[ewg] ofa_1_5_kernel 20110216-0200 daily build status
This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_5/linux-2.6.git git_branch: ofed_kernel_1_5 Common build parameters: Passed: Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.24 Passed on i686 with linux-2.6.26 Passed on i686 with linux-2.6.28 Passed on i686 with linux-2.6.27 Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.31 Passed on i686 with linux-2.6.29 Passed on i686 with linux-2.6.33 Passed on i686 with linux-2.6.32 Passed on i686 with linux-2.6.30 Passed on i686 with linux-2.6.35 Passed on i686 with linux-2.6.34 Passed on i686 with linux-2.6.36 Passed on x86_64 with linux-2.6.16.60-0.54.5-smp Passed on x86_64 with linux-2.6.16.60-0.21-smp Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.18-238.el5 Passed on x86_64 with linux-2.6.18-194.el5 Passed on x86_64 with linux-2.6.18-164.el5 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.26 Passed on x86_64 with linux-2.6.24 Passed on x86_64 with linux-2.6.27 Passed on x86_64 with linux-2.6.25 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.28 Passed on x86_64 with linux-2.6.27.19-5-smp Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.22 Passed on ia64 with linux-2.6.24 Passed on ia64 with linux-2.6.26 Passed on ia64 with linux-2.6.28 Passed on ia64 with linux-2.6.25 Passed on ppc64 with linux-2.6.18 Passed on ppc64 with linux-2.6.19 Failed: ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] rping/cxgb3 regression
On 02/16/2011 04:00 AM, Hefty, Sean wrote: Not a big deal. Vlad, can you pull librdmacm 1.0.14.1 into the next OFED 1.5.3 RC? The only change versus 1.0.14 is reverting a patch to the rping sample. Thanks, Sean Done, Regards, Vladimir ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] [PATCH]mlx4_ib XRC RCV: Fix mlx4_ib_reg_xrc_rcv_qp() locking
You are correct! Good catch. We will add this to OFED. (P.S., I would rather leave irqsave -- it is used everywhere else for this spinlock). -Jack On Monday 14 February 2011 09:32, sebastien dugue wrote: Resending to the proper ML (sorry). In mlx4_ib_reg_xrc_rcv_qp(), we need to take the xrc_reg_list_lock spinlock when walking the xrc_reg_list. We've been hit by this on 2 customer sites. Also, I guess spin_lock_irqsave() could be replaced by spin_lock_irq() in that function as we know for sure we're in process context. Signed-off-by: Sébastien Dugué sebastien.du...@bull.net -- qp.c |3 +++ 1 file changed, 3 insertions(+) dIndex: kernel-ib/drivers/infiniband/hw/mlx4/qp.c === --- kernel-ib.orig/drivers/infiniband/hw/mlx4/qp.c2011-01-31 16:52:11.0 +0100 +++ kernel-ib/drivers/infiniband/hw/mlx4/qp.c 2011-02-11 15:24:27.0 +0100 @@ -2549,13 +2549,16 @@ } mutex_lock(mibqp-mutex); + spin_lock_irqsave(mibqp-xrc_reg_list_lock, flags); list_for_each_entry(tmp, mibqp-xrc_reg_list, list) if (tmp-context == context) { + spin_unlock_irqrestore(mibqp-xrc_reg_list_lock, flags); mutex_unlock(mibqp-mutex); kfree(ctx_entry); mutex_unlock(to_mdev(xrcd-device)-xrc_reg_mutex); return 0; } + spin_unlock_irqrestore(mibqp-xrc_reg_list_lock, flags); ctx_entry-context = context; spin_lock_irqsave(mibqp-xrc_reg_list_lock, flags); ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] [PATCH]mlx4_ib XRC RCV: Fix mlx4_ib_reg_xrc_rcv_qp() locking
On Wed, 16 Feb 2011 14:50:02 +0200 Jack Morgenstein ja...@dev.mellanox.co.il wrote: You are correct! Good catch. We will add this to OFED. Thanks, (P.S., I would rather leave irqsave -- it is used everywhere else for this spinlock). Right, but everywhere you know for sure you're in which context you are (process or interrupt), there's no need to use the save/restore variant. Those are just to be used in places where you don't know in which context you are. Also, one thing I noticed in that same function: why allocate ctx_entry before knowing if it's going to be of any use? The allocation could be done right before the first use. Sébastien. -Jack On Monday 14 February 2011 09:32, sebastien dugue wrote: Resending to the proper ML (sorry). In mlx4_ib_reg_xrc_rcv_qp(), we need to take the xrc_reg_list_lock spinlock when walking the xrc_reg_list. We've been hit by this on 2 customer sites. Also, I guess spin_lock_irqsave() could be replaced by spin_lock_irq() in that function as we know for sure we're in process context. Signed-off-by: Sébastien Dugué sebastien.du...@bull.net -- qp.c |3 +++ 1 file changed, 3 insertions(+) dIndex: kernel-ib/drivers/infiniband/hw/mlx4/qp.c === --- kernel-ib.orig/drivers/infiniband/hw/mlx4/qp.c 2011-01-31 16:52:11.0 +0100 +++ kernel-ib/drivers/infiniband/hw/mlx4/qp.c 2011-02-11 15:24:27.0 +0100 @@ -2549,13 +2549,16 @@ } mutex_lock(mibqp-mutex); + spin_lock_irqsave(mibqp-xrc_reg_list_lock, flags); list_for_each_entry(tmp, mibqp-xrc_reg_list, list) if (tmp-context == context) { + spin_unlock_irqrestore(mibqp-xrc_reg_list_lock, flags); mutex_unlock(mibqp-mutex); kfree(ctx_entry); mutex_unlock(to_mdev(xrcd-device)-xrc_reg_mutex); return 0; } + spin_unlock_irqrestore(mibqp-xrc_reg_list_lock, flags); ctx_entry-context = context; spin_lock_irqsave(mibqp-xrc_reg_list_lock, flags); ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] [PATCH]mlx4_ib XRC RCV: Fix mlx4_ib_reg_xrc_rcv_qp() locking
On Wednesday 16 February 2011 15:02, sebastien dugue wrote: Also, one thing I noticed in that same function: why allocate ctx_entry before knowing if it's going to be of any use? The allocation could be done right before the first use. I did it just to gather all the error returns at the beginning of the function. You are correct, though: I could have walked the list before doing the allocation. I don't see this as critical, though. -Jack ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] pull request
Please pull the following fix from git.openfabrics.org/~mmarciniszyn/scm/linux-2.6.to_ofed. This fixes an issue revealed during OFED 1.5.3 testing. Mike commit 0012f25501856cf0f47397fbb57e26bf46b11b99 Author: Mike Marciniszyn mike.marcinis...@qlogic.com Date: Wed Feb 16 09:47:41 2011 -0500 IB/qib: Prevent double completions after a timeout or RNR error From: Mike Marciniszyn mike.marcinis...@qlogic.com There is a double completion associated with error handling for RC QP's. The sequence is: - The do_rc_ack() routine fields an RNR nack and there are 0 rnr_retries configured on the QP. - qib_error_qp() stops the pending timer - qib_rc_send_complete() is called from sdma_complete() - qib_rc_send_complete() starts the timer because the msb of the psn just completed says and ack is needed. - a bunch of flushes occur as ipoib posts wqe's to an error'ed qp - rc_timeout() calls qib_restart_rc() - qib_restart_rc() calls qib_send_complete() with a IB_WC_RETRY_EXC_ERR on a wqe that has already been completed in the past The fix avoids starting the timer since another packet will never arrive. Signed-off-by: Mike Marciniszyn mike.marcinis...@qlogic.com This message and any attached documents contain information from QLogic Corporation or its wholly-owned subsidiaries that may be confidential. If you are not the intended recipient, you may not read, copy, distribute, or use this information. If you have received this transmission in error, please notify the sender immediately by reply e-mail and then delete this message. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] pull request
On 02/16/2011 05:55 PM, Mike Marciniszyn wrote: Please pull the following fix from git.openfabrics.org/~mmarciniszyn/scm/linux-2.6.to_ofed. This fixes an issue revealed during OFED 1.5.3 testing. Mike Done, Regards, Vladimir ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] [ANNOUNCE] management tarballs release
Hi, There is a new release of the management (OpenSM and infiniband diagnostics) tarballs available in: http://www.openfabrics.org/downloads/management/ (listed in http://www.openfabrics.org/downloads/management/latest.txt) c0b24a1053ae8b0b3caf5950b3ede6dc infiniband-diags-1.5.8.tar.gz c2755aa360d3f29d04865ba4e2454a98 libibmad-1.3.7.tar.gz c7575b7620615d7dfa1c7fdbbd310ec7 libibumad-1.3.7.tar.gz df051f5f0192d369b0b904147cb045a8 opensm-3.3.8.tar.gz All component versions are from recent master branch. Full list of changes is below. OpenSM: === Alex Netes (1): opensm: fixed getline pointer allocation free in osm_console_io Eli Dorfman (Voltaire) (1): Wrong handling of MC create and delete traps Hal Rosenstock (6): opensm/osm_state_mgr.c: Don't signal DISCOVER to SM state machine when already DISCOVERING opensm: Fix some typos osmtest/osmt_service.c: In osmt_run_service_records_flow, add missing status opensm/osm_ucast_ftree: When roots are not connected, update hop count but not lft opensm/osm_trap_rcv.c: No need to check for sweep for trap 145 opensm: Add support for SwitchInfo:MulticastFDBTop Ira Weiny (1): Add node/port/qos information to some error messages Jason Gunthorpe (1): Fix autotools to include the necessary M4 files Sasha Khapyorsky (3): opensm/sa: simplify osm_mcmr_rcv_find_or_create_new_mgrp() function call opensm/osm_node_info_rcv.c: move p_physp declaration under code block opensm/osm_db_files.c: malloc() return value run-time check Stan C. Smith (2): replace (long*)(long) casting with transportable data type (uintptr_t) replace (long*)(long) casting with transportable data type (uintptr_t) Yevgeny Kliteynik (28): opensm/osm_qos_policy.c: change a log message opensm/osm_prtn.c: removing TopSpin hack libvendor/osm_vendor_ibumad_sa.c: remove useless if statement libvendor/osm_vendor_mlx_sa.c: remove useless if statement opensm/osm_mtree.c: removing useless 'if' statement opensm/osm_sminfo_rcv.c: removing unused variable opensm/osm_pkey.c: removing unused function opensm/osm_sa_pkey_record.c: removing unused variable opensm/osm_sa_vlarb_record.c: removed unused variable opensm/osm_node_info_rcv.c: remove useless code line osmtest/osmtest.c: handle timeouts in PR stress test opensm/osm_helper.c: fix potential overrun of the array opensm/osm_helper.c: cosmetics - move define closer to the relevant code opensm/osm_mesh.c: fixing a bug in compare_switches() opensm/osm_subnet.c: fixing small bug in error path opensm/osm_db_files.c: fix small memory leak osmtest/osmt_slvl_vl_arb.c: handling fopen() failure opensm/osm_helper.c: use ARR_SIZE macro instead of hardcoded values osm_vl15intf.c: fixing use-after-free coredump opensm/osm_trap_rcv.c: fix possible core dump opensm/osm_ucast_ftree.c: fix small memory leak in error path opensm/osm_ucast_ftree.c: fixing another memory leak at error path opensm/osm_ucast_lash.c: small bug in calculating allocated size opensm/osm_pkey_mgr.c: fixing small memory leak opensm/osm_ucast_file.c: closing file descriptor in error path opensm/osm_qos_parser_y.y: fixing bunch of memory leaks on invalid values opensm/osm_console.c: fix memory and file descriptor leaks opensm/st.c: fix potential core dumps libibumad: == Jason Gunthorpe (1): Fix autotools to include the necessary M4 files Mike Heinz (1): FW: [PATCH] umad_send.3 (man page) Yevgeny Kliteynik (1): umad.{c,h}: moving stdlib.h include from C to H file libibmad: = Ira Weiny (1): libibmad/fields.c: Change all PortCounter names to match the Specification Jason Gunthorpe (1): Fix autotools to include the necessary M4 files infiniband-diags: = Albert Chu (4): add --diff support to iblinkinfo support --diffcheck in iblinkinfo Add lid and node description diff options for --diffcheck in iblinkinfo support --filterdownports in iblinkinfo Alex Netes (3): Makefile: ChangeLog and version generation script path fix infiniband-diags: update shared library versions infiniband-diags: package versions update Eli Dorfman (Voltaire) (2): infiniband-diags: Do not exit when unexpected node found inifiband-diags: Support Voltaire switch ISR4200 Hal Rosenstock (3): infiniband-diags/ibtracert: Eliminate direct route (-D) option infiniband-diags/saquery.c: In dump_one_mcmember_record, fix flow label endian infiniband-diags/iblinkinfo.c: Limit some queries to switches Ira Weiny (4): libibmad/fields.c: Change all PortCounter names to match the Specification infiniband-diags: Verify timeout value specified to diagnostics Further timeout paramater verification (Was: [PATCH] infiniband-diags: Verify