RE: OFA Management maintainership

2013-02-06 Thread Hal Rosenstock
Hi,

I think we all owe a debt of gratitude for Alex's excellent 2+ years of OpenSM, 
libibumad, and ibsim maintainership. I hope I can live up to the high standard 
Alex set. Thanks for all you've done Alex!

-- Hal 

-Original Message-
From: Alex Netes [mailto:ale...@dev.mellanox.co.il] On Behalf Of Alex Netes
Sent: Thursday, February 07, 2013 1:37 AM
To: linux-rdma@vger.kernel.org; Hal Rosenstock
Subject: OFA Management maintainership

Hi,

I want to announce that starting from today Hal Rosenstock which you are 
familiar with, is going to maintain OpenSM, libibumad and ibsim development.

So starting from today his trees should be considered as master development 
trees:

git://git.openfabrics.org/~halr/libibumad
git://git.openfabrics.org/~halr/opensm
git://git.openfabrics.org/~halr/ibsim

I would like to wish Hal a lot of success with the new role.
Adiitionaly, I would like to thank the whole community for a good working time. 
I still continue to work on OpenSM and will continue to contribute to the 
community in the future.

--Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


OFA Management maintainership

2013-02-06 Thread Alex Netes
Hi,

I want to announce that starting from today Hal Rosenstock which you are
familiar with, is going to maintain OpenSM, libibumad and ibsim development.

So starting from today his trees should be considered as master
development trees:

git://git.openfabrics.org/~halr/libibumad
git://git.openfabrics.org/~halr/opensm
git://git.openfabrics.org/~halr/ibsim

I would like to wish Hal a lot of success with the new role.
Adiitionaly, I would like to thank the whole community for a good working
time. I still continue to work on OpenSM and will continue to contribute
to the community in the future.

--Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[GIT PULL] please pull infiniband.git

2013-02-06 Thread Roland Dreier
Hi Linus,

Please pull from

git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git 
tags/rdma-for-linus



IB regression fixes for 3.8:
 - Fix mlx4 VFs not working on old guests because of 64B CQE changes
 - Fix ill-considered sparse fix for qib
 - Fix IPoIB crash due to skb double destruct introduced in 3.8-rc1


Mike Marciniszyn (1):
  IB/qib: Fix for broken sparse warning fix

Or Gerlitz (1):
  mlx4_core: Fix advertisement of wrong PF context behaviour

Roland Dreier (1):
  Merge branches 'ipoib', 'mlx4' and 'qib' into for-next

Shlomo Pongratz (1):
  IPoIB: Fix crash due to skb double destruct

 drivers/infiniband/hw/qib/qib_qp.c| 11 +++
 drivers/infiniband/ulp/ipoib/ipoib_cm.c   |  6 +++---
 drivers/infiniband/ulp/ipoib/ipoib_ib.c   |  6 +++---
 drivers/net/ethernet/mellanox/mlx4/main.c |  2 +-
 4 files changed, 10 insertions(+), 15 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/3] infiniband-diags: deprecate dump_[m|l]fts.sh scripts

2013-02-06 Thread Ira Weiny

Signed-off-by: Ira Weiny 
---
 Makefile.am|6 +-
 configure.in   |4 +-
 doc/man/dump_lfts.8.in |  177 
 doc/man/dump_mfts.8.in |  170 --
 doc/rst/dump_lfts.8.in.rst |   73 --
 doc/rst/dump_mfts.8.in.rst |   64 
 man/dump_lfts.8|2 +
 man/dump_mfts.8|2 +
 scripts/dump_lfts.sh   |   72 --
 scripts/dump_lfts.sh.in|   12 +++
 scripts/dump_mfts.sh   |   72 --
 scripts/dump_mfts.sh.in|   12 +++
 12 files changed, 33 insertions(+), 633 deletions(-)
 delete mode 100644 doc/man/dump_lfts.8.in
 delete mode 100644 doc/man/dump_mfts.8.in
 delete mode 100644 doc/rst/dump_lfts.8.in.rst
 delete mode 100644 doc/rst/dump_mfts.8.in.rst
 create mode 100644 man/dump_lfts.8
 create mode 100644 man/dump_mfts.8
 delete mode 100755 scripts/dump_lfts.sh
 create mode 100755 scripts/dump_lfts.sh.in
 delete mode 100755 scripts/dump_mfts.sh
 create mode 100755 scripts/dump_mfts.sh.in

diff --git a/Makefile.am b/Makefile.am
index 42c2c75..f44b4d6 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -47,8 +47,6 @@ man_MANS = doc/man/ibaddr.8 \
doc/man/ibccconfig.8 \
doc/man/ibccquery.8 \
doc/man/dump_fts.8 \
-   doc/man/dump_lfts.8 \
-   doc/man/dump_mfts.8 \
doc/man/iblinkinfo.8 \
doc/man/ibfindnodesusing.8 \
doc/man/ibhosts.8 \
@@ -71,7 +69,9 @@ man_MANS = doc/man/ibaddr.8 \
doc/man/smpdump.8 \
doc/man/smpquery.8 \
doc/man/vendstat.8 \
-   doc/man/infiniband-diags.8
+   doc/man/infiniband-diags.8 \
+   man/dump_lfts.8 \
+   man/dump_mfts.8
 
 # define this for the dist target
 compat_man_pages = man/ibdiscover.8 man/ibcheckerrors.8 man/ibcheckerrs.8 \
diff --git a/configure.in b/configure.in
index b54222b..edac1e3 100644
--- a/configure.in
+++ b/configure.in
@@ -216,14 +216,14 @@ AC_CONFIG_FILES([\
scripts/ibrouters \
scripts/iblinkinfo.pl \
scripts/ibqueryerrors.pl \
+   scripts/dump_lfts.sh \
+   scripts/dump_mfts.sh \
doc/man/ibaddr.8 \
doc/man/check_lft_balance.8 \
doc/man/ibcacheedit.8 \
doc/man/ibccconfig.8 \
doc/man/ibccquery.8 \
doc/man/dump_fts.8 \
-   doc/man/dump_lfts.8 \
-   doc/man/dump_mfts.8 \
doc/man/ibhosts.8 \
doc/man/ibidsverify.8 \
doc/man/iblinkinfo.8 \
diff --git a/doc/man/dump_lfts.8.in b/doc/man/dump_lfts.8.in
deleted file mode 100644
index a75a425..000
--- a/doc/man/dump_lfts.8.in
+++ /dev/null
@@ -1,177 +0,0 @@
-.\" Man page generated from reStructeredText.
-.
-.TH DUMP_LFTS.SH 8 "@BUILD_DATE@" "" "OpenIB Diagnostics"
-.SH NAME
-DUMP_LFTS.SH \- dump InfiniBand unicast forwarding tables
-.
-.nr rst2man-indent-level 0
-.
-.de1 rstReportMargin
-\\$1 \\n[an-margin]
-level \\n[rst2man-indent-level]
-level margin: \\n[rst2man-indent\\n[rst2man-indent-level]]
--
-\\n[rst2man-indent0]
-\\n[rst2man-indent1]
-\\n[rst2man-indent2]
-..
-.de1 INDENT
-.\" .rstReportMargin pre:
-. RS \\$1
-. nr rst2man-indent\\n[rst2man-indent-level] \\n[an-margin]
-. nr rst2man-indent-level +1
-.\" .rstReportMargin post:
-..
-.de UNINDENT
-. RE
-.\" indent \\n[an-margin]
-.\" old: \\n[rst2man-indent\\n[rst2man-indent-level]]
-.nr rst2man-indent-level -1
-.\" new: \\n[rst2man-indent\\n[rst2man-indent-level]]
-.in \\n[rst2man-indent\\n[rst2man-indent-level]]u
-..
-.SH SYNOPSIS
-.sp
-dump_lfts.sh [\-h] [\-D] [\-C ca_name] [\-P ca_port] [\-t(imeout) timeout_ms] 
[>/path/to/dump\-file]
-.SH DESCRIPTION
-.sp
-dump_lfts.sh is a script which dumps the InfiniBand unciast forwarding
-tables (MFTs) in the switch nodes in the subnet.
-.sp
-The dump file format is compatible with loading into OpenSM using
-the \-R file \-U /path/to/dump\-file syntax.
-.SH OPTIONS
-.sp
-\fB\-D\fP
-dump forwarding tables using direct routed rather than LID routed SMPs
-.sp
-\fB\-h\fP
-show help
-.SS Port Selection flags
-.\" Define the common option -C
-.
-.sp
-\fB\-C, \-\-Ca \fPuse the specified ca_name.
-.\" Define the common option -P
-.
-.sp
-\fB\-P, \-\-Port \fPuse the specified ca_port.
-.\" Explanation of local port selection
-.
-.SS Local port Selection
-.sp
-Multiple port/Multiple CA support: when no IB device or port is specified
-(see the "local umad parameters" below), the libibumad library
-selects the port to use by the following criteria:
-.INDENT 0.0
-.INDENT 3.5
-.INDENT 0.0
-.IP 1. 3
-.
-the first port that is ACTIVE.
-.IP 2. 3
-.
-if not found, the first port that is UP (physical link up).
-.UNINDENT
-.sp
-If a port and/or CA name is specified, the libibumad library attempts
-to fulfill the user request, and will fail if it is not possible.
-.sp
-For example:
-.sp
-.nf
-.ft C
-ibaddr

[PATCH V3 2/3] infiniband-diags: add dump_fts tool

2013-02-06 Thread Ira Weiny



dump_fts adds a faster version of the functionality of dump_[l|m]fts.sh.  This
code is based off of the ibroute code and simply uses libibnetdisc to scan the
fabric instead of using ibnetdiscover and letting ibroute requery all that data
over again.  This improves things in 3 ways.

1) performance improves by nearly 2 orders of magnitude.
2) this version greatly reduces the mads required and thus reduces the
   impact on the fabric.
3) Everything is queried with DR paths which ensures if the routing tables are
   bad on the cluster the query will still complete and give you the
   information you were looking for.  (To be fair dump_lft.sh has the DR option
   but it is currently buggy.)

Example runs on the ~1400 nodes of the Hyperion test cluster show:

13:45:46 > time ./dump_lfts.sh > /dev/null

real4m58.175s
user0m6.407s
sys 0m17.983s

13:53:12 > time ./dump_fts > /dev/null
dump tables: linear forwarding table get failed

real0m8.121s
user0m3.032s
sys 0m3.342s

Changes from V1:
Add status and query information to error messages.
Add man page files which were missed in the first patch

Changes from V2:
Additional status and address information on error messages.
clean up error handling when fabric scan fails.

Signed-off-by: Ira Weiny 
---
 Makefile.am   |7 +-
 configure.in  |1 +
 doc/man/dump_fts.8.in |  236 ++
 doc/rst/dump_fts.8.in.rst |   85 
 infiniband-diags.spec.in  |2 +
 src/dump_fts.c|  489 +
 6 files changed, 819 insertions(+), 1 deletions(-)
 create mode 100644 doc/man/dump_fts.8.in
 create mode 100644 doc/rst/dump_fts.8.in.rst
 create mode 100644 src/dump_fts.c

diff --git a/Makefile.am b/Makefile.am
index a35a432..42c2c75 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -15,7 +15,8 @@ sbin_PROGRAMS = src/ibaddr src/ibnetdiscover src/ibping 
src/ibportstate \
src/perfquery src/sminfo src/smpdump src/smpquery \
src/saquery src/vendstat src/iblinkinfo \
src/ibqueryerrors src/ibcacheedit src/ibccquery \
-   src/ibccconfig
+   src/ibccconfig \
+   src/dump_fts
 
 if ENABLE_TEST_UTILS
 sbin_PROGRAMS += src/ibsendtrap src/mcm_rereg_test
@@ -45,6 +46,7 @@ man_MANS = doc/man/ibaddr.8 \
doc/man/ibcacheedit.8 \
doc/man/ibccconfig.8 \
doc/man/ibccquery.8 \
+   doc/man/dump_fts.8 \
doc/man/dump_lfts.8 \
doc/man/dump_mfts.8 \
doc/man/iblinkinfo.8 \
@@ -118,6 +120,9 @@ src_ibqueryerrors_LDFLAGS = -L$(top_builddir)/libibnetdisc 
-libnetdisc
 src_ibcacheedit_SOURCES = src/ibcacheedit.c
 src_ibcacheedit_LDFLAGS = -L$(top_builddir)/libibnetdisc -libnetdisc
 
+src_dump_fts_SOURCES = src/dump_fts.c
+src_dump_fts_LDFLAGS = -L$(top_builddir)/libibnetdisc -libnetdisc
+
 BUILT_SOURCES = ibdiag_version
 ibdiag_version:
if [ -x $(top_srcdir)/gen_ver.sh ] ; then \
diff --git a/configure.in b/configure.in
index ca62d5b..b54222b 100644
--- a/configure.in
+++ b/configure.in
@@ -221,6 +221,7 @@ AC_CONFIG_FILES([\
doc/man/ibcacheedit.8 \
doc/man/ibccconfig.8 \
doc/man/ibccquery.8 \
+   doc/man/dump_fts.8 \
doc/man/dump_lfts.8 \
doc/man/dump_mfts.8 \
doc/man/ibhosts.8 \
diff --git a/doc/man/dump_fts.8.in b/doc/man/dump_fts.8.in
new file mode 100644
index 000..a64c6da
--- /dev/null
+++ b/doc/man/dump_fts.8.in
@@ -0,0 +1,236 @@
+.\" Man page generated from reStructeredText.
+.
+.TH DUMP_FTS 8 "@BUILD_DATE@" "" "OpenIB Diagnostics"
+.SH NAME
+DUMP_FTS \- dump InfiniBand forwarding tables
+.
+.nr rst2man-indent-level 0
+.
+.de1 rstReportMargin
+\\$1 \\n[an-margin]
+level \\n[rst2man-indent-level]
+level margin: \\n[rst2man-indent\\n[rst2man-indent-level]]
+-
+\\n[rst2man-indent0]
+\\n[rst2man-indent1]
+\\n[rst2man-indent2]
+..
+.de1 INDENT
+.\" .rstReportMargin pre:
+. RS \\$1
+. nr rst2man-indent\\n[rst2man-indent-level] \\n[an-margin]
+. nr rst2man-indent-level +1
+.\" .rstReportMargin post:
+..
+.de UNINDENT
+. RE
+.\" indent \\n[an-margin]
+.\" old: \\n[rst2man-indent\\n[rst2man-indent-level]]
+.nr rst2man-indent-level -1
+.\" new: \\n[rst2man-indent\\n[rst2man-indent-level]]
+.in \\n[rst2man-indent\\n[rst2man-indent-level]]u
+..
+.SH SYNOPSIS
+.sp
+dump_fts [options] [ []]
+.SH DESCRIPTION
+.sp
+dump_fts is similar to ibroute but dumps tables for every switch found in an
+ibnetdiscover scan of the subnet.
+.sp
+The dump file format is compatible with loading into OpenSM using
+the \-R file \-U /path/to/dump\-file syntax.
+.SH OPTIONS
+.INDENT 0.0
+.TP
+.B \fB\-a, \-\-all\fP
+.sp
+show all lids in range, even invalid entries
+.TP
+.B \fB\-n, \-\-no_dests\fP
+.sp
+do not try to resolve destinations
+.TP
+.B \fB\-M, \-\-Multicast\fP
+.sp
+show multicast forwarding tables
+In this case, the range par

[PATCH V3 1/3] infiniband-diags: libibnetdisc add find node by lid

2013-02-06 Thread Ira Weiny


NOTE: this change adds a glib requirement to the package.

Changes since V1:
Use GINT_TO_POINTER rather than allocating keys

Changes since V2:
Use internal object everywhere rather than just hacked into
discover_fabric
Generate lid2port hash when reading cached fabrics

Signed-off-by: Ira Weiny 
---
 configure.in|7 ++
 infiniband-diags.spec.in|4 +-
 libibnetdisc/Makefile.am|4 +-
 libibnetdisc/include/infiniband/ibnetdisc.h |3 +
 libibnetdisc/libibnetdisc.ver   |2 +-
 libibnetdisc/src/ibnetdisc.c|  126 ++
 libibnetdisc/src/ibnetdisc_cache.c  |   32 
 libibnetdisc/src/internal.h |   15 +++-
 libibnetdisc/src/libibnetdisc.map   |1 +
 9 files changed, 132 insertions(+), 62 deletions(-)

diff --git a/configure.in b/configure.in
index 2dc60a0..ca62d5b 100644
--- a/configure.in
+++ b/configure.in
@@ -161,6 +161,13 @@ IBSCRIPTPATH_TMP2="`echo $IBSCRIPTPATH_TMP1 | sed 
's/^NONE/$ac_default_prefix/'`
 IBSCRIPTPATH="${with_ibpath_override:-`eval echo $IBSCRIPTPATH_TMP2`}"
 AC_SUBST(IBSCRIPTPATH)
 
+dnl check for glib
+PKG_CHECK_MODULES([GLIB], [glib-2.0], ac_glib=yes, ac_glib=no)
+AM_CONDITIONAL([HAVE_GLIB], test "$ac_glib" = "yes")
+if test "$ac_glib" = "yes"; then
+   AC_DEFINE([HAVE_GLIB], 1, [Define to 1 to indicate GLIB support])
+fi
+
 dnl Begin libibnetdisc stuff
 ibnetdisc_api_version=`grep LIBVERSION $srcdir/libibnetdisc/libibnetdisc.ver | 
sed 's/LIBVERSION=//'`
 if test -z $ibnetdisc_api_version; then
diff --git a/infiniband-diags.spec.in b/infiniband-diags.spec.in
index d3fcd13..9cd195b 100644
--- a/infiniband-diags.spec.in
+++ b/infiniband-diags.spec.in
@@ -11,8 +11,8 @@ Group: System Environment/Libraries
 BuildRoot: %{_tmppath}/%{name}-%{version}-%{release}-root-%(%{__id_u} -n)
 Source: http://www.openfabrics.org/downloads/management/@TARBALL@
 Url: http://openfabrics.org/
-BuildRequires: libibmad-devel, opensm-devel, libibumad-devel
-Requires: libibmad, opensm-libs, libibumad
+BuildRequires: libibmad-devel, opensm-devel, libibumad-devel, glib-devel
+Requires: libibmad, opensm-libs, libibumad, glib
 Provides: perl(IBswcountlimits)
 Obsoletes: openib-diags
 
diff --git a/libibnetdisc/Makefile.am b/libibnetdisc/Makefile.am
index fbf0e60..d05604f 100644
--- a/libibnetdisc/Makefile.am
+++ b/libibnetdisc/Makefile.am
@@ -24,10 +24,10 @@ endif
 
 libibnetdisc_la_SOURCES = src/ibnetdisc.c src/ibnetdisc_cache.c src/chassis.c \
  src/chassis.h src/internal.h src/query_smp.c
-libibnetdisc_la_CFLAGS = -Wall $(DBGFLAGS)
+libibnetdisc_la_CFLAGS = -Wall $(DBGFLAGS) $(GLIB_CFLAGS)
 libibnetdisc_la_LDFLAGS = -version-info $(ibnetdisc_api_version) \
-export-dynamic $(libibnetdisc_version_script) \
-   -libmad
+   -libmad $(GLIB_LIBS)
 libibnetdisc_la_DEPENDENCIES = $(srcdir)/src/libibnetdisc.map
 
 libibnetdiscincludedir = $(includedir)/infiniband
diff --git a/libibnetdisc/include/infiniband/ibnetdisc.h 
b/libibnetdisc/include/infiniband/ibnetdisc.h
index e41c92c..acde1dc 100644
--- a/libibnetdisc/include/infiniband/ibnetdisc.h
+++ b/libibnetdisc/include/infiniband/ibnetdisc.h
@@ -231,6 +231,9 @@ IBND_EXPORT ibnd_port_t *ibnd_find_port_guid(ibnd_fabric_t 
* fabric,
uint64_t guid);
 IBND_EXPORT ibnd_port_t *ibnd_find_port_dr(ibnd_fabric_t * fabric,
char *dr_str);
+IBND_EXPORT ibnd_port_t *ibnd_find_port_lid(ibnd_fabric_t * fabric,
+   uint16_t lid);
+
 typedef void (*ibnd_iter_port_func_t) (ibnd_port_t * port, void *user_data);
 IBND_EXPORT void ibnd_iter_ports(ibnd_fabric_t * fabric,
ibnd_iter_port_func_t func, void *user_data);
diff --git a/libibnetdisc/libibnetdisc.ver b/libibnetdisc/libibnetdisc.ver
index c513f2a..59fca19 100644
--- a/libibnetdisc/libibnetdisc.ver
+++ b/libibnetdisc/libibnetdisc.ver
@@ -6,4 +6,4 @@
 # API_REV - advance on any added API
 # RUNNING_REV - advance any change to the vendor files
 # AGE - number of backward versions the API still supports
-LIBVERSION=7:0:2
+LIBVERSION=8:0:3
diff --git a/libibnetdisc/src/ibnetdisc.c b/libibnetdisc/src/ibnetdisc.c
index 3a7dd8f..9d120dd 100644
--- a/libibnetdisc/src/ibnetdisc.c
+++ b/libibnetdisc/src/ibnetdisc.c
@@ -98,10 +98,10 @@ static int add_port_to_dpath(ib_dr_path_t * path, int 
nextport)
 static int retract_dpath(smp_engine_t * engine, ib_portid_t * portid)
 {
ibnd_scan_t *scan = engine->user_data;
-   ibnd_fabric_t *fabric = scan->fabric;
+   f_internal_t *f_int = scan->f_int;
 
if (scan->cfg->max_hops &&
-   fabric->maxhops_discovered > scan->cfg->max_hops)
+   f_int->fabric.maxhops_discovered > scan->cfg->max_hops)
return 0;
 
/* this may seem wrong but the only time we would r

Re: NFS over RDMA crashing

2013-02-06 Thread Steve Wise

On 2/6/2013 4:24 PM, J. Bruce Fields wrote:

On Wed, Feb 06, 2013 at 05:48:15PM +0200, Yan Burman wrote:

When killing mount command that got stuck:
---

BUG: unable to handle kernel paging request at 880324dc7ff8
IP: [] rdma_read_xdr+0x8bb/0xd40 [svcrdma]
PGD 1a0c063 PUD 32f82e063 PMD 32f2fd063 PTE 800324dc7161
Oops: 0003 [#1] PREEMPT SMP
Modules linked in: md5 ib_ipoib xprtrdma svcrdma rdma_cm ib_cm iw_cm
ib_addr nfsd exportfs netconsole ip6table_filter ip6_tables
iptable_filter ip_tables ebtable_nat nfsv3 nfs_acl ebtables x_tables
nfsv4 auth_rpcgss nfs lockd autofs4 sunrpc target_core_iblock
target_core_file target_core_pscsi target_core_mod configfs 8021q
bridge stp llc ipv6 dm_mirror dm_region_hash dm_log vhost_net
macvtap macvlan tun uinput iTCO_wdt iTCO_vendor_support kvm_intel
kvm crc32c_intel microcode pcspkr joydev i2c_i801 lpc_ich mfd_core
ehci_pci ehci_hcd sg ioatdma ixgbe mdio mlx4_ib ib_sa ib_mad ib_core
mlx4_en mlx4_core igb hwmon dca ptp pps_core button dm_mod ext3 jbd
sd_mod ata_piix libata uhci_hcd megaraid_sas scsi_mod
CPU 6
Pid: 4744, comm: nfsd Not tainted 3.8.0-rc5+ #4 Supermicro
X8DTH-i/6/iF/6F/X8DTH
RIP: 0010:[]  []
rdma_read_xdr+0x8bb/0xd40 [svcrdma]
RSP: 0018:880324c3dbf8  EFLAGS: 00010297
RAX: 880324dc8000 RBX: 0001 RCX: 880324dd8428
RDX: 880324dc7ff8 RSI: 880324dd8428 RDI: 81149618
RBP: 880324c3dd78 R08: 60f9c860 R09: 0001
R10: 880324dd8000 R11: 0001 R12: 8806299dcb10
R13: 0003 R14: 0001 R15: 0010
FS:  () GS:88063fc0() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2: 880324dc7ff8 CR3: 01a0b000 CR4: 07e0
DR0:  DR1:  DR2: 
DR3:  DR6: 0ff0 DR7: 0400
Process nfsd (pid: 4744, threadinfo 880324c3c000, task 88033055)
Stack:
  880324c3dc78 880324c3dcd8 0282 880631cec000
  880324dd8000 88062ed33040 000124c3dc48 880324dd8000
  88062ed33058 880630ce2b90 8806299e8000 0003
Call Trace:
  [] svc_rdma_recvfrom+0x3ee/0xd80 [svcrdma]
  [] ? try_to_wake_up+0x2f0/0x2f0
  [] svc_recv+0x3ef/0x4b0 [sunrpc]
  [] ? nfsd_svc+0x740/0x740 [nfsd]
  [] nfsd+0xad/0x130 [nfsd]
  [] ? nfsd_svc+0x740/0x740 [nfsd]
  [] kthread+0xd6/0xe0
  [] ? __init_kthread_worker+0x70/0x70
  [] ret_from_fork+0x7c/0xb0
  [] ? __init_kthread_worker+0x70/0x70
Code: 63 c2 49 8d 8c c2 18 02 00 00 48 39 ce 77 e1 49 8b 82 40 0a 00
00 48 39 c6 0f 84 92 f7 ff ff 90 48 8d 50 f8 49 89 92 40 0a 00 00
<48> c7 40 f8 00 00 00 00 49 8b 82 40 0a 00 00 49 3b 82 30 0a 00
RIP  [] rdma_read_xdr+0x8bb/0xd40 [svcrdma]
  RSP 
CR2: 880324dc7ff8
---[ end trace 06d0384754e9609a ]---


It seems that commit afc59400d6c65bad66d4ad0b2daf879cbff8e23e
"nfsd4: cleanup: replace rq_resused count by rq_next_page pointer"
is responsible for the crash (it seems to be crashing in
net/sunrpc/xprtrdma/svc_rdma_recvfrom.c:527)
It may be because I have CONFIG_DEBUG_SET_MODULE_RONX and
CONFIG_DEBUG_RODATA enabled. I did not try to disable them yet.

When I moved to commit 79f77bf9a4e3dd5ead006b8f17e7c4ff07d8374e I
was no longer getting the server crashes,
so the reset of my tests were done using that point (it is somewhere
in the middle of 3.7.0-rc2).

OK, so this part's clearly my fault--I'll work on a patch, but the
rdma's use of the ->rq_pages array is pretty confusing.


Maybe Tom can shed some light?

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: NFS over RDMA crashing

2013-02-06 Thread J. Bruce Fields
On Wed, Feb 06, 2013 at 05:48:15PM +0200, Yan Burman wrote:
> When killing mount command that got stuck:
> ---
> 
> BUG: unable to handle kernel paging request at 880324dc7ff8
> IP: [] rdma_read_xdr+0x8bb/0xd40 [svcrdma]
> PGD 1a0c063 PUD 32f82e063 PMD 32f2fd063 PTE 800324dc7161
> Oops: 0003 [#1] PREEMPT SMP
> Modules linked in: md5 ib_ipoib xprtrdma svcrdma rdma_cm ib_cm iw_cm
> ib_addr nfsd exportfs netconsole ip6table_filter ip6_tables
> iptable_filter ip_tables ebtable_nat nfsv3 nfs_acl ebtables x_tables
> nfsv4 auth_rpcgss nfs lockd autofs4 sunrpc target_core_iblock
> target_core_file target_core_pscsi target_core_mod configfs 8021q
> bridge stp llc ipv6 dm_mirror dm_region_hash dm_log vhost_net
> macvtap macvlan tun uinput iTCO_wdt iTCO_vendor_support kvm_intel
> kvm crc32c_intel microcode pcspkr joydev i2c_i801 lpc_ich mfd_core
> ehci_pci ehci_hcd sg ioatdma ixgbe mdio mlx4_ib ib_sa ib_mad ib_core
> mlx4_en mlx4_core igb hwmon dca ptp pps_core button dm_mod ext3 jbd
> sd_mod ata_piix libata uhci_hcd megaraid_sas scsi_mod
> CPU 6
> Pid: 4744, comm: nfsd Not tainted 3.8.0-rc5+ #4 Supermicro
> X8DTH-i/6/iF/6F/X8DTH
> RIP: 0010:[]  []
> rdma_read_xdr+0x8bb/0xd40 [svcrdma]
> RSP: 0018:880324c3dbf8  EFLAGS: 00010297
> RAX: 880324dc8000 RBX: 0001 RCX: 880324dd8428
> RDX: 880324dc7ff8 RSI: 880324dd8428 RDI: 81149618
> RBP: 880324c3dd78 R08: 60f9c860 R09: 0001
> R10: 880324dd8000 R11: 0001 R12: 8806299dcb10
> R13: 0003 R14: 0001 R15: 0010
> FS:  () GS:88063fc0() knlGS:
> CS:  0010 DS:  ES:  CR0: 8005003b
> CR2: 880324dc7ff8 CR3: 01a0b000 CR4: 07e0
> DR0:  DR1:  DR2: 
> DR3:  DR6: 0ff0 DR7: 0400
> Process nfsd (pid: 4744, threadinfo 880324c3c000, task 88033055)
> Stack:
>  880324c3dc78 880324c3dcd8 0282 880631cec000
>  880324dd8000 88062ed33040 000124c3dc48 880324dd8000
>  88062ed33058 880630ce2b90 8806299e8000 0003
> Call Trace:
>  [] svc_rdma_recvfrom+0x3ee/0xd80 [svcrdma]
>  [] ? try_to_wake_up+0x2f0/0x2f0
>  [] svc_recv+0x3ef/0x4b0 [sunrpc]
>  [] ? nfsd_svc+0x740/0x740 [nfsd]
>  [] nfsd+0xad/0x130 [nfsd]
>  [] ? nfsd_svc+0x740/0x740 [nfsd]
>  [] kthread+0xd6/0xe0
>  [] ? __init_kthread_worker+0x70/0x70
>  [] ret_from_fork+0x7c/0xb0
>  [] ? __init_kthread_worker+0x70/0x70
> Code: 63 c2 49 8d 8c c2 18 02 00 00 48 39 ce 77 e1 49 8b 82 40 0a 00
> 00 48 39 c6 0f 84 92 f7 ff ff 90 48 8d 50 f8 49 89 92 40 0a 00 00
> <48> c7 40 f8 00 00 00 00 49 8b 82 40 0a 00 00 49 3b 82 30 0a 00
> RIP  [] rdma_read_xdr+0x8bb/0xd40 [svcrdma]
>  RSP 
> CR2: 880324dc7ff8
> ---[ end trace 06d0384754e9609a ]---
> 
> 
> It seems that commit afc59400d6c65bad66d4ad0b2daf879cbff8e23e
> "nfsd4: cleanup: replace rq_resused count by rq_next_page pointer"
> is responsible for the crash (it seems to be crashing in
> net/sunrpc/xprtrdma/svc_rdma_recvfrom.c:527)
> It may be because I have CONFIG_DEBUG_SET_MODULE_RONX and
> CONFIG_DEBUG_RODATA enabled. I did not try to disable them yet.
> 
> When I moved to commit 79f77bf9a4e3dd5ead006b8f17e7c4ff07d8374e I
> was no longer getting the server crashes,
> so the reset of my tests were done using that point (it is somewhere
> in the middle of 3.7.0-rc2).

OK, so this part's clearly my fault--I'll work on a patch, but the
rdma's use of the ->rq_pages array is pretty confusing.

--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH for 3.8 v3, resend 0/3] IB/SRP patches for kernel 3.8

2013-02-06 Thread Vu Pham

Bart Van Assche wrote:

On 02/05/13 21:54, Or Gerlitz wrote:
On Tue, Feb 5, 2013 at 6:25 PM, Bart Van Assche  
wrote:

On 02/04/13 22:11, Or Gerlitz wrote:

Bart, I'd like to sharpen the point: could you please clarify if the
series posted to linux-rdma stands for itself in the sense that SRP HA
scheme X (please state it) now works/better when the patches applied
on top of the latest 3.8-rc cut? OR for X to do better/work, one needs
this series AND the one you posted to linux-scsi.


Hello Or,

A huge number of patches have been taken upstream between 3.8-rc1 and 
3.8-rc6. I have retested these three patches with 3.8-rc6 and would 
appreciate if you would also repeat your tests.


Thanks,

Bart.

Hello Bart,

I tested your 3.8 v3 patchset. I did the following:
- clone & checkout Roland's ib tree for-next branch
- applied Bart's 3.8 v3 patchset
- applied "save & restore host_scribble during error handling" patch - 
http://www.mail-archive.com/linux-scsi@vger.kernel.org/msg17809.html


I have two paths to target thru port 1 & 2 (scsi_host host9 & host10)

- run I/Os
- disable port 1 @ 19:11:30
- error recovery for host9 kick in @ 19:12:04
- multipath remove the path, I/Os fail-over @ 19:12:51
- error recovery was still going on with host9 (sysfs entry for host9 
still intact)

- enable port 1 @19:15:00
- host9 reconnect to target thru error recovery, multipathd module 
re-instate the path in kernel; and then host9 is REMOVED, usermode 
"multipath -l" did not show re-instate path thru host9


Feb  6 19:15:04 vsa30 kernel: scsi host9: SRP abort called
Feb  6 19:15:05 vsa30 multipathd: overflow in attribute 
'/sys/devices/pci:00/:00:02.0/:02:00.0/host9/target9:0:0/9:0:0:2/state'

Feb  6 19:15:14 vsa30 kernel: scsi host9: SRP abort called
Feb  6 19:15:14 vsa30 kernel: scsi host9: SRP reset_device called
Feb  6 19:15:14 vsa30 kernel: scsi host9: ib_srp: SRP reset_host called
Feb  6 19:15:14 vsa30 kernel: scsi host9: ib_srp: reconnect succeeded
Feb  6 19:15:26 vsa30 multipathd: 3600144f0665c440050a522180003: sdd 
- tur checker reports path is up

Feb  6 19:15:26 vsa30 multipathd: 8:48: reinstated
Feb  6 19:15:26 vsa30 multipathd: 3600144f0665c440050a522180003: 
remaining active paths: 2
Feb  6 19:15:26 vsa30 multipathd: 3600144f0665c440050a522180002: sdc 
- tur checker reports path is up

Feb  6 19:15:26 vsa30 multipathd: 8:32: reinstated
Feb  6 19:15:26 vsa30 multipathd: 3600144f0665c440050a522180002: 
remaining active paths: 2

Feb  6 19:15:26 vsa30 multipathd: sdc: remove path (uevent)
Feb  6 19:15:26 vsa30 multipathd: 3600144f0665c440050a522180002: 
load table [0 409600 multipath 0 0 1 1 round-robin 0 1 1 8:80 1]
Feb  6 19:15:26 vsa30 multipathd: sdc: path removed from map 
3600144f0665c440050a522180002

Feb  6 19:15:26 vsa30 kernel: sd 9:0:0:1: [sdc] Synchronizing SCSI cache
Feb  6 19:15:26 vsa30 multipathd: sdd: remove path (uevent)
Feb  6 19:15:26 vsa30 multipathd: 3600144f0665c440050a522180003: 
load table [0 409600 multipath 0 0 1 1 round-robin 0 1 1 8:96 1]
Feb  6 19:15:26 vsa30 multipathd: sdd: path removed from map 
3600144f0665c440050a522180003

Feb  6 19:15:26 vsa30 kernel: sd 9:0:0:2: [sdd] Synchronizing SCSI cache

- disable port 2 @19:22:50
- error recovery kicked in on host10 @ 19:23:40
- I/Os failed with NO path to target @ 19:24:27
- without enabling port 2, error recovery was still going on host10 
still 19:57:52 and stop.
- host10 was still in sysfs /sys/class/scsi_host/host10 & taking 
reference on ib_srp module

- enable port 2 - nothing happened.

Conclusion:
1. disable the port/path long enough >35 minutes, we have dangling scsi 
host.
2. enable the port within 30 minute, scsi host re-establish connection, 
path re-instate and then scsi_host was removed (no entry in sysfs)


I attached a log here to show what happened above.

thanks,
-vu


messages.bz2
Description: Binary data


[PATCH 38/77] IB/core: convert to idr_alloc()

2013-02-06 Thread Tejun Heo
Convert to the much saner new idr interface.

Only compile tested.

v2: Mike triggered WARN_ON() in idr_preload() because send_mad(),
which may be used from non-process context, was calling
idr_preload() unconditionally.  Preload iff @gfp_mask has
__GFP_WAIT.

Signed-off-by: Tejun Heo 
Reviewed-by: Sean Hefty 
Reported-by: "Marciniszyn, Mike" 
Cc: Roland Dreier 
Cc: Sean Hefty 
Cc: Hal Rosenstock 
Cc: linux-rdma@vger.kernel.org
---
 drivers/infiniband/core/cm.c | 22 +++---
 drivers/infiniband/core/cma.c| 24 +++-
 drivers/infiniband/core/sa_query.c   | 18 ++
 drivers/infiniband/core/ucm.c| 16 
 drivers/infiniband/core/ucma.c   | 32 
 drivers/infiniband/core/uverbs_cmd.c | 17 -
 6 files changed, 48 insertions(+), 81 deletions(-)

diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
index 394fea2..98281fe 100644
--- a/drivers/infiniband/core/cm.c
+++ b/drivers/infiniband/core/cm.c
@@ -382,20 +382,21 @@ static int cm_init_av_by_path(struct ib_sa_path_rec 
*path, struct cm_av *av)
 static int cm_alloc_id(struct cm_id_private *cm_id_priv)
 {
unsigned long flags;
-   int ret, id;
+   int id;
static int next_id;
 
-   do {
-   spin_lock_irqsave(&cm.lock, flags);
-   ret = idr_get_new_above(&cm.local_id_table, cm_id_priv,
-   next_id, &id);
-   if (!ret)
-   next_id = ((unsigned) id + 1) & MAX_IDR_MASK;
-   spin_unlock_irqrestore(&cm.lock, flags);
-   } while( (ret == -EAGAIN) && idr_pre_get(&cm.local_id_table, 
GFP_KERNEL) );
+   idr_preload(GFP_KERNEL);
+   spin_lock_irqsave(&cm.lock, flags);
+
+   id = idr_alloc(&cm.local_id_table, cm_id_priv, next_id, 0, GFP_NOWAIT);
+   if (id >= 0)
+   next_id = ((unsigned) id + 1) & MAX_IDR_MASK;
+
+   spin_unlock_irqrestore(&cm.lock, flags);
+   idr_preload_end();
 
cm_id_priv->id.local_id = (__force __be32)id ^ cm.random_id_operand;
-   return ret;
+   return id < 0 ? id : 0;
 }
 
 static void cm_free_id(__be32 local_id)
@@ -3844,7 +3845,6 @@ static int __init ib_cm_init(void)
cm.remote_sidr_table = RB_ROOT;
idr_init(&cm.local_id_table);
get_random_bytes(&cm.random_id_operand, sizeof cm.random_id_operand);
-   idr_pre_get(&cm.local_id_table, GFP_KERNEL);
INIT_LIST_HEAD(&cm.timewait_list);
 
ret = class_register(&cm_class);
diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index d789eea..c32eeaa 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -2143,33 +2143,23 @@ static int cma_alloc_port(struct idr *ps, struct 
rdma_id_private *id_priv,
  unsigned short snum)
 {
struct rdma_bind_list *bind_list;
-   int port, ret;
+   int ret;
 
bind_list = kzalloc(sizeof *bind_list, GFP_KERNEL);
if (!bind_list)
return -ENOMEM;
 
-   do {
-   ret = idr_get_new_above(ps, bind_list, snum, &port);
-   } while ((ret == -EAGAIN) && idr_pre_get(ps, GFP_KERNEL));
-
-   if (ret)
-   goto err1;
-
-   if (port != snum) {
-   ret = -EADDRNOTAVAIL;
-   goto err2;
-   }
+   ret = idr_alloc(ps, bind_list, snum, snum + 1, GFP_KERNEL);
+   if (ret < 0)
+   goto err;
 
bind_list->ps = ps;
-   bind_list->port = (unsigned short) port;
+   bind_list->port = (unsigned short)ret;
cma_bind_port(bind_list, id_priv);
return 0;
-err2:
-   idr_remove(ps, port);
-err1:
+err:
kfree(bind_list);
-   return ret;
+   return ret == -ENOSPC ? -EADDRNOTAVAIL : ret;
 }
 
 static int cma_alloc_any_port(struct idr *ps, struct rdma_id_private *id_priv)
diff --git a/drivers/infiniband/core/sa_query.c 
b/drivers/infiniband/core/sa_query.c
index a8905ab..934f45e 100644
--- a/drivers/infiniband/core/sa_query.c
+++ b/drivers/infiniband/core/sa_query.c
@@ -611,19 +611,21 @@ static void init_mad(struct ib_sa_mad *mad, struct 
ib_mad_agent *agent)
 
 static int send_mad(struct ib_sa_query *query, int timeout_ms, gfp_t gfp_mask)
 {
+   bool preload = gfp_mask & __GFP_WAIT;
unsigned long flags;
int ret, id;
 
-retry:
-   if (!idr_pre_get(&query_idr, gfp_mask))
-   return -ENOMEM;
+   if (preload)
+   idr_preload(gfp_mask);
spin_lock_irqsave(&idr_lock, flags);
-   ret = idr_get_new(&query_idr, query, &id);
+
+   id = idr_alloc(&query_idr, query, 0, 0, GFP_NOWAIT);
+
spin_unlock_irqrestore(&idr_lock, flags);
-   if (ret == -EAGAIN)
-   goto retry;
-   if (ret)
-   return ret;
+   if (preload)
+   idr_preload_end();
+   if (id < 0)
+   re

[PATCH 40/77] IB/cxgb3: convert to idr_alloc()

2013-02-06 Thread Tejun Heo
Convert to the much saner new idr interface.

Only compile tested.

Signed-off-by: Tejun Heo 
Reviewed-by: Steve Wise 
Cc: linux-rdma@vger.kernel.org
---
 drivers/infiniband/hw/cxgb3/iwch.h | 24 +++-
 1 file changed, 11 insertions(+), 13 deletions(-)

diff --git a/drivers/infiniband/hw/cxgb3/iwch.h 
b/drivers/infiniband/hw/cxgb3/iwch.h
index a1c4457..8378622 100644
--- a/drivers/infiniband/hw/cxgb3/iwch.h
+++ b/drivers/infiniband/hw/cxgb3/iwch.h
@@ -153,19 +153,17 @@ static inline int insert_handle(struct iwch_dev *rhp, 
struct idr *idr,
void *handle, u32 id)
 {
int ret;
-   int newid;
-
-   do {
-   if (!idr_pre_get(idr, GFP_KERNEL)) {
-   return -ENOMEM;
-   }
-   spin_lock_irq(&rhp->lock);
-   ret = idr_get_new_above(idr, handle, id, &newid);
-   BUG_ON(newid != id);
-   spin_unlock_irq(&rhp->lock);
-   } while (ret == -EAGAIN);
-
-   return ret;
+
+   idr_preload(GFP_KERNEL);
+   spin_lock_irq(&rhp->lock);
+
+   ret = idr_alloc(idr, handle, id, id + 1, GFP_NOWAIT);
+
+   spin_unlock_irq(&rhp->lock);
+   idr_preload_end();
+
+   BUG_ON(ret == -ENOSPC);
+   return ret < 0 ? ret : 0;
 }
 
 static inline void remove_handle(struct iwch_dev *rhp, struct idr *idr, u32 id)
-- 
1.8.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 39/77] IB/amso1100: convert to idr_alloc()

2013-02-06 Thread Tejun Heo
Convert to the much saner new idr interface.

Only compile tested.

Signed-off-by: Tejun Heo 
Reviewed-by: Steve Wise 
Cc: Tom Tucker 
Cc: linux-rdma@vger.kernel.org
---
 drivers/infiniband/hw/amso1100/c2_qp.c | 19 +++
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/drivers/infiniband/hw/amso1100/c2_qp.c 
b/drivers/infiniband/hw/amso1100/c2_qp.c
index 28cd5cb..0ab826b 100644
--- a/drivers/infiniband/hw/amso1100/c2_qp.c
+++ b/drivers/infiniband/hw/amso1100/c2_qp.c
@@ -382,14 +382,17 @@ static int c2_alloc_qpn(struct c2_dev *c2dev, struct 
c2_qp *qp)
 {
int ret;
 
-do {
-   spin_lock_irq(&c2dev->qp_table.lock);
-   ret = idr_get_new_above(&c2dev->qp_table.idr, qp,
-   c2dev->qp_table.last++, &qp->qpn);
-   spin_unlock_irq(&c2dev->qp_table.lock);
-} while ((ret == -EAGAIN) &&
-idr_pre_get(&c2dev->qp_table.idr, GFP_KERNEL));
-   return ret;
+   idr_preload(GFP_KERNEL);
+   spin_lock_irq(&c2dev->qp_table.lock);
+
+   ret = idr_alloc(&c2dev->qp_table.idr, qp, c2dev->qp_table.last++, 0,
+   GFP_NOWAIT);
+   if (ret >= 0)
+   qp->qpn = ret;
+
+   spin_unlock_irq(&c2dev->qp_table.lock);
+   idr_preload_end();
+   return ret < 0 ? ret : 0;
 }
 
 static void c2_free_qpn(struct c2_dev *c2dev, int qpn)
-- 
1.8.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 43/77] IB/ipath: convert to idr_alloc()

2013-02-06 Thread Tejun Heo
Convert to the much saner new idr interface.

Only compile tested.

Signed-off-by: Tejun Heo 
Cc: Mike Marciniszyn 
Cc: linux-rdma@vger.kernel.org
---
 drivers/infiniband/hw/ipath/ipath_driver.c | 16 
 1 file changed, 4 insertions(+), 12 deletions(-)

diff --git a/drivers/infiniband/hw/ipath/ipath_driver.c 
b/drivers/infiniband/hw/ipath/ipath_driver.c
index 7b371f5..fcdaeea 100644
--- a/drivers/infiniband/hw/ipath/ipath_driver.c
+++ b/drivers/infiniband/hw/ipath/ipath_driver.c
@@ -194,11 +194,6 @@ static struct ipath_devdata *ipath_alloc_devdata(struct 
pci_dev *pdev)
struct ipath_devdata *dd;
int ret;
 
-   if (!idr_pre_get(&unit_table, GFP_KERNEL)) {
-   dd = ERR_PTR(-ENOMEM);
-   goto bail;
-   }
-
dd = vzalloc(sizeof(*dd));
if (!dd) {
dd = ERR_PTR(-ENOMEM);
@@ -206,9 +201,10 @@ static struct ipath_devdata *ipath_alloc_devdata(struct 
pci_dev *pdev)
}
dd->ipath_unit = -1;
 
+   idr_preload(GFP_KERNEL);
spin_lock_irqsave(&ipath_devs_lock, flags);
 
-   ret = idr_get_new(&unit_table, dd, &dd->ipath_unit);
+   ret = idr_alloc(&unit_table, dd, 0, 0, GFP_KERNEL);
if (ret < 0) {
printk(KERN_ERR IPATH_DRV_NAME
   ": Could not allocate unit ID: error %d\n", -ret);
@@ -216,6 +212,7 @@ static struct ipath_devdata *ipath_alloc_devdata(struct 
pci_dev *pdev)
dd = ERR_PTR(ret);
goto bail_unlock;
}
+   dd->ipath_unit = ret;
 
dd->pcidev = pdev;
pci_set_drvdata(pdev, dd);
@@ -224,7 +221,7 @@ static struct ipath_devdata *ipath_alloc_devdata(struct 
pci_dev *pdev)
 
 bail_unlock:
spin_unlock_irqrestore(&ipath_devs_lock, flags);
-
+   idr_preload_end();
 bail:
return dd;
 }
@@ -2503,11 +2500,6 @@ static int __init infinipath_init(void)
 * the PCI subsystem.
 */
idr_init(&unit_table);
-   if (!idr_pre_get(&unit_table, GFP_KERNEL)) {
-   printk(KERN_ERR IPATH_DRV_NAME ": idr_pre_get() failed\n");
-   ret = -ENOMEM;
-   goto bail;
-   }
 
ret = pci_register_driver(&ipath_driver);
if (ret < 0) {
-- 
1.8.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 46/77] IB/qib: convert to idr_alloc()

2013-02-06 Thread Tejun Heo
Convert to the much saner new idr interface.

Only compile tested.

Signed-off-by: Tejun Heo 
Cc: Mike Marciniszyn 
Cc: linux-rdma@vger.kernel.org
---
 drivers/infiniband/hw/qib/qib_init.c | 21 -
 1 file changed, 8 insertions(+), 13 deletions(-)

diff --git a/drivers/infiniband/hw/qib/qib_init.c 
b/drivers/infiniband/hw/qib/qib_init.c
index ddf066d..50e33aa 100644
--- a/drivers/infiniband/hw/qib/qib_init.c
+++ b/drivers/infiniband/hw/qib/qib_init.c
@@ -1060,22 +1060,23 @@ struct qib_devdata *qib_alloc_devdata(struct pci_dev 
*pdev, size_t extra)
struct qib_devdata *dd;
int ret;
 
-   if (!idr_pre_get(&qib_unit_table, GFP_KERNEL)) {
-   dd = ERR_PTR(-ENOMEM);
-   goto bail;
-   }
-
dd = (struct qib_devdata *) ib_alloc_device(sizeof(*dd) + extra);
if (!dd) {
dd = ERR_PTR(-ENOMEM);
goto bail;
}
 
+   idr_preload(GFP_KERNEL);
spin_lock_irqsave(&qib_devs_lock, flags);
-   ret = idr_get_new(&qib_unit_table, dd, &dd->unit);
-   if (ret >= 0)
+
+   ret = idr_alloc(&qib_unit_table, dd, 0, 0, GFP_NOWAIT);
+   if (ret >= 0) {
+   dd->unit = ret;
list_add(&dd->list, &qib_dev_list);
+   }
+
spin_unlock_irqrestore(&qib_devs_lock, flags);
+   idr_preload_end();
 
if (ret < 0) {
qib_early_err(&pdev->dev,
@@ -1180,11 +1181,6 @@ static int __init qlogic_ib_init(void)
 * the PCI subsystem.
 */
idr_init(&qib_unit_table);
-   if (!idr_pre_get(&qib_unit_table, GFP_KERNEL)) {
-   pr_err("idr_pre_get() failed\n");
-   ret = -ENOMEM;
-   goto bail_cq_wq;
-   }
 
ret = pci_register_driver(&qib_driver);
if (ret < 0) {
@@ -1199,7 +1195,6 @@ static int __init qlogic_ib_init(void)
 
 bail_unit:
idr_destroy(&qib_unit_table);
-bail_cq_wq:
destroy_workqueue(qib_cq_wq);
 bail_dev:
qib_dev_cleanup();
-- 
1.8.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 44/77] IB/mlx4: convert to idr_alloc()

2013-02-06 Thread Tejun Heo
Convert to the much saner new idr interface.

Only compile tested.

Signed-off-by: Tejun Heo 
Cc: Jack Morgenstein 
Cc: Or Gerlitz 
Cc: Roland Dreier 
Cc: linux-rdma@vger.kernel.org
---
 drivers/infiniband/hw/mlx4/cm.c | 32 +++-
 1 file changed, 15 insertions(+), 17 deletions(-)

diff --git a/drivers/infiniband/hw/mlx4/cm.c b/drivers/infiniband/hw/mlx4/cm.c
index dbc99d4..80e59ed 100644
--- a/drivers/infiniband/hw/mlx4/cm.c
+++ b/drivers/infiniband/hw/mlx4/cm.c
@@ -203,7 +203,7 @@ static void sl_id_map_add(struct ib_device *ibdev, struct 
id_map_entry *new)
 static struct id_map_entry *
 id_map_alloc(struct ib_device *ibdev, int slave_id, u32 sl_cm_id)
 {
-   int ret, id;
+   int ret;
static int next_id;
struct id_map_entry *ent;
struct mlx4_ib_sriov *sriov = &to_mdev(ibdev)->sriov;
@@ -220,25 +220,23 @@ id_map_alloc(struct ib_device *ibdev, int slave_id, u32 
sl_cm_id)
ent->dev = to_mdev(ibdev);
INIT_DELAYED_WORK(&ent->timeout, id_map_ent_timeout);
 
-   do {
-   spin_lock(&to_mdev(ibdev)->sriov.id_map_lock);
-   ret = idr_get_new_above(&sriov->pv_id_table, ent,
-   next_id, &id);
-   if (!ret) {
-   next_id = ((unsigned) id + 1) & MAX_IDR_MASK;
-   ent->pv_cm_id = (u32)id;
-   sl_id_map_add(ibdev, ent);
-   }
+   idr_preload(GFP_KERNEL);
+   spin_lock(&to_mdev(ibdev)->sriov.id_map_lock);
 
-   spin_unlock(&sriov->id_map_lock);
-   } while (ret == -EAGAIN && idr_pre_get(&sriov->pv_id_table, 
GFP_KERNEL));
-   /*the function idr_get_new_above can return -ENOSPC, so don't insert in 
that case.*/
-   if (!ret) {
-   spin_lock(&sriov->id_map_lock);
+   ret = idr_alloc(&sriov->pv_id_table, ent, next_id, 0, GFP_NOWAIT);
+   if (ret >= 0) {
+   next_id = ((unsigned)ret + 1) & MAX_IDR_MASK;
+   ent->pv_cm_id = (u32)ret;
+   sl_id_map_add(ibdev, ent);
list_add_tail(&ent->list, &sriov->cm_list);
-   spin_unlock(&sriov->id_map_lock);
-   return ent;
}
+
+   spin_unlock(&sriov->id_map_lock);
+   idr_preload_end();
+
+   if (ret >= 0)
+   return ent;
+
/*error flow*/
kfree(ent);
mlx4_ib_warn(ibdev, "No more space in the idr (err:0x%x)\n", ret);
-- 
1.8.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 45/77] IB/ocrdma: convert to idr_alloc()

2013-02-06 Thread Tejun Heo
Convert to the much saner new idr interface.

Only compile tested.

Signed-off-by: Tejun Heo 
Cc: Roland Dreier 
Cc: linux-rdma@vger.kernel.org
---
 drivers/infiniband/hw/ocrdma/ocrdma_main.c | 14 +-
 1 file changed, 1 insertion(+), 13 deletions(-)

diff --git a/drivers/infiniband/hw/ocrdma/ocrdma_main.c 
b/drivers/infiniband/hw/ocrdma/ocrdma_main.c
index c4e0131..48928c8 100644
--- a/drivers/infiniband/hw/ocrdma/ocrdma_main.c
+++ b/drivers/infiniband/hw/ocrdma/ocrdma_main.c
@@ -51,18 +51,6 @@ static DEFINE_IDR(ocrdma_dev_id);
 
 static union ib_gid ocrdma_zero_sgid;
 
-static int ocrdma_get_instance(void)
-{
-   int instance = 0;
-
-   /* Assign an unused number */
-   if (!idr_pre_get(&ocrdma_dev_id, GFP_KERNEL))
-   return -1;
-   if (idr_get_new(&ocrdma_dev_id, NULL, &instance))
-   return -1;
-   return instance;
-}
-
 void ocrdma_get_guid(struct ocrdma_dev *dev, u8 *guid)
 {
u8 mac_addr[6];
@@ -416,7 +404,7 @@ static struct ocrdma_dev *ocrdma_add(struct be_dev_info 
*dev_info)
goto idr_err;
 
memcpy(&dev->nic_info, dev_info, sizeof(*dev_info));
-   dev->id = ocrdma_get_instance();
+   dev->id = idr_alloc(&ocrdma_dev_id, NULL, 0, 0, GFP_KERNEL);
if (dev->id < 0)
goto idr_err;
 
-- 
1.8.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 41/77] IB/cxgb4: convert to idr_alloc()

2013-02-06 Thread Tejun Heo
Convert to the much saner new idr interface.

Only compile tested.

Signed-off-by: Tejun Heo 
Reviewed-by: Steve Wise 
Cc: linux-rdma@vger.kernel.org
---
 drivers/infiniband/hw/cxgb4/iw_cxgb4.h | 27 ++-
 1 file changed, 14 insertions(+), 13 deletions(-)

diff --git a/drivers/infiniband/hw/cxgb4/iw_cxgb4.h 
b/drivers/infiniband/hw/cxgb4/iw_cxgb4.h
index 9c1644f..7f862da 100644
--- a/drivers/infiniband/hw/cxgb4/iw_cxgb4.h
+++ b/drivers/infiniband/hw/cxgb4/iw_cxgb4.h
@@ -260,20 +260,21 @@ static inline int _insert_handle(struct c4iw_dev *rhp, 
struct idr *idr,
 void *handle, u32 id, int lock)
 {
int ret;
-   int newid;
 
-   do {
-   if (!idr_pre_get(idr, lock ? GFP_KERNEL : GFP_ATOMIC))
-   return -ENOMEM;
-   if (lock)
-   spin_lock_irq(&rhp->lock);
-   ret = idr_get_new_above(idr, handle, id, &newid);
-   BUG_ON(!ret && newid != id);
-   if (lock)
-   spin_unlock_irq(&rhp->lock);
-   } while (ret == -EAGAIN);
-
-   return ret;
+   if (lock) {
+   idr_preload(GFP_KERNEL);
+   spin_lock_irq(&rhp->lock);
+   }
+
+   ret = idr_alloc(idr, handle, id, id + 1, GFP_ATOMIC);
+
+   if (lock) {
+   spin_unlock_irq(&rhp->lock);
+   idr_preload_end();
+   }
+
+   BUG_ON(ret == -ENOSPC);
+   return ret < 0 ? ret : 0;
 }
 
 static inline int insert_handle(struct c4iw_dev *rhp, struct idr *idr,
-- 
1.8.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 42/77] IB/ehca: convert to idr_alloc()

2013-02-06 Thread Tejun Heo
Convert to the much saner new idr interface.

Only compile tested.

Signed-off-by: Tejun Heo 
Cc: Hoang-Nam Nguyen 
Cc: Christoph Raisch 
Cc: linux-rdma@vger.kernel.org
---
 drivers/infiniband/hw/ehca/ehca_cq.c | 27 +++
 drivers/infiniband/hw/ehca/ehca_qp.c | 34 +++---
 2 files changed, 22 insertions(+), 39 deletions(-)

diff --git a/drivers/infiniband/hw/ehca/ehca_cq.c 
b/drivers/infiniband/hw/ehca/ehca_cq.c
index 8f52901..212150c 100644
--- a/drivers/infiniband/hw/ehca/ehca_cq.c
+++ b/drivers/infiniband/hw/ehca/ehca_cq.c
@@ -128,7 +128,7 @@ struct ib_cq *ehca_create_cq(struct ib_device *device, int 
cqe, int comp_vector,
void *vpage;
u32 counter;
u64 rpage, cqx_fec, h_ret;
-   int ipz_rc, ret, i;
+   int ipz_rc, i;
unsigned long flags;
 
if (cqe >= 0x - 64 - additional_cqe)
@@ -163,32 +163,19 @@ struct ib_cq *ehca_create_cq(struct ib_device *device, 
int cqe, int comp_vector,
adapter_handle = shca->ipz_hca_handle;
param.eq_handle = shca->eq.ipz_eq_handle;
 
-   do {
-   if (!idr_pre_get(&ehca_cq_idr, GFP_KERNEL)) {
-   cq = ERR_PTR(-ENOMEM);
-   ehca_err(device, "Can't reserve idr nr. device=%p",
-device);
-   goto create_cq_exit1;
-   }
-
-   write_lock_irqsave(&ehca_cq_idr_lock, flags);
-   ret = idr_get_new(&ehca_cq_idr, my_cq, &my_cq->token);
-   write_unlock_irqrestore(&ehca_cq_idr_lock, flags);
-   } while (ret == -EAGAIN);
+   idr_preload(GFP_KERNEL);
+   write_lock_irqsave(&ehca_cq_idr_lock, flags);
+   my_cq->token = idr_alloc(&ehca_cq_idr, my_cq, 0, 0x200, GFP_NOWAIT);
+   write_unlock_irqrestore(&ehca_cq_idr_lock, flags);
+   idr_preload_end();
 
-   if (ret) {
+   if (my_cq->token < 0) {
cq = ERR_PTR(-ENOMEM);
ehca_err(device, "Can't allocate new idr entry. device=%p",
 device);
goto create_cq_exit1;
}
 
-   if (my_cq->token > 0x1FF) {
-   cq = ERR_PTR(-ENOMEM);
-   ehca_err(device, "Invalid number of cq. device=%p", device);
-   goto create_cq_exit2;
-   }
-
/*
 * CQs maximum depth is 4GB-64, but we need additional 20 as buffer
 * for receiving errors CQEs.
diff --git a/drivers/infiniband/hw/ehca/ehca_qp.c 
b/drivers/infiniband/hw/ehca/ehca_qp.c
index 1493939..00d6861 100644
--- a/drivers/infiniband/hw/ehca/ehca_qp.c
+++ b/drivers/infiniband/hw/ehca/ehca_qp.c
@@ -636,30 +636,26 @@ static struct ehca_qp *internal_create_qp(
my_qp->send_cq =
container_of(init_attr->send_cq, struct ehca_cq, ib_cq);
 
-   do {
-   if (!idr_pre_get(&ehca_qp_idr, GFP_KERNEL)) {
-   ret = -ENOMEM;
-   ehca_err(pd->device, "Can't reserve idr resources.");
-   goto create_qp_exit0;
-   }
+   idr_preload(GFP_KERNEL);
+   write_lock_irqsave(&ehca_qp_idr_lock, flags);
 
-   write_lock_irqsave(&ehca_qp_idr_lock, flags);
-   ret = idr_get_new(&ehca_qp_idr, my_qp, &my_qp->token);
-   write_unlock_irqrestore(&ehca_qp_idr_lock, flags);
-   } while (ret == -EAGAIN);
+   ret = idr_alloc(&ehca_qp_idr, my_qp, 0, 0x200, GFP_NOWAIT);
+   if (ret >= 0)
+   my_qp->token = ret;
 
-   if (ret) {
-   ret = -ENOMEM;
-   ehca_err(pd->device, "Can't allocate new idr entry.");
+   write_unlock_irqrestore(&ehca_qp_idr_lock, flags);
+   idr_preload_end();
+   if (ret < 0) {
+   if (ret == -ENOSPC) {
+   ret = -EINVAL;
+   ehca_err(pd->device, "Invalid number of qp");
+   } else {
+   ret = -ENOMEM;
+   ehca_err(pd->device, "Can't allocate new idr entry.");
+   }
goto create_qp_exit0;
}
 
-   if (my_qp->token > 0x1FF) {
-   ret = -EINVAL;
-   ehca_err(pd->device, "Invalid number of qp");
-   goto create_qp_exit1;
-   }
-
if (has_srq)
parms.srq_token = my_qp->token;
 
-- 
1.8.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: NFS over RDMA crashing

2013-02-06 Thread Jeff Becker
Hi. In case you're interested, I did the NFS/RDMA backports for OFED. I 
tested that NFS/RDMA in OFED 3.5 works on kernel 3.5, and also the RHEL 
6.3 kernel. However, I did not test it with SRIOV. If you test it 
(OFED-3.5-rc6 was released last week), I'd like to know how it goes. Thanks.


Jeff Becker

On 02/06/2013 07:58 AM, Steve Wise wrote:

On 2/6/2013 9:48 AM, Yan Burman wrote:

When I moved to commit 79f77bf9a4e3dd5ead006b8f17e7c4ff07d8374e I was
no longer getting the server crashes,
so the reset of my tests were done using that point (it is somewhere
in the middle of 3.7.0-rc2)


+tom tucker

I'd try going back a few kernels, like to 3.5.x and see if things are
more stable.  If you find a point that works, then git bisect might help
identify the regression.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH for-next 06/10] IB/core: Enhance memory windows support

2013-02-06 Thread Or Gerlitz
From: Shani Michaeli 

This patch enhanced the IB core support for Memory Windows.

Memory Windows (MW) allow an application to have better/flexible control
over remote access to memory.

Two types of MWs are supported:

Type 1  - associated with PD only
Type 2A - associated with QPN only
Type 2B - associated with PD and QPN

Applications can allocate a MW once, and then repeatedly bind the MW to
different ranges in MRs that are associated to the same PD. Type 1 windows
are bound through a verb, while type 2 windows are bound by
posting a work request.

The 32-bit memory key is composed of a 24-bit index and an 8-bit key. The key is
changed with each bind, thus allowing more control over the peer's use of the
memory key.

The changes introduced are the following:

* add memory window type enum and a corresponding parameter to ib_alloc_mw.
* type 2 memory window bind work request support.
* create a struct that contains the common part of the bind verb struct
  ibv_mw_bind and the bind work request into a single struct.
* add the ib_inc_rkey helper function to advance the tag part of an rkey.

Consumer interface details:

* new device capability flags IB_DEVICE_MEM_WINDOW_TYPE_2A and
  IB_DEVICE_MEM_WINDOW_TYPE_2B are added to indicate device support
  for these features.
  Devices can set either IB_DEVICE_MEM_WINDOW_TYPE_2A or
  IB_DEVICE_MEM_WINDOW_TYPE_2B if it supports type 2A or type 2B
  memory windows. It can set neither to indicate it doesn't support
  type 2 windows at all.

* modify existing provides and consumers code to the new param of ib_alloc_mw
  and the ib_mw_bind_info structure

Signed-off-by: Haggai Eran 
Signed-off-by: Shani Michaeli 
Signed-off-by: Or Gerlitz 
---
 drivers/infiniband/core/verbs.c |5 +-
 drivers/infiniband/hw/cxgb3/iwch_provider.c |5 ++-
 drivers/infiniband/hw/cxgb3/iwch_qp.c   |   15 +++---
 drivers/infiniband/hw/cxgb4/iw_cxgb4.h  |2 +-
 drivers/infiniband/hw/cxgb4/mem.c   |5 ++-
 drivers/infiniband/hw/ehca/ehca_iverbs.h|2 +-
 drivers/infiniband/hw/ehca/ehca_mrmw.c  |5 ++-
 drivers/infiniband/hw/nes/nes_verbs.c   |   19 ---
 include/rdma/ib_verbs.h |   73 +++---
 net/sunrpc/xprtrdma/verbs.c |   20 
 10 files changed, 110 insertions(+), 41 deletions(-)

diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index 30f199e..a8fdd33 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -1099,18 +1099,19 @@ EXPORT_SYMBOL(ib_free_fast_reg_page_list);
 
 /* Memory windows */
 
-struct ib_mw *ib_alloc_mw(struct ib_pd *pd)
+struct ib_mw *ib_alloc_mw(struct ib_pd *pd, enum ib_mw_type type)
 {
struct ib_mw *mw;
 
if (!pd->device->alloc_mw)
return ERR_PTR(-ENOSYS);
 
-   mw = pd->device->alloc_mw(pd);
+   mw = pd->device->alloc_mw(pd, type);
if (!IS_ERR(mw)) {
mw->device  = pd->device;
mw->pd  = pd;
mw->uobject = NULL;
+   mw->type= type;
atomic_inc(&pd->usecnt);
}
 
diff --git a/drivers/infiniband/hw/cxgb3/iwch_provider.c 
b/drivers/infiniband/hw/cxgb3/iwch_provider.c
index 0bdf09a..074d5c2 100644
--- a/drivers/infiniband/hw/cxgb3/iwch_provider.c
+++ b/drivers/infiniband/hw/cxgb3/iwch_provider.c
@@ -738,7 +738,7 @@ static struct ib_mr *iwch_get_dma_mr(struct ib_pd *pd, int 
acc)
return ibmr;
 }
 
-static struct ib_mw *iwch_alloc_mw(struct ib_pd *pd)
+static struct ib_mw *iwch_alloc_mw(struct ib_pd *pd, enum ib_mw_type type)
 {
struct iwch_dev *rhp;
struct iwch_pd *php;
@@ -747,6 +747,9 @@ static struct ib_mw *iwch_alloc_mw(struct ib_pd *pd)
u32 stag = 0;
int ret;
 
+   if (type != IB_MW_TYPE_1)
+   return ERR_PTR(-EINVAL);
+
php = to_iwch_pd(pd);
rhp = php->rhp;
mhp = kzalloc(sizeof(*mhp), GFP_KERNEL);
diff --git a/drivers/infiniband/hw/cxgb3/iwch_qp.c 
b/drivers/infiniband/hw/cxgb3/iwch_qp.c
index 6de8463..e5649e8 100644
--- a/drivers/infiniband/hw/cxgb3/iwch_qp.c
+++ b/drivers/infiniband/hw/cxgb3/iwch_qp.c
@@ -567,18 +567,19 @@ int iwch_bind_mw(struct ib_qp *qp,
if (mw_bind->send_flags & IB_SEND_SIGNALED)
t3_wr_flags = T3_COMPLETION_FLAG;
 
-   sgl.addr = mw_bind->addr;
-   sgl.lkey = mw_bind->mr->lkey;
-   sgl.length = mw_bind->length;
+   sgl.addr = mw_bind->bind_info.addr;
+   sgl.lkey = mw_bind->bind_info.mr->lkey;
+   sgl.length = mw_bind->bind_info.length;
wqe->bind.reserved = 0;
wqe->bind.type = TPT_VATO;
 
/* TBD: check perms */
-   wqe->bind.perms = iwch_ib_to_tpt_bind_access(mw_bind->mw_access_flags);
-   wqe->bind.mr_stag = cpu_to_be32(mw_bind->mr->lkey);
+   wqe->bind.perms = iwch_ib_to_tpt_bind_access(
+   mw_bind->bind_info.mw_access_flags);
+   wqe->bind.mr_stag = c

[PATCH for-next 05/10] net/mlx4_core: Enable memory windows in {INIT,QUERY}_HCA

2013-02-06 Thread Or Gerlitz
From: Shani Michaeli 

Add memory windows related code to INIT_HCA and QUERY_HCA

Signed-off-by: Haggai Eran 
Signed-off-by: Shani Michaeli 
Signed-off-by: Or Gerlitz 
---
 drivers/net/ethernet/mellanox/mlx4/fw.c   |3 +++
 drivers/net/ethernet/mellanox/mlx4/fw.h   |1 +
 drivers/net/ethernet/mellanox/mlx4/main.c |4 
 drivers/net/ethernet/mellanox/mlx4/mlx4.h |2 ++
 4 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/fw.c 
b/drivers/net/ethernet/mellanox/mlx4/fw.c
index a389612..d136b36 100644
--- a/drivers/net/ethernet/mellanox/mlx4/fw.c
+++ b/drivers/net/ethernet/mellanox/mlx4/fw.c
@@ -1207,6 +1207,7 @@ int mlx4_INIT_HCA(struct mlx4_dev *dev, struct 
mlx4_init_hca_param *param)
 #define  INIT_HCA_FS_IB_NUM_ADDRS_OFFSET  (INIT_HCA_FS_PARAM_OFFSET + 0x26)
 #define INIT_HCA_TPT_OFFSET 0x0f0
 #define INIT_HCA_DMPT_BASE_OFFSET   (INIT_HCA_TPT_OFFSET + 0x00)
+#define  INIT_HCA_TPT_MW_OFFSET (INIT_HCA_TPT_OFFSET + 0x08)
 #define INIT_HCA_LOG_MPT_SZ_OFFSET  (INIT_HCA_TPT_OFFSET + 0x0b)
 #define INIT_HCA_MTT_BASE_OFFSET(INIT_HCA_TPT_OFFSET + 0x10)
 #define INIT_HCA_CMPT_BASE_OFFSET   (INIT_HCA_TPT_OFFSET + 0x18)
@@ -1323,6 +1324,7 @@ int mlx4_INIT_HCA(struct mlx4_dev *dev, struct 
mlx4_init_hca_param *param)
/* TPT attributes */
 
MLX4_PUT(inbox, param->dmpt_base,  INIT_HCA_DMPT_BASE_OFFSET);
+   MLX4_PUT(inbox, param->mw_enabled, INIT_HCA_TPT_MW_OFFSET);
MLX4_PUT(inbox, param->log_mpt_sz, INIT_HCA_LOG_MPT_SZ_OFFSET);
MLX4_PUT(inbox, param->mtt_base,   INIT_HCA_MTT_BASE_OFFSET);
MLX4_PUT(inbox, param->cmpt_base,  INIT_HCA_CMPT_BASE_OFFSET);
@@ -1419,6 +1421,7 @@ int mlx4_QUERY_HCA(struct mlx4_dev *dev,
/* TPT attributes */
 
MLX4_GET(param->dmpt_base,  outbox, INIT_HCA_DMPT_BASE_OFFSET);
+   MLX4_GET(param->mw_enabled, outbox, INIT_HCA_TPT_MW_OFFSET);
MLX4_GET(param->log_mpt_sz, outbox, INIT_HCA_LOG_MPT_SZ_OFFSET);
MLX4_GET(param->mtt_base,   outbox, INIT_HCA_MTT_BASE_OFFSET);
MLX4_GET(param->cmpt_base,  outbox, INIT_HCA_CMPT_BASE_OFFSET);
diff --git a/drivers/net/ethernet/mellanox/mlx4/fw.h 
b/drivers/net/ethernet/mellanox/mlx4/fw.h
index dbf2f69..9f1a25c 100644
--- a/drivers/net/ethernet/mellanox/mlx4/fw.h
+++ b/drivers/net/ethernet/mellanox/mlx4/fw.h
@@ -170,6 +170,7 @@ struct mlx4_init_hca_param {
u8  log_mc_table_sz;
u8  log_mpt_sz;
u8  log_uar_sz;
+   u8  mw_enabled;  /* Enable memory windows */
u8  uar_page_sz; /* log pg sz in 4k chunks */
u8  fs_hash_enable_bits;
u8  steering_mode; /* for QUERY_HCA */
diff --git a/drivers/net/ethernet/mellanox/mlx4/main.c 
b/drivers/net/ethernet/mellanox/mlx4/main.c
index 9a84c75..2a4dda0 100644
--- a/drivers/net/ethernet/mellanox/mlx4/main.c
+++ b/drivers/net/ethernet/mellanox/mlx4/main.c
@@ -1447,6 +1447,10 @@ static int mlx4_init_hca(struct mlx4_dev *dev)
 
init_hca.log_uar_sz = ilog2(dev->caps.num_uars);
init_hca.uar_page_sz = PAGE_SHIFT - 12;
+   init_hca.mw_enabled = 0;
+   if (dev->caps.flags & MLX4_DEV_CAP_FLAG_MEM_WINDOW ||
+   dev->caps.bmme_flags & MLX4_BMME_FLAG_TYPE_2_WIN)
+   init_hca.mw_enabled = INIT_HCA_TPT_MW_ENABLE;
 
err = mlx4_init_icm(dev, &dev_cap, &init_hca, icm_size);
if (err)
diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4.h 
b/drivers/net/ethernet/mellanox/mlx4/mlx4.h
index 539212b..8b75d5e 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4.h
@@ -60,6 +60,8 @@
 #define MLX4_FS_MGM_LOG_ENTRY_SIZE 7
 #define MLX4_FS_NUM_MCG(1 << 17)
 
+#define INIT_HCA_TPT_MW_ENABLE  (1 << 7)
+
 enum {
MLX4_FS_L2_HASH = 0,
MLX4_FS_L2_L3_L4_HASH,
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH for-next 02/10] net/mlx4_core: Rename MPT related service routines to have mpt_ prefix

2013-02-06 Thread Or Gerlitz
From: Shani Michaeli 

The MPT - Memory Protection Table - is used by both memory windows and memory
regions. Hence, all MPT references are relevant for both types of memory 
objects.
Rename the relevant functions to start with mpt_ instead of the current mr_ 
prefix.

Signed-off-by: Haggai Eran 
Signed-off-by: Shani Michaeli 
Signed-off-by: Or Gerlitz 
---
 drivers/net/ethernet/mellanox/mlx4/mlx4.h  |   16 +++---
 drivers/net/ethernet/mellanox/mlx4/mr.c|   48 ++--
 .../net/ethernet/mellanox/mlx4/resource_tracker.c  |   14 +++---
 3 files changed, 39 insertions(+), 39 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4.h 
b/drivers/net/ethernet/mellanox/mlx4/mlx4.h
index 116c5c2..5075236 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4.h
@@ -118,10 +118,10 @@ enum {
MLX4_NUM_CMPTS  = MLX4_CMPT_NUM_TYPE << MLX4_CMPT_SHIFT
 };
 
-enum mlx4_mr_state {
-   MLX4_MR_DISABLED = 0,
-   MLX4_MR_EN_HW,
-   MLX4_MR_EN_SW
+enum mlx4_mpt_state {
+   MLX4_MPT_DISABLED = 0,
+   MLX4_MPT_EN_HW,
+   MLX4_MPT_EN_SW
 };
 
 #define MLX4_COMM_TIME 1
@@ -871,10 +871,10 @@ int __mlx4_cq_alloc_icm(struct mlx4_dev *dev, int *cqn);
 void __mlx4_cq_free_icm(struct mlx4_dev *dev, int cqn);
 int __mlx4_srq_alloc_icm(struct mlx4_dev *dev, int *srqn);
 void __mlx4_srq_free_icm(struct mlx4_dev *dev, int srqn);
-int __mlx4_mr_reserve(struct mlx4_dev *dev);
-void __mlx4_mr_release(struct mlx4_dev *dev, u32 index);
-int __mlx4_mr_alloc_icm(struct mlx4_dev *dev, u32 index);
-void __mlx4_mr_free_icm(struct mlx4_dev *dev, u32 index);
+int __mlx4_mpt_reserve(struct mlx4_dev *dev);
+void __mlx4_mpt_release(struct mlx4_dev *dev, u32 index);
+int __mlx4_mpt_alloc_icm(struct mlx4_dev *dev, u32 index);
+void __mlx4_mpt_free_icm(struct mlx4_dev *dev, u32 index);
 u32 __mlx4_alloc_mtt_range(struct mlx4_dev *dev, int order);
 void __mlx4_free_mtt_range(struct mlx4_dev *dev, u32 first_seg, int order);
 
diff --git a/drivers/net/ethernet/mellanox/mlx4/mr.c 
b/drivers/net/ethernet/mellanox/mlx4/mr.c
index c202d3a..49705cf 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mr.c
+++ b/drivers/net/ethernet/mellanox/mlx4/mr.c
@@ -321,7 +321,7 @@ static int mlx4_mr_alloc_reserved(struct mlx4_dev *dev, u32 
mridx, u32 pd,
mr->size   = size;
mr->pd = pd;
mr->access = access;
-   mr->enabled= MLX4_MR_DISABLED;
+   mr->enabled= MLX4_MPT_DISABLED;
mr->key= hw_index_to_key(mridx);
 
return mlx4_mtt_init(dev, npages, page_shift, &mr->mtt);
@@ -335,14 +335,14 @@ static int mlx4_WRITE_MTT(struct mlx4_dev *dev,
MLX4_CMD_TIME_CLASS_A,  MLX4_CMD_WRAPPED);
 }
 
-int __mlx4_mr_reserve(struct mlx4_dev *dev)
+int __mlx4_mpt_reserve(struct mlx4_dev *dev)
 {
struct mlx4_priv *priv = mlx4_priv(dev);
 
return mlx4_bitmap_alloc(&priv->mr_table.mpt_bitmap);
 }
 
-static int mlx4_mr_reserve(struct mlx4_dev *dev)
+static int mlx4_mpt_reserve(struct mlx4_dev *dev)
 {
u64 out_param;
 
@@ -353,17 +353,17 @@ static int mlx4_mr_reserve(struct mlx4_dev *dev)
return -1;
return get_param_l(&out_param);
}
-   return  __mlx4_mr_reserve(dev);
+   return  __mlx4_mpt_reserve(dev);
 }
 
-void __mlx4_mr_release(struct mlx4_dev *dev, u32 index)
+void __mlx4_mpt_release(struct mlx4_dev *dev, u32 index)
 {
struct mlx4_priv *priv = mlx4_priv(dev);
 
mlx4_bitmap_free(&priv->mr_table.mpt_bitmap, index);
 }
 
-static void mlx4_mr_release(struct mlx4_dev *dev, u32 index)
+static void mlx4_mpt_release(struct mlx4_dev *dev, u32 index)
 {
u64 in_param;
 
@@ -376,17 +376,17 @@ static void mlx4_mr_release(struct mlx4_dev *dev, u32 
index)
  index);
return;
}
-   __mlx4_mr_release(dev, index);
+   __mlx4_mpt_release(dev, index);
 }
 
-int __mlx4_mr_alloc_icm(struct mlx4_dev *dev, u32 index)
+int __mlx4_mpt_alloc_icm(struct mlx4_dev *dev, u32 index)
 {
struct mlx4_mr_table *mr_table = &mlx4_priv(dev)->mr_table;
 
return mlx4_table_get(dev, &mr_table->dmpt_table, index);
 }
 
-static int mlx4_mr_alloc_icm(struct mlx4_dev *dev, u32 index)
+static int mlx4_mpt_alloc_icm(struct mlx4_dev *dev, u32 index)
 {
u64 param;
 
@@ -397,17 +397,17 @@ static int mlx4_mr_alloc_icm(struct mlx4_dev *dev, u32 
index)
MLX4_CMD_TIME_CLASS_A,
MLX4_CMD_WRAPPED);
}
-   return __mlx4_mr_alloc_icm(dev, index);
+   return __mlx4_mpt_alloc_icm(dev, index);
 }
 
-void __mlx4_mr_free_icm(struct mlx4_dev *dev, u32 index)
+void __mlx4_mpt_free_icm(struct mlx4_dev *dev, u32 index)
 {
struct mlx4_mr_table *mr_table = &mlx4_priv(dev)->mr_table;
 
mlx4_table_put(dev, 

[PATCH for-next 10/10] IB/mlx4_ib: Advertize MW support

2013-02-06 Thread Or Gerlitz
From: Shani Michaeli 

Indicate memory windows support through device capabilities, kernel
verb entries and the relevant uverbs command mask entries.

Signed-off-by: Haggai Eran 
Signed-off-by: Shani Michaeli 
Signed-off-by: Or Gerlitz 
---
 drivers/infiniband/hw/mlx4/main.c |   19 +++
 1 files changed, 19 insertions(+), 0 deletions(-)

diff --git a/drivers/infiniband/hw/mlx4/main.c 
b/drivers/infiniband/hw/mlx4/main.c
index e7d81c0..f77ff4f 100644
--- a/drivers/infiniband/hw/mlx4/main.c
+++ b/drivers/infiniband/hw/mlx4/main.c
@@ -137,6 +137,14 @@ static int mlx4_ib_query_device(struct ib_device *ibdev,
props->device_cap_flags |= IB_DEVICE_MEM_MGT_EXTENSIONS;
if (dev->dev->caps.flags & MLX4_DEV_CAP_FLAG_XRC)
props->device_cap_flags |= IB_DEVICE_XRC;
+   if (dev->dev->caps.flags & MLX4_DEV_CAP_FLAG_MEM_WINDOW)
+   props->device_cap_flags |= IB_DEVICE_MEM_WINDOW;
+   if (dev->dev->caps.bmme_flags & MLX4_BMME_FLAG_TYPE_2_WIN) {
+   if (dev->dev->caps.bmme_flags & MLX4_BMME_FLAG_WIN_TYPE_2B)
+   props->device_cap_flags |= IB_DEVICE_MEM_WINDOW_TYPE_2B;
+   else
+   props->device_cap_flags |= IB_DEVICE_MEM_WINDOW_TYPE_2A;
+   }
 
props->vendor_id   = be32_to_cpup((__be32 *) (out_mad->data + 
36)) &
0xff;
@@ -1434,6 +1442,17 @@ static void *mlx4_ib_add(struct mlx4_dev *dev)
ibdev->ib_dev.dealloc_fmr   = mlx4_ib_fmr_dealloc;
}
 
+   if (dev->caps.flags & MLX4_DEV_CAP_FLAG_MEM_WINDOW ||
+   dev->caps.bmme_flags & MLX4_BMME_FLAG_TYPE_2_WIN) {
+   ibdev->ib_dev.alloc_mw = mlx4_ib_alloc_mw;
+   ibdev->ib_dev.bind_mw = mlx4_ib_bind_mw;
+   ibdev->ib_dev.dealloc_mw = mlx4_ib_dealloc_mw;
+
+   ibdev->ib_dev.uverbs_cmd_mask |=
+   (1ull << IB_USER_VERBS_CMD_ALLOC_MW) |
+   (1ull << IB_USER_VERBS_CMD_DEALLOC_MW);
+   }
+
if (dev->caps.flags & MLX4_DEV_CAP_FLAG_XRC) {
ibdev->ib_dev.alloc_xrcd = mlx4_ib_alloc_xrcd;
ibdev->ib_dev.dealloc_xrcd = mlx4_ib_dealloc_xrcd;
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH for-next 09/10] IB/mlx4_ib: Support memory window binding

2013-02-06 Thread Or Gerlitz
From: Shani Michaeli 

* Implement memory windows binding in mlx4_ib_post_send.

* Implement mlx4_ib_bind_mw by deferring to mlx4_ib_post_send.

* Rename MLX4_WQE_FMR_PERM_* flags to MLX4_WQE_FMR_AND_BIND_PERM_*,
  indicating that they are used both for fast registration work
  requests, and for memory window bind work requests.

Signed-off-by: Haggai Eran 
Signed-off-by: Shani Michaeli 
Signed-off-by: Or Gerlitz 
---
 drivers/infiniband/hw/mlx4/mlx4_ib.h |2 +
 drivers/infiniband/hw/mlx4/mr.c  |   22 +
 drivers/infiniband/hw/mlx4/qp.c  |   35 +++--
 include/linux/mlx4/qp.h  |   11 +++--
 4 files changed, 64 insertions(+), 6 deletions(-)

diff --git a/drivers/infiniband/hw/mlx4/mlx4_ib.h 
b/drivers/infiniband/hw/mlx4/mlx4_ib.h
index 6d28491..5a21783 100644
--- a/drivers/infiniband/hw/mlx4/mlx4_ib.h
+++ b/drivers/infiniband/hw/mlx4/mlx4_ib.h
@@ -592,6 +592,8 @@ struct ib_mr *mlx4_ib_reg_user_mr(struct ib_pd *pd, u64 
start, u64 length,
  struct ib_udata *udata);
 int mlx4_ib_dereg_mr(struct ib_mr *mr);
 struct ib_mw *mlx4_ib_alloc_mw(struct ib_pd *pd, enum ib_mw_type type);
+int mlx4_ib_bind_mw(struct ib_qp *qp, struct ib_mw *mw,
+   struct ib_mw_bind *mw_bind);
 int mlx4_ib_dealloc_mw(struct ib_mw *mw);
 struct ib_mr *mlx4_ib_alloc_fast_reg_mr(struct ib_pd *pd,
int max_page_list_len);
diff --git a/drivers/infiniband/hw/mlx4/mr.c b/drivers/infiniband/hw/mlx4/mr.c
index 5adf4c4..e471f08 100644
--- a/drivers/infiniband/hw/mlx4/mr.c
+++ b/drivers/infiniband/hw/mlx4/mr.c
@@ -231,6 +231,28 @@ err_free:
return ERR_PTR(err);
 }
 
+int mlx4_ib_bind_mw(struct ib_qp *qp, struct ib_mw *mw,
+   struct ib_mw_bind *mw_bind)
+{
+   struct ib_send_wr  wr;
+   struct ib_send_wr *bad_wr;
+   int ret;
+
+   memset(&wr, 0, sizeof(wr));
+   wr.opcode   = IB_WR_BIND_MW;
+   wr.wr_id= mw_bind->wr_id;
+   wr.send_flags   = mw_bind->send_flags;
+   wr.wr.bind_mw.mw= mw;
+   wr.wr.bind_mw.bind_info = mw_bind->bind_info;
+   wr.wr.bind_mw.rkey  = ib_inc_rkey(mw->rkey);
+
+   ret = mlx4_ib_post_send(qp, &wr, &bad_wr);
+   if (!ret)
+   mw->rkey = wr.wr.bind_mw.rkey;
+
+   return ret;
+}
+
 int mlx4_ib_dealloc_mw(struct ib_mw *ibmw)
 {
struct mlx4_ib_mw *mw = to_mmw(ibmw);
diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c
index c6dde71..93bdae5 100644
--- a/drivers/infiniband/hw/mlx4/qp.c
+++ b/drivers/infiniband/hw/mlx4/qp.c
@@ -104,6 +104,7 @@ static const __be32 mlx4_ib_opcode[] = {
[IB_WR_FAST_REG_MR] = cpu_to_be32(MLX4_OPCODE_FMR),
[IB_WR_MASKED_ATOMIC_CMP_AND_SWP]   = 
cpu_to_be32(MLX4_OPCODE_MASKED_ATOMIC_CS),
[IB_WR_MASKED_ATOMIC_FETCH_AND_ADD] = 
cpu_to_be32(MLX4_OPCODE_MASKED_ATOMIC_FA),
+   [IB_WR_BIND_MW] = 
cpu_to_be32(MLX4_OPCODE_BIND_MW),
 };
 
 static struct mlx4_ib_sqp *to_msqp(struct mlx4_ib_qp *mqp)
@@ -1953,9 +1954,12 @@ static int mlx4_wq_overflow(struct mlx4_ib_wq *wq, int 
nreq, struct ib_cq *ib_cq
 
 static __be32 convert_access(int acc)
 {
-   return (acc & IB_ACCESS_REMOTE_ATOMIC ? 
cpu_to_be32(MLX4_WQE_FMR_PERM_ATOMIC)   : 0) |
-  (acc & IB_ACCESS_REMOTE_WRITE  ? 
cpu_to_be32(MLX4_WQE_FMR_PERM_REMOTE_WRITE) : 0) |
-  (acc & IB_ACCESS_REMOTE_READ   ? 
cpu_to_be32(MLX4_WQE_FMR_PERM_REMOTE_READ)  : 0) |
+   return (acc & IB_ACCESS_REMOTE_ATOMIC ?
+   cpu_to_be32(MLX4_WQE_FMR_AND_BIND_PERM_ATOMIC)   : 0) |
+  (acc & IB_ACCESS_REMOTE_WRITE  ?
+   cpu_to_be32(MLX4_WQE_FMR_AND_BIND_PERM_REMOTE_WRITE) : 0) |
+  (acc & IB_ACCESS_REMOTE_READ   ?
+   cpu_to_be32(MLX4_WQE_FMR_AND_BIND_PERM_REMOTE_READ)  : 0) |
   (acc & IB_ACCESS_LOCAL_WRITE   ? 
cpu_to_be32(MLX4_WQE_FMR_PERM_LOCAL_WRITE)  : 0) |
cpu_to_be32(MLX4_WQE_FMR_PERM_LOCAL_READ);
 }
@@ -1981,6 +1985,24 @@ static void set_fmr_seg(struct mlx4_wqe_fmr_seg *fseg, 
struct ib_send_wr *wr)
fseg->reserved[1]   = 0;
 }
 
+static void set_bind_seg(struct mlx4_wqe_bind_seg *bseg, struct ib_send_wr *wr)
+{
+   bseg->flags1 =
+   convert_access(wr->wr.bind_mw.bind_info.mw_access_flags) &
+   cpu_to_be32(MLX4_WQE_FMR_AND_BIND_PERM_REMOTE_READ  |
+   MLX4_WQE_FMR_AND_BIND_PERM_REMOTE_WRITE |
+   MLX4_WQE_FMR_AND_BIND_PERM_ATOMIC);
+   bseg->flags2 = 0;
+   if (wr->wr.bind_mw.mw->type == IB_MW_TYPE_2)
+   bseg->flags2 |= cpu_to_be32(MLX4_WQE_BIND_TYPE_2);
+   if (wr->wr.bind_mw.bind_info.mw_access_flags & IB_ZERO_BASED)
+   bseg->flags2 |= cpu_to_be32(MLX4_WQE_BIND_ZERO_BASED);
+   bseg->new_rkey = cpu_to_be32(wr->wr.b

[PATCH for-next 08/10] mlx4: Implement memory windows allocation and deallocation

2013-02-06 Thread Or Gerlitz
From: Shani Michaeli 

Implement MW allocation and deallocation in mlx4_core and mlx4_ib.
Pass down the enable bind flag when registering memory regions.

Signed-off-by: Haggai Eran 
Signed-off-by: Shani Michaeli 
Signed-off-by: Or Gerlitz 
---
 drivers/infiniband/hw/mlx4/mlx4_ib.h|   12 
 drivers/infiniband/hw/mlx4/mr.c |   52 +
 drivers/net/ethernet/mellanox/mlx4/mr.c |   95 +++
 include/linux/mlx4/device.h |   20 ++-
 4 files changed, 178 insertions(+), 1 deletions(-)

diff --git a/drivers/infiniband/hw/mlx4/mlx4_ib.h 
b/drivers/infiniband/hw/mlx4/mlx4_ib.h
index dcd845b..6d28491 100644
--- a/drivers/infiniband/hw/mlx4/mlx4_ib.h
+++ b/drivers/infiniband/hw/mlx4/mlx4_ib.h
@@ -116,6 +116,11 @@ struct mlx4_ib_mr {
struct ib_umem *umem;
 };
 
+struct mlx4_ib_mw {
+   struct ib_mwibmw;
+   struct mlx4_mw  mmw;
+};
+
 struct mlx4_ib_fast_reg_page_list {
struct ib_fast_reg_page_listibfrpl;
__be64 *mapped_page_list;
@@ -533,6 +538,11 @@ static inline struct mlx4_ib_mr *to_mmr(struct ib_mr *ibmr)
return container_of(ibmr, struct mlx4_ib_mr, ibmr);
 }
 
+static inline struct mlx4_ib_mw *to_mmw(struct ib_mw *ibmw)
+{
+   return container_of(ibmw, struct mlx4_ib_mw, ibmw);
+}
+
 static inline struct mlx4_ib_fast_reg_page_list *to_mfrpl(struct 
ib_fast_reg_page_list *ibfrpl)
 {
return container_of(ibfrpl, struct mlx4_ib_fast_reg_page_list, ibfrpl);
@@ -581,6 +591,8 @@ struct ib_mr *mlx4_ib_reg_user_mr(struct ib_pd *pd, u64 
start, u64 length,
  u64 virt_addr, int access_flags,
  struct ib_udata *udata);
 int mlx4_ib_dereg_mr(struct ib_mr *mr);
+struct ib_mw *mlx4_ib_alloc_mw(struct ib_pd *pd, enum ib_mw_type type);
+int mlx4_ib_dealloc_mw(struct ib_mw *mw);
 struct ib_mr *mlx4_ib_alloc_fast_reg_mr(struct ib_pd *pd,
int max_page_list_len);
 struct ib_fast_reg_page_list *mlx4_ib_alloc_fast_reg_page_list(struct 
ib_device *ibdev,
diff --git a/drivers/infiniband/hw/mlx4/mr.c b/drivers/infiniband/hw/mlx4/mr.c
index 254e1cf..5adf4c4 100644
--- a/drivers/infiniband/hw/mlx4/mr.c
+++ b/drivers/infiniband/hw/mlx4/mr.c
@@ -41,9 +41,19 @@ static u32 convert_access(int acc)
   (acc & IB_ACCESS_REMOTE_WRITE  ? MLX4_PERM_REMOTE_WRITE : 0) |
   (acc & IB_ACCESS_REMOTE_READ   ? MLX4_PERM_REMOTE_READ  : 0) |
   (acc & IB_ACCESS_LOCAL_WRITE   ? MLX4_PERM_LOCAL_WRITE  : 0) |
+  (acc & IB_ACCESS_MW_BIND   ? MLX4_PERM_BIND_MW  : 0) |
   MLX4_PERM_LOCAL_READ;
 }
 
+static enum mlx4_mw_type to_mlx4_type(enum ib_mw_type type)
+{
+   switch (type) {
+   case IB_MW_TYPE_1:  return MLX4_MW_TYPE_1;
+   case IB_MW_TYPE_2:  return MLX4_MW_TYPE_2;
+   default:return -1;
+   }
+}
+
 struct ib_mr *mlx4_ib_get_dma_mr(struct ib_pd *pd, int acc)
 {
struct mlx4_ib_mr *mr;
@@ -189,6 +199,48 @@ int mlx4_ib_dereg_mr(struct ib_mr *ibmr)
return 0;
 }
 
+struct ib_mw *mlx4_ib_alloc_mw(struct ib_pd *pd, enum ib_mw_type type)
+{
+   struct mlx4_ib_dev *dev = to_mdev(pd->device);
+   struct mlx4_ib_mw *mw;
+   int err;
+
+   mw = kmalloc(sizeof(*mw), GFP_KERNEL);
+   if (!mw)
+   return ERR_PTR(-ENOMEM);
+
+   err = mlx4_mw_alloc(dev->dev, to_mpd(pd)->pdn,
+   to_mlx4_type(type), &mw->mmw);
+   if (err)
+   goto err_free;
+
+   err = mlx4_mw_enable(dev->dev, &mw->mmw);
+   if (err)
+   goto err_mw;
+
+   mw->ibmw.rkey = mw->mmw.key;
+
+   return &mw->ibmw;
+
+err_mw:
+   mlx4_mw_free(dev->dev, &mw->mmw);
+
+err_free:
+   kfree(mw);
+
+   return ERR_PTR(err);
+}
+
+int mlx4_ib_dealloc_mw(struct ib_mw *ibmw)
+{
+   struct mlx4_ib_mw *mw = to_mmw(ibmw);
+
+   mlx4_mw_free(to_mdev(ibmw->device)->dev, &mw->mmw);
+   kfree(mw);
+
+   return 0;
+}
+
 struct ib_mr *mlx4_ib_alloc_fast_reg_mr(struct ib_pd *pd,
int max_page_list_len)
 {
diff --git a/drivers/net/ethernet/mellanox/mlx4/mr.c 
b/drivers/net/ethernet/mellanox/mlx4/mr.c
index 5e785bd..602ca9b 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mr.c
+++ b/drivers/net/ethernet/mellanox/mlx4/mr.c
@@ -654,6 +654,101 @@ int mlx4_buf_write_mtt(struct mlx4_dev *dev, struct 
mlx4_mtt *mtt,
 }
 EXPORT_SYMBOL_GPL(mlx4_buf_write_mtt);
 
+int mlx4_mw_alloc(struct mlx4_dev *dev, u32 pd, enum mlx4_mw_type type,
+ struct mlx4_mw *mw)
+{
+   u32 index;
+
+   if ((type == MLX4_MW_TYPE_1 &&
+!(dev->caps.flags & MLX4_DEV_CAP_FLAG_MEM_WINDOW)) ||
+(type == MLX4_MW_TYPE_2 &&
+!(dev->caps.bmme_flags & MLX4_BMME_FLAG_TYPE_2_WIN)))
+   return -ENOTSUPP;
+
+   index = mlx4_mpt_reserve(dev)

[PATCH for-next 07/10] IB/uverbs: Implement memory windows support in uverbs

2013-02-06 Thread Or Gerlitz
From: Shani Michaeli 

The existing user/kernel uverbs API has IB_USER_VERBS_CMD_ALLOC/DEALLOC_MW,
implement these calls, along with destroying user memory windows
during process cleanup.

Signed-off-by: Haggai Eran 
Signed-off-by: Shani Michaeli 
Signed-off-by: Or Gerlitz 
---
 drivers/infiniband/core/uverbs.h  |2 +
 drivers/infiniband/core/uverbs_cmd.c  |  121 +
 drivers/infiniband/core/uverbs_main.c |   13 +++-
 include/uapi/rdma/ib_user_verbs.h |   16 +
 4 files changed, 150 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/core/uverbs.h b/drivers/infiniband/core/uverbs.h
index 5bcb2af..0fcd7aa 100644
--- a/drivers/infiniband/core/uverbs.h
+++ b/drivers/infiniband/core/uverbs.h
@@ -188,6 +188,8 @@ IB_UVERBS_DECLARE_CMD(alloc_pd);
 IB_UVERBS_DECLARE_CMD(dealloc_pd);
 IB_UVERBS_DECLARE_CMD(reg_mr);
 IB_UVERBS_DECLARE_CMD(dereg_mr);
+IB_UVERBS_DECLARE_CMD(alloc_mw);
+IB_UVERBS_DECLARE_CMD(dealloc_mw);
 IB_UVERBS_DECLARE_CMD(create_comp_channel);
 IB_UVERBS_DECLARE_CMD(create_cq);
 IB_UVERBS_DECLARE_CMD(resize_cq);
diff --git a/drivers/infiniband/core/uverbs_cmd.c 
b/drivers/infiniband/core/uverbs_cmd.c
index 0cb0007..3983a05 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -48,6 +48,7 @@ struct uverbs_lock_class {
 
 static struct uverbs_lock_class pd_lock_class  = { .name = "PD-uobj" };
 static struct uverbs_lock_class mr_lock_class  = { .name = "MR-uobj" };
+static struct uverbs_lock_class mw_lock_class  = { .name = "MW-uobj" };
 static struct uverbs_lock_class cq_lock_class  = { .name = "CQ-uobj" };
 static struct uverbs_lock_class qp_lock_class  = { .name = "QP-uobj" };
 static struct uverbs_lock_class ah_lock_class  = { .name = "AH-uobj" };
@@ -1049,6 +1050,126 @@ ssize_t ib_uverbs_dereg_mr(struct ib_uverbs_file *file,
return in_len;
 }
 
+ssize_t ib_uverbs_alloc_mw(struct ib_uverbs_file *file,
+const char __user *buf, int in_len,
+int out_len)
+{
+   struct ib_uverbs_alloc_mw  cmd;
+   struct ib_uverbs_alloc_mw_resp resp;
+   struct ib_uobject *uobj;
+   struct ib_pd  *pd;
+   struct ib_mw  *mw;
+   intret;
+
+   if (out_len < sizeof(resp))
+   return -ENOSPC;
+
+   if (copy_from_user(&cmd, buf, sizeof(cmd)))
+   return -EFAULT;
+
+   uobj = kmalloc(sizeof(*uobj), GFP_KERNEL);
+   if (!uobj)
+   return -ENOMEM;
+
+   init_uobj(uobj, 0, file->ucontext, &mw_lock_class);
+   down_write(&uobj->mutex);
+
+   pd = idr_read_pd(cmd.pd_handle, file->ucontext);
+   if (!pd) {
+   ret = -EINVAL;
+   goto err_free;
+   }
+
+   mw = pd->device->alloc_mw(pd, cmd.mw_type);
+   if (IS_ERR(mw)) {
+   ret = PTR_ERR(mw);
+   goto err_put;
+   }
+
+   mw->device  = pd->device;
+   mw->pd  = pd;
+   mw->uobject = uobj;
+   atomic_inc(&pd->usecnt);
+
+   uobj->object = mw;
+   ret = idr_add_uobj(&ib_uverbs_mw_idr, uobj);
+   if (ret)
+   goto err_unalloc;
+
+   memset(&resp, 0, sizeof(resp));
+   resp.rkey  = mw->rkey;
+   resp.mw_handle = uobj->id;
+
+   if (copy_to_user((void __user *)(unsigned long)cmd.response,
+&resp, sizeof(resp))) {
+   ret = -EFAULT;
+   goto err_copy;
+   }
+
+   put_pd_read(pd);
+
+   mutex_lock(&file->mutex);
+   list_add_tail(&uobj->list, &file->ucontext->mw_list);
+   mutex_unlock(&file->mutex);
+
+   uobj->live = 1;
+
+   up_write(&uobj->mutex);
+
+   return in_len;
+
+err_copy:
+   idr_remove_uobj(&ib_uverbs_mw_idr, uobj);
+
+err_unalloc:
+   ib_dealloc_mw(mw);
+
+err_put:
+   put_pd_read(pd);
+
+err_free:
+   put_uobj_write(uobj);
+   return ret;
+}
+
+ssize_t ib_uverbs_dealloc_mw(struct ib_uverbs_file *file,
+  const char __user *buf, int in_len,
+  int out_len)
+{
+   struct ib_uverbs_dealloc_mw cmd;
+   struct ib_mw   *mw;
+   struct ib_uobject  *uobj;
+   int ret = -EINVAL;
+
+   if (copy_from_user(&cmd, buf, sizeof(cmd)))
+   return -EFAULT;
+
+   uobj = idr_write_uobj(&ib_uverbs_mw_idr, cmd.mw_handle, file->ucontext);
+   if (!uobj)
+   return -EINVAL;
+
+   mw = uobj->object;
+
+   ret = ib_dealloc_mw(mw);
+   if (!ret)
+   uobj->live = 0;
+
+   put_uobj_write(uobj);
+
+   if (ret)
+   return ret;
+
+   idr_remove_uobj(&ib_uverbs_mw_idr, uobj);
+
+   mutex_lock(&file->mutex);
+   list_del(&uobj->list);
+   mutex_unlock(&file->mutex);
+
+   put_uobj(uobj);
+
+   return in_len;
+}
+
 ssize_t ib_uverbs_create_comp_channel(struc

[PATCH for-next 04/10] net/mlx4_core: Disable memory windows for VFs

2013-02-06 Thread Or Gerlitz
From: Shani Michaeli 

Do not enable memory windows allocation for virtual functions.

In addition, add a few safety checks, such as:

* Verifying the PD of a new MPT matches the VF.

* Making sure binding memory window isn't enabled for FMRs, and
  that new memory windows are not FMR themselves.

Signed-off-by: Haggai Eran 
Signed-off-by: Shani Michaeli 
Signed-off-by: Or Gerlitz 
---
 drivers/net/ethernet/mellanox/mlx4/fw.c|   11 -
 drivers/net/ethernet/mellanox/mlx4/mlx4.h  |   16 ++
 drivers/net/ethernet/mellanox/mlx4/mr.c|   14 --
 .../net/ethernet/mellanox/mlx4/resource_tracker.c  |   49 
 4 files changed, 75 insertions(+), 15 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/fw.c 
b/drivers/net/ethernet/mellanox/mlx4/fw.c
index 8b3d051..a389612 100644
--- a/drivers/net/ethernet/mellanox/mlx4/fw.c
+++ b/drivers/net/ethernet/mellanox/mlx4/fw.c
@@ -757,15 +757,19 @@ int mlx4_QUERY_DEV_CAP_wrapper(struct mlx4_dev *dev, int 
slave,
u64 flags;
int err = 0;
u8  field;
+   u32 bmme_flags;
 
err = mlx4_cmd_box(dev, 0, outbox->dma, 0, 0, MLX4_CMD_QUERY_DEV_CAP,
   MLX4_CMD_TIME_CLASS_A, MLX4_CMD_NATIVE);
if (err)
return err;
 
-   /* add port mng change event capability unconditionally to slaves */
+   /* add port mng change event capability and disable mw type 1
+* unconditionally to slaves
+*/
MLX4_GET(flags, outbox->buf, QUERY_DEV_CAP_EXT_FLAGS_OFFSET);
flags |= MLX4_DEV_CAP_FLAG_PORT_MNG_CHG_EV;
+   flags &= ~MLX4_DEV_CAP_FLAG_MEM_WINDOW;
MLX4_PUT(outbox->buf, flags, QUERY_DEV_CAP_EXT_FLAGS_OFFSET);
 
/* For guests, report Blueflame disabled */
@@ -773,6 +777,11 @@ int mlx4_QUERY_DEV_CAP_wrapper(struct mlx4_dev *dev, int 
slave,
field &= 0x7f;
MLX4_PUT(outbox->buf, field, QUERY_DEV_CAP_BF_OFFSET);
 
+   /* For guests, disable mw type 2 */
+   MLX4_GET(bmme_flags, outbox, QUERY_DEV_CAP_BMME_FLAGS_OFFSET);
+   bmme_flags &= ~MLX4_BMME_FLAG_TYPE_2_WIN;
+   MLX4_PUT(outbox->buf, bmme_flags, QUERY_DEV_CAP_BMME_FLAGS_OFFSET);
+
return 0;
 }
 
diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4.h 
b/drivers/net/ethernet/mellanox/mlx4/mlx4.h
index 5075236..539212b 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4.h
@@ -268,6 +268,22 @@ struct mlx4_icm_table {
struct mlx4_icm   **icm;
 };
 
+#define MLX4_MPT_FLAG_SW_OWNS  (0xfUL << 28)
+#define MLX4_MPT_FLAG_FREE (0x3UL << 28)
+#define MLX4_MPT_FLAG_MIO  (1 << 17)
+#define MLX4_MPT_FLAG_BIND_ENABLE   (1 << 15)
+#define MLX4_MPT_FLAG_PHYSICAL (1 <<  9)
+#define MLX4_MPT_FLAG_REGION   (1 <<  8)
+
+#define MLX4_MPT_PD_FLAG_FAST_REG   (1 << 27)
+#define MLX4_MPT_PD_FLAG_RAE   (1 << 28)
+#define MLX4_MPT_PD_FLAG_EN_INV(3 << 24)
+
+#define MLX4_MPT_QP_FLAG_BOUND_QP   (1 << 7)
+
+#define MLX4_MPT_STATUS_SW 0xF0
+#define MLX4_MPT_STATUS_HW 0x00
+
 /*
  * Must be packed because mtt_seg is 64 bits but only aligned to 32 bits.
  */
diff --git a/drivers/net/ethernet/mellanox/mlx4/mr.c 
b/drivers/net/ethernet/mellanox/mlx4/mr.c
index 06b16e4..5e785bd 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mr.c
+++ b/drivers/net/ethernet/mellanox/mlx4/mr.c
@@ -44,20 +44,6 @@
 #include "mlx4.h"
 #include "icm.h"
 
-#define MLX4_MPT_FLAG_SW_OWNS  (0xfUL << 28)
-#define MLX4_MPT_FLAG_FREE (0x3UL << 28)
-#define MLX4_MPT_FLAG_MIO  (1 << 17)
-#define MLX4_MPT_FLAG_BIND_ENABLE   (1 << 15)
-#define MLX4_MPT_FLAG_PHYSICAL (1 <<  9)
-#define MLX4_MPT_FLAG_REGION   (1 <<  8)
-
-#define MLX4_MPT_PD_FLAG_FAST_REG   (1 << 27)
-#define MLX4_MPT_PD_FLAG_RAE   (1 << 28)
-#define MLX4_MPT_PD_FLAG_EN_INV(3 << 24)
-
-#define MLX4_MPT_STATUS_SW 0xF0
-#define MLX4_MPT_STATUS_HW 0x00
-
 static u32 mlx4_buddy_alloc(struct mlx4_buddy *buddy, int order)
 {
int o;
diff --git a/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c 
b/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c
index 2287dfd..9185e2e 100644
--- a/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c
+++ b/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c
@@ -1796,6 +1796,26 @@ static int mr_get_mtt_size(struct mlx4_mpt_entry *mpt)
return be32_to_cpu(mpt->mtt_sz);
 }
 
+static u32 mr_get_pd(struct mlx4_mpt_entry *mpt)
+{
+   return be32_to_cpu(mpt->pd_flags) & 0x00ff;
+}
+
+static int mr_is_fmr(struct mlx4_mpt_entry *mpt)
+{
+   return be32_to_cpu(mpt->pd_flags) & MLX4_MPT_PD_FLAG_FAST_REG;
+}
+
+static int mr_is_bind_enabled(struct mlx4_mpt_entry *mpt)
+{
+   return be32_to_cpu(mpt->flags) & MLX4_MPT_FLAG_BIND_ENABLE;
+}
+
+static int mr_is_region(struct mlx4_mpt_entry *mpt)
+{
+   return be32_to_cpu(mpt->flags) & MLX4_MPT_FLAG_RE

[PATCH for-next 00/10] mlx4: Add Memory Windows support

2013-02-06 Thread Or Gerlitz
Hi Roland,

Here's a series from Shani Michaeli and Haggai Eran adds mlx4 driver 
support for Memory Windows.

The first entries in this set are "pre patches" preparing the grounds for the 
actual implementation of MWs. Later there're two core patches, one to 
ib_verbs.h adding support for type 2 MWs and another one to uverbs that 
exposes MW commands to user space. And finally the actual mlx4 driver 
MWs patches.

Or.


Shani Michaeli (10):
  IB/mlx4_ib: Remove local invalidate segment unused fields
  net/mlx4_core: Rename MPT related service routines to have mpt_ prefix
  net/mlx4_core: Propogate MR deregistration failure
  net/mlx4_core: Disable memory windows for VFs
  net/mlx4_core: Enable memory windows in {INIT,QUERY}_HCA
  IB/core: Enhance memory windows support
  IB/uverbs: Implement memory windows support in uverbs
  mlx4: Implement memory windows allocation and deallocation
  IB/mlx4_ib: Support memory window binding
  IB/mlx4_ib: Advertize MW support

 drivers/infiniband/core/uverbs.h   |2 +
 drivers/infiniband/core/uverbs_cmd.c   |  121 +
 drivers/infiniband/core/uverbs_main.c  |   13 ++-
 drivers/infiniband/core/verbs.c|5 +-
 drivers/infiniband/hw/cxgb3/iwch_provider.c|5 +-
 drivers/infiniband/hw/cxgb3/iwch_qp.c  |   15 +-
 drivers/infiniband/hw/cxgb4/iw_cxgb4.h |2 +-
 drivers/infiniband/hw/cxgb4/mem.c  |5 +-
 drivers/infiniband/hw/ehca/ehca_iverbs.h   |2 +-
 drivers/infiniband/hw/ehca/ehca_mrmw.c |5 +-
 drivers/infiniband/hw/mlx4/main.c  |   19 ++
 drivers/infiniband/hw/mlx4/mlx4_ib.h   |   14 ++
 drivers/infiniband/hw/mlx4/mr.c|   87 +-
 drivers/infiniband/hw/mlx4/qp.c|   41 -
 drivers/infiniband/hw/nes/nes_verbs.c  |   19 ++-
 drivers/net/ethernet/mellanox/mlx4/en_main.c   |4 +-
 drivers/net/ethernet/mellanox/mlx4/fw.c|   14 ++-
 drivers/net/ethernet/mellanox/mlx4/fw.h|1 +
 drivers/net/ethernet/mellanox/mlx4/main.c  |4 +
 drivers/net/ethernet/mellanox/mlx4/mlx4.h  |   34 +++-
 drivers/net/ethernet/mellanox/mlx4/mr.c|  186 +++-
 .../net/ethernet/mellanox/mlx4/resource_tracker.c  |   63 ++-
 include/linux/mlx4/device.h|   22 ++-
 include/linux/mlx4/qp.h|   19 ++-
 include/rdma/ib_verbs.h|   73 +++-
 include/uapi/rdma/ib_user_verbs.h  |   16 ++
 net/sunrpc/xprtrdma/verbs.c|   20 +-
 27 files changed, 683 insertions(+), 128 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH for-next 03/10] net/mlx4_core: Propogate MR deregistration failure

2013-02-06 Thread Or Gerlitz
From: Shani Michaeli 

MR deregistration fails when memory windows are bound to it.
Handle such failures by propagating it to the caller ULP.

Signed-off-by: Haggai Eran 
Signed-off-by: Shani Michaeli 
Signed-off-by: Or Gerlitz 
---
 drivers/infiniband/hw/mlx4/mr.c  |   13 +++
 drivers/net/ethernet/mellanox/mlx4/en_main.c |4 +-
 drivers/net/ethernet/mellanox/mlx4/mr.c  |   29 +++--
 include/linux/mlx4/device.h  |2 +-
 4 files changed, 33 insertions(+), 15 deletions(-)

diff --git a/drivers/infiniband/hw/mlx4/mr.c b/drivers/infiniband/hw/mlx4/mr.c
index bbaf617..254e1cf 100644
--- a/drivers/infiniband/hw/mlx4/mr.c
+++ b/drivers/infiniband/hw/mlx4/mr.c
@@ -68,7 +68,7 @@ struct ib_mr *mlx4_ib_get_dma_mr(struct ib_pd *pd, int acc)
return &mr->ibmr;
 
 err_mr:
-   mlx4_mr_free(to_mdev(pd->device)->dev, &mr->mmr);
+   (void) mlx4_mr_free(to_mdev(pd->device)->dev, &mr->mmr);
 
 err_free:
kfree(mr);
@@ -163,7 +163,7 @@ struct ib_mr *mlx4_ib_reg_user_mr(struct ib_pd *pd, u64 
start, u64 length,
return &mr->ibmr;
 
 err_mr:
-   mlx4_mr_free(to_mdev(pd->device)->dev, &mr->mmr);
+   (void) mlx4_mr_free(to_mdev(pd->device)->dev, &mr->mmr);
 
 err_umem:
ib_umem_release(mr->umem);
@@ -177,8 +177,11 @@ err_free:
 int mlx4_ib_dereg_mr(struct ib_mr *ibmr)
 {
struct mlx4_ib_mr *mr = to_mmr(ibmr);
+   int ret;
 
-   mlx4_mr_free(to_mdev(ibmr->device)->dev, &mr->mmr);
+   ret = mlx4_mr_free(to_mdev(ibmr->device)->dev, &mr->mmr);
+   if (ret)
+   return ret;
if (mr->umem)
ib_umem_release(mr->umem);
kfree(mr);
@@ -212,7 +215,7 @@ struct ib_mr *mlx4_ib_alloc_fast_reg_mr(struct ib_pd *pd,
return &mr->ibmr;
 
 err_mr:
-   mlx4_mr_free(dev->dev, &mr->mmr);
+   (void) mlx4_mr_free(dev->dev, &mr->mmr);
 
 err_free:
kfree(mr);
@@ -291,7 +294,7 @@ struct ib_fmr *mlx4_ib_fmr_alloc(struct ib_pd *pd, int acc,
return &fmr->ibfmr;
 
 err_mr:
-   mlx4_mr_free(to_mdev(pd->device)->dev, &fmr->mfmr.mr);
+   (void) mlx4_mr_free(to_mdev(pd->device)->dev, &fmr->mfmr.mr);
 
 err_free:
kfree(fmr);
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_main.c 
b/drivers/net/ethernet/mellanox/mlx4/en_main.c
index 3a2b8c6..a298714 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_main.c
@@ -176,7 +176,7 @@ static void mlx4_en_remove(struct mlx4_dev *dev, void 
*endev_ptr)
 
flush_workqueue(mdev->workqueue);
destroy_workqueue(mdev->workqueue);
-   mlx4_mr_free(dev, &mdev->mr);
+   (void) mlx4_mr_free(dev, &mdev->mr);
iounmap(mdev->uar_map);
mlx4_uar_free(dev, &mdev->priv_uar);
mlx4_pd_free(dev, mdev->priv_pdn);
@@ -283,7 +283,7 @@ static void *mlx4_en_add(struct mlx4_dev *dev)
return mdev;
 
 err_mr:
-   mlx4_mr_free(dev, &mdev->mr);
+   (void) mlx4_mr_free(dev, &mdev->mr);
 err_map:
if (!mdev->uar_map)
iounmap(mdev->uar_map);
diff --git a/drivers/net/ethernet/mellanox/mlx4/mr.c 
b/drivers/net/ethernet/mellanox/mlx4/mr.c
index 49705cf..06b16e4 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mr.c
+++ b/drivers/net/ethernet/mellanox/mlx4/mr.c
@@ -442,7 +442,7 @@ int mlx4_mr_alloc(struct mlx4_dev *dev, u32 pd, u64 iova, 
u64 size, u32 access,
 }
 EXPORT_SYMBOL_GPL(mlx4_mr_alloc);
 
-static void mlx4_mr_free_reserved(struct mlx4_dev *dev, struct mlx4_mr *mr)
+static int mlx4_mr_free_reserved(struct mlx4_dev *dev, struct mlx4_mr *mr)
 {
int err;
 
@@ -450,20 +450,31 @@ static void mlx4_mr_free_reserved(struct mlx4_dev *dev, 
struct mlx4_mr *mr)
err = mlx4_HW2SW_MPT(dev, NULL,
 key_to_hw_index(mr->key) &
 (dev->caps.num_mpts - 1));
-   if (err)
-   mlx4_warn(dev, "xxx HW2SW_MPT failed (%d)\n", err);
+   if (err) {
+   mlx4_warn(dev, "HW2SW_MPT failed (%d),", err);
+   mlx4_warn(dev, "MR has MWs bound to it.\n");
+   return err;
+   }
 
mr->enabled = MLX4_MPT_EN_SW;
}
mlx4_mtt_cleanup(dev, &mr->mtt);
+
+   return 0;
 }
 
-void mlx4_mr_free(struct mlx4_dev *dev, struct mlx4_mr *mr)
+int mlx4_mr_free(struct mlx4_dev *dev, struct mlx4_mr *mr)
 {
-   mlx4_mr_free_reserved(dev, mr);
+   int ret;
+
+   ret = mlx4_mr_free_reserved(dev, mr);
+   if (ret)
+   return ret;
if (mr->enabled)
mlx4_mpt_free_icm(dev, key_to_hw_index(mr->key));
mlx4_mpt_release(dev, key_to_hw_index(mr->key));
+
+   return 0;
 }
 EXPORT_SYMBOL_GPL(mlx4_mr_free);
 
@@ -831,7 +842,7 @@ int mlx4_fmr_alloc(struct mlx4_dev *dev, u32 pd, u32 
access, int max_pages,
return 0;
 
 err_free:
-   mlx4_mr_free(dev, &fmr->mr);
+ 

[PATCH for-next 01/10] IB/mlx4_ib: Remove local invalidate segment unused fields

2013-02-06 Thread Or Gerlitz
From: Shani Michaeli 

Remove unused fields from the local invalidate WQE segment structure.

Signed-off-by: Haggai Eran 
Signed-off-by: Shani Michaeli 
Signed-off-by: Or Gerlitz 
---
 drivers/infiniband/hw/mlx4/qp.c |6 ++
 include/linux/mlx4/qp.h |8 +++-
 2 files changed, 5 insertions(+), 9 deletions(-)

diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c
index 19e0637..c6dde71 100644
--- a/drivers/infiniband/hw/mlx4/qp.c
+++ b/drivers/infiniband/hw/mlx4/qp.c
@@ -1983,10 +1983,8 @@ static void set_fmr_seg(struct mlx4_wqe_fmr_seg *fseg, 
struct ib_send_wr *wr)
 
 static void set_local_inv_seg(struct mlx4_wqe_local_inval_seg *iseg, u32 rkey)
 {
-   iseg->flags = 0;
-   iseg->mem_key   = cpu_to_be32(rkey);
-   iseg->guest_id  = 0;
-   iseg->pa= 0;
+   memset(iseg, 0, sizeof(*iseg));
+   iseg->mem_key = cpu_to_be32(rkey);
 }
 
 static __always_inline void set_raddr_seg(struct mlx4_wqe_raddr_seg *rseg,
diff --git a/include/linux/mlx4/qp.h b/include/linux/mlx4/qp.h
index 4b4ad6f..6c8a68c 100644
--- a/include/linux/mlx4/qp.h
+++ b/include/linux/mlx4/qp.h
@@ -304,12 +304,10 @@ struct mlx4_wqe_fmr_ext_seg {
 };
 
 struct mlx4_wqe_local_inval_seg {
-   __be32  flags;
-   u32 reserved1;
+   u64 reserved1;
__be32  mem_key;
-   u32 reserved2[2];
-   __be32  guest_id;
-   __be64  pa;
+   u32 reserved2;
+   u64 reserved3[2];
 };
 
 struct mlx4_wqe_raddr_seg {
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: NFS over RDMA crashing

2013-02-06 Thread Steve Wise

On 2/6/2013 9:48 AM, Yan Burman wrote:
When I moved to commit 79f77bf9a4e3dd5ead006b8f17e7c4ff07d8374e I was 
no longer getting the server crashes,
so the reset of my tests were done using that point (it is somewhere 
in the middle of 3.7.0-rc2).




+tom tucker

I'd try going back a few kernels, like to 3.5.x and see if things are 
more stable.  If you find a point that works, then git bisect might help 
identify the regression.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


NFS over RDMA crashing

2013-02-06 Thread Yan Burman

Hi.

I have been trying to create a setup with NFS/RDMA, but I am getting 
crashes.


I am using Mellanox ConnectX 3 HCA with SRIOV enabled with two KVM VMs 
with RHEL 6.3 getting one VF each.
My test case is trying to use one VM's storage from another using NFS 
over RDMA (192.168.20.210 server, 192.168.20.211 client)
I started with two physical hosts, but because of crashes moved to VMs 
which are easier to debug.


I have functional ipoib connection between the two VMs and rping is 
working between them also.


My /etc/exports has the following entry:
/mnt/tmp*(fsid=1,rw,async,insecure,all_squash)
while /mnt/tmp has tmpfs mounted on it.

My mount command is:
mount -t nfs -o rdma,port=2050 192.168.20.210:/mnt/tmp /mnt/tmp


I have tried latest net-next kernel first, but I was getting the 
following errors:


=
[ INFO: possible recursive locking detected ]
3.8.0-rc5+ #4 Not tainted
-
kworker/6:0/49 is trying to acquire lock:
 (&id_priv->handler_mutex){+.+.+.}, at: [] 
rdma_destroy_id+0x33/0x250 [rdma_cm]


but task is already holding lock:
 (&id_priv->handler_mutex){+.+.+.}, at: [] 
cma_disable_callback+0x2b/0x60 [rdma_cm]


other info that might help us debug this:
 Possible unsafe locking scenario:

   CPU0
   
  lock(&id_priv->handler_mutex);
  lock(&id_priv->handler_mutex);

 *** DEADLOCK ***

 May be due to missing lock nesting notation

3 locks held by kworker/6:0/49:
 #0:  (ib_cm){.+.+.+}, at: [] 
process_one_work+0x160/0x720
 #1:  ((&(&work->work)->work)){+.+.+.}, at: [] 
process_one_work+0x160/0x720
 #2:  (&id_priv->handler_mutex){+.+.+.}, at: [] 
cma_disable_callback+0x2b/0x60 [rdma_cm]


stack backtrace:
Pid: 49, comm: kworker/6:0 Not tainted 3.8.0-rc5+ #4
Call Trace:
 [] validate_chain+0xdcc/0x11f0
 [] ? save_trace+0x3f/0xc0
 [] __lock_acquire+0x440/0xc30
 [] ? __lock_acquire+0x440/0xc30
 [] lock_acquire+0x95/0x1e0
 [] ? rdma_destroy_id+0x33/0x250 [rdma_cm]
 [] ? rdma_destroy_id+0x33/0x250 [rdma_cm]
 [] mutex_lock_nested+0x5f/0x3b0
 [] ? rdma_destroy_id+0x33/0x250 [rdma_cm]
 [] ? trace_hardirqs_on_caller+0x10d/0x1a0
 [] ? trace_hardirqs_on+0xd/0x10
 [] ? _raw_spin_unlock_irqrestore+0x3d/0x80
 [] rdma_destroy_id+0x33/0x250 [rdma_cm]
 [] cma_req_handler+0x719/0x730 [rdma_cm]
 [] ? _raw_spin_unlock_irqrestore+0x4/0x80
 [] cm_process_work+0x22/0x170 [ib_cm]
 [] cm_req_handler+0x67d/0xa70 [ib_cm]
 [] cm_work_handler+0x12d/0x1218 [ib_cm]
 [] process_one_work+0x1d2/0x720
 [] ? process_one_work+0x160/0x720
 [] ? cm_req_handler+0xa70/0xa70 [ib_cm]
 [] worker_thread+0x120/0x460
 [] ? preempt_schedule+0x44/0x60
 [] ? manage_workers+0x300/0x300
 [] kthread+0xd6/0xe0
 [] ? __init_kthread_worker+0x70/0x70
 [] ret_from_fork+0x7c/0xb0
 [] ? __init_kthread_worker+0x70/0x70


When killing mount command that got stuck:
---

BUG: unable to handle kernel paging request at 880324dc7ff8
IP: [] rdma_read_xdr+0x8bb/0xd40 [svcrdma]
PGD 1a0c063 PUD 32f82e063 PMD 32f2fd063 PTE 800324dc7161
Oops: 0003 [#1] PREEMPT SMP
Modules linked in: md5 ib_ipoib xprtrdma svcrdma rdma_cm ib_cm iw_cm 
ib_addr nfsd exportfs netconsole ip6table_filter ip6_tables 
iptable_filter ip_tables ebtable_nat nfsv3 nfs_acl ebtables x_tables 
nfsv4 auth_rpcgss nfs lockd autofs4 sunrpc target_core_iblock 
target_core_file target_core_pscsi target_core_mod configfs 8021q bridge 
stp llc ipv6 dm_mirror dm_region_hash dm_log vhost_net macvtap macvlan 
tun uinput iTCO_wdt iTCO_vendor_support kvm_intel kvm crc32c_intel 
microcode pcspkr joydev i2c_i801 lpc_ich mfd_core ehci_pci ehci_hcd sg 
ioatdma ixgbe mdio mlx4_ib ib_sa ib_mad ib_core mlx4_en mlx4_core igb 
hwmon dca ptp pps_core button dm_mod ext3 jbd sd_mod ata_piix libata 
uhci_hcd megaraid_sas scsi_mod

CPU 6
Pid: 4744, comm: nfsd Not tainted 3.8.0-rc5+ #4 Supermicro 
X8DTH-i/6/iF/6F/X8DTH
RIP: 0010:[]  [] 
rdma_read_xdr+0x8bb/0xd40 [svcrdma]

RSP: 0018:880324c3dbf8  EFLAGS: 00010297
RAX: 880324dc8000 RBX: 0001 RCX: 880324dd8428
RDX: 880324dc7ff8 RSI: 880324dd8428 RDI: 81149618
RBP: 880324c3dd78 R08: 60f9c860 R09: 0001
R10: 880324dd8000 R11: 0001 R12: 8806299dcb10
R13: 0003 R14: 0001 R15: 0010
FS:  () GS:88063fc0() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2: 880324dc7ff8 CR3: 01a0b000 CR4: 07e0
DR0:  DR1:  DR2: 
DR3:  DR6: 0ff0 DR7: 0400
Process nfsd (pid: 4744, threadinfo 880324c3c000, task 88033055)
Stack:
 880324c3dc78 880324c3dcd8 0282 880631cec000
 880324dd8000 88062ed33040 000124c3dc48 880324dd8000
 88062ed33058 880630ce2b90 8806299e8000 0003
Call Trace:
 [] svc_rdma_recv

Re: "Virtual" ibnetdiscover command fails

2013-02-06 Thread Or Gerlitz

On 06/02/2013 12:40, Sebastian Riemer wrote:
So if I don't use the unmaintained srptools to get the SRP connection 
strings but instead send them directly to the initiator to connect to 
the SRP target, then also SRP should be possible with the virtual 
GUID. Am I right?


Basically YES, you can use the initiator VM vGID as the source GID for 
the connection.


Or.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: "Virtual" ibnetdiscover command fails

2013-02-06 Thread Sebastian Riemer
On 06.02.2013 11:20, Or Gerlitz wrote:
> On 06/02/2013 12:04, Mathis GAVILLON wrote:
>> Just a last question : is that possible VFs lid to be different from
>> PF one ?
> 
> NO, we've implemented a "shared port" model, so all functions on the
> same IB port use the same lid, each function has its own
> virtual GUID though.

So if I don't use the unmaintained srptools to get the SRP connection
strings but instead send them directly to the initiator to connect to
the SRP target, then also SRP should be possible with the virtual GUID.
Am I right?

Cheers,
Sebastian
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: "Virtual" ibnetdiscover command fails

2013-02-06 Thread Sebastian Riemer
On 06.02.2013 10:22, Or Gerlitz wrote:
> On 06/02/2013 11:17, Mathis GAVILLON wrote:
>> Ok. But what is it possible to do with Infiniband VFs if QP0 is not
>> available ?
> 
> EVERYTHING, e.g run IPoIB, iSER, RDS, MPI, etc, etc - except for what
> requires QP0, such as running SM or issuing SMPs for
> discovery/diagnostics purposes

But SRP isn't provided with SR-IOV I've heared. Is it just a matter of
software or is it a matter of firmware/hardware?

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: "Virtual" ibnetdiscover command fails

2013-02-06 Thread Or Gerlitz

On 06/02/2013 12:04, Mathis GAVILLON wrote:

Just a last question : is that possible VFs lid to be different from PF one ?


NO, we've implemented a "shared port" model, so all functions on the 
same IB port use the same lid, each function has its own

virtual GUID though.

Or.




Thanks

2013/2/6 Mathis GAVILLON :

EVERYTHING, e.g run IPoIB, iSER, RDS, MPI, etc, etc - except for what
requires QP0, such as running SM or issuing SMPs for discovery/diagnostics
purposes

Ok. I just begin with Infiniband technologie so I don't know
everything about this yet.

Thanks

Mathis


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[ANNOUNCE] OpenSM tarball release

2013-02-06 Thread Alex Netes
Hi,

There is a new release of OpenSM.
Tarball available in:

http://www.openfabrics.org/downloads/management/

(listed in http://www.openfabrics.org/downloads/management/latest.txt)

md5sum:
32b16efbaba69d478f8c05df42ce0462  opensm-3.3.16.tar.gz

All component versions are from recent master branch. Full list of
changes is below.

Albert Chu (5):
  opensm: Manage ports that do not support congestion control
  opensm: Fix signed vs unsigned int comparison
  opensm: Protect against spurious wakeups when calling cl_event_wait_on
  opensm/osm_perfmgr_db.c: Fix output error due to possible 32bit int 
overflow
  opensm: Add better error output when parsing node name maps

Alex Netes (15):
  opensm: fix crash in DFSSSP routing engine on reroute
  opensm: fix default cc_max_outstanding_mads assignment
  opensm/osm_subnet.c: Only parameters that marked with can_update flag 
should be updated during conf file rescan
  opensm: Changed #if to #ifdef when using ENABLE_OSM_PERF_MGR_PROFILE
  opensm/osm_link_mgr.c: Set AM SMSupportExtendedSpeeds bit if port 
supports ExtPortInfo
  opensm/osm_link_mgr.c: Fix sending PortInfo(Set) with AM 
SMSupportExtendedSpeeds bit set for switch base port 0
  opensm: Revert "opensm/osm_ucast_ftree: When roots are not connected, 
update hop count but not lft"
  opensm/osm_sm_mad_ctrl.c: Upon receiving trap repress we should decrease 
qp0_mads_outstanding_on_wire
  opensm: Add physp_p discovery count support
  opensm/osm_sm_state_mgr.c: Start sweep immedeately when recieving 
HANDOVER in DISCOVERING state
  opensm/configure.in: Remove Default-Start from opensmd init script
  opensm/osm_req.c: fix first sweep m_key search algorithm
  opensm: update shared library versions
  opensm_release_notes-3.3: update
  opensm: packages versions update

Bart Van Assche (7):
  opensm: osm_pkey: Remove unused variables
  opensm: Add .gitignore
  Correct option names in opensm man page
  Add command-line option --pidfile
  Make it possible to enable opensm with chkconfig
  opensm.spec.in: Improve portability
  /etc/init.d/opensmd: Improve systemd integration

Dan Ben Yosef (3):
  opensm/osm_ucast_dfsssp.c : Fix resource leak
  opensm/osm_ucast_dfsssp.c : fix dereference null return value
  opensm/osm_ucast_dfsssp.c : fix dereference before null check

Daniel Klein (1):
  opensm: improve search common pkeys.

Garrett Cooper (3):
  Fix linker error with clang with -O < 2
  Fix -Wtautological-compare warnings with clang
  Fix -Wformat-security warnings with clang

Hal Rosenstock (51):
  opensm/osm_vendor_ibumad.c: Add management class to error log message
  opensm/osm_sw_info_rcv.c: Fixed locking issue on osm_get_node_by_guid 
error
  OpenSM: Add new Mellanox OUI
  osmtest/osmt_multicast.c: Fix 02BF error
  opensm/osm_torus.c: Add error code to error log message
  opensm/complib/cl_spinlock.h: Remove some unimplemented routines
  opensm/ib_types.h: Commentary and cosmetic formatting change
  opensm/osm_sa.h: Cosmetic commentary change
  opensm/osm_ucast_updn.c: Add error codes to a couple of log messages
  opensm/osm_helper.c: Add some missing new lines to log message output
  opensm/osm_torus.c: Cosmetic formatting change
  opensm: Track minimum value in the fabric for data VLs supported on 
switch external ports
  opensm/osm_torus.c: Check fabric minimum data VLs on switch external ports
  opensm/complib/cl_atomic_osd.h: Fix long standing bug in cl_atomic_sub
  opensm/osm_trap_rcv.c: Eliminate unneeded trap_rcv_process_response 
routine
  opensm/osm_vl15intf.c: Fix commentary typo
  opensm/include/complib/cl_packon.h: Fix some commentary typos
  opensm/osm_sa_mcmember_record.c: Return proper scope for query with valid 
SA key
  Add Per Module Logging support for Congestion Manager
  opensm/include/osm_opensm.h: Fix commentary typo
  opensm: Add routing specific update_vlarb hook routine
  opensm/osm_torus.c: Require only 2 data VLs supported (PortInfo.VLCap) 
and use VLs 0-1 on CA links
  opensm: Update doc for changes to torus routing for CA, support
  opensm/osm_torus.c: Improve QoS configuration
  opensm/osm_torus.c: Add copyright
  opensm: Update doc for changes to torus routing for, endport support
  opensm/osm_torus.c: Minor simplification to check_qos_config
  opensm/osm_torus.c: Improve some misconfiguration error messages
  opensm/osm_req.c: In req_determine_mkey, add more info when ERR 1107 
occurs
  opensm/osm_subnet.c: Improve error messages in subn_validate_neighbor
  opensm/osm_ucast_ftree.c: Remove duplicate free in 
fabric_create_leaf_switch_array
  opensm/osm_ucast_ftree.c: Eliminate unneeded NULL pointer checks prior to 
calls to free
  opensm/osm_torus.c: Fix crash in torus_update_osm_vlarb
  opensm/osm_port_info_rcv.c

Re: "Virtual" ibnetdiscover command fails

2013-02-06 Thread Mathis GAVILLON
> EVERYTHING, e.g run IPoIB, iSER, RDS, MPI, etc, etc - except for what
> requires QP0, such as running SM or issuing SMPs for discovery/diagnostics
> purposes

Ok. I just begin with Infiniband technologie so I don't know
everything about this yet.

Thanks

Mathis
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: "Virtual" ibnetdiscover command fails

2013-02-06 Thread Or Gerlitz

On 06/02/2013 11:17, Mathis GAVILLON wrote:

Ok. But what is it possible to do with Infiniband VFs if QP0 is not available ?


EVERYTHING, e.g run IPoIB, iSER, RDS, MPI, etc, etc - except for what 
requires QP0, such as running SM or issuing SMPs for 
discovery/diagnostics purposes






2013/2/5 Jack Morgenstein :

Mathis,

You cannot use SMP packets on a virtual host (this is a security issue,
VFs are not trusted).  Since QP0 (SMP) is not available on VFs,
any tool which attempts to use QP0 (SMPs) will fail.

Thus, OpenSM will not run over a VF, nor will ibnetdiscover,
nor will sminfo (which uses SMP).

-Jack


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH for 3.8 v3, resend 0/3] IB/SRP patches for kernel 3.8

2013-02-06 Thread Or Gerlitz

On 06/02/2013 09:59, Bart Van Assche wrote:

On 02/06/13 08:44, Or Gerlitz wrote:

On 06/02/2013 09:22, Bart Van Assche wrote:


A huge number of patches have been taken upstream between 3.8-rc1 and
3.8-rc6. I have retested these three patches with 3.8-rc6 and would
appreciate if you would also repeat your tests.


not really... this is what I see on Linus tree for the relevant
directories, anywhere else I need to look

linux-2.6]# git log --oneline v3.8-rc1..v3.8-rc6 drivers/scsi/
drivers/block/ block/drivers/infiniband/ulp/srp
bdb0ae6 Merge branch 'x86-urgent-for-linus' of
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
83e6818 efi: Make 'efi_enabled' a function to query EFI facilities
2263647 Merge tag 'fixes-for-linus' of
git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux
8d85fce Drivers: block: remove __dev* attributes.
6f03979 Drivers: scsi: remove __dev* attributes.
f4953fe virtio-blk: Don't free ida when disk is in use


Nobody outside Mellanox has ever been able to reproduce the behavior 
reported by you. 


I have asked for 2nd opinion so we can get a quorum either way.

Something in your tests might have been specific to the Mellanox 
environment. Have you perhaps been running your tests with a firmware 
version that is not available to the general public ? 


NO


I would appreciate it if you could check your test environment and 
repeat your tests.


We will repeat the tests, indeed.

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html