Re: qib_lookup_qpn() appears to leak pointer out of rcu_read_unlock()

2014-02-12 Thread Paul E. McKenney
On Wed, Feb 12, 2014 at 02:57:14PM +, Marciniszyn, Mike wrote:
> BTW, I am considering eliminating the atomic_inc() in favor of widening the 
> scope of the rcu lock expanse.

As long as the newly included code doesn't block, that should work fine.
(If it does block, another option is SRCU.)

Thanx, Paul

> Mike
> 
> > -Original Message-
> > From: Paul E. McKenney [mailto:paul...@linux.vnet.ibm.com]
> > Sent: Wednesday, February 12, 2014 9:56 AM
> > To: Marciniszyn, Mike
> > Cc: rol...@kernel.org; Hefty, Sean; hal.rosenst...@gmail.com; linux-
> > r...@vger.kernel.org; linux-ker...@vger.kernel.org
> > Subject: Re: qib_lookup_qpn() appears to leak pointer out of 
> > rcu_read_unlock()
> > 
> > On Wed, Feb 12, 2014 at 01:59:30PM +, Marciniszyn, Mike wrote:
> > > > So what am I missing here?
> > > >
> > >
> > > The atomic increment of a reference count:
> > 
> > Got it, thank you, apologies for the noise!
> > 
> > Thanx, Paul
> > 
> > > struct qib_qp *qib_lookup_qpn(struct qib_ibport *ibp, u32 qpn) {
> > > struct qib_qp *qp = NULL;
> > >
> > > rcu_read_lock();
> > > if (unlikely(qpn <= 1)) {
> > > if (qpn == 0)
> > > qp = rcu_dereference(ibp->qp0);
> > > else
> > > qp = rcu_dereference(ibp->qp1);
> > > if (qp)
> > > atomic_inc(&qp->refcount); 
> > > <--
> > > } else {
> > > struct qib_ibdev *dev = &ppd_from_ibp(ibp)->dd->verbs_dev;
> > > unsigned n = qpn_hash(dev, qpn);
> > >
> > > for (qp = rcu_dereference(dev->qp_table[n]); qp;
> > > qp = rcu_dereference(qp->next))
> > > if (qp->ibqp.qp_num == qpn) {
> > > atomic_inc(&qp->refcount); 
> > > <-
> > > break;
> > > }
> > > }
> > > rcu_read_unlock();
> > > return qp;
> > > }
> > >
> > > Mike
> > >
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH opensm] osm_ucast_mgr.c: Use LFT block of all port 0s to indicate resend

2014-02-12 Thread Hal Rosenstock

In set_lft_block, use LFT block set to port 0 rather than invalid port
as a better way to detect that resending the LFT block is needed.

Signed-off-by: Hal Rosenstock 
--
diff --git a/opensm/osm_ucast_mgr.c b/opensm/osm_ucast_mgr.c
index c8a7360..2329ea6 100644
--- a/opensm/osm_ucast_mgr.c
+++ b/opensm/osm_ucast_mgr.c
@@ -1006,7 +1006,7 @@ static int set_lft_block(IN osm_switch_t *p_sw, IN 
osm_ucast_mgr_t *p_mgr,
 * Zero the stored LFT block, so in case the MAD will end up
 * with error, we will resend it in the next sweep.
 */
-   memset(p_sw->lft + block_id_ho * IB_SMP_DATA_SIZE, OSM_NO_PATH,
+   memset(p_sw->lft + block_id_ho * IB_SMP_DATA_SIZE, 0,
   IB_SMP_DATA_SIZE);
 
OSM_LOG(p_mgr->p_log, OSM_LOG_DEBUG,
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: mlx4 qp allocation

2014-02-12 Thread Bob Biloxi
Hi Jack,

Thanks for the reply. Now I understand. On a related note, I had the
following question. Would really appreciate if you can help answer the
same:

Considering the resources QPs, CQs, EQs etc after going through the
code my understanding is that:

1. Physical Function Driver/Hypervisor allocates memory only for the
ICM space for these resources.
2. Virtual Function Driver needs to allocate corresponding system
memory for the resources

For e.g let's say I need 32K QPs, 64K CQs, 512 EQs, the PF driver
allocates the memory only for the ICM.

The VFs need to allocate the memory for Send Queue Buffer, Receive
Queue Buffer, Completion Queue Buffer, Event Queue Buffer.

Is that right?

Also, as the QPs, CQs etc are created by the HCA when ALLOC_RES
command is issued, does the PF driver need to maintain anything to
associate the QPs, CQs created by the HCA with owners(VFs) possessing
them?

I would really appreciate your help!

Thanks so much..

Best Regards,
Bob


On Tue, Feb 11, 2014 at 5:01 PM, Jack Morgenstein
 wrote:
> On Wed, 29 Jan 2014 15:52:09 +0530
> Bob Biloxi  wrote:
>
>> These paths are taken based on the return value of mlx4_is_func(dev).
>> This is true for MASTER or SLAVE which I believe is Physical Function
>> Driver/Virtual Function Driver. So for SRIOV, it covers all cases.
>>
>> The MAP_ICM portion which gets executed as part of __mlx4_qp_alloc_icm
>> never gets called??
>
> For slaves (VFs), the command is sent via the comm channel to the
> Hypervisor.  It is the Hypervisor which invokes map_icm on behalf of
> that slave.
>
> -Jack
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] IB/qib: add missing serdes init sequeuce

2014-02-12 Thread Mike Marciniszyn
Research has shown that commit a77fcf89
("IB/qib: Use a single txselect module parameter for serdes tuning")
missed a key serdes init sequence.

This patch add that sequence.

Cc: 
Reviewed-by: Dennis Dalessandro 
Signed-off-by: Mike Marciniszyn 
---
 drivers/infiniband/hw/qib/qib_iba7322.c |5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/infiniband/hw/qib/qib_iba7322.c 
b/drivers/infiniband/hw/qib/qib_iba7322.c
index 5bfc02f..d1bd213 100644
--- a/drivers/infiniband/hw/qib/qib_iba7322.c
+++ b/drivers/infiniband/hw/qib/qib_iba7322.c
@@ -2395,6 +2395,11 @@ static int qib_7322_bringup_serdes(struct qib_pportdata 
*ppd)
qib_write_kreg_port(ppd, krp_ibcctrl_a, ppd->cpspec->ibcctrl_a);
qib_write_kreg(dd, kr_scratch, 0ULL);
 
+   /* ensure previous Tx parameters are not still forced */
+   qib_write_kreg_port(ppd, krp_tx_deemph_override,
+   SYM_MASK(IBSD_TX_DEEMPHASIS_OVERRIDE_0,
+   reset_tx_deemphasis_override));
+
if (qib_compat_ddr_negotiate) {
ppd->cpspec->ibdeltainprog = 1;
ppd->cpspec->ibsymsnap = read_7322_creg32_port(ppd,

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: qib_lookup_qpn() appears to leak pointer out of rcu_read_unlock()

2014-02-12 Thread Marciniszyn, Mike
BTW, I am considering eliminating the atomic_inc() in favor of widening the 
scope of the rcu lock expanse.

Mike

> -Original Message-
> From: Paul E. McKenney [mailto:paul...@linux.vnet.ibm.com]
> Sent: Wednesday, February 12, 2014 9:56 AM
> To: Marciniszyn, Mike
> Cc: rol...@kernel.org; Hefty, Sean; hal.rosenst...@gmail.com; linux-
> r...@vger.kernel.org; linux-ker...@vger.kernel.org
> Subject: Re: qib_lookup_qpn() appears to leak pointer out of rcu_read_unlock()
> 
> On Wed, Feb 12, 2014 at 01:59:30PM +, Marciniszyn, Mike wrote:
> > > So what am I missing here?
> > >
> >
> > The atomic increment of a reference count:
> 
> Got it, thank you, apologies for the noise!
> 
>   Thanx, Paul
> 
> > struct qib_qp *qib_lookup_qpn(struct qib_ibport *ibp, u32 qpn) {
> > struct qib_qp *qp = NULL;
> >
> > rcu_read_lock();
> > if (unlikely(qpn <= 1)) {
> > if (qpn == 0)
> > qp = rcu_dereference(ibp->qp0);
> > else
> > qp = rcu_dereference(ibp->qp1);
> > if (qp)
> > atomic_inc(&qp->refcount); 
> > <--
> > } else {
> > struct qib_ibdev *dev = &ppd_from_ibp(ibp)->dd->verbs_dev;
> > unsigned n = qpn_hash(dev, qpn);
> >
> > for (qp = rcu_dereference(dev->qp_table[n]); qp;
> > qp = rcu_dereference(qp->next))
> > if (qp->ibqp.qp_num == qpn) {
> > atomic_inc(&qp->refcount); 
> > <-
> > break;
> > }
> > }
> > rcu_read_unlock();
> > return qp;
> > }
> >
> > Mike
> >

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: qib_lookup_qpn() appears to leak pointer out of rcu_read_unlock()

2014-02-12 Thread Paul E. McKenney
On Wed, Feb 12, 2014 at 01:59:30PM +, Marciniszyn, Mike wrote:
> > So what am I missing here?
> > 
> 
> The atomic increment of a reference count:

Got it, thank you, apologies for the noise!

Thanx, Paul

> struct qib_qp *qib_lookup_qpn(struct qib_ibport *ibp, u32 qpn)
> {
> struct qib_qp *qp = NULL;
> 
> rcu_read_lock();
> if (unlikely(qpn <= 1)) {
> if (qpn == 0)
> qp = rcu_dereference(ibp->qp0);
> else
> qp = rcu_dereference(ibp->qp1);
> if (qp)
> atomic_inc(&qp->refcount); <--
> } else {
> struct qib_ibdev *dev = &ppd_from_ibp(ibp)->dd->verbs_dev;
> unsigned n = qpn_hash(dev, qpn);
> 
> for (qp = rcu_dereference(dev->qp_table[n]); qp;
> qp = rcu_dereference(qp->next))
> if (qp->ibqp.qp_num == qpn) {
> atomic_inc(&qp->refcount); 
> <-
> break;
> }
> }
> rcu_read_unlock();
> return qp;
> }
> 
> Mike
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: qib_lookup_qpn() appears to leak pointer out of rcu_read_unlock()

2014-02-12 Thread Marciniszyn, Mike
> So what am I missing here?
> 

The atomic increment of a reference count:

struct qib_qp *qib_lookup_qpn(struct qib_ibport *ibp, u32 qpn)
{
struct qib_qp *qp = NULL;

rcu_read_lock();
if (unlikely(qpn <= 1)) {
if (qpn == 0)
qp = rcu_dereference(ibp->qp0);
else
qp = rcu_dereference(ibp->qp1);
if (qp)
atomic_inc(&qp->refcount); <--
} else {
struct qib_ibdev *dev = &ppd_from_ibp(ibp)->dd->verbs_dev;
unsigned n = qpn_hash(dev, qpn);

for (qp = rcu_dereference(dev->qp_table[n]); qp;
qp = rcu_dereference(qp->next))
if (qp->ibqp.qp_num == qpn) {
atomic_inc(&qp->refcount); 
<-
break;
}
}
rcu_read_unlock();
return qp;
}

Mike
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V1 libibverbs 2/3] Use neighbour lookup for RoCE UD QPs Eth L2 resolution

2014-02-12 Thread Matan Barak

On 11/2/2014 5:18 PM, Yann Droneaud wrote:

Le mardi 11 février 2014 à 14:31 +0200, Or Gerlitz a écrit :

From: Matan Barak 

In order to implement RoCE IP based addressing for UD QPs, without introducing
uverbs changes, we need a way to resolve the L2 Ethernet addresses from 
user-space.

This is done with netlink through libnl, and in libibverbs such that multiple
vendor provider libraries can use the code.

The Ethernet L2 params are passed to the provider driver using an extension
verb drv_ibv_create_ah_ex call. This resolution is done only for providers
that support this extension verb, otherwise, there's not real reason to do that.

The steps for resolution:

1. get sgid

2. from sgid, get the interface

3. query route from interface to the destination

4. query the neigh table, if the MAC for the destination is already known, done.

5. if not, loop until timeout:
 a. send a UDP packet to that IP targeted to the "discard" port


Hi Yann,

Thanks for commenting about this code.



What will happen if the system is configured to refuse to send packet to
this port: eg it exists an netfilter rules that disallow this:

   iptables -t filter -A OUTPUT -p udp --dport 9 -j REJECT --reject-with
icmp-admin-prohibited

(not tested)

And why send a packet in the first place, would it be enough to connect
to the socket to target address:9 ? It's possible to call connect() on a
dgram socket to bind the destination address, but I don't know if it's
trigger a neighbor resolution.



The iptables rule indeed blocks the ARP resolution.
I don't think using connect sends a neighbor resolution, so it's not 
really a solution.
After considering some alternatives, I think that using non-privileged 
ping:

socket(AF_INET, SOCK_DGRAM, IPPROTO_ICMP)
is our best option. What do you think?

If that's an appropriate solution, we'll need some backports to current 
distributions, as the ICMP6 code was only submitted to the kernel last year.



 b. listen to events from the neigh table
 c. query neigh table in order to know if the neigh has shown up between
query until we started monitoring it



Such userspace ping must be configurable: timeout and number of tries
should probably be configurable by userspace.



How do you suggest configuring those parameters without changing the API?


6. query vlan id from the interface

This solution depends on libnl v1.



According to http://www.infradead.org/~tgr/libnl/

"1.1.x releases
Support for 1.1.x releases is limited, backports are only done upon
request. Do not develop new applications based on libnl1 and consider
porting your applications to libnl3"

Please provide some rational for using "obsolete" version 1.



This was done since most current linux distributions (RH 6.x, SLES 11) 
don't have libnl > 1 in their repositories. We didn't see the obsolete 
message, thus we'll provide a libnl3 port of this code. This will 
require supplying backports for libnl1 for these distributions.



Signed-off-by: Matan Barak 
Signed-off-by: Or Gerlitz 
---
  Makefile.am|3 +
  configure.ac   |   11 +
  include/infiniband/verbs.h |   35 ++-
  src/neigh.c|  866 
  src/neigh.h|   42 +++
  src/verbs.c|  160 -
  6 files changed, 1114 insertions(+), 3 deletions(-)
  create mode 100644 src/neigh.c
  create mode 100644 src/neigh.h

diff --git a/Makefile.am b/Makefile.am
index a6767de..f21697f 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -12,6 +12,9 @@ libibverbs_version_script = @LIBIBVERBS_VERSION_SCRIPT@
  src_libibverbs_la_SOURCES = src/cmd.c src/compat-1_0.c src/device.c 
src/init.c \
src/marshall.c src/memory.c src/sysfs.c src/verbs.c 
\
src/enum_strs.c
+if ! NO_RESOLVE_NEIGH
+src_libibverbs_la_SOURCES += src/neigh.c
+endif
  src_libibverbs_la_LDFLAGS = -version-info 1 -export-dynamic \
  $(libibverbs_version_script)
  src_libibverbs_la_DEPENDENCIES = $(srcdir)/src/libibverbs.map
diff --git a/configure.ac b/configure.ac
index b8d4cea..ba67b0a 100644
--- a/configure.ac
+++ b/configure.ac
@@ -28,6 +28,17 @@ else
  fi
  fi

+AC_ARG_WITH([resolve-neigh],
+AC_HELP_STRING([--with-resolve-neigh],
+[Enable neighbour resolution in Ethernet (default YES)]))
+if  test x$with_resolve_neigh = x || test x$with_resolve_neigh = xyes; then
+   AC_CHECK_LIB(nl, rtnl_link_vlan_get_id, [],
+   AC_MSG_ERROR([rtnl_link_vlan_get_id not found.  libibverbs 
requires libnl.]))
+else
+AC_DEFINE([NRESOLVE_NEIGH], 1, [Define to 1 to disable resovle neigh 
annotations.])
+fi
+AM_CONDITIONAL(NO_RESOLVE_NEIGH, test x$with_resolve_neigh = xno)
+
  dnl Checks for libraries
  AC_CHECK_LIB(dl, dlsym, [],
  AC_MSG_ERROR([dlsym() not found.  libibverbs requires libdl.]))
diff --git a/include/infiniband/verbs.h b/include/infiniband/verbs.h
index a79a1de..63c4756 100644
--- a/include/infin

Cash Awaiting Pick Up..

2014-02-12 Thread 2014 Heritage Foundation Board


This is to re-notify you that you have $500,000.00 waiting for pick-up at Money 
Gram, Contact 
Mrs Hillary Florence via email : heritd...@xtra.co.nz for claims.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V1 libibverbs 1/3] Add ibv_port_cap_flags

2014-02-12 Thread Or Gerlitz

On 11/02/2014 15:53, Yann Droneaud wrote:

The last part is no more true.

sure, will fix that for V2, thanks for spotting this!
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html