Re: [PATCH v4 17/50] IB/hfi1: add PSM driver control/data path

2015-07-31 Thread Al Viro
On Fri, Jul 31, 2015 at 12:31:58AM -0700, Christoph Hellwig wrote:
 On Thu, Jul 30, 2015 at 05:42:16PM -0400, Doug Ledford wrote:
  I have no problem with this code.  That Al finds the user space ABI for
  this driver to be bizarre is neither here nor there to me.  Sure, this
  file does not exhibit normal file API behavior.  Who cares?
 
 Everyone who cares about filesystem semantics does.
 
 A NACK from me for a this features, and b) for trying to sneak it in
 after a negative comment without cc to -fsdevel and Al.

FWIW, the lack of comments appears to have tripped Doug, judging by his
another option would be to drop the write function altogether and just make
all commands come through writev/write_iter and if you only have one command,
you only send one element.

The thing is, ipath and qib (and AFAICS this one as well) have write(2) and
writev(2) take different and completely unrelated sets of commands.  On
the same file.  IOW, the effects of
writev(fd, (struct iovec){buf, len}, 1)
and
write(fd, buf, len)
are not even remotely similar.  _That_ is the gratuitous weirdness I'd been
unhappy with.  And yes, it is gratuitous - it's trivial to have separate files
for separate command sets.

If you drop -write() in there, you certainly lose the weirdness - along with
one of those command sets.  Sure, having individual iovecs correspond to
separate datagrams is fine; tons of drivers are like that.  qib and ipath
are unique, though, in having *two* command sets overloaded on the same file,
with write() vs. writev() acting as selector (BTW, single-element AIO
going like writev(), not like write()).

PS: I'm back after several weeks of being sick (recipe for fun: start with
40C in shade, get completely soaked in serious rain, then have half an hour
cab ride, with AC set to ~15C) and while I'd managed to get the mailbox
down to relatively sane size, I might have easily deleted more than I should
have.  If there had been anything important sent my way and I don't reply to 
it by Saturday, please resend.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


mlx5: don't duplicate kvfree()

2014-11-20 Thread Al Viro
Signed-off-by: Al Viro v...@zeniv.linux.org.uk
---
 drivers/infiniband/hw/mlx5/cq.c |8 
 drivers/infiniband/hw/mlx5/mr.c |4 ++--
 drivers/infiniband/hw/mlx5/qp.c |8 
 drivers/infiniband/hw/mlx5/srq.c|6 +++---
 drivers/net/ethernet/mellanox/mlx5/core/eq.c|4 ++--
 drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c |4 ++--
 drivers/net/ethernet/mellanox/mlx5/core/port.c  |4 ++--
 include/linux/mlx5/driver.h |8 
 9 files changed, 21 insertions(+), 35 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/cq.c b/drivers/infiniband/hw/mlx5/cq.c
index 10cfce5..c463e7b 100644
--- a/drivers/infiniband/hw/mlx5/cq.c
+++ b/drivers/infiniband/hw/mlx5/cq.c
@@ -805,14 +805,14 @@ struct ib_cq *mlx5_ib_create_cq(struct ib_device *ibdev, 
int entries,
}
 
 
-   mlx5_vfree(cqb);
+   kvfree(cqb);
return cq-ibcq;
 
 err_cmd:
mlx5_core_destroy_cq(dev-mdev, cq-mcq);
 
 err_cqb:
-   mlx5_vfree(cqb);
+   kvfree(cqb);
if (context)
destroy_cq_user(cq, context);
else
@@ -1159,11 +1159,11 @@ int mlx5_ib_resize_cq(struct ib_cq *ibcq, int entries, 
struct ib_udata *udata)
}
mutex_unlock(cq-resize_mutex);
 
-   mlx5_vfree(in);
+   kvfree(in);
return 0;
 
 ex_alloc:
-   mlx5_vfree(in);
+   kvfree(in);
 
 ex_resize:
if (udata)
diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
index 8ee7cb4..4c89b64 100644
--- a/drivers/infiniband/hw/mlx5/mr.c
+++ b/drivers/infiniband/hw/mlx5/mr.c
@@ -853,14 +853,14 @@ static struct mlx5_ib_mr *reg_create(struct ib_pd *pd, 
u64 virt_addr,
goto err_2;
}
mr-umem = umem;
-   mlx5_vfree(in);
+   kvfree(in);
 
mlx5_ib_dbg(dev, mkey = 0x%x\n, mr-mmr.key);
 
return mr;
 
 err_2:
-   mlx5_vfree(in);
+   kvfree(in);
 
 err_1:
kfree(mr);
diff --git a/drivers/infiniband/hw/mlx5/qp.c b/drivers/infiniband/hw/mlx5/qp.c
index e261a53..0e2ef9f 100644
--- a/drivers/infiniband/hw/mlx5/qp.c
+++ b/drivers/infiniband/hw/mlx5/qp.c
@@ -647,7 +647,7 @@ err_unmap:
mlx5_ib_db_unmap_user(context, qp-db);
 
 err_free:
-   mlx5_vfree(*in);
+   kvfree(*in);
 
 err_umem:
if (qp-umem)
@@ -761,7 +761,7 @@ err_wrid:
kfree(qp-rq.wrid);
 
 err_free:
-   mlx5_vfree(*in);
+   kvfree(*in);
 
 err_buf:
mlx5_buf_free(dev-mdev, qp-buf);
@@ -971,7 +971,7 @@ static int create_qp_common(struct mlx5_ib_dev *dev, struct 
ib_pd *pd,
goto err_create;
}
 
-   mlx5_vfree(in);
+   kvfree(in);
/* Hardware wants QPN written in big-endian order (after
 * shifting) for send doorbell.  Precompute this value to save
 * a little bit when posting sends.
@@ -988,7 +988,7 @@ err_create:
else if (qp-create_type == MLX5_QP_KERNEL)
destroy_qp_kernel(dev, qp);
 
-   mlx5_vfree(in);
+   kvfree(in);
return err;
 }
 
diff --git a/drivers/infiniband/hw/mlx5/srq.c b/drivers/infiniband/hw/mlx5/srq.c
index 97cc1ba..41fec66 100644
--- a/drivers/infiniband/hw/mlx5/srq.c
+++ b/drivers/infiniband/hw/mlx5/srq.c
@@ -141,7 +141,7 @@ static int create_srq_user(struct ib_pd *pd, struct 
mlx5_ib_srq *srq,
return 0;
 
 err_in:
-   mlx5_vfree(*in);
+   kvfree(*in);
 
 err_umem:
ib_umem_release(srq-umem);
@@ -209,7 +209,7 @@ static int create_srq_kernel(struct mlx5_ib_dev *dev, 
struct mlx5_ib_srq *srq,
return 0;
 
 err_in:
-   mlx5_vfree(*in);
+   kvfree(*in);
 
 err_buf:
mlx5_buf_free(dev-mdev, srq-buf);
@@ -306,7 +306,7 @@ struct ib_srq *mlx5_ib_create_srq(struct ib_pd *pd,
in-ctx.pd = cpu_to_be32(to_mpd(pd)-pdn);
in-ctx.db_record = cpu_to_be64(srq-db.dma);
err = mlx5_core_create_srq(dev-mdev, srq-msrq, in, inlen);
-   mlx5_vfree(in);
+   kvfree(in);
if (err) {
mlx5_ib_dbg(dev, create SRQ failed, err %d\n, err);
goto err_usr_kern_srq;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c 
b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
index a278238..eb3aa15 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
@@ -391,7 +391,7 @@ int mlx5_create_map_eq(struct mlx5_core_dev *dev, struct 
mlx5_eq *eq, u8 vecidx,
 */
eq_update_ci(eq, 1);
 
-   mlx5_vfree(in);
+   kvfree(in);
return 0;
 
 err_irq:
@@ -401,7 +401,7 @@ err_eq:
mlx5_cmd_destroy_eq(dev, eq-eqn);
 
 err_in:
-   mlx5_vfree(in);
+   kvfree(in);
 
 err_buf:
mlx5_buf_free(dev, eq-buf);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c 
b/drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c
index d476918..4fdaae9 100644
--- a/drivers/net/ethernet/mellanox

Re: mlx5: don't duplicate kvfree()

2014-11-20 Thread Al Viro
On Thu, Nov 20, 2014 at 08:13:57AM +, Al Viro wrote:
  9 files changed, 21 insertions(+), 35 deletions(-)

grr... 8 files changed, actually - that was from the diff that included mlx4
bits.  Patch split correctly and sent in two pieces, summary left as is ;-/
Sorry about the confusion it might cause...
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: mlx5: don't duplicate kvfree()

2014-11-20 Thread Al Viro
On Thu, Nov 20, 2014 at 07:09:31PM +0200, Or Gerlitz wrote:
 On 11/20/2014 10:17 AM, Al Viro wrote:
 On Thu, Nov 20, 2014 at 08:13:57AM +, Al Viro wrote:
   9 files changed, 21 insertions(+), 35 deletions(-)
 grr... 8 files changed, actually - that was from the diff that included mlx4
 bits.  Patch split correctly and sent in two pieces, summary left as is ;-/
 Sorry about the confusion it might cause...
 Al,
 
 I think the best way to proceed here is to come up with mlx4 and
 mlx5 patches and get both of them in through netdev / net-nexttree,
 don't worry about each of these drivers have a part which is also
 under drivers/infiniband/hw -- it doesn't justify splitting in this
 case

Both had been sent to netdev today (and ACKed by respective drivers'
maintainers).  I'd certainly prefer not to put them through vfs.git,
but beyond that I really don't care which tree do those go into...
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: races in ipathfs

2012-03-18 Thread Al Viro
On Thu, Jan 19, 2012 at 08:20:04PM +, Al Viro wrote:
   Use of qib_super is seriously racy.  qibfs_add() (and worse,
 qibfs_remove()) can happen during qibfs_mount() and qibfs_kill_super().

[snip]

FWIW, I've put a completely untested patchset into
git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git qibfs

It tries to avoid that kind of crap by getting rid of populate at mount
time logics - we keep (internal) mount of that sucker for as long as
there are any devices owned by ib_qib, adding them on add_device() and
removing on remove_device().  The only complication is with module
use counts - that fs has to be a separate module, or we'll have ib_qib
impossible to rmmod, because fs keeps its module pinned and any devices
held by ib_qib PCI driver will keep the fs pinned, so we never get
to unregistering said PCI driver.  With skeleton of qibfs (static parts
only) taken to ib_qib_fs.c we avoid that problem - it is what ends up
being pinned down for as long as ib_qib owns any devices, but then it's
pinned down by ib_qib using exports from ib_qib_fs anyway.  And once
ib_qib is removed, all internal references to that vfsmount and superblock
disappear as well...

This stuff is completely untested; I don't have the hardware in question.
It does compile and survive modpost, but that's it.  Please, review and
comment...
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: races in ipathfs

2012-03-18 Thread Al Viro
On Sun, Mar 18, 2012 at 06:45:47PM +, Al Viro wrote:
 On Thu, Jan 19, 2012 at 08:20:04PM +, Al Viro wrote:
  Use of qib_super is seriously racy.  qibfs_add() (and worse,
  qibfs_remove()) can happen during qibfs_mount() and qibfs_kill_super().
 
 [snip]
 
 FWIW, I've put a completely untested patchset into
 git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git qibfs

Corresponding fix for ipathfs added to the same branch; again, completely
untested.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


races in ipathfs

2012-01-19 Thread Al Viro
Use of qib_super is seriously racy.  qibfs_add() (and worse,
qibfs_remove()) can happen during qibfs_mount() and qibfs_kill_super().

1) CPU1: qib_init_one().  The sucker is allocated and placed
on the list.  CPU2: ipathfs is mounted, directory created.  CPU1: finally
gets around to qibfs_add(); by now qib_super is non-NULL and off we go,
trying to create it again.  The worst part is, that code doesn't even
notice that dentry is there and positive; you silently leak the old inode.

2) CPU1: qib_init_one().  Allocated the sucker.  CPU2: ipathfs
is getting mounted.  Picked the first device off the list, creating
directory for it.  CPU1: inserted new device into the head of the list,
continued working.  Got around to qibfs_add(); qib_super is NULL, so
we do nothing.  CPU2: walked the rest of the list, creating directories
for all devices.  Our device is missed, since we are past that point in
the list.  Worse, shift the timing a bit and it doesn't matter whether
you add to the head or to the tail of the list - if qibfs_add() happens
just before we set qib_super, we are screwed again.

3) CPU1: qib_remove_one().  CPU2: mount ipathfs is walking that
list and decides to try and create a directory for the device that is
being freed.  Oops...

4) CPU1: qib_init_one() or qib_remove_one(), doesn't matter which.
CPU2: final umount of ipathfs already got through setting sb-s_root to
NULL but still hadn't set qib_super to the same.  Oops...  And no,
moving that qib_super = NULL; up prior to kill_litter_super() won't
fix the race either, of course.

AFAICS, the older driver (in hw/ipath) has the same problems.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html