Re: [RFC 0/2] nvme/loop: Add support for controllers-per-port model

2016-06-09 Thread Nicholas A. Bellinger
On Thu, 2016-06-09 at 15:39 +0200, Christoph Hellwig wrote:
> On Wed, Jun 08, 2016 at 10:13:53PM -0700, Nicholas A. Bellinger wrote:
> > I still don't see why a loopback controller on a port of an individual
> > subsystem NQN can't be created + deleted directly from configfs, and
> > operate independently of other loopback controllers on a port of a
> > different subsystem NQNs.
> 
> I think you are misunderstanding the NVMe entities - there is no port
> of a subsystem.  There are port, and there are subsystems and there
> is an N to M relation between them, not an N to one relation.
> 

Yes, the mapping of subsystem NQNs and ports works identically to how
iscsi IQNs and network portals work.

That is, network portals can be setup across multiple IQNs, or setup
specifically to one IQN.

> Multiple controllers may and usually will use the same port to access
> one or more subsystems, but multiple controllers may also use different
> ports to use the same subsystem.
> 

Yes.  Also similar to MC/S in a single IQN across multiple network
portals setup.  ;)

> For full NVMe functionality we thus need to be able to create all
> these objects indepdently.

Yes.

> While loopback ports are extremtly cheap
> I see no reason to introduce a shortcut to break this model for our
> debug local implementation - the whole point of the loopback driver
> is to give you the full NVMe over Fabrics experience without needing
> an (possibly expensive) transport.
> 

I think making this extra step for loopback in pointless, and just adds
unnecessary list lookups, and is something that configfs is better
suited to handle directly.

But loopback is a special case, both for drivers/target/loopback/ and
driver/nvme/target/loop.c code.

So why not just let configfs do it..?

However, your point is taken about using the global hostnqn for loopback
controllers in this patch.  Following what we do for SCSI loopback, I
think it makes the sense to have each loopback subsystem port setup it's
own hostnqn instead.

Updating the patch to do this.

> > > Something like the patch below is the right way, it just needs a bit more
> > > fine tuning and proper test cases:
> > 
> > I don't see how adding a internal mutex for loopback ports, and doing
> > internal sequential list lookup holding the global nvmet_config_sem
> > whilst doing nvmet_fabric_ops->add_port() helps to scale at all..
> 
> Scalability of the controller creation is not the prime goal for the
> nvme-loop driver.  It's supposed to be the simples possible implementation
> of a NVMe over Fabrics transport.
> 

Btw, I consider adding a global list for loopback to be 'less' simple.

> > AFAICT for loopback, neither of these is necessary.  Why can't a
> > loopback controller on a port for a individual subsystem NQN be created
> > + deleted directly from configfs,
> 
> We could trivially do this, but I think it's counter productive.  NVMe
> over Fabrics does not automatically create one or multiple controllers,
> and nvme-loop is just another transport that follows the higher level
> spec.
> 

I disagree.  The more interesting question is if to enforce a limit to
the amount of loopback ports that be allocated with this patch
per /sys/kernel/config/nvmet/subsystems/$NQN.FOO/.

That is, each echo 1 > ../$NQN_FOO/$ANY_LOOPBACK/enable will create it's
own controller, regardless if another loopback controller already exists
on the subsystem NQN.

Leaving that as-is for -v2.

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 0/2] nvme/loop: Add support for controllers-per-port model

2016-06-09 Thread Christoph Hellwig
On Wed, Jun 08, 2016 at 10:13:53PM -0700, Nicholas A. Bellinger wrote:
> I still don't see why a loopback controller on a port of an individual
> subsystem NQN can't be created + deleted directly from configfs, and
> operate independently of other loopback controllers on a port of a
> different subsystem NQNs.

I think you are misunderstanding the NVMe entities - there is no port
of a subsystem.  There are port, and there are subsystems and there
is an N to M relation between them, not an N to one relation.

Multiple controllers may and usually will use the same port to access
one or more subsystems, but multiple controllers may also use different
ports to use the same subsystem.

For full NVMe functionality we thus need to be able to create all
these objects indepdently.   While loopback ports are extremtly cheap
I see no reason to introduce a shortcut to break this model for our
debug local implementation - the whole point of the loopback driver
is to give you the full NVMe over Fabrics experience without needing
an (possibly expensive) transport.

> > Something like the patch below is the right way, it just needs a bit more
> > fine tuning and proper test cases:
> 
> I don't see how adding a internal mutex for loopback ports, and doing
> internal sequential list lookup holding the global nvmet_config_sem
> whilst doing nvmet_fabric_ops->add_port() helps to scale at all..

Scalability of the controller creation is not the prime goal for the
nvme-loop driver.  It's supposed to be the simples possible implementation
of a NVMe over Fabrics transport.

> AFAICT for loopback, neither of these is necessary.  Why can't a
> loopback controller on a port for a individual subsystem NQN be created
> + deleted directly from configfs,

We could trivially do this, but I think it's counter productive.  NVMe
over Fabrics does not automatically create one or multiple controllers,
and nvme-loop is just another transport that follows the higher level
spec.

> independent of other subsystem NQN's
> loopback controller ports..?

The only limitation at the moment is that there is only one port.  I've
shown you a fairly trivial patch to lift that limitation.  Subsystem
can be created and deleted independently from each other, as can
controllers, and as can ports for transport where more than one can
exists.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 0/2] nvme/loop: Add support for controllers-per-port model

2016-06-08 Thread Nicholas A. Bellinger
On Wed, 2016-06-08 at 14:14 +0200, Christoph Hellwig wrote:
> Hi Nic,
> 
> multiple ports have been on the todo list ever since we put that short
> cut in place, but your patches aren't really how this should work.
> 
> The host NQN is not available for the actual drivers - we need to be able
> to virtualize it for containers for example, that's why we need to pass
> it by pointer depending on the arguments.  Also there is no need to
> add another sysfs interface - just like for real drivers we can simply
> create the ports in configfs and assign an address to it, we just need
> to stop ignoring it.
> 

Not sure what you mean, but my patch does not propose 'to add another
sysfs interface'.

It simply follows what drivers/target/loopback/ already does for scsi,
and allocates a controller from nvmet/configfs-ng at subsystem NQN
port enable time.

I still don't see why a loopback controller on a port of an individual
subsystem NQN can't be created + deleted directly from configfs, and
operate independently of other loopback controllers on a port of a
different subsystem NQNs.

> Something like the patch below is the right way, it just needs a bit more
> fine tuning and proper test cases:

I don't see how adding a internal mutex for loopback ports, and doing
internal sequential list lookup holding the global nvmet_config_sem
whilst doing nvmet_fabric_ops->add_port() helps to scale at all..

AFAICT for loopback, neither of these is necessary.  Why can't a
loopback controller on a port for a individual subsystem NQN be created
+ deleted directly from configfs, independent of other subsystem NQN's
loopback controller ports..?

Now for rdma, just like with iscsi-target + iser with network portals,
we do need to keep an internal list of ports, given that ports can be
shared across different target endpoints.

That's the part that I'm working on next for nvmet/configfs-ng.

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 0/2] nvme/loop: Add support for controllers-per-port model

2016-06-08 Thread Christoph Hellwig
Hi Nic,

multiple ports have been on the todo list ever since we put that short
cut in place, but your patches aren't really how this should work.

The host NQN is not available for the actual drivers - we need to be able
to virtualize it for containers for example, that's why we need to pass
it by pointer depending on the arguments.  Also there is no need to
add another sysfs interface - just like for real drivers we can simply
create the ports in configfs and assign an address to it, we just need
to stop ignoring it.

Something like the patch below is the right way, it just needs a bit more
fine tuning and proper test cases:

diff --git a/drivers/nvme/target/loop.c b/drivers/nvme/target/loop.c
index 94e7829..ecd2c4c 100644
--- a/drivers/nvme/target/loop.c
+++ b/drivers/nvme/target/loop.c
@@ -57,6 +57,7 @@ struct nvme_loop_ctrl {
struct blk_mq_tag_set   tag_set;
struct nvme_loop_iodasync_event_iod;
struct nvme_ctrlctrl;
+   struct nvme_loop_port   *port;
 
struct nvmet_ctrl   *target_ctrl;
struct work_struct  delete_work;
@@ -74,11 +75,17 @@ struct nvme_loop_queue {
struct nvme_loop_ctrl   *ctrl;
 };
 
-static struct nvmet_port *nvmet_loop_port;
-
 static LIST_HEAD(nvme_loop_ctrl_list);
 static DEFINE_MUTEX(nvme_loop_ctrl_mutex);
 
+struct nvme_loop_port {
+   struct list_headentry;
+   struct nvmet_port   *port;
+};
+
+static LIST_HEAD(nvme_loop_ports);
+static DEFINE_SPINLOCK(nvme_loop_port_lock);
+
 static void nvme_loop_queue_response(struct nvmet_req *nvme_req);
 static void nvme_loop_delete_ctrl(struct nvmet_ctrl *ctrl);
 
@@ -172,7 +179,7 @@ static int nvme_loop_queue_rq(struct blk_mq_hw_ctx *hctx,
return ret;
 
iod->cmd.common.flags |= NVME_CMD_SGL_METABUF;
-   iod->req.port = nvmet_loop_port;
+   iod->req.port = queue->ctrl->port->port;
if (!nvmet_req_init(&iod->req, &queue->nvme_cq,
&queue->nvme_sq, &nvme_loop_ops)) {
nvme_cleanup_cmd(req);
@@ -210,6 +217,7 @@ static void nvme_loop_submit_async_event(struct nvme_ctrl 
*arg, int aer_idx)
iod->cmd.common.opcode = nvme_admin_async_event;
iod->cmd.common.command_id = NVME_LOOP_AQ_BLKMQ_DEPTH;
iod->cmd.common.flags |= NVME_CMD_SGL_METABUF;
+   iod->req.port = queue->ctrl->port->port;
 
if (!nvmet_req_init(&iod->req, &queue->nvme_cq, &queue->nvme_sq,
&nvme_loop_ops)) {
@@ -597,6 +605,20 @@ out_destroy_queues:
return ret;
 }
 
+static struct nvme_loop_port *
+__nvme_loop_find_port(const char *traddr)
+{
+   struct nvme_loop_port *p;
+
+   list_for_each_entry(p, &nvme_loop_ports, entry) {
+   if (!strncmp(traddr, p->port->disc_addr.traddr,
+   NVMF_TRADDR_SIZE))
+   return p;
+   }
+
+   return NULL;
+}
+
 static struct nvme_ctrl *nvme_loop_create_ctrl(struct device *dev,
struct nvmf_ctrl_options *opts)
 {
@@ -610,6 +632,17 @@ static struct nvme_ctrl *nvme_loop_create_ctrl(struct 
device *dev,
ctrl->ctrl.opts = opts;
INIT_LIST_HEAD(&ctrl->list);
 
+   spin_lock(&nvme_loop_port_lock);
+   ctrl->port = __nvme_loop_find_port(opts->traddr);
+   spin_unlock(&nvme_loop_port_lock);
+   if (!ctrl->port) {
+   ret = -EINVAL;
+   dev_warn(ctrl->ctrl.device,
+"could not find port at addr %s\n",
+opts->traddr);
+   goto out_put_ctrl;
+   }
+
INIT_WORK(&ctrl->delete_work, nvme_loop_del_ctrl_work);
INIT_WORK(&ctrl->reset_work, nvme_loop_reset_ctrl_work);
 
@@ -684,27 +717,40 @@ out_put_ctrl:
 
 static int nvme_loop_add_port(struct nvmet_port *port)
 {
-   /*
-* XXX: disalow adding more than one port so
-* there is no connection rejections when a
-* a subsystem is assigned to a port for which
-* loop doesn't have a pointer.
-* This scenario would be possible if we allowed
-* more than one port to be added and a subsystem
-* was assigned to a port other than nvmet_loop_port.
-*/
+   struct nvme_loop_port *p, *old;
+
+   p = kmalloc(sizeof(*p), GFP_KERNEL);
+   if (!p)
+   return -ENOMEM;
+
+   spin_lock(&nvme_loop_port_lock);
+   old = __nvme_loop_find_port(port->disc_addr.traddr);
+   if (old)
+   goto out_dup;
 
-   if (nvmet_loop_port)
-   return -EPERM;
+   p->port = port;
+   list_add_tail_rcu(&p->entry, &nvme_loop_ports);
+   port->priv = p;
+   spin_unlock(&nvme_loop_port_lock);
 
-   nvmet_loop_port = port;
return 0;
+out_dup:
+   spin_unlock(&nvme_loop_port_lock);
+   kfree(p);
+   pr_err("duplicate port name: %s\n", port->disc_addr.traddr);
+   return -EINVAL;
 }
 
 static void nvme_loop_remove_port(struct nvmet_port *port)
 {