Re: [RFC 1/2] target: Add documentation on the target userspace pass-through driver

2014-09-02 Thread Andy Grover

On 08/31/2014 02:22 PM, Richard W.M. Jones wrote:

Reading this several times, I now think I get what it's trying to say,
but I think it needs to introduces the terms (as the Economist style
does).  Something like this:

   TCM is the new name for LIO, an in-kernel iSCSI target (server).
   Existing TCM targets run in the kernel.  TCMU (TCM in Userspace)
   allows userspace programs to be written which act as iSCSI targets.
   This document describes the design.

   The existing kernel provides modules for different SCSI transport
   protocols.  TCM also modularizes the data storage.  There are
   existing modules for file, block device, RAM or using another SCSI
   device as storage.  These are called backstores or storage
   engines.  These built-in modules are implemented entirely as kernel
   code.

And hopefully having defined a bit of background, the rest of the
document just flows nicely:


Thanks much! I've put this in the doc, and will hopefully send out a 
final patchset for inclusion with the new text in the next week or so.


begin naming potential-bikeshed wasteoftime

The only change I made was another instead of the new name -- 
because to be honest I don't know if there was actually an attempt at a 
name change, and if so if it was successful or not :) LIO seems to have 
stuck, but TCM seems to refer just to the backend part of LIO.


end bikeshed

Thanks again -- Andy

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/2] target: Add documentation on the target userspace pass-through driver

2014-08-31 Thread Andy Grover

On 08/30/2014 10:35 AM, Richard W.M. Jones wrote:

On Tue, Jul 01, 2014 at 12:11:14PM -0700, Andy Grover wrote:

Describes the driver and its interface to make it possible for user
programs to back a LIO-exported LUN.

Signed-off-by: Andy Grover agro...@redhat.com
---
  Documentation/target/tcmu-design.txt | 210 +++
  1 file changed, 210 insertions(+)
  create mode 100644 Documentation/target/tcmu-design.txt

diff --git a/Documentation/target/tcmu-design.txt 
b/Documentation/target/tcmu-design.txt
new file mode 100644
index 000..200ff3e
--- /dev/null
+++ b/Documentation/target/tcmu-design.txt
@@ -0,0 +1,210 @@
+TCM Userspace Design
+
+
+
+Background:
+
+In addition to modularizing the transport protocol used for carrying
+SCSI commands (fabrics), the Linux kernel target, LIO, also modularizes
+the actual data storage as well. These are referred to as backstores
+or storage engines. The target comes with backstores that allow a
+file, a block device, RAM, or another SCSI device to be used for the
+local storage needed for the exported SCSI LUN. Like the rest of LIO,
+these are implemented entirely as kernel code.
+
+These backstores cover the most common use cases, but not all. One new
+use case that other non-kernel target solutions, such as tgt, are able
+to support is using Gluster's GLFS or Ceph's RBD as a backstore. The
+target then serves as a translator, allowing initiators to store data
+in these non-traditional networked storage systems, while still only
+using standard protocols themselves.
+
+If the target is a userspace process, supporting these is easy. tgt,
+for example, needs only a small adapter module for each, because the
+modules just use the available userspace libraries for RBD and GLFS.
+
+Adding support for these backstores in LIO is considerably more
+difficult, because LIO is entirely kernel code. Instead of undertaking
+the significant work to port the GLFS or RBD APIs and protocols to the
+kernel, another approach is to create a userspace pass-through
+backstore for LIO, TCMU.


It has to be said that this documentation is terrible.

Jumping in medias res[1] is great for fiction, awful for technical
documentation.

I would recommend the Economist Style Guide[2].  They always say
Barak Obama, President of the United States the first time he is
mentioned in an article, even though almost everyone knows who Barak
Obama is.

In this case you're leaping into something .. fabrics, LIO,
backstores, target solutions, ... aargh.  Explain what you mean by
each term and how it all fits together.


Thanks for the feedback. I am undoubtedly too close to the details, 
because I thought I *was* explaining things :)


This doc is for people like you -- tech-savvy but unfamiliar with this 
specific area. Would you be so kind as to point out exactly the terms 
this document should explain? Should it explain SCSI and SCSI commands? 
What a SCSI target is? Say target implementations rather than target 
solutions? Do I need some ASCII art?


Or, if in finding these gaps you've actually picked up the jargon and 
wanted to just take a pass a rewriting it, that would be fine too :) 
I've read some of your libguestfs docs and they were very understandable.


Regards -- Andy

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/2] target: Add documentation on the target userspace pass-through driver

2014-08-31 Thread Richard W.M. Jones
On Sun, Aug 31, 2014 at 12:49:26PM -0700, Andy Grover wrote:
 Thanks for the feedback. I am undoubtedly too close to the details,
 because I thought I *was* explaining things :)

Yeah, sorry it came across as a bit harsh.

Benoit did explain it to me so I understood it in the end (I think!)

 This doc is for people like you -- tech-savvy but unfamiliar with
 this specific area. Would you be so kind as to point out exactly the
 terms this document should explain? Should it explain SCSI and SCSI
 commands? What a SCSI target is? Say target implementations rather
 than target solutions? Do I need some ASCII art?

So I can only speak for myself here, but I'm pretty familiar with
iSCSI, using it, and some of the internals -- in fact I'm using the
Linux kernel target daily.

 TCM Userspace Design
 In addition to modularizing the transport protocol used for carrying
 SCSI commands (fabrics), the Linux kernel target, LIO, also
 modularizes the actual data storage as well.  These are referred to
 as backstores or storage engines.

Reading this several times, I now think I get what it's trying to say,
but I think it needs to introduces the terms (as the Economist style
does).  Something like this:

  TCM is the new name for LIO, an in-kernel iSCSI target (server).
  Existing TCM targets run in the kernel.  TCMU (TCM in Userspace)
  allows userspace programs to be written which act as iSCSI targets.
  This document describes the design.

  The existing kernel provides modules for different SCSI transport
  protocols.  TCM also modularizes the data storage.  There are
  existing modules for file, block device, RAM or using another SCSI
  device as storage.  These are called backstores or storage
  engines.  These built-in modules are implemented entirely as kernel
  code.

And hopefully having defined a bit of background, the rest of the
document just flows nicely:

 These backstores cover the most common use cases, but not all. One new
 use case that other non-kernel target solutions, such as tgt, are able
 to support is using Gluster's GLFS or Ceph's RBD as a backstore. The
 target then serves as a translator, allowing initiators to store data
 in these non-traditional networked storage systems, while still only
 using standard protocols themselves.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine.  Supports Linux and Windows.
http://people.redhat.com/~rjones/virt-df/
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/2] target: Add documentation on the target userspace pass-through driver

2014-08-30 Thread Richard W.M. Jones
On Tue, Jul 01, 2014 at 12:11:14PM -0700, Andy Grover wrote:
 Describes the driver and its interface to make it possible for user
 programs to back a LIO-exported LUN.
 
 Signed-off-by: Andy Grover agro...@redhat.com
 ---
  Documentation/target/tcmu-design.txt | 210 
 +++
  1 file changed, 210 insertions(+)
  create mode 100644 Documentation/target/tcmu-design.txt
 
 diff --git a/Documentation/target/tcmu-design.txt 
 b/Documentation/target/tcmu-design.txt
 new file mode 100644
 index 000..200ff3e
 --- /dev/null
 +++ b/Documentation/target/tcmu-design.txt
 @@ -0,0 +1,210 @@
 +TCM Userspace Design
 +
 +
 +
 +Background:
 +
 +In addition to modularizing the transport protocol used for carrying
 +SCSI commands (fabrics), the Linux kernel target, LIO, also modularizes
 +the actual data storage as well. These are referred to as backstores
 +or storage engines. The target comes with backstores that allow a
 +file, a block device, RAM, or another SCSI device to be used for the
 +local storage needed for the exported SCSI LUN. Like the rest of LIO,
 +these are implemented entirely as kernel code.
 +
 +These backstores cover the most common use cases, but not all. One new
 +use case that other non-kernel target solutions, such as tgt, are able
 +to support is using Gluster's GLFS or Ceph's RBD as a backstore. The
 +target then serves as a translator, allowing initiators to store data
 +in these non-traditional networked storage systems, while still only
 +using standard protocols themselves.
 +
 +If the target is a userspace process, supporting these is easy. tgt,
 +for example, needs only a small adapter module for each, because the
 +modules just use the available userspace libraries for RBD and GLFS.
 +
 +Adding support for these backstores in LIO is considerably more
 +difficult, because LIO is entirely kernel code. Instead of undertaking
 +the significant work to port the GLFS or RBD APIs and protocols to the
 +kernel, another approach is to create a userspace pass-through
 +backstore for LIO, TCMU.

It has to be said that this documentation is terrible.

Jumping in medias res[1] is great for fiction, awful for technical
documentation.

I would recommend the Economist Style Guide[2].  They always say
Barak Obama, President of the United States the first time he is
mentioned in an article, even though almost everyone knows who Barak
Obama is.

In this case you're leaping into something .. fabrics, LIO,
backstores, target solutions, ... aargh.  Explain what you mean by
each term and how it all fits together.

Thanks,
Rich.

[1] https://en.wikipedia.org/wiki/In_medias_res

[2] http://www.economist.com/styleguide/introduction

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-builder quickly builds VMs from scratch
http://libguestfs.org/virt-builder.1.html
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/2] target: Add documentation on the target userspace pass-through driver

2014-07-08 Thread Andy Grover

[re-adding individual CCs that were dropped]

On 07/05/2014 04:29 AM, Alex Elsayed wrote:

+Device Discovery:
+
+Other devices may be using UIO besides TCMU. Unrelated user processes
+may also be handling different sets of TCMU devices. TCMU userspace
+processes must find their devices by scanning sysfs
+class/uio/uio*/name. For TCMU devices, these names will be of the
+format:
+
+tcm-user/subtype/path
+
+where tcm-user is common for all TCMU-backed UIO devices. subtype
+will be a userspace-process-unique string to identify the TCMU device
+as expecting to be backed by a certain handler, and path will be an
+additional handler-specific string for the user process to configure
+the device, if needed. Neither subtype or path can contain ':',
+due to LIO limitations.


It might be good to change this somewhat; in the vast majority of cases it'd
be saner for userspace programs to figure this information out via udev etc.
rather than parsing sysfs themselves. This information is still worth
documenting, but saying things like must find their devices by scanning
sysfs is likely to lead to users of this interface making suboptimal
choices.


I agree. There's no getting around a certain degree of work required by 
the backing user program. I'm planning on writing a tcmu-runner 
program with a plugin interface, that will handle the event loop, device 
notifications, enumeration, and possibly thread pools, to minimize the 
amount of boilerplate code each implementation must contain.



+Device Events:
+
+If a new device is added or removed, user processes will recieve a HUP
+signal, and should re-scan sysfs. File descriptors for devices no
+longer in sysfs should be closed, and new devices should be opened and
+handled.


Is there a cleaner way to do this? In particular, re-scanning sysfs may
cause race conditions (device removed, one of the same name re-added but a
different UIO device node; probably more to be found). Perhaps recommend
netlink uevents, so that remove+add is noticeable? Also, is the SIGHUP
itself the best option? Could we simply require the user process to listen
for add/remove uevents to get such change notifications, and thus enforce
good behavior?


Yes this sounds better, let's do it this way.


One use case I'm actually interested in is having userspace provide
something other than just SPC - for instance, tgt can provide a virtual tape
library or an OSD, and CDemu can provide emulated optical discs from various
image formats.

Currently, CDemu uses its own out-of-tree driver called VHBA (Virtual Host
Bus Adapter) to do pretty much exactly what TCMU+Loopback would
accomplish... and in the process misses out on all of the other fabrics,
unless you're willing to _re-import_ those devices using PSCSI, which has
its own quirks.

Perhaps there could be a level 0 (or 4, or whatever) which means explicitly
enabled list of commands - maybe as a bitmap that could be passed to the
kernel somehow? Hopefully, that could also avoid some of the quirks of PSCSI
regarding ALUA and such - if it's not implemented, leave the relevant bits
at zero, and LIO handles it.


I'm beginning to sour on pass_level and having configurable cmd 
filtering in the kernel interface.


I think a less-clever but simpler approach might be to eliminate 
filtering, and the user process can return CHECK_CONDITION, INVALID 
COMMAND OPERATION CODE for commands it doesn't wish to support. TCMU 
checks for this, and the pending command thus returned can still be 
emulated by LIO (it looks like we could just re-call sbc_parse_cdb and 
target_execute_cmd).



This does look really nice, thanks for writing it!


Thanks for your helpful feedback! :)

-- Andy

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/2] target: Add documentation on the target userspace pass-through driver

2014-07-05 Thread Alex Elsayed
Reply inline, with a good bit of snipping done (posting via gmane, so 
quote/content ratio is an issue).

Andy Grover wrote:

 +These backstores cover the most common use cases, but not all. One new
 +use case that other non-kernel target solutions, such as tgt, are able
 +to support is using Gluster's GLFS or Ceph's RBD as a backstore. The
 +target then serves as a translator, allowing initiators to store data
 +in these non-traditional networked storage systems, while still only
 +using standard protocols themselves.

Another use case is in supporting various image formats, like (say) qcow2, 
and then handing those off to vhost_scsi.

 +Benefits:
 +
 +In addition to allowing relatively easy support for RBD and GLFS, TCMU
 +will also allow easier development of new backstores. TCMU combines
 +with the LIO loopback fabric to become something similar to FUSE
 +(Filesystem in Userspace), but at the SCSI layer instead of the
 +filesystem layer. A SUSE, if you will.

As long as people don't start calling it L[UNs in ]USER[space] :P

Between that and ABUSE (A Block device in USErspace), this domain has some 
real naming potential...

 +Device Discovery:
 +
 +Other devices may be using UIO besides TCMU. Unrelated user processes
 +may also be handling different sets of TCMU devices. TCMU userspace
 +processes must find their devices by scanning sysfs
 +class/uio/uio*/name. For TCMU devices, these names will be of the
 +format:
 +
 +tcm-user/subtype/path
 +
 +where tcm-user is common for all TCMU-backed UIO devices. subtype
 +will be a userspace-process-unique string to identify the TCMU device
 +as expecting to be backed by a certain handler, and path will be an
 +additional handler-specific string for the user process to configure
 +the device, if needed. Neither subtype or path can contain ':',
 +due to LIO limitations.

It might be good to change this somewhat; in the vast majority of cases it'd 
be saner for userspace programs to figure this information out via udev etc. 
rather than parsing sysfs themselves. This information is still worth 
documenting, but saying things like must find their devices by scanning 
sysfs is likely to lead to users of this interface making suboptimal 
choices.

 +Device Events:
 +
 +If a new device is added or removed, user processes will recieve a HUP
 +signal, and should re-scan sysfs. File descriptors for devices no
 +longer in sysfs should be closed, and new devices should be opened and
 +handled.

Is there a cleaner way to do this? In particular, re-scanning sysfs may 
cause race conditions (device removed, one of the same name re-added but a 
different UIO device node; probably more to be found). Perhaps recommend 
netlink uevents, so that remove+add is noticeable? Also, is the SIGHUP 
itself the best option? Could we simply require the user process to listen 
for add/remove uevents to get such change notifications, and thus enforce 
good behavior?

 +Writing a user backstore handler:
 +
 +Variable emulation with pass_level:
 +
 +TCMU supports a pass_level option with valid values of 1, 2, or
 +3. This controls how many different SCSI commands are passed up,
 +versus being emulated by LIO. The purpose of this is to give the user
 +handler author a choice of how much of the full SCSI command set they
 +care to support.
 +
 +At level 1, only READ and WRITE commands will be seen. At level 2,
 +additional commands defined in the SBC SCSI specification such as
 +WRITE SAME, SYNCRONIZE CACHE, and UNMAP will be passed up. Finally, at
 +level 3, almost all commands defined in the SPC SCSI specification
 +will also be passed up for processing by the user handler.

One use case I'm actually interested in is having userspace provide 
something other than just SPC - for instance, tgt can provide a virtual tape 
library or an OSD, and CDemu can provide emulated optical discs from various 
image formats.

Currently, CDemu uses its own out-of-tree driver called VHBA (Virtual Host 
Bus Adapter) to do pretty much exactly what TCMU+Loopback would 
accomplish... and in the process misses out on all of the other fabrics, 
unless you're willing to _re-import_ those devices using PSCSI, which has 
its own quirks.

Perhaps there could be a level 0 (or 4, or whatever) which means explicitly 
enabled list of commands - maybe as a bitmap that could be passed to the 
kernel somehow? Hopefully, that could also avoid some of the quirks of PSCSI 
regarding ALUA and such - if it's not implemented, leave the relevant bits 
at zero, and LIO handles it.

This does look really nice, thanks for writing it!

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html