Re: [Xen-devel] [RFC PATCH] xen-block: introduces extra request to pass-through SCSI commands

2016-03-03 Thread Paul Durrant
> -Original Message-
> From: Xen-devel [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of
> Konrad Rzeszutek Wilk
> Sent: 03 March 2016 14:07
> To: Juergen Gross; Ian Jackson; Paul Durrant
> Cc: xen-devel@lists.xen.org; jbeul...@suse.com; Roger Pau Monne
> Subject: Re: [Xen-devel] [RFC PATCH] xen-block: introduces extra request to
> pass-through SCSI commands
> 
> On March 3, 2016 7:25:11 AM EST, Juergen Gross  wrote:
> >On 03/03/16 13:03, Ian Jackson wrote:
> >> Paul Durrant writes ("RE: [RFC PATCH] xen-block: introduces extra
> >request to pass-through SCSI commands"):
> >>> Actually, that's quite desirable. The existing driver is actually a
> >>> SCSI HBA driver (because that's all you can write is you use thr
> >>> Windows STORPORT wrapper... which you need to if you want to pass
> >>> WHQL) so making it use a PV scsi protocol would probably make it
> >>> smaller since all the scsi <-> blkif translation code could be
> >>> ripped out and all the stuff to synthesize INQUIRY responses based
> >>> on vbd data in xenstore could go away too.
> >>
> 
> Wait a minute? You have that in the driver? Is this documented somewhere?
> Did I miss this in blkif.h?
> 

Yes, the Windows XENVBD driver has a whole load of code for synthesizing 
INQUIRY responses since that's how Windows probes for disks. There's nothing in 
blkif.h for any of this though since it’s an implementation detail of the 
Windows frontend. However, if you want to take a look, go to 
http://xenbits.xen.org/gitweb/?p=pvdrivers/win/xenvbd.git;a=blob;f=src/xenvbd/pdoinquiry.c.
 There's even code in there for base64 decoding VPD pages from xenstore! 
(That's a XenServer implementation artefact... I'm not recommending it as a way 
forward ;-))

  Paul

> 
> >> Do non-Linux hosts have something like LIO ?
> >
> >Are there any pvSCSI backends other than the one in Linux?
> >
> 
> Not that I know off.
> 
> 
> 
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH] xen-block: introduces extra request to pass-through SCSI commands

2016-03-03 Thread Konrad Rzeszutek Wilk
On March 3, 2016 7:25:11 AM EST, Juergen Gross  wrote:
>On 03/03/16 13:03, Ian Jackson wrote:
>> Paul Durrant writes ("RE: [RFC PATCH] xen-block: introduces extra
>request to pass-through SCSI commands"):
>>> Actually, that's quite desirable. The existing driver is actually a
>>> SCSI HBA driver (because that's all you can write is you use thr
>>> Windows STORPORT wrapper... which you need to if you want to pass
>>> WHQL) so making it use a PV scsi protocol would probably make it
>>> smaller since all the scsi <-> blkif translation code could be
>>> ripped out and all the stuff to synthesize INQUIRY responses based
>>> on vbd data in xenstore could go away too.
>> 

Wait a minute? You have that in the driver? Is this documented somewhere? Did I 
miss this in blkif.h?


>> Do non-Linux hosts have something like LIO ?
>
>Are there any pvSCSI backends other than the one in Linux?
>

Not that I know off.




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH] xen-block: introduces extra request to pass-through SCSI commands

2016-03-03 Thread Juergen Gross
On 03/03/16 13:03, Ian Jackson wrote:
> Paul Durrant writes ("RE: [RFC PATCH] xen-block: introduces extra request to 
> pass-through SCSI commands"):
>> Actually, that's quite desirable. The existing driver is actually a
>> SCSI HBA driver (because that's all you can write is you use thr
>> Windows STORPORT wrapper... which you need to if you want to pass
>> WHQL) so making it use a PV scsi protocol would probably make it
>> smaller since all the scsi <-> blkif translation code could be
>> ripped out and all the stuff to synthesize INQUIRY responses based
>> on vbd data in xenstore could go away too.
> 
> Do non-Linux hosts have something like LIO ?

Are there any pvSCSI backends other than the one in Linux?


Juergen


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH] xen-block: introduces extra request to pass-through SCSI commands

2016-03-03 Thread Ian Jackson
Paul Durrant writes ("RE: [RFC PATCH] xen-block: introduces extra request to 
pass-through SCSI commands"):
> Actually, that's quite desirable. The existing driver is actually a
> SCSI HBA driver (because that's all you can write is you use thr
> Windows STORPORT wrapper... which you need to if you want to pass
> WHQL) so making it use a PV scsi protocol would probably make it
> smaller since all the scsi <-> blkif translation code could be
> ripped out and all the stuff to synthesize INQUIRY responses based
> on vbd data in xenstore could go away too.

Do non-Linux hosts have something like LIO ?

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH] xen-block: introduces extra request to pass-through SCSI commands

2016-03-03 Thread Paul Durrant
> -Original Message-
> From: Konrad Rzeszutek Wilk [mailto:konrad.w...@oracle.com]
> Sent: 02 March 2016 17:23
> To: Ian Jackson
> Cc: Bob Liu; Juergen Gross; xen-devel@lists.xen.org; jbeul...@suse.com;
> Roger Pau Monne; Paul Durrant
> Subject: Re: [RFC PATCH] xen-block: introduces extra request to pass-
> through SCSI commands
> 
> On Wed, Mar 02, 2016 at 02:44:12PM +, Ian Jackson wrote:
> > Bob Liu writes ("Re: [RFC PATCH] xen-block: introduces extra request to
> pass-through SCSI commands"):
> > > On 03/02/2016 07:40 PM, Ian Jackson wrote:
> > > > I can't see how that could cause anything but pain.  In many cases
> > > > "the underlying SCSI storage target" wouldn't be well defined.  Even
> > > > if it was, these side channel SCSI commands are likely to Go Wrong in
> > > > exciting ways.
> > > >
> > > > What SCSI commands do you want to send ?
> > >
> > > * INQUIRY
> >
> > ... but why ?
> 
> Need to expose VPD information to the guest for application usage.
> 
> >
> > > * PERSISTENT RESERVE IN
> > > * PERSISTENT RESERVE OUT
> > >
> > > This is for Failover Clusters in Windows, not sure whether more
> > > commands are required.  I didn't get a required scsi commands list
> > > in the failover document.
> >
> > So you want to be able to reserve the volume against concurrent
> > access ?  If you're using LVM, such a reservation should apply to the
> > LVM LV, not to the underlying physical storage device, clearly.  So I
> > think LIO [1] + PVSCSI might be what you want.
> >
> 
> Except it would mean writting a new Windows PV driver and replacing the
> existing block one.

Actually, that's quite desirable. The existing driver is actually a SCSI HBA 
driver (because that's all you can write is you use thr Windows STORPORT 
wrapper... which you need to if you want to pass WHQL) so making it use a PV 
scsi protocol would probably make it smaller since all the scsi <-> blkif 
translation code could be ripped out and all the stuff to synthesize INQUIRY 
responses based on vbd data in xenstore could go away too.

  Paul

> 
> 
> > Ian.
> >
> > [1] http://linux-iscsi.org/wiki/LIO

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH] xen-block: introduces extra request to pass-through SCSI commands

2016-03-02 Thread Ian Jackson
Bob Liu writes ("Re: [RFC PATCH] xen-block: introduces extra request to 
pass-through SCSI commands"):
> On 03/02/2016 07:40 PM, Ian Jackson wrote:
> > I can't see how that could cause anything but pain.  In many cases
> > "the underlying SCSI storage target" wouldn't be well defined.  Even
> > if it was, these side channel SCSI commands are likely to Go Wrong in
> > exciting ways.
> > 
> > What SCSI commands do you want to send ?
> 
> * INQUIRY

... but why ?

> * PERSISTENT RESERVE IN
> * PERSISTENT RESERVE OUT
> 
> This is for Failover Clusters in Windows, not sure whether more
> commands are required.  I didn't get a required scsi commands list
> in the failover document.

So you want to be able to reserve the volume against concurrent
access ?  If you're using LVM, such a reservation should apply to the
LVM LV, not to the underlying physical storage device, clearly.  So I
think LIO [1] + PVSCSI might be what you want.

Ian.

[1] http://linux-iscsi.org/wiki/LIO

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH] xen-block: introduces extra request to pass-through SCSI commands

2016-03-02 Thread Bob Liu

On 03/02/2016 07:40 PM, Ian Jackson wrote:
> Bob Liu writes ("Re: [RFC PATCH] xen-block: introduces extra request to 
> pass-through SCSI commands"):
>> Do you know whether pvscsi can work on top of multipath(the device-mapper 
>> framework) or LVMs?
> 
> No, it can't.  devmapper and LVM work with the block device
> abstraction.
> 
> Implicitly you seem to be suggesting that you want to use dm-multipath
> and LVM, but also send other SCSI CDBs from the upper layers through
> to the underlying SCSI storage target.
> 

Exactly!

> I can't see how that could cause anything but pain.  In many cases
> "the underlying SCSI storage target" wouldn't be well defined.  Even
> if it was, these side channel SCSI commands are likely to Go Wrong in
> exciting ways.
> 
> What SCSI commands do you want to send ?
> 

* INQUIRY

* PERSISTENT RESERVE IN
* PERSISTENT RESERVE OUT
This is for Failover Clusters in Windows, not sure whether more commands are 
required.
I didn't get a required scsi commands list in the failover document.

-- 
Regards,
-Bob

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH] xen-block: introduces extra request to pass-through SCSI commands

2016-03-02 Thread Juergen Gross
On 02/03/16 12:46, Paul Durrant wrote:
>> -Original Message-
>> From: Ian Jackson [mailto:ian.jack...@eu.citrix.com]
>> Sent: 02 March 2016 11:41
>> To: Bob Liu
>> Cc: Juergen Gross; xen-devel@lists.xen.org; jbeul...@suse.com; Roger Pau
>> Monne; Paul Durrant; konrad.w...@oracle.com
>> Subject: Re: [RFC PATCH] xen-block: introduces extra request to pass-
>> through SCSI commands
>>
>> Bob Liu writes ("Re: [RFC PATCH] xen-block: introduces extra request to
>> pass-through SCSI commands"):
>>> Do you know whether pvscsi can work on top of multipath(the device-
>> mapper framework) or LVMs?
>>
>> No, it can't.  devmapper and LVM work with the block device
>> abstraction.
>>
> 
> Google suggests you can use LIO to present an LV as a virtual SCSI LUN, in 
> which case pvscsi should work.

Indeed: http://linux-iscsi.org/wiki/Targetcli#IBLOCK


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH] xen-block: introduces extra request to pass-through SCSI commands

2016-03-02 Thread Paul Durrant
> -Original Message-
> From: Ian Jackson [mailto:ian.jack...@eu.citrix.com]
> Sent: 02 March 2016 11:41
> To: Bob Liu
> Cc: Juergen Gross; xen-devel@lists.xen.org; jbeul...@suse.com; Roger Pau
> Monne; Paul Durrant; konrad.w...@oracle.com
> Subject: Re: [RFC PATCH] xen-block: introduces extra request to pass-
> through SCSI commands
> 
> Bob Liu writes ("Re: [RFC PATCH] xen-block: introduces extra request to
> pass-through SCSI commands"):
> > Do you know whether pvscsi can work on top of multipath(the device-
> mapper framework) or LVMs?
> 
> No, it can't.  devmapper and LVM work with the block device
> abstraction.
>

Google suggests you can use LIO to present an LV as a virtual SCSI LUN, in 
which case pvscsi should work.

  Paul
 
> Implicitly you seem to be suggesting that you want to use dm-multipath
> and LVM, but also send other SCSI CDBs from the upper layers through
> to the underlying SCSI storage target.
> 
> I can't see how that could cause anything but pain.  In many cases
> "the underlying SCSI storage target" wouldn't be well defined.  Even
> if it was, these side channel SCSI commands are likely to Go Wrong in
> exciting ways.
> 
> What SCSI commands do you want to send ?
> 
> Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH] xen-block: introduces extra request to pass-through SCSI commands

2016-03-02 Thread Ian Jackson
Bob Liu writes ("Re: [RFC PATCH] xen-block: introduces extra request to 
pass-through SCSI commands"):
> Do you know whether pvscsi can work on top of multipath(the device-mapper 
> framework) or LVMs?

No, it can't.  devmapper and LVM work with the block device
abstraction.

Implicitly you seem to be suggesting that you want to use dm-multipath
and LVM, but also send other SCSI CDBs from the upper layers through
to the underlying SCSI storage target.

I can't see how that could cause anything but pain.  In many cases
"the underlying SCSI storage target" wouldn't be well defined.  Even
if it was, these side channel SCSI commands are likely to Go Wrong in
exciting ways.

What SCSI commands do you want to send ?

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH] xen-block: introduces extra request to pass-through SCSI commands

2016-03-01 Thread Bob Liu
Hi Juergen,

On 03/02/2016 03:39 PM, Juergen Gross wrote:
> On 01/03/16 19:08, Ian Jackson wrote:
>> Bob Liu writes ("Re: [RFC PATCH] xen-block: introduces extra request to 
>> pass-through SCSI commands"):
>>> One thing I'm still not sure about PVSCSI is do we have the same security 
>>> issue since LIO can interface to any block device.
>>> E.g when using a partition /dev/sda1 as the PVSCSI-backend, but the 
>>> PVSCSI-frontend may still send SCSI operates on LUN bases (the whole disk).
>>
>> I don't think you can use pvscsi to passthrough a partition such as
>> /dev/sda1.  Such a thing is not a SCSI command target.
> 
> It might be possible via the fileio target backend. In this case LUN
> based SCSI operations are ignored/refused/emulated by LIO.
> 

Do you know whether pvscsi can work on top of multipath(the device-mapper 
framework) or LVMs?
Thank you!

Bob

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH] xen-block: introduces extra request to pass-through SCSI commands

2016-03-01 Thread Juergen Gross
On 01/03/16 19:08, Ian Jackson wrote:
> Bob Liu writes ("Re: [RFC PATCH] xen-block: introduces extra request to 
> pass-through SCSI commands"):
>> One thing I'm still not sure about PVSCSI is do we have the same security 
>> issue since LIO can interface to any block device.
>> E.g when using a partition /dev/sda1 as the PVSCSI-backend, but the 
>> PVSCSI-frontend may still send SCSI operates on LUN bases (the whole disk).
> 
> I don't think you can use pvscsi to passthrough a partition such as
> /dev/sda1.  Such a thing is not a SCSI command target.

It might be possible via the fileio target backend. In this case LUN
based SCSI operations are ignored/refused/emulated by LIO.


Juergen


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH] xen-block: introduces extra request to pass-through SCSI commands

2016-03-01 Thread Ian Jackson
Bob Liu writes ("Re: [RFC PATCH] xen-block: introduces extra request to 
pass-through SCSI commands"):
> One thing I'm still not sure about PVSCSI is do we have the same security 
> issue since LIO can interface to any block device.
> E.g when using a partition /dev/sda1 as the PVSCSI-backend, but the 
> PVSCSI-frontend may still send SCSI operates on LUN bases (the whole disk).

I don't think you can use pvscsi to passthrough a partition such as
/dev/sda1.  Such a thing is not a SCSI command target.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH] xen-block: introduces extra request to pass-through SCSI commands

2016-02-29 Thread Bob Liu

On 03/01/2016 12:29 AM, Ian Jackson wrote:
> Ian Jackson writes ("Re: [RFC PATCH] xen-block: introduces extra request to 
> pass-through SCSI commands"):
>> [stuff suggesting use of PVSCSI instead]
> 
> For the avoidance of doubt:
> 
> 1. Thanks very much for bringing this proposal to us at the concept
> stage.  It is much easier to discuss these matters in a constructive
> way before a lot of effort has been put into an implementation.
> 
> 2. I should explain the downsides which I see in your proposal:
> 
> - Your suggestion has bad security properties: previously, the PV
>   block protocol would present only a very simple and narrow
>   interface.  Your SCSI CDB passthrough proposal means that guests
>   would be able to activate features in SCSI targets which would be
>   unexpected and unintended by the host administrator.  Such features
>   would perhaps even be unknown to the host administrator.
> 
>   This could be mitigated by making this feature configurable, of
>   course, defaulting to off, along with clear documentation.  But it's
>   not a desirable property.
> 
> - For similar reasons it will often be difficult to use such a feature
>   safely.  Guest software in particular might expect that it can
>   safely use whatever features it can see, and do all sorts of
>   exciting things.
> 
> - It involves duplicating multiplexing logic which already exists in
>   PVSCSI.
> 

One thing I'm still not sure about PVSCSI is do we have the same security issue 
since LIO can interface to any block device.
E.g when using a partition /dev/sda1 as the PVSCSI-backend, but the 
PVSCSI-frontend may still send SCSI operates on LUN bases (the whole disk).

P.S. Thanks to all of you, it helps a lot!

-- 
Regards,
-Bob

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH] xen-block: introduces extra request to pass-through SCSI commands

2016-02-29 Thread Bob Liu

On 03/01/2016 12:29 AM, Ian Jackson wrote:
> Ian Jackson writes ("Re: [RFC PATCH] xen-block: introduces extra request to 
> pass-through SCSI commands"):
>> [stuff suggesting use of PVSCSI instead]
> 
> For the avoidance of doubt:
> 
> 1. Thanks very much for bringing this proposal to us at the concept
> stage.  It is much easier to discuss these matters in a constructive
> way before a lot of effort has been put into an implementation.
> 
> 2. I should explain the downsides which I see in your proposal:
> 
> - Your suggestion has bad security properties: previously, the PV
>   block protocol would present only a very simple and narrow
>   interface.  Your SCSI CDB passthrough proposal means that guests
>   would be able to activate features in SCSI targets which would be
>   unexpected and unintended by the host administrator.  Such features
>   would perhaps even be unknown to the host administrator.
> 
>   This could be mitigated by making this feature configurable, of
>   course, defaulting to off, along with clear documentation.  But it's
>   not a desirable property.
> 
> - For similar reasons it will often be difficult to use such a feature
>   safely.  Guest software in particular might expect that it can
>   safely use whatever features it can see, and do all sorts of
>   exciting things.
> 
> - It involves duplicating multiplexing logic which already exists in
>   PVSCSI.
> 

One thing I'm still not sure about PVSCSI is do we have the same security issue 
since LIO can interface to any block device.
E.g when using a partition /dev/sda1 as the PVSCSI-backend, but the 
PVSCSI-frontend may send SCSI operates on LUN bases (the whole disk).

P.S. Thanks to all of you, it helps a lot!

-- 
Regards,
-Bob

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH] xen-block: introduces extra request to pass-through SCSI commands

2016-02-29 Thread Paul Durrant
> -Original Message-
[snip]
> > > I was casting my mind back to incompatibilities that crept in with
> persistent grants. TBH I haven't used blkback much since then; I tend to use
> qemu qdisk as my backend these days.
> >
> > FWIW, QEMU Qdisk also has persistent grants support... and was
> > implemented more or less at the same time as blkback persistent grants.
> 
> 
> 
> Perhaps Pauls' definition of  'recent times' means == two years or so :-)
> 
> :-)

Indeed. That's a blink of an eye for someone as ancient as me :-)

  Paul

> >
> > Roger.
> >

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH] xen-block: introduces extra request to pass-through SCSI commands

2016-02-29 Thread Konrad Rzeszutek Wilk
On Mon, Feb 29, 2016 at 04:35:36PM +0100, Roger Pau Monné wrote:
> El 29/2/16 a les 16:28, Paul Durrant ha escrit:
> >> -Original Message-
> >> From: Konrad Rzeszutek Wilk [mailto:konrad.w...@oracle.com]
> >> Sent: 29 February 2016 14:56
> >> To: Paul Durrant
> >> Cc: Bob Liu; xen-devel@lists.xen.org; Ian Jackson; jbeul...@suse.com; Roger
> >> Pau Monne; jgr...@suse.com
> >> Subject: Re: [RFC PATCH] xen-block: introduces extra request to pass-
> >> through SCSI commands
> >>
> >> On Mon, Feb 29, 2016 at 09:13:41AM +, Paul Durrant wrote:
>  -Original Message-
>  From: Bob Liu [mailto:bob@oracle.com]
>  Sent: 29 February 2016 03:37
>  To: xen-devel@lists.xen.org
>  Cc: Ian Jackson; jbeul...@suse.com; Roger Pau Monne;
> >> jgr...@suse.com;
>  Paul Durrant; konrad.w...@oracle.com; Bob Liu
>  Subject: [RFC PATCH] xen-block: introduces extra request to pass-
>   * xen-block has been existed for many years, widely used and more
> >> stable.
> 
> >>>
> >>> It's definitely widely used, but it has had stability issues in recent 
> >>> times.
> >>
> >> Oh? Could you send the bug-reports to me and Roger, CC xen-devel and
> >> LKML please ?
> > 
> > I was casting my mind back to incompatibilities that crept in with 
> > persistent grants. TBH I haven't used blkback much since then; I tend to 
> > use qemu qdisk as my backend these days.
> 
> FWIW, QEMU Qdisk also has persistent grants support... and was
> implemented more or less at the same time as blkback persistent grants.



Perhaps Pauls' definition of  'recent times' means == two years or so :-)

:-)
> 
> Roger.
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH] xen-block: introduces extra request to pass-through SCSI commands

2016-02-29 Thread Ian Jackson
Ian Jackson writes ("Re: [RFC PATCH] xen-block: introduces extra request to 
pass-through SCSI commands"):
> [stuff suggesting use of PVSCSI instead]

For the avoidance of doubt:

1. Thanks very much for bringing this proposal to us at the concept
stage.  It is much easier to discuss these matters in a constructive
way before a lot of effort has been put into an implementation.

2. I should explain the downsides which I see in your proposal:

- Your suggestion has bad security properties: previously, the PV
  block protocol would present only a very simple and narrow
  interface.  Your SCSI CDB passthrough proposal means that guests
  would be able to activate features in SCSI targets which would be
  unexpected and unintended by the host administrator.  Such features
  would perhaps even be unknown to the host administrator.

  This could be mitigated by making this feature configurable, of
  course, defaulting to off, along with clear documentation.  But it's
  not a desirable property.

- For similar reasons it will often be difficult to use such a feature
  safely.  Guest software in particular might expect that it can
  safely use whatever features it can see, and do all sorts of
  exciting things.

- It involves duplicating multiplexing logic which already exists in
  PVSCSI.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH] xen-block: introduces extra request to pass-through SCSI commands

2016-02-29 Thread Ian Jackson
Bob Liu writes ("[RFC PATCH] xen-block: introduces extra request to 
pass-through SCSI commands"):
> 4) Alternative approach: Using PVSCSI instead:
>  * Doubt PVSCSI can support as many type of backend storage devices
>as Xen-block.

I don't understand why this would be the case.  Your "extra requests"
are SCSI CDBs, so your proposal only works with SCSI targets.

>  * Much longer path:
>ioctl() -> SCSI upper layer -> Middle layer -> PVSCSI-frontend -> 
> PVSCSI-backend -> Target framework(LIO?) ->

The pvscsi pieces here is precisely the multiplexer that you really
want, for this all to work correctly.

If you want your guests to be able to drive this thing properly, you
need to make the thing look like a SCSI target anyway.  So effectively
you've reinvented PVSCSI.

When you say "longer path" do you really mean that the performance is
poor ?  I don't think this is due to the number of layers here.  The
layers are quite thin.

If there is a performance problem with PVSCSI then perhaps that should
be investigated.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH] xen-block: introduces extra request to pass-through SCSI commands

2016-02-29 Thread Paul Durrant
> -Original Message-
> From: Konrad Rzeszutek Wilk [mailto:konrad.w...@oracle.com]
> Sent: 29 February 2016 14:56
> To: Paul Durrant
> Cc: Bob Liu; xen-devel@lists.xen.org; Ian Jackson; jbeul...@suse.com; Roger
> Pau Monne; jgr...@suse.com
> Subject: Re: [RFC PATCH] xen-block: introduces extra request to pass-
> through SCSI commands
> 
> On Mon, Feb 29, 2016 at 09:13:41AM +, Paul Durrant wrote:
> > > -Original Message-
> > > From: Bob Liu [mailto:bob@oracle.com]
> > > Sent: 29 February 2016 03:37
> > > To: xen-devel@lists.xen.org
> > > Cc: Ian Jackson; jbeul...@suse.com; Roger Pau Monne;
> jgr...@suse.com;
> > > Paul Durrant; konrad.w...@oracle.com; Bob Liu
> > > Subject: [RFC PATCH] xen-block: introduces extra request to pass-
> through
> > > SCSI commands
> > >
> > > 1) What is this patch about?
> > > This patch introduces an new block operation (BLKIF_OP_EXTRA_FLAG).
> > > A request with BLKIF_OP_EXTRA_FLAG set means the following request
> is an
> > > extra request which is used to pass through SCSI commands.
> > > This is like a simplified version of XEN_NETIF_EXTRA_* in netif.h.
> > > It can be extended easily to transmit other per-request/bio data from
> > > frontend
> > > to backend e.g Data Integrity Field per bio.
> > >
> > > 2) Why we need this?
> > > Currently only raw data segments are transmitted from blkfront to
> blkback,
> > > which
> > > means some advanced features are lost.
> > >  * Guest knows nothing about features of the real backend storage.
> > >   For example, on bare-metal environment INQUIRY SCSI command
> > > can be used
> > >   to query storage device information. If it's a SSD or flash device we
> > >   can have the option to use the device as a fast cache.
> > >   But this can't happen in current domU guests, because blkfront only
> > >   knows it's just a normal virtual disk
> > >
> >
> > That's the sort of information that should be advertised via xenstore then.
> There already feature flags for specific things but if some form of
> throughput/latency information is meaningful to a frontend stack then
> perhaps that can be advertised too.
> 
> Certainly could be put on the XenStore. Do you envision this being done
> pre guest creation (so toolstack does it), or the backend finds this
> and populates the XenStore keys?
> 
> Or that the frontend writes an XenStore key 'scsi-inq=vpd80' and the
> backend
> responds by populating an 'scsi-inq-vpd80=' '? If so can
> the XenStore accept binary payloads? Can it be more than 4K?
> 

I was thinking more along the lines of blkback creating xenstore keys with any 
relevant information. We have sector size and number of sectors already but I 
don't see any harm in having values for other quantities that may be useful to 
a frontend. Bouncing SCSI inquiries via xenstore was certainly not what I was 
thinking.

> 
> >
> > >  * Failover Clusters in Windows
> > >   Failover clusters require SCSI-3 persistent reservation target disks,
> > >   but now this can't work in domU.
> > >
> >
> > That's true but allowing arbitrary SCSI messages through is not the way
> forward IMO. Just because Windows thinks every HBA is SCSI doesn't mean
> other OS do so I think reservation/release should have dedicated messages
> in the blkif protocol if it's desirable to support clustering in the frontend.
> 
> Could you expand a bit on the 'dedicated message' you have in mind please?
> 

As in we create message types to reserve/release whatever backend is being used 
and the backend uses whatever mechanism is appropriate to deal with that. E.g. 
if it were qdisk talking to a raw file then that might just be an flock.

> >
> > > 3) Known issues:
> > >  * Security issues, how to 'validate' this extra request payload.
> > >E.g SCSI operates on LUN bases (the whole disk) while we really just
> want
> > > to
> > >operate on partitions
> > >
> > >  * Can't pass SCSI commands through if the backend storage driver is bio-
> > > based
> > >instead of request-based.
> > >
> > > 4) Alternative approach: Using PVSCSI instead:
> > >  * Doubt PVSCSI can support as many type of backend storage devices as
> > > Xen-block.
> > >
> >
> > LIO can interface to any block device in much the same way blkback does
> IIRC.
> 
> But it can't do multipath or LVMs - which is an integral component.
> 

Really? I was not aware of that limitation and it surprises me since AFAIK LIO 
can also use a raw file as a backstore which seems like it would be above 
either of those.

> Anyhow that is more of a implementation specific quirk.
> >
> > >  * Much longer path:
> > >ioctl() -> SCSI upper layer -> Middle layer -> PVSCSI-frontend -> 
> > > PVSCSI-
> > > backend -> Target framework(LIO?) ->
> > >
> > >With xen-block we only need:
> > >ioctl() -> blkfront -> blkback ->
> > >
> >
> > ...and what happens if the block device that blkback is talking to is a SCSI
> LUN?
> >
> > That latter path is also not true for Windows. You've got all the SCSI
> t

Re: [Xen-devel] [RFC PATCH] xen-block: introduces extra request to pass-through SCSI commands

2016-02-29 Thread Roger Pau Monné
El 29/2/16 a les 16:28, Paul Durrant ha escrit:
>> -Original Message-
>> From: Konrad Rzeszutek Wilk [mailto:konrad.w...@oracle.com]
>> Sent: 29 February 2016 14:56
>> To: Paul Durrant
>> Cc: Bob Liu; xen-devel@lists.xen.org; Ian Jackson; jbeul...@suse.com; Roger
>> Pau Monne; jgr...@suse.com
>> Subject: Re: [RFC PATCH] xen-block: introduces extra request to pass-
>> through SCSI commands
>>
>> On Mon, Feb 29, 2016 at 09:13:41AM +, Paul Durrant wrote:
 -Original Message-
 From: Bob Liu [mailto:bob@oracle.com]
 Sent: 29 February 2016 03:37
 To: xen-devel@lists.xen.org
 Cc: Ian Jackson; jbeul...@suse.com; Roger Pau Monne;
>> jgr...@suse.com;
 Paul Durrant; konrad.w...@oracle.com; Bob Liu
 Subject: [RFC PATCH] xen-block: introduces extra request to pass-
  * xen-block has been existed for many years, widely used and more
>> stable.

>>>
>>> It's definitely widely used, but it has had stability issues in recent 
>>> times.
>>
>> Oh? Could you send the bug-reports to me and Roger, CC xen-devel and
>> LKML please ?
> 
> I was casting my mind back to incompatibilities that crept in with persistent 
> grants. TBH I haven't used blkback much since then; I tend to use qemu qdisk 
> as my backend these days.

FWIW, QEMU Qdisk also has persistent grants support... and was
implemented more or less at the same time as blkback persistent grants.

Roger.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH] xen-block: introduces extra request to pass-through SCSI commands

2016-02-29 Thread Juergen Gross
On 29/02/16 16:05, Konrad Rzeszutek Wilk wrote:
> On Mon, Feb 29, 2016 at 09:12:30AM +0100, Juergen Gross wrote:
>> On 29/02/16 04:37, Bob Liu wrote:
>>> 1) What is this patch about?
>>> This patch introduces an new block operation (BLKIF_OP_EXTRA_FLAG).
>>> A request with BLKIF_OP_EXTRA_FLAG set means the following request is an
>>> extra request which is used to pass through SCSI commands.
>>> This is like a simplified version of XEN_NETIF_EXTRA_* in netif.h.
>>> It can be extended easily to transmit other per-request/bio data from 
>>> frontend
>>> to backend e.g Data Integrity Field per bio.
>>>
>>> 2) Why we need this?
>>> Currently only raw data segments are transmitted from blkfront to blkback, 
>>> which
>>> means some advanced features are lost.
>>>  * Guest knows nothing about features of the real backend storage.
>>> For example, on bare-metal environment INQUIRY SCSI command can be used
>>> to query storage device information. If it's a SSD or flash device we
>>> can have the option to use the device as a fast cache.
>>> But this can't happen in current domU guests, because blkfront only
>>> knows it's just a normal virtual disk
>>>
>>>  * Failover Clusters in Windows
>>> Failover clusters require SCSI-3 persistent reservation target disks,
>>> but now this can't work in domU.
>>>
>>> 3) Known issues:
>>>  * Security issues, how to 'validate' this extra request payload.
>>>E.g SCSI operates on LUN bases (the whole disk) while we really just 
>>> want to
>>>operate on partitions
>>
>> It's not only validation: some operations just affect the whole LUN
>> (e.g. Reserve/Release). And what about "multi-LUN" commands like
>> "report LUNs"?
> 
> Don't expose them. Bob and I want to get an idea of what would be a good
> compromise to allow some SCSI specific (or perhaps ATA specific or DIF/DIX?) 
> type of
> commands go through the PV driver.
> 
> Would it be better if it was through XenBus? But that may not work for some
> that are tied closely to requests, such as DIF/DIX.
> 
> However the 'DISCARD' for example worked out - it is an umbrella for both
> SCSI UNMAP and ATA DISCARD operation and hides the complexity of the low level
> protocol. Could there be an 'INQ' ? Since the SCSI VPD is the most exhaustive
> in terms of details it may make sense to base it on that..?
> 
>>
>>>  * Can't pass SCSI commands through if the backend storage driver is 
>>> bio-based
>>>instead of request-based.
>>>
>>> 4) Alternative approach: Using PVSCSI instead:
>>>  * Doubt PVSCSI can support as many type of backend storage devices as 
>>> Xen-block.
>>
>> pvSCSI won't need to support all types of backends. It's enough to
>> support those where passing through SCSI commands makes sense.
>>
>> Seems to be a similar issue as the above mentioned problem with
>> bio-based backend storage drivers.
> 
> In particular the ones we care about are:
>  - Multipath over FibreChannel devices.
>  - Linear mapping (LVM) over the multipath.
>  - And then potentially an filesystem on top of that
>  - .. and a raw file on the filesystem.
> 
> Having SCSI VPD 0x83 page sent to frontend for that would be good.
> 
> Not sure about SCSI reservations. Perhaps those are more of .. unique
> in that the implementation would have to make sure that the guest
> owns the whole LUN. But that is implementation question.
> 
> This is about the design - how would you envision to to cram in 
> SCSI commands or DIF/DIX commands or ATA commands via PV block layer?

Have some kind of abstraction which can be mapped to SCSI commands
easily, but don't stick to the SCSI definitions. Use your own
structures, commands etc. and build SCSI commands from those in the
backend. This way you avoid having to emulate SCSI. Instead of
naming it "SCSI passthrough" call it "special commands". :-)
I would add only those operations you really need. Add an inquiry
operation which returns the supported "special commands". The
availability of the inquiry can be reported via Xenstore.

> 
>>
>>>  * Much longer path:
>>>ioctl() -> SCSI upper layer -> Middle layer -> PVSCSI-frontend -> 
>>> PVSCSI-backend -> Target framework(LIO?) ->
>>>
>>>With xen-block we only need:
>>>ioctl() -> blkfront -> blkback ->
>>
>> I'd like to see performance numbers before making a decision.
> 
> For SCSI INQ? 
> 
> Or are you talking about raw READ/WRITE?

READ/WRITE. I'm quite sure pvSCSI will be slower than pvblk, but I don't
know how much slower.

>>
>>>  * xen-block has been existed for many years, widely used and more stable.
>>
>> Adding another SCSI passthrough capability wasn't accepted for pvSCSI
>> (that's the reason I used the Target Framework). Why do you think it
>> will be accepted for pvblk?
>>
>> This is not my personal opinion, just a heads up from someone who had a
>> try already. ;-)
> 
> Right. So SCSI passthrough out. What about mediated access for
> specific SCSI, or specific ATA, or DIF/DIX ones? And how would you do it
> 

Re: [Xen-devel] [RFC PATCH] xen-block: introduces extra request to pass-through SCSI commands

2016-02-29 Thread Konrad Rzeszutek Wilk
On Mon, Feb 29, 2016 at 09:12:30AM +0100, Juergen Gross wrote:
> On 29/02/16 04:37, Bob Liu wrote:
> > 1) What is this patch about?
> > This patch introduces an new block operation (BLKIF_OP_EXTRA_FLAG).
> > A request with BLKIF_OP_EXTRA_FLAG set means the following request is an
> > extra request which is used to pass through SCSI commands.
> > This is like a simplified version of XEN_NETIF_EXTRA_* in netif.h.
> > It can be extended easily to transmit other per-request/bio data from 
> > frontend
> > to backend e.g Data Integrity Field per bio.
> > 
> > 2) Why we need this?
> > Currently only raw data segments are transmitted from blkfront to blkback, 
> > which
> > means some advanced features are lost.
> >  * Guest knows nothing about features of the real backend storage.
> > For example, on bare-metal environment INQUIRY SCSI command can be used
> > to query storage device information. If it's a SSD or flash device we
> > can have the option to use the device as a fast cache.
> > But this can't happen in current domU guests, because blkfront only
> > knows it's just a normal virtual disk
> > 
> >  * Failover Clusters in Windows
> > Failover clusters require SCSI-3 persistent reservation target disks,
> > but now this can't work in domU.
> > 
> > 3) Known issues:
> >  * Security issues, how to 'validate' this extra request payload.
> >E.g SCSI operates on LUN bases (the whole disk) while we really just 
> > want to
> >operate on partitions
> 
> It's not only validation: some operations just affect the whole LUN
> (e.g. Reserve/Release). And what about "multi-LUN" commands like
> "report LUNs"?

Don't expose them. Bob and I want to get an idea of what would be a good
compromise to allow some SCSI specific (or perhaps ATA specific or DIF/DIX?) 
type of
commands go through the PV driver.

Would it be better if it was through XenBus? But that may not work for some
that are tied closely to requests, such as DIF/DIX.

However the 'DISCARD' for example worked out - it is an umbrella for both
SCSI UNMAP and ATA DISCARD operation and hides the complexity of the low level
protocol. Could there be an 'INQ' ? Since the SCSI VPD is the most exhaustive
in terms of details it may make sense to base it on that..?

> 
> >  * Can't pass SCSI commands through if the backend storage driver is 
> > bio-based
> >instead of request-based.
> > 
> > 4) Alternative approach: Using PVSCSI instead:
> >  * Doubt PVSCSI can support as many type of backend storage devices as 
> > Xen-block.
> 
> pvSCSI won't need to support all types of backends. It's enough to
> support those where passing through SCSI commands makes sense.
> 
> Seems to be a similar issue as the above mentioned problem with
> bio-based backend storage drivers.

In particular the ones we care about are:
 - Multipath over FibreChannel devices.
 - Linear mapping (LVM) over the multipath.
 - And then potentially an filesystem on top of that
 - .. and a raw file on the filesystem.

Having SCSI VPD 0x83 page sent to frontend for that would be good.

Not sure about SCSI reservations. Perhaps those are more of .. unique
in that the implementation would have to make sure that the guest
owns the whole LUN. But that is implementation question.

This is about the design - how would you envision to to cram in 
SCSI commands or DIF/DIX commands or ATA commands via PV block layer?

> 
> >  * Much longer path:
> >ioctl() -> SCSI upper layer -> Middle layer -> PVSCSI-frontend -> 
> > PVSCSI-backend -> Target framework(LIO?) ->
> > 
> >With xen-block we only need:
> >ioctl() -> blkfront -> blkback ->
> 
> I'd like to see performance numbers before making a decision.

For SCSI INQ? 

Or are you talking about raw READ/WRITE?
> 
> >  * xen-block has been existed for many years, widely used and more stable.
> 
> Adding another SCSI passthrough capability wasn't accepted for pvSCSI
> (that's the reason I used the Target Framework). Why do you think it
> will be accepted for pvblk?
> 
> This is not my personal opinion, just a heads up from someone who had a
> try already. ;-)

Right. So SCSI passthrough out. What about mediated access for
specific SCSI, or specific ATA, or DIF/DIX ones? And how would you do it
knowing the SCSI maintainers much better than we do?


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH] xen-block: introduces extra request to pass-through SCSI commands

2016-02-29 Thread Konrad Rzeszutek Wilk
On Mon, Feb 29, 2016 at 09:13:41AM +, Paul Durrant wrote:
> > -Original Message-
> > From: Bob Liu [mailto:bob@oracle.com]
> > Sent: 29 February 2016 03:37
> > To: xen-devel@lists.xen.org
> > Cc: Ian Jackson; jbeul...@suse.com; Roger Pau Monne; jgr...@suse.com;
> > Paul Durrant; konrad.w...@oracle.com; Bob Liu
> > Subject: [RFC PATCH] xen-block: introduces extra request to pass-through
> > SCSI commands
> > 
> > 1) What is this patch about?
> > This patch introduces an new block operation (BLKIF_OP_EXTRA_FLAG).
> > A request with BLKIF_OP_EXTRA_FLAG set means the following request is an
> > extra request which is used to pass through SCSI commands.
> > This is like a simplified version of XEN_NETIF_EXTRA_* in netif.h.
> > It can be extended easily to transmit other per-request/bio data from
> > frontend
> > to backend e.g Data Integrity Field per bio.
> > 
> > 2) Why we need this?
> > Currently only raw data segments are transmitted from blkfront to blkback,
> > which
> > means some advanced features are lost.
> >  * Guest knows nothing about features of the real backend storage.
> > For example, on bare-metal environment INQUIRY SCSI command
> > can be used
> > to query storage device information. If it's a SSD or flash device we
> > can have the option to use the device as a fast cache.
> > But this can't happen in current domU guests, because blkfront only
> > knows it's just a normal virtual disk
> > 
> 
> That's the sort of information that should be advertised via xenstore then. 
> There already feature flags for specific things but if some form of 
> throughput/latency information is meaningful to a frontend stack then perhaps 
> that can be advertised too.

Certainly could be put on the XenStore. Do you envision this being done
pre guest creation (so toolstack does it), or the backend finds this
and populates the XenStore keys?

Or that the frontend writes an XenStore key 'scsi-inq=vpd80' and the backend
responds by populating an 'scsi-inq-vpd80=' '? If so can
the XenStore accept binary payloads? Can it be more than 4K?


> 
> >  * Failover Clusters in Windows
> > Failover clusters require SCSI-3 persistent reservation target disks,
> > but now this can't work in domU.
> > 
> 
> That's true but allowing arbitrary SCSI messages through is not the way 
> forward IMO. Just because Windows thinks every HBA is SCSI doesn't mean other 
> OS do so I think reservation/release should have dedicated messages in the 
> blkif protocol if it's desirable to support clustering in the frontend.

Could you expand a bit on the 'dedicated message' you have in mind please?

> 
> > 3) Known issues:
> >  * Security issues, how to 'validate' this extra request payload.
> >E.g SCSI operates on LUN bases (the whole disk) while we really just want
> > to
> >operate on partitions
> > 
> >  * Can't pass SCSI commands through if the backend storage driver is bio-
> > based
> >instead of request-based.
> > 
> > 4) Alternative approach: Using PVSCSI instead:
> >  * Doubt PVSCSI can support as many type of backend storage devices as
> > Xen-block.
> > 
> 
> LIO can interface to any block device in much the same way blkback does IIRC.

But it can't do multipath or LVMs - which is an integral component.

Anyhow that is more of a implementation specific quirk.
> 
> >  * Much longer path:
> >ioctl() -> SCSI upper layer -> Middle layer -> PVSCSI-frontend -> PVSCSI-
> > backend -> Target framework(LIO?) ->
> > 
> >With xen-block we only need:
> >ioctl() -> blkfront -> blkback ->
> > 
> 
> ...and what happens if the block device that blkback is talking to is a SCSI 
> LUN?
> 
> That latter path is also not true for Windows. You've got all the SCSI 
> translation logic in the frontend when using blkif so that first path would 
> collapse to:
> 
> Disk driver -> (SCSI) HBA Driver -> xen-scsiback -> LIO -> backstore -> XXX

I don't know if it matters on the length of the path for say SCSI INQ. It isn't 
like
that is performance specific. Neither are the clustering SCSI commands.

> 
> >  * xen-block has been existed for many years, widely used and more stable.
> > 
> 
> It's definitely widely used, but it has had stability issues in recent times.

Oh? Could you send the bug-reports to me and Roger, CC xen-devel and LKML 
please ?

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH] xen-block: introduces extra request to pass-through SCSI commands

2016-02-29 Thread Paul Durrant
> -Original Message-
> From: Bob Liu [mailto:bob@oracle.com]
> Sent: 29 February 2016 03:37
> To: xen-devel@lists.xen.org
> Cc: Ian Jackson; jbeul...@suse.com; Roger Pau Monne; jgr...@suse.com;
> Paul Durrant; konrad.w...@oracle.com; Bob Liu
> Subject: [RFC PATCH] xen-block: introduces extra request to pass-through
> SCSI commands
> 
> 1) What is this patch about?
> This patch introduces an new block operation (BLKIF_OP_EXTRA_FLAG).
> A request with BLKIF_OP_EXTRA_FLAG set means the following request is an
> extra request which is used to pass through SCSI commands.
> This is like a simplified version of XEN_NETIF_EXTRA_* in netif.h.
> It can be extended easily to transmit other per-request/bio data from
> frontend
> to backend e.g Data Integrity Field per bio.
> 
> 2) Why we need this?
> Currently only raw data segments are transmitted from blkfront to blkback,
> which
> means some advanced features are lost.
>  * Guest knows nothing about features of the real backend storage.
>   For example, on bare-metal environment INQUIRY SCSI command
> can be used
>   to query storage device information. If it's a SSD or flash device we
>   can have the option to use the device as a fast cache.
>   But this can't happen in current domU guests, because blkfront only
>   knows it's just a normal virtual disk
> 

That's the sort of information that should be advertised via xenstore then. 
There already feature flags for specific things but if some form of 
throughput/latency information is meaningful to a frontend stack then perhaps 
that can be advertised too.

>  * Failover Clusters in Windows
>   Failover clusters require SCSI-3 persistent reservation target disks,
>   but now this can't work in domU.
> 

That's true but allowing arbitrary SCSI messages through is not the way forward 
IMO. Just because Windows thinks every HBA is SCSI doesn't mean other OS do so 
I think reservation/release should have dedicated messages in the blkif 
protocol if it's desirable to support clustering in the frontend.

> 3) Known issues:
>  * Security issues, how to 'validate' this extra request payload.
>E.g SCSI operates on LUN bases (the whole disk) while we really just want
> to
>operate on partitions
> 
>  * Can't pass SCSI commands through if the backend storage driver is bio-
> based
>instead of request-based.
> 
> 4) Alternative approach: Using PVSCSI instead:
>  * Doubt PVSCSI can support as many type of backend storage devices as
> Xen-block.
> 

LIO can interface to any block device in much the same way blkback does IIRC.

>  * Much longer path:
>ioctl() -> SCSI upper layer -> Middle layer -> PVSCSI-frontend -> PVSCSI-
> backend -> Target framework(LIO?) ->
> 
>With xen-block we only need:
>ioctl() -> blkfront -> blkback ->
> 

...and what happens if the block device that blkback is talking to is a SCSI 
LUN?

That latter path is also not true for Windows. You've got all the SCSI 
translation logic in the frontend when using blkif so that first path would 
collapse to:

Disk driver -> (SCSI) HBA Driver -> xen-scsiback -> LIO -> backstore -> XXX

>  * xen-block has been existed for many years, widely used and more stable.
> 

It's definitely widely used, but it has had stability issues in recent times.

  Paul

> Welcome any input, thank you!
> 
> Signed-off-by: Bob Liu 
> ---
>  xen/include/public/io/blkif.h |   73
> +
>  1 file changed, 73 insertions(+)
> 
> diff --git a/xen/include/public/io/blkif.h b/xen/include/public/io/blkif.h
> index 99f0326..7c10bce 100644
> --- a/xen/include/public/io/blkif.h
> +++ b/xen/include/public/io/blkif.h
> @@ -635,6 +635,28 @@
>  #define BLKIF_OP_INDIRECT  6
> 
>  /*
> + * Recognised only if "feature-extra-request" is present in backend xenbus
> info.
> + * A request with BLKIF_OP_EXTRA_FLAG indicates an extra request is
> followed
> + * in the shared ring buffer.
> + *
> + * By this way, extra data like SCSI command, DIF/DIX and other per-
> request/bio
> + * data can be transmitted from frontend to backend.
> + *
> + * The 'wire' format is like:
> + *  Request 1: xen_blkif_request
> + * [Request 2: xen_blkif_extra_request](only if request 1 has
> BLKIF_OP_EXTRA_FLAG)
> + *  Request 3: xen_blkif_request
> + *  Request 4: xen_blkif_request
> + * [Request 5: xen_blkif_extra_request](only if request 4 has
> BLKIF_OP_EXTRA_FLAG)
> + *  ...
> + *  Request N: xen_blkif_request
> + *
> + * If a backend does not recognize BLKIF_OP_EXTRA_FLAG, it should *not*
> create the
> + * "feature-extra-request" node!
> + */
> +#define BLKIF_OP_EXTRA_FLAG (0x80)
> +
> +/*
>   * Maximum scatter/gather segments per request.
>   * This is carefully chosen so that sizeof(blkif_ring_t) <= PAGE_SIZE.
>   * NB. This could be 12 if the ring indexes weren't stored in the same page.
> @@ -703,10 +725,61 @@ struct blkif_request_indirect {
>  };
>  typedef struct blkif_request_indirect blkif_request_in

Re: [Xen-devel] [RFC PATCH] xen-block: introduces extra request to pass-through SCSI commands

2016-02-29 Thread Juergen Gross
On 29/02/16 04:37, Bob Liu wrote:
> 1) What is this patch about?
> This patch introduces an new block operation (BLKIF_OP_EXTRA_FLAG).
> A request with BLKIF_OP_EXTRA_FLAG set means the following request is an
> extra request which is used to pass through SCSI commands.
> This is like a simplified version of XEN_NETIF_EXTRA_* in netif.h.
> It can be extended easily to transmit other per-request/bio data from frontend
> to backend e.g Data Integrity Field per bio.
> 
> 2) Why we need this?
> Currently only raw data segments are transmitted from blkfront to blkback, 
> which
> means some advanced features are lost.
>  * Guest knows nothing about features of the real backend storage.
>   For example, on bare-metal environment INQUIRY SCSI command can be used
>   to query storage device information. If it's a SSD or flash device we
>   can have the option to use the device as a fast cache.
>   But this can't happen in current domU guests, because blkfront only
>   knows it's just a normal virtual disk
> 
>  * Failover Clusters in Windows
>   Failover clusters require SCSI-3 persistent reservation target disks,
>   but now this can't work in domU.
> 
> 3) Known issues:
>  * Security issues, how to 'validate' this extra request payload.
>E.g SCSI operates on LUN bases (the whole disk) while we really just want 
> to
>operate on partitions

It's not only validation: some operations just affect the whole LUN
(e.g. Reserve/Release). And what about "multi-LUN" commands like
"report LUNs"?

>  * Can't pass SCSI commands through if the backend storage driver is bio-based
>instead of request-based.
> 
> 4) Alternative approach: Using PVSCSI instead:
>  * Doubt PVSCSI can support as many type of backend storage devices as 
> Xen-block.

pvSCSI won't need to support all types of backends. It's enough to
support those where passing through SCSI commands makes sense.

Seems to be a similar issue as the above mentioned problem with
bio-based backend storage drivers.

>  * Much longer path:
>ioctl() -> SCSI upper layer -> Middle layer -> PVSCSI-frontend -> 
> PVSCSI-backend -> Target framework(LIO?) ->
> 
>With xen-block we only need:
>ioctl() -> blkfront -> blkback ->

I'd like to see performance numbers before making a decision.

>  * xen-block has been existed for many years, widely used and more stable.

Adding another SCSI passthrough capability wasn't accepted for pvSCSI
(that's the reason I used the Target Framework). Why do you think it
will be accepted for pvblk?

This is not my personal opinion, just a heads up from someone who had a
try already. ;-)


Juergen

> Welcome any input, thank you!
> 
> Signed-off-by: Bob Liu 
> ---
>  xen/include/public/io/blkif.h |   73 
> +
>  1 file changed, 73 insertions(+)
> 
> diff --git a/xen/include/public/io/blkif.h b/xen/include/public/io/blkif.h
> index 99f0326..7c10bce 100644
> --- a/xen/include/public/io/blkif.h
> +++ b/xen/include/public/io/blkif.h
> @@ -635,6 +635,28 @@
>  #define BLKIF_OP_INDIRECT  6
>  
>  /*
> + * Recognised only if "feature-extra-request" is present in backend xenbus 
> info.
> + * A request with BLKIF_OP_EXTRA_FLAG indicates an extra request is followed
> + * in the shared ring buffer.
> + *
> + * By this way, extra data like SCSI command, DIF/DIX and other 
> per-request/bio
> + * data can be transmitted from frontend to backend.
> + *
> + * The 'wire' format is like:
> + *  Request 1: xen_blkif_request
> + * [Request 2: xen_blkif_extra_request](only if request 1 has 
> BLKIF_OP_EXTRA_FLAG)
> + *  Request 3: xen_blkif_request
> + *  Request 4: xen_blkif_request
> + * [Request 5: xen_blkif_extra_request](only if request 4 has 
> BLKIF_OP_EXTRA_FLAG)
> + *  ...
> + *  Request N: xen_blkif_request
> + *
> + * If a backend does not recognize BLKIF_OP_EXTRA_FLAG, it should *not* 
> create the
> + * "feature-extra-request" node!
> + */
> +#define BLKIF_OP_EXTRA_FLAG (0x80)
> +
> +/*
>   * Maximum scatter/gather segments per request.
>   * This is carefully chosen so that sizeof(blkif_ring_t) <= PAGE_SIZE.
>   * NB. This could be 12 if the ring indexes weren't stored in the same page.
> @@ -703,10 +725,61 @@ struct blkif_request_indirect {
>  };
>  typedef struct blkif_request_indirect blkif_request_indirect_t;
>  
> +enum blkif_extra_request_type {
> + BLKIF_EXTRA_TYPE_SCSI_CMD = 1,  /* Transmit SCSI command.  */
> +};
> +
> +struct scsi_cmd_req {
> + /*
> +  * Grant mapping for transmiting SCSI command to backend, and
> +  * also receive sense data from backend.
> +  * One 4KB page is enough.
> +  */
> + grant_ref_t cmd_gref;
> + /* Length of SCSI command in the grant mapped page. */
> + unsigned int cmd_len;
> +
> + /*
> +  * SCSI command may require transmiting data segment length less
> +  * than a sector(512 bytes).
> +  * Record num_sg and last segment length in extra request so that