date:20120718

Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6

2012-07-18 Thread Paolo Bonzini

Il 18/07/2012 21:12, Anthony Liguori ha scritto:
> Is that true for all OSes?  Linux may handle things gracefully if UNMAP
> starts throwing errors but that doesn't mean that Windows will.

There is so much USB crap (not just removable, think of ATA-to-USB
enclosures) that they have to deal with pretty much everything.

> Windows does this with a points system and I do believe that INQUIRY
> responses from any local disks are included in this tally.

INQUIRY responses (at least vendor/product/type) should not change.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[GIT PULL] target fixes for v3.5

2012-07-18 Thread Nicholas A. Bellinger

Hi Linus,

Here are the last handful of target fixes that have come in recently
that are headed for-3.5 final.  Please go ahead and pull from:

  git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending.git master

This includes a bugfix from MDR to address a NULL pointer OOPs with FCoE
aborts, along with a WRITE_SAME emulation bugfix for NOLB=0 cases +
persistent reservation return cleanups from Roland.

All three patches are CC'ed to stable.

Thank you!

--nab

Mark Rustad (1):
  tcm_fc: Fix crash seen with aborts and large reads

Roland Dreier (2):
  target: Clean up returning errors in PR handling code
  target: Fix range calculation in WRITE SAME emulation when num blocks
== 0

 drivers/target/target_core_cdb.c |2 +-
 drivers/target/target_core_pr.c  |7 ---
 drivers/target/tcm_fc/tfc_cmd.c  |2 ++
 3 files changed, 7 insertions(+), 4 deletions(-)



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] target: Allow for target_submit_cmd() returning errors

2012-07-18 Thread Nicholas A. Bellinger

On Wed, 2012-07-18 at 11:10 +0200, Sebastian Andrzej Siewior wrote:
> On 07/18/2012 02:05 AM, Nicholas A. Bellinger wrote:
> >> index 02ace18..9ddf315 100644
> >> --- a/drivers/usb/gadget/tcm_usb_gadget.c
> >> +++ b/drivers/usb/gadget/tcm_usb_gadget.c
> >> @@ -1060,7 +1060,10 @@ static void usbg_cmd_work(struct work_struct *work)
> >>tpg = cmd->fu->tpg;
> >>tv_nexus = tpg->tpg_nexus;
> >>dir = get_cmd_dir(cmd->cmd_buf);
> >> -  if (dir<  0) {
> >> +  if (dir<  0 ||
> >> +  target_submit_cmd(se_cmd, tv_nexus->tvn_se_sess,
> >> +cmd->cmd_buf, cmd->sense_iu.sense, 
> >> cmd->unpacked_lun,
> >> +0, cmd->prio_attr, dir, TARGET_SCF_UNKNOWN_SIZE)) 
> >> {
> >>transport_init_se_cmd(se_cmd,
> >>tv_nexus->tvn_se_sess->se_tpg->se_tpg_tfo,
> >>tv_nexus->tvn_se_sess, cmd->data_len, DMA_NONE,
> >> @@ -1069,12 +1072,7 @@ static void usbg_cmd_work(struct work_struct *work)
> >>transport_send_check_condition_and_sense(se_cmd,
> >>TCM_UNSUPPORTED_SCSI_OPCODE, 1);
> >>usbg_cleanup_cmd(cmd);
> >> -  return;
> >>}
> >> -
> >> -  target_submit_cmd(se_cmd, tv_nexus->tvn_se_sess,
> >> -  cmd->cmd_buf, cmd->sense_iu.sense, cmd->unpacked_lun,
> >> -  0, cmd->prio_attr, dir, TARGET_SCF_UNKNOWN_SIZE);
> >>   }
> >>
> >>   static int usbg_submit_command(struct f_uas *fu,
> >
> > Looking at this part again target_submit_cmd() has been moved ahead of
> > target_init_se_cmd() in the exception path, which ends up getting called
> > twice during a target_get_sess_cmd() failure..
> >
> > It looks like usbg_cmd_work() + bot_cmd_work() will need some work if
> > they intends to use a non zero failure to signal exception here, without
> > invoking a CHECK_CONDITION and sense.  Actually, I'm not even sure
> > sending a CHECK_CONDITION here after the target_submit_cmd() is going to
> > work for other fabrics drivers, but it appears to work with usb-gadget.
> > (Sebastian..?)
> 
> For UAS the status/ sense URB is sent right away (without any data) and
> this worked the last time I tested.

, thanks for the re-clarification on this..  ;)

> For BOT things are a little complicated. What I do is to send empty
> data (to fill the data pipe) and send a bad status so the other side
> learns that the transfer failed somehow. The sense information is lost.
> What I should do is to queue this sense data for the next MODE_SENSE
> request which will be coming soon. Right now WinXP repeats a failed
> command thrice if MODE_SENSE returns no error. "This" is on my to-fix
> list…
> 

Ok, so I think the special case for usb-gadget where is able to call
transport_send_check_condition_and_sense() should be OK for the single
target_get_sess_cmd() -> target_submit_cmd() failure case..

This is not going to be safe for just about every other fabric AFAICT,
so I think we need a comment here to describe that fact..

> > So for the moment, lets fix the double se_cmd init and make things a
> > little easier to read..   Sebastian, please have a look and double check
> > that the (dir<  0) + target_submit_cmd() failures cases are both going
> > to work as expected whilst sending a CHECK_CONDITION response.
> 
> The sense code should only be sent once and
> transport_send_check_condition_and_sense() sets
> SCF_SENT_CHECK_CONDITION and checks for it so doing it twice should not
> do any harm right? But grep does not say when this flag gets removed.
> 

Correct.  The SCF_SENT_CHECK_CONDITION bit does not get removed ahead of
fabric descriptor release.

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: scsi_lib: fix scsi_io_completion's SG_IO error propagation

2012-07-18 Thread Mike Snitzer

Hi James,

Seems this one slipped through the cracks.  Please review/apply.

Thanks,
Mike

On Thu, May 31 2012 at  4:33pm -0400,
Moger, Babu  wrote:

> Thanks Mike. Looks good.
> 
> > -Original Message-
> > From: Mike Snitzer [mailto:snit...@redhat.com]
> > Sent: Thursday, May 31, 2012 3:24 PM
> > To: Paolo Bonzini
> > Cc: linux-scsi@vger.kernel.org; Moger, Babu; h...@suse.de;
> > micha...@cs.wisc.edu; james.bottom...@hansenpartnership.com;
> > sta...@vger.kernel.org.#.3.4
> > Subject: Re: scsi_lib: fix scsi_io_completion's SG_IO error propagation
> > 
> > On Thu, May 31 2012 at  3:51pm -0400,
> > Paolo Bonzini  wrote:
> > 
> > > Il 31/05/2012 21:05, Mike Snitzer ha scritto:
> > > > The following v3.4-rc1 commit unmasked an existing bug in
> > > > scsi_io_completion's SG_IO error handling:
> > > > 47ac56d [SCSI] scsi_error: classify some ILLEGAL_REQUEST sense as a
> > permanent TARGET_ERROR
> > > >
> > > > Given that certain ILLEGAL_REQUEST are now properly categorized as
> > > > TARGET_ERROR the host_byte is being set (before host_byte wasn't ever
> > > > set for these ILLEGAL_REQUEST).
> > > >
> > > > In scsi_io_completion, initialize req->errors with cmd->result _after_
> > > > the SG_IO block that calls __scsi_error_from_host_byte (which may
> > > > modify the host_byte).
> > > >
> > > > Before this fix:
> > > >
> > > > cdb to send: 12 01 01 00 00 00
> > > > ioctl(3, SG_IO, {'S', SG_DXFER_NONE, cmd[6]=[12, 01, 01, 00, 00, 00],
> > > > mx_sb_len=32, iovec_count=0, dxfer_len=0, timeout=2, flags=0,
> > > > status=02, masked_status=01, sb[19]=[70, 00, 05, 00, 00, 00, 00, 0b,
> > > > 00, 00, 00, 00, 24, 00, 00, 00, 00, 00, 00], host_status=0x10,
> > > > driver_status=0x8, resid=0, duration=0, info=0x1}) = 0
> > > > SCSI Status: Check Condition
> > > >
> > > > Sense Information:
> > > > sense buffer empty
> > > >
> > > > After:
> > > >
> > > > cdb to send: 12 01 01 00 00 00
> > > > ioctl(3, SG_IO, {'S', SG_DXFER_NONE, cmd[6]=[12, 01, 01, 00, 00, 00],
> > > > mx_sb_len=32, iovec_count=0, dxfer_len=0, timeout=2, flags=0,
> > > > status=02, masked_status=01, sb[19]=[70, 00, 05, 00, 00, 00, 00, 0b,
> > > > 00, 00, 00, 00, 24, 00, 00, 00, 00, 00, 00], host_status=0,
> > > > driver_status=0x8, resid=0, duration=0, info=0x1}) = 0
> > > > SCSI Status: Check Condition
> > > >
> > > > Sense Information:
> > > >  Fixed format, current;  Sense key: Illegal Request
> > > >  Additional sense: Invalid field in cdb
> > > >  Raw sense data (in hex):
> > > > 70 00 05 00 00 00 00 0b  00 00 00 00 24 00 00 00
> > > > 00 00 00
> > > >
> > > > Signed-off-by: Mike Snitzer 
> > > > Reviewed-by: Babu Moger 
> > >
> > > Reported-by: Paolo Bonzini 
> > > Tested-by: Paolo Bonzini 
> > 
> > Yes, thanks for backfilling the Reported-by and Tested-by Paolo!
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6

2012-07-18 Thread Nicholas A. Bellinger

On Wed, 2012-07-18 at 08:42 -0500, Anthony Liguori wrote:
> On 07/17/2012 04:50 PM, Nicholas A. Bellinger wrote:
> > On Tue, 2012-07-17 at 13:55 -0500, Anthony Liguori wrote:
> >> On 07/17/2012 10:05 AM, Michael S. Tsirkin wrote:
> >>> On Wed, Jul 11, 2012 at 09:15:00PM +, Nicholas A. Bellinger wrote:
> >



> >
> >> But I do think the kernel should carefully consider whether it wants to 
> >> support
> >> an interface like this.  This an extremely complicated ABI with a lot of 
> >> subtle
> >> details around state and compatibility.
> >>
> >> Are you absolutely confident that you can support a userspace application 
> >> that
> >> expects to get exactly the same response from all possible commands in 20 
> >> kernel
> >> versions from now?  Virtualization requires absolutely precise 
> >> compatibility in
> >> terms of bugs and features.  This is probably not something the TCM stack 
> >> has
> >> had to consider yet.
> >>
> >
> > We most certainly have thought about long term userspace compatibility
> > with TCM.  Our userspace code (that's now available in all major
> > distros) is completely forward-compatible with new fabric modules such
> > as tcm_vhost.  No update required.
> 
> I'm not sure we're talking about the same thing when we say compatibility.
> 
> I'm not talking about the API.  I'm talking about the behavior of the 
> commands 
> that tcm_vhost supports.
> 

OK, I understand what your getting at now..

> If you add support for a new command, you need to provide userspace a way to 
> disable this command.  If you change what gets reported for VPD, you need to 
> provide userspace a way to make VPD look like what it did in a previous 
> version.
> 
> Basically, you need to be able to make a TCM device behave 100% the same as 
> it 
> did in an older version of the kernel.
> 
> This is unique to virtualization due to live migration.  If you migrate from 
> a 
> 3.6 kernel to a 3.8 kernel, you need to make sure that the 3.8 kernel's TCM 
> device behaves exactly like the 3.6 kernel because the guest that is 
> interacting 
> with it does not realize that live migration happened.
> 
> Yes, you can add knobs via configfs to control this behavior, but I think the 
> question is, what's the plan for this?
> 

So we already allow for some types of CDBs emulation to be toggled via
backend device attributes:

root@tifa:/usr/src/target-pending.git# tree 
/sys/kernel/config/target/core/iblock_2/fioa/attrib/
/sys/kernel/config/target/core/iblock_2/fioa/attrib/
├── block_size
├── emulate_dpo
├── emulate_fua_read
├── emulate_fua_write
├── emulate_rest_reord
├── emulate_tas
├── emulate_tpu
├── emulate_tpws
├── emulate_ua_intlck_ctrl
├── emulate_write_cache
├── enforce_pr_isids



Some things like SPC-3 persistent reservations + implict/explict ALUA
multipath currently can't be disabled, but adding two more backend
attributes to disable/enable this logic individual is easy enough to do.

So that said, I don't have a problem with adding the necessary device
attributes to limit what type of CDBs a backend device is capable of
processing.

Trying to limiting this per-guest (instead of per-device) is where
things get a little more tricky..

> BTW, I think this is a good thing to cover in 
> Documentation/vhost/tcm_vhost.txt. 
>   I think that's probably the only change that's needed here.
> 

Sure, but I'll need to know what else that you'd like to optionally
restrict it terms of CDB processing that's not already there..

Thanks for your feedback!

--nab

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Bug 44771] [REGRESSION] a7a20d103994fd760766e6c9d494daa569cbfe06 makes kernel 3.5 unbootable on an Intel chipset based motherboard

2012-07-18 Thread Dan Williams

On Wed, Jul 18, 2012 at 1:25 PM, Alan Stern  wrote:
> On Tue, 17 Jul 2012, James Bottomley wrote:
>
>> On Tue, 2012-07-17 at 12:51 -0700, Linus Torvalds wrote:
>> > Wrong people cc'd, it looks like.
>> >
>> > Guys, commit a7a20d103994 ("sd: limit the scope of the async probe
>> > domain") is causing boot problems. It's timing-dependent and
>> > apparently sometimes works, which makes sense with that commit.
>> >
>> > However, it *should* have been fixed by commit 43a8d39d0137 ("fix
>> > async probe regression"), but Artem seems to report the problem even
>> > in -rc7.
>> >
>> > Comments?
>>
>> As far as I can tell, the fix should have worked.  However, there are a
>> lot of assumptions in the async stuff that end up not being true in the
>> presence of separate async domains.   We should be fixing it all here:
>>
>> http://git.kernel.org/?p=linux/kernel/git/jejb/scsi.git;a=shortlog;h=refs/heads/async
>
> Those commits haven't been merged into 3.5-rc7, right?  So Artem isn't
> using them.
>
>> I've got to say, I don't understand the bug report.  all of those
>> commits were about probing for devices.  However, the screen shot
>>
>> https://bugzilla.kernel.org/attachment.cgi?id=75351
>>
>> shows the devices were found, it's the partition tables that weren't.
>> For us to see the message about sda's capacity, we're already in the
>> async code the commits were trying to synchronise with.
>
> That's understandable, given that those commits aren't present in
> Artem's kernel.  Are they queued for -stable?

I did not flag them as such given the wider entanglements.

>
>> However, there are some missing messages: there's no partition table
>> print and no final
>>
>>   sd_printk(KERN_NOTICE, sdkp, "Attached SCSI %sdisk\n",
>> sdp->removable ? "removable " : "");
>>
>> So sd_probe_async() got stuck somewhere after the first
>> sd_revalidata_disk().
>
> No, it didn't get stuck.  The problem is that the scsi_wait_scan module
> didn't wait for the async scanning to finish.
>
> And the reason for that is one you're already familiar with:
> CONFIG_SCSI_MOD=y.

I had forgotten that additional aspect. but I don't think that is the
root cause.  scsi_wait_scan() is a nop in Artem's config given
CONFIG_SCSI_WAIT_SCAN=m.  From what I have seen ahci typically ensures
that wait_for_device_probe() is all that is needed to guarantee
scanning is complete which commit a7a20d103994 ("sd: limit the scope
of the async probe domain") breaks, and which commit 43a8d39d0137
("[SCSI] fix async probe regression"), does not resolve because that
requires a call to scsi_complete_async_scans() to close the loop.

> Artem, if you change all your SCSI drivers to be modular rather than
> built-in, that ought to fix the problem.  Alternatively, you can simply
> continue to use the "rootwait" option.

I think setting CONFIG_SCSI_WAIT_SCAN=y would also be a workaround, or
going forward with the pending async rework to make sd probe work once
again visible to async_synchronize_domain_full().

--
Dan
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [MINI SUMMIT] SCSI core performance

2012-07-18 Thread Nicholas A. Bellinger

On Wed, 2012-07-18 at 09:00 +0100, James Bottomley wrote:
> On Tue, 2012-07-17 at 19:39 -0700, Nicholas A. Bellinger wrote:
> > Hi KS-PCs,
> > 
> > I'd like to propose a SCSI performance mini-summit to see how interested
> > folks are in helping address the long-term issues that SCSI core is
> > currently facing wrt to multi-lun per host and heavy small block random
> > I/O workloads.
> > 
> > I know this would probably be better suited for LSF (for the record it
> > was proposed this year) but now that we've acknowledge there is a
> > problem with SCSI LLDs vs. raw block drivers vs. other SCSI subsystems,
> > it would be useful to get the storage folks into a single room at some
> > point during KS/LPC to figure out what is actually going on with SCSI
> > core.
> 
> You seem to have a short memory:  The last time it was discussed
> 
> http://marc.info/?t=13415537393
> 
> It rapidly became apparent there isn't a problem.  Enabling high IOPS in
> the SCSI stack is what I think you mean.
> 

small block random I/O == performance, that is correct.

The host-lock-less stuff is doing better these days for small-ish
multi-lun setups with large block sequential I/O workloads.

Doing ~1 GB/sec per LUN is achievable with multi-lun per host (say up to
6-8 LUNs dependent on your setup) using PCI-e Gen3 hardware.

> > As mentioned in the recent tcm_vhost thread, there are a number of cases
> > where drivers/target/ code can demonstrate this limitation pretty
> > vividly now.
> > 
> > This includes the following scenarios using raw block flash export with
> > target_core_mod + target_core_iblock export and the same small block
> > (4k) mixed random I/O workload with fio:
> > 
> > *) tcm_loop local SCSI LLD performance is an order of magnitude slower 
> >than the same local raw block flash backend.
> > *) tcm_qla2xxx performs better using MSFT Server hosts than Linux v3.x
> >based hosts using 2x socket Nehalem hardware w/ PCI-e Gen2 HBAs
> > *) ib_srpt performs better using MSFT Server host than RHEL 6.x .32 
> >based hosts using 2x socket Romley hardware w/ PCI-e Gen3 HCAs
> > *) Raw block IBLOCK export into KVM guest v3.5-rc w/ virtio-scsi is 
> >behind in performance vs. raw local block flash.  (cmwq on the host 
> >is helping here, but still need to with MSFT SCSI mini-port)
> > 
> > Also, with 1M IOPs into a single VM guest now being done by other non
> > Linux based hypervisors, the virtualization bit for high performance KVM
> > SCSI based storage is quickly coming on..
> > 
> > So all of that said, I'd like to at least have a discussion with the key
> > SCSI + block folks who will be present in San Diego on path forward to
> > address these without having to wait until LSF-2013 + hope for a topic
> > slot to materialize then.
> > 
> > Thank you for your consideration,
> 
> Well, your proposal is devoid of an actual proposal.
> 

Huh..?  It's a proposal for a discussion to (hopefully) identify the
main culprit(s) and figure out an incremental way forward.

Due to the fact that 1M IOPs machines aren't quite the norm (yet), the
idea is to get storage folks in the same room who do have access to 1M
IOPs systems + have an interest in making SCSI core go faster for random
small block I/O workloads.

This can be vendors / LLD maintainers who've run into similar
limitations with SCSI core, or folks who have an interest in KVM guest
SCSI performance.

> Enabling high IOPS involves reducing locking overhead and path length
> through the code.  I think most of the low hanging fruit in this area is
> already picked, but if you have an idea, please say.  There might be
> something we can extract from the lockless queue work Jens is doing, but
> we need that to materialise first.
> 

Would really like to hear from Jens here, but I don't know how much time
he is spending on the SCSI layer these days..

I've been more interested recently in working on a fabric that can
demonstrate this bottleneck with raw block flash into KVM guest <->
virtio-scsi, as I think it's a important vehicle for short-term
diagnosis.

> Without a concrete thing to discuss, shooting the breeze on high IOPS in
> the SCSI stack is about as useful as discussing what happened in last
> night's episode of Coronation Street which, when it happens in my house,
> always helps me see how incredibly urgent fixing the leaky tap I've been
> putting off for months actually is.
> 

Sorry, I've never heard of that show.  

> If someone can come up with a proposal ... or even perhaps another path
> trace showing where the reducible overhead and lock problems are we can
> discuss it on the list and we might have a real topic by the time LSF
> rolls around.
> 

So identifying root culprit(s) is still a WIP at this point.

In the next weeks I'll be back spending time back on 1M IOPs machines
with raw block flash + qla2xxx/srpt/vhost + Linux/MSFT SCSI clients, and
should be getting some more data-points by then.

Anyways, if it ends up taking until

Re: [RFC-v3 3/4] vhost: Add vhost_scsi specific defines

2012-07-18 Thread Michael S. Tsirkin

On Wed, Jul 18, 2012 at 02:17:05PM -0700, Nicholas A. Bellinger wrote:
> On Wed, 2012-07-18 at 16:05 +0300, Michael S. Tsirkin wrote:
> > On Wed, Jul 18, 2012 at 12:59:31AM +, Nicholas A. Bellinger wrote:
> > > From: Nicholas Bellinger 
> > > 
> > > This patch adds the initial vhost_scsi_ioctl() callers for 
> > > VHOST_SCSI_SET_ENDPOINT
> > > and VHOST_SCSI_CLEAR_ENDPOINT respectively, and also adds struct 
> > > vhost_vring_target
> > > that is used by tcm_vhost code when locating target ports during qemu 
> > > setup.
> > > 
> > > Signed-off-by: Stefan Hajnoczi 
> > > Cc: Zhi Yong Wu 
> > > Cc: Michael S. Tsirkin 
> > > Cc: Paolo Bonzini ,
> > > Signed-off-by: Nicholas A. Bellinger 
> > > ---
> > >  include/linux/vhost.h |9 +
> > >  1 files changed, 9 insertions(+), 0 deletions(-)
> > > 
> > > diff --git a/include/linux/vhost.h b/include/linux/vhost.h
> > > index e847f1e..33b313b 100644
> > > --- a/include/linux/vhost.h
> > > +++ b/include/linux/vhost.h
> > > @@ -24,7 +24,11 @@ struct vhost_vring_state {
> > >  struct vhost_vring_file {
> > >   unsigned int index;
> > >   int fd; /* Pass -1 to unbind from file. */
> > > +};
> > >  
> > > +struct vhost_vring_target {
> > 
> > Can this be renamed vhost_scsi_target?
> 
> Done
> 
> > 
> > > + unsigned char vhost_wwpn[224];
> > 
> > 224? I am guessing ISCSI_NAME_LEN from include/scsi/iscsi_proto.h?
> > Unfortunately we can't include iscsi_proto.h here as it
> > is not exported to users. But let's add a comment for now.
> > 
> 
> This is actually from target/target_core_base.h:TRANSPORT_IQN_LEN.
> 
> Fixing this up now..
> 
> > > + unsigned short vhost_tpgt;
> > >  };
> > >  
> > >  struct vhost_vring_addr {
> > > @@ -121,6 +125,11 @@ struct vhost_memory {
> > >   * device.  This can be used to stop the ring (e.g. for migration). */
> > >  #define VHOST_NET_SET_BACKEND _IOW(VHOST_VIRTIO, 0x30, struct 
> > > vhost_vring_file)
> > >  
> > > +/* VHOST_SCSI specific defines */
> > > +
> > > +#define VHOST_SCSI_SET_ENDPOINT _IOW(VHOST_VIRTIO, 0x40, struct 
> > > vhost_vring_target)
> > > +#define VHOST_SCSI_CLEAR_ENDPOINT _IOW(VHOST_VIRTIO, 0x41, struct 
> > > vhost_vring_target)
> > > +
> > >  /* Feature bits */
> > >  /* Log all write descriptors. Can be changed while device is active. */
> > 
> > Can these go into appropriate ifdef CONFIG_TCP_VHOST please?
> > 
> 
> M, I don't think we can do that with CONFIG_TCM_VHOST=m, or at least
> not with the following patch
> 
> diff --git a/include/linux/vhost.h b/include/linux/vhost.h
> index 33b313b..e4b1ee3 100644
> --- a/include/linux/vhost.h
> +++ b/include/linux/vhost.h
> @@ -125,10 +126,14 @@ struct vhost_memory {
>   * device.  This can be used to stop the ring (e.g. for migration). */
>  #define VHOST_NET_SET_BACKEND _IOW(VHOST_VIRTIO, 0x30, struct 
> vhost_vring_file)
>  
> +#ifdef CONFIG_TCM_VHOST
> +
>  /* VHOST_SCSI specific defines */
>  
> -#define VHOST_SCSI_SET_ENDPOINT _IOW(VHOST_VIRTIO, 0x40, struct 
> vhost_vring_target)
> -#define VHOST_SCSI_CLEAR_ENDPOINT _IOW(VHOST_VIRTIO, 0x41, struct 
> vhost_vring_target)
> +# define VHOST_SCSI_SET_ENDPOINT _IOW(VHOST_VIRTIO, 0x40, struct 
> vhost_scsi_target)
> +# define VHOST_SCSI_CLEAR_ENDPOINT _IOW(VHOST_VIRTIO, 0x41, struct 
> vhost_scsi_target)
> +
> +#endif
> 
> --
> 
> drivers/vhost/tcm/tcm_vhost.c: In function ‘vhost_scsi_ioctl’:
> drivers/vhost/tcm/tcm_vhost.c:970: error: ‘VHOST_SCSI_SET_ENDPOINT’ 
> undeclared (first use in this function)
> drivers/vhost/tcm/tcm_vhost.c:970: error: (Each undeclared identifier is 
> reported only once
> drivers/vhost/tcm/tcm_vhost.c:970: error: for each function it appears in.)
> drivers/vhost/tcm/tcm_vhost.c:975: error: ‘VHOST_SCSI_CLEAR_ENDPOINT’ 
> undeclared (first use in this function)
> make[3]: *** [drivers/vhost/tcm/tcm_vhost.o] Error 1


Maybe ifdefs only work for booleans? If yes you can probably add a boolean and 
select it?
What I want to prevent is exposing tcm stuff in the header if it is
configured off. I'll play with it tomorrow.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC-v3 4/4] tcm_vhost: Initial merge for vhost level target fabric driver

2012-07-18 Thread Nicholas A. Bellinger

On Wed, 2012-07-18 at 19:09 +0300, Michael S. Tsirkin wrote:
> On Wed, Jul 18, 2012 at 12:59:32AM +, Nicholas A. Bellinger wrote:



> > 
> > Changelog v2 -> v3:
> > 
> >   Unlock on error in tcm_vhost_drop_nexus() (DanC)
> >   Fix strlen() doesn't count the terminator (DanC)
> >   Call kfree() on an error path (DanC)
> >   Convert tcm_vhost_write_pending to use target_execute_cmd (hch + nab)
> >   Fix another strlen() off by one in tcm_vhost_make_tport (DanC)
> >   Add option under drivers/staging/Kconfig, and move to drivers/vhost/tcm/
> >   as requested by MST (nab)
> > 
> > ---
> >  drivers/staging/Kconfig   |2 +
> >  drivers/vhost/Makefile|2 +
> >  drivers/vhost/tcm/Kconfig |6 +
> >  drivers/vhost/tcm/Makefile|1 +
> >  drivers/vhost/tcm/tcm_vhost.c | 1611 
> > +
> >  drivers/vhost/tcm/tcm_vhost.h |   74 ++
> >  6 files changed, 1696 insertions(+), 0 deletions(-)
> >  create mode 100644 drivers/vhost/tcm/Kconfig
> >  create mode 100644 drivers/vhost/tcm/Makefile
> >  create mode 100644 drivers/vhost/tcm/tcm_vhost.c
> >  create mode 100644 drivers/vhost/tcm/tcm_vhost.h
> > 
> 
> Really sorry about making you run around like that,
> I did not mean moving all of tcm to a directory,
> just adding tcm/Kconfig or adding drivers/vhost/Kconfig.tcm
> because eventually it's easier to keep it all together
> in one place.
> 

Er, apologies for the slight mis-understanding here..  Moving back now +
fixing up the Kbuild bits.

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC-v3 3/4] vhost: Add vhost_scsi specific defines

2012-07-18 Thread Nicholas A. Bellinger

On Wed, 2012-07-18 at 16:05 +0300, Michael S. Tsirkin wrote:
> On Wed, Jul 18, 2012 at 12:59:31AM +, Nicholas A. Bellinger wrote:
> > From: Nicholas Bellinger 
> > 
> > This patch adds the initial vhost_scsi_ioctl() callers for 
> > VHOST_SCSI_SET_ENDPOINT
> > and VHOST_SCSI_CLEAR_ENDPOINT respectively, and also adds struct 
> > vhost_vring_target
> > that is used by tcm_vhost code when locating target ports during qemu setup.
> > 
> > Signed-off-by: Stefan Hajnoczi 
> > Cc: Zhi Yong Wu 
> > Cc: Michael S. Tsirkin 
> > Cc: Paolo Bonzini ,
> > Signed-off-by: Nicholas A. Bellinger 
> > ---
> >  include/linux/vhost.h |9 +
> >  1 files changed, 9 insertions(+), 0 deletions(-)
> > 
> > diff --git a/include/linux/vhost.h b/include/linux/vhost.h
> > index e847f1e..33b313b 100644
> > --- a/include/linux/vhost.h
> > +++ b/include/linux/vhost.h
> > @@ -24,7 +24,11 @@ struct vhost_vring_state {
> >  struct vhost_vring_file {
> > unsigned int index;
> > int fd; /* Pass -1 to unbind from file. */
> > +};
> >  
> > +struct vhost_vring_target {
> 
> Can this be renamed vhost_scsi_target?

Done

> 
> > +   unsigned char vhost_wwpn[224];
> 
> 224? I am guessing ISCSI_NAME_LEN from include/scsi/iscsi_proto.h?
> Unfortunately we can't include iscsi_proto.h here as it
> is not exported to users. But let's add a comment for now.
> 

This is actually from target/target_core_base.h:TRANSPORT_IQN_LEN.

Fixing this up now..

> > +   unsigned short vhost_tpgt;
> >  };
> >  
> >  struct vhost_vring_addr {
> > @@ -121,6 +125,11 @@ struct vhost_memory {
> >   * device.  This can be used to stop the ring (e.g. for migration). */
> >  #define VHOST_NET_SET_BACKEND _IOW(VHOST_VIRTIO, 0x30, struct 
> > vhost_vring_file)
> >  
> > +/* VHOST_SCSI specific defines */
> > +
> > +#define VHOST_SCSI_SET_ENDPOINT _IOW(VHOST_VIRTIO, 0x40, struct 
> > vhost_vring_target)
> > +#define VHOST_SCSI_CLEAR_ENDPOINT _IOW(VHOST_VIRTIO, 0x41, struct 
> > vhost_vring_target)
> > +
> >  /* Feature bits */
> >  /* Log all write descriptors. Can be changed while device is active. */
> 
> Can these go into appropriate ifdef CONFIG_TCP_VHOST please?
> 

M, I don't think we can do that with CONFIG_TCM_VHOST=m, or at least
not with the following patch

diff --git a/include/linux/vhost.h b/include/linux/vhost.h
index 33b313b..e4b1ee3 100644
--- a/include/linux/vhost.h
+++ b/include/linux/vhost.h
@@ -125,10 +126,14 @@ struct vhost_memory {
  * device.  This can be used to stop the ring (e.g. for migration). */
 #define VHOST_NET_SET_BACKEND _IOW(VHOST_VIRTIO, 0x30, struct vhost_vring_file)
 
+#ifdef CONFIG_TCM_VHOST
+
 /* VHOST_SCSI specific defines */
 
-#define VHOST_SCSI_SET_ENDPOINT _IOW(VHOST_VIRTIO, 0x40, struct 
vhost_vring_target)
-#define VHOST_SCSI_CLEAR_ENDPOINT _IOW(VHOST_VIRTIO, 0x41, struct 
vhost_vring_target)
+# define VHOST_SCSI_SET_ENDPOINT _IOW(VHOST_VIRTIO, 0x40, struct 
vhost_scsi_target)
+# define VHOST_SCSI_CLEAR_ENDPOINT _IOW(VHOST_VIRTIO, 0x41, struct 
vhost_scsi_target)
+
+#endif

--

drivers/vhost/tcm/tcm_vhost.c: In function ‘vhost_scsi_ioctl’:
drivers/vhost/tcm/tcm_vhost.c:970: error: ‘VHOST_SCSI_SET_ENDPOINT’ undeclared 
(first use in this function)
drivers/vhost/tcm/tcm_vhost.c:970: error: (Each undeclared identifier is 
reported only once
drivers/vhost/tcm/tcm_vhost.c:970: error: for each function it appears in.)
drivers/vhost/tcm/tcm_vhost.c:975: error: ‘VHOST_SCSI_CLEAR_ENDPOINT’ 
undeclared (first use in this function)
make[3]: *** [drivers/vhost/tcm/tcm_vhost.o] Error 1


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Bug 44771] [REGRESSION] a7a20d103994fd760766e6c9d494daa569cbfe06 makes kernel 3.5 unbootable on an Intel chipset based motherboard

2012-07-18 Thread Alan Stern

On Tue, 17 Jul 2012, James Bottomley wrote:

> On Tue, 2012-07-17 at 12:51 -0700, Linus Torvalds wrote:
> > Wrong people cc'd, it looks like.
> > 
> > Guys, commit a7a20d103994 ("sd: limit the scope of the async probe
> > domain") is causing boot problems. It's timing-dependent and
> > apparently sometimes works, which makes sense with that commit.
> > 
> > However, it *should* have been fixed by commit 43a8d39d0137 ("fix
> > async probe regression"), but Artem seems to report the problem even
> > in -rc7.
> > 
> > Comments?
> 
> As far as I can tell, the fix should have worked.  However, there are a
> lot of assumptions in the async stuff that end up not being true in the
> presence of separate async domains.   We should be fixing it all here:
> 
> http://git.kernel.org/?p=linux/kernel/git/jejb/scsi.git;a=shortlog;h=refs/heads/async

Those commits haven't been merged into 3.5-rc7, right?  So Artem isn't 
using them.

> I've got to say, I don't understand the bug report.  all of those
> commits were about probing for devices.  However, the screen shot
> 
> https://bugzilla.kernel.org/attachment.cgi?id=75351
> 
> shows the devices were found, it's the partition tables that weren't.
> For us to see the message about sda's capacity, we're already in the
> async code the commits were trying to synchronise with.

That's understandable, given that those commits aren't present in
Artem's kernel.  Are they queued for -stable?

> However, there are some missing messages: there's no partition table
> print and no final 
> 
>   sd_printk(KERN_NOTICE, sdkp, "Attached SCSI %sdisk\n",
> sdp->removable ? "removable " : "");
> 
> So sd_probe_async() got stuck somewhere after the first
> sd_revalidata_disk().

No, it didn't get stuck.  The problem is that the scsi_wait_scan module 
didn't wait for the async scanning to finish.

And the reason for that is one you're already familiar with: 
CONFIG_SCSI_MOD=y.

Artem, if you change all your SCSI drivers to be modular rather than 
built-in, that ought to fix the problem.  Alternatively, you can simply 
continue to use the "rootwait" option.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6

2012-07-18 Thread Rustad, Mark D

On Jul 18, 2012, at 10:17 AM, Michael S. Tsirkin wrote:



>> You do have to pay very close attention to some things however. Don't change 
>> the device identity in any way - even version information, otherwise a 
>> Windows initiator will blue-screen. I made that mistake myself, so I 
>> remember it well. It seemed like such an innocent change. I don't recall 
>> there being any issue with adding commands and we did do that on occasion.
> 
> How about removing commands?

Good question. With the storage system I am familiar with, that would only be a 
risk if firmware were downgraded. Downgrading would never have been 
recommended. I am sure that if something like persistent reserve suddenly went 
away it would cause big trouble for some initiators.

-- 
Mark Rustad, LAN Access Division, Intel Corporation

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6

2012-07-18 Thread Anthony Liguori


On 07/18/2012 11:47 AM, James Bottomley wrote:

On Wed, 2012-07-18 at 11:00 -0500, Anthony Liguori wrote:

Of course: Think about the consequences: you want to upgrade one array
on your SAN.  You definitely don't want to shut down your entire data
centre to achieve it.  In place upgrades on running SANs have been
common in enterprise environments for a while.


Would firmware upgrades ever result in major OS visible changes though?

Maybe OSes are more robust with SCSI than with other types of buses, but I don't 
think it's safe to completely ignore the problem.



I agree that in general, SCSI targets don't need this, but I'm pretty sure that
if a guest probes for a command, you migrate to an old version, and that command
is no longer there, badness will ensue.


What command are we talking about?  Operation of initiators is usually
just READ and WRITE.  So perhaps we might have inline UNMAP ... but the
world wouldn't come to an end even if the latter stopped working.


Is that true for all OSes?  Linux may handle things gracefully if UNMAP starts 
throwing errors but that doesn't mean that Windows will.


There are other cases where this creates problems.  Windows (and some other 
OSes) fingerprint the hardware profile in order to do license enforcement.  If 
the hardware changes beyond a certain amount, then they refuse to boot.


Windows does this with a points system and I do believe that INQUIRY responses 
from any local disks are included in this tally.



Most of the complex SCSI stuff is done at start of day; it's actually
only then we'd notice things like changes in INQUIRY strings or mode
pages.

Failover, which is what you're talking about, requires reinstatement of
all the operating parameters of the source/target system, but that's not
wholly the responsibility of the storage system ...


It's the responsibility of the hypervisor when dealing with live migration.

Regards,

Anthony Liguori



James


It's different when you're talking about a reboot happening or a
disconnect/reconnect due to firmware upgrade.  The OS would naturally be
reprobing in this case.






--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6

2012-07-18 Thread Michael S. Tsirkin

On Wed, Jul 18, 2012 at 04:42:33PM +, Rustad, Mark D wrote:
> On Jul 18, 2012, at 9:00 AM, Michael S. Tsirkin wrote:
> 
> > On Wed, Jul 18, 2012 at 11:53:38AM -0400, Christoph Hellwig wrote:
> >> On Wed, Jul 18, 2012 at 08:42:21AM -0500, Anthony Liguori wrote:
> >>> 
> >>> If you add support for a new command, you need to provide userspace
> >>> a way to disable this command.  If you change what gets reported for
> >>> VPD, you need to provide userspace a way to make VPD look like what
> >>> it did in a previous version.
> >>> 
> >>> Basically, you need to be able to make a TCM device behave 100% the
> >>> same as it did in an older version of the kernel.
> >>> 
> >>> This is unique to virtualization due to live migration.  If you
> >>> migrate from a 3.6 kernel to a 3.8 kernel, you need to make sure
> >>> that the 3.8 kernel's TCM device behaves exactly like the 3.6 kernel
> >>> because the guest that is interacting with it does not realize that
> >>> live migration happened.
> >> 
> >> I don't think these strict live migration rules apply to SCSI targets.
> >> 
> >> Real life storage systems get new features and different behaviour with
> >> firmware upgrades all the time, and SCSI initiators deal with that just
> >> fine.
> >> I don't see any reason to be more picky just because we're
> >> virtualized.
> > 
> > Presumably initiators are shut down for target firmware upgrades?
> > With virtualization your host can change without guest shutdown.
> > You can also *lose* commands when migrating to an older host.
> 
> 
> Actually no. Storage vendors do not want to impose a need to take initiators 
> down for any reason. I have worked for a storage system vendor that routinely 
> did firmware upgrades on-the-fly. This is done by multi-pathing and taking 
> one path down, upgrade, bring up, repeat.

With live migration even that does not happen.

> There was even one non-redundant system that I am aware of that could upgrade 
> firmware and reboot fast enough that the initiators would not notice.
> 
> You do have to pay very close attention to some things however. Don't change 
> the device identity in any way - even version information, otherwise a 
> Windows initiator will blue-screen. I made that mistake myself, so I remember 
> it well. It seemed like such an innocent change. I don't recall there being 
> any issue with adding commands and we did do that on occasion.

How about removing commands?

> -- 
> Mark Rustad, LAN Access Division, Intel Corporation
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] scsi/ibmvscsi: /sys/class/scsi_host/hostX/config doesn't show any information

2012-07-18 Thread olaf

From: Linda Xie 

Expected result:
It should show something like this:
x1521p4:~ # cat /sys/class/scsi_host/host1/config
PARTITIONNAME='x1521p4'
NWSDNAME='X1521P4'
HOSTNAME='X1521P4'
DOMAINNAME='RCHLAND.IBM.COM'
NAMESERVERS='9.10.244.100 9.10.244.200'

Actual result:
x1521p4:~ # cat /sys/class/scsi_host/host0/config
x1521p4:~ #

This patch changes the size of the buffer used for transfering config
data to 4K. It was tested against 2.6.19-rc2 tree.

Reported by IBM during SLES11 beta testing:

https://bugzilla.novell.com/show_bug.cgi?id=439970
LTC49349

Signed-off-by: Olaf Hering 

diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.c b/drivers/scsi/ibmvscsi/ibmvscsi.c
index e580aa4..1513ca8 100644
--- a/drivers/scsi/ibmvscsi/ibmvscsi.c
+++ b/drivers/scsi/ibmvscsi/ibmvscsi.c
@@ -93,6 +93,8 @@ static int max_requests = IBMVSCSI_MAX_REQUESTS_DEFAULT;
 static int max_events = IBMVSCSI_MAX_REQUESTS_DEFAULT + 2;
 static int fast_fail = 1;
 static int client_reserve = 1;
+/* host data buffer size */
+#define HOST_BUFFER_SIZE 4096
 
 static struct scsi_transport_template *ibmvscsi_transport_template;
 
@@ -1666,7 +1668,7 @@ static ssize_t show_host_srp_version(struct device *dev,
struct ibmvscsi_host_data *hostdata = shost_priv(shost);
int len;
 
-   len = snprintf(buf, PAGE_SIZE, "%s\n",
+   len = snprintf(buf, HOST_BUFFER_SIZE, "%s\n",
   hostdata->madapter_info.srp_version);
return len;
 }
@@ -1687,7 +1689,7 @@ static ssize_t show_host_partition_name(struct device 
*dev,
struct ibmvscsi_host_data *hostdata = shost_priv(shost);
int len;
 
-   len = snprintf(buf, PAGE_SIZE, "%s\n",
+   len = snprintf(buf, HOST_BUFFER_SIZE, "%s\n",
   hostdata->madapter_info.partition_name);
return len;
 }
@@ -1708,7 +1710,7 @@ static ssize_t show_host_partition_number(struct device 
*dev,
struct ibmvscsi_host_data *hostdata = shost_priv(shost);
int len;
 
-   len = snprintf(buf, PAGE_SIZE, "%d\n",
+   len = snprintf(buf, HOST_BUFFER_SIZE, "%d\n",
   hostdata->madapter_info.partition_number);
return len;
 }
@@ -1728,7 +1730,7 @@ static ssize_t show_host_mad_version(struct device *dev,
struct ibmvscsi_host_data *hostdata = shost_priv(shost);
int len;
 
-   len = snprintf(buf, PAGE_SIZE, "%d\n",
+   len = snprintf(buf, HOST_BUFFER_SIZE, "%d\n",
   hostdata->madapter_info.mad_version);
return len;
 }
@@ -1748,7 +1750,7 @@ static ssize_t show_host_os_type(struct device *dev,
struct ibmvscsi_host_data *hostdata = shost_priv(shost);
int len;
 
-   len = snprintf(buf, PAGE_SIZE, "%d\n", hostdata->madapter_info.os_type);
+   len = snprintf(buf, HOST_BUFFER_SIZE, "%d\n", 
hostdata->madapter_info.os_type);
return len;
 }
 
@@ -1767,7 +1769,7 @@ static ssize_t show_host_config(struct device *dev,
struct ibmvscsi_host_data *hostdata = shost_priv(shost);
 
/* returns null-terminated host config data */
-   if (ibmvscsi_do_host_config(hostdata, buf, PAGE_SIZE) == 0)
+   if (ibmvscsi_do_host_config(hostdata, buf, HOST_BUFFER_SIZE) == 0)
return strlen(buf);
else
return 0;
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] scsi/ibmvscsi: add module alias for ibmvscsic

2012-07-18 Thread olaf

From: Olaf Hering 

The driver is named ibmvscsic, at runtime it its name is advertised as
ibmvscsi. For this reason mkinitrd wont pickup the driver properly.
Reported by IBM during SLES11 beta testing:

https://bugzilla.novell.com/show_bug.cgi?id=459933
LTC50724

Signed-off-by: Olaf Hering 

diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.c b/drivers/scsi/ibmvscsi/ibmvscsi.c
index 3a6c474..e580aa4 100644
--- a/drivers/scsi/ibmvscsi/ibmvscsi.c
+++ b/drivers/scsi/ibmvscsi/ibmvscsi.c
@@ -100,6 +100,9 @@ static struct scsi_transport_template 
*ibmvscsi_transport_template;
 
 static struct ibmvscsi_ops *ibmvscsi_ops;
 
+/* The driver is named ibmvscsic, map ibmvscsi to module name */
+#define IBMVSCSI_NAME "ibmvscsi"
+MODULE_ALIAS(IBMVSCSI_NAME);
 MODULE_DESCRIPTION("IBM Virtual SCSI");
 MODULE_AUTHOR("Dave Boutcher");
 MODULE_LICENSE("GPL");
@@ -1796,7 +1799,7 @@ static struct device_attribute *ibmvscsi_attrs[] = {
 static struct scsi_host_template driver_template = {
.module = THIS_MODULE,
.name = "IBM POWER Virtual SCSI Adapter " IBMVSCSI_VERSION,
-   .proc_name = "ibmvscsi",
+   .proc_name = IBMVSCSI_NAME,
.queuecommand = ibmvscsi_queuecommand,
.eh_abort_handler = ibmvscsi_eh_abort_handler,
.eh_device_reset_handler = ibmvscsi_eh_device_reset_handler,
@@ -1936,7 +1939,7 @@ static int ibmvscsi_probe(struct vio_dev *vdev, const 
struct vio_device_id *id)
}
 
hostdata->work_thread = kthread_run(ibmvscsi_work, hostdata, "%s_%d",
-   "ibmvscsi", host->host_no);
+   IBMVSCSI_NAME, host->host_no);
 
if (IS_ERR(hostdata->work_thread)) {
dev_err(&vdev->dev, "couldn't initialize kthread. rc=%ld\n",
@@ -2061,7 +2064,7 @@ static struct vio_driver ibmvscsi_driver = {
.probe = ibmvscsi_probe,
.remove = ibmvscsi_remove,
.get_desired_dma = ibmvscsi_get_desired_dma,
-   .name = "ibmvscsi",
+   .name = IBMVSCSI_NAME,
.pm = &ibmvscsi_pm_ops,
 };
 
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6

2012-07-18 Thread James Bottomley

On Wed, 2012-07-18 at 11:00 -0500, Anthony Liguori wrote:
> On 07/18/2012 10:53 AM, Christoph Hellwig wrote:
> > On Wed, Jul 18, 2012 at 08:42:21AM -0500, Anthony Liguori wrote:
> >>
> >> If you add support for a new command, you need to provide userspace
> >> a way to disable this command.  If you change what gets reported for
> >> VPD, you need to provide userspace a way to make VPD look like what
> >> it did in a previous version.
> >>
> >> Basically, you need to be able to make a TCM device behave 100% the
> >> same as it did in an older version of the kernel.
> >>
> >> This is unique to virtualization due to live migration.  If you
> >> migrate from a 3.6 kernel to a 3.8 kernel, you need to make sure
> >> that the 3.8 kernel's TCM device behaves exactly like the 3.6 kernel
> >> because the guest that is interacting with it does not realize that
> >> live migration happened.
> >
> > I don't think these strict live migration rules apply to SCSI targets.
> >
> > Real life storage systems get new features and different behaviour with
> > firmware upgrades all the time, and SCSI initiators deal with that just
> > fine.  I don't see any reason to be more picky just because we're
> > virtualized.
> 
> But would this happen while a system is running live?

Of course: Think about the consequences: you want to upgrade one array
on your SAN.  You definitely don't want to shut down your entire data
centre to achieve it.  In place upgrades on running SANs have been
common in enterprise environments for a while.

> I agree that in general, SCSI targets don't need this, but I'm pretty sure 
> that 
> if a guest probes for a command, you migrate to an old version, and that 
> command 
> is no longer there, badness will ensue.

What command are we talking about?  Operation of initiators is usually
just READ and WRITE.  So perhaps we might have inline UNMAP ... but the
world wouldn't come to an end even if the latter stopped working.

Most of the complex SCSI stuff is done at start of day; it's actually
only then we'd notice things like changes in INQUIRY strings or mode
pages.

Failover, which is what you're talking about, requires reinstatement of
all the operating parameters of the source/target system, but that's not
wholly the responsibility of the storage system ...

James

> It's different when you're talking about a reboot happening or a 
> disconnect/reconnect due to firmware upgrade.  The OS would naturally be 
> reprobing in this case.

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6

2012-07-18 Thread Rustad, Mark D

On Jul 18, 2012, at 9:00 AM, Michael S. Tsirkin wrote:

> On Wed, Jul 18, 2012 at 11:53:38AM -0400, Christoph Hellwig wrote:
>> On Wed, Jul 18, 2012 at 08:42:21AM -0500, Anthony Liguori wrote:
>>> 
>>> If you add support for a new command, you need to provide userspace
>>> a way to disable this command.  If you change what gets reported for
>>> VPD, you need to provide userspace a way to make VPD look like what
>>> it did in a previous version.
>>> 
>>> Basically, you need to be able to make a TCM device behave 100% the
>>> same as it did in an older version of the kernel.
>>> 
>>> This is unique to virtualization due to live migration.  If you
>>> migrate from a 3.6 kernel to a 3.8 kernel, you need to make sure
>>> that the 3.8 kernel's TCM device behaves exactly like the 3.6 kernel
>>> because the guest that is interacting with it does not realize that
>>> live migration happened.
>> 
>> I don't think these strict live migration rules apply to SCSI targets.
>> 
>> Real life storage systems get new features and different behaviour with
>> firmware upgrades all the time, and SCSI initiators deal with that just
>> fine.
>> I don't see any reason to be more picky just because we're
>> virtualized.
> 
> Presumably initiators are shut down for target firmware upgrades?
> With virtualization your host can change without guest shutdown.
> You can also *lose* commands when migrating to an older host.

Actually no. Storage vendors do not want to impose a need to take initiators 
down for any reason. I have worked for a storage system vendor that routinely 
did firmware upgrades on-the-fly. This is done by multi-pathing and taking one 
path down, upgrade, bring up, repeat. There was even one non-redundant system 
that I am aware of that could upgrade firmware and reboot fast enough that the 
initiators would not notice.

You do have to pay very close attention to some things however. Don't change 
the device identity in any way - even version information, otherwise a Windows 
initiator will blue-screen. I made that mistake myself, so I remember it well. 
It seemed like such an innocent change. I don't recall there being any issue 
with adding commands and we did do that on occasion.

-- 
Mark Rustad, LAN Access Division, Intel Corporation

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6

2012-07-18 Thread Anthony Liguori


On 07/18/2012 10:53 AM, Christoph Hellwig wrote:

On Wed, Jul 18, 2012 at 08:42:21AM -0500, Anthony Liguori wrote:


If you add support for a new command, you need to provide userspace
a way to disable this command.  If you change what gets reported for
VPD, you need to provide userspace a way to make VPD look like what
it did in a previous version.

Basically, you need to be able to make a TCM device behave 100% the
same as it did in an older version of the kernel.

This is unique to virtualization due to live migration.  If you
migrate from a 3.6 kernel to a 3.8 kernel, you need to make sure
that the 3.8 kernel's TCM device behaves exactly like the 3.6 kernel
because the guest that is interacting with it does not realize that
live migration happened.


I don't think these strict live migration rules apply to SCSI targets.

Real life storage systems get new features and different behaviour with
firmware upgrades all the time, and SCSI initiators deal with that just
fine.  I don't see any reason to be more picky just because we're
virtualized.


But would this happen while a system is running live?

I agree that in general, SCSI targets don't need this, but I'm pretty sure that 
if a guest probes for a command, you migrate to an old version, and that command 
is no longer there, badness will ensue.


It's different when you're talking about a reboot happening or a 
disconnect/reconnect due to firmware upgrade.  The OS would naturally be 
reprobing in this case.


Regards,

Anthony Liguori





--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC-v3 0/4] tcm_vhost+cmwq fabric driver code for-3.6

2012-07-18 Thread Michael S. Tsirkin

On Wed, Jul 18, 2012 at 12:59:28AM +, Nicholas A. Bellinger wrote:
> From: Nicholas Bellinger 
> 
> Hi folks,
> 
> The following is the RFC-v3 series of tcm_vhost target fabric driver code
> currently in-flight for-3.6 mainline code.

So I sent some comments. I think it's in an OK state for a staging
driver.  I'd suggest doing some interations while there is still time
and the rest we can fix up in tree.

-- 
MST
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC-v3 4/4] tcm_vhost: Initial merge for vhost level target fabric driver

2012-07-18 Thread Michael S. Tsirkin

On Wed, Jul 18, 2012 at 12:59:32AM +, Nicholas A. Bellinger wrote:
> From: Nicholas Bellinger 
> 
> This patch adds the initial code for tcm_vhost, a Vhost level TCM
> fabric driver for virtio SCSI initiators into KVM guest.
> 
> This code is currently up and running on v3.5-rc2 host+guest along
> with the virtio-scsi vdev->scan() patch to allow a proper
> scsi_scan_host() to occur once the tcm_vhost nexus has been established
> by the paravirtualized virtio-scsi client here:
> 
> virtio-scsi: Add vdrv->scan for post VIRTIO_CONFIG_S_DRIVER_OK LUN scanning
> http://marc.info/?l=linux-scsi&m=134160609212542&w=2
> 
> Using tcm_vhost requires Zhi's -> Stefan's qemu vhost-scsi tree here:
> 
> https://github.com/wuzhy/qemu/tree/vhost-scsi
> 
> along with the recent QEMU patch to hw/virtio-scsi.c to set max_target=0
> during vhost-scsi operation.
> 
> Changelog v2 -> v3:
> 
>   Unlock on error in tcm_vhost_drop_nexus() (DanC)
>   Fix strlen() doesn't count the terminator (DanC)
>   Call kfree() on an error path (DanC)
>   Convert tcm_vhost_write_pending to use target_execute_cmd (hch + nab)
>   Fix another strlen() off by one in tcm_vhost_make_tport (DanC)
>   Add option under drivers/staging/Kconfig, and move to drivers/vhost/tcm/
>   as requested by MST (nab)
> 
> Changelog v1 -> v2:
> 
>   Fix tv_cmd completion -> release SGL memory leak (nab)
>   Fix sparse warnings for static variable usage ((Fengguang Wu)
>   Fix sparse warnings for min() typing + printk format specs (Fengguang Wu)
>   Convert to cmwq submission for I/O dispatch (nab + hch)
> 
> Changelog v0 -> v1:
> 
>   Merge into single source + header file, and move to drivers/vhost/
> 
> Cc: Stefan Hajnoczi 
> Cc: Zhi Yong Wu 
> Cc: Michael S. Tsirkin 
> Cc: Paolo Bonzini 
> Cc: Christoph Hellwig 
> Cc: Hannes Reinecke 
> Cc: Jens Axboe 
> Signed-off-by: Nicholas Bellinger 
> ---
>  drivers/staging/Kconfig   |2 +
>  drivers/vhost/Makefile|2 +
>  drivers/vhost/tcm/Kconfig |6 +
>  drivers/vhost/tcm/Makefile|1 +
>  drivers/vhost/tcm/tcm_vhost.c | 1611 
> +
>  drivers/vhost/tcm/tcm_vhost.h |   74 ++
>  6 files changed, 1696 insertions(+), 0 deletions(-)
>  create mode 100644 drivers/vhost/tcm/Kconfig
>  create mode 100644 drivers/vhost/tcm/Makefile
>  create mode 100644 drivers/vhost/tcm/tcm_vhost.c
>  create mode 100644 drivers/vhost/tcm/tcm_vhost.h
> 

Really sorry about making you run around like that,
I did not mean moving all of tcm to a directory,
just adding tcm/Kconfig or adding drivers/vhost/Kconfig.tcm
because eventually it's easier to keep it all together
in one place.

> diff --git a/drivers/staging/Kconfig b/drivers/staging/Kconfig
> index 05e33c7..8d1a627 100644
> --- a/drivers/staging/Kconfig
> +++ b/drivers/staging/Kconfig
> @@ -132,4 +132,6 @@ source "drivers/staging/ipack/Kconfig"
>  
>  source "drivers/staging/gdm72xx/Kconfig"
>  
> +source "drivers/vhost/tcm/Kconfig"
> +
>  endif # STAGING
> diff --git a/drivers/vhost/Makefile b/drivers/vhost/Makefile
> index 72dd020..3408bea 100644
> --- a/drivers/vhost/Makefile
> +++ b/drivers/vhost/Makefile
> @@ -1,2 +1,4 @@
>  obj-$(CONFIG_VHOST_NET) += vhost_net.o
>  vhost_net-y := vhost.o net.o
> +
> +obj-$(CONFIG_TCM_VHOST) += tcm/
> diff --git a/drivers/vhost/tcm/Kconfig b/drivers/vhost/tcm/Kconfig
> new file mode 100644
> index 000..a9c6f76
> --- /dev/null
> +++ b/drivers/vhost/tcm/Kconfig
> @@ -0,0 +1,6 @@
> +config TCM_VHOST
> + tristate "TCM_VHOST fabric module (EXPERIMENTAL)"
> + depends on TARGET_CORE && EVENTFD && EXPERIMENTAL && m
> + default n
> + ---help---
> + Say M here to enable the TCM_VHOST fabric module for use with 
> virtio-scsi guests
> diff --git a/drivers/vhost/tcm/Makefile b/drivers/vhost/tcm/Makefile
> new file mode 100644
> index 000..54b0ea6
> --- /dev/null
> +++ b/drivers/vhost/tcm/Makefile
> @@ -0,0 +1 @@
> +obj-$(CONFIG_TCM_VHOST) += tcm_vhost.o
> diff --git a/drivers/vhost/tcm/tcm_vhost.c b/drivers/vhost/tcm/tcm_vhost.c
> new file mode 100644
> index 000..0ee4046
> --- /dev/null
> +++ b/drivers/vhost/tcm/tcm_vhost.c
> @@ -0,0 +1,1611 @@
> +/***
> + * Vhost kernel TCM fabric driver for virtio SCSI initiators
> + *
> + * (C) Copyright 2010-2012 RisingTide Systems LLC.
> + * (C) Copyright 2010-2012 IBM Corp.
> + *
> + * Licensed to the Linux Foundation under the General Public License (GPL) 
> version 2.
> + *
> + * Authors: Nicholas A. Bellinger 
> + *  Stefan Hajnoczi 
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of

Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6

2012-07-18 Thread Michael S. Tsirkin

On Wed, Jul 18, 2012 at 11:53:38AM -0400, Christoph Hellwig wrote:
> On Wed, Jul 18, 2012 at 08:42:21AM -0500, Anthony Liguori wrote:
> > 
> > If you add support for a new command, you need to provide userspace
> > a way to disable this command.  If you change what gets reported for
> > VPD, you need to provide userspace a way to make VPD look like what
> > it did in a previous version.
> > 
> > Basically, you need to be able to make a TCM device behave 100% the
> > same as it did in an older version of the kernel.
> > 
> > This is unique to virtualization due to live migration.  If you
> > migrate from a 3.6 kernel to a 3.8 kernel, you need to make sure
> > that the 3.8 kernel's TCM device behaves exactly like the 3.6 kernel
> > because the guest that is interacting with it does not realize that
> > live migration happened.
> 
> I don't think these strict live migration rules apply to SCSI targets.
> 
> Real life storage systems get new features and different behaviour with
> firmware upgrades all the time, and SCSI initiators deal with that just
> fine.
>  I don't see any reason to be more picky just because we're
> virtualized.


Presumably initiators are shut down for target firmware upgrades?
With virtualization your host can change without guest shutdown.
You can also *lose* commands when migrating to an older host.

-- 
MST
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6

2012-07-18 Thread Christoph Hellwig

On Wed, Jul 18, 2012 at 08:42:21AM -0500, Anthony Liguori wrote:
> 
> If you add support for a new command, you need to provide userspace
> a way to disable this command.  If you change what gets reported for
> VPD, you need to provide userspace a way to make VPD look like what
> it did in a previous version.
> 
> Basically, you need to be able to make a TCM device behave 100% the
> same as it did in an older version of the kernel.
> 
> This is unique to virtualization due to live migration.  If you
> migrate from a 3.6 kernel to a 3.8 kernel, you need to make sure
> that the 3.8 kernel's TCM device behaves exactly like the 3.6 kernel
> because the guest that is interacting with it does not realize that
> live migration happened.

I don't think these strict live migration rules apply to SCSI targets.

Real life storage systems get new features and different behaviour with
firmware upgrades all the time, and SCSI initiators deal with that just
fine.  I don't see any reason to be more picky just because we're
virtualized.

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6

2012-07-18 Thread Michael S. Tsirkin

On Wed, Jul 18, 2012 at 08:42:21AM -0500, Anthony Liguori wrote:
> On 07/17/2012 04:50 PM, Nicholas A. Bellinger wrote:
> >On Tue, 2012-07-17 at 13:55 -0500, Anthony Liguori wrote:
> >>On 07/17/2012 10:05 AM, Michael S. Tsirkin wrote:
> >>>On Wed, Jul 11, 2012 at 09:15:00PM +, Nicholas A. Bellinger wrote:
> >
> >
> >
> >>>
> >>>It still seems not 100% clear whether this driver will have major
> >>>userspace using it. And if not, it would be very hard to support a driver
> >>>when recent userspace does not use it in the end.
> >>
> >>I don't think this is a good reason to exclude something from the kernel.
> >>However, there are good reasons why this doesn't make sense for something 
> >>like
> >>QEMU--specifically because we have a large number of features in our block 
> >>layer
> >>that tcm_vhost would bypass.
> >>
> >
> >I can definitely appreciate your concern here as the QEMU maintainer.
> >
> >>But perhaps it makes sense for something like native kvm tool.  And if it 
> >>did go
> >>into the kernel, we would certainly support it in QEMU.
> >>
> >
> >...
> >
> >>But I do think the kernel should carefully consider whether it wants to 
> >>support
> >>an interface like this.  This an extremely complicated ABI with a lot of 
> >>subtle
> >>details around state and compatibility.
> >>
> >>Are you absolutely confident that you can support a userspace application 
> >>that
> >>expects to get exactly the same response from all possible commands in 20 
> >>kernel
> >>versions from now?  Virtualization requires absolutely precise 
> >>compatibility in
> >>terms of bugs and features.  This is probably not something the TCM stack 
> >>has
> >>had to consider yet.
> >>
> >
> >We most certainly have thought about long term userspace compatibility
> >with TCM.  Our userspace code (that's now available in all major
> >distros) is completely forward-compatible with new fabric modules such
> >as tcm_vhost.  No update required.
> 
> I'm not sure we're talking about the same thing when we say compatibility.
> 
> I'm not talking about the API.  I'm talking about the behavior of
> the commands that tcm_vhost supports.
> 
> If you add support for a new command, you need to provide userspace
> a way to disable this command.  If you change what gets reported for
> VPD, you need to provide userspace a way to make VPD look like what
> it did in a previous version.
> 
> Basically, you need to be able to make a TCM device behave 100% the
> same as it did in an older version of the kernel.
> 
> This is unique to virtualization due to live migration.  If you
> migrate from a 3.6 kernel to a 3.8 kernel, you need to make sure
> that the 3.8 kernel's TCM device behaves exactly like the 3.6 kernel
> because the guest that is interacting with it does not realize that
> live migration happened.
> 
> Yes, you can add knobs via configfs to control this behavior, but I
> think the question is, what's the plan for this?
> 
> BTW, I think this is a good thing to cover in
> Documentation/vhost/tcm_vhost.txt.  I think that's probably the only
> change that's needed here.
> 
> Regards,
> 
> Anthony Liguori

I agree it's needed but it's not a requirement for merging IMHO.
As a first step we can disable live migration.

> >
> >Also, by virtue of the fact that we are using configfs + rtslib (python
> >object library) on top, it's very easy to keep any type of compatibility
> >logic around in python code.  With rtslib, we are able to hide configfs
> >ABI changes from higher level apps.
> >
> >So far we've had a track record of 100% userspace ABI compatibility in
> >mainline since .38, and I don't intend to merge a patch that breaks this
> >any time soon.  But if that ever happens, apps using rtslib are not
> >going to be effected.
> >
> >>>I think a good idea for 3.6 would be to make it depend on CONFIG_STAGING.
> >>>Then we don't commit to an ABI.
> >>
> >>I think this is a good idea.  Even if it goes in, a really clear policy 
> >>would be
> >>needed wrt the userspace ABI.
> >>
> >>While tcm_vhost is probably more useful than vhost_blk, it's a much more 
> >>complex
> >>ABI to maintain.
> >>
> >
> >As far as I am concerned, the kernel API (eg: configfs directory layout)
> >as it is now in sys/kernel/config/target/vhost/ is not going to change.
> >It's based on the same drivers/target/target_core_fabric_configfs.c
> >generic layout that we've had since .38.
> >
> >The basic functional fabric layout in configfs is identical (with fabric
> >dependent WWPN naming of course) regardless of fabric driver, and by
> >virtue of being generic it means we can add things like fabric dependent
> >attributes + parameters in the future for existing fabrics without
> >breaking userspace.
> >
> >So while I agree the ABI is more complex than vhost-blk, the logic in
> >target_core_fabric_configfs.c is a basic ABI fabric definition that we
> >are enforcing across all fabric modules in mainline for long term
> >compatibility.
> >
> >--nab
> >
--
To unsubscribe fro

Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6

2012-07-18 Thread Paolo Bonzini

Il 18/07/2012 15:42, Anthony Liguori ha scritto:
> If you add support for a new command, you need to provide userspace a
> way to disable this command.  If you change what gets reported for VPD,
> you need to provide userspace a way to make VPD look like what it did in
> a previous version.

The QEMU target is not enforcing this to this level.  We didn't for
CD-ROM ATAPI, and we're not doing it for SCSI.

It may indeed be useful for changes to VPD pages or major features.
However, so far we've never introduced any feature that deserved it.
This is also because OSes typically don't care: they use a small subset
of the features and all the remaining "decorations" are only needed to
be pedantically compliant to the spec.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6

2012-07-18 Thread Anthony Liguori


On 07/17/2012 04:50 PM, Nicholas A. Bellinger wrote:

On Tue, 2012-07-17 at 13:55 -0500, Anthony Liguori wrote:

On 07/17/2012 10:05 AM, Michael S. Tsirkin wrote:

On Wed, Jul 11, 2012 at 09:15:00PM +, Nicholas A. Bellinger wrote:






It still seems not 100% clear whether this driver will have major
userspace using it. And if not, it would be very hard to support a driver
when recent userspace does not use it in the end.


I don't think this is a good reason to exclude something from the kernel.
However, there are good reasons why this doesn't make sense for something like
QEMU--specifically because we have a large number of features in our block layer
that tcm_vhost would bypass.



I can definitely appreciate your concern here as the QEMU maintainer.


But perhaps it makes sense for something like native kvm tool.  And if it did go
into the kernel, we would certainly support it in QEMU.



...


But I do think the kernel should carefully consider whether it wants to support
an interface like this.  This an extremely complicated ABI with a lot of subtle
details around state and compatibility.

Are you absolutely confident that you can support a userspace application that
expects to get exactly the same response from all possible commands in 20 kernel
versions from now?  Virtualization requires absolutely precise compatibility in
terms of bugs and features.  This is probably not something the TCM stack has
had to consider yet.



We most certainly have thought about long term userspace compatibility
with TCM.  Our userspace code (that's now available in all major
distros) is completely forward-compatible with new fabric modules such
as tcm_vhost.  No update required.


I'm not sure we're talking about the same thing when we say compatibility.

I'm not talking about the API.  I'm talking about the behavior of the commands 
that tcm_vhost supports.


If you add support for a new command, you need to provide userspace a way to 
disable this command.  If you change what gets reported for VPD, you need to 
provide userspace a way to make VPD look like what it did in a previous version.


Basically, you need to be able to make a TCM device behave 100% the same as it 
did in an older version of the kernel.


This is unique to virtualization due to live migration.  If you migrate from a 
3.6 kernel to a 3.8 kernel, you need to make sure that the 3.8 kernel's TCM 
device behaves exactly like the 3.6 kernel because the guest that is interacting 
with it does not realize that live migration happened.


Yes, you can add knobs via configfs to control this behavior, but I think the 
question is, what's the plan for this?


BTW, I think this is a good thing to cover in Documentation/vhost/tcm_vhost.txt. 
 I think that's probably the only change that's needed here.


Regards,

Anthony Liguori



Also, by virtue of the fact that we are using configfs + rtslib (python
object library) on top, it's very easy to keep any type of compatibility
logic around in python code.  With rtslib, we are able to hide configfs
ABI changes from higher level apps.

So far we've had a track record of 100% userspace ABI compatibility in
mainline since .38, and I don't intend to merge a patch that breaks this
any time soon.  But if that ever happens, apps using rtslib are not
going to be effected.


I think a good idea for 3.6 would be to make it depend on CONFIG_STAGING.
Then we don't commit to an ABI.


I think this is a good idea.  Even if it goes in, a really clear policy would be
needed wrt the userspace ABI.

While tcm_vhost is probably more useful than vhost_blk, it's a much more complex
ABI to maintain.



As far as I am concerned, the kernel API (eg: configfs directory layout)
as it is now in sys/kernel/config/target/vhost/ is not going to change.
It's based on the same drivers/target/target_core_fabric_configfs.c
generic layout that we've had since .38.

The basic functional fabric layout in configfs is identical (with fabric
dependent WWPN naming of course) regardless of fabric driver, and by
virtue of being generic it means we can add things like fabric dependent
attributes + parameters in the future for existing fabrics without
breaking userspace.

So while I agree the ABI is more complex than vhost-blk, the logic in
target_core_fabric_configfs.c is a basic ABI fabric definition that we
are enforcing across all fabric modules in mainline for long term
compatibility.

--nab



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC-v3 3/4] vhost: Add vhost_scsi specific defines

2012-07-18 Thread Michael S. Tsirkin

On Wed, Jul 18, 2012 at 12:59:31AM +, Nicholas A. Bellinger wrote:
> From: Nicholas Bellinger 
> 
> This patch adds the initial vhost_scsi_ioctl() callers for 
> VHOST_SCSI_SET_ENDPOINT
> and VHOST_SCSI_CLEAR_ENDPOINT respectively, and also adds struct 
> vhost_vring_target
> that is used by tcm_vhost code when locating target ports during qemu setup.
> 
> Signed-off-by: Stefan Hajnoczi 
> Cc: Zhi Yong Wu 
> Cc: Michael S. Tsirkin 
> Cc: Paolo Bonzini ,
> Signed-off-by: Nicholas A. Bellinger 
> ---
>  include/linux/vhost.h |9 +
>  1 files changed, 9 insertions(+), 0 deletions(-)
> 
> diff --git a/include/linux/vhost.h b/include/linux/vhost.h
> index e847f1e..33b313b 100644
> --- a/include/linux/vhost.h
> +++ b/include/linux/vhost.h
> @@ -24,7 +24,11 @@ struct vhost_vring_state {
>  struct vhost_vring_file {
>   unsigned int index;
>   int fd; /* Pass -1 to unbind from file. */
> +};
>  
> +struct vhost_vring_target {

Can this be renamed vhost_scsi_target?

> + unsigned char vhost_wwpn[224];

224? I am guessing ISCSI_NAME_LEN from include/scsi/iscsi_proto.h?
Unfortunately we can't include iscsi_proto.h here as it
is not exported to users. But let's add a comment for now.

> + unsigned short vhost_tpgt;
>  };
>  
>  struct vhost_vring_addr {
> @@ -121,6 +125,11 @@ struct vhost_memory {
>   * device.  This can be used to stop the ring (e.g. for migration). */
>  #define VHOST_NET_SET_BACKEND _IOW(VHOST_VIRTIO, 0x30, struct 
> vhost_vring_file)
>  
> +/* VHOST_SCSI specific defines */
> +
> +#define VHOST_SCSI_SET_ENDPOINT _IOW(VHOST_VIRTIO, 0x40, struct 
> vhost_vring_target)
> +#define VHOST_SCSI_CLEAR_ENDPOINT _IOW(VHOST_VIRTIO, 0x41, struct 
> vhost_vring_target)
> +
>  /* Feature bits */
>  /* Log all write descriptors. Can be changed while device is active. */

Can these go into appropriate ifdef CONFIG_TCP_VHOST please?

>  #define VHOST_F_LOG_ALL 26
> -- 
> 1.7.2.5
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC-v3 0/4] tcm_vhost+cmwq fabric driver code for-3.6

2012-07-18 Thread Michael S. Tsirkin

On Wed, Jul 18, 2012 at 12:59:28AM +, Nicholas A. Bellinger wrote:
> From: Nicholas Bellinger 
> 
> Hi folks,
> 
> The following is the RFC-v3 series of tcm_vhost target fabric driver code
> currently in-flight for-3.6 mainline code.
> 
> With the merge window opening soon, the tcm_vhost code has started seeing
> time in linux-next.  The v2 -> v3 changelog from the last week is currently
> looking like:
> 
>  *) Unlock on error in tcm_vhost_drop_nexus() (DanC)
>  *) Fix strlen() doesn't count the terminator (DanC)
>  *) Call kfree() on an error path (DanC)
>  *) Convert tcm_vhost_write_pending to use target_execute_cmd (hch + nab)
>  *) Fix another strlen() off by one in tcm_vhost_make_tport (DanC)
>  *) Add option under drivers/staging/Kconfig, and move to drivers/vhost/tcm/
> as requested by MST (nab)

I actually only wanted to have a separate Kconfig (in a separate
directory or Kconfig.tcm), so you do not need to move code back to move
driver out of staging.  But if you prefer this, I'm fine with it too for
now.

> Thanks to Dan Carpenter for his smatch fixes this past round.
> 
> Also as requested by MST, the code has been moved to a seperate tcm/ 
> subdirectory
> under drivers/vhost/ so that it can be included under staging's config options
> until we can settle on the necessary userspace bits for QEMU and kvm-tool.
> 
> The updated series will be going out shortly to target-pending/for-next-merge.
> 
> Please have another look and let us know if you have any concerned.
> 
> Thanks!
> 
> Nicholas Bellinger (2):
>   vhost: Add vhost_scsi specific defines
>   tcm_vhost: Initial merge for vhost level target fabric driver
> 
> Stefan Hajnoczi (2):
>   vhost: Separate vhost-net features from vhost features
>   vhost: make vhost work queue visible
> 
>  drivers/staging/Kconfig   |2 +
>  drivers/vhost/Makefile|2 +
>  drivers/vhost/net.c   |4 +-
>  drivers/vhost/tcm/Kconfig |6 +
>  drivers/vhost/tcm/Makefile|1 +
>  drivers/vhost/tcm/tcm_vhost.c | 1611 
> +
>  drivers/vhost/tcm/tcm_vhost.h |   74 ++
>  drivers/vhost/test.c  |4 +-
>  drivers/vhost/vhost.c |5 +-
>  drivers/vhost/vhost.h |6 +-
>  include/linux/vhost.h |9 +
>  11 files changed, 1716 insertions(+), 8 deletions(-)
>  create mode 100644 drivers/vhost/tcm/Kconfig
>  create mode 100644 drivers/vhost/tcm/Makefile
>  create mode 100644 drivers/vhost/tcm/tcm_vhost.c
>  create mode 100644 drivers/vhost/tcm/tcm_vhost.h
> 
> -- 
> 1.7.2.5
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH RESEND 0/5] Add vhost-blk support

2012-07-18 Thread Stefan Hajnoczi

On Tue, Jul 17, 2012 at 4:09 PM, Michael S. Tsirkin  wrote:
> On Fri, Jul 13, 2012 at 04:55:06PM +0800, Asias He wrote:
>>
>> Hi folks,
>>
>> [I am resending to fix the broken thread in the previous one.]
>>
>> This patchset adds vhost-blk support. vhost-blk is a in kernel virito-blk
>> device accelerator. Compared to userspace virtio-blk implementation, 
>> vhost-blk
>> gives about 5% to 15% performance improvement.
>
> Same thing as tcm_host comment:
>
> It seems not 100% clear whether this driver will have major
> userspace using it. And if not, it would be very hard to support a
> driver when recent userspace does not use it in the end.
>
> I think a good idea for 3.6 would be to make it depend on
> CONFIG_STAGING.  Then we don't commit to an ABI.  For this, you can 
> add
> a separate Kconfig and source it from drivers/staging/Kconfig.  Maybe 
> it
> needs to be in a separate directory drivers/vhost/staging/Kconfig.
>
> I Cc'd the list of tcm_host in the hope that you can cooperate on this.
>

Adding it to staging allows more people to try it out, so that's a
good thing.  If I get a moment to play with it I'll let you know the
results.

Stefan
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] target: Allow for target_submit_cmd() returning errors

2012-07-18 Thread Sebastian Andrzej Siewior


On 07/18/2012 02:05 AM, Nicholas A. Bellinger wrote:

index 02ace18..9ddf315 100644
--- a/drivers/usb/gadget/tcm_usb_gadget.c
+++ b/drivers/usb/gadget/tcm_usb_gadget.c
@@ -1060,7 +1060,10 @@ static void usbg_cmd_work(struct work_struct *work)
tpg = cmd->fu->tpg;
tv_nexus = tpg->tpg_nexus;
dir = get_cmd_dir(cmd->cmd_buf);
-   if (dir<  0) {
+   if (dir<  0 ||
+   target_submit_cmd(se_cmd, tv_nexus->tvn_se_sess,
+ cmd->cmd_buf, cmd->sense_iu.sense, 
cmd->unpacked_lun,
+ 0, cmd->prio_attr, dir, TARGET_SCF_UNKNOWN_SIZE)) 
{
transport_init_se_cmd(se_cmd,
tv_nexus->tvn_se_sess->se_tpg->se_tpg_tfo,
tv_nexus->tvn_se_sess, cmd->data_len, DMA_NONE,
@@ -1069,12 +1072,7 @@ static void usbg_cmd_work(struct work_struct *work)
transport_send_check_condition_and_sense(se_cmd,
TCM_UNSUPPORTED_SCSI_OPCODE, 1);
usbg_cleanup_cmd(cmd);
-   return;
}
-
-   target_submit_cmd(se_cmd, tv_nexus->tvn_se_sess,
-   cmd->cmd_buf, cmd->sense_iu.sense, cmd->unpacked_lun,
-   0, cmd->prio_attr, dir, TARGET_SCF_UNKNOWN_SIZE);
  }

  static int usbg_submit_command(struct f_uas *fu,


Looking at this part again target_submit_cmd() has been moved ahead of
target_init_se_cmd() in the exception path, which ends up getting called
twice during a target_get_sess_cmd() failure..

It looks like usbg_cmd_work() + bot_cmd_work() will need some work if
they intends to use a non zero failure to signal exception here, without
invoking a CHECK_CONDITION and sense.  Actually, I'm not even sure
sending a CHECK_CONDITION here after the target_submit_cmd() is going to
work for other fabrics drivers, but it appears to work with usb-gadget.
(Sebastian..?)


For UAS the status/ sense URB is sent right away (without any data) and
this worked the last time I tested.
For BOT things are a little complicated. What I do is to send empty
data (to fill the data pipe) and send a bad status so the other side
learns that the transfer failed somehow. The sense information is lost.
What I should do is to queue this sense data for the next MODE_SENSE
request which will be coming soon. Right now WinXP repeats a failed
command thrice if MODE_SENSE returns no error. "This" is on my to-fix
list…


So for the moment, lets fix the double se_cmd init and make things a
little easier to read..   Sebastian, please have a look and double check
that the (dir<  0) + target_submit_cmd() failures cases are both going
to work as expected whilst sending a CHECK_CONDITION response.


The sense code should only be sent once and
transport_send_check_condition_and_sense() sets
SCF_SENT_CHECK_CONDITION and checks for it so doing it twice should not
do any harm right? But grep does not say when this flag gets removed.



Thanks!

--nab


Sebastian
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [MINI SUMMIT] SCSI core performance

2012-07-18 Thread James Bottomley

On Tue, 2012-07-17 at 19:39 -0700, Nicholas A. Bellinger wrote:
> Hi KS-PCs,
> 
> I'd like to propose a SCSI performance mini-summit to see how interested
> folks are in helping address the long-term issues that SCSI core is
> currently facing wrt to multi-lun per host and heavy small block random
> I/O workloads.
> 
> I know this would probably be better suited for LSF (for the record it
> was proposed this year) but now that we've acknowledge there is a
> problem with SCSI LLDs vs. raw block drivers vs. other SCSI subsystems,
> it would be useful to get the storage folks into a single room at some
> point during KS/LPC to figure out what is actually going on with SCSI
> core.

You seem to have a short memory:  The last time it was discussed

http://marc.info/?t=13415537393

It rapidly became apparent there isn't a problem.  Enabling high IOPS in
the SCSI stack is what I think you mean.

> As mentioned in the recent tcm_vhost thread, there are a number of cases
> where drivers/target/ code can demonstrate this limitation pretty
> vividly now.
> 
> This includes the following scenarios using raw block flash export with
> target_core_mod + target_core_iblock export and the same small block
> (4k) mixed random I/O workload with fio:
> 
> *) tcm_loop local SCSI LLD performance is an order of magnitude slower 
>than the same local raw block flash backend.
> *) tcm_qla2xxx performs better using MSFT Server hosts than Linux v3.x
>based hosts using 2x socket Nehalem hardware w/ PCI-e Gen2 HBAs
> *) ib_srpt performs better using MSFT Server host than RHEL 6.x .32 
>based hosts using 2x socket Romley hardware w/ PCI-e Gen3 HCAs
> *) Raw block IBLOCK export into KVM guest v3.5-rc w/ virtio-scsi is 
>behind in performance vs. raw local block flash.  (cmwq on the host 
>is helping here, but still need to with MSFT SCSI mini-port)
> 
> Also, with 1M IOPs into a single VM guest now being done by other non
> Linux based hypervisors, the virtualization bit for high performance KVM
> SCSI based storage is quickly coming on..
> 
> So all of that said, I'd like to at least have a discussion with the key
> SCSI + block folks who will be present in San Diego on path forward to
> address these without having to wait until LSF-2013 + hope for a topic
> slot to materialize then.
> 
> Thank you for your consideration,

Well, your proposal is devoid of an actual proposal.

Enabling high IOPS involves reducing locking overhead and path length
through the code.  I think most of the low hanging fruit in this area is
already picked, but if you have an idea, please say.  There might be
something we can extract from the lockless queue work Jens is doing, but
we need that to materialise first.

Without a concrete thing to discuss, shooting the breeze on high IOPS in
the SCSI stack is about as useful as discussing what happened in last
night's episode of Coronation Street which, when it happens in my house,
always helps me see how incredibly urgent fixing the leaky tap I've been
putting off for months actually is.

If someone can come up with a proposal ... or even perhaps another path
trace showing where the reducible overhead and lock problems are we can
discuss it on the list and we might have a real topic by the time LSF
rolls around.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6

[GIT PULL] target fixes for v3.5

Re: [PATCH] target: Allow for target_submit_cmd() returning errors

Re: scsi_lib: fix scsi_io_completion's SG_IO error propagation

Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6

Re: [Bug 44771] [REGRESSION] a7a20d103994fd760766e6c9d494daa569cbfe06 makes kernel 3.5 unbootable on an Intel chipset based motherboard

Re: [MINI SUMMIT] SCSI core performance

Re: [RFC-v3 3/4] vhost: Add vhost_scsi specific defines

Re: [RFC-v3 4/4] tcm_vhost: Initial merge for vhost level target fabric driver

Re: [RFC-v3 3/4] vhost: Add vhost_scsi specific defines

Re: [Bug 44771] [REGRESSION] a7a20d103994fd760766e6c9d494daa569cbfe06 makes kernel 3.5 unbootable on an Intel chipset based motherboard

Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6

Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6

Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6

[PATCH] scsi/ibmvscsi: /sys/class/scsi_host/hostX/config doesn't show any information

[PATCH] scsi/ibmvscsi: add module alias for ibmvscsic

Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6

Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6

Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6

Re: [RFC-v3 0/4] tcm_vhost+cmwq fabric driver code for-3.6

Re: [RFC-v3 4/4] tcm_vhost: Initial merge for vhost level target fabric driver

Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6

Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6

Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6

Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6

Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6

Re: [RFC-v3 3/4] vhost: Add vhost_scsi specific defines

Re: [RFC-v3 0/4] tcm_vhost+cmwq fabric driver code for-3.6

Re: [PATCH RESEND 0/5] Add vhost-blk support

Re: [PATCH] target: Allow for target_submit_cmd() returning errors

Re: [MINI SUMMIT] SCSI core performance

31 matches

Site Navigation

Mail list logo

Footer information