RE: [RFC] hv_storvsc: error handling.
> -Original Message- > From: Christoph Hellwig [mailto:h...@lst.de] > Sent: Monday, March 6, 2017 9:06 PM > To: KY Srinivasan <k...@microsoft.com> > Cc: Stephen Hemminger <step...@networkplumber.org>; James > Bottomley <james.bottom...@hansenpartnership.com>; Hannes Reinecke > <h...@suse.de>; Christoph Hellwig <h...@lst.de>; James Bottomley > <j...@linux.vnet.ibm.com>; Jens Axboe <ax...@kernel.dk>; Linus Torvalds > <torva...@linux-foundation.org>; Martin K. Petersen > <martin.peter...@oracle.com>; Dexuan Cui <de...@microsoft.com>; Long > Li <lon...@microsoft.com>; Josh Poulson <jopou...@microsoft.com>; Adrian > Suhov (Cloudbase Solutions SRL) <v-ads...@microsoft.com>; linux- > s...@vger.kernel.org; Haiyang Zhang <haiya...@microsoft.com> > Subject: Re: [RFC] hv_storvsc: error handling. > > > Was the invalid LUN in the LUN0 position. Inquiry of LUN0 support (when > LUN0 is not populated) > > was added only recently to address host side issue. > > How does HyperV expect device scanning to happen for a not populated > LUN? > > REPORT SUPPORTED LUNS but nothing else on LUN 0? Maybe a TEST UNIT > READY > thrown? Or does it actually support the REPORT LUNS well known LU? LUN0 on every target is supposed at least return enough data to support report luns scan for the target. It looks like if LUN0 on a target is empty DVD device, the INQUIRY data that is returned from the host is bogus. Stephen can give additional information on this. K. Y
Re: [RFC] hv_storvsc: error handling.
> Was the invalid LUN in the LUN0 position. Inquiry of LUN0 support (when LUN0 > is not populated) > was added only recently to address host side issue. How does HyperV expect device scanning to happen for a not populated LUN? REPORT SUPPORTED LUNS but nothing else on LUN 0? Maybe a TEST UNIT READY thrown? Or does it actually support the REPORT LUNS well known LU?
Re: [RFC] hv_storvsc: error handling.
> > I will try it, but it can't work for two reasons. > > First, the INVALID_LUN error is masked off on INQUIRY in current code. > > Second, the scsi_device is instantiated already as part of scan probe > > process > > before it gets here. > > Was the invalid LUN in the LUN0 position. Inquiry of LUN0 support (when LUN0 > is not populated) > was added only recently to address host side issue. When probing the code probes with LUN 1, ... There is a cause where kernel does INQUIRY on LUN0, it looks kernel is asking for page code 80 which is optional "Unit Serial Number". And then WS2016 is returning an error and invalid sense data. The old masking of errors caused kernel to interpret sense data as Unit Serial Number which is also not good but looks harmless. > > The best solution so far is: > > - remove old INQUIRY/SENSE error masking > > + add new workaround for INQUIRY of device id on LUN 0 > > which appears to be the reason for old masking > > + return errors on missing LUN > > + provide better transport services for hot remove (rather > > than detecting by failed I/O). > > This the mechanism used by the host for notifying LUN removal - invalid LUN > error code. This has a couple of problems. First, it means that hotplug doesn't occur until an I/O is done. Second the current code was not being truthful to block layer. If it has to handle hotplug in this manner, it should have still failed the I/O. If application was using direct I/O it would want to know that write failed. Perhaps the existing channel mechanism can be used as notification path.
RE: [RFC] hv_storvsc: error handling.
> -Original Message- > From: Stephen Hemminger [mailto:step...@networkplumber.org] > Sent: Monday, March 6, 2017 8:36 AM > To: KY Srinivasan <k...@microsoft.com> > Cc: James Bottomley <james.bottom...@hansenpartnership.com>; Hannes > Reinecke <h...@suse.de>; Christoph Hellwig <h...@lst.de>; James > Bottomley <j...@linux.vnet.ibm.com>; Jens Axboe <ax...@kernel.dk>; > Linus Torvalds <torva...@linux-foundation.org>; Martin K. Petersen > <martin.peter...@oracle.com>; Dexuan Cui <de...@microsoft.com>; Long > Li <lon...@microsoft.com>; Josh Poulson <jopou...@microsoft.com>; Adrian > Suhov (Cloudbase Solutions SRL) <v-ads...@microsoft.com>; linux- > s...@vger.kernel.org; Haiyang Zhang <haiya...@microsoft.com> > Subject: Re: [RFC] hv_storvsc: error handling. > > On Sat, 4 Mar 2017 21:03:41 + > KY Srinivasan <k...@microsoft.com> wrote: > > > > -Original Message- > > > From: Stephen Hemminger [mailto:step...@networkplumber.org] > > > Sent: Friday, March 3, 2017 4:50 PM > > > To: James Bottomley <james.bottom...@hansenpartnership.com> > > > Cc: Hannes Reinecke <h...@suse.de>; Christoph Hellwig <h...@lst.de>; > > > James Bottomley <j...@linux.vnet.ibm.com>; Jens Axboe > > > <ax...@kernel.dk>; Linus Torvalds <torva...@linux-foundation.org>; > > > Martin K. Petersen <martin.peter...@oracle.com>; KY Srinivasan > > > <k...@microsoft.com>; Dexuan Cui <de...@microsoft.com>; Long Li > > > <lon...@microsoft.com>; Josh Poulson <jopou...@microsoft.com>; > Adrian > > > Suhov (Cloudbase Solutions SRL) <v-ads...@microsoft.com>; linux- > > > s...@vger.kernel.org; Haiyang Zhang <haiya...@microsoft.com> > > > Subject: [RFC] hv_storvsc: error handling. > > > > > > Needs more testing but this does fix the observed problem. > > > > > > From: Stephen Hemminger <sthem...@microsoft.com> > > > > > > Subject: [PATCH] hv_storvsc: fix error handling > > > > > > The Hyper-V storvsc SCSI driver was hiding all errors in INQUIRY and > > > MODE_SENSE commands. This caused the scan process to incorrectly > think > > > devices were present and online. Also invalid LUN errors were not > > > being handled correctly. > > > > > > This fixes problems booting a GEN2 VM on Hyper-V. It effectively > > > reverts commit 4ed51a21c0f69 ("Staging: hv: storvsc: Fixup > > > srb and scsi status for INQUIRY and MODE_SENSE") > > > > > > Signed-off-by: Stephen Hemminger <sthem...@microsoft.com> > > > --- > > > drivers/scsi/storvsc_drv.c | 48 > > > -- > > > 1 file changed, 4 insertions(+), 44 deletions(-) > > > > > > diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c > > > index 638e5f427c90..8cc241fc54b8 100644 > > > --- a/drivers/scsi/storvsc_drv.c > > > +++ b/drivers/scsi/storvsc_drv.c > > > @@ -543,28 +543,6 @@ static void storvsc_host_scan(struct work_struct > > > *work) > > > kfree(wrk); > > > } > > > > > > -static void storvsc_remove_lun(struct work_struct *work) > > > -{ > > > - struct storvsc_scan_work *wrk; > > > - struct scsi_device *sdev; > > > - > > > - wrk = container_of(work, struct storvsc_scan_work, work); > > > - if (!scsi_host_get(wrk->host)) > > > - goto done; > > > - > > > - sdev = scsi_device_lookup(wrk->host, 0, wrk->tgt_id, wrk->lun); > > > - > > > - if (sdev) { > > > - scsi_remove_device(sdev); > > > - scsi_device_put(sdev); > > > - } > > > - scsi_host_put(wrk->host); > > > - > > > -done: > > > - kfree(wrk); > > > -} > > > - > > > - > > > /* > > > * We can get incoming messages from the host that are not in response > to > > > * messages that we have sent out. An example of this would be > messages > > > @@ -955,8 +933,7 @@ static void storvsc_handle_error(struct > > > vmscsi_request *vm_srb, > > > } > > > break; > > > case SRB_STATUS_INVALID_LUN: > > > - do_work = true; > > > - process_err_fn = storvsc_remove_lun; > > > + set_host_byte(scmnd, DID_NO_CONNECT); > > > break; > > > case SRB_STATUS_ABO
Re: [RFC] hv_storvsc: error handling.
On Sat, 4 Mar 2017 21:03:41 + KY Srinivasanwrote: > > -Original Message- > > From: Stephen Hemminger [mailto:step...@networkplumber.org] > > Sent: Friday, March 3, 2017 4:50 PM > > To: James Bottomley > > Cc: Hannes Reinecke ; Christoph Hellwig ; > > James Bottomley ; Jens Axboe > > ; Linus Torvalds ; > > Martin K. Petersen ; KY Srinivasan > > ; Dexuan Cui ; Long Li > > ; Josh Poulson ; Adrian > > Suhov (Cloudbase Solutions SRL) ; linux- > > s...@vger.kernel.org; Haiyang Zhang > > Subject: [RFC] hv_storvsc: error handling. > > > > Needs more testing but this does fix the observed problem. > > > > From: Stephen Hemminger > > > > Subject: [PATCH] hv_storvsc: fix error handling > > > > The Hyper-V storvsc SCSI driver was hiding all errors in INQUIRY and > > MODE_SENSE commands. This caused the scan process to incorrectly think > > devices were present and online. Also invalid LUN errors were not > > being handled correctly. > > > > This fixes problems booting a GEN2 VM on Hyper-V. It effectively > > reverts commit 4ed51a21c0f69 ("Staging: hv: storvsc: Fixup > > srb and scsi status for INQUIRY and MODE_SENSE") > > > > Signed-off-by: Stephen Hemminger > > --- > > drivers/scsi/storvsc_drv.c | 48 > > -- > > 1 file changed, 4 insertions(+), 44 deletions(-) > > > > diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c > > index 638e5f427c90..8cc241fc54b8 100644 > > --- a/drivers/scsi/storvsc_drv.c > > +++ b/drivers/scsi/storvsc_drv.c > > @@ -543,28 +543,6 @@ static void storvsc_host_scan(struct work_struct > > *work) > > kfree(wrk); > > } > > > > -static void storvsc_remove_lun(struct work_struct *work) > > -{ > > - struct storvsc_scan_work *wrk; > > - struct scsi_device *sdev; > > - > > - wrk = container_of(work, struct storvsc_scan_work, work); > > - if (!scsi_host_get(wrk->host)) > > - goto done; > > - > > - sdev = scsi_device_lookup(wrk->host, 0, wrk->tgt_id, wrk->lun); > > - > > - if (sdev) { > > - scsi_remove_device(sdev); > > - scsi_device_put(sdev); > > - } > > - scsi_host_put(wrk->host); > > - > > -done: > > - kfree(wrk); > > -} > > - > > - > > /* > > * We can get incoming messages from the host that are not in response to > > * messages that we have sent out. An example of this would be messages > > @@ -955,8 +933,7 @@ static void storvsc_handle_error(struct > > vmscsi_request *vm_srb, > > } > > break; > > case SRB_STATUS_INVALID_LUN: > > - do_work = true; > > - process_err_fn = storvsc_remove_lun; > > + set_host_byte(scmnd, DID_NO_CONNECT); > > break; > > case SRB_STATUS_ABORTED: > > if (vm_srb->srb_status & SRB_STATUS_AUTOSENSE_VALID > > && > > @@ -1050,32 +1027,15 @@ static void storvsc_on_io_completion(struct > > storvsc_device *stor_device, > > > > stor_pkt = >vstor_packet; > > > > - /* > > -* The current SCSI handling on the host side does > > -* not correctly handle: > > -* INQUIRY command with page code parameter set to 0x80 > > -* MODE_SENSE command with cmd[2] == 0x1c > > -* > > -* Setup srb and scsi status so this won't be fatal. > > -* We do this so we can distinguish truly fatal failues > > -* (srb status == 0x4) and off-line the device in that case. > > -*/ > > - > > - if ((stor_pkt->vm_srb.cdb[0] == INQUIRY) || > > - (stor_pkt->vm_srb.cdb[0] == MODE_SENSE)) { > > - vstor_packet->vm_srb.scsi_status = 0; > > - vstor_packet->vm_srb.srb_status = SRB_STATUS_SUCCESS; > > - } > > - > > - > > /* Copy over the status...etc */ > > stor_pkt->vm_srb.scsi_status = vstor_packet->vm_srb.scsi_status; > > stor_pkt->vm_srb.srb_status = vstor_packet->vm_srb.srb_status; > > stor_pkt->vm_srb.sense_info_length = > > vstor_packet->vm_srb.sense_info_length; > > > > - if (vstor_packet->vm_srb.scsi_status != 0 || > > - vstor_packet->vm_srb.srb_status != SRB_STATUS_SUCCESS) > > + if (stor_pkt->vm_srb.cdb[0] != INQUIRY && > > + (vstor_packet->vm_srb.scsi_status != 0 || > > +vstor_packet->vm_srb.srb_status != SRB_STATUS_SUCCESS)) > > storvsc_log(device, STORVSC_LOGGING_WARN, > > "cmd 0x%x scsi status 0x%x srb status 0x%x\n", > > stor_pkt->vm_srb.cdb[0], > > -- > > This patch gets rid of the ability to "hot remove" LUNs. I don't think that > can be part of any > solution. The INQUIRY hack I put in a long time ago was to deal with host > bugs on prior
RE: [RFC] hv_storvsc: error handling.
> -Original Message- > From: KY Srinivasan > Sent: Saturday, March 4, 2017 1:40 PM > To: 'James Bottomley' <j...@linux.vnet.ibm.com>; Stephen Hemminger > <step...@networkplumber.org> > Cc: Hannes Reinecke <h...@suse.de>; Christoph Hellwig <h...@lst.de>; Jens > Axboe <ax...@kernel.dk>; Linus Torvalds <torvalds@linux- > foundation.org>; Martin K. Petersen <martin.peter...@oracle.com>; > Dexuan Cui <de...@microsoft.com>; Long Li <lon...@microsoft.com>; Josh > Poulson <jopou...@microsoft.com>; Adrian Suhov (Cloudbase Solutions SRL) > <v-ads...@microsoft.com>; linux-scsi@vger.kernel.org; Haiyang Zhang > <haiya...@microsoft.com> > Subject: RE: [RFC] hv_storvsc: error handling. > > > > > -Original Message- > > From: James Bottomley [mailto:j...@linux.vnet.ibm.com] > > Sent: Saturday, March 4, 2017 1:37 PM > > To: KY Srinivasan <k...@microsoft.com>; Stephen Hemminger > > <step...@networkplumber.org> > > Cc: Hannes Reinecke <h...@suse.de>; Christoph Hellwig <h...@lst.de>; > Jens > > Axboe <ax...@kernel.dk>; Linus Torvalds <torvalds@linux- > > foundation.org>; Martin K. Petersen <martin.peter...@oracle.com>; > > Dexuan Cui <de...@microsoft.com>; Long Li <lon...@microsoft.com>; Josh > > Poulson <jopou...@microsoft.com>; Adrian Suhov (Cloudbase Solutions > SRL) > > <v-ads...@microsoft.com>; linux-scsi@vger.kernel.org; Haiyang Zhang > > <haiya...@microsoft.com> > > Subject: Re: [RFC] hv_storvsc: error handling. > > > > On Sat, 2017-03-04 at 21:03 +, KY Srinivasan wrote: > > > > > > > -Original Message- > > > > From: Stephen Hemminger [mailto:step...@networkplumber.org] > > > > Sent: Friday, March 3, 2017 4:50 PM > > > > To: James Bottomley <james.bottom...@hansenpartnership.com> > > > > Cc: Hannes Reinecke <h...@suse.de>; Christoph Hellwig > <h...@lst.de>; > > > > James Bottomley <j...@linux.vnet.ibm.com>; Jens Axboe > > > > <ax...@kernel.dk>; Linus Torvalds <torva...@linux-foundation.org>; > > > > Martin K. Petersen <martin.peter...@oracle.com>; KY Srinivasan > > > > <k...@microsoft.com>; Dexuan Cui <de...@microsoft.com>; Long Li > > > > <lon...@microsoft.com>; Josh Poulson <jopou...@microsoft.com>; > > > > Adrian > > > > Suhov (Cloudbase Solutions SRL) <v-ads...@microsoft.com>; linux- > > > > s...@vger.kernel.org; Haiyang Zhang <haiya...@microsoft.com> > > > > Subject: [RFC] hv_storvsc: error handling. > > > > > > > > Needs more testing but this does fix the observed problem. > > > > > > > > From: Stephen Hemminger <sthem...@microsoft.com> > > > > > > > > Subject: [PATCH] hv_storvsc: fix error handling > > > > > > > > The Hyper-V storvsc SCSI driver was hiding all errors in INQUIRY > > > > and > > > > MODE_SENSE commands. This caused the scan process to incorrectly > > > > think > > > > devices were present and online. Also invalid LUN errors were not > > > > being handled correctly. > > > > > > > > This fixes problems booting a GEN2 VM on Hyper-V. It effectively > > > > reverts commit 4ed51a21c0f69 ("Staging: hv: storvsc: Fixup > > > > srb and scsi status for INQUIRY and MODE_SENSE") > > > > > > > > Signed-off-by: Stephen Hemminger <sthem...@microsoft.com> > > > > --- > > > > drivers/scsi/storvsc_drv.c | 48 -- > > > > > > > > 1 file changed, 4 insertions(+), 44 deletions(-) > > > > > > > > diff --git a/drivers/scsi/storvsc_drv.c > > > > b/drivers/scsi/storvsc_drv.c > > > > index 638e5f427c90..8cc241fc54b8 100644 > > > > --- a/drivers/scsi/storvsc_drv.c > > > > +++ b/drivers/scsi/storvsc_drv.c > > > > @@ -543,28 +543,6 @@ static void storvsc_host_scan(struct > > > > work_struct > > > > *work) > > > > kfree(wrk); > > > > } > > > > > > > > -static void storvsc_remove_lun(struct work_struct *work) > > > > -{ > > > > - struct storvsc_scan_work *wrk; > > > > - struct scsi_device *sdev; > > > > - > > > > - wrk = container_of(work, struct storvsc_scan_work, work); > >
RE: [RFC] hv_storvsc: error handling.
> -Original Message- > From: James Bottomley [mailto:j...@linux.vnet.ibm.com] > Sent: Saturday, March 4, 2017 1:37 PM > To: KY Srinivasan <k...@microsoft.com>; Stephen Hemminger > <step...@networkplumber.org> > Cc: Hannes Reinecke <h...@suse.de>; Christoph Hellwig <h...@lst.de>; Jens > Axboe <ax...@kernel.dk>; Linus Torvalds <torvalds@linux- > foundation.org>; Martin K. Petersen <martin.peter...@oracle.com>; > Dexuan Cui <de...@microsoft.com>; Long Li <lon...@microsoft.com>; Josh > Poulson <jopou...@microsoft.com>; Adrian Suhov (Cloudbase Solutions SRL) > <v-ads...@microsoft.com>; linux-scsi@vger.kernel.org; Haiyang Zhang > <haiya...@microsoft.com> > Subject: Re: [RFC] hv_storvsc: error handling. > > On Sat, 2017-03-04 at 21:03 +, KY Srinivasan wrote: > > > > > -Original Message- > > > From: Stephen Hemminger [mailto:step...@networkplumber.org] > > > Sent: Friday, March 3, 2017 4:50 PM > > > To: James Bottomley <james.bottom...@hansenpartnership.com> > > > Cc: Hannes Reinecke <h...@suse.de>; Christoph Hellwig <h...@lst.de>; > > > James Bottomley <j...@linux.vnet.ibm.com>; Jens Axboe > > > <ax...@kernel.dk>; Linus Torvalds <torva...@linux-foundation.org>; > > > Martin K. Petersen <martin.peter...@oracle.com>; KY Srinivasan > > > <k...@microsoft.com>; Dexuan Cui <de...@microsoft.com>; Long Li > > > <lon...@microsoft.com>; Josh Poulson <jopou...@microsoft.com>; > > > Adrian > > > Suhov (Cloudbase Solutions SRL) <v-ads...@microsoft.com>; linux- > > > s...@vger.kernel.org; Haiyang Zhang <haiya...@microsoft.com> > > > Subject: [RFC] hv_storvsc: error handling. > > > > > > Needs more testing but this does fix the observed problem. > > > > > > From: Stephen Hemminger <sthem...@microsoft.com> > > > > > > Subject: [PATCH] hv_storvsc: fix error handling > > > > > > The Hyper-V storvsc SCSI driver was hiding all errors in INQUIRY > > > and > > > MODE_SENSE commands. This caused the scan process to incorrectly > > > think > > > devices were present and online. Also invalid LUN errors were not > > > being handled correctly. > > > > > > This fixes problems booting a GEN2 VM on Hyper-V. It effectively > > > reverts commit 4ed51a21c0f69 ("Staging: hv: storvsc: Fixup > > > srb and scsi status for INQUIRY and MODE_SENSE") > > > > > > Signed-off-by: Stephen Hemminger <sthem...@microsoft.com> > > > --- > > > drivers/scsi/storvsc_drv.c | 48 -- > > > > > > 1 file changed, 4 insertions(+), 44 deletions(-) > > > > > > diff --git a/drivers/scsi/storvsc_drv.c > > > b/drivers/scsi/storvsc_drv.c > > > index 638e5f427c90..8cc241fc54b8 100644 > > > --- a/drivers/scsi/storvsc_drv.c > > > +++ b/drivers/scsi/storvsc_drv.c > > > @@ -543,28 +543,6 @@ static void storvsc_host_scan(struct > > > work_struct > > > *work) > > > kfree(wrk); > > > } > > > > > > -static void storvsc_remove_lun(struct work_struct *work) > > > -{ > > > - struct storvsc_scan_work *wrk; > > > - struct scsi_device *sdev; > > > - > > > - wrk = container_of(work, struct storvsc_scan_work, work); > > > - if (!scsi_host_get(wrk->host)) > > > - goto done; > > > - > > > - sdev = scsi_device_lookup(wrk->host, 0, wrk->tgt_id, wrk > > > ->lun); > > > - > > > - if (sdev) { > > > - scsi_remove_device(sdev); > > > - scsi_device_put(sdev); > > > - } > > > - scsi_host_put(wrk->host); > > > - > > > -done: > > > - kfree(wrk); > > > -} > > > - > > > - > > > /* > > > * We can get incoming messages from the host that are not in > > > response to > > > * messages that we have sent out. An example of this would be > > > messages > > > @@ -955,8 +933,7 @@ static void storvsc_handle_error(struct > > > vmscsi_request *vm_srb, > > > } > > > break; > > > case SRB_STATUS_INVALID_LUN: > > > - do_work = true; > > > - process_err_fn = storvsc_remove_lun; > > > + set_host_byte(scmnd, DID_NO_CONNECT); > > > break; > > > case SRB_
Re: [RFC] hv_storvsc: error handling.
On Sat, 2017-03-04 at 21:03 +, KY Srinivasan wrote: > > > -Original Message- > > From: Stephen Hemminger [mailto:step...@networkplumber.org] > > Sent: Friday, March 3, 2017 4:50 PM > > To: James Bottomley> > Cc: Hannes Reinecke ; Christoph Hellwig ; > > James Bottomley ; Jens Axboe > > ; Linus Torvalds ; > > Martin K. Petersen ; KY Srinivasan > > ; Dexuan Cui ; Long Li > > ; Josh Poulson ; > > Adrian > > Suhov (Cloudbase Solutions SRL) ; linux- > > s...@vger.kernel.org; Haiyang Zhang > > Subject: [RFC] hv_storvsc: error handling. > > > > Needs more testing but this does fix the observed problem. > > > > From: Stephen Hemminger > > > > Subject: [PATCH] hv_storvsc: fix error handling > > > > The Hyper-V storvsc SCSI driver was hiding all errors in INQUIRY > > and > > MODE_SENSE commands. This caused the scan process to incorrectly > > think > > devices were present and online. Also invalid LUN errors were not > > being handled correctly. > > > > This fixes problems booting a GEN2 VM on Hyper-V. It effectively > > reverts commit 4ed51a21c0f69 ("Staging: hv: storvsc: Fixup > > srb and scsi status for INQUIRY and MODE_SENSE") > > > > Signed-off-by: Stephen Hemminger > > --- > > drivers/scsi/storvsc_drv.c | 48 -- > > > > 1 file changed, 4 insertions(+), 44 deletions(-) > > > > diff --git a/drivers/scsi/storvsc_drv.c > > b/drivers/scsi/storvsc_drv.c > > index 638e5f427c90..8cc241fc54b8 100644 > > --- a/drivers/scsi/storvsc_drv.c > > +++ b/drivers/scsi/storvsc_drv.c > > @@ -543,28 +543,6 @@ static void storvsc_host_scan(struct > > work_struct > > *work) > > kfree(wrk); > > } > > > > -static void storvsc_remove_lun(struct work_struct *work) > > -{ > > - struct storvsc_scan_work *wrk; > > - struct scsi_device *sdev; > > - > > - wrk = container_of(work, struct storvsc_scan_work, work); > > - if (!scsi_host_get(wrk->host)) > > - goto done; > > - > > - sdev = scsi_device_lookup(wrk->host, 0, wrk->tgt_id, wrk > > ->lun); > > - > > - if (sdev) { > > - scsi_remove_device(sdev); > > - scsi_device_put(sdev); > > - } > > - scsi_host_put(wrk->host); > > - > > -done: > > - kfree(wrk); > > -} > > - > > - > > /* > > * We can get incoming messages from the host that are not in > > response to > > * messages that we have sent out. An example of this would be > > messages > > @@ -955,8 +933,7 @@ static void storvsc_handle_error(struct > > vmscsi_request *vm_srb, > > } > > break; > > case SRB_STATUS_INVALID_LUN: > > - do_work = true; > > - process_err_fn = storvsc_remove_lun; > > + set_host_byte(scmnd, DID_NO_CONNECT); > > break; > > case SRB_STATUS_ABORTED: > > if (vm_srb->srb_status & > > SRB_STATUS_AUTOSENSE_VALID > > && > > @@ -1050,32 +1027,15 @@ static void storvsc_on_io_completion(struct > > storvsc_device *stor_device, > > > > stor_pkt = >vstor_packet; > > > > - /* > > -* The current SCSI handling on the host side does > > -* not correctly handle: > > -* INQUIRY command with page code parameter set to 0x80 > > -* MODE_SENSE command with cmd[2] == 0x1c > > -* > > -* Setup srb and scsi status so this won't be fatal. > > -* We do this so we can distinguish truly fatal failues > > -* (srb status == 0x4) and off-line the device in that > > case. > > -*/ > > - > > - if ((stor_pkt->vm_srb.cdb[0] == INQUIRY) || > > - (stor_pkt->vm_srb.cdb[0] == MODE_SENSE)) { > > - vstor_packet->vm_srb.scsi_status = 0; > > - vstor_packet->vm_srb.srb_status = > > SRB_STATUS_SUCCESS; > > - } > > - > > - > > /* Copy over the status...etc */ > > stor_pkt->vm_srb.scsi_status = vstor_packet > > ->vm_srb.scsi_status; > > stor_pkt->vm_srb.srb_status = vstor_packet > > ->vm_srb.srb_status; > > stor_pkt->vm_srb.sense_info_length = > > vstor_packet->vm_srb.sense_info_length; > > > > - if (vstor_packet->vm_srb.scsi_status != 0 || > > - vstor_packet->vm_srb.srb_status != SRB_STATUS_SUCCESS) > > + if (stor_pkt->vm_srb.cdb[0] != INQUIRY && > > + (vstor_packet->vm_srb.scsi_status != 0 || > > +vstor_packet->vm_srb.srb_status != > > SRB_STATUS_SUCCESS)) > > storvsc_log(device, STORVSC_LOGGING_WARN, > > "cmd 0x%x scsi status 0x%x srb status > > 0x%x\n", > > stor_pkt->vm_srb.cdb[0], > > -- > > This patch gets rid of the ability to "hot remove" LUNs. I don't > think that can be part of any > solution. The INQUIRY hack I put in a long time ago
RE: [RFC] hv_storvsc: error handling.
> -Original Message- > From: Stephen Hemminger [mailto:step...@networkplumber.org] > Sent: Friday, March 3, 2017 4:50 PM > To: James Bottomley> Cc: Hannes Reinecke ; Christoph Hellwig ; > James Bottomley ; Jens Axboe > ; Linus Torvalds ; > Martin K. Petersen ; KY Srinivasan > ; Dexuan Cui ; Long Li > ; Josh Poulson ; Adrian > Suhov (Cloudbase Solutions SRL) ; linux- > s...@vger.kernel.org; Haiyang Zhang > Subject: [RFC] hv_storvsc: error handling. > > Needs more testing but this does fix the observed problem. > > From: Stephen Hemminger > > Subject: [PATCH] hv_storvsc: fix error handling > > The Hyper-V storvsc SCSI driver was hiding all errors in INQUIRY and > MODE_SENSE commands. This caused the scan process to incorrectly think > devices were present and online. Also invalid LUN errors were not > being handled correctly. > > This fixes problems booting a GEN2 VM on Hyper-V. It effectively > reverts commit 4ed51a21c0f69 ("Staging: hv: storvsc: Fixup > srb and scsi status for INQUIRY and MODE_SENSE") > > Signed-off-by: Stephen Hemminger > --- > drivers/scsi/storvsc_drv.c | 48 > -- > 1 file changed, 4 insertions(+), 44 deletions(-) > > diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c > index 638e5f427c90..8cc241fc54b8 100644 > --- a/drivers/scsi/storvsc_drv.c > +++ b/drivers/scsi/storvsc_drv.c > @@ -543,28 +543,6 @@ static void storvsc_host_scan(struct work_struct > *work) > kfree(wrk); > } > > -static void storvsc_remove_lun(struct work_struct *work) > -{ > - struct storvsc_scan_work *wrk; > - struct scsi_device *sdev; > - > - wrk = container_of(work, struct storvsc_scan_work, work); > - if (!scsi_host_get(wrk->host)) > - goto done; > - > - sdev = scsi_device_lookup(wrk->host, 0, wrk->tgt_id, wrk->lun); > - > - if (sdev) { > - scsi_remove_device(sdev); > - scsi_device_put(sdev); > - } > - scsi_host_put(wrk->host); > - > -done: > - kfree(wrk); > -} > - > - > /* > * We can get incoming messages from the host that are not in response to > * messages that we have sent out. An example of this would be messages > @@ -955,8 +933,7 @@ static void storvsc_handle_error(struct > vmscsi_request *vm_srb, > } > break; > case SRB_STATUS_INVALID_LUN: > - do_work = true; > - process_err_fn = storvsc_remove_lun; > + set_host_byte(scmnd, DID_NO_CONNECT); > break; > case SRB_STATUS_ABORTED: > if (vm_srb->srb_status & SRB_STATUS_AUTOSENSE_VALID > && > @@ -1050,32 +1027,15 @@ static void storvsc_on_io_completion(struct > storvsc_device *stor_device, > > stor_pkt = >vstor_packet; > > - /* > - * The current SCSI handling on the host side does > - * not correctly handle: > - * INQUIRY command with page code parameter set to 0x80 > - * MODE_SENSE command with cmd[2] == 0x1c > - * > - * Setup srb and scsi status so this won't be fatal. > - * We do this so we can distinguish truly fatal failues > - * (srb status == 0x4) and off-line the device in that case. > - */ > - > - if ((stor_pkt->vm_srb.cdb[0] == INQUIRY) || > -(stor_pkt->vm_srb.cdb[0] == MODE_SENSE)) { > - vstor_packet->vm_srb.scsi_status = 0; > - vstor_packet->vm_srb.srb_status = SRB_STATUS_SUCCESS; > - } > - > - > /* Copy over the status...etc */ > stor_pkt->vm_srb.scsi_status = vstor_packet->vm_srb.scsi_status; > stor_pkt->vm_srb.srb_status = vstor_packet->vm_srb.srb_status; > stor_pkt->vm_srb.sense_info_length = > vstor_packet->vm_srb.sense_info_length; > > - if (vstor_packet->vm_srb.scsi_status != 0 || > - vstor_packet->vm_srb.srb_status != SRB_STATUS_SUCCESS) > + if (stor_pkt->vm_srb.cdb[0] != INQUIRY && > + (vstor_packet->vm_srb.scsi_status != 0 || > + vstor_packet->vm_srb.srb_status != SRB_STATUS_SUCCESS)) > storvsc_log(device, STORVSC_LOGGING_WARN, > "cmd 0x%x scsi status 0x%x srb status 0x%x\n", > stor_pkt->vm_srb.cdb[0], > -- This patch gets rid of the ability to "hot remove" LUNs. I don't think that can be part of any solution. The INQUIRY hack I put in a long time ago was to deal with host bugs on prior versions of Windows server. WS2016 should not be trigerring this code. Stephen, could you please test this patch - a quick hack: >From b97f24f224a71a6e745c42e5640045a553eb407c Mon Sep 17 00:00:00 2001 From: K. Y. Srinivasan
Re: [RFC] hv_storvsc: error handling.
On 03/04/2017 01:50 AM, Stephen Hemminger wrote: Needs more testing but this does fix the observed problem. From: Stephen HemmingerSubject: [PATCH] hv_storvsc: fix error handling The Hyper-V storvsc SCSI driver was hiding all errors in INQUIRY and MODE_SENSE commands. This caused the scan process to incorrectly think devices were present and online. Also invalid LUN errors were not being handled correctly. This fixes problems booting a GEN2 VM on Hyper-V. It effectively reverts commit 4ed51a21c0f69 ("Staging: hv: storvsc: Fixup srb and scsi status for INQUIRY and MODE_SENSE") Signed-off-by: Stephen Hemminger --- drivers/scsi/storvsc_drv.c | 48 -- 1 file changed, 4 insertions(+), 44 deletions(-) diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c index 638e5f427c90..8cc241fc54b8 100644 --- a/drivers/scsi/storvsc_drv.c +++ b/drivers/scsi/storvsc_drv.c @@ -543,28 +543,6 @@ static void storvsc_host_scan(struct work_struct *work) kfree(wrk); } -static void storvsc_remove_lun(struct work_struct *work) -{ - struct storvsc_scan_work *wrk; - struct scsi_device *sdev; - - wrk = container_of(work, struct storvsc_scan_work, work); - if (!scsi_host_get(wrk->host)) - goto done; - - sdev = scsi_device_lookup(wrk->host, 0, wrk->tgt_id, wrk->lun); - - if (sdev) { - scsi_remove_device(sdev); - scsi_device_put(sdev); - } - scsi_host_put(wrk->host); - -done: - kfree(wrk); -} - - /* * We can get incoming messages from the host that are not in response to * messages that we have sent out. An example of this would be messages @@ -955,8 +933,7 @@ static void storvsc_handle_error(struct vmscsi_request *vm_srb, } break; case SRB_STATUS_INVALID_LUN: - do_work = true; - process_err_fn = storvsc_remove_lun; + set_host_byte(scmnd, DID_NO_CONNECT); break; case SRB_STATUS_ABORTED: if (vm_srb->srb_status & SRB_STATUS_AUTOSENSE_VALID && @@ -1050,32 +1027,15 @@ static void storvsc_on_io_completion(struct storvsc_device *stor_device, stor_pkt = >vstor_packet; - /* -* The current SCSI handling on the host side does -* not correctly handle: -* INQUIRY command with page code parameter set to 0x80 -* MODE_SENSE command with cmd[2] == 0x1c -* -* Setup srb and scsi status so this won't be fatal. -* We do this so we can distinguish truly fatal failues -* (srb status == 0x4) and off-line the device in that case. -*/ - - if ((stor_pkt->vm_srb.cdb[0] == INQUIRY) || - (stor_pkt->vm_srb.cdb[0] == MODE_SENSE)) { - vstor_packet->vm_srb.scsi_status = 0; - vstor_packet->vm_srb.srb_status = SRB_STATUS_SUCCESS; - } - - /* Copy over the status...etc */ stor_pkt->vm_srb.scsi_status = vstor_packet->vm_srb.scsi_status; stor_pkt->vm_srb.srb_status = vstor_packet->vm_srb.srb_status; stor_pkt->vm_srb.sense_info_length = vstor_packet->vm_srb.sense_info_length; - if (vstor_packet->vm_srb.scsi_status != 0 || - vstor_packet->vm_srb.srb_status != SRB_STATUS_SUCCESS) + if (stor_pkt->vm_srb.cdb[0] != INQUIRY && + (vstor_packet->vm_srb.scsi_status != 0 || +vstor_packet->vm_srb.srb_status != SRB_STATUS_SUCCESS)) storvsc_log(device, STORVSC_LOGGING_WARN, "cmd 0x%x scsi status 0x%x srb status 0x%x\n", stor_pkt->vm_srb.cdb[0], I really do wonder; according to the comments above the wrong error handling really only affects inquiry VPD page 0x80 and MODE_SENSE page 0x1c. So why do we need to blank out _every_ inquiry? Wouldn't it far better to check _what_ goes wrong when asking for page 0x80, and fix up things correctly? Cheers, Hannes -- Dr. Hannes Reinecke zSeries & Storage h...@suse.de +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)