date:20171205

Re: [REGRESSION][v4.13.y][v4.14.y] scsi: libsas: allow async aborts

2017-12-05 Thread Fabian Grünbichler

On Fri, Nov 17, 2017 at 03:09:37PM +0100, Hannes Reinecke wrote:
> On 11/16/2017 11:08 PM, Joseph Salisbury wrote:
> > Hi Christoph,
> > 
> > A kernel bug report was opened against Ubuntu [0].  After a kernel
> > bisect, it was found that reverting the following commit resolved this bug:
> > 
> > 909657615d9b ("scsi: libsas: allow async aborts")
> > 
> >  
> > The regression was introduced as of v4.12-rc1, and it still exists in
> > 4.14 mainline.
> > 
> > I was hoping to get your feedback, since you are the patch author.  Do
> > you think gathering any additional data will help diagnose this issue,
> > or would it be best to submit a revert request?
> > 
> I'll be checking what's going on there.
> 
> Cheers,
> 
> Hannes

Any news about this regression? Any traces / data affected users can
provide?

Thanks!

[PATCH 0/17] scsi: arcmsr: add some driver options and support new adapter ARC-1884

2017-12-05 Thread Ching Huang

From: Ching Huang 

Hi Martin,

Due to 4.16/scsi-queue conflict with 4.15-rc1, so I resubmit these patches 
again.
The following patches apply to Linus' 4.15-rc1 tree.

Patch 1: redefine ACB_ADAPTER_TYPE_A, _B, _C, _D and subsequent changes.

Patch 2: simplify arcmsr_iop_init function.

Patch 3: add codes for ACB_ADAPTER_TYPE_E to support new adapter ARC-1884

Patch 4. update ARCMSR_MAX_OUTSTANDING_CMD and ARCMSR_MAX_FREECCB_NUM to 1024

Patch 5: replace constant ARCMSR_MAX_FREECCB_NUM by variable acb->maxFreeCCB 
that was got from firmware.

Patch 6: add driver option host_can_queue to set host->can_queue value by user. 
It's value expands
 up to 1024.

Patch 7: replace constant ARCMSR_MAX_OUTSTANDING_CMD by variable 
acb->maxOutstanding that was determined by user.

Patch 8: add driver option cmd_per_lun to set host->cmd_per_lun value by user.

Patch 9: add ACB_F_MSG_GET_CONFIG to acb->acb_flags for for message interrupt 
checking before schedule work for
 get device map.

Patch 10: add a function arcmsr_set_iop_datetime and driver option 
set_date_time to set date and time to firmware.

Patch 11: fix clear doorbell queue on ACB_ADAPTER_TYPE_B controller.

Patch 12: spin off duplicate code of timer init for message isr BH in 
arcmsr_probe and arcmsr_resume as a function
  arcmsr_init_get_devmap_timer

Patch 13: adjust some tab or white-space to make text alignment.

Patch 14: fix grammar error.

Patch 15: Add module parameter msi_enable to has a chance to disable msi 
interrupt if it does not work properly.

Patch 16: Add module parameter msix_enable to has a chance to disable msix 
interrupt if it does not work properly.

Patch 17: update driver version to v1.40.00.04-20171130
---

[PATCH 1/17] scsi: arcmsr: redefine ACB_ADAPTER_TYPE_A, _B, _C, _D and subsequent changes.

2017-12-05 Thread Ching Huang

From: Ching Huang 

redefine ACB_ADAPTER_TYPE_A, _B, _C, _D and subsequent changes.

Signed-off-by: Ching Huang 
---

diff -uprN a/drivers/scsi/arcmsr/arcmsr.h b/drivers/scsi/arcmsr/arcmsr.h
--- a/drivers/scsi/arcmsr/arcmsr.h  2017-12-05 10:45:50.0 +0800
+++ b/drivers/scsi/arcmsr/arcmsr.h  2017-08-03 18:54:46.0 +0800
@@ -621,10 +621,10 @@ struct MessageUnit_D {
 struct AdapterControlBlock
 {
uint32_t  adapter_type;/* adapter A,B. */
-   #define ACB_ADAPTER_TYPE_A0x0001/* hba I IOP */
-   #define ACB_ADAPTER_TYPE_B0x0002/* hbb M IOP */
-   #define ACB_ADAPTER_TYPE_C0x0004/* hbc P IOP */
-   #define ACB_ADAPTER_TYPE_D0x0008/* hbd A IOP */
+   #define ACB_ADAPTER_TYPE_A  0x  /* hba I IOP */
+   #define ACB_ADAPTER_TYPE_B  0x0001  /* hbb M IOP */
+   #define ACB_ADAPTER_TYPE_C  0x0002  /* hbc L IOP */
+   #define ACB_ADAPTER_TYPE_D  0x0003  /* hbd M IOP */
u32 roundup_ccbsize;
struct pci_dev *pdev;
struct Scsi_Host *  host;
diff -uprN a/drivers/scsi/arcmsr/arcmsr_hba.c b/drivers/scsi/arcmsr/arcmsr_hba.c
--- a/drivers/scsi/arcmsr/arcmsr_hba.c  2017-12-05 10:45:30.0 +0800
+++ b/drivers/scsi/arcmsr/arcmsr_hba.c  2017-12-05 11:47:40.0 +0800
@@ -1785,7 +1785,7 @@ arcmsr_Read_iop_rqbuffer_data(struct Ada
uint8_t __iomem *iop_data;
uint32_t iop_len;
 
-   if (acb->adapter_type & (ACB_ADAPTER_TYPE_C | ACB_ADAPTER_TYPE_D))
+   if (acb->adapter_type > ACB_ADAPTER_TYPE_B)
return arcmsr_Read_iop_rqbuffer_in_DWORD(acb, prbuffer);
iop_data = (uint8_t __iomem *)prbuffer->data;
iop_len = readl(&prbuffer->data_len);
@@ -1871,7 +1871,7 @@ arcmsr_write_ioctldata2iop(struct Adapte
uint8_t __iomem *iop_data;
int32_t allxfer_len = 0;
 
-   if (acb->adapter_type & (ACB_ADAPTER_TYPE_C | ACB_ADAPTER_TYPE_D)) {
+   if (acb->adapter_type > ACB_ADAPTER_TYPE_B) {
arcmsr_write_ioctldata2iop_in_DWORD(acb);
return;
}

[PATCH 2/17] scsi: arcmsr: simplify arcmsr_iop_init function

2017-12-05 Thread Ching Huang

From: Ching Huang 

simplify arcmsr_iop_init function

Signed-off-by: Ching Huang 
---

diff -uprN a/drivers/scsi/arcmsr/arcmsr_hba.c b/drivers/scsi/arcmsr/arcmsr_hba.c
--- a/drivers/scsi/arcmsr/arcmsr_hba.c  2017-12-05 11:47:40.0 +0800
+++ b/drivers/scsi/arcmsr/arcmsr_hba.c  2017-12-05 11:47:28.0 +0800
@@ -3671,6 +3671,39 @@ static void arcmsr_hardware_reset(struct
msleep(1000);
return;
 }
+
+static bool arcmsr_reset_in_progress(struct AdapterControlBlock *acb)
+{
+   bool rtn = true;
+
+   switch(acb->adapter_type) {
+   case ACB_ADAPTER_TYPE_A:{
+   struct MessageUnit_A __iomem *reg = acb->pmuA;
+   rtn = ((readl(®->outbound_msgaddr1) &
+   ARCMSR_OUTBOUND_MESG1_FIRMWARE_OK) == 0) ? true : false;
+   }
+   break;
+   case ACB_ADAPTER_TYPE_B:{
+   struct MessageUnit_B *reg = acb->pmuB;
+   rtn = ((readl(reg->iop2drv_doorbell) &
+   ARCMSR_MESSAGE_FIRMWARE_OK) == 0) ? true : false;
+   }
+   break;
+   case ACB_ADAPTER_TYPE_C:{
+   struct MessageUnit_C __iomem *reg = acb->pmuC;
+   rtn = (readl(®->host_diagnostic) & 0x04) ? true : false;
+   }
+   break;
+   case ACB_ADAPTER_TYPE_D:{
+   struct MessageUnit_D *reg = acb->pmuD;
+   rtn = ((readl(reg->sample_at_reset) & 0x80) == 0) ?
+   true : false;
+   }
+   break;
+   }
+   return rtn;
+}
+
 static void arcmsr_iop_init(struct AdapterControlBlock *acb)
 {
uint32_t intmask_org;
@@ -3725,197 +3758,55 @@ static uint8_t arcmsr_iop_reset(struct A
 static int arcmsr_bus_reset(struct scsi_cmnd *cmd)
 {
struct AdapterControlBlock *acb;
-   uint32_t intmask_org, outbound_doorbell;
int retry_count = 0;
int rtn = FAILED;
acb = (struct AdapterControlBlock *) cmd->device->host->hostdata;
-   printk(KERN_ERR "arcmsr: executing bus reset eh.num_resets = %d, 
num_aborts = %d \n", acb->num_resets, acb->num_aborts);
+   pr_notice("arcmsr: executing bus reset eh.num_resets = %d,"
+   " num_aborts = %d \n", acb->num_resets, acb->num_aborts);
acb->num_resets++;
 
-   switch(acb->adapter_type){
-   case ACB_ADAPTER_TYPE_A:{
-   if (acb->acb_flags & ACB_F_BUS_RESET){
-   long timeout;
-   printk(KERN_ERR "arcmsr: there is an  bus reset 
eh proceeding...\n");
-   timeout = wait_event_timeout(wait_q, 
(acb->acb_flags & ACB_F_BUS_RESET) == 0, 220*HZ);
-   if (timeout) {
-   return SUCCESS;
-   }
-   }
-   acb->acb_flags |= ACB_F_BUS_RESET;
-   if (!arcmsr_iop_reset(acb)) {
-   struct MessageUnit_A __iomem *reg;
-   reg = acb->pmuA;
-   arcmsr_hardware_reset(acb);
-   acb->acb_flags &= ~ACB_F_IOP_INITED;
-sleep_again:
-   ssleep(ARCMSR_SLEEPTIME);
-   if ((readl(®->outbound_msgaddr1) & 
ARCMSR_OUTBOUND_MESG1_FIRMWARE_OK) == 0) {
-   printk(KERN_ERR "arcmsr%d: waiting for 
hw bus reset return, retry=%d\n", acb->host->host_no, retry_count);
-   if (retry_count > ARCMSR_RETRYCOUNT) {
-   acb->fw_flag = FW_DEADLOCK;
-   printk(KERN_ERR "arcmsr%d: 
waiting for hw bus reset return, RETRY TERMINATED!!\n", acb->host->host_no);
-   return FAILED;
-   }
-   retry_count++;
-   goto sleep_again;
-   }
-   acb->acb_flags |= ACB_F_IOP_INITED;
-   /* disable all outbound interrupt */
-   intmask_org = arcmsr_disable_outbound_ints(acb);
-   arcmsr_get_firmware_spec(acb);
-   arcmsr_start_adapter_bgrb(acb);
-   /* clear Qbuffer if door bell ringed */
-   outbound_doorbell = 
readl(®->outbound_doorbell);
-   writel(outbound_doorbell, 
®->outbound_doorbell); /*clear interrupt */
-   writel(ARCMSR_INBOUND_DRIVER_DATA_READ_OK, 
®->inbound_doorbell);
-   /* enable outbound Post Queue,outbound doorbell 
Interrupt */
-   arcmsr_enable_outbound_in

[PATCH] scsi: libsas: fix length error in sas_smp_handler()

2017-12-05 Thread Jason Yan

The bsg_job_done() requires the length of payload received, but we give
it the untransferred residual.

Fixes: 651a01364994 ("scsi: scsi_transport_sas: switch to bsg-lib for SMP")
Reported-and-tested-by: chenqilin 
Signed-off-by: Jason Yan 
CC: Christoph Hellwig 
---
 drivers/scsi/libsas/sas_expander.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/libsas/sas_expander.c 
b/drivers/scsi/libsas/sas_expander.c
index 50cb0f3..8323dc1 100644
--- a/drivers/scsi/libsas/sas_expander.c
+++ b/drivers/scsi/libsas/sas_expander.c
@@ -2177,9 +2177,9 @@ void sas_smp_handler(struct bsg_job *job, struct 
Scsi_Host *shost,
 
ret = smp_execute_task_sg(dev, job->request_payload.sg_list,
job->reply_payload.sg_list);
-   if (ret > 0) {
+   if (ret >= 0) {
/* positive number is the untransferred residual */
-   reslen = ret;
+   reslen = job->reply_payload.payload_len - ret;
ret = 0;
}
 
-- 
2.9.5

[PATCH 3/17] scsi: arcmsr: add codes for ACB_ADAPTER_TYPE_E to support new adapter ARC-1884

2017-12-05 Thread Ching Huang

From: Ching Huang 

add codes for ACB_ADAPTER_TYPE_E to support new adapter ARC-1884

Signed-off-by: Ching Huang 
---

diff -uprN a/drivers/scsi/arcmsr/arcmsr.h b/drivers/scsi/arcmsr/arcmsr.h
--- a/drivers/scsi/arcmsr/arcmsr.h  2017-08-03 18:54:46.0 +0800
+++ b/drivers/scsi/arcmsr/arcmsr.h  2017-08-04 11:19:22.0 +0800
@@ -65,6 +65,7 @@ struct device_attribute;
 #define ARCMSR_MAX_HBB_POSTQUEUE   
264
 #define ARCMSR_MAX_ARC1214_POSTQUEUE   256
 #define ARCMSR_MAX_ARC1214_DONEQUEUE   257
+#define ARCMSR_MAX_HBE_DONEQUEUE   512
 #define ARCMSR_MAX_XFER_LEN
0x26000 /* 152K */
 #define ARCMSR_CDB_SG_PAGE_LENGTH  
256 
 #define ARCMST_NUM_MSIX_VECTORS4
@@ -77,6 +78,9 @@ struct device_attribute;
 #ifndef PCI_DEVICE_ID_ARECA_1203
#define PCI_DEVICE_ID_ARECA_12030x1203
 #endif
+#ifndef PCI_DEVICE_ID_ARECA_1884
+   #define PCI_DEVICE_ID_ARECA_18840x1884
+#endif
 /*
 
**
 **
@@ -405,6 +409,31 @@ struct FIRMWARE_INFO
 /*ARCMSR_HBAMU_MESSAGE_FIRMWARE_OK*/
 #define ARCMSR_ARC1214_MESSAGE_FIRMWARE_OK 0x8000
 #define ARCMSR_ARC1214_OUTBOUND_LIST_INTERRUPT_CLEAR   0x0001
+/* 
+***
+**SPEC. for Areca Type E adapter
+***
+*/
+#define ARCMSR_SIGNATURE_1884  0x188417D3
+
+#define ARCMSR_HBEMU_DRV2IOP_DATA_WRITE_OK 0x0002
+#define ARCMSR_HBEMU_DRV2IOP_DATA_READ_OK  0x0004
+#define ARCMSR_HBEMU_DRV2IOP_MESSAGE_CMD_DONE  0x0008
+
+#define ARCMSR_HBEMU_IOP2DRV_DATA_WRITE_OK 0x0002
+#define ARCMSR_HBEMU_IOP2DRV_DATA_READ_OK  0x0004
+#define ARCMSR_HBEMU_IOP2DRV_MESSAGE_CMD_DONE  0x0008
+
+#define ARCMSR_HBEMU_MESSAGE_FIRMWARE_OK   0x8000
+
+#define ARCMSR_HBEMU_OUTBOUND_DOORBELL_ISR 0x0001
+#define ARCMSR_HBEMU_OUTBOUND_POSTQUEUE_ISR0x0008
+#define ARCMSR_HBEMU_ALL_INTMASKENABLE 0x0009
+
+/* ARC-1884 doorbell sync */
+#define ARCMSR_HBEMU_DOORBELL_SYNC 0x100
+#define ARCMSR_ARC188X_RESET_ADAPTER   0x0004
+#define ARCMSR_ARC1884_DiagWrite_ENABLE0x0080
 /*
 ***
 **ARECA SCSI COMMAND DESCRIPTOR BLOCK size 0x1F8 (504)
@@ -614,6 +643,88 @@ struct MessageUnit_D {
u32 __iomem *msgcode_rwbuffer;  /* 0x2200 */
 };
 /*
+*
+** Messaging Unit (MU) of Type E processor(LSI)
+*
+*/
+struct MessageUnit_E{
+   uint32_tiobound_doorbell;   /* 0003*/
+   uint32_twrite_sequence_3xxx;/*0004 0007*/
+   uint32_thost_diagnostic_3xxx;   /*0008 000B*/
+   uint32_tposted_outbound_doorbell;   /*000C 000F*/
+   uint32_tmaster_error_attribute; /*0010 0013*/
+   uint32_tmaster_error_address_low;   /*0014 0017*/
+   uint32_tmaster_error_address_high;  /*0018 001B*/
+   uint32_thcb_size;   /*001C 001F*/
+   uint32_tinbound_doorbell;   /*0020 0023*/
+   uint32_tdiagnostic_rw_data; /*0024 0027*/
+   uint32_tdiagnostic_rw_address_low;  /*0028 002B*/
+   uint32_tdiagnostic_rw_address_high; /*002C 002F*/
+   uint32_thost_int_status;/*0030 0033*/
+   uint32_thost_int_mask;  /*0034 0037*/
+   uint32_tdcr_data;   /*0038 003B*/
+   uint32_tdcr_address;/*003C 003F*/
+   uint32_tinbound_queueport;  /*0040 0043*/
+   uint32_toutbound_queueport; /*0044 0047*/
+   uint32_thcb_pci_address_low;/*0048 004B*/
+   uint32_thcb_pci_address_high;   /*004C 004F*/
+   uint32_tiop_int_status; /*0050 0053*/
+   uint32_tiop_int_mask;   /*0054 0057*/
+   uint32_tiop_inbound_queue_port; /*0058 005B*/
+   uint32_tiop_outbound_queue_port;/*005C 005F*/
+   uint32_tinbound_free_list_index;/*0060 0063*/
+   uint32_tinbound_post_list_index;/*0064 0067*/
+   uint32_t

[PATCH 4/17] scsi: arcmsr: enable host controller command queueu up to 1024

2017-12-05 Thread Ching Huang

From: Ching Huang 

update ARCMSR_MAX_OUTSTANDING_CMD and ARCMSR_MAX_FREECCB_NUM to 1024

Signed-off-by: Ching Huang 
---

diff -uprN a/drivers/scsi/arcmsr/arcmsr.h b/drivers/scsi/arcmsr/arcmsr.h
--- a/drivers/scsi/arcmsr/arcmsr.h  2017-08-04 11:19:22.0 +0800
+++ b/drivers/scsi/arcmsr/arcmsr.h  2017-12-05 15:24:16.0 +0800
@@ -45,13 +45,8 @@
 #include 
 struct device_attribute;
 /*The limit of outstanding scsi command that firmware can handle*/
-#ifdef CONFIG_XEN
-   #define ARCMSR_MAX_FREECCB_NUM  160
-#define ARCMSR_MAX_OUTSTANDING_CMD 155
-#else
-   #define ARCMSR_MAX_FREECCB_NUM  320
-#define ARCMSR_MAX_OUTSTANDING_CMD 255
-#endif
+#define ARCMSR_MAX_FREECCB_NUM 1024
+#define ARCMSR_MAX_OUTSTANDING_CMD 1024
 #define ARCMSR_DRIVER_VERSION  "v1.30.00.22-20151126"
 #define ARCMSR_SCSI_INITIATOR_ID   
255
 #define ARCMSR_MAX_XFER_SECTORS
512

[PATCH 5/17] scsi: arcmsr: replace constant ARCMSR_MAX_FREECCB_NUM by variable

2017-12-05 Thread Ching Huang

From: Ching Huang 

replace constant ARCMSR_MAX_FREECCB_NUM by variable acb->maxFreeCCB that was 
got from firmware

Signed-off-by: Ching Huang 
---

diff -uprN a/drivers/scsi/arcmsr/arcmsr.h b/drivers/scsi/arcmsr/arcmsr.h
--- a/drivers/scsi/arcmsr/arcmsr.h  2017-12-05 15:24:16.0 +0800
+++ b/drivers/scsi/arcmsr/arcmsr.h  2017-12-05 15:25:06.0 +0800
@@ -831,6 +831,7 @@ struct AdapterControlBlock
atomic_tante_token_value;
uint32_tmaxOutstanding;
int vector_count;
+   uint32_tmaxFreeCCB;
uint32_tdoneq_index;
uint32_tccbsize;
uint32_tin_doorbell;
diff -uprN a/drivers/scsi/arcmsr/arcmsr_hba.c b/drivers/scsi/arcmsr/arcmsr_hba.c
--- a/drivers/scsi/arcmsr/arcmsr_hba.c  2017-12-05 11:47:16.0 +0800
+++ b/drivers/scsi/arcmsr/arcmsr_hba.c  2017-12-05 11:47:04.0 +0800
@@ -688,7 +688,7 @@ static int arcmsr_alloc_ccb_pool(struct 
acb->host->max_sectors = max_xfer_len/512;
acb->host->sg_tablesize = max_sg_entrys;
roundup_ccbsize = roundup(sizeof(struct CommandControlBlock) + 
(max_sg_entrys - 1) * sizeof(struct SG64ENTRY), 32);
-   acb->uncache_size = roundup_ccbsize * ARCMSR_MAX_FREECCB_NUM;
+   acb->uncache_size = roundup_ccbsize * acb->maxFreeCCB;
dma_coherent = dma_alloc_coherent(&pdev->dev, acb->uncache_size, 
&dma_coherent_handle, GFP_KERNEL);
if(!dma_coherent){
printk(KERN_NOTICE "arcmsr%d: dma_alloc_coherent got error\n", 
acb->host->host_no);
@@ -700,7 +700,7 @@ static int arcmsr_alloc_ccb_pool(struct 
acb->ccbsize = roundup_ccbsize;
ccb_tmp = dma_coherent;
acb->vir2phy_offset = (unsigned long)dma_coherent - (unsigned 
long)dma_coherent_handle;
-   for(i = 0; i < ARCMSR_MAX_FREECCB_NUM; i++){
+   for(i = 0; i < acb->maxFreeCCB; i++){
cdb_phyaddr = dma_coherent_handle + offsetof(struct 
CommandControlBlock, arcmsr_cdb);
switch (acb->adapter_type) {
case ACB_ADAPTER_TYPE_A:
@@ -1427,7 +1427,7 @@ static void arcmsr_remove(struct pci_dev
 
arcmsr_abort_allcmd(acb);
arcmsr_done4abort_postqueue(acb);
-   for (i = 0; i < ARCMSR_MAX_FREECCB_NUM; i++) {
+   for (i = 0; i < acb->maxFreeCCB; i++) {
struct CommandControlBlock *ccb = acb->pccb_pool[i];
if (ccb->startdone == ARCMSR_CCB_START) {
ccb->startdone = ARCMSR_CCB_ABORTED;
@@ -3239,6 +3239,9 @@ static bool arcmsr_get_firmware_spec(str
else
acb->maxOutstanding = acb->firm_numbers_queue - 1;
acb->host->can_queue = acb->maxOutstanding;
+   acb->maxFreeCCB = acb->host->can_queue;
+   if (acb->maxFreeCCB < ARCMSR_MAX_FREECCB_NUM)
+   acb->maxFreeCCB += 64;
return rtn;
 }
 
@@ -4261,7 +4264,7 @@ static uint8_t arcmsr_iop_reset(struct A
rtnval = arcmsr_abort_allcmd(acb);
/* clear all outbound posted Q */
arcmsr_done4abort_postqueue(acb);
-   for (i = 0; i < ARCMSR_MAX_FREECCB_NUM; i++) {
+   for (i = 0; i < acb->maxFreeCCB; i++) {
ccb = acb->pccb_pool[i];
if (ccb->startdone == ARCMSR_CCB_START) {
scsi_dma_unmap(ccb->pcmd);
@@ -4369,7 +4372,7 @@ static int arcmsr_abort(struct scsi_cmnd
}
 
intmask_org = arcmsr_disable_outbound_ints(acb);
-   for (i = 0; i < ARCMSR_MAX_FREECCB_NUM; i++) {
+   for (i = 0; i < acb->maxFreeCCB; i++) {
struct CommandControlBlock *ccb = acb->pccb_pool[i];
if (ccb->startdone == ARCMSR_CCB_START && ccb->pcmd == cmd) {
ccb->startdone = ARCMSR_CCB_ABORTED;

[PATCH 6/17] scsi: arcmsr: add driver option host_can_queue to set host->can_queue value by user

2017-12-05 Thread Ching Huang

From: Ching Huang 

add driver option host_can_queue to set host->can_queue value by user. It's 
value expands up to 1024

Signed-off-by: Ching Huang 
---

diff -uprN a/drivers/scsi/arcmsr/arcmsr.h b/drivers/scsi/arcmsr/arcmsr.h
--- a/drivers/scsi/arcmsr/arcmsr.h  2017-12-05 15:25:06.0 +0800
+++ b/drivers/scsi/arcmsr/arcmsr.h  2017-12-05 15:25:36.0 +0800
@@ -47,6 +47,8 @@ struct device_attribute;
 /*The limit of outstanding scsi command that firmware can handle*/
 #define ARCMSR_MAX_FREECCB_NUM 1024
 #define ARCMSR_MAX_OUTSTANDING_CMD 1024
+#define ARCMSR_DEFAULT_OUTSTANDING_CMD 128
+#define ARCMSR_MIN_OUTSTANDING_CMD 32
 #define ARCMSR_DRIVER_VERSION  "v1.30.00.22-20151126"
 #define ARCMSR_SCSI_INITIATOR_ID   
255
 #define ARCMSR_MAX_XFER_SECTORS
512
diff -uprN a/drivers/scsi/arcmsr/arcmsr_hba.c b/drivers/scsi/arcmsr/arcmsr_hba.c
--- a/drivers/scsi/arcmsr/arcmsr_hba.c  2017-12-05 11:47:04.0 +0800
+++ b/drivers/scsi/arcmsr/arcmsr_hba.c  2017-12-05 11:46:50.0 +0800
@@ -75,6 +75,10 @@ MODULE_DESCRIPTION("Areca ARC11xx/12xx/1
 MODULE_LICENSE("Dual BSD/GPL");
 MODULE_VERSION(ARCMSR_DRIVER_VERSION);
 
+static int host_can_queue = ARCMSR_DEFAULT_OUTSTANDING_CMD;
+module_param(host_can_queue, int, S_IRUGO);
+MODULE_PARM_DESC(host_can_queue, " adapter queue depth(32 ~ 1024), default is 
128");
+
 #defineARCMSR_SLEEPTIME10
 #defineARCMSR_RETRYCOUNT   12
 
@@ -133,7 +137,7 @@ static struct scsi_host_template arcmsr_
.eh_bus_reset_handler   = arcmsr_bus_reset,
.bios_param = arcmsr_bios_param,
.change_queue_depth = arcmsr_adjust_disk_queue_depth,
-   .can_queue  = ARCMSR_MAX_OUTSTANDING_CMD,
+   .can_queue  = ARCMSR_DEFAULT_OUTSTANDING_CMD,
.this_id= ARCMSR_SCSI_INITIATOR_ID,
.sg_tablesize   = ARCMSR_DEFAULT_SG_ENTRIES, 
.max_sectors= ARCMSR_MAX_XFER_SECTORS_C, 
@@ -877,7 +881,9 @@ static int arcmsr_probe(struct pci_dev *
host->max_lun = ARCMSR_MAX_TARGETLUN;
host->max_id = ARCMSR_MAX_TARGETID; /*16:8*/
host->max_cmd_len = 16; /*this is issue of 
64bit LBA ,over 2T byte*/
-   host->can_queue = ARCMSR_MAX_OUTSTANDING_CMD;
+   if ((host_can_queue < ARCMSR_MIN_OUTSTANDING_CMD) || (host_can_queue > 
ARCMSR_MAX_OUTSTANDING_CMD))
+   host_can_queue = ARCMSR_DEFAULT_OUTSTANDING_CMD;
+   host->can_queue = host_can_queue;   /* max simultaneous cmds */ 

host->cmd_per_lun = ARCMSR_MAX_CMD_PERLUN;  
host->this_id = ARCMSR_SCSI_INITIATOR_ID;
host->unique_id = (bus << 8) | dev_fun;
@@ -3234,11 +3240,11 @@ static bool arcmsr_get_firmware_spec(str
default:
break;
}
-   if (acb->firm_numbers_queue > ARCMSR_MAX_OUTSTANDING_CMD)
-   acb->maxOutstanding = ARCMSR_MAX_OUTSTANDING_CMD;
+   acb->maxOutstanding = acb->firm_numbers_queue - 1;
+   if (acb->host->can_queue >= acb->firm_numbers_queue)
+   acb->host->can_queue = acb->maxOutstanding;
else
-   acb->maxOutstanding = acb->firm_numbers_queue - 1;
-   acb->host->can_queue = acb->maxOutstanding;
+   acb->maxOutstanding = acb->host->can_queue;
acb->maxFreeCCB = acb->host->can_queue;
if (acb->maxFreeCCB < ARCMSR_MAX_FREECCB_NUM)
acb->maxFreeCCB += 64;

[PATCH 7/17] scsi: arcmsr: replace constant ARCMSR_MAX_OUTSTANDING_CMD by variable acb->maxOutstanding that was determined by user

2017-12-05 Thread Ching Huang

From: Ching Huang 

replace constant ARCMSR_MAX_OUTSTANDING_CMD by variable acb->maxOutstanding 
that was determined by user

Signed-off-by: Ching Huang 
---

diff -uprN a/drivers/scsi/arcmsr/arcmsr_hba.c b/drivers/scsi/arcmsr/arcmsr_hba.c
--- a/drivers/scsi/arcmsr/arcmsr_hba.c  2017-12-05 11:46:50.0 +0800
+++ b/drivers/scsi/arcmsr/arcmsr_hba.c  2017-12-05 11:46:30.0 +0800
@@ -1315,7 +1315,7 @@ static void arcmsr_done4abort_postqueue(
/*clear and abort all outbound posted Q*/
writel(outbound_intstatus, ®->outbound_intstatus);/*clear 
interrupt*/
while(((flag_ccb = readl(®->outbound_queueport)) != 
0x)
-   && (i++ < ARCMSR_MAX_OUTSTANDING_CMD)) {
+   && (i++ < acb->maxOutstanding)) {
pARCMSR_CDB = (struct ARCMSR_CDB *)(acb->vir2phy_offset 
+ (flag_ccb << 5));/*frame must be 32 bytes aligned*/
pCCB = container_of(pARCMSR_CDB, struct 
CommandControlBlock, arcmsr_cdb);
error = (flag_ccb & ARCMSR_CCBREPLY_FLAG_ERROR_MODE0) ? 
true : false;
@@ -1345,7 +1345,7 @@ static void arcmsr_done4abort_postqueue(
break;
case ACB_ADAPTER_TYPE_C: {
struct MessageUnit_C __iomem *reg = acb->pmuC;
-   while ((readl(®->host_int_status) & 
ARCMSR_HBCMU_OUTBOUND_POSTQUEUE_ISR) && (i++ < ARCMSR_MAX_OUTSTANDING_CMD)) {
+   while ((readl(®->host_int_status) & 
ARCMSR_HBCMU_OUTBOUND_POSTQUEUE_ISR) && (i++ < acb->maxOutstanding)) {
/*need to do*/
flag_ccb = readl(®->outbound_queueport_low);
ccb_cdb_phy = (flag_ccb & 0xFFF0);
@@ -1421,7 +1421,7 @@ static void arcmsr_remove(struct pci_dev
acb->acb_flags |= ACB_F_SCSISTOPADAPTER;
acb->acb_flags &= ~ACB_F_IOP_INITED;
 
-   for (poll_count = 0; poll_count < ARCMSR_MAX_OUTSTANDING_CMD; 
poll_count++){
+   for (poll_count = 0; poll_count < acb->maxOutstanding; poll_count++){
if (!atomic_read(&acb->ccboutstandingcount))
break;
arcmsr_interrupt(acb);/* FIXME: need spinlock */

[PATCH 8/17] scsi: arcmsr: add driver option cmd_per_lun to set host->cmd_per_lun value by user

2017-12-05 Thread Ching Huang

From: Ching Huang 

add driver option cmd_per_lun to set host->cmd_per_lun value by user

Signed-off-by: Ching Huang 
---

diff -uprN a/drivers/scsi/arcmsr/arcmsr.h b/drivers/scsi/arcmsr/arcmsr.h
--- a/drivers/scsi/arcmsr/arcmsr.h  2017-12-05 15:26:06.0 +0800
+++ b/drivers/scsi/arcmsr/arcmsr.h  2017-12-05 15:26:32.0 +0800
@@ -56,7 +56,9 @@ struct device_attribute;
 #define ARCMSR_MAX_XFER_SECTORS_C  
304
 #define ARCMSR_MAX_TARGETID
17
 #define ARCMSR_MAX_TARGETLUN   
8
-#define ARCMSR_MAX_CMD_PERLUN   
ARCMSR_MAX_OUTSTANDING_CMD
+#define ARCMSR_MAX_CMD_PERLUN  128
+#define ARCMSR_DEFAULT_CMD_PERLUN  32
+#define ARCMSR_MIN_CMD_PERLUN  1
 #define ARCMSR_MAX_QBUFFER 
4096
 #define ARCMSR_DEFAULT_SG_ENTRIES  
38
 #define ARCMSR_MAX_HBB_POSTQUEUE   
264
diff -uprN a/drivers/scsi/arcmsr/arcmsr_hba.c b/drivers/scsi/arcmsr/arcmsr_hba.c
--- a/drivers/scsi/arcmsr/arcmsr_hba.c  2017-12-05 11:46:30.0 +0800
+++ b/drivers/scsi/arcmsr/arcmsr_hba.c  2017-12-05 11:46:18.0 +0800
@@ -79,6 +79,10 @@ static int host_can_queue = ARCMSR_DEFAU
 module_param(host_can_queue, int, S_IRUGO);
 MODULE_PARM_DESC(host_can_queue, " adapter queue depth(32 ~ 1024), default is 
128");
 
+static int cmd_per_lun = ARCMSR_DEFAULT_CMD_PERLUN;
+module_param(cmd_per_lun, int, S_IRUGO);
+MODULE_PARM_DESC(cmd_per_lun, " device queue depth(1 ~ 128), default is 32");
+
 #defineARCMSR_SLEEPTIME10
 #defineARCMSR_RETRYCOUNT   12
 
@@ -141,7 +145,7 @@ static struct scsi_host_template arcmsr_
.this_id= ARCMSR_SCSI_INITIATOR_ID,
.sg_tablesize   = ARCMSR_DEFAULT_SG_ENTRIES, 
.max_sectors= ARCMSR_MAX_XFER_SECTORS_C, 
-   .cmd_per_lun= ARCMSR_MAX_CMD_PERLUN,
+   .cmd_per_lun= ARCMSR_DEFAULT_CMD_PERLUN,
.use_clustering = ENABLE_CLUSTERING,
.shost_attrs= arcmsr_host_attrs,
.no_write_same  = 1,
@@ -884,7 +888,9 @@ static int arcmsr_probe(struct pci_dev *
if ((host_can_queue < ARCMSR_MIN_OUTSTANDING_CMD) || (host_can_queue > 
ARCMSR_MAX_OUTSTANDING_CMD))
host_can_queue = ARCMSR_DEFAULT_OUTSTANDING_CMD;
host->can_queue = host_can_queue;   /* max simultaneous cmds */ 

-   host->cmd_per_lun = ARCMSR_MAX_CMD_PERLUN;  
+   if ((cmd_per_lun < ARCMSR_MIN_CMD_PERLUN) || (cmd_per_lun > 
ARCMSR_MAX_CMD_PERLUN))
+   cmd_per_lun = ARCMSR_DEFAULT_CMD_PERLUN;
+   host->cmd_per_lun = cmd_per_lun;
host->this_id = ARCMSR_SCSI_INITIATOR_ID;
host->unique_id = (bus << 8) | dev_fun;
pci_set_drvdata(pdev, host);

[PATCH 9/17] scsi: arcmsr: add ACB_F_MSG_GET_CONFIG to acb->acb_flags for for message interrupt checking before schedule work for get device map

2017-12-05 Thread Ching Huang

From: Ching Huang 

add ACB_F_MSG_GET_CONFIG to acb->acb_flags for for message interrupt checking 
before schedule work for get device map

Signed-off-by: Ching Huang 
---

diff -uprN a/drivers/scsi/arcmsr/arcmsr.h b/drivers/scsi/arcmsr/arcmsr.h
--- a/drivers/scsi/arcmsr/arcmsr.h  2017-12-05 15:26:32.0 +0800
+++ b/drivers/scsi/arcmsr/arcmsr.h  2017-12-05 15:27:04.0 +0800
@@ -782,6 +782,7 @@ struct AdapterControlBlock
/* iop init */
#define ACB_F_ABORT 0x0200
#define ACB_F_FIRMWARE_TRAP 0x0400
+   #define ACB_F_MSG_GET_CONFIG0x1000
struct CommandControlBlock *
pccb_pool[ARCMSR_MAX_FREECCB_NUM];
/* used for memory free */
struct list_headccb_free_list;
diff -uprN a/drivers/scsi/arcmsr/arcmsr_hba.c b/drivers/scsi/arcmsr/arcmsr_hba.c
--- a/drivers/scsi/arcmsr/arcmsr_hba.c  2017-12-05 11:46:18.0 +0800
+++ b/drivers/scsi/arcmsr/arcmsr_hba.c  2017-12-05 11:46:06.0 +0800
@@ -743,6 +743,7 @@ static void arcmsr_message_isr_bh_fn(str
struct scsi_device *psdev;
char diff, temp;
 
+   acb->acb_flags &= ~ACB_F_MSG_GET_CONFIG;
switch (acb->adapter_type) {
case ACB_ADAPTER_TYPE_A: {
struct MessageUnit_A __iomem *reg  = acb->pmuA;
@@ -2328,7 +2329,8 @@ static void arcmsr_hbaA_message_isr(stru
struct MessageUnit_A __iomem *reg  = acb->pmuA;
/*clear interrupt and message state*/
writel(ARCMSR_MU_OUTBOUND_MESSAGE0_INT, ®->outbound_intstatus);
-   schedule_work(&acb->arcmsr_do_message_isr_bh);
+   if (acb->acb_flags & ACB_F_MSG_GET_CONFIG)
+   schedule_work(&acb->arcmsr_do_message_isr_bh);
 }
 static void arcmsr_hbaB_message_isr(struct AdapterControlBlock *acb)
 {
@@ -2336,7 +2338,8 @@ static void arcmsr_hbaB_message_isr(stru
 
/*clear interrupt and message state*/
writel(ARCMSR_MESSAGE_INT_CLEAR_PATTERN, reg->iop2drv_doorbell);
-   schedule_work(&acb->arcmsr_do_message_isr_bh);
+   if (acb->acb_flags & ACB_F_MSG_GET_CONFIG)
+   schedule_work(&acb->arcmsr_do_message_isr_bh);
 }
 /*
 
**
@@ -2352,7 +2355,8 @@ static void arcmsr_hbaC_message_isr(stru
struct MessageUnit_C __iomem *reg  = acb->pmuC;
/*clear interrupt and message state*/
writel(ARCMSR_HBCMU_IOP2DRV_MESSAGE_CMD_DONE_DOORBELL_CLEAR, 
®->outbound_doorbell_clear);
-   schedule_work(&acb->arcmsr_do_message_isr_bh);
+   if (acb->acb_flags & ACB_F_MSG_GET_CONFIG)
+   schedule_work(&acb->arcmsr_do_message_isr_bh);
 }
 
 static void arcmsr_hbaD_message_isr(struct AdapterControlBlock *acb)
@@ -2361,7 +2365,8 @@ static void arcmsr_hbaD_message_isr(stru
 
writel(ARCMSR_ARC1214_IOP2DRV_MESSAGE_CMD_DONE, reg->outbound_doorbell);
readl(reg->outbound_doorbell);
-   schedule_work(&acb->arcmsr_do_message_isr_bh);
+   if (acb->acb_flags & ACB_F_MSG_GET_CONFIG)
+   schedule_work(&acb->arcmsr_do_message_isr_bh);
 }
 
 static void arcmsr_hbaE_message_isr(struct AdapterControlBlock *acb)
@@ -2369,7 +2374,8 @@ static void arcmsr_hbaE_message_isr(stru
struct MessageUnit_E __iomem *reg  = acb->pmuE;
 
writel(0, ®->host_int_status);
-   schedule_work(&acb->arcmsr_do_message_isr_bh);
+   if (acb->acb_flags & ACB_F_MSG_GET_CONFIG)
+   schedule_work(&acb->arcmsr_do_message_isr_bh);
 }
 
 static int arcmsr_hbaA_handle_isr(struct AdapterControlBlock *acb)
@@ -3826,6 +3832,7 @@ static void arcmsr_hbaA_request_device_m
return;
}
writel(ARCMSR_INBOUND_MESG0_GET_CONFIG, ®->inbound_msgaddr0);
+   acb->acb_flags |= ACB_F_MSG_GET_CONFIG;
mod_timer(&acb->eternal_timer, jiffies + msecs_to_jiffies(6 * 
HZ));
}
return;
@@ -3848,6 +3855,7 @@ static void arcmsr_hbaB_request_device_m
return;
}
writel(ARCMSR_MESSAGE_GET_CONFIG, reg->drv2iop_doorbell);
+   acb->acb_flags |= ACB_F_MSG_GET_CONFIG;
mod_timer(&acb->eternal_timer, jiffies + msecs_to_jiffies(6 * 
HZ));
}
return;
@@ -3871,6 +3879,7 @@ static void arcmsr_hbaC_request_device_m
}
writel(ARCMSR_INBOUND_MESG0_GET_CONFIG, ®->inbound_msgaddr0);
writel(ARCMSR_HBCMU_DRV2IOP_MESSAGE_CMD_DONE, 
®->inbound_doorbell);
+   acb->acb_flags |= ACB_F_MSG_GET_CONFIG;
mod_timer(&acb->eternal_timer, jiffies + msecs_to_jiffies(6 * 
HZ));
}
return;
@@ -3900,6 +3909,7 @@ static void arcmsr_hbaD_request_device_m
}
writel(ARCMSR_INBOUND_MESG0_GET_CONFIG,
reg->inbound_msgaddr0);
+   acb->acb_flags |= ACB_F_MSG_GET_CONFI

[PATCH 10/17] scsi: arcmsr: add a function arcmsr_set_iop_datetime and driver option set_date_time to set date and time to firmware

2017-12-05 Thread Ching Huang

From: Ching Huang 

add a function arcmsr_set_iop_datetime and driver option set_date_time to set 
date and time to firmware

Signed-off-by: Ching Huang 
---

diff -uprN a/drivers/scsi/arcmsr/arcmsr.h b/drivers/scsi/arcmsr/arcmsr.h
--- a/drivers/scsi/arcmsr/arcmsr.h  2017-12-05 15:27:04.0 +0800
+++ b/drivers/scsi/arcmsr/arcmsr.h  2017-12-05 15:27:32.0 +0800
@@ -80,6 +80,8 @@ struct device_attribute;
 #ifndef PCI_DEVICE_ID_ARECA_1884
#define PCI_DEVICE_ID_ARECA_18840x1884
 #endif
+#defineARCMSR_HOURS(1000 * 60 * 60 * 4)
+#defineARCMSR_MINUTES  (1000 * 60 * 60)
 /*
 
**
 **
@@ -280,6 +282,7 @@ struct FIRMWARE_INFO
 #define ARCMSR_MESSAGE_FLUSH_CACHE0x00050008
 /* (ARCMSR_INBOUND_MESG0_START_BGRB<<16)|ARCMSR_DRV2IOP_MESSAGE_CMD_POSTED) */
 #define ARCMSR_MESSAGE_START_BGRB0x00060008
+#define ARCMSR_MESSAGE_SYNC_TIMER  0x00080008
 #define ARCMSR_MESSAGE_START_DRIVER_MODE 0x000E0008
 #define ARCMSR_MESSAGE_SET_POST_WINDOW   0x000F0008
 #define ARCMSR_MESSAGE_ACTIVE_EOI_MODE 0x0018
@@ -837,6 +840,7 @@ struct AdapterControlBlock
uint32_tmaxOutstanding;
int vector_count;
uint32_tmaxFreeCCB;
+   struct timer_list   refresh_timer;
uint32_tdoneq_index;
uint32_tccbsize;
uint32_tin_doorbell;
diff -uprN a/drivers/scsi/arcmsr/arcmsr_hba.c b/drivers/scsi/arcmsr/arcmsr_hba.c
--- a/drivers/scsi/arcmsr/arcmsr_hba.c  2017-12-05 11:46:06.0 +0800
+++ b/drivers/scsi/arcmsr/arcmsr_hba.c  2017-12-05 11:45:52.0 +0800
@@ -83,6 +83,10 @@ static int cmd_per_lun = ARCMSR_DEFAULT_
 module_param(cmd_per_lun, int, S_IRUGO);
 MODULE_PARM_DESC(cmd_per_lun, " device queue depth(1 ~ 128), default is 32");
 
+static int set_date_time = 0;
+module_param(set_date_time, int, S_IRUGO);
+MODULE_PARM_DESC(set_date_time, " send date, time to iop(0 ~ 1), 
set_date_time=1(enable), default(=0) is disable");
+
 #defineARCMSR_SLEEPTIME10
 #defineARCMSR_RETRYCOUNT   12
 
@@ -125,6 +129,7 @@ static const char *arcmsr_info(struct Sc
 static irqreturn_t arcmsr_interrupt(struct AdapterControlBlock *acb);
 static void arcmsr_free_irq(struct pci_dev *, struct AdapterControlBlock *);
 static void arcmsr_wait_firmware_ready(struct AdapterControlBlock *acb);
+static void arcmsr_set_iop_datetime(struct timer_list *);
 static int arcmsr_adjust_disk_queue_depth(struct scsi_device *sdev, int 
queue_depth)
 {
if (queue_depth > ARCMSR_MAX_CMD_PERLUN)
@@ -852,6 +857,13 @@ out_free_irq:
return FAILED;
 }
 
+static void arcmsr_init_set_datetime_timer(struct AdapterControlBlock *pacb)
+{
+   timer_setup(&pacb->refresh_timer, arcmsr_set_iop_datetime, 0);
+   pacb->refresh_timer.expires = jiffies + msecs_to_jiffies(60 * 1000);
+   add_timer(&pacb->refresh_timer);
+}
+
 static int arcmsr_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 {
struct Scsi_Host *host;
@@ -941,11 +953,15 @@ static int arcmsr_probe(struct pci_dev *
timer_setup(&acb->eternal_timer, arcmsr_request_device_map, 0);
acb->eternal_timer.expires = jiffies + msecs_to_jiffies(6 * HZ);
add_timer(&acb->eternal_timer);
+   if (set_date_time)
+   arcmsr_init_set_datetime_timer(acb);
if(arcmsr_alloc_sysfs_attr(acb))
goto out_free_sysfs;
scsi_scan_host(host);
return 0;
 out_free_sysfs:
+   if (set_date_time)
+   del_timer_sync(&acb->refresh_timer);
del_timer_sync(&acb->eternal_timer);
flush_work(&acb->arcmsr_do_message_isr_bh);
arcmsr_stop_adapter_bgrb(acb);
@@ -988,6 +1004,8 @@ static int arcmsr_suspend(struct pci_dev
intmask_org = arcmsr_disable_outbound_ints(acb);
arcmsr_free_irq(pdev, acb);
del_timer_sync(&acb->eternal_timer);
+   if (set_date_time)
+   del_timer_sync(&acb->refresh_timer);
flush_work(&acb->arcmsr_do_message_isr_bh);
arcmsr_stop_adapter_bgrb(acb);
arcmsr_flush_adapter_cache(acb);
@@ -1032,6 +1050,8 @@ static int arcmsr_resume(struct pci_dev 
timer_setup(&acb->eternal_timer, arcmsr_request_device_map, 0);
acb->eternal_timer.expires = jiffies + msecs_to_jiffies(6 * HZ);
add_timer(&acb->eternal_timer);
+   if (set_date_time)
+   arcmsr_init_set_datetime_timer(acb);
return 0;
 controller_stop:
arcmsr_stop_adapter_bgrb(acb);
@@ -1422,6 +1442,8 @@ static void arcmsr_remove(struct pci_dev
scsi_remove_host(host);
flush_work(&acb->arcmsr_do_message_isr_bh);
del_timer_sync(&acb->eternal_timer);
+   if (set_date_time)
+   del_timer_sync(&acb

[PATCH 11/17] scsi: arcmsr: fix clear doorbell queue on ACB_ADAPTER_TYPE_B controller

2017-12-05 Thread Ching Huang

From: Ching Huang 

fix clear doorbell queue on ACB_ADAPTER_TYPE_B controller

Signed-off-by: Ching Huang 
---

diff -uprN a/drivers/scsi/arcmsr/arcmsr_hba.c b/drivers/scsi/arcmsr/arcmsr_hba.c
--- a/drivers/scsi/arcmsr/arcmsr_hba.c  2017-12-05 11:45:52.0 +0800
+++ b/drivers/scsi/arcmsr/arcmsr_hba.c  2017-12-05 11:45:34.0 +0800
@@ -4196,10 +4196,19 @@ static void arcmsr_clear_doorbell_queue_
 
case ACB_ADAPTER_TYPE_B: {
struct MessageUnit_B *reg = acb->pmuB;
-   /*clear interrupt and message state*/
-   writel(ARCMSR_MESSAGE_INT_CLEAR_PATTERN, reg->iop2drv_doorbell);
+   uint32_t outbound_doorbell, i;
+   writel(ARCMSR_DOORBELL_INT_CLEAR_PATTERN, 
reg->iop2drv_doorbell);
writel(ARCMSR_DRV2IOP_DATA_READ_OK, reg->drv2iop_doorbell);
/* let IOP know data has been read */
+   for(i=0; i < 200; i++) {
+   msleep(20);
+   outbound_doorbell = readl(reg->iop2drv_doorbell);
+   if( outbound_doorbell & ARCMSR_IOP2DRV_DATA_WRITE_OK) {
+   writel(ARCMSR_DOORBELL_INT_CLEAR_PATTERN, 
reg->iop2drv_doorbell);
+   writel(ARCMSR_DRV2IOP_DATA_READ_OK, 
reg->drv2iop_doorbell);
+   } else
+   break;
+   }
}
break;
case ACB_ADAPTER_TYPE_C: {

[PATCH 12/17] scsi: arcmsr: spin off duplicate code of timer init for message isr BH in arcmsr_probe and arcmsr_resume as a function

2017-12-05 Thread Ching Huang

From: Ching Huang 

spin off duplicate code of timer init for message isr BH in arcmsr_probe and 
arcmsr_resume as a function arcmsr_init_get_devmap_timer

Signed-off-by: Ching Huang 
---

diff -uprN a/drivers/scsi/arcmsr/arcmsr_hba.c b/drivers/scsi/arcmsr/arcmsr_hba.c
--- a/drivers/scsi/arcmsr/arcmsr_hba.c  2017-12-05 11:45:34.0 +0800
+++ b/drivers/scsi/arcmsr/arcmsr_hba.c  2017-12-05 11:45:06.0 +0800
@@ -857,6 +857,17 @@ out_free_irq:
return FAILED;
 }
 
+static void arcmsr_init_get_devmap_timer(struct AdapterControlBlock *pacb)
+{
+   INIT_WORK(&pacb->arcmsr_do_message_isr_bh, arcmsr_message_isr_bh_fn);
+   atomic_set(&pacb->rq_map_token, 16);
+   atomic_set(&pacb->ante_token_value, 16);
+   pacb->fw_flag = FW_NORMAL;
+   timer_setup(&pacb->eternal_timer, arcmsr_request_device_map, 0);
+   pacb->eternal_timer.expires = jiffies + msecs_to_jiffies(6 * HZ);
+   add_timer(&pacb->eternal_timer);
+}
+
 static void arcmsr_init_set_datetime_timer(struct AdapterControlBlock *pacb)
 {
timer_setup(&pacb->refresh_timer, arcmsr_set_iop_datetime, 0);
@@ -946,13 +957,7 @@ static int arcmsr_probe(struct pci_dev *
if (arcmsr_request_irq(pdev, acb) == FAILED)
goto scsi_host_remove;
arcmsr_iop_init(acb);
-   INIT_WORK(&acb->arcmsr_do_message_isr_bh, arcmsr_message_isr_bh_fn);
-   atomic_set(&acb->rq_map_token, 16);
-   atomic_set(&acb->ante_token_value, 16);
-   acb->fw_flag = FW_NORMAL;
-   timer_setup(&acb->eternal_timer, arcmsr_request_device_map, 0);
-   acb->eternal_timer.expires = jiffies + msecs_to_jiffies(6 * HZ);
-   add_timer(&acb->eternal_timer);
+   arcmsr_init_get_devmap_timer(acb);
if (set_date_time)
arcmsr_init_set_datetime_timer(acb);
if(arcmsr_alloc_sysfs_attr(acb))
@@ -1043,13 +1048,7 @@ static int arcmsr_resume(struct pci_dev 
if (arcmsr_request_irq(pdev, acb) == FAILED)
goto controller_stop;
arcmsr_iop_init(acb);
-   INIT_WORK(&acb->arcmsr_do_message_isr_bh, arcmsr_message_isr_bh_fn);
-   atomic_set(&acb->rq_map_token, 16);
-   atomic_set(&acb->ante_token_value, 16);
-   acb->fw_flag = FW_NORMAL;
-   timer_setup(&acb->eternal_timer, arcmsr_request_device_map, 0);
-   acb->eternal_timer.expires = jiffies + msecs_to_jiffies(6 * HZ);
-   add_timer(&acb->eternal_timer);
+   arcmsr_init_get_devmap_timer(acb);
if (set_date_time)
arcmsr_init_set_datetime_timer(acb);
return 0;

Re: [PATCH] SCSI: delay run queue if device is blocked in scsi_dev_queue_ready()

2017-12-05 Thread Holger Hoffstätte

On Tue, 05 Dec 2017 06:45:08 +0800, Ming Lei wrote:

> On Mon, Dec 04, 2017 at 03:09:20PM +, Bart Van Assche wrote:
>> On Sun, 2017-12-03 at 00:31 +0800, Ming Lei wrote:
>> > Fixes: 0df21c86bdbf ("scsi: implement .get_budget and .put_budget for 
>> > blk-mq")
>> 
>> It might be safer to revert commit 0df21c86bdbf instead of trying to fix all
>> issues introduced by that commit for kernel version v4.15 ...
> 
> What are all issues in v4.15-rc? Up to now, it is the only issue reported,
> and can be fixed by this simple patch, which one can be thought as cleanup
> too.

Even with this patch I've encountered at least one hang that
seemed related. I'm using most of block/scsi-4.15 on top of 4.14 and
the hang in question was on a rotating disk. It could be solved by activating
a different scheduler on the hanging device; all hanging sync/df processes got
unstuck and all was fine again, which leads me to believe that there is at least
one more rare condition where delaying requests (as done in the budget patch)
leads to a hang.

This happened with mq-deadline which I was testing specifically to avoid
any BFQ-related side effects.

I didn't do anything specific to trigger the hang and have not been able
to reproduce it during regular usage.

-h

[PATCH 13/17] scsi: arcmsr: adjust some tab or white-space to make text alignment

2017-12-05 Thread Ching Huang

From: Ching Huang 

adjust some tab or white-space to make text alignment

Signed-off-by: Ching Huang 
---

diff -uprN a/drivers/scsi/arcmsr/arcmsr.h b/drivers/scsi/arcmsr/arcmsr.h
--- a/drivers/scsi/arcmsr/arcmsr.h  2017-12-05 15:28:50.0 +0800
+++ b/drivers/scsi/arcmsr/arcmsr.h  2017-12-05 15:29:38.0 +0800
@@ -50,35 +50,35 @@ struct device_attribute;
 #define ARCMSR_DEFAULT_OUTSTANDING_CMD 128
 #define ARCMSR_MIN_OUTSTANDING_CMD 32
 #define ARCMSR_DRIVER_VERSION  "v1.30.00.22-20151126"
-#define ARCMSR_SCSI_INITIATOR_ID   
255
-#define ARCMSR_MAX_XFER_SECTORS
512
-#define ARCMSR_MAX_XFER_SECTORS_B  
4096
-#define ARCMSR_MAX_XFER_SECTORS_C  
304
-#define ARCMSR_MAX_TARGETID
17
-#define ARCMSR_MAX_TARGETLUN   
8
+#define ARCMSR_SCSI_INITIATOR_ID   255
+#define ARCMSR_MAX_XFER_SECTORS512
+#define ARCMSR_MAX_XFER_SECTORS_B  4096
+#define ARCMSR_MAX_XFER_SECTORS_C  304
+#define ARCMSR_MAX_TARGETID17
+#define ARCMSR_MAX_TARGETLUN   8
 #define ARCMSR_MAX_CMD_PERLUN  128
 #define ARCMSR_DEFAULT_CMD_PERLUN  32
 #define ARCMSR_MIN_CMD_PERLUN  1
-#define ARCMSR_MAX_QBUFFER 
4096
-#define ARCMSR_DEFAULT_SG_ENTRIES  
38
-#define ARCMSR_MAX_HBB_POSTQUEUE   
264
+#define ARCMSR_MAX_QBUFFER 4096
+#define ARCMSR_DEFAULT_SG_ENTRIES  38
+#define ARCMSR_MAX_HBB_POSTQUEUE   264
 #define ARCMSR_MAX_ARC1214_POSTQUEUE   256
 #define ARCMSR_MAX_ARC1214_DONEQUEUE   257
 #define ARCMSR_MAX_HBE_DONEQUEUE   512
-#define ARCMSR_MAX_XFER_LEN
0x26000 /* 152K */
-#define ARCMSR_CDB_SG_PAGE_LENGTH  
256 
+#define ARCMSR_MAX_XFER_LEN0x26000 /* 152K */
+#define ARCMSR_CDB_SG_PAGE_LENGTH  256 
 #define ARCMST_NUM_MSIX_VECTORS4
 #ifndef PCI_DEVICE_ID_ARECA_1880
-#define PCI_DEVICE_ID_ARECA_1880 0x1880
- #endif
+#define PCI_DEVICE_ID_ARECA_1880   0x1880
+#endif
 #ifndef PCI_DEVICE_ID_ARECA_1214
-   #define PCI_DEVICE_ID_ARECA_12140x1214
+#define PCI_DEVICE_ID_ARECA_1214   0x1214
 #endif
 #ifndef PCI_DEVICE_ID_ARECA_1203
-   #define PCI_DEVICE_ID_ARECA_12030x1203
+#define PCI_DEVICE_ID_ARECA_1203   0x1203
 #endif
 #ifndef PCI_DEVICE_ID_ARECA_1884
-   #define PCI_DEVICE_ID_ARECA_18840x1884
+#define PCI_DEVICE_ID_ARECA_1884   0x1884
 #endif
 #defineARCMSR_HOURS(1000 * 60 * 60 * 4)
 #defineARCMSR_MINUTES  (1000 * 60 * 60)
@@ -87,15 +87,15 @@ struct device_attribute;
 **
 
**
 */
-#define ARC_SUCCESS   0
-#define ARC_FAILURE   1
+#define ARC_SUCCESS0
+#define ARC_FAILURE1
 /*
 ***
 **split 64bits dma addressing
 ***
 */
-#define dma_addr_hi32(addr)   (uint32_t) ((addr>>16)>>16)
-#define dma_addr_lo32(addr)   (uint32_t) (addr & 0x)
+#define dma_addr_hi32(addr)(uint32_t) ((addr>>16)>>16)
+#define dma_addr_lo32(addr)(uint32_t) (addr & 0x)
 /*
 ***
 **MESSAGE CONTROL CODE
@@ -135,7 +135,7 @@ struct CMD_MESSAGE_FIELD
 #define FUNCTION_SAY_HELLO 0x0807
 #define FUNCTION_SAY_GOODBYE   0x0808
 #define FUNCTION_FLUSH_ADAPTER_CACHE   0x0809
-#define FUNCTION_GET_FIRMWARE_STATUS   0x080A
+#define FUNCTION_GET_FIRMWARE_STATUS   0x080A
 #define FUNCTION_HARDWARE_RESET0x080B
 /* ARECA IO CONTROL CODE*/
 #define ARCMSR_MESSAGE_READ_RQBUFFER   \
@@ -166,18 +166,18 @@ struct CMD_MESSAGE_FIELD
 **   structure for holding DMA address data
 *
 */
-#define IS_DMA64   (sizeof(dma_addr_t) == 8)
-#define IS_SG64_ADDR0x0100 /* bit24 */
+#define IS_DMA64   (sizeof(dma_addr_t) == 8)
+#define IS_SG64_ADDR   0x0100 /* bit24 */
 struct  SG32ENTRY
 {
-   __le32  length;
-   __le32  address;
+   __le32  length;
+   __le32  address;
 }__att

[PATCH 14/17] scsi: arcmsr: fix grammar error

2017-12-05 Thread Ching Huang

From: Ching Huang 

fix grammar error.

Signed-off-by: Ching Huang 
---

diff -uprN a/drivers/scsi/arcmsr/arcmsr_hba.c b/drivers/scsi/arcmsr/arcmsr_hba.c
--- a/drivers/scsi/arcmsr/arcmsr_hba.c  2017-12-05 11:44:52.0 +0800
+++ b/drivers/scsi/arcmsr/arcmsr_hba.c  2017-12-05 11:44:36.0 +0800
@@ -4453,7 +4453,7 @@ static int arcmsr_bus_reset(struct scsi_
 
if (acb->acb_flags & ACB_F_BUS_RESET) {
long timeout;
-   pr_notice("arcmsr: there is an bus reset eh proceeding...\n");
+   pr_notice("arcmsr: there is a bus reset eh proceeding...\n");
timeout = wait_event_timeout(wait_q, (acb->acb_flags
& ACB_F_BUS_RESET) == 0, 220 * HZ);
if (timeout)

Re: [PATCH v3 1/2] Ensure that the SCSI error handler gets woken up

2017-12-05 Thread Pavel Tikhomirov


On 12/04/2017 09:06 PM, Bart Van Assche wrote:

If scsi_eh_scmd_add() is called concurrently with
scsi_host_queue_ready() while shost->host_blocked > 0 then it can
happen that neither function wakes up the SCSI error handler. Fix
this by making every function that decreases the host_busy counter
wake up the error handler if necessary and by protecting the
host_failed checks with the SCSI host lock.

Reported-by: Pavel Tikhomirov 
References: https://marc.info/?l=linux-kernel&m=150461610630736
Fixes: commit 746650160866 ("scsi: convert host_busy to atomic_t")
Signed-off-by: Bart Van Assche  > Cc: Konstantin Khorenko 

Cc: Stuart Hayes 
Cc: Pavel Tikhomirov 
Cc: Christoph Hellwig 
Cc: Hannes Reinecke 
Cc: Johannes Thumshirn 
Cc: 
---
  drivers/scsi/hosts.c  |  6 ++
  drivers/scsi/scsi_error.c | 18 --
  drivers/scsi/scsi_lib.c   | 39 ---
  include/scsi/scsi_host.h  |  2 ++
  4 files changed, 52 insertions(+), 13 deletions(-)

diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
index a306af6a5ea7..a0a7e4ff255c 100644
--- a/drivers/scsi/hosts.c
+++ b/drivers/scsi/hosts.c
@@ -324,6 +324,9 @@ static void scsi_host_dev_release(struct device *dev)
  
  	scsi_proc_hostdir_rm(shost->hostt);
  
+	/* Wait for functions invoked through call_rcu(&shost->rcu, ...) */

+   rcu_barrier();
+
if (shost->tmf_work_q)
destroy_workqueue(shost->tmf_work_q);
if (shost->ehandler)
@@ -331,6 +334,8 @@ static void scsi_host_dev_release(struct device *dev)
if (shost->work_q)
destroy_workqueue(shost->work_q);
  
+	destroy_rcu_head(&shost->rcu);

+
if (shost->shost_state == SHOST_CREATED) {
/*
 * Free the shost_dev device name here if scsi_host_alloc()
@@ -399,6 +404,7 @@ struct Scsi_Host *scsi_host_alloc(struct scsi_host_template 
*sht, int privsize)
INIT_LIST_HEAD(&shost->starved_list);
init_waitqueue_head(&shost->host_wait);
mutex_init(&shost->scan_mutex);
+   init_rcu_head(&shost->rcu);
  
  	index = ida_simple_get(&host_index_ida, 0, 0, GFP_KERNEL);

if (index < 0)
diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 5e89049e9b4e..258b8a741992 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -226,6 +226,17 @@ static void scsi_eh_reset(struct scsi_cmnd *scmd)
}
  }
  
+static void scsi_eh_inc_host_failed(struct rcu_head *head)

+{
+   struct Scsi_Host *shost = container_of(head, typeof(*shost), rcu);
+   unsigned long flags;
+
+   spin_lock_irqsave(shost->host_lock, flags);
+   shost->host_failed++;


May be we need to increment host_failed before call_rcu(), so that all 
rcu protected readers already see a change at these point?



+   scsi_eh_wakeup(shost);
+   spin_unlock_irqrestore(shost->host_lock, flags);
+}
+
  /**
   * scsi_eh_scmd_add - add scsi cmd to error handling.
   * @scmd: scmd to run eh on.
@@ -248,9 +259,12 @@ void scsi_eh_scmd_add(struct scsi_cmnd *scmd)
  
  	scsi_eh_reset(scmd);

list_add_tail(&scmd->eh_entry, &shost->eh_cmd_q);
-   shost->host_failed++;
-   scsi_eh_wakeup(shost);
spin_unlock_irqrestore(shost->host_lock, flags);
+   /*
+* Ensure that all tasks observe the host state change before the
+* host_failed change.
+*/
+   call_rcu(&shost->rcu, scsi_eh_inc_host_failed);
  }
  
  /**

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index b6d3842b6809..5cbc69b2b1ae 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -318,22 +318,39 @@ static void scsi_init_cmd_errh(struct scsi_cmnd *cmd)
cmd->cmd_len = scsi_command_size(cmd->cmnd);
  }
  
-void scsi_device_unbusy(struct scsi_device *sdev)

+/*
+ * Decrement the host_busy counter and wake up the error handler if necessary.
+ * Avoid as follows that the error handler is not woken up if shost->host_busy
+ * == shost->host_failed: use call_rcu() in scsi_eh_scmd_add() in combination
+ * with an RCU read lock in this function to ensure that this function in its
+ * entirety either finishes before scsi_eh_scmd_add() increases the
+ * host_failed counter or that it notices the shost state change made by
+ * scsi_eh_scmd_add().
+ */
+static void scsi_dec_host_busy(struct Scsi_Host *shost)
  {
-   struct Scsi_Host *shost = sdev->host;
-   struct scsi_target *starget = scsi_target(sdev);
unsigned long flags;
  
+	rcu_read_lock();

atomic_dec(&shost->host_busy);
-   if (starget->can_queue > 0)
-   atomic_dec(&starget->target_busy);
-
-   if (unlikely(scsi_host_in_recovery(shost) &&
-(shost->host_failed || shost->host_eh_scheduled))) {
+   if (unlikely(scsi_host_in_recovery(shost))) {
spin_lock_irqsave(shost->host_lock, flags);
-   scsi_eh_wakeup(shost);
+   if (shost->host_failed ||

[PATCH 15/17] scsi: arcmsr: Add driver module parameter msi_enable

2017-12-05 Thread Ching Huang

From: Ching Huang 

Add module parameter msi_enable to has a chance to disable msi interrupt if 
between controller and system has
msi INT compatible issue.

Signed-off-by: Ching Huang 
---

diff -uprN a/drivers/scsi/arcmsr/arcmsr_hba.c b/drivers/scsi/arcmsr/arcmsr_hba.c
--- a/drivers/scsi/arcmsr/arcmsr_hba.c  2017-12-05 11:44:36.0 +0800
+++ b/drivers/scsi/arcmsr/arcmsr_hba.c  2017-12-05 16:59:58.0 +0800
@@ -75,6 +75,10 @@ MODULE_DESCRIPTION("Areca ARC11xx/12xx/1
 MODULE_LICENSE("Dual BSD/GPL");
 MODULE_VERSION(ARCMSR_DRIVER_VERSION);
 
+static int msi_enable = 1;
+module_param(msi_enable, int, S_IRUGO);
+MODULE_PARM_DESC(msi_enable, "Enable MSI interrupt(0 ~ 1), 
msi_enable=1(enable), =0(disable)");
+
 static int host_can_queue = ARCMSR_DEFAULT_OUTSTANDING_CMD;
 module_param(host_can_queue, int, S_IRUGO);
 MODULE_PARM_DESC(host_can_queue, " adapter queue depth(32 ~ 1024), default is 
128");
@@ -831,11 +835,17 @@ arcmsr_request_irq(struct pci_dev *pdev,
pr_info("arcmsr%d: msi-x enabled\n", acb->host->host_no);
flags = 0;
} else {
-   nvec = pci_alloc_irq_vectors(pdev, 1, 1,
-   PCI_IRQ_MSI | PCI_IRQ_LEGACY);
+   if (msi_enable == 1) {
+   nvec = pci_alloc_irq_vectors(pdev, 1, 1, PCI_IRQ_MSI);
+   if (nvec == 1) {
+   dev_info(&pdev->dev, "msi enabled\n");
+   goto msi_int1;
+   }
+   }
+   nvec = pci_alloc_irq_vectors(pdev, 1, 1, PCI_IRQ_LEGACY);
if (nvec < 1)
return FAILED;
-
+msi_int1:
flags = IRQF_SHARED;
}

[PATCH 16/17] scsi: arcmsr: Add driver module parameter msix_enable

2017-12-05 Thread Ching Huang

From: Ching Huang 

Add module parameter msix_enable to has a chance to disable msix interrupt if 
between controller and system has
msix INT compatible issue.

Signed-off-by: Ching Huang 
---

diff -uprN a/drivers/scsi/arcmsr/arcmsr_hba.c b/drivers/scsi/arcmsr/arcmsr_hba.c
--- a/drivers/scsi/arcmsr/arcmsr_hba.c  2017-12-05 16:59:58.0 +0800
+++ b/drivers/scsi/arcmsr/arcmsr_hba.c  2017-12-05 17:04:52.0 +0800
@@ -75,6 +75,10 @@ MODULE_DESCRIPTION("Areca ARC11xx/12xx/1
 MODULE_LICENSE("Dual BSD/GPL");
 MODULE_VERSION(ARCMSR_DRIVER_VERSION);
 
+static int msix_enable = 1;
+module_param(msix_enable, int, S_IRUGO);
+MODULE_PARM_DESC(msix_enable, "Enable MSI-X interrupt(0 ~ 1), 
msix_enable=1(enable), =0(disable)");
+
 static int msi_enable = 1;
 module_param(msi_enable, int, S_IRUGO);
 MODULE_PARM_DESC(msi_enable, "Enable MSI interrupt(0 ~ 1), 
msi_enable=1(enable), =0(disable)");
@@ -829,12 +833,15 @@ arcmsr_request_irq(struct pci_dev *pdev,
unsigned long flags;
int nvec, i;
 
+   if (msix_enable == 0)
+   goto msi_int0;
nvec = pci_alloc_irq_vectors(pdev, 1, ARCMST_NUM_MSIX_VECTORS,
PCI_IRQ_MSIX);
if (nvec > 0) {
pr_info("arcmsr%d: msi-x enabled\n", acb->host->host_no);
flags = 0;
} else {
+msi_int0:
if (msi_enable == 1) {
nvec = pci_alloc_irq_vectors(pdev, 1, 1, PCI_IRQ_MSI);
if (nvec == 1) {

[PATCH 17/17] scsi: arcmsr: Update driver version to v1.40.00.04-20171130

2017-12-05 Thread Ching Huang

From: Ching Huang 

Update driver version to v1.40.00.04-20171130

Signed-off-by: Ching Huang 
---

diff -uprN a/drivers/scsi/arcmsr/arcmsr.h b/drivers/scsi/arcmsr/arcmsr.h
--- a/drivers/scsi/arcmsr/arcmsr.h  2017-12-05 15:30:30.0 +0800
+++ b/drivers/scsi/arcmsr/arcmsr.h  2017-12-05 09:18:58.0 +0800
@@ -49,7 +49,7 @@ struct device_attribute;
 #define ARCMSR_MAX_OUTSTANDING_CMD 1024
 #define ARCMSR_DEFAULT_OUTSTANDING_CMD 128
 #define ARCMSR_MIN_OUTSTANDING_CMD 32
-#define ARCMSR_DRIVER_VERSION  "v1.30.00.22-20151126"
+#define ARCMSR_DRIVER_VERSION  "v1.40.00.04-20171130"
 #define ARCMSR_SCSI_INITIATOR_ID   255
 #define ARCMSR_MAX_XFER_SECTORS512
 #define ARCMSR_MAX_XFER_SECTORS_B  4096

[PATCH 18/17] scsi: arcmsr: Fix report command result error when CHECK_CONDITION

2017-12-05 Thread Ching Huang

From: Ching Huang 

Fix report command result error when CHECK_CONDITION.

Signed-off-by: Ching Huang 
---

diff -uprN a/drivers/scsi/arcmsr/arcmsr_hba.c b/drivers/scsi/arcmsr/arcmsr_hba.c
--- a/drivers/scsi/arcmsr/arcmsr_hba.c  2017-12-05 17:04:52.0 +0800
+++ b/drivers/scsi/arcmsr/arcmsr_hba.c  2017-12-05 19:17:28.0 +0800
@@ -1205,7 +1205,7 @@ static void arcmsr_report_sense_info(str
 
struct scsi_cmnd *pcmd = ccb->pcmd;
struct SENSE_DATA *sensebuffer = (struct SENSE_DATA 
*)pcmd->sense_buffer;
-   pcmd->result = DID_OK << 16;
+   pcmd->result = (DID_OK << 16) | (CHECK_CONDITION << 1);
if (sensebuffer) {
int sense_data_length =
sizeof(struct SENSE_DATA) < SCSI_SENSE_BUFFERSIZE
@@ -1214,6 +1214,7 @@ static void arcmsr_report_sense_info(str
memcpy(sensebuffer, ccb->arcmsr_cdb.SenseData, 
sense_data_length);
sensebuffer->ErrorCode = SCSI_SENSE_CURRENT_ERRORS;
sensebuffer->Valid = 1;
+   pcmd->result |= (DRIVER_SENSE << 24);
}
 }

Re: [PATCH] SCSI: delay run queue if device is blocked in scsi_dev_queue_ready()

2017-12-05 Thread Holger Hoffstätte

On 12/05/17 06:16, Ming Lei wrote:
> On Mon, Dec 04, 2017 at 11:48:07PM +, Holger Hoffstätte wrote:
>> On Tue, 05 Dec 2017 06:45:08 +0800, Ming Lei wrote:
>>
>>> On Mon, Dec 04, 2017 at 03:09:20PM +, Bart Van Assche wrote:
 On Sun, 2017-12-03 at 00:31 +0800, Ming Lei wrote:
> Fixes: 0df21c86bdbf ("scsi: implement .get_budget and .put_budget for 
> blk-mq")

 It might be safer to revert commit 0df21c86bdbf instead of trying to fix 
 all
 issues introduced by that commit for kernel version v4.15 ...
>>>
>>> What are all issues in v4.15-rc? Up to now, it is the only issue reported,
>>> and can be fixed by this simple patch, which one can be thought as cleanup
>>> too.
>>
>> Even with this patch I've encountered at least one hang that
>> seemed related. I'm using most of block/scsi-4.15 on top of 4.14 and
>> the hang in question was on a rotating disk. It could be solved by activating
>> a different scheduler on the hanging device; all hanging sync/df processes 
>> got
>> unstuck and all was fine again, which leads me to believe that there is at 
>> least
>> one more rare condition where delaying requests (as done in the budget patch)
>> leads to a hang.
>>
>> This happened with mq-deadline which I was testing specifically to avoid
>> any BFQ-related side effects.
> 
> OK, this looks a new report.
> 
> Without any log, we can't make any progress, and even we can't guess
> what the issue is related with.

Considering that you just had an idea about a corner case and posted v2
of the patch, it's safe to say that we actually can...which is why I
described the situation exactly the way I did. :)

I did try to get stack traces but all the hanging processes were
simply stuck on the device mutex (deep inside btrfs), so nothing too
helpful.

> Could you post your dmesg log(include the hang process stack trace)? And
> dump the debugfs log by the following script when this hang happens?
> 
>   http://people.redhat.com/minlei/tests/tools/dump-blk-info
> 
> BTW, you just need to pass the disk name to the script, such as: /dev/sda.

Thanks for the script. I'm now running with the new patch and will see what
happens.

cheers,
Holger

Re: [PATCH] scsi: fix race condition when removing target

2017-12-05 Thread Jason Yan




On 2017/12/1 23:35, James Bottomley wrote:

On Fri, 2017-12-01 at 16:40 +0800, Jason Yan wrote:

On 2017/12/1 7:56, James Bottomley wrote:

b/include/scsi/scsi_device.h
index 571ddb49b926..2e4d48d8cd68 100644
--- a/include/scsi/scsi_device.h
+++ b/include/scsi/scsi_device.h
@@ -380,6 +380,23 @@ extern struct scsi_device
*__scsi_iterate_devices(struct Scsi_Host *,
   #define __shost_for_each_device(sdev, shost) \
list_for_each_entry((sdev), &((shost)->__devices),
siblings)



Seems that __shost_for_each_device() is still not safe. scsi device
been deleted stays in the list and put_device() can be called
anywhere out of the host lock.


Not if it's used with scsi_get_device().  As I said, I only did a
cursory inspectiont, so if I've missed a loop, please specify.

The point was more a demonstration of how we could fix the problem if
we don't change get_device().

James



Yes, it's OK now. __shost_for_each_device() is not used with
scsi_get_device() yet.

Another problem is that put_device() cannot be called while holding the
host lock, so we need to remove all put_device() out of the lock. Some
places like scsi_device_lookup() and scsi_device_lookup_by_target() need 
rework:


@@ -765,12 +772,22 @@ struct scsi_device *scsi_device_lookup(struct 
Scsi_Host *shost,

unsigned long flags;

spin_lock_irqsave(shost->host_lock, flags);
-   sdev = __scsi_device_lookup(shost, channel, id, lun);
-   if (sdev && scsi_device_get(sdev))
-   sdev = NULL;
+   __sdev_for_each_get(sdev, &shost->__devices, siblings) {
+   spin_unlock_irqrestore(shost->host_lock, flags);
+   if (sdev->sdev_state != SDEV_DEL &&
+   sdev->channel == channel && sdev->id == id &&
+   sdev->lun ==lun) {
+   if (!scsi_device_get(sdev)) {
+   put_device(&sdev->sdev_gendev);
+   return sdev;
+   }
+   }
+   put_device(&sdev->sdev_gendev);
+   spin_lock_irqsave(shost->host_lock, flags);
+   }
spin_unlock_irqrestore(shost->host_lock, flags);

-   return sdev;
+   return NULL;
 }
 EXPORT_SYMBOL(scsi_device_lookup);

Re: [PATCH] SCSI: run queue if SCSI device queue isn't ready and queue is idle

2017-12-05 Thread Johannes Thumshirn

[ +Cc Omar ]

Ming Lei  writes:
> Before commit 0df21c86bdbf ("scsi: implement .get_budget and .put_budget
> for blk-mq"), we run queue after 3ms if queue is idle and SCSI device
> queue isn't ready, which is done in handling BLK_STS_RESOURCE. After
> commit 0df21c86bdbf is introduced, queue won't be run any more under
> this situation.
>
> IO hang is observed when timeout happened, and this patch fixes the IO
> hang issue by running queue after delay in scsi_dev_queue_ready, just like
> non-mq. This issue can be triggered by the following script[1].
>
> There is another issue which can be covered by running idle queue:
> when .get_budget() is called on request coming from hctx->dispatch_list,
> if one request just completes during .get_budget(), we can't depend on
> SCSI's restart to make progress any more. This patch fixes the race too.
>
> With this patch, we basically recover to previous behaviour(before commit
> 0df21c86bdbf) of handling idle queue when running out of resource.
>
> [1] script for test/verify SCSI timeout
> rmmod scsi_debug
> modprobe scsi_debug max_queue=1
>
> DEVICE=`ls -d 
> /sys/bus/pseudo/drivers/scsi_debug/adapter*/host*/target*/*/block/* | head -1 
> | xargs basename`
> DISK_DIR=`ls -d /sys/block/$DEVICE/device/scsi_disk/*`
>
> echo "using scsi device $DEVICE"
> echo "-1" >/sys/bus/pseudo/drivers/scsi_debug/every_nth
> echo "temporary write through" >$DISK_DIR/cache_type
> echo "128" >/sys/bus/pseudo/drivers/scsi_debug/opts
> echo none > /sys/block/$DEVICE/queue/scheduler
> dd if=/dev/$DEVICE of=/dev/null bs=1M iflag=direct count=1 &
> sleep 5
> echo "0" >/sys/bus/pseudo/drivers/scsi_debug/opts
> wait
> echo "SUCCESS"

SO I turned the above into a blktest but have found some shortcommings
of my bash skills. Maybe you or Omar has a soution for it:

--- 8< ---
>From 80e5810011d52bc188cd858962ce202bfd4dbee5 Mon Sep 17 00:00:00 2001
From: Johannes Thumshirn 
Date: Tue, 5 Dec 2017 15:21:08 +0100
Subject: [PATCH blktests] block/013: add test for scsi_device queue starvation

Add a test for Ming Lei's patch titled "SCSI: run queue if SCSI device
queue isn't ready and queue is idle"

Signed-off-by: Johannes Thumshirn 

---
This test case has two shortcommings, which need to be addressed I'm
just lacking a bit of the shell magic to address them properly.

1) Testing without the patch applied hangs the test forever as it
   doesn't get killed after a specific timeout (I think this should be
   solved in a common function).
2) It has a nasty sleep at it's end to wait for scsi_debug's refcounts
   to drop to 0 before removing the module or removing will fail and thus
   the test case. This as well should be solved in a more generic way.
---
 tests/block/013 | 63 +
 tests/block/013.out |  2 ++
 2 files changed, 65 insertions(+)
 create mode 100755 tests/block/013
 create mode 100644 tests/block/013.out

diff --git a/tests/block/013 b/tests/block/013
new file mode 100755
index ..f73724fc9ed2
--- /dev/null
+++ b/tests/block/013
@@ -0,0 +1,63 @@
+#!/bin/bash
+#
+# Regression test for patch "SCSI: delay run queue if device is
+# blocked in scsi_dev_queue_ready()"
+#
+# Copyright (C) 2017 Johannes Thumshirn
+#
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+
+. common/scsi_debug
+
+DESCRIPTION="Test if a SCSI device's queue can be run if it isn't ready but 
the device is idle"
+TIMED=1
+
+requires() {
+   _have_scsi_debug && _have_module sd_mod && \
+   grep -q Y /sys/module/scsi_mod/parameters/use_blk_mq
+}
+
+test_one_device()
+{
+local device=$1
+
+echo "-1" > /sys/bus/pseudo/drivers/scsi_debug/every_nth
+echo "temporary write through" > \
+   /sys/block/"${device}"/device/scsi_disk/"$(basename $(readlink 
/sys/block/${device}/device))"/cache_type
+echo "128" > /sys/bus/pseudo/drivers/scsi_debug/opts
+echo "none" > /sys/block/${device}/queue/scheduler
+dd if=/dev/"${device}" of=/dev/null bs=1M iflag=direct \
+   count=1 2> /dev/null &
+sleep 5
+echo 0 > /sys/bus/pseudo/drivers/scsi_debug/opts
+wait
+}
+
+test() {
+   echo "Running ${TEST_NAME}"
+
+   if ! _init_scsi_debug statistics=1 max_queue=1; then
+   return
+   fi
+
+   local device
+   for device in "${SCSI_DEBUG_DEVICES[@]}"; do
+   test_one_device ${device}   
+   done
+
+

Re: [PATCH] scsi: fix race condition when removing target

2017-12-05 Thread James Bottomley

On Tue, 2017-12-05 at 20:37 +0800, Jason Yan wrote:
> 
> On 2017/12/1 23:35, James Bottomley wrote:
> > 
> > On Fri, 2017-12-01 at 16:40 +0800, Jason Yan wrote:
> > > 
> > > On 2017/12/1 7:56, James Bottomley wrote:
> > > > 
> > > > b/include/scsi/scsi_device.h
> > > > index 571ddb49b926..2e4d48d8cd68 100644
> > > > --- a/include/scsi/scsi_device.h
> > > > +++ b/include/scsi/scsi_device.h
> > > > @@ -380,6 +380,23 @@ extern struct scsi_device
> > > > *__scsi_iterate_devices(struct Scsi_Host *,
> > > >    #define __shost_for_each_device(sdev, shost) \
> > > >     list_for_each_entry((sdev), &((shost)->__devices),
> > > > siblings)
> > > > 
> > > 
> > > Seems that __shost_for_each_device() is still not safe. scsi
> > > device
> > > been deleted stays in the list and put_device() can be called
> > > anywhere out of the host lock.
> > 
> > Not if it's used with scsi_get_device().  As I said, I only did a
> > cursory inspectiont, so if I've missed a loop, please specify.
> > 
> > The point was more a demonstration of how we could fix the problem
> > if we don't change get_device().
> > 
> > James
> > 
> 
> Yes, it's OK now. __shost_for_each_device() is not used with
> scsi_get_device() yet.
> 
> Another problem is that put_device() cannot be called while holding
> the host lock,

Yes it can.  That's one of the design goals of the execute in process
context: you can call it from interrupt context and you can call it
with locks held and we'll return immediately and delay all the
dangerous stuff until we have a process context.

To get the process context to be acquired, the in_interrupt() test must
pass (so the spin lock must be acquired irqsave) ; is that condition
missing anywhere?

James

Re: UFS utilities

2017-12-05 Thread Bean Huo (beanhuo)

Hi, greg k-h

>
>So what UFS commands are you missing that you need to see implemented?
>
>And again, have you checked the different forks of the driver?
>

Seems there is something misunderstood, I want to use UPIU, rather than CDB.
Maybe it is not possible based on current UFS stacks. Of course, exactly, 
there is no missing SCSI command listed in UFS 2.1.

>> >> And also it doesn't support several UFS special command.
>> >
>> >Are you referring to SCSI commands or rather to UFS commands that
>> >fall outside the SCSI spec? Anyway, an approach that is used by many
>> >SCSI drivers to export information to user space that falls outside
>> >the SCSI spec is to create additional sysfs attributes. See also the
>> >sdev_attrs and shost_attrs members of struct scsi_host_template.
>> >
>> Yes, for the UFS information, I can use these interface/approach to easily
>get.
>> I am thinking how about some testing case and configuration operation.
>
>Which ones exactly?
>
>> Also, is it possible bypass SCSI stacks and go into directly UFS stack?
>
>Look at the different sysfs files for the UFS device, it does that for some
>commands.
>
To be honest, I don't know which interface, it can pass UPIU to UFS driver,
And bypass SCSI stacks.

Thanks.
Bean Huo

Re: UFS utilities

2017-12-05 Thread gre...@linuxfoundation.org

On Tue, Dec 05, 2017 at 03:44:19PM +, Bean Huo (beanhuo) wrote:
> Hi, greg k-h
> 
> >
> >So what UFS commands are you missing that you need to see implemented?
> >
> >And again, have you checked the different forks of the driver?
> >
> 
> Seems there is something misunderstood, I want to use UPIU, rather than CDB.
> Maybe it is not possible based on current UFS stacks. Of course, exactly, 
> there is no missing SCSI command listed in UFS 2.1.

Again, have you looked at the different forks out there?  I bet you can
already do this...

If not, I'm sure simple patches should be able to add the missing
functionality, it would be good to get a solid UFS driver one of these
days in the tree :)

thanks,

greg k-h

Re: [PATCH] SCSI: run queue if SCSI device queue isn't ready and queue is idle

2017-12-05 Thread Bart Van Assche

On Tue, 2017-12-05 at 15:52 +0800, Ming Lei wrote:
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index db9556662e27..1816dd8259b3 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -1967,6 +1967,8 @@ static bool scsi_mq_get_budget(struct blk_mq_hw_ctx 
> *hctx)
>  out_put_device:
>   put_device(&sdev->sdev_gendev);
>  out:
> + if (atomic_read(&sdev->device_busy) == 0 && !scsi_device_blocked(sdev))
> + blk_mq_delay_run_hw_queue(hctx, SCSI_QUEUE_DELAY);
>   return false;
>  }

This cannot work since multiple threads can call scsi_mq_get_budget()
concurrently and hence it can happen that none of them sees
atomic_read(&sdev->device_busy) == 0. BTW, the above patch is something I
consider as a workaround. Have you considered to fix the root cause and to
add blk_mq_sched_mark_restart_hctx() where these are missing, e.g. in 
blk_mq_sched_dispatch_requests()? The patch below is not a full solution
but resulted in a significant improvement in my tests:

diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
index 69e3226e66ca..9d86876ec503 100644
--- a/block/blk-mq-sched.c
+++ b/block/blk-mq-sched.c
@@ -226,6 +226,7 @@ void blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx 
*hctx)
 * TODO: get more budgets, and dequeue more requests in
 * one time.
 */
+   blk_mq_sched_mark_restart_hctx(hctx);
blk_mq_do_dispatch_ctx(hctx);
} else {
blk_mq_flush_busy_ctxs(hctx, &rq_list);

Bart.

Re: [PATCH] SCSI: run queue if SCSI device queue isn't ready and queue is idle

2017-12-05 Thread Bart Van Assche

On Tue, 2017-12-05 at 15:29 +0100, Johannes Thumshirn wrote:
> 1) Testing without the patch applied hangs the test forever as it
>doesn't get killed after a specific timeout (I think this should be
>solved in a common function).

Hello Johannes,

If a request queue got stuck then the processes that submitted the requests
on that queue are unkillable. The only approach I know of to stop these
processes is to send a kill signal and next to trigger a queue run from user
space. One possible approach is to run the following command:

for d in /sys/kernel/debug/block/*; do echo kick >$d/state; done

Bart.

Re: [PATCH v3 1/2] Ensure that the SCSI error handler gets woken up

2017-12-05 Thread Bart Van Assche

On Tue, 2017-12-05 at 13:18 +0300, Pavel Tikhomirov wrote:
> On 12/04/2017 09:06 PM, Bart Van Assche wrote:
> > +static void scsi_eh_inc_host_failed(struct rcu_head *head)
> > +{
> > +   struct Scsi_Host *shost = container_of(head, typeof(*shost), rcu);
> > +   unsigned long flags;
> > +
> > +   spin_lock_irqsave(shost->host_lock, flags);
> > +   shost->host_failed++;
> 
> May be we need to increment host_failed before call_rcu(), so that all 
> rcu protected readers already see a change at these point?

Sorry but I don't think that would work. If host_failed would be changed first
then it can happen that scsi_dec_host_busy() encounters that the host state is
"running" and hence that it skips the host_failed check. That can result in a
missed error handler wakeup, which is what we want to avoid.

Bart.

Re: [PATCH 1/2] scsi-mq: Only show the CDB if available

2017-12-05 Thread Bart Van Assche

On Tue, 2017-12-05 at 13:00 +0800, Ming Lei wrote:
> No, do not mix two different things in one patch, especially the fix part
> need to be backported to stable.
> 
> The fix part should aim at V4.15, and the other part can be a V4.16
> stuff.

Does this mean that you do not plan to post a v5 of your patch and that you
want me to rework this patch series? I can do that.

Bart.

Re: [PATCH] SCSI: run queue if SCSI device queue isn't ready and queue is idle

2017-12-05 Thread Ming Lei

On Tue, Dec 05, 2017 at 04:08:20PM +, Bart Van Assche wrote:
> On Tue, 2017-12-05 at 15:52 +0800, Ming Lei wrote:
> > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> > index db9556662e27..1816dd8259b3 100644
> > --- a/drivers/scsi/scsi_lib.c
> > +++ b/drivers/scsi/scsi_lib.c
> > @@ -1967,6 +1967,8 @@ static bool scsi_mq_get_budget(struct blk_mq_hw_ctx 
> > *hctx)
> >  out_put_device:
> > put_device(&sdev->sdev_gendev);
> >  out:
> > +   if (atomic_read(&sdev->device_busy) == 0 && !scsi_device_blocked(sdev))
> > +   blk_mq_delay_run_hw_queue(hctx, SCSI_QUEUE_DELAY);
> > return false;
> >  }
> 
> This cannot work since multiple threads can call scsi_mq_get_budget()

That is exactly the way we are handling these cases before 0df21c86bdbf(scsi:
implement .get_budget and .put_budget for blk-mq), so if it can't work,
that is not fault of commit 0df21c86bdbf.

> concurrently and hence it can happen that none of them sees
> atomic_read(&sdev->device_busy) == 0. BTW, the above patch is something I

If there is concurrent .get_budget(), one of them will see the counter
becoming zero finally because each sdev->device_busy is inc/dec
atomically. Or scsi_dev_queue_ready() return true.

Anyway, we need this patch to avoid possible regression. If you think
there are bugs in blk-mq RESTART, just root cause and and fix it.

> consider as a workaround. Have you considered to fix the root cause and to
> add blk_mq_sched_mark_restart_hctx() where these are missing, e.g. in 
> blk_mq_sched_dispatch_requests()? The patch below is not a full solution
> but resulted in a significant improvement in my tests:
> 
> diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
> index 69e3226e66ca..9d86876ec503 100644
> --- a/block/blk-mq-sched.c
> +++ b/block/blk-mq-sched.c
> @@ -226,6 +226,7 @@ void blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx 
> *hctx)
>* TODO: get more budgets, and dequeue more requests in
>* one time.
>*/
> + blk_mq_sched_mark_restart_hctx(hctx);
>   blk_mq_do_dispatch_ctx(hctx);
>   } else {
>   blk_mq_flush_busy_ctxs(hctx, &rq_list);

This is still a workaround for RESTART, see my comment before:

https://marc.info/?l=linux-block&m=151217500929341&w=2

-- 
Ming

Re: [PATCH 1/2] scsi-mq: Only show the CDB if available

2017-12-05 Thread Ming Lei

On Tue, Dec 05, 2017 at 04:22:33PM +, Bart Van Assche wrote:
> On Tue, 2017-12-05 at 13:00 +0800, Ming Lei wrote:
> > No, do not mix two different things in one patch, especially the fix part
> > need to be backported to stable.
> > 
> > The fix part should aim at V4.15, and the other part can be a V4.16
> > stuff.
> 
> Does this mean that you do not plan to post a v5 of your patch and that you
> want me to rework this patch series? I can do that.

I believe V4 has been OK for merge, actually the only concern from James
is that 'set the cmnd to NULL *before* calling free so we narrow the race
window.', but that isn't required as I explained, even though you don't do
that in this patch too.

https://marc.info/?t=15103046433&r=1&w=2

I will work on V5 if Martin/James thinks it is needed.

-- 
Ming

Re: [PATCH] SCSI: run queue if SCSI device queue isn't ready and queue is idle

2017-12-05 Thread Bart Van Assche

On Wed, 2017-12-06 at 00:28 +0800, Ming Lei wrote:
> This is still a workaround for RESTART, see my comment before:
> 
>   https://marc.info/?l=linux-block&m=151217500929341&w=2

A quote from that e-mail: "The theory about using BLK_MQ_S_SCHED_RESTART in
current way is that we mark it after requests are added to hctx->dispatch".
Reading that makes me wonder whether you understand the purpose of the
BLK_MQ_S_SCHED_RESTART flag? That flag is not set after requests are added
to the dispatch list but after requests have been *removed*. The purpose of
that flag is to detect whether another thread has run the queue after
requests were removed from the dispatch list and before these were readded.
If so, the queue needs to be rerun.

Bart.

Re: [PATCH 1/2] scsi-mq: Only show the CDB if available

2017-12-05 Thread James Bottomley

On Wed, 2017-12-06 at 00:38 +0800, Ming Lei wrote:
> On Tue, Dec 05, 2017 at 04:22:33PM +, Bart Van Assche wrote:
> > 
> > On Tue, 2017-12-05 at 13:00 +0800, Ming Lei wrote:
> > > 
> > > No, do not mix two different things in one patch, especially the
> > > fix part need to be backported to stable.
> > > 
> > > The fix part should aim at V4.15, and the other part can be a
> > > V4.16 stuff.
> > 
> > Does this mean that you do not plan to post a v5 of your patch and
> > that you want me to rework this patch series? I can do that.
> 
> I believe V4 has been OK for merge, actually the only concern from
> James is that 'set the cmnd to NULL *before* calling free so we
> narrow the race window.', but that isn't required as I explained,
> even though you don't do that in this patch too.
> 
>   https://marc.info/?t=15103046433&r=1&w=2
> 
> I will work on V5 if Martin/James thinks it is needed.

I don't buy that it isn't needed.  The point (and the pattern) is for a
destructive action set the signal *before* you execute the action not
after.  The reason should be obvious: if you set it after you invite a
race where the check says OK but the object has gone.  Even if the race
is highly unlikely, the pattern point still holds.

James

Re: [PATCH] SCSI: run queue if SCSI device queue isn't ready and queue is idle

2017-12-05 Thread Ming Lei

On Tue, Dec 05, 2017 at 04:41:46PM +, Bart Van Assche wrote:
> On Wed, 2017-12-06 at 00:28 +0800, Ming Lei wrote:
> > This is still a workaround for RESTART, see my comment before:
> > 
> > https://marc.info/?l=linux-block&m=151217500929341&w=2
> 
> A quote from that e-mail: "The theory about using BLK_MQ_S_SCHED_RESTART in
> current way is that we mark it after requests are added to hctx->dispatch".
> Reading that makes me wonder whether you understand the purpose of the
> BLK_MQ_S_SCHED_RESTART flag? That flag is not set after requests are added
> to the dispatch list but after requests have been *removed*. The purpose of
> that flag is to detect whether another thread has run the queue after
> requests were removed from the dispatch list and before these were readded.
> If so, the queue needs to be rerun.

If you want to discuss that, please reply on that thread of '[PATCH 4/7] blk-mq:
Avoid that request processing sta', I will reply on you there too.

-- 
Ming

Re: [PATCH V4] scsi_debugfs: fix crash in scsi_show_rq()

2017-12-05 Thread Ming Lei

On Wed, Nov 15, 2017 at 08:04:49PM +0800, Ming Lei wrote:
> On Wed, Nov 15, 2017 at 07:28:00PM +0900, James Bottomley wrote:
> > On Wed, 2017-11-15 at 18:09 +0800, Ming Lei wrote:
> > > On Tue, Nov 14, 2017 at 10:14:52AM -0800, James Bottomley wrote:
> > > > 
> > > > On Tue, 2017-11-14 at 08:55 +0800, Ming Lei wrote:
> > > > > 
> > > > > Hi James,
> > > > > 
> > > > > On Mon, Nov 13, 2017 at 10:55:52AM -0800, James Bottomley wrote:
> > > > > > 
> > > > > > 
> > > > > > On Sat, 2017-11-11 at 10:43 +0800, Ming Lei wrote:
> > > > > > > 
> > > > > > > 
> > > > > > > So from CPU1's review, cmd->cmnd is in a remote NUMA node,
> > > > > > > __scsi_format_command() is executed much slower than
> > > > > > > mempool_free().
> > > > > > > So when mempool_free() returns, __scsi_format_command() may
> > > > > > > not fetched the buffer in L1 cache yet, then use-after-free
> > > > > > > is still triggered.
> > > > > > > 
> > > > > > > That is why I say this use-after-free is inevitable no matter
> > > > > > > 'setting SCpnt->cmnd to NULL before calling mempool_free()'
> > > > > > > or not.
> > > > > > 
> > > > > > The bottom line is that there are several creative ways around
> > > > > > this but the proposed code is currently broken and simply
> > > > > > putting a comment in saying so doesn't make it acceptable.
> > > > > 
> > > > > As I explained above, I didn't see one really workable way. Or
> > > > > please correct it if I am wrong.
> > > > 
> > > > I simply can't believe it's beyond the wit of man to solve a use
> > > > after free race.  About 40% of kernel techniques are devoted to
> > > > this.  All I really care about is not losing the PI information we
> > > > previously had.  I agree with Bart that NULL cmnd is a good
> > > > indicator, so it seems reasonable to use it.  If you have another
> > > > mechanism, feel free to propose it.
> > > 
> > > Hi James,
> > > 
> > > This patch is my proposal, no others thought of yet.
> > > 
> > > We can fix the use-after-free easily via lock, rcu and ..., but some
> > > cost has to pay. In this case, we can't wait too long in show_rq(),
> > > otherwise we may lose important debug info, so I do not have better
> > > way.
> > > 
> > > IMO this use-after-free is actually no harm, I don't think we have to
> > > fix it, but it should be better to not let utility warn on this case.
> > 
> > Fine, so lose the snide comment and set the cmnd to NULL *before*
> > calling free so we narrow the race window.
> 
> Hi James,
> 
> Given we can't avoid the use-after-free, how about not do that way so
> we won't lose the precious debug info too early?

Hi James,

Are you fine with V4?

As I explained, the use-after-free can't be avoided, we have to make
scsi_show_rq() to survive that, so we don't need to touch code in free path.
Also we won't lose debug info too early in this way, not like 'set the cmnd
to NULL *before* calling free'.

Thanks,
Ming

[GIT PULL] SCSI fixes for 4.15-rc2

2017-12-05 Thread James Bottomley

We have a bunch of fixes for aacraid, a set of coherency fixes that
only affect non-coherent platforms and one coccinelle detected null
check after use.

The patch is available here:

git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes

The short changelog is:

Christoph Hellwig (1):
  scsi: dma-mapping: always provide dma_get_cache_alignment

Guilherme G. Piccoli (3):
  scsi: aacraid: Prevent crash in case of free interrupt during scsi EH path
  scsi: aacraid: Perform initialization reset only once
  scsi: aacraid: Check for PCI state of device in a generic way

Gustavo A. R. Silva (1):
  scsi: ufs: ufshcd: fix potential NULL pointer dereference in 
ufshcd_config_vreg

Huacai Chen (2):
  scsi: libsas: align sata_device's rps_resp on a cacheline
  scsi: use dma_get_cache_alignment() as minimum DMA alignment

And the diffstat:

 drivers/scsi/aacraid/aacraid.h |  1 +
 drivers/scsi/aacraid/commsup.c | 35 +++
 drivers/scsi/aacraid/linit.c   |  3 +++
 drivers/scsi/aacraid/rx.c  | 15 ++-
 drivers/scsi/aacraid/src.c | 20 ++--
 drivers/scsi/scsi_lib.c| 10 ++
 drivers/scsi/ufs/ufshcd.c  |  7 +--
 include/linux/dma-mapping.h|  2 --
 include/scsi/libsas.h  |  2 +-
 9 files changed, 43 insertions(+), 52 deletions(-)

With full diff below.

James

---

diff --git a/drivers/scsi/aacraid/aacraid.h b/drivers/scsi/aacraid/aacraid.h
index 403a639574e5..6e3d81969a77 100644
--- a/drivers/scsi/aacraid/aacraid.h
+++ b/drivers/scsi/aacraid/aacraid.h
@@ -1673,6 +1673,7 @@ struct aac_dev
struct aac_hba_map_info hba_map[AAC_MAX_BUSES][AAC_MAX_TARGETS];
u8  adapter_shutdown;
u32 handle_pci_error;
+   boolinit_reset;
 };
 
 #define aac_adapter_interrupt(dev) \
diff --git a/drivers/scsi/aacraid/commsup.c b/drivers/scsi/aacraid/commsup.c
index 525a652dab48..bec9f3193f60 100644
--- a/drivers/scsi/aacraid/commsup.c
+++ b/drivers/scsi/aacraid/commsup.c
@@ -467,35 +467,6 @@ int aac_queue_get(struct aac_dev * dev, u32 * index, u32 
qid, struct hw_fib * hw
return 0;
 }
 
-#ifdef CONFIG_EEH
-static inline int aac_check_eeh_failure(struct aac_dev *dev)
-{
-   /* Check for an EEH failure for the given
-* device node. Function eeh_dev_check_failure()
-* returns 0 if there has not been an EEH error
-* otherwise returns a non-zero value.
-*
-* Need to be called before any PCI operation,
-* i.e.,before aac_adapter_check_health()
-*/
-   struct eeh_dev *edev = pci_dev_to_eeh_dev(dev->pdev);
-
-   if (eeh_dev_check_failure(edev)) {
-   /* The EEH mechanisms will handle this
-* error and reset the device if
-* necessary.
-*/
-   return 1;
-   }
-   return 0;
-}
-#else
-static inline int aac_check_eeh_failure(struct aac_dev *dev)
-{
-   return 0;
-}
-#endif
-
 /*
  * Define the highest level of host to adapter communication routines.
  * These routines will support host to adapter FS commuication. These
@@ -701,7 +672,7 @@ int aac_fib_send(u16 command, struct fib *fibptr, unsigned 
long size,
return -ETIMEDOUT;
}
 
-   if (aac_check_eeh_failure(dev))
+   if (unlikely(pci_channel_offline(dev->pdev)))
return -EFAULT;
 
if ((blink = aac_adapter_check_health(dev)) > 
0) {
@@ -801,7 +772,7 @@ int aac_hba_send(u8 command, struct fib *fibptr, 
fib_callback callback,
 
spin_unlock_irqrestore(&fibptr->event_lock, flags);
 
-   if (aac_check_eeh_failure(dev))
+   if (unlikely(pci_channel_offline(dev->pdev)))
return -EFAULT;
 
fibptr->flags |= FIB_CONTEXT_FLAG_WAIT;
@@ -1583,6 +1554,7 @@ static int _aac_reset_adapter(struct aac_dev *aac, int 
forced, u8 reset_type)
 * will ensure that i/o is queisced and the card is flushed in that
 * case.
 */
+   aac_free_irq(aac);
aac_fib_map_free(aac);
dma_free_coherent(&aac->pdev->dev, aac->comm_size, aac->comm_addr,
  aac->comm_phys);
@@ -1590,7 +1562,6 @@ static int _aac_reset_adapter(struct aac_dev *aac, int 
forced, u8 reset_type)
aac->comm_phys = 0;
kfree(aac->queues);
aac->queues = NULL;
-   aac_free_irq(aac);
kfree(aac->fsa_dev);
aac->fsa_dev = NULL;
 
diff --git a/drivers/scsi/aacraid/linit.c b/drivers/scsi/aacraid/linit.c
index c9252b138c1f..bdf127aaab41 100644
--- a/drivers/scsi/aacraid/linit.c
+++ b/drivers/scsi/aacraid/linit.c
@@ -1680,6 +1680,9 @@ static int aac_probe_one(struct pci_dev *pdev, const 
struct pci_device_id *id)

Re: [PATCH 1/2] scsi-mq: Only show the CDB if available

2017-12-05 Thread Bart Van Assche

On Wed, 2017-12-06 at 00:38 +0800, Ming Lei wrote:
> On Tue, Dec 05, 2017 at 04:22:33PM +, Bart Van Assche wrote:
> > On Tue, 2017-12-05 at 13:00 +0800, Ming Lei wrote:
> > > No, do not mix two different things in one patch, especially the fix part
> > > need to be backported to stable.
> > > 
> > > The fix part should aim at V4.15, and the other part can be a V4.16
> > > stuff.
> > 
> > Does this mean that you do not plan to post a v5 of your patch and that you
> > want me to rework this patch series? I can do that.
> 
> I believe V4 has been OK for merge, actually the only concern from James
> is that 'set the cmnd to NULL *before* calling free so we narrow the race
> window.', but that isn't required as I explained, even though you don't do
> that in this patch too.

The next version of this patch series will include the sd driver change 
requested
by James.

Bart.

Re: [PATCH v3 1/2] Ensure that the SCSI error handler gets woken up

2017-12-05 Thread Bart Van Assche

On Wed, 2017-12-06 at 00:17 +0300, ptikhomirov wrote:
> I mean threads in scsi_dec_host_busy() the part under rcu_read_lock are
> divided into two groups: a) finished before call_rcu, b) beginning rcu
> section after call_rcu. So first, in scsi_eh_inc_host_failed() we will
> see all changes to host busy from group (a), second, all threads in group
> (b) will see our change to host_failed. Either there is nobody in (b) and
> we will start EH, or the thread from (b) which entered spin_lock last will
> start EH.
> 
> In your case tasks from b does not see host_failed was incremented, and
> will not start EH.

Hello Pavel,

What does "your case" mean? In my previous e-mail I explained a scenario that
cannot happen so it's not clear to me what "your case" refers to?

Additionally, it seems like you are assuming that RCU guarantees ordering of
RCU read-locked sections against call_rcu()? That's not how RCU works. RCU
guarantees serialization of read-locked sections against grace periods. The
function scsi_eh_inc_host_failed() is invoked through call_rcu() and hence
will be called during a grace period.

Anyway, the different scenarios I see are as follows:
(a) scsi_dec_host_busy() finishes before scsi_eh_inc_host_failed() starts.
(b) scsi_dec_host_busy() starts after scsi_eh_inc_host_failed() has
finished.

In case (a) scsi_eh_inc_host_failed() will wake up the error handler. And in
case (b) scsi_dec_host_busy() will wake up the error handler. So it's not
clear to me why you think that there is a scenario in which the EH won't be
woken up?

Bart.

Re: UFS utilities

2017-12-05 Thread Bart Van Assche

On Mon, 2017-12-04 at 15:20 +, Bean Huo (beanhuo) wrote:
> Also, is it possible bypass SCSI stacks and go into directly UFS stack?

Hello Bean,

Sorry but I think it would be wrong to bypass the block layer when submitting
UFS commands. My understanding is that UFS devices are used in systems where
power management functionality is important (see also Documentation/power).
If the block layer would be bypassed then the power management support that
exists in the block layer will have to be reimplemented in UFS devices. That
would be a duplicate effort. I'm not sure that we want such duplication.

Bart.

Re: [PATCH v3 1/2] Ensure that the SCSI error handler gets woken up

2017-12-05 Thread Pavel Tikhomirov


Hello, Bart!

On 12/06/2017 12:46 AM, Bart Van Assche wrote:

On Wed, 2017-12-06 at 00:17 +0300, ptikhomirov wrote:

I mean threads in scsi_dec_host_busy() the part under rcu_read_lock are
divided into two groups: a) finished before call_rcu, b) beginning rcu
section after call_rcu. So first, in scsi_eh_inc_host_failed() we will
see all changes to host busy from group (a), second, all threads in group
(b) will see our change to host_failed. Either there is nobody in (b) and
we will start EH, or the thread from (b) which entered spin_lock last will
start EH.

In your case tasks from b does not see host_failed was incremented, and
will not start EH.


Hello Pavel,

What does "your case" mean? In my previous e-mail I explained a scenario that
cannot happen so it's not clear to me what "your case" refers to?


By "your case" I meant how your code works, especially that host_failed 
increment is inside scsi_eh_inc_host_failed() in your patch.




Additionally, it seems like you are assuming that RCU guarantees ordering of
RCU read-locked sections against call_rcu()?


Sorry.. Not "call_rcu" itself, In my previous reply I meant the delayed 
callback which actually triggers. (s/call_rcu/call_rcu's callback start/)



That's not how RCU works. RCU
guarantees serialization of read-locked sections against grace periods. The
function scsi_eh_inc_host_failed() is invoked through call_rcu() and hence
will be called during a grace period.


May be I missunderstand something, but I think that callback from 
call_rcu is guaranteed to _start_ after a grace period, but not to end 
before a next grace period. Other threads can enter rcu_read_lock 
protected critical regions while we are still in a callback and will run 
concurrently with callback.




Anyway, the different scenarios I see are as follows:
(a) scsi_dec_host_busy() finishes before scsi_eh_inc_host_failed() starts.
(b) scsi_dec_host_busy() starts after scsi_eh_inc_host_failed() has
 finished.


So I think in (b) scsi_dec_host_busy starts after 
scsi_eh_inc_host_failed has _started_.




In case (a) scsi_eh_inc_host_failed() will wake up the error handler. And in
case (b) scsi_dec_host_busy() will wake up the error handler. So it's not
clear to me why you think that there is a scenario in which the EH won't be
woken up?


So in case (b), in my understanding, scsi_dec_host_busy can skip wakeups 
as it does not see host_failed change yet.




Bart.



To prove my point, some parts of kernel docs:

"This function invokes func(head) after a grace period has elapsed." 
Documentation/RCU/whatisRCU.txt


"
15. The whole point of call_rcu(), synchronize_rcu(), and friends
is to wait until all pre-existing readers have finished before
carrying out some otherwise-destructive operation.
...
Because these primitives only wait for pre-existing readers, it
is the caller's responsibility to guarantee that any subsequent
readers will execute safely.
"
Documentation/RCU/checklist.txt

There is nothing about "callback ends before next grace period".

Sorry, if I'm missing something.

Pavel.

Re: [PATCH v3 1/2] Ensure that the SCSI error handler gets woken up

2017-12-05 Thread Bart Van Assche

On Wed, 2017-12-06 at 01:49 +0300, Pavel Tikhomirov wrote:
> On 12/06/2017 12:46 AM, Bart Van Assche wrote:
> > Anyway, the different scenarios I see are as follows:
> > (a) scsi_dec_host_busy() finishes before scsi_eh_inc_host_failed() starts.
> > (b) scsi_dec_host_busy() starts after scsi_eh_inc_host_failed() has
> >  finished.
> 
> So I think in (b) scsi_dec_host_busy starts after scsi_eh_inc_host_failed
> has _started_.

Agreed, and that's fine, since in that case the SCSI host state has alread been
modified and hence both functions will obtain the SCSI host lock. The relevant
code will be serialized through the SCSI host lock.

> > In case (a) scsi_eh_inc_host_failed() will wake up the error handler. And in
> > case (b) scsi_dec_host_busy() will wake up the error handler. So it's not
> > clear to me why you think that there is a scenario in which the EH won't be
> > woken up?
> 
> So in case (b), in my understanding, scsi_dec_host_busy can skip wakeups 
> as it does not see host_failed change yet.

That's not correct. If scsi_dec_host_busy() obtains the SCSI host lock before
scsi_eh_inc_host_failed() obtains it then the latter function will trigger a
SCSI EH wakeup.

Bart.

Re: [PATCH] scsi: fix race condition when removing target

2017-12-05 Thread Jason Yan



On 2017/12/5 23:37, James Bottomley wrote:

On Tue, 2017-12-05 at 20:37 +0800, Jason Yan wrote:


On 2017/12/1 23:35, James Bottomley wrote:


On Fri, 2017-12-01 at 16:40 +0800, Jason Yan wrote:


On 2017/12/1 7:56, James Bottomley wrote:


b/include/scsi/scsi_device.h
index 571ddb49b926..2e4d48d8cd68 100644
--- a/include/scsi/scsi_device.h
+++ b/include/scsi/scsi_device.h
@@ -380,6 +380,23 @@ extern struct scsi_device
*__scsi_iterate_devices(struct Scsi_Host *,
#define __shost_for_each_device(sdev, shost) \
list_for_each_entry((sdev), &((shost)->__devices),
siblings)



Seems that __shost_for_each_device() is still not safe. scsi
device
been deleted stays in the list and put_device() can be called
anywhere out of the host lock.


Not if it's used with scsi_get_device().  As I said, I only did a
cursory inspectiont, so if I've missed a loop, please specify.

The point was more a demonstration of how we could fix the problem
if we don't change get_device().

James



Yes, it's OK now. __shost_for_each_device() is not used with
scsi_get_device() yet.

Another problem is that put_device() cannot be called while holding
the host lock,


Yes it can.  That's one of the design goals of the execute in process
context: you can call it from interrupt context and you can call it
with locks held and we'll return immediately and delay all the
dangerous stuff until we have a process context.

To get the process context to be acquired, the in_interrupt() test must
pass (so the spin lock must be acquired irqsave) ; is that condition
missing anywhere?

James




Call it from interrupt context is ok. I'm talking about calling it from
process context.

Think about this in a process context:
scsi_device_lookup()
   ->spin_lock_irqsave(shost->host_lock, flags);
   ->__scsi_device_lookup()
  ->iterate and kobject_get_unless_zero()
  ->put_device()
 ->scsi_device_dev_release() if the last put
 ->scsi_device_dev_release_usercontext()
->acquire the host lock = deadlock

Jason


.

[PATCH v2 0/3] Show all commands in debugfs

2017-12-05 Thread Bart Van Assche

Hello Jens,

While debugging an issue with the SCSI error handler I noticed that commands
that got stuck in that error handler are not shown in debugfs. That is very
annoying for anyone who relies on the information in debugfs for root-causing
such an issue. Hence this patch series that makes sure that commands that got
stuck in a block driver timeout handler are also shown in debugfs. Please
consider these patches for kernel v4.16.

Thanks,

Bart.

Changes compared to v1:
- Split the SCSI patch into two patches.
- Added the tag "Cc: stable" to two of the three patches.
- Included a change for the SCSI sd driver.

Bart Van Assche (3):
  scsi: Fix a scsi_show_rq() NULL pointer dereference
  blk-mq-debugfs: Also show requests that have not yet been started
  scsi-mq-debugfs: Show more information

 block/blk-mq-debugfs.c  |  2 +-
 drivers/scsi/scsi_debugfs.c | 47 -
 drivers/scsi/sd.c   |  4 +++-
 3 files changed, 46 insertions(+), 7 deletions(-)

-- 
2.15.0

[PATCH v2 1/3] scsi: Fix a scsi_show_rq() NULL pointer dereference

2017-12-05 Thread Bart Van Assche

Avoid that scsi_show_rq() triggers a NULL pointer dereference if
called after sd_uninit_command(). Swap the NULL pointer assignment
and the mempool_free() call in sd_uninit_command() to make it less
likely that scsi_show_rq() triggers a use-after-free. Note: even
with these changes scsi_show_rq() can trigger a use-after-free but
that's a lesser evil than e.g. suppressing debug information for
T10-PI commands completely. This patch fixes the following oops:

BUG: unable to handle kernel NULL pointer dereference at (null)
IP: scsi_format_opcode_name+0x1a/0x1c0
CPU: 1 PID: 1881 Comm: cat Not tainted 4.14.0-rc2.blk_mq_io_hang+ #516
Call Trace:
 __scsi_format_command+0x27/0xc0
 scsi_show_rq+0x5c/0xc0
 __blk_mq_debugfs_rq_show+0x116/0x130
 blk_mq_debugfs_rq_show+0xe/0x10
 seq_read+0xfe/0x3b0
 full_proxy_read+0x54/0x90
 __vfs_read+0x37/0x160
 vfs_read+0x96/0x130
 SyS_read+0x55/0xc0
 entry_SYSCALL_64_fastpath+0x1a/0xa5

Fixes: 0eebd005dd07 ("scsi: Implement blk_mq_ops.show_rq()")
Reported-by: Ming Lei 
Signed-off-by: Bart Van Assche 
Cc: James E.J. Bottomley 
Cc: Martin K. Petersen 
Cc: Ming Lei 
Cc: Christoph Hellwig 
Cc: Hannes Reinecke 
Cc: Johannes Thumshirn 
Cc: sta...@vger.kernel.org
---
 drivers/scsi/scsi_debugfs.c | 6 --
 drivers/scsi/sd.c   | 4 +++-
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/scsi_debugfs.c b/drivers/scsi/scsi_debugfs.c
index 01f08c03f2c1..c3765d29fd3f 100644
--- a/drivers/scsi/scsi_debugfs.c
+++ b/drivers/scsi/scsi_debugfs.c
@@ -8,9 +8,11 @@ void scsi_show_rq(struct seq_file *m, struct request *rq)
 {
struct scsi_cmnd *cmd = container_of(scsi_req(rq), typeof(*cmd), req);
int msecs = jiffies_to_msecs(jiffies - cmd->jiffies_at_alloc);
-   char buf[80];
+   const u8 *const cdb = READ_ONCE(cmd->cmnd);
+   char buf[80] = "(?)";
 
-   __scsi_format_command(buf, sizeof(buf), cmd->cmnd, cmd->cmd_len);
+   if (cdb)
+   __scsi_format_command(buf, sizeof(buf), cdb, cmd->cmd_len);
seq_printf(m, ", .cmd=%s, .retries=%d, allocated %d.%03d s ago", buf,
   cmd->retries, msecs / 1000, msecs % 1000);
 }
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index d175c5c5ccf8..d841743b2107 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -1284,6 +1284,7 @@ static int sd_init_command(struct scsi_cmnd *cmd)
 static void sd_uninit_command(struct scsi_cmnd *SCpnt)
 {
struct request *rq = SCpnt->request;
+   u8 *cmnd;
 
if (SCpnt->flags & SCMD_ZONE_WRITE_LOCK)
sd_zbc_write_unlock_zone(SCpnt);
@@ -1292,9 +1293,10 @@ static void sd_uninit_command(struct scsi_cmnd *SCpnt)
__free_page(rq->special_vec.bv_page);
 
if (SCpnt->cmnd != scsi_req(rq)->cmd) {
-   mempool_free(SCpnt->cmnd, sd_cdb_pool);
+   cmnd = SCpnt->cmnd;
SCpnt->cmnd = NULL;
SCpnt->cmd_len = 0;
+   mempool_free(cmnd, sd_cdb_pool);
}
 }
 
-- 
2.15.0

[PATCH v2 3/3] scsi-mq-debugfs: Show more information

2017-12-05 Thread Bart Van Assche

Show the request result, request timeout and SCSI command flags.
This information is very helpful when trying to figure out why a
queue got stuck. An example of the information that is exported
through debugfs:

$ (cd /sys/kernel/debug/block && find -type f -print0 | xargs -0 grep ago)
./sda/hctx0/busy:8804a4523300 {.op=READ, 
.cmd_flags=FAILFAST_DEV|FAILFAST_TRANSPORT|FAILFAST_DRIVER|RAHEAD, 
.rq_flags=MQ_INFLIGHT|DONTPREP|IO_STAT|STATS, .atomic_flags=STARTED, .tag=24, 
.internal_tag=-1, .cmd=Read(10) 28 00 06 80 1c c8 00 00 08 00, .retries=0, 
.result = 0x0, .flags=TAGGED|INITIALIZED, .timeout=90.000, allocated 0.010 s 
ago}

Signed-off-by: Bart Van Assche 
Cc: James E.J. Bottomley 
Cc: Martin K. Petersen 
Cc: Ming Lei 
Cc: Christoph Hellwig 
Cc: Hannes Reinecke 
Cc: Johannes Thumshirn 
---
 drivers/scsi/scsi_debugfs.c | 41 ++---
 1 file changed, 38 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/scsi_debugfs.c b/drivers/scsi/scsi_debugfs.c
index c3765d29fd3f..37ed6bb8e6ec 100644
--- a/drivers/scsi/scsi_debugfs.c
+++ b/drivers/scsi/scsi_debugfs.c
@@ -4,15 +4,50 @@
 #include 
 #include "scsi_debugfs.h"
 
+#define SCSI_CMD_FLAG_NAME(name) [ilog2(SCMD_##name)] = #name
+static const char *const scsi_cmd_flags[] = {
+   SCSI_CMD_FLAG_NAME(TAGGED),
+   SCSI_CMD_FLAG_NAME(UNCHECKED_ISA_DMA),
+   SCSI_CMD_FLAG_NAME(ZONE_WRITE_LOCK),
+   SCSI_CMD_FLAG_NAME(INITIALIZED),
+};
+#undef SCSI_CMD_FLAG_NAME
+
+static int scsi_flags_show(struct seq_file *m, const unsigned long flags,
+  const char *const *flag_name, int flag_name_count)
+{
+   bool sep = false;
+   int i;
+
+   for (i = 0; i < sizeof(flags) * BITS_PER_BYTE; i++) {
+   if (!(flags & BIT(i)))
+   continue;
+   if (sep)
+   seq_puts(m, "|");
+   sep = true;
+   if (i < flag_name_count && flag_name[i])
+   seq_puts(m, flag_name[i]);
+   else
+   seq_printf(m, "%d", i);
+   }
+   return 0;
+}
+
 void scsi_show_rq(struct seq_file *m, struct request *rq)
 {
struct scsi_cmnd *cmd = container_of(scsi_req(rq), typeof(*cmd), req);
-   int msecs = jiffies_to_msecs(jiffies - cmd->jiffies_at_alloc);
+   int alloc_ms = jiffies_to_msecs(jiffies - cmd->jiffies_at_alloc);
+   int timeout_ms = jiffies_to_msecs(rq->timeout);
const u8 *const cdb = READ_ONCE(cmd->cmnd);
char buf[80] = "(?)";
 
if (cdb)
__scsi_format_command(buf, sizeof(buf), cdb, cmd->cmd_len);
-   seq_printf(m, ", .cmd=%s, .retries=%d, allocated %d.%03d s ago", buf,
-  cmd->retries, msecs / 1000, msecs % 1000);
+   seq_printf(m, ", .cmd=%s, .retries=%d, .result = %#x, .flags=", buf,
+  cmd->retries, cmd->result);
+   scsi_flags_show(m, cmd->flags, scsi_cmd_flags,
+   ARRAY_SIZE(scsi_cmd_flags));
+   seq_printf(m, ", .timeout=%d.%03d, allocated %d.%03d s ago",
+  timeout_ms / 1000, timeout_ms % 1000,
+  alloc_ms / 1000, alloc_ms % 1000);
 }
-- 
2.15.0

[PATCH v2 2/3] blk-mq-debugfs: Also show requests that have not yet been started

2017-12-05 Thread Bart Van Assche

When debugging e.g. the SCSI timeout handler it is important that
requests that have not yet been started or that already have
completed are also reported through debugfs.

Fixes: commit 2720bab50258 ("blk-mq-debugfs: Show busy requests")
Signed-off-by: Bart Van Assche 
Cc: Ming Lei 
Cc: Christoph Hellwig 
Cc: Hannes Reinecke 
Cc: Johannes Thumshirn 
Cc: Martin K. Petersen 
Cc: sta...@vger.kernel.org
---
 block/blk-mq-debugfs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c
index f7db73f1698e..886b37163f17 100644
--- a/block/blk-mq-debugfs.c
+++ b/block/blk-mq-debugfs.c
@@ -409,7 +409,7 @@ static void hctx_show_busy_rq(struct request *rq, void 
*data, bool reserved)
const struct show_busy_params *params = data;
 
if (blk_mq_map_queue(rq->q, rq->mq_ctx->cpu) == params->hctx &&
-   test_bit(REQ_ATOM_STARTED, &rq->atomic_flags))
+   list_empty(&rq->queuelist))
__blk_mq_debugfs_rq_show(params->m,
 list_entry_rq(&rq->queuelist));
 }
-- 
2.15.0

Re: [PATCH 1/2] scsi-mq: Only show the CDB if available

2017-12-05 Thread Ming Lei

On Tue, Dec 05, 2017 at 08:43:49AM -0800, James Bottomley wrote:
> On Wed, 2017-12-06 at 00:38 +0800, Ming Lei wrote:
> > On Tue, Dec 05, 2017 at 04:22:33PM +, Bart Van Assche wrote:
> > > 
> > > On Tue, 2017-12-05 at 13:00 +0800, Ming Lei wrote:
> > > > 
> > > > No, do not mix two different things in one patch, especially the
> > > > fix part need to be backported to stable.
> > > > 
> > > > The fix part should aim at V4.15, and the other part can be a
> > > > V4.16 stuff.
> > > 
> > > Does this mean that you do not plan to post a v5 of your patch and
> > > that you want me to rework this patch series? I can do that.
> > 
> > I believe V4 has been OK for merge, actually the only concern from
> > James is that 'set the cmnd to NULL *before* calling free so we
> > narrow the race window.', but that isn't required as I explained,
> > even though you don't do that in this patch too.
> > 
> > https://marc.info/?t=15103046433&r=1&w=2
> > 
> > I will work on V5 if Martin/James thinks it is needed.
> 
> I don't buy that it isn't needed.  The point (and the pattern) is for a
> destructive action set the signal *before* you execute the action not
> after.  The reason should be obvious: if you set it after you invite a
> race where the check says OK but the object has gone.  Even if the race

Even you do that, the race is still highly likely there:

1) mempool_free() can be much quicker than scsi_show_rq() because it is
a local free, and scsi_show_rq() can be run from remote CPU wrt. the
allocated 'cmd->cmnd', and access to remote NUMA node should be slower
than mempool_free(), so use-after-free is triggered.

2) any preemption or local IRQ in scsi_show_rq() can make it touch
a freed buffer, and sd_uninit_command() is run from irq context.

3) no any barrier is applied, so the actual write can be reordered
in sd_uninit_command()

So setting the cmd->cmnd as NULL before mempool_free() can't avoid
the use-after-free, scsi_show_rq() has to survive that, then
do we really need to add the unnecessary change in sd_uninit_command()?

Not mention the change will make the debug info disappear too early, is
that what we need?

> is highly unlikely, the pattern point still holds.

-- 
Ming

Re: [PATCH] SCSI: run queue if SCSI device queue isn't ready and queue is idle

2017-12-05 Thread Ming Lei

On Wed, Dec 06, 2017 at 12:28:25AM +0800, Ming Lei wrote:
> On Tue, Dec 05, 2017 at 04:08:20PM +, Bart Van Assche wrote:
> > On Tue, 2017-12-05 at 15:52 +0800, Ming Lei wrote:
> > > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> > > index db9556662e27..1816dd8259b3 100644
> > > --- a/drivers/scsi/scsi_lib.c
> > > +++ b/drivers/scsi/scsi_lib.c
> > > @@ -1967,6 +1967,8 @@ static bool scsi_mq_get_budget(struct blk_mq_hw_ctx 
> > > *hctx)
> > >  out_put_device:
> > >   put_device(&sdev->sdev_gendev);
> > >  out:
> > > + if (atomic_read(&sdev->device_busy) == 0 && !scsi_device_blocked(sdev))
> > > + blk_mq_delay_run_hw_queue(hctx, SCSI_QUEUE_DELAY);
> > >   return false;
> > >  }
> > 
> > This cannot work since multiple threads can call scsi_mq_get_budget()
> 
> That is exactly the way we are handling these cases before 0df21c86bdbf(scsi:
> implement .get_budget and .put_budget for blk-mq), so if it can't work,
> that is not fault of commit 0df21c86bdbf.
> 
> > concurrently and hence it can happen that none of them sees
> > atomic_read(&sdev->device_busy) == 0. BTW, the above patch is something I
> 
> If there is concurrent .get_budget(), one of them will see the counter
> becoming zero finally because each sdev->device_busy is inc/dec
> atomically. Or scsi_dev_queue_ready() return true.
> 
> Anyway, we need this patch to avoid possible regression. If you think
> there are bugs in blk-mq RESTART, just root cause and and fix it.
> 
> > consider as a workaround. Have you considered to fix the root cause and to
> > add blk_mq_sched_mark_restart_hctx() where these are missing, e.g. in 
> > blk_mq_sched_dispatch_requests()? The patch below is not a full solution
> > but resulted in a significant improvement in my tests:
> > 
> > diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
> > index 69e3226e66ca..9d86876ec503 100644
> > --- a/block/blk-mq-sched.c
> > +++ b/block/blk-mq-sched.c
> > @@ -226,6 +226,7 @@ void blk_mq_sched_dispatch_requests(struct 
> > blk_mq_hw_ctx *hctx)
> >  * TODO: get more budgets, and dequeue more requests in
> >  * one time.
> >  */
> > +   blk_mq_sched_mark_restart_hctx(hctx);
> > blk_mq_do_dispatch_ctx(hctx);
> > } else {
> > blk_mq_flush_busy_ctxs(hctx, &rq_list);
> 

BTW, this kind of change can't cover scsi_set_blocked() which is
triggered by timeout, scsi dispatch failure. You will see that
easily if you run the SCSI test script I provided in the commit log.

-- 
Ming

Re: [PATCH] scsi: fix race condition when removing target

2017-12-05 Thread James Bottomley

On Wed, 2017-12-06 at 08:41 +0800, Jason Yan wrote:
> On 2017/12/5 23:37, James Bottomley wrote:
> > 
> > On Tue, 2017-12-05 at 20:37 +0800, Jason Yan wrote:
> > > 
> > > 
> > > On 2017/12/1 23:35, James Bottomley wrote:
> > > > 
> > > > 
> > > > On Fri, 2017-12-01 at 16:40 +0800, Jason Yan wrote:
> > > > > 
> > > > > 
> > > > > On 2017/12/1 7:56, James Bottomley wrote:
> > > > > > 
> > > > > > 
> > > > > > b/include/scsi/scsi_device.h
> > > > > > index 571ddb49b926..2e4d48d8cd68 100644
> > > > > > --- a/include/scsi/scsi_device.h
> > > > > > +++ b/include/scsi/scsi_device.h
> > > > > > @@ -380,6 +380,23 @@ extern struct scsi_device
> > > > > > *__scsi_iterate_devices(struct Scsi_Host *,
> > > > > > #define __shost_for_each_device(sdev, shost) \
> > > > > > list_for_each_entry((sdev), &((shost)-
> > > > > > >__devices),
> > > > > > siblings)
> > > > > > 
> > > > > 
> > > > > Seems that __shost_for_each_device() is still not safe. scsi
> > > > > device
> > > > > been deleted stays in the list and put_device() can be called
> > > > > anywhere out of the host lock.
> > > > 
> > > > Not if it's used with scsi_get_device().  As I said, I only did
> > > > a cursory inspectiont, so if I've missed a loop, please
> > > > specify.
> > > > 
> > > > The point was more a demonstration of how we could fix the
> > > > problem if we don't change get_device().
> > > > 
> > > > James
> > > > 
> > > 
> > > Yes, it's OK now. __shost_for_each_device() is not used with
> > > scsi_get_device() yet.
> > > 
> > > Another problem is that put_device() cannot be called while
> > > holding the host lock,
> > 
> > Yes it can.  That's one of the design goals of the execute in
> > process context: you can call it from interrupt context and you can
> > call it with locks held and we'll return immediately and delay all
> > the dangerous stuff until we have a process context.
> > 
> > To get the process context to be acquired, the in_interrupt() test
> > must pass (so the spin lock must be acquired irqsave) ; is that
> > condition missing anywhere?
> > 
> > James
> > 
> > 
> 
> Call it from interrupt context is ok. I'm talking about calling it
> from process context.
> 
> Think about this in a process context:
> scsi_device_lookup()
> ->spin_lock_irqsave(shost->host_lock, flags);
> ->__scsi_device_lookup()
>    ->iterate and kobject_get_unless_zero()
>    ->put_device()
>   ->scsi_device_dev_release() if the last put
>   ->scsi_device_dev_release_usercontext()
>  ->acquire the host lock = deadlock

execute_in_process_context() is supposed to produce us a context
whenever the local context isn't available, and that's supposed to
include when interrupts are disabled as in spin_lock_irqsave().

So let me ask this another way: have you seen this deadlock (which
would mean we have a bug in execute_process_context())?

James

Spende

2017-12-05 Thread winda

Hallo, ich bin Roy Cockrum, 58, aus Knoxville, Tennessee Vereinigte Staaten von 
Amerika, Sie haben eine Wohltätigkeitsspende von € 3.400.000,00 EUR, 
kontaktieren Sie mich für weitere Details: roycockrumpovertyfoundat...@gmail.com

Re: [PATCH v2] scsi_debug: add cdb_len paramete

2017-12-05 Thread Bart Van Assche

On Tue, 2017-12-05 at 00:05 -0500, Douglas Gilbert wrote:
> While testing "sd: Micro-optimize READ / WRITE CDB encoding" patches it was
> helpful to check various code paths associated with READ/WRITE 6, 10
> and 16 byte cdb variants. There seems to be no user space "knobs" to
> twiddle use_10_for_rw and friends in the scsi_device structure.
> So add a parameter to scsi_debug called "cdb_len" for this purpose.
> 
> Changes since v1:
>   - address most of the concerns from Bart Van Assche
>   - keep driver version and date and tie them into some responses
> (e.g. version becomes INQUIRY Revision field)
> 
> Patch built on lk 4.15.0-rc2

Reviewed-by: Bart Van Assche

Re: [PATCH] scsi: fix race condition when removing target

2017-12-05 Thread Jason Yan




execute_in_process_context() is supposed to produce us a context
whenever the local context isn't available, and that's supposed to
include when interrupts are disabled as in spin_lock_irqsave().

So let me ask this another way: have you seen this deadlock (which
would mean we have a bug in execute_process_context())?

James




I havn't seen this dead lock but in_interrupt() do not check whether
the interrupts are disabled. Please refer to the definition of
in_interrupt().

#define irq_count() (preempt_count() & (HARDIRQ_MASK | SOFTIRQ_MASK\
 | NMI_MASK))
#define in_interrupt()  (irq_count())


Jason


.

Re: [PATCH] bfa: fix access to bfad_im_port_s

2017-12-05 Thread James Bottomley

On Tue, 2017-11-28 at 16:26 +0100, Johannes Thumshirn wrote:
> Commit 'cd21c605b2cf ("scsi: fc: provide fc_bsg_to_shost() helper")'
> changed access to bfa's 'struct bfad_im_port_s' by using shost_priv()
> instead of shost->hostdata[0].
> 
> This lead to crashes like in the following back-trace:
> 
> task: 880046375300 ti: 8800a2ef8000 task.ti: 8800a2ef8000
> RIP: e030:[]  []
> bfa_fcport_get_attr+0x82/0x260 [bfa]
> RSP: e02b:8800a2efba10  EFLAGS: 00010046
> RAX: 575f415441536432 RBX: 8800a2efba28 RCX: 
> RDX:  RSI: 8800a2efba28 RDI: 880004dc31d8
> RBP: 880004dc31d8 R08:  R09: 0001
> R10: 88011fadc468 R11: 0001 R12: 880004dc31f0
> R13: 0200 R14: 880004dc61d0 R15: 880004947a10
> FS:  7feb1e489700() GS:88011fac()
> knlGS:
> CS:  e033 DS:  ES:  CR0: 8005003b
> CR2: 7ffe14e46c10 CR3: 957b8000 CR4: 0660
> Stack:
>  88001d4da000 880004dc31c0 a048a9df 81e56380
>     
> [] bfad_iocmd_ioc_get_info+0x4f/0x220 [bfa]
> [] bfad_iocmd_handler+0xa00/0xd40 [bfa]
> [] bfad_im_bsg_request+0xee/0x1b0 [bfa]
> [] fc_bsg_dispatch+0x10b/0x1b0 [scsi_transport_fc]
> [] bsg_request_fn+0x11d/0x1c0
> [] __blk_run_queue+0x2f/0x40
> [] blk_execute_rq_nowait+0xa8/0x160
> [] blk_execute_rq+0x77/0x120
> [] bsg_ioctl+0x1b6/0x200
> [] do_vfs_ioctl+0x2cd/0x4a0
> [] SyS_ioctl+0x74/0x80
> [] entry_SYSCALL_64_fastpath+0x12/0x6d
> 
> Fixes: cd21c605b2cf ("scsi: fc: provide fc_bsg_to_shost() helper")
> Signed-off-by: Johannes Thumshirn 
> Cc: Michal Koutný 
> ---
>  drivers/scsi/bfa/bfad_bsg.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/scsi/bfa/bfad_bsg.c
> b/drivers/scsi/bfa/bfad_bsg.c
> index 72ca2a2e08e2..09ef68c8225f 100644
> --- a/drivers/scsi/bfa/bfad_bsg.c
> +++ b/drivers/scsi/bfa/bfad_bsg.c
> @@ -3135,7 +3135,8 @@ bfad_im_bsg_vendor_request(struct bsg_job *job)
>   struct fc_bsg_request *bsg_request = job->request;
>   struct fc_bsg_reply *bsg_reply = job->reply;
>   uint32_t vendor_cmd = bsg_request-
> >rqst_data.h_vendor.vendor_cmd[0];
> - struct bfad_im_port_s *im_port =
> shost_priv(fc_bsg_to_shost(job));
> + struct Scsi_Host *shost = fc_bsg_to_shost(job);
> + struct bfad_im_port_s *im_port = shost->hostdata[0];
>   struct bfad_s *bfad = im_port->bfad;
>   void *payload_kbuf;
>   int rc = -EINVAL;
> @@ -3350,7 +3351,8 @@ int
>  bfad_im_bsg_els_ct_request(struct bsg_job *job)
>  {
>   struct bfa_bsg_data *bsg_data;
> - struct bfad_im_port_s *im_port =
> shost_priv(fc_bsg_to_shost(job));
> + struct Scsi_Host *shost = fc_bsg_to_shost(job);
> + struct bfad_im_port_s *im_port = shost->hostdata[0];
>   struct bfad_s *bfad = im_port->bfad;
>   bfa_bsg_fcpt_t *bsg_fcpt;
>   struct bfad_fcxp*drv_fcxp;

OK, so we had a linux next failure with this:

After merging the scsi tree, today's linux-next build (x86_64
allmodconfig) produced these warnings:

drivers/scsi/bfa/bfad_bsg.c: In function 'bfad_im_bsg_vendor_request':
drivers/scsi/bfa/bfad_bsg.c:3137:35: warning: initialization makes pointer 
from integer without a cast [-Wint-conversion]
  struct bfad_im_port_s *im_port = shost->hostdata[0];
   ^
drivers/scsi/bfa/bfad_bsg.c: In function 'bfad_im_bsg_els_ct_request':
drivers/scsi/bfa/bfad_bsg.c:3353:35: warning: initialization makes pointer 
from integer without a cast [-Wint-conversion]
  struct bfad_im_port_s *im_port = shost->hostdata[0];
   ^

Introduced by commit

  45349821ab3a ("scsi: bfa: fix access to bfad_im_port_s")

-- 
Cheers,
Stephen Rothwell

The reason is you should be using shost_priv(shost) not shost->hostdata[0].

James

Re: [PATCH v3 1/2] Ensure that the SCSI error handler gets woken up

2017-12-05 Thread Pavel Tikhomirov


In case (a) scsi_eh_inc_host_failed() will wake up the error handler. And in
case (b) scsi_dec_host_busy() will wake up the error handler. So it's not
clear to me why you think that there is a scenario in which the EH won't be
woken up?


So in case (b), in my understanding, scsi_dec_host_busy can skip wakeups
as it does not see host_failed change yet.


That's not correct. If scsi_dec_host_busy() obtains the SCSI host lock before
scsi_eh_inc_host_failed() obtains it then the latter function will trigger a
SCSI EH wakeup.


You are right! Thanks a lot for pointing that out! Now when I understand 
it, your patch looks good for me:


Reviewed-by: Pavel Tikhomirov 

By the way, I very much like your idea of using rcu for these case.

Thanks, Pavel.



Bart.

58 matches

Mail list logo