[PATCH] hpsa: correct enclosure sas address
- separate enclosure logical identifier from the SAS address. The original complaint was the lsscsi -t showed the same SAS address of the two enclosures (SEP devices). In fact the SAS address was being set to the Enclosure Logical Identifier (ELI). Reviewed-by: Scott Teel Reviewed-by: Kevin Barnett Signed-off-by: Don Brace --- drivers/scsi/hpsa.c | 25 + drivers/scsi/hpsa.h |1 + 2 files changed, 22 insertions(+), 4 deletions(-) diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c index 15c7f3b6f35e..58bb70b886d7 100644 --- a/drivers/scsi/hpsa.c +++ b/drivers/scsi/hpsa.c @@ -3440,11 +3440,11 @@ static void hpsa_get_enclosure_info(struct ctlr_info *h, struct ext_report_lun_entry *rle = &rlep->LUN[rle_index]; u16 bmic_device_index = 0; - bmic_device_index = GET_BMIC_DRIVE_NUMBER(&rle->lunid[0]); - - encl_dev->sas_address = + encl_dev->eli = hpsa_get_enclosure_logical_identifier(h, scsi3addr); + bmic_device_index = GET_BMIC_DRIVE_NUMBER(&rle->lunid[0]); + if (encl_dev->target == -1 || encl_dev->lun == -1) { rc = IO_OK; goto out; @@ -9697,7 +9697,24 @@ hpsa_sas_get_linkerrors(struct sas_phy *phy) static int hpsa_sas_get_enclosure_identifier(struct sas_rphy *rphy, u64 *identifier) { - *identifier = rphy->identify.sas_address; + struct Scsi_Host *shost = phy_to_shost(rphy); + struct ctlr_info *h; + struct hpsa_scsi_dev_t *sd; + + if (!shost) + return -ENXIO; + + h = shost_to_hba(shost); + + if (!h) + return -ENXIO; + + sd = hpsa_find_device_by_sas_rphy(h, rphy); + if (!sd) + return -ENXIO; + + *identifier = sd->eli; + return 0; } diff --git a/drivers/scsi/hpsa.h b/drivers/scsi/hpsa.h index fb9f5e7f8209..59e023696fff 100644 --- a/drivers/scsi/hpsa.h +++ b/drivers/scsi/hpsa.h @@ -68,6 +68,7 @@ struct hpsa_scsi_dev_t { #define RAID_CTLR_LUNID "\0\0\0\0\0\0\0\0" unsigned char device_id[16];/* from inquiry pg. 0x83 */ u64 sas_address; + u64 eli;/* from report diags. */ unsigned char vendor[8];/* bytes 8-15 of inquiry data */ unsigned char model[16];/* bytes 16-31 of inquiry data */ unsigned char rev; /* byte 2 of inquiry data */
RE: aacraid driver, kernel 4.14 and up, ASR8xxx controller : doesn't work
> I have already set up many drives of this size previously, using > ASR8xx5 controllers, without any problem. > > On another machine with a simple 8x1TB array, it doesn't work any better, > while > an older kernel works perfectly fine too: > > 4.14.48 : > > [ 61.069190] Adaptec aacraid driver 1.2.1[50834]-custom > [ 61.069527] aacraid :21:00.0: SME is active, device will require DMA > bounce buffers > [ 61.076949] SME is active and system is using DMA bounce buffers > [ 61.076954] aacraid: Comm Interface type2 enabled > Both controllers are capable of 64-bit DMA, however the communication area should be 32-bit. Is this issue specific to Secure Memory Encryption? Does it occur without that enabled? > > Nothing else happens for minutes. Attached devices are unavailable. > "arcconf" find no controller. "rmmod aacraid" doesn't work (device in use). > > Running kernel 4.16.14 with the 8 TB array is exactly the same: Thanks for that info ... I won't have to find a lot of drives. > > [ 40.380488] Adaptec aacraid driver 1.2.1[50877]-custom > [ 40.380871] aacraid :21:00.0: SME is active, device will require DMA > bounce buffers > [ 40.388991] SME is active and system is using DMA bounce buffers > [ 40.388995] aacraid: Comm Interface type2 enabled > > In contrast, kernel 4.13, though the driver version is the same as 4.14, just > works: > > [ 25.286437] Adaptec aacraid driver 1.2.1[50834]-custom > [ 25.293694] aacraid: Comm Interface type2 enabled > [ 25.300799] AAC0: kernel 7.13-0[33263] Mar 16 2018 > [ 25.300801] AAC0: monitor 7.13-0[33263] > [ 25.300802] AAC0: bios 7.13-0[33263] > [ 25.300804] AAC0: serial 6A46639462D > [ 25.300804] AAC0: Non-DASD support enabled. > [ 25.300805] AAC0: 64bit support enabled. > [ 25.300807] aacraid :21:00.0: 64 Bit DAC enabled > ... > [ 27.307013] scsi host16: aacraid > [ 27.307228] scsi 16:0:0:0: Direct-Access ASR8805 LogicalDrv 0 > V1.0 PQ: 0 > ANSI: 2 > [ 27.307364] sd 16:0:0:0: [sdm] Very big device. Trying to use READ > CAPACITY(16). > [ 27.307376] sd 16:0:0:0: [sdm] 11714670592 512-byte logical blocks: (6.00 > TB/5.45 TiB) > [ 27.307385] sd 16:0:0:0: [sdm] Write Protect is off > [ 27.307386] sd 16:0:0:0: [sdm] Mode Sense: 12 00 10 08 > [ 27.307403] sd 16:0:0:0: [sdm] Write cache: disabled, read cache: enabled, > supports DPO and FUA > [ 27.307411] sd 16:0:0:0: Attached scsi generic sg12 type 0 > [ 27.307507] sd 16:0:0:0: [sdm] Very big device. Trying to use READ > CAPACITY(16). > [ 27.307830] sd 16:0:0:0: [sdm] Very big device. Trying to use READ > CAPACITY(16). > [ 27.307855] sd 16:0:0:0: [sdm] Attached SCSI removable disk > [ 27.332731] scsi 16:1:0:0: Direct-Access ATA WDC WD10TPVT-00U 1A01 > PQ: 1 ANSI: 6 > [ 27.355431] scsi 16:1:0:0: Attached scsi generic sg13 type 0 > [ 27.355830] scsi 16:1:1:0: Direct-Access ATA WDC WD10TPVT-00U 1A01 > PQ: 1 ANSI: 6 > [ 27.385377] scsi 16:1:1:0: Attached scsi generic sg14 type 0 > [ 27.385788] scsi 16:1:2:0: Direct-Access ATA WDC WD10TPVT-00U 1A01 > PQ: 1 ANSI: 6 > [ 27.413412] scsi 16:1:2:0: Attached scsi generic sg15 type 0 > [ 27.413813] scsi 16:1:3:0: Direct-Access ATA WDC WD10TPVT-00U 1A01 > PQ: 1 ANSI: 6 > [ 27.437420] scsi 16:1:3:0: Attached scsi generic sg16 type 0 > [ 27.437836] scsi 16:1:4:0: Direct-Access ATA WDC WD10TPVT-00H 1A01 > PQ: 1 ANSI: 6 > [ 27.636675] scsi 16:1:4:0: Attached scsi generic sg17 type 0 > [ 27.637077] scsi 16:1:5:0: Direct-Access ATA WDC WD10TPVT-00U 1A01 > PQ: 1 ANSI: 6 > [ 27.664591] scsi 16:1:5:0: Attached scsi generic sg18 type 0 > [ 27.664996] scsi 16:1:6:0: Direct-Access ATA WDC WD10TPVT-00U 1A01 > PQ: 1 ANSI: 6 > [ 27.697619] scsi 16:1:6:0: Attached scsi generic sg19 type 0 > [ 27.698027] scsi 16:1:7:0: Direct-Access ATA WDC WD10TPVT-00U 1A01 > PQ: 1 ANSI: 6 > [ 27.732645] scsi 16:1:7:0: Attached scsi generic sg20 type 0 > Thanks for the additional info ... -Dave
Re: aacraid driver, kernel 4.14 and up, ASR8xxx controller : doesn't work
Le Tue, 3 Jul 2018 16:59:41 + Dave Carroll écrivait: > Hi Emmanuel, > > It is curious that the FW is having outstanding commands ... I've > created a ticket to iderntify the differences. I suspect that the > large drive size may be related, but all options are open. > I have already set up many drives of this size previously, using ASR8xx5 controllers, without any problem. On another machine with a simple 8x1TB array, it doesn't work any better, while an older kernel works perfectly fine too: 4.14.48 : [ 61.069190] Adaptec aacraid driver 1.2.1[50834]-custom [ 61.069527] aacraid :21:00.0: SME is active, device will require DMA bounce buffers [ 61.076949] SME is active and system is using DMA bounce buffers [ 61.076954] aacraid: Comm Interface type2 enabled Nothing else happens for minutes. Attached devices are unavailable. "arcconf" find no controller. "rmmod aacraid" doesn't work (device in use). Running kernel 4.16.14 with the 8 TB array is exactly the same: [ 40.380488] Adaptec aacraid driver 1.2.1[50877]-custom [ 40.380871] aacraid :21:00.0: SME is active, device will require DMA bounce buffers [ 40.388991] SME is active and system is using DMA bounce buffers [ 40.388995] aacraid: Comm Interface type2 enabled In contrast, kernel 4.13, though the driver version is the same as 4.14, just works: [ 25.286437] Adaptec aacraid driver 1.2.1[50834]-custom [ 25.293694] aacraid: Comm Interface type2 enabled [ 25.300799] AAC0: kernel 7.13-0[33263] Mar 16 2018 [ 25.300801] AAC0: monitor 7.13-0[33263] [ 25.300802] AAC0: bios 7.13-0[33263] [ 25.300804] AAC0: serial 6A46639462D [ 25.300804] AAC0: Non-DASD support enabled. [ 25.300805] AAC0: 64bit support enabled. [ 25.300807] aacraid :21:00.0: 64 Bit DAC enabled ... [ 27.307013] scsi host16: aacraid [ 27.307228] scsi 16:0:0:0: Direct-Access ASR8805 LogicalDrv 0 V1.0 PQ: 0 ANSI: 2 [ 27.307364] sd 16:0:0:0: [sdm] Very big device. Trying to use READ CAPACITY(16). [ 27.307376] sd 16:0:0:0: [sdm] 11714670592 512-byte logical blocks: (6.00 TB/5.45 TiB) [ 27.307385] sd 16:0:0:0: [sdm] Write Protect is off [ 27.307386] sd 16:0:0:0: [sdm] Mode Sense: 12 00 10 08 [ 27.307403] sd 16:0:0:0: [sdm] Write cache: disabled, read cache: enabled, supports DPO and FUA [ 27.307411] sd 16:0:0:0: Attached scsi generic sg12 type 0 [ 27.307507] sd 16:0:0:0: [sdm] Very big device. Trying to use READ CAPACITY(16). [ 27.307830] sd 16:0:0:0: [sdm] Very big device. Trying to use READ CAPACITY(16). [ 27.307855] sd 16:0:0:0: [sdm] Attached SCSI removable disk [ 27.332731] scsi 16:1:0:0: Direct-Access ATA WDC WD10TPVT-00U 1A01 PQ: 1 ANSI: 6 [ 27.355431] scsi 16:1:0:0: Attached scsi generic sg13 type 0 [ 27.355830] scsi 16:1:1:0: Direct-Access ATA WDC WD10TPVT-00U 1A01 PQ: 1 ANSI: 6 [ 27.385377] scsi 16:1:1:0: Attached scsi generic sg14 type 0 [ 27.385788] scsi 16:1:2:0: Direct-Access ATA WDC WD10TPVT-00U 1A01 PQ: 1 ANSI: 6 [ 27.413412] scsi 16:1:2:0: Attached scsi generic sg15 type 0 [ 27.413813] scsi 16:1:3:0: Direct-Access ATA WDC WD10TPVT-00U 1A01 PQ: 1 ANSI: 6 [ 27.437420] scsi 16:1:3:0: Attached scsi generic sg16 type 0 [ 27.437836] scsi 16:1:4:0: Direct-Access ATA WDC WD10TPVT-00H 1A01 PQ: 1 ANSI: 6 [ 27.636675] scsi 16:1:4:0: Attached scsi generic sg17 type 0 [ 27.637077] scsi 16:1:5:0: Direct-Access ATA WDC WD10TPVT-00U 1A01 PQ: 1 ANSI: 6 [ 27.664591] scsi 16:1:5:0: Attached scsi generic sg18 type 0 [ 27.664996] scsi 16:1:6:0: Direct-Access ATA WDC WD10TPVT-00U 1A01 PQ: 1 ANSI: 6 [ 27.697619] scsi 16:1:6:0: Attached scsi generic sg19 type 0 [ 27.698027] scsi 16:1:7:0: Direct-Access ATA WDC WD10TPVT-00U 1A01 PQ: 1 ANSI: 6 [ 27.732645] scsi 16:1:7:0: Attached scsi generic sg20 type 0 -- Emmanuel Florac | Direction technique | Intellique | | +33 1 78 94 84 02 pgpljfhS_hxGN.pgp Description: Signature digitale OpenPGP
RE: aacraid driver, kernel 4.14 and up, ASR8xxx controller : doesn't work
> After a very long time, it finally boots up and sees the disks, but > here's the output from dmesg | grep aacraid: > > [1.357760] Adaptec aacraid driver 1.2.1[50877]-custom > [1.388119] aacraid: Comm Interface type2 enabled > [3.405113] scsi host0: aacraid > [ 50.156024] aacraid: Host adapter abort request. >aacraid: Outstanding commands on (0,0,0,0): > [ 50.156126] aacraid: Host adapter abort request. >aacraid: Outstanding commands on (0,0,0,0): > [ 50.172032] aacraid: Host adapter reset request. SCSI hang ? > [ 65.536106] aacraid: Host adapter reset request. SCSI hang ? > [ 65.536204] aacraid :01:00.0: outstanding cmd: midlevel-0 > [ 65.536206] aacraid :01:00.0: outstanding cmd: lowlevel-0 > [ 65.536207] aacraid :01:00.0: outstanding cmd: error handler-0 > [ 65.536208] aacraid :01:00.0: outstanding cmd: firmware-2 > [ 65.536210] aacraid :01:00.0: outstanding cmd: kernel-0 > [ 65.536228] aacraid :01:00.0: Controller reset type is 3 > [ 65.536314] aacraid :01:00.0: Issuing IOP reset > [ 98.675617] aacraid :01:00.0: IOP reset succeded > [ 98.684161] aacraid: Comm Interface type2 enabled > [ 110.015352] aacraid :01:00.0: Scheduling bus rescan > [ 166.896116] aacraid: Host adapter reset request. SCSI hang ? > [ 166.896214] aacraid :01:00.0: outstanding cmd: midlevel-0 > [ 166.896216] aacraid :01:00.0: outstanding cmd: lowlevel-0 > [ 166.896217] aacraid :01:00.0: outstanding cmd: error handler-0 > [ 166.896218] aacraid :01:00.0: outstanding cmd: firmware-2 > [ 166.896220] aacraid :01:00.0: outstanding cmd: kernel-0 > [ 166.896236] aacraid :01:00.0: Controller reset type is 3 > [ 166.896322] aacraid :01:00.0: Issuing IOP reset > [ 198.858466] aacraid :01:00.0: IOP reset succeded > [ 198.870660] aacraid: Comm Interface type2 enabled > [ 211.129896] aacraid :01:00.0: Scheduling bus rescan > [ 228.844034] aacraid: Host adapter abort request. >aacraid: Outstanding commands on (0,0,0,0): > [ 266.835610] aacraid: Host adapter reset request. SCSI hang ? > [ 266.837891] aacraid :01:00.0: outstanding cmd: midlevel-0 > [ 266.837894] aacraid :01:00.0: outstanding cmd: lowlevel-0 > [ 266.837897] aacraid :01:00.0: outstanding cmd: error handler-0 > [ 266.837899] aacraid :01:00.0: outstanding cmd: firmware-2 > [ 266.837902] aacraid :01:00.0: outstanding cmd: kernel-0 > [ 266.837939] aacraid :01:00.0: Controller reset type is 3 > [ 266.840415] aacraid :01:00.0: Issuing IOP reset > [ 299.846642] aacraid :01:00.0: IOP reset succeded > [ 299.858811] aacraid: Comm Interface type2 enabled > [ 312.098221] aacraid :01:00.0: Scheduling bus rescan > [ 367.869277] aacraid: Host adapter reset request. SCSI hang ? > [ 367.871382] aacraid :01:00.0: outstanding cmd: midlevel-0 > [ 367.871385] aacraid :01:00.0: outstanding cmd: lowlevel-0 > [ 367.871387] aacraid :01:00.0: outstanding cmd: error handler-0 > [ 367.871389] aacraid :01:00.0: outstanding cmd: firmware-2 > [ 367.871391] aacraid :01:00.0: outstanding cmd: kernel-0 > [ 367.871480] aacraid :01:00.0: Controller reset type is 3 > [ 367.873982] aacraid :01:00.0: Issuing IOP reset > [ 400.765513] aacraid :01:00.0: IOP reset succeded > [ 400.776673] aacraid: Comm Interface type2 enabled > [ 413.036505] aacraid :01:00.0: Scheduling bus rescan > [ 468.995678] aacraid: Host adapter reset request. SCSI hang ? > [ 468.995700] aacraid :01:00.0: outstanding cmd: midlevel-0 > [ 468.995704] aacraid :01:00.0: outstanding cmd: lowlevel-0 > [ 468.995706] aacraid :01:00.0: outstanding cmd: error handler-0 > [ 468.995709] aacraid :01:00.0: outstanding cmd: firmware-2 > [ 468.995711] aacraid :01:00.0: outstanding cmd: kernel-0 > [ 468.995740] aacraid :01:00.0: Controller reset type is 3 > [ 468.995745] aacraid :01:00.0: Issuing IOP reset > [ 501.875537] aacraid :01:00.0: IOP reset succeded > [ 501.887288] aacraid: Comm Interface type2 enabled > [ 514.148609] aacraid :01:00.0: Scheduling bus rescan > > The RAID controller is unusable, obviously. Rebooting with 4.13, now... Hi Emmanuel, It is curious that the FW is having outstanding commands ... I've created a ticket to iderntify the differences. I suspect that the large drive size may be related, but all options are open. Thanks, -Dave
Re: aacraid driver, kernel 4.14 and up, ASR8xxx controller : doesn't work
Le Wed, 27 Jun 2018 18:48:56 + Dave Carroll écrivait: > > Is this size consistent with the 4.13 kernel? That size is greater > > than the 64-bit LBA addressing (0x93 539F B000). > > Sorry, that comment was incorrect, but I would like to see if the > size is consistent between the kernels. After a very long time, it finally boots up and sees the disks, but here's the output from dmesg | grep aacraid: [1.357760] Adaptec aacraid driver 1.2.1[50877]-custom [1.388119] aacraid: Comm Interface type2 enabled [3.405113] scsi host0: aacraid [ 50.156024] aacraid: Host adapter abort request. aacraid: Outstanding commands on (0,0,0,0): [ 50.156126] aacraid: Host adapter abort request. aacraid: Outstanding commands on (0,0,0,0): [ 50.172032] aacraid: Host adapter reset request. SCSI hang ? [ 65.536106] aacraid: Host adapter reset request. SCSI hang ? [ 65.536204] aacraid :01:00.0: outstanding cmd: midlevel-0 [ 65.536206] aacraid :01:00.0: outstanding cmd: lowlevel-0 [ 65.536207] aacraid :01:00.0: outstanding cmd: error handler-0 [ 65.536208] aacraid :01:00.0: outstanding cmd: firmware-2 [ 65.536210] aacraid :01:00.0: outstanding cmd: kernel-0 [ 65.536228] aacraid :01:00.0: Controller reset type is 3 [ 65.536314] aacraid :01:00.0: Issuing IOP reset [ 98.675617] aacraid :01:00.0: IOP reset succeded [ 98.684161] aacraid: Comm Interface type2 enabled [ 110.015352] aacraid :01:00.0: Scheduling bus rescan [ 166.896116] aacraid: Host adapter reset request. SCSI hang ? [ 166.896214] aacraid :01:00.0: outstanding cmd: midlevel-0 [ 166.896216] aacraid :01:00.0: outstanding cmd: lowlevel-0 [ 166.896217] aacraid :01:00.0: outstanding cmd: error handler-0 [ 166.896218] aacraid :01:00.0: outstanding cmd: firmware-2 [ 166.896220] aacraid :01:00.0: outstanding cmd: kernel-0 [ 166.896236] aacraid :01:00.0: Controller reset type is 3 [ 166.896322] aacraid :01:00.0: Issuing IOP reset [ 198.858466] aacraid :01:00.0: IOP reset succeded [ 198.870660] aacraid: Comm Interface type2 enabled [ 211.129896] aacraid :01:00.0: Scheduling bus rescan [ 228.844034] aacraid: Host adapter abort request. aacraid: Outstanding commands on (0,0,0,0): [ 266.835610] aacraid: Host adapter reset request. SCSI hang ? [ 266.837891] aacraid :01:00.0: outstanding cmd: midlevel-0 [ 266.837894] aacraid :01:00.0: outstanding cmd: lowlevel-0 [ 266.837897] aacraid :01:00.0: outstanding cmd: error handler-0 [ 266.837899] aacraid :01:00.0: outstanding cmd: firmware-2 [ 266.837902] aacraid :01:00.0: outstanding cmd: kernel-0 [ 266.837939] aacraid :01:00.0: Controller reset type is 3 [ 266.840415] aacraid :01:00.0: Issuing IOP reset [ 299.846642] aacraid :01:00.0: IOP reset succeded [ 299.858811] aacraid: Comm Interface type2 enabled [ 312.098221] aacraid :01:00.0: Scheduling bus rescan [ 367.869277] aacraid: Host adapter reset request. SCSI hang ? [ 367.871382] aacraid :01:00.0: outstanding cmd: midlevel-0 [ 367.871385] aacraid :01:00.0: outstanding cmd: lowlevel-0 [ 367.871387] aacraid :01:00.0: outstanding cmd: error handler-0 [ 367.871389] aacraid :01:00.0: outstanding cmd: firmware-2 [ 367.871391] aacraid :01:00.0: outstanding cmd: kernel-0 [ 367.871480] aacraid :01:00.0: Controller reset type is 3 [ 367.873982] aacraid :01:00.0: Issuing IOP reset [ 400.765513] aacraid :01:00.0: IOP reset succeded [ 400.776673] aacraid: Comm Interface type2 enabled [ 413.036505] aacraid :01:00.0: Scheduling bus rescan [ 468.995678] aacraid: Host adapter reset request. SCSI hang ? [ 468.995700] aacraid :01:00.0: outstanding cmd: midlevel-0 [ 468.995704] aacraid :01:00.0: outstanding cmd: lowlevel-0 [ 468.995706] aacraid :01:00.0: outstanding cmd: error handler-0 [ 468.995709] aacraid :01:00.0: outstanding cmd: firmware-2 [ 468.995711] aacraid :01:00.0: outstanding cmd: kernel-0 [ 468.995740] aacraid :01:00.0: Controller reset type is 3 [ 468.995745] aacraid :01:00.0: Issuing IOP reset [ 501.875537] aacraid :01:00.0: IOP reset succeded [ 501.887288] aacraid: Comm Interface type2 enabled [ 514.148609] aacraid :01:00.0: Scheduling bus rescan The RAID controller is unusable, obviously. Rebooting with 4.13, now... -- Emmanuel Florac | Direction technique | Intellique | | +33 1 78 94 84 02 pgpiO9bja167Q.pgp Description: Signature digitale OpenPGP
Re: aacraid driver, kernel 4.14 and up, ASR8xxx controller : doesn't work
Le Wed, 27 Jun 2018 18:48:56 + Dave Carroll écrivait: > Sorry, that comment was incorrect, but I would like to see if the > size is consistent between the kernels. > I just booted the 4.16 from Debian testing, same problem, so this is not an artefact of my custom compilation: aacraid: Host adapter abort request. aacraid: Outstanding commands on (0,0,0,0): aacraid: Host adapter abort request. aacraid: Outstanding commands on (0,0,0,0): aacraid: Host adapter abort request. aacraid: Host adapter reset request. SCSI hang ? The very same adapter connected to the very same disks works perfectly fine with a 4.13 kernel, and not at all with 4.14, 4.15, 4.16. I didn't test 4.17 yet but... -- Emmanuel Florac | Direction technique | Intellique | | +33 1 78 94 84 02 pgptqiB2zA2MO.pgp Description: Signature digitale OpenPGP
[Bug 199703] HPSA blocking boot on HP smart Array P400
https://bugzilla.kernel.org/show_bug.cgi?id=199703 --- Comment #24 from Rich Reamer (richr...@yahoo.com) --- UPDATE: * found out CONFIG_BLK_CPQ_CISS_DA was removed and Replaced with HPSA driver. * the "hpsa_allow_any=1" boot parameter does sort-of find the raid/disk -- and creates a "/dev/disk/by-path/pci-:00:06.1-scsi-0:0:0:0" -- but thats it. (no "by-uuid" or "by-id" entries) * blkid just hangs completely when tried -- You are receiving this mail because: You are the assignee for the bug.
Re: [PATCH] mpt3sas: Fix for regression caused due to cf6bf9710c patch
On Tue, 2018-07-03 at 22:49 +0900, David Miller wrote: > From: Sreekanth Reddy > Date: Tue, 3 Jul 2018 17:48:49 +0530 > > > Any suggestion/update over my previous mail. I am using 4.13 > kernel. > > I think the issue is that if you are reading a 32-bit word and then > interpreting it as a struct full of individual bytes, you have to > order the bytes in the structure appropriately for the cpu > endianness. This is undoubtedly it. The point being if you read from a structure using readX, you have to read every element at its correct length for the endian swaps to work. You can't do a readq on 2 32 bit words and expect the endianness to be correct (you'll find they come out in the wrong order). I think you're using a shared (device and cpu) memory mapped structured data with a doorbell register, which is pretty identical to how the qla1280 does it. We went through several iterations of fixing that driver for big endian but finally settled on putting __le annotations on all the structures and doing cpu_to_leX() swaps as we wrote them (and obviously leX_to_cpu() swaps to read them), meaning the structure in memory is always correct for the device. Then we used a writeX to poke the doorbell and the device just picked up the correct information. The rule you want to be following is: memory mapped structure, you're responsible for annotation and swapping; readX/writeX to correctly sized data, the API will swap for you. So, can we just revert the original patch which is clearly now a regression and try to get this fixed in the merge window? I think the actual bug is simply you're missing __leX annotations on the shared memory mapped structure to fix sparse, but otherwise everything is working. James
[PATCH] qedi: Send driver state to mfw.
In case of iSCSI offload BFS environment, mfw requires to mark virtual link based upon qedi load status. Signed-off-by: Manish Rangankar --- drivers/scsi/qedi/qedi_main.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/drivers/scsi/qedi/qedi_main.c b/drivers/scsi/qedi/qedi_main.c index 682f3ce..253f305 100644 --- a/drivers/scsi/qedi/qedi_main.c +++ b/drivers/scsi/qedi/qedi_main.c @@ -2273,6 +2273,7 @@ static int qedi_setup_boot_info(struct qedi_ctx *qedi) static void __qedi_remove(struct pci_dev *pdev, int mode) { struct qedi_ctx *qedi = pci_get_drvdata(pdev); + int rval; if (qedi->tmf_thread) { flush_workqueue(qedi->tmf_thread); @@ -2302,6 +2303,10 @@ static void __qedi_remove(struct pci_dev *pdev, int mode) if (mode == QEDI_MODE_NORMAL) qedi_free_iscsi_pf_param(qedi); + rval = qedi_ops->common->update_drv_state(qedi->cdev, false); + if (rval) + QEDI_ERR(&qedi->dbg_ctx, "Failed to send drv state to MFW\n"); + if (!test_bit(QEDI_IN_OFFLINE, &qedi->flags)) { qedi_ops->common->slowpath_stop(qedi->cdev); qedi_ops->common->remove(qedi->cdev); @@ -2576,6 +2581,12 @@ static int __qedi_probe(struct pci_dev *pdev, int mode) if (qedi_setup_boot_info(qedi)) QEDI_ERR(&qedi->dbg_ctx, "No iSCSI boot target configured\n"); + + rc = qedi_ops->common->update_drv_state(qedi->cdev, true); + if (rc) + QEDI_ERR(&qedi->dbg_ctx, +"Failed to send drv state to MFW\n"); + } return 0; -- 1.8.3.1
Re: [PATCH] mpt3sas: Fix for regression caused due to cf6bf9710c patch
From: Sreekanth Reddy Date: Tue, 3 Jul 2018 17:48:49 +0530 > Any suggestion/update over my previous mail. I am using 4.13 kernel. I think the issue is that if you are reading a 32-bit word and then interpreting it as a struct full of individual bytes, you have to order the bytes in the structure appropriately for the cpu endianness. Look at the 32-bit value read, it is identical in both the x86 and sparc cases.
Re: [PATCH] mpt3sas: Fix for regression caused due to cf6bf9710c patch
Hi, Any suggestion/update over my previous mail. I am using 4.13 kernel. Thanks, Sreekanth On Sat, Jun 30, 2018 at 12:34 AM, Sreekanth Reddy wrote: > Hi All, > > Here is the issue which we are observing when driver don't use > le16_to_cpu() in below code snippet on Sparc64 machine when driver is > reading 2 bytes of data which is posted by HBA firmware, > > u32 reply1; > reply1 = readl(&ioc->chip->Doorbell); > reply[1] = (reply1 & MPI2_DOORBELL_DATA_MASK); > > printk("LSI debug.. 0x%x, 0x%x, 0x%x \n", reply1, reply[1]); > writel(0, &ioc->chip->HostInterruptStatus); > > printk("LSI MsgLength :%d\n", default_reply->MsgLength); > > When I execute above code I got below output on Sparc64 machine, > > LSI debug.. 0x1c000311, 0x311 > LSI MsgLength :3 > > When I execute same code in x86 machine then I got below output, > > LSI debug.. 0x1c000311, 0x311 > LSI MsgLength :17 > > Correct message (Here I am referring IOCFacts message) Length is 17 > words. But on Sparc64 machine we got message length as 3 words which > is wrong. > > Here is data structure of default reply message, > > typedef struct _MPI2_DEFAULT_REPLY { > U16 FunctionDependent1; /*0x00 */ > U8 MsgLength; /*0x02 */ > U8 Function;/*0x03 */ > U16 FunctionDependent2; /*0x04 */ > U8 FunctionDependent3; /*0x06 */ > U8 MsgFlags;/*0x07 */ > U8 VP_ID; /*0x08 */ > U8 VF_ID; /*0x09 */ > U16 Reserved1; /*0x0A */ > U16 FunctionDependent5; /*0x0C */ > U16 IOCStatus; /*0x0E */ > U32 IOCLogInfo; /*0x10 */ > } MPI2_DEFAULT_REPLY, *PTR_MPI2_DEFAULT_REPLY, > MPI2DefaultReply_t, *pMPI2DefaultReply_t; > > Until host reads correct number of reply words, IOC won't clear > Doorbel Used bit and hence we see below error message while loading > the driver and IOC initialization fails. > > Jun 28 02:21:57 localhost kernel: mpt4sas_cm0: _base_get_ioc_facts > Jun 28 02:21:57 localhost kernel: mpt4sas_cm0: _base_wait_for_iocstate > Jun 28 02:21:57 localhost kernel: mpt4sas_cm0: doorbell is in use (line=5241) > Jun 28 02:21:57 localhost kernel: mpt4sas_cm0: _base_get_ioc_facts: > handshake failed (r=-14) > Jun 28 02:21:57 localhost kernel: mpt4sas_cm0: mpt3sas_base_free_resources > Jun 28 02:21:57 localhost kernel: mpt4sas_cm0: _base_make_ioc_ready > Jun 28 02:21:57 localhost kernel: mpt4sas_cm0: mpt3sas_base_unmap_resources > Jun 28 02:21:57 localhost kernel: mpt4sas_cm0: _base_release_memory_pools > Jun 28 02:21:57 localhost kernel: mpt4sas_cm0: failure at > /home/chaitra/mpt3sas_with_sparse_patch/mpt3sas_scsih.c:10776/_scsih_probe() > > > Thanks, > Sreekanth > > On Sat, Jun 30, 2018 at 12:06 AM, Andy Shevchenko > wrote: >> On Fri, Jun 29, 2018 at 7:06 PM, James Bottomley >> wrote: >>> On Fri, 2018-06-29 at 10:58 -0400, Chaitra P B wrote: "scsi: mpt3sas: Bug fix for big endian systems" Above patch with commit id "cf6bf9710cabba1fe94a4349f4eb8db623c77ebc" was posted to fix sparse warnings. While posting this patch it was assumed that readl() & writel() APIs internally calls le32_to_cpu() & cpu_to_le32() APIs respectively. Looks like it is not true for all architecture >>> >>> Just a minute, it damn well should be. The definition of readl/writel >>> is barriers and little endian (you can see this in asm-generic/io.h). >>> >>> Which architecture is getting this wrong? Because it sounds like >>> that's what we need to fix rather than doing something like this in all >>> drivers. >>> >>> Sparc (and parisc) definitely do the little endian thing, so if this >>> code is what it takes to get them working again, it looks like you're >>> double swapping somewhere. I really think cf6bf9710c needs to be >>> reverted and you should look again at your sparse warnings. Since the >>> driver is using the non-raw readX/writeX it should be cpu endian >>> internally which cf6bf9710c upsets. >> >> And we definitely won't see the constructions like >> writeq(cpu_to_le64()) in the code, because it's weird. >> If I get it correctly it's equivalent to __raw_writeq(). >> >> -- >> With Best Regards, >> Andy Shevchenko