Re: [PATCH v2] qla2xxx: Fix for locking issue between driver ISR and mailbox routines
Acked-by: Saurav Kashyap saurav.kash...@qlogic.com Thanks, ~Saurav The driver uses ha-mbx_cmd_flags variable to pass information between its ISR and mailbox routines, however, it does so without the protection of any locks. Under certain conditions, this can lead to multiple mailbox command completions being signaled, which, in turn, leads to a false mailbox timeout error for the subsequently issued mailbox command. The issue occurs frequently but intermittenly with the Qlogic 8GFC mezz card during card initialization, resulting in card initialization failure. Signed-off-by: Gurinder (Sunny) Shergill gurinder.sherg...@hp.com --- drivers/scsi/qla2xxx/qla_inline.h | 11 +++ drivers/scsi/qla2xxx/qla_isr.c| 27 --- drivers/scsi/qla2xxx/qla_mbx.c| 2 -- drivers/scsi/qla2xxx/qla_mr.c | 10 ++ drivers/scsi/qla2xxx/qla_nx.c | 26 ++ 5 files changed, 27 insertions(+), 49 deletions(-) diff --git a/drivers/scsi/qla2xxx/qla_inline.h b/drivers/scsi/qla2xxx/qla_inline.h index 98ab921..0a5c895 100644 --- a/drivers/scsi/qla2xxx/qla_inline.h +++ b/drivers/scsi/qla2xxx/qla_inline.h @@ -278,3 +278,14 @@ qla2x00_do_host_ramp_up(scsi_qla_host_t *vha) set_bit(HOST_RAMP_UP_QUEUE_DEPTH, vha-dpc_flags); } + +static inline void +qla2x00_handle_mbx_completion(struct qla_hw_data *ha, int status) +{ + if (test_bit(MBX_INTR_WAIT, ha-mbx_cmd_flags) + (status MBX_INTERRUPT) ha-flags.mbox_int) { + set_bit(MBX_INTERRUPT, ha-mbx_cmd_flags); + clear_bit(MBX_INTR_WAIT, ha-mbx_cmd_flags); + complete(ha-mbx_intr_comp); + } +} diff --git a/drivers/scsi/qla2xxx/qla_isr.c b/drivers/scsi/qla2xxx/qla_isr.c index 259d920..d2a4c75 100644 --- a/drivers/scsi/qla2xxx/qla_isr.c +++ b/drivers/scsi/qla2xxx/qla_isr.c @@ -104,14 +104,9 @@ qla2100_intr_handler(int irq, void *dev_id) RD_REG_WORD(reg-hccr); } } + qla2x00_handle_mbx_completion(ha, status); spin_unlock_irqrestore(ha-hardware_lock, flags); - if (test_bit(MBX_INTR_WAIT, ha-mbx_cmd_flags) - (status MBX_INTERRUPT) ha-flags.mbox_int) { - set_bit(MBX_INTERRUPT, ha-mbx_cmd_flags); - complete(ha-mbx_intr_comp); - } - return (IRQ_HANDLED); } @@ -221,14 +216,9 @@ qla2300_intr_handler(int irq, void *dev_id) WRT_REG_WORD(reg-hccr, HCCR_CLR_RISC_INT); RD_REG_WORD_RELAXED(reg-hccr); } + qla2x00_handle_mbx_completion(ha, status); spin_unlock_irqrestore(ha-hardware_lock, flags); - if (test_bit(MBX_INTR_WAIT, ha-mbx_cmd_flags) - (status MBX_INTERRUPT) ha-flags.mbox_int) { - set_bit(MBX_INTERRUPT, ha-mbx_cmd_flags); - complete(ha-mbx_intr_comp); - } - return (IRQ_HANDLED); } @@ -2613,14 +2603,9 @@ qla24xx_intr_handler(int irq, void *dev_id) if (unlikely(IS_QLA83XX(ha) (ha-pdev-revision == 1))) ndelay(3500); } + qla2x00_handle_mbx_completion(ha, status); spin_unlock_irqrestore(ha-hardware_lock, flags); - if (test_bit(MBX_INTR_WAIT, ha-mbx_cmd_flags) - (status MBX_INTERRUPT) ha-flags.mbox_int) { - set_bit(MBX_INTERRUPT, ha-mbx_cmd_flags); - complete(ha-mbx_intr_comp); - } - return IRQ_HANDLED; } @@ -2763,13 +2748,9 @@ qla24xx_msix_default(int irq, void *dev_id) } WRT_REG_DWORD(reg-hccr, HCCRX_CLR_RISC_INT); } while (0); + qla2x00_handle_mbx_completion(ha, status); spin_unlock_irqrestore(ha-hardware_lock, flags); - if (test_bit(MBX_INTR_WAIT, ha-mbx_cmd_flags) - (status MBX_INTERRUPT) ha-flags.mbox_int) { - set_bit(MBX_INTERRUPT, ha-mbx_cmd_flags); - complete(ha-mbx_intr_comp); - } return IRQ_HANDLED; } diff --git a/drivers/scsi/qla2xxx/qla_mbx.c b/drivers/scsi/qla2xxx/qla_mbx.c index 9e5d89d..3587ec2 100644 --- a/drivers/scsi/qla2xxx/qla_mbx.c +++ b/drivers/scsi/qla2xxx/qla_mbx.c @@ -179,8 +179,6 @@ qla2x00_mailbox_command(scsi_qla_host_t *vha, mbx_cmd_t *mcp) wait_for_completion_timeout(ha-mbx_intr_comp, mcp-tov * HZ); - clear_bit(MBX_INTR_WAIT, ha-mbx_cmd_flags); - } else { ql_dbg(ql_dbg_mbx, vha, 0x1011, Cmd=%x Polling Mode.\n, command); diff --git a/drivers/scsi/qla2xxx/qla_mr.c b/drivers/scsi/qla2xxx/qla_mr.c index 729b743..5483da8 100644 --- a/drivers/scsi/qla2xxx/qla_mr.c +++ b/drivers/scsi/qla2xxx/qla_mr.c @@ -148,9 +148,6 @@ qlafx00_mailbox_command(scsi_qla_host_t *vha, struct mbx_cmd_32 *mcp) spin_unlock_irqrestore(ha-hardware_lock, flags); wait_for_completion_timeout(ha-mbx_intr_comp, mcp-tov * HZ); - - clear_bit(MBX_INTR_WAIT, ha-mbx_cmd_flags); - } else { ql_dbg(ql_dbg_mbx,
sg_ses -j shows Transport protocol: Oxc not decoded
Hi, I have 3 boxes, each with an LSI 9211-8i and a mix of LSI expanders (Supermicro SAS-846EL2, SAS-826EL2). For some of my expanders, 'sg_ses -j' (originally sg3_utils 1.33, now 1.35) is showing: Slot 24 [0,23] Element type: Array device slot ... Additional Element Status: Transport protocol: Oxc not decoded ...where the slot contains a SATA device. It's always Slot 24, and other slots show up fine. E.g. on one of the expanders with SATA drives in both Slot 23 and 24: h3# sg_ses -j /dev/sg81 LSI SAS2X36 0e0b Primary enclosure logical identifier (hex): 500304800013453f ... Slot 23 [0,22] Element type: Array device slot Enclosure Status: Predicted failure=0, Disabled=0, Swap=0, status: OK OK=0, Reserved device=0, Hot spare=0, Cons check=0 In crit array=0, In failed array=0, Rebuild/remap=0, R/R abort=0 App client bypass A=0, Do not remove=0, Enc bypass A=0, Enc bypass B=0 Ready to insert=0, RMV=0, Ident=0, Report=0 App client bypass B=0, Fault sensed=0, Fault reqstd=0, Device off=0 Bypassed A=0, Bypassed B=0, Dev bypassed A=0, Dev bypassed B=0 Additional Element Status: Transport protocol: SAS number of phys: 1, not all phys: 0, device slot number: 22 phy index: 0 device type: no device attached initiator port for: target port for: SATA_device attached SAS address: 0x500304800013453f SAS address: 0x5003048000134522 phy identifier: 0x0 Slot 24 [0,23] Element type: Array device slot Enclosure Status: Predicted failure=0, Disabled=0, Swap=0, status: OK OK=0, Reserved device=0, Hot spare=0, Cons check=0 In crit array=0, In failed array=0, Rebuild/remap=0, R/R abort=0 App client bypass A=0, Do not remove=0, Enc bypass A=0, Enc bypass B=0 Ready to insert=0, RMV=0, Ident=0, Report=0 App client bypass B=0, Fault sensed=0, Fault reqstd=0, Device off=0 Bypassed A=0, Bypassed B=0, Dev bypassed A=0, Dev bypassed B=0 Additional Element Status: Transport protocol: Oxc not decoded ... This may be unrelated, but 'sg_ses -j' is also coming up with the following error on 3 of the 6 expanders identified as LSI SAS2X36 0e0b (this doesn't include any of the expanders with the Slot 24 problem): join_work: oi=6, ei=255 (broken_ei=0) not in join_arr The expander types are: -- $ for h in h1 h2 h3; do echo === $h === ssh $h 'lsscsi | grep enclosu' done === h1 === [0:0:24:0] enclosu LSI CORP SAS2X36 0717 - [0:0:27:0] enclosu LSI SAS2X36 0e0b - [0:0:38:0] enclosu LSI CORP SAS2X28 0717 - [0:0:62:0] enclosu LSI SAS2X36 0e0b - [0:0:85:0] enclosu LSI SAS2X36 0e0b - === h2 === [0:0:25:0] enclosu LSI CORP SAS2X36 0717 - [0:0:29:0] enclosu LSI CORP SAS2X28 0717 - === h3 === [0:0:23:0] enclosu LSI CORP SAS2X36 0717 - [0:0:45:0] enclosu LSI SAS2X36 0e0b - [0:0:57:0] enclosu LSI CORP SAS2X28 0717 - [0:0:81:0] enclosu LSI SAS2X36 0e0b - [0:0:88:0] enclosu LSI SAS2X36 0e0b - -- ...and they're daisy-chained like this: -- for h in b2 b4 b5; do echo === $h === ssh $h 'find /sys/bus/scsi/devices/host0/ -name expander\* | egrep -v bsg|sas_(expander|device)' done === h1 === /sys/bus/scsi/devices/host0/port-0:0/expander-0:0 /sys/bus/scsi/devices/host0/port-0:0/expander-0:0/port-0:0:0/expander-0:1 /sys/bus/scsi/devices/host0/port-0:0/expander-0:0/port-0:0:0/expander-0:1/port-0:1:25/expander-0:4 /sys/bus/scsi/devices/host0/port-0:1/expander-0:2 /sys/bus/scsi/devices/host0/port-0:1/expander-0:2/port-0:2:0/expander-0:3 === h2 === /sys/bus/scsi/devices/host0/port-0:0/expander-0:0 /sys/bus/scsi/devices/host0/port-0:1/expander-0:1 === h3 === /sys/bus/scsi/devices/host0/port-0:0/expander-0:0 /sys/bus/scsi/devices/host0/port-0:0/expander-0:0/port-0:0:0/expander-0:1 /sys/bus/scsi/devices/host0/port-0:1/expander-0:2 /sys/bus/scsi/devices/host0/port-0:1/expander-0:2/port-0:2:0/expander-0:3 /sys/bus/scsi/devices/host0/port-0:1/expander-0:2/port-0:2:0/expander-0:3/port-0:3:0/expander-0:4 -- (Sorry, I don't know how to relate the /sys/bus/scsi stuff to the scsi ids or /dev/sgXX.) The errors are showing up like: -- $ for h in h1 h2 h3; do ssh $h ' for d in $(lsscsi -tg | awk \$2 == \enclosu\ { print \$5 }); do echo === $(hostname):$d === sg_ses -j $d 21 done ' done | egrep 'LSI|^=|^Slot 24|join_work|not decoded' | sed -r 's/^=/\n=/' === h1:/dev/sg24 === LSI CORP SAS2X36 0717 Slot 24 [0,23] Element type: Array device slot === h1:/dev/sg27 === LSI
Re: T10 WCE interpretation in Linux device level access
Hi Rob, Comments inline below. On 04/24/2013 01:44 AM, Elliott, Robert (Server Storage) wrote: If the writeback cache is enabled (per the WCE bit in the Caching mode page), prudent software uses the FUA bit in WRITE commands when writing metadata and/or sends the SYNCHRONIZE CACHE command at important checkpoints to ensure the data is not going to be lost due to a power loss. Some database software is particularly prolific at sending these commands. Around 2003, many RAID controllers with non-volatile writeback caches honored the SYNCHRONIZE CACHE command, flushing the entire cache to the drives. This started causing timeouts as non-volatile write cache sizes grew. Recently, it's even causing trouble on individual disk drives with growing volatile write caches. The intent of software using these commands and bits was unclear - it could be: a) ensure data is in non-volatile cache (and will eventually be flushed) or on the medium; or b) ensure data is on the medium (so the drives are ready for removal). Linux issues SYNCHRONIZE_CACHE commands when we need to make sure that the data needs to be crash safe (after a transaction commit from a file system journal, an explicit fsync call or write system call with O_SYNC set). If the cache is nonvolatile (i.e., the target will have it after a power outage or reboot), we are fine - pretty much your (a) clause above. Not sure we have thought through (or can control) how an array would handle pulling a drive from behind a RAID controller that has not flushed its state. As a short-term fix, many RAID controllers assumed intent (a) and started interpreting the SYNCHRONIZE CACHE command as a NOP and ignoring the FUA bit. We have seen problems with some RAID controllers that leave the write cache enabled on back end drives - their cache is battery backed, but the cache on those backend drives is exposed to certain data loss on a power outage. It would be nice if they always disabled the write cache on the backend drives *or* advertised WCE and propagated the SYNCHRONIZE_CACHE commands to each drive when we send them down. Surprise removal of a drive from a RAID controller is risky even if software has run SYNCHRONIZE CACHE, since the RAID controller might be doing other activity in the background. So, there are other reasons to justify assuming that the user just won't do that. Afraid of breaking software with intent (b) (which was more likely in the days of floppy disks, Bournelli Boxes, and other removable block devices), T10 chose to clarify that the original meaning was (b) and added new FUA_NV and SYNC_NV bits to let software express intent (a). The hope was that devices would implement the bits and software would start using them at appropriate times. Unfortunately, the short-term fix worked well enough that it still prevails today, and most standalone removable media block devices have disappeared. There is not much software actually sending the FUA_NV and SYNC_NV bits and few devices honoring the bits per the standard. As an SBC-3 letter ballot comment, I recently submitted T10 proposal 13-050 (see http://www.t10.org/doc13.htm) to obsolete the SYNC_NV and FUA_NV bits and change the meaning of the commands without those bits to intent (a), reflecting what the industry has actually done. This is definitely something that we should review and take into account going forward. It does sound like we have a lot of confusion around WCE meaning in the storage industry today though, which leads me to think that we will need to allow raw block accessing applications to manually override our flush settings (reluctantly!). Regards, Ric -Original Message- From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-ow...@vger.kernel.org] On Behalf Of Jeremy Linton Sent: Tuesday, April 23, 2013 5:40 PM To: James Bottomley Cc: Ric Wheeler; linux-scsi@vger.kernel.org; Martin K. Petersen; Jeff Moyer; Tejun Heo; Mike Snitzer; dgilb...@interlog.com Subject: Re: T10 WCE interpretation in Linux device level access On 4/23/2013 3:07 PM, James Bottomley wrote: I bet they don't; they probably obey the spec. There's a SYNC_NV bit which if unset (which it is in our implementation) means only sync your non-NV cache. For a device with all NV, that equates to nop. Yes, linux leaves the SYNC_NV bit unset in scsi_setup_flush_cmnd(). The draft specs, and a couple others I have laying about says: says the device shall sync cache to medium for both volatile and non volatile cache data if SYNC_NV is _unset_. With it set, the table could be more confusing! For volatile cache blocks with SYNC_NV set If a non-volatile cache is present, then the device server shall synchronize to non-volatile cache or to the medium. If a non-volatile cache is not present, then the device server shall synchronize to the medium. And for Non-volatile cache with it set No Requirement Which to me says, don't expect any particular behavior
Re: T10 WCE interpretation in Linux device level access
Il 23/04/2013 22:07, James Bottomley ha scritto: On Tue, 2013-04-23 at 15:41 -0400, Ric Wheeler wrote: For many years, we have used WCE as an indication that a device has a volatile write cache (not just a write cache) and used this as a trigger to send down SYNCHRONIZE_CACHE commands as needed. Some arrays with non-volatile cache seem to have WCE set and simply ignore the command. I bet they don't; they probably obey the spec. There's a SYNC_NV bit which if unset (which it is in our implementation) means only sync your non-NV cache. For a device with all NV, that equates to nop. Isn't it the other way round? SYNC_NV = 0 means sync all your caches to the medium, and it's what we do. SYNC_NV = 1 means sync volatile to non-volatile, and it's what Ric wants. So we should set SYNC_NV=1 if NV_SUP is set, perhaps only if the medium is non-removable just to err on the safe side. Ric, can you check what your arrays set in VPD page 0x86 byte 6 bit 1? Paolo Some arrays with non-volatile cache seem to not set WCE. Others arrays with non-volatile cache - our problem arrays - set WCE and do something horrible and slow when sent the SYNCHRONIZE_CACHE commands. These arrays sound to be out of spec, so we should probably just blacklist them. Note that for file systems, you can override this behavior by mounting with our barriers disabled (mount -o nobarrier .). There is currently no way do disable this for anything using the device directly, not through the file system. Some applications run against block devices - not through a file system - and want not to slow to a crawl when they have an array in my problem set. Giving them a hook to ignore WCE seems to be a hack, but one that would resolve issues with users who won't want to wait months (years?) for us to convince the array vendors. Is this a hook worth doing? We already have the ability to set the cache type in sysfs. It tries to do a mode select back to the array, but the USB guys want it for the reverse problem (write back cache behind bridge declared as write through). Have we hashed this out in the T10 committee? SBC-3 contains everything one could wish for about handling devices with volatile and NV cache, I thought. James -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: T10 WCE interpretation in Linux device level access
On 04/23/2013 10:07 PM, James Bottomley wrote: On Tue, 2013-04-23 at 15:41 -0400, Ric Wheeler wrote: For many years, we have used WCE as an indication that a device has a volatile write cache (not just a write cache) and used this as a trigger to send down SYNCHRONIZE_CACHE commands as needed. Some arrays with non-volatile cache seem to have WCE set and simply ignore the command. I bet they don't; they probably obey the spec. There's a SYNC_NV bit which if unset (which it is in our implementation) means only sync your non-NV cache. For a device with all NV, that equates to nop. Some arrays with non-volatile cache seem to not set WCE. Others arrays with non-volatile cache - our problem arrays - set WCE and do something horrible and slow when sent the SYNCHRONIZE_CACHE commands. These arrays sound to be out of spec, so we should probably just blacklist them. Don't think so. There is no time limit for the SYNCHRONIZE_CACHE command, so the array might take all day to write out the cache. In fact, I've recently had a rather heated discussion with a certain storage vendor which reserved his right to take up to several seconds when flushing the cache. Might be that we're in fact talking about the same one ... are we on a naming-and-shaming policy here ? If so I could tell you some really _interesting_ details about their behaviour ... Also note that SYNCHRONIZE_CACHE was always problematic; and as we're not even setting the LBA range we're even have cross-speak when issuing it to partitioned devices. Note that for file systems, you can override this behavior by mounting with our barriers disabled (mount -o nobarrier .). There is currently no way do disable this for anything using the device directly, not through the file system. Some applications run against block devices - not through a file system - and want not to slow to a crawl when they have an array in my problem set. Giving them a hook to ignore WCE seems to be a hack, but one that would resolve issues with users who won't want to wait months (years?) for us to convince the array vendors. Is this a hook worth doing? We already have the ability to set the cache type in sysfs. It tries to do a mode select back to the array, but the USB guys want it for the reverse problem (write back cache behind bridge declared as write through). You can always set the 'IMMED' bit for these arrays :-) Cheers, Hannes -- Dr. Hannes Reinecke zSeries Storage h...@suse.de +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg) -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH][v3] scsi: Handle MLQUEUE busy response in scsi_send_eh_cmnd
scsi_send_eh_cmnd() is calling queuecommand() directly, so it needs to check the return value here. The only valid return codes for queuecommand() are 'busy' states, so we need to wait for a bit to allow the LLDD to recover. Based on an earlier patch from Wen Xiong. Cc: Wen Xiong wenxi...@linux.vnet.ibm.com Cc: Brian King brk...@linux.vnet.ibm.com Signed-off-by: Hannes Reinecke h...@suse.de diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index c1b05a8..1fc6da94 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -791,22 +791,32 @@ static int scsi_send_eh_cmnd(struct scsi_cmnd *scmd, unsigned char *cmnd, struct scsi_device *sdev = scmd-device; struct Scsi_Host *shost = sdev-host; DECLARE_COMPLETION_ONSTACK(done); - unsigned long timeleft; + unsigned long timeleft = timeout; struct scsi_eh_save ses; + const int stall_for = min(HZ/10, 1); int rtn; +retry: scsi_eh_prep_cmnd(scmd, ses, cmnd, cmnd_size, sense_bytes); shost-eh_action = done; scsi_log_send(scmd); scmd-scsi_done = scsi_eh_done; - shost-hostt-queuecommand(shost, scmd); - - timeleft = wait_for_completion_timeout(done, timeout); + rtn = shost-hostt-queuecommand(shost, scmd); + if (rtn) { + if (timeleft) { + scsi_eh_restore_cmnd(scmd, ses); + timeleft -= stall_for; + msleep(stall_for); + goto retry; + } + rtn = NEEDS_RETRY; + } else + timeleft = wait_for_completion_timeout(done, timeout); shost-eh_action = NULL; - scsi_log_completion(scmd, SUCCESS); + scsi_log_completion(scmd, rtn); SCSI_LOG_ERROR_RECOVERY(3, printk(%s: scmd: %p, timeleft: %ld\n, @@ -837,7 +847,7 @@ static int scsi_send_eh_cmnd(struct scsi_cmnd *scmd, unsigned char *cmnd, rtn = FAILED; break; } - } else { + } else if (!rtn) { scsi_abort_eh_cmnd(scmd); rtn = FAILED; } -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: T10 WCE interpretation in Linux device level access
Il 24/04/2013 14:07, Hannes Reinecke ha scritto: On 04/24/2013 01:17 PM, Paolo Bonzini wrote: Il 23/04/2013 22:07, James Bottomley ha scritto: On Tue, 2013-04-23 at 15:41 -0400, Ric Wheeler wrote: For many years, we have used WCE as an indication that a device has a volatile write cache (not just a write cache) and used this as a trigger to send down SYNCHRONIZE_CACHE commands as needed. Some arrays with non-volatile cache seem to have WCE set and simply ignore the command. I bet they don't; they probably obey the spec. There's a SYNC_NV bit which if unset (which it is in our implementation) means only sync your non-NV cache. For a device with all NV, that equates to nop. Isn't it the other way round? SYNC_NV = 0 means sync all your caches to the medium, and it's what we do. SYNC_NV = 1 means sync volatile to non-volatile, and it's what Ric wants. So we should set SYNC_NV=1 if NV_SUP is set, perhaps only if the medium is non-removable just to err on the safe side. Or use 'WRITE_AND_VERIFY' here; that's guaranteed to hit the disk. Plus it even has a guarantee about data consistency on the disk, which the normal WRITE command has not. The point is to _avoid_ hitting the disk. :) Paolo -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: T10 WCE interpretation in Linux device level access
On 04/24/2013 02:08 PM, Paolo Bonzini wrote: Il 24/04/2013 14:07, Hannes Reinecke ha scritto: On 04/24/2013 01:17 PM, Paolo Bonzini wrote: Il 23/04/2013 22:07, James Bottomley ha scritto: On Tue, 2013-04-23 at 15:41 -0400, Ric Wheeler wrote: For many years, we have used WCE as an indication that a device has a volatile write cache (not just a write cache) and used this as a trigger to send down SYNCHRONIZE_CACHE commands as needed. Some arrays with non-volatile cache seem to have WCE set and simply ignore the command. I bet they don't; they probably obey the spec. There's a SYNC_NV bit which if unset (which it is in our implementation) means only sync your non-NV cache. For a device with all NV, that equates to nop. Isn't it the other way round? SYNC_NV = 0 means sync all your caches to the medium, and it's what we do. SYNC_NV = 1 means sync volatile to non-volatile, and it's what Ric wants. So we should set SYNC_NV=1 if NV_SUP is set, perhaps only if the medium is non-removable just to err on the safe side. Or use 'WRITE_AND_VERIFY' here; that's guaranteed to hit the disk. Plus it even has a guarantee about data consistency on the disk, which the normal WRITE command has not. The point is to _avoid_ hitting the disk. :) Ah. Really? Why do we discuss SYNCHRONIZE CACHE then? I was under the impression that we're talking various 'barriers' (or rather 'flush' nowadays) implementations. Which require that some data needs to be written to disk before continuing. Or did I somehow misread the thread? Confused, Hannes -- Dr. Hannes Reinecke zSeries Storage h...@suse.de +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg) -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: T10 WCE interpretation in Linux device level access
Il 24/04/2013 14:12, Hannes Reinecke ha scritto: On 04/24/2013 02:08 PM, Paolo Bonzini wrote: Il 24/04/2013 14:07, Hannes Reinecke ha scritto: On 04/24/2013 01:17 PM, Paolo Bonzini wrote: Il 23/04/2013 22:07, James Bottomley ha scritto: On Tue, 2013-04-23 at 15:41 -0400, Ric Wheeler wrote: For many years, we have used WCE as an indication that a device has a volatile write cache (not just a write cache) and used this as a trigger to send down SYNCHRONIZE_CACHE commands as needed. Some arrays with non-volatile cache seem to have WCE set and simply ignore the command. I bet they don't; they probably obey the spec. There's a SYNC_NV bit which if unset (which it is in our implementation) means only sync your non-NV cache. For a device with all NV, that equates to nop. Isn't it the other way round? SYNC_NV = 0 means sync all your caches to the medium, and it's what we do. SYNC_NV = 1 means sync volatile to non-volatile, and it's what Ric wants. So we should set SYNC_NV=1 if NV_SUP is set, perhaps only if the medium is non-removable just to err on the safe side. Or use 'WRITE_AND_VERIFY' here; that's guaranteed to hit the disk. Plus it even has a guarantee about data consistency on the disk, which the normal WRITE command has not. The point is to _avoid_ hitting the disk. :) Ah. Really? Why do we discuss SYNCHRONIZE CACHE then? Because we do want the data to hit the non-volatile cache, just in case the disk has both a volatile and a non-volatile cache. Paolo -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: T10 WCE interpretation in Linux device level access
On 04/24/2013 08:08 AM, Paolo Bonzini wrote: Il 24/04/2013 14:07, Hannes Reinecke ha scritto: On 04/24/2013 01:17 PM, Paolo Bonzini wrote: Il 23/04/2013 22:07, James Bottomley ha scritto: On Tue, 2013-04-23 at 15:41 -0400, Ric Wheeler wrote: For many years, we have used WCE as an indication that a device has a volatile write cache (not just a write cache) and used this as a trigger to send down SYNCHRONIZE_CACHE commands as needed. Some arrays with non-volatile cache seem to have WCE set and simply ignore the command. I bet they don't; they probably obey the spec. There's a SYNC_NV bit which if unset (which it is in our implementation) means only sync your non-NV cache. For a device with all NV, that equates to nop. Isn't it the other way round? SYNC_NV = 0 means sync all your caches to the medium, and it's what we do. SYNC_NV = 1 means sync volatile to non-volatile, and it's what Ric wants. So we should set SYNC_NV=1 if NV_SUP is set, perhaps only if the medium is non-removable just to err on the safe side. Or use 'WRITE_AND_VERIFY' here; that's guaranteed to hit the disk. Plus it even has a guarantee about data consistency on the disk, which the normal WRITE command has not. The point is to _avoid_ hitting the disk. :) Paolo The point is to have a crash-proof version of the data acknowledged by the target device while letting data sit in volatile state as long as possible. To be even clearer, we would love to do this for a sub-range of the device but currently use a big hammer to flush the entire cache (possibly for multiple file systems on one target storage device). Once we use the SYNCHRONIZE_CACHE (or CACHE_FLUSH_EXT) command, we want the data on that target device to be there if someone loses power. If the device can promise this, we don't care (and don't know) how it manages that promise. It can leave the data on battery backed DRAM, can archive it to flash or any other scheme that works. Just as importantly, we don't want to destage data to the back end drives if that is not required since it is really, really slow. The confusion here is that various storage devices have used the standard bits in arbitrary ways which makes it very hard to have one clear set of rules. Even harder to explain to end users when to use a work around (like mount -o nobarrier) or the proposed ignore flushes block level call :) Regards, Ric -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: T10 WCE interpretation in Linux device level access
Il 24/04/2013 14:27, Ric Wheeler ha scritto: The point is to _avoid_ hitting the disk. :) The point is to have a crash-proof version of the data acknowledged by the target device while letting data sit in volatile state as long as possible. To be even clearer, we would love to do this for a sub-range of the device but currently use a big hammer to flush the entire cache (possibly for multiple file systems on one target storage device). Once we use the SYNCHRONIZE_CACHE (or CACHE_FLUSH_EXT) command, we want the data on that target device to be there if someone loses power. If the device can promise this, we don't care (and don't know) how it manages that promise. It can leave the data on battery backed DRAM, can archive it to flash or any other scheme that works. That's exactly the point of SYNC_NV=1. Just as importantly, we don't want to destage data to the back end drives if that is not required since it is really, really slow. The confusion here is that various storage devices have used the standard bits in arbitrary ways which makes it very hard to have one clear set of rules. Also that we have ignored the problem for long, and it's worked surprisingly well. :) Even harder to explain to end users when to use a work around (like mount -o nobarrier) or the proposed ignore flushes block level call :) Hoping Thunderbird doesn't mangle the patch too badly, code might be worth a thousand words... see after the sig, compile-tested only. I have no access to these controllers, neither the good ones nor the bad ones. :) Paolo - 8 - From: Paolo Bonzini pbonz...@redhat.com Subject: [PATCH] scsi: only make REQ_FLUSH flush to non-volatile cache The point of REQ_FLUSH is to have a crash-proof version of the data acknowledged by the target device. We want the data on that target device to be there if someone loses power, but we don't care (and don't want to know) how it manages that promise. It can leave the data on battery backed DRAM, can archive it to flash or any other scheme that works. This is exactly what SYNC_NV=1 does. Instead, SYNC_NV=0 should flush the data to the medium, which is not desirable when we have a non-volatile cache (except perhaps if the medium is removable). NOT-Tested-by: Paolo Bonzini pbonz...@redhat.com Signed-off-by: Paolo Bonzini pbonz...@redhat.com --- diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index 7992635..97ecfd9 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -800,9 +800,17 @@ static int sd_setup_write_same_cmnd(struct scsi_device *sdp, struct request *rq) static int scsi_setup_flush_cmnd(struct scsi_device *sdp, struct request *rq) { + struct scsi_disk *sdkp = scsi_disk(rq-rq_disk); + rq-timeout = SD_FLUSH_TIMEOUT; rq-retries = SD_MAX_RETRIES; rq-cmd[0] = SYNCHRONIZE_CACHE; + + /* No need to synchronize to medium if we have a non-volatile cache, +* but be safe if the medium could just go away. +*/ + if (sdkp-nv_sup !sdp-removable) + rq-cmd[1] |= 4; /* SYNC_NV */ rq-cmd_len = 10; return scsi_setup_blk_pc_cmnd(sdp, rq); @@ -2511,6 +2519,26 @@ static void sd_read_app_tag_own(struct scsi_disk *sdkp, unsigned char *buffer) } /** + * sd_read_extended_inquiry - Query disk device for non-volatile cache. + * @disk: disk to query + */ +static void sd_read_extended_inquiry(struct scsi_disk *sdkp) +{ + const int vpd_len = 64; + unsigned char *buffer = kmalloc(vpd_len, GFP_KERNEL); + + if (!buffer || + /* Block Limits VPD */ + scsi_get_vpd_page(sdkp-device, 0x86, buffer, vpd_len)) + goto out; + + sdkp-nv_sup = (buffer[6] 0x02) != 0; + + out: + kfree(buffer); +} + +/** * sd_read_block_limits - Query disk device for preferred I/O sizes. * @disk: disk to query */ @@ -2684,6 +2712,7 @@ static int sd_revalidate_disk(struct gendisk *disk) sd_read_capacity(sdkp, buffer); if (sd_try_extended_inquiry(sdp)) { + sd_read_extended_inquiry(sdkp); sd_read_block_provisioning(sdkp); sd_read_block_limits(sdkp); sd_read_block_characteristics(sdkp); diff --git a/drivers/scsi/sd.h b/drivers/scsi/sd.h index 74a1e4c..6334dfe 100644 --- a/drivers/scsi/sd.h +++ b/drivers/scsi/sd.h @@ -84,6 +84,7 @@ struct scsi_disk { unsignedlbpws10 : 1; unsignedlbpvpd : 1; unsignedws16 : 1; + unsignednv_sup : 1; }; #define to_scsi_disk(obj) container_of(obj,struct scsi_disk,dev) Regards, Ric -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/1] scsi: ufs: Add support for sending NOP OUT UPIU
As part of device initialization sequence, sending NOP OUT UPIU and waiting for NOP IN UPIU response is mandatory. This confirms that the device UFS Transport (UTP) layer is functional and the host can configure the device with further commands. Add support for sending NOP OUT UPIU to check the device connection path and test whether the UTP layer on the device side is functional during initialization. Signed-off-by: Sujit Reddy Thumma sthu...@codeaurora.org --- drivers/scsi/ufs/ufshcd.c | 167 ++--- drivers/scsi/ufs/ufshcd.h |4 + 2 files changed, 162 insertions(+), 9 deletions(-) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index ced421a..1a8cba3 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -34,12 +34,21 @@ */ #include ufshcd.h +#include linux/iopoll.h +#include linux/async.h /* Timeout after 10 secs if UIC command hangs */ #define UIC_COMMAND_TIMEOUT(10 * HZ) /* Retry for 3 times in case of UIC failures */ #define UIC_COMMAND_RETRIES3 +/* NOP OUT retries waiting for NOP IN response */ +#define NOP_OUT_RETRIES10 +/* Timeout after 30 msecs if NOP OUT hangs without response */ +#define NOP_OUT_TIMEOUT30 /* msecs */ +/* Reserved tag for internal commands */ +#define INTERNAL_CMD_TAG 0 + enum { UFSHCD_MAX_CHANNEL = 0, UFSHCD_MAX_ID = 1, @@ -607,7 +616,7 @@ static void ufshcd_prepare_req_desc(struct ufshcd_lrb *lrbp, u32 *upiu_flags) { struct utp_transfer_req_desc *req_desc = lrbp-utr_descriptor_ptr; enum dma_data_direction cmd_dir = - lrbp-cmd-sc_data_direction; + lrbp-cmd ? lrbp-cmd-sc_data_direction : DMA_NONE; u32 data_direction; u32 dword_0; @@ -624,6 +633,8 @@ static void ufshcd_prepare_req_desc(struct ufshcd_lrb *lrbp, u32 *upiu_flags) dword_0 = data_direction | (lrbp-command_type UPIU_COMMAND_TYPE_OFFSET); + if (lrbp-intr_cmd) + dword_0 |= UTP_REQ_DESC_INT_CMD; /* Transfer request descriptor header fields */ req_desc-header.dword_0 = cpu_to_le32(dword_0); @@ -712,6 +723,18 @@ static inline void ufshcd_prepare_utp_query_req_upiu(struct ufs_hba *hba, } +static inline void ufshcd_prepare_utp_nop_upiu(struct ufshcd_lrb *lrbp) +{ + struct utp_upiu_req *ucd_req_ptr = lrbp-ucd_req_ptr; + + memset(ucd_req_ptr, 0, sizeof(struct utp_upiu_req)); + + /* command descriptor fields */ + ucd_req_ptr-header.dword_0 = + UPIU_HEADER_DWORD( + UPIU_TRANSACTION_NOP_OUT, 0, 0, lrbp-task_tag); +} + /** * ufshcd_compose_upiu - form UFS Protocol Information Unit(UPIU) * @hba - UFS hba @@ -741,12 +764,14 @@ static int ufshcd_compose_upiu(struct ufs_hba *hba, struct ufshcd_lrb *lrbp) case UTP_CMD_TYPE_SCSI: case UTP_CMD_TYPE_DEV_MANAGE: ufshcd_prepare_req_desc(lrbp, upiu_flags); - if (lrbp-cmd lrbp-command_type == UTP_CMD_TYPE_SCSI) + if (lrbp-cmd lrbp-command_type == UTP_CMD_TYPE_SCSI) { ufshcd_prepare_utp_scsi_cmd_upiu(lrbp, upiu_flags); - else if (lrbp-cmd) + } else if (lrbp-cmd ufshcd_is_query_req(lrbp)) { ufshcd_prepare_utp_query_req_upiu(hba, lrbp, upiu_flags); - else { + } else if (!lrbp-cmd) { + ufshcd_prepare_utp_nop_upiu(lrbp); + } else { dev_err(hba-dev, %s: Invalid UPIU request\n, __func__); ret = -EINVAL; @@ -799,6 +824,7 @@ static int ufshcd_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *cmd) lrbp-sense_buffer = cmd-sense_buffer; lrbp-task_tag = tag; lrbp-lun = cmd-device-lun; + lrbp-intr_cmd = false; if (ufshcd_is_query_req(lrbp)) lrbp-command_type = UTP_CMD_TYPE_DEV_MANAGE; @@ -1112,6 +1138,113 @@ static int ufshcd_dme_link_startup(struct ufs_hba *hba) return err; } +static inline int +ufshcd_compose_nop_out_upiu(struct ufs_hba *hba, struct ufshcd_lrb *lrbp) +{ + lrbp-cmd = NULL; + lrbp-sense_bufflen = 0; + lrbp-sense_buffer = NULL; + lrbp-task_tag = INTERNAL_CMD_TAG; + lrbp-lun = 0; /* NOP OUT is not specific to any LUN */ + lrbp-command_type = UTP_CMD_TYPE_DEV_MANAGE; + lrbp-intr_cmd = true; /* No interrupt aggregation */ + + return ufshcd_compose_upiu(hba, lrbp); +} + +/** + * ufshcd_validate_dev_connection() - Check device connection status + * @hba: per-adapter instance + * + * Send NOP OUT UPIU and wait for NOP IN response to check whether the + * device Transport Protocol (UTP) layer is ready after a reset. + * If the UTP layer at the device side is not initialized, it
Re: T10 WCE interpretation in Linux device level access
On 4/24/2013 7:57 AM, Paolo Bonzini wrote: If the device can promise this, we don't care (and don't know) how it manages that promise. It can leave the data on battery backed DRAM, can archive it to flash or any other scheme that works. That's exactly the point of SYNC_NV=1. Well its the point, but the specification is written such that the vendors can choose to implement it any way they wish, especially for split cache systems where there is both volatile and non volatile cache. Flushing the NV cache to medium (as is the current behavior) may not be a bad idea anyway. Thats because I know of a large vendors array where the non-volatile cache might be better described as the sometimes non-volatile cache. That is because a failure to flush the volatile portions results in the non-volatile portions being considered invalid when power is restored. This fences the volume, and the usual method for recovering the array is to call support and have them invalidate the NV portions of the cache. Thereby negating the whole reason for having a NV cache. I'm sure they don't tell customers this fact when they sell the array, when it happened in our lab I was in a state of shock for about a week. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH][v3] scsi: Handle MLQUEUE busy response in scsi_send_eh_cmnd
On Wed, 2013-04-24 at 13:32 +0200, Hannes Reinecke wrote: scsi_send_eh_cmnd() is calling queuecommand() directly, so it needs to check the return value here. The only valid return codes for queuecommand() are 'busy' states, so we need to wait for a bit to allow the LLDD to recover. Based on an earlier patch from Wen Xiong. This looks good, only one minor nitpick: Cc: Wen Xiong wenxi...@linux.vnet.ibm.com Cc: Brian King brk...@linux.vnet.ibm.com Signed-off-by: Hannes Reinecke h...@suse.de diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index c1b05a8..1fc6da94 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -791,22 +791,32 @@ static int scsi_send_eh_cmnd(struct scsi_cmnd *scmd, unsigned char *cmnd, struct scsi_device *sdev = scmd-device; struct Scsi_Host *shost = sdev-host; DECLARE_COMPLETION_ONSTACK(done); - unsigned long timeleft; + unsigned long timeleft = timeout; struct scsi_eh_save ses; + const int stall_for = min(HZ/10, 1); int rtn; +retry: scsi_eh_prep_cmnd(scmd, ses, cmnd, cmnd_size, sense_bytes); shost-eh_action = done; scsi_log_send(scmd); scmd-scsi_done = scsi_eh_done; - shost-hostt-queuecommand(shost, scmd); - - timeleft = wait_for_completion_timeout(done, timeout); + rtn = shost-hostt-queuecommand(shost, scmd); + if (rtn) { + if (timeleft) { + scsi_eh_restore_cmnd(scmd, ses); + timeleft -= stall_for; + msleep(stall_for); Stall for is in HZ: need to convert to ms, so msleep(stall_for * 1000/HZ); I also don't see a need to restore and reprep the command each time, but I don't see a problem with it either. James + goto retry; + } + rtn = NEEDS_RETRY; + } else + timeleft = wait_for_completion_timeout(done, timeout); shost-eh_action = NULL; - scsi_log_completion(scmd, SUCCESS); + scsi_log_completion(scmd, rtn); SCSI_LOG_ERROR_RECOVERY(3, printk(%s: scmd: %p, timeleft: %ld\n, @@ -837,7 +847,7 @@ static int scsi_send_eh_cmnd(struct scsi_cmnd *scmd, unsigned char *cmnd, rtn = FAILED; break; } - } else { + } else if (!rtn) { scsi_abort_eh_cmnd(scmd); rtn = FAILED; } -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: T10 WCE interpretation in Linux device level access
On 13-04-23 03:41 PM, Ric Wheeler wrote: For many years, we have used WCE as an indication that a device has a volatile write cache (not just a write cache) and used this as a trigger to send down SYNCHRONIZE_CACHE commands as needed. Some arrays with non-volatile cache seem to have WCE set and simply ignore the command. Some arrays with non-volatile cache seem to not set WCE. Others arrays with non-volatile cache - our problem arrays - set WCE and do something horrible and slow when sent the SYNCHRONIZE_CACHE commands. Note that for file systems, you can override this behavior by mounting with our barriers disabled (mount -o nobarrier .). There is currently no way do disable this for anything using the device directly, not through the file system. Some applications run against block devices - not through a file system - and want not to slow to a crawl when they have an array in my problem set. Giving them a hook to ignore WCE seems to be a hack, but one that would resolve issues with users who won't want to wait months (years?) for us to convince the array vendors. Is this a hook worth doing? Have we hashed this out in the T10 committee? Naturally I'm biased, but I tend to think the user space is usually smarter than the kernel. That assumes skilled users. So if the user space issues a SYNCHRONIZE_CACHE with the IMMED bit set and for the whole disk then the user should have a way of forcing that command to be issued. The assumption here is that the skilled user is about to power down that array or pull some disks or SSDs *. The more questionable cases are when a file system or the block layer is issuing a barrier or some such that translates to a SYNCHRONIZE_CACHE. That should be ignored in some cases already discussed in this thread. While working with SoCs I have noticed an interesting technique. Sub-system sized sections of the memory mapped IO space (e.g. a bank of GPIOs) can be write protected by a simple ASCII sequence **. Attempts to change configuration registers after write protect are ignored and an error is noted (if anyone cares). The same ACSII sequence can be used to un-write protect those sub-system configuration registers. Typically on a SoC if the GPIOs are randomly re-configured, it's game over. Back to the SCSI world: a better solution might be if an LLD could be informed of the reason a SCSI control command is being issued (a sort of come from field). Failing, or it addition to that, a sysfs interface could be added to filter out dangerous SCSI commands: echo SC /sys/class/scsi_device/8:0:0:0/device/filter cat /sys/class/scsi_device/8:0:0:0/device/filter FU SC If, for whatever reason, we did ignore a SYNCHRONIZE_CACHE command we could use vendor specific sense data (vendor=Linux) to indicate that a command had been ignored. That could be extended to all SCSI commands that are filtered out ***; better that than EIO, EACCES etc. Doug Gilbert * and if Linux doesn't permit this, then user might be advised to run another, more obedient, host OS with Linux running as a VM. A pass-by rather than a pass-through ... ** only the configuration registers are write protected, so data can still be written to the GPIOs *** like me, many pass-through users cannot see why SCSI commands injected to the SCSI subsystem (e.g. via sg or bsg) are filtered out silently by the block layer. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/5] scsi: ufs: move the ufshcd_hba_stop to ufshcd.c
Move the ufshcd_hba_stop from header file. Signed-off-by: Seungwon Jeon tgih@samsung.com --- drivers/scsi/ufs/ufshcd.c |9 + drivers/scsi/ufs/ufshcd.h |9 - 2 files changed, 9 insertions(+), 9 deletions(-) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 60fd40c..41b9639 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -285,6 +285,15 @@ static inline void ufshcd_hba_start(struct ufs_hba *hba) } /** + * ufshcd_hba_stop - Send controller to reset state + * @hba: per adapter instance + */ +static inline void ufshcd_hba_stop(struct ufs_hba *hba) +{ + writel(CONTROLLER_DISABLE, (hba-mmio_base + REG_CONTROLLER_ENABLE)); +} + +/** * ufshcd_is_hba_active - Get controller state * @hba: per adapter instance * diff --git a/drivers/scsi/ufs/ufshcd.h b/drivers/scsi/ufs/ufshcd.h index 6b99a42..1680394 100644 --- a/drivers/scsi/ufs/ufshcd.h +++ b/drivers/scsi/ufs/ufshcd.h @@ -190,13 +190,4 @@ int ufshcd_init(struct device *, struct ufs_hba ** , void __iomem * , unsigned int); void ufshcd_remove(struct ufs_hba *); -/** - * ufshcd_hba_stop - Send controller to reset state - * @hba: per adapter instance - */ -static inline void ufshcd_hba_stop(struct ufs_hba *hba) -{ - writel(CONTROLLER_DISABLE, (hba-mmio_base + REG_CONTROLLER_ENABLE)); -} - #endif /* End of Header */ -- 1.7.0.4 -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/5] scsi: ufs: amend interrupt configuration
It makes interrupt setting more flexible especially for disabling. And wrong bit mask is fixed for ver 1.0. [17:16] is added for mask. Signed-off-by: Seungwon Jeon tgih@samsung.com --- drivers/scsi/ufs/ufshcd.c | 86 - drivers/scsi/ufs/ufshcd.h |4 +- drivers/scsi/ufs/ufshci.h |5 ++- 3 files changed, 66 insertions(+), 29 deletions(-) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index b6c19b0..efe2256 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -35,6 +35,10 @@ #include ufshcd.h +#define UFSHCD_ENABLE_INTRS(UTP_TRANSFER_REQ_COMPL |\ +UTP_TASK_REQ_COMPL |\ +UFSHCD_ERROR_MASK) + enum { UFSHCD_MAX_CHANNEL = 0, UFSHCD_MAX_ID = 1, @@ -64,6 +68,20 @@ enum { }; /** + * ufshcd_get_intr_mask - Get the interrupt bit mask + * @hba - Pointer to adapter instance + * + * Returns interrupt bit mask per version + */ +static inline u32 ufshcd_get_intr_mask(struct ufs_hba *hba) +{ + if (hba-ufs_version == UFSHCI_VERSION_10) + return INTERRUPT_MASK_ALL_VER_10; + else + return INTERRUPT_MASK_ALL_VER_11; +} + +/** * ufshcd_get_ufs_version - Get the UFS version supported by the HBA * @hba - Pointer to adapter instance * @@ -397,25 +415,45 @@ static int ufshcd_map_sg(struct ufshcd_lrb *lrbp) } /** - * ufshcd_int_config - enable/disable interrupts + * ufshcd_enable_intr - enable interrupts * @hba: per adapter instance - * @option: interrupt option + * @intrs: interrupt bits */ -static void ufshcd_int_config(struct ufs_hba *hba, u32 option) +static void ufshcd_enable_intr(struct ufs_hba *hba, u32 intrs) { - switch (option) { - case UFSHCD_INT_ENABLE: - ufshcd_writel(hba, REG_INTERRUPT_ENABLE, hba-int_enable_mask); - break; - case UFSHCD_INT_DISABLE: - if (hba-ufs_version == UFSHCI_VERSION_10) - ufshcd_writel(hba, REG_INTERRUPT_ENABLE, - INTERRUPT_DISABLE_MASK_10); - else - ufshcd_writel(hba, REG_INTERRUPT_ENABLE, - INTERRUPT_DISABLE_MASK_11); - break; + u32 set = ufshcd_readl(hba, REG_INTERRUPT_ENABLE); + + if (hba-ufs_version == UFSHCI_VERSION_10) { + u32 rw; + rw = set INTERRUPT_MASK_RW_VER_10; + set = rw | ((set ^ intrs) intrs); + } else { + set |= intrs; + } + + ufshcd_writel(hba, REG_INTERRUPT_ENABLE, set); +} + +/** + * ufshcd_disable_intr - disable interrupts + * @hba: per adapter instance + * @intrs: interrupt bits + */ +static void ufshcd_disable_intr(struct ufs_hba *hba, u32 intrs) +{ + u32 set = ufshcd_readl(hba, REG_INTERRUPT_ENABLE); + + if (hba-ufs_version == UFSHCI_VERSION_10) { + u32 rw; + rw = (set INTERRUPT_MASK_RW_VER_10) + ~(intrs INTERRUPT_MASK_RW_VER_10); + set = rw | ((set intrs) ~INTERRUPT_MASK_RW_VER_10); + + } else { + set = ~intrs; } + + ufshcd_writel(hba, REG_INTERRUPT_ENABLE, set); } /** @@ -717,8 +755,7 @@ static int ufshcd_dme_link_startup(struct ufs_hba *hba) uic_cmd-argument3 = 0; /* enable UIC related interrupts */ - hba-int_enable_mask |= UIC_COMMAND_COMPL; - ufshcd_int_config(hba, UFSHCD_INT_ENABLE); + ufshcd_enable_intr(hba, UIC_COMMAND_COMPL); /* sending UIC commands to controller */ ufshcd_send_uic_command(hba, uic_cmd); @@ -765,13 +802,9 @@ static int ufshcd_make_hba_operational(struct ufs_hba *hba) } /* Enable required interrupts */ - hba-int_enable_mask |= (UTP_TRANSFER_REQ_COMPL | -UIC_ERROR | -UTP_TASK_REQ_COMPL | -DEVICE_FATAL_ERROR | -CONTROLLER_FATAL_ERROR | -SYSTEM_BUS_FATAL_ERROR); - ufshcd_int_config(hba, UFSHCD_INT_ENABLE); + ufshcd_enable_intr(hba, UTP_TRANSFER_REQ_COMPL | UIC_ERROR | + UTP_TASK_REQ_COMPL | DEVICE_FATAL_ERROR | + CONTROLLER_FATAL_ERROR | SYSTEM_BUS_FATAL_ERROR); /* Configure interrupt aggregation */ ufshcd_config_int_aggr(hba, INT_AGGR_CONFIG); @@ -1578,7 +1611,7 @@ static void ufshcd_hba_free(struct ufs_hba *hba) void ufshcd_remove(struct ufs_hba *hba) { /* disable interrupts */ - ufshcd_int_config(hba, UFSHCD_INT_DISABLE); + ufshcd_disable_intr(hba, hba-intr_mask); ufshcd_hba_stop(hba); ufshcd_hba_free(hba); @@ -1636,6 +1669,9 @@ int ufshcd_init(struct device *dev, struct ufs_hba **hba_handle, /* Get UFS version supported by
[PATCH 2/5] scsi: ufs: wrap the i/o access operations
Simplify operations with hiding mmio_base. Signed-off-by: Seungwon Jeon tgih@samsung.com --- drivers/scsi/ufs/ufshcd.c | 106 +++-- drivers/scsi/ufs/ufshcd.h |5 ++ 2 files changed, 49 insertions(+), 62 deletions(-) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 41b9639..b6c19b0 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -71,7 +71,7 @@ enum { */ static inline u32 ufshcd_get_ufs_version(struct ufs_hba *hba) { - return readl(hba-mmio_base + REG_UFS_VERSION); + return ufshcd_readl(hba, REG_UFS_VERSION); } /** @@ -130,8 +130,7 @@ static inline int ufshcd_get_tm_free_slot(struct ufs_hba *hba) */ static inline void ufshcd_utrl_clear(struct ufs_hba *hba, u32 pos) { - writel(~(1 pos), - (hba-mmio_base + REG_UTP_TRANSFER_REQ_LIST_CLEAR)); + ufshcd_writel(hba, REG_UTP_TRANSFER_REQ_LIST_CLEAR, ~(1 pos)); } /** @@ -165,7 +164,7 @@ static inline int ufshcd_get_lists_status(u32 reg) */ static inline int ufshcd_get_uic_cmd_result(struct ufs_hba *hba) { - return readl(hba-mmio_base + REG_UIC_COMMAND_ARG_2) + return ufshcd_readl(hba, REG_UIC_COMMAND_ARG_2) MASK_UIC_COMMAND_RESULT; } @@ -243,18 +242,14 @@ ufshcd_config_int_aggr(struct ufs_hba *hba, int option) { switch (option) { case INT_AGGR_RESET: - writel((INT_AGGR_ENABLE | - INT_AGGR_COUNTER_AND_TIMER_RESET), - (hba-mmio_base + -REG_UTP_TRANSFER_REQ_INT_AGG_CONTROL)); + ufshcd_writel(hba, REG_UTP_TRANSFER_REQ_INT_AGG_CONTROL, + INT_AGGR_ENABLE | INT_AGGR_COUNTER_AND_TIMER_RESET); break; case INT_AGGR_CONFIG: - writel((INT_AGGR_ENABLE | - INT_AGGR_PARAM_WRITE | - INT_AGGR_COUNTER_THRESHOLD_VALUE | - INT_AGGR_TIMEOUT_VALUE), - (hba-mmio_base + -REG_UTP_TRANSFER_REQ_INT_AGG_CONTROL)); + ufshcd_writel(hba, REG_UTP_TRANSFER_REQ_INT_AGG_CONTROL, + INT_AGGR_ENABLE | INT_AGGR_PARAM_WRITE | + INT_AGGR_COUNTER_THRESHOLD_VALUE | + INT_AGGR_TIMEOUT_VALUE); break; } } @@ -267,12 +262,10 @@ ufshcd_config_int_aggr(struct ufs_hba *hba, int option) */ static void ufshcd_enable_run_stop_reg(struct ufs_hba *hba) { - writel(UTP_TASK_REQ_LIST_RUN_STOP_BIT, - (hba-mmio_base + - REG_UTP_TASK_REQ_LIST_RUN_STOP)); - writel(UTP_TRANSFER_REQ_LIST_RUN_STOP_BIT, - (hba-mmio_base + - REG_UTP_TRANSFER_REQ_LIST_RUN_STOP)); + ufshcd_writel(hba, REG_UTP_TASK_REQ_LIST_RUN_STOP, + UTP_TASK_REQ_LIST_RUN_STOP_BIT); + ufshcd_writel(hba, REG_UTP_TRANSFER_REQ_LIST_RUN_STOP, + UTP_TRANSFER_REQ_LIST_RUN_STOP_BIT); } /** @@ -281,7 +274,7 @@ static void ufshcd_enable_run_stop_reg(struct ufs_hba *hba) */ static inline void ufshcd_hba_start(struct ufs_hba *hba) { - writel(CONTROLLER_ENABLE , (hba-mmio_base + REG_CONTROLLER_ENABLE)); + ufshcd_writel(hba, REG_CONTROLLER_ENABLE, CONTROLLER_ENABLE); } /** @@ -290,7 +283,7 @@ static inline void ufshcd_hba_start(struct ufs_hba *hba) */ static inline void ufshcd_hba_stop(struct ufs_hba *hba) { - writel(CONTROLLER_DISABLE, (hba-mmio_base + REG_CONTROLLER_ENABLE)); + ufshcd_writel(hba, REG_CONTROLLER_ENABLE, CONTROLLER_DISABLE); } /** @@ -301,7 +294,7 @@ static inline void ufshcd_hba_stop(struct ufs_hba *hba) */ static inline int ufshcd_is_hba_active(struct ufs_hba *hba) { - return (readl(hba-mmio_base + REG_CONTROLLER_ENABLE) 0x1) ? 0 : 1; + return (ufshcd_readl(hba, REG_CONTROLLER_ENABLE) 0x1) ? 0 : 1; } /** @@ -313,8 +306,7 @@ static inline void ufshcd_send_command(struct ufs_hba *hba, unsigned int task_tag) { __set_bit(task_tag, hba-outstanding_reqs); - writel((1 task_tag), - (hba-mmio_base + REG_UTP_TRANSFER_REQ_DOOR_BELL)); + ufshcd_writel(hba, REG_UTP_TRANSFER_REQ_DOOR_BELL, 1 task_tag); } /** @@ -338,8 +330,7 @@ static inline void ufshcd_copy_sense_data(struct ufshcd_lrb *lrbp) */ static inline void ufshcd_hba_capabilities(struct ufs_hba *hba) { - hba-capabilities = - readl(hba-mmio_base + REG_CONTROLLER_CAPABILITIES); + hba-capabilities = ufshcd_readl(hba, REG_CONTROLLER_CAPABILITIES); /* nutrs and nutmrs are 0 based values */ hba-nutrs = (hba-capabilities MASK_TRANSFER_REQUESTS_SLOTS) + 1; @@ -356,16 +347,13 @@ static inline void ufshcd_send_uic_command(struct ufs_hba *hba, struct uic_command *uic_cmnd) { /* Write Args */ - writel(uic_cmnd-argument1, -
[PATCH 4/5] scsi: ufs: rework link start-up process
Link start-up requires long time with multiphase handshakes between UFS host and device. This affects driver's probe time. This patch let link start-up run asynchronously. And completion time of uic command is defined to avoid a permanent wait. Signed-off-by: Seungwon Jeon tgih@samsung.com --- drivers/scsi/ufs/ufshcd.c | 114 +--- drivers/scsi/ufs/ufshcd.h |6 ++- 2 files changed, 89 insertions(+), 31 deletions(-) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index efe2256..76ff332 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -38,6 +38,7 @@ #define UFSHCD_ENABLE_INTRS(UTP_TRANSFER_REQ_COMPL |\ UTP_TASK_REQ_COMPL |\ UFSHCD_ERROR_MASK) +#define UIC_CMD_TIMEOUT100 enum { UFSHCD_MAX_CHANNEL = 0, @@ -357,13 +358,15 @@ static inline void ufshcd_hba_capabilities(struct ufs_hba *hba) } /** - * ufshcd_send_uic_command - Send UIC commands to unipro layers + * ufshcd_dispatch_uic_cmd - Dispatch UIC commands to unipro layers * @hba: per adapter instance * @uic_command: UIC command */ static inline void -ufshcd_send_uic_command(struct ufs_hba *hba, struct uic_command *uic_cmnd) +ufshcd_dispatch_uic_cmd(struct ufs_hba *hba, struct uic_command *uic_cmnd) { + init_completion(uic_cmnd-done); + /* Write Args */ ufshcd_writel(hba, REG_UIC_COMMAND_ARG_1, uic_cmnd-argument1); ufshcd_writel(hba, REG_UIC_COMMAND_ARG_2, uic_cmnd-argument2); @@ -375,6 +378,45 @@ ufshcd_send_uic_command(struct ufs_hba *hba, struct uic_command *uic_cmnd) } /** + * ufshcd_wait_for_uic_cmd - Wait complectioin of UIC command + * @hba: per adapter instance + * @uic_command: UIC command + * + * Returns 0 only if success. + */ +static int ufshcd_wait_for_uic_cmd(struct ufs_hba *hba) +{ + struct uic_command *uic_cmd = hba-active_uic_cmd; + int ret; + + if (wait_for_completion_timeout(uic_cmd-done, + msecs_to_jiffies(UIC_CMD_TIMEOUT))) + ret = ufshcd_get_uic_cmd_result(hba); + else + ret = -ETIMEDOUT; + + return ret; +} + +/** + * ufshcd_ready_uic_cmd - Check if controller is ready + *to accept UIC commands + * @hba: per adapter instance + * Return true on success, else false + */ +static inline bool ufshcd_ready_uic_cmd(struct ufs_hba *hba) +{ + if (ufshcd_readl(hba, REG_CONTROLLER_STATUS) UIC_COMMAND_READY) { + return true; + } else { + dev_err(hba-dev, + Controller not ready +to accept UIC commands\n); + return false; + } +} + +/** * ufshcd_map_sg - Map scatter-gather list to prdt * @lrbp - pointer to local reference block * @@ -735,15 +777,10 @@ static int ufshcd_dme_link_startup(struct ufs_hba *hba) { struct uic_command *uic_cmd; unsigned long flags; + int ret; - /* check if controller is ready to accept UIC commands */ - if (((ufshcd_readl(hba, REG_CONTROLLER_STATUS)) - UIC_COMMAND_READY) == 0x0) { - dev_err(hba-dev, - Controller not ready -to accept UIC commands\n); + if (!ufshcd_ready_uic_cmd(hba)) return -EIO; - } spin_lock_irqsave(hba-host-host_lock, flags); @@ -754,13 +791,16 @@ static int ufshcd_dme_link_startup(struct ufs_hba *hba) uic_cmd-argument2 = 0; uic_cmd-argument3 = 0; - /* enable UIC related interrupts */ - ufshcd_enable_intr(hba, UIC_COMMAND_COMPL); + /* Dispatching UIC commands to controller */ + ufshcd_dispatch_uic_cmd(hba, uic_cmd); - /* sending UIC commands to controller */ - ufshcd_send_uic_command(hba, uic_cmd); spin_unlock_irqrestore(hba-host-host_lock, flags); - return 0; + + ret = ufshcd_wait_for_uic_cmd(hba); + if (ret) + dev_err(hba-dev, link startup: error code %d returned\n, ret); + + return ret; } /** @@ -898,6 +938,9 @@ static int ufshcd_initialize_hba(struct ufs_hba *hba) if (ufshcd_hba_enable(hba)) return -EIO; + /* enable UIC related interrupts */ + ufshcd_enable_intr(hba, UIC_COMMAND_COMPL | UIC_ERROR); + /* Configure UTRL and UTMRL base address registers */ ufshcd_writel(hba, REG_UTP_TRANSFER_REQ_LIST_BASE_L, lower_32_bits(hba-utrdl_dma_addr)); @@ -909,7 +952,9 @@ static int ufshcd_initialize_hba(struct ufs_hba *hba) upper_32_bits(hba-utmrdl_dma_addr)); /* Initialize unipro link startup procedure */ - return ufshcd_dme_link_startup(hba); + schedule_work(hba-link_startup_wq); + + return 0; } /** @@ -1186,6 +1231,16 @@ ufshcd_transfer_rsp_status(struct
[PATCH 5/5] scsi: ufs: add dme operations
Add uic command operations including DME_XXX series. Signed-off-by: Seungwon Jeon tgih@samsung.com --- drivers/scsi/ufs/ufs-attrs.h | 129 drivers/scsi/ufs/ufshcd.c| 220 +- drivers/scsi/ufs/ufshcd.h| 55 +++ drivers/scsi/ufs/ufshci.h| 19 4 files changed, 422 insertions(+), 1 deletions(-) create mode 100644 drivers/scsi/ufs/ufs-attrs.h diff --git a/drivers/scsi/ufs/ufs-attrs.h b/drivers/scsi/ufs/ufs-attrs.h new file mode 100644 index 000..562bb49 --- /dev/null +++ b/drivers/scsi/ufs/ufs-attrs.h @@ -0,0 +1,129 @@ +/* + * drivers/scsi/ufs/ufs-attrs.h + * + * Copyright (C) 2013 Samsung Electronics Co., Ltd. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + */ + +#ifndef _UFS_ATTRS_H_ +#define _UFS_ATTRS_H_ + +/* + * PHY Adpater attributes + */ +#define PA_ACTIVETXDATALANES 0x1560 +#define PA_ACTIVERXDATALANES 0x1580 +#define PA_TXTRAILINGCLOCKS0x1564 +#define PA_PHY_TYPE0x1500 +#define PA_AVAILTXDATALANES0x1520 +#define PA_AVAILRXDATALANES0x1540 +#define PA_MINRXTRAILINGCLOCKS 0x1543 +#define PA_TXPWRSTATUS 0x1567 +#define PA_RXPWRSTATUS 0x1582 +#define PA_TXFORCECLOCK0x1562 +#define PA_TXPWRMODE 0x1563 +#define PA_LEGACYDPHYESCDL 0x1570 +#define PA_MAXTXSPEEDFAST 0x1521 +#define PA_MAXTXSPEEDSLOW 0x1522 +#define PA_MAXRXSPEEDFAST 0x1541 +#define PA_MAXRXSPEEDSLOW 0x1542 +#define PA_TXLINKSTARTUPHS 0x1544 +#define PA_TXSPEEDFAST 0x1565 +#define PA_TXSPEEDSLOW 0x1566 +#define PA_REMOTEVERINFO 0x15A0 +#define PA_TXGEAR 0x1568 +#define PA_TXTERMINATION 0x1569 +#define PA_HSSERIES0x156A +#define PA_PWRMODE 0x1571 +#define PA_RXGEAR 0x1583 +#define PA_RXTERMINATION 0x1584 +#define PA_MAXRXPWMGEAR0x1586 +#define PA_MAXRXHSGEAR 0x1587 +#define PA_RXHSUNTERMCAP 0x15A5 +#define PA_RXLSTERMCAP 0x15A6 +#define PA_PACPREQTIMEOUT 0x1590 +#define PA_PACPREQEOBTIMEOUT 0x1591 +#define PA_LOCALVERINFO0x15A9 +#define PA_TACTIVATE 0x15A8 +#define PA_PACPFRAMECOUNT 0x15C0 +#define PA_PACPERRORCOUNT 0x15C1 +#define PA_PHYTESTCONTROL 0x15C2 +#define PA_PWRMODEUSERDATA00x15B0 +#define PA_PWRMODEUSERDATA10x15B1 +#define PA_PWRMODEUSERDATA20x15B2 +#define PA_PWRMODEUSERDATA30x15B3 +#define PA_PWRMODEUSERDATA40x15B4 +#define PA_PWRMODEUSERDATA50x15B5 +#define PA_PWRMODEUSERDATA60x15B6 +#define PA_PWRMODEUSERDATA70x15B7 +#define PA_PWRMODEUSERDATA80x15B8 +#define PA_PWRMODEUSERDATA90x15B9 +#define PA_PWRMODEUSERDATA10 0x15BA +#define PA_PWRMODEUSERDATA11 0x15BB +#define PA_CONNECTEDTXDATALANE 0x1561 +#define PA_CONNECTEDRXDATALANE 0x1581 +#define PA_LOGICALLANEMAP 0x15A1 +#define PA_SLEEPNOCONFIGTIME 0x15A2 +#define PA_STALLNOCONFIGTIME 0x15A3 +#define PA_SAVECONFIGTIME 0x15A4 + +/* + * Data Link Layer Attributes + */ +#define DL_TC0TXFCTHRESHOLD0x2040 +#define DL_FC0PROTTIMEOUTVAL 0x2041 +#define DL_TC0REPLAYTIMEOUTVAL 0x2042 +#define DL_AFC0REQTIMEOUTVAL 0x2043 +#define DL_AFC0CREDITTHRESHOLD 0x2044 +#define DL_TC0OUTACKTHRESHOLD 0x2045 +#define DL_TC1TXFCTHRESHOLD0x2060 +#define DL_FC1PROTTIMEOUTVAL 0x2061 +#define DL_TC1REPLAYTIMEOUTVAL 0x2062 +#define DL_AFC1REQTIMEOUTVAL 0x2063 +#define DL_AFC1CREDITTHRESHOLD 0x2064 +#define DL_TC1OUTACKTHRESHOLD 0x2065 +#define DL_TXPREEMPTIONCAP 0x2000 +#define DL_TC0TXMAXSDUSIZE 0x2001 +#define DL_TC0RXINITCREDITVAL 0x2002 +#define DL_TC0TXBUFFERSIZE 0x2005 +#define DL_PEERTC0PRESENT 0x2046 +#define DL_PEERTC0RXINITCREVAL 0x2047 +#define DL_TC1TXMAXSDUSIZE 0x2003 +#define DL_TC1RXINITCREDITVAL 0x2004 +#define DL_TC1TXBUFFERSIZE 0x2006 +#define DL_PEERTC1PRESENT 0x2066 +#define DL_PEERTC1RXINITCREVAL 0x2067 + +/* + * Network Layer Attributes + */ +#define N_DEVICEID 0x3000 +#define N_DEVICEID_VALID 0x3001 +#define N_TC0TXMAXSDUSIZE 0x3020 +#define N_TC1TXMAXSDUSIZE 0x3021 + +/* + * Transport Layer Attributes + */ +#define T_NUMCPORTS0x4000 +#define T_NUMTESTFEATURES 0x4001 +#define T_CONNECTIONSTATE 0x4020 +#define T_PEERDEVICEID 0x4021 +#define T_PEERCPORTID 0x4022 +#define T_TRAFFICCLASS 0x4023 +#define T_PROTOCOLID 0x4024 +#define T_CPORTFLAGS 0x4025 +#define T_TXTOKENVALUE 0x4026 +#define T_RXTOKENVALUE 0x4027 +#define T_LOCALBUFFERSPACE 0x4028 +#define T_PEERBUFFERSPACE 0x4029 +#define T_CREDITSTOSEND0x402A +#define T_CPORTMODE0x402B +#define T_TC0TXMAXSDUSIZE
RE: T10 WCE interpretation in Linux device level access
Jeremy, It looks like, you, Paolo and Ric have hit the nail on the head here - this is a nice summary, IMHO: On 4/24/2013 7:57 AM, Paolo Bonzini wrote: If the device can promise this, we don't care (and don't know) how it manages that promise. It can leave the data on battery backed DRAM, can archive it to flash or any other scheme that works. That's exactly the point of SYNC_NV=1. Well its the point, but the specification is written such that the vendors can choose to implement it any way they wish, especially for split cache systems where there is both volatile and non volatile cache. Independent of T10's best intentions at the time, the implementations aren't doing what's needed or intended, and I'd guess that the SYNC_NV bit is not being set to 1 by [other people's ;-) ] software that should be setting it to 1 if it were paying attention to the standard. This is further complicated by it being completely legitimate wrt the SCSI standard to put non-volatile cache in a system and not have the SCSI interface admit that the non-volatile cache exists (WCE=0, SYNCHRONIZE CACHE is a no-op independent of the value of SYNC_NV). I believe that Rob Elliot's 13-050 proposal to obsolete SYNC_NV and re-specify SYNCHRONIZE CACHE to make all data non-volatile by whatever means the target chooses is what T10 should do, and that matches Ric's summary: If the device can promise this, we don't care (and don't know) how it manages that promise. It can leave the data on battery backed DRAM, can archive it to flash or any other scheme that works. Beyond that, attempting to manage drive removal from storage systems via the SCSI interface with standard commands is a waste of time and effort, IMHO. In a serious storage array (and even some fairly simple RAID controllers), some vendor-specific magic is needed to get the array (or controller) to prepare so that the drive can be removed cleanly. To oversimplify, it's not enough to flush data to the drive; the array or controller is stateful, and hence has to be told to forget the drive, where forget involves things that are rather implementation-specific. Thanks, --David David L. Black, Distinguished Engineer EMC Corporation, 176 South St., Hopkinton, MA 01748 +1 (508) 293-7953 FAX: +1 (508) 293-7786 david.bl...@emc.com Mobile: +1 (978) 394-7754 -Original Message- From: Jeremy Linton [mailto:jlin...@tributary.com] Sent: Wednesday, April 24, 2013 10:36 AM To: Paolo Bonzini Cc: Ric Wheeler; Hannes Reinecke; James Bottomley; linux-scsi@vger.kernel.org; Martin K. Petersen; Jeff Moyer; Tejun Heo; Mike Snitzer; Black, David; Elliott, Robert (Server Storage); Knight, Frederick Subject: Re: T10 WCE interpretation in Linux device level access On 4/24/2013 7:57 AM, Paolo Bonzini wrote: If the device can promise this, we don't care (and don't know) how it manages that promise. It can leave the data on battery backed DRAM, can archive it to flash or any other scheme that works. That's exactly the point of SYNC_NV=1. Well its the point, but the specification is written such that the vendors can choose to implement it any way they wish, especially for split cache systems where there is both volatile and non volatile cache. Flushing the NV cache to medium (as is the current behavior) may not be a bad idea anyway. Thats because I know of a large vendors array where the non-volatile cache might be better described as the sometimes non-volatile cache. That is because a failure to flush the volatile portions results in the non-volatile portions being considered invalid when power is restored. This fences the volume, and the usual method for recovering the array is to call support and have them invalidate the NV portions of the cache. Thereby negating the whole reason for having a NV cache. I'm sure they don't tell customers this fact when they sell the array, when it happened in our lab I was in a state of shock for about a week. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: T10 WCE interpretation in Linux device level access
On 04/24/2013 02:20 PM, Black, David wrote: Jeremy, It looks like, you, Paolo and Ric have hit the nail on the head here - this is a nice summary, IMHO: On 4/24/2013 7:57 AM, Paolo Bonzini wrote: If the device can promise this, we don't care (and don't know) how it manages that promise. It can leave the data on battery backed DRAM, can archive it to flash or any other scheme that works. That's exactly the point of SYNC_NV=1. Well its the point, but the specification is written such that the vendors can choose to implement it any way they wish, especially for split cache systems where there is both volatile and non volatile cache. Independent of T10's best intentions at the time, the implementations aren't doing what's needed or intended, and I'd guess that the SYNC_NV bit is not being set to 1 by [other people's ;-) ] software that should be setting it to 1 if it were paying attention to the standard. This is further complicated by it being completely legitimate wrt the SCSI standard to put non-volatile cache in a system and not have the SCSI interface admit that the non-volatile cache exists (WCE=0, SYNCHRONIZE CACHE is a no-op independent of the value of SYNC_NV). I believe that Rob Elliot's 13-050 proposal to obsolete SYNC_NV and re-specify SYNCHRONIZE CACHE to make all data non-volatile by whatever means the target chooses is what T10 should do, and that matches Ric's summary: If the device can promise this, we don't care (and don't know) how it manages that promise. It can leave the data on battery backed DRAM, can archive it to flash or any other scheme that works. Beyond that, attempting to manage drive removal from storage systems via the SCSI interface with standard commands is a waste of time and effort, IMHO. In a serious storage array (and even some fairly simple RAID controllers), some vendor-specific magic is needed to get the array (or controller) to prepare so that the drive can be removed cleanly. To oversimplify, it's not enough to flush data to the drive; the array or controller is stateful, and hence has to be told to forget the drive, where forget involves things that are rather implementation-specific. Thanks, --David So I think that leaves us with some arrays that might benefit from Paolo's proposed patch, but almost certainly still will need to be able to ignore flushes for some block device accessing DB's, etc Ric -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: T10 WCE interpretation in Linux device level access
On Wed, 2013-04-24 at 16:41 -0400, Ric Wheeler wrote: On 04/24/2013 02:20 PM, Black, David wrote: Jeremy, It looks like, you, Paolo and Ric have hit the nail on the head here - this is a nice summary, IMHO: On 4/24/2013 7:57 AM, Paolo Bonzini wrote: If the device can promise this, we don't care (and don't know) how it manages that promise. It can leave the data on battery backed DRAM, can archive it to flash or any other scheme that works. That's exactly the point of SYNC_NV=1. Well its the point, but the specification is written such that the vendors can choose to implement it any way they wish, especially for split cache systems where there is both volatile and non volatile cache. Independent of T10's best intentions at the time, the implementations aren't doing what's needed or intended, and I'd guess that the SYNC_NV bit is not being set to 1 by [other people's ;-) ] software that should be setting it to 1 if it were paying attention to the standard. This is further complicated by it being completely legitimate wrt the SCSI standard to put non-volatile cache in a system and not have the SCSI interface admit that the non-volatile cache exists (WCE=0, SYNCHRONIZE CACHE is a no-op independent of the value of SYNC_NV). I believe that Rob Elliot's 13-050 proposal to obsolete SYNC_NV and re-specify SYNCHRONIZE CACHE to make all data non-volatile by whatever means the target chooses is what T10 should do, and that matches Ric's summary: If the device can promise this, we don't care (and don't know) how it manages that promise. It can leave the data on battery backed DRAM, can archive it to flash or any other scheme that works. Beyond that, attempting to manage drive removal from storage systems via the SCSI interface with standard commands is a waste of time and effort, IMHO. In a serious storage array (and even some fairly simple RAID controllers), some vendor-specific magic is needed to get the array (or controller) to prepare so that the drive can be removed cleanly. To oversimplify, it's not enough to flush data to the drive; the array or controller is stateful, and hence has to be told to forget the drive, where forget involves things that are rather implementation-specific. Thanks, --David So I think that leaves us with some arrays that might benefit from Paolo's proposed patch, but almost certainly still will need to be able to ignore flushes for some block device accessing DB's, etc That just leaves us with random standards behaviour. Lets permit the deterministic thing instead for the distros. It kills two birds with one stone because we can set WCE for the stupid UAS devices that clear it wrongly as well. For those who don't read code well, you add a temporary prefix to the cache set in echo xxx /sys/class/scsi_disk/disk/cache_type and it will set the flags for the lifetime of the current kernel, but won't try to do a mode select to make them permanent. James --- diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index 7992635..af16e88 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -142,6 +142,7 @@ sd_store_cache_type(struct device *dev, struct device_attribute *attr, char *buffer_data; struct scsi_mode_data data; struct scsi_sense_hdr sshdr; + const char *temp = temporary ; int len; if (sdp-type != TYPE_DISK) @@ -150,6 +151,13 @@ sd_store_cache_type(struct device *dev, struct device_attribute *attr, * it's not worth the risk */ return -EINVAL; + if (strncmp(buf, temp, sizeof(temp) - 1) == 0) { + buf += sizeof(temp) - 1; + sdkp-cache_override = 1; + } else { + sdkp-cache_override = 0; + } + for (i = 0; i ARRAY_SIZE(sd_cache_types); i++) { len = strlen(sd_cache_types[i]); if (strncmp(sd_cache_types[i], buf, len) == 0 @@ -162,6 +170,13 @@ sd_store_cache_type(struct device *dev, struct device_attribute *attr, return -EINVAL; rcd = ct 0x01 ? 1 : 0; wce = ct 0x02 ? 1 : 0; + + if (sdkp-cache_override) { + sdkp-WCE = wce; + sdkp-RCD = rcd; + return count; + } + if (scsi_mode_sense(sdp, 0x08, 8, buffer, sizeof(buffer), SD_TIMEOUT, SD_MAX_RETRIES, data, NULL)) return -EINVAL; @@ -2319,6 +2334,10 @@ sd_read_cache_type(struct scsi_disk *sdkp, unsigned char *buffer) int old_rcd = sdkp-RCD; int old_dpofua = sdkp-DPOFUA; + + if (sdkp-cache_override) + return; + first_len = 4; if (sdp-skip_ms_page_8) { if (sdp-type == TYPE_RBC) @@ -2812,6 +2831,7 @@ static void sd_probe_async(void *data, async_cookie_t cookie) sdkp-capacity = 0;
Re: T10 WCE interpretation in Linux device level access
Il 24/04/2013 23:02, James Bottomley ha scritto: That just leaves us with random standards behaviour. Lets permit the deterministic thing instead for the distros. It kills two birds with one stone because we can set WCE for the stupid UAS devices that clear it wrongly as well. For those who don't read code well, you add a temporary prefix to the cache set in echo xxx /sys/class/scsi_disk/disk/cache_type and it will set the flags for the lifetime of the current kernel, but won't try to do a mode select to make them permanent. Having the knob is useful indeed. I don't like the temporary name though, because temporary write-through doesn't sound like it can eat data on a power loss. What about force or assume? Also, this would be in addition to my patch (when tested), right? Paolo -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: T10 WCE interpretation in Linux device level access
On Wed, 2013-04-24 at 23:54 +0200, Paolo Bonzini wrote: Il 24/04/2013 23:02, James Bottomley ha scritto: That just leaves us with random standards behaviour. Lets permit the deterministic thing instead for the distros. It kills two birds with one stone because we can set WCE for the stupid UAS devices that clear it wrongly as well. For those who don't read code well, you add a temporary prefix to the cache set in echo xxx /sys/class/scsi_disk/disk/cache_type and it will set the flags for the lifetime of the current kernel, but won't try to do a mode select to make them permanent. Having the knob is useful indeed. I don't like the temporary name though, because temporary write-through doesn't sound like it can eat data on a power loss. What about force or assume? I'm fairly ambivalent, except not force. The default behaviour is to do the mode select, so force seems to imply that as well, except it won't. I don't see a difference between assume and temporary. Also, this would be in addition to my patch (when tested), right? Not really ... given T10s deprecation I don't think we want to touch anything to do with SYNC_NV because it just adds to the uncertainty about what will actually happen. Giving the ability to control WCE (and RCD) fixes all the problems raised so far. James -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/2] qla2xxx: Patches for scsi misc branch.
Hi James, This patchset fixes the sparse warnings. Please apply the following patches to the scsi tree at your earliest convenience for inclusion in the next mainline merge window. Thanks, ~Saurav Armen Baloyan (1): qla2xxx: fix sparse warning large integer implicitly truncated to unsigned type Fengguang Wu (1): qla2xxx: qla2x00_sp_compl can be static. drivers/scsi/qla2xxx/qla_mr.c |6 ++ drivers/scsi/qla2xxx/qla_os.c |2 +- 2 files changed, 3 insertions(+), 5 deletions(-) -- 1.7.7 -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] qla2xxx: qla2x00_sp_compl can be static.
From: Fengguang Wu fengguang...@intel.com Signed-off-by: Fengguang Wu fengguang...@intel.com Signed-off-by: Saurav Kashyap saurav.kash...@qlogic.com --- drivers/scsi/qla2xxx/qla_os.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c index a083715..7b621c2 100644 --- a/drivers/scsi/qla2xxx/qla_os.c +++ b/drivers/scsi/qla2xxx/qla_os.c @@ -644,7 +644,7 @@ qla2x00_sp_free_dma(void *vha, void *ptr) qla2x00_rel_sp(sp-fcport-vha, sp); } -void +static void qla2x00_sp_compl(void *data, void *ptr, int res) { struct qla_hw_data *ha = (struct qla_hw_data *)data; -- 1.7.7 -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] qla2xxx: fix sparse warning large integer implicitly truncated to unsigned type
From: Armen Baloyan armen.balo...@qlogic.com Signed-off-by: Armen Baloyan armen.balo...@qlogic.com Signed-off-by: Saurav Kashyap saurav.kash...@qlogic.com --- drivers/scsi/qla2xxx/qla_mr.c |6 ++ 1 files changed, 2 insertions(+), 4 deletions(-) diff --git a/drivers/scsi/qla2xxx/qla_mr.c b/drivers/scsi/qla2xxx/qla_mr.c index 729b743..937fed8 100644 --- a/drivers/scsi/qla2xxx/qla_mr.c +++ b/drivers/scsi/qla2xxx/qla_mr.c @@ -3003,12 +3003,10 @@ qlafx00_build_scsi_iocbs(srb_t *sp, struct cmd_type_7_fx00 *cmd_pkt, /* Set transfer direction */ if (cmd-sc_data_direction == DMA_TO_DEVICE) { - lcmd_pkt-cntrl_flags = - __constant_cpu_to_le16(TMF_WRITE_DATA); + lcmd_pkt-cntrl_flags = TMF_WRITE_DATA; vha-qla_stats.output_bytes += scsi_bufflen(cmd); } else if (cmd-sc_data_direction == DMA_FROM_DEVICE) { - lcmd_pkt-cntrl_flags = - __constant_cpu_to_le16(TMF_READ_DATA); + lcmd_pkt-cntrl_flags = TMF_READ_DATA; vha-qla_stats.input_bytes += scsi_bufflen(cmd); } -- 1.7.7 -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: T10 WCE interpretation in Linux device level access
On 04/24/2013 06:09 PM, James Bottomley wrote: On Wed, 2013-04-24 at 23:54 +0200, Paolo Bonzini wrote: Il 24/04/2013 23:02, James Bottomley ha scritto: That just leaves us with random standards behaviour. Lets permit the deterministic thing instead for the distros. It kills two birds with one stone because we can set WCE for the stupid UAS devices that clear it wrongly as well. For those who don't read code well, you add a temporary prefix to the cache set in echo xxx /sys/class/scsi_disk/disk/cache_type and it will set the flags for the lifetime of the current kernel, but won't try to do a mode select to make them permanent. Having the knob is useful indeed. I don't like the temporary name though, because temporary write-through doesn't sound like it can eat data on a power loss. What about force or assume? I'm fairly ambivalent, except not force. The default behaviour is to do the mode select, so force seems to imply that as well, except it won't. I don't see a difference between assume and temporary. Also, this would be in addition to my patch (when tested), right? Not really ... given T10s deprecation I don't think we want to touch anything to do with SYNC_NV because it just adds to the uncertainty about what will actually happen. Giving the ability to control WCE (and RCD) fixes all the problems raised so far. James Why are we turning off the RCD bit in this? Not sure it matters, but we only should care about WCE (and the dirty write cache data)? Thanks! Ric -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: T10 WCE interpretation in Linux device level access
On Wed, 2013-04-24 at 18:36 -0400, Ric Wheeler wrote: On 04/24/2013 06:09 PM, James Bottomley wrote: On Wed, 2013-04-24 at 23:54 +0200, Paolo Bonzini wrote: Il 24/04/2013 23:02, James Bottomley ha scritto: That just leaves us with random standards behaviour. Lets permit the deterministic thing instead for the distros. It kills two birds with one stone because we can set WCE for the stupid UAS devices that clear it wrongly as well. For those who don't read code well, you add a temporary prefix to the cache set in echo xxx /sys/class/scsi_disk/disk/cache_type and it will set the flags for the lifetime of the current kernel, but won't try to do a mode select to make them permanent. Having the knob is useful indeed. I don't like the temporary name though, because temporary write-through doesn't sound like it can eat data on a power loss. What about force or assume? I'm fairly ambivalent, except not force. The default behaviour is to do the mode select, so force seems to imply that as well, except it won't. I don't see a difference between assume and temporary. Also, this would be in addition to my patch (when tested), right? Not really ... given T10s deprecation I don't think we want to touch anything to do with SYNC_NV because it just adds to the uncertainty about what will actually happen. Giving the ability to control WCE (and RCD) fixes all the problems raised so far. Why are we turning off the RCD bit in this? Not sure it matters, but we only should care about WCE (and the dirty write cache data)? Well, it's in the code. Cache policy is a combination of those two bits. The cache type takes a cache policy string, ergo it must update both. We don't do anything with it because having a write back cache and no cache at all is transparent to us. James -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/6] scsi_debug: fix logical block provisioning support
On 13-04-16 09:11 AM, Akinobu Mita wrote: I tried testing the logical block provisioning support in scsi_debug, but it didn't work as I expected. For example, load scsi_debug module with UNMAP command supported and fill the storage with random data. # modprobe scsi_debug lbpu=1 # dd if=/dev/urandom of=/dev/sdb Then, try to unmap LBA 0, but Get LBA status reports: # sg_unmap --lba=0 --num=1 /dev/sdb # sg_get_lba_status --lba=0 /dev/sdb descriptor LBA: 0x blocks: 16384 mapped This is unexpected result. Because UNMAP command to LBA 0 finished without any errors, but Get LBA status shows that LBA 0 is still mapped. I looked around the logical block provisioning support in scsi_debug, and I found several problems there. This patch series tries to fix these problems and it is broken into small patches as much as possible for ease of review. Cc: James E.J. Bottomley jbottom...@parallels.com Cc: linux-scsi@vger.kernel.org Cc: Douglas Gilbert dgilb...@interlog.com Cc: Martin K. Petersen martin.peter...@oracle.com Akinobu Mita (6): scsi_debug: call map_region() and unmap_region() only when needed scsi_debug: prohibit scsi_debug_unmap_granularity == scsi_debug_unmap_alignment scsi_debug: clear correct memory region when LBPRZ is enabled scsi_debug: add translation functions between LBA and index of provisioning map scsi_debug: fix initialization of provisioning map scsi_debug: fix logical block provisioning support I'd like to see some feedback from Martin Petersen on this set of patches. For my part, for this patch series (1/6 fo 6/6): Acked-by: Douglas Gilbert dgilb...@interlog.com -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/3] scsi_debug: fix data integrity support
On 13-04-21 05:17 AM, Akinobu Mita wrote: When I tried testing the data integrity support in scsi_debug on x86_32, I got CONFIG_DEBUG_HIGHMEM warnings and protection errors. This was triggered due to misused kmap_atomic/kunmap_atomic. And then, while I was testing the fix of the above issue with several combination with module parameters dix and dif, I found that doing 'modprobe scsi_debug dif=0 dix=1' causes kernel crash. This patch set includes these fixes and cleanup which is related to data integrity support. Cc: James E.J. Bottomley jbottom...@parallels.com Cc: Douglas Gilbert dgilb...@interlog.com Cc: Martin K. Petersen martin.peter...@oracle.com Cc: linux-scsi@vger.kernel.org Akinobu Mita (3): scsi_debug: fix data integrity support on highmem machine scsi_debug: fix NULL pointer dereference with parameters dif=0 dix=1 scsi_debug: simplify offset calculation for dif_storep drivers/scsi/scsi_debug.c | 29 - 1 file changed, 12 insertions(+), 17 deletions(-) Again, I'd like to see some feedback from Martin Petersen on this set of patches. For my part, for this patch series (1/3 to 3/3): Acked-by: Douglas Gilbert dgilb...@interlog.com -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] [SCSI] sd: workaround invalid OPTIMAL TRANSFER LENGTH
Mike == Mike Snitzer snit...@redhat.com writes: Mike, Following up on our discussion at LSF/MM last week... Mike Workaround disk firmware that improperly sets OPTIMAL TRANSFER Mike LENGTH to 0x (aka UINT_MAX or 4294967295U) by assuming Mike this _optional_ BLOCK LIMITS VPD field was not specified (0). My only real objection is that UINT_MAX appears to be completely coincidental. What if the next drive specifies UINT_MAX-1 or $RANDOM? Lacking a decent heuristic I'd rather just blacklist the offending drive using the skip_vpd_pages knob we already have in place. -- Martin K. Petersen Oracle Linux Engineering [SCSI] Workaround for disks that report bad optimal transfer length Not all disks fill out the VPD pages correctly. Add a blacklist flag that allows us ignore the SBC-3 VPD pages for a given device. The BLIST_SKIP_VPD_PAGES flag triggers our existing skip_vpd_pages scsi_device parameter to bypass VPD scanning. Also blacklist the offending Seagate drive model. Reported-by: Mike Snitzer snit...@redhat.com Signed-off-by: Martin K. Petersen martin.peter...@oracle.com diff --git a/drivers/scsi/scsi_devinfo.c b/drivers/scsi/scsi_devinfo.c index 43fca91..f969aca 100644 --- a/drivers/scsi/scsi_devinfo.c +++ b/drivers/scsi/scsi_devinfo.c @@ -228,6 +228,7 @@ static struct { {SanDisk, ImageMate CF-SD1, NULL, BLIST_FORCELUN}, {SEAGATE, ST34555N, 0930, BLIST_NOTQ},/* Chokes on tagged INQUIRY */ {SEAGATE, ST3390N, 9546, BLIST_NOTQ}, + {SEAGATE, ST900MM0006, NULL, BLIST_SKIP_VPD_PAGES}, {SGI, RAID3, *, BLIST_SPARSELUN}, {SGI, RAID5, *, BLIST_SPARSELUN}, {SGI, TP9100, *, BLIST_REPORTLUN2}, diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c index 3e58b22..b9e39e0 100644 --- a/drivers/scsi/scsi_scan.c +++ b/drivers/scsi/scsi_scan.c @@ -924,6 +924,9 @@ static int scsi_add_lun(struct scsi_device *sdev, unsigned char *inq_result, if (*bflags BLIST_NO_DIF) sdev-no_dif = 1; + if (*bflags BLIST_SKIP_VPD_PAGES) + sdev-skip_vpd_pages = 1; + transport_configure_device(sdev-sdev_gendev); if (sdev-host-hostt-slave_configure) { diff --git a/include/scsi/scsi_devinfo.h b/include/scsi/scsi_devinfo.h index cc1f3e7..447d2d7 100644 --- a/include/scsi/scsi_devinfo.h +++ b/include/scsi/scsi_devinfo.h @@ -31,4 +31,5 @@ #define BLIST_MAX_512 0x80 /* maximum 512 sector cdb length */ #define BLIST_ATTACH_PQ3 0x100 /* Scan: Attach to PQ3 devices */ #define BLIST_NO_DIF 0x200 /* Disable T10 PI (DIF) */ +#define BLIST_SKIP_VPD_PAGES 0x400 /* Ignore SBC-3 VPD pages */ #endif -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: T10 WCE interpretation in Linux device level access
James == James Bottomley james.bottom...@hansenpartnership.com writes: James I'm fairly ambivalent, except not force. The default behaviour James is to do the mode select, so force seems to imply that as well, James except it won't. I don't see a difference between assume and James temporary. I'm ok with your patch. And a strong believer in not altering the SYNCHRONIZE CACHE behavior that's been rigorously tested in the field by adding SYNC_NV to the mix. James Not really ... given T10s deprecation I don't think we want to James touch anything to do with SYNC_NV because it just adds to the James uncertainty about what will actually happen. Yep. James Giving the ability to control WCE (and RCD) fixes all the James problems raised so far. If there are devices that would truly benefit from SYNC_NV we could add a sync_nv parameter to scsi_disk's sysfs that could be used to set that bit when issuing flush_cmnd. But it would be something we would do manually on a per-device basis and not something that is automatically keyed off of NV_SUP (SYNC_NV doesn't require NV_SUP, btw., so that's not even a valid check). -- Martin K. Petersen Oracle Linux Engineering -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/6] scsi_debug: prohibit scsi_debug_unmap_granularity == scsi_debug_unmap_alignment
Akinobu == Akinobu Mita akinobu.m...@gmail.com writes: Akinobu scsi_debug prohibits setting scsi_debug_unmap_alignment to be Akinobu greater than scsi_debug_unmap_granularity. But setting them to Akinobu be the same value is not prohibited. In this case, the only Akinobu difference with scsi_debug_unmap_alignment == 0 is the logical Akinobu blocks from 0 to scsi_debug_unmap_alignment - 1 cannot be Akinobu unmapped. But the difference is not properly handled in the Akinobu current code. Akinobu So this prohibits such unusual setting. Acked-by: Martin K. Petersen martin.peter...@oracle.com -- Martin K. Petersen Oracle Linux Engineering -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/6] scsi_debug: clear correct memory region when LBPRZ is enabled
Akinobu == Akinobu Mita akinobu.m...@gmail.com writes: Akinobu The function unmap_region() clears memory region specified as Akinobu the logical block address and the number of logical blocks in Akinobu ramdisk storage (fake_storep) if lbpu and lbprz module Akinobu parameters are enabled. Akinobu In the while loop of unmap_region(), it advances optimal unmap Akinobu granularity in logical blocks. But it only clears one logical Akinobu block at LBA 'block' per loop iteration. And furthermore, the Akinobu 'block' is not pointing to a logical block address which should Akinobu be cleared, it is a index of probisioning map (map_storep). Acked-by: Martin K. Petersen martin.peter...@oracle.com -- Martin K. Petersen Oracle Linux Engineering -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/6] scsi_debug: add translation functions between LBA and index of provisioning map
Akinobu == Akinobu Mita akinobu.m...@gmail.com writes: Akinobu The translation from LBA to index of provisioning map Akinobu (map_storep) is used in various places (map_state(), Akinobu map_region(), and unmap_region()). But it is not correctly Akinobu calculated if scsi_debug_unmap_alignment is zero. Akinobu This introduces correct translation functions between LBA and Akinobu index of provisioning map: Akinobu static unsigned long lba_to_map_index(sector_t lba); Akinobu static sector_t map_index_to_lba(unsigned long index); Akinobu Actual bug fixes with using these functions will be done by Akinobu forthcoming patches. Acked-by: Martin K. Petersen martin.peter...@oracle.com -- Martin K. Petersen Oracle Linux Engineering -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 5/6] scsi_debug: fix initialization of provisioning map
Akinobu == Akinobu Mita akinobu.m...@gmail.com writes: Akinobu provisioning map (map_storep) is a bitmap accessed by bitops. Akinobu So the allocation size should be a multiple of sizeof(unsigned Akinobu long) and also the bitmap should be cleared by using Akinobu bitmap_clear() instead of memset(). Akinobu Otherwise it will cause problem on big-endian architecture if Akinobu the number of bits is not a multiple of BITS_PER_LONG. Acked-by: Martin K. Petersen martin.peter...@oracle.com -- Martin K. Petersen Oracle Linux Engineering -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: sd: workaround invalid OPTIMAL TRANSFER LENGTH
On Wed, Apr 24 2013 at 9:19pm -0400, Martin K. Petersen martin.peter...@oracle.com wrote: Mike == Mike Snitzer snit...@redhat.com writes: Mike, Following up on our discussion at LSF/MM last week... Mike Workaround disk firmware that improperly sets OPTIMAL TRANSFER Mike LENGTH to 0x (aka UINT_MAX or 4294967295U) by assuming Mike this _optional_ BLOCK LIMITS VPD field was not specified (0). My only real objection is that UINT_MAX appears to be completely coincidental. What if the next drive specifies UINT_MAX-1 or $RANDOM? Lacking a decent heuristic I'd rather just blacklist the offending drive using the skip_vpd_pages knob we already have in place. OK, sounds fine to me, thanks. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 6/6] scsi_debug: fix logical block provisioning support
Akinobu == Akinobu Mita akinobu.m...@gmail.com writes: Akinobu This problem is due to the wrong translation between LBA and Akinobu index of provisioning map. Fix it by using correct translation Akinobu functions added previously. Acked-by: Martin K. Petersen martin.peter...@oracle.com -- Martin K. Petersen Oracle Linux Engineering -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 1/1] sd_dif: problem with verify of type 1 protection information (PI)
Jeremy == Jeremy Higdon jer...@sgi.com writes: Jeremy It appears to me that there is a problem with handling of type 1 Jeremy protection information. It is considering a logical block Jeremy reference tag of 0x to be an error, but it is actually Jeremy valid any time ((lba 0x) == 0x) [for example, Jeremy 2TiB-1, 4TiB-1, 6TiB-1, etc.]. Yes, that's a bogus check. It's actually one I put in way back to debug a faulty device and forgot to back out. My DIF test tooling does not rely on the block layer generate/verify functions so I never noticed. The same issue was reported recently by Keith Busch. Signed-off-by: Martin K. Petersen martin.peter...@oracle.com -- Martin K. Petersen Oracle Linux Engineering -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/6] scsi_debug: call map_region() and unmap_region() only when needed
Akinobu == Akinobu Mita akinobu.m...@gmail.com writes: Akinobu If the logical block provisioning is not enabled, map_region() Akinobu and unmap_region() have no effect and they don't need to be Akinobu called. Akinobu So this makes map_region() and unmap_region() to be called only Akinobu when scsi_debug_lbp() returns true, i.e. logical block Akinobu provisioning is enabled. Acked-by: Martin K. Petersen martin.peter...@oracle.com Akinobu Because scsi_debug_unmap_granularity cannot be zero with usual Akinobu setting: scsi_debug_unmap_granularity is 1 by default, and it Akinobu can be changed to zero with explicit module parameter setting Akinobu only when the logical block provisioning is disabled. But it Akinobu is only meaningful module parameter when the logical block Akinobu provisioning is enabled. As you have found out, I generally allow module parameter values and combinations thereof that do not make strict sense. That's because the main reason for adding these parameters in the first place is to test the block and SCSI layer code that queries them. scsi_debug is a test vehicle and being able to simulate a device with broken reporting is something I use a lot. That said, I don't think we have any test scripts that actually depend on this combination of unmap parameters so I'm ok with you cleaning them up. -- Martin K. Petersen Oracle Linux Engineering -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] scsi_debug: fix data integrity support on highmem machine
Akinobu == Akinobu Mita akinobu.m...@gmail.com writes: Akinobu kmap_atomic() is now using stack based implementation and Akinobu doesn't take the KM_type argument anymore. So nesting Akinobu kmap_atomic() calls must be properly stacked. Akinobu This fixes nesting kmap_atomic() calls for scsi_sglist and Akinobu scsi_prot_sglist in prot_verify_write(). I don't see how the prog_sglist is incorrectly nested? Also, with your code you also end up mapping and unmapping the protection page for every data segment. The two scatterlists have different cadence. Akinobu This also fixes another issue that invalid kmap address is used Akinobu for kunmap_atomic(): the kmap address 'daddr' is incremented in Akinobu the loop for each data page, and it can reach the next page Akinobu boundary. That fix is fine. -- Martin K. Petersen Oracle Linux Engineering -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] scsi_debug: simplify offset calculation for dif_storep
Akinobu == Akinobu Mita akinobu.m...@gmail.com writes: + sector += len / sizeof(*dif_storep); I'd rather see sizeof(struct scsi_dif_tuple) here. But that's just personal preference. Otherwise ok. -- Martin K. Petersen Oracle Linux Engineering -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3] scsi_debug: fix NULL pointer dereference with parameters dif=0 dix=1
Akinobu == Akinobu Mita akinobu.m...@gmail.com writes: Akinobu The protection info dif_storep is allocated only when parameter Akinobu dif is not zero. But it will be accessed when reading or Akinobu writing to the storage installed with parameter dix is not Akinobu zero. Akinobu So kernel crashes if scsi_debug module is loaded with Akinobu parameters dix=1 and dif=0. The full story is that scsi_debug does not support DIF and DIX correctly by virtue of simultaneously being the HBA and the target. And since there is no actual data transfer between the HBA and the target the notion of DIF is weak at best. I did look into making scsi_debug do the right thing but it's quite a bit of code and I lost interest about halfway through the effort. If you'd like to fix this properly I can probably find the patch to use as baseline? In this particular case I'm ok with your patch keying off of scsi_debug_dix. But it's very important to me that I'm able to set P_TYPE and the host protection mask as I see fit using the module parameters. Even when the combination of options don't make strict sense. Acked-by: Martin K. Petersen martin.peter...@oracle.com -- Martin K. Petersen Oracle Linux Engineering -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/5] scsi: ufs: wrap the i/o access operations
On 4/24/2013 9:36 PM, Seungwon Jeon wrote: Simplify operations with hiding mmio_base. Signed-off-by: Seungwon Jeon tgih@samsung.com --- drivers/scsi/ufs/ufshcd.c | 106 +++-- drivers/scsi/ufs/ufshcd.h |5 ++ 2 files changed, 49 insertions(+), 62 deletions(-) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 41b9639..b6c19b0 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -71,7 +71,7 @@ enum { */ static inline u32 ufshcd_get_ufs_version(struct ufs_hba *hba) { - return readl(hba-mmio_base + REG_UFS_VERSION); + return ufshcd_readl(hba, REG_UFS_VERSION); } /** @@ -130,8 +130,7 @@ static inline int ufshcd_get_tm_free_slot(struct ufs_hba *hba) */ static inline void ufshcd_utrl_clear(struct ufs_hba *hba, u32 pos) { - writel(~(1 pos), - (hba-mmio_base + REG_UTP_TRANSFER_REQ_LIST_CLEAR)); + ufshcd_writel(hba, REG_UTP_TRANSFER_REQ_LIST_CLEAR, ~(1 pos)); } /** @@ -165,7 +164,7 @@ static inline int ufshcd_get_lists_status(u32 reg) */ static inline int ufshcd_get_uic_cmd_result(struct ufs_hba *hba) { - return readl(hba-mmio_base + REG_UIC_COMMAND_ARG_2) + return ufshcd_readl(hba, REG_UIC_COMMAND_ARG_2) MASK_UIC_COMMAND_RESULT; } @@ -243,18 +242,14 @@ ufshcd_config_int_aggr(struct ufs_hba *hba, int option) { switch (option) { case INT_AGGR_RESET: - writel((INT_AGGR_ENABLE | - INT_AGGR_COUNTER_AND_TIMER_RESET), - (hba-mmio_base + -REG_UTP_TRANSFER_REQ_INT_AGG_CONTROL)); + ufshcd_writel(hba, REG_UTP_TRANSFER_REQ_INT_AGG_CONTROL, + INT_AGGR_ENABLE | INT_AGGR_COUNTER_AND_TIMER_RESET); break; case INT_AGGR_CONFIG: - writel((INT_AGGR_ENABLE | - INT_AGGR_PARAM_WRITE | - INT_AGGR_COUNTER_THRESHOLD_VALUE | - INT_AGGR_TIMEOUT_VALUE), - (hba-mmio_base + -REG_UTP_TRANSFER_REQ_INT_AGG_CONTROL)); + ufshcd_writel(hba, REG_UTP_TRANSFER_REQ_INT_AGG_CONTROL, + INT_AGGR_ENABLE | INT_AGGR_PARAM_WRITE | + INT_AGGR_COUNTER_THRESHOLD_VALUE | + INT_AGGR_TIMEOUT_VALUE); break; } } @@ -267,12 +262,10 @@ ufshcd_config_int_aggr(struct ufs_hba *hba, int option) */ static void ufshcd_enable_run_stop_reg(struct ufs_hba *hba) { - writel(UTP_TASK_REQ_LIST_RUN_STOP_BIT, - (hba-mmio_base + - REG_UTP_TASK_REQ_LIST_RUN_STOP)); - writel(UTP_TRANSFER_REQ_LIST_RUN_STOP_BIT, - (hba-mmio_base + - REG_UTP_TRANSFER_REQ_LIST_RUN_STOP)); + ufshcd_writel(hba, REG_UTP_TASK_REQ_LIST_RUN_STOP, + UTP_TASK_REQ_LIST_RUN_STOP_BIT); + ufshcd_writel(hba, REG_UTP_TRANSFER_REQ_LIST_RUN_STOP, + UTP_TRANSFER_REQ_LIST_RUN_STOP_BIT); } /** @@ -281,7 +274,7 @@ static void ufshcd_enable_run_stop_reg(struct ufs_hba *hba) */ static inline void ufshcd_hba_start(struct ufs_hba *hba) { - writel(CONTROLLER_ENABLE , (hba-mmio_base + REG_CONTROLLER_ENABLE)); + ufshcd_writel(hba, REG_CONTROLLER_ENABLE, CONTROLLER_ENABLE); } /** @@ -290,7 +283,7 @@ static inline void ufshcd_hba_start(struct ufs_hba *hba) */ static inline void ufshcd_hba_stop(struct ufs_hba *hba) { - writel(CONTROLLER_DISABLE, (hba-mmio_base + REG_CONTROLLER_ENABLE)); + ufshcd_writel(hba, REG_CONTROLLER_ENABLE, CONTROLLER_DISABLE); } /** @@ -301,7 +294,7 @@ static inline void ufshcd_hba_stop(struct ufs_hba *hba) */ static inline int ufshcd_is_hba_active(struct ufs_hba *hba) { - return (readl(hba-mmio_base + REG_CONTROLLER_ENABLE) 0x1) ? 0 : 1; + return (ufshcd_readl(hba, REG_CONTROLLER_ENABLE) 0x1) ? 0 : 1; } /** @@ -313,8 +306,7 @@ static inline void ufshcd_send_command(struct ufs_hba *hba, unsigned int task_tag) { __set_bit(task_tag, hba-outstanding_reqs); - writel((1 task_tag), - (hba-mmio_base + REG_UTP_TRANSFER_REQ_DOOR_BELL)); + ufshcd_writel(hba, REG_UTP_TRANSFER_REQ_DOOR_BELL, 1 task_tag); } /** @@ -338,8 +330,7 @@ static inline void ufshcd_copy_sense_data(struct ufshcd_lrb *lrbp) */ static inline void ufshcd_hba_capabilities(struct ufs_hba *hba) { - hba-capabilities = - readl(hba-mmio_base + REG_CONTROLLER_CAPABILITIES); + hba-capabilities = ufshcd_readl(hba, REG_CONTROLLER_CAPABILITIES); /* nutrs and nutmrs are 0 based values */ hba-nutrs = (hba-capabilities MASK_TRANSFER_REQUESTS_SLOTS) + 1; @@ -356,16 +347,13 @@ static inline void ufshcd_send_uic_command(struct ufs_hba *hba, struct uic_command
Re: [PATCH 4/5] scsi: ufs: rework link start-up process
On 4/24/2013 9:36 PM, Seungwon Jeon wrote: Link start-up requires long time with multiphase handshakes between UFS host and device. This affects driver's probe time. This patch let link start-up run asynchronously. And completion time of uic command is defined to avoid a permanent wait. I have similar patch posted few days back scsi: ufs: Generalize UFS Interconnect Layer (UIC) command support which does a bit more (mutex, error handling) than what is done here. Can that be used/improved? Signed-off-by: Seungwon Jeon tgih@samsung.com --- drivers/scsi/ufs/ufshcd.c | 114 +--- drivers/scsi/ufs/ufshcd.h |6 ++- 2 files changed, 89 insertions(+), 31 deletions(-) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index efe2256..76ff332 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -38,6 +38,7 @@ #define UFSHCD_ENABLE_INTRS (UTP_TRANSFER_REQ_COMPL |\ UTP_TASK_REQ_COMPL |\ UFSHCD_ERROR_MASK) +#define UIC_CMD_TIMEOUT100 enum { UFSHCD_MAX_CHANNEL = 0, @@ -357,13 +358,15 @@ static inline void ufshcd_hba_capabilities(struct ufs_hba *hba) } /** - * ufshcd_send_uic_command - Send UIC commands to unipro layers + * ufshcd_dispatch_uic_cmd - Dispatch UIC commands to unipro layers * @hba: per adapter instance * @uic_command: UIC command */ static inline void -ufshcd_send_uic_command(struct ufs_hba *hba, struct uic_command *uic_cmnd) +ufshcd_dispatch_uic_cmd(struct ufs_hba *hba, struct uic_command *uic_cmnd) { + init_completion(uic_cmnd-done); + /* Write Args */ ufshcd_writel(hba, REG_UIC_COMMAND_ARG_1, uic_cmnd-argument1); ufshcd_writel(hba, REG_UIC_COMMAND_ARG_2, uic_cmnd-argument2); @@ -375,6 +378,45 @@ ufshcd_send_uic_command(struct ufs_hba *hba, struct uic_command *uic_cmnd) } /** + * ufshcd_wait_for_uic_cmd - Wait complectioin of UIC command + * @hba: per adapter instance + * @uic_command: UIC command + * + * Returns 0 only if success. + */ +static int ufshcd_wait_for_uic_cmd(struct ufs_hba *hba) +{ + struct uic_command *uic_cmd = hba-active_uic_cmd; + int ret; + + if (wait_for_completion_timeout(uic_cmd-done, + msecs_to_jiffies(UIC_CMD_TIMEOUT))) + ret = ufshcd_get_uic_cmd_result(hba); + else + ret = -ETIMEDOUT; + + return ret; +} + +/** + * ufshcd_ready_uic_cmd - Check if controller is ready + *to accept UIC commands + * @hba: per adapter instance + * Return true on success, else false + */ +static inline bool ufshcd_ready_uic_cmd(struct ufs_hba *hba) +{ + if (ufshcd_readl(hba, REG_CONTROLLER_STATUS) UIC_COMMAND_READY) { + return true; + } else { + dev_err(hba-dev, + Controller not ready +to accept UIC commands\n); + return false; + } +} + +/** * ufshcd_map_sg - Map scatter-gather list to prdt * @lrbp - pointer to local reference block * @@ -735,15 +777,10 @@ static int ufshcd_dme_link_startup(struct ufs_hba *hba) { struct uic_command *uic_cmd; unsigned long flags; + int ret; - /* check if controller is ready to accept UIC commands */ - if (((ufshcd_readl(hba, REG_CONTROLLER_STATUS)) - UIC_COMMAND_READY) == 0x0) { - dev_err(hba-dev, - Controller not ready -to accept UIC commands\n); + if (!ufshcd_ready_uic_cmd(hba)) return -EIO; - } spin_lock_irqsave(hba-host-host_lock, flags); @@ -754,13 +791,16 @@ static int ufshcd_dme_link_startup(struct ufs_hba *hba) uic_cmd-argument2 = 0; uic_cmd-argument3 = 0; - /* enable UIC related interrupts */ - ufshcd_enable_intr(hba, UIC_COMMAND_COMPL); + /* Dispatching UIC commands to controller */ + ufshcd_dispatch_uic_cmd(hba, uic_cmd); - /* sending UIC commands to controller */ - ufshcd_send_uic_command(hba, uic_cmd); spin_unlock_irqrestore(hba-host-host_lock, flags); - return 0; + + ret = ufshcd_wait_for_uic_cmd(hba); + if (ret) + dev_err(hba-dev, link startup: error code %d returned\n, ret); + + return ret; } /** @@ -898,6 +938,9 @@ static int ufshcd_initialize_hba(struct ufs_hba *hba) if (ufshcd_hba_enable(hba)) return -EIO; + /* enable UIC related interrupts */ + ufshcd_enable_intr(hba, UIC_COMMAND_COMPL | UIC_ERROR); + /* Configure UTRL and UTMRL base address registers */ ufshcd_writel(hba, REG_UTP_TRANSFER_REQ_LIST_BASE_L, lower_32_bits(hba-utrdl_dma_addr)); @@ -909,7 +952,9 @@ static int ufshcd_initialize_hba(struct ufs_hba *hba)
Re: [PATCH][v3] scsi: Handle MLQUEUE busy response in scsi_send_eh_cmnd
On 04/24/2013 04:48 PM, James Bottomley wrote: On Wed, 2013-04-24 at 13:32 +0200, Hannes Reinecke wrote: scsi_send_eh_cmnd() is calling queuecommand() directly, so it needs to check the return value here. The only valid return codes for queuecommand() are 'busy' states, so we need to wait for a bit to allow the LLDD to recover. Based on an earlier patch from Wen Xiong. This looks good, only one minor nitpick: Cc: Wen Xiong wenxi...@linux.vnet.ibm.com Cc: Brian King brk...@linux.vnet.ibm.com Signed-off-by: Hannes Reinecke h...@suse.de diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index c1b05a8..1fc6da94 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -791,22 +791,32 @@ static int scsi_send_eh_cmnd(struct scsi_cmnd *scmd, unsigned char *cmnd, struct scsi_device *sdev = scmd-device; struct Scsi_Host *shost = sdev-host; DECLARE_COMPLETION_ONSTACK(done); -unsigned long timeleft; +unsigned long timeleft = timeout; struct scsi_eh_save ses; +const int stall_for = min(HZ/10, 1); int rtn; +retry: scsi_eh_prep_cmnd(scmd, ses, cmnd, cmnd_size, sense_bytes); shost-eh_action = done; scsi_log_send(scmd); scmd-scsi_done = scsi_eh_done; -shost-hostt-queuecommand(shost, scmd); - -timeleft = wait_for_completion_timeout(done, timeout); +rtn = shost-hostt-queuecommand(shost, scmd); +if (rtn) { +if (timeleft) { +scsi_eh_restore_cmnd(scmd, ses); +timeleft -= stall_for; +msleep(stall_for); Stall for is in HZ: need to convert to ms, so msleep(stall_for * 1000/HZ); Right. Will be converting it. I also don't see a need to restore and reprep the command each time, but I don't see a problem with it either. This was mainly thought as a safeguard here; just in case someone else might be fiddling with the command. But yeah, it's not required. Cheers, Hannes -- Dr. Hannes Reinecke zSeries Storage h...@suse.de +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg) -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] qla2xxx: fix sparse warning large integer implicitly truncated to unsigned type
From: Armen Baloyan armen.balo...@qlogic.com Signed-off-by: Armen Baloyan armen.balo...@qlogic.com Signed-off-by: Saurav Kashyap saurav.kash...@qlogic.com --- drivers/scsi/qla2xxx/qla_mr.c |6 ++ 1 files changed, 2 insertions(+), 4 deletions(-) diff --git a/drivers/scsi/qla2xxx/qla_mr.c b/drivers/scsi/qla2xxx/qla_mr.c index 729b743..937fed8 100644 --- a/drivers/scsi/qla2xxx/qla_mr.c +++ b/drivers/scsi/qla2xxx/qla_mr.c @@ -3003,12 +3003,10 @@ qlafx00_build_scsi_iocbs(srb_t *sp, struct cmd_type_7_fx00 *cmd_pkt, /* Set transfer direction */ if (cmd-sc_data_direction == DMA_TO_DEVICE) { - lcmd_pkt-cntrl_flags = - __constant_cpu_to_le16(TMF_WRITE_DATA); + lcmd_pkt-cntrl_flags = TMF_WRITE_DATA; vha-qla_stats.output_bytes += scsi_bufflen(cmd); } else if (cmd-sc_data_direction == DMA_FROM_DEVICE) { - lcmd_pkt-cntrl_flags = - __constant_cpu_to_le16(TMF_READ_DATA); + lcmd_pkt-cntrl_flags = TMF_READ_DATA; vha-qla_stats.input_bytes += scsi_bufflen(cmd); } -- 1.7.7 -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/2] qla2xxx: Patches for scsi misc branch.
Hi James, This patchset fixes the sparse warnings. Please apply the following patches to the scsi tree at your earliest convenience for inclusion in the next mainline merge window. Thanks, ~Saurav Armen Baloyan (1): qla2xxx: fix sparse warning large integer implicitly truncated to unsigned type Fengguang Wu (1): qla2xxx: qla2x00_sp_compl can be static. drivers/scsi/qla2xxx/qla_mr.c |6 ++ drivers/scsi/qla2xxx/qla_os.c |2 +- 2 files changed, 3 insertions(+), 5 deletions(-) -- 1.7.7 -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] qla2xxx: qla2x00_sp_compl can be static.
From: Fengguang Wu fengguang...@intel.com Signed-off-by: Fengguang Wu fengguang...@intel.com Signed-off-by: Saurav Kashyap saurav.kash...@qlogic.com --- drivers/scsi/qla2xxx/qla_os.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c index a083715..7b621c2 100644 --- a/drivers/scsi/qla2xxx/qla_os.c +++ b/drivers/scsi/qla2xxx/qla_os.c @@ -644,7 +644,7 @@ qla2x00_sp_free_dma(void *vha, void *ptr) qla2x00_rel_sp(sp-fcport-vha, sp); } -void +static void qla2x00_sp_compl(void *data, void *ptr, int res) { struct qla_hw_data *ha = (struct qla_hw_data *)data; -- 1.7.7 -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html