Re: [PATCH v4] ibmvscsis: Do not send aborted task response
Hi Bryant, Given running we're almost out of time for -rc1, I'd like to avoid having to rebase the handful of patches that are atop the -v3 that was applied to target-pending/for-next over the weekend... So if you'd be so kind, please post an incremental patch atop -v3, and I'll apply that instead. On Mon, 2017-05-08 at 17:07 -0500, Bryant G. Ly wrote: > The driver is sending a response to the actual scsi op that was > aborted by an abort task TM, while LIO is sending a response to > the abort task TM. > > ibmvscsis_tgt does not send the response to the client until > release_cmd time. The reason for this was because if we did it > at queue_status time, then the client would be free to reuse the > tag for that command, but we're still using the tag until the > command is released at release_cmd time, so we chose to delay > sending the response until then. That then caused this issue, because > release_cmd is always called, even if queue_status is not. > > SCSI spec says that the initiator that sends the abort task > TM NEVER gets a response to the aborted op and with the current > code it will send a response. Thus this fix will remove that response > if the CMD_T_ABORTED && !CMD_T_TAS. > > Another case with a small timing window is the case where if LIO sends a > TMR_DOES_NOT_EXIST, and the release_cmd callback is called for the TMR Abort > cmd before the release_cmd for the (attemped) aborted cmd, then we need to > ensure that we send the response for the (attempted) abort cmd to the client > before we send the response for the TMR Abort cmd. > > Forgot to reset the cmd->abort_cmd pointer after finding it in send_messages. > > Cc:# v4.8+ > Signed-off-by: Bryant G. Ly > Signed-off-by: Michael Cyr > --- > drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c | 117 > --- > drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.h | 2 + > 2 files changed, 94 insertions(+), 25 deletions(-) > > diff --git a/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c > b/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c > index 0f80779..ee64241 100644 > --- a/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c > +++ b/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c > @@ -1170,6 +1170,9 @@ static struct ibmvscsis_cmd > *ibmvscsis_get_free_cmd(struct scsi_info *vscsi) > cmd = list_first_entry_or_null(>free_cmd, > struct ibmvscsis_cmd, list); > if (cmd) { > + if (cmd->abort_cmd) > + cmd->abort_cmd = NULL; > + cmd->flags &= ~(DELAY_SEND); > list_del(>list); > cmd->iue = iue; > cmd->type = UNSET_TYPE; > @@ -1749,45 +1752,80 @@ static void srp_snd_msg_failed(struct scsi_info > *vscsi, long rc) > static void ibmvscsis_send_messages(struct scsi_info *vscsi) > { > u64 msg_hi = 0; > - /* note do not attmempt to access the IU_data_ptr with this pointer > + /* note do not attempt to access the IU_data_ptr with this pointer >* it is not valid >*/ > struct viosrp_crq *crq = (struct viosrp_crq *)_hi; > struct ibmvscsis_cmd *cmd, *nxt; > struct iu_entry *iue; > long rc = ADAPT_SUCCESS; > + bool retry = false; > > if (!(vscsi->flags & RESPONSE_Q_DOWN)) { > - list_for_each_entry_safe(cmd, nxt, >waiting_rsp, list) { > - iue = cmd->iue; > + do { > + retry = false; > + list_for_each_entry_safe(cmd, nxt, >waiting_rsp, > + list) { > + /* > + * Check to make sure abort cmd gets processed > + * prior to the abort tmr cmd > + */ > + if (cmd->flags & DELAY_SEND) > + continue; > > - crq->valid = VALID_CMD_RESP_EL; > - crq->format = cmd->rsp.format; > + if (cmd->abort_cmd) { > + retry = true; > + cmd->abort_cmd->flags &= ~(DELAY_SEND); > + cmd->abort_cmd = NULL; > + } > > - if (cmd->flags & CMD_FAST_FAIL) > - crq->status = VIOSRP_ADAPTER_FAIL; > + /* > + * If CMD_T_ABORTED w/o CMD_T_TAS scenarios and > + * the case where LIO issued a > + * ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST > + * case then we dont send a response, since it > + * was already done. > + */ >
Re: [PATCH] scsi: qedf: properly update arguments position in function call
Hi Martin, Quoting "Martin K. Petersen": Gustavo A., Properly update the position of the arguments in function call. Applied to 4.12/scsi-fixes, thank you! Awesome, glad to help. :) -- Gustavo A. R. Silva
Re: [PATCH] scsi: pmcraid: remove redundant check to see if request_size is less than zero
Colin, > The 2nd check to see if request_size is less than zero is redundant > because the first check takes error exit path on this condition. So, > since it is redundant, remove it. Applied to 4.12/scsi-fixes. -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH] scsi: lpfc: ensure els_wq is being checked before destroying it
Colin, > I believe there is a typo on the wq destroy of els_wq, currently the > driver is checking if els_cq is not null and I think this should be a > check on els_wq instead. Applied to 4.12/scsi-fixes. Thanks! -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH] cxlflash: Select IRQ_POLL
Guenter, > The driver now uses IRQ_POLL and needs to select it to avoid the following > build error. Applied to 4.12/scsi-fixes, thank you! -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH] scsi: qedf: Avoid reading past end of buffer
Kees, > Using memcpy() from a string that is shorter than the length copied > means the destination buffer is being filled with arbitrary data from > the kernel rodata segment. Instead, use strncpy() which will fill the > trailing bytes with zeros. Applied to 4.12/scsi-fixes, thanks! -- Martin K. Petersen Oracle Linux Engineering
Re: remove REQ_OP_WRITE_SAME
Christoph, > Any chance to get a sneak preview of that work? I have been on the road since LSF/MM and just got back home. I'll make it a priority. -- Martin K. Petersen Oracle Linux Engineering
Re: [REEEEPOST] bnx2i + bnx2fc: convert to generic workqueue (#3)
Sebastian, > Martin, do you see any chance to get this merged? Chad replied to the > list that he is going to test it on 2017-04-10, didn't respond to the > ping 10 days later. The series stalled last time in the same way. I am very reluctant to merge something when a driver has an active maintainer and that person has not acked the change. That said, Chad: You have been sitting on this for quite a while. Please make it a priority. In exchange for veto rights you do have to provide timely feedback on anything that touches your driver. Thanks! -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH v2] sd: Ignore sync cache failures when not supported
Christoph, > Normally we'd just pass the scsi_sense_hdr structure in from the caler > if we care about sense data. Is this something you considered? > > Otherwise this looks fine to me. I agree with Christoph that passing the sense header would be more consistent with the rest of the SCSI code. Even if we only need the key in this case. -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH] scsi: qedf: Cleanup the type of io_log->op
Dan, > We store sc_cmd->cmnd[0] which is an unsigned char in io_log->op so > this should also be unsigned char. The other thing is that this is > displayed in the debugfs: Applied to 4.12/scsi-fixes, thanks! -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH] scsi: lpfc: double lock typo in lpfc_ns_rsp()
Dan, > There is a double lock bug here so this will deadlock instead of > unlocking. Applied to 4.12/scsi-fixes. -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH] scsi: qedf: properly update arguments position in function call
Gustavo A., > Properly update the position of the arguments in function call. Applied to 4.12/scsi-fixes, thank you! -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH] scsi_lib: Add #include
Bart, > This patch avoids that when building with W=1 the compiler complains > that __scsi_init_queue() has not been declared. See also commit > d48777a633d6 ("scsi: remove __scsi_alloc_queue"). Applied to 4.12/scsi-fixes. Thanks! -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH, RFC] MAINTAINERS: update OSD entries
Christoph, > The open-osd domain doesn't exist anymore, and mails to the list lead > to really annoying bounced that repeat every day. > > Also the primarydata address for Benny bounces, and while I have a new > one for him he doesn't seem to be maintaining the OSD code any more. > > Which beggs the question: should we really leave the Supported status > in MAINTAINERS given that the code is barely maintained? Applied to 4.12/scsi-fixes. -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH]scsi: megaraid_sas: fix raid card hotswap failure
Zhou, > When a scsi_device is unpluged from scsi controller, if the > scsi_device is still be used by application layer,it won't be released > until users release it. In this case, scsi_device_remove just set the > scsi_device's state to be SDEV_DEL. But if you plug the disk just > before the old scsi_device is released, then there will be two > scsi_device structures in scsi_host->__devices. when the next > unpluging event happens,some low-level drivers will check whether the > scsi_device has been added to host (for example, the megaraid sas > series controller) by calling scsi_device_lookup(call > __scsi_device_lookup) in function > megasas_aen_polling.__scsi_device_lookup will return the first > scsi_device. Because its state is SDEV_DEL, the scsi_device_lookup > will return NULL finally, making the low-level driver assume that the > scsi_device has been removed,and won't call scsi_device_remove,which > will lead the failure of hot swap. Applied to 4.12/scsi-fixes. Thank you! -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH v3] lpfc: Fix panic on BFS configuration.
James, > Fix is to reset the sli-3 function before sending the mailbox command, > thus synchronizing the function/driver on mailbox location. Applied to 4.12/scsi-fixes. -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH] libfc: do not flood console with messages 'libfc: queue full ...'
Hannes, > When the FCoE sending side becomes congested libfc tries to reduce the > queue depth on the host; however due to the built-in lag before > attempting to ramp down the queue depth _again_ the message log is > flooded with messages > > libfc: queue full, reducing can_queue to 512 > > With this patch the message is printed only once (ie when it's > actually changed). Applied to 4.12/scsi-fixes. -- Martin K. Petersen Oracle Linux Engineering
[PATCH v4] ibmvscsis: Do not send aborted task response
The driver is sending a response to the actual scsi op that was aborted by an abort task TM, while LIO is sending a response to the abort task TM. ibmvscsis_tgt does not send the response to the client until release_cmd time. The reason for this was because if we did it at queue_status time, then the client would be free to reuse the tag for that command, but we're still using the tag until the command is released at release_cmd time, so we chose to delay sending the response until then. That then caused this issue, because release_cmd is always called, even if queue_status is not. SCSI spec says that the initiator that sends the abort task TM NEVER gets a response to the aborted op and with the current code it will send a response. Thus this fix will remove that response if the CMD_T_ABORTED && !CMD_T_TAS. Another case with a small timing window is the case where if LIO sends a TMR_DOES_NOT_EXIST, and the release_cmd callback is called for the TMR Abort cmd before the release_cmd for the (attemped) aborted cmd, then we need to ensure that we send the response for the (attempted) abort cmd to the client before we send the response for the TMR Abort cmd. Forgot to reset the cmd->abort_cmd pointer after finding it in send_messages. Cc:# v4.8+ Signed-off-by: Bryant G. Ly Signed-off-by: Michael Cyr --- drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c | 117 --- drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.h | 2 + 2 files changed, 94 insertions(+), 25 deletions(-) diff --git a/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c b/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c index 0f80779..ee64241 100644 --- a/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c +++ b/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c @@ -1170,6 +1170,9 @@ static struct ibmvscsis_cmd *ibmvscsis_get_free_cmd(struct scsi_info *vscsi) cmd = list_first_entry_or_null(>free_cmd, struct ibmvscsis_cmd, list); if (cmd) { + if (cmd->abort_cmd) + cmd->abort_cmd = NULL; + cmd->flags &= ~(DELAY_SEND); list_del(>list); cmd->iue = iue; cmd->type = UNSET_TYPE; @@ -1749,45 +1752,80 @@ static void srp_snd_msg_failed(struct scsi_info *vscsi, long rc) static void ibmvscsis_send_messages(struct scsi_info *vscsi) { u64 msg_hi = 0; - /* note do not attmempt to access the IU_data_ptr with this pointer + /* note do not attempt to access the IU_data_ptr with this pointer * it is not valid */ struct viosrp_crq *crq = (struct viosrp_crq *)_hi; struct ibmvscsis_cmd *cmd, *nxt; struct iu_entry *iue; long rc = ADAPT_SUCCESS; + bool retry = false; if (!(vscsi->flags & RESPONSE_Q_DOWN)) { - list_for_each_entry_safe(cmd, nxt, >waiting_rsp, list) { - iue = cmd->iue; + do { + retry = false; + list_for_each_entry_safe(cmd, nxt, >waiting_rsp, +list) { + /* +* Check to make sure abort cmd gets processed +* prior to the abort tmr cmd +*/ + if (cmd->flags & DELAY_SEND) + continue; - crq->valid = VALID_CMD_RESP_EL; - crq->format = cmd->rsp.format; + if (cmd->abort_cmd) { + retry = true; + cmd->abort_cmd->flags &= ~(DELAY_SEND); + cmd->abort_cmd = NULL; + } - if (cmd->flags & CMD_FAST_FAIL) - crq->status = VIOSRP_ADAPTER_FAIL; + /* +* If CMD_T_ABORTED w/o CMD_T_TAS scenarios and +* the case where LIO issued a +* ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST +* case then we dont send a response, since it +* was already done. +*/ + if (cmd->se_cmd.transport_state & CMD_T_ABORTED && + !(cmd->se_cmd.transport_state & CMD_T_TAS)) { + list_del(>list); + ibmvscsis_free_cmd_resources(vscsi, +cmd); + } else { +
Re: Race to power off harming SATA SSDs
On Mon 2017-05-08 16:40:11, David Woodhouse wrote: > On Mon, 2017-05-08 at 13:50 +0200, Boris Brezillon wrote: > > On Mon, 08 May 2017 11:13:10 +0100 > > David Woodhousewrote: > > > > > > > > On Mon, 2017-05-08 at 11:09 +0200, Hans de Goede wrote: > > > > > > > > You're forgetting that the SSD itself (this thread is about SSDs) also > > > > has > > > > a major software component which is doing housekeeping all the time, so > > > > even > > > > if the main CPU gets reset the SSD's controller may still happily be > > > > erasing > > > > blocks. > > > We're not really talking about SSDs at all any more; we're talking > > > about real flash with real maintainable software. > > > > It's probably a good sign that this new discussion should take place in > > a different thread :-). > > Well, maybe. But it was a silly thread in the first place. SATA SSDs > aren't *expected* to be reliable. Citation needed? I'm pretty sure SATA SSDs are expected to be reliable, up to maximum amount of gigabytes written (specified by manufacturer), as long as you don't cut power without warning. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
RE: [PATCH 00/19] aacraid: Patchset with reset rework and misc fixes
Hi, The patch set has been sent twice. Please ignore the later sent/received patch set. I had sent the original patch set on Saturday, but I did not receive a confirmation nor did any mailing list archive(http://marc.info/?l=linux-scsi) pick them up (should have waited). The next day I resent the patch set and here we are. Regards, Raghava Aditya > -Original Message- > From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi- > ow...@vger.kernel.org] On Behalf Of Raghava Aditya Renukunta > Sent: Sunday, May 7, 2017 6:34 AM > To: j...@linux.vnet.ibm.com; martin.peter...@oracle.com; linux- > s...@vger.kernel.org > Cc: Dave Carroll; Gana Sridaran > ; Scott Benesh > ; Prasad Munirathnam > > Subject: [PATCH 00/19] aacraid: Patchset with reset rework and misc fixes > > EXTERNAL EMAIL > > > This patchset primarily focuses on tweaking and hardening the controller > reset support for both ARC and HBA1000 devices. Now the driver can only > reset the controller thru eh reset. Included a srb memory fix and pci > dma allocation fix. > > Raghava Aditya Renukunta (19): > [SCSI] aacraid: Remove GFP_DMA for raw srb memory > [SCSI] aacraid: Fix DMAR issues with iommu=pt > [SCSI] aacraid: Added 32 and 64 queue depth for arc natives > [SCSI] aacraid: Set correct Queue Depth for HBA1000 RAW disks > [SCSI] aacraid: Remove reset support from check_health > [SCSI] aacraid: Change wait time for fib completion > [SCSI] aacraid: Log count info of scsi cmds before reset > [SCSI] aacraid: Print ctrl status before eh reset > [SCSI] aacraid: Using single reset mask for IOP reset > [SCSI] aacraid: Rework IOP reset > [SCSI] aacraid: Add periodic checks to see IOP reset status > [SCSI] aacraid: Rework SOFT reset code > [SCSI] aacraid: Rework aac_src_restart > [SCSI] aacraid: Use correct function to get ctrl health > [SCSI] aacraid: Make sure ioctl returns on controller reset > [SCSI] aacraid: Enable ctrl reset for both hba and arc > [SCSI] aacraid: Add reset debugging statements > [SCSI] aacraid: Remove reference to Series-9 > [SCSI] aacraid: Update driver version to 50834 > > drivers/scsi/aacraid/aachba.c | 17 ++- > drivers/scsi/aacraid/aacraid.h | 22 +++- > drivers/scsi/aacraid/commctrl.c | 13 ++- > drivers/scsi/aacraid/comminit.c | 18 +--- > drivers/scsi/aacraid/commsup.c | 73 +++-- > drivers/scsi/aacraid/linit.c| 232 -- > -- > drivers/scsi/aacraid/src.c | 136 +-- > 7 files changed, 296 insertions(+), 215 deletions(-) > > -- > 2.7.4
Re: Race to power off harming SATA SSDs
Hello, On Mon, May 08, 2017 at 08:56:15PM +0200, Pavel Machek wrote: > Well... the SMART counter tells us that the device was not shut down > correctly. Do we have reason to believe that it is _not_ telling us > truth? It is more than one device. It also finished power off command successfully. > SSDs die when you power them without warning: > http://lkcl.net/reports/ssd_analysis.html > > What kind of data would you like to see? "I have been using linux and > my SSD died"? We have had such reports. "I have killed 10 SSDs in a > week then I added one second delay, and this SSD survived 6 months"? Repeating shutdown cycles and showing that the device actually is in trouble would be great. It doesn't have to reach full-on device failure. Showing some sign of corruption would be enough - increase in CRC failure counts, bad block counts (a lot of devices report remaining reserve or lifetime in one way or the other) and so on. Right now, it might as well be just the SMART counter being funky. Thanks. -- tejun
Re: Race to power off harming SATA SSDs
On Mon 2017-05-08 13:43:03, Tejun Heo wrote: > Hello, > > On Mon, May 08, 2017 at 06:43:22PM +0200, Pavel Machek wrote: > > What I was trying to point out was that storage people try to treat > > SSDs as HDDs... and SSDs are very different. Harddrives mostly survive > > powerfails (with emergency parking), while it is very, very difficult > > to make SSD survive random powerfail, and we have to make sure we > > always powerdown SSDs "cleanly". > > We do. > > The issue raised is that some SSDs still increment the unexpected > power loss count even after clean shutdown sequence and that the > kernel should wait for some secs before powering off. > > We can do that for select devices but I want something more than "this > SMART counter is getting incremented" before doing that. Well... the SMART counter tells us that the device was not shut down correctly. Do we have reason to believe that it is _not_ telling us truth? It is more than one device. SSDs die when you power them without warning: http://lkcl.net/reports/ssd_analysis.html What kind of data would you like to see? "I have been using linux and my SSD died"? We have had such reports. "I have killed 10 SSDs in a week then I added one second delay, and this SSD survived 6 months"? Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html signature.asc Description: Digital signature
RE: Race to power off harming SATA SSDs
> Well, you are right.. and I'm responsible. > > What I was trying to point out was that storage people try to treat SSDs as > HDDs... > and SSDs are very different. Harddrives mostly survive powerfails (with > emergency > parking), while it is very, very difficult to make SSD survive random > powerfail, > and we have to make sure we always powerdown SSDs "cleanly". It all depends on the class of SSD that we're discussing. "Enterprise class" SSDs will often use either ultracapacitors or batteries to allow them to successfully complete all of the necessary operations upon a power cut. This e-mail and the information, including any attachments it contains, are intended to be a confidential communication only to the person or entity to whom it is addressed and may contain information that is privileged. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please immediately notify the sender and destroy the original message. Thank you. Please consider the environment before printing this email.
Re: Race to power off harming SATA SSDs
Hello, On Mon, May 08, 2017 at 06:43:22PM +0200, Pavel Machek wrote: > What I was trying to point out was that storage people try to treat > SSDs as HDDs... and SSDs are very different. Harddrives mostly survive > powerfails (with emergency parking), while it is very, very difficult > to make SSD survive random powerfail, and we have to make sure we > always powerdown SSDs "cleanly". We do. The issue raised is that some SSDs still increment the unexpected power loss count even after clean shutdown sequence and that the kernel should wait for some secs before powering off. We can do that for select devices but I want something more than "this SMART counter is getting incremented" before doing that. Thanks. -- tejun
Re: Race to power off harming SATA SSDs
On Mon 2017-05-08 13:50:05, Boris Brezillon wrote: > On Mon, 08 May 2017 11:13:10 +0100 > David Woodhousewrote: > > > On Mon, 2017-05-08 at 11:09 +0200, Hans de Goede wrote: > > > You're forgetting that the SSD itself (this thread is about SSDs) also has > > > a major software component which is doing housekeeping all the time, so > > > even > > > if the main CPU gets reset the SSD's controller may still happily be > > > erasing > > > blocks. > > > > We're not really talking about SSDs at all any more; we're talking > > about real flash with real maintainable software. > > It's probably a good sign that this new discussion should take place in > a different thread :-). Well, you are right.. and I'm responsible. What I was trying to point out was that storage people try to treat SSDs as HDDs... and SSDs are very different. Harddrives mostly survive powerfails (with emergency parking), while it is very, very difficult to make SSD survive random powerfail, and we have to make sure we always powerdown SSDs "cleanly". Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html signature.asc Description: Digital signature
Re: Race to power off harming SATA SSDs
On Mon, 2017-05-08 at 13:50 +0200, Boris Brezillon wrote: > On Mon, 08 May 2017 11:13:10 +0100 > David Woodhousewrote: > > > > > On Mon, 2017-05-08 at 11:09 +0200, Hans de Goede wrote: > > > > > > You're forgetting that the SSD itself (this thread is about SSDs) also has > > > a major software component which is doing housekeeping all the time, so > > > even > > > if the main CPU gets reset the SSD's controller may still happily be > > > erasing > > > blocks. > > We're not really talking about SSDs at all any more; we're talking > > about real flash with real maintainable software. > > It's probably a good sign that this new discussion should take place in > a different thread :-). Well, maybe. But it was a silly thread in the first place. SATA SSDs aren't *expected* to be reliable. smime.p7s Description: S/MIME cryptographic signature
Re: [PATCH] cxlflash: Select IRQ_POLL
> On May 5, 2017, at 6:31 PM, Guenter Roeckwrote: > > The driver now uses IRQ_POLL and needs to select it to avoid the following > build error. > > ERROR: ".irq_poll_complete" [drivers/scsi/cxlflash/cxlflash.ko] undefined! > ERROR: ".irq_poll_sched" [drivers/scsi/cxlflash/cxlflash.ko] undefined! > ERROR: ".irq_poll_disable" [drivers/scsi/cxlflash/cxlflash.ko] undefined! > ERROR: ".irq_poll_init" [drivers/scsi/cxlflash/cxlflash.ko] undefined! > > Fixes: cba06e6de403 ("scsi: cxlflash: Implement IRQ polling for RRQ > processing") > Signed-off-by: Guenter Roeck Acked-by: Matthew R. Ochs
Re: Race to power off harming SATA SSDs
Hi! > > 'clean marker' is a good idea... empty pages have plenty of space. > > Well... you lose that space permanently. Although I suppose you could > do things differently and erase a block immediately prior to using it. > But in that case why ever write the cleanmarker? Just maintain a set of > blocks that you *will* erase and re-use. Yes, but erase is slow so that would hurt performance...? > > How do you handle the issue during regular write? Always ignore last > > successfully written block? > > Log nodes have a CRC. If you get interrupted during a write, that CRC > should fail. Umm. That is not what "unstable bits" issue is about, right? If you are interrupted during write, you can get into state where readback will be correct on next boot (CRC, ECC ok), but then the bits will go back few hours after that. You can't rely on checksums to detect that.. because the bits will have the right values -- for a while. > > Do you handle "paired pages" problem on MLC? > > No. It would theoretically be possible, by not considering a write to > the first page "committed" until the second page of the pair is also > written. Essentially, it's not far off expanding the existing 'wbuf' > which we use to gather writes into full pages for NAND, to cover the > *whole* of the set of pages which are affected by MLC. > > But we mostly consider JFFS2 to be obsolete these days, in favour of > UBI/UBIFS or other approaches. Yes, I guess MLC NAND chips are mostly too big for jjfs2. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
Re: Race to power off harming SATA SSDs
Boris, Am 08.05.2017 um 13:48 schrieb Boris Brezillon: >>> How do you handle the issue during regular write? Always ignore last >>> successfully written block? > > I guess UBIFS can know what was written last, because of the log-based > approach + the seqnum stored along with FS nodes, but I'm pretty sure > UBIFS does not re-write the last written block in case of an unclean > mount. Richard, am I wrong? Yes. UBIFS has the machinery but uses it differently. When it faces ECC errors while replying the journal it can recover good data from the LEB. It assumes that an interrupted write leads always to ECC errors. >> >> The last page of a block is inspected and allowed to be corrupted. > > Actually, it's not really about corrupted pages, it's about pages that > might become unreadable after a few reads. As stated before, it assumes an ECC error from an interrupted read. We could automatically re-write everything in UBIFS that was written last but we don't have this information for data UBI itself wrote since UBI has no journal. If unstable bit can be triggered with current systems we can think of a clever trick to deal with that. So far nobody was able to show me the problem. Thanks, //richard
Re: [PATCH] scsi: qla4xxx: check for null return from iscsi_lookup_endpoint
This should be CC'd to qlogic-storage-upstr...@qlogic.com as well. regards, dan carpenter On Sun, May 07, 2017 at 10:30:20PM +0100, Colin King wrote: > From: Colin Ian King> > iscsi_lookup_endpoint can potentially return null and in 9 out of > the 10 calls to this function a null return is checked, so I think > it is pertinent to perform a null check here too and return -EINVAL > as in the other null cases. > > Detected by CoverityScan, CID#147282 ("Dereference null return value") > > Signed-off-by: Colin Ian King > --- > drivers/scsi/qla4xxx/ql4_os.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drivers/scsi/qla4xxx/ql4_os.c b/drivers/scsi/qla4xxx/ql4_os.c > index 64c6fa563fdb..803e342e1093 100644 > --- a/drivers/scsi/qla4xxx/ql4_os.c > +++ b/drivers/scsi/qla4xxx/ql4_os.c > @@ -3189,6 +3189,8 @@ static int qla4xxx_conn_bind(struct iscsi_cls_session > *cls_session, > if (iscsi_conn_bind(cls_session, cls_conn, is_leading)) > return -EINVAL; > ep = iscsi_lookup_endpoint(transport_fd); > + if (!ep) > + return -EINVAL; > conn = cls_conn->dd_data; > qla_conn = conn->dd_data; > qla_conn->qla_ep = ep->dd_data; > -- > 2.11.0 > > -- > To unsubscribe from this list: send the line "unsubscribe kernel-janitors" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Race to power off harming SATA SSDs
On Mon, 8 May 2017 13:48:07 +0200 Boris Brezillonwrote: > On Mon, 8 May 2017 13:06:17 +0200 > Richard Weinberger wrote: > > > On Mon, May 8, 2017 at 12:49 PM, Pavel Machek wrote: > > > Aha, nice, so it looks like ubifs is a step back here. > > > > > > 'clean marker' is a good idea... empty pages have plenty of space. > > > > If UBI (not UBIFS) faces an empty block, it also re-erases it. > > Unfortunately, that's not the case, though UBI can easily be patched > to do that (see below). Sorry for the noise, I was wrong, UBI already re-erases empty blocks [1]. [1]http://elixir.free-electrons.com/linux/latest/source/drivers/mtd/ubi/attach.c#L983
Re: Race to power off harming SATA SSDs
On Mon, 08 May 2017 11:13:10 +0100 David Woodhousewrote: > On Mon, 2017-05-08 at 11:09 +0200, Hans de Goede wrote: > > You're forgetting that the SSD itself (this thread is about SSDs) also has > > a major software component which is doing housekeeping all the time, so even > > if the main CPU gets reset the SSD's controller may still happily be erasing > > blocks. > > We're not really talking about SSDs at all any more; we're talking > about real flash with real maintainable software. It's probably a good sign that this new discussion should take place in a different thread :-).
[PATCH] qlogicpti: Fix resource releasing in order handling path
Reorder 'fail_free_irq' and 'fail_unmap_regs' in order to correctly free resources in case of error. Signed-off-by: Christophe JAILLET--- drivers/scsi/qlogicpti.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/scsi/qlogicpti.c b/drivers/scsi/qlogicpti.c index 69bfc0a1aea3..e5aeb26bd9b1 100644 --- a/drivers/scsi/qlogicpti.c +++ b/drivers/scsi/qlogicpti.c @@ -1385,6 +1385,9 @@ static int qpti_sbus_probe(struct platform_device *op) qpti->req_cpu, qpti->req_dvma); #undef QSIZE +fail_free_irq: + free_irq(qpti->irq, qpti); + fail_unmap_regs: of_iounmap(>resource[0], qpti->qregs, resource_size(>resource[0])); @@ -1392,9 +1395,6 @@ static int qpti_sbus_probe(struct platform_device *op) of_iounmap(>resource[0], qpti->sreg, sizeof(unsigned char)); -fail_free_irq: - free_irq(qpti->irq, qpti); - fail_unlink: scsi_host_put(host); -- 2.11.0
Re: Race to power off harming SATA SSDs
On Mon, 2017-05-08 at 12:49 +0200, Pavel Machek wrote: > On Mon 2017-05-08 10:34:08, David Woodhouse wrote: > > > > On Mon, 2017-05-08 at 11:28 +0200, Pavel Machek wrote: > > > > > > > > > Are you sure you have it right in JFFS2? Do you journal block erases? > > > Apparently, that was pretty much non-issue on older flashes. > > It isn't necessary in JFFS2. It is a *purely* log-structured file > > system (which is why it doesn't scale well past the 1GiB or so that we > > made it handle for OLPC). > > > > So we don't erase a block until all its contents are obsolete. And if > > we fail to complete the erase... well the contents are either going to > > fail a CRC check, or... still be obsoleted by later entries elsewhere. > > > > And even if it *looks* like an erase has completed and the block is all > > 0xFF, we erase it again and write a 'clean marker' to it to indicate > > that the erase was completed successfully. Because otherwise it can't > > be trusted. > Aha, nice, so it looks like ubifs is a step back here. > > 'clean marker' is a good idea... empty pages have plenty of space. Well... you lose that space permanently. Although I suppose you could do things differently and erase a block immediately prior to using it. But in that case why ever write the cleanmarker? Just maintain a set of blocks that you *will* erase and re-use. > How do you handle the issue during regular write? Always ignore last > successfully written block? Log nodes have a CRC. If you get interrupted during a write, that CRC should fail. > Do you handle "paired pages" problem on MLC? No. It would theoretically be possible, by not considering a write to the first page "committed" until the second page of the pair is also written. Essentially, it's not far off expanding the existing 'wbuf' which we use to gather writes into full pages for NAND, to cover the *whole* of the set of pages which are affected by MLC. But we mostly consider JFFS2 to be obsolete these days, in favour of UBI/UBIFS or other approaches. smime.p7s Description: S/MIME cryptographic signature
Re: Race to power off harming SATA SSDs
On Mon, May 8, 2017 at 12:49 PM, Pavel Machekwrote: > Aha, nice, so it looks like ubifs is a step back here. > > 'clean marker' is a good idea... empty pages have plenty of space. If UBI (not UBIFS) faces an empty block, it also re-erases it. The EC header is uses as clean marker. > How do you handle the issue during regular write? Always ignore last > successfully written block? The last page of a block is inspected and allowed to be corrupted. > Do you handle "paired pages" problem on MLC? Nope, no MLC support in mainline so far. -- Thanks, //richard
Re: Race to power off harming SATA SSDs
On Mon 2017-05-08 10:34:08, David Woodhouse wrote: > On Mon, 2017-05-08 at 11:28 +0200, Pavel Machek wrote: > > > > Are you sure you have it right in JFFS2? Do you journal block erases? > > Apparently, that was pretty much non-issue on older flashes. > > It isn't necessary in JFFS2. It is a *purely* log-structured file > system (which is why it doesn't scale well past the 1GiB or so that we > made it handle for OLPC). > > So we don't erase a block until all its contents are obsolete. And if > we fail to complete the erase... well the contents are either going to > fail a CRC check, or... still be obsoleted by later entries elsewhere. > > And even if it *looks* like an erase has completed and the block is all > 0xFF, we erase it again and write a 'clean marker' to it to indicate > that the erase was completed successfully. Because otherwise it can't > be trusted. Aha, nice, so it looks like ubifs is a step back here. 'clean marker' is a good idea... empty pages have plenty of space. How do you handle the issue during regular write? Always ignore last successfully written block? Do you handle "paired pages" problem on MLC? Best regards, Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
Re: Race to power off harming SATA SSDs
On Mon, 2017-05-08 at 11:06 +0200, Ricard Wanderlof wrote: > > My point is really that say that the problem is in fact not that the erase > is cut short due to the power fail, but that the software issues a second > command before the first erase command has completed, for instance, or > some other situation. Then we'd have a concrete situation which we can > resolve (i.e., fix the bug), rather than assuming that it's the hardware's > fault and implement various software workarounds. On NOR flash we have *definitely* seen it during powerfail testing. A block looks like it's all 0xFF when you read it back on mount, but if you read it repeatedly, you may see bit flips because it wasn't completely erased. And even if you read it ten times and 'trust' that it's properly erased, it could start to show those bit flips when you start to program it. It was very repeatable, and that's when we implemented the 'clean markers' written after a successful erase, rather than trusting a block that "looks empty". smime.p7s Description: S/MIME cryptographic signature
Re: Race to power off harming SATA SSDs
On Mon, 2017-05-08 at 11:09 +0200, Hans de Goede wrote: > You're forgetting that the SSD itself (this thread is about SSDs) also has > a major software component which is doing housekeeping all the time, so even > if the main CPU gets reset the SSD's controller may still happily be erasing > blocks. We're not really talking about SSDs at all any more; we're talking about real flash with real maintainable software. smime.p7s Description: S/MIME cryptographic signature
Re: Race to power off harming SATA SSDs
Pavel, On Mon, May 8, 2017 at 11:28 AM, Pavel Machekwrote: > Are you sure you have it right in JFFS2? Do you journal block erases? > Apparently, that was pretty much non-issue on older flashes. This is what the website says, yes. Do you have hardware where you can trigger it? If so, I'd love to get access to it. So far I never saw the issue, sometimes people claim to suffer from it but when I inspect the problems in detail it is always something else. -- Thanks, //richard
Re: Race to power off harming SATA SSDs
On Mon, 2017-05-08 at 11:28 +0200, Pavel Machek wrote: > > Are you sure you have it right in JFFS2? Do you journal block erases? > Apparently, that was pretty much non-issue on older flashes. It isn't necessary in JFFS2. It is a *purely* log-structured file system (which is why it doesn't scale well past the 1GiB or so that we made it handle for OLPC). So we don't erase a block until all its contents are obsolete. And if we fail to complete the erase... well the contents are either going to fail a CRC check, or... still be obsoleted by later entries elsewhere. And even if it *looks* like an erase has completed and the block is all 0xFF, we erase it again and write a 'clean marker' to it to indicate that the erase was completed successfully. Because otherwise it can't be trusted. smime.p7s Description: S/MIME cryptographic signature
Re: Race to power off harming SATA SSDs
On Mon 2017-05-08 08:21:34, David Woodhouse wrote: > On Sun, 2017-05-07 at 22:40 +0200, Pavel Machek wrote: > > > > NOTE: unclean SSD power-offs are dangerous and may brick the device in > > > > the worst case, or otherwise harm it (reduce longevity, damage flash > > > > blocks). It is also not impossible to get data corruption. > > > > > I get that the incrementing counters might not be pretty but I'm a bit > > > skeptical about this being an actual issue. Because if that were > > > true, the device would be bricking itself from any sort of power > > > losses be that an actual power loss, battery rundown or hard power off > > > after crash. > > > > And that's exactly what users see. If you do enough power fails on a > > SSD, you usually brick it, some die sooner than others. There was some > > test results published, some are here > > http://lkcl.net/reports/ssd_analysis.html, I believe I seen some > > others too. > > > > It is very hard for a NAND to work reliably in face of power > > failures. In fact, not even Linux MTD + UBIFS works well in that > > regards. See > > http://www.linux-mtd.infradead.org/faq/ubi.html. (Unfortunately, its > > down now?!). If we can't get it right, do you believe SSD manufactures > > do? > > > > [Issue is, if you powerdown during erase, you get "weakly erased" > > page, which will contain expected 0xff's, but you'll get bitflips > > there quickly. Similar issue exists for writes. It is solveable in > > software, just hard and slow... and we don't do it.] > > It's not that hard. We certainly do it in JFFS2. I was fairly sure that > it was also part of the design considerations for UBI — it really ought > to be right there too. I'm less sure about UBIFS but I would have > expected it to be OK. Are you sure you have it right in JFFS2? Do you journal block erases? Apparently, that was pretty much non-issue on older flashes. https://web-beta.archive.org/web/20160923094716/http://www.linux-mtd.infradead.org:80/doc/ubifs.html#L_unstable_bits > SSDs however are often crap; power fail those at your peril. And of > course there's nothing you can do when they do fail, whereas we accept > patches for things which are implemented in Linux. Agreed. If the SSD indiciates unexpected powerdown, it is a problem and we need to fix it. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
Re: Race to power off harming SATA SSDs
Hi, On 08-05-17 11:06, Ricard Wanderlof wrote: On Mon, 8 May 2017, David Woodhouse wrote: On Mon, 8 May 2017, David Woodhouse wrote: Our empirical testing trumps your "can never happen" theory :) I'm sure it does. But what is the explanation then? Has anyone analyzed what is going on using an oscilloscope to verify relationship between erase command and supply voltage drop? Not that I'm aware of. Once we have reached the "it does happen and we have to cope" there was not a lot of point in working out *why* it happened. In fact, the only examples I *personally* remember were on NOR flash, which takes longer to erase. So it's vaguely possible that it doesn't happen on NAND. But really, it's not something we should be depending on and the software mechanisms have to remain in place. My point is really that say that the problem is in fact not that the erase is cut short due to the power fail, but that the software issues a second command before the first erase command has completed, for instance, or some other situation. Then we'd have a concrete situation which we can resolve (i.e., fix the bug), rather than assuming that it's the hardware's fault and implement various software workarounds. You're forgetting that the SSD itself (this thread is about SSDs) also has a major software component which is doing housekeeping all the time, so even if the main CPU gets reset the SSD's controller may still happily be erasing blocks. Regards, Hans
Re: Race to power off harming SATA SSDs
On Mon, 8 May 2017, David Woodhouse wrote: > > On Mon, 8 May 2017, David Woodhouse wrote: > > > Our empirical testing trumps your "can never happen" theory :) > > > > I'm sure it does. But what is the explanation then? Has anyone analyzed > > what is going on using an oscilloscope to verify relationship between > > erase command and supply voltage drop? > > Not that I'm aware of. Once we have reached the "it does happen and we > have to cope" there was not a lot of point in working out *why* it > happened. > > In fact, the only examples I *personally* remember were on NOR flash, > which takes longer to erase. So it's vaguely possible that it doesn't > happen on NAND. But really, it's not something we should be depending > on and the software mechanisms have to remain in place. My point is really that say that the problem is in fact not that the erase is cut short due to the power fail, but that the software issues a second command before the first erase command has completed, for instance, or some other situation. Then we'd have a concrete situation which we can resolve (i.e., fix the bug), rather than assuming that it's the hardware's fault and implement various software workarounds. On the other hand, making the software resilient to erase problems essentially makes the system more robust in any case, so it's not a bad thing of course. It's just that I've seen this "we're software guys, and it must be the hardware's fault" (and vice versa) enough times to cause a small warning bell to off here. /Ricard -- Ricard Wolf Wanderlöf ricardw(at)axis.com Axis Communications AB, Lund, Swedenwww.axis.com Phone +46 46 272 2016 Fax +46 46 13 61 30
Re: Race to power off harming SATA SSDs
On Mon, 2017-05-08 at 10:36 +0200, Ricard Wanderlof wrote: > On Mon, 8 May 2017, David Woodhouse wrote: > > Our empirical testing trumps your "can never happen" theory :) > > I'm sure it does. But what is the explanation then? Has anyone analyzed > what is going on using an oscilloscope to verify relationship between > erase command and supply voltage drop? Not that I'm aware of. Once we have reached the "it does happen and we have to cope" there was not a lot of point in working out *why* it happened. In fact, the only examples I *personally* remember were on NOR flash, which takes longer to erase. So it's vaguely possible that it doesn't happen on NAND. But really, it's not something we should be depending on and the software mechanisms have to remain in place. smime.p7s Description: S/MIME cryptographic signature
Re: [PATCH] libfc: do not flood console with messages 'libfc: queue full ...'
On Thu, Apr 27, 2017 at 04:25:03PM +0200, Hannes Reinecke wrote: > When the FCoE sending side becomes congested libfc tries to > reduce the queue depth on the host; however due to the built-in > lag before attempting to ramp down the queue depth _again_ the > message log is flooded with messages > > libfc: queue full, reducing can_queue to 512 > > With this patch the message is printed only once (ie when it's > actually changed). > > Signed-off-by: Hannes Reinecke> --- Acked-by: Johannes Thumshirn -- Johannes Thumshirn Storage jthumsh...@suse.de+49 911 74053 689 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg) Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
Re: Race to power off harming SATA SSDs
On Mon, 8 May 2017, David Woodhouse wrote: > > I've got a problem with the underlying mechanism. How long does it take to > > erase a NAND block? A couple of milliseconds. That means that for an erase > > to be "weak" du to a power fail, the host CPU must issue an erase command, > > and then the power to the NAND must drop within those milliseconds. > > However, in most systems there will be a power monitor which will > > essentially reset the CPU as soon as the power starts dropping. So in > > practice, by the time the voltage is too low to successfully supply the > > NAND chip, the CPU has already been reset, hence, no reset command will > > have been given by the time NAND runs out of steam. > > > > Sure, with switchmode power supplies, we don't have those large capacitors > > in the power supply which can keep the power going for a second or more, > > but still, I would think that the power wouldn't die fast enough for this > > to be an issue. > > > Our empirical testing trumps your "can never happen" theory :) I'm sure it does. But what is the explanation then? Has anyone analyzed what is going on using an oscilloscope to verify relationship between erase command and supply voltage drop? /Ricard -- Ricard Wolf Wanderlöf ricardw(at)axis.com Axis Communications AB, Lund, Swedenwww.axis.com Phone +46 46 272 2016 Fax +46 46 13 61 30
Re: Race to power off harming SATA SSDs
On Mon, 2017-05-08 at 09:38 +0200, Ricard Wanderlof wrote: > On Mon, 8 May 2017, David Woodhouse wrote: > > > > > > > > > [Issue is, if you powerdown during erase, you get "weakly erased" > > > page, which will contain expected 0xff's, but you'll get bitflips > > > there quickly. Similar issue exists for writes. It is solveable in > > > software, just hard and slow... and we don't do it.] > > It's not that hard. We certainly do it in JFFS2. I was fairly sure that > > it was also part of the design considerations for UBI ? it really ought > > to be right there too. I'm less sure about UBIFS but I would have > > expected it to be OK. > I've got a problem with the underlying mechanism. How long does it take to > erase a NAND block? A couple of milliseconds. That means that for an erase > to be "weak" du to a power fail, the host CPU must issue an erase command, > and then the power to the NAND must drop within those milliseconds. > However, in most systems there will be a power monitor which will > essentially reset the CPU as soon as the power starts dropping. So in > practice, by the time the voltage is too low to successfully supply the > NAND chip, the CPU has already been reset, hence, no reset command will > have been given by the time NAND runs out of steam. > > Sure, with switchmode power supplies, we don't have those large capacitors > in the power supply which can keep the power going for a second or more, > but still, I would think that the power wouldn't die fast enough for this > to be an issue. > > But I could very well be wrong and I haven't had experience with that many > NAND flash systems. But then please tell me where the above reasoning is > flawed. Our empirical testing trumps your "can never happen" theory :) smime.p7s Description: S/MIME cryptographic signature
[PATCH 19/19] aacraid: Update driver version to 50834
Update the driver version to 50834 Signed-off-by: Raghava Aditya Renukunta--- drivers/scsi/aacraid/aacraid.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/scsi/aacraid/aacraid.h b/drivers/scsi/aacraid/aacraid.h index 58ccd2a..0995265 100644 --- a/drivers/scsi/aacraid/aacraid.h +++ b/drivers/scsi/aacraid/aacraid.h @@ -97,7 +97,7 @@ enum { #definePMC_GLOBAL_INT_BIT0 0x0001 #ifndef AAC_DRIVER_BUILD -# define AAC_DRIVER_BUILD 50792 +# define AAC_DRIVER_BUILD 50834 # define AAC_DRIVER_BRANCH "-custom" #endif #define MAXIMUM_NUM_CONTAINERS 32 -- 2.7.4
Re: Race to power off harming SATA SSDs
On Mon, 8 May 2017, David Woodhouse wrote: > > [Issue is, if you powerdown during erase, you get "weakly erased" > > page, which will contain expected 0xff's, but you'll get bitflips > > there quickly. Similar issue exists for writes. It is solveable in > > software, just hard and slow... and we don't do it.] > > It's not that hard. We certainly do it in JFFS2. I was fairly sure that > it was also part of the design considerations for UBI ? it really ought > to be right there too. I'm less sure about UBIFS but I would have > expected it to be OK. I've got a problem with the underlying mechanism. How long does it take to erase a NAND block? A couple of milliseconds. That means that for an erase to be "weak" du to a power fail, the host CPU must issue an erase command, and then the power to the NAND must drop within those milliseconds. However, in most systems there will be a power monitor which will essentially reset the CPU as soon as the power starts dropping. So in practice, by the time the voltage is too low to successfully supply the NAND chip, the CPU has already been reset, hence, no reset command will have been given by the time NAND runs out of steam. Sure, with switchmode power supplies, we don't have those large capacitors in the power supply which can keep the power going for a second or more, but still, I would think that the power wouldn't die fast enough for this to be an issue. But I could very well be wrong and I haven't had experience with that many NAND flash systems. But then please tell me where the above reasoning is flawed. /Ricard -- Ricard Wolf Wanderlöf ricardw(at)axis.com Axis Communications AB, Lund, Swedenwww.axis.com Phone +46 46 272 2016 Fax +46 46 13 61 30
Re: [PATCH] scsi_lib: Add #include
On Tue, May 02, 2017 at 10:45:03AM -0700, Bart Van Assche wrote: > This patch avoids that when building with W=1 the compiler > complains that __scsi_init_queue() has not been declared. > See also commit d48777a633d6 ("scsi: remove __scsi_alloc_queue"). > > Signed-off-by: Bart Van Assche> Cc: Christoph Hellwig > Cc: Hannes Reinecke > Cc: Johannes Thumshirn > --- Looks good, Reviewed-by: Johannes Thumshirn -- Johannes Thumshirn Storage jthumsh...@suse.de+49 911 74053 689 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg) Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
Re: Race to power off harming SATA SSDs
On Sun, 2017-05-07 at 22:40 +0200, Pavel Machek wrote: > > > NOTE: unclean SSD power-offs are dangerous and may brick the device in > > > the worst case, or otherwise harm it (reduce longevity, damage flash > > > blocks). It is also not impossible to get data corruption. > > > I get that the incrementing counters might not be pretty but I'm a bit > > skeptical about this being an actual issue. Because if that were > > true, the device would be bricking itself from any sort of power > > losses be that an actual power loss, battery rundown or hard power off > > after crash. > > And that's exactly what users see. If you do enough power fails on a > SSD, you usually brick it, some die sooner than others. There was some > test results published, some are here > http://lkcl.net/reports/ssd_analysis.html, I believe I seen some > others too. > > It is very hard for a NAND to work reliably in face of power > failures. In fact, not even Linux MTD + UBIFS works well in that > regards. See > http://www.linux-mtd.infradead.org/faq/ubi.html. (Unfortunately, its > down now?!). If we can't get it right, do you believe SSD manufactures > do? > > [Issue is, if you powerdown during erase, you get "weakly erased" > page, which will contain expected 0xff's, but you'll get bitflips > there quickly. Similar issue exists for writes. It is solveable in > software, just hard and slow... and we don't do it.] It's not that hard. We certainly do it in JFFS2. I was fairly sure that it was also part of the design considerations for UBI — it really ought to be right there too. I'm less sure about UBIFS but I would have expected it to be OK. SSDs however are often crap; power fail those at your peril. And of course there's nothing you can do when they do fail, whereas we accept patches for things which are implemented in Linux. smime.p7s Description: S/MIME cryptographic signature
Re: [PATCH v3] lpfc: Fix panic on BFS configuration.
On Thu, Apr 27, 2017 at 03:08:26PM -0700, jsmart2...@gmail.com wrote: > From: James Smart> > To select the appropriate shost template, the driver is issuing > a mailbox command to retrieve the wwn. Turns out the sending of > the command precedes the reset of the function. On SLI-4 adapters, > this is inconsequential as the mailbox command location is specified > by dma via the BMBX register. However, on SLI-3 adapters, the > location of the mailbox command submission area changes. When the > function is first powered on or reset, the cmd is submitted via PCI > bar memory. Later the driver changes the function config to use > host memory and DMA. The request to start a mailbox command is the > same, a simple doorbell write, regardless of submission area. > So.. if there has not been a boot driver run against the adapter, > the mailbox command works as defaults are ok. But, if the boot > driver has configured the card and, and if no platform pci > function/slot reset occurs as the os starts, the mailbox command > will fail. The SLI-3 device will use the stale boot driver dma > location. This can cause PCI eeh errors. > > Fix is to reset the sli-3 function before sending the > mailbox command, thus synchronizing the function/driver on mailbox > location. > > Note: The fix uses routines that are typically invoked later in the > call flow to reset the sli-3 device. The issue in using those routines is > that the normal (non-fix) flow does additional initialization, namely the > allocation of the pport structure. So, rather than significantly reworking > the initialization flow so that the pport is alloc'd first, pointer checks > are added to work around it. Checks are limited to the routines invoked > by a sli-3 adapter (s3 routines) as this fix/early call is only invoked > on a sli3 adapter. Nothing changes post the fix. Subsequent initialization, > and another adapter reset, still occur - both on sli-3 and sli-4 adapters. > > Signed-off-by: Dick Kennedy > Signed-off-by: James Smart > Fixes: 96418b5e2c88 ("scsi: lpfc: Fix eh_deadline setting for sli3 adapters.") > Cc: sta...@vger.kernel.org > --- Looks good, Reviewed-by: Johannes Thumshirn -- Johannes Thumshirn Storage jthumsh...@suse.de+49 911 74053 689 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg) Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
[PATCH v2] sd: Write lock zone for REQ_OP_WRITE_ZEROES
From: Damien Le MoalFor a zoned block device, sd_zbc_complete() handles zone write unlock on completion of a REQ_OP_WRITE_ZEROES command but the zone write locking is missing from sd_setup_write_zeroes_cmnd(). This patch fixes this problem by locking the target zone of a REQ_OP_WRITE_ZEROES request. Signed-off-by: Damien Le Moal --- Changes from v1: * Lock zone in all cases so that no warning is triggered in sd_zbc_complete() with unlocked zones for the unmap case. drivers/scsi/sd.c | 19 +++ 1 file changed, 15 insertions(+), 4 deletions(-) diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index e60a309..de9e2f2 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -827,21 +827,32 @@ static int sd_setup_write_zeroes_cmnd(struct scsi_cmnd *cmd) struct scsi_disk *sdkp = scsi_disk(rq->rq_disk); u64 sector = blk_rq_pos(rq) >> (ilog2(sdp->sector_size) - 9); u32 nr_sectors = blk_rq_sectors(rq) >> (ilog2(sdp->sector_size) - 9); + int ret; if (!(rq->cmd_flags & REQ_NOUNMAP)) { switch (sdkp->zeroing_mode) { case SD_ZERO_WS16_UNMAP: - return sd_setup_write_same16_cmnd(cmd, true); + ret = sd_setup_write_same16_cmnd(cmd, true); + goto out; case SD_ZERO_WS10_UNMAP: - return sd_setup_write_same10_cmnd(cmd, true); + ret = sd_setup_write_same10_cmnd(cmd, true); + goto out; } } if (sdp->no_write_same) return BLKPREP_INVALID; + if (sdkp->ws16 || sector > 0x || nr_sectors > 0x) - return sd_setup_write_same16_cmnd(cmd, false); - return sd_setup_write_same10_cmnd(cmd, false); + ret = sd_setup_write_same16_cmnd(cmd, false); + else + ret = sd_setup_write_same10_cmnd(cmd, false); + +out: + if (sd_is_zoned(sdkp) && ret == BLKPREP_OK) + return sd_zbc_write_lock_zone(cmd); + + return ret; } static void sd_config_write_same(struct scsi_disk *sdkp) -- 2.9.3
[PATCH] sd: Write lock zone for REQ_OP_WRITE_ZEROES
From: Damien Le MoalFor a zoned block device, sd_zbc_complete() handles zone write unlock on completion of a REQ_OP_WRITE_ZEROES command but the zone write locking is missing from sd_setup_write_zeroes_cmnd(). This patch fixes this problem by locking the target zone of a REQ_OP_WRITE_ZEROES request, ignoring the unmap == true case (as in this case, the write same command will necessarily be aligned on the entire zone and be equivalent to a discard which does not write lock zones). Signed-off-by: Damien Le Moal --- drivers/scsi/sd.c | 18 -- 1 file changed, 16 insertions(+), 2 deletions(-) diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index e60a309..b0a525d 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -827,6 +827,7 @@ static int sd_setup_write_zeroes_cmnd(struct scsi_cmnd *cmd) struct scsi_disk *sdkp = scsi_disk(rq->rq_disk); u64 sector = blk_rq_pos(rq) >> (ilog2(sdp->sector_size) - 9); u32 nr_sectors = blk_rq_sectors(rq) >> (ilog2(sdp->sector_size) - 9); + int ret; if (!(rq->cmd_flags & REQ_NOUNMAP)) { switch (sdkp->zeroing_mode) { @@ -839,9 +840,22 @@ static int sd_setup_write_zeroes_cmnd(struct scsi_cmnd *cmd) if (sdp->no_write_same) return BLKPREP_INVALID; + + if (sd_is_zoned(sdkp)) { + ret = sd_zbc_write_lock_zone(cmd); + if (ret != BLKPREP_OK) + return ret; + } + if (sdkp->ws16 || sector > 0x || nr_sectors > 0x) - return sd_setup_write_same16_cmnd(cmd, false); - return sd_setup_write_same10_cmnd(cmd, false); + ret = sd_setup_write_same16_cmnd(cmd, false); + else + ret = sd_setup_write_same10_cmnd(cmd, false); + + if (sd_is_zoned(sdkp) && ret != BLKPREP_OK) + sd_zbc_write_unlock_zone(cmd); + + return ret; } static void sd_config_write_same(struct scsi_disk *sdkp) -- 2.9.3
[PATCH] sd: Unlock zone in case of error in sd_setup_write_same_cmnd()
From: Damien Le Moalscsi_io_init() may fail, leaving a zone of a zoned block device locked. Fix this by properly unlocking the write same request target zone if scsi_io_init() fails. Signed-off-by: Damien Le Moal Cc: sta...@vger.kernel.org --- drivers/scsi/sd.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index f9d1432..e60a309 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -948,6 +948,10 @@ static int sd_setup_write_same_cmnd(struct scsi_cmnd *cmd) rq->__data_len = sdp->sector_size; ret = scsi_init_io(cmd); rq->__data_len = nr_bytes; + + if (sd_is_zoned(sdkp) && ret != BLKPREP_OK) + sd_zbc_write_unlock_zone(cmd); + return ret; } -- 2.9.3