Re:
With profound love in my heart, I Kindly Oblige your interest to very important proposal.. It is Truly Divine and require your utmost attention.. S hlubokou láskou v mém srdci, Laskave jsem prinutit svuj zájem k návrhu .. Je velmi duležité, skutecne Divine a vyžadují vaši nejvyšší pozornost. Kontaktujte me prímo pres: helenarobert...@gmail.com pro úplné podrobnosti.complete. HELINA .A ROBERTS --- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus
[PATCH 1/3] target: Fix compare_and_write_callback handling for non GOOD status
From: Nicholas Bellinger Following the bugfix for handling non SAM_STAT_GOOD COMPARE_AND_WRITE status during COMMIT phase in commit 9b2792c3da1, the same bug exists for the READ phase as well. This would manifest first as a lost SCSI response, and eventual hung task during fabric driver logout or re-login, as existing shutdown logic waited for the COMPARE_AND_WRITE se_cmd->cmd_kref to reach zero. To address this bug, compare_and_write_callback() has been changed to set post_ret = 1 and return TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE as necessary to signal failure status. Reported-by: Bill Borsari Cc: Bill Borsari Tested-by: Gary Guo Cc: Gary Guo Signed-off-by: Nicholas Bellinger --- drivers/target/target_core_sbc.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/target/target_core_sbc.c b/drivers/target/target_core_sbc.c index f9250b3..a0ad618 100644 --- a/drivers/target/target_core_sbc.c +++ b/drivers/target/target_core_sbc.c @@ -507,8 +507,11 @@ static sense_reason_t compare_and_write_callback(struct se_cmd *cmd, bool succes * been failed with a non-zero SCSI status. */ if (cmd->scsi_status) { - pr_err("compare_and_write_callback: non zero scsi_status:" + pr_debug("compare_and_write_callback: non zero scsi_status:" " 0x%02x\n", cmd->scsi_status); + *post_ret = 1; + if (cmd->scsi_status == SAM_STAT_CHECK_CONDITION) + ret = TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE; goto out; } -- 1.9.1
[PATCH 3/3] target: Don't force session reset if queue_depth does not change
From: Nicholas Bellinger Keeping in the idempotent nature of target_core_fabric_configfs.c, if a queue_depth value is set and it's the same as the existing value, don't attempt to force session reinstatement. Reported-by: Raghu Krishnamurthy Cc: Raghu Krishnamurthy Tested-by: Gary Guo Cc: Gary Guo Signed-off-by: Nicholas Bellinger --- drivers/target/target_core_tpg.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/drivers/target/target_core_tpg.c b/drivers/target/target_core_tpg.c index dfaef4d..310d9e5 100644 --- a/drivers/target/target_core_tpg.c +++ b/drivers/target/target_core_tpg.c @@ -398,6 +398,13 @@ int core_tpg_set_initiator_node_queue_depth( struct se_portal_group *tpg = acl->se_tpg; /* +* Allow the setting of se_node_acl queue_depth to be idempotent, +* and not force a session shutdown event if the value is not +* changing. +*/ + if (acl->queue_depth == queue_depth) + return 0; + /* * User has requested to change the queue depth for a Initiator Node. * Change the value in the Node's struct se_node_acl, and call * target_set_nacl_queue_depth() to set the new queue depth. -- 1.9.1
[PATCH 2/3] iscsi-target: Set session_fall_back_to_erl0 when forcing reinstatement
From: Nicholas Bellinger While testing modification of per se_node_acl queue_depth forcing session reinstatement via lio_target_nacl_cmdsn_depth_store() -> core_tpg_set_initiator_node_queue_depth(), a hung task bug triggered when changing cmdsn_depth invoked session reinstatement while an iscsi login was already waiting for session reinstatement to complete. This can happen when an outstanding se_cmd descriptor is taking a long time to complete, and session reinstatement from iscsi login or cmdsn_depth change occurs concurrently. To address this bug, explicitly set session_fall_back_to_erl0 = 1 when forcing session reinstatement, so session reinstatement is not attempted if an active session is already being shutdown. This patch has been tested with two scenarios. The first when iscsi login is blocked waiting for iscsi session reinstatement to complete followed by queue_depth change via configfs, and second when queue_depth change via configfs us blocked followed by a iscsi login driven session reinstatement. Note this patch depends on commit d36ad77f702 to handle multiple sessions per se_node_acl when changing cmdsn_depth, and for pre v4.5 kernels will need to be included for stable as well. Reported-by: Gary Guo Tested-by: Gary Guo Cc: Gary Guo Signed-off-by: Nicholas Bellinger --- drivers/target/iscsi/iscsi_target.c | 1 + drivers/target/iscsi/iscsi_target_configfs.c | 1 + drivers/target/iscsi/iscsi_target_login.c| 1 + 3 files changed, 3 insertions(+) diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c index 0f7ade0..26a9bcd 100644 --- a/drivers/target/iscsi/iscsi_target.c +++ b/drivers/target/iscsi/iscsi_target.c @@ -4663,6 +4663,7 @@ int iscsit_release_sessions_for_tpg(struct iscsi_portal_group *tpg, int force) continue; } atomic_set(&sess->session_reinstatement, 1); + atomic_set(&sess->session_fall_back_to_erl0, 1); spin_unlock(&sess->conn_lock); list_move_tail(&se_sess->sess_list, &free_list); diff --git a/drivers/target/iscsi/iscsi_target_configfs.c b/drivers/target/iscsi/iscsi_target_configfs.c index 344e844..96d9c73 100644 --- a/drivers/target/iscsi/iscsi_target_configfs.c +++ b/drivers/target/iscsi/iscsi_target_configfs.c @@ -1528,6 +1528,7 @@ static void lio_tpg_close_session(struct se_session *se_sess) return; } atomic_set(&sess->session_reinstatement, 1); + atomic_set(&sess->session_fall_back_to_erl0, 1); spin_unlock(&sess->conn_lock); iscsit_stop_time2retain_timer(sess); diff --git a/drivers/target/iscsi/iscsi_target_login.c b/drivers/target/iscsi/iscsi_target_login.c index ad8f301..6623847 100644 --- a/drivers/target/iscsi/iscsi_target_login.c +++ b/drivers/target/iscsi/iscsi_target_login.c @@ -208,6 +208,7 @@ int iscsi_check_for_session_reinstatement(struct iscsi_conn *conn) initiatorname_param->value) && (sess_p->sess_ops->SessionType == sessiontype))) { atomic_set(&sess_p->session_reinstatement, 1); + atomic_set(&sess_p->session_fall_back_to_erl0, 1); spin_unlock(&sess_p->conn_lock); iscsit_inc_session_usage_count(sess_p); iscsit_stop_time2retain_timer(sess_p); -- 1.9.1
[PATCH 0/3] target: Fixes for v4.12-rc1
From: Nicholas Bellinger Hi all, Here are a couple of fixes from the last weeks testing while continuing longevity and scale out workloads on v4.x target code. This series contains three patches. The first is to address a COMPARE_AND_WRITE se_cmd reference leak where the READ phase hits a non GOOD status, observed with ESX VAAI hosts when outstanding READ I/O reaches a point where non SAM_STAT_GOOD status completions start to happen. The second addresses a hung task bug observed with iscsi-target ports while explicitly changing the active per se_node_acl queue_depth via the existing configfs attribute, if a new iscsi login was already forcing session reinstatement. And the third to is avoid forcing an session reinstatement if queue_depth is changed via configfs, but the value itself has not changed. The series has been verified on v4.1.y by DATERA Q/A and automation teams. Thank you, --nab Nicholas Bellinger (3): target: Fix compare_and_write_callback handling for non GOOD status iscsi-target: Set session_fall_back_to_erl0 when forcing reinstatement target: Don't force session reset if queue_depth does not change drivers/target/iscsi/iscsi_target.c | 1 + drivers/target/iscsi/iscsi_target_configfs.c | 1 + drivers/target/iscsi/iscsi_target_login.c| 1 + drivers/target/target_core_sbc.c | 5 - drivers/target/target_core_tpg.c | 7 +++ 5 files changed, 14 insertions(+), 1 deletion(-) -- 1.9.1
[Bug 176951] boot fails unless acpi=off Acer Travelmate X-349
https://bugzilla.kernel.org/show_bug.cgi?id=176951 Max (700...@gmail.com) changed: What|Removed |Added CC||700...@gmail.com --- Comment #23 from Max (700...@gmail.com) --- I can confirm the bug on my Acer Swift 3 Model SF314-51 (bios 1.08). I able to install just Mint 18.1. I found that it freezes when: 1. ~4 times of 5 when my machine start ups (boot logo/black screen); 2. usually when chrome starts; 3. when brightness changes very fast; 4. when shut down. -- You are receiving this mail because: You are watching the assignee of the bug.
Re: [PATCH v2] ibmvscsis: Do not send aborted task response
On Tue, 2017-05-02 at 13:54 -0500, Bryant G. Ly wrote: > The driver is sending a response to the actual scsi op that was > aborted by an abort task TM, while LIO is sending a response to > the abort task TM. > > ibmvscsis_tgt does not send the response to the client until > release_cmd time. The reason for this was because if we did it > at queue_status time, then the client would be free to reuse the > tag for that command, but we're still using the tag until the > command is released at release_cmd time, so we chose to delay > sending the response until then. That then caused this issue, because > release_cmd is always called, even if queue_status is not. > > SCSI spec says that the initiator that sends the abort task > TM NEVER gets a response to the aborted op and with the current > code it will send a response. Thus this fix will remove that response > if the TAS bit is set. > > Cc: # v4.8+ > Signed-off-by: Bryant G. Ly > Reviewed-by: Tyrel Datwyler > --- > drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c | 66 > ++-- > 1 file changed, 45 insertions(+), 21 deletions(-) Applied, with a small update to the last sentence of the commit log wrt to 'if ABORTED && !TAS bit is set'. Thanks Bryant + Tyrel.
Re: [PATCH] tcmu: Recalculate the tcmu_cmd size to save cmd area memories
On Tue, 2017-05-02 at 21:06 -0500, Mike Christie wrote: > On 05/02/2017 02:54 AM, lixi...@cmss.chinamobile.com wrote: > > From: Xiubo Li > > > > For the "struct tcmu_cmd_entry" in cmd area, the minimum size > > will be sizeof(struct tcmu_cmd_entry) == 112 Bytes. And it could > > fill about (sizeof(struct rsp) - sizeof(struct req)) / > > sizeof(struct iovec) == 68 / 16 ~= 4 data regions(iov[4]) by > > default. > > > > For most tcmu_cmds, the data block indexes allocated from the > > data area will be continuous. And for the continuous blocks they > > will be merged into the same region using only one iovec. For > > the current code, it will always allocates the same number of > > iovecs with blocks for each tcmu_cmd, and it will wastes much > > memories. > > > > For example, when the block size is 4K and the DATA_OUT buffer > > size is 64K, and the regions needed is less than 5(on my > > environment is almost 99.7%). The current code will allocate > > about 16 iovecs, and there will be (16 - 4) * sizeof(struct > > iovec) = 192 Bytes cmd area memories wasted. > > > > Here adds two helpers to calculate the base size and full size > > of the tcmu_cmd. And will recalculate them again when it make sure > > how many iovs is needed before insert it to cmd area. > > > > Signed-off-by: Xiubo Li > > Looks ok to me. Thanks. > > Acked-by: Mike Christie Applied. Thanks Xiubo + MNC.
Re: [PATCH 25/27] block: remove the discard_zeroes_data flag
On Tue, 2017-05-02 at 09:23 +0200, h...@lst.de wrote: > On Tue, May 02, 2017 at 12:16:13AM -0700, Nicholas A. Bellinger wrote: > > Or, another options is use bdev_write_zeroes_sectors() to determine when > > dev_attrib->unmap_zeroes_data should be set. > > Yes, that in combination with your patch to use bdev_write_zeroes_sectors > for zeroing from write same seems like the right fix. The larger target/iblock conversion patch looks like post v4.12 material at this point, so to avoid breakage wrt to existing LBPRZ behavior, I'll plan to push the following patch post -rc1. diff --git a/drivers/target/target_core_device.c b/drivers/target/target_core_device.c index d2f089c..e7caf78 100644 --- a/drivers/target/target_core_device.c +++ b/drivers/target/target_core_device.c @@ -851,7 +851,7 @@ bool target_configure_unmap_from_queue(struct se_dev_attrib *attrib, attrib->unmap_granularity = q->limits.discard_granularity / block_size; attrib->unmap_granularity_alignment = q->limits.discard_alignment / block_size; - attrib->unmap_zeroes_data = 0; + attrib->unmap_zeroes_data = (q->limits.max_write_zeroes_sectors); return true; } EXPORT_SYMBOL(target_configure_unmap_from_queue);
Re: [PATCH] tcmu: Recalculate the tcmu_cmd size to save cmd area memories
On 05/02/2017 02:54 AM, lixi...@cmss.chinamobile.com wrote: > From: Xiubo Li > > For the "struct tcmu_cmd_entry" in cmd area, the minimum size > will be sizeof(struct tcmu_cmd_entry) == 112 Bytes. And it could > fill about (sizeof(struct rsp) - sizeof(struct req)) / > sizeof(struct iovec) == 68 / 16 ~= 4 data regions(iov[4]) by > default. > > For most tcmu_cmds, the data block indexes allocated from the > data area will be continuous. And for the continuous blocks they > will be merged into the same region using only one iovec. For > the current code, it will always allocates the same number of > iovecs with blocks for each tcmu_cmd, and it will wastes much > memories. > > For example, when the block size is 4K and the DATA_OUT buffer > size is 64K, and the regions needed is less than 5(on my > environment is almost 99.7%). The current code will allocate > about 16 iovecs, and there will be (16 - 4) * sizeof(struct > iovec) = 192 Bytes cmd area memories wasted. > > Here adds two helpers to calculate the base size and full size > of the tcmu_cmd. And will recalculate them again when it make sure > how many iovs is needed before insert it to cmd area. > > Signed-off-by: Xiubo Li Looks ok to me. Thanks. Acked-by: Mike Christie
Re: [PATCH] Avoid that scsi_exit_rq() triggers a use-after-free
On Tue, May 02, 2017 at 10:43:30AM -0700, Bart Van Assche wrote: > This patch fixes the following KASAN complaint: > > == > BUG: KASAN: use-after-free in scsi_exit_rq+0xf3/0x120 at addr 8802b7fedf00 > Read of size 1 by task rcuos/5/53 > CPU: 7 PID: 53 Comm: rcuos/6 Not tainted 4.11.0-rc5+ #13 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0 > ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014 > Call Trace: > dump_stack+0x63/0x8f > kasan_object_err+0x21/0x70 > kasan_report.part.1+0x231/0x500 > __asan_report_load1_noabort+0x2e/0x30 > scsi_exit_rq+0xf3/0x120 > free_request_size+0x44/0x60 > mempool_destroy.part.6+0x9b/0x150 > mempool_destroy+0x13/0x20 > blk_exit_rl+0x36/0x40 > blkg_free+0x146/0x200 > __blkg_release_rcu+0x121/0x220 > rcu_nocb_kthread+0x61f/0xca0 > kthread+0x298/0x390 > ret_from_fork+0x2c/0x40 > Object at 8802b7fedd80, in cache kmalloc-2048 size: 2048 > Allocated: > PID = 3992 > save_stack_trace+0x1b/0x20 > save_stack+0x46/0xd0 > kasan_kmalloc+0xad/0xe0 > __kmalloc+0x134/0x220 > scsi_host_alloc+0x6b/0x11c0 > 0xc101d94a > driver_probe_device+0x49e/0xc60 > __device_attach_driver+0x1d3/0x2a0 > bus_for_each_drv+0x11a/0x1d0 > __device_attach+0x1e1/0x2c0 > device_initial_probe+0x13/0x20 > bus_probe_device+0x19b/0x240 > device_add+0x86d/0x1450 > device_register+0x1a/0x20 > 0xc10270ce > 0xc1048a62 > do_one_initcall+0xa7/0x250 > do_init_module+0x1d0/0x55d > load_module+0x7c9f/0x9850 > SYSC_finit_module+0x189/0x1c0 > SyS_finit_module+0xe/0x10 > entry_SYSCALL_64_fastpath+0x1a/0xa9 > Freed: > PID = 4128 > save_stack_trace+0x1b/0x20 > save_stack+0x46/0xd0 > kasan_slab_free+0x71/0xb0 > kfree+0x8d/0x1b0 > scsi_host_dev_release+0x2cb/0x430 > device_release+0x76/0x1e0 > kobject_release+0x107/0x370 > kobject_put+0x56/0xb0 > put_device+0x17/0x20 > scsi_host_put+0x15/0x20 > 0xc101fcd7 > device_release_driver_internal+0x26a/0x4e0 > device_release_driver+0x12/0x20 > bus_remove_device+0x2d0/0x590 > device_del+0x55b/0x920 > device_unregister+0x1a/0xa0 > 0xc101e0ca > 0xc102fccc > SyS_delete_module+0x334/0x3e0 > entry_SYSCALL_64_fastpath+0x1a/0xa9 > Memory state around the buggy address: > 8802b7fede00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb > 8802b7fede80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb > >8802b7fedf00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb >^ > 8802b7fedf80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb > 8802b7fee000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb > == > > Reported-by: Scott Bauer > Fixes: e9c787e65c0c ("scsi: allocate scsi_cmnd structures as part of struct > request") > Signed-off-by: Bart Van Assche > Cc: Scott Bauer > Cc: Christoph Hellwig > Cc: Jan Kara > Cc: Hannes Reinecke > Cc: > --- > drivers/scsi/scsi_lib.c | 8 +++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c > index 15c9fe766071..d698364df072 100644 > --- a/drivers/scsi/scsi_lib.c > +++ b/drivers/scsi/scsi_lib.c > @@ -2095,11 +2095,14 @@ static int scsi_init_rq(struct request_queue *q, > struct request *rq, gfp_t gfp) > struct Scsi_Host *shost = q->rq_alloc_data; > struct scsi_cmnd *cmd = blk_mq_rq_to_pdu(rq); > > + if (!scsi_host_get(shost)) > + goto fail; > + > memset(cmd, 0, sizeof(*cmd)); > > cmd->sense_buffer = scsi_alloc_sense_buffer(shost, gfp, NUMA_NO_NODE); > if (!cmd->sense_buffer) > - goto fail; > + goto put; > cmd->req.sense = cmd->sense_buffer; > > if (scsi_host_get_prot(shost) >= SHOST_DIX_TYPE0_PROTECTION) { > @@ -2112,6 +2115,8 @@ static int scsi_init_rq(struct request_queue *q, struct > request *rq, gfp_t gfp) > > fail_free_sense: > scsi_free_sense_buffer(shost, cmd->sense_buffer); > +put: > + scsi_host_put(shost); > fail: > return -ENOMEM; > } > @@ -2124,6 +2129,7 @@ static void scsi_exit_rq(struct request_queue *q, > struct request *rq) > if (cmd->prot_sdb) > kmem_cache_free(scsi_sdb_cache, cmd->prot_sdb); > scsi_free_sense_buffer(shost, cmd->sense_buffer); > + scsi_host_put(shost); > } > > struct request_queue *scsi_alloc_queue(struct scsi_device *sdev) > -- > 2.12.2 > I've applied this on-top of Jens' For-Linus and re-ran the test. I get the following scheduling while atomic BUG() splat: [ 30.686851] run fstests generic/108 at 2017-05-02 16:56:42 [ 30.953920] XFS (nvme1n1): Unmounting Filesystem [ 31.020543] scsi host2: scsi_debug: version 1.86 [20160430] [ 31.020543] dev_size_mb=128, opts=0x0, submit_queues=1, statistics=0 [ 31.022341] scsi 2:0:0:0: Direct-Access Linuxscsi_debug 0186 PQ: 0 ANSI: 7 [ 31.0
[PATCH v2] ibmvscsis: Do not send aborted task response
The driver is sending a response to the actual scsi op that was aborted by an abort task TM, while LIO is sending a response to the abort task TM. ibmvscsis_tgt does not send the response to the client until release_cmd time. The reason for this was because if we did it at queue_status time, then the client would be free to reuse the tag for that command, but we're still using the tag until the command is released at release_cmd time, so we chose to delay sending the response until then. That then caused this issue, because release_cmd is always called, even if queue_status is not. SCSI spec says that the initiator that sends the abort task TM NEVER gets a response to the aborted op and with the current code it will send a response. Thus this fix will remove that response if the TAS bit is set. Cc: # v4.8+ Signed-off-by: Bryant G. Ly Reviewed-by: Tyrel Datwyler --- drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c | 66 ++-- 1 file changed, 45 insertions(+), 21 deletions(-) diff --git a/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c b/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c index 4bb5635..d6e62ce 100644 --- a/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c +++ b/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c @@ -1758,33 +1758,46 @@ static void ibmvscsis_send_messages(struct scsi_info *vscsi) if (!(vscsi->flags & RESPONSE_Q_DOWN)) { list_for_each_entry_safe(cmd, nxt, &vscsi->waiting_rsp, list) { - iue = cmd->iue; - - crq->valid = VALID_CMD_RESP_EL; - crq->format = cmd->rsp.format; + /* +* If Task Abort Status Bit is set, then dont send a +* response. +*/ + if (cmd->se_cmd.transport_state & CMD_T_ABORTED && + !(cmd->se_cmd.transport_state & CMD_T_TAS)) { + list_del(&cmd->list); + ibmvscsis_free_cmd_resources(vscsi, cmd); + } else { + iue = cmd->iue; - if (cmd->flags & CMD_FAST_FAIL) - crq->status = VIOSRP_ADAPTER_FAIL; + crq->valid = VALID_CMD_RESP_EL; + crq->format = cmd->rsp.format; - crq->IU_length = cpu_to_be16(cmd->rsp.len); + if (cmd->flags & CMD_FAST_FAIL) + crq->status = VIOSRP_ADAPTER_FAIL; - rc = h_send_crq(vscsi->dma_dev->unit_address, - be64_to_cpu(msg_hi), - be64_to_cpu(cmd->rsp.tag)); + crq->IU_length = cpu_to_be16(cmd->rsp.len); - pr_debug("send_messages: cmd %p, tag 0x%llx, rc %ld\n", -cmd, be64_to_cpu(cmd->rsp.tag), rc); + rc = h_send_crq(vscsi->dma_dev->unit_address, + be64_to_cpu(msg_hi), + be64_to_cpu(cmd->rsp.tag)); - /* if all ok free up the command element resources */ - if (rc == H_SUCCESS) { - /* some movement has occurred */ - vscsi->rsp_q_timer.timer_pops = 0; - list_del(&cmd->list); + pr_debug("send_messages: cmd %p, tag 0x%llx, rc %ld\n", +cmd, be64_to_cpu(cmd->rsp.tag), rc); - ibmvscsis_free_cmd_resources(vscsi, cmd); - } else { - srp_snd_msg_failed(vscsi, rc); - break; + /* if all ok free up the command element +* resources +*/ + if (rc == H_SUCCESS) { + /* some movement has occurred */ + vscsi->rsp_q_timer.timer_pops = 0; + list_del(&cmd->list); + + ibmvscsis_free_cmd_resources(vscsi, +cmd); + } else { + srp_snd_msg_failed(vscsi, rc); + break; + } } } @@ -3581,9 +3594,20 @@ static int ibmvscsis_write_pending(struct se_cmd *se_cmd) { struct ibmvscsis_cmd *cmd = container_of(se_cmd, struct ibmvscsis_cmd, se_cmd); + struct scsi_info *vs
[PATCH v6 5/5] Make __scsi_remove_device go straight from BLOCKED to DEL
If a device is blocked, make __scsi_remove_device() cause it to transition to the DEL state. This means that all the commands issued in .shutdown() will error in the mid-layer, thus making the removal proceed without being stopped. This patch is a slightly modified version of a patch from James Bottomley. This patch avoids that the following lockup occurs: Call Trace: schedule+0x35/0x80 schedule_timeout+0x237/0x2d0 io_schedule_timeout+0xa6/0x110 wait_for_completion_io+0xa3/0x110 blk_execute_rq+0xdf/0x120 scsi_execute+0xce/0x150 [scsi_mod] scsi_execute_req_flags+0x8f/0xf0 [scsi_mod] sd_sync_cache+0xa9/0x190 [sd_mod] sd_shutdown+0x6a/0x100 [sd_mod] sd_remove+0x64/0xc0 [sd_mod] __device_release_driver+0x8d/0x120 device_release_driver+0x1e/0x30 bus_remove_device+0xf9/0x170 device_del+0x127/0x240 __scsi_remove_device+0xc1/0xd0 [scsi_mod] scsi_forget_host+0x57/0x60 [scsi_mod] scsi_remove_host+0x72/0x110 [scsi_mod] srp_remove_work+0x8b/0x200 [ib_srp] Reported-by: Israel Rukshin Signed-off-by: Bart Van Assche Cc: James Bottomley Cc: Israel Rukshin Cc: Max Gurtovoy Cc: Hannes Reinecke Cc: Benjamin Block --- drivers/scsi/scsi_lib.c | 2 +- drivers/scsi/scsi_sysfs.c | 13 + 2 files changed, 14 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index bbce1f1db515..b83dca6b495b 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -2618,7 +2618,6 @@ scsi_device_set_state(struct scsi_device *sdev, enum scsi_device_state state) case SDEV_QUIESCE: case SDEV_OFFLINE: case SDEV_TRANSPORT_OFFLINE: - case SDEV_BLOCK: break; default: goto illegal; @@ -2632,6 +2631,7 @@ scsi_device_set_state(struct scsi_device *sdev, enum scsi_device_state state) case SDEV_OFFLINE: case SDEV_TRANSPORT_OFFLINE: case SDEV_CANCEL: + case SDEV_BLOCK: case SDEV_CREATED_BLOCK: break; default: diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c index a91537a3abbf..1f243ac16010 100644 --- a/drivers/scsi/scsi_sysfs.c +++ b/drivers/scsi/scsi_sysfs.c @@ -1290,7 +1290,20 @@ void __scsi_remove_device(struct scsi_device *sdev) * wait until it has finished before changing the device state. */ mutex_lock(&sdev->state_mutex); + /* +* If blocked, we go straight to DEL and restart the queue so +* any commands issued during driver shutdown (like sync +* cache) are errored immediately. +*/ res = scsi_device_set_state(sdev, SDEV_CANCEL); + if (res != 0) { + res = scsi_device_set_state(sdev, SDEV_DEL); + if (res == 0) { + scsi_start_queue(sdev); + sdev_printk(KERN_DEBUG, sdev, + "Changed state from BLOCKED to DEL\n"); + } + } mutex_unlock(&sdev->state_mutex); if (res != 0) -- 2.12.2
[PATCH v6 4/5] Introduce scsi_start_queue()
This patch does not change any functionality. Signed-off-by: Bart Van Assche Cc: Israel Rukshin Cc: Max Gurtovoy Cc: Hannes Reinecke Cc: Benjamin Block --- drivers/scsi/scsi_lib.c | 25 +++-- drivers/scsi/scsi_priv.h | 1 + 2 files changed, 16 insertions(+), 10 deletions(-) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index d2854558437d..bbce1f1db515 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -3024,6 +3024,20 @@ static int scsi_internal_device_block(struct scsi_device *sdev) return err; } +void scsi_start_queue(struct scsi_device *sdev) +{ + struct request_queue *q = sdev->request_queue; + unsigned long flags; + + if (q->mq_ops) { + blk_mq_start_stopped_hw_queues(q, false); + } else { + spin_lock_irqsave(q->queue_lock, flags); + blk_start_queue(q); + spin_unlock_irqrestore(q->queue_lock, flags); + } +} + /** * scsi_internal_device_unblock_nowait - resume a device after a block request * @sdev: device to resume @@ -3042,9 +3056,6 @@ static int scsi_internal_device_block(struct scsi_device *sdev) int scsi_internal_device_unblock_nowait(struct scsi_device *sdev, enum scsi_device_state new_state) { - struct request_queue *q = sdev->request_queue; - unsigned long flags; - /* * Try to transition the scsi device to SDEV_RUNNING or one of the * offlined states and goose the device queue if successful. @@ -3062,13 +3073,7 @@ int scsi_internal_device_unblock_nowait(struct scsi_device *sdev, sdev->sdev_state != SDEV_OFFLINE) return -EINVAL; - if (q->mq_ops) { - blk_mq_start_stopped_hw_queues(q, false); - } else { - spin_lock_irqsave(q->queue_lock, flags); - blk_start_queue(q); - spin_unlock_irqrestore(q->queue_lock, flags); - } + scsi_start_queue(sdev); return 0; } diff --git a/drivers/scsi/scsi_priv.h b/drivers/scsi/scsi_priv.h index f11bd102d6d5..c7629e31a75b 100644 --- a/drivers/scsi/scsi_priv.h +++ b/drivers/scsi/scsi_priv.h @@ -89,6 +89,7 @@ extern void scsi_run_host_queues(struct Scsi_Host *shost); extern void scsi_requeue_run_queue(struct work_struct *work); extern struct request_queue *scsi_alloc_queue(struct scsi_device *sdev); extern struct request_queue *scsi_mq_alloc_queue(struct scsi_device *sdev); +extern void scsi_start_queue(struct scsi_device *sdev); extern int scsi_mq_setup_tags(struct Scsi_Host *shost); extern void scsi_mq_destroy_tags(struct Scsi_Host *shost); extern int scsi_init_queue(void); -- 2.12.2
[PATCH v6 2/5] Create two versions of scsi_internal_device_unblock()
This will make it easier to serialize SCSI device state changes through a mutex. Signed-off-by: Bart Van Assche Cc: Christoph Hellwig Cc: Hannes Reinecke Cc: Johannes Thumshirn Cc: Sreekanth Reddy --- drivers/scsi/mpt3sas/mpt3sas_scsih.c | 4 ++-- drivers/scsi/scsi_lib.c | 46 +--- include/scsi/scsi_device.h | 4 ++-- 3 files changed, 36 insertions(+), 18 deletions(-) diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c index 25e89cfe4417..d671f6e6062c 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c @@ -2883,7 +2883,7 @@ _scsih_internal_device_unblock(struct scsi_device *sdev, sdev_printk(KERN_WARNING, sdev, "device_unblock and setting to running, " "handle(0x%04x)\n", sas_device_priv_data->sas_target->handle); sas_device_priv_data->block = 0; - r = scsi_internal_device_unblock(sdev, SDEV_RUNNING); + r = scsi_internal_device_unblock_nowait(sdev, SDEV_RUNNING); if (r == -EINVAL) { /* The device has been set to SDEV_RUNNING by SD layer during * device addition but the request queue is still stopped by @@ -2902,7 +2902,7 @@ _scsih_internal_device_unblock(struct scsi_device *sdev, r, sas_device_priv_data->sas_target->handle); sas_device_priv_data->block = 0; - r = scsi_internal_device_unblock(sdev, SDEV_RUNNING); + r = scsi_internal_device_unblock_nowait(sdev, SDEV_RUNNING); if (r) sdev_printk(KERN_WARNING, sdev, "retried device_unblock" " failed with return(%d) for handle(0x%04x)\n", diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index ca20d3702b45..79bb05fa09d5 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -3016,24 +3016,22 @@ static int scsi_internal_device_block(struct scsi_device *sdev) } /** - * scsi_internal_device_unblock - resume a device after a block request + * scsi_internal_device_unblock_nowait - resume a device after a block request * @sdev: device to resume - * @new_state: state to set devices to after unblocking + * @new_state: state to set the device to after unblocking * - * Called by scsi lld's or the midlayer to restart the device queue - * for the previously suspended scsi device. Called from interrupt or - * normal process context. + * Restart the device queue for a previously suspended SCSI device. Does not + * sleep. * - * Returns zero if successful or error if not. + * Returns zero if successful or a negative error code upon failure. * - * Notes: - * This routine transitions the device to the SDEV_RUNNING state - * or to one of the offline states (which must be a legal transition) - * allowing the midlayer to goose the queue for this device. + * Notes: + * This routine transitions the device to the SDEV_RUNNING state or to one of + * the offline states (which must be a legal transition) allowing the midlayer + * to goose the queue for this device. */ -int -scsi_internal_device_unblock(struct scsi_device *sdev, -enum scsi_device_state new_state) +int scsi_internal_device_unblock_nowait(struct scsi_device *sdev, + enum scsi_device_state new_state) { struct request_queue *q = sdev->request_queue; unsigned long flags; @@ -3065,7 +3063,27 @@ scsi_internal_device_unblock(struct scsi_device *sdev, return 0; } -EXPORT_SYMBOL_GPL(scsi_internal_device_unblock); +EXPORT_SYMBOL_GPL(scsi_internal_device_unblock_nowait); + +/** + * scsi_internal_device_unblock - resume a device after a block request + * @sdev: device to resume + * @new_state: state to set the device to after unblocking + * + * Restart the device queue for a previously suspended SCSI device. May sleep. + * + * Returns zero if successful or a negative error code upon failure. + * + * Notes: + * This routine transitions the device to the SDEV_RUNNING state or to one of + * the offline states (which must be a legal transition) allowing the midlayer + * to goose the queue for this device. + */ +static int scsi_internal_device_unblock(struct scsi_device *sdev, + enum scsi_device_state new_state) +{ + return scsi_internal_device_unblock_nowait(sdev, new_state); +} static void device_block(struct scsi_device *sdev, void *data) diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h index e2f43ae3e264..bb784045ba71 100644 --- a/include/scsi/scsi_device.h +++ b/include/scsi/scsi_device.h @@ -473,8 +473,8 @@ static inline int scsi_device_created(struct scsi_device *sdev) } int scsi_internal_device_block_nowait(struct scsi_device *sdev); -int scsi_internal_device_unblock(struct scsi_device *sdev, -enum scs
[PATCH v6 1/5] Split scsi_internal_device_block()
Instead of passing a "wait" argument to scsi_internal_device_block(), split this function into a function that waits and a function that doesn't wait. This will make it easier to serialize SCSI device state changes through a mutex. Signed-off-by: Bart Van Assche Cc: Christoph Hellwig Cc: Hannes Reinecke Cc: Johannes Thumshirn Cc: Sreekanth Reddy --- drivers/scsi/mpt3sas/mpt3sas_scsih.c | 4 +- drivers/scsi/scsi_lib.c | 73 +++- include/scsi/scsi_device.h | 2 +- 3 files changed, 50 insertions(+), 29 deletions(-) diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c index 919ba2bb15f1..25e89cfe4417 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c @@ -2859,7 +2859,7 @@ _scsih_internal_device_block(struct scsi_device *sdev, sas_device_priv_data->sas_target->handle); sas_device_priv_data->block = 1; - r = scsi_internal_device_block(sdev, false); + r = scsi_internal_device_block_nowait(sdev); if (r == -EINVAL) sdev_printk(KERN_WARNING, sdev, "device_block failed with return(%d) for handle(0x%04x)\n", @@ -2895,7 +2895,7 @@ _scsih_internal_device_unblock(struct scsi_device *sdev, "performing a block followed by an unblock\n", r, sas_device_priv_data->sas_target->handle); sas_device_priv_data->block = 1; - r = scsi_internal_device_block(sdev, false); + r = scsi_internal_device_block_nowait(sdev); if (r) sdev_printk(KERN_WARNING, sdev, "retried device_block " "failed with return(%d) for handle(0x%04x)\n", diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index b7b340c494ab..ca20d3702b45 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -2937,28 +2937,20 @@ scsi_target_resume(struct scsi_target *starget) EXPORT_SYMBOL(scsi_target_resume); /** - * scsi_internal_device_block - internal function to put a device temporarily into the SDEV_BLOCK state - * @sdev: device to block - * @wait: Whether or not to wait until ongoing .queuecommand() / - * .queue_rq() calls have finished. + * scsi_internal_device_block_nowait - try to transition to the SDEV_BLOCK state + * @sdev: device to block * - * Block request made by scsi lld's to temporarily stop all - * scsi commands on the specified device. May sleep. + * Pause SCSI command processing on the specified device. Does not sleep. * - * Returns zero if successful or error if not + * Returns zero if successful or a negative error code upon failure. * - * Notes: - * This routine transitions the device to the SDEV_BLOCK state - * (which must be a legal transition). When the device is in this - * state, all commands are deferred until the scsi lld reenables - * the device with scsi_device_unblock or device_block_tmo fires. - * - * To do: avoid that scsi_send_eh_cmnd() calls queuecommand() after - * scsi_internal_device_block() has blocked a SCSI device and also - * remove the rport mutex lock and unlock calls from srp_queuecommand(). + * Notes: + * This routine transitions the device to the SDEV_BLOCK state (which must be + * a legal transition). When the device is in this state, command processing + * is paused until the device leaves the SDEV_BLOCK state. See also + * scsi_internal_device_unblock_nowait(). */ -int -scsi_internal_device_block(struct scsi_device *sdev, bool wait) +int scsi_internal_device_block_nowait(struct scsi_device *sdev) { struct request_queue *q = sdev->request_queue; unsigned long flags; @@ -2978,21 +2970,50 @@ scsi_internal_device_block(struct scsi_device *sdev, bool wait) * request queue. */ if (q->mq_ops) { - if (wait) - blk_mq_quiesce_queue(q); - else - blk_mq_stop_hw_queues(q); + blk_mq_stop_hw_queues(q); } else { spin_lock_irqsave(q->queue_lock, flags); blk_stop_queue(q); spin_unlock_irqrestore(q->queue_lock, flags); - if (wait) - scsi_wait_for_queuecommand(sdev); } return 0; } -EXPORT_SYMBOL_GPL(scsi_internal_device_block); +EXPORT_SYMBOL_GPL(scsi_internal_device_block_nowait); + +/** + * scsi_internal_device_block - try to transition to the SDEV_BLOCK state + * @sdev: device to block + * + * Pause SCSI command processing on the specified device and wait until all + * ongoing scsi_request_fn() / scsi_queue_rq() calls have finished. May sleep. + * + * Returns zero if successful or a negative error code upon failure. + * + * Note: + * This routine transitions the device to the SDEV_BLOCK state (which must be + * a legal transition). When the device is in th
[PATCH v6 0/5] Avoid that __scsi_remove_device() hangs
__scsi_remove_device() hangs if it is waiting for the SYNCHRONIZE CACHE command submitted by the sd driver to finish if the block layer queue is stopped and does not get restarted. This patch series avoids that that hang occurs. Changes compared to v5: - Reworked the approach. Instead of submitting the SYNCHRONIZE CACHE command asynchronously, make __scsi_remove_device() go straight from BLOCKED to DEL. Changes compared to v4: - Fixed the deadlock reported by the 0-day bot by changing a blk_put_request() call into __blk_put_request(). Changes compared to v3: - Removed the logging statements from sd_sync_cache_done() that triggered a crash due to a use-after-free. Changes compared to v2: - Moved the "stop_disk" assignment after the sdkp check in the sd driver. - Added a completion function for asynchronous SYNCHRONIZE CACHE commands. - Added "disk" and "done" arguments to scsi_execute_async(). Changes compared to v1: - Reworked the approach of this patch series. Bart Van Assche (5): Split scsi_internal_device_block() Create two versions of scsi_internal_device_unblock() Protect SCSI device state changes with a mutex Introduce scsi_start_queue() Make __scsi_remove_device go straight from BLOCKED to DEL drivers/scsi/mpt3sas/mpt3sas_scsih.c | 8 +- drivers/scsi/scsi_error.c| 8 +- drivers/scsi/scsi_lib.c | 171 +++ drivers/scsi/scsi_priv.h | 1 + drivers/scsi/scsi_scan.c | 16 ++-- drivers/scsi/scsi_sysfs.c| 37 +++- drivers/scsi/scsi_transport_srp.c| 7 +- drivers/scsi/sd.c| 7 +- include/scsi/scsi_device.h | 7 +- 9 files changed, 181 insertions(+), 81 deletions(-) -- 2.12.2
[PATCH v6 3/5] Protect SCSI device state changes with a mutex
Enable this mechanism for all scsi_target_*block() callers but not for the scsi_internal_device_unblock() calls from the mpt3sas driver because that driver can call scsi_internal_device_unblock() from atomic context. Signed-off-by: Bart Van Assche Cc: Christoph Hellwig Cc: Hannes Reinecke Cc: Johannes Thumshirn --- drivers/scsi/scsi_error.c | 8 +++- drivers/scsi/scsi_lib.c | 27 +-- drivers/scsi/scsi_scan.c | 16 +--- drivers/scsi/scsi_sysfs.c | 24 +++- drivers/scsi/scsi_transport_srp.c | 7 --- drivers/scsi/sd.c | 7 +-- include/scsi/scsi_device.h| 1 + 7 files changed, 66 insertions(+), 24 deletions(-) diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index f2cafae150bc..02f5f7f49885 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -1696,11 +1696,17 @@ static void scsi_eh_offline_sdevs(struct list_head *work_q, struct list_head *done_q) { struct scsi_cmnd *scmd, *next; + struct scsi_device *sdev; list_for_each_entry_safe(scmd, next, work_q, eh_entry) { sdev_printk(KERN_INFO, scmd->device, "Device offlined - " "not ready after error recovery\n"); - scsi_device_set_state(scmd->device, SDEV_OFFLINE); + sdev = scmd->device; + + mutex_lock(&sdev->state_mutex); + scsi_device_set_state(sdev, SDEV_OFFLINE); + mutex_unlock(&sdev->state_mutex); + if (scmd->eh_eflags & SCSI_EH_CANCEL_CMD) { /* * FIXME: Handle lost cmds. diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 79bb05fa09d5..d2854558437d 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -2875,7 +2875,12 @@ static void scsi_wait_for_queuecommand(struct scsi_device *sdev) int scsi_device_quiesce(struct scsi_device *sdev) { - int err = scsi_device_set_state(sdev, SDEV_QUIESCE); + int err; + + mutex_lock(&sdev->state_mutex); + err = scsi_device_set_state(sdev, SDEV_QUIESCE); + mutex_unlock(&sdev->state_mutex); + if (err) return err; @@ -2903,10 +2908,11 @@ void scsi_device_resume(struct scsi_device *sdev) * so assume the state is being managed elsewhere (for example * device deleted during suspend) */ - if (sdev->sdev_state != SDEV_QUIESCE || - scsi_device_set_state(sdev, SDEV_RUNNING)) - return; - scsi_run_queue(sdev->request_queue); + mutex_lock(&sdev->state_mutex); + if (sdev->sdev_state == SDEV_QUIESCE && + scsi_device_set_state(sdev, SDEV_RUNNING) == 0) + scsi_run_queue(sdev->request_queue); + mutex_unlock(&sdev->state_mutex); } EXPORT_SYMBOL(scsi_device_resume); @@ -3005,6 +3011,7 @@ static int scsi_internal_device_block(struct scsi_device *sdev) struct request_queue *q = sdev->request_queue; int err; + mutex_lock(&sdev->state_mutex); err = scsi_internal_device_block_nowait(sdev); if (err == 0) { if (q->mq_ops) @@ -3012,6 +3019,8 @@ static int scsi_internal_device_block(struct scsi_device *sdev) else scsi_wait_for_queuecommand(sdev); } + mutex_unlock(&sdev->state_mutex); + return err; } @@ -3082,7 +3091,13 @@ EXPORT_SYMBOL_GPL(scsi_internal_device_unblock_nowait); static int scsi_internal_device_unblock(struct scsi_device *sdev, enum scsi_device_state new_state) { - return scsi_internal_device_unblock_nowait(sdev, new_state); + int ret; + + mutex_lock(&sdev->state_mutex); + ret = scsi_internal_device_unblock_nowait(sdev, new_state); + mutex_unlock(&sdev->state_mutex); + + return ret; } static void diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c index 6f7128f49c30..e6de4eee97a3 100644 --- a/drivers/scsi/scsi_scan.c +++ b/drivers/scsi/scsi_scan.c @@ -231,6 +231,7 @@ static struct scsi_device *scsi_alloc_sdev(struct scsi_target *starget, sdev->id = starget->id; sdev->lun = lun; sdev->channel = starget->channel; + mutex_init(&sdev->state_mutex); sdev->sdev_state = SDEV_CREATED; INIT_LIST_HEAD(&sdev->siblings); INIT_LIST_HEAD(&sdev->same_target_siblings); @@ -943,16 +944,17 @@ static int scsi_add_lun(struct scsi_device *sdev, unsigned char *inq_result, /* set the device running here so that slave configure * may do I/O */ + mutex_lock(&sdev->state_mutex); ret = scsi_device_set_state(sdev, SDEV_RUNNING); - if (ret) { + if (ret) ret = scsi_device_set_state(sdev, SDEV_BLOCK); + mutex_unlock(&sdev->state_mut
[PATCH] scsi_lib: Add #include
This patch avoids that when building with W=1 the compiler complains that __scsi_init_queue() has not been declared. See also commit d48777a633d6 ("scsi: remove __scsi_alloc_queue"). Signed-off-by: Bart Van Assche Cc: Christoph Hellwig Cc: Hannes Reinecke Cc: Johannes Thumshirn --- drivers/scsi/scsi_lib.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index d698364df072..b7b340c494ab 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -30,6 +30,7 @@ #include #include #include +#include /* __scsi_init_queue() */ #include #include -- 2.12.2
[PATCH] Avoid that scsi_exit_rq() triggers a use-after-free
This patch fixes the following KASAN complaint: == BUG: KASAN: use-after-free in scsi_exit_rq+0xf3/0x120 at addr 8802b7fedf00 Read of size 1 by task rcuos/5/53 CPU: 7 PID: 53 Comm: rcuos/6 Not tainted 4.11.0-rc5+ #13 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0 ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014 Call Trace: dump_stack+0x63/0x8f kasan_object_err+0x21/0x70 kasan_report.part.1+0x231/0x500 __asan_report_load1_noabort+0x2e/0x30 scsi_exit_rq+0xf3/0x120 free_request_size+0x44/0x60 mempool_destroy.part.6+0x9b/0x150 mempool_destroy+0x13/0x20 blk_exit_rl+0x36/0x40 blkg_free+0x146/0x200 __blkg_release_rcu+0x121/0x220 rcu_nocb_kthread+0x61f/0xca0 kthread+0x298/0x390 ret_from_fork+0x2c/0x40 Object at 8802b7fedd80, in cache kmalloc-2048 size: 2048 Allocated: PID = 3992 save_stack_trace+0x1b/0x20 save_stack+0x46/0xd0 kasan_kmalloc+0xad/0xe0 __kmalloc+0x134/0x220 scsi_host_alloc+0x6b/0x11c0 0xc101d94a driver_probe_device+0x49e/0xc60 __device_attach_driver+0x1d3/0x2a0 bus_for_each_drv+0x11a/0x1d0 __device_attach+0x1e1/0x2c0 device_initial_probe+0x13/0x20 bus_probe_device+0x19b/0x240 device_add+0x86d/0x1450 device_register+0x1a/0x20 0xc10270ce 0xc1048a62 do_one_initcall+0xa7/0x250 do_init_module+0x1d0/0x55d load_module+0x7c9f/0x9850 SYSC_finit_module+0x189/0x1c0 SyS_finit_module+0xe/0x10 entry_SYSCALL_64_fastpath+0x1a/0xa9 Freed: PID = 4128 save_stack_trace+0x1b/0x20 save_stack+0x46/0xd0 kasan_slab_free+0x71/0xb0 kfree+0x8d/0x1b0 scsi_host_dev_release+0x2cb/0x430 device_release+0x76/0x1e0 kobject_release+0x107/0x370 kobject_put+0x56/0xb0 put_device+0x17/0x20 scsi_host_put+0x15/0x20 0xc101fcd7 device_release_driver_internal+0x26a/0x4e0 device_release_driver+0x12/0x20 bus_remove_device+0x2d0/0x590 device_del+0x55b/0x920 device_unregister+0x1a/0xa0 0xc101e0ca 0xc102fccc SyS_delete_module+0x334/0x3e0 entry_SYSCALL_64_fastpath+0x1a/0xa9 Memory state around the buggy address: 8802b7fede00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb 8802b7fede80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb >8802b7fedf00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ 8802b7fedf80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb 8802b7fee000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb == Reported-by: Scott Bauer Fixes: e9c787e65c0c ("scsi: allocate scsi_cmnd structures as part of struct request") Signed-off-by: Bart Van Assche Cc: Scott Bauer Cc: Christoph Hellwig Cc: Jan Kara Cc: Hannes Reinecke Cc: --- drivers/scsi/scsi_lib.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 15c9fe766071..d698364df072 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -2095,11 +2095,14 @@ static int scsi_init_rq(struct request_queue *q, struct request *rq, gfp_t gfp) struct Scsi_Host *shost = q->rq_alloc_data; struct scsi_cmnd *cmd = blk_mq_rq_to_pdu(rq); + if (!scsi_host_get(shost)) + goto fail; + memset(cmd, 0, sizeof(*cmd)); cmd->sense_buffer = scsi_alloc_sense_buffer(shost, gfp, NUMA_NO_NODE); if (!cmd->sense_buffer) - goto fail; + goto put; cmd->req.sense = cmd->sense_buffer; if (scsi_host_get_prot(shost) >= SHOST_DIX_TYPE0_PROTECTION) { @@ -2112,6 +2115,8 @@ static int scsi_init_rq(struct request_queue *q, struct request *rq, gfp_t gfp) fail_free_sense: scsi_free_sense_buffer(shost, cmd->sense_buffer); +put: + scsi_host_put(shost); fail: return -ENOMEM; } @@ -2124,6 +2129,7 @@ static void scsi_exit_rq(struct request_queue *q, struct request *rq) if (cmd->prot_sdb) kmem_cache_free(scsi_sdb_cache, cmd->prot_sdb); scsi_free_sense_buffer(shost, cmd->sense_buffer); + scsi_host_put(shost); } struct request_queue *scsi_alloc_queue(struct scsi_device *sdev) -- 2.12.2
Re: BUG: KASAN: use-after-free in scsi_exit_rq
On Tue, 2017-05-02 at 16:41 +0200, Jan Kara wrote: > So I'm also not aware of any particular breakage this would cause. However > logically the freeing of request mempools really belongs to > blk_release_queue() so it seems a bit dumb to move blk_exit_rl() just > because SCSI stores the fact from which slab cache it has allocated the > sense buffer in a structure (shost) that it frees under its hands by the > time blk_release_queue() is called. :-| Hello Jan, My concern when I wrote my previous e-mail was that I didn't want to add a scsi_host_get() / scsi_host_put() pair to the hot path in the SCSI core. But I just realized that scsi_init_rq() and scsi_exit_rq() are not in the hot path so adding a scsi_host_get() / scsi_host_put() pair should work fine. I will post a patch. Bart.
Re: BUG: KASAN: use-after-free in scsi_exit_rq
On Fri 28-04-17 17:46:47, Tejun Heo wrote: > On Fri, Apr 21, 2017 at 09:49:17PM +, Bart Van Assche wrote: > > On Thu, 2017-04-20 at 15:18 -0600, Scott Bauer wrote: > > > [ 642.638860] BUG: KASAN: use-after-free in scsi_exit_rq+0xf3/0x120 at > > > addr 8802b7fedf00 > > > [ 642.639362] Read of size 1 by task rcuos/5/53 > > > [ 642.639713] CPU: 7 PID: 53 Comm: rcuos/6 Not tainted 4.11.0-rc5+ #13 > > > [ 642.640170] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), > > > BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org > > > 04/01/2014 > > > [ 642.640923] Call Trace: > > > [ 642.641080] dump_stack+0x63/0x8f > > > [ 642.641289] kasan_object_err+0x21/0x70 > > > [ 642.641531] kasan_report.part.1+0x231/0x500 > > > [ 642.641823] ? scsi_exit_rq+0xf3/0x120 > > > [ 642.642054] ? _raw_spin_unlock_irqrestore+0xe/0x10 > > > [ 642.642353] ? free_percpu+0x1b7/0x340 > > > [ 642.642586] ? put_task_stack+0x117/0x2b0 > > > [ 642.642837] __asan_report_load1_noabort+0x2e/0x30 > > > [ 642.643138] scsi_exit_rq+0xf3/0x120 > > > [ 642.643366] free_request_size+0x44/0x60 > > > [ 642.643614] mempool_destroy.part.6+0x9b/0x150 > > > [ 642.643899] ? kasan_slab_free+0x87/0xb0 > > > [ 642.644152] mempool_destroy+0x13/0x20 > > > [ 642.644394] blk_exit_rl+0x36/0x40 > > > [ 642.644614] blkg_free+0x146/0x200 > > > [ 642.644836] __blkg_release_rcu+0x121/0x220 > > > [ 642.645112] rcu_nocb_kthread+0x61f/0xca0 > > > [ 642.645376] ? get_state_synchronize_rcu+0x20/0x20 > > > [ 642.645690] ? pci_mmcfg_check_reserved+0x110/0x110 > > > [ 642.646011] kthread+0x298/0x390 > > > [ 642.646224] ? get_state_synchronize_rcu+0x20/0x20 > > > [ 642.646535] ? kthread_park+0x160/0x160 > > > [ 642.646787] ret_from_fork+0x2c/0x40 > > > > I'm not familiar with cgroups but seeing this makes me wonder whether it > > would > > be possible to move the blk_exit_rl() calls from blk_release_queue() into > > blk_cleanup_queue()? The SCSI core frees a SCSI host after > > blk_cleanup_queue() > > has finished for all associated SCSI devices. This is why I think that > > calling > > blk_exit_rl() earlier would be sufficient to avoid that scsi_exit_rq() > > dereferences a SCSI host pointer after it has been freed. > > Hmm... I see. Didn't know request put path involved derefing those > structs. The blk_exit_rl() call just replaced open coded > mempool_destroy call, so the destruction order was always like this. > We shouldn't have any request in flight by cleanup, so moving it there > should be fine. So I'm also not aware of any particular breakage this would cause. However logically the freeing of request mempools really belongs to blk_release_queue() so it seems a bit dumb to move blk_exit_rl() just because SCSI stores the fact from which slab cache it has allocated the sense buffer in a structure (shost) that it frees under its hands by the time blk_release_queue() is called. :-| Honza -- Jan Kara SUSE Labs, CR
I NEED YOUR URGENT HELP AND CORPERATION
Dear Friend, I am contacting you on a business deal of $9,500,000.00 Million United States Dollars, ready for transfer into your own personal account and if we make this claim, we will share it on the ratio of 50% / 50% basis, I would like to assure you that it be 100% risk free and it will be legally backed up with government approval. Once you are interested to transact this business with me, kindly give me your consent response immediately. Hoping to hear from you. My regards, Mr Ibrahim Kabore EMAIL,kabreibrahim...@gmail.com
Re: [PATCH, RFC] MAINTAINERS: update OSD entries
On Tue, 2017-05-02 at 09:57 +0200, Christoph Hellwig wrote: > The open-osd domain doesn't exist anymore, and mails to the list lead > to really annoying bounced that repeat every day. > > Also the primarydata address for Benny bounces, and while I have a new > one for him he doesn't seem to be maintaining the OSD code any more. > > Which beggs the question: should we really leave the Supported status > in MAINTAINERS given that the code is barely maintained? > > Signed-off-by: Christoph Hellwig > --- > MAINTAINERS | 4 > 1 file changed, 4 deletions(-) > > diff --git a/MAINTAINERS b/MAINTAINERS > index 1bb06c5f7716..28dd83a1d9e2 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -9418,10 +9418,6 @@ F: drivers/net/wireless/intersil/orinoco/ > > OSD LIBRARY and FILESYSTEM > M: Boaz Harrosh > -M: Benny Halevy > -L: osd-...@open-osd.org > -W: http://open-osd.org > -T: git git://git.open-osd.org/open-osd.git > S: Maintained > F: drivers/scsi/osd/ > F: include/scsi/osd_* Hah, you beat me to it! I was going to spin up a patch for this today. Acked-by: Jeff Layton
答复: [PATCH v6 0/2] tcmu: Dynamic growing data area support
> > Xiubo Li (2): > > tcmu: Add dynamic growing data area feature support > > tcmu: Add global data block pool support > > > > drivers/target/target_core_user.c | 598 > > ++ > > 1 file changed, 469 insertions(+), 129 deletions(-) > > > > So based upon the feedback from MNC, this looks OK to merge. > > It looks like there is one bug as reported by MNC introduced by patch > #1. > It's actually one enhancement. > Please go ahead and post an incremental patch to address this at your > earliest convenience. > And I will address this later. Thank, BRs Xiubo
[PATCH, RFC] MAINTAINERS: update OSD entries
The open-osd domain doesn't exist anymore, and mails to the list lead to really annoying bounced that repeat every day. Also the primarydata address for Benny bounces, and while I have a new one for him he doesn't seem to be maintaining the OSD code any more. Which beggs the question: should we really leave the Supported status in MAINTAINERS given that the code is barely maintained? Signed-off-by: Christoph Hellwig --- MAINTAINERS | 4 1 file changed, 4 deletions(-) diff --git a/MAINTAINERS b/MAINTAINERS index 1bb06c5f7716..28dd83a1d9e2 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -9418,10 +9418,6 @@ F: drivers/net/wireless/intersil/orinoco/ OSD LIBRARY and FILESYSTEM M: Boaz Harrosh -M: Benny Halevy -L: osd-...@open-osd.org -W: http://open-osd.org -T: git git://git.open-osd.org/open-osd.git S: Maintained F: drivers/scsi/osd/ F: include/scsi/osd_* -- 2.11.0
[PATCH] tcmu: Recalculate the tcmu_cmd size to save cmd area memories
From: Xiubo Li For the "struct tcmu_cmd_entry" in cmd area, the minimum size will be sizeof(struct tcmu_cmd_entry) == 112 Bytes. And it could fill about (sizeof(struct rsp) - sizeof(struct req)) / sizeof(struct iovec) == 68 / 16 ~= 4 data regions(iov[4]) by default. For most tcmu_cmds, the data block indexes allocated from the data area will be continuous. And for the continuous blocks they will be merged into the same region using only one iovec. For the current code, it will always allocates the same number of iovecs with blocks for each tcmu_cmd, and it will wastes much memories. For example, when the block size is 4K and the DATA_OUT buffer size is 64K, and the regions needed is less than 5(on my environment is almost 99.7%). The current code will allocate about 16 iovecs, and there will be (16 - 4) * sizeof(struct iovec) = 192 Bytes cmd area memories wasted. Here adds two helpers to calculate the base size and full size of the tcmu_cmd. And will recalculate them again when it make sure how many iovs is needed before insert it to cmd area. Signed-off-by: Xiubo Li --- drivers/target/target_core_user.c | 52 ++- 1 file changed, 41 insertions(+), 11 deletions(-) diff --git a/drivers/target/target_core_user.c b/drivers/target/target_core_user.c index 0b29e4f..89b75ce 100644 --- a/drivers/target/target_core_user.c +++ b/drivers/target/target_core_user.c @@ -602,6 +602,27 @@ static bool is_ring_space_avail(struct tcmu_dev *udev, struct tcmu_cmd *cmd, return true; } +static inline size_t tcmu_cmd_get_base_cmd_size(size_t iov_cnt) +{ + return max(offsetof(struct tcmu_cmd_entry, req.iov[iov_cnt]), + sizeof(struct tcmu_cmd_entry)); +} + +static inline size_t tcmu_cmd_get_cmd_size(struct tcmu_cmd *tcmu_cmd, + size_t base_command_size) +{ + struct se_cmd *se_cmd = tcmu_cmd->se_cmd; + size_t command_size; + + command_size = base_command_size + + round_up(scsi_command_size(se_cmd->t_task_cdb), + TCMU_OP_ALIGN_SIZE); + + WARN_ON(command_size & (TCMU_OP_ALIGN_SIZE-1)); + + return command_size; +} + static sense_reason_t tcmu_queue_cmd_ring(struct tcmu_cmd *tcmu_cmd) { @@ -624,16 +645,16 @@ static bool is_ring_space_avail(struct tcmu_dev *udev, struct tcmu_cmd *cmd, * Must be a certain minimum size for response sense info, but * also may be larger if the iov array is large. * -* We prepare way too many iovs for potential uses here, because it's -* expensive to tell how many regions are freed in the bitmap - */ - base_command_size = max(offsetof(struct tcmu_cmd_entry, - req.iov[tcmu_cmd_get_block_cnt(tcmu_cmd)]), - sizeof(struct tcmu_cmd_entry)); - command_size = base_command_size - + round_up(scsi_command_size(se_cmd->t_task_cdb), TCMU_OP_ALIGN_SIZE); - - WARN_ON(command_size & (TCMU_OP_ALIGN_SIZE-1)); +* We prepare as many iovs as possbile for potential uses here, +* because it's expensive to tell how many regions are freed in +* the bitmap & global data pool, as the size calculated here +* will only be used to do the checks. +* +* The size will be recalculated later as actually needed to save +* cmd area memories. +*/ + base_command_size = tcmu_cmd_get_base_cmd_size(tcmu_cmd->dbi_cnt); + command_size = tcmu_cmd_get_cmd_size(tcmu_cmd, base_command_size); mutex_lock(&udev->cmdr_lock); @@ -694,7 +715,6 @@ static bool is_ring_space_avail(struct tcmu_dev *udev, struct tcmu_cmd *cmd, entry = (void *) mb + CMDR_OFF + cmd_head; tcmu_flush_dcache_range(entry, sizeof(*entry)); tcmu_hdr_set_op(&entry->hdr.len_op, TCMU_OP_CMD); - tcmu_hdr_set_len(&entry->hdr.len_op, command_size); entry->hdr.cmd_id = tcmu_cmd->cmd_id; entry->hdr.kflags = 0; entry->hdr.uflags = 0; @@ -736,6 +756,16 @@ static bool is_ring_space_avail(struct tcmu_dev *udev, struct tcmu_cmd *cmd, entry->req.iov_bidi_cnt = iov_cnt; } + /* +* Recalaulate the command's base size and size according +* to the actual needs +*/ + base_command_size = tcmu_cmd_get_base_cmd_size(entry->req.iov_cnt + + entry->req.iov_bidi_cnt); + command_size = tcmu_cmd_get_cmd_size(tcmu_cmd, base_command_size); + + tcmu_hdr_set_len(&entry->hdr.len_op, command_size); + /* All offsets relative to mb_addr, not start of entry! */ cdb_off = CMDR_OFF + cmd_head + base_command_size; memcpy((void *) mb + cdb_off, se_cmd->t_task_cdb, scsi_command_size(se_cmd->t_task_cdb)); -- 1.8.3.1
Re: [PATCH 25/27] block: remove the discard_zeroes_data flag
On Tue, May 02, 2017 at 12:16:13AM -0700, Nicholas A. Bellinger wrote: > Or, another options is use bdev_write_zeroes_sectors() to determine when > dev_attrib->unmap_zeroes_data should be set. Yes, that in combination with your patch to use bdev_write_zeroes_sectors for zeroing from write same seems like the right fix.
Re: [PATCH 25/27] block: remove the discard_zeroes_data flag
On Mon, 2017-05-01 at 23:43 -0700, Nicholas A. Bellinger wrote: > On Mon, 2017-05-01 at 20:45 +, Bart Van Assche wrote: > > On Wed, 2017-04-05 at 19:21 +0200, Christoph Hellwig wrote: > > > Now that we use the proper REQ_OP_WRITE_ZEROES operation everywhere we can > > > kill this hack. > > > > > > Signed-off-by: Christoph Hellwig > > > Reviewed-by: Martin K. Petersen > > > Reviewed-by: Hannes Reinecke > > > [ ... ] > > > diff --git a/drivers/target/target_core_device.c > > > b/drivers/target/target_core_device.c > > > index c754ae33bf7b..d2f089cfa9ae 100644 > > > --- a/drivers/target/target_core_device.c > > > +++ b/drivers/target/target_core_device.c > > > @@ -851,7 +851,7 @@ bool target_configure_unmap_from_queue(struct > > > se_dev_attrib *attrib, > > > attrib->unmap_granularity = q->limits.discard_granularity / block_size; > > > attrib->unmap_granularity_alignment = q->limits.discard_alignment / > > > block_size; > > > - attrib->unmap_zeroes_data = q->limits.discard_zeroes_data; > > > + attrib->unmap_zeroes_data = 0; > > > return true; > > > } > > > EXPORT_SYMBOL(target_configure_unmap_from_queue); > > > > Hello Christoph, > > > > Sorry that I hadn't noticed this before but I think that this patch > > introduces a significant performance regressions for LIO users. Before > > this patch the LBPRZ flag was reported correctly to initiator systems > > through the thin provisioning VPD. With this patch applied that flag > > will always be reported as zero, forcing initiators to submit WRITE > > commands with zeroed data buffers instead of submitting the SCSI UNMAP > > command to block devices for which discard_zeroes_data was set. From > > target_core_spc.c: > > > > /* Thin Provisioning VPD */ > > static sense_reason_t spc_emulate_evpd_b2(struct se_cmd *cmd, unsigned char > > *buf) > > { > > [ ... ] > > /* > > * The unmap_zeroes_data set means that the underlying device supports > > * REQ_DISCARD and has the discard_zeroes_data bit set. This satisfies > > * the SBC requirements for LBPRZ, meaning that a subsequent read > > * will return zeroes after an UNMAP or WRITE SAME (16) to an LBA > > * See sbc4r36 6.6.4. > > */ > > if (((dev->dev_attrib.emulate_tpu != 0) || > > (dev->dev_attrib.emulate_tpws != 0)) && > > (dev->dev_attrib.unmap_zeroes_data != 0)) > > buf[5] |= 0x04; > > [ ... ] > > } > > > > According to sd_config_discard(), it's SD_LBP_WS16, SD_LBP_WS10 and > SD_LBP_ZERO that where ever setting unmap_zeros_data = 1 to begin with. > For UNMAP, q->limits.discard_zeroes_data was never set. > > That said, it's pretty much implied that supporting DISCARD means > subsequent READs return zeros, so target_configure_unmap_from_queue() > should be setting attrib->unmap_zeroes_data = 1, or just dropping it > all-together. > > Post -rc1, I'll push a patch to do the latter. > Or, another options is use bdev_write_zeroes_sectors() to determine when dev_attrib->unmap_zeroes_data should be set.