Re: [PATCH 1/2] ibmvfc: fix potential race in ibmvfc_wait_for_ops
Reviewed-by: Brian King -- Brian King Power Linux I/O IBM Linux Technology Center
[PATCH 1/2] ibmvfc: fix potential race in ibmvfc_wait_for_ops
For various EH activities the ibmvfc driver uses ibmvfc_wait_for_ops() to wait for the completion of commands that match a given criteria be it cancel key, or specific LUN. With recent changes commands are completed outside the lock in bulk by removing them from the sent list and adding them to a private completion list. This introduces a potential race in ibmvfc_wait_for_ops() since the criteria for a command to be oustsanding is no longer simply being on the sent list, but instead not being on the free list. Avoid this race by scanning the entire command event pool and checking that any matching command that ibmvfc needs to wait on is not already on the free list. Fixes: 1f4a4a19508d ("scsi: ibmvfc: Complete commands outside the host/queue lock") Signed-off-by: Tyrel Datwyler --- drivers/scsi/ibmvscsi/ibmvfc.c | 42 ++ 1 file changed, 33 insertions(+), 9 deletions(-) diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c index f140eafc0d6a..5414b465a92f 100644 --- a/drivers/scsi/ibmvscsi/ibmvfc.c +++ b/drivers/scsi/ibmvscsi/ibmvfc.c @@ -2371,6 +2371,24 @@ static int ibmvfc_match_lun(struct ibmvfc_event *evt, void *device) return 0; } +/** + * ibmvfc_event_is_free - Check if event is free or not + * @evt: ibmvfc event struct + * + * Returns: + * true / false + **/ +static bool ibmvfc_event_is_free(struct ibmvfc_event *evt) +{ + struct ibmvfc_event *loop_evt; + + list_for_each_entry(loop_evt, >queue->free, queue_list) + if (loop_evt == evt) + return true; + + return false; +} + /** * ibmvfc_wait_for_ops - Wait for ops to complete * @vhost: ibmvfc host struct @@ -2385,7 +2403,7 @@ static int ibmvfc_wait_for_ops(struct ibmvfc_host *vhost, void *device, { struct ibmvfc_event *evt; DECLARE_COMPLETION_ONSTACK(comp); - int wait; + int wait, i; unsigned long flags; signed long timeout = IBMVFC_ABORT_WAIT_TIMEOUT * HZ; @@ -2393,10 +2411,13 @@ static int ibmvfc_wait_for_ops(struct ibmvfc_host *vhost, void *device, do { wait = 0; spin_lock_irqsave(>crq.l_lock, flags); - list_for_each_entry(evt, >crq.sent, queue_list) { - if (match(evt, device)) { - evt->eh_comp = - wait++; + for (i = 0; i < vhost->crq.evt_pool.size; i++) { + evt = >crq.evt_pool.events[i]; + if (!ibmvfc_event_is_free(evt)) { + if (match(evt, device)) { + evt->eh_comp = + wait++; + } } } spin_unlock_irqrestore(>crq.l_lock, flags); @@ -2407,10 +2428,13 @@ static int ibmvfc_wait_for_ops(struct ibmvfc_host *vhost, void *device, if (!timeout) { wait = 0; spin_lock_irqsave(>crq.l_lock, flags); - list_for_each_entry(evt, >crq.sent, queue_list) { - if (match(evt, device)) { - evt->eh_comp = NULL; - wait++; + for (i = 0; i < vhost->crq.evt_pool.size; i++) { + evt = >crq.evt_pool.events[i]; + if (!ibmvfc_event_is_free(evt)) { + if (match(evt, device)) { + evt->eh_comp = NULL; + wait++; + } } } spin_unlock_irqrestore(>crq.l_lock, flags); -- 2.27.0
[PATCH 5.11 487/775] pwm: rockchip: Eliminate potential race condition when probing
From: Simon South [ Upstream commit d21ba5d6217bd5a6a696678385906ed1994b380b ] Commit 48cf973cae33 ("pwm: rockchip: Avoid glitches on already running PWMs") introduced a potential race condition in rockchip_pwm_probe(): A consumer could enable an inactive PWM, or disable a running one, between rockchip_pwm_probe() registering the device via pwmchip_add() and checking whether it is enabled (to determine whether it was started by a bootloader). This could result in a device's PWM clock being either enabled once more than necessary, potentially causing it to continue running when no longer needed, or disabled once more than necessary, producing a warning from the kernel. Eliminate these possibilities by modifying rockchip_pwm_probe() so it checks whether a device is enabled before registering it rather than after. Fixes: 48cf973cae33 ("pwm: rockchip: Avoid glitches on already running PWMs") Reported-by: Trent Piepho Signed-off-by: Simon South Signed-off-by: Thierry Reding Signed-off-by: Sasha Levin --- drivers/pwm/pwm-rockchip.c | 9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/drivers/pwm/pwm-rockchip.c b/drivers/pwm/pwm-rockchip.c index 90f969f9f5e2b..f3a5641f6bca5 100644 --- a/drivers/pwm/pwm-rockchip.c +++ b/drivers/pwm/pwm-rockchip.c @@ -288,6 +288,7 @@ static int rockchip_pwm_probe(struct platform_device *pdev) const struct of_device_id *id; struct rockchip_pwm_chip *pc; u32 enable_conf, ctrl; + bool enabled; int ret, count; id = of_match_device(rockchip_pwm_dt_ids, >dev); @@ -349,6 +350,10 @@ static int rockchip_pwm_probe(struct platform_device *pdev) pc->chip.of_pwm_n_cells = 3; } + enable_conf = pc->data->enable_conf; + ctrl = readl_relaxed(pc->base + pc->data->regs.ctrl); + enabled = (ctrl & enable_conf) == enable_conf; + ret = pwmchip_add(>chip); if (ret < 0) { dev_err(>dev, "pwmchip_add() failed: %d\n", ret); @@ -356,9 +361,7 @@ static int rockchip_pwm_probe(struct platform_device *pdev) } /* Keep the PWM clk enabled if the PWM appears to be up and running. */ - enable_conf = pc->data->enable_conf; - ctrl = readl_relaxed(pc->base + pc->data->regs.ctrl); - if ((ctrl & enable_conf) != enable_conf) + if (!enabled) clk_disable(pc->clk); clk_disable(pc->pclk); -- 2.27.0
[PATCH 5.10 403/663] pwm: rockchip: Eliminate potential race condition when probing
From: Simon South [ Upstream commit d21ba5d6217bd5a6a696678385906ed1994b380b ] Commit 48cf973cae33 ("pwm: rockchip: Avoid glitches on already running PWMs") introduced a potential race condition in rockchip_pwm_probe(): A consumer could enable an inactive PWM, or disable a running one, between rockchip_pwm_probe() registering the device via pwmchip_add() and checking whether it is enabled (to determine whether it was started by a bootloader). This could result in a device's PWM clock being either enabled once more than necessary, potentially causing it to continue running when no longer needed, or disabled once more than necessary, producing a warning from the kernel. Eliminate these possibilities by modifying rockchip_pwm_probe() so it checks whether a device is enabled before registering it rather than after. Fixes: 48cf973cae33 ("pwm: rockchip: Avoid glitches on already running PWMs") Reported-by: Trent Piepho Signed-off-by: Simon South Signed-off-by: Thierry Reding Signed-off-by: Sasha Levin --- drivers/pwm/pwm-rockchip.c | 9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/drivers/pwm/pwm-rockchip.c b/drivers/pwm/pwm-rockchip.c index ede027fbf2bb4..3b8da7b0091b1 100644 --- a/drivers/pwm/pwm-rockchip.c +++ b/drivers/pwm/pwm-rockchip.c @@ -289,6 +289,7 @@ static int rockchip_pwm_probe(struct platform_device *pdev) struct rockchip_pwm_chip *pc; struct resource *r; u32 enable_conf, ctrl; + bool enabled; int ret, count; id = of_match_device(rockchip_pwm_dt_ids, >dev); @@ -351,6 +352,10 @@ static int rockchip_pwm_probe(struct platform_device *pdev) pc->chip.of_pwm_n_cells = 3; } + enable_conf = pc->data->enable_conf; + ctrl = readl_relaxed(pc->base + pc->data->regs.ctrl); + enabled = (ctrl & enable_conf) == enable_conf; + ret = pwmchip_add(>chip); if (ret < 0) { dev_err(>dev, "pwmchip_add() failed: %d\n", ret); @@ -358,9 +363,7 @@ static int rockchip_pwm_probe(struct platform_device *pdev) } /* Keep the PWM clk enabled if the PWM appears to be up and running. */ - enable_conf = pc->data->enable_conf; - ctrl = readl_relaxed(pc->base + pc->data->regs.ctrl); - if ((ctrl & enable_conf) != enable_conf) + if (!enabled) clk_disable(pc->clk); clk_disable(pc->pclk); -- 2.27.0
[PATCH 5.10 241/663] nvmet-tcp: fix potential race of tcp socket closing accept_work
From: Sagi Grimberg [ Upstream commit 0fbcfb089a3f2f2a731d01f0aec8f7697a849c28 ] When we accept a TCP connection and allocate an nvmet-tcp queue we should make sure not to fully establish it or reference it as the connection may be already closing, which triggers queue release work, which does not fence against queue establishment. In order to address such a race, we make sure to check the sk_state and contain the queue reference to be done underneath the sk_callback_lock such that the queue release work correctly fences against it. Fixes: 872d26a391da ("nvmet-tcp: add NVMe over TCP target driver") Reported-by: Elad Grupi Signed-off-by: Sagi Grimberg Signed-off-by: Christoph Hellwig Signed-off-by: Sasha Levin --- drivers/nvme/target/tcp.c | 28 ++-- 1 file changed, 18 insertions(+), 10 deletions(-) diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c index 577ce7d403ae9..8b0485ada315b 100644 --- a/drivers/nvme/target/tcp.c +++ b/drivers/nvme/target/tcp.c @@ -1485,17 +1485,27 @@ static int nvmet_tcp_set_queue_sock(struct nvmet_tcp_queue *queue) if (inet->rcv_tos > 0) ip_sock_set_tos(sock->sk, inet->rcv_tos); + ret = 0; write_lock_bh(>sk->sk_callback_lock); - sock->sk->sk_user_data = queue; - queue->data_ready = sock->sk->sk_data_ready; - sock->sk->sk_data_ready = nvmet_tcp_data_ready; - queue->state_change = sock->sk->sk_state_change; - sock->sk->sk_state_change = nvmet_tcp_state_change; - queue->write_space = sock->sk->sk_write_space; - sock->sk->sk_write_space = nvmet_tcp_write_space; + if (sock->sk->sk_state != TCP_ESTABLISHED) { + /* +* If the socket is already closing, don't even start +* consuming it +*/ + ret = -ENOTCONN; + } else { + sock->sk->sk_user_data = queue; + queue->data_ready = sock->sk->sk_data_ready; + sock->sk->sk_data_ready = nvmet_tcp_data_ready; + queue->state_change = sock->sk->sk_state_change; + sock->sk->sk_state_change = nvmet_tcp_state_change; + queue->write_space = sock->sk->sk_write_space; + sock->sk->sk_write_space = nvmet_tcp_write_space; + queue_work_on(queue_cpu(queue), nvmet_tcp_wq, >io_work); + } write_unlock_bh(>sk->sk_callback_lock); - return 0; + return ret; } static int nvmet_tcp_alloc_queue(struct nvmet_tcp_port *port, @@ -1543,8 +1553,6 @@ static int nvmet_tcp_alloc_queue(struct nvmet_tcp_port *port, if (ret) goto out_destroy_sq; - queue_work_on(queue_cpu(queue), nvmet_tcp_wq, >io_work); - return 0; out_destroy_sq: mutex_lock(_tcp_queue_mutex); -- 2.27.0
[PATCH 5.11 286/775] nvmet-tcp: fix potential race of tcp socket closing accept_work
From: Sagi Grimberg [ Upstream commit 0fbcfb089a3f2f2a731d01f0aec8f7697a849c28 ] When we accept a TCP connection and allocate an nvmet-tcp queue we should make sure not to fully establish it or reference it as the connection may be already closing, which triggers queue release work, which does not fence against queue establishment. In order to address such a race, we make sure to check the sk_state and contain the queue reference to be done underneath the sk_callback_lock such that the queue release work correctly fences against it. Fixes: 872d26a391da ("nvmet-tcp: add NVMe over TCP target driver") Reported-by: Elad Grupi Signed-off-by: Sagi Grimberg Signed-off-by: Christoph Hellwig Signed-off-by: Sasha Levin --- drivers/nvme/target/tcp.c | 28 ++-- 1 file changed, 18 insertions(+), 10 deletions(-) diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c index 577ce7d403ae9..8b0485ada315b 100644 --- a/drivers/nvme/target/tcp.c +++ b/drivers/nvme/target/tcp.c @@ -1485,17 +1485,27 @@ static int nvmet_tcp_set_queue_sock(struct nvmet_tcp_queue *queue) if (inet->rcv_tos > 0) ip_sock_set_tos(sock->sk, inet->rcv_tos); + ret = 0; write_lock_bh(>sk->sk_callback_lock); - sock->sk->sk_user_data = queue; - queue->data_ready = sock->sk->sk_data_ready; - sock->sk->sk_data_ready = nvmet_tcp_data_ready; - queue->state_change = sock->sk->sk_state_change; - sock->sk->sk_state_change = nvmet_tcp_state_change; - queue->write_space = sock->sk->sk_write_space; - sock->sk->sk_write_space = nvmet_tcp_write_space; + if (sock->sk->sk_state != TCP_ESTABLISHED) { + /* +* If the socket is already closing, don't even start +* consuming it +*/ + ret = -ENOTCONN; + } else { + sock->sk->sk_user_data = queue; + queue->data_ready = sock->sk->sk_data_ready; + sock->sk->sk_data_ready = nvmet_tcp_data_ready; + queue->state_change = sock->sk->sk_state_change; + sock->sk->sk_state_change = nvmet_tcp_state_change; + queue->write_space = sock->sk->sk_write_space; + sock->sk->sk_write_space = nvmet_tcp_write_space; + queue_work_on(queue_cpu(queue), nvmet_tcp_wq, >io_work); + } write_unlock_bh(>sk->sk_callback_lock); - return 0; + return ret; } static int nvmet_tcp_alloc_queue(struct nvmet_tcp_port *port, @@ -1543,8 +1553,6 @@ static int nvmet_tcp_alloc_queue(struct nvmet_tcp_port *port, if (ret) goto out_destroy_sq; - queue_work_on(queue_cpu(queue), nvmet_tcp_wq, >io_work); - return 0; out_destroy_sq: mutex_lock(_tcp_queue_mutex); -- 2.27.0
[PATCH 5.9 093/133] scsi: ibmvscsi: Fix potential race after loss of transport
From: Tyrel Datwyler [ Upstream commit 665e0224a3d76f36da40bd9012270fa629aa42ed ] After a loss of transport due to an adapter migration or crash/disconnect from the host partner there is a tiny window where we can race adjusting the request_limit of the adapter. The request limit is atomically increased/decreased to track the number of inflight requests against the allowed limit of our VIOS partner. After a transport loss we set the request_limit to zero to reflect this state. However, there is a window where the adapter may attempt to queue a command because the transport loss event hasn't been fully processed yet and request_limit is still greater than zero. The hypercall to send the event will fail and the error path will increment the request_limit as a result. If the adapter processes the transport event prior to this increment the request_limit becomes out of sync with the adapter state and can result in SCSI commands being submitted on the now reset connection prior to an SRP Login resulting in a protocol violation. Fix this race by protecting request_limit with the host lock when changing the value via atomic_set() to indicate no transport. Link: https://lore.kernel.org/r/20201025001355.4527-1-tyr...@linux.ibm.com Signed-off-by: Tyrel Datwyler Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin --- drivers/scsi/ibmvscsi/ibmvscsi.c | 36 +++- 1 file changed, 26 insertions(+), 10 deletions(-) diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.c b/drivers/scsi/ibmvscsi/ibmvscsi.c index b1f3017b6547a..29fcc44be2d57 100644 --- a/drivers/scsi/ibmvscsi/ibmvscsi.c +++ b/drivers/scsi/ibmvscsi/ibmvscsi.c @@ -806,6 +806,22 @@ static void purge_requests(struct ibmvscsi_host_data *hostdata, int error_code) spin_unlock_irqrestore(hostdata->host->host_lock, flags); } +/** + * ibmvscsi_set_request_limit - Set the adapter request_limit in response to + * an adapter failure, reset, or SRP Login. Done under host lock to prevent + * race with SCSI command submission. + * @hostdata: adapter to adjust + * @limit: new request limit + */ +static void ibmvscsi_set_request_limit(struct ibmvscsi_host_data *hostdata, int limit) +{ + unsigned long flags; + + spin_lock_irqsave(hostdata->host->host_lock, flags); + atomic_set(>request_limit, limit); + spin_unlock_irqrestore(hostdata->host->host_lock, flags); +} + /** * ibmvscsi_reset_host - Reset the connection to the server * @hostdata: struct ibmvscsi_host_data to reset @@ -813,7 +829,7 @@ static void purge_requests(struct ibmvscsi_host_data *hostdata, int error_code) static void ibmvscsi_reset_host(struct ibmvscsi_host_data *hostdata) { scsi_block_requests(hostdata->host); - atomic_set(>request_limit, 0); + ibmvscsi_set_request_limit(hostdata, 0); purge_requests(hostdata, DID_ERROR); hostdata->action = IBMVSCSI_HOST_ACTION_RESET; @@ -1146,13 +1162,13 @@ static void login_rsp(struct srp_event_struct *evt_struct) dev_info(hostdata->dev, "SRP_LOGIN_REJ reason %u\n", evt_struct->xfer_iu->srp.login_rej.reason); /* Login failed. */ - atomic_set(>request_limit, -1); + ibmvscsi_set_request_limit(hostdata, -1); return; default: dev_err(hostdata->dev, "Invalid login response typecode 0x%02x!\n", evt_struct->xfer_iu->srp.login_rsp.opcode); /* Login failed. */ - atomic_set(>request_limit, -1); + ibmvscsi_set_request_limit(hostdata, -1); return; } @@ -1163,7 +1179,7 @@ static void login_rsp(struct srp_event_struct *evt_struct) * This value is set rather than added to request_limit because * request_limit could have been set to -1 by this client. */ - atomic_set(>request_limit, + ibmvscsi_set_request_limit(hostdata, be32_to_cpu(evt_struct->xfer_iu->srp.login_rsp.req_lim_delta)); /* If we had any pending I/Os, kick them */ @@ -1195,13 +1211,13 @@ static int send_srp_login(struct ibmvscsi_host_data *hostdata) login->req_buf_fmt = cpu_to_be16(SRP_BUF_FORMAT_DIRECT | SRP_BUF_FORMAT_INDIRECT); - spin_lock_irqsave(hostdata->host->host_lock, flags); /* Start out with a request limit of 0, since this is negotiated in * the login request we are just sending and login requests always * get sent by the driver regardless of request_limit. */ - atomic_set(>request_limit, 0); + ibmvscsi_set_request_limit(hostdata, 0); + spin_lock_irqsave(hostdata->host->host_lock, flags); rc = ibmvscsi_send_srp_event(evt_struct, hostdata, login_timeout * 2); spin_unlock_irqrestore(hostdata->host->host_lock, flags); dev_info(hostdata->dev, "sent SRP login\n"); @@
[PATCH 5.4 53/85] scsi: ibmvscsi: Fix potential race after loss of transport
From: Tyrel Datwyler [ Upstream commit 665e0224a3d76f36da40bd9012270fa629aa42ed ] After a loss of transport due to an adapter migration or crash/disconnect from the host partner there is a tiny window where we can race adjusting the request_limit of the adapter. The request limit is atomically increased/decreased to track the number of inflight requests against the allowed limit of our VIOS partner. After a transport loss we set the request_limit to zero to reflect this state. However, there is a window where the adapter may attempt to queue a command because the transport loss event hasn't been fully processed yet and request_limit is still greater than zero. The hypercall to send the event will fail and the error path will increment the request_limit as a result. If the adapter processes the transport event prior to this increment the request_limit becomes out of sync with the adapter state and can result in SCSI commands being submitted on the now reset connection prior to an SRP Login resulting in a protocol violation. Fix this race by protecting request_limit with the host lock when changing the value via atomic_set() to indicate no transport. Link: https://lore.kernel.org/r/20201025001355.4527-1-tyr...@linux.ibm.com Signed-off-by: Tyrel Datwyler Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin --- drivers/scsi/ibmvscsi/ibmvscsi.c | 36 +++- 1 file changed, 26 insertions(+), 10 deletions(-) diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.c b/drivers/scsi/ibmvscsi/ibmvscsi.c index c5711c659b517..1ab0a61e3fb59 100644 --- a/drivers/scsi/ibmvscsi/ibmvscsi.c +++ b/drivers/scsi/ibmvscsi/ibmvscsi.c @@ -806,6 +806,22 @@ static void purge_requests(struct ibmvscsi_host_data *hostdata, int error_code) spin_unlock_irqrestore(hostdata->host->host_lock, flags); } +/** + * ibmvscsi_set_request_limit - Set the adapter request_limit in response to + * an adapter failure, reset, or SRP Login. Done under host lock to prevent + * race with SCSI command submission. + * @hostdata: adapter to adjust + * @limit: new request limit + */ +static void ibmvscsi_set_request_limit(struct ibmvscsi_host_data *hostdata, int limit) +{ + unsigned long flags; + + spin_lock_irqsave(hostdata->host->host_lock, flags); + atomic_set(>request_limit, limit); + spin_unlock_irqrestore(hostdata->host->host_lock, flags); +} + /** * ibmvscsi_reset_host - Reset the connection to the server * @hostdata: struct ibmvscsi_host_data to reset @@ -813,7 +829,7 @@ static void purge_requests(struct ibmvscsi_host_data *hostdata, int error_code) static void ibmvscsi_reset_host(struct ibmvscsi_host_data *hostdata) { scsi_block_requests(hostdata->host); - atomic_set(>request_limit, 0); + ibmvscsi_set_request_limit(hostdata, 0); purge_requests(hostdata, DID_ERROR); hostdata->action = IBMVSCSI_HOST_ACTION_RESET; @@ -1146,13 +1162,13 @@ static void login_rsp(struct srp_event_struct *evt_struct) dev_info(hostdata->dev, "SRP_LOGIN_REJ reason %u\n", evt_struct->xfer_iu->srp.login_rej.reason); /* Login failed. */ - atomic_set(>request_limit, -1); + ibmvscsi_set_request_limit(hostdata, -1); return; default: dev_err(hostdata->dev, "Invalid login response typecode 0x%02x!\n", evt_struct->xfer_iu->srp.login_rsp.opcode); /* Login failed. */ - atomic_set(>request_limit, -1); + ibmvscsi_set_request_limit(hostdata, -1); return; } @@ -1163,7 +1179,7 @@ static void login_rsp(struct srp_event_struct *evt_struct) * This value is set rather than added to request_limit because * request_limit could have been set to -1 by this client. */ - atomic_set(>request_limit, + ibmvscsi_set_request_limit(hostdata, be32_to_cpu(evt_struct->xfer_iu->srp.login_rsp.req_lim_delta)); /* If we had any pending I/Os, kick them */ @@ -1195,13 +1211,13 @@ static int send_srp_login(struct ibmvscsi_host_data *hostdata) login->req_buf_fmt = cpu_to_be16(SRP_BUF_FORMAT_DIRECT | SRP_BUF_FORMAT_INDIRECT); - spin_lock_irqsave(hostdata->host->host_lock, flags); /* Start out with a request limit of 0, since this is negotiated in * the login request we are just sending and login requests always * get sent by the driver regardless of request_limit. */ - atomic_set(>request_limit, 0); + ibmvscsi_set_request_limit(hostdata, 0); + spin_lock_irqsave(hostdata->host->host_lock, flags); rc = ibmvscsi_send_srp_event(evt_struct, hostdata, login_timeout * 2); spin_unlock_irqrestore(hostdata->host->host_lock, flags); dev_info(hostdata->dev, "sent SRP login\n"); @@
[PATCH AUTOSEL 5.9 24/35] scsi: ibmvscsi: Fix potential race after loss of transport
From: Tyrel Datwyler [ Upstream commit 665e0224a3d76f36da40bd9012270fa629aa42ed ] After a loss of transport due to an adapter migration or crash/disconnect from the host partner there is a tiny window where we can race adjusting the request_limit of the adapter. The request limit is atomically increased/decreased to track the number of inflight requests against the allowed limit of our VIOS partner. After a transport loss we set the request_limit to zero to reflect this state. However, there is a window where the adapter may attempt to queue a command because the transport loss event hasn't been fully processed yet and request_limit is still greater than zero. The hypercall to send the event will fail and the error path will increment the request_limit as a result. If the adapter processes the transport event prior to this increment the request_limit becomes out of sync with the adapter state and can result in SCSI commands being submitted on the now reset connection prior to an SRP Login resulting in a protocol violation. Fix this race by protecting request_limit with the host lock when changing the value via atomic_set() to indicate no transport. Link: https://lore.kernel.org/r/20201025001355.4527-1-tyr...@linux.ibm.com Signed-off-by: Tyrel Datwyler Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin --- drivers/scsi/ibmvscsi/ibmvscsi.c | 36 +++- 1 file changed, 26 insertions(+), 10 deletions(-) diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.c b/drivers/scsi/ibmvscsi/ibmvscsi.c index b1f3017b6547a..29fcc44be2d57 100644 --- a/drivers/scsi/ibmvscsi/ibmvscsi.c +++ b/drivers/scsi/ibmvscsi/ibmvscsi.c @@ -806,6 +806,22 @@ static void purge_requests(struct ibmvscsi_host_data *hostdata, int error_code) spin_unlock_irqrestore(hostdata->host->host_lock, flags); } +/** + * ibmvscsi_set_request_limit - Set the adapter request_limit in response to + * an adapter failure, reset, or SRP Login. Done under host lock to prevent + * race with SCSI command submission. + * @hostdata: adapter to adjust + * @limit: new request limit + */ +static void ibmvscsi_set_request_limit(struct ibmvscsi_host_data *hostdata, int limit) +{ + unsigned long flags; + + spin_lock_irqsave(hostdata->host->host_lock, flags); + atomic_set(>request_limit, limit); + spin_unlock_irqrestore(hostdata->host->host_lock, flags); +} + /** * ibmvscsi_reset_host - Reset the connection to the server * @hostdata: struct ibmvscsi_host_data to reset @@ -813,7 +829,7 @@ static void purge_requests(struct ibmvscsi_host_data *hostdata, int error_code) static void ibmvscsi_reset_host(struct ibmvscsi_host_data *hostdata) { scsi_block_requests(hostdata->host); - atomic_set(>request_limit, 0); + ibmvscsi_set_request_limit(hostdata, 0); purge_requests(hostdata, DID_ERROR); hostdata->action = IBMVSCSI_HOST_ACTION_RESET; @@ -1146,13 +1162,13 @@ static void login_rsp(struct srp_event_struct *evt_struct) dev_info(hostdata->dev, "SRP_LOGIN_REJ reason %u\n", evt_struct->xfer_iu->srp.login_rej.reason); /* Login failed. */ - atomic_set(>request_limit, -1); + ibmvscsi_set_request_limit(hostdata, -1); return; default: dev_err(hostdata->dev, "Invalid login response typecode 0x%02x!\n", evt_struct->xfer_iu->srp.login_rsp.opcode); /* Login failed. */ - atomic_set(>request_limit, -1); + ibmvscsi_set_request_limit(hostdata, -1); return; } @@ -1163,7 +1179,7 @@ static void login_rsp(struct srp_event_struct *evt_struct) * This value is set rather than added to request_limit because * request_limit could have been set to -1 by this client. */ - atomic_set(>request_limit, + ibmvscsi_set_request_limit(hostdata, be32_to_cpu(evt_struct->xfer_iu->srp.login_rsp.req_lim_delta)); /* If we had any pending I/Os, kick them */ @@ -1195,13 +1211,13 @@ static int send_srp_login(struct ibmvscsi_host_data *hostdata) login->req_buf_fmt = cpu_to_be16(SRP_BUF_FORMAT_DIRECT | SRP_BUF_FORMAT_INDIRECT); - spin_lock_irqsave(hostdata->host->host_lock, flags); /* Start out with a request limit of 0, since this is negotiated in * the login request we are just sending and login requests always * get sent by the driver regardless of request_limit. */ - atomic_set(>request_limit, 0); + ibmvscsi_set_request_limit(hostdata, 0); + spin_lock_irqsave(hostdata->host->host_lock, flags); rc = ibmvscsi_send_srp_event(evt_struct, hostdata, login_timeout * 2); spin_unlock_irqrestore(hostdata->host->host_lock, flags); dev_info(hostdata->dev, "sent SRP login\n"); @@
[PATCH AUTOSEL 5.4 15/24] scsi: ibmvscsi: Fix potential race after loss of transport
From: Tyrel Datwyler [ Upstream commit 665e0224a3d76f36da40bd9012270fa629aa42ed ] After a loss of transport due to an adapter migration or crash/disconnect from the host partner there is a tiny window where we can race adjusting the request_limit of the adapter. The request limit is atomically increased/decreased to track the number of inflight requests against the allowed limit of our VIOS partner. After a transport loss we set the request_limit to zero to reflect this state. However, there is a window where the adapter may attempt to queue a command because the transport loss event hasn't been fully processed yet and request_limit is still greater than zero. The hypercall to send the event will fail and the error path will increment the request_limit as a result. If the adapter processes the transport event prior to this increment the request_limit becomes out of sync with the adapter state and can result in SCSI commands being submitted on the now reset connection prior to an SRP Login resulting in a protocol violation. Fix this race by protecting request_limit with the host lock when changing the value via atomic_set() to indicate no transport. Link: https://lore.kernel.org/r/20201025001355.4527-1-tyr...@linux.ibm.com Signed-off-by: Tyrel Datwyler Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin --- drivers/scsi/ibmvscsi/ibmvscsi.c | 36 +++- 1 file changed, 26 insertions(+), 10 deletions(-) diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.c b/drivers/scsi/ibmvscsi/ibmvscsi.c index c5711c659b517..1ab0a61e3fb59 100644 --- a/drivers/scsi/ibmvscsi/ibmvscsi.c +++ b/drivers/scsi/ibmvscsi/ibmvscsi.c @@ -806,6 +806,22 @@ static void purge_requests(struct ibmvscsi_host_data *hostdata, int error_code) spin_unlock_irqrestore(hostdata->host->host_lock, flags); } +/** + * ibmvscsi_set_request_limit - Set the adapter request_limit in response to + * an adapter failure, reset, or SRP Login. Done under host lock to prevent + * race with SCSI command submission. + * @hostdata: adapter to adjust + * @limit: new request limit + */ +static void ibmvscsi_set_request_limit(struct ibmvscsi_host_data *hostdata, int limit) +{ + unsigned long flags; + + spin_lock_irqsave(hostdata->host->host_lock, flags); + atomic_set(>request_limit, limit); + spin_unlock_irqrestore(hostdata->host->host_lock, flags); +} + /** * ibmvscsi_reset_host - Reset the connection to the server * @hostdata: struct ibmvscsi_host_data to reset @@ -813,7 +829,7 @@ static void purge_requests(struct ibmvscsi_host_data *hostdata, int error_code) static void ibmvscsi_reset_host(struct ibmvscsi_host_data *hostdata) { scsi_block_requests(hostdata->host); - atomic_set(>request_limit, 0); + ibmvscsi_set_request_limit(hostdata, 0); purge_requests(hostdata, DID_ERROR); hostdata->action = IBMVSCSI_HOST_ACTION_RESET; @@ -1146,13 +1162,13 @@ static void login_rsp(struct srp_event_struct *evt_struct) dev_info(hostdata->dev, "SRP_LOGIN_REJ reason %u\n", evt_struct->xfer_iu->srp.login_rej.reason); /* Login failed. */ - atomic_set(>request_limit, -1); + ibmvscsi_set_request_limit(hostdata, -1); return; default: dev_err(hostdata->dev, "Invalid login response typecode 0x%02x!\n", evt_struct->xfer_iu->srp.login_rsp.opcode); /* Login failed. */ - atomic_set(>request_limit, -1); + ibmvscsi_set_request_limit(hostdata, -1); return; } @@ -1163,7 +1179,7 @@ static void login_rsp(struct srp_event_struct *evt_struct) * This value is set rather than added to request_limit because * request_limit could have been set to -1 by this client. */ - atomic_set(>request_limit, + ibmvscsi_set_request_limit(hostdata, be32_to_cpu(evt_struct->xfer_iu->srp.login_rsp.req_lim_delta)); /* If we had any pending I/Os, kick them */ @@ -1195,13 +1211,13 @@ static int send_srp_login(struct ibmvscsi_host_data *hostdata) login->req_buf_fmt = cpu_to_be16(SRP_BUF_FORMAT_DIRECT | SRP_BUF_FORMAT_INDIRECT); - spin_lock_irqsave(hostdata->host->host_lock, flags); /* Start out with a request limit of 0, since this is negotiated in * the login request we are just sending and login requests always * get sent by the driver regardless of request_limit. */ - atomic_set(>request_limit, 0); + ibmvscsi_set_request_limit(hostdata, 0); + spin_lock_irqsave(hostdata->host->host_lock, flags); rc = ibmvscsi_send_srp_event(evt_struct, hostdata, login_timeout * 2); spin_unlock_irqrestore(hostdata->host->host_lock, flags); dev_info(hostdata->dev, "sent SRP login\n"); @@
[PATCH AUTOSEL 5.8 20/29] scsi: ibmvscsi: Fix potential race after loss of transport
From: Tyrel Datwyler [ Upstream commit 665e0224a3d76f36da40bd9012270fa629aa42ed ] After a loss of transport due to an adapter migration or crash/disconnect from the host partner there is a tiny window where we can race adjusting the request_limit of the adapter. The request limit is atomically increased/decreased to track the number of inflight requests against the allowed limit of our VIOS partner. After a transport loss we set the request_limit to zero to reflect this state. However, there is a window where the adapter may attempt to queue a command because the transport loss event hasn't been fully processed yet and request_limit is still greater than zero. The hypercall to send the event will fail and the error path will increment the request_limit as a result. If the adapter processes the transport event prior to this increment the request_limit becomes out of sync with the adapter state and can result in SCSI commands being submitted on the now reset connection prior to an SRP Login resulting in a protocol violation. Fix this race by protecting request_limit with the host lock when changing the value via atomic_set() to indicate no transport. Link: https://lore.kernel.org/r/20201025001355.4527-1-tyr...@linux.ibm.com Signed-off-by: Tyrel Datwyler Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin --- drivers/scsi/ibmvscsi/ibmvscsi.c | 36 +++- 1 file changed, 26 insertions(+), 10 deletions(-) diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.c b/drivers/scsi/ibmvscsi/ibmvscsi.c index 14f687e9b1f44..62faeab47d905 100644 --- a/drivers/scsi/ibmvscsi/ibmvscsi.c +++ b/drivers/scsi/ibmvscsi/ibmvscsi.c @@ -806,6 +806,22 @@ static void purge_requests(struct ibmvscsi_host_data *hostdata, int error_code) spin_unlock_irqrestore(hostdata->host->host_lock, flags); } +/** + * ibmvscsi_set_request_limit - Set the adapter request_limit in response to + * an adapter failure, reset, or SRP Login. Done under host lock to prevent + * race with SCSI command submission. + * @hostdata: adapter to adjust + * @limit: new request limit + */ +static void ibmvscsi_set_request_limit(struct ibmvscsi_host_data *hostdata, int limit) +{ + unsigned long flags; + + spin_lock_irqsave(hostdata->host->host_lock, flags); + atomic_set(>request_limit, limit); + spin_unlock_irqrestore(hostdata->host->host_lock, flags); +} + /** * ibmvscsi_reset_host - Reset the connection to the server * @hostdata: struct ibmvscsi_host_data to reset @@ -813,7 +829,7 @@ static void purge_requests(struct ibmvscsi_host_data *hostdata, int error_code) static void ibmvscsi_reset_host(struct ibmvscsi_host_data *hostdata) { scsi_block_requests(hostdata->host); - atomic_set(>request_limit, 0); + ibmvscsi_set_request_limit(hostdata, 0); purge_requests(hostdata, DID_ERROR); hostdata->action = IBMVSCSI_HOST_ACTION_RESET; @@ -1146,13 +1162,13 @@ static void login_rsp(struct srp_event_struct *evt_struct) dev_info(hostdata->dev, "SRP_LOGIN_REJ reason %u\n", evt_struct->xfer_iu->srp.login_rej.reason); /* Login failed. */ - atomic_set(>request_limit, -1); + ibmvscsi_set_request_limit(hostdata, -1); return; default: dev_err(hostdata->dev, "Invalid login response typecode 0x%02x!\n", evt_struct->xfer_iu->srp.login_rsp.opcode); /* Login failed. */ - atomic_set(>request_limit, -1); + ibmvscsi_set_request_limit(hostdata, -1); return; } @@ -1163,7 +1179,7 @@ static void login_rsp(struct srp_event_struct *evt_struct) * This value is set rather than added to request_limit because * request_limit could have been set to -1 by this client. */ - atomic_set(>request_limit, + ibmvscsi_set_request_limit(hostdata, be32_to_cpu(evt_struct->xfer_iu->srp.login_rsp.req_lim_delta)); /* If we had any pending I/Os, kick them */ @@ -1195,13 +1211,13 @@ static int send_srp_login(struct ibmvscsi_host_data *hostdata) login->req_buf_fmt = cpu_to_be16(SRP_BUF_FORMAT_DIRECT | SRP_BUF_FORMAT_INDIRECT); - spin_lock_irqsave(hostdata->host->host_lock, flags); /* Start out with a request limit of 0, since this is negotiated in * the login request we are just sending and login requests always * get sent by the driver regardless of request_limit. */ - atomic_set(>request_limit, 0); + ibmvscsi_set_request_limit(hostdata, 0); + spin_lock_irqsave(hostdata->host->host_lock, flags); rc = ibmvscsi_send_srp_event(evt_struct, hostdata, login_timeout * 2); spin_unlock_irqrestore(hostdata->host->host_lock, flags); dev_info(hostdata->dev, "sent SRP login\n"); @@
Re: [PATCH] ibmvscsi: fix race potential race after loss of transport
Tyrel Datwyler writes: > After a loss of tranport due to an adatper migration or crash/disconnect from > the host partner there is a tiny window where we can race adjusting the > request_limit of the adapter. The request limit is atomically inc/dec to track > the number of inflight requests against the allowed limit of our VIOS partner. > After a transport loss we set the request_limit to zero to reflect this state. > However, there is a window where the adapter may attempt to queue a command > because the transport loss event hasn't been fully processed yet and > request_limit is still greater than zero. The hypercall to send the event will > fail and the error path will increment the request_limit as a result. If the > adapter processes the transport event prior to this increment the > request_limit > becomes out of sync with the adapter state and can result in scsi commands > being > submitted on the now reset connection prior to an SRP Login resulting in a > protocol violation. > > Fix this race by protecting request_limit with the host lock when changing the > value via atomic_set() to indicate no transport. > > Signed-off-by: Tyrel Datwyler > --- > drivers/scsi/ibmvscsi/ibmvscsi.c | 36 +++- > 1 file changed, 26 insertions(+), 10 deletions(-) > > diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.c > b/drivers/scsi/ibmvscsi/ibmvscsi.c > index b1f3017b6547..188ed75417a5 100644 > --- a/drivers/scsi/ibmvscsi/ibmvscsi.c > +++ b/drivers/scsi/ibmvscsi/ibmvscsi.c > @@ -806,6 +806,22 @@ static void purge_requests(struct ibmvscsi_host_data > *hostdata, int error_code) > spin_unlock_irqrestore(hostdata->host->host_lock, flags); > } > > +/** > + * ibmvscsi_set_request_limit - Set the adapter request_limit in response to > + * an adapter failure, reset, or SRP Login. Done under host lock to prevent > + * race with scsi command submission. > + * @hostdata:adapter to adjust > + * @limit: new request limit > + */ > +static void ibmvscsi_set_request_limit(struct ibmvscsi_host_data *hostdata, > int limit) > +{ > + unsigned long flags; > + > + spin_lock_irqsave(hostdata->host->host_lock, flags); > + atomic_set(>request_limit, limit); > + spin_unlock_irqrestore(hostdata->host->host_lock, flags); > +} > + > /** > * ibmvscsi_reset_host - Reset the connection to the server > * @hostdata:struct ibmvscsi_host_data to reset ... > @@ -2137,12 +2153,12 @@ static void ibmvscsi_do_work(struct > ibmvscsi_host_data *hostdata) > } > > hostdata->action = IBMVSCSI_HOST_ACTION_NONE; > + spin_unlock_irqrestore(hostdata->host->host_lock, flags); You drop the lock ... > if (rc) { > - atomic_set(>request_limit, -1); > + ibmvscsi_set_request_limit(hostdata, -1); .. then retake it, then drop it again in ibmvscsi_set_request_limit(). Which introduces the possibility that something else gets the lock before you can set the limit to -1. I'm not sure that's a bug, but it's not obviously correct either? cheers > dev_err(hostdata->dev, "error after %s\n", action); > } > - spin_unlock_irqrestore(hostdata->host->host_lock, flags); > > scsi_unblock_requests(hostdata->host); > }
Re: [PATCH] ibmvscsi: fix race potential race after loss of transport
On 10/27/20 10:21 PM, Michael Ellerman wrote: > Tyrel Datwyler writes: >> After a loss of tranport due to an adatper migration or crash/disconnect from >> the host partner there is a tiny window where we can race adjusting the >> request_limit of the adapter. The request limit is atomically inc/dec to >> track >> the number of inflight requests against the allowed limit of our VIOS >> partner. >> After a transport loss we set the request_limit to zero to reflect this >> state. >> However, there is a window where the adapter may attempt to queue a command >> because the transport loss event hasn't been fully processed yet and >> request_limit is still greater than zero. The hypercall to send the event >> will >> fail and the error path will increment the request_limit as a result. If the >> adapter processes the transport event prior to this increment the >> request_limit >> becomes out of sync with the adapter state and can result in scsi commands >> being >> submitted on the now reset connection prior to an SRP Login resulting in a >> protocol violation. >> >> Fix this race by protecting request_limit with the host lock when changing >> the >> value via atomic_set() to indicate no transport. >> >> Signed-off-by: Tyrel Datwyler >> --- >> drivers/scsi/ibmvscsi/ibmvscsi.c | 36 +++- >> 1 file changed, 26 insertions(+), 10 deletions(-) >> >> diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.c >> b/drivers/scsi/ibmvscsi/ibmvscsi.c >> index b1f3017b6547..188ed75417a5 100644 >> --- a/drivers/scsi/ibmvscsi/ibmvscsi.c >> +++ b/drivers/scsi/ibmvscsi/ibmvscsi.c >> @@ -806,6 +806,22 @@ static void purge_requests(struct ibmvscsi_host_data >> *hostdata, int error_code) >> spin_unlock_irqrestore(hostdata->host->host_lock, flags); >> } >> >> +/** >> + * ibmvscsi_set_request_limit - Set the adapter request_limit in response to >> + * an adapter failure, reset, or SRP Login. Done under host lock to prevent >> + * race with scsi command submission. >> + * @hostdata: adapter to adjust >> + * @limit: new request limit >> + */ >> +static void ibmvscsi_set_request_limit(struct ibmvscsi_host_data *hostdata, >> int limit) >> +{ >> +unsigned long flags; >> + >> +spin_lock_irqsave(hostdata->host->host_lock, flags); >> +atomic_set(>request_limit, limit); >> +spin_unlock_irqrestore(hostdata->host->host_lock, flags); >> +} >> + >> /** >> * ibmvscsi_reset_host - Reset the connection to the server >> * @hostdata: struct ibmvscsi_host_data to reset > ... >> @@ -2137,12 +2153,12 @@ static void ibmvscsi_do_work(struct >> ibmvscsi_host_data *hostdata) >> } >> >> hostdata->action = IBMVSCSI_HOST_ACTION_NONE; >> +spin_unlock_irqrestore(hostdata->host->host_lock, flags); > > You drop the lock ... > >> if (rc) { >> -atomic_set(>request_limit, -1); >> +ibmvscsi_set_request_limit(hostdata, -1); > > .. then retake it, then drop it again in ibmvscsi_set_request_limit(). > > Which introduces the possibility that something else gets the lock > before you can set the limit to -1. > > I'm not sure that's a bug, but it's not obviously correct either? Yeah, I'd already moved the request_limit update into its own function before I got to this case which made me a bit uneasy when I realized I had to drop the lock because my new function takes the lock. However, we only need to protect ourselves from from racing with queuecommand() which is locked for its entire call. Further, if we've gotten here it means we were either resetting or re-enabling the adapter which would have already set request_limit to zero. At this point the transport was already gone and we've further failed to reset it. Also, we've blocked any new scsi requests at this point. -Tyrel > > cheers > >> dev_err(hostdata->dev, "error after %s\n", action); >> } >> -spin_unlock_irqrestore(hostdata->host->host_lock, flags); >> >> scsi_unblock_requests(hostdata->host); >> }
[PATCH 5.4 070/408] crypto: picoxcell - Fix potential race condition bug
From: Madhuparna Bhowmik [ Upstream commit 64f4a62e3b17f1e473f971127c2924cae42afc82 ] engine->stat_irq_thresh was initialized after device_create_file() in the probe function, the initialization may race with call to spacc_stat_irq_thresh_store() which updates engine->stat_irq_thresh, therefore initialize it before creating the file in probe function. Found by Linux Driver Verification project (linuxtesting.org). Fixes: ce92136843cb ("crypto: picoxcell - add support for the...") Signed-off-by: Madhuparna Bhowmik Acked-by: Jamie Iles Signed-off-by: Herbert Xu Signed-off-by: Sasha Levin --- drivers/crypto/picoxcell_crypto.c | 9 - 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/drivers/crypto/picoxcell_crypto.c b/drivers/crypto/picoxcell_crypto.c index 2680e1525db58..13ecbb0e58528 100644 --- a/drivers/crypto/picoxcell_crypto.c +++ b/drivers/crypto/picoxcell_crypto.c @@ -1697,11 +1697,6 @@ static int spacc_probe(struct platform_device *pdev) goto err_clk_put; } - ret = device_create_file(>dev, _attr_stat_irq_thresh); - if (ret) - goto err_clk_disable; - - /* * Use an IRQ threshold of 50% as a default. This seems to be a * reasonable trade off of latency against throughput but can be @@ -1709,6 +1704,10 @@ static int spacc_probe(struct platform_device *pdev) */ engine->stat_irq_thresh = (engine->fifo_sz / 2); + ret = device_create_file(>dev, _attr_stat_irq_thresh); + if (ret) + goto err_clk_disable; + /* * Configure the interrupts. We only use the STAT_CNT interrupt as we * only submit a new packet for processing when we complete another in -- 2.25.1
[PATCH 5.4 208/408] RDMA/mlx5: Fix potential race between destroy and CQE poll
From: Leon Romanovsky [ Upstream commit 4b916ed9f9e85f705213ca8d69771d3c1cd6ee5a ] The SRQ can be destroyed right before mlx5_cmd_get_srq is called. In such case the latter will return NULL instead of expected SRQ. Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters") Link: https://lore.kernel.org/r/20200830084010.102381-5-l...@kernel.org Signed-off-by: Leon Romanovsky Signed-off-by: Jason Gunthorpe Signed-off-by: Sasha Levin --- drivers/infiniband/hw/mlx5/cq.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/hw/mlx5/cq.c b/drivers/infiniband/hw/mlx5/cq.c index ff664355de550..73d5b8dc74d86 100644 --- a/drivers/infiniband/hw/mlx5/cq.c +++ b/drivers/infiniband/hw/mlx5/cq.c @@ -167,7 +167,7 @@ static void handle_responder(struct ib_wc *wc, struct mlx5_cqe64 *cqe, { enum rdma_link_layer ll = rdma_port_get_link_layer(qp->ibqp.device, 1); struct mlx5_ib_dev *dev = to_mdev(qp->ibqp.device); - struct mlx5_ib_srq *srq; + struct mlx5_ib_srq *srq = NULL; struct mlx5_ib_wq *wq; u16 wqe_ctr; u8 roce_packet_type; @@ -179,7 +179,8 @@ static void handle_responder(struct ib_wc *wc, struct mlx5_cqe64 *cqe, if (qp->ibqp.xrcd) { msrq = mlx5_cmd_get_srq(dev, be32_to_cpu(cqe->srqn)); - srq = to_mibsrq(msrq); + if (msrq) + srq = to_mibsrq(msrq); } else { srq = to_msrq(qp->ibqp.srq); } -- 2.25.1
[PATCH 5.8 105/633] crypto: picoxcell - Fix potential race condition bug
From: Madhuparna Bhowmik [ Upstream commit 64f4a62e3b17f1e473f971127c2924cae42afc82 ] engine->stat_irq_thresh was initialized after device_create_file() in the probe function, the initialization may race with call to spacc_stat_irq_thresh_store() which updates engine->stat_irq_thresh, therefore initialize it before creating the file in probe function. Found by Linux Driver Verification project (linuxtesting.org). Fixes: ce92136843cb ("crypto: picoxcell - add support for the...") Signed-off-by: Madhuparna Bhowmik Acked-by: Jamie Iles Signed-off-by: Herbert Xu Signed-off-by: Sasha Levin --- drivers/crypto/picoxcell_crypto.c | 9 - 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/drivers/crypto/picoxcell_crypto.c b/drivers/crypto/picoxcell_crypto.c index 7384e91c8b32b..0d32b641a7f9d 100644 --- a/drivers/crypto/picoxcell_crypto.c +++ b/drivers/crypto/picoxcell_crypto.c @@ -1666,11 +1666,6 @@ static int spacc_probe(struct platform_device *pdev) goto err_clk_put; } - ret = device_create_file(>dev, _attr_stat_irq_thresh); - if (ret) - goto err_clk_disable; - - /* * Use an IRQ threshold of 50% as a default. This seems to be a * reasonable trade off of latency against throughput but can be @@ -1678,6 +1673,10 @@ static int spacc_probe(struct platform_device *pdev) */ engine->stat_irq_thresh = (engine->fifo_sz / 2); + ret = device_create_file(>dev, _attr_stat_irq_thresh); + if (ret) + goto err_clk_disable; + /* * Configure the interrupts. We only use the STAT_CNT interrupt as we * only submit a new packet for processing when we complete another in -- 2.25.1
[PATCH 5.8 336/633] RDMA/mlx5: Fix potential race between destroy and CQE poll
From: Leon Romanovsky [ Upstream commit 4b916ed9f9e85f705213ca8d69771d3c1cd6ee5a ] The SRQ can be destroyed right before mlx5_cmd_get_srq is called. In such case the latter will return NULL instead of expected SRQ. Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters") Link: https://lore.kernel.org/r/20200830084010.102381-5-l...@kernel.org Signed-off-by: Leon Romanovsky Signed-off-by: Jason Gunthorpe Signed-off-by: Sasha Levin --- drivers/infiniband/hw/mlx5/cq.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/hw/mlx5/cq.c b/drivers/infiniband/hw/mlx5/cq.c index 0c18cb6a2f148..3ca379513d0e6 100644 --- a/drivers/infiniband/hw/mlx5/cq.c +++ b/drivers/infiniband/hw/mlx5/cq.c @@ -168,7 +168,7 @@ static void handle_responder(struct ib_wc *wc, struct mlx5_cqe64 *cqe, { enum rdma_link_layer ll = rdma_port_get_link_layer(qp->ibqp.device, 1); struct mlx5_ib_dev *dev = to_mdev(qp->ibqp.device); - struct mlx5_ib_srq *srq; + struct mlx5_ib_srq *srq = NULL; struct mlx5_ib_wq *wq; u16 wqe_ctr; u8 roce_packet_type; @@ -180,7 +180,8 @@ static void handle_responder(struct ib_wc *wc, struct mlx5_cqe64 *cqe, if (qp->ibqp.xrcd) { msrq = mlx5_cmd_get_srq(dev, be32_to_cpu(cqe->srqn)); - srq = to_mibsrq(msrq); + if (msrq) + srq = to_mibsrq(msrq); } else { srq = to_msrq(qp->ibqp.srq); } -- 2.25.1
[PATCH 5.9 413/757] RDMA/mlx5: Fix potential race between destroy and CQE poll
From: Leon Romanovsky [ Upstream commit 4b916ed9f9e85f705213ca8d69771d3c1cd6ee5a ] The SRQ can be destroyed right before mlx5_cmd_get_srq is called. In such case the latter will return NULL instead of expected SRQ. Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters") Link: https://lore.kernel.org/r/20200830084010.102381-5-l...@kernel.org Signed-off-by: Leon Romanovsky Signed-off-by: Jason Gunthorpe Signed-off-by: Sasha Levin --- drivers/infiniband/hw/mlx5/cq.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/hw/mlx5/cq.c b/drivers/infiniband/hw/mlx5/cq.c index dceb0eb2bed16..b318bde2e565f 100644 --- a/drivers/infiniband/hw/mlx5/cq.c +++ b/drivers/infiniband/hw/mlx5/cq.c @@ -168,7 +168,7 @@ static void handle_responder(struct ib_wc *wc, struct mlx5_cqe64 *cqe, { enum rdma_link_layer ll = rdma_port_get_link_layer(qp->ibqp.device, 1); struct mlx5_ib_dev *dev = to_mdev(qp->ibqp.device); - struct mlx5_ib_srq *srq; + struct mlx5_ib_srq *srq = NULL; struct mlx5_ib_wq *wq; u16 wqe_ctr; u8 roce_packet_type; @@ -180,7 +180,8 @@ static void handle_responder(struct ib_wc *wc, struct mlx5_cqe64 *cqe, if (qp->ibqp.xrcd) { msrq = mlx5_cmd_get_srq(dev, be32_to_cpu(cqe->srqn)); - srq = to_mibsrq(msrq); + if (msrq) + srq = to_mibsrq(msrq); } else { srq = to_msrq(qp->ibqp.srq); } -- 2.25.1
[PATCH 5.9 122/757] crypto: picoxcell - Fix potential race condition bug
From: Madhuparna Bhowmik [ Upstream commit 64f4a62e3b17f1e473f971127c2924cae42afc82 ] engine->stat_irq_thresh was initialized after device_create_file() in the probe function, the initialization may race with call to spacc_stat_irq_thresh_store() which updates engine->stat_irq_thresh, therefore initialize it before creating the file in probe function. Found by Linux Driver Verification project (linuxtesting.org). Fixes: ce92136843cb ("crypto: picoxcell - add support for the...") Signed-off-by: Madhuparna Bhowmik Acked-by: Jamie Iles Signed-off-by: Herbert Xu Signed-off-by: Sasha Levin --- drivers/crypto/picoxcell_crypto.c | 9 - 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/drivers/crypto/picoxcell_crypto.c b/drivers/crypto/picoxcell_crypto.c index dac6eb37fff93..fb34bf92861d1 100644 --- a/drivers/crypto/picoxcell_crypto.c +++ b/drivers/crypto/picoxcell_crypto.c @@ -1685,11 +1685,6 @@ static int spacc_probe(struct platform_device *pdev) goto err_clk_put; } - ret = device_create_file(>dev, _attr_stat_irq_thresh); - if (ret) - goto err_clk_disable; - - /* * Use an IRQ threshold of 50% as a default. This seems to be a * reasonable trade off of latency against throughput but can be @@ -1697,6 +1692,10 @@ static int spacc_probe(struct platform_device *pdev) */ engine->stat_irq_thresh = (engine->fifo_sz / 2); + ret = device_create_file(>dev, _attr_stat_irq_thresh); + if (ret) + goto err_clk_disable; + /* * Configure the interrupts. We only use the STAT_CNT interrupt as we * only submit a new packet for processing when we complete another in -- 2.25.1
[PATCH 4.19 042/264] crypto: picoxcell - Fix potential race condition bug
From: Madhuparna Bhowmik [ Upstream commit 64f4a62e3b17f1e473f971127c2924cae42afc82 ] engine->stat_irq_thresh was initialized after device_create_file() in the probe function, the initialization may race with call to spacc_stat_irq_thresh_store() which updates engine->stat_irq_thresh, therefore initialize it before creating the file in probe function. Found by Linux Driver Verification project (linuxtesting.org). Fixes: ce92136843cb ("crypto: picoxcell - add support for the...") Signed-off-by: Madhuparna Bhowmik Acked-by: Jamie Iles Signed-off-by: Herbert Xu Signed-off-by: Sasha Levin --- drivers/crypto/picoxcell_crypto.c | 9 - 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/drivers/crypto/picoxcell_crypto.c b/drivers/crypto/picoxcell_crypto.c index e2491754c468f..1ef47f7208b92 100644 --- a/drivers/crypto/picoxcell_crypto.c +++ b/drivers/crypto/picoxcell_crypto.c @@ -1701,11 +1701,6 @@ static int spacc_probe(struct platform_device *pdev) goto err_clk_put; } - ret = device_create_file(>dev, _attr_stat_irq_thresh); - if (ret) - goto err_clk_disable; - - /* * Use an IRQ threshold of 50% as a default. This seems to be a * reasonable trade off of latency against throughput but can be @@ -1713,6 +1708,10 @@ static int spacc_probe(struct platform_device *pdev) */ engine->stat_irq_thresh = (engine->fifo_sz / 2); + ret = device_create_file(>dev, _attr_stat_irq_thresh); + if (ret) + goto err_clk_disable; + /* * Configure the interrupts. We only use the STAT_CNT interrupt as we * only submit a new packet for processing when we complete another in -- 2.25.1
Re: [PATCH] ibmvscsi: fix race potential race after loss of transport
Tyrel, > After a loss of tranport due to an adatper migration or > crash/disconnect from the host partner there is a tiny window where we > can race adjusting the request_limit of the adapter. Applied to 5.10/scsi-fixes, thanks! -- Martin K. Petersen Oracle Linux Engineering
[PATCH] ibmvscsi: fix race potential race after loss of transport
After a loss of tranport due to an adatper migration or crash/disconnect from the host partner there is a tiny window where we can race adjusting the request_limit of the adapter. The request limit is atomically inc/dec to track the number of inflight requests against the allowed limit of our VIOS partner. After a transport loss we set the request_limit to zero to reflect this state. However, there is a window where the adapter may attempt to queue a command because the transport loss event hasn't been fully processed yet and request_limit is still greater than zero. The hypercall to send the event will fail and the error path will increment the request_limit as a result. If the adapter processes the transport event prior to this increment the request_limit becomes out of sync with the adapter state and can result in scsi commands being submitted on the now reset connection prior to an SRP Login resulting in a protocol violation. Fix this race by protecting request_limit with the host lock when changing the value via atomic_set() to indicate no transport. Signed-off-by: Tyrel Datwyler --- drivers/scsi/ibmvscsi/ibmvscsi.c | 36 +++- 1 file changed, 26 insertions(+), 10 deletions(-) diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.c b/drivers/scsi/ibmvscsi/ibmvscsi.c index b1f3017b6547..188ed75417a5 100644 --- a/drivers/scsi/ibmvscsi/ibmvscsi.c +++ b/drivers/scsi/ibmvscsi/ibmvscsi.c @@ -806,6 +806,22 @@ static void purge_requests(struct ibmvscsi_host_data *hostdata, int error_code) spin_unlock_irqrestore(hostdata->host->host_lock, flags); } +/** + * ibmvscsi_set_request_limit - Set the adapter request_limit in response to + * an adapter failure, reset, or SRP Login. Done under host lock to prevent + * race with scsi command submission. + * @hostdata: adapter to adjust + * @limit: new request limit + */ +static void ibmvscsi_set_request_limit(struct ibmvscsi_host_data *hostdata, int limit) +{ + unsigned long flags; + + spin_lock_irqsave(hostdata->host->host_lock, flags); + atomic_set(>request_limit, limit); + spin_unlock_irqrestore(hostdata->host->host_lock, flags); +} + /** * ibmvscsi_reset_host - Reset the connection to the server * @hostdata: struct ibmvscsi_host_data to reset @@ -813,7 +829,7 @@ static void purge_requests(struct ibmvscsi_host_data *hostdata, int error_code) static void ibmvscsi_reset_host(struct ibmvscsi_host_data *hostdata) { scsi_block_requests(hostdata->host); - atomic_set(>request_limit, 0); + ibmvscsi_set_request_limit(hostdata, 0); purge_requests(hostdata, DID_ERROR); hostdata->action = IBMVSCSI_HOST_ACTION_RESET; @@ -1146,13 +1162,13 @@ static void login_rsp(struct srp_event_struct *evt_struct) dev_info(hostdata->dev, "SRP_LOGIN_REJ reason %u\n", evt_struct->xfer_iu->srp.login_rej.reason); /* Login failed. */ - atomic_set(>request_limit, -1); + ibmvscsi_set_request_limit(hostdata, -1); return; default: dev_err(hostdata->dev, "Invalid login response typecode 0x%02x!\n", evt_struct->xfer_iu->srp.login_rsp.opcode); /* Login failed. */ - atomic_set(>request_limit, -1); + ibmvscsi_set_request_limit(hostdata, -1); return; } @@ -1163,7 +1179,7 @@ static void login_rsp(struct srp_event_struct *evt_struct) * This value is set rather than added to request_limit because * request_limit could have been set to -1 by this client. */ - atomic_set(>request_limit, + ibmvscsi_set_request_limit(hostdata, be32_to_cpu(evt_struct->xfer_iu->srp.login_rsp.req_lim_delta)); /* If we had any pending I/Os, kick them */ @@ -1195,13 +1211,13 @@ static int send_srp_login(struct ibmvscsi_host_data *hostdata) login->req_buf_fmt = cpu_to_be16(SRP_BUF_FORMAT_DIRECT | SRP_BUF_FORMAT_INDIRECT); - spin_lock_irqsave(hostdata->host->host_lock, flags); /* Start out with a request limit of 0, since this is negotiated in * the login request we are just sending and login requests always * get sent by the driver regardless of request_limit. */ - atomic_set(>request_limit, 0); + ibmvscsi_set_request_limit(hostdata, 0); + spin_lock_irqsave(hostdata->host->host_lock, flags); rc = ibmvscsi_send_srp_event(evt_struct, hostdata, login_timeout * 2); spin_unlock_irqrestore(hostdata->host->host_lock, flags); dev_info(hostdata->dev, "sent SRP login\n"); @@ -1781,7 +1797,7 @@ static void ibmvscsi_handle_crq(struct viosrp_crq *crq, return; case VIOSRP_CRQ_XPORT_EVENT:/* Hypervisor telling us the connection is closed */ scsi_block_requests(hostdata->host); -
[PATCH 4.9 086/121] ALSA: hda: Fix potential race in unsol event handler
From: Takashi Iwai [ Upstream commit c637fa151259c0f74665fde7cba5b7eac1417ae5 ] The unsol event handling code has a loop retrieving the read/write indices and the arrays without locking while the append to the array may happen concurrently. This may lead to some inconsistency. Although there hasn't been any proof of this bad results, it's still safer to protect the racy accesses. This patch adds the spinlock protection around the unsol handling loop for addressing it. Here we take bus->reg_lock as the writer side snd_hdac_bus_queue_event() is also protected by that lock. Link: https://lore.kernel.org/r/20200516062556.30951-1-ti...@suse.de Signed-off-by: Takashi Iwai Signed-off-by: Sasha Levin --- sound/hda/hdac_bus.c | 4 1 file changed, 4 insertions(+) diff --git a/sound/hda/hdac_bus.c b/sound/hda/hdac_bus.c index 0e81ea89a5965..e3f68a76d90eb 100644 --- a/sound/hda/hdac_bus.c +++ b/sound/hda/hdac_bus.c @@ -155,6 +155,7 @@ static void process_unsol_events(struct work_struct *work) struct hdac_driver *drv; unsigned int rp, caddr, res; + spin_lock_irq(>reg_lock); while (bus->unsol_rp != bus->unsol_wp) { rp = (bus->unsol_rp + 1) % HDA_UNSOL_QUEUE_SIZE; bus->unsol_rp = rp; @@ -166,10 +167,13 @@ static void process_unsol_events(struct work_struct *work) codec = bus->caddr_tbl[caddr & 0x0f]; if (!codec || !codec->dev.driver) continue; + spin_unlock_irq(>reg_lock); drv = drv_to_hdac_driver(codec->dev.driver); if (drv->unsol_event) drv->unsol_event(codec, res); + spin_lock_irq(>reg_lock); } + spin_unlock_irq(>reg_lock); } /** -- 2.25.1
[PATCH 4.14 117/166] ALSA: hda: Fix potential race in unsol event handler
From: Takashi Iwai [ Upstream commit c637fa151259c0f74665fde7cba5b7eac1417ae5 ] The unsol event handling code has a loop retrieving the read/write indices and the arrays without locking while the append to the array may happen concurrently. This may lead to some inconsistency. Although there hasn't been any proof of this bad results, it's still safer to protect the racy accesses. This patch adds the spinlock protection around the unsol handling loop for addressing it. Here we take bus->reg_lock as the writer side snd_hdac_bus_queue_event() is also protected by that lock. Link: https://lore.kernel.org/r/20200516062556.30951-1-ti...@suse.de Signed-off-by: Takashi Iwai Signed-off-by: Sasha Levin --- sound/hda/hdac_bus.c | 4 1 file changed, 4 insertions(+) diff --git a/sound/hda/hdac_bus.c b/sound/hda/hdac_bus.c index 714a51721a313..ab9236e4c157e 100644 --- a/sound/hda/hdac_bus.c +++ b/sound/hda/hdac_bus.c @@ -155,6 +155,7 @@ static void process_unsol_events(struct work_struct *work) struct hdac_driver *drv; unsigned int rp, caddr, res; + spin_lock_irq(>reg_lock); while (bus->unsol_rp != bus->unsol_wp) { rp = (bus->unsol_rp + 1) % HDA_UNSOL_QUEUE_SIZE; bus->unsol_rp = rp; @@ -166,10 +167,13 @@ static void process_unsol_events(struct work_struct *work) codec = bus->caddr_tbl[caddr & 0x0f]; if (!codec || !codec->dev.driver) continue; + spin_unlock_irq(>reg_lock); drv = drv_to_hdac_driver(codec->dev.driver); if (drv->unsol_event) drv->unsol_event(codec, res); + spin_lock_irq(>reg_lock); } + spin_unlock_irq(>reg_lock); } /** -- 2.25.1
[PATCH 4.14 131/166] ceph: fix potential race in ceph_check_caps
From: Jeff Layton [ Upstream commit dc3da0461cc4b76f2d0c5b12247fcb3b520edbbf ] Nothing ensures that session will still be valid by the time we dereference the pointer. Take and put a reference. In principle, we should always be able to get a reference here, but throw a warning if that's ever not the case. Signed-off-by: Jeff Layton Signed-off-by: Ilya Dryomov Signed-off-by: Sasha Levin --- fs/ceph/caps.c | 14 +- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c index 589cfe3ed873b..ce94d09f6abf9 100644 --- a/fs/ceph/caps.c +++ b/fs/ceph/caps.c @@ -1906,12 +1906,24 @@ ack: if (mutex_trylock(>s_mutex) == 0) { dout("inverting session/ino locks on %p\n", session); + session = ceph_get_mds_session(session); spin_unlock(>i_ceph_lock); if (took_snap_rwsem) { up_read(>snap_rwsem); took_snap_rwsem = 0; } - mutex_lock(>s_mutex); + if (session) { + mutex_lock(>s_mutex); + ceph_put_mds_session(session); + } else { + /* +* Because we take the reference while +* holding the i_ceph_lock, it should +* never be NULL. Throw a warning if it +* ever is. +*/ + WARN_ON_ONCE(true); + } goto retry; } } -- 2.25.1
[PATCH 4.19 164/245] ALSA: hda: Fix potential race in unsol event handler
From: Takashi Iwai [ Upstream commit c637fa151259c0f74665fde7cba5b7eac1417ae5 ] The unsol event handling code has a loop retrieving the read/write indices and the arrays without locking while the append to the array may happen concurrently. This may lead to some inconsistency. Although there hasn't been any proof of this bad results, it's still safer to protect the racy accesses. This patch adds the spinlock protection around the unsol handling loop for addressing it. Here we take bus->reg_lock as the writer side snd_hdac_bus_queue_event() is also protected by that lock. Link: https://lore.kernel.org/r/20200516062556.30951-1-ti...@suse.de Signed-off-by: Takashi Iwai Signed-off-by: Sasha Levin --- sound/hda/hdac_bus.c | 4 1 file changed, 4 insertions(+) diff --git a/sound/hda/hdac_bus.c b/sound/hda/hdac_bus.c index 714a51721a313..ab9236e4c157e 100644 --- a/sound/hda/hdac_bus.c +++ b/sound/hda/hdac_bus.c @@ -155,6 +155,7 @@ static void process_unsol_events(struct work_struct *work) struct hdac_driver *drv; unsigned int rp, caddr, res; + spin_lock_irq(>reg_lock); while (bus->unsol_rp != bus->unsol_wp) { rp = (bus->unsol_rp + 1) % HDA_UNSOL_QUEUE_SIZE; bus->unsol_rp = rp; @@ -166,10 +167,13 @@ static void process_unsol_events(struct work_struct *work) codec = bus->caddr_tbl[caddr & 0x0f]; if (!codec || !codec->dev.driver) continue; + spin_unlock_irq(>reg_lock); drv = drv_to_hdac_driver(codec->dev.driver); if (drv->unsol_event) drv->unsol_event(codec, res); + spin_lock_irq(>reg_lock); } + spin_unlock_irq(>reg_lock); } /** -- 2.25.1
[PATCH 4.19 189/245] ceph: fix potential race in ceph_check_caps
From: Jeff Layton [ Upstream commit dc3da0461cc4b76f2d0c5b12247fcb3b520edbbf ] Nothing ensures that session will still be valid by the time we dereference the pointer. Take and put a reference. In principle, we should always be able to get a reference here, but throw a warning if that's ever not the case. Signed-off-by: Jeff Layton Signed-off-by: Ilya Dryomov Signed-off-by: Sasha Levin --- fs/ceph/caps.c | 14 +- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c index a2d4eed27f804..c0dbf8b7762b4 100644 --- a/fs/ceph/caps.c +++ b/fs/ceph/caps.c @@ -2015,12 +2015,24 @@ ack: if (mutex_trylock(>s_mutex) == 0) { dout("inverting session/ino locks on %p\n", session); + session = ceph_get_mds_session(session); spin_unlock(>i_ceph_lock); if (took_snap_rwsem) { up_read(>snap_rwsem); took_snap_rwsem = 0; } - mutex_lock(>s_mutex); + if (session) { + mutex_lock(>s_mutex); + ceph_put_mds_session(session); + } else { + /* +* Because we take the reference while +* holding the i_ceph_lock, it should +* never be NULL. Throw a warning if it +* ever is. +*/ + WARN_ON_ONCE(true); + } goto retry; } } -- 2.25.1
[PATCH 5.4 259/388] ALSA: hda: Fix potential race in unsol event handler
From: Takashi Iwai [ Upstream commit c637fa151259c0f74665fde7cba5b7eac1417ae5 ] The unsol event handling code has a loop retrieving the read/write indices and the arrays without locking while the append to the array may happen concurrently. This may lead to some inconsistency. Although there hasn't been any proof of this bad results, it's still safer to protect the racy accesses. This patch adds the spinlock protection around the unsol handling loop for addressing it. Here we take bus->reg_lock as the writer side snd_hdac_bus_queue_event() is also protected by that lock. Link: https://lore.kernel.org/r/20200516062556.30951-1-ti...@suse.de Signed-off-by: Takashi Iwai Signed-off-by: Sasha Levin --- sound/hda/hdac_bus.c | 4 1 file changed, 4 insertions(+) diff --git a/sound/hda/hdac_bus.c b/sound/hda/hdac_bus.c index 8f19876244ebe..53be2cac98e7c 100644 --- a/sound/hda/hdac_bus.c +++ b/sound/hda/hdac_bus.c @@ -158,6 +158,7 @@ static void snd_hdac_bus_process_unsol_events(struct work_struct *work) struct hdac_driver *drv; unsigned int rp, caddr, res; + spin_lock_irq(>reg_lock); while (bus->unsol_rp != bus->unsol_wp) { rp = (bus->unsol_rp + 1) % HDA_UNSOL_QUEUE_SIZE; bus->unsol_rp = rp; @@ -169,10 +170,13 @@ static void snd_hdac_bus_process_unsol_events(struct work_struct *work) codec = bus->caddr_tbl[caddr & 0x0f]; if (!codec || !codec->dev.driver) continue; + spin_unlock_irq(>reg_lock); drv = drv_to_hdac_driver(codec->dev.driver); if (drv->unsol_event) drv->unsol_event(codec, res); + spin_lock_irq(>reg_lock); } + spin_unlock_irq(>reg_lock); } /** -- 2.25.1
[PATCH 5.4 296/388] ceph: fix potential race in ceph_check_caps
From: Jeff Layton [ Upstream commit dc3da0461cc4b76f2d0c5b12247fcb3b520edbbf ] Nothing ensures that session will still be valid by the time we dereference the pointer. Take and put a reference. In principle, we should always be able to get a reference here, but throw a warning if that's ever not the case. Signed-off-by: Jeff Layton Signed-off-by: Ilya Dryomov Signed-off-by: Sasha Levin --- fs/ceph/caps.c | 14 +- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c index b2695919435e8..af563d73d252c 100644 --- a/fs/ceph/caps.c +++ b/fs/ceph/caps.c @@ -2013,12 +2013,24 @@ ack: if (mutex_trylock(>s_mutex) == 0) { dout("inverting session/ino locks on %p\n", session); + session = ceph_get_mds_session(session); spin_unlock(>i_ceph_lock); if (took_snap_rwsem) { up_read(>snap_rwsem); took_snap_rwsem = 0; } - mutex_lock(>s_mutex); + if (session) { + mutex_lock(>s_mutex); + ceph_put_mds_session(session); + } else { + /* +* Because we take the reference while +* holding the i_ceph_lock, it should +* never be NULL. Throw a warning if it +* ever is. +*/ + WARN_ON_ONCE(true); + } goto retry; } } -- 2.25.1
[PATCH 4.9 096/121] ceph: fix potential race in ceph_check_caps
From: Jeff Layton [ Upstream commit dc3da0461cc4b76f2d0c5b12247fcb3b520edbbf ] Nothing ensures that session will still be valid by the time we dereference the pointer. Take and put a reference. In principle, we should always be able to get a reference here, but throw a warning if that's ever not the case. Signed-off-by: Jeff Layton Signed-off-by: Ilya Dryomov Signed-off-by: Sasha Levin --- fs/ceph/caps.c | 14 +- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c index e11aacb35d6b5..cbd92dd89de16 100644 --- a/fs/ceph/caps.c +++ b/fs/ceph/caps.c @@ -1807,12 +1807,24 @@ ack: if (mutex_trylock(>s_mutex) == 0) { dout("inverting session/ino locks on %p\n", session); + session = ceph_get_mds_session(session); spin_unlock(>i_ceph_lock); if (took_snap_rwsem) { up_read(>snap_rwsem); took_snap_rwsem = 0; } - mutex_lock(>s_mutex); + if (session) { + mutex_lock(>s_mutex); + ceph_put_mds_session(session); + } else { + /* +* Because we take the reference while +* holding the i_ceph_lock, it should +* never be NULL. Throw a warning if it +* ever is. +*/ + WARN_ON_ONCE(true); + } goto retry; } } -- 2.25.1
[PATCH 4.4 68/85] ceph: fix potential race in ceph_check_caps
From: Jeff Layton [ Upstream commit dc3da0461cc4b76f2d0c5b12247fcb3b520edbbf ] Nothing ensures that session will still be valid by the time we dereference the pointer. Take and put a reference. In principle, we should always be able to get a reference here, but throw a warning if that's ever not the case. Signed-off-by: Jeff Layton Signed-off-by: Ilya Dryomov Signed-off-by: Sasha Levin --- fs/ceph/caps.c | 14 +- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c index 3d0497421e62b..49e693232916f 100644 --- a/fs/ceph/caps.c +++ b/fs/ceph/caps.c @@ -1777,12 +1777,24 @@ ack: if (mutex_trylock(>s_mutex) == 0) { dout("inverting session/ino locks on %p\n", session); + session = ceph_get_mds_session(session); spin_unlock(>i_ceph_lock); if (took_snap_rwsem) { up_read(>snap_rwsem); took_snap_rwsem = 0; } - mutex_lock(>s_mutex); + if (session) { + mutex_lock(>s_mutex); + ceph_put_mds_session(session); + } else { + /* +* Because we take the reference while +* holding the i_ceph_lock, it should +* never be NULL. Throw a warning if it +* ever is. +*/ + WARN_ON_ONCE(true); + } goto retry; } } -- 2.25.1
[PATCH 4.4 60/85] ALSA: hda: Fix potential race in unsol event handler
From: Takashi Iwai [ Upstream commit c637fa151259c0f74665fde7cba5b7eac1417ae5 ] The unsol event handling code has a loop retrieving the read/write indices and the arrays without locking while the append to the array may happen concurrently. This may lead to some inconsistency. Although there hasn't been any proof of this bad results, it's still safer to protect the racy accesses. This patch adds the spinlock protection around the unsol handling loop for addressing it. Here we take bus->reg_lock as the writer side snd_hdac_bus_queue_event() is also protected by that lock. Link: https://lore.kernel.org/r/20200516062556.30951-1-ti...@suse.de Signed-off-by: Takashi Iwai Signed-off-by: Sasha Levin --- sound/hda/hdac_bus.c | 4 1 file changed, 4 insertions(+) diff --git a/sound/hda/hdac_bus.c b/sound/hda/hdac_bus.c index 0e81ea89a5965..e3f68a76d90eb 100644 --- a/sound/hda/hdac_bus.c +++ b/sound/hda/hdac_bus.c @@ -155,6 +155,7 @@ static void process_unsol_events(struct work_struct *work) struct hdac_driver *drv; unsigned int rp, caddr, res; + spin_lock_irq(>reg_lock); while (bus->unsol_rp != bus->unsol_wp) { rp = (bus->unsol_rp + 1) % HDA_UNSOL_QUEUE_SIZE; bus->unsol_rp = rp; @@ -166,10 +167,13 @@ static void process_unsol_events(struct work_struct *work) codec = bus->caddr_tbl[caddr & 0x0f]; if (!codec || !codec->dev.driver) continue; + spin_unlock_irq(>reg_lock); drv = drv_to_hdac_driver(codec->dev.driver); if (drv->unsol_event) drv->unsol_event(codec, res); + spin_lock_irq(>reg_lock); } + spin_unlock_irq(>reg_lock); } /** -- 2.25.1
[PATCH AUTOSEL 4.19 164/206] ALSA: hda: Fix potential race in unsol event handler
From: Takashi Iwai [ Upstream commit c637fa151259c0f74665fde7cba5b7eac1417ae5 ] The unsol event handling code has a loop retrieving the read/write indices and the arrays without locking while the append to the array may happen concurrently. This may lead to some inconsistency. Although there hasn't been any proof of this bad results, it's still safer to protect the racy accesses. This patch adds the spinlock protection around the unsol handling loop for addressing it. Here we take bus->reg_lock as the writer side snd_hdac_bus_queue_event() is also protected by that lock. Link: https://lore.kernel.org/r/20200516062556.30951-1-ti...@suse.de Signed-off-by: Takashi Iwai Signed-off-by: Sasha Levin --- sound/hda/hdac_bus.c | 4 1 file changed, 4 insertions(+) diff --git a/sound/hda/hdac_bus.c b/sound/hda/hdac_bus.c index 714a51721a313..ab9236e4c157e 100644 --- a/sound/hda/hdac_bus.c +++ b/sound/hda/hdac_bus.c @@ -155,6 +155,7 @@ static void process_unsol_events(struct work_struct *work) struct hdac_driver *drv; unsigned int rp, caddr, res; + spin_lock_irq(>reg_lock); while (bus->unsol_rp != bus->unsol_wp) { rp = (bus->unsol_rp + 1) % HDA_UNSOL_QUEUE_SIZE; bus->unsol_rp = rp; @@ -166,10 +167,13 @@ static void process_unsol_events(struct work_struct *work) codec = bus->caddr_tbl[caddr & 0x0f]; if (!codec || !codec->dev.driver) continue; + spin_unlock_irq(>reg_lock); drv = drv_to_hdac_driver(codec->dev.driver); if (drv->unsol_event) drv->unsol_event(codec, res); + spin_lock_irq(>reg_lock); } + spin_unlock_irq(>reg_lock); } /** -- 2.25.1
[PATCH AUTOSEL 4.19 190/206] ceph: fix potential race in ceph_check_caps
From: Jeff Layton [ Upstream commit dc3da0461cc4b76f2d0c5b12247fcb3b520edbbf ] Nothing ensures that session will still be valid by the time we dereference the pointer. Take and put a reference. In principle, we should always be able to get a reference here, but throw a warning if that's ever not the case. Signed-off-by: Jeff Layton Signed-off-by: Ilya Dryomov Signed-off-by: Sasha Levin --- fs/ceph/caps.c | 14 +- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c index a2d4eed27f804..c0dbf8b7762b4 100644 --- a/fs/ceph/caps.c +++ b/fs/ceph/caps.c @@ -2015,12 +2015,24 @@ ack: if (mutex_trylock(>s_mutex) == 0) { dout("inverting session/ino locks on %p\n", session); + session = ceph_get_mds_session(session); spin_unlock(>i_ceph_lock); if (took_snap_rwsem) { up_read(>snap_rwsem); took_snap_rwsem = 0; } - mutex_lock(>s_mutex); + if (session) { + mutex_lock(>s_mutex); + ceph_put_mds_session(session); + } else { + /* +* Because we take the reference while +* holding the i_ceph_lock, it should +* never be NULL. Throw a warning if it +* ever is. +*/ + WARN_ON_ONCE(true); + } goto retry; } } -- 2.25.1
[PATCH AUTOSEL 4.14 101/127] ALSA: hda: Fix potential race in unsol event handler
From: Takashi Iwai [ Upstream commit c637fa151259c0f74665fde7cba5b7eac1417ae5 ] The unsol event handling code has a loop retrieving the read/write indices and the arrays without locking while the append to the array may happen concurrently. This may lead to some inconsistency. Although there hasn't been any proof of this bad results, it's still safer to protect the racy accesses. This patch adds the spinlock protection around the unsol handling loop for addressing it. Here we take bus->reg_lock as the writer side snd_hdac_bus_queue_event() is also protected by that lock. Link: https://lore.kernel.org/r/20200516062556.30951-1-ti...@suse.de Signed-off-by: Takashi Iwai Signed-off-by: Sasha Levin --- sound/hda/hdac_bus.c | 4 1 file changed, 4 insertions(+) diff --git a/sound/hda/hdac_bus.c b/sound/hda/hdac_bus.c index 714a51721a313..ab9236e4c157e 100644 --- a/sound/hda/hdac_bus.c +++ b/sound/hda/hdac_bus.c @@ -155,6 +155,7 @@ static void process_unsol_events(struct work_struct *work) struct hdac_driver *drv; unsigned int rp, caddr, res; + spin_lock_irq(>reg_lock); while (bus->unsol_rp != bus->unsol_wp) { rp = (bus->unsol_rp + 1) % HDA_UNSOL_QUEUE_SIZE; bus->unsol_rp = rp; @@ -166,10 +167,13 @@ static void process_unsol_events(struct work_struct *work) codec = bus->caddr_tbl[caddr & 0x0f]; if (!codec || !codec->dev.driver) continue; + spin_unlock_irq(>reg_lock); drv = drv_to_hdac_driver(codec->dev.driver); if (drv->unsol_event) drv->unsol_event(codec, res); + spin_lock_irq(>reg_lock); } + spin_unlock_irq(>reg_lock); } /** -- 2.25.1
[PATCH AUTOSEL 4.14 116/127] ceph: fix potential race in ceph_check_caps
From: Jeff Layton [ Upstream commit dc3da0461cc4b76f2d0c5b12247fcb3b520edbbf ] Nothing ensures that session will still be valid by the time we dereference the pointer. Take and put a reference. In principle, we should always be able to get a reference here, but throw a warning if that's ever not the case. Signed-off-by: Jeff Layton Signed-off-by: Ilya Dryomov Signed-off-by: Sasha Levin --- fs/ceph/caps.c | 14 +- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c index 589cfe3ed873b..ce94d09f6abf9 100644 --- a/fs/ceph/caps.c +++ b/fs/ceph/caps.c @@ -1906,12 +1906,24 @@ ack: if (mutex_trylock(>s_mutex) == 0) { dout("inverting session/ino locks on %p\n", session); + session = ceph_get_mds_session(session); spin_unlock(>i_ceph_lock); if (took_snap_rwsem) { up_read(>snap_rwsem); took_snap_rwsem = 0; } - mutex_lock(>s_mutex); + if (session) { + mutex_lock(>s_mutex); + ceph_put_mds_session(session); + } else { + /* +* Because we take the reference while +* holding the i_ceph_lock, it should +* never be NULL. Throw a warning if it +* ever is. +*/ + WARN_ON_ONCE(true); + } goto retry; } } -- 2.25.1
[PATCH AUTOSEL 4.9 74/90] ALSA: hda: Fix potential race in unsol event handler
From: Takashi Iwai [ Upstream commit c637fa151259c0f74665fde7cba5b7eac1417ae5 ] The unsol event handling code has a loop retrieving the read/write indices and the arrays without locking while the append to the array may happen concurrently. This may lead to some inconsistency. Although there hasn't been any proof of this bad results, it's still safer to protect the racy accesses. This patch adds the spinlock protection around the unsol handling loop for addressing it. Here we take bus->reg_lock as the writer side snd_hdac_bus_queue_event() is also protected by that lock. Link: https://lore.kernel.org/r/20200516062556.30951-1-ti...@suse.de Signed-off-by: Takashi Iwai Signed-off-by: Sasha Levin --- sound/hda/hdac_bus.c | 4 1 file changed, 4 insertions(+) diff --git a/sound/hda/hdac_bus.c b/sound/hda/hdac_bus.c index 0e81ea89a5965..e3f68a76d90eb 100644 --- a/sound/hda/hdac_bus.c +++ b/sound/hda/hdac_bus.c @@ -155,6 +155,7 @@ static void process_unsol_events(struct work_struct *work) struct hdac_driver *drv; unsigned int rp, caddr, res; + spin_lock_irq(>reg_lock); while (bus->unsol_rp != bus->unsol_wp) { rp = (bus->unsol_rp + 1) % HDA_UNSOL_QUEUE_SIZE; bus->unsol_rp = rp; @@ -166,10 +167,13 @@ static void process_unsol_events(struct work_struct *work) codec = bus->caddr_tbl[caddr & 0x0f]; if (!codec || !codec->dev.driver) continue; + spin_unlock_irq(>reg_lock); drv = drv_to_hdac_driver(codec->dev.driver); if (drv->unsol_event) drv->unsol_event(codec, res); + spin_lock_irq(>reg_lock); } + spin_unlock_irq(>reg_lock); } /** -- 2.25.1
[PATCH AUTOSEL 4.9 84/90] ceph: fix potential race in ceph_check_caps
From: Jeff Layton [ Upstream commit dc3da0461cc4b76f2d0c5b12247fcb3b520edbbf ] Nothing ensures that session will still be valid by the time we dereference the pointer. Take and put a reference. In principle, we should always be able to get a reference here, but throw a warning if that's ever not the case. Signed-off-by: Jeff Layton Signed-off-by: Ilya Dryomov Signed-off-by: Sasha Levin --- fs/ceph/caps.c | 14 +- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c index e11aacb35d6b5..cbd92dd89de16 100644 --- a/fs/ceph/caps.c +++ b/fs/ceph/caps.c @@ -1807,12 +1807,24 @@ ack: if (mutex_trylock(>s_mutex) == 0) { dout("inverting session/ino locks on %p\n", session); + session = ceph_get_mds_session(session); spin_unlock(>i_ceph_lock); if (took_snap_rwsem) { up_read(>snap_rwsem); took_snap_rwsem = 0; } - mutex_lock(>s_mutex); + if (session) { + mutex_lock(>s_mutex); + ceph_put_mds_session(session); + } else { + /* +* Because we take the reference while +* holding the i_ceph_lock, it should +* never be NULL. Throw a warning if it +* ever is. +*/ + WARN_ON_ONCE(true); + } goto retry; } } -- 2.25.1
[PATCH AUTOSEL 4.4 52/64] ALSA: hda: Fix potential race in unsol event handler
From: Takashi Iwai [ Upstream commit c637fa151259c0f74665fde7cba5b7eac1417ae5 ] The unsol event handling code has a loop retrieving the read/write indices and the arrays without locking while the append to the array may happen concurrently. This may lead to some inconsistency. Although there hasn't been any proof of this bad results, it's still safer to protect the racy accesses. This patch adds the spinlock protection around the unsol handling loop for addressing it. Here we take bus->reg_lock as the writer side snd_hdac_bus_queue_event() is also protected by that lock. Link: https://lore.kernel.org/r/20200516062556.30951-1-ti...@suse.de Signed-off-by: Takashi Iwai Signed-off-by: Sasha Levin --- sound/hda/hdac_bus.c | 4 1 file changed, 4 insertions(+) diff --git a/sound/hda/hdac_bus.c b/sound/hda/hdac_bus.c index 0e81ea89a5965..e3f68a76d90eb 100644 --- a/sound/hda/hdac_bus.c +++ b/sound/hda/hdac_bus.c @@ -155,6 +155,7 @@ static void process_unsol_events(struct work_struct *work) struct hdac_driver *drv; unsigned int rp, caddr, res; + spin_lock_irq(>reg_lock); while (bus->unsol_rp != bus->unsol_wp) { rp = (bus->unsol_rp + 1) % HDA_UNSOL_QUEUE_SIZE; bus->unsol_rp = rp; @@ -166,10 +167,13 @@ static void process_unsol_events(struct work_struct *work) codec = bus->caddr_tbl[caddr & 0x0f]; if (!codec || !codec->dev.driver) continue; + spin_unlock_irq(>reg_lock); drv = drv_to_hdac_driver(codec->dev.driver); if (drv->unsol_event) drv->unsol_event(codec, res); + spin_lock_irq(>reg_lock); } + spin_unlock_irq(>reg_lock); } /** -- 2.25.1
[PATCH AUTOSEL 4.4 60/64] ceph: fix potential race in ceph_check_caps
From: Jeff Layton [ Upstream commit dc3da0461cc4b76f2d0c5b12247fcb3b520edbbf ] Nothing ensures that session will still be valid by the time we dereference the pointer. Take and put a reference. In principle, we should always be able to get a reference here, but throw a warning if that's ever not the case. Signed-off-by: Jeff Layton Signed-off-by: Ilya Dryomov Signed-off-by: Sasha Levin --- fs/ceph/caps.c | 14 +- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c index 3d0497421e62b..49e693232916f 100644 --- a/fs/ceph/caps.c +++ b/fs/ceph/caps.c @@ -1777,12 +1777,24 @@ ack: if (mutex_trylock(>s_mutex) == 0) { dout("inverting session/ino locks on %p\n", session); + session = ceph_get_mds_session(session); spin_unlock(>i_ceph_lock); if (took_snap_rwsem) { up_read(>snap_rwsem); took_snap_rwsem = 0; } - mutex_lock(>s_mutex); + if (session) { + mutex_lock(>s_mutex); + ceph_put_mds_session(session); + } else { + /* +* Because we take the reference while +* holding the i_ceph_lock, it should +* never be NULL. Throw a warning if it +* ever is. +*/ + WARN_ON_ONCE(true); + } goto retry; } } -- 2.25.1
[PATCH AUTOSEL 5.4 305/330] ceph: fix potential race in ceph_check_caps
From: Jeff Layton [ Upstream commit dc3da0461cc4b76f2d0c5b12247fcb3b520edbbf ] Nothing ensures that session will still be valid by the time we dereference the pointer. Take and put a reference. In principle, we should always be able to get a reference here, but throw a warning if that's ever not the case. Signed-off-by: Jeff Layton Signed-off-by: Ilya Dryomov Signed-off-by: Sasha Levin --- fs/ceph/caps.c | 14 +- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c index b2695919435e8..af563d73d252c 100644 --- a/fs/ceph/caps.c +++ b/fs/ceph/caps.c @@ -2013,12 +2013,24 @@ ack: if (mutex_trylock(>s_mutex) == 0) { dout("inverting session/ino locks on %p\n", session); + session = ceph_get_mds_session(session); spin_unlock(>i_ceph_lock); if (took_snap_rwsem) { up_read(>snap_rwsem); took_snap_rwsem = 0; } - mutex_lock(>s_mutex); + if (session) { + mutex_lock(>s_mutex); + ceph_put_mds_session(session); + } else { + /* +* Because we take the reference while +* holding the i_ceph_lock, it should +* never be NULL. Throw a warning if it +* ever is. +*/ + WARN_ON_ONCE(true); + } goto retry; } } -- 2.25.1
[PATCH AUTOSEL 5.4 267/330] ALSA: hda: Fix potential race in unsol event handler
From: Takashi Iwai [ Upstream commit c637fa151259c0f74665fde7cba5b7eac1417ae5 ] The unsol event handling code has a loop retrieving the read/write indices and the arrays without locking while the append to the array may happen concurrently. This may lead to some inconsistency. Although there hasn't been any proof of this bad results, it's still safer to protect the racy accesses. This patch adds the spinlock protection around the unsol handling loop for addressing it. Here we take bus->reg_lock as the writer side snd_hdac_bus_queue_event() is also protected by that lock. Link: https://lore.kernel.org/r/20200516062556.30951-1-ti...@suse.de Signed-off-by: Takashi Iwai Signed-off-by: Sasha Levin --- sound/hda/hdac_bus.c | 4 1 file changed, 4 insertions(+) diff --git a/sound/hda/hdac_bus.c b/sound/hda/hdac_bus.c index 8f19876244ebe..53be2cac98e7c 100644 --- a/sound/hda/hdac_bus.c +++ b/sound/hda/hdac_bus.c @@ -158,6 +158,7 @@ static void snd_hdac_bus_process_unsol_events(struct work_struct *work) struct hdac_driver *drv; unsigned int rp, caddr, res; + spin_lock_irq(>reg_lock); while (bus->unsol_rp != bus->unsol_wp) { rp = (bus->unsol_rp + 1) % HDA_UNSOL_QUEUE_SIZE; bus->unsol_rp = rp; @@ -169,10 +170,13 @@ static void snd_hdac_bus_process_unsol_events(struct work_struct *work) codec = bus->caddr_tbl[caddr & 0x0f]; if (!codec || !codec->dev.driver) continue; + spin_unlock_irq(>reg_lock); drv = drv_to_hdac_driver(codec->dev.driver); if (drv->unsol_event) drv->unsol_event(codec, res); + spin_lock_irq(>reg_lock); } + spin_unlock_irq(>reg_lock); } /** -- 2.25.1
Re: [PATCH] drivers: crypto: picoxcell_crypto: Fix potential race condition bug
On Tue, Aug 11, 2020 at 06:00:24PM +0530, madhuparnabhowmi...@gmail.com wrote: > From: Madhuparna Bhowmik > > engine->stat_irq_thresh was initialized after device_create_file() in > the probe function, the initialization may race with call to > spacc_stat_irq_thresh_store() which updates engine->stat_irq_thresh, > therefore initialize it before creating the file in probe function. > > Found by Linux Driver Verification project (linuxtesting.org). > > Signed-off-by: Madhuparna Bhowmik > --- > drivers/crypto/picoxcell_crypto.c | 9 - > 1 file changed, 4 insertions(+), 5 deletions(-) Patch applied. Thanks. -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Re: [PATCH] drivers: crypto: picoxcell_crypto: Fix potential race condition bug
On Tue, Aug 11, 2020 at 06:00:24PM +0530, madhuparnabhowmi...@gmail.com wrote: > From: Madhuparna Bhowmik > > engine->stat_irq_thresh was initialized after device_create_file() in > the probe function, the initialization may race with call to > spacc_stat_irq_thresh_store() which updates engine->stat_irq_thresh, > therefore initialize it before creating the file in probe function. > > Found by Linux Driver Verification project (linuxtesting.org). > > Signed-off-by: Madhuparna Bhowmik Acked-by: Jamie Iles Thanks!
[PATCH] drivers: crypto: picoxcell_crypto: Fix potential race condition bug
From: Madhuparna Bhowmik engine->stat_irq_thresh was initialized after device_create_file() in the probe function, the initialization may race with call to spacc_stat_irq_thresh_store() which updates engine->stat_irq_thresh, therefore initialize it before creating the file in probe function. Found by Linux Driver Verification project (linuxtesting.org). Signed-off-by: Madhuparna Bhowmik --- drivers/crypto/picoxcell_crypto.c | 9 - 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/drivers/crypto/picoxcell_crypto.c b/drivers/crypto/picoxcell_crypto.c index dac6eb37fff9..fb34bf92861d 100644 --- a/drivers/crypto/picoxcell_crypto.c +++ b/drivers/crypto/picoxcell_crypto.c @@ -1685,11 +1685,6 @@ static int spacc_probe(struct platform_device *pdev) goto err_clk_put; } - ret = device_create_file(>dev, _attr_stat_irq_thresh); - if (ret) - goto err_clk_disable; - - /* * Use an IRQ threshold of 50% as a default. This seems to be a * reasonable trade off of latency against throughput but can be @@ -1697,6 +1692,10 @@ static int spacc_probe(struct platform_device *pdev) */ engine->stat_irq_thresh = (engine->fifo_sz / 2); + ret = device_create_file(>dev, _attr_stat_irq_thresh); + if (ret) + goto err_clk_disable; + /* * Configure the interrupts. We only use the STAT_CNT interrupt as we * only submit a new packet for processing when we complete another in -- 2.17.1
[tip: core/urgent] stop_machine: Avoid potential race behaviour
The following commit has been merged into the core/urgent branch of tip: Commit-ID: b1fc5833357524d5d342737913dbe32ff3557bc5 Gitweb: https://git.kernel.org/tip/b1fc5833357524d5d342737913dbe32ff3557bc5 Author:Mark Rutland AuthorDate:Mon, 07 Oct 2019 11:45:36 +01:00 Committer: Thomas Gleixner CommitterDate: Thu, 17 Oct 2019 12:47:12 +02:00 stop_machine: Avoid potential race behaviour Both multi_cpu_stop() and set_state() access multi_stop_data::state racily using plain accesses. These are subject to compiler transformations which could break the intended behaviour of the code, and this situation is detected by KCSAN on both arm64 and x86 (splats below). Improve matters by using READ_ONCE() and WRITE_ONCE() to ensure that the compiler cannot elide, replay, or tear loads and stores. In multi_cpu_stop() the two loads of multi_stop_data::state are expected to be a consistent value, so snapshot the value into a temporary variable to ensure this. The state transitions are serialized by atomic manipulation of multi_stop_data::num_threads, and other fields in multi_stop_data are not modified while subject to concurrent reads. KCSAN splat on arm64: | BUG: KCSAN: data-race in multi_cpu_stop+0xa8/0x198 and set_state+0x80/0xb0 | | write to 0x1003bd00 of 4 bytes by task 24 on cpu 3: | set_state+0x80/0xb0 | multi_cpu_stop+0x16c/0x198 | cpu_stopper_thread+0x170/0x298 | smpboot_thread_fn+0x40c/0x560 | kthread+0x1a8/0x1b0 | ret_from_fork+0x10/0x18 | | read to 0x1003bd00 of 4 bytes by task 14 on cpu 1: | multi_cpu_stop+0xa8/0x198 | cpu_stopper_thread+0x170/0x298 | smpboot_thread_fn+0x40c/0x560 | kthread+0x1a8/0x1b0 | ret_from_fork+0x10/0x18 | | Reported by Kernel Concurrency Sanitizer on: | CPU: 1 PID: 14 Comm: migration/1 Not tainted 5.3.0-7-g67ab35a199f4-dirty #3 | Hardware name: linux,dummy-virt (DT) KCSAN splat on x86: | write to 0xb0bac0013e18 of 4 bytes by task 19 on cpu 2: | set_state kernel/stop_machine.c:170 [inline] | ack_state kernel/stop_machine.c:177 [inline] | multi_cpu_stop+0x1a4/0x220 kernel/stop_machine.c:227 | cpu_stopper_thread+0x19e/0x280 kernel/stop_machine.c:516 | smpboot_thread_fn+0x1a8/0x300 kernel/smpboot.c:165 | kthread+0x1b5/0x200 kernel/kthread.c:255 | ret_from_fork+0x35/0x40 arch/x86/entry/entry_64.S:352 | | read to 0xb0bac0013e18 of 4 bytes by task 44 on cpu 7: | multi_cpu_stop+0xb4/0x220 kernel/stop_machine.c:213 | cpu_stopper_thread+0x19e/0x280 kernel/stop_machine.c:516 | smpboot_thread_fn+0x1a8/0x300 kernel/smpboot.c:165 | kthread+0x1b5/0x200 kernel/kthread.c:255 | ret_from_fork+0x35/0x40 arch/x86/entry/entry_64.S:352 | | Reported by Kernel Concurrency Sanitizer on: | CPU: 7 PID: 44 Comm: migration/7 Not tainted 5.3.0+ #1 | Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014 Signed-off-by: Mark Rutland Signed-off-by: Thomas Gleixner Acked-by: Marco Elver Link: https://lkml.kernel.org/r/20191007104536.27276-1-mark.rutl...@arm.com --- kernel/stop_machine.c | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c index c7031a2..998d50e 100644 --- a/kernel/stop_machine.c +++ b/kernel/stop_machine.c @@ -7,6 +7,7 @@ * Copyright (C) 2010 SUSE Linux Products GmbH * Copyright (C) 2010 Tejun Heo */ +#include #include #include #include @@ -167,7 +168,7 @@ static void set_state(struct multi_stop_data *msdata, /* Reset ack counter. */ atomic_set(>thread_ack, msdata->num_threads); smp_wmb(); - msdata->state = newstate; + WRITE_ONCE(msdata->state, newstate); } /* Last one to ack a state moves to the next state. */ @@ -186,7 +187,7 @@ void __weak stop_machine_yield(const struct cpumask *cpumask) static int multi_cpu_stop(void *data) { struct multi_stop_data *msdata = data; - enum multi_stop_state curstate = MULTI_STOP_NONE; + enum multi_stop_state newstate, curstate = MULTI_STOP_NONE; int cpu = smp_processor_id(), err = 0; const struct cpumask *cpumask; unsigned long flags; @@ -210,8 +211,9 @@ static int multi_cpu_stop(void *data) do { /* Chill out and ensure we re-read multi_stop_state. */ stop_machine_yield(cpumask); - if (msdata->state != curstate) { - curstate = msdata->state; + newstate = READ_ONCE(msdata->state); + if (newstate != curstate) { + curstate = newstate; switch (curstate) { case MULTI_STOP_DISABLE_IRQ: local_irq_disable();
Re: [PATCH] stop_machine: avoid potential race behaviour
On Tue, Oct 08, 2019 at 10:36:37AM +0200, Peter Zijlstra wrote: > On Mon, Oct 07, 2019 at 11:45:36AM +0100, Mark Rutland wrote: > > Both multi_cpu_stop() and set_state() access multi_stop_data::state > > racily using plain accesses. These are subject to compiler > > transformations which could break the intended behaviour of the code, > > and this situation is detected by KCSAN on both arm64 and x86 (splats > > below). > > I really don't think there is anything the compiler can do wrong here. > > That is, I'm thinking I'd like to get this called out as a false-positive. I agree that in practice, it's very unlikely this would go wrong. There are some things the compiler could do, e.g. with re-ordering of volatile and plain reads of the same variable: https://lore.kernel.org/lkml/20191003161233.gb38...@lakrids.cambridge.arm.com/ ... and while I agree that's vanishingly unlikely to happen here, I couldn't say how to rule that out without ruling out cases that would actually blow up in practice. > That said, the patch looks obviously fine and will help with the > validation effort so no real objection there. Great! Can I take that as an Acked-by? I assume this should go via the tip tree. Thanks, Mark.
Re: [PATCH] stop_machine: avoid potential race behaviour
On Mon, Oct 07, 2019 at 11:45:36AM +0100, Mark Rutland wrote: > Both multi_cpu_stop() and set_state() access multi_stop_data::state > racily using plain accesses. These are subject to compiler > transformations which could break the intended behaviour of the code, > and this situation is detected by KCSAN on both arm64 and x86 (splats > below). I really don't think there is anything the compiler can do wrong here. That is, I'm thinking I'd like to get this called out as a false-positive. That said, the patch looks obviously fine and will help with the validation effort so no real objection there.
Re: [PATCH] stop_machine: avoid potential race behaviour
On Mon, 7 Oct 2019 at 12:45, Mark Rutland wrote: > > Both multi_cpu_stop() and set_state() access multi_stop_data::state > racily using plain accesses. These are subject to compiler > transformations which could break the intended behaviour of the code, > and this situation is detected by KCSAN on both arm64 and x86 (splats > below). > > Let's improve matters by using READ_ONCE() and WRITE_ONCE() to ensure > that the compiler cannot elide, replay, or tear loads and stores. In > multi_cpu_stop() we expect the two loads of multi_stop_data::state to be > a consistent value, so we snapshot the value into a temporary variable > to ensure this. > > The state transitions are serialized by atomic manipulation of > multi_stop_data::num_threads, and other fields in multi_stop_data are > not modified while subject to concurrent reads. > > KCSAN splat on arm64: > > | BUG: KCSAN: data-race in multi_cpu_stop+0xa8/0x198 and set_state+0x80/0xb0 > | > | write to 0x1003bd00 of 4 bytes by task 24 on cpu 3: > | set_state+0x80/0xb0 > | multi_cpu_stop+0x16c/0x198 > | cpu_stopper_thread+0x170/0x298 > | smpboot_thread_fn+0x40c/0x560 > | kthread+0x1a8/0x1b0 > | ret_from_fork+0x10/0x18 > | > | read to 0x1003bd00 of 4 bytes by task 14 on cpu 1: > | multi_cpu_stop+0xa8/0x198 > | cpu_stopper_thread+0x170/0x298 > | smpboot_thread_fn+0x40c/0x560 > | kthread+0x1a8/0x1b0 > | ret_from_fork+0x10/0x18 > | > | Reported by Kernel Concurrency Sanitizer on: > | CPU: 1 PID: 14 Comm: migration/1 Not tainted > 5.3.0-7-g67ab35a199f4-dirty #3 > | Hardware name: linux,dummy-virt (DT) > > KCSAN splat on x86: > > | write to 0xb0bac0013e18 of 4 bytes by task 19 on cpu 2: > | set_state kernel/stop_machine.c:170 [inline] > | ack_state kernel/stop_machine.c:177 [inline] > | multi_cpu_stop+0x1a4/0x220 kernel/stop_machine.c:227 > | cpu_stopper_thread+0x19e/0x280 kernel/stop_machine.c:516 > | smpboot_thread_fn+0x1a8/0x300 kernel/smpboot.c:165 > | kthread+0x1b5/0x200 kernel/kthread.c:255 > | ret_from_fork+0x35/0x40 arch/x86/entry/entry_64.S:352 > | > | read to 0xb0bac0013e18 of 4 bytes by task 44 on cpu 7: > | multi_cpu_stop+0xb4/0x220 kernel/stop_machine.c:213 > | cpu_stopper_thread+0x19e/0x280 kernel/stop_machine.c:516 > | smpboot_thread_fn+0x1a8/0x300 kernel/smpboot.c:165 > | kthread+0x1b5/0x200 kernel/kthread.c:255 > | ret_from_fork+0x35/0x40 arch/x86/entry/entry_64.S:352 > | > | Reported by Kernel Concurrency Sanitizer on: > | CPU: 7 PID: 44 Comm: migration/7 Not tainted 5.3.0+ #1 > | Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 > 04/01/2014 > > Signed-off-by: Mark Rutland > Cc: Marco Elver > Cc: Thomas Gleixner > Cc: Peter Zijlstra Thanks for fixing this! Acked-by: Marco Elver > --- > kernel/stop_machine.c | 10 ++ > 1 file changed, 6 insertions(+), 4 deletions(-) > > diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c > index c7031a22aa7b..998d50ee2d9b 100644 > --- a/kernel/stop_machine.c > +++ b/kernel/stop_machine.c > @@ -7,6 +7,7 @@ > * Copyright (C) 2010 SUSE Linux Products GmbH > * Copyright (C) 2010 Tejun Heo > */ > +#include > #include > #include > #include > @@ -167,7 +168,7 @@ static void set_state(struct multi_stop_data *msdata, > /* Reset ack counter. */ > atomic_set(>thread_ack, msdata->num_threads); > smp_wmb(); > - msdata->state = newstate; > + WRITE_ONCE(msdata->state, newstate); > } > > /* Last one to ack a state moves to the next state. */ > @@ -186,7 +187,7 @@ void __weak stop_machine_yield(const struct cpumask > *cpumask) > static int multi_cpu_stop(void *data) > { > struct multi_stop_data *msdata = data; > - enum multi_stop_state curstate = MULTI_STOP_NONE; > + enum multi_stop_state newstate, curstate = MULTI_STOP_NONE; > int cpu = smp_processor_id(), err = 0; > const struct cpumask *cpumask; > unsigned long flags; > @@ -210,8 +211,9 @@ static int multi_cpu_stop(void *data) > do { > /* Chill out and ensure we re-read multi_stop_state. */ > stop_machine_yield(cpumask); > - if (msdata->state != curstate) { > - curstate = msdata->state; > + newstate = READ_ONCE(msdata->state); > + if (newstate != curstate) { > + curstate = newstate; > switch (curstate) { > case MULTI_STOP_DISABLE_IRQ: > local_irq_disable(); > -- > 2.11.0 >
[PATCH] stop_machine: avoid potential race behaviour
Both multi_cpu_stop() and set_state() access multi_stop_data::state racily using plain accesses. These are subject to compiler transformations which could break the intended behaviour of the code, and this situation is detected by KCSAN on both arm64 and x86 (splats below). Let's improve matters by using READ_ONCE() and WRITE_ONCE() to ensure that the compiler cannot elide, replay, or tear loads and stores. In multi_cpu_stop() we expect the two loads of multi_stop_data::state to be a consistent value, so we snapshot the value into a temporary variable to ensure this. The state transitions are serialized by atomic manipulation of multi_stop_data::num_threads, and other fields in multi_stop_data are not modified while subject to concurrent reads. KCSAN splat on arm64: | BUG: KCSAN: data-race in multi_cpu_stop+0xa8/0x198 and set_state+0x80/0xb0 | | write to 0x1003bd00 of 4 bytes by task 24 on cpu 3: | set_state+0x80/0xb0 | multi_cpu_stop+0x16c/0x198 | cpu_stopper_thread+0x170/0x298 | smpboot_thread_fn+0x40c/0x560 | kthread+0x1a8/0x1b0 | ret_from_fork+0x10/0x18 | | read to 0x1003bd00 of 4 bytes by task 14 on cpu 1: | multi_cpu_stop+0xa8/0x198 | cpu_stopper_thread+0x170/0x298 | smpboot_thread_fn+0x40c/0x560 | kthread+0x1a8/0x1b0 | ret_from_fork+0x10/0x18 | | Reported by Kernel Concurrency Sanitizer on: | CPU: 1 PID: 14 Comm: migration/1 Not tainted 5.3.0-7-g67ab35a199f4-dirty #3 | Hardware name: linux,dummy-virt (DT) KCSAN splat on x86: | write to 0xb0bac0013e18 of 4 bytes by task 19 on cpu 2: | set_state kernel/stop_machine.c:170 [inline] | ack_state kernel/stop_machine.c:177 [inline] | multi_cpu_stop+0x1a4/0x220 kernel/stop_machine.c:227 | cpu_stopper_thread+0x19e/0x280 kernel/stop_machine.c:516 | smpboot_thread_fn+0x1a8/0x300 kernel/smpboot.c:165 | kthread+0x1b5/0x200 kernel/kthread.c:255 | ret_from_fork+0x35/0x40 arch/x86/entry/entry_64.S:352 | | read to 0xb0bac0013e18 of 4 bytes by task 44 on cpu 7: | multi_cpu_stop+0xb4/0x220 kernel/stop_machine.c:213 | cpu_stopper_thread+0x19e/0x280 kernel/stop_machine.c:516 | smpboot_thread_fn+0x1a8/0x300 kernel/smpboot.c:165 | kthread+0x1b5/0x200 kernel/kthread.c:255 | ret_from_fork+0x35/0x40 arch/x86/entry/entry_64.S:352 | | Reported by Kernel Concurrency Sanitizer on: | CPU: 7 PID: 44 Comm: migration/7 Not tainted 5.3.0+ #1 | Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014 Signed-off-by: Mark Rutland Cc: Marco Elver Cc: Thomas Gleixner Cc: Peter Zijlstra --- kernel/stop_machine.c | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c index c7031a22aa7b..998d50ee2d9b 100644 --- a/kernel/stop_machine.c +++ b/kernel/stop_machine.c @@ -7,6 +7,7 @@ * Copyright (C) 2010 SUSE Linux Products GmbH * Copyright (C) 2010 Tejun Heo */ +#include #include #include #include @@ -167,7 +168,7 @@ static void set_state(struct multi_stop_data *msdata, /* Reset ack counter. */ atomic_set(>thread_ack, msdata->num_threads); smp_wmb(); - msdata->state = newstate; + WRITE_ONCE(msdata->state, newstate); } /* Last one to ack a state moves to the next state. */ @@ -186,7 +187,7 @@ void __weak stop_machine_yield(const struct cpumask *cpumask) static int multi_cpu_stop(void *data) { struct multi_stop_data *msdata = data; - enum multi_stop_state curstate = MULTI_STOP_NONE; + enum multi_stop_state newstate, curstate = MULTI_STOP_NONE; int cpu = smp_processor_id(), err = 0; const struct cpumask *cpumask; unsigned long flags; @@ -210,8 +211,9 @@ static int multi_cpu_stop(void *data) do { /* Chill out and ensure we re-read multi_stop_state. */ stop_machine_yield(cpumask); - if (msdata->state != curstate) { - curstate = msdata->state; + newstate = READ_ONCE(msdata->state); + if (newstate != curstate) { + curstate = newstate; switch (curstate) { case MULTI_STOP_DISABLE_IRQ: local_irq_disable(); -- 2.11.0
[RFC 0/2] Potential race condition with page lock
In 4.14 kernel, observed following 2 BUG_ON(!PageLocked(page)) scenarios. Both looks to be having similar cause. Case: 1 [127823.176076] try_to_free_buffers+0xfc/0x108 (BUG_ON(), page lock was freed somehow) [127823.176079] jbd2_journal_try_to_free_buffers+0x15c/0x194 (page lock was available till this function) [127823.176083] ext4_releasepage+0xe0/0x110 [127823.176087] try_to_release_page+0x68/0x90 (page lock was available till this function) [127823.176090] invalidate_inode_page+0x94/0xa8 [127823.176093] invalidate_mapping_pages_without_uidlru+0xec/0x1a4 (page lock taken here) ... ... Case: 2 [] el1_dbg+0x18 [] __remove_mapping+0x160 (BUG_ON(), page lock is not available. Some one might have free'd that.) [] remove_mapping+0x28 [] invalidate_inode_page+0xa4 [] invalidate_mapping_pages+0xd4 (acquired the page lock) [] inode_lru_isolate+0x128 [] __list_lru_walk_one+0x10c [] list_lru_walk_one+0x58 [] prune_icache_sb+0x50 [] super_cache_scan+0xfc [] shrink_slab+0x304 [] shrink_node+0x254 [] do_try_to_free_pages+0x144 [] try_to_free_pages+0x390 [] __alloc_pages_nodemask+0x940 [] __get_free_pages+0x28 [] proc_pid_readlink+0x6c [] vfs_readlink+0x124 [] SyS_readlinkat+0xc8 [] __sys_trace_return+0x0 Both the scenarios say that current stack tried taking page lock but got released in meantime by someone else. There could be 2 possiblities here. 1) Someone trying to update page flags and due to race condition, PG_locked bit got cleared, unwantedly. 2) Someone else took the lock without checking if it is really locked or not as there are explicit APIs to set PG_locked. I didn't get traces of history for having PG_locked being set non-atomically. I believe it could be because of performance reasons. Not sure though. Chintan Pandya (2): page-flags: Make page lock operation atomic page-flags: Catch the double setter of page flags fs/cifs/file.c | 8 fs/pipe.c | 2 +- include/linux/page-flags.h | 4 ++-- include/linux/pagemap.h| 6 +++--- mm/filemap.c | 4 ++-- mm/khugepaged.c| 2 +- mm/ksm.c | 2 +- mm/memory-failure.c| 2 +- mm/memory.c| 2 +- mm/migrate.c | 2 +- mm/shmem.c | 6 +++--- mm/swap_state.c| 4 ++-- mm/vmscan.c| 2 +- 13 files changed, 23 insertions(+), 23 deletions(-) -- 2.17.1
[PATCH 3.16 252/305] SUNRPC: Fix a potential race in xprt_connect()
3.16.63-rc1 review patch. If anyone has any objections, please let me know. -- From: Trond Myklebust commit 0a9a4304f3614e25d9de9b63502ca633c01c0d70 upstream. If an asynchronous connection attempt completes while another task is in xprt_connect(), then the call to rpc_sleep_on() could end up racing with the call to xprt_wake_pending_tasks(). So add a second test of the connection state after we've put the task to sleep and set the XPRT_CONNECTING flag, when we know that there can be no asynchronous connection attempts still in progress. Fixes: 0b9e79431377d ("SUNRPC: Move the test for XPRT_CONNECTING into...") Signed-off-by: Trond Myklebust Signed-off-by: Ben Hutchings --- net/sunrpc/xprt.c | 11 +-- 1 file changed, 9 insertions(+), 2 deletions(-) --- a/net/sunrpc/xprt.c +++ b/net/sunrpc/xprt.c @@ -721,8 +721,15 @@ void xprt_connect(struct rpc_task *task) return; if (xprt_test_and_set_connecting(xprt)) return; - xprt->stat.connect_start = jiffies; - xprt->ops->connect(xprt, task); + /* Race breaker */ + if (!xprt_connected(xprt)) { + xprt->stat.connect_start = jiffies; + xprt->ops->connect(xprt, task); + } else { + xprt_clear_connecting(xprt); + task->tk_status = 0; + rpc_wake_up_queued_task(>pending, task); + } } }
[PATCH 3.18 16/31] SUNRPC: Fix a potential race in xprt_connect()
3.18-stable review patch. If anyone has any objections, please let me know. -- [ Upstream commit 0a9a4304f3614e25d9de9b63502ca633c01c0d70 ] If an asynchronous connection attempt completes while another task is in xprt_connect(), then the call to rpc_sleep_on() could end up racing with the call to xprt_wake_pending_tasks(). So add a second test of the connection state after we've put the task to sleep and set the XPRT_CONNECTING flag, when we know that there can be no asynchronous connection attempts still in progress. Fixes: 0b9e79431377d ("SUNRPC: Move the test for XPRT_CONNECTING into...") Signed-off-by: Trond Myklebust Signed-off-by: Sasha Levin --- net/sunrpc/xprt.c | 11 +-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c index 56e4e150e80e..dca234b1f77d 100644 --- a/net/sunrpc/xprt.c +++ b/net/sunrpc/xprt.c @@ -721,8 +721,15 @@ void xprt_connect(struct rpc_task *task) return; if (xprt_test_and_set_connecting(xprt)) return; - xprt->stat.connect_start = jiffies; - xprt->ops->connect(xprt, task); + /* Race breaker */ + if (!xprt_connected(xprt)) { + xprt->stat.connect_start = jiffies; + xprt->ops->connect(xprt, task); + } else { + xprt_clear_connecting(xprt); + task->tk_status = 0; + rpc_wake_up_queued_task(>pending, task); + } } } -- 2.19.1
[PATCH 4.19 28/67] SUNRPC: Fix a potential race in xprt_connect()
4.19-stable review patch. If anyone has any objections, please let me know. -- [ Upstream commit 0a9a4304f3614e25d9de9b63502ca633c01c0d70 ] If an asynchronous connection attempt completes while another task is in xprt_connect(), then the call to rpc_sleep_on() could end up racing with the call to xprt_wake_pending_tasks(). So add a second test of the connection state after we've put the task to sleep and set the XPRT_CONNECTING flag, when we know that there can be no asynchronous connection attempts still in progress. Fixes: 0b9e79431377d ("SUNRPC: Move the test for XPRT_CONNECTING into...") Signed-off-by: Trond Myklebust Signed-off-by: Sasha Levin --- net/sunrpc/xprt.c | 11 +-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c index a8db2e3f8904..d066aae3cb6d 100644 --- a/net/sunrpc/xprt.c +++ b/net/sunrpc/xprt.c @@ -781,8 +781,15 @@ void xprt_connect(struct rpc_task *task) return; if (xprt_test_and_set_connecting(xprt)) return; - xprt->stat.connect_start = jiffies; - xprt->ops->connect(xprt, task); + /* Race breaker */ + if (!xprt_connected(xprt)) { + xprt->stat.connect_start = jiffies; + xprt->ops->connect(xprt, task); + } else { + xprt_clear_connecting(xprt); + task->tk_status = 0; + rpc_wake_up_queued_task(>pending, task); + } } xprt_release_write(xprt, task); } -- 2.19.1
[PATCH 4.14 46/72] SUNRPC: Fix a potential race in xprt_connect()
4.14-stable review patch. If anyone has any objections, please let me know. -- [ Upstream commit 0a9a4304f3614e25d9de9b63502ca633c01c0d70 ] If an asynchronous connection attempt completes while another task is in xprt_connect(), then the call to rpc_sleep_on() could end up racing with the call to xprt_wake_pending_tasks(). So add a second test of the connection state after we've put the task to sleep and set the XPRT_CONNECTING flag, when we know that there can be no asynchronous connection attempts still in progress. Fixes: 0b9e79431377d ("SUNRPC: Move the test for XPRT_CONNECTING into...") Signed-off-by: Trond Myklebust Signed-off-by: Sasha Levin --- net/sunrpc/xprt.c | 11 +-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c index 8eb0c4f3b3e9..d0282cc88b14 100644 --- a/net/sunrpc/xprt.c +++ b/net/sunrpc/xprt.c @@ -780,8 +780,15 @@ void xprt_connect(struct rpc_task *task) return; if (xprt_test_and_set_connecting(xprt)) return; - xprt->stat.connect_start = jiffies; - xprt->ops->connect(xprt, task); + /* Race breaker */ + if (!xprt_connected(xprt)) { + xprt->stat.connect_start = jiffies; + xprt->ops->connect(xprt, task); + } else { + xprt_clear_connecting(xprt); + task->tk_status = 0; + rpc_wake_up_queued_task(>pending, task); + } } xprt_release_write(xprt, task); } -- 2.19.1
[PATCH 4.9 39/61] SUNRPC: Fix a potential race in xprt_connect()
4.9-stable review patch. If anyone has any objections, please let me know. -- [ Upstream commit 0a9a4304f3614e25d9de9b63502ca633c01c0d70 ] If an asynchronous connection attempt completes while another task is in xprt_connect(), then the call to rpc_sleep_on() could end up racing with the call to xprt_wake_pending_tasks(). So add a second test of the connection state after we've put the task to sleep and set the XPRT_CONNECTING flag, when we know that there can be no asynchronous connection attempts still in progress. Fixes: 0b9e79431377d ("SUNRPC: Move the test for XPRT_CONNECTING into...") Signed-off-by: Trond Myklebust Signed-off-by: Sasha Levin --- net/sunrpc/xprt.c | 11 +-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c index 685e6d225414..1a8df242d26a 100644 --- a/net/sunrpc/xprt.c +++ b/net/sunrpc/xprt.c @@ -778,8 +778,15 @@ void xprt_connect(struct rpc_task *task) return; if (xprt_test_and_set_connecting(xprt)) return; - xprt->stat.connect_start = jiffies; - xprt->ops->connect(xprt, task); + /* Race breaker */ + if (!xprt_connected(xprt)) { + xprt->stat.connect_start = jiffies; + xprt->ops->connect(xprt, task); + } else { + xprt_clear_connecting(xprt); + task->tk_status = 0; + rpc_wake_up_queued_task(>pending, task); + } } xprt_release_write(xprt, task); } -- 2.19.1
[PATCH 4.4 25/40] SUNRPC: Fix a potential race in xprt_connect()
4.4-stable review patch. If anyone has any objections, please let me know. -- [ Upstream commit 0a9a4304f3614e25d9de9b63502ca633c01c0d70 ] If an asynchronous connection attempt completes while another task is in xprt_connect(), then the call to rpc_sleep_on() could end up racing with the call to xprt_wake_pending_tasks(). So add a second test of the connection state after we've put the task to sleep and set the XPRT_CONNECTING flag, when we know that there can be no asynchronous connection attempts still in progress. Fixes: 0b9e79431377d ("SUNRPC: Move the test for XPRT_CONNECTING into...") Signed-off-by: Trond Myklebust Signed-off-by: Sasha Levin --- net/sunrpc/xprt.c | 11 +-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c index 2e98f4a243e5..112c191b8336 100644 --- a/net/sunrpc/xprt.c +++ b/net/sunrpc/xprt.c @@ -758,8 +758,15 @@ void xprt_connect(struct rpc_task *task) return; if (xprt_test_and_set_connecting(xprt)) return; - xprt->stat.connect_start = jiffies; - xprt->ops->connect(xprt, task); + /* Race breaker */ + if (!xprt_connected(xprt)) { + xprt->stat.connect_start = jiffies; + xprt->ops->connect(xprt, task); + } else { + xprt_clear_connecting(xprt); + task->tk_status = 0; + rpc_wake_up_queued_task(>pending, task); + } } xprt_release_write(xprt, task); } -- 2.19.1
[PATCH AUTOSEL 4.19 27/73] SUNRPC: Fix a potential race in xprt_connect()
From: Trond Myklebust [ Upstream commit 0a9a4304f3614e25d9de9b63502ca633c01c0d70 ] If an asynchronous connection attempt completes while another task is in xprt_connect(), then the call to rpc_sleep_on() could end up racing with the call to xprt_wake_pending_tasks(). So add a second test of the connection state after we've put the task to sleep and set the XPRT_CONNECTING flag, when we know that there can be no asynchronous connection attempts still in progress. Fixes: 0b9e79431377d ("SUNRPC: Move the test for XPRT_CONNECTING into...") Signed-off-by: Trond Myklebust Signed-off-by: Sasha Levin --- net/sunrpc/xprt.c | 11 +-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c index a8db2e3f8904..d066aae3cb6d 100644 --- a/net/sunrpc/xprt.c +++ b/net/sunrpc/xprt.c @@ -781,8 +781,15 @@ void xprt_connect(struct rpc_task *task) return; if (xprt_test_and_set_connecting(xprt)) return; - xprt->stat.connect_start = jiffies; - xprt->ops->connect(xprt, task); + /* Race breaker */ + if (!xprt_connected(xprt)) { + xprt->stat.connect_start = jiffies; + xprt->ops->connect(xprt, task); + } else { + xprt_clear_connecting(xprt); + task->tk_status = 0; + rpc_wake_up_queued_task(>pending, task); + } } xprt_release_write(xprt, task); } -- 2.19.1
[PATCH AUTOSEL 4.14 12/41] SUNRPC: Fix a potential race in xprt_connect()
From: Trond Myklebust [ Upstream commit 0a9a4304f3614e25d9de9b63502ca633c01c0d70 ] If an asynchronous connection attempt completes while another task is in xprt_connect(), then the call to rpc_sleep_on() could end up racing with the call to xprt_wake_pending_tasks(). So add a second test of the connection state after we've put the task to sleep and set the XPRT_CONNECTING flag, when we know that there can be no asynchronous connection attempts still in progress. Fixes: 0b9e79431377d ("SUNRPC: Move the test for XPRT_CONNECTING into...") Signed-off-by: Trond Myklebust Signed-off-by: Sasha Levin --- net/sunrpc/xprt.c | 11 +-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c index 8eb0c4f3b3e9..d0282cc88b14 100644 --- a/net/sunrpc/xprt.c +++ b/net/sunrpc/xprt.c @@ -780,8 +780,15 @@ void xprt_connect(struct rpc_task *task) return; if (xprt_test_and_set_connecting(xprt)) return; - xprt->stat.connect_start = jiffies; - xprt->ops->connect(xprt, task); + /* Race breaker */ + if (!xprt_connected(xprt)) { + xprt->stat.connect_start = jiffies; + xprt->ops->connect(xprt, task); + } else { + xprt_clear_connecting(xprt); + task->tk_status = 0; + rpc_wake_up_queued_task(>pending, task); + } } xprt_release_write(xprt, task); } -- 2.19.1
[PATCH AUTOSEL 4.4 09/23] SUNRPC: Fix a potential race in xprt_connect()
From: Trond Myklebust [ Upstream commit 0a9a4304f3614e25d9de9b63502ca633c01c0d70 ] If an asynchronous connection attempt completes while another task is in xprt_connect(), then the call to rpc_sleep_on() could end up racing with the call to xprt_wake_pending_tasks(). So add a second test of the connection state after we've put the task to sleep and set the XPRT_CONNECTING flag, when we know that there can be no asynchronous connection attempts still in progress. Fixes: 0b9e79431377d ("SUNRPC: Move the test for XPRT_CONNECTING into...") Signed-off-by: Trond Myklebust Signed-off-by: Sasha Levin --- net/sunrpc/xprt.c | 11 +-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c index 2e98f4a243e5..112c191b8336 100644 --- a/net/sunrpc/xprt.c +++ b/net/sunrpc/xprt.c @@ -758,8 +758,15 @@ void xprt_connect(struct rpc_task *task) return; if (xprt_test_and_set_connecting(xprt)) return; - xprt->stat.connect_start = jiffies; - xprt->ops->connect(xprt, task); + /* Race breaker */ + if (!xprt_connected(xprt)) { + xprt->stat.connect_start = jiffies; + xprt->ops->connect(xprt, task); + } else { + xprt_clear_connecting(xprt); + task->tk_status = 0; + rpc_wake_up_queued_task(>pending, task); + } } xprt_release_write(xprt, task); } -- 2.19.1
[PATCH AUTOSEL 3.18 06/16] SUNRPC: Fix a potential race in xprt_connect()
From: Trond Myklebust [ Upstream commit 0a9a4304f3614e25d9de9b63502ca633c01c0d70 ] If an asynchronous connection attempt completes while another task is in xprt_connect(), then the call to rpc_sleep_on() could end up racing with the call to xprt_wake_pending_tasks(). So add a second test of the connection state after we've put the task to sleep and set the XPRT_CONNECTING flag, when we know that there can be no asynchronous connection attempts still in progress. Fixes: 0b9e79431377d ("SUNRPC: Move the test for XPRT_CONNECTING into...") Signed-off-by: Trond Myklebust Signed-off-by: Sasha Levin --- net/sunrpc/xprt.c | 11 +-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c index 56e4e150e80e..dca234b1f77d 100644 --- a/net/sunrpc/xprt.c +++ b/net/sunrpc/xprt.c @@ -721,8 +721,15 @@ void xprt_connect(struct rpc_task *task) return; if (xprt_test_and_set_connecting(xprt)) return; - xprt->stat.connect_start = jiffies; - xprt->ops->connect(xprt, task); + /* Race breaker */ + if (!xprt_connected(xprt)) { + xprt->stat.connect_start = jiffies; + xprt->ops->connect(xprt, task); + } else { + xprt_clear_connecting(xprt); + task->tk_status = 0; + rpc_wake_up_queued_task(>pending, task); + } } } -- 2.19.1
[PATCH AUTOSEL 4.9 10/34] SUNRPC: Fix a potential race in xprt_connect()
From: Trond Myklebust [ Upstream commit 0a9a4304f3614e25d9de9b63502ca633c01c0d70 ] If an asynchronous connection attempt completes while another task is in xprt_connect(), then the call to rpc_sleep_on() could end up racing with the call to xprt_wake_pending_tasks(). So add a second test of the connection state after we've put the task to sleep and set the XPRT_CONNECTING flag, when we know that there can be no asynchronous connection attempts still in progress. Fixes: 0b9e79431377d ("SUNRPC: Move the test for XPRT_CONNECTING into...") Signed-off-by: Trond Myklebust Signed-off-by: Sasha Levin --- net/sunrpc/xprt.c | 11 +-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c index 685e6d225414..1a8df242d26a 100644 --- a/net/sunrpc/xprt.c +++ b/net/sunrpc/xprt.c @@ -778,8 +778,15 @@ void xprt_connect(struct rpc_task *task) return; if (xprt_test_and_set_connecting(xprt)) return; - xprt->stat.connect_start = jiffies; - xprt->ops->connect(xprt, task); + /* Race breaker */ + if (!xprt_connected(xprt)) { + xprt->stat.connect_start = jiffies; + xprt->ops->connect(xprt, task); + } else { + xprt_clear_connecting(xprt); + task->tk_status = 0; + rpc_wake_up_queued_task(>pending, task); + } } xprt_release_write(xprt, task); } -- 2.19.1
[PATCH 4.4 045/114] arm64: Fix potential race with hardware DBM in ptep_set_access_flags()
4.4-stable review patch. If anyone has any objections, please let me know. -- [ Upstream commit 6d332747fa5f0a6843b56b5b129168ba909336d1 ] In a system with DBM (dirty bit management) capable agents there is a possible race between a CPU executing ptep_set_access_flags() (maybe non-DBM capable) and a hardware update of the dirty state (clearing of PTE_RDONLY). The scenario: a) the pte is writable (PTE_WRITE set), clean (PTE_RDONLY set) and old (PTE_AF clear) b) ptep_set_access_flags() is called as a result of a read access and it needs to set the pte to writable, clean and young (PTE_AF set) c) a DBM-capable agent, as a result of a different write access, is marking the entry as young (setting PTE_AF) and dirty (clearing PTE_RDONLY) The current ptep_set_access_flags() implementation would set the PTE_RDONLY bit in the resulting value overriding the DBM update and losing the dirty state. This patch fixes such race by setting PTE_RDONLY to the most permissive (lowest value) of the current entry and the new one. Fixes: 66dbd6e61a52 ("arm64: Implement ptep_set_access_flags() for hardware AF/DBM") Cc: Will Deacon Acked-by: Mark Rutland Acked-by: Steve Capper Signed-off-by: Catalin Marinas Signed-off-by: Will Deacon Signed-off-by: Sasha Levin --- arch/arm64/mm/fault.c | 15 --- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index 86485415c5f0..be7f8416809f 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -107,26 +107,27 @@ int ptep_set_access_flags(struct vm_area_struct *vma, /* only preserve the access flags and write permission */ pte_val(entry) &= PTE_AF | PTE_WRITE | PTE_DIRTY; - /* -* PTE_RDONLY is cleared by default in the asm below, so set it in -* back if necessary (read-only or clean PTE). -*/ + /* set PTE_RDONLY if actual read-only or clean PTE */ if (!pte_write(entry) || !pte_sw_dirty(entry)) pte_val(entry) |= PTE_RDONLY; /* * Setting the flags must be done atomically to avoid racing with the -* hardware update of the access/dirty state. +* hardware update of the access/dirty state. The PTE_RDONLY bit must +* be set to the most permissive (lowest value) of *ptep and entry +* (calculated as: a & b == ~(~a | ~b)). */ + pte_val(entry) ^= PTE_RDONLY; asm volatile("//ptep_set_access_flags\n" " prfmpstl1strm, %2\n" "1: ldxr%0, %2\n" - " and %0, %0, %3 // clear PTE_RDONLY\n" + " eor %0, %0, %3 // negate PTE_RDONLY in *ptep\n" " orr %0, %0, %4 // set flags\n" + " eor %0, %0, %3 // negate final PTE_RDONLY\n" " stxr%w1, %0, %2\n" " cbnz%w1, 1b\n" : "=" (old_pteval), "=" (tmp), "+Q" (pte_val(*ptep)) - : "L" (~PTE_RDONLY), "r" (pte_val(entry))); + : "L" (PTE_RDONLY), "r" (pte_val(entry))); flush_tlb_fix_spurious_fault(vma, address); return 1; -- 2.17.1
[PATCH 4.4 045/114] arm64: Fix potential race with hardware DBM in ptep_set_access_flags()
4.4-stable review patch. If anyone has any objections, please let me know. -- [ Upstream commit 6d332747fa5f0a6843b56b5b129168ba909336d1 ] In a system with DBM (dirty bit management) capable agents there is a possible race between a CPU executing ptep_set_access_flags() (maybe non-DBM capable) and a hardware update of the dirty state (clearing of PTE_RDONLY). The scenario: a) the pte is writable (PTE_WRITE set), clean (PTE_RDONLY set) and old (PTE_AF clear) b) ptep_set_access_flags() is called as a result of a read access and it needs to set the pte to writable, clean and young (PTE_AF set) c) a DBM-capable agent, as a result of a different write access, is marking the entry as young (setting PTE_AF) and dirty (clearing PTE_RDONLY) The current ptep_set_access_flags() implementation would set the PTE_RDONLY bit in the resulting value overriding the DBM update and losing the dirty state. This patch fixes such race by setting PTE_RDONLY to the most permissive (lowest value) of the current entry and the new one. Fixes: 66dbd6e61a52 ("arm64: Implement ptep_set_access_flags() for hardware AF/DBM") Cc: Will Deacon Acked-by: Mark Rutland Acked-by: Steve Capper Signed-off-by: Catalin Marinas Signed-off-by: Will Deacon Signed-off-by: Sasha Levin --- arch/arm64/mm/fault.c | 15 --- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index 86485415c5f0..be7f8416809f 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -107,26 +107,27 @@ int ptep_set_access_flags(struct vm_area_struct *vma, /* only preserve the access flags and write permission */ pte_val(entry) &= PTE_AF | PTE_WRITE | PTE_DIRTY; - /* -* PTE_RDONLY is cleared by default in the asm below, so set it in -* back if necessary (read-only or clean PTE). -*/ + /* set PTE_RDONLY if actual read-only or clean PTE */ if (!pte_write(entry) || !pte_sw_dirty(entry)) pte_val(entry) |= PTE_RDONLY; /* * Setting the flags must be done atomically to avoid racing with the -* hardware update of the access/dirty state. +* hardware update of the access/dirty state. The PTE_RDONLY bit must +* be set to the most permissive (lowest value) of *ptep and entry +* (calculated as: a & b == ~(~a | ~b)). */ + pte_val(entry) ^= PTE_RDONLY; asm volatile("//ptep_set_access_flags\n" " prfmpstl1strm, %2\n" "1: ldxr%0, %2\n" - " and %0, %0, %3 // clear PTE_RDONLY\n" + " eor %0, %0, %3 // negate PTE_RDONLY in *ptep\n" " orr %0, %0, %4 // set flags\n" + " eor %0, %0, %3 // negate final PTE_RDONLY\n" " stxr%w1, %0, %2\n" " cbnz%w1, 1b\n" : "=" (old_pteval), "=" (tmp), "+Q" (pte_val(*ptep)) - : "L" (~PTE_RDONLY), "r" (pte_val(entry))); + : "L" (PTE_RDONLY), "r" (pte_val(entry))); flush_tlb_fix_spurious_fault(vma, address); return 1; -- 2.17.1
[PATCH AUTOSEL 4.4 27/65] arm64: Fix potential race with hardware DBM in ptep_set_access_flags()
From: Catalin Marinas [ Upstream commit 6d332747fa5f0a6843b56b5b129168ba909336d1 ] In a system with DBM (dirty bit management) capable agents there is a possible race between a CPU executing ptep_set_access_flags() (maybe non-DBM capable) and a hardware update of the dirty state (clearing of PTE_RDONLY). The scenario: a) the pte is writable (PTE_WRITE set), clean (PTE_RDONLY set) and old (PTE_AF clear) b) ptep_set_access_flags() is called as a result of a read access and it needs to set the pte to writable, clean and young (PTE_AF set) c) a DBM-capable agent, as a result of a different write access, is marking the entry as young (setting PTE_AF) and dirty (clearing PTE_RDONLY) The current ptep_set_access_flags() implementation would set the PTE_RDONLY bit in the resulting value overriding the DBM update and losing the dirty state. This patch fixes such race by setting PTE_RDONLY to the most permissive (lowest value) of the current entry and the new one. Fixes: 66dbd6e61a52 ("arm64: Implement ptep_set_access_flags() for hardware AF/DBM") Cc: Will Deacon Acked-by: Mark Rutland Acked-by: Steve Capper Signed-off-by: Catalin Marinas Signed-off-by: Will Deacon Signed-off-by: Sasha Levin --- arch/arm64/mm/fault.c | 15 --- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index 86485415c5f0..be7f8416809f 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -107,26 +107,27 @@ int ptep_set_access_flags(struct vm_area_struct *vma, /* only preserve the access flags and write permission */ pte_val(entry) &= PTE_AF | PTE_WRITE | PTE_DIRTY; - /* -* PTE_RDONLY is cleared by default in the asm below, so set it in -* back if necessary (read-only or clean PTE). -*/ + /* set PTE_RDONLY if actual read-only or clean PTE */ if (!pte_write(entry) || !pte_sw_dirty(entry)) pte_val(entry) |= PTE_RDONLY; /* * Setting the flags must be done atomically to avoid racing with the -* hardware update of the access/dirty state. +* hardware update of the access/dirty state. The PTE_RDONLY bit must +* be set to the most permissive (lowest value) of *ptep and entry +* (calculated as: a & b == ~(~a | ~b)). */ + pte_val(entry) ^= PTE_RDONLY; asm volatile("//ptep_set_access_flags\n" " prfmpstl1strm, %2\n" "1: ldxr%0, %2\n" - " and %0, %0, %3 // clear PTE_RDONLY\n" + " eor %0, %0, %3 // negate PTE_RDONLY in *ptep\n" " orr %0, %0, %4 // set flags\n" + " eor %0, %0, %3 // negate final PTE_RDONLY\n" " stxr%w1, %0, %2\n" " cbnz%w1, 1b\n" : "=" (old_pteval), "=" (tmp), "+Q" (pte_val(*ptep)) - : "L" (~PTE_RDONLY), "r" (pte_val(entry))); + : "L" (PTE_RDONLY), "r" (pte_val(entry))); flush_tlb_fix_spurious_fault(vma, address); return 1; -- 2.17.1
[PATCH AUTOSEL 4.4 27/65] arm64: Fix potential race with hardware DBM in ptep_set_access_flags()
From: Catalin Marinas [ Upstream commit 6d332747fa5f0a6843b56b5b129168ba909336d1 ] In a system with DBM (dirty bit management) capable agents there is a possible race between a CPU executing ptep_set_access_flags() (maybe non-DBM capable) and a hardware update of the dirty state (clearing of PTE_RDONLY). The scenario: a) the pte is writable (PTE_WRITE set), clean (PTE_RDONLY set) and old (PTE_AF clear) b) ptep_set_access_flags() is called as a result of a read access and it needs to set the pte to writable, clean and young (PTE_AF set) c) a DBM-capable agent, as a result of a different write access, is marking the entry as young (setting PTE_AF) and dirty (clearing PTE_RDONLY) The current ptep_set_access_flags() implementation would set the PTE_RDONLY bit in the resulting value overriding the DBM update and losing the dirty state. This patch fixes such race by setting PTE_RDONLY to the most permissive (lowest value) of the current entry and the new one. Fixes: 66dbd6e61a52 ("arm64: Implement ptep_set_access_flags() for hardware AF/DBM") Cc: Will Deacon Acked-by: Mark Rutland Acked-by: Steve Capper Signed-off-by: Catalin Marinas Signed-off-by: Will Deacon Signed-off-by: Sasha Levin --- arch/arm64/mm/fault.c | 15 --- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index 86485415c5f0..be7f8416809f 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -107,26 +107,27 @@ int ptep_set_access_flags(struct vm_area_struct *vma, /* only preserve the access flags and write permission */ pte_val(entry) &= PTE_AF | PTE_WRITE | PTE_DIRTY; - /* -* PTE_RDONLY is cleared by default in the asm below, so set it in -* back if necessary (read-only or clean PTE). -*/ + /* set PTE_RDONLY if actual read-only or clean PTE */ if (!pte_write(entry) || !pte_sw_dirty(entry)) pte_val(entry) |= PTE_RDONLY; /* * Setting the flags must be done atomically to avoid racing with the -* hardware update of the access/dirty state. +* hardware update of the access/dirty state. The PTE_RDONLY bit must +* be set to the most permissive (lowest value) of *ptep and entry +* (calculated as: a & b == ~(~a | ~b)). */ + pte_val(entry) ^= PTE_RDONLY; asm volatile("//ptep_set_access_flags\n" " prfmpstl1strm, %2\n" "1: ldxr%0, %2\n" - " and %0, %0, %3 // clear PTE_RDONLY\n" + " eor %0, %0, %3 // negate PTE_RDONLY in *ptep\n" " orr %0, %0, %4 // set flags\n" + " eor %0, %0, %3 // negate final PTE_RDONLY\n" " stxr%w1, %0, %2\n" " cbnz%w1, 1b\n" : "=" (old_pteval), "=" (tmp), "+Q" (pte_val(*ptep)) - : "L" (~PTE_RDONLY), "r" (pte_val(entry))); + : "L" (PTE_RDONLY), "r" (pte_val(entry))); flush_tlb_fix_spurious_fault(vma, address); return 1; -- 2.17.1
[PATCH AUTOSEL for 4.9 098/293] ALSA: hda: Fix potential race at unregistration and unsol events
From: Takashi Iwai[ Upstream commit eb8d0eaaf84b0398533a7c091a0b65663f2fd7ea ] When the codec device is unregistered / freed, it may release the resource while being used in an unsolicited event like the jack detection work. This leads to use-after-free. The fix here is to unregister the device at first, i.e. removing the codec from the list, then flushing the pending works to assure that all unsol events are gone. After this point, we're free from accessing the codec via unsol events, thus can release the resources gracefully. The issue was spotted originally by Intel CI, but it couldn't be reproduced reliably by its nature. So let's hope this fix really addresses the whole issues. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=196045 Reported-by: Martin Peres Signed-off-by: Takashi Iwai Signed-off-by: Sasha Levin --- sound/hda/hdac_bus.c| 1 + sound/hda/hdac_device.c | 1 + 2 files changed, 2 insertions(+) diff --git a/sound/hda/hdac_bus.c b/sound/hda/hdac_bus.c index 0e81ea89a596..714a51721a31 100644 --- a/sound/hda/hdac_bus.c +++ b/sound/hda/hdac_bus.c @@ -212,5 +212,6 @@ void snd_hdac_bus_remove_device(struct hdac_bus *bus, bus->caddr_tbl[codec->addr] = NULL; clear_bit(codec->addr, >codec_powered); bus->num_codecs--; + flush_work(>unsol_work); } EXPORT_SYMBOL_GPL(snd_hdac_bus_remove_device); diff --git a/sound/hda/hdac_device.c b/sound/hda/hdac_device.c index 03c9872c31cf..19deb306facb 100644 --- a/sound/hda/hdac_device.c +++ b/sound/hda/hdac_device.c @@ -159,6 +159,7 @@ void snd_hdac_device_unregister(struct hdac_device *codec) if (device_is_registered(>dev)) { hda_widget_sysfs_exit(codec); device_del(>dev); + snd_hdac_bus_remove_device(codec->bus, codec); } } EXPORT_SYMBOL_GPL(snd_hdac_device_unregister); -- 2.15.1
[PATCH AUTOSEL for 4.9 098/293] ALSA: hda: Fix potential race at unregistration and unsol events
From: Takashi Iwai [ Upstream commit eb8d0eaaf84b0398533a7c091a0b65663f2fd7ea ] When the codec device is unregistered / freed, it may release the resource while being used in an unsolicited event like the jack detection work. This leads to use-after-free. The fix here is to unregister the device at first, i.e. removing the codec from the list, then flushing the pending works to assure that all unsol events are gone. After this point, we're free from accessing the codec via unsol events, thus can release the resources gracefully. The issue was spotted originally by Intel CI, but it couldn't be reproduced reliably by its nature. So let's hope this fix really addresses the whole issues. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=196045 Reported-by: Martin Peres Signed-off-by: Takashi Iwai Signed-off-by: Sasha Levin --- sound/hda/hdac_bus.c| 1 + sound/hda/hdac_device.c | 1 + 2 files changed, 2 insertions(+) diff --git a/sound/hda/hdac_bus.c b/sound/hda/hdac_bus.c index 0e81ea89a596..714a51721a31 100644 --- a/sound/hda/hdac_bus.c +++ b/sound/hda/hdac_bus.c @@ -212,5 +212,6 @@ void snd_hdac_bus_remove_device(struct hdac_bus *bus, bus->caddr_tbl[codec->addr] = NULL; clear_bit(codec->addr, >codec_powered); bus->num_codecs--; + flush_work(>unsol_work); } EXPORT_SYMBOL_GPL(snd_hdac_bus_remove_device); diff --git a/sound/hda/hdac_device.c b/sound/hda/hdac_device.c index 03c9872c31cf..19deb306facb 100644 --- a/sound/hda/hdac_device.c +++ b/sound/hda/hdac_device.c @@ -159,6 +159,7 @@ void snd_hdac_device_unregister(struct hdac_device *codec) if (device_is_registered(>dev)) { hda_widget_sysfs_exit(codec); device_del(>dev); + snd_hdac_bus_remove_device(codec->bus, codec); } } EXPORT_SYMBOL_GPL(snd_hdac_device_unregister); -- 2.15.1
[PATCH AUTOSEL for 4.4 050/162] ALSA: hda: Fix potential race at unregistration and unsol events
From: Takashi Iwai[ Upstream commit eb8d0eaaf84b0398533a7c091a0b65663f2fd7ea ] When the codec device is unregistered / freed, it may release the resource while being used in an unsolicited event like the jack detection work. This leads to use-after-free. The fix here is to unregister the device at first, i.e. removing the codec from the list, then flushing the pending works to assure that all unsol events are gone. After this point, we're free from accessing the codec via unsol events, thus can release the resources gracefully. The issue was spotted originally by Intel CI, but it couldn't be reproduced reliably by its nature. So let's hope this fix really addresses the whole issues. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=196045 Reported-by: Martin Peres Signed-off-by: Takashi Iwai Signed-off-by: Sasha Levin --- sound/hda/hdac_bus.c| 1 + sound/hda/hdac_device.c | 1 + 2 files changed, 2 insertions(+) diff --git a/sound/hda/hdac_bus.c b/sound/hda/hdac_bus.c index 0e81ea89a596..714a51721a31 100644 --- a/sound/hda/hdac_bus.c +++ b/sound/hda/hdac_bus.c @@ -212,5 +212,6 @@ void snd_hdac_bus_remove_device(struct hdac_bus *bus, bus->caddr_tbl[codec->addr] = NULL; clear_bit(codec->addr, >codec_powered); bus->num_codecs--; + flush_work(>unsol_work); } EXPORT_SYMBOL_GPL(snd_hdac_bus_remove_device); diff --git a/sound/hda/hdac_device.c b/sound/hda/hdac_device.c index e361024eabb6..4a7400ae8af3 100644 --- a/sound/hda/hdac_device.c +++ b/sound/hda/hdac_device.c @@ -159,6 +159,7 @@ void snd_hdac_device_unregister(struct hdac_device *codec) if (device_is_registered(>dev)) { hda_widget_sysfs_exit(codec); device_del(>dev); + snd_hdac_bus_remove_device(codec->bus, codec); } } EXPORT_SYMBOL_GPL(snd_hdac_device_unregister); -- 2.15.1
[PATCH AUTOSEL for 4.4 050/162] ALSA: hda: Fix potential race at unregistration and unsol events
From: Takashi Iwai [ Upstream commit eb8d0eaaf84b0398533a7c091a0b65663f2fd7ea ] When the codec device is unregistered / freed, it may release the resource while being used in an unsolicited event like the jack detection work. This leads to use-after-free. The fix here is to unregister the device at first, i.e. removing the codec from the list, then flushing the pending works to assure that all unsol events are gone. After this point, we're free from accessing the codec via unsol events, thus can release the resources gracefully. The issue was spotted originally by Intel CI, but it couldn't be reproduced reliably by its nature. So let's hope this fix really addresses the whole issues. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=196045 Reported-by: Martin Peres Signed-off-by: Takashi Iwai Signed-off-by: Sasha Levin --- sound/hda/hdac_bus.c| 1 + sound/hda/hdac_device.c | 1 + 2 files changed, 2 insertions(+) diff --git a/sound/hda/hdac_bus.c b/sound/hda/hdac_bus.c index 0e81ea89a596..714a51721a31 100644 --- a/sound/hda/hdac_bus.c +++ b/sound/hda/hdac_bus.c @@ -212,5 +212,6 @@ void snd_hdac_bus_remove_device(struct hdac_bus *bus, bus->caddr_tbl[codec->addr] = NULL; clear_bit(codec->addr, >codec_powered); bus->num_codecs--; + flush_work(>unsol_work); } EXPORT_SYMBOL_GPL(snd_hdac_bus_remove_device); diff --git a/sound/hda/hdac_device.c b/sound/hda/hdac_device.c index e361024eabb6..4a7400ae8af3 100644 --- a/sound/hda/hdac_device.c +++ b/sound/hda/hdac_device.c @@ -159,6 +159,7 @@ void snd_hdac_device_unregister(struct hdac_device *codec) if (device_is_registered(>dev)) { hda_widget_sysfs_exit(codec); device_del(>dev); + snd_hdac_bus_remove_device(codec->bus, codec); } } EXPORT_SYMBOL_GPL(snd_hdac_device_unregister); -- 2.15.1
z3fold: fix potential race in z3fold_reclaim_page
It is possible that on a (partially) unsuccessful page reclaim, kref_put() called in z3fold_reclaim_page() does not yield page release, but the page is released shortly afterwards by another thread. Then z3fold_reclaim_page() would try to list_add() that (released) page again which is obviously a bug. To avoid that, spin_lock() has to be taken earlier, before the kref_put() call mentioned earlier. Signed-off-by: Vitaly Wool--- mm/z3fold.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/mm/z3fold.c b/mm/z3fold.c index 486550df32be..b04fa3ba1bf2 100644 --- a/mm/z3fold.c +++ b/mm/z3fold.c @@ -875,16 +875,18 @@ static int z3fold_reclaim_page(struct z3fold_pool *pool, unsigned int retries) goto next; } next: + spin_lock(>lock); if (test_bit(PAGE_HEADLESS, >private)) { if (ret == 0) { + spin_unlock(>lock); free_z3fold_page(page); return 0; } } else if (kref_put(>refcount, release_z3fold_page)) { atomic64_dec(>pages_nr); + spin_unlock(>lock); return 0; } - spin_lock(>lock); /* * Add to the beginning of LRU. -- 2.11.0
z3fold: fix potential race in z3fold_reclaim_page
It is possible that on a (partially) unsuccessful page reclaim, kref_put() called in z3fold_reclaim_page() does not yield page release, but the page is released shortly afterwards by another thread. Then z3fold_reclaim_page() would try to list_add() that (released) page again which is obviously a bug. To avoid that, spin_lock() has to be taken earlier, before the kref_put() call mentioned earlier. Signed-off-by: Vitaly Wool --- mm/z3fold.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/mm/z3fold.c b/mm/z3fold.c index 486550df32be..b04fa3ba1bf2 100644 --- a/mm/z3fold.c +++ b/mm/z3fold.c @@ -875,16 +875,18 @@ static int z3fold_reclaim_page(struct z3fold_pool *pool, unsigned int retries) goto next; } next: + spin_lock(>lock); if (test_bit(PAGE_HEADLESS, >private)) { if (ret == 0) { + spin_unlock(>lock); free_z3fold_page(page); return 0; } } else if (kref_put(>refcount, release_z3fold_page)) { atomic64_dec(>pages_nr); + spin_unlock(>lock); return 0; } - spin_lock(>lock); /* * Add to the beginning of LRU. -- 2.11.0
Re: [PATCH] cgroup: Fix potential race between cgroup_exit and migrate path
On 08/31/2017 06:42 AM, Tejun Heo wrote: On Wed, Aug 30, 2017 at 06:03:19PM -0700, Tejun Heo wrote: Oops, more like the following. diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index df2e0f1..6f34025 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -683,7 +683,7 @@ static void css_set_move_task(struct task_struct *task, if (it->task_pos == >cg_list) css_task_iter_advance(it); - list_del_init(>cg_list); + list_del(>cg_list); if (!css_set_populated(from_cset)) css_set_update_populated(from_cset, false); } else { @@ -702,6 +702,8 @@ static void css_set_move_task(struct task_struct *task, rcu_assign_pointer(task->cgroups, to_cset); list_add_tail(>cg_list, use_mg_tasks ? _cset->mg_tasks : _cset->tasks); + } else { + INIT_LIST_HEAD(>cg_list); } } On the third thought, I don't think this can happen either because now migration is strongly synchronized against exits. Please take a look at the changes around cgroup_threadgroup_rwsem. Thanks. Thank you for the suggestion; found below fix, which is not present in v4.4.86 stable code base. Please let me know in case I am missing something: eedd0f4 cgroupns: Close race between cgroup_post_fork and copy_cgroup_ns Thanks. -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, hosted by The Linux Foundation
Re: [PATCH] cgroup: Fix potential race between cgroup_exit and migrate path
On 08/31/2017 06:42 AM, Tejun Heo wrote: On Wed, Aug 30, 2017 at 06:03:19PM -0700, Tejun Heo wrote: Oops, more like the following. diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index df2e0f1..6f34025 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -683,7 +683,7 @@ static void css_set_move_task(struct task_struct *task, if (it->task_pos == >cg_list) css_task_iter_advance(it); - list_del_init(>cg_list); + list_del(>cg_list); if (!css_set_populated(from_cset)) css_set_update_populated(from_cset, false); } else { @@ -702,6 +702,8 @@ static void css_set_move_task(struct task_struct *task, rcu_assign_pointer(task->cgroups, to_cset); list_add_tail(>cg_list, use_mg_tasks ? _cset->mg_tasks : _cset->tasks); + } else { + INIT_LIST_HEAD(>cg_list); } } On the third thought, I don't think this can happen either because now migration is strongly synchronized against exits. Please take a look at the changes around cgroup_threadgroup_rwsem. Thanks. Thank you for the suggestion; found below fix, which is not present in v4.4.86 stable code base. Please let me know in case I am missing something: eedd0f4 cgroupns: Close race between cgroup_post_fork and copy_cgroup_ns Thanks. -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, hosted by The Linux Foundation
Re: [PATCH] cgroup: Fix potential race between cgroup_exit and migrate path
On Wed, Aug 30, 2017 at 06:03:19PM -0700, Tejun Heo wrote: > Oops, more like the following. > > diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c > index df2e0f1..6f34025 100644 > --- a/kernel/cgroup/cgroup.c > +++ b/kernel/cgroup/cgroup.c > @@ -683,7 +683,7 @@ static void css_set_move_task(struct task_struct *task, > if (it->task_pos == >cg_list) > css_task_iter_advance(it); > > - list_del_init(>cg_list); > + list_del(>cg_list); > if (!css_set_populated(from_cset)) > css_set_update_populated(from_cset, false); > } else { > @@ -702,6 +702,8 @@ static void css_set_move_task(struct task_struct *task, > rcu_assign_pointer(task->cgroups, to_cset); > list_add_tail(>cg_list, use_mg_tasks ? _cset->mg_tasks > : >_cset->tasks); > + } else { > + INIT_LIST_HEAD(>cg_list); > } > } On the third thought, I don't think this can happen either because now migration is strongly synchronized against exits. Please take a look at the changes around cgroup_threadgroup_rwsem. Thanks. -- tejun
Re: [PATCH] cgroup: Fix potential race between cgroup_exit and migrate path
On Wed, Aug 30, 2017 at 06:03:19PM -0700, Tejun Heo wrote: > Oops, more like the following. > > diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c > index df2e0f1..6f34025 100644 > --- a/kernel/cgroup/cgroup.c > +++ b/kernel/cgroup/cgroup.c > @@ -683,7 +683,7 @@ static void css_set_move_task(struct task_struct *task, > if (it->task_pos == >cg_list) > css_task_iter_advance(it); > > - list_del_init(>cg_list); > + list_del(>cg_list); > if (!css_set_populated(from_cset)) > css_set_update_populated(from_cset, false); > } else { > @@ -702,6 +702,8 @@ static void css_set_move_task(struct task_struct *task, > rcu_assign_pointer(task->cgroups, to_cset); > list_add_tail(>cg_list, use_mg_tasks ? _cset->mg_tasks > : >_cset->tasks); > + } else { > + INIT_LIST_HEAD(>cg_list); > } > } On the third thought, I don't think this can happen either because now migration is strongly synchronized against exits. Please take a look at the changes around cgroup_threadgroup_rwsem. Thanks. -- tejun
Re: [PATCH] cgroup: Fix potential race between cgroup_exit and cpuset_attach
Hello, On Wed, Aug 30, 2017 at 06:23:50PM +0530, Neeraj Upadhyay wrote: > cpuset_hotplug_workfn() > cgroup_transfer_tasks() > cgroup_migrate() > cgroup_migrate_execute() > > list_del_init(>cg_list) > cpuset_attach() > cgroup_taskset_first(tset, ) // css is not set > guarantee_online_mems(cs, ...) // data abort I don't think this can happen anymore. We lock out task exits during migration and don't call migration callbacks if there's no tasks to migrate. Thanks. -- tejun
Re: [PATCH] cgroup: Fix potential race between cgroup_exit and cpuset_attach
Hello, On Wed, Aug 30, 2017 at 06:23:50PM +0530, Neeraj Upadhyay wrote: > cpuset_hotplug_workfn() > cgroup_transfer_tasks() > cgroup_migrate() > cgroup_migrate_execute() > > list_del_init(>cg_list) > cpuset_attach() > cgroup_taskset_first(tset, ) // css is not set > guarantee_online_mems(cs, ...) // data abort I don't think this can happen anymore. We lock out task exits during migration and don't call migration callbacks if there's no tasks to migrate. Thanks. -- tejun
Re: [PATCH] cgroup: Fix potential race between cgroup_exit and migrate path
On Wed, Aug 30, 2017 at 05:55:45PM -0700, Tejun Heo wrote: > Would something like the following fix the issue? Thanks. > > diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c > index df2e0f1..cd85ca0 100644 > --- a/kernel/cgroup/cgroup.c > +++ b/kernel/cgroup/cgroup.c > @@ -683,7 +683,7 @@ static void css_set_move_task(struct task_struct *task, > if (it->task_pos == >cg_list) > css_task_iter_advance(it); > > - list_del_init(>cg_list); > + list_del(>cg_list); > if (!css_set_populated(from_cset)) > css_set_update_populated(from_cset, false); > } else { Oops, more like the following. diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index df2e0f1..6f34025 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -683,7 +683,7 @@ static void css_set_move_task(struct task_struct *task, if (it->task_pos == >cg_list) css_task_iter_advance(it); - list_del_init(>cg_list); + list_del(>cg_list); if (!css_set_populated(from_cset)) css_set_update_populated(from_cset, false); } else { @@ -702,6 +702,8 @@ static void css_set_move_task(struct task_struct *task, rcu_assign_pointer(task->cgroups, to_cset); list_add_tail(>cg_list, use_mg_tasks ? _cset->mg_tasks : _cset->tasks); + } else { + INIT_LIST_HEAD(>cg_list); } }
Re: [PATCH] cgroup: Fix potential race between cgroup_exit and migrate path
On Wed, Aug 30, 2017 at 05:55:45PM -0700, Tejun Heo wrote: > Would something like the following fix the issue? Thanks. > > diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c > index df2e0f1..cd85ca0 100644 > --- a/kernel/cgroup/cgroup.c > +++ b/kernel/cgroup/cgroup.c > @@ -683,7 +683,7 @@ static void css_set_move_task(struct task_struct *task, > if (it->task_pos == >cg_list) > css_task_iter_advance(it); > > - list_del_init(>cg_list); > + list_del(>cg_list); > if (!css_set_populated(from_cset)) > css_set_update_populated(from_cset, false); > } else { Oops, more like the following. diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index df2e0f1..6f34025 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -683,7 +683,7 @@ static void css_set_move_task(struct task_struct *task, if (it->task_pos == >cg_list) css_task_iter_advance(it); - list_del_init(>cg_list); + list_del(>cg_list); if (!css_set_populated(from_cset)) css_set_update_populated(from_cset, false); } else { @@ -702,6 +702,8 @@ static void css_set_move_task(struct task_struct *task, rcu_assign_pointer(task->cgroups, to_cset); list_add_tail(>cg_list, use_mg_tasks ? _cset->mg_tasks : _cset->tasks); + } else { + INIT_LIST_HEAD(>cg_list); } }
Re: [PATCH] cgroup: Fix potential race between cgroup_exit and migrate path
Hello, Neeraj. On Wed, Aug 30, 2017 at 06:24:09PM +0530, Neeraj Upadhyay wrote: > There is a potential race between cgroup_exit() and the > migration path. This race happens because cgroup_exit path > reads the css_set and does cg_list empty check outside of > css_set lock. This can potentially race with the migrate path > trying to move the tasks to a different css_set. For instance, > below is the interleaved sequence of events, where race is > observed: > > cpuset_hotplug_workfn() > cgroup_transfer_tasks() > cgroup_migrate() > cgroup_migrate_execute() > css_set_move_task() > list_del_init(>cg_list); > > cgroup_exit() > cset = task_css_set(tsk); > if (!list_empty(>cg_list)) > > list_add_tail(>cg_list, use_mg_tasks > > In above sequence, as cgroup_exit() read the cg_list for > the task as empty, it didn't disassociate it from its > current css_set, and was moved to new css_set instance > css_set_move_task() called from cpuset_hotplug_workfn() > path. This eventually can result in use after free scenarios, > while accessing the same task_struct again, like in following > sequence: > > kernfs_seq_start() > cgroup_seqfile_start() > cgroup_pidlist_start() > css_task_iter_next() > __put_task_struct() > > > Fix this problem, by moving the css_set and cg_list fetch in > cgroup_exit() inside css_set lock. Hmm... I haven't really thought through but could the problem be that css_set_move_task() is temporarily making ->cg_list empty? The use_task_css_set_links optimization can't handle that. Would something like the following fix the issue? Thanks. diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index df2e0f1..cd85ca0 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -683,7 +683,7 @@ static void css_set_move_task(struct task_struct *task, if (it->task_pos == >cg_list) css_task_iter_advance(it); - list_del_init(>cg_list); + list_del(>cg_list); if (!css_set_populated(from_cset)) css_set_update_populated(from_cset, false); } else {
Re: [PATCH] cgroup: Fix potential race between cgroup_exit and migrate path
Hello, Neeraj. On Wed, Aug 30, 2017 at 06:24:09PM +0530, Neeraj Upadhyay wrote: > There is a potential race between cgroup_exit() and the > migration path. This race happens because cgroup_exit path > reads the css_set and does cg_list empty check outside of > css_set lock. This can potentially race with the migrate path > trying to move the tasks to a different css_set. For instance, > below is the interleaved sequence of events, where race is > observed: > > cpuset_hotplug_workfn() > cgroup_transfer_tasks() > cgroup_migrate() > cgroup_migrate_execute() > css_set_move_task() > list_del_init(>cg_list); > > cgroup_exit() > cset = task_css_set(tsk); > if (!list_empty(>cg_list)) > > list_add_tail(>cg_list, use_mg_tasks > > In above sequence, as cgroup_exit() read the cg_list for > the task as empty, it didn't disassociate it from its > current css_set, and was moved to new css_set instance > css_set_move_task() called from cpuset_hotplug_workfn() > path. This eventually can result in use after free scenarios, > while accessing the same task_struct again, like in following > sequence: > > kernfs_seq_start() > cgroup_seqfile_start() > cgroup_pidlist_start() > css_task_iter_next() > __put_task_struct() > > > Fix this problem, by moving the css_set and cg_list fetch in > cgroup_exit() inside css_set lock. Hmm... I haven't really thought through but could the problem be that css_set_move_task() is temporarily making ->cg_list empty? The use_task_css_set_links optimization can't handle that. Would something like the following fix the issue? Thanks. diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index df2e0f1..cd85ca0 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -683,7 +683,7 @@ static void css_set_move_task(struct task_struct *task, if (it->task_pos == >cg_list) css_task_iter_advance(it); - list_del_init(>cg_list); + list_del(>cg_list); if (!css_set_populated(from_cset)) css_set_update_populated(from_cset, false); } else {
[PATCH] cgroup: Fix potential race between cgroup_exit and migrate path
There is a potential race between cgroup_exit() and the migration path. This race happens because cgroup_exit path reads the css_set and does cg_list empty check outside of css_set lock. This can potentially race with the migrate path trying to move the tasks to a different css_set. For instance, below is the interleaved sequence of events, where race is observed: cpuset_hotplug_workfn() cgroup_transfer_tasks() cgroup_migrate() cgroup_migrate_execute() css_set_move_task() list_del_init(>cg_list); cgroup_exit() cset = task_css_set(tsk); if (!list_empty(>cg_list)) list_add_tail(>cg_list, use_mg_tasks In above sequence, as cgroup_exit() read the cg_list for the task as empty, it didn't disassociate it from its current css_set, and was moved to new css_set instance css_set_move_task() called from cpuset_hotplug_workfn() path. This eventually can result in use after free scenarios, while accessing the same task_struct again, like in following sequence: kernfs_seq_start() cgroup_seqfile_start() cgroup_pidlist_start() css_task_iter_next() __put_task_struct() Fix this problem, by moving the css_set and cg_list fetch in cgroup_exit() inside css_set lock. Signed-off-by: Neeraj Upadhyay <neer...@codeaurora.org> --- Hi, We observed this issue for cgroup code corresponding to stable v4.4.85 snapshot 3144d81 ("cgroup, kthread: close race window where new kthreads can be migrated to non-root cgroups"). Can you please tell us, if there are any patches in latest code, which fixes these issue? kernel/cgroup/cgroup.c | 20 1 file changed, 12 insertions(+), 8 deletions(-) diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index df2e0f1..f746b70 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -692,10 +692,10 @@ static void css_set_move_task(struct task_struct *task, if (to_cset) { /* -* We are synchronized through cgroup_threadgroup_rwsem -* against PF_EXITING setting such that we can't race -* against cgroup_exit() changing the css_set to -* init_css_set and dropping the old one. +* We are synchronized through css_set_lock against +* PF_EXITING setting such that we can't race against +* cgroup_exit() disassociating the task from the +* css_set. */ WARN_ON_ONCE(task->flags & PF_EXITING); @@ -4934,20 +4934,24 @@ void cgroup_exit(struct task_struct *tsk) int i; /* -* Unlink from @tsk from its css_set. As migration path can't race -* with us, we can check css_set and cg_list without synchronization. +* Avoid potential race with the migrate path. +*/ + spin_lock_irq(_set_lock); + + /* +* Unlink from @tsk from its css_set. */ cset = task_css_set(tsk); if (!list_empty(>cg_list)) { - spin_lock_irq(_set_lock); css_set_move_task(tsk, cset, NULL, false); cset->nr_tasks--; - spin_unlock_irq(_set_lock); } else { get_css_set(cset); } + spin_unlock_irq(_set_lock); + /* see cgroup_post_fork() for details */ do_each_subsys_mask(ss, i, have_exit_callback) { ss->exit(tsk); -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, hosted by The Linux Foundation
[PATCH] cgroup: Fix potential race between cgroup_exit and migrate path
There is a potential race between cgroup_exit() and the migration path. This race happens because cgroup_exit path reads the css_set and does cg_list empty check outside of css_set lock. This can potentially race with the migrate path trying to move the tasks to a different css_set. For instance, below is the interleaved sequence of events, where race is observed: cpuset_hotplug_workfn() cgroup_transfer_tasks() cgroup_migrate() cgroup_migrate_execute() css_set_move_task() list_del_init(>cg_list); cgroup_exit() cset = task_css_set(tsk); if (!list_empty(>cg_list)) list_add_tail(>cg_list, use_mg_tasks In above sequence, as cgroup_exit() read the cg_list for the task as empty, it didn't disassociate it from its current css_set, and was moved to new css_set instance css_set_move_task() called from cpuset_hotplug_workfn() path. This eventually can result in use after free scenarios, while accessing the same task_struct again, like in following sequence: kernfs_seq_start() cgroup_seqfile_start() cgroup_pidlist_start() css_task_iter_next() __put_task_struct() Fix this problem, by moving the css_set and cg_list fetch in cgroup_exit() inside css_set lock. Signed-off-by: Neeraj Upadhyay --- Hi, We observed this issue for cgroup code corresponding to stable v4.4.85 snapshot 3144d81 ("cgroup, kthread: close race window where new kthreads can be migrated to non-root cgroups"). Can you please tell us, if there are any patches in latest code, which fixes these issue? kernel/cgroup/cgroup.c | 20 1 file changed, 12 insertions(+), 8 deletions(-) diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index df2e0f1..f746b70 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -692,10 +692,10 @@ static void css_set_move_task(struct task_struct *task, if (to_cset) { /* -* We are synchronized through cgroup_threadgroup_rwsem -* against PF_EXITING setting such that we can't race -* against cgroup_exit() changing the css_set to -* init_css_set and dropping the old one. +* We are synchronized through css_set_lock against +* PF_EXITING setting such that we can't race against +* cgroup_exit() disassociating the task from the +* css_set. */ WARN_ON_ONCE(task->flags & PF_EXITING); @@ -4934,20 +4934,24 @@ void cgroup_exit(struct task_struct *tsk) int i; /* -* Unlink from @tsk from its css_set. As migration path can't race -* with us, we can check css_set and cg_list without synchronization. +* Avoid potential race with the migrate path. +*/ + spin_lock_irq(_set_lock); + + /* +* Unlink from @tsk from its css_set. */ cset = task_css_set(tsk); if (!list_empty(>cg_list)) { - spin_lock_irq(_set_lock); css_set_move_task(tsk, cset, NULL, false); cset->nr_tasks--; - spin_unlock_irq(_set_lock); } else { get_css_set(cset); } + spin_unlock_irq(_set_lock); + /* see cgroup_post_fork() for details */ do_each_subsys_mask(ss, i, have_exit_callback) { ss->exit(tsk); -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, hosted by The Linux Foundation
[PATCH] cgroup: Fix potential race between cgroup_exit and cpuset_attach
There is a potential race between cgroup_exit() and the taskset migration path. This race happens when all tasks associated with the cg_list entries in mg_tasks, detach themselves before the tasks can be attached to the destination cpuset in cpuset_attach(). Below is the sequence where race is observed: cpuset_hotplug_workfn() cgroup_transfer_tasks() cgroup_migrate() cgroup_migrate_execute() list_del_init(>cg_list) cpuset_attach() cgroup_taskset_first(tset, ) // css is not set guarantee_online_mems(cs, ...) // data abort Fix this by adding a checking to verify that css is set from cgroup_taskset_first(), before proceeding. Signed-off-by: Neeraj Upadhyay <neer...@codeaurora.org> --- Hi, We observed this issue for cgroup code corresponding to stable v4.4.85 snapshot 3144d81 ("cgroup, kthread: close race window where new kthreads can be migrated to non-root cgroups"). Can you please tell us, if there are any patches in latest code, which fixes these issue? kernel/cgroup/cpuset.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index 87a1213..7e245f26 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -1510,11 +1510,17 @@ static void cpuset_attach(struct cgroup_taskset *tset) static nodemask_t cpuset_attach_nodemask_to; struct task_struct *task; struct task_struct *leader; - struct cgroup_subsys_state *css; + struct cgroup_subsys_state *css = NULL; struct cpuset *cs; struct cpuset *oldcs = cpuset_attach_old_cs; cgroup_taskset_first(tset, ); + /* If all mg_tasks detached (from cgroup_exit()) +* before we started scanning the mg_tasks in +* cgroup_taskset_next(). +*/ + if (css == NULL) + return; cs = css_cs(css); mutex_lock(_mutex); -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, hosted by The Linux Foundation
[PATCH] cgroup: Fix potential race between cgroup_exit and cpuset_attach
There is a potential race between cgroup_exit() and the taskset migration path. This race happens when all tasks associated with the cg_list entries in mg_tasks, detach themselves before the tasks can be attached to the destination cpuset in cpuset_attach(). Below is the sequence where race is observed: cpuset_hotplug_workfn() cgroup_transfer_tasks() cgroup_migrate() cgroup_migrate_execute() list_del_init(>cg_list) cpuset_attach() cgroup_taskset_first(tset, ) // css is not set guarantee_online_mems(cs, ...) // data abort Fix this by adding a checking to verify that css is set from cgroup_taskset_first(), before proceeding. Signed-off-by: Neeraj Upadhyay --- Hi, We observed this issue for cgroup code corresponding to stable v4.4.85 snapshot 3144d81 ("cgroup, kthread: close race window where new kthreads can be migrated to non-root cgroups"). Can you please tell us, if there are any patches in latest code, which fixes these issue? kernel/cgroup/cpuset.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index 87a1213..7e245f26 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -1510,11 +1510,17 @@ static void cpuset_attach(struct cgroup_taskset *tset) static nodemask_t cpuset_attach_nodemask_to; struct task_struct *task; struct task_struct *leader; - struct cgroup_subsys_state *css; + struct cgroup_subsys_state *css = NULL; struct cpuset *cs; struct cpuset *oldcs = cpuset_attach_old_cs; cgroup_taskset_first(tset, ); + /* If all mg_tasks detached (from cgroup_exit()) +* before we started scanning the mg_tasks in +* cgroup_taskset_next(). +*/ + if (css == NULL) + return; cs = css_cs(css); mutex_lock(_mutex); -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, hosted by The Linux Foundation
Re: [PATCH] crypto: rsa - fix a potential race condition in build
On 12/4/2016 10:48 PM, Herbert Xu wrote: On Fri, Dec 02, 2016 at 03:41:04PM -0800, Yang Shi wrote: When building kernel with RSA enabled with multithreaded, the below compile failure might be caught: | /buildarea/kernel-source/crypto/rsa_helper.c:18:28: fatal error: rsapubkey-asn1.h: No such file or directory | #include "rsapubkey-asn1.h" | ^ | compilation terminated. | CC crypto/rsa-pkcs1pad.o | CC crypto/algboss.o | CC crypto/testmgr.o | make[3]: *** [/buildarea/kernel-source/scripts/Makefile.build:289: crypto/rsa_helper.o] Error 1 | make[3]: *** Waiting for unfinished jobs | make[2]: *** [/buildarea/kernel-source/Makefile:969: crypto] Error 2 | make[1]: *** [Makefile:150: sub-make] Error 2 | make: *** [Makefile:24: __sub-make] Error 2 The header file is not generated before rsa_helper is compiled, so adding dependency to avoid such issue. Signed-off-by: Yang ShiThis should already be fixed in the latest crypto tree. Could you please double-check? Thanks, Herbert, I just found the commit. Please ignore my patch, sorry for the inconvenience. I will backport that commit to our kernel tree. Yang Thanks,
Re: [PATCH] crypto: rsa - fix a potential race condition in build
On 12/4/2016 10:48 PM, Herbert Xu wrote: On Fri, Dec 02, 2016 at 03:41:04PM -0800, Yang Shi wrote: When building kernel with RSA enabled with multithreaded, the below compile failure might be caught: | /buildarea/kernel-source/crypto/rsa_helper.c:18:28: fatal error: rsapubkey-asn1.h: No such file or directory | #include "rsapubkey-asn1.h" | ^ | compilation terminated. | CC crypto/rsa-pkcs1pad.o | CC crypto/algboss.o | CC crypto/testmgr.o | make[3]: *** [/buildarea/kernel-source/scripts/Makefile.build:289: crypto/rsa_helper.o] Error 1 | make[3]: *** Waiting for unfinished jobs | make[2]: *** [/buildarea/kernel-source/Makefile:969: crypto] Error 2 | make[1]: *** [Makefile:150: sub-make] Error 2 | make: *** [Makefile:24: __sub-make] Error 2 The header file is not generated before rsa_helper is compiled, so adding dependency to avoid such issue. Signed-off-by: Yang Shi This should already be fixed in the latest crypto tree. Could you please double-check? Thanks, Herbert, I just found the commit. Please ignore my patch, sorry for the inconvenience. I will backport that commit to our kernel tree. Yang Thanks,
Re: [PATCH] crypto: rsa - fix a potential race condition in build
On Fri, Dec 02, 2016 at 03:41:04PM -0800, Yang Shi wrote: > When building kernel with RSA enabled with multithreaded, the below > compile failure might be caught: > > | /buildarea/kernel-source/crypto/rsa_helper.c:18:28: fatal error: > rsapubkey-asn1.h: No such file or directory > | #include "rsapubkey-asn1.h" > | ^ > | compilation terminated. > | CC crypto/rsa-pkcs1pad.o > | CC crypto/algboss.o > | CC crypto/testmgr.o > | make[3]: *** [/buildarea/kernel-source/scripts/Makefile.build:289: > crypto/rsa_helper.o] Error 1 > | make[3]: *** Waiting for unfinished jobs > | make[2]: *** [/buildarea/kernel-source/Makefile:969: crypto] Error 2 > | make[1]: *** [Makefile:150: sub-make] Error 2 > | make: *** [Makefile:24: __sub-make] Error 2 > > The header file is not generated before rsa_helper is compiled, so > adding dependency to avoid such issue. > > Signed-off-by: Yang ShiThis should already be fixed in the latest crypto tree. Could you please double-check? Thanks, -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Re: [PATCH] crypto: rsa - fix a potential race condition in build
On Fri, Dec 02, 2016 at 03:41:04PM -0800, Yang Shi wrote: > When building kernel with RSA enabled with multithreaded, the below > compile failure might be caught: > > | /buildarea/kernel-source/crypto/rsa_helper.c:18:28: fatal error: > rsapubkey-asn1.h: No such file or directory > | #include "rsapubkey-asn1.h" > | ^ > | compilation terminated. > | CC crypto/rsa-pkcs1pad.o > | CC crypto/algboss.o > | CC crypto/testmgr.o > | make[3]: *** [/buildarea/kernel-source/scripts/Makefile.build:289: > crypto/rsa_helper.o] Error 1 > | make[3]: *** Waiting for unfinished jobs > | make[2]: *** [/buildarea/kernel-source/Makefile:969: crypto] Error 2 > | make[1]: *** [Makefile:150: sub-make] Error 2 > | make: *** [Makefile:24: __sub-make] Error 2 > > The header file is not generated before rsa_helper is compiled, so > adding dependency to avoid such issue. > > Signed-off-by: Yang Shi This should already be fixed in the latest crypto tree. Could you please double-check? Thanks, -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
[PATCH] crypto: rsa - fix a potential race condition in build
When building kernel with RSA enabled with multithreaded, the below compile failure might be caught: | /buildarea/kernel-source/crypto/rsa_helper.c:18:28: fatal error: rsapubkey-asn1.h: No such file or directory | #include "rsapubkey-asn1.h" | ^ | compilation terminated. | CC crypto/rsa-pkcs1pad.o | CC crypto/algboss.o | CC crypto/testmgr.o | make[3]: *** [/buildarea/kernel-source/scripts/Makefile.build:289: crypto/rsa_helper.o] Error 1 | make[3]: *** Waiting for unfinished jobs | make[2]: *** [/buildarea/kernel-source/Makefile:969: crypto] Error 2 | make[1]: *** [Makefile:150: sub-make] Error 2 | make: *** [Makefile:24: __sub-make] Error 2 The header file is not generated before rsa_helper is compiled, so adding dependency to avoid such issue. Signed-off-by: Yang Shi--- crypto/Makefile | 1 + 1 file changed, 1 insertion(+) diff --git a/crypto/Makefile b/crypto/Makefile index 99cc64a..8db39f9 100644 --- a/crypto/Makefile +++ b/crypto/Makefile @@ -40,6 +40,7 @@ obj-$(CONFIG_CRYPTO_ECDH) += ecdh_generic.o $(obj)/rsapubkey-asn1.o: $(obj)/rsapubkey-asn1.c $(obj)/rsapubkey-asn1.h $(obj)/rsaprivkey-asn1.o: $(obj)/rsaprivkey-asn1.c $(obj)/rsaprivkey-asn1.h +$(obj)/rsa_helper.o: $(obj)/rsa_helper.c $(obj)/rsaprivkey-asn1.h clean-files += rsapubkey-asn1.c rsapubkey-asn1.h clean-files += rsaprivkey-asn1.c rsaprivkey-asn1.h -- 2.0.2
[PATCH] crypto: rsa - fix a potential race condition in build
When building kernel with RSA enabled with multithreaded, the below compile failure might be caught: | /buildarea/kernel-source/crypto/rsa_helper.c:18:28: fatal error: rsapubkey-asn1.h: No such file or directory | #include "rsapubkey-asn1.h" | ^ | compilation terminated. | CC crypto/rsa-pkcs1pad.o | CC crypto/algboss.o | CC crypto/testmgr.o | make[3]: *** [/buildarea/kernel-source/scripts/Makefile.build:289: crypto/rsa_helper.o] Error 1 | make[3]: *** Waiting for unfinished jobs | make[2]: *** [/buildarea/kernel-source/Makefile:969: crypto] Error 2 | make[1]: *** [Makefile:150: sub-make] Error 2 | make: *** [Makefile:24: __sub-make] Error 2 The header file is not generated before rsa_helper is compiled, so adding dependency to avoid such issue. Signed-off-by: Yang Shi --- crypto/Makefile | 1 + 1 file changed, 1 insertion(+) diff --git a/crypto/Makefile b/crypto/Makefile index 99cc64a..8db39f9 100644 --- a/crypto/Makefile +++ b/crypto/Makefile @@ -40,6 +40,7 @@ obj-$(CONFIG_CRYPTO_ECDH) += ecdh_generic.o $(obj)/rsapubkey-asn1.o: $(obj)/rsapubkey-asn1.c $(obj)/rsapubkey-asn1.h $(obj)/rsaprivkey-asn1.o: $(obj)/rsaprivkey-asn1.c $(obj)/rsaprivkey-asn1.h +$(obj)/rsa_helper.o: $(obj)/rsa_helper.c $(obj)/rsaprivkey-asn1.h clean-files += rsapubkey-asn1.c rsapubkey-asn1.h clean-files += rsaprivkey-asn1.c rsaprivkey-asn1.h -- 2.0.2
Re: A potential race in drivers/iio/adc/vf610_adc.ko
On Saturday 03 September 2016 08:53 PM, Jonathan Cameron wrote: > On 02/09/16 09:05, Pavel Andrianov wrote: >> > >> Hi! > Hi Pavel, >> >> There is a potential race in drivers/iio/adc/vf610_adc.ko. Handlers >> vf610_set_conversion_mode and vf610_write_raw are called via >> device_attibute interface, but they are related to different >> attributes, so may be executed in parallel. vf610_set_conversion_mode >> acquires the mutex indio_dev->mlock, and vf610_write_raw does not. >> Thus updating the structure 'info' may be performed simultaneously. >> >> Should vf610_write_raw also acquire the same mutex indio_dev->mlock? >> > > As Alison observed, mlock is not a general purpose lock for use by > drivers. It's there to prevent state changes between direct reads > (polled) and buffered/triggered reads (pushed). > > The write raw simply sets the sampling frequency. That's not a problem > whilst buffered capture is running or otherwise. Interesting question > of whether changing mode causes any trouble as well. It's possible > something is undefined in the hardware during a mode change... > > Anyhow, that covers mlock. Next question: Is there a race condition in > general? > > Yes there definitely is as we have read modify write cycles > on VF610_REG_ADC_CFG in both paths. So what is needed is a local lock > to protect these accesses. Whilst in theory mlock could be used > it should not be as it has a clearly stated purpose and using it > for other purposes makes for much fiddlier and harder to read code! Makes sense. What would be the best solution in this case? Should we just introduce local lock for this module and use it for both or there is anything we need to take care of while we have mlock for one? > (as an aside IIRC there is no locking in sysfs attributes to prevent > a single attribute being read twice at the same time.) > > Jonathan > -- Vaishali
Re: A potential race in drivers/iio/adc/vf610_adc.ko
On Saturday 03 September 2016 08:53 PM, Jonathan Cameron wrote: > On 02/09/16 09:05, Pavel Andrianov wrote: >> > >> Hi! > Hi Pavel, >> >> There is a potential race in drivers/iio/adc/vf610_adc.ko. Handlers >> vf610_set_conversion_mode and vf610_write_raw are called via >> device_attibute interface, but they are related to different >> attributes, so may be executed in parallel. vf610_set_conversion_mode >> acquires the mutex indio_dev->mlock, and vf610_write_raw does not. >> Thus updating the structure 'info' may be performed simultaneously. >> >> Should vf610_write_raw also acquire the same mutex indio_dev->mlock? >> > > As Alison observed, mlock is not a general purpose lock for use by > drivers. It's there to prevent state changes between direct reads > (polled) and buffered/triggered reads (pushed). > > The write raw simply sets the sampling frequency. That's not a problem > whilst buffered capture is running or otherwise. Interesting question > of whether changing mode causes any trouble as well. It's possible > something is undefined in the hardware during a mode change... > > Anyhow, that covers mlock. Next question: Is there a race condition in > general? > > Yes there definitely is as we have read modify write cycles > on VF610_REG_ADC_CFG in both paths. So what is needed is a local lock > to protect these accesses. Whilst in theory mlock could be used > it should not be as it has a clearly stated purpose and using it > for other purposes makes for much fiddlier and harder to read code! Makes sense. What would be the best solution in this case? Should we just introduce local lock for this module and use it for both or there is anything we need to take care of while we have mlock for one? > (as an aside IIRC there is no locking in sysfs attributes to prevent > a single attribute being read twice at the same time.) > > Jonathan > -- Vaishali
Re: A potential race in drivers/iio/adc/vf610_adc.ko
On 05/09/16 07:49, Vaishali Thakkar wrote: > > > On Saturday 03 September 2016 08:53 PM, Jonathan Cameron wrote: >> On 02/09/16 09:05, Pavel Andrianov wrote: >>> >> >>> Hi! >> Hi Pavel, >>> >>> There is a potential race in drivers/iio/adc/vf610_adc.ko. Handlers >>> vf610_set_conversion_mode and vf610_write_raw are called via >>> device_attibute interface, but they are related to different >>> attributes, so may be executed in parallel. vf610_set_conversion_mode >>> acquires the mutex indio_dev->mlock, and vf610_write_raw does not. >>> Thus updating the structure 'info' may be performed simultaneously. >>> >>> Should vf610_write_raw also acquire the same mutex indio_dev->mlock? >>> >> >> As Alison observed, mlock is not a general purpose lock for use by >> drivers. It's there to prevent state changes between direct reads >> (polled) and buffered/triggered reads (pushed). >> >> The write raw simply sets the sampling frequency. That's not a problem >> whilst buffered capture is running or otherwise. Interesting question >> of whether changing mode causes any trouble as well. It's possible >> something is undefined in the hardware during a mode change... >> >> Anyhow, that covers mlock. Next question: Is there a race condition in >> general? >> >> Yes there definitely is as we have read modify write cycles >> on VF610_REG_ADC_CFG in both paths. So what is needed is a local lock >> to protect these accesses. Whilst in theory mlock could be used >> it should not be as it has a clearly stated purpose and using it >> for other purposes makes for much fiddlier and harder to read code! > > Makes sense. What would be the best solution in this case? Should we > just introduce local lock for this module and use it for both or there > is anything we need to take care of while we have mlock for one? I'd leave the mlock as it is (or use the direct_claim wrappers as relevant). It would require a deep dive into the data sheet and hardware testing (for undocumented 'features') to be sure we could relax the current locking. Tightening it doesn't make sense unless we have reason to believe the frequency change causes trouble. A local lock as part of the structure retrieved from iio_priv would make sense to protect against multiple read, modify, write cycles occurring at the same time would cover the race conditions nicely. Jonathan > >> (as an aside IIRC there is no locking in sysfs attributes to prevent >> a single attribute being read twice at the same time.) >> >> Jonathan >> >
Re: A potential race in drivers/iio/adc/vf610_adc.ko
On 05/09/16 07:49, Vaishali Thakkar wrote: > > > On Saturday 03 September 2016 08:53 PM, Jonathan Cameron wrote: >> On 02/09/16 09:05, Pavel Andrianov wrote: >>> >> >>> Hi! >> Hi Pavel, >>> >>> There is a potential race in drivers/iio/adc/vf610_adc.ko. Handlers >>> vf610_set_conversion_mode and vf610_write_raw are called via >>> device_attibute interface, but they are related to different >>> attributes, so may be executed in parallel. vf610_set_conversion_mode >>> acquires the mutex indio_dev->mlock, and vf610_write_raw does not. >>> Thus updating the structure 'info' may be performed simultaneously. >>> >>> Should vf610_write_raw also acquire the same mutex indio_dev->mlock? >>> >> >> As Alison observed, mlock is not a general purpose lock for use by >> drivers. It's there to prevent state changes between direct reads >> (polled) and buffered/triggered reads (pushed). >> >> The write raw simply sets the sampling frequency. That's not a problem >> whilst buffered capture is running or otherwise. Interesting question >> of whether changing mode causes any trouble as well. It's possible >> something is undefined in the hardware during a mode change... >> >> Anyhow, that covers mlock. Next question: Is there a race condition in >> general? >> >> Yes there definitely is as we have read modify write cycles >> on VF610_REG_ADC_CFG in both paths. So what is needed is a local lock >> to protect these accesses. Whilst in theory mlock could be used >> it should not be as it has a clearly stated purpose and using it >> for other purposes makes for much fiddlier and harder to read code! > > Makes sense. What would be the best solution in this case? Should we > just introduce local lock for this module and use it for both or there > is anything we need to take care of while we have mlock for one? I'd leave the mlock as it is (or use the direct_claim wrappers as relevant). It would require a deep dive into the data sheet and hardware testing (for undocumented 'features') to be sure we could relax the current locking. Tightening it doesn't make sense unless we have reason to believe the frequency change causes trouble. A local lock as part of the structure retrieved from iio_priv would make sense to protect against multiple read, modify, write cycles occurring at the same time would cover the race conditions nicely. Jonathan > >> (as an aside IIRC there is no locking in sysfs attributes to prevent >> a single attribute being read twice at the same time.) >> >> Jonathan >> >
Re: A potential race in drivers/staging/speakup/speakup.ko
Pavel Andrianov, on Mon 05 Sep 2016 13:33:33 +0300, wrote: > synth_direct_store may be called via device_attributes interface. Ah, right. > In which function the lock should be added? That'd be synth_direct_store then, around the while loop. Samuel
Re: A potential race in drivers/staging/speakup/speakup.ko
Pavel Andrianov, on Mon 05 Sep 2016 13:33:33 +0300, wrote: > synth_direct_store may be called via device_attributes interface. Ah, right. > In which function the lock should be added? That'd be synth_direct_store then, around the while loop. Samuel
Re: A potential race in drivers/staging/speakup/speakup.ko
05.09.2016 12:56, Samuel Thibault пишет: Pavel Andrianov, on Mon 05 Sep 2016 12:54:10 +0300, wrote: 05.09.2016 12:43, Samuel Thibault пишет: Pavel Andrianov, on Mon 05 Sep 2016 11:51:50 +0300, wrote: There is a potential race in drivers/staging/speakup/speakup.ko. All operations with global pointers buff_in and buff_out are performed without any locks. Thus, a simultaneous write (via synth_buffer_clear or synth_buffer_add) to the pointers may lead to inconsistent data. Should a local lock be used here? AIUI, all callers of these functions have speakup_info.spinlock held. Regard a call stack -> synth_direct_store -> synth_printf -> synth_buffer_add The functions have not held speakup_info.spinlock. Apparently there is currently no caller of synth_direct_store and synth_store. But taking the lock here would be needed indeed. Samuel synth_direct_store may be called via device_attributes interface. In which function the lock should be added? -- Pavel Andrianov Linux Verification Center, ISPRAS web: http://linuxtesting.org e-mail: andria...@ispras.ru
Re: A potential race in drivers/staging/speakup/speakup.ko
05.09.2016 12:56, Samuel Thibault пишет: Pavel Andrianov, on Mon 05 Sep 2016 12:54:10 +0300, wrote: 05.09.2016 12:43, Samuel Thibault пишет: Pavel Andrianov, on Mon 05 Sep 2016 11:51:50 +0300, wrote: There is a potential race in drivers/staging/speakup/speakup.ko. All operations with global pointers buff_in and buff_out are performed without any locks. Thus, a simultaneous write (via synth_buffer_clear or synth_buffer_add) to the pointers may lead to inconsistent data. Should a local lock be used here? AIUI, all callers of these functions have speakup_info.spinlock held. Regard a call stack -> synth_direct_store -> synth_printf -> synth_buffer_add The functions have not held speakup_info.spinlock. Apparently there is currently no caller of synth_direct_store and synth_store. But taking the lock here would be needed indeed. Samuel synth_direct_store may be called via device_attributes interface. In which function the lock should be added? -- Pavel Andrianov Linux Verification Center, ISPRAS web: http://linuxtesting.org e-mail: andria...@ispras.ru