Re: general protection fault in tomoyo_socket_sendmsg_permission
On Sat, Feb 13, 2021 at 07:02:22PM +0900, Tetsuo Handa wrote: > Greg, will you queue > https://lkml.kernel.org/r/20210205135707.4574-1-penguin-ker...@i-love.sakura.ne.jp > (which can > close a report which is wasting syzbot's resource with 5300+ crashes) for > 5.12 ? The change shown below will be > too large to test before merge window for 5.12 opens. > > The patch for fixing "general protection fault in > tomoyo_socket_sendmsg_permission" will kill kthread_get_run(). > Closing frequently crashing bug now is the better. > > On 2021/02/11 22:40, Tetsuo Handa wrote: > > I guess that we need to serialize attach operation and reset/detach > > operations, for > > it seems there is a race window that might result in "general protection > > fault in > > tomoyo_socket_sendmsg_permission". Details follows... > > Here is untested diff that is expected to be complete. > > (1) Handle kthread_create() failure (which avoids "KASAN: null-ptr-deref > Write in vhci_shutdown_connection") > by grouping socket lookup, SOCK_STREAM check and kthread_get_run() into > usbip_prepare_threads() function. > > (2) Serialize usbip_sockfd_store(), detach_store(), attach_store(), > usbip_sockfd_store() and > ud->eh_ops.shutdown()/ud->eh_ops.reset()/ud->eh_ops.unusable() operations > using usbip_store_mutex mutex > (which avoids "general protection fault in > tomoyo_socket_sendmsg_permission"). > > (3) Don't reset ud->tcp_socket to NULL in vhci_device_reset(). Since > tx_thread/rx_thread depends on > ud->tcp_socket != NULL whereas tcp_socket and tx_thread/rx_thread are > assigned at the same time, > it is never safe to reset only ud->tcp_socket from ud->eh_ops.reset(). > And actually nobody is > calling ud->eh_ops.reset() without ud->eh_ops.shutdown(). > > (4) usbip_sockfd_store() must perform {sdev,udc}->ud.status != > SDEV_ST_AVAILABLE && {sdev,udc}->ud.status = SDEV_ST_USED > exclusively, or multiple tx_thread/rx_thread can be created when > concurrently called. Although (2) will already > serialize this action, (1) will make it possible to perform within one > spinlock section. Shouldn't this be 4 different patches? thanks, greg k-h
Re: general protection fault in tomoyo_socket_sendmsg_permission
On Sat, Feb 13, 2021 at 07:02:22PM +0900, Tetsuo Handa wrote: > Greg, will you queue > https://lkml.kernel.org/r/20210205135707.4574-1-penguin-ker...@i-love.sakura.ne.jp > (which can > close a report which is wasting syzbot's resource with 5300+ crashes) for > 5.12 ? The change shown below will be > too large to test before merge window for 5.12 opens. > > The patch for fixing "general protection fault in > tomoyo_socket_sendmsg_permission" will kill kthread_get_run(). > Closing frequently crashing bug now is the better. > > On 2021/02/11 22:40, Tetsuo Handa wrote: > > I guess that we need to serialize attach operation and reset/detach > > operations, for > > it seems there is a race window that might result in "general protection > > fault in > > tomoyo_socket_sendmsg_permission". Details follows... > > Here is untested diff that is expected to be complete. Please work and test this and get it merged in a normal manner, there is no "rush" here at all. Submit it properly and all will be fine. thanks, greg k-h
Re: general protection fault in tomoyo_socket_sendmsg_permission
Greg, will you queue https://lkml.kernel.org/r/20210205135707.4574-1-penguin-ker...@i-love.sakura.ne.jp (which can close a report which is wasting syzbot's resource with 5300+ crashes) for 5.12 ? The change shown below will be too large to test before merge window for 5.12 opens. The patch for fixing "general protection fault in tomoyo_socket_sendmsg_permission" will kill kthread_get_run(). Closing frequently crashing bug now is the better. On 2021/02/11 22:40, Tetsuo Handa wrote: > I guess that we need to serialize attach operation and reset/detach > operations, for > it seems there is a race window that might result in "general protection > fault in > tomoyo_socket_sendmsg_permission". Details follows... Here is untested diff that is expected to be complete. (1) Handle kthread_create() failure (which avoids "KASAN: null-ptr-deref Write in vhci_shutdown_connection") by grouping socket lookup, SOCK_STREAM check and kthread_get_run() into usbip_prepare_threads() function. (2) Serialize usbip_sockfd_store(), detach_store(), attach_store(), usbip_sockfd_store() and ud->eh_ops.shutdown()/ud->eh_ops.reset()/ud->eh_ops.unusable() operations using usbip_store_mutex mutex (which avoids "general protection fault in tomoyo_socket_sendmsg_permission"). (3) Don't reset ud->tcp_socket to NULL in vhci_device_reset(). Since tx_thread/rx_thread depends on ud->tcp_socket != NULL whereas tcp_socket and tx_thread/rx_thread are assigned at the same time, it is never safe to reset only ud->tcp_socket from ud->eh_ops.reset(). And actually nobody is calling ud->eh_ops.reset() without ud->eh_ops.shutdown(). (4) usbip_sockfd_store() must perform {sdev,udc}->ud.status != SDEV_ST_AVAILABLE && {sdev,udc}->ud.status = SDEV_ST_USED exclusively, or multiple tx_thread/rx_thread can be created when concurrently called. Although (2) will already serialize this action, (1) will make it possible to perform within one spinlock section. drivers/usb/usbip/stub_dev.c | 56 ++-- drivers/usb/usbip/usbip_common.c | 55 +++ drivers/usb/usbip/usbip_common.h | 13 drivers/usb/usbip/usbip_event.c | 9 + drivers/usb/usbip/vhci_hcd.c | 7 +--- drivers/usb/usbip/vhci_sysfs.c | 50 drivers/usb/usbip/vudc_sysfs.c | 50 7 files changed, 175 insertions(+), 65 deletions(-) diff --git a/drivers/usb/usbip/stub_dev.c b/drivers/usb/usbip/stub_dev.c index 2305d425e6c9..522d58826049 100644 --- a/drivers/usb/usbip/stub_dev.c +++ b/drivers/usb/usbip/stub_dev.c @@ -39,12 +39,11 @@ static DEVICE_ATTR_RO(usbip_status); * is used to transfer usbip requests by kernel threads. -1 is a magic number * by which usbip connection is finished. */ -static ssize_t usbip_sockfd_store(struct device *dev, struct device_attribute *attr, - const char *buf, size_t count) +static ssize_t __usbip_sockfd_store(struct device *dev, struct device_attribute *attr, + const char *buf, size_t count) { struct stub_device *sdev = dev_get_drvdata(dev); int sockfd = 0; - struct socket *socket; int rv; if (!sdev) { @@ -57,7 +56,12 @@ static ssize_t usbip_sockfd_store(struct device *dev, struct device_attribute *a return -EINVAL; if (sockfd != -1) { - int err; + struct usbip_thread_info uti; + int err = usbip_prepare_threads(, >ud, sockfd, + stub_tx_loop, "stub_tx", stub_rx_loop, "stub_rx"); + + if (err) + return err; dev_info(dev, "stub up\n"); @@ -65,44 +69,46 @@ static ssize_t usbip_sockfd_store(struct device *dev, struct device_attribute *a if (sdev->ud.status != SDEV_ST_AVAILABLE) { dev_err(dev, "not ready\n"); - goto err; + spin_unlock_irq(>ud.lock); + usbip_unprepare_threads(); + return -EINVAL; } - socket = sockfd_lookup(sockfd, ); - if (!socket) - goto err; - - sdev->ud.tcp_socket = socket; + sdev->ud.tcp_socket = uti.tcp_socket; sdev->ud.sockfd = sockfd; - - spin_unlock_irq(>ud.lock); - - sdev->ud.tcp_rx = kthread_get_run(stub_rx_loop, >ud, - "stub_rx"); - sdev->ud.tcp_tx = kthread_get_run(stub_tx_loop, >ud, - "stub_tx"); - - spin_lock_irq(&g
Re: general protection fault in tomoyo_socket_sendmsg_permission
On 2/11/21 9:58 PM, Tetsuo Handa wrote: On 2021/02/12 11:22, Tetsuo Handa wrote: On 2021/02/12 10:34, Shuah Khan wrote: On 2/10/21 6:14 PM, Tetsuo Handa wrote: (Dropping LSM ML because this is not a TOMOYO's bug.) On 2021/02/11 4:29, Shuah Khan wrote: This is a good find. I already replied to the thread to send a complete fix. As I said at https://lkml.kernel.org/r/f8cae6b1-8f84-0e6a-7d9c-fc4aec68f...@i-love.sakura.ne.jp , the as-is patch is effectively a complete fix. And applying the as-is patch should help spending syzbot resources for reproducing "general protection fault in tomoyo_socket_sendmsg_permission" with debug printk() patch applied, which in turn will help you in Right. I would like to get a clear understanding of how this condition is triggered. I am not saying this isn't a problem. Understanding how it is triggered helps find the best fix. part. Therefore, I strongly expect you to apply this version now. Is there a reproducer for this problem? There is no reproducer for "general protection fault in tomoyo_socket_sendmsg_permission" problem, but the race condition is explained at https://lkml.kernel.org/r/676d4518-0faa-9fab-15db-0db8d216d...@i-love.sakura.ne.jp . Here is a race window widening patch, and I locally reproduced "general protection fault in tomoyo_socket_sendmsg_permission". diff --git a/drivers/usb/usbip/usbip_common.h b/drivers/usb/usbip/usbip_common.h index 8be857a4fa13..a7c68097aa1d 100644 --- a/drivers/usb/usbip/usbip_common.h +++ b/drivers/usb/usbip/usbip_common.h @@ -286,6 +286,8 @@ struct usbip_device { if (!IS_ERR(__k)) {\ get_task_struct(__k); \ wake_up_process(__k); \ + } else { \ + __k = NULL;\ } \ __k; \ }) diff --git a/drivers/usb/usbip/vhci_sysfs.c b/drivers/usb/usbip/vhci_sysfs.c index be37aec250c2..93e1271d0f5d 100644 --- a/drivers/usb/usbip/vhci_sysfs.c +++ b/drivers/usb/usbip/vhci_sysfs.c @@ -389,8 +389,12 @@ static ssize_t attach_store(struct device *dev, struct device_attribute *attr, /* end the lock */ vdev->ud.tcp_rx = kthread_get_run(vhci_rx_loop, >ud, "vhci_rx"); - vdev->ud.tcp_tx = kthread_get_run(vhci_tx_loop, >ud, "vhci_tx"); + { + struct task_struct *tx = kthread_get_run(vhci_tx_loop, >ud, "vhci_tx"); + schedule_timeout_uninterruptible(HZ); + vdev->ud.tcp_tx = tx; + } rh_port_connect(vdev, speed); return count; Thank you. This is helpful. I will try to reproduce the problem with this test code. thanks, -- Shuah
Re: general protection fault in tomoyo_socket_sendmsg_permission
On 2021/02/12 11:22, Tetsuo Handa wrote: > On 2021/02/12 10:34, Shuah Khan wrote: >> On 2/10/21 6:14 PM, Tetsuo Handa wrote: >>> (Dropping LSM ML because this is not a TOMOYO's bug.) >>> >>> On 2021/02/11 4:29, Shuah Khan wrote: >>>> This is a good find. I already replied to the thread to send a complete >>>> fix. >>> >>> As I said at >>> https://lkml.kernel.org/r/f8cae6b1-8f84-0e6a-7d9c-fc4aec68f...@i-love.sakura.ne.jp >>> , >>> the as-is patch is effectively a complete fix. And applying the as-is patch >>> should help spending >>> syzbot resources for reproducing "general protection fault in >>> tomoyo_socket_sendmsg_permission" >>> with debug printk() patch applied, which in turn will help you in >>> >>>> Right. I would like to get a clear understanding of how this condition >>>> is triggered. I am not saying this isn't a problem. Understanding how >>>> it is triggered helps find the best fix. >>> >>> part. Therefore, I strongly expect you to apply this version now. >>> >> >> Is there a reproducer for this problem? > > There is no reproducer for "general protection fault in > tomoyo_socket_sendmsg_permission" problem, but > the race condition is explained at > https://lkml.kernel.org/r/676d4518-0faa-9fab-15db-0db8d216d...@i-love.sakura.ne.jp > . > Here is a race window widening patch, and I locally reproduced "general protection fault in tomoyo_socket_sendmsg_permission". diff --git a/drivers/usb/usbip/usbip_common.h b/drivers/usb/usbip/usbip_common.h index 8be857a4fa13..a7c68097aa1d 100644 --- a/drivers/usb/usbip/usbip_common.h +++ b/drivers/usb/usbip/usbip_common.h @@ -286,6 +286,8 @@ struct usbip_device { if (!IS_ERR(__k)) {\ get_task_struct(__k); \ wake_up_process(__k); \ + } else { \ + __k = NULL;\ } \ __k; \ }) diff --git a/drivers/usb/usbip/vhci_sysfs.c b/drivers/usb/usbip/vhci_sysfs.c index be37aec250c2..93e1271d0f5d 100644 --- a/drivers/usb/usbip/vhci_sysfs.c +++ b/drivers/usb/usbip/vhci_sysfs.c @@ -389,8 +389,12 @@ static ssize_t attach_store(struct device *dev, struct device_attribute *attr, /* end the lock */ vdev->ud.tcp_rx = kthread_get_run(vhci_rx_loop, >ud, "vhci_rx"); - vdev->ud.tcp_tx = kthread_get_run(vhci_tx_loop, >ud, "vhci_tx"); + { + struct task_struct *tx = kthread_get_run(vhci_tx_loop, >ud, "vhci_tx"); + schedule_timeout_uninterruptible(HZ); + vdev->ud.tcp_tx = tx; + } rh_port_connect(vdev, speed); return count; [ 134.880383][ T7879] vhci_hcd vhci_hcd.0: pdev(2) rhport(0) sockfd(6) [ 134.883465][ T7879] vhci_hcd vhci_hcd.0: devid(0) speed(2) speed_str(full-speed) [ 134.904750][ T74] vhci_hcd: vhci_device speed not set [ 134.933053][ T7873] vhci_hcd: connection closed [ 134.933212][ T7870] vhci_hcd: connection closed [ 134.933415][ T4488] vhci_hcd: stop threads [ 134.942970][ T4488] vhci_hcd: release socket [ 134.944949][ T14] vhci_hcd: vhci_device speed not set [ 134.947063][ T4488] vhci_hcd: disconnect device [ 134.951508][ T4488] vhci_hcd: stop threads [ 134.953861][ T4488] vhci_hcd: release socket [ 134.957113][ T4488] vhci_hcd: disconnect device [ 135.014433][ T56] usb 15-1: enqueue for inactive port 0 [ 135.065119][ T7877] vhci_hcd: connection closed [ 135.065205][ T7880] vhci_hcd: connection closed [ 135.065370][ T4480] vhci_hcd: stop threads [ 135.072854][ T4480] vhci_hcd: release socket [ 135.076490][ T4480] vhci_hcd: disconnect device [ 135.079535][ T4480] vhci_hcd: stop threads [ 135.084024][ T4480] vhci_hcd: release socket [ 135.087979][ T4480] vhci_hcd: disconnect device [ 135.134526][ T6820] vhci_hcd: vhci_device speed not set [ 135.314959][ T6821] vhci_hcd: vhci_device speed not set [ 135.621271][ T7884] vhci_hcd vhci_hcd.0: pdev(4) rhport(0) sockfd(3) [ 135.621290][ T7885] vhci_hcd vhci_hcd.0: pdev(0) rhport(0) sockfd(3) [ 135.625072][ T7884] vhci_hcd vhci_hcd.0: devid(0) speed(5) speed_str(super-speed) [ 135.628665][ T7885] vhci_hcd vhci_hcd.0: devid(0) speed(5) speed_str(super-speed) [ 135.630109][ T7887] vhci_hcd vhci_hcd.0: pdev(1) rhport(0) sockfd(3) [ 135.630116][ T7887] vhci_hcd vhci_hcd.0: devid(0) speed(5) speed_str(super-speed) [ 135.672834][ T7895] vhci_hcd
Re: general protection fault in tomoyo_socket_sendmsg_permission
On 2021/02/12 10:34, Shuah Khan wrote: > On 2/10/21 6:14 PM, Tetsuo Handa wrote: >> (Dropping LSM ML because this is not a TOMOYO's bug.) >> >> On 2021/02/11 4:29, Shuah Khan wrote: >>> This is a good find. I already replied to the thread to send a complete >>> fix. >> >> As I said at >> https://lkml.kernel.org/r/f8cae6b1-8f84-0e6a-7d9c-fc4aec68f...@i-love.sakura.ne.jp >> , >> the as-is patch is effectively a complete fix. And applying the as-is patch >> should help spending >> syzbot resources for reproducing "general protection fault in >> tomoyo_socket_sendmsg_permission" >> with debug printk() patch applied, which in turn will help you in >> >>> Right. I would like to get a clear understanding of how this condition >>> is triggered. I am not saying this isn't a problem. Understanding how >>> it is triggered helps find the best fix. >> >> part. Therefore, I strongly expect you to apply this version now. >> > > Is there a reproducer for this problem? There is no reproducer for "general protection fault in tomoyo_socket_sendmsg_permission" problem, but the race condition is explained at https://lkml.kernel.org/r/676d4518-0faa-9fab-15db-0db8d216d...@i-love.sakura.ne.jp .
Re: general protection fault in tomoyo_socket_sendmsg_permission
On 2/10/21 6:14 PM, Tetsuo Handa wrote: (Dropping LSM ML because this is not a TOMOYO's bug.) On 2021/02/11 4:29, Shuah Khan wrote: This is a good find. I already replied to the thread to send a complete fix. As I said at https://lkml.kernel.org/r/f8cae6b1-8f84-0e6a-7d9c-fc4aec68f...@i-love.sakura.ne.jp , the as-is patch is effectively a complete fix. And applying the as-is patch should help spending syzbot resources for reproducing "general protection fault in tomoyo_socket_sendmsg_permission" with debug printk() patch applied, which in turn will help you in Right. I would like to get a clear understanding of how this condition is triggered. I am not saying this isn't a problem. Understanding how it is triggered helps find the best fix. part. Therefore, I strongly expect you to apply this version now. Is there a reproducer for this problem? thanks, -- Shuah
Re: general protection fault in tomoyo_socket_sendmsg_permission
(Dropping LSM ML because this is not a TOMOYO's bug.) On 2021/02/11 4:29, Shuah Khan wrote: > This is a good find. I already replied to the thread to send a complete > fix. As I said at https://lkml.kernel.org/r/f8cae6b1-8f84-0e6a-7d9c-fc4aec68f...@i-love.sakura.ne.jp , the as-is patch is effectively a complete fix. And applying the as-is patch should help spending syzbot resources for reproducing "general protection fault in tomoyo_socket_sendmsg_permission" with debug printk() patch applied, which in turn will help you in > Right. I would like to get a clear understanding of how this condition > is triggered. I am not saying this isn't a problem. Understanding how > it is triggered helps find the best fix. part. Therefore, I strongly expect you to apply this version now.
Re: general protection fault in tomoyo_socket_sendmsg_permission
On 2/10/21 12:07 PM, Tetsuo Handa wrote: On 2021/02/11 3:17, Shuah Khan wrote: I am looking to understand the syzbot configuration and a reproducer to be able to debug and fix the problem. How is syzbot triggering the vhci_hcd attach and detach sequence? I don't know. I'm waiting for syzbot to reproduce the problem on linux-next with https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/drivers/usb/usbip?id=f1bdf414e7dd0cbc26460425719fc3ea479947a2 . This helps me determine all these fix suggestions that are coming in are fixes or papering over a real problem. What are these fix suggestions? "general protection fault in tomoyo_socket_sendmsg_permission" is a NULL pointer dereference which can happen if vhci_device_reset() and/or vhci_device_init() (which does vdev->ud.tcp_socket = NULL;) were unexpectedly called. There is no reproducer, and (as far as I know) no fix suggestion. Right. I would like to get a clear understanding of how this condition is triggered. I am not saying this isn't a problem. Understanding how it is triggered helps find the best fix. "KASAN: null-ptr-deref Write in vhci_shutdown_connection" is an ERR_PTR(-EINTR) pointer dereference which can happen if kthread_create() was SIGKILLed. There is a reproducer, and https://lkml.kernel.org/r/20210205135707.4574-1-penguin-ker...@i-love.sakura.ne.jp is a fix suggestion. This is a good find. I already replied to the thread to send a complete fix. thanks, -- Shuah
Re: general protection fault in tomoyo_socket_sendmsg_permission
On 2021/02/11 3:17, Shuah Khan wrote: > I am looking to understand the syzbot configuration and a reproducer > to be able to debug and fix the problem. How is syzbot triggering the > vhci_hcd attach and detach sequence? I don't know. I'm waiting for syzbot to reproduce the problem on linux-next with https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/drivers/usb/usbip?id=f1bdf414e7dd0cbc26460425719fc3ea479947a2 . > > This helps me determine all these fix suggestions that are coming in > are fixes or papering over a real problem. What are these fix suggestions? "general protection fault in tomoyo_socket_sendmsg_permission" is a NULL pointer dereference which can happen if vhci_device_reset() and/or vhci_device_init() (which does vdev->ud.tcp_socket = NULL;) were unexpectedly called. There is no reproducer, and (as far as I know) no fix suggestion. "KASAN: null-ptr-deref Write in vhci_shutdown_connection" is an ERR_PTR(-EINTR) pointer dereference which can happen if kthread_create() was SIGKILLed. There is a reproducer, and https://lkml.kernel.org/r/20210205135707.4574-1-penguin-ker...@i-love.sakura.ne.jp is a fix suggestion.
Re: general protection fault in tomoyo_socket_sendmsg_permission
On 1/29/21 7:25 PM, Tetsuo Handa wrote: On 2021/01/30 6:18, Shuah Khan wrote: In this console log: It seems "this console log" refers to https://syzkaller.appspot.com/x/log.txt?x=1045303450 . 06:57:50 executing program 1: socketpair$tipc(0x1e, 0x2, 0x0, &(0x7fc0)={0x}) sendmsg$BATADV_CMD_GET_TRANSTABLE_LOCAL(r0, &(0x7f0002c0)={&(0x7f0001c0), 0xc, &(0x7f000280)={0x0, 0xd0010100}}, 0x0) [ 1151.090883][T23361] vhci_hcd vhci_hcd.0: pdev(4) rhport(0) sockfd(4) [ 1151.097445][T23361] vhci_hcd vhci_hcd.0: devid(0) speed(1) speed_str(low-speed) 06:57:50 executing program 0: r0 = syz_open_dev$binderN(&(0x7f000680)='/dev/binder#\x00', 0x0, 0x0) ioctl$BINDER_WRITE_READ(r0, 0xc0306201, &(0x7f000cc0)={0x88, 0x0, &(0x7f000b80)=[@transaction={0x40406300, {0x2, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}}, @transaction={0x40406300, {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}}], 0x0, 0x0, 0x0}) [ 1151.164402][T23363] vhci_hcd: connection closed [ 1151.167346][ T240] vhci_hcd: stop threads [ 1151.178329][T26761] usb 17-1: new low-speed USB device number 2 using vhci_hcd SK: Looking at the console log, it looks like while connection is being torn down, Excuse me, but it looks like (what comes here) while connection is being torn down ? I'm not familiar with driver code. [ 1151.181245][ T240] vhci_hcd: release socket Can you share your your test code for this program: "executing program 1" I don't think program 1 is relevant. I think program 4 06:57:50 executing program 4: r0 = socket$tipc(0x1e, 0x2, 0x0) syz_usbip_server_init(0x1) close_range(r0, 0x, 0x0) which calls syz_usbip_server_init() as with other duplicates is relevant. Also your setup? Do you run usbip_host and vhci_hcd both? Who are you referring to with "you/your" ? I'm not running syzkaller in my setup and I don't have test code. I'm just proposing printing more messages in order to confirm the ordering of events and member values in structures. I am looking to understand the syzbot configuration and a reproducer to be able to debug and fix the problem. How is syzbot triggering the vhci_hcd attach and detach sequence? This helps me determine all these fix suggestions that are coming in are fixes or papering over a real problem. thanks, -- Shuah
Re: general protection fault in tomoyo_socket_sendmsg_permission
On 2021/01/30 6:18, Shuah Khan wrote: > In this console log: It seems "this console log" refers to https://syzkaller.appspot.com/x/log.txt?x=1045303450 . > > 06:57:50 executing program 1: > socketpair$tipc(0x1e, 0x2, 0x0, &(0x7fc0)={0x}) > sendmsg$BATADV_CMD_GET_TRANSTABLE_LOCAL(r0, > &(0x7f0002c0)={&(0x7f0001c0), 0xc, &(0x7f000280)={0x0, > 0xd0010100}}, 0x0) > > [ 1151.090883][T23361] vhci_hcd vhci_hcd.0: pdev(4) rhport(0) sockfd(4) > [ 1151.097445][T23361] vhci_hcd vhci_hcd.0: devid(0) speed(1) > speed_str(low-speed) > 06:57:50 executing program 0: > r0 = syz_open_dev$binderN(&(0x7f000680)='/dev/binder#\x00', 0x0, 0x0) > ioctl$BINDER_WRITE_READ(r0, 0xc0306201, &(0x7f000cc0)={0x88, 0x0, > &(0x7f000b80)=[@transaction={0x40406300, {0x2, 0x0, 0x0, 0x0, 0x0, 0x0, > 0x0, 0x0, 0x0, 0x0, 0x0}}, @transaction={0x40406300, {0x0, 0x0, 0x0, 0x0, > 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}}], 0x0, 0x0, 0x0}) > > [ 1151.164402][T23363] vhci_hcd: connection closed > [ 1151.167346][ T240] vhci_hcd: stop threads > > > [ 1151.178329][T26761] usb 17-1: new low-speed USB device number 2 using > vhci_hcd > > > SK: Looking at the console log, it looks like while connection is being > torn down, Excuse me, but it looks like (what comes here) while connection is being torn down ? I'm not familiar with driver code. > > > [ 1151.181245][ T240] vhci_hcd: release socket > > > Can you share your your test code for this program: > "executing program 1" I don't think program 1 is relevant. I think program 4 06:57:50 executing program 4: r0 = socket$tipc(0x1e, 0x2, 0x0) syz_usbip_server_init(0x1) close_range(r0, 0x, 0x0) which calls syz_usbip_server_init() as with other duplicates is relevant. > > Also your setup? Do you run usbip_host and vhci_hcd both? Who are you referring to with "you/your" ? I'm not running syzkaller in my setup and I don't have test code. I'm just proposing printing more messages in order to confirm the ordering of events and member values in structures.
Re: general protection fault in tomoyo_socket_sendmsg_permission
On 1/29/21 10:08 AM, Tetsuo Handa wrote: On 2021/01/30 1:05, Shuah Khan wrote: Since "general protection fault in tomoyo_socket_sendmsg_permission" is caused by unexpectedly resetting ud->tcp_socket to NULL without waiting for tx thread to terminate, tracing the ordering of events is worth knowing. Even adding schedule_timeout_uninterruptible() to before kernel_sendmsg() might help. What about the duplicate bug information that was in my email? Did you take a look at that? I was not aware of the duplicate bugs. It is interesting that "KASAN: null-ptr-deref Write in event_handler" says that vdev->ud.tcp_tx became NULL at if (vdev->ud.tcp_tx) { /* this location */ kthread_stop_put(vdev->ud.tcp_tx); vdev->ud.tcp_tx = NULL; } which means that somebody else is unexpectedly resetting vdev->ud.tcp_tx to NULL. If memset() from vhci_device_init() from vhci_start() were unexpectedly called, all of tcp_socket, tcp_rx, tcp_tx etc. becomes NULL which can explain these bugs ? I'm inclined to report not only tcp_socket but also other fields when kernel_sendmsg() detected that tcp_socket is NULL... In this console log: 06:57:50 executing program 1: socketpair$tipc(0x1e, 0x2, 0x0, &(0x7fc0)={0x}) sendmsg$BATADV_CMD_GET_TRANSTABLE_LOCAL(r0, &(0x7f0002c0)={&(0x7f0001c0), 0xc, &(0x7f000280)={0x0, 0xd0010100}}, 0x0) [ 1151.090883][T23361] vhci_hcd vhci_hcd.0: pdev(4) rhport(0) sockfd(4) [ 1151.097445][T23361] vhci_hcd vhci_hcd.0: devid(0) speed(1) speed_str(low-speed) 06:57:50 executing program 0: r0 = syz_open_dev$binderN(&(0x7f000680)='/dev/binder#\x00', 0x0, 0x0) ioctl$BINDER_WRITE_READ(r0, 0xc0306201, &(0x7f000cc0)={0x88, 0x0, &(0x7f000b80)=[@transaction={0x40406300, {0x2, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}}, @transaction={0x40406300, {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}}], 0x0, 0x0, 0x0}) [ 1151.164402][T23363] vhci_hcd: connection closed [ 1151.167346][ T240] vhci_hcd: stop threads [ 1151.178329][T26761] usb 17-1: new low-speed USB device number 2 using vhci_hcd SK: Looking at the console log, it looks like while connection is being torn down, [ 1151.181245][ T240] vhci_hcd: release socket Can you share your your test code for this program: "executing program 1" Also your setup? Do you run usbip_host and vhci_hcd both? thanks, -- Shuah
Re: general protection fault in tomoyo_socket_sendmsg_permission
On 2021/01/30 1:05, Shuah Khan wrote: >> Since "general protection fault in tomoyo_socket_sendmsg_permission" is >> caused by >> unexpectedly resetting ud->tcp_socket to NULL without waiting for tx thread >> to >> terminate, tracing the ordering of events is worth knowing. Even adding >> schedule_timeout_uninterruptible() to before kernel_sendmsg() might help. >> > > What about the duplicate bug information that was in my email? > Did you take a look at that? I was not aware of the duplicate bugs. It is interesting that "KASAN: null-ptr-deref Write in event_handler" says that vdev->ud.tcp_tx became NULL at if (vdev->ud.tcp_tx) { /* this location */ kthread_stop_put(vdev->ud.tcp_tx); vdev->ud.tcp_tx = NULL; } which means that somebody else is unexpectedly resetting vdev->ud.tcp_tx to NULL. If memset() from vhci_device_init() from vhci_start() were unexpectedly called, all of tcp_socket, tcp_rx, tcp_tx etc. becomes NULL which can explain these bugs ? I'm inclined to report not only tcp_socket but also other fields when kernel_sendmsg() detected that tcp_socket is NULL...
Re: general protection fault in tomoyo_socket_sendmsg_permission
On 2021/01/29 4:05, Shuah Khan wrote: > The reason I don't like adding printk's is this is a race condition > and as a result time sensitive. Adding printks in the path will not > help debug this issue. It will make it harder to reproduce the problem. Not always. Adding printk() might make it easier to reproduce the problem. > > I am unable to reproduce the problem using the reproducer and running > multiple instances of the reproducer. Since syzkaller cannot find a reproducer for "general protection fault in tomoyo_socket_sendmsg_permission", and you cannot reproduce other problem using reproducer, trying to obtain some clue via printing messages by asking syzkaller to try debug patch can be very helpful. Since "general protection fault in tomoyo_socket_sendmsg_permission" is caused by unexpectedly resetting ud->tcp_socket to NULL without waiting for tx thread to terminate, tracing the ordering of events is worth knowing. Even adding schedule_timeout_uninterruptible() to before kernel_sendmsg() might help.
Re: general protection fault in tomoyo_socket_sendmsg_permission
On 2020/11/14 2:14, Shuah Khan wrote: > On 11/13/20 5:00 AM, Hillf Danton wrote: >> Thu, 12 Nov 2020 23:21:26 -0800 >>> syzbot found the following issue on: >>> >>> HEAD commit: 9dbc1c03 Merge tag 'xfs-5.10-fixes-3' of git://git.kernel... >>> git tree: upstream >>> console output: https://syzkaller.appspot.com/x/log.txt?x=1045303450 >>> kernel config: https://syzkaller.appspot.com/x/.config?x=1735b7978b1c3721 >>> dashboard link: https://syzkaller.appspot.com/bug?extid=95ce4b142579611ef0a9 >>> compiler: gcc (GCC) 10.1.0-syz 20200507 >>> userspace arch: i386 >>> >>> Unfortunately, I don't have any reproducer for this issue yet. >>> > > I would like to see the reproducer for this. I just can't reproduce > this problem. > >> >> Fix 96c2737716d5 ("usbip: move usbip kernel code out of staging") >> by closing the race between readers and writer of ud->tcp_socket on >> sock shutdown. To do that, add waitqueue for usbip device and make >> writer wait for all readers to go home before releasing the socket. Worrysome part for me is vhci_device_reset() which resets ud->tcp_socket to NULL without waiting for tx thread to terminate, though I don't know if vhci_device_reset() can be called while tx thread is running. I'd like to try below debug printk() patch on linx-next tree, for this bug is reported on linux.git and linux-next.git trees. Which git tree can be used for sending this to-be-removed patch to linux-next.git ? I wish there is a kernel config for fuzzers in linux.git so that every git tree can carry debug printk() patch for syzbot's reports... Subject: [PATCH] usb: usbip: vhci_hcd: add printk() for debug This is linux-next only patch for examining a bug reported at https://syzkaller.appspot.com/bug?id=3e1d941a31361efc4ced2ba8b4af2044d4e43c59 . Signed-off-by: Tetsuo Handa --- drivers/usb/usbip/stub_dev.c | 6 ++ drivers/usb/usbip/vhci_hcd.c | 11 +++ drivers/usb/usbip/vhci_sysfs.c | 4 drivers/usb/usbip/vhci_tx.c| 12 drivers/usb/usbip/vudc_dev.c | 6 ++ 5 files changed, 39 insertions(+) diff --git a/drivers/usb/usbip/stub_dev.c b/drivers/usb/usbip/stub_dev.c index 2305d425e6c9..627f83c0ebc8 100644 --- a/drivers/usb/usbip/stub_dev.c +++ b/drivers/usb/usbip/stub_dev.c @@ -131,10 +131,14 @@ static void stub_shutdown_connection(struct usbip_device *ud) /* 1. stop threads */ if (ud->tcp_rx) { + if (IS_BUILTIN(CONFIG_DEBUG_AID_FOR_SYZBOT)) + pr_info("%s: stop rx %p\n", __func__, ud->tcp_rx); kthread_stop_put(ud->tcp_rx); ud->tcp_rx = NULL; } if (ud->tcp_tx) { + if (IS_BUILTIN(CONFIG_DEBUG_AID_FOR_SYZBOT)) + pr_info("%s: stop tx %p\n", __func__, ud->tcp_tx); kthread_stop_put(ud->tcp_tx); ud->tcp_tx = NULL; } @@ -146,6 +150,8 @@ static void stub_shutdown_connection(struct usbip_device *ud) * not touch NULL socket. */ if (ud->tcp_socket) { + if (IS_BUILTIN(CONFIG_DEBUG_AID_FOR_SYZBOT)) + pr_info("%s: close sock %p\n", __func__, ud->tcp_socket); sockfd_put(ud->tcp_socket); ud->tcp_socket = NULL; ud->sockfd = -1; diff --git a/drivers/usb/usbip/vhci_hcd.c b/drivers/usb/usbip/vhci_hcd.c index 3209b5ddd30c..9e95bf9330f5 100644 --- a/drivers/usb/usbip/vhci_hcd.c +++ b/drivers/usb/usbip/vhci_hcd.c @@ -1016,10 +1016,14 @@ static void vhci_shutdown_connection(struct usbip_device *ud) /* kill threads related to this sdev */ if (vdev->ud.tcp_rx) { + if (IS_BUILTIN(CONFIG_DEBUG_AID_FOR_SYZBOT)) + pr_info("%s: stop rx %p\n", __func__, vdev->ud.tcp_rx); kthread_stop_put(vdev->ud.tcp_rx); vdev->ud.tcp_rx = NULL; } if (vdev->ud.tcp_tx) { + if (IS_BUILTIN(CONFIG_DEBUG_AID_FOR_SYZBOT)) + pr_info("%s: stop tx %p\n", __func__, vdev->ud.tcp_tx); kthread_stop_put(vdev->ud.tcp_tx); vdev->ud.tcp_tx = NULL; } @@ -1027,6 +1031,8 @@ static void vhci_shutdown_connection(struct usbip_device *ud) /* active connection is closed */ if (vdev->ud.tcp_socket) { + if (IS_BUILTIN(CONFIG_DEBUG_AID_FOR_SYZBOT)) + pr_info("%s: close sock %p\n", __func__, ud->tcp_socket); sockfd_put(vdev->ud.tcp_socket); vdev->ud.tcp_socket = NULL; vdev->ud.sockfd = -1; @@ -1074,6 +1080,11 @@ static void vhci_device_reset(struct usbip_device *ud) vdev->udev = NULL; if (ud->tcp_socket) { + if (IS_BUILTIN(CONFIG_DEBUG_AID_FOR_SYZBOT)) { + pr_info("%s: close sock %p\n", __func__, ud->tcp_socket); + BUG_ON(vdev->ud.tcp_tx); +
Re: general protection fault in tomoyo_socket_sendmsg_permission
On 11/13/20 5:00 AM, Hillf Danton wrote: Thu, 12 Nov 2020 23:21:26 -0800 syzbot found the following issue on: HEAD commit:9dbc1c03 Merge tag 'xfs-5.10-fixes-3' of git://git.kernel... git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=1045303450 kernel config: https://syzkaller.appspot.com/x/.config?x=1735b7978b1c3721 dashboard link: https://syzkaller.appspot.com/bug?extid=95ce4b142579611ef0a9 compiler: gcc (GCC) 10.1.0-syz 20200507 userspace arch: i386 Unfortunately, I don't have any reproducer for this issue yet. I would like to see the reproducer for this. I just can't reproduce this problem. Fix 96c2737716d5 ("usbip: move usbip kernel code out of staging") by closing the race between readers and writer of ud->tcp_socket on sock shutdown. To do that, add waitqueue for usbip device and make writer wait for all readers to go home before releasing the socket. --- a/drivers/usb/usbip/usbip_common.h +++ b/drivers/usb/usbip/usbip_common.h @@ -277,6 +277,9 @@ struct usbip_device { void (*reset)(struct usbip_device *); void (*unusable)(struct usbip_device *); } eh_ops; + + atomic_t sk_shutdown_count; + wait_queue_head_t sk_shutdown_waitq; }; #define kthread_get_run(threadfn, data, namefmt, ...) \ --- a/drivers/usb/usbip/vhci_hcd.c +++ b/drivers/usb/usbip/vhci_hcd.c @@ -1008,15 +1008,20 @@ static void vhci_shutdown_connection(str /* kill threads related to this sdev */ if (vdev->ud.tcp_rx) { + atomic_inc(>sk_shutdown_count); kthread_stop_put(vdev->ud.tcp_rx); vdev->ud.tcp_rx = NULL; } if (vdev->ud.tcp_tx) { + atomic_inc(>sk_shutdown_count); kthread_stop_put(vdev->ud.tcp_tx); vdev->ud.tcp_tx = NULL; } pr_info("stop threads\n"); + wait_event(ud->sk_shutdown_waitq, + !atomic_read(>sk_shutdown_count)); + /* active connection is closed */ if (vdev->ud.tcp_socket) { sockfd_put(vdev->ud.tcp_socket); @@ -1104,6 +1109,8 @@ static void vhci_device_init(struct vhci vdev->ud.eh_ops.reset = vhci_device_reset; vdev->ud.eh_ops.unusable = vhci_device_unusable; + init_waitqueue_head(>ud.sk_shutdown_waitq); + usbip_start_eh(>ud); } --- a/drivers/usb/usbip/vhci_rx.c +++ b/drivers/usb/usbip/vhci_rx.c @@ -264,5 +264,8 @@ int vhci_rx_loop(void *data) vhci_rx_pdu(ud); } + if (atomic_dec_and_test(>sk_shutdown_count)) + wake_up(>sk_shutdown_waitq); + return 0; } --- a/drivers/usb/usbip/vhci_tx.c +++ b/drivers/usb/usbip/vhci_tx.c @@ -252,5 +252,8 @@ int vhci_tx_loop(void *data) usbip_dbg_vhci_tx("pending urbs ?, now wake up\n"); } + if (atomic_dec_and_test(>sk_shutdown_count)) + wake_up(>sk_shutdown_waitq); + return 0; } Without a reproducer, it is hard to verify the fix. thanks, -- Shuah
general protection fault in tomoyo_socket_sendmsg_permission
Hello, syzbot found the following issue on: HEAD commit:9dbc1c03 Merge tag 'xfs-5.10-fixes-3' of git://git.kernel... git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=1045303450 kernel config: https://syzkaller.appspot.com/x/.config?x=1735b7978b1c3721 dashboard link: https://syzkaller.appspot.com/bug?extid=95ce4b142579611ef0a9 compiler: gcc (GCC) 10.1.0-syz 20200507 userspace arch: i386 Unfortunately, I don't have any reproducer for this issue yet. IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: syzbot+95ce4b142579611ef...@syzkaller.appspotmail.com general protection fault, probably for non-canonical address 0xdc03: [#1] PREEMPT SMP KASAN KASAN: null-ptr-deref in range [0x0018-0x001f] CPU: 0 PID: 23364 Comm: vhci_tx Not tainted 5.10.0-rc2-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:tomoyo_socket_sendmsg_permission+0x80/0x340 security/tomoyo/network.c:762 Code: c7 40 0c f3 f3 f3 f3 65 48 8b 04 25 28 00 00 00 48 89 84 24 a0 00 00 00 31 c0 e8 0b 70 f1 fd 48 8d 7d 18 48 89 f8 48 c1 e8 03 <42> 80 3c 20 00 0f 85 7d 02 00 00 48 b8 00 00 00 00 00 fc ff df 4c RSP: 0018:c90016f77ad0 EFLAGS: 00010206 RAX: 0003 RBX: 192002deef5b RCX: 83b3789e RDX: 8880186c8000 RSI: 837e9945 RDI: 0018 RBP: R08: 0030 R09: c90016f77d97 R10: R11: R12: dc00 R13: c90016f77ce8 R14: c90016f77ce8 R15: FS: () GS:8880b9e0() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 562623b70868 CR3: 2af19000 CR4: 001526f0 DR0: DR1: DR2: DR3: DR6: fffe0ff0 DR7: 0400 Call Trace: security_socket_sendmsg+0x50/0xb0 security/security.c:2140 sock_sendmsg+0x3f/0x120 net/socket.c:668 vhci_send_cmd_submit+0x897/0xf90 drivers/usb/usbip/vhci_tx.c:139 vhci_tx_loop+0xfc/0x370 drivers/usb/usbip/vhci_tx.c:241 kthread+0x3af/0x4a0 kernel/kthread.c:292 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296 Modules linked in: ---[ end trace 535369ec3e6ef304 ]--- RIP: 0010:tomoyo_socket_sendmsg_permission+0x80/0x340 security/tomoyo/network.c:762 Code: c7 40 0c f3 f3 f3 f3 65 48 8b 04 25 28 00 00 00 48 89 84 24 a0 00 00 00 31 c0 e8 0b 70 f1 fd 48 8d 7d 18 48 89 f8 48 c1 e8 03 <42> 80 3c 20 00 0f 85 7d 02 00 00 48 b8 00 00 00 00 00 fc ff df 4c RSP: 0018:c90016f77ad0 EFLAGS: 00010206 RAX: 0003 RBX: 192002deef5b RCX: 83b3789e RDX: 8880186c8000 RSI: 837e9945 RDI: 0018 RBP: R08: 0030 R09: c90016f77d97 R10: R11: R12: dc00 R13: c90016f77ce8 R14: c90016f77ce8 R15: FS: () GS:8880b9e0() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 7f0c3a930740 CR3: 2af19000 CR4: 001526f0 DR0: DR1: DR2: DR3: DR6: fffe0ff0 DR7: 0400 --- This report is generated by a bot. It may contain errors. See https://goo.gl/tpsmEJ for more information about syzbot. syzbot engineers can be reached at syzkal...@googlegroups.com. syzbot will keep track of this issue. See: https://goo.gl/tpsmEJ#status for how to communicate with syzbot.