Re: general protection fault in tomoyo_socket_sendmsg_permission

2021-02-13 Thread Greg Kroah-Hartman
On Sat, Feb 13, 2021 at 07:02:22PM +0900, Tetsuo Handa wrote:
> Greg, will you queue 
> https://lkml.kernel.org/r/20210205135707.4574-1-penguin-ker...@i-love.sakura.ne.jp
>  (which can
> close a report which is wasting syzbot's resource with 5300+ crashes) for 
> 5.12 ? The change shown below will be
> too large to test before merge window for 5.12 opens. 
> 
> The patch for fixing "general protection fault in 
> tomoyo_socket_sendmsg_permission" will kill kthread_get_run().
> Closing frequently crashing bug now is the better.
> 
> On 2021/02/11 22:40, Tetsuo Handa wrote:
> > I guess that we need to serialize attach operation and reset/detach 
> > operations, for
> > it seems there is a race window that might result in "general protection 
> > fault in
> > tomoyo_socket_sendmsg_permission". Details follows...
> 
> Here is untested diff that is expected to be complete.
> 
> (1) Handle kthread_create() failure (which avoids "KASAN: null-ptr-deref 
> Write in vhci_shutdown_connection")
> by grouping socket lookup, SOCK_STREAM check and kthread_get_run() into 
> usbip_prepare_threads() function.
> 
> (2) Serialize usbip_sockfd_store(), detach_store(), attach_store(), 
> usbip_sockfd_store() and
> ud->eh_ops.shutdown()/ud->eh_ops.reset()/ud->eh_ops.unusable() operations 
> using usbip_store_mutex mutex
> (which avoids "general protection fault in 
> tomoyo_socket_sendmsg_permission").
> 
> (3) Don't reset ud->tcp_socket to NULL in vhci_device_reset(). Since 
> tx_thread/rx_thread depends on
> ud->tcp_socket != NULL whereas tcp_socket and tx_thread/rx_thread are 
> assigned at the same time,
> it is never safe to reset only ud->tcp_socket from ud->eh_ops.reset(). 
> And actually nobody is
> calling ud->eh_ops.reset() without ud->eh_ops.shutdown().
> 
> (4) usbip_sockfd_store() must perform {sdev,udc}->ud.status != 
> SDEV_ST_AVAILABLE && {sdev,udc}->ud.status = SDEV_ST_USED
> exclusively, or multiple tx_thread/rx_thread can be created when 
> concurrently called. Although (2) will already
> serialize this action, (1) will make it possible to perform within one 
> spinlock section.

Shouldn't this be 4 different patches?

thanks,

greg k-h


Re: general protection fault in tomoyo_socket_sendmsg_permission

2021-02-13 Thread Greg Kroah-Hartman
On Sat, Feb 13, 2021 at 07:02:22PM +0900, Tetsuo Handa wrote:
> Greg, will you queue 
> https://lkml.kernel.org/r/20210205135707.4574-1-penguin-ker...@i-love.sakura.ne.jp
>  (which can
> close a report which is wasting syzbot's resource with 5300+ crashes) for 
> 5.12 ? The change shown below will be
> too large to test before merge window for 5.12 opens. 
> 
> The patch for fixing "general protection fault in 
> tomoyo_socket_sendmsg_permission" will kill kthread_get_run().
> Closing frequently crashing bug now is the better.
> 
> On 2021/02/11 22:40, Tetsuo Handa wrote:
> > I guess that we need to serialize attach operation and reset/detach 
> > operations, for
> > it seems there is a race window that might result in "general protection 
> > fault in
> > tomoyo_socket_sendmsg_permission". Details follows...
> 
> Here is untested diff that is expected to be complete.

Please work and test this and get it merged in a normal manner, there is
no "rush" here at all.  Submit it properly and all will be fine.

thanks,

greg k-h


Re: general protection fault in tomoyo_socket_sendmsg_permission

2021-02-13 Thread Tetsuo Handa
Greg, will you queue 
https://lkml.kernel.org/r/20210205135707.4574-1-penguin-ker...@i-love.sakura.ne.jp
 (which can
close a report which is wasting syzbot's resource with 5300+ crashes) for 5.12 
? The change shown below will be
too large to test before merge window for 5.12 opens. 

The patch for fixing "general protection fault in 
tomoyo_socket_sendmsg_permission" will kill kthread_get_run().
Closing frequently crashing bug now is the better.

On 2021/02/11 22:40, Tetsuo Handa wrote:
> I guess that we need to serialize attach operation and reset/detach 
> operations, for
> it seems there is a race window that might result in "general protection 
> fault in
> tomoyo_socket_sendmsg_permission". Details follows...

Here is untested diff that is expected to be complete.

(1) Handle kthread_create() failure (which avoids "KASAN: null-ptr-deref Write 
in vhci_shutdown_connection")
by grouping socket lookup, SOCK_STREAM check and kthread_get_run() into 
usbip_prepare_threads() function.

(2) Serialize usbip_sockfd_store(), detach_store(), attach_store(), 
usbip_sockfd_store() and
ud->eh_ops.shutdown()/ud->eh_ops.reset()/ud->eh_ops.unusable() operations 
using usbip_store_mutex mutex
(which avoids "general protection fault in 
tomoyo_socket_sendmsg_permission").

(3) Don't reset ud->tcp_socket to NULL in vhci_device_reset(). Since 
tx_thread/rx_thread depends on
ud->tcp_socket != NULL whereas tcp_socket and tx_thread/rx_thread are 
assigned at the same time,
it is never safe to reset only ud->tcp_socket from ud->eh_ops.reset(). And 
actually nobody is
calling ud->eh_ops.reset() without ud->eh_ops.shutdown().

(4) usbip_sockfd_store() must perform {sdev,udc}->ud.status != 
SDEV_ST_AVAILABLE && {sdev,udc}->ud.status = SDEV_ST_USED
exclusively, or multiple tx_thread/rx_thread can be created when 
concurrently called. Although (2) will already
serialize this action, (1) will make it possible to perform within one 
spinlock section.

 drivers/usb/usbip/stub_dev.c | 56 ++--
 drivers/usb/usbip/usbip_common.c | 55 +++
 drivers/usb/usbip/usbip_common.h | 13 
 drivers/usb/usbip/usbip_event.c  |  9 +
 drivers/usb/usbip/vhci_hcd.c |  7 +---
 drivers/usb/usbip/vhci_sysfs.c   | 50 
 drivers/usb/usbip/vudc_sysfs.c   | 50 
 7 files changed, 175 insertions(+), 65 deletions(-)

diff --git a/drivers/usb/usbip/stub_dev.c b/drivers/usb/usbip/stub_dev.c
index 2305d425e6c9..522d58826049 100644
--- a/drivers/usb/usbip/stub_dev.c
+++ b/drivers/usb/usbip/stub_dev.c
@@ -39,12 +39,11 @@ static DEVICE_ATTR_RO(usbip_status);
  * is used to transfer usbip requests by kernel threads. -1 is a magic number
  * by which usbip connection is finished.
  */
-static ssize_t usbip_sockfd_store(struct device *dev, struct device_attribute 
*attr,
-   const char *buf, size_t count)
+static ssize_t __usbip_sockfd_store(struct device *dev, struct 
device_attribute *attr,
+   const char *buf, size_t count)
 {
struct stub_device *sdev = dev_get_drvdata(dev);
int sockfd = 0;
-   struct socket *socket;
int rv;
 
if (!sdev) {
@@ -57,7 +56,12 @@ static ssize_t usbip_sockfd_store(struct device *dev, struct 
device_attribute *a
return -EINVAL;
 
if (sockfd != -1) {
-   int err;
+   struct usbip_thread_info uti;
+   int err = usbip_prepare_threads(, >ud, sockfd,
+   stub_tx_loop, "stub_tx", 
stub_rx_loop, "stub_rx");
+
+   if (err)
+   return err;
 
dev_info(dev, "stub up\n");
 
@@ -65,44 +69,46 @@ static ssize_t usbip_sockfd_store(struct device *dev, 
struct device_attribute *a
 
if (sdev->ud.status != SDEV_ST_AVAILABLE) {
dev_err(dev, "not ready\n");
-   goto err;
+   spin_unlock_irq(>ud.lock);
+   usbip_unprepare_threads();
+   return -EINVAL;
}
 
-   socket = sockfd_lookup(sockfd, );
-   if (!socket)
-   goto err;
-
-   sdev->ud.tcp_socket = socket;
+   sdev->ud.tcp_socket = uti.tcp_socket;
sdev->ud.sockfd = sockfd;
-
-   spin_unlock_irq(>ud.lock);
-
-   sdev->ud.tcp_rx = kthread_get_run(stub_rx_loop, >ud,
- "stub_rx");
-   sdev->ud.tcp_tx = kthread_get_run(stub_tx_loop, >ud,
- "stub_tx");
-
-   spin_lock_irq(&g

Re: general protection fault in tomoyo_socket_sendmsg_permission

2021-02-12 Thread Shuah Khan

On 2/11/21 9:58 PM, Tetsuo Handa wrote:

On 2021/02/12 11:22, Tetsuo Handa wrote:

On 2021/02/12 10:34, Shuah Khan wrote:

On 2/10/21 6:14 PM, Tetsuo Handa wrote:

(Dropping LSM ML because this is not a TOMOYO's bug.)

On 2021/02/11 4:29, Shuah Khan wrote:

This is a good find. I already replied to the thread to send a complete
fix.


As I said at 
https://lkml.kernel.org/r/f8cae6b1-8f84-0e6a-7d9c-fc4aec68f...@i-love.sakura.ne.jp
 ,
the as-is patch is effectively a complete fix. And applying the as-is patch 
should help spending
syzbot resources for reproducing "general protection fault in 
tomoyo_socket_sendmsg_permission"
with debug printk() patch applied, which in turn will help you in


Right. I would like to get a clear understanding of how this condition
is triggered. I am not saying this isn't a problem. Understanding how
it is triggered helps find the best fix.


part. Therefore, I strongly expect you to apply this version now.



Is there a reproducer for this problem?


There is no reproducer for "general protection fault in 
tomoyo_socket_sendmsg_permission" problem, but
the race condition is explained at 
https://lkml.kernel.org/r/676d4518-0faa-9fab-15db-0db8d216d...@i-love.sakura.ne.jp
 .



Here is a race window widening patch, and I locally reproduced "general protection 
fault in tomoyo_socket_sendmsg_permission".

diff --git a/drivers/usb/usbip/usbip_common.h b/drivers/usb/usbip/usbip_common.h
index 8be857a4fa13..a7c68097aa1d 100644
--- a/drivers/usb/usbip/usbip_common.h
+++ b/drivers/usb/usbip/usbip_common.h
@@ -286,6 +286,8 @@ struct usbip_device {
if (!IS_ERR(__k)) {\
get_task_struct(__k);  \
wake_up_process(__k);  \
+   } else {   \
+   __k = NULL;\
}  \
__k;   \
  })
diff --git a/drivers/usb/usbip/vhci_sysfs.c b/drivers/usb/usbip/vhci_sysfs.c
index be37aec250c2..93e1271d0f5d 100644
--- a/drivers/usb/usbip/vhci_sysfs.c
+++ b/drivers/usb/usbip/vhci_sysfs.c
@@ -389,8 +389,12 @@ static ssize_t attach_store(struct device *dev, struct 
device_attribute *attr,
/* end the lock */
  
  	vdev->ud.tcp_rx = kthread_get_run(vhci_rx_loop, >ud, "vhci_rx");

-   vdev->ud.tcp_tx = kthread_get_run(vhci_tx_loop, >ud, "vhci_tx");
+   {
+   struct task_struct *tx = kthread_get_run(vhci_tx_loop, >ud, 
"vhci_tx");
  
+		schedule_timeout_uninterruptible(HZ);

+   vdev->ud.tcp_tx = tx;
+   }
rh_port_connect(vdev, speed);
  
  	return count;




Thank you. This is helpful. I will try to reproduce the problem with
this test code.

thanks,
-- Shuah


Re: general protection fault in tomoyo_socket_sendmsg_permission

2021-02-11 Thread Tetsuo Handa
On 2021/02/12 11:22, Tetsuo Handa wrote:
> On 2021/02/12 10:34, Shuah Khan wrote:
>> On 2/10/21 6:14 PM, Tetsuo Handa wrote:
>>> (Dropping LSM ML because this is not a TOMOYO's bug.)
>>>
>>> On 2021/02/11 4:29, Shuah Khan wrote:
>>>> This is a good find. I already replied to the thread to send a complete
>>>> fix.
>>>
>>> As I said at 
>>> https://lkml.kernel.org/r/f8cae6b1-8f84-0e6a-7d9c-fc4aec68f...@i-love.sakura.ne.jp
>>>  ,
>>> the as-is patch is effectively a complete fix. And applying the as-is patch 
>>> should help spending
>>> syzbot resources for reproducing "general protection fault in 
>>> tomoyo_socket_sendmsg_permission"
>>> with debug printk() patch applied, which in turn will help you in
>>>
>>>> Right. I would like to get a clear understanding of how this condition
>>>> is triggered. I am not saying this isn't a problem. Understanding how
>>>> it is triggered helps find the best fix.
>>>
>>> part. Therefore, I strongly expect you to apply this version now.
>>>
>>
>> Is there a reproducer for this problem?
> 
> There is no reproducer for "general protection fault in 
> tomoyo_socket_sendmsg_permission" problem, but
> the race condition is explained at 
> https://lkml.kernel.org/r/676d4518-0faa-9fab-15db-0db8d216d...@i-love.sakura.ne.jp
>  .
> 

Here is a race window widening patch, and I locally reproduced "general 
protection fault in tomoyo_socket_sendmsg_permission".

diff --git a/drivers/usb/usbip/usbip_common.h b/drivers/usb/usbip/usbip_common.h
index 8be857a4fa13..a7c68097aa1d 100644
--- a/drivers/usb/usbip/usbip_common.h
+++ b/drivers/usb/usbip/usbip_common.h
@@ -286,6 +286,8 @@ struct usbip_device {
if (!IS_ERR(__k)) {\
get_task_struct(__k);  \
wake_up_process(__k);  \
+   } else {   \
+   __k = NULL;\
}  \
__k;   \
 })
diff --git a/drivers/usb/usbip/vhci_sysfs.c b/drivers/usb/usbip/vhci_sysfs.c
index be37aec250c2..93e1271d0f5d 100644
--- a/drivers/usb/usbip/vhci_sysfs.c
+++ b/drivers/usb/usbip/vhci_sysfs.c
@@ -389,8 +389,12 @@ static ssize_t attach_store(struct device *dev, struct 
device_attribute *attr,
/* end the lock */
 
vdev->ud.tcp_rx = kthread_get_run(vhci_rx_loop, >ud, "vhci_rx");
-   vdev->ud.tcp_tx = kthread_get_run(vhci_tx_loop, >ud, "vhci_tx");
+   {
+   struct task_struct *tx = kthread_get_run(vhci_tx_loop, 
>ud, "vhci_tx");
 
+   schedule_timeout_uninterruptible(HZ);
+   vdev->ud.tcp_tx = tx;
+   }
rh_port_connect(vdev, speed);
 
return count;


[  134.880383][ T7879] vhci_hcd vhci_hcd.0: pdev(2) rhport(0) sockfd(6)
[  134.883465][ T7879] vhci_hcd vhci_hcd.0: devid(0) speed(2) 
speed_str(full-speed)
[  134.904750][   T74] vhci_hcd: vhci_device speed not set
[  134.933053][ T7873] vhci_hcd: connection closed
[  134.933212][ T7870] vhci_hcd: connection closed
[  134.933415][ T4488] vhci_hcd: stop threads
[  134.942970][ T4488] vhci_hcd: release socket
[  134.944949][   T14] vhci_hcd: vhci_device speed not set
[  134.947063][ T4488] vhci_hcd: disconnect device
[  134.951508][ T4488] vhci_hcd: stop threads
[  134.953861][ T4488] vhci_hcd: release socket
[  134.957113][ T4488] vhci_hcd: disconnect device
[  135.014433][   T56] usb 15-1: enqueue for inactive port 0
[  135.065119][ T7877] vhci_hcd: connection closed
[  135.065205][ T7880] vhci_hcd: connection closed
[  135.065370][ T4480] vhci_hcd: stop threads
[  135.072854][ T4480] vhci_hcd: release socket
[  135.076490][ T4480] vhci_hcd: disconnect device
[  135.079535][ T4480] vhci_hcd: stop threads
[  135.084024][ T4480] vhci_hcd: release socket
[  135.087979][ T4480] vhci_hcd: disconnect device
[  135.134526][ T6820] vhci_hcd: vhci_device speed not set
[  135.314959][ T6821] vhci_hcd: vhci_device speed not set
[  135.621271][ T7884] vhci_hcd vhci_hcd.0: pdev(4) rhport(0) sockfd(3)
[  135.621290][ T7885] vhci_hcd vhci_hcd.0: pdev(0) rhport(0) sockfd(3)
[  135.625072][ T7884] vhci_hcd vhci_hcd.0: devid(0) speed(5) 
speed_str(super-speed)
[  135.628665][ T7885] vhci_hcd vhci_hcd.0: devid(0) speed(5) 
speed_str(super-speed)
[  135.630109][ T7887] vhci_hcd vhci_hcd.0: pdev(1) rhport(0) sockfd(3)
[  135.630116][ T7887] vhci_hcd vhci_hcd.0: devid(0) speed(5) 
speed_str(super-speed)
[  135.672834][ T7895] vhci_hcd 

Re: general protection fault in tomoyo_socket_sendmsg_permission

2021-02-11 Thread Tetsuo Handa
On 2021/02/12 10:34, Shuah Khan wrote:
> On 2/10/21 6:14 PM, Tetsuo Handa wrote:
>> (Dropping LSM ML because this is not a TOMOYO's bug.)
>>
>> On 2021/02/11 4:29, Shuah Khan wrote:
>>> This is a good find. I already replied to the thread to send a complete
>>> fix.
>>
>> As I said at 
>> https://lkml.kernel.org/r/f8cae6b1-8f84-0e6a-7d9c-fc4aec68f...@i-love.sakura.ne.jp
>>  ,
>> the as-is patch is effectively a complete fix. And applying the as-is patch 
>> should help spending
>> syzbot resources for reproducing "general protection fault in 
>> tomoyo_socket_sendmsg_permission"
>> with debug printk() patch applied, which in turn will help you in
>>
>>> Right. I would like to get a clear understanding of how this condition
>>> is triggered. I am not saying this isn't a problem. Understanding how
>>> it is triggered helps find the best fix.
>>
>> part. Therefore, I strongly expect you to apply this version now.
>>
> 
> Is there a reproducer for this problem?

There is no reproducer for "general protection fault in 
tomoyo_socket_sendmsg_permission" problem, but
the race condition is explained at 
https://lkml.kernel.org/r/676d4518-0faa-9fab-15db-0db8d216d...@i-love.sakura.ne.jp
 .


Re: general protection fault in tomoyo_socket_sendmsg_permission

2021-02-11 Thread Shuah Khan

On 2/10/21 6:14 PM, Tetsuo Handa wrote:

(Dropping LSM ML because this is not a TOMOYO's bug.)

On 2021/02/11 4:29, Shuah Khan wrote:

This is a good find. I already replied to the thread to send a complete
fix.


As I said at 
https://lkml.kernel.org/r/f8cae6b1-8f84-0e6a-7d9c-fc4aec68f...@i-love.sakura.ne.jp
 ,
the as-is patch is effectively a complete fix. And applying the as-is patch 
should help spending
syzbot resources for reproducing "general protection fault in 
tomoyo_socket_sendmsg_permission"
with debug printk() patch applied, which in turn will help you in


Right. I would like to get a clear understanding of how this condition
is triggered. I am not saying this isn't a problem. Understanding how
it is triggered helps find the best fix.


part. Therefore, I strongly expect you to apply this version now.



Is there a reproducer for this problem?

thanks,
-- Shuah


Re: general protection fault in tomoyo_socket_sendmsg_permission

2021-02-10 Thread Tetsuo Handa
(Dropping LSM ML because this is not a TOMOYO's bug.)

On 2021/02/11 4:29, Shuah Khan wrote:
> This is a good find. I already replied to the thread to send a complete
> fix.

As I said at 
https://lkml.kernel.org/r/f8cae6b1-8f84-0e6a-7d9c-fc4aec68f...@i-love.sakura.ne.jp
 ,
the as-is patch is effectively a complete fix. And applying the as-is patch 
should help spending
syzbot resources for reproducing "general protection fault in 
tomoyo_socket_sendmsg_permission"
with debug printk() patch applied, which in turn will help you in

> Right. I would like to get a clear understanding of how this condition
> is triggered. I am not saying this isn't a problem. Understanding how
> it is triggered helps find the best fix.

part. Therefore, I strongly expect you to apply this version now.



Re: general protection fault in tomoyo_socket_sendmsg_permission

2021-02-10 Thread Shuah Khan

On 2/10/21 12:07 PM, Tetsuo Handa wrote:

On 2021/02/11 3:17, Shuah Khan wrote:

I am looking to understand the syzbot configuration and a reproducer
to be able to debug and fix the problem. How is syzbot triggering the
vhci_hcd attach and detach sequence?


I don't know. I'm waiting for syzbot to reproduce the problem on linux-next
with 
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/drivers/usb/usbip?id=f1bdf414e7dd0cbc26460425719fc3ea479947a2
 .



This helps me determine all these fix suggestions that are coming in
are fixes or papering over a real problem.


What are these fix suggestions?

"general protection fault in tomoyo_socket_sendmsg_permission" is a NULL pointer
dereference which can happen if vhci_device_reset() and/or vhci_device_init()
(which does vdev->ud.tcp_socket = NULL;) were unexpectedly called. There is no
reproducer, and (as far as I know) no fix suggestion.


Right. I would like to get a clear understanding of how this condition
is triggered. I am not saying this isn't a problem. Understanding how
it is triggered helps find the best fix.


"KASAN: null-ptr-deref Write in vhci_shutdown_connection" is an ERR_PTR(-EINTR)
pointer dereference which can happen if kthread_create() was SIGKILLed. There is
a reproducer, and 
https://lkml.kernel.org/r/20210205135707.4574-1-penguin-ker...@i-love.sakura.ne.jp
is a fix suggestion.



This is a good find. I already replied to the thread to send a complete
fix.

thanks,
-- Shuah


Re: general protection fault in tomoyo_socket_sendmsg_permission

2021-02-10 Thread Tetsuo Handa
On 2021/02/11 3:17, Shuah Khan wrote:
> I am looking to understand the syzbot configuration and a reproducer
> to be able to debug and fix the problem. How is syzbot triggering the
> vhci_hcd attach and detach sequence?

I don't know. I'm waiting for syzbot to reproduce the problem on linux-next
with 
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/drivers/usb/usbip?id=f1bdf414e7dd0cbc26460425719fc3ea479947a2
 .

> 
> This helps me determine all these fix suggestions that are coming in
> are fixes or papering over a real problem.

What are these fix suggestions?

"general protection fault in tomoyo_socket_sendmsg_permission" is a NULL pointer
dereference which can happen if vhci_device_reset() and/or vhci_device_init()
(which does vdev->ud.tcp_socket = NULL;) were unexpectedly called. There is no
reproducer, and (as far as I know) no fix suggestion.

"KASAN: null-ptr-deref Write in vhci_shutdown_connection" is an ERR_PTR(-EINTR)
pointer dereference which can happen if kthread_create() was SIGKILLed. There is
a reproducer, and 
https://lkml.kernel.org/r/20210205135707.4574-1-penguin-ker...@i-love.sakura.ne.jp
is a fix suggestion.



Re: general protection fault in tomoyo_socket_sendmsg_permission

2021-02-10 Thread Shuah Khan

On 1/29/21 7:25 PM, Tetsuo Handa wrote:

On 2021/01/30 6:18, Shuah Khan wrote:

In this console log:


It seems "this console log" refers to 
https://syzkaller.appspot.com/x/log.txt?x=1045303450 .



06:57:50 executing program 1:
socketpair$tipc(0x1e, 0x2, 0x0, &(0x7fc0)={0x})
sendmsg$BATADV_CMD_GET_TRANSTABLE_LOCAL(r0, &(0x7f0002c0)={&(0x7f0001c0), 
0xc, &(0x7f000280)={0x0, 0xd0010100}}, 0x0)

[ 1151.090883][T23361] vhci_hcd vhci_hcd.0: pdev(4) rhport(0) sockfd(4)
[ 1151.097445][T23361] vhci_hcd vhci_hcd.0: devid(0) speed(1) 
speed_str(low-speed)
06:57:50 executing program 0:
r0 = syz_open_dev$binderN(&(0x7f000680)='/dev/binder#\x00', 0x0, 0x0)
ioctl$BINDER_WRITE_READ(r0, 0xc0306201, &(0x7f000cc0)={0x88, 0x0, 
&(0x7f000b80)=[@transaction={0x40406300, {0x2, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 
0x0, 0x0, 0x0, 0x0}}, @transaction={0x40406300, {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 
0x0, 0x0, 0x0, 0x0}}], 0x0, 0x0, 0x0})

[ 1151.164402][T23363] vhci_hcd: connection closed
[ 1151.167346][  T240] vhci_hcd: stop threads


[ 1151.178329][T26761] usb 17-1: new low-speed USB device number 2 using 
vhci_hcd


SK: Looking at the console log, it looks like while connection is being
     torn down,


Excuse me, but it looks like (what comes here) while connection is being torn 
down ?
I'm not familiar with driver code.




[ 1151.181245][  T240] vhci_hcd: release socket


Can you share your your test code for this program:
"executing program 1"


I don't think program 1 is relevant. I think program 4

   06:57:50 executing program 4:
   r0 = socket$tipc(0x1e, 0x2, 0x0)
   syz_usbip_server_init(0x1)
   close_range(r0, 0x, 0x0)

which calls syz_usbip_server_init() as with other duplicates is relevant.



Also your setup? Do you run usbip_host and vhci_hcd both?


Who are you referring to with "you/your" ? I'm not running syzkaller in my setup
and I don't have test code.

I'm just proposing printing more messages in order to confirm the ordering of
events and member values in structures.



I am looking to understand the syzbot configuration and a reproducer
to be able to debug and fix the problem. How is syzbot triggering the
vhci_hcd attach and detach sequence?

This helps me determine all these fix suggestions that are coming in
are fixes or papering over a real problem.

thanks,
-- Shuah


Re: general protection fault in tomoyo_socket_sendmsg_permission

2021-01-30 Thread Tetsuo Handa
On 2021/01/30 6:18, Shuah Khan wrote:
> In this console log:

It seems "this console log" refers to 
https://syzkaller.appspot.com/x/log.txt?x=1045303450 .

> 
> 06:57:50 executing program 1:
> socketpair$tipc(0x1e, 0x2, 0x0, &(0x7fc0)={0x})
> sendmsg$BATADV_CMD_GET_TRANSTABLE_LOCAL(r0, 
> &(0x7f0002c0)={&(0x7f0001c0), 0xc, &(0x7f000280)={0x0, 
> 0xd0010100}}, 0x0)
> 
> [ 1151.090883][T23361] vhci_hcd vhci_hcd.0: pdev(4) rhport(0) sockfd(4)
> [ 1151.097445][T23361] vhci_hcd vhci_hcd.0: devid(0) speed(1) 
> speed_str(low-speed)
> 06:57:50 executing program 0:
> r0 = syz_open_dev$binderN(&(0x7f000680)='/dev/binder#\x00', 0x0, 0x0)
> ioctl$BINDER_WRITE_READ(r0, 0xc0306201, &(0x7f000cc0)={0x88, 0x0, 
> &(0x7f000b80)=[@transaction={0x40406300, {0x2, 0x0, 0x0, 0x0, 0x0, 0x0, 
> 0x0, 0x0, 0x0, 0x0, 0x0}}, @transaction={0x40406300, {0x0, 0x0, 0x0, 0x0, 
> 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}}], 0x0, 0x0, 0x0})
> 
> [ 1151.164402][T23363] vhci_hcd: connection closed
> [ 1151.167346][  T240] vhci_hcd: stop threads
> 
> 
> [ 1151.178329][T26761] usb 17-1: new low-speed USB device number 2 using 
> vhci_hcd
> 
> 
> SK: Looking at the console log, it looks like while connection is being
>     torn down,

Excuse me, but it looks like (what comes here) while connection is being torn 
down ?
I'm not familiar with driver code.

> 
> 
> [ 1151.181245][  T240] vhci_hcd: release socket
> 
> 
> Can you share your your test code for this program:
> "executing program 1"

I don't think program 1 is relevant. I think program 4

  06:57:50 executing program 4:
  r0 = socket$tipc(0x1e, 0x2, 0x0)
  syz_usbip_server_init(0x1)
  close_range(r0, 0x, 0x0)

which calls syz_usbip_server_init() as with other duplicates is relevant.

> 
> Also your setup? Do you run usbip_host and vhci_hcd both?

Who are you referring to with "you/your" ? I'm not running syzkaller in my setup
and I don't have test code.

I'm just proposing printing more messages in order to confirm the ordering of
events and member values in structures.


Re: general protection fault in tomoyo_socket_sendmsg_permission

2021-01-29 Thread Shuah Khan

On 1/29/21 10:08 AM, Tetsuo Handa wrote:

On 2021/01/30 1:05, Shuah Khan wrote:

Since "general protection fault in tomoyo_socket_sendmsg_permission" is caused 
by
unexpectedly resetting ud->tcp_socket to NULL without waiting for tx thread to
terminate, tracing the ordering of events is worth knowing. Even adding
schedule_timeout_uninterruptible() to before kernel_sendmsg() might help.



What about the duplicate bug information that was in my email?
Did you take a look at that?


I was not aware of the duplicate bugs. It is interesting that
"KASAN: null-ptr-deref Write in event_handler" says that vdev->ud.tcp_tx became 
NULL at

if (vdev->ud.tcp_tx) {

/* this location */

kthread_stop_put(vdev->ud.tcp_tx);
vdev->ud.tcp_tx = NULL;
}

which means that somebody else is unexpectedly resetting vdev->ud.tcp_tx to 
NULL.

If memset() from vhci_device_init() from vhci_start() were unexpectedly called,
all of tcp_socket, tcp_rx, tcp_tx etc. becomes NULL which can explain these 
bugs ?
I'm inclined to report not only tcp_socket but also other fields when 
kernel_sendmsg()
detected that tcp_socket is NULL...



In this console log:

06:57:50 executing program 1:
socketpair$tipc(0x1e, 0x2, 0x0, &(0x7fc0)={0x})
sendmsg$BATADV_CMD_GET_TRANSTABLE_LOCAL(r0, 
&(0x7f0002c0)={&(0x7f0001c0), 0xc, &(0x7f000280)={0x0, 
0xd0010100}}, 0x0)


[ 1151.090883][T23361] vhci_hcd vhci_hcd.0: pdev(4) rhport(0) sockfd(4)
[ 1151.097445][T23361] vhci_hcd vhci_hcd.0: devid(0) speed(1) 
speed_str(low-speed)

06:57:50 executing program 0:
r0 = syz_open_dev$binderN(&(0x7f000680)='/dev/binder#\x00', 0x0, 0x0)
ioctl$BINDER_WRITE_READ(r0, 0xc0306201, &(0x7f000cc0)={0x88, 0x0, 
&(0x7f000b80)=[@transaction={0x40406300, {0x2, 0x0, 0x0, 0x0, 0x0, 
0x0, 0x0, 0x0, 0x0, 0x0, 0x0}}, @transaction={0x40406300, {0x0, 0x0, 
0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}}], 0x0, 0x0, 0x0})


[ 1151.164402][T23363] vhci_hcd: connection closed
[ 1151.167346][  T240] vhci_hcd: stop threads


[ 1151.178329][T26761] usb 17-1: new low-speed USB device number 2 using 
vhci_hcd



SK: Looking at the console log, it looks like while connection is being
torn down,


[ 1151.181245][  T240] vhci_hcd: release socket


Can you share your your test code for this program:
"executing program 1"

Also your setup? Do you run usbip_host and vhci_hcd both?

thanks,
-- Shuah




Re: general protection fault in tomoyo_socket_sendmsg_permission

2021-01-29 Thread Tetsuo Handa
On 2021/01/30 1:05, Shuah Khan wrote:
>> Since "general protection fault in tomoyo_socket_sendmsg_permission" is 
>> caused by
>> unexpectedly resetting ud->tcp_socket to NULL without waiting for tx thread 
>> to
>> terminate, tracing the ordering of events is worth knowing. Even adding
>> schedule_timeout_uninterruptible() to before kernel_sendmsg() might help.
>>
> 
> What about the duplicate bug information that was in my email?
> Did you take a look at that?

I was not aware of the duplicate bugs. It is interesting that
"KASAN: null-ptr-deref Write in event_handler" says that vdev->ud.tcp_tx became 
NULL at

if (vdev->ud.tcp_tx) {

/* this location */

kthread_stop_put(vdev->ud.tcp_tx);
vdev->ud.tcp_tx = NULL;
}

which means that somebody else is unexpectedly resetting vdev->ud.tcp_tx to 
NULL.

If memset() from vhci_device_init() from vhci_start() were unexpectedly called,
all of tcp_socket, tcp_rx, tcp_tx etc. becomes NULL which can explain these 
bugs ?
I'm inclined to report not only tcp_socket but also other fields when 
kernel_sendmsg()
detected that tcp_socket is NULL...


Re: general protection fault in tomoyo_socket_sendmsg_permission

2021-01-28 Thread Tetsuo Handa
On 2021/01/29 4:05, Shuah Khan wrote:
> The reason I don't like adding printk's is this is a race condition
> and as a result time sensitive. Adding printks in the path will not
> help debug this issue. It will make it harder to reproduce the problem.

Not always. Adding printk() might make it easier to reproduce the problem.

> 
> I am unable to reproduce the problem using the reproducer and running 
> multiple instances of the reproducer.

Since syzkaller cannot find a reproducer for "general protection fault in
tomoyo_socket_sendmsg_permission", and you cannot reproduce other problem
using reproducer, trying to obtain some clue via printing messages by asking
syzkaller to try debug patch can be very helpful.

Since "general protection fault in tomoyo_socket_sendmsg_permission" is caused 
by
unexpectedly resetting ud->tcp_socket to NULL without waiting for tx thread to
terminate, tracing the ordering of events is worth knowing. Even adding
schedule_timeout_uninterruptible() to before kernel_sendmsg() might help.



Re: general protection fault in tomoyo_socket_sendmsg_permission

2021-01-27 Thread Tetsuo Handa
On 2020/11/14 2:14, Shuah Khan wrote:
> On 11/13/20 5:00 AM, Hillf Danton wrote:
>> Thu, 12 Nov 2020 23:21:26 -0800
>>> syzbot found the following issue on:
>>>
>>> HEAD commit:    9dbc1c03 Merge tag 'xfs-5.10-fixes-3' of git://git.kernel...
>>> git tree:   upstream
>>> console output: https://syzkaller.appspot.com/x/log.txt?x=1045303450
>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=1735b7978b1c3721
>>> dashboard link: https://syzkaller.appspot.com/bug?extid=95ce4b142579611ef0a9
>>> compiler:   gcc (GCC) 10.1.0-syz 20200507
>>> userspace arch: i386
>>>
>>> Unfortunately, I don't have any reproducer for this issue yet.
>>>
> 
> I would like to see the reproducer for this. I just can't reproduce
> this problem.
> 
>>
>> Fix 96c2737716d5 ("usbip: move usbip kernel code out of staging")
>> by closing the race between readers and writer of ud->tcp_socket on
>> sock shutdown. To do that, add waitqueue for usbip device and make
>> writer wait for all readers to go home before releasing the socket.

Worrysome part for me is vhci_device_reset() which resets ud->tcp_socket to NULL
without waiting for tx thread to terminate, though I don't know if
vhci_device_reset() can be called while tx thread is running.

I'd like to try below debug printk() patch on linx-next tree, for this bug is
reported on linux.git and linux-next.git trees. Which git tree can be used for
sending this to-be-removed patch to linux-next.git ? I wish there is a kernel
config for fuzzers in linux.git so that every git tree can carry debug printk()
patch for syzbot's reports...

Subject: [PATCH] usb: usbip: vhci_hcd: add printk() for debug

This is linux-next only patch for examining a bug reported at
https://syzkaller.appspot.com/bug?id=3e1d941a31361efc4ced2ba8b4af2044d4e43c59 .

Signed-off-by: Tetsuo Handa 
---
 drivers/usb/usbip/stub_dev.c   |  6 ++
 drivers/usb/usbip/vhci_hcd.c   | 11 +++
 drivers/usb/usbip/vhci_sysfs.c |  4 
 drivers/usb/usbip/vhci_tx.c| 12 
 drivers/usb/usbip/vudc_dev.c   |  6 ++
 5 files changed, 39 insertions(+)

diff --git a/drivers/usb/usbip/stub_dev.c b/drivers/usb/usbip/stub_dev.c
index 2305d425e6c9..627f83c0ebc8 100644
--- a/drivers/usb/usbip/stub_dev.c
+++ b/drivers/usb/usbip/stub_dev.c
@@ -131,10 +131,14 @@ static void stub_shutdown_connection(struct usbip_device 
*ud)
 
/* 1. stop threads */
if (ud->tcp_rx) {
+   if (IS_BUILTIN(CONFIG_DEBUG_AID_FOR_SYZBOT))
+   pr_info("%s: stop rx %p\n", __func__, ud->tcp_rx);
kthread_stop_put(ud->tcp_rx);
ud->tcp_rx = NULL;
}
if (ud->tcp_tx) {
+   if (IS_BUILTIN(CONFIG_DEBUG_AID_FOR_SYZBOT))
+   pr_info("%s: stop tx %p\n", __func__, ud->tcp_tx);
kthread_stop_put(ud->tcp_tx);
ud->tcp_tx = NULL;
}
@@ -146,6 +150,8 @@ static void stub_shutdown_connection(struct usbip_device 
*ud)
 * not touch NULL socket.
 */
if (ud->tcp_socket) {
+   if (IS_BUILTIN(CONFIG_DEBUG_AID_FOR_SYZBOT))
+   pr_info("%s: close sock %p\n", __func__, 
ud->tcp_socket);
sockfd_put(ud->tcp_socket);
ud->tcp_socket = NULL;
ud->sockfd = -1;
diff --git a/drivers/usb/usbip/vhci_hcd.c b/drivers/usb/usbip/vhci_hcd.c
index 3209b5ddd30c..9e95bf9330f5 100644
--- a/drivers/usb/usbip/vhci_hcd.c
+++ b/drivers/usb/usbip/vhci_hcd.c
@@ -1016,10 +1016,14 @@ static void vhci_shutdown_connection(struct 
usbip_device *ud)
 
/* kill threads related to this sdev */
if (vdev->ud.tcp_rx) {
+   if (IS_BUILTIN(CONFIG_DEBUG_AID_FOR_SYZBOT))
+   pr_info("%s: stop rx %p\n", __func__, vdev->ud.tcp_rx);
kthread_stop_put(vdev->ud.tcp_rx);
vdev->ud.tcp_rx = NULL;
}
if (vdev->ud.tcp_tx) {
+   if (IS_BUILTIN(CONFIG_DEBUG_AID_FOR_SYZBOT))
+   pr_info("%s: stop tx %p\n", __func__, vdev->ud.tcp_tx);
kthread_stop_put(vdev->ud.tcp_tx);
vdev->ud.tcp_tx = NULL;
}
@@ -1027,6 +1031,8 @@ static void vhci_shutdown_connection(struct usbip_device 
*ud)
 
/* active connection is closed */
if (vdev->ud.tcp_socket) {
+   if (IS_BUILTIN(CONFIG_DEBUG_AID_FOR_SYZBOT))
+   pr_info("%s: close sock %p\n", __func__, 
ud->tcp_socket);
sockfd_put(vdev->ud.tcp_socket);
vdev->ud.tcp_socket = NULL;
vdev->ud.sockfd = -1;
@@ -1074,6 +1080,11 @@ static void vhci_device_reset(struct usbip_device *ud)
vdev->udev = NULL;
 
if (ud->tcp_socket) {
+   if (IS_BUILTIN(CONFIG_DEBUG_AID_FOR_SYZBOT)) {
+   pr_info("%s: close sock %p\n", __func__, 
ud->tcp_socket);
+   BUG_ON(vdev->ud.tcp_tx);
+   

Re: general protection fault in tomoyo_socket_sendmsg_permission

2020-11-13 Thread Shuah Khan

On 11/13/20 5:00 AM, Hillf Danton wrote:

Thu, 12 Nov 2020 23:21:26 -0800

syzbot found the following issue on:

HEAD commit:9dbc1c03 Merge tag 'xfs-5.10-fixes-3' of git://git.kernel...
git tree:   upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1045303450
kernel config:  https://syzkaller.appspot.com/x/.config?x=1735b7978b1c3721
dashboard link: https://syzkaller.appspot.com/bug?extid=95ce4b142579611ef0a9
compiler:   gcc (GCC) 10.1.0-syz 20200507
userspace arch: i386

Unfortunately, I don't have any reproducer for this issue yet.



I would like to see the reproducer for this. I just can't reproduce
this problem.



Fix 96c2737716d5 ("usbip: move usbip kernel code out of staging")
by closing the race between readers and writer of ud->tcp_socket on
sock shutdown. To do that, add waitqueue for usbip device and make
writer wait for all readers to go home before releasing the socket.


--- a/drivers/usb/usbip/usbip_common.h
+++ b/drivers/usb/usbip/usbip_common.h
@@ -277,6 +277,9 @@ struct usbip_device {
void (*reset)(struct usbip_device *);
void (*unusable)(struct usbip_device *);
} eh_ops;
+
+   atomic_t sk_shutdown_count;
+   wait_queue_head_t sk_shutdown_waitq;
  };
  
  #define kthread_get_run(threadfn, data, namefmt, ...)			   \

--- a/drivers/usb/usbip/vhci_hcd.c
+++ b/drivers/usb/usbip/vhci_hcd.c
@@ -1008,15 +1008,20 @@ static void vhci_shutdown_connection(str
  
  	/* kill threads related to this sdev */

if (vdev->ud.tcp_rx) {
+   atomic_inc(>sk_shutdown_count);
kthread_stop_put(vdev->ud.tcp_rx);
vdev->ud.tcp_rx = NULL;
}
if (vdev->ud.tcp_tx) {
+   atomic_inc(>sk_shutdown_count);
kthread_stop_put(vdev->ud.tcp_tx);
vdev->ud.tcp_tx = NULL;
}
pr_info("stop threads\n");
  
+	wait_event(ud->sk_shutdown_waitq,

+   !atomic_read(>sk_shutdown_count));
+
/* active connection is closed */
if (vdev->ud.tcp_socket) {
sockfd_put(vdev->ud.tcp_socket);
@@ -1104,6 +1109,8 @@ static void vhci_device_init(struct vhci
vdev->ud.eh_ops.reset = vhci_device_reset;
vdev->ud.eh_ops.unusable = vhci_device_unusable;
  
+	init_waitqueue_head(>ud.sk_shutdown_waitq);

+
usbip_start_eh(>ud);
  }
  
--- a/drivers/usb/usbip/vhci_rx.c

+++ b/drivers/usb/usbip/vhci_rx.c
@@ -264,5 +264,8 @@ int vhci_rx_loop(void *data)
vhci_rx_pdu(ud);
}
  
+	if (atomic_dec_and_test(>sk_shutdown_count))

+   wake_up(>sk_shutdown_waitq);
+
return 0;
  }
--- a/drivers/usb/usbip/vhci_tx.c
+++ b/drivers/usb/usbip/vhci_tx.c
@@ -252,5 +252,8 @@ int vhci_tx_loop(void *data)
usbip_dbg_vhci_tx("pending urbs ?, now wake up\n");
}
  
+	if (atomic_dec_and_test(>sk_shutdown_count))

+   wake_up(>sk_shutdown_waitq);
+
return 0;
  }




Without a reproducer, it is hard to verify the fix.

thanks,
-- Shuah


general protection fault in tomoyo_socket_sendmsg_permission

2020-11-12 Thread syzbot
Hello,

syzbot found the following issue on:

HEAD commit:9dbc1c03 Merge tag 'xfs-5.10-fixes-3' of git://git.kernel...
git tree:   upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1045303450
kernel config:  https://syzkaller.appspot.com/x/.config?x=1735b7978b1c3721
dashboard link: https://syzkaller.appspot.com/bug?extid=95ce4b142579611ef0a9
compiler:   gcc (GCC) 10.1.0-syz 20200507
userspace arch: i386

Unfortunately, I don't have any reproducer for this issue yet.

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+95ce4b142579611ef...@syzkaller.appspotmail.com

general protection fault, probably for non-canonical address 
0xdc03:  [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0018-0x001f]
CPU: 0 PID: 23364 Comm: vhci_tx Not tainted 5.10.0-rc2-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
RIP: 0010:tomoyo_socket_sendmsg_permission+0x80/0x340 
security/tomoyo/network.c:762
Code: c7 40 0c f3 f3 f3 f3 65 48 8b 04 25 28 00 00 00 48 89 84 24 a0 00 00 00 
31 c0 e8 0b 70 f1 fd 48 8d 7d 18 48 89 f8 48 c1 e8 03 <42> 80 3c 20 00 0f 85 7d 
02 00 00 48 b8 00 00 00 00 00 fc ff df 4c
RSP: 0018:c90016f77ad0 EFLAGS: 00010206
RAX: 0003 RBX: 192002deef5b RCX: 83b3789e
RDX: 8880186c8000 RSI: 837e9945 RDI: 0018
RBP:  R08: 0030 R09: c90016f77d97
R10:  R11:  R12: dc00
R13: c90016f77ce8 R14: c90016f77ce8 R15: 
FS:  () GS:8880b9e0() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 562623b70868 CR3: 2af19000 CR4: 001526f0
DR0:  DR1:  DR2: 
DR3:  DR6: fffe0ff0 DR7: 0400
Call Trace:
 security_socket_sendmsg+0x50/0xb0 security/security.c:2140
 sock_sendmsg+0x3f/0x120 net/socket.c:668
 vhci_send_cmd_submit+0x897/0xf90 drivers/usb/usbip/vhci_tx.c:139
 vhci_tx_loop+0xfc/0x370 drivers/usb/usbip/vhci_tx.c:241
 kthread+0x3af/0x4a0 kernel/kthread.c:292
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296
Modules linked in:
---[ end trace 535369ec3e6ef304 ]---
RIP: 0010:tomoyo_socket_sendmsg_permission+0x80/0x340 
security/tomoyo/network.c:762
Code: c7 40 0c f3 f3 f3 f3 65 48 8b 04 25 28 00 00 00 48 89 84 24 a0 00 00 00 
31 c0 e8 0b 70 f1 fd 48 8d 7d 18 48 89 f8 48 c1 e8 03 <42> 80 3c 20 00 0f 85 7d 
02 00 00 48 b8 00 00 00 00 00 fc ff df 4c
RSP: 0018:c90016f77ad0 EFLAGS: 00010206
RAX: 0003 RBX: 192002deef5b RCX: 83b3789e
RDX: 8880186c8000 RSI: 837e9945 RDI: 0018
RBP:  R08: 0030 R09: c90016f77d97
R10:  R11:  R12: dc00
R13: c90016f77ce8 R14: c90016f77ce8 R15: 
FS:  () GS:8880b9e0() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 7f0c3a930740 CR3: 2af19000 CR4: 001526f0
DR0:  DR1:  DR2: 
DR3:  DR6: fffe0ff0 DR7: 0400


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkal...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.