Re: [PATCH 4/6] usbip: fix stub_dev usbip_sockfd_store() races leading to gpf

2021-03-10 Thread Tetsuo Handa
On 2021/03/10 11:07, Shuah Khan wrote:
> On 3/9/21 6:02 PM, Tetsuo Handa wrote:
>> On 2021/03/10 9:29, Shuah Khan wrote:
 It is not a large grain lock. Since event_handler() is exclusively 
 executed, this lock
 does _NOT_ block event_handler() unless attach/detach operations run 
 concurrently.

>>>
>>> event handler queues the events. It shouldn't be blocked by attach
>>> and detach. The events could originate for various reasons during
>>> the host and vhci operations. I don't like using this lock for
>>> attach and detach.
>>
>> How can attach/detach deadlock event_handler()?
>> event_handler() calls e.g. vhci_shutdown_connection() via 
>> ud->eh_ops.shutdown(ud).
>> vhci_shutdown_connection() e.g. waits for termination of tx/rx threads via 
>> kthread_stop_put().
>> event_handler() is already blocked by detach operation.
>> How it can make situation worse to wait for creation of tx/rx threads in 
>> attach operation?
>>
> 
> event_lock shouldn't be held during event ops. usbip_event_add()
> uses it to add events. Protecting shutdown path needs a different
> approach.

I can't understand what you are talking about.

event_lock is defined as

  static DEFINE_SPINLOCK(event_lock);

in drivers/usb/usbip/usbip_event.c and usbip_event_add() uses it like

  spin_lock_irqsave(&event_lock, flags);
  spin_unlock_irqrestore(&event_lock, flags);

, but so what? I know event_lock spinlock cannot be taken when calling

  ud->eh_ops.shutdown(ud);
  ud->eh_ops.reset(ud);
  ud->eh_ops.unusable(ud);

in event_handler() because e.g. vhci_shutdown_connection() can sleep.

What my [PATCH v4 01/12] is suggesting is usbip_event_mutex which is defined as

  static DEFINE_MUTEX(usbip_event_mutex);

in drivers/usb/usbip/usbip_event.c and held when calling

  ud->eh_ops.shutdown(ud);
  ud->eh_ops.reset(ud);
  ud->eh_ops.unusable(ud);

in event_handler(). Since event_handler() is executed exclusively via
singlethreaded workqueue, "event_handler() holds usbip_event_mutex" itself
does not introduce a new lock dependency.

My question is, how holding usbip_event_mutex can cause deadlock if
usbip_sockfd_store()/attach_store()/detach_store() also hold usbip_event_mutex .

kthread_create() is essentially several kmalloc(GFP_KERNEL) where 
event_handler()
cannot interfere unless event_handler() serves as a memory reclaiming function.
My question again. What functions might sleep and hold locks other than
kthread_create() for tx/rx threads?

Your answer to my question (if you identified such dependency) will go into
patch shown bottom which replaces my [PATCH v4 01/12]-[PATCH v4 04/12] patches.

> 
> In any case, do you have comments on this patch which doesn't even
> touch vhci driver?

I'm just replying to your [PATCH 4/6] because all your [PATCH 4/6]-[PATCH 6/6]
patches have the same oversight.

> 
> I understand you are identifying additional race condition that
> the vhci patches in this series might not fix. That doesn't mean
> that these patches aren't valid.
> 
> Do you have any comments specific to the patches in this series?

Your [PATCH 5/6] is still racy regarding rh_port_connect() versus 
rh_port_disconnect().
As soon as you call wake_up_process(), rh_port_disconnect() can be called before
rh_port_connect() is called. Since you don't like serializing event_handler() 
and
usbip_sockfd_store()/attach_store()/detach_store() using usbip_event_mutex, my
patch shown below which uses attach_detach_lock mutex cannot close such race 
window.
Ideally, wake_up_process() should be deferred to after releasing 
attach_detach_lock,
but please answer to my question first.



From: Tetsuo Handa 
Date: Wed, 10 Mar 2021 18:31:54 +0900

syzbot is reporting a NULL pointer dereference at sock_sendmsg() [1], for
lack of serialization between attach_store() and event_handler() causes
vhci_shutdown_connection() to observe vdev->ud.tcp_tx == NULL while
vdev->ud.tcp_socket != NULL. Please read the reference link for details of
this race window.

While usbip module exclusively runs event_handler(), usbip module has never
thought the possibility that multiple userspace processes can concurrently
call usbip_sockfd_store()/attach_store()/detach_store(). As a result, there
is a TOCTTOU bug in these functions regarding ud->status value which can
result in similar crashes like [1].

Simplest way would be to run all of
event_handler()/usbip_sockfd_store()/attach_store()/detach_store() functions
exclusively using a global mutex. But Shuah Khan does not want to share a
global mutex between these functions, for ...[please fill in this part]... .

Therefore, this patch introduces a per-submodule local mutex which closes
race window within usbip_sockfd_store() and attach_store()/detach_store().

This local mutex cannot close race window between event_handler()
and usbip_sockfd_store()/attach_store()/detach_store(), for calling
wake_up_process() allows kernel thread to call
usbip_event_add(VDEV_EVENT_DOWN) and event_handler() will start
detach 

Re: [PATCH 4/6] usbip: fix stub_dev usbip_sockfd_store() races leading to gpf

2021-03-09 Thread Shuah Khan

On 3/9/21 6:02 PM, Tetsuo Handa wrote:

On 2021/03/10 9:29, Shuah Khan wrote:

It is not a large grain lock. Since event_handler() is exclusively executed, 
this lock
does _NOT_ block event_handler() unless attach/detach operations run 
concurrently.





event handler queues the events. It shouldn't be blocked by attach
and detach. The events could originate for various reasons during
the host and vhci operations. I don't like using this lock for
attach and detach.


How can attach/detach deadlock event_handler()?
event_handler() calls e.g. vhci_shutdown_connection() via 
ud->eh_ops.shutdown(ud).
vhci_shutdown_connection() e.g. waits for termination of tx/rx threads via 
kthread_stop_put().
event_handler() is already blocked by detach operation.
How it can make situation worse to wait for creation of tx/rx threads in attach 
operation?



event_lock shouldn't be held during event ops. usbip_event_add()
uses it to add events. Protecting shutdown path needs a different
approach.

In any case, do you have comments on this patch which doesn't even
touch vhci driver?

I understand you are identifying additional race condition that
the vhci patches in this series might not fix. That doesn't mean
that these patches aren't valid.

Do you have any comments specific to the patches in this series?

thanks,
-- Shuah





Re: [PATCH 4/6] usbip: fix stub_dev usbip_sockfd_store() races leading to gpf

2021-03-09 Thread Tetsuo Handa
On 2021/03/10 9:29, Shuah Khan wrote:
>> It is not a large grain lock. Since event_handler() is exclusively executed, 
>> this lock
>> does _NOT_ block event_handler() unless attach/detach operations run 
>> concurrently.
>>
>>>
> 
> event handler queues the events. It shouldn't be blocked by attach
> and detach. The events could originate for various reasons during
> the host and vhci operations. I don't like using this lock for
> attach and detach.

How can attach/detach deadlock event_handler()?
event_handler() calls e.g. vhci_shutdown_connection() via 
ud->eh_ops.shutdown(ud).
vhci_shutdown_connection() e.g. waits for termination of tx/rx threads via 
kthread_stop_put().
event_handler() is already blocked by detach operation.
How it can make situation worse to wait for creation of tx/rx threads in attach 
operation?



Re: [PATCH 4/6] usbip: fix stub_dev usbip_sockfd_store() races leading to gpf

2021-03-09 Thread Shuah Khan

On 3/9/21 5:03 PM, Tetsuo Handa wrote:

On 2021/03/10 8:52, Shuah Khan wrote:

On 3/9/21 4:40 PM, Tetsuo Handa wrote:

On 2021/03/10 4:50, Shuah Khan wrote:

On 3/9/21 4:04 AM, Tetsuo Handa wrote:

On 2021/03/09 1:27, Shuah Khan wrote:

Yes. We might need synchronization between events, threads, and shutdown
in usbip_host side and in connection polling and threads in vhci.

I am also looking at the shutdown sequences closely as well since the
local state is referenced without usbip_device lock in these paths.

I am approaching these problems as peeling the onion an expression so
we can limit the changes and take a spot fix approach. We have the
goal to address these crashes and not introduce regressions.


I think my [PATCH v4 01/12]-[PATCH v4 06/12] simplify your further changes
without introducing regressions. While ud->lock is held when checking 
ud->status,
current attach/detach code is racy about read/update of ud->status . I think we
can close race in attach/detach code via a simple usbip_event_mutex 
serialization.



Do you mean patches 1,2,3,3,4,5,6?


Yes, my 1,2,3,4,5,6.

Since you think that usbip_prepare_threads() does not worth introducing, I'm 
fine with
replacing my 7,8,9,10,11,12 with your "[PATCH 0/6] usbip fixes to crashes found by 
syzbot".



Using event lock isn't the right approach to solve the race. It is a
large grain lock. I am not looking to replace patches.


It is not a large grain lock. Since event_handler() is exclusively executed, 
this lock
does _NOT_ block event_handler() unless attach/detach operations run 
concurrently.





event handler queues the events. It shouldn't be blocked by attach
and detach. The events could originate for various reasons during
the host and vhci operations. I don't like using this lock for
attach and detach.


I still haven't seen any response from you about if you were able to
verify the fixes I sent in fix the problem you are seeing.
 > I won't be able to verify your fixes, for it is syzbot who is seeing 

the problem.

Thank you for letting me know.

thanks,
-- Shuah



Re: [PATCH 4/6] usbip: fix stub_dev usbip_sockfd_store() races leading to gpf

2021-03-09 Thread Tetsuo Handa
On 2021/03/10 8:52, Shuah Khan wrote:
> On 3/9/21 4:40 PM, Tetsuo Handa wrote:
>> On 2021/03/10 4:50, Shuah Khan wrote:
>>> On 3/9/21 4:04 AM, Tetsuo Handa wrote:
 On 2021/03/09 1:27, Shuah Khan wrote:
> Yes. We might need synchronization between events, threads, and shutdown
> in usbip_host side and in connection polling and threads in vhci.
>
> I am also looking at the shutdown sequences closely as well since the
> local state is referenced without usbip_device lock in these paths.
>
> I am approaching these problems as peeling the onion an expression so
> we can limit the changes and take a spot fix approach. We have the
> goal to address these crashes and not introduce regressions.

 I think my [PATCH v4 01/12]-[PATCH v4 06/12] simplify your further changes
 without introducing regressions. While ud->lock is held when checking 
 ud->status,
 current attach/detach code is racy about read/update of ud->status . I 
 think we
 can close race in attach/detach code via a simple usbip_event_mutex 
 serialization.

>>>
>>> Do you mean patches 1,2,3,3,4,5,6?
>>
>> Yes, my 1,2,3,4,5,6.
>>
>> Since you think that usbip_prepare_threads() does not worth introducing, I'm 
>> fine with
>> replacing my 7,8,9,10,11,12 with your "[PATCH 0/6] usbip fixes to crashes 
>> found by syzbot".
>>
> 
> Using event lock isn't the right approach to solve the race. It is a
> large grain lock. I am not looking to replace patches.

It is not a large grain lock. Since event_handler() is exclusively executed, 
this lock
does _NOT_ block event_handler() unless attach/detach operations run 
concurrently.

> 
> I still haven't seen any response from you about if you were able to
> verify the fixes I sent in fix the problem you are seeing.

I won't be able to verify your fixes, for it is syzbot who is seeing the 
problem.
But I can see that your patch description is wrong because you are ignoring 
what I'm commenting.

Global serialization had better come first. Your patch description depends on 
global serialization.


Re: [PATCH 4/6] usbip: fix stub_dev usbip_sockfd_store() races leading to gpf

2021-03-09 Thread Shuah Khan

On 3/9/21 4:40 PM, Tetsuo Handa wrote:

On 2021/03/10 4:50, Shuah Khan wrote:

On 3/9/21 4:04 AM, Tetsuo Handa wrote:

On 2021/03/09 1:27, Shuah Khan wrote:

Yes. We might need synchronization between events, threads, and shutdown
in usbip_host side and in connection polling and threads in vhci.

I am also looking at the shutdown sequences closely as well since the
local state is referenced without usbip_device lock in these paths.

I am approaching these problems as peeling the onion an expression so
we can limit the changes and take a spot fix approach. We have the
goal to address these crashes and not introduce regressions.


I think my [PATCH v4 01/12]-[PATCH v4 06/12] simplify your further changes
without introducing regressions. While ud->lock is held when checking 
ud->status,
current attach/detach code is racy about read/update of ud->status . I think we
can close race in attach/detach code via a simple usbip_event_mutex 
serialization.



Do you mean patches 1,2,3,3,4,5,6?


Yes, my 1,2,3,4,5,6.

Since you think that usbip_prepare_threads() does not worth introducing, I'm 
fine with
replacing my 7,8,9,10,11,12 with your "[PATCH 0/6] usbip fixes to crashes found by 
syzbot".



Using event lock isn't the right approach to solve the race. It is a
large grain lock. I am not looking to replace patches.

I still haven't seen any response from you about if you were able to
verify the fixes I sent in fix the problem you are seeing.

thanks,
-- Shuah




Re: [PATCH 4/6] usbip: fix stub_dev usbip_sockfd_store() races leading to gpf

2021-03-09 Thread Tetsuo Handa
On 2021/03/10 4:50, Shuah Khan wrote:
> On 3/9/21 4:04 AM, Tetsuo Handa wrote:
>> On 2021/03/09 1:27, Shuah Khan wrote:
>>> Yes. We might need synchronization between events, threads, and shutdown
>>> in usbip_host side and in connection polling and threads in vhci.
>>>
>>> I am also looking at the shutdown sequences closely as well since the
>>> local state is referenced without usbip_device lock in these paths.
>>>
>>> I am approaching these problems as peeling the onion an expression so
>>> we can limit the changes and take a spot fix approach. We have the
>>> goal to address these crashes and not introduce regressions.
>>
>> I think my [PATCH v4 01/12]-[PATCH v4 06/12] simplify your further changes
>> without introducing regressions. While ud->lock is held when checking 
>> ud->status,
>> current attach/detach code is racy about read/update of ud->status . I think 
>> we
>> can close race in attach/detach code via a simple usbip_event_mutex 
>> serialization.
>>
> 
> Do you mean patches 1,2,3,3,4,5,6?

Yes, my 1,2,3,4,5,6.

Since you think that usbip_prepare_threads() does not worth introducing, I'm 
fine with
replacing my 7,8,9,10,11,12 with your "[PATCH 0/6] usbip fixes to crashes found 
by syzbot".


Re: [PATCH 4/6] usbip: fix stub_dev usbip_sockfd_store() races leading to gpf

2021-03-09 Thread Shuah Khan

On 3/9/21 4:04 AM, Tetsuo Handa wrote:

On 2021/03/09 1:27, Shuah Khan wrote:

Yes. We might need synchronization between events, threads, and shutdown
in usbip_host side and in connection polling and threads in vhci.

I am also looking at the shutdown sequences closely as well since the
local state is referenced without usbip_device lock in these paths.

I am approaching these problems as peeling the onion an expression so
we can limit the changes and take a spot fix approach. We have the
goal to address these crashes and not introduce regressions.


I think my [PATCH v4 01/12]-[PATCH v4 06/12] simplify your further changes
without introducing regressions. While ud->lock is held when checking 
ud->status,
current attach/detach code is racy about read/update of ud->status . I think we
can close race in attach/detach code via a simple usbip_event_mutex 
serialization.



Do you mean patches 1,2,3,3,4,5,6?

thanks,
-- Shuah


Re: [PATCH 4/6] usbip: fix stub_dev usbip_sockfd_store() races leading to gpf

2021-03-09 Thread Shuah Khan

On 3/8/21 9:27 AM, Shuah Khan wrote:

On 3/8/21 3:10 AM, Tetsuo Handa wrote:

On 2021/03/08 16:35, Tetsuo Handa wrote:

On 2021/03/08 12:53, Shuah Khan wrote:

Fix the above problems:
- Stop using kthread_get_run() macro to create/start threads.
- Create threads and get task struct reference.
- Add kthread_create() failure handling and bail out.
- Hold usbip_device lock to update local and shared states after
   creating rx and tx threads.
- Update usbip_device status to SDEV_ST_USED.
- Update usbip_device tcp_socket, sockfd, tcp_rx, and tcp_tx
- Start threads after usbip_device (tcp_socket, sockfd, tcp_rx, tcp_tx,
   and status) is complete.


No, the whole usbip_sockfd_store() etc. should be serialized using a 
mutex,
for two different threads can open same file and write the same 
content at
the same moment. This results in seeing SDEV_ST_AVAILABLE and 
creating two
threads and overwiting global variables and setting SDEV_ST_USED and 
starting
two threads by each of two thread, which will later fail to call 
kthread_stop()

on one of two thread because global variables are overwritten.

kthread_crate() (which involves GFP_KERNEL allocation) can take long 
time

enough to hit

   usbip_sockfd_store() must perform

   if (sdev->ud.status != SDEV_ST_AVAILABLE) {


Oops. This is

if (sdev->ud.status == SDEV_ST_AVAILABLE) {

of course.


 /* misc assignments for attach operation */
 sdev->ud.status = SDEV_ST_USED;
   }

   under a lock, or multiple ud->tcp_{tx,rx} are created (which will 
later

   cause a crash like [1]) and refcount on ud->tcp_socket is leaked when
   usbip_sockfd_store() is concurrently called.

problem. That's why my patch introduced usbip_event_mutex lock.



And I think that same serialization is required between 
"rh_port_connect() from attach_store()" and
"rh_port_disconnect() from vhci_shutdown_connection() via 
usbip_event_add(&vdev->ud, VDEV_EVENT_DOWN)
  from vhci_port_disconnect() from detach_store()", for both 
vhci_rx_pdu() from vhci_rx_loop() and
vhci_port_disconnect() from detach_store() can queue VDEV_EVENT_DOWN 
event which can be processed

without waiting for attach_store() to complete.



Yes. We might need synchronization between events, threads, and shutdown
in usbip_host side and in connection polling and threads in vhci.

I am also looking at the shutdown sequences closely as well since the
local state is referenced without usbip_device lock in these paths.

I am approaching these problems as peeling the onion an expression so
we can limit the changes and take a spot fix approach. We have the
goal to address these crashes and not introduce regressions.

I don't seem to be able to reproduce these problems consistently in my
env. with the reproducer. I couldn't reproduce them in normal case at
all. Hence, the this cautious approach that reduces the chance of
regressions and if we see regressions, they can fixed easily.

https://syzkaller.appspot.com/text?tag=ReproC&x=14801034d0

If this patch series fixes the problems you are seeing, I would like
get these fixes in and address the other two potential race conditions
in another round of patches. I also want to soak these in the next
for a few weeks.

Please let me know if these patches fix the problems you are seeing in 
your env.




Can you verify these patches in your environment and see if you are
seeing any problems? I want to first see where we are with these
fixes.

thanks,
-- Shuah


Re: [PATCH 4/6] usbip: fix stub_dev usbip_sockfd_store() races leading to gpf

2021-03-09 Thread Tetsuo Handa
On 2021/03/09 20:04, Tetsuo Handa wrote:
> On 2021/03/09 1:27, Shuah Khan wrote:
>> Yes. We might need synchronization between events, threads, and shutdown
>> in usbip_host side and in connection polling and threads in vhci.
>>
>> I am also looking at the shutdown sequences closely as well since the
>> local state is referenced without usbip_device lock in these paths.
>>
>> I am approaching these problems as peeling the onion an expression so
>> we can limit the changes and take a spot fix approach. We have the
>> goal to address these crashes and not introduce regressions.
> 
> I think my [PATCH v4 01/12]-[PATCH v4 06/12] simplify your further changes
> without introducing regressions. While ud->lock is held when checking 
> ud->status,
> current attach/detach code is racy about read/update of ud->status . I think 
> we
> can close race in attach/detach code via a simple usbip_event_mutex 
> serialization.
> 

Commit 04679b3489e048cd ("Staging: USB/IP: add client driver") says

  /*
   * The important thing is that only one context begins cleanup.
   * This is why error handling and cleanup become simple.
   * We do not want to consider race condition as possible.
   */
  static void vhci_shutdown_connection(struct usbip_device *ud)

but why are we allowing multiple contexts to begin startup?
That's where a subtle race window syzbot is reporting happened.
This driver has never thought the possibility that multiple userspace
processes can concurrently begin startup. Then, it is quite natural that
we make startup simple and safe, by enforcing that only one context
begins startup.

Without serialization between startup/cleanup/event_handler(),
"Fix the above problems:" in your changelog cannot become true.
First step should be closing time-of-check to time-of-use bug
via entire serialization as if "nonpreemptible UP kernel".


Re: [PATCH 4/6] usbip: fix stub_dev usbip_sockfd_store() races leading to gpf

2021-03-09 Thread Tetsuo Handa
On 2021/03/09 1:27, Shuah Khan wrote:
> Yes. We might need synchronization between events, threads, and shutdown
> in usbip_host side and in connection polling and threads in vhci.
> 
> I am also looking at the shutdown sequences closely as well since the
> local state is referenced without usbip_device lock in these paths.
> 
> I am approaching these problems as peeling the onion an expression so
> we can limit the changes and take a spot fix approach. We have the
> goal to address these crashes and not introduce regressions.

I think my [PATCH v4 01/12]-[PATCH v4 06/12] simplify your further changes
without introducing regressions. While ud->lock is held when checking 
ud->status,
current attach/detach code is racy about read/update of ud->status . I think we
can close race in attach/detach code via a simple usbip_event_mutex 
serialization.



Re: [PATCH 4/6] usbip: fix stub_dev usbip_sockfd_store() races leading to gpf

2021-03-08 Thread Shuah Khan

On 3/8/21 3:10 AM, Tetsuo Handa wrote:

On 2021/03/08 16:35, Tetsuo Handa wrote:

On 2021/03/08 12:53, Shuah Khan wrote:

Fix the above problems:
- Stop using kthread_get_run() macro to create/start threads.
- Create threads and get task struct reference.
- Add kthread_create() failure handling and bail out.
- Hold usbip_device lock to update local and shared states after
   creating rx and tx threads.
- Update usbip_device status to SDEV_ST_USED.
- Update usbip_device tcp_socket, sockfd, tcp_rx, and tcp_tx
- Start threads after usbip_device (tcp_socket, sockfd, tcp_rx, tcp_tx,
   and status) is complete.


No, the whole usbip_sockfd_store() etc. should be serialized using a mutex,
for two different threads can open same file and write the same content at
the same moment. This results in seeing SDEV_ST_AVAILABLE and creating two
threads and overwiting global variables and setting SDEV_ST_USED and starting
two threads by each of two thread, which will later fail to call kthread_stop()
on one of two thread because global variables are overwritten.

kthread_crate() (which involves GFP_KERNEL allocation) can take long time
enough to hit

   usbip_sockfd_store() must perform

   if (sdev->ud.status != SDEV_ST_AVAILABLE) {


Oops. This is

if (sdev->ud.status == SDEV_ST_AVAILABLE) {

of course.


 /* misc assignments for attach operation */
 sdev->ud.status = SDEV_ST_USED;
   }

   under a lock, or multiple ud->tcp_{tx,rx} are created (which will later
   cause a crash like [1]) and refcount on ud->tcp_socket is leaked when
   usbip_sockfd_store() is concurrently called.

problem. That's why my patch introduced usbip_event_mutex lock.



And I think that same serialization is required between "rh_port_connect() from 
attach_store()" and
"rh_port_disconnect() from vhci_shutdown_connection() via 
usbip_event_add(&vdev->ud, VDEV_EVENT_DOWN)
  from vhci_port_disconnect() from detach_store()", for both vhci_rx_pdu() from 
vhci_rx_loop() and
vhci_port_disconnect() from detach_store() can queue VDEV_EVENT_DOWN event 
which can be processed
without waiting for attach_store() to complete.



Yes. We might need synchronization between events, threads, and shutdown
in usbip_host side and in connection polling and threads in vhci.

I am also looking at the shutdown sequences closely as well since the
local state is referenced without usbip_device lock in these paths.

I am approaching these problems as peeling the onion an expression so
we can limit the changes and take a spot fix approach. We have the
goal to address these crashes and not introduce regressions.

I don't seem to be able to reproduce these problems consistently in my
env. with the reproducer. I couldn't reproduce them in normal case at
all. Hence, the this cautious approach that reduces the chance of
regressions and if we see regressions, they can fixed easily.

https://syzkaller.appspot.com/text?tag=ReproC&x=14801034d0

If this patch series fixes the problems you are seeing, I would like
get these fixes in and address the other two potential race conditions
in another round of patches. I also want to soak these in the next
for a few weeks.

Please let me know if these patches fix the problems you are seeing in 
your env.


thanks,
-- Shuah




Re: [PATCH 4/6] usbip: fix stub_dev usbip_sockfd_store() races leading to gpf

2021-03-08 Thread Tetsuo Handa
On 2021/03/08 16:35, Tetsuo Handa wrote:
> On 2021/03/08 12:53, Shuah Khan wrote:
>> Fix the above problems:
>> - Stop using kthread_get_run() macro to create/start threads.
>> - Create threads and get task struct reference.
>> - Add kthread_create() failure handling and bail out.
>> - Hold usbip_device lock to update local and shared states after
>>   creating rx and tx threads.
>> - Update usbip_device status to SDEV_ST_USED.
>> - Update usbip_device tcp_socket, sockfd, tcp_rx, and tcp_tx
>> - Start threads after usbip_device (tcp_socket, sockfd, tcp_rx, tcp_tx,
>>   and status) is complete.
> 
> No, the whole usbip_sockfd_store() etc. should be serialized using a mutex,
> for two different threads can open same file and write the same content at
> the same moment. This results in seeing SDEV_ST_AVAILABLE and creating two
> threads and overwiting global variables and setting SDEV_ST_USED and starting
> two threads by each of two thread, which will later fail to call 
> kthread_stop()
> on one of two thread because global variables are overwritten.
> 
> kthread_crate() (which involves GFP_KERNEL allocation) can take long time
> enough to hit
> 
>   usbip_sockfd_store() must perform
> 
>   if (sdev->ud.status != SDEV_ST_AVAILABLE) {

Oops. This is

if (sdev->ud.status == SDEV_ST_AVAILABLE) {

of course.

> /* misc assignments for attach operation */
> sdev->ud.status = SDEV_ST_USED;
>   }
> 
>   under a lock, or multiple ud->tcp_{tx,rx} are created (which will later
>   cause a crash like [1]) and refcount on ud->tcp_socket is leaked when
>   usbip_sockfd_store() is concurrently called.
> 
> problem. That's why my patch introduced usbip_event_mutex lock.
> 

And I think that same serialization is required between "rh_port_connect() from 
attach_store()" and
"rh_port_disconnect() from vhci_shutdown_connection() via 
usbip_event_add(&vdev->ud, VDEV_EVENT_DOWN)
 from vhci_port_disconnect() from detach_store()", for both vhci_rx_pdu() from 
vhci_rx_loop() and
vhci_port_disconnect() from detach_store() can queue VDEV_EVENT_DOWN event 
which can be processed
without waiting for attach_store() to complete.


Re: [PATCH 4/6] usbip: fix stub_dev usbip_sockfd_store() races leading to gpf

2021-03-07 Thread Tetsuo Handa
On 2021/03/08 12:53, Shuah Khan wrote:
> Fix the above problems:
> - Stop using kthread_get_run() macro to create/start threads.
> - Create threads and get task struct reference.
> - Add kthread_create() failure handling and bail out.
> - Hold usbip_device lock to update local and shared states after
>   creating rx and tx threads.
> - Update usbip_device status to SDEV_ST_USED.
> - Update usbip_device tcp_socket, sockfd, tcp_rx, and tcp_tx
> - Start threads after usbip_device (tcp_socket, sockfd, tcp_rx, tcp_tx,
>   and status) is complete.

No, the whole usbip_sockfd_store() etc. should be serialized using a mutex,
for two different threads can open same file and write the same content at
the same moment. This results in seeing SDEV_ST_AVAILABLE and creating two
threads and overwiting global variables and setting SDEV_ST_USED and starting
two threads by each of two thread, which will later fail to call kthread_stop()
on one of two thread because global variables are overwritten.

kthread_crate() (which involves GFP_KERNEL allocation) can take long time
enough to hit

  usbip_sockfd_store() must perform

  if (sdev->ud.status != SDEV_ST_AVAILABLE) {
/* misc assignments for attach operation */
sdev->ud.status = SDEV_ST_USED;
  }

  under a lock, or multiple ud->tcp_{tx,rx} are created (which will later
  cause a crash like [1]) and refcount on ud->tcp_socket is leaked when
  usbip_sockfd_store() is concurrently called.

problem. That's why my patch introduced usbip_event_mutex lock.