Re: [Cocci] Coccinelle rule for CVE-2019-18683

2020-04-10 Thread Alexander Popov
On 09.04.2020 22:41, Alexander Popov wrote:
> On 09.04.2020 01:26, Jann Horn wrote:
>> On Thu, Apr 9, 2020 at 12:01 AM Alexander Popov  wrote:
>>> CVE-2019-18683 refers to three similar vulnerabilities caused by the same
>>> incorrect approach to locking that is used in 
>>> vivid_stop_generating_vid_cap(),
>>> vivid_stop_generating_vid_out(), and sdr_cap_stop_streaming().
>>>
>>> For fixes please see the commit 6dcd5d7a7a29c1e4 (media: vivid: Fix wrong
>>> locking that causes race conditions on streaming stop).
>>>
>>> These three functions are called during streaming stopping with 
>>> vivid_dev.mutex
>>> locked. And they all do the same mistake while stopping their kthreads, 
>>> which
>>> need to lock this mutex as well. See the example from
>>> vivid_stop_generating_vid_cap():
>>> /* shutdown control thread */
>>> vivid_grab_controls(dev, false);
>>> mutex_unlock(>mutex);
>>> kthread_stop(dev->kthread_vid_cap);
>>> dev->kthread_vid_cap = NULL;
>>> mutex_lock(>mutex);
>>>
>>> But when this mutex is unlocked, another vb2_fop_read() can lock it instead 
>>> of
>>> the kthread and manipulate the buffer queue. That causes use-after-free.
>>>
>>> I created a Coccinelle rule that detects 
>>> mutex_unlock+kthread_stop+mutex_lock
>>> within one function.
>> [...]
>>> mutex_unlock@unlock_p(E)
>>> ...
>>> kthread_stop@stop_p(...)
>>> ...
>>> mutex_lock@lock_p(E)
>>
>> Is the kthread_stop() really special here? It seems to me like it's
>> pretty much just a normal instance of the "temporarily dropping a
>> lock" pattern - which does tend to go wrong quite often, but can also
>> be correct.
> 
> Right, searching without kthread_stop() gives more cases.
> 
>> I think it would be interesting though to have a list of places that
>> drop and then re-acquire a mutex/spinlock/... that was not originally
>> acquired in the same block of code (but was instead originally
>> acquired in an outer block, or by a parent function, or something like
>> that). So things like this:

The following rule reported 146 matching cases, which might be interesting.

```
virtual report
virtual context

@race exists@
expression E;
position unlock_p;
position lock_p;
@@

... when != mutex_lock(E)
* mutex_unlock@unlock_p(E)
... when != schedule()
when != schedule_timeout(...)
when != cond_resched()
when != wait_event(...)
when != wait_event_timeout(...)
when != wait_event_interruptible_timeout(...)
when != wait_event_interruptible(...)
when != msleep()
when != msleep_interruptible(...)
* mutex_lock@lock_p(E)

@script:python@
unlock_p << race.unlock_p;
lock_p << race.lock_p;
E << race.E;
@@

coccilib.report.print_report(unlock_p[0], 'see mutex_unlock(' + E + ') here')
coccilib.report.print_report(lock_p[0], 'see mutex_lock(' + E + ') here\n')
```

Analysing each matching case would take a lot of time.

However, I'm focused on searching kernel security issues.
So I will filter out the code that:
 - is not enabled in popular kernel configurations,
 - doesn't create additional attack surface.
Then I'll take the time to analyse the rest of reported cases.

I'll inform you if I find any bug.

Best regards,
Alexander
___
Cocci mailing list
Cocci@systeme.lip6.fr
https://systeme.lip6.fr/mailman/listinfo/cocci


Re: [Cocci] Coccinelle rule for CVE-2019-18683

2020-04-09 Thread Alexander Popov
On 09.04.2020 13:53, Julia Lawall wrote:
> On Thu, 9 Apr 2020, Alexander Popov wrote:
>> virtual report
>>
>> @race exists@
>> expression E;
>> position stop_p;
>> position unlock_p;
>> position lock_p;
>> @@
>>
>> mutex_unlock@unlock_p(E)
>> ...
> 
> It would be good to put when != mutex_lock(E) after the ... above.  Your
> rule doesn't actually prevent the lock from being retaken.

Thanks Julia! I used this trick in the second version of the rule that I've just
sent.

>> kthread_stop@stop_p(...)
>> ...
>> mutex_lock@lock_p(E)
>>
>> @script:python@
>> stop_p << race.stop_p;
>> unlock_p << race.unlock_p;
>> lock_p << race.lock_p;
>> E << race.E;
>> @@
>>
>> coccilib.report.print_report(unlock_p[0], 'mutex_unlock(' + E + ') here')
>> coccilib.report.print_report(stop_p[0], 'kthread_stop here')
>> coccilib.report.print_report(lock_p[0], 'mutex_lock(' + E + ') here\n')

...

> Based on Jann's suggestion, it seem like it could be interesting to find
> these locking pauses, and then collect functions that are used in locks
> and in lock pauses.  If a function is mostly used with locks held, then
> using it in a lock pause could be a sign of a bug.  I will see if it turns
> up anything interesting.

Do you mean collecting the behaviour that happens between unlocking and locking
and then analysing it somehow?

Best regards,
Alexander
___
Cocci mailing list
Cocci@systeme.lip6.fr
https://systeme.lip6.fr/mailman/listinfo/cocci


Re: [Cocci] Coccinelle rule for CVE-2019-18683

2020-04-09 Thread Alexander Popov
Jann, thanks for your reply!

On 09.04.2020 01:26, Jann Horn wrote:
> On Thu, Apr 9, 2020 at 12:01 AM Alexander Popov  wrote:
>> CVE-2019-18683 refers to three similar vulnerabilities caused by the same
>> incorrect approach to locking that is used in 
>> vivid_stop_generating_vid_cap(),
>> vivid_stop_generating_vid_out(), and sdr_cap_stop_streaming().
>>
>> For fixes please see the commit 6dcd5d7a7a29c1e4 (media: vivid: Fix wrong
>> locking that causes race conditions on streaming stop).
>>
>> These three functions are called during streaming stopping with 
>> vivid_dev.mutex
>> locked. And they all do the same mistake while stopping their kthreads, which
>> need to lock this mutex as well. See the example from
>> vivid_stop_generating_vid_cap():
>> /* shutdown control thread */
>> vivid_grab_controls(dev, false);
>> mutex_unlock(>mutex);
>> kthread_stop(dev->kthread_vid_cap);
>> dev->kthread_vid_cap = NULL;
>> mutex_lock(>mutex);
>>
>> But when this mutex is unlocked, another vb2_fop_read() can lock it instead 
>> of
>> the kthread and manipulate the buffer queue. That causes use-after-free.
>>
>> I created a Coccinelle rule that detects mutex_unlock+kthread_stop+mutex_lock
>> within one function.
> [...]
>> mutex_unlock@unlock_p(E)
>> ...
>> kthread_stop@stop_p(...)
>> ...
>> mutex_lock@lock_p(E)
> 
> Is the kthread_stop() really special here? It seems to me like it's
> pretty much just a normal instance of the "temporarily dropping a
> lock" pattern - which does tend to go wrong quite often, but can also
> be correct.

Right, searching without kthread_stop() gives more cases.

> I think it would be interesting though to have a list of places that
> drop and then re-acquire a mutex/spinlock/... that was not originally
> acquired in the same block of code (but was instead originally
> acquired in an outer block, or by a parent function, or something like
> that). So things like this:

It's a very good idea. I tried it and got first results (described below).

> void X(...) {
>   mutex_lock(A);
>   for (...) {
> ...
> mutex_unlock(A);
> ...
> mutex_lock(A);
> ...
>   }
>   mutex_unlock(A);
> }

I'm not an expert in SmPL yet. Don't know how to describe this case.

> or like this:
> 
> void X(...) {
>   ... [no mutex operations on A]
>   mutex_unlock(A);
>   ...
>   mutex_lock(A);
>   ...
> }

Yes, I adapted the rule for that easier case:

```
virtual report
virtual context

@race exists@
expression E;
position unlock_p;
position lock_p;
@@

... when != mutex_lock(E)
* mutex_unlock@unlock_p(E)
...
* mutex_lock@lock_p(E)

@script:python@
unlock_p << race.unlock_p;
lock_p << race.lock_p;
E << race.E;
@@

coccilib.report.print_report(unlock_p[0], 'see mutex_unlock(' + E + ') here')
coccilib.report.print_report(lock_p[0], 'see mutex_lock(' + E + ') here\n')
```

The command to run it:
  COCCI=./scripts/coccinelle/kthread_race.cocci make coccicheck MODE=context
It shows the code context around in a form of diff.

This rule found 195 matches. Not that much!

> But of course, there are places where this kind of behavior is
> correct; so such a script wouldn't just return report code, just code
> that could use a bit more scrutiny than normal. 

I've spent some time looking through the results.
Currently I see 3 types of cases.


1. Cases that look legit: a mutex is unlocked for some waiting or sleeping.

Example:
./fs/io_uring.c:7908:2-14: see mutex_unlock(& ctx -> uring_lock) here
./fs/io_uring.c:7910:2-12: see mutex_lock(& ctx -> uring_lock) here

diff -u -p ./fs/io_uring.c /tmp/nothing/fs/io_uring.c
--- ./fs/io_uring.c
+++ /tmp/nothing/fs/io_uring.c
@@ -7905,9 +7905,7 @@ static int __io_uring_register(struct io
 * to drop the mutex here, since no new references will come in
 * after we've killed the percpu ref.
 */
-   mutex_unlock(>uring_lock);
ret = wait_for_completion_interruptible(>completions[0]);
-   mutex_lock(>uring_lock);
if (ret) {
percpu_ref_resurrect(>refs);
ret = -EINTR;


Another example that looks legit:
./mm/ksm.c:2709:2-14: see mutex_unlock(& ksm_thread_mutex) here
./mm/ksm.c:2712:2-12: see mutex_lock(& ksm_thread_mutex) here

diff -u -p ./mm/ksm.c /tmp/nothing/mm/ksm.c
--- ./mm/ksm.c
+++ /tmp/nothing/mm/ksm.c
@@ -2706,10 +2706,8 @@ void ksm_migrate_page(struct page *newpa
 static void wait_while_offlining(void)
 {
while (ksm_run & KSM_RUN_OFFLINE) {
-   mutex_unlock(_thread_mutex);
wait_on_bit(_run,

Re: [Cocci] Coccinelle rule for CVE-2019-18683

2020-04-09 Thread Alexander Popov
Markus, thanks for your remarks!

On 09.04.2020 11:41, Markus Elfring wrote:
> * The source code search pattern can be too generic.
>   How do you think about to consider additional constraints
>   for safer data control flow analysis?

Could you please elaborate on that?

I used 'exists' keyword to find at least one branch that has
mutex_unlock+kthread_stop+mutex_lock chain.

> * Other operation modes might become helpful.

Thanks! I added 'context' mode, it's very good for this purpose.

Best regards,
Alexander
___
Cocci mailing list
Cocci@systeme.lip6.fr
https://systeme.lip6.fr/mailman/listinfo/cocci


[Cocci] Coccinelle rule for CVE-2019-18683

2020-04-09 Thread Alexander Popov
Hello!

Some time ago I fixed CVE-2019-18683 in the V4L2 subsystem of the Linux kernel.

I created a Coccinelle rule that detects that bug pattern. Let me show it.


Bug pattern
===

CVE-2019-18683 refers to three similar vulnerabilities caused by the same
incorrect approach to locking that is used in vivid_stop_generating_vid_cap(),
vivid_stop_generating_vid_out(), and sdr_cap_stop_streaming().

For fixes please see the commit 6dcd5d7a7a29c1e4 (media: vivid: Fix wrong
locking that causes race conditions on streaming stop).

These three functions are called during streaming stopping with vivid_dev.mutex
locked. And they all do the same mistake while stopping their kthreads, which
need to lock this mutex as well. See the example from
vivid_stop_generating_vid_cap():
/* shutdown control thread */
vivid_grab_controls(dev, false);
mutex_unlock(>mutex);
kthread_stop(dev->kthread_vid_cap);
dev->kthread_vid_cap = NULL;
mutex_lock(>mutex);

But when this mutex is unlocked, another vb2_fop_read() can lock it instead of
the kthread and manipulate the buffer queue. That causes use-after-free.

I created a Coccinelle rule that detects mutex_unlock+kthread_stop+mutex_lock
within one function.


Coccinelle rule
===

virtual report

@race exists@
expression E;
position stop_p;
position unlock_p;
position lock_p;
@@

mutex_unlock@unlock_p(E)
...
kthread_stop@stop_p(...)
...
mutex_lock@lock_p(E)

@script:python@
stop_p << race.stop_p;
unlock_p << race.unlock_p;
lock_p << race.lock_p;
E << race.E;
@@

coccilib.report.print_report(unlock_p[0], 'mutex_unlock(' + E + ') here')
coccilib.report.print_report(stop_p[0], 'kthread_stop here')
coccilib.report.print_report(lock_p[0], 'mutex_lock(' + E + ') here\n')


Testing the rule


I reverted the commit 6dcd5d7a7a29c1e4 and called:
COCCI=./scripts/coccinelle/kthread_race.cocci make coccicheck MODE=report

The result:

./drivers/media/platform/vivid/vivid-kthread-out.c:347:1-13: mutex_unlock(& dev
-> mutex) here
./drivers/media/platform/vivid/vivid-kthread-out.c:348:1-13: kthread_stop here
./drivers/media/platform/vivid/vivid-kthread-out.c:350:1-11: mutex_lock(& dev ->
mutex) here

./drivers/media/platform/vivid/vivid-sdr-cap.c:306:1-13: mutex_unlock(& dev ->
mutex) here
./drivers/media/platform/vivid/vivid-sdr-cap.c:307:1-13: kthread_stop here
./drivers/media/platform/vivid/vivid-sdr-cap.c:309:1-11: mutex_lock(& dev ->
mutex) here

./drivers/media/platform/vivid/vivid-kthread-cap.c:1001:1-13: mutex_unlock(& dev
-> mutex) here
./drivers/media/platform/vivid/vivid-kthread-cap.c:1002:1-13: kthread_stop here
./drivers/media/platform/vivid/vivid-kthread-cap.c:1004:1-11: mutex_lock(& dev
-> mutex) here

There are no other bugs detected.

Do you have any idea how to improve it?
Do we need that rule for regression testing in the upstream?

Thanks in advance!
Alexander
___
Cocci mailing list
Cocci@systeme.lip6.fr
https://systeme.lip6.fr/mailman/listinfo/cocci


Re: [Cocci] [PATCH] floppy: fix usercopy direction

2019-08-12 Thread Alexander Popov
Hello everyone!

On 27.03.2019 1:03, Jann Horn wrote:
> As sparse points out, these two copy_from_user() should actually be
> copy_to_user().

I've spent some time on these bugs, but it turned out that they are already 
public.

I think Jann's patch is lost, it is not applied to the mainline.
So I add a new floppy maintainer Denis Efremov to CC.

These bugs on x86_64 cause memset for the userspace memory from the kernelspace.
That is funny:
 - access_ok for the copy_from_user source (2nd argument) returns zero;
 - copy_from_user tries to erase the destination (1st argument);
 - but the destination is in the userspace instead of kernelspace.

So we have:
[   40.937098] BUG: unable to handle page fault for address: 41414242
[   40.938714] #PF: supervisor write access in kernel mode
[   40.939951] #PF: error_code(0x0002) - not-present page
[   40.941121] PGD 7963f067 P4D 7963f067 PUD 0
[   40.942107] Oops: 0002 [#1] SMP NOPTI
[   40.942968] CPU: 0 PID: 292 Comm: d Not tainted 5.3.0-rc3+ #7
[   40.944288] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
Ubuntu-1.8.2-1ubuntu1 04/01/2014
[   40.946478] RIP: 0010:__memset+0x24/0x30
[   40.947394] Code: 90 90 90 90 90 90 0f 1f 44 00 00 49 89 f9 48 89 d1 83 e2 07
48 c1 e9 03 40 0f b6 f6 48 b8 01 01 01 01 01 01 01 01 48 0f af c6  48 ab 89
d1 f3 aa 4c 89 c8 c3 90 49 89 f9 40 88 f0 48 89 d1 f3
[   40.951721] RSP: 0018:c93dbd58 EFLAGS: 00010206
[   40.952941] RAX:  RBX: 0034 RCX: 0006
[   40.954592] RDX: 0004 RSI:  RDI: 41414242
[   40.956169] RBP: 41414242 R08: 8200bd80 R09: 41414242
[   40.957753] R10: 00121806 R11: 88807da28ab0 R12: c93dbd7c
[   40.959407] R13: 0001 R14: 41414242 R15: 41414242
[   40.961062] FS:  7f91115c4440() GS:88807da0()
knlGS:
[   40.962603] CS:  0010 DS:  ES:  CR0: 80050033
[   40.963695] CR2: 41414242 CR3: 7c584000 CR4: 06f0
[   40.965004] Call Trace:
[   40.965459]  _copy_from_user+0x51/0x60
[   40.966141]  compat_getdrvstat+0x124/0x170
[   40.966781]  fd_compat_ioctl+0x69c/0x6d0
[   40.967423]  ? selinux_file_ioctl+0x16f/0x210
[   40.968117]  compat_blkdev_ioctl+0x21d/0x8f0
[   40.968864]  __x32_compat_sys_ioctl+0x99/0x250
[   40.969659]  do_syscall_64+0x4a/0x110
[   40.970337]  entry_SYSCALL_64_after_hwframe+0x44/0xa9


I haven't found a way to exploit it.

> Fixes: 229b53c9bf4e ("take floppy compat ioctls to sodding floppy.c")
> Cc: sta...@vger.kernel.org
> Signed-off-by: Jann Horn 
> Reviewed-by: Mukesh Ojha 
> ---
> compile-tested only

Acked-by: Alexander Popov 

>  drivers/block/floppy.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/block/floppy.c b/drivers/block/floppy.c
> index 95f608d1a098..8c641245ff12 100644
> --- a/drivers/block/floppy.c
> +++ b/drivers/block/floppy.c
> @@ -3749,7 +3749,7 @@ static int compat_getdrvprm(int drive,
>   v.native_format = UDP->native_format;
>   mutex_unlock(_mutex);
>  
> - if (copy_from_user(arg, , sizeof(struct compat_floppy_drive_params)))
> + if (copy_to_user(arg, , sizeof(struct compat_floppy_drive_params)))
>   return -EFAULT;
>   return 0;
>  }
> @@ -3785,7 +3785,7 @@ static int compat_getdrvstat(int drive, bool poll,
>   v.bufblocks = UDRS->bufblocks;
>   mutex_unlock(_mutex);
>  
> - if (copy_from_user(arg, , sizeof(struct compat_floppy_drive_struct)))
> + if (copy_to_user(arg, , sizeof(struct compat_floppy_drive_struct)))
>   return -EFAULT;
>   return 0;
>  Eintr:
> 

I also wrote a coccinelle rule for detecting similar bugs (adding coccinelle
experts to CC).


virtual report

@cfu@
identifier f;
type t;
identifier v;
position decl_p;
position copy_p;
@@

f(..., t v@decl_p, ...)
{
<+...
copy_from_user@copy_p(v, ...)
...+>
}

@script:python@
f << cfu.f;
t << cfu.t;
v << cfu.v;
decl_p << cfu.decl_p;
copy_p << cfu.copy_p;
@@

if '__user' in t:
  msg0 = "function \"" + f + "\" has arg \"" + v + "\" of type \"" + t + "\""
  coccilib.report.print_report(decl_p[0], msg0)
  msg1 = "copy_from_user uses \"" + v + "\" as the destination. What a shame!\n"
  coccilib.report.print_report(copy_p[0], msg1)


The rule output:

./drivers/block/floppy.c:3756:49-52: function "compat_getdrvprm" has arg "arg"
of type "struct compat_floppy_drive_params __user *"
./drivers/block/floppy.c:3783:5-19: copy_from_user uses "arg" as the
destination. What a shame!

./drivers/block/floppy.c:3789:49-52: function "compat_getdrvstat" has arg "arg"
of type "struct compat_floppy_drive_struct __user *"
./drivers/block/floppy.c:3819:5-19: copy_from_user uses "arg" as the
destination. What a shame!


Best regards,
Alexander
___
Cocci mailing list
Cocci@systeme.lip6.fr
https://systeme.lip6.fr/mailman/listinfo/cocci


Re: [Cocci] [PATCH] floppy: fix usercopy direction

2019-08-12 Thread Alexander Popov
On 09.08.2019 16:56, Julia Lawall wrote:
> On Fri, 9 Aug 2019, Alexander Popov wrote:
>> On 27.03.2019 1:03, Jann Horn wrote:
>>> As sparse points out, these two copy_from_user() should actually be
>>> copy_to_user().
>>
>> I also wrote a coccinelle rule for detecting similar bugs (adding coccinelle
>> experts to CC).
>>
>>
>> virtual report
>>
>> @cfu@
> 
> You can replace the above line with @cfu exists@.  You want to find the
> existence of such a call, not ensure that the call occurs on every
> control-flow path, which is the default.

Thanks Julia, I see `exists` allows to drop `<+ +>`, right?

> Do you want this rule to go into the kernel?

It turned out that sparse already can find these bugs.
Is this rule useful anyway? If so, I can prepare a patch.

>> identifier f;
>> type t;
>> identifier v;
>> position decl_p;
>> position copy_p;
>> @@
>>
>> f(..., t v@decl_p, ...)
>> {
>> <+...
>> copy_from_user@copy_p(v, ...)
>> ...+>
>> }
>>
>> @script:python@
>> f << cfu.f;
>> t << cfu.t;
>> v << cfu.v;
>> decl_p << cfu.decl_p;
>> copy_p << cfu.copy_p;
>> @@
>>
>> if '__user' in t:
>>   msg0 = "function \"" + f + "\" has arg \"" + v + "\" of type \"" + t + "\""
>>   coccilib.report.print_report(decl_p[0], msg0)
>>   msg1 = "copy_from_user uses \"" + v + "\" as the destination. What a 
>> shame!\n"
>>   coccilib.report.print_report(copy_p[0], msg1)
>>
>>
>> The rule output:
>>
>> ./drivers/block/floppy.c:3756:49-52: function "compat_getdrvprm" has arg 
>> "arg"
>> of type "struct compat_floppy_drive_params __user *"
>> ./drivers/block/floppy.c:3783:5-19: copy_from_user uses "arg" as the
>> destination. What a shame!
>>
>> ./drivers/block/floppy.c:3789:49-52: function "compat_getdrvstat" has arg 
>> "arg"
>> of type "struct compat_floppy_drive_struct __user *"
>> ./drivers/block/floppy.c:3819:5-19: copy_from_user uses "arg" as the
>> destination. What a shame!
___
Cocci mailing list
Cocci@systeme.lip6.fr
https://systeme.lip6.fr/mailman/listinfo/cocci