Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm

2019-03-22 Thread syzbot

Bisection is inconclusive: the first bad commit could be any of:

2c43838c sched/isolation: Enable CONFIG_CPU_ISOLATION=y by default
bf29cb23 sched/isolation: Make CONFIG_NO_HZ_FULL select CONFIG_CPU_ISOLATION
d94d1053 sched/isolation: Document boot parameters dependency on  
CONFIG_CPU_ISOLATION=y
4c470317 Merge branch 'sched-urgent-for-linus' of  
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip


bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=1592b03720
start commit:   0072a0c1
git tree:   upstream
dashboard link: https://syzkaller.appspot.com/bug?extid=cbb52e396df3e565ab02
syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=12835e2540
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=172fa5a340

For information about bisection process see: https://goo.gl/tpsmEJ#bisection


Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm

2019-03-18 Thread zhong jiang
On 2019/3/17 3:42, Andrea Arcangeli wrote:
> On Sat, Mar 16, 2019 at 05:38:54PM +0800, zhong jiang wrote:
>> On 2019/3/16 5:39, Andrea Arcangeli wrote:
>>> On Fri, Mar 08, 2019 at 03:10:08PM +0800, zhong jiang wrote:
 I can reproduce the issue in arm64 qemu machine.  The issue will leave 
 after applying the
 patch.

 Tested-by: zhong jiang 
>>> Thanks a lot for the quick testing!
>>>
 Meanwhile,  I just has a little doubt whether it is necessary to use RCU 
 to free the task struct or not.
 I think that mm->owner alway be NULL after failing to create to process. 
 Because we call mm_clear_owner.
>>> I wish it was enough, but the problem is that the other CPU may be in
>>> the middle of get_mem_cgroup_from_mm() while this runs, and it would
>>> dereference mm->owner while it is been freed without the call_rcu
>>> affter we clear mm->owner. What prevents this race is the
>> As you had said, It would dereference mm->owner after we clear mm->owner.
>>
>> But after we clear mm->owner,  mm->owner should be NULL.  Is it right?
>>
>> And mem_cgroup_from_task will check the parameter. 
>> you mean that it is possible after checking the parameter to  clear the 
>> owner .
>> and the NULL pointer will trigger. :-(
> Dereference mm->owner didn't mean reading the value of the mm->owner
> pointer, it really means to dereference the value of the pointer. It's
> like below:
>
> get_mem_cgroup_from_mm()  failing fork()
>   ---
> task = mm->owner
>   mm->owner = NULL;
>   free(mm->owner)
> *task /* use after free */
>
> We didn't set mm->owner to NULL before, so the window for the race was
> larger, but setting mm->owner to NULL only hides the problem and it
> can still happen (albeit with a smaller window).
>
> If get_mem_cgroup_from_mm() can see at any time mm->owner not NULL,
> then the free of the task struct must be delayed until after
> rcu_read_unlock has returned in get_mem_cgroup_from_mm(). This is
> the standard RCU model, the freeing must be delayed until after the
> next quiescent point.

Thank you for your explaination patiently.  The patch should go to upstream 
too.  I think you
should send a formal patch to the mainline.  Maybe other people suffer from
the issue.  :-)

Thanks,
zhong jiang
> BTW, both mm_update_next_owner() and mm_clear_owner() should have used
> WRITE_ONCE when they write to mm->owner, I can update that too but
> it's just to not to make assumptions that gcc does the right thing
> (and we still rely on gcc to do the right thing in other places) so
> that is just an orthogonal cleanup.
>
> Thanks,
> Andrea
>
> .
>




Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm

2019-03-16 Thread Andrea Arcangeli
On Sat, Mar 16, 2019 at 05:38:54PM +0800, zhong jiang wrote:
> On 2019/3/16 5:39, Andrea Arcangeli wrote:
> > On Fri, Mar 08, 2019 at 03:10:08PM +0800, zhong jiang wrote:
> >> I can reproduce the issue in arm64 qemu machine.  The issue will leave 
> >> after applying the
> >> patch.
> >>
> >> Tested-by: zhong jiang 
> > Thanks a lot for the quick testing!
> >
> >> Meanwhile,  I just has a little doubt whether it is necessary to use RCU 
> >> to free the task struct or not.
> >> I think that mm->owner alway be NULL after failing to create to process. 
> >> Because we call mm_clear_owner.
> > I wish it was enough, but the problem is that the other CPU may be in
> > the middle of get_mem_cgroup_from_mm() while this runs, and it would
> > dereference mm->owner while it is been freed without the call_rcu
> > affter we clear mm->owner. What prevents this race is the
> As you had said, It would dereference mm->owner after we clear mm->owner.
> 
> But after we clear mm->owner,  mm->owner should be NULL.  Is it right?
> 
> And mem_cgroup_from_task will check the parameter. 
> you mean that it is possible after checking the parameter to  clear the owner 
> .
> and the NULL pointer will trigger. :-(

Dereference mm->owner didn't mean reading the value of the mm->owner
pointer, it really means to dereference the value of the pointer. It's
like below:

get_mem_cgroup_from_mm()failing fork()
---
task = mm->owner
mm->owner = NULL;
free(mm->owner)
*task /* use after free */

We didn't set mm->owner to NULL before, so the window for the race was
larger, but setting mm->owner to NULL only hides the problem and it
can still happen (albeit with a smaller window).

If get_mem_cgroup_from_mm() can see at any time mm->owner not NULL,
then the free of the task struct must be delayed until after
rcu_read_unlock has returned in get_mem_cgroup_from_mm(). This is
the standard RCU model, the freeing must be delayed until after the
next quiescent point.

BTW, both mm_update_next_owner() and mm_clear_owner() should have used
WRITE_ONCE when they write to mm->owner, I can update that too but
it's just to not to make assumptions that gcc does the right thing
(and we still rely on gcc to do the right thing in other places) so
that is just an orthogonal cleanup.

Thanks,
Andrea


Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm

2019-03-16 Thread zhong jiang
On 2019/3/16 5:39, Andrea Arcangeli wrote:
> On Fri, Mar 08, 2019 at 03:10:08PM +0800, zhong jiang wrote:
>> I can reproduce the issue in arm64 qemu machine.  The issue will leave after 
>> applying the
>> patch.
>>
>> Tested-by: zhong jiang 
> Thanks a lot for the quick testing!
>
>> Meanwhile,  I just has a little doubt whether it is necessary to use RCU to 
>> free the task struct or not.
>> I think that mm->owner alway be NULL after failing to create to process. 
>> Because we call mm_clear_owner.
> I wish it was enough, but the problem is that the other CPU may be in
> the middle of get_mem_cgroup_from_mm() while this runs, and it would
> dereference mm->owner while it is been freed without the call_rcu
> affter we clear mm->owner. What prevents this race is the
As you had said, It would dereference mm->owner after we clear mm->owner.

But after we clear mm->owner,  mm->owner should be NULL.  Is it right?

And mem_cgroup_from_task will check the parameter. 
you mean that it is possible after checking the parameter to  clear the owner .
and the NULL pointer will trigger. :-(

Thanks,
zhong jiang
> rcu_read_lock() in get_mem_cgroup_from_mm() and the corresponding
> call_rcu to free the task struct in the fork failure path (again only
> if CONFIG_MEMCG=y is defined). Considering you can reproduce this tiny
> race on arm64 qemu (perhaps tcg JIT timing variantions helps?), you
> might also in theory be able to still reproduce the race condition if
> you remove the call_rcu from delayed_free_task and you replace it with
> free_task.
>
> .
>




Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm

2019-03-15 Thread Andrea Arcangeli
On Fri, Mar 08, 2019 at 03:10:08PM +0800, zhong jiang wrote:
> I can reproduce the issue in arm64 qemu machine.  The issue will leave after 
> applying the
> patch.
> 
> Tested-by: zhong jiang 

Thanks a lot for the quick testing!

> Meanwhile,  I just has a little doubt whether it is necessary to use RCU to 
> free the task struct or not.
> I think that mm->owner alway be NULL after failing to create to process. 
> Because we call mm_clear_owner.

I wish it was enough, but the problem is that the other CPU may be in
the middle of get_mem_cgroup_from_mm() while this runs, and it would
dereference mm->owner while it is been freed without the call_rcu
affter we clear mm->owner. What prevents this race is the
rcu_read_lock() in get_mem_cgroup_from_mm() and the corresponding
call_rcu to free the task struct in the fork failure path (again only
if CONFIG_MEMCG=y is defined). Considering you can reproduce this tiny
race on arm64 qemu (perhaps tcg JIT timing variantions helps?), you
might also in theory be able to still reproduce the race condition if
you remove the call_rcu from delayed_free_task and you replace it with
free_task.


Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm

2019-03-07 Thread zhong jiang
On 2019/3/6 10:05, Andrea Arcangeli wrote:
> Hello everyone,
>
> [ CC'ed Mike and Peter ]
>
> On Tue, Mar 05, 2019 at 02:42:00PM +0800, zhong jiang wrote:
>> On 2019/3/5 14:26, Dmitry Vyukov wrote:
>>> On Mon, Mar 4, 2019 at 4:32 PM zhong jiang  wrote:
 On 2019/3/4 22:11, Dmitry Vyukov wrote:
> On Mon, Mar 4, 2019 at 3:00 PM zhong jiang  wrote:
>> On 2019/3/4 15:40, Dmitry Vyukov wrote:
>>> On Sun, Mar 3, 2019 at 5:19 PM zhong jiang  
>>> wrote:
 Hi, guys

 I also hit the following issue. but it fails to reproduce the issue by 
 the log.

 it seems to the case that we access the mm->owner and deference it 
 will result in the UAF.
 But it should not be possible that we specify the incomplete process 
 to be the mm->owner.

 Any thoughts?
>>> FWIW syzbot was able to reproduce this with this reproducer.
>>> This looks like a very subtle race (threaded reproducer that runs
>>> repeatedly in multiple processes), so most likely we are looking for
>>> something like few instructions inconsistency window.
>>>
>> I has a little doubtful about the instrustions inconsistency window.
>>
>> I guess that you mean some smb barriers should be taken into account.:-)
>>
>> Because IMO, It should not be the lock case to result in the issue.
> Since the crash was triggered on x86 _most likley_ this is not a
> missed barrier. What I meant is that one thread needs to executed some
> code, while another thread is stopped within few instructions.
>
>
 It is weird and I can not find any relationship you had said with the 
 issue.:-(

 Because It is the cause that mm->owner has been freed, whereas we still 
 deference it.

 From the lastest freed task call trace, It fails to create process.

 Am I miss something or I misunderstand your meaning. Please correct me.
>>> Your analysis looks correct. I am just saying that the root cause of
>>> this use-after-free seems to be a race condition.
>>>
>>>
>>>
>> Yep, Indeed,  I can not figure out how the race works. I will dig up further.
> Yes it's a race condition.
>
> We were aware about the non-cooperative fork userfaultfd feature
> creating userfaultfd file descriptor that gets reported to the parent
> uffd, despite they belong to mm created by failed forks.
>
> https://www.spinics.net/lists/linux-mm/msg136357.html
>
> The fork failure in my testcase happened because of signal pending
> that interrupted fork after the failed-fork uffd context, was already
> pushed to the userfaultfd reader/monitor. CRIU then takes care of
> filtering the failed fork cases so we didn't want to make the fork
> code more complicated just for userfaultfd.
>
> In reality if MEMCG is enabled at build time, mm->owner maintainance
> code now creates a race condition in the above case, with any fork
> failure.
>
> I pinged Mike yesterday to ask if my theory could be true for this bug
> and one solution he suggested is to do the userfaultfd_dup at a point
> where fork cannot fail anymore. That's precisely what we were
> wondering to do back then to avoid the failed fork reports to the
> non cooperative uffd monitor.
>
> That will solve the false positive deliveries that CRIU manager
> currently filters out too. From a theoretical standpoint it's also
> quite strange to even allow any uffd ioctl to run on a otherwise long
> gone mm created for a process that in the end wasn't even created (the
> mm got temporarily fully created, but no child task really ever used
> such mm). However that mm is on its way to exit_mmap as soon as the
> ioclt returns and this only ever happens during race conditions, so
> the way CRIU monitor works there wasn't anything fundamentally
> concerning about this detail, despite it's remarkably "strange". Our
> priority was to keep the fork code as simple as possible and keep
> userfaultfd as non intrusive as possible.
>
> One alternative solution I'm wondering about for this memcg issue is
> to free the task struct with RCU also when fork has failed and to add
> the mm_update_next_owner before mmput. That will still report failed
> forks to the uffd monitor, so it's not the ideal fix, but since it's
> probably simpler I'm posting it below. Also I couldn't reproduce the
> problem with the testcase here yet.
>
> >From 6cbf9d377b705476e5226704422357176f79e32c Mon Sep 17 00:00:00 2001
> From: Andrea Arcangeli 
> Date: Tue, 5 Mar 2019 19:21:37 -0500
> Subject: [PATCH 1/1] userfaultfd: use RCU to free the task struct when fork
>  fails if MEMCG
>
> MEMCG depends on the task structure not to be freed under
> rcu_read_lock() in get_mem_cgroup_from_mm() after it dereferences
> mm->owner.
>
> A better fix would be to avoid registering forked vmas in userfaultfd
> contexts reported to the monitor, if case fork ends up failing.
Hi,  Andrea

I can reproduce the issue in arm64 qemu machine.  The 

Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm

2019-03-06 Thread zhong jiang
On 2019/3/7 2:29, Andrea Arcangeli wrote:
> Hello Zhong,
>
> On Wed, Mar 06, 2019 at 09:07:00PM +0800, zhong jiang wrote:
>> The patch use call_rcu to delay free the task_struct, but It is possible to 
>> free the task_struct
>> ahead of get_mem_cgroup_from_mm. is it right?
> Yes it is possible to free before get_mem_cgroup_from_mm, but if it's
> freed before get_mem_cgroup_from_mm rcu_read_lock,
> rcu_dereference(mm->owner) will return NULL in such case and there
> will be no problem.
Yes
> The simple fix also clears the mm->owner of the failed-fork-mm before
> doing the call_rcu. The call_rcu delays the freeing after no other CPU
> runs in between rcu_read_lock/unlock anymore. That guarantees that
> those critical section will see mm->owner == NULL if the freeing of
> the task strut already happened.
We has set the mm->owner to NULL when child process fails to fork ahead of 
freeing
the task struct.

Have those critical section  chance to see the mm->owner, which is not NULL.

I has tested the patch.  Not Oops and panic appear  so far.

Thanks,
zhong jiang
> The solution Mike suggested for this and that we were wondering as
> ideal in the past for the signal issue too, is to move the uffd
> delivery at a point where fork is guaranteed to succeed. We should
> probably try that too to see how it looks like and if it can be done
> in a not intrusive way, but the simple fix that uses RCU should work
> too.
>
> Rolling back in case of errors inside fork itself isn't easily doable:
> the moment we push the uffd ctx to the other side of the uffd pipe
> there's no coming back as that information can reach the userland of
> the uffd monitor/reader thread immediately after. The rolling back is
> really the other thread failing at mmget_not_zero eventually. It's the
> userland that has to rollback in such case when it gets a -ESRCH
> retval.
>
> Note that this fork feature is only ever needed in the non-cooperative
> case, these things never need to happen when userfaultfd is used by an
> app (or a lib) that is aware that it is using userfaultfd.
>
> Thanks,
> Andrea
>
> .
>




Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm

2019-03-06 Thread Andrea Arcangeli
Hello Zhong,

On Wed, Mar 06, 2019 at 09:07:00PM +0800, zhong jiang wrote:
> The patch use call_rcu to delay free the task_struct, but It is possible to 
> free the task_struct
> ahead of get_mem_cgroup_from_mm. is it right?

Yes it is possible to free before get_mem_cgroup_from_mm, but if it's
freed before get_mem_cgroup_from_mm rcu_read_lock,
rcu_dereference(mm->owner) will return NULL in such case and there
will be no problem.

The simple fix also clears the mm->owner of the failed-fork-mm before
doing the call_rcu. The call_rcu delays the freeing after no other CPU
runs in between rcu_read_lock/unlock anymore. That guarantees that
those critical section will see mm->owner == NULL if the freeing of
the task strut already happened.

The solution Mike suggested for this and that we were wondering as
ideal in the past for the signal issue too, is to move the uffd
delivery at a point where fork is guaranteed to succeed. We should
probably try that too to see how it looks like and if it can be done
in a not intrusive way, but the simple fix that uses RCU should work
too.

Rolling back in case of errors inside fork itself isn't easily doable:
the moment we push the uffd ctx to the other side of the uffd pipe
there's no coming back as that information can reach the userland of
the uffd monitor/reader thread immediately after. The rolling back is
really the other thread failing at mmget_not_zero eventually. It's the
userland that has to rollback in such case when it gets a -ESRCH
retval.

Note that this fork feature is only ever needed in the non-cooperative
case, these things never need to happen when userfaultfd is used by an
app (or a lib) that is aware that it is using userfaultfd.

Thanks,
Andrea


Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm

2019-03-06 Thread zhong jiang
On 2019/3/6 16:12, Peter Xu wrote:
> On Wed, Mar 06, 2019 at 03:41:06PM +0800, zhong jiang wrote:
>> On 2019/3/6 14:26, Mike Rapoport wrote:
>>> Hi,
>>>
>>> On Wed, Mar 06, 2019 at 01:53:12PM +0800, zhong jiang wrote:
 On 2019/3/6 10:05, Andrea Arcangeli wrote:
> Hello everyone,
>
> [ CC'ed Mike and Peter ]
>
> On Tue, Mar 05, 2019 at 02:42:00PM +0800, zhong jiang wrote:
>> On 2019/3/5 14:26, Dmitry Vyukov wrote:
>>> On Mon, Mar 4, 2019 at 4:32 PM zhong jiang  
>>> wrote:
 On 2019/3/4 22:11, Dmitry Vyukov wrote:
> On Mon, Mar 4, 2019 at 3:00 PM zhong jiang  
> wrote:
>> On 2019/3/4 15:40, Dmitry Vyukov wrote:
>>> On Sun, Mar 3, 2019 at 5:19 PM zhong jiang  
>>> wrote:
 Hi, guys

 I also hit the following issue. but it fails to reproduce the 
 issue by the log.

 it seems to the case that we access the mm->owner and deference it 
 will result in the UAF.
 But it should not be possible that we specify the incomplete 
 process to be the mm->owner.

 Any thoughts?
>>> FWIW syzbot was able to reproduce this with this reproducer.
>>> This looks like a very subtle race (threaded reproducer that runs
>>> repeatedly in multiple processes), so most likely we are looking for
>>> something like few instructions inconsistency window.
>>>
>> I has a little doubtful about the instrustions inconsistency window.
>>
>> I guess that you mean some smb barriers should be taken into 
>> account.:-)
>>
>> Because IMO, It should not be the lock case to result in the issue.
> Since the crash was triggered on x86 _most likley_ this is not a
> missed barrier. What I meant is that one thread needs to executed some
> code, while another thread is stopped within few instructions.
>
>
 It is weird and I can not find any relationship you had said with the 
 issue.:-(

 Because It is the cause that mm->owner has been freed, whereas we 
 still deference it.

 From the lastest freed task call trace, It fails to create process.

 Am I miss something or I misunderstand your meaning. Please correct me.
>>> Your analysis looks correct. I am just saying that the root cause of
>>> this use-after-free seems to be a race condition.
>>>
>>>
>>>
>> Yep, Indeed,  I can not figure out how the race works. I will dig up 
>> further.
> Yes it's a race condition.
>
> We were aware about the non-cooperative fork userfaultfd feature
> creating userfaultfd file descriptor that gets reported to the parent
> uffd, despite they belong to mm created by failed forks.
>
> https://www.spinics.net/lists/linux-mm/msg136357.html
>
 Hi, Andrea

 I still not clear why uffd ioctl can use the incomplete process as the 
 mm->owner.
 and how to produce the race.
>>> There is a C reproducer in  the syzcaller report:
>>>
>>> https://syzkaller.appspot.com/x/repro.c?x=172fa5a340
>>>  
 From your above explainations,   My underdtanding is that the process 
 handling do_exexve
 will have a temporary mm,  which will be used by the UUFD ioctl.
>>> The race is between userfaultfd operation and fork() failure:
>>>
>>> forking thread  | userfaultfd monitor thread
>>> +---
>>> fork()  |
>>>   dup_mmap()|
>>> dup_userfaultfd()   |
>>> dup_userfaultfd_complete()  |
>>> |  read(UFFD_EVENT_FORK)
>>> |  uffdio_copy()
>>> |mmget_not_zero()
>>> goto bad_fork_something |
>>> ... |
>>> bad_fork_free:  |
>>>   free_task()   |
>>> |  mem_cgroup_from_task()
>>> |   /* access stale mm->owner */
>>>  
>> Hi, Mike
> Hi, Zhong,
>
>> forking thread fails to create the process ,and then free the allocated task 
>> struct.
>> Other userfaultfd monitor thread should not access the stale mm->owner.
>>
>> The parent process and child process do not share the mm struct.  
>> Userfaultfd monitor thread's
>> mm->owner should not point to the freed child task_struct.
> IIUC the problem is that above mm (of the mm->owner) is the child
> process's mm rather than the uffd monitor's.  When
> dup_userfaultfd_complete() is called there will be a new userfaultfd
> context sent to the uffd monitor thread which linked to the chlid
> process's mm, and if the monitor thread do UFFDIO_COPY upon the newly
> received userfaultfd it'll operate on that new mm 

Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm

2019-03-06 Thread Mike Rapoport
On Wed, Mar 06, 2019 at 03:41:06PM +0800, zhong jiang wrote:
> On 2019/3/6 14:26, Mike Rapoport wrote:
> > Hi,
> >
> > On Wed, Mar 06, 2019 at 01:53:12PM +0800, zhong jiang wrote:
> >> On 2019/3/6 10:05, Andrea Arcangeli wrote:
> >>> Hello everyone,
> >>>
> >>> [ CC'ed Mike and Peter ]
> >>>
> >>> On Tue, Mar 05, 2019 at 02:42:00PM +0800, zhong jiang wrote:
>  On 2019/3/5 14:26, Dmitry Vyukov wrote:
> > On Mon, Mar 4, 2019 at 4:32 PM zhong jiang  
> > wrote:
> >> On 2019/3/4 22:11, Dmitry Vyukov wrote:
> >>> On Mon, Mar 4, 2019 at 3:00 PM zhong jiang  
> >>> wrote:
>  On 2019/3/4 15:40, Dmitry Vyukov wrote:
> > On Sun, Mar 3, 2019 at 5:19 PM zhong jiang  
> > wrote:
> >> Hi, guys
> >>
> >> I also hit the following issue. but it fails to reproduce the 
> >> issue by the log.
> >>
> >> it seems to the case that we access the mm->owner and deference it 
> >> will result in the UAF.
> >> But it should not be possible that we specify the incomplete 
> >> process to be the mm->owner.
> >>
> >> Any thoughts?
> > FWIW syzbot was able to reproduce this with this reproducer.
> > This looks like a very subtle race (threaded reproducer that runs
> > repeatedly in multiple processes), so most likely we are looking for
> > something like few instructions inconsistency window.
> >
>  I has a little doubtful about the instrustions inconsistency window.
> 
>  I guess that you mean some smb barriers should be taken into 
>  account.:-)
> 
>  Because IMO, It should not be the lock case to result in the issue.
> >>> Since the crash was triggered on x86 _most likley_ this is not a
> >>> missed barrier. What I meant is that one thread needs to executed some
> >>> code, while another thread is stopped within few instructions.
> >>>
> >>>
> >> It is weird and I can not find any relationship you had said with the 
> >> issue.:-(
> >>
> >> Because It is the cause that mm->owner has been freed, whereas we 
> >> still deference it.
> >>
> >> From the lastest freed task call trace, It fails to create process.
> >>
> >> Am I miss something or I misunderstand your meaning. Please correct me.
> > Your analysis looks correct. I am just saying that the root cause of
> > this use-after-free seems to be a race condition.
> >
> >
> >
>  Yep, Indeed,  I can not figure out how the race works. I will dig up 
>  further.
> >>> Yes it's a race condition.
> >>>
> >>> We were aware about the non-cooperative fork userfaultfd feature
> >>> creating userfaultfd file descriptor that gets reported to the parent
> >>> uffd, despite they belong to mm created by failed forks.
> >>>
> >>> https://www.spinics.net/lists/linux-mm/msg136357.html
> >>>
> >> Hi, Andrea
> >>
> >> I still not clear why uffd ioctl can use the incomplete process as the 
> >> mm->owner.
> >> and how to produce the race.
> > There is a C reproducer in  the syzcaller report:
> >
> > https://syzkaller.appspot.com/x/repro.c?x=172fa5a340
> >  
> >> From your above explainations,   My underdtanding is that the process 
> >> handling do_exexve
> >> will have a temporary mm,  which will be used by the UUFD ioctl.
> > The race is between userfaultfd operation and fork() failure:
> >
> > forking thread  | userfaultfd monitor thread
> > +---
> > fork()  |
> >   dup_mmap()|
> > dup_userfaultfd()   |
> > dup_userfaultfd_complete()  |
> > |  read(UFFD_EVENT_FORK)
> > |  uffdio_copy()
> > |mmget_not_zero()
> > goto bad_fork_something |
> > ... |
> > bad_fork_free:  |
> >   free_task()   |
> > |  mem_cgroup_from_task()
> > |   /* access stale mm->owner */
> >  
> Hi, Mike
> 
> forking thread fails to create the process ,and then free the allocated task 
> struct.
> Other userfaultfd monitor thread should not access the stale mm->owner.
> 
> The parent process and child process do not share the mm struct.  Userfaultfd 
> monitor thread's
> mm->owner should not point to the freed child task_struct.

Userfaultfd can monitor remote mm's [1]. In this case, dup_userfaultfd() and
dup_userfaultfd_complete() create uffd context for the new process and
notify userspace uffd monitor about this new context. The uffd monitor then
can perform uffd operations on the new context.

On the right side the mmget_not_zero() will take the reference for the mm of 
the newly
created process.

[1] 

Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm

2019-03-06 Thread Peter Xu
On Wed, Mar 06, 2019 at 03:41:06PM +0800, zhong jiang wrote:
> On 2019/3/6 14:26, Mike Rapoport wrote:
> > Hi,
> >
> > On Wed, Mar 06, 2019 at 01:53:12PM +0800, zhong jiang wrote:
> >> On 2019/3/6 10:05, Andrea Arcangeli wrote:
> >>> Hello everyone,
> >>>
> >>> [ CC'ed Mike and Peter ]
> >>>
> >>> On Tue, Mar 05, 2019 at 02:42:00PM +0800, zhong jiang wrote:
>  On 2019/3/5 14:26, Dmitry Vyukov wrote:
> > On Mon, Mar 4, 2019 at 4:32 PM zhong jiang  
> > wrote:
> >> On 2019/3/4 22:11, Dmitry Vyukov wrote:
> >>> On Mon, Mar 4, 2019 at 3:00 PM zhong jiang  
> >>> wrote:
>  On 2019/3/4 15:40, Dmitry Vyukov wrote:
> > On Sun, Mar 3, 2019 at 5:19 PM zhong jiang  
> > wrote:
> >> Hi, guys
> >>
> >> I also hit the following issue. but it fails to reproduce the 
> >> issue by the log.
> >>
> >> it seems to the case that we access the mm->owner and deference it 
> >> will result in the UAF.
> >> But it should not be possible that we specify the incomplete 
> >> process to be the mm->owner.
> >>
> >> Any thoughts?
> > FWIW syzbot was able to reproduce this with this reproducer.
> > This looks like a very subtle race (threaded reproducer that runs
> > repeatedly in multiple processes), so most likely we are looking for
> > something like few instructions inconsistency window.
> >
>  I has a little doubtful about the instrustions inconsistency window.
> 
>  I guess that you mean some smb barriers should be taken into 
>  account.:-)
> 
>  Because IMO, It should not be the lock case to result in the issue.
> >>> Since the crash was triggered on x86 _most likley_ this is not a
> >>> missed barrier. What I meant is that one thread needs to executed some
> >>> code, while another thread is stopped within few instructions.
> >>>
> >>>
> >> It is weird and I can not find any relationship you had said with the 
> >> issue.:-(
> >>
> >> Because It is the cause that mm->owner has been freed, whereas we 
> >> still deference it.
> >>
> >> From the lastest freed task call trace, It fails to create process.
> >>
> >> Am I miss something or I misunderstand your meaning. Please correct me.
> > Your analysis looks correct. I am just saying that the root cause of
> > this use-after-free seems to be a race condition.
> >
> >
> >
>  Yep, Indeed,  I can not figure out how the race works. I will dig up 
>  further.
> >>> Yes it's a race condition.
> >>>
> >>> We were aware about the non-cooperative fork userfaultfd feature
> >>> creating userfaultfd file descriptor that gets reported to the parent
> >>> uffd, despite they belong to mm created by failed forks.
> >>>
> >>> https://www.spinics.net/lists/linux-mm/msg136357.html
> >>>
> >> Hi, Andrea
> >>
> >> I still not clear why uffd ioctl can use the incomplete process as the 
> >> mm->owner.
> >> and how to produce the race.
> > There is a C reproducer in  the syzcaller report:
> >
> > https://syzkaller.appspot.com/x/repro.c?x=172fa5a340
> >  
> >> From your above explainations,   My underdtanding is that the process 
> >> handling do_exexve
> >> will have a temporary mm,  which will be used by the UUFD ioctl.
> > The race is between userfaultfd operation and fork() failure:
> >
> > forking thread  | userfaultfd monitor thread
> > +---
> > fork()  |
> >   dup_mmap()|
> > dup_userfaultfd()   |
> > dup_userfaultfd_complete()  |
> > |  read(UFFD_EVENT_FORK)
> > |  uffdio_copy()
> > |mmget_not_zero()
> > goto bad_fork_something |
> > ... |
> > bad_fork_free:  |
> >   free_task()   |
> > |  mem_cgroup_from_task()
> > |   /* access stale mm->owner */
> >  
> Hi, Mike

Hi, Zhong,

> 
> forking thread fails to create the process ,and then free the allocated task 
> struct.
> Other userfaultfd monitor thread should not access the stale mm->owner.
> 
> The parent process and child process do not share the mm struct.  Userfaultfd 
> monitor thread's
> mm->owner should not point to the freed child task_struct.

IIUC the problem is that above mm (of the mm->owner) is the child
process's mm rather than the uffd monitor's.  When
dup_userfaultfd_complete() is called there will be a new userfaultfd
context sent to the uffd monitor thread which linked to the chlid
process's mm, and if the monitor thread do UFFDIO_COPY upon the newly
received userfaultfd it'll operate on that new mm too.

> 
> and due to the existence of tasklist_lock,  

Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm

2019-03-05 Thread zhong jiang
On 2019/3/6 14:26, Mike Rapoport wrote:
> Hi,
>
> On Wed, Mar 06, 2019 at 01:53:12PM +0800, zhong jiang wrote:
>> On 2019/3/6 10:05, Andrea Arcangeli wrote:
>>> Hello everyone,
>>>
>>> [ CC'ed Mike and Peter ]
>>>
>>> On Tue, Mar 05, 2019 at 02:42:00PM +0800, zhong jiang wrote:
 On 2019/3/5 14:26, Dmitry Vyukov wrote:
> On Mon, Mar 4, 2019 at 4:32 PM zhong jiang  wrote:
>> On 2019/3/4 22:11, Dmitry Vyukov wrote:
>>> On Mon, Mar 4, 2019 at 3:00 PM zhong jiang  
>>> wrote:
 On 2019/3/4 15:40, Dmitry Vyukov wrote:
> On Sun, Mar 3, 2019 at 5:19 PM zhong jiang  
> wrote:
>> Hi, guys
>>
>> I also hit the following issue. but it fails to reproduce the issue 
>> by the log.
>>
>> it seems to the case that we access the mm->owner and deference it 
>> will result in the UAF.
>> But it should not be possible that we specify the incomplete process 
>> to be the mm->owner.
>>
>> Any thoughts?
> FWIW syzbot was able to reproduce this with this reproducer.
> This looks like a very subtle race (threaded reproducer that runs
> repeatedly in multiple processes), so most likely we are looking for
> something like few instructions inconsistency window.
>
 I has a little doubtful about the instrustions inconsistency window.

 I guess that you mean some smb barriers should be taken into 
 account.:-)

 Because IMO, It should not be the lock case to result in the issue.
>>> Since the crash was triggered on x86 _most likley_ this is not a
>>> missed barrier. What I meant is that one thread needs to executed some
>>> code, while another thread is stopped within few instructions.
>>>
>>>
>> It is weird and I can not find any relationship you had said with the 
>> issue.:-(
>>
>> Because It is the cause that mm->owner has been freed, whereas we still 
>> deference it.
>>
>> From the lastest freed task call trace, It fails to create process.
>>
>> Am I miss something or I misunderstand your meaning. Please correct me.
> Your analysis looks correct. I am just saying that the root cause of
> this use-after-free seems to be a race condition.
>
>
>
 Yep, Indeed,  I can not figure out how the race works. I will dig up 
 further.
>>> Yes it's a race condition.
>>>
>>> We were aware about the non-cooperative fork userfaultfd feature
>>> creating userfaultfd file descriptor that gets reported to the parent
>>> uffd, despite they belong to mm created by failed forks.
>>>
>>> https://www.spinics.net/lists/linux-mm/msg136357.html
>>>
>> Hi, Andrea
>>
>> I still not clear why uffd ioctl can use the incomplete process as the 
>> mm->owner.
>> and how to produce the race.
> There is a C reproducer in  the syzcaller report:
>
> https://syzkaller.appspot.com/x/repro.c?x=172fa5a340
>  
>> From your above explainations,   My underdtanding is that the process 
>> handling do_exexve
>> will have a temporary mm,  which will be used by the UUFD ioctl.
> The race is between userfaultfd operation and fork() failure:
>
> forking thread  | userfaultfd monitor thread
> +---
> fork()  |
>   dup_mmap()|
> dup_userfaultfd()   |
> dup_userfaultfd_complete()  |
> |  read(UFFD_EVENT_FORK)
> |  uffdio_copy()
> |mmget_not_zero()
> goto bad_fork_something |
> ... |
> bad_fork_free:  |
>   free_task()   |
> |  mem_cgroup_from_task()
> |   /* access stale mm->owner */
>  
Hi, Mike

forking thread fails to create the process ,and then free the allocated task 
struct.
Other userfaultfd monitor thread should not access the stale mm->owner.

The parent process and child process do not share the mm struct.  Userfaultfd 
monitor thread's
mm->owner should not point to the freed child task_struct.

and due to the existence of tasklist_lock,  we can not specify the mm->owner to 
freed task_struct.

I miss something,=-O

Thanks,
zhong jiang
>> Thanks,
>> zhong jiang




Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm

2019-03-05 Thread Mike Rapoport
Hi,

On Wed, Mar 06, 2019 at 01:53:12PM +0800, zhong jiang wrote:
> On 2019/3/6 10:05, Andrea Arcangeli wrote:
> > Hello everyone,
> >
> > [ CC'ed Mike and Peter ]
> >
> > On Tue, Mar 05, 2019 at 02:42:00PM +0800, zhong jiang wrote:
> >> On 2019/3/5 14:26, Dmitry Vyukov wrote:
> >>> On Mon, Mar 4, 2019 at 4:32 PM zhong jiang  wrote:
>  On 2019/3/4 22:11, Dmitry Vyukov wrote:
> > On Mon, Mar 4, 2019 at 3:00 PM zhong jiang  
> > wrote:
> >> On 2019/3/4 15:40, Dmitry Vyukov wrote:
> >>> On Sun, Mar 3, 2019 at 5:19 PM zhong jiang  
> >>> wrote:
>  Hi, guys
> 
>  I also hit the following issue. but it fails to reproduce the issue 
>  by the log.
> 
>  it seems to the case that we access the mm->owner and deference it 
>  will result in the UAF.
>  But it should not be possible that we specify the incomplete process 
>  to be the mm->owner.
> 
>  Any thoughts?
> >>> FWIW syzbot was able to reproduce this with this reproducer.
> >>> This looks like a very subtle race (threaded reproducer that runs
> >>> repeatedly in multiple processes), so most likely we are looking for
> >>> something like few instructions inconsistency window.
> >>>
> >> I has a little doubtful about the instrustions inconsistency window.
> >>
> >> I guess that you mean some smb barriers should be taken into 
> >> account.:-)
> >>
> >> Because IMO, It should not be the lock case to result in the issue.
> > Since the crash was triggered on x86 _most likley_ this is not a
> > missed barrier. What I meant is that one thread needs to executed some
> > code, while another thread is stopped within few instructions.
> >
> >
>  It is weird and I can not find any relationship you had said with the 
>  issue.:-(
> 
>  Because It is the cause that mm->owner has been freed, whereas we still 
>  deference it.
> 
>  From the lastest freed task call trace, It fails to create process.
> 
>  Am I miss something or I misunderstand your meaning. Please correct me.
> >>> Your analysis looks correct. I am just saying that the root cause of
> >>> this use-after-free seems to be a race condition.
> >>>
> >>>
> >>>
> >> Yep, Indeed,  I can not figure out how the race works. I will dig up 
> >> further.
> > Yes it's a race condition.
> >
> > We were aware about the non-cooperative fork userfaultfd feature
> > creating userfaultfd file descriptor that gets reported to the parent
> > uffd, despite they belong to mm created by failed forks.
> >
> > https://www.spinics.net/lists/linux-mm/msg136357.html
> >
> 
> Hi, Andrea
> 
> I still not clear why uffd ioctl can use the incomplete process as the 
> mm->owner.
> and how to produce the race.

There is a C reproducer in  the syzcaller report:

https://syzkaller.appspot.com/x/repro.c?x=172fa5a340
 
> From your above explainations,   My underdtanding is that the process 
> handling do_exexve
> will have a temporary mm,  which will be used by the UUFD ioctl.

The race is between userfaultfd operation and fork() failure:

forking thread  | userfaultfd monitor thread
+---
fork()  |
  dup_mmap()|
dup_userfaultfd()   |
dup_userfaultfd_complete()  |
|  read(UFFD_EVENT_FORK)
|  uffdio_copy()
|mmget_not_zero()
goto bad_fork_something |
... |
bad_fork_free:  |
  free_task()   |
|  mem_cgroup_from_task()
|   /* access stale mm->owner */
 
> Thanks,
> zhong jiang

-- 
Sincerely yours,
Mike.



Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm

2019-03-05 Thread zhong jiang
On 2019/3/6 10:05, Andrea Arcangeli wrote:
> Hello everyone,
>
> [ CC'ed Mike and Peter ]
>
> On Tue, Mar 05, 2019 at 02:42:00PM +0800, zhong jiang wrote:
>> On 2019/3/5 14:26, Dmitry Vyukov wrote:
>>> On Mon, Mar 4, 2019 at 4:32 PM zhong jiang  wrote:
 On 2019/3/4 22:11, Dmitry Vyukov wrote:
> On Mon, Mar 4, 2019 at 3:00 PM zhong jiang  wrote:
>> On 2019/3/4 15:40, Dmitry Vyukov wrote:
>>> On Sun, Mar 3, 2019 at 5:19 PM zhong jiang  
>>> wrote:
 Hi, guys

 I also hit the following issue. but it fails to reproduce the issue by 
 the log.

 it seems to the case that we access the mm->owner and deference it 
 will result in the UAF.
 But it should not be possible that we specify the incomplete process 
 to be the mm->owner.

 Any thoughts?
>>> FWIW syzbot was able to reproduce this with this reproducer.
>>> This looks like a very subtle race (threaded reproducer that runs
>>> repeatedly in multiple processes), so most likely we are looking for
>>> something like few instructions inconsistency window.
>>>
>> I has a little doubtful about the instrustions inconsistency window.
>>
>> I guess that you mean some smb barriers should be taken into account.:-)
>>
>> Because IMO, It should not be the lock case to result in the issue.
> Since the crash was triggered on x86 _most likley_ this is not a
> missed barrier. What I meant is that one thread needs to executed some
> code, while another thread is stopped within few instructions.
>
>
 It is weird and I can not find any relationship you had said with the 
 issue.:-(

 Because It is the cause that mm->owner has been freed, whereas we still 
 deference it.

 From the lastest freed task call trace, It fails to create process.

 Am I miss something or I misunderstand your meaning. Please correct me.
>>> Your analysis looks correct. I am just saying that the root cause of
>>> this use-after-free seems to be a race condition.
>>>
>>>
>>>
>> Yep, Indeed,  I can not figure out how the race works. I will dig up further.
> Yes it's a race condition.
>
> We were aware about the non-cooperative fork userfaultfd feature
> creating userfaultfd file descriptor that gets reported to the parent
> uffd, despite they belong to mm created by failed forks.
>
> https://www.spinics.net/lists/linux-mm/msg136357.html
>
> The fork failure in my testcase happened because of signal pending
> that interrupted fork after the failed-fork uffd context, was already
> pushed to the userfaultfd reader/monitor. CRIU then takes care of
> filtering the failed fork cases so we didn't want to make the fork
> code more complicated just for userfaultfd.
>
> In reality if MEMCG is enabled at build time, mm->owner maintainance
> code now creates a race condition in the above case, with any fork
> failure.
>
> I pinged Mike yesterday to ask if my theory could be true for this bug
> and one solution he suggested is to do the userfaultfd_dup at a point
> where fork cannot fail anymore. That's precisely what we were
> wondering to do back then to avoid the failed fork reports to the
> non cooperative uffd monitor.
>
> That will solve the false positive deliveries that CRIU manager
> currently filters out too. From a theoretical standpoint it's also
> quite strange to even allow any uffd ioctl to run on a otherwise long
> gone mm created for a process that in the end wasn't even created (the
> mm got temporarily fully created, but no child task really ever used
> such mm). However that mm is on its way to exit_mmap as soon as the
> ioclt returns and this only ever happens during race conditions, so
> the way CRIU monitor works there wasn't anything fundamentally
> concerning about this detail, despite it's remarkably "strange". Our
> priority was to keep the fork code as simple as possible and keep
> userfaultfd as non intrusive as possible.

Hi, Andrea

I still not clear why uffd ioctl can use the incomplete process as the 
mm->owner.
and how to produce the race.

>From your above explainations,   My underdtanding is that the process handling 
>do_exexve
will have a temporary mm,  which will be used by the UUFD ioctl.

Thanks,
zhong jiang
> One alternative solution I'm wondering about for this memcg issue is
> to free the task struct with RCU also when fork has failed and to add
> the mm_update_next_owner before mmput. That will still report failed
> forks to the uffd monitor, so it's not the ideal fix, but since it's
> probably simpler I'm posting it below. Also I couldn't reproduce the
> problem with the testcase here yet.
>
> >From 6cbf9d377b705476e5226704422357176f79e32c Mon Sep 17 00:00:00 2001
> From: Andrea Arcangeli 
> Date: Tue, 5 Mar 2019 19:21:37 -0500
> Subject: [PATCH 1/1] userfaultfd: use RCU to free the task struct when fork
>  fails if MEMCG
>
> MEMCG depends on the task structure not to be 

Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm

2019-03-05 Thread Andrea Arcangeli
Hello everyone,

[ CC'ed Mike and Peter ]

On Tue, Mar 05, 2019 at 02:42:00PM +0800, zhong jiang wrote:
> On 2019/3/5 14:26, Dmitry Vyukov wrote:
> > On Mon, Mar 4, 2019 at 4:32 PM zhong jiang  wrote:
> >> On 2019/3/4 22:11, Dmitry Vyukov wrote:
> >>> On Mon, Mar 4, 2019 at 3:00 PM zhong jiang  wrote:
>  On 2019/3/4 15:40, Dmitry Vyukov wrote:
> > On Sun, Mar 3, 2019 at 5:19 PM zhong jiang  
> > wrote:
> >> Hi, guys
> >>
> >> I also hit the following issue. but it fails to reproduce the issue by 
> >> the log.
> >>
> >> it seems to the case that we access the mm->owner and deference it 
> >> will result in the UAF.
> >> But it should not be possible that we specify the incomplete process 
> >> to be the mm->owner.
> >>
> >> Any thoughts?
> > FWIW syzbot was able to reproduce this with this reproducer.
> > This looks like a very subtle race (threaded reproducer that runs
> > repeatedly in multiple processes), so most likely we are looking for
> > something like few instructions inconsistency window.
> >
>  I has a little doubtful about the instrustions inconsistency window.
> 
>  I guess that you mean some smb barriers should be taken into account.:-)
> 
>  Because IMO, It should not be the lock case to result in the issue.
> >>> Since the crash was triggered on x86 _most likley_ this is not a
> >>> missed barrier. What I meant is that one thread needs to executed some
> >>> code, while another thread is stopped within few instructions.
> >>>
> >>>
> >> It is weird and I can not find any relationship you had said with the 
> >> issue.:-(
> >>
> >> Because It is the cause that mm->owner has been freed, whereas we still 
> >> deference it.
> >>
> >> From the lastest freed task call trace, It fails to create process.
> >>
> >> Am I miss something or I misunderstand your meaning. Please correct me.
> > Your analysis looks correct. I am just saying that the root cause of
> > this use-after-free seems to be a race condition.
> >
> >
> >
> Yep, Indeed,  I can not figure out how the race works. I will dig up further.

Yes it's a race condition.

We were aware about the non-cooperative fork userfaultfd feature
creating userfaultfd file descriptor that gets reported to the parent
uffd, despite they belong to mm created by failed forks.

https://www.spinics.net/lists/linux-mm/msg136357.html

The fork failure in my testcase happened because of signal pending
that interrupted fork after the failed-fork uffd context, was already
pushed to the userfaultfd reader/monitor. CRIU then takes care of
filtering the failed fork cases so we didn't want to make the fork
code more complicated just for userfaultfd.

In reality if MEMCG is enabled at build time, mm->owner maintainance
code now creates a race condition in the above case, with any fork
failure.

I pinged Mike yesterday to ask if my theory could be true for this bug
and one solution he suggested is to do the userfaultfd_dup at a point
where fork cannot fail anymore. That's precisely what we were
wondering to do back then to avoid the failed fork reports to the
non cooperative uffd monitor.

That will solve the false positive deliveries that CRIU manager
currently filters out too. From a theoretical standpoint it's also
quite strange to even allow any uffd ioctl to run on a otherwise long
gone mm created for a process that in the end wasn't even created (the
mm got temporarily fully created, but no child task really ever used
such mm). However that mm is on its way to exit_mmap as soon as the
ioclt returns and this only ever happens during race conditions, so
the way CRIU monitor works there wasn't anything fundamentally
concerning about this detail, despite it's remarkably "strange". Our
priority was to keep the fork code as simple as possible and keep
userfaultfd as non intrusive as possible.

One alternative solution I'm wondering about for this memcg issue is
to free the task struct with RCU also when fork has failed and to add
the mm_update_next_owner before mmput. That will still report failed
forks to the uffd monitor, so it's not the ideal fix, but since it's
probably simpler I'm posting it below. Also I couldn't reproduce the
problem with the testcase here yet.

>From 6cbf9d377b705476e5226704422357176f79e32c Mon Sep 17 00:00:00 2001
From: Andrea Arcangeli 
Date: Tue, 5 Mar 2019 19:21:37 -0500
Subject: [PATCH 1/1] userfaultfd: use RCU to free the task struct when fork
 fails if MEMCG

MEMCG depends on the task structure not to be freed under
rcu_read_lock() in get_mem_cgroup_from_mm() after it dereferences
mm->owner.

A better fix would be to avoid registering forked vmas in userfaultfd
contexts reported to the monitor, if case fork ends up failing.

Signed-off-by: Andrea Arcangeli 
---
 kernel/fork.c | 34 --
 1 file changed, 32 insertions(+), 2 deletions(-)

diff --git a/kernel/fork.c b/kernel/fork.c
index eb9953c82104..3bcbb361ffbc 

Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm

2019-03-04 Thread zhong jiang
On 2019/3/5 14:26, Dmitry Vyukov wrote:
> On Mon, Mar 4, 2019 at 4:32 PM zhong jiang  wrote:
>> On 2019/3/4 22:11, Dmitry Vyukov wrote:
>>> On Mon, Mar 4, 2019 at 3:00 PM zhong jiang  wrote:
 On 2019/3/4 15:40, Dmitry Vyukov wrote:
> On Sun, Mar 3, 2019 at 5:19 PM zhong jiang  wrote:
>> Hi, guys
>>
>> I also hit the following issue. but it fails to reproduce the issue by 
>> the log.
>>
>> it seems to the case that we access the mm->owner and deference it will 
>> result in the UAF.
>> But it should not be possible that we specify the incomplete process to 
>> be the mm->owner.
>>
>> Any thoughts?
> FWIW syzbot was able to reproduce this with this reproducer.
> This looks like a very subtle race (threaded reproducer that runs
> repeatedly in multiple processes), so most likely we are looking for
> something like few instructions inconsistency window.
>
 I has a little doubtful about the instrustions inconsistency window.

 I guess that you mean some smb barriers should be taken into account.:-)

 Because IMO, It should not be the lock case to result in the issue.
>>> Since the crash was triggered on x86 _most likley_ this is not a
>>> missed barrier. What I meant is that one thread needs to executed some
>>> code, while another thread is stopped within few instructions.
>>>
>>>
>> It is weird and I can not find any relationship you had said with the 
>> issue.:-(
>>
>> Because It is the cause that mm->owner has been freed, whereas we still 
>> deference it.
>>
>> From the lastest freed task call trace, It fails to create process.
>>
>> Am I miss something or I misunderstand your meaning. Please correct me.
> Your analysis looks correct. I am just saying that the root cause of
> this use-after-free seems to be a race condition.
>
>
>
Yep, Indeed,  I can not figure out how the race works. I will dig up further.

Thanks,
zhong jiang
>
>> On 2018/12/4 23:43, syzbot wrote:
>>> syzbot has found a reproducer for the following crash on:
>>>
>>> HEAD commit:0072a0c14d5b Merge tag 'media/v4.20-4' of 
>>> git://git.kernel..
>>> git tree:   upstream
>>> console output: https://syzkaller.appspot.com/x/log.txt?x=11c885a340
>>> kernel config:  
>>> https://syzkaller.appspot.com/x/.config?x=b9cc5a440391cbfd
>>> dashboard link: 
>>> https://syzkaller.appspot.com/bug?extid=cbb52e396df3e565ab02
>>> compiler:   gcc (GCC) 8.0.1 20180413 (experimental)
>>> syz repro:  
>>> https://syzkaller.appspot.com/x/repro.syz?x=12835e2540
>>> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=172fa5a340
>>>
>>> IMPORTANT: if you fix the bug, please add the following tag to the 
>>> commit:
>>> Reported-by: syzbot+cbb52e396df3e565a...@syzkaller.appspotmail.com
>>>
>>> cgroup: fork rejected by pids controller in /syz2
>>> ==
>>> BUG: KASAN: use-after-free in __read_once_size 
>>> include/linux/compiler.h:182 [inline]
>>> BUG: KASAN: use-after-free in task_css include/linux/cgroup.h:477 
>>> [inline]
>>> BUG: KASAN: use-after-free in mem_cgroup_from_task mm/memcontrol.c:815 
>>> [inline]
>>> BUG: KASAN: use-after-free in 
>>> get_mem_cgroup_from_mm.part.62+0x6d7/0x880 mm/memcontrol.c:844
>>> Read of size 8 at addr 8881b72af310 by task syz-executor198/9332
>>>
>>> CPU: 0 PID: 9332 Comm: syz-executor198 Not tainted 4.20.0-rc5+ #142
>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS 
>>> Google 01/01/2011
>>> Call Trace:
>>>  __dump_stack lib/dump_stack.c:77 [inline]
>>>  dump_stack+0x244/0x39d lib/dump_stack.c:113
>>>  print_address_description.cold.7+0x9/0x1ff mm/kasan/report.c:256
>>>  kasan_report_error mm/kasan/report.c:354 [inline]
>>>  kasan_report.cold.8+0x242/0x309 mm/kasan/report.c:412
>>>  __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433
>>>  __read_once_size include/linux/compiler.h:182 [inline]
>>>  task_css include/linux/cgroup.h:477 [inline]
>>>  mem_cgroup_from_task mm/memcontrol.c:815 [inline]
>>>  get_mem_cgroup_from_mm.part.62+0x6d7/0x880 mm/memcontrol.c:844
>>>  get_mem_cgroup_from_mm mm/memcontrol.c:834 [inline]
>>>  mem_cgroup_try_charge+0x608/0xe20 mm/memcontrol.c:5888
>>>  mcopy_atomic_pte mm/userfaultfd.c:71 [inline]
>>>  mfill_atomic_pte mm/userfaultfd.c:418 [inline]
>>>  __mcopy_atomic mm/userfaultfd.c:559 [inline]
>>>  mcopy_atomic+0xb08/0x2c70 mm/userfaultfd.c:609
>>>  userfaultfd_copy fs/userfaultfd.c:1705 [inline]
>>>  userfaultfd_ioctl+0x29fb/0x5610 fs/userfaultfd.c:1851
>>>  vfs_ioctl fs/ioctl.c:46 [inline]
>>>  file_ioctl fs/ioctl.c:509 [inline]
>>>  do_vfs_ioctl+0x1de/0x1790 fs/ioctl.c:696
>>>  ksys_ioctl+0xa9/0xd0 fs/ioctl.c:713
>>>  __do_sys_ioctl 

Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm

2019-03-04 Thread Dmitry Vyukov
On Mon, Mar 4, 2019 at 4:32 PM zhong jiang  wrote:
>
> On 2019/3/4 22:11, Dmitry Vyukov wrote:
> > On Mon, Mar 4, 2019 at 3:00 PM zhong jiang  wrote:
> >> On 2019/3/4 15:40, Dmitry Vyukov wrote:
> >>> On Sun, Mar 3, 2019 at 5:19 PM zhong jiang  wrote:
>  Hi, guys
> 
>  I also hit the following issue. but it fails to reproduce the issue by 
>  the log.
> 
>  it seems to the case that we access the mm->owner and deference it will 
>  result in the UAF.
>  But it should not be possible that we specify the incomplete process to 
>  be the mm->owner.
> 
>  Any thoughts?
> >>> FWIW syzbot was able to reproduce this with this reproducer.
> >>> This looks like a very subtle race (threaded reproducer that runs
> >>> repeatedly in multiple processes), so most likely we are looking for
> >>> something like few instructions inconsistency window.
> >>>
> >> I has a little doubtful about the instrustions inconsistency window.
> >>
> >> I guess that you mean some smb barriers should be taken into account.:-)
> >>
> >> Because IMO, It should not be the lock case to result in the issue.
> >
> > Since the crash was triggered on x86 _most likley_ this is not a
> > missed barrier. What I meant is that one thread needs to executed some
> > code, while another thread is stopped within few instructions.
> >
> >
> It is weird and I can not find any relationship you had said with the 
> issue.:-(
>
> Because It is the cause that mm->owner has been freed, whereas we still 
> deference it.
>
> From the lastest freed task call trace, It fails to create process.
>
> Am I miss something or I misunderstand your meaning. Please correct me.

Your analysis looks correct. I am just saying that the root cause of
this use-after-free seems to be a race condition.





>  On 2018/12/4 23:43, syzbot wrote:
> > syzbot has found a reproducer for the following crash on:
> >
> > HEAD commit:0072a0c14d5b Merge tag 'media/v4.20-4' of 
> > git://git.kernel..
> > git tree:   upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=11c885a340
> > kernel config:  
> > https://syzkaller.appspot.com/x/.config?x=b9cc5a440391cbfd
> > dashboard link: 
> > https://syzkaller.appspot.com/bug?extid=cbb52e396df3e565ab02
> > compiler:   gcc (GCC) 8.0.1 20180413 (experimental)
> > syz repro:  
> > https://syzkaller.appspot.com/x/repro.syz?x=12835e2540
> > C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=172fa5a340
> >
> > IMPORTANT: if you fix the bug, please add the following tag to the 
> > commit:
> > Reported-by: syzbot+cbb52e396df3e565a...@syzkaller.appspotmail.com
> >
> > cgroup: fork rejected by pids controller in /syz2
> > ==
> > BUG: KASAN: use-after-free in __read_once_size 
> > include/linux/compiler.h:182 [inline]
> > BUG: KASAN: use-after-free in task_css include/linux/cgroup.h:477 
> > [inline]
> > BUG: KASAN: use-after-free in mem_cgroup_from_task mm/memcontrol.c:815 
> > [inline]
> > BUG: KASAN: use-after-free in 
> > get_mem_cgroup_from_mm.part.62+0x6d7/0x880 mm/memcontrol.c:844
> > Read of size 8 at addr 8881b72af310 by task syz-executor198/9332
> >
> > CPU: 0 PID: 9332 Comm: syz-executor198 Not tainted 4.20.0-rc5+ #142
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS 
> > Google 01/01/2011
> > Call Trace:
> >  __dump_stack lib/dump_stack.c:77 [inline]
> >  dump_stack+0x244/0x39d lib/dump_stack.c:113
> >  print_address_description.cold.7+0x9/0x1ff mm/kasan/report.c:256
> >  kasan_report_error mm/kasan/report.c:354 [inline]
> >  kasan_report.cold.8+0x242/0x309 mm/kasan/report.c:412
> >  __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433
> >  __read_once_size include/linux/compiler.h:182 [inline]
> >  task_css include/linux/cgroup.h:477 [inline]
> >  mem_cgroup_from_task mm/memcontrol.c:815 [inline]
> >  get_mem_cgroup_from_mm.part.62+0x6d7/0x880 mm/memcontrol.c:844
> >  get_mem_cgroup_from_mm mm/memcontrol.c:834 [inline]
> >  mem_cgroup_try_charge+0x608/0xe20 mm/memcontrol.c:5888
> >  mcopy_atomic_pte mm/userfaultfd.c:71 [inline]
> >  mfill_atomic_pte mm/userfaultfd.c:418 [inline]
> >  __mcopy_atomic mm/userfaultfd.c:559 [inline]
> >  mcopy_atomic+0xb08/0x2c70 mm/userfaultfd.c:609
> >  userfaultfd_copy fs/userfaultfd.c:1705 [inline]
> >  userfaultfd_ioctl+0x29fb/0x5610 fs/userfaultfd.c:1851
> >  vfs_ioctl fs/ioctl.c:46 [inline]
> >  file_ioctl fs/ioctl.c:509 [inline]
> >  do_vfs_ioctl+0x1de/0x1790 fs/ioctl.c:696
> >  ksys_ioctl+0xa9/0xd0 fs/ioctl.c:713
> >  __do_sys_ioctl fs/ioctl.c:720 [inline]
> >  __se_sys_ioctl fs/ioctl.c:718 [inline]
> >  __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
> >  

Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm

2019-03-04 Thread zhong jiang
On 2019/3/5 5:51, Matthew Wilcox wrote:
> On Mon, Mar 04, 2019 at 12:19:32AM +0800, zhong jiang wrote:
>> I also hit the following issue. but it fails to reproduce the issue by the 
>> log.
>>
>> it seems to the case that we access the mm->owner and deference it will 
>> result in the UAF.
>> But it should not be possible that we specify the incomplete process to be 
>> the mm->owner.
> OK, so we've got thread 9325 calling fork() and failing due to the PID
> controller saying "no".  9325 calls free_task(), but somehow thread 9332
> has a reference to the struct task_struct.  There are two possibilities
> here: one is that 9332 really did manage to get a reference to the larval
> child of 9325, and the other is that 9332 has a stale reference to some
> memory which was reallocated to 9325's child.

Good guess and analysis.   IMO,   9332 can not handle the task_struct directly 
in the code flow.
But It can get a reference of mm_struct.  Maybe I miss something important.

> Andrea, is there any way for a UFFD thread to get access to the child's
> task_struct during the copy_process() call?  If so, I think copy_process()
> needs to call mm_update_next_owner().

Yep,  Hope andrea  have time to  look at this. 

Thanks,
zhong jiang
> If there's no way for that to happen, then we have quite a bug-hunt ahead
> of us looking for who is missing a call to mm_update_next_owner().

>> On 2018/12/4 23:43, syzbot wrote:
>>> syzbot has found a reproducer for the following crash on:
>>>
>>> HEAD commit:0072a0c14d5b Merge tag 'media/v4.20-4' of git://git.kernel..
>>> git tree:   upstream
>>> console output: https://syzkaller.appspot.com/x/log.txt?x=11c885a340
>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=b9cc5a440391cbfd
>>> dashboard link: https://syzkaller.appspot.com/bug?extid=cbb52e396df3e565ab02
>>> compiler:   gcc (GCC) 8.0.1 20180413 (experimental)
>>> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=12835e2540
>>> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=172fa5a340
>>>
>>> IMPORTANT: if you fix the bug, please add the following tag to the commit:
>>> Reported-by: syzbot+cbb52e396df3e565a...@syzkaller.appspotmail.com
>>>
>>> cgroup: fork rejected by pids controller in /syz2
>>> ==
>>> BUG: KASAN: use-after-free in __read_once_size include/linux/compiler.h:182 
>>> [inline]
>>> BUG: KASAN: use-after-free in task_css include/linux/cgroup.h:477 [inline]
>>> BUG: KASAN: use-after-free in mem_cgroup_from_task mm/memcontrol.c:815 
>>> [inline]
>>> BUG: KASAN: use-after-free in get_mem_cgroup_from_mm.part.62+0x6d7/0x880 
>>> mm/memcontrol.c:844
>>> Read of size 8 at addr 8881b72af310 by task syz-executor198/9332
>>>
>>> CPU: 0 PID: 9332 Comm: syz-executor198 Not tainted 4.20.0-rc5+ #142
>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS 
>>> Google 01/01/2011
>>> Call Trace:
>>>  __dump_stack lib/dump_stack.c:77 [inline]
>>>  dump_stack+0x244/0x39d lib/dump_stack.c:113
>>>  print_address_description.cold.7+0x9/0x1ff mm/kasan/report.c:256
>>>  kasan_report_error mm/kasan/report.c:354 [inline]
>>>  kasan_report.cold.8+0x242/0x309 mm/kasan/report.c:412
>>>  __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433
>>>  __read_once_size include/linux/compiler.h:182 [inline]
>>>  task_css include/linux/cgroup.h:477 [inline]
>>>  mem_cgroup_from_task mm/memcontrol.c:815 [inline]
>>>  get_mem_cgroup_from_mm.part.62+0x6d7/0x880 mm/memcontrol.c:844
>>>  get_mem_cgroup_from_mm mm/memcontrol.c:834 [inline]
>>>  mem_cgroup_try_charge+0x608/0xe20 mm/memcontrol.c:5888
>>>  mcopy_atomic_pte mm/userfaultfd.c:71 [inline]
>>>  mfill_atomic_pte mm/userfaultfd.c:418 [inline]
>>>  __mcopy_atomic mm/userfaultfd.c:559 [inline]
>>>  mcopy_atomic+0xb08/0x2c70 mm/userfaultfd.c:609
>>>  userfaultfd_copy fs/userfaultfd.c:1705 [inline]
>>>  userfaultfd_ioctl+0x29fb/0x5610 fs/userfaultfd.c:1851
>>>  vfs_ioctl fs/ioctl.c:46 [inline]
>>>  file_ioctl fs/ioctl.c:509 [inline]
>>>  do_vfs_ioctl+0x1de/0x1790 fs/ioctl.c:696
>>>  ksys_ioctl+0xa9/0xd0 fs/ioctl.c:713
>>>  __do_sys_ioctl fs/ioctl.c:720 [inline]
>>>  __se_sys_ioctl fs/ioctl.c:718 [inline]
>>>  __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
>>>  do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
>>>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
>>> RIP: 0033:0x44c7e9
>>> Code: 5d c5 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 
>>> 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff 
>>> ff 0f 83 2b c5 fb ff c3 66 2e 0f 1f 84 00 00 00 00
>>> RSP: 002b:7f906b69fdb8 EFLAGS: 0246 ORIG_RAX: 0010
>>> RAX: ffda RBX: 006e4a08 RCX: 0044c7e9
>>> RDX: 2100 RSI: c028aa03 RDI: 0004
>>> RBP: 006e4a00 R08:  R09: 
>>> R10:  R11: 0246 R12: 006e4a0c
>>> R13: 

Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm

2019-03-04 Thread Matthew Wilcox
On Mon, Mar 04, 2019 at 12:19:32AM +0800, zhong jiang wrote:
> I also hit the following issue. but it fails to reproduce the issue by the 
> log.
> 
> it seems to the case that we access the mm->owner and deference it will 
> result in the UAF.
> But it should not be possible that we specify the incomplete process to be 
> the mm->owner.

OK, so we've got thread 9325 calling fork() and failing due to the PID
controller saying "no".  9325 calls free_task(), but somehow thread 9332
has a reference to the struct task_struct.  There are two possibilities
here: one is that 9332 really did manage to get a reference to the larval
child of 9325, and the other is that 9332 has a stale reference to some
memory which was reallocated to 9325's child.

Andrea, is there any way for a UFFD thread to get access to the child's
task_struct during the copy_process() call?  If so, I think copy_process()
needs to call mm_update_next_owner().

If there's no way for that to happen, then we have quite a bug-hunt ahead
of us looking for who is missing a call to mm_update_next_owner().

> On 2018/12/4 23:43, syzbot wrote:
> > syzbot has found a reproducer for the following crash on:
> >
> > HEAD commit:0072a0c14d5b Merge tag 'media/v4.20-4' of git://git.kernel..
> > git tree:   upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=11c885a340
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=b9cc5a440391cbfd
> > dashboard link: https://syzkaller.appspot.com/bug?extid=cbb52e396df3e565ab02
> > compiler:   gcc (GCC) 8.0.1 20180413 (experimental)
> > syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=12835e2540
> > C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=172fa5a340
> >
> > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > Reported-by: syzbot+cbb52e396df3e565a...@syzkaller.appspotmail.com
> >
> > cgroup: fork rejected by pids controller in /syz2
> > ==
> > BUG: KASAN: use-after-free in __read_once_size include/linux/compiler.h:182 
> > [inline]
> > BUG: KASAN: use-after-free in task_css include/linux/cgroup.h:477 [inline]
> > BUG: KASAN: use-after-free in mem_cgroup_from_task mm/memcontrol.c:815 
> > [inline]
> > BUG: KASAN: use-after-free in get_mem_cgroup_from_mm.part.62+0x6d7/0x880 
> > mm/memcontrol.c:844
> > Read of size 8 at addr 8881b72af310 by task syz-executor198/9332
> >
> > CPU: 0 PID: 9332 Comm: syz-executor198 Not tainted 4.20.0-rc5+ #142
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS 
> > Google 01/01/2011
> > Call Trace:
> >  __dump_stack lib/dump_stack.c:77 [inline]
> >  dump_stack+0x244/0x39d lib/dump_stack.c:113
> >  print_address_description.cold.7+0x9/0x1ff mm/kasan/report.c:256
> >  kasan_report_error mm/kasan/report.c:354 [inline]
> >  kasan_report.cold.8+0x242/0x309 mm/kasan/report.c:412
> >  __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433
> >  __read_once_size include/linux/compiler.h:182 [inline]
> >  task_css include/linux/cgroup.h:477 [inline]
> >  mem_cgroup_from_task mm/memcontrol.c:815 [inline]
> >  get_mem_cgroup_from_mm.part.62+0x6d7/0x880 mm/memcontrol.c:844
> >  get_mem_cgroup_from_mm mm/memcontrol.c:834 [inline]
> >  mem_cgroup_try_charge+0x608/0xe20 mm/memcontrol.c:5888
> >  mcopy_atomic_pte mm/userfaultfd.c:71 [inline]
> >  mfill_atomic_pte mm/userfaultfd.c:418 [inline]
> >  __mcopy_atomic mm/userfaultfd.c:559 [inline]
> >  mcopy_atomic+0xb08/0x2c70 mm/userfaultfd.c:609
> >  userfaultfd_copy fs/userfaultfd.c:1705 [inline]
> >  userfaultfd_ioctl+0x29fb/0x5610 fs/userfaultfd.c:1851
> >  vfs_ioctl fs/ioctl.c:46 [inline]
> >  file_ioctl fs/ioctl.c:509 [inline]
> >  do_vfs_ioctl+0x1de/0x1790 fs/ioctl.c:696
> >  ksys_ioctl+0xa9/0xd0 fs/ioctl.c:713
> >  __do_sys_ioctl fs/ioctl.c:720 [inline]
> >  __se_sys_ioctl fs/ioctl.c:718 [inline]
> >  __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
> >  do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
> >  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> > RIP: 0033:0x44c7e9
> > Code: 5d c5 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 
> > 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff 
> > ff 0f 83 2b c5 fb ff c3 66 2e 0f 1f 84 00 00 00 00
> > RSP: 002b:7f906b69fdb8 EFLAGS: 0246 ORIG_RAX: 0010
> > RAX: ffda RBX: 006e4a08 RCX: 0044c7e9
> > RDX: 2100 RSI: c028aa03 RDI: 0004
> > RBP: 006e4a00 R08:  R09: 
> > R10:  R11: 0246 R12: 006e4a0c
> > R13: 7ffdfd47813f R14: 7f906b6a09c0 R15: 002d
> >
> > Allocated by task 9325:
> >  save_stack+0x43/0xd0 mm/kasan/kasan.c:448
> >  set_track mm/kasan/kasan.c:460 [inline]
> >  kasan_kmalloc+0xc7/0xe0 mm/kasan/kasan.c:553
> >  kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:490
> >  kmem_cache_alloc_node+0x144/0x730 

Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm

2019-03-04 Thread zhong jiang
On 2019/3/4 22:11, Dmitry Vyukov wrote:
> On Mon, Mar 4, 2019 at 3:00 PM zhong jiang  wrote:
>> On 2019/3/4 15:40, Dmitry Vyukov wrote:
>>> On Sun, Mar 3, 2019 at 5:19 PM zhong jiang  wrote:
 Hi, guys

 I also hit the following issue. but it fails to reproduce the issue by the 
 log.

 it seems to the case that we access the mm->owner and deference it will 
 result in the UAF.
 But it should not be possible that we specify the incomplete process to be 
 the mm->owner.

 Any thoughts?
>>> FWIW syzbot was able to reproduce this with this reproducer.
>>> This looks like a very subtle race (threaded reproducer that runs
>>> repeatedly in multiple processes), so most likely we are looking for
>>> something like few instructions inconsistency window.
>>>
>> I has a little doubtful about the instrustions inconsistency window.
>>
>> I guess that you mean some smb barriers should be taken into account.:-)
>>
>> Because IMO, It should not be the lock case to result in the issue.
>
> Since the crash was triggered on x86 _most likley_ this is not a
> missed barrier. What I meant is that one thread needs to executed some
> code, while another thread is stopped within few instructions.
>
>
It is weird and I can not find any relationship you had said with the issue.:-(

Because It is the cause that mm->owner has been freed, whereas we still 
deference it.

>From the lastest freed task call trace, It fails to create process.

Am I miss something or I misunderstand your meaning. Please correct me.

Thanks,
zhong jiang
>> Thanks,
>> zhong jinag
 Thanks,
 zhong jiang

 On 2018/12/4 23:43, syzbot wrote:
> syzbot has found a reproducer for the following crash on:
>
> HEAD commit:0072a0c14d5b Merge tag 'media/v4.20-4' of 
> git://git.kernel..
> git tree:   upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=11c885a340
> kernel config:  https://syzkaller.appspot.com/x/.config?x=b9cc5a440391cbfd
> dashboard link: 
> https://syzkaller.appspot.com/bug?extid=cbb52e396df3e565ab02
> compiler:   gcc (GCC) 8.0.1 20180413 (experimental)
> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=12835e2540
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=172fa5a340
>
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+cbb52e396df3e565a...@syzkaller.appspotmail.com
>
> cgroup: fork rejected by pids controller in /syz2
> ==
> BUG: KASAN: use-after-free in __read_once_size 
> include/linux/compiler.h:182 [inline]
> BUG: KASAN: use-after-free in task_css include/linux/cgroup.h:477 [inline]
> BUG: KASAN: use-after-free in mem_cgroup_from_task mm/memcontrol.c:815 
> [inline]
> BUG: KASAN: use-after-free in get_mem_cgroup_from_mm.part.62+0x6d7/0x880 
> mm/memcontrol.c:844
> Read of size 8 at addr 8881b72af310 by task syz-executor198/9332
>
> CPU: 0 PID: 9332 Comm: syz-executor198 Not tainted 4.20.0-rc5+ #142
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS 
> Google 01/01/2011
> Call Trace:
>  __dump_stack lib/dump_stack.c:77 [inline]
>  dump_stack+0x244/0x39d lib/dump_stack.c:113
>  print_address_description.cold.7+0x9/0x1ff mm/kasan/report.c:256
>  kasan_report_error mm/kasan/report.c:354 [inline]
>  kasan_report.cold.8+0x242/0x309 mm/kasan/report.c:412
>  __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433
>  __read_once_size include/linux/compiler.h:182 [inline]
>  task_css include/linux/cgroup.h:477 [inline]
>  mem_cgroup_from_task mm/memcontrol.c:815 [inline]
>  get_mem_cgroup_from_mm.part.62+0x6d7/0x880 mm/memcontrol.c:844
>  get_mem_cgroup_from_mm mm/memcontrol.c:834 [inline]
>  mem_cgroup_try_charge+0x608/0xe20 mm/memcontrol.c:5888
>  mcopy_atomic_pte mm/userfaultfd.c:71 [inline]
>  mfill_atomic_pte mm/userfaultfd.c:418 [inline]
>  __mcopy_atomic mm/userfaultfd.c:559 [inline]
>  mcopy_atomic+0xb08/0x2c70 mm/userfaultfd.c:609
>  userfaultfd_copy fs/userfaultfd.c:1705 [inline]
>  userfaultfd_ioctl+0x29fb/0x5610 fs/userfaultfd.c:1851
>  vfs_ioctl fs/ioctl.c:46 [inline]
>  file_ioctl fs/ioctl.c:509 [inline]
>  do_vfs_ioctl+0x1de/0x1790 fs/ioctl.c:696
>  ksys_ioctl+0xa9/0xd0 fs/ioctl.c:713
>  __do_sys_ioctl fs/ioctl.c:720 [inline]
>  __se_sys_ioctl fs/ioctl.c:718 [inline]
>  __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
>  do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> RIP: 0033:0x44c7e9
> Code: 5d c5 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 
> f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 
> ff ff 0f 83 2b c5 fb ff c3 66 2e 0f 1f 84 00 00 00 00

Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm

2019-03-04 Thread Dmitry Vyukov
On Mon, Mar 4, 2019 at 3:00 PM zhong jiang  wrote:
>
> On 2019/3/4 15:40, Dmitry Vyukov wrote:
> > On Sun, Mar 3, 2019 at 5:19 PM zhong jiang  wrote:
> >> Hi, guys
> >>
> >> I also hit the following issue. but it fails to reproduce the issue by the 
> >> log.
> >>
> >> it seems to the case that we access the mm->owner and deference it will 
> >> result in the UAF.
> >> But it should not be possible that we specify the incomplete process to be 
> >> the mm->owner.
> >>
> >> Any thoughts?
> > FWIW syzbot was able to reproduce this with this reproducer.
> > This looks like a very subtle race (threaded reproducer that runs
> > repeatedly in multiple processes), so most likely we are looking for
> > something like few instructions inconsistency window.
> >
>
> I has a little doubtful about the instrustions inconsistency window.
>
> I guess that you mean some smb barriers should be taken into account.:-)
>
> Because IMO, It should not be the lock case to result in the issue.


Since the crash was triggered on x86 _most likley_ this is not a
missed barrier. What I meant is that one thread needs to executed some
code, while another thread is stopped within few instructions.



> Thanks,
> zhong jinag
> >> Thanks,
> >> zhong jiang
> >>
> >> On 2018/12/4 23:43, syzbot wrote:
> >>> syzbot has found a reproducer for the following crash on:
> >>>
> >>> HEAD commit:0072a0c14d5b Merge tag 'media/v4.20-4' of 
> >>> git://git.kernel..
> >>> git tree:   upstream
> >>> console output: https://syzkaller.appspot.com/x/log.txt?x=11c885a340
> >>> kernel config:  https://syzkaller.appspot.com/x/.config?x=b9cc5a440391cbfd
> >>> dashboard link: 
> >>> https://syzkaller.appspot.com/bug?extid=cbb52e396df3e565ab02
> >>> compiler:   gcc (GCC) 8.0.1 20180413 (experimental)
> >>> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=12835e2540
> >>> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=172fa5a340
> >>>
> >>> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> >>> Reported-by: syzbot+cbb52e396df3e565a...@syzkaller.appspotmail.com
> >>>
> >>> cgroup: fork rejected by pids controller in /syz2
> >>> ==
> >>> BUG: KASAN: use-after-free in __read_once_size 
> >>> include/linux/compiler.h:182 [inline]
> >>> BUG: KASAN: use-after-free in task_css include/linux/cgroup.h:477 [inline]
> >>> BUG: KASAN: use-after-free in mem_cgroup_from_task mm/memcontrol.c:815 
> >>> [inline]
> >>> BUG: KASAN: use-after-free in get_mem_cgroup_from_mm.part.62+0x6d7/0x880 
> >>> mm/memcontrol.c:844
> >>> Read of size 8 at addr 8881b72af310 by task syz-executor198/9332
> >>>
> >>> CPU: 0 PID: 9332 Comm: syz-executor198 Not tainted 4.20.0-rc5+ #142
> >>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS 
> >>> Google 01/01/2011
> >>> Call Trace:
> >>>  __dump_stack lib/dump_stack.c:77 [inline]
> >>>  dump_stack+0x244/0x39d lib/dump_stack.c:113
> >>>  print_address_description.cold.7+0x9/0x1ff mm/kasan/report.c:256
> >>>  kasan_report_error mm/kasan/report.c:354 [inline]
> >>>  kasan_report.cold.8+0x242/0x309 mm/kasan/report.c:412
> >>>  __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433
> >>>  __read_once_size include/linux/compiler.h:182 [inline]
> >>>  task_css include/linux/cgroup.h:477 [inline]
> >>>  mem_cgroup_from_task mm/memcontrol.c:815 [inline]
> >>>  get_mem_cgroup_from_mm.part.62+0x6d7/0x880 mm/memcontrol.c:844
> >>>  get_mem_cgroup_from_mm mm/memcontrol.c:834 [inline]
> >>>  mem_cgroup_try_charge+0x608/0xe20 mm/memcontrol.c:5888
> >>>  mcopy_atomic_pte mm/userfaultfd.c:71 [inline]
> >>>  mfill_atomic_pte mm/userfaultfd.c:418 [inline]
> >>>  __mcopy_atomic mm/userfaultfd.c:559 [inline]
> >>>  mcopy_atomic+0xb08/0x2c70 mm/userfaultfd.c:609
> >>>  userfaultfd_copy fs/userfaultfd.c:1705 [inline]
> >>>  userfaultfd_ioctl+0x29fb/0x5610 fs/userfaultfd.c:1851
> >>>  vfs_ioctl fs/ioctl.c:46 [inline]
> >>>  file_ioctl fs/ioctl.c:509 [inline]
> >>>  do_vfs_ioctl+0x1de/0x1790 fs/ioctl.c:696
> >>>  ksys_ioctl+0xa9/0xd0 fs/ioctl.c:713
> >>>  __do_sys_ioctl fs/ioctl.c:720 [inline]
> >>>  __se_sys_ioctl fs/ioctl.c:718 [inline]
> >>>  __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
> >>>  do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
> >>>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> >>> RIP: 0033:0x44c7e9
> >>> Code: 5d c5 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 
> >>> f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 
> >>> ff ff 0f 83 2b c5 fb ff c3 66 2e 0f 1f 84 00 00 00 00
> >>> RSP: 002b:7f906b69fdb8 EFLAGS: 0246 ORIG_RAX: 0010
> >>> RAX: ffda RBX: 006e4a08 RCX: 0044c7e9
> >>> RDX: 2100 RSI: c028aa03 RDI: 0004
> >>> RBP: 006e4a00 R08:  R09: 
> >>> R10:  R11: 0246 R12: 006e4a0c
> >>> R13: 

Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm

2019-03-04 Thread zhong jiang
On 2019/3/4 15:40, Dmitry Vyukov wrote:
> On Sun, Mar 3, 2019 at 5:19 PM zhong jiang  wrote:
>> Hi, guys
>>
>> I also hit the following issue. but it fails to reproduce the issue by the 
>> log.
>>
>> it seems to the case that we access the mm->owner and deference it will 
>> result in the UAF.
>> But it should not be possible that we specify the incomplete process to be 
>> the mm->owner.
>>
>> Any thoughts?
> FWIW syzbot was able to reproduce this with this reproducer.
> This looks like a very subtle race (threaded reproducer that runs
> repeatedly in multiple processes), so most likely we are looking for
> something like few instructions inconsistency window.
>

I has a little doubtful about the instrustions inconsistency window.

I guess that you mean some smb barriers should be taken into account.:-)

Because IMO, It should not be the lock case to result in the issue.


Thanks,
zhong jinag
>> Thanks,
>> zhong jiang
>>
>> On 2018/12/4 23:43, syzbot wrote:
>>> syzbot has found a reproducer for the following crash on:
>>>
>>> HEAD commit:0072a0c14d5b Merge tag 'media/v4.20-4' of git://git.kernel..
>>> git tree:   upstream
>>> console output: https://syzkaller.appspot.com/x/log.txt?x=11c885a340
>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=b9cc5a440391cbfd
>>> dashboard link: https://syzkaller.appspot.com/bug?extid=cbb52e396df3e565ab02
>>> compiler:   gcc (GCC) 8.0.1 20180413 (experimental)
>>> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=12835e2540
>>> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=172fa5a340
>>>
>>> IMPORTANT: if you fix the bug, please add the following tag to the commit:
>>> Reported-by: syzbot+cbb52e396df3e565a...@syzkaller.appspotmail.com
>>>
>>> cgroup: fork rejected by pids controller in /syz2
>>> ==
>>> BUG: KASAN: use-after-free in __read_once_size include/linux/compiler.h:182 
>>> [inline]
>>> BUG: KASAN: use-after-free in task_css include/linux/cgroup.h:477 [inline]
>>> BUG: KASAN: use-after-free in mem_cgroup_from_task mm/memcontrol.c:815 
>>> [inline]
>>> BUG: KASAN: use-after-free in get_mem_cgroup_from_mm.part.62+0x6d7/0x880 
>>> mm/memcontrol.c:844
>>> Read of size 8 at addr 8881b72af310 by task syz-executor198/9332
>>>
>>> CPU: 0 PID: 9332 Comm: syz-executor198 Not tainted 4.20.0-rc5+ #142
>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS 
>>> Google 01/01/2011
>>> Call Trace:
>>>  __dump_stack lib/dump_stack.c:77 [inline]
>>>  dump_stack+0x244/0x39d lib/dump_stack.c:113
>>>  print_address_description.cold.7+0x9/0x1ff mm/kasan/report.c:256
>>>  kasan_report_error mm/kasan/report.c:354 [inline]
>>>  kasan_report.cold.8+0x242/0x309 mm/kasan/report.c:412
>>>  __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433
>>>  __read_once_size include/linux/compiler.h:182 [inline]
>>>  task_css include/linux/cgroup.h:477 [inline]
>>>  mem_cgroup_from_task mm/memcontrol.c:815 [inline]
>>>  get_mem_cgroup_from_mm.part.62+0x6d7/0x880 mm/memcontrol.c:844
>>>  get_mem_cgroup_from_mm mm/memcontrol.c:834 [inline]
>>>  mem_cgroup_try_charge+0x608/0xe20 mm/memcontrol.c:5888
>>>  mcopy_atomic_pte mm/userfaultfd.c:71 [inline]
>>>  mfill_atomic_pte mm/userfaultfd.c:418 [inline]
>>>  __mcopy_atomic mm/userfaultfd.c:559 [inline]
>>>  mcopy_atomic+0xb08/0x2c70 mm/userfaultfd.c:609
>>>  userfaultfd_copy fs/userfaultfd.c:1705 [inline]
>>>  userfaultfd_ioctl+0x29fb/0x5610 fs/userfaultfd.c:1851
>>>  vfs_ioctl fs/ioctl.c:46 [inline]
>>>  file_ioctl fs/ioctl.c:509 [inline]
>>>  do_vfs_ioctl+0x1de/0x1790 fs/ioctl.c:696
>>>  ksys_ioctl+0xa9/0xd0 fs/ioctl.c:713
>>>  __do_sys_ioctl fs/ioctl.c:720 [inline]
>>>  __se_sys_ioctl fs/ioctl.c:718 [inline]
>>>  __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
>>>  do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
>>>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
>>> RIP: 0033:0x44c7e9
>>> Code: 5d c5 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 
>>> 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff 
>>> ff 0f 83 2b c5 fb ff c3 66 2e 0f 1f 84 00 00 00 00
>>> RSP: 002b:7f906b69fdb8 EFLAGS: 0246 ORIG_RAX: 0010
>>> RAX: ffda RBX: 006e4a08 RCX: 0044c7e9
>>> RDX: 2100 RSI: c028aa03 RDI: 0004
>>> RBP: 006e4a00 R08:  R09: 
>>> R10:  R11: 0246 R12: 006e4a0c
>>> R13: 7ffdfd47813f R14: 7f906b6a09c0 R15: 002d
>>>
>>> Allocated by task 9325:
>>>  save_stack+0x43/0xd0 mm/kasan/kasan.c:448
>>>  set_track mm/kasan/kasan.c:460 [inline]
>>>  kasan_kmalloc+0xc7/0xe0 mm/kasan/kasan.c:553
>>>  kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:490
>>>  kmem_cache_alloc_node+0x144/0x730 mm/slab.c:3644
>>>  alloc_task_struct_node kernel/fork.c:158 [inline]
>>>  dup_task_struct kernel/fork.c:843 [inline]
>>>  

Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm

2019-03-03 Thread Dmitry Vyukov
On Sun, Mar 3, 2019 at 5:19 PM zhong jiang  wrote:
>
> Hi, guys
>
> I also hit the following issue. but it fails to reproduce the issue by the 
> log.
>
> it seems to the case that we access the mm->owner and deference it will 
> result in the UAF.
> But it should not be possible that we specify the incomplete process to be 
> the mm->owner.
>
> Any thoughts?

FWIW syzbot was able to reproduce this with this reproducer.
This looks like a very subtle race (threaded reproducer that runs
repeatedly in multiple processes), so most likely we are looking for
something like few instructions inconsistency window.


> Thanks,
> zhong jiang
>
> On 2018/12/4 23:43, syzbot wrote:
> > syzbot has found a reproducer for the following crash on:
> >
> > HEAD commit:0072a0c14d5b Merge tag 'media/v4.20-4' of git://git.kernel..
> > git tree:   upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=11c885a340
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=b9cc5a440391cbfd
> > dashboard link: https://syzkaller.appspot.com/bug?extid=cbb52e396df3e565ab02
> > compiler:   gcc (GCC) 8.0.1 20180413 (experimental)
> > syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=12835e2540
> > C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=172fa5a340
> >
> > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > Reported-by: syzbot+cbb52e396df3e565a...@syzkaller.appspotmail.com
> >
> > cgroup: fork rejected by pids controller in /syz2
> > ==
> > BUG: KASAN: use-after-free in __read_once_size include/linux/compiler.h:182 
> > [inline]
> > BUG: KASAN: use-after-free in task_css include/linux/cgroup.h:477 [inline]
> > BUG: KASAN: use-after-free in mem_cgroup_from_task mm/memcontrol.c:815 
> > [inline]
> > BUG: KASAN: use-after-free in get_mem_cgroup_from_mm.part.62+0x6d7/0x880 
> > mm/memcontrol.c:844
> > Read of size 8 at addr 8881b72af310 by task syz-executor198/9332
> >
> > CPU: 0 PID: 9332 Comm: syz-executor198 Not tainted 4.20.0-rc5+ #142
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS 
> > Google 01/01/2011
> > Call Trace:
> >  __dump_stack lib/dump_stack.c:77 [inline]
> >  dump_stack+0x244/0x39d lib/dump_stack.c:113
> >  print_address_description.cold.7+0x9/0x1ff mm/kasan/report.c:256
> >  kasan_report_error mm/kasan/report.c:354 [inline]
> >  kasan_report.cold.8+0x242/0x309 mm/kasan/report.c:412
> >  __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433
> >  __read_once_size include/linux/compiler.h:182 [inline]
> >  task_css include/linux/cgroup.h:477 [inline]
> >  mem_cgroup_from_task mm/memcontrol.c:815 [inline]
> >  get_mem_cgroup_from_mm.part.62+0x6d7/0x880 mm/memcontrol.c:844
> >  get_mem_cgroup_from_mm mm/memcontrol.c:834 [inline]
> >  mem_cgroup_try_charge+0x608/0xe20 mm/memcontrol.c:5888
> >  mcopy_atomic_pte mm/userfaultfd.c:71 [inline]
> >  mfill_atomic_pte mm/userfaultfd.c:418 [inline]
> >  __mcopy_atomic mm/userfaultfd.c:559 [inline]
> >  mcopy_atomic+0xb08/0x2c70 mm/userfaultfd.c:609
> >  userfaultfd_copy fs/userfaultfd.c:1705 [inline]
> >  userfaultfd_ioctl+0x29fb/0x5610 fs/userfaultfd.c:1851
> >  vfs_ioctl fs/ioctl.c:46 [inline]
> >  file_ioctl fs/ioctl.c:509 [inline]
> >  do_vfs_ioctl+0x1de/0x1790 fs/ioctl.c:696
> >  ksys_ioctl+0xa9/0xd0 fs/ioctl.c:713
> >  __do_sys_ioctl fs/ioctl.c:720 [inline]
> >  __se_sys_ioctl fs/ioctl.c:718 [inline]
> >  __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
> >  do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
> >  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> > RIP: 0033:0x44c7e9
> > Code: 5d c5 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 
> > 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff 
> > ff 0f 83 2b c5 fb ff c3 66 2e 0f 1f 84 00 00 00 00
> > RSP: 002b:7f906b69fdb8 EFLAGS: 0246 ORIG_RAX: 0010
> > RAX: ffda RBX: 006e4a08 RCX: 0044c7e9
> > RDX: 2100 RSI: c028aa03 RDI: 0004
> > RBP: 006e4a00 R08:  R09: 
> > R10:  R11: 0246 R12: 006e4a0c
> > R13: 7ffdfd47813f R14: 7f906b6a09c0 R15: 002d
> >
> > Allocated by task 9325:
> >  save_stack+0x43/0xd0 mm/kasan/kasan.c:448
> >  set_track mm/kasan/kasan.c:460 [inline]
> >  kasan_kmalloc+0xc7/0xe0 mm/kasan/kasan.c:553
> >  kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:490
> >  kmem_cache_alloc_node+0x144/0x730 mm/slab.c:3644
> >  alloc_task_struct_node kernel/fork.c:158 [inline]
> >  dup_task_struct kernel/fork.c:843 [inline]
> >  copy_process+0x2026/0x87a0 kernel/fork.c:1751
> >  _do_fork+0x1cb/0x11d0 kernel/fork.c:2216
> >  __do_sys_clone kernel/fork.c:2323 [inline]
> >  __se_sys_clone kernel/fork.c:2317 [inline]
> >  __x64_sys_clone+0xbf/0x150 kernel/fork.c:2317
> >  do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
> >  

Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm

2019-03-03 Thread zhong jiang
Hi, guys

I also hit the following issue. but it fails to reproduce the issue by the log.

it seems to the case that we access the mm->owner and deference it will result 
in the UAF.
But it should not be possible that we specify the incomplete process to be the 
mm->owner.

Any thoughts?

Thanks,
zhong jiang

On 2018/12/4 23:43, syzbot wrote:
> syzbot has found a reproducer for the following crash on:
>
> HEAD commit:0072a0c14d5b Merge tag 'media/v4.20-4' of git://git.kernel..
> git tree:   upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=11c885a340
> kernel config:  https://syzkaller.appspot.com/x/.config?x=b9cc5a440391cbfd
> dashboard link: https://syzkaller.appspot.com/bug?extid=cbb52e396df3e565ab02
> compiler:   gcc (GCC) 8.0.1 20180413 (experimental)
> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=12835e2540
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=172fa5a340
>
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+cbb52e396df3e565a...@syzkaller.appspotmail.com
>
> cgroup: fork rejected by pids controller in /syz2
> ==
> BUG: KASAN: use-after-free in __read_once_size include/linux/compiler.h:182 
> [inline]
> BUG: KASAN: use-after-free in task_css include/linux/cgroup.h:477 [inline]
> BUG: KASAN: use-after-free in mem_cgroup_from_task mm/memcontrol.c:815 
> [inline]
> BUG: KASAN: use-after-free in get_mem_cgroup_from_mm.part.62+0x6d7/0x880 
> mm/memcontrol.c:844
> Read of size 8 at addr 8881b72af310 by task syz-executor198/9332
>
> CPU: 0 PID: 9332 Comm: syz-executor198 Not tainted 4.20.0-rc5+ #142
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS 
> Google 01/01/2011
> Call Trace:
>  __dump_stack lib/dump_stack.c:77 [inline]
>  dump_stack+0x244/0x39d lib/dump_stack.c:113
>  print_address_description.cold.7+0x9/0x1ff mm/kasan/report.c:256
>  kasan_report_error mm/kasan/report.c:354 [inline]
>  kasan_report.cold.8+0x242/0x309 mm/kasan/report.c:412
>  __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433
>  __read_once_size include/linux/compiler.h:182 [inline]
>  task_css include/linux/cgroup.h:477 [inline]
>  mem_cgroup_from_task mm/memcontrol.c:815 [inline]
>  get_mem_cgroup_from_mm.part.62+0x6d7/0x880 mm/memcontrol.c:844
>  get_mem_cgroup_from_mm mm/memcontrol.c:834 [inline]
>  mem_cgroup_try_charge+0x608/0xe20 mm/memcontrol.c:5888
>  mcopy_atomic_pte mm/userfaultfd.c:71 [inline]
>  mfill_atomic_pte mm/userfaultfd.c:418 [inline]
>  __mcopy_atomic mm/userfaultfd.c:559 [inline]
>  mcopy_atomic+0xb08/0x2c70 mm/userfaultfd.c:609
>  userfaultfd_copy fs/userfaultfd.c:1705 [inline]
>  userfaultfd_ioctl+0x29fb/0x5610 fs/userfaultfd.c:1851
>  vfs_ioctl fs/ioctl.c:46 [inline]
>  file_ioctl fs/ioctl.c:509 [inline]
>  do_vfs_ioctl+0x1de/0x1790 fs/ioctl.c:696
>  ksys_ioctl+0xa9/0xd0 fs/ioctl.c:713
>  __do_sys_ioctl fs/ioctl.c:720 [inline]
>  __se_sys_ioctl fs/ioctl.c:718 [inline]
>  __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
>  do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> RIP: 0033:0x44c7e9
> Code: 5d c5 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 
> 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 
> 83 2b c5 fb ff c3 66 2e 0f 1f 84 00 00 00 00
> RSP: 002b:7f906b69fdb8 EFLAGS: 0246 ORIG_RAX: 0010
> RAX: ffda RBX: 006e4a08 RCX: 0044c7e9
> RDX: 2100 RSI: c028aa03 RDI: 0004
> RBP: 006e4a00 R08:  R09: 
> R10:  R11: 0246 R12: 006e4a0c
> R13: 7ffdfd47813f R14: 7f906b6a09c0 R15: 002d
>
> Allocated by task 9325:
>  save_stack+0x43/0xd0 mm/kasan/kasan.c:448
>  set_track mm/kasan/kasan.c:460 [inline]
>  kasan_kmalloc+0xc7/0xe0 mm/kasan/kasan.c:553
>  kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:490
>  kmem_cache_alloc_node+0x144/0x730 mm/slab.c:3644
>  alloc_task_struct_node kernel/fork.c:158 [inline]
>  dup_task_struct kernel/fork.c:843 [inline]
>  copy_process+0x2026/0x87a0 kernel/fork.c:1751
>  _do_fork+0x1cb/0x11d0 kernel/fork.c:2216
>  __do_sys_clone kernel/fork.c:2323 [inline]
>  __se_sys_clone kernel/fork.c:2317 [inline]
>  __x64_sys_clone+0xbf/0x150 kernel/fork.c:2317
>  do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
>
> Freed by task 9325:
>  save_stack+0x43/0xd0 mm/kasan/kasan.c:448
>  set_track mm/kasan/kasan.c:460 [inline]
>  __kasan_slab_free+0x102/0x150 mm/kasan/kasan.c:521
>  kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528
>  __cache_free mm/slab.c:3498 [inline]
>  kmem_cache_free+0x83/0x290 mm/slab.c:3760
>  free_task_struct kernel/fork.c:163 [inline]
>  free_task+0x16e/0x1f0 kernel/fork.c:457
>  copy_process+0x1dcc/0x87a0 kernel/fork.c:2148
>  _do_fork+0x1cb/0x11d0 

Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm

2018-12-04 Thread syzbot

syzbot has found a reproducer for the following crash on:

HEAD commit:0072a0c14d5b Merge tag 'media/v4.20-4' of git://git.kernel..
git tree:   upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=11c885a340
kernel config:  https://syzkaller.appspot.com/x/.config?x=b9cc5a440391cbfd
dashboard link: https://syzkaller.appspot.com/bug?extid=cbb52e396df3e565ab02
compiler:   gcc (GCC) 8.0.1 20180413 (experimental)
syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=12835e2540
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=172fa5a340

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+cbb52e396df3e565a...@syzkaller.appspotmail.com

cgroup: fork rejected by pids controller in /syz2
==
BUG: KASAN: use-after-free in __read_once_size include/linux/compiler.h:182  
[inline]

BUG: KASAN: use-after-free in task_css include/linux/cgroup.h:477 [inline]
BUG: KASAN: use-after-free in mem_cgroup_from_task mm/memcontrol.c:815  
[inline]
BUG: KASAN: use-after-free in get_mem_cgroup_from_mm.part.62+0x6d7/0x880  
mm/memcontrol.c:844

Read of size 8 at addr 8881b72af310 by task syz-executor198/9332

CPU: 0 PID: 9332 Comm: syz-executor198 Not tainted 4.20.0-rc5+ #142
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011

Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x244/0x39d lib/dump_stack.c:113
 print_address_description.cold.7+0x9/0x1ff mm/kasan/report.c:256
 kasan_report_error mm/kasan/report.c:354 [inline]
 kasan_report.cold.8+0x242/0x309 mm/kasan/report.c:412
 __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433
 __read_once_size include/linux/compiler.h:182 [inline]
 task_css include/linux/cgroup.h:477 [inline]
 mem_cgroup_from_task mm/memcontrol.c:815 [inline]
 get_mem_cgroup_from_mm.part.62+0x6d7/0x880 mm/memcontrol.c:844
 get_mem_cgroup_from_mm mm/memcontrol.c:834 [inline]
 mem_cgroup_try_charge+0x608/0xe20 mm/memcontrol.c:5888
 mcopy_atomic_pte mm/userfaultfd.c:71 [inline]
 mfill_atomic_pte mm/userfaultfd.c:418 [inline]
 __mcopy_atomic mm/userfaultfd.c:559 [inline]
 mcopy_atomic+0xb08/0x2c70 mm/userfaultfd.c:609
 userfaultfd_copy fs/userfaultfd.c:1705 [inline]
 userfaultfd_ioctl+0x29fb/0x5610 fs/userfaultfd.c:1851
 vfs_ioctl fs/ioctl.c:46 [inline]
 file_ioctl fs/ioctl.c:509 [inline]
 do_vfs_ioctl+0x1de/0x1790 fs/ioctl.c:696
 ksys_ioctl+0xa9/0xd0 fs/ioctl.c:713
 __do_sys_ioctl fs/ioctl.c:720 [inline]
 __se_sys_ioctl fs/ioctl.c:718 [inline]
 __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x44c7e9
Code: 5d c5 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7  
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff  
ff 0f 83 2b c5 fb ff c3 66 2e 0f 1f 84 00 00 00 00

RSP: 002b:7f906b69fdb8 EFLAGS: 0246 ORIG_RAX: 0010
RAX: ffda RBX: 006e4a08 RCX: 0044c7e9
RDX: 2100 RSI: c028aa03 RDI: 0004
RBP: 006e4a00 R08:  R09: 
R10:  R11: 0246 R12: 006e4a0c
R13: 7ffdfd47813f R14: 7f906b6a09c0 R15: 002d

Allocated by task 9325:
 save_stack+0x43/0xd0 mm/kasan/kasan.c:448
 set_track mm/kasan/kasan.c:460 [inline]
 kasan_kmalloc+0xc7/0xe0 mm/kasan/kasan.c:553
 kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:490
 kmem_cache_alloc_node+0x144/0x730 mm/slab.c:3644
 alloc_task_struct_node kernel/fork.c:158 [inline]
 dup_task_struct kernel/fork.c:843 [inline]
 copy_process+0x2026/0x87a0 kernel/fork.c:1751
 _do_fork+0x1cb/0x11d0 kernel/fork.c:2216
 __do_sys_clone kernel/fork.c:2323 [inline]
 __se_sys_clone kernel/fork.c:2317 [inline]
 __x64_sys_clone+0xbf/0x150 kernel/fork.c:2317
 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

Freed by task 9325:
 save_stack+0x43/0xd0 mm/kasan/kasan.c:448
 set_track mm/kasan/kasan.c:460 [inline]
 __kasan_slab_free+0x102/0x150 mm/kasan/kasan.c:521
 kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528
 __cache_free mm/slab.c:3498 [inline]
 kmem_cache_free+0x83/0x290 mm/slab.c:3760
 free_task_struct kernel/fork.c:163 [inline]
 free_task+0x16e/0x1f0 kernel/fork.c:457
 copy_process+0x1dcc/0x87a0 kernel/fork.c:2148
 _do_fork+0x1cb/0x11d0 kernel/fork.c:2216
 __do_sys_clone kernel/fork.c:2323 [inline]
 __se_sys_clone kernel/fork.c:2317 [inline]
 __x64_sys_clone+0xbf/0x150 kernel/fork.c:2317
 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

The buggy address belongs to the object at 8881b72ae240
 which belongs to the cache task_struct(81:syz2) of size 6080
The buggy address is located 4304 bytes inside of
 6080-byte region [8881b72ae240, 8881b72afa00)
The buggy address belongs to the page:

Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm

2018-12-04 Thread syzbot

syzbot has found a reproducer for the following crash on:

HEAD commit:0072a0c14d5b Merge tag 'media/v4.20-4' of git://git.kernel..
git tree:   upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=11c885a340
kernel config:  https://syzkaller.appspot.com/x/.config?x=b9cc5a440391cbfd
dashboard link: https://syzkaller.appspot.com/bug?extid=cbb52e396df3e565ab02
compiler:   gcc (GCC) 8.0.1 20180413 (experimental)
syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=12835e2540
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=172fa5a340

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+cbb52e396df3e565a...@syzkaller.appspotmail.com

cgroup: fork rejected by pids controller in /syz2
==
BUG: KASAN: use-after-free in __read_once_size include/linux/compiler.h:182  
[inline]

BUG: KASAN: use-after-free in task_css include/linux/cgroup.h:477 [inline]
BUG: KASAN: use-after-free in mem_cgroup_from_task mm/memcontrol.c:815  
[inline]
BUG: KASAN: use-after-free in get_mem_cgroup_from_mm.part.62+0x6d7/0x880  
mm/memcontrol.c:844

Read of size 8 at addr 8881b72af310 by task syz-executor198/9332

CPU: 0 PID: 9332 Comm: syz-executor198 Not tainted 4.20.0-rc5+ #142
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011

Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x244/0x39d lib/dump_stack.c:113
 print_address_description.cold.7+0x9/0x1ff mm/kasan/report.c:256
 kasan_report_error mm/kasan/report.c:354 [inline]
 kasan_report.cold.8+0x242/0x309 mm/kasan/report.c:412
 __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433
 __read_once_size include/linux/compiler.h:182 [inline]
 task_css include/linux/cgroup.h:477 [inline]
 mem_cgroup_from_task mm/memcontrol.c:815 [inline]
 get_mem_cgroup_from_mm.part.62+0x6d7/0x880 mm/memcontrol.c:844
 get_mem_cgroup_from_mm mm/memcontrol.c:834 [inline]
 mem_cgroup_try_charge+0x608/0xe20 mm/memcontrol.c:5888
 mcopy_atomic_pte mm/userfaultfd.c:71 [inline]
 mfill_atomic_pte mm/userfaultfd.c:418 [inline]
 __mcopy_atomic mm/userfaultfd.c:559 [inline]
 mcopy_atomic+0xb08/0x2c70 mm/userfaultfd.c:609
 userfaultfd_copy fs/userfaultfd.c:1705 [inline]
 userfaultfd_ioctl+0x29fb/0x5610 fs/userfaultfd.c:1851
 vfs_ioctl fs/ioctl.c:46 [inline]
 file_ioctl fs/ioctl.c:509 [inline]
 do_vfs_ioctl+0x1de/0x1790 fs/ioctl.c:696
 ksys_ioctl+0xa9/0xd0 fs/ioctl.c:713
 __do_sys_ioctl fs/ioctl.c:720 [inline]
 __se_sys_ioctl fs/ioctl.c:718 [inline]
 __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x44c7e9
Code: 5d c5 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7  
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff  
ff 0f 83 2b c5 fb ff c3 66 2e 0f 1f 84 00 00 00 00

RSP: 002b:7f906b69fdb8 EFLAGS: 0246 ORIG_RAX: 0010
RAX: ffda RBX: 006e4a08 RCX: 0044c7e9
RDX: 2100 RSI: c028aa03 RDI: 0004
RBP: 006e4a00 R08:  R09: 
R10:  R11: 0246 R12: 006e4a0c
R13: 7ffdfd47813f R14: 7f906b6a09c0 R15: 002d

Allocated by task 9325:
 save_stack+0x43/0xd0 mm/kasan/kasan.c:448
 set_track mm/kasan/kasan.c:460 [inline]
 kasan_kmalloc+0xc7/0xe0 mm/kasan/kasan.c:553
 kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:490
 kmem_cache_alloc_node+0x144/0x730 mm/slab.c:3644
 alloc_task_struct_node kernel/fork.c:158 [inline]
 dup_task_struct kernel/fork.c:843 [inline]
 copy_process+0x2026/0x87a0 kernel/fork.c:1751
 _do_fork+0x1cb/0x11d0 kernel/fork.c:2216
 __do_sys_clone kernel/fork.c:2323 [inline]
 __se_sys_clone kernel/fork.c:2317 [inline]
 __x64_sys_clone+0xbf/0x150 kernel/fork.c:2317
 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

Freed by task 9325:
 save_stack+0x43/0xd0 mm/kasan/kasan.c:448
 set_track mm/kasan/kasan.c:460 [inline]
 __kasan_slab_free+0x102/0x150 mm/kasan/kasan.c:521
 kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528
 __cache_free mm/slab.c:3498 [inline]
 kmem_cache_free+0x83/0x290 mm/slab.c:3760
 free_task_struct kernel/fork.c:163 [inline]
 free_task+0x16e/0x1f0 kernel/fork.c:457
 copy_process+0x1dcc/0x87a0 kernel/fork.c:2148
 _do_fork+0x1cb/0x11d0 kernel/fork.c:2216
 __do_sys_clone kernel/fork.c:2323 [inline]
 __se_sys_clone kernel/fork.c:2317 [inline]
 __x64_sys_clone+0xbf/0x150 kernel/fork.c:2317
 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

The buggy address belongs to the object at 8881b72ae240
 which belongs to the cache task_struct(81:syz2) of size 6080
The buggy address is located 4304 bytes inside of
 6080-byte region [8881b72ae240, 8881b72afa00)
The buggy address belongs to the page: