Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm
Bisection is inconclusive: the first bad commit could be any of: 2c43838c sched/isolation: Enable CONFIG_CPU_ISOLATION=y by default bf29cb23 sched/isolation: Make CONFIG_NO_HZ_FULL select CONFIG_CPU_ISOLATION d94d1053 sched/isolation: Document boot parameters dependency on CONFIG_CPU_ISOLATION=y 4c470317 Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=1592b03720 start commit: 0072a0c1 git tree: upstream dashboard link: https://syzkaller.appspot.com/bug?extid=cbb52e396df3e565ab02 syz repro: https://syzkaller.appspot.com/x/repro.syz?x=12835e2540 C reproducer: https://syzkaller.appspot.com/x/repro.c?x=172fa5a340 For information about bisection process see: https://goo.gl/tpsmEJ#bisection
Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm
On 2019/3/17 3:42, Andrea Arcangeli wrote: > On Sat, Mar 16, 2019 at 05:38:54PM +0800, zhong jiang wrote: >> On 2019/3/16 5:39, Andrea Arcangeli wrote: >>> On Fri, Mar 08, 2019 at 03:10:08PM +0800, zhong jiang wrote: I can reproduce the issue in arm64 qemu machine. The issue will leave after applying the patch. Tested-by: zhong jiang >>> Thanks a lot for the quick testing! >>> Meanwhile, I just has a little doubt whether it is necessary to use RCU to free the task struct or not. I think that mm->owner alway be NULL after failing to create to process. Because we call mm_clear_owner. >>> I wish it was enough, but the problem is that the other CPU may be in >>> the middle of get_mem_cgroup_from_mm() while this runs, and it would >>> dereference mm->owner while it is been freed without the call_rcu >>> affter we clear mm->owner. What prevents this race is the >> As you had said, It would dereference mm->owner after we clear mm->owner. >> >> But after we clear mm->owner, mm->owner should be NULL. Is it right? >> >> And mem_cgroup_from_task will check the parameter. >> you mean that it is possible after checking the parameter to clear the >> owner . >> and the NULL pointer will trigger. :-( > Dereference mm->owner didn't mean reading the value of the mm->owner > pointer, it really means to dereference the value of the pointer. It's > like below: > > get_mem_cgroup_from_mm() failing fork() > --- > task = mm->owner > mm->owner = NULL; > free(mm->owner) > *task /* use after free */ > > We didn't set mm->owner to NULL before, so the window for the race was > larger, but setting mm->owner to NULL only hides the problem and it > can still happen (albeit with a smaller window). > > If get_mem_cgroup_from_mm() can see at any time mm->owner not NULL, > then the free of the task struct must be delayed until after > rcu_read_unlock has returned in get_mem_cgroup_from_mm(). This is > the standard RCU model, the freeing must be delayed until after the > next quiescent point. Thank you for your explaination patiently. The patch should go to upstream too. I think you should send a formal patch to the mainline. Maybe other people suffer from the issue. :-) Thanks, zhong jiang > BTW, both mm_update_next_owner() and mm_clear_owner() should have used > WRITE_ONCE when they write to mm->owner, I can update that too but > it's just to not to make assumptions that gcc does the right thing > (and we still rely on gcc to do the right thing in other places) so > that is just an orthogonal cleanup. > > Thanks, > Andrea > > . >
Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm
On Sat, Mar 16, 2019 at 05:38:54PM +0800, zhong jiang wrote: > On 2019/3/16 5:39, Andrea Arcangeli wrote: > > On Fri, Mar 08, 2019 at 03:10:08PM +0800, zhong jiang wrote: > >> I can reproduce the issue in arm64 qemu machine. The issue will leave > >> after applying the > >> patch. > >> > >> Tested-by: zhong jiang > > Thanks a lot for the quick testing! > > > >> Meanwhile, I just has a little doubt whether it is necessary to use RCU > >> to free the task struct or not. > >> I think that mm->owner alway be NULL after failing to create to process. > >> Because we call mm_clear_owner. > > I wish it was enough, but the problem is that the other CPU may be in > > the middle of get_mem_cgroup_from_mm() while this runs, and it would > > dereference mm->owner while it is been freed without the call_rcu > > affter we clear mm->owner. What prevents this race is the > As you had said, It would dereference mm->owner after we clear mm->owner. > > But after we clear mm->owner, mm->owner should be NULL. Is it right? > > And mem_cgroup_from_task will check the parameter. > you mean that it is possible after checking the parameter to clear the owner > . > and the NULL pointer will trigger. :-( Dereference mm->owner didn't mean reading the value of the mm->owner pointer, it really means to dereference the value of the pointer. It's like below: get_mem_cgroup_from_mm()failing fork() --- task = mm->owner mm->owner = NULL; free(mm->owner) *task /* use after free */ We didn't set mm->owner to NULL before, so the window for the race was larger, but setting mm->owner to NULL only hides the problem and it can still happen (albeit with a smaller window). If get_mem_cgroup_from_mm() can see at any time mm->owner not NULL, then the free of the task struct must be delayed until after rcu_read_unlock has returned in get_mem_cgroup_from_mm(). This is the standard RCU model, the freeing must be delayed until after the next quiescent point. BTW, both mm_update_next_owner() and mm_clear_owner() should have used WRITE_ONCE when they write to mm->owner, I can update that too but it's just to not to make assumptions that gcc does the right thing (and we still rely on gcc to do the right thing in other places) so that is just an orthogonal cleanup. Thanks, Andrea
Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm
On 2019/3/16 5:39, Andrea Arcangeli wrote: > On Fri, Mar 08, 2019 at 03:10:08PM +0800, zhong jiang wrote: >> I can reproduce the issue in arm64 qemu machine. The issue will leave after >> applying the >> patch. >> >> Tested-by: zhong jiang > Thanks a lot for the quick testing! > >> Meanwhile, I just has a little doubt whether it is necessary to use RCU to >> free the task struct or not. >> I think that mm->owner alway be NULL after failing to create to process. >> Because we call mm_clear_owner. > I wish it was enough, but the problem is that the other CPU may be in > the middle of get_mem_cgroup_from_mm() while this runs, and it would > dereference mm->owner while it is been freed without the call_rcu > affter we clear mm->owner. What prevents this race is the As you had said, It would dereference mm->owner after we clear mm->owner. But after we clear mm->owner, mm->owner should be NULL. Is it right? And mem_cgroup_from_task will check the parameter. you mean that it is possible after checking the parameter to clear the owner . and the NULL pointer will trigger. :-( Thanks, zhong jiang > rcu_read_lock() in get_mem_cgroup_from_mm() and the corresponding > call_rcu to free the task struct in the fork failure path (again only > if CONFIG_MEMCG=y is defined). Considering you can reproduce this tiny > race on arm64 qemu (perhaps tcg JIT timing variantions helps?), you > might also in theory be able to still reproduce the race condition if > you remove the call_rcu from delayed_free_task and you replace it with > free_task. > > . >
Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm
On Fri, Mar 08, 2019 at 03:10:08PM +0800, zhong jiang wrote: > I can reproduce the issue in arm64 qemu machine. The issue will leave after > applying the > patch. > > Tested-by: zhong jiang Thanks a lot for the quick testing! > Meanwhile, I just has a little doubt whether it is necessary to use RCU to > free the task struct or not. > I think that mm->owner alway be NULL after failing to create to process. > Because we call mm_clear_owner. I wish it was enough, but the problem is that the other CPU may be in the middle of get_mem_cgroup_from_mm() while this runs, and it would dereference mm->owner while it is been freed without the call_rcu affter we clear mm->owner. What prevents this race is the rcu_read_lock() in get_mem_cgroup_from_mm() and the corresponding call_rcu to free the task struct in the fork failure path (again only if CONFIG_MEMCG=y is defined). Considering you can reproduce this tiny race on arm64 qemu (perhaps tcg JIT timing variantions helps?), you might also in theory be able to still reproduce the race condition if you remove the call_rcu from delayed_free_task and you replace it with free_task.
Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm
On 2019/3/6 10:05, Andrea Arcangeli wrote: > Hello everyone, > > [ CC'ed Mike and Peter ] > > On Tue, Mar 05, 2019 at 02:42:00PM +0800, zhong jiang wrote: >> On 2019/3/5 14:26, Dmitry Vyukov wrote: >>> On Mon, Mar 4, 2019 at 4:32 PM zhong jiang wrote: On 2019/3/4 22:11, Dmitry Vyukov wrote: > On Mon, Mar 4, 2019 at 3:00 PM zhong jiang wrote: >> On 2019/3/4 15:40, Dmitry Vyukov wrote: >>> On Sun, Mar 3, 2019 at 5:19 PM zhong jiang >>> wrote: Hi, guys I also hit the following issue. but it fails to reproduce the issue by the log. it seems to the case that we access the mm->owner and deference it will result in the UAF. But it should not be possible that we specify the incomplete process to be the mm->owner. Any thoughts? >>> FWIW syzbot was able to reproduce this with this reproducer. >>> This looks like a very subtle race (threaded reproducer that runs >>> repeatedly in multiple processes), so most likely we are looking for >>> something like few instructions inconsistency window. >>> >> I has a little doubtful about the instrustions inconsistency window. >> >> I guess that you mean some smb barriers should be taken into account.:-) >> >> Because IMO, It should not be the lock case to result in the issue. > Since the crash was triggered on x86 _most likley_ this is not a > missed barrier. What I meant is that one thread needs to executed some > code, while another thread is stopped within few instructions. > > It is weird and I can not find any relationship you had said with the issue.:-( Because It is the cause that mm->owner has been freed, whereas we still deference it. From the lastest freed task call trace, It fails to create process. Am I miss something or I misunderstand your meaning. Please correct me. >>> Your analysis looks correct. I am just saying that the root cause of >>> this use-after-free seems to be a race condition. >>> >>> >>> >> Yep, Indeed, I can not figure out how the race works. I will dig up further. > Yes it's a race condition. > > We were aware about the non-cooperative fork userfaultfd feature > creating userfaultfd file descriptor that gets reported to the parent > uffd, despite they belong to mm created by failed forks. > > https://www.spinics.net/lists/linux-mm/msg136357.html > > The fork failure in my testcase happened because of signal pending > that interrupted fork after the failed-fork uffd context, was already > pushed to the userfaultfd reader/monitor. CRIU then takes care of > filtering the failed fork cases so we didn't want to make the fork > code more complicated just for userfaultfd. > > In reality if MEMCG is enabled at build time, mm->owner maintainance > code now creates a race condition in the above case, with any fork > failure. > > I pinged Mike yesterday to ask if my theory could be true for this bug > and one solution he suggested is to do the userfaultfd_dup at a point > where fork cannot fail anymore. That's precisely what we were > wondering to do back then to avoid the failed fork reports to the > non cooperative uffd monitor. > > That will solve the false positive deliveries that CRIU manager > currently filters out too. From a theoretical standpoint it's also > quite strange to even allow any uffd ioctl to run on a otherwise long > gone mm created for a process that in the end wasn't even created (the > mm got temporarily fully created, but no child task really ever used > such mm). However that mm is on its way to exit_mmap as soon as the > ioclt returns and this only ever happens during race conditions, so > the way CRIU monitor works there wasn't anything fundamentally > concerning about this detail, despite it's remarkably "strange". Our > priority was to keep the fork code as simple as possible and keep > userfaultfd as non intrusive as possible. > > One alternative solution I'm wondering about for this memcg issue is > to free the task struct with RCU also when fork has failed and to add > the mm_update_next_owner before mmput. That will still report failed > forks to the uffd monitor, so it's not the ideal fix, but since it's > probably simpler I'm posting it below. Also I couldn't reproduce the > problem with the testcase here yet. > > >From 6cbf9d377b705476e5226704422357176f79e32c Mon Sep 17 00:00:00 2001 > From: Andrea Arcangeli > Date: Tue, 5 Mar 2019 19:21:37 -0500 > Subject: [PATCH 1/1] userfaultfd: use RCU to free the task struct when fork > fails if MEMCG > > MEMCG depends on the task structure not to be freed under > rcu_read_lock() in get_mem_cgroup_from_mm() after it dereferences > mm->owner. > > A better fix would be to avoid registering forked vmas in userfaultfd > contexts reported to the monitor, if case fork ends up failing. Hi, Andrea I can reproduce the issue in arm64 qemu machine. The
Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm
On 2019/3/7 2:29, Andrea Arcangeli wrote: > Hello Zhong, > > On Wed, Mar 06, 2019 at 09:07:00PM +0800, zhong jiang wrote: >> The patch use call_rcu to delay free the task_struct, but It is possible to >> free the task_struct >> ahead of get_mem_cgroup_from_mm. is it right? > Yes it is possible to free before get_mem_cgroup_from_mm, but if it's > freed before get_mem_cgroup_from_mm rcu_read_lock, > rcu_dereference(mm->owner) will return NULL in such case and there > will be no problem. Yes > The simple fix also clears the mm->owner of the failed-fork-mm before > doing the call_rcu. The call_rcu delays the freeing after no other CPU > runs in between rcu_read_lock/unlock anymore. That guarantees that > those critical section will see mm->owner == NULL if the freeing of > the task strut already happened. We has set the mm->owner to NULL when child process fails to fork ahead of freeing the task struct. Have those critical section chance to see the mm->owner, which is not NULL. I has tested the patch. Not Oops and panic appear so far. Thanks, zhong jiang > The solution Mike suggested for this and that we were wondering as > ideal in the past for the signal issue too, is to move the uffd > delivery at a point where fork is guaranteed to succeed. We should > probably try that too to see how it looks like and if it can be done > in a not intrusive way, but the simple fix that uses RCU should work > too. > > Rolling back in case of errors inside fork itself isn't easily doable: > the moment we push the uffd ctx to the other side of the uffd pipe > there's no coming back as that information can reach the userland of > the uffd monitor/reader thread immediately after. The rolling back is > really the other thread failing at mmget_not_zero eventually. It's the > userland that has to rollback in such case when it gets a -ESRCH > retval. > > Note that this fork feature is only ever needed in the non-cooperative > case, these things never need to happen when userfaultfd is used by an > app (or a lib) that is aware that it is using userfaultfd. > > Thanks, > Andrea > > . >
Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm
Hello Zhong, On Wed, Mar 06, 2019 at 09:07:00PM +0800, zhong jiang wrote: > The patch use call_rcu to delay free the task_struct, but It is possible to > free the task_struct > ahead of get_mem_cgroup_from_mm. is it right? Yes it is possible to free before get_mem_cgroup_from_mm, but if it's freed before get_mem_cgroup_from_mm rcu_read_lock, rcu_dereference(mm->owner) will return NULL in such case and there will be no problem. The simple fix also clears the mm->owner of the failed-fork-mm before doing the call_rcu. The call_rcu delays the freeing after no other CPU runs in between rcu_read_lock/unlock anymore. That guarantees that those critical section will see mm->owner == NULL if the freeing of the task strut already happened. The solution Mike suggested for this and that we were wondering as ideal in the past for the signal issue too, is to move the uffd delivery at a point where fork is guaranteed to succeed. We should probably try that too to see how it looks like and if it can be done in a not intrusive way, but the simple fix that uses RCU should work too. Rolling back in case of errors inside fork itself isn't easily doable: the moment we push the uffd ctx to the other side of the uffd pipe there's no coming back as that information can reach the userland of the uffd monitor/reader thread immediately after. The rolling back is really the other thread failing at mmget_not_zero eventually. It's the userland that has to rollback in such case when it gets a -ESRCH retval. Note that this fork feature is only ever needed in the non-cooperative case, these things never need to happen when userfaultfd is used by an app (or a lib) that is aware that it is using userfaultfd. Thanks, Andrea
Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm
On 2019/3/6 16:12, Peter Xu wrote: > On Wed, Mar 06, 2019 at 03:41:06PM +0800, zhong jiang wrote: >> On 2019/3/6 14:26, Mike Rapoport wrote: >>> Hi, >>> >>> On Wed, Mar 06, 2019 at 01:53:12PM +0800, zhong jiang wrote: On 2019/3/6 10:05, Andrea Arcangeli wrote: > Hello everyone, > > [ CC'ed Mike and Peter ] > > On Tue, Mar 05, 2019 at 02:42:00PM +0800, zhong jiang wrote: >> On 2019/3/5 14:26, Dmitry Vyukov wrote: >>> On Mon, Mar 4, 2019 at 4:32 PM zhong jiang >>> wrote: On 2019/3/4 22:11, Dmitry Vyukov wrote: > On Mon, Mar 4, 2019 at 3:00 PM zhong jiang > wrote: >> On 2019/3/4 15:40, Dmitry Vyukov wrote: >>> On Sun, Mar 3, 2019 at 5:19 PM zhong jiang >>> wrote: Hi, guys I also hit the following issue. but it fails to reproduce the issue by the log. it seems to the case that we access the mm->owner and deference it will result in the UAF. But it should not be possible that we specify the incomplete process to be the mm->owner. Any thoughts? >>> FWIW syzbot was able to reproduce this with this reproducer. >>> This looks like a very subtle race (threaded reproducer that runs >>> repeatedly in multiple processes), so most likely we are looking for >>> something like few instructions inconsistency window. >>> >> I has a little doubtful about the instrustions inconsistency window. >> >> I guess that you mean some smb barriers should be taken into >> account.:-) >> >> Because IMO, It should not be the lock case to result in the issue. > Since the crash was triggered on x86 _most likley_ this is not a > missed barrier. What I meant is that one thread needs to executed some > code, while another thread is stopped within few instructions. > > It is weird and I can not find any relationship you had said with the issue.:-( Because It is the cause that mm->owner has been freed, whereas we still deference it. From the lastest freed task call trace, It fails to create process. Am I miss something or I misunderstand your meaning. Please correct me. >>> Your analysis looks correct. I am just saying that the root cause of >>> this use-after-free seems to be a race condition. >>> >>> >>> >> Yep, Indeed, I can not figure out how the race works. I will dig up >> further. > Yes it's a race condition. > > We were aware about the non-cooperative fork userfaultfd feature > creating userfaultfd file descriptor that gets reported to the parent > uffd, despite they belong to mm created by failed forks. > > https://www.spinics.net/lists/linux-mm/msg136357.html > Hi, Andrea I still not clear why uffd ioctl can use the incomplete process as the mm->owner. and how to produce the race. >>> There is a C reproducer in the syzcaller report: >>> >>> https://syzkaller.appspot.com/x/repro.c?x=172fa5a340 >>> From your above explainations, My underdtanding is that the process handling do_exexve will have a temporary mm, which will be used by the UUFD ioctl. >>> The race is between userfaultfd operation and fork() failure: >>> >>> forking thread | userfaultfd monitor thread >>> +--- >>> fork() | >>> dup_mmap()| >>> dup_userfaultfd() | >>> dup_userfaultfd_complete() | >>> | read(UFFD_EVENT_FORK) >>> | uffdio_copy() >>> |mmget_not_zero() >>> goto bad_fork_something | >>> ... | >>> bad_fork_free: | >>> free_task() | >>> | mem_cgroup_from_task() >>> | /* access stale mm->owner */ >>> >> Hi, Mike > Hi, Zhong, > >> forking thread fails to create the process ,and then free the allocated task >> struct. >> Other userfaultfd monitor thread should not access the stale mm->owner. >> >> The parent process and child process do not share the mm struct. >> Userfaultfd monitor thread's >> mm->owner should not point to the freed child task_struct. > IIUC the problem is that above mm (of the mm->owner) is the child > process's mm rather than the uffd monitor's. When > dup_userfaultfd_complete() is called there will be a new userfaultfd > context sent to the uffd monitor thread which linked to the chlid > process's mm, and if the monitor thread do UFFDIO_COPY upon the newly > received userfaultfd it'll operate on that new mm
Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm
On Wed, Mar 06, 2019 at 03:41:06PM +0800, zhong jiang wrote: > On 2019/3/6 14:26, Mike Rapoport wrote: > > Hi, > > > > On Wed, Mar 06, 2019 at 01:53:12PM +0800, zhong jiang wrote: > >> On 2019/3/6 10:05, Andrea Arcangeli wrote: > >>> Hello everyone, > >>> > >>> [ CC'ed Mike and Peter ] > >>> > >>> On Tue, Mar 05, 2019 at 02:42:00PM +0800, zhong jiang wrote: > On 2019/3/5 14:26, Dmitry Vyukov wrote: > > On Mon, Mar 4, 2019 at 4:32 PM zhong jiang > > wrote: > >> On 2019/3/4 22:11, Dmitry Vyukov wrote: > >>> On Mon, Mar 4, 2019 at 3:00 PM zhong jiang > >>> wrote: > On 2019/3/4 15:40, Dmitry Vyukov wrote: > > On Sun, Mar 3, 2019 at 5:19 PM zhong jiang > > wrote: > >> Hi, guys > >> > >> I also hit the following issue. but it fails to reproduce the > >> issue by the log. > >> > >> it seems to the case that we access the mm->owner and deference it > >> will result in the UAF. > >> But it should not be possible that we specify the incomplete > >> process to be the mm->owner. > >> > >> Any thoughts? > > FWIW syzbot was able to reproduce this with this reproducer. > > This looks like a very subtle race (threaded reproducer that runs > > repeatedly in multiple processes), so most likely we are looking for > > something like few instructions inconsistency window. > > > I has a little doubtful about the instrustions inconsistency window. > > I guess that you mean some smb barriers should be taken into > account.:-) > > Because IMO, It should not be the lock case to result in the issue. > >>> Since the crash was triggered on x86 _most likley_ this is not a > >>> missed barrier. What I meant is that one thread needs to executed some > >>> code, while another thread is stopped within few instructions. > >>> > >>> > >> It is weird and I can not find any relationship you had said with the > >> issue.:-( > >> > >> Because It is the cause that mm->owner has been freed, whereas we > >> still deference it. > >> > >> From the lastest freed task call trace, It fails to create process. > >> > >> Am I miss something or I misunderstand your meaning. Please correct me. > > Your analysis looks correct. I am just saying that the root cause of > > this use-after-free seems to be a race condition. > > > > > > > Yep, Indeed, I can not figure out how the race works. I will dig up > further. > >>> Yes it's a race condition. > >>> > >>> We were aware about the non-cooperative fork userfaultfd feature > >>> creating userfaultfd file descriptor that gets reported to the parent > >>> uffd, despite they belong to mm created by failed forks. > >>> > >>> https://www.spinics.net/lists/linux-mm/msg136357.html > >>> > >> Hi, Andrea > >> > >> I still not clear why uffd ioctl can use the incomplete process as the > >> mm->owner. > >> and how to produce the race. > > There is a C reproducer in the syzcaller report: > > > > https://syzkaller.appspot.com/x/repro.c?x=172fa5a340 > > > >> From your above explainations, My underdtanding is that the process > >> handling do_exexve > >> will have a temporary mm, which will be used by the UUFD ioctl. > > The race is between userfaultfd operation and fork() failure: > > > > forking thread | userfaultfd monitor thread > > +--- > > fork() | > > dup_mmap()| > > dup_userfaultfd() | > > dup_userfaultfd_complete() | > > | read(UFFD_EVENT_FORK) > > | uffdio_copy() > > |mmget_not_zero() > > goto bad_fork_something | > > ... | > > bad_fork_free: | > > free_task() | > > | mem_cgroup_from_task() > > | /* access stale mm->owner */ > > > Hi, Mike > > forking thread fails to create the process ,and then free the allocated task > struct. > Other userfaultfd monitor thread should not access the stale mm->owner. > > The parent process and child process do not share the mm struct. Userfaultfd > monitor thread's > mm->owner should not point to the freed child task_struct. Userfaultfd can monitor remote mm's [1]. In this case, dup_userfaultfd() and dup_userfaultfd_complete() create uffd context for the new process and notify userspace uffd monitor about this new context. The uffd monitor then can perform uffd operations on the new context. On the right side the mmget_not_zero() will take the reference for the mm of the newly created process. [1]
Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm
On Wed, Mar 06, 2019 at 03:41:06PM +0800, zhong jiang wrote: > On 2019/3/6 14:26, Mike Rapoport wrote: > > Hi, > > > > On Wed, Mar 06, 2019 at 01:53:12PM +0800, zhong jiang wrote: > >> On 2019/3/6 10:05, Andrea Arcangeli wrote: > >>> Hello everyone, > >>> > >>> [ CC'ed Mike and Peter ] > >>> > >>> On Tue, Mar 05, 2019 at 02:42:00PM +0800, zhong jiang wrote: > On 2019/3/5 14:26, Dmitry Vyukov wrote: > > On Mon, Mar 4, 2019 at 4:32 PM zhong jiang > > wrote: > >> On 2019/3/4 22:11, Dmitry Vyukov wrote: > >>> On Mon, Mar 4, 2019 at 3:00 PM zhong jiang > >>> wrote: > On 2019/3/4 15:40, Dmitry Vyukov wrote: > > On Sun, Mar 3, 2019 at 5:19 PM zhong jiang > > wrote: > >> Hi, guys > >> > >> I also hit the following issue. but it fails to reproduce the > >> issue by the log. > >> > >> it seems to the case that we access the mm->owner and deference it > >> will result in the UAF. > >> But it should not be possible that we specify the incomplete > >> process to be the mm->owner. > >> > >> Any thoughts? > > FWIW syzbot was able to reproduce this with this reproducer. > > This looks like a very subtle race (threaded reproducer that runs > > repeatedly in multiple processes), so most likely we are looking for > > something like few instructions inconsistency window. > > > I has a little doubtful about the instrustions inconsistency window. > > I guess that you mean some smb barriers should be taken into > account.:-) > > Because IMO, It should not be the lock case to result in the issue. > >>> Since the crash was triggered on x86 _most likley_ this is not a > >>> missed barrier. What I meant is that one thread needs to executed some > >>> code, while another thread is stopped within few instructions. > >>> > >>> > >> It is weird and I can not find any relationship you had said with the > >> issue.:-( > >> > >> Because It is the cause that mm->owner has been freed, whereas we > >> still deference it. > >> > >> From the lastest freed task call trace, It fails to create process. > >> > >> Am I miss something or I misunderstand your meaning. Please correct me. > > Your analysis looks correct. I am just saying that the root cause of > > this use-after-free seems to be a race condition. > > > > > > > Yep, Indeed, I can not figure out how the race works. I will dig up > further. > >>> Yes it's a race condition. > >>> > >>> We were aware about the non-cooperative fork userfaultfd feature > >>> creating userfaultfd file descriptor that gets reported to the parent > >>> uffd, despite they belong to mm created by failed forks. > >>> > >>> https://www.spinics.net/lists/linux-mm/msg136357.html > >>> > >> Hi, Andrea > >> > >> I still not clear why uffd ioctl can use the incomplete process as the > >> mm->owner. > >> and how to produce the race. > > There is a C reproducer in the syzcaller report: > > > > https://syzkaller.appspot.com/x/repro.c?x=172fa5a340 > > > >> From your above explainations, My underdtanding is that the process > >> handling do_exexve > >> will have a temporary mm, which will be used by the UUFD ioctl. > > The race is between userfaultfd operation and fork() failure: > > > > forking thread | userfaultfd monitor thread > > +--- > > fork() | > > dup_mmap()| > > dup_userfaultfd() | > > dup_userfaultfd_complete() | > > | read(UFFD_EVENT_FORK) > > | uffdio_copy() > > |mmget_not_zero() > > goto bad_fork_something | > > ... | > > bad_fork_free: | > > free_task() | > > | mem_cgroup_from_task() > > | /* access stale mm->owner */ > > > Hi, Mike Hi, Zhong, > > forking thread fails to create the process ,and then free the allocated task > struct. > Other userfaultfd monitor thread should not access the stale mm->owner. > > The parent process and child process do not share the mm struct. Userfaultfd > monitor thread's > mm->owner should not point to the freed child task_struct. IIUC the problem is that above mm (of the mm->owner) is the child process's mm rather than the uffd monitor's. When dup_userfaultfd_complete() is called there will be a new userfaultfd context sent to the uffd monitor thread which linked to the chlid process's mm, and if the monitor thread do UFFDIO_COPY upon the newly received userfaultfd it'll operate on that new mm too. > > and due to the existence of tasklist_lock,
Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm
On 2019/3/6 14:26, Mike Rapoport wrote: > Hi, > > On Wed, Mar 06, 2019 at 01:53:12PM +0800, zhong jiang wrote: >> On 2019/3/6 10:05, Andrea Arcangeli wrote: >>> Hello everyone, >>> >>> [ CC'ed Mike and Peter ] >>> >>> On Tue, Mar 05, 2019 at 02:42:00PM +0800, zhong jiang wrote: On 2019/3/5 14:26, Dmitry Vyukov wrote: > On Mon, Mar 4, 2019 at 4:32 PM zhong jiang wrote: >> On 2019/3/4 22:11, Dmitry Vyukov wrote: >>> On Mon, Mar 4, 2019 at 3:00 PM zhong jiang >>> wrote: On 2019/3/4 15:40, Dmitry Vyukov wrote: > On Sun, Mar 3, 2019 at 5:19 PM zhong jiang > wrote: >> Hi, guys >> >> I also hit the following issue. but it fails to reproduce the issue >> by the log. >> >> it seems to the case that we access the mm->owner and deference it >> will result in the UAF. >> But it should not be possible that we specify the incomplete process >> to be the mm->owner. >> >> Any thoughts? > FWIW syzbot was able to reproduce this with this reproducer. > This looks like a very subtle race (threaded reproducer that runs > repeatedly in multiple processes), so most likely we are looking for > something like few instructions inconsistency window. > I has a little doubtful about the instrustions inconsistency window. I guess that you mean some smb barriers should be taken into account.:-) Because IMO, It should not be the lock case to result in the issue. >>> Since the crash was triggered on x86 _most likley_ this is not a >>> missed barrier. What I meant is that one thread needs to executed some >>> code, while another thread is stopped within few instructions. >>> >>> >> It is weird and I can not find any relationship you had said with the >> issue.:-( >> >> Because It is the cause that mm->owner has been freed, whereas we still >> deference it. >> >> From the lastest freed task call trace, It fails to create process. >> >> Am I miss something or I misunderstand your meaning. Please correct me. > Your analysis looks correct. I am just saying that the root cause of > this use-after-free seems to be a race condition. > > > Yep, Indeed, I can not figure out how the race works. I will dig up further. >>> Yes it's a race condition. >>> >>> We were aware about the non-cooperative fork userfaultfd feature >>> creating userfaultfd file descriptor that gets reported to the parent >>> uffd, despite they belong to mm created by failed forks. >>> >>> https://www.spinics.net/lists/linux-mm/msg136357.html >>> >> Hi, Andrea >> >> I still not clear why uffd ioctl can use the incomplete process as the >> mm->owner. >> and how to produce the race. > There is a C reproducer in the syzcaller report: > > https://syzkaller.appspot.com/x/repro.c?x=172fa5a340 > >> From your above explainations, My underdtanding is that the process >> handling do_exexve >> will have a temporary mm, which will be used by the UUFD ioctl. > The race is between userfaultfd operation and fork() failure: > > forking thread | userfaultfd monitor thread > +--- > fork() | > dup_mmap()| > dup_userfaultfd() | > dup_userfaultfd_complete() | > | read(UFFD_EVENT_FORK) > | uffdio_copy() > |mmget_not_zero() > goto bad_fork_something | > ... | > bad_fork_free: | > free_task() | > | mem_cgroup_from_task() > | /* access stale mm->owner */ > Hi, Mike forking thread fails to create the process ,and then free the allocated task struct. Other userfaultfd monitor thread should not access the stale mm->owner. The parent process and child process do not share the mm struct. Userfaultfd monitor thread's mm->owner should not point to the freed child task_struct. and due to the existence of tasklist_lock, we can not specify the mm->owner to freed task_struct. I miss something,=-O Thanks, zhong jiang >> Thanks, >> zhong jiang
Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm
Hi, On Wed, Mar 06, 2019 at 01:53:12PM +0800, zhong jiang wrote: > On 2019/3/6 10:05, Andrea Arcangeli wrote: > > Hello everyone, > > > > [ CC'ed Mike and Peter ] > > > > On Tue, Mar 05, 2019 at 02:42:00PM +0800, zhong jiang wrote: > >> On 2019/3/5 14:26, Dmitry Vyukov wrote: > >>> On Mon, Mar 4, 2019 at 4:32 PM zhong jiang wrote: > On 2019/3/4 22:11, Dmitry Vyukov wrote: > > On Mon, Mar 4, 2019 at 3:00 PM zhong jiang > > wrote: > >> On 2019/3/4 15:40, Dmitry Vyukov wrote: > >>> On Sun, Mar 3, 2019 at 5:19 PM zhong jiang > >>> wrote: > Hi, guys > > I also hit the following issue. but it fails to reproduce the issue > by the log. > > it seems to the case that we access the mm->owner and deference it > will result in the UAF. > But it should not be possible that we specify the incomplete process > to be the mm->owner. > > Any thoughts? > >>> FWIW syzbot was able to reproduce this with this reproducer. > >>> This looks like a very subtle race (threaded reproducer that runs > >>> repeatedly in multiple processes), so most likely we are looking for > >>> something like few instructions inconsistency window. > >>> > >> I has a little doubtful about the instrustions inconsistency window. > >> > >> I guess that you mean some smb barriers should be taken into > >> account.:-) > >> > >> Because IMO, It should not be the lock case to result in the issue. > > Since the crash was triggered on x86 _most likley_ this is not a > > missed barrier. What I meant is that one thread needs to executed some > > code, while another thread is stopped within few instructions. > > > > > It is weird and I can not find any relationship you had said with the > issue.:-( > > Because It is the cause that mm->owner has been freed, whereas we still > deference it. > > From the lastest freed task call trace, It fails to create process. > > Am I miss something or I misunderstand your meaning. Please correct me. > >>> Your analysis looks correct. I am just saying that the root cause of > >>> this use-after-free seems to be a race condition. > >>> > >>> > >>> > >> Yep, Indeed, I can not figure out how the race works. I will dig up > >> further. > > Yes it's a race condition. > > > > We were aware about the non-cooperative fork userfaultfd feature > > creating userfaultfd file descriptor that gets reported to the parent > > uffd, despite they belong to mm created by failed forks. > > > > https://www.spinics.net/lists/linux-mm/msg136357.html > > > > Hi, Andrea > > I still not clear why uffd ioctl can use the incomplete process as the > mm->owner. > and how to produce the race. There is a C reproducer in the syzcaller report: https://syzkaller.appspot.com/x/repro.c?x=172fa5a340 > From your above explainations, My underdtanding is that the process > handling do_exexve > will have a temporary mm, which will be used by the UUFD ioctl. The race is between userfaultfd operation and fork() failure: forking thread | userfaultfd monitor thread +--- fork() | dup_mmap()| dup_userfaultfd() | dup_userfaultfd_complete() | | read(UFFD_EVENT_FORK) | uffdio_copy() |mmget_not_zero() goto bad_fork_something | ... | bad_fork_free: | free_task() | | mem_cgroup_from_task() | /* access stale mm->owner */ > Thanks, > zhong jiang -- Sincerely yours, Mike.
Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm
On 2019/3/6 10:05, Andrea Arcangeli wrote: > Hello everyone, > > [ CC'ed Mike and Peter ] > > On Tue, Mar 05, 2019 at 02:42:00PM +0800, zhong jiang wrote: >> On 2019/3/5 14:26, Dmitry Vyukov wrote: >>> On Mon, Mar 4, 2019 at 4:32 PM zhong jiang wrote: On 2019/3/4 22:11, Dmitry Vyukov wrote: > On Mon, Mar 4, 2019 at 3:00 PM zhong jiang wrote: >> On 2019/3/4 15:40, Dmitry Vyukov wrote: >>> On Sun, Mar 3, 2019 at 5:19 PM zhong jiang >>> wrote: Hi, guys I also hit the following issue. but it fails to reproduce the issue by the log. it seems to the case that we access the mm->owner and deference it will result in the UAF. But it should not be possible that we specify the incomplete process to be the mm->owner. Any thoughts? >>> FWIW syzbot was able to reproduce this with this reproducer. >>> This looks like a very subtle race (threaded reproducer that runs >>> repeatedly in multiple processes), so most likely we are looking for >>> something like few instructions inconsistency window. >>> >> I has a little doubtful about the instrustions inconsistency window. >> >> I guess that you mean some smb barriers should be taken into account.:-) >> >> Because IMO, It should not be the lock case to result in the issue. > Since the crash was triggered on x86 _most likley_ this is not a > missed barrier. What I meant is that one thread needs to executed some > code, while another thread is stopped within few instructions. > > It is weird and I can not find any relationship you had said with the issue.:-( Because It is the cause that mm->owner has been freed, whereas we still deference it. From the lastest freed task call trace, It fails to create process. Am I miss something or I misunderstand your meaning. Please correct me. >>> Your analysis looks correct. I am just saying that the root cause of >>> this use-after-free seems to be a race condition. >>> >>> >>> >> Yep, Indeed, I can not figure out how the race works. I will dig up further. > Yes it's a race condition. > > We were aware about the non-cooperative fork userfaultfd feature > creating userfaultfd file descriptor that gets reported to the parent > uffd, despite they belong to mm created by failed forks. > > https://www.spinics.net/lists/linux-mm/msg136357.html > > The fork failure in my testcase happened because of signal pending > that interrupted fork after the failed-fork uffd context, was already > pushed to the userfaultfd reader/monitor. CRIU then takes care of > filtering the failed fork cases so we didn't want to make the fork > code more complicated just for userfaultfd. > > In reality if MEMCG is enabled at build time, mm->owner maintainance > code now creates a race condition in the above case, with any fork > failure. > > I pinged Mike yesterday to ask if my theory could be true for this bug > and one solution he suggested is to do the userfaultfd_dup at a point > where fork cannot fail anymore. That's precisely what we were > wondering to do back then to avoid the failed fork reports to the > non cooperative uffd monitor. > > That will solve the false positive deliveries that CRIU manager > currently filters out too. From a theoretical standpoint it's also > quite strange to even allow any uffd ioctl to run on a otherwise long > gone mm created for a process that in the end wasn't even created (the > mm got temporarily fully created, but no child task really ever used > such mm). However that mm is on its way to exit_mmap as soon as the > ioclt returns and this only ever happens during race conditions, so > the way CRIU monitor works there wasn't anything fundamentally > concerning about this detail, despite it's remarkably "strange". Our > priority was to keep the fork code as simple as possible and keep > userfaultfd as non intrusive as possible. Hi, Andrea I still not clear why uffd ioctl can use the incomplete process as the mm->owner. and how to produce the race. >From your above explainations, My underdtanding is that the process handling >do_exexve will have a temporary mm, which will be used by the UUFD ioctl. Thanks, zhong jiang > One alternative solution I'm wondering about for this memcg issue is > to free the task struct with RCU also when fork has failed and to add > the mm_update_next_owner before mmput. That will still report failed > forks to the uffd monitor, so it's not the ideal fix, but since it's > probably simpler I'm posting it below. Also I couldn't reproduce the > problem with the testcase here yet. > > >From 6cbf9d377b705476e5226704422357176f79e32c Mon Sep 17 00:00:00 2001 > From: Andrea Arcangeli > Date: Tue, 5 Mar 2019 19:21:37 -0500 > Subject: [PATCH 1/1] userfaultfd: use RCU to free the task struct when fork > fails if MEMCG > > MEMCG depends on the task structure not to be
Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm
Hello everyone, [ CC'ed Mike and Peter ] On Tue, Mar 05, 2019 at 02:42:00PM +0800, zhong jiang wrote: > On 2019/3/5 14:26, Dmitry Vyukov wrote: > > On Mon, Mar 4, 2019 at 4:32 PM zhong jiang wrote: > >> On 2019/3/4 22:11, Dmitry Vyukov wrote: > >>> On Mon, Mar 4, 2019 at 3:00 PM zhong jiang wrote: > On 2019/3/4 15:40, Dmitry Vyukov wrote: > > On Sun, Mar 3, 2019 at 5:19 PM zhong jiang > > wrote: > >> Hi, guys > >> > >> I also hit the following issue. but it fails to reproduce the issue by > >> the log. > >> > >> it seems to the case that we access the mm->owner and deference it > >> will result in the UAF. > >> But it should not be possible that we specify the incomplete process > >> to be the mm->owner. > >> > >> Any thoughts? > > FWIW syzbot was able to reproduce this with this reproducer. > > This looks like a very subtle race (threaded reproducer that runs > > repeatedly in multiple processes), so most likely we are looking for > > something like few instructions inconsistency window. > > > I has a little doubtful about the instrustions inconsistency window. > > I guess that you mean some smb barriers should be taken into account.:-) > > Because IMO, It should not be the lock case to result in the issue. > >>> Since the crash was triggered on x86 _most likley_ this is not a > >>> missed barrier. What I meant is that one thread needs to executed some > >>> code, while another thread is stopped within few instructions. > >>> > >>> > >> It is weird and I can not find any relationship you had said with the > >> issue.:-( > >> > >> Because It is the cause that mm->owner has been freed, whereas we still > >> deference it. > >> > >> From the lastest freed task call trace, It fails to create process. > >> > >> Am I miss something or I misunderstand your meaning. Please correct me. > > Your analysis looks correct. I am just saying that the root cause of > > this use-after-free seems to be a race condition. > > > > > > > Yep, Indeed, I can not figure out how the race works. I will dig up further. Yes it's a race condition. We were aware about the non-cooperative fork userfaultfd feature creating userfaultfd file descriptor that gets reported to the parent uffd, despite they belong to mm created by failed forks. https://www.spinics.net/lists/linux-mm/msg136357.html The fork failure in my testcase happened because of signal pending that interrupted fork after the failed-fork uffd context, was already pushed to the userfaultfd reader/monitor. CRIU then takes care of filtering the failed fork cases so we didn't want to make the fork code more complicated just for userfaultfd. In reality if MEMCG is enabled at build time, mm->owner maintainance code now creates a race condition in the above case, with any fork failure. I pinged Mike yesterday to ask if my theory could be true for this bug and one solution he suggested is to do the userfaultfd_dup at a point where fork cannot fail anymore. That's precisely what we were wondering to do back then to avoid the failed fork reports to the non cooperative uffd monitor. That will solve the false positive deliveries that CRIU manager currently filters out too. From a theoretical standpoint it's also quite strange to even allow any uffd ioctl to run on a otherwise long gone mm created for a process that in the end wasn't even created (the mm got temporarily fully created, but no child task really ever used such mm). However that mm is on its way to exit_mmap as soon as the ioclt returns and this only ever happens during race conditions, so the way CRIU monitor works there wasn't anything fundamentally concerning about this detail, despite it's remarkably "strange". Our priority was to keep the fork code as simple as possible and keep userfaultfd as non intrusive as possible. One alternative solution I'm wondering about for this memcg issue is to free the task struct with RCU also when fork has failed and to add the mm_update_next_owner before mmput. That will still report failed forks to the uffd monitor, so it's not the ideal fix, but since it's probably simpler I'm posting it below. Also I couldn't reproduce the problem with the testcase here yet. >From 6cbf9d377b705476e5226704422357176f79e32c Mon Sep 17 00:00:00 2001 From: Andrea Arcangeli Date: Tue, 5 Mar 2019 19:21:37 -0500 Subject: [PATCH 1/1] userfaultfd: use RCU to free the task struct when fork fails if MEMCG MEMCG depends on the task structure not to be freed under rcu_read_lock() in get_mem_cgroup_from_mm() after it dereferences mm->owner. A better fix would be to avoid registering forked vmas in userfaultfd contexts reported to the monitor, if case fork ends up failing. Signed-off-by: Andrea Arcangeli --- kernel/fork.c | 34 -- 1 file changed, 32 insertions(+), 2 deletions(-) diff --git a/kernel/fork.c b/kernel/fork.c index eb9953c82104..3bcbb361ffbc
Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm
On 2019/3/5 14:26, Dmitry Vyukov wrote: > On Mon, Mar 4, 2019 at 4:32 PM zhong jiang wrote: >> On 2019/3/4 22:11, Dmitry Vyukov wrote: >>> On Mon, Mar 4, 2019 at 3:00 PM zhong jiang wrote: On 2019/3/4 15:40, Dmitry Vyukov wrote: > On Sun, Mar 3, 2019 at 5:19 PM zhong jiang wrote: >> Hi, guys >> >> I also hit the following issue. but it fails to reproduce the issue by >> the log. >> >> it seems to the case that we access the mm->owner and deference it will >> result in the UAF. >> But it should not be possible that we specify the incomplete process to >> be the mm->owner. >> >> Any thoughts? > FWIW syzbot was able to reproduce this with this reproducer. > This looks like a very subtle race (threaded reproducer that runs > repeatedly in multiple processes), so most likely we are looking for > something like few instructions inconsistency window. > I has a little doubtful about the instrustions inconsistency window. I guess that you mean some smb barriers should be taken into account.:-) Because IMO, It should not be the lock case to result in the issue. >>> Since the crash was triggered on x86 _most likley_ this is not a >>> missed barrier. What I meant is that one thread needs to executed some >>> code, while another thread is stopped within few instructions. >>> >>> >> It is weird and I can not find any relationship you had said with the >> issue.:-( >> >> Because It is the cause that mm->owner has been freed, whereas we still >> deference it. >> >> From the lastest freed task call trace, It fails to create process. >> >> Am I miss something or I misunderstand your meaning. Please correct me. > Your analysis looks correct. I am just saying that the root cause of > this use-after-free seems to be a race condition. > > > Yep, Indeed, I can not figure out how the race works. I will dig up further. Thanks, zhong jiang > >> On 2018/12/4 23:43, syzbot wrote: >>> syzbot has found a reproducer for the following crash on: >>> >>> HEAD commit:0072a0c14d5b Merge tag 'media/v4.20-4' of >>> git://git.kernel.. >>> git tree: upstream >>> console output: https://syzkaller.appspot.com/x/log.txt?x=11c885a340 >>> kernel config: >>> https://syzkaller.appspot.com/x/.config?x=b9cc5a440391cbfd >>> dashboard link: >>> https://syzkaller.appspot.com/bug?extid=cbb52e396df3e565ab02 >>> compiler: gcc (GCC) 8.0.1 20180413 (experimental) >>> syz repro: >>> https://syzkaller.appspot.com/x/repro.syz?x=12835e2540 >>> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=172fa5a340 >>> >>> IMPORTANT: if you fix the bug, please add the following tag to the >>> commit: >>> Reported-by: syzbot+cbb52e396df3e565a...@syzkaller.appspotmail.com >>> >>> cgroup: fork rejected by pids controller in /syz2 >>> == >>> BUG: KASAN: use-after-free in __read_once_size >>> include/linux/compiler.h:182 [inline] >>> BUG: KASAN: use-after-free in task_css include/linux/cgroup.h:477 >>> [inline] >>> BUG: KASAN: use-after-free in mem_cgroup_from_task mm/memcontrol.c:815 >>> [inline] >>> BUG: KASAN: use-after-free in >>> get_mem_cgroup_from_mm.part.62+0x6d7/0x880 mm/memcontrol.c:844 >>> Read of size 8 at addr 8881b72af310 by task syz-executor198/9332 >>> >>> CPU: 0 PID: 9332 Comm: syz-executor198 Not tainted 4.20.0-rc5+ #142 >>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS >>> Google 01/01/2011 >>> Call Trace: >>> __dump_stack lib/dump_stack.c:77 [inline] >>> dump_stack+0x244/0x39d lib/dump_stack.c:113 >>> print_address_description.cold.7+0x9/0x1ff mm/kasan/report.c:256 >>> kasan_report_error mm/kasan/report.c:354 [inline] >>> kasan_report.cold.8+0x242/0x309 mm/kasan/report.c:412 >>> __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433 >>> __read_once_size include/linux/compiler.h:182 [inline] >>> task_css include/linux/cgroup.h:477 [inline] >>> mem_cgroup_from_task mm/memcontrol.c:815 [inline] >>> get_mem_cgroup_from_mm.part.62+0x6d7/0x880 mm/memcontrol.c:844 >>> get_mem_cgroup_from_mm mm/memcontrol.c:834 [inline] >>> mem_cgroup_try_charge+0x608/0xe20 mm/memcontrol.c:5888 >>> mcopy_atomic_pte mm/userfaultfd.c:71 [inline] >>> mfill_atomic_pte mm/userfaultfd.c:418 [inline] >>> __mcopy_atomic mm/userfaultfd.c:559 [inline] >>> mcopy_atomic+0xb08/0x2c70 mm/userfaultfd.c:609 >>> userfaultfd_copy fs/userfaultfd.c:1705 [inline] >>> userfaultfd_ioctl+0x29fb/0x5610 fs/userfaultfd.c:1851 >>> vfs_ioctl fs/ioctl.c:46 [inline] >>> file_ioctl fs/ioctl.c:509 [inline] >>> do_vfs_ioctl+0x1de/0x1790 fs/ioctl.c:696 >>> ksys_ioctl+0xa9/0xd0 fs/ioctl.c:713 >>> __do_sys_ioctl
Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm
On Mon, Mar 4, 2019 at 4:32 PM zhong jiang wrote: > > On 2019/3/4 22:11, Dmitry Vyukov wrote: > > On Mon, Mar 4, 2019 at 3:00 PM zhong jiang wrote: > >> On 2019/3/4 15:40, Dmitry Vyukov wrote: > >>> On Sun, Mar 3, 2019 at 5:19 PM zhong jiang wrote: > Hi, guys > > I also hit the following issue. but it fails to reproduce the issue by > the log. > > it seems to the case that we access the mm->owner and deference it will > result in the UAF. > But it should not be possible that we specify the incomplete process to > be the mm->owner. > > Any thoughts? > >>> FWIW syzbot was able to reproduce this with this reproducer. > >>> This looks like a very subtle race (threaded reproducer that runs > >>> repeatedly in multiple processes), so most likely we are looking for > >>> something like few instructions inconsistency window. > >>> > >> I has a little doubtful about the instrustions inconsistency window. > >> > >> I guess that you mean some smb barriers should be taken into account.:-) > >> > >> Because IMO, It should not be the lock case to result in the issue. > > > > Since the crash was triggered on x86 _most likley_ this is not a > > missed barrier. What I meant is that one thread needs to executed some > > code, while another thread is stopped within few instructions. > > > > > It is weird and I can not find any relationship you had said with the > issue.:-( > > Because It is the cause that mm->owner has been freed, whereas we still > deference it. > > From the lastest freed task call trace, It fails to create process. > > Am I miss something or I misunderstand your meaning. Please correct me. Your analysis looks correct. I am just saying that the root cause of this use-after-free seems to be a race condition. > On 2018/12/4 23:43, syzbot wrote: > > syzbot has found a reproducer for the following crash on: > > > > HEAD commit:0072a0c14d5b Merge tag 'media/v4.20-4' of > > git://git.kernel.. > > git tree: upstream > > console output: https://syzkaller.appspot.com/x/log.txt?x=11c885a340 > > kernel config: > > https://syzkaller.appspot.com/x/.config?x=b9cc5a440391cbfd > > dashboard link: > > https://syzkaller.appspot.com/bug?extid=cbb52e396df3e565ab02 > > compiler: gcc (GCC) 8.0.1 20180413 (experimental) > > syz repro: > > https://syzkaller.appspot.com/x/repro.syz?x=12835e2540 > > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=172fa5a340 > > > > IMPORTANT: if you fix the bug, please add the following tag to the > > commit: > > Reported-by: syzbot+cbb52e396df3e565a...@syzkaller.appspotmail.com > > > > cgroup: fork rejected by pids controller in /syz2 > > == > > BUG: KASAN: use-after-free in __read_once_size > > include/linux/compiler.h:182 [inline] > > BUG: KASAN: use-after-free in task_css include/linux/cgroup.h:477 > > [inline] > > BUG: KASAN: use-after-free in mem_cgroup_from_task mm/memcontrol.c:815 > > [inline] > > BUG: KASAN: use-after-free in > > get_mem_cgroup_from_mm.part.62+0x6d7/0x880 mm/memcontrol.c:844 > > Read of size 8 at addr 8881b72af310 by task syz-executor198/9332 > > > > CPU: 0 PID: 9332 Comm: syz-executor198 Not tainted 4.20.0-rc5+ #142 > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > > Google 01/01/2011 > > Call Trace: > > __dump_stack lib/dump_stack.c:77 [inline] > > dump_stack+0x244/0x39d lib/dump_stack.c:113 > > print_address_description.cold.7+0x9/0x1ff mm/kasan/report.c:256 > > kasan_report_error mm/kasan/report.c:354 [inline] > > kasan_report.cold.8+0x242/0x309 mm/kasan/report.c:412 > > __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433 > > __read_once_size include/linux/compiler.h:182 [inline] > > task_css include/linux/cgroup.h:477 [inline] > > mem_cgroup_from_task mm/memcontrol.c:815 [inline] > > get_mem_cgroup_from_mm.part.62+0x6d7/0x880 mm/memcontrol.c:844 > > get_mem_cgroup_from_mm mm/memcontrol.c:834 [inline] > > mem_cgroup_try_charge+0x608/0xe20 mm/memcontrol.c:5888 > > mcopy_atomic_pte mm/userfaultfd.c:71 [inline] > > mfill_atomic_pte mm/userfaultfd.c:418 [inline] > > __mcopy_atomic mm/userfaultfd.c:559 [inline] > > mcopy_atomic+0xb08/0x2c70 mm/userfaultfd.c:609 > > userfaultfd_copy fs/userfaultfd.c:1705 [inline] > > userfaultfd_ioctl+0x29fb/0x5610 fs/userfaultfd.c:1851 > > vfs_ioctl fs/ioctl.c:46 [inline] > > file_ioctl fs/ioctl.c:509 [inline] > > do_vfs_ioctl+0x1de/0x1790 fs/ioctl.c:696 > > ksys_ioctl+0xa9/0xd0 fs/ioctl.c:713 > > __do_sys_ioctl fs/ioctl.c:720 [inline] > > __se_sys_ioctl fs/ioctl.c:718 [inline] > > __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718 > >
Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm
On 2019/3/5 5:51, Matthew Wilcox wrote: > On Mon, Mar 04, 2019 at 12:19:32AM +0800, zhong jiang wrote: >> I also hit the following issue. but it fails to reproduce the issue by the >> log. >> >> it seems to the case that we access the mm->owner and deference it will >> result in the UAF. >> But it should not be possible that we specify the incomplete process to be >> the mm->owner. > OK, so we've got thread 9325 calling fork() and failing due to the PID > controller saying "no". 9325 calls free_task(), but somehow thread 9332 > has a reference to the struct task_struct. There are two possibilities > here: one is that 9332 really did manage to get a reference to the larval > child of 9325, and the other is that 9332 has a stale reference to some > memory which was reallocated to 9325's child. Good guess and analysis. IMO, 9332 can not handle the task_struct directly in the code flow. But It can get a reference of mm_struct. Maybe I miss something important. > Andrea, is there any way for a UFFD thread to get access to the child's > task_struct during the copy_process() call? If so, I think copy_process() > needs to call mm_update_next_owner(). Yep, Hope andrea have time to look at this. Thanks, zhong jiang > If there's no way for that to happen, then we have quite a bug-hunt ahead > of us looking for who is missing a call to mm_update_next_owner(). >> On 2018/12/4 23:43, syzbot wrote: >>> syzbot has found a reproducer for the following crash on: >>> >>> HEAD commit:0072a0c14d5b Merge tag 'media/v4.20-4' of git://git.kernel.. >>> git tree: upstream >>> console output: https://syzkaller.appspot.com/x/log.txt?x=11c885a340 >>> kernel config: https://syzkaller.appspot.com/x/.config?x=b9cc5a440391cbfd >>> dashboard link: https://syzkaller.appspot.com/bug?extid=cbb52e396df3e565ab02 >>> compiler: gcc (GCC) 8.0.1 20180413 (experimental) >>> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=12835e2540 >>> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=172fa5a340 >>> >>> IMPORTANT: if you fix the bug, please add the following tag to the commit: >>> Reported-by: syzbot+cbb52e396df3e565a...@syzkaller.appspotmail.com >>> >>> cgroup: fork rejected by pids controller in /syz2 >>> == >>> BUG: KASAN: use-after-free in __read_once_size include/linux/compiler.h:182 >>> [inline] >>> BUG: KASAN: use-after-free in task_css include/linux/cgroup.h:477 [inline] >>> BUG: KASAN: use-after-free in mem_cgroup_from_task mm/memcontrol.c:815 >>> [inline] >>> BUG: KASAN: use-after-free in get_mem_cgroup_from_mm.part.62+0x6d7/0x880 >>> mm/memcontrol.c:844 >>> Read of size 8 at addr 8881b72af310 by task syz-executor198/9332 >>> >>> CPU: 0 PID: 9332 Comm: syz-executor198 Not tainted 4.20.0-rc5+ #142 >>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS >>> Google 01/01/2011 >>> Call Trace: >>> __dump_stack lib/dump_stack.c:77 [inline] >>> dump_stack+0x244/0x39d lib/dump_stack.c:113 >>> print_address_description.cold.7+0x9/0x1ff mm/kasan/report.c:256 >>> kasan_report_error mm/kasan/report.c:354 [inline] >>> kasan_report.cold.8+0x242/0x309 mm/kasan/report.c:412 >>> __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433 >>> __read_once_size include/linux/compiler.h:182 [inline] >>> task_css include/linux/cgroup.h:477 [inline] >>> mem_cgroup_from_task mm/memcontrol.c:815 [inline] >>> get_mem_cgroup_from_mm.part.62+0x6d7/0x880 mm/memcontrol.c:844 >>> get_mem_cgroup_from_mm mm/memcontrol.c:834 [inline] >>> mem_cgroup_try_charge+0x608/0xe20 mm/memcontrol.c:5888 >>> mcopy_atomic_pte mm/userfaultfd.c:71 [inline] >>> mfill_atomic_pte mm/userfaultfd.c:418 [inline] >>> __mcopy_atomic mm/userfaultfd.c:559 [inline] >>> mcopy_atomic+0xb08/0x2c70 mm/userfaultfd.c:609 >>> userfaultfd_copy fs/userfaultfd.c:1705 [inline] >>> userfaultfd_ioctl+0x29fb/0x5610 fs/userfaultfd.c:1851 >>> vfs_ioctl fs/ioctl.c:46 [inline] >>> file_ioctl fs/ioctl.c:509 [inline] >>> do_vfs_ioctl+0x1de/0x1790 fs/ioctl.c:696 >>> ksys_ioctl+0xa9/0xd0 fs/ioctl.c:713 >>> __do_sys_ioctl fs/ioctl.c:720 [inline] >>> __se_sys_ioctl fs/ioctl.c:718 [inline] >>> __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718 >>> do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 >>> entry_SYSCALL_64_after_hwframe+0x49/0xbe >>> RIP: 0033:0x44c7e9 >>> Code: 5d c5 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 >>> 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff >>> ff 0f 83 2b c5 fb ff c3 66 2e 0f 1f 84 00 00 00 00 >>> RSP: 002b:7f906b69fdb8 EFLAGS: 0246 ORIG_RAX: 0010 >>> RAX: ffda RBX: 006e4a08 RCX: 0044c7e9 >>> RDX: 2100 RSI: c028aa03 RDI: 0004 >>> RBP: 006e4a00 R08: R09: >>> R10: R11: 0246 R12: 006e4a0c >>> R13:
Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm
On Mon, Mar 04, 2019 at 12:19:32AM +0800, zhong jiang wrote: > I also hit the following issue. but it fails to reproduce the issue by the > log. > > it seems to the case that we access the mm->owner and deference it will > result in the UAF. > But it should not be possible that we specify the incomplete process to be > the mm->owner. OK, so we've got thread 9325 calling fork() and failing due to the PID controller saying "no". 9325 calls free_task(), but somehow thread 9332 has a reference to the struct task_struct. There are two possibilities here: one is that 9332 really did manage to get a reference to the larval child of 9325, and the other is that 9332 has a stale reference to some memory which was reallocated to 9325's child. Andrea, is there any way for a UFFD thread to get access to the child's task_struct during the copy_process() call? If so, I think copy_process() needs to call mm_update_next_owner(). If there's no way for that to happen, then we have quite a bug-hunt ahead of us looking for who is missing a call to mm_update_next_owner(). > On 2018/12/4 23:43, syzbot wrote: > > syzbot has found a reproducer for the following crash on: > > > > HEAD commit:0072a0c14d5b Merge tag 'media/v4.20-4' of git://git.kernel.. > > git tree: upstream > > console output: https://syzkaller.appspot.com/x/log.txt?x=11c885a340 > > kernel config: https://syzkaller.appspot.com/x/.config?x=b9cc5a440391cbfd > > dashboard link: https://syzkaller.appspot.com/bug?extid=cbb52e396df3e565ab02 > > compiler: gcc (GCC) 8.0.1 20180413 (experimental) > > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=12835e2540 > > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=172fa5a340 > > > > IMPORTANT: if you fix the bug, please add the following tag to the commit: > > Reported-by: syzbot+cbb52e396df3e565a...@syzkaller.appspotmail.com > > > > cgroup: fork rejected by pids controller in /syz2 > > == > > BUG: KASAN: use-after-free in __read_once_size include/linux/compiler.h:182 > > [inline] > > BUG: KASAN: use-after-free in task_css include/linux/cgroup.h:477 [inline] > > BUG: KASAN: use-after-free in mem_cgroup_from_task mm/memcontrol.c:815 > > [inline] > > BUG: KASAN: use-after-free in get_mem_cgroup_from_mm.part.62+0x6d7/0x880 > > mm/memcontrol.c:844 > > Read of size 8 at addr 8881b72af310 by task syz-executor198/9332 > > > > CPU: 0 PID: 9332 Comm: syz-executor198 Not tainted 4.20.0-rc5+ #142 > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > > Google 01/01/2011 > > Call Trace: > > __dump_stack lib/dump_stack.c:77 [inline] > > dump_stack+0x244/0x39d lib/dump_stack.c:113 > > print_address_description.cold.7+0x9/0x1ff mm/kasan/report.c:256 > > kasan_report_error mm/kasan/report.c:354 [inline] > > kasan_report.cold.8+0x242/0x309 mm/kasan/report.c:412 > > __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433 > > __read_once_size include/linux/compiler.h:182 [inline] > > task_css include/linux/cgroup.h:477 [inline] > > mem_cgroup_from_task mm/memcontrol.c:815 [inline] > > get_mem_cgroup_from_mm.part.62+0x6d7/0x880 mm/memcontrol.c:844 > > get_mem_cgroup_from_mm mm/memcontrol.c:834 [inline] > > mem_cgroup_try_charge+0x608/0xe20 mm/memcontrol.c:5888 > > mcopy_atomic_pte mm/userfaultfd.c:71 [inline] > > mfill_atomic_pte mm/userfaultfd.c:418 [inline] > > __mcopy_atomic mm/userfaultfd.c:559 [inline] > > mcopy_atomic+0xb08/0x2c70 mm/userfaultfd.c:609 > > userfaultfd_copy fs/userfaultfd.c:1705 [inline] > > userfaultfd_ioctl+0x29fb/0x5610 fs/userfaultfd.c:1851 > > vfs_ioctl fs/ioctl.c:46 [inline] > > file_ioctl fs/ioctl.c:509 [inline] > > do_vfs_ioctl+0x1de/0x1790 fs/ioctl.c:696 > > ksys_ioctl+0xa9/0xd0 fs/ioctl.c:713 > > __do_sys_ioctl fs/ioctl.c:720 [inline] > > __se_sys_ioctl fs/ioctl.c:718 [inline] > > __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718 > > do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 > > entry_SYSCALL_64_after_hwframe+0x49/0xbe > > RIP: 0033:0x44c7e9 > > Code: 5d c5 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 > > 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff > > ff 0f 83 2b c5 fb ff c3 66 2e 0f 1f 84 00 00 00 00 > > RSP: 002b:7f906b69fdb8 EFLAGS: 0246 ORIG_RAX: 0010 > > RAX: ffda RBX: 006e4a08 RCX: 0044c7e9 > > RDX: 2100 RSI: c028aa03 RDI: 0004 > > RBP: 006e4a00 R08: R09: > > R10: R11: 0246 R12: 006e4a0c > > R13: 7ffdfd47813f R14: 7f906b6a09c0 R15: 002d > > > > Allocated by task 9325: > > save_stack+0x43/0xd0 mm/kasan/kasan.c:448 > > set_track mm/kasan/kasan.c:460 [inline] > > kasan_kmalloc+0xc7/0xe0 mm/kasan/kasan.c:553 > > kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:490 > > kmem_cache_alloc_node+0x144/0x730
Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm
On 2019/3/4 22:11, Dmitry Vyukov wrote: > On Mon, Mar 4, 2019 at 3:00 PM zhong jiang wrote: >> On 2019/3/4 15:40, Dmitry Vyukov wrote: >>> On Sun, Mar 3, 2019 at 5:19 PM zhong jiang wrote: Hi, guys I also hit the following issue. but it fails to reproduce the issue by the log. it seems to the case that we access the mm->owner and deference it will result in the UAF. But it should not be possible that we specify the incomplete process to be the mm->owner. Any thoughts? >>> FWIW syzbot was able to reproduce this with this reproducer. >>> This looks like a very subtle race (threaded reproducer that runs >>> repeatedly in multiple processes), so most likely we are looking for >>> something like few instructions inconsistency window. >>> >> I has a little doubtful about the instrustions inconsistency window. >> >> I guess that you mean some smb barriers should be taken into account.:-) >> >> Because IMO, It should not be the lock case to result in the issue. > > Since the crash was triggered on x86 _most likley_ this is not a > missed barrier. What I meant is that one thread needs to executed some > code, while another thread is stopped within few instructions. > > It is weird and I can not find any relationship you had said with the issue.:-( Because It is the cause that mm->owner has been freed, whereas we still deference it. >From the lastest freed task call trace, It fails to create process. Am I miss something or I misunderstand your meaning. Please correct me. Thanks, zhong jiang >> Thanks, >> zhong jinag Thanks, zhong jiang On 2018/12/4 23:43, syzbot wrote: > syzbot has found a reproducer for the following crash on: > > HEAD commit:0072a0c14d5b Merge tag 'media/v4.20-4' of > git://git.kernel.. > git tree: upstream > console output: https://syzkaller.appspot.com/x/log.txt?x=11c885a340 > kernel config: https://syzkaller.appspot.com/x/.config?x=b9cc5a440391cbfd > dashboard link: > https://syzkaller.appspot.com/bug?extid=cbb52e396df3e565ab02 > compiler: gcc (GCC) 8.0.1 20180413 (experimental) > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=12835e2540 > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=172fa5a340 > > IMPORTANT: if you fix the bug, please add the following tag to the commit: > Reported-by: syzbot+cbb52e396df3e565a...@syzkaller.appspotmail.com > > cgroup: fork rejected by pids controller in /syz2 > == > BUG: KASAN: use-after-free in __read_once_size > include/linux/compiler.h:182 [inline] > BUG: KASAN: use-after-free in task_css include/linux/cgroup.h:477 [inline] > BUG: KASAN: use-after-free in mem_cgroup_from_task mm/memcontrol.c:815 > [inline] > BUG: KASAN: use-after-free in get_mem_cgroup_from_mm.part.62+0x6d7/0x880 > mm/memcontrol.c:844 > Read of size 8 at addr 8881b72af310 by task syz-executor198/9332 > > CPU: 0 PID: 9332 Comm: syz-executor198 Not tainted 4.20.0-rc5+ #142 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > Google 01/01/2011 > Call Trace: > __dump_stack lib/dump_stack.c:77 [inline] > dump_stack+0x244/0x39d lib/dump_stack.c:113 > print_address_description.cold.7+0x9/0x1ff mm/kasan/report.c:256 > kasan_report_error mm/kasan/report.c:354 [inline] > kasan_report.cold.8+0x242/0x309 mm/kasan/report.c:412 > __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433 > __read_once_size include/linux/compiler.h:182 [inline] > task_css include/linux/cgroup.h:477 [inline] > mem_cgroup_from_task mm/memcontrol.c:815 [inline] > get_mem_cgroup_from_mm.part.62+0x6d7/0x880 mm/memcontrol.c:844 > get_mem_cgroup_from_mm mm/memcontrol.c:834 [inline] > mem_cgroup_try_charge+0x608/0xe20 mm/memcontrol.c:5888 > mcopy_atomic_pte mm/userfaultfd.c:71 [inline] > mfill_atomic_pte mm/userfaultfd.c:418 [inline] > __mcopy_atomic mm/userfaultfd.c:559 [inline] > mcopy_atomic+0xb08/0x2c70 mm/userfaultfd.c:609 > userfaultfd_copy fs/userfaultfd.c:1705 [inline] > userfaultfd_ioctl+0x29fb/0x5610 fs/userfaultfd.c:1851 > vfs_ioctl fs/ioctl.c:46 [inline] > file_ioctl fs/ioctl.c:509 [inline] > do_vfs_ioctl+0x1de/0x1790 fs/ioctl.c:696 > ksys_ioctl+0xa9/0xd0 fs/ioctl.c:713 > __do_sys_ioctl fs/ioctl.c:720 [inline] > __se_sys_ioctl fs/ioctl.c:718 [inline] > __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718 > do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 > entry_SYSCALL_64_after_hwframe+0x49/0xbe > RIP: 0033:0x44c7e9 > Code: 5d c5 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 > f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 > ff ff 0f 83 2b c5 fb ff c3 66 2e 0f 1f 84 00 00 00 00
Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm
On Mon, Mar 4, 2019 at 3:00 PM zhong jiang wrote: > > On 2019/3/4 15:40, Dmitry Vyukov wrote: > > On Sun, Mar 3, 2019 at 5:19 PM zhong jiang wrote: > >> Hi, guys > >> > >> I also hit the following issue. but it fails to reproduce the issue by the > >> log. > >> > >> it seems to the case that we access the mm->owner and deference it will > >> result in the UAF. > >> But it should not be possible that we specify the incomplete process to be > >> the mm->owner. > >> > >> Any thoughts? > > FWIW syzbot was able to reproduce this with this reproducer. > > This looks like a very subtle race (threaded reproducer that runs > > repeatedly in multiple processes), so most likely we are looking for > > something like few instructions inconsistency window. > > > > I has a little doubtful about the instrustions inconsistency window. > > I guess that you mean some smb barriers should be taken into account.:-) > > Because IMO, It should not be the lock case to result in the issue. Since the crash was triggered on x86 _most likley_ this is not a missed barrier. What I meant is that one thread needs to executed some code, while another thread is stopped within few instructions. > Thanks, > zhong jinag > >> Thanks, > >> zhong jiang > >> > >> On 2018/12/4 23:43, syzbot wrote: > >>> syzbot has found a reproducer for the following crash on: > >>> > >>> HEAD commit:0072a0c14d5b Merge tag 'media/v4.20-4' of > >>> git://git.kernel.. > >>> git tree: upstream > >>> console output: https://syzkaller.appspot.com/x/log.txt?x=11c885a340 > >>> kernel config: https://syzkaller.appspot.com/x/.config?x=b9cc5a440391cbfd > >>> dashboard link: > >>> https://syzkaller.appspot.com/bug?extid=cbb52e396df3e565ab02 > >>> compiler: gcc (GCC) 8.0.1 20180413 (experimental) > >>> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=12835e2540 > >>> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=172fa5a340 > >>> > >>> IMPORTANT: if you fix the bug, please add the following tag to the commit: > >>> Reported-by: syzbot+cbb52e396df3e565a...@syzkaller.appspotmail.com > >>> > >>> cgroup: fork rejected by pids controller in /syz2 > >>> == > >>> BUG: KASAN: use-after-free in __read_once_size > >>> include/linux/compiler.h:182 [inline] > >>> BUG: KASAN: use-after-free in task_css include/linux/cgroup.h:477 [inline] > >>> BUG: KASAN: use-after-free in mem_cgroup_from_task mm/memcontrol.c:815 > >>> [inline] > >>> BUG: KASAN: use-after-free in get_mem_cgroup_from_mm.part.62+0x6d7/0x880 > >>> mm/memcontrol.c:844 > >>> Read of size 8 at addr 8881b72af310 by task syz-executor198/9332 > >>> > >>> CPU: 0 PID: 9332 Comm: syz-executor198 Not tainted 4.20.0-rc5+ #142 > >>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > >>> Google 01/01/2011 > >>> Call Trace: > >>> __dump_stack lib/dump_stack.c:77 [inline] > >>> dump_stack+0x244/0x39d lib/dump_stack.c:113 > >>> print_address_description.cold.7+0x9/0x1ff mm/kasan/report.c:256 > >>> kasan_report_error mm/kasan/report.c:354 [inline] > >>> kasan_report.cold.8+0x242/0x309 mm/kasan/report.c:412 > >>> __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433 > >>> __read_once_size include/linux/compiler.h:182 [inline] > >>> task_css include/linux/cgroup.h:477 [inline] > >>> mem_cgroup_from_task mm/memcontrol.c:815 [inline] > >>> get_mem_cgroup_from_mm.part.62+0x6d7/0x880 mm/memcontrol.c:844 > >>> get_mem_cgroup_from_mm mm/memcontrol.c:834 [inline] > >>> mem_cgroup_try_charge+0x608/0xe20 mm/memcontrol.c:5888 > >>> mcopy_atomic_pte mm/userfaultfd.c:71 [inline] > >>> mfill_atomic_pte mm/userfaultfd.c:418 [inline] > >>> __mcopy_atomic mm/userfaultfd.c:559 [inline] > >>> mcopy_atomic+0xb08/0x2c70 mm/userfaultfd.c:609 > >>> userfaultfd_copy fs/userfaultfd.c:1705 [inline] > >>> userfaultfd_ioctl+0x29fb/0x5610 fs/userfaultfd.c:1851 > >>> vfs_ioctl fs/ioctl.c:46 [inline] > >>> file_ioctl fs/ioctl.c:509 [inline] > >>> do_vfs_ioctl+0x1de/0x1790 fs/ioctl.c:696 > >>> ksys_ioctl+0xa9/0xd0 fs/ioctl.c:713 > >>> __do_sys_ioctl fs/ioctl.c:720 [inline] > >>> __se_sys_ioctl fs/ioctl.c:718 [inline] > >>> __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718 > >>> do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 > >>> entry_SYSCALL_64_after_hwframe+0x49/0xbe > >>> RIP: 0033:0x44c7e9 > >>> Code: 5d c5 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 > >>> f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 > >>> ff ff 0f 83 2b c5 fb ff c3 66 2e 0f 1f 84 00 00 00 00 > >>> RSP: 002b:7f906b69fdb8 EFLAGS: 0246 ORIG_RAX: 0010 > >>> RAX: ffda RBX: 006e4a08 RCX: 0044c7e9 > >>> RDX: 2100 RSI: c028aa03 RDI: 0004 > >>> RBP: 006e4a00 R08: R09: > >>> R10: R11: 0246 R12: 006e4a0c > >>> R13:
Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm
On 2019/3/4 15:40, Dmitry Vyukov wrote: > On Sun, Mar 3, 2019 at 5:19 PM zhong jiang wrote: >> Hi, guys >> >> I also hit the following issue. but it fails to reproduce the issue by the >> log. >> >> it seems to the case that we access the mm->owner and deference it will >> result in the UAF. >> But it should not be possible that we specify the incomplete process to be >> the mm->owner. >> >> Any thoughts? > FWIW syzbot was able to reproduce this with this reproducer. > This looks like a very subtle race (threaded reproducer that runs > repeatedly in multiple processes), so most likely we are looking for > something like few instructions inconsistency window. > I has a little doubtful about the instrustions inconsistency window. I guess that you mean some smb barriers should be taken into account.:-) Because IMO, It should not be the lock case to result in the issue. Thanks, zhong jinag >> Thanks, >> zhong jiang >> >> On 2018/12/4 23:43, syzbot wrote: >>> syzbot has found a reproducer for the following crash on: >>> >>> HEAD commit:0072a0c14d5b Merge tag 'media/v4.20-4' of git://git.kernel.. >>> git tree: upstream >>> console output: https://syzkaller.appspot.com/x/log.txt?x=11c885a340 >>> kernel config: https://syzkaller.appspot.com/x/.config?x=b9cc5a440391cbfd >>> dashboard link: https://syzkaller.appspot.com/bug?extid=cbb52e396df3e565ab02 >>> compiler: gcc (GCC) 8.0.1 20180413 (experimental) >>> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=12835e2540 >>> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=172fa5a340 >>> >>> IMPORTANT: if you fix the bug, please add the following tag to the commit: >>> Reported-by: syzbot+cbb52e396df3e565a...@syzkaller.appspotmail.com >>> >>> cgroup: fork rejected by pids controller in /syz2 >>> == >>> BUG: KASAN: use-after-free in __read_once_size include/linux/compiler.h:182 >>> [inline] >>> BUG: KASAN: use-after-free in task_css include/linux/cgroup.h:477 [inline] >>> BUG: KASAN: use-after-free in mem_cgroup_from_task mm/memcontrol.c:815 >>> [inline] >>> BUG: KASAN: use-after-free in get_mem_cgroup_from_mm.part.62+0x6d7/0x880 >>> mm/memcontrol.c:844 >>> Read of size 8 at addr 8881b72af310 by task syz-executor198/9332 >>> >>> CPU: 0 PID: 9332 Comm: syz-executor198 Not tainted 4.20.0-rc5+ #142 >>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS >>> Google 01/01/2011 >>> Call Trace: >>> __dump_stack lib/dump_stack.c:77 [inline] >>> dump_stack+0x244/0x39d lib/dump_stack.c:113 >>> print_address_description.cold.7+0x9/0x1ff mm/kasan/report.c:256 >>> kasan_report_error mm/kasan/report.c:354 [inline] >>> kasan_report.cold.8+0x242/0x309 mm/kasan/report.c:412 >>> __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433 >>> __read_once_size include/linux/compiler.h:182 [inline] >>> task_css include/linux/cgroup.h:477 [inline] >>> mem_cgroup_from_task mm/memcontrol.c:815 [inline] >>> get_mem_cgroup_from_mm.part.62+0x6d7/0x880 mm/memcontrol.c:844 >>> get_mem_cgroup_from_mm mm/memcontrol.c:834 [inline] >>> mem_cgroup_try_charge+0x608/0xe20 mm/memcontrol.c:5888 >>> mcopy_atomic_pte mm/userfaultfd.c:71 [inline] >>> mfill_atomic_pte mm/userfaultfd.c:418 [inline] >>> __mcopy_atomic mm/userfaultfd.c:559 [inline] >>> mcopy_atomic+0xb08/0x2c70 mm/userfaultfd.c:609 >>> userfaultfd_copy fs/userfaultfd.c:1705 [inline] >>> userfaultfd_ioctl+0x29fb/0x5610 fs/userfaultfd.c:1851 >>> vfs_ioctl fs/ioctl.c:46 [inline] >>> file_ioctl fs/ioctl.c:509 [inline] >>> do_vfs_ioctl+0x1de/0x1790 fs/ioctl.c:696 >>> ksys_ioctl+0xa9/0xd0 fs/ioctl.c:713 >>> __do_sys_ioctl fs/ioctl.c:720 [inline] >>> __se_sys_ioctl fs/ioctl.c:718 [inline] >>> __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718 >>> do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 >>> entry_SYSCALL_64_after_hwframe+0x49/0xbe >>> RIP: 0033:0x44c7e9 >>> Code: 5d c5 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 >>> 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff >>> ff 0f 83 2b c5 fb ff c3 66 2e 0f 1f 84 00 00 00 00 >>> RSP: 002b:7f906b69fdb8 EFLAGS: 0246 ORIG_RAX: 0010 >>> RAX: ffda RBX: 006e4a08 RCX: 0044c7e9 >>> RDX: 2100 RSI: c028aa03 RDI: 0004 >>> RBP: 006e4a00 R08: R09: >>> R10: R11: 0246 R12: 006e4a0c >>> R13: 7ffdfd47813f R14: 7f906b6a09c0 R15: 002d >>> >>> Allocated by task 9325: >>> save_stack+0x43/0xd0 mm/kasan/kasan.c:448 >>> set_track mm/kasan/kasan.c:460 [inline] >>> kasan_kmalloc+0xc7/0xe0 mm/kasan/kasan.c:553 >>> kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:490 >>> kmem_cache_alloc_node+0x144/0x730 mm/slab.c:3644 >>> alloc_task_struct_node kernel/fork.c:158 [inline] >>> dup_task_struct kernel/fork.c:843 [inline] >>>
Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm
On Sun, Mar 3, 2019 at 5:19 PM zhong jiang wrote: > > Hi, guys > > I also hit the following issue. but it fails to reproduce the issue by the > log. > > it seems to the case that we access the mm->owner and deference it will > result in the UAF. > But it should not be possible that we specify the incomplete process to be > the mm->owner. > > Any thoughts? FWIW syzbot was able to reproduce this with this reproducer. This looks like a very subtle race (threaded reproducer that runs repeatedly in multiple processes), so most likely we are looking for something like few instructions inconsistency window. > Thanks, > zhong jiang > > On 2018/12/4 23:43, syzbot wrote: > > syzbot has found a reproducer for the following crash on: > > > > HEAD commit:0072a0c14d5b Merge tag 'media/v4.20-4' of git://git.kernel.. > > git tree: upstream > > console output: https://syzkaller.appspot.com/x/log.txt?x=11c885a340 > > kernel config: https://syzkaller.appspot.com/x/.config?x=b9cc5a440391cbfd > > dashboard link: https://syzkaller.appspot.com/bug?extid=cbb52e396df3e565ab02 > > compiler: gcc (GCC) 8.0.1 20180413 (experimental) > > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=12835e2540 > > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=172fa5a340 > > > > IMPORTANT: if you fix the bug, please add the following tag to the commit: > > Reported-by: syzbot+cbb52e396df3e565a...@syzkaller.appspotmail.com > > > > cgroup: fork rejected by pids controller in /syz2 > > == > > BUG: KASAN: use-after-free in __read_once_size include/linux/compiler.h:182 > > [inline] > > BUG: KASAN: use-after-free in task_css include/linux/cgroup.h:477 [inline] > > BUG: KASAN: use-after-free in mem_cgroup_from_task mm/memcontrol.c:815 > > [inline] > > BUG: KASAN: use-after-free in get_mem_cgroup_from_mm.part.62+0x6d7/0x880 > > mm/memcontrol.c:844 > > Read of size 8 at addr 8881b72af310 by task syz-executor198/9332 > > > > CPU: 0 PID: 9332 Comm: syz-executor198 Not tainted 4.20.0-rc5+ #142 > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > > Google 01/01/2011 > > Call Trace: > > __dump_stack lib/dump_stack.c:77 [inline] > > dump_stack+0x244/0x39d lib/dump_stack.c:113 > > print_address_description.cold.7+0x9/0x1ff mm/kasan/report.c:256 > > kasan_report_error mm/kasan/report.c:354 [inline] > > kasan_report.cold.8+0x242/0x309 mm/kasan/report.c:412 > > __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433 > > __read_once_size include/linux/compiler.h:182 [inline] > > task_css include/linux/cgroup.h:477 [inline] > > mem_cgroup_from_task mm/memcontrol.c:815 [inline] > > get_mem_cgroup_from_mm.part.62+0x6d7/0x880 mm/memcontrol.c:844 > > get_mem_cgroup_from_mm mm/memcontrol.c:834 [inline] > > mem_cgroup_try_charge+0x608/0xe20 mm/memcontrol.c:5888 > > mcopy_atomic_pte mm/userfaultfd.c:71 [inline] > > mfill_atomic_pte mm/userfaultfd.c:418 [inline] > > __mcopy_atomic mm/userfaultfd.c:559 [inline] > > mcopy_atomic+0xb08/0x2c70 mm/userfaultfd.c:609 > > userfaultfd_copy fs/userfaultfd.c:1705 [inline] > > userfaultfd_ioctl+0x29fb/0x5610 fs/userfaultfd.c:1851 > > vfs_ioctl fs/ioctl.c:46 [inline] > > file_ioctl fs/ioctl.c:509 [inline] > > do_vfs_ioctl+0x1de/0x1790 fs/ioctl.c:696 > > ksys_ioctl+0xa9/0xd0 fs/ioctl.c:713 > > __do_sys_ioctl fs/ioctl.c:720 [inline] > > __se_sys_ioctl fs/ioctl.c:718 [inline] > > __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718 > > do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 > > entry_SYSCALL_64_after_hwframe+0x49/0xbe > > RIP: 0033:0x44c7e9 > > Code: 5d c5 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 > > 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff > > ff 0f 83 2b c5 fb ff c3 66 2e 0f 1f 84 00 00 00 00 > > RSP: 002b:7f906b69fdb8 EFLAGS: 0246 ORIG_RAX: 0010 > > RAX: ffda RBX: 006e4a08 RCX: 0044c7e9 > > RDX: 2100 RSI: c028aa03 RDI: 0004 > > RBP: 006e4a00 R08: R09: > > R10: R11: 0246 R12: 006e4a0c > > R13: 7ffdfd47813f R14: 7f906b6a09c0 R15: 002d > > > > Allocated by task 9325: > > save_stack+0x43/0xd0 mm/kasan/kasan.c:448 > > set_track mm/kasan/kasan.c:460 [inline] > > kasan_kmalloc+0xc7/0xe0 mm/kasan/kasan.c:553 > > kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:490 > > kmem_cache_alloc_node+0x144/0x730 mm/slab.c:3644 > > alloc_task_struct_node kernel/fork.c:158 [inline] > > dup_task_struct kernel/fork.c:843 [inline] > > copy_process+0x2026/0x87a0 kernel/fork.c:1751 > > _do_fork+0x1cb/0x11d0 kernel/fork.c:2216 > > __do_sys_clone kernel/fork.c:2323 [inline] > > __se_sys_clone kernel/fork.c:2317 [inline] > > __x64_sys_clone+0xbf/0x150 kernel/fork.c:2317 > > do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 > >
Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm
Hi, guys I also hit the following issue. but it fails to reproduce the issue by the log. it seems to the case that we access the mm->owner and deference it will result in the UAF. But it should not be possible that we specify the incomplete process to be the mm->owner. Any thoughts? Thanks, zhong jiang On 2018/12/4 23:43, syzbot wrote: > syzbot has found a reproducer for the following crash on: > > HEAD commit:0072a0c14d5b Merge tag 'media/v4.20-4' of git://git.kernel.. > git tree: upstream > console output: https://syzkaller.appspot.com/x/log.txt?x=11c885a340 > kernel config: https://syzkaller.appspot.com/x/.config?x=b9cc5a440391cbfd > dashboard link: https://syzkaller.appspot.com/bug?extid=cbb52e396df3e565ab02 > compiler: gcc (GCC) 8.0.1 20180413 (experimental) > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=12835e2540 > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=172fa5a340 > > IMPORTANT: if you fix the bug, please add the following tag to the commit: > Reported-by: syzbot+cbb52e396df3e565a...@syzkaller.appspotmail.com > > cgroup: fork rejected by pids controller in /syz2 > == > BUG: KASAN: use-after-free in __read_once_size include/linux/compiler.h:182 > [inline] > BUG: KASAN: use-after-free in task_css include/linux/cgroup.h:477 [inline] > BUG: KASAN: use-after-free in mem_cgroup_from_task mm/memcontrol.c:815 > [inline] > BUG: KASAN: use-after-free in get_mem_cgroup_from_mm.part.62+0x6d7/0x880 > mm/memcontrol.c:844 > Read of size 8 at addr 8881b72af310 by task syz-executor198/9332 > > CPU: 0 PID: 9332 Comm: syz-executor198 Not tainted 4.20.0-rc5+ #142 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > Google 01/01/2011 > Call Trace: > __dump_stack lib/dump_stack.c:77 [inline] > dump_stack+0x244/0x39d lib/dump_stack.c:113 > print_address_description.cold.7+0x9/0x1ff mm/kasan/report.c:256 > kasan_report_error mm/kasan/report.c:354 [inline] > kasan_report.cold.8+0x242/0x309 mm/kasan/report.c:412 > __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433 > __read_once_size include/linux/compiler.h:182 [inline] > task_css include/linux/cgroup.h:477 [inline] > mem_cgroup_from_task mm/memcontrol.c:815 [inline] > get_mem_cgroup_from_mm.part.62+0x6d7/0x880 mm/memcontrol.c:844 > get_mem_cgroup_from_mm mm/memcontrol.c:834 [inline] > mem_cgroup_try_charge+0x608/0xe20 mm/memcontrol.c:5888 > mcopy_atomic_pte mm/userfaultfd.c:71 [inline] > mfill_atomic_pte mm/userfaultfd.c:418 [inline] > __mcopy_atomic mm/userfaultfd.c:559 [inline] > mcopy_atomic+0xb08/0x2c70 mm/userfaultfd.c:609 > userfaultfd_copy fs/userfaultfd.c:1705 [inline] > userfaultfd_ioctl+0x29fb/0x5610 fs/userfaultfd.c:1851 > vfs_ioctl fs/ioctl.c:46 [inline] > file_ioctl fs/ioctl.c:509 [inline] > do_vfs_ioctl+0x1de/0x1790 fs/ioctl.c:696 > ksys_ioctl+0xa9/0xd0 fs/ioctl.c:713 > __do_sys_ioctl fs/ioctl.c:720 [inline] > __se_sys_ioctl fs/ioctl.c:718 [inline] > __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718 > do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 > entry_SYSCALL_64_after_hwframe+0x49/0xbe > RIP: 0033:0x44c7e9 > Code: 5d c5 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 > 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f > 83 2b c5 fb ff c3 66 2e 0f 1f 84 00 00 00 00 > RSP: 002b:7f906b69fdb8 EFLAGS: 0246 ORIG_RAX: 0010 > RAX: ffda RBX: 006e4a08 RCX: 0044c7e9 > RDX: 2100 RSI: c028aa03 RDI: 0004 > RBP: 006e4a00 R08: R09: > R10: R11: 0246 R12: 006e4a0c > R13: 7ffdfd47813f R14: 7f906b6a09c0 R15: 002d > > Allocated by task 9325: > save_stack+0x43/0xd0 mm/kasan/kasan.c:448 > set_track mm/kasan/kasan.c:460 [inline] > kasan_kmalloc+0xc7/0xe0 mm/kasan/kasan.c:553 > kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:490 > kmem_cache_alloc_node+0x144/0x730 mm/slab.c:3644 > alloc_task_struct_node kernel/fork.c:158 [inline] > dup_task_struct kernel/fork.c:843 [inline] > copy_process+0x2026/0x87a0 kernel/fork.c:1751 > _do_fork+0x1cb/0x11d0 kernel/fork.c:2216 > __do_sys_clone kernel/fork.c:2323 [inline] > __se_sys_clone kernel/fork.c:2317 [inline] > __x64_sys_clone+0xbf/0x150 kernel/fork.c:2317 > do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 > entry_SYSCALL_64_after_hwframe+0x49/0xbe > > Freed by task 9325: > save_stack+0x43/0xd0 mm/kasan/kasan.c:448 > set_track mm/kasan/kasan.c:460 [inline] > __kasan_slab_free+0x102/0x150 mm/kasan/kasan.c:521 > kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528 > __cache_free mm/slab.c:3498 [inline] > kmem_cache_free+0x83/0x290 mm/slab.c:3760 > free_task_struct kernel/fork.c:163 [inline] > free_task+0x16e/0x1f0 kernel/fork.c:457 > copy_process+0x1dcc/0x87a0 kernel/fork.c:2148 > _do_fork+0x1cb/0x11d0
Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm
syzbot has found a reproducer for the following crash on: HEAD commit:0072a0c14d5b Merge tag 'media/v4.20-4' of git://git.kernel.. git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=11c885a340 kernel config: https://syzkaller.appspot.com/x/.config?x=b9cc5a440391cbfd dashboard link: https://syzkaller.appspot.com/bug?extid=cbb52e396df3e565ab02 compiler: gcc (GCC) 8.0.1 20180413 (experimental) syz repro: https://syzkaller.appspot.com/x/repro.syz?x=12835e2540 C reproducer: https://syzkaller.appspot.com/x/repro.c?x=172fa5a340 IMPORTANT: if you fix the bug, please add the following tag to the commit: Reported-by: syzbot+cbb52e396df3e565a...@syzkaller.appspotmail.com cgroup: fork rejected by pids controller in /syz2 == BUG: KASAN: use-after-free in __read_once_size include/linux/compiler.h:182 [inline] BUG: KASAN: use-after-free in task_css include/linux/cgroup.h:477 [inline] BUG: KASAN: use-after-free in mem_cgroup_from_task mm/memcontrol.c:815 [inline] BUG: KASAN: use-after-free in get_mem_cgroup_from_mm.part.62+0x6d7/0x880 mm/memcontrol.c:844 Read of size 8 at addr 8881b72af310 by task syz-executor198/9332 CPU: 0 PID: 9332 Comm: syz-executor198 Not tainted 4.20.0-rc5+ #142 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x244/0x39d lib/dump_stack.c:113 print_address_description.cold.7+0x9/0x1ff mm/kasan/report.c:256 kasan_report_error mm/kasan/report.c:354 [inline] kasan_report.cold.8+0x242/0x309 mm/kasan/report.c:412 __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433 __read_once_size include/linux/compiler.h:182 [inline] task_css include/linux/cgroup.h:477 [inline] mem_cgroup_from_task mm/memcontrol.c:815 [inline] get_mem_cgroup_from_mm.part.62+0x6d7/0x880 mm/memcontrol.c:844 get_mem_cgroup_from_mm mm/memcontrol.c:834 [inline] mem_cgroup_try_charge+0x608/0xe20 mm/memcontrol.c:5888 mcopy_atomic_pte mm/userfaultfd.c:71 [inline] mfill_atomic_pte mm/userfaultfd.c:418 [inline] __mcopy_atomic mm/userfaultfd.c:559 [inline] mcopy_atomic+0xb08/0x2c70 mm/userfaultfd.c:609 userfaultfd_copy fs/userfaultfd.c:1705 [inline] userfaultfd_ioctl+0x29fb/0x5610 fs/userfaultfd.c:1851 vfs_ioctl fs/ioctl.c:46 [inline] file_ioctl fs/ioctl.c:509 [inline] do_vfs_ioctl+0x1de/0x1790 fs/ioctl.c:696 ksys_ioctl+0xa9/0xd0 fs/ioctl.c:713 __do_sys_ioctl fs/ioctl.c:720 [inline] __se_sys_ioctl fs/ioctl.c:718 [inline] __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x44c7e9 Code: 5d c5 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 2b c5 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:7f906b69fdb8 EFLAGS: 0246 ORIG_RAX: 0010 RAX: ffda RBX: 006e4a08 RCX: 0044c7e9 RDX: 2100 RSI: c028aa03 RDI: 0004 RBP: 006e4a00 R08: R09: R10: R11: 0246 R12: 006e4a0c R13: 7ffdfd47813f R14: 7f906b6a09c0 R15: 002d Allocated by task 9325: save_stack+0x43/0xd0 mm/kasan/kasan.c:448 set_track mm/kasan/kasan.c:460 [inline] kasan_kmalloc+0xc7/0xe0 mm/kasan/kasan.c:553 kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:490 kmem_cache_alloc_node+0x144/0x730 mm/slab.c:3644 alloc_task_struct_node kernel/fork.c:158 [inline] dup_task_struct kernel/fork.c:843 [inline] copy_process+0x2026/0x87a0 kernel/fork.c:1751 _do_fork+0x1cb/0x11d0 kernel/fork.c:2216 __do_sys_clone kernel/fork.c:2323 [inline] __se_sys_clone kernel/fork.c:2317 [inline] __x64_sys_clone+0xbf/0x150 kernel/fork.c:2317 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe Freed by task 9325: save_stack+0x43/0xd0 mm/kasan/kasan.c:448 set_track mm/kasan/kasan.c:460 [inline] __kasan_slab_free+0x102/0x150 mm/kasan/kasan.c:521 kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528 __cache_free mm/slab.c:3498 [inline] kmem_cache_free+0x83/0x290 mm/slab.c:3760 free_task_struct kernel/fork.c:163 [inline] free_task+0x16e/0x1f0 kernel/fork.c:457 copy_process+0x1dcc/0x87a0 kernel/fork.c:2148 _do_fork+0x1cb/0x11d0 kernel/fork.c:2216 __do_sys_clone kernel/fork.c:2323 [inline] __se_sys_clone kernel/fork.c:2317 [inline] __x64_sys_clone+0xbf/0x150 kernel/fork.c:2317 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe The buggy address belongs to the object at 8881b72ae240 which belongs to the cache task_struct(81:syz2) of size 6080 The buggy address is located 4304 bytes inside of 6080-byte region [8881b72ae240, 8881b72afa00) The buggy address belongs to the page:
Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm
syzbot has found a reproducer for the following crash on: HEAD commit:0072a0c14d5b Merge tag 'media/v4.20-4' of git://git.kernel.. git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=11c885a340 kernel config: https://syzkaller.appspot.com/x/.config?x=b9cc5a440391cbfd dashboard link: https://syzkaller.appspot.com/bug?extid=cbb52e396df3e565ab02 compiler: gcc (GCC) 8.0.1 20180413 (experimental) syz repro: https://syzkaller.appspot.com/x/repro.syz?x=12835e2540 C reproducer: https://syzkaller.appspot.com/x/repro.c?x=172fa5a340 IMPORTANT: if you fix the bug, please add the following tag to the commit: Reported-by: syzbot+cbb52e396df3e565a...@syzkaller.appspotmail.com cgroup: fork rejected by pids controller in /syz2 == BUG: KASAN: use-after-free in __read_once_size include/linux/compiler.h:182 [inline] BUG: KASAN: use-after-free in task_css include/linux/cgroup.h:477 [inline] BUG: KASAN: use-after-free in mem_cgroup_from_task mm/memcontrol.c:815 [inline] BUG: KASAN: use-after-free in get_mem_cgroup_from_mm.part.62+0x6d7/0x880 mm/memcontrol.c:844 Read of size 8 at addr 8881b72af310 by task syz-executor198/9332 CPU: 0 PID: 9332 Comm: syz-executor198 Not tainted 4.20.0-rc5+ #142 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x244/0x39d lib/dump_stack.c:113 print_address_description.cold.7+0x9/0x1ff mm/kasan/report.c:256 kasan_report_error mm/kasan/report.c:354 [inline] kasan_report.cold.8+0x242/0x309 mm/kasan/report.c:412 __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433 __read_once_size include/linux/compiler.h:182 [inline] task_css include/linux/cgroup.h:477 [inline] mem_cgroup_from_task mm/memcontrol.c:815 [inline] get_mem_cgroup_from_mm.part.62+0x6d7/0x880 mm/memcontrol.c:844 get_mem_cgroup_from_mm mm/memcontrol.c:834 [inline] mem_cgroup_try_charge+0x608/0xe20 mm/memcontrol.c:5888 mcopy_atomic_pte mm/userfaultfd.c:71 [inline] mfill_atomic_pte mm/userfaultfd.c:418 [inline] __mcopy_atomic mm/userfaultfd.c:559 [inline] mcopy_atomic+0xb08/0x2c70 mm/userfaultfd.c:609 userfaultfd_copy fs/userfaultfd.c:1705 [inline] userfaultfd_ioctl+0x29fb/0x5610 fs/userfaultfd.c:1851 vfs_ioctl fs/ioctl.c:46 [inline] file_ioctl fs/ioctl.c:509 [inline] do_vfs_ioctl+0x1de/0x1790 fs/ioctl.c:696 ksys_ioctl+0xa9/0xd0 fs/ioctl.c:713 __do_sys_ioctl fs/ioctl.c:720 [inline] __se_sys_ioctl fs/ioctl.c:718 [inline] __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x44c7e9 Code: 5d c5 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 2b c5 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:7f906b69fdb8 EFLAGS: 0246 ORIG_RAX: 0010 RAX: ffda RBX: 006e4a08 RCX: 0044c7e9 RDX: 2100 RSI: c028aa03 RDI: 0004 RBP: 006e4a00 R08: R09: R10: R11: 0246 R12: 006e4a0c R13: 7ffdfd47813f R14: 7f906b6a09c0 R15: 002d Allocated by task 9325: save_stack+0x43/0xd0 mm/kasan/kasan.c:448 set_track mm/kasan/kasan.c:460 [inline] kasan_kmalloc+0xc7/0xe0 mm/kasan/kasan.c:553 kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:490 kmem_cache_alloc_node+0x144/0x730 mm/slab.c:3644 alloc_task_struct_node kernel/fork.c:158 [inline] dup_task_struct kernel/fork.c:843 [inline] copy_process+0x2026/0x87a0 kernel/fork.c:1751 _do_fork+0x1cb/0x11d0 kernel/fork.c:2216 __do_sys_clone kernel/fork.c:2323 [inline] __se_sys_clone kernel/fork.c:2317 [inline] __x64_sys_clone+0xbf/0x150 kernel/fork.c:2317 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe Freed by task 9325: save_stack+0x43/0xd0 mm/kasan/kasan.c:448 set_track mm/kasan/kasan.c:460 [inline] __kasan_slab_free+0x102/0x150 mm/kasan/kasan.c:521 kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528 __cache_free mm/slab.c:3498 [inline] kmem_cache_free+0x83/0x290 mm/slab.c:3760 free_task_struct kernel/fork.c:163 [inline] free_task+0x16e/0x1f0 kernel/fork.c:457 copy_process+0x1dcc/0x87a0 kernel/fork.c:2148 _do_fork+0x1cb/0x11d0 kernel/fork.c:2216 __do_sys_clone kernel/fork.c:2323 [inline] __se_sys_clone kernel/fork.c:2317 [inline] __x64_sys_clone+0xbf/0x150 kernel/fork.c:2317 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe The buggy address belongs to the object at 8881b72ae240 which belongs to the cache task_struct(81:syz2) of size 6080 The buggy address is located 4304 bytes inside of 6080-byte region [8881b72ae240, 8881b72afa00) The buggy address belongs to the page: