On Fri, 19 Mar 2021 22:01:41 +1300 Kai Huang wrote:
> On Fri, 19 Mar 2021 09:45:23 +0100 Borislav Petkov wrote:
> > On Fri, Mar 19, 2021 at 05:06:02PM +1300, Kai Huang wrote:
> > > Below kernel bug happened when running simple SGX application when EPC
> > > is under pressure.  The root cause is with commit 5b8719504e3a
> > > ("x86/sgx: Add a basic NUMA allocation scheme to sgx_alloc_epc_page()"),
> > > __sgx_alloc_epc_page() returns NULL when there's no free EPC page can be
> > > allocated, while old behavior was it always returned ERR_PTR(-ENOMEM) in
> > > such case.
> > > 
> > > Fix by directly returning the page if __sgx_alloc_epc_page_from_node()
> > > allocates a valid page in fallback to non-local allocation, and always
> > > returning ERR_PTR(-ENOMEM) if no EPC page can be allocated.
> > > 
> > > [  253.474764] BUG: kernel NULL pointer dereference, address: 
> > > 0000000000000008
> > > [  253.500101] #PF: supervisor write access in kernel mode
> > > [  253.525462] #PF: error_code(0x0002) - not-present page
> > > ...
> > > [  254.102041] Call Trace:
> > > [  254.126699]  sgx_ioc_enclave_add_pages+0x241/0x770
> > > [  254.151305]  sgx_ioctl+0x194/0x4b0
> > > [  254.174976]  ? handle_mm_fault+0xd0/0x260
> > > [  254.198470]  ? do_user_addr_fault+0x1ef/0x570
> > > [  254.221827]  __x64_sys_ioctl+0x91/0xc0
> > > [  254.244546]  do_syscall_64+0x38/0x90
> > > [  254.266728]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> > > [  254.289232] RIP: 0033:0x7fdc4cf4031b
> > > ...
> > > [  254.711480] CR2: 0000000000000008
> > > [  254.735494] ---[ end trace 970dce6d4cdf7f64 ]---
> > > [  254.759915] RIP: 0010:sgx_alloc_epc_page+0x46/0x152
> > > ...
> > > 
> > > Fixes: 5b8719504e3a("x86/sgx: Add a basic NUMA allocation scheme to 
> > > sgx_alloc_epc_page()")
> > > Signed-off-by: Kai Huang <kai.hu...@intel.com>
> > > ---
> > >  arch/x86/kernel/cpu/sgx/main.c | 4 ++--
> > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > I was on the verge whether to merge that into the original patch since
> > it is the top patch on the branch or create a new one but opted for
> > former because this way it won't break bisection and people won't have
> > to pay attention whether there's a fix patch to the NUMA patch too, in
> > case they wanna backport and whatnot.
> 
> Sure.
> 
> [...]
> 
> > +
> > +   /* Fall back to the non-local NUMA nodes: */
> > +   while (true) {
> > +           nid = next_node_in(nid, sgx_numa_mask);
> > +           if (nid == nid_of_current)
> > +                   break;
> >  
> > -           page = __sgx_alloc_epc_page_from_section(section);
> > +           page = __sgx_alloc_epc_page_from_node(nid);
> >             if (page)
> >                     return page;
> >     }
> 
> It seems "return ERR_PTR(-ENOMEM)" is missing at the bottom of this function?

Oh my fault. This line is not shown in patch probably due to it is not changed
by this patch. Please ignore this.

Reply via email to