On Fri, May 15, 2020 at 03:44:02AM +0300, Jarkko Sakkinen wrote: > +/** > + * sgx_reclaim_pages() - Reclaim EPC pages from the consumers > + * > + * Take a fixed number of pages from the head of the active page pool and > + * reclaim them to the enclave's private shmem files. Skip the pages, which > + * have been accessed since the last scan. Move those pages to the tail of > + * active page pool so that the pages get scanned in LRU like fashion. > + */ > +void sgx_reclaim_pages(void) > +{ > + struct sgx_epc_page *chunk[SGX_NR_TO_SCAN]; > + struct sgx_backing backing[SGX_NR_TO_SCAN]; > + struct sgx_epc_section *section; > + struct sgx_encl_page *encl_page; > + struct sgx_epc_page *epc_page; > + int cnt = 0; > + int ret; > + int i; > + > + spin_lock(&sgx_active_page_list_lock); > + for (i = 0; i < SGX_NR_TO_SCAN; i++) { > + if (list_empty(&sgx_active_page_list)) > + break; > + > + epc_page = list_first_entry(&sgx_active_page_list, > + struct sgx_epc_page, list); > + list_del_init(&epc_page->list); > + encl_page = epc_page->owner; > + > + if (kref_get_unless_zero(&encl_page->encl->refcount) != 0) > + chunk[cnt++] = epc_page; > + else > + /* The owner is freeing the page. No need to add the > + * page back to the list of reclaimable pages. > + */ > + epc_page->desc &= ~SGX_EPC_PAGE_RECLAIMABLE; > + } > + spin_unlock(&sgx_active_page_list_lock); > + > + for (i = 0; i < cnt; i++) { > + epc_page = chunk[i]; > + encl_page = epc_page->owner; > + > + if (!sgx_reclaimer_age(epc_page)) > + goto skip; > + > + ret = sgx_encl_get_backing(encl_page->encl, > + SGX_ENCL_PAGE_INDEX(encl_page), > + &backing[i]); > + if (ret) > + goto skip; > + > + mutex_lock(&encl_page->encl->lock); > + encl_page->desc |= SGX_ENCL_PAGE_RECLAIMED; > + mutex_unlock(&encl_page->encl->lock); > + continue; > + > +skip: > + kref_put(&encl_page->encl->refcount, sgx_encl_release); > + > + spin_lock(&sgx_active_page_list_lock); > + list_add_tail(&epc_page->list, &sgx_active_page_list); > + spin_unlock(&sgx_active_page_list_lock);
Ugh, this is wrong. If the above kref_put() drops the last reference and releases the enclave, adding the page to the active page list will result in a use-after-free as the enclave will have been freed. It also leaks the EPC page because sgx_encl_destroy() skips pages that are in the process of being reclaimed (as detected by list_empty()). The "original" code did the put() after list_add_tail(), but was moved in v15 to fix a bug where the put() could drop a reference to the wrong enclave if the page was freed and reallocated by a different CPU between list_add_tail() and put(). But, that particular bug only occurred because the code at the time was: sgx_encl_page_put(epc_page); I.e. the backpointer in epc_page was consumed after dropping the spin lock. So long as epc_page->owner (well, epc_page in general) isn't dereferenced, I'm 99% certain this can be fixed simply by doing kref_put() after moving the page back to the active page list. > + > + chunk[i] = NULL; > + } > + > + for (i = 0; i < cnt; i++) { > + epc_page = chunk[i]; > + if (epc_page) > + sgx_reclaimer_block(epc_page); > + } > + > + for (i = 0; i < cnt; i++) { > + epc_page = chunk[i]; > + if (!epc_page) > + continue; > + > + encl_page = epc_page->owner; > + sgx_reclaimer_write(epc_page, &backing[i]); > + sgx_encl_put_backing(&backing[i], true); > + > + kref_put(&encl_page->encl->refcount, sgx_encl_release); > + epc_page->desc &= ~SGX_EPC_PAGE_RECLAIMABLE; > + > + section = sgx_epc_section(epc_page); > + spin_lock(§ion->lock); > + list_add_tail(&epc_page->list, §ion->page_list); > + section->free_cnt++; > + spin_unlock(§ion->lock); > + } > +}