On Mon, Mar 15, 2021 at 03:18:16PM +0200, Jarkko Sakkinen wrote:
> On Mon, Mar 15, 2021 at 08:12:36PM +1300, Kai Huang wrote:
> > On Sat, 13 Mar 2021 12:45:53 +0200 Jarkko Sakkinen wrote:
> > > On Fri, Mar 12, 2021 at 01:21:54PM -0800, Sean Christopherson wrote:
> > > > On Thu, Mar 11, 2021, Kai Huang wrote:
> > > > > From: Jarkko Sakkinen <jar...@kernel.org>
> > > > > 
> > > > > EREMOVE takes a page and removes any association between that page and
> > > > > an enclave.  It must be run on a page before it can be added into
> > > > > another enclave.  Currently, EREMOVE is run as part of pages being 
> > > > > freed
> > > > > into the SGX page allocator.  It is not expected to fail.
> > > > > 
> > > > > KVM does not track how guest pages are used, which means that SGX
> > > > > virtualization use of EREMOVE might fail.
> > > > > 
> > > > > Break out the EREMOVE call from the SGX page allocator.  This will 
> > > > > allow
> > > > > the SGX virtualization code to use the allocator directly.  (SGX/KVM
> > > > > will also introduce a more permissive EREMOVE helper).
> > > > > 
> > > > > Implement original sgx_free_epc_page() as sgx_encl_free_epc_page() to 
> > > > > be
> > > > > more specific that it is used to free EPC page assigned to one 
> > > > > enclave.
> > > > > Print an error message when EREMOVE fails to explicitly call out EPC
> > > > > page is leaked, and requires machine reboot to get leaked pages back.
> > > > > 
> > > > > Signed-off-by: Jarkko Sakkinen <jar...@kernel.org>
> > > > > Co-developed-by: Kai Huang <kai.hu...@intel.com>
> > > > > Acked-by: Jarkko Sakkinen <jar...@kernel.org>
> > > > > Signed-off-by: Kai Huang <kai.hu...@intel.com>
> > > > > ---
> > > > > v2->v3:
> > > > > 
> > > > >  - Fixed bug during copy/paste which results in SECS page and va 
> > > > > pages are not
> > > > >    correctly freed in sgx_encl_release() (sorry for the mistake).
> > > > >  - Added Jarkko's Acked-by.
> > > > 
> > > > That Acked-by should either be dropped or moved above Co-developed-by 
> > > > to make
> > > > checkpatch happy.
> > > > 
> > > > Reviewed-by: Sean Christopherson <sea...@google.com>
> > > 
> > > Oops, my bad. Yup, ack should be removed.
> > > 
> > > /Jarkko
> > 
> > Hi Jarkko,
> > 
> > Your reply of your concern of this patch to the cover-letter
> > 
> > https://lore.kernel.org/lkml/yekjxu262yda8...@kernel.org/
> > 
> > reminds me to do more sanity check of whether removing EREMOVE in
> > sgx_free_epc_page() will impact other code path or not, and I think
> > sgx_encl_release() is not the only place should be changed:
> > 
> > - sgx_encl_shrink() needs to call sgx_encl_free_epc_page(), since when this 
> > is
> > called, the VA page can be already valid -- there are other failures can
> > trigger sgx_encl_shrink().
> 
> You right about this, good catch.
> 
> Shrink needs to always do EREMOVE as grow has done EPA, which changes
> EPC page state.
> 
> > - sgx_encl_add_page() should call sgx_encl_free_epc_page() in 
> > "err_out_free:"
> > label, since the EPC page can be already valid when error happened, i.e. 
> > when
> > EEXTEND fails.
> 
> Yes, correct, good work!
> 
> > Other places should be OK per my check, but I'd prefer to just replacing all
> > sgx_free_epc_page() call sites in driver with sgx_encl_free_epc_page(), with
> > one exception: sgx_alloc_va_page(), which calls sgx_free_epc_page() when EPA
> > fails, in which case EREMOVE is not required for sure.
> 
> I would not unless they require it.
> 
> > Your idea, please?
> > 
> > Btw, introducing a driver wrapper of sgx_free_epc_page() does make sense to 
> > me,
> > because virtualization has a counterpart in sgx/virt.c too.
> 
> It does make sense to use sgx_free_epc_page() everywhere where it's
> the right thing to call and here's why.
> 
> If there is some unrelated regression that causes EPC page not get
> uninitialized when it actually should, doing extra EREMOVE could mask
> those bugs. I.e. it can postpone a failure, which can make a bug harder
> to backtrace.
> 

I.e. even though it is true that for correctly working code extra EREMOVE
is nil functionality, it could change semantics for buggy code.

/Jarkko

Reply via email to