Re: [PATCH v10 09/14] x86/sgx: Implement async reclamation for cgroup

2024-04-04 Thread Haitao Huang

On Thu, 04 Apr 2024 06:16:54 -0500, Huang, Kai  wrote:


On Wed, 2024-03-27 at 17:22 -0700, Haitao Huang wrote:

  void sgx_cgroup_init(void)
 {
+	sgx_cg_wq = alloc_workqueue("sgx_cg_wq", WQ_UNBOUND | WQ_FREEZABLE,  
WQ_UNBOUND_MAX_ACTIVE);

+
+   /* All Cgroups functionalities are disabled. */
+   if (WARN_ON(!sgx_cg_wq))
+   return;
+


I don't think you should WARN(), because it's not a kernel bug or  
similar.  Just

print a message saying EPC cgroup is disabled and move on.

if (!sgx_cg_wq) {
pr_err("SGX EPC cgroup disabled: alloc_workqueue() failed.\n");
return;
}



Sure
Thanks
Haitao



Re: [PATCH v10 09/14] x86/sgx: Implement async reclamation for cgroup

2024-04-04 Thread Huang, Kai
On Wed, 2024-03-27 at 17:22 -0700, Haitao Huang wrote:
>  
>  void sgx_cgroup_init(void)
>  {
> + sgx_cg_wq = alloc_workqueue("sgx_cg_wq", WQ_UNBOUND | WQ_FREEZABLE, 
> WQ_UNBOUND_MAX_ACTIVE);
> +
> + /* All Cgroups functionalities are disabled. */
> + if (WARN_ON(!sgx_cg_wq))
> + return;
> +

I don't think you should WARN(), because it's not a kernel bug or similar.  Just
print a message saying EPC cgroup is disabled and move on.

if (!sgx_cg_wq) {
pr_err("SGX EPC cgroup disabled: alloc_workqueue() failed.\n");
return;
}


[PATCH v10 09/14] x86/sgx: Implement async reclamation for cgroup

2024-03-27 Thread Haitao Huang
From: Kristen Carlson Accardi 

In cases EPC pages need be allocated during a page fault and the cgroup
usage is near its limit, an asynchronous reclamation needs be triggered
to avoid blocking the page fault handling.

Create a workqueue, corresponding work item and function definitions
for EPC cgroup to support the asynchronous reclamation.

In case the workqueue allocation is failed during init, disable cgroup.

In sgx_cgroup_try_charge(), if caller does not allow synchronous
reclamation, queue an asynchronous work into the workqueue.

Co-developed-by: Sean Christopherson 
Signed-off-by: Sean Christopherson 
Signed-off-by: Kristen Carlson Accardi 
Co-developed-by: Haitao Huang 
Signed-off-by: Haitao Huang 
---
V10:
- Split asynchronous flow in separate patch. (Kai)
- Consider cgroup disabled when the workqueue allocation fail during
init. (Kai)
- Abstract out sgx_cgroup_should_reclaim().

V9:
- Add comments for static variables. (Jarkko)

V8:
- Remove alignment for substructure variables. (Jarkko)

V7:
- Split this out from the big patch, #10 in V6. (Dave, Kai)
---
 arch/x86/kernel/cpu/sgx/epc_cgroup.c | 134 ++-
 arch/x86/kernel/cpu/sgx/epc_cgroup.h |   1 +
 arch/x86/kernel/cpu/sgx/main.c   |   8 +-
 3 files changed, 135 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kernel/cpu/sgx/epc_cgroup.c 
b/arch/x86/kernel/cpu/sgx/epc_cgroup.c
index f7a487a29ed1..3ca89b1fb7e2 100644
--- a/arch/x86/kernel/cpu/sgx/epc_cgroup.c
+++ b/arch/x86/kernel/cpu/sgx/epc_cgroup.c
@@ -5,9 +5,63 @@
 #include 
 #include "epc_cgroup.h"
 
+/*
+ * The minimal free pages maintained by per-cgroup reclaimer
+ * Set this to the low threshold used by the global reclaimer, ksgxd.
+ */
+#define SGX_CG_MIN_FREE_PAGE   (SGX_NR_LOW_PAGES)
+
+/*
+ * If the cgroup limit is close to SGX_CG_MIN_FREE_PAGE, maintaining the 
minimal
+ * free pages would barely leave any page for use, causing excessive 
reclamation
+ * and thrashing.
+ *
+ * Define the following limit, below which cgroup does not maintain the minimal
+ * free page threshold. Set this to quadruple of the minimal so at least 75%
+ * pages used without being reclaimed.
+ */
+#define SGX_CG_LOW_LIMIT   (SGX_CG_MIN_FREE_PAGE * 4)
+
 /* The root SGX EPC cgroup */
 static struct sgx_cgroup sgx_cg_root;
 
+/*
+ * The work queue that reclaims EPC pages in the background for cgroups.
+ *
+ * A cgroup schedules a work item into this queue to reclaim pages within the
+ * same cgroup when its usage limit is reached and synchronous reclamation is 
not
+ * an option, i.e., in a page fault handler.
+ */
+static struct workqueue_struct *sgx_cg_wq;
+
+static inline u64 sgx_cgroup_page_counter_read(struct sgx_cgroup *sgx_cg)
+{
+   return atomic64_read(_cg->cg->res[MISC_CG_RES_SGX_EPC].usage) / 
PAGE_SIZE;
+}
+
+static inline u64 sgx_cgroup_max_pages(struct sgx_cgroup *sgx_cg)
+{
+   return READ_ONCE(sgx_cg->cg->res[MISC_CG_RES_SGX_EPC].max) / PAGE_SIZE;
+}
+
+/*
+ * Get the lower bound of limits of a cgroup and its ancestors. Used in
+ * sgx_cgroup_should_reclaim() to determine if EPC usage of a cgroup is
+ * close to its limit or its ancestors' hence reclamation is needed.
+ */
+static inline u64 sgx_cgroup_max_pages_to_root(struct sgx_cgroup *sgx_cg)
+{
+   struct misc_cg *i = sgx_cg->cg;
+   u64 m = U64_MAX;
+
+   while (i) {
+   m = min(m, READ_ONCE(i->res[MISC_CG_RES_SGX_EPC].max));
+   i = misc_cg_parent(i);
+   }
+
+   return m / PAGE_SIZE;
+}
+
 /**
  * sgx_cgroup_lru_empty() - check if a cgroup tree has no pages on its LRUs
  * @root:  Root of the tree to check
@@ -99,6 +153,61 @@ static void sgx_cgroup_reclaim_pages(struct misc_cg *root)
rcu_read_unlock();
 }
 
+/**
+ * sgx_cgroup_should_reclaim() - check if EPC reclamation is needed for a 
cgroup
+ * @sgx_cg: The cgroup to be checked.
+ *
+ * This function can be used to guard a call to sgx_cgroup_reclaim_pages() 
where
+ * the minimal number of free page needs be maintained for the cgroup to make
+ * good forward progress.
+ *
+ * Return: %true if number of free pages available for the cgroup below a
+ * threshold (%SGX_CG_MIN_FREE_PAGE) and there are reclaimable pages within the
+ * cgroup.
+ */
+static bool sgx_cgroup_should_reclaim(struct sgx_cgroup *sgx_cg)
+{
+   u64 cur, max;
+
+   if (sgx_cgroup_lru_empty(sgx_cg->cg))
+   return false;
+
+   max = sgx_cgroup_max_pages_to_root(sgx_cg);
+
+   /*
+* Unless the limit is very low, maintain a minimal number of free pages
+* so there is always a few pages available to serve new allocation
+* requests quickly.
+*/
+   if (max > SGX_CG_LOW_LIMIT)
+   max -= SGX_CG_MIN_FREE_PAGE;
+
+   cur = sgx_cgroup_page_counter_read(sgx_cg);
+
+   return (cur >= max);
+}
+
+/*
+ * Asynchronous work flow to reclaim pages from the cgroup when the cgroup is
+ * at/near its maximum capacity.
+ */
+static void