Added AVX512_elapsed_ms in /proc/<pid>/status. Report it
in Documentation/filesystems/proc.txt

Signed-off-by: Aubrey Li <aubrey...@linux.intel.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Andi Kleen <a...@linux.intel.com>
Cc: Tim Chen <tim.c.c...@linux.intel.com>
Cc: Dave Hansen <dave.han...@intel.com>
Cc: Arjan van de Ven <ar...@linux.intel.com>
---
 Documentation/filesystems/proc.txt | 28 +++++++++++++++++++++++++++-
 1 file changed, 27 insertions(+), 1 deletion(-)

diff --git a/Documentation/filesystems/proc.txt 
b/Documentation/filesystems/proc.txt
index 66cad5c86171..425f2f09c9aa 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -45,6 +45,7 @@ Table of Contents
   3.9   /proc/<pid>/map_files - Information about memory mapped files
   3.10  /proc/<pid>/timerslack_ns - Task timerslack value
   3.11 /proc/<pid>/patch_state - Livepatch patch operation state
+  3.12 /proc/<pid>/AVX512_elapsed_ms - time elapsed since last AVX512 use
 
   4    Configuring procfs
   4.1  Mount options
@@ -207,6 +208,7 @@ read the file /proc/PID/status:
   Speculation_Store_Bypass:       thread vulnerable
   voluntary_ctxt_switches:        0
   nonvoluntary_ctxt_switches:     1
+  AVX512_elapsed_ms:   8
 
 This shows you nearly the same information you would get if you viewed it with
 the ps  command.  In  fact,  ps  uses  the  proc  file  system  to  obtain its
@@ -224,7 +226,7 @@ asynchronous manner and the value may not be very precise. 
To see a precise
 snapshot of a moment, you can see /proc/<pid>/smaps file and scan page table.
 It's slow but very precise.
 
-Table 1-2: Contents of the status files (as of 4.19)
+Table 1-2: Contents of the status files (as of 5.1)
 ..............................................................................
  Field                       Content
  Name                        filename of the executable
@@ -289,6 +291,7 @@ Table 1-2: Contents of the status files (as of 4.19)
  Mems_allowed_list           Same as previous, but in "list format"
  voluntary_ctxt_switches     number of voluntary context switches
  nonvoluntary_ctxt_switches  number of non voluntary context switches
+ AVX512_elapsed_ms           time elapsed since last AVX512 use in millisecond
 ..............................................................................
 
 Table 1-3: Contents of the statm files (as of 2.6.8-rc3)
@@ -1948,6 +1951,29 @@ patched.  If the patch is being enabled, then the task 
has already been
 patched.  If the patch is being disabled, then the task hasn't been
 unpatched yet.
 
+3.12   /proc/<pid>/AVX512_elapsed_ms - time elapsed since last AVX512 use
+--------------------------------------------------------------------------
+If AVX512 is supported on the machine, this file displays time elapsed since
+last AVX512 usage of the task in millisecond.
+
+The per-task AVX512 usage tracking mechanism is added during context switch.
+When the task is scheduled out, the AVX512 timestamp of the task is tagged
+by jiffies if AVX512 usage is detected.
+
+When this interface is queried, AVX512_elapsed_ms is calculated as follows:
+
+       delta = (long)(jiffies_now - AVX512_timestamp);
+       AVX512_elpased_ms = jiffies_to_msecs(delta);
+
+Because this tracking mechanism depends on context switch, the number of
+AVX512_elapsed_ms could be inaccurate if the AVX512 using task runs alone on
+a CPU and not scheduled out for a long time. An extreme experiment shows a
+task is spinning on the AVX512 ops on an isolated CPU, but the longest elapsed
+time is close to 4 seconds(HZ = 250).
+
+So 5s or even longer is an appropriate threshold for the job scheduler to poll
+and decide if the task should be classifed as an AVX512 task and migrated
+away from the core on which a Non-AVX512 task is running.
 
 ------------------------------------------------------------------------------
 Configuring procfs
-- 
2.17.1

Reply via email to