Added AVX512_elapsed_ms in /proc/<pid>/status. Report it in Documentation/filesystems/proc.txt
Signed-off-by: Aubrey Li <aubrey...@linux.intel.com> Cc: Peter Zijlstra <pet...@infradead.org> Cc: Andi Kleen <a...@linux.intel.com> Cc: Tim Chen <tim.c.c...@linux.intel.com> Cc: Dave Hansen <dave.han...@intel.com> Cc: Arjan van de Ven <ar...@linux.intel.com> --- Documentation/filesystems/proc.txt | 28 +++++++++++++++++++++++++++- 1 file changed, 27 insertions(+), 1 deletion(-) diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt index 66cad5c86171..425f2f09c9aa 100644 --- a/Documentation/filesystems/proc.txt +++ b/Documentation/filesystems/proc.txt @@ -45,6 +45,7 @@ Table of Contents 3.9 /proc/<pid>/map_files - Information about memory mapped files 3.10 /proc/<pid>/timerslack_ns - Task timerslack value 3.11 /proc/<pid>/patch_state - Livepatch patch operation state + 3.12 /proc/<pid>/AVX512_elapsed_ms - time elapsed since last AVX512 use 4 Configuring procfs 4.1 Mount options @@ -207,6 +208,7 @@ read the file /proc/PID/status: Speculation_Store_Bypass: thread vulnerable voluntary_ctxt_switches: 0 nonvoluntary_ctxt_switches: 1 + AVX512_elapsed_ms: 8 This shows you nearly the same information you would get if you viewed it with the ps command. In fact, ps uses the proc file system to obtain its @@ -224,7 +226,7 @@ asynchronous manner and the value may not be very precise. To see a precise snapshot of a moment, you can see /proc/<pid>/smaps file and scan page table. It's slow but very precise. -Table 1-2: Contents of the status files (as of 4.19) +Table 1-2: Contents of the status files (as of 5.1) .............................................................................. Field Content Name filename of the executable @@ -289,6 +291,7 @@ Table 1-2: Contents of the status files (as of 4.19) Mems_allowed_list Same as previous, but in "list format" voluntary_ctxt_switches number of voluntary context switches nonvoluntary_ctxt_switches number of non voluntary context switches + AVX512_elapsed_ms time elapsed since last AVX512 use in millisecond .............................................................................. Table 1-3: Contents of the statm files (as of 2.6.8-rc3) @@ -1948,6 +1951,29 @@ patched. If the patch is being enabled, then the task has already been patched. If the patch is being disabled, then the task hasn't been unpatched yet. +3.12 /proc/<pid>/AVX512_elapsed_ms - time elapsed since last AVX512 use +-------------------------------------------------------------------------- +If AVX512 is supported on the machine, this file displays time elapsed since +last AVX512 usage of the task in millisecond. + +The per-task AVX512 usage tracking mechanism is added during context switch. +When the task is scheduled out, the AVX512 timestamp of the task is tagged +by jiffies if AVX512 usage is detected. + +When this interface is queried, AVX512_elapsed_ms is calculated as follows: + + delta = (long)(jiffies_now - AVX512_timestamp); + AVX512_elpased_ms = jiffies_to_msecs(delta); + +Because this tracking mechanism depends on context switch, the number of +AVX512_elapsed_ms could be inaccurate if the AVX512 using task runs alone on +a CPU and not scheduled out for a long time. An extreme experiment shows a +task is spinning on the AVX512 ops on an isolated CPU, but the longest elapsed +time is close to 4 seconds(HZ = 250). + +So 5s or even longer is an appropriate threshold for the job scheduler to poll +and decide if the task should be classifed as an AVX512 task and migrated +away from the core on which a Non-AVX512 task is running. ------------------------------------------------------------------------------ Configuring procfs -- 2.17.1