On 1/22/20 12:27 PM, Jan Hubicka wrote:
Hi.

I've updated the patch based on Honza's notes and installed it
as 5f32f9cf13f99f6295591927950aaf98aa8dba91.
Thanks, you omitted the patch, but looking at git log I think there is
still problem I commented on in the previous mail:

Whoops, sorry for that.


   for (unsigned i = 0; i < GCOV_TOPN_VALUES; i++)
      {
        if (value == counters[2 * i])
         {
           if (use_atomic)
-           __atomic_fetch_add (&counters[2 * i + 1], 1, __ATOMIC_RELAXED);
+           __atomic_fetch_add (&counters[2 * i + 1], GCOV_TOPN_VALUES,
+                               __ATOMIC_RELAXED);
           else
-           counters[2 * i + 1]++;
+           counters[2 * i + 1] += GCOV_TOPN_VALUES;
           return;
         }
-      else if (counters[2 * i + 1] == 0)
+      else if (counters[2 * i + 1] <= 0)
         {
           /* We found an empty slot.  */
           counters[2 * i] = value;
-         counters[2 * i + 1] = 1;
+         counters[2 * i + 1] = GCOV_TOPN_VALUES;
           return;
         }

Lets say that sequence is Y X X Z Z X and N = 2
you will get

Y 2 _ 0
Y 2 X 2
Y 2 X 4
Y 1 X 3
Y 0 X 2
X 2 X 2

while it ought to be

Y 0 X 4

the loop really needs to walk all counters since non-zero entries are
not guaranteed to be grouped at the beggining.
(I think reordering counter is worse alternative especially for
concurent updates since we probably do not want N to be very large)

It should be fixed in the attached patch.


Just for the record, there are stats for cc1plus for PGO bootstrap:

before:

== Stats for gcc ==
stats for indirect_call:
   total: 9210
   invalid: 760
   tracked values:
     0 values:     6248 times (67.84%)
     1 values:     1795 times (19.49%)
     2 values:      220 times (2.39%)
     3 values:       87 times (0.94%)
     4 values:      100 times (1.09%)

stats for topn:
   total: 1513
   invalid: 220
   tracked values:
     0 values:     1028 times (67.94%)
     1 values:      149 times (9.85%)
     2 values:       50 times (3.30%)
     3 values:       38 times (2.51%)
     4 values:       28 times (1.85%)

after:

stats for indirect_call:
   total: 9210
   invalid: 600
   tracked values:
     0 values:     6280 times (68.19%)
     1 values:     1856 times (20.15%)
     2 values:      264 times (2.87%)
     3 values:      157 times (1.70%)
     4 values:       53 times (0.58%)

stats for topn:
   total: 1513
   invalid: 178
   tracked values:
     0 values:     1034 times (68.34%)
     1 values:      176 times (11.63%)
     2 values:       68 times (4.49%)
     3 values:       39 times (2.58%)
     4 values:       18 times (1.19%)

Thanks, so basically there are 2930 = 1856 + 264 + 157 + 53 + 600
trained entries. Out of that 1795 has always only one target, that
leaves 1135 entries with interesting behaviour wrt merging and we give
up on 600, that is about 50% of them.

Can you send me your stats script?  I will try it out at Clang as well
and will regenerate firefox numbers after the problem above is solved.

Sure:
https://github.com/marxin/script-misc/blob/master/gcov-dump-analysis.py

Make sure you have install gcov-dump in $PATH that corresponds to version
of .gcda files.

Is the patch fine for trunk?

Martin


Honza

Martin

>From 7c378b91be18fc258cc453a138a32405b8ac550c Mon Sep 17 00:00:00 2001
From: Martin Liska <mli...@suse.cz>
Date: Wed, 22 Jan 2020 13:01:28 +0100
Subject: [PATCH] Fix TOP N counter update.

libgcc/ChangeLog:

2020-01-22  Martin Liska  <mli...@suse.cz>

	PR tree-optimization/92924
	* libgcov-profiler.c (__gcov_topn_values_profiler_body): First
	try to find an existing value, then find an empty slot
	if not found.
---
 libgcc/libgcov-profiler.c | 46 ++++++++++++++++++++-------------------
 1 file changed, 24 insertions(+), 22 deletions(-)

diff --git a/libgcc/libgcov-profiler.c b/libgcc/libgcov-profiler.c
index f45ef498a6e..506a9b68022 100644
--- a/libgcc/libgcov-profiler.c
+++ b/libgcc/libgcov-profiler.c
@@ -119,35 +119,37 @@ __gcov_topn_values_profiler_body (gcov_type *counters, gcov_type value,
 
   ++counters;
 
+  /* First try to find an existing value.  */
+  int empty_counter = -1;
+
   for (unsigned i = 0; i < GCOV_TOPN_VALUES; i++)
+    if (value == counters[2 * i])
+      {
+	if (use_atomic)
+	  __atomic_fetch_add (&counters[2 * i + 1], GCOV_TOPN_VALUES,
+			      __ATOMIC_RELAXED);
+	else
+	  counters[2 * i + 1] += GCOV_TOPN_VALUES;
+	return;
+      }
+    else if (counters[2 * i + 1] <= 0 && empty_counter == -1)
+      empty_counter = i;
+
+  /* Find an empty slot for a new value.  */
+  if (empty_counter != -1)
     {
-      if (value == counters[2 * i])
-	{
-	  if (use_atomic)
-	    __atomic_fetch_add (&counters[2 * i + 1], GCOV_TOPN_VALUES,
-				__ATOMIC_RELAXED);
-	  else
-	    counters[2 * i + 1] += GCOV_TOPN_VALUES;
-	  return;
-	}
-      else if (counters[2 * i + 1] <= 0)
-	{
-	  /* We found an empty slot.  */
-	  counters[2 * i] = value;
-	  counters[2 * i + 1] = GCOV_TOPN_VALUES;
-	  return;
-	}
+      counters[2 * empty_counter] = value;
+      counters[2 * empty_counter + 1] = GCOV_TOPN_VALUES;
+      return;
     }
 
   /* We haven't found an empty slot, then decrement all
      counter values by one.  */
   for (unsigned i = 0; i < GCOV_TOPN_VALUES; i++)
-    {
-      if (use_atomic)
-	__atomic_fetch_sub (&counters[2 * i + 1], 1, __ATOMIC_RELAXED);
-      else
-	counters[2 * i + 1]--;
-    }
+    if (use_atomic)
+      __atomic_fetch_sub (&counters[2 * i + 1], 1, __ATOMIC_RELAXED);
+    else
+      counters[2 * i + 1]--;
 }
 
 #ifdef L_gcov_topn_values_profiler
-- 
2.24.1

Reply via email to