Re: [PATCH 1/5] Add gcc-auto-profile script

2016-05-29 Thread Andi Kleen
On Mon, May 30, 2016 at 02:39:06AM +0200, Jan Hubicka wrote:
> > 
> > Since maintaining the script would be somewhat tedious (needs changes
> > every time a new CPU comes out) I auto generated it from the online
> > Intel event database. The script to do that is in contrib and can be
> > rerun.
> 
> I guess we need to figure out how to ship this to users.  At the moment
> the script will tell you to rebuild when it meets new CPU, but it reffers
> to gcc sources which is not the best place.

I don't have a better solution.

> 
> Also the script should insteall when it is documented in invoke.texi

I wasn't sure if it should be installed or not, but opted for not. 
For now I can remove the documentation.

> 
> What happens when you ahve wrong perf?

You mean with the manual example? It will just error out,
because it requires Google's special patched version.
Also the event name is not always the same, so it's only working
on some CPUs.

With my script perf should work, but you need LBR support in the kernel for
using -b. LBRs are a model specific feature, and this needs:

- The kernel is new enough and knows about your CPU model
(dmesg | grep Perf.*LBR outputs something)

- LBRs are typically not virtualized[1], so it usually does not work
in VMs

When the LBR support is not there the profiling run will not
work at all.

-Andi

[1] There are some patches for KVM, but they are not mainline so far.

-- 
a...@linux.intel.com -- Speaking for myself only


Re: [PATCH 1/5] Add gcc-auto-profile script

2016-05-29 Thread Jan Hubicka
Andi,
thanks a lot for working on the auto-fdo bootstrap. It is badly needed to
have some coverage for this feature.  I don't think I can approve the
build machinery changes.

> From: Andi Kleen 
> 
> Using autofdo is currently something difficult. It requires using the
> model specific branches taken event, which differs on different CPUs.
> The example shown in the manual requires a special patched version of
> perf that is non standard, and also will likely not work everywhere.
> 
> This patch adds a new gcc-auto-profile script that figures out the
> correct event and runs perf.
> 
> This is needed to actually make use of autofdo in a generic way
> in the build system and in the test suite.
> 
> Since maintaining the script would be somewhat tedious (needs changes
> every time a new CPU comes out) I auto generated it from the online
> Intel event database. The script to do that is in contrib and can be
> rerun.

I guess we need to figure out how to ship this to users.  At the moment
the script will tell you to rebuild when it meets new CPU, but it reffers
to gcc sources which is not the best place.

Also the script should insteall when it is documented in invoke.texi

What happens when you ahve wrong perf?

Honza


Re: [PATCH 1/5] Add gcc-auto-profile script

2016-05-21 Thread Bernhard Reutner-Fischer
On May 21, 2016 6:36:22 PM GMT+02:00, Andi Kleen  wrote:
>From: Andi Kleen 

>+if [ "$1" = "--kernel" ] ; then
>+  FLAGS=k
>+  shift
>+fi
>+if [ "$1" == "--all" ] ; then

== is legacy, s/==/=/

>+  FLAGS=uk
>+  shift
>+fi
>+
>+if ! grep -q Intel /proc/cpuinfo ] ; then
>+  echo >&2 "Only Intel CPUs supported"
>+  exit 1
>+fi
>+
>+if grep -q hypervisor /proc/cpuinfo ; then
>+  echo >&2 "Warning: branch profiling may not be functional in VMs"
>+fi

grep && echo would do but OK.

>+
>+case `egrep -q "^cpu family\s*: 6" /proc/cpuinfo &&
>+  egrep "^model\s*:" /proc/cpuinfo | head -1` in'''

head and tail both require -n nowadays (in fact since susv2, IIRC), so please 
head -n1

thanks,



[PATCH 1/5] Add gcc-auto-profile script

2016-05-21 Thread Andi Kleen
From: Andi Kleen 

Using autofdo is currently something difficult. It requires using the
model specific branches taken event, which differs on different CPUs.
The example shown in the manual requires a special patched version of
perf that is non standard, and also will likely not work everywhere.

This patch adds a new gcc-auto-profile script that figures out the
correct event and runs perf.

This is needed to actually make use of autofdo in a generic way
in the build system and in the test suite.

Since maintaining the script would be somewhat tedious (needs changes
every time a new CPU comes out) I auto generated it from the online
Intel event database. The script to do that is in contrib and can be
rerun.

Right now there is no test if perf works in configure. This
would vary depending on the build and target system, and since
it currently doesn't work in virtualization and needs uptodate
kernel it may often fail in common distribution build setups.

So far the script is not installed.

gcc/:
2016-05-21  Andi Kleen  

* doc/invoke.texi: Document gcc-auto-profile
* config/i386/gcc-auto-profile: New file.

contrib/:

2016-05-21  Andi Kleen  

* gen_autofdo_event.py: New file to regenerate
gcc-auto-profile.
---
 contrib/gen_autofdo_event.py | 155 +++
 gcc/config/i386/gcc-auto-profile |  70 ++
 gcc/doc/invoke.texi  |  31 ++--
 3 files changed, 251 insertions(+), 5 deletions(-)
 create mode 100755 contrib/gen_autofdo_event.py
 create mode 100755 gcc/config/i386/gcc-auto-profile

diff --git a/contrib/gen_autofdo_event.py b/contrib/gen_autofdo_event.py
new file mode 100755
index 000..907430d
--- /dev/null
+++ b/contrib/gen_autofdo_event.py
@@ -0,0 +1,155 @@
+#!/usr/bin/python
+# Generate Intel taken branches Linux perf event script for autofdo profiling.
+
+# Copyright (C) 2016 Free Software Foundation, Inc.
+#
+# GCC is free software; you can redistribute it and/or modify it under
+# the terms of the GNU General Public License as published by the Free
+# Software Foundation; either version 3, or (at your option) any later
+# version.
+#
+# GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+# WARRANTY; without even the implied warranty of MERCHANTABILITY or
+# FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+# for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING3.  If not see
+# .  */
+
+# Run it with perf record -b -e EVENT program ...
+# The Linux Kernel needs to support the PMU of the current CPU, and
+# It will likely not work in VMs.
+# Add --all to print for all cpus, otherwise for current cpu.
+# Add --script to generate shell script to run correct event.
+#
+# Requires internet (https) access. This may require setting up a proxy
+# with export https_proxy=...
+#
+import urllib2
+import sys
+import json
+import argparse
+import collections
+
+baseurl = "https://download.01.org/perfmon;
+
+target_events = (u'BR_INST_RETIRED.NEAR_TAKEN',
+ u'BR_INST_EXEC.TAKEN',
+ u'BR_INST_RETIRED.TAKEN_JCC',
+ u'BR_INST_TYPE_RETIRED.COND_TAKEN')
+
+ap = argparse.ArgumentParser()
+ap.add_argument('--all', '-a', help='Print for all CPUs', action='store_true')
+ap.add_argument('--script', help='Generate shell script', action='store_true')
+args = ap.parse_args()
+
+eventmap = collections.defaultdict(list)
+
+def get_cpu_str():
+with open('/proc/cpuinfo', 'r') as c:
+vendor, fam, model = None, None, None
+for j in c:
+n = j.split()
+if n[0] == 'vendor_id':
+vendor = n[2]
+elif n[0] == 'model' and n[1] == ':':
+model = int(n[2])
+elif n[0] == 'cpu' and n[1] == 'family':
+fam = int(n[3])
+if vendor and fam and model:
+return "%s-%d-%X" % (vendor, fam, model), model
+return None, None
+
+def find_event(eventurl, model):
+print >>sys.stderr, "Downloading", eventurl
+u = urllib2.urlopen(eventurl)
+events = json.loads(u.read())
+u.close()
+
+found = 0
+for j in events:
+if j[u'EventName'] in target_events:
+event = "cpu/event=%s,umask=%s/" % (j[u'EventCode'], j[u'UMask'])
+if u'PEBS' in j and j[u'PEBS'] > 0:
+event += "p"
+if args.script:
+eventmap[event].append(model)
+else:
+print j[u'EventName'], "event for model", model, "is", event
+found += 1
+return found
+
+if not args.all:
+cpu, model = get_cpu_str()
+if not cpu:
+sys.exit("Unknown CPU type")
+
+url = baseurl + "/mapfile.csv"
+print >>sys.stderr, "Downloading", url
+u = urllib2.urlopen(url)
+found = 0