Re: [PATCH v7 7/7] powerpc/perf/hv-24x7: Document sysfs event description entries

2015-02-22 Thread Cody P Schafer
On Fri, Jan 30, 2015 at 4:46 PM, Sukadev Bhattiprolu
suka...@linux.vnet.ibm.com wrote:
 From: Cody P Schafer c...@linux.vnet.ibm.com

 Signed-off-by: Cody P Schafer c...@linux.vnet.ibm.com
 Signed-off-by: Sukadev Bhattiprolu suka...@linux.vnet.ibm.com
 ---
 Changelog[v6]
 Update Contact info to Linux on Power Developer list

  .../testing/sysfs-bus-event_source-devices-hv_24x7 | 22 
 ++
  1 file changed, 22 insertions(+)

 diff --git a/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_24x7 
 b/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_24x7
 index 32f3f5f..f893337 100644
 --- a/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_24x7
 +++ b/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_24x7
 @@ -21,3 +21,25 @@ Contact: Linux on PowerPC Developer List 
 linuxppc-dev@lists.ozlabs.org
  Description:
 Exposes the version field of the 24x7 catalog. This is also
 extractable from the provided binary catalog sysfs entry.
 +
 +What:  /sys/bus/event_source/devices/hv_24x7/event_descs/event-name
 +Date:  February 2014
 +Contact:   Linux on PowerPC Developer List 
 linuxppc-dev@lists.ozlabs.org
 +Description:
 +   Provides the description of a particular event as provided by
 +   the firmware. If firmware does not provide a description, no
 +   file will be created.
 +
 +   Note that the event-name lacks the domain suffix appended for
 +   events in the events/ dir.

I'm probably a bit late on this, but:

Please consider removing the need for a user to know about the domain
suffixes (which, as far as I know are 24x7 specific).
If anyone else ever wants to add firmware/hardware/kernel provided
event descriptions, they'll need to special case these ones as they
don't match up with the actual event names.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v5 1/4] tools/perf: support parsing parameterized events

2014-12-05 Thread Cody P Schafer
On Thu, Dec 4, 2014 at 7:44 AM, Jiri Olsa jo...@redhat.com wrote:
 On Tue, Dec 02, 2014 at 06:09:35PM -0800, Sukadev Bhattiprolu wrote:
 From: Cody P Schafer c...@linux.vnet.ibm.com

 Enable event specification like:

   pmu/event_name,param1=0x1,param2=0x4/

 Assuming that

   /sys/bus/event_source/devices/pmu/events/event_name

 Contains something like

   param2=$foo,bar=1,param1=$baz

 oops.. sorry to be PITA on this one.. I might have missed something
 in the previous discussion but I guess I might have finally some
 opinion on this ;-)

 here's how I think your patchset works:

 in /sys/bus/event_source/devices/pmu/events/event_name you can actually have:

param2=foo,bar=1,param1=baz

 notice no '$', thats what you add later in 'perf list' output, right?

 Moreover it actually does not matter whats in value 'param2=HERE',
 because it's not used in the config code at all apart from the
 'perf list' display processing.

 So when we discussed the '$' name way, I thought it'd be like:

 in /sys/bus/event_source/devices/pmu/events/event_name you have:
   param2=$foo,bar=1,param1=$baz

 and on command line you'd use:
   pmu/event_name,foo=0x1,bar=0x4/

 to assign directly to the $var, which would justify the $var
 syntax I think..


Agreed, what you've described above sounds like a good idea.

Compared to monopolizing all strings (which is what I did when
initialy writing this), using a '$' prefix would allow less pain when
some events suddenly need non-integer parameters.

 anyway we could assign directly to the param term name as you do,
 but I think we just need to mark the term as parametrized, like:

 in /sys/bus/event_source/devices/pmu/events/event_name you have:
   param2=?,bar=1,param1=?

 and on command line you'd use:
   pmu/event_name,param2=0x1,param1=0x4/

 while the config code would check that the param substitution is
 done only for terms with '?' in value, like 'param2=?' and not
 for all PARSE_EVENTS__TERM_TYPE_STR type terms (as of now)

I prefer the `foo=0x1` as mentioned previously: it makes the user
interface much less painful as we can have event-specific names for
register/hcall fields.

I'm pretty sure the code used to do this, not sure when it was removed
(haven't been following this patchset closely).

That said: I haven't fiddled with this code in a while (it's Suka's at
this point), and there might be arguments the other way on both of
those.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v5 6/6] powerpc/perf/hv-24x7: Document sysfs event description entries

2014-12-03 Thread Cody P Schafer
 diff --git a/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_24x7 
 b/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_24x7
 index 32f3f5f..cf70084 100644
 --- a/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_24x7
 +++ b/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_24x7
 @@ -21,3 +21,25 @@ Contact: Linux on PowerPC Developer List 
 linuxppc-dev@lists.ozlabs.org
 +Contact:   Cody P Schafer c...@linux.vnet.ibm.com

Probably want someone else to be the contact here.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v4 10/10] powerpc/perf/hv-24x7: Document sysfs event description entries

2014-09-30 Thread Cody P Schafer
 +What:  /sys/bus/event_source/devices/hv_24x7/event_descs/event-name
 +Date:  February 2014
 +Contact:   Cody P Schafer c...@linux.vnet.ibm.com

May want to change this contact email to an address that still works
(perhaps the ppc devel list?)
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 11/16] byteorder: provide a linux/byteorder.h with {be, le}_to_cpu() and cpu_to_{be, le}() macros

2014-05-28 Thread Cody P Schafer
On Wed, May 28, 2014 at 3:45 AM, David Laight david.lai...@aculab.com wrote:
 From: Cody P Schafer
 Rather manually specifying the size of the integer to be converted, key
 off of the type size. Reduces duplicate size info and the occurance of
 certain types of bugs (using the wrong sized conversion).
 ...
 +#define be_to_cpu(v) \
 + __builtin_choose_expr(sizeof(v) == sizeof(uint8_t) , v, \
 + __builtin_choose_expr(sizeof(v) == sizeof(uint16_t), be16_to_cpu(v), \
 + __builtin_choose_expr(sizeof(v) == sizeof(uint32_t), be32_to_cpu(v), \
 + __builtin_choose_expr(sizeof(v) == sizeof(uint64_t), be64_to_cpu(v), \
 + (void)0
 ...

 I'm not at all sure that using the 'size' of the constant will reduce
 the number of bugs - it just introduces a whole new category of bugs.

Certainly, if you mis-size the argument (and thus have missized one of
the variables containing the be value, probably a bug anyhow), there
will be problems.

I put this interface together because of an actual bug I wrote into
the initial code of the hv_24x7 driver (resized a struct member
without adjusting the be*_to_cpu() sizing).
Having this auto sizing macro means I can avoid encoding the size of
a struct field in multiple places.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 11/16] byteorder: provide a linux/byteorder.h with {be,le}_to_cpu() and cpu_to_{be,le}() macros

2014-05-28 Thread Cody P Schafer
On Tue, May 27, 2014 at 7:44 PM, Joe Perches j...@perches.com wrote:
 On Tue, 2014-05-27 at 17:22 -0700, Cody P Schafer wrote:
 Rather manually specifying the size of the integer to be converted, key
 off of the type size. Reduces duplicate size info and the occurance of
 certain types of bugs (using the wrong sized conversion).
 []
 diff --git a/include/linux/byteorder.h b/include/linux/byteorder.h
 []
 @@ -0,0 +1,34 @@
 +#ifndef LINUX_BYTEORDER_H_
 +#define LINUX_BYTEORDER_H_
 +
 +#include asm/byteorder.h
 +
 +#define be_to_cpu(v) \
 + __builtin_choose_expr(sizeof(v) == sizeof(uint8_t) , v, \
 + __builtin_choose_expr(sizeof(v) == sizeof(uint16_t), be16_to_cpu(v), \
 + __builtin_choose_expr(sizeof(v) == sizeof(uint32_t), be32_to_cpu(v), \
 + __builtin_choose_expr(sizeof(v) == sizeof(uint64_t), be64_to_cpu(v), \
 + (void)0

 probably better to use BUILD_BUG instead of these 0 returns


They aren't 0 returns.

$ echo int main(void) { int x = (void)0; return x; } | gcc -x c -
stdin: In function ‘main’:
stdin:1:26: error: void value not ignored as it ought to be
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 11/16] byteorder: provide a linux/byteorder.h with {be, le}_to_cpu() and cpu_to_{be, le}() macros

2014-05-28 Thread Cody P Schafer
On Wed, May 28, 2014 at 5:05 PM, Cody P Schafer d...@codyps.com wrote:
 On Wed, May 28, 2014 at 3:45 AM, David Laight david.lai...@aculab.com wrote:
 From: Cody P Schafer
 Rather manually specifying the size of the integer to be converted, key
 off of the type size. Reduces duplicate size info and the occurance of
 certain types of bugs (using the wrong sized conversion).
 ...
 +#define be_to_cpu(v) \
 + __builtin_choose_expr(sizeof(v) == sizeof(uint8_t) , v, \
 + __builtin_choose_expr(sizeof(v) == sizeof(uint16_t), be16_to_cpu(v), \
 + __builtin_choose_expr(sizeof(v) == sizeof(uint32_t), be32_to_cpu(v), \
 + __builtin_choose_expr(sizeof(v) == sizeof(uint64_t), be64_to_cpu(v), \
 + (void)0
 ...

 I'm not at all sure that using the 'size' of the constant will reduce
 the number of bugs - it just introduces a whole new category of bugs.

 Certainly, if you mis-size the argument (and thus have missized one of
 the variables containing the be value, probably a bug anyhow), there
 will be problems.

 I put this interface together because of an actual bug I wrote into
 the initial code of the hv_24x7 driver (resized a struct member
 without adjusting the be*_to_cpu() sizing).
 Having this auto sizing macro means I can avoid encoding the size of
 a struct field in multiple places.

To clarify, the point I'm making here is that this simply cuts out 1
more place we can screw up endianness conversion sizing.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 11/16] byteorder: provide a linux/byteorder.h with {be, le}_to_cpu() and cpu_to_{be, le}() macros

2014-05-28 Thread Cody P Schafer
On Wed, May 28, 2014 at 6:00 PM, Joe Perches j...@perches.com wrote:
 On Wed, 2014-05-28 at 17:11 -0500, Cody P Schafer wrote:
 On Wed, May 28, 2014 at 5:05 PM, Cody P Schafer d...@codyps.com wrote:
  On Wed, May 28, 2014 at 3:45 AM, David Laight david.lai...@aculab.com 
  wrote:
  From: Cody P Schafer
  Rather manually specifying the size of the integer to be converted, key
  off of the type size. Reduces duplicate size info and the occurance of
  certain types of bugs (using the wrong sized conversion).
  ...
  +#define be_to_cpu(v) \
  + __builtin_choose_expr(sizeof(v) == sizeof(uint8_t) , v, \
  + __builtin_choose_expr(sizeof(v) == sizeof(uint16_t), 
  be16_to_cpu(v), \
  + __builtin_choose_expr(sizeof(v) == sizeof(uint32_t), 
  be32_to_cpu(v), \
  + __builtin_choose_expr(sizeof(v) == sizeof(uint64_t), 
  be64_to_cpu(v), \
  + (void)0
  ...
 
  I'm not at all sure that using the 'size' of the constant will reduce
  the number of bugs - it just introduces a whole new category of bugs.
 
  Certainly, if you mis-size the argument (and thus have missized one of
  the variables containing the be value, probably a bug anyhow), there
  will be problems.
 
  I put this interface together because of an actual bug I wrote into
  the initial code of the hv_24x7 driver (resized a struct member
  without adjusting the be*_to_cpu() sizing).
  Having this auto sizing macro means I can avoid encoding the size of
  a struct field in multiple places.

 To clarify, the point I'm making here is that this simply cuts out 1
 more place we can screw up endianness conversion sizing.

 It does screw up other types when you do things like:

 u8 foo = some_function();

 cpu_to_be(foo + 1);

 the return value is sizeof(int) not u8

Yep, that is a very good argument against the cpu_to_{be,le}()
variants. It might make sense to remove them and just have the
{be,le}_to_cpu() ones.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 00/16] perf: add support for parameterized events from sysfs (powerpc 24x7)

2014-05-27 Thread Cody P Schafer
What this patchset does:

 - the first patch (override sysfs in tools/perf via SYSFS_PATH) was sent out
   previously, but needed a resend anyhow. Having it is useful for testing the
   later changes to tools/perf.
 - the second patch is a bugfix to the powerpc hv-24x7 code which was
   previously sent out, which is a good idea to have when testing these patches
   on POWER8 hardware.

 - document perf sysfs and the changes to add parameterized events
   - semi-notably: removes the growing list of specific POWER cpu events and
 begins documenting them generically, much like the docs for
 /sys/modules/MODULENAME do for modules.
 - tools/perf changes to support parameterized events
 - export some parameterized events from the powerpc pmus hv_24x7 and hv_gpci

Description of event parameters from the documentation patch:

Event parameters are a basic way for partial events to be specified in
sysfs with per-event names given to the fields that need to be filled in
when using a particular event.

It is intended for supporting cases where the single 'cpu' parameter is
insufficient. For example, POWER 8 has events for physical
sockets/cores/cpus that are accessible from with virtual machines. To
keep using the single 'cpu' parameter we'd need to perform a mapping
between Linux's cpus and the physical machine's cpus (in this case
Linux is running under a hypervisor). This isn't possible because
bindings between our cpus and physical cpus may not be fixed, and we
probably won't have a cpu on each physical cpu.

Description of the sysfs contents when events are parameterized (copied from an
included patch):

Examples:

domain=0x1,offset=0x8,starting_index=phys_cpu

In the case of the last example, a value replacing phys_cpu
would need to be provided by the user selecting the particular
event. This is refered to as event parameterization. All
non-numerical values indicate an event parameter.

Notes on how perf-list displays parameterized events (and how to use them,
again culled from an included patch):

PARAMETERIZED EVENTS


Some pmu events listed by 'perf-list' will be displayed with '?' in 
them. For
example:

  hv_gpci/dtbp_ptitc,phys_processor_idx=?/

This means that when provided as an event, a value for 
phys_processor_idx must
also be supplied. For example:

  perf stat -e 'hv_gpci/dtbp_ptitc,phys_processor_idx=0x2/' ...


Cody P Schafer (16):
  tools/perf: allow overriding sysfs and proc finding with env var
  powerpc/perf/hv-24x7: use kmem_cache instead of aligned stack
allocations
  perf Documentation: sysfs events/ interfaces
  perf Documentation: remove duplicated docs for powerpc cpu specific
events
  perf Documentation: add event parameters
  tools/perf: annotate list_head with type info
  tools/perf: support parsing parameterized events
  tools/perf: extend format_alias() to include event parameters
  tools/perf: document parameterized events and note symbolically formed
events
  perf: provide sysfs_show for struct perf_pmu_events_attr
  byteorder: provide a linux/byteorder.h with {be,le}_to_cpu() and
cpu_to_{be,le}() macros
  powerpc/perf/hv-24x7: parse catalog and populate sysfs with events
  powerpc/perf/hv-24x7: Documentaion for new sysfs entries which expose
descriptions
  perf: add PMU_EVENT_ATTR_STRING() helper
  powerpc/perf/{hv-gpci,hv-common}: generate requests with counters
annotated
  powerpc/perf/hv-gpci: add the remaining gpci requests

 .../testing/sysfs-bus-event_source-devices-events  | 617 ++--
 .../testing/sysfs-bus-event_source-devices-hv_24x7 |  22 +
 arch/powerpc/perf/hv-24x7-catalog.h|  25 +
 arch/powerpc/perf/hv-24x7-domains.h|  19 +
 arch/powerpc/perf/hv-24x7.c| 812 -
 arch/powerpc/perf/hv-24x7.h|  12 +-
 arch/powerpc/perf/hv-common.c  |  10 +-
 arch/powerpc/perf/hv-gpci-requests.h   | 258 +++
 arch/powerpc/perf/hv-gpci.c|   8 +
 arch/powerpc/perf/hv-gpci.h|  37 +-
 arch/powerpc/perf/req-gen/_begin.h |  13 +
 arch/powerpc/perf/req-gen/_clear.h |   5 +
 arch/powerpc/perf/req-gen/_end.h   |   4 +
 arch/powerpc/perf/req-gen/_request-begin.h |  15 +
 arch/powerpc/perf/req-gen/_request-end.h   |   8 +
 arch/powerpc/perf/req-gen/perf.h   | 155 
 include/linux/byteorder.h  |  34 +
 include/linux/perf_event.h |  10 +
 kernel/events/core.c   |   8 +
 tools/lib/api/fs/fs.c  |  43 +-
 tools/perf/Documentation/perf-list.txt |  13 +
 tools/perf/Documentation/perf-record.txt

[PATCH 01/16] tools/perf: allow overriding sysfs and proc finding with env var

2014-05-27 Thread Cody P Schafer
SYSFS_PATH and PROC_PATH environment variables now let the user override
the detection of sysfs and proc locations for testing purposes.

CC: Sukadev Bhattiprolu suka...@linux.vnet.ibm.com
Signed-off-by: Cody P Schafer d...@codyps.com
---
 tools/lib/api/fs/fs.c | 43 ++-
 1 file changed, 42 insertions(+), 1 deletion(-)

diff --git a/tools/lib/api/fs/fs.c b/tools/lib/api/fs/fs.c
index 5b5eb78..c1b49c3 100644
--- a/tools/lib/api/fs/fs.c
+++ b/tools/lib/api/fs/fs.c
@@ -1,8 +1,10 @@
 /* TODO merge/factor in debugfs.c here */
 
+#include ctype.h
 #include errno.h
 #include stdbool.h
 #include stdio.h
+#include stdlib.h
 #include string.h
 #include sys/vfs.h
 
@@ -96,12 +98,51 @@ static bool fs__check_mounts(struct fs *fs)
return false;
 }
 
+static void mem_toupper(char *f, size_t len)
+{
+   while (len) {
+   *f = toupper(*f);
+   f++;
+   len--;
+   }
+}
+
+/*
+ * Check for NAME_PATH environment variable to override fs location (for
+ * testing). This matches the recommendation in Documentation/sysfs-rules.txt
+ * for SYSFS_PATH.
+ */
+static bool fs__env_override(struct fs *fs)
+{
+   char *override_path;
+   size_t name_len = strlen(fs-name);
+   /* name + _PATH + '\0' */
+   char upper_name[name_len + 5 + 1];
+   memcpy(upper_name, fs-name, name_len);
+   mem_toupper(upper_name, name_len);
+   strcpy(upper_name[name_len], _PATH);
+
+   override_path = getenv(upper_name);
+   if (!override_path)
+   return false;
+
+   fs-found = true;
+   strncpy(fs-path, override_path, sizeof(fs-path));
+   return true;
+}
+
 static const char *fs__get_mountpoint(struct fs *fs)
 {
+   if (fs__env_override(fs))
+   return fs-path;
+
if (fs__check_mounts(fs))
return fs-path;
 
-   return fs__read_mounts(fs) ? fs-path : NULL;
+   if (fs__read_mounts(fs))
+   return fs-path;
+
+   return NULL;
 }
 
 static const char *fs__mountpoint(int idx)
-- 
1.9.3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 02/16] powerpc/perf/hv-24x7: use kmem_cache instead of aligned stack allocations

2014-05-27 Thread Cody P Schafer
Ian pointed out the use of __aligned(4096) caused rather large stack
consumption in single_24x7_request(), so use the kmem_cache
hv_page_cache (which we've already got set up for other allocations)
insead of allocating locally.

CC: Sukadev Bhattiprolu suka...@linux.vnet.ibm.com
Reported-by: Ian Munsie imun...@au1.ibm.com
Signed-off-by: Cody P Schafer d...@codyps.com
---
 arch/powerpc/perf/hv-24x7.c | 52 -
 1 file changed, 37 insertions(+), 15 deletions(-)

diff --git a/arch/powerpc/perf/hv-24x7.c b/arch/powerpc/perf/hv-24x7.c
index e0766b8..9a7a830 100644
--- a/arch/powerpc/perf/hv-24x7.c
+++ b/arch/powerpc/perf/hv-24x7.c
@@ -294,7 +294,7 @@ static unsigned long single_24x7_request(u8 domain, u32 
offset, u16 ix,
 u16 lpar, u64 *res,
 bool success_expected)
 {
-   unsigned long ret;
+   unsigned long ret = -ENOMEM;
 
/*
 * request_buffer and result_buffer are not required to be 4k aligned,
@@ -304,7 +304,27 @@ static unsigned long single_24x7_request(u8 domain, u32 
offset, u16 ix,
struct reqb {
struct hv_24x7_request_buffer buf;
struct hv_24x7_request req;
-   } __packed __aligned(4096) request_buffer = {
+   } __packed *request_buffer;
+   struct resb {
+   struct hv_24x7_data_result_buffer buf;
+   struct hv_24x7_result res;
+   struct hv_24x7_result_element elem;
+   __be64 result;
+   } __packed *result_buffer;
+
+   BUILD_BUG_ON(sizeof(*request_buffer)  4096);
+   BUILD_BUG_ON(sizeof(*result_buffer)  4096);
+
+   request_buffer = kmem_cache_alloc(hv_page_cache, GFP_USER);
+
+   if (!request_buffer)
+   goto out_reqb;
+
+   result_buffer = kmem_cache_zalloc(hv_page_cache, GFP_USER);
+   if (!result_buffer)
+   goto out_resb;
+
+   *request_buffer = (struct reqb) {
.buf = {
.interface_version = HV_24X7_IF_VERSION_CURRENT,
.num_requests = 1,
@@ -320,28 +340,30 @@ static unsigned long single_24x7_request(u8 domain, u32 
offset, u16 ix,
}
};
 
-   struct resb {
-   struct hv_24x7_data_result_buffer buf;
-   struct hv_24x7_result res;
-   struct hv_24x7_result_element elem;
-   __be64 result;
-   } __packed __aligned(4096) result_buffer = {};
-
ret = plpar_hcall_norets(H_GET_24X7_DATA,
-   virt_to_phys(request_buffer), sizeof(request_buffer),
-   virt_to_phys(result_buffer),  sizeof(result_buffer));
+   virt_to_phys(request_buffer), sizeof(*request_buffer),
+   virt_to_phys(result_buffer),  sizeof(*result_buffer));
 
if (ret) {
if (success_expected)
pr_err_ratelimited(hcall failed: %d %#x %#x %d = 
0x%lx (%ld) detail=0x%x failing ix=%x\n,
domain, offset, ix, lpar,
ret, ret,
-   result_buffer.buf.detailed_rc,
-   result_buffer.buf.failing_request_ix);
-   return ret;
+   result_buffer-buf.detailed_rc,
+   result_buffer-buf.failing_request_ix);
+   goto out_hcall;
}
 
-   *res = be64_to_cpu(result_buffer.result);
+   *res = be64_to_cpu(result_buffer-result);
+   kfree(result_buffer);
+   kfree(request_buffer);
+   return ret;
+
+out_hcall:
+   kfree(result_buffer);
+out_resb:
+   kfree(request_buffer);
+out_reqb:
return ret;
 }
 
-- 
1.9.3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 03/16] perf Documentation: sysfs events/ interfaces

2014-05-27 Thread Cody P Schafer
Add documentation for the event, event.scale, and event.unit
files in sysfs.

event.scale and event.unit were undocumented.
event was previously documented only for specific powerpc pmu events.

CC: Sukadev Bhattiprolu suka...@linux.vnet.ibm.com
Signed-off-by: Cody P Schafer d...@codyps.com
---
 .../testing/sysfs-bus-event_source-devices-events  | 60 ++
 1 file changed, 60 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-bus-event_source-devices-events 
b/Documentation/ABI/testing/sysfs-bus-event_source-devices-events
index 7b40a3c..a5226f0 100644
--- a/Documentation/ABI/testing/sysfs-bus-event_source-devices-events
+++ b/Documentation/ABI/testing/sysfs-bus-event_source-devices-events
@@ -599,3 +599,63 @@ Description:   POWER-systems specific performance 
monitoring events
Further, multiple terms like 'event=0x' can be specified
and separated with comma. All available terms are defined in
the /sys/bus/event_source/devices/dev/format file.
+
+What: /sys/bus/event_source/devices/pmu/events/event
+Date: 2014/02/24
+Contact:   Linux kernel mailing list linux-ker...@vger.kernel.org
+Description:   Per-pmu performance monitoring events specific to the running 
system
+
+   Each file (except for some of those with a '.' in them, '.unit'
+   and '.scale') in the 'events' directory describes a single
+   performance monitoring event supported by the pmu. The name
+   of the file is the name of the event.
+
+   File contents:
+
+   term[=value][,term[=value]]...
+
+   Where term is one of the terms listed under
+   /sys/bus/event_source/devices/pmu/format/ and value is
+   a number is base-16 format with a '0x' prefix (lowercase only).
+   If a term is specified alone (without an assigned value), it
+   is implied that 0x1 is assigned to that term.
+
+   Examples (each of these lines would be in a seperate file):
+
+   event=0x2abc
+   event=0x423,inv,cmask=0x3
+   domain=0x1,offset=0x8,starting_index=0x
+
+   Each of the assignments indicates a value to be assigned to a
+   particular set of bits (as defined by the format file
+   corresponding to the term) in the perf_event structure passed
+   to the perf_open syscall.
+
+What: /sys/bus/event_source/devices/pmu/events/event.unit
+Date: 2014/02/24
+Contact:   Linux kernel mailing list linux-ker...@vger.kernel.org
+Description:   Perf event units
+
+   A string specifying the English plural numerical unit that 
event
+   (once multiplied by event.scale) represents.
+
+   Example:
+
+   Joules
+
+What: /sys/bus/event_source/devices/pmu/events/event.scale
+Date: 2014/02/24
+Contact:   Linux kernel mailing list linux-ker...@vger.kernel.org
+Description:   Perf event scaling factors
+
+   A string representing a floating point value expressed in
+   scientific notation to be multiplied by the event count
+   recieved from the kernel to match the unit specified in the
+   event.unit file.
+
+   Example:
+
+   2.3283064365386962890625e-10
+
+   This is provided to avoid performing floating point arithmetic
+   in the kernel.
-- 
1.9.3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 04/16] perf Documentation: remove duplicated docs for powerpc cpu specific events

2014-05-27 Thread Cody P Schafer
Listing specific events doesn't actually help us at all here because:
 - these events actually vary between different ppc processors, they
   aren't garunteed to be present.
 - the documentation of the (generic) file contents is now superceded by the
   docs for arbitrary event file contents.

CC: Sukadev Bhattiprolu suka...@linux.vnet.ibm.com
Signed-off-by: Cody P Schafer d...@codyps.com
---
 .../testing/sysfs-bus-event_source-devices-events  | 573 -
 1 file changed, 573 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-bus-event_source-devices-events 
b/Documentation/ABI/testing/sysfs-bus-event_source-devices-events
index a5226f0..20979f8 100644
--- a/Documentation/ABI/testing/sysfs-bus-event_source-devices-events
+++ b/Documentation/ABI/testing/sysfs-bus-event_source-devices-events
@@ -27,579 +27,6 @@ Description:Generic performance monitoring events
basename.
 
 
-What:  /sys/devices/cpu/events/PM_1PLUS_PPC_CMPL
-   /sys/devices/cpu/events/PM_BRU_FIN
-   /sys/devices/cpu/events/PM_BR_MPRED
-   /sys/devices/cpu/events/PM_CMPLU_STALL
-   /sys/devices/cpu/events/PM_CMPLU_STALL_BRU
-   /sys/devices/cpu/events/PM_CMPLU_STALL_DCACHE_MISS
-   /sys/devices/cpu/events/PM_CMPLU_STALL_DFU
-   /sys/devices/cpu/events/PM_CMPLU_STALL_DIV
-   /sys/devices/cpu/events/PM_CMPLU_STALL_ERAT_MISS
-   /sys/devices/cpu/events/PM_CMPLU_STALL_FXU
-   /sys/devices/cpu/events/PM_CMPLU_STALL_IFU
-   /sys/devices/cpu/events/PM_CMPLU_STALL_LSU
-   /sys/devices/cpu/events/PM_CMPLU_STALL_REJECT
-   /sys/devices/cpu/events/PM_CMPLU_STALL_SCALAR
-   /sys/devices/cpu/events/PM_CMPLU_STALL_SCALAR_LONG
-   /sys/devices/cpu/events/PM_CMPLU_STALL_STORE
-   /sys/devices/cpu/events/PM_CMPLU_STALL_THRD
-   /sys/devices/cpu/events/PM_CMPLU_STALL_VECTOR
-   /sys/devices/cpu/events/PM_CMPLU_STALL_VECTOR_LONG
-   /sys/devices/cpu/events/PM_CYC
-   /sys/devices/cpu/events/PM_GCT_NOSLOT_BR_MPRED
-   /sys/devices/cpu/events/PM_GCT_NOSLOT_BR_MPRED_IC_MISS
-   /sys/devices/cpu/events/PM_GCT_NOSLOT_CYC
-   /sys/devices/cpu/events/PM_GCT_NOSLOT_IC_MISS
-   /sys/devices/cpu/events/PM_GRP_CMPL
-   /sys/devices/cpu/events/PM_INST_CMPL
-   /sys/devices/cpu/events/PM_LD_MISS_L1
-   /sys/devices/cpu/events/PM_LD_REF_L1
-   /sys/devices/cpu/events/PM_RUN_CYC
-   /sys/devices/cpu/events/PM_RUN_INST_CMPL
-   /sys/devices/cpu/events/PM_IC_DEMAND_L2_BR_ALL
-   /sys/devices/cpu/events/PM_GCT_UTIL_7_TO_10_SLOTS
-   /sys/devices/cpu/events/PM_PMC2_SAVED
-   /sys/devices/cpu/events/PM_VSU0_16FLOP
-   /sys/devices/cpu/events/PM_MRK_LSU_DERAT_MISS
-   /sys/devices/cpu/events/PM_MRK_ST_CMPL
-   /sys/devices/cpu/events/PM_NEST_PAIR3_ADD
-   /sys/devices/cpu/events/PM_L2_ST_DISP
-   /sys/devices/cpu/events/PM_L2_CASTOUT_MOD
-   /sys/devices/cpu/events/PM_ISEG
-   /sys/devices/cpu/events/PM_MRK_INST_TIMEO
-   /sys/devices/cpu/events/PM_L2_RCST_DISP_FAIL_ADDR
-   /sys/devices/cpu/events/PM_LSU1_DC_PREF_STREAM_CONFIRM
-   /sys/devices/cpu/events/PM_IERAT_WR_64K
-   /sys/devices/cpu/events/PM_MRK_DTLB_MISS_16M
-   /sys/devices/cpu/events/PM_IERAT_MISS
-   /sys/devices/cpu/events/PM_MRK_PTEG_FROM_LMEM
-   /sys/devices/cpu/events/PM_FLOP
-   /sys/devices/cpu/events/PM_THRD_PRIO_4_5_CYC
-   /sys/devices/cpu/events/PM_BR_PRED_TA
-   /sys/devices/cpu/events/PM_EXT_INT
-   /sys/devices/cpu/events/PM_VSU_FSQRT_FDIV
-   /sys/devices/cpu/events/PM_MRK_LD_MISS_EXPOSED_CYC
-   /sys/devices/cpu/events/PM_LSU1_LDF
-   /sys/devices/cpu/events/PM_IC_WRITE_ALL
-   /sys/devices/cpu/events/PM_LSU0_SRQ_STFWD
-   /sys/devices/cpu/events/PM_PTEG_FROM_RL2L3_MOD
-   /sys/devices/cpu/events/PM_MRK_DATA_FROM_L31_SHR
-   /sys/devices/cpu/events/PM_DATA_FROM_L21_MOD
-   /sys/devices/cpu/events/PM_VSU1_SCAL_DOUBLE_ISSUED
-   /sys/devices/cpu/events/PM_VSU0_8FLOP
-   /sys/devices/cpu/events/PM_POWER_EVENT1
-   /sys/devices/cpu/events/PM_DISP_CLB_HELD_BAL
-   /sys/devices/cpu/events/PM_VSU1_2FLOP
-   /sys/devices/cpu/events/PM_LWSYNC_HELD
-   /sys/devices/cpu/events/PM_PTEG_FROM_DL2L3_SHR
-   /sys/devices/cpu/events/PM_INST_FROM_L21_MOD
-   /sys/devices/cpu/events/PM_IERAT_XLATE_WR_16MPLUS
-   /sys/devices/cpu/events/PM_IC_REQ_ALL

[PATCH 05/16] perf Documentation: add event parameters

2014-05-27 Thread Cody P Schafer
Event parameters are a basic way for partial events to be specified in
sysfs with per-event names given to the fields that need to be filled in
when using a particular event.

It is intended for supporting cases where the single 'cpu' parameter is
insufficient. For example, POWER 8 has events for physical
sockets/cores/cpus that are accessible from with virtual machines. To
keep using the single 'cpu' parameter we'd need to perform a mapping
between Linux's cpus and the physical machine's cpus (in this case
Linux is running under a hypervisor). This isn't possible because
bindings between our cpus and physical cpus may not be fixed, and we
probably won't have a cpu on each physical cpu.

CC: Sukadev Bhattiprolu suka...@linux.vnet.ibm.com
Signed-off-by: Cody P Schafer d...@codyps.com
---
 Documentation/ABI/testing/sysfs-bus-event_source-devices-events | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-bus-event_source-devices-events 
b/Documentation/ABI/testing/sysfs-bus-event_source-devices-events
index 20979f8..c1f9850 100644
--- a/Documentation/ABI/testing/sysfs-bus-event_source-devices-events
+++ b/Documentation/ABI/testing/sysfs-bus-event_source-devices-events
@@ -52,12 +52,18 @@ Description:Per-pmu performance monitoring events 
specific to the running syste
event=0x2abc
event=0x423,inv,cmask=0x3
domain=0x1,offset=0x8,starting_index=0x
+   domain=0x1,offset=0x8,starting_index=phys_cpu
 
Each of the assignments indicates a value to be assigned to a
particular set of bits (as defined by the format file
corresponding to the term) in the perf_event structure passed
to the perf_open syscall.
 
+   In the case of the last example, a value replacing phys_cpu
+   would need to be provided by the user selecting the particular
+   event. This is refered to as event parameterization. All
+   non-numerical values indicate an event parameter.
+
 What: /sys/bus/event_source/devices/pmu/events/event.unit
 Date: 2014/02/24
 Contact:   Linux kernel mailing list linux-ker...@vger.kernel.org
-- 
1.9.3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 06/16] tools/perf: annotate list_head with type info

2014-05-27 Thread Cody P Schafer
CC: Sukadev Bhattiprolu suka...@linux.vnet.ibm.com
Signed-off-by: Cody P Schafer d...@codyps.com
---
 tools/perf/util/pmu.c | 4 ++--
 tools/perf/util/pmu.h | 6 +++---
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index 00a7dcb..906ae40 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -14,8 +14,8 @@
 
 struct perf_pmu_alias {
char *name;
-   struct list_head terms;
-   struct list_head list;
+   struct list_head terms; /* HEAD struct parse_events_term - list */
+   struct list_head list;  /* ELEM */
char unit[UNIT_MAX_LEN+1];
double scale;
 };
diff --git a/tools/perf/util/pmu.h b/tools/perf/util/pmu.h
index 8b64125..4a85230 100644
--- a/tools/perf/util/pmu.h
+++ b/tools/perf/util/pmu.h
@@ -17,9 +17,9 @@ struct perf_pmu {
char *name;
__u32 type;
struct cpu_map *cpus;
-   struct list_head format;
-   struct list_head aliases;
-   struct list_head list;
+   struct list_head format;  /* HEAD struct perf_pmu_format - list */
+   struct list_head aliases; /* HEAD struct perf_pmu_alias - list */
+   struct list_head list;/* ELEM */
 };
 
 struct perf_pmu *perf_pmu__find(const char *name);
-- 
1.9.3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 07/16] tools/perf: support parsing parameterized events

2014-05-27 Thread Cody P Schafer
Enable event specification like:

pmu/event_name,param1=0x1,param2=0x4/

Assuming that

/sys/bus/event_source/devices/pmu/events/event_name

Contains something like

bar=param2,foo=1,baz=param1

CC: Sukadev Bhattiprolu suka...@linux.vnet.ibm.com
Signed-off-by: Cody P Schafer d...@codyps.com
---
 tools/perf/util/parse-events.h |  1 +
 tools/perf/util/pmu.c  | 55 ++
 2 files changed, 46 insertions(+), 10 deletions(-)

diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index f1cb4c4..1147e87 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -60,6 +60,7 @@ struct parse_events_term {
int type_val;
int type_term;
struct list_head list;
+   bool used;
 };
 
 struct parse_events_evlist {
diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index 906ae40..db53fac 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -504,27 +504,57 @@ static __u64 pmu_format_value(unsigned long *format, 
__u64 value)
 }
 
 /*
+ * Term is a string term, and might be a param-term. Try to look up it's value
+ * in the remaining terms.
+ * - We have a term like base-or-format-term=param-term,
+ * - We need to find the value supplied for param-term (with param-term named
+ *   in a config string) later on in the term list.
+ */
+static int pmu_resolve_param_term(struct parse_events_term *term,
+ struct list_head *head_terms,
+ __u64 *value)
+{
+   struct parse_events_term *t;
+
+   list_for_each_entry(t, head_terms, list)
+   if (t-type_val == PARSE_EVENTS__TERM_TYPE_NUM) {
+   if (!strcmp(t-config, term-val.str)) {
+   t-used = true;
+   *value = t-val.num;
+   return 0;
+   }
+   }
+
+   return -1;
+}
+
+/*
  * Setup one of config[12] attr members based on the
  * user input data - term parameter.
  */
 static int pmu_config_term(struct list_head *formats,
   struct perf_event_attr *attr,
-  struct parse_events_term *term)
+  struct parse_events_term *term,
+  struct list_head *head_terms)
 {
struct perf_pmu_format *format;
__u64 *vp;
+   __u64 val;
+
+   /*
+* If this is a parameter we've already used for parameterized-eval,
+* skip it in normal eval.
+*/
+   if (term-used)
+   return 0;
 
/*
-* Support only for hardcoded and numnerial terms.
 * Hardcoded terms should be already in, so nothing
 * to be done for them.
 */
if (parse_events__is_hardcoded_term(term))
return 0;
 
-   if (term-type_val != PARSE_EVENTS__TERM_TYPE_NUM)
-   return -EINVAL;
-
format = pmu_find_format(formats, term-config);
if (!format)
return -EINVAL;
@@ -544,11 +574,16 @@ static int pmu_config_term(struct list_head *formats,
}
 
/*
-* XXX If we ever decide to go with string values for
-* non-hardcoded terms, here's the place to translate
-* them into value.
+* Either directly use a numeric term, or try to translate string terms
+* using event parameters.
 */
-   *vp |= pmu_format_value(format-bits, term-val.num);
+   if (term-type_val == PARSE_EVENTS__TERM_TYPE_NUM)
+   val = term-val.num;
+   else
+   if (pmu_resolve_param_term(term, head_terms, val))
+   return -EINVAL;
+
+   *vp |= pmu_format_value(format-bits, val);
return 0;
 }
 
@@ -559,7 +594,7 @@ int perf_pmu__config_terms(struct list_head *formats,
struct parse_events_term *term;
 
list_for_each_entry(term, head_terms, list)
-   if (pmu_config_term(formats, attr, term))
+   if (pmu_config_term(formats, attr, term, head_terms))
return -EINVAL;
 
return 0;
-- 
1.9.3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 08/16] tools/perf: extend format_alias() to include event parameters

2014-05-27 Thread Cody P Schafer
This causes `perf list pmu` to show parameters for parameterized events
like follows:

  pmu/event_name,param1=?,param2=?/ [Kernel PMU event]

An example:

  
hv_gpci/dispatch_timebase_by_processor_processor_time_in_timebase_cycles,phys_processor_idx=?/
 [Kernel PMU event]

CC: Sukadev Bhattiprolu suka...@linux.vnet.ibm.com
Signed-off-by: Cody P Schafer d...@codyps.com
---
 tools/perf/util/pmu.c | 26 +-
 1 file changed, 25 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index db53fac..7b8d067 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -741,10 +741,33 @@ void perf_pmu__set_format(unsigned long *bits, long from, 
long to)
set_bit(b, bits);
 }
 
+static int sub_non_neg(int a, int b)
+{
+   if (b  a)
+   return 0;
+   return a - b;
+}
+
 static char *format_alias(char *buf, int len, struct perf_pmu *pmu,
  struct perf_pmu_alias *alias)
 {
-   snprintf(buf, len, %s/%s/, pmu-name, alias-name);
+   struct parse_events_term *term;
+   int used = snprintf(buf, len, %s/%s, pmu-name, alias-name);
+
+   list_for_each_entry(term, alias-terms, list)
+   if (term-type_val == PARSE_EVENTS__TERM_TYPE_STR)
+   used += snprintf(buf + used, sub_non_neg(len, used),
+   ,%s=?, term-val.str);
+
+   if (sub_non_neg(len, used)  0) {
+   buf[used] = '/';
+   used++;
+   }
+   if (sub_non_neg(len, used)  0) {
+   buf[used] = '\0';
+   used++;
+   } else
+   buf[len - 1] = '\0';
return buf;
 }
 
@@ -795,6 +818,7 @@ void print_pmu_events(const char *event_glob, bool 
name_only)
if (is_cpu  !name_only)
aliases[j] = format_alias_or(buf, sizeof(buf),
  pmu, alias);
+
aliases[j] = strdup(aliases[j]);
j++;
}
-- 
1.9.3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 09/16] tools/perf: document parameterized events and note symbolically formed events

2014-05-27 Thread Cody P Schafer
CC: Sukadev Bhattiprolu suka...@linux.vnet.ibm.com
Signed-off-by: Cody P Schafer d...@codyps.com
---
 tools/perf/Documentation/perf-list.txt   | 13 +
 tools/perf/Documentation/perf-record.txt |  5 +
 2 files changed, 18 insertions(+)

diff --git a/tools/perf/Documentation/perf-list.txt 
b/tools/perf/Documentation/perf-list.txt
index 6fce6a6..626818b 100644
--- a/tools/perf/Documentation/perf-list.txt
+++ b/tools/perf/Documentation/perf-list.txt
@@ -89,6 +89,19 @@ raw encoding of 0x1A8 can be used:
 You should refer to the processor specific documentation for getting these
 details. Some of them are referenced in the SEE ALSO section below.
 
+PARAMETERIZED EVENTS
+
+
+Some pmu events listed by 'perf-list' will be displayed with '?' in them. For
+example:
+
+  hv_gpci/dtbp_ptitc,phys_processor_idx=?/
+
+This means that when provided as an event, a value for phys_processor_idx must
+also be supplied. For example:
+
+  perf stat -e 'hv_gpci/dtbp_ptitc,phys_processor_idx=0x2/' ...
+
 OPTIONS
 ---
 
diff --git a/tools/perf/Documentation/perf-record.txt 
b/tools/perf/Documentation/perf-record.txt
index c71b0f3..c005180 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -33,6 +33,11 @@ OPTIONS
 - a raw PMU event (eventsel+umask) in the form of rNNN where NNN is a
  hexadecimal event descriptor.
 
+   - a symbolicly formed PMU event like 'pmu/value1=0x3,value2/' where
+ 'value1' and 'value2' are defined as formats in
+ /sys/bus/event_sources/devices/pmu/format/* OR are one of 'config',
+ 'config1', 'config2'.
+
 - a hardware breakpoint event in the form of '\mem:addr[:access]'
   where addr is the address in memory you want to break in.
   Access is the memory access type (read, write, execute) it can
-- 
1.9.3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 10/16] perf: provide sysfs_show for struct perf_pmu_events_attr

2014-05-27 Thread Cody P Schafer
(struct perf_pmu_events_attr) is defined in include/linux/perf_event.h,
but the only show for it is in x86 and contains x86 specific stuff.

Make a generic one for those of us who are just using the event_str.

CC: Sukadev Bhattiprolu suka...@linux.vnet.ibm.com
Signed-off-by: Cody P Schafer d...@codyps.com
---
 include/linux/perf_event.h | 3 +++
 kernel/events/core.c   | 8 
 2 files changed, 11 insertions(+)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 3356abc..6c1d6dd 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -867,6 +867,9 @@ struct perf_pmu_events_attr {
const char *event_str;
 };
 
+ssize_t perf_event_sysfs_show(struct device *dev, struct device_attribute 
*attr,
+ char *page);
+
 #define PMU_EVENT_ATTR(_name, _var, _id, _show)
\
 static struct perf_pmu_events_attr _var = {\
.attr = __ATTR(_name, 0444, _show, NULL),   \
diff --git a/kernel/events/core.c b/kernel/events/core.c
index f83a71a..6830e21 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -7971,6 +7971,14 @@ void __init perf_event_init(void)
 != 1024);
 }
 
+ssize_t perf_event_sysfs_show(struct device *dev, struct device_attribute 
*attr,
+ char *page)
+{
+   struct perf_pmu_events_attr *pmu_attr =
+   container_of(attr, struct perf_pmu_events_attr, attr);
+   return sprintf(page, %s\n, pmu_attr-event_str);
+}
+
 static int __init perf_event_sysfs_init(void)
 {
struct pmu *pmu;
-- 
1.9.3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 11/16] byteorder: provide a linux/byteorder.h with {be, le}_to_cpu() and cpu_to_{be, le}() macros

2014-05-27 Thread Cody P Schafer
Rather manually specifying the size of the integer to be converted, key
off of the type size. Reduces duplicate size info and the occurance of
certain types of bugs (using the wrong sized conversion).

CC: Sukadev Bhattiprolu suka...@linux.vnet.ibm.com
Signed-off-by: Cody P Schafer d...@codyps.com
---
 include/linux/byteorder.h | 34 ++
 1 file changed, 34 insertions(+)
 create mode 100644 include/linux/byteorder.h

diff --git a/include/linux/byteorder.h b/include/linux/byteorder.h
new file mode 100644
index 000..c7ab8da
--- /dev/null
+++ b/include/linux/byteorder.h
@@ -0,0 +1,34 @@
+#ifndef LINUX_BYTEORDER_H_
+#define LINUX_BYTEORDER_H_
+
+#include asm/byteorder.h
+
+#define be_to_cpu(v) \
+   __builtin_choose_expr(sizeof(v) == sizeof(uint8_t) , v, \
+   __builtin_choose_expr(sizeof(v) == sizeof(uint16_t), be16_to_cpu(v), \
+   __builtin_choose_expr(sizeof(v) == sizeof(uint32_t), be32_to_cpu(v), \
+   __builtin_choose_expr(sizeof(v) == sizeof(uint64_t), be64_to_cpu(v), \
+   (void)0
+
+#define le_to_cpu(v) \
+   __builtin_choose_expr(sizeof(v) == sizeof(uint8_t) , v, \
+   __builtin_choose_expr(sizeof(v) == sizeof(uint16_t), le16_to_cpu(v), \
+   __builtin_choose_expr(sizeof(v) == sizeof(uint32_t), le32_to_cpu(v), \
+   __builtin_choose_expr(sizeof(v) == sizeof(uint64_t), le64_to_cpu(v), \
+   (void)0
+
+#define cpu_to_le(v) \
+   __builtin_choose_expr(sizeof(v) == sizeof(uint8_t) , v, \
+   __builtin_choose_expr(sizeof(v) == sizeof(uint16_t), cpu_to_le16(v), \
+   __builtin_choose_expr(sizeof(v) == sizeof(uint32_t), cpu_to_le32(v), \
+   __builtin_choose_expr(sizeof(v) == sizeof(uint64_t), cpu_to_le64(v), \
+   (void)0
+
+#define cpu_to_be(v) \
+   __builtin_choose_expr(sizeof(v) == sizeof(uint8_t) , v, \
+   __builtin_choose_expr(sizeof(v) == sizeof(uint16_t), cpu_to_be16(v), \
+   __builtin_choose_expr(sizeof(v) == sizeof(uint32_t), cpu_to_be32(v), \
+   __builtin_choose_expr(sizeof(v) == sizeof(uint64_t), cpu_to_be64(v), \
+   (void)0
+
+#endif
-- 
1.9.3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 12/16] powerpc/perf/hv-24x7: parse catalog and populate sysfs with events

2014-05-27 Thread Cody P Schafer
Retrieves and parses the 24x7 catalog on POWER systems that supply it
(right now, only POWER 8). Events are exposed via sysfs in the standard
fashion, and are all parameterized.

Catalog is (at the moment) only parsed on boot. It needs re-parsing
when a some hypervisor events occur. At that point we'll also need to
prevent old events from continuing to function (counter that is passed
in via spare space in the config values?).

CC: Sukadev Bhattiprolu suka...@linux.vnet.ibm.com
Signed-off-by: Cody P Schafer d...@codyps.com
---
 arch/powerpc/perf/hv-24x7-catalog.h |  25 ++
 arch/powerpc/perf/hv-24x7-domains.h |  19 +
 arch/powerpc/perf/hv-24x7.c | 760 +++-
 arch/powerpc/perf/hv-24x7.h |  12 +-
 4 files changed, 804 insertions(+), 12 deletions(-)
 create mode 100644 arch/powerpc/perf/hv-24x7-domains.h

diff --git a/arch/powerpc/perf/hv-24x7-catalog.h 
b/arch/powerpc/perf/hv-24x7-catalog.h
index 21b19dd..69e2e1f 100644
--- a/arch/powerpc/perf/hv-24x7-catalog.h
+++ b/arch/powerpc/perf/hv-24x7-catalog.h
@@ -30,4 +30,29 @@ struct hv_24x7_catalog_page_0 {
__u8 reserved6[2];
 } __packed;
 
+struct hv_24x7_event_data {
+   __be16 length; /* in bytes, must be a multiple of 16 */
+   __u8 reserved1[2];
+   __u8 domain; /* Chip = 1, Core = 2 */
+   __u8 reserved2[1];
+   __be16 event_group_record_offs; /* in bytes, must be 8 byte aligned */
+   __be16 event_group_record_len; /* in bytes */
+
+   /* in bytes, offset from event_group_record */
+   __be16 event_counter_offs;
+
+   /* verified_state, unverified_state, caveat_state, broken_state, ... */
+   __be32 flags;
+
+   __be16 primary_group_ix;
+   __be16 group_count;
+   __be16 event_name_len;
+   __u8 remainder[];
+   /* __u8 event_name[event_name_len - 2]; */
+   /* __be16 event_description_len; */
+   /* __u8 event_desc[event_description_len - 2]; */
+   /* __be16 detailed_desc_len; */
+   /* __u8 detailed_desc[detailed_desc_len - 2]; */
+} __packed;
+
 #endif
diff --git a/arch/powerpc/perf/hv-24x7-domains.h 
b/arch/powerpc/perf/hv-24x7-domains.h
new file mode 100644
index 000..9c5c862
--- /dev/null
+++ b/arch/powerpc/perf/hv-24x7-domains.h
@@ -0,0 +1,19 @@
+
+/*
+ * DOMAIN(name, num, index_kind, is_physical)
+ *
+ * @name: an all caps token, suitable for use in generating an enum member and
+ *appending to an event name in sysfs.
+ * @num: the number corresponding to the domain as given in documentation. We
+ *   assume the catalog domain and the hcall domain have the same numbering
+ *   (so far they do), but this may need to be changed in the future.
+ * @index_kind: a stringifiable token describing the meaning of the index 
within the
+ *  given domain. Must fit the parsing rules of the perf sysfs api.
+ * @is_physical: true if the domain is physical, false otherwise (if virtual).
+ */
+DOMAIN(PHYSICAL_CHIP, 0x01, chip, true)
+DOMAIN(PHYSICAL_CORE, 0x02, core, true)
+DOMAIN(VIRTUAL_PROCESSOR_HOME_CORE, 0x03, vcpu, false)
+DOMAIN(VIRTUAL_PROCESSOR_HOME_CHIP, 0x04, vcpu, false)
+DOMAIN(VIRTUAL_PROCESSOR_HOME_NODE, 0x05, vcpu, false)
+DOMAIN(VIRTUAL_PROCESSOR_REMOTE_NODE, 0x06, vcpu, false)
diff --git a/arch/powerpc/perf/hv-24x7.c b/arch/powerpc/perf/hv-24x7.c
index 9a7a830..c9b7c55 100644
--- a/arch/powerpc/perf/hv-24x7.c
+++ b/arch/powerpc/perf/hv-24x7.c
@@ -1,3 +1,4 @@
+#define DEBUG 1
 /*
  * Hypervisor supplied 24x7 performance counter support
  *
@@ -12,9 +13,13 @@
 
 #define pr_fmt(fmt) hv-24x7:  fmt
 
+#include linux/byteorder.h
 #include linux/perf_event.h
+#include linux/rbtree.h
 #include linux/module.h
 #include linux/slab.h
+#include linux/vmalloc.h
+
 #include asm/firmware.h
 #include asm/hvcall.h
 #include asm/io.h
@@ -23,6 +28,66 @@
 #include hv-24x7-catalog.h
 #include hv-common.h
 
+static const char *domain_to_index_string(unsigned domain)
+{
+   switch (domain) {
+#define DOMAIN(n, v, x, c) \
+   case HV_PERF_DOMAIN_##n:\
+   return #x;
+#include hv-24x7-domains.h
+#undef DOMAIN
+   default:
+   WARN(1, unknown domain %d\n, domain);
+   return UNKNOWN_DOMAIN_INDEX_STRING;
+   }
+}
+
+static const char *event_domain_suffix(unsigned domain)
+{
+   switch (domain) {
+#define DOMAIN(n, v, x, c) \
+   case HV_PERF_DOMAIN_##n:\
+   return __ #n;
+#include hv-24x7-domains.h
+#undef DOMAIN
+   default:
+   WARN(1, unknown domain %d\n, domain);
+   return __UNKNOWN_DOMAIN_SUFFIX;
+   }
+}
+
+static bool domain_is_valid(unsigned domain)
+{
+   switch (domain) {
+#define DOMAIN(n, v, x, c) \
+   case HV_PERF_DOMAIN_##n:\
+   /* fall through */
+#include hv-24x7-domains.h
+#undef DOMAIN
+   return true;
+   default:
+   return false;
+   }
+}
+
+static bool is_physical_domain

[PATCH 13/16] powerpc/perf/hv-24x7: Documentaion for new sysfs entries which expose descriptions

2014-05-27 Thread Cody P Schafer
CC: Sukadev Bhattiprolu suka...@linux.vnet.ibm.com
Signed-off-by: Cody P Schafer d...@codyps.com
---
 .../testing/sysfs-bus-event_source-devices-hv_24x7 | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_24x7 
b/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_24x7
index e78ee79..5b501d7 100644
--- a/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_24x7
+++ b/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_24x7
@@ -21,3 +21,25 @@ Contact: Cody P Schafer c...@linux.vnet.ibm.com
 Description:
Exposes the version field of the 24x7 catalog. This is also
extractable from the provided binary catalog sysfs entry.
+
+What:  /sys/bus/event_source/devices/hv_24x7/event_descs/event-name
+Date:  February 2014
+Contact:   Cody P Schafer c...@linux.vnet.ibm.com
+Description:
+   Provides the description of a particular event as provided by
+   the firmware. If firmware does not provide a description, no
+   file will be created.
+
+   Note that the event-name lacks the domain suffix appended for
+   events in the events/ dir.
+
+What:  
/sys/bus/event_source/devices/hv_24x7/event_long_descs/event-name
+Date:  February 2014
+Contact:   Cody P Schafer c...@linux.vnet.ibm.com
+Description:
+   Provides the long description of a particular event as
+   provided by the firmware. If firmware does not provide a
+   description, no file will be created.
+
+   Note that the event-name lacks the domain suffix appended for
+   events in the events/ dir.
-- 
1.9.3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 14/16] perf: add PMU_EVENT_ATTR_STRING() helper

2014-05-27 Thread Cody P Schafer
Helper for constructing static struct perf_pmu_events_attr s.

CC: Sukadev Bhattiprolu suka...@linux.vnet.ibm.com
Signed-off-by: Cody P Schafer d...@codyps.com
---
 include/linux/perf_event.h | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 6c1d6dd..1313171 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -876,6 +876,13 @@ static struct perf_pmu_events_attr _var = {
\
.id   =  _id,   \
 };
 
+#define PMU_EVENT_ATTR_STRING(_name, _var, _value) \
+static struct perf_pmu_events_attr _var = {\
+   .attr = __ATTR(_name, 0444, perf_event_sysfs_show, NULL),   \
+   .event_str = _value,\
+};
+
+
 #define PMU_FORMAT_ATTR(_name, _format)
\
 static ssize_t \
 _name##_show(struct device *dev,   \
-- 
1.9.3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 16/16] powerpc/perf/hv-gpci: add the remaining gpci requests

2014-05-27 Thread Cody P Schafer
Add the remaining gpci requests that contain counters suitable for use
by perf. Omit those that don't contain any counters (but note their
ommision).

CC: Sukadev Bhattiprolu suka...@linux.vnet.ibm.com
Signed-off-by: Cody P Schafer d...@codyps.com
---
 arch/powerpc/perf/hv-gpci-requests.h | 179 +++
 1 file changed, 179 insertions(+)

diff --git a/arch/powerpc/perf/hv-gpci-requests.h 
b/arch/powerpc/perf/hv-gpci-requests.h
index 0dfc4d9..af3b73c 100644
--- a/arch/powerpc/perf/hv-gpci-requests.h
+++ b/arch/powerpc/perf/hv-gpci-requests.h
@@ -65,6 +65,33 @@ REQUEST(__count(0,   8,  
processor_time_in_timebase_cycles)
 )
 #include I(REQUEST_END)
 
+#define REQUEST_NAME 
entitled_capped_uncapped_donated_idle_timebase_by_partition
+#define REQUEST_NUM 0x20
+#define REQUEST_IDX_KIND sibling_part_id
+#include I(REQUEST_BEGIN)
+REQUEST(__field(0, 8,  partition_id)
+   __count(0x8,8,  entitled_cycles)
+   __count(0x10,   8,  consumed_capped_cycles)
+   __count(0x18,   8,  consumed_uncapped_cycles)
+   __count(0x20,   8,  cycles_donated)
+   __count(0x28,   8,  purr_idle_cycles)
+)
+#include I(REQUEST_END)
+
+/*
+ * Not avaliable for counter_info_version = 0x8, use
+ * run_instruction_cycles_by_partition(0x100) instead.
+ */
+#define REQUEST_NAME run_instructions_run_cycles_by_partition
+#define REQUEST_NUM 0x30
+#define REQUEST_IDX_KIND sibling_part_id
+#include I(REQUEST_BEGIN)
+REQUEST(__field(0, 8,  partition_id)
+   __count(0x8,8,  instructions_completed)
+   __count(0x10,   8,  cycles)
+)
+#include I(REQUEST_END)
+
 #define REQUEST_NAME system_performance_capabilities
 #define REQUEST_NUM 0x40
 #define REQUEST_IDX_KIND M1
@@ -75,5 +102,157 @@ REQUEST(__field(0, 1,  perf_collect_privileged)
 )
 #include I(REQUEST_END)
 
+#define REQUEST_NAME processor_bus_utilization_abc_links
+#define REQUEST_NUM 0x50
+#define REQUEST_IDX_KIND hw_chip_id
+#include I(REQUEST_BEGIN)
+REQUEST(__field(0, 4,  hw_chip_id)
+   __array(0x4,0xC,reserved1)
+   __count(0x10,   8,  total_link_cycles)
+   __count(0x18,   8,  idle_cycles_for_a_link)
+   __count(0x20,   8,  idle_cycles_for_b_link)
+   __count(0x28,   8,  idle_cycles_for_c_link)
+   __array(0x30,   0x20,   reserved2)
+)
+#include I(REQUEST_END)
+
+#define REQUEST_NAME processor_bus_utilization_wxyz_links
+#define REQUEST_NUM 0x60
+#define REQUEST_IDX_KIND hw_chip_id
+#include I(REQUEST_BEGIN)
+REQUEST(__field(0, 4,  hw_chip_id)
+   __array(0x4,0xC,reserved1)
+   __count(0x10,   8,  total_link_cycles)
+   __count(0x18,   8,  idle_cycles_for_w_link)
+   __count(0x20,   8,  idle_cycles_for_x_link)
+   __count(0x28,   8,  idle_cycles_for_y_link)
+   __count(0x30,   8,  idle_cycles_for_z_link)
+   __array(0x38,   0x28,   reserved2)
+)
+#include I(REQUEST_END)
+
+#define REQUEST_NAME processor_bus_utilization_gx_links
+#define REQUEST_NUM 0x70
+#define REQUEST_IDX_KIND hw_chip_id
+#include I(REQUEST_BEGIN)
+REQUEST(__field(0, 4,  hw_chip_id)
+   __array(0x4,0xC,reserved1)
+   __count(0x10,   8,  gx0_in_address_cycles)
+   __count(0x18,   8,  gx0_in_data_cycles)
+   __count(0x20,   8,  gx0_in_retries)
+   __count(0x28,   8,  gx0_in_bus_cycles)
+   __count(0x30,   8,  gx0_in_cycles_total)
+   __count(0x38,   8,  gx0_out_address_cycles)
+   __count(0x40,   8,  gx0_out_data_cycles)
+   __count(0x48,   8,  gx0_out_retries)
+   __count(0x50,   8,  gx0_out_bus_cycles)
+   __count(0x58,   8,  gx0_out_cycles_total)
+   __count(0x60,   8,  gx1_in_address_cycles)
+   __count(0x68,   8,  gx1_in_data_cycles)
+   __count(0x70,   8,  gx1_in_retries)
+   __count(0x78,   8,  gx1_in_bus_cycles)
+   __count(0x80,   8,  gx1_in_cycles_total)
+   __count(0x88,   8,  gx1_out_address_cycles)
+   __count(0x90,   8,  gx1_out_data_cycles)
+   __count(0x98,   8,  gx1_out_retries)
+   __count(0xA0,   8,  gx1_out_bus_cycles)
+   __count(0xA8,   8,  gx1_out_cycles_total)
+)
+#include I(REQUEST_END)
+
+#define REQUEST_NAME processor_bus_utilization_mc_links
+#define REQUEST_NUM 0x80
+#define REQUEST_IDX_KIND hw_chip_id
+#include I(REQUEST_BEGIN)
+REQUEST(__field(0, 4,  hw_chip_id)
+   __array(0x4,0xC,reserved1)
+   __count(0x10,   8,  mc0_frames)
+   __count(0x18,   8,  mc0_reads)
+   __count(0x20,   8,  mc0_write)
+   __count(0x28,   8,  mc0_total_cycles)
+   __count(0x30,   8,  mc1_frames)
+   __count(0x38,   8,  mc1_reads)
+   __count(0x40,   8,  mc1_writes)
+   __count(0x48,   8,  mc1_total_cycles)
+)
+#include I(REQUEST_END)
+
+/* Processor_config (0x90) skipped, no counters */
+/* Current_processor_frequency

[PATCH 15/16] powerpc/perf/{hv-gpci, hv-common}: generate requests with counters annotated

2014-05-27 Thread Cody P Schafer
This adds (in req-gen/) a framework for defining gpci counter requests.
It uses macro magic similar to ftrace.

Also convert the existing hv-gpci request structures and enum values to
use the new framework (and adjust old users of the structs and enum
values to cope with changes in naming).

In exchange for this macro disaster, we get autogenerated event listing
for GPCI in sysfs, build time field offset checking, and zero
duplication of information about GPCI requests.

CC: Sukadev Bhattiprolu suka...@linux.vnet.ibm.com
Signed-off-by: Cody P Schafer d...@codyps.com
---
 arch/powerpc/perf/hv-common.c  |  10 +-
 arch/powerpc/perf/hv-gpci-requests.h   |  79 +++
 arch/powerpc/perf/hv-gpci.c|   8 ++
 arch/powerpc/perf/hv-gpci.h|  37 +++
 arch/powerpc/perf/req-gen/_begin.h |  13 +++
 arch/powerpc/perf/req-gen/_clear.h |   5 +
 arch/powerpc/perf/req-gen/_end.h   |   4 +
 arch/powerpc/perf/req-gen/_request-begin.h |  15 +++
 arch/powerpc/perf/req-gen/_request-end.h   |   8 ++
 arch/powerpc/perf/req-gen/perf.h   | 155 +
 10 files changed, 304 insertions(+), 30 deletions(-)
 create mode 100644 arch/powerpc/perf/hv-gpci-requests.h
 create mode 100644 arch/powerpc/perf/req-gen/_begin.h
 create mode 100644 arch/powerpc/perf/req-gen/_clear.h
 create mode 100644 arch/powerpc/perf/req-gen/_end.h
 create mode 100644 arch/powerpc/perf/req-gen/_request-begin.h
 create mode 100644 arch/powerpc/perf/req-gen/_request-end.h
 create mode 100644 arch/powerpc/perf/req-gen/perf.h

diff --git a/arch/powerpc/perf/hv-common.c b/arch/powerpc/perf/hv-common.c
index 47e02b3..7dce8f10 100644
--- a/arch/powerpc/perf/hv-common.c
+++ b/arch/powerpc/perf/hv-common.c
@@ -9,13 +9,13 @@ unsigned long hv_perf_caps_get(struct hv_perf_caps *caps)
unsigned long r;
struct p {
struct hv_get_perf_counter_info_params params;
-   struct cv_system_performance_capabilities caps;
+   struct hv_gpci_system_performance_capabilities caps;
} __packed __aligned(sizeof(uint64_t));
 
struct p arg = {
.params = {
.counter_request = cpu_to_be32(
-   CIR_SYSTEM_PERFORMANCE_CAPABILITIES),
+   HV_GPCI_system_performance_capabilities),
.starting_index = cpu_to_be32(-1),
.counter_info_version_in = 0,
}
@@ -31,9 +31,9 @@ unsigned long hv_perf_caps_get(struct hv_perf_caps *caps)
 
caps-version = arg.params.counter_info_version_out;
caps-collect_privileged = !!arg.caps.perf_collect_privileged;
-   caps-ga = !!(arg.caps.capability_mask  CV_CM_GA);
-   caps-expanded = !!(arg.caps.capability_mask  CV_CM_EXPANDED);
-   caps-lab = !!(arg.caps.capability_mask  CV_CM_LAB);
+   caps-ga = !!(arg.caps.capability_mask  HV_GPCI_CM_GA);
+   caps-expanded = !!(arg.caps.capability_mask  HV_GPCI_CM_EXPANDED);
+   caps-lab = !!(arg.caps.capability_mask  HV_GPCI_CM_LAB);
 
return r;
 }
diff --git a/arch/powerpc/perf/hv-gpci-requests.h 
b/arch/powerpc/perf/hv-gpci-requests.h
new file mode 100644
index 000..0dfc4d9
--- /dev/null
+++ b/arch/powerpc/perf/hv-gpci-requests.h
@@ -0,0 +1,79 @@
+
+#include req-gen/_begin.h
+
+/*
+ * Based on the document getPerfCountInfo v1.07
+ */
+
+/* this needs to be -1 encoded in hex suitable for parsing by tools/perf. */
+#define M1 0x
+
+/*
+ * #define REQUEST_NAME counter_request_name
+ * #define REQUEST_NUM r_num
+ * #define REQUEST_IDX_KIND starting_index_kind
+ * #include I(REQUEST_BEGIN)
+ * REQUEST(
+ * __field(...)
+ * __field(...)
+ * __array(...)
+ * __count(...)
+ * )
+ * #include I(REQUEST_END)
+ *
+ * - starting_index_kind is one of:
+ *   M1: must be -1
+ *   chip_id: hardware chip id or -1 for current hw chip
+ *   phys_processor_idx:
+ *
+ * __count(offset, bytes, name):
+ * a counter that should be exposed via perf
+ * __field(offset, bytes, name)
+ * a normal field
+ * __array(offset, bytes, name)
+ * an array of bytes
+ *
+ *
+ * @bytes for __count, and __field _must_ be a numeral token
+ * in decimal, not an expression and not in hex.
+ *
+ *
+ * TODO:
+ * - expose secondary index (if any counter ever uses it, only 0xA0
+ *   appears to use it right now, and it doesn't have any counters)
+ * - embed versioning info
+ * - include counter descriptions
+ */
+#define REQUEST_NAME dispatch_timebase_by_processor
+#define REQUEST_NUM 0x10
+#define REQUEST_IDX_KIND phys_processor_idx
+#include I(REQUEST_BEGIN)
+REQUEST(__count(0, 8,  processor_time_in_timebase_cycles)
+   __field(0x8,4,  hw_processor_id)
+   __field(0xC,2,  owning_part_id)
+   __field(0xE,1,  processor_state)
+   __field(0xF,1,  version

Re: [PATCH v4 09/11] powerpc/perf: add support for the hv 24x7 interface

2014-05-22 Thread Cody P Schafer

On 05/22/2014 01:19 AM, Ian Munsie wrote:

Hi Cody,

I just tried building this with gcc 4.5, which failed with the following
warning (treated as an error):

cc1: warnings being treated as errors
arch/powerpc/perf/hv-24x7.c: In function 'single_24x7_request':
arch/powerpc/perf/hv-24x7.c:346:1: error: the frame size of 8192 bytes is 
larger than 2048 bytes
make[3]: *** [arch/powerpc/perf/hv-24x7.o] Error 1
make[2]: *** [arch/powerpc/perf] Error 2

My .config has CONFIG_FRAME_WARN=2048 (default on 64bit), but the
alignment constraints in this function may require 8K on the stack -
possibly a bit large?



Yep, it is a bit large. In other places in hv-24x7 that use similar 
firmware interfaces (with similar alignment requirements), I've used a 
kmem_cache (hv_page_cache). Testing out a patch that uses that here as well.




Notably for some reason this warning no longer seems to trigger on gcc
4.8 (or at least somewhere between 4.5-4.8), though the assembly does
still show it aligning the buffers.


That's a bit concerning (and might be why I didn't pick it up, using gcc 
4.9.0 over here). Looking at the gcc docs, it seems to indicate that 
alloca() and VLAs aren't counted for -Wframe-larger-than. Perhaps gcc 
decided to move locally defined structures with alignment requirements 
into that same bucket? (while size of the structures is statically 
determinable, the stack consumption due to alignment is [to some degree] 
variable).


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH] powerpc/perf/hv-24x7: use kmem_cache instead of aligned stack allocations

2014-05-22 Thread Cody P Schafer
Ian pointed out the use of __aligned(4096) caused rather large stack
consumption in single_24x7_request(), so use the kmem_cache
hv_page_cache (which we've already got set up for other allocations)
insead of allocating locally.

Reported-by: Ian Munsie imun...@au1.ibm.com
Signed-off-by: Cody P Schafer c...@linux.vnet.ibm.com
---
 arch/powerpc/perf/hv-24x7.c | 52 -
 1 file changed, 37 insertions(+), 15 deletions(-)

diff --git a/arch/powerpc/perf/hv-24x7.c b/arch/powerpc/perf/hv-24x7.c
index e0766b8..9a7a830 100644
--- a/arch/powerpc/perf/hv-24x7.c
+++ b/arch/powerpc/perf/hv-24x7.c
@@ -294,7 +294,7 @@ static unsigned long single_24x7_request(u8 domain, u32 
offset, u16 ix,
 u16 lpar, u64 *res,
 bool success_expected)
 {
-   unsigned long ret;
+   unsigned long ret = -ENOMEM;
 
/*
 * request_buffer and result_buffer are not required to be 4k aligned,
@@ -304,7 +304,27 @@ static unsigned long single_24x7_request(u8 domain, u32 
offset, u16 ix,
struct reqb {
struct hv_24x7_request_buffer buf;
struct hv_24x7_request req;
-   } __packed __aligned(4096) request_buffer = {
+   } __packed *request_buffer;
+   struct resb {
+   struct hv_24x7_data_result_buffer buf;
+   struct hv_24x7_result res;
+   struct hv_24x7_result_element elem;
+   __be64 result;
+   } __packed *result_buffer;
+
+   BUILD_BUG_ON(sizeof(*request_buffer)  4096);
+   BUILD_BUG_ON(sizeof(*result_buffer)  4096);
+
+   request_buffer = kmem_cache_alloc(hv_page_cache, GFP_USER);
+
+   if (!request_buffer)
+   goto out_reqb;
+
+   result_buffer = kmem_cache_zalloc(hv_page_cache, GFP_USER);
+   if (!result_buffer)
+   goto out_resb;
+
+   *request_buffer = (struct reqb) {
.buf = {
.interface_version = HV_24X7_IF_VERSION_CURRENT,
.num_requests = 1,
@@ -320,28 +340,30 @@ static unsigned long single_24x7_request(u8 domain, u32 
offset, u16 ix,
}
};
 
-   struct resb {
-   struct hv_24x7_data_result_buffer buf;
-   struct hv_24x7_result res;
-   struct hv_24x7_result_element elem;
-   __be64 result;
-   } __packed __aligned(4096) result_buffer = {};
-
ret = plpar_hcall_norets(H_GET_24X7_DATA,
-   virt_to_phys(request_buffer), sizeof(request_buffer),
-   virt_to_phys(result_buffer),  sizeof(result_buffer));
+   virt_to_phys(request_buffer), sizeof(*request_buffer),
+   virt_to_phys(result_buffer),  sizeof(*result_buffer));
 
if (ret) {
if (success_expected)
pr_err_ratelimited(hcall failed: %d %#x %#x %d = 
0x%lx (%ld) detail=0x%x failing ix=%x\n,
domain, offset, ix, lpar,
ret, ret,
-   result_buffer.buf.detailed_rc,
-   result_buffer.buf.failing_request_ix);
-   return ret;
+   result_buffer-buf.detailed_rc,
+   result_buffer-buf.failing_request_ix);
+   goto out_hcall;
}
 
-   *res = be64_to_cpu(result_buffer.result);
+   *res = be64_to_cpu(result_buffer-result);
+   kfree(result_buffer);
+   kfree(request_buffer);
+   return ret;
+
+out_hcall:
+   kfree(result_buffer);
+out_resb:
+   kfree(request_buffer);
+out_reqb:
return ret;
 }
 
-- 
1.9.3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] powerpc/perf/hv-24x7: use kmem_cache instead of aligned stack allocations

2014-05-22 Thread Cody P Schafer

On 05/22/2014 03:38 PM, Stephen Rothwell wrote:

Hi Cody,

On Thu, 22 May 2014 15:29:08 -0700 Cody P Schafer c...@linux.vnet.ibm.com 
wrote:


-   *res = be64_to_cpu(result_buffer.result);
+   *res = be64_to_cpu(result_buffer-result);
+   kfree(result_buffer);
+   kfree(request_buffer);
+   return ret;


Why not just fall through here by removing the above 3 lines?


No reason except me not noticing it.


+
+out_hcall:
+   kfree(result_buffer);
+out_resb:
+   kfree(request_buffer);
+out_reqb:
return ret;
  }




___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v2] powerpc/perf/hv-24x7: use kmem_cache instead of aligned stack allocations

2014-05-22 Thread Cody P Schafer
Ian pointed out the use of __aligned(4096) caused rather large stack
consumption in single_24x7_request(), so use the kmem_cache
hv_page_cache (which we've already got set up for other allocations)
insead of allocating locally.

Reported-by: Ian Munsie imun...@au1.ibm.com
Signed-off-by: Cody P Schafer c...@linux.vnet.ibm.com
---
In v2:
  - remove duplicate exit path


 arch/powerpc/perf/hv-24x7.c | 48 +++--
 1 file changed, 33 insertions(+), 15 deletions(-)

diff --git a/arch/powerpc/perf/hv-24x7.c b/arch/powerpc/perf/hv-24x7.c
index e0766b8..998863b 100644
--- a/arch/powerpc/perf/hv-24x7.c
+++ b/arch/powerpc/perf/hv-24x7.c
@@ -294,7 +294,7 @@ static unsigned long single_24x7_request(u8 domain, u32 
offset, u16 ix,
 u16 lpar, u64 *res,
 bool success_expected)
 {
-   unsigned long ret;
+   unsigned long ret = -ENOMEM;
 
/*
 * request_buffer and result_buffer are not required to be 4k aligned,
@@ -304,7 +304,27 @@ static unsigned long single_24x7_request(u8 domain, u32 
offset, u16 ix,
struct reqb {
struct hv_24x7_request_buffer buf;
struct hv_24x7_request req;
-   } __packed __aligned(4096) request_buffer = {
+   } __packed *request_buffer;
+   struct resb {
+   struct hv_24x7_data_result_buffer buf;
+   struct hv_24x7_result res;
+   struct hv_24x7_result_element elem;
+   __be64 result;
+   } __packed *result_buffer;
+
+   BUILD_BUG_ON(sizeof(*request_buffer)  4096);
+   BUILD_BUG_ON(sizeof(*result_buffer)  4096);
+
+   request_buffer = kmem_cache_alloc(hv_page_cache, GFP_USER);
+
+   if (!request_buffer)
+   goto out_reqb;
+
+   result_buffer = kmem_cache_zalloc(hv_page_cache, GFP_USER);
+   if (!result_buffer)
+   goto out_resb;
+
+   *request_buffer = (struct reqb) {
.buf = {
.interface_version = HV_24X7_IF_VERSION_CURRENT,
.num_requests = 1,
@@ -320,28 +340,26 @@ static unsigned long single_24x7_request(u8 domain, u32 
offset, u16 ix,
}
};
 
-   struct resb {
-   struct hv_24x7_data_result_buffer buf;
-   struct hv_24x7_result res;
-   struct hv_24x7_result_element elem;
-   __be64 result;
-   } __packed __aligned(4096) result_buffer = {};
-
ret = plpar_hcall_norets(H_GET_24X7_DATA,
-   virt_to_phys(request_buffer), sizeof(request_buffer),
-   virt_to_phys(result_buffer),  sizeof(result_buffer));
+   virt_to_phys(request_buffer), sizeof(*request_buffer),
+   virt_to_phys(result_buffer),  sizeof(*result_buffer));
 
if (ret) {
if (success_expected)
pr_err_ratelimited(hcall failed: %d %#x %#x %d = 
0x%lx (%ld) detail=0x%x failing ix=%x\n,
domain, offset, ix, lpar,
ret, ret,
-   result_buffer.buf.detailed_rc,
-   result_buffer.buf.failing_request_ix);
-   return ret;
+   result_buffer-buf.detailed_rc,
+   result_buffer-buf.failing_request_ix);
+   goto out_hcall;
}
 
-   *res = be64_to_cpu(result_buffer.result);
+   *res = be64_to_cpu(result_buffer-result);
+out_hcall:
+   kfree(result_buffer);
+out_resb:
+   kfree(request_buffer);
+out_reqb:
return ret;
 }
 
-- 
1.9.3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2] powerpc/perf/hv-24x7: use kmem_cache instead of aligned stack allocations

2014-05-22 Thread Cody P Schafer

On 05/22/2014 04:49 PM, Stephen Rothwell wrote:

Hi Cody,

On Thu, 22 May 2014 15:44:25 -0700 Cody P Schafer c...@linux.vnet.ibm.com 
wrote:


if (ret) {
if (success_expected)
pr_err_ratelimited(hcall failed: %d %#x %#x %d = 0x%lx 
(%ld) detail=0x%x failing ix=%x\n,
domain, offset, ix, lpar,
ret, ret,
-   result_buffer.buf.detailed_rc,
-   result_buffer.buf.failing_request_ix);
-   return ret;
+   result_buffer-buf.detailed_rc,
+   result_buffer-buf.failing_request_ix);
+   goto out_hcall;
}

-   *res = be64_to_cpu(result_buffer.result);
+   *res = be64_to_cpu(result_buffer-result);


not a biggie, but this last bit could be (remove the goto out_hcall and
teh label and then)

} else {
*res = be64_to_cpu(result_buffer-result);
}



I've got a slight preference toward keeping it as is, which lets all of 
the non-error path code stay outside of if/else blocks (and the error 
handling is kept ever so slightly more consistent).



+out_hcall:
+   kfree(result_buffer);
+out_resb:
+   kfree(request_buffer);
+out_reqb:
return ret;
  }



otherwise looks good to me.



___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] powerpc/pseries: relocate config DTL so KConfig nests properly

2014-05-13 Thread Cody P Schafer

On 05/12/2014 11:23 PM, Michael Neuling wrote:

powerpc/pseries: relocate config DTL so KConfig nests properly


I don't know what that means.  Can you describe it in more detail?



So the config DTL refers to the configuration entry.

The nests properly refers to the indent that 'make menuconfig' shows 
when a config-option that depends on the config-option proceeding it.


In this case, moving config DTL up so it is below config PPC_SPLPAR 
means that menuconfig will show config DTL nicely indented right below 
config PPC_SPLPAR when PPC_SPLPAR is enabled.


To contrast that, right now if I enable PPC_SPLPAR in menuconfig, all I 
can immediately tell is that something showed up further down the list 
where I wasn't looking, and I end up having to toggle the option a few 
times to figure out what showed up, or look at the KConfig to find out 
that config DTL depends on config PPC_SPLPAR.


Essentially, this enables menuconfig to provide a visual hint about the 
dependencies between options.



Mikey


On Mon, 2014-05-12 at 20:09 -0700, Cody P Schafer wrote:

Signed-off-by: Cody P Schafer c...@linux.vnet.ibm.com
---
  arch/powerpc/platforms/pseries/Kconfig | 20 ++--
  1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/Kconfig 
b/arch/powerpc/platforms/pseries/Kconfig
index 2cb8b77..e00dd4d 100644
--- a/arch/powerpc/platforms/pseries/Kconfig
+++ b/arch/powerpc/platforms/pseries/Kconfig
@@ -33,6 +33,16 @@ config PPC_SPLPAR
  processors, that is, which share physical processors between
  two or more partitions.

+config DTL
+   bool Dispatch Trace Log
+   depends on PPC_SPLPAR  DEBUG_FS
+   help
+ SPLPAR machines can log hypervisor preempt  dispatch events to a
+ kernel buffer. Saying Y here will enable logging these events,
+ which are accessible through a debugfs file.
+
+ Say N if you are unsure.
+
  config PSERIES_MSI
 bool
 depends on PCI_MSI  PPC_PSERIES  EEH
@@ -122,13 +132,3 @@ config HV_PERF_CTRS
  systems. 24x7 is available on Power 8 systems.

If unsure, select Y.
-
-config DTL
-   bool Dispatch Trace Log
-   depends on PPC_SPLPAR  DEBUG_FS
-   help
- SPLPAR machines can log hypervisor preempt  dispatch events to a
- kernel buffer. Saying Y here will enable logging these events,
- which are accessible through a debugfs file.
-
- Say N if you are unsure.




___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH] powerpc/pseries: relocate config DTL so KConfig nests properly

2014-05-12 Thread Cody P Schafer
Signed-off-by: Cody P Schafer c...@linux.vnet.ibm.com
---
 arch/powerpc/platforms/pseries/Kconfig | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/Kconfig 
b/arch/powerpc/platforms/pseries/Kconfig
index 2cb8b77..e00dd4d 100644
--- a/arch/powerpc/platforms/pseries/Kconfig
+++ b/arch/powerpc/platforms/pseries/Kconfig
@@ -33,6 +33,16 @@ config PPC_SPLPAR
  processors, that is, which share physical processors between
  two or more partitions.
 
+config DTL
+   bool Dispatch Trace Log
+   depends on PPC_SPLPAR  DEBUG_FS
+   help
+ SPLPAR machines can log hypervisor preempt  dispatch events to a
+ kernel buffer. Saying Y here will enable logging these events,
+ which are accessible through a debugfs file.
+
+ Say N if you are unsure.
+
 config PSERIES_MSI
bool
depends on PCI_MSI  PPC_PSERIES  EEH
@@ -122,13 +132,3 @@ config HV_PERF_CTRS
  systems. 24x7 is available on Power 8 systems.
 
   If unsure, select Y.
-
-config DTL
-   bool Dispatch Trace Log
-   depends on PPC_SPLPAR  DEBUG_FS
-   help
- SPLPAR machines can log hypervisor preempt  dispatch events to a
- kernel buffer. Saying Y here will enable logging these events,
- which are accessible through a debugfs file.
-
- Say N if you are unsure.
-- 
1.9.3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2 6/6] powerpc/perf/hv-24x7: catalog version number is be64, not be32

2014-04-27 Thread Cody P Schafer

On 04/27/2014 09:47 PM, Benjamin Herrenschmidt wrote:

On Tue, 2014-04-15 at 10:10 -0700, Cody P Schafer wrote:

The catalog version number was changed from a be32 (with proceeding
32bits of padding) to a be64, update the code to treat it as a be64

Signed-off-by: Cody P Schafer c...@linux.vnet.ibm.com
--


Have you tested this ?

It doesn't build for me:

arch/powerpc/perf/hv-24x7.c: In function 'catalog_read':
arch/powerpc/perf/hv-24x7.c:223:3: error: format '%d' expects argument of type 
'int', but argument 2 has type 'uint64_t' [-Werror=format]
cc1: all warnings being treated as errors


I have, and I wasn't initially sure how I managed to miss that 
warning-as-error. On examination: My config (for some reason) has 
CONFIG_PPC_DISABLE_WERROR=y set (probably because it's a variation of a 
distro config). Must have been piping the warnings to a file and 
forgotten to check the file.



I'll fix that up in my tree.


Thanks.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v2 0/6] powerpc/perf/hv_{gpci,24x7}: fixes

2014-04-15 Thread Cody P Schafer
 - 24x7 and gpci probing now uses pr_debug() and doesn't pad to 80 characters
 - Catalog access is fixed for LE kernels
 - remove c99 feature sparse doesn't like
 - 1 device attr made static


Cody P Schafer (6):
  powerpc/perf/hv_24x7: probe errors changed to pr_debug(), padding
fixed
  powerpc/perf/hv_gpci: probe failures use pr_debug(), and padding
reduced
  powerpc/perf/hv-gpci: make device attr static
  powerpc/perf/hv-24x7: use (unsigned long) not (u32) values when
calling plpar_hcall_norets()
  powerpc/perf/hv-24x7: remove [static 4096], sparse chokes on it
  powerpc/perf/hv-24x7: catalog version number is be64, not be32

 arch/powerpc/perf/hv-24x7.c | 30 +-
 arch/powerpc/perf/hv-gpci.c |  6 +++---
 2 files changed, 24 insertions(+), 12 deletions(-)

-- 
1.9.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v2 3/6] powerpc/perf/hv-gpci: make device attr static

2014-04-15 Thread Cody P Schafer
Signed-off-by: Cody P Schafer c...@linux.vnet.ibm.com
---
 arch/powerpc/perf/hv-gpci.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/perf/hv-gpci.c b/arch/powerpc/perf/hv-gpci.c
index 8fee1dc..c9d399a 100644
--- a/arch/powerpc/perf/hv-gpci.c
+++ b/arch/powerpc/perf/hv-gpci.c
@@ -78,7 +78,7 @@ static ssize_t kernel_version_show(struct device *dev,
return sprintf(page, 0x%x\n, COUNTER_INFO_VERSION_CURRENT);
 }
 
-DEVICE_ATTR_RO(kernel_version);
+static DEVICE_ATTR_RO(kernel_version);
 HV_CAPS_ATTR(version, 0x%x\n);
 HV_CAPS_ATTR(ga, %d\n);
 HV_CAPS_ATTR(expanded, %d\n);
-- 
1.9.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v2 5/6] powerpc/perf/hv-24x7: remove [static 4096], sparse chokes on it

2014-04-15 Thread Cody P Schafer
Signed-off-by: Cody P Schafer c...@linux.vnet.ibm.com
---
 arch/powerpc/perf/hv-24x7.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/perf/hv-24x7.c b/arch/powerpc/perf/hv-24x7.c
index 3e8f60a..95a67f8 100644
--- a/arch/powerpc/perf/hv-24x7.c
+++ b/arch/powerpc/perf/hv-24x7.c
@@ -170,7 +170,7 @@ static unsigned long h_get_24x7_catalog_page_(unsigned long 
phys_4096,
index);
 }
 
-static unsigned long h_get_24x7_catalog_page(char page[static 4096],
+static unsigned long h_get_24x7_catalog_page(char page[],
 u32 version, u32 index)
 {
return h_get_24x7_catalog_page_(virt_to_phys(page),
-- 
1.9.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v2 4/6] powerpc/perf/hv-24x7: use (unsigned long) not (u32) values when calling plpar_hcall_norets()

2014-04-15 Thread Cody P Schafer
Signed-off-by: Cody P Schafer c...@linux.vnet.ibm.com
---
 arch/powerpc/perf/hv-24x7.c | 20 
 1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/perf/hv-24x7.c b/arch/powerpc/perf/hv-24x7.c
index f5bca73..3e8f60a 100644
--- a/arch/powerpc/perf/hv-24x7.c
+++ b/arch/powerpc/perf/hv-24x7.c
@@ -155,16 +155,28 @@ static ssize_t read_offset_data(void *dest, size_t 
dest_len,
return copy_len;
 }
 
-static unsigned long h_get_24x7_catalog_page(char page[static 4096],
-u32 version, u32 index)
+static unsigned long h_get_24x7_catalog_page_(unsigned long phys_4096,
+ unsigned long version,
+ unsigned long index)
 {
-   WARN_ON(!IS_ALIGNED((unsigned long)page, 4096));
+   pr_devel(h_get_24x7_catalog_page(0x%lx, %lu, %lu),
+   phys_4096,
+   version,
+   index);
+   WARN_ON(!IS_ALIGNED(phys_4096, 4096));
return plpar_hcall_norets(H_GET_24X7_CATALOG_PAGE,
-   virt_to_phys(page),
+   phys_4096,
version,
index);
 }
 
+static unsigned long h_get_24x7_catalog_page(char page[static 4096],
+u32 version, u32 index)
+{
+   return h_get_24x7_catalog_page_(virt_to_phys(page),
+   version, index);
+}
+
 static ssize_t catalog_read(struct file *filp, struct kobject *kobj,
struct bin_attribute *bin_attr, char *buf,
loff_t offset, size_t count)
-- 
1.9.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v2 1/6] powerpc/perf/hv_24x7: probe errors changed to pr_debug(), padding fixed

2014-04-15 Thread Cody P Schafer
fixup for powerpc/perf: Add support for the hv 24x7 interface

Makes the not enabled message less awful (and hides it in most cases).

Signed-off-by: Cody P Schafer c...@linux.vnet.ibm.com
---
 arch/powerpc/perf/hv-24x7.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/perf/hv-24x7.c b/arch/powerpc/perf/hv-24x7.c
index 297c9105..f5bca73 100644
--- a/arch/powerpc/perf/hv-24x7.c
+++ b/arch/powerpc/perf/hv-24x7.c
@@ -485,13 +485,13 @@ static int hv_24x7_init(void)
struct hv_perf_caps caps;
 
if (!firmware_has_feature(FW_FEATURE_LPAR)) {
-   pr_info(not a virtualized system, not enabling\n);
+   pr_debug(not a virtualized system, not enabling\n);
return -ENODEV;
}
 
hret = hv_perf_caps_get(caps);
if (hret) {
-   pr_info(could not obtain capabilities, error 0x%80lx, not 
enabling\n,
+   pr_debug(could not obtain capabilities, not enabling, 
rc=%ld\n,
hret);
return -ENODEV;
}
-- 
1.9.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v2 2/6] powerpc/perf/hv_gpci: probe failures use pr_debug(), and padding reduced

2014-04-15 Thread Cody P Schafer
fixup for powerpc/perf: Add support for the hv gpci (get performance
counter info) interface.

Makes the not enabled message less awful (and hidden unless
debugging).

Signed-off-by: Cody P Schafer c...@linux.vnet.ibm.com
---
 arch/powerpc/perf/hv-gpci.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/perf/hv-gpci.c b/arch/powerpc/perf/hv-gpci.c
index 278ba7b..8fee1dc 100644
--- a/arch/powerpc/perf/hv-gpci.c
+++ b/arch/powerpc/perf/hv-gpci.c
@@ -273,13 +273,13 @@ static int hv_gpci_init(void)
struct hv_perf_caps caps;
 
if (!firmware_has_feature(FW_FEATURE_LPAR)) {
-   pr_info(not a virtualized system, not enabling\n);
+   pr_debug(not a virtualized system, not enabling\n);
return -ENODEV;
}
 
hret = hv_perf_caps_get(caps);
if (hret) {
-   pr_info(could not obtain capabilities, error 0x%80lx, not 
enabling\n,
+   pr_debug(could not obtain capabilities, not enabling, 
rc=%ld\n,
hret);
return -ENODEV;
}
-- 
1.9.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v2 6/6] powerpc/perf/hv-24x7: catalog version number is be64, not be32

2014-04-15 Thread Cody P Schafer
The catalog version number was changed from a be32 (with proceeding
32bits of padding) to a be64, update the code to treat it as a be64

Signed-off-by: Cody P Schafer c...@linux.vnet.ibm.com
---
 arch/powerpc/perf/hv-24x7.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/perf/hv-24x7.c b/arch/powerpc/perf/hv-24x7.c
index 95a67f8..9d4badc 100644
--- a/arch/powerpc/perf/hv-24x7.c
+++ b/arch/powerpc/perf/hv-24x7.c
@@ -171,7 +171,7 @@ static unsigned long h_get_24x7_catalog_page_(unsigned long 
phys_4096,
 }
 
 static unsigned long h_get_24x7_catalog_page(char page[],
-u32 version, u32 index)
+u64 version, u32 index)
 {
return h_get_24x7_catalog_page_(virt_to_phys(page),
version, index);
@@ -185,7 +185,7 @@ static ssize_t catalog_read(struct file *filp, struct 
kobject *kobj,
ssize_t ret = 0;
size_t catalog_len = 0, catalog_page_len = 0, page_count = 0;
loff_t page_offset = 0;
-   uint32_t catalog_version_num = 0;
+   uint64_t catalog_version_num = 0;
void *page = kmem_cache_alloc(hv_page_cache, GFP_USER);
struct hv_24x7_catalog_page_0 *page_0 = page;
if (!page)
@@ -197,7 +197,7 @@ static ssize_t catalog_read(struct file *filp, struct 
kobject *kobj,
goto e_free;
}
 
-   catalog_version_num = be32_to_cpu(page_0-version);
+   catalog_version_num = be64_to_cpu(page_0-version);
catalog_page_len = be32_to_cpu(page_0-length);
catalog_len = catalog_page_len * 4096;
 
@@ -255,7 +255,7 @@ e_free: 
\
 static DEVICE_ATTR_RO(_name)
 
 PAGE_0_ATTR(catalog_version, %lld\n,
-   (unsigned long long)be32_to_cpu(page_0-version));
+   (unsigned long long)be64_to_cpu(page_0-version));
 PAGE_0_ATTR(catalog_len, %lld\n,
(unsigned long long)be32_to_cpu(page_0-length) * 4096);
 static BIN_ATTR_RO(catalog, 0/* real length varies */);
-- 
1.9.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 1/2] fixup: powerpc/perf: Add support for the hv 24x7 interface

2014-04-02 Thread Cody P Schafer
Make the not enabled message less awful.

Signed-off-by: Cody P Schafer c...@linux.vnet.ibm.com
---
 arch/powerpc/perf/hv-24x7.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/perf/hv-24x7.c b/arch/powerpc/perf/hv-24x7.c
index 297c9105..3246ea2 100644
--- a/arch/powerpc/perf/hv-24x7.c
+++ b/arch/powerpc/perf/hv-24x7.c
@@ -491,7 +491,7 @@ static int hv_24x7_init(void)
 
hret = hv_perf_caps_get(caps);
if (hret) {
-   pr_info(could not obtain capabilities, error 0x%80lx, not 
enabling\n,
+   pr_info(could not obtain capabilities, not enabling (%ld)\n,
hret);
return -ENODEV;
}
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 2/2] fixup: powerpc/perf: Add support for the hv gpci (get performance counter info) interface

2014-04-02 Thread Cody P Schafer
Make the not enabled message less awful.

Signed-off-by: Cody P Schafer c...@linux.vnet.ibm.com
---
 arch/powerpc/perf/hv-gpci.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/perf/hv-gpci.c b/arch/powerpc/perf/hv-gpci.c
index 278ba7b..f6c471d 100644
--- a/arch/powerpc/perf/hv-gpci.c
+++ b/arch/powerpc/perf/hv-gpci.c
@@ -279,7 +279,7 @@ static int hv_gpci_init(void)
 
hret = hv_perf_caps_get(caps);
if (hret) {
-   pr_info(could not obtain capabilities, error 0x%80lx, not 
enabling\n,
+   pr_info(could not obtain capabilities, not enabling (%ld)\n,
hret);
return -ENODEV;
}
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 0/2] powerpc/perf: fixup 2 patches from the 24x7 series

2014-04-02 Thread Cody P Schafer
mpe: these are fixups for 2 patches already in your merge tree (and in benh's 
next branch).

f3e622941a7cec587c00c0d17ea31514457c63c8 powerpc/perf: Add support for the hv 
24x7 interface
edd354ea4a6774bf9f380b0acf30e699070f4e8a powerpc/perf: Add support for the hv 
gpci (get performance counter info) interface

The only change is to a pr_info() printed when the interface is not detected.

Anton: I'm hesitant to switch these to pr_debug() as they are the only way
users expecting these PMUs to exist to tell why the kernel decided they didn't
have them. As a result, I've kept them as pr_info() instead of converting to
pr_debug().


Cody P Schafer (2):
  fixup: powerpc/perf: Add support for the hv 24x7 interface
  fixup: powerpc/perf: Add support for the hv gpci (get performance
counter info) interface

 arch/powerpc/perf/hv-24x7.c | 2 +-
 arch/powerpc/perf/hv-gpci.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v4 09/11] powerpc/perf: add support for the hv 24x7 interface

2014-03-25 Thread Cody P Schafer

On 03/25/2014 03:43 AM, Anton Blanchard wrote:


Hi Cody,

hv-24x7: could not obtain capabilities, error 0x
fffe, not enabling
hv-gpci: could not obtain capabilities, error 0x
fffe, not enabling


+   pr_info(could not obtain capabilities, error 0x%80lx, not 
enabling\n,


That's a lot of padding :)

I think this should also be a pr_debug, considering this is not relevant
to most ppc64 boxes.


I'm fine with that. It should probably be 0x%08lx not 0x%80lx, not 
sure when I screwed that up.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v4 09/11] powerpc/perf: add support for the hv 24x7 interface

2014-03-25 Thread Cody P Schafer

On 03/25/2014 03:43 AM, Anton Blanchard wrote:


Hi Cody,

hv-24x7: could not obtain capabilities, error 0x
fffe, not enabling
hv-gpci: could not obtain capabilities, error 0x
fffe, not enabling


+   pr_info(could not obtain capabilities, error 0x%80lx, not 
enabling\n,


That's a lot of padding :)

I think this should also be a pr_debug, considering this is not relevant
to most ppc64 boxes.


Yep, s/info/debug/ makes sense. The format should have been %08lx not 
%80lx, not sure when I screwed that up.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v4 02/11] perf: add PMU_FORMAT_RANGE() helper for use by sw-like pmus

2014-03-05 Thread Cody P Schafer
Add PMU_FORMAT_RANGE() and PMU_FORMAT_RANGE_RESERVED() (for reserved
areas) which generate functions to extract the relevent bits from
event-attr.config{,1,2} for use by sw-like pmus where the
'config{,1,2}' values don't map directly to hardware registers.

Signed-off-by: Cody P Schafer c...@linux.vnet.ibm.com
---
 include/linux/perf_event.h | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index e56b07f..5c12009 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -871,4 +871,27 @@ _name##_show(struct device *dev,   
\
\
 static struct device_attribute format_attr_##_name = __ATTR_RO(_name)
 
+#define format_max(name) FORMAT_MAX_(name)()
+#define FORMAT_MAX_(name) format_##name##_max
+
+#define format_get(name, event) FORMAT_GET_(name)(event)
+#define FORMAT_GET_(name) format_get_##name
+
+#define PMU_FORMAT_RANGE(name, attr_var, bit_start, bit_end)   \
+PMU_FORMAT_RANGE_RESERVED(name, attr_var, bit_start, bit_end)  \
+PMU_FORMAT_ATTR(name, #attr_var : #bit_start - #bit_end)
+
+#define PMU_FORMAT_RANGE_RESERVED(name, attr_var, bit_start, bit_end)  \
+static u64 FORMAT_MAX_(name)(void) \
+{  \
+   BUILD_BUG_ON((bit_start  bit_end)  \
+   || (bit_end = (sizeof(1ull) * 8)));\
+   return (((1ull  (bit_end - bit_start)) - 1)  1) + 1;\
+}  \
+static u64 FORMAT_GET_(name)(struct perf_event *event) \
+{  \
+   return (event-attr.attr_var  (bit_start))   \
+   format_max(name);   \
+}
+
 #endif /* _LINUX_PERF_EVENT_H */
-- 
1.9.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v4 00/11] powerpc: Add support for Power Hypervisor supplied performance counters

2014-03-05 Thread Cody P Schafer
These patches add basic pmus for 2 powerpc hypervisor interfaces to obtain
performance counters: gpci (get performance counter info) and 24x7.

The counters supplied by these interfaces are continually counting and never
need to be (and cannot be) disabled or enabled. They additionally do not
generate any interrupts. This makes them in some regards similar to software
counters, and as a result their implimentation shares some common code (which
an initial patch exposes) with the sw counters.

These 2 PMUs end up providing access to some cpu, core, and chip level counters
not exposed via other interfaces, and additionally allow monitoring the
performance of other lpars (guests) on the same host system. Because it
provides access to core and chip level counters, this pair of PMUs could be
thought of as powerpc's counterpart to x86's uncore events.

GPCI is an interface that already exists on some power6 and power7 machines
(depending on the fw version), but is rather in-flexible and code intensive to
add additional counters to.  The 24x7 interfaces currently are designed to
co-exist with the gpci interface while replacing most of gpci's functionality
on newer systems. Right now, the 24x7 code I've submitted uses the gpci calls
to check if it has permission to access certain classes of counters.

--

Since v3:
 - PMU_FORMAT_RANGE*()
- add BUILD_BUG_ON() invalid bit indexes
- rename event_get_##name(ev) to format_get(name, ev) [Michael Ellerman]
- similarly, rename event_get_##name##_max() to format_max(name)
  [Michael Ellerman]
- fix format_max() [Michael Ellerman]

Since v2:
 - sysfs: create bin_attributes under the requested group is now in
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core.git 
driver-core-next
with commit-id: aabaf4c2050d21d39fe11eec889c508e84d6a328

 - Split hv-24x7.h catalog definition into hv-24x7-catalog.h
 - Remove unused 24x7 and gpci interface structures and enums (Michael Ellerman)
 - Update docs to point to an external source for the full catalog docs
 - Extend some of the patch changelogs (Peter Z)
 - Remove hrtimer usage and just extern the event_idx helper (now renamed) 
(Peter Z)
 - s/PMU_RANGE_ATTR/PMU_FORMAT_RANGE/ (and similar RESERVED rename) (Michael
   Ellerman)
 - hv_24x7: small clarifications in read_offset_data()'s comment
 - hv_gpci: remove h_gpci_event_read() and h_gpci_event_del(), call _stop and
   _update() directly (Michael Ellerman)
 - Kconfig relocation, dependency changes, and rewording (Scott Wood and
   Michael Ellerman)

Since v1:
 - add a few attributes to hv_gpci and hv_24x7 that expose some info about the 
interfaces
 - so the attributes show up in the right place, fix bin_attr creation in sysfs 
groups.
 - move hv_gpci.h and hv_24x7.h interface headers into arch/powerpc/perf
 - fix bit ordering in hv_gpci.h
 - split out hv_perf_caps_get() and use it to probe for the interface before 
registering
 - ensure proper alignment of hypervisor args
 - add a few missing counter requests to hv_gpci.h
 - s/CIR_xxx/CIR_XXX/ in hv_gpci.h
 - s/modules_init/device_initcall/
 - Don't set event-cpu, use the user provided one
 - remove the union of gpci events, just give the user 1024 bytes to play with
 - clarify some comments (the list of fw versions is now labeled)
 - provide and event_24x7_request() that wraps single_24x7_request()
 - probably some other small fixes I'm forgetting.


Cody P Schafer (11):
  sysfs: create bin_attributes under the requested group
  perf: add PMU_FORMAT_RANGE() helper for use by sw-like pmus
  perf: provide a common perf_event_nop_0() for use with .event_idx
  powerpc: add hvcalls for 24x7 and gpci (get performance counter info)
  powerpc/perf: add hv_gpci interface header
  powerpc/perf: add 24x7 interface headers
  powerpc/perf: add a shared interface to get gpci version and
capabilities
  powerpc/perf: add support for the hv gpci (get performance counter
info) interface
  powerpc/perf: add support for the hv 24x7 interface
  powerpc/perf: add kconfig option for hypervisor provided counters
  powerpc/perf/hv_{gpci,24x7}: add documentation of device attributes

 .../testing/sysfs-bus-event_source-devices-hv_24x7 |  23 +
 .../testing/sysfs-bus-event_source-devices-hv_gpci |  43 ++
 arch/powerpc/include/asm/hvcall.h  |   5 +
 arch/powerpc/perf/Makefile |   2 +
 arch/powerpc/perf/hv-24x7-catalog.h|  33 ++
 arch/powerpc/perf/hv-24x7.c| 493 +
 arch/powerpc/perf/hv-24x7.h| 109 +
 arch/powerpc/perf/hv-common.c  |  39 ++
 arch/powerpc/perf/hv-common.h  |  17 +
 arch/powerpc/perf/hv-gpci.c| 277 
 arch/powerpc/perf/hv-gpci.h|  73 +++
 arch/powerpc/platforms/pseries/Kconfig |  12 +
 fs/sysfs/group.c

[PATCH v4 03/11] perf: provide a common perf_event_nop_0() for use with .event_idx

2014-03-05 Thread Cody P Schafer
Rather an having every pmu that needs a function that just returns 0 for
.event_idx define their own copy, reuse the one in kernel/events/core.c.

Rename from perf_swevent_event_idx() because we're no longer using it
for just software events. Naming is based on the perf_pmu_nop_*()
functions.

Signed-off-by: Cody P Schafer c...@linux.vnet.ibm.com
---
 include/linux/perf_event.h |  1 +
 kernel/events/core.c   | 10 +-
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 5c12009..23da668 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -560,6 +560,7 @@ extern void perf_pmu_migrate_context(struct pmu *pmu,
 extern u64 perf_event_read_value(struct perf_event *event,
 u64 *enabled, u64 *running);
 
+extern int perf_event_nop_0(struct perf_event *event);
 
 struct perf_sample_data {
u64 type;
diff --git a/kernel/events/core.c b/kernel/events/core.c
index fa0b2d4..16bf7c2 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -5816,7 +5816,7 @@ static int perf_swevent_init(struct perf_event *event)
return 0;
 }
 
-static int perf_swevent_event_idx(struct perf_event *event)
+int perf_event_nop_0(struct perf_event *event)
 {
return 0;
 }
@@ -5831,7 +5831,7 @@ static struct pmu perf_swevent = {
.stop   = perf_swevent_stop,
.read   = perf_swevent_read,
 
-   .event_idx  = perf_swevent_event_idx,
+   .event_idx  = perf_event_nop_0,
 };
 
 #ifdef CONFIG_EVENT_TRACING
@@ -5950,7 +5950,7 @@ static struct pmu perf_tracepoint = {
.stop   = perf_swevent_stop,
.read   = perf_swevent_read,
 
-   .event_idx  = perf_swevent_event_idx,
+   .event_idx  = perf_event_nop_0,
 };
 
 static inline void perf_tp_register(void)
@@ -6177,7 +6177,7 @@ static struct pmu perf_cpu_clock = {
.stop   = cpu_clock_event_stop,
.read   = cpu_clock_event_read,
 
-   .event_idx  = perf_swevent_event_idx,
+   .event_idx  = perf_event_nop_0,
 };
 
 /*
@@ -6257,7 +6257,7 @@ static struct pmu perf_task_clock = {
.stop   = task_clock_event_stop,
.read   = task_clock_event_read,
 
-   .event_idx  = perf_swevent_event_idx,
+   .event_idx  = perf_event_nop_0,
 };
 
 static void perf_pmu_nop_void(struct pmu *pmu)
-- 
1.9.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v4 04/11] powerpc: add hvcalls for 24x7 and gpci (get performance counter info)

2014-03-05 Thread Cody P Schafer
Signed-off-by: Cody P Schafer c...@linux.vnet.ibm.com
---
 arch/powerpc/include/asm/hvcall.h | 5 +
 1 file changed, 5 insertions(+)

diff --git a/arch/powerpc/include/asm/hvcall.h 
b/arch/powerpc/include/asm/hvcall.h
index d8b600b..5dbbb29 100644
--- a/arch/powerpc/include/asm/hvcall.h
+++ b/arch/powerpc/include/asm/hvcall.h
@@ -274,6 +274,11 @@
 /* Platform specific hcalls, used by KVM */
 #define H_RTAS 0xf000
 
+/* Platform specific hcalls, provided by PHYP */
+#define H_GET_24X7_CATALOG_PAGE0xF078
+#define H_GET_24X7_DATA0xF07C
+#define H_GET_PERF_COUNTER_INFO0xF080
+
 #ifndef __ASSEMBLY__
 
 /**
-- 
1.9.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v4 01/11] sysfs: create bin_attributes under the requested group

2014-03-05 Thread Cody P Schafer
bin_attributes created/updated in create_files() (such as those listed
via (struct device).attribute_groups) were not placed under the
specified group, and instead appeared in the base kobj directory.

Fix this by making bin_attributes use creating code similar to normal
attributes.

A quick grep shows that no one is using bin_attrs in a named attribute
group yet, so we can do this without breaking anything in usespace.

Note that I do not add is_visible() support to
bin_attributes, though that could be done as well.

Signed-off-by: Cody P Schafer c...@linux.vnet.ibm.com
Signed-off-by: Greg Kroah-Hartman gre...@linuxfoundation.org
---

 Currently in:
 git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core.git 
driver-core-next
 with commit-id: aabaf4c2050d21d39fe11eec889c508e84d6a328

---

 fs/sysfs/group.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/fs/sysfs/group.c b/fs/sysfs/group.c
index 6b57938..aa04068 100644
--- a/fs/sysfs/group.c
+++ b/fs/sysfs/group.c
@@ -70,8 +70,11 @@ static int create_files(struct kernfs_node *parent, struct 
kobject *kobj,
if (grp-bin_attrs) {
for (bin_attr = grp-bin_attrs; *bin_attr; bin_attr++) {
if (update)
-   sysfs_remove_bin_file(kobj, *bin_attr);
-   error = sysfs_create_bin_file(kobj, *bin_attr);
+   kernfs_remove_by_name(parent,
+   (*bin_attr)-attr.name);
+   error = sysfs_add_file_mode_ns(parent,
+   (*bin_attr)-attr, true,
+   (*bin_attr)-attr.mode, NULL);
if (error)
break;
}
-- 
1.9.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v4 05/11] powerpc/perf: add hv_gpci interface header

2014-03-05 Thread Cody P Schafer
H_GetPerformanceCounterInfo (refered to as hv_gpci or just gpci from
here on) is an interface to retrieve specific performance counters and
other data from the hypervisor. All outputs have a fixed format. This
header only describes the portions of the interface that we plan on
using in linux at this time.

Signed-off-by: Cody P Schafer c...@linux.vnet.ibm.com
---
 arch/powerpc/perf/hv-gpci.h | 73 +
 1 file changed, 73 insertions(+)
 create mode 100644 arch/powerpc/perf/hv-gpci.h

diff --git a/arch/powerpc/perf/hv-gpci.h b/arch/powerpc/perf/hv-gpci.h
new file mode 100644
index 000..b25f460
--- /dev/null
+++ b/arch/powerpc/perf/hv-gpci.h
@@ -0,0 +1,73 @@
+#ifndef LINUX_POWERPC_PERF_HV_GPCI_H_
+#define LINUX_POWERPC_PERF_HV_GPCI_H_
+
+#include linux/types.h
+
+/* From the document H_GetPerformanceCounterInfo Interface v1.07 */
+
+/* H_GET_PERF_COUNTER_INFO argument */
+struct hv_get_perf_counter_info_params {
+   __be32 counter_request; /* I */
+   __be32 starting_index;  /* IO */
+   __be16 secondary_index; /* IO */
+   __be16 returned_values; /* O */
+   __be32 detail_rc; /* O, only needed when called via *_norets() */
+
+   /*
+* O, size each of counter_value element in bytes, only set for version
+* = 0x3
+*/
+   __be16 cv_element_size;
+
+   /* I, 0 (zero) for versions  0x3 */
+   __u8 counter_info_version_in;
+
+   /* O, 0 (zero) if version  0x3. Must be set to 0 when making hcall */
+   __u8 counter_info_version_out;
+   __u8 reserved[0xC];
+   __u8 counter_value[];
+} __packed;
+
+/*
+ * counter info version = fw version/reference (spec version)
+ *
+ * 8 = power8 (1.07)
+ * [7 is skipped by spec 1.07]
+ * 6 = TLBIE (1.07)
+ * 5 = v7r7m0.phyp (1.05)
+ * [4 skipped]
+ * 3 = v7r6m0.phyp (?)
+ * [1,2 skipped]
+ * 0 = v7r{2,3,4}m0.phyp (?)
+ */
+#define COUNTER_INFO_VERSION_CURRENT 0x8
+
+/*
+ * These determine the counter_value[] layout and the meaning of starting_index
+ * and secondary_index.
+ *
+ * Unless otherwise noted, @secondary_index is unused and ignored.
+ */
+enum counter_info_requests {
+
+   /* GENERAL */
+
+   /* @starting_index: must be -1 (to refer to the current partition)
+*/
+   CIR_SYSTEM_PERFORMANCE_CAPABILITIES = 0X40,
+};
+
+struct cv_system_performance_capabilities {
+   /* If != 0, allowed to collect data from other partitions */
+   __u8 perf_collect_privileged;
+
+   /* These following are only valid if counter_info_version = 0x3 */
+#define CV_CM_GA   (1  7)
+#define CV_CM_EXPANDED (1  6)
+#define CV_CM_LAB  (1  5)
+   /* remaining bits are reserved */
+   __u8 capability_mask;
+   __u8 reserved[0xE];
+} __packed;
+
+#endif
-- 
1.9.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v4 06/11] powerpc/perf: add 24x7 interface headers

2014-03-05 Thread Cody P Schafer
24x7 (also called hv_24x7 or H_24X7) is an interface to obtain
performance counters from the hypervisor. These counters do not have a
fixed format/possition and are instead documented in a 24x7 Catalog,
which is provided by the hypervisor (that interface is also documented
paritialy in the included hv-24x7-catalog.h and fully in at
https://raw.githubusercontent.com/jmesmon/catalog-24x7/master/hv-24x7-catalog.h 
).

The 24x7 data access is simply a copy operation into a 4 dimentional
array of 64bit counters (from hypervisor to kernel memory). There is no
interupt triggered on overflow, these are completely disjoint from the
typical power pmu.

This method of obtaining performance counters from the hypervisor is
intended to paritialy replace the gpci interface.

Signed-off-by: Cody P Schafer c...@linux.vnet.ibm.com
---
 arch/powerpc/perf/hv-24x7-catalog.h |  33 +++
 arch/powerpc/perf/hv-24x7.h | 109 
 2 files changed, 142 insertions(+)
 create mode 100644 arch/powerpc/perf/hv-24x7-catalog.h
 create mode 100644 arch/powerpc/perf/hv-24x7.h

diff --git a/arch/powerpc/perf/hv-24x7-catalog.h 
b/arch/powerpc/perf/hv-24x7-catalog.h
new file mode 100644
index 000..21b19dd
--- /dev/null
+++ b/arch/powerpc/perf/hv-24x7-catalog.h
@@ -0,0 +1,33 @@
+#ifndef LINUX_POWERPC_PERF_HV_24X7_CATALOG_H_
+#define LINUX_POWERPC_PERF_HV_24X7_CATALOG_H_
+
+#include linux/types.h
+
+/* From document 24x7 Event and Group Catalog Formats Proposal v0.15 */
+
+struct hv_24x7_catalog_page_0 {
+#define HV_24X7_CATALOG_MAGIC 0x32347837 /* 24x7 in ASCII */
+   __be32 magic;
+   __be32 length; /* In 4096 byte pages */
+   __be64 version; /* XXX: arbitrary? what's the meaning/useage/purpose? */
+   __u8 build_time_stamp[16]; /* MMDDHHMMSS\0\0 */
+   __u8 reserved2[32];
+   __be16 schema_data_offs; /* in 4096 byte pages */
+   __be16 schema_data_len;  /* in 4096 byte pages */
+   __be16 schema_entry_count;
+   __u8 reserved3[2];
+   __be16 event_data_offs;
+   __be16 event_data_len;
+   __be16 event_entry_count;
+   __u8 reserved4[2];
+   __be16 group_data_offs; /* in 4096 byte pages */
+   __be16 group_data_len;  /* in 4096 byte pages */
+   __be16 group_entry_count;
+   __u8 reserved5[2];
+   __be16 formula_data_offs; /* in 4096 byte pages */
+   __be16 formula_data_len;  /* in 4096 byte pages */
+   __be16 formula_entry_count;
+   __u8 reserved6[2];
+} __packed;
+
+#endif
diff --git a/arch/powerpc/perf/hv-24x7.h b/arch/powerpc/perf/hv-24x7.h
new file mode 100644
index 000..720ebce
--- /dev/null
+++ b/arch/powerpc/perf/hv-24x7.h
@@ -0,0 +1,109 @@
+#ifndef LINUX_POWERPC_PERF_HV_24X7_H_
+#define LINUX_POWERPC_PERF_HV_24X7_H_
+
+#include linux/types.h
+
+struct hv_24x7_request {
+   /* PHYSICAL domains require enabling via phyp/hmc. */
+#define HV_24X7_PERF_DOMAIN_PHYSICAL_CHIP 0x01
+#define HV_24X7_PERF_DOMAIN_PHYSICAL_CORE 0x02
+#define HV_24X7_PERF_DOMAIN_VIRTUAL_PROCESSOR_HOME_CORE   0x03
+#define HV_24X7_PERF_DOMAIN_VIRTUAL_PROCESSOR_HOME_CHIP   0x04
+#define HV_24X7_PERF_DOMAIN_VIRTUAL_PROCESSOR_HOME_NODE   0x05
+#define HV_24X7_PERF_DOMAIN_VIRTUAL_PROCESSOR_REMOTE_NODE 0x06
+   __u8 performance_domain;
+   __u8 reserved[0x1];
+
+   /* bytes to read starting at @data_offset. must be a multiple of 8 */
+   __be16 data_size;
+
+   /*
+* byte offset within the perf domain to read from. must be 8 byte
+* aligned
+*/
+   __be32 data_offset;
+
+   /*
+* only valid for VIRTUAL_PROCESSOR domains, ignored for others.
+* -1 means current partition only
+*  Enabling via phyp/hmc required for non--1 values. 0 forbidden
+*  unless requestor is 0.
+*/
+   __be16 starting_lpar_ix;
+
+   /*
+* Ignored when @starting_lpar_ix == -1
+* Ignored when @performance_domain is not VIRTUAL_PROCESSOR_*
+* -1 means infinite or all
+*/
+   __be16 max_num_lpars;
+
+   /* chip, core, or virtual processor based on @performance_domain */
+   __be16 starting_ix;
+   __be16 max_ix;
+} __packed;
+
+struct hv_24x7_request_buffer {
+   /* 0 - ? */
+   /* 1 - ? */
+#define HV_24X7_IF_VERSION_CURRENT 0x01
+   __u8 interface_version;
+   __u8 num_requests;
+   __u8 reserved[0xE];
+   struct hv_24x7_request requests[];
+} __packed;
+
+struct hv_24x7_result_element {
+   __be16 lpar_ix;
+
+   /*
+* represents the core, chip, or virtual processor based on the
+* request's @performance_domain
+*/
+   __be16 domain_ix;
+
+   /* -1 if @performance_domain does not refer to a virtual processor */
+   __be32 lpar_cfg_instance_id;
+
+   /* size = @result_element_data_size of cointaining result. */
+   __u8 element_data[];
+} __packed;
+
+struct hv_24x7_result {
+   __u8 result_ix;
+
+   /*
+* 0 = not all

[PATCH v4 07/11] powerpc/perf: add a shared interface to get gpci version and capabilities

2014-03-05 Thread Cody P Schafer
This exposes a simple way to grab the firmware provided
collect_priveliged, ga, expanded, and lab capability bits. All of these
bits come in from the same gpci request, so we've exposed all of them.

Only the collect_priveliged bit is really used by the hv-gpci/hv-24x7
code, the other bits are simply exposed in sysfs to inform the user.

Signed-off-by: Cody P Schafer c...@linux.vnet.ibm.com
---
 arch/powerpc/perf/hv-common.c | 39 +++
 arch/powerpc/perf/hv-common.h | 17 +
 2 files changed, 56 insertions(+)
 create mode 100644 arch/powerpc/perf/hv-common.c
 create mode 100644 arch/powerpc/perf/hv-common.h

diff --git a/arch/powerpc/perf/hv-common.c b/arch/powerpc/perf/hv-common.c
new file mode 100644
index 000..47e02b3
--- /dev/null
+++ b/arch/powerpc/perf/hv-common.c
@@ -0,0 +1,39 @@
+#include asm/io.h
+#include asm/hvcall.h
+
+#include hv-gpci.h
+#include hv-common.h
+
+unsigned long hv_perf_caps_get(struct hv_perf_caps *caps)
+{
+   unsigned long r;
+   struct p {
+   struct hv_get_perf_counter_info_params params;
+   struct cv_system_performance_capabilities caps;
+   } __packed __aligned(sizeof(uint64_t));
+
+   struct p arg = {
+   .params = {
+   .counter_request = cpu_to_be32(
+   CIR_SYSTEM_PERFORMANCE_CAPABILITIES),
+   .starting_index = cpu_to_be32(-1),
+   .counter_info_version_in = 0,
+   }
+   };
+
+   r = plpar_hcall_norets(H_GET_PERF_COUNTER_INFO,
+  virt_to_phys(arg), sizeof(arg));
+
+   if (r)
+   return r;
+
+   pr_devel(capability_mask: 0x%x\n, arg.caps.capability_mask);
+
+   caps-version = arg.params.counter_info_version_out;
+   caps-collect_privileged = !!arg.caps.perf_collect_privileged;
+   caps-ga = !!(arg.caps.capability_mask  CV_CM_GA);
+   caps-expanded = !!(arg.caps.capability_mask  CV_CM_EXPANDED);
+   caps-lab = !!(arg.caps.capability_mask  CV_CM_LAB);
+
+   return r;
+}
diff --git a/arch/powerpc/perf/hv-common.h b/arch/powerpc/perf/hv-common.h
new file mode 100644
index 000..7e615bd
--- /dev/null
+++ b/arch/powerpc/perf/hv-common.h
@@ -0,0 +1,17 @@
+#ifndef LINUX_POWERPC_PERF_HV_COMMON_H_
+#define LINUX_POWERPC_PERF_HV_COMMON_H_
+
+#include linux/types.h
+
+struct hv_perf_caps {
+   u16 version;
+   u16 collect_privileged:1,
+   ga:1,
+   expanded:1,
+   lab:1,
+   unused:12;
+};
+
+unsigned long hv_perf_caps_get(struct hv_perf_caps *caps);
+
+#endif
-- 
1.9.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v4 08/11] powerpc/perf: add support for the hv gpci (get performance counter info) interface

2014-03-05 Thread Cody P Schafer
This provides a basic link between perf and hv_gpci. Notably, it does
not yet support transactions and does not list any events (they can
still be manually composed).

Example usage via perf tool:

perf stat -e 
'hv_gpci/counter_info_version=3,offset=0,length=8,secondary_index=0,starting_index=0x,request=0x10/'
 -r 0 -C 0 -x ' ' sleep 0.1

Signed-off-by: Cody P Schafer c...@linux.vnet.ibm.com
---
 arch/powerpc/perf/hv-gpci.c | 277 
 1 file changed, 277 insertions(+)
 create mode 100644 arch/powerpc/perf/hv-gpci.c

diff --git a/arch/powerpc/perf/hv-gpci.c b/arch/powerpc/perf/hv-gpci.c
new file mode 100644
index 000..cc308fc
--- /dev/null
+++ b/arch/powerpc/perf/hv-gpci.c
@@ -0,0 +1,277 @@
+/*
+ * Hypervisor supplied gpci (get performance counter info) performance
+ * counter support
+ *
+ * Author: Cody P Schafer c...@linux.vnet.ibm.com
+ * Copyright 2014 IBM Corporation.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+#define pr_fmt(fmt) hv-gpci:  fmt
+
+#include linux/init.h
+#include linux/perf_event.h
+#include asm/firmware.h
+#include asm/hvcall.h
+#include asm/io.h
+
+#include hv-gpci.h
+#include hv-common.h
+
+PMU_FORMAT_RANGE(request, config, 0, 31); /* u32 */
+PMU_FORMAT_RANGE(starting_index, config, 32, 63); /* u32 */
+PMU_FORMAT_RANGE(secondary_index, config1, 0, 15); /* u16 */
+PMU_FORMAT_RANGE(counter_info_version, config1, 16, 23); /* u8 */
+PMU_FORMAT_RANGE(length, config1, 24, 31); /* u8, bytes of data (1-8) */
+PMU_FORMAT_RANGE(offset, config1, 32, 63); /* u32, byte offset */
+
+static struct attribute *format_attrs[] = {
+   format_attr_request.attr,
+   format_attr_starting_index.attr,
+   format_attr_secondary_index.attr,
+   format_attr_counter_info_version.attr,
+
+   format_attr_offset.attr,
+   format_attr_length.attr,
+   NULL,
+};
+
+static struct attribute_group format_group = {
+   .name = format,
+   .attrs = format_attrs,
+};
+
+#define HV_CAPS_ATTR(_name, _format)   \
+static ssize_t _name##_show(struct device *dev,\
+   struct device_attribute *attr,  \
+   char *page) \
+{  \
+   struct hv_perf_caps caps;   \
+   unsigned long hret = hv_perf_caps_get(caps);   \
+   if (hret)   \
+   return -EIO;\
+   \
+   return sprintf(page, _format, caps._name);  \
+}  \
+static struct device_attribute hv_caps_attr_##_name = __ATTR_RO(_name)
+
+static ssize_t kernel_version_show(struct device *dev,
+  struct device_attribute *attr,
+  char *page)
+{
+   return sprintf(page, 0x%x\n, COUNTER_INFO_VERSION_CURRENT);
+}
+
+DEVICE_ATTR_RO(kernel_version);
+HV_CAPS_ATTR(version, 0x%x\n);
+HV_CAPS_ATTR(ga, %d\n);
+HV_CAPS_ATTR(expanded, %d\n);
+HV_CAPS_ATTR(lab, %d\n);
+HV_CAPS_ATTR(collect_privileged, %d\n);
+
+static struct attribute *interface_attrs[] = {
+   dev_attr_kernel_version.attr,
+   hv_caps_attr_version.attr,
+   hv_caps_attr_ga.attr,
+   hv_caps_attr_expanded.attr,
+   hv_caps_attr_lab.attr,
+   hv_caps_attr_collect_privileged.attr,
+   NULL,
+};
+
+static struct attribute_group interface_group = {
+   .name = interface,
+   .attrs = interface_attrs,
+};
+
+static const struct attribute_group *attr_groups[] = {
+   format_group,
+   interface_group,
+   NULL,
+};
+
+#define GPCI_MAX_DATA_BYTES \
+   (1024 - sizeof(struct hv_get_perf_counter_info_params))
+
+static unsigned long single_gpci_request(u32 req, u32 starting_index,
+   u16 secondary_index, u8 version_in, u32 offset, u8 length,
+   u64 *value)
+{
+   unsigned long ret;
+   size_t i;
+   u64 count;
+
+   struct {
+   struct hv_get_perf_counter_info_params params;
+   uint8_t bytes[GPCI_MAX_DATA_BYTES];
+   } __packed __aligned(sizeof(uint64_t)) arg = {
+   .params = {
+   .counter_request = cpu_to_be32(req),
+   .starting_index = cpu_to_be32(starting_index),
+   .secondary_index = cpu_to_be16(secondary_index),
+   .counter_info_version_in = version_in,
+   }
+   };
+
+   ret = plpar_hcall_norets(H_GET_PERF_COUNTER_INFO,
+   virt_to_phys(arg

[PATCH v4 09/11] powerpc/perf: add support for the hv 24x7 interface

2014-03-05 Thread Cody P Schafer
This provides a basic interface between hv_24x7 and perf. Similar to
the one provided for gpci, it lacks transaction support and does not
list any events.

Example usage via perf tool:

perf stat -e 
'hv_24x7/domain=2,offset=8,starting_index=0,lpar=0x/' -r 0 -C 0 -x ' ' 
sleep 0.1

Signed-off-by: Cody P Schafer c...@linux.vnet.ibm.com
---
 arch/powerpc/perf/hv-24x7.c | 493 
 1 file changed, 493 insertions(+)
 create mode 100644 arch/powerpc/perf/hv-24x7.c

diff --git a/arch/powerpc/perf/hv-24x7.c b/arch/powerpc/perf/hv-24x7.c
new file mode 100644
index 000..81d68b6
--- /dev/null
+++ b/arch/powerpc/perf/hv-24x7.c
@@ -0,0 +1,493 @@
+/*
+ * Hypervisor supplied 24x7 performance counter support
+ *
+ * Author: Cody P Schafer c...@linux.vnet.ibm.com
+ * Copyright 2014 IBM Corporation.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#define pr_fmt(fmt) hv-24x7:  fmt
+
+#include linux/perf_event.h
+#include linux/module.h
+#include linux/slab.h
+#include asm/firmware.h
+#include asm/hvcall.h
+#include asm/io.h
+
+#include hv-24x7.h
+#include hv-24x7-catalog.h
+#include hv-common.h
+
+/*
+ * TODO: Merging events:
+ * - Think of the hcall as an interface to a 4d array of counters:
+ *   - x = domains
+ *   - y = indexes in the domain (core, chip, vcpu, node, etc)
+ *   - z = offset into the counter space
+ *   - w = lpars (guest vms, logical partitions)
+ * - A single request is: x,y,y_last,z,z_last,w,w_last
+ *   - this means we can retrieve a rectangle of counters in y,z for a single 
x.
+ *
+ * - Things to consider (ignoring w):
+ *   - input  cost_per_request = 16
+ *   - output cost_per_result(ys,zs)  = 8 + 8 * ys + ys * zs
+ *   - limited number of requests per hcall (must fit into 4K bytes)
+ * - 4k = 16 [buffer header] - 16 [request size] * request_count
+ * - 255 requests per hcall
+ *   - sometimes it will be more efficient to read extra data and discard
+ */
+
+PMU_FORMAT_RANGE(domain, config, 0, 3); /* u3 0-6, one of HV_24X7_PERF_DOMAIN 
*/
+PMU_FORMAT_RANGE(starting_index, config, 16, 31); /* u16 */
+PMU_FORMAT_RANGE(offset, config, 32, 63); /* u32, see data_offset */
+PMU_FORMAT_RANGE(lpar, config1, 0, 15); /* u16 */
+
+PMU_FORMAT_RANGE_RESERVED(reserved1, config,   4, 15);
+PMU_FORMAT_RANGE_RESERVED(reserved2, config1, 16, 63);
+PMU_FORMAT_RANGE_RESERVED(reserved3, config2,  0, 63);
+
+static struct attribute *format_attrs[] = {
+   format_attr_domain.attr,
+   format_attr_offset.attr,
+   format_attr_starting_index.attr,
+   format_attr_lpar.attr,
+   NULL,
+};
+
+static struct attribute_group format_group = {
+   .name = format,
+   .attrs = format_attrs,
+};
+
+/*
+ * read_offset_data - copy data from one buffer to another while treating the
+ *source buffer as a small view on the total avaliable
+ *source data.
+ *
+ * @dest: buffer to copy into
+ * @dest_len: length of @dest in bytes
+ * @requested_offset: the offset within the source data we want. Must be  0
+ * @src: buffer to copy data from
+ * @src_len: length of @src in bytes
+ * @source_offset: the offset in the sorce data that (src,src_len) refers to.
+ * Must be  0
+ *
+ * returns the number of bytes copied.
+ *
+ * The following ascii art shows the various buffer possitioning we need to
+ * handle, assigns some arbitrary varibles to points on the buffer, and then
+ * shows how we fiddle with those values to get things we care about (copy
+ * start in src and copy len)
+ *
+ * s = @src buffer
+ * d = @dest buffer
+ * '.' areas in d are written to.
+ *
+ *   u
+ *   x wv  z
+ * d   |.|
+ * s |--|
+ *
+ *  u
+ *   x w   z v
+ * d   |--|
+ * s |--|
+ *
+ *   x wu,z,v
+ * d   ||
+ * s |--|
+ *
+ *   x,wu,v,z
+ * d |..|
+ * s |--|
+ *
+ *   xu
+ *   wvz
+ * d ||
+ * s |--|
+ *
+ *   x  z   w  v
+ * d|--|
+ * s |--|
+ *
+ * x = source_offset
+ * w = requested_offset
+ * z = source_offset + src_len
+ * v = requested_offset + dest_len
+ *
+ * w_offset_in_s = w - x = requested_offset - source_offset
+ * z_offset_in_s = z - x = src_len
+ * v_offset_in_s = v - x = request_offset + dest_len - src_len
+ */
+static ssize_t read_offset_data(void *dest, size_t dest_len,
+   loff_t requested_offset, void *src,
+   size_t src_len, loff_t source_offset)
+{
+   size_t w_offset_in_s = requested_offset

[PATCH v4 10/11] powerpc/perf: add kconfig option for hypervisor provided counters

2014-03-05 Thread Cody P Schafer
Signed-off-by: Cody P Schafer c...@linux.vnet.ibm.com
---
 arch/powerpc/perf/Makefile |  2 ++
 arch/powerpc/platforms/pseries/Kconfig | 12 
 2 files changed, 14 insertions(+)

diff --git a/arch/powerpc/perf/Makefile b/arch/powerpc/perf/Makefile
index 60d71ee..f9c083a 100644
--- a/arch/powerpc/perf/Makefile
+++ b/arch/powerpc/perf/Makefile
@@ -11,5 +11,7 @@ obj32-$(CONFIG_PPC_PERF_CTRS) += mpc7450-pmu.o
 obj-$(CONFIG_FSL_EMB_PERF_EVENT) += core-fsl-emb.o
 obj-$(CONFIG_FSL_EMB_PERF_EVENT_E500) += e500-pmu.o e6500-pmu.o
 
+obj-$(CONFIG_HV_PERF_CTRS) += hv-24x7.o hv-gpci.o hv-common.o
+
 obj-$(CONFIG_PPC64)+= $(obj64-y)
 obj-$(CONFIG_PPC32)+= $(obj32-y)
diff --git a/arch/powerpc/platforms/pseries/Kconfig 
b/arch/powerpc/platforms/pseries/Kconfig
index 80b1d57..2cb8b77 100644
--- a/arch/powerpc/platforms/pseries/Kconfig
+++ b/arch/powerpc/platforms/pseries/Kconfig
@@ -111,6 +111,18 @@ config CMM
  will be reused for other LPARs. The interface allows firmware to
  balance memory across many LPARs.
 
+config HV_PERF_CTRS
+   bool Hypervisor supplied PMU events (24x7  GPCI)
+   default y
+   depends on PERF_EVENTS  PPC_PSERIES
+   help
+ Enable access to hypervisor supplied counters in perf. Currently,
+ this enables code that uses the hcall GetPerfCounterInfo and 24x7
+ interfaces to retrieve counters. GPCI exists on Power 6 and later
+ systems. 24x7 is available on Power 8 systems.
+
+  If unsure, select Y.
+
 config DTL
bool Dispatch Trace Log
depends on PPC_SPLPAR  DEBUG_FS
-- 
1.9.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v4 11/11] powerpc/perf/hv_{gpci, 24x7}: add documentation of device attributes

2014-03-05 Thread Cody P Schafer
gpci and 24x7 expose some device specific attributes. Add some
documentation for them.

Signed-off-by: Cody P Schafer c...@linux.vnet.ibm.com
---
 .../testing/sysfs-bus-event_source-devices-hv_24x7 | 23 
 .../testing/sysfs-bus-event_source-devices-hv_gpci | 43 ++
 2 files changed, 66 insertions(+)
 create mode 100644 
Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_24x7
 create mode 100644 
Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_gpci

diff --git a/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_24x7 
b/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_24x7
new file mode 100644
index 000..e78ee79
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_24x7
@@ -0,0 +1,23 @@
+What:  /sys/bus/event_source/devices/hv_24x7/interface/catalog
+Date:  February 2014
+Contact:   Cody P Schafer c...@linux.vnet.ibm.com
+Description:
+   Provides access to the binary 24x7 catalog provided by the
+   hypervisor on POWER7 and 8 systems. This catalog lists events
+   avaliable from the powerpc hv_24x7 pmu. Its format is
+   documented here:
+   
https://raw.githubusercontent.com/jmesmon/catalog-24x7/master/hv-24x7-catalog.h
+
+What:  /sys/bus/event_source/devices/hv_24x7/interface/catalog_length
+Date:  February 2014
+Contact:   Cody P Schafer c...@linux.vnet.ibm.com
+Description:
+   A number equal to the length in bytes of the catalog. This is
+   also extractable from the provided binary catalog sysfs entry.
+
+What:  /sys/bus/event_source/devices/hv_24x7/interface/catalog_version
+Date:  February 2014
+Contact:   Cody P Schafer c...@linux.vnet.ibm.com
+Description:
+   Exposes the version field of the 24x7 catalog. This is also
+   extractable from the provided binary catalog sysfs entry.
diff --git a/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_gpci 
b/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_gpci
new file mode 100644
index 000..3fa58c2
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_gpci
@@ -0,0 +1,43 @@
+What:  
/sys/bus/event_source/devices/hv_gpci/interface/collect_privileged
+Date:  February 2014
+Contact:   Cody P Schafer c...@linux.vnet.ibm.com
+Description:
+   '0' if the hypervisor is configured to forbid access to event
+   counters being accumulated by other guests and to physical
+   domain event counters.
+   '1' if that access is allowed.
+
+What:  /sys/bus/event_source/devices/hv_gpci/interface/ga
+Date:  February 2014
+Contact:   Cody P Schafer c...@linux.vnet.ibm.com
+Description:
+   0 or 1. Indicates whether we have access to GA events (listed
+   in arch/powerpc/perf/hv-gpci.h).
+
+What:  /sys/bus/event_source/devices/hv_gpci/interface/expanded
+Date:  February 2014
+Contact:   Cody P Schafer c...@linux.vnet.ibm.com
+Description:
+   0 or 1. Indicates whether we have access to EXPANDED events 
(listed
+   in arch/powerpc/perf/hv-gpci.h).
+
+What:  /sys/bus/event_source/devices/hv_gpci/interface/lab
+Date:  February 2014
+Contact:   Cody P Schafer c...@linux.vnet.ibm.com
+Description:
+   0 or 1. Indicates whether we have access to LAB events (listed
+   in arch/powerpc/perf/hv-gpci.h).
+
+What:  /sys/bus/event_source/devices/hv_gpci/interface/version
+Date:  February 2014
+Contact:   Cody P Schafer c...@linux.vnet.ibm.com
+Description:
+   A number indicating the version of the gpci interface that the
+   hypervisor reports supporting.
+
+What:  /sys/bus/event_source/devices/hv_gpci/interface/kernel_version
+Date:  February 2014
+Contact:   Cody P Schafer c...@linux.vnet.ibm.com
+Description:
+   A number indicating the latest version of the gpci interface
+   that the kernel is aware of.
-- 
1.9.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3 02/11] perf: add PMU_FORMAT_RANGE() helper for use by sw-like pmus

2014-03-05 Thread Cody P Schafer

On 03/04/2014 12:09 AM, Cody P Schafer wrote:

On 03/03/2014 09:19 PM, Michael Ellerman wrote:

On Thu, 2014-27-02 at 21:04:55 UTC, Cody P Schafer wrote:

Add PMU_FORMAT_RANGE() and PMU_FORMAT_RANGE_RESERVED() (for reserved
areas) which generate functions to extract the relevent bits from
event-attr.config{,1,2} for use by sw-like pmus where the
'config{,1,2}' values don't map directly to hardware registers.

Signed-off-by: Cody P Schafer c...@linux.vnet.ibm.com
---
  include/linux/perf_event.h | 17 +
  1 file changed, 17 insertions(+)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index e56b07f..3da5081 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -871,4 +871,21 @@ _name##_show(struct device
*dev,\
  \
  static struct device_attribute format_attr_##_name = __ATTR_RO(_name)

+#define PMU_FORMAT_RANGE(name, attr_var, bit_start, bit_end)\
+PMU_FORMAT_ATTR(name, #attr_var : #bit_start - #bit_end);\
+PMU_FORMAT_RANGE_RESERVED(name, attr_var, bit_start, bit_end)


I really think these should have event in the name.

Someone looking at the code is going to see event_get_foo() and wonder
where
that is defined. Grep won't find a definition, tags won't find a
definition,
the least you can do is have the macro name give some hint.



That is a good point (grep-ability). Let me think about this. There is
also the possibility that I could adjust the event_get_*() naming to
something else. format_get_*()? event_get_format_*()? (these names keep
growing...)



I've gone with a format_get(name, event) style macro (making it more 
grep-able), in v4.

Feel free to direct further discussion to the v4 posting.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 1/2] perf Documentation: sysfs events/ interfaces

2014-03-05 Thread Cody P Schafer
Add documentation for the event, event.scale, and event.unit
files in sysfs.

event.scale and event.unit were undocumented.
event was previously documented only for specific powerpc pmu events.

I've added a restriction that event names cannot contain '.' characters
so we can avoid breaking the API when we (inevitably) add more
'event.' files.

Signed-off-by: Cody P Schafer c...@linux.vnet.ibm.com
---
 .../testing/sysfs-bus-event_source-devices-events  | 59 ++
 1 file changed, 59 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-bus-event_source-devices-events 
b/Documentation/ABI/testing/sysfs-bus-event_source-devices-events
index 3c1cc24..5393e1ed6 100644
--- a/Documentation/ABI/testing/sysfs-bus-event_source-devices-events
+++ b/Documentation/ABI/testing/sysfs-bus-event_source-devices-events
@@ -82,3 +82,62 @@ Description: POWER-systems specific performance monitoring 
events
Further, multiple terms like 'event=0x' can be specified
and separated with comma. All available terms are defined in
the /sys/bus/event_source/devices/dev/format file.
+
+What: /sys/bus/event_source/devices/pmu/events/event
+Date: 2014/02/24
+Contact:   Linux kernel mailing list linux-ker...@vger.kernel.org
+Description:   Per-pmu performance monitoring events specific to the running 
system
+
+   Each file (with a name not containing a '.') in the 'events'
+   directory describes a single performance monitoring event
+   supported by the pmu. The name of the file is the name of the 
event.
+
+   File contents:
+
+   term[=value][,term[=value]]...
+
+   Where term is one of the terms listed under
+   /sys/bus/event_source/devices/pmu/format/ and value is
+   a number is base-16 format with a '0x' prefix (lowercase only).
+   If a term is specified alone (without an assigned value), it
+   is implied that 0x1 is assigned to that term.
+
+   Examples (each of these lines would be in a seperate file):
+
+   event=0x2abc
+   event=0x423,inv,cmask=0x3
+   domain=0x1,offset=0x8,starting_index=0x
+
+   Each of the assignments indicates a value to be assigned to a
+   particular set of bits (as defined by the format file
+   corresponding to the term) in the perf_event structure passed
+   to the perf_open syscall.
+
+What: /sys/bus/event_source/devices/pmu/events/event.unit
+Date: 2014/02/24
+Contact:   Linux kernel mailing list linux-ker...@vger.kernel.org
+Description:   Perf event units
+
+   A string specifying the English plural numerical unit that 
event
+   (once multiplied by event.scale) represents.
+
+   Example:
+
+   Joules
+
+What: /sys/bus/event_source/devices/pmu/events/event.scale
+Date: 2014/02/24
+Contact:   Linux kernel mailing list linux-ker...@vger.kernel.org
+Description:   Perf event scaling factors
+
+   A string representing a floating point value expressed in
+   scientific notation to be multiplied by the event count
+   recieved from the kernel to match the unit specified in the
+   event.unit file.
+
+   Example:
+
+   2.3283064365386962890625e-10
+
+   This is provided to avoid performing floating point arithmetic
+   in the kernel.
-- 
1.9.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 2/2] perf Documentation: remove duplicated docs for powerpc cpu specific events

2014-03-05 Thread Cody P Schafer
Listing specific events doesn't actually help us at all here because:
 - these events actually vary between different ppc processors, they
   aren't garunteed to be present.
 - the documentation of the file contents is now duplicated by the
   docs for arbitrary event file contents.

Signed-off-by: Cody P Schafer c...@linux.vnet.ibm.com
---
 .../testing/sysfs-bus-event_source-devices-events  | 57 --
 1 file changed, 57 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-bus-event_source-devices-events 
b/Documentation/ABI/testing/sysfs-bus-event_source-devices-events
index 5393e1ed6..50c30a6 100644
--- a/Documentation/ABI/testing/sysfs-bus-event_source-devices-events
+++ b/Documentation/ABI/testing/sysfs-bus-event_source-devices-events
@@ -26,63 +26,6 @@ Description: Generic performance monitoring events
raw code for the perf event identified by the file's
basename.
 
-
-What:  /sys/devices/cpu/events/PM_1PLUS_PPC_CMPL
-   /sys/devices/cpu/events/PM_BRU_FIN
-   /sys/devices/cpu/events/PM_BR_MPRED
-   /sys/devices/cpu/events/PM_CMPLU_STALL
-   /sys/devices/cpu/events/PM_CMPLU_STALL_BRU
-   /sys/devices/cpu/events/PM_CMPLU_STALL_DCACHE_MISS
-   /sys/devices/cpu/events/PM_CMPLU_STALL_DFU
-   /sys/devices/cpu/events/PM_CMPLU_STALL_DIV
-   /sys/devices/cpu/events/PM_CMPLU_STALL_ERAT_MISS
-   /sys/devices/cpu/events/PM_CMPLU_STALL_FXU
-   /sys/devices/cpu/events/PM_CMPLU_STALL_IFU
-   /sys/devices/cpu/events/PM_CMPLU_STALL_LSU
-   /sys/devices/cpu/events/PM_CMPLU_STALL_REJECT
-   /sys/devices/cpu/events/PM_CMPLU_STALL_SCALAR
-   /sys/devices/cpu/events/PM_CMPLU_STALL_SCALAR_LONG
-   /sys/devices/cpu/events/PM_CMPLU_STALL_STORE
-   /sys/devices/cpu/events/PM_CMPLU_STALL_THRD
-   /sys/devices/cpu/events/PM_CMPLU_STALL_VECTOR
-   /sys/devices/cpu/events/PM_CMPLU_STALL_VECTOR_LONG
-   /sys/devices/cpu/events/PM_CYC
-   /sys/devices/cpu/events/PM_GCT_NOSLOT_BR_MPRED
-   /sys/devices/cpu/events/PM_GCT_NOSLOT_BR_MPRED_IC_MISS
-   /sys/devices/cpu/events/PM_GCT_NOSLOT_CYC
-   /sys/devices/cpu/events/PM_GCT_NOSLOT_IC_MISS
-   /sys/devices/cpu/events/PM_GRP_CMPL
-   /sys/devices/cpu/events/PM_INST_CMPL
-   /sys/devices/cpu/events/PM_LD_MISS_L1
-   /sys/devices/cpu/events/PM_LD_REF_L1
-   /sys/devices/cpu/events/PM_RUN_CYC
-   /sys/devices/cpu/events/PM_RUN_INST_CMPL
-
-Date:  2013/01/08
-
-Contact:   Linux kernel mailing list linux-ker...@vger.kernel.org
-   Linux Powerpc mailing list linuxppc-...@ozlabs.org
-
-Description:   POWER-systems specific performance monitoring events
-
-   A collection of performance monitoring events that may be
-   supported by the POWER CPU. These events can be monitored
-   using the 'perf(1)' tool.
-
-   These events may not be supported by other CPUs.
-
-   The contents of each file would look like:
-
-   event=0x
-
-   where 'N' is a hex digit and the number '0x' shows the
-   raw code for the perf event identified by the file's
-   basename.
-
-   Further, multiple terms like 'event=0x' can be specified
-   and separated with comma. All available terms are defined in
-   the /sys/bus/event_source/devices/dev/format file.
-
 What: /sys/bus/event_source/devices/pmu/events/event
 Date: 2014/02/24
 Contact:   Linux kernel mailing list linux-ker...@vger.kernel.org
-- 
1.9.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 0/2] perf: add documentation for sysfs interfaces

2014-03-05 Thread Cody P Schafer
Documents pmu/event/event{,.scale,.units} and then removes the redundant
POWER docs.

Slightly restricts event names to avoid API funkyness when we add new
event.? files ('.' forbidden in event names).

The contact is currently lkml, it would be very useful to have a perf
development list to put here instead (acme, feel like making one?).

--

Cody P Schafer (2):
  perf Documentation: sysfs events/ interfaces
  perf Documentation: remove duplicated docs for powerpc cpu specific
events

 .../testing/sysfs-bus-event_source-devices-events  | 92 +++---
 1 file changed, 47 insertions(+), 45 deletions(-)

-- 
1.9.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3 02/11] perf: add PMU_FORMAT_RANGE() helper for use by sw-like pmus

2014-03-04 Thread Cody P Schafer

On 03/03/2014 09:19 PM, Michael Ellerman wrote:

On Thu, 2014-27-02 at 21:04:55 UTC, Cody P Schafer wrote:

Add PMU_FORMAT_RANGE() and PMU_FORMAT_RANGE_RESERVED() (for reserved
areas) which generate functions to extract the relevent bits from
event-attr.config{,1,2} for use by sw-like pmus where the
'config{,1,2}' values don't map directly to hardware registers.

Signed-off-by: Cody P Schafer c...@linux.vnet.ibm.com
---
  include/linux/perf_event.h | 17 +
  1 file changed, 17 insertions(+)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index e56b07f..3da5081 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -871,4 +871,21 @@ _name##_show(struct device *dev,   
\
\
  static struct device_attribute format_attr_##_name = __ATTR_RO(_name)

+#define PMU_FORMAT_RANGE(name, attr_var, bit_start, bit_end)   \
+PMU_FORMAT_ATTR(name, #attr_var : #bit_start - #bit_end);  \
+PMU_FORMAT_RANGE_RESERVED(name, attr_var, bit_start, bit_end)


I really think these should have event in the name.

Someone looking at the code is going to see event_get_foo() and wonder where
that is defined. Grep won't find a definition, tags won't find a definition,
the least you can do is have the macro name give some hint.



That is a good point (grep-ability). Let me think about this. There is 
also the possibility that I could adjust the event_get_*() naming to 
something else. format_get_*()? event_get_format_*()? (these names keep 
growing...)



+#define PMU_FORMAT_RANGE_RESERVED(name, attr_var, bit_start, bit_end)  \


It doesn't generate a format attribute.


This was done with the idea that the term format didn't just refer to 
the attribute exposed in sysfs, it referred to some subset of bits 
extractable from attr.config{,1,2}. Which is also the reasoning for the 
above naming.



+static u64 event_get_##name##_max(void)
\
+{  \
+   int bits = (bit_end) - (bit_start) + 1; \
+   return ((0x1ULL  (bits - 1ULL)) - 1ULL) |   \
+   (0xFULL  (bits - 4ULL));\


What's wrong with:

(0x1ULL  ((bit_end) - (bit_start) + 1)) - 1ULL;


Overflowing the  when bit_end = 63 and bit_start = 0 results in max(0, 
63) = 0.
That said, the current implementation is wrong when (bits  4). Here's 
one that actually works (without overflowing):


return (((1ull  (bit_end - bit_start)) - 1)  1) + 1;

And an examination of the problematic case:

#if 0
typedef unsigned long long ull;
ull a = bits - 1; /* 63 */
ull b = 1  a;   /* 0x8000 */
ull c = b - 1;/* 0x7fff */
ull d = b  1;   /* 0xfffe */
ull e = d + 1;/* 0x */
return e;
#endif

Small number of valid inputs, so I also tested it for all of them using

unsigned bits = (bit_end) - (bit_start) + 1;
return (bits  (sizeof(0ULL) * CHAR_BIT))
? ((1ULL  bits) - 1ULL)
: ~0ULL;

As the baseline correct one.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3 03/11] perf: provide a common perf_event_nop_0() for use with .event_idx

2014-03-03 Thread Cody P Schafer

On 03/03/2014 09:19 PM, Michael Ellerman wrote:

On Thu, 2014-27-02 at 21:04:56 UTC, Cody P Schafer wrote:

Rather an having every pmu that needs a function that just returns 0 for
.event_idx define their own copy, reuse the one in kernel/events/core.c.

Rename from perf_swevent_event_idx() because we're no longer using it
for just software events. Naming is based on the perf_pmu_nop_*()
functions.


You could just use perf_pmu_nop_int() directly.


No, .event_idx needs something that takes a (struct perf_event *), 
perf_pmu_nop_int() takes a (struct pmu *).


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2 02/11] perf core: export swevent hrtimer helpers

2014-02-27 Thread Cody P Schafer

On 02/26/2014 12:29 AM, Peter Zijlstra wrote:

On Tue, Feb 25, 2014 at 01:38:31PM -0800, Cody P Schafer wrote:

On 02/25/2014 02:20 AM, Peter Zijlstra wrote:

On Tue, Feb 25, 2014 at 02:33:26PM +1100, Michael Ellerman wrote:

On Fri, 2014-14-02 at 22:02:06 UTC, Cody P Schafer wrote:

Export the swevent hrtimer helpers currently only used in events/core.c
to allow the addition of architecture specific sw-like pmus.


Peter, Ingo, can we get your ACK on this please?


How are they used? I saw some usage in patch 9 or so; but its not
explained anywhere. All patches have non-existent Changelogs and the few
comments that are there are pretty hardware specific.

So please do tell; what do you need this for?


 From this patch's change log:


Export the swevent hrtimer helpers currently only used in events/core.c to 
allow the addition of architecture specific sw-like pmus.


The key part here is architecture specific sw-like pmus, where the
announcement explains why these pmus are sw-like:


I don't read announcements for crucial patch details; announcements are
lost and therefore unimportant.


And I'll be sure to elaborate further in the changelog next time (if I 
don't drop this change entirely).

This is the first comment I've got on this particular patch.


The counters supplied by these interfaces are continually counting and never
need to be (and cannot be) disabled or enabled. They additionally do not
generate any interrupts. This makes them in some regards similar to software
counters, and as a result their implimentation shares some common code (which
an initial patch exposes) with the sw counters.


Essentially, these pmus just provide access to a big array of counters which
don't generate interrupts, and are all 64bit (and assumed to never
overflow). Rather than duplicate the code that we already have for managing
timing when reading from counters that don't have interrupts (the functions
that are exposed by this patch), I've reused it.


So note that all the software counters generate interrupts in their own
measuring domain. The hrtimer ones measure time and generate time based
interrupts, the event based ones generate 'interrupts' on their events.

What you have here is a hw pmu without interrupt capability. That's
fine, they don't get to generate interrupt. We have plenty of those
already.

But what you propose to do is add interrupt in another domain entirely.
That's not fine. Don't do that.


Ok, so it looks like I misunderstood the need for an interrupt. The 
intention in using the swevent_hrtimer code was to enable setting up the 
events as frequency sampled. After taking another look at the gpci and 
24x7 pmus, I'm forbidding sampling events anyhow in event init, so the 
timer code isn't even taken advantage of. I'll drop this patch in the 
next set.




You also try and conceal this information; so you suck.



___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v3 01/11] sysfs: create bin_attributes under the requested group

2014-02-27 Thread Cody P Schafer
bin_attributes created/updated in create_files() (such as those listed
via (struct device).attribute_groups) were not placed under the
specified group, and instead appeared in the base kobj directory.

Fix this by making bin_attributes use creating code similar to normal
attributes.

A quick grep shows that no one is using bin_attrs in a named attribute
group yet, so we can do this without breaking anything in usespace.

Note that I do not add is_visible() support to
bin_attributes, though that could be done as well.

Signed-off-by: Cody P Schafer c...@linux.vnet.ibm.com
Signed-off-by: Greg Kroah-Hartman gre...@linuxfoundation.org
---

No need to merge, already in driver-core-next as
aabaf4c2050d21d39fe11eec889c508e84d6a328, included for
reference/testing/verification only.

---

 fs/sysfs/group.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/fs/sysfs/group.c b/fs/sysfs/group.c
index 6b57938..aa04068 100644
--- a/fs/sysfs/group.c
+++ b/fs/sysfs/group.c
@@ -70,8 +70,11 @@ static int create_files(struct kernfs_node *parent, struct 
kobject *kobj,
if (grp-bin_attrs) {
for (bin_attr = grp-bin_attrs; *bin_attr; bin_attr++) {
if (update)
-   sysfs_remove_bin_file(kobj, *bin_attr);
-   error = sysfs_create_bin_file(kobj, *bin_attr);
+   kernfs_remove_by_name(parent,
+   (*bin_attr)-attr.name);
+   error = sysfs_add_file_mode_ns(parent,
+   (*bin_attr)-attr, true,
+   (*bin_attr)-attr.mode, NULL);
if (error)
break;
}
-- 
1.9.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v3 00/11] powerpc: Add support for Power Hypervisor supplied performance counters

2014-02-27 Thread Cody P Schafer
These patches add basic pmus for 2 powerpc hypervisor interfaces to obtain
performance counters: gpci (get performance counter info) and 24x7.

The counters supplied by these interfaces are continually counting and never
need to be (and cannot be) disabled or enabled. They additionally do not
generate any interrupts. This makes them in some regards similar to software
counters, and as a result their implimentation shares some common code (which
an initial patch exposes) with the sw counters.

These 2 PMUs end up providing access to some cpu, core, and chip level counters
not exposed via other interfaces, and additionally allow monitoring the
performance of other lpars (guests) on the same host system. Because it
provides access to core and chip level counters, this pair of PMUs could be
thought of as powerpc's counterpart to x86's uncore events.

GPCI is an interface that already exists on some power6 and power7 machines
(depending on the fw version), but is rather in-flexible and code intensive to
add additional counters to.  The 24x7 interfaces currently are designed to
co-exist with the gpci interface while replacing most of gpci's functionality
on newer systems. Right now, the 24x7 code I've submitted uses the gpci calls
to check if it has permission to access certain classes of counters.

--

Since v2:
 - sysfs: create bin_attributes under the requested group is now in
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core.git 
driver-core-next
with commit-id: aabaf4c2050d21d39fe11eec889c508e84d6a328

 - Split hv-24x7.h catalog definition into hv-24x7-catalog.h
 - Remove unused 24x7 and gpci interface structures and enums (Michael Ellerman)
 - Update docs to point to an external source for the full catalog docs
 - Extend some of the patch changelogs (Peter Z)
 - Remove hrtimer usage and just extern the event_idx helper (now renamed) 
(Peter Z)
 - s/PMU_RANGE_ATTR/PMU_FORMAT_RANGE/ (and similar RESERVED rename) (Michael
   Ellerman)
 - hv_24x7: small clarifications in read_offset_data()'s comment
 - hv_gpci: remove h_gpci_event_read() and h_gpci_event_del(), call _stop and
   _update() directly (Michael Ellerman)
 - Kconfig relocation, dependency changes, and rewording (Scott Wood and
   Michael Ellerman)

Since v1:
 - add a few attributes to hv_gpci and hv_24x7 that expose some info about the 
interfaces
 - so the attributes show up in the right place, fix bin_attr creation in sysfs 
groups.
 - move hv_gpci.h and hv_24x7.h interface headers into arch/powerpc/perf
 - fix bit ordering in hv_gpci.h
 - split out hv_perf_caps_get() and use it to probe for the interface before 
registering
 - ensure proper alignment of hypervisor args
 - add a few missing counter requests to hv_gpci.h
 - s/CIR_xxx/CIR_XXX/ in hv_gpci.h
 - s/modules_init/device_initcall/
 - Don't set event-cpu, use the user provided one
 - remove the union of gpci events, just give the user 1024 bytes to play with
 - clarify some comments (the list of fw versions is now labeled)
 - provide and event_24x7_request() that wraps single_24x7_request()
 - probably some other small fixes I'm forgetting.


Cody P Schafer (11):
  sysfs: create bin_attributes under the requested group
  perf: add PMU_FORMAT_RANGE() helper for use by sw-like pmus
  perf: provide a common perf_event_nop_0() for use with .event_idx
  powerpc: add hvcalls for 24x7 and gpci (get performance counter info)
  powerpc/perf: add hv_gpci interface header
  powerpc/perf: add 24x7 interface headers
  powerpc/perf: add a shared interface to get gpci version and
capabilities
  powerpc/perf: add support for the hv gpci (get performance counter
info) interface
  powerpc/perf: add support for the hv 24x7 interface
  powerpc/perf: add kconfig option for hypervisor provided counters
  powerpc/perf/hv_{gpci,24x7}: add documentation of device attributes

 .../testing/sysfs-bus-event_source-devices-hv_24x7 |  23 +
 .../testing/sysfs-bus-event_source-devices-hv_gpci |  43 ++
 arch/powerpc/include/asm/hvcall.h  |   5 +
 arch/powerpc/perf/Makefile |   2 +
 arch/powerpc/perf/hv-24x7-catalog.h|  33 ++
 arch/powerpc/perf/hv-24x7.c| 492 +
 arch/powerpc/perf/hv-24x7.h| 109 +
 arch/powerpc/perf/hv-common.c  |  39 ++
 arch/powerpc/perf/hv-common.h  |  17 +
 arch/powerpc/perf/hv-gpci.c| 277 
 arch/powerpc/perf/hv-gpci.h|  73 +++
 arch/powerpc/platforms/pseries/Kconfig |  12 +
 fs/sysfs/group.c   |   7 +-
 include/linux/perf_event.h |  18 +
 kernel/events/core.c   |  10 +-
 15 files changed, 1153 insertions(+), 7 deletions(-)
 create mode 100644 
Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_24x7
 create mode 100644 
Documentation/ABI

[PATCH v3 04/11] powerpc: add hvcalls for 24x7 and gpci (get performance counter info)

2014-02-27 Thread Cody P Schafer
Signed-off-by: Cody P Schafer c...@linux.vnet.ibm.com
---
 arch/powerpc/include/asm/hvcall.h | 5 +
 1 file changed, 5 insertions(+)

diff --git a/arch/powerpc/include/asm/hvcall.h 
b/arch/powerpc/include/asm/hvcall.h
index d8b600b..5dbbb29 100644
--- a/arch/powerpc/include/asm/hvcall.h
+++ b/arch/powerpc/include/asm/hvcall.h
@@ -274,6 +274,11 @@
 /* Platform specific hcalls, used by KVM */
 #define H_RTAS 0xf000
 
+/* Platform specific hcalls, provided by PHYP */
+#define H_GET_24X7_CATALOG_PAGE0xF078
+#define H_GET_24X7_DATA0xF07C
+#define H_GET_PERF_COUNTER_INFO0xF080
+
 #ifndef __ASSEMBLY__
 
 /**
-- 
1.9.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v3 03/11] perf: provide a common perf_event_nop_0() for use with .event_idx

2014-02-27 Thread Cody P Schafer
Rather an having every pmu that needs a function that just returns 0 for
.event_idx define their own copy, reuse the one in kernel/events/core.c.

Rename from perf_swevent_event_idx() because we're no longer using it
for just software events. Naming is based on the perf_pmu_nop_*()
functions.

Signed-off-by: Cody P Schafer c...@linux.vnet.ibm.com
---
 include/linux/perf_event.h |  1 +
 kernel/events/core.c   | 10 +-
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 3da5081..24a7b45 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -560,6 +560,7 @@ extern void perf_pmu_migrate_context(struct pmu *pmu,
 extern u64 perf_event_read_value(struct perf_event *event,
 u64 *enabled, u64 *running);
 
+extern int perf_event_nop_0(struct perf_event *event);
 
 struct perf_sample_data {
u64 type;
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 56003c6..2938a77 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -5816,7 +5816,7 @@ static int perf_swevent_init(struct perf_event *event)
return 0;
 }
 
-static int perf_swevent_event_idx(struct perf_event *event)
+int perf_event_nop_0(struct perf_event *event)
 {
return 0;
 }
@@ -5831,7 +5831,7 @@ static struct pmu perf_swevent = {
.stop   = perf_swevent_stop,
.read   = perf_swevent_read,
 
-   .event_idx  = perf_swevent_event_idx,
+   .event_idx  = perf_event_nop_0,
 };
 
 #ifdef CONFIG_EVENT_TRACING
@@ -5950,7 +5950,7 @@ static struct pmu perf_tracepoint = {
.stop   = perf_swevent_stop,
.read   = perf_swevent_read,
 
-   .event_idx  = perf_swevent_event_idx,
+   .event_idx  = perf_event_nop_0,
 };
 
 static inline void perf_tp_register(void)
@@ -6177,7 +6177,7 @@ static struct pmu perf_cpu_clock = {
.stop   = cpu_clock_event_stop,
.read   = cpu_clock_event_read,
 
-   .event_idx  = perf_swevent_event_idx,
+   .event_idx  = perf_event_nop_0,
 };
 
 /*
@@ -6257,7 +6257,7 @@ static struct pmu perf_task_clock = {
.stop   = task_clock_event_stop,
.read   = task_clock_event_read,
 
-   .event_idx  = perf_swevent_event_idx,
+   .event_idx  = perf_event_nop_0,
 };
 
 static void perf_pmu_nop_void(struct pmu *pmu)
-- 
1.9.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v3 02/11] perf: add PMU_FORMAT_RANGE() helper for use by sw-like pmus

2014-02-27 Thread Cody P Schafer
Add PMU_FORMAT_RANGE() and PMU_FORMAT_RANGE_RESERVED() (for reserved
areas) which generate functions to extract the relevent bits from
event-attr.config{,1,2} for use by sw-like pmus where the
'config{,1,2}' values don't map directly to hardware registers.

Signed-off-by: Cody P Schafer c...@linux.vnet.ibm.com
---
 include/linux/perf_event.h | 17 +
 1 file changed, 17 insertions(+)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index e56b07f..3da5081 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -871,4 +871,21 @@ _name##_show(struct device *dev,   
\
\
 static struct device_attribute format_attr_##_name = __ATTR_RO(_name)
 
+#define PMU_FORMAT_RANGE(name, attr_var, bit_start, bit_end)   \
+PMU_FORMAT_ATTR(name, #attr_var : #bit_start - #bit_end);  \
+PMU_FORMAT_RANGE_RESERVED(name, attr_var, bit_start, bit_end)
+
+#define PMU_FORMAT_RANGE_RESERVED(name, attr_var, bit_start, bit_end)  \
+static u64 event_get_##name##_max(void)
\
+{  \
+   int bits = (bit_end) - (bit_start) + 1; \
+   return ((0x1ULL  (bits - 1ULL)) - 1ULL) | \
+   (0xFULL  (bits - 4ULL));  \
+}  \
+static u64 event_get_##name(struct perf_event *event)  \
+{  \
+   return (event-attr.attr_var  (bit_start))   \
+   event_get_##name##_max();   \
+}
+
 #endif /* _LINUX_PERF_EVENT_H */
-- 
1.9.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v3 05/11] powerpc/perf: add hv_gpci interface header

2014-02-27 Thread Cody P Schafer
H_GetPerformanceCounterInfo (refered to as hv_gpci or just gpci from
here on) is an interface to retrieve specific performance counters and
other data from the hypervisor. All outputs have a fixed format. This
header only describes the portions of the interface that we plan on
using in linux at this time.

Signed-off-by: Cody P Schafer c...@linux.vnet.ibm.com
---
 arch/powerpc/perf/hv-gpci.h | 73 +
 1 file changed, 73 insertions(+)
 create mode 100644 arch/powerpc/perf/hv-gpci.h

diff --git a/arch/powerpc/perf/hv-gpci.h b/arch/powerpc/perf/hv-gpci.h
new file mode 100644
index 000..b25f460
--- /dev/null
+++ b/arch/powerpc/perf/hv-gpci.h
@@ -0,0 +1,73 @@
+#ifndef LINUX_POWERPC_PERF_HV_GPCI_H_
+#define LINUX_POWERPC_PERF_HV_GPCI_H_
+
+#include linux/types.h
+
+/* From the document H_GetPerformanceCounterInfo Interface v1.07 */
+
+/* H_GET_PERF_COUNTER_INFO argument */
+struct hv_get_perf_counter_info_params {
+   __be32 counter_request; /* I */
+   __be32 starting_index;  /* IO */
+   __be16 secondary_index; /* IO */
+   __be16 returned_values; /* O */
+   __be32 detail_rc; /* O, only needed when called via *_norets() */
+
+   /*
+* O, size each of counter_value element in bytes, only set for version
+* = 0x3
+*/
+   __be16 cv_element_size;
+
+   /* I, 0 (zero) for versions  0x3 */
+   __u8 counter_info_version_in;
+
+   /* O, 0 (zero) if version  0x3. Must be set to 0 when making hcall */
+   __u8 counter_info_version_out;
+   __u8 reserved[0xC];
+   __u8 counter_value[];
+} __packed;
+
+/*
+ * counter info version = fw version/reference (spec version)
+ *
+ * 8 = power8 (1.07)
+ * [7 is skipped by spec 1.07]
+ * 6 = TLBIE (1.07)
+ * 5 = v7r7m0.phyp (1.05)
+ * [4 skipped]
+ * 3 = v7r6m0.phyp (?)
+ * [1,2 skipped]
+ * 0 = v7r{2,3,4}m0.phyp (?)
+ */
+#define COUNTER_INFO_VERSION_CURRENT 0x8
+
+/*
+ * These determine the counter_value[] layout and the meaning of starting_index
+ * and secondary_index.
+ *
+ * Unless otherwise noted, @secondary_index is unused and ignored.
+ */
+enum counter_info_requests {
+
+   /* GENERAL */
+
+   /* @starting_index: must be -1 (to refer to the current partition)
+*/
+   CIR_SYSTEM_PERFORMANCE_CAPABILITIES = 0X40,
+};
+
+struct cv_system_performance_capabilities {
+   /* If != 0, allowed to collect data from other partitions */
+   __u8 perf_collect_privileged;
+
+   /* These following are only valid if counter_info_version = 0x3 */
+#define CV_CM_GA   (1  7)
+#define CV_CM_EXPANDED (1  6)
+#define CV_CM_LAB  (1  5)
+   /* remaining bits are reserved */
+   __u8 capability_mask;
+   __u8 reserved[0xE];
+} __packed;
+
+#endif
-- 
1.9.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v3 06/11] powerpc/perf: add 24x7 interface headers

2014-02-27 Thread Cody P Schafer
24x7 (also called hv_24x7 or H_24X7) is an interface to obtain
performance counters from the hypervisor. These counters do not have a
fixed format/possition and are instead documented in a 24x7 Catalog,
which is provided by the hypervisor (that interface is also documented
paritialy in the included hv-24x7-catalog.h and fully in at
https://raw.githubusercontent.com/jmesmon/catalog-24x7/master/hv-24x7-catalog.h 
).

The 24x7 data access is simply a copy operation into a 4 dimentional
array of 64bit counters (from hypervisor to kernel memory). There is no
interupt triggered on overflow, these are completely disjoint from the
typical power pmu.

This method of obtaining performance counters from the hypervisor is
intended to paritialy replace the gpci interface.

Signed-off-by: Cody P Schafer c...@linux.vnet.ibm.com
---
 arch/powerpc/perf/hv-24x7-catalog.h |  33 +++
 arch/powerpc/perf/hv-24x7.h | 109 
 2 files changed, 142 insertions(+)
 create mode 100644 arch/powerpc/perf/hv-24x7-catalog.h
 create mode 100644 arch/powerpc/perf/hv-24x7.h

diff --git a/arch/powerpc/perf/hv-24x7-catalog.h 
b/arch/powerpc/perf/hv-24x7-catalog.h
new file mode 100644
index 000..21b19dd
--- /dev/null
+++ b/arch/powerpc/perf/hv-24x7-catalog.h
@@ -0,0 +1,33 @@
+#ifndef LINUX_POWERPC_PERF_HV_24X7_CATALOG_H_
+#define LINUX_POWERPC_PERF_HV_24X7_CATALOG_H_
+
+#include linux/types.h
+
+/* From document 24x7 Event and Group Catalog Formats Proposal v0.15 */
+
+struct hv_24x7_catalog_page_0 {
+#define HV_24X7_CATALOG_MAGIC 0x32347837 /* 24x7 in ASCII */
+   __be32 magic;
+   __be32 length; /* In 4096 byte pages */
+   __be64 version; /* XXX: arbitrary? what's the meaning/useage/purpose? */
+   __u8 build_time_stamp[16]; /* MMDDHHMMSS\0\0 */
+   __u8 reserved2[32];
+   __be16 schema_data_offs; /* in 4096 byte pages */
+   __be16 schema_data_len;  /* in 4096 byte pages */
+   __be16 schema_entry_count;
+   __u8 reserved3[2];
+   __be16 event_data_offs;
+   __be16 event_data_len;
+   __be16 event_entry_count;
+   __u8 reserved4[2];
+   __be16 group_data_offs; /* in 4096 byte pages */
+   __be16 group_data_len;  /* in 4096 byte pages */
+   __be16 group_entry_count;
+   __u8 reserved5[2];
+   __be16 formula_data_offs; /* in 4096 byte pages */
+   __be16 formula_data_len;  /* in 4096 byte pages */
+   __be16 formula_entry_count;
+   __u8 reserved6[2];
+} __packed;
+
+#endif
diff --git a/arch/powerpc/perf/hv-24x7.h b/arch/powerpc/perf/hv-24x7.h
new file mode 100644
index 000..720ebce
--- /dev/null
+++ b/arch/powerpc/perf/hv-24x7.h
@@ -0,0 +1,109 @@
+#ifndef LINUX_POWERPC_PERF_HV_24X7_H_
+#define LINUX_POWERPC_PERF_HV_24X7_H_
+
+#include linux/types.h
+
+struct hv_24x7_request {
+   /* PHYSICAL domains require enabling via phyp/hmc. */
+#define HV_24X7_PERF_DOMAIN_PHYSICAL_CHIP 0x01
+#define HV_24X7_PERF_DOMAIN_PHYSICAL_CORE 0x02
+#define HV_24X7_PERF_DOMAIN_VIRTUAL_PROCESSOR_HOME_CORE   0x03
+#define HV_24X7_PERF_DOMAIN_VIRTUAL_PROCESSOR_HOME_CHIP   0x04
+#define HV_24X7_PERF_DOMAIN_VIRTUAL_PROCESSOR_HOME_NODE   0x05
+#define HV_24X7_PERF_DOMAIN_VIRTUAL_PROCESSOR_REMOTE_NODE 0x06
+   __u8 performance_domain;
+   __u8 reserved[0x1];
+
+   /* bytes to read starting at @data_offset. must be a multiple of 8 */
+   __be16 data_size;
+
+   /*
+* byte offset within the perf domain to read from. must be 8 byte
+* aligned
+*/
+   __be32 data_offset;
+
+   /*
+* only valid for VIRTUAL_PROCESSOR domains, ignored for others.
+* -1 means current partition only
+*  Enabling via phyp/hmc required for non--1 values. 0 forbidden
+*  unless requestor is 0.
+*/
+   __be16 starting_lpar_ix;
+
+   /*
+* Ignored when @starting_lpar_ix == -1
+* Ignored when @performance_domain is not VIRTUAL_PROCESSOR_*
+* -1 means infinite or all
+*/
+   __be16 max_num_lpars;
+
+   /* chip, core, or virtual processor based on @performance_domain */
+   __be16 starting_ix;
+   __be16 max_ix;
+} __packed;
+
+struct hv_24x7_request_buffer {
+   /* 0 - ? */
+   /* 1 - ? */
+#define HV_24X7_IF_VERSION_CURRENT 0x01
+   __u8 interface_version;
+   __u8 num_requests;
+   __u8 reserved[0xE];
+   struct hv_24x7_request requests[];
+} __packed;
+
+struct hv_24x7_result_element {
+   __be16 lpar_ix;
+
+   /*
+* represents the core, chip, or virtual processor based on the
+* request's @performance_domain
+*/
+   __be16 domain_ix;
+
+   /* -1 if @performance_domain does not refer to a virtual processor */
+   __be32 lpar_cfg_instance_id;
+
+   /* size = @result_element_data_size of cointaining result. */
+   __u8 element_data[];
+} __packed;
+
+struct hv_24x7_result {
+   __u8 result_ix;
+
+   /*
+* 0 = not all

[PATCH v3 07/11] powerpc/perf: add a shared interface to get gpci version and capabilities

2014-02-27 Thread Cody P Schafer
This exposes a simple way to grab the firmware provided
collect_priveliged, ga, expanded, and lab capability bits. All of these
bits come in from the same gpci request, so we've exposed all of them.

Only the collect_priveliged bit is really used by the hv-gpci/hv-24x7
code, the other bits are simply exposed in sysfs to inform the user.

Signed-off-by: Cody P Schafer c...@linux.vnet.ibm.com
---
 arch/powerpc/perf/hv-common.c | 39 +++
 arch/powerpc/perf/hv-common.h | 17 +
 2 files changed, 56 insertions(+)
 create mode 100644 arch/powerpc/perf/hv-common.c
 create mode 100644 arch/powerpc/perf/hv-common.h

diff --git a/arch/powerpc/perf/hv-common.c b/arch/powerpc/perf/hv-common.c
new file mode 100644
index 000..47e02b3
--- /dev/null
+++ b/arch/powerpc/perf/hv-common.c
@@ -0,0 +1,39 @@
+#include asm/io.h
+#include asm/hvcall.h
+
+#include hv-gpci.h
+#include hv-common.h
+
+unsigned long hv_perf_caps_get(struct hv_perf_caps *caps)
+{
+   unsigned long r;
+   struct p {
+   struct hv_get_perf_counter_info_params params;
+   struct cv_system_performance_capabilities caps;
+   } __packed __aligned(sizeof(uint64_t));
+
+   struct p arg = {
+   .params = {
+   .counter_request = cpu_to_be32(
+   CIR_SYSTEM_PERFORMANCE_CAPABILITIES),
+   .starting_index = cpu_to_be32(-1),
+   .counter_info_version_in = 0,
+   }
+   };
+
+   r = plpar_hcall_norets(H_GET_PERF_COUNTER_INFO,
+  virt_to_phys(arg), sizeof(arg));
+
+   if (r)
+   return r;
+
+   pr_devel(capability_mask: 0x%x\n, arg.caps.capability_mask);
+
+   caps-version = arg.params.counter_info_version_out;
+   caps-collect_privileged = !!arg.caps.perf_collect_privileged;
+   caps-ga = !!(arg.caps.capability_mask  CV_CM_GA);
+   caps-expanded = !!(arg.caps.capability_mask  CV_CM_EXPANDED);
+   caps-lab = !!(arg.caps.capability_mask  CV_CM_LAB);
+
+   return r;
+}
diff --git a/arch/powerpc/perf/hv-common.h b/arch/powerpc/perf/hv-common.h
new file mode 100644
index 000..7e615bd
--- /dev/null
+++ b/arch/powerpc/perf/hv-common.h
@@ -0,0 +1,17 @@
+#ifndef LINUX_POWERPC_PERF_HV_COMMON_H_
+#define LINUX_POWERPC_PERF_HV_COMMON_H_
+
+#include linux/types.h
+
+struct hv_perf_caps {
+   u16 version;
+   u16 collect_privileged:1,
+   ga:1,
+   expanded:1,
+   lab:1,
+   unused:12;
+};
+
+unsigned long hv_perf_caps_get(struct hv_perf_caps *caps);
+
+#endif
-- 
1.9.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v3 08/11] powerpc/perf: add support for the hv gpci (get performance counter info) interface

2014-02-27 Thread Cody P Schafer
This provides a basic link between perf and hv_gpci. Notably, it does
not yet support transactions and does not list any events (they can
still be manually composed).

Example usage via perf tool:

perf stat -e 
'hv_gpci/counter_info_version=3,offset=0,length=8,secondary_index=0,starting_index=0x,request=0x10/'
 -r 0 -C 0 -x ' ' sleep 0.1

Signed-off-by: Cody P Schafer c...@linux.vnet.ibm.com
---
 arch/powerpc/perf/hv-gpci.c | 277 
 1 file changed, 277 insertions(+)
 create mode 100644 arch/powerpc/perf/hv-gpci.c

diff --git a/arch/powerpc/perf/hv-gpci.c b/arch/powerpc/perf/hv-gpci.c
new file mode 100644
index 000..2f64732
--- /dev/null
+++ b/arch/powerpc/perf/hv-gpci.c
@@ -0,0 +1,277 @@
+/*
+ * Hypervisor supplied gpci (get performance counter info) performance
+ * counter support
+ *
+ * Author: Cody P Schafer c...@linux.vnet.ibm.com
+ * Copyright 2014 IBM Corporation.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+#define pr_fmt(fmt) hv-gpci:  fmt
+
+#include linux/init.h
+#include linux/perf_event.h
+#include asm/firmware.h
+#include asm/hvcall.h
+#include asm/io.h
+
+#include hv-gpci.h
+#include hv-common.h
+
+PMU_FORMAT_RANGE(request, config, 0, 31); /* u32 */
+PMU_FORMAT_RANGE(starting_index, config, 32, 63); /* u32 */
+PMU_FORMAT_RANGE(secondary_index, config1, 0, 15); /* u16 */
+PMU_FORMAT_RANGE(counter_info_version, config1, 16, 23); /* u8 */
+PMU_FORMAT_RANGE(length, config1, 24, 31); /* u8, bytes of data (1-8) */
+PMU_FORMAT_RANGE(offset, config1, 32, 63); /* u32, byte offset */
+
+static struct attribute *format_attrs[] = {
+   format_attr_request.attr,
+   format_attr_starting_index.attr,
+   format_attr_secondary_index.attr,
+   format_attr_counter_info_version.attr,
+
+   format_attr_offset.attr,
+   format_attr_length.attr,
+   NULL,
+};
+
+static struct attribute_group format_group = {
+   .name = format,
+   .attrs = format_attrs,
+};
+
+#define HV_CAPS_ATTR(_name, _format)   \
+static ssize_t _name##_show(struct device *dev,\
+   struct device_attribute *attr,  \
+   char *page) \
+{  \
+   struct hv_perf_caps caps;   \
+   unsigned long hret = hv_perf_caps_get(caps);   \
+   if (hret)   \
+   return -EIO;\
+   \
+   return sprintf(page, _format, caps._name);  \
+}  \
+static struct device_attribute hv_caps_attr_##_name = __ATTR_RO(_name)
+
+static ssize_t kernel_version_show(struct device *dev,
+  struct device_attribute *attr,
+  char *page)
+{
+   return sprintf(page, 0x%x\n, COUNTER_INFO_VERSION_CURRENT);
+}
+
+DEVICE_ATTR_RO(kernel_version);
+HV_CAPS_ATTR(version, 0x%x\n);
+HV_CAPS_ATTR(ga, %d\n);
+HV_CAPS_ATTR(expanded, %d\n);
+HV_CAPS_ATTR(lab, %d\n);
+HV_CAPS_ATTR(collect_privileged, %d\n);
+
+static struct attribute *interface_attrs[] = {
+   dev_attr_kernel_version.attr,
+   hv_caps_attr_version.attr,
+   hv_caps_attr_ga.attr,
+   hv_caps_attr_expanded.attr,
+   hv_caps_attr_lab.attr,
+   hv_caps_attr_collect_privileged.attr,
+   NULL,
+};
+
+static struct attribute_group interface_group = {
+   .name = interface,
+   .attrs = interface_attrs,
+};
+
+static const struct attribute_group *attr_groups[] = {
+   format_group,
+   interface_group,
+   NULL,
+};
+
+#define GPCI_MAX_DATA_BYTES \
+   (1024 - sizeof(struct hv_get_perf_counter_info_params))
+
+static unsigned long single_gpci_request(u32 req, u32 starting_index,
+   u16 secondary_index, u8 version_in, u32 offset, u8 length,
+   u64 *value)
+{
+   unsigned long ret;
+   size_t i;
+   u64 count;
+
+   struct {
+   struct hv_get_perf_counter_info_params params;
+   uint8_t bytes[GPCI_MAX_DATA_BYTES];
+   } __packed __aligned(sizeof(uint64_t)) arg = {
+   .params = {
+   .counter_request = cpu_to_be32(req),
+   .starting_index = cpu_to_be32(starting_index),
+   .secondary_index = cpu_to_be16(secondary_index),
+   .counter_info_version_in = version_in,
+   }
+   };
+
+   ret = plpar_hcall_norets(H_GET_PERF_COUNTER_INFO,
+   virt_to_phys(arg

[PATCH v3 09/11] powerpc/perf: add support for the hv 24x7 interface

2014-02-27 Thread Cody P Schafer
This provides a basic interface between hv_24x7 and perf. Similar to
the one provided for gpci, it lacks transaction support and does not
list any events.

Example usage via perf tool:

perf stat -e 
'hv_24x7/domain=2,offset=8,starting_index=0,lpar=0x/' -r 0 -C 0 -x ' ' 
sleep 0.1

Signed-off-by: Cody P Schafer c...@linux.vnet.ibm.com
---
 arch/powerpc/perf/hv-24x7.c | 492 
 1 file changed, 492 insertions(+)
 create mode 100644 arch/powerpc/perf/hv-24x7.c

diff --git a/arch/powerpc/perf/hv-24x7.c b/arch/powerpc/perf/hv-24x7.c
new file mode 100644
index 000..c1847a3
--- /dev/null
+++ b/arch/powerpc/perf/hv-24x7.c
@@ -0,0 +1,492 @@
+/*
+ * Hypervisor supplied 24x7 performance counter support
+ *
+ * Author: Cody P Schafer c...@linux.vnet.ibm.com
+ * Copyright 2014 IBM Corporation.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#define pr_fmt(fmt) hv-24x7:  fmt
+
+#include linux/perf_event.h
+#include linux/module.h
+#include linux/slab.h
+#include asm/firmware.h
+#include asm/hvcall.h
+#include asm/io.h
+
+#include hv-24x7.h
+#include hv-24x7-catalog.h
+#include hv-common.h
+
+/*
+ * TODO: Merging events:
+ * - Think of the hcall as an interface to a 4d array of counters:
+ *   - x = domains
+ *   - y = indexes in the domain (core, chip, vcpu, node, etc)
+ *   - z = offset into the counter space
+ *   - w = lpars (guest vms, logical partitions)
+ * - A single request is: x,y,y_last,z,z_last,w,w_last
+ *   - this means we can retrieve a rectangle of counters in y,z for a single 
x.
+ *
+ * - Things to consider (ignoring w):
+ *   - input  cost_per_request = 16
+ *   - output cost_per_result(ys,zs)  = 8 + 8 * ys + ys * zs
+ *   - limited number of requests per hcall (must fit into 4K bytes)
+ * - 4k = 16 [buffer header] - 16 [request size] * request_count
+ * - 255 requests per hcall
+ *   - sometimes it will be more efficient to read extra data and discard
+ */
+
+PMU_FORMAT_RANGE(domain, config, 0, 3); /* u3 0-6, one of HV_24X7_PERF_DOMAIN 
*/
+PMU_FORMAT_RANGE(starting_index, config, 16, 31); /* u16 */
+PMU_FORMAT_RANGE(offset, config, 32, 63); /* u32, see data_offset */
+PMU_FORMAT_RANGE(lpar, config1, 0, 15); /* u16 */
+
+PMU_FORMAT_RANGE_RESERVED(reserved1, config,   4, 15);
+PMU_FORMAT_RANGE_RESERVED(reserved2, config1, 16, 63);
+PMU_FORMAT_RANGE_RESERVED(reserved3, config2,  0, 63);
+
+static struct attribute *format_attrs[] = {
+   format_attr_domain.attr,
+   format_attr_offset.attr,
+   format_attr_starting_index.attr,
+   format_attr_lpar.attr,
+   NULL,
+};
+
+static struct attribute_group format_group = {
+   .name = format,
+   .attrs = format_attrs,
+};
+
+/*
+ * read_offset_data - copy data from one buffer to another while treating the
+ *source buffer as a small view on the total avaliable
+ *source data.
+ *
+ * @dest: buffer to copy into
+ * @dest_len: length of @dest in bytes
+ * @requested_offset: the offset within the source data we want. Must be  0
+ * @src: buffer to copy data from
+ * @src_len: length of @src in bytes
+ * @source_offset: the offset in the sorce data that (src,src_len) refers to.
+ * Must be  0
+ *
+ * returns the number of bytes copied.
+ *
+ * The following ascii art shows the various buffer possitioning we need to
+ * handle, assigns some arbitrary varibles to points on the buffer, and then
+ * shows how we fiddle with those values to get things we care about (copy
+ * start in src and copy len)
+ *
+ * s = @src buffer
+ * d = @dest buffer
+ * '.' areas in d are written to.
+ *
+ *   u
+ *   x wv  z
+ * d   |.|
+ * s |--|
+ *
+ *  u
+ *   x w   z v
+ * d   |--|
+ * s |--|
+ *
+ *   x wu,z,v
+ * d   ||
+ * s |--|
+ *
+ *   x,wu,v,z
+ * d |..|
+ * s |--|
+ *
+ *   xu
+ *   wvz
+ * d ||
+ * s |--|
+ *
+ *   x  z   w  v
+ * d|--|
+ * s |--|
+ *
+ * x = source_offset
+ * w = requested_offset
+ * z = source_offset + src_len
+ * v = requested_offset + dest_len
+ *
+ * w_offset_in_s = w - x = requested_offset - source_offset
+ * z_offset_in_s = z - x = src_len
+ * v_offset_in_s = v - x = request_offset + dest_len - src_len
+ */
+static ssize_t read_offset_data(void *dest, size_t dest_len,
+   loff_t requested_offset, void *src,
+   size_t src_len, loff_t source_offset)
+{
+   size_t w_offset_in_s = requested_offset

[PATCH v3 10/11] powerpc/perf: add kconfig option for hypervisor provided counters

2014-02-27 Thread Cody P Schafer
Signed-off-by: Cody P Schafer c...@linux.vnet.ibm.com
---
 arch/powerpc/perf/Makefile |  2 ++
 arch/powerpc/platforms/pseries/Kconfig | 12 
 2 files changed, 14 insertions(+)

diff --git a/arch/powerpc/perf/Makefile b/arch/powerpc/perf/Makefile
index 60d71ee..f9c083a 100644
--- a/arch/powerpc/perf/Makefile
+++ b/arch/powerpc/perf/Makefile
@@ -11,5 +11,7 @@ obj32-$(CONFIG_PPC_PERF_CTRS) += mpc7450-pmu.o
 obj-$(CONFIG_FSL_EMB_PERF_EVENT) += core-fsl-emb.o
 obj-$(CONFIG_FSL_EMB_PERF_EVENT_E500) += e500-pmu.o e6500-pmu.o
 
+obj-$(CONFIG_HV_PERF_CTRS) += hv-24x7.o hv-gpci.o hv-common.o
+
 obj-$(CONFIG_PPC64)+= $(obj64-y)
 obj-$(CONFIG_PPC32)+= $(obj32-y)
diff --git a/arch/powerpc/platforms/pseries/Kconfig 
b/arch/powerpc/platforms/pseries/Kconfig
index 80b1d57..2cb8b77 100644
--- a/arch/powerpc/platforms/pseries/Kconfig
+++ b/arch/powerpc/platforms/pseries/Kconfig
@@ -111,6 +111,18 @@ config CMM
  will be reused for other LPARs. The interface allows firmware to
  balance memory across many LPARs.
 
+config HV_PERF_CTRS
+   bool Hypervisor supplied PMU events (24x7  GPCI)
+   default y
+   depends on PERF_EVENTS  PPC_PSERIES
+   help
+ Enable access to hypervisor supplied counters in perf. Currently,
+ this enables code that uses the hcall GetPerfCounterInfo and 24x7
+ interfaces to retrieve counters. GPCI exists on Power 6 and later
+ systems. 24x7 is available on Power 8 systems.
+
+  If unsure, select Y.
+
 config DTL
bool Dispatch Trace Log
depends on PPC_SPLPAR  DEBUG_FS
-- 
1.9.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v3 11/11] powerpc/perf/hv_{gpci, 24x7}: add documentation of device attributes

2014-02-27 Thread Cody P Schafer
gpci and 24x7 expose some device specific attributes. Add some
documentation for them.

Signed-off-by: Cody P Schafer c...@linux.vnet.ibm.com
---
 .../testing/sysfs-bus-event_source-devices-hv_24x7 | 23 
 .../testing/sysfs-bus-event_source-devices-hv_gpci | 43 ++
 2 files changed, 66 insertions(+)
 create mode 100644 
Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_24x7
 create mode 100644 
Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_gpci

diff --git a/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_24x7 
b/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_24x7
new file mode 100644
index 000..e78ee79
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_24x7
@@ -0,0 +1,23 @@
+What:  /sys/bus/event_source/devices/hv_24x7/interface/catalog
+Date:  February 2014
+Contact:   Cody P Schafer c...@linux.vnet.ibm.com
+Description:
+   Provides access to the binary 24x7 catalog provided by the
+   hypervisor on POWER7 and 8 systems. This catalog lists events
+   avaliable from the powerpc hv_24x7 pmu. Its format is
+   documented here:
+   
https://raw.githubusercontent.com/jmesmon/catalog-24x7/master/hv-24x7-catalog.h
+
+What:  /sys/bus/event_source/devices/hv_24x7/interface/catalog_length
+Date:  February 2014
+Contact:   Cody P Schafer c...@linux.vnet.ibm.com
+Description:
+   A number equal to the length in bytes of the catalog. This is
+   also extractable from the provided binary catalog sysfs entry.
+
+What:  /sys/bus/event_source/devices/hv_24x7/interface/catalog_version
+Date:  February 2014
+Contact:   Cody P Schafer c...@linux.vnet.ibm.com
+Description:
+   Exposes the version field of the 24x7 catalog. This is also
+   extractable from the provided binary catalog sysfs entry.
diff --git a/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_gpci 
b/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_gpci
new file mode 100644
index 000..3fa58c2
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_gpci
@@ -0,0 +1,43 @@
+What:  
/sys/bus/event_source/devices/hv_gpci/interface/collect_privileged
+Date:  February 2014
+Contact:   Cody P Schafer c...@linux.vnet.ibm.com
+Description:
+   '0' if the hypervisor is configured to forbid access to event
+   counters being accumulated by other guests and to physical
+   domain event counters.
+   '1' if that access is allowed.
+
+What:  /sys/bus/event_source/devices/hv_gpci/interface/ga
+Date:  February 2014
+Contact:   Cody P Schafer c...@linux.vnet.ibm.com
+Description:
+   0 or 1. Indicates whether we have access to GA events (listed
+   in arch/powerpc/perf/hv-gpci.h).
+
+What:  /sys/bus/event_source/devices/hv_gpci/interface/expanded
+Date:  February 2014
+Contact:   Cody P Schafer c...@linux.vnet.ibm.com
+Description:
+   0 or 1. Indicates whether we have access to EXPANDED events 
(listed
+   in arch/powerpc/perf/hv-gpci.h).
+
+What:  /sys/bus/event_source/devices/hv_gpci/interface/lab
+Date:  February 2014
+Contact:   Cody P Schafer c...@linux.vnet.ibm.com
+Description:
+   0 or 1. Indicates whether we have access to LAB events (listed
+   in arch/powerpc/perf/hv-gpci.h).
+
+What:  /sys/bus/event_source/devices/hv_gpci/interface/version
+Date:  February 2014
+Contact:   Cody P Schafer c...@linux.vnet.ibm.com
+Description:
+   A number indicating the version of the gpci interface that the
+   hypervisor reports supporting.
+
+What:  /sys/bus/event_source/devices/hv_gpci/interface/kernel_version
+Date:  February 2014
+Contact:   Cody P Schafer c...@linux.vnet.ibm.com
+Description:
+   A number indicating the latest version of the gpci interface
+   that the kernel is aware of.
-- 
1.9.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2 01/11] perf: add PMU_RANGE_ATTR() helper for use by sw-like pmus

2014-02-25 Thread Cody P Schafer

On 02/24/2014 07:33 PM, Michael Ellerman wrote:

On Fri, 2014-14-02 at 22:02:05 UTC, Cody P Schafer wrote:

Add PMU_RANGE_ATTR() and PMU_RANGE_RESV() (for reserved areas) which
generate functions to extract the relevent bits from
event-attr.config{,1,2} for use by sw-like pmus where the
'config{,1,2}' values don't map directly to hardware registers.

Signed-off-by: Cody P Schafer c...@linux.vnet.ibm.com
---
  include/linux/perf_event.h | 17 +
  1 file changed, 17 insertions(+)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index e56b07f..2702e91 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -871,4 +871,21 @@ _name##_show(struct device *dev,   
\
\
  static struct device_attribute format_attr_##_name = __ATTR_RO(_name)

+#define PMU_RANGE_ATTR(name, attr_var, bit_start, bit_end) \
+PMU_FORMAT_ATTR(name, #attr_var : #bit_start - #bit_end);  \
+PMU_RANGE_RESV(name, attr_var, bit_start, bit_end)
+
+#define PMU_RANGE_RESV(name, attr_var, bit_start, bit_end) \
+static u64 event_get_##name##_max(void)
\
+{  \
+   int bits = (bit_end) - (bit_start) + 1; \
+   return ((0x1ULL  (bits - 1ULL)) - 1ULL) |   \
+   (0xFULL  (bits - 4ULL));\
+}  \
+static u64 event_get_##name(struct perf_event *event)  \
+{  \
+   return (event-attr.attr_var  (bit_start))  \
+   event_get_##name##_max();   \
+}


I still don't like the names.

EVENT_GETTER_AND_FORMAT()


EVENT_RANGE()

I'd prefer to describe the intended usage rather than what is generated 
both in case we change some of the specifics later, and to provide 
additional information to the developers beyond what a simple code 
reading gives.



EVENT_RESERVED()


Sure. The PMU_* naming was just based on the PMU_FORMAT_ATTR() naming, 
so I kept it for continuity with the existing API. Maybe 
EVENT_RANGE_RESERVED() would be more appropriate?



?

It's not clear to me the max routine is useful in general. Can't we just do:


+#define EVENT_RESERVED(name, attr_var, bit_start, bit_end) \
+static u64 event_get_##name(struct perf_event *event)  \
+{  \
+   return (event-attr.attr_var  (bit_start))  \
+   ((0x1ULL  ((bit_end) - (bit_start) + 1)) - 1ULL);   \
+}


I use event_get_*_max() for some checking of parameters in event_init(). 
Having it lets me avoid specifying the maximum explicitly (0x7 = 
0-19, for example). Specifying it explicitly would mean we'd have the 
bit width of the field in question encoded in two places instead of one, 
and I'd prefer to avoid unneeded duplication.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2 05/11] powerpc: add hv_gpci interface header

2014-02-25 Thread Cody P Schafer

On 02/24/2014 07:33 PM, Michael Ellerman wrote:

On Fri, 2014-14-02 at 22:02:09 UTC, Cody P Schafer wrote:

H_GetPerformanceCounterInfo (refered to as hv_gpci or just gpci from
here on) is an interface to retrieve specific performance counters and
other data from the hypervisor. All outputs have a fixed format (and
are represented as structs in this patch).


I still see unused stuff in here, can you strip it back to just what we need.
Same goes for the next patch.



Sure, I can remove the unused structures and enum entries (hadn't 
realized you wanted that in the last review).


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2 09/11] powerpc/perf: add support for the hv 24x7 interface

2014-02-25 Thread Cody P Schafer

On 02/24/2014 07:33 PM, Michael Ellerman wrote:

On Fri, 2014-14-02 at 22:02:13 UTC, Cody P Schafer wrote:

This provides a basic interface between hv_24x7 and perf. Similar to
the one provided for gpci, it lacks transaction support and does not
list any events.

Signed-off-by: Cody P Schafer c...@linux.vnet.ibm.com
---
  arch/powerpc/perf/hv-24x7.c | 491 
  1 file changed, 491 insertions(+)
  create mode 100644 arch/powerpc/perf/hv-24x7.c

diff --git a/arch/powerpc/perf/hv-24x7.c b/arch/powerpc/perf/hv-24x7.c
new file mode 100644
index 000..13de140
--- /dev/null
+++ b/arch/powerpc/perf/hv-24x7.c

...

+
+/*
+ * read_offset_data - copy data from one buffer to another while treating the
+ *source buffer as a small view on the total avaliable
+ *source data.
+ *
+ * @dest: buffer to copy into
+ * @dest_len: length of @dest in bytes
+ * @requested_offset: the offset within the source data we want. Must be  0
+ * @src: buffer to copy data from
+ * @src_len: length of @src in bytes
+ * @source_offset: the offset in the sorce data that (src,src_len) refers to.
+ * Must be  0
+ *
+ * returns the number of bytes copied.
+ *
+ * '.' areas in d are written to.
+ *
+ *   u
+ *   x wv  z
+ * d   |.|
+ * s |--|
+ *
+ *  u
+ *   x w   z v
+ * d   |--|
+ * s |--|
+ *
+ *   x wu,z,v
+ * d   ||
+ * s |--|
+ *
+ *   x,wu,v,z
+ * d |--|
+ * s |--|
+ *
+ *   xu
+ *   wvz
+ * d ||
+ * s |--|
+ *
+ *   x  z   w  v
+ * d|--|
+ * s |--|
+ *
+ * x = source_offset
+ * w = requested_offset
+ * z = source_offset + src_len
+ * v = requested_offset + dest_len
+ *
+ * w_offset_in_s = w - x = requested_offset - source_offset
+ * z_offset_in_s = z - x = src_len
+ * v_offset_in_s = v - x = request_offset + dest_len - src_len
+ * u_offset_in_s = min(z_offset_in_s, v_offset_in_s)
+ *
+ * copy_len = u_offset_in_s - w_offset_in_s = min(z_offset_in_s, v_offset_in_s)
+ * - w_offset_in_s


Comments are great, especially for complicated code like this. But at a glance
I don't actually understand what this comment is trying to tell me.


The function was composed via some number line logic. The comment tries 
to explain what that logic is. The ascii art is various overlapping 
buffers that we're copying between (the '+'s from the patch are messing 
with the indenting some of the labels). The only major omission I'm 
seeing is I failed to note that d=dest and s=src (though this could be 
inferred from the comment about '.' indicating a write).


Is there anything specific That doesn't make sense in the comment? (it 
may not be a comment that really can be read at a glance).





+ */
+static ssize_t read_offset_data(void *dest, size_t dest_len,
+   loff_t requested_offset, void *src,
+   size_t src_len, loff_t source_offset)
+{
+   size_t w_offset_in_s = requested_offset - source_offset;
+   size_t z_offset_in_s = src_len;
+   size_t v_offset_in_s = requested_offset + dest_len - src_len;
+   size_t u_offset_in_s = min(z_offset_in_s, v_offset_in_s);
+   size_t copy_len = u_offset_in_s - w_offset_in_s;
+
+   if (requested_offset  0 || source_offset  0)
+   return -EINVAL;
+
+   if (z_offset_in_s = w_offset_in_s)
+   return 0;
+
+   memcpy(dest, src + w_offset_in_s, copy_len);
+   return copy_len;
+}
+
+static unsigned long h_get_24x7_catalog_page(char page[static 4096],
+u32 version, u32 index)
+{
+   WARN_ON(!IS_ALIGNED((unsigned long)page, 4096));
+   return plpar_hcall_norets(H_GET_24X7_CATALOG_PAGE,
+   virt_to_phys(page),
+   version,
+   index);
+}
+
+static ssize_t catalog_read(struct file *filp, struct kobject *kobj,
+   struct bin_attribute *bin_attr, char *buf,
+   loff_t offset, size_t count)
+{
+   unsigned long hret;
+   ssize_t ret = 0;
+   size_t catalog_len = 0, catalog_page_len = 0, page_count = 0;
+   loff_t page_offset = 0;
+   uint32_t catalog_version_num = 0;
+   void *page = kmalloc(4096, GFP_USER);
+   struct hv_24x7_catalog_page_0 *page_0 = page;
+   if (!page)
+   return -ENOMEM;
+
+
+   hret = h_get_24x7_catalog_page(page, 0, 0);
+   if (hret) {
+   ret = -EIO;
+   goto e_free;
+   }
+
+   catalog_version_num = be32_to_cpu(page_0-version);
+   catalog_page_len = be32_to_cpu(page_0-length

Re: [PATCH v2 04/11] powerpc: add hvcalls for 24x7 and gpci (get performance counter info)

2014-02-25 Thread Cody P Schafer

On 02/24/2014 07:33 PM, Michael Ellerman wrote:

On Fri, 2014-14-02 at 22:02:08 UTC, Cody P Schafer wrote:

Signed-off-by: Cody P Schafer c...@linux.vnet.ibm.com
---
  arch/powerpc/include/asm/hvcall.h | 5 +
  1 file changed, 5 insertions(+)

diff --git a/arch/powerpc/include/asm/hvcall.h 
b/arch/powerpc/include/asm/hvcall.h
index d8b600b..652f7e4 100644
--- a/arch/powerpc/include/asm/hvcall.h
+++ b/arch/powerpc/include/asm/hvcall.h
@@ -274,6 +274,11 @@
  /* Platform specific hcalls, used by KVM */
  #define H_RTAS0xf000

+/* Platform specific hcalls, provided by PHYP */
+#define H_GET_24X7_CATALOG_PAGE 0xF078
+#define H_GET_24X7_DATA0xF07C
+#define H_GET_PERF_COUNTER_INFO 0xF080


Some tabs some spaces, use tabs.


Ack.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2 07/11] powerpc: add a shared interface to get gpci version and capabilities

2014-02-25 Thread Cody P Schafer

On 02/24/2014 07:33 PM, Michael Ellerman wrote:

[PATCH v2 07/11] powerpc: add a shared interface to get gpci version and 
capabilities

All the patches that touch perf should be powerpc/perf: foo


Ok.


On Fri, 2014-14-02 at 22:02:11 UTC, Cody P Schafer wrote:

...


I realise this is a fairly small patch but a changelog is still nice. You could
for example mention that we don't currently use .ga, .expanded or .lab but
we're adding the logic anyway because ...



Well, we do use them to expose some more information to the user (via 
sysfs attributes). Always nice to know what capabilities are enabled.


But sure, I can explain why each bit in that structure is a good idea.




Signed-off-by: Cody P Schafer c...@linux.vnet.ibm.com
---
  arch/powerpc/perf/hv-common.c | 39 +++
  arch/powerpc/perf/hv-common.h | 17 +
  2 files changed, 56 insertions(+)
  create mode 100644 arch/powerpc/perf/hv-common.c
  create mode 100644 arch/powerpc/perf/hv-common.h

diff --git a/arch/powerpc/perf/hv-common.c b/arch/powerpc/perf/hv-common.c
new file mode 100644
index 000..47e02b3
--- /dev/null
+++ b/arch/powerpc/perf/hv-common.c
@@ -0,0 +1,39 @@
+#include asm/io.h
+#include asm/hvcall.h
+
+#include hv-gpci.h
+#include hv-common.h
+
+unsigned long hv_perf_caps_get(struct hv_perf_caps *caps)
+{
+   unsigned long r;
+   struct p {
+   struct hv_get_perf_counter_info_params params;
+   struct cv_system_performance_capabilities caps;
+   } __packed __aligned(sizeof(uint64_t));
+
+   struct p arg = {
+   .params = {
+   .counter_request = cpu_to_be32(
+   CIR_SYSTEM_PERFORMANCE_CAPABILITIES),
+   .starting_index = cpu_to_be32(-1),
+   .counter_info_version_in = 0,
+   }
+   };
+
+   r = plpar_hcall_norets(H_GET_PERF_COUNTER_INFO,
+  virt_to_phys(arg), sizeof(arg));
+
+   if (r)
+   return r;
+
+   pr_devel(capability_mask: 0x%x\n, arg.caps.capability_mask);
+
+   caps-version = arg.params.counter_info_version_out;
+   caps-collect_privileged = !!arg.caps.perf_collect_privileged;
+   caps-ga = !!(arg.caps.capability_mask  CV_CM_GA);
+   caps-expanded = !!(arg.caps.capability_mask  CV_CM_EXPANDED);
+   caps-lab = !!(arg.caps.capability_mask  CV_CM_LAB);
+
+   return r;
+}
diff --git a/arch/powerpc/perf/hv-common.h b/arch/powerpc/perf/hv-common.h
new file mode 100644
index 000..7e615bd
--- /dev/null
+++ b/arch/powerpc/perf/hv-common.h
@@ -0,0 +1,17 @@
+#ifndef LINUX_POWERPC_PERF_HV_COMMON_H_
+#define LINUX_POWERPC_PERF_HV_COMMON_H_
+
+#include linux/types.h
+
+struct hv_perf_caps {
+   u16 version;
+   u16 collect_privileged:1,
+   ga:1,
+   expanded:1,
+   lab:1,
+   unused:12;
+};
+
+unsigned long hv_perf_caps_get(struct hv_perf_caps *caps);
+
+#endif
--
1.8.5.4






___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2 08/11] powerpc/perf: add support for the hv gpci (get performance counter info) interface

2014-02-25 Thread Cody P Schafer

On 02/24/2014 07:33 PM, Michael Ellerman wrote:

On Fri, 2014-14-02 at 22:02:12 UTC, Cody P Schafer wrote:

This provides a basic link between perf and hv_gpci. Notably, it does
not yet support transactions and does not list any events (they can
still be manually composed).


Can you explain how the HV_CAPS stuff ends up looking.

I'm not against adding it, but I'd like to understand how we expect it to be
used a bit better.


It's just a quick mechanism for me to expose some relevant information 
to userspace via sysfs using the hv_perf_caps_get() function's returned 
data. Documentation for this sysfs interface (and the rest) is in a 
later patch.
I don't expect any more uses to show up unless the firmware decides to 
add another capability bit (in which case I'll want to expose it as well).



diff --git a/arch/powerpc/perf/hv-gpci.c b/arch/powerpc/perf/hv-gpci.c
new file mode 100644
index 000..1f5d96d
--- /dev/null
+++ b/arch/powerpc/perf/hv-gpci.c
+
+static struct pmu h_gpci_pmu = {
+   .task_ctx_nr = perf_invalid_context,
+
+   .name = hv_gpci,
+   .attr_groups = attr_groups,
+   .event_init  = h_gpci_event_init,
+   .add = h_gpci_event_add,
+   .del = h_gpci_event_del,

 = h_gpci_event_stop,


+   .start   = h_gpci_event_start,
+   .stop= h_gpci_event_stop,
+   .read= h_gpci_event_read,

 = h_gpci_event_update


+   .event_idx = perf_swevent_event_idx,
+};


whoops, thought I had fixed those 2 already.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2 10/11] powerpc/perf: add kconfig option for hypervisor provided counters

2014-02-25 Thread Cody P Schafer

On 02/24/2014 07:33 PM, Michael Ellerman wrote:

On Fri, 2014-14-02 at 22:02:14 UTC, Cody P Schafer wrote:

Signed-off-by: Cody P Schafer c...@linux.vnet.ibm.com
---
  arch/powerpc/perf/Makefile | 2 ++
  arch/powerpc/platforms/Kconfig.cputype | 6 ++
  2 files changed, 8 insertions(+)

diff --git a/arch/powerpc/perf/Makefile b/arch/powerpc/perf/Makefile
index 60d71ee..f9c083a 100644
--- a/arch/powerpc/perf/Makefile
+++ b/arch/powerpc/perf/Makefile
@@ -11,5 +11,7 @@ obj32-$(CONFIG_PPC_PERF_CTRS) += mpc7450-pmu.o
  obj-$(CONFIG_FSL_EMB_PERF_EVENT) += core-fsl-emb.o
  obj-$(CONFIG_FSL_EMB_PERF_EVENT_E500) += e500-pmu.o e6500-pmu.o

+obj-$(CONFIG_HV_PERF_CTRS) += hv-24x7.o hv-gpci.o hv-common.o
+
  obj-$(CONFIG_PPC64)   += $(obj64-y)
  obj-$(CONFIG_PPC32)   += $(obj32-y)
diff --git a/arch/powerpc/platforms/Kconfig.cputype 
b/arch/powerpc/platforms/Kconfig.cputype
index 434fda3..dcc67cd 100644
--- a/arch/powerpc/platforms/Kconfig.cputype
+++ b/arch/powerpc/platforms/Kconfig.cputype
@@ -364,6 +364,12 @@ config PPC_PERF_CTRS
 help
   This enables the powerpc-specific perf_event back-end.

+config HV_PERF_CTRS
+   def_bool y


This was bool, why did you change it?


No, it wasn't. v1 also had def_bool. https://lkml.org/lkml/2014/1/16/518
Maybe you're confusing v2.1 and v2 of this patch?




+   depends on PERF_EVENTS  PPC_HAVE_PMU_SUPPORT


Should be:

depends on PERF_EVENTS  PPC_PSERIES


+   help
+ Enable access to perf counters provided by the hypervisor
+


Yep, the v2.1 patch (which I bungled and labeled as 9/11) already 
changes both of these.

It'll end up rolled into v3.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2 02/11] perf core: export swevent hrtimer helpers

2014-02-25 Thread Cody P Schafer

On 02/25/2014 02:20 AM, Peter Zijlstra wrote:

On Tue, Feb 25, 2014 at 02:33:26PM +1100, Michael Ellerman wrote:

On Fri, 2014-14-02 at 22:02:06 UTC, Cody P Schafer wrote:

Export the swevent hrtimer helpers currently only used in events/core.c
to allow the addition of architecture specific sw-like pmus.


Peter, Ingo, can we get your ACK on this please?


How are they used? I saw some usage in patch 9 or so; but its not
explained anywhere. All patches have non-existent Changelogs and the few
comments that are there are pretty hardware specific.

So please do tell; what do you need this for?


From this patch's change log:


Export the swevent hrtimer helpers currently only used in events/core.c to 
allow the addition of architecture specific sw-like pmus.


The key part here is architecture specific sw-like pmus, where the 
announcement explains why these pmus are sw-like:



The counters supplied by these interfaces are continually counting and never
need to be (and cannot be) disabled or enabled. They additionally do not
generate any interrupts. This makes them in some regards similar to software
counters, and as a result their implimentation shares some common code (which
an initial patch exposes) with the sw counters.


Essentially, these pmus just provide access to a big array of counters 
which don't generate interrupts, and are all 64bit (and assumed to never 
overflow). Rather than duplicate the code that we already have for 
managing timing when reading from counters that don't have interrupts 
(the functions that are exposed by this patch), I've reused it.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2 01/11] perf: add PMU_RANGE_ATTR() helper for use by sw-like pmus

2014-02-25 Thread Cody P Schafer

On 02/25/2014 12:33 PM, Cody P Schafer wrote:

On 02/24/2014 07:33 PM, Michael Ellerman wrote:

On Fri, 2014-14-02 at 22:02:05 UTC, Cody P Schafer wrote:

Add PMU_RANGE_ATTR() and PMU_RANGE_RESV() (for reserved areas) which
generate functions to extract the relevent bits from
event-attr.config{,1,2} for use by sw-like pmus where the
'config{,1,2}' values don't map directly to hardware registers.

Signed-off-by: Cody P Schafer c...@linux.vnet.ibm.com
---
  include/linux/perf_event.h | 17 +
  1 file changed, 17 insertions(+)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index e56b07f..2702e91 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -871,4 +871,21 @@ _name##_show(struct device
*dev,\
  \
  static struct device_attribute format_attr_##_name = __ATTR_RO(_name)

+#define PMU_RANGE_ATTR(name, attr_var, bit_start, bit_end)\
+PMU_FORMAT_ATTR(name, #attr_var : #bit_start - #bit_end);\
+PMU_RANGE_RESV(name, attr_var, bit_start, bit_end)
+
+#define PMU_RANGE_RESV(name, attr_var, bit_start, bit_end)\
+static u64 event_get_##name##_max(void)\
+{\
+int bits = (bit_end) - (bit_start) + 1;\
+return ((0x1ULL  (bits - 1ULL)) - 1ULL) |\
+(0xFULL  (bits - 4ULL));\
+}\
+static u64 event_get_##name(struct perf_event *event)\
+{\
+return (event-attr.attr_var  (bit_start)) \
+event_get_##name##_max();\
+}


I still don't like the names.

EVENT_GETTER_AND_FORMAT()


EVENT_RANGE()

I'd prefer to describe the intended usage rather than what is generated
both in case we change some of the specifics later, and to provide
additional information to the developers beyond what a simple code
reading gives.


EVENT_RESERVED()


Sure. The PMU_* naming was just based on the PMU_FORMAT_ATTR() naming,
so I kept it for continuity with the existing API. Maybe
EVENT_RANGE_RESERVED() would be more appropriate?



Thinking about this a bit more, EVENT_RANGE() and EVENT_RANGE_RESERVED() 
aren't quite ideal either. The EVENT name collides with the files we 
put in the event/ dir, which these macros generate files for the format/ 
dir. Maybe:


FORMAT_RANGE() and FORMAT_RANGE_RESERVED()
or
PMU_FORMAT_RANGE(), PMU_FORMAT_RANGE_RESERVED()

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] powerpc: warn users of smt-snooze-delay that the API isn't there anymore

2014-02-25 Thread Cody P Schafer

On 02/24/2014 08:53 PM, Madhavan Srinivasan wrote:

On Saturday 22 February 2014 05:44 AM, Cody P Schafer wrote:

/sys/devices/system/cpu/cpu*/smt-snooze-delay was converted into a NOP
in commit 3fa8cad82b94d0bed002571bd246f2299ffc876b, and now does
nothing. Add a pr_warn() to convince any users that they should stop
using it.

The commit message from the removing commit notes that this
functionality should move into the cpuidle driver, essentially by


Would prefer to cleanup the code since the functionality is moved,
instead of adding to it.


We'd still want users of the interface to use an attribute wired up 
under the cpuidle/ dir, so a warning (to update their software) is still 
needed. As deepthi has noted, cpuidle right now doesn't support changing 
this on a per-cpu basis, so a cleanup isn't a simple matter.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH] powerpc: warn users of smt-snooze-delay that the API isn't there anymore

2014-02-21 Thread Cody P Schafer
/sys/devices/system/cpu/cpu*/smt-snooze-delay was converted into a NOP
in commit 3fa8cad82b94d0bed002571bd246f2299ffc876b, and now does
nothing. Add a pr_warn() to convince any users that they should stop
using it.

The commit message from the removing commit notes that this
functionality should move into the cpuidle driver, essentially by
adjusting target_residency to the specified value. At the moment,
target_residency is not exposed by cpuidle's sysfs, so there isn't a
drop in replacement for this.

Signed-off-by: Cody P Schafer c...@linux.vnet.ibm.com
---
 arch/powerpc/kernel/sysfs.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/arch/powerpc/kernel/sysfs.c b/arch/powerpc/kernel/sysfs.c
index 97e1dc9..84097b4 100644
--- a/arch/powerpc/kernel/sysfs.c
+++ b/arch/powerpc/kernel/sysfs.c
@@ -50,6 +50,9 @@ static ssize_t store_smt_snooze_delay(struct device *dev,
if (ret != 1)
return -EINVAL;
 
+   pr_warn_ratelimited(%s (%d): 
/sys/devices/system/cpu/cpu%d/smt-snooze-delay is deprecated and is a NOP\n,
+ current-comm, task_pid_nr(current), cpu-dev.id);
+
per_cpu(smt_snooze_delay, cpu-dev.id) = snooze;
return count;
 }
@@ -60,6 +63,9 @@ static ssize_t show_smt_snooze_delay(struct device *dev,
 {
struct cpu *cpu = container_of(dev, struct cpu, dev);
 
+   pr_warn_ratelimited(%s (%d): 
/sys/devices/system/cpu/cpu%d/smt-snooze-delay is deprecated and is a NOP\n,
+ current-comm, task_pid_nr(current), cpu-dev.id);
+
return sprintf(buf, %ld\n, per_cpu(smt_snooze_delay, cpu-dev.id));
 }
 
-- 
1.9.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v2.1 9/11] powerpc/perf: add kconfig option for hypervisor provided counters

2014-02-20 Thread Cody P Schafer
Signed-off-by: Cody P Schafer c...@linux.vnet.ibm.com
---
 arch/powerpc/perf/Makefile |  2 ++
 arch/powerpc/platforms/pseries/Kconfig | 12 
 2 files changed, 14 insertions(+)

diff --git a/arch/powerpc/perf/Makefile b/arch/powerpc/perf/Makefile
index 60d71ee..f9c083a 100644
--- a/arch/powerpc/perf/Makefile
+++ b/arch/powerpc/perf/Makefile
@@ -11,5 +11,7 @@ obj32-$(CONFIG_PPC_PERF_CTRS) += mpc7450-pmu.o
 obj-$(CONFIG_FSL_EMB_PERF_EVENT) += core-fsl-emb.o
 obj-$(CONFIG_FSL_EMB_PERF_EVENT_E500) += e500-pmu.o e6500-pmu.o
 
+obj-$(CONFIG_HV_PERF_CTRS) += hv-24x7.o hv-gpci.o hv-common.o
+
 obj-$(CONFIG_PPC64)+= $(obj64-y)
 obj-$(CONFIG_PPC32)+= $(obj32-y)
diff --git a/arch/powerpc/platforms/pseries/Kconfig 
b/arch/powerpc/platforms/pseries/Kconfig
index 80b1d57..2cb8b77 100644
--- a/arch/powerpc/platforms/pseries/Kconfig
+++ b/arch/powerpc/platforms/pseries/Kconfig
@@ -111,6 +111,18 @@ config CMM
  will be reused for other LPARs. The interface allows firmware to
  balance memory across many LPARs.
 
+config HV_PERF_CTRS
+   bool Hypervisor supplied PMU events (24x7  GPCI)
+   default y
+   depends on PERF_EVENTS  PPC_PSERIES
+   help
+ Enable access to hypervisor supplied counters in perf. Currently,
+ this enables code that uses the hcall GetPerfCounterInfo and 24x7
+ interfaces to retrieve counters. GPCI exists on Power 6 and later
+ systems. 24x7 is available on Power 8 systems.
+
+  If unsure, select Y.
+
 config DTL
bool Dispatch Trace Log
depends on PPC_SPLPAR  DEBUG_FS
-- 
1.9.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH v2.1 9/11] powerpc/perf: add kconfig option for hypervisor provided counters

2014-02-20 Thread Cody P Schafer

Whoops, should be [Patch v2.1 10/11]

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH v2 10/11] powerpc/perf: add kconfig option for hypervisor provided counters

2014-02-17 Thread Cody P Schafer

On 02/16/2014 11:11 PM, Michael Ellerman wrote:

On Fri, 2014-02-14 at 16:25 -0800, Cody P Schafer wrote:

On Fri, Feb 14, 2014 at 04:32:13PM -0600, Scott Wood wrote:

On Fri, 2014-02-14 at 14:02 -0800, Cody P Schafer wrote:

diff --git a/arch/powerpc/platforms/Kconfig.cputype 
b/arch/powerpc/platforms/Kconfig.cputype
index 434fda3..dcc67cd 100644
--- a/arch/powerpc/platforms/Kconfig.cputype
+++ b/arch/powerpc/platforms/Kconfig.cputype
@@ -364,6 +364,12 @@ config PPC_PERF_CTRS
 help
   This enables the powerpc-specific perf_event back-end.

+config HV_PERF_CTRS
+   def_bool y
+   depends on PERF_EVENTS  PPC_HAVE_PMU_SUPPORT
+   help
+ Enable access to perf counters provided by the hypervisor


Please don't add default-y stuff that is platform-specific, and
definitely point out that platform dependency in the config description
-- I have to look elsewhere in the patchset to determine that this is
for Power Hypervisor.  PPC_HAVE_PMU_SUPPORT is enabled by all 6xx
builds, even for hardware like e300 that doesn't have PMU at all (it has
the FSL embedded perfmon instead), much less this hv interface.

And yes, PPC_PERF_CTRS has the same problem and should be fixed. :-)


Yep, I just based this one on what PPC_PERF_CTRS was doing.

How about the following:

+config HV_PERF_CTRS
+   bool Perf Hypervisor supplied counters


Support for Hypervisor supplied PMU events (24x7  GPCI) ?


Sounds good to me.




+   default y
+   depends on PERF_EVENTS  PPC_HAVE_PMU_SUPPORT  PPC_PSERIES


I think you just want:

depends on PERF_EVENTS  PPC_PSERIES


Because you're adding two completely new PMUs, they're not a struct power_pmu
backend for the existing powerpc PMU implementation.



Ack. I'll fix this up in v3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH v2 00/11] powerpc: Add support for Power Hypervisor supplied performance counters

2014-02-14 Thread Cody P Schafer
These patches add basic pmus for 2 powerpc hypervisor interfaces to obtain
performance counters: gpci (get performance counter info) and 24x7.

The counters supplied by these interfaces are continually counting and never
need to be (and cannot be) disabled or enabled. They additionally do not
generate any interrupts. This makes them in some regards similar to software
counters, and as a result their implimentation shares some common code (which
an initial patch exposes) with the sw counters.

There is ongoing work to support transactions for each of these pmus.

These 2 PMUs end up providing access to some cpu, core, and chip level counters
not exposed via other interfaces, and additionally allow monitoring the
performance of other lpars (guests) on the same host system. Because it
provides access to core and chip level counters, this pair of PMUs could be
thought of as powerpc's counterpart to x86's uncore events.

As an example, processor_bus_utilization_abc and
processor_bus_utilization_wxyz (in hv_gpci.h) allow retreval of total cycles
and idle cycles for various inter-chip buses.

GPCI is an interface that already exists on some power6 and power7 machines
(depending on the fw version), but is rather in-flexible and code intensive to
add additional counters to.  The 24x7 interfaces currently are designed to
co-exist with the gpci interface while replacing most of gpci's functionality
on newer systems. Right now, the 24x7 code I've submitted uses the gpci calls
to check if it has permission to access certain classes of counters.

Example perf usage:

perf stat -e 
'hv_gpci/counter_info_version=3,offset=0,length=8,secondary_index=0,starting_index=0x,request=0x10/'
 -r 0 -C 0 -x ' ' sleep 0.1

perf stat -e 'hv_24x7/domain=2,offset=8,starting_index=0,lpar=0x/' -r 0 
-C 0 -x ' ' sleep 0.1

--

Changes since v1:
 - add a few attributes to hv_gpci and hv_24x7 that expose some info about the 
interfaces
 - so the attributes show up in the right place, fix bin_attr creation in sysfs 
groups.
 - move hv_gpci.h and hv_24x7.h interface headers into arch/powerpc/perf
 - fix bit ordering in hv_gpci.h
 - split out hv_perf_caps_get() and use it to probe for the interface before 
registering
 - ensure proper alignment of hypervisor args
 - add a few missing counter requests to hv_gpci.h
 - s/CIR_xxx/CIR_XXX/ in hv_gpci.h
 - s/modules_init/device_initcall/
 - Don't set event-cpu, use the user provided one
 - remove the union of gpci events, just give the user 1024 bytes to play with
 - clarify some comments (the list of fw versions is now labeled)
 - provide and event_24x7_request() that wraps single_24x7_request()
 - probably some other small fixes I'm forgetting.


Cody P Schafer (11):
  perf: add PMU_RANGE_ATTR() helper for use by sw-like pmus
  perf core: export swevent hrtimer helpers
  sysfs: create bin_attributes under the requested group
  powerpc: add hvcalls for 24x7 and gpci (get performance counter info)
  powerpc: add hv_gpci interface header
  powerpc: add 24x7 interface header
  powerpc: add a shared interface to get gpci version and capabilities
  powerpc/perf: add support for the hv gpci (get performance counter
info) interface
  powerpc/perf: add support for the hv 24x7 interface
  powerpc/perf: add kconfig option for hypervisor provided counters
  powerpc/perf/hv_{gpci,24x7}: add documentation of device attributes

 .../testing/sysfs-bus-event_source-devices-hv_24x7 |  22 +
 .../testing/sysfs-bus-event_source-devices-hv_gpci |  43 ++
 arch/powerpc/include/asm/hvcall.h  |   5 +
 arch/powerpc/perf/Makefile |   2 +
 arch/powerpc/perf/hv-24x7.c| 491 +++
 arch/powerpc/perf/hv-24x7.h| 239 ++
 arch/powerpc/perf/hv-common.c  |  39 ++
 arch/powerpc/perf/hv-common.h  |  17 +
 arch/powerpc/perf/hv-gpci.c| 290 
 arch/powerpc/perf/hv-gpci.h| 521 +
 arch/powerpc/platforms/Kconfig.cputype |   6 +
 fs/sysfs/group.c   |   7 +-
 include/linux/perf_event.h |  22 +-
 kernel/events/core.c   |   8 +-
 14 files changed, 1705 insertions(+), 7 deletions(-)
 create mode 100644 
Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_24x7
 create mode 100644 
Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_gpci
 create mode 100644 arch/powerpc/perf/hv-24x7.c
 create mode 100644 arch/powerpc/perf/hv-24x7.h
 create mode 100644 arch/powerpc/perf/hv-common.c
 create mode 100644 arch/powerpc/perf/hv-common.h
 create mode 100644 arch/powerpc/perf/hv-gpci.c
 create mode 100644 arch/powerpc/perf/hv-gpci.h

-- 
1.8.5.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH v2 02/11] perf core: export swevent hrtimer helpers

2014-02-14 Thread Cody P Schafer
Export the swevent hrtimer helpers currently only used in events/core.c
to allow the addition of architecture specific sw-like pmus.

Signed-off-by: Cody P Schafer c...@linux.vnet.ibm.com
---
 include/linux/perf_event.h | 5 -
 kernel/events/core.c   | 8 
 2 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 2702e91..24378a9 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -559,7 +559,10 @@ extern void perf_pmu_migrate_context(struct pmu *pmu,
int src_cpu, int dst_cpu);
 extern u64 perf_event_read_value(struct perf_event *event,
 u64 *enabled, u64 *running);
-
+extern void perf_swevent_init_hrtimer(struct perf_event *event);
+extern void perf_swevent_start_hrtimer(struct perf_event *event);
+extern void perf_swevent_cancel_hrtimer(struct perf_event *event);
+extern int perf_swevent_event_idx(struct perf_event *event);
 
 struct perf_sample_data {
u64 type;
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 56003c6..feb0347 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -5816,7 +5816,7 @@ static int perf_swevent_init(struct perf_event *event)
return 0;
 }
 
-static int perf_swevent_event_idx(struct perf_event *event)
+int perf_swevent_event_idx(struct perf_event *event)
 {
return 0;
 }
@@ -6045,7 +6045,7 @@ static enum hrtimer_restart perf_swevent_hrtimer(struct 
hrtimer *hrtimer)
return ret;
 }
 
-static void perf_swevent_start_hrtimer(struct perf_event *event)
+void perf_swevent_start_hrtimer(struct perf_event *event)
 {
struct hw_perf_event *hwc = event-hw;
s64 period;
@@ -6067,7 +6067,7 @@ static void perf_swevent_start_hrtimer(struct perf_event 
*event)
HRTIMER_MODE_REL_PINNED, 0);
 }
 
-static void perf_swevent_cancel_hrtimer(struct perf_event *event)
+void perf_swevent_cancel_hrtimer(struct perf_event *event)
 {
struct hw_perf_event *hwc = event-hw;
 
@@ -6079,7 +6079,7 @@ static void perf_swevent_cancel_hrtimer(struct perf_event 
*event)
}
 }
 
-static void perf_swevent_init_hrtimer(struct perf_event *event)
+void perf_swevent_init_hrtimer(struct perf_event *event)
 {
struct hw_perf_event *hwc = event-hw;
 
-- 
1.8.5.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH v2 01/11] perf: add PMU_RANGE_ATTR() helper for use by sw-like pmus

2014-02-14 Thread Cody P Schafer
Add PMU_RANGE_ATTR() and PMU_RANGE_RESV() (for reserved areas) which
generate functions to extract the relevent bits from
event-attr.config{,1,2} for use by sw-like pmus where the
'config{,1,2}' values don't map directly to hardware registers.

Signed-off-by: Cody P Schafer c...@linux.vnet.ibm.com
---
 include/linux/perf_event.h | 17 +
 1 file changed, 17 insertions(+)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index e56b07f..2702e91 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -871,4 +871,21 @@ _name##_show(struct device *dev,   
\
\
 static struct device_attribute format_attr_##_name = __ATTR_RO(_name)
 
+#define PMU_RANGE_ATTR(name, attr_var, bit_start, bit_end) \
+PMU_FORMAT_ATTR(name, #attr_var : #bit_start - #bit_end);  \
+PMU_RANGE_RESV(name, attr_var, bit_start, bit_end)
+
+#define PMU_RANGE_RESV(name, attr_var, bit_start, bit_end) \
+static u64 event_get_##name##_max(void)
\
+{  \
+   int bits = (bit_end) - (bit_start) + 1; \
+   return ((0x1ULL  (bits - 1ULL)) - 1ULL) | \
+   (0xFULL  (bits - 4ULL));  \
+}  \
+static u64 event_get_##name(struct perf_event *event)  \
+{  \
+   return (event-attr.attr_var  (bit_start))   \
+   event_get_##name##_max();   \
+}
+
 #endif /* _LINUX_PERF_EVENT_H */
-- 
1.8.5.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH v2 03/11] sysfs: create bin_attributes under the requested group

2014-02-14 Thread Cody P Schafer
bin_attributes created/updated in create_files() (such as those listed
via (struct device).attribute_groups) were not placed under the
specified group, and instead appeared in the base kobj directory.

Fix this by making bin_attributes use creating code similar to normal
attributes.

A quick grep shows that no one is using bin_attrs in a named attribute
group yet, so we can do this without breaking anything in usespace.

Note that I do not add is_visible() support to
bin_attributes, though that could be done as well.

Signed-off-by: Cody P Schafer c...@linux.vnet.ibm.com
---
 fs/sysfs/group.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/fs/sysfs/group.c b/fs/sysfs/group.c
index 6b57938..aa04068 100644
--- a/fs/sysfs/group.c
+++ b/fs/sysfs/group.c
@@ -70,8 +70,11 @@ static int create_files(struct kernfs_node *parent, struct 
kobject *kobj,
if (grp-bin_attrs) {
for (bin_attr = grp-bin_attrs; *bin_attr; bin_attr++) {
if (update)
-   sysfs_remove_bin_file(kobj, *bin_attr);
-   error = sysfs_create_bin_file(kobj, *bin_attr);
+   kernfs_remove_by_name(parent,
+   (*bin_attr)-attr.name);
+   error = sysfs_add_file_mode_ns(parent,
+   (*bin_attr)-attr, true,
+   (*bin_attr)-attr.mode, NULL);
if (error)
break;
}
-- 
1.8.5.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH v2 04/11] powerpc: add hvcalls for 24x7 and gpci (get performance counter info)

2014-02-14 Thread Cody P Schafer
Signed-off-by: Cody P Schafer c...@linux.vnet.ibm.com
---
 arch/powerpc/include/asm/hvcall.h | 5 +
 1 file changed, 5 insertions(+)

diff --git a/arch/powerpc/include/asm/hvcall.h 
b/arch/powerpc/include/asm/hvcall.h
index d8b600b..652f7e4 100644
--- a/arch/powerpc/include/asm/hvcall.h
+++ b/arch/powerpc/include/asm/hvcall.h
@@ -274,6 +274,11 @@
 /* Platform specific hcalls, used by KVM */
 #define H_RTAS 0xf000
 
+/* Platform specific hcalls, provided by PHYP */
+#define H_GET_24X7_CATALOG_PAGE 0xF078
+#define H_GET_24X7_DATA0xF07C
+#define H_GET_PERF_COUNTER_INFO 0xF080
+
 #ifndef __ASSEMBLY__
 
 /**
-- 
1.8.5.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH v2 05/11] powerpc: add hv_gpci interface header

2014-02-14 Thread Cody P Schafer
H_GetPerformanceCounterInfo (refered to as hv_gpci or just gpci from
here on) is an interface to retrieve specific performance counters and
other data from the hypervisor. All outputs have a fixed format (and
are represented as structs in this patch).

Signed-off-by: Cody P Schafer c...@linux.vnet.ibm.com
---
 arch/powerpc/perf/hv-gpci.h | 521 
 1 file changed, 521 insertions(+)
 create mode 100644 arch/powerpc/perf/hv-gpci.h

diff --git a/arch/powerpc/perf/hv-gpci.h b/arch/powerpc/perf/hv-gpci.h
new file mode 100644
index 000..d602809
--- /dev/null
+++ b/arch/powerpc/perf/hv-gpci.h
@@ -0,0 +1,521 @@
+#ifndef LINUX_POWERPC_PERF_HV_GPCI_H_
+#define LINUX_POWERPC_PERF_HV_GPCI_H_
+
+#include linux/types.h
+
+/* From the document H_GetPerformanceCounterInfo Interface v1.07 */
+
+/* H_GET_PERF_COUNTER_INFO argument */
+struct hv_get_perf_counter_info_params {
+   __be32 counter_request; /* I */
+   __be32 starting_index;  /* IO */
+   __be16 secondary_index; /* IO */
+   __be16 returned_values; /* O */
+   __be32 detail_rc; /* O, only needed when called via *_norets() */
+
+   /*
+* O, size each of counter_value element in bytes, only set for version
+* = 0x3
+*/
+   __be16 cv_element_size;
+
+   /* I, 0 (zero) for versions  0x3 */
+   __u8 counter_info_version_in;
+
+   /* O, 0 (zero) if version  0x3. Must be set to 0 when making hcall */
+   __u8 counter_info_version_out;
+   __u8 reserved[0xC];
+   __u8 counter_value[];
+} __packed;
+
+/*
+ * counter info version = fw version/reference (spec version)
+ *
+ * 8 = power8 (1.07)
+ * [7 is skipped by spec 1.07]
+ * 6 = TLBIE (1.07)
+ * 5 = v7r7m0.phyp (1.05)
+ * [4 skipped]
+ * 3 = v7r6m0.phyp (?)
+ * [1,2 skipped]
+ * 0 = v7r{2,3,4}m0.phyp (?)
+ */
+#define COUNTER_INFO_VERSION_CURRENT 0x8
+
+/*
+ * These determine the counter_value[] layout and the meaning of starting_index
+ * and secondary_index.
+ *
+ * Unless otherwise noted, @secondary_index is unused and ignored.
+ */
+enum counter_info_requests {
+
+   /* GENERAL */
+
+   /* @starting_index: starting physical processor index or -1 for
+*  current physical processor. Data is only collected
+*  for the processors' primary thread.
+*/
+   CIR_DISPATCH_TIMEBASE_BY_PROCESSOR = 0x10,
+
+   /* @starting_index: starting partition id or -1 for the current logical
+*  partition (virtual machine).
+*/
+   CIR_ENTITLED_CAPPED_UNCAPPED_DONATED_IDLE_TIMEBASE_BY_PARTITION = 0x20,
+
+   /* @starting_index: starting partition id or -1 for the current logical
+*  partition (virtual machine).
+*/
+   CIR_RUN_INSTRUCTIONS_RUN_CYCLES_BY_PARTITION = 0X30,
+
+   /* @starting_index: must be -1 (to refer to the current partition)
+*/
+   CIR_SYSTEM_PERFORMANCE_CAPABILITIES = 0X40,
+
+
+   /* Data from this should only be considered valid if
+* counter_info_version = 0x3
+* @starting_index: starting hardware chip id or -1 for the current hw
+*  chip id
+*/
+   CIR_PROCESSOR_BUS_UTILIZATION_ABC_LINKS = 0X50,
+
+   /* Data from this should only be considered valid if
+* counter_info_version = 0x3
+* @starting_index: starting hardware chip id or -1 for the current hw
+*  chip id
+*/
+   CIR_PROCESSOR_BUS_UTILIZATION_WXYZ_LINKS = 0X60,
+
+   /*
+* EXPANDED - the following are only avaliable if the CV_CM_EXPANDED
+* bit is set from system_performace_capabilities. Enforcement is left
+* to the hypervisor.
+*/
+
+   /* Available if counter_info_version = 0x3
+* @starting_index: starting hardware chip id or -1 for the current hw
+*  chip id
+*/
+   CIR_PROCESSOR_BUS_UTILIZATION_GX_LINKS = 0X70,
+
+   /* Available if counter_info_version = 0x3
+* @starting_index: starting hardware chip id or -1 for the current hw
+*  chip id
+*/
+   CIR_PROCESSOR_BUS_UTILIZATION_MC_LINKS = 0X80,
+
+   /* Available if counter_info_version = 0x3
+* @starting_index: starting physical processor or -1 for the current
+*  physical processor
+*/
+   CIR_PROCESSOR_CONFIG = 0X90,
+
+   /* Available if counter_info_version = 0x3
+* @starting_index: starting physical processor or -1 for the current
+*  physical processor
+*/
+   CIR_CURRENT_PROCESSOR_FREQUENCY = 0X91,
+
+   /* Available if counter_info_version = 0x3 and = 0x7
+* @starting_index: starting physical processor or -1 for the current
+*  physical processor
+*/
+   CIR_PROCESSOR_CORE_UTILIZATION = 0X94,
+
+   /* Available

[PATCH v2 06/11] powerpc: add 24x7 interface header

2014-02-14 Thread Cody P Schafer
24x7 (also called hv_24x7 or H_24X7) is an interface to obtain
performance counters from the hypervisor. These counters do not have a
fixed format/possition and are instead documented in a 24x7 Catalog,
which is provided by the hypervisor (that interface is also documented
in this header).

This method of obtaining performance counters from the hypervisor is
intended to paritialy replace the gpci interface.

Signed-off-by: Cody P Schafer c...@linux.vnet.ibm.com
---
 arch/powerpc/perf/hv-24x7.h | 239 
 1 file changed, 239 insertions(+)
 create mode 100644 arch/powerpc/perf/hv-24x7.h

diff --git a/arch/powerpc/perf/hv-24x7.h b/arch/powerpc/perf/hv-24x7.h
new file mode 100644
index 000..bf079da
--- /dev/null
+++ b/arch/powerpc/perf/hv-24x7.h
@@ -0,0 +1,239 @@
+#ifndef LINUX_POWERPC_PERF_HV_24X7_H_
+#define LINUX_POWERPC_PERF_HV_24X7_H_
+
+#include linux/types.h
+
+struct hv_24x7_request {
+   /* PHYSICAL domains require enabling via phyp/hmc. */
+#define HV_24X7_PERF_DOMAIN_PHYSICAL_CHIP 0x01
+#define HV_24X7_PERF_DOMAIN_PHYSICAL_CORE 0x02
+#define HV_24X7_PERF_DOMAIN_VIRTUAL_PROCESSOR_HOME_CORE   0x03
+#define HV_24X7_PERF_DOMAIN_VIRTUAL_PROCESSOR_HOME_CHIP   0x04
+#define HV_24X7_PERF_DOMAIN_VIRTUAL_PROCESSOR_HOME_NODE   0x05
+#define HV_24X7_PERF_DOMAIN_VIRTUAL_PROCESSOR_REMOTE_NODE 0x06
+   __u8 performance_domain;
+   __u8 reserved[0x1];
+
+   /* bytes to read starting at @data_offset. must be a multiple of 8 */
+   __be16 data_size;
+
+   /*
+* byte offset within the perf domain to read from. must be 8 byte
+* aligned
+*/
+   __be32 data_offset;
+
+   /*
+* only valid for VIRTUAL_PROCESSOR domains, ignored for others.
+* -1 means current partition only
+*  Enabling via phyp/hmc required for non--1 values. 0 forbidden
+*  unless requestor is 0.
+*/
+   __be16 starting_lpar_ix;
+
+   /*
+* Ignored when @starting_lpar_ix == -1
+* Ignored when @performance_domain is not VIRTUAL_PROCESSOR_*
+* -1 means infinite or all
+*/
+   __be16 max_num_lpars;
+
+   /* chip, core, or virtual processor based on @performance_domain */
+   __be16 starting_ix;
+   __be16 max_ix;
+} __packed;
+
+struct hv_24x7_request_buffer {
+   /* 0 - ? */
+   /* 1 - ? */
+#define HV_24X7_IF_VERSION_CURRENT 0x01
+   __u8 interface_version;
+   __u8 num_requests;
+   __u8 reserved[0xE];
+   struct hv_24x7_request requests[];
+} __packed;
+
+struct hv_24x7_result_element {
+   __be16 lpar_ix;
+
+   /*
+* represents the core, chip, or virtual processor based on the
+* request's @performance_domain
+*/
+   __be16 domain_ix;
+
+   /* -1 if @performance_domain does not refer to a virtual processor */
+   __be32 lpar_cfg_instance_id;
+
+   /* size = @result_element_data_size of cointaining result. */
+   __u8 element_data[];
+} __packed;
+
+struct hv_24x7_result {
+   __u8 result_ix;
+
+   /*
+* 0 = not all result elements fit into the buffer, additional requests
+* required
+* 1 = all result elements were returned
+*/
+   __u8 results_complete;
+   __be16 num_elements_returned;
+
+   /* This is a copy of @data_size from the coresponding hv_24x7_request */
+   __be16 result_element_data_size;
+   __u8 reserved[0x2];
+
+   /* WARNING: only valid for first result element due to variable sizes
+*  of result elements */
+   /* struct hv_24x7_result_element[@num_elements_returned] */
+   struct hv_24x7_result_element elements[];
+} __packed;
+
+struct hv_24x7_data_result_buffer {
+   /* See versioning for request buffer */
+   __u8 interface_version;
+
+   __u8 num_results;
+   __u8 reserved[0x1];
+   __u8 failing_request_ix;
+   __be32 detailed_rc;
+   __be64 cec_cfg_instance_id;
+   __be64 catalog_version_num;
+   __u8 reserved2[0x8];
+   /* WARNING: only valid for the first result due to variable sizes of
+*  results */
+   struct hv_24x7_result results[]; /* [@num_results] */
+} __packed;
+
+/* From document 24x7 Event and Group Catalog Formats Proposal v0.14 */
+struct hv_24x7_catalog_page_0 {
+#define HV_24X7_CATALOG_MAGIC 0x32347837 /* 24x7 in ASCII */
+   __be32 magic;
+   __be32 length; /* In 4096 byte pages */
+   __u8 reserved1[4];
+   __be32 version;
+   __u8 build_time_stamp[16]; /* MMDDHHMMSS\0\0 */
+   __u8 reserved2[32];
+   __be16 schema_data_offs; /* in 4096 byte pages */
+   __be16 schema_data_len;  /* in 4096 byte pages */
+   __be16 schema_entry_count;
+   __u8 reserved3[2];
+   __be16 group_data_offs; /* in 4096 byte pages */
+   __be16 group_data_len;  /* in 4096 byte pages */
+   __be16 group_entry_count;
+   __u8 reserved4[2];
+   __be16

[PATCH v2 07/11] powerpc: add a shared interface to get gpci version and capabilities

2014-02-14 Thread Cody P Schafer
Signed-off-by: Cody P Schafer c...@linux.vnet.ibm.com
---
 arch/powerpc/perf/hv-common.c | 39 +++
 arch/powerpc/perf/hv-common.h | 17 +
 2 files changed, 56 insertions(+)
 create mode 100644 arch/powerpc/perf/hv-common.c
 create mode 100644 arch/powerpc/perf/hv-common.h

diff --git a/arch/powerpc/perf/hv-common.c b/arch/powerpc/perf/hv-common.c
new file mode 100644
index 000..47e02b3
--- /dev/null
+++ b/arch/powerpc/perf/hv-common.c
@@ -0,0 +1,39 @@
+#include asm/io.h
+#include asm/hvcall.h
+
+#include hv-gpci.h
+#include hv-common.h
+
+unsigned long hv_perf_caps_get(struct hv_perf_caps *caps)
+{
+   unsigned long r;
+   struct p {
+   struct hv_get_perf_counter_info_params params;
+   struct cv_system_performance_capabilities caps;
+   } __packed __aligned(sizeof(uint64_t));
+
+   struct p arg = {
+   .params = {
+   .counter_request = cpu_to_be32(
+   CIR_SYSTEM_PERFORMANCE_CAPABILITIES),
+   .starting_index = cpu_to_be32(-1),
+   .counter_info_version_in = 0,
+   }
+   };
+
+   r = plpar_hcall_norets(H_GET_PERF_COUNTER_INFO,
+  virt_to_phys(arg), sizeof(arg));
+
+   if (r)
+   return r;
+
+   pr_devel(capability_mask: 0x%x\n, arg.caps.capability_mask);
+
+   caps-version = arg.params.counter_info_version_out;
+   caps-collect_privileged = !!arg.caps.perf_collect_privileged;
+   caps-ga = !!(arg.caps.capability_mask  CV_CM_GA);
+   caps-expanded = !!(arg.caps.capability_mask  CV_CM_EXPANDED);
+   caps-lab = !!(arg.caps.capability_mask  CV_CM_LAB);
+
+   return r;
+}
diff --git a/arch/powerpc/perf/hv-common.h b/arch/powerpc/perf/hv-common.h
new file mode 100644
index 000..7e615bd
--- /dev/null
+++ b/arch/powerpc/perf/hv-common.h
@@ -0,0 +1,17 @@
+#ifndef LINUX_POWERPC_PERF_HV_COMMON_H_
+#define LINUX_POWERPC_PERF_HV_COMMON_H_
+
+#include linux/types.h
+
+struct hv_perf_caps {
+   u16 version;
+   u16 collect_privileged:1,
+   ga:1,
+   expanded:1,
+   lab:1,
+   unused:12;
+};
+
+unsigned long hv_perf_caps_get(struct hv_perf_caps *caps);
+
+#endif
-- 
1.8.5.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH v2 08/11] powerpc/perf: add support for the hv gpci (get performance counter info) interface

2014-02-14 Thread Cody P Schafer
This provides a basic link between perf and hv_gpci. Notably, it does
not yet support transactions and does not list any events (they can
still be manually composed).

Signed-off-by: Cody P Schafer c...@linux.vnet.ibm.com
---
 arch/powerpc/perf/hv-gpci.c | 290 
 1 file changed, 290 insertions(+)
 create mode 100644 arch/powerpc/perf/hv-gpci.c

diff --git a/arch/powerpc/perf/hv-gpci.c b/arch/powerpc/perf/hv-gpci.c
new file mode 100644
index 000..1f5d96d
--- /dev/null
+++ b/arch/powerpc/perf/hv-gpci.c
@@ -0,0 +1,290 @@
+/*
+ * Hypervisor supplied gpci (get performance counter info) performance
+ * counter support
+ *
+ * Author: Cody P Schafer c...@linux.vnet.ibm.com
+ * Copyright 2014 IBM Corporation.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+#define pr_fmt(fmt) hv-gpci:  fmt
+
+#include linux/init.h
+#include linux/perf_event.h
+#include asm/firmware.h
+#include asm/hvcall.h
+#include asm/io.h
+
+#include hv-gpci.h
+#include hv-common.h
+
+PMU_RANGE_ATTR(request, config, 0, 31); /* u32 */
+PMU_RANGE_ATTR(starting_index, config, 32, 63); /* u32 */
+PMU_RANGE_ATTR(secondary_index, config1, 0, 15); /* u16 */
+PMU_RANGE_ATTR(counter_info_version, config1, 16, 23); /* u8 */
+PMU_RANGE_ATTR(length, config1, 24, 31); /* u8, bytes of data (1-8) */
+PMU_RANGE_ATTR(offset, config1, 32, 63); /* u32, byte offset */
+
+static struct attribute *format_attrs[] = {
+   format_attr_request.attr,
+   format_attr_starting_index.attr,
+   format_attr_secondary_index.attr,
+   format_attr_counter_info_version.attr,
+
+   format_attr_offset.attr,
+   format_attr_length.attr,
+   NULL,
+};
+
+static struct attribute_group format_group = {
+   .name = format,
+   .attrs = format_attrs,
+};
+
+#define HV_CAPS_ATTR(_name, _format)   \
+static ssize_t _name##_show(struct device *dev,\
+   struct device_attribute *attr,  \
+   char *page) \
+{  \
+   struct hv_perf_caps caps;   \
+   unsigned long hret = hv_perf_caps_get(caps);   \
+   if (hret)   \
+   return -EIO;\
+   \
+   return sprintf(page, _format, caps._name);  \
+}  \
+static struct device_attribute hv_caps_attr_##_name = __ATTR_RO(_name)
+
+static ssize_t kernel_version_show(struct device *dev,
+  struct device_attribute *attr,
+  char *page)
+{
+   return sprintf(page, 0x%x\n, COUNTER_INFO_VERSION_CURRENT);
+}
+
+DEVICE_ATTR_RO(kernel_version);
+HV_CAPS_ATTR(version, 0x%x\n);
+HV_CAPS_ATTR(ga, %d\n);
+HV_CAPS_ATTR(expanded, %d\n);
+HV_CAPS_ATTR(lab, %d\n);
+HV_CAPS_ATTR(collect_privileged, %d\n);
+
+static struct attribute *interface_attrs[] = {
+   dev_attr_kernel_version.attr,
+   hv_caps_attr_version.attr,
+   hv_caps_attr_ga.attr,
+   hv_caps_attr_expanded.attr,
+   hv_caps_attr_lab.attr,
+   hv_caps_attr_collect_privileged.attr,
+   NULL,
+};
+
+static struct attribute_group interface_group = {
+   .name = interface,
+   .attrs = interface_attrs,
+};
+
+static const struct attribute_group *attr_groups[] = {
+   format_group,
+   interface_group,
+   NULL,
+};
+
+#define GPCI_MAX_DATA_BYTES \
+   (1024 - sizeof(struct hv_get_perf_counter_info_params))
+
+static unsigned long single_gpci_request(u32 req, u32 starting_index,
+   u16 secondary_index, u8 version_in, u32 offset, u8 length,
+   u64 *value)
+{
+   unsigned long ret;
+   size_t i;
+   u64 count;
+
+   struct {
+   struct hv_get_perf_counter_info_params params;
+   uint8_t bytes[GPCI_MAX_DATA_BYTES];
+   } __packed __aligned(sizeof(uint64_t)) arg = {
+   .params = {
+   .counter_request = cpu_to_be32(req),
+   .starting_index = cpu_to_be32(starting_index),
+   .secondary_index = cpu_to_be16(secondary_index),
+   .counter_info_version_in = version_in,
+   }
+   };
+
+   ret = plpar_hcall_norets(H_GET_PERF_COUNTER_INFO,
+   virt_to_phys(arg), sizeof(arg));
+   if (ret) {
+   pr_devel(hcall failed: 0x%lx\n, ret);
+   return ret;
+   }
+
+   /*
+* we verify offset and length are within the zeroed buffer

[PATCH v2 09/11] powerpc/perf: add support for the hv 24x7 interface

2014-02-14 Thread Cody P Schafer
This provides a basic interface between hv_24x7 and perf. Similar to
the one provided for gpci, it lacks transaction support and does not
list any events.

Signed-off-by: Cody P Schafer c...@linux.vnet.ibm.com
---
 arch/powerpc/perf/hv-24x7.c | 491 
 1 file changed, 491 insertions(+)
 create mode 100644 arch/powerpc/perf/hv-24x7.c

diff --git a/arch/powerpc/perf/hv-24x7.c b/arch/powerpc/perf/hv-24x7.c
new file mode 100644
index 000..13de140
--- /dev/null
+++ b/arch/powerpc/perf/hv-24x7.c
@@ -0,0 +1,491 @@
+/*
+ * Hypervisor supplied 24x7 performance counter support
+ *
+ * Author: Cody P Schafer c...@linux.vnet.ibm.com
+ * Copyright 2014 IBM Corporation.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#define pr_fmt(fmt) hv-24x7:  fmt
+
+#include linux/perf_event.h
+#include linux/module.h
+#include linux/slab.h
+#include asm/firmware.h
+#include asm/hvcall.h
+#include asm/io.h
+
+#include hv-24x7.h
+#include hv-common.h
+
+/*
+ * TODO: Merging events:
+ * - Think of the hcall as an interface to a 4d array of counters:
+ *   - x = domains
+ *   - y = indexes in the domain (core, chip, vcpu, node, etc)
+ *   - z = offset into the counter space
+ *   - w = lpars (guest vms, logical partitions)
+ * - A single request is: x,y,y_last,z,z_last,w,w_last
+ *   - this means we can retrieve a rectangle of counters in y,z for a single 
x.
+ *
+ * - Things to consider (ignoring w):
+ *   - input  cost_per_request = 16
+ *   - output cost_per_result(ys,zs)  = 8 + 8 * ys + ys * zs
+ *   - limited number of requests per hcall (must fit into 4K bytes)
+ * - 4k = 16 [buffer header] - 16 [request size] * request_count
+ * - 255 requests per hcall
+ *   - sometimes it will be more efficient to read extra data and discard
+ */
+
+PMU_RANGE_ATTR(domain, config, 0, 3); /* u3 0-6, one of HV_24X7_PERF_DOMAIN */
+PMU_RANGE_ATTR(starting_index, config, 16, 31); /* u16 */
+PMU_RANGE_ATTR(offset, config, 32, 63); /* u32, see data_offset */
+PMU_RANGE_ATTR(lpar, config1, 0, 15); /* u16 */
+
+PMU_RANGE_RESV(reserved1, config,   4, 15);
+PMU_RANGE_RESV(reserved2, config1, 16, 63);
+PMU_RANGE_RESV(reserved3, config2,  0, 63);
+
+static struct attribute *format_attrs[] = {
+   format_attr_domain.attr,
+   format_attr_offset.attr,
+   format_attr_starting_index.attr,
+   format_attr_lpar.attr,
+   NULL,
+};
+
+static struct attribute_group format_group = {
+   .name = format,
+   .attrs = format_attrs,
+};
+
+/*
+ * read_offset_data - copy data from one buffer to another while treating the
+ *source buffer as a small view on the total avaliable
+ *source data.
+ *
+ * @dest: buffer to copy into
+ * @dest_len: length of @dest in bytes
+ * @requested_offset: the offset within the source data we want. Must be  0
+ * @src: buffer to copy data from
+ * @src_len: length of @src in bytes
+ * @source_offset: the offset in the sorce data that (src,src_len) refers to.
+ * Must be  0
+ *
+ * returns the number of bytes copied.
+ *
+ * '.' areas in d are written to.
+ *
+ *   u
+ *   x wv  z
+ * d   |.|
+ * s |--|
+ *
+ *  u
+ *   x w   z v
+ * d   |--|
+ * s |--|
+ *
+ *   x wu,z,v
+ * d   ||
+ * s |--|
+ *
+ *   x,wu,v,z
+ * d |--|
+ * s |--|
+ *
+ *   xu
+ *   wvz
+ * d ||
+ * s |--|
+ *
+ *   x  z   w  v
+ * d|--|
+ * s |--|
+ *
+ * x = source_offset
+ * w = requested_offset
+ * z = source_offset + src_len
+ * v = requested_offset + dest_len
+ *
+ * w_offset_in_s = w - x = requested_offset - source_offset
+ * z_offset_in_s = z - x = src_len
+ * v_offset_in_s = v - x = request_offset + dest_len - src_len
+ * u_offset_in_s = min(z_offset_in_s, v_offset_in_s)
+ *
+ * copy_len = u_offset_in_s - w_offset_in_s = min(z_offset_in_s, v_offset_in_s)
+ * - w_offset_in_s
+ */
+static ssize_t read_offset_data(void *dest, size_t dest_len,
+   loff_t requested_offset, void *src,
+   size_t src_len, loff_t source_offset)
+{
+   size_t w_offset_in_s = requested_offset - source_offset;
+   size_t z_offset_in_s = src_len;
+   size_t v_offset_in_s = requested_offset + dest_len - src_len;
+   size_t u_offset_in_s = min(z_offset_in_s, v_offset_in_s);
+   size_t copy_len = u_offset_in_s - w_offset_in_s;
+
+   if (requested_offset  0 || source_offset  0)
+   return

  1   2   >