Re: [PATCH v4 09/11] powerpc/perf: add support for the hv 24x7 interface

2014-05-22 Thread Ian Munsie
Hi Cody,

I just tried building this with gcc 4.5, which failed with the following
warning (treated as an error):

cc1: warnings being treated as errors
arch/powerpc/perf/hv-24x7.c: In function 'single_24x7_request':
arch/powerpc/perf/hv-24x7.c:346:1: error: the frame size of 8192 bytes is 
larger than 2048 bytes
make[3]: *** [arch/powerpc/perf/hv-24x7.o] Error 1
make[2]: *** [arch/powerpc/perf] Error 2

My .config has CONFIG_FRAME_WARN=2048 (default on 64bit), but the
alignment constraints in this function may require 8K on the stack -
possibly a bit large?


Notably for some reason this warning no longer seems to trigger on gcc
4.8 (or at least somewhere between 4.5-4.8), though the assembly does
still show it aligning the buffers.


Excerpts from Cody P Schafer's message of 2014-03-06 11:01:08 +1100:
 +static unsigned long single_24x7_request(u8 domain, u32 offset, u16 ix,
snip
 +/*
 + * request_buffer and result_buffer are not required to be 4k aligned,
 + * but are not allowed to cross any 4k boundary. Aligning them to 4k is
 + * the simplest way to ensure that.
 + */
 +struct reqb {
 +struct hv_24x7_request_buffer buf;
 +struct hv_24x7_request req;
 +} __packed __aligned(4096) request_buffer = {
snip
 +struct resb {
 +struct hv_24x7_data_result_buffer buf;
 +struct hv_24x7_result res;
 +struct hv_24x7_result_element elem;
 +__be64 result;
 +} __packed __aligned(4096) result_buffer = {};
snip


Cheers,
-Ian

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v4 09/11] powerpc/perf: add support for the hv 24x7 interface

2014-05-22 Thread Cody P Schafer

On 05/22/2014 01:19 AM, Ian Munsie wrote:

Hi Cody,

I just tried building this with gcc 4.5, which failed with the following
warning (treated as an error):

cc1: warnings being treated as errors
arch/powerpc/perf/hv-24x7.c: In function 'single_24x7_request':
arch/powerpc/perf/hv-24x7.c:346:1: error: the frame size of 8192 bytes is 
larger than 2048 bytes
make[3]: *** [arch/powerpc/perf/hv-24x7.o] Error 1
make[2]: *** [arch/powerpc/perf] Error 2

My .config has CONFIG_FRAME_WARN=2048 (default on 64bit), but the
alignment constraints in this function may require 8K on the stack -
possibly a bit large?



Yep, it is a bit large. In other places in hv-24x7 that use similar 
firmware interfaces (with similar alignment requirements), I've used a 
kmem_cache (hv_page_cache). Testing out a patch that uses that here as well.




Notably for some reason this warning no longer seems to trigger on gcc
4.8 (or at least somewhere between 4.5-4.8), though the assembly does
still show it aligning the buffers.


That's a bit concerning (and might be why I didn't pick it up, using gcc 
4.9.0 over here). Looking at the gcc docs, it seems to indicate that 
alloca() and VLAs aren't counted for -Wframe-larger-than. Perhaps gcc 
decided to move locally defined structures with alignment requirements 
into that same bucket? (while size of the structures is statically 
determinable, the stack consumption due to alignment is [to some degree] 
variable).


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

RE: [PATCH v4 09/11] powerpc/perf: add support for the hv 24x7 interface

2014-03-26 Thread David Laight
From: Cody P Schafer
 On 03/25/2014 03:43 AM, Anton Blanchard wrote:
 
  Hi Cody,
 
  hv-24x7: could not obtain capabilities, error 0x
 fffe, not enabling
  hv-gpci: could not obtain capabilities, error 0x
 fffe, not enabling
 
  +  pr_info(could not obtain capabilities, error 0x%80lx, not 
  enabling\n,
 
  That's a lot of padding :)
 
  I think this should also be a pr_debug, considering this is not relevant
  to most ppc64 boxes.
 
 Yep, s/info/debug/ makes sense. The format should have been %08lx not
 %80lx, not sure when I screwed that up.

Using %#x is an alternative - unless you really need fixed width.

However in this case %ld might be more appropriate!
If the value is a -ve errno one, maybe even errno %d and negate the value.

David

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v4 09/11] powerpc/perf: add support for the hv 24x7 interface

2014-03-25 Thread Anton Blanchard

Hi Cody,

hv-24x7: could not obtain capabilities, error 0x
fffe, not enabling
hv-gpci: could not obtain capabilities, error 0x
fffe, not enabling

 + pr_info(could not obtain capabilities, error 0x%80lx, not 
 enabling\n,

That's a lot of padding :)

I think this should also be a pr_debug, considering this is not relevant
to most ppc64 boxes.

Anton
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v4 09/11] powerpc/perf: add support for the hv 24x7 interface

2014-03-25 Thread Cody P Schafer

On 03/25/2014 03:43 AM, Anton Blanchard wrote:


Hi Cody,

hv-24x7: could not obtain capabilities, error 0x
fffe, not enabling
hv-gpci: could not obtain capabilities, error 0x
fffe, not enabling


+   pr_info(could not obtain capabilities, error 0x%80lx, not 
enabling\n,


That's a lot of padding :)

I think this should also be a pr_debug, considering this is not relevant
to most ppc64 boxes.


I'm fine with that. It should probably be 0x%08lx not 0x%80lx, not 
sure when I screwed that up.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v4 09/11] powerpc/perf: add support for the hv 24x7 interface

2014-03-25 Thread Cody P Schafer

On 03/25/2014 03:43 AM, Anton Blanchard wrote:


Hi Cody,

hv-24x7: could not obtain capabilities, error 0x
fffe, not enabling
hv-gpci: could not obtain capabilities, error 0x
fffe, not enabling


+   pr_info(could not obtain capabilities, error 0x%80lx, not 
enabling\n,


That's a lot of padding :)

I think this should also be a pr_debug, considering this is not relevant
to most ppc64 boxes.


Yep, s/info/debug/ makes sense. The format should have been %08lx not 
%80lx, not sure when I screwed that up.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v4 09/11] powerpc/perf: add support for the hv 24x7 interface

2014-03-05 Thread Cody P Schafer
This provides a basic interface between hv_24x7 and perf. Similar to
the one provided for gpci, it lacks transaction support and does not
list any events.

Example usage via perf tool:

perf stat -e 
'hv_24x7/domain=2,offset=8,starting_index=0,lpar=0x/' -r 0 -C 0 -x ' ' 
sleep 0.1

Signed-off-by: Cody P Schafer c...@linux.vnet.ibm.com
---
 arch/powerpc/perf/hv-24x7.c | 493 
 1 file changed, 493 insertions(+)
 create mode 100644 arch/powerpc/perf/hv-24x7.c

diff --git a/arch/powerpc/perf/hv-24x7.c b/arch/powerpc/perf/hv-24x7.c
new file mode 100644
index 000..81d68b6
--- /dev/null
+++ b/arch/powerpc/perf/hv-24x7.c
@@ -0,0 +1,493 @@
+/*
+ * Hypervisor supplied 24x7 performance counter support
+ *
+ * Author: Cody P Schafer c...@linux.vnet.ibm.com
+ * Copyright 2014 IBM Corporation.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#define pr_fmt(fmt) hv-24x7:  fmt
+
+#include linux/perf_event.h
+#include linux/module.h
+#include linux/slab.h
+#include asm/firmware.h
+#include asm/hvcall.h
+#include asm/io.h
+
+#include hv-24x7.h
+#include hv-24x7-catalog.h
+#include hv-common.h
+
+/*
+ * TODO: Merging events:
+ * - Think of the hcall as an interface to a 4d array of counters:
+ *   - x = domains
+ *   - y = indexes in the domain (core, chip, vcpu, node, etc)
+ *   - z = offset into the counter space
+ *   - w = lpars (guest vms, logical partitions)
+ * - A single request is: x,y,y_last,z,z_last,w,w_last
+ *   - this means we can retrieve a rectangle of counters in y,z for a single 
x.
+ *
+ * - Things to consider (ignoring w):
+ *   - input  cost_per_request = 16
+ *   - output cost_per_result(ys,zs)  = 8 + 8 * ys + ys * zs
+ *   - limited number of requests per hcall (must fit into 4K bytes)
+ * - 4k = 16 [buffer header] - 16 [request size] * request_count
+ * - 255 requests per hcall
+ *   - sometimes it will be more efficient to read extra data and discard
+ */
+
+PMU_FORMAT_RANGE(domain, config, 0, 3); /* u3 0-6, one of HV_24X7_PERF_DOMAIN 
*/
+PMU_FORMAT_RANGE(starting_index, config, 16, 31); /* u16 */
+PMU_FORMAT_RANGE(offset, config, 32, 63); /* u32, see data_offset */
+PMU_FORMAT_RANGE(lpar, config1, 0, 15); /* u16 */
+
+PMU_FORMAT_RANGE_RESERVED(reserved1, config,   4, 15);
+PMU_FORMAT_RANGE_RESERVED(reserved2, config1, 16, 63);
+PMU_FORMAT_RANGE_RESERVED(reserved3, config2,  0, 63);
+
+static struct attribute *format_attrs[] = {
+   format_attr_domain.attr,
+   format_attr_offset.attr,
+   format_attr_starting_index.attr,
+   format_attr_lpar.attr,
+   NULL,
+};
+
+static struct attribute_group format_group = {
+   .name = format,
+   .attrs = format_attrs,
+};
+
+/*
+ * read_offset_data - copy data from one buffer to another while treating the
+ *source buffer as a small view on the total avaliable
+ *source data.
+ *
+ * @dest: buffer to copy into
+ * @dest_len: length of @dest in bytes
+ * @requested_offset: the offset within the source data we want. Must be  0
+ * @src: buffer to copy data from
+ * @src_len: length of @src in bytes
+ * @source_offset: the offset in the sorce data that (src,src_len) refers to.
+ * Must be  0
+ *
+ * returns the number of bytes copied.
+ *
+ * The following ascii art shows the various buffer possitioning we need to
+ * handle, assigns some arbitrary varibles to points on the buffer, and then
+ * shows how we fiddle with those values to get things we care about (copy
+ * start in src and copy len)
+ *
+ * s = @src buffer
+ * d = @dest buffer
+ * '.' areas in d are written to.
+ *
+ *   u
+ *   x wv  z
+ * d   |.|
+ * s |--|
+ *
+ *  u
+ *   x w   z v
+ * d   |--|
+ * s |--|
+ *
+ *   x wu,z,v
+ * d   ||
+ * s |--|
+ *
+ *   x,wu,v,z
+ * d |..|
+ * s |--|
+ *
+ *   xu
+ *   wvz
+ * d ||
+ * s |--|
+ *
+ *   x  z   w  v
+ * d|--|
+ * s |--|
+ *
+ * x = source_offset
+ * w = requested_offset
+ * z = source_offset + src_len
+ * v = requested_offset + dest_len
+ *
+ * w_offset_in_s = w - x = requested_offset - source_offset
+ * z_offset_in_s = z - x = src_len
+ * v_offset_in_s = v - x = request_offset + dest_len - src_len
+ */
+static ssize_t read_offset_data(void *dest, size_t dest_len,
+   loff_t requested_offset, void *src,
+   size_t src_len, loff_t source_offset)
+{
+   size_t w_offset_in_s = requested_offset -