date:20180305

Re: [PATCH] perf stat: fix cvs output format

2018-03-05 Thread Jiri Olsa

On Mon, Mar 05, 2018 at 10:43:53PM -0800, Cong Wang wrote:
> From: Ilya Pronin 
> 
> When printing stats in CSV mode, perf stat appends extra CSV
> separators when counter is not supported:
> 
>  supported>,,L1-dcache-store-misses,mesos/bd442f34-2b4a-47df-b966-9b281f9f56fc,0,100.00
> 
> which causes a failure of parsing fields. The numbers of separators
> is fixed for each line, no matter supported or not supported.
> 
> Fixes: 92a61f6412d3 ("perf stat: Implement CSV metrics output")
> Cc: Andi Kleen 
> Cc: Arnaldo Carvalho de Melo 
> Cc: Jiri Olsa 
> Signed-off-by: Ilya Pronin 
> Signed-off-by: Cong Wang 
> ---
>  tools/perf/builtin-stat.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> index 98bf9d32f222..54a4c152edb3 100644
> --- a/tools/perf/builtin-stat.c
> +++ b/tools/perf/builtin-stat.c
> @@ -917,7 +917,7 @@ static void print_metric_csv(void *ctx,
>   char buf[64], *vals, *ends;
>  
>   if (unit == NULL || fmt == NULL) {
> - fprintf(out, "%s%s%s%s", csv_sep, csv_sep, csv_sep, csv_sep);
> + fprintf(out, "%s%s", csv_sep, csv_sep);
>   return;
>   }

right, the non else legs prints just 2 values:
  fprintf(out, "%s%s%s%s", csv_sep, vals, csv_sep, unit);

Acked-by: Jiri Olsa 

thanks,
jirka

Re: [PATCH v4 3/3] mm/free_pcppages_bulk: prefetch buddy while not holding lock

2018-03-05 Thread Vlastimil Babka

On 03/05/2018 12:41 PM, Aaron Lu wrote:
> On Fri, Mar 02, 2018 at 06:55:25PM +0100, Vlastimil Babka wrote:
>> On 03/01/2018 03:00 PM, Michal Hocko wrote:
>>>
>>> I am really surprised that this has such a big impact.
>>
>> It's even stranger to me. Struct page is 64 bytes these days, exactly a
>> a cache line. Unless that changed, Intel CPUs prefetched a "buddy" cache
>> line (that forms an aligned 128 bytes block with the one we touch).
>> Which is exactly a order-0 buddy struct page! Maybe that implicit
>> prefetching stopped at L2 and explicit goes all the way to L1, can't
> 
> The Intel Architecture Optimization Manual section 7.3.2 says:
> 
> prefetchT0 - fetch data into all cache levels
> Intel Xeon Processors based on Nehalem, Westmere, Sandy Bridge and newer
> microarchitectures: 1st, 2nd and 3rd level cache.
> 
> prefetchT2 - fetch data into 2nd and 3rd level caches (identical to
> prefetchT1)
> Intel Xeon Processors based on Nehalem, Westmere, Sandy Bridge and newer
> microarchitectures: 2nd and 3rd level cache.
> 
> prefetchNTA - fetch data into non-temporal cache close to the processor,
> minimizing cache pollution
> Intel Xeon Processors based on Nehalem, Westmere, Sandy Bridge and newer
> microarchitectures: must fetch into 3rd level cache with fast replacement.
> 
> I tried 'prefetcht0' and 'prefetcht2' instead of the default
> 'prefetchNTA' on a 2 sockets Intel Skylake, the two ended up with about
> the same performance number as prefetchNTA. I had expected prefetchT0 to
> deliver a better score if it was indeed due to L1D since prefetchT2 will
> not place data into L1 while prefetchT0 will, but looks like it is not
> the case here.
> 
> It feels more like the buddy cacheline isn't in any level of the caches
> without prefetch for some reason.

So the adjacent line prefetch might be disabled? Could you check bios or
the MSR mentioned in
https://software.intel.com/en-us/articles/disclosure-of-hw-prefetcher-control-on-some-intel-processors

>> remember. Would that make such a difference? It would be nice to do some
>> perf tests with cache counters to see what is really going on...
> 
> Compare prefetchT2 to no-prefetch, I saw these metrics change:
> 
> no-prefetch  change  prefetchT2   metrics
> \  \
>  stddev stddev
> 
>   0.18+0.00.18perf-stat.branch-miss-rate% 
>   
>  8.268e+09+3.8%  8.585e+09perf-stat.branch-misses 
>   
>  2.333e+10+4.7%  2.443e+10perf-stat.cache-misses  
>   
>  2.402e+11+5.0%  2.522e+11perf-stat.cache-references  
>   
>   3.52-1.1%   3.48perf-stat.cpi   
>   
>   0.02-0.00.01 ±3%perf-stat.dTLB-load-miss-rate%  
>  
>  8.677e+08-7.3%  8.048e+08 ±3%perf-stat.dTLB-load-misses  
>   
>   1.18+0.01.19perf-stat.dTLB-store-miss-rate% 
> 
>  2.359e+10+6.0%  2.502e+10perf-stat.dTLB-store-misses 
>   
>  1.979e+12+5.0%  2.078e+12perf-stat.dTLB-stores   
>   
>  6.126e+09   +10.1%  6.745e+09 ±3%perf-stat.iTLB-load-misses  
>   
>   3464-8.4%   3172 ±3%
> perf-stat.instructions-per-iTLB-miss
>   0.28+1.1%   0.29perf-stat.ipc   
>   
>  2.929e+09+5.1%  3.077e+09perf-stat.minor-faults  
>
>  9.244e+09+4.7%  9.681e+09perf-stat.node-loads
>   
>  2.491e+08+5.8%  2.634e+08perf-stat.node-store-misses 
>   
>  6.472e+09+6.1%  6.869e+09perf-stat.node-stores   
>   
>  2.929e+09+5.1%  3.077e+09perf-stat.page-faults   
>
>2182469-4.2%2090977perf-stat.path-length
> 
> Not sure if this is useful though...

Looks like most stats increased in absolute values as the work done
increased and this is a time-limited benchmark? Although number of
instructions (calculated from itlb misses and insns-per-itlb-miss) shows
less than 1% increase, so dunno. And the improvement comes from reduced
dTLB-load-misses? That makes no sense for order-0 buddy struct pages
which always share a page. And the memmap mapping should use huge pages.
BTW what is path-length?

Re: [PATCH v2 0/2] perf sched map: re-annotate shortname if thread comm changed

2018-03-05 Thread Jiri Olsa

On Tue, Mar 06, 2018 at 11:37:35AM +0800, changbin...@intel.com wrote:
> From: Changbin Du 
> 
> v2:
>   o add a patch to move thread::shortname to thread_runtime
>   o add function perf_sched__process_comm() to process PERF_RECORD_COMM event.
> 
> Changbin Du (2):
>   perf sched: move thread::shortname to thread_runtime
>   perf sched map: re-annotate shortname if thread comm changed

Acked-by: Jiri Olsa 

thanks,
jirka

> 
>  tools/perf/builtin-sched.c | 132 
> ++---
>  tools/perf/util/thread.h   |   1 -
>  2 files changed, 90 insertions(+), 43 deletions(-)
> 
> -- 
> 2.7.4
>

Re: [PATCH] perf report: Provide libtraceevent with a kernel symbol resolver

2018-03-05 Thread Wang YanQing

On Thu, Feb 08, 2018 at 01:20:31PM +0100, Jiri Olsa wrote:
> On Mon, Jan 15, 2018 at 12:47:32PM +0800, Wang YanQing wrote:
> > So that beautifiers wanting to resolve kernel function addresses to
> > names can do its work, and when we use "perf report" for output of
> > "perf kmem record", we will get kernel symbol output.
> > 
> > Signed-off-by: Wang YanQing 
> 
> Acked-by: Jiri Olsa 

Hi! Arnaldo Carvalho de Melo

What is the status of this patch now?
Does the patch sanked to the bottom of your mailbox?

Thanks!

[PATCH v4 2/3] Input: gpio-keys - allow setting wakeup event action in DT

2018-03-05 Thread Jeffy Chen

Allow specifying event actions to trigger wakeup when using the
gpio-keys input device as a wakeup source.

Reviewed-by: Rob Herring 
Signed-off-by: Jeffy Chen 
---

Changes in v4: None
Changes in v3: None
Changes in v2:
Specify wakeup event action instead of irq trigger type as Brian
suggested.

 Documentation/devicetree/bindings/input/gpio-keys.txt | 8 
 1 file changed, 8 insertions(+)

diff --git a/Documentation/devicetree/bindings/input/gpio-keys.txt 
b/Documentation/devicetree/bindings/input/gpio-keys.txt
index a94940481e55..996ce84352cb 100644
--- a/Documentation/devicetree/bindings/input/gpio-keys.txt
+++ b/Documentation/devicetree/bindings/input/gpio-keys.txt
@@ -26,6 +26,14 @@ Optional subnode-properties:
  If not specified defaults to 5.
- wakeup-source: Boolean, button can wake-up the system.
 (Legacy property supported: "gpio-key,wakeup")
+   - wakeup-event-action: Specifies whether the key should wake the
+ system when asserted, when deasserted, or both. This property is
+ only valid for keys that wake up the system (e.g., when the
+ "wakeup-source" property is also provided).
+ Supported values are defined in linux-event-codes.h:
+   EV_ACT_ASSERTED - asserted
+   EV_ACT_DEASSERTED   - deasserted
+   EV_ACT_ANY  - both asserted and deasserted
- linux,can-disable: Boolean, indicates that button is connected
  to dedicated (not shared) interrupt which can be disabled to
  suppress events from the button.
-- 
2.11.0

[PATCH v4 3/3] arm64: dts: rockchip: kevin: Avoid wakeup when inserting the pen

2018-03-05 Thread Jeffy Chen

Add wakeup event action for Pen Insert gpio key, to avoid wakeup when
inserting the pen.

Signed-off-by: Jeffy Chen 
Tested-by: Enric Balletbo i Serra 
---

Changes in v4:
Include dt-binding gpio-keys.h

Changes in v3: None
Changes in v2:
Specify wakeup event action instead of irq trigger type as Brian
suggested.

 arch/arm64/boot/dts/rockchip/rk3399-gru-kevin.dts | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/arm64/boot/dts/rockchip/rk3399-gru-kevin.dts 
b/arch/arm64/boot/dts/rockchip/rk3399-gru-kevin.dts
index 191a6bcb1704..89126dbe5d91 100644
--- a/arch/arm64/boot/dts/rockchip/rk3399-gru-kevin.dts
+++ b/arch/arm64/boot/dts/rockchip/rk3399-gru-kevin.dts
@@ -44,6 +44,7 @@
 
 /dts-v1/;
 #include "rk3399-gru.dtsi"
+#include 
 #include 
 
 /*
@@ -134,6 +135,8 @@
gpios = <&gpio0 13 GPIO_ACTIVE_LOW>;
linux,code = ;
linux,input-type = ;
+   /* Wakeup only when ejecting */
+   wakeup-event-action = ;
wakeup-source;
};
 };
-- 
2.11.0

Re: [PATCH 1/3 RESEND] tpm: add longer timeouts for creation commands.

2018-03-05 Thread Jarkko Sakkinen

On Mon, Mar 05, 2018 at 01:09:09PM +, Winkler, Tomas wrote:
> Why you need cover letter?  What are u missing in the patch description

If you submit a *patch set* I *require* a cover letter, yes.

/Jarkko

[PATCH v4 1/3] Input: gpio-keys - add support for wakeup event action

2018-03-05 Thread Jeffy Chen

Add support for specifying event actions to trigger wakeup when using
the gpio-keys input device as a wakeup source.

This would allow the device to configure when to wakeup the system. For
example a gpio-keys input device for pen insert, may only want to wakeup
the system when ejecting the pen.

Suggested-by: Brian Norris 
Signed-off-by: Jeffy Chen 
---

Changes in v4:
Add dt-binding gpio-keys.h, stop saving irq trigger type, add enable/disable 
wakeup helpers as Dmitry suggested.

Changes in v3:
Adding more comments as Brian suggested.

Changes in v2:
Specify wakeup event action instead of irq trigger type as Brian
suggested.

 drivers/input/keyboard/gpio_keys.c| 67 +--
 include/dt-bindings/input/gpio-keys.h | 13 +++
 include/linux/gpio_keys.h |  2 ++
 3 files changed, 79 insertions(+), 3 deletions(-)
 create mode 100644 include/dt-bindings/input/gpio-keys.h

diff --git a/drivers/input/keyboard/gpio_keys.c 
b/drivers/input/keyboard/gpio_keys.c
index 87e613dc33b8..4bc23648b6a7 100644
--- a/drivers/input/keyboard/gpio_keys.c
+++ b/drivers/input/keyboard/gpio_keys.c
@@ -30,6 +30,7 @@
 #include 
 #include 
 #include 
+#include 
 
 struct gpio_button_data {
const struct gpio_keys_button *button;
@@ -45,10 +46,12 @@ struct gpio_button_data {
unsigned int software_debounce; /* in msecs, for GPIO-driven buttons */
 
unsigned int irq;
+   unsigned int wakeup_trigger_type;
spinlock_t lock;
bool disabled;
bool key_pressed;
bool suspended;
+   bool wakeup_enabled;
 };
 
 struct gpio_keys_drvdata {
@@ -540,6 +543,8 @@ static int gpio_keys_setup_key(struct platform_device *pdev,
}
 
if (bdata->gpiod) {
+   int active_low = gpiod_is_active_low(bdata->gpiod);
+
if (button->debounce_interval) {
error = gpiod_set_debounce(bdata->gpiod,
button->debounce_interval * 1000);
@@ -568,6 +573,24 @@ static int gpio_keys_setup_key(struct platform_device 
*pdev,
isr = gpio_keys_gpio_isr;
irqflags = IRQF_TRIGGER_RISING | IRQF_TRIGGER_FALLING;
 
+   switch (button->wakeup_event_action) {
+   case EV_ACT_ASSERTED:
+   bdata->wakeup_trigger_type = active_low ?
+   IRQ_TYPE_EDGE_FALLING : IRQ_TYPE_EDGE_RISING;
+   break;
+   case EV_ACT_DEASSERTED:
+   bdata->wakeup_trigger_type = active_low ?
+   IRQ_TYPE_EDGE_RISING : IRQ_TYPE_EDGE_FALLING;
+   break;
+   case EV_ACT_ANY:
+   /* fall through */
+   default:
+   /*
+* For other cases, we are OK letting suspend/resume
+* not reconfigure the trigger type.
+*/
+   break;
+   }
} else {
if (!button->irq) {
dev_err(dev, "Found button without gpio or irq\n");
@@ -586,6 +609,11 @@ static int gpio_keys_setup_key(struct platform_device 
*pdev,
 
isr = gpio_keys_irq_isr;
irqflags = 0;
+
+   /*
+* For IRQ buttons, there is no interrupt for release.
+* So we don't need to reconfigure the trigger type for wakeup.
+*/
}
 
bdata->code = &ddata->keymap[idx];
@@ -718,6 +746,9 @@ gpio_keys_get_devtree_pdata(struct device *dev)
/* legacy name */
fwnode_property_read_bool(child, "gpio-key,wakeup");
 
+   fwnode_property_read_u32(child, "wakeup-event-action",
+&button->wakeup_event_action);
+
button->can_disable =
fwnode_property_read_bool(child, "linux,can-disable");
 
@@ -845,6 +876,31 @@ static int gpio_keys_probe(struct platform_device *pdev)
return 0;
 }
 
+static int gpio_keys_enable_wakeup(struct gpio_button_data *bdata)
+{
+   int ret;
+
+   ret = enable_irq_wake(bdata->irq);
+   if (ret)
+   return ret;
+
+   if (bdata->wakeup_trigger_type)
+   irq_set_irq_type(bdata->irq, bdata->wakeup_trigger_type);
+
+   return 0;
+}
+
+static void gpio_keys_disable_wakeup(struct gpio_button_data *bdata)
+{
+   /**
+* The trigger type is always both edges for gpio-based keys and we do
+* not support changing wakeup trigger for interrupt-based keys.
+*/
+   if (bdata->wakeup_trigger_type)
+   irq_set_irq_type(bdata->irq, IRQ_TYPE_EDGE_BOTH);
+   disable_irq_wake(bdata->irq);
+}
+
 static int __maybe_unused gpio_keys_suspend(struct device *dev)
 {
struct gpio_keys_drvdata *ddata = dev_get_drvdata(dev);
@@ -854,8 +910,10 @@ static int

[PATCH v4 0/3] gpio-keys: Add support for specifying wakeup event action

2018-03-05 Thread Jeffy Chen


On chromebook kevin, we are using gpio-keys for pen insert event. But
we only want it to wakeup the system when ejecting the pen.

So we may need to change the interrupt trigger type during suspending.

Changes in v4:
Add dt-binding gpio-keys.h, stop saving irq trigger type, add enable/disable 
wakeup helpers as Dmitry suggested.
Include dt-binding gpio-keys.h

Changes in v3:
Adding more comments as Brian suggested.

Changes in v2:
Specify wakeup event action instead of irq trigger type as Brian
suggested.
Specify wakeup event action instead of irq trigger type as Brian
suggested.
Specify wakeup event action instead of irq trigger type as Brian
suggested.

Jeffy Chen (3):
  Input: gpio-keys - add support for wakeup event action
  Input: gpio-keys - allow setting wakeup event action in DT
  arm64: dts: rockchip: kevin: Avoid wakeup when inserting the pen

 .../devicetree/bindings/input/gpio-keys.txt|  8 +++
 arch/arm64/boot/dts/rockchip/rk3399-gru-kevin.dts  |  3 +
 drivers/input/keyboard/gpio_keys.c | 67 +-
 include/dt-bindings/input/gpio-keys.h  | 13 +
 include/linux/gpio_keys.h  |  2 +
 5 files changed, 90 insertions(+), 3 deletions(-)
 create mode 100644 include/dt-bindings/input/gpio-keys.h

-- 
2.11.0

Re: [PATCH v3 1/3] Input: gpio-keys - add support for wakeup event action

2018-03-05 Thread JeffyChen


Hi Dmitry,

Thanks for your review.

On 03/06/2018 08:30 AM, Dmitry Torokhov wrote:

>+   switch (button->wakeup_event_action) {
>+   case EV_ACT_ASSERTED:
>+   bdata->wakeup_trigger_type = active_low ?
>+   IRQF_TRIGGER_FALLING : IRQF_TRIGGER_RISING;

IRQ_TYPE_EDGE_FALLING : IRQ_TYPE_EDGE_RISING;


ok, will fix in next version

>+   break;
>+   case EV_ACT_DEASSERTED:
>+   bdata->wakeup_trigger_type = active_low ?
>+   IRQF_TRIGGER_RISING : IRQF_TRIGGER_FALLING;
>+   break;

case EV_ACT_ANY:

ok, will fix in next version

>+   default:
>+   /*
>+* For other cases, we are OK letting suspend/resume
>+* not reconfigure the trigger type.
>+*/
>+   break;
>+   }
>} else {
>if (!button->irq) {
>dev_err(dev, "Found button without gpio or irq\n");
>@@ -586,6 +606,12 @@ static int gpio_keys_setup_key(struct platform_device 
*pdev,
>
>isr = gpio_keys_irq_isr;
>irqflags = 0;
>+
>+   /*
>+* For IRQ buttons, the irq trigger type for press and release
>+* are the same. So we don't need to reconfigure the trigger
>+* type for wakeup.

That is not entirely accurate. Interrupt triggers button press, which is
followed by either immediate or delayed release. There is no interrupt
for release.

ok, will fix the comment



>+*/
>}
>
>bdata->code = &ddata->keymap[idx];
>@@ -618,6 +644,8 @@ static int gpio_keys_setup_key(struct platform_device 
*pdev,
>return error;
>}
>
>+   bdata->irq_trigger_type = irq_get_trigger_type(bdata->irq);

Why do we need to store the trigger type? It is always both edges for
gpio-based keys and we do not support changing wakeup trigger for
interrupt-based keys.

right, this is not needed.



>+
>return 0;
>  }
>
>@@ -718,6 +746,9 @@ gpio_keys_get_devtree_pdata(struct device *dev)
>/* legacy name */
>fwnode_property_read_bool(child, "gpio-key,wakeup");
>
>+   fwnode_property_read_u32(child, "wakeup-event-action",
>+&button->wakeup_event_action);
>+
>button->can_disable =
>fwnode_property_read_bool(child, "linux,can-disable");
>
>@@ -854,6 +885,10 @@ static int __maybe_unused gpio_keys_suspend(struct device 
*dev)
>if (device_may_wakeup(dev)) {
>for (i = 0; i < ddata->pdata->nbuttons; i++) {
>struct gpio_button_data *bdata = &ddata->data[i];
>+
>+   if (bdata->button->wakeup && bdata->wakeup_trigger_type)
>+   irq_set_irq_type(bdata->irq,
>+bdata->wakeup_trigger_type);

if (bdata->button->wakeup) {
if (bdata->wakeup_trigger_type) {
error = ...;
}

enable_irq_wake(bdata->irq);
}

Might need to be split into a helper; if you add error handling to
enable_irq_wake() that woudl be great too.

ok, will do that.



>if (bdata->button->wakeup)
>enable_irq_wake(bdata->irq);
>bdata->suspended = true;
>@@ -878,6 +913,10 @@ static int __maybe_unused gpio_keys_resume(struct device 
*dev)
>if (device_may_wakeup(dev)) {
>for (i = 0; i < ddata->pdata->nbuttons; i++) {
>struct gpio_button_data *bdata = &ddata->data[i];
>+
>+   if (bdata->button->wakeup && bdata->wakeup_trigger_type)
>+   irq_set_irq_type(bdata->irq,
>+bdata->irq_trigger_type);

Just use IRQ_TYPE_EDGE_BOTH.


>if (bdata->button->wakeup)
>disable_irq_wake(bdata->irq);
>bdata->suspended = false;
>diff --git a/include/linux/gpio_keys.h b/include/linux/gpio_keys.h
>index d06bf77400f1..7160df54a6fe 100644
>--- a/include/linux/gpio_keys.h
>+++ b/include/linux/gpio_keys.h
>@@ -13,6 +13,7 @@ struct device;
>   * @desc: label that will be attached to button's gpio
>   * @type: input event type (%EV_KEY, %EV_SW, %EV_ABS)
>   * @wakeup:   configure the button as a wake-up source
>+ * @wakeup_event_action:   event action to trigger wakeup
>   * @debounce_interval:debounce ticks interval in msecs
>   * @can_disable:  %true indicates that userspace is allowed to
>   *disable button via sysfs
>@@ -26,6 +27,7 @@ struct gpio_keys_button {
>const char *desc;
>unsigned int type;
>int wakeup;
>+   int wakeup_even

Re: [PATCH 8/9] drm/xen-front: Implement GEM operations

2018-03-05 Thread Oleksandr Andrushchenko


On 03/06/2018 09:26 AM, Daniel Vetter wrote:

On Mon, Mar 05, 2018 at 03:46:07PM +0200, Oleksandr Andrushchenko wrote:

On 03/05/2018 11:32 AM, Daniel Vetter wrote:

On Wed, Feb 21, 2018 at 10:03:41AM +0200, Oleksandr Andrushchenko wrote:

From: Oleksandr Andrushchenko 

Implement GEM handling depending on driver mode of operation:
depending on the requirements for the para-virtualized environment, namely
requirements dictated by the accompanying DRM/(v)GPU drivers running in both
host and guest environments, number of operating modes of para-virtualized
display driver are supported:
   - display buffers can be allocated by either frontend driver or backend
   - display buffers can be allocated to be contiguous in memory or not

Note! Frontend driver itself has no dependency on contiguous memory for
its operation.

1. Buffers allocated by the frontend driver.

The below modes of operation are configured at compile-time via
frontend driver's kernel configuration.

1.1. Front driver configured to use GEM CMA helpers
   This use-case is useful when used with accompanying DRM/vGPU driver in
   guest domain which was designed to only work with contiguous buffers,
   e.g. DRM driver based on GEM CMA helpers: such drivers can only import
   contiguous PRIME buffers, thus requiring frontend driver to provide
   such. In order to implement this mode of operation para-virtualized
   frontend driver can be configured to use GEM CMA helpers.

1.2. Front driver doesn't use GEM CMA
   If accompanying drivers can cope with non-contiguous memory then, to
   lower pressure on CMA subsystem of the kernel, driver can allocate
   buffers from system memory.

Note! If used with accompanying DRM/(v)GPU drivers this mode of operation
may require IOMMU support on the platform, so accompanying DRM/vGPU
hardware can still reach display buffer memory while importing PRIME
buffers from the frontend driver.

2. Buffers allocated by the backend

This mode of operation is run-time configured via guest domain configuration
through XenStore entries.

For systems which do not provide IOMMU support, but having specific
requirements for display buffers it is possible to allocate such buffers
at backend side and share those with the frontend.
For example, if host domain is 1:1 mapped and has DRM/GPU hardware expecting
physically contiguous memory, this allows implementing zero-copying
use-cases.

Note! Configuration options 1.1 (contiguous display buffers) and 2 (backend
allocated buffers) are not supported at the same time.

Signed-off-by: Oleksandr Andrushchenko 

Some suggestions below for some larger cleanup work.
-Daniel


---
   drivers/gpu/drm/xen/Kconfig |  13 +
   drivers/gpu/drm/xen/Makefile|   6 +
   drivers/gpu/drm/xen/xen_drm_front.h |  74 ++
   drivers/gpu/drm/xen/xen_drm_front_drv.c |  80 ++-
   drivers/gpu/drm/xen/xen_drm_front_drv.h |   1 +
   drivers/gpu/drm/xen/xen_drm_front_gem.c | 360 

   drivers/gpu/drm/xen/xen_drm_front_gem.h |  46 
   drivers/gpu/drm/xen/xen_drm_front_gem_cma.c |  93 +++
   8 files changed, 667 insertions(+), 6 deletions(-)
   create mode 100644 drivers/gpu/drm/xen/xen_drm_front_gem.c
   create mode 100644 drivers/gpu/drm/xen/xen_drm_front_gem.h
   create mode 100644 drivers/gpu/drm/xen/xen_drm_front_gem_cma.c

diff --git a/drivers/gpu/drm/xen/Kconfig b/drivers/gpu/drm/xen/Kconfig
index 4cca160782ab..4f4abc91f3b6 100644
--- a/drivers/gpu/drm/xen/Kconfig
+++ b/drivers/gpu/drm/xen/Kconfig
@@ -15,3 +15,16 @@ config DRM_XEN_FRONTEND
help
  Choose this option if you want to enable a para-virtualized
  frontend DRM/KMS driver for Xen guest OSes.
+
+config DRM_XEN_FRONTEND_CMA
+   bool "Use DRM CMA to allocate dumb buffers"
+   depends on DRM_XEN_FRONTEND
+   select DRM_KMS_CMA_HELPER
+   select DRM_GEM_CMA_HELPER
+   help
+ Use DRM CMA helpers to allocate display buffers.
+ This is useful for the use-cases when guest driver needs to
+ share or export buffers to other drivers which only expect
+ contiguous buffers.
+ Note: in this mode driver cannot use buffers allocated
+ by the backend.
diff --git a/drivers/gpu/drm/xen/Makefile b/drivers/gpu/drm/xen/Makefile
index 4fcb0da1a9c5..12376ec78fbc 100644
--- a/drivers/gpu/drm/xen/Makefile
+++ b/drivers/gpu/drm/xen/Makefile
@@ -8,4 +8,10 @@ drm_xen_front-objs := xen_drm_front.o \
  xen_drm_front_shbuf.o \
  xen_drm_front_cfg.o
+ifeq ($(CONFIG_DRM_XEN_FRONTEND_CMA),y)
+   drm_xen_front-objs += xen_drm_front_gem_cma.o
+else
+   drm_xen_front-objs += xen_drm_front_gem.o
+endif
+
   obj-$(CONFIG_DRM_XEN_FRONTEND) += drm_xen_front.o
diff --git a/drivers/gpu/drm/xen/xen_drm_front.h 
b/drivers/gpu/drm/xen/xen_drm_front.h
index 9ed5bfb248d0..c6f52c892434 100644
--- a/drivers/gpu/drm/xen/xen_drm_front

Re: [PATCH 1/4] drm/atomic: integrate modeset lock with private objects

2018-03-05 Thread Daniel Vetter

On Tue, Mar 06, 2018 at 08:29:20AM +0100, Daniel Vetter wrote:
> On Wed, Feb 21, 2018 at 04:19:40PM +0100, Maarten Lankhorst wrote:
> > Hey,
> > 
> > Op 21-02-18 om 15:37 schreef Rob Clark:
> > > Follow the same pattern of locking as with other state objects.  This
> > > avoids boilerplate in the driver.
> > I'm afraid this will prohibit any uses of this on i915, since it still uses 
> > legacy lock_all().
> > 
> > Oh well, afaict nothing in i915 uses private objects, so I don't think it's 
> > harmful. :)
> 
> We do use private objects, as part of dp mst helpers. But I also thought
> that the only users left of lock_all are in the debugfs code, where this
> really doesn't matter all that much.

Correction, we use it in other places than debugfs. But thanks to Ville's
private state obj refactoring we now have drm_atomic_private_obj_init(),
so it's easy to add all the private state objects to a new list in
drm_dev->mode_config.private_states or so, and use that list in
drm_modeset_lock_all_ctx to also take driver private locks.

I think that would actually be useful in other places, just in case.
-Daniel

> 
> > Could you cc intel-gfx just in case?
> 
> Yeah, best to double-check with CI.
> 
> > > Signed-off-by: Rob Clark 
> > > ---
> > >  drivers/gpu/drm/drm_atomic.c | 9 -
> > >  include/drm/drm_atomic.h | 5 +
> > >  2 files changed, 13 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c
> > > index fc8c4da409ff..004e621ab307 100644
> > > --- a/drivers/gpu/drm/drm_atomic.c
> > > +++ b/drivers/gpu/drm/drm_atomic.c
> > > @@ -1078,6 +1078,8 @@ drm_atomic_private_obj_init(struct drm_private_obj 
> > > *obj,
> > >  {
> > >   memset(obj, 0, sizeof(*obj));
> > >  
> > > + drm_modeset_lock_init(&obj->lock);
> > > +
> > >   obj->state = state;
> > >   obj->funcs = funcs;
> > >  }
> > > @@ -1093,6 +1095,7 @@ void
> > >  drm_atomic_private_obj_fini(struct drm_private_obj *obj)
> > >  {
> > >   obj->funcs->atomic_destroy_state(obj, obj->state);
> > > + drm_modeset_lock_fini(&obj->lock);
> > >  }
> > >  EXPORT_SYMBOL(drm_atomic_private_obj_fini);
> > >  
> > > @@ -1113,7 +1116,7 @@ struct drm_private_state *
> > >  drm_atomic_get_private_obj_state(struct drm_atomic_state *state,
> > >struct drm_private_obj *obj)
> > >  {
> > > - int index, num_objs, i;
> > > + int index, num_objs, i, ret;
> > >   size_t size;
> > >   struct __drm_private_objs_state *arr;
> > >   struct drm_private_state *obj_state;
> > > @@ -1122,6 +1125,10 @@ drm_atomic_get_private_obj_state(struct 
> > > drm_atomic_state *state,
> > >   if (obj == state->private_objs[i].ptr)
> > >   return state->private_objs[i].state;
> > >  
> > > + ret = drm_modeset_lock(&obj->lock, state->acquire_ctx);
> > > + if (ret)
> > > + return ERR_PTR(ret);
> > > +
> > >   num_objs = state->num_private_objs + 1;
> > >   size = sizeof(*state->private_objs) * num_objs;
> > >   arr = krealloc(state->private_objs, size, GFP_KERNEL);
> > > diff --git a/include/drm/drm_atomic.h b/include/drm/drm_atomic.h
> > > index 09076a625637..9ae53b73c9d2 100644
> > > --- a/include/drm/drm_atomic.h
> > > +++ b/include/drm/drm_atomic.h
> > > @@ -218,6 +218,11 @@ struct drm_private_state_funcs {
> > >   * &drm_modeset_lock is required to duplicate and update this object's 
> > > state.
> > >   */
> > >  struct drm_private_obj {
> > > + /**
> > > +  * @lock: Modeset lock to protect the state object.
> > > +  */
> > > + struct drm_modeset_lock lock;
> > > +
> > >   /**
> > >* @state: Current atomic state for this driver private object.
> > >*/
> > 
> > 
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [PATCH 1/4] drm/atomic: integrate modeset lock with private objects

2018-03-05 Thread Daniel Vetter

On Wed, Feb 21, 2018 at 06:36:54PM +0200, Ville Syrjälä wrote:
> On Wed, Feb 21, 2018 at 11:17:21AM -0500, Rob Clark wrote:
> > On Wed, Feb 21, 2018 at 10:54 AM, Ville Syrjälä
> >  wrote:
> > > On Wed, Feb 21, 2018 at 10:36:06AM -0500, Rob Clark wrote:
> > >> On Wed, Feb 21, 2018 at 10:27 AM, Ville Syrjälä
> > >>  wrote:
> > >> > On Wed, Feb 21, 2018 at 10:20:03AM -0500, Rob Clark wrote:
> > >> >> On Wed, Feb 21, 2018 at 10:07 AM, Ville Syrjälä
> > >> >>  wrote:
> > >> >> > On Wed, Feb 21, 2018 at 09:54:49AM -0500, Rob Clark wrote:
> > >> >> >> On Wed, Feb 21, 2018 at 9:49 AM, Ville Syrjälä
> > >> >> >>  wrote:
> > >> >> >> > On Wed, Feb 21, 2018 at 09:37:21AM -0500, Rob Clark wrote:
> > >> >> >> >> Follow the same pattern of locking as with other state objects. 
> > >> >> >> >>  This
> > >> >> >> >> avoids boilerplate in the driver.
> > >> >> >> >
> > >> >> >> > I'm not sure we really want to do this. What if the driver wants 
> > >> >> >> > a
> > >> >> >> > custom locking scheme for this state?
> > >> >> >>
> > >> >> >> That seems like something we want to discourage, ie. all the more
> > >> >> >> reason for this patch.
> > >> >> >>
> > >> >> >> There is no reason drivers could not split their global state into
> > >> >> >> multiple private objs's, each with their own lock, for more fine
> > >> >> >> grained locking.  That is basically the only valid reason I can 
> > >> >> >> think
> > >> >> >> of for "custom locking".
> > >> >> >
> > >> >> > In i915 we have at least one case that would want something close 
> > >> >> > to an
> > >> >> > rwlock. Any crtc lock is enough for read, need all of them for 
> > >> >> > write.
> > >> >> > Though if we wanted to use private objs for that we might need to
> > >> >> > actually make the states refcounted as well, otherwise I can imagine
> > >> >> > we might land in some use-after-free issues once again.
> > >> >> >
> > >> >> > Maybe we could duplicate the state into per-crtc and global copies, 
> > >> >> > but
> > >> >> > then we have to keep all of those in sync somehow which doesn't 
> > >> >> > sound
> > >> >> > particularly pleasant.
> > >> >>
> > >> >> Or just keep your own driver lock for read, and use that plus the core
> > >> >> modeset lock for write?
> > >> >
> > >> > If we can't add the private obj to the state we can't really use it.
> > >> >
> > >>
> > >> I'm not sure why that is strictly true (that you need to add it to the
> > >> state if for read-only), since you'd be guarding it with your own
> > >> driver read-lock you can just priv->foo_state->bar.
> > >>
> > >> Since it is read-only access, there is no roll-back to worry about for
> > >> test-only or failed atomic_check()s..
> > >
> > > That would be super ugly. We want to access the information the same
> > > way whether it has been modified or not.
> > 
> > Well, I mean the whole idea of what you want to do seems a bit super-ugly 
> > ;-)
> > 
> > I mean, in mdp5 the assigned global resources go in plane/crtc state,
> > and tracking of what is assigned to which plane/crtc is in global
> > state, so it fits nicely in the current locking model.  For i915, I'm
> > not quite sure what is the global state you are concerned about, so it
> > is a bit hard to talk about the best solution in the abstract.  Maybe
> > the better option is to teach modeset-lock how to be a rwlock instead?
> 
> The thing I'm thinking is the core display clock (cdclk) frequency which
> we need to consult whenever computing plane states and whatnot. We don't
> want a modeset on one crtc to block a plane update on another crtc
> unless we actually have to bump the cdclk (which would generally require
> all crtcs to undergo a full modeset). Seems like a generally useful
> pattern to me.

The usual way to fix that is to have read-only copies of the state in the
plane or crtc states. And for writing (or if the requirement changes) you
have to lock all the objects. Essentially what Rob's doing for his
plane/crtc assignment stuff.

What we do in i915 is kinda not what I've been recommending to everyone
else, because it is a rather tricky and complicated way to get things
done. Sure there's a tradeoff between duplicating data and complicated
locking schemes, but I think for the kms case having to explicitly type
code that reflects the depencies in computation (instead of having that
embedded implicitly in the locking scheme) is a feature, not a bug.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [PATCH v4 15/38] drm/bridge: analogix_dp: Ensure edp is disabled when shutting down the panel

2018-03-05 Thread Marek Szyprowski


Hi All,

This is the patch, which introduces the issue I've pointed here:

https://lists.freedesktop.org/archives/dri-devel/2018-March/167794.html

On 2018-03-05 23:23, Enric Balletbo i Serra wrote:

From: Lin Huang 

When panel is shut down, we should make sure edp can be disabled to avoid
undefined behavior.

Cc: Stéphane Marchesin 
Signed-off-by: Lin Huang 
Signed-off-by: zain wang 
Signed-off-by: Sean Paul 
Signed-off-by: Thierry Escande 
Reviewed-by: Andrzej Hajda 
Signed-off-by: Enric Balletbo i Serra 
---

  drivers/gpu/drm/bridge/analogix/analogix_dp_core.c | 11 +++
  1 file changed, 11 insertions(+)

diff --git a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c 
b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
index 92fb9a072cb6..9b7d530ad24c 100644
--- a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
+++ b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
@@ -1160,6 +1160,12 @@ static int analogix_dp_set_bridge(struct 
analogix_dp_device *dp)
  
  	pm_runtime_get_sync(dp->dev);
  
+	ret = clk_prepare_enable(dp->clock);

+   if (ret < 0) {
+   DRM_ERROR("Failed to prepare_enable the clock clk [%d]\n", ret);
+   goto out_dp_clk_pre;
+   }
+
if (dp->plat_data->power_on)
dp->plat_data->power_on(dp->plat_data);
  
@@ -1191,6 +1197,8 @@ static int analogix_dp_set_bridge(struct analogix_dp_device *dp)

phy_power_off(dp->phy);
if (dp->plat_data->power_off)
dp->plat_data->power_off(dp->plat_data);
+   clk_disable_unprepare(dp->clock);
+out_dp_clk_pre:
pm_runtime_put_sync(dp->dev);
  
  	return ret;

@@ -1234,10 +1242,13 @@ static void analogix_dp_bridge_disable(struct 
drm_bridge *bridge)
  
  	disable_irq(dp->irq);

phy_power_off(dp->phy);
+   analogix_dp_set_analog_power_down(dp, POWER_ALL, 1);


In case of Exynos DP, external PHY is used to power the DP block, so no
register access should be performed after phy_power_off(). Please move
analogix_dp_set_analog_power_down() before phy_power_off().

  
  	if (dp->plat_data->power_off)

dp->plat_data->power_off(dp->plat_data);
  
+	clk_disable_unprepare(dp->clock);

+
pm_runtime_put_sync(dp->dev);
  
  	ret = analogix_dp_prepare_panel(dp, false, true);


Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland

Re: [PATCH 1/4] drm/atomic: integrate modeset lock with private objects

2018-03-05 Thread Daniel Vetter

On Wed, Feb 21, 2018 at 09:37:21AM -0500, Rob Clark wrote:
> Follow the same pattern of locking as with other state objects.  This
> avoids boilerplate in the driver.
> 
> Signed-off-by: Rob Clark 

Please also adjust the kernel doc, and I think we can remove the locking
WARN_ON in drm_atomic_get_mst_topology_state after this patch (plus again
adjust the kerneldoc for that please).

Otherwise I think this makes sense, and ecnourages reasonable semantics
for driver private state objects.
-Daniel

> ---
>  drivers/gpu/drm/drm_atomic.c | 9 -
>  include/drm/drm_atomic.h | 5 +
>  2 files changed, 13 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c
> index fc8c4da409ff..004e621ab307 100644
> --- a/drivers/gpu/drm/drm_atomic.c
> +++ b/drivers/gpu/drm/drm_atomic.c
> @@ -1078,6 +1078,8 @@ drm_atomic_private_obj_init(struct drm_private_obj *obj,
>  {
>   memset(obj, 0, sizeof(*obj));
>  
> + drm_modeset_lock_init(&obj->lock);
> +
>   obj->state = state;
>   obj->funcs = funcs;
>  }
> @@ -1093,6 +1095,7 @@ void
>  drm_atomic_private_obj_fini(struct drm_private_obj *obj)
>  {
>   obj->funcs->atomic_destroy_state(obj, obj->state);
> + drm_modeset_lock_fini(&obj->lock);
>  }
>  EXPORT_SYMBOL(drm_atomic_private_obj_fini);
>  
> @@ -1113,7 +1116,7 @@ struct drm_private_state *
>  drm_atomic_get_private_obj_state(struct drm_atomic_state *state,
>struct drm_private_obj *obj)
>  {
> - int index, num_objs, i;
> + int index, num_objs, i, ret;
>   size_t size;
>   struct __drm_private_objs_state *arr;
>   struct drm_private_state *obj_state;
> @@ -1122,6 +1125,10 @@ drm_atomic_get_private_obj_state(struct 
> drm_atomic_state *state,
>   if (obj == state->private_objs[i].ptr)
>   return state->private_objs[i].state;
>  
> + ret = drm_modeset_lock(&obj->lock, state->acquire_ctx);
> + if (ret)
> + return ERR_PTR(ret);
> +
>   num_objs = state->num_private_objs + 1;
>   size = sizeof(*state->private_objs) * num_objs;
>   arr = krealloc(state->private_objs, size, GFP_KERNEL);
> diff --git a/include/drm/drm_atomic.h b/include/drm/drm_atomic.h
> index 09076a625637..9ae53b73c9d2 100644
> --- a/include/drm/drm_atomic.h
> +++ b/include/drm/drm_atomic.h
> @@ -218,6 +218,11 @@ struct drm_private_state_funcs {
>   * &drm_modeset_lock is required to duplicate and update this object's state.
>   */
>  struct drm_private_obj {
> + /**
> +  * @lock: Modeset lock to protect the state object.
> +  */
> + struct drm_modeset_lock lock;
> +
>   /**
>* @state: Current atomic state for this driver private object.
>*/
> -- 
> 2.14.3
> 
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [PATCH 7/9] drm/xen-front: Implement KMS/connector handling

2018-03-05 Thread Oleksandr Andrushchenko


On 03/06/2018 09:22 AM, Daniel Vetter wrote:

On Mon, Mar 05, 2018 at 02:59:23PM +0200, Oleksandr Andrushchenko wrote:

On 03/05/2018 11:23 AM, Daniel Vetter wrote:

On Wed, Feb 21, 2018 at 10:03:40AM +0200, Oleksandr Andrushchenko wrote:

From: Oleksandr Andrushchenko 

Implement kernel modesetiing/connector handling using
DRM simple KMS helper pipeline:

- implement KMS part of the driver with the help of DRM
simple pipepline helper which is possible due to the fact
that the para-virtualized driver only supports a single
(primary) plane:
- initialize connectors according to XenStore configuration
- handle frame done events from the backend
- generate vblank events
- create and destroy frame buffers and propagate those
  to the backend
- propagate set/reset mode configuration to the backend on display
  enable/disable callbacks
- send page flip request to the backend and implement logic for
  reporting backend IO errors on prepare fb callback

- implement virtual connector handling:
- support only pixel formats suitable for single plane modes
- make sure the connector is always connected
- support a single video mode as per para-virtualized driver
  configuration

Signed-off-by: Oleksandr Andrushchenko 

I think once you've removed the midlayer in the previous patch it would
makes sense to merge the 2 patches into 1.

ok, will squash the two

Bunch more comments below.
-Daniel


---
   drivers/gpu/drm/xen/Makefile |   2 +
   drivers/gpu/drm/xen/xen_drm_front_conn.c | 125 +
   drivers/gpu/drm/xen/xen_drm_front_conn.h |  35 
   drivers/gpu/drm/xen/xen_drm_front_drv.c  |  15 ++
   drivers/gpu/drm/xen/xen_drm_front_drv.h  |  12 ++
   drivers/gpu/drm/xen/xen_drm_front_kms.c  | 299 
+++
   drivers/gpu/drm/xen/xen_drm_front_kms.h  |  30 
   7 files changed, 518 insertions(+)
   create mode 100644 drivers/gpu/drm/xen/xen_drm_front_conn.c
   create mode 100644 drivers/gpu/drm/xen/xen_drm_front_conn.h
   create mode 100644 drivers/gpu/drm/xen/xen_drm_front_kms.c
   create mode 100644 drivers/gpu/drm/xen/xen_drm_front_kms.h

diff --git a/drivers/gpu/drm/xen/Makefile b/drivers/gpu/drm/xen/Makefile
index d3068202590f..4fcb0da1a9c5 100644
--- a/drivers/gpu/drm/xen/Makefile
+++ b/drivers/gpu/drm/xen/Makefile
@@ -2,6 +2,8 @@
   drm_xen_front-objs := xen_drm_front.o \
  xen_drm_front_drv.o \
+ xen_drm_front_kms.o \
+ xen_drm_front_conn.o \
  xen_drm_front_evtchnl.o \
  xen_drm_front_shbuf.o \
  xen_drm_front_cfg.o
diff --git a/drivers/gpu/drm/xen/xen_drm_front_conn.c 
b/drivers/gpu/drm/xen/xen_drm_front_conn.c
new file mode 100644
index ..d9986a2e1a3b
--- /dev/null
+++ b/drivers/gpu/drm/xen/xen_drm_front_conn.c
@@ -0,0 +1,125 @@
+/*
+ *  Xen para-virtual DRM device
+ *
+ *   This program is free software; you can redistribute it and/or modify
+ *   it under the terms of the GNU General Public License as published by
+ *   the Free Software Foundation; either version 2 of the License, or
+ *   (at your option) any later version.
+ *
+ *   This program is distributed in the hope that it will be useful,
+ *   but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *   GNU General Public License for more details.
+ *
+ * Copyright (C) 2016-2018 EPAM Systems Inc.
+ *
+ * Author: Oleksandr Andrushchenko 
+ */
+
+#include 
+#include 
+
+#include 
+
+#include "xen_drm_front_conn.h"
+#include "xen_drm_front_drv.h"
+
+static struct xen_drm_front_drm_pipeline *
+to_xen_drm_pipeline(struct drm_connector *connector)
+{
+   return container_of(connector, struct xen_drm_front_drm_pipeline, conn);
+}
+
+static const uint32_t plane_formats[] = {
+   DRM_FORMAT_RGB565,
+   DRM_FORMAT_RGB888,
+   DRM_FORMAT_XRGB,
+   DRM_FORMAT_ARGB,
+   DRM_FORMAT_XRGB,
+   DRM_FORMAT_ARGB,
+   DRM_FORMAT_XRGB1555,
+   DRM_FORMAT_ARGB1555,
+};
+
+const uint32_t *xen_drm_front_conn_get_formats(int *format_count)
+{
+   *format_count = ARRAY_SIZE(plane_formats);
+   return plane_formats;
+}
+
+static enum drm_connector_status connector_detect(
+   struct drm_connector *connector, bool force)
+{
+   if (drm_dev_is_unplugged(connector->dev))
+   return connector_status_disconnected;
+
+   return connector_status_connected;
+}
+
+#define XEN_DRM_NUM_VIDEO_MODES1
+#define XEN_DRM_CRTC_VREFRESH_HZ   60
+
+static int connector_get_modes(struct drm_connector *connector)
+{
+   struct xen_drm_front_drm_pipeline *pipeline =
+   to_xen_drm_pipeline(connector);
+   struct drm_display_mode *mode;
+   struct videomode videomode;
+   int width, height;
+
+   mode = drm_mode_create(connector->dev);
+

Re: [PATCH 1/4] drm/atomic: integrate modeset lock with private objects

2018-03-05 Thread Daniel Vetter

On Wed, Feb 21, 2018 at 04:19:40PM +0100, Maarten Lankhorst wrote:
> Hey,
> 
> Op 21-02-18 om 15:37 schreef Rob Clark:
> > Follow the same pattern of locking as with other state objects.  This
> > avoids boilerplate in the driver.
> I'm afraid this will prohibit any uses of this on i915, since it still uses 
> legacy lock_all().
> 
> Oh well, afaict nothing in i915 uses private objects, so I don't think it's 
> harmful. :)

We do use private objects, as part of dp mst helpers. But I also thought
that the only users left of lock_all are in the debugfs code, where this
really doesn't matter all that much.

> Could you cc intel-gfx just in case?

Yeah, best to double-check with CI.

> > Signed-off-by: Rob Clark 
> > ---
> >  drivers/gpu/drm/drm_atomic.c | 9 -
> >  include/drm/drm_atomic.h | 5 +
> >  2 files changed, 13 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c
> > index fc8c4da409ff..004e621ab307 100644
> > --- a/drivers/gpu/drm/drm_atomic.c
> > +++ b/drivers/gpu/drm/drm_atomic.c
> > @@ -1078,6 +1078,8 @@ drm_atomic_private_obj_init(struct drm_private_obj 
> > *obj,
> >  {
> > memset(obj, 0, sizeof(*obj));
> >  
> > +   drm_modeset_lock_init(&obj->lock);
> > +
> > obj->state = state;
> > obj->funcs = funcs;
> >  }
> > @@ -1093,6 +1095,7 @@ void
> >  drm_atomic_private_obj_fini(struct drm_private_obj *obj)
> >  {
> > obj->funcs->atomic_destroy_state(obj, obj->state);
> > +   drm_modeset_lock_fini(&obj->lock);
> >  }
> >  EXPORT_SYMBOL(drm_atomic_private_obj_fini);
> >  
> > @@ -1113,7 +1116,7 @@ struct drm_private_state *
> >  drm_atomic_get_private_obj_state(struct drm_atomic_state *state,
> >  struct drm_private_obj *obj)
> >  {
> > -   int index, num_objs, i;
> > +   int index, num_objs, i, ret;
> > size_t size;
> > struct __drm_private_objs_state *arr;
> > struct drm_private_state *obj_state;
> > @@ -1122,6 +1125,10 @@ drm_atomic_get_private_obj_state(struct 
> > drm_atomic_state *state,
> > if (obj == state->private_objs[i].ptr)
> > return state->private_objs[i].state;
> >  
> > +   ret = drm_modeset_lock(&obj->lock, state->acquire_ctx);
> > +   if (ret)
> > +   return ERR_PTR(ret);
> > +
> > num_objs = state->num_private_objs + 1;
> > size = sizeof(*state->private_objs) * num_objs;
> > arr = krealloc(state->private_objs, size, GFP_KERNEL);
> > diff --git a/include/drm/drm_atomic.h b/include/drm/drm_atomic.h
> > index 09076a625637..9ae53b73c9d2 100644
> > --- a/include/drm/drm_atomic.h
> > +++ b/include/drm/drm_atomic.h
> > @@ -218,6 +218,11 @@ struct drm_private_state_funcs {
> >   * &drm_modeset_lock is required to duplicate and update this object's 
> > state.
> >   */
> >  struct drm_private_obj {
> > +   /**
> > +* @lock: Modeset lock to protect the state object.
> > +*/
> > +   struct drm_modeset_lock lock;
> > +
> > /**
> >  * @state: Current atomic state for this driver private object.
> >  */
> 
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [PATCH 8/9] drm/xen-front: Implement GEM operations

2018-03-05 Thread Daniel Vetter

On Mon, Mar 05, 2018 at 03:46:07PM +0200, Oleksandr Andrushchenko wrote:
> On 03/05/2018 11:32 AM, Daniel Vetter wrote:
> > On Wed, Feb 21, 2018 at 10:03:41AM +0200, Oleksandr Andrushchenko wrote:
> > > From: Oleksandr Andrushchenko 
> > > 
> > > Implement GEM handling depending on driver mode of operation:
> > > depending on the requirements for the para-virtualized environment, namely
> > > requirements dictated by the accompanying DRM/(v)GPU drivers running in 
> > > both
> > > host and guest environments, number of operating modes of para-virtualized
> > > display driver are supported:
> > >   - display buffers can be allocated by either frontend driver or backend
> > >   - display buffers can be allocated to be contiguous in memory or not
> > > 
> > > Note! Frontend driver itself has no dependency on contiguous memory for
> > > its operation.
> > > 
> > > 1. Buffers allocated by the frontend driver.
> > > 
> > > The below modes of operation are configured at compile-time via
> > > frontend driver's kernel configuration.
> > > 
> > > 1.1. Front driver configured to use GEM CMA helpers
> > >   This use-case is useful when used with accompanying DRM/vGPU driver 
> > > in
> > >   guest domain which was designed to only work with contiguous 
> > > buffers,
> > >   e.g. DRM driver based on GEM CMA helpers: such drivers can only 
> > > import
> > >   contiguous PRIME buffers, thus requiring frontend driver to provide
> > >   such. In order to implement this mode of operation para-virtualized
> > >   frontend driver can be configured to use GEM CMA helpers.
> > > 
> > > 1.2. Front driver doesn't use GEM CMA
> > >   If accompanying drivers can cope with non-contiguous memory then, to
> > >   lower pressure on CMA subsystem of the kernel, driver can allocate
> > >   buffers from system memory.
> > > 
> > > Note! If used with accompanying DRM/(v)GPU drivers this mode of operation
> > > may require IOMMU support on the platform, so accompanying DRM/vGPU
> > > hardware can still reach display buffer memory while importing PRIME
> > > buffers from the frontend driver.
> > > 
> > > 2. Buffers allocated by the backend
> > > 
> > > This mode of operation is run-time configured via guest domain 
> > > configuration
> > > through XenStore entries.
> > > 
> > > For systems which do not provide IOMMU support, but having specific
> > > requirements for display buffers it is possible to allocate such buffers
> > > at backend side and share those with the frontend.
> > > For example, if host domain is 1:1 mapped and has DRM/GPU hardware 
> > > expecting
> > > physically contiguous memory, this allows implementing zero-copying
> > > use-cases.
> > > 
> > > Note! Configuration options 1.1 (contiguous display buffers) and 2 
> > > (backend
> > > allocated buffers) are not supported at the same time.
> > > 
> > > Signed-off-by: Oleksandr Andrushchenko 
> > Some suggestions below for some larger cleanup work.
> > -Daniel
> > 
> > > ---
> > >   drivers/gpu/drm/xen/Kconfig |  13 +
> > >   drivers/gpu/drm/xen/Makefile|   6 +
> > >   drivers/gpu/drm/xen/xen_drm_front.h |  74 ++
> > >   drivers/gpu/drm/xen/xen_drm_front_drv.c |  80 ++-
> > >   drivers/gpu/drm/xen/xen_drm_front_drv.h |   1 +
> > >   drivers/gpu/drm/xen/xen_drm_front_gem.c | 360 
> > > 
> > >   drivers/gpu/drm/xen/xen_drm_front_gem.h |  46 
> > >   drivers/gpu/drm/xen/xen_drm_front_gem_cma.c |  93 +++
> > >   8 files changed, 667 insertions(+), 6 deletions(-)
> > >   create mode 100644 drivers/gpu/drm/xen/xen_drm_front_gem.c
> > >   create mode 100644 drivers/gpu/drm/xen/xen_drm_front_gem.h
> > >   create mode 100644 drivers/gpu/drm/xen/xen_drm_front_gem_cma.c
> > > 
> > > diff --git a/drivers/gpu/drm/xen/Kconfig b/drivers/gpu/drm/xen/Kconfig
> > > index 4cca160782ab..4f4abc91f3b6 100644
> > > --- a/drivers/gpu/drm/xen/Kconfig
> > > +++ b/drivers/gpu/drm/xen/Kconfig
> > > @@ -15,3 +15,16 @@ config DRM_XEN_FRONTEND
> > >   help
> > > Choose this option if you want to enable a para-virtualized
> > > frontend DRM/KMS driver for Xen guest OSes.
> > > +
> > > +config DRM_XEN_FRONTEND_CMA
> > > + bool "Use DRM CMA to allocate dumb buffers"
> > > + depends on DRM_XEN_FRONTEND
> > > + select DRM_KMS_CMA_HELPER
> > > + select DRM_GEM_CMA_HELPER
> > > + help
> > > +   Use DRM CMA helpers to allocate display buffers.
> > > +   This is useful for the use-cases when guest driver needs to
> > > +   share or export buffers to other drivers which only expect
> > > +   contiguous buffers.
> > > +   Note: in this mode driver cannot use buffers allocated
> > > +   by the backend.
> > > diff --git a/drivers/gpu/drm/xen/Makefile b/drivers/gpu/drm/xen/Makefile
> > > index 4fcb0da1a9c5..12376ec78fbc 100644
> > > --- a/drivers/gpu/drm/xen/Makefile
> > > +++ b/drivers/gpu/drm/xen/Makefile
> > > @@ -8,4 +8,10 @@ drm_xen_front-objs

Re: [PATCH 7/9] drm/xen-front: Implement KMS/connector handling

2018-03-05 Thread Daniel Vetter

On Mon, Mar 05, 2018 at 02:59:23PM +0200, Oleksandr Andrushchenko wrote:
> On 03/05/2018 11:23 AM, Daniel Vetter wrote:
> > On Wed, Feb 21, 2018 at 10:03:40AM +0200, Oleksandr Andrushchenko wrote:
> > > From: Oleksandr Andrushchenko 
> > > 
> > > Implement kernel modesetiing/connector handling using
> > > DRM simple KMS helper pipeline:
> > > 
> > > - implement KMS part of the driver with the help of DRM
> > >simple pipepline helper which is possible due to the fact
> > >that the para-virtualized driver only supports a single
> > >(primary) plane:
> > >- initialize connectors according to XenStore configuration
> > >- handle frame done events from the backend
> > >- generate vblank events
> > >- create and destroy frame buffers and propagate those
> > >  to the backend
> > >- propagate set/reset mode configuration to the backend on display
> > >  enable/disable callbacks
> > >- send page flip request to the backend and implement logic for
> > >  reporting backend IO errors on prepare fb callback
> > > 
> > > - implement virtual connector handling:
> > >- support only pixel formats suitable for single plane modes
> > >- make sure the connector is always connected
> > >- support a single video mode as per para-virtualized driver
> > >  configuration
> > > 
> > > Signed-off-by: Oleksandr Andrushchenko 
> > I think once you've removed the midlayer in the previous patch it would
> > makes sense to merge the 2 patches into 1.
> ok, will squash the two
> > 
> > Bunch more comments below.
> > -Daniel
> > 
> > > ---
> > >   drivers/gpu/drm/xen/Makefile |   2 +
> > >   drivers/gpu/drm/xen/xen_drm_front_conn.c | 125 +
> > >   drivers/gpu/drm/xen/xen_drm_front_conn.h |  35 
> > >   drivers/gpu/drm/xen/xen_drm_front_drv.c  |  15 ++
> > >   drivers/gpu/drm/xen/xen_drm_front_drv.h  |  12 ++
> > >   drivers/gpu/drm/xen/xen_drm_front_kms.c  | 299 
> > > +++
> > >   drivers/gpu/drm/xen/xen_drm_front_kms.h  |  30 
> > >   7 files changed, 518 insertions(+)
> > >   create mode 100644 drivers/gpu/drm/xen/xen_drm_front_conn.c
> > >   create mode 100644 drivers/gpu/drm/xen/xen_drm_front_conn.h
> > >   create mode 100644 drivers/gpu/drm/xen/xen_drm_front_kms.c
> > >   create mode 100644 drivers/gpu/drm/xen/xen_drm_front_kms.h
> > > 
> > > diff --git a/drivers/gpu/drm/xen/Makefile b/drivers/gpu/drm/xen/Makefile
> > > index d3068202590f..4fcb0da1a9c5 100644
> > > --- a/drivers/gpu/drm/xen/Makefile
> > > +++ b/drivers/gpu/drm/xen/Makefile
> > > @@ -2,6 +2,8 @@
> > >   drm_xen_front-objs := xen_drm_front.o \
> > > xen_drm_front_drv.o \
> > > +   xen_drm_front_kms.o \
> > > +   xen_drm_front_conn.o \
> > > xen_drm_front_evtchnl.o \
> > > xen_drm_front_shbuf.o \
> > > xen_drm_front_cfg.o
> > > diff --git a/drivers/gpu/drm/xen/xen_drm_front_conn.c 
> > > b/drivers/gpu/drm/xen/xen_drm_front_conn.c
> > > new file mode 100644
> > > index ..d9986a2e1a3b
> > > --- /dev/null
> > > +++ b/drivers/gpu/drm/xen/xen_drm_front_conn.c
> > > @@ -0,0 +1,125 @@
> > > +/*
> > > + *  Xen para-virtual DRM device
> > > + *
> > > + *   This program is free software; you can redistribute it and/or modify
> > > + *   it under the terms of the GNU General Public License as published by
> > > + *   the Free Software Foundation; either version 2 of the License, or
> > > + *   (at your option) any later version.
> > > + *
> > > + *   This program is distributed in the hope that it will be useful,
> > > + *   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > > + *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > > + *   GNU General Public License for more details.
> > > + *
> > > + * Copyright (C) 2016-2018 EPAM Systems Inc.
> > > + *
> > > + * Author: Oleksandr Andrushchenko 
> > > + */
> > > +
> > > +#include 
> > > +#include 
> > > +
> > > +#include 
> > > +
> > > +#include "xen_drm_front_conn.h"
> > > +#include "xen_drm_front_drv.h"
> > > +
> > > +static struct xen_drm_front_drm_pipeline *
> > > +to_xen_drm_pipeline(struct drm_connector *connector)
> > > +{
> > > + return container_of(connector, struct xen_drm_front_drm_pipeline, conn);
> > > +}
> > > +
> > > +static const uint32_t plane_formats[] = {
> > > + DRM_FORMAT_RGB565,
> > > + DRM_FORMAT_RGB888,
> > > + DRM_FORMAT_XRGB,
> > > + DRM_FORMAT_ARGB,
> > > + DRM_FORMAT_XRGB,
> > > + DRM_FORMAT_ARGB,
> > > + DRM_FORMAT_XRGB1555,
> > > + DRM_FORMAT_ARGB1555,
> > > +};
> > > +
> > > +const uint32_t *xen_drm_front_conn_get_formats(int *format_count)
> > > +{
> > > + *format_count = ARRAY_SIZE(plane_formats);
> > > + return plane_formats;
> > > +}
> > > +
> > > +static enum drm_connector_status connector_detect(
> > > + struct drm_connector *connector, bool force)
> > > +{
> > > + if (drm_dev_is_unplugg

[GIT PULL] siginfo fix for v4.16-rc5

2018-03-05 Thread Eric W. Biederman

Linus,

Please pull the siginfo-linus branch from the git tree:

   git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace.git 
siginfo-linus

   HEAD: f6a015498dcaee72f80283cb7873d88deb07129c signal/x86: Include the field 
offsets in the build time checks

The kbuild test robot found that I accidentally moved si_pkey when I was
cleaning up siginfo_t.  A short followed by an int with the int having 8
byte alignment.  Sheesh siginfo_t is a weird structure.

I have now corrected it and added build time checks that with a little
luck will catch any similar future mistakes.  The build time checks were
sufficient for me to verify the bug and to verify my fix.  So they are
at least useful this once.

Eric W. Biederman (2):
  signal: Correct the offset of si_pkey in struct siginfo
  signal/x86: Include the field offsets in the build time checks

 arch/x86/kernel/signal_compat.c| 65 ++
 include/linux/compat.h |  4 +--
 include/uapi/asm-generic/siginfo.h |  4 +--
 3 files changed, 69 insertions(+), 4 deletions(-)

[RFC PATCH] irqchip/gic-v3-its: handle wrapped case in its_wait_for_range_completion()

2018-03-05 Thread Yang Yingliang


From: Yang Yingliang 

While cpus posting a bunch of ITS commands, the cmd_queue and rd_idx will
be wrapped easily. And current way of handling wrapped case is not quite
right.
Such as, in direct case, rd_idx will wrap if other cpus post commands
that make rd_idx increase. When rd_idx wrapped, the driver prints
timeout messages but in fact the command is finished.

This patch adds two variables to count wrapped times of ITS commands and
read index. With these two variables, the driver can handle wrapped case
correctly.

Signed-off-by: Yang Yingliang 
---
 drivers/irqchip/irq-gic-v3-its.c | 72 
+---

 1 file changed, 60 insertions(+), 12 deletions(-)

diff --git a/drivers/irqchip/irq-gic-v3-its.c 
b/drivers/irqchip/irq-gic-v3-its.c

index 1d3056f..a03e18e 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -111,6 +111,9 @@ struct its_node {
u32 pre_its_base; /* for Socionext Synquacer */
boolis_v4;
int vlpi_redist_offset;
+   int last_rd;
+   u64 cmd_wrapped_cnt;
+   u64 rd_wrapped_cnt;
 };

 #define ITS_ITT_ALIGN  SZ_256
@@ -662,6 +665,7 @@ static int its_queue_full(struct its_node *its)

 static struct its_cmd_block *its_allocate_entry(struct its_node *its)
 {
+   u32 rd;
struct its_cmd_block *cmd;
u32 count = 100;/* 1s! */

@@ -675,11 +679,24 @@ static struct its_cmd_block 
*its_allocate_entry(struct its_node *its)

udelay(1);
}

+   /*
+* Here is protected by its->lock and driver cannot allocate
+* ITS commands, if ITS command queue is full, so the read
+* won't wrap twice between this rd_idx and last rd_idx.
+* Count rd wrapped times here is safe.
+*/
+   rd = readl_relaxed(its->base + GITS_CREADR);
+   if (rd < its->last_rd)
+   its->rd_wrapped_cnt++;
+   its->last_rd = rd;
+
cmd = its->cmd_write++;

/* Handle queue wrapping */
-   if (its->cmd_write == (its->cmd_base + ITS_CMD_QUEUE_NR_ENTRIES))
+   if (its->cmd_write == (its->cmd_base + ITS_CMD_QUEUE_NR_ENTRIES)) {
its->cmd_write = its->cmd_base;
+   its->cmd_wrapped_cnt++;
+   }

/* Clear command  */
cmd->raw_cmd[0] = 0;
@@ -713,29 +730,57 @@ static void its_flush_cmd(struct its_node *its, 
struct its_cmd_block *cmd)


 static int its_wait_for_range_completion(struct its_node *its,
 struct its_cmd_block *from,
-struct its_cmd_block *to)
+struct its_cmd_block *to,
+u64 last_cmd_wrapped_cnt)
 {
-   u64 rd_idx, from_idx, to_idx;
+   unsigned long flags;
+   u64 rd_idx, from_idx, to_idx, rd_wrapped_cnt;
u32 count = 100;/* 1s! */

from_idx = its_cmd_ptr_to_offset(its, from);
to_idx = its_cmd_ptr_to_offset(its, to);

while (1) {
+   raw_spin_lock_irqsave(&its->lock, flags);
rd_idx = readl_relaxed(its->base + GITS_CREADR);
+   if (rd_idx < its->last_rd)
+   its->rd_wrapped_cnt++;
+   its->last_rd = rd_idx;
+   rd_wrapped_cnt = its->rd_wrapped_cnt;
+   raw_spin_unlock_irqrestore(&its->lock, flags);

-   /* Direct case */
-   if (from_idx < to_idx && rd_idx >= to_idx)
-   break;
-
-   /* Wrapped case */
-   if (from_idx >= to_idx && rd_idx >= to_idx && rd_idx < from_idx)
+   /*
+* If rd_wrapped_cnt > last_cmd_wrapped_cnt:
+* there are a lot of ITS commands posted by
+* other cpus and ITS is fast.
+*
+* If rd_wrapped_cnt < last_cmd_wrapped_cnt:
+* ITS is slow, there are some ITS commands
+* not finished.
+*
+* If rd_wrapped_cnt == last_cmd_wrapped_cnt:
+* it's common case.
+*/
+   if (rd_wrapped_cnt > last_cmd_wrapped_cnt) {
+   /*
+* There is a lot of ITS commands posted by other cpus,
+* it make rd_idx move foward fast and wrap.
+*/
break;
+   } else if (rd_wrapped_cnt == last_cmd_wrapped_cnt) {
+   /* Direct case */
+   if (from_idx < to_idx && rd_idx >= to_idx)
+   break;
+
+   /* Wrapped case */
+   if (from_idx >= to_idx && rd_idx >= to_idx && rd_idx < 
from_idx)
+   break;
+   }

[PATCHv2 2/2] zram: drop max_zpage_size and use zs_huge_class_size()

2018-03-05 Thread Sergey Senozhatsky

This patch removes ZRAM's enforced "huge object" value and uses
zsmalloc huge-class watermark instead, which makes more sense.

TEST
- I used a 1G zram device, LZO compression back-end, original
  data set size was 444MB. Looking at zsmalloc classes stats the
  test ended up to be pretty fair.

BASE ZRAM/ZSMALLOC
=
zram mm_stat

498978816 191482495 1998315520 199831552156340

zsmalloc classes

 class  size almost_full almost_empty obj_allocated   obj_used pages_used 
pages_per_zspage freeable
...
   151  2448   00  1240   1240744   
 30
   168  2720   00  4200   4200   2800   
 20
   190  3072   00 10100  10100   7575   
 30
   202  3264   00   380380304   
 40
   254  4096   00 10620  10620  10620   
 10

 Total 7   46106982 106187  48787   
  0

PATCHED ZRAM/ZSMALLOC
=

zram mm_stat

498978816 182579184 1942487040 194248704156280

zsmalloc classes

 class  size almost_full almost_empty obj_allocated   obj_used pages_used 
pages_per_zspage freeable
...
   151  2448   00  1240   1240744   
 30
   168  2720   00  4200   4200   2800   
 20
   190  3072   00 10100  10100   7575   
 30
   202  3264   00  7180   7180   5744   
 40
   254  4096   00  3820   3820   3820   
 10

 Total 8   45106959 106193  47424   
  0

As we can see, we reduced the number of objects stored in class-4096,
because a huge number of objects which we previously forcibly stored
in class-4096 now stored in non-huge class-3264. This results in lower
memory consumption:
 - zsmalloc now uses 47424 physical pages, which is less than 48787
   pages zsmalloc used before.

 - objects that we store in class-3264 share zspages. That's why overall
   the number of pages that both class-4096 and class-3264 consumed went
   down from 10924 to 9564.

Signed-off-by: Sergey Senozhatsky 
---
 drivers/block/zram/zram_drv.c |  9 -
 drivers/block/zram/zram_drv.h | 16 
 2 files changed, 8 insertions(+), 17 deletions(-)

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 85110e7931e5..1b8082e6d2f5 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -44,6 +44,11 @@ static const char *default_compressor = "lzo";
 
 /* Module params (documentation at end) */
 static unsigned int num_devices = 1;
+/*
+ * Pages that compress to sizes equals or greater than this are stored
+ * uncompressed in memory.
+ */
+static size_t huge_class_size;
 
 static void zram_free_page(struct zram *zram, size_t index);
 
@@ -786,6 +791,8 @@ static bool zram_meta_alloc(struct zram *zram, u64 disksize)
return false;
}
 
+   if (!huge_class_size)
+   huge_class_size = zs_huge_class_size();
return true;
 }
 
@@ -965,7 +972,7 @@ static int __zram_bvec_write(struct zram *zram, struct 
bio_vec *bvec,
return ret;
}
 
-   if (unlikely(comp_len > max_zpage_size)) {
+   if (unlikely(comp_len >= huge_class_size)) {
if (zram_wb_enabled(zram) && allow_wb) {
zcomp_stream_put(zram->comp);
ret = write_to_bdev(zram, bvec, index, bio, &element);
diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
index 31762db861e3..d71c8000a964 100644
--- a/drivers/block/zram/zram_drv.h
+++ b/drivers/block/zram/zram_drv.h
@@ -21,22 +21,6 @@
 
 #include "zcomp.h"
 
-/*-- Configurable parameters */
-
-/*
- * Pages that compress to size greater than this are stored
- * uncompressed in memory.
- */
-static const size_t max_zpage_size = PAGE_SIZE / 4 * 3;
-
-/*
- * NOTE: max_zpage_size must be less than or equal to:
- *   ZS_MAX_ALLOC_SIZE. Otherwise, zs_malloc() would
- * always return failure.
- */
-
-/*-- End of configurable params */
-
 #define SECTOR_SHIFT   9
 #define SECTORS_PER_PAGE_SHIFT (PAGE_SHIFT - SECTOR_SHIFT)
 #define SECTORS_PER_PAGE   (1 << SECTORS_PER_PAGE_SHIFT)
-- 
2.16.2

[PATCHv2 0/2] zsmalloc/zram: drop zram's max_zpage_size

2018-03-05 Thread Sergey Senozhatsky

Hello,

ZRAM's max_zpage_size is a bad thing. It forces zsmalloc to
store normal objects as huge ones, which results in bigger zsmalloc
memory usage. Drop it and use actual zsmalloc huge-class value when
decide if the object is huge or not.

Sergey Senozhatsky (2):
  zsmalloc: introduce zs_huge_class_size() function
  zram: drop max_zpage_size and use zs_huge_class_size()

 drivers/block/zram/zram_drv.c |  9 -
 drivers/block/zram/zram_drv.h | 16 
 include/linux/zsmalloc.h  |  2 ++
 mm/zsmalloc.c | 40 
 4 files changed, 50 insertions(+), 17 deletions(-)

-- 
2.16.2

[PATCHv2 1/2] zsmalloc: introduce zs_huge_class_size() function

2018-03-05 Thread Sergey Senozhatsky

Not every object can be share its zspage with other objects, e.g.
when the object is as big as zspage or nearly as big a zspage.
For such objects zsmalloc has a so called huge class - every object
which belongs to huge class consumes the entire zspage (which
consists of a physical page). On x86_64, PAGE_SHIFT 12 box, the
first non-huge class size is 3264, so starting down from size 3264,
objects can share page(-s) and thus minimize memory wastage.

ZRAM, however, has its own statically defined watermark for huge
objects - "3 * PAGE_SIZE / 4 = 3072", and forcibly stores every
object larger than this watermark (3072) as a PAGE_SIZE object,
in other words, to a huge class, while zsmalloc can keep some of
those objects in non-huge classes. This results in increased
memory consumption.

zsmalloc knows better if the object is huge or not. Introduce
zs_huge_class_size() function which tells if the given object can be
stored in one of non-huge classes or not. This will let us to drop
ZRAM's huge object watermark and fully rely on zsmalloc when we
decide if the object is huge.

Signed-off-by: Sergey Senozhatsky 
---
 include/linux/zsmalloc.h |  2 ++
 mm/zsmalloc.c| 40 
 2 files changed, 42 insertions(+)

diff --git a/include/linux/zsmalloc.h b/include/linux/zsmalloc.h
index 57a8e98f2708..753c1af4d2cb 100644
--- a/include/linux/zsmalloc.h
+++ b/include/linux/zsmalloc.h
@@ -47,6 +47,8 @@ void zs_destroy_pool(struct zs_pool *pool);
 unsigned long zs_malloc(struct zs_pool *pool, size_t size, gfp_t flags);
 void zs_free(struct zs_pool *pool, unsigned long obj);
 
+size_t zs_huge_class_size(void);
+
 void *zs_map_object(struct zs_pool *pool, unsigned long handle,
enum zs_mapmode mm);
 void zs_unmap_object(struct zs_pool *pool, unsigned long handle);
diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index a583ab111a43..63422cf35b94 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -193,6 +193,7 @@ static struct vfsmount *zsmalloc_mnt;
  * (see: fix_fullness_group())
  */
 static const int fullness_threshold_frac = 4;
+static size_t huge_class_size;
 
 struct size_class {
spinlock_t lock;
@@ -1407,6 +1408,24 @@ void zs_unmap_object(struct zs_pool *pool, unsigned long 
handle)
 }
 EXPORT_SYMBOL_GPL(zs_unmap_object);
 
+/**
+ * zs_huge_class_size() - Returns the size (in bytes) of the first huge
+ *zsmalloc &size_class.
+ *
+ * The function returns the size of the first huge class - any object of equal
+ * or bigger size will be stored in zspage consisting of a single physical
+ * page.
+ *
+ * Context: Any context.
+ *
+ * Return: the size (in bytes) of the first huge zsmalloc &size_class.
+ */
+size_t zs_huge_class_size(void)
+{
+   return huge_class_size;
+}
+EXPORT_SYMBOL_GPL(zs_huge_class_size);
+
 static unsigned long obj_malloc(struct size_class *class,
struct zspage *zspage, unsigned long handle)
 {
@@ -2363,6 +2382,27 @@ struct zs_pool *zs_create_pool(const char *name)
pages_per_zspage = get_pages_per_zspage(size);
objs_per_zspage = pages_per_zspage * PAGE_SIZE / size;
 
+   /*
+* We iterate from biggest down to smallest classes,
+* so huge_class_size holds the size of the first huge
+* class. Any object bigger than or equal to that will
+* endup in the huge class.
+*/
+   if (pages_per_zspage != 1 && objs_per_zspage != 1 &&
+   !huge_class_size) {
+   huge_class_size = size;
+   /*
+* The object uses ZS_HANDLE_SIZE bytes to store the
+* handle. We need to subtract it, because zs_malloc()
+* unconditionally adds handle size before it performs
+* size class search - so object may be smaller than
+* huge class size, yet it still can end up in the huge
+* class because it grows by ZS_HANDLE_SIZE extra bytes
+* right before class lookup.
+*/
+   huge_class_size -= (ZS_HANDLE_SIZE - 1);
+   }
+
/*
 * size_class is used for normal zsmalloc operation such
 * as alloc/free for that size. Although it is natural that we
-- 
2.16.2

Re: [PATCH 07/34] x86/entry/32: Restore segments before int registers

2018-03-05 Thread Ingo Molnar

* H. Peter Anvin  wrote:

> On NX-enabled hardware NX works with PDE, but the PDPDT in general doesn't
> have permission bits (it's really more of a set of four CR3s than a page
> table level.)

The 4 PDPDT entries are also shadowed in the CPU and are only refreshed
on CR3 loads, not spontaneously reloaded from memory during TLB walk
like regular page table entries, right?

This too strengthens the notion that the third page table level of PAE is more 
like a special in-memory CR3[4] array.

Thanks,

Ingo

Re: [PATCH 3/3] vfio/pci: Add ioeventfd support

2018-03-05 Thread kbuild test robot

Hi Alex,

I love your patch! Perhaps something to improve:

[auto build test WARNING on linus/master]
[also build test WARNING on v4.16-rc4 next-20180306]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Alex-Williamson/vfio-pci-Pull-BAR-mapping-setup-from-read-write-path/20180303-015851
reproduce:
# apt-get install sparse
make ARCH=x86_64 allmodconfig
make C=1 CF=-D__CHECK_ENDIAN__


sparse warnings: (new ones prefixed by >>)

>> drivers/vfio/pci/vfio_pci_rdwr.c:290:1: sparse: incorrect type in argument 2 
>> (different address spaces) @@expected void [noderef] * 
>> @@got sn:2>* @@
   drivers/vfio/pci/vfio_pci_rdwr.c:290:1:expected void [noderef] 
*
   drivers/vfio/pci/vfio_pci_rdwr.c:290:1:got void *opaque
   drivers/vfio/pci/vfio_pci_rdwr.c:291:1: sparse: incorrect type in argument 2 
(different address spaces) @@expected void [noderef] * @@   
 got sn:2>* @@
   drivers/vfio/pci/vfio_pci_rdwr.c:291:1:expected void [noderef] 
*
   drivers/vfio/pci/vfio_pci_rdwr.c:291:1:got void *opaque
   drivers/vfio/pci/vfio_pci_rdwr.c:292:1: sparse: incorrect type in argument 2 
(different address spaces) @@expected void [noderef] * @@   
 got sn:2>* @@
   drivers/vfio/pci/vfio_pci_rdwr.c:292:1:expected void [noderef] 
*
   drivers/vfio/pci/vfio_pci_rdwr.c:292:1:got void *opaque
>> drivers/vfio/pci/vfio_pci_rdwr.c:378:52: sparse: incorrect type in argument 
>> 1 (different address spaces) @@expected void *opaque @@got void 
>> [noderef] *

vim +290 drivers/vfio/pci/vfio_pci_rdwr.c

   286  
   287  #ifdef iowrite64
   288  VFIO_PCI_IOEVENTFD_HANDLER(64)
   289  #endif
 > 290  VFIO_PCI_IOEVENTFD_HANDLER(32)
   291  VFIO_PCI_IOEVENTFD_HANDLER(16)
   292  VFIO_PCI_IOEVENTFD_HANDLER(8)
   293  
   294  long vfio_pci_ioeventfd(struct vfio_pci_device *vdev, loff_t offset,
   295  uint64_t data, int count, int fd)
   296  {
   297  struct pci_dev *pdev = vdev->pdev;
   298  loff_t pos = offset & VFIO_PCI_OFFSET_MASK;
   299  int ret, bar = VFIO_PCI_OFFSET_TO_INDEX(offset);
   300  struct vfio_pci_ioeventfd *ioeventfd;
   301  int (*handler)(void *addr, void *value);
   302  
   303  /* Only support ioeventfds into BARs */
   304  if (bar > VFIO_PCI_BAR5_REGION_INDEX)
   305  return -EINVAL;
   306  
   307  if (pos + count > pci_resource_len(pdev, bar))
   308  return -EINVAL;
   309  
   310  /* Disallow ioeventfds working around MSI-X table writes */
   311  if (bar == vdev->msix_bar &&
   312  !(pos + count <= vdev->msix_offset ||
   313pos >= vdev->msix_offset + vdev->msix_size))
   314  return -EINVAL;
   315  
   316  switch (count) {
   317  case 1:
   318  handler = &vfio_pci_ioeventfd_handler8;
   319  break;
   320  case 2:
   321  handler = &vfio_pci_ioeventfd_handler16;
   322  break;
   323  case 4:
   324  handler = &vfio_pci_ioeventfd_handler32;
   325  break;
   326  #ifdef iowrite64
   327  case 8:
   328  handler = &vfio_pci_ioeventfd_handler64;
   329  break;
   330  #endif
   331  default:
   332  return -EINVAL;
   333  }
   334  
   335  ret = vfio_pci_setup_barmap(vdev, bar);
   336  if (ret)
   337  return ret;
   338  
   339  mutex_lock(&vdev->ioeventfds_lock);
   340  
   341  list_for_each_entry(ioeventfd, &vdev->ioeventfds_list, next) {
   342  if (ioeventfd->pos == pos && ioeventfd->bar == bar &&
   343  ioeventfd->data == data && ioeventfd->count == 
count) {
   344  if (fd == -1) {
   345  vfio_virqfd_disable(&ioeventfd->virqfd);
   346  list_del(&ioeventfd->next);
   347  vdev->ioeventfds_nr--;
   348  kfree(ioeventfd);
   349  ret = 0;
   350  } else
   351  ret = -EEXIST;
   352  
   353  goto out_unlock;
   354  }
   355  }
   356  
   357  if (fd < 0) {
   358  ret = -ENODEV;
   359  goto out_unlock;
   360  }
   361  
   362  if (vdev->ioeventfds_nr >= VFIO_PCI_IOEVENTFD_MAX) {
   363  ret = -ENOSPC;
   364  goto out_unlock;
   365  }
   366  
   367  ioeventfd = kzalloc(sizeof(*ioeventfd), GFP_KERNEL);
   368  if (!ioeventfd) {
   369  ret = -ENOMEM;

[tip:perf/core] perf mmap: Discard legacy interfaces for mmap read forward

2018-03-05 Thread tip-bot for Kan Liang

Commit-ID:  6afad54d2f0ddebacfcf3b829147d7fed8dab298
Gitweb: https://git.kernel.org/tip/6afad54d2f0ddebacfcf3b829147d7fed8dab298
Author: Kan Liang 
AuthorDate: Thu, 1 Mar 2018 18:09:11 -0500
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 5 Mar 2018 10:51:10 -0300

perf mmap: Discard legacy interfaces for mmap read forward

Discards legacy interfaces perf_evlist__mmap_read_forward(),
perf_evlist__mmap_read() and perf_evlist__mmap_consume().

No tools use them.

Signed-off-by: Kan Liang 
Cc: Andi Kleen 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Wang Nan 
Link: 
http://lkml.kernel.org/r/1519945751-37786-14-git-send-email-kan.li...@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/evlist.c | 25 +
 tools/perf/util/evlist.h |  4 
 tools/perf/util/mmap.c   | 21 +
 3 files changed, 2 insertions(+), 48 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 7b7d535396f7..41a4666f1519 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -702,29 +702,6 @@ static int perf_evlist__resume(struct perf_evlist *evlist)
return perf_evlist__set_paused(evlist, false);
 }
 
-union perf_event *perf_evlist__mmap_read_forward(struct perf_evlist *evlist, 
int idx)
-{
-   struct perf_mmap *md = &evlist->mmap[idx];
-
-   /*
-* Check messup is required for forward overwritable ring buffer:
-* memory pointed by md->prev can be overwritten in this case.
-* No need for read-write ring buffer: kernel stop outputting when
-* it hit md->prev (perf_mmap__consume()).
-*/
-   return perf_mmap__read_forward(md);
-}
-
-union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx)
-{
-   return perf_evlist__mmap_read_forward(evlist, idx);
-}
-
-void perf_evlist__mmap_consume(struct perf_evlist *evlist, int idx)
-{
-   perf_mmap__consume(&evlist->mmap[idx], false);
-}
-
 static void perf_evlist__munmap_nofree(struct perf_evlist *evlist)
 {
int i;
@@ -761,7 +738,7 @@ static struct perf_mmap *perf_evlist__alloc_mmap(struct 
perf_evlist *evlist)
map[i].fd = -1;
/*
 * When the perf_mmap() call is made we grab one refcount, plus
-* one extra to let perf_evlist__mmap_consume() get the last
+* one extra to let perf_mmap__consume() get the last
 * events after all real references (perf_mmap__get()) are
 * dropped.
 *
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 336b838e6957..6c41b2f78713 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -129,10 +129,6 @@ struct perf_sample_id *perf_evlist__id2sid(struct 
perf_evlist *evlist, u64 id);
 
 void perf_evlist__toggle_bkw_mmap(struct perf_evlist *evlist, enum 
bkw_mmap_state state);
 
-union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx);
-
-union perf_event *perf_evlist__mmap_read_forward(struct perf_evlist *evlist,
-int idx);
 void perf_evlist__mmap_consume(struct perf_evlist *evlist, int idx);
 
 int perf_evlist__open(struct perf_evlist *evlist);
diff --git a/tools/perf/util/mmap.c b/tools/perf/util/mmap.c
index 91531a7c8fbf..4f27c464ce0b 100644
--- a/tools/perf/util/mmap.c
+++ b/tools/perf/util/mmap.c
@@ -63,25 +63,6 @@ static union perf_event *perf_mmap__read(struct perf_mmap 
*map,
return event;
 }
 
-/*
- * legacy interface for mmap read.
- * Don't use it. Use perf_mmap__read_event().
- */
-union perf_event *perf_mmap__read_forward(struct perf_mmap *map)
-{
-   u64 head;
-
-   /*
-* Check if event was unmapped due to a POLLHUP/POLLERR.
-*/
-   if (!refcount_read(&map->refcnt))
-   return NULL;
-
-   head = perf_mmap__read_head(map);
-
-   return perf_mmap__read(map, &map->prev, head);
-}
-
 /*
  * Read event from ring buffer one by one.
  * Return one event for each call.
@@ -191,7 +172,7 @@ void perf_mmap__munmap(struct perf_mmap *map)
 int perf_mmap__mmap(struct perf_mmap *map, struct mmap_params *mp, int fd)
 {
/*
-* The last one will be done at perf_evlist__mmap_consume(), so that we
+* The last one will be done at perf_mmap__consume(), so that we
 * make sure we don't prevent tools from consuming every last event in
 * the ring buffer.
 *

[tip:perf/core] perf test: Switch to new perf_mmap__read_event() interface for task-exit

2018-03-05 Thread tip-bot for Kan Liang

Commit-ID:  759487307625cd44ac4aa241ee547b52b72bc4ad
Gitweb: https://git.kernel.org/tip/759487307625cd44ac4aa241ee547b52b72bc4ad
Author: Kan Liang 
AuthorDate: Thu, 1 Mar 2018 18:09:10 -0500
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 5 Mar 2018 10:51:00 -0300

perf test: Switch to new perf_mmap__read_event() interface for task-exit

The perf test 'task-exit' still use the legacy interface.

No functional change.

Committer notes:

Testing it:

  # perf test exit
  21: Number of exit events of a simple workload: Ok
  #

Signed-off-by: Kan Liang 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Andi Kleen 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Wang Nan 
Link: 
http://lkml.kernel.org/r/1519945751-37786-13-git-send-email-kan.li...@linux.intel.com
[ Changed bool parameters from 0 to 'false', as per Jiri comment ]
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/tests/task-exit.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/tools/perf/tests/task-exit.c b/tools/perf/tests/task-exit.c
index 01b62b81751b..02b0888b72a3 100644
--- a/tools/perf/tests/task-exit.c
+++ b/tools/perf/tests/task-exit.c
@@ -47,6 +47,8 @@ int test__task_exit(struct test *test __maybe_unused, int 
subtest __maybe_unused
char sbuf[STRERR_BUFSIZE];
struct cpu_map *cpus;
struct thread_map *threads;
+   struct perf_mmap *md;
+   u64 end, start;
 
signal(SIGCHLD, sig_handler);
 
@@ -110,13 +112,19 @@ int test__task_exit(struct test *test __maybe_unused, int 
subtest __maybe_unused
perf_evlist__start_workload(evlist);
 
 retry:
-   while ((event = perf_evlist__mmap_read(evlist, 0)) != NULL) {
+   md = &evlist->mmap[0];
+   if (perf_mmap__read_init(md, false, &start, &end) < 0)
+   goto out_init;
+
+   while ((event = perf_mmap__read_event(md, false, &start, end)) != NULL) 
{
if (event->header.type == PERF_RECORD_EXIT)
nr_exit++;
 
-   perf_evlist__mmap_consume(evlist, 0);
+   perf_mmap__consume(md, false);
}
+   perf_mmap__read_done(md);
 
+out_init:
if (!exited || !nr_exit) {
perf_evlist__poll(evlist, -1);
goto retry;

[tip:perf/core] perf test: Switch to new perf_mmap__read_event() interface for switch-tracking

2018-03-05 Thread tip-bot for Kan Liang

Commit-ID:  ee4024ff858211316c4824b16bea446f08765ae8
Gitweb: https://git.kernel.org/tip/ee4024ff858211316c4824b16bea446f08765ae8
Author: Kan Liang 
AuthorDate: Thu, 1 Mar 2018 18:09:09 -0500
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 5 Mar 2018 10:50:50 -0300

perf test: Switch to new perf_mmap__read_event() interface for switch-tracking

The perf test 'switch-tracking' still use the legacy interface.

No functional change.

Committer testing:

  # perf test switch
  32: Track with sched_switch   : Ok
  #

Signed-off-by: Kan Liang 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Andi Kleen 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Wang Nan 
Link: 
http://lkml.kernel.org/r/1519945751-37786-12-git-send-email-kan.li...@linux.intel.com
[ Changed bool parameters from 0 to 'false', as per Jiri comment ]
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/tests/switch-tracking.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/tools/perf/tests/switch-tracking.c 
b/tools/perf/tests/switch-tracking.c
index 33e00295a972..10c4dcdc2324 100644
--- a/tools/perf/tests/switch-tracking.c
+++ b/tools/perf/tests/switch-tracking.c
@@ -258,16 +258,23 @@ static int process_events(struct perf_evlist *evlist,
unsigned pos, cnt = 0;
LIST_HEAD(events);
struct event_node *events_array, *node;
+   struct perf_mmap *md;
+   u64 end, start;
int i, ret;
 
for (i = 0; i < evlist->nr_mmaps; i++) {
-   while ((event = perf_evlist__mmap_read(evlist, i)) != NULL) {
+   md = &evlist->mmap[i];
+   if (perf_mmap__read_init(md, false, &start, &end) < 0)
+   continue;
+
+   while ((event = perf_mmap__read_event(md, false, &start, end)) 
!= NULL) {
cnt += 1;
ret = add_event(evlist, &events, event);
-   perf_evlist__mmap_consume(evlist, i);
+perf_mmap__consume(md, false);
if (ret < 0)
goto out_free_nodes;
}
+   perf_mmap__read_done(md);
}
 
events_array = calloc(cnt, sizeof(struct event_node));

[tip:perf/core] perf test: Switch to new perf_mmap__read_event() interface for sw-clock

2018-03-05 Thread tip-bot for Kan Liang

Commit-ID:  5d0007cdfc6612788badceb276156d6ccb30b6de
Gitweb: https://git.kernel.org/tip/5d0007cdfc6612788badceb276156d6ccb30b6de
Author: Kan Liang 
AuthorDate: Thu, 1 Mar 2018 18:09:08 -0500
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 5 Mar 2018 10:50:37 -0300

perf test: Switch to new perf_mmap__read_event() interface for sw-clock

The perf test 'sw-clock' still use the legacy interface.

No functional change.

Committer testing:

  # perf test clock
  22: Software clock events period values   : Ok
  #

Signed-off-by: Kan Liang 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Andi Kleen 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Wang Nan 
Link: 
http://lkml.kernel.org/r/1519945751-37786-11-git-send-email-kan.li...@linux.intel.com
[ Changed bool parameters from 0 to 'false', as per Jiri comment ]
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/tests/sw-clock.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/tools/perf/tests/sw-clock.c b/tools/perf/tests/sw-clock.c
index f6c72f915d48..e6320e267ba5 100644
--- a/tools/perf/tests/sw-clock.c
+++ b/tools/perf/tests/sw-clock.c
@@ -39,6 +39,8 @@ static int __test__sw_clock_freq(enum perf_sw_ids clock_id)
};
struct cpu_map *cpus;
struct thread_map *threads;
+   struct perf_mmap *md;
+   u64 end, start;
 
attr.sample_freq = 500;
 
@@ -93,7 +95,11 @@ static int __test__sw_clock_freq(enum perf_sw_ids clock_id)
 
perf_evlist__disable(evlist);
 
-   while ((event = perf_evlist__mmap_read(evlist, 0)) != NULL) {
+   md = &evlist->mmap[0];
+   if (perf_mmap__read_init(md, false, &start, &end) < 0)
+   goto out_init;
+
+   while ((event = perf_mmap__read_event(md, false, &start, end)) != NULL) 
{
struct perf_sample sample;
 
if (event->header.type != PERF_RECORD_SAMPLE)
@@ -108,9 +114,11 @@ static int __test__sw_clock_freq(enum perf_sw_ids clock_id)
total_periods += sample.period;
nr_samples++;
 next_event:
-   perf_evlist__mmap_consume(evlist, 0);
+   perf_mmap__consume(md, false);
}
+   perf_mmap__read_done(md);
 
+out_init:
if ((u64) nr_samples == total_periods) {
pr_debug("All (%d) samples have period value of 1!\n",
 nr_samples);

[tip:perf/core] perf test: Switch to new perf_mmap__read_event() interface for time-to-tsc

2018-03-05 Thread tip-bot for Kan Liang

Commit-ID:  9dfb85dfaffe6bc38f0c9f8a8622e2a7ca333e58
Gitweb: https://git.kernel.org/tip/9dfb85dfaffe6bc38f0c9f8a8622e2a7ca333e58
Author: Kan Liang 
AuthorDate: Thu, 1 Mar 2018 18:09:07 -0500
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 5 Mar 2018 10:50:23 -0300

perf test: Switch to new perf_mmap__read_event() interface for time-to-tsc

The perf test 'time-to-tsc' still use the legacy interface.

No functional change.

Commiter notes:

Testing it:

  # perf test tsc
  57: Convert perf time to TSC  : Ok
  #

Signed-off-by: Kan Liang 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Andi Kleen 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Wang Nan 
Link: 
http://lkml.kernel.org/r/1519945751-37786-10-git-send-email-kan.li...@linux.intel.com
[ Changed bool parameters from 0 to 'false', as per Jiri comment ]
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/arch/x86/tests/perf-time-to-tsc.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/tools/perf/arch/x86/tests/perf-time-to-tsc.c 
b/tools/perf/arch/x86/tests/perf-time-to-tsc.c
index 06abe8108b33..7f82d91ef473 100644
--- a/tools/perf/arch/x86/tests/perf-time-to-tsc.c
+++ b/tools/perf/arch/x86/tests/perf-time-to-tsc.c
@@ -60,6 +60,8 @@ int test__perf_time_to_tsc(struct test *test __maybe_unused, 
int subtest __maybe
union perf_event *event;
u64 test_tsc, comm1_tsc, comm2_tsc;
u64 test_time, comm1_time = 0, comm2_time = 0;
+   struct perf_mmap *md;
+   u64 end, start;
 
threads = thread_map__new(-1, getpid(), UINT_MAX);
CHECK_NOT_NULL__(threads);
@@ -109,7 +111,11 @@ int test__perf_time_to_tsc(struct test *test 
__maybe_unused, int subtest __maybe
perf_evlist__disable(evlist);
 
for (i = 0; i < evlist->nr_mmaps; i++) {
-   while ((event = perf_evlist__mmap_read(evlist, i)) != NULL) {
+   md = &evlist->mmap[i];
+   if (perf_mmap__read_init(md, false, &start, &end) < 0)
+   continue;
+
+   while ((event = perf_mmap__read_event(md, false, &start, end)) 
!= NULL) {
struct perf_sample sample;
 
if (event->header.type != PERF_RECORD_COMM ||
@@ -128,8 +134,9 @@ int test__perf_time_to_tsc(struct test *test 
__maybe_unused, int subtest __maybe
comm2_time = sample.time;
}
 next_event:
-   perf_evlist__mmap_consume(evlist, i);
+   perf_mmap__consume(md, false);
}
+   perf_mmap__read_done(md);
}
 
if (!comm1_time || !comm2_time)

[tip:perf/core] perf test: Switch to new perf_mmap__read_event() interface for perf-record

2018-03-05 Thread tip-bot for Kan Liang

Commit-ID:  88e37a4bbe6e05fd5ad103738c542658b81e76ea
Gitweb: https://git.kernel.org/tip/88e37a4bbe6e05fd5ad103738c542658b81e76ea
Author: Kan Liang 
AuthorDate: Thu, 1 Mar 2018 18:09:06 -0500
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 5 Mar 2018 10:50:21 -0300

perf test: Switch to new perf_mmap__read_event() interface for perf-record

The perf test 'perf-record' still use the legacy interface.

No functional change.

Committer notes:

Testing it:

  # perf test PERF_RECORD
   8: PERF_RECORD_* events & perf_sample fields : Ok
  #

Signed-off-by: Kan Liang 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Andi Kleen 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Wang Nan 
Link: 
http://lkml.kernel.org/r/1519945751-37786-9-git-send-email-kan.li...@linux.intel.com
[ Changed bool parameters from 0 to 'false', as per Jiri comment ]
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/tests/perf-record.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/tools/perf/tests/perf-record.c b/tools/perf/tests/perf-record.c
index 0afafab85238..31f3f70adca6 100644
--- a/tools/perf/tests/perf-record.c
+++ b/tools/perf/tests/perf-record.c
@@ -164,8 +164,14 @@ int test__PERF_RECORD(struct test *test __maybe_unused, 
int subtest __maybe_unus
 
for (i = 0; i < evlist->nr_mmaps; i++) {
union perf_event *event;
+   struct perf_mmap *md;
+   u64 end, start;
 
-   while ((event = perf_evlist__mmap_read(evlist, i)) != 
NULL) {
+   md = &evlist->mmap[i];
+   if (perf_mmap__read_init(md, false, &start, &end) < 0)
+   continue;
+
+   while ((event = perf_mmap__read_event(md, false, 
&start, end)) != NULL) {
const u32 type = event->header.type;
const char *name = perf_event__name(type);
 
@@ -266,8 +272,9 @@ int test__PERF_RECORD(struct test *test __maybe_unused, int 
subtest __maybe_unus
++errs;
}
 
-   perf_evlist__mmap_consume(evlist, i);
+   perf_mmap__consume(md, false);
}
+   perf_mmap__read_done(md);
}
 
/*

[tip:perf/core] perf test: Switch to new perf_mmap__read_event() interface for tp fields

2018-03-05 Thread tip-bot for Kan Liang

Commit-ID:  1d1b5632ed0b797721a409bbed718d85384168a2
Gitweb: https://git.kernel.org/tip/1d1b5632ed0b797721a409bbed718d85384168a2
Author: Kan Liang 
AuthorDate: Thu, 1 Mar 2018 18:09:05 -0500
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 5 Mar 2018 10:49:59 -0300

perf test: Switch to new perf_mmap__read_event() interface for tp fields

The perf test 'syscalls:sys_enter_openat event fields' still use the
legacy interface.

No functional change.

Committer notes:

Testing it:

  # perf test sys_enter_openat
  15: syscalls:sys_enter_openat event fields: Ok
  #

Signed-off-by: Kan Liang 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Andi Kleen 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Wang Nan 
Link: 
http://lkml.kernel.org/r/1519945751-37786-8-git-send-email-kan.li...@linux.intel.com
[ Changed bool parameters from 0 to 'false', as per Jiri comment ]
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/tests/openat-syscall-tp-fields.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/tools/perf/tests/openat-syscall-tp-fields.c 
b/tools/perf/tests/openat-syscall-tp-fields.c
index 43519267b93b..620b21023f72 100644
--- a/tools/perf/tests/openat-syscall-tp-fields.c
+++ b/tools/perf/tests/openat-syscall-tp-fields.c
@@ -86,8 +86,14 @@ int test__syscall_openat_tp_fields(struct test *test 
__maybe_unused, int subtest
 
for (i = 0; i < evlist->nr_mmaps; i++) {
union perf_event *event;
+   struct perf_mmap *md;
+   u64 end, start;
 
-   while ((event = perf_evlist__mmap_read(evlist, i)) != 
NULL) {
+   md = &evlist->mmap[i];
+   if (perf_mmap__read_init(md, false, &start, &end) < 0)
+   continue;
+
+   while ((event = perf_mmap__read_event(md, false, 
&start, end)) != NULL) {
const u32 type = event->header.type;
int tp_flags;
struct perf_sample sample;
@@ -95,7 +101,7 @@ int test__syscall_openat_tp_fields(struct test *test 
__maybe_unused, int subtest
++nr_events;
 
if (type != PERF_RECORD_SAMPLE) {
-   perf_evlist__mmap_consume(evlist, i);
+   perf_mmap__consume(md, false);
continue;
}
 
@@ -115,6 +121,7 @@ int test__syscall_openat_tp_fields(struct test *test 
__maybe_unused, int subtest
 
goto out_ok;
}
+   perf_mmap__read_done(md);
}
 
if (nr_events == before)

[tip:perf/core] perf test: Switch to new perf_mmap__read_event() interface for mmap-basic

2018-03-05 Thread tip-bot for Kan Liang

Commit-ID:  334f823e2ab58b3c0e58fa71321680382c5f60ff
Gitweb: https://git.kernel.org/tip/334f823e2ab58b3c0e58fa71321680382c5f60ff
Author: Kan Liang 
AuthorDate: Thu, 1 Mar 2018 18:09:04 -0500
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 5 Mar 2018 10:49:37 -0300

perf test: Switch to new perf_mmap__read_event() interface for mmap-basic

The perf test 'mmap-basic' still use the legacy interface.

No functional change.

Committer notes:

Testing it:

  # perf test "mmap interface"
   4: Read samples using the mmap interface : Ok
  #

Signed-off-by: Kan Liang 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Andi Kleen 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Wang Nan 
Link: 
http://lkml.kernel.org/r/1519945751-37786-7-git-send-email-kan.li...@linux.intel.com
[ Changed bool parameters from 0 to 'false', as per Jiri comment ]
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/tests/mmap-basic.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/tools/perf/tests/mmap-basic.c b/tools/perf/tests/mmap-basic.c
index c0e971da965c..44c58d69cd87 100644
--- a/tools/perf/tests/mmap-basic.c
+++ b/tools/perf/tests/mmap-basic.c
@@ -38,6 +38,8 @@ int test__basic_mmap(struct test *test __maybe_unused, int 
subtest __maybe_unuse
 expected_nr_events[nsyscalls], i, j;
struct perf_evsel *evsels[nsyscalls], *evsel;
char sbuf[STRERR_BUFSIZE];
+   struct perf_mmap *md;
+   u64 end, start;
 
threads = thread_map__new(-1, getpid(), UINT_MAX);
if (threads == NULL) {
@@ -106,7 +108,11 @@ int test__basic_mmap(struct test *test __maybe_unused, int 
subtest __maybe_unuse
++foo;
}
 
-   while ((event = perf_evlist__mmap_read(evlist, 0)) != NULL) {
+   md = &evlist->mmap[0];
+   if (perf_mmap__read_init(md, false, &start, &end) < 0)
+   goto out_init;
+
+   while ((event = perf_mmap__read_event(md, false, &start, end)) != NULL) 
{
struct perf_sample sample;
 
if (event->header.type != PERF_RECORD_SAMPLE) {
@@ -129,9 +135,11 @@ int test__basic_mmap(struct test *test __maybe_unused, int 
subtest __maybe_unuse
goto out_delete_evlist;
}
nr_events[evsel->idx]++;
-   perf_evlist__mmap_consume(evlist, 0);
+   perf_mmap__consume(md, false);
}
+   perf_mmap__read_done(md);
 
+out_init:
err = 0;
evlist__for_each_entry(evlist, evsel) {
if (nr_events[evsel->idx] != expected_nr_events[evsel->idx]) {

[tip:perf/core] perf test: Switch to new perf_mmap__read_event() interface for "keep tracking" test

2018-03-05 Thread tip-bot for Kan Liang

Commit-ID:  693d32aebf857ef1d1803b08ef1b631990ae3747
Gitweb: https://git.kernel.org/tip/693d32aebf857ef1d1803b08ef1b631990ae3747
Author: Kan Liang 
AuthorDate: Thu, 1 Mar 2018 18:09:03 -0500
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 5 Mar 2018 10:49:01 -0300

perf test: Switch to new perf_mmap__read_event() interface for "keep tracking" 
test

The perf test 'keep tracking' still use the legacy interface.

No functional change.

Committer testing:

  # perf test tracking
  25: Use a dummy software event to keep tracking   : Ok
  #

Signed-off-by: Kan Liang 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Andi Kleen 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Wang Nan 
Link: 
http://lkml.kernel.org/r/1519945751-37786-6-git-send-email-kan.li...@linux.intel.com
[ Changed bool parameters from 0 to 'false', as per Jiri comment ]
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/tests/keep-tracking.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/tools/perf/tests/keep-tracking.c b/tools/perf/tests/keep-tracking.c
index c46530918938..4590d8fb91ab 100644
--- a/tools/perf/tests/keep-tracking.c
+++ b/tools/perf/tests/keep-tracking.c
@@ -27,18 +27,24 @@
 static int find_comm(struct perf_evlist *evlist, const char *comm)
 {
union perf_event *event;
+   struct perf_mmap *md;
+   u64 end, start;
int i, found;
 
found = 0;
for (i = 0; i < evlist->nr_mmaps; i++) {
-   while ((event = perf_evlist__mmap_read(evlist, i)) != NULL) {
+   md = &evlist->mmap[i];
+   if (perf_mmap__read_init(md, false, &start, &end) < 0)
+   continue;
+   while ((event = perf_mmap__read_event(md, false, &start, end)) 
!= NULL) {
if (event->header.type == PERF_RECORD_COMM &&
(pid_t)event->comm.pid == getpid() &&
(pid_t)event->comm.tid == getpid() &&
strcmp(event->comm.comm, comm) == 0)
found += 1;
-   perf_evlist__mmap_consume(evlist, i);
+   perf_mmap__consume(md, false);
}
+   perf_mmap__read_done(md);
}
return found;
 }

[tip:perf/core] perf test: Switch to new perf_mmap__read_event() interface for 'code reading' test

2018-03-05 Thread tip-bot for Kan Liang

Commit-ID:  00fc2460e735fa0f6add802c7426273e7dbc2b27
Gitweb: https://git.kernel.org/tip/00fc2460e735fa0f6add802c7426273e7dbc2b27
Author: Kan Liang 
AuthorDate: Thu, 1 Mar 2018 18:09:02 -0500
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 5 Mar 2018 10:48:36 -0300

perf test: Switch to new perf_mmap__read_event() interface for 'code reading' 
test

The perf test 'object code reading' still use the legacy interface.

No functional change.

Committer notes:

Testing:

  # perf test reading
  23: Object code reading: Ok
  #

Signed-off-by: Kan Liang 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Andi Kleen 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Wang Nan 
Link: 
http://lkml.kernel.org/r/1519945751-37786-5-git-send-email-kan.li...@linux.intel.com
[ Changed bool parameters from 0 to 'false', as per Jiri comment ]
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/tests/code-reading.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/tools/perf/tests/code-reading.c b/tools/perf/tests/code-reading.c
index c7115d369511..03ed8c77b1bb 100644
--- a/tools/perf/tests/code-reading.c
+++ b/tools/perf/tests/code-reading.c
@@ -409,15 +409,22 @@ static int process_events(struct machine *machine, struct 
perf_evlist *evlist,
  struct state *state)
 {
union perf_event *event;
+   struct perf_mmap *md;
+   u64 end, start;
int i, ret;
 
for (i = 0; i < evlist->nr_mmaps; i++) {
-   while ((event = perf_evlist__mmap_read(evlist, i)) != NULL) {
+   md = &evlist->mmap[i];
+   if (perf_mmap__read_init(md, false, &start, &end) < 0)
+   continue;
+
+   while ((event = perf_mmap__read_event(md, false, &start, end)) 
!= NULL) {
ret = process_event(machine, evlist, event, state);
-   perf_evlist__mmap_consume(evlist, i);
+   perf_mmap__consume(md, false);
if (ret < 0)
return ret;
}
+   perf_mmap__read_done(md);
}
return 0;
 }

[tip:perf/core] perf python: Switch to new perf_mmap__read_event() interface

2018-03-05 Thread tip-bot for Kan Liang

Commit-ID:  35b7cdc6379ea8300161f0f80fe8aad083a1c5d0
Gitweb: https://git.kernel.org/tip/35b7cdc6379ea8300161f0f80fe8aad083a1c5d0
Author: Kan Liang 
AuthorDate: Thu, 1 Mar 2018 18:09:00 -0500
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 5 Mar 2018 10:47:07 -0300

perf python: Switch to new perf_mmap__read_event() interface

The perf python binding still use the legacy interface.

No functional change.

Committer notes:

Tested before and after with:

  [root@jouet perf]# export PYTHONPATH=/tmp/build/perf/python
  [root@jouet perf]# tools/perf/python/twatch.py
  cpu: 0, pid: 1183, tid: 6293 { type: exit, pid: 1183, ppid: 1183, tid: 6293, 
ptid: 6293, time: 17886646588257}
  cpu: 2, pid: 13820, tid: 13820 { type: fork, pid: 13820, ppid: 13820, tid: 
6306, ptid: 13820, time: 17886869099529}
  cpu: 1, pid: 13820, tid: 6306 { type: comm, pid: 13820, tid: 6306, comm: 
TaskSchedulerFo }
  ^CTraceback (most recent call last):
File "tools/perf/python/twatch.py", line 68, in 
  main()
File "tools/perf/python/twatch.py", line 40, in main
  evlist.poll(timeout = -1)
  KeyboardInterrupt
  [root@jouet perf]#

No problems found.

Signed-off-by: Kan Liang 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Andi Kleen 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Wang Nan 
Link: 
http://lkml.kernel.org/r/1519945751-37786-3-git-send-email-kan.li...@linux.intel.com
[ Changed bool parameters from 0 to 'false', as per Jiri comment ]
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/python.c | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/python.c b/tools/perf/util/python.c
index 2918cac7a142..35fb5ef7d290 100644
--- a/tools/perf/util/python.c
+++ b/tools/perf/util/python.c
@@ -983,13 +983,19 @@ static PyObject *pyrf_evlist__read_on_cpu(struct 
pyrf_evlist *pevlist,
union perf_event *event;
int sample_id_all = 1, cpu;
static char *kwlist[] = { "cpu", "sample_id_all", NULL };
+   struct perf_mmap *md;
+   u64 end, start;
int err;
 
if (!PyArg_ParseTupleAndKeywords(args, kwargs, "i|i", kwlist,
 &cpu, &sample_id_all))
return NULL;
 
-   event = perf_evlist__mmap_read(evlist, cpu);
+   md = &evlist->mmap[cpu];
+   if (perf_mmap__read_init(md, false, &start, &end) < 0)
+   goto end;
+
+   event = perf_mmap__read_event(md, false, &start, end);
if (event != NULL) {
PyObject *pyevent = pyrf_event__new(event);
struct pyrf_event *pevent = (struct pyrf_event *)pyevent;
@@ -1007,14 +1013,14 @@ static PyObject *pyrf_evlist__read_on_cpu(struct 
pyrf_evlist *pevlist,
err = perf_evsel__parse_sample(evsel, event, &pevent->sample);
 
/* Consume the even only after we parsed it out. */
-   perf_evlist__mmap_consume(evlist, cpu);
+   perf_mmap__consume(md, false);
 
if (err)
return PyErr_Format(PyExc_OSError,
"perf: can't parse sample, err=%d", 
err);
return pyevent;
}
-
+end:
Py_INCREF(Py_None);
return Py_None;
 }

[tip:perf/core] perf test: Switch to new perf_mmap__read_event() interface for bpf

2018-03-05 Thread tip-bot for Kan Liang

Commit-ID:  2f54f3a4733c0cd857992d793af5e8321b281012
Gitweb: https://git.kernel.org/tip/2f54f3a4733c0cd857992d793af5e8321b281012
Author: Kan Liang 
AuthorDate: Thu, 1 Mar 2018 18:09:01 -0500
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 5 Mar 2018 10:47:54 -0300

perf test: Switch to new perf_mmap__read_event() interface for bpf

The perf test 'bpf' still use the legacy interface.

No functional change.

Committer notes:

Tested with:

  # perf test bpf
  39: BPF filter:
  39.1: Basic BPF filtering : Ok
  39.2: BPF pinning : Ok
  39.3: BPF prologue generation : Ok
  39.4: BPF relocation checker  : Ok
  #

Signed-off-by: Kan Liang 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Andi Kleen 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Wang Nan 
Link: 
http://lkml.kernel.org/r/1519945751-37786-4-git-send-email-kan.li...@linux.intel.com
[ Changed bool parameters from 0 to 'false', as per Jiri comment ]
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/tests/bpf.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/tools/perf/tests/bpf.c b/tools/perf/tests/bpf.c
index e8399beca62b..09c9c9f9e827 100644
--- a/tools/perf/tests/bpf.c
+++ b/tools/perf/tests/bpf.c
@@ -176,13 +176,20 @@ static int do_test(struct bpf_object *obj, int 
(*func)(void),
 
for (i = 0; i < evlist->nr_mmaps; i++) {
union perf_event *event;
+   struct perf_mmap *md;
+   u64 end, start;
 
-   while ((event = perf_evlist__mmap_read(evlist, i)) != NULL) {
+   md = &evlist->mmap[i];
+   if (perf_mmap__read_init(md, false, &start, &end) < 0)
+   continue;
+
+   while ((event = perf_mmap__read_event(md, false, &start, end)) 
!= NULL) {
const u32 type = event->header.type;
 
if (type == PERF_RECORD_SAMPLE)
count ++;
}
+   perf_mmap__read_done(md);
}
 
if (count != expect) {

[tip:perf/core] perf trace: Switch to new perf_mmap__read_event() interface

2018-03-05 Thread tip-bot for Kan Liang

Commit-ID:  d7f55c62e63461c4071afe8730851e406935d960
Gitweb: https://git.kernel.org/tip/d7f55c62e63461c4071afe8730851e406935d960
Author: Kan Liang 
AuthorDate: Thu, 1 Mar 2018 18:08:59 -0500
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 5 Mar 2018 10:41:59 -0300

perf trace: Switch to new perf_mmap__read_event() interface

The 'perf trace' utility still use the legacy interface.

Switch to the new perf_mmap__read_event() interface.

No functional change.

Signed-off-by: Kan Liang 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Andi Kleen 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Wang Nan 
Link: 
http://lkml.kernel.org/r/1519945751-37786-2-git-send-email-kan.li...@linux.intel.com
[ Changed bool parameters from 0 to 'false', as per Jiri comment ]
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-trace.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index e7f1b182fc15..1a93debc1e8d 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -2472,8 +2472,14 @@ again:
 
for (i = 0; i < evlist->nr_mmaps; i++) {
union perf_event *event;
+   struct perf_mmap *md;
+   u64 end, start;
 
-   while ((event = perf_evlist__mmap_read(evlist, i)) != NULL) {
+   md = &evlist->mmap[i];
+   if (perf_mmap__read_init(md, false, &start, &end) < 0)
+   continue;
+
+   while ((event = perf_mmap__read_event(md, false, &start, end)) 
!= NULL) {
struct perf_sample sample;
 
++trace->nr_events;
@@ -2486,7 +2492,7 @@ again:
 
trace__handle_event(trace, event, &sample);
 next_event:
-   perf_evlist__mmap_consume(evlist, i);
+   perf_mmap__consume(md, false);
 
if (interrupted)
goto out_disable;
@@ -2496,6 +2502,7 @@ next_event:
draining = true;
}
}
+   perf_mmap__read_done(md);
}
 
if (trace->nr_events == before) {

[tip:perf/core] perf record: Fix crash in pipe mode

2018-03-05 Thread tip-bot for Jiri Olsa

Commit-ID:  ad46e48c65fa1f204fa29eaff1b91174d314a94b
Gitweb: https://git.kernel.org/tip/ad46e48c65fa1f204fa29eaff1b91174d314a94b
Author: Jiri Olsa 
AuthorDate: Fri, 2 Mar 2018 17:13:54 +0100
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 5 Mar 2018 09:58:45 -0300

perf record: Fix crash in pipe mode

Currently we can crash perf record when running in pipe mode, like:

  $ perf record ls | perf report
  # To display the perf.data header info, please use --header/--header-only 
options.
  #
  perf: Segmentation fault
  Error:
  The - file has no samples!

The callstack of the crash is:

0x00515242 in perf_event__synthesize_event_update_name
  3513ev = event_update_event__new(len + 1, 
PERF_EVENT_UPDATE__NAME, evsel->id[0]);
  (gdb) bt
  #0  0x00515242 in perf_event__synthesize_event_update_name
  #1  0x005158a4 in perf_event__synthesize_extra_attr
  #2  0x00443347 in record__synthesize
  #3  0x004438e3 in __cmd_record
  #4  0x0044514e in cmd_record
  #5  0x004cbc95 in run_builtin
  #6  0x004cbf02 in handle_internal_command
  #7  0x004cc054 in run_argv
  #8  0x004cc422 in main

The reason of the crash is that the evsel does not have ids array
allocated and the pipe's synthesize code tries to access it.

We don't force evsel ids allocation when we have single event, because
it's not needed. However we need it when we are in pipe mode even for
single event as a key for evsel update event.

Fixing this by forcing evsel ids allocation event for single event, when
we are in pipe mode.

Signed-off-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: David Ahern 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Link: http://lkml.kernel.org/r/20180302161354.30192-1-jo...@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-record.c | 9 +
 tools/perf/perf.h   | 1 +
 tools/perf/util/record.c| 8 ++--
 3 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 62387942a1d5..12230ddb6506 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -882,6 +882,15 @@ static int __cmd_record(struct record *rec, int argc, 
const char **argv)
}
}
 
+   /*
+* If we have just single event and are sending data
+* through pipe, we need to force the ids allocation,
+* because we synthesize event name through the pipe
+* and need the id for that.
+*/
+   if (data->is_pipe && rec->evlist->nr_entries == 1)
+   rec->opts.sample_id = true;
+
if (record__open(rec) != 0) {
err = -1;
goto out_child;
diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index 007e0dfd5ce3..8fec1abd0f1f 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -62,6 +62,7 @@ struct record_opts {
bool overwrite;
bool ignore_missing_thread;
bool strict_freq;
+   bool sample_id;
unsigned int freq;
unsigned int mmap_pages;
unsigned int auxtrace_mmap_pages;
diff --git a/tools/perf/util/record.c b/tools/perf/util/record.c
index 4f1a82e76d39..9cfc7bf16531 100644
--- a/tools/perf/util/record.c
+++ b/tools/perf/util/record.c
@@ -138,6 +138,7 @@ void perf_evlist__config(struct perf_evlist *evlist, struct 
record_opts *opts,
struct perf_evsel *evsel;
bool use_sample_identifier = false;
bool use_comm_exec;
+   bool sample_id = opts->sample_id;
 
/*
 * Set the evsel leader links before we configure attributes,
@@ -164,8 +165,7 @@ void perf_evlist__config(struct perf_evlist *evlist, struct 
record_opts *opts,
 * match the id.
 */
use_sample_identifier = perf_can_sample_identifier();
-   evlist__for_each_entry(evlist, evsel)
-   perf_evsel__set_sample_id(evsel, use_sample_identifier);
+   sample_id = true;
} else if (evlist->nr_entries > 1) {
struct perf_evsel *first = perf_evlist__first(evlist);
 
@@ -175,6 +175,10 @@ void perf_evlist__config(struct perf_evlist *evlist, 
struct record_opts *opts,
use_sample_identifier = perf_can_sample_identifier();
break;
}
+   sample_id = true;
+   }
+
+   if (sample_id) {
evlist__for_each_entry(evlist, evsel)
perf_evsel__set_sample_id(evsel, use_sample_identifier);
}

[tip:perf/core] perf kvm: Switch to new perf_mmap__read_event() interface

2018-03-05 Thread tip-bot for Kan Liang

Commit-ID:  53172f9057e92c9b27f0bbf2a46827d87f12b0d2
Gitweb: https://git.kernel.org/tip/53172f9057e92c9b27f0bbf2a46827d87f12b0d2
Author: Kan Liang 
AuthorDate: Thu, 1 Mar 2018 18:08:58 -0500
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 5 Mar 2018 10:41:36 -0300

perf kvm: Switch to new perf_mmap__read_event() interface

The perf kvm still use the legacy interface.

Switch to the new perf_mmap__read_event() interface for perf kvm.

No functional change.

Committer notes:

Tested before and after running:

  # perf kvm stat record

On a machine with a kvm guest, then used:

  # perf kvm stat report

Before/after results match and look like:

  # perf kvm stat record -a sleep 5
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 2.132 MB perf.data.guest (1828 samples) ]
  # perf kvm stat report

  Analyze events for all VMs, all VCPUs:

 VM-EXIT Samples Samples%  Time% Min TimeMax TimeAvg time

  IO_INSTRUCTION 258   40.06%  0.08%   3.51us122.54us 14.87us 
(+- 6.76%)
   MSR_WRITE 178   27.64%  0.01%   0.47us  6.34us  2.18us 
(+- 4.80%)
   EPT_MISCONFIG 148   22.98%  0.03%   3.76us 65.60us 11.22us 
(+- 8.14%)
 HLT  477.30% 99.88% 181.69us 249988.06us 102061.36us 
(+-13.49%)
   PAUSE_INSTRUCTION   50.78%  0.00%   0.38us  0.79us  0.47us 
(+-17.05%)
MSR_READ   40.62%  0.00%   1.14us  3.33us  2.67us 
(+-19.35%)
  EXTERNAL_INTERRUPT   20.31%  0.00%   2.15us  2.17us  2.16us 
(+- 0.30%)
   PENDING_INTERRUPT   10.16%  0.00%   2.56us  2.56us  2.56us 
(+- 0.00%)
PREEMPTION_TIMER   10.16%  0.00%   3.21us  3.21us  3.21us 
(+- 0.00%)

  Total Samples:644, Total events handled time:4802790.72us.

  #

Signed-off-by: Kan Liang 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Andi Kleen 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Wang Nan 
Link: 
http://lkml.kernel.org/r/1519945751-37786-1-git-send-email-kan.li...@linux.intel.com
[ Changed bool parameters from 0 to 'false', as per Jiri comment ]
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-kvm.c | 17 +
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/tools/perf/builtin-kvm.c b/tools/perf/builtin-kvm.c
index 55d919dc5bc6..d2703d3b8366 100644
--- a/tools/perf/builtin-kvm.c
+++ b/tools/perf/builtin-kvm.c
@@ -743,16 +743,24 @@ static bool verify_vcpu(int vcpu)
 static s64 perf_kvm__mmap_read_idx(struct perf_kvm_stat *kvm, int idx,
   u64 *mmap_time)
 {
+   struct perf_evlist *evlist = kvm->evlist;
union perf_event *event;
+   struct perf_mmap *md;
+   u64 end, start;
u64 timestamp;
s64 n = 0;
int err;
 
*mmap_time = ULLONG_MAX;
-   while ((event = perf_evlist__mmap_read(kvm->evlist, idx)) != NULL) {
-   err = perf_evlist__parse_sample_timestamp(kvm->evlist, event, 
×tamp);
+   md = &evlist->mmap[idx];
+   err = perf_mmap__read_init(md, false, &start, &end);
+   if (err < 0)
+   return (err == -EAGAIN) ? 0 : -1;
+
+   while ((event = perf_mmap__read_event(md, false, &start, end)) != NULL) 
{
+   err = perf_evlist__parse_sample_timestamp(evlist, event, 
×tamp);
if (err) {
-   perf_evlist__mmap_consume(kvm->evlist, idx);
+   perf_mmap__consume(md, false);
pr_err("Failed to parse sample\n");
return -1;
}
@@ -762,7 +770,7 @@ static s64 perf_kvm__mmap_read_idx(struct perf_kvm_stat 
*kvm, int idx,
 * FIXME: Here we can't consume the event, as 
perf_session__queue_event will
 *point to it, and it'll get possibly overwritten by 
the kernel.
 */
-   perf_evlist__mmap_consume(kvm->evlist, idx);
+   perf_mmap__consume(md, false);
 
if (err) {
pr_err("Failed to enqueue sample: %d\n", err);
@@ -779,6 +787,7 @@ static s64 perf_kvm__mmap_read_idx(struct perf_kvm_stat 
*kvm, int idx,
break;
}
 
+   perf_mmap__read_done(md);
return n;
 }

[tip:perf/core] perf annotate: Find 'call' instruction target symbol at parsing time

2018-03-05 Thread tip-bot for Arnaldo Carvalho de Melo

Commit-ID:  696703af37a28892db89ff6a6d0cdfde6fb803ab
Gitweb: https://git.kernel.org/tip/696703af37a28892db89ff6a6d0cdfde6fb803ab
Author: Arnaldo Carvalho de Melo 
AuthorDate: Fri, 2 Mar 2018 11:59:36 -0300
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 5 Mar 2018 09:58:45 -0300

perf annotate: Find 'call' instruction target symbol at parsing time

So that we do it just once, not everytime we press enter or -> on a
'call' instruction line.

Cc: Adrian Hunter 
Cc: David Ahern 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Wang Nan 
Link: https://lkml.kernel.org/n/tip-uysyojl1e6nm94amzzzs0...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/ui/browsers/annotate.c | 17 +
 tools/perf/util/annotate.c| 38 +-
 tools/perf/util/annotate.h|  1 +
 3 files changed, 27 insertions(+), 29 deletions(-)

diff --git a/tools/perf/ui/browsers/annotate.c 
b/tools/perf/ui/browsers/annotate.c
index 6ff6839558b0..618edf96353c 100644
--- a/tools/perf/ui/browsers/annotate.c
+++ b/tools/perf/ui/browsers/annotate.c
@@ -568,35 +568,28 @@ static bool annotate_browser__callq(struct 
annotate_browser *browser,
struct map_symbol *ms = browser->b.priv;
struct disasm_line *dl = disasm_line(browser->selection);
struct annotation *notes;
-   struct addr_map_symbol target = {
-   .map = ms->map,
-   .addr = map__objdump_2mem(ms->map, dl->ops.target.addr),
-   };
char title[SYM_TITLE_MAX_SIZE];
 
if (!ins__is_call(&dl->ins))
return false;
 
-   if (map_groups__find_ams(&target) ||
-   map__rip_2objdump(target.map, target.map->map_ip(target.map,
-target.addr)) !=
-   dl->ops.target.addr) {
+   if (!dl->ops.target.sym) {
ui_helpline__puts("The called function was not found.");
return true;
}
 
-   notes = symbol__annotation(target.sym);
+   notes = symbol__annotation(dl->ops.target.sym);
pthread_mutex_lock(¬es->lock);
 
-   if (notes->src == NULL && symbol__alloc_hist(target.sym) < 0) {
+   if (notes->src == NULL && symbol__alloc_hist(dl->ops.target.sym) < 0) {
pthread_mutex_unlock(¬es->lock);
ui__warning("Not enough memory for annotating '%s' symbol!\n",
-   target.sym->name);
+   dl->ops.target.sym->name);
return true;
}
 
pthread_mutex_unlock(¬es->lock);
-   symbol__tui_annotate(target.sym, target.map, evsel, hbt);
+   symbol__tui_annotate(dl->ops.target.sym, ms->map, evsel, hbt);
sym_title(ms->sym, ms->map, title, sizeof(title));
ui_browser__show_title(&browser->b, title);
return true;
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 28b233c3dcbe..49ff825f745c 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -187,6 +187,9 @@ bool ins__is_fused(struct arch *arch, const char *ins1, 
const char *ins2)
 static int call__parse(struct arch *arch, struct ins_operands *ops, struct map 
*map)
 {
char *endptr, *tok, *name;
+   struct addr_map_symbol target = {
+   .map = map,
+   };
 
ops->target.addr = strtoull(ops->raw, &endptr, 16);
 
@@ -208,28 +211,29 @@ static int call__parse(struct arch *arch, struct 
ins_operands *ops, struct map *
ops->target.name = strdup(name);
*tok = '>';
 
-   return ops->target.name == NULL ? -1 : 0;
+   if (ops->target.name == NULL)
+   return -1;
+find_target:
+   target.addr = map__objdump_2mem(map, ops->target.addr);
 
-indirect_call:
-   tok = strchr(endptr, '*');
-   if (tok == NULL) {
-   struct symbol *sym = map__find_symbol(map, map->map_ip(map, 
ops->target.addr));
-   if (sym != NULL)
-   ops->target.name = strdup(sym->name);
-   else
-   ops->target.addr = 0;
-   return 0;
-   }
+   if (map_groups__find_ams(&target) == 0 &&
+   map__rip_2objdump(target.map, map->map_ip(target.map, target.addr)) 
== ops->target.addr)
+   ops->target.sym = target.sym;
 
-   ops->target.addr = strtoull(tok + 1, NULL, 16);
return 0;
+
+indirect_call:
+   tok = strchr(endptr, '*');
+   if (tok != NULL)
+   ops->target.addr = strtoull(tok + 1, NULL, 16);
+   goto find_target;
 }
 
 static int call__scnprintf(struct ins *ins, char *bf, size_t size,
   struct ins_operands *ops)
 {
-   if (ops->target.name)
-   return scnprintf(bf, size, "%-6s %s", ins->name, 
ops->target.name);
+   if (ops->target.sym)
+   return scnprintf(bf, size, "%-6s %s", ins->name, 
ops->target.sym->name);
 
if (ops->target.addr == 0)

[tip:perf/core] perf record: Throttle user defined frequencies to the maximum allowed

2018-03-05 Thread tip-bot for Arnaldo Carvalho de Melo

Commit-ID:  b09c2364a4dc2a67e640c2b839d936302815693f
Gitweb: https://git.kernel.org/tip/b09c2364a4dc2a67e640c2b839d936302815693f
Author: Arnaldo Carvalho de Melo 
AuthorDate: Thu, 1 Mar 2018 14:52:50 -0300
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 5 Mar 2018 09:58:44 -0300

perf record: Throttle user defined frequencies to the maximum allowed

  # perf record -F 20 sleep 1
  warning: Maximum frequency rate (15,000 Hz) exceeded, throttling from 200,000 
Hz to 15,000 Hz.
   The limit can be raised via 
/proc/sys/kernel/perf_event_max_sample_rate.
   The kernel will lower it when perf's interrupts take too long.
   Use --strict-freq to disable this throttling, refusing to record.
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.019 MB perf.data (15 samples) ]
  # perf evlist -v
  cycles:ppp: size: 112, { sample_period, sample_freq }: 15000, sample_type: 
IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, 
enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, 
mmap2: 1, comm_exec: 1

For those wanting that it fails if the desired frequency can't be used:

  # perf record --strict-freq -F 20 sleep 1
  error: Maximum frequency rate (15,000 Hz) exceeded.
 Please use -F freq option with a lower value or consider
 tweaking /proc/sys/kernel/perf_event_max_sample_rate.
  #

Suggested-by: Ingo Molnar 
Cc: Adrian Hunter 
Cc: David Ahern 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Wang Nan 
Link: https://lkml.kernel.org/n/tip-oyebruc44nlja499nqkr1...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Documentation/perf-record.txt |  7 ++-
 tools/perf/builtin-record.c  |  2 ++
 tools/perf/perf.h|  1 +
 tools/perf/util/record.c | 20 +++-
 4 files changed, 24 insertions(+), 6 deletions(-)

diff --git a/tools/perf/Documentation/perf-record.txt 
b/tools/perf/Documentation/perf-record.txt
index 94f2faebc7f0..cc37b3a4be76 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -191,11 +191,16 @@ OPTIONS
 -i::
 --no-inherit::
Child tasks do not inherit counters.
+
 -F::
 --freq=::
Profile at this frequency. Use 'max' to use the currently maximum
allowed frequency, i.e. the value in the 
kernel.perf_event_max_sample_rate
-   sysctl.
+   sysctl. Will throttle down to the currently maximum allowed frequency.
+   See --strict-freq.
+
+--strict-freq::
+   Fail if the specified frequency can't be used.
 
 -m::
 --mmap-pages=::
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index e1821eea14ef..62387942a1d5 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1543,6 +1543,8 @@ static struct option __record_options[] = {
OPT_BOOLEAN(0, "tail-synthesize", &record.opts.tail_synthesize,
"synthesize non-sample events at the end of output"),
OPT_BOOLEAN(0, "overwrite", &record.opts.overwrite, "use overwrite 
mode"),
+   OPT_BOOLEAN(0, "strict-freq", &record.opts.strict_freq,
+   "Fail if the specified frequency can't be used"),
OPT_CALLBACK('F', "freq", &record.opts, "freq or 'max'",
 "profile at this frequency",
  record__parse_freq),
diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index a5df8bf73a68..007e0dfd5ce3 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -61,6 +61,7 @@ struct record_opts {
bool tail_synthesize;
bool overwrite;
bool ignore_missing_thread;
+   bool strict_freq;
unsigned int freq;
unsigned int mmap_pages;
unsigned int auxtrace_mmap_pages;
diff --git a/tools/perf/util/record.c b/tools/perf/util/record.c
index acabf54ceccb..4f1a82e76d39 100644
--- a/tools/perf/util/record.c
+++ b/tools/perf/util/record.c
@@ -216,11 +216,21 @@ static int record_opts__config_freq(struct record_opts 
*opts)
 * User specified frequency is over current maximum.
 */
if (user_freq && (max_rate < opts->freq)) {
-   pr_err("Maximum frequency rate (%u) reached.\n"
-  "Please use -F freq option with lower value or consider\n"
-  "tweaking /proc/sys/kernel/perf_event_max_sample_rate.\n",
-  max_rate);
-   return -1;
+   if (opts->strict_freq) {
+   pr_err("error: Maximum frequency rate (%'u Hz) 
exceeded.\n"
+  "   Please use -F freq option with a lower 
value or consider\n"
+  "   tweaking 
/proc/sys/kernel/perf_event_max_sample_rate.\n",
+  max_rate);
+   return -1;
+   } else {
+   pr_warning("warning: Maximum f

[tip:perf/core] perf top browser: Show sample_freq in browser title line

2018-03-05 Thread tip-bot for Arnaldo Carvalho de Melo

Commit-ID:  a9980a6dbb9efd154b032ad729c869784302f361
Gitweb: https://git.kernel.org/tip/a9980a6dbb9efd154b032ad729c869784302f361
Author: Arnaldo Carvalho de Melo 
AuthorDate: Thu, 1 Mar 2018 14:22:12 -0300
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 5 Mar 2018 09:58:43 -0300

perf top browser: Show sample_freq in browser title line

The '--stdio' 'perf top' UI shows it, so lets remove this UI difference
and show it too in '--tui', will be useful for 'perf top --tui -F max'.

Cc: Adrian Hunter 
Cc: David Ahern 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Wang Nan 
Link: https://lkml.kernel.org/n/tip-n3wd8n395uo4y9irst29p...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/ui/browsers/hists.c | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index 6495ee55d9c3..de2bde232cb3 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -2223,7 +2223,7 @@ static int perf_evsel_browser_title(struct hist_browser 
*browser,
u64 nr_events = hists->stats.total_period;
struct perf_evsel *evsel = hists_to_evsel(hists);
const char *ev_name = perf_evsel__name(evsel);
-   char buf[512];
+   char buf[512], sample_freq_str[64] = "";
size_t buflen = sizeof(buf);
char ref[30] = " show reference callgraph, ";
bool enable_ref = false;
@@ -2255,10 +2255,14 @@ static int perf_evsel_browser_title(struct hist_browser 
*browser,
if (symbol_conf.show_ref_callgraph &&
strstr(ev_name, "call-graph=no"))
enable_ref = true;
+
+   if (!is_report_browser(hbt))
+   scnprintf(sample_freq_str, sizeof(sample_freq_str), " %d Hz,", 
evsel->attr.sample_freq);
+
nr_samples = convert_unit(nr_samples, &unit);
printed = scnprintf(bf, size,
-  "Samples: %lu%c of event '%s',%sEvent count 
(approx.): %" PRIu64,
-  nr_samples, unit, ev_name, enable_ref ? ref : " ", 
nr_events);
+  "Samples: %lu%c of event '%s',%s%sEvent count 
(approx.): %" PRIu64,
+  nr_samples, unit, ev_name, sample_freq_str, 
enable_ref ? ref : " ", nr_events);
 
 
if (hists->uid_filter_str)

[tip:perf/core] perf top: Allow asking for the maximum allowed sample rate

2018-03-05 Thread tip-bot for Arnaldo Carvalho de Melo

Commit-ID:  7831bf236505bcb2a0a1255e7f3e902a0cb732d6
Gitweb: https://git.kernel.org/tip/7831bf236505bcb2a0a1255e7f3e902a0cb732d6
Author: Arnaldo Carvalho de Melo 
AuthorDate: Thu, 1 Mar 2018 14:25:56 -0300
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 5 Mar 2018 09:58:44 -0300

perf top: Allow asking for the maximum allowed sample rate

Add the handy '-F max' shortcut, just introduced to 'perf record', to
reading and using the kernel.perf_event_max_sample_rate value as the
user supplied sampling frequency:

Cc: Adrian Hunter 
Cc: David Ahern 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Wang Nan 
Link: https://lkml.kernel.org/n/tip-hz04f296zccknnb5at06a...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Documentation/perf-top.txt | 4 +++-
 tools/perf/builtin-top.c  | 4 +++-
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/tools/perf/Documentation/perf-top.txt 
b/tools/perf/Documentation/perf-top.txt
index 8a32cc77bead..a039407d63b8 100644
--- a/tools/perf/Documentation/perf-top.txt
+++ b/tools/perf/Documentation/perf-top.txt
@@ -55,7 +55,9 @@ Default is to monitor all CPUS.
 
 -F ::
 --freq=::
-   Profile at this frequency.
+   Profile at this frequency. Use 'max' to use the currently maximum
+   allowed frequency, i.e. the value in the 
kernel.perf_event_max_sample_rate
+   sysctl.
 
 -i::
 --inherit::
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 35ac016fcb98..bb4f9fafd11d 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -1307,7 +1307,9 @@ int cmd_top(int argc, const char **argv)
OPT_STRING(0, "sym-annotate", &top.sym_filter, "symbol name",
"symbol to annotate"),
OPT_BOOLEAN('z', "zero", &top.zero, "zero history across updates"),
-   OPT_UINTEGER('F', "freq", &opts->user_freq, "profile at this 
frequency"),
+   OPT_CALLBACK('F', "freq", &top.record_opts, "freq or 'max'",
+"profile at this frequency",
+ record__parse_freq),
OPT_INTEGER('E', "entries", &top.print_entries,
"display this many functions"),
OPT_BOOLEAN('U', "hide_user_symbols", &top.hide_user_symbols,

[tip:perf/core] perf record: Allow asking for the maximum allowed sample rate

2018-03-05 Thread tip-bot for Arnaldo Carvalho de Melo

Commit-ID:  67230479b2304be99e9451ee171aa288a112ea16
Gitweb: https://git.kernel.org/tip/67230479b2304be99e9451ee171aa288a112ea16
Author: Arnaldo Carvalho de Melo 
AuthorDate: Thu, 1 Mar 2018 13:46:23 -0300
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 5 Mar 2018 09:58:43 -0300

perf record: Allow asking for the maximum allowed sample rate

Add the handy '-F max' shortcut to reading and using the
kernel.perf_event_max_sample_rate value as the user supplied
sampling frequency:

  # perf record -F max sleep 1
  info: Using a maximum frequency rate of 15,000 Hz
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.019 MB perf.data (14 samples) ]
  # sysctl kernel.perf_event_max_sample_rate
  kernel.perf_event_max_sample_rate = 15000
  # perf evlist -v
  cycles:ppp: size: 112, { sample_period, sample_freq }: 15000, sample_type: 
IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, 
enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, 
mmap2: 1, comm_exec: 1

  # perf record -F 10 sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.019 MB perf.data (4 samples) ]
  # perf evlist -v
  cycles:ppp: size: 112, { sample_period, sample_freq }: 10, sample_type: 
IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, 
enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, 
mmap2: 1, comm_exec: 1
  #

Suggested-by: Ingo Molnar 
Cc: Adrian Hunter 
Cc: David Ahern 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Wang Nan 
Link: https://lkml.kernel.org/n/tip-4y0tiuws62c64gp4cf0hm...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Documentation/perf-record.txt |  4 +++-
 tools/perf/builtin-record.c  |  7 ++-
 tools/perf/perf.h|  2 ++
 tools/perf/util/record.c | 23 +++
 4 files changed, 34 insertions(+), 2 deletions(-)

diff --git a/tools/perf/Documentation/perf-record.txt 
b/tools/perf/Documentation/perf-record.txt
index 76bc2181d214..94f2faebc7f0 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -193,7 +193,9 @@ OPTIONS
Child tasks do not inherit counters.
 -F::
 --freq=::
-   Profile at this frequency.
+   Profile at this frequency. Use 'max' to use the currently maximum
+   allowed frequency, i.e. the value in the 
kernel.perf_event_max_sample_rate
+   sysctl.
 
 -m::
 --mmap-pages=::
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 907267206973..e1821eea14ef 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -45,6 +45,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -1542,7 +1543,9 @@ static struct option __record_options[] = {
OPT_BOOLEAN(0, "tail-synthesize", &record.opts.tail_synthesize,
"synthesize non-sample events at the end of output"),
OPT_BOOLEAN(0, "overwrite", &record.opts.overwrite, "use overwrite 
mode"),
-   OPT_UINTEGER('F', "freq", &record.opts.user_freq, "profile at this 
frequency"),
+   OPT_CALLBACK('F', "freq", &record.opts, "freq or 'max'",
+"profile at this frequency",
+ record__parse_freq),
OPT_CALLBACK('m', "mmap-pages", &record.opts, "pages[,pages]",
 "number of mmap data pages and AUX area tracing mmap 
pages",
 record__parse_mmap_pages),
@@ -1651,6 +1654,8 @@ int cmd_record(int argc, const char **argv)
struct record *rec = &record;
char errbuf[BUFSIZ];
 
+   setlocale(LC_ALL, "");
+
 #ifndef HAVE_LIBBPF_SUPPORT
 # define set_nobuild(s, l, c) set_option_nobuild(record_options, s, l, 
"NO_LIBBPF=1", c)
set_nobuild('\0', "clang-path", true);
diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index cfe46236a5e5..a5df8bf73a68 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -82,4 +82,6 @@ struct record_opts {
 struct option;
 extern const char * const *record_usage;
 extern struct option *record_options;
+
+int record__parse_freq(const struct option *opt, const char *str, int unset);
 #endif
diff --git a/tools/perf/util/record.c b/tools/perf/util/record.c
index 1e97937b03a9..acabf54ceccb 100644
--- a/tools/perf/util/record.c
+++ b/tools/perf/util/record.c
@@ -5,6 +5,7 @@
 #include "parse-events.h"
 #include 
 #include 
+#include 
 #include "util.h"
 #include "cloexec.h"
 
@@ -287,3 +288,25 @@ out_delete:
perf_evlist__delete(temp_evlist);
return ret;
 }
+
+int record__parse_freq(const struct option *opt, const char *str, int unset 
__maybe_unused)
+{
+   unsigned int freq;
+   struct record_opts *opts = opt->value;
+
+   if (!str)
+   return -EINVAL;
+
+   if (strcasecmp(str, "max") == 0) {
+   if (get_max_rate(&freq)) {
+   pr_err("couldn

[PATCH] perf stat: fix cvs output format

2018-03-05 Thread Cong Wang

From: Ilya Pronin 

When printing stats in CSV mode, perf stat appends extra CSV
separators when counter is not supported:

,,L1-dcache-store-misses,mesos/bd442f34-2b4a-47df-b966-9b281f9f56fc,0,100.00

which causes a failure of parsing fields. The numbers of separators
is fixed for each line, no matter supported or not supported.

Fixes: 92a61f6412d3 ("perf stat: Implement CSV metrics output")
Cc: Andi Kleen 
Cc: Arnaldo Carvalho de Melo 
Cc: Jiri Olsa 
Signed-off-by: Ilya Pronin 
Signed-off-by: Cong Wang 
---
 tools/perf/builtin-stat.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 98bf9d32f222..54a4c152edb3 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -917,7 +917,7 @@ static void print_metric_csv(void *ctx,
char buf[64], *vals, *ends;
 
if (unit == NULL || fmt == NULL) {
-   fprintf(out, "%s%s%s%s", csv_sep, csv_sep, csv_sep, csv_sep);
+   fprintf(out, "%s%s", csv_sep, csv_sep);
return;
}
snprintf(buf, sizeof(buf), fmt, val);
-- 
2.13.0

[tip:perf/core] perf tests: Switch trace+probe_libc_inet_pton to use record

2018-03-05 Thread tip-bot for Jiri Olsa

Commit-ID:  a18ee796f8af5569628c324700b9a34b4488
Gitweb: https://git.kernel.org/tip/a18ee796f8af5569628c324700b9a34b4488
Author: Jiri Olsa 
AuthorDate: Thu, 1 Mar 2018 17:52:14 +0100
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 5 Mar 2018 09:58:42 -0300

perf tests: Switch trace+probe_libc_inet_pton to use record

There's a problem with relying on backtrace data from 'perf trace' the
way the trace+probe_libc_inet_pton does. This test inserts uprobe within
ping binary and checks that it gets its sample using 'perf trace'.

It also checks it gets proper backtrace from sample and that's where the
issue is.

The 'perf trace' does not sort events (by definition) so it can happen
that it processes the event sample before the ping binary memory map
event. This can (very rarely) happen as proved by this events dump
output (from custom added debug output):

  ...
  7680/7680: [0x7f4e29718000(0x204000) @ 0 fd:00 33611321 4230892504]: r-xp 
/usr/lib64/libdl-2.17.so
  7680/7680: [0x7f4e29502000(0x216000) @ 0 fd:00 33617257 2606846872]: r-xp 
/usr/lib64/libz.so.1.2.7
  (IP, 0x2): 7680/7680: 0x7f4e29c2ed60 period: 1 addr: 0
  7680/7680: [0x564842ef(0x233000) @ 0 fd:00 83 1989280200]: r-xp 
/usr/bin/ping
  7680/7680: [0x7f4e2aca2000(0x224000) @ 0 fd:00 33611308 1219144940]: r-xp 
/usr/lib64/ld-2.17.so
  ...

In this case 'perf trace' fails to resolve the last callchain IP (within
the ping binary) because it does not know about the ping binary memory
map yet and the test fails like this:

  PING ::1(::1) 56 data bytes
  64 bytes from ::1: icmp_seq=1 ttl=64 time=0.037 ms
  --- ::1 ping statistics ---
  1 packets transmitted, 1 received, 0% packet loss, time 0ms
  rtt min/avg/max/mdev = 0.037/0.037/0.037/0.000 ms
  0.000 probe_libc:inet_pton:(7f4e29c2ed60))
  __GI___inet_pton (/usr/lib64/libc-2.17.so)
  getaddrinfo (/usr/lib64/libc-2.17.so)
  [0] ([unknown])
  FAIL: expected backtrace entry 8 ".*\(.*/bin/ping.*\)$" got "[0] ([unknown])"

Switching the test to use 'perf record' and 'perf script' instead of
'perf trace'.

Signed-off-by: Jiri Olsa 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Alexander Shishkin 
Cc: David Ahern 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Link: http://lkml.kernel.org/r/20180301165215.6780-1-jo...@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 
---
 .../perf/tests/shell/trace+probe_libc_inet_pton.sh | 30 +++---
 1 file changed, 15 insertions(+), 15 deletions(-)

diff --git a/tools/perf/tests/shell/trace+probe_libc_inet_pton.sh 
b/tools/perf/tests/shell/trace+probe_libc_inet_pton.sh
index 8c4ab0b390c0..52c3ee701a89 100755
--- a/tools/perf/tests/shell/trace+probe_libc_inet_pton.sh
+++ b/tools/perf/tests/shell/trace+probe_libc_inet_pton.sh
@@ -15,30 +15,28 @@ nm -g $libc 2>/dev/null | fgrep -q inet_pton || exit 254
 
 trace_libc_inet_pton_backtrace() {
idx=0
-   expected[0]="PING.*bytes"
-   expected[1]="64 bytes from ::1.*"
-   expected[2]=".*ping statistics.*"
-   expected[3]=".*packets transmitted.*"
-   expected[4]="rtt min.*"
-   
expected[5]="[0-9]+\.[0-9]+[[:space:]]+probe_libc:inet_pton:\([[:xdigit:]]+\)"
-   expected[6]=".*inet_pton[[:space:]]\($libc|inlined\)$"
+   expected[0]="ping[][0-9 \.:]+probe_libc:inet_pton: \([[:xdigit:]]+\)"
+   expected[1]=".*inet_pton[[:space:]]\($libc\)$"
case "$(uname -m)" in
s390x)
eventattr='call-graph=dwarf'
-   expected[7]="gaih_inet.*[[:space:]]\($libc|inlined\)$"
-   expected[8]="__GI_getaddrinfo[[:space:]]\($libc|inlined\)$"
-   expected[9]="main[[:space:]]\(.*/bin/ping.*\)$"
-   expected[10]="__libc_start_main[[:space:]]\($libc\)$"
-   expected[11]="_start[[:space:]]\(.*/bin/ping.*\)$"
+   expected[2]="gaih_inet.*[[:space:]]\($libc|inlined\)$"
+   expected[3]="__GI_getaddrinfo[[:space:]]\($libc|inlined\)$"
+   expected[4]="main[[:space:]]\(.*/bin/ping.*\)$"
+   expected[5]="__libc_start_main[[:space:]]\($libc\)$"
+   expected[6]="_start[[:space:]]\(.*/bin/ping.*\)$"
;;
*)
eventattr='max-stack=3'
-   expected[7]="getaddrinfo[[:space:]]\($libc\)$"
-   expected[8]=".*\(.*/bin/ping.*\)$"
+   expected[2]="getaddrinfo[[:space:]]\($libc\)$"
+   expected[3]=".*\(.*/bin/ping.*\)$"
;;
esac
 
-   perf trace --no-syscalls -e probe_libc:inet_pton/$eventattr/ ping -6 -c 
1 ::1 2>&1 | grep -v ^$ | while read line ; do
+   file=`mktemp -u /tmp/perf.data.XXX`
+
+   perf record -e probe_libc:inet_pton/$eventattr/ -o $file ping -6 -c 1 
::1 > /dev/null 2>&1
+   perf script -i $file | while read line ; do
echo $line
echo "$line" | egrep -q "${expected[$idx]}"
if [ $? -ne 0 ] ; then
@@ -48,6 +46,8 @@ trace_libc_inet_pton_backtrace() {
let idx+=1

[tip:perf/core] perf tests: Rename trace+probe_libc_inet_pton to record+probe_libc_inet_pton

2018-03-05 Thread tip-bot for Jiri Olsa

Commit-ID:  4f67336870f641daa485ea504777486e24a9aece
Gitweb: https://git.kernel.org/tip/4f67336870f641daa485ea504777486e24a9aece
Author: Jiri Olsa 
AuthorDate: Thu, 1 Mar 2018 17:52:15 +0100
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 5 Mar 2018 09:58:42 -0300

perf tests: Rename trace+probe_libc_inet_pton to record+probe_libc_inet_pton

Because the test is no longer using perf trace but perf record instead.

Signed-off-by: Jiri Olsa 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Alexander Shishkin 
Cc: David Ahern 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Link: http://lkml.kernel.org/r/20180301165215.6780-2-jo...@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 
---
 .../{trace+probe_libc_inet_pton.sh => record+probe_libc_inet_pton.sh} | 0
 1 file changed, 0 insertions(+), 0 deletions(-)

diff --git a/tools/perf/tests/shell/trace+probe_libc_inet_pton.sh 
b/tools/perf/tests/shell/record+probe_libc_inet_pton.sh
similarity index 100%
rename from tools/perf/tests/shell/trace+probe_libc_inet_pton.sh
rename to tools/perf/tests/shell/record+probe_libc_inet_pton.sh

[tip:perf/core] perf stat: Ignore error thread when enabling system-wide --per-thread

2018-03-05 Thread tip-bot for Jin Yao

Commit-ID:  ab6c79b819f5a50cf41a11ebec17bef63b530333
Gitweb: https://git.kernel.org/tip/ab6c79b819f5a50cf41a11ebec17bef63b530333
Author: Jin Yao 
AuthorDate: Tue, 16 Jan 2018 23:43:08 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 27 Feb 2018 11:29:21 -0300

perf stat: Ignore error thread when enabling system-wide --per-thread

If we execute 'perf stat --per-thread' with non-root account (even set
kernel.perf_event_paranoid = -1 yet), it reports the error:

  jinyao@skl:~$ perf stat --per-thread
  Error:
  You may not have permission to collect system-wide stats.

  Consider tweaking /proc/sys/kernel/perf_event_paranoid,
  which controls use of the performance events system by
  unprivileged users (without CAP_SYS_ADMIN).

  The current value is 2:

-1: Allow use of (almost) all events by all users
Ignore mlock limit after perf_event_mlock_kb without CAP_IPC_LOCK
  >= 0: Disallow ftrace function tracepoint by users without CAP_SYS_ADMIN
Disallow raw tracepoint access by users without CAP_SYS_ADMIN
  >= 1: Disallow CPU event access by users without CAP_SYS_ADMIN
  >= 2: Disallow kernel profiling by users without CAP_SYS_ADMIN

  To make this setting permanent, edit /etc/sysctl.conf too, e.g.:

  kernel.perf_event_paranoid = -1

Perhaps the ptrace rule doesn't allow to trace some processes. But anyway
the global --per-thread mode had better ignore such errors and continue
working on other threads.

This patch will record the index of error thread in perf_evsel__open()
and remove this thread before retrying.

For example (run with non-root, kernel.perf_event_paranoid isn't set):

  jinyao@skl:~$ perf stat --per-thread
  ^C
   Performance counter stats for 'system wide':

 vmstat-34586.171984   cpu-clock:u (msec) #  0.000 CPUs utilized
   perf-36700.515599   cpu-clock:u (msec) #  0.000 CPUs utilized
 vmstat-3458   1,163,643   cycles:u   #  0.189 GHz
   perf-3670  40,881   cycles:u   #  0.079 GHz
 vmstat-3458   1,410,238   instructions:u #  1.21  insn per cycle
   perf-3670   3,536   instructions:u #  0.09  insn per cycle
 vmstat-3458 288,937   branches:u # 46.814 M/sec
   perf-3670 936   branches:u #  1.815 M/sec
 vmstat-3458  15,195   branch-misses:u#  5.26% of all branches
   perf-3670  76   branch-misses:u#  8.12% of all branches

12.651675247 seconds time elapsed

Signed-off-by: Jin Yao 
Acked-by: Jiri Olsa 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1516117388-10120-1-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-stat.c| 14 +-
 tools/perf/util/evsel.c  |  3 +++
 tools/perf/util/thread_map.c |  1 +
 tools/perf/util/thread_map.h |  1 +
 4 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index fadcff52cd09..6214d2b220b2 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -637,7 +637,19 @@ try_again:
 if (verbose > 0)
 ui__warning("%s\n", msg);
 goto try_again;
-}
+   } else if (target__has_per_thread(&target) &&
+  evsel_list->threads &&
+  evsel_list->threads->err_thread != -1) {
+   /*
+* For global --per-thread case, skip current
+* error thread.
+*/
+   if (!thread_map__remove(evsel_list->threads,
+   
evsel_list->threads->err_thread)) {
+   evsel_list->threads->err_thread = -1;
+   goto try_again;
+   }
+   }
 
perf_evsel__open_strerror(counter, &target,
  errno, msg, sizeof(msg));
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index ef351688b797..b56e1c2ddaee 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1915,6 +1915,9 @@ try_fallback:
goto fallback_missing_features;
}
 out_close:
+   if (err)
+   threads->err_thread = thread;
+
do {
while (--thread >= 0) {
close(FD(evsel, cpu, thread));
diff --git a/tools/perf/util/thread_map.c b/tools/perf/util/thread_map.c
index 729dad8f412d..5d467d8ae9ab 100644
--- a/tools/perf/util/thread_map.c
+++ b/tools/perf/util/thread_map.c
@@ -32,6 +32,7 @@ static void

[tip:perf/core] perf annotate browser: Be more robust when drawing jump arrows

2018-03-05 Thread tip-bot for Arnaldo Carvalho de Melo

Commit-ID:  9c04409d7f5c325233961673356ea8aced6a4ef3
Gitweb: https://git.kernel.org/tip/9c04409d7f5c325233961673356ea8aced6a4ef3
Author: Arnaldo Carvalho de Melo 
AuthorDate: Thu, 1 Mar 2018 11:33:59 -0300
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 5 Mar 2018 09:57:57 -0300

perf annotate browser: Be more robust when drawing jump arrows

This first happened with a gcc function, _cpp_lex_token, that has the
usual jumps:

 │1159e6c: ↓ jne115aa32 <_cpp_lex_token@@Base+0xf92>

I.e. jumps to a label inside that function (_cpp_lex_token), and those
works, but also this kind:

 │1159e8b: ↓ jnec469be 

I.e. jumps to another function, outside _cpp_lex_token, which are not
being correctly handled generating as a side effect references to
ab->offset[] entries that are set to NULL, so to make this code more
robust, check that here.

A proper fix for will be put in place, looking at the function name
right after the '<' token and probably treating this like a 'call'
instruction.

For now just don't draw the arrow.

Reported-by: Ingo Molnar 
Reported-by: Linus Torvalds 
Cc: Adrian Hunter 
Cc: David Ahern 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Wang Nan 
Cc: Jin Yao 
Cc: Kan Liang 
Link: https://lkml.kernel.org/n/tip-5tzvb875ep2sel03aeefg...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/ui/browsers/annotate.c | 25 +
 1 file changed, 25 insertions(+)

diff --git a/tools/perf/ui/browsers/annotate.c 
b/tools/perf/ui/browsers/annotate.c
index e2f666391ac4..6ff6839558b0 100644
--- a/tools/perf/ui/browsers/annotate.c
+++ b/tools/perf/ui/browsers/annotate.c
@@ -328,7 +328,32 @@ static void annotate_browser__draw_current_jump(struct 
ui_browser *browser)
if (!disasm_line__is_valid_jump(cursor, sym))
return;
 
+   /*
+* This first was seen with a gcc function, _cpp_lex_token, that
+* has the usual jumps:
+*
+*  │1159e6c: ↓ jne115aa32 <_cpp_lex_token@@Base+0xf92>
+*
+* I.e. jumps to a label inside that function (_cpp_lex_token), and
+* those works, but also this kind:
+*
+*  │1159e8b: ↓ jnec469be 
+*
+*  I.e. jumps to another function, outside _cpp_lex_token, which
+*  are not being correctly handled generating as a side effect 
references
+*  to ab->offset[] entries that are set to NULL, so to make this code
+*  more robust, check that here.
+*
+*  A proper fix for will be put in place, looking at the function
+*  name right after the '<' token and probably treating this like a
+*  'call' instruction.
+*/
target = ab->offsets[cursor->ops.target.offset];
+   if (target == NULL) {
+   ui_helpline__printf("WARN: jump target inconsistency, press 
'o', ab->offsets[%#x] = NULL\n",
+   cursor->ops.target.offset);
+   return;
+   }
 
bcursor = browser_line(&cursor->al);
btarget = browser_line(target);

[tip:perf/core] perf top: Fix annoying fallback message on older kernels

2018-03-05 Thread tip-bot for Kan Liang

Commit-ID:  853745f5e6d95faaae6381c9a01dbd43de992fb3
Gitweb: https://git.kernel.org/tip/853745f5e6d95faaae6381c9a01dbd43de992fb3
Author: Kan Liang 
AuthorDate: Mon, 26 Feb 2018 10:17:10 -0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 26 Feb 2018 16:04:08 -0300

perf top: Fix annoying fallback message on older kernels

On older (e.g. v4.4) kernels, an annoying fallback message can be
observed in 'perf top':

┌─Warning:──┐
│fall back to non-overwrite mode│
│   │
│   │
│Press any key...   │
└───┘

The 'perf top' utility has been changed to overwrite mode since commit
ebebbf082357 ("perf top: Switch default mode to overwrite mode").

For older kernels which don't have overwrite mode support, 'perf top'
will fall back to non-overwrite mode and print out the fallback message
using ui__warning(), which needs user's input to close.

The fallback message is not critical for end users. Turning it to debug
message which is printed when running with -vv.

Reported-by: Ingo Molnar 
Signed-off-by: Kan Liang 
Cc: Kan Liang 
Fixes: ebebbf082357 ("perf top: Switch default mode to overwrite mode")
Link: 
http://lkml.kernel.org/r/1519669030-176549-1-git-send-email-kan.li...@intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-top.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index b7c823ba8374..35ac016fcb98 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -991,7 +991,7 @@ static int perf_top_overwrite_fallback(struct perf_top *top,
evlist__for_each_entry(evlist, counter)
counter->attr.write_backward = false;
opts->overwrite = false;
-   ui__warning("fall back to non-overwrite mode\n");
+   pr_debug2("fall back to non-overwrite mode\n");
return 1;
 }

[tip:perf/core] perf cgroup: Simplify arguments when tracking multiple events

2018-03-05 Thread tip-bot for weiping zhang

Commit-ID:  25f72f9ed88d5be86c92432fc779e2725e3506cd
Gitweb: https://git.kernel.org/tip/25f72f9ed88d5be86c92432fc779e2725e3506cd
Author: weiping zhang 
AuthorDate: Mon, 29 Jan 2018 23:48:09 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Thu, 22 Feb 2018 10:02:27 -0300

perf cgroup: Simplify arguments when tracking multiple events

When using -G with one cgroup and -e with multiple events, only the
first event gets the correct cgroup setting, all events from the second
onwards will track system-wide events.

If the user wants to track multiple events for a specific cgroup, the
user must give parameters like the following:

  $ perf stat -e e1 -e e2 -e e3 -G test,test,test

This patch simplify this case, just type one cgroup:

  $ perf stat -e e1 -e e2 -e e3 -G test

  $ mkdir -p /sys/fs/cgroup/perf_event/empty_cgroup
  $ perf stat -e cycles -e cache-misses -a -I 1000 -G empty_cgroup

Before:

 1.001007226 cycles   empty_cgroup
 1.001007226   7,506  cache-misses

After:

 1.000834097 cycles   empty_cgroup
 1.000834097 cache-misses empty_cgroup

Signed-off-by: weiping zhang 
Acked-by: Jiri Olsa 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Alexander Shishkin 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Link: http://lkml.kernel.org/r/20180129154805.ga6...@localhost.didichuxing.com
[ Improved the doc text a bit, providing an example for cgroup + system wide 
counting ]
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Documentation/perf-record.txt |  6 +-
 tools/perf/Documentation/perf-stat.txt   |  6 +-
 tools/perf/util/cgroup.c | 17 -
 3 files changed, 26 insertions(+), 3 deletions(-)

diff --git a/tools/perf/Documentation/perf-record.txt 
b/tools/perf/Documentation/perf-record.txt
index 3eea6de35a38..76bc2181d214 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -308,7 +308,11 @@ can be provided. Each cgroup is applied to the 
corresponding event, i.e., first
 to first event, second cgroup to second event and so on. It is possible to 
provide
 an empty cgroup (monitor all the time) using, e.g., -G foo,,bar. Cgroups must 
have
 corresponding events, i.e., they always refer to events defined earlier on the 
command
-line.
+line. If the user wants to track multiple events for a specific cgroup, the 
user can
+use '-e e1 -e e2 -G foo,foo' or just use '-e e1 -e e2 -G foo'.
+
+If wanting to monitor, say, 'cycles' for a cgroup and also for system wide, 
this
+command line can be used: 'perf stat -e cycles -G cgroup_name -a -e cycles'.
 
 -b::
 --branch-any::
diff --git a/tools/perf/Documentation/perf-stat.txt 
b/tools/perf/Documentation/perf-stat.txt
index 2bbe79a50d3c..2b38e222016a 100644
--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@@ -118,7 +118,11 @@ can be provided. Each cgroup is applied to the 
corresponding event, i.e., first
 to first event, second cgroup to second event and so on. It is possible to 
provide
 an empty cgroup (monitor all the time) using, e.g., -G foo,,bar. Cgroups must 
have
 corresponding events, i.e., they always refer to events defined earlier on the 
command
-line.
+line. If the user wants to track multiple events for a specific cgroup, the 
user can
+use '-e e1 -e e2 -G foo,foo' or just use '-e e1 -e e2 -G foo'.
+
+If wanting to monitor, say, 'cycles' for a cgroup and also for system wide, 
this
+command line can be used: 'perf stat -e cycles -G cgroup_name -a -e cycles'.
 
 -o file::
 --output file::
diff --git a/tools/perf/util/cgroup.c b/tools/perf/util/cgroup.c
index 984f69144f87..5dd9b5ea314d 100644
--- a/tools/perf/util/cgroup.c
+++ b/tools/perf/util/cgroup.c
@@ -157,9 +157,11 @@ int parse_cgroups(const struct option *opt __maybe_unused, 
const char *str,
  int unset __maybe_unused)
 {
struct perf_evlist *evlist = *(struct perf_evlist **)opt->value;
+   struct perf_evsel *counter;
+   struct cgroup_sel *cgrp = NULL;
const char *p, *e, *eos = str + strlen(str);
char *s;
-   int ret;
+   int ret, i;
 
if (list_empty(&evlist->entries)) {
fprintf(stderr, "must define events before cgroups\n");
@@ -188,5 +190,18 @@ int parse_cgroups(const struct option *opt __maybe_unused, 
const char *str,
break;
str = p+1;
}
+   /* for the case one cgroup combine to multiple events */
+   i = 0;
+   if (nr_cgroups == 1) {
+   evlist__for_each_entry(evlist, counter) {
+   if (i == 0)
+   cgrp = counter->cgrp;
+   else {
+   counter->cgrp = cgrp;
+   refcount_inc(&cgrp->refcnt);
+   }
+   i++;
+   }
+   }
return 0;
 }

[tip:perf/core] perf stat: Use xyarray dimensions to iterate fds

2018-03-05 Thread tip-bot for Andi Kleen

Commit-ID:  42811d509d6e0b0118918ce6be346be54d8e8801
Gitweb: https://git.kernel.org/tip/42811d509d6e0b0118918ce6be346be54d8e8801
Author: Andi Kleen 
AuthorDate: Thu, 5 Oct 2017 19:00:28 -0700
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 21 Feb 2018 11:36:57 -0300

perf stat: Use xyarray dimensions to iterate fds

Now that the xyarray stores the dimensions we can use those
to iterate over the FDs for a evsel.

Signed-off-by: Andi Kleen 
Acked-by: Jiri Olsa 
Link: http://lkml.kernel.org/r/20171006020029.13339-1-a...@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-stat.c | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 2d49eccf98f2..fadcff52cd09 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -508,14 +508,13 @@ static int perf_stat_synthesize_config(bool is_pipe)
 
 #define FD(e, x, y) (*(int *)xyarray__entry(e->fd, x, y))
 
-static int __store_counter_ids(struct perf_evsel *counter,
-  struct cpu_map *cpus,
-  struct thread_map *threads)
+static int __store_counter_ids(struct perf_evsel *counter)
 {
int cpu, thread;
 
-   for (cpu = 0; cpu < cpus->nr; cpu++) {
-   for (thread = 0; thread < threads->nr; thread++) {
+   for (cpu = 0; cpu < xyarray__max_x(counter->fd); cpu++) {
+   for (thread = 0; thread < xyarray__max_y(counter->fd);
+thread++) {
int fd = FD(counter, cpu, thread);
 
if (perf_evlist__id_add_fd(evsel_list, counter,
@@ -535,7 +534,7 @@ static int store_counter_ids(struct perf_evsel *counter)
if (perf_evsel__alloc_id(counter, cpus->nr, threads->nr))
return -ENOMEM;
 
-   return __store_counter_ids(counter, cpus, threads);
+   return __store_counter_ids(counter);
 }
 
 static bool perf_evsel__should_store_id(struct perf_evsel *counter)

[tip:perf/core] perf kallsyms: Fix the usage on the man page

2018-03-05 Thread tip-bot for Sangwon Hong

Commit-ID:  de7112868829b3286def38297848d5d2592b4a70
Gitweb: https://git.kernel.org/tip/de7112868829b3286def38297848d5d2592b4a70
Author: Sangwon Hong 
AuthorDate: Mon, 12 Feb 2018 04:37:44 +0900
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 21 Feb 2018 09:23:36 -0300

perf kallsyms: Fix the usage on the man page

First, all man pages highlight only perf and subcommands except 'perf
kallsyms', which includes the full usage. Fix it for commands to
monopolize underlines.

Second, options can be ommited when executing 'perf kallsyms', so add
square brackets between .

Signed-off-by: Sangwon Hong 
Acked-by: Namhyung Kim 
Cc: Jiri Olsa 
Cc: Taeung Song 
Link: 
http://lkml.kernel.org/r/1518377864-20353-1-git-send-email-qpa...@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Documentation/perf-kallsyms.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/Documentation/perf-kallsyms.txt 
b/tools/perf/Documentation/perf-kallsyms.txt
index 954ea9e21236..cf9f4040ea5c 100644
--- a/tools/perf/Documentation/perf-kallsyms.txt
+++ b/tools/perf/Documentation/perf-kallsyms.txt
@@ -8,7 +8,7 @@ perf-kallsyms - Searches running kernel for symbols
 SYNOPSIS
 
 [verse]
-'perf kallsyms  symbol_name[,symbol_name...]'
+'perf kallsyms' [] symbol_name[,symbol_name...]
 
 DESCRIPTION
 ---

Re: [GIT PULL 00/28] perf/core improvements and fixes

2018-03-05 Thread Ingo Molnar


* Arnaldo Carvalho de Melo  wrote:

> Hi Ingo,
> 
>   Please consider pulling, I'll cherry pick some into a separate
> perf/urgent pull request, like the jump-to-another-function one, after
> the usual round of tests, but since I've been working on then in my
> perf/core branch, lets flush them now.
> 
> - Arnaldo
> 
> Test results at the end of this message, as usual.
>   
> The following changes since commit ddc4becca1409541c2ebb7ecb99b5cef44cf17e4:
> 
>   Merge tag 'perf-core-for-mingo-4.17-20180220' of 
> git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core 
> (2018-02-21 08:50:45 +0100)
> 
> are available in the Git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git 
> tags/perf-core-for-mingo-4.17-20180305
> 
> for you to fetch changes up to 6afad54d2f0ddebacfcf3b829147d7fed8dab298:
> 
>   perf mmap: Discard legacy interfaces for mmap read forward (2018-03-05 
> 10:51:10 -0300)
> 
> 
> perf/core improvements and fixes:
> 
> - Be more robust when drawing arrows in the annotation TUI, avoiding a
>   segfault when jump instructions have as a target addresses in functions
>   other that the one currently being annotated. The full fix will come in
>   the following days, when jumping to other functions will work as call
>   instructions (Arnaldo Carvalho de Melo)
> 
> - Allow asking for the maximum allowed sample rate in 'top' and
>   'record', i.e. 'perf record -F max' will read the
>   kernel.perf_event_max_sample_rate sysctl and use it (Arnaldo Carvalho de 
> Melo)
> 
> - When the user specifies a freq above kernel.perf_event_max_sample_rate,
>   Throttle it down to that max freq, and warn the user about it, add as
>   well --strict-freq so that the previous behaviour of not starting the
>   session when the desired freq can't be used can be selected (Arnaldo 
> Carvalho de Melo)
> 
> - Find 'call' instruction target symbol at parsing time, used so far in
>   the TUI, part of the infrastructure changes that will end up allowing
>   for jumps to navigate to other functions, just like 'call'
>   instructions. (Arnaldo Carvalho de Melo)
> 
> - Use xyarray dimensions to iterate fds in 'perf stat' (Andi Kleen)
> 
> - Ignore threads for which the current user hasn't permissions when
>   enabling system-wide --per-thread (Jin Yao)
> 
> - Fix some backtrace perf test cases to use 'perf record' + 'perf script'
>   instead, till 'perf trace' starts using ordered_events or equivalent
>   to avoid symbol resolving artifacts due to reordering of
>   PERF_RECORD_MMAP events (Jiri Olsa)
> 
> - Fix crash in 'perf record' pipe mode, it needs to allocate the ID
>   array even for a single event, unlike non-pipe mode (Jiri Olsa)
> 
> - Make annoying fallback message on older kernels with newer 'perf top'
>   binaries trying to use overwrite mode and that not being present
>   in the older kernels (Kan Liang)
> 
> - Switch last users of old APIs to the newer perf_mmap__read_event()
>   one, then discard those old mmap read forward APIs (Kan Liang)
> 
> - Fix the usage on the 'perf kallsyms' man page (Sangwon Hong)
> 
> - Simplify cgroup arguments when tracking multiple events (weiping zhang)
> 
> Signed-off-by: Arnaldo Carvalho de Melo 
> 
> 
> Andi Kleen (1):
>   perf stat: Use xyarray dimensions to iterate fds
> 
> Arnaldo Carvalho de Melo (6):
>   perf annotate browser: Be more robust when drawing jump arrows
>   perf record: Allow asking for the maximum allowed sample rate
>   perf top browser: Show sample_freq in browser title line
>   perf top: Allow asking for the maximum allowed sample rate
>   perf record: Throttle user defined frequencies to the maximum allowed
>   perf annotate: Find 'call' instruction target symbol at parsing time
> 
> Jin Yao (1):
>   perf stat: Ignore error thread when enabling system-wide --per-thread
> 
> Jiri Olsa (3):
>   perf tests: Switch trace+probe_libc_inet_pton to use record
>   perf tests: Rename trace+probe_libc_inet_pton to 
> record+probe_libc_inet_pton
>   perf record: Fix crash in pipe mode
> 
> Kan Liang (15):
>   perf top: Fix annoying fallback message on older kernels
>   perf kvm: Switch to new perf_mmap__read_event() interface
>   perf trace: Switch to new perf_mmap__read_event() interface
>   perf python: Switch to new perf_mmap__read_event() interface
>   p

[PATCH] cxgb3: remove VLA

2018-03-05 Thread Gustavo A. R. Silva

In preparation to enabling -Wvla, remove VLA and replace it
with dynamic memory allocation.

Signed-off-by: Gustavo A. R. Silva 
---
 drivers/net/ethernet/chelsio/cxgb3/t3_hw.c | 25 +
 1 file changed, 21 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb3/t3_hw.c 
b/drivers/net/ethernet/chelsio/cxgb3/t3_hw.c
index a89721f..ad6a280 100644
--- a/drivers/net/ethernet/chelsio/cxgb3/t3_hw.c
+++ b/drivers/net/ethernet/chelsio/cxgb3/t3_hw.c
@@ -683,20 +683,37 @@ int t3_seeprom_wp(struct adapter *adapter, int enable)
 
 static int vpdstrtouint(char *s, int len, unsigned int base, unsigned int *val)
 {
-   char tok[len + 1];
+   char *tok;
+   int ret;
+
+   tok = kcalloc(len + 1, sizeof(*tok), GFP_KERNEL);
+   if (!tok)
+   return -ENOMEM;
 
memcpy(tok, s, len);
tok[len] = 0;
-   return kstrtouint(strim(tok), base, val);
+   ret = kstrtouint(strim(tok), base, val);
+
+   kfree(tok);
+   return ret;
 }
 
 static int vpdstrtou16(char *s, int len, unsigned int base, u16 *val)
 {
-   char tok[len + 1];
+   char *tok;
+   int ret;
+
+   tok = kcalloc(len + 1, sizeof(*tok), GFP_KERNEL);
+   if (!tok)
+   return -ENOMEM;
 
memcpy(tok, s, len);
tok[len] = 0;
-   return kstrtou16(strim(tok), base, val);
+
+   ret = kstrtou16(strim(tok), base, val);
+
+   kfree(tok);
+   return ret;
 }
 
 /**
-- 
2.7.4

Re: [PATCH] RDMA/bnxt_re/qplib_sp: Use true and false for boolean values

2018-03-05 Thread Selvin Xavier

On Tue, Mar 6, 2018 at 5:06 AM, Gustavo A. R. Silva
 wrote:
> Assign true or false to boolean variables instead of an integer value.
>
> This issue was detected with the help of Coccinelle.
>
> Signed-off-by: Gustavo A. R. Silva 

Thanks.

Acked-by: Selvin Xavier 
> ---
>  drivers/infiniband/hw/bnxt_re/qplib_sp.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/infiniband/hw/bnxt_re/qplib_sp.c 
> b/drivers/infiniband/hw/bnxt_re/qplib_sp.c
> index ee98e5e..2f3f32ea 100644
> --- a/drivers/infiniband/hw/bnxt_re/qplib_sp.c
> +++ b/drivers/infiniband/hw/bnxt_re/qplib_sp.c
> @@ -154,7 +154,7 @@ int bnxt_qplib_get_dev_attr(struct bnxt_qplib_rcfw *rcfw,
> attr->tqm_alloc_reqs[i * 4 + 3] = *(++tqm_alloc);
> }
>
> -   attr->is_atomic = 0;
> +   attr->is_atomic = false;
>  bail:
> bnxt_qplib_rcfw_free_sbuf(rcfw, sbuf);
> return rc;
> --
> 2.7.4
>

[PATCH] perf: correct ctx_event_type in ctx_resched()

2018-03-05 Thread Song Liu

In ctx_resched(), EVENT_FLEXIBLE should be sched_out when EVENT_PINNED is
added. However, ctx_resched() calculates ctx_event_type before checking
this condition. As a result, pinned events will NOT get higher priority
than flexible events.

The following shows this issue on an Intel CPU (where ref-cycles can
only use one hardware counter).

  1. First start:
   perf stat -C 0 -e ref-cycles  -I 1000
  2. Then, in the second console, run:
   perf stat -C 0 -e ref-cycles:D -I 1000

The second perf uses pinned events, which is expected to have higher
priority. However, because it failed in ctx_resched(). It is never
run.

This patch fixes this by calculating ctx_event_type after re-evaluating
event_type.

Fixes: 487f05e18aa4 ("perf/core: Optimize event rescheduling on active 
contexts")
Signed-off-by: Song Liu 
Reported-by: Ephraim Park 
---
 kernel/events/core.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 5789810..cf52fc0 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -2246,7 +2246,7 @@ static void ctx_resched(struct perf_cpu_context *cpuctx,
struct perf_event_context *task_ctx,
enum event_type_t event_type)
 {
-   enum event_type_t ctx_event_type = event_type & EVENT_ALL;
+   enum event_type_t ctx_event_type;
bool cpu_event = !!(event_type & EVENT_CPU);
 
/*
@@ -2256,6 +2256,8 @@ static void ctx_resched(struct perf_cpu_context *cpuctx,
if (event_type & EVENT_PINNED)
event_type |= EVENT_FLEXIBLE;
 
+   ctx_event_type = event_type & EVENT_ALL;
+
perf_pmu_disable(cpuctx->ctx.pmu);
if (task_ctx)
task_ctx_sched_out(cpuctx, task_ctx, event_type);
-- 
2.9.5

Re: [PATCH] thermal: of: Allow selection of thermal governor in DT

2018-03-05 Thread Amit Kucheria

On Tue, Mar 6, 2018 at 2:41 AM, Daniel Lezcano
 wrote:
> On 05/03/2018 19:36, Amit Kucheria wrote:
>> From: Ram Chandrasekar 
>>
>> There is currently no way for the governor to be selected for each thermal
>> zone in devicetree. This results in the default governor being used for all
>> thermal zones even though no such restriction exists in the core code.
>>
>> Add support for specifying the thermal governor to be used for a thermal
>> zone in the devicetree. The devicetree config should specify the governor
>> name as a string that matches any available governors. If not specified, we
>> maintain the current behaviour of using the default governor.
>>
>> Signed-off-by: Ram Chandrasekar 
>> Signed-off-by: Amit Kucheria 
>
> Why not create a kernel parameter (eg. thermal.governor=) ? So everyone
> can gain benefit of this feature. And in order to specify that from the
> DT, add the 'chosen' node and bootargs with the desired kernel parameter?
>

This is supposed to be a per-thermal zone property. So specifying it
on the command-line, while possible, might be a little cumbersome. I'm
not even sure if kernel parameters can have a variable number of
arguments. IOW, thermal.tz0.governor=userspace,
thermal.tz1.governor=step_wise, thermal.tz2.governor=userspace, .

I'm already seeing SoCs defining 8 or more thermal zones.

Re: [PATCH] spi: tegra20-slink: use true and false for boolean values

2018-03-05 Thread Laxman Dewangan




On Tuesday 06 March 2018 05:23 AM, Gustavo A. R. Silva wrote:

Assign true or false to boolean variables instead of an integer value.

This issue was detected with the help of Coccinelle.

Signed-off-by: Gustavo A. R. Silva 



Acked-by: Laxman Dewangan

Re: [PATCH v3 03/10] drivers: qcom: rpmh-rsc: log RPMH requests in FTRACE

2018-03-05 Thread kbuild test robot

Hi Lina,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on robh/for-next]
[also build test ERROR on v4.16-rc4 next-20180306]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Lina-Iyer/drivers-qcom-add-RPMH-communication-support/20180305-225623
base:   https://git.kernel.org/pub/scm/linux/kernel/git/robh/linux.git for-next
config: arm64-allmodconfig (attached as .config)
compiler: aarch64-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=arm64 

All error/warnings (new ones prefixed by >>):

   In file included from include/trace/define_trace.h:96:0,
from drivers/soc/qcom/trace-rpmh.h:89,
from drivers/soc/qcom/rpmh-rsc.c:28:
   drivers/soc/qcom/./trace-rpmh.h: In function 
'trace_event_raw_event_rpmh_notify':
>> drivers/soc/qcom/./trace-rpmh.h:29:3: error: implicit declaration of 
>> function '__assign_string'; did you mean '__assign_str'? 
>> [-Werror=implicit-function-declaration]
  __assign_string(name, d->name);
  ^
   include/trace/trace_events.h:719:4: note: in definition of macro 
'DECLARE_EVENT_CLASS'
 { assign; }   \
   ^~
>> drivers/soc/qcom/./trace-rpmh.h:28:2: note: in expansion of macro 
>> 'TP_fast_assign'
 TP_fast_assign(
 ^~
>> drivers/soc/qcom/./trace-rpmh.h:29:19: error: 'name' undeclared (first use 
>> in this function); did you mean 'node'?
  __assign_string(name, d->name);
  ^
   include/trace/trace_events.h:719:4: note: in definition of macro 
'DECLARE_EVENT_CLASS'
 { assign; }   \
   ^~
>> drivers/soc/qcom/./trace-rpmh.h:28:2: note: in expansion of macro 
>> 'TP_fast_assign'
 TP_fast_assign(
 ^~
   drivers/soc/qcom/./trace-rpmh.h:29:19: note: each undeclared identifier is 
reported only once for each function it appears in
  __assign_string(name, d->name);
  ^
   include/trace/trace_events.h:719:4: note: in definition of macro 
'DECLARE_EVENT_CLASS'
 { assign; }   \
   ^~
>> drivers/soc/qcom/./trace-rpmh.h:28:2: note: in expansion of macro 
>> 'TP_fast_assign'
 TP_fast_assign(
 ^~
   drivers/soc/qcom/./trace-rpmh.h: In function 
'trace_event_raw_event_rpmh_send_msg':
   drivers/soc/qcom/./trace-rpmh.h:67:19: error: 'name' undeclared (first use 
in this function); did you mean 'node'?
  __assign_string(name, d->name);
  ^
   include/trace/trace_events.h:719:4: note: in definition of macro 
'DECLARE_EVENT_CLASS'
 { assign; }   \
   ^~
   include/trace/trace_events.h:78:9: note: in expansion of macro 'PARAMS'
PARAMS(assign), \
^~
>> drivers/soc/qcom/./trace-rpmh.h:50:1: note: in expansion of macro 
>> 'TRACE_EVENT'
TRACE_EVENT(rpmh_send_msg,
^~~
   drivers/soc/qcom/./trace-rpmh.h:66:2: note: in expansion of macro 
'TP_fast_assign'
 TP_fast_assign(
 ^~
   In file included from include/trace/define_trace.h:97:0,
from drivers/soc/qcom/trace-rpmh.h:89,
from drivers/soc/qcom/rpmh-rsc.c:28:
   drivers/soc/qcom/./trace-rpmh.h: In function 'perf_trace_rpmh_notify':
>> drivers/soc/qcom/./trace-rpmh.h:29:19: error: 'name' undeclared (first use 
>> in this function); did you mean 'node'?
  __assign_string(name, d->name);
  ^
   include/trace/perf.h:66:4: note: in definition of macro 'DECLARE_EVENT_CLASS'
 { assign; }   \
   ^~
>> drivers/soc/qcom/./trace-rpmh.h:28:2: note: in expansion of macro 
>> 'TP_fast_assign'
 TP_fast_assign(
 ^~
   drivers/soc/qcom/./trace-rpmh.h: In function 'perf_trace_rpmh_send_msg':
   drivers/soc/qcom/./trace-rpmh.h:67:19: error: 'name' undeclared (first use 
in this function); did you mean 'node'?
  __assign_string(name, d->name);
  ^
   include/trace/perf.h:66:4: note: in definition of macro 'DECLARE_EVENT_CLASS'
 { assign; }   \
   ^~
   include/trace/trace_events.h:78:9: note: in expansion of macro 'PARAMS'
PARAMS(assign), \
^~
>> drivers/soc/qcom/./trace-rpmh.h:50:1: note: in expansion of macro 
>> 'TRACE_EVENT'

Re: [PATCH] thermal: of: Allow selection of thermal governor in DT

2018-03-05 Thread Amit Kucheria

On Tue, Mar 6, 2018 at 1:38 AM, Rob Herring  wrote:
> On Mon, Mar 5, 2018 at 12:36 PM, Amit Kucheria  
> wrote:
>> From: Ram Chandrasekar 
>>
>> There is currently no way for the governor to be selected for each thermal
>> zone in devicetree. This results in the default governor being used for all
>> thermal zones even though no such restriction exists in the core code.
>>
>> Add support for specifying the thermal governor to be used for a thermal
>> zone in the devicetree. The devicetree config should specify the governor
>> name as a string that matches any available governors. If not specified, we
>> maintain the current behaviour of using the default governor.
>>
>> Signed-off-by: Ram Chandrasekar 
>> Signed-off-by: Amit Kucheria 
>> ---
>>  Documentation/devicetree/bindings/thermal/thermal.txt | 8 
>>  drivers/thermal/of-thermal.c  | 6 ++
>>  2 files changed, 14 insertions(+)
>>
>> diff --git a/Documentation/devicetree/bindings/thermal/thermal.txt 
>> b/Documentation/devicetree/bindings/thermal/thermal.txt
>> index 1719d47..fced9d3 100644
>> --- a/Documentation/devicetree/bindings/thermal/thermal.txt
>> +++ b/Documentation/devicetree/bindings/thermal/thermal.txt
>> @@ -168,6 +168,14 @@ Optional property:
>> by means of sensor ID. Additional coefficients are
>> interpreted as constant offset.
>>
>> +- thermal-governor: Thermal governor to be used for this thermal zone.
>> +   Expected values are:
>> +   "step_wise": Use step wise governor.
>> +   "fair_share": Use fair share governor.
>> +   "user_space": Use user space governor.
>> +   "power_allocator": Use power allocator governor.
>
> This looks pretty Linux specific. Not that we can't have Linux
> specific properties, but we try to avoid them.
>
> What determines the selection? I'd imagine only certain governors make
> sense for certain devices. We should perhaps describe those
> characteristics which can then infer the best governor. Not really
> sure though...

I'm not sure if it would be easy to assign preferred governors to
device classes. It is dependent on what devices are present on the
system, what throttling knobs they expose and how the system designer
decided to integrate it all. e.g. A GPU driver might be controlled in
kernel or userspace depending on whether it exposes a devfreq knob or
some more esoteric statistics to userspace.

Bang Bang governor seems to be designed for Fans with a simple ON/OFF iterface.
Userspace governor is designed to move thermal policy to userspace
(e.g. through thermald). So backlight brightness, battery charging,
GPU scaling, even cpu frequency scaling can be offloaded to userspace.
On embedded platforms, modem control typically happens in userspace
Power allocator governor is designed for a closed-loop system to keep
the total TDP of the platform under control while allowing various
devices (cpu, gpu, modem, etc.) to dynamically increase or decrease
their individual budget depending on the usecase.

Regards,
Amit

Re: [PATCH] dump_stack: convert generic dump_stack into a weak symbol

2018-03-05 Thread Greentime Hu

2018-03-06 12:31 GMT+08:00 Sergey Senozhatsky
:
> On (03/06/18 10:50), Greentime Hu wrote:
> [..]
>> > Greentime Hu, you tested this on nds32. Could I use your Tested-by,
>> > please?
>> >
>>
>> Yes, please use it. :)
>
> Thanks.
>
> To be sure, is this
>
>   Tested-by: Greentime Hu  # nds32
> or
>   Acked-by: Greentime Hu  # nds32
>

Acked-by is prefered.
Thanks.

[RFC] rcu: Prevent expedite reporting within RCU read-side section

2018-03-05 Thread Byungchul Park

Hello Paul and RCU folks,

I am afraid I correctly understand and fix it. But I really wonder why
sync_rcu_exp_handler() reports the quiescent state even in the case that
current task is within a RCU read-side section. Do I miss something?

If I correctly understand it and you agree with it, I can add more logic
which make it more expedited by boosting current or making it urgent
when we fail to report the quiescent state on the IPI.

->8-
>From 0b0191f506c19ce331a1fdb7c2c5a00fb23fbcf2 Mon Sep 17 00:00:00 2001
From: Byungchul Park 
Date: Tue, 6 Mar 2018 13:54:41 +0900
Subject: [RFC] rcu: Prevent expedite reporting within RCU read-side section

We report the quiescent state for this cpu if it's out of RCU read-side
section at the moment IPI was just fired during the expedite process.

However, current code reports the quiescent state even in the case:

   1) the current task is still within a RCU read-side section
   2) the current task has been blocked within the RCU read-side section

Since we don't get to the quiescent state yet in the case, we shouldn't
report it but check it another time.

Signed-off-by: Byungchul Park 
---
 kernel/rcu/tree_exp.h | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h
index 73e1d3d..cc69d14 100644
--- a/kernel/rcu/tree_exp.h
+++ b/kernel/rcu/tree_exp.h
@@ -731,13 +731,13 @@ static void sync_rcu_exp_handler(void *info)
/*
 * We are either exiting an RCU read-side critical section (negative
 * values of t->rcu_read_lock_nesting) or are not in one at all
-* (zero value of t->rcu_read_lock_nesting).  Or we are in an RCU
-* read-side critical section that blocked before this expedited
-* grace period started.  Either way, we can immediately report
-* the quiescent state.
+* (zero value of t->rcu_read_lock_nesting). We can immediately
+* report the quiescent state.
 */
-   rdp = this_cpu_ptr(rsp->rda);
-   rcu_report_exp_rdp(rsp, rdp, true);
+   if (t->rcu_read_lock_nesting <= 0) {
+   rdp = this_cpu_ptr(rsp->rda);
+   rcu_report_exp_rdp(rsp, rdp, true);
+   }
 }
 
 /**
-- 
1.9.1

Re: [PATCH v8 15/15] dt-bindings: cpufreq: Document operating-points-v2-krait-cpu

2018-03-05 Thread Sricharan R



On 3/6/2018 3:49 AM, Rob Herring wrote:
> On Tue, Feb 27, 2018 at 07:37:02PM +0530, Sricharan R wrote:
>> In Certain QCOM SoCs like ipq8064, apq8064, msm8960, msm8974
>> that has KRAIT processors the voltage/current value of each OPP
>> varies based on the silicon variant in use.
>> operating-points-v2-krait-cpu specifies the phandle to nvmem efuse cells
>> and the operating-points-v2 table for each opp. The qcom-cpufreq driver
>> reads the efuse value from the SoC to provide the required information
>> that is used to determine the voltage and current value for each OPP of
>> operating-points-v2 table when it is parsed by the OPP framework.
>>
>> Signed-off-by: Sricharan R 
>> ---
>>  .../devicetree/bindings/cpufreq/krait-cpufreq.txt  | 363 
>> +
>>  1 file changed, 363 insertions(+)
>>  create mode 100644 
>> Documentation/devicetree/bindings/cpufreq/krait-cpufreq.txt
> 
> Reviewed-by: Rob Herring 

Thanks Rob !!
Will post with all tags and the Makefile corrected.

Regards,
 Sricharan

-- 
"QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of 
Code Aurora Forum, hosted by The Linux Foundation

Re: [PATCH 0/3] Improve and extend checkpatch.pl Kconfig help text checks

2018-03-05 Thread Masahiro Yamada

(+To: Andrew)

2018-03-06 13:52 GMT+09:00 Ulf Magnusson :
> On Sat, Feb 24, 2018 at 2:53 PM, Masahiro Yamada
>  wrote:
>> 2018-02-23 10:30 GMT+09:00 Ulf Magnusson :
>>> On Fri, Feb 16, 2018 at 10:14 PM, Joe Perches  wrote:
 On Fri, 2018-02-16 at 21:22 +0100, Ulf Magnusson wrote:
> Hello,
>
> This patchset contains some improvements for the Kconfig help text check 
> in
> scripts/checkconfig.pl:

 Seems sensible enough to me.
 Signed-off-by: Joe Perches 
>>>
>>> Will you be taking this in yourself?
>>>
>>> (Adding Masahiro on CC -- forgot when I sent the patchset.)
>>>
>>> Cheers,
>>> Ulf
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in
>>> the body of a message to majord...@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
>> I am not a perl expert, but I have no objection for this series.
>>
>>
>> Thanks!
>>
>>
>>
>>
>> --
>> Best Regards
>> Masahiro Yamada
>
> *Bump*

Who is addressed by "*Bump*" ?

I think patches for checkpatch.pl
are supposed to be taken care of by Andrew.

He forwards patches to Linus.

$ git log --no-merges --pretty=fuller scripts/checkpatch.pl  | grep
'Commit:' | sort | uniq -c | sort -nr
555 Commit: Linus Torvalds 
 16 Commit: Linus Torvalds 
  4 Commit: Paul E. McKenney 
  4 Commit: Michael S. Tsirkin 
  2 Commit: Thomas Gleixner 
  2 Commit: Ingo Molnar 
  2 Commit: Greg Kroah-Hartman 
  1 Commit: Tobin C. Harding 
  1 Commit: Rob Herring 
  1 Commit: Petr Mladek 
  1 Commit: Michal Marek 
  1 Commit: Mauro Carvalho Chehab 
  1 Commit: Masahiro Yamada 
  1 Commit: Lucas De Marchi 
  1 Commit: Jiri Kosina 
  1 Commit: Dan Williams 
  1 Commit: Bjorn Helgaas 

-- 
Best Regards
Masahiro Yamada

[linux-next:master 5332/5518] drivers/net/ethernet/marvell/mvpp2.c:4288:5: sparse: symbol 'mvpp2_check_hw_buf_num' was not declared. Should it be static?

2018-03-05 Thread kbuild test robot

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git 
master
head:   9c142d8a6556f069be6278ccab701039da81ad6f
commit: effbf5f58d64b1d1f93cb687d9797b42f291d5fd [5332/5518] net: mvpp2: update 
the BM buffer free/destroy logic
reproduce:
# apt-get install sparse
git checkout effbf5f58d64b1d1f93cb687d9797b42f291d5fd
make ARCH=x86_64 allmodconfig
make C=1 CF=-D__CHECK_ENDIAN__


sparse warnings: (new ones prefixed by >>)

>> drivers/net/ethernet/marvell/mvpp2.c:4288:5: sparse: symbol 
>> 'mvpp2_check_hw_buf_num' was not declared. Should it be static?
   drivers/net/ethernet/marvell/mvpp2.c:6620:36: sparse: incorrect type in 
argument 2 (different base types) @@expected int [signed] l3_proto @@
got restricted __be1int [signed] l3_proto @@
   drivers/net/ethernet/marvell/mvpp2.c:6620:36:expected int [signed] 
l3_proto
   drivers/net/ethernet/marvell/mvpp2.c:6620:36:got restricted __be16 
[usertype] protocol

Please review and possibly fold the followup patch.

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation

[RFC PATCH linux-next] net: mvpp2: mvpp2_check_hw_buf_num() can be static

2018-03-05 Thread kbuild test robot


Fixes: effbf5f58d64 ("net: mvpp2: update the BM buffer free/destroy logic")
Signed-off-by: Fengguang Wu 
---
 mvpp2.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/marvell/mvpp2.c 
b/drivers/net/ethernet/marvell/mvpp2.c
index c7b8093..c360430 100644
--- a/drivers/net/ethernet/marvell/mvpp2.c
+++ b/drivers/net/ethernet/marvell/mvpp2.c
@@ -4285,7 +4285,7 @@ static void mvpp2_bm_bufs_free(struct device *dev, struct 
mvpp2 *priv,
 }
 
 /* Check number of buffers in BM pool */
-int mvpp2_check_hw_buf_num(struct mvpp2 *priv, struct mvpp2_bm_pool *bm_pool)
+static int mvpp2_check_hw_buf_num(struct mvpp2 *priv, struct mvpp2_bm_pool 
*bm_pool)
 {
int buf_num = 0;

[PATCH] kernel/memremap: Remove stale devres_free() call

2018-03-05 Thread Oliver O'Halloran

devm_memremap_pages() was re-worked in e8d513483300 to take a caller
allocated struct dev_pagemap as a function parameter. A call to
devres_free() was left in the error cleanup path which results in
a kernel panic if the remap fails for some reason. Remove it
to fix the panic and let devm_memremap_pages() fail gracefully.

Fixes: e8d513483300 ("memremap: change devm_memremap_pages interface to use 
struct dev_pagemap")
Cc: Logan Gunthorpe 
Cc: Christoph Hellwig 
Cc: Dan Williams 
Signed-off-by: Oliver O'Halloran 
---
Both in-tree users of devm_memremap_pages() embed dev_pagemap into other
structures so this shouldn't cause any leaks. Logan's p2p series does
add one usage that assumes pgmap will be freed on error so that'll
need fixing.
---
 kernel/memremap.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/kernel/memremap.c b/kernel/memremap.c
index 4dd4274cabe2..895e6b76b25e 100644
--- a/kernel/memremap.c
+++ b/kernel/memremap.c
@@ -427,7 +427,6 @@ void *devm_memremap_pages(struct device *dev, struct 
dev_pagemap *pgmap)
  err_pfn_remap:
  err_radix:
pgmap_radix_release(res, pgoff);
-   devres_free(pgmap);
return ERR_PTR(error);
 }
 EXPORT_SYMBOL(devm_memremap_pages);
-- 
2.9.5

Re: [PATCH 0/3] Improve and extend checkpatch.pl Kconfig help text checks

2018-03-05 Thread Ulf Magnusson

On Sat, Feb 24, 2018 at 2:53 PM, Masahiro Yamada
 wrote:
> 2018-02-23 10:30 GMT+09:00 Ulf Magnusson :
>> On Fri, Feb 16, 2018 at 10:14 PM, Joe Perches  wrote:
>>> On Fri, 2018-02-16 at 21:22 +0100, Ulf Magnusson wrote:
 Hello,

 This patchset contains some improvements for the Kconfig help text check in
 scripts/checkconfig.pl:
>>>
>>> Seems sensible enough to me.
>>> Signed-off-by: Joe Perches 
>>
>> Will you be taking this in yourself?
>>
>> (Adding Masahiro on CC -- forgot when I sent the patchset.)
>>
>> Cheers,
>> Ulf
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
> I am not a perl expert, but I have no objection for this series.
>
>
> Thanks!
>
>
>
>
> --
> Best Regards
> Masahiro Yamada

*Bump*

Re: [PATCH 2/3] vfio: Add support for unmanaged or userspace managed SR-IOV

2018-03-05 Thread kbuild test robot

Hi Alexander,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on pci/next]
[also build test ERROR on v4.16-rc4 next-20180305]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Alexander-Duyck/pci-iov-Add-support-for-unmanaged-SR-IOV/20180306-063954
base:   https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git next
config: s390-default_defconfig (attached as .config)
compiler: s390x-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=s390 

All errors (new ones prefixed by >>):

   drivers/vfio/pci/vfio_pci.c: In function 'vfio_pci_sriov_configure':
>> drivers/vfio/pci/vfio_pci.c:1291:8: error: implicit declaration of function 
>> 'pci_sriov_configure_unmanaged'; did you mean 'pci_write_config_dword'? 
>> [-Werror=implicit-function-declaration]
 err = pci_sriov_configure_unmanaged(pdev, nr_virtfn);
   ^
   pci_write_config_dword
   At top level:
   drivers/vfio/pci/vfio_pci.c:1265:12: warning: 'vfio_pci_sriov_configure' 
defined but not used [-Wunused-function]
static int vfio_pci_sriov_configure(struct pci_dev *pdev, int nr_virtfn)
   ^~~~
   cc1: some warnings being treated as errors

vim +1291 drivers/vfio/pci/vfio_pci.c

  1264  
  1265  static int vfio_pci_sriov_configure(struct pci_dev *pdev, int nr_virtfn)
  1266  {
  1267  struct vfio_pci_device *vdev;
  1268  struct vfio_device *device;
  1269  int err;
  1270  
  1271  device = vfio_device_get_from_dev(&pdev->dev);
  1272  if (device == NULL)
  1273  return -ENODEV;
  1274  
  1275  vdev = vfio_device_data(device);
  1276  if (vdev == NULL) {
  1277  vfio_device_put(device);
  1278  return -ENODEV;
  1279  }
  1280  
  1281  /*
  1282   * If a userspace process is already using this device just 
return
  1283   * busy and don't allow for any changes.
  1284   */
  1285  if (vdev->refcnt) {
  1286  pci_warn(pdev,
  1287   "PF is currently in use, blocked until 
released by user\n");
  1288  return -EBUSY;
  1289  }
  1290  
> 1291  err = pci_sriov_configure_unmanaged(pdev, nr_virtfn);
  1292  if (err <= 0)
  1293  return err;
  1294  
  1295  /*
  1296   * We are now leaving VFs in the control of some unknown PF 
entity.
  1297   *
  1298   * Best case is a well behaved userspace PF is expected and any 
VMs
  1299   * that the VFs will be assigned to are dependent on the 
userspace
  1300   * entity anyway. An example being NFV where maybe the PF is 
acting
  1301   * as an accelerated interface for a firewall or switch.
  1302   *
  1303   * Worst case is somebody really messed up and just enabled 
SR-IOV
  1304   * on a device they were planning to assign to a VM somwhere.
  1305   *
  1306   * In either case it is probably best for us to set the taint 
flag
  1307   * and warn the user since this could get really ugly really 
quick
  1308   * if this wasn't what they were planning to do.
  1309   */
  1310  add_taint(TAINT_USER, LOCKDEP_STILL_OK);
  1311  pci_warn(pdev,
  1312   "Adding kernel taint for vfio-pci now managing SR-IOV 
PF device\n");
  1313  
  1314  return nr_virtfn;
  1315  }
  1316  

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip

RE: [PATCH 5/6] dma-mapping: support fsl-mc bus

2018-03-05 Thread Nipun Gupta



> From: Robin Murphy [mailto:robin.mur...@arm.com]
> Sent: Tuesday, March 06, 2018 0:22
> 
> On 05/03/18 18:39, Christoph Hellwig wrote:
> > On Mon, Mar 05, 2018 at 03:48:32PM +, Robin Murphy wrote:
> >> Unfortunately for us, fsl-mc is conceptually rather like PCI in that it's
> >> software-discoverable and the only thing described in DT is the bus "host",
> >> thus we need the same sort of thing as for PCI to map from the child
> >> devices back to the bus root in order to find the appropriate firmware
> >> node. Worse than PCI, though, we wouldn't even have the option of
> >> describing child devices statically in firmware at all, since it's actually
> >> one of these runtime-configurable "build your own network accelerator"
> >> hardware pools where userspace gets to create and destroy "devices" as it
> >> likes.
> >
> > I really hate the PCI special case just as much.  Maybe we just
> > need a dma_configure method on the bus, and move PCI as well as fsl-mc
> > to it.
> 
> Hmm, on reflection, 100% ack to that idea. It would neatly supersede
> bus->force_dma *and* mean that we don't have to effectively pull pci.h
> into everything, which I've never liked. In hindsight dma_configure()
> does feel like it's grown into this odd choke point where we munge
> everything in just for it to awkwardly unpick things again.
> 
> Robin.

+1 to the idea.

Sorry for asking a trivial question - looking into dma_configure() I see that
PCI is used in the start and the end of the API.
In the end part pci_put_host_bridge_device() is called.
So are two bus callbacks something like 'dma_config_start' & 'dma_config_end'
will be required where the former one will return "dma_dev"?

Regards,
Nipun

Re: [PATCH] dump_stack: convert generic dump_stack into a weak symbol

2018-03-05 Thread Sergey Senozhatsky

On (03/06/18 10:50), Greentime Hu wrote:
[..]
> > Greentime Hu, you tested this on nds32. Could I use your Tested-by,
> > please?
> >
> 
> Yes, please use it. :)

Thanks.

To be sure, is this

  Tested-by: Greentime Hu  # nds32
or
  Acked-by: Greentime Hu  # nds32

?

-ss

Re: [PATCH] dump_stack: convert generic dump_stack into a weak symbol

2018-03-05 Thread Sergey Senozhatsky

On (03/05/18 15:48), Petr Mladek wrote:
[..]
> 
> I hope that I did not miss anything. I could not try this at
> runtime.

I think you can. The rules are universal, you can do on x86
something like this

---

 arch/x86/kernel/dumpstack.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
index a2d8a3908670..5d45f406717e 100644
--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -375,3 +375,16 @@ static int __init code_bytes_setup(char *s)
return 1;
 }
 __setup("code_bytes=", code_bytes_setup);
+
+void dump_stack(void)
+{
+   dump_stack_print_info(KERN_DEFAULT);
+
+   pr_crit("\t\tLinux\n\n");
+
+   pr_crit("An error has occurred. To continue:\n"
+   "Press Enter to return to Linux, or\n"
+   "Press CTRL+ALT+DEL to restart your computer.\n");
+
+   pr_crit("\n\n\tPress any key to continue _");
+}

---

Should be enough for testing.

> Anyway, from my side:
> 
> Reviewed-by: Petr Mladek 

Thanks.

-ss

Re: [PATCH] acpi, nfit: remove redundant func in dev_dbg

2018-03-05 Thread Ross Zwisler

On Fri, Mar 02, 2018 at 01:20:49PM +0100, Johannes Thumshirn wrote:
> Dynamic debug can be instructed to add the function name to the debug
> output using the +f switch, so there is no need for the nfit module to
> do it again. If a user decides to add the +f switch for nfit's dynamic
> debug this results in double prints of the function name like the
> following:
> 
> [ 2391.935383] acpi_nfit_ctl: nfit ACPI0012:00: acpi_nfit_ctl:nmem8 cmd: 10: 
> func: 1 input length: 0
> 
> Thus remove the stray __func__ printing.
> 
> Signed-off-by: Johannes Thumshirn 

Oh, Johannes I noticed that here is one stray one still in
drivers/acpi/nfit/mce.c.  Do you mind pulling it into your patch to keep the
drivers/acpi/nfit/* changes together?

Re: [PATCH 4/7] Protectable Memory

2018-03-05 Thread J Freyensee


snip
.
.
.


+
+config PROTECTABLE_MEMORY
+bool
+depends on MMU



Curious, would you also want to depend on "SECURITY" as well, as this is 
being advertised as a compliment to __read_only_after_init, per the file 
header comments, as I'm assuming ro_after_init would be disabled if the 
SECURITY Kconfig selection is *NOT* selected?




+depends on ARCH_HAS_SET_MEMORY
+select GENERIC_ALLOCATOR
+default y
diff --git a/mm/Makefile b/mm/Makefile
index e669f02c5a54..959fdbdac118 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -65,6 +65,7 @@ obj-$(CONFIG_SPARSEMEM)   += sparse.o
  obj-$(CONFIG_SPARSEMEM_VMEMMAP) += sparse-vmemmap.o
  obj-$(CONFIG_SLOB) += slob.o
  obj-$(CONFIG_MMU_NOTIFIER) += mmu_notifier.o
+obj-$(CONFIG_PROTECTABLE_MEMORY) += pmalloc.o
  obj-$(CONFIG_KSM) += ksm.o
  obj-$(CONFIG_PAGE_POISONING) += page_poison.o
  obj-$(CONFIG_SLAB) += slab.o
diff --git a/mm/pmalloc.c b/mm/pmalloc.c
new file mode 100644
index ..acdec0fbdde6
--- /dev/null
+++ b/mm/pmalloc.c
@@ -0,0 +1,468 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * pmalloc.c: Protectable Memory Allocator
+ *
+ * (C) Copyright 2017 Huawei Technologies Co. Ltd.
+ * Author: Igor Stoppa
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+/*
+ * pmalloc_data contains the data specific to a pmalloc pool,
+ * in a format compatible with the design of gen_alloc.
+ * Some of the fields are used for exposing the corresponding parameter
+ * to userspace, through sysfs.
+ */
+struct pmalloc_data {
+   struct gen_pool *pool;  /* Link back to the associated pool. */
+   bool protected; /* Status of the pool: RO or RW. */


nitpick, you could probably get a tad bit better byte packing alignment 
of this struct if "bool protected" was stuck as the last element in this 
data structure.



+   struct kobj_attribute attr_protected; /* Sysfs attribute. */
+   struct kobj_attribute attr_avail; /* Sysfs attribute. */
+   struct kobj_attribute attr_size;  /* Sysfs attribute. */
+   struct kobj_attribute attr_chunks;/* Sysfs attribute. */
+   struct kobject *pool_kobject;
+   struct list_head node; /* list of pools */
+};
+
+static LIST_HEAD(pmalloc_final_list);
+static LIST_HEAD(pmalloc_tmp_list);
+static struct list_head *pmalloc_list = &pmalloc_tmp_list;
+static DEFINE_MUTEX(pmalloc_mutex);
+static struct kobject *pmalloc_kobject;
+
+static ssize_t pmalloc_pool_show_protected(struct kobject *dev,
+  struct kobj_attribute *attr,
+  char *buf)
+{
+   struct pmalloc_data *data;
+
+   data = container_of(attr, struct pmalloc_data, attr_protected);
+   if (data->protected)
+   return sprintf(buf, "protected\n");
+   else
+   return sprintf(buf, "unprotected\n");
+}
+
+static ssize_t pmalloc_pool_show_avail(struct kobject *dev,
+  struct kobj_attribute *attr,
+  char *buf)
+{
+   struct pmalloc_data *data;
+
+   data = container_of(attr, struct pmalloc_data, attr_avail);
+   return sprintf(buf, "%lu\n",
+  (unsigned long)gen_pool_avail(data->pool));
+}
+
+static ssize_t pmalloc_pool_show_size(struct kobject *dev,
+ struct kobj_attribute *attr,
+ char *buf)
+{
+   struct pmalloc_data *data;
+
+   data = container_of(attr, struct pmalloc_data, attr_size);
+   return sprintf(buf, "%lu\n",
+  (unsigned long)gen_pool_size(data->pool));
+}



Curious, will this show the size in bytes?



+
+static void pool_chunk_number(struct gen_pool *pool,
+ struct gen_pool_chunk *chunk, void *data)
+{
+   unsigned long *counter = data;
+
+   (*counter)++;
+}
+
+static ssize_t pmalloc_pool_show_chunks(struct kobject *dev,
+   struct kobj_attribute *attr,
+   char *buf)
+{
+   struct pmalloc_data *data;
+   unsigned long chunks_num = 0;
+
+   data = container_of(attr, struct pmalloc_data, attr_chunks);
+   gen_pool_for_each_chunk(data->pool, pool_chunk_number, &chunks_num);
+   return sprintf(buf, "%lu\n", chunks_num);
+}
+
+/* Exposes the pool and its attributes through sysfs. */
+static struct kobject *pmalloc_connect(struct pmalloc_data *data)
+{
+   const struct attribute *attrs[] = {
+   &data->attr_protected.attr,
+   &data->attr_avail.attr,
+   &data->attr_size.attr,
+   &data->attr_chunks.attr,
+   NULL
+   };
+   struct kobject *kobj;
+
+   kobj = kobject_create_and_add(data->pool->name, pmalloc_kobject);
+   if (unlikely(!kobj))
+

Re: [PATCH] device-dax: remove redundant func in dev_dbg

2018-03-05 Thread Ross Zwisler

On Mon, Mar 05, 2018 at 05:09:32PM -0800, Dan Williams wrote:
> Dynamic debug can be instructed to add the function name to the debug
> output using the +f switch, so there is no need for the dax modules to
> do it again. If a user decides to add the +f switch for the dax modules'
> dynamic debug this results in double prints of the function name.
> 
> Reported-by: Johannes Thumshirn 
> Reported-by: Ross Zwisler 
> Signed-off-by: Dan Williams 

Looks good to me.

Reviewed-by: Ross Zwisler

Re: [PATCH] libnvdimm: remove redundant func in dev_dbg

2018-03-05 Thread Ross Zwisler

On Mon, Mar 05, 2018 at 05:09:21PM -0800, Dan Williams wrote:
> Dynamic debug can be instructed to add the function name to the debug
> output using the +f switch, so there is no need for the libnvdimm
> modules to do it again. If a user decides to add the +f switch for
> libnvdimm's dynamic debug this results in double prints of the function
> name.
> 
> Reported-by: Johannes Thumshirn 
> Reported-by: Ross Zwisler 
> Signed-off-by: Dan Williams 
> ---
>  drivers/nvdimm/badrange.c   |3 +-
>  drivers/nvdimm/btt_devs.c   |   21 
>  drivers/nvdimm/bus.c|   13 +-
>  drivers/nvdimm/claim.c  |2 +-
>  drivers/nvdimm/core.c   |4 ++-
>  drivers/nvdimm/dax_devs.c   |5 ++--
>  drivers/nvdimm/dimm_devs.c  |7 ++---
>  drivers/nvdimm/label.c  |   51 
> ++-
>  drivers/nvdimm/namespace_devs.c |   38 -
>  drivers/nvdimm/pfn_devs.c   |   25 +--
>  drivers/nvdimm/pmem.c   |2 +-
>  11 files changed, 77 insertions(+), 94 deletions(-)
> 
> diff --git a/drivers/nvdimm/badrange.c b/drivers/nvdimm/badrange.c
> index e068d72b4357..df17f1cd696d 100644
> --- a/drivers/nvdimm/badrange.c
> +++ b/drivers/nvdimm/badrange.c
> @@ -176,8 +176,7 @@ static void set_badblock(struct badblocks *bb, sector_t 
> s, int num)
>   (u64) s * 512, (u64) num * 512);
>   /* this isn't an error as the hardware will still throw an exception */
>   if (badblocks_set(bb, s, num, 1))
> - dev_info_once(bb->dev, "%s: failed for sector %llx\n",
> - __func__, (u64) s);
> + dev_info_once(bb->dev, "failed for sector %llx\n", (u64) s);

I don't think you should remove this one.  dev_info_once() is just a printk(),
and doesn't inherit the +f flag from the dynamic debugging code. The __func__
here does add value.

The rest of these look correct, though I think you missed one in each of
nvdimm_map_release() and validate_dimm().  (I made these changes as well, but
you sent out your patch first. :)

[PATCH v2 1/2] perf sched: move thread::shortname to thread_runtime

2018-03-05 Thread changbin . du

From: Changbin Du 

The thread::shortname only used by sched command, so move it
to sched private structure.

Signed-off-by: Changbin Du 
---
 tools/perf/builtin-sched.c | 95 +++---
 tools/perf/util/thread.h   |  1 -
 2 files changed, 55 insertions(+), 41 deletions(-)

diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 83283fe..5bfc8d5 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -255,6 +255,8 @@ struct thread_runtime {
 
int last_state;
u64 migrations;
+
+   char shortname[3];
 };
 
 /* per event run time data */
@@ -897,6 +899,37 @@ struct sort_dimension {
struct list_headlist;
 };
 
+/*
+ * handle runtime stats saved per thread
+ */
+static struct thread_runtime *thread__init_runtime(struct thread *thread)
+{
+   struct thread_runtime *r;
+
+   r = zalloc(sizeof(struct thread_runtime));
+   if (!r)
+   return NULL;
+
+   init_stats(&r->run_stats);
+   thread__set_priv(thread, r);
+
+   return r;
+}
+
+static struct thread_runtime *thread__get_runtime(struct thread *thread)
+{
+   struct thread_runtime *tr;
+
+   tr = thread__priv(thread);
+   if (tr == NULL) {
+   tr = thread__init_runtime(thread);
+   if (tr == NULL)
+   pr_debug("Failed to malloc memory for runtime data.\n");
+   }
+
+   return tr;
+}
+
 static int
 thread_lat_cmp(struct list_head *list, struct work_atoms *l, struct work_atoms 
*r)
 {
@@ -1480,6 +1513,7 @@ static int map_switch_event(struct perf_sched *sched, 
struct perf_evsel *evsel,
 {
const u32 next_pid = perf_evsel__intval(evsel, sample, "next_pid");
struct thread *sched_in;
+   struct thread_runtime *tr;
int new_shortname;
u64 timestamp0, timestamp = sample->time;
s64 delta;
@@ -1519,22 +1553,28 @@ static int map_switch_event(struct perf_sched *sched, 
struct perf_evsel *evsel,
if (sched_in == NULL)
return -1;
 
+   tr = thread__get_runtime(sched_in);
+   if (tr == NULL) {
+   thread__put(sched_in);
+   return -1;
+   }
+
sched->curr_thread[this_cpu] = thread__get(sched_in);
 
printf("  ");
 
new_shortname = 0;
-   if (!sched_in->shortname[0]) {
+   if (!tr->shortname[0]) {
if (!strcmp(thread__comm_str(sched_in), "swapper")) {
/*
 * Don't allocate a letter-number for swapper:0
 * as a shortname. Instead, we use '.' for it.
 */
-   sched_in->shortname[0] = '.';
-   sched_in->shortname[1] = ' ';
+   tr->shortname[0] = '.';
+   tr->shortname[1] = ' ';
} else {
-   sched_in->shortname[0] = sched->next_shortname1;
-   sched_in->shortname[1] = sched->next_shortname2;
+   tr->shortname[0] = sched->next_shortname1;
+   tr->shortname[1] = sched->next_shortname2;
 
if (sched->next_shortname1 < 'Z') {
sched->next_shortname1++;
@@ -1552,6 +1592,7 @@ static int map_switch_event(struct perf_sched *sched, 
struct perf_evsel *evsel,
for (i = 0; i < cpus_nr; i++) {
int cpu = sched->map.comp ? sched->map.comp_cpus[i] : i;
struct thread *curr_thread = sched->curr_thread[cpu];
+   struct thread_runtime *curr_tr;
const char *pid_color = color;
const char *cpu_color = color;
 
@@ -1569,9 +1610,14 @@ static int map_switch_event(struct perf_sched *sched, 
struct perf_evsel *evsel,
else
color_fprintf(stdout, cpu_color, "*");
 
-   if (sched->curr_thread[cpu])
-   color_fprintf(stdout, pid_color, "%2s ", 
sched->curr_thread[cpu]->shortname);
-   else
+   if (sched->curr_thread[cpu]) {
+   curr_tr = thread__get_runtime(sched->curr_thread[cpu]);
+   if (curr_tr == NULL) {
+   thread__put(sched_in);
+   return -1;
+   }
+   color_fprintf(stdout, pid_color, "%2s ", 
curr_tr->shortname);
+   } else
color_fprintf(stdout, color, "   ");
}
 
@@ -1587,7 +1633,7 @@ static int map_switch_event(struct perf_sched *sched, 
struct perf_evsel *evsel,
pid_color = COLOR_PIDS;
 
color_fprintf(stdout, pid_color, "%s => %s:%d",
-  sched_in->shortname, thread__comm_str(sched_in), 
sched_in->tid);
+  tr->shortname, thread__comm_str(sched_in), 
sched_in->tid);
}
 
if (sched->map.comp

[PATCH v2 2/2] perf sched map: re-annotate shortname if thread comm changed

2018-03-05 Thread changbin . du

From: Changbin Du 

This is to show the real name of thread that created via fork-exec.
See below example for shortname *A0*.

$ sudo ./perf sched map
  *A0   80393.050639 secs A0 => perf:22368
  *.   A0   80393.050748 secs .  => swapper:0
   .  *.80393.050887 secs
  *B0  .   .80393.052735 secs B0 => rcu_sched:8
  *.   .   .80393.052743 secs
   .  *C0  .80393.056264 secs C0 => kworker/2:1H:287
   .  *A0  .80393.056270 secs
   .  *D0  .80393.056769 secs D0 => ksoftirqd/2:22
-  .  *A0  .80393.056804 secs
+  .  *A0  .80393.056804 secs A0 => pi:22368
   .  *.   .80393.056854 secs
  *B0  .   .80393.060727 secs
  ...

Cc: Namhyung Kim 
Cc: Jiri Olsa 
Signed-off-by: Changbin Du 

---
v2: add function perf_sched__process_comm() to process PERF_RECORD_COMM event.
---
 tools/perf/builtin-sched.c | 37 +++--
 1 file changed, 35 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 5bfc8d5..7aa0600 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -257,6 +257,7 @@ struct thread_runtime {
u64 migrations;
 
char shortname[3];
+   bool comm_changed;
 };
 
 /* per event run time data */
@@ -1626,7 +1627,7 @@ static int map_switch_event(struct perf_sched *sched, 
struct perf_evsel *evsel,
 
timestamp__scnprintf_usec(timestamp, stimestamp, sizeof(stimestamp));
color_fprintf(stdout, color, "  %12s secs ", stimestamp);
-   if (new_shortname || (verbose > 0 && sched_in->tid)) {
+   if (new_shortname || tr->comm_changed || (verbose > 0 && 
sched_in->tid)) {
const char *pid_color = color;
 
if (thread__has_color(sched_in))
@@ -1634,6 +1635,7 @@ static int map_switch_event(struct perf_sched *sched, 
struct perf_evsel *evsel,
 
color_fprintf(stdout, pid_color, "%s => %s:%d",
   tr->shortname, thread__comm_str(sched_in), 
sched_in->tid);
+   tr->comm_changed = false;
}
 
if (sched->map.comp && new_cpu)
@@ -1737,6 +1739,37 @@ static int perf_sched__process_tracepoint_sample(struct 
perf_tool *tool __maybe_
return err;
 }
 
+static int perf_sched__process_comm(struct perf_tool *tool __maybe_unused,
+   union perf_event *event,
+   struct perf_sample *sample,
+   struct machine *machine)
+{
+   struct thread *thread;
+   struct thread_runtime *tr;
+   int err;
+
+   err = perf_event__process_comm(tool, event, sample, machine);
+   if (err)
+   return err;
+
+   thread = machine__find_thread(machine, sample->pid, sample->tid);
+   if (!thread) {
+   pr_err("Internal error: can't find thread\n");
+   return -1;
+   }
+
+   tr = thread__get_runtime(thread);
+   if (tr == NULL) {
+   thread__put(thread);
+   return -1;
+   }
+
+   tr->comm_changed = true;
+   thread__put(thread);
+
+   return 0;
+}
+
 static int perf_sched__read_events(struct perf_sched *sched)
 {
const struct perf_evsel_str_handler handlers[] = {
@@ -3306,7 +3339,7 @@ int cmd_sched(int argc, const char **argv)
struct perf_sched sched = {
.tool = {
.sample  = 
perf_sched__process_tracepoint_sample,
-   .comm= perf_event__process_comm,
+   .comm= perf_sched__process_comm,
.namespaces  = perf_event__process_namespaces,
.lost= perf_event__process_lost,
.fork= perf_sched__process_fork_event,
-- 
2.7.4

[PATCH v2 0/2] perf sched map: re-annotate shortname if thread comm changed

2018-03-05 Thread changbin . du

From: Changbin Du 

v2:
  o add a patch to move thread::shortname to thread_runtime
  o add function perf_sched__process_comm() to process PERF_RECORD_COMM event.

Changbin Du (2):
  perf sched: move thread::shortname to thread_runtime
  perf sched map: re-annotate shortname if thread comm changed

 tools/perf/builtin-sched.c | 132 ++---
 tools/perf/util/thread.h   |   1 -
 2 files changed, 90 insertions(+), 43 deletions(-)

-- 
2.7.4

Re: [RESEND PATCH] perf sched map: re-annotate shortname if thread comm changed

2018-03-05 Thread Du, Changbin

I just done final version, please check v2. Thanks for your comments!

On Mon, Mar 05, 2018 at 11:37:54PM +0100, Jiri Olsa wrote:
> On Mon, Mar 05, 2018 at 03:11:36PM +0800, Du, Changbin wrote:
> 
> SNIP
> 
> > > > on the other hand it's simple enough and looks
> > > > like generic solution would be more tricky
> > > 
> > > What about adding perf_sched__process_comm() to set it in the
> > > thread::priv?
> > >
> > I can be done, then thread->comm_changed moves to 
> > thread_runtime->comm_changed.
> > Draft code as below. It is also a little tricky.
> > 
> > +int perf_sched__process_comm(struct perf_tool *tool __maybe_unused,
> > +union perf_event *event,
> > +struct perf_sample *sample,
> > +struct machine *machine)
> > +{
> > +   struct thread *thread;
> > +   struct thread_runtime *r;
> > +
> > +   perf_event__process_comm(tool, event, sample, machine);
> > +
> > +   thread = machine__findnew_thread(machine, pid, tid);
> 
> should you use machine__find_thread in here?
> 
> > +   if (thread) {
> > +   r = thread__priv(thread);
> > +   if (r)
> > +   r->comm_changed = true;
> > +   thread__put(thread);
> > +   }
> > +}
> > +
> >  static int perf_sched__read_events(struct perf_sched *sched)
> >  {
> > const struct perf_evsel_str_handler handlers[] = {
> > @@ -3291,7 +3311,7 @@ int cmd_sched(int argc, const char **argv)
> > struct perf_sched sched = {
> > .tool = {
> > .sample  = 
> > perf_sched__process_tracepoint_sample,
> > -   .comm= perf_event__process_comm,
> > +   .comm= perf_sched__process_comm,
> > 
> > 
> > But I'd keep 'comm_changed' where 'shortname' is defined. I think they 
> > should appears
> > togother. And 'shortname' is only used by sched command, too.
> 
> they can both go to struct thread_runtime then
> 
> > 
> > So I still prefer my privous simpler change. Thanks!
> 
> I was wrong thinking that the amount of code
> making it sched specific would be biger
> 
> we're trying to keep the core structs generic,
> so this one fits better 
> 
> thanks,
> jirka

-- 
Thanks,
Changbin Du

linux-next: Tree for Mar 6

2018-03-05 Thread Stephen Rothwell

Hi all,

Changes since 20180305:

The mali-dp tree gained a conflict against the drm-misc tree.

Non-merge commits (relative to Linus' tree): 4880
 5418 files changed, 202951 insertions(+), 143721 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log
files in the Next directory.  Between each merge, the tree was built
with a ppc64_defconfig for powerpc, an allmodconfig for x86_64, a
multi_v7_defconfig for arm and a native build of tools/perf. After
the final fixups (if any), I do an x86_64 modules_install followed by
builds for x86_64 allnoconfig, powerpc allnoconfig (32 and 64 bit),
ppc44x_defconfig, allyesconfig and pseries_le_defconfig and i386, sparc
and sparc64 defconfig. And finally, a simple boot test of the powerpc
pseries_le_defconfig kernel in qemu (with and without kvm enabled).

Below is a summary of the state of the merge.

I am currently merging 260 trees (counting Linus' and 44 trees of bug
fix patches pending for the current merge release).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

-- 
Cheers,
Stephen Rothwell

$ git checkout master
$ git reset --hard stable
Merging origin/master (661e50bc8532 Linux 4.16-rc4)
Merging fixes/master (7928b2cbe55b Linux 4.16-rc1)
Merging kbuild-current/fixes (638e69cf2230 fixdep: do not ignore kconfig.h)
Merging arc-current/for-curr (661e50bc8532 Linux 4.16-rc4)
Merging arm-current/fixes (091f02483df7 ARM: net: bpf: clarify tail_call index)
Merging arm64-fixes/for-next/fixes (b08e5fd90bfc arm_pmu: Use 
disable_irq_nosync when disabling SPI in CPU teardown hook)
Merging m68k-current/for-linus (2334b1ac1235 MAINTAINERS: Add NuBus subsystem 
entry)
Merging metag-fixes/fixes (b884a190afce metag/usercopy: Add missing fixups)
Merging powerpc-fixes/fixes (e7666d046ac0 ocxl: Document the 
OCXL_IOCTL_GET_METADATA IOCTL)
Merging sparc/master (aebb48f5e465 sparc64: fix typo in 
CONFIG_CRYPTO_DES_SPARC64 => CONFIG_CRYPTO_CAMELLIA_SPARC64)
Merging fscrypt-current/for-stable (ae64f9bd1d36 Linux 4.15-rc2)
Merging net/master (a7f0fb1bfb66 Merge branch 'hv_netvsc-minor-fixes')
Merging bpf/master (d02f51cbcf12 bpf: fix bpf_skb_adjust_net/bpf_skb_proto_xlat 
to deal with gso sctp skbs)
Merging ipsec/master (b8b549eec818 xfrm: Fix ESN sequence number handling for 
IPsec GSO packets.)
Merging netfilter/master (4e00f5d5f9fc Merge tag 
'batadv-net-for-davem-20180302' of git://git.open-mesh.org/linux-merge)
Merging ipvs/master (f7fb77fc1235 netfilter: nft_compat: check extension hook 
mask only if set)
Merging wireless-drivers/master (78dc897b7ee6 rtlwifi: rtl8723be: Fix loss of 
signal)
Merging mac80211/master (a78872363614 cfg80211: add missing dependency to 
CFG80211 suboptions)
Merging rdma-fixes/for-rc (4cd482c12be4 IB/core : Add null pointer check in 
addr_resolve)
Merging sound-current/for-linus (d5078193e56b ALSA: hda - Fix a wrong FIXUP for 
alc289 on Dell machines)
Merging pci-current/for-linus (c37406e05d1e PCI: Allow release of resources 
that were never assigned)
Merging driver-core.current/driver-core-linus (4a3928c6f8a5 Linux 4.16-rc3)
Merging tty.current/tty-linus (5d7f77ec72d1 serial: imx: fix bogus dev_err)
Merging usb.current/usb-linus (4a3928c6f8a5 Linux 4.16-rc3)
Merging usb-gadget-fixes/fixes (c6ba5084ce0d usb: gadget: udc: renesas_usb3: 
add binging for r8a77965)
Merging usb-serial-fixes/usb-linus (0a17f9fef994 USB: serial: ftdi_sio: add RT 
Systems VX-8 cable)
Merging usb-chipidea-fixes/ci-for-usb-stable (964728f9f407 USB: chipidea: msm: 
fix ulpi-node lookup)
Merging phy/fixes (7928b2cbe55b Linux 4.16-rc1)
Merging staging.current/staging-linus (cb57469c9573 staging: android: ashmem: 
Fix lockdep issue during llseek)
Merging char-misc.current/char-misc-linus (4a3928c6f8a5 Linux 4.16-rc3)
Merging input-current/for-linus (ea4f7bd2aca9 Input: matrix_keypad - fix race 
when disabling interrupts)
Merging crypto-current/master (c927b080c67e crypto: s5p-sss - Fix kernel Oops 
in AES-ECB mode)
Merging ide/master (8e44e6600caa Merge branch 'KASAN-read_word_at_a_time')
Mer

[PATCH v7 14/14] iommu/rockchip: Support sharing IOMMU between masters

2018-03-05 Thread Jeffy Chen

There would be some masters sharing the same IOMMU device. Put them in
the same iommu group and share the same iommu domain.

Signed-off-by: Jeffy Chen 
Reviewed-by: Robin Murphy 
---

Changes in v7:
Use iommu_group_ref_get to avoid ref leak

Changes in v6: None
Changes in v5: None
Changes in v4: None
Changes in v3:
Remove rk_iommudata->domain.

Changes in v2: None

 drivers/iommu/rockchip-iommu.c | 22 --
 1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
index db08978203f7..6a1c7efa7c17 100644
--- a/drivers/iommu/rockchip-iommu.c
+++ b/drivers/iommu/rockchip-iommu.c
@@ -104,6 +104,7 @@ struct rk_iommu {
struct iommu_device iommu;
struct list_head node; /* entry in rk_iommu_domain.iommus */
struct iommu_domain *domain; /* domain to which iommu is attached */
+   struct iommu_group *group;
 };
 
 /**
@@ -1091,6 +1092,15 @@ static void rk_iommu_remove_device(struct device *dev)
iommu_group_remove_device(dev);
 }
 
+static struct iommu_group *rk_iommu_device_group(struct device *dev)
+{
+   struct rk_iommu *iommu;
+
+   iommu = rk_iommu_from_dev(dev);
+
+   return iommu_group_ref_get(iommu->group);
+}
+
 static int rk_iommu_of_xlate(struct device *dev,
 struct of_phandle_args *args)
 {
@@ -1122,7 +1132,7 @@ static const struct iommu_ops rk_iommu_ops = {
.add_device = rk_iommu_add_device,
.remove_device = rk_iommu_remove_device,
.iova_to_phys = rk_iommu_iova_to_phys,
-   .device_group = generic_device_group,
+   .device_group = rk_iommu_device_group,
.pgsize_bitmap = RK_IOMMU_PGSIZE_BITMAP,
.of_xlate = rk_iommu_of_xlate,
 };
@@ -1191,9 +1201,15 @@ static int rk_iommu_probe(struct platform_device *pdev)
if (err)
return err;
 
+   iommu->group = iommu_group_alloc();
+   if (IS_ERR(iommu->group)) {
+   err = PTR_ERR(iommu->group);
+   goto err_unprepare_clocks;
+   }
+
err = iommu_device_sysfs_add(&iommu->iommu, dev, NULL, dev_name(dev));
if (err)
-   goto err_unprepare_clocks;
+   goto err_put_group;
 
iommu_device_set_ops(&iommu->iommu, &rk_iommu_ops);
iommu_device_set_fwnode(&iommu->iommu, &dev->of_node->fwnode);
@@ -1217,6 +1233,8 @@ static int rk_iommu_probe(struct platform_device *pdev)
return 0;
 err_remove_sysfs:
iommu_device_sysfs_remove(&iommu->iommu);
+err_put_group:
+   iommu_group_put(iommu->group);
 err_unprepare_clocks:
clk_bulk_unprepare(iommu->num_clocks, iommu->clocks);
return err;
-- 
2.11.0

[PATCH v7 13/14] iommu/rockchip: Add runtime PM support

2018-03-05 Thread Jeffy Chen

When the power domain is powered off, the IOMMU cannot be accessed and
register programming must be deferred until the power domain becomes
enabled.

Add runtime PM support, and use runtime PM device link from IOMMU to
master to startup and shutdown IOMMU.

Signed-off-by: Jeffy Chen 
---

Changes in v7:
Add WARN_ON in irq isr, and modify iommu archdata comment.

Changes in v6: None
Changes in v5:
Avoid race about pm_runtime_get_if_in_use() and pm_runtime_enabled().

Changes in v4: None
Changes in v3:
Only call startup() and shutdown() when iommu attached.
Remove pm_mutex.
Check runtime PM disabled.
Check pm_runtime in rk_iommu_irq().

Changes in v2: None

 drivers/iommu/rockchip-iommu.c | 189 -
 1 file changed, 148 insertions(+), 41 deletions(-)

diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
index 2448a0528e39..db08978203f7 100644
--- a/drivers/iommu/rockchip-iommu.c
+++ b/drivers/iommu/rockchip-iommu.c
@@ -22,6 +22,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -105,7 +106,14 @@ struct rk_iommu {
struct iommu_domain *domain; /* domain to which iommu is attached */
 };
 
+/**
+ * struct rk_iommudata - iommu archdata of master device.
+ * @link:  device link with runtime PM integration from the master
+ * (consumer) to the IOMMU (supplier).
+ * @iommu: IOMMU of the master device.
+ */
 struct rk_iommudata {
+   struct device_link *link;
struct rk_iommu *iommu;
 };
 
@@ -518,7 +526,13 @@ static irqreturn_t rk_iommu_irq(int irq, void *dev_id)
u32 int_status;
dma_addr_t iova;
irqreturn_t ret = IRQ_NONE;
-   int i;
+   bool need_runtime_put;
+   int i, err;
+
+   err = pm_runtime_get_if_in_use(iommu->dev);
+   if (WARN_ON(err <= 0 && err != -EINVAL))
+   return ret;
+   need_runtime_put = err > 0;
 
WARN_ON(clk_bulk_enable(iommu->num_clocks, iommu->clocks));
 
@@ -570,6 +584,9 @@ static irqreturn_t rk_iommu_irq(int irq, void *dev_id)
 
clk_bulk_disable(iommu->num_clocks, iommu->clocks);
 
+   if (need_runtime_put)
+   pm_runtime_put(iommu->dev);
+
return ret;
 }
 
@@ -611,10 +628,20 @@ static void rk_iommu_zap_iova(struct rk_iommu_domain 
*rk_domain,
spin_lock_irqsave(&rk_domain->iommus_lock, flags);
list_for_each(pos, &rk_domain->iommus) {
struct rk_iommu *iommu;
+   int ret;
+
iommu = list_entry(pos, struct rk_iommu, node);
-   WARN_ON(clk_bulk_enable(iommu->num_clocks, iommu->clocks));
-   rk_iommu_zap_lines(iommu, iova, size);
-   clk_bulk_disable(iommu->num_clocks, iommu->clocks);
+
+   /* Only zap TLBs of IOMMUs that are powered on. */
+   ret = pm_runtime_get_if_in_use(iommu->dev);
+   if (ret > 0 || ret == -EINVAL) {
+   WARN_ON(clk_bulk_enable(iommu->num_clocks,
+   iommu->clocks));
+   rk_iommu_zap_lines(iommu, iova, size);
+   clk_bulk_disable(iommu->num_clocks, iommu->clocks);
+   }
+   if (ret > 0)
+   pm_runtime_put(iommu->dev);
}
spin_unlock_irqrestore(&rk_domain->iommus_lock, flags);
 }
@@ -817,22 +844,30 @@ static struct rk_iommu *rk_iommu_from_dev(struct device 
*dev)
return data ? data->iommu : NULL;
 }
 
-static int rk_iommu_attach_device(struct iommu_domain *domain,
- struct device *dev)
+/* Must be called with iommu powered on and attached */
+static void rk_iommu_shutdown(struct rk_iommu *iommu)
 {
-   struct rk_iommu *iommu;
+   int i;
+
+   /* Ignore error while disabling, just keep going */
+   WARN_ON(clk_bulk_enable(iommu->num_clocks, iommu->clocks));
+   rk_iommu_enable_stall(iommu);
+   rk_iommu_disable_paging(iommu);
+   for (i = 0; i < iommu->num_mmu; i++) {
+   rk_iommu_write(iommu->bases[i], RK_MMU_INT_MASK, 0);
+   rk_iommu_write(iommu->bases[i], RK_MMU_DTE_ADDR, 0);
+   }
+   rk_iommu_disable_stall(iommu);
+   clk_bulk_disable(iommu->num_clocks, iommu->clocks);
+}
+
+/* Must be called with iommu powered on and attached */
+static int rk_iommu_startup(struct rk_iommu *iommu)
+{
+   struct iommu_domain *domain = iommu->domain;
struct rk_iommu_domain *rk_domain = to_rk_domain(domain);
-   unsigned long flags;
int ret, i;
 
-   /*
-* Allow 'virtual devices' (e.g., drm) to attach to domain.
-* Such a device does not belong to an iommu group.
-*/
-   iommu = rk_iommu_from_dev(dev);
-   if (!iommu)
-   return 0;
-
ret = clk_bulk_enable(iommu->num_clocks, iommu->clocks);
if (ret)
return ret;
@@ -845,8 +880,6 @@ static int rk_iommu_attach_device(struct io

[PATCH v7 12/14] iommu/rockchip: Fix error handling in init

2018-03-05 Thread Jeffy Chen

It's hard to undo bus_set_iommu() in the error path, so move it to the
end of rk_iommu_probe().

Signed-off-by: Jeffy Chen 
Reviewed-by: Tomasz Figa 
Reviewed-by: Robin Murphy 
---

Changes in v7: None
Changes in v6: None
Changes in v5: None
Changes in v4: None
Changes in v3: None
Changes in v2:
Move bus_set_iommu() to rk_iommu_probe().

 drivers/iommu/rockchip-iommu.c | 15 ++-
 1 file changed, 2 insertions(+), 13 deletions(-)

diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
index 1346bbb8a3e7..2448a0528e39 100644
--- a/drivers/iommu/rockchip-iommu.c
+++ b/drivers/iommu/rockchip-iommu.c
@@ -1133,6 +1133,8 @@ static int rk_iommu_probe(struct platform_device *pdev)
if (!dma_dev)
dma_dev = &pdev->dev;
 
+   bus_set_iommu(&platform_bus_type, &rk_iommu_ops);
+
return 0;
 err_remove_sysfs:
iommu_device_sysfs_remove(&iommu->iommu);
@@ -1158,19 +1160,6 @@ static struct platform_driver rk_iommu_driver = {
 
 static int __init rk_iommu_init(void)
 {
-   struct device_node *np;
-   int ret;
-
-   np = of_find_matching_node(NULL, rk_iommu_dt_ids);
-   if (!np)
-   return 0;
-
-   of_node_put(np);
-
-   ret = bus_set_iommu(&platform_bus_type, &rk_iommu_ops);
-   if (ret)
-   return ret;
-
return platform_driver_register(&rk_iommu_driver);
 }
 subsys_initcall(rk_iommu_init);
-- 
2.11.0

[PATCH v7 11/14] iommu/rockchip: Use OF_IOMMU to attach devices automatically

2018-03-05 Thread Jeffy Chen

Converts the rockchip-iommu driver to use the OF_IOMMU infrastructure,
which allows attaching master devices to their IOMMUs automatically
according to DT properties.

Signed-off-by: Jeffy Chen 
Reviewed-by: Robin Murphy 
---

Changes in v7: None
Changes in v6: None
Changes in v5: None
Changes in v4: None
Changes in v3:
Add struct rk_iommudata.
Squash iommu/rockchip: Use iommu_group_get_for_dev() for add_device

Changes in v2: None

 drivers/iommu/rockchip-iommu.c | 135 -
 1 file changed, 40 insertions(+), 95 deletions(-)

diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
index 6789e11b7087..1346bbb8a3e7 100644
--- a/drivers/iommu/rockchip-iommu.c
+++ b/drivers/iommu/rockchip-iommu.c
@@ -19,6 +19,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -104,6 +105,10 @@ struct rk_iommu {
struct iommu_domain *domain; /* domain to which iommu is attached */
 };
 
+struct rk_iommudata {
+   struct rk_iommu *iommu;
+};
+
 static struct device *dma_dev;
 
 static inline void rk_table_flush(struct rk_iommu_domain *dom, dma_addr_t dma,
@@ -807,18 +812,9 @@ static size_t rk_iommu_unmap(struct iommu_domain *domain, 
unsigned long _iova,
 
 static struct rk_iommu *rk_iommu_from_dev(struct device *dev)
 {
-   struct iommu_group *group;
-   struct device *iommu_dev;
-   struct rk_iommu *rk_iommu;
+   struct rk_iommudata *data = dev->archdata.iommu;
 
-   group = iommu_group_get(dev);
-   if (!group)
-   return NULL;
-   iommu_dev = iommu_group_get_iommudata(group);
-   rk_iommu = dev_get_drvdata(iommu_dev);
-   iommu_group_put(group);
-
-   return rk_iommu;
+   return data ? data->iommu : NULL;
 }
 
 static int rk_iommu_attach_device(struct iommu_domain *domain,
@@ -989,110 +985,53 @@ static void rk_iommu_domain_free(struct iommu_domain 
*domain)
iommu_put_dma_cookie(&rk_domain->domain);
 }
 
-static bool rk_iommu_is_dev_iommu_master(struct device *dev)
-{
-   struct device_node *np = dev->of_node;
-   int ret;
-
-   /*
-* An iommu master has an iommus property containing a list of phandles
-* to iommu nodes, each with an #iommu-cells property with value 0.
-*/
-   ret = of_count_phandle_with_args(np, "iommus", "#iommu-cells");
-   return (ret > 0);
-}
-
-static int rk_iommu_group_set_iommudata(struct iommu_group *group,
-   struct device *dev)
+static int rk_iommu_add_device(struct device *dev)
 {
-   struct device_node *np = dev->of_node;
-   struct platform_device *pd;
-   int ret;
-   struct of_phandle_args args;
+   struct iommu_group *group;
+   struct rk_iommu *iommu;
 
-   /*
-* An iommu master has an iommus property containing a list of phandles
-* to iommu nodes, each with an #iommu-cells property with value 0.
-*/
-   ret = of_parse_phandle_with_args(np, "iommus", "#iommu-cells", 0,
-&args);
-   if (ret) {
-   dev_err(dev, "of_parse_phandle_with_args(%pOF) => %d\n",
-   np, ret);
-   return ret;
-   }
-   if (args.args_count != 0) {
-   dev_err(dev, "incorrect number of iommu params found for %pOF 
(found %d, expected 0)\n",
-   args.np, args.args_count);
-   return -EINVAL;
-   }
+   iommu = rk_iommu_from_dev(dev);
+   if (!iommu)
+   return -ENODEV;
 
-   pd = of_find_device_by_node(args.np);
-   of_node_put(args.np);
-   if (!pd) {
-   dev_err(dev, "iommu %pOF not found\n", args.np);
-   return -EPROBE_DEFER;
-   }
+   group = iommu_group_get_for_dev(dev);
+   if (IS_ERR(group))
+   return PTR_ERR(group);
+   iommu_group_put(group);
 
-   /* TODO(djkurtz): handle multiple slave iommus for a single master */
-   iommu_group_set_iommudata(group, &pd->dev, NULL);
+   iommu_device_link(&iommu->iommu, dev);
 
return 0;
 }
 
-static int rk_iommu_add_device(struct device *dev)
+static void rk_iommu_remove_device(struct device *dev)
 {
-   struct iommu_group *group;
struct rk_iommu *iommu;
-   int ret;
-
-   if (!rk_iommu_is_dev_iommu_master(dev))
-   return -ENODEV;
-
-   group = iommu_group_get(dev);
-   if (!group) {
-   group = iommu_group_alloc();
-   if (IS_ERR(group)) {
-   dev_err(dev, "Failed to allocate IOMMU group\n");
-   return PTR_ERR(group);
-   }
-   }
-
-   ret = iommu_group_add_device(group, dev);
-   if (ret)
-   goto err_put_group;
-
-   ret = rk_iommu_group_set_iommudata(group, dev);
-   if (ret)
-   goto err_remove_device;
 
iommu = rk_iommu_from_dev(dev);
-   if (iommu

[PATCH v7 10/14] iommu/rockchip: Use IOMMU device for dma mapping operations

2018-03-05 Thread Jeffy Chen

Use the first registered IOMMU device for dma mapping operations, and
drop the domain platform device.

This is similar to exynos iommu driver.

Signed-off-by: Jeffy Chen 
Reviewed-by: Tomasz Figa 
Reviewed-by: Robin Murphy 
---

Changes in v7: None
Changes in v6: None
Changes in v5: None
Changes in v4: None
Changes in v3: None
Changes in v2: None

 drivers/iommu/rockchip-iommu.c | 85 --
 1 file changed, 24 insertions(+), 61 deletions(-)

diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
index 6c6275589bd5..6789e11b7087 100644
--- a/drivers/iommu/rockchip-iommu.c
+++ b/drivers/iommu/rockchip-iommu.c
@@ -79,7 +79,6 @@
 
 struct rk_iommu_domain {
struct list_head iommus;
-   struct platform_device *pdev;
u32 *dt; /* page directory table */
dma_addr_t dt_dma;
spinlock_t iommus_lock; /* lock for iommus list */
@@ -105,12 +104,14 @@ struct rk_iommu {
struct iommu_domain *domain; /* domain to which iommu is attached */
 };
 
+static struct device *dma_dev;
+
 static inline void rk_table_flush(struct rk_iommu_domain *dom, dma_addr_t dma,
  unsigned int count)
 {
size_t size = count * sizeof(u32); /* count of u32 entry */
 
-   dma_sync_single_for_device(&dom->pdev->dev, dma, size, DMA_TO_DEVICE);
+   dma_sync_single_for_device(dma_dev, dma, size, DMA_TO_DEVICE);
 }
 
 static struct rk_iommu_domain *to_rk_domain(struct iommu_domain *dom)
@@ -625,7 +626,6 @@ static void rk_iommu_zap_iova_first_last(struct 
rk_iommu_domain *rk_domain,
 static u32 *rk_dte_get_page_table(struct rk_iommu_domain *rk_domain,
  dma_addr_t iova)
 {
-   struct device *dev = &rk_domain->pdev->dev;
u32 *page_table, *dte_addr;
u32 dte_index, dte;
phys_addr_t pt_phys;
@@ -643,9 +643,9 @@ static u32 *rk_dte_get_page_table(struct rk_iommu_domain 
*rk_domain,
if (!page_table)
return ERR_PTR(-ENOMEM);
 
-   pt_dma = dma_map_single(dev, page_table, SPAGE_SIZE, DMA_TO_DEVICE);
-   if (dma_mapping_error(dev, pt_dma)) {
-   dev_err(dev, "DMA mapping error while allocating page table\n");
+   pt_dma = dma_map_single(dma_dev, page_table, SPAGE_SIZE, DMA_TO_DEVICE);
+   if (dma_mapping_error(dma_dev, pt_dma)) {
+   dev_err(dma_dev, "DMA mapping error while allocating page 
table\n");
free_page((unsigned long)page_table);
return ERR_PTR(-ENOMEM);
}
@@ -911,29 +911,20 @@ static void rk_iommu_detach_device(struct iommu_domain 
*domain,
 static struct iommu_domain *rk_iommu_domain_alloc(unsigned type)
 {
struct rk_iommu_domain *rk_domain;
-   struct platform_device *pdev;
-   struct device *iommu_dev;
 
if (type != IOMMU_DOMAIN_UNMANAGED && type != IOMMU_DOMAIN_DMA)
return NULL;
 
-   /* Register a pdev per domain, so DMA API can base on this *dev
-* even some virtual master doesn't have an iommu slave
-*/
-   pdev = platform_device_register_simple("rk_iommu_domain",
-  PLATFORM_DEVID_AUTO, NULL, 0);
-   if (IS_ERR(pdev))
+   if (!dma_dev)
return NULL;
 
-   rk_domain = devm_kzalloc(&pdev->dev, sizeof(*rk_domain), GFP_KERNEL);
+   rk_domain = devm_kzalloc(dma_dev, sizeof(*rk_domain), GFP_KERNEL);
if (!rk_domain)
-   goto err_unreg_pdev;
-
-   rk_domain->pdev = pdev;
+   return NULL;
 
if (type == IOMMU_DOMAIN_DMA &&
iommu_get_dma_cookie(&rk_domain->domain))
-   goto err_unreg_pdev;
+   return NULL;
 
/*
 * rk32xx iommus use a 2 level pagetable.
@@ -944,11 +935,10 @@ static struct iommu_domain 
*rk_iommu_domain_alloc(unsigned type)
if (!rk_domain->dt)
goto err_put_cookie;
 
-   iommu_dev = &pdev->dev;
-   rk_domain->dt_dma = dma_map_single(iommu_dev, rk_domain->dt,
+   rk_domain->dt_dma = dma_map_single(dma_dev, rk_domain->dt,
   SPAGE_SIZE, DMA_TO_DEVICE);
-   if (dma_mapping_error(iommu_dev, rk_domain->dt_dma)) {
-   dev_err(iommu_dev, "DMA map error for DT\n");
+   if (dma_mapping_error(dma_dev, rk_domain->dt_dma)) {
+   dev_err(dma_dev, "DMA map error for DT\n");
goto err_free_dt;
}
 
@@ -969,8 +959,6 @@ static struct iommu_domain *rk_iommu_domain_alloc(unsigned 
type)
 err_put_cookie:
if (type == IOMMU_DOMAIN_DMA)
iommu_put_dma_cookie(&rk_domain->domain);
-err_unreg_pdev:
-   platform_device_unregister(pdev);
 
return NULL;
 }
@@ -987,20 +975,18 @@ static void rk_iommu_domain_free(struct iommu_domain 
*domain)
if (rk_dte_is_pt_valid(dte)) {
phys_addr_t pt_phys = rk_dte_pt_address(dte);

Re: [PATCH 4.4 000/193] 4.4.118-stable review

2018-03-05 Thread Ben Hutchings

Patch #132 (which didn't reach the mailing list) was:

> From: Arnd Bergmann 
> Date: Wed, 26 Oct 2016 15:55:02 -0700
> Subject: Input: tca8418_keypad - hide gcc-4.9 -Wmaybe-uninitialized warning
>
> commit ea4348c8462a20e8b1b6455a7145d2b86f8a49b6 upstream.

This appears to introduce a regression, fixed upstream by:

commit 9dd46c02532a6bed6240101ecf4bbc407f8c6adf
Author: Dmitry Torokhov 
Date:   Mon Feb 13 15:45:59 2017 -0800

Input: tca8418_keypad - remove double read of key event register

Ben.

-- 
Ben Hutchings
Software Developer, Codethink Ltd.

[PATCH v7 09/14] dt-bindings: iommu/rockchip: Add clock property

2018-03-05 Thread Jeffy Chen

Add clock property, since we are going to control clocks in rockchip
iommu driver.

Signed-off-by: Jeffy Chen 
Reviewed-by: Rob Herring 
---

Changes in v7: None
Changes in v6:
Fix dt-binding as Robin suggested.
Use aclk and iface clk as Rob and Robin suggested, and split binding
patch.

Changes in v5: None
Changes in v4: None
Changes in v3: None
Changes in v2: None

 Documentation/devicetree/bindings/iommu/rockchip,iommu.txt | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/Documentation/devicetree/bindings/iommu/rockchip,iommu.txt 
b/Documentation/devicetree/bindings/iommu/rockchip,iommu.txt
index 2098f7732264..6ecefea1c6f9 100644
--- a/Documentation/devicetree/bindings/iommu/rockchip,iommu.txt
+++ b/Documentation/devicetree/bindings/iommu/rockchip,iommu.txt
@@ -14,6 +14,11 @@ Required properties:
 "single-master" device, and needs no additional information
 to associate with its master device.  See:
 Documentation/devicetree/bindings/iommu/iommu.txt
+- clocks  : A list of clocks required for the IOMMU to be accessible by
+the host CPU.
+- clock-names : Should contain the following:
+   "iface" - Main peripheral bus clock (PCLK/HCL) (required)
+   "aclk"  - AXI bus clock (required)
 
 Optional properties:
 - rockchip,disable-mmu-reset : Don't use the mmu reset operation.
@@ -27,5 +32,7 @@ Example:
reg = <0xff940300 0x100>;
interrupts = ;
interrupt-names = "vopl_mmu";
+   clocks = <&cru ACLK_VOP1>, <&cru HCLK_VOP1>;
+   clock-names = "aclk", "iface";
#iommu-cells = <0>;
};
-- 
2.11.0

[PATCH v7 07/14] ARM: dts: rockchip: add clocks in iommu nodes

2018-03-05 Thread Jeffy Chen

Add clocks in iommu nodes, since we are going to control clocks in
rockchip iommu driver.

Signed-off-by: Jeffy Chen 
---

Changes in v7: None
Changes in v6:
Add clk names, and modify all iommu nodes in all existing rockchip dts

Changes in v5:
Remove clk names.

Changes in v4: None
Changes in v3: None
Changes in v2: None

 arch/arm/boot/dts/rk3036.dtsi|  2 ++
 arch/arm/boot/dts/rk322x.dtsi|  8 
 arch/arm/boot/dts/rk3288.dtsi| 12 
 arch/arm64/boot/dts/rockchip/rk3328.dtsi | 10 ++
 arch/arm64/boot/dts/rockchip/rk3368.dtsi | 10 ++
 arch/arm64/boot/dts/rockchip/rk3399.dtsi | 14 --
 6 files changed, 54 insertions(+), 2 deletions(-)

diff --git a/arch/arm/boot/dts/rk3036.dtsi b/arch/arm/boot/dts/rk3036.dtsi
index a97458112ff6..567a6a725f9c 100644
--- a/arch/arm/boot/dts/rk3036.dtsi
+++ b/arch/arm/boot/dts/rk3036.dtsi
@@ -197,6 +197,8 @@
reg = <0x10118300 0x100>;
interrupts = ;
interrupt-names = "vop_mmu";
+   clocks = <&cru ACLK_LCDC>, <&cru HCLK_LCDC>;
+   clock-names = "aclk", "iface";
#iommu-cells = <0>;
status = "disabled";
};
diff --git a/arch/arm/boot/dts/rk322x.dtsi b/arch/arm/boot/dts/rk322x.dtsi
index df1e47858675..be80e9a2c9af 100644
--- a/arch/arm/boot/dts/rk322x.dtsi
+++ b/arch/arm/boot/dts/rk322x.dtsi
@@ -584,6 +584,8 @@
reg = <0x20020800 0x100>;
interrupts = ;
interrupt-names = "vpu_mmu";
+   clocks = <&cru ACLK_VPU>, <&cru HCLK_VPU>;
+   clock-names = "aclk", "iface";
iommu-cells = <0>;
status = "disabled";
};
@@ -593,6 +595,8 @@
reg = <0x20030480 0x40>, <0x200304c0 0x40>;
interrupts = ;
interrupt-names = "vdec_mmu";
+   clocks = <&cru ACLK_RKVDEC>, <&cru HCLK_RKVDEC>;
+   clock-names = "aclk", "iface";
iommu-cells = <0>;
status = "disabled";
};
@@ -602,6 +606,8 @@
reg = <0x20053f00 0x100>;
interrupts = ;
interrupt-names = "vop_mmu";
+   clocks = <&cru ACLK_VOP>, <&cru HCLK_VOP>;
+   clock-names = "aclk", "iface";
iommu-cells = <0>;
status = "disabled";
};
@@ -611,6 +617,8 @@
reg = <0x20070800 0x100>;
interrupts = ;
interrupt-names = "iep_mmu";
+   clocks = <&cru ACLK_IEP>, <&cru HCLK_IEP>;
+   clock-names = "aclk", "iface";
iommu-cells = <0>;
status = "disabled";
};
diff --git a/arch/arm/boot/dts/rk3288.dtsi b/arch/arm/boot/dts/rk3288.dtsi
index 6102e4e7f35c..ad77c8eb3c38 100644
--- a/arch/arm/boot/dts/rk3288.dtsi
+++ b/arch/arm/boot/dts/rk3288.dtsi
@@ -958,6 +958,8 @@
reg = <0x0 0xff900800 0x0 0x40>;
interrupts = ;
interrupt-names = "iep_mmu";
+   clocks = <&cru ACLK_IEP>, <&cru HCLK_IEP>;
+   clock-names = "aclk", "iface";
#iommu-cells = <0>;
status = "disabled";
};
@@ -967,6 +969,8 @@
reg = <0x0 0xff914000 0x0 0x100>, <0x0 0xff915000 0x0 0x100>;
interrupts = ;
interrupt-names = "isp_mmu";
+   clocks = <&cru ACLK_ISP>, <&cru HCLK_ISP>;
+   clock-names = "aclk", "iface";
#iommu-cells = <0>;
rockchip,disable-mmu-reset;
status = "disabled";
@@ -1026,6 +1030,8 @@
reg = <0x0 0xff930300 0x0 0x100>;
interrupts = ;
interrupt-names = "vopb_mmu";
+   clocks = <&cru ACLK_VOP0>, <&cru HCLK_VOP0>;
+   clock-names = "aclk", "iface";
power-domains = <&power RK3288_PD_VIO>;
#iommu-cells = <0>;
status = "disabled";
@@ -1074,6 +1080,8 @@
reg = <0x0 0xff940300 0x0 0x100>;
interrupts = ;
interrupt-names = "vopl_mmu";
+   clocks = <&cru ACLK_VOP1>, <&cru HCLK_VOP1>;
+   clock-names = "aclk", "iface";
power-domains = <&power RK3288_PD_VIO>;
#iommu-cells = <0>;
status = "disabled";
@@ -1204,6 +1212,8 @@
reg = <0x0 0xff9a0800 0x0 0x100>;
interrupts = ;
interrupt-names = "vpu_mmu";
+   clocks = <&cru ACLK_VCODEC>, <&cru HCLK_VCODEC>;
+   clock-names = "aclk", "iface";
#iommu-cells = <0>;
status = "disabled";
};
@@ -1213,6 +1223,8 @@
reg = <0x0 0xff9c0440 0x0 0x40>, <0x0 0xff9c0480 0x0 0x40>;
interrupts = ;
interrupt-names = "hevc_mmu";
+   clocks = <&cru

[PATCH v7 04/14] iommu/rockchip: Fix error handling in attach

2018-03-05 Thread Jeffy Chen

From: Tomasz Figa 

Currently if the driver encounters an error while attaching device, it
will leave the IOMMU in an inconsistent state. Even though it shouldn't
really happen in reality, let's just add proper error path to keep
things consistent.

Signed-off-by: Tomasz Figa 
Signed-off-by: Jeffy Chen 
Reviewed-by: Robin Murphy 
---

Changes in v7: None
Changes in v6: None
Changes in v5:
Use out labels to save the duplication between the error and success paths.

Changes in v4: None
Changes in v3: None
Changes in v2:
Move irq request to probe(in patch[0])

 drivers/iommu/rockchip-iommu.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
index b743d82e6fe1..f7ff3a3645ea 100644
--- a/drivers/iommu/rockchip-iommu.c
+++ b/drivers/iommu/rockchip-iommu.c
@@ -824,7 +824,7 @@ static int rk_iommu_attach_device(struct iommu_domain 
*domain,
 
ret = rk_iommu_force_reset(iommu);
if (ret)
-   return ret;
+   goto out_disable_stall;
 
iommu->domain = domain;
 
@@ -837,7 +837,7 @@ static int rk_iommu_attach_device(struct iommu_domain 
*domain,
 
ret = rk_iommu_enable_paging(iommu);
if (ret)
-   return ret;
+   goto out_disable_stall;
 
spin_lock_irqsave(&rk_domain->iommus_lock, flags);
list_add_tail(&iommu->node, &rk_domain->iommus);
@@ -845,9 +845,9 @@ static int rk_iommu_attach_device(struct iommu_domain 
*domain,
 
dev_dbg(dev, "Attached to iommu domain\n");
 
+out_disable_stall:
rk_iommu_disable_stall(iommu);
-
-   return 0;
+   return ret;
 }
 
 static void rk_iommu_detach_device(struct iommu_domain *domain,
-- 
2.11.0

[PATCH v7 08/14] iommu/rockchip: Control clocks needed to access the IOMMU

2018-03-05 Thread Jeffy Chen

From: Tomasz Figa 

Current code relies on master driver enabling necessary clocks before
IOMMU is accessed, however there are cases when the IOMMU should be
accessed while the master is not running yet, for example allocating
V4L2 videobuf2 buffers, which is done by the VB2 framework using DMA
mapping API and doesn't engage the master driver at all.

This patch fixes the problem by letting clocks needed for IOMMU
operation to be listed in Device Tree and making the driver enable them
for the time of accessing the hardware.

Signed-off-by: Jeffy Chen 
Signed-off-by: Tomasz Figa 
Acked-by: Robin Murphy 
---

Changes in v7: None
Changes in v6:
Fix dt-binding as Robin suggested.
Use aclk and iface clk as Rob and Robin suggested, and split binding
patch.

Changes in v5:
Use clk_bulk APIs.

Changes in v4: None
Changes in v3: None
Changes in v2: None

 drivers/iommu/rockchip-iommu.c | 54 +-
 1 file changed, 48 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
index c4131ca792e0..6c6275589bd5 100644
--- a/drivers/iommu/rockchip-iommu.c
+++ b/drivers/iommu/rockchip-iommu.c
@@ -4,6 +4,7 @@
  * published by the Free Software Foundation.
  */
 
+#include 
 #include 
 #include 
 #include 
@@ -87,10 +88,17 @@ struct rk_iommu_domain {
struct iommu_domain domain;
 };
 
+/* list of clocks required by IOMMU */
+static const char * const rk_iommu_clocks[] = {
+   "aclk", "iface",
+};
+
 struct rk_iommu {
struct device *dev;
void __iomem **bases;
int num_mmu;
+   struct clk_bulk_data *clocks;
+   int num_clocks;
bool reset_disabled;
struct iommu_device iommu;
struct list_head node; /* entry in rk_iommu_domain.iommus */
@@ -506,6 +514,8 @@ static irqreturn_t rk_iommu_irq(int irq, void *dev_id)
irqreturn_t ret = IRQ_NONE;
int i;
 
+   WARN_ON(clk_bulk_enable(iommu->num_clocks, iommu->clocks));
+
for (i = 0; i < iommu->num_mmu; i++) {
int_status = rk_iommu_read(iommu->bases[i], RK_MMU_INT_STATUS);
if (int_status == 0)
@@ -552,6 +562,8 @@ static irqreturn_t rk_iommu_irq(int irq, void *dev_id)
rk_iommu_write(iommu->bases[i], RK_MMU_INT_CLEAR, int_status);
}
 
+   clk_bulk_disable(iommu->num_clocks, iommu->clocks);
+
return ret;
 }
 
@@ -594,7 +606,9 @@ static void rk_iommu_zap_iova(struct rk_iommu_domain 
*rk_domain,
list_for_each(pos, &rk_domain->iommus) {
struct rk_iommu *iommu;
iommu = list_entry(pos, struct rk_iommu, node);
+   WARN_ON(clk_bulk_enable(iommu->num_clocks, iommu->clocks));
rk_iommu_zap_lines(iommu, iova, size);
+   clk_bulk_disable(iommu->num_clocks, iommu->clocks);
}
spin_unlock_irqrestore(&rk_domain->iommus_lock, flags);
 }
@@ -823,10 +837,14 @@ static int rk_iommu_attach_device(struct iommu_domain 
*domain,
if (!iommu)
return 0;
 
-   ret = rk_iommu_enable_stall(iommu);
+   ret = clk_bulk_enable(iommu->num_clocks, iommu->clocks);
if (ret)
return ret;
 
+   ret = rk_iommu_enable_stall(iommu);
+   if (ret)
+   goto out_disable_clocks;
+
ret = rk_iommu_force_reset(iommu);
if (ret)
goto out_disable_stall;
@@ -852,6 +870,8 @@ static int rk_iommu_attach_device(struct iommu_domain 
*domain,
 
 out_disable_stall:
rk_iommu_disable_stall(iommu);
+out_disable_clocks:
+   clk_bulk_disable(iommu->num_clocks, iommu->clocks);
return ret;
 }
 
@@ -873,6 +893,7 @@ static void rk_iommu_detach_device(struct iommu_domain 
*domain,
spin_unlock_irqrestore(&rk_domain->iommus_lock, flags);
 
/* Ignore error while disabling, just keep going */
+   WARN_ON(clk_bulk_enable(iommu->num_clocks, iommu->clocks));
rk_iommu_enable_stall(iommu);
rk_iommu_disable_paging(iommu);
for (i = 0; i < iommu->num_mmu; i++) {
@@ -880,6 +901,7 @@ static void rk_iommu_detach_device(struct iommu_domain 
*domain,
rk_iommu_write(iommu->bases[i], RK_MMU_DTE_ADDR, 0);
}
rk_iommu_disable_stall(iommu);
+   clk_bulk_disable(iommu->num_clocks, iommu->clocks);
 
iommu->domain = NULL;
 
@@ -1172,18 +1194,38 @@ static int rk_iommu_probe(struct platform_device *pdev)
iommu->reset_disabled = device_property_read_bool(dev,
"rockchip,disable-mmu-reset");
 
-   err = iommu_device_sysfs_add(&iommu->iommu, dev, NULL, dev_name(dev));
+   iommu->num_clocks = ARRAY_SIZE(rk_iommu_clocks);
+   iommu->clocks = devm_kcalloc(iommu->dev, iommu->num_clocks,
+sizeof(*iommu->clocks), GFP_KERNEL);
+   if (!iommu->clocks)
+   return -ENOMEM;
+
+   for (i = 0; i < iommu->num_clocks; ++i)
+   iommu->cl

[PATCH v7 05/14] iommu/rockchip: Use iopoll helpers to wait for hardware

2018-03-05 Thread Jeffy Chen

From: Tomasz Figa 

This patch converts the rockchip-iommu driver to use the in-kernel
iopoll helpers to wait for certain status bits to change in registers
instead of an open-coded custom macro.

Signed-off-by: Tomasz Figa 
Signed-off-by: Jeffy Chen 
Reviewed-by: Robin Murphy 
---

Changes in v7: None
Changes in v6: None
Changes in v5:
Use RK_MMU_POLL_PERIOD_US instead of 100.

Changes in v4: None
Changes in v3: None
Changes in v2: None

 drivers/iommu/rockchip-iommu.c | 75 ++
 1 file changed, 39 insertions(+), 36 deletions(-)

diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
index f7ff3a3645ea..baba283ccdf9 100644
--- a/drivers/iommu/rockchip-iommu.c
+++ b/drivers/iommu/rockchip-iommu.c
@@ -13,7 +13,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 #include 
 #include 
@@ -36,7 +36,10 @@
 #define RK_MMU_AUTO_GATING 0x24
 
 #define DTE_ADDR_DUMMY 0xCAFEBABE
-#define FORCE_RESET_TIMEOUT100 /* ms */
+
+#define RK_MMU_POLL_PERIOD_US  100
+#define RK_MMU_FORCE_RESET_TIMEOUT_US  10
+#define RK_MMU_POLL_TIMEOUT_US 1000
 
 /* RK_MMU_STATUS fields */
 #define RK_MMU_STATUS_PAGING_ENABLED   BIT(0)
@@ -73,8 +76,6 @@
   */
 #define RK_IOMMU_PGSIZE_BITMAP 0x007ff000
 
-#define IOMMU_REG_POLL_COUNT_FAST 1000
-
 struct rk_iommu_domain {
struct list_head iommus;
struct platform_device *pdev;
@@ -109,27 +110,6 @@ static struct rk_iommu_domain *to_rk_domain(struct 
iommu_domain *dom)
return container_of(dom, struct rk_iommu_domain, domain);
 }
 
-/**
- * Inspired by _wait_for in intel_drv.h
- * This is NOT safe for use in interrupt context.
- *
- * Note that it's important that we check the condition again after having
- * timed out, since the timeout could be due to preemption or similar and
- * we've never had a chance to check the condition before the timeout.
- */
-#define rk_wait_for(COND, MS) ({ \
-   unsigned long timeout__ = jiffies + msecs_to_jiffies(MS) + 1;   \
-   int ret__ = 0;  \
-   while (!(COND)) {   \
-   if (time_after(jiffies, timeout__)) {   \
-   ret__ = (COND) ? 0 : -ETIMEDOUT;\
-   break;  \
-   }   \
-   usleep_range(50, 100);  \
-   }   \
-   ret__;  \
-})
-
 /*
  * The Rockchip rk3288 iommu uses a 2-level page table.
  * The first level is the "Directory Table" (DT).
@@ -333,9 +313,21 @@ static bool rk_iommu_is_paging_enabled(struct rk_iommu 
*iommu)
return enable;
 }
 
+static bool rk_iommu_is_reset_done(struct rk_iommu *iommu)
+{
+   bool done = true;
+   int i;
+
+   for (i = 0; i < iommu->num_mmu; i++)
+   done &= rk_iommu_read(iommu->bases[i], RK_MMU_DTE_ADDR) == 0;
+
+   return done;
+}
+
 static int rk_iommu_enable_stall(struct rk_iommu *iommu)
 {
int ret, i;
+   bool val;
 
if (rk_iommu_is_stall_active(iommu))
return 0;
@@ -346,7 +338,9 @@ static int rk_iommu_enable_stall(struct rk_iommu *iommu)
 
rk_iommu_command(iommu, RK_MMU_CMD_ENABLE_STALL);
 
-   ret = rk_wait_for(rk_iommu_is_stall_active(iommu), 1);
+   ret = readx_poll_timeout(rk_iommu_is_stall_active, iommu, val,
+val, RK_MMU_POLL_PERIOD_US,
+RK_MMU_POLL_TIMEOUT_US);
if (ret)
for (i = 0; i < iommu->num_mmu; i++)
dev_err(iommu->dev, "Enable stall request timed out, 
status: %#08x\n",
@@ -358,13 +352,16 @@ static int rk_iommu_enable_stall(struct rk_iommu *iommu)
 static int rk_iommu_disable_stall(struct rk_iommu *iommu)
 {
int ret, i;
+   bool val;
 
if (!rk_iommu_is_stall_active(iommu))
return 0;
 
rk_iommu_command(iommu, RK_MMU_CMD_DISABLE_STALL);
 
-   ret = rk_wait_for(!rk_iommu_is_stall_active(iommu), 1);
+   ret = readx_poll_timeout(rk_iommu_is_stall_active, iommu, val,
+!val, RK_MMU_POLL_PERIOD_US,
+RK_MMU_POLL_TIMEOUT_US);
if (ret)
for (i = 0; i < iommu->num_mmu; i++)
dev_err(iommu->dev, "Disable stall request timed out, 
status: %#08x\n",
@@ -376,13 +373,16 @@ static int rk_iommu_disable_stall(struct rk_iommu *iommu)
 static int rk_iommu_enable_paging(struct rk_iommu *iommu)
 {
int ret, i;
+   bool val;
 
if (rk_iommu_is_paging_enabled(iommu))
return 0;
 
rk_iommu_command(iommu, RK_MMU_CMD_ENABLE_PAGING);
 
-

[PATCH v7 06/14] iommu/rockchip: Fix TLB flush of secondary IOMMUs

2018-03-05 Thread Jeffy Chen

From: Tomasz Figa 

Due to the bug in current code, only first IOMMU has the TLB lines
flushed in rk_iommu_zap_lines. This patch fixes the inner loop to
execute for all IOMMUs and properly flush the TLB.

Signed-off-by: Tomasz Figa 
Signed-off-by: Jeffy Chen 
---

Changes in v7: None
Changes in v6: None
Changes in v5: None
Changes in v4: None
Changes in v3: None
Changes in v2: None

 drivers/iommu/rockchip-iommu.c | 12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
index baba283ccdf9..c4131ca792e0 100644
--- a/drivers/iommu/rockchip-iommu.c
+++ b/drivers/iommu/rockchip-iommu.c
@@ -274,19 +274,21 @@ static void rk_iommu_base_command(void __iomem *base, u32 
command)
 {
writel(command, base + RK_MMU_COMMAND);
 }
-static void rk_iommu_zap_lines(struct rk_iommu *iommu, dma_addr_t iova,
+static void rk_iommu_zap_lines(struct rk_iommu *iommu, dma_addr_t iova_start,
   size_t size)
 {
int i;
-
-   dma_addr_t iova_end = iova + size;
+   dma_addr_t iova_end = iova_start + size;
/*
 * TODO(djkurtz): Figure out when it is more efficient to shootdown the
 * entire iotlb rather than iterate over individual iovas.
 */
-   for (i = 0; i < iommu->num_mmu; i++)
-   for (; iova < iova_end; iova += SPAGE_SIZE)
+   for (i = 0; i < iommu->num_mmu; i++) {
+   dma_addr_t iova;
+
+   for (iova = iova_start; iova < iova_end; iova += SPAGE_SIZE)
rk_iommu_write(iommu->bases[i], RK_MMU_ZAP_ONE_LINE, 
iova);
+   }
 }
 
 static bool rk_iommu_is_stall_active(struct rk_iommu *iommu)
-- 
2.11.0

Donation For Charity Work

2018-03-05 Thread Friedrich And Ann Mayrhofer




--
Good Day,

My wife and I have awarded you with a donation of $ 1,000,000.00 Dollars from
part of our Jackpot Lottery of 50 Million Dollars, respond with your details
for claims.

We await your earliest response and God Bless you.

Friedrich And Ann Mayrhofer.

Re: Would you help to tell why async printk solution was not taken to upstream kernel ?

2018-03-05 Thread Steven Rostedt

On Tue, 6 Mar 2018 11:43:58 +0900
Sergey Senozhatsky  wrote:

> One more thing
> 
> On (03/06/18 10:52), Sergey Senozhatsky wrote:
> [..]
> > > If you know the baud rate, logbuf size * console throughput is actually
> > > trivial to calculate.  
> 
> It's trivial when your setup is trivial. In a less trivial case if you
> set watchdog threshold based on "logbuf size * console throughput" then
> things are still too bad.
> 
> So this is what a typical printk over serial console looks like
> 
> printk()
>  console_unlock()
>   for (;;) {
>local_irq_save()
> call_console_drivers()
>  foo_console_write()
>   spin_lock_irqsave(&port->lock, flags);
>   uart_console_write(port, s, count, foo_console_putchar);
>   spin_unlock_irqrestore(&port->lock, flags);
>local_irq_restore()
>   }
> 
> Notice that call_console_drivers->foo_console_write spins on
> port->lock every time it wants to print out a logbuf line.
> Why does it do this?
> 
> In short, because of printf(). Yes, printk() may depend on printf().
> 
> printf()
>  n_tty_write()
>   uart_write()
>uart_port_lock(state, flags)  // 
> spin_lock_irqsave(&uport->lock, flags)
> memcpy(circ->buf + circ->head, buf, c);
>uart_port_unlock(port, flags) // 
> spin_unlock_irqrestore(&port->lock, flags);
> 
> Now, printf() messages stored in uart circ buffer must be printed
> to the console. And this is where console's IRQ handler jumps in.
> 
> A typical IRQ handler does something like this
> 
> static irqreturn_t foo_console_irq_handler(...)
> {
>   spin_lock(&port->lock);
>   rx_chars(port, status);
>   tx_chars(port, status);
>   spin_unlock(&port->lock);
> }
> 
> Where tx_chars() usually does something like this
> 
>   while (...) {
>   write_char(port, xmit->buf[xmit->tail]);
>   xmit->tail = (xmit->tail + 1) & (UART_XMIT_SIZE - 1);
>   if (uart_circ_empty(xmit))
>   break;
>   }
> 
> Some drivers flush all pending chars, some drivers limit the number
> of TX chars to some number, e.g. 512.
> 
> But in any case, printk() -> call_console_drivers() -> foo_console_write()
> must spin on port->lock as long as foo_console_irq_handler() has chars to
> TX / RX.
> 
> Thus, if you have O(logbuf) of kernel messages, and O(circ->buf) of user
> space messages, then printk() will spend O(logbuf) + O(circ->buf) + O(RX).
> 
> So the watchdog threshold value based purely on O(logbuf) (printing to
> _all_ of the consoles) will not always work.
> 

If you have a complex setup happening like above, you most likely have
printks happening on multiple CPUs which means the work load will be
spread out across those CPUs.

-- Steve

Re: [PATCH 4.4 130/193] [media] tc358743: fix register i2c_rd/wr functions

2018-03-05 Thread Ben Hutchings

On Fri, 2018-02-23 at 19:26 +0100, Greg Kroah-Hartman wrote:
> 4.4-stable review patch.  If anyone has any objections, please let me know.
> 
> --
> 
> From: Arnd Bergmann 
> 
> commit 3538aa6ecfb2dd727a40f9ebbbf25a0c2afe6226 upstream.
[...]

This introduces a regression in i2c_wr8_and_or(), fixed upstream by:

commit f2c61f98e0b5f8b53b8fb860e5dcdd661bde7d0b
Author: Philipp Zabel 
Date:   Thu May 4 12:20:17 2017 -0300

[media] tc358743: fix register i2c_rd/wr function fix

Ben.

-- 
Ben Hutchings
Software Developer, Codethink Ltd.

Re: Would you help to tell why async printk solution was not taken to upstream kernel ?

2018-03-05 Thread Steven Rostedt

On Tue, 6 Mar 2018 11:53:50 +0900
Sergey Senozhatsky  wrote:


> Yes. My point was that "CPU can print one full buffer max" is not
> guaranteed and not exactly true. There are ways for CPUs to break
> that O(logbuf) boundary.

Yes, when printk or the consoles have a bug, it can be greater than
O(logbuf).

-- Steve

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1157 matches

Mail list logo