date:20180202

[PATCH 00/12] Major code reorganization to make all i2c transfers working

2018-02-02 Thread Abhishek Sahu

The current driver is failing in following test case
1. Handling of failure cases is not working in long run for BAM
   mode. It generates error message “bam-dma-engine 7884000.dma: Cannot
   free busy channel” sometimes.
2. Following I2C transfers are failing
   a. Single transfer with multiple read messages
   b. Single transfer with multiple read/write message with maximum
  allowed length per message (65K) in BAM mode
   c. Single transfer with write greater than 32 bytes in QUP v1 and
  write greater than 64 bytes in QUP v2 for non-DMA mode.
3. No handling is present for Block/FIFO interrupts. Any non-error
   interrupts are being treated as the transfer completion and then
   polling is being done for available/free bytes in FIFO.

To fix all these issues, major code changes are required. This patch
series fixes all the above issues and makes the driver interrupt based
instead of polling based. After these changes, all the mentioned test
cases are working properly.

The code changes have been tested for QUP v1 (IPQ8064) and QUP
v2 (IPQ8074) with sample application written over i2c-dev.

Abhishek Sahu (12):
  i2c: qup: fixed releasing dma without flush operation completion
  i2c: qup: minor code reorganization for use_dma
  i2c: qup: remove redundant variables for BAM SG count
  i2c: qup: schedule EOT and FLUSH tags at the end of transfer
  i2c: qup: fix the transfer length for BAM rx EOT FLUSH tags
  i2c: qup: proper error handling for i2c error in BAM mode
  i2c: qup: use the complete transfer length to choose DMA mode
  i2c: qup: change completion timeout according to transfer length
  i2c: qup: fix buffer overflow for multiple msg of maximum xfer len
  i2c: qup: send NACK for last read sub transfers
  i2c: qup: reorganization of driver code to remove polling for qup v1
  i2c: qup: reorganization of driver code to remove polling for qup v2

 drivers/i2c/busses/i2c-qup.c | 1538 +-
 1 file changed, 924 insertions(+), 614 deletions(-)

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of 
Code Aurora Forum, hosted by The Linux Foundation

[PATCH 02/12] i2c: qup: minor code reorganization for use_dma

2018-02-02 Thread Abhishek Sahu

1. Assigns use_dma in qup_dev structure itself which will
   help in subsequent patches to determine the mode in IRQ handler.
2. Does minor code reorganization for loops to reduce the
   unnecessary comparison and assignment.

Signed-off-by: Abhishek Sahu 
---
 drivers/i2c/busses/i2c-qup.c | 19 +++
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/drivers/i2c/busses/i2c-qup.c b/drivers/i2c/busses/i2c-qup.c
index 9faa26c41a..c68f433 100644
--- a/drivers/i2c/busses/i2c-qup.c
+++ b/drivers/i2c/busses/i2c-qup.c
@@ -190,6 +190,8 @@ struct qup_i2c_dev {
 
/* dma parameters */
boolis_dma;
+   /* To check if the current transfer is using DMA */
+   booluse_dma;
struct  dma_pool *dpool;
struct  qup_i2c_tag start_tag;
struct  qup_i2c_bam brx;
@@ -1297,7 +1299,7 @@ static int qup_i2c_xfer_v2(struct i2c_adapter *adap,
   int num)
 {
struct qup_i2c_dev *qup = i2c_get_adapdata(adap);
-   int ret, len, idx = 0, use_dma = 0;
+   int ret, len, idx = 0;
 
qup->bus_err = 0;
qup->qup_err = 0;
@@ -1326,13 +1328,12 @@ static int qup_i2c_xfer_v2(struct i2c_adapter *adap,
len = (msgs[idx].len > qup->out_fifo_sz) ||
  (msgs[idx].len > qup->in_fifo_sz);
 
-   if ((!is_vmalloc_addr(msgs[idx].buf)) && len) {
-   use_dma = 1;
-} else {
-   use_dma = 0;
+   if (is_vmalloc_addr(msgs[idx].buf) || !len)
break;
-   }
}
+
+   if (idx == num)
+   qup->use_dma = true;
}
 
idx = 0;
@@ -1356,15 +1357,17 @@ static int qup_i2c_xfer_v2(struct i2c_adapter *adap,
 
reinit_completion(&qup->xfer);
 
-   if (use_dma) {
+   if (qup->use_dma) {
ret = qup_i2c_bam_xfer(adap, &msgs[idx], num);
+   qup->use_dma = false;
+   break;
} else {
if (msgs[idx].flags & I2C_M_RD)
ret = qup_i2c_read_one_v2(qup, &msgs[idx]);
else
ret = qup_i2c_write_one_v2(qup, &msgs[idx]);
}
-   } while ((idx++ < (num - 1)) && !use_dma && !ret);
+   } while ((idx++ < (num - 1)) && !ret);
 
if (!ret)
ret = qup_i2c_change_state(qup, QUP_RESET_STATE);
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of 
Code Aurora Forum, hosted by The Linux Foundation

[PATCH 01/12] i2c: qup: fixed releasing dma without flush operation completion

2018-02-02 Thread Abhishek Sahu

The QUP BSLP BAM generates the following error sometimes if the
current I2C DMA transfer fails and the flush operation has been
scheduled

“bam-dma-engine 7884000.dma: Cannot free busy channel”

If any I2C error comes during BAM DMA transfer, then the QUP I2C
interrupt will be generated and the flush operation will be
carried out to make i2c consume all scheduled DMA transfer.
Currently, the same completion structure is being used for BAM
transfer which has already completed without reinit. It will make
flush operation wait_for_completion_timeout completed immediately
and will proceed for freeing the DMA resources where the
descriptors are still in process.

Signed-off-by: Abhishek Sahu 
---
 drivers/i2c/busses/i2c-qup.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/i2c/busses/i2c-qup.c b/drivers/i2c/busses/i2c-qup.c
index 08f8e01..9faa26c41a 100644
--- a/drivers/i2c/busses/i2c-qup.c
+++ b/drivers/i2c/busses/i2c-qup.c
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2009-2013, The Linux Foundation. All rights reserved.
+ * Copyright (c) 2009-2013, 2016-2018, The Linux Foundation. All rights 
reserved.
  * Copyright (c) 2014, Sony Mobile Communications AB.
  *
  *
@@ -844,6 +844,8 @@ static int qup_i2c_bam_do_xfer(struct qup_i2c_dev *qup, 
struct i2c_msg *msg,
}
 
if (ret || qup->bus_err || qup->qup_err) {
+   reinit_completion(&qup->xfer);
+
if (qup_i2c_change_state(qup, QUP_RUN_STATE)) {
dev_err(qup->dev, "change to run state timed out");
goto desc_err;
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of 
Code Aurora Forum, hosted by The Linux Foundation

Re: [PATCH v2 2/2] HID: core: Fix size as type u32

2018-02-02 Thread Marcus Folkesson

Hi Aaron,

On Mon, Jan 08, 2018 at 10:41:41AM +0800, Aaron Ma wrote:
> When size is negative, calling memset will make segment fault.
> Declare the size as type u32 to keep memset safe.
> 
> size in struct hid_report is unsigned, fix return type of
> hid_report_len to u32.
> 
> Cc: sta...@vger.kernel.org
> Signed-off-by: Aaron Ma 
> ---
>  drivers/hid/hid-core.c | 10 +-
>  include/linux/hid.h|  6 +++---
>  2 files changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/hid/hid-core.c b/drivers/hid/hid-core.c
> index 0c3f608131cf..cf81c53e3b98 100644
> --- a/drivers/hid/hid-core.c
> +++ b/drivers/hid/hid-core.c
> @@ -1390,7 +1390,7 @@ u8 *hid_alloc_report_buf(struct hid_report *report, 
> gfp_t flags)
>* of implement() working on 8 byte chunks
>*/
>  
> - int len = hid_report_len(report) + 7;
> + u32 len = hid_report_len(report) + 7;
>  
>   return kmalloc(len, flags);
>  }
> @@ -1455,7 +1455,7 @@ void __hid_request(struct hid_device *hid, struct 
> hid_report *report,
>  {
>   char *buf;
>   int ret;
> - int len;
> + u32 len;
>  
>   buf = hid_alloc_report_buf(report, GFP_KERNEL);
>   if (!buf)
> @@ -1481,14 +1481,14 @@ void __hid_request(struct hid_device *hid, struct 
> hid_report *report,
>  }
>  EXPORT_SYMBOL_GPL(__hid_request);
>  
> -int hid_report_raw_event(struct hid_device *hid, int type, u8 *data, int 
> size,
> +int hid_report_raw_event(struct hid_device *hid, int type, u8 *data, u32 
> size,
>   int interrupt)
>  {
>   struct hid_report_enum *report_enum = hid->report_enum + type;
>   struct hid_report *report;
>   struct hid_driver *hdrv;
>   unsigned int a;
> - int rsize, csize = size;
> + u32 rsize, csize = size;
>   u8 *cdata = data;
>   int ret = 0;
>  
> @@ -1546,7 +1546,7 @@ EXPORT_SYMBOL_GPL(hid_report_raw_event);
>   *
>   * This is data entry for lower layers.
>   */
> -int hid_input_report(struct hid_device *hid, int type, u8 *data, int size, 
> int interrupt)
> +int hid_input_report(struct hid_device *hid, int type, u8 *data, u32 size, 
> int interrupt)
>  {
>   struct hid_report_enum *report_enum;
>   struct hid_driver *hdrv;
> diff --git a/include/linux/hid.h b/include/linux/hid.h
> index d491027a7c22..9bc296eebc98 100644
> --- a/include/linux/hid.h
> +++ b/include/linux/hid.h
> @@ -841,7 +841,7 @@ extern int hidinput_connect(struct hid_device *hid, 
> unsigned int force);
>  extern void hidinput_disconnect(struct hid_device *);
>  
>  int hid_set_field(struct hid_field *, unsigned, __s32);
> -int hid_input_report(struct hid_device *, int type, u8 *, int, int);
> +int hid_input_report(struct hid_device *, int type, u8 *, u32, int);
>  int hidinput_find_field(struct hid_device *hid, unsigned int type, unsigned 
> int code, struct hid_field **field);
>  struct hid_field *hidinput_get_led_field(struct hid_device *hid);
>  unsigned int hidinput_count_leds(struct hid_device *hid);
> @@ -1088,13 +1088,13 @@ static inline void hid_hw_wait(struct hid_device 
> *hdev)
>   *
>   * @report: the report we want to know the length
>   */
> -static inline int hid_report_len(struct hid_report *report)
> +static inline u32 hid_report_len(struct hid_report *report)

hid_report_len() is used in several files.
If we think it is a good idea to change the return type, we should fix
these files as well.

[08:47:56]marcus@little:~/git/linux$ git grep -l hid_report_len
drivers/hid/hid-core.c
drivers/hid/hid-input.c
drivers/hid/hid-multitouch.c
drivers/hid/hid-rmi.c
drivers/hid/usbhid/hid-core.c
drivers/hid/wacom_sys.c
drivers/staging/greybus/hid.c
include/linux/hid.h

>  {
>   /* equivalent to DIV_ROUND_UP(report->size, 8) + !!(report->id > 0) */
>   return ((report->size - 1) >> 3) + 1 + (report->id > 0);
>  }
>  
> -int hid_report_raw_event(struct hid_device *hid, int type, u8 *data, int 
> size,
> +int hid_report_raw_event(struct hid_device *hid, int type, u8 *data, u32 
> size,
>   int interrupt);
>  
>  /* HID quirks API */
> -- 
> 2.14.3

Best regards
Marcus Folkesson
> 


signature.asc
Description: PGP signature

Re: [BUG] x86 : i486 reporting to be vulnerable to Meltdown/Spectre_V1/Spectre_V2

2018-02-02 Thread David Woodhouse

On Fri, 2018-02-02 at 23:52 -0500, tedheadster wrote:
> I just tested the 4.15 kernel and it is reporting that my old i486
> (non-cpuid capable) cpu is vulnerable to all three issues: Meltdown,
> Spectre V1, and Spectre V2.
> 
> I find this to be _unlikely_.

This should be fixed in Linus' tree already by commit fec9434a1
("x86/pti: Do not enable PTI on CPUs which are not vulnerable to
Meltdown").

We'll make sure it ends up in the stable tree too, if it hasn't
already.

smime.p7s
Description: S/MIME cryptographic signature

Re: [PATCH 1/3] net: stmmac: dwmac-sun8i: drop V3s compatible and add V3 one

2018-02-02 Thread Icenowy Zheng



于 2018年2月3日 GMT+08:00 上午6:13:01, Maxime Ripard  写到:
>On Sat, Feb 03, 2018 at 02:04:54AM +0800, Icenowy Zheng wrote:
>> The V3s is just a differently packaged version of the V3 chip, which
>has
>> a MAC with the same capability with H3. The V3s just doesn't wire out
>> the external MII/RMII/RGMII bus. (V3 wired out it).
>> 
>> Drop the compatible string of V3s in the dwmac-sun8i driver, and add
>a
>> V3 compatible string, which has all capabilities.
>> 
>> Signed-off-by: Icenowy Zheng 
>
>This breaks the DT ABI, so NAK.

I have asked this at IRC.

The V3s compatible string is never used in any mainline
kernel, even not in any RC version.

>
>Maxime

Re: Coccinelle: zalloc-simple: Checking consistency for SmPL rules

2018-02-02 Thread SF Markus Elfring

>> * Do we agree that a proper size determination is essential for every
>>   condition in the discussed SmPL rules together with forwarding
>>   this information?
> 
> No.  I don't mind a few false positives.

Do you care to split SmPL rules by their confidence category in such an use 
case?

Regards,
Markus

Re: [linux-sunxi] [PATCH 1/3] net: stmmac: dwmac-sun8i: drop V3s compatible and add V3 one

2018-02-02 Thread Icenowy Zheng



于 2018年2月3日 GMT+08:00 下午2:00:33, Julian Calaby  写到:
>Hi Icenowy,
>
>On Sat, Feb 3, 2018 at 5:04 AM, Icenowy Zheng  wrote:
>> The V3s is just a differently packaged version of the V3 chip, which
>has
>> a MAC with the same capability with H3. The V3s just doesn't wire out
>> the external MII/RMII/RGMII bus. (V3 wired out it).
>>
>> Drop the compatible string of V3s in the dwmac-sun8i driver, and add
>a
>> V3 compatible string, which has all capabilities.
>
>Aren't compatible strings technically API, so don't we need to support
>those that are out in the wild "forever"?
>
>Therefore shouldn't we leave the v3s variant around for compatibility
>with existing device trees?

You can run grep at arch/arm/boot/dts, this compatible
string is not used at all.

>
>Thanks,

Re: clang warning: implicit conversion in intel_ddi.c:1481

2018-02-02 Thread Knut Omang

On Fri, 2018-02-02 at 16:50 +0100, Greg KH wrote:
> On Fri, Feb 02, 2018 at 04:37:55PM +0200, Jani Nikula wrote:
> > On Fri, 02 Feb 2018, Greg KH  wrote:
> > > On Fri, Feb 02, 2018 at 12:44:38PM +0200, Jani Nikula wrote:
> > >> 
> > >> +Knut, Fengguang
> > >> 
> > >> On Fri, 02 Feb 2018, Greg KH  wrote:
> > >> >- If clang now builds the kernel "cleanly", yes, I want to take
> > >> >  warning fixes in the stable tree.  And even better yet, if you
> > >> >  keep working to ensure the tree is "clean", that would be
> > >> >  wonderful.
> > >> 
> > >> So we can run sparse using 'make C=1' and friends, or other static
> > >> analysis tools using 'make CHECK=foo C=1', as long as the passed command
> > >> line params work. There was work by Knut to extend this make checker
> > >> stuff [1]. Since mixing different HOSTCC's in a single workdir seems
> > >> like a bad idea, I wonder how hard it would be to make clang work like
> > >> this:
> > >> 
> > >> $ make CHECK=clang C=1
> > >> 
> > >> Or using Knut's wrapper. Feels like that could increase the use of clang
> > >> for static analysis of patches.
> > >
> > > Why not just build with clang itself:
> > >   make CC=clang
> > 
> > Same as HOSTCC, mixing different CC's in a single build dir seems like a
> > bad idea. Sure, everyone can setup a separate build dir for clang, but
> > IMHO having 'make CHECK=clang C=1' work has least resistance. YMMV.
> 
> "O=some_output_dir" is your friend.  If you aren't doing that already
> for your test builds, you don't know what you are missing :)

I use O= a lot myself - so good not to have all the output files "pollute" the 
source
tree, and to be able to switch branches and compile without having to recompile 
everything
by having multiple O= set up.

I think what my runchecks wrapper script brings in addition is the ability to 
to a number
of checks which may or may not pass, even return error codes, from the same 
'make' command
and configure what errors to fix now and what to postpone/ignore (and thus not 
fail from).

As an example, I just tried clang (on v4.15-rc6) with:

cd $HOME/src/kernel
make O=$HOME/build/kernel/clang
cd $HOME/build/kernel/clang
make

and it fails to compile for me in arch/x86/xen/mmu_pv.o. 

If I'd want to just make sure that some patches did not introduce new errors 
with clang, 
I would waste some time with unrelated errors, and there will be noise in the 
output, also
consuming personal "cycles".

I haven't really looked at the details of much of what clang outputs of errors 
yet, but I
can imagine that specific errors reported by clang might be useful to correct 
even in old
kernels, where some files inevitably will fail to compile like this.

This would be easy to handle with runchecks using a few exceptions for those
problems/files not yet fixed, allowing a run to easily detect (while compiling 
with gcc as
the main compiler) that no new clang errors were introduced of any other kind 
than those
suppressed.

Thanks,
Knut

> 
> thanks,
> 
> greg k-h

Re: [PATCH RESEND v3] perf/core: Fix installing cgroup event into cpu

2018-02-02 Thread kbuild test robot

Hi leilei.lin,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on tip/perf/core]
[also build test ERROR on v4.15 next-20180202]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/linxiulei-gmail-com/perf-core-Fix-installing-cgroup-event-into-cpu/20180203-133110
config: i386-randconfig-s0-201804 (attached as .config)
compiler: gcc-6 (Debian 6.4.0-9) 6.4.0 20171026
reproduce:
# save the attached .config to linux build tree
make ARCH=i386 

All errors (new ones prefixed by >>):

   kernel/events/core.c: In function '__perf_install_in_context':
>> kernel/events/core.c:2332:10: error: implicit declaration of function 
>> 'perf_cgroup_from_task' [-Werror=implicit-function-declaration]
  cgrp = perf_cgroup_from_task(current, ctx);
 ^
   kernel/events/core.c:2332:8: warning: assignment makes pointer from integer 
without a cast [-Wint-conversion]
  cgrp = perf_cgroup_from_task(current, ctx);
   ^
   kernel/events/core.c:2333:40: error: dereferencing pointer to incomplete 
type 'struct perf_cgroup'
  reprogram = cgroup_is_descendant(cgrp->css.cgroup,
   ^~
   kernel/events/core.c:2334:11: error: 'struct perf_event' has no member named 
'cgrp'
 event->cgrp->css.cgroup);
  ^~
   cc1: some warnings being treated as errors

vim +/perf_cgroup_from_task +2332 kernel/events/core.c

  2284  
  2285  /*
  2286   * Cross CPU call to install and enable a performance event
  2287   *
  2288   * Very similar to remote_function() + event_function() but cannot 
assume that
  2289   * things like ctx->is_active and cpuctx->task_ctx are set.
  2290   */
  2291  static int  __perf_install_in_context(void *info)
  2292  {
  2293  struct perf_event *event = info;
  2294  struct perf_event_context *ctx = event->ctx;
  2295  struct perf_cpu_context *cpuctx = __get_cpu_context(ctx);
  2296  struct perf_event_context *task_ctx = cpuctx->task_ctx;
  2297  struct perf_cgroup *cgrp;
  2298  bool reprogram = true;
  2299  int ret = 0;
  2300  
  2301  raw_spin_lock(&cpuctx->ctx.lock);
  2302  if (ctx->task) {
  2303  raw_spin_lock(&ctx->lock);
  2304  task_ctx = ctx;
  2305  
  2306  reprogram = (ctx->task == current);
  2307  
  2308  /*
  2309   * If the task is running, it must be running on this 
CPU,
  2310   * otherwise we cannot reprogram things.
  2311   *
  2312   * If its not running, we don't care, ctx->lock will
  2313   * serialize against it becoming runnable.
  2314   */
  2315  if (task_curr(ctx->task) && !reprogram) {
  2316  ret = -ESRCH;
  2317  goto unlock;
  2318  }
  2319  
  2320  WARN_ON_ONCE(reprogram && cpuctx->task_ctx && 
cpuctx->task_ctx != ctx);
  2321  } else if (task_ctx) {
  2322  raw_spin_lock(&task_ctx->lock);
  2323  }
  2324  
  2325  if (is_cgroup_event(event)) {
  2326  /*
  2327   * Only care about cgroup events.
  2328   *
  2329   * If only the task belongs to cgroup of this event,
  2330   * we will continue the installment
  2331   */
> 2332  cgrp = perf_cgroup_from_task(current, ctx);
  2333  reprogram = cgroup_is_descendant(cgrp->css.cgroup,
  2334  event->cgrp->css.cgroup);
  2335  }
  2336  
  2337  if (reprogram) {
  2338  ctx_sched_out(ctx, cpuctx, EVENT_TIME);
  2339  add_event_to_ctx(event, ctx);
  2340  ctx_resched(cpuctx, task_ctx, get_event_type(event));
  2341  } else {
  2342  add_event_to_ctx(event, ctx);
  2343  }
  2344  
  2345  unlock:
  2346  perf_ctx_unlock(cpuctx, task_ctx);
  2347  
  2348  return ret;
  2349  }
  2350  

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip

Re: [PATCH] media: cx25821: prevent out-of-bounds read on array card

2018-02-02 Thread kbuild test robot

Hi Colin,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on linuxtv-media/master]
[also build test WARNING on v4.15 next-20180202]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Colin-King/media-cx25821-prevent-out-of-bounds-read-on-array-card/20180203-130958
base:   git://linuxtv.org/media_tree.git master
config: xtensa-allyesconfig (attached as .config)
compiler: xtensa-linux-gcc (GCC) 7.2.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=xtensa 

All warnings (new ones prefixed by >>):

   In file included from include/linux/printk.h:7:0,
from include/linux/kernel.h:14,
from include/linux/list.h:9,
from include/linux/kobject.h:20,
from include/linux/device.h:17,
from include/linux/i2c.h:30,
from drivers/media/pci/cx25821/cx25821-core.c:22:
   drivers/media/pci/cx25821/cx25821-core.c: In function 'cx25821_dev_setup':
>> include/linux/kern_levels.h:5:18: warning: format '%ld' expects argument of 
>> type 'long int', but argument 3 has type 'unsigned int' [-Wformat=]
#define KERN_SOH "\001"  /* ASCII Start Of Header */
 ^
   include/linux/kern_levels.h:14:19: note: in expansion of macro 'KERN_SOH'
#define KERN_INFO KERN_SOH "6" /* informational */
  ^~~~
   include/linux/printk.h:308:9: note: in expansion of macro 'KERN_INFO'
 printk(KERN_INFO pr_fmt(fmt), ##__VA_ARGS__)
^
>> drivers/media/pci/cx25821/cx25821.h:380:2: note: in expansion of macro 
>> 'pr_info'
 pr_info("(%d): " fmt, dev->board, ##args)
 ^~~
>> drivers/media/pci/cx25821/cx25821-core.c:871:3: note: in expansion of macro 
>> 'CX25821_INFO'
  CX25821_INFO("dev->nr >= %ld", ARRAY_SIZE(card));
  ^~~~

vim +/pr_info +380 drivers/media/pci/cx25821/cx25821.h

02b20b0b drivers/staging/cx25821/cx25821.h Mauro Carvalho Chehab 2009-09-15  
374  
36d89f7d drivers/staging/cx25821/cx25821.h Joe Perches   2010-11-07  
375  #define CX25821_ERR(fmt, args...) \
36d89f7d drivers/staging/cx25821/cx25821.h Joe Perches   2010-11-07  
376pr_err("(%d): " fmt, dev->board, ##args)
36d89f7d drivers/staging/cx25821/cx25821.h Joe Perches   2010-11-07  
377  #define CX25821_WARN(fmt, args...)\
36d89f7d drivers/staging/cx25821/cx25821.h Joe Perches   2010-11-07  
378pr_warn("(%d): " fmt, dev->board, ##args)
36d89f7d drivers/staging/cx25821/cx25821.h Joe Perches   2010-11-07  
379  #define CX25821_INFO(fmt, args...)\
36d89f7d drivers/staging/cx25821/cx25821.h Joe Perches   2010-11-07 
@380pr_info("(%d): " fmt, dev->board, ##args)
02b20b0b drivers/staging/cx25821/cx25821.h Mauro Carvalho Chehab 2009-09-15  
381  

:: The code at line 380 was first introduced by commit
:: 36d89f7de4a4937848de86d9b35cb03a9f0357e1 [media] 
drivers/staging/cx25821: Use pr_fmt and pr_

:: TO: Joe Perches 
:: CC: Mauro Carvalho Chehab 

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip

Re: [linux-sunxi] [PATCH 1/3] net: stmmac: dwmac-sun8i: drop V3s compatible and add V3 one

2018-02-02 Thread Julian Calaby

Hi Icenowy,

On Sat, Feb 3, 2018 at 5:04 AM, Icenowy Zheng  wrote:
> The V3s is just a differently packaged version of the V3 chip, which has
> a MAC with the same capability with H3. The V3s just doesn't wire out
> the external MII/RMII/RGMII bus. (V3 wired out it).
>
> Drop the compatible string of V3s in the dwmac-sun8i driver, and add a
> V3 compatible string, which has all capabilities.

Aren't compatible strings technically API, so don't we need to support
those that are out in the wild "forever"?

Therefore shouldn't we leave the v3s variant around for compatibility
with existing device trees?

Thanks,

-- 
Julian Calaby

Email: julian.cal...@gmail.com
Profile: http://www.google.com/profiles/julian.calaby/

Re: [PATCH AUTOSEL for 3.18 36/40] powerpc/xmon: Avoid tripping SMP hardlockup watchdog

2018-02-02 Thread Nicholas Piggin

On Tue, 30 Jan 2018 15:35:54 +1100
Michael Ellerman  wrote:

> alexander.le...@verizon.com writes:
> 
> > On Thu, Dec 14, 2017 at 12:10:39AM +1100, Michael Ellerman wrote:  
> >>alexander.le...@verizon.com writes:
> >>  
> >>> From: Nicholas Piggin 
> >>>
> >>> [ Upstream commit 064996d62a33ffe10264b5af5dca92d54f60f806 ]
> >>>
> >>> The SMP hardlockup watchdog cross-checks other CPUs for lockups, which
> >>> causes xmon headaches because it's assuming interrupts hard disabled
> >>> means no watchdog troubles. Try to improve that by calling
> >>> touch_nmi_watchdog() in obvious places where secondaries are spinning.
> >>>
> >>> Also annotate these spin loops with spin_begin/end calls.  
> >>
> >>These macros didn't exist until 4.13, and haven't been backported AFAIK.  
> >
> > But the touch_nmi_watchdog() bits are something we want in stable, right?  
> 
> I don't think you need them unless you've also back ported
> arch/powerpc/kernel/watchdog.c, which I don't think you have.
> 
> Maybe Nick can confirm?

I'm not 100% sure. The CPUs only check themselves for lockups. They will
blow their threshold when in xmon, but when they come out of xmon, I think
by a quirk of our local_irq_enable() implementation that actually checks
timers explicitly and runs them first before re-enabling hard interrupts,
then our heartbeat starts up again just before the perf interrupt would
come in to report the lockup.

I think.

Given that we've had no reports of misbehaviour of the old perf watchdog,
I would say you can skip the backport.

Thanks,
Nick

Re: [PATCH RESEND v3] perf/core: Fix installing cgroup event into cpu

2018-02-02 Thread kbuild test robot

Hi leilei.lin,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on tip/perf/core]
[also build test ERROR on v4.15 next-20180202]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/linxiulei-gmail-com/perf-core-Fix-installing-cgroup-event-into-cpu/20180203-133110
config: i386-randconfig-x071-201804 (attached as .config)
compiler: gcc-7 (Debian 7.2.0-12) 7.2.1 20171025
reproduce:
# save the attached .config to linux build tree
make ARCH=i386 

All errors (new ones prefixed by >>):

   kernel/events/core.c: In function '__perf_install_in_context':
>> kernel/events/core.c:2332:10: error: implicit declaration of function 
>> 'perf_cgroup_from_task'; did you mean 'perf_cgroup_match'? 
>> [-Werror=implicit-function-declaration]
  cgrp = perf_cgroup_from_task(current, ctx);
 ^
 perf_cgroup_match
   kernel/events/core.c:2332:8: warning: assignment makes pointer from integer 
without a cast [-Wint-conversion]
  cgrp = perf_cgroup_from_task(current, ctx);
   ^
>> kernel/events/core.c:2333:40: error: dereferencing pointer to incomplete 
>> type 'struct perf_cgroup'
  reprogram = cgroup_is_descendant(cgrp->css.cgroup,
   ^~
>> kernel/events/core.c:2334:11: error: 'struct perf_event' has no member named 
>> 'cgrp'
 event->cgrp->css.cgroup);
  ^~
   cc1: some warnings being treated as errors

vim +2332 kernel/events/core.c

  2284  
  2285  /*
  2286   * Cross CPU call to install and enable a performance event
  2287   *
  2288   * Very similar to remote_function() + event_function() but cannot 
assume that
  2289   * things like ctx->is_active and cpuctx->task_ctx are set.
  2290   */
  2291  static int  __perf_install_in_context(void *info)
  2292  {
  2293  struct perf_event *event = info;
  2294  struct perf_event_context *ctx = event->ctx;
  2295  struct perf_cpu_context *cpuctx = __get_cpu_context(ctx);
  2296  struct perf_event_context *task_ctx = cpuctx->task_ctx;
  2297  struct perf_cgroup *cgrp;
  2298  bool reprogram = true;
  2299  int ret = 0;
  2300  
  2301  raw_spin_lock(&cpuctx->ctx.lock);
  2302  if (ctx->task) {
  2303  raw_spin_lock(&ctx->lock);
  2304  task_ctx = ctx;
  2305  
  2306  reprogram = (ctx->task == current);
  2307  
  2308  /*
  2309   * If the task is running, it must be running on this 
CPU,
  2310   * otherwise we cannot reprogram things.
  2311   *
  2312   * If its not running, we don't care, ctx->lock will
  2313   * serialize against it becoming runnable.
  2314   */
  2315  if (task_curr(ctx->task) && !reprogram) {
  2316  ret = -ESRCH;
  2317  goto unlock;
  2318  }
  2319  
  2320  WARN_ON_ONCE(reprogram && cpuctx->task_ctx && 
cpuctx->task_ctx != ctx);
  2321  } else if (task_ctx) {
  2322  raw_spin_lock(&task_ctx->lock);
  2323  }
  2324  
  2325  if (is_cgroup_event(event)) {
  2326  /*
  2327   * Only care about cgroup events.
  2328   *
  2329   * If only the task belongs to cgroup of this event,
  2330   * we will continue the installment
  2331   */
> 2332  cgrp = perf_cgroup_from_task(current, ctx);
> 2333  reprogram = cgroup_is_descendant(cgrp->css.cgroup,
> 2334  event->cgrp->css.cgroup);
  2335  }
  2336  
  2337  if (reprogram) {
  2338  ctx_sched_out(ctx, cpuctx, EVENT_TIME);
  2339  add_event_to_ctx(event, ctx);
  2340  ctx_resched(cpuctx, task_ctx, get_event_type(event));
  2341  } else {
  2342  add_event_to_ctx(event, ctx);
  2343  }
  2344  
  2345  unlock:
  2346  perf_ctx_unlock(cpuctx, task_ctx);
  2347  
  2348  return ret;
  2349  }
  2350  

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip

[PATCH] audit: update bugtracker and source URIs

2018-02-02 Thread Richard Guy Briggs

Since the Linux Audit project has transitioned completely over to
github, update the MAINTAINERS file and the primary audit source file to
reflect that reality.

Signed-off-by: Richard Guy Briggs 
---
 MAINTAINERS| 1 -
 kernel/audit.c | 3 ++-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 845fc25..fba4875 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2479,7 +2479,6 @@ M:Paul Moore 
 M: Eric Paris 
 L: linux-au...@redhat.com (moderated for non-subscribers)
 W: https://github.com/linux-audit
-W: https://people.redhat.com/sgrubb/audit
 T: git git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit.git
 S: Supported
 F: include/linux/audit.h
diff --git a/kernel/audit.c b/kernel/audit.c
index 227db99..5c25449 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -38,7 +38,8 @@
  *   6) Support low-overhead kernel-based filtering to minimize the
  *  information that must be passed to user-space.
  *
- * Example user-space utilities: http://people.redhat.com/sgrubb/audit/
+ * Audit userspace, documentation, tests, and bug/issue trackers:
+ * https://github.com/linux-audit
  */
 
 #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
-- 
1.8.3.1

[BUG] x86 : i486 reporting to be vulnerable to Meltdown/Spectre_V1/Spectre_V2

2018-02-02 Thread tedheadster

I just tested the 4.15 kernel and it is reporting that my old i486
(non-cpuid capable) cpu is vulnerable to all three issues: Meltdown,
Spectre V1, and Spectre V2.

I find this to be _unlikely_.

/sys/devices/system/cpu/vulnerabilities/* reports the following:

meltdown: "Vulnerable"
spectre_v1: "Vulnerable"
spectre_v2: "Vulnerable: Minimal generic ASM retpoline"

The output of dmesg includes:

"Spectre V2 mitigation: Vulnerable: Minimal generic ASM retpoline"
"Spectre V2 mitigation: Filling RSB on context switch"

Also, /proc/cpuinfo reports the following:

cpuid level: -1
flags: fpu retpoline rsb_ctxsw
bugs: cpu_meltdown spectre_v1 spectre_v2

I have the hardware to test on. Send me your patches.

- Matthew Whitehead

Re: [PATCH] net: mlx5: remove pointless memcpy

2018-02-02 Thread Saeed Mahameed



On 02/02/2018 12:26 PM, Arnd Bergmann wrote:
> On Fri, Feb 2, 2018 at 8:06 PM, Jason Gunthorpe  wrote:
>> On Fri, Feb 02, 2018 at 04:46:30PM +0100, Arnd Bergmann wrote:
>>> gcc-8 notices that the memcpy in mlx5_core_query_xsrq() makes no
>>> sense because the source and destination variables are identical:
>>>
>>> drivers/net/ethernet/mellanox/mlx5/core/transobj.c: In function 
>>> 'mlx5_core_query_xsrq':
>>> drivers/net/ethernet/mellanox/mlx5/core/transobj.c:347:3: error: 'memcpy' 
>>> source argument is the same as destination [-Werror=restrict]
>>>
>>> Either one of the pointers should be something else, or the code is
>>> completely bogus. Removing the memcpy() won't change the behavior
>>> but gets rid of the warning.
>>>
>>> Fixes: 01949d0109ee ("net/mlx5_core: Enable XRCs and SRQs when using ISSI > 
>>> 0")
>>> Signed-off-by: Arnd Bergmann 
>>> Please review carefully, I have no idea what the author actually
>>> intended here.
>>
>> I think they intended to adjust the command return between
>> mlx5_ifc_query_srq_out_bits and mlx5_ifc_query_xrc_srq_out_bits?
>>
>>> index 9e38343a951f..75450f7d53bf 100644
>>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/transobj.c
>>> @@ -332,20 +332,12 @@ int mlx5_core_destroy_xsrq(struct mlx5_core_dev *dev, 
>>> u32 xsrqn)
>>>  int mlx5_core_query_xsrq(struct mlx5_core_dev *dev, u32 xsrqn, u32 *out)
>>>  {
>>>   u32 in[MLX5_ST_SZ_DW(query_xrc_srq_in)] = {0};
>>> - void *srqc;
>>> - void *xrc_srqc;
>>>   int err;
>>>
>>>   MLX5_SET(query_xrc_srq_in, in, opcode,   MLX5_CMD_OP_QUERY_XRC_SRQ);
>>>   MLX5_SET(query_xrc_srq_in, in, xrc_srqn, xsrqn);
>>>   err = mlx5_cmd_exec(dev, in, sizeof(in), out,
>>>   MLX5_ST_SZ_BYTES(query_xrc_srq_out));
>>> - if (!err) {
>>> - xrc_srqc = MLX5_ADDR_OF(query_xrc_srq_out, out,
>>> - xrc_srq_context_entry);
>>> - srqc = MLX5_ADDR_OF(query_srq_out, out, srq_context_entry);
>>> - memcpy(srqc, xrc_srqc, MLX5_ST_SZ_BYTES(srqc));
>>> - }

OMG!

>>
>> Probably should add a
>>
>> BUILD_BUG_ON(MLX5_BYTE_OFF(query_xrc_srq_out, xrc_srq_context_entry) == 
>> MLX5_BYTE_OFF(query_srq_out, srq_context_entry));
>>
>> Just for clarity that the SRQ and XRC_SRQ are being used interchangeably.
>>
>> and the 'err' variable can be eliminated.
>>
>> Curious though that I can't find a call site for it, and removing the
>> prototype doesn't break the build.. Seems like dead code.
> 
> I checked the git history and don't see any user ever added after the function
> first showed up in the kernel, same for a couple of other functions from
> commit 01949d0109ee ("net/mlx5_core: Enable XRCs and SRQs when
> using ISSI > 0").
> 
> Can you come up with a proper patch for this isse, either removing the
> dead code, or fixing it appropriately? You clearly understand what this
> file is about, and I don't ;-)

Simply this is just pointless dead code, will remove it, there is no point of 
trying to
figure out what the author was thinking the day he wrote that patch :)

Thank you Arnd for spotting this.

> 
>   Arnd
>

Re: [PATCH 4.15 00/55] 4.15.1-stable review

2018-02-02 Thread Dan Rue

On Fri, Feb 02, 2018 at 05:58:18PM +0100, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.15.1 release.
> There are 55 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Sun Feb  4 14:07:50 UTC 2018.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
>   kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.15.1-rc1.gz
> or in the git tree and branch at:
>   git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git 
> linux-4.15.y
> and the diffstat can be found below.

Results from Linaro’s test farm.

No regressions since 4.15 release, but you'll notice high failure counts
in kselftest. These are because it was the first RC and I ran the tests
multiple times - first without a skipfile, and then again with a partial
skipfile. All of the failures look like known issues that we also saw on
4.15 release.

Summary


kernel: 4.15.1-rc1
git repo:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
git branch: linux-4.15.y
git commit: b01b3d9519f250398695c7cc6493ba1e8fb072f4
git describe: v4.15-56-gb01b3d9519f2
Test details:
https://qa-reports.linaro.org/lkft/linux-stable-rc-4.15-oe/build/v4.15-56-gb01b3d9519f2


No regressions (compared to build )

Boards, architectures and test suites:
-

hi6220-hikey - arm64
* boot - pass: 38,
* kselftest - pass: 98, skip: 14, fail: 12
* libhugetlbfs - pass: 180, skip: 2,
* ltp-cap_bounds-tests - pass: 4,
* ltp-containers-tests - pass: 128,
* ltp-fcntl-locktests-tests - pass: 4,
* ltp-filecaps-tests - pass: 4,
* ltp-fs-tests - pass: 120,
* ltp-fs_bind-tests - pass: 4,
* ltp-fs_perms_simple-tests - pass: 38,
* ltp-fsx-tests - pass: 4,
* ltp-hugetlb-tests - pass: 21, skip: 1,
* ltp-io-tests - pass: 6,
* ltp-ipc-tests - pass: 18,
* ltp-math-tests - pass: 22,
* ltp-nptl-tests - pass: 4,
* ltp-pty-tests - pass: 4,
* ltp-sched-tests - pass: 20,
* ltp-securebits-tests - pass: 8,
* ltp-syscalls-tests - pass: 1968, skip: 242,
* ltp-timers-tests - pass: 24,

juno-r2 - arm64
* boot - pass: 31,
* kselftest - pass: 111, skip: 28, fail: 12
* libhugetlbfs - pass: 90, skip: 1,
* ltp-cap_bounds-tests - pass: 4,
* ltp-containers-tests - pass: 128,
* ltp-fcntl-locktests-tests - pass: 2,
* ltp-filecaps-tests - pass: 4,
* ltp-fs-tests - pass: 120,
* ltp-fs_bind-tests - pass: 2,
* ltp-fs_perms_simple-tests - pass: 38,
* ltp-fsx-tests - pass: 2,
* ltp-hugetlb-tests - pass: 44,
* ltp-io-tests - pass: 3,
* ltp-ipc-tests - pass: 18,
* ltp-math-tests - pass: 11,
* ltp-nptl-tests - pass: 4,
* ltp-pty-tests - pass: 4,
* ltp-sched-tests - pass: 10,
* ltp-securebits-tests - pass: 4,
* ltp-syscalls-tests - pass: 987, skip: 121,
* ltp-timers-tests - pass: 24,

x15 - arm
* boot - pass: 41,
* kselftest - pass: 92, skip: 32, fail: 15
* libhugetlbfs - pass: 174, skip: 2,
* ltp-cap_bounds-tests - pass: 4,
* ltp-containers-tests - pass: 124, fail: 4
* ltp-fcntl-locktests-tests - pass: 4,
* ltp-filecaps-tests - pass: 4,
* ltp-fs-tests - pass: 120,
* ltp-fs_bind-tests - pass: 4,
* ltp-fs_perms_simple-tests - pass: 38,
* ltp-fsx-tests - pass: 4,
* ltp-hugetlb-tests - pass: 40, skip: 4,
* ltp-io-tests - pass: 6,
* ltp-ipc-tests - pass: 18,
* ltp-math-tests - pass: 22,
* ltp-nptl-tests - pass: 4,
* ltp-pty-tests - pass: 8,
* ltp-sched-tests - pass: 26, skip: 2,
* ltp-securebits-tests - pass: 8,
* ltp-syscalls-tests - pass: 2076, skip: 132,
* ltp-timers-tests - pass: 24,

x86_64
* boot - pass: 40,
* kselftest - pass: 121, skip: 16, fail: 14
* libhugetlbfs - pass: 180, skip: 2,
* ltp-cap_bounds-tests - pass: 4,
* ltp-containers-tests - pass: 128,
* ltp-fcntl-locktests-tests - pass: 4,
* ltp-filecaps-tests - pass: 4,
* ltp-fs-tests - pass: 122, skip: 2,
* ltp-fs_bind-tests - pass: 4,
* ltp-fs_perms_simple-tests - pass: 38,
* ltp-fsx-tests - pass: 4,
* ltp-hugetlb-tests - pass: 44,
* ltp-io-tests - pass: 6,
* ltp-ipc-tests - pass: 18,
* ltp-math-tests - pass: 22,
* ltp-nptl-tests - pass: 4,
* ltp-pty-tests - pass: 8,
* ltp-sched-tests - pass: 18, skip: 2,
* ltp-securebits-tests - pass: 8,
* ltp-syscalls-tests - pass: 2032, skip: 232,
* ltp-timers-tests - pass: 24,


--
Linaro QA (beta)
https://qa-reports.linaro.org

Re: [PATCH 2/2] HID: i2c-hid: Fix resume issue on Raydium touchscreen device

2018-02-02 Thread Aaron Ma

Hi

Could anyone review an apply this single patch?

The 2nd patch had been sent as v2.

Regards,
Aaron

Re: [PATCH v2 2/2] HID: core: Fix size as type u32

2018-02-02 Thread Aaron Ma

Hi:

Could anyone review and apply these 2 patch?

Regards,
Aaron

Re: [PATCH] of: cache phandle nodes to decrease cost of of_find_node_by_phandle()

2018-02-02 Thread Frank Rowand

On 02/01/18 21:53, Chintan Pandya wrote:
> 
> 
> On 2/2/2018 2:39 AM, Frank Rowand wrote:
>> On 02/01/18 06:24, Rob Herring wrote:
>>> And so
>>> far, no one has explained why a bigger cache got slower.
>>
>> Yes, I still find that surprising.
> 
> I thought a bit about this. And realized that increasing the cache size 
> should help improve the performance only if there are too many misses with 
> the smaller cache. So, from my experiments some time back, I looked up the 
> logs and saw the access pattern. Seems like, there is *not_too_much* juggling 
> during look up by phandles.
> 
> See the access pattern here: 
> https://drive.google.com/file/d/1qfAD8OsswNJABgAwjJf6Gr_JZMeK7rLV/view?usp=sharing

Thanks!  Very interesting.

I was somewhat limited at playing detective with this, because the phandle
values are not consistent with the dts file you are currently working
with (arch/arm64/boot/dts/qcom/sda670-mtp.dts).  For example, I could
not determine what the target nodes for the hot phandle values.  That
information _could_ possibly point at algorithms within the devicetree
core code that could be improved.  Or maybe not.  Hard to tell until
actually looking at the data.

Anyway, some observations were possible.

There are 485 unique phandle values searched for.

The ten phandle values most frequently referenced account for
3932 / 6745 (or 58%) of all references.

Without the corresponding devicetree I can not tell how many nodes
need to be scanned to locate each of these ten values (using the
existing algorithm).  Thus I can not determine how much scanning
would be eliminated by caching just the nodes corresponding to
these ten phandle values.

There are 89 phandle values that were searched for 10 times
or more, accounting for 86% of the searches.

Only 164 phandle values were searched for just one time.

303 phandle values were searched for just one or two times.

Here is a more complete picture:

  10 values each used 100 or more times; searches:  3932   58%
  11 values each used  90 or more times; searches:  3994   59%
  12 values each used  80 or more times; searches:  4045   60%
  13 values each used  70 or more times; searches:  4093   61%
  14 values each used  60 or more times; searches:  4136   61%
  15 values each used  50 or more times; searches:  4178   62%
  18 values each used  40 or more times; searches:  4300   64%
  32 values each used  30 or more times; searches:  4774   71%
  54 values each used  20 or more times; searches:  5293   78%
  89 values each used  10 or more times; searches:  5791   86%
  93 values each used   9 or more times; searches:  5827   86%
 117 values each used   8 or more times; searches:  6019   89%
 122 values each used   7 or more times; searches:  6054   90%
 132 values each used   6 or more times; searches:  6114   91%
 144 values each used   5 or more times; searches:  6174   92%
 162 values each used   4 or more times; searches:  6246   93%
 181 values each used   3 or more times; searches:  6303   93%
 320 values each used   2 or more times; searches:  6581   98%
 484 values each used   1 or more times; searches:  6746  100%

A single system does not prove anything.  It is possible that
other devicetrees would exhibit similarly long tailed behavior,
but that is just wild speculation on my part.

_If_ the long tail is representative of other systems, then
identifying a few hot spots could be useful, but fixing them
is not likely to significantly reduce the overhead of calls
to of_find_node_by_phandle().  Some method of reducing the
overhead of each call would be the answer for a system of
this class.

> Sample log is pasted below where number in the last is phandle value.
> Line 8853: [   37.425405] OF: want to search this 262
> Line 8854: [   37.425453] OF: want to search this 262
> Line 8855: [   37.425499] OF: want to search this 262
> Line 8856: [   37.425549] OF: want to search this 15
> Line 8857: [   37.425599] OF: want to search this 5
> Line 8858: [   37.429989] OF: want to search this 253
> Line 8859: [   37.430058] OF: want to search this 253
> Line 8860: [   37.430217] OF: want to search this 253
> Line 8861: [   37.430278] OF: want to search this 253
> Line 8862: [   37.430337] OF: want to search this 253
> Line 8863: [   37.430399] OF: want to search this 254
> Line 8864: [   37.430597] OF: want to search this 254
> Line 8865: [   37.430656] OF: want to search this 254
> 
> 
> Above explains why results with cache size 64 and 128 have almost similar 
> results. Now, for cache size 256 we have degrading performance. I don't have 
> a good theory here but I'm assuming that by making large SW cache, we miss 
> the benefits of real HW cache which is typically smaller than our array size. 
> Also, in my set up, I've set max_cpu=1 to reduce the variance. That again, 
> should affect the cache holding pattern in HW and affect the perf numbers.
> 
> 
> Chintan

Re: [RESEND RFC PATCH V3] sched: Improve scalability of select_idle_sibling using SMT balance

2018-02-02 Thread Mike Galbraith

On Fri, 2018-02-02 at 13:34 -0500, Steven Sistare wrote:
> On 2/2/2018 12:39 PM, Steven Sistare wrote:
> > On 2/2/2018 12:21 PM, Peter Zijlstra wrote:
> >> On Fri, Feb 02, 2018 at 11:53:40AM -0500, Steven Sistare wrote:
> >>> It might be interesting to add a tunable for the number of random choices 
> >>> to
> >>> make, and clamp it at the max nr computed from avg_cost in 
> >>> select_idle_cpu.
> >>
> >> This needs a fairly complicated PRNG for it would need to visit each
> >> possible CPU once before looping. A LFSR does that, but requires 2^n-1
> >> elements and we have topology masks that don't match that.. The trivial
> >> example is something with 6 cores.
> > 
> > Or keep it simple and accept the possibility of choosing the same candidate
> > more than once.
> > 
> >>> Or, choose a random starting point and then search for nr sequential 
> >>> candidates; possibly limited by a tunable.
> >>
> >> And this is basically what we already do. Except with the task-cpu
> >> instead of a per-cpu rotor.
> > 
> > Righto.  Disregard this suggestion.
> 
> Actually, I take back my take back.  I suspect the primary benefit
> of random selection is that it breaks up resonance states where
> CPUs that are busy tend to stay busy, and CPUs that are idle tend
> to stay idle, which is reinforced by starting the search at target =
> last cpu ran.

I suspect the primary benefit is reduction of bouncing.  The absolutely
maddening thing about SIS is that some stuff out there (like FB's load)
doesn't give a rats ass about anything other than absolute minimum
sched latency while other stuff notices cache going missing.  Joy.

-Mike

Re: [PATCH v2 5/6] arm64: Detect current view of GIC priorities

2018-02-02 Thread Yang Yingliang


Hi, Julien

On 2018/1/17 19:54, Julien Thierry wrote:

The values non secure EL1 needs to use for priority registers depends on
the value of SCR_EL3.FIQ.

Since we don't have access to SCR_EL3, we fake an interrupt and compare the
GIC priority with the one present in the [re]distributor.

Also, add firmware requirements related to SCR_EL3.

Signed-off-by: Julien Thierry 
Cc: Catalin Marinas 
Cc: Will Deacon 
Cc: Thomas Gleixner 
Cc: Jason Cooper 
Cc: Marc Zyngier 
---
  Documentation/arm64/booting.txt |  5 +++
  arch/arm64/include/asm/arch_gicv3.h |  5 +++
  arch/arm64/include/asm/irqflags.h   |  6 +++
  arch/arm64/include/asm/sysreg.h |  1 +
  drivers/irqchip/irq-gic-v3.c| 86 +
  5 files changed, 103 insertions(+)

diff --git a/Documentation/arm64/booting.txt b/Documentation/arm64/booting.txt
index 8d0df62..e387938 100644
--- a/Documentation/arm64/booting.txt
+++ b/Documentation/arm64/booting.txt
@@ -188,6 +188,11 @@ Before jumping into the kernel, the following conditions 
must be met:
the kernel image will be entered must be initialised by software at a
higher exception level to prevent execution in an UNKNOWN state.

+  - SCR_EL3.FIQ must have the same value across all CPUs the kernel is
+executing on.
+  - The value of SCR_EL3.FIQ must be the same as the one present at boot
+time whenever the kernel is executing.
+
For systems with a GICv3 interrupt controller to be used in v3 mode:
- If EL3 is present:
  ICC_SRE_EL3.Enable (bit 3) must be initialiased to 0b1.
diff --git a/arch/arm64/include/asm/arch_gicv3.h 
b/arch/arm64/include/asm/arch_gicv3.h
index 490bb3a..ac7b7f6 100644
--- a/arch/arm64/include/asm/arch_gicv3.h
+++ b/arch/arm64/include/asm/arch_gicv3.h
@@ -124,6 +124,11 @@ static inline void gic_write_bpr1(u32 val)
write_sysreg_s(val, SYS_ICC_BPR1_EL1);
  }

+static inline u32 gic_read_rpr(void)
+{
+   return read_sysreg_s(SYS_ICC_RPR_EL1);
+}
+
  #define gic_read_typer(c) readq_relaxed(c)
  #define gic_write_irouter(v, c)   writeq_relaxed(v, c)
  #define gic_read_lpir(c)  readq_relaxed(c)
diff --git a/arch/arm64/include/asm/irqflags.h 
b/arch/arm64/include/asm/irqflags.h
index 3d5d443..d25e7ee 100644
--- a/arch/arm64/include/asm/irqflags.h
+++ b/arch/arm64/include/asm/irqflags.h
@@ -217,6 +217,12 @@ static inline int arch_irqs_disabled_flags(unsigned long 
flags)
!(ARCH_FLAGS_GET_PMR(flags) & ICC_PMR_EL1_EN_BIT);
  }

+/* Mask IRQs at CPU level instead of GIC level */
+static inline void arch_irqs_daif_disable(void)
+{
+   asm volatile ("msr daifset, #2" : : : "memory");
+}
+
  void maybe_switch_to_sysreg_gic_cpuif(void);

  #endif /* CONFIG_IRQFLAGS_GIC_MASKING */
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 08cc885..46fa869 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -304,6 +304,7 @@
  #define SYS_ICC_SRE_EL1   sys_reg(3, 0, 12, 12, 5)
  #define SYS_ICC_IGRPEN0_EL1   sys_reg(3, 0, 12, 12, 6)
  #define SYS_ICC_IGRPEN1_EL1   sys_reg(3, 0, 12, 12, 7)
+#define SYS_ICC_RPR_EL1sys_reg(3, 0, 12, 11, 3)

  #define SYS_CONTEXTIDR_EL1sys_reg(3, 0, 13, 0, 1)
  #define SYS_TPIDR_EL1 sys_reg(3, 0, 13, 0, 4)
diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index df51d96..58b5e89 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -63,6 +63,10 @@ struct gic_chip_data {
  static struct gic_chip_data gic_data __read_mostly;
  static struct static_key supports_deactivate = STATIC_KEY_INIT_TRUE;

+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+DEFINE_STATIC_KEY_FALSE(have_non_secure_prio_view);
+#endif
+
  static struct gic_kvm_info gic_v3_kvm_info;
  static DEFINE_PER_CPU(bool, has_rss);

@@ -997,6 +1001,84 @@ static int partition_domain_translate(struct irq_domain 
*d,
.select = gic_irq_domain_select,
  };

+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+/*
+ * The behaviours of RPR and PMR registers differ depending on the value of
+ * SCR_EL3.FIQ, while the behaviour of priority registers of the distributor
+ * and redistributors is always the same.
+ *
+ * If SCR_EL3.FIQ == 1, the values used for RPR and PMR are the same as the 
ones
+ * programmed in the distributor and redistributors registers.
+ *
+ * Otherwise, the value presented by RPR as well as the value which will be
+ * compared against PMR is: (GIC_(R)DIST_PRI[irq] >> 1) | 0x80;
+ *
+ * see GICv3/GICv4 Architecture Specification (IHI0069D):
+ * - section 4.8.1 Non-secure accesses to register fields for Secure interrupt
+ *   priorities.
+ * - Figure 4-7 Secure read of the priority field for a Non-secure Group 1
+ *   interrupt.
+ */

I think we can use write/read PMR to check if SCR_EL3.FIQ == 1.
Like this:

gic_write_pmr(0xf0);
if (gic_read_pmr() == 0xf0)// if SCR_EL3.FIQ ==

[PATCH 2/2] f2fs: add GC_WRITTEN_PAGE to gc atomic file

2018-02-02 Thread Yunlong Song

This patch enables to gc atomic file by adding GC_WRITTEN_PAGE to
identify the gced pages of atomic file, which can avoid
register_inmem_page in set_page_dirty, so the gced pages will not mix
with the inmem pages.

Signed-off-by: Yunlong Song 
---
 fs/f2fs/data.c|  7 ++-
 fs/f2fs/gc.c  | 25 ++---
 fs/f2fs/segment.h |  3 +++
 3 files changed, 27 insertions(+), 8 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index edafcb6..5e1fc5d 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -120,6 +120,10 @@ static void f2fs_write_end_io(struct bio *bio)
 
dec_page_count(sbi, type);
clear_cold_data(page);
+   if (IS_GC_WRITTEN_PAGE(page)) {
+   set_page_private(page, 0);
+   ClearPagePrivate(page);
+   }
end_page_writeback(page);
}
if (!get_pages(sbi, F2FS_WB_CP_DATA) &&
@@ -2418,7 +2422,8 @@ static int f2fs_set_data_page_dirty(struct page *page)
if (!PageUptodate(page))
SetPageUptodate(page);
 
-   if (f2fs_is_atomic_file(inode) && !f2fs_is_commit_atomic_write(inode)) {
+   if (f2fs_is_atomic_file(inode) && !f2fs_is_commit_atomic_write(inode)
+   && !IS_GC_WRITTEN_PAGE(page)) {
if (!IS_ATOMIC_WRITTEN_PAGE(page)) {
register_inmem_page(inode, page);
return 1;
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index 84ab3ff..9d54ddb 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -622,10 +622,6 @@ static void move_data_block(struct inode *inode, block_t 
bidx,
if (!check_valid_map(F2FS_I_SB(inode), segno, off))
goto out;
 
-   if (f2fs_is_atomic_file(inode) &&
-   !f2fs_is_commit_atomic_write(inode))
-   goto out;
-
if (f2fs_is_pinned_file(inode)) {
f2fs_pin_file_control(inode, true);
goto out;
@@ -680,6 +676,12 @@ static void move_data_block(struct inode *inode, block_t 
bidx,
goto put_page_out;
}
 
+   if (f2fs_is_atomic_file(inode) &&
+   !f2fs_is_commit_atomic_write(inode) &&
+   !IS_GC_WRITTEN_PAGE(fio.encrypted_page)) {
+   set_page_private(fio.encrypted_page, (unsigned 
long)GC_WRITTEN_PAGE);
+   SetPagePrivate(fio.encrypted_page);
+   }
set_page_dirty(fio.encrypted_page);
f2fs_wait_on_page_writeback(fio.encrypted_page, DATA, true);
if (clear_page_dirty_for_io(fio.encrypted_page))
@@ -730,9 +732,6 @@ static void move_data_page(struct inode *inode, block_t 
bidx, int gc_type,
if (!check_valid_map(F2FS_I_SB(inode), segno, off))
goto out;
 
-   if (f2fs_is_atomic_file(inode) &&
-   !f2fs_is_commit_atomic_write(inode))
-   goto out;
if (f2fs_is_pinned_file(inode)) {
if (gc_type == FG_GC)
f2fs_pin_file_control(inode, true);
@@ -742,6 +741,12 @@ static void move_data_page(struct inode *inode, block_t 
bidx, int gc_type,
if (gc_type == BG_GC) {
if (PageWriteback(page))
goto out;
+   if (f2fs_is_atomic_file(inode) &&
+   !f2fs_is_commit_atomic_write(inode) &&
+   !IS_GC_WRITTEN_PAGE(page)) {
+   set_page_private(page, (unsigned long)GC_WRITTEN_PAGE);
+   SetPagePrivate(page);
+   }
set_page_dirty(page);
set_cold_data(page);
} else {
@@ -762,6 +767,12 @@ static void move_data_page(struct inode *inode, block_t 
bidx, int gc_type,
int err;
 
 retry:
+   if (f2fs_is_atomic_file(inode) &&
+   !f2fs_is_commit_atomic_write(inode) &&
+   !IS_GC_WRITTEN_PAGE(page)) {
+   set_page_private(page, (unsigned long)GC_WRITTEN_PAGE);
+   SetPagePrivate(page);
+   }
set_page_dirty(page);
f2fs_wait_on_page_writeback(page, DATA, true);
if (clear_page_dirty_for_io(page)) {
diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h
index f11c4bc..f0a6432 100644
--- a/fs/f2fs/segment.h
+++ b/fs/f2fs/segment.h
@@ -203,11 +203,14 @@ struct segment_allocation {
  */
 #define ATOMIC_WRITTEN_PAGE((unsigned long)-1)
 #define DUMMY_WRITTEN_PAGE ((unsigned long)-2)
+#define GC_WRITTEN_PAGE((unsigned long)-3)
 
 #define IS_ATOMIC_WRITTEN_PAGE(page)   \
(page_private(page) == (unsigned long)ATOMIC_WRITTEN_PAGE)
 #define IS_DUMMY_WRITTEN_PAGE(page)\
(page_private(page) == (unsigned long)DUMMY_WRITTEN_PAGE)
+#define IS_GC_WRITTEN_PAGE(page)   \
+   (page_private(page) == (unsigned long)GC_WRITTEN_PAGE)
 
 struct inm

[PATCH 1/2] f2fs: enable to gc page whose inode already atomic commit

2018-02-02 Thread Yunlong Song

If inode has already started to atomic commit, then set_page_dirty will
not mix the gc pages with the inmem atomic pages, so the page can be
gced safely.

Signed-off-by: Yunlong Song 
---
 fs/f2fs/data.c | 5 ++---
 fs/f2fs/gc.c   | 6 --
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 7435830..edafcb6 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -1580,14 +1580,13 @@ bool should_update_outplace(struct inode *inode, struct 
f2fs_io_info *fio)
return true;
if (S_ISDIR(inode->i_mode))
return true;
-   if (f2fs_is_atomic_file(inode))
-   return true;
if (fio) {
if (is_cold_data(fio->page))
return true;
if (IS_ATOMIC_WRITTEN_PAGE(fio->page))
return true;
-   }
+   } else if (f2fs_is_atomic_file(inode))
+   return true;
return false;
 }
 
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index b9d93fd..84ab3ff 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -622,7 +622,8 @@ static void move_data_block(struct inode *inode, block_t 
bidx,
if (!check_valid_map(F2FS_I_SB(inode), segno, off))
goto out;
 
-   if (f2fs_is_atomic_file(inode))
+   if (f2fs_is_atomic_file(inode) &&
+   !f2fs_is_commit_atomic_write(inode))
goto out;
 
if (f2fs_is_pinned_file(inode)) {
@@ -729,7 +730,8 @@ static void move_data_page(struct inode *inode, block_t 
bidx, int gc_type,
if (!check_valid_map(F2FS_I_SB(inode), segno, off))
goto out;
 
-   if (f2fs_is_atomic_file(inode))
+   if (f2fs_is_atomic_file(inode) &&
+   !f2fs_is_commit_atomic_write(inode))
goto out;
if (f2fs_is_pinned_file(inode)) {
if (gc_type == FG_GC)
-- 
1.8.5.2

Re: [PATCH v4] Fix loading of module radeonfb on PowerMac

2018-02-02 Thread kbuild test robot

Hi Mathieu,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on linus/master]
[also build test WARNING on v4.15 next-20180202]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Mathieu-Malaterre/Fix-loading-of-module-radeonfb-on-PowerMac/20180203-085907
config: x86_64-randconfig-x009-201804 (attached as .config)
compiler: gcc-7 (Debian 7.2.0-12) 7.2.1 20171025
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64 

All warnings (new ones prefixed by >>):

   In file included from drivers/video/fbdev/aty/radeon_base.c:91:0:
>> drivers/video/fbdev/aty/../edid.h:21:0: warning: "EDID_LENGTH" redefined
#define EDID_LENGTH0x80

   In file included from include/drm/drm_crtc.h:44:0,
from include/drm/drm_fb_helper.h:35,
from drivers/video/fbdev/aty/radeon_base.c:73:
   include/drm/drm_edid.h:32:0: note: this is the location of the previous 
definition
#define EDID_LENGTH 128

   Cyclomatic Complexity 1 arch/x86/include/asm/bitops.h:fls64
   Cyclomatic Complexity 1 include/linux/log2.h:__ilog2_u64
   Cyclomatic Complexity 1 include/asm-generic/getorder.h:__get_order
   Cyclomatic Complexity 1 include/linux/string.h:strnlen
   Cyclomatic Complexity 4 include/linux/string.h:strlen
   Cyclomatic Complexity 6 include/linux/string.h:strlcpy
   Cyclomatic Complexity 4 include/linux/string.h:memcpy
   Cyclomatic Complexity 1 
arch/x86/include/asm/paravirt.h:arch_local_irq_disable
   Cyclomatic Complexity 1 arch/x86/include/asm/paravirt.h:arch_local_irq_enable
   Cyclomatic Complexity 1 include/linux/spinlock.h:spinlock_check
   Cyclomatic Complexity 1 include/linux/spinlock.h:spin_unlock_irqrestore
   Cyclomatic Complexity 1 include/linux/jiffies.h:_msecs_to_jiffies
   Cyclomatic Complexity 3 include/linux/jiffies.h:msecs_to_jiffies
   Cyclomatic Complexity 1 arch/x86/include/asm/io.h:readb
   Cyclomatic Complexity 1 arch/x86/include/asm/io.h:readw
   Cyclomatic Complexity 1 arch/x86/include/asm/io.h:readl
   Cyclomatic Complexity 1 arch/x86/include/asm/io.h:writeb
   Cyclomatic Complexity 1 arch/x86/include/asm/io.h:writel
   Cyclomatic Complexity 1 arch/x86/include/asm/io.h:ioremap
   Cyclomatic Complexity 1 include/linux/kobject.h:kobject_name
   Cyclomatic Complexity 2 include/linux/device.h:dev_name
   Cyclomatic Complexity 1 include/linux/device.h:dev_get_drvdata
   Cyclomatic Complexity 1 include/linux/device.h:dev_set_drvdata
   Cyclomatic Complexity 1 include/linux/io.h:arch_phys_wc_add
   Cyclomatic Complexity 1 include/linux/io.h:arch_phys_wc_del
   Cyclomatic Complexity 68 include/linux/slab.h:kmalloc_large
   Cyclomatic Complexity 3 include/linux/slab.h:kmalloc
   Cyclomatic Complexity 1 include/linux/slab.h:kzalloc
   Cyclomatic Complexity 1 include/linux/pci.h:pci_get_drvdata
   Cyclomatic Complexity 1 include/linux/pci.h:pci_set_drvdata
   Cyclomatic Complexity 1 include/linux/pci.h:pci_name
   Cyclomatic Complexity 2 include/linux/fb.h:alloc_apertures
   Cyclomatic Complexity 2 
drivers/video/fbdev/aty/radeonfb.h:radeon_pll_errata_after_index
   Cyclomatic Complexity 2 
drivers/video/fbdev/aty/radeonfb.h:radeon_pll_errata_after_data
   Cyclomatic Complexity 1 drivers/video/fbdev/aty/radeonfb.h:round_div
   Cyclomatic Complexity 3 drivers/video/fbdev/aty/radeonfb.h:var_to_depth
   Cyclomatic Complexity 5 drivers/video/fbdev/aty/radeonfb.h:radeon_get_dstbpp
   Cyclomatic Complexity 1 drivers/video/fbdev/aty/radeonfb.h:radeonfb_bl_init
   Cyclomatic Complexity 1 drivers/video/fbdev/aty/radeonfb.h:radeonfb_bl_exit
   Cyclomatic Complexity 1 
include/drm/drm_fb_helper.h:drm_fb_helper_remove_conflicting_framebuffers
   Cyclomatic Complexity 21 
drivers/video/fbdev/aty/radeon_base.c:radeon_calc_pll_regs
   Cyclomatic Complexity 1 drivers/video/fbdev/aty/radeon_base.c:radeonfb_exit
   Cyclomatic Complexity 6 
drivers/video/fbdev/aty/radeon_base.c:radeon_find_mem_vbios
   Cyclomatic Complexity 4 
drivers/video/fbdev/aty/radeon_base.c:radeon_kick_out_firmware_fb
   Cyclomatic Complexity 5 
drivers/video/fbdev/aty/radeon_base.c:radeonfb_pci_unregister
   Cyclomatic Complexity 1 
drivers/video/fbdev/aty/radeon_base.c:radeon_show_one_edid
   Cyclomatic Complexity 3 
drivers/video/fbdev/aty/radeon_base.c:radeon_show_edid2
   Cyclomatic Complexity 3 
drivers/video/fbdev/aty/radeon_base.c:radeon_show_edid1
   Cyclomatic Complexity 2 
drivers/video/fbdev/aty/radeon_base.c:radeon_set_fbinfo
   Cyclomatic Complexity 18 
drivers/video/fbdev/aty/radeon_base.c:radeonfb_check_var
   Cyclomatic Complexity 2 
drivers/video/fbdev/aty/radeon_base.c:radeon_unmap_ROM
   Cyclomatic Complexity 7 drivers/video/fbdev/aty/radeon_base.c:radeon_map_ROM
   Cyclomatic Complexity 16 drivers/video/fbdev/aty/radeon_base.c:radeonfb_setup
   Cyclomatic Complexity 2 drivers/video/fbdev/aty/radeo

Re: bisected bd4c82c22c367e is the first bad commit (was [Bug 198617] New: zswap causing random applications to crash)

2018-02-02 Thread Sergey Senozhatsky

On (02/03/18 10:34), Sergey Senozhatsky wrote:
> so we are basically looking at 4.14-rc0+
[..]
> # first bad commit: [bd4c82c22c367e068acb1ec9ec02be2fac3e09e2] mm, THP, swap: 
> delay splitting THP after swapped out

To re-confirm, disabling CONFIG_TRANSPARENT_HUGEPAGE fixes my 4.15.0-next

-ss

[PATCH v2] x86/perf : Add check for CPUID instruction before using

2018-02-02 Thread Matthew Whitehead

We still officially support the ancient i486 cpu. First generation
versions of this processor do not have the CPUID instruction, though
later versions do. Therefore you must check that the cpu supports
it before using it. At present it fails with an "Illegal Instruction"
signal on the early processors.

v1: cpuid detection code based on GCC gcc/config/i386/cpuid.h

https://gcc.gnu.org/git/?p=gcc.git;a=blob_plain;f=gcc/config/i386/cpuid.h;hb=HEAD
v2: cpuid detection code based on Linux kernel arch/x86/kernel/cpu/common.c

Signed-off-by: Matthew Whitehead 
---
 tools/perf/arch/x86/util/header.c | 54 +++
 tools/perf/util/header.h  |  2 ++
 2 files changed, 56 insertions(+)

diff --git a/tools/perf/arch/x86/util/header.c 
b/tools/perf/arch/x86/util/header.c
index fb0d71a..d2508b3 100644
--- a/tools/perf/arch/x86/util/header.c
+++ b/tools/perf/arch/x86/util/header.c
@@ -7,6 +7,57 @@
 
 #include "../../util/header.h"
 
+#ifndef __x86_64__
+
+/* This code based on arch/x86/kernel/cpu/common.c
+ * Standard macro to see if a specific flag is changeable.
+ */
+static inline int flag_is_changeable_p(u32 flag)
+{
+   u32 f1, f2;
+
+   /*
+* Cyrix and IDT cpus allow disabling of CPUID
+* so the code below may return different results
+* when it is executed before and after enabling
+* the CPUID. Add "volatile" to not allow gcc to
+* optimize the subsequent calls to this function.
+*/
+   asm volatile ("pushfl   \n\t"
+ "pushfl   \n\t"
+ "popl %0  \n\t"
+ "movl %0, %1  \n\t"
+ "xorl %2, %0  \n\t"
+ "pushl %0 \n\t"
+ "popfl\n\t"
+ "pushfl   \n\t"
+ "popl %0  \n\t"
+ "popfl\n\t"
+
+ : "=&r" (f1), "=&r" (f2)
+ : "ir" (flag));
+
+   return ((f1^f2) & flag) != 0;
+}
+
+#define X86_EFLAGS_ID 0x0020
+
+/* Probe for the CPUID instruction */
+int have_cpuid_p(void)
+{
+   return flag_is_changeable_p(X86_EFLAGS_ID);
+}
+
+#else  /* CONFIG_X86_64 */
+
+/* All X86_64 have cpuid instruction */
+int have_cpuid_p(void)
+{
+   return 1;
+}
+
+#endif /* CONFIG_X86_64 */
+
 static inline void
 cpuid(unsigned int op, unsigned int *a, unsigned int *b, unsigned int *c,
   unsigned int *d)
@@ -28,6 +79,9 @@
int nb;
char vendor[16];
 
+   if (!have_cpuid_p())
+   return -1;
+
cpuid(0, &lvl, &b, &c, &d);
strncpy(&vendor[0], (char *)(&b), 4);
strncpy(&vendor[4], (char *)(&d), 4);
diff --git a/tools/perf/util/header.h b/tools/perf/util/header.h
index f28..f4de656 100644
--- a/tools/perf/util/header.h
+++ b/tools/perf/util/header.h
@@ -171,6 +171,8 @@ int write_padded(struct feat_fd *fd, const void *bf,
 /*
  * arch specific callback
  */
+int have_cpuid_p(void);
+
 int get_cpuid(char *buffer, size_t sz);
 
 char *get_cpuid_str(struct perf_pmu *pmu __maybe_unused);
-- 
1.8.3.1

Re: [PATCH] locking/qspinlock: Ensure node is initialised before updating prev->next

2018-02-02 Thread kbuild test robot

Hi Will,

I love your patch! Yet something to improve:

[auto build test ERROR on v4.15]
[cannot apply to tip/locking/core tip/core/locking tip/auto-latest 
next-20180202]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Will-Deacon/locking-qspinlock-Ensure-node-is-initialised-before-updating-prev-next/20180203-095222
config: x86_64-randconfig-x017-201804 (attached as .config)
compiler: gcc-7 (Debian 7.2.0-12) 7.2.1 20171025
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64 

All error/warnings (new ones prefixed by >>):

   In file included from include/linux/kernel.h:10:0,
from include/linux/list.h:9,
from include/linux/smp.h:12,
from kernel/locking/qspinlock.c:25:
   kernel/locking/qspinlock.c: In function 'queued_spin_lock_slowpath':
>> include/linux/compiler.h:264:8: error: conversion to non-scalar type 
>> requested
 union { typeof(x) __val; char __c[1]; } __u = \
   ^
>> arch/x86/include/asm/barrier.h:71:2: note: in expansion of macro 'WRITE_ONCE'
 WRITE_ONCE(*p, v);  \
 ^~
   include/asm-generic/barrier.h:157:33: note: in expansion of macro 
'__smp_store_release'
#define smp_store_release(p, v) __smp_store_release(p, v)
^~~
>> kernel/locking/qspinlock.c:419:3: note: in expansion of macro 
>> 'smp_store_release'
  smp_store_release(prev->next, node);
  ^
--
   In file included from include/linux/kernel.h:10:0,
from include/linux/list.h:9,
from include/linux/smp.h:12,
from kernel//locking/qspinlock.c:25:
   kernel//locking/qspinlock.c: In function 'queued_spin_lock_slowpath':
>> include/linux/compiler.h:264:8: error: conversion to non-scalar type 
>> requested
 union { typeof(x) __val; char __c[1]; } __u = \
   ^
>> arch/x86/include/asm/barrier.h:71:2: note: in expansion of macro 'WRITE_ONCE'
 WRITE_ONCE(*p, v);  \
 ^~
   include/asm-generic/barrier.h:157:33: note: in expansion of macro 
'__smp_store_release'
#define smp_store_release(p, v) __smp_store_release(p, v)
^~~
   kernel//locking/qspinlock.c:419:3: note: in expansion of macro 
'smp_store_release'
  smp_store_release(prev->next, node);
  ^

vim +/WRITE_ONCE +71 arch/x86/include/asm/barrier.h

47933ad4 Peter Zijlstra 2013-11-06  66  
1638fb72 Michael S. Tsirkin 2015-12-27  67  #define __smp_store_release(p, v)   
\
47933ad4 Peter Zijlstra 2013-11-06  68  do {
\
47933ad4 Peter Zijlstra 2013-11-06  69  
compiletime_assert_atomic_type(*p); \
47933ad4 Peter Zijlstra 2013-11-06  70  barrier();  
\
76695af2 Andrey Konovalov   2015-08-02 @71  WRITE_ONCE(*p, v);  
\
47933ad4 Peter Zijlstra 2013-11-06  72  } while (0)
47933ad4 Peter Zijlstra 2013-11-06  73  

:: The code at line 71 was first introduced by commit
:: 76695af20c015206cffb84b15912be6797d0cca2 locking, arch: use 
WRITE_ONCE()/READ_ONCE() in smp_store_release()/smp_load_acquire()

:: TO: Andrey Konovalov 
:: CC: Ingo Molnar 

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip

Re: [PATCH] locking/qspinlock: Ensure node is initialised before updating prev->next

2018-02-02 Thread kbuild test robot

Hi Will,

I love your patch! Perhaps something to improve:

[auto build test WARNING on v4.15]
[cannot apply to tip/locking/core tip/core/locking tip/auto-latest 
next-20180202]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Will-Deacon/locking-qspinlock-Ensure-node-is-initialised-before-updating-prev-next/20180203-095222
config: sparc64-allyesconfig (attached as .config)
compiler: sparc64-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=sparc64 

All warnings (new ones prefixed by >>):

   In file included from include/linux/kernel.h:10:0,
from include/linux/list.h:9,
from include/linux/smp.h:12,
from kernel/locking/qspinlock.c:25:
   kernel/locking/qspinlock.c: In function 'queued_spin_lock_slowpath':
   include/linux/compiler.h:264:8: error: conversion to non-scalar type 
requested
 union { typeof(x) __val; char __c[1]; } __u = \
   ^
>> arch/sparc/include/asm/barrier_64.h:45:2: note: in expansion of macro 
>> 'WRITE_ONCE'
 WRITE_ONCE(*p, v);  \
 ^~
   include/asm-generic/barrier.h:157:33: note: in expansion of macro 
'__smp_store_release'
#define smp_store_release(p, v) __smp_store_release(p, v)
^~~
   kernel/locking/qspinlock.c:419:3: note: in expansion of macro 
'smp_store_release'
  smp_store_release(prev->next, node);
  ^
--
   In file included from include/linux/kernel.h:10:0,
from include/linux/list.h:9,
from include/linux/smp.h:12,
from kernel//locking/qspinlock.c:25:
   kernel//locking/qspinlock.c: In function 'queued_spin_lock_slowpath':
   include/linux/compiler.h:264:8: error: conversion to non-scalar type 
requested
 union { typeof(x) __val; char __c[1]; } __u = \
   ^
>> arch/sparc/include/asm/barrier_64.h:45:2: note: in expansion of macro 
>> 'WRITE_ONCE'
 WRITE_ONCE(*p, v);  \
 ^~
   include/asm-generic/barrier.h:157:33: note: in expansion of macro 
'__smp_store_release'
#define smp_store_release(p, v) __smp_store_release(p, v)
^~~
   kernel//locking/qspinlock.c:419:3: note: in expansion of macro 
'smp_store_release'
  smp_store_release(prev->next, node);
  ^

vim +/WRITE_ONCE +45 arch/sparc/include/asm/barrier_64.h

d550bbd4 David Howells  2012-03-28  40  
45d9b859 Michael S. Tsirkin 2015-12-27  41  #define __smp_store_release(p, v)   
\
47933ad4 Peter Zijlstra 2013-11-06  42  do {
\
47933ad4 Peter Zijlstra 2013-11-06  43  
compiletime_assert_atomic_type(*p); \
47933ad4 Peter Zijlstra 2013-11-06  44  barrier();  
\
76695af2 Andrey Konovalov   2015-08-02 @45  WRITE_ONCE(*p, v);  
\
47933ad4 Peter Zijlstra 2013-11-06  46  } while (0)
47933ad4 Peter Zijlstra 2013-11-06  47  

:: The code at line 45 was first introduced by commit
:: 76695af20c015206cffb84b15912be6797d0cca2 locking, arch: use 
WRITE_ONCE()/READ_ONCE() in smp_store_release()/smp_load_acquire()

:: TO: Andrey Konovalov 
:: CC: Ingo Molnar 

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip

linux-next: Signed-off-by missing for commits in the s390 tree

2018-02-02 Thread Stephen Rothwell

Hi all,

Commits

  a39892ed47bf ("s390/runtime_instrumentation: re-add signum system call 
parameter")
  279d2cea3aad ("s390/cio: fix kernel-doc usage")

are missing a Signed-off-by from their committer.

-- 
Cheers,
Stephen Rothwell

Re: [PATCH bpf-next v8 0/5] libbpf: add XDP binding support

2018-02-02 Thread Alexei Starovoitov

On Wed, Jan 31, 2018 at 05:53:13PM +0100, Daniel Borkmann wrote:
> On 01/30/2018 09:50 PM, Eric Leblond wrote:
> > Hello Daniel,
> > 
> > No problem with the delay in the answer. I'm doing far worse.
> > 
> > Here is an updated version:
> > - add if_link.h in uapi and remove the definition
> > - fix a commit message
> > - remove uapi from a include
> 
> Fyi, this still needs to wait for a bit in the queue due to current
> merge window where bpf-next is closed during that time [0]. Thanks!
> 
>   [0] https://www.spinics.net/lists/netdev/msg481490.html

I've tested it and applied to bpf tree considering that
the series were practically ready long before bpf-next was closed.
Thank you Eric.
perf build was also fine, but please watch out for any unexpected
breakages, since perf has to be built on variety of distros.

Re: RFC(V3): Audit Kernel Container IDs

2018-02-02 Thread Serge E. Hallyn

On Fri, Feb 02, 2018 at 05:05:22PM -0500, Paul Moore wrote:
> On Tue, Jan 9, 2018 at 7:16 AM, Richard Guy Briggs  wrote:
> > Containers are a userspace concept.  The kernel knows nothing of them.
> >
> > The Linux audit system needs a way to be able to track the container
> > provenance of events and actions.  Audit needs the kernel's help to do
> > this.
> 
> Two small comments below, but I tend to think we are at a point where
> you can start cobbling together some prototype/RFC patches.  Surely

Agreed.

LGTM.

> there are going to be a few changes, and new comments, that come out
> once we see an initial implementation so let's see what those are.

thanks,
-serge

bisected bd4c82c22c367e is the first bad commit (was [Bug 198617] New: zswap causing random applications to crash)

2018-02-02 Thread Sergey Senozhatsky

Hello,

On (01/30/18 11:48), Andrew Morton wrote:
> Subject: [Bug 198617] New: zswap causing random applications to crash
> 
> https://bugzilla.kernel.org/show_bug.cgi?id=198617
> 
> Bug ID: 198617
>Summary: zswap causing random applications to crash
>Product: Memory Management
>Version: 2.5
> Kernel Version: 4.14.15
>   Hardware: x86-64
> OS: Linux
>   Tree: Mainline
> Status: NEW
>   Severity: normal
>   Priority: P1
>  Component: Page Allocator
>   Assignee: a...@linux-foundation.org
>   Reporter: kernel_...@dlk.pl
> Regression: No
> 
> https://bugs.freedesktop.org/show_bug.cgi?id=104709
> https://bugs.kde.org/show_bug.cgi?id=389542
> 
> I did have zswap enabled for a long while, and a lot of wine games,
> plasmashell, xorg, kwin_x11 (and other) did crash randomly when reached 100% 
> of
> physical ram and swap was like almost never used.
> 
> I could esilly open a lot of browser tabs and the browser or xorg would fail
> every time.
> 
> After disabling zswap no crashes at all.
> 
> /etc/systemd/swap.conf
> zswap_enabled=1
> zswap_compressor=lz4  # lzo lz4
> zswap_max_pool_percent=25 # 1-99
> zswap_zpool=zbud  # zbud z3fold


So I did a number of tests and I confirm that under memory pressure
with frontswap enabled I do see segfaults and memory corruptions in
random user space applications.

kernel: urxvt[338]: segfault at 20 ip 7fc08889ae0d sp 7ffc73a7fc40 
error 6 in libc-2.26.so[7fc08881a000+1ae000]
 #0  0x7fc08889ae0d _int_malloc (libc.so.6)
 #1  0x7fc08889c2f3 malloc (libc.so.6)
 #2  0x560e6004bff7 _Z14rxvt_wcstoutf8PKwi (urxvt)
 #3  0x560e6005e75c n/a (urxvt)
 #4  0x560e6007d9f1 _ZN16rxvt_perl_interp6invokeEP9rxvt_term9hook_typez 
(urxvt)
 #5  0x560e6003d988 _ZN9rxvt_term9cmd_parseEv (urxvt)
 #6  0x560e60042804 _ZN9rxvt_term6pty_cbERN2ev2ioEi (urxvt)
 #7  0x560e6005c10f _Z17ev_invoke_pendingv (urxvt)
 #8  0x560e6005cb55 ev_run (urxvt)
 #9  0x560e6003b9b9 main (urxvt)
 #10 0x7fc08883af4a __libc_start_main (libc.so.6)
 #11 0x560e6003f9da _start (urxvt)

kernel: urxvt[343]: segfault at 10 ip 7fa56bd7d52b sp 7ffc09783a40 
error 4 in libc-2.26.so[7fa56bcfd000+1ae000]
 #0  0x7fa56bd7d52b _int_malloc (libc.so.6)
 #1  0x7fa56bd7f2f3 malloc (libc.so.6)
 #2  0x7fa56b3d6097 n/a (libxcb.so.1)
 #3  0x7fa56b3d64d8 n/a (libxcb.so.1)
 #4  0x7fa56c921b79 n/a (libX11.so.6)
 #5  0x7fa56c921ceb n/a (libX11.so.6)
 #6  0x7fa56c921fdd _XEventsQueued (libX11.so.6)
 #7  0x7fa56c913c49 XEventsQueued (libX11.so.6)
 #8  0x55b35cfc3262 _ZN12rxvt_display8flush_cbERN2ev7prepareEi (urxvt)
 #9  0x55b35cfc910f _Z17ev_invoke_pendingv (urxvt)
 #10 0x55b35cfc9c02 ev_run (urxvt)
 #11 0x55b35cfa89b9 main (urxvt)
 #12 0x7fa56bd1df4a __libc_start_main (libc.so.6)
 #13 0x55b35cfac9da _start (urxvt)

 Stack trace of thread 351:
 #0  0x7f5baaee7860 raise (libc.so.6)
 #1  0x7f5baaee8ec9 abort (libc.so.6)
 #2  0x7f5baaf30849 __malloc_assert (libc.so.6)
 #3  0x7f5baaf34011 _int_malloc (libc.so.6)
 #4  0x7f5baaf352f3 malloc (libc.so.6)
 #5  0x7f5baaf71cad __alloc_dir (libc.so.6)
 #6  0x7f5baaf71dbd opendir_tail (libc.so.6)
 #7  0x7f5bab5bbac4 Perl_pp_open_dir (libperl.so)
 #8  0x7f5bab55fec6 Perl_runops_standard (libperl.so)
 #9  0x7f5bab4d9390 Perl_call_sv (libperl.so)
 #10 0x5611f097e190 _ZN16rxvt_perl_interp6invokeEP9rxvt_term9hook_typez 
(urxvt)
 #11 0x5611f0947acb _ZN9rxvt_term14init_resourcesEiPKPKc (urxvt)
 #12 0x5611f0948da8 _ZN9rxvt_term5init2EiPKPKc (urxvt)
 #13 0x5611f097a0af n/a (urxvt)
 #14 0x7f5bab568259 Perl_pp_entersub (libperl.so)
 #15 0x7f5bab55fec6 Perl_runops_standard (libperl.so)
 #16 0x7f5bab4d9390 Perl_call_sv (libperl.so)
 #17 0x5611f097e190 _ZN16rxvt_perl_interp6invokeEP9rxvt_term9hook_typez 
(urxvt)
 #18 0x5611f0939a77 _ZN9rxvt_term9key_pressER9XKeyEvent (urxvt)
 #19 0x5611f093d77a _ZN9rxvt_term4x_cbER7_XEvent (urxvt)
 #20 0x5611f09572e8 _ZN12rxvt_display8flush_cbERN2ev7prepareEi (urxvt)
 #21 0x5611f095d10f _Z17ev_invoke_pendingv (urxvt)
 #22 0x5611f095dc02 ev_run (urxvt)
 #23 0x5611f093c9b9 main (urxvt)
 #24 0x7f5baaed3f4a __libc_start_main (libc.so.6)
 #25 0x5611f09409da _start (urxvt)


and so on.


However, the problem is not specific to 4.14.15 or 4.14.11.

I manages to track it down to 4.14 merge window, so we are basically
looking at 4.14-rc0+

The bisect log looks as follows:

git bisect start
# bad: [2bd6bf03f4c1c59381d62c61d03f6cc3fe71f66e] Linux 4.14-rc1
git bisect bad 2bd6bf03f4c1c59381d62c61d03f6cc3fe71f66e
# good: [569dbb88e80deb68974ef6fdd6a13edb9d686261] Linux 4.13
git bisect good 569dbb88e80deb68974ef6fdd6a13edb9d686261
# good: [aae3dbb4776e7916b6cd442d00159bea27a695c1] Merge 
git://git.kernel.org/pub/scm/linux/kernel/git/dav

[PATCH] pvcalls-back: do not return error on inet_accept EAGAIN

2018-02-02 Thread Stefano Stabellini

When the client sends a regular blocking accept request, the backend is
expected to return only when the accept is completed, simulating a
blocking behavior, or return an error.

Specifically, on EAGAIN from inet_accept, the backend shouldn't return
"EAGAIN" to the client. Instead, it should simply continue the wait.
Otherwise, the client will send another accept request, which will cause
another EAGAIN to be sent back, which is a waste of resources and not
conforming to the expected behavior. Change the behavior by turning the
"goto error" into a return.

Signed-off-by: Stefano Stabellini 

diff --git a/drivers/xen/pvcalls-back.c b/drivers/xen/pvcalls-back.c
index c7822d8..156e5ae 100644
--- a/drivers/xen/pvcalls-back.c
+++ b/drivers/xen/pvcalls-back.c
@@ -548,7 +548,7 @@ static void __pvcalls_back_accept(struct work_struct *work)
ret = inet_accept(mappass->sock, sock, O_NONBLOCK, true);
if (ret == -EAGAIN) {
sock_release(sock);
-   goto out_error;
+   return;
}
 
map = pvcalls_new_active_socket(fedata,

Re: [PATCH 1/2] Documentation/memory-barriers.txt: cross-reference "tools/memory-model/"

2018-02-02 Thread Paul E. McKenney

On Fri, Feb 02, 2018 at 10:12:48AM +0100, Andrea Parri wrote:
> Recent efforts led to the specification of a memory consistency model
> for the Linux kernel [1], which "can (roughly speaking) be thought of
> as an automated version of memory-barriers.txt" and which is (in turn)
> "accompanied by extensive documentation on its use and its design".
> 
> Make sure that the (occasional) reader of memory-barriers.txt will be
> aware of these developments.
> 
> [1] https://marc.info/?l=linux-kernel&m=151687290114799&w=2
> 
> Signed-off-by: Andrea Parri 

I am inclined to pull in something along these lines, but would like
some feedback on the wording, especially how "official" we want to
make the memory model to be.

Thoughts?

If I don't hear otherwise in a couple of days, I will pull this as is.

Thanx, Paul

> ---
>  Documentation/memory-barriers.txt | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/memory-barriers.txt 
> b/Documentation/memory-barriers.txt
> index a863009849a3b..8cc3f098f4a7d 100644
> --- a/Documentation/memory-barriers.txt
> +++ b/Documentation/memory-barriers.txt
> @@ -17,7 +17,9 @@ meant as a guide to using the various memory barriers 
> provided by Linux, but
>  in case of any doubt (and there are many) please ask.
> 
>  To repeat, this document is not a specification of what Linux expects from
> -hardware.
> +hardware.  For such a specification, in the form of a memory consistency
> +model, and for documentation about its usage and its design, the reader is
> +referred to "tools/memory-model/".
> 
>  The purpose of this document is twofold:
> 
> -- 
> 2.7.4
>

[PATCH v4 4/5] irqchip/gic-v3-its: add ability to resend MAPC on resume

2018-02-02 Thread Derek Basehore

This adds functionality to resend the MAPC command to an ITS node on
resume. If the ITS is powered down during suspend and the collections
are not backed by memory, the ITS will lose that state. This just sets
up the known state for the collections after the ITS is restored.

This feature is enabled via Kconfig and a device tree entry.

Signed-off-by: Derek Basehore 
---
 arch/arm64/Kconfig   |  10 
 drivers/irqchip/irq-gic-v3-its.c | 101 ---
 2 files changed, 73 insertions(+), 38 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 53612879fe56..f38f1a7b4266 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -571,6 +571,16 @@ config HISILICON_ERRATUM_161600802
 
  If unsure, say Y.
 
+config ARM_GIC500_COLLECTIONS_RESET
+   bool "GIC-500 Collections: Workaround for GIC-500 Collections on 
suspend reset"
+   default y
+   help
+ The GIC-500 can store Collections state internally for the ITS. If
+ the ITS is reset on suspend (ie from power getting disabled), the
+ collections need to be reconfigured on resume.
+
+ If unsure, say Y.
+
 config QCOM_FALKOR_ERRATUM_E1041
bool "Falkor E1041: Speculative instruction fetches might cause errant 
memory access"
default y
diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index e13515cdb68f..63764efa4dcc 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -48,6 +48,7 @@
 #define ITS_FLAGS_WORKAROUND_CAVIUM_22375  (1ULL << 1)
 #define ITS_FLAGS_WORKAROUND_CAVIUM_23144  (1ULL << 2)
 #define ITS_FLAGS_SAVE_SUSPEND_STATE   (1ULL << 3)
+#define ITS_FLAGS_WORKAROUND_GIC500_MAPC   (1ULL << 4)
 
 #define RDIST_FLAGS_PROPBASE_NEEDS_FLUSHING(1 << 0)
 
@@ -1950,52 +1951,53 @@ static void its_cpu_init_lpis(void)
dsb(sy);
 }
 
-static void its_cpu_init_collection(void)
+static void its_cpu_init_collection(struct its_node *its)
 {
-   struct its_node *its;
-   int cpu;
-
-   spin_lock(&its_lock);
-   cpu = smp_processor_id();
-
-   list_for_each_entry(its, &its_nodes, entry) {
-   u64 target;
+   int cpu = smp_processor_id();
+   u64 target;
 
-   /* avoid cross node collections and its mapping */
-   if (its->flags & ITS_FLAGS_WORKAROUND_CAVIUM_23144) {
-   struct device_node *cpu_node;
+   /* avoid cross node collections and its mapping */
+   if (its->flags & ITS_FLAGS_WORKAROUND_CAVIUM_23144) {
+   struct device_node *cpu_node;
 
-   cpu_node = of_get_cpu_node(cpu, NULL);
-   if (its->numa_node != NUMA_NO_NODE &&
-   its->numa_node != of_node_to_nid(cpu_node))
-   continue;
-   }
+   cpu_node = of_get_cpu_node(cpu, NULL);
+   if (its->numa_node != NUMA_NO_NODE &&
+   its->numa_node != of_node_to_nid(cpu_node))
+   return;
+   }
 
+   /*
+* We now have to bind each collection to its target
+* redistributor.
+*/
+   if (gic_read_typer(its->base + GITS_TYPER) & GITS_TYPER_PTA) {
/*
-* We now have to bind each collection to its target
+* This ITS wants the physical address of the
 * redistributor.
 */
-   if (gic_read_typer(its->base + GITS_TYPER) & GITS_TYPER_PTA) {
-   /*
-* This ITS wants the physical address of the
-* redistributor.
-*/
-   target = gic_data_rdist()->phys_base;
-   } else {
-   /*
-* This ITS wants a linear CPU number.
-*/
-   target = gic_read_typer(gic_data_rdist_rd_base() + 
GICR_TYPER);
-   target = GICR_TYPER_CPU_NUMBER(target) << 16;
-   }
+   target = gic_data_rdist()->phys_base;
+   } else {
+   /* This ITS wants a linear CPU number. */
+   target = gic_read_typer(gic_data_rdist_rd_base() + GICR_TYPER);
+   target = GICR_TYPER_CPU_NUMBER(target) << 16;
+   }
 
-   /* Perform collection mapping */
-   its->collections[cpu].target_address = target;
-   its->collections[cpu].col_id = cpu;
+   /* Perform collection mapping */
+   its->collections[cpu].target_address = target;
+   its->collections[cpu].col_id = cpu;
 
-   its_send_mapc(its, &its->collections[cpu], 1);
-   its_send_invall(its, &its->collections[cpu]);
-   }
+   its_send_mapc(its, &its->collections[cpu], 1);
+   its_send_invall(its, &its->collections[cpu]);
+}
+
+static void it

[PATCH v4 2/5] irqchip/gic-v3-its: add ability to save/restore ITS state

2018-02-02 Thread Derek Basehore

Some platforms power off GIC logic in suspend, so we need to
save/restore state. The distributor and redistributor registers need
to be handled in platform code due to access permissions on those
registers, but the ITS registers can be restored in the kernel.

Signed-off-by: Derek Basehore 
---
 drivers/irqchip/irq-gic-v3-its.c | 101 +++
 1 file changed, 101 insertions(+)

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 06f025fd5726..e13515cdb68f 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -33,6 +33,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -46,6 +47,7 @@
 #define ITS_FLAGS_CMDQ_NEEDS_FLUSHING  (1ULL << 0)
 #define ITS_FLAGS_WORKAROUND_CAVIUM_22375  (1ULL << 1)
 #define ITS_FLAGS_WORKAROUND_CAVIUM_23144  (1ULL << 2)
+#define ITS_FLAGS_SAVE_SUSPEND_STATE   (1ULL << 3)
 
 #define RDIST_FLAGS_PROPBASE_NEEDS_FLUSHING(1 << 0)
 
@@ -83,6 +85,15 @@ struct its_baser {
u32 psz;
 };
 
+/*
+ * Saved ITS state - this is where saved state for the ITS is stored
+ * when it's disabled during system suspend.
+ */
+struct its_ctx {
+   u64 cbaser;
+   u32 ctlr;
+};
+
 struct its_device;
 
 /*
@@ -101,6 +112,7 @@ struct its_node {
struct its_collection   *collections;
struct fwnode_handle*fwnode_handle;
u64 (*get_msi_base)(struct its_device *its_dev);
+   struct its_ctx  its_ctx;
struct list_headits_device_list;
u64 flags;
unsigned long   list_nr;
@@ -3042,6 +3054,90 @@ static void its_enable_quirks(struct its_node *its)
gic_enable_quirks(iidr, its_quirks, its);
 }
 
+static int its_save_disable(void)
+{
+   struct its_node *its;
+   int err = 0;
+
+   spin_lock(&its_lock);
+   list_for_each_entry(its, &its_nodes, entry) {
+   struct its_ctx *ctx;
+   void __iomem *base;
+
+   if (!(its->flags & ITS_FLAGS_SAVE_SUSPEND_STATE))
+   continue;
+
+   ctx = &its->its_ctx;
+   base = its->base;
+   ctx->ctlr = readl_relaxed(base + GITS_CTLR);
+   err = its_force_quiescent(base);
+   if (err) {
+   pr_err("ITS failed to quiesce\n");
+   writel_relaxed(ctx->ctlr, base + GITS_CTLR);
+   goto err;
+   }
+
+   ctx->cbaser = gits_read_cbaser(base + GITS_CBASER);
+   }
+
+err:
+   if (err) {
+   list_for_each_entry_continue_reverse(its, &its_nodes, entry) {
+   if (its->flags & ITS_FLAGS_SAVE_SUSPEND_STATE) {
+   struct its_ctx *ctx = &its->its_ctx;
+   void __iomem *base = its->base;
+
+   writel_relaxed(ctx->ctlr, base + GITS_CTLR);
+   }
+   }
+   }
+
+   spin_unlock(&its_lock);
+
+   return err;
+}
+
+static void its_restore_enable(void)
+{
+   struct its_node *its;
+
+   spin_lock(&its_lock);
+   list_for_each_entry(its, &its_nodes, entry) {
+   if (its->flags & ITS_FLAGS_SAVE_SUSPEND_STATE) {
+   struct its_ctx *ctx = &its->its_ctx;
+   void __iomem *base = its->base;
+   /*
+* Only the lower 32 bits matter here since the upper 32
+* don't include any of the offset.
+*/
+   u32 creader = readl_relaxed(base + GITS_CREADR);
+   int i;
+
+   /*
+* Reset the write location to where the ITS is
+* currently at.
+*/
+   gits_write_cbaser(ctx->cbaser, base + GITS_CBASER);
+   gits_write_cwriter(creader, base + GITS_CWRITER);
+   its->cmd_write = &its->cmd_base[
+   creader / sizeof(struct its_cmd_block)];
+   /* Restore GITS_BASER from the value cache. */
+   for (i = 0; i < GITS_BASER_NR_REGS; i++) {
+   struct its_baser *baser = &its->tables[i];
+
+   its_write_baser(its, baser, baser->val);
+   }
+   writel_relaxed(ctx->ctlr, base + GITS_CTLR);
+   }
+   }
+   spin_unlock(&its_lock);
+}
+
+static struct syscore_ops its_syscore_ops = {
+   .suspend = its_save_disable,
+   .resume = its_restore_enable,
+};
+
 static int its_init_domain(struct fwnode_handle *handle, struct its_node *its)
 {
struct irq_domain *inner_domain;
@@ -3261,6 +3357,9 @@ static int

[PATCH v4 5/5] DT/arm,gic-v3: add collections-reset-on-suspend property

2018-02-02 Thread Derek Basehore

This boolean property for the GIC-V3-ITS enables resending the MAP
COLLECTIONS commands when resuming for when the state is reset on
suspend.

Signed-off-by: Derek Basehore 
---
 Documentation/devicetree/bindings/interrupt-controller/arm,gic-v3.txt | 4 
 1 file changed, 4 insertions(+)

diff --git 
a/Documentation/devicetree/bindings/interrupt-controller/arm,gic-v3.txt 
b/Documentation/devicetree/bindings/interrupt-controller/arm,gic-v3.txt
index a470147d4f14..adb958e046d2 100644
--- a/Documentation/devicetree/bindings/interrupt-controller/arm,gic-v3.txt
+++ b/Documentation/devicetree/bindings/interrupt-controller/arm,gic-v3.txt
@@ -81,6 +81,10 @@ Optional:
 - reset-on-suspend: Boolean property. Indicates that the ITS state is
   reset on suspend. The state is then saved on suspend and restored on
   resume.
+- collections-reset-on-suspend : Boolean property. If the collections for the
+  ITS are stored internally instead of externally, the state will be lost if 
the
+  GIC loses power. Setting this enables the kernel to reset the collections
+  state on resume for this ITS node.
 
 The main GIC node must contain the appropriate #address-cells,
 #size-cells and ranges properties for the reg property of all ITS
-- 
2.16.0.rc1.238.g530d649a79-goog

[PATCH v4 3/5] DT/arm,gic-v3-its: add reset-on-suspend property

2018-02-02 Thread Derek Basehore

This adds documentation for the new reset-on-suspend property. This
property enables saving and restoring the ITS for when it loses state
in system suspend.

Signed-off-by: Derek Basehore 
---
 Documentation/devicetree/bindings/interrupt-controller/arm,gic-v3.txt | 3 +++
 1 file changed, 3 insertions(+)

diff --git 
a/Documentation/devicetree/bindings/interrupt-controller/arm,gic-v3.txt 
b/Documentation/devicetree/bindings/interrupt-controller/arm,gic-v3.txt
index 0a57f2f4167d..a470147d4f14 100644
--- a/Documentation/devicetree/bindings/interrupt-controller/arm,gic-v3.txt
+++ b/Documentation/devicetree/bindings/interrupt-controller/arm,gic-v3.txt
@@ -78,6 +78,9 @@ These nodes must have the following properties:
 Optional:
 - socionext,synquacer-pre-its: (u32, u32) tuple describing the untranslated
   address and size of the pre-ITS window.
+- reset-on-suspend: Boolean property. Indicates that the ITS state is
+  reset on suspend. The state is then saved on suspend and restored on
+  resume.
 
 The main GIC node must contain the appropriate #address-cells,
 #size-cells and ranges properties for the reg property of all ITS
-- 
2.16.0.rc1.238.g530d649a79-goog

[PATCH v4 1/5] cpu_pm: add syscore_suspend error handling

2018-02-02 Thread Derek Basehore

If cpu_cluster_pm_enter() fails, cpu_pm_exit() should be called. This
will put the CPU in the correct state to resume from the failure.

Signed-off-by: Derek Basehore 
---
 kernel/cpu_pm.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/cpu_pm.c b/kernel/cpu_pm.c
index 67b02e138a47..03bcc0751a51 100644
--- a/kernel/cpu_pm.c
+++ b/kernel/cpu_pm.c
@@ -186,6 +186,9 @@ static int cpu_pm_suspend(void)
return ret;
 
ret = cpu_cluster_pm_enter();
+   if (ret)
+   cpu_pm_exit();
+
return ret;
 }
 
-- 
2.16.0.rc1.238.g530d649a79-goog

[PATCH v4 0/5] GICv3 Save and Restore

2018-02-02 Thread Derek Basehore

A lot of changes in v2. The distributor and redistributor saving and
restoring is left to the PSCI/firmware implementation after
discussions with ARM. This reduces the line changes by a lot and
removes now unneeded patches.

Patches are verified on an RK3399 platform with pending patches in the
ARM-Trusted-Firmware project.

Just a couple minor changes in v3 to formatting.

Fixed a false ITS wedged detection due to the cmd_write and creadr
offsets not matching up on reset in v4. Also minor formatting changes.

Derek Basehore (5):
  cpu_pm: add syscore_suspend error handling
  irqchip/gic-v3-its: add ability to save/restore ITS state
  DT/arm,gic-v3-its: add reset-on-suspend property
  irqchip/gic-v3-its: add ability to resend MAPC on resume
  DT/arm,gic-v3: add collections-reset-on-suspend property

 .../bindings/interrupt-controller/arm,gic-v3.txt   |   7 +
 arch/arm64/Kconfig |  10 +
 drivers/irqchip/irq-gic-v3-its.c   | 202 +
 kernel/cpu_pm.c|   3 +
 4 files changed, 184 insertions(+), 38 deletions(-)

-- 
2.16.0.rc1.238.g530d649a79-goog

cris-linux-ld: cannot open linker script file ./arch/cris/kernel/vmlinux.lds: No such file or directory

2018-02-02 Thread kbuild test robot

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
master
head:   b89e32ccd1be92a3643df3908d3026b09e271616
commit: 0fbc0b67a89d756ae3a839be01440e54348159a0 cris: remove arch specific 
early DT functions
date:   3 days ago
config: cris-defconfig (attached as .config)
compiler: cris-linux-gcc (GCC) 7.2.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
git checkout 0fbc0b67a89d756ae3a839be01440e54348159a0
# save the attached .config to linux build tree
make.cross ARCH=cris 

All errors (new ones prefixed by >>):

>> cris-linux-ld: cannot open linker script file 
>> ./arch/cris/kernel/vmlinux.lds: No such file or directory

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip

Inquiry about your product/ From exportersindia_ Awaiting your reply.

2018-02-02 Thread noreply

I am interested in your Product

i got your listing from exportersindia.com

we need large quantites of about 100pcs.

Pls send a mail for business discussions at kmike2...@gmail.com

so as to carryout orders and payment as soon as possibles.


Mike Kennedy
CEO
9037623258
USA
www.goodman.com

Re: [PATCH 2/2] MAINTAINERS: list file memory-barriers.txt within the LKMM entry

2018-02-02 Thread Andrea Parri

On Fri, Feb 02, 2018 at 03:51:02PM -0800, Paul E. McKenney wrote:
> On Fri, Feb 02, 2018 at 10:13:42AM +0100, Andrea Parri wrote:
> > Now that a formal specification of the LKMM has become available to
> > the developer, some concern about how to track changes to the model
> > on the level of the "high-level documentation" was raised.
> > 
> > A first "mitigation" to this issue, suggested by Will, is to assign
> > maintainership (and responsibility!!)  of such documentation (here,
> > memory-barriers.txt) to the maintainers of the LKMM themselves.
> > 
> > Suggested-by: Will Deacon 
> > Signed-off-by: Andrea Parri 
> 
> Very good, thank you, queued!  Please see below for the usual commit-log
> rework.  BTW, in future submissions, could you please capitalize the
> first word after the colon (":") in the subject line?  It is all too
> easy for me to forget to change this, as Ingo can attest.  ;-)

Sorry, I'll do my best! ;-)


> 
> If we are going to continue to use the LKMM acronym, should we make the
> first line of the MAINTAINERS block look something like this?

I've no strong opinion about whether we should, but it makes sense to me.
(The acronym is currently defined (and heavily used) in explanation.txt.)

Thanks,
  Andrea


> 
>   LINUX KERNEL MEMORY CONSISTENCY MODEL (LKMM)
> 
> One alternative would be to start calling it LKMCM, though that does
> look a bit like a Roman numeral.  ;-)
> 
>   Thanx, Paul
> 
> 
> 
> commit 2f80571625dc2d1977acdef79267ba1645b07c53
> Author: Andrea Parri 
> Date:   Fri Feb 2 10:13:42 2018 +0100
> 
> MAINTAINERS: List file memory-barriers.txt within the LKMM entry
> 
> We now have a shiny new Linux-kernel memory model (LKMM) and the old
> tried-and-true Documentation/memory-barrier.txt.  It would be good to
> keep these automatically synchronized, but in the meantime we need at
> least let people know that they are related.  Will suggested adding the
> Documentation/memory-barrier.txt file to the LKMM maintainership list,
> thus making the LKMM maintainers responsible for both the old and the new.
> This commit follows Will's excellent suggestion.
> 
> Suggested-by: Will Deacon 
> Signed-off-by: Andrea Parri 
> Signed-off-by: Paul E. McKenney 
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index ba4dc08fbe95..e6ad9b44e8fb 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -8101,6 +8101,7 @@ L:  linux-kernel@vger.kernel.org
>  S:   Supported
>  T:   git git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
>  F:   tools/memory-model/
> +F:   Documentation/memory-barriers.txt
>  
>  LINUX SECURITY MODULE (LSM) FRAMEWORK
>  M:   Chris Wright 
>

[GIT] Networking

2018-02-02 Thread David Miller


1) The bnx2x can hang if you give it a GSO packet with a segment size
   which is too big for the hardware, detect and drop in this case.
   From Daniel Axtens.

2) Fix some overflows and pointer leaks in xtables, from Dmitry
   Vyukov.

3) Missing RCU locking in igmp, from Eric Dumazet.

4) Fix RX checksum handling on r8152, it can only checksum UDP and
   TCP packets.  From Hayes Wang.

5) Minor pacing tweak to TCP BBR congestion control, from Neal
   Cardwell.

6) Missing RCU annotations in cls_u32, from Paolo Abeni.

Please pull, thanks a lot!
  

The following changes since commit 255442c93843f52b6891b21d0b485bf2c97f93c3:

  Merge tag 'docs-4.16' of git://git.lwn.net/linux (2018-01-31 19:25:25 -0800)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git 

for you to fetch changes up to edbe69ef2c90fc86998a74b08319a01c508bd497:

  Revert "defer call to mem_cgroup_sk_alloc()" (2018-02-02 19:49:31 -0500)


Alexander Monakov (1):
  net: pxa168_eth: add netconsole support

Arnd Bergmann (3):
  net: cxgb4: avoid memcpy beyond end of source buffer
  net: qed: use correct strncpy() size
  net: qlge: use memmove instead of skb_copy_to_linear_data

Christian Brauner (1):
  rtnetlink: remove check for IFLA_IF_NETNSID

Colin Ian King (4):
  be2net: remove redundant initialization of 'head' and pointer txq
  net: jme: remove unused initialization of 'rxdesc'
  lan78xx: remove redundant initialization of pointer 'phydev'
  vmxnet3: remove redundant initialization of pointer 'rq'

Daniel Axtens (2):
  net: create skb_gso_validate_mac_len()
  bnx2x: disable GSO where gso_size is too big for hardware

David S. Miller (3):
  Merge branch 'bnx2x-disable-GSO-on-too-large-packets'
  Merge git://git.kernel.org/.../pablo/nf
  Merge branch 'r8152-fix-rx-issues'

Desnes Augusto Nunes do Rosario (1):
  ibmvnic: fix firmware version when no firmware level has been provided by 
the VIOS server

Dmitry Vyukov (3):
  netfilter: x_tables: fix int overflow in xt_alloc_table_info()
  netfilter: x_tables: fix pointer leaks to userspace
  netfilter: ipt_CLUSTERIP: fix out-of-bounds accesses in 
clusterip_tg_check()

Ed Swierk (1):
  openvswitch: Remove padding from packet before L3+ conntrack processing

Edwin Peer (1):
  nfp: fix TLV offset calculation

Eric Dumazet (3):
  netfilter: x_tables: avoid out-of-bounds reads in 
xt_request_find_{match|target}
  net: igmp: add a missing rcu locking section
  soreuseport: fix mem leak in reuseport_add_sock()

Geert Uytterhoeven (2):
  net: bridge: Fix uninitialized error in br_fdb_sync_static()
  inet: Avoid unitialized variable warning in inet_unhash()

Hayes Wang (2):
  r8152: fix wrong checksum status for received IPv4 packets
  r8152: set rx mode early when linking on

Jiri Pirko (1):
  rocker: fix possible null pointer dereference in 
rocker_router_fib_event_work

Jozsef Kadlecsik (1):
  netfilter: ipset: Fix wraparound in hash:*net* types

Neal Cardwell (1):
  tcp_bbr: fix pacing_gain to always be unity when using lt_bw

Paolo Abeni (2):
  netfilter: on sockopt() acquire sock lock only in the required scope
  cls_u32: add missing RCU annotation.

Roman Gushchin (1):
  Revert "defer call to mem_cgroup_sk_alloc()"

 drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c  | 18 ++
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h|  2 +-
 drivers/net/ethernet/emulex/benet/be_main.c   |  3 +--
 drivers/net/ethernet/ibm/ibmvnic.c|  6 +-
 drivers/net/ethernet/jme.c|  2 +-
 drivers/net/ethernet/marvell/pxa168_eth.c | 12 
 drivers/net/ethernet/netronome/nfp/nfp_net_ctrl.c |  2 +-
 drivers/net/ethernet/qlogic/qed/qed_debug.c   |  6 ++
 drivers/net/ethernet/qlogic/qlge/qlge_main.c  |  3 +--
 drivers/net/ethernet/rocker/rocker_main.c | 18 +-
 drivers/net/usb/lan78xx.c |  2 +-
 drivers/net/usb/r8152.c   | 13 ++---
 drivers/net/vmxnet3/vmxnet3_drv.c |  6 ++
 include/linux/skbuff.h| 16 
 mm/memcontrol.c   | 14 ++
 net/bridge/br_fdb.c   |  2 +-
 net/core/rtnetlink.c  |  3 ---
 net/core/skbuff.c | 63 
++-
 net/core/sock.c   |  5 +
 net/core/sock_reuseport.c | 35 
---
 net/ipv4/igmp.c   |  4 
 net/ipv4/inet_connection_sock.c   |  1 -
 net/ipv4/inet_hashtables.c|  6 ++
 n

Re: [GIT PULL] pin control bulk changes for v4.16

2018-02-02 Thread Linus Torvalds

On Fri, Feb 2, 2018 at 4:44 PM, Linus Torvalds
 wrote:
>
> Stupid patch attached. I don't know how much this helps the insane
> dependency hell for , but it's bound to help
> _some_.

Testing it, that patch definitely cuts down on recompiles after

 touch include/linux/pinctrl/devinfo.h

a lot.

It still ends up rebuilding a fair amount of odd drivers, but now the
files it rebuilds at least make _some_ sense.

It used to really rebuild just about everything (because pretty much
everything includes ). Now it rebuilds various snd/soc
files,gpio stuff and mmc/mfc stuff.

I'm sure it could be improved upon still, but I think this is already
a fairly noticeable improvement.

One odd header include down. Ten million to go.

 Linus

[PATCH v7 5/5] iommu/vt-d: Add debugfs support for Interrupt remapping

2018-02-02 Thread Sohil Mehta

Debugfs extension for Intel IOMMU to dump Interrupt remapping table
entries for Interrupt remapping and Interrupt posting.

The file /sys/kernel/debug/intel_iommu/ir_translation_struct provides
detailed information, such as Index, Source Id, Destination Id, Vector
and the IRTE values for entries with the present bit set, in the format
shown.

Remapped Interrupt supported on IOMMU: dmar7
 IR table address:85e50
 Index  SrcID DstIDVct IRTE_highIRTE_low
 1  f0f8  0100 30  0004f0f8 013d
 7  f0f8  0400 22  0004f0f8 0422000d

Posted Interrupt supported on IOMMU: dmar5
 IR table address:85ec0
 Index  SrcID PDA_high PDA_low  Vct IRTE_high   IRTE_low
 4  4300  000f ff765980 41  000f00044300ff76598000418001
 5  4300  000f ff765980 51  000f00044300ff76598000518001

Cc: Jacob Pan 
Cc: Fenghua Yu 
Cc: Ashok Raj 
Co-Developed-by: Gayatri Kammela 
Signed-off-by: Gayatri Kammela 
Signed-off-by: Sohil Mehta 
---

v7: Print the IR table physical base address
Simplify IR table formatting

v6: Change a couple of seq_puts to seq_putc

v5: Fix seq_puts formatting and remove leading '\n's

v4: Remove the unused function parameter
Fix checkpatch.pl warnings
Remove error reporting for debugfs_create_file function
Remove redundant IOMMU null check under for_each_active_iommu

v3: Use a macro for seq file operations 
Change the intel_iommu_interrupt_remap file name to ir_translation_struct

v2: Handle the case when IR is not enabled. Fix seq_printf formatting

 drivers/iommu/intel-iommu-debug.c | 94 +++
 1 file changed, 94 insertions(+)

diff --git a/drivers/iommu/intel-iommu-debug.c 
b/drivers/iommu/intel-iommu-debug.c
index a9a99aa..b66a073 100644
--- a/drivers/iommu/intel-iommu-debug.c
+++ b/drivers/iommu/intel-iommu-debug.c
@@ -229,6 +229,96 @@ static int iommu_regset_show(struct seq_file *m, void 
*unused)
 }
 DEFINE_SHOW_ATTRIBUTE(iommu_regset);
 
+#ifdef CONFIG_IRQ_REMAP
+static void ir_tbl_remap_entry_show(struct seq_file *m,
+   struct intel_iommu *iommu)
+{
+   struct irte *ri_entry;
+   int idx;
+
+   seq_puts(m, " Index  SrcID DstIDVct IRTE_high\t\tIRTE_low\n");
+
+   for (idx = 0; idx < INTR_REMAP_TABLE_ENTRIES; idx++) {
+   ri_entry = &iommu->ir_table->base[idx];
+   if (!ri_entry->present || ri_entry->p_pst)
+   continue;
+
+   seq_printf(m, " %d\t%04x  %08x %02x  %016llx\t%016llx\n", idx,
+  ri_entry->sid, ri_entry->dest_id, ri_entry->vector,
+  ri_entry->high, ri_entry->low);
+   }
+}
+
+static void ir_tbl_posted_entry_show(struct seq_file *m,
+struct intel_iommu *iommu)
+{
+   struct irte *pi_entry;
+   int idx;
+
+   seq_puts(m, " Index  SrcID PDA_high PDA_low  Vct 
IRTE_high\t\tIRTE_low\n");
+
+   for (idx = 0; idx < INTR_REMAP_TABLE_ENTRIES; idx++) {
+   pi_entry = &iommu->ir_table->base[idx];
+   if (!pi_entry->present || !pi_entry->p_pst)
+   continue;
+
+   seq_printf(m, " %d\t%04x  %08x %08x %02x  %016llx\t%016llx\n",
+  idx, pi_entry->sid, pi_entry->pda_h,
+  pi_entry->pda_l << 6, pi_entry->vector,
+  pi_entry->high, pi_entry->low);
+   }
+}
+
+/*
+ * For active IOMMUs go through the Interrupt remapping
+ * table and print valid entries in a table format for
+ * Remapped and Posted Interrupts.
+ */
+static int ir_translation_struct_show(struct seq_file *m, void *unused)
+{
+   struct dmar_drhd_unit *drhd;
+   struct intel_iommu *iommu;
+   u64 irta;
+
+   rcu_read_lock();
+   for_each_active_iommu(iommu, drhd) {
+   if (!ecap_ir_support(iommu->ecap))
+   continue;
+
+   irta = dmar_readq(iommu->reg + DMAR_IRTA_REG) & VTD_PAGE_MASK;
+   seq_printf(m, "Remapped Interrupt supported on IOMMU: %s\n"
+ " IR table address:%llx\n", iommu->name, irta);
+
+   if (iommu->ir_table && irta)
+   ir_tbl_remap_entry_show(m, iommu);
+   else
+   seq_puts(m, "Interrupt Remapping is not enabled\n");
+   seq_putc(m, '\n');
+   }
+
+   seq_puts(m, "\n\n");
+
+   for_each_active_iommu(iommu, drhd) {
+   if (!cap_pi_support(iommu->cap))
+   continue;
+
+   irta = dmar_readq(iommu->reg + DMAR_IRTA_REG) & VTD_PAGE_MASK;
+   seq_printf(m, "Posted Interrupt supported on IOMMU: %s\n"
+ " IR table address:%llx\n", iommu->name, irta);
+
+   if (iommu->ir_table && irta)
+   ir_tbl_posted_entry_show(m, iommu);
+

[PATCH v7 3/5] iommu/vt-d: Add debugfs support to show register contents

2018-02-02 Thread Sohil Mehta

From: Gayatri Kammela 

Debugfs extension to dump all the register contents for each IOMMU
device to the user space via debugfs.

Example:
root@OTC-KBLH-01:~# cat /sys/kernel/debug/intel_iommu/iommu_regset
DMAR: dmar0: Register Base Address fed9
NameOffset  Contents
VER 0x000x0010
CAP 0x080x01cc40660462
ECAP0x100x00f0101a
GCMD0x180x
GSTS0x1c0xc700
RTADDR  0x200x0004071d3800
CCMD0x280x0800
FSTS0x340x
FECTL   0x380x
FEDATA  0x3c0xfee010044021

Cc: Fenghua Yu 
Cc: Jacob Pan 
Cc: Ashok Raj 
Co-Developed-by: Sohil Mehta 
Signed-off-by: Sohil Mehta 
Signed-off-by: Gayatri Kammela 
---

v7: Use macro for register set definitions
Fix compiler warning for readq with 32bit architecture
Remove leading '\n'

v6: No change

v5: No change

v4: Fix checkpatch.pl warnings
Remove error reporting for debugfs_create_file function
Remove redundant IOMMU null check under for_each_active_iommu

v3: Use a macro for seq file operations 
Change the intel_iommu_regset file name to iommu_regset
Add information for MTRR registers

v2: Fix seq_printf formatting

 drivers/iommu/intel-iommu-debug.c | 84 +++
 include/linux/intel-iommu.h   |  2 +
 2 files changed, 86 insertions(+)

diff --git a/drivers/iommu/intel-iommu-debug.c 
b/drivers/iommu/intel-iommu-debug.c
index 8253503..38651ad 100644
--- a/drivers/iommu/intel-iommu-debug.c
+++ b/drivers/iommu/intel-iommu-debug.c
@@ -38,6 +38,49 @@ static const struct file_operations __name ## _fops =
\
.owner  = THIS_MODULE,  \
 }
 
+struct iommu_regset {
+   int offset;
+   const char *regs;
+};
+
+#define IOMMU_REGSET_ENTRY(_reg_)  \
+   { DMAR_##_reg_##_REG, __stringify(_reg_) }
+static const struct iommu_regset iommu_regs[] = {
+   IOMMU_REGSET_ENTRY(VER),
+   IOMMU_REGSET_ENTRY(CAP),
+   IOMMU_REGSET_ENTRY(ECAP),
+   IOMMU_REGSET_ENTRY(GCMD),
+   IOMMU_REGSET_ENTRY(GSTS),
+   IOMMU_REGSET_ENTRY(RTADDR),
+   IOMMU_REGSET_ENTRY(CCMD),
+   IOMMU_REGSET_ENTRY(FSTS),
+   IOMMU_REGSET_ENTRY(FECTL),
+   IOMMU_REGSET_ENTRY(FEDATA),
+   IOMMU_REGSET_ENTRY(FEADDR),
+   IOMMU_REGSET_ENTRY(FEUADDR),
+   IOMMU_REGSET_ENTRY(AFLOG),
+   IOMMU_REGSET_ENTRY(PMEN),
+   IOMMU_REGSET_ENTRY(PLMBASE),
+   IOMMU_REGSET_ENTRY(PLMLIMIT),
+   IOMMU_REGSET_ENTRY(PHMBASE),
+   IOMMU_REGSET_ENTRY(PHMLIMIT),
+   IOMMU_REGSET_ENTRY(IQH),
+   IOMMU_REGSET_ENTRY(IQT),
+   IOMMU_REGSET_ENTRY(IQA),
+   IOMMU_REGSET_ENTRY(ICS),
+   IOMMU_REGSET_ENTRY(IRTA),
+   IOMMU_REGSET_ENTRY(PQH),
+   IOMMU_REGSET_ENTRY(PQT),
+   IOMMU_REGSET_ENTRY(PQA),
+   IOMMU_REGSET_ENTRY(PRS),
+   IOMMU_REGSET_ENTRY(PECTL),
+   IOMMU_REGSET_ENTRY(PEDATA),
+   IOMMU_REGSET_ENTRY(PEADDR),
+   IOMMU_REGSET_ENTRY(PEUADDR),
+   IOMMU_REGSET_ENTRY(MTRRCAP),
+   IOMMU_REGSET_ENTRY(MTRRDEF)
+};
+
 static void ctx_tbl_entry_show(struct seq_file *m, struct intel_iommu *iommu,
   int bus, bool ext)
 {
@@ -116,6 +159,45 @@ static int dmar_translation_struct_show(struct seq_file 
*m, void *unused)
 }
 DEFINE_SHOW_ATTRIBUTE(dmar_translation_struct);
 
+static int iommu_regset_show(struct seq_file *m, void *unused)
+{
+   struct dmar_drhd_unit *drhd;
+   struct intel_iommu *iommu;
+   unsigned long long base;
+   int i, ret = 0;
+   u64 value;
+
+   rcu_read_lock();
+   for_each_active_iommu(iommu, drhd) {
+   if (!drhd->reg_base_addr) {
+   seq_puts(m, "IOMMU: Invalid base address\n");
+   ret = -EINVAL;
+   goto out;
+   }
+
+   base = drhd->reg_base_addr;
+   seq_printf(m, "DMAR: %s: Register Base Address %llx\n",
+  iommu->name, base);
+   seq_puts(m, "Name\t\t\tOffset\t\tContents\n");
+   /*
+* Publish the contents of the 64-bit hardware registers
+* by adding the offset to the pointer (virtual address).
+*/
+   for (i = 0 ; i < ARRAY_SIZE(iommu_regs); i++) {
+   value = dmar_readq(iommu->reg + iommu_regs[i].offset);
+   seq_printf(m, "%-8s\t\t0x%02x\t\t0x%016llx\n",
+  iommu_regs[i].regs, iommu_regs[i].offset,
+  value);
+   }
+   seq_putc(m, '\n');
+   }
+out:
+   rcu_read_unlock();
+
+   retu

[PATCH v7 0/5] Add Intel IOMMU debugfs support

2018-02-02 Thread Sohil Mehta

Hi All,

This series aims to add debugfs support for Intel IOMMU. It exposes IOMMU
registers, internal context and dumps individual table entries to help debug
Intel IOMMUs.

The first patch does the ground work for the following patches by reorganizing
some Intel IOMMU data structures. The following patches create a new Kconfig
option - INTEL_IOMMU_DEBUG and add debugfs support for IOMMU context internals,
register contents, PASID internals, and Interrupt remapping in that order. The
information can be accessed in sysfs at '/sys/kernel/debug/intel_iommu/'.


Regards,
Sohil
 
Changes since v6:
 - Split patch 1/5 and 2/5 differently
 - Simplify and improve code formatting
 - Use macro for register set definitions
 - Fix compiler warning for readq
 - Add Co-Developed-by tag to commit messages

Changes since v5:
 - Change the order of includes to an alphabetical order
 - Change seq_printf and seq_puts formatting

Changes since v4:
 - Change to a SPDX license tag
 - Fix seq_printf formatting and remove leading '\n's

Changes since v3:
 - Remove an unused function parameter from some of the functions
 - Fix checkpatch.pl warnings
 - Remove error reporting for debugfs_create_file functions
 - Fix unnecessary reprogramming of the context entries
 - Simplify and merge the show context and extended context patch into one
 - Remove redundant IOMMU null check under for_each_active_iommu
 - Update the commit title to be consistent

Changes since v2:
 - Added a macro for seq file operations based on recommendation by Andy 
   Shevchenko. The marco can be moved to seq_file.h at a future point
 - Changed the debugfs file names to more relevant ones
 - Added information for MTRR registers in the regset file

Changes since v1:
 - Fixed seq_printf formatting
 - Handled the case when Interrupt remapping is not enabled

Gayatri Kammela (4):
  iommu/vt-d: Relocate struct/function declarations to its header files
  iommu/vt-d: Enable debugfs support to show context internals
  iommu/vt-d: Add debugfs support to show register contents
  iommu/vt-d: Add debugfs support to show Pasid table contents

Sohil Mehta (1):
  iommu/vt-d: Add debugfs support for Interrupt remapping

 drivers/iommu/Kconfig |   8 +
 drivers/iommu/Makefile|   1 +
 drivers/iommu/intel-iommu-debug.c | 338 ++
 drivers/iommu/intel-iommu.c   |  34 +---
 drivers/iommu/intel-svm.c |   8 -
 include/linux/intel-iommu.h   |  39 +
 include/linux/intel-svm.h |  10 +-
 7 files changed, 400 insertions(+), 38 deletions(-)
 create mode 100644 drivers/iommu/intel-iommu-debug.c

-- 
2.7.4

[PATCH v7 1/5] iommu/vt-d: Relocate struct/function declarations to its header files

2018-02-02 Thread Sohil Mehta

From: Gayatri Kammela 

To reuse the static functions and the struct declarations, move them to
corresponding header files and export the needed functions.

Cc: Sohil Mehta 
Cc: Fenghua Yu 
Cc: Ashok Raj 
Signed-off-by: Jacob Pan 
Signed-off-by: Gayatri Kammela 
---

v7: Split patch 1/5 and 2/5 differently
Update the commit message

v6: No change

v5: No change

v4: No change

v3: No change

v2: No change

 drivers/iommu/intel-iommu.c | 33 -
 include/linux/intel-iommu.h | 31 +++
 include/linux/intel-svm.h   |  2 +-
 3 files changed, 36 insertions(+), 30 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 4a2de34..f6241f6 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -183,16 +183,6 @@ static int rwbf_quirk;
 static int force_on = 0;
 int intel_iommu_tboot_noforce;
 
-/*
- * 0: Present
- * 1-11: Reserved
- * 12-63: Context Ptr (12 - (haw-1))
- * 64-127: Reserved
- */
-struct root_entry {
-   u64 lo;
-   u64 hi;
-};
 #define ROOT_ENTRY_NR (VTD_PAGE_SIZE/sizeof(struct root_entry))
 
 /*
@@ -218,21 +208,6 @@ static phys_addr_t root_entry_uctp(struct root_entry *re)
 
return re->hi & VTD_PAGE_MASK;
 }
-/*
- * low 64 bits:
- * 0: present
- * 1: fault processing disable
- * 2-3: translation type
- * 12-63: address space root
- * high 64 bits:
- * 0-2: address width
- * 3-6: aval
- * 8-23: domain id
- */
-struct context_entry {
-   u64 lo;
-   u64 hi;
-};
 
 static inline void context_clear_pasid_enable(struct context_entry *context)
 {
@@ -259,7 +234,7 @@ static inline bool __context_present(struct context_entry 
*context)
return (context->lo & 1);
 }
 
-static inline bool context_present(struct context_entry *context)
+bool context_present(struct context_entry *context)
 {
return context_pasid_enabled(context) ?
 __context_present(context) :
@@ -819,8 +794,8 @@ static void domain_update_iommu_cap(struct dmar_domain 
*domain)
domain->iommu_superpage = domain_update_iommu_superpage(NULL);
 }
 
-static inline struct context_entry *iommu_context_addr(struct intel_iommu 
*iommu,
-  u8 bus, u8 devfn, int 
alloc)
+struct context_entry *iommu_context_addr(struct intel_iommu *iommu, u8 bus,
+u8 devfn, int alloc)
 {
struct root_entry *root = &iommu->root_entry[bus];
struct context_entry *context;
@@ -5208,7 +5183,7 @@ static void intel_iommu_put_resv_regions(struct device 
*dev,
 
 #ifdef CONFIG_INTEL_IOMMU_SVM
 #define MAX_NR_PASID_BITS (20)
-static inline unsigned long intel_iommu_get_pts(struct intel_iommu *iommu)
+unsigned long intel_iommu_get_pts(struct intel_iommu *iommu)
 {
/*
 * Convert ecap_pss to extend context entry pts encoding, also
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index f3274d9..78ec85a 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -383,6 +383,33 @@ struct pasid_entry;
 struct pasid_state_entry;
 struct page_req_dsc;
 
+/*
+ * 0: Present
+ * 1-11: Reserved
+ * 12-63: Context Ptr (12 - (haw-1))
+ * 64-127: Reserved
+ */
+struct root_entry {
+   u64 lo;
+   u64 hi;
+};
+
+/*
+ * low 64 bits:
+ * 0: present
+ * 1: fault processing disable
+ * 2-3: translation type
+ * 12-63: address space root
+ * high 64 bits:
+ * 0-2: address width
+ * 3-6: aval
+ * 8-23: domain id
+ */
+struct context_entry {
+   u64 lo;
+   u64 hi;
+};
+
 struct intel_iommu {
void __iomem*reg; /* Pointer to hardware regs, virtual addr */
u64 reg_phys; /* physical address of hw register set */
@@ -488,8 +515,12 @@ struct intel_svm {
 
 extern int intel_iommu_enable_pasid(struct intel_iommu *iommu, struct 
intel_svm_dev *sdev);
 extern struct intel_iommu *intel_svm_device_to_iommu(struct device *dev);
+extern unsigned long intel_iommu_get_pts(struct intel_iommu *iommu);
 #endif
 
 extern const struct attribute_group *intel_iommu_groups[];
+extern bool context_present(struct context_entry *context);
+extern struct context_entry *iommu_context_addr(struct intel_iommu *iommu,
+   u8 bus, u8 devfn, int alloc);
 
 #endif
diff --git a/include/linux/intel-svm.h b/include/linux/intel-svm.h
index 99bc5b3..733eaf9 100644
--- a/include/linux/intel-svm.h
+++ b/include/linux/intel-svm.h
@@ -130,7 +130,7 @@ static inline int intel_svm_unbind_mm(struct device *dev, 
int pasid)
BUG();
 }
 
-static int intel_svm_is_pasid_valid(struct device *dev, int pasid)
+static inline int intel_svm_is_pasid_valid(struct device *dev, int pasid)
 {
return -EINVAL;
 }
-- 
2.7.4

[PATCH v7 4/5] iommu/vt-d: Add debugfs support to show Pasid table contents

2018-02-02 Thread Sohil Mehta

From: Gayatri Kammela 

Debugfs extension to dump the internals such as pasid table entries for
each IOMMU to the userspace.

Example of such dump in Kabylake:

root@OTC-KBLH-01:~# cat
/sys/kernel/debug/intel_iommu/dmar_translation_struct
IOMMU dmar1: Extended Root Table Address:4071d3800
Extended Root Table Entries:
Bus 0 L: 4071d7001 H: 0
Lower Context Table Entries for Bus: 0
[entry] Device B:D.FLow High
[16]:00:02.04071d6005   102
Higher Context Table Entries for Bus: 0
[16]:00:02.00   0
Pasid Table Address: 746cb0af
Pasid Table Entries for domain 0:
[Entry] Contents
[0] 12c409801

Cc: Fenghua Yu 
Cc: Jacob Pan 
Cc: Ashok Raj 
Co-Developed-by: Sohil Mehta 
Signed-off-by: Sohil Mehta 
Signed-off-by: Gayatri Kammela 
---

v7: Improve code indentation and formatting

v6: No change

v5: No change

v4: Remove the unused function parameter
Fix checkpatch.pl warnings

v3: No change

v2: Fix seq_printf formatting

 drivers/iommu/intel-iommu-debug.c | 31 +++
 drivers/iommu/intel-svm.c |  8 
 include/linux/intel-svm.h |  8 
 3 files changed, 39 insertions(+), 8 deletions(-)

diff --git a/drivers/iommu/intel-iommu-debug.c 
b/drivers/iommu/intel-iommu-debug.c
index 38651ad..a9a99aa 100644
--- a/drivers/iommu/intel-iommu-debug.c
+++ b/drivers/iommu/intel-iommu-debug.c
@@ -81,6 +81,36 @@ static const struct iommu_regset iommu_regs[] = {
IOMMU_REGSET_ENTRY(MTRRDEF)
 };
 
+#ifdef CONFIG_INTEL_IOMMU_SVM
+static void pasid_tbl_entry_show(struct seq_file *m, struct intel_iommu *iommu)
+{
+   int pasid_size = 0, i;
+
+   if (!ecap_pasid(iommu->ecap))
+   return;
+
+   pasid_size = intel_iommu_get_pts(iommu);
+   seq_printf(m, "Pasid Table Address: %p\n", iommu->pasid_table);
+
+   if (!iommu->pasid_table)
+   return;
+
+   seq_printf(m, "Pasid Table Entries for domain %d:\n", iommu->segment);
+   seq_puts(m, "[Entry]\t\tContents\n");
+
+   /* Publish the pasid table entries here */
+   for (i = 0; i < pasid_size; i++) {
+   if (!iommu->pasid_table[i].val)
+   continue;
+
+   seq_printf(m, "[%d]\t\t%04llx\n", i, iommu->pasid_table[i].val);
+   }
+}
+#else /* CONFIG_INTEL_IOMMU_SVM */
+static inline void
+pasid_tbl_entry_show(struct seq_file *m, struct intel_iommu *iommu) {}
+#endif /* CONFIG_INTEL_IOMMU_SVM */
+
 static void ctx_tbl_entry_show(struct seq_file *m, struct intel_iommu *iommu,
   int bus, bool ext)
 {
@@ -116,6 +146,7 @@ static void ctx_tbl_entry_show(struct seq_file *m, struct 
intel_iommu *iommu,
   iommu->segment, bus, PCI_SLOT(ctx), PCI_FUNC(ctx),
   context[1].lo, context[1].hi);
}
+   pasid_tbl_entry_show(m, iommu);
 out:
spin_unlock_irqrestore(&iommu->lock, flags);
 }
diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
index ed1cf7c..c646724 100644
--- a/drivers/iommu/intel-svm.c
+++ b/drivers/iommu/intel-svm.c
@@ -28,14 +28,6 @@
 
 static irqreturn_t prq_event_thread(int irq, void *d);
 
-struct pasid_entry {
-   u64 val;
-};
-
-struct pasid_state_entry {
-   u64 val;
-};
-
 int intel_svm_alloc_pasid_tables(struct intel_iommu *iommu)
 {
struct page *pages;
diff --git a/include/linux/intel-svm.h b/include/linux/intel-svm.h
index 733eaf9..a8abad6 100644
--- a/include/linux/intel-svm.h
+++ b/include/linux/intel-svm.h
@@ -18,6 +18,14 @@
 
 struct device;
 
+struct pasid_entry {
+   u64 val;
+};
+
+struct pasid_state_entry {
+   u64 val;
+};
+
 struct svm_dev_ops {
void (*fault_cb)(struct device *dev, int pasid, u64 address,
 u32 private, int rwxp, int response);
-- 
2.7.4

[PATCH v7 2/5] iommu/vt-d: Enable debugfs support to show context internals

2018-02-02 Thread Sohil Mehta

From: Gayatri Kammela 

Add a new config option CONFIG_INTEL_IOMMU_DEBUG and export Intel IOMMU
internals states, such as root and context in debugfs to the userspace.

Example of such dump in Kabylake:

root@OTC-KBLH-01:~# cat
/sys/kernel/debug/intel_iommu/dmar_translation_struct
IOMMU dmar1: Extended Root Table Address:4071d3800
Extended Root Table Entries:
Bus 0 L: 4071d7001 H: 0
Lower Context Table Entries for Bus: 0
[entry] Device B:D.FLow High
[16]:00:02.04071d6005   102
Higher Context Table Entries for Bus: 0
[16]:00:02.00   0

IOMMU dmar0: Extended Root Table Address:4071d4800

IOMMU dmar2: Root Table Address:4071d5000
Root Table Entries:
Bus 0 L: 406d13001 H: 0
Context Table Entries for Bus: 0
[entry] Device B:D.FLow High
[160]   :00:14.0406d12001   102
[184]   :00:17.0405756001   302
[248]   :00:1f.0406d3b001   202
[251]   :00:1f.3405497001   402
[254]   :00:1f.640662e001   502
Root Table Entries:
Bus 1 L: 401e03001 H: 0
Context Table Entries for Bus: 1
[entry] Device B:D.FLow High
[0] :01:00.0401e04001   602

Cc: Fenghua Yu 
Cc: Ashok Raj 
Co-Developed-by: Sohil Mehta 
Signed-off-by: Jacob Pan 
Signed-off-by: Sohil Mehta 
Signed-off-by: Gayatri Kammela 
---

v7: Split patch 1/5 and 2/5 differently
Update commit message and copyright year
Fix typo in a comment
Simplify code

v6: Change the order of includes to an alphabetical order
Change seq_printf formatting

v5: Change to a SPDX license tag
Fix seq_printf formatting

v4: Remove the unused function parameter
Fix checkpatch.pl warnings
Remove error reporting for debugfs_create_file function
Fix unnecessary reprogramming of the context entries
Simplify and merge the show context and extended context patch into one
Remove redundant IOMMU null check under for_each_active_iommu

v3: Add a macro for seq file operations 
Change the intel_iommu_ctx file name to dmar_translation_struct

v2: No change

 drivers/iommu/Kconfig |   8 +++
 drivers/iommu/Makefile|   1 +
 drivers/iommu/intel-iommu-debug.c | 129 ++
 drivers/iommu/intel-iommu.c   |   1 +
 include/linux/intel-iommu.h   |   6 ++
 5 files changed, 145 insertions(+)
 create mode 100644 drivers/iommu/intel-iommu-debug.c

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index f3a2134..332648f 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -152,6 +152,14 @@ config INTEL_IOMMU
  and include PCI device scope covered by these DMA
  remapping devices.
 
+config INTEL_IOMMU_DEBUG
+   bool "Export Intel IOMMU internals in Debugfs"
+   depends on INTEL_IOMMU && DEBUG_FS
+   help
+ Debugfs support to export IOMMU context internals, register contents,
+ PASID internals and interrupt remapping. To access this information in
+ sysfs, say Y.
+
 config INTEL_IOMMU_SVM
bool "Support for Shared Virtual Memory with Intel IOMMU"
depends on INTEL_IOMMU && X86
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index 1fb6958..fdbaf46 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -15,6 +15,7 @@ obj-$(CONFIG_ARM_SMMU) += arm-smmu.o
 obj-$(CONFIG_ARM_SMMU_V3) += arm-smmu-v3.o
 obj-$(CONFIG_DMAR_TABLE) += dmar.o
 obj-$(CONFIG_INTEL_IOMMU) += intel-iommu.o
+obj-$(CONFIG_INTEL_IOMMU_DEBUG) += intel-iommu-debug.o
 obj-$(CONFIG_INTEL_IOMMU_SVM) += intel-svm.o
 obj-$(CONFIG_IPMMU_VMSA) += ipmmu-vmsa.o
 obj-$(CONFIG_IRQ_REMAP) += intel_irq_remapping.o irq_remapping.o
diff --git a/drivers/iommu/intel-iommu-debug.c 
b/drivers/iommu/intel-iommu-debug.c
new file mode 100644
index 000..8253503
--- /dev/null
+++ b/drivers/iommu/intel-iommu-debug.c
@@ -0,0 +1,129 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright © 2018 Intel Corporation.
+ *
+ * Authors: Gayatri Kammela 
+ *  Jacob Pan 
+ *  Sohil Mehta 
+ */
+
+#define pr_fmt(fmt) "INTEL_IOMMU: " fmt
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "irq_remapping.h"
+
+#define TOTAL_BUS_NR   256 /* full bus range */
+#define DEFINE_SHOW_ATTRIBUTE(__name)  \
+static int __name ## _open(struct inode *inode, struct file *file) \
+{  \
+   return single_open(file, __name ## _show, inode->i_private);\
+}  \
+static const struct file_operations __name ## _fops =  \
+{  \
+   .open   = __name ## _open,  \
+   .read   = seq_read, \

Re: [PATCH] net: qlge: use memmove instead of skb_copy_to_linear_data

2018-02-02 Thread David Miller

From: Arnd Bergmann 
Date: Fri,  2 Feb 2018 16:45:44 +0100

> gcc-8 points out that the skb_copy_to_linear_data() argument points to
> the skb itself, which makes it run into a problem with overlapping
> memcpy arguments:
> 
> In file included from include/linux/ip.h:20,
>  from drivers/net/ethernet/qlogic/qlge/qlge_main.c:26:
> drivers/net/ethernet/qlogic/qlge/qlge_main.c: In function 'ql_realign_skb':
> include/linux/skbuff.h:3378:2: error: 'memcpy' source argument is the same as 
> destination [-Werror=restrict]
>   memcpy(skb->data, from, len);
> 
> It's unclear to me what the best solution is, maybe it ought to use a
> different helper that adjusts the skb data in a safe way. Simply using
> memmove() here seems like the easiest workaround.
> 
> Signed-off-by: Arnd Bergmann 

This looks fine, applied, thanks.

Re: [GIT PULL] pin control bulk changes for v4.16

2018-02-02 Thread Linus Torvalds

On Fri, Feb 2, 2018 at 2:56 PM, Linus Torvalds
 wrote:
>
> so I would really prefer to speed up recompiles and just generally try
> to avoid horrible header file inclusion by doing the same thing in
> , adding just that
>
> struct dev_pin_info;
>
> declaration, and removing the  include.

It turns out that some pinctl users seem to depend on this broken
situation., with at least

  drivers/pinctrl/core.c
  drivers/media/platform/sti/c8sectpfe/c8sectpfe-core.c
  drivers/pinctrl/pinctrl-ocelot.c
  drivers/pinctrl/bcm/pinctrl-iproc-gpio.c

expecting to magically get some of the pinctrl function declarations
not through some pinctrl header file, but just from .

Adding that include to  would seem to make
those happy and make 'allmodconfig' build for me.

But I'm only testing x86-64. Can somebody test at least arm too?

Stupid patch attached. I don't know how much this helps the insane
dependency hell for , but it's bound to help
_some_.

Comments?

 Linus
 include/linux/device.h  | 2 +-
 include/linux/pinctrl/pinctrl.h | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/linux/device.h b/include/linux/device.h
index f649fc0c2571..b093405ed525 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -20,7 +20,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -41,6 +40,7 @@ struct fwnode_handle;
 struct iommu_ops;
 struct iommu_group;
 struct iommu_fwspec;
+struct dev_pin_info;
 
 struct bus_attribute {
struct attributeattr;
diff --git a/include/linux/pinctrl/pinctrl.h b/include/linux/pinctrl/pinctrl.h
index 5e45385c5bdc..8f5dbb84547a 100644
--- a/include/linux/pinctrl/pinctrl.h
+++ b/include/linux/pinctrl/pinctrl.h
@@ -18,6 +18,7 @@
 #include 
 #include 
 #include 
+#include 
 
 struct device;
 struct pinctrl_dev;

Re: [PATCH] net: qed: use correct strncpy() size

2018-02-02 Thread David Miller

From: Arnd Bergmann 
Date: Fri,  2 Feb 2018 16:44:47 +0100

> passing the strlen() of the source string as the destination
> length is pointless, and gcc-8 now warns about it:
> 
> drivers/net/ethernet/qlogic/qed/qed_debug.c: In function 'qed_grc_dump':
> include/linux/string.h:253: error: 'strncpy' specified bound depends on the 
> length of the source argument [-Werror=stringop-overflow=]
> 
> This changes qed_grc_dump_big_ram() to instead uses the length of
> the destination buffer, and use strscpy() to guarantee nul-termination.
> 
> Signed-off-by: Arnd Bergmann 

Applied.

Re: [PATCH] tools: libsubcmd: Drop the less hack that was inherited from Git.

2018-02-02 Thread Arvind Sankar

Hi, it looks like Sergey has put in a patch to fix the aliasing, looking
at the linux-next tree.

Are we still looking to remove this hack altogether?

Thanks

On Thu, Jan 25, 2018 at 08:24:26AM -0500, Arvind Sankar wrote:
> Thanks.
> 
> This was found because gcc 8 appears to be enabling -Wrestrict in -Wall,
> so there is a build failure with mainline gcc.
> 
> On Thu, Jan 25, 2018 at 05:16:52AM -0300, Arnaldo Carvalho de Melo wrote:
> > Em Wed, Jan 24, 2018 at 02:54:11PM -0600, Josh Poimboeuf escreveu:
> > > On Tue, Jan 23, 2018 at 07:38:37PM -0500, Arvind Sankar wrote:
> > > > We inherited this hack with the original code from the Git project. The
> > > > select call is invalid as the two fd_set pointers should not be aliased.
> > > > 
> > > > We could fix it, but the Git project removed this hack in 2012 in commit
> > > > e8320f3 (pager: drop "wait for output to run less" hack). The bug it
> > > > worked around was apparently fixed in less back in June 2007.
> > > > 
> > > > So remove the hack from here as well.
> > > > 
> > > > Signed-off-by: Arvind Sankar 
> > > 
> > > Looks good to me.
> > > 
> > >   Acked-by: Josh Poimboeuf 
> > > 
> > > Libsubcmd is used by perf and objtool, so adding the perf maintainers to
> > > CC.  Arnaldo, do you want to pick this one up?
> > 
> > Sure, I'll put it in my perf/core branch.
> > 
> > - Arnaldo
> >  
> > > > ---
> > > >  tools/lib/subcmd/pager.c   | 17 -
> > > >  tools/lib/subcmd/run-command.c |  2 --
> > > >  tools/lib/subcmd/run-command.h |  1 -
> > > >  3 files changed, 20 deletions(-)
> > > > 
> > > > diff --git a/tools/lib/subcmd/pager.c b/tools/lib/subcmd/pager.c
> > > > index 5ba754d17952..94d61d9b511f 100644
> > > > --- a/tools/lib/subcmd/pager.c
> > > > +++ b/tools/lib/subcmd/pager.c
> > > > @@ -1,5 +1,4 @@
> > > >  // SPDX-License-Identifier: GPL-2.0
> > > > -#include 
> > > >  #include 
> > > >  #include 
> > > >  #include 
> > > > @@ -23,21 +22,6 @@ void pager_init(const char *pager_env)
> > > > subcmd_config.pager_env = pager_env;
> > > >  }
> > > >  
> > > > -static void pager_preexec(void)
> > > > -{
> > > > -   /*
> > > > -* Work around bug in "less" by not starting it until we
> > > > -* have real input
> > > > -*/
> > > > -   fd_set in;
> > > > -
> > > > -   FD_ZERO(&in);
> > > > -   FD_SET(0, &in);
> > > > -   select(1, &in, NULL, &in, NULL);
> > > > -
> > > > -   setenv("LESS", "FRSX", 0);
> > > > -}
> > > > -
> > > >  static const char *pager_argv[] = { "sh", "-c", NULL, NULL };
> > > >  static struct child_process pager_process;
> > > >  
> > > > @@ -84,7 +68,6 @@ void setup_pager(void)
> > > > pager_argv[2] = pager;
> > > > pager_process.argv = pager_argv;
> > > > pager_process.in = -1;
> > > > -   pager_process.preexec_cb = pager_preexec;
> > > >  
> > > > if (start_command(&pager_process))
> > > > return;
> > > > diff --git a/tools/lib/subcmd/run-command.c 
> > > > b/tools/lib/subcmd/run-command.c
> > > > index 5cdac2162532..9e9dca717ed7 100644
> > > > --- a/tools/lib/subcmd/run-command.c
> > > > +++ b/tools/lib/subcmd/run-command.c
> > > > @@ -120,8 +120,6 @@ int start_command(struct child_process *cmd)
> > > > unsetenv(*cmd->env);
> > > > }
> > > > }
> > > > -   if (cmd->preexec_cb)
> > > > -   cmd->preexec_cb();
> > > > if (cmd->exec_cmd) {
> > > > execv_cmd(cmd->argv);
> > > > } else {
> > > > diff --git a/tools/lib/subcmd/run-command.h 
> > > > b/tools/lib/subcmd/run-command.h
> > > > index 17d969c6add3..6256268802b5 100644
> > > > --- a/tools/lib/subcmd/run-command.h
> > > > +++ b/tools/lib/subcmd/run-command.h
> > > > @@ -46,7 +46,6 @@ struct child_process {
> > > > unsigned no_stderr:1;
> > > > unsigned exec_cmd:1; /* if this is to be external sub-command */
> > > > unsigned stdout_to_stderr:1;
> > > > -   void (*preexec_cb)(void);
> > > >  };
> > > >  
> > > >  int start_command(struct child_process *);
> > > > -- 
> > > > 2.13.6
> > > > 
> > > 
> > > -- 
> > > Josh

Re: [PATCH] net: cxgb4: avoid memcpy beyond end of source buffer

2018-02-02 Thread David Miller

From: Arnd Bergmann 
Date: Fri,  2 Feb 2018 16:18:37 +0100

> Building with link-time-optimizations revealed that the cxgb4 driver does
> a fixed-size memcpy() from a variable-length constant string into the
> network interface name:
> 
> In function 'memcpy',
> inlined from 'cfg_queues_uld.constprop' at 
> drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.c:335:2,
> inlined from 'cxgb4_register_uld.constprop' at 
> drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.c:719:9:
> include/linux/string.h:350:3: error: call to '__read_overflow2' declared with 
> attribute error: detected read beyond size of object passed as 2nd parameter
>__read_overflow2();
>^
> 
> I can see two equally workable solutions: either we use a strncpy() instead
> of the memcpy() to stop at the end of the input, or we make the source buffer
> fixed length as well. This implements the latter.
> 
> Signed-off-by: Arnd Bergmann 

Not the most pleasant thing in the world, but I can't think of a better
solution.

> @@ -355,7 +355,7 @@ struct cxgb4_lld_info {
>  };
>  
>  struct cxgb4_uld_info {
> - const char *name;
> + char name[IFNAMSIZ];
>   void *handle;
>   unsigned int nrxq;
>   unsigned int rxq_size;

David Laight asked how this can be the sole part of the patch.

All of these structures are initialized like:

static struct cxgb4_uld_info {
.name   = "foo",
...
};

So changing from "const char *" to "char []" just works.

Re: [PATCH net 0/2] r8152: fix rx issues

2018-02-02 Thread David Miller

From: Hayes Wang 
Date: Fri, 2 Feb 2018 16:43:34 +0800

> The two patched are used to fix rx issues.

Series applied.

Re: [PATCH v3 14/21] fpga: dfl: add fpga manager platform driver for FME

2018-02-02 Thread Luebbers, Enno

Hi Hao, Alan,

On Fri, Feb 02, 2018 at 05:42:13PM +0800, Wu Hao wrote:
> On Thu, Feb 01, 2018 at 04:00:36PM -0600, Alan Tull wrote:
> > On Mon, Nov 27, 2017 at 12:42 AM, Wu Hao  wrote:
> > 
> > Hi Hao,
> > 
> > A few comments below.   Besides that, looks good.
> > 
> > > This patch adds fpga manager driver for FPGA Management Engine (FME). It
> > > implements fpga_manager_ops for FPGA Partial Reconfiguration function.
> > >
> > > Signed-off-by: Tim Whisonant 
> > > Signed-off-by: Enno Luebbers 
> > > Signed-off-by: Shiva Rao 
> > > Signed-off-by: Christopher Rauer 
> > > Signed-off-by: Kang Luwei 
> > > Signed-off-by: Xiao Guangrong 
> > > Signed-off-by: Wu Hao 
> > > 
> > > v3: rename driver to dfl-fpga-fme-mgr
> > > implemented status callback for fpga manager
> > > rebased due to fpga api changes
> > > ---
> > >  .../ABI/testing/sysfs-platform-fpga-dfl-fme-mgr|   8 +
> > >  drivers/fpga/Kconfig   |   6 +
> > >  drivers/fpga/Makefile  |   1 +
> > >  drivers/fpga/fpga-dfl-fme-mgr.c| 318 
> > > +
> > >  drivers/fpga/fpga-dfl.h|  39 ++-
> > >  5 files changed, 371 insertions(+), 1 deletion(-)
> > >  create mode 100644 
> > > Documentation/ABI/testing/sysfs-platform-fpga-dfl-fme-mgr
> > >  create mode 100644 drivers/fpga/fpga-dfl-fme-mgr.c
> > >
> > > diff --git a/Documentation/ABI/testing/sysfs-platform-fpga-dfl-fme-mgr 
> > > b/Documentation/ABI/testing/sysfs-platform-fpga-dfl-fme-mgr
> > > new file mode 100644
> > > index 000..2d4f917
> > > --- /dev/null
> > > +++ b/Documentation/ABI/testing/sysfs-platform-fpga-dfl-fme-mgr
> > > @@ -0,0 +1,8 @@
> > > +What:  /sys/bus/platform/devices/fpga-dfl-fme-mgr.0/interface_id
> > > +Date:  November 2017
> > > +KernelVersion:  4.15
> > > +Contact:   Wu Hao 
> > > +Description:   Read-only. It returns interface id of partial 
> > > reconfiguration
> > > +   hardware. Userspace could use this information to check if
> > > +   current hardware is compatible with given image before 
> > > FPGA
> > > +   programming.
> > 
> > I'm a little confused by this.  I can understand that the PR bitstream
> > has a dependency on the FPGA's static image, but I don't understand
> > the dependency of the bistream on the hardware that is used to program
> > the bitstream to the FPGA.
> 
> Sorry for the confusion, the interface_id is used to indicate the version of
> the hardware for partial reconfiguration (it's part of the static image of
> the FPGA device). Will improve the description on this.
> 

The interface_id expresses the compatibility of the static region with PR
bitstreams generated for it. It changes every time a new static region is
generated.

Would it make more sense to have the interface_id exposed as part of the FME
device (which represents the static region)? I'm not sure - it kind of also
makes sense here, where you would have all the information in one place (if the
interface_id matches, I can use this component to program a bitstream).

Sorry for my limited understanding of the infrastructure - would this same
"fpga-dfl-fme-mgr.0" be used for PR if we had multiple PR regions? In that case
it would need to expose multiple interface_ids (or we'd have to track both
interface IDs and an identifier for the target PR region).

> > 
> > > diff --git a/drivers/fpga/Kconfig b/drivers/fpga/Kconfig
> > > index 57da904..0171ecb 100644
> > > --- a/drivers/fpga/Kconfig
> > > +++ b/drivers/fpga/Kconfig
> > > @@ -150,6 +150,12 @@ config FPGA_DFL_FME
> > >   FPGA platform level management features. There shall be 1 FME
> > >   per DFL based FPGA device.
> > >
> > > +config FPGA_DFL_FME_MGR
> > > +   tristate "FPGA DFL FME Manager Driver"
> > > +   depends on FPGA_DFL_FME
> > > +   help
> > > + Say Y to enable FPGA Manager driver for FPGA Management Engine.
> > > +
> > >  config INTEL_FPGA_DFL_PCI
> > > tristate "Intel FPGA DFL PCIe Device Driver"
> > > depends on PCI && FPGA_DFL
> > > diff --git a/drivers/fpga/Makefile b/drivers/fpga/Makefile
> > > index cc75bb3..6378580 100644
> > > --- a/drivers/fpga/Makefile
> > > +++ b/drivers/fpga/Makefile
> > > @@ -31,6 +31,7 @@ obj-$(CONFIG_OF_FPGA_REGION)  += 
> > > of-fpga-region.o
> > >  # FPGA Device Feature List Support
> > >  obj-$(CONFIG_FPGA_DFL) += fpga-dfl.o
> > >  obj-$(CONFIG_FPGA_DFL_FME) += fpga-dfl-fme.o
> > > +obj-$(CONFIG_FPGA_DFL_FME_MGR) += fpga-dfl-fme-mgr.o
> > >
> > >  fpga-dfl-fme-objs := dfl-fme-main.o dfl-fme-pr.o
> > >
> > > diff --git a/drivers/fpga/fpga-dfl-fme-mgr.c 
> > > b/drivers/fpga/fpga-dfl-fme-mgr.c
> > > new file mode 100644
> > > index 000..70356ce
> > > --- /dev/null
> > > +++ b/drivers/fpga/fpga-dfl-fme-mgr.c
> > > @@ -0,0 +1,318 @@
> > > +/*
> > > + * FPGA Manager Driver for FPGA Management Engine (FME)
> >

Re: [PATCH v3 3/3] ARM: dts: imx: Add memory node unit name

2018-02-02 Thread Fabio Estevam

On Wed, Jan 24, 2018 at 11:22 AM, Marco Franchi  wrote:
> Fix the following warnings from dtc by adding the unit name to memory
> nodes:
>
> Warning (unit_address_vs_reg): Node /memory has a reg or ranges property, but 
> no unit name
>
> Converted using the following command:
>
> perl -p0777i -e 's/memory \{\n\t\treg = \<0x+([0-9a-f])/memory\@$1$\000 
> \{\n\t\treg = <0x$1/m' `find ./arch/arm/boot/dts -name "imx*"`
>
> The files below were manually fixed:
> -imx1-ads.dts
> -imx1-apf9328.dts
> -imx6q-pistachio.dts
>
> Signed-off-by: Marco Franchi 

Reviewed-by: Fabio Estevam

Re: [PATCH v3 2/3] ARM: dts: imx: Remove empty memory size nodes

2018-02-02 Thread Fabio Estevam

On Wed, Jan 24, 2018 at 11:22 AM, Marco Franchi  wrote:
> Remove the empty reg property from the SoC dtsi files in order to avoid
> duplicate memory nodes when the correct size is passed in board dts files.
>
> Signed-off-by: Marco Franchi 

Reviewed-by: Fabio Estevam

Re: [PATCH v3 1/3] ARM: dts: imx: Pass empty memory size on board dts

2018-02-02 Thread Fabio Estevam

On Wed, Jan 24, 2018 at 11:22 AM, Marco Franchi  wrote:
> In preparation for removing 'reg = <0 0>;' from the dtsi SoC files, pass
> 'reg = <0 0 >;' to the dts/dtsi board files that do not pass the memory
> size.
>
> Signed-off-by: Marco Franchi 

Reviewed-by: Fabio Estevam

Re: [PATCH 1/2] iommu: Fix iommu_unmap and iommu_unmap_fast return type

2018-02-02 Thread kbuild test robot

Hi Suravee,

I love your patch! Perhaps something to improve:

[auto build test WARNING on iommu/next]
[also build test WARNING on v4.15 next-20180202]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Suravee-Suthikulpanit/iommu-Fix-iommu_unmap-and-iommu_unmap_fast-return-type/20180203-015316
base:   https://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git next
reproduce:
# apt-get install sparse
make ARCH=x86_64 allmodconfig
make C=1 CF=-D__CHECK_ENDIAN__


sparse warnings: (new ones prefixed by >>)

>> drivers/iommu/qcom_iommu.c:592:27: sparse: incorrect type in initializer 
>> (different signedness) @@ expected long ( )( ... ) @@ got unsigned long ( )( 
>> ... ) @@
   drivers/iommu/qcom_iommu.c:592:27: expected long ( )( ... )
   drivers/iommu/qcom_iommu.c:592:27: got unsigned long ( )( ... )
   drivers/iommu/qcom_iommu.c:592:12: error: initialization from incompatible 
pointer type
.unmap = qcom_iommu_unmap,
^~~~
   drivers/iommu/qcom_iommu.c:592:12: note: (near initialization for 
'qcom_iommu_ops.unmap')
   cc1: some warnings being treated as errors

vim +592 drivers/iommu/qcom_iommu.c

0ae349a0f3 Rob Clark2017-08-09  584  
0ae349a0f3 Rob Clark2017-08-09  585  static const struct iommu_ops 
qcom_iommu_ops = {
0ae349a0f3 Rob Clark2017-08-09  586 .capable= 
qcom_iommu_capable,
0ae349a0f3 Rob Clark2017-08-09  587 .domain_alloc   = 
qcom_iommu_domain_alloc,
0ae349a0f3 Rob Clark2017-08-09  588 .domain_free= 
qcom_iommu_domain_free,
0ae349a0f3 Rob Clark2017-08-09  589 .attach_dev = 
qcom_iommu_attach_dev,
0ae349a0f3 Rob Clark2017-08-09  590 .detach_dev = 
qcom_iommu_detach_dev,
0ae349a0f3 Rob Clark2017-08-09  591 .map= 
qcom_iommu_map,
0ae349a0f3 Rob Clark2017-08-09 @592 .unmap  = 
qcom_iommu_unmap,
0ae349a0f3 Rob Clark2017-08-09  593 .map_sg = 
default_iommu_map_sg,
4d689b6194 Robin Murphy 2017-09-28  594 .flush_iotlb_all = 
qcom_iommu_iotlb_sync,
4d689b6194 Robin Murphy 2017-09-28  595 .iotlb_sync = 
qcom_iommu_iotlb_sync,
0ae349a0f3 Rob Clark2017-08-09  596 .iova_to_phys   = 
qcom_iommu_iova_to_phys,
0ae349a0f3 Rob Clark2017-08-09  597 .add_device = 
qcom_iommu_add_device,
0ae349a0f3 Rob Clark2017-08-09  598 .remove_device  = 
qcom_iommu_remove_device,
0ae349a0f3 Rob Clark2017-08-09  599 .device_group   = 
generic_device_group,
0ae349a0f3 Rob Clark2017-08-09  600 .of_xlate   = 
qcom_iommu_of_xlate,
0ae349a0f3 Rob Clark2017-08-09  601 .pgsize_bitmap  = SZ_4K | 
SZ_64K | SZ_1M | SZ_16M,
0ae349a0f3 Rob Clark2017-08-09  602  };
0ae349a0f3 Rob Clark2017-08-09  603  

:: The code at line 592 was first introduced by commit
:: 0ae349a0f33fb040a2bc228fdc6d60111455feab iommu/qcom: Add qcom_iommu

:: TO: Rob Clark 
:: CC: Joerg Roedel 

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation

Re: KASAN: use-after-free Read in __list_add_valid (3)

2018-02-02 Thread Eric Biggers

On Wed, Jan 24, 2018 at 11:57:01PM -0800, syzbot wrote:
> Hello,
> 
> syzbot hit the following crash on upstream commit
> 1f07476ec143bbed7bf0b641749783b1094b4c4f (Tue Jan 23 20:45:40 2018 +)
> Merge tag 'pci-v4.15-fixes-3' of
> git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
> 
> So far this crash happened 6 times on upstream.
> Unfortunately, I don't have any reproducer for this crash yet.
> Raw console output is attached.
> compiler: gcc (GCC) 7.1.1 20170620
> .config is attached.
> 
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+fbbedb95ed1d1e957...@syzkaller.appspotmail.com
> It will help syzbot understand when the bug is fixed. See footer for
> details.
> If you forward the report, please keep this part and the footer.
> 
> ==
> BUG: KASAN: use-after-free in __list_add_valid+0xb1/0xd0 lib/list_debug.c:23
> Read of size 8 at addr 8801bb5893d0 by task syz-executor1/28911
> 
> CPU: 0 PID: 28911 Comm: syz-executor1 Not tainted 4.15.0-rc9+ #277
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> Call Trace:
>  __dump_stack lib/dump_stack.c:17 [inline]
>  dump_stack+0x194/0x257 lib/dump_stack.c:53
>  print_address_description+0x73/0x250 mm/kasan/report.c:252
>  kasan_report_error mm/kasan/report.c:351 [inline]
>  kasan_report+0x25b/0x340 mm/kasan/report.c:409
>  __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:430
>  __list_add_valid+0xb1/0xd0 lib/list_debug.c:23
>  __list_add include/linux/list.h:60 [inline]
>  list_add include/linux/list.h:79 [inline]
>  __add_wait_queue include/linux/wait.h:156 [inline]
>  add_wait_queue+0xcf/0x290 kernel/sched/wait.c:30
>  vhost_poll_func+0x3d/0x50 drivers/vhost/vhost.c:165
>  poll_wait include/linux/poll.h:46 [inline]
>  eventfd_poll+0xe8/0x1f0 fs/eventfd.c:123
>  vhost_poll_start+0x97/0x1c0 drivers/vhost/vhost.c:212
>  vhost_vring_ioctl+0xe28/0x19b0 drivers/vhost/vhost.c:1556
>  vhost_net_ioctl+0x9df/0x1910 drivers/vhost/net.c:1320
>  vfs_ioctl fs/ioctl.c:46 [inline]
>  do_vfs_ioctl+0x1b1/0x1520 fs/ioctl.c:686
>  SYSC_ioctl fs/ioctl.c:701 [inline]
>  SyS_ioctl+0x8f/0xc0 fs/ioctl.c:692
>  entry_SYSCALL_64_fastpath+0x29/0xa0
> RIP: 0033:0x452f19
> RSP: 002b:7f33da20dc58 EFLAGS: 0212 ORIG_RAX: 0010
> RAX: ffda RBX: 7f33da20e700 RCX: 00452f19
> RDX: 2088 RSI: 4008af20 RDI: 0017
> RBP: 00a2f850 R08:  R09: 
> R10:  R11: 0212 R12: 
> R13: 00a2f7cf R14: 7f33da20e9c0 R15: 0006
> 
> Allocated by task 28878:
>  save_stack+0x43/0xd0 mm/kasan/kasan.c:447
>  set_track mm/kasan/kasan.c:459 [inline]
>  kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:551
>  __do_kmalloc_node mm/slab.c:3672 [inline]
>  __kmalloc_node+0x47/0x70 mm/slab.c:3679
>  kmalloc_node include/linux/slab.h:541 [inline]
>  kvmalloc_node+0x64/0xd0 mm/util.c:397
>  kvmalloc include/linux/mm.h:541 [inline]
>  vhost_net_open+0x27/0x670 drivers/vhost/net.c:902
>  misc_open+0x382/0x500 drivers/char/misc.c:154
>  chrdev_open+0x257/0x730 fs/char_dev.c:417
>  do_dentry_open+0x667/0xd40 fs/open.c:752
>  vfs_open+0x107/0x220 fs/open.c:866
>  do_last fs/namei.c:3379 [inline]
>  path_openat+0x1151/0x3530 fs/namei.c:3519
>  do_filp_open+0x25b/0x3b0 fs/namei.c:3554
>  do_sys_open+0x502/0x6d0 fs/open.c:1059
>  SYSC_openat fs/open.c:1086 [inline]
>  SyS_openat+0x30/0x40 fs/open.c:1080
>  entry_SYSCALL_64_fastpath+0x29/0xa0
> 
> Freed by task 28878:
>  save_stack+0x43/0xd0 mm/kasan/kasan.c:447
>  set_track mm/kasan/kasan.c:459 [inline]
>  kasan_slab_free+0x71/0xc0 mm/kasan/kasan.c:524
>  __cache_free mm/slab.c:3488 [inline]
>  kfree+0xd6/0x260 mm/slab.c:3803
>  kvfree+0x36/0x60 mm/util.c:416
>  vhost_net_release+0x159/0x190 drivers/vhost/net.c:1012
>  __fput+0x327/0x7e0 fs/file_table.c:210
>  fput+0x15/0x20 fs/file_table.c:244
>  task_work_run+0x199/0x270 kernel/task_work.c:113
>  tracehook_notify_resume include/linux/tracehook.h:191 [inline]
>  exit_to_usermode_loop+0x296/0x310 arch/x86/entry/common.c:162
>  prepare_exit_to_usermode arch/x86/entry/common.c:195 [inline]
>  syscall_return_slowpath+0x490/0x550 arch/x86/entry/common.c:264
>  entry_SYSCALL_64_fastpath+0x9e/0xa0
> 
> The buggy address belongs to the object at 8801bb589140
>  which belongs to the cache kmalloc-65536 of size 65536
> The buggy address is located 656 bytes inside of
>  65536-byte region [8801bb589140, 8801bb599140)
> The buggy address belongs to the page:
> page:ea0006ed6000 count:1 mapcount:0 mapping:8801bb589140 index:0x0
> compound_mapcount: 0
> flags: 0x2fffc008100(slab|head)
> raw: 02fffc008100 8801bb589140  00010001
> raw: ea00068b0820 ea000693d020 8801dac02500 
> page dumped because: kasan: bad access detecte

Re: [PATCH 2/2] MAINTAINERS: list file memory-barriers.txt within the LKMM entry

2018-02-02 Thread Paul E. McKenney

On Fri, Feb 02, 2018 at 10:13:42AM +0100, Andrea Parri wrote:
> Now that a formal specification of the LKMM has become available to
> the developer, some concern about how to track changes to the model
> on the level of the "high-level documentation" was raised.
> 
> A first "mitigation" to this issue, suggested by Will, is to assign
> maintainership (and responsibility!!)  of such documentation (here,
> memory-barriers.txt) to the maintainers of the LKMM themselves.
> 
> Suggested-by: Will Deacon 
> Signed-off-by: Andrea Parri 

Very good, thank you, queued!  Please see below for the usual commit-log
rework.  BTW, in future submissions, could you please capitalize the
first word after the colon (":") in the subject line?  It is all too
easy for me to forget to change this, as Ingo can attest.  ;-)

If we are going to continue to use the LKMM acronym, should we make the
first line of the MAINTAINERS block look something like this?

LINUX KERNEL MEMORY CONSISTENCY MODEL (LKMM)

One alternative would be to start calling it LKMCM, though that does
look a bit like a Roman numeral.  ;-)

Thanx, Paul

commit 2f80571625dc2d1977acdef79267ba1645b07c53
Author: Andrea Parri 
Date:   Fri Feb 2 10:13:42 2018 +0100

MAINTAINERS: List file memory-barriers.txt within the LKMM entry

We now have a shiny new Linux-kernel memory model (LKMM) and the old
tried-and-true Documentation/memory-barrier.txt.  It would be good to
keep these automatically synchronized, but in the meantime we need at
least let people know that they are related.  Will suggested adding the
Documentation/memory-barrier.txt file to the LKMM maintainership list,
thus making the LKMM maintainers responsible for both the old and the new.
This commit follows Will's excellent suggestion.

Suggested-by: Will Deacon 
Signed-off-by: Andrea Parri 
Signed-off-by: Paul E. McKenney 

diff --git a/MAINTAINERS b/MAINTAINERS
index ba4dc08fbe95..e6ad9b44e8fb 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8101,6 +8101,7 @@ L:linux-kernel@vger.kernel.org
 S: Supported
 T: git git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
 F: tools/memory-model/
+F: Documentation/memory-barriers.txt

 LINUX SECURITY MODULE (LSM) FRAMEWORK
 M: Chris Wright

Re: [PATCH 1/2] tools/memory-model: clarify the origin/scope of the tool name

2018-02-02 Thread Andrea Parri

On Fri, Feb 02, 2018 at 03:19:22PM -0800, Paul E. McKenney wrote:
> On Fri, Feb 02, 2018 at 11:44:21AM +0100, Andrea Parri wrote:
> > On Thu, Feb 01, 2018 at 03:09:41PM -0800, Paul E. McKenney wrote:
> > > On Thu, Feb 01, 2018 at 10:26:50AM -0500, Alan Stern wrote:
> > > > On Thu, 1 Feb 2018, Andrea Parri wrote:
> > > > 
> > > > > Ingo pointed out that:
> > > > > 
> > > > >   "The "memory model" name is overly generic, ambiguous and somewhat
> > > > >misleading, as we usually mean the virtual memory layout/model
> > > > >when we say "memory model". GCC too uses it in that sense [...]"
> > > > > 
> > > > > Make it clearer that, in the context of tools/memory-model/, the term
> > > > > "memory-model" is used as shorthand for "memory consistency model" by
> > > > > calling out this convention in tools/memory-model/README.
> > > > > 
> > > > > Stick to the full name in sources' headers and for the subsystem name.
> > > > > 
> > > > > Suggested-by: Ingo Molnar 
> > > > > Signed-off-by: Andrea Parri 
> > > > 
> > > > For both patches:
> > > > 
> > > > Acked-by: Alan Stern 
> > > 
> > > Thank you all -- I have queued this and pushed it to my RCU tree on
> > > branch lkmm.  I did reword the commit log a bit, please see below and
> > > please let me know if any of my rewordings need halp.
> > 
> > Seems to me that your message has a leftover "is used".
> 
> Good catch, how about this instead?

Looks good to me. The same for 12a62a1d07031.

Thanks,
  Andrea


> 
>   Thanx, Paul
> 
> ---
> 
> commit 2b1b4ab5166209da849f306fbdc84114d9e611fd
> Author: Andrea Parri 
> Date:   Thu Feb 1 13:03:29 2018 +0100
> 
> tools/memory-model: Clarify the origin/scope of the tool name
> 
> Ingo pointed out that:
> 
>   "The "memory model" name is overly generic, ambiguous and somewhat
>misleading, as we usually mean the virtual memory layout/model
>when we say "memory model". GCC too uses it in that sense [...]"
> 
> Make it clear that tools/memory-model/ uses the term "memory model" as
> shorthand for "memory consistency model" by calling out this convention
> in tools/memory-model/README.
> 
> Stick to the original "memory model" term in sources' headers and for
> the subsystem name.
> 
> Suggested-by: Ingo Molnar 
> Signed-off-by: Andrea Parri 
> Acked-by: Will Deacon 
> Acked-by: Alan Stern 
> Signed-off-by: Paul E. McKenney 
> 
> diff --git a/tools/memory-model/MAINTAINERS b/tools/memory-model/MAINTAINERS
> index 711cbe72d606..db3bd3fc0435 100644
> --- a/tools/memory-model/MAINTAINERS
> +++ b/tools/memory-model/MAINTAINERS
> @@ -1,4 +1,4 @@
> -LINUX KERNEL MEMORY MODEL
> +LINUX KERNEL MEMORY CONSISTENCY MODEL
>  M:   Alan Stern 
>  M:   Andrea Parri 
>  M:   Will Deacon 
> diff --git a/tools/memory-model/README b/tools/memory-model/README
> index 43ba49492111..91414a49fac5 100644
> --- a/tools/memory-model/README
> +++ b/tools/memory-model/README
> @@ -1,15 +1,15 @@
> - =
> - LINUX KERNEL MEMORY MODEL
> - =
> + =
> + LINUX KERNEL MEMORY CONSISTENCY MODEL
> + =
>  
>  
>  INTRODUCTION
>  
>  
> -This directory contains the memory model of the Linux kernel, written
> -in the "cat" language and executable by the (externally provided)
> -"herd7" simulator, which exhaustively explores the state space of
> -small litmus tests.
> +This directory contains the memory consistency model (memory model, for
> +short) of the Linux kernel, written in the "cat" language and executable
> +by the externally provided "herd7" simulator, which exhaustively explores
> +the state space of small litmus tests.
>  
>  In addition, the "klitmus7" tool (also externally provided) may be used
>  to convert a litmus test to a Linux kernel module, which in turn allows
> diff --git a/tools/memory-model/linux-kernel.bell 
> b/tools/memory-model/linux-kernel.bell
> index 57112505f5e0..b984bbda01a5 100644
> --- a/tools/memory-model/linux-kernel.bell
> +++ b/tools/memory-model/linux-kernel.bell
> @@ -11,7 +11,7 @@
>   * which is to appear in ASPLOS 2018.
>   *)
>  
> -"Linux kernel memory model"
> +"Linux-kernel memory consistency model"
>  
>  enum Accesses = 'once (*READ_ONCE,WRITE_ONCE,ACCESS_ONCE*) ||
>   'release (*smp_store_release*) ||
> diff --git a/tools/memory-model/linux-kernel.cat 
> b/tools/memory-model/linux-kernel.cat
> index 15b7a5dd8a9a..babe2b3b0bb3 100644
> --- a/tools/memory-model/linux-kernel.cat
> +++ b/tools/memory-model/linux-kernel.cat
> @@ -11,7 +11,7 @@
>   * which is to appear in ASPLOS 2018.
>   *)
>  
> -"Linux kernel memory model"
> +"Linux-kernel memory consistency model"
>

Re: Can RCU stall lead to hard lockups?

2018-02-02 Thread Serge E. Hallyn

Quoting Paul E. McKenney (paul...@linux.vnet.ibm.com):
> On Tue, Jan 09, 2018 at 06:11:14AM -0800, Tejun Heo wrote:
> > Hello, Paul.
> > 
> > On Mon, Jan 08, 2018 at 08:24:25PM -0800, Paul E. McKenney wrote:
> > > > I don't know the RCU code at all but it *looks* like the first CPU is
> > > > taking a sweet while flushing printk buffer while holding a lock (the
> > > > console is IPMI serial console, which faithfully emulates 115200 baud
> > > > rate), and everyone else seems stuck waiting for that spinlock in
> > > > rcu_check_callbacks().
> > > > 
> > > > Does this sound possible?
> > > 
> > > 115200 baud?  Ouch!!!  That -will- result in trouble from console
> > > printing, and often also in RCU CPU stall warnings.
> > 
> > It could even be slower than 115200, and we occassionally see RCU
> > stall warnings caused by printk storms, for example, while the kernel
> > is trying to dump a lot of info after an OOM.  That's an issue we
> > probably want to improve from printk side; however, they don't usually
> > lead to NMI hard lockup detector kicking in and crashing the machine,
> > which is the peculiarity here.
> > 
> > Hmmm... show_state_filter(), the function which dumps all task
> > backtraces, share a similar problem and it avoids it by explicitly
> > calling touch_nmi_watchdog().  Maybe we can do something like the
> > following from RCU too?
> 
> If this fixes things for you, I would welcome such a patch.

Hi - would this also be relevant to 4.9-stable and 4.4-stable, or
has something elsewhere changed after 4.9 that actually triggers this?

thanks,
-serge

>   Thanx, Paul
> 
> > diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
> > index db85ca3..3c4c4d3 100644
> > --- a/kernel/rcu/tree_plugin.h
> > +++ b/kernel/rcu/tree_plugin.h
> > @@ -561,8 +561,14 @@ static void rcu_print_detail_task_stall_rnp(struct 
> > rcu_node *rnp)
> > }
> > t = list_entry(rnp->gp_tasks->prev,
> >struct task_struct, rcu_node_entry);
> > -   list_for_each_entry_continue(t, &rnp->blkd_tasks, rcu_node_entry)
> > +   list_for_each_entry_continue(t, &rnp->blkd_tasks, rcu_node_entry) {
> > +   touch_nmi_watchdog();
> > +   /*
> > +* We could be printing a lot of these messages while
> > +* holding a spinlock.  Avoid triggering hard lockup.
> > +*/
> > sched_show_task(t);
> > +   }
> > raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
> >  }
> > 
> > @@ -1678,6 +1684,12 @@ static void print_cpu_stall_info(struct rcu_state 
> > *rsp, int cpu)
> > char *ticks_title;
> > unsigned long ticks_value;
> > 
> > +   /*
> > +* We could be printing a lot of these messages while holding a
> > +* spinlock.  Avoid triggering hard lockup.
> > +*/
> > +   touch_nmi_watchdog();
> > +
> > if (rsp->gpnum == rdp->gpnum) {
> > ticks_title = "ticks this GP";
> > ticks_value = rdp->ticks_this_gp;
> >

Re: [PATCH] Fix typo IBRS_ATT, which should be IBRS_ALL

2018-02-02 Thread David Woodhouse

On Fri, 2018-02-02 at 19:12 +, Darren Kenny wrote:
> Fixes a typo in commit 117cc7a908c83697b0b737d15ae1eb5943afe35b
> ("x86/retpoline: Fill return stack buffer on vmexit")
> 
> Signed-off-by: Darren Kenny 
> Reviewed-by: Konrad Rzeszutek Wilk 

Not strictly a typo; that was the original name for it. "IBRS all the
time". But yes, it should be IBRS_ALL now.

Acked-by: David Woodhouse 


smime.p7s
Description: S/MIME cryptographic signature

Re: INFO: task hung in bpf_exit_net

2018-02-02 Thread Eric Biggers

On Fri, Dec 22, 2017 at 05:04:37PM -0200, Marcelo Ricardo Leitner wrote:
> On Fri, Dec 22, 2017 at 04:28:07PM -0200, Marcelo Ricardo Leitner wrote:
> > On Fri, Dec 22, 2017 at 11:58:08AM +0100, Dmitry Vyukov wrote:
> > ...
> > > > Same with this one, perhaps related to / fixed by:
> > > > http://patchwork.ozlabs.org/patch/850957/
> > > >
> > > 
> > > 
> > > 
> > > Looking at the log, this one seems to be an infinite loop in SCTP code
> > > with console output in it. Kernel is busy printing gazilion of:
> > > 
> > > [  176.491099] sctp: sctp_transport_update_pmtu: Reported pmtu 508 too
> > > low, using default minimum of 512
> > > ** 110 printk messages dropped **
> > > [  176.503409] sctp: sctp_transport_update_pmtu: Reported pmtu 508 too
> > > low, using default minimum of 512
> > > ** 103 printk messages dropped **
> > > ...
> > > [  246.742374] sctp: sctp_transport_update_pmtu: Reported pmtu 508 too
> > > low, using default minimum of 512
> > > [  246.742484] sctp: sctp_transport_update_pmtu: Reported pmtu 508 too
> > > low, using default minimum of 512
> > > [  246.742590] sctp: sctp_transport_update_pmtu: Reported pmtu 508 too
> > > low, using default minimum of 512
> > > 
> > > Looks like a different issue.
> > > 
> > 
> > Oh. I guess this is caused by the interface having a MTU smaller than
> > SCTP_DEFAULT_MINSEGMENT (512), as the icmp frag needed handler
> > (sctp_icmp_frag_needed) will trigger an instant retransmission.
> > But as the MTU is smaller, SCTP won't update it, but will issue the
> > retransmission anyway.
> > 
> > I will test this soon. Should be fairly easy to trigger it.
> 
> Reproduced it.
> 
> netns A veth0(1500) - veth1(1500) B veth2(508) - veth3(508) C
> 
> When A sends a sctp packet bigger than 508, it triggers the issue as B
> will reply a icmp frag needed with a size that sctp won't accept but
> will retransmit anyway.
> 

syzbot hasn't encountered this hang again (although, it just happened once in
the first place).  I assume it was fixed by commit b6c5734db070, so telling
syzbot this:

#syz fix: sctp: fix the handling of ICMP Frag Needed for too small MTUs

- Eric

Re: [PATCH] xen: hypercall: fix out-of-bounds memcpy

2018-02-02 Thread Boris Ostrovsky

On 02/02/2018 10:32 AM, Arnd Bergmann wrote:
> The legacy hypercall handlers were originally added with
> a comment explaining that "copying the argument structures in
> HYPERVISOR_event_channel_op() and HYPERVISOR_physdev_op() into the local
> variable is sufficiently safe" and only made sure to not write
> past the end of the argument structure, the checks in linux/string.h
> disagree with that, when link-time optimizations are used:
>
> In function 'memcpy',
> inlined from 'pirq_query_unmask' at drivers/xen/fallback.c:53:2,
> inlined from '__startup_pirq' at drivers/xen/events/events_base.c:529:2,
> inlined from 'restore_pirqs' at drivers/xen/events/events_base.c:1439:3,
> inlined from 'xen_irq_resume' at drivers/xen/events/events_base.c:1581:2:
> include/linux/string.h:350:3: error: call to '__read_overflow2' declared with 
> attribute error: detected read beyond size of object passed as 2nd parameter
>__read_overflow2();
>^
> make[3]: *** [ccLujFNx.ltrans15.ltrans.o] Error 1
> make[3]: Target 'all' not remade because of errors.
> lto-wrapper: fatal error: make returned 2 exit status
> compilation terminated.
> ld: error: lto-wrapper failed
>
> This changes the functions so that each argument is accessed with
> exactly the correct length based on the command code.
>
> Fixes: cf47a83fb06e ("xen/hypercall: fix hypercall fallback code for very old 
> hypervisors")
> Signed-off-by: Arnd Bergmann 
> ---
>  drivers/xen/fallback.c | 94 
> --
>  1 file changed, 53 insertions(+), 41 deletions(-)
>
> diff --git a/drivers/xen/fallback.c b/drivers/xen/fallback.c
> index b04fb64c5a91..eded8dd821ad 100644
> --- a/drivers/xen/fallback.c
> +++ b/drivers/xen/fallback.c
> @@ -7,75 +7,87 @@
>  
>  int xen_event_channel_op_compat(int cmd, void *arg)
>  {
> - struct evtchn_op op;
> + struct evtchn_op op = { .cmd = cmd, };
> + size_t len;
>   int rc;
>  
> - op.cmd = cmd;
> - memcpy(&op.u, arg, sizeof(op.u));
> - rc = _hypercall1(int, event_channel_op_compat, &op);
> -
>   switch (cmd) {
> + case EVTCHNOP_bind_interdomain:
> + len = sizeof(struct evtchn_bind_interdomain);
> + break;
> + case EVTCHNOP_bind_virq:
> + len = sizeof(struct evtchn_bind_virq);
> + break;
> + case EVTCHNOP_bind_pirq:
> + len = sizeof(struct evtchn_bind_pirq);
> + break;
>   case EVTCHNOP_close:
> + len = sizeof(struct evtchn_close);
> + break;
>   case EVTCHNOP_send:
> + len = sizeof(struct evtchn_send);
> + break;
> + case EVTCHNOP_alloc_unbound:
> + len = sizeof(struct evtchn_alloc_unbound);
> + break;
> + case EVTCHNOP_bind_ipi:
> + len = sizeof(struct evtchn_bind_ipi);
> + break;
> + case EVTCHNOP_status:
> + len = sizeof(struct evtchn_status);
> + break;
>   case EVTCHNOP_bind_vcpu:
> + len = sizeof(struct evtchn_bind_vcpu);
> + break;
>   case EVTCHNOP_unmask:
> - /* no output */
> + len = sizeof(struct evtchn_unmask);
>   break;
> -
> -#define COPY_BACK(eop) \
> - case EVTCHNOP_##eop: \
> - memcpy(arg, &op.u.eop, sizeof(op.u.eop)); \
> - break
> -
> - COPY_BACK(bind_interdomain);
> - COPY_BACK(bind_virq);
> - COPY_BACK(bind_pirq);
> - COPY_BACK(status);
> - COPY_BACK(alloc_unbound);
> - COPY_BACK(bind_ipi);
> -#undef COPY_BACK
> -
>   default:
> - WARN_ON(rc != -ENOSYS);
> - break;
> + return -ENOSYS;
>   }
>  
> + memcpy(&op.u, arg, len);
> + rc = _hypercall1(int, event_channel_op_compat, &op);
> + memcpy(arg, &op.u, len);


We don't copy back for all commands, only those that are COPY_BACK.



> +
>   return rc;
>  }
>  EXPORT_SYMBOL_GPL(xen_event_channel_op_compat);
>  
>  int xen_physdev_op_compat(int cmd, void *arg)
>  {
> - struct physdev_op op;
> + struct physdev_op op = { .cmd = cmd, };
> + size_t len;
>   int rc;
>  
> - op.cmd = cmd;
> - memcpy(&op.u, arg, sizeof(op.u));
> - rc = _hypercall1(int, physdev_op_compat, &op);
> -
>   switch (cmd) {
>   case PHYSDEVOP_IRQ_UNMASK_NOTIFY:
> + len = 0;
> + break;
> + case PHYSDEVOP_irq_status_query:
> + len = sizeof(struct physdev_irq_status_query);
> + break;
>   case PHYSDEVOP_set_iopl:
> + len = sizeof(struct physdev_set_iopl);
> + break;
>   case PHYSDEVOP_set_iobitmap:
> + len = sizeof(struct physdev_set_iobitmap);
> + break;
> + case PHYSDEVOP_apic_read:
>   case PHYSDEVOP_apic_write:
> - /* no output */
> + len = sizeof(struct physdev_apic);
>   break;
> -
> -#define COPY_BACK(pop, fld) \
> - case

[RFC] x86/retpoline: Add clang support for 64-bit builds

2018-02-02 Thread Guenter Roeck

clang has its own set of compiler options for retpoline support.
Also, the thunks required by C code have their own function names.

For 64-bit builds, there is only a single thunk, which is easy
to support. Support for 32-bit builds is more complicated - in
addition to various register thunks, there is also a thunk
named __llvm_external_retpoline_push which is more challenging.
Play it safe and only support 64-bit clang builds for now.

Link: 
https://github.com/llvm-mirror/clang/commit/0d816739a82da29748caf88570affb9715e18b69
Cc: David Woodhouse 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: gno...@lxorguk.ukuu.org.uk
Cc: Rik van Riel 
Cc: Andi Kleen 
Cc: Josh Poimboeuf 
Cc: thomas.lenda...@amd.com
Cc: Peter Zijlstra 
Cc: Linus Torvalds 
Cc: Jiri Kosina 
Cc: Andy Lutomirski 
Cc: Dave Hansen 
Cc: Kees Cook 
Cc: Tim Chen 
Cc: Greg Kroah-Hartman 
Cc: Paul Turner 
Signed-off-by: Guenter Roeck 
---
Sent as RFC because I am not sure if the 64-bit only solution
is acceptable.

 arch/x86/Makefile|  5 -
 arch/x86/lib/retpoline.S | 24 
 2 files changed, 24 insertions(+), 5 deletions(-)

diff --git a/arch/x86/Makefile b/arch/x86/Makefile
index fad55160dcb9..536dd6775988 100644
--- a/arch/x86/Makefile
+++ b/arch/x86/Makefile
@@ -232,7 +232,10 @@ KBUILD_CFLAGS += -fno-asynchronous-unwind-tables
 
 # Avoid indirect branches in kernel to deal with Spectre
 ifdef CONFIG_RETPOLINE
-RETPOLINE_CFLAGS += $(call cc-option,-mindirect-branch=thunk-extern 
-mindirect-branch-register)
+RETPOLINE_CFLAGS = $(call cc-option,-mindirect-branch=thunk-extern 
-mindirect-branch-register)
+ifeq ($(RETPOLINE_CFLAGS)$(CONFIG_X86_32),)
+   RETPOLINE_CFLAGS = $(call cc-option,-mretpoline 
-mretpoline-external-thunk)
+endif
 ifneq ($(RETPOLINE_CFLAGS),)
 KBUILD_CFLAGS += $(RETPOLINE_CFLAGS) -DRETPOLINE
 endif
diff --git a/arch/x86/lib/retpoline.S b/arch/x86/lib/retpoline.S
index 480edc3a5e03..f77738b13481 100644
--- a/arch/x86/lib/retpoline.S
+++ b/arch/x86/lib/retpoline.S
@@ -9,14 +9,22 @@
 #include 
 #include 
 
-.macro THUNK reg
+.macro _THUNK prefix, reg
.section .text.__x86.indirect_thunk
 
-ENTRY(__x86_indirect_thunk_\reg)
+ENTRY(\prefix\reg)
CFI_STARTPROC
JMP_NOSPEC %\reg
CFI_ENDPROC
-ENDPROC(__x86_indirect_thunk_\reg)
+ENDPROC(\prefix\reg)
+.endm
+
+.macro THUNK reg
+_THUNK __x86_indirect_thunk_ \reg
+.endm
+
+.macro CLANG_THUNK reg
+_THUNK __llvm_external_retpoline_ \reg
 .endm
 
 /*
@@ -27,8 +35,11 @@ ENDPROC(__x86_indirect_thunk_\reg)
  * the simple and nasty way...
  */
 #define __EXPORT_THUNK(sym) _ASM_NOKPROBE(sym); EXPORT_SYMBOL(sym)
-#define EXPORT_THUNK(reg) __EXPORT_THUNK(__x86_indirect_thunk_ ## reg)
+#define _EXPORT_THUNK(thunk, reg) __EXPORT_THUNK(thunk ## reg)
+#define EXPORT_THUNK(reg) _EXPORT_THUNK(__x86_indirect_thunk_, reg)
 #define GENERATE_THUNK(reg) THUNK reg ; EXPORT_THUNK(reg)
+#define EXPORT_CLANG_THUNK(reg) _EXPORT_THUNK(__llvm_external_retpoline_, reg)
+#define GENERATE_CLANG_THUNK(reg) CLANG_THUNK reg ; EXPORT_CLANG_THUNK(reg)
 
 GENERATE_THUNK(_ASM_AX)
 GENERATE_THUNK(_ASM_BX)
@@ -46,6 +57,11 @@ GENERATE_THUNK(r12)
 GENERATE_THUNK(r13)
 GENERATE_THUNK(r14)
 GENERATE_THUNK(r15)
+
+#ifdef __clang__
+GENERATE_CLANG_THUNK(r11)
+#endif
+
 #endif
 
 /*
-- 
2.7.4

Re: suspicious RCU usage at ./include/linux/rcupdate.h:LINE (4)

2018-02-02 Thread Alexei Starovoitov

On Fri, Feb 02, 2018 at 06:58:01AM -0800, syzbot wrote:
> Hello,
> 
> syzbot hit the following crash on bpf-next commit
> b2fe5fa68642860e7de76167c3111623aa0d5de1 (Wed Jan 31 22:31:10 2018 +)
> Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
> 
> So far this crash happened 1575 times on bpf-next.
> C reproducer is attached.
> syzkaller reproducer is attached.
> Raw console output is attached.
> compiler: gcc (GCC) 7.1.1 20170620
> .config is attached.
> 
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+7dbcd2d3b85f9b608...@syzkaller.appspotmail.com
> It will help syzbot understand when the bug is fixed. See footer for
> details.
> If you forward the report, please keep this part and the footer.
> 
> audit: type=1400 audit(1517546098.866:9): avc:  denied  { prog_run } for
> pid=4159 comm="syzkaller076311"
> scontext=unconfined_u:system_r:insmod_t:s0-s0:c0.c1023
> tcontext=unconfined_u:system_r:insmod_t:s0-s0:c0.c1023 tclass=bpf
> permissive=1
> 
> =
> WARNING: suspicious RCU usage
> 4.15.0+ #10 Not tainted
> -
> ./include/linux/rcupdate.h:302 Illegal context switch in RCU read-side
> critical section!
> 
> other info that might help us debug this:
> 
> 
> rcu_scheduler_active = 2, debug_locks = 1
> 3 locks held by syzkaller076311/4159:
>  #0:  (&ctx->mutex){+.+.}, at: [<27c8872d>]
> perf_event_ctx_lock_nested+0x21b/0x450 kernel/events/core.c:1253
>  #1:  (bpf_event_mutex){+.+.}, at: [<92294d8c>]
> perf_event_query_prog_array+0x10e/0x280 kernel/trace/bpf_trace.c:876
>  #2:  (rcu_read_lock){}, at: [<2b518ca0>]
> bpf_prog_array_copy_to_user+0x0/0x4d0 kernel/bpf/core.c:1568
> 
> stack backtrace:
> CPU: 0 PID: 4159 Comm: syzkaller076311 Not tainted 4.15.0+ #10
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> Call Trace:
>  __dump_stack lib/dump_stack.c:17 [inline]
>  dump_stack+0x194/0x257 lib/dump_stack.c:53
>  lockdep_rcu_suspicious+0x123/0x170 kernel/locking/lockdep.c:4592
>  rcu_preempt_sleep_check include/linux/rcupdate.h:301 [inline]
>  ___might_sleep+0x385/0x470 kernel/sched/core.c:6079
>  __might_sleep+0x95/0x190 kernel/sched/core.c:6067
>  __might_fault+0xab/0x1d0 mm/memory.c:4532
>  _copy_to_user+0x2c/0xc0 lib/usercopy.c:25
>  copy_to_user include/linux/uaccess.h:155 [inline]
>  bpf_prog_array_copy_to_user+0x217/0x4d0 kernel/bpf/core.c:1587
>  bpf_prog_array_copy_info+0x17b/0x1c0 kernel/bpf/core.c:1685
>  perf_event_query_prog_array+0x196/0x280 kernel/trace/bpf_trace.c:877
>  _perf_ioctl kernel/events/core.c:4737 [inline]
>  perf_ioctl+0x3e1/0x1480 kernel/events/core.c:4757
>  vfs_ioctl fs/ioctl.c:46 [inline]

fyi
it was copy_to_user in rcu section bug.
Submitted a fix here:
https://patchwork.ozlabs.org/patch/868824/

Re: Compilation error report for: drivers/firmware/qcom_scm.c:469:47: error: passing argument 3 of ?dma_alloc_coherent? from incompatible pointer type

2018-02-02 Thread Bjorn Andersson

On Tue 30 Jan 05:25 PST 2018, Arnd Bergmann wrote:

> On Tue, Jan 30, 2018 at 11:11 AM, Benjamin GAIGNARD
>  wrote:
> >
> > On 01/12/2018 05:11 PM, Arnaud Pouliquen wrote:
> >> Hello Andy,David,
> > + Arnd
> >
> > I have the same issue on drm-misc-next.
> > Does Arnaud's fix make sense or should we update/change the way of how
> > we compile the kernel ?
> 
> We've hit a couple of bugs with qcom drivers confusing physical addresses
> and DMA addresses in the past, usually the drivers were buggy in
> some form, and tried to use dma_alloc_coherent() to get a buffer
> that gets passed into a firmware interface taking a physical address,
> which is of course completely wrong.
> 

Thanks Arnd, for once again using the words "bug" and "completely wrong"
when referring to something that obviously works just fine...

The solution you introduced for venus and adreno relies on static
reservations of system ram, which isn't pretty, but more importantly
isn't viable for the qcom_scm driver.

So, how do I dynamically allocate a chunk of coherent memory?

Preferably with the possibility of unmapping it temporarily from Linux
while passing the buffer into the trusted environment (as any accesses
during the operation might cause access violations).

Regards,
Bjorn

Re: [RFC net 1/1] rtnetlink: require unique netns identifier

2018-02-02 Thread David Ahern

On 2/2/18 1:51 AM, Christian Brauner wrote:
> diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
> index 56af8e41abfc..d0b7ab22eff4 100644
> --- a/net/core/rtnetlink.c
> +++ b/net/core/rtnetlink.c
> @@ -1951,6 +1951,18 @@ static struct net *rtnl_link_get_net_capable(const 
> struct sk_buff *skb,
>   return net;
>  }
>  
> +/* Verify that rtnetlink requests that support network namespace ids do not 
> pass
> + * additional properties that allow to identify a network namespace as they
> + * might conflict.
> + */
> +static int rtnl_ensure_unique_netns_attr(struct nlattr *tb[])
> +{
> + if (tb[IFLA_IF_NETNSID] && (tb[IFLA_NET_NS_PID] || tb[IFLA_NET_NS_FD]))
> + return -EINVAL;

The days of just returning EINVAL are over; please plumb extack arg to
this message and add a string describing the problem. There are plenty
of examples in rtnetlink.c

Also, what if those NSID's all point to the same namespace? That should
not fail right?

Re: [PATCH v4 3/3] x86/kvm: Expose AMD Core Perf Extension flag to guests

2018-02-02 Thread Natarajan, Janakarajan


On 2/2/2018 2:03 PM, kbuild test robot wrote:

Hi Janakarajan,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on tip/x86/core]
[also build test WARNING on v4.15]
[cannot apply to kvm/linux-next next-20180202]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]
This patch uses functions defined in commit 
'd6321d493319bfd406c484e8359c6101cbda39d3 KVM: x86: generalize 
guest_cpuid_has_ helpers'.

https://lkml.org/lkml/2017/8/2/811


url:
https://github.com/0day-ci/linux/commits/Janakarajan-Natarajan/Support-Perf-Extensions-on-AMD-KVM-guests/20180202-231344
reproduce:
 # apt-get install sparse
 make ARCH=x86_64 allmodconfig
 make C=1 CF=-D__CHECK_ENDIAN__


sparse warnings: (new ones prefixed by >>)


arch/x86/kvm/cpuid.c:58:6: sparse: symbol 'perf_ext_supported' was not 
declared. Should it be

Please review and possibly fold the followup patch.

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation

Re: BUG: unable to handle kernel NULL pointer dereference in __crypto_register_alg

2018-02-02 Thread Eric Biggers

On Sat, Dec 23, 2017 at 11:54:01PM -0800, syzbot wrote:
> Hello,
> 
> syzkaller hit the following crash on
> 6084b576dca2e898f5c101baef151f7bfdbb606d
> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/master
> compiler: gcc (GCC) 7.1.1 20170620
> .config is attached
> Raw console output is attached.
> 
> Unfortunately, I don't have any reproducer for this bug yet.
> 
> 
> netlink: 'syz-executor4': attribute type 29 has an invalid length.
> BUG: unable to handle kernel NULL pointer dereference at 0020
> IP: __crypto_register_alg+0x7b/0x300 crypto/algapi.c:212
> PGD 0 P4D 0
> Oops:  [#1] SMP
> Dumping ftrace buffer:
>(ftrace buffer empty)
> Modules linked in:
> CPU: 0 PID: 19130 Comm: cryptomgr_probe Not tainted
> 4.15.0-rc3-next-20171214+ #67
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> RIP: 0010:__crypto_register_alg+0x7b/0x300 crypto/algapi.c:212
> RSP: 0018:c9fefe00 EFLAGS: 00010293
> RAX: 88020df4c240 RBX:  RCX: 8167622b
> RDX:  RSI: 8801fa6f8559 RDI: 8801fa6f8881
> RBP: c9fefe30 R08: 0001 R09: 0004
> R10: c9fefdb0 R11: 0004 R12: 8801fa6f84a0
> R13:  R14: 080e R15: 8801fa6f8900
> FS:  () GS:88021fc0() knlGS:
> CS:  0010 DS:  ES:  CR0: 80050033
> CR2: 0020 CR3: 0301e004 CR4: 001626f0
> DR0: 2000 DR1: 2000 DR2: 
> DR3:  DR6: 0ff0 DR7: 0600
> Call Trace:
>  crypto_register_instance+0x83/0x140 crypto/algapi.c:544
>  shash_register_instance+0x34/0x50 crypto/shash.c:532
>  cbcmac_create+0x15c/0x190 crypto/ccm.c:988
>  cryptomgr_probe+0x40/0x100 crypto/algboss.c:75
>  kthread+0x149/0x170 kernel/kthread.c:238
>  ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:524
> Code: 10 49 89 44 24 18 48 81 fb 60 21 0e 83 0f 84 bb 00 00 00 e8 28 41 c4
> ff 49 39 dc 74 43 49 8d 44 24 38 48 89 45 d0 e8 15 41 c4 ff <44> 8b 6b 20 41
> f6 c5 60 75 78 e8 06 41 c4 ff 41 83 e5 10 4c 8d
> RIP: __crypto_register_alg+0x7b/0x300 crypto/algapi.c:212 RSP:
> c9fefe00
> CR2: 0020
> ---[ end trace 598f24e6511387a3 ]---
> Kernel panic - not syncing: Fatal exception
> Dumping ftrace buffer:
>(ftrace buffer empty)
> Kernel Offset: disabled
> Rebooting in 86400 seconds..
> 

This is yet another one that was reported while KASAN was accidentally disabled,
and it only happened once, so invalidating:

#syz invalid

- Eric

Re: RFC(V3): Audit Kernel Container IDs

2018-02-02 Thread Paul Moore

On Fri, Feb 2, 2018 at 5:19 PM, Simo Sorce  wrote:
> On Fri, 2018-02-02 at 16:24 -0500, Paul Moore wrote:
>> On Wed, Jan 10, 2018 at 2:00 AM, Richard Guy Briggs  wrote:
>> > On 2018-01-09 11:18, Simo Sorce wrote:
>> > > On Tue, 2018-01-09 at 07:16 -0500, Richard Guy Briggs wrote:

...

>> > Paul, can you justify this somewhat larger inconvenience for some
>> > relatively minor convenience on our part?
>>
>> Done in direct response to Simo.
>
> Sorry but your response sounds more like waving away then addressing
> them, the excuse being: we can't please everyone, so we are going to
> please no one.

I obviously disagree with the take on my comments but you're free to
your opinion.

I believe saying we are pleasing no one isn't really fair now is it?
Is there any type of audit container ID now?  How would you go about
associating audit events with containers now? (spoiler alert: it ain't
pretty, and there are gaps I don't believe you can cover)  This
proposal provides a mechanism to do this in a way that isn't tied to
any one particular concept of a container and is manageable inside the
kernel.

If you have a need to track audit events for containers, I find it
extremely hard to believe that you are not at least partially pleased
by the solutions presented here.  It may not be everything on your
wishlist, but when did you ever get *everything* on your wishlist?

>> But to be clear Richard, we've talked about this a few times, it's not
>> a "minor convenience" on our part, it's a pretty big convenience once
>> we starting having to route audit events and make decisions based on
>> the audit container ID information.  Audit performance is less than
>> awesome now, I'm working hard to not make it worse.
>
> Sounds like a security vs performance trade off to me.

Welcome to software development.  It's generally a pretty terrible
hobby and/or occupation, but we make up for it with long hours and
endless frustration.

>> > u64 vs u128 is easy for us to
>> > accomodate in terms of scalar comparisons.  It doubles the information
>> > in every container id field we print in audit records.
>>
>> ... and slows down audit container ID checks.
>
> Are you saying a cmp on a u128 is slower than a comparison on a u64 and
> this is something that will be noticeable ?

Do you have a 128 bit system?  I don't.  I've got a bunch of 64 bit
systems, and a couple of 32 bit systems too.  People that use audit
have a tendency to really hammer on it, to the point that we get
performance complaints on a not infrequent basis.  I don't know the
exact number of times we are going to need to check the audit
container ID, but it's reasonable to think that we'll expose it as a
filter-able field which adds a few checks, we'll use it for record
routing so that's a few more, and if we're running multiple audit
daemons we will probably want to include LSM checks which could result
in a few more audit container ID checks.  If it was one comparison I
wouldn't be too worried about it, but the point I'm trying to make is
that we don't know what the implementation is going to look like yet
and I suspect this ID is going to be leveraged in several places in
the audit subsystem and I would much rather start small to save
headaches later.

We can always expand the ID to a larger integer at a later date, but
we can't make it smaller.

>> > A c36 is a bigger step.
>>
>> Yeah, we're not doing that, no way.
>
> Ok, I can see your point though I do not agree with it.
>
> I can see why you do not want to have arbitrary length strings, but a
> u128 sounded like a reasonable compromise to me as it has enough room
> to be able to have unique cluster-wide IDs which a u64 definitely makes
> a lot harder to provide w/o tight coordination.

I originally wanted it to be a 32-bit integer, but Richard managed to
talk me into 64-bits, that was my compromise :)

As I said earlier, if you are doing container auditing you're going to
need coordination with the orchestrator, regardless of the audit
container ID size.

-- 
paul moore
www.paul-moore.com

Re: KASAN: use-after-free Read in mon_bin_vma_fault

2018-02-02 Thread Eric Biggers

On Thu, Dec 28, 2017 at 12:15:01PM -0800, syzbot wrote:
> Hello,
> 
> syzkaller hit the following crash on
> beacbc68ac3e23821a681adb30b45dc55b17488d
> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/master
> compiler: gcc (GCC) 7.1.1 20170620
> .config is attached
> Raw console output is attached.
> Unfortunately, I don't have any reproducer for this bug yet.
> 
> 
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+9cd6f6d80e1a5205f...@syzkaller.appspotmail.com
> It will help syzbot understand when the bug is fixed. See footer for
> details.
> If you forward the report, please keep this part and the footer.
> 
> ==
> BUG: KASAN: use-after-free in mon_bin_vma_fault+0x378/0x400
> drivers/usb/mon/mon_bin.c:1238
> Read of size 8 at addr 8801cd040080 by task syz-executor1/5424
> 
> CPU: 1 PID: 5424 Comm: syz-executor1 Not tainted 4.15.0-rc5+ #238
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> Call Trace:
>  __dump_stack lib/dump_stack.c:17 [inline]
>  dump_stack+0x194/0x257 lib/dump_stack.c:53
>  print_address_description+0x73/0x250 mm/kasan/report.c:252
>  kasan_report_error mm/kasan/report.c:351 [inline]
>  kasan_report+0x25b/0x340 mm/kasan/report.c:409
>  __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:430
>  mon_bin_vma_fault+0x378/0x400 drivers/usb/mon/mon_bin.c:1238
>  __do_fault+0xeb/0x30f mm/memory.c:3196
>  do_read_fault mm/memory.c:3606 [inline]
>  do_fault mm/memory.c:3706 [inline]
>  handle_pte_fault mm/memory.c:3937 [inline]
>  __handle_mm_fault+0x1d8f/0x3ce0 mm/memory.c:4061
>  handle_mm_fault+0x334/0x8d0 mm/memory.c:4098
>  faultin_page mm/gup.c:502 [inline]
>  __get_user_pages+0x50c/0x15f0 mm/gup.c:699
>  populate_vma_page_range+0x20e/0x2f0 mm/gup.c:1216
>  __mm_populate+0x23a/0x450 mm/gup.c:1266
>  mm_populate include/linux/mm.h:2226 [inline]
>  vm_mmap_pgoff+0x241/0x280 mm/util.c:338
>  SYSC_mmap_pgoff mm/mmap.c:1533 [inline]
>  SyS_mmap_pgoff+0x462/0x5f0 mm/mmap.c:1491
>  SYSC_mmap arch/x86/kernel/sys_x86_64.c:100 [inline]
>  SyS_mmap+0x16/0x20 arch/x86/kernel/sys_x86_64.c:91
>  entry_SYSCALL_64_fastpath+0x1f/0x96
> RIP: 0033:0x452ac9
> RSP: 002b:7f7721b03c58 EFLAGS: 0212 ORIG_RAX: 0009
> RAX: ffda RBX: 0071bea0 RCX: 00452ac9
> RDX: 0104 RSI: 4000 RDI: 20ac6000
> RBP: 039b R08: 0014 R09: 
> R10: 8011 R11: 0212 R12: 006f2728
> R13:  R14: 7f7721b046d4 R15: 
> 
> Allocated by task 5424:
>  save_stack+0x43/0xd0 mm/kasan/kasan.c:447
>  set_track mm/kasan/kasan.c:459 [inline]
>  kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:551
>  kmem_cache_alloc_trace+0x136/0x750 mm/slab.c:3610
>  kmalloc include/linux/slab.h:499 [inline]
>  kzalloc include/linux/slab.h:688 [inline]
>  mon_bin_open+0x1ae/0x4a0 drivers/usb/mon/mon_bin.c:703
>  chrdev_open+0x257/0x730 fs/char_dev.c:417
>  do_dentry_open+0x667/0xd40 fs/open.c:752
>  vfs_open+0x107/0x220 fs/open.c:866
>  do_last fs/namei.c:3379 [inline]
>  path_openat+0x1151/0x3530 fs/namei.c:3519
>  do_filp_open+0x25b/0x3b0 fs/namei.c:3554
>  do_sys_open+0x502/0x6d0 fs/open.c:1059
>  SYSC_open fs/open.c:1077 [inline]
>  SyS_open+0x2d/0x40 fs/open.c:1072
>  entry_SYSCALL_64_fastpath+0x1f/0x96
> 
> Freed by task 5433:
>  save_stack+0x43/0xd0 mm/kasan/kasan.c:447
>  set_track mm/kasan/kasan.c:459 [inline]
>  kasan_slab_free+0x71/0xc0 mm/kasan/kasan.c:524
>  __cache_free mm/slab.c:3488 [inline]
>  kfree+0xd6/0x260 mm/slab.c:3803
>  mon_bin_ioctl+0x68d/0xd40 drivers/usb/mon/mon_bin.c:1040
>  vfs_ioctl fs/ioctl.c:46 [inline]
>  do_vfs_ioctl+0x1b1/0x1520 fs/ioctl.c:686
>  SYSC_ioctl fs/ioctl.c:701 [inline]
>  SyS_ioctl+0x8f/0xc0 fs/ioctl.c:692
>  entry_SYSCALL_64_fastpath+0x1f/0x96
> 
> The buggy address belongs to the object at 8801cd040080
>  which belongs to the cache kmalloc-2048 of size 2048
> The buggy address is located 0 bytes inside of
>  2048-byte region [8801cd040080, 8801cd040880)
> The buggy address belongs to the page:
> page:3d43a99d count:1 mapcount:0 mapping:83654cb9 index:0x0
> compound_mapcount: 0
> flags: 0x2fffc008100(slab|head)
> raw: 02fffc008100 8801cd040080  00010003
> raw: ea00072a2e20 ea00071e6920 8801db000c40 
> page dumped because: kasan: bad access detected
> 
> Memory state around the buggy address:
>  8801cd03ff80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>  8801cd04: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> > 8801cd040080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>^
>  8801cd040100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>  8801cd040180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ==

Re: [PATCH 1/2] tools/memory-model: clarify the origin/scope of the tool name

2018-02-02 Thread Paul E. McKenney

On Fri, Feb 02, 2018 at 03:17:53PM -0800, Paul E. McKenney wrote:
> On Fri, Feb 02, 2018 at 09:54:27AM +0100, Andrea Parri wrote:
> > On Thu, Feb 01, 2018 at 03:09:41PM -0800, Paul E. McKenney wrote:
> > > On Thu, Feb 01, 2018 at 10:26:50AM -0500, Alan Stern wrote:
> > > > On Thu, 1 Feb 2018, Andrea Parri wrote:
> > > > 
> > > > > Ingo pointed out that:
> > > > > 
> > > > >   "The "memory model" name is overly generic, ambiguous and somewhat
> > > > >misleading, as we usually mean the virtual memory layout/model
> > > > >when we say "memory model". GCC too uses it in that sense [...]"
> > > > > 
> > > > > Make it clearer that, in the context of tools/memory-model/, the term
> > > > > "memory-model" is used as shorthand for "memory consistency model" by
> > > > > calling out this convention in tools/memory-model/README.
> > > > > 
> > > > > Stick to the full name in sources' headers and for the subsystem name.
> > > > > 
> > > > > Suggested-by: Ingo Molnar 
> > > > > Signed-off-by: Andrea Parri 
> > > > 
> > > > For both patches:
> > > > 
> > > > Acked-by: Alan Stern 
> > > 
> > > Thank you all -- I have queued this and pushed it to my RCU tree on
> > > branch lkmm.  I did reword the commit log a bit, please see below and
> > > please let me know if any of my rewordings need halp.
> > > 
> > > Andrea, when you resend your second patch, could you please add Alan's
> > > Acked-by?
> > 
> > You mean in order to integrate Will's suggestion? I was planning to send
> > that as a separate patch, as suggested by Will: the patch is on its way,
> > IAC, please let me know if you'd prefer a V2 merging the two changes.
> 
> Ah, apologies, I misread your reply.  I have queued your second patch
> with Will's and Alan's Acked-by's.

And I did reword the commit log a bit, so please check it.

Thanx, Paul



commit 12a62a1d07031c0afa396b03334abbe30d9b9cf7
Author: Andrea Parri 
Date:   Thu Feb 1 13:04:26 2018 +0100

MAINTAINERS: Add the Memory Consistency Model subsystem

Move the contents of tools/memory-model/MAINTAINERS into the main
MAINTAINERS file, removing tools/memory-model/MAINTAINERS. This
allows get_maintainer.pl to correctly identify the maintainers of
tools/memory-model/.

Suggested-by: Ingo Molnar 
Signed-off-by: Andrea Parri 
Acked-by: Will Deacon 
Acked-by: Alan Stern 
Signed-off-by: Paul E. McKenney 

diff --git a/MAINTAINERS b/MAINTAINERS
index e3581413420c..ba4dc08fbe95 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8086,6 +8086,22 @@ M:   Kees Cook 
 S: Maintained
 F: drivers/misc/lkdtm*
 
+LINUX KERNEL MEMORY CONSISTENCY MODEL
+M: Alan Stern 
+M: Andrea Parri 
+M: Will Deacon 
+M: Peter Zijlstra 
+M: Boqun Feng 
+M: Nicholas Piggin 
+M: David Howells 
+M: Jade Alglave 
+M: Luc Maranget 
+M: "Paul E. McKenney" 
+L: linux-kernel@vger.kernel.org
+S: Supported
+T: git git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
+F: tools/memory-model/
+
 LINUX SECURITY MODULE (LSM) FRAMEWORK
 M: Chris Wright 
 L: linux-security-mod...@vger.kernel.org
diff --git a/tools/memory-model/MAINTAINERS b/tools/memory-model/MAINTAINERS
deleted file mode 100644
index db3bd3fc0435..
--- a/tools/memory-model/MAINTAINERS
+++ /dev/null
@@ -1,15 +0,0 @@
-LINUX KERNEL MEMORY CONSISTENCY MODEL
-M: Alan Stern 
-M: Andrea Parri 
-M: Will Deacon 
-M: Peter Zijlstra 
-M: Boqun Feng 
-M: Nicholas Piggin 
-M: David Howells 
-M: Jade Alglave 
-M: Luc Maranget 
-M: "Paul E. McKenney" 
-L: linux-kernel@vger.kernel.org
-S: Supported
-T: git git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
-F: tools/memory-model/

Re: [PATCH 1/2] tools/memory-model: clarify the origin/scope of the tool name

2018-02-02 Thread Paul E. McKenney

On Fri, Feb 02, 2018 at 11:44:21AM +0100, Andrea Parri wrote:
> On Thu, Feb 01, 2018 at 03:09:41PM -0800, Paul E. McKenney wrote:
> > On Thu, Feb 01, 2018 at 10:26:50AM -0500, Alan Stern wrote:
> > > On Thu, 1 Feb 2018, Andrea Parri wrote:
> > > 
> > > > Ingo pointed out that:
> > > > 
> > > >   "The "memory model" name is overly generic, ambiguous and somewhat
> > > >misleading, as we usually mean the virtual memory layout/model
> > > >when we say "memory model". GCC too uses it in that sense [...]"
> > > > 
> > > > Make it clearer that, in the context of tools/memory-model/, the term
> > > > "memory-model" is used as shorthand for "memory consistency model" by
> > > > calling out this convention in tools/memory-model/README.
> > > > 
> > > > Stick to the full name in sources' headers and for the subsystem name.
> > > > 
> > > > Suggested-by: Ingo Molnar 
> > > > Signed-off-by: Andrea Parri 
> > > 
> > > For both patches:
> > > 
> > > Acked-by: Alan Stern 
> > 
> > Thank you all -- I have queued this and pushed it to my RCU tree on
> > branch lkmm.  I did reword the commit log a bit, please see below and
> > please let me know if any of my rewordings need halp.
> 
> Seems to me that your message has a leftover "is used".

Good catch, how about this instead?

Thanx, Paul

---

commit 2b1b4ab5166209da849f306fbdc84114d9e611fd
Author: Andrea Parri 
Date:   Thu Feb 1 13:03:29 2018 +0100

tools/memory-model: Clarify the origin/scope of the tool name

Ingo pointed out that:

  "The "memory model" name is overly generic, ambiguous and somewhat
   misleading, as we usually mean the virtual memory layout/model
   when we say "memory model". GCC too uses it in that sense [...]"

Make it clear that tools/memory-model/ uses the term "memory model" as
shorthand for "memory consistency model" by calling out this convention
in tools/memory-model/README.

Stick to the original "memory model" term in sources' headers and for
the subsystem name.

Suggested-by: Ingo Molnar 
Signed-off-by: Andrea Parri 
Acked-by: Will Deacon 
Acked-by: Alan Stern 
Signed-off-by: Paul E. McKenney 

diff --git a/tools/memory-model/MAINTAINERS b/tools/memory-model/MAINTAINERS
index 711cbe72d606..db3bd3fc0435 100644
--- a/tools/memory-model/MAINTAINERS
+++ b/tools/memory-model/MAINTAINERS
@@ -1,4 +1,4 @@
-LINUX KERNEL MEMORY MODEL
+LINUX KERNEL MEMORY CONSISTENCY MODEL
 M: Alan Stern 
 M: Andrea Parri 
 M: Will Deacon 
diff --git a/tools/memory-model/README b/tools/memory-model/README
index 43ba49492111..91414a49fac5 100644
--- a/tools/memory-model/README
+++ b/tools/memory-model/README
@@ -1,15 +1,15 @@
-   =
-   LINUX KERNEL MEMORY MODEL
-   =
+   =
+   LINUX KERNEL MEMORY CONSISTENCY MODEL
+   =
 
 
 INTRODUCTION
 
 
-This directory contains the memory model of the Linux kernel, written
-in the "cat" language and executable by the (externally provided)
-"herd7" simulator, which exhaustively explores the state space of
-small litmus tests.
+This directory contains the memory consistency model (memory model, for
+short) of the Linux kernel, written in the "cat" language and executable
+by the externally provided "herd7" simulator, which exhaustively explores
+the state space of small litmus tests.
 
 In addition, the "klitmus7" tool (also externally provided) may be used
 to convert a litmus test to a Linux kernel module, which in turn allows
diff --git a/tools/memory-model/linux-kernel.bell 
b/tools/memory-model/linux-kernel.bell
index 57112505f5e0..b984bbda01a5 100644
--- a/tools/memory-model/linux-kernel.bell
+++ b/tools/memory-model/linux-kernel.bell
@@ -11,7 +11,7 @@
  * which is to appear in ASPLOS 2018.
  *)
 
-"Linux kernel memory model"
+"Linux-kernel memory consistency model"
 
 enum Accesses = 'once (*READ_ONCE,WRITE_ONCE,ACCESS_ONCE*) ||
'release (*smp_store_release*) ||
diff --git a/tools/memory-model/linux-kernel.cat 
b/tools/memory-model/linux-kernel.cat
index 15b7a5dd8a9a..babe2b3b0bb3 100644
--- a/tools/memory-model/linux-kernel.cat
+++ b/tools/memory-model/linux-kernel.cat
@@ -11,7 +11,7 @@
  * which is to appear in ASPLOS 2018.
  *)
 
-"Linux kernel memory model"
+"Linux-kernel memory consistency model"
 
 (*
  * File "lock.cat" handles locks and is experimental.

Re: [PATCH 1/2] tools/memory-model: clarify the origin/scope of the tool name

2018-02-02 Thread Paul E. McKenney

On Fri, Feb 02, 2018 at 09:54:27AM +0100, Andrea Parri wrote:
> On Thu, Feb 01, 2018 at 03:09:41PM -0800, Paul E. McKenney wrote:
> > On Thu, Feb 01, 2018 at 10:26:50AM -0500, Alan Stern wrote:
> > > On Thu, 1 Feb 2018, Andrea Parri wrote:
> > > 
> > > > Ingo pointed out that:
> > > > 
> > > >   "The "memory model" name is overly generic, ambiguous and somewhat
> > > >misleading, as we usually mean the virtual memory layout/model
> > > >when we say "memory model". GCC too uses it in that sense [...]"
> > > > 
> > > > Make it clearer that, in the context of tools/memory-model/, the term
> > > > "memory-model" is used as shorthand for "memory consistency model" by
> > > > calling out this convention in tools/memory-model/README.
> > > > 
> > > > Stick to the full name in sources' headers and for the subsystem name.
> > > > 
> > > > Suggested-by: Ingo Molnar 
> > > > Signed-off-by: Andrea Parri 
> > > 
> > > For both patches:
> > > 
> > > Acked-by: Alan Stern 
> > 
> > Thank you all -- I have queued this and pushed it to my RCU tree on
> > branch lkmm.  I did reword the commit log a bit, please see below and
> > please let me know if any of my rewordings need halp.
> > 
> > Andrea, when you resend your second patch, could you please add Alan's
> > Acked-by?
> 
> You mean in order to integrate Will's suggestion? I was planning to send
> that as a separate patch, as suggested by Will: the patch is on its way,
> IAC, please let me know if you'd prefer a V2 merging the two changes.

Ah, apologies, I misread your reply.  I have queued your second patch
with Will's and Alan's Acked-by's.

Thanx, Paul

>   Andrea
> 
> 
> > 
> > Thanx, Paul
> > 
> > 
> > 
> > commit de175b697f71b8e3e6d980b7186b909fee0c4378
> > Author: Andrea Parri 
> > Date:   Thu Feb 1 13:03:29 2018 +0100
> > 
> > tools/memory-model: Clarify the origin/scope of the tool name
> > 
> > Ingo pointed out that:
> > 
> >   "The "memory model" name is overly generic, ambiguous and somewhat
> >misleading, as we usually mean the virtual memory layout/model
> >when we say "memory model". GCC too uses it in that sense [...]"
> > 
> > Make it clearer that tools/memory-model/ uses the term "memory model"
> > is used as shorthand for "memory consistency model" by calling out this
> > convention in tools/memory-model/README.
> > 
> > Stick to the original "memory model" term in sources' headers and for
> > the subsystem name.
> > 
> > Suggested-by: Ingo Molnar 
> > Signed-off-by: Andrea Parri 
> > Acked-by: Will Deacon 
> > Acked-by: Alan Stern 
> > Signed-off-by: Paul E. McKenney 
> > 
> > diff --git a/tools/memory-model/MAINTAINERS b/tools/memory-model/MAINTAINERS
> > index 711cbe72d606..db3bd3fc0435 100644
> > --- a/tools/memory-model/MAINTAINERS
> > +++ b/tools/memory-model/MAINTAINERS
> > @@ -1,4 +1,4 @@
> > -LINUX KERNEL MEMORY MODEL
> > +LINUX KERNEL MEMORY CONSISTENCY MODEL
> >  M: Alan Stern 
> >  M: Andrea Parri 
> >  M: Will Deacon 
> > diff --git a/tools/memory-model/README b/tools/memory-model/README
> > index 43ba49492111..91414a49fac5 100644
> > --- a/tools/memory-model/README
> > +++ b/tools/memory-model/README
> > @@ -1,15 +1,15 @@
> > -   =
> > -   LINUX KERNEL MEMORY MODEL
> > -   =
> > +   =
> > +   LINUX KERNEL MEMORY CONSISTENCY MODEL
> > +   =
> >  
> >  
> >  INTRODUCTION
> >  
> >  
> > -This directory contains the memory model of the Linux kernel, written
> > -in the "cat" language and executable by the (externally provided)
> > -"herd7" simulator, which exhaustively explores the state space of
> > -small litmus tests.
> > +This directory contains the memory consistency model (memory model, for
> > +short) of the Linux kernel, written in the "cat" language and executable
> > +by the externally provided "herd7" simulator, which exhaustively explores
> > +the state space of small litmus tests.
> >  
> >  In addition, the "klitmus7" tool (also externally provided) may be used
> >  to convert a litmus test to a Linux kernel module, which in turn allows
> > diff --git a/tools/memory-model/linux-kernel.bell 
> > b/tools/memory-model/linux-kernel.bell
> > index 57112505f5e0..b984bbda01a5 100644
> > --- a/tools/memory-model/linux-kernel.bell
> > +++ b/tools/memory-model/linux-kernel.bell
> > @@ -11,7 +11,7 @@
> >   * which is to appear in ASPLOS 2018.
> >   *)
> >  
> > -"Linux kernel memory model"
> > +"Linux-kernel memory consistency model"
> >  
> >  enum Accesses = 'once (*READ_ONCE,WRITE_ONCE,ACCESS_ONCE*) ||
> > 'release (*smp_store_release*) ||

[PATCH 04/18] tracing/x86: Add arch_get_func_args() function

2018-02-02 Thread Steven Rostedt

From: "Steven Rostedt (VMware)" 

Add function to get the function arguments from pt_regs.

Signed-off-by: Steven Rostedt (VMware) 
---
 arch/x86/kernel/ftrace.c | 28 
 1 file changed, 28 insertions(+)

diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index 01ebcb6f263e..5e845c8cf89d 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -46,6 +46,34 @@ int ftrace_arch_code_modify_post_process(void)
return 0;
 }
 
+int arch_get_func_args(struct pt_regs *regs,
+  int start, int end, long *args)
+{
+#ifdef CONFIG_X86_64
+# define MAX_ARGS 6
+# define INIT_REGS \
+   {   regs->di, regs->si, regs->dx,   \
+   regs->cx, regs->r8, regs->r9\
+   }
+#else
+# define MAX_ARGS 3
+# define INIT_REGS \
+   {   regs->ax, regs->dx, regs->cx}
+#endif
+   if (!regs)
+   return MAX_ARGS;
+
+   {
+   long pt_args[] = INIT_REGS;
+   int i;
+
+   for (i = start; i <= end && i < MAX_ARGS; i++)
+   args[i - start] = pt_args[i];
+
+   return i - start;
+   }
+}
+
 union ftrace_code_union {
char code[MCOUNT_INSN_SIZE];
struct {
-- 
2.15.1

[PATCH 06/18] tracing: Add indirect offset to args of ftrace based events

2018-02-02 Thread Steven Rostedt

From: "Steven Rostedt (VMware)" 

Add '[' ']' syntex to allow to get values indirectly from the arguments.
For example:

 echo replenish_dl_entity(s64 dl_se[4]) > function_events

Will get the 4th long long word from the first parameter like an array.

Signed-off-by: Steven Rostedt (VMware) 
---
 Documentation/trace/function-based-events.rst | 32 +++-
 kernel/trace/trace_event_ftrace.c | 73 +--
 2 files changed, 101 insertions(+), 4 deletions(-)

diff --git a/Documentation/trace/function-based-events.rst 
b/Documentation/trace/function-based-events.rst
index f27a0c4e829c..7d67229e8e88 100644
--- a/Documentation/trace/function-based-events.rst
+++ b/Documentation/trace/function-based-events.rst
@@ -100,11 +100,15 @@ as follows:
  'x8' | 'x16' | 'x32' | 'x64' |
  'char' | 'short' | 'int' | 'long' | 'size_t'
 
- FIELD := 
+ FIELD :=  |  INDEX
+
+ INDEX := '['  ']'
 
  Where  is a unique string starting with an alphabetic character
  and consists only of letters and numbers and underscores.
 
+ Where  is a number that can be read by kstrtol() (hex, decimal, etc).
+
 
 Simple arguments
 
@@ -128,3 +132,29 @@ If we are only interested in the first argument (skb):
 
 We use "x64" in order to make sure that the data is displayed in hex.
 This is on a x86_64 machine, and we know the pointer sizes are 8 bytes.
+
+
+Indexing
+
+
+The pointers of the skb and the dev isn't that interesting. But if we want the
+length "len" field of skb, we could index it with an index operator '[' and 
']'.
+
+Using gdb, we can find the offset of 'len' from the sk_buff type:
+
+ $ gdb vmlinux
+ (gdb) printf "%d\n", &((struct sk_buff *)0)->len
+128
+
+As 128 / 4 (length of int) is 32, we can see the length of the skb with:
+
+ # echo 'ip_rcv(int skb[32], x64 dev)' > function_events
+
+ # echo 1 > events/functions/ip_rcv/enable
+ # cat trace
+-0 [003] ..s3   280.167137: 
__netif_receive_skb_core->ip_rcv(skb=52, dev=8801092f9400)
+-0 [003] ..s3   280.167152: 
__netif_receive_skb_core->ip_rcv(skb=52, dev=8801092f9400)
+-0 [003] ..s3   280.806629: 
__netif_receive_skb_core->ip_rcv(skb=88, dev=8801092f9400)
+-0 [003] ..s3   280.807023: 
__netif_receive_skb_core->ip_rcv(skb=52, dev=8801092f9400)
+
+Now we see the length of the sk_buff per event.
diff --git a/kernel/trace/trace_event_ftrace.c 
b/kernel/trace/trace_event_ftrace.c
index aa19c8af9d34..5d37498d1c6b 100644
--- a/kernel/trace/trace_event_ftrace.c
+++ b/kernel/trace/trace_event_ftrace.c
@@ -10,13 +10,15 @@
 
 #include "trace.h"
 
-#define FUNC_EVENT_SYSTEM "functions"
-#define WRITE_BUFSIZE  4096
+#define FUNC_EVENT_SYSTEM  "functions"
+#define WRITE_BUFSIZE  4096
+#define INDIRECT_FLAG  0x1000
 
 struct func_arg {
struct list_headlist;
char*type;
char*name;
+   longindirect;
short   offset;
short   size;
chararg;
@@ -55,6 +57,9 @@ enum func_states {
FUNC_STATE_INIT,
FUNC_STATE_FUNC,
FUNC_STATE_PARAM,
+   FUNC_STATE_BRACKET,
+   FUNC_STATE_BRACKET_END,
+   FUNC_STATE_INDIRECT,
FUNC_STATE_TYPE,
FUNC_STATE_VAR,
FUNC_STATE_COMMA,
@@ -171,6 +176,8 @@ static char *next_token(char **ptr, char *last)
 
for (str = arg; *str; str++) {
if (*str == '(' ||
+   *str == '[' ||
+   *str == ']' ||
*str == ',' ||
*str == ')')
break;
@@ -223,6 +230,7 @@ static int add_arg(struct func_event *fevent, int ftype)
 static enum func_states
 process_event(struct func_event *fevent, const char *token, enum func_states 
state)
 {
+   long val;
int ret;
int i;
 
@@ -269,12 +277,37 @@ process_event(struct func_event *fevent, const char 
*token, enum func_states sta
break;
return FUNC_STATE_VAR;
 
+   case FUNC_STATE_BRACKET:
+   WARN_ON(!fevent->last_arg);
+   ret = kstrtol(token, 0, &val);
+   if (ret)
+   break;
+   val *= fevent->last_arg->size;
+   fevent->last_arg->indirect = val ^ INDIRECT_FLAG;
+   return FUNC_STATE_INDIRECT;
+
+   case FUNC_STATE_INDIRECT:
+   if (token[0] != ']')
+   break;
+   return FUNC_STATE_BRACKET_END;
+
+   case FUNC_STATE_BRACKET_END:
+   switch (token[0]) {
+   case ')':
+   return FUNC_STATE_END;
+   case ',':
+   return FUNC_STATE_COMMA;
+   }
+   break;
+
case FUNC_STATE_VAR:
switch (token[0]) {

[PATCH 07/18] tracing: Add dereferencing multiple fields per arg

2018-02-02 Thread Steven Rostedt

From: "Steven Rostedt (VMware)" 

As an argument may be a structure or an array, we may want to dereference
more than one field per argument. Create a pipe '|' token to the parsing
that allows to reference multipe dereference fields per function argument.

Change func_arg fields from char to s8 or u8 to allow them to be
subscripts to arrays.

Signed-off-by: Steven Rostedt (VMware) 
---
 Documentation/trace/function-based-events.rst | 20 +-
 kernel/trace/trace_event_ftrace.c | 29 ---
 2 files changed, 41 insertions(+), 8 deletions(-)

diff --git a/Documentation/trace/function-based-events.rst 
b/Documentation/trace/function-based-events.rst
index 7d67229e8e88..2a002c8a500b 100644
--- a/Documentation/trace/function-based-events.rst
+++ b/Documentation/trace/function-based-events.rst
@@ -91,7 +91,7 @@ as follows:
 
  ARGS := ARG | ARG ',' ARGS | ''
 
- ARG := TYPE FIELD
+ ARG := TYPE FIELD | ARG '|' ARG
 
  TYPE := ATOM
 
@@ -158,3 +158,21 @@ As 128 / 4 (length of int) is 32, we can see the length of 
the skb with:
 -0 [003] ..s3   280.807023: 
__netif_receive_skb_core->ip_rcv(skb=52, dev=8801092f9400)
 
 Now we see the length of the sk_buff per event.
+
+
+Multiple fields per argument
+
+
+
+If we still want to see the skb pointer value along with the length of the
+skb, then using the '|' option allows us to add more than one option to
+an argument:
+
+ # echo 'ip_rcv(x64 skb | int skb[32], x64 dev)' > function_events
+
+ # echo 1 > events/functions/ip_rcv/enable
+ # cat trace
+-0 [003] ..s3   904.075838: 
__netif_receive_skb_core->ip_rcv(skb=88011396e800, skb=52, 
dev=880115204000)
+-0 [003] ..s3   904.075848: 
__netif_receive_skb_core->ip_rcv(skb=88011396e800, skb=52, 
dev=880115204000)
+-0 [003] ..s3   904.725486: 
__netif_receive_skb_core->ip_rcv(skb=88011396e800, skb=194, 
dev=880115204000)
+-0 [003] ..s3   905.152537: 
__netif_receive_skb_core->ip_rcv(skb=88011396f200, skb=88, 
dev=880115204000)
diff --git a/kernel/trace/trace_event_ftrace.c 
b/kernel/trace/trace_event_ftrace.c
index 5d37498d1c6b..8c9d4a92deab 100644
--- a/kernel/trace/trace_event_ftrace.c
+++ b/kernel/trace/trace_event_ftrace.c
@@ -21,8 +21,8 @@ struct func_arg {
longindirect;
short   offset;
short   size;
-   chararg;
-   charsign;
+   s8  arg;
+   u8  sign;
 };
 
 struct func_event {
@@ -60,6 +60,7 @@ enum func_states {
FUNC_STATE_BRACKET,
FUNC_STATE_BRACKET_END,
FUNC_STATE_INDIRECT,
+   FUNC_STATE_PIPE,
FUNC_STATE_TYPE,
FUNC_STATE_VAR,
FUNC_STATE_COMMA,
@@ -179,6 +180,7 @@ static char *next_token(char **ptr, char *last)
*str == '[' ||
*str == ']' ||
*str == ',' ||
+   *str == '|' ||
*str == ')')
break;
}
@@ -251,11 +253,15 @@ process_event(struct func_event *fevent, const char 
*token, enum func_states sta
break;
return FUNC_STATE_PARAM;
 
+   case FUNC_STATE_PIPE:
+   fevent->arg_cnt--;
+   goto comma;
case FUNC_STATE_PARAM:
if (token[0] == ')')
return FUNC_STATE_END;
/* Fall through */
case FUNC_STATE_COMMA:
+ comma:
for (i = 0; func_types[i].size; i++) {
if (strcmp(token, func_types[i].name) == 0)
break;
@@ -297,6 +303,8 @@ process_event(struct func_event *fevent, const char *token, 
enum func_states sta
return FUNC_STATE_END;
case ',':
return FUNC_STATE_COMMA;
+   case '|':
+   return FUNC_STATE_PIPE;
}
break;
 
@@ -306,6 +314,8 @@ process_event(struct func_event *fevent, const char *token, 
enum func_states sta
return FUNC_STATE_END;
case ',':
return FUNC_STATE_COMMA;
+   case '|':
+   return FUNC_STATE_PIPE;
case '[':
return FUNC_STATE_BRACKET;
}
@@ -364,7 +374,6 @@ static void func_event_trace(struct trace_event_file 
*trace_file,
int nr_args;
int size;
int pc;
-   int i = 0;
 
if (trace_trigger_soft_disabled(trace_file))
return;
@@ -386,8 +395,8 @@ static void func_event_trace(struct trace_event_file 
*trace_file,
nr_args = arch_get_func_args(pt_regs, 0, func_event->arg_cnt, args);
 
list_for_each_entry(arg, &func_event-

[PATCH 18/18] tracing/perf: Allow perf to use function based events

2018-02-02 Thread Steven Rostedt

From: "Steven Rostedt (VMware)" 

Have perf use function based events.

 # echo 'SyS_openat(int dfd, string buf, x32 flags, x32 mode)' > 
/sys/kernel/tracing/function_events
 # perf record -e functions:SyS_openat grep task_forks /proc/kallsyms
 # perf script
grep   913 [002]  5713.413239: functions:SyS_openat: 
entry_SYSCALL_64_fastpath->sys_openat(dfd=-100, buf=/proc/kallsyms, flags=100, 
mode=0)

Signed-off-by: Steven Rostedt (VMware) 
---
 Documentation/trace/function-based-events.rst |   3 +-
 kernel/trace/trace_event_ftrace.c | 134 --
 2 files changed, 104 insertions(+), 33 deletions(-)

diff --git a/Documentation/trace/function-based-events.rst 
b/Documentation/trace/function-based-events.rst
index 3b341992b93d..6effde96d3d6 100644
--- a/Documentation/trace/function-based-events.rst
+++ b/Documentation/trace/function-based-events.rst
@@ -48,7 +48,8 @@ enable  filter  format  hist  id  trigger
 
 Even though the above function based event does not record much more
 than the function tracer does, it does become a full fledge event.
-This can be used by the histogram infrastructure, and triggers.
+This can be used by the histogram infrastructure, triggers, and perf
+where one can attach eBPF programs to.
 
  # cat events/functions/do_IRQ/format
 name: do_IRQ
diff --git a/kernel/trace/trace_event_ftrace.c 
b/kernel/trace/trace_event_ftrace.c
index b5b719680686..b145639eac45 100644
--- a/kernel/trace/trace_event_ftrace.c
+++ b/kernel/trace/trace_event_ftrace.c
@@ -747,46 +747,33 @@ static int get_string(unsigned long addr, unsigned int 
idx,
return len;
 }
 
-static void func_event_trace(struct trace_event_file *trace_file,
-struct func_event *func_event,
-unsigned long ip, unsigned long parent_ip,
-struct pt_regs *pt_regs)
+static int get_event_size(struct func_event *func_event, struct pt_regs 
*pt_regs,
+ long *args, int *nr_args)
 {
-   struct func_event_hdr *entry;
-   struct trace_event_call *call = &func_event->call;
-   struct ring_buffer_event *event;
-   struct ring_buffer *buffer;
-   struct func_arg *arg;
-   long args[func_event->arg_cnt];
-   long long val = 1;
-   unsigned long irq_flags;
-   int str_offset;
-   int str_idx = 0;
-   int nr_args = 0;
int size;
-   int pc;
-
-   if (trace_trigger_soft_disabled(trace_file))
-   return;
-
-   local_save_flags(irq_flags);
-   pc = preempt_count();
 
-   size = func_event->arg_offset + sizeof(*entry);
+   size = func_event->arg_offset + sizeof(struct func_event_hdr);
 
if (func_event->arg_cnt)
-   nr_args = arch_get_func_args(pt_regs, 0, func_event->arg_cnt, 
args);
+   *nr_args = arch_get_func_args(pt_regs, 0, func_event->arg_cnt, 
args);
+   else
+   *nr_args = 0;
 
if (func_event->has_strings)
-   size += calculate_strings(func_event, nr_args, args);
+   size += calculate_strings(func_event, *nr_args, args);
 
-   event = trace_event_buffer_lock_reserve(&buffer, trace_file,
-   call->event.type,
-   size, irq_flags, pc);
-   if (!event)
-   return;
+   return size;
+}
+
+static void
+record_entry(struct func_event_hdr *entry, struct func_event *func_event,
+unsigned long ip, unsigned long parent_ip, int nr_args, long *args)
+{
+   struct func_arg *arg;
+   long long val;
+   int str_offset;
+   int str_idx = 0;
 
-   entry = ring_buffer_event_data(event);
entry->ip = ip;
entry->parent_ip = parent_ip;
 
@@ -809,11 +796,80 @@ static void func_event_trace(struct trace_event_file 
*trace_file,
} else
memcpy(&entry->data[arg->offset], &val, arg->size);
}
+}
+
+static void func_event_trace(struct trace_event_file *trace_file,
+struct func_event *func_event,
+unsigned long ip, unsigned long parent_ip,
+struct pt_regs *pt_regs)
+{
+   struct func_event_hdr *entry;
+   struct trace_event_call *call = &func_event->call;
+   struct ring_buffer_event *event;
+   struct ring_buffer *buffer;
+   long args[func_event->arg_cnt];
+   unsigned long irq_flags;
+   int nr_args;
+   int size;
+   int pc;
+
+   if (trace_trigger_soft_disabled(trace_file))
+   return;
+
+   local_save_flags(irq_flags);
+   pc = preempt_count();
+
+   size = get_event_size(func_event, pt_regs, args, &nr_args);
+
+   event = trace_event_buffer_lock_reserve(&buffer, trace_file,
+   call->event.type,
+   size, irq_flags, pc);
+

[PATCH 09/18] tracing: Add indexing of arguments for function based events

2018-02-02 Thread Steven Rostedt

From: "Steven Rostedt (VMware)" 

Currently reading of 8 byte words can only happen 8 bytes aligned from the
argument. But there may be cases that they are 4 bytes aligned. To make the
capturing of arguments more flexible, add a plus '+' operator that can index
the variable at arbitrary indexes to get any location.

 u64 arg+4[3]

Will get an 8 byte word at index 28 (3 * 8 + 4)

Signed-off-by: Steven Rostedt (VMware) 
---
 Documentation/trace/function-based-events.rst | 24 +++-
 kernel/trace/trace_event_ftrace.c | 18 ++
 2 files changed, 41 insertions(+), 1 deletion(-)

diff --git a/Documentation/trace/function-based-events.rst 
b/Documentation/trace/function-based-events.rst
index 72e3e7730d63..bdb28f433bfb 100644
--- a/Documentation/trace/function-based-events.rst
+++ b/Documentation/trace/function-based-events.rst
@@ -100,10 +100,12 @@ as follows:
  'x8' | 'x16' | 'x32' | 'x64' |
  'char' | 'short' | 'int' | 'long' | 'size_t'
 
- FIELD :=  |  INDEX
+ FIELD :=  |  INDEX |  OFFSET |  OFFSET INDEX
 
  INDEX := '['  ']'
 
+ OFFSET := '+' 
+
  Where  is a unique string starting with an alphabetic character
  and consists only of letters and numbers and underscores.
 
@@ -221,3 +223,23 @@ format:
 print fmt: "%pS->%pS(skb=%u)", REC->__ip, REC->__parent_ip, REC->skb
 
 It is now printed with a "%u".
+
+
+Offsets
+===
+
+After the name of the variable, brackets '[' number ']' will index the value of
+the argument by the number given times the size of the field.
+
+ int field[5] will dereference the value of the argument 20 bytes away (4 * 5)
+  as sizeof(int) is 4.
+
+If there's a case where the type is of 8 bytes in size but is not 8 bytes
+alligned in the structure, an offset may be required.
+
+  For example: x64 param+4[2]
+
+The above will take the parameter value, add it by 4, then index it by two
+8 byte words. It's the same in C as: (u64 *)((void *)param + 4)[2]
+
+ Note: "int skb[32]" is the same as "int skb+4[31]".
diff --git a/kernel/trace/trace_event_ftrace.c 
b/kernel/trace/trace_event_ftrace.c
index 9548b93eb8cd..4c23fa18453d 100644
--- a/kernel/trace/trace_event_ftrace.c
+++ b/kernel/trace/trace_event_ftrace.c
@@ -19,6 +19,7 @@ struct func_arg {
char*type;
char*name;
longindirect;
+   longindex;
short   offset;
short   size;
s8  arg;
@@ -62,6 +63,7 @@ enum func_states {
FUNC_STATE_INDIRECT,
FUNC_STATE_UNSIGNED,
FUNC_STATE_PIPE,
+   FUNC_STATE_PLUS,
FUNC_STATE_TYPE,
FUNC_STATE_VAR,
FUNC_STATE_COMMA,
@@ -182,6 +184,7 @@ static char *next_token(char **ptr, char *last)
*str == ']' ||
*str == ',' ||
*str == '|' ||
+   *str == '+' ||
*str == ')')
break;
}
@@ -323,6 +326,15 @@ process_event(struct func_event *fevent, const char 
*token, enum func_states sta
}
break;
 
+   case FUNC_STATE_PLUS:
+   if (WARN_ON(!fevent->last_arg))
+   break;
+   ret = kstrtol(token, 0, &val);
+   if (ret)
+   break;
+   fevent->last_arg->index += val;
+   return FUNC_STATE_VAR;
+
case FUNC_STATE_VAR:
switch (token[0]) {
case ')':
@@ -331,6 +343,8 @@ process_event(struct func_event *fevent, const char *token, 
enum func_states sta
return FUNC_STATE_COMMA;
case '|':
return FUNC_STATE_PIPE;
+   case '+':
+   return FUNC_STATE_PLUS;
case '[':
return FUNC_STATE_BRACKET;
}
@@ -347,6 +361,8 @@ static long long get_arg(struct func_arg *arg, unsigned 
long val)
char buf[8];
int ret;
 
+   val += arg->index;
+
if (!arg->indirect)
return val;
 
@@ -779,6 +795,8 @@ static int func_event_seq_show(struct seq_file *m, void *v)
last_arg = arg->arg;
comma = true;
seq_printf(m, "%s %s", arg->type, arg->name);
+   if (arg->index)
+   seq_printf(m, "+%ld", arg->index);
if (arg->indirect && arg->size)
seq_printf(m, "[%ld]",
   (arg->indirect ^ INDIRECT_FLAG) / arg->size);
-- 
2.15.1

[PATCH 12/18] tracing: Add accessing direct address from function based events

2018-02-02 Thread Steven Rostedt

From: "Steven Rostedt (VMware)" 

Allow referencing any address during the function based event. The syntax is
to use  = For example:

 # echo 'do_IRQ(long total_forks=0xa2a4b4c0)' > function_events
 # echo 1 > events/function/enable
 # cat trace
sshd-832   [000] d... 221639.210845: 
ret_from_intr->do_IRQ(total_forks=855)
sshd-832   [000] d... 221639.24: 
ret_from_intr->do_IRQ(total_forks=855)
  -0 [000] d... 221639.211198: 
ret_from_intr->do_IRQ(total_forks=855)

Signed-off-by: Steven Rostedt (VMware) 
---
 Documentation/trace/function-based-events.rst |  40 +++-
 kernel/trace/trace_event_ftrace.c | 129 +-
 2 files changed, 143 insertions(+), 26 deletions(-)

diff --git a/Documentation/trace/function-based-events.rst 
b/Documentation/trace/function-based-events.rst
index f18c8f3ef330..b0e6725f3032 100644
--- a/Documentation/trace/function-based-events.rst
+++ b/Documentation/trace/function-based-events.rst
@@ -91,7 +91,7 @@ as follows:
 
  ARGS := ARG | ARG ',' ARGS | ''
 
- ARG := TYPE FIELD | ARG '|' ARG
+ ARG := TYPE FIELD | TYPE  '=' ADDR | TYPE ADDR | ARG '|' ARG
 
  TYPE := ATOM | 'unsigned' ATOM
 
@@ -107,6 +107,8 @@ as follows:
 
  OFFSET := '+' 
 
+ ADDR := A hexidecimal address starting with '0x'
+
  Where  is a unique string starting with an alphabetic character
  and consists only of letters and numbers and underscores.
 
@@ -267,3 +269,39 @@ Again, using gdb to find the offset of the "func" field of 
struct work_struct
  -0 [000] dNs3  6241.172004: 
delayed_work_timer_fn->__queue_work(cpu=128, wq=88011a010800, 
func=vmstat_shepherd+0x0/0xb0)
  worker/0:2-1689  [000] d..2  6241.172026: 
__queue_delayed_work->__queue_work(cpu=7, wq=88011a11da00, 
func=vmstat_update+0x0/0x70)
  -0 [005] d.s3  6241.347996: 
queue_work_on->__queue_work(cpu=128, wq=88011a011200, 
func=fb_flashcursor+0x0/0x110 [fb])
+
+
+Direct memory access
+
+
+Function arguments are not the only thing that can be recorded from a function
+based event. Memory addresses can also be examined. If there's a global 
variable
+that you want to monitor via an interrupt, you can put in the address directly.
+
+  # grep total_forks /proc/kallsyms
+82354c18 B total_forks
+
+  # echo 'do_IRQ(int total_forks=0x82354c18)' > function_events
+
+  # echo 1 events/functions/do_IRQ/enable
+  # cat trace
+-0 [003] d..3   337.076709: 
ret_from_intr->do_IRQ(total_forks=1419)
+-0 [003] d..3   337.077046: 
ret_from_intr->do_IRQ(total_forks=1419)
+-0 [003] d..3   337.077076: 
ret_from_intr->do_IRQ(total_forks=1420)
+
+Note, address notations do not affect the argument count. For instance, with
+
+__visible unsigned int __irq_entry do_IRQ(struct pt_regs *regs)
+
+  # echo 'do_IRQ(int total_forks=0x82354c18, symbol regs[16])' > 
function_events
+
+Is the same as
+
+  # echo 'do_IRQ(int total_forks=0x82354c18 | symbol regs[16])' > 
function_events
+
+  # cat trace
+-0 [003] d..3   653.839546: 
ret_from_intr->do_IRQ(total_forks=1504, regs=cpuidle_enter_state+0xb1/0x330)
+-0 [003] d..3   653.906011: 
ret_from_intr->do_IRQ(total_forks=1504, regs=cpuidle_enter_state+0xb1/0x330)
+-0 [003] d..3   655.823498: 
ret_from_intr->do_IRQ(total_forks=1504, regs=tick_nohz_idle_enter+0x4c/0x50)
+-0 [003] d..3   655.954096: 
ret_from_intr->do_IRQ(total_forks=1504, regs=cpuidle_enter_state+0xb1/0x330)
+
diff --git a/kernel/trace/trace_event_ftrace.c 
b/kernel/trace/trace_event_ftrace.c
index ba10177b9bd6..206114f192be 100644
--- a/kernel/trace/trace_event_ftrace.c
+++ b/kernel/trace/trace_event_ftrace.c
@@ -63,6 +63,8 @@ enum func_states {
FUNC_STATE_BRACKET_END,
FUNC_STATE_INDIRECT,
FUNC_STATE_UNSIGNED,
+   FUNC_STATE_ADDR,
+   FUNC_STATE_EQUAL,
FUNC_STATE_PIPE,
FUNC_STATE_PLUS,
FUNC_STATE_TYPE,
@@ -199,6 +201,7 @@ static char *next_token(char **ptr, char *last)
*str == ',' ||
*str == '|' ||
*str == '+' ||
+   *str == '=' ||
*str == ')')
break;
}
@@ -243,12 +246,39 @@ static int add_arg(struct func_event *fevent, int ftype, 
int unsign)
arg->sign = func_type->sign;
arg->offset = ALIGN(fevent->arg_offset, arg->size);
arg->func_type = ftype;
-   arg->arg = fevent->arg_cnt;
fevent->arg_offset = arg->offset + arg->size;
 
list_add_tail(&arg->list, &fevent->args);
fevent->last_arg = arg;
-   fevent->arg_cnt++;
+
+   return 0;
+}
+
+static int update_arg_name(struct func_event *fevent, const char *name)
+{
+   struct func_arg *arg = fevent->last_arg;
+
+   if (WARN_ON(!arg))
+   return -EINVAL;
+
+   arg->name = kstrdup(name, GFP_KERNEL);
+   if (!arg->name)
+   return -ENO

[PATCH 08/18] tracing: Add "unsigned" to function based events

2018-02-02 Thread Steven Rostedt

From: "Steven Rostedt (VMware)" 

Add "unsigned" to the format processing to creating dynamic function based
events. For example: "unsigned long" now works.

Signed-off-by: Steven Rostedt (VMware) 
---
 Documentation/trace/function-based-events.rst | 47 ++-
 kernel/trace/trace_event_ftrace.c | 23 ++---
 2 files changed, 65 insertions(+), 5 deletions(-)

diff --git a/Documentation/trace/function-based-events.rst 
b/Documentation/trace/function-based-events.rst
index 2a002c8a500b..72e3e7730d63 100644
--- a/Documentation/trace/function-based-events.rst
+++ b/Documentation/trace/function-based-events.rst
@@ -93,7 +93,7 @@ as follows:
 
  ARG := TYPE FIELD | ARG '|' ARG
 
- TYPE := ATOM
+ TYPE := ATOM | 'unsigned' ATOM
 
  ATOM := 'u8' | 'u16' | 'u32' | 'u64' |
  's8' | 's16' | 's32' | 's64' |
@@ -176,3 +176,48 @@ an argument:
 -0 [003] ..s3   904.075848: 
__netif_receive_skb_core->ip_rcv(skb=88011396e800, skb=52, 
dev=880115204000)
 -0 [003] ..s3   904.725486: 
__netif_receive_skb_core->ip_rcv(skb=88011396e800, skb=194, 
dev=880115204000)
 -0 [003] ..s3   905.152537: 
__netif_receive_skb_core->ip_rcv(skb=88011396f200, skb=88, 
dev=880115204000)
+
+
+Unsigned usage
+==
+
+One can also use "unsigned" to make some types unsigned. It works against
+"long", "int", "short" and "char". It doesn't error against other types but
+may not make any sense.
+
+ # echo 'ip_rcv(int skb[32])' > function_events
+ # cat events/functions/ip_rcv/format
+name: ip_rcv
+ID: 1397
+format:
+   field:unsigned short common_type;   offset:0;   size:2; 
signed:0;
+   field:unsigned char common_flags;   offset:2;   size:1; 
signed:0;
+   field:unsigned char common_preempt_count;   offset:3;   size:1; 
signed:0;
+   field:int common_pid;   offset:4;   size:4; signed:1;
+
+   field:unsigned long __parent_ip;offset:8;   size:8; 
signed:0;
+   field:unsigned long __ip;   offset:16;  size:8; signed:0;
+   field:int skb;  offset:24;  size:4; signed:1;
+
+print fmt: "%pS->%pS(skb=%d)", REC->__ip, REC->__parent_ip, REC->skb
+
+
+Notice that REC->skb is printed with "%d". By adding "unsigned"
+
+ # echo 'ip_rcv(unsigned int skb[32])' > function_events
+ # cat events/functions/ip_rcv/format
+name: ip_rcv
+ID: 1398
+format:
+   field:unsigned short common_type;   offset:0;   size:2; 
signed:0;
+   field:unsigned char common_flags;   offset:2;   size:1; 
signed:0;
+   field:unsigned char common_preempt_count;   offset:3;   size:1; 
signed:0;
+   field:int common_pid;   offset:4;   size:4; signed:1;
+
+   field:unsigned long __parent_ip;offset:8;   size:8; 
signed:0;
+   field:unsigned long __ip;   offset:16;  size:8; signed:0;
+   field:unsigned int skb; offset:24;  size:4; signed:0;
+
+print fmt: "%pS->%pS(skb=%u)", REC->__ip, REC->__parent_ip, REC->skb
+
+It is now printed with a "%u".
diff --git a/kernel/trace/trace_event_ftrace.c 
b/kernel/trace/trace_event_ftrace.c
index 8c9d4a92deab..9548b93eb8cd 100644
--- a/kernel/trace/trace_event_ftrace.c
+++ b/kernel/trace/trace_event_ftrace.c
@@ -60,6 +60,7 @@ enum func_states {
FUNC_STATE_BRACKET,
FUNC_STATE_BRACKET_END,
FUNC_STATE_INDIRECT,
+   FUNC_STATE_UNSIGNED,
FUNC_STATE_PIPE,
FUNC_STATE_TYPE,
FUNC_STATE_VAR,
@@ -198,7 +199,7 @@ static char *next_token(char **ptr, char *last)
return arg;
 }
 
-static int add_arg(struct func_event *fevent, int ftype)
+static int add_arg(struct func_event *fevent, int ftype, int unsign)
 {
struct func_type *func_type = &func_types[ftype];
struct func_arg *arg;
@@ -211,13 +212,18 @@ static int add_arg(struct func_event *fevent, int ftype)
if (!arg)
return -ENOMEM;
 
-   arg->type = kstrdup(func_type->name, GFP_KERNEL);
+   if (unsign)
+   arg->type = kasprintf(GFP_KERNEL, "unsigned %s",
+ func_type->name);
+   else
+   arg->type = kstrdup(func_type->name, GFP_KERNEL);
if (!arg->type) {
kfree(arg);
return -ENOMEM;
}
arg->size = func_type->size;
-   arg->sign = func_type->sign;
+   if (!unsign)
+   arg->sign = func_type->sign;
arg->offset = ALIGN(fevent->arg_offset, arg->size);
arg->arg = fevent->arg_cnt;
fevent->arg_offset = arg->offset + arg->size;
@@ -232,12 +238,14 @@ static int add_arg(struct func_event *fevent, int ftype)
 static enum func_states
 process_event(struct func_event *fevent, const char *token, enum func_states 
state)
 {
+   static int unsign;
long val;
int ret;
int i;
 
switch (state) {
case FUNC_STATE_INIT:
+   unsign = 0;
if (!isalpha(token[0]))

[PATCH 11/18] tracing: Add symbol type to function based events

2018-02-02 Thread Steven Rostedt

From: "Steven Rostedt (VMware)" 

Add a special type "symbol" that will use %pS to display the field of a
function based event.

Signed-off-by: Steven Rostedt (VMware) 
---
 Documentation/trace/function-based-events.rst | 26 +-
 kernel/trace/trace_event_ftrace.c | 13 ++---
 2 files changed, 35 insertions(+), 4 deletions(-)

diff --git a/Documentation/trace/function-based-events.rst 
b/Documentation/trace/function-based-events.rst
index bdb28f433bfb..f18c8f3ef330 100644
--- a/Documentation/trace/function-based-events.rst
+++ b/Documentation/trace/function-based-events.rst
@@ -98,7 +98,8 @@ as follows:
  ATOM := 'u8' | 'u16' | 'u32' | 'u64' |
  's8' | 's16' | 's32' | 's64' |
  'x8' | 'x16' | 'x32' | 'x64' |
- 'char' | 'short' | 'int' | 'long' | 'size_t'
+ 'char' | 'short' | 'int' | 'long' | 'size_t' |
+'symbol'
 
  FIELD :=  |  INDEX |  OFFSET |  OFFSET INDEX
 
@@ -243,3 +244,26 @@ The above will take the parameter value, add it by 4, then 
index it by two
 8 byte words. It's the same in C as: (u64 *)((void *)param + 4)[2]
 
  Note: "int skb[32]" is the same as "int skb+4[31]".
+
+
+Symbols (function names)
+
+
+To display kallsyms "%pS" type of output, use the special type "symbol".
+
+Again, using gdb to find the offset of the "func" field of struct work_struct
+
+(gdb) printf "%d\n", &((struct work_struct *)0)->func
+24
+
+ Both "symbol func[3]" and "symbol func+24[0]" will work.
+
+ # echo '__queue_work(int cpu, x64 wq, symbol func[3])' > function_events
+
+ # echo 1 > events/functions/__queue_work/enable
+ # cat trace
+   bash-1641  [007] d..2  6241.171332: 
queue_work_on->__queue_work(cpu=128, wq=88011a010e00, 
func=flush_to_ldisc+0x0/0xa0)
+   bash-1641  [007] d..2  6241.171460: 
queue_work_on->__queue_work(cpu=128, wq=88011a010e00, 
func=flush_to_ldisc+0x0/0xa0)
+ -0 [000] dNs3  6241.172004: 
delayed_work_timer_fn->__queue_work(cpu=128, wq=88011a010800, 
func=vmstat_shepherd+0x0/0xb0)
+ worker/0:2-1689  [000] d..2  6241.172026: 
__queue_delayed_work->__queue_work(cpu=7, wq=88011a11da00, 
func=vmstat_update+0x0/0x70)
+ -0 [005] d.s3  6241.347996: 
queue_work_on->__queue_work(cpu=128, wq=88011a011200, 
func=fb_flashcursor+0x0/0x110 [fb])
diff --git a/kernel/trace/trace_event_ftrace.c 
b/kernel/trace/trace_event_ftrace.c
index 0f2650e97e49..ba10177b9bd6 100644
--- a/kernel/trace/trace_event_ftrace.c
+++ b/kernel/trace/trace_event_ftrace.c
@@ -76,6 +76,7 @@ typedef u64 x64;
 typedef u32 x32;
 typedef u16 x16;
 typedef u8 x8;
+typedef void * symbol;
 
 #define TYPE_TUPLE(type)   \
{ #type, sizeof(type), is_signed_type(type) }
@@ -97,7 +98,8 @@ typedef u8 x8;
TYPE_TUPLE(x16),\
TYPE_TUPLE(u8), \
TYPE_TUPLE(s8), \
-   TYPE_TUPLE(x8)
+   TYPE_TUPLE(x8), \
+   TYPE_TUPLE(symbol)
 
 static struct func_type {
char*name;
@@ -262,7 +264,7 @@ process_event(struct func_event *fevent, const char *token, 
enum func_states sta
switch (state) {
case FUNC_STATE_INIT:
unsign = 0;
-   if (!isalpha(token[0]))
+   if (!isalpha(token[0]) && token[0] != '_')
break;
/* Do not allow wild cards */
if (strstr(token, "*") || strstr(token, "?"))
@@ -305,7 +307,7 @@ process_event(struct func_event *fevent, const char *token, 
enum func_states sta
return FUNC_STATE_TYPE;
 
case FUNC_STATE_TYPE:
-   if (!isalpha(token[0]))
+   if (!isalpha(token[0]) || token[0] == '_')
break;
if (WARN_ON(!fevent->last_arg))
break;
@@ -472,6 +474,11 @@ static void make_fmt(struct func_arg *arg, char *fmt)
 {
int c = 0;
 
+   if (arg->func_type == FUNC_TYPE_symbol) {
+   strcpy(fmt, "%pS");
+   return;
+   }
+
fmt[c++] = '%';
 
if (arg->size == 8) {
-- 
2.15.1

[PATCH 10/18] tracing: Make func_type enums for easier comparing of arg types

2018-02-02 Thread Steven Rostedt

From: "Steven Rostedt (VMware)" 

For the function based event args, knowing quickly what type they are is
advantageous, as decisions can be made quickly based on them. Having an
enum for the types is useful for this purpose.

Use macros to create both the func_type array as well as enums that
match the type to the index into that array.

Signed-off-by: Steven Rostedt (VMware) 
---
 kernel/trace/trace_event_ftrace.c | 47 +--
 1 file changed, 30 insertions(+), 17 deletions(-)

diff --git a/kernel/trace/trace_event_ftrace.c 
b/kernel/trace/trace_event_ftrace.c
index 4c23fa18453d..0f2650e97e49 100644
--- a/kernel/trace/trace_event_ftrace.c
+++ b/kernel/trace/trace_event_ftrace.c
@@ -24,6 +24,7 @@ struct func_arg {
short   size;
s8  arg;
u8  sign;
+   u8  func_type;
 };
 
 struct func_event {
@@ -79,31 +80,42 @@ typedef u8 x8;
 #define TYPE_TUPLE(type)   \
{ #type, sizeof(type), is_signed_type(type) }
 
+#define FUNC_TYPES \
+   TYPE_TUPLE(long),   \
+   TYPE_TUPLE(int),\
+   TYPE_TUPLE(short),  \
+   TYPE_TUPLE(char),   \
+   TYPE_TUPLE(size_t), \
+   TYPE_TUPLE(u64),\
+   TYPE_TUPLE(s64),\
+   TYPE_TUPLE(x64),\
+   TYPE_TUPLE(u32),\
+   TYPE_TUPLE(s32),\
+   TYPE_TUPLE(x32),\
+   TYPE_TUPLE(u16),\
+   TYPE_TUPLE(s16),\
+   TYPE_TUPLE(x16),\
+   TYPE_TUPLE(u8), \
+   TYPE_TUPLE(s8), \
+   TYPE_TUPLE(x8)
+
 static struct func_type {
char*name;
int size;
int sign;
 } func_types[] = {
-   TYPE_TUPLE(long),
-   TYPE_TUPLE(int),
-   TYPE_TUPLE(short),
-   TYPE_TUPLE(char),
-   TYPE_TUPLE(size_t),
-   TYPE_TUPLE(u64),
-   TYPE_TUPLE(s64),
-   TYPE_TUPLE(x64),
-   TYPE_TUPLE(u32),
-   TYPE_TUPLE(s32),
-   TYPE_TUPLE(x32),
-   TYPE_TUPLE(u16),
-   TYPE_TUPLE(s16),
-   TYPE_TUPLE(x16),
-   TYPE_TUPLE(u8),
-   TYPE_TUPLE(s8),
-   TYPE_TUPLE(x8),
+   FUNC_TYPES,
{ NULL, 0,  0 }
 };
 
+#undef TYPE_TUPLE
+#define TYPE_TUPLE(type)   FUNC_TYPE_##type
+
+enum {
+   FUNC_TYPES,
+   FUNC_TYPE_MAX
+};
+
 /**
  * arch_get_func_args - retrieve function arguments via pt_regs
  * @regs: The registers at the moment the function is called
@@ -228,6 +240,7 @@ static int add_arg(struct func_event *fevent, int ftype, 
int unsign)
if (!unsign)
arg->sign = func_type->sign;
arg->offset = ALIGN(fevent->arg_offset, arg->size);
+   arg->func_type = ftype;
arg->arg = fevent->arg_cnt;
fevent->arg_offset = arg->offset + arg->size;
 
-- 
2.15.1

[PATCH 13/18] tracing: Add array type to function based events

2018-02-02 Thread Steven Rostedt

From: "Steven Rostedt (VMware)" 

Add syntex to allow the user to create an array type. Brackets after the
type field will denote that this is an array type. For example:

 # echo 'SyS_open(x8[32] buf, x32 flags, x32 mode)' > function_events

Will make the first argument of the sys_open function call an array of
32 bytes.

The array type can also be used in conjunction with the indirect offset
brackets as well. For example to get the interrupt stack of regs in do_IRQ()
for x86_64.

 # echo 'do_IRQ(x64[5] regs[16])' > function_events

Signed-off-by: Steven Rostedt (VMware) 
---
 Documentation/trace/function-based-events.rst |  22 +++-
 kernel/trace/trace_event_ftrace.c | 157 +-
 2 files changed, 151 insertions(+), 28 deletions(-)

diff --git a/Documentation/trace/function-based-events.rst 
b/Documentation/trace/function-based-events.rst
index b0e6725f3032..4a8a6fb16a0a 100644
--- a/Documentation/trace/function-based-events.rst
+++ b/Documentation/trace/function-based-events.rst
@@ -93,7 +93,7 @@ as follows:
 
  ARG := TYPE FIELD | TYPE  '=' ADDR | TYPE ADDR | ARG '|' ARG
 
- TYPE := ATOM | 'unsigned' ATOM
+ TYPE := ATOM | ATOM '['  ']' | 'unsigned' TYPE
 
  ATOM := 'u8' | 'u16' | 'u32' | 'u64' |
  's8' | 's16' | 's32' | 's64' |
@@ -305,3 +305,23 @@ Is the same as
 -0 [003] d..3   655.823498: 
ret_from_intr->do_IRQ(total_forks=1504, regs=tick_nohz_idle_enter+0x4c/0x50)
 -0 [003] d..3   655.954096: 
ret_from_intr->do_IRQ(total_forks=1504, regs=cpuidle_enter_state+0xb1/0x330)
 
+
+Array types
+===
+
+If there's a case where you want to see an array of a type, then you can
+declare a type as an array by adding '[' number ']' after the type.
+
+To get the net_device perm_addr, from the dev parameter.
+
+ (gdb) printf "%d\n", &((struct net_device *)0)->perm_addr
+558
+
+ # echo 'ip_rcv(x64 skb, x8[6] perm_addr+558)' > function_events
+
+ # echo 1 > events/functions/ip_rcv/enable
+ # cat trace
+-0 [003] ..s3   219.813582: 
__netif_receive_skb_core->ip_rcv(skb=880118195e00, 
perm_addr=b4,b5,2f,ce,18,65)
+-0 [003] ..s3   219.813595: 
__netif_receive_skb_core->ip_rcv(skb=880118195e00, 
perm_addr=b4,b5,2f,ce,18,65)
+-0 [003] ..s3   220.115053: 
__netif_receive_skb_core->ip_rcv(skb=880118195c00, 
perm_addr=b4,b5,2f,ce,18,65)
+-0 [003] ..s3   220.115293: 
__netif_receive_skb_core->ip_rcv(skb=880118195c00, 
perm_addr=b4,b5,2f,ce,18,65)
diff --git a/kernel/trace/trace_event_ftrace.c 
b/kernel/trace/trace_event_ftrace.c
index 206114f192be..64e2d7dcfd18 100644
--- a/kernel/trace/trace_event_ftrace.c
+++ b/kernel/trace/trace_event_ftrace.c
@@ -20,6 +20,7 @@ struct func_arg {
char*name;
longindirect;
longindex;
+   short   array;
short   offset;
short   size;
s8  arg;
@@ -68,6 +69,9 @@ enum func_states {
FUNC_STATE_PIPE,
FUNC_STATE_PLUS,
FUNC_STATE_TYPE,
+   FUNC_STATE_ARRAY,
+   FUNC_STATE_ARRAY_SIZE,
+   FUNC_STATE_ARRAY_END,
FUNC_STATE_VAR,
FUNC_STATE_COMMA,
FUNC_STATE_END,
@@ -289,6 +293,7 @@ process_event(struct func_event *fevent, const char *token, 
enum func_states sta
static bool update_arg;
static int unsign;
unsigned long val;
+   char *type;
int ret;
int i;
 
@@ -339,6 +344,10 @@ process_event(struct func_event *fevent, const char 
*token, enum func_states sta
return FUNC_STATE_TYPE;
 
case FUNC_STATE_TYPE:
+   if (token[0] == '[')
+   return FUNC_STATE_ARRAY;
+   /* Fall through */
+   case FUNC_STATE_ARRAY_END:
if (WARN_ON(!fevent->last_arg))
break;
if (update_arg_name(fevent, token) < 0)
@@ -350,14 +359,37 @@ process_event(struct func_event *fevent, const char 
*token, enum func_states sta
update_arg = true;
return FUNC_STATE_VAR;
 
+   case FUNC_STATE_ARRAY:
case FUNC_STATE_BRACKET:
-   WARN_ON(!fevent->last_arg);
+   if (WARN_ON(!fevent->last_arg))
+   break;
ret = kstrtoul(token, 0, &val);
if (ret)
break;
-   val *= fevent->last_arg->size;
-   fevent->last_arg->indirect = val ^ INDIRECT_FLAG;
-   return FUNC_STATE_INDIRECT;
+   if (state == FUNC_STATE_BRACKET) {
+   val *= fevent->last_arg->size;
+   fevent->last_arg->indirect = val ^ INDIRECT_FLAG;
+   return FUNC_STATE_INDIRECT;
+   }
+   if (val <= 0)
+   break;
+   fevent->last_a

[PATCH 15/18] tracing: Add string type for dynamic strings in function based events

2018-02-02 Thread Steven Rostedt

From: "Steven Rostedt (VMware)" 

Add a "string" type that will create a dynamic length string for the
event, this is the same as the __string() field in normal TRACE_EVENTS.

[ missing 'static' found by Fengguang Wu's kbuild test robot ]
Signed-off-by: Steven Rostedt (VMware) 
---
 Documentation/trace/function-based-events.rst |  19 ++-
 kernel/trace/trace_event_ftrace.c | 183 +++---
 2 files changed, 181 insertions(+), 21 deletions(-)

diff --git a/Documentation/trace/function-based-events.rst 
b/Documentation/trace/function-based-events.rst
index 99ae77cd59e6..6c643ea749e7 100644
--- a/Documentation/trace/function-based-events.rst
+++ b/Documentation/trace/function-based-events.rst
@@ -99,7 +99,7 @@ as follows:
  's8' | 's16' | 's32' | 's64' |
  'x8' | 'x16' | 'x32' | 'x64' |
  'char' | 'short' | 'int' | 'long' | 'size_t' |
-'symbol'
+'symbol' | 'string'
 
  FIELD :=  |  INDEX |  OFFSET |  OFFSET INDEX
 
@@ -342,3 +342,20 @@ the format "%s". If a nul is found, the output will stop. 
Use another type
   bash-1470  [003] ...2   980.678715: 
path_openat->link_path_walk(name=/lib64/ld-linux-x86-64.so.2)
   bash-1470  [003] ...2   980.678721: 
path_openat->link_path_walk(name=ld-2.24.so)
   bash-1470  [003] ...2   980.678978: 
path_lookupat->link_path_walk(name=/etc/ld.so.preload)
+
+
+Dynamic strings
+===
+
+Static strings are fine, but they can waste a lot of memory in the ring buffer.
+The above allocated 64 bytes for a character array, but most of the output was
+less than 20 characters. Not wanting to truncate strings or waste space on
+the ring buffer, the dynamic string can help.
+
+Use the "string" type for strings that have a large range in size. The max
+size that will be recorded is 512 bytes. If a string is larger than that, then
+it will be truncated.
+
+ # echo 'link_path_walk(string name)' > function_events
+
+Gives the same result as above, but does not waste buffer space.
diff --git a/kernel/trace/trace_event_ftrace.c 
b/kernel/trace/trace_event_ftrace.c
index dd24b840329d..273c5838a8e2 100644
--- a/kernel/trace/trace_event_ftrace.c
+++ b/kernel/trace/trace_event_ftrace.c
@@ -39,6 +39,7 @@ struct func_event {
struct func_arg *last_arg;
int arg_cnt;
int arg_offset;
+   int has_strings;
 };
 
 struct func_file {
@@ -83,6 +84,8 @@ typedef u32 x32;
 typedef u16 x16;
 typedef u8 x8;
 typedef void * symbol;
+/* 2 byte offset, 2 byte length */
+typedef u32 string;
 
 #define TYPE_TUPLE(type)   \
{ #type, sizeof(type), is_signed_type(type) }
@@ -105,7 +108,8 @@ typedef void * symbol;
TYPE_TUPLE(u8), \
TYPE_TUPLE(s8), \
TYPE_TUPLE(x8), \
-   TYPE_TUPLE(symbol)
+   TYPE_TUPLE(symbol), \
+   TYPE_TUPLE(string)
 
 static struct func_type {
char*name;
@@ -124,6 +128,16 @@ enum {
FUNC_TYPE_MAX
 };
 
+#define MAX_STR512
+
+/* Two contexts, normal and NMI, hence the " * 2" */
+struct func_string {
+   charbuf[MAX_STR * 2];
+};
+
+static struct func_string __percpu *str_buffer;
+static int nr_strings;
+
 /**
  * arch_get_func_args - retrieve function arguments via pt_regs
  * @regs: The registers at the moment the function is called
@@ -163,6 +177,23 @@ int __weak arch_get_func_args(struct pt_regs *regs,
return 0;
 }
 
+static void free_arg(struct func_arg *arg)
+{
+   list_del(&arg->list);
+   if (arg->func_type == FUNC_TYPE_string) {
+   nr_strings--;
+   if (WARN_ON(nr_strings < 0))
+   nr_strings = 0;
+   if (!nr_strings) {
+   free_percpu(str_buffer);
+   str_buffer = NULL;
+   }
+   }
+   kfree(arg->name);
+   kfree(arg->type);
+   kfree(arg);
+}
+
 static void free_func_event(struct func_event *func_event)
 {
struct func_arg *arg, *n;
@@ -171,10 +202,7 @@ static void free_func_event(struct func_event *func_event)
return;
 
list_for_each_entry_safe(arg, n, &func_event->args, list) {
-   list_del(&arg->list);
-   kfree(arg->name);
-   kfree(arg->type);
-   kfree(arg);
+   free_arg(arg);
}
ftrace_free_filter(&func_event->ops);
kfree(func_event->call.print_fmt);
@@ -255,6 +283,17 @@ static int add_arg(struct func_event *fevent, int ftype, 
int unsign)
list_add_tail(&arg->list, &fevent->args);
fevent->last_arg = arg;
 
+   if (ftype == FUNC_TYPE_string) {
+   fevent->has_strings++;
+   nr_strings++;
+   if (nr_strings == 1) {
+   str_buffer = alloc_

[PATCH 14/18] tracing: Have char arrays be strings for function based events

2018-02-02 Thread Steven Rostedt

From: "Steven Rostedt (VMware)" 

If a field in a function based event is defined with type "char[##]" then it
will be considered a static string. If a user wants an actual byte array
they should use one of u8, s8, or x8.

Now we can get strings from events:

 # echo 'SyS_openat(int dfd, char[64] buf, x32 flags, x32 mode)' > 
function_events
 # grep xxx /etc/*
 # cat trace
  grep-1745  [001]  346135.431364: 
entry_SYSCALL_64_fastpath->SyS_openat(dfd=-100, buf=/etc/adjtime, flags=100, 
mode=0)
  grep-1745  [001]  346135.431734: 
entry_SYSCALL_64_fastpath->SyS_openat(dfd=-100, buf=/etc/aliases, flags=100, 
mode=0)
  grep-1745  [001]  346135.618765: 
entry_SYSCALL_64_fastpath->SyS_openat(dfd=-100, buf=/etc/alternatives, 
flags=100, mode=0)
  grep-1745  [001]  346135.619063: 
entry_SYSCALL_64_fastpath->SyS_openat(dfd=-100, buf=/etc/anacrontab, flags=100, 
mode=0)
  grep-1745  [001]  346135.619134: 
entry_SYSCALL_64_fastpath->SyS_openat(dfd=-100, buf=/etc/asciidoc, flags=100, 
mode=0)
  grep-1745  [001]  346135.619390: 
entry_SYSCALL_64_fastpath->SyS_openat(dfd=-100, buf=/etc/asound.conf, 
flags=100, mode=0)
  grep-1745  [001]  346135.624350: 
entry_SYSCALL_64_fastpath->SyS_openat(dfd=-100, buf=/etc/audisp, flags=100, 
mode=0)
  grep-1745  [001]  346135.624565: 
entry_SYSCALL_64_fastpath->SyS_openat(dfd=-100, buf=/etc/audit, flags=100, 
mode=0)

Signed-off-by: Steven Rostedt (VMware) 
---
 Documentation/trace/function-based-events.rst | 17 +
 kernel/trace/trace_event_ftrace.c | 21 +
 2 files changed, 34 insertions(+), 4 deletions(-)

diff --git a/Documentation/trace/function-based-events.rst 
b/Documentation/trace/function-based-events.rst
index 4a8a6fb16a0a..99ae77cd59e6 100644
--- a/Documentation/trace/function-based-events.rst
+++ b/Documentation/trace/function-based-events.rst
@@ -325,3 +325,20 @@ To get the net_device perm_addr, from the dev parameter.
 -0 [003] ..s3   219.813595: 
__netif_receive_skb_core->ip_rcv(skb=880118195e00, 
perm_addr=b4,b5,2f,ce,18,65)
 -0 [003] ..s3   220.115053: 
__netif_receive_skb_core->ip_rcv(skb=880118195c00, 
perm_addr=b4,b5,2f,ce,18,65)
 -0 [003] ..s3   220.115293: 
__netif_receive_skb_core->ip_rcv(skb=880118195c00, 
perm_addr=b4,b5,2f,ce,18,65)
+
+
+Static strings
+==
+
+An array of type 'char' or 'unsigned char' will be processed as a string using
+the format "%s". If a nul is found, the output will stop. Use another type
+(x8, u8, s8) if this is not desired.
+
+  # echo 'link_path_walk(char[64] name)' > function_events
+
+  # echo 1 > events/functions/link_path_walk/enable
+  # cat trace
+  bash-1470  [003] ...2   980.678664: 
path_openat->link_path_walk(name=/usr/bin/cat)
+  bash-1470  [003] ...2   980.678715: 
path_openat->link_path_walk(name=/lib64/ld-linux-x86-64.so.2)
+  bash-1470  [003] ...2   980.678721: 
path_openat->link_path_walk(name=ld-2.24.so)
+  bash-1470  [003] ...2   980.678978: 
path_lookupat->link_path_walk(name=/etc/ld.so.preload)
diff --git a/kernel/trace/trace_event_ftrace.c 
b/kernel/trace/trace_event_ftrace.c
index 64e2d7dcfd18..dd24b840329d 100644
--- a/kernel/trace/trace_event_ftrace.c
+++ b/kernel/trace/trace_event_ftrace.c
@@ -610,6 +610,14 @@ static void make_fmt(struct func_arg *arg, char *fmt)
 
fmt[c++] = '%';
 
+   if (arg->func_type == FUNC_TYPE_char) {
+   if (arg->array)
+   fmt[c++] = 's';
+   else
+   fmt[c++] = 'c';
+   goto out;
+   }
+
if (arg->size == 8) {
fmt[c++] = 'l';
fmt[c++] = 'l';
@@ -622,6 +630,7 @@ static void make_fmt(struct func_arg *arg, char *fmt)
else
fmt[c++] = 'u';
 
+ out:
fmt[c++] = '\0';
 }
 
@@ -639,7 +648,10 @@ static void write_data(struct trace_seq *s, const struct 
func_arg *arg, const ch
trace_seq_printf(s, fmt, *(unsigned short *)data);
break;
case 1:
-   trace_seq_printf(s, fmt, *(unsigned char *)data);
+   if (arg->array && arg->func_type == FUNC_TYPE_char)
+   trace_seq_printf(s, fmt, (char *)data);
+   else
+   trace_seq_printf(s, fmt, *(unsigned char *)data);
break;
}
 }
@@ -672,7 +684,7 @@ func_event_print(struct trace_iterator *iter, int flags,
 
make_fmt(arg, fmt);
 
-   if (arg->array) {
+   if (arg->array && arg->func_type != FUNC_TYPE_char) {
comma = false;
for (a = 0; a < arg->array; a++, data += arg->size) {
if (comma)
@@ -821,7 +833,7 @@ static int __set_print_fmt(struct func_event *func_event,
 
make_fmt(arg, fmt);
 
-   if (arg->array) {
+   if (arg->array && arg->func_type != FUNC_TYPE_char)

[PATCH 17/18] tracing: Add indirect to indirect access for function based events

2018-02-02 Thread Steven Rostedt

From: "Steven Rostedt (VMware)" 

Allow the function based events to retrieve not only the parameters offsets,
but also get data from a pointer within a parameter structure. Something
like:

 # echo 'ip_rcv(string skdev+16[0][0] | x8[6] skperm+16[0]+558)' > 
function_events

 # echo 1 > events/functions/ip_rcv/enable
 # cat trace
-0 [003] ..s3   310.626391: 
__netif_receive_skb_core->ip_rcv(skdev=em1, skperm=b4,b5,2f,ce,18,65)
-0 [003] ..s3   310.626400: 
__netif_receive_skb_core->ip_rcv(skdev=em1, skperm=b4,b5,2f,ce,18,65)
-0 [003] ..s3   312.183775: 
__netif_receive_skb_core->ip_rcv(skdev=em1, skperm=b4,b5,2f,ce,18,65)
-0 [003] ..s3   312.184329: 
__netif_receive_skb_core->ip_rcv(skdev=em1, skperm=b4,b5,2f,ce,18,65)
-0 [003] ..s3   312.303895: 
__netif_receive_skb_core->ip_rcv(skdev=em1, skperm=b4,b5,2f,ce,18,65)
-0 [003] ..s3   312.304610: 
__netif_receive_skb_core->ip_rcv(skdev=em1, skperm=b4,b5,2f,ce,18,65)
-0 [003] ..s3   312.471980: 
__netif_receive_skb_core->ip_rcv(skdev=em1, skperm=b4,b5,2f,ce,18,65)
-0 [003] ..s3   312.472908: 
__netif_receive_skb_core->ip_rcv(skdev=em1, skperm=b4,b5,2f,ce,18,65)
-0 [003] ..s3   313.135804: 
__netif_receive_skb_core->ip_rcv(skdev=em1, skperm=b4,b5,2f,ce,18,65)

That is, we retrieved the net_device of the sk_buff and displayed its name
and perm_addr info.

  sk->dev->name, sk->dev->perm_addr

Signed-off-by: Steven Rostedt (VMware) 
---
 Documentation/trace/function-based-events.rst |  40 +-
 kernel/trace/trace_event_ftrace.c | 102 --
 2 files changed, 136 insertions(+), 6 deletions(-)

diff --git a/Documentation/trace/function-based-events.rst 
b/Documentation/trace/function-based-events.rst
index b90b52b7061d..3b341992b93d 100644
--- a/Documentation/trace/function-based-events.rst
+++ b/Documentation/trace/function-based-events.rst
@@ -101,12 +101,15 @@ as follows:
  'char' | 'short' | 'int' | 'long' | 'size_t' |
 'symbol' | 'string'
 
- FIELD :=  |  INDEX |  OFFSET |  OFFSET INDEX
+ FIELD :=  |  INDEX |  OFFSET |  OFFSET INDEX |
+FIELD INDIRECT
 
  INDEX := '['  ']'
 
  OFFSET := '+' 
 
+ INDIRECT := INDEX | OFFSET | INDIRECT INDIRECT | ''
+
  ADDR := A hexidecimal address starting with '0x'
 
  Where  is a unique string starting with an alphabetic character
@@ -385,3 +388,38 @@ based event.
 NULL can appear in any argument, to have them ignored. Note, skipping arguments
 does not give you access to later arguments if they are not supported by the
 architecture. The architecture only supplies the first set of arguments.
+
+
+The chain of indirects
+==
+
+When a parameter is a structure, and that structure points to another 
structure,
+the data of that structure can still be found.
+
+ssize_t __vfs_read(struct file *file, char __user *buf, size_t count,
+  loff_t *pos)
+
+has the following code.
+
+   if (file->f_op->read)
+   return file->f_op->read(file, buf, count, pos);
+
+To trace all the functions that are called by f_op->read(), that information
+can be obtained from the file pointer.
+
+Using gdb again:
+
+   (gdb) printf "%d\n", &((struct file *)0)->f_op
+40
+   (gdb) printf "%d\n", &((struct file_operations *)0)->read
+16
+
+# echo '__vfs_read(symbol read+40[0]+16)' > function_events
+
+  # echo 1 > events/functions/__vfs_read/enable
+  # cat trace
+ sshd-1343  [005] ...2   199.734752: 
vfs_read->__vfs_read(read=tty_read+0x0/0xf0)
+ bash-1344  [003] ...2   199.734822: 
vfs_read->__vfs_read(read=tty_read+0x0/0xf0)
+ sshd-1343  [005] ...2   199.734835: 
vfs_read->__vfs_read(read=tty_read+0x0/0xf0)
+ avahi-daemon-910   [003] ...2   200.136740: vfs_read->__vfs_read(read=
  (null))
+ avahi-daemon-910   [003] ...2   200.136750: vfs_read->__vfs_read(read=
  (null))
diff --git a/kernel/trace/trace_event_ftrace.c 
b/kernel/trace/trace_event_ftrace.c
index 22bcb67ad184..b5b719680686 100644
--- a/kernel/trace/trace_event_ftrace.c
+++ b/kernel/trace/trace_event_ftrace.c
@@ -14,8 +14,15 @@
 #define WRITE_BUFSIZE  4096
 #define INDIRECT_FLAG  0x1000
 
+struct func_arg_redirect {
+   struct list_headlist;
+   longindex;
+   longindirect;
+};
+
 struct func_arg {
struct list_headlist;
+   struct list_headredirects;
char*type;
char*name;
longindirect;
@@ -73,6 +80,8 @@ enum func_states {
FUNC_STATE_ARRAY,
FUNC_STATE_ARRAY_SIZE,
FUNC_STATE_ARRAY_END,
+   FUNC_STATE_REDIRECT_PLUS,
+   FUNC_STATE_REDIRECT_BRACKET,
FUNC_STATE_VAR,
FUNC_STATE_COMMA,
FUNC_STATE_NULL,
@@ -267,6 +276,8 @@ static int add_arg(struct func_event *fevent, int f

[PATCH 16/18] tracing: Add NULL to skip args for function based events

2018-02-02 Thread Steven Rostedt

From: "Steven Rostedt (VMware)" 

If args are to be skipped (only care about second, third or later arguments)
then add a NULL to ignore them. For example, if one only wants to record the
third argument of a function, they can perform:

 echo foo(NULL, NULL, u32 arg3) > function_events

Then only the third argument is saved in the function based event.

Signed-off-by: Steven Rostedt (VMware) 
---
 Documentation/trace/function-based-events.rst | 28 +-
 kernel/trace/trace_event_ftrace.c | 34 ++-
 2 files changed, 60 insertions(+), 2 deletions(-)

diff --git a/Documentation/trace/function-based-events.rst 
b/Documentation/trace/function-based-events.rst
index 6c643ea749e7..b90b52b7061d 100644
--- a/Documentation/trace/function-based-events.rst
+++ b/Documentation/trace/function-based-events.rst
@@ -91,7 +91,7 @@ as follows:
 
  ARGS := ARG | ARG ',' ARGS | ''
 
- ARG := TYPE FIELD | TYPE  '=' ADDR | TYPE ADDR | ARG '|' ARG
+ ARG := TYPE FIELD | TYPE  '=' ADDR | TYPE ADDR | ARG '|' ARG | 'NULL'
 
  TYPE := ATOM | ATOM '['  ']' | 'unsigned' TYPE
 
@@ -359,3 +359,29 @@ it will be truncated.
  # echo 'link_path_walk(string name)' > function_events
 
 Gives the same result as above, but does not waste buffer space.
+
+
+NULL arguments
+==
+
+If you are only interested in the second, or later parameter of a function,
+you do not have to record the previous parameters. Just set them as NULL and
+they will not be recorded.
+
+If we only wanted the perm_addr of the net_device of ip_rcv() and not the
+sk_buff, we put a NULL into the first parameter when created the function
+based event.
+
+  # echo 'ip_rcv(NULL, x8[6] perm_addr+558)' > function_events
+
+  # echo 1 > events/functions/ip_rcv/enable
+  # cat trace
+-0 [003] ..s3   165.617114: 
__netif_receive_skb_core->ip_rcv(perm_addr=b4,b5,2f,ce,18,65)
+-0 [003] ..s3   165.617133: 
__netif_receive_skb_core->ip_rcv(perm_addr=b4,b5,2f,ce,18,65)
+-0 [003] ..s3   166.412277: 
__netif_receive_skb_core->ip_rcv(perm_addr=b4,b5,2f,ce,18,65)
+-0 [003] ..s3   166.412797: 
__netif_receive_skb_core->ip_rcv(perm_addr=b4,b5,2f,ce,18,65)
+
+
+NULL can appear in any argument, to have them ignored. Note, skipping arguments
+does not give you access to later arguments if they are not supported by the
+architecture. The architecture only supplies the first set of arguments.
diff --git a/kernel/trace/trace_event_ftrace.c 
b/kernel/trace/trace_event_ftrace.c
index 273c5838a8e2..22bcb67ad184 100644
--- a/kernel/trace/trace_event_ftrace.c
+++ b/kernel/trace/trace_event_ftrace.c
@@ -75,6 +75,7 @@ enum func_states {
FUNC_STATE_ARRAY_END,
FUNC_STATE_VAR,
FUNC_STATE_COMMA,
+   FUNC_STATE_NULL,
FUNC_STATE_END,
FUNC_STATE_ERROR,
 };
@@ -117,6 +118,7 @@ static struct func_type {
int sign;
 } func_types[] = {
FUNC_TYPES,
+   { "NULL",   0,  0 },
{ NULL, 0,  0 }
 };
 
@@ -125,6 +127,7 @@ static struct func_type {
 
 enum {
FUNC_TYPES,
+   FUNC_TYPE_NULL,
FUNC_TYPE_MAX
 };
 
@@ -364,6 +367,8 @@ process_event(struct func_event *fevent, const char *token, 
enum func_states sta
fevent->arg_cnt++;
update_arg = false;
case FUNC_STATE_PIPE:
+   if (strcmp(token, "NULL") == 0)
+   return FUNC_STATE_NULL;
if (strcmp(token, "unsigned") == 0) {
unsign = 2;
return FUNC_STATE_UNSIGNED;
@@ -513,6 +518,19 @@ process_event(struct func_event *fevent, const char 
*token, enum func_states sta
fevent->last_arg->indirect = INDIRECT_FLAG;
return FUNC_STATE_ADDR;
 
+   case FUNC_STATE_NULL:
+   ret = add_arg(fevent, FUNC_TYPE_NULL, 0);
+   if (ret < 0)
+   break;
+   switch (token[0]) {
+   case ')':
+   goto end;
+   case ',':
+   update_arg = true;
+   return FUNC_STATE_COMMA;
+   }
+   break;
+
default:
break;
}
@@ -689,6 +707,8 @@ static void func_event_trace(struct trace_event_file 
*trace_file,
entry->parent_ip = parent_ip;
 
list_for_each_entry(arg, &func_event->args, list) {
+   if (arg->func_type == FUNC_TYPE_NULL)
+   continue;
if (arg->arg < nr_args)
val = get_arg(arg, args);
else
@@ -811,6 +831,8 @@ func_event_print(struct trace_iterator *iter, int flags,
trace_seq_printf(s, "%ps->%ps(",
 (void *)entry->parent_ip, (void *)entry->ip);
list_for_each_entry(arg, &func_event->args, list) {
+   if (arg->func_type == FUNC_TYPE_NULL)
+   continue;

[PATCH v2] x86: e820: Implement a range manipulation operator

2018-02-02 Thread Jan H . Schönherr

Add a more versatile memmap= operator, which -- in addition to all the
things that were possible before -- allows you to:
- redeclare existing ranges -- before, you were limited to adding ranges;
- drop any range -- like a mem= for any location;
- use any e820 memory type -- not just some predefined ones.

The syntax is:

  memmap=%-+

Size and offset work as usual. The "-" and "+" are
optional and their existence determine the behavior: The command
works on the specified range of memory limited to type 
(if specified). This memory is then configured to show up as .
If  is not specified, the memory is removed from the e820 map.

Signed-off-by: Jan H. Schönherr 
---
v2: Small coding style and typography adjustments

 Documentation/admin-guide/kernel-parameters.txt |  9 +
 arch/x86/kernel/e820.c  | 18 ++
 2 files changed, 27 insertions(+)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 46b26bfee27b..60926ae3ec06 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2221,6 +2221,15 @@
The memory region may be marked as e820 type 12 (0xc)
and is NVDIMM or ADR memory.
 
+   memmap=%-+
+   [KNL,ACPI] Convert memory within the specified region
+   from  to . If "-" is left
+   out, the whole region will be marked as ,
+   even if previously unavailable. If "+" is left
+   out, matching memory will be removed. Types are
+   specified as e820 types, e.g., 1 = RAM, 2 = reserved,
+   3 = ACPI, 12 = PRAM.
+
memory_corruption_check=0/1 [X86]
Some BIOSes seem to corrupt the first 64k of
memory when doing things like suspend/resume.
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index 71c11ad5643e..6a2cb1442e05 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -924,6 +924,24 @@ static int __init parse_memmap_one(char *p)
} else if (*p == '!') {
start_at = memparse(p+1, &p);
e820__range_add(start_at, mem_size, E820_TYPE_PRAM);
+   } else if (*p == '%') {
+   enum e820_type from = 0, to = 0;
+
+   start_at = memparse(p + 1, &p);
+   if (*p == '-')
+   from = simple_strtoull(p + 1, &p, 0);
+   if (*p == '+')
+   to = simple_strtoull(p + 1, &p, 0);
+   if (*p != '\0')
+   return -EINVAL;
+   if (from && to)
+   e820__range_update(start_at, mem_size, from, to);
+   else if (to)
+   e820__range_add(start_at, mem_size, to);
+   else if (from)
+   e820__range_remove(start_at, mem_size, from, 1);
+   else
+   e820__range_remove(start_at, mem_size, 0, 0);
} else {
e820__range_remove(mem_size, ULLONG_MAX - mem_size, 
E820_TYPE_RAM, 1);
}
-- 
2.9.3.1.gcba166c.dirty

Re: [PATCH v2 1/2] ASoC: codecs: Add support for AK5558 ADC driver

2018-02-02 Thread Mark Brown

On Fri, Feb 02, 2018 at 09:33:18PM +0200, Andy Shevchenko wrote:
> On Fri, Feb 2, 2018 at 6:20 PM, Daniel Baluta  wrote:

> > +static int ak5558_set_dai_mute(struct snd_soc_dai *dai, int mute)
> > +{
> > +   struct snd_soc_codec *codec = dai->codec;
> > +   struct ak5558_priv *ak5558 = snd_soc_codec_get_drvdata(codec);
> 
> > +   int ndt = 0;
> 
> It might be even
> 
>   int ndt = max(ak5558->fs ? 583000 / ak5558->fs : 5, 5);

Please don't encourage people to use the ternery operator like that, it
does nothing for legibility not to write out the conditionals.

> > +static const struct i2c_device_id ak5558_i2c_id[] = {
> > +   { "ak5558", 0 },
> > +   { }
> > +};
> > +MODULE_DEVICE_TABLE(i2c, ak5558_i2c_id);

> I dunno if it's really helpful to have. Though it's up to Mark and you.

I don't care either way.


signature.asc
Description: PGP signature

[PATCH 00/18] [ANNOUNCE] Dynamically created function based events

2018-02-02 Thread Steven Rostedt


At Kernel Summit back in October, we tried to bring up trace markers, which
would be nops within the kernel proper, that would allow modules to hook
arbitrary trace events to them. The reaction to this proposal was less than
favorable. We were told that we were trying to make a work around for a
problem, and not solving it. The problem in our minds is the notion of a
"stable trace event".

There are maintainers that do not want trace events, or more trace events in
their subsystems. This is due to the fact that trace events post an
interface to user space, and this interface could become required by some
tool. This may cause the trace event to become stable where it must not
break the tool, and thus prevent the code from changing.

Or, the trace event may just have to add padding for fields that tools
may require. The "success" field of the sched_wakeup trace event is one such
instance. There is no more "success" variable, but tools may fail if it were
to go away, so a "1" is simply added to the trace event wasting ring buffer
real estate.

I talked with Linus about this, and he told me that we already have these
markers in the kernel. They are from the mcount/__fentry__ used by function
tracing. Have the trace events be created by these, and see if this will
satisfy most areas that want trace events.

I decided to implement this idea, and here's the patch set.

Introducing "function based events". These are created dynamically by a
tracefs file called "function_events". By writing a pseudo prototype into
this file, you create an event.

 # mount -t tracefs nodev /sys/kernel/tracing
 # cd /sys/kernel/tracing
 # echo 'do_IRQ(symbol ip[16] | x64[6] irq_stack[16])' > function_events
 # cat events/functions/do_IRQ/format
name: do_IRQ
ID: 1399
format:
field:unsigned short common_type;   offset:0;   size:2; 
signed:0;
field:unsigned char common_flags;   offset:2;   size:1; 
signed:0;
field:unsigned char common_preempt_count;   offset:3;   size:1; 
signed:0;
field:int common_pid;   offset:4;   size:4; signed:1;

field:unsigned long __parent_ip;offset:8;   size:8; 
signed:0;
field:unsigned long __ip;   offset:16;  size:8; signed:0;
field:symbol ip;offset:24;  size:8; signed:0;
field:x64 irq_stack[6]; offset:32;  size:48;signed:0;

print fmt: "%pS->%pS(ip=%pS, irq_stack=%llx:%llx:%llx:%llx:%llx:%llx)", 
REC->__ip, REC->__parent_ip,
REC->ip, REC->irq_stack[0], REC->irq_stack[1], REC->irq_stack[2], 
REC->irq_stack[3], REC->irq_stack[4],
REC->irq_stack[5]

 # echo 1 > events/functions/do_IRQ/enable
 # cat trace
  -0 [003] d..3  3647.049344: 
ret_from_intr->do_IRQ(ip=cpuidle_enter_state+0xb1/0x330, 
irq_stack=81665db1,10,246,c96c3e80,18,88011eae9b40)
  -0 [003] d..3  3647.049433: 
ret_from_intr->do_IRQ(ip=cpuidle_enter_state+0xb1/0x330, 
irq_stack=81665db1,10,246,c96c3e80,18,88011eae9b40)
  -0 [003] d..3  3647.049672: 
ret_from_intr->do_IRQ(ip=cpuidle_enter_state+0xb1/0x330, 
irq_stack=81665db1,10,246,c96c3e80,18,88011eae9b40)
  -0 [003] d..3  3647.325709: 
ret_from_intr->do_IRQ(ip=cpuidle_enter_state+0xb1/0x330, 
irq_stack=81665db1,10,246,c96c3e80,18,88011eae9b40)
  -0 [003] d..3  3647.325929: 
ret_from_intr->do_IRQ(ip=cpuidle_enter_state+0xb1/0x330, 
irq_stack=81665db1,10,246,c96c3e80,18,88011eae9b40)
  -0 [003] d..3  3647.325993: 
ret_from_intr->do_IRQ(ip=cpuidle_enter_state+0xb1/0x330, 
irq_stack=81665db1,10,246,c96c3e80,18,88011eae9b40)
  -0 [003] d..3  3647.387571: 
ret_from_intr->do_IRQ(ip=cpuidle_enter_state+0xb1/0x330, 
irq_stack=81665db1,10,246,c96c3e80,18,88011eae9b40)
  -0 [003] d..3  3647.387791: 
ret_from_intr->do_IRQ(ip=cpuidle_enter_state+0xb1/0x330, 
irq_stack=81665db1,10,246,c96c3e80,18,88011eae9b40)
  -0 [003] d..3  3647.387874: 
ret_from_intr->do_IRQ(ip=cpuidle_enter_state+0xb1/0x330, 
irq_stack=81665db1,10,246,c96c3e80,18,88011eae9b40)

And this is much more powerful than just this. We can show strings, and
index off of structures into other structures.

  # echo '__vfs_read(symbol read+40[0]+16)' > function_events

  # echo 1 > events/functions/__vfs_read/enable
  # cat trace
 sshd-1343  [005] ...2   199.734752: 
vfs_read->__vfs_read(read=tty_read+0x0/0xf0)
 bash-1344  [003] ...2   199.734822: 
vfs_read->__vfs_read(read=tty_read+0x0/0xf0)
 sshd-1343  [005] ...2   199.734835: 
vfs_read->__vfs_read(read=tty_read+0x0/0xf0)
 avahi-daemon-910   [003] ...2   200.136740: vfs_read->__vfs_read(read= 
 (null))
 avahi-daemon-910   [003] ...2   200.136750: vfs_read->__vfs_read(read= 
 (null))

And even read user space:

  # echo 'SyS_openat(int dfd, str

[PATCH 05/18] tracing: Add hex print for dynamic ftrace based events

2018-02-02 Thread Steven Rostedt

From: "Steven Rostedt (VMware)" 

Add x64, x32, x16 and x8 to represent numbers of the same size in hex.
Similar to u64, u32, u16, and u8 but uses %x instead of %u.

Signed-off-by: Steven Rostedt (VMware) 
---
 Documentation/trace/function-based-events.rst | 14 +-
 kernel/trace/trace_event_ftrace.c | 13 -
 2 files changed, 21 insertions(+), 6 deletions(-)

diff --git a/Documentation/trace/function-based-events.rst 
b/Documentation/trace/function-based-events.rst
index 94c2c975295a..f27a0c4e829c 100644
--- a/Documentation/trace/function-based-events.rst
+++ b/Documentation/trace/function-based-events.rst
@@ -97,6 +97,7 @@ as follows:
 
  ATOM := 'u8' | 'u16' | 'u32' | 'u64' |
  's8' | 's16' | 's32' | 's64' |
+ 'x8' | 'x16' | 'x32' | 'x64' |
  'char' | 'short' | 'int' | 'long' | 'size_t'
 
  FIELD := 
@@ -116,11 +117,14 @@ int ip_rcv(struct sk_buff *skb, struct net_device *dev, 
struct packet_type *pt,
 
 If we are only interested in the first argument (skb):
 
- # echo 'ip_rcv(u64 skb, u64 dev)' > function_events
+ # echo 'ip_rcv(x64 skb, x86 dev)' > function_events
 
  # echo 1 > events/functions/ip_rcv/enable
  # cat trace
- -0 [003] ..s3  2119.041935: 
__netif_receive_skb_core->ip_rcv(skb=18446612136982403072, 
dev=18446612136968273920)
- -0 [003] ..s3  2119.041944: 
__netif_receive_skb_core->ip_rcv(skb=18446612136982403072, 
dev=18446612136968273920)
- -0 [003] ..s3  2119.288337: 
__netif_receive_skb_core->ip_rcv(skb=18446612136982403072, 
dev=18446612136968273920)
- -0 [003] ..s3  2119.288960: 
__netif_receive_skb_core->ip_rcv(skb=18446612136982403072, 
dev=18446612136968273920)
+ -0 [003] ..s3  5543.133460: 
__netif_receive_skb_core->ip_rcv(skb=88007f960700, net=88011425)
+ -0 [003] ..s3  5543.133475: 
__netif_receive_skb_core->ip_rcv(skb=88007f960700, net=88011425)
+ -0 [003] ..s3  5543.312592: 
__netif_receive_skb_core->ip_rcv(skb=88007f960700, net=88011425)
+ -0 [003] ..s3  5543.313150: 
__netif_receive_skb_core->ip_rcv(skb=88007f960700, net=88011425)
+
+We use "x64" in order to make sure that the data is displayed in hex.
+This is on a x86_64 machine, and we know the pointer sizes are 8 bytes.
diff --git a/kernel/trace/trace_event_ftrace.c 
b/kernel/trace/trace_event_ftrace.c
index 66465be1e6d5..aa19c8af9d34 100644
--- a/kernel/trace/trace_event_ftrace.c
+++ b/kernel/trace/trace_event_ftrace.c
@@ -62,6 +62,11 @@ enum func_states {
FUNC_STATE_ERROR,
 };
 
+typedef u64 x64;
+typedef u32 x32;
+typedef u16 x16;
+typedef u8 x8;
+
 #define TYPE_TUPLE(type)   \
{ #type, sizeof(type), is_signed_type(type) }
 
@@ -77,12 +82,16 @@ static struct func_type {
TYPE_TUPLE(size_t),
TYPE_TUPLE(u64),
TYPE_TUPLE(s64),
+   TYPE_TUPLE(x64),
TYPE_TUPLE(u32),
TYPE_TUPLE(s32),
+   TYPE_TUPLE(x32),
TYPE_TUPLE(u16),
TYPE_TUPLE(s16),
+   TYPE_TUPLE(x16),
TYPE_TUPLE(u8),
TYPE_TUPLE(s8),
+   TYPE_TUPLE(x8),
{ NULL, 0,  0 }
 };
 
@@ -353,7 +362,9 @@ static void make_fmt(struct func_arg *arg, char *fmt)
fmt[c++] = 'l';
}
 
-   if (arg->sign)
+   if (arg->type[0] == 'x')
+   fmt[c++] = 'x';
+   else if (arg->sign)
fmt[c++] = 'd';
else
fmt[c++] = 'u';
-- 
2.15.1

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 975 matches

Mail list logo