Re: [PATCH v2 4/4] x86/static_call: Add inline static call implementation for x86-64

2018-11-29 Thread Josh Poimboeuf
On Thu, Nov 29, 2018 at 10:58:40AM -0800, Linus Torvalds wrote:
> On Thu, Nov 29, 2018 at 10:47 AM Steven Rostedt  wrote:
> >
> > Note, we do have a bit of control at what is getting called. The patch
> > set requires that the callers are wrapped in macros. We should not
> > allow just any random callers (like from asm).
> 
> Actually, I'd argue that asm is often more controlled than C code.
> 
> Right now you can do odd things if you really want to, and have the
> compiler generate indirect calls to those wrapper functions.
> 
> For example, I can easily imagine a pre-retpoline compiler turning
> 
>  if (cond)
> fn1(a,b)
>  else
>fn2(a,b);
> 
> into a function pointer conditional
> 
> (cond ? fn1 : fn2)(a,b);
> 
> and honestly, the way "static_call()" works now, can you guarantee
> that the call-site doesn't end up doing that, and calling the
> trampoline function for two different static calls from one indirect
> call?
> 
> See what I'm talking about? Saying "callers are wrapped in macros"
> doesn't actually protect you from the compiler doing things like that.
> 
> In contrast, if the call was wrapped in an inline asm, we'd *know* the
> compiler couldn't turn a "call wrapper(%rip)" into anything else.

I think objtool could warn about many such issues, including function
pointer references to trampolines and short tail call jumps.

-- 
Josh


Re: [PATCH v2 4/4] x86/static_call: Add inline static call implementation for x86-64

2018-11-29 Thread Josh Poimboeuf
On Thu, Nov 29, 2018 at 10:58:40AM -0800, Linus Torvalds wrote:
> On Thu, Nov 29, 2018 at 10:47 AM Steven Rostedt  wrote:
> >
> > Note, we do have a bit of control at what is getting called. The patch
> > set requires that the callers are wrapped in macros. We should not
> > allow just any random callers (like from asm).
> 
> Actually, I'd argue that asm is often more controlled than C code.
> 
> Right now you can do odd things if you really want to, and have the
> compiler generate indirect calls to those wrapper functions.
> 
> For example, I can easily imagine a pre-retpoline compiler turning
> 
>  if (cond)
> fn1(a,b)
>  else
>fn2(a,b);
> 
> into a function pointer conditional
> 
> (cond ? fn1 : fn2)(a,b);
> 
> and honestly, the way "static_call()" works now, can you guarantee
> that the call-site doesn't end up doing that, and calling the
> trampoline function for two different static calls from one indirect
> call?
> 
> See what I'm talking about? Saying "callers are wrapped in macros"
> doesn't actually protect you from the compiler doing things like that.
> 
> In contrast, if the call was wrapped in an inline asm, we'd *know* the
> compiler couldn't turn a "call wrapper(%rip)" into anything else.

I think objtool could warn about many such issues, including function
pointer references to trampolines and short tail call jumps.

-- 
Josh


Re: [PATCH] ARM: OMAP1: devices: configure omap1_spi100k only on OMAP7xx

2018-11-29 Thread Tony Lindgren
* Aaro Koskinen  [181119 11:49]:
> Configure omap1_spi100k only on OMAP7xx. This allows running multiboard
> kernels on non-OMAP7xx HW with CONFIG_SPI_OMAP_100K enabled.

Applying into omap-for-v4.21/omap1 thanks.

Tony


Re: [PATCH] ARM: OMAP1: devices: configure omap1_spi100k only on OMAP7xx

2018-11-29 Thread Tony Lindgren
* Aaro Koskinen  [181119 11:49]:
> Configure omap1_spi100k only on OMAP7xx. This allows running multiboard
> kernels on non-OMAP7xx HW with CONFIG_SPI_OMAP_100K enabled.

Applying into omap-for-v4.21/omap1 thanks.

Tony


Re: [PATCH] ARM: OMAP1/2: fix SoC name printing

2018-11-29 Thread Tony Lindgren
* Aaro Koskinen  [181119 11:46]:
> Currently we get extra newlines on OMAP1/2 when the SoC name is printed:
> 
> [0.00] OMAP1510
> [0.00]  revision 2 handled as 15xx id: bc058c9b93111a16
> 
> [0.00] OMAP2420
> [0.00]
> 
> Fix by using pr_cont.

Applying into omap-for-v4.21/omap1 thanks.

Tony


Re: [PATCH] ARM: OMAP1/2: fix SoC name printing

2018-11-29 Thread Tony Lindgren
* Aaro Koskinen  [181119 11:46]:
> Currently we get extra newlines on OMAP1/2 when the SoC name is printed:
> 
> [0.00] OMAP1510
> [0.00]  revision 2 handled as 15xx id: bc058c9b93111a16
> 
> [0.00] OMAP2420
> [0.00]
> 
> Fix by using pr_cont.

Applying into omap-for-v4.21/omap1 thanks.

Tony


Re: [PATCH 0/3] ARN: OMAP1: ams-delta: Post-GPIO-refresh cleanups

2018-11-29 Thread Tony Lindgren
* Linus Walleij  [181115 10:46]:
> On Wed, Nov 7, 2018 at 9:16 PM Janusz Krzysztofik  wrote:
> 
> > Janusz Krzysztofik (3):
> >   ARM: OMAP1: ams-delta: Drop board specific global GPIO numbers
> >   ARM: OMAP1: ams-delta: Drop unused symbols from the board header
> >   ARM: OMAP1: ams-delta: Move AMS_DELTA_LATCH2_NGPIO to the board file
> 
> This series:
> Reviewed-by: Linus Walleij 

Applying into omap-for-v4.21/omap1 thanks.

Tony


Re: [PATCH 0/3] ARN: OMAP1: ams-delta: Post-GPIO-refresh cleanups

2018-11-29 Thread Tony Lindgren
* Linus Walleij  [181115 10:46]:
> On Wed, Nov 7, 2018 at 9:16 PM Janusz Krzysztofik  wrote:
> 
> > Janusz Krzysztofik (3):
> >   ARM: OMAP1: ams-delta: Drop board specific global GPIO numbers
> >   ARM: OMAP1: ams-delta: Drop unused symbols from the board header
> >   ARM: OMAP1: ams-delta: Move AMS_DELTA_LATCH2_NGPIO to the board file
> 
> This series:
> Reviewed-by: Linus Walleij 

Applying into omap-for-v4.21/omap1 thanks.

Tony


Re: [PATCH] ARM: OMAP: PM: Change to use DEFINE_SHOW_ATTRIBUTE macro

2018-11-29 Thread Tony Lindgren
* Yangtao Li  [181106 06:35]:
> Use DEFINE_SHOW_ATTRIBUTE macro to simplify the code.

Applying into omap-for-v4.21/omap1 thanks.

Tony


Re: [PATCH] ARM: OMAP: PM: Change to use DEFINE_SHOW_ATTRIBUTE macro

2018-11-29 Thread Tony Lindgren
* Yangtao Li  [181106 06:35]:
> Use DEFINE_SHOW_ATTRIBUTE macro to simplify the code.

Applying into omap-for-v4.21/omap1 thanks.

Tony


Re: [PATCH] ARM: OMAP1: clock: Change to use DEFINE_SHOW_ATTRIBUTE macro

2018-11-29 Thread Tony Lindgren
* Yangtao Li  [181106 06:34]:
> Use DEFINE_SHOW_ATTRIBUTE macro to simplify the code.

Applying into omap-for-v4.21/omap1 thanks.

Tony


Re: [PATCH] ARM: OMAP1: clock: Change to use DEFINE_SHOW_ATTRIBUTE macro

2018-11-29 Thread Tony Lindgren
* Yangtao Li  [181106 06:34]:
> Use DEFINE_SHOW_ATTRIBUTE macro to simplify the code.

Applying into omap-for-v4.21/omap1 thanks.

Tony


Re: [PATCH] ARM: OMAP1: ams-delta: Provide GPIO lookup table for LED device

2018-11-29 Thread Tony Lindgren
* Linus Walleij  [181114 12:51]:
> On Tue, Nov 6, 2018 at 12:22 AM Janusz Krzysztofik  
> wrote:
> 
> > Global GPIO numbers no longer have to be passed to leds-gpio driver,
> > replace their assignment with a lookup table.
> >
> > Signed-off-by: Janusz Krzysztofik 
> 
> Excellent Janusz! :)
> Reviewed-by: Linus Walleij 

Applying into omap-for-v4.21/omap1 thanks.

Tony


Re: [PATCH] ARM: OMAP1: ams-delta: Provide GPIO lookup table for LED device

2018-11-29 Thread Tony Lindgren
* Linus Walleij  [181114 12:51]:
> On Tue, Nov 6, 2018 at 12:22 AM Janusz Krzysztofik  
> wrote:
> 
> > Global GPIO numbers no longer have to be passed to leds-gpio driver,
> > replace their assignment with a lookup table.
> >
> > Signed-off-by: Janusz Krzysztofik 
> 
> Excellent Janusz! :)
> Reviewed-by: Linus Walleij 

Applying into omap-for-v4.21/omap1 thanks.

Tony


Re: [PATCH] ARM: OMAP1: ams-delta: make board header file local to mach-omap1

2018-11-29 Thread Tony Lindgren
* Janusz Krzysztofik  [181105 15:09]:
> Now as the board header file is no longer included by drivers, move it
> to the root directory of mach-omap1.

Applying into omap-for-v4.21/omap1 thanks.

Tony


Re: [PATCH] ARM: OMAP1: ams-delta: make board header file local to mach-omap1

2018-11-29 Thread Tony Lindgren
* Janusz Krzysztofik  [181105 15:09]:
> Now as the board header file is no longer included by drivers, move it
> to the root directory of mach-omap1.

Applying into omap-for-v4.21/omap1 thanks.

Tony


Re: [PATCH 1/4] selftests: timers: move PIE tests out of rtctest

2018-11-29 Thread Rafael David Tinoco
On 4/19/18 9:50 AM, Alexandre Belloni wrote:
> Since commit 6610e0893b8bc ("RTC: Rework RTC code to use timerqueue for
> events"), PIE are completely handled using hrtimers, without actually using
> any underlying hardware RTC.
> 
> Move PIE testing out of rtctest. It still depends on the presence of an RTC
> (to access the device file) but doesn't depend on it actually working.
> 
> Signed-off-by: Alexandre Belloni 
> ---
>  tools/testing/selftests/timers/.gitignore |   1 +
>  tools/testing/selftests/timers/Makefile   |   2 +-
>  tools/testing/selftests/timers/rtcpie.c   | 132 ++
>  tools/testing/selftests/timers/rtctest.c  |  83 +-
>  4 files changed, 138 insertions(+), 80 deletions(-)
>  create mode 100644 tools/testing/selftests/timers/rtcpie.c
> 
...
> + /* The frequencies 128Hz, 256Hz, ... 8192Hz are only allowed for root. 
> */
> + for (tmp=2; tmp<=64; tmp*=2) {
> +
> + retval = ioctl(fd, RTC_IRQP_SET, tmp);
> + if (retval == -1) {
> + /* not all RTCs can change their periodic IRQ rate */
> + if (errno == EINVAL) {
> + fprintf(stderr,
> + "\n...Periodic IRQ rate is fixed\n");
> + goto done;
> + }
> + perror("RTC_IRQP_SET ioctl");
> + exit(errno);
> + }

Hello Alexandre,

In our tests, having failures under 64Hz is quite common in embedded
systems with few number of CPUs/Cores:


root@bug3402:opt$ find /sys -name rtc0
/sys/devices/platform/901.pl031/rtc/rtc0
/sys/class/rtc/rtc0

selftests: timers: rtcpie
Periodic IRQ rate is 1Hz.
Counting 20 interrupts at:
2Hz: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
4Hz: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
8Hz: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
16Hz:1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
32Hz:1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
64Hz:   
PIE delta error: 0.017697 should be close to 0.015625
not ok 1..9 selftests: timers: rtcpie [FAIL]


Mainly because 64Hz gives us.. ~16ms in the test, and the variation
being taken in consideration for an error, for this test, is 10%, which
is ~1.6ms... pretty close to scheduler limit for lower number of CPUs in
a functional testing environment.

Would you mind if we change the default for up to 32Hz ? Or, do you have
any other suggestion ?

Best Rgds
Rafael

-- 
Rafael D. Tinoco
Linaro - Kernel Validation


Re: [PATCH 1/4] selftests: timers: move PIE tests out of rtctest

2018-11-29 Thread Rafael David Tinoco
On 4/19/18 9:50 AM, Alexandre Belloni wrote:
> Since commit 6610e0893b8bc ("RTC: Rework RTC code to use timerqueue for
> events"), PIE are completely handled using hrtimers, without actually using
> any underlying hardware RTC.
> 
> Move PIE testing out of rtctest. It still depends on the presence of an RTC
> (to access the device file) but doesn't depend on it actually working.
> 
> Signed-off-by: Alexandre Belloni 
> ---
>  tools/testing/selftests/timers/.gitignore |   1 +
>  tools/testing/selftests/timers/Makefile   |   2 +-
>  tools/testing/selftests/timers/rtcpie.c   | 132 ++
>  tools/testing/selftests/timers/rtctest.c  |  83 +-
>  4 files changed, 138 insertions(+), 80 deletions(-)
>  create mode 100644 tools/testing/selftests/timers/rtcpie.c
> 
...
> + /* The frequencies 128Hz, 256Hz, ... 8192Hz are only allowed for root. 
> */
> + for (tmp=2; tmp<=64; tmp*=2) {
> +
> + retval = ioctl(fd, RTC_IRQP_SET, tmp);
> + if (retval == -1) {
> + /* not all RTCs can change their periodic IRQ rate */
> + if (errno == EINVAL) {
> + fprintf(stderr,
> + "\n...Periodic IRQ rate is fixed\n");
> + goto done;
> + }
> + perror("RTC_IRQP_SET ioctl");
> + exit(errno);
> + }

Hello Alexandre,

In our tests, having failures under 64Hz is quite common in embedded
systems with few number of CPUs/Cores:


root@bug3402:opt$ find /sys -name rtc0
/sys/devices/platform/901.pl031/rtc/rtc0
/sys/class/rtc/rtc0

selftests: timers: rtcpie
Periodic IRQ rate is 1Hz.
Counting 20 interrupts at:
2Hz: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
4Hz: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
8Hz: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
16Hz:1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
32Hz:1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
64Hz:   
PIE delta error: 0.017697 should be close to 0.015625
not ok 1..9 selftests: timers: rtcpie [FAIL]


Mainly because 64Hz gives us.. ~16ms in the test, and the variation
being taken in consideration for an error, for this test, is 10%, which
is ~1.6ms... pretty close to scheduler limit for lower number of CPUs in
a functional testing environment.

Would you mind if we change the default for up to 32Hz ? Or, do you have
any other suggestion ?

Best Rgds
Rafael

-- 
Rafael D. Tinoco
Linaro - Kernel Validation


[PATCH] mfd: cros_ec: Add support for MKBP more event flags

2018-11-29 Thread egranata
From: Enrico Granata 

The ChromeOS EC has support for signaling to the host that
a single IRQ can serve multiple MKBP events.

Doing this serves an optimization purpose, as it minimizes the
number of round-trips into the interrupt handling machinery, and
it proves beneficial to sensor timestamping as it keeps the desired
synchronization of event times between the two processors.

This patch adds kernel support for this EC feature, allowing the
ec_irq to loop until all events have been served.

Signed-off-by: Enrico Granata 
---
 drivers/mfd/cros_ec.c   | 20 +--
 drivers/platform/chrome/cros_ec_proto.c | 33 +++--
 include/linux/mfd/cros_ec.h |  3 ++-
 include/linux/mfd/cros_ec_commands.h| 15 +++
 4 files changed, 50 insertions(+), 21 deletions(-)

diff --git a/drivers/mfd/cros_ec.c b/drivers/mfd/cros_ec.c
index fe6f83766144f..17903a378aa1a 100644
--- a/drivers/mfd/cros_ec.c
+++ b/drivers/mfd/cros_ec.c
@@ -51,13 +51,16 @@ static const struct mfd_cell ec_pd_cell = {
.pdata_size = sizeof(pd_p),
 };
 
-static irqreturn_t ec_irq_thread(int irq, void *data)
+static bool ec_handle_event(struct cros_ec_device *ec_dev)
 {
-   struct cros_ec_device *ec_dev = data;
bool wake_event = true;
int ret;
+   bool ec_has_more_events = false;
 
ret = cros_ec_get_next_event(ec_dev, _event);
+   if (ret > 0)
+   ec_has_more_events =
+   ec_dev->event_data.event_type & EC_MKBP_HAS_MORE_EVENTS;
 
/*
 * Signal only if wake host events or any interrupt if
@@ -70,6 +73,19 @@ static irqreturn_t ec_irq_thread(int irq, void *data)
if (ret > 0)
blocking_notifier_call_chain(_dev->event_notifier,
 0, ec_dev);
+
+   return ec_has_more_events;
+}
+
+static irqreturn_t ec_irq_thread(int irq, void *data)
+{
+   struct cros_ec_device *ec_dev = data;
+   bool ec_has_more_events;
+
+   do {
+   ec_has_more_events = ec_handle_event(ec_dev);
+   } while (ec_has_more_events);
+
return IRQ_HANDLED;
 }
 
diff --git a/drivers/platform/chrome/cros_ec_proto.c 
b/drivers/platform/chrome/cros_ec_proto.c
index b6fd4838f60f3..bb126d95b2fd4 100644
--- a/drivers/platform/chrome/cros_ec_proto.c
+++ b/drivers/platform/chrome/cros_ec_proto.c
@@ -420,10 +420,14 @@ int cros_ec_query_all(struct cros_ec_device *ec_dev)
ret = cros_ec_get_host_command_version_mask(ec_dev,
EC_CMD_GET_NEXT_EVENT,
_mask);
-   if (ret < 0 || ver_mask == 0)
+   if (ret < 0 || ver_mask == 0) {
ec_dev->mkbp_event_supported = 0;
-   else
-   ec_dev->mkbp_event_supported = 1;
+   dev_info(ec_dev->dev, "MKBP not supported\n");
+   } else {
+   ec_dev->mkbp_event_supported = fls(ver_mask);
+   dev_info(ec_dev->dev, "MKBP support version %u\n",
+   ec_dev->mkbp_event_supported - 1);
+   }
 
/*
 * Get host event wake mask, assume all events are wake events
@@ -530,28 +534,19 @@ static int get_next_event(struct cros_ec_device *ec_dev)
 {
u8 buffer[sizeof(struct cros_ec_command) + sizeof(ec_dev->event_data)];
struct cros_ec_command *msg = (struct cros_ec_command *)
-   static int cmd_version = 1;
-   int ret;
+   const int cmd_version = ec_dev->mkbp_event_supported - 1;
 
if (ec_dev->suspended) {
dev_dbg(ec_dev->dev, "Device suspended.\n");
return -EHOSTDOWN;
}
 
-   if (cmd_version == 1) {
-   ret = get_next_event_xfer(ec_dev, msg, cmd_version,
-   sizeof(struct ec_response_get_next_event_v1));
-   if (ret < 0 || msg->result != EC_RES_INVALID_VERSION)
-   return ret;
-
-   /* Fallback to version 0 for future send attempts */
-   cmd_version = 0;
-   }
-
-   ret = get_next_event_xfer(ec_dev, msg, cmd_version,
+   if (cmd_version == 0)
+   return get_next_event_xfer(ec_dev, msg, 0,
  sizeof(struct ec_response_get_next_event));
 
-   return ret;
+   return get_next_event_xfer(ec_dev, msg, cmd_version,
+   sizeof(struct ec_response_get_next_event_v1));
 }
 
 static int get_keyboard_state_event(struct cros_ec_device *ec_dev)
@@ -607,11 +602,13 @@ EXPORT_SYMBOL(cros_ec_get_next_event);
 
 u32 cros_ec_get_host_event(struct cros_ec_device *ec_dev)
 {
+   u32 event_type =
+   ec_dev->event_data.event_type & EC_MKBP_EVENT_TYPE_MASK;
u32 host_event;
 
BUG_ON(!ec_dev->mkbp_event_supported);
 
-   if (ec_dev->event_data.event_type != EC_MKBP_EVENT_HOST_EVENT)
+   if (event_type != EC_MKBP_EVENT_HOST_EVENT)

Re: [PATCH v2] signal: add procfd_signal() syscall

2018-11-29 Thread Christian Brauner
On Thu, Nov 29, 2018 at 11:22:58AM -0800, Andy Lutomirski wrote:
> On Thu, Nov 29, 2018 at 11:17 AM Christian Brauner  
> wrote:
> >
> > On November 30, 2018 5:54:18 AM GMT+13:00, Andy Lutomirski 
> >  wrote:
> > >
> > >
> > >> On Nov 29, 2018, at 4:28 AM, Florian Weimer 
> > >wrote:
> > >>
> > >> Disclaimer: I'm looking at this patch because Christian requested it.
> > >> I'm not a kernel developer.
> > >>
> > >> * Christian Brauner:
> > >>
> > >>> diff --git a/arch/x86/entry/syscalls/syscall_32.tbl
> > >b/arch/x86/entry/syscalls/syscall_32.tbl
> > >>> index 3cf7b533b3d1..3f27ffd8ae87 100644
> > >>> --- a/arch/x86/entry/syscalls/syscall_32.tbl
> > >>> +++ b/arch/x86/entry/syscalls/syscall_32.tbl
> > >>> @@ -398,3 +398,4 @@
> > >>> 384i386arch_prctlsys_arch_prctl
> > >__ia32_compat_sys_arch_prctl
> > >>> 385i386io_pgeteventssys_io_pgetevents
> > >__ia32_compat_sys_io_pgetevents
> > >>> 386i386rseqsys_rseq__ia32_sys_rseq
> > >>> +387i386procfd_signalsys_procfd_signal
> > >__ia32_compat_sys_procfd_signal
> > >>> diff --git a/arch/x86/entry/syscalls/syscall_64.tbl
> > >b/arch/x86/entry/syscalls/syscall_64.tbl
> > >>> index f0b1709a5ffb..8a30cde82450 100644
> > >>> --- a/arch/x86/entry/syscalls/syscall_64.tbl
> > >>> +++ b/arch/x86/entry/syscalls/syscall_64.tbl
> > >>> @@ -343,6 +343,7 @@
> > >>> 332commonstatx__x64_sys_statx
> > >>> 333commonio_pgetevents__x64_sys_io_pgetevents
> > >>> 334commonrseq__x64_sys_rseq
> > >>> +33564procfd_signal__x64_sys_procfd_signal
> > >>>
> > >>> #
> > >>> # x32-specific system call numbers start at 512 to avoid cache
> > >impact
> > >>> @@ -386,3 +387,4 @@
> > >>> 545x32execveat__x32_compat_sys_execveat/ptregs
> > >>> 546x32preadv2__x32_compat_sys_preadv64v2
> > >>> 547x32pwritev2__x32_compat_sys_pwritev64v2
> > >>> +548x32procfd_signal__x32_compat_sys_procfd_signal
> > >>
> > >> Is there a reason why these numbers have to be different?
> > >>
> > >> (See the recent discussion with Andy Lutomirski.)
> > >
> > >Hah, I missed this part of the patch.  Let’s not add new x32 syscall
> > >numbers.
> > >
> > >Also, can we perhaps rework this a bit to get rid of the compat entry
> > >point?  The easier way would be to check in_compat_syscall(). The nicer
> > >way IMO would be to use the 64-bit structure for 32-bit as well.
> >
> > Do you have a syscall which set precedence/did this before I could look at?
> > Just if you happen to remember one.
> > Fwiw, I followed the other signal syscalls.
> > They all introduce compat syscalls.
> >
> 
> Not really.
> 
> Let me try to explain.  I have three issues with the approach in your 
> patchset:
> 
> 1. You're introducing a new syscall, and it behaves differently on
> 32-bit and 64-bit because the structure you pass in is different.
> This is necessary for old syscalls where compatibility matters, but
> maybe we can get rid of it for new syscalls.   Could we define a
> siginfo64_t that is identical to the 64-bit siginfo_t and just use
> that in all cases?
> 
> 2. Assuming that #1 doesn't work, then we need compat support.  But
> you're doing it by having two different entry points.  Instead, you
> could have a single entry point that calls in_compat_syscall() to
> decide which structure to read.  This would simplify things because
> x86 doesn't really support the separate compat entry points, which
> leads me to #3.
> 
> 3. The separate x32 numbers are a huge turd that may have security
> holes and certainly have comprehensibility holes.  I will object to
> any patch that adds a new one (like yours).  Fixing #1 or #2 makes
> this problem go away.
> 
> Does that make any sense?  The #2 fix would be something like:
> 
> if (in_compat_syscall)
>   copy...user32();
> else
>   copy_from_user();
> 
> The #1 fix would add a copy_siginfo_from_user64() or similar.

Thanks very much! That all helped a bunch already! I'll try to go the
copy_siginfo_from_user64() way first and see if I can make this work. If
we do this I would however only want to use it for the new syscall first
and not change all other signal syscalls over to it too. I'd rather keep
this patchset focussed and small and do such conversions caused by the
new approach later. Does that sound reasonable?


[PATCH] mfd: cros_ec: Add support for MKBP more event flags

2018-11-29 Thread egranata
From: Enrico Granata 

The ChromeOS EC has support for signaling to the host that
a single IRQ can serve multiple MKBP events.

Doing this serves an optimization purpose, as it minimizes the
number of round-trips into the interrupt handling machinery, and
it proves beneficial to sensor timestamping as it keeps the desired
synchronization of event times between the two processors.

This patch adds kernel support for this EC feature, allowing the
ec_irq to loop until all events have been served.

Signed-off-by: Enrico Granata 
---
 drivers/mfd/cros_ec.c   | 20 +--
 drivers/platform/chrome/cros_ec_proto.c | 33 +++--
 include/linux/mfd/cros_ec.h |  3 ++-
 include/linux/mfd/cros_ec_commands.h| 15 +++
 4 files changed, 50 insertions(+), 21 deletions(-)

diff --git a/drivers/mfd/cros_ec.c b/drivers/mfd/cros_ec.c
index fe6f83766144f..17903a378aa1a 100644
--- a/drivers/mfd/cros_ec.c
+++ b/drivers/mfd/cros_ec.c
@@ -51,13 +51,16 @@ static const struct mfd_cell ec_pd_cell = {
.pdata_size = sizeof(pd_p),
 };
 
-static irqreturn_t ec_irq_thread(int irq, void *data)
+static bool ec_handle_event(struct cros_ec_device *ec_dev)
 {
-   struct cros_ec_device *ec_dev = data;
bool wake_event = true;
int ret;
+   bool ec_has_more_events = false;
 
ret = cros_ec_get_next_event(ec_dev, _event);
+   if (ret > 0)
+   ec_has_more_events =
+   ec_dev->event_data.event_type & EC_MKBP_HAS_MORE_EVENTS;
 
/*
 * Signal only if wake host events or any interrupt if
@@ -70,6 +73,19 @@ static irqreturn_t ec_irq_thread(int irq, void *data)
if (ret > 0)
blocking_notifier_call_chain(_dev->event_notifier,
 0, ec_dev);
+
+   return ec_has_more_events;
+}
+
+static irqreturn_t ec_irq_thread(int irq, void *data)
+{
+   struct cros_ec_device *ec_dev = data;
+   bool ec_has_more_events;
+
+   do {
+   ec_has_more_events = ec_handle_event(ec_dev);
+   } while (ec_has_more_events);
+
return IRQ_HANDLED;
 }
 
diff --git a/drivers/platform/chrome/cros_ec_proto.c 
b/drivers/platform/chrome/cros_ec_proto.c
index b6fd4838f60f3..bb126d95b2fd4 100644
--- a/drivers/platform/chrome/cros_ec_proto.c
+++ b/drivers/platform/chrome/cros_ec_proto.c
@@ -420,10 +420,14 @@ int cros_ec_query_all(struct cros_ec_device *ec_dev)
ret = cros_ec_get_host_command_version_mask(ec_dev,
EC_CMD_GET_NEXT_EVENT,
_mask);
-   if (ret < 0 || ver_mask == 0)
+   if (ret < 0 || ver_mask == 0) {
ec_dev->mkbp_event_supported = 0;
-   else
-   ec_dev->mkbp_event_supported = 1;
+   dev_info(ec_dev->dev, "MKBP not supported\n");
+   } else {
+   ec_dev->mkbp_event_supported = fls(ver_mask);
+   dev_info(ec_dev->dev, "MKBP support version %u\n",
+   ec_dev->mkbp_event_supported - 1);
+   }
 
/*
 * Get host event wake mask, assume all events are wake events
@@ -530,28 +534,19 @@ static int get_next_event(struct cros_ec_device *ec_dev)
 {
u8 buffer[sizeof(struct cros_ec_command) + sizeof(ec_dev->event_data)];
struct cros_ec_command *msg = (struct cros_ec_command *)
-   static int cmd_version = 1;
-   int ret;
+   const int cmd_version = ec_dev->mkbp_event_supported - 1;
 
if (ec_dev->suspended) {
dev_dbg(ec_dev->dev, "Device suspended.\n");
return -EHOSTDOWN;
}
 
-   if (cmd_version == 1) {
-   ret = get_next_event_xfer(ec_dev, msg, cmd_version,
-   sizeof(struct ec_response_get_next_event_v1));
-   if (ret < 0 || msg->result != EC_RES_INVALID_VERSION)
-   return ret;
-
-   /* Fallback to version 0 for future send attempts */
-   cmd_version = 0;
-   }
-
-   ret = get_next_event_xfer(ec_dev, msg, cmd_version,
+   if (cmd_version == 0)
+   return get_next_event_xfer(ec_dev, msg, 0,
  sizeof(struct ec_response_get_next_event));
 
-   return ret;
+   return get_next_event_xfer(ec_dev, msg, cmd_version,
+   sizeof(struct ec_response_get_next_event_v1));
 }
 
 static int get_keyboard_state_event(struct cros_ec_device *ec_dev)
@@ -607,11 +602,13 @@ EXPORT_SYMBOL(cros_ec_get_next_event);
 
 u32 cros_ec_get_host_event(struct cros_ec_device *ec_dev)
 {
+   u32 event_type =
+   ec_dev->event_data.event_type & EC_MKBP_EVENT_TYPE_MASK;
u32 host_event;
 
BUG_ON(!ec_dev->mkbp_event_supported);
 
-   if (ec_dev->event_data.event_type != EC_MKBP_EVENT_HOST_EVENT)
+   if (event_type != EC_MKBP_EVENT_HOST_EVENT)

Re: [PATCH v2] signal: add procfd_signal() syscall

2018-11-29 Thread Christian Brauner
On Thu, Nov 29, 2018 at 11:22:58AM -0800, Andy Lutomirski wrote:
> On Thu, Nov 29, 2018 at 11:17 AM Christian Brauner  
> wrote:
> >
> > On November 30, 2018 5:54:18 AM GMT+13:00, Andy Lutomirski 
> >  wrote:
> > >
> > >
> > >> On Nov 29, 2018, at 4:28 AM, Florian Weimer 
> > >wrote:
> > >>
> > >> Disclaimer: I'm looking at this patch because Christian requested it.
> > >> I'm not a kernel developer.
> > >>
> > >> * Christian Brauner:
> > >>
> > >>> diff --git a/arch/x86/entry/syscalls/syscall_32.tbl
> > >b/arch/x86/entry/syscalls/syscall_32.tbl
> > >>> index 3cf7b533b3d1..3f27ffd8ae87 100644
> > >>> --- a/arch/x86/entry/syscalls/syscall_32.tbl
> > >>> +++ b/arch/x86/entry/syscalls/syscall_32.tbl
> > >>> @@ -398,3 +398,4 @@
> > >>> 384i386arch_prctlsys_arch_prctl
> > >__ia32_compat_sys_arch_prctl
> > >>> 385i386io_pgeteventssys_io_pgetevents
> > >__ia32_compat_sys_io_pgetevents
> > >>> 386i386rseqsys_rseq__ia32_sys_rseq
> > >>> +387i386procfd_signalsys_procfd_signal
> > >__ia32_compat_sys_procfd_signal
> > >>> diff --git a/arch/x86/entry/syscalls/syscall_64.tbl
> > >b/arch/x86/entry/syscalls/syscall_64.tbl
> > >>> index f0b1709a5ffb..8a30cde82450 100644
> > >>> --- a/arch/x86/entry/syscalls/syscall_64.tbl
> > >>> +++ b/arch/x86/entry/syscalls/syscall_64.tbl
> > >>> @@ -343,6 +343,7 @@
> > >>> 332commonstatx__x64_sys_statx
> > >>> 333commonio_pgetevents__x64_sys_io_pgetevents
> > >>> 334commonrseq__x64_sys_rseq
> > >>> +33564procfd_signal__x64_sys_procfd_signal
> > >>>
> > >>> #
> > >>> # x32-specific system call numbers start at 512 to avoid cache
> > >impact
> > >>> @@ -386,3 +387,4 @@
> > >>> 545x32execveat__x32_compat_sys_execveat/ptregs
> > >>> 546x32preadv2__x32_compat_sys_preadv64v2
> > >>> 547x32pwritev2__x32_compat_sys_pwritev64v2
> > >>> +548x32procfd_signal__x32_compat_sys_procfd_signal
> > >>
> > >> Is there a reason why these numbers have to be different?
> > >>
> > >> (See the recent discussion with Andy Lutomirski.)
> > >
> > >Hah, I missed this part of the patch.  Let’s not add new x32 syscall
> > >numbers.
> > >
> > >Also, can we perhaps rework this a bit to get rid of the compat entry
> > >point?  The easier way would be to check in_compat_syscall(). The nicer
> > >way IMO would be to use the 64-bit structure for 32-bit as well.
> >
> > Do you have a syscall which set precedence/did this before I could look at?
> > Just if you happen to remember one.
> > Fwiw, I followed the other signal syscalls.
> > They all introduce compat syscalls.
> >
> 
> Not really.
> 
> Let me try to explain.  I have three issues with the approach in your 
> patchset:
> 
> 1. You're introducing a new syscall, and it behaves differently on
> 32-bit and 64-bit because the structure you pass in is different.
> This is necessary for old syscalls where compatibility matters, but
> maybe we can get rid of it for new syscalls.   Could we define a
> siginfo64_t that is identical to the 64-bit siginfo_t and just use
> that in all cases?
> 
> 2. Assuming that #1 doesn't work, then we need compat support.  But
> you're doing it by having two different entry points.  Instead, you
> could have a single entry point that calls in_compat_syscall() to
> decide which structure to read.  This would simplify things because
> x86 doesn't really support the separate compat entry points, which
> leads me to #3.
> 
> 3. The separate x32 numbers are a huge turd that may have security
> holes and certainly have comprehensibility holes.  I will object to
> any patch that adds a new one (like yours).  Fixing #1 or #2 makes
> this problem go away.
> 
> Does that make any sense?  The #2 fix would be something like:
> 
> if (in_compat_syscall)
>   copy...user32();
> else
>   copy_from_user();
> 
> The #1 fix would add a copy_siginfo_from_user64() or similar.

Thanks very much! That all helped a bunch already! I'll try to go the
copy_siginfo_from_user64() way first and see if I can make this work. If
we do this I would however only want to use it for the new syscall first
and not change all other signal syscalls over to it too. I'd rather keep
this patchset focussed and small and do such conversions caused by the
new approach later. Does that sound reasonable?


Re: [PATCH 4.4 00/86] 4.4.166-stable review

2018-11-29 Thread kernelci.org bot
stable-rc/linux-4.4.y boot: 103 boots: 1 failed, 101 passed with 1 offline 
(v4.4.165-87-g0741f52e1a53)

Full Boot Summary: 
https://kernelci.org/boot/all/job/stable-rc/branch/linux-4.4.y/kernel/v4.4.165-87-g0741f52e1a53/
Full Build Summary: 
https://kernelci.org/build/stable-rc/branch/linux-4.4.y/kernel/v4.4.165-87-g0741f52e1a53/

Tree: stable-rc
Branch: linux-4.4.y
Git Describe: v4.4.165-87-g0741f52e1a53
Git Commit: 0741f52e1a53c50944292102dfa048c5de080f14
Git URL: 
http://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
Tested: 46 unique boards, 20 SoC families, 15 builds out of 187

Boot Failure Detected:

arm64:

defconfig
qcom-qdf2400: 1 failed lab

Offline Platforms:

arm:

multi_v7_defconfig:
stih410-b2120: 1 offline lab

---
For more info write to 


Re: [PATCH 3.18 00/83] 3.18.128-stable review

2018-11-29 Thread kernelci.org bot
stable-rc/linux-3.18.y boot: 64 boots: 5 failed, 59 passed 
(v3.18.127-84-gd2846da19bca)

Full Boot Summary: 
https://kernelci.org/boot/all/job/stable-rc/branch/linux-3.18.y/kernel/v3.18.127-84-gd2846da19bca/
Full Build Summary: 
https://kernelci.org/build/stable-rc/branch/linux-3.18.y/kernel/v3.18.127-84-gd2846da19bca/

Tree: stable-rc
Branch: linux-3.18.y
Git Describe: v3.18.127-84-gd2846da19bca
Git Commit: d2846da19bca0785a50c366d388d0da53c1b4042
Git URL: 
http://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
Tested: 26 unique boards, 12 SoC families, 14 builds out of 185

Boot Failures Detected:

arm:

exynos_defconfig
exynos4412-odroidx2: 1 failed lab

x86:

x86_64_defconfig
qemu: 3 failed labs
x86-atom330: 1 failed lab

---
For more info write to 


Re: [PATCH 4.4 00/86] 4.4.166-stable review

2018-11-29 Thread kernelci.org bot
stable-rc/linux-4.4.y boot: 103 boots: 1 failed, 101 passed with 1 offline 
(v4.4.165-87-g0741f52e1a53)

Full Boot Summary: 
https://kernelci.org/boot/all/job/stable-rc/branch/linux-4.4.y/kernel/v4.4.165-87-g0741f52e1a53/
Full Build Summary: 
https://kernelci.org/build/stable-rc/branch/linux-4.4.y/kernel/v4.4.165-87-g0741f52e1a53/

Tree: stable-rc
Branch: linux-4.4.y
Git Describe: v4.4.165-87-g0741f52e1a53
Git Commit: 0741f52e1a53c50944292102dfa048c5de080f14
Git URL: 
http://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
Tested: 46 unique boards, 20 SoC families, 15 builds out of 187

Boot Failure Detected:

arm64:

defconfig
qcom-qdf2400: 1 failed lab

Offline Platforms:

arm:

multi_v7_defconfig:
stih410-b2120: 1 offline lab

---
For more info write to 


Re: [PATCH 3.18 00/83] 3.18.128-stable review

2018-11-29 Thread kernelci.org bot
stable-rc/linux-3.18.y boot: 64 boots: 5 failed, 59 passed 
(v3.18.127-84-gd2846da19bca)

Full Boot Summary: 
https://kernelci.org/boot/all/job/stable-rc/branch/linux-3.18.y/kernel/v3.18.127-84-gd2846da19bca/
Full Build Summary: 
https://kernelci.org/build/stable-rc/branch/linux-3.18.y/kernel/v3.18.127-84-gd2846da19bca/

Tree: stable-rc
Branch: linux-3.18.y
Git Describe: v3.18.127-84-gd2846da19bca
Git Commit: d2846da19bca0785a50c366d388d0da53c1b4042
Git URL: 
http://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
Tested: 26 unique boards, 12 SoC families, 14 builds out of 185

Boot Failures Detected:

arm:

exynos_defconfig
exynos4412-odroidx2: 1 failed lab

x86:

x86_64_defconfig
qemu: 3 failed labs
x86-atom330: 1 failed lab

---
For more info write to 


Re: [PATCH 4.9 00/92] 4.9.142-stable review

2018-11-29 Thread kernelci.org bot
stable-rc/linux-4.9.y boot: 113 boots: 0 failed, 111 passed with 1 offline, 1 
untried/unknown (v4.9.141-93-gf46aebe62697)

Full Boot Summary: 
https://kernelci.org/boot/all/job/stable-rc/branch/linux-4.9.y/kernel/v4.9.141-93-gf46aebe62697/
Full Build Summary: 
https://kernelci.org/build/stable-rc/branch/linux-4.9.y/kernel/v4.9.141-93-gf46aebe62697/

Tree: stable-rc
Branch: linux-4.9.y
Git Describe: v4.9.141-93-gf46aebe62697
Git Commit: f46aebe62697538df220b8708e266369ffd901c5
Git URL: 
http://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
Tested: 52 unique boards, 22 SoC families, 16 builds out of 193

Offline Platforms:

arm:

multi_v7_defconfig:
stih410-b2120: 1 offline lab

---
For more info write to 


Re: [PATCH 4.14 000/100] 4.14.85-stable review

2018-11-29 Thread kernelci.org bot
stable-rc/linux-4.14.y boot: 121 boots: 0 failed, 117 passed with 3 offline, 1 
conflict (v4.14.84-101-gfed8ae3e80b0)

Full Boot Summary: 
https://kernelci.org/boot/all/job/stable-rc/branch/linux-4.14.y/kernel/v4.14.84-101-gfed8ae3e80b0/
Full Build Summary: 
https://kernelci.org/build/stable-rc/branch/linux-4.14.y/kernel/v4.14.84-101-gfed8ae3e80b0/

Tree: stable-rc
Branch: linux-4.14.y
Git Describe: v4.14.84-101-gfed8ae3e80b0
Git Commit: fed8ae3e80b06a2e8f9c86798dec421d50054bab
Git URL: 
http://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
Tested: 66 unique boards, 24 SoC families, 14 builds out of 197

Boot Regressions Detected:

arm64:

defconfig:
meson-gxbb-p200:
lab-baylibre: new failure (last pass: v4.14.83-63-g0ece78cdc128)

Offline Platforms:

arm:

multi_v7_defconfig:
stih410-b2120: 1 offline lab

arm64:

defconfig:
meson-gxl-s905d-p230: 1 offline lab
meson-gxl-s905x-p212: 1 offline lab

Conflicting Boot Failure Detected: (These likely are not failures as other labs 
are reporting PASS. Needs review.)

arm64:

defconfig:
meson-gxbb-p200:
lab-baylibre: FAIL
lab-baylibre-seattle: PASS

---
For more info write to 


Re: [PATCH 4.9 00/92] 4.9.142-stable review

2018-11-29 Thread kernelci.org bot
stable-rc/linux-4.9.y boot: 113 boots: 0 failed, 111 passed with 1 offline, 1 
untried/unknown (v4.9.141-93-gf46aebe62697)

Full Boot Summary: 
https://kernelci.org/boot/all/job/stable-rc/branch/linux-4.9.y/kernel/v4.9.141-93-gf46aebe62697/
Full Build Summary: 
https://kernelci.org/build/stable-rc/branch/linux-4.9.y/kernel/v4.9.141-93-gf46aebe62697/

Tree: stable-rc
Branch: linux-4.9.y
Git Describe: v4.9.141-93-gf46aebe62697
Git Commit: f46aebe62697538df220b8708e266369ffd901c5
Git URL: 
http://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
Tested: 52 unique boards, 22 SoC families, 16 builds out of 193

Offline Platforms:

arm:

multi_v7_defconfig:
stih410-b2120: 1 offline lab

---
For more info write to 


Re: [PATCH 4.14 000/100] 4.14.85-stable review

2018-11-29 Thread kernelci.org bot
stable-rc/linux-4.14.y boot: 121 boots: 0 failed, 117 passed with 3 offline, 1 
conflict (v4.14.84-101-gfed8ae3e80b0)

Full Boot Summary: 
https://kernelci.org/boot/all/job/stable-rc/branch/linux-4.14.y/kernel/v4.14.84-101-gfed8ae3e80b0/
Full Build Summary: 
https://kernelci.org/build/stable-rc/branch/linux-4.14.y/kernel/v4.14.84-101-gfed8ae3e80b0/

Tree: stable-rc
Branch: linux-4.14.y
Git Describe: v4.14.84-101-gfed8ae3e80b0
Git Commit: fed8ae3e80b06a2e8f9c86798dec421d50054bab
Git URL: 
http://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
Tested: 66 unique boards, 24 SoC families, 14 builds out of 197

Boot Regressions Detected:

arm64:

defconfig:
meson-gxbb-p200:
lab-baylibre: new failure (last pass: v4.14.83-63-g0ece78cdc128)

Offline Platforms:

arm:

multi_v7_defconfig:
stih410-b2120: 1 offline lab

arm64:

defconfig:
meson-gxl-s905d-p230: 1 offline lab
meson-gxl-s905x-p212: 1 offline lab

Conflicting Boot Failure Detected: (These likely are not failures as other labs 
are reporting PASS. Needs review.)

arm64:

defconfig:
meson-gxbb-p200:
lab-baylibre: FAIL
lab-baylibre-seattle: PASS

---
For more info write to 


Re: overlayfs access checks on underlying layers

2018-11-29 Thread Miklos Szeredi
On Thu, Nov 29, 2018 at 5:14 PM Stephen Smalley  wrote:

> Possibly I misunderstood you, but I don't think we want to copy-up on
> permission denial, as that would still allow the mounter to read/write
> special files or execute regular files to which it would normally be
> denied access, because the copy would inherit the context specified by
> the mounter in the context mount case.  It still represents an
> escalation of privilege for the mounter.  In contrast, the copy-up on
> write behavior does not allow the mounter to do anything it could not do
> already (i.e. read from the lower, write to the upper).

Let's get this straight:  when file is copied up, it inherits label
from context=, not from label of lower file?

Next question: permission to change metadata is tied to permission to
open?  Is it possible that open is denied, but metadata can be
changed?

DAC model allows this: metadata change is tied to ownership, not mode
bits.   And different capability flag.

If the same is true for MAC, then the pre-v4.20-rc1 is already
susceptible to the privilege escalation you describe, right?

Thanks,
Miklos


Re: overlayfs access checks on underlying layers

2018-11-29 Thread Miklos Szeredi
On Thu, Nov 29, 2018 at 5:14 PM Stephen Smalley  wrote:

> Possibly I misunderstood you, but I don't think we want to copy-up on
> permission denial, as that would still allow the mounter to read/write
> special files or execute regular files to which it would normally be
> denied access, because the copy would inherit the context specified by
> the mounter in the context mount case.  It still represents an
> escalation of privilege for the mounter.  In contrast, the copy-up on
> write behavior does not allow the mounter to do anything it could not do
> already (i.e. read from the lower, write to the upper).

Let's get this straight:  when file is copied up, it inherits label
from context=, not from label of lower file?

Next question: permission to change metadata is tied to permission to
open?  Is it possible that open is denied, but metadata can be
changed?

DAC model allows this: metadata change is tied to ownership, not mode
bits.   And different capability flag.

If the same is true for MAC, then the pre-v4.20-rc1 is already
susceptible to the privilege escalation you describe, right?

Thanks,
Miklos


Re: [PATCH] thermal/drivers/hisi: Fix bad initialization

2018-11-29 Thread Eduardo Valentin
On Thu, Nov 29, 2018 at 07:26:56PM +0100, Daniel Lezcano wrote:
> Without this patch, the thermal driver on hi6220 and hi3660 is broken.
> 
> That is due because part of the posted patchset was merged but a small
> change in the DT was dropped.
> 
> The hi6220 and hi3660 do not have an interrupt name in the DT, so
> finding interrupt by name fails.
> 
> In addition, the hi3660 only defines one thermal zone in the DT and we
> are trying to register two sensors assuming we have two thermal zones
> in the DT.
> 
> Fix this by adding a couple of line of code to add back compatibility
> with older DT and change the sensors number to 1 for the hi3660.

Is this a case of adding dt versioning for those nodes?

> 
> Fixes: 2cffaeff083f (thermal/drivers/hisi: Use platform_get_irq_byname)
> Signed-off-by: Daniel Lezcano 
> ---
>  drivers/thermal/hisi_thermal.c | 11 ---
>  1 file changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/thermal/hisi_thermal.c b/drivers/thermal/hisi_thermal.c
> index c4111a9..3ab0e63 100644
> --- a/drivers/thermal/hisi_thermal.c
> +++ b/drivers/thermal/hisi_thermal.c
> @@ -424,7 +424,7 @@ static int hi3660_thermal_probe(struct hisi_thermal_data 
> *data)
>   struct platform_device *pdev = data->pdev;
>   struct device *dev = >dev;
>  
> - data->nr_sensors = 2;
> + data->nr_sensors = 1;

For bisectability (heh.. is that even a word?), would you please send
one fix per patch?

>  
>   data->sensor = devm_kzalloc(dev, sizeof(*data->sensor) *
>   data->nr_sensors, GFP_KERNEL);
> @@ -590,8 +590,13 @@ static int hisi_thermal_probe(struct platform_device 
> *pdev)
>   }
>  
>   ret = platform_get_irq_byname(pdev, sensor->irq_name);
> - if (ret < 0)
> - return ret;
> + if (ret <= 0) {

Maybe a simple <  is enough? reading it seams awkward. From a glance, I
dont think platform_get_irq* ever returns 0.

> + ret = platform_get_irq(pdev, 0);
> + if (ret <=  0) {

Same here.

> + dev_err(dev, "Failed get interrupt: %d\n", ret);
> + return ret;
> + }
> + }
>  
>   ret = devm_request_threaded_irq(dev, ret, NULL,
>   hisi_thermal_alarm_irq_thread,
> -- 
> 2.7.4
> 


Re: [PATCH] thermal/drivers/hisi: Fix bad initialization

2018-11-29 Thread Eduardo Valentin
On Thu, Nov 29, 2018 at 07:26:56PM +0100, Daniel Lezcano wrote:
> Without this patch, the thermal driver on hi6220 and hi3660 is broken.
> 
> That is due because part of the posted patchset was merged but a small
> change in the DT was dropped.
> 
> The hi6220 and hi3660 do not have an interrupt name in the DT, so
> finding interrupt by name fails.
> 
> In addition, the hi3660 only defines one thermal zone in the DT and we
> are trying to register two sensors assuming we have two thermal zones
> in the DT.
> 
> Fix this by adding a couple of line of code to add back compatibility
> with older DT and change the sensors number to 1 for the hi3660.

Is this a case of adding dt versioning for those nodes?

> 
> Fixes: 2cffaeff083f (thermal/drivers/hisi: Use platform_get_irq_byname)
> Signed-off-by: Daniel Lezcano 
> ---
>  drivers/thermal/hisi_thermal.c | 11 ---
>  1 file changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/thermal/hisi_thermal.c b/drivers/thermal/hisi_thermal.c
> index c4111a9..3ab0e63 100644
> --- a/drivers/thermal/hisi_thermal.c
> +++ b/drivers/thermal/hisi_thermal.c
> @@ -424,7 +424,7 @@ static int hi3660_thermal_probe(struct hisi_thermal_data 
> *data)
>   struct platform_device *pdev = data->pdev;
>   struct device *dev = >dev;
>  
> - data->nr_sensors = 2;
> + data->nr_sensors = 1;

For bisectability (heh.. is that even a word?), would you please send
one fix per patch?

>  
>   data->sensor = devm_kzalloc(dev, sizeof(*data->sensor) *
>   data->nr_sensors, GFP_KERNEL);
> @@ -590,8 +590,13 @@ static int hisi_thermal_probe(struct platform_device 
> *pdev)
>   }
>  
>   ret = platform_get_irq_byname(pdev, sensor->irq_name);
> - if (ret < 0)
> - return ret;
> + if (ret <= 0) {

Maybe a simple <  is enough? reading it seams awkward. From a glance, I
dont think platform_get_irq* ever returns 0.

> + ret = platform_get_irq(pdev, 0);
> + if (ret <=  0) {

Same here.

> + dev_err(dev, "Failed get interrupt: %d\n", ret);
> + return ret;
> + }
> + }
>  
>   ret = devm_request_threaded_irq(dev, ret, NULL,
>   hisi_thermal_alarm_irq_thread,
> -- 
> 2.7.4
> 


RE: [PATCH 4.14 044/100] ACPICA: AML interpreter: add region addresses in global list during initialization

2018-11-29 Thread Schmauss, Erik



> -Original Message-
> From: Jean Delvare [mailto:jdelv...@suse.de]
> Sent: Thursday, November 29, 2018 11:34 AM
> To: Greg Kroah-Hartman 
> Cc: Schmauss, Erik ; linux-
> ker...@vger.kernel.org; sta...@vger.kernel.org; Jean-Marc Lenoir
> ; Wysocki, Rafael J 
> Subject: Re: [PATCH 4.14 044/100] ACPICA: AML interpreter: add region
> addresses in global list during initialization
> 
> On Thu, 29 Nov 2018 20:21:37 +0100, Greg Kroah-Hartman wrote:
> > On Thu, Nov 29, 2018 at 06:56:40PM +, Schmauss, Erik wrote:
> > > This should only apply to 4.17 or later. I unintentionally put all
> > > applicable so please drop this for all 4.16 or earlier. I've learned
> > > my lesson and I'll put the correct tags from now on :-)
> >
> > Ok, now dropped from 4.14, thanks.
> 
> Should be dropped from 4.9 and 4.4 too... if it was not clear.

Yes, it should,

Thanks,
Erik
> 
> Thanks,
> --
> Jean Delvare
> SUSE L3 Support


RE: [PATCH 4.14 044/100] ACPICA: AML interpreter: add region addresses in global list during initialization

2018-11-29 Thread Schmauss, Erik



> -Original Message-
> From: Jean Delvare [mailto:jdelv...@suse.de]
> Sent: Thursday, November 29, 2018 11:34 AM
> To: Greg Kroah-Hartman 
> Cc: Schmauss, Erik ; linux-
> ker...@vger.kernel.org; sta...@vger.kernel.org; Jean-Marc Lenoir
> ; Wysocki, Rafael J 
> Subject: Re: [PATCH 4.14 044/100] ACPICA: AML interpreter: add region
> addresses in global list during initialization
> 
> On Thu, 29 Nov 2018 20:21:37 +0100, Greg Kroah-Hartman wrote:
> > On Thu, Nov 29, 2018 at 06:56:40PM +, Schmauss, Erik wrote:
> > > This should only apply to 4.17 or later. I unintentionally put all
> > > applicable so please drop this for all 4.16 or earlier. I've learned
> > > my lesson and I'll put the correct tags from now on :-)
> >
> > Ok, now dropped from 4.14, thanks.
> 
> Should be dropped from 4.9 and 4.4 too... if it was not clear.

Yes, it should,

Thanks,
Erik
> 
> Thanks,
> --
> Jean Delvare
> SUSE L3 Support


Re: [PATCH 4.14 044/100] ACPICA: AML interpreter: add region addresses in global list during initialization

2018-11-29 Thread Jean Delvare
On Thu, 29 Nov 2018 20:21:37 +0100, Greg Kroah-Hartman wrote:
> On Thu, Nov 29, 2018 at 06:56:40PM +, Schmauss, Erik wrote:
> > This should only apply to 4.17 or later. I unintentionally put all 
> > applicable so
> > please drop this for all 4.16 or earlier. I've learned my lesson and I'll 
> > put the
> > correct tags from now on :-)  
> 
> Ok, now dropped from 4.14, thanks.

Should be dropped from 4.9 and 4.4 too... if it was not clear.

Thanks,
-- 
Jean Delvare
SUSE L3 Support


Re: [PATCH 4.14 044/100] ACPICA: AML interpreter: add region addresses in global list during initialization

2018-11-29 Thread Jean Delvare
On Thu, 29 Nov 2018 20:21:37 +0100, Greg Kroah-Hartman wrote:
> On Thu, Nov 29, 2018 at 06:56:40PM +, Schmauss, Erik wrote:
> > This should only apply to 4.17 or later. I unintentionally put all 
> > applicable so
> > please drop this for all 4.16 or earlier. I've learned my lesson and I'll 
> > put the
> > correct tags from now on :-)  
> 
> Ok, now dropped from 4.14, thanks.

Should be dropped from 4.9 and 4.4 too... if it was not clear.

Thanks,
-- 
Jean Delvare
SUSE L3 Support


Re: [PATCH v2 4/4] x86/static_call: Add inline static call implementation for x86-64

2018-11-29 Thread Steven Rostedt
On Thu, 29 Nov 2018 11:24:43 -0800
Linus Torvalds  wrote:

> On Thu, Nov 29, 2018 at 11:16 AM Steven Rostedt  wrote:
> >
> > But then we need to implement all numbers of parameters.  
> 
> Oh, I agree, it's nasty.
> 
> But it's actually a nastiness that we've solved before. In particular,
> with the system call mappings, which have pretty much the exact same
> issue of "map unknown number of arguments to registers".
> 
> Yes, it's different - there you map the unknown number of arguments to
> a structure access instead. And yes, the macros are unbelievably ugly.
> See
> 
> arch/x86/include/asm/syscall_wrapper.h

Those are not doing inline assembly.

> 
> and the __MAP() macro from
> 
> include/linux/syscalls.h
> 
> so it's not pretty. But it would solve all the problems.
> 

Again, not inline assembly, and those only handle up to 6 parameters.

My POC started down this route, until I notice that there's tracepoints
that have 13 parameters! And I need to handle all tracepoints.

Yes, we can argue that we need to change those (if that doesn't break
the API of something using it).

-- Steve



Re: [PATCH v2 4/4] x86/static_call: Add inline static call implementation for x86-64

2018-11-29 Thread Steven Rostedt
On Thu, 29 Nov 2018 11:24:43 -0800
Linus Torvalds  wrote:

> On Thu, Nov 29, 2018 at 11:16 AM Steven Rostedt  wrote:
> >
> > But then we need to implement all numbers of parameters.  
> 
> Oh, I agree, it's nasty.
> 
> But it's actually a nastiness that we've solved before. In particular,
> with the system call mappings, which have pretty much the exact same
> issue of "map unknown number of arguments to registers".
> 
> Yes, it's different - there you map the unknown number of arguments to
> a structure access instead. And yes, the macros are unbelievably ugly.
> See
> 
> arch/x86/include/asm/syscall_wrapper.h

Those are not doing inline assembly.

> 
> and the __MAP() macro from
> 
> include/linux/syscalls.h
> 
> so it's not pretty. But it would solve all the problems.
> 

Again, not inline assembly, and those only handle up to 6 parameters.

My POC started down this route, until I notice that there's tracepoints
that have 13 parameters! And I need to handle all tracepoints.

Yes, we can argue that we need to change those (if that doesn't break
the API of something using it).

-- Steve



Re: [PATCH] PCI: pciehp: Report degraded links via link bandwidth notification

2018-11-29 Thread Mika Westerberg
On Thu, Nov 29, 2018 at 07:00:58PM +, alex_gagn...@dellteam.com wrote:
> >> +  if (link_status & PCI_EXP_LNKSTA_LBMS) {
> >> +  if (pdev->subordinate && pdev->subordinate->self)
> >> +  endpoint = pdev->subordinate->self;
> > 
> > Hmm, I thought pdev->subordinate->self == pdev, no?
> 
> That makes no sense, but I think you're right. I'm trying to get to the 
> other end of the PCIe link. Is there a simple way to do that? (other 
> than convoluted logic that all leads to the same mistake)

AFAIK you should be able to find the other end by looking at the
pdev->subordinate->devices list. Not sure if there is a simpler way,
though.


Re: [PATCH] PCI: pciehp: Report degraded links via link bandwidth notification

2018-11-29 Thread Mika Westerberg
On Thu, Nov 29, 2018 at 07:00:58PM +, alex_gagn...@dellteam.com wrote:
> >> +  if (link_status & PCI_EXP_LNKSTA_LBMS) {
> >> +  if (pdev->subordinate && pdev->subordinate->self)
> >> +  endpoint = pdev->subordinate->self;
> > 
> > Hmm, I thought pdev->subordinate->self == pdev, no?
> 
> That makes no sense, but I think you're right. I'm trying to get to the 
> other end of the PCIe link. Is there a simple way to do that? (other 
> than convoluted logic that all leads to the same mistake)

AFAIK you should be able to find the other end by looking at the
pdev->subordinate->devices list. Not sure if there is a simpler way,
though.


Re: [PATCH v2 4/4] x86/static_call: Add inline static call implementation for x86-64

2018-11-29 Thread Andy Lutomirski
On Thu, Nov 29, 2018 at 11:25 AM Linus Torvalds
 wrote:
>
> On Thu, Nov 29, 2018 at 11:16 AM Steven Rostedt  wrote:
> >
> > But then we need to implement all numbers of parameters.
>
> Oh, I agree, it's nasty.
>
> But it's actually a nastiness that we've solved before. In particular,
> with the system call mappings, which have pretty much the exact same
> issue of "map unknown number of arguments to registers".
>
> Yes, it's different - there you map the unknown number of arguments to
> a structure access instead. And yes, the macros are unbelievably ugly.
> See
>
> arch/x86/include/asm/syscall_wrapper.h
>
> and the __MAP() macro from
>
> include/linux/syscalls.h
>
> so it's not pretty. But it would solve all the problems.
>

Until someone does:

struct foo foo;
static_call(thingy, foo);

For syscalls, we know better than to do that.  For static calls, I'm
less confident.


Re: [PATCH v2 4/4] x86/static_call: Add inline static call implementation for x86-64

2018-11-29 Thread Andy Lutomirski
On Thu, Nov 29, 2018 at 11:25 AM Linus Torvalds
 wrote:
>
> On Thu, Nov 29, 2018 at 11:16 AM Steven Rostedt  wrote:
> >
> > But then we need to implement all numbers of parameters.
>
> Oh, I agree, it's nasty.
>
> But it's actually a nastiness that we've solved before. In particular,
> with the system call mappings, which have pretty much the exact same
> issue of "map unknown number of arguments to registers".
>
> Yes, it's different - there you map the unknown number of arguments to
> a structure access instead. And yes, the macros are unbelievably ugly.
> See
>
> arch/x86/include/asm/syscall_wrapper.h
>
> and the __MAP() macro from
>
> include/linux/syscalls.h
>
> so it's not pretty. But it would solve all the problems.
>

Until someone does:

struct foo foo;
static_call(thingy, foo);

For syscalls, we know better than to do that.  For static calls, I'm
less confident.


Re: [PATCH v2 4/4] x86/static_call: Add inline static call implementation for x86-64

2018-11-29 Thread Steven Rostedt
On Thu, 29 Nov 2018 13:22:11 -0600
Josh Poimboeuf  wrote:

> On Thu, Nov 29, 2018 at 02:16:48PM -0500, Steven Rostedt wrote:
> > > and honestly, the way "static_call()" works now, can you guarantee
> > > that the call-site doesn't end up doing that, and calling the
> > > trampoline function for two different static calls from one indirect
> > > call?
> > > 
> > > See what I'm talking about? Saying "callers are wrapped in macros"
> > > doesn't actually protect you from the compiler doing things like that.
> > > 
> > > In contrast, if the call was wrapped in an inline asm, we'd *know* the
> > > compiler couldn't turn a "call wrapper(%rip)" into anything else.  
> > 
> > But then we need to implement all numbers of parameters.  
> 
> I actually have an old unfinished patch which (ab)used C macros to
> detect the number of parameters and then setup the asm constraints
> accordingly.  At the time, the goal was to optimize the BUG code.
> 
> I had wanted to avoid this kind of approach for static calls, because
> "ugh", but now it's starting to look much more appealing.
> 
> Behold:
> 
> diff --git a/arch/x86/include/asm/bug.h b/arch/x86/include/asm/bug.h
> index aa6b2023d8f8..d63e9240da77 100644
> --- a/arch/x86/include/asm/bug.h
> +++ b/arch/x86/include/asm/bug.h
> @@ -32,10 +32,59 @@
>  
>  #ifdef CONFIG_DEBUG_BUGVERBOSE
>  
> -#define _BUG_FLAGS(ins, flags)   
> \
> +#define __BUG_ARGS_0(ins, ...) \
> +({\
> + asm volatile("1:\t" ins "\n"); \
> +})
> +#define __BUG_ARGS_1(ins, ...) \
> +({\
> + asm volatile("1:\t" ins "\n" \
> +  : : "D" (ARG1(__VA_ARGS__))); \
> +})
> +#define __BUG_ARGS_2(ins, ...) \
> +({\
> + asm volatile("1:\t" ins "\n" \
> +  : : "D" (ARG1(__VA_ARGS__)), \
> +  "S" (ARG2(__VA_ARGS__))); \
> +})
> +#define __BUG_ARGS_3(ins, ...) \
> +({\
> + asm volatile("1:\t" ins "\n" \
> +  : : "D" (ARG1(__VA_ARGS__)), \
> +  "S" (ARG2(__VA_ARGS__)), \
> +  "d" (ARG3(__VA_ARGS__))); \
> +})
> +#define __BUG_ARGS_4(ins, ...) \
> +({\
> + asm volatile("1:\t" ins "\n" \
> +  : : "D" (ARG1(__VA_ARGS__)), \
> +  "S" (ARG2(__VA_ARGS__)), \
> +  "d" (ARG3(__VA_ARGS__)), \
> +  "c" (ARG4(__VA_ARGS__))); \
> +})
> +#define __BUG_ARGS_5(ins, ...) \
> +({\
> + register u64 __r8 asm("r8") = (u64)ARG5(__VA_ARGS__); \
> + asm volatile("1:\t" ins "\n" \
> +  : : "D" (ARG1(__VA_ARGS__)), \
> +  "S" (ARG2(__VA_ARGS__)), \
> +  "d" (ARG3(__VA_ARGS__)), \
> +  "c" (ARG4(__VA_ARGS__)), \
> +  "r" (__r8)); \
> +})
> +#define __BUG_ARGS_6 foo
> +#define __BUG_ARGS_7 foo
> +#define __BUG_ARGS_8 foo
> +#define __BUG_ARGS_9 foo
> +


There exist tracepoints with 13 arguments.

-- Steve


Re: [PATCH v2 4/4] x86/static_call: Add inline static call implementation for x86-64

2018-11-29 Thread Steven Rostedt
On Thu, 29 Nov 2018 13:22:11 -0600
Josh Poimboeuf  wrote:

> On Thu, Nov 29, 2018 at 02:16:48PM -0500, Steven Rostedt wrote:
> > > and honestly, the way "static_call()" works now, can you guarantee
> > > that the call-site doesn't end up doing that, and calling the
> > > trampoline function for two different static calls from one indirect
> > > call?
> > > 
> > > See what I'm talking about? Saying "callers are wrapped in macros"
> > > doesn't actually protect you from the compiler doing things like that.
> > > 
> > > In contrast, if the call was wrapped in an inline asm, we'd *know* the
> > > compiler couldn't turn a "call wrapper(%rip)" into anything else.  
> > 
> > But then we need to implement all numbers of parameters.  
> 
> I actually have an old unfinished patch which (ab)used C macros to
> detect the number of parameters and then setup the asm constraints
> accordingly.  At the time, the goal was to optimize the BUG code.
> 
> I had wanted to avoid this kind of approach for static calls, because
> "ugh", but now it's starting to look much more appealing.
> 
> Behold:
> 
> diff --git a/arch/x86/include/asm/bug.h b/arch/x86/include/asm/bug.h
> index aa6b2023d8f8..d63e9240da77 100644
> --- a/arch/x86/include/asm/bug.h
> +++ b/arch/x86/include/asm/bug.h
> @@ -32,10 +32,59 @@
>  
>  #ifdef CONFIG_DEBUG_BUGVERBOSE
>  
> -#define _BUG_FLAGS(ins, flags)   
> \
> +#define __BUG_ARGS_0(ins, ...) \
> +({\
> + asm volatile("1:\t" ins "\n"); \
> +})
> +#define __BUG_ARGS_1(ins, ...) \
> +({\
> + asm volatile("1:\t" ins "\n" \
> +  : : "D" (ARG1(__VA_ARGS__))); \
> +})
> +#define __BUG_ARGS_2(ins, ...) \
> +({\
> + asm volatile("1:\t" ins "\n" \
> +  : : "D" (ARG1(__VA_ARGS__)), \
> +  "S" (ARG2(__VA_ARGS__))); \
> +})
> +#define __BUG_ARGS_3(ins, ...) \
> +({\
> + asm volatile("1:\t" ins "\n" \
> +  : : "D" (ARG1(__VA_ARGS__)), \
> +  "S" (ARG2(__VA_ARGS__)), \
> +  "d" (ARG3(__VA_ARGS__))); \
> +})
> +#define __BUG_ARGS_4(ins, ...) \
> +({\
> + asm volatile("1:\t" ins "\n" \
> +  : : "D" (ARG1(__VA_ARGS__)), \
> +  "S" (ARG2(__VA_ARGS__)), \
> +  "d" (ARG3(__VA_ARGS__)), \
> +  "c" (ARG4(__VA_ARGS__))); \
> +})
> +#define __BUG_ARGS_5(ins, ...) \
> +({\
> + register u64 __r8 asm("r8") = (u64)ARG5(__VA_ARGS__); \
> + asm volatile("1:\t" ins "\n" \
> +  : : "D" (ARG1(__VA_ARGS__)), \
> +  "S" (ARG2(__VA_ARGS__)), \
> +  "d" (ARG3(__VA_ARGS__)), \
> +  "c" (ARG4(__VA_ARGS__)), \
> +  "r" (__r8)); \
> +})
> +#define __BUG_ARGS_6 foo
> +#define __BUG_ARGS_7 foo
> +#define __BUG_ARGS_8 foo
> +#define __BUG_ARGS_9 foo
> +


There exist tracepoints with 13 arguments.

-- Steve


Re: [PATCH v2 4/4] x86/static_call: Add inline static call implementation for x86-64

2018-11-29 Thread Andy Lutomirski
On Thu, Nov 29, 2018 at 11:08 AM Linus Torvalds
 wrote:
>
> On Thu, Nov 29, 2018 at 10:58 AM Linus Torvalds
>  wrote:
> >
> > In contrast, if the call was wrapped in an inline asm, we'd *know* the
> > compiler couldn't turn a "call wrapper(%rip)" into anything else.
>
> Actually, I think I have a better model - if the caller is done with inline 
> asm.
>
> What you can do then is basically add a single-byte prefix to the
> "call" instruction that does nothing (say, cs override), and then
> replace *that* with a 'int3' instruction.
>
> Boom. Done.
>
> Now, the "int3" handler can just update the instruction in-place, but
> leave the "int3" in place, and then return to the next instruction
> byte (which is just the normal branch instruction without the prefix
> byte).
>
> The cross-CPU case continues to work, because the 'int3' remains in
> place until after the IPI.

Hmm, cute.  But then the calls are in inline asm, which results in
giant turds like we have for the pvop vcalls.  And, if they start
being used more generally, we potentially have ABI issues where the
calling convention isn't quite what the asm expects, and we explode.

I propose a different solution:

As in this patch set, we have a direct and an indirect version.  The
indirect version remains exactly the same as in this patch set.  The
direct version just only does the patching when all seems well: the
call instruction needs to be 0xe8, and we only do it when the thing
doesn't cross a cache line.  Does that work?  In the rare case where
the compiler generates something other than 0xe8 or crosses a cache
line, then the thing just remains as a call to the out of line jmp
trampoline.  Does that seem reasonable?  It's a very minor change to
the patch set.

Alternatively, we could actually emulate call instructions like this:

void __noreturn jump_to_kernel_pt_regs(struct pt_regs *regs, ...)
{
  struct pt_regs ptregs_copy = *regs;
  barrier();
  *(unsigned long *)(regs->sp - 8) = whatever;  /* may clobber old
regs, but so what? */
  asm volatile ("jmp return_to_alternate_ptregs");
}

where return_to_alternate_ptregs points rsp to the ptregs and goes
through the normal return path.  It's ugly, but we could have a test
case for it, and it should work fine.


Re: [PATCH v2 4/4] x86/static_call: Add inline static call implementation for x86-64

2018-11-29 Thread Andy Lutomirski
On Thu, Nov 29, 2018 at 11:08 AM Linus Torvalds
 wrote:
>
> On Thu, Nov 29, 2018 at 10:58 AM Linus Torvalds
>  wrote:
> >
> > In contrast, if the call was wrapped in an inline asm, we'd *know* the
> > compiler couldn't turn a "call wrapper(%rip)" into anything else.
>
> Actually, I think I have a better model - if the caller is done with inline 
> asm.
>
> What you can do then is basically add a single-byte prefix to the
> "call" instruction that does nothing (say, cs override), and then
> replace *that* with a 'int3' instruction.
>
> Boom. Done.
>
> Now, the "int3" handler can just update the instruction in-place, but
> leave the "int3" in place, and then return to the next instruction
> byte (which is just the normal branch instruction without the prefix
> byte).
>
> The cross-CPU case continues to work, because the 'int3' remains in
> place until after the IPI.

Hmm, cute.  But then the calls are in inline asm, which results in
giant turds like we have for the pvop vcalls.  And, if they start
being used more generally, we potentially have ABI issues where the
calling convention isn't quite what the asm expects, and we explode.

I propose a different solution:

As in this patch set, we have a direct and an indirect version.  The
indirect version remains exactly the same as in this patch set.  The
direct version just only does the patching when all seems well: the
call instruction needs to be 0xe8, and we only do it when the thing
doesn't cross a cache line.  Does that work?  In the rare case where
the compiler generates something other than 0xe8 or crosses a cache
line, then the thing just remains as a call to the out of line jmp
trampoline.  Does that seem reasonable?  It's a very minor change to
the patch set.

Alternatively, we could actually emulate call instructions like this:

void __noreturn jump_to_kernel_pt_regs(struct pt_regs *regs, ...)
{
  struct pt_regs ptregs_copy = *regs;
  barrier();
  *(unsigned long *)(regs->sp - 8) = whatever;  /* may clobber old
regs, but so what? */
  asm volatile ("jmp return_to_alternate_ptregs");
}

where return_to_alternate_ptregs points rsp to the ptregs and goes
through the normal return path.  It's ugly, but we could have a test
case for it, and it should work fine.


Re: dcache_readdir NULL inode oops

2018-11-29 Thread Jan Glauber
On Wed, Nov 28, 2018 at 08:08:06PM +, Will Deacon wrote:
> I spent some more time looking at this today...
> 
> On Fri, Nov 23, 2018 at 06:05:25PM +, Will Deacon wrote:
> > Doing some more debugging, it looks like the usual failure case is where
> > one CPU clears the inode field in the dentry via:
> >
> >   devpts_pty_kill()
> >   -> d_delete()   // dentry->d_lockref.count == 1
> >   -> dentry_unlink_inode()
> >
> > whilst another CPU gets a pointer to the dentry via:
> >
> >   sys_getdents64()
> >   -> iterate_dir()
> >   -> dcache_readdir()
> >   -> next_positive()
> >
> > and explodes on the subsequent inode dereference when trying to pass the
> > inode number to dir_emit():
> >
> >   if (!dir_emit(..., d_inode(next)->i_ino, ...))
> >
> > Indeed, the hack below triggers a warning, indicating that the inode
> > is being cleared concurrently.
> >
> > I can't work out whether the getdents64() path should hold a refcount
> > to stop d_delete() in its tracks, or whether devpts_pty_kill() shouldn't
> > be calling d_delete() like this at all.
> 
> So the issue is that opening /dev/pts/ptmx creates a new pty in /dev/pts,
> which disappears when you close /dev/pts/ptmx. Consequently, when we tear
> down the dentry for the magic new file, we have to take the i_node rwsem of
> the *parent* so that concurrent path walkers don't trip over it whilst its
> being freed. I wrote a simple concurrent program to getdents(/dev/pts/) in
> one thread, whilst another opens and closes /dev/pts/ptmx: it crashes the
> kernel in seconds.

I also made a testcase and verified that your fix is fine. I also tried
replacing open-close on /dev/ptmx with mkdir-rmdir but that does not
trigger the error.

> Patch below, but I'd still like somebody else to look at this, please.

I wonder why no inode_lock on parent is needed for devpts_pty_new(), but
I'm obviously not a VFS expert... So your patch looks good to me and
clearly solves the issue.

thanks,
Jan

> Will
> 
> --->8
> 
> diff --git a/fs/devpts/inode.c b/fs/devpts/inode.c
> index c53814539070..50ddb95ff84c 100644
> --- a/fs/devpts/inode.c
> +++ b/fs/devpts/inode.c
> @@ -619,11 +619,17 @@ void *devpts_get_priv(struct dentry *dentry)
>   */
>  void devpts_pty_kill(struct dentry *dentry)
>  {
> -   WARN_ON_ONCE(dentry->d_sb->s_magic != DEVPTS_SUPER_MAGIC);
> +   struct super_block *sb = dentry->d_sb;
> +   struct dentry *parent = sb->s_root;
> 
> +   WARN_ON_ONCE(sb->s_magic != DEVPTS_SUPER_MAGIC);
> +
> +   inode_lock(parent->d_inode);
> dentry->d_fsdata = NULL;
> drop_nlink(dentry->d_inode);
> d_delete(dentry);
> +   inode_unlock(parent->d_inode);
> +
> dput(dentry);   /* d_alloc_name() in devpts_pty_new() */
>  }


Re: dcache_readdir NULL inode oops

2018-11-29 Thread Jan Glauber
On Wed, Nov 28, 2018 at 08:08:06PM +, Will Deacon wrote:
> I spent some more time looking at this today...
> 
> On Fri, Nov 23, 2018 at 06:05:25PM +, Will Deacon wrote:
> > Doing some more debugging, it looks like the usual failure case is where
> > one CPU clears the inode field in the dentry via:
> >
> >   devpts_pty_kill()
> >   -> d_delete()   // dentry->d_lockref.count == 1
> >   -> dentry_unlink_inode()
> >
> > whilst another CPU gets a pointer to the dentry via:
> >
> >   sys_getdents64()
> >   -> iterate_dir()
> >   -> dcache_readdir()
> >   -> next_positive()
> >
> > and explodes on the subsequent inode dereference when trying to pass the
> > inode number to dir_emit():
> >
> >   if (!dir_emit(..., d_inode(next)->i_ino, ...))
> >
> > Indeed, the hack below triggers a warning, indicating that the inode
> > is being cleared concurrently.
> >
> > I can't work out whether the getdents64() path should hold a refcount
> > to stop d_delete() in its tracks, or whether devpts_pty_kill() shouldn't
> > be calling d_delete() like this at all.
> 
> So the issue is that opening /dev/pts/ptmx creates a new pty in /dev/pts,
> which disappears when you close /dev/pts/ptmx. Consequently, when we tear
> down the dentry for the magic new file, we have to take the i_node rwsem of
> the *parent* so that concurrent path walkers don't trip over it whilst its
> being freed. I wrote a simple concurrent program to getdents(/dev/pts/) in
> one thread, whilst another opens and closes /dev/pts/ptmx: it crashes the
> kernel in seconds.

I also made a testcase and verified that your fix is fine. I also tried
replacing open-close on /dev/ptmx with mkdir-rmdir but that does not
trigger the error.

> Patch below, but I'd still like somebody else to look at this, please.

I wonder why no inode_lock on parent is needed for devpts_pty_new(), but
I'm obviously not a VFS expert... So your patch looks good to me and
clearly solves the issue.

thanks,
Jan

> Will
> 
> --->8
> 
> diff --git a/fs/devpts/inode.c b/fs/devpts/inode.c
> index c53814539070..50ddb95ff84c 100644
> --- a/fs/devpts/inode.c
> +++ b/fs/devpts/inode.c
> @@ -619,11 +619,17 @@ void *devpts_get_priv(struct dentry *dentry)
>   */
>  void devpts_pty_kill(struct dentry *dentry)
>  {
> -   WARN_ON_ONCE(dentry->d_sb->s_magic != DEVPTS_SUPER_MAGIC);
> +   struct super_block *sb = dentry->d_sb;
> +   struct dentry *parent = sb->s_root;
> 
> +   WARN_ON_ONCE(sb->s_magic != DEVPTS_SUPER_MAGIC);
> +
> +   inode_lock(parent->d_inode);
> dentry->d_fsdata = NULL;
> drop_nlink(dentry->d_inode);
> d_delete(dentry);
> +   inode_unlock(parent->d_inode);
> +
> dput(dentry);   /* d_alloc_name() in devpts_pty_new() */
>  }


Re: [PATCH v2 4/4] x86/static_call: Add inline static call implementation for x86-64

2018-11-29 Thread Linus Torvalds
On Thu, Nov 29, 2018 at 11:16 AM Steven Rostedt  wrote:
>
> But then we need to implement all numbers of parameters.

Oh, I agree, it's nasty.

But it's actually a nastiness that we've solved before. In particular,
with the system call mappings, which have pretty much the exact same
issue of "map unknown number of arguments to registers".

Yes, it's different - there you map the unknown number of arguments to
a structure access instead. And yes, the macros are unbelievably ugly.
See

arch/x86/include/asm/syscall_wrapper.h

and the __MAP() macro from

include/linux/syscalls.h

so it's not pretty. But it would solve all the problems.

  Linus


Re: [PATCH v2 4/4] x86/static_call: Add inline static call implementation for x86-64

2018-11-29 Thread Linus Torvalds
On Thu, Nov 29, 2018 at 11:16 AM Steven Rostedt  wrote:
>
> But then we need to implement all numbers of parameters.

Oh, I agree, it's nasty.

But it's actually a nastiness that we've solved before. In particular,
with the system call mappings, which have pretty much the exact same
issue of "map unknown number of arguments to registers".

Yes, it's different - there you map the unknown number of arguments to
a structure access instead. And yes, the macros are unbelievably ugly.
See

arch/x86/include/asm/syscall_wrapper.h

and the __MAP() macro from

include/linux/syscalls.h

so it's not pretty. But it would solve all the problems.

  Linus


Re: [PATCH v2] signal: add procfd_signal() syscall

2018-11-29 Thread Andy Lutomirski
On Thu, Nov 29, 2018 at 11:17 AM Christian Brauner  wrote:
>
> On November 30, 2018 5:54:18 AM GMT+13:00, Andy Lutomirski 
>  wrote:
> >
> >
> >> On Nov 29, 2018, at 4:28 AM, Florian Weimer 
> >wrote:
> >>
> >> Disclaimer: I'm looking at this patch because Christian requested it.
> >> I'm not a kernel developer.
> >>
> >> * Christian Brauner:
> >>
> >>> diff --git a/arch/x86/entry/syscalls/syscall_32.tbl
> >b/arch/x86/entry/syscalls/syscall_32.tbl
> >>> index 3cf7b533b3d1..3f27ffd8ae87 100644
> >>> --- a/arch/x86/entry/syscalls/syscall_32.tbl
> >>> +++ b/arch/x86/entry/syscalls/syscall_32.tbl
> >>> @@ -398,3 +398,4 @@
> >>> 384i386arch_prctlsys_arch_prctl
> >__ia32_compat_sys_arch_prctl
> >>> 385i386io_pgeteventssys_io_pgetevents
> >__ia32_compat_sys_io_pgetevents
> >>> 386i386rseqsys_rseq__ia32_sys_rseq
> >>> +387i386procfd_signalsys_procfd_signal
> >__ia32_compat_sys_procfd_signal
> >>> diff --git a/arch/x86/entry/syscalls/syscall_64.tbl
> >b/arch/x86/entry/syscalls/syscall_64.tbl
> >>> index f0b1709a5ffb..8a30cde82450 100644
> >>> --- a/arch/x86/entry/syscalls/syscall_64.tbl
> >>> +++ b/arch/x86/entry/syscalls/syscall_64.tbl
> >>> @@ -343,6 +343,7 @@
> >>> 332commonstatx__x64_sys_statx
> >>> 333commonio_pgetevents__x64_sys_io_pgetevents
> >>> 334commonrseq__x64_sys_rseq
> >>> +33564procfd_signal__x64_sys_procfd_signal
> >>>
> >>> #
> >>> # x32-specific system call numbers start at 512 to avoid cache
> >impact
> >>> @@ -386,3 +387,4 @@
> >>> 545x32execveat__x32_compat_sys_execveat/ptregs
> >>> 546x32preadv2__x32_compat_sys_preadv64v2
> >>> 547x32pwritev2__x32_compat_sys_pwritev64v2
> >>> +548x32procfd_signal__x32_compat_sys_procfd_signal
> >>
> >> Is there a reason why these numbers have to be different?
> >>
> >> (See the recent discussion with Andy Lutomirski.)
> >
> >Hah, I missed this part of the patch.  Let’s not add new x32 syscall
> >numbers.
> >
> >Also, can we perhaps rework this a bit to get rid of the compat entry
> >point?  The easier way would be to check in_compat_syscall(). The nicer
> >way IMO would be to use the 64-bit structure for 32-bit as well.
>
> Do you have a syscall which set precedence/did this before I could look at?
> Just if you happen to remember one.
> Fwiw, I followed the other signal syscalls.
> They all introduce compat syscalls.
>

Not really.

Let me try to explain.  I have three issues with the approach in your patchset:

1. You're introducing a new syscall, and it behaves differently on
32-bit and 64-bit because the structure you pass in is different.
This is necessary for old syscalls where compatibility matters, but
maybe we can get rid of it for new syscalls.   Could we define a
siginfo64_t that is identical to the 64-bit siginfo_t and just use
that in all cases?

2. Assuming that #1 doesn't work, then we need compat support.  But
you're doing it by having two different entry points.  Instead, you
could have a single entry point that calls in_compat_syscall() to
decide which structure to read.  This would simplify things because
x86 doesn't really support the separate compat entry points, which
leads me to #3.

3. The separate x32 numbers are a huge turd that may have security
holes and certainly have comprehensibility holes.  I will object to
any patch that adds a new one (like yours).  Fixing #1 or #2 makes
this problem go away.

Does that make any sense?  The #2 fix would be something like:

if (in_compat_syscall)
  copy...user32();
else
  copy_from_user();

The #1 fix would add a copy_siginfo_from_user64() or similar.


Re: [PATCH v2] signal: add procfd_signal() syscall

2018-11-29 Thread Andy Lutomirski
On Thu, Nov 29, 2018 at 11:17 AM Christian Brauner  wrote:
>
> On November 30, 2018 5:54:18 AM GMT+13:00, Andy Lutomirski 
>  wrote:
> >
> >
> >> On Nov 29, 2018, at 4:28 AM, Florian Weimer 
> >wrote:
> >>
> >> Disclaimer: I'm looking at this patch because Christian requested it.
> >> I'm not a kernel developer.
> >>
> >> * Christian Brauner:
> >>
> >>> diff --git a/arch/x86/entry/syscalls/syscall_32.tbl
> >b/arch/x86/entry/syscalls/syscall_32.tbl
> >>> index 3cf7b533b3d1..3f27ffd8ae87 100644
> >>> --- a/arch/x86/entry/syscalls/syscall_32.tbl
> >>> +++ b/arch/x86/entry/syscalls/syscall_32.tbl
> >>> @@ -398,3 +398,4 @@
> >>> 384i386arch_prctlsys_arch_prctl
> >__ia32_compat_sys_arch_prctl
> >>> 385i386io_pgeteventssys_io_pgetevents
> >__ia32_compat_sys_io_pgetevents
> >>> 386i386rseqsys_rseq__ia32_sys_rseq
> >>> +387i386procfd_signalsys_procfd_signal
> >__ia32_compat_sys_procfd_signal
> >>> diff --git a/arch/x86/entry/syscalls/syscall_64.tbl
> >b/arch/x86/entry/syscalls/syscall_64.tbl
> >>> index f0b1709a5ffb..8a30cde82450 100644
> >>> --- a/arch/x86/entry/syscalls/syscall_64.tbl
> >>> +++ b/arch/x86/entry/syscalls/syscall_64.tbl
> >>> @@ -343,6 +343,7 @@
> >>> 332commonstatx__x64_sys_statx
> >>> 333commonio_pgetevents__x64_sys_io_pgetevents
> >>> 334commonrseq__x64_sys_rseq
> >>> +33564procfd_signal__x64_sys_procfd_signal
> >>>
> >>> #
> >>> # x32-specific system call numbers start at 512 to avoid cache
> >impact
> >>> @@ -386,3 +387,4 @@
> >>> 545x32execveat__x32_compat_sys_execveat/ptregs
> >>> 546x32preadv2__x32_compat_sys_preadv64v2
> >>> 547x32pwritev2__x32_compat_sys_pwritev64v2
> >>> +548x32procfd_signal__x32_compat_sys_procfd_signal
> >>
> >> Is there a reason why these numbers have to be different?
> >>
> >> (See the recent discussion with Andy Lutomirski.)
> >
> >Hah, I missed this part of the patch.  Let’s not add new x32 syscall
> >numbers.
> >
> >Also, can we perhaps rework this a bit to get rid of the compat entry
> >point?  The easier way would be to check in_compat_syscall(). The nicer
> >way IMO would be to use the 64-bit structure for 32-bit as well.
>
> Do you have a syscall which set precedence/did this before I could look at?
> Just if you happen to remember one.
> Fwiw, I followed the other signal syscalls.
> They all introduce compat syscalls.
>

Not really.

Let me try to explain.  I have three issues with the approach in your patchset:

1. You're introducing a new syscall, and it behaves differently on
32-bit and 64-bit because the structure you pass in is different.
This is necessary for old syscalls where compatibility matters, but
maybe we can get rid of it for new syscalls.   Could we define a
siginfo64_t that is identical to the 64-bit siginfo_t and just use
that in all cases?

2. Assuming that #1 doesn't work, then we need compat support.  But
you're doing it by having two different entry points.  Instead, you
could have a single entry point that calls in_compat_syscall() to
decide which structure to read.  This would simplify things because
x86 doesn't really support the separate compat entry points, which
leads me to #3.

3. The separate x32 numbers are a huge turd that may have security
holes and certainly have comprehensibility holes.  I will object to
any patch that adds a new one (like yours).  Fixing #1 or #2 makes
this problem go away.

Does that make any sense?  The #2 fix would be something like:

if (in_compat_syscall)
  copy...user32();
else
  copy_from_user();

The #1 fix would add a copy_siginfo_from_user64() or similar.


Re: [PATCH v2 1/4] x86/hyper-v: move synic/stimer control structures definitions to hyperv-tlfs.h

2018-11-29 Thread Thomas Gleixner
On Thu, 29 Nov 2018, Vitaly Kuznetsov wrote:
> Nadav Amit  writes:
> 
> >> On Nov 28, 2018, at 5:07 AM, Thomas Gleixner  wrote:
> >> 
> >> On Wed, 28 Nov 2018, Vitaly Kuznetsov wrote:
> >> 
> >>> Nadav Amit  writes:
> >>> 
>  On a different note: how come all of the hyper-v structs are not marked
>  with the “packed" attribute?
> >>> 
> >>> "packed" should not be needed with proper padding; I vaguely remember
> >>> someone (from x86@?) arguing _against_ "packed".
> >> 
> >> Packed needs to be used, when describing fixed format data structures in
> >> hardware or other ABIs, so the compiler cannot put alignment holes into
> >> them.
> >> 
> >> Using packed for generic data structures might result in suboptimal layouts
> >> and prevents layout randomization.
> >
> > Right, I forgot about the structs randomization. So at least for it, the
> > attribute should be needed.
> >
> 
> Not sure when randomization.s used but Hyper-V drivers will of course be
> utterly broken with it.
> 
> > To prevent conflicts, I think that this series should also add the
> > attribute in a first patch, which would be tagged for stable.
> 
> As the patchset doesn't add new definitions and as Paolo already queued
> it I'd go with a follow-up patch adding "packed" to all hyperv-tlfs.h
> structures. The question is how to avoid conflicts when Linus will be
> merging this. We can do:
> - Topic branch in kvm
> - Send the patch to x86, make topic branch and reabse kvm
> - Send the patch to kvm
> - ... ?
> 
> Paolo/Thomas, what would be your preference?

As Paolo already has it, just route it through his tree please.

Thanks,

tglx

Re: [PATCH v2 1/4] x86/hyper-v: move synic/stimer control structures definitions to hyperv-tlfs.h

2018-11-29 Thread Thomas Gleixner
On Thu, 29 Nov 2018, Vitaly Kuznetsov wrote:
> Nadav Amit  writes:
> 
> >> On Nov 28, 2018, at 5:07 AM, Thomas Gleixner  wrote:
> >> 
> >> On Wed, 28 Nov 2018, Vitaly Kuznetsov wrote:
> >> 
> >>> Nadav Amit  writes:
> >>> 
>  On a different note: how come all of the hyper-v structs are not marked
>  with the “packed" attribute?
> >>> 
> >>> "packed" should not be needed with proper padding; I vaguely remember
> >>> someone (from x86@?) arguing _against_ "packed".
> >> 
> >> Packed needs to be used, when describing fixed format data structures in
> >> hardware or other ABIs, so the compiler cannot put alignment holes into
> >> them.
> >> 
> >> Using packed for generic data structures might result in suboptimal layouts
> >> and prevents layout randomization.
> >
> > Right, I forgot about the structs randomization. So at least for it, the
> > attribute should be needed.
> >
> 
> Not sure when randomization.s used but Hyper-V drivers will of course be
> utterly broken with it.
> 
> > To prevent conflicts, I think that this series should also add the
> > attribute in a first patch, which would be tagged for stable.
> 
> As the patchset doesn't add new definitions and as Paolo already queued
> it I'd go with a follow-up patch adding "packed" to all hyperv-tlfs.h
> structures. The question is how to avoid conflicts when Linus will be
> merging this. We can do:
> - Topic branch in kvm
> - Send the patch to x86, make topic branch and reabse kvm
> - Send the patch to kvm
> - ... ?
> 
> Paolo/Thomas, what would be your preference?

As Paolo already has it, just route it through his tree please.

Thanks,

tglx

Re: [PATCH v2 4/4] x86/static_call: Add inline static call implementation for x86-64

2018-11-29 Thread Josh Poimboeuf
On Thu, Nov 29, 2018 at 02:16:48PM -0500, Steven Rostedt wrote:
> > and honestly, the way "static_call()" works now, can you guarantee
> > that the call-site doesn't end up doing that, and calling the
> > trampoline function for two different static calls from one indirect
> > call?
> > 
> > See what I'm talking about? Saying "callers are wrapped in macros"
> > doesn't actually protect you from the compiler doing things like that.
> > 
> > In contrast, if the call was wrapped in an inline asm, we'd *know* the
> > compiler couldn't turn a "call wrapper(%rip)" into anything else.
> 
> But then we need to implement all numbers of parameters.

I actually have an old unfinished patch which (ab)used C macros to
detect the number of parameters and then setup the asm constraints
accordingly.  At the time, the goal was to optimize the BUG code.

I had wanted to avoid this kind of approach for static calls, because
"ugh", but now it's starting to look much more appealing.

Behold:

diff --git a/arch/x86/include/asm/bug.h b/arch/x86/include/asm/bug.h
index aa6b2023d8f8..d63e9240da77 100644
--- a/arch/x86/include/asm/bug.h
+++ b/arch/x86/include/asm/bug.h
@@ -32,10 +32,59 @@
 
 #ifdef CONFIG_DEBUG_BUGVERBOSE
 
-#define _BUG_FLAGS(ins, flags) \
+#define __BUG_ARGS_0(ins, ...) \
+({\
+   asm volatile("1:\t" ins "\n"); \
+})
+#define __BUG_ARGS_1(ins, ...) \
+({\
+   asm volatile("1:\t" ins "\n" \
+: : "D" (ARG1(__VA_ARGS__))); \
+})
+#define __BUG_ARGS_2(ins, ...) \
+({\
+   asm volatile("1:\t" ins "\n" \
+: : "D" (ARG1(__VA_ARGS__)), \
+"S" (ARG2(__VA_ARGS__))); \
+})
+#define __BUG_ARGS_3(ins, ...) \
+({\
+   asm volatile("1:\t" ins "\n" \
+: : "D" (ARG1(__VA_ARGS__)), \
+"S" (ARG2(__VA_ARGS__)), \
+"d" (ARG3(__VA_ARGS__))); \
+})
+#define __BUG_ARGS_4(ins, ...) \
+({\
+   asm volatile("1:\t" ins "\n" \
+: : "D" (ARG1(__VA_ARGS__)), \
+"S" (ARG2(__VA_ARGS__)), \
+"d" (ARG3(__VA_ARGS__)), \
+"c" (ARG4(__VA_ARGS__))); \
+})
+#define __BUG_ARGS_5(ins, ...) \
+({\
+   register u64 __r8 asm("r8") = (u64)ARG5(__VA_ARGS__); \
+   asm volatile("1:\t" ins "\n" \
+: : "D" (ARG1(__VA_ARGS__)), \
+"S" (ARG2(__VA_ARGS__)), \
+"d" (ARG3(__VA_ARGS__)), \
+"c" (ARG4(__VA_ARGS__)), \
+"r" (__r8)); \
+})
+#define __BUG_ARGS_6 foo
+#define __BUG_ARGS_7 foo
+#define __BUG_ARGS_8 foo
+#define __BUG_ARGS_9 foo
+
+#define __BUG_ARGS(ins, num, ...) __BUG_ARGS_ ## num(ins, __VA_ARGS__)
+
+#define _BUG_ARGS(ins, num, ...) __BUG_ARGS(ins, num, __VA_ARGS__)
+
+#define _BUG_FLAGS(ins, flags, ...)\
 do {   \
-   asm volatile("1:\t" ins "\n"\
-".pushsection __bug_table,\"aw\"\n"\
+   _BUG_ARGS(ins, NUM_ARGS(__VA_ARGS__), __VA_ARGS__); \
+   asm volatile(".pushsection __bug_table,\"aw\"\n"\
 "2:\t" __BUG_REL(1b) "\t# bug_entry::bug_addr\n"   \
 "\t"  __BUG_REL(%c0) "\t# bug_entry::file\n"   \
 "\t.word %c1""\t# bug_entry::line\n"   \
@@ -76,7 +125,7 @@ do { 
\
unreachable();  \
 } while (0)
 
-#define __WARN_FLAGS(flags)_BUG_FLAGS(ASM_UD0, BUGFLAG_WARNING|(flags))
+#define __WARN_FLAGS(flags, ...)   _BUG_FLAGS(ASM_UD0, 
BUGFLAG_WARNING|(flags), __VA_ARGS__)
 
 #include 
 
diff --git a/include/asm-generic/bug.h b/include/asm-generic/bug.h
index 70c7732c9594..0cb16e912c02 100644
--- a/include/asm-generic/bug.h
+++ b/include/asm-generic/bug.h
@@ -58,8 +58,8 @@ struct bug_entry {
 #endif
 
 #ifdef __WARN_FLAGS
-#define __WARN_TAINT(taint)__WARN_FLAGS(BUGFLAG_TAINT(taint))
-#define __WARN_ONCE_TAINT(taint)   
__WARN_FLAGS(BUGFLAG_ONCE|BUGFLAG_TAINT(taint))
+#define __WARN_TAINT(taint, args...)   
__WARN_FLAGS(BUGFLAG_TAINT(taint), args)
+#define __WARN_ONCE_TAINT(taint, args...)  
__WARN_FLAGS(BUGFLAG_ONCE|BUGFLAG_TAINT(taint), args)
 
 #define WARN_ON_ONCE(condition) ({ \
int __ret_warn_on = !!(condition);  \
@@ -84,11 +84,12 @@ void warn_slowpath_fmt_taint(const char *file, const int 
line, unsigned taint,
 extern void warn_slowpath_null(const char *file, const int line);
 #ifdef __WARN_TAINT
 #define __WARN()   __WARN_TAINT(TAINT_WARN)
+#define __WARN_printf(args...) __WARN_TAINT(TAINT_WARN, args)
 #else
 #define __WARN()   warn_slowpath_null(__FILE__, 

Re: [PATCH v2 4/4] x86/static_call: Add inline static call implementation for x86-64

2018-11-29 Thread Josh Poimboeuf
On Thu, Nov 29, 2018 at 02:16:48PM -0500, Steven Rostedt wrote:
> > and honestly, the way "static_call()" works now, can you guarantee
> > that the call-site doesn't end up doing that, and calling the
> > trampoline function for two different static calls from one indirect
> > call?
> > 
> > See what I'm talking about? Saying "callers are wrapped in macros"
> > doesn't actually protect you from the compiler doing things like that.
> > 
> > In contrast, if the call was wrapped in an inline asm, we'd *know* the
> > compiler couldn't turn a "call wrapper(%rip)" into anything else.
> 
> But then we need to implement all numbers of parameters.

I actually have an old unfinished patch which (ab)used C macros to
detect the number of parameters and then setup the asm constraints
accordingly.  At the time, the goal was to optimize the BUG code.

I had wanted to avoid this kind of approach for static calls, because
"ugh", but now it's starting to look much more appealing.

Behold:

diff --git a/arch/x86/include/asm/bug.h b/arch/x86/include/asm/bug.h
index aa6b2023d8f8..d63e9240da77 100644
--- a/arch/x86/include/asm/bug.h
+++ b/arch/x86/include/asm/bug.h
@@ -32,10 +32,59 @@
 
 #ifdef CONFIG_DEBUG_BUGVERBOSE
 
-#define _BUG_FLAGS(ins, flags) \
+#define __BUG_ARGS_0(ins, ...) \
+({\
+   asm volatile("1:\t" ins "\n"); \
+})
+#define __BUG_ARGS_1(ins, ...) \
+({\
+   asm volatile("1:\t" ins "\n" \
+: : "D" (ARG1(__VA_ARGS__))); \
+})
+#define __BUG_ARGS_2(ins, ...) \
+({\
+   asm volatile("1:\t" ins "\n" \
+: : "D" (ARG1(__VA_ARGS__)), \
+"S" (ARG2(__VA_ARGS__))); \
+})
+#define __BUG_ARGS_3(ins, ...) \
+({\
+   asm volatile("1:\t" ins "\n" \
+: : "D" (ARG1(__VA_ARGS__)), \
+"S" (ARG2(__VA_ARGS__)), \
+"d" (ARG3(__VA_ARGS__))); \
+})
+#define __BUG_ARGS_4(ins, ...) \
+({\
+   asm volatile("1:\t" ins "\n" \
+: : "D" (ARG1(__VA_ARGS__)), \
+"S" (ARG2(__VA_ARGS__)), \
+"d" (ARG3(__VA_ARGS__)), \
+"c" (ARG4(__VA_ARGS__))); \
+})
+#define __BUG_ARGS_5(ins, ...) \
+({\
+   register u64 __r8 asm("r8") = (u64)ARG5(__VA_ARGS__); \
+   asm volatile("1:\t" ins "\n" \
+: : "D" (ARG1(__VA_ARGS__)), \
+"S" (ARG2(__VA_ARGS__)), \
+"d" (ARG3(__VA_ARGS__)), \
+"c" (ARG4(__VA_ARGS__)), \
+"r" (__r8)); \
+})
+#define __BUG_ARGS_6 foo
+#define __BUG_ARGS_7 foo
+#define __BUG_ARGS_8 foo
+#define __BUG_ARGS_9 foo
+
+#define __BUG_ARGS(ins, num, ...) __BUG_ARGS_ ## num(ins, __VA_ARGS__)
+
+#define _BUG_ARGS(ins, num, ...) __BUG_ARGS(ins, num, __VA_ARGS__)
+
+#define _BUG_FLAGS(ins, flags, ...)\
 do {   \
-   asm volatile("1:\t" ins "\n"\
-".pushsection __bug_table,\"aw\"\n"\
+   _BUG_ARGS(ins, NUM_ARGS(__VA_ARGS__), __VA_ARGS__); \
+   asm volatile(".pushsection __bug_table,\"aw\"\n"\
 "2:\t" __BUG_REL(1b) "\t# bug_entry::bug_addr\n"   \
 "\t"  __BUG_REL(%c0) "\t# bug_entry::file\n"   \
 "\t.word %c1""\t# bug_entry::line\n"   \
@@ -76,7 +125,7 @@ do { 
\
unreachable();  \
 } while (0)
 
-#define __WARN_FLAGS(flags)_BUG_FLAGS(ASM_UD0, BUGFLAG_WARNING|(flags))
+#define __WARN_FLAGS(flags, ...)   _BUG_FLAGS(ASM_UD0, 
BUGFLAG_WARNING|(flags), __VA_ARGS__)
 
 #include 
 
diff --git a/include/asm-generic/bug.h b/include/asm-generic/bug.h
index 70c7732c9594..0cb16e912c02 100644
--- a/include/asm-generic/bug.h
+++ b/include/asm-generic/bug.h
@@ -58,8 +58,8 @@ struct bug_entry {
 #endif
 
 #ifdef __WARN_FLAGS
-#define __WARN_TAINT(taint)__WARN_FLAGS(BUGFLAG_TAINT(taint))
-#define __WARN_ONCE_TAINT(taint)   
__WARN_FLAGS(BUGFLAG_ONCE|BUGFLAG_TAINT(taint))
+#define __WARN_TAINT(taint, args...)   
__WARN_FLAGS(BUGFLAG_TAINT(taint), args)
+#define __WARN_ONCE_TAINT(taint, args...)  
__WARN_FLAGS(BUGFLAG_ONCE|BUGFLAG_TAINT(taint), args)
 
 #define WARN_ON_ONCE(condition) ({ \
int __ret_warn_on = !!(condition);  \
@@ -84,11 +84,12 @@ void warn_slowpath_fmt_taint(const char *file, const int 
line, unsigned taint,
 extern void warn_slowpath_null(const char *file, const int line);
 #ifdef __WARN_TAINT
 #define __WARN()   __WARN_TAINT(TAINT_WARN)
+#define __WARN_printf(args...) __WARN_TAINT(TAINT_WARN, args)
 #else
 #define __WARN()   warn_slowpath_null(__FILE__, 

Re: [PATCH 4.14 044/100] ACPICA: AML interpreter: add region addresses in global list during initialization

2018-11-29 Thread Greg Kroah-Hartman
On Thu, Nov 29, 2018 at 06:56:40PM +, Schmauss, Erik wrote:
> 
> 
> > -Original Message-
> > From: Greg Kroah-Hartman [mailto:gre...@linuxfoundation.org]
> > Sent: Thursday, November 29, 2018 7:01 AM
> > To: Jean Delvare 
> > Cc: linux-kernel@vger.kernel.org; sta...@vger.kernel.org; Jean-Marc Lenoir
> > ; Schmauss, Erik ;
> > Wysocki, Rafael J 
> > Subject: Re: [PATCH 4.14 044/100] ACPICA: AML interpreter: add region
> > addresses in global list during initialization
> > 
> > On Thu, Nov 29, 2018 at 03:45:26PM +0100, Jean Delvare wrote:
> > > Hi Greg,
> > >
> > > On Thu, 2018-11-29 at 15:12 +0100, Greg Kroah-Hartman wrote:
> > > > 4.14-stable review patch.  If anyone has any objections, please let me
> > know.
> > > >
> > > > --
> > > >
> > > > From: Erik Schmauss 
> > > >
> > > > commit 4abb951b73ff0a8a979113ef185651aa3c8da19b upstream.
> > > >
> > > > The table load process omitted adding the operation region address
> > > > range to the global list. This omission is problematic because the
> > > > OS queries the global list to check for address range conflicts
> > > > before deciding which drivers to load. This commit may result in
> > > > warning messages that look like the following:
> > > >
> > > > [7.871761] ACPI Warning: system_IO range 0x0428-0x042F
> > conflicts with op_region 0x0400-0x047F (\PMIO)
> > (20180531/utaddress-213)
> > > > [7.871769] ACPI: If an ACPI driver is available for this device, 
> > > > you should
> > use it instead of the native driver
> > > >
> > > > However, these messages do not signify regressions. It is a result
> > > > of properly adding address ranges within the global address list.
> > > >
> > > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=200011
> > > > Tested-by: Jean-Marc Lenoir 
> > > > Signed-off-by: Erik Schmauss 
> > > > Cc: All applicable 
> > > > Signed-off-by: Rafael J. Wysocki 
> > > > Cc: Jean Delvare 
> > > > Signed-off-by: Greg Kroah-Hartman 
> > >
> > > I'm confused. While we were discussing the regression, Erik said that
> > > this is fixing commit 5a8361f7ecceaed64b4064000d16cb703462be49, which
> > > went upstream in v4.17. So how can the fix be needed in any kernel
> > > older than v4.17? Erik, did I understand you incorrectly?
> > 
> Hi Greg,
> 
> > The patch says "All applicable", and I assumed that meant, "as long as it
> > applies."
> >
> > Erik, should I drop this from 4.14.y?
> 
> This should only apply to 4.17 or later. I unintentionally put all applicable 
> so
> please drop this for all 4.16 or earlier. I've learned my lesson and I'll put 
> the
> correct tags from now on :-)

Ok, now dropped from 4.14, thanks.

greg k-h


Re: [PATCH 4.14 044/100] ACPICA: AML interpreter: add region addresses in global list during initialization

2018-11-29 Thread Greg Kroah-Hartman
On Thu, Nov 29, 2018 at 06:56:40PM +, Schmauss, Erik wrote:
> 
> 
> > -Original Message-
> > From: Greg Kroah-Hartman [mailto:gre...@linuxfoundation.org]
> > Sent: Thursday, November 29, 2018 7:01 AM
> > To: Jean Delvare 
> > Cc: linux-kernel@vger.kernel.org; sta...@vger.kernel.org; Jean-Marc Lenoir
> > ; Schmauss, Erik ;
> > Wysocki, Rafael J 
> > Subject: Re: [PATCH 4.14 044/100] ACPICA: AML interpreter: add region
> > addresses in global list during initialization
> > 
> > On Thu, Nov 29, 2018 at 03:45:26PM +0100, Jean Delvare wrote:
> > > Hi Greg,
> > >
> > > On Thu, 2018-11-29 at 15:12 +0100, Greg Kroah-Hartman wrote:
> > > > 4.14-stable review patch.  If anyone has any objections, please let me
> > know.
> > > >
> > > > --
> > > >
> > > > From: Erik Schmauss 
> > > >
> > > > commit 4abb951b73ff0a8a979113ef185651aa3c8da19b upstream.
> > > >
> > > > The table load process omitted adding the operation region address
> > > > range to the global list. This omission is problematic because the
> > > > OS queries the global list to check for address range conflicts
> > > > before deciding which drivers to load. This commit may result in
> > > > warning messages that look like the following:
> > > >
> > > > [7.871761] ACPI Warning: system_IO range 0x0428-0x042F
> > conflicts with op_region 0x0400-0x047F (\PMIO)
> > (20180531/utaddress-213)
> > > > [7.871769] ACPI: If an ACPI driver is available for this device, 
> > > > you should
> > use it instead of the native driver
> > > >
> > > > However, these messages do not signify regressions. It is a result
> > > > of properly adding address ranges within the global address list.
> > > >
> > > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=200011
> > > > Tested-by: Jean-Marc Lenoir 
> > > > Signed-off-by: Erik Schmauss 
> > > > Cc: All applicable 
> > > > Signed-off-by: Rafael J. Wysocki 
> > > > Cc: Jean Delvare 
> > > > Signed-off-by: Greg Kroah-Hartman 
> > >
> > > I'm confused. While we were discussing the regression, Erik said that
> > > this is fixing commit 5a8361f7ecceaed64b4064000d16cb703462be49, which
> > > went upstream in v4.17. So how can the fix be needed in any kernel
> > > older than v4.17? Erik, did I understand you incorrectly?
> > 
> Hi Greg,
> 
> > The patch says "All applicable", and I assumed that meant, "as long as it
> > applies."
> >
> > Erik, should I drop this from 4.14.y?
> 
> This should only apply to 4.17 or later. I unintentionally put all applicable 
> so
> please drop this for all 4.16 or earlier. I've learned my lesson and I'll put 
> the
> correct tags from now on :-)

Ok, now dropped from 4.14, thanks.

greg k-h


Re: [PATCH v2] signal: add procfd_signal() syscall

2018-11-29 Thread Christian Brauner
On November 30, 2018 5:54:18 AM GMT+13:00, Andy Lutomirski 
 wrote:
>
>
>> On Nov 29, 2018, at 4:28 AM, Florian Weimer 
>wrote:
>> 
>> Disclaimer: I'm looking at this patch because Christian requested it.
>> I'm not a kernel developer.
>> 
>> * Christian Brauner:
>> 
>>> diff --git a/arch/x86/entry/syscalls/syscall_32.tbl
>b/arch/x86/entry/syscalls/syscall_32.tbl
>>> index 3cf7b533b3d1..3f27ffd8ae87 100644
>>> --- a/arch/x86/entry/syscalls/syscall_32.tbl
>>> +++ b/arch/x86/entry/syscalls/syscall_32.tbl
>>> @@ -398,3 +398,4 @@
>>> 384i386arch_prctlsys_arch_prctl   
>__ia32_compat_sys_arch_prctl
>>> 385i386io_pgeteventssys_io_pgetevents   
>__ia32_compat_sys_io_pgetevents
>>> 386i386rseqsys_rseq__ia32_sys_rseq
>>> +387i386procfd_signalsys_procfd_signal   
>__ia32_compat_sys_procfd_signal
>>> diff --git a/arch/x86/entry/syscalls/syscall_64.tbl
>b/arch/x86/entry/syscalls/syscall_64.tbl
>>> index f0b1709a5ffb..8a30cde82450 100644
>>> --- a/arch/x86/entry/syscalls/syscall_64.tbl
>>> +++ b/arch/x86/entry/syscalls/syscall_64.tbl
>>> @@ -343,6 +343,7 @@
>>> 332commonstatx__x64_sys_statx
>>> 333commonio_pgetevents__x64_sys_io_pgetevents
>>> 334commonrseq__x64_sys_rseq
>>> +33564procfd_signal__x64_sys_procfd_signal
>>> 
>>> #
>>> # x32-specific system call numbers start at 512 to avoid cache
>impact
>>> @@ -386,3 +387,4 @@
>>> 545x32execveat__x32_compat_sys_execveat/ptregs
>>> 546x32preadv2__x32_compat_sys_preadv64v2
>>> 547x32pwritev2__x32_compat_sys_pwritev64v2
>>> +548x32procfd_signal__x32_compat_sys_procfd_signal
>> 
>> Is there a reason why these numbers have to be different?
>> 
>> (See the recent discussion with Andy Lutomirski.)
>
>Hah, I missed this part of the patch.  Let’s not add new x32 syscall
>numbers.
>
>Also, can we perhaps rework this a bit to get rid of the compat entry
>point?  The easier way would be to check in_compat_syscall(). The nicer
>way IMO would be to use the 64-bit structure for 32-bit as well.

Do you have a syscall which set precedence/did this before I could look at?
Just if you happen to remember one.
Fwiw, I followed the other signal syscalls.
They all introduce compat syscalls.



Re: [PATCH v2] signal: add procfd_signal() syscall

2018-11-29 Thread Christian Brauner
On November 30, 2018 5:54:18 AM GMT+13:00, Andy Lutomirski 
 wrote:
>
>
>> On Nov 29, 2018, at 4:28 AM, Florian Weimer 
>wrote:
>> 
>> Disclaimer: I'm looking at this patch because Christian requested it.
>> I'm not a kernel developer.
>> 
>> * Christian Brauner:
>> 
>>> diff --git a/arch/x86/entry/syscalls/syscall_32.tbl
>b/arch/x86/entry/syscalls/syscall_32.tbl
>>> index 3cf7b533b3d1..3f27ffd8ae87 100644
>>> --- a/arch/x86/entry/syscalls/syscall_32.tbl
>>> +++ b/arch/x86/entry/syscalls/syscall_32.tbl
>>> @@ -398,3 +398,4 @@
>>> 384i386arch_prctlsys_arch_prctl   
>__ia32_compat_sys_arch_prctl
>>> 385i386io_pgeteventssys_io_pgetevents   
>__ia32_compat_sys_io_pgetevents
>>> 386i386rseqsys_rseq__ia32_sys_rseq
>>> +387i386procfd_signalsys_procfd_signal   
>__ia32_compat_sys_procfd_signal
>>> diff --git a/arch/x86/entry/syscalls/syscall_64.tbl
>b/arch/x86/entry/syscalls/syscall_64.tbl
>>> index f0b1709a5ffb..8a30cde82450 100644
>>> --- a/arch/x86/entry/syscalls/syscall_64.tbl
>>> +++ b/arch/x86/entry/syscalls/syscall_64.tbl
>>> @@ -343,6 +343,7 @@
>>> 332commonstatx__x64_sys_statx
>>> 333commonio_pgetevents__x64_sys_io_pgetevents
>>> 334commonrseq__x64_sys_rseq
>>> +33564procfd_signal__x64_sys_procfd_signal
>>> 
>>> #
>>> # x32-specific system call numbers start at 512 to avoid cache
>impact
>>> @@ -386,3 +387,4 @@
>>> 545x32execveat__x32_compat_sys_execveat/ptregs
>>> 546x32preadv2__x32_compat_sys_preadv64v2
>>> 547x32pwritev2__x32_compat_sys_pwritev64v2
>>> +548x32procfd_signal__x32_compat_sys_procfd_signal
>> 
>> Is there a reason why these numbers have to be different?
>> 
>> (See the recent discussion with Andy Lutomirski.)
>
>Hah, I missed this part of the patch.  Let’s not add new x32 syscall
>numbers.
>
>Also, can we perhaps rework this a bit to get rid of the compat entry
>point?  The easier way would be to check in_compat_syscall(). The nicer
>way IMO would be to use the 64-bit structure for 32-bit as well.

Do you have a syscall which set precedence/did this before I could look at?
Just if you happen to remember one.
Fwiw, I followed the other signal syscalls.
They all introduce compat syscalls.



Re: [PATCH v2 4/4] x86/static_call: Add inline static call implementation for x86-64

2018-11-29 Thread Steven Rostedt
On Thu, 29 Nov 2018 10:58:40 -0800
Linus Torvalds  wrote:

> On Thu, Nov 29, 2018 at 10:47 AM Steven Rostedt  wrote:
> >
> > Note, we do have a bit of control at what is getting called. The patch
> > set requires that the callers are wrapped in macros. We should not
> > allow just any random callers (like from asm).  
> 
> Actually, I'd argue that asm is often more controlled than C code.
> 
> Right now you can do odd things if you really want to, and have the
> compiler generate indirect calls to those wrapper functions.
> 
> For example, I can easily imagine a pre-retpoline compiler turning
> 
>  if (cond)
> fn1(a,b)
>  else
>fn2(a,b);
> 
> into a function pointer conditional
> 
> (cond ? fn1 : fn2)(a,b);

If we are worried about such a construct, wouldn't a compiler barrier
before and after the static_call solve that?

barrier();
static_call(func...);
barrier();

It should also stop tail calls too.

> 
> and honestly, the way "static_call()" works now, can you guarantee
> that the call-site doesn't end up doing that, and calling the
> trampoline function for two different static calls from one indirect
> call?
> 
> See what I'm talking about? Saying "callers are wrapped in macros"
> doesn't actually protect you from the compiler doing things like that.
> 
> In contrast, if the call was wrapped in an inline asm, we'd *know* the
> compiler couldn't turn a "call wrapper(%rip)" into anything else.

But then we need to implement all numbers of parameters.

-- Steve


Re: [PATCH v2 4/4] x86/static_call: Add inline static call implementation for x86-64

2018-11-29 Thread Steven Rostedt
On Thu, 29 Nov 2018 10:58:40 -0800
Linus Torvalds  wrote:

> On Thu, Nov 29, 2018 at 10:47 AM Steven Rostedt  wrote:
> >
> > Note, we do have a bit of control at what is getting called. The patch
> > set requires that the callers are wrapped in macros. We should not
> > allow just any random callers (like from asm).  
> 
> Actually, I'd argue that asm is often more controlled than C code.
> 
> Right now you can do odd things if you really want to, and have the
> compiler generate indirect calls to those wrapper functions.
> 
> For example, I can easily imagine a pre-retpoline compiler turning
> 
>  if (cond)
> fn1(a,b)
>  else
>fn2(a,b);
> 
> into a function pointer conditional
> 
> (cond ? fn1 : fn2)(a,b);

If we are worried about such a construct, wouldn't a compiler barrier
before and after the static_call solve that?

barrier();
static_call(func...);
barrier();

It should also stop tail calls too.

> 
> and honestly, the way "static_call()" works now, can you guarantee
> that the call-site doesn't end up doing that, and calling the
> trampoline function for two different static calls from one indirect
> call?
> 
> See what I'm talking about? Saying "callers are wrapped in macros"
> doesn't actually protect you from the compiler doing things like that.
> 
> In contrast, if the call was wrapped in an inline asm, we'd *know* the
> compiler couldn't turn a "call wrapper(%rip)" into anything else.

But then we need to implement all numbers of parameters.

-- Steve


Re: [PATCH] PCI: pciehp: Report degraded links via link bandwidth notification

2018-11-29 Thread Lukas Wunner
On Thu, Nov 29, 2018 at 06:57:37PM +, alex_gagn...@dellteam.com wrote:
> On 11/29/2018 11:36 AM, Bjorn Helgaas wrote:
> > On Wed, Nov 28, 2018 at 06:08:24PM -0600, Alexandru Gagniuc wrote:
> >> A warning is generated when a PCIe device is probed with a degraded
> >> link, but there was no similar mechanism to warn when the link becomes
> >> degraded after probing. The Link Bandwidth Notification provides this
> >> mechanism.
> >>
> >> Use the link bandwidth notification interrupt to detect bandwidth
> >> changes, and rescan the bandwidth, looking for the weakest point. This
> >> is the same logic used in probe().
> > 
> > I like the concept of this.  What I don't like is the fact that it's
> > tied to pciehp, since I don't think the concept of Link Bandwidth
> > Notification is related to hotplug.  So I think we'll only notice this
> > for ports that support hotplug.  Maybe it's worth doing it this way
> > anyway, even if it could be generalized in the future?
> 
> That makes sense. At first, I thought that BW notification was tied to 
> hotplug, but our PCIe spec writer disagreed with that assertion. I'm 
> just not sure where to handle the interrupt otherwise.

I guess the interrupt is shared with hotplug and PME?  In that case write
a separate pcie_port_service_driver and request the interrupt with
IRQF_SHARED.  Define a new service type in drivers/pci/pcie/portdrv.h.
Amend get_port_device_capability() to check for PCI_EXP_LNKCAP_LBNC.

Thanks,

Lukas


Re: [PATCH] PCI: pciehp: Report degraded links via link bandwidth notification

2018-11-29 Thread Lukas Wunner
On Thu, Nov 29, 2018 at 06:57:37PM +, alex_gagn...@dellteam.com wrote:
> On 11/29/2018 11:36 AM, Bjorn Helgaas wrote:
> > On Wed, Nov 28, 2018 at 06:08:24PM -0600, Alexandru Gagniuc wrote:
> >> A warning is generated when a PCIe device is probed with a degraded
> >> link, but there was no similar mechanism to warn when the link becomes
> >> degraded after probing. The Link Bandwidth Notification provides this
> >> mechanism.
> >>
> >> Use the link bandwidth notification interrupt to detect bandwidth
> >> changes, and rescan the bandwidth, looking for the weakest point. This
> >> is the same logic used in probe().
> > 
> > I like the concept of this.  What I don't like is the fact that it's
> > tied to pciehp, since I don't think the concept of Link Bandwidth
> > Notification is related to hotplug.  So I think we'll only notice this
> > for ports that support hotplug.  Maybe it's worth doing it this way
> > anyway, even if it could be generalized in the future?
> 
> That makes sense. At first, I thought that BW notification was tied to 
> hotplug, but our PCIe spec writer disagreed with that assertion. I'm 
> just not sure where to handle the interrupt otherwise.

I guess the interrupt is shared with hotplug and PME?  In that case write
a separate pcie_port_service_driver and request the interrupt with
IRQF_SHARED.  Define a new service type in drivers/pci/pcie/portdrv.h.
Amend get_port_device_capability() to check for PCI_EXP_LNKCAP_LBNC.

Thanks,

Lukas


Re: [PATCH v2 4/4] x86/static_call: Add inline static call implementation for x86-64

2018-11-29 Thread Steven Rostedt
On Thu, 29 Nov 2018 11:08:26 -0800
Linus Torvalds  wrote:

> On Thu, Nov 29, 2018 at 10:58 AM Linus Torvalds
>  wrote:
> >
> > In contrast, if the call was wrapped in an inline asm, we'd *know* the
> > compiler couldn't turn a "call wrapper(%rip)" into anything else.  
> 
> Actually, I think I have a better model - if the caller is done with inline 
> asm.
> 
> What you can do then is basically add a single-byte prefix to the
> "call" instruction that does nothing (say, cs override), and then
> replace *that* with a 'int3' instruction.
> 
> Boom. Done.
> 
> Now, the "int3" handler can just update the instruction in-place, but
> leave the "int3" in place, and then return to the next instruction
> byte (which is just the normal branch instruction without the prefix
> byte).
> 
> The cross-CPU case continues to work, because the 'int3' remains in
> place until after the IPI.
> 
> But that would require that we'd mark those call instruction with
> 

In my original proof of concept, I tried to to implement the callers
with asm, but then the way to handle parameters became a nightmare.

The goal of this (for me) was to replace the tracepoint indirect calls
with static calls, and tracepoints can have any number of parameters to
pass. I ended up needing the compiler to help me with the passing of
parameters.

-- Steve


Re: [PATCH v2 4/4] x86/static_call: Add inline static call implementation for x86-64

2018-11-29 Thread Linus Torvalds
On Thu, Nov 29, 2018 at 11:08 AM Linus Torvalds
 wrote:
>
> What you can do then is basically add a single-byte prefix to the
> "call" instruction that does nothing (say, cs override), and then
> replace *that* with a 'int3' instruction.

Hmm. the segment prefixes are documented as being "reserved" for
branch instructions. I *think* that means just conditional branches
(Intel at one point used the prefixes for static prediction
information), not "call", but who knows..

It might be better to use an empty REX prefix on x86-64 or something like that.

   Linus


Re: [PATCH v2 4/4] x86/static_call: Add inline static call implementation for x86-64

2018-11-29 Thread Steven Rostedt
On Thu, 29 Nov 2018 11:08:26 -0800
Linus Torvalds  wrote:

> On Thu, Nov 29, 2018 at 10:58 AM Linus Torvalds
>  wrote:
> >
> > In contrast, if the call was wrapped in an inline asm, we'd *know* the
> > compiler couldn't turn a "call wrapper(%rip)" into anything else.  
> 
> Actually, I think I have a better model - if the caller is done with inline 
> asm.
> 
> What you can do then is basically add a single-byte prefix to the
> "call" instruction that does nothing (say, cs override), and then
> replace *that* with a 'int3' instruction.
> 
> Boom. Done.
> 
> Now, the "int3" handler can just update the instruction in-place, but
> leave the "int3" in place, and then return to the next instruction
> byte (which is just the normal branch instruction without the prefix
> byte).
> 
> The cross-CPU case continues to work, because the 'int3' remains in
> place until after the IPI.
> 
> But that would require that we'd mark those call instruction with
> 

In my original proof of concept, I tried to to implement the callers
with asm, but then the way to handle parameters became a nightmare.

The goal of this (for me) was to replace the tracepoint indirect calls
with static calls, and tracepoints can have any number of parameters to
pass. I ended up needing the compiler to help me with the passing of
parameters.

-- Steve


Re: [PATCH v2 4/4] x86/static_call: Add inline static call implementation for x86-64

2018-11-29 Thread Linus Torvalds
On Thu, Nov 29, 2018 at 11:08 AM Linus Torvalds
 wrote:
>
> What you can do then is basically add a single-byte prefix to the
> "call" instruction that does nothing (say, cs override), and then
> replace *that* with a 'int3' instruction.

Hmm. the segment prefixes are documented as being "reserved" for
branch instructions. I *think* that means just conditional branches
(Intel at one point used the prefixes for static prediction
information), not "call", but who knows..

It might be better to use an empty REX prefix on x86-64 or something like that.

   Linus


Re: [PATCH v2 4/4] x86/static_call: Add inline static call implementation for x86-64

2018-11-29 Thread Linus Torvalds
On Thu, Nov 29, 2018 at 10:58 AM Linus Torvalds
 wrote:
>
> In contrast, if the call was wrapped in an inline asm, we'd *know* the
> compiler couldn't turn a "call wrapper(%rip)" into anything else.

Actually, I think I have a better model - if the caller is done with inline asm.

What you can do then is basically add a single-byte prefix to the
"call" instruction that does nothing (say, cs override), and then
replace *that* with a 'int3' instruction.

Boom. Done.

Now, the "int3" handler can just update the instruction in-place, but
leave the "int3" in place, and then return to the next instruction
byte (which is just the normal branch instruction without the prefix
byte).

The cross-CPU case continues to work, because the 'int3' remains in
place until after the IPI.

But that would require that we'd mark those call instruction with

  Linus


Re: [PATCH v2 4/4] x86/static_call: Add inline static call implementation for x86-64

2018-11-29 Thread Linus Torvalds
On Thu, Nov 29, 2018 at 10:58 AM Linus Torvalds
 wrote:
>
> In contrast, if the call was wrapped in an inline asm, we'd *know* the
> compiler couldn't turn a "call wrapper(%rip)" into anything else.

Actually, I think I have a better model - if the caller is done with inline asm.

What you can do then is basically add a single-byte prefix to the
"call" instruction that does nothing (say, cs override), and then
replace *that* with a 'int3' instruction.

Boom. Done.

Now, the "int3" handler can just update the instruction in-place, but
leave the "int3" in place, and then return to the next instruction
byte (which is just the normal branch instruction without the prefix
byte).

The cross-CPU case continues to work, because the 'int3' remains in
place until after the IPI.

But that would require that we'd mark those call instruction with

  Linus


[PATCH] mmc: sdhci-omap: Workaround errata regarding SDR104/HS200 tuning failures (i929)

2018-11-29 Thread Faiz Abbas
Errata i929 in certain OMAP5/DRA7XX/AM57XX silicon revisions
(SPRZ426D - November 2014 - Revised February 2018 [1]) mentions
unexpected tuning pattern errors. A small failure band may be present
in the tuning range which may be missed by the current algorithm.
Furthermore, the failure bands vary with temperature leading to
different optimum tuning values for different temperatures.

As suggested in the related Application Report (SPRACA9B - October 2017
- Revised July 2018 [2]), tuning should be done in two stages.
In stage 1, assign the optimum ratio in the maximum pass window for the
current temperature. In stage 2, if the chosen value is close to the
small failure band, move away from it in the appropriate direction.

References:
[1] http://www.ti.com/lit/pdf/sprz426
[2] http://www.ti.com/lit/pdf/SPRACA9

Signed-off-by: Faiz Abbas 
---
 drivers/mmc/host/Kconfig  |  2 +
 drivers/mmc/host/sdhci-omap.c | 90 ++-
 2 files changed, 91 insertions(+), 1 deletion(-)

diff --git a/drivers/mmc/host/Kconfig b/drivers/mmc/host/Kconfig
index 1b58739d9744..6d3553f06f27 100644
--- a/drivers/mmc/host/Kconfig
+++ b/drivers/mmc/host/Kconfig
@@ -969,6 +969,8 @@ config MMC_SDHCI_XENON
 config MMC_SDHCI_OMAP
tristate "TI SDHCI Controller Support"
depends on MMC_SDHCI_PLTFM && OF
+   select THERMAL
+   select TI_SOC_THERMAL
help
  This selects the Secure Digital Host Controller Interface (SDHCI)
  support present in TI's DRA7 SOCs. The controller supports
diff --git a/drivers/mmc/host/sdhci-omap.c b/drivers/mmc/host/sdhci-omap.c
index b3cb39d0db6f..9ccce7ef3a60 100644
--- a/drivers/mmc/host/sdhci-omap.c
+++ b/drivers/mmc/host/sdhci-omap.c
@@ -27,6 +27,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "sdhci-pltfm.h"
 
@@ -286,14 +287,18 @@ static int sdhci_omap_execute_tuning(struct mmc_host 
*mmc, u32 opcode)
struct sdhci_host *host = mmc_priv(mmc);
struct sdhci_pltfm_host *pltfm_host = sdhci_priv(host);
struct sdhci_omap_host *omap_host = sdhci_pltfm_priv(pltfm_host);
+   struct thermal_zone_device *thermal_dev;
struct device *dev = omap_host->dev;
struct mmc_ios *ios = >ios;
u32 start_window = 0, max_window = 0;
+   bool single_point_failure = false;
u8 cur_match, prev_match = 0;
u32 length = 0, max_len = 0;
u32 phase_delay = 0;
+   int temperature;
int ret = 0;
u32 reg;
+   int i;
 
/* clock tuning is not needed for upto 52MHz */
if (ios->clock <= 5200)
@@ -303,6 +308,16 @@ static int sdhci_omap_execute_tuning(struct mmc_host *mmc, 
u32 opcode)
if (ios->timing == MMC_TIMING_UHS_SDR50 && !(reg & CAPA2_TSDR50))
return 0;
 
+   thermal_dev = thermal_zone_get_zone_by_name("cpu_thermal");
+   if (IS_ERR(thermal_dev)) {
+   dev_err(dev, "Unable to get thermal zone for tuning\n");
+   return PTR_ERR(thermal_dev);
+   }
+
+   ret = thermal_zone_get_temp(thermal_dev, );
+   if (ret)
+   return ret;
+
reg = sdhci_omap_readl(omap_host, SDHCI_OMAP_DLL);
reg |= DLL_SWT;
sdhci_omap_writel(omap_host, SDHCI_OMAP_DLL, reg);
@@ -317,6 +332,11 @@ static int sdhci_omap_execute_tuning(struct mmc_host *mmc, 
u32 opcode)
 
omap_host->is_tuning = true;
 
+   /*
+* Stage 1: Search for a maximum pass window ignoring any
+* any single point failures. If the tuning value ends up
+* near it, move away from it in stage 2 below
+*/
while (phase_delay <= MAX_PHASE_DELAY) {
sdhci_omap_set_dll(omap_host, phase_delay);
 
@@ -324,10 +344,15 @@ static int sdhci_omap_execute_tuning(struct mmc_host 
*mmc, u32 opcode)
if (cur_match) {
if (prev_match) {
length++;
+   } else if (single_point_failure) {
+   /* ignore single point failure */
+   length++;
} else {
start_window = phase_delay;
length = 1;
}
+   } else {
+   single_point_failure = prev_match;
}
 
if (length > max_len) {
@@ -345,13 +370,76 @@ static int sdhci_omap_execute_tuning(struct mmc_host 
*mmc, u32 opcode)
goto tuning_error;
}
 
+   /*
+* Assign tuning value as a ratio of maximum pass window based
+* on temperature
+*/
+   if (temperature < -2)
+   phase_delay = min(max_window + 4 * max_len - 24,
+ max_window +
+ DIV_ROUND_UP(13 * max_len, 16) * 4);
+   else if (temperature < 2)
+   phase_delay = max_window + DIV_ROUND_UP(9 * max_len, 16) * 

[PATCH] mmc: sdhci-omap: Workaround errata regarding SDR104/HS200 tuning failures (i929)

2018-11-29 Thread Faiz Abbas
Errata i929 in certain OMAP5/DRA7XX/AM57XX silicon revisions
(SPRZ426D - November 2014 - Revised February 2018 [1]) mentions
unexpected tuning pattern errors. A small failure band may be present
in the tuning range which may be missed by the current algorithm.
Furthermore, the failure bands vary with temperature leading to
different optimum tuning values for different temperatures.

As suggested in the related Application Report (SPRACA9B - October 2017
- Revised July 2018 [2]), tuning should be done in two stages.
In stage 1, assign the optimum ratio in the maximum pass window for the
current temperature. In stage 2, if the chosen value is close to the
small failure band, move away from it in the appropriate direction.

References:
[1] http://www.ti.com/lit/pdf/sprz426
[2] http://www.ti.com/lit/pdf/SPRACA9

Signed-off-by: Faiz Abbas 
---
 drivers/mmc/host/Kconfig  |  2 +
 drivers/mmc/host/sdhci-omap.c | 90 ++-
 2 files changed, 91 insertions(+), 1 deletion(-)

diff --git a/drivers/mmc/host/Kconfig b/drivers/mmc/host/Kconfig
index 1b58739d9744..6d3553f06f27 100644
--- a/drivers/mmc/host/Kconfig
+++ b/drivers/mmc/host/Kconfig
@@ -969,6 +969,8 @@ config MMC_SDHCI_XENON
 config MMC_SDHCI_OMAP
tristate "TI SDHCI Controller Support"
depends on MMC_SDHCI_PLTFM && OF
+   select THERMAL
+   select TI_SOC_THERMAL
help
  This selects the Secure Digital Host Controller Interface (SDHCI)
  support present in TI's DRA7 SOCs. The controller supports
diff --git a/drivers/mmc/host/sdhci-omap.c b/drivers/mmc/host/sdhci-omap.c
index b3cb39d0db6f..9ccce7ef3a60 100644
--- a/drivers/mmc/host/sdhci-omap.c
+++ b/drivers/mmc/host/sdhci-omap.c
@@ -27,6 +27,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "sdhci-pltfm.h"
 
@@ -286,14 +287,18 @@ static int sdhci_omap_execute_tuning(struct mmc_host 
*mmc, u32 opcode)
struct sdhci_host *host = mmc_priv(mmc);
struct sdhci_pltfm_host *pltfm_host = sdhci_priv(host);
struct sdhci_omap_host *omap_host = sdhci_pltfm_priv(pltfm_host);
+   struct thermal_zone_device *thermal_dev;
struct device *dev = omap_host->dev;
struct mmc_ios *ios = >ios;
u32 start_window = 0, max_window = 0;
+   bool single_point_failure = false;
u8 cur_match, prev_match = 0;
u32 length = 0, max_len = 0;
u32 phase_delay = 0;
+   int temperature;
int ret = 0;
u32 reg;
+   int i;
 
/* clock tuning is not needed for upto 52MHz */
if (ios->clock <= 5200)
@@ -303,6 +308,16 @@ static int sdhci_omap_execute_tuning(struct mmc_host *mmc, 
u32 opcode)
if (ios->timing == MMC_TIMING_UHS_SDR50 && !(reg & CAPA2_TSDR50))
return 0;
 
+   thermal_dev = thermal_zone_get_zone_by_name("cpu_thermal");
+   if (IS_ERR(thermal_dev)) {
+   dev_err(dev, "Unable to get thermal zone for tuning\n");
+   return PTR_ERR(thermal_dev);
+   }
+
+   ret = thermal_zone_get_temp(thermal_dev, );
+   if (ret)
+   return ret;
+
reg = sdhci_omap_readl(omap_host, SDHCI_OMAP_DLL);
reg |= DLL_SWT;
sdhci_omap_writel(omap_host, SDHCI_OMAP_DLL, reg);
@@ -317,6 +332,11 @@ static int sdhci_omap_execute_tuning(struct mmc_host *mmc, 
u32 opcode)
 
omap_host->is_tuning = true;
 
+   /*
+* Stage 1: Search for a maximum pass window ignoring any
+* any single point failures. If the tuning value ends up
+* near it, move away from it in stage 2 below
+*/
while (phase_delay <= MAX_PHASE_DELAY) {
sdhci_omap_set_dll(omap_host, phase_delay);
 
@@ -324,10 +344,15 @@ static int sdhci_omap_execute_tuning(struct mmc_host 
*mmc, u32 opcode)
if (cur_match) {
if (prev_match) {
length++;
+   } else if (single_point_failure) {
+   /* ignore single point failure */
+   length++;
} else {
start_window = phase_delay;
length = 1;
}
+   } else {
+   single_point_failure = prev_match;
}
 
if (length > max_len) {
@@ -345,13 +370,76 @@ static int sdhci_omap_execute_tuning(struct mmc_host 
*mmc, u32 opcode)
goto tuning_error;
}
 
+   /*
+* Assign tuning value as a ratio of maximum pass window based
+* on temperature
+*/
+   if (temperature < -2)
+   phase_delay = min(max_window + 4 * max_len - 24,
+ max_window +
+ DIV_ROUND_UP(13 * max_len, 16) * 4);
+   else if (temperature < 2)
+   phase_delay = max_window + DIV_ROUND_UP(9 * max_len, 16) * 

Re: [patch V2 00/28] x86/speculation: Remedy the STIBP/IBPB overhead

2018-11-29 Thread Tim Chen
On 11/28/2018 06:24 AM, Thomas Gleixner wrote:

> 
> I've integrated the latest review feedback and the change which plugs the
> TIF async update issue and pushed all of it out to:
> 
>git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86/pti
> 
> For the stable 4.14.y and 4.19.y trees, I've collected the missing bits and
> pieces and uploaded tarballs which contain everything ready for consumption:
> 
>https://tglx.de/~tglx/patches-spec-4.14.y.tar.xz
> 
>   sha256 of patches-spec-4.14.y.tar:
>   3d2976ef06ab5556c1c6cba975b0c9390eb57f43c506fb7f8834bb484feb9b17
> 
>https://tglx.de/~tglx/patches-spec-4.19.y.tar.xz
> 
>   sha256 of patches-spec-4.19.y.tar:
>   b7666cf378ad63810a17e98a471aae81a49738c552dbe912aea49de83f8145cc
> 
> Thanks everyone for review, discussion, testing ... !

Big thanks to Thomas for getting all these changes merged.

Tim


Re: [PATCH 00/10] Regulator ena_gpiod fixups

2018-11-29 Thread Mark Brown
On Thu, Nov 29, 2018 at 07:38:20PM +0100, Bartosz Golaszewski wrote:

> I'm wondering if instead of using the non-devm variants of
> gpiod_get_*() routines, we shouldn't provide helpers in the regulator
> framework that would be named accordingly, for example:
> regulator_gpiod_get_optional() etc. even if all they do is call the
> relevant gpiolib function. Those helpers could then be documented as
> passing the control over GPIO lines over to the regulator subsystem.

> The reason for that is that most driver developers will automatically
> use devm functions whenever available and having a single non-devm
> function without any comment used in a driver normally using devres
> looks like a bug. Expect people sending "fixes" in a couple months.

I predict that people would then immediately start demanding devm_
variants of that function...


signature.asc
Description: PGP signature


Re: [PATCH 00/10] Regulator ena_gpiod fixups

2018-11-29 Thread Mark Brown
On Thu, Nov 29, 2018 at 07:38:20PM +0100, Bartosz Golaszewski wrote:

> I'm wondering if instead of using the non-devm variants of
> gpiod_get_*() routines, we shouldn't provide helpers in the regulator
> framework that would be named accordingly, for example:
> regulator_gpiod_get_optional() etc. even if all they do is call the
> relevant gpiolib function. Those helpers could then be documented as
> passing the control over GPIO lines over to the regulator subsystem.

> The reason for that is that most driver developers will automatically
> use devm functions whenever available and having a single non-devm
> function without any comment used in a driver normally using devres
> looks like a bug. Expect people sending "fixes" in a couple months.

I predict that people would then immediately start demanding devm_
variants of that function...


signature.asc
Description: PGP signature


Re: [patch V2 00/28] x86/speculation: Remedy the STIBP/IBPB overhead

2018-11-29 Thread Tim Chen
On 11/28/2018 06:24 AM, Thomas Gleixner wrote:

> 
> I've integrated the latest review feedback and the change which plugs the
> TIF async update issue and pushed all of it out to:
> 
>git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86/pti
> 
> For the stable 4.14.y and 4.19.y trees, I've collected the missing bits and
> pieces and uploaded tarballs which contain everything ready for consumption:
> 
>https://tglx.de/~tglx/patches-spec-4.14.y.tar.xz
> 
>   sha256 of patches-spec-4.14.y.tar:
>   3d2976ef06ab5556c1c6cba975b0c9390eb57f43c506fb7f8834bb484feb9b17
> 
>https://tglx.de/~tglx/patches-spec-4.19.y.tar.xz
> 
>   sha256 of patches-spec-4.19.y.tar:
>   b7666cf378ad63810a17e98a471aae81a49738c552dbe912aea49de83f8145cc
> 
> Thanks everyone for review, discussion, testing ... !

Big thanks to Thomas for getting all these changes merged.

Tim


Re: [PATCH] PCI: pciehp: Report degraded links via link bandwidth notification

2018-11-29 Thread Alex_Gagniuc
On 11/29/2018 10:06 AM, Mika Westerberg wrote:
>> @@ -515,7 +515,8 @@ static irqreturn_t pciehp_isr(int irq, void *dev_id)
>>  struct controller *ctrl = (struct controller *)dev_id;
>>  struct pci_dev *pdev = ctrl_dev(ctrl);
>>  struct device *parent = pdev->dev.parent;
>> -u16 status, events;
>> +struct pci_dev *endpoint;
>> +u16 status, events, link_status;
> 
> Looks better if you write them in opposite order (reverse xmas-tree):
> 
>   u16 status, events, link_status;
>   struct pci_dev *endpoint;
> 

I don't decorate in November :p

>> +pcie_capability_read_word(pdev, PCI_EXP_LNKSTA, _status);
>> +
> 
> Unnecessary empty line.

However Bjorn wants it, though I don't like the crowded look with this 
line removed.

>> +if (link_status & PCI_EXP_LNKSTA_LBMS) {
>> +if (pdev->subordinate && pdev->subordinate->self)
>> +endpoint = pdev->subordinate->self;
> 
> Hmm, I thought pdev->subordinate->self == pdev, no?

That makes no sense, but I think you're right. I'm trying to get to the 
other end of the PCIe link. Is there a simple way to do that? (other 
than convoluted logic that all leads to the same mistake)

>>   static void pcie_enable_notification(struct controller *ctrl)
>>   {
>>  u16 cmd, mask;
>> @@ -713,6 +743,9 @@ static void pcie_enable_notification(struct controller 
>> *ctrl)
>>  pcie_write_cmd_nowait(ctrl, cmd, mask);
>>  ctrl_dbg(ctrl, "%s: SLOTCTRL %x write cmd %x\n", __func__,
>>   pci_pcie_cap(ctrl->pcie->port) + PCI_EXP_SLTCTL, cmd);
>> +
>> +if (pcie_link_bandwidth_notification_supported(ctrl))
>> +pcie_enable_link_bandwidth_notification(ctrl);
> 
> Do we ever need to disable it?

I can't think of a case where that would be needed.

Alex


Re: [PATCH] PCI: pciehp: Report degraded links via link bandwidth notification

2018-11-29 Thread Alex_Gagniuc
On 11/29/2018 10:06 AM, Mika Westerberg wrote:
>> @@ -515,7 +515,8 @@ static irqreturn_t pciehp_isr(int irq, void *dev_id)
>>  struct controller *ctrl = (struct controller *)dev_id;
>>  struct pci_dev *pdev = ctrl_dev(ctrl);
>>  struct device *parent = pdev->dev.parent;
>> -u16 status, events;
>> +struct pci_dev *endpoint;
>> +u16 status, events, link_status;
> 
> Looks better if you write them in opposite order (reverse xmas-tree):
> 
>   u16 status, events, link_status;
>   struct pci_dev *endpoint;
> 

I don't decorate in November :p

>> +pcie_capability_read_word(pdev, PCI_EXP_LNKSTA, _status);
>> +
> 
> Unnecessary empty line.

However Bjorn wants it, though I don't like the crowded look with this 
line removed.

>> +if (link_status & PCI_EXP_LNKSTA_LBMS) {
>> +if (pdev->subordinate && pdev->subordinate->self)
>> +endpoint = pdev->subordinate->self;
> 
> Hmm, I thought pdev->subordinate->self == pdev, no?

That makes no sense, but I think you're right. I'm trying to get to the 
other end of the PCIe link. Is there a simple way to do that? (other 
than convoluted logic that all leads to the same mistake)

>>   static void pcie_enable_notification(struct controller *ctrl)
>>   {
>>  u16 cmd, mask;
>> @@ -713,6 +743,9 @@ static void pcie_enable_notification(struct controller 
>> *ctrl)
>>  pcie_write_cmd_nowait(ctrl, cmd, mask);
>>  ctrl_dbg(ctrl, "%s: SLOTCTRL %x write cmd %x\n", __func__,
>>   pci_pcie_cap(ctrl->pcie->port) + PCI_EXP_SLTCTL, cmd);
>> +
>> +if (pcie_link_bandwidth_notification_supported(ctrl))
>> +pcie_enable_link_bandwidth_notification(ctrl);
> 
> Do we ever need to disable it?

I can't think of a case where that would be needed.

Alex


Re: [PATCH v2 4/4] x86/static_call: Add inline static call implementation for x86-64

2018-11-29 Thread Linus Torvalds
On Thu, Nov 29, 2018 at 10:47 AM Steven Rostedt  wrote:
>
> Note, we do have a bit of control at what is getting called. The patch
> set requires that the callers are wrapped in macros. We should not
> allow just any random callers (like from asm).

Actually, I'd argue that asm is often more controlled than C code.

Right now you can do odd things if you really want to, and have the
compiler generate indirect calls to those wrapper functions.

For example, I can easily imagine a pre-retpoline compiler turning

 if (cond)
fn1(a,b)
 else
   fn2(a,b);

into a function pointer conditional

(cond ? fn1 : fn2)(a,b);

and honestly, the way "static_call()" works now, can you guarantee
that the call-site doesn't end up doing that, and calling the
trampoline function for two different static calls from one indirect
call?

See what I'm talking about? Saying "callers are wrapped in macros"
doesn't actually protect you from the compiler doing things like that.

In contrast, if the call was wrapped in an inline asm, we'd *know* the
compiler couldn't turn a "call wrapper(%rip)" into anything else.

  Linus


Re: [PATCH v2 4/4] x86/static_call: Add inline static call implementation for x86-64

2018-11-29 Thread Linus Torvalds
On Thu, Nov 29, 2018 at 10:47 AM Steven Rostedt  wrote:
>
> Note, we do have a bit of control at what is getting called. The patch
> set requires that the callers are wrapped in macros. We should not
> allow just any random callers (like from asm).

Actually, I'd argue that asm is often more controlled than C code.

Right now you can do odd things if you really want to, and have the
compiler generate indirect calls to those wrapper functions.

For example, I can easily imagine a pre-retpoline compiler turning

 if (cond)
fn1(a,b)
 else
   fn2(a,b);

into a function pointer conditional

(cond ? fn1 : fn2)(a,b);

and honestly, the way "static_call()" works now, can you guarantee
that the call-site doesn't end up doing that, and calling the
trampoline function for two different static calls from one indirect
call?

See what I'm talking about? Saying "callers are wrapped in macros"
doesn't actually protect you from the compiler doing things like that.

In contrast, if the call was wrapped in an inline asm, we'd *know* the
compiler couldn't turn a "call wrapper(%rip)" into anything else.

  Linus


Re: [PATCH] PCI: pciehp: Report degraded links via link bandwidth notification

2018-11-29 Thread Alex_Gagniuc
On 11/29/2018 11:36 AM, Bjorn Helgaas wrote:
> On Wed, Nov 28, 2018 at 06:08:24PM -0600, Alexandru Gagniuc wrote:
>> A warning is generated when a PCIe device is probed with a degraded
>> link, but there was no similar mechanism to warn when the link becomes
>> degraded after probing. The Link Bandwidth Notification provides this
>> mechanism.
>>
>> Use the link bandwidth notification interrupt to detect bandwidth
>> changes, and rescan the bandwidth, looking for the weakest point. This
>> is the same logic used in probe().
> 
> I like the concept of this.  What I don't like is the fact that it's
> tied to pciehp, since I don't think the concept of Link Bandwidth
> Notification is related to hotplug.  So I think we'll only notice this
> for ports that support hotplug.  Maybe it's worth doing it this way
> anyway, even if it could be generalized in the future?

That makes sense. At first, I thought that BW notification was tied to 
hotplug, but our PCIe spec writer disagreed with that assertion. I'm 
just not sure where to handle the interrupt otherwise.

Alex



Re: [PATCH] PCI: pciehp: Report degraded links via link bandwidth notification

2018-11-29 Thread Alex_Gagniuc
On 11/29/2018 11:36 AM, Bjorn Helgaas wrote:
> On Wed, Nov 28, 2018 at 06:08:24PM -0600, Alexandru Gagniuc wrote:
>> A warning is generated when a PCIe device is probed with a degraded
>> link, but there was no similar mechanism to warn when the link becomes
>> degraded after probing. The Link Bandwidth Notification provides this
>> mechanism.
>>
>> Use the link bandwidth notification interrupt to detect bandwidth
>> changes, and rescan the bandwidth, looking for the weakest point. This
>> is the same logic used in probe().
> 
> I like the concept of this.  What I don't like is the fact that it's
> tied to pciehp, since I don't think the concept of Link Bandwidth
> Notification is related to hotplug.  So I think we'll only notice this
> for ports that support hotplug.  Maybe it's worth doing it this way
> anyway, even if it could be generalized in the future?

That makes sense. At first, I thought that BW notification was tied to 
hotplug, but our PCIe spec writer disagreed with that assertion. I'm 
just not sure where to handle the interrupt otherwise.

Alex



RE: [PATCH 4.14 044/100] ACPICA: AML interpreter: add region addresses in global list during initialization

2018-11-29 Thread Schmauss, Erik



> -Original Message-
> From: Greg Kroah-Hartman [mailto:gre...@linuxfoundation.org]
> Sent: Thursday, November 29, 2018 7:01 AM
> To: Jean Delvare 
> Cc: linux-kernel@vger.kernel.org; sta...@vger.kernel.org; Jean-Marc Lenoir
> ; Schmauss, Erik ;
> Wysocki, Rafael J 
> Subject: Re: [PATCH 4.14 044/100] ACPICA: AML interpreter: add region
> addresses in global list during initialization
> 
> On Thu, Nov 29, 2018 at 03:45:26PM +0100, Jean Delvare wrote:
> > Hi Greg,
> >
> > On Thu, 2018-11-29 at 15:12 +0100, Greg Kroah-Hartman wrote:
> > > 4.14-stable review patch.  If anyone has any objections, please let me
> know.
> > >
> > > --
> > >
> > > From: Erik Schmauss 
> > >
> > > commit 4abb951b73ff0a8a979113ef185651aa3c8da19b upstream.
> > >
> > > The table load process omitted adding the operation region address
> > > range to the global list. This omission is problematic because the
> > > OS queries the global list to check for address range conflicts
> > > before deciding which drivers to load. This commit may result in
> > > warning messages that look like the following:
> > >
> > > [7.871761] ACPI Warning: system_IO range 0x0428-0x042F
> conflicts with op_region 0x0400-0x047F (\PMIO)
> (20180531/utaddress-213)
> > > [7.871769] ACPI: If an ACPI driver is available for this device, you 
> > > should
> use it instead of the native driver
> > >
> > > However, these messages do not signify regressions. It is a result
> > > of properly adding address ranges within the global address list.
> > >
> > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=200011
> > > Tested-by: Jean-Marc Lenoir 
> > > Signed-off-by: Erik Schmauss 
> > > Cc: All applicable 
> > > Signed-off-by: Rafael J. Wysocki 
> > > Cc: Jean Delvare 
> > > Signed-off-by: Greg Kroah-Hartman 
> >
> > I'm confused. While we were discussing the regression, Erik said that
> > this is fixing commit 5a8361f7ecceaed64b4064000d16cb703462be49, which
> > went upstream in v4.17. So how can the fix be needed in any kernel
> > older than v4.17? Erik, did I understand you incorrectly?
> 
Hi Greg,

> The patch says "All applicable", and I assumed that meant, "as long as it
> applies."
>
> Erik, should I drop this from 4.14.y?

This should only apply to 4.17 or later. I unintentionally put all applicable so
please drop this for all 4.16 or earlier. I've learned my lesson and I'll put 
the
correct tags from now on :-)

Erik
> 
> thanks,
> 
> greg k-h


RE: [PATCH 4.14 044/100] ACPICA: AML interpreter: add region addresses in global list during initialization

2018-11-29 Thread Schmauss, Erik



> -Original Message-
> From: Greg Kroah-Hartman [mailto:gre...@linuxfoundation.org]
> Sent: Thursday, November 29, 2018 7:01 AM
> To: Jean Delvare 
> Cc: linux-kernel@vger.kernel.org; sta...@vger.kernel.org; Jean-Marc Lenoir
> ; Schmauss, Erik ;
> Wysocki, Rafael J 
> Subject: Re: [PATCH 4.14 044/100] ACPICA: AML interpreter: add region
> addresses in global list during initialization
> 
> On Thu, Nov 29, 2018 at 03:45:26PM +0100, Jean Delvare wrote:
> > Hi Greg,
> >
> > On Thu, 2018-11-29 at 15:12 +0100, Greg Kroah-Hartman wrote:
> > > 4.14-stable review patch.  If anyone has any objections, please let me
> know.
> > >
> > > --
> > >
> > > From: Erik Schmauss 
> > >
> > > commit 4abb951b73ff0a8a979113ef185651aa3c8da19b upstream.
> > >
> > > The table load process omitted adding the operation region address
> > > range to the global list. This omission is problematic because the
> > > OS queries the global list to check for address range conflicts
> > > before deciding which drivers to load. This commit may result in
> > > warning messages that look like the following:
> > >
> > > [7.871761] ACPI Warning: system_IO range 0x0428-0x042F
> conflicts with op_region 0x0400-0x047F (\PMIO)
> (20180531/utaddress-213)
> > > [7.871769] ACPI: If an ACPI driver is available for this device, you 
> > > should
> use it instead of the native driver
> > >
> > > However, these messages do not signify regressions. It is a result
> > > of properly adding address ranges within the global address list.
> > >
> > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=200011
> > > Tested-by: Jean-Marc Lenoir 
> > > Signed-off-by: Erik Schmauss 
> > > Cc: All applicable 
> > > Signed-off-by: Rafael J. Wysocki 
> > > Cc: Jean Delvare 
> > > Signed-off-by: Greg Kroah-Hartman 
> >
> > I'm confused. While we were discussing the regression, Erik said that
> > this is fixing commit 5a8361f7ecceaed64b4064000d16cb703462be49, which
> > went upstream in v4.17. So how can the fix be needed in any kernel
> > older than v4.17? Erik, did I understand you incorrectly?
> 
Hi Greg,

> The patch says "All applicable", and I assumed that meant, "as long as it
> applies."
>
> Erik, should I drop this from 4.14.y?

This should only apply to 4.17 or later. I unintentionally put all applicable so
please drop this for all 4.16 or earlier. I've learned my lesson and I'll put 
the
correct tags from now on :-)

Erik
> 
> thanks,
> 
> greg k-h


Re: [PATCH v2 4/4] x86/static_call: Add inline static call implementation for x86-64

2018-11-29 Thread Steven Rostedt
On Thu, 29 Nov 2018 10:00:48 -0800
Andy Lutomirski  wrote:

> > 
> > Of course, another option is to just say "we don't do the inline case,
> > then", and only ever do a call to a stub that does a "jmp"
> > instruction.  
> 
> That’s not a terrible idea.

It was the implementation of my first proof of concept that kicked off
this entire idea, where others (Peter and Josh) thought it was better
to modify the calls themselves. It does improve things.

Just a reminder of the benchmarks of enabling all tracepoints (which
use indirect jumps) and running hackbench:

  No RETPOLINES:
1.4503 +- 0.0148 seconds time elapsed  ( +-  1.02% )

  baseline RETPOLINES:
1.5120 +- 0.0133 seconds time elapsed  ( +-  0.88% )

  Added direct calls for trace_events:
1.5239 +- 0.0139 seconds time elapsed  ( +-  0.91% )

  With static calls:
1.5282 +- 0.0135 seconds time elapsed  ( +-  0.88% )

  With static call trampolines:
   1.48328 +- 0.00515 seconds time elapsed  ( +-  0.35% )

  Full static calls:
   1.47364 +- 0.00706 seconds time elapsed  ( +-  0.48% )


  Adding Retpolines caused a 1.5120 / 1.4503 = 1.0425 ( 4.25% ) slowdown

  Trampolines made it into 1.48328 / 1.4503 = 1.0227 ( 2.27% ) slowdown

The above is the stub with the jmp case.

  With full static calls 1.47364 / 1.4503 = 1.0160 ( 1.6% ) slowdown

Modifying the calls themselves does have an improvement (and this is
much greater of an improvement when I had debugging enabled).

Perhaps it's not worth the effort, but again, we do have control of
what uses this. It's not a total free-for-all.

Full results here:

  http://lkml.kernel.org/r/20181126155405.72b4f...@gandalf.local.home

Although since lore.kernel.org seems to be having issues:

  https://marc.info/?l=linux-kernel=154326714710686


-- Steve


Re: [PATCH v2 4/4] x86/static_call: Add inline static call implementation for x86-64

2018-11-29 Thread Steven Rostedt
On Thu, 29 Nov 2018 10:00:48 -0800
Andy Lutomirski  wrote:

> > 
> > Of course, another option is to just say "we don't do the inline case,
> > then", and only ever do a call to a stub that does a "jmp"
> > instruction.  
> 
> That’s not a terrible idea.

It was the implementation of my first proof of concept that kicked off
this entire idea, where others (Peter and Josh) thought it was better
to modify the calls themselves. It does improve things.

Just a reminder of the benchmarks of enabling all tracepoints (which
use indirect jumps) and running hackbench:

  No RETPOLINES:
1.4503 +- 0.0148 seconds time elapsed  ( +-  1.02% )

  baseline RETPOLINES:
1.5120 +- 0.0133 seconds time elapsed  ( +-  0.88% )

  Added direct calls for trace_events:
1.5239 +- 0.0139 seconds time elapsed  ( +-  0.91% )

  With static calls:
1.5282 +- 0.0135 seconds time elapsed  ( +-  0.88% )

  With static call trampolines:
   1.48328 +- 0.00515 seconds time elapsed  ( +-  0.35% )

  Full static calls:
   1.47364 +- 0.00706 seconds time elapsed  ( +-  0.48% )


  Adding Retpolines caused a 1.5120 / 1.4503 = 1.0425 ( 4.25% ) slowdown

  Trampolines made it into 1.48328 / 1.4503 = 1.0227 ( 2.27% ) slowdown

The above is the stub with the jmp case.

  With full static calls 1.47364 / 1.4503 = 1.0160 ( 1.6% ) slowdown

Modifying the calls themselves does have an improvement (and this is
much greater of an improvement when I had debugging enabled).

Perhaps it's not worth the effort, but again, we do have control of
what uses this. It's not a total free-for-all.

Full results here:

  http://lkml.kernel.org/r/20181126155405.72b4f...@gandalf.local.home

Although since lore.kernel.org seems to be having issues:

  https://marc.info/?l=linux-kernel=154326714710686


-- Steve


Re: [Patch v4 1/3] CIFS: Add support for direct I/O read

2018-11-29 Thread Pavel Shilovsky
ср, 28 нояб. 2018 г. в 15:43, Long Li :
>
> > Subject: Re: [Patch v4 1/3] CIFS: Add support for direct I/O read
> >
> > Hi Long,
> >
> > Please find my comments below.
> >
> >
> > ср, 31 окт. 2018 г. в 15:14, Long Li :
> > >
> > > From: Long Li 
> > >
> > > With direct I/O read, we transfer the data directly from transport
> > > layer to the user data buffer.
> > >
> > > Change in v3: add support for kernel AIO
> > >
> > > Change in v4:
> > > Refactor common read code to __cifs_readv for direct and non-direct I/O.
> > > Retry on direct I/O failure.
> > >
> > > Signed-off-by: Long Li 
> > > ---
> > >  fs/cifs/cifsfs.h   |   1 +
> > >  fs/cifs/cifsglob.h |   5 ++
> > >  fs/cifs/file.c | 219 +++--
> > 
> > >  3 files changed, 186 insertions(+), 39 deletions(-)
> > >
> > > diff --git a/fs/cifs/cifsfs.h b/fs/cifs/cifsfs.h index
> > > 5f02318..7fba9aa 100644
> > > --- a/fs/cifs/cifsfs.h
> > > +++ b/fs/cifs/cifsfs.h
> > > @@ -102,6 +102,7 @@ extern int cifs_open(struct inode *inode, struct
> > > file *file);  extern int cifs_close(struct inode *inode, struct file
> > > *file);  extern int cifs_closedir(struct inode *inode, struct file
> > > *file);  extern ssize_t cifs_user_readv(struct kiocb *iocb, struct
> > > iov_iter *to);
> > > +extern ssize_t cifs_direct_readv(struct kiocb *iocb, struct iov_iter
> > > +*to);
> > >  extern ssize_t cifs_strict_readv(struct kiocb *iocb, struct iov_iter
> > > *to);  extern ssize_t cifs_user_writev(struct kiocb *iocb, struct
> > > iov_iter *from);  extern ssize_t cifs_strict_writev(struct kiocb
> > > *iocb, struct iov_iter *from); diff --git a/fs/cifs/cifsglob.h
> > > b/fs/cifs/cifsglob.h index 7f62c98..52248dd 100644
> > > --- a/fs/cifs/cifsglob.h
> > > +++ b/fs/cifs/cifsglob.h
> > > @@ -1146,6 +1146,11 @@ struct cifs_aio_ctx {
> > > unsigned intlen;
> > > unsigned inttotal_len;
> > > boolshould_dirty;
> > > +   /*
> > > +* Indicates if this aio_ctx is for direct_io,
> > > +* If yes, iter is a copy of the user passed iov_iter
> > > +*/
> > > +   booldirect_io;
> > >  };
> > >
> > >  struct cifs_readdata;
> > > diff --git a/fs/cifs/file.c b/fs/cifs/file.c index 87eece6..daab878
> > > 100644
> > > --- a/fs/cifs/file.c
> > > +++ b/fs/cifs/file.c
> > > @@ -2965,7 +2965,6 @@ cifs_uncached_readdata_release(struct kref
> > *refcount)
> > > kref_put(>ctx->refcount, cifs_aio_ctx_release);
> > > for (i = 0; i < rdata->nr_pages; i++) {
> > > put_page(rdata->pages[i]);
> > > -   rdata->pages[i] = NULL;
> > > }
> > > cifs_readdata_release(refcount);  } @@ -3092,6 +3091,63 @@
> > > cifs_uncached_copy_into_pages(struct TCP_Server_Info *server,
> > > return uncached_fill_pages(server, rdata, iter, iter->count);
> > > }
> > >
> > > +static int cifs_resend_rdata(struct cifs_readdata *rdata,
> > > + struct list_head *rdata_list,
> > > + struct cifs_aio_ctx *ctx) {
> > > +   int wait_retry = 0;
> > > +   unsigned int rsize, credits;
> > > +   int rc;
> > > +   struct TCP_Server_Info *server =
> > > +tlink_tcon(rdata->cfile->tlink)->ses->server;
> > > +
> > > +   /*
> > > +* Try to resend this rdata, waiting for credits up to 3 seconds.
> > > +* Note: we are attempting to resend the whole rdata not in 
> > > segments
> > > +*/
> > > +   do {
> > > +   rc = server->ops->wait_mtu_credits(server, rdata->bytes,
> > > +   , );
> > > +
> > > +   if (rc)
> > > +   break;
> > > +
> > > +   if (rsize < rdata->bytes) {
> > > +   add_credits_and_wake_if(server, credits, 0);
> > > +   msleep(1000);
> > > +   wait_retry++;
> > > +   }
> > > +   } while (rsize < rdata->bytes && wait_retry < 3);
> > > +
> > > +   /*
> > > +* If we can't find enough credits to send this rdata
> > > +* release the rdata and return failure, this will pass
> > > +* whatever I/O amount we have finished to VFS.
> > > +*/
> > > +   if (rsize < rdata->bytes) {
> > > +   rc = -EBUSY;
> >
> > We don't have enough credits and return EBUSY here...
> >
> > > +   goto out;
> > > +   }
> > > +
> > > +   rc = -EAGAIN;
> > > +   while (rc == -EAGAIN)
> > > +   if (!rdata->cfile->invalidHandle ||
> > > +   !(rc = cifs_reopen_file(rdata->cfile, true)))
> > > +   rc = server->ops->async_readv(rdata);
> > > +
> > > +   if (!rc) {
> > > +   /* Add to aio pending list */
> > > +   list_add_tail(>list, rdata_list);
> > > +   return 0;
> > > +   }
> > > +
> > > +   

Re: [Patch v4 1/3] CIFS: Add support for direct I/O read

2018-11-29 Thread Pavel Shilovsky
ср, 28 нояб. 2018 г. в 15:43, Long Li :
>
> > Subject: Re: [Patch v4 1/3] CIFS: Add support for direct I/O read
> >
> > Hi Long,
> >
> > Please find my comments below.
> >
> >
> > ср, 31 окт. 2018 г. в 15:14, Long Li :
> > >
> > > From: Long Li 
> > >
> > > With direct I/O read, we transfer the data directly from transport
> > > layer to the user data buffer.
> > >
> > > Change in v3: add support for kernel AIO
> > >
> > > Change in v4:
> > > Refactor common read code to __cifs_readv for direct and non-direct I/O.
> > > Retry on direct I/O failure.
> > >
> > > Signed-off-by: Long Li 
> > > ---
> > >  fs/cifs/cifsfs.h   |   1 +
> > >  fs/cifs/cifsglob.h |   5 ++
> > >  fs/cifs/file.c | 219 +++--
> > 
> > >  3 files changed, 186 insertions(+), 39 deletions(-)
> > >
> > > diff --git a/fs/cifs/cifsfs.h b/fs/cifs/cifsfs.h index
> > > 5f02318..7fba9aa 100644
> > > --- a/fs/cifs/cifsfs.h
> > > +++ b/fs/cifs/cifsfs.h
> > > @@ -102,6 +102,7 @@ extern int cifs_open(struct inode *inode, struct
> > > file *file);  extern int cifs_close(struct inode *inode, struct file
> > > *file);  extern int cifs_closedir(struct inode *inode, struct file
> > > *file);  extern ssize_t cifs_user_readv(struct kiocb *iocb, struct
> > > iov_iter *to);
> > > +extern ssize_t cifs_direct_readv(struct kiocb *iocb, struct iov_iter
> > > +*to);
> > >  extern ssize_t cifs_strict_readv(struct kiocb *iocb, struct iov_iter
> > > *to);  extern ssize_t cifs_user_writev(struct kiocb *iocb, struct
> > > iov_iter *from);  extern ssize_t cifs_strict_writev(struct kiocb
> > > *iocb, struct iov_iter *from); diff --git a/fs/cifs/cifsglob.h
> > > b/fs/cifs/cifsglob.h index 7f62c98..52248dd 100644
> > > --- a/fs/cifs/cifsglob.h
> > > +++ b/fs/cifs/cifsglob.h
> > > @@ -1146,6 +1146,11 @@ struct cifs_aio_ctx {
> > > unsigned intlen;
> > > unsigned inttotal_len;
> > > boolshould_dirty;
> > > +   /*
> > > +* Indicates if this aio_ctx is for direct_io,
> > > +* If yes, iter is a copy of the user passed iov_iter
> > > +*/
> > > +   booldirect_io;
> > >  };
> > >
> > >  struct cifs_readdata;
> > > diff --git a/fs/cifs/file.c b/fs/cifs/file.c index 87eece6..daab878
> > > 100644
> > > --- a/fs/cifs/file.c
> > > +++ b/fs/cifs/file.c
> > > @@ -2965,7 +2965,6 @@ cifs_uncached_readdata_release(struct kref
> > *refcount)
> > > kref_put(>ctx->refcount, cifs_aio_ctx_release);
> > > for (i = 0; i < rdata->nr_pages; i++) {
> > > put_page(rdata->pages[i]);
> > > -   rdata->pages[i] = NULL;
> > > }
> > > cifs_readdata_release(refcount);  } @@ -3092,6 +3091,63 @@
> > > cifs_uncached_copy_into_pages(struct TCP_Server_Info *server,
> > > return uncached_fill_pages(server, rdata, iter, iter->count);
> > > }
> > >
> > > +static int cifs_resend_rdata(struct cifs_readdata *rdata,
> > > + struct list_head *rdata_list,
> > > + struct cifs_aio_ctx *ctx) {
> > > +   int wait_retry = 0;
> > > +   unsigned int rsize, credits;
> > > +   int rc;
> > > +   struct TCP_Server_Info *server =
> > > +tlink_tcon(rdata->cfile->tlink)->ses->server;
> > > +
> > > +   /*
> > > +* Try to resend this rdata, waiting for credits up to 3 seconds.
> > > +* Note: we are attempting to resend the whole rdata not in 
> > > segments
> > > +*/
> > > +   do {
> > > +   rc = server->ops->wait_mtu_credits(server, rdata->bytes,
> > > +   , );
> > > +
> > > +   if (rc)
> > > +   break;
> > > +
> > > +   if (rsize < rdata->bytes) {
> > > +   add_credits_and_wake_if(server, credits, 0);
> > > +   msleep(1000);
> > > +   wait_retry++;
> > > +   }
> > > +   } while (rsize < rdata->bytes && wait_retry < 3);
> > > +
> > > +   /*
> > > +* If we can't find enough credits to send this rdata
> > > +* release the rdata and return failure, this will pass
> > > +* whatever I/O amount we have finished to VFS.
> > > +*/
> > > +   if (rsize < rdata->bytes) {
> > > +   rc = -EBUSY;
> >
> > We don't have enough credits and return EBUSY here...
> >
> > > +   goto out;
> > > +   }
> > > +
> > > +   rc = -EAGAIN;
> > > +   while (rc == -EAGAIN)
> > > +   if (!rdata->cfile->invalidHandle ||
> > > +   !(rc = cifs_reopen_file(rdata->cfile, true)))
> > > +   rc = server->ops->async_readv(rdata);
> > > +
> > > +   if (!rc) {
> > > +   /* Add to aio pending list */
> > > +   list_add_tail(>list, rdata_list);
> > > +   return 0;
> > > +   }
> > > +
> > > +   

Re: [RFC] locking/rwsem: Avoid issuing wakeup before setting the reader waiter to nil

2018-11-29 Thread Waiman Long
On 11/29/2018 01:43 PM, Davidlohr Bueso wrote:
> On Thu, 29 Nov 2018, Peter Zijlstra wrote:
>
>> On Thu, Nov 29, 2018 at 02:12:32PM +0100, Peter Zijlstra wrote:
>>> Yes, I think this is real, and worse, I think we need to go audit all
>>> wake_q_add() users and document this behaviour.
>>>
>>> In the ideal case we'd delay the actual wakeup to the last wake_up_q(),
>>> but I don't think we can easily fix that.
>>
>> See commit: 1d0dcb3ad9d3 ("futex: Implement lockless wakeups"), I think
>> that introduces the exact same bug.
>>
>> Something like the below perhaps, altough this pattern seems to want a
>> wake_a_add() variant that already assumes get_task_struct().
>
> So I was looking at ways to avoid the redundant reference counting,
> but given how wake_q_add() and wake_up_q() are so loose I can't
> see how to avoid it -- we hold reference across the calls to maintain
> valid mem.
>
> For example, wake_q will grab reference iff the cmpxchg succeeds,
> likewise it will enter the wakeup loop in wake_up_q(), and there is no
> awareness of which caller had the failed cmpxchg because another wakup
> was in progress.
>

I think it will be useful to make wake_q_add() to return the status of
wake_q insertion so that we can respond appropriately if it fails.

> And yes, afaict all wake_q users suffer from the same issue, so we have
> to move the wake_q_add() after the condition, while explicitly doing
> the task ref counting.

How about adding two helpers - wake_q_enter(), wake_q_exit() that can
hide all the reference counting?

Thanks,
Longman


Re: [RFC] locking/rwsem: Avoid issuing wakeup before setting the reader waiter to nil

2018-11-29 Thread Waiman Long
On 11/29/2018 01:43 PM, Davidlohr Bueso wrote:
> On Thu, 29 Nov 2018, Peter Zijlstra wrote:
>
>> On Thu, Nov 29, 2018 at 02:12:32PM +0100, Peter Zijlstra wrote:
>>> Yes, I think this is real, and worse, I think we need to go audit all
>>> wake_q_add() users and document this behaviour.
>>>
>>> In the ideal case we'd delay the actual wakeup to the last wake_up_q(),
>>> but I don't think we can easily fix that.
>>
>> See commit: 1d0dcb3ad9d3 ("futex: Implement lockless wakeups"), I think
>> that introduces the exact same bug.
>>
>> Something like the below perhaps, altough this pattern seems to want a
>> wake_a_add() variant that already assumes get_task_struct().
>
> So I was looking at ways to avoid the redundant reference counting,
> but given how wake_q_add() and wake_up_q() are so loose I can't
> see how to avoid it -- we hold reference across the calls to maintain
> valid mem.
>
> For example, wake_q will grab reference iff the cmpxchg succeeds,
> likewise it will enter the wakeup loop in wake_up_q(), and there is no
> awareness of which caller had the failed cmpxchg because another wakup
> was in progress.
>

I think it will be useful to make wake_q_add() to return the status of
wake_q insertion so that we can respond appropriately if it fails.

> And yes, afaict all wake_q users suffer from the same issue, so we have
> to move the wake_q_add() after the condition, while explicitly doing
> the task ref counting.

How about adding two helpers - wake_q_enter(), wake_q_exit() that can
hide all the reference counting?

Thanks,
Longman


Re: [PATCH v2 4/4] x86/static_call: Add inline static call implementation for x86-64

2018-11-29 Thread Steven Rostedt
On Thu, 29 Nov 2018 10:23:44 -0800
Linus Torvalds  wrote:

> On Thu, Nov 29, 2018 at 9:59 AM Steven Rostedt  wrote:
> >
> > Do you realize that the cmpxchg used by the first attempts of the
> > dynamic modification of code by ftrace was the source of the e1000e
> > NVRAM corruption bug.  
> 
> If you have a static call in IO memory, you have bigger problems than that.
> 
> What's your point?

Just that cmpxchg on dynamic modified code brings back bad memories ;-)

> 
> Again - I will point out that the things you guys have tried to come
> up with have been *WORSE*. Much worse.

Note, we do have a bit of control at what is getting called. The patch
set requires that the callers are wrapped in macros. We should not
allow just any random callers (like from asm).

This isn't about modifying any function call. This is for a specific
subset, that we can impose rules on.

-- Steve



Re: [PATCH v2 4/4] x86/static_call: Add inline static call implementation for x86-64

2018-11-29 Thread Steven Rostedt
On Thu, 29 Nov 2018 10:23:44 -0800
Linus Torvalds  wrote:

> On Thu, Nov 29, 2018 at 9:59 AM Steven Rostedt  wrote:
> >
> > Do you realize that the cmpxchg used by the first attempts of the
> > dynamic modification of code by ftrace was the source of the e1000e
> > NVRAM corruption bug.  
> 
> If you have a static call in IO memory, you have bigger problems than that.
> 
> What's your point?

Just that cmpxchg on dynamic modified code brings back bad memories ;-)

> 
> Again - I will point out that the things you guys have tried to come
> up with have been *WORSE*. Much worse.

Note, we do have a bit of control at what is getting called. The patch
set requires that the callers are wrapped in macros. We should not
allow just any random callers (like from asm).

This isn't about modifying any function call. This is for a specific
subset, that we can impose rules on.

-- Steve



Re: [RFC] locking/rwsem: Avoid issuing wakeup before setting the reader waiter to nil

2018-11-29 Thread Davidlohr Bueso

On Thu, 29 Nov 2018, Peter Zijlstra wrote:


On Thu, Nov 29, 2018 at 02:12:32PM +0100, Peter Zijlstra wrote:

Yes, I think this is real, and worse, I think we need to go audit all
wake_q_add() users and document this behaviour.

In the ideal case we'd delay the actual wakeup to the last wake_up_q(),
but I don't think we can easily fix that.


See commit: 1d0dcb3ad9d3 ("futex: Implement lockless wakeups"), I think
that introduces the exact same bug.

Something like the below perhaps, altough this pattern seems to want a
wake_a_add() variant that already assumes get_task_struct().


So I was looking at ways to avoid the redundant reference counting,
but given how wake_q_add() and wake_up_q() are so loose I can't
see how to avoid it -- we hold reference across the calls to maintain
valid mem.

For example, wake_q will grab reference iff the cmpxchg succeeds,
likewise it will enter the wakeup loop in wake_up_q(), and there is no
awareness of which caller had the failed cmpxchg because another wakup
was in progress.

And yes, afaict all wake_q users suffer from the same issue, so we have
to move the wake_q_add() after the condition, while explicitly doing
the task ref counting.

Thanks,
Davidlohr


Re: [RFC] locking/rwsem: Avoid issuing wakeup before setting the reader waiter to nil

2018-11-29 Thread Davidlohr Bueso

On Thu, 29 Nov 2018, Peter Zijlstra wrote:


On Thu, Nov 29, 2018 at 02:12:32PM +0100, Peter Zijlstra wrote:

Yes, I think this is real, and worse, I think we need to go audit all
wake_q_add() users and document this behaviour.

In the ideal case we'd delay the actual wakeup to the last wake_up_q(),
but I don't think we can easily fix that.


See commit: 1d0dcb3ad9d3 ("futex: Implement lockless wakeups"), I think
that introduces the exact same bug.

Something like the below perhaps, altough this pattern seems to want a
wake_a_add() variant that already assumes get_task_struct().


So I was looking at ways to avoid the redundant reference counting,
but given how wake_q_add() and wake_up_q() are so loose I can't
see how to avoid it -- we hold reference across the calls to maintain
valid mem.

For example, wake_q will grab reference iff the cmpxchg succeeds,
likewise it will enter the wakeup loop in wake_up_q(), and there is no
awareness of which caller had the failed cmpxchg because another wakup
was in progress.

And yes, afaict all wake_q users suffer from the same issue, so we have
to move the wake_q_add() after the condition, while explicitly doing
the task ref counting.

Thanks,
Davidlohr


Re: [PATCH v2 4/4] x86/static_call: Add inline static call implementation for x86-64

2018-11-29 Thread Linus Torvalds
On Thu, Nov 29, 2018 at 10:00 AM Andy Lutomirski  wrote:
> > then it really sounds pretty safe to just say "ok, just make it
> > aligned and update the instruction with an atomic cmpxchg or
> > something".
>
> And how do we do that?  With a gcc plugin and some asm magic?

Asm magic.

You already have to mark the call sites with

static_call(fn, arg1, arg2, ...);

and while it right now just magically depends on gcc outputting the
right code to call the trampoline. But it could do it as a jmp
instruction (tail-call), and maybe that works right, maybe it doesn't.
And maybe some gcc switch makes it output it as a indirect call due to
instrumentation or something. Doing it with asm magic would, I feel,
be safer anyway, so that we'd know *exactly* how that call gets done.

For example, if gcc does it as a jmp due to a tail-call, the
compiler/linker could in theory turn the jump into a short jump if it
sees that the trampoline is close enough. Does that happen? Probably
not. But I don't see why it *couldn't* happen in the current patch
series. The trampoline is just a regular function, even if it has been
defined by global asm.

Putting the trampoline in a different code section could fix things
like that (maybe there was a patch that did that and I missed it?) but
I do think that doing the call with an asm would *also* fix it.

But the "just always use a trampoline" is certainly the simpler model.

  Linus


Re: [PATCH v2 4/4] x86/static_call: Add inline static call implementation for x86-64

2018-11-29 Thread Linus Torvalds
On Thu, Nov 29, 2018 at 10:00 AM Andy Lutomirski  wrote:
> > then it really sounds pretty safe to just say "ok, just make it
> > aligned and update the instruction with an atomic cmpxchg or
> > something".
>
> And how do we do that?  With a gcc plugin and some asm magic?

Asm magic.

You already have to mark the call sites with

static_call(fn, arg1, arg2, ...);

and while it right now just magically depends on gcc outputting the
right code to call the trampoline. But it could do it as a jmp
instruction (tail-call), and maybe that works right, maybe it doesn't.
And maybe some gcc switch makes it output it as a indirect call due to
instrumentation or something. Doing it with asm magic would, I feel,
be safer anyway, so that we'd know *exactly* how that call gets done.

For example, if gcc does it as a jmp due to a tail-call, the
compiler/linker could in theory turn the jump into a short jump if it
sees that the trampoline is close enough. Does that happen? Probably
not. But I don't see why it *couldn't* happen in the current patch
series. The trampoline is just a regular function, even if it has been
defined by global asm.

Putting the trampoline in a different code section could fix things
like that (maybe there was a patch that did that and I missed it?) but
I do think that doing the call with an asm would *also* fix it.

But the "just always use a trampoline" is certainly the simpler model.

  Linus


[PATCH] linux-firmware: Update AMD cpu microcode

2018-11-29 Thread Allen, John
* Update AMD cpu microcode for processor family 17h

Key Name= AMD Microcode Signing Key (for signing microcode container 
files only)
Key ID  = F328AE73
Key Fingerprint = FC7C 6C50 5DAF CC14 7183 57CA E4BE 5339 F328 AE73

Signed-off-by: John Allen 
---
 WHENCE |   2 +-
 amd-ucode/microcode_amd_fam17h.bin | Bin 3364 -> 6476 bytes
 amd-ucode/microcode_amd_fam17h.bin.asc |  21 +
 3 files changed, 14 insertions(+), 9 deletions(-)

diff --git a/WHENCE b/WHENCE
index 31b0b74..8761ef0 100644
--- a/WHENCE
+++ b/WHENCE
@@ -3239,7 +3239,7 @@ Version: 2018-05-24
 File: amd-ucode/microcode_amd_fam16h.bin
 Version: 2014-10-28
 File: amd-ucode/microcode_amd_fam17h.bin
-Version: 2018-05-15
+Version: 2018-11-28
 
 License: Redistributable. See LICENSE.amd-ucode for details
 
diff --git a/amd-ucode/microcode_amd_fam17h.bin 
b/amd-ucode/microcode_amd_fam17h.bin
index 
8e6f4c181c32206dfac82c1f0025b2fd6b72495a..8ba3c612e10188a4572ce6b59fa754ceb58c5d0c
 100644
GIT binary patch
delta 693
zcmZ1?b;d~3#n+Jm1Pp*!h`#~E1OuT428Jdie$**x`(K)D7U1_lWQ1vc&`1`ZaW
zDrC?E<%cz!T(RvqE>ziRA@Fps{YguXfH22P%k-Y~a_1db`epO8zjy4Xu@h)sWa
zHP4PoO!%uqf=$WGGu)*Z0i{C=mz^WxiS@X(YT8C*;4Kc+e_E0Jl?TbuX>AC
zfY|d^dH%}QZG0Q7cVLN
zw6HvC^Rx>#vEGID*Yk_Gg08JhEjX!Vx2AAwyUZ_H-28IPVqkas5>1;rj5r
zjdQM{Esw4m1HLhsWkMYzmB%Q`kHv
zpJDglP+;It5$TDjm>k0)p|Oigh(S<+L!p63qhdch2ZO>KrUeY`>Macm^moAcf8>Q2
zCNJUen8d3BL@JxPfUc5ZT;RyS!2og%h_4_6)CK}ToBP{@7?6PGDlZJNYH|YA%p{
z4LlJQ|Kumn

[PATCH] linux-firmware: Update AMD cpu microcode

2018-11-29 Thread Allen, John
* Update AMD cpu microcode for processor family 17h

Key Name= AMD Microcode Signing Key (for signing microcode container 
files only)
Key ID  = F328AE73
Key Fingerprint = FC7C 6C50 5DAF CC14 7183 57CA E4BE 5339 F328 AE73

Signed-off-by: John Allen 
---
 WHENCE |   2 +-
 amd-ucode/microcode_amd_fam17h.bin | Bin 3364 -> 6476 bytes
 amd-ucode/microcode_amd_fam17h.bin.asc |  21 +
 3 files changed, 14 insertions(+), 9 deletions(-)

diff --git a/WHENCE b/WHENCE
index 31b0b74..8761ef0 100644
--- a/WHENCE
+++ b/WHENCE
@@ -3239,7 +3239,7 @@ Version: 2018-05-24
 File: amd-ucode/microcode_amd_fam16h.bin
 Version: 2014-10-28
 File: amd-ucode/microcode_amd_fam17h.bin
-Version: 2018-05-15
+Version: 2018-11-28
 
 License: Redistributable. See LICENSE.amd-ucode for details
 
diff --git a/amd-ucode/microcode_amd_fam17h.bin 
b/amd-ucode/microcode_amd_fam17h.bin
index 
8e6f4c181c32206dfac82c1f0025b2fd6b72495a..8ba3c612e10188a4572ce6b59fa754ceb58c5d0c
 100644
GIT binary patch
delta 693
zcmZ1?b;d~3#n+Jm1Pp*!h`#~E1OuT428Jdie$**x`(K)D7U1_lWQ1vc&`1`ZaW
zDrC?E<%cz!T(RvqE>ziRA@Fps{YguXfH22P%k-Y~a_1db`epO8zjy4Xu@h)sWa
zHP4PoO!%uqf=$WGGu)*Z0i{C=mz^WxiS@X(YT8C*;4Kc+e_E0Jl?TbuX>AC
zfY|d^dH%}QZG0Q7cVLN
zw6HvC^Rx>#vEGID*Yk_Gg08JhEjX!Vx2AAwyUZ_H-28IPVqkas5>1;rj5r
zjdQM{Esw4m1HLhsWkMYzmB%Q`kHv
zpJDglP+;It5$TDjm>k0)p|Oigh(S<+L!p63qhdch2ZO>KrUeY`>Macm^moAcf8>Q2
zCNJUen8d3BL@JxPfUc5ZT;RyS!2og%h_4_6)CK}ToBP{@7?6PGDlZJNYH|YA%p{
z4LlJQ|Kumn

<    3   4   5   6   7   8   9   10   11   12   >