Regression in 4.9-rc1 for PPC32 - bisected to commit 05fd007e4629

2016-10-19 Thread Larry Finger
Kernel 4.9-rc1 fails to boot on my PowerBook G4 Aluminum (32-bit PowerPC) with 
the following splat:


Kernel Panic - not synching: Attempted to kill init: exitcode = 0x0200

Call trace:

dump_stack+0x24/0x34 (unreliable)
panic+0x110/0x2ac
do_exit+0x464/0x834
do_group_exit+0x84/0xac
__wake_up_parent+0x0/0x34
ret_from_syscall+0x0/0x40

As the panic happens very early, I was not able to capture the output, thus the 
above was entered by hand.


The problem was bisected to commit 05fd007e4629 ("console: don't prefer first 
registered if DT specifies stdout-path"). Examining that patch and testing the 
various hunks, I found that the system booted fine when I eliminated the hunk at


--- a/drivers/of/base.c
+++ b/drivers/of/base.c
@@ -2077,6 +2077,8 @@ void of_alias_scan(void * (*dt_alloc)(u64 size, u64 
align))
name = of_get_property(of_aliases, "stdout", NULL);
if (name)
of_stdout = of_find_node_opts_by_path(name, 
_stdout_options);

+   if (of_stdout)
+   console_set_by_of();
}

if (!of_aliases)

Similarly, it would boot if I eliminated the hunk at

--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -2647,7 +2658,7 @@ void register_console(struct console *newcon)
 *  didn't select a console we take the first one
 *  that registers here.
 */
-   if (preferred_console < 0) {
+   if (preferred_console < 0 && !of_specified_console) {
if (newcon->index < 0)
newcon->index = 0;
if (newcon->setup == NULL ||

The problem happens when of_specified_console is true, and the code following 
the modified if statement above is not executed. In my .config, CONFIG_OF=y.


As always, I will be happy to test any fixes.

Thanks,

Larry

--
If I was stranded on an island and the only way to get off
the island was to make a pretty UI, I’d die there.

Linus Torvalds


Regression in 4.9-rc1 for PPC32 - bisected to commit 05fd007e4629

2016-10-19 Thread Larry Finger
Kernel 4.9-rc1 fails to boot on my PowerBook G4 Aluminum (32-bit PowerPC) with 
the following splat:


Kernel Panic - not synching: Attempted to kill init: exitcode = 0x0200

Call trace:

dump_stack+0x24/0x34 (unreliable)
panic+0x110/0x2ac
do_exit+0x464/0x834
do_group_exit+0x84/0xac
__wake_up_parent+0x0/0x34
ret_from_syscall+0x0/0x40

As the panic happens very early, I was not able to capture the output, thus the 
above was entered by hand.


The problem was bisected to commit 05fd007e4629 ("console: don't prefer first 
registered if DT specifies stdout-path"). Examining that patch and testing the 
various hunks, I found that the system booted fine when I eliminated the hunk at


--- a/drivers/of/base.c
+++ b/drivers/of/base.c
@@ -2077,6 +2077,8 @@ void of_alias_scan(void * (*dt_alloc)(u64 size, u64 
align))
name = of_get_property(of_aliases, "stdout", NULL);
if (name)
of_stdout = of_find_node_opts_by_path(name, 
_stdout_options);

+   if (of_stdout)
+   console_set_by_of();
}

if (!of_aliases)

Similarly, it would boot if I eliminated the hunk at

--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -2647,7 +2658,7 @@ void register_console(struct console *newcon)
 *  didn't select a console we take the first one
 *  that registers here.
 */
-   if (preferred_console < 0) {
+   if (preferred_console < 0 && !of_specified_console) {
if (newcon->index < 0)
newcon->index = 0;
if (newcon->setup == NULL ||

The problem happens when of_specified_console is true, and the code following 
the modified if statement above is not executed. In my .config, CONFIG_OF=y.


As always, I will be happy to test any fixes.

Thanks,

Larry

--
If I was stranded on an island and the only way to get off
the island was to make a pretty UI, I’d die there.

Linus Torvalds


Re: [patch] perf_event_open.2: add cycles field in LBR records

2016-10-19 Thread Michael Kerrisk (man-pages)
Hi Vince,


On 10/19/2016 05:10 PM, Vince Weaver wrote:
> 
> Linux 4.3 (71ef3c6b9d4665ee7afbbe4c208a98917dcfc32f)
> added a cycles field to the PERF_SAMPLE_BRANCH_STACK
> last branch records.
> 
> The kernel commit was a bit vague on this, but you can find
> a few more details on this in the Intel Architectural Manual
> vol3B.  The field indicates the number of core cycles elapsed
> since the previous update to the LBR stack.
> 
> This feature is only found on Skylake and newer Intel chips,
> as well as Intel Atom Goldmont chips.  I'm not sure if it's
> worth adding this info to the manpage, as it seems a bit
> specific and will probably get rapidly out of date.
> 
> I'm also not sure about the manpage project's opinion on
> Oxford commas.

We like them!

Patch applied. One comment below.
> 
> Signed-off-by: Vince Weaver 
> 
> diff --git a/man2/perf_event_open.2 b/man2/perf_event_open.2
> index d64adaf..c9555c6 100644
> --- a/man2/perf_event_open.2
> +++ b/man2/perf_event_open.2
> @@ -2183,15 +2183,18 @@ The branch was in a transactional memory transaction.
>  .IR abort " (since Linux 3.11)"
>  .\" commit 135c5612c460f89657c4698fe2ea753f6f667963
>  The branch was in an aborted transactional memory transaction.
> +.TP
> +.IR cycles " (since Linux 4.3)"
> +.\" commit 71ef3c6b9d4665ee7afbbe4c208a98917dcfc32f
> +This reports the number of cycles elapsed since the
> +previous branch stack update.
>  .P
>  The entries are from most to least recent, so the first entry
>  has the most recent branch.
>  
>  Support for
> -.I mispred
> -and
> -.I predicted
> -is optional; if not supported, both
> +.IR mispred ", " predicted " and " cycles

Rather than the previous, I mostly prefer something like:

Support for
.IR mispred ,
.IR predicted , 
and
.IR cycles

I actually find source in that format a little easier to
read, and a lot easier to script against.

Cheers,

Michael


> +is optional; if not supported, those
>  values will be 0.
>  
>  The type of branches recorded is specified by the
> --
> To unsubscribe from this list: send the line "unsubscribe linux-man" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/


Re: [patch] perf_event_open.2: add cycles field in LBR records

2016-10-19 Thread Michael Kerrisk (man-pages)
Hi Vince,


On 10/19/2016 05:10 PM, Vince Weaver wrote:
> 
> Linux 4.3 (71ef3c6b9d4665ee7afbbe4c208a98917dcfc32f)
> added a cycles field to the PERF_SAMPLE_BRANCH_STACK
> last branch records.
> 
> The kernel commit was a bit vague on this, but you can find
> a few more details on this in the Intel Architectural Manual
> vol3B.  The field indicates the number of core cycles elapsed
> since the previous update to the LBR stack.
> 
> This feature is only found on Skylake and newer Intel chips,
> as well as Intel Atom Goldmont chips.  I'm not sure if it's
> worth adding this info to the manpage, as it seems a bit
> specific and will probably get rapidly out of date.
> 
> I'm also not sure about the manpage project's opinion on
> Oxford commas.

We like them!

Patch applied. One comment below.
> 
> Signed-off-by: Vince Weaver 
> 
> diff --git a/man2/perf_event_open.2 b/man2/perf_event_open.2
> index d64adaf..c9555c6 100644
> --- a/man2/perf_event_open.2
> +++ b/man2/perf_event_open.2
> @@ -2183,15 +2183,18 @@ The branch was in a transactional memory transaction.
>  .IR abort " (since Linux 3.11)"
>  .\" commit 135c5612c460f89657c4698fe2ea753f6f667963
>  The branch was in an aborted transactional memory transaction.
> +.TP
> +.IR cycles " (since Linux 4.3)"
> +.\" commit 71ef3c6b9d4665ee7afbbe4c208a98917dcfc32f
> +This reports the number of cycles elapsed since the
> +previous branch stack update.
>  .P
>  The entries are from most to least recent, so the first entry
>  has the most recent branch.
>  
>  Support for
> -.I mispred
> -and
> -.I predicted
> -is optional; if not supported, both
> +.IR mispred ", " predicted " and " cycles

Rather than the previous, I mostly prefer something like:

Support for
.IR mispred ,
.IR predicted , 
and
.IR cycles

I actually find source in that format a little easier to
read, and a lot easier to script against.

Cheers,

Michael


> +is optional; if not supported, those
>  values will be 0.
>  
>  The type of branches recorded is specified by the
> --
> To unsubscribe from this list: send the line "unsubscribe linux-man" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/


[PATCH V2 4/5] arm64: Allow hw watchpoint of length 3,5,6 and 7

2016-10-19 Thread Pratyush Anand
Since, arm64 can support all offset within a double word limit. Therefore,
now support other lengths within that range as well.

Signed-off-by: Pratyush Anand 
---
 arch/arm64/include/asm/hw_breakpoint.h |  4 
 arch/arm64/kernel/hw_breakpoint.c  | 36 ++
 2 files changed, 40 insertions(+)

diff --git a/arch/arm64/include/asm/hw_breakpoint.h 
b/arch/arm64/include/asm/hw_breakpoint.h
index 4f4e58bee9bc..7a18c8520588 100644
--- a/arch/arm64/include/asm/hw_breakpoint.h
+++ b/arch/arm64/include/asm/hw_breakpoint.h
@@ -76,7 +76,11 @@ static inline void decode_ctrl_reg(u32 reg,
 /* Lengths */
 #define ARM_BREAKPOINT_LEN_1   0x1
 #define ARM_BREAKPOINT_LEN_2   0x3
+#define ARM_BREAKPOINT_LEN_3   0x7
 #define ARM_BREAKPOINT_LEN_4   0xf
+#define ARM_BREAKPOINT_LEN_5   0x1f
+#define ARM_BREAKPOINT_LEN_6   0x3f
+#define ARM_BREAKPOINT_LEN_7   0x7f
 #define ARM_BREAKPOINT_LEN_8   0xff
 
 /* Kernel stepping */
diff --git a/arch/arm64/kernel/hw_breakpoint.c 
b/arch/arm64/kernel/hw_breakpoint.c
index c57bc90b8286..4125c2152e85 100644
--- a/arch/arm64/kernel/hw_breakpoint.c
+++ b/arch/arm64/kernel/hw_breakpoint.c
@@ -317,9 +317,21 @@ static int get_hbp_len(u8 hbp_len)
case ARM_BREAKPOINT_LEN_2:
len_in_bytes = 2;
break;
+   case ARM_BREAKPOINT_LEN_3:
+   len_in_bytes = 3;
+   break;
case ARM_BREAKPOINT_LEN_4:
len_in_bytes = 4;
break;
+   case ARM_BREAKPOINT_LEN_5:
+   len_in_bytes = 5;
+   break;
+   case ARM_BREAKPOINT_LEN_6:
+   len_in_bytes = 6;
+   break;
+   case ARM_BREAKPOINT_LEN_7:
+   len_in_bytes = 7;
+   break;
case ARM_BREAKPOINT_LEN_8:
len_in_bytes = 8;
break;
@@ -379,9 +391,21 @@ int arch_bp_generic_fields(struct arch_hw_breakpoint_ctrl 
ctrl,
case ARM_BREAKPOINT_LEN_2:
*gen_len = HW_BREAKPOINT_LEN_2;
break;
+   case ARM_BREAKPOINT_LEN_3:
+   *gen_len = HW_BREAKPOINT_LEN_3;
+   break;
case ARM_BREAKPOINT_LEN_4:
*gen_len = HW_BREAKPOINT_LEN_4;
break;
+   case ARM_BREAKPOINT_LEN_5:
+   *gen_len = HW_BREAKPOINT_LEN_5;
+   break;
+   case ARM_BREAKPOINT_LEN_6:
+   *gen_len = HW_BREAKPOINT_LEN_6;
+   break;
+   case ARM_BREAKPOINT_LEN_7:
+   *gen_len = HW_BREAKPOINT_LEN_7;
+   break;
case ARM_BREAKPOINT_LEN_8:
*gen_len = HW_BREAKPOINT_LEN_8;
break;
@@ -425,9 +449,21 @@ static int arch_build_bp_info(struct perf_event *bp)
case HW_BREAKPOINT_LEN_2:
info->ctrl.len = ARM_BREAKPOINT_LEN_2;
break;
+   case HW_BREAKPOINT_LEN_3:
+   info->ctrl.len = ARM_BREAKPOINT_LEN_3;
+   break;
case HW_BREAKPOINT_LEN_4:
info->ctrl.len = ARM_BREAKPOINT_LEN_4;
break;
+   case HW_BREAKPOINT_LEN_5:
+   info->ctrl.len = ARM_BREAKPOINT_LEN_5;
+   break;
+   case HW_BREAKPOINT_LEN_6:
+   info->ctrl.len = ARM_BREAKPOINT_LEN_6;
+   break;
+   case HW_BREAKPOINT_LEN_7:
+   info->ctrl.len = ARM_BREAKPOINT_LEN_7;
+   break;
case HW_BREAKPOINT_LEN_8:
info->ctrl.len = ARM_BREAKPOINT_LEN_8;
break;
-- 
2.7.4



[PATCH V2 4/5] arm64: Allow hw watchpoint of length 3,5,6 and 7

2016-10-19 Thread Pratyush Anand
Since, arm64 can support all offset within a double word limit. Therefore,
now support other lengths within that range as well.

Signed-off-by: Pratyush Anand 
---
 arch/arm64/include/asm/hw_breakpoint.h |  4 
 arch/arm64/kernel/hw_breakpoint.c  | 36 ++
 2 files changed, 40 insertions(+)

diff --git a/arch/arm64/include/asm/hw_breakpoint.h 
b/arch/arm64/include/asm/hw_breakpoint.h
index 4f4e58bee9bc..7a18c8520588 100644
--- a/arch/arm64/include/asm/hw_breakpoint.h
+++ b/arch/arm64/include/asm/hw_breakpoint.h
@@ -76,7 +76,11 @@ static inline void decode_ctrl_reg(u32 reg,
 /* Lengths */
 #define ARM_BREAKPOINT_LEN_1   0x1
 #define ARM_BREAKPOINT_LEN_2   0x3
+#define ARM_BREAKPOINT_LEN_3   0x7
 #define ARM_BREAKPOINT_LEN_4   0xf
+#define ARM_BREAKPOINT_LEN_5   0x1f
+#define ARM_BREAKPOINT_LEN_6   0x3f
+#define ARM_BREAKPOINT_LEN_7   0x7f
 #define ARM_BREAKPOINT_LEN_8   0xff
 
 /* Kernel stepping */
diff --git a/arch/arm64/kernel/hw_breakpoint.c 
b/arch/arm64/kernel/hw_breakpoint.c
index c57bc90b8286..4125c2152e85 100644
--- a/arch/arm64/kernel/hw_breakpoint.c
+++ b/arch/arm64/kernel/hw_breakpoint.c
@@ -317,9 +317,21 @@ static int get_hbp_len(u8 hbp_len)
case ARM_BREAKPOINT_LEN_2:
len_in_bytes = 2;
break;
+   case ARM_BREAKPOINT_LEN_3:
+   len_in_bytes = 3;
+   break;
case ARM_BREAKPOINT_LEN_4:
len_in_bytes = 4;
break;
+   case ARM_BREAKPOINT_LEN_5:
+   len_in_bytes = 5;
+   break;
+   case ARM_BREAKPOINT_LEN_6:
+   len_in_bytes = 6;
+   break;
+   case ARM_BREAKPOINT_LEN_7:
+   len_in_bytes = 7;
+   break;
case ARM_BREAKPOINT_LEN_8:
len_in_bytes = 8;
break;
@@ -379,9 +391,21 @@ int arch_bp_generic_fields(struct arch_hw_breakpoint_ctrl 
ctrl,
case ARM_BREAKPOINT_LEN_2:
*gen_len = HW_BREAKPOINT_LEN_2;
break;
+   case ARM_BREAKPOINT_LEN_3:
+   *gen_len = HW_BREAKPOINT_LEN_3;
+   break;
case ARM_BREAKPOINT_LEN_4:
*gen_len = HW_BREAKPOINT_LEN_4;
break;
+   case ARM_BREAKPOINT_LEN_5:
+   *gen_len = HW_BREAKPOINT_LEN_5;
+   break;
+   case ARM_BREAKPOINT_LEN_6:
+   *gen_len = HW_BREAKPOINT_LEN_6;
+   break;
+   case ARM_BREAKPOINT_LEN_7:
+   *gen_len = HW_BREAKPOINT_LEN_7;
+   break;
case ARM_BREAKPOINT_LEN_8:
*gen_len = HW_BREAKPOINT_LEN_8;
break;
@@ -425,9 +449,21 @@ static int arch_build_bp_info(struct perf_event *bp)
case HW_BREAKPOINT_LEN_2:
info->ctrl.len = ARM_BREAKPOINT_LEN_2;
break;
+   case HW_BREAKPOINT_LEN_3:
+   info->ctrl.len = ARM_BREAKPOINT_LEN_3;
+   break;
case HW_BREAKPOINT_LEN_4:
info->ctrl.len = ARM_BREAKPOINT_LEN_4;
break;
+   case HW_BREAKPOINT_LEN_5:
+   info->ctrl.len = ARM_BREAKPOINT_LEN_5;
+   break;
+   case HW_BREAKPOINT_LEN_6:
+   info->ctrl.len = ARM_BREAKPOINT_LEN_6;
+   break;
+   case HW_BREAKPOINT_LEN_7:
+   info->ctrl.len = ARM_BREAKPOINT_LEN_7;
+   break;
case HW_BREAKPOINT_LEN_8:
info->ctrl.len = ARM_BREAKPOINT_LEN_8;
break;
-- 
2.7.4



Re: [patch] perf_event_open.2: PERF_RECORD_SWITCH support

2016-10-19 Thread Michael Kerrisk (man-pages)
On 10/19/2016 05:14 PM, Vince Weaver wrote:
> On Wed, 19 Oct 2016, Michael Kerrisk (man-pages) wrote:
> 
>>> diff --git a/man2/perf_event_open.2 b/man2/perf_event_open.2
>>> index 68b99bb..04a0cf5 100644
>>> +.B PERF_RECORD_SWITCH_CPU_WIDE
>>> +records when sampling in cpu-wide mode.
>>> +This functionality is in addition to existing tracepoint and
>>> +software events for measuring context switches.
>>> +The advantage of this method is that it will give full
>>
>> s/give full/give a full/
>>
>> ok?
>>
>>> +information event with strict
>>> +.I perf_event_paranoid
>>> +settings.
> 
> What I meant to say was 
> 
> "it will give full information *even* with strict perf_event_paranoid 
> settings"
> 
> Maybe saying something like "despite strict settings" would be better 
> wording.
> 
> not sure how I missed that typo, apparently my fingers are used to typing 
> "event" too much.

Thanks, Vince. Fixed now.

Cheers,

Michael



-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/


Re: [patch] perf_event_open.2: PERF_RECORD_SWITCH support

2016-10-19 Thread Michael Kerrisk (man-pages)
On 10/19/2016 05:14 PM, Vince Weaver wrote:
> On Wed, 19 Oct 2016, Michael Kerrisk (man-pages) wrote:
> 
>>> diff --git a/man2/perf_event_open.2 b/man2/perf_event_open.2
>>> index 68b99bb..04a0cf5 100644
>>> +.B PERF_RECORD_SWITCH_CPU_WIDE
>>> +records when sampling in cpu-wide mode.
>>> +This functionality is in addition to existing tracepoint and
>>> +software events for measuring context switches.
>>> +The advantage of this method is that it will give full
>>
>> s/give full/give a full/
>>
>> ok?
>>
>>> +information event with strict
>>> +.I perf_event_paranoid
>>> +settings.
> 
> What I meant to say was 
> 
> "it will give full information *even* with strict perf_event_paranoid 
> settings"
> 
> Maybe saying something like "despite strict settings" would be better 
> wording.
> 
> not sure how I missed that typo, apparently my fingers are used to typing 
> "event" too much.

Thanks, Vince. Fixed now.

Cheers,

Michael



-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/


[PATCH V2 5/5] selftests: arm64: add test for unaligned/inexact watchpoint handling

2016-10-19 Thread Pratyush Anand
ARM64 hardware expects 64bit aligned address for watchpoint invocation.
However, it provides byte selection method to select any number of
consecutive byte set within the range of 1-8.

This patch adds support to test all such byte selection option for
different memory write sizes.

Patch also adds a test for handling the case when the cpu does not
report an address which exactly matches one of the regions we have
been watching (which is a situation permitted by the spec if an
instruction accesses both watched and unwatched regions). The test
was failing on a MSM8996pro before this patch series and is
passing now.

Signed-off-by: Pavel Labath 
Signed-off-by: Pratyush Anand 
---
 tools/testing/selftests/breakpoints/Makefile   |   5 +-
 .../selftests/breakpoints/breakpoint_test_arm64.c  | 236 +
 2 files changed, 240 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/breakpoints/breakpoint_test_arm64.c

diff --git a/tools/testing/selftests/breakpoints/Makefile 
b/tools/testing/selftests/breakpoints/Makefile
index 74e533fd4bc5..61b79e8df1f4 100644
--- a/tools/testing/selftests/breakpoints/Makefile
+++ b/tools/testing/selftests/breakpoints/Makefile
@@ -5,6 +5,9 @@ ARCH ?= $(shell echo $(uname_M) | sed -e s/i.86/x86/ -e 
s/x86_64/x86/)
 ifeq ($(ARCH),x86)
 TEST_PROGS := breakpoint_test
 endif
+ifeq ($(ARCH),aarch64)
+TEST_PROGS := breakpoint_test_arm64
+endif
 
 TEST_PROGS += step_after_suspend_test
 
@@ -13,4 +16,4 @@ all: $(TEST_PROGS)
 include ../lib.mk
 
 clean:
-   rm -fr breakpoint_test step_after_suspend_test
+   rm -fr breakpoint_test breakpoint_test_arm64 step_after_suspend_test
diff --git a/tools/testing/selftests/breakpoints/breakpoint_test_arm64.c 
b/tools/testing/selftests/breakpoints/breakpoint_test_arm64.c
new file mode 100644
index ..3897e996541e
--- /dev/null
+++ b/tools/testing/selftests/breakpoints/breakpoint_test_arm64.c
@@ -0,0 +1,236 @@
+/*
+ * Copyright (C) 2016 Google, Inc.
+ *
+ * This software is licensed under the terms of the GNU General Public
+ * License version 2, as published by the Free Software Foundation, and
+ * may be copied, distributed, and modified under those terms.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * Original Code by Pavel Labath 
+ *
+ * Code modified by Pratyush Anand 
+ * for testing different byte select for each access size.
+ *
+ */
+
+#define _GNU_SOURCE
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "../kselftest.h"
+
+static volatile uint8_t var[96] __attribute__((__aligned__(32)));
+
+static void child(int size, int wr)
+{
+   volatile uint8_t *addr = [32 + wr];
+
+   if (ptrace(PTRACE_TRACEME, 0, NULL, NULL) != 0) {
+   perror("ptrace(PTRACE_TRACEME) failed");
+   _exit(1);
+   }
+
+   if (raise(SIGSTOP) != 0) {
+   perror("raise(SIGSTOP) failed");
+   _exit(1);
+   }
+
+   if ((uintptr_t) addr % size) {
+   perror("Wrong address write for the given size\n");
+   _exit(1);
+   }
+   switch (size) {
+   case 1:
+   *addr = 47;
+   break;
+   case 2:
+   *(uint16_t *)addr = 47;
+   break;
+   case 4:
+   *(uint32_t *)addr = 47;
+   break;
+   case 8:
+   *(uint64_t *)addr = 47;
+   break;
+   case 16:
+   __asm__ volatile ("stp x29, x30, %0" : "=m" (addr[0]));
+   break;
+   case 32:
+   __asm__ volatile ("stp q29, q30, %0" : "=m" (addr[0]));
+   break;
+   }
+
+   _exit(0);
+}
+
+static bool set_watchpoint(pid_t pid, int size, int wp)
+{
+   const volatile uint8_t *addr = [32 + wp];
+   const int offset = (uintptr_t)addr % 8;
+   const unsigned int byte_mask = ((1 << size) - 1) << offset;
+   const unsigned int type = 2; /* Write */
+   const unsigned int enable = 1;
+   const unsigned int control = byte_mask << 5 | type << 3 | enable;
+   struct user_hwdebug_state dreg_state;
+   struct iovec iov;
+
+   memset(_state, 0, sizeof(dreg_state));
+   dreg_state.dbg_regs[0].addr = (uintptr_t)(addr - offset);
+   dreg_state.dbg_regs[0].ctrl = control;
+   iov.iov_base = _state;
+   iov.iov_len = offsetof(struct user_hwdebug_state, dbg_regs) +
+   sizeof(dreg_state.dbg_regs[0]);
+   if (ptrace(PTRACE_SETREGSET, pid, NT_ARM_HW_WATCH, ) == 0)
+   return true;
+
+   if (errno == EIO) {
+   

[PATCH V2 2/5] arm64: Allow hw watchpoint at varied offset from base address

2016-10-19 Thread Pratyush Anand
ARM64 hardware supports watchpoint at any double word aligned address.
However, it can select any consecutive bytes from offset 0 to 7 from that
base address. For example, if base address is programmed as 0x420030 and
byte select is 0x1C, then access of 0x420032,0x420033 and 0x420034 will
generate a watchpoint exception.

Currently, we do not have such modularity. We can only program byte,
halfword, word and double word access exception from any base address.

This patch adds support to overcome above limitations.

Signed-off-by: Pratyush Anand 
---
 arch/arm64/include/asm/hw_breakpoint.h |  2 +-
 arch/arm64/kernel/hw_breakpoint.c  | 45 --
 arch/arm64/kernel/ptrace.c |  5 ++--
 3 files changed, 25 insertions(+), 27 deletions(-)

diff --git a/arch/arm64/include/asm/hw_breakpoint.h 
b/arch/arm64/include/asm/hw_breakpoint.h
index 115ea2a64520..4f4e58bee9bc 100644
--- a/arch/arm64/include/asm/hw_breakpoint.h
+++ b/arch/arm64/include/asm/hw_breakpoint.h
@@ -118,7 +118,7 @@ struct perf_event;
 struct pmu;
 
 extern int arch_bp_generic_fields(struct arch_hw_breakpoint_ctrl ctrl,
- int *gen_len, int *gen_type);
+ int *gen_len, int *gen_type, int *offset);
 extern int arch_check_bp_in_kernelspace(struct perf_event *bp);
 extern int arch_validate_hwbkpt_settings(struct perf_event *bp);
 extern int hw_breakpoint_exceptions_notify(struct notifier_block *unused,
diff --git a/arch/arm64/kernel/hw_breakpoint.c 
b/arch/arm64/kernel/hw_breakpoint.c
index 26a6bf77d272..3c2b96803eba 100644
--- a/arch/arm64/kernel/hw_breakpoint.c
+++ b/arch/arm64/kernel/hw_breakpoint.c
@@ -349,7 +349,7 @@ int arch_check_bp_in_kernelspace(struct perf_event *bp)
  * to generic breakpoint descriptions.
  */
 int arch_bp_generic_fields(struct arch_hw_breakpoint_ctrl ctrl,
-  int *gen_len, int *gen_type)
+  int *gen_len, int *gen_type, int *offset)
 {
/* Type */
switch (ctrl.type) {
@@ -369,8 +369,10 @@ int arch_bp_generic_fields(struct arch_hw_breakpoint_ctrl 
ctrl,
return -EINVAL;
}
 
+   *offset = ffs(ctrl.len) - 1;
+
/* Len */
-   switch (ctrl.len) {
+   switch (ctrl.len >> *offset) {
case ARM_BREAKPOINT_LEN_1:
*gen_len = HW_BREAKPOINT_LEN_1;
break;
@@ -517,18 +519,17 @@ int arch_validate_hwbkpt_settings(struct perf_event *bp)
default:
return -EINVAL;
}
-
-   info->address &= ~alignment_mask;
-   info->ctrl.len <<= offset;
} else {
if (info->ctrl.type == ARM_BREAKPOINT_EXECUTE)
alignment_mask = 0x3;
else
alignment_mask = 0x7;
-   if (info->address & alignment_mask)
-   return -EINVAL;
+   offset = info->address & alignment_mask;
}
 
+   info->address &= ~alignment_mask;
+   info->ctrl.len <<= offset;
+
/*
 * Disallow per-task kernel breakpoints since these would
 * complicate the stepping code.
@@ -665,8 +666,8 @@ static int watchpoint_handler(unsigned long addr, unsigned 
int esr,
  struct pt_regs *regs)
 {
int i, step = 0, *kernel_step, access;
-   u32 ctrl_reg;
-   u64 val, alignment_mask;
+   u32 ctrl_reg, lens, lene;
+   u64 val;
struct perf_event *wp, **slots;
struct debug_info *debug_info;
struct arch_hw_breakpoint *info;
@@ -684,25 +685,21 @@ static int watchpoint_handler(unsigned long addr, 
unsigned int esr,
goto unlock;
 
info = counter_arch_bp(wp);
-   /* AArch32 watchpoints are either 4 or 8 bytes aligned. */
-   if (is_compat_task()) {
-   if (info->ctrl.len == ARM_BREAKPOINT_LEN_8)
-   alignment_mask = 0x7;
-   else
-   alignment_mask = 0x3;
-   } else {
-   alignment_mask = 0x7;
-   }
 
-   /* Check if the watchpoint value matches. */
+   /* Check if the watchpoint value and byte select match. */
val = read_wb_reg(AARCH64_DBG_REG_WVR, i);
-   if (val != (addr & ~alignment_mask))
-   goto unlock;
-
-   /* Possible match, check the byte address select to confirm. */
ctrl_reg = read_wb_reg(AARCH64_DBG_REG_WCR, i);
decode_ctrl_reg(ctrl_reg, );
-   if (!((1 << (addr & alignment_mask)) & ctrl.len))
+   lens = ffs(ctrl.len) - 1;
+   lene = fls(ctrl.len) - 1;
+   /*
+* FIXME: reported address can be anywhere between "the
+* lowest address accessed 

[PATCH V2 5/5] selftests: arm64: add test for unaligned/inexact watchpoint handling

2016-10-19 Thread Pratyush Anand
ARM64 hardware expects 64bit aligned address for watchpoint invocation.
However, it provides byte selection method to select any number of
consecutive byte set within the range of 1-8.

This patch adds support to test all such byte selection option for
different memory write sizes.

Patch also adds a test for handling the case when the cpu does not
report an address which exactly matches one of the regions we have
been watching (which is a situation permitted by the spec if an
instruction accesses both watched and unwatched regions). The test
was failing on a MSM8996pro before this patch series and is
passing now.

Signed-off-by: Pavel Labath 
Signed-off-by: Pratyush Anand 
---
 tools/testing/selftests/breakpoints/Makefile   |   5 +-
 .../selftests/breakpoints/breakpoint_test_arm64.c  | 236 +
 2 files changed, 240 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/breakpoints/breakpoint_test_arm64.c

diff --git a/tools/testing/selftests/breakpoints/Makefile 
b/tools/testing/selftests/breakpoints/Makefile
index 74e533fd4bc5..61b79e8df1f4 100644
--- a/tools/testing/selftests/breakpoints/Makefile
+++ b/tools/testing/selftests/breakpoints/Makefile
@@ -5,6 +5,9 @@ ARCH ?= $(shell echo $(uname_M) | sed -e s/i.86/x86/ -e 
s/x86_64/x86/)
 ifeq ($(ARCH),x86)
 TEST_PROGS := breakpoint_test
 endif
+ifeq ($(ARCH),aarch64)
+TEST_PROGS := breakpoint_test_arm64
+endif
 
 TEST_PROGS += step_after_suspend_test
 
@@ -13,4 +16,4 @@ all: $(TEST_PROGS)
 include ../lib.mk
 
 clean:
-   rm -fr breakpoint_test step_after_suspend_test
+   rm -fr breakpoint_test breakpoint_test_arm64 step_after_suspend_test
diff --git a/tools/testing/selftests/breakpoints/breakpoint_test_arm64.c 
b/tools/testing/selftests/breakpoints/breakpoint_test_arm64.c
new file mode 100644
index ..3897e996541e
--- /dev/null
+++ b/tools/testing/selftests/breakpoints/breakpoint_test_arm64.c
@@ -0,0 +1,236 @@
+/*
+ * Copyright (C) 2016 Google, Inc.
+ *
+ * This software is licensed under the terms of the GNU General Public
+ * License version 2, as published by the Free Software Foundation, and
+ * may be copied, distributed, and modified under those terms.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * Original Code by Pavel Labath 
+ *
+ * Code modified by Pratyush Anand 
+ * for testing different byte select for each access size.
+ *
+ */
+
+#define _GNU_SOURCE
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "../kselftest.h"
+
+static volatile uint8_t var[96] __attribute__((__aligned__(32)));
+
+static void child(int size, int wr)
+{
+   volatile uint8_t *addr = [32 + wr];
+
+   if (ptrace(PTRACE_TRACEME, 0, NULL, NULL) != 0) {
+   perror("ptrace(PTRACE_TRACEME) failed");
+   _exit(1);
+   }
+
+   if (raise(SIGSTOP) != 0) {
+   perror("raise(SIGSTOP) failed");
+   _exit(1);
+   }
+
+   if ((uintptr_t) addr % size) {
+   perror("Wrong address write for the given size\n");
+   _exit(1);
+   }
+   switch (size) {
+   case 1:
+   *addr = 47;
+   break;
+   case 2:
+   *(uint16_t *)addr = 47;
+   break;
+   case 4:
+   *(uint32_t *)addr = 47;
+   break;
+   case 8:
+   *(uint64_t *)addr = 47;
+   break;
+   case 16:
+   __asm__ volatile ("stp x29, x30, %0" : "=m" (addr[0]));
+   break;
+   case 32:
+   __asm__ volatile ("stp q29, q30, %0" : "=m" (addr[0]));
+   break;
+   }
+
+   _exit(0);
+}
+
+static bool set_watchpoint(pid_t pid, int size, int wp)
+{
+   const volatile uint8_t *addr = [32 + wp];
+   const int offset = (uintptr_t)addr % 8;
+   const unsigned int byte_mask = ((1 << size) - 1) << offset;
+   const unsigned int type = 2; /* Write */
+   const unsigned int enable = 1;
+   const unsigned int control = byte_mask << 5 | type << 3 | enable;
+   struct user_hwdebug_state dreg_state;
+   struct iovec iov;
+
+   memset(_state, 0, sizeof(dreg_state));
+   dreg_state.dbg_regs[0].addr = (uintptr_t)(addr - offset);
+   dreg_state.dbg_regs[0].ctrl = control;
+   iov.iov_base = _state;
+   iov.iov_len = offsetof(struct user_hwdebug_state, dbg_regs) +
+   sizeof(dreg_state.dbg_regs[0]);
+   if (ptrace(PTRACE_SETREGSET, pid, NT_ARM_HW_WATCH, ) == 0)
+   return true;
+
+   if (errno == EIO) {
+   printf("ptrace(PTRACE_SETREGSET, NT_ARM_HW_WATCH) "
+   "not supported 

[PATCH V2 2/5] arm64: Allow hw watchpoint at varied offset from base address

2016-10-19 Thread Pratyush Anand
ARM64 hardware supports watchpoint at any double word aligned address.
However, it can select any consecutive bytes from offset 0 to 7 from that
base address. For example, if base address is programmed as 0x420030 and
byte select is 0x1C, then access of 0x420032,0x420033 and 0x420034 will
generate a watchpoint exception.

Currently, we do not have such modularity. We can only program byte,
halfword, word and double word access exception from any base address.

This patch adds support to overcome above limitations.

Signed-off-by: Pratyush Anand 
---
 arch/arm64/include/asm/hw_breakpoint.h |  2 +-
 arch/arm64/kernel/hw_breakpoint.c  | 45 --
 arch/arm64/kernel/ptrace.c |  5 ++--
 3 files changed, 25 insertions(+), 27 deletions(-)

diff --git a/arch/arm64/include/asm/hw_breakpoint.h 
b/arch/arm64/include/asm/hw_breakpoint.h
index 115ea2a64520..4f4e58bee9bc 100644
--- a/arch/arm64/include/asm/hw_breakpoint.h
+++ b/arch/arm64/include/asm/hw_breakpoint.h
@@ -118,7 +118,7 @@ struct perf_event;
 struct pmu;
 
 extern int arch_bp_generic_fields(struct arch_hw_breakpoint_ctrl ctrl,
- int *gen_len, int *gen_type);
+ int *gen_len, int *gen_type, int *offset);
 extern int arch_check_bp_in_kernelspace(struct perf_event *bp);
 extern int arch_validate_hwbkpt_settings(struct perf_event *bp);
 extern int hw_breakpoint_exceptions_notify(struct notifier_block *unused,
diff --git a/arch/arm64/kernel/hw_breakpoint.c 
b/arch/arm64/kernel/hw_breakpoint.c
index 26a6bf77d272..3c2b96803eba 100644
--- a/arch/arm64/kernel/hw_breakpoint.c
+++ b/arch/arm64/kernel/hw_breakpoint.c
@@ -349,7 +349,7 @@ int arch_check_bp_in_kernelspace(struct perf_event *bp)
  * to generic breakpoint descriptions.
  */
 int arch_bp_generic_fields(struct arch_hw_breakpoint_ctrl ctrl,
-  int *gen_len, int *gen_type)
+  int *gen_len, int *gen_type, int *offset)
 {
/* Type */
switch (ctrl.type) {
@@ -369,8 +369,10 @@ int arch_bp_generic_fields(struct arch_hw_breakpoint_ctrl 
ctrl,
return -EINVAL;
}
 
+   *offset = ffs(ctrl.len) - 1;
+
/* Len */
-   switch (ctrl.len) {
+   switch (ctrl.len >> *offset) {
case ARM_BREAKPOINT_LEN_1:
*gen_len = HW_BREAKPOINT_LEN_1;
break;
@@ -517,18 +519,17 @@ int arch_validate_hwbkpt_settings(struct perf_event *bp)
default:
return -EINVAL;
}
-
-   info->address &= ~alignment_mask;
-   info->ctrl.len <<= offset;
} else {
if (info->ctrl.type == ARM_BREAKPOINT_EXECUTE)
alignment_mask = 0x3;
else
alignment_mask = 0x7;
-   if (info->address & alignment_mask)
-   return -EINVAL;
+   offset = info->address & alignment_mask;
}
 
+   info->address &= ~alignment_mask;
+   info->ctrl.len <<= offset;
+
/*
 * Disallow per-task kernel breakpoints since these would
 * complicate the stepping code.
@@ -665,8 +666,8 @@ static int watchpoint_handler(unsigned long addr, unsigned 
int esr,
  struct pt_regs *regs)
 {
int i, step = 0, *kernel_step, access;
-   u32 ctrl_reg;
-   u64 val, alignment_mask;
+   u32 ctrl_reg, lens, lene;
+   u64 val;
struct perf_event *wp, **slots;
struct debug_info *debug_info;
struct arch_hw_breakpoint *info;
@@ -684,25 +685,21 @@ static int watchpoint_handler(unsigned long addr, 
unsigned int esr,
goto unlock;
 
info = counter_arch_bp(wp);
-   /* AArch32 watchpoints are either 4 or 8 bytes aligned. */
-   if (is_compat_task()) {
-   if (info->ctrl.len == ARM_BREAKPOINT_LEN_8)
-   alignment_mask = 0x7;
-   else
-   alignment_mask = 0x3;
-   } else {
-   alignment_mask = 0x7;
-   }
 
-   /* Check if the watchpoint value matches. */
+   /* Check if the watchpoint value and byte select match. */
val = read_wb_reg(AARCH64_DBG_REG_WVR, i);
-   if (val != (addr & ~alignment_mask))
-   goto unlock;
-
-   /* Possible match, check the byte address select to confirm. */
ctrl_reg = read_wb_reg(AARCH64_DBG_REG_WCR, i);
decode_ctrl_reg(ctrl_reg, );
-   if (!((1 << (addr & alignment_mask)) & ctrl.len))
+   lens = ffs(ctrl.len) - 1;
+   lene = fls(ctrl.len) - 1;
+   /*
+* FIXME: reported address can be anywhere between "the
+* lowest address accessed by the memory 

[PATCH V2 1/5] hw_breakpoint: Allow watchpoint of length 3,5,6 and 7

2016-10-19 Thread Pratyush Anand
We only support breakpoint/watchpoint of length 1, 2, 4 and 8. If we can
support other length as well, then user may watch more data with less
number of watchpoints (provided hardware supports it). For example: if we
have to watch only 4th, 5th and 6th byte from a 64 bit aligned address, we
will have to use two slots to implement it currently. One slot will watch a
half word at offset 4 and other a byte at offset 6. If we can have a
watchpoint of length 3 then we can watch it with single slot as well.

ARM64 hardware does support such functionality, therefore adding these new
definitions in generic layer.

Signed-off-by: Pratyush Anand 
---
 include/uapi/linux/hw_breakpoint.h   | 4 
 tools/include/uapi/linux/hw_breakpoint.h | 4 
 2 files changed, 8 insertions(+)

diff --git a/include/uapi/linux/hw_breakpoint.h 
b/include/uapi/linux/hw_breakpoint.h
index b04000a2296a..2b65efd19a46 100644
--- a/include/uapi/linux/hw_breakpoint.h
+++ b/include/uapi/linux/hw_breakpoint.h
@@ -4,7 +4,11 @@
 enum {
HW_BREAKPOINT_LEN_1 = 1,
HW_BREAKPOINT_LEN_2 = 2,
+   HW_BREAKPOINT_LEN_3 = 3,
HW_BREAKPOINT_LEN_4 = 4,
+   HW_BREAKPOINT_LEN_5 = 5,
+   HW_BREAKPOINT_LEN_6 = 6,
+   HW_BREAKPOINT_LEN_7 = 7,
HW_BREAKPOINT_LEN_8 = 8,
 };
 
diff --git a/tools/include/uapi/linux/hw_breakpoint.h 
b/tools/include/uapi/linux/hw_breakpoint.h
index b04000a2296a..2b65efd19a46 100644
--- a/tools/include/uapi/linux/hw_breakpoint.h
+++ b/tools/include/uapi/linux/hw_breakpoint.h
@@ -4,7 +4,11 @@
 enum {
HW_BREAKPOINT_LEN_1 = 1,
HW_BREAKPOINT_LEN_2 = 2,
+   HW_BREAKPOINT_LEN_3 = 3,
HW_BREAKPOINT_LEN_4 = 4,
+   HW_BREAKPOINT_LEN_5 = 5,
+   HW_BREAKPOINT_LEN_6 = 6,
+   HW_BREAKPOINT_LEN_7 = 7,
HW_BREAKPOINT_LEN_8 = 8,
 };
 
-- 
2.7.4



[PATCH V2 0/5] ARM64: More flexible HW watchpoint

2016-10-19 Thread Pratyush Anand
Currently, we do not support all the byte select option provided by ARM64
specs for a HW watchpoint.

This patch set will help user to instrument a watchpoint with all possible
byte select options.

Changes since v1:
Introduced a new patch 3/5 where it takes care of the situation when HW
does not report a watchpoint hit with the address that matches one of the
watchpoints set.
Added corresponding test case to test that functionality.

Pavel Labath (1):
  arm64: hw_breakpoint: Handle inexact watchpoint addresses

Pratyush Anand (4):
  hw_breakpoint: Allow watchpoint of length 3,5,6 and 7
  arm64: Allow hw watchpoint at varied offset from base address
  arm64: Allow hw watchpoint of length 3,5,6 and 7
  selftests: arm64: add test for unaligned/inexact watchpoint handling

 arch/arm64/include/asm/hw_breakpoint.h |   6 +-
 arch/arm64/kernel/hw_breakpoint.c  | 149 +
 arch/arm64/kernel/ptrace.c |   5 +-
 include/uapi/linux/hw_breakpoint.h |   4 +
 tools/include/uapi/linux/hw_breakpoint.h   |   4 +
 tools/testing/selftests/breakpoints/Makefile   |   5 +-
 .../selftests/breakpoints/breakpoint_test_arm64.c  | 236 +
 7 files changed, 366 insertions(+), 43 deletions(-)
 create mode 100644 tools/testing/selftests/breakpoints/breakpoint_test_arm64.c

-- 
2.7.4



[PATCH V2 1/5] hw_breakpoint: Allow watchpoint of length 3,5,6 and 7

2016-10-19 Thread Pratyush Anand
We only support breakpoint/watchpoint of length 1, 2, 4 and 8. If we can
support other length as well, then user may watch more data with less
number of watchpoints (provided hardware supports it). For example: if we
have to watch only 4th, 5th and 6th byte from a 64 bit aligned address, we
will have to use two slots to implement it currently. One slot will watch a
half word at offset 4 and other a byte at offset 6. If we can have a
watchpoint of length 3 then we can watch it with single slot as well.

ARM64 hardware does support such functionality, therefore adding these new
definitions in generic layer.

Signed-off-by: Pratyush Anand 
---
 include/uapi/linux/hw_breakpoint.h   | 4 
 tools/include/uapi/linux/hw_breakpoint.h | 4 
 2 files changed, 8 insertions(+)

diff --git a/include/uapi/linux/hw_breakpoint.h 
b/include/uapi/linux/hw_breakpoint.h
index b04000a2296a..2b65efd19a46 100644
--- a/include/uapi/linux/hw_breakpoint.h
+++ b/include/uapi/linux/hw_breakpoint.h
@@ -4,7 +4,11 @@
 enum {
HW_BREAKPOINT_LEN_1 = 1,
HW_BREAKPOINT_LEN_2 = 2,
+   HW_BREAKPOINT_LEN_3 = 3,
HW_BREAKPOINT_LEN_4 = 4,
+   HW_BREAKPOINT_LEN_5 = 5,
+   HW_BREAKPOINT_LEN_6 = 6,
+   HW_BREAKPOINT_LEN_7 = 7,
HW_BREAKPOINT_LEN_8 = 8,
 };
 
diff --git a/tools/include/uapi/linux/hw_breakpoint.h 
b/tools/include/uapi/linux/hw_breakpoint.h
index b04000a2296a..2b65efd19a46 100644
--- a/tools/include/uapi/linux/hw_breakpoint.h
+++ b/tools/include/uapi/linux/hw_breakpoint.h
@@ -4,7 +4,11 @@
 enum {
HW_BREAKPOINT_LEN_1 = 1,
HW_BREAKPOINT_LEN_2 = 2,
+   HW_BREAKPOINT_LEN_3 = 3,
HW_BREAKPOINT_LEN_4 = 4,
+   HW_BREAKPOINT_LEN_5 = 5,
+   HW_BREAKPOINT_LEN_6 = 6,
+   HW_BREAKPOINT_LEN_7 = 7,
HW_BREAKPOINT_LEN_8 = 8,
 };
 
-- 
2.7.4



[PATCH V2 0/5] ARM64: More flexible HW watchpoint

2016-10-19 Thread Pratyush Anand
Currently, we do not support all the byte select option provided by ARM64
specs for a HW watchpoint.

This patch set will help user to instrument a watchpoint with all possible
byte select options.

Changes since v1:
Introduced a new patch 3/5 where it takes care of the situation when HW
does not report a watchpoint hit with the address that matches one of the
watchpoints set.
Added corresponding test case to test that functionality.

Pavel Labath (1):
  arm64: hw_breakpoint: Handle inexact watchpoint addresses

Pratyush Anand (4):
  hw_breakpoint: Allow watchpoint of length 3,5,6 and 7
  arm64: Allow hw watchpoint at varied offset from base address
  arm64: Allow hw watchpoint of length 3,5,6 and 7
  selftests: arm64: add test for unaligned/inexact watchpoint handling

 arch/arm64/include/asm/hw_breakpoint.h |   6 +-
 arch/arm64/kernel/hw_breakpoint.c  | 149 +
 arch/arm64/kernel/ptrace.c |   5 +-
 include/uapi/linux/hw_breakpoint.h |   4 +
 tools/include/uapi/linux/hw_breakpoint.h   |   4 +
 tools/testing/selftests/breakpoints/Makefile   |   5 +-
 .../selftests/breakpoints/breakpoint_test_arm64.c  | 236 +
 7 files changed, 366 insertions(+), 43 deletions(-)
 create mode 100644 tools/testing/selftests/breakpoints/breakpoint_test_arm64.c

-- 
2.7.4



[PATCH V2 3/5] arm64: hw_breakpoint: Handle inexact watchpoint addresses

2016-10-19 Thread Pratyush Anand
From: Pavel Labath 

Arm64 hardware does not always report a watchpoint hit address that
matches one of the watchpoints set. It can also report an address
"near" the watchpoint if a single instruction access both watched and
unwatched addresses. There is no straight-forward way, short of
disassembling the offending instruction, to map that address back to
the watchpoint.

Previously, when the hardware reported a watchpoint hit on an address
that did not match our watchpoint (this happens in case of instructions
which access large chunks of memory such as "stp") the process would
enter a loop where we would be continually resuming it (because we did
not recognise that watchpoint hit) and it would keep hitting the
watchpoint again and again. The tracing process would never get
notified of the watchpoint hit.

This commit fixes the problem by looking at the watchpoints near the
address reported by the hardware. If the address does not exactly match
one of the watchpoints we have set, it attributes the hit to the
nearest watchpoint we have.  This heuristic is a bit dodgy, but I don't
think we can do much more, given the hardware limitations.

[panand: reworked to rebase on his patches]

Signed-off-by: Pavel Labath 
Signed-off-by: Pratyush Anand 
---
 arch/arm64/kernel/hw_breakpoint.c | 94 +++
 1 file changed, 66 insertions(+), 28 deletions(-)

diff --git a/arch/arm64/kernel/hw_breakpoint.c 
b/arch/arm64/kernel/hw_breakpoint.c
index 3c2b96803eba..c57bc90b8286 100644
--- a/arch/arm64/kernel/hw_breakpoint.c
+++ b/arch/arm64/kernel/hw_breakpoint.c
@@ -662,11 +662,46 @@ unlock:
 }
 NOKPROBE_SYMBOL(breakpoint_handler);
 
+/*
+ * Arm64 hardware does not always report a watchpoint hit address that matches
+ * one of the watchpoints set. It can also report an address "near" the
+ * watchpoint if a single instruction access both watched and unwatched
+ * addresses. There is no straight-forward way, short of disassembling the
+ * offending instruction, to map that address back to the watchpoint. This
+ * function computes the distance of the memory access from the watchpoint as a
+ * heuristic for the likelyhood that a given access triggered the watchpoint.
+ *
+ * See Section D2.10.5 "Determining the memory location that caused a 
Watchpoint
+ * exception" of ARMv8 Architecture Reference Manual for details.
+ *
+ * The function returns the distance of the address from the bytes watched by
+ * the watchpoint. In case of an exact match, it returns 0.
+ */
+static u64 get_distance_from_watchpoint(unsigned long addr, u64 val,
+   struct arch_hw_breakpoint_ctrl *ctrl)
+{
+   u64 wp_low, wp_high;
+   u32 lens, lene;
+
+   lens = ffs(ctrl->len) - 1;
+   lene = fls(ctrl->len) - 1;
+
+   wp_low = val + lens;
+   wp_high = val + lene;
+   if (addr < wp_low)
+   return wp_low - addr;
+   else if (addr > wp_high)
+   return addr - wp_high;
+   else
+   return 0;
+}
+
 static int watchpoint_handler(unsigned long addr, unsigned int esr,
  struct pt_regs *regs)
 {
-   int i, step = 0, *kernel_step, access;
-   u32 ctrl_reg, lens, lene;
+   int i, step = 0, *kernel_step, access, closest_match = 0;
+   u64 min_dist = -1, dist;
+   u32 ctrl_reg;
u64 val;
struct perf_event *wp, **slots;
struct debug_info *debug_info;
@@ -676,31 +711,15 @@ static int watchpoint_handler(unsigned long addr, 
unsigned int esr,
slots = this_cpu_ptr(wp_on_reg);
debug_info = >thread.debug;
 
+   /*
+* Find all watchpoints that match the reported address. If no exact
+* match is found. Attribute the hit to the closest watchpoint.
+*/
+   rcu_read_lock();
for (i = 0; i < core_num_wrps; ++i) {
-   rcu_read_lock();
-
wp = slots[i];
-
if (wp == NULL)
-   goto unlock;
-
-   info = counter_arch_bp(wp);
-
-   /* Check if the watchpoint value and byte select match. */
-   val = read_wb_reg(AARCH64_DBG_REG_WVR, i);
-   ctrl_reg = read_wb_reg(AARCH64_DBG_REG_WCR, i);
-   decode_ctrl_reg(ctrl_reg, );
-   lens = ffs(ctrl.len) - 1;
-   lene = fls(ctrl.len) - 1;
-   /*
-* FIXME: reported address can be anywhere between "the
-* lowest address accessed by the memory access that
-* triggered the watchpoint" and "the highest watchpointed
-* address accessed by the memory access". So, it may not
-* lie in the interval of watchpoint address range.
-*/
-   if (addr < val + lens || addr > val + lene)
-   goto unlock;
+   continue;
 

[PATCH V2 3/5] arm64: hw_breakpoint: Handle inexact watchpoint addresses

2016-10-19 Thread Pratyush Anand
From: Pavel Labath 

Arm64 hardware does not always report a watchpoint hit address that
matches one of the watchpoints set. It can also report an address
"near" the watchpoint if a single instruction access both watched and
unwatched addresses. There is no straight-forward way, short of
disassembling the offending instruction, to map that address back to
the watchpoint.

Previously, when the hardware reported a watchpoint hit on an address
that did not match our watchpoint (this happens in case of instructions
which access large chunks of memory such as "stp") the process would
enter a loop where we would be continually resuming it (because we did
not recognise that watchpoint hit) and it would keep hitting the
watchpoint again and again. The tracing process would never get
notified of the watchpoint hit.

This commit fixes the problem by looking at the watchpoints near the
address reported by the hardware. If the address does not exactly match
one of the watchpoints we have set, it attributes the hit to the
nearest watchpoint we have.  This heuristic is a bit dodgy, but I don't
think we can do much more, given the hardware limitations.

[panand: reworked to rebase on his patches]

Signed-off-by: Pavel Labath 
Signed-off-by: Pratyush Anand 
---
 arch/arm64/kernel/hw_breakpoint.c | 94 +++
 1 file changed, 66 insertions(+), 28 deletions(-)

diff --git a/arch/arm64/kernel/hw_breakpoint.c 
b/arch/arm64/kernel/hw_breakpoint.c
index 3c2b96803eba..c57bc90b8286 100644
--- a/arch/arm64/kernel/hw_breakpoint.c
+++ b/arch/arm64/kernel/hw_breakpoint.c
@@ -662,11 +662,46 @@ unlock:
 }
 NOKPROBE_SYMBOL(breakpoint_handler);
 
+/*
+ * Arm64 hardware does not always report a watchpoint hit address that matches
+ * one of the watchpoints set. It can also report an address "near" the
+ * watchpoint if a single instruction access both watched and unwatched
+ * addresses. There is no straight-forward way, short of disassembling the
+ * offending instruction, to map that address back to the watchpoint. This
+ * function computes the distance of the memory access from the watchpoint as a
+ * heuristic for the likelyhood that a given access triggered the watchpoint.
+ *
+ * See Section D2.10.5 "Determining the memory location that caused a 
Watchpoint
+ * exception" of ARMv8 Architecture Reference Manual for details.
+ *
+ * The function returns the distance of the address from the bytes watched by
+ * the watchpoint. In case of an exact match, it returns 0.
+ */
+static u64 get_distance_from_watchpoint(unsigned long addr, u64 val,
+   struct arch_hw_breakpoint_ctrl *ctrl)
+{
+   u64 wp_low, wp_high;
+   u32 lens, lene;
+
+   lens = ffs(ctrl->len) - 1;
+   lene = fls(ctrl->len) - 1;
+
+   wp_low = val + lens;
+   wp_high = val + lene;
+   if (addr < wp_low)
+   return wp_low - addr;
+   else if (addr > wp_high)
+   return addr - wp_high;
+   else
+   return 0;
+}
+
 static int watchpoint_handler(unsigned long addr, unsigned int esr,
  struct pt_regs *regs)
 {
-   int i, step = 0, *kernel_step, access;
-   u32 ctrl_reg, lens, lene;
+   int i, step = 0, *kernel_step, access, closest_match = 0;
+   u64 min_dist = -1, dist;
+   u32 ctrl_reg;
u64 val;
struct perf_event *wp, **slots;
struct debug_info *debug_info;
@@ -676,31 +711,15 @@ static int watchpoint_handler(unsigned long addr, 
unsigned int esr,
slots = this_cpu_ptr(wp_on_reg);
debug_info = >thread.debug;
 
+   /*
+* Find all watchpoints that match the reported address. If no exact
+* match is found. Attribute the hit to the closest watchpoint.
+*/
+   rcu_read_lock();
for (i = 0; i < core_num_wrps; ++i) {
-   rcu_read_lock();
-
wp = slots[i];
-
if (wp == NULL)
-   goto unlock;
-
-   info = counter_arch_bp(wp);
-
-   /* Check if the watchpoint value and byte select match. */
-   val = read_wb_reg(AARCH64_DBG_REG_WVR, i);
-   ctrl_reg = read_wb_reg(AARCH64_DBG_REG_WCR, i);
-   decode_ctrl_reg(ctrl_reg, );
-   lens = ffs(ctrl.len) - 1;
-   lene = fls(ctrl.len) - 1;
-   /*
-* FIXME: reported address can be anywhere between "the
-* lowest address accessed by the memory access that
-* triggered the watchpoint" and "the highest watchpointed
-* address accessed by the memory access". So, it may not
-* lie in the interval of watchpoint address range.
-*/
-   if (addr < val + lens || addr > val + lene)
-   goto unlock;
+   continue;
 
/*
 * Check that the access type matches.
@@ 

Re: [RESEND PATCH v2 4/4] usb: doc: add document for USB3 debug port usage

2016-10-19 Thread Lu Baolu
Hi Jani,

On 10/19/2016 03:48 PM, Jani Nikula wrote:
> On Wed, 19 Oct 2016, Lu Baolu  wrote:
>> Add Documentation/usb/usb3-debug-port.txt. This document includes
>> the user guide for USB3 debug port.
> If you're adding completely new files, please at least consider writing
> them in reStructuredText, so we can easily bolt them to the Sphinx
> build. Just a few tweaks would be required, comments inline below.

Thanks for your comments. I will refactor my document according
to your comments.

By the way, are there any tools that I can use to check the document
format?

Best regards,
Lu Baolu

>
> BR,
> Jani.
>
>> Cc: linux-...@vger.kernel.org
>> Signed-off-by: Lu Baolu 
>> ---
>>  Documentation/usb/usb3-debug-port.txt | 87 
>> +++
>>  1 file changed, 87 insertions(+)
>>  create mode 100644 Documentation/usb/usb3-debug-port.txt
>>
>> diff --git a/Documentation/usb/usb3-debug-port.txt 
>> b/Documentation/usb/usb3-debug-port.txt
>> new file mode 100644
>> index 000..df5ce27
>> --- /dev/null
>> +++ b/Documentation/usb/usb3-debug-port.txt
>> @@ -0,0 +1,87 @@
>> +  USB3 debug port
> Make that a title with
>
> ===
> USB3 debug port
> ===
>
>> +
>> + Lu Baolu 
> :Author: Lu Baolu 
>
> Although git blame will give a more accurate idea after the file's been
> edited by others.
>
>> +
>> +  Last-updated: October 2016
> :Date: October 2016
>
> Again, this is what git does.
>
>> +
>> +GENERAL
>> +===
>> +
>> +This is a HOWTO for using USB3 debug port on x86 systems.
>> +
>> +Before using any kernel debugging functionalities based on USB3
>> +debug port, you need to check 1) whether debug port is supported
>> +by the xHCI host, 2) which port is used for debugging purpose
>> +(normally the first USB3 root port). You must have a USB 3.0
>> +super-speed A-to-A debugging cable to connect the debug target
>> +with a debug host. In this document, a debug target stands for
>> +the system under debugging; while, a debug host stands for a
>> +stand-alone system that is able to talk to the debugging target
>> +through the USB3 debug port.
>> +
>> +EARLY PRINTK
>> +
>> +
>> +On debug target system, you need to customize a debugging kernel
>> +with CONFIG_EARLY_PRINTK_XDBC enabled. And add below kernel boot
>> +parameter.
> Add :: at the end of previous line to make the below indented block
> preformatted text. Ditto for the others.
>
>> +
>> +"earlyprintk=xdbc"
>> +
>> +If there are multiple xHCI controllers in the system, you can
>> +append a host contoller index to this kernel parameter. This
>> +index is started from 0.
>> +
>> +If you are going to leverage the keep option defined by the
>> +early printk framework to keep the boot console alive after
>> +early boot, you'd better add below kernel boot parameter.
>> +
>> +"usbcore.autosuspend=-1"
>> +
>> +On debug host side, you don't need to customize the kernel, but
>> +you need to disable usb subsystem runtime power management by
>> +adding below kernel boot parameter.
>> +
>> +"usbcore.autosuspend=-1"
>> +
>> +Before starting the debug target, you should connect the debug
>> +port on debug target with a root port or port of any external hub
>> +on the debug host. The cable used to connect these two ports
>> +should be a USB 3.0 super-speed A-to-A debugging cable.
>> +
>> +During early boot of debug target, DbC (the debug engine for USB3
>> +debug port) hardware gets initialized. Debug host should be able
>> +to enumerate the debug target as a debug device. Debug host will
>> +then bind the debug device with the usb_debug driver module and
>> +create the /dev/ttyUSB0 device.
>> +
>> +If device enumeration goes smoothly, you should be able to see
>> +below kernel messages on debug host.
> Again, add :: and indent the below lines by some spaces.
>
>> +
>> +# tail -f /var/log/kern.log
>> +
>> +[ 1815.983374] usb 4-3: new SuperSpeed USB device number 4 using xhci_hcd
>> +[ 1815.999595] usb 4-3: LPM exit latency is zeroed, disabling LPM.
>> +[ 1815.999899] usb 4-3: New USB device found, idVendor=1d6b, idProduct=0004
>> +[ 1815.02] usb 4-3: New USB device strings: Mfr=1, Product=2, 
>> SerialNumber=3
>> +[ 1815.03] usb 4-3: Product: Remote GDB
>> +[ 1815.04] usb 4-3: Manufacturer: Linux
>> +[ 1815.05] usb 4-3: SerialNumber: 0001
>> +[ 1816.000240] usb_debug 4-3:1.0: xhci_dbc converter detected
>> +[ 1816.000360] usb 4-3: xhci_dbc converter now attached to ttyUSB0
>> +
>> +You can run below bash scripts on debug host to read the kernel
>> +log sent from debug target.
> Same here. Alternatively, if you do
>
> .. code-block:: sh
>
> and indent the block, you'll get syntax highlighting in the output.
>
>> +
>> += start of bash scripts =
>> +#!/bin/bash
>> +
>> +while true ; do
>> +while [ ! -d /sys/class/tty/ttyUSB0 ] ; do
>> + 

Re: [RESEND PATCH v2 4/4] usb: doc: add document for USB3 debug port usage

2016-10-19 Thread Lu Baolu
Hi Jani,

On 10/19/2016 03:48 PM, Jani Nikula wrote:
> On Wed, 19 Oct 2016, Lu Baolu  wrote:
>> Add Documentation/usb/usb3-debug-port.txt. This document includes
>> the user guide for USB3 debug port.
> If you're adding completely new files, please at least consider writing
> them in reStructuredText, so we can easily bolt them to the Sphinx
> build. Just a few tweaks would be required, comments inline below.

Thanks for your comments. I will refactor my document according
to your comments.

By the way, are there any tools that I can use to check the document
format?

Best regards,
Lu Baolu

>
> BR,
> Jani.
>
>> Cc: linux-...@vger.kernel.org
>> Signed-off-by: Lu Baolu 
>> ---
>>  Documentation/usb/usb3-debug-port.txt | 87 
>> +++
>>  1 file changed, 87 insertions(+)
>>  create mode 100644 Documentation/usb/usb3-debug-port.txt
>>
>> diff --git a/Documentation/usb/usb3-debug-port.txt 
>> b/Documentation/usb/usb3-debug-port.txt
>> new file mode 100644
>> index 000..df5ce27
>> --- /dev/null
>> +++ b/Documentation/usb/usb3-debug-port.txt
>> @@ -0,0 +1,87 @@
>> +  USB3 debug port
> Make that a title with
>
> ===
> USB3 debug port
> ===
>
>> +
>> + Lu Baolu 
> :Author: Lu Baolu 
>
> Although git blame will give a more accurate idea after the file's been
> edited by others.
>
>> +
>> +  Last-updated: October 2016
> :Date: October 2016
>
> Again, this is what git does.
>
>> +
>> +GENERAL
>> +===
>> +
>> +This is a HOWTO for using USB3 debug port on x86 systems.
>> +
>> +Before using any kernel debugging functionalities based on USB3
>> +debug port, you need to check 1) whether debug port is supported
>> +by the xHCI host, 2) which port is used for debugging purpose
>> +(normally the first USB3 root port). You must have a USB 3.0
>> +super-speed A-to-A debugging cable to connect the debug target
>> +with a debug host. In this document, a debug target stands for
>> +the system under debugging; while, a debug host stands for a
>> +stand-alone system that is able to talk to the debugging target
>> +through the USB3 debug port.
>> +
>> +EARLY PRINTK
>> +
>> +
>> +On debug target system, you need to customize a debugging kernel
>> +with CONFIG_EARLY_PRINTK_XDBC enabled. And add below kernel boot
>> +parameter.
> Add :: at the end of previous line to make the below indented block
> preformatted text. Ditto for the others.
>
>> +
>> +"earlyprintk=xdbc"
>> +
>> +If there are multiple xHCI controllers in the system, you can
>> +append a host contoller index to this kernel parameter. This
>> +index is started from 0.
>> +
>> +If you are going to leverage the keep option defined by the
>> +early printk framework to keep the boot console alive after
>> +early boot, you'd better add below kernel boot parameter.
>> +
>> +"usbcore.autosuspend=-1"
>> +
>> +On debug host side, you don't need to customize the kernel, but
>> +you need to disable usb subsystem runtime power management by
>> +adding below kernel boot parameter.
>> +
>> +"usbcore.autosuspend=-1"
>> +
>> +Before starting the debug target, you should connect the debug
>> +port on debug target with a root port or port of any external hub
>> +on the debug host. The cable used to connect these two ports
>> +should be a USB 3.0 super-speed A-to-A debugging cable.
>> +
>> +During early boot of debug target, DbC (the debug engine for USB3
>> +debug port) hardware gets initialized. Debug host should be able
>> +to enumerate the debug target as a debug device. Debug host will
>> +then bind the debug device with the usb_debug driver module and
>> +create the /dev/ttyUSB0 device.
>> +
>> +If device enumeration goes smoothly, you should be able to see
>> +below kernel messages on debug host.
> Again, add :: and indent the below lines by some spaces.
>
>> +
>> +# tail -f /var/log/kern.log
>> +
>> +[ 1815.983374] usb 4-3: new SuperSpeed USB device number 4 using xhci_hcd
>> +[ 1815.999595] usb 4-3: LPM exit latency is zeroed, disabling LPM.
>> +[ 1815.999899] usb 4-3: New USB device found, idVendor=1d6b, idProduct=0004
>> +[ 1815.02] usb 4-3: New USB device strings: Mfr=1, Product=2, 
>> SerialNumber=3
>> +[ 1815.03] usb 4-3: Product: Remote GDB
>> +[ 1815.04] usb 4-3: Manufacturer: Linux
>> +[ 1815.05] usb 4-3: SerialNumber: 0001
>> +[ 1816.000240] usb_debug 4-3:1.0: xhci_dbc converter detected
>> +[ 1816.000360] usb 4-3: xhci_dbc converter now attached to ttyUSB0
>> +
>> +You can run below bash scripts on debug host to read the kernel
>> +log sent from debug target.
> Same here. Alternatively, if you do
>
> .. code-block:: sh
>
> and indent the block, you'll get syntax highlighting in the output.
>
>> +
>> += start of bash scripts =
>> +#!/bin/bash
>> +
>> +while true ; do
>> +while [ ! -d /sys/class/tty/ttyUSB0 ] ; do
>> +:
>> +done
>> +cat /dev/ttyUSB0 >> xdbc.log
>> +done
>> += end of bash scripts 

Re: [4.9-rc1+] intel_uncore builtin + CONFIG_DEBUG_TEST_DRIVER_REMOVE kernel panic

2016-10-19 Thread Peter Zijlstra
On Wed, Oct 19, 2016 at 09:19:43PM +0200, Jiri Olsa wrote:
> I think the reason here is that presume pmu devices are always added,
> but we add them only if pmu_bus_running (in perf_event_sysfs_init)
> is set which might happen after uncore initcall
> 
> attached patch fixes the issue for me

Right, we never expected to be unloaded before userspace runs.

Strictly speaking we should only read pmu_bus_running while holding
pmus_lock, that way we're serialized against perf_event_sysfs_init()
flipping it while we're being removed etc..

With the current setup the introduced race is harmless, but who knows
what other crazy these device people will come up with ;-)

> ---
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index c6e47e97b33f..c2099b799d16 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -8871,8 +8871,10 @@ void perf_pmu_unregister(struct pmu *pmu)
>   idr_remove(_idr, pmu->type);
>   if (pmu->nr_addr_filters)
>   device_remove_file(pmu->dev, _attr_nr_addr_filters);
> - device_del(pmu->dev);
> - put_device(pmu->dev);
> + if (pmu_bus_running) {
> + device_del(pmu->dev);
> + put_device(pmu->dev);
> + }
>   free_pmu_context(pmu);
>  }
>  EXPORT_SYMBOL_GPL(perf_pmu_unregister);


Re: [4.9-rc1+] intel_uncore builtin + CONFIG_DEBUG_TEST_DRIVER_REMOVE kernel panic

2016-10-19 Thread Peter Zijlstra
On Wed, Oct 19, 2016 at 09:19:43PM +0200, Jiri Olsa wrote:
> I think the reason here is that presume pmu devices are always added,
> but we add them only if pmu_bus_running (in perf_event_sysfs_init)
> is set which might happen after uncore initcall
> 
> attached patch fixes the issue for me

Right, we never expected to be unloaded before userspace runs.

Strictly speaking we should only read pmu_bus_running while holding
pmus_lock, that way we're serialized against perf_event_sysfs_init()
flipping it while we're being removed etc..

With the current setup the introduced race is harmless, but who knows
what other crazy these device people will come up with ;-)

> ---
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index c6e47e97b33f..c2099b799d16 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -8871,8 +8871,10 @@ void perf_pmu_unregister(struct pmu *pmu)
>   idr_remove(_idr, pmu->type);
>   if (pmu->nr_addr_filters)
>   device_remove_file(pmu->dev, _attr_nr_addr_filters);
> - device_del(pmu->dev);
> - put_device(pmu->dev);
> + if (pmu_bus_running) {
> + device_del(pmu->dev);
> + put_device(pmu->dev);
> + }
>   free_pmu_context(pmu);
>  }
>  EXPORT_SYMBOL_GPL(perf_pmu_unregister);


Re: Regression in intel_powerclamp, due to cpu whitelist removal

2016-10-19 Thread Zhang Rui
On Wed, 2016-10-19 at 20:45 -0700, Jacob Pan wrote:
> On Tue, 18 Oct 2016 14:20:49 +
> Alok Kataria  wrote:
> 
> > 
> > Hi Jacob, Zhang, 
> > 
> > One of your recent commit "thermal/powerclamp: remove cpu
> > whitelist” [1], has caused a regression in the kernel. 
> > 
> > That commit changed powerclamp_probe from requiring all of the
> > following features:
> > 
> > X86_FEATURE_NONSTOP_TSC
> > X86_FEATURE_CONSTANT_TSC
> > X86_FEATURE_MWAIT
> > X86_FEATURE_ARAT   
> > 
> > to *any* of them.  The problem is clamp_thread still wants to use
> > mwait_idle_with_hints even if the CPU doesn't support it. 
> > 
> Hi Alok,
> 
> You are right, it should be AND not OR.
>  
> +Eric who has a patch to address this.
> 
> https://patchwork.kernel.org/patch/9365005/
> 
> Rui/Rafael,
> 
> Could you consider this as an urgent fix?
> 
yeah, will include this for next -rc. And cc stable v4.7+.

thanks,
rui
> Jacob
> > 
> > This was reported by our users when running ubuntu 16.10
> > (4.8.0-22-generic) inside a VMware VM, though as mentioned above I
> > don’t think it is specific to our platform. We have seen kernel
> > panics due to invalid opcode because of this. Below is the stack
> > trace for your reference. 
> > 
> > [5.736416] invalid opcode:  [#1] SMP
> > [5.736455] Modules linked in: vmw_vsock_vmci_transport vsock
> > vmw_balloon intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul
> > ghash_clmulni_intel aesni_intel aes_x86_64 lrw glue_helper
> > ablk_helper cryptd intel_rapl_perf input_leds joydev serio_raw
> > snd_ens1371 snd_ac97_codec gameport ac97_bus snd_pcm snd_seq_midi
> > snd_seq_midi_event snd_rawmidi snd_seq snd_seq_device snd_timer snd
> > soundcore i2c_piix4 shpchp vmw_vmci nfit floppy(+) mac_hid
> > parport_pc
> > ppdev lp parport ip_tables x_tables autofs4 hid_generic usbhid hid
> > ahci libahci e1000 mptspi mptscsih psmouse mptbase vmwgfx
> > scsi_transport_spi ttm drm_kms_helper syscopyarea sysfillrect
> > sysimgblt fb_sys_fops drm pata_acpi fjes [5.744370] CPU: 1 PID:
> > 912 Comm: kidle_inject/1 Not tainted 4.8.0-22-generic #24-Ubuntu
> > [5.744373] Hardware name: VMware, Inc. VMware Virtual
> > Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015
> > [5.744375] task: 9658f7a663c0 task.stack: 9658fa908000
> > [5.744378] RIP: 0010:[]  []
> > clamp_thread+0x2b8/0x5d0 [intel_powerclamp] [5.744380] RSP:
> > 0018:9658fa90be00  EFLAGS: 00010246 [5.744383] RAX:
> > 9658fa908008 RBX: fffee0a6 RCX: 
> > [5.744386] RDX:  RSI: 0246 RDI:
> > 0246 [5.744388] RBP: 9658fa90bec0 R08:
> > 9658fa908000 R09:  [5.744391] R10:
> > 0001cbf7 R11:  R12: 8db581a0
> > [5.744393] R13: 9658fa908000 R14:  R15:
> > 9658fa908000 [5.744396] FS:  ()
> > GS:9658fc64() knlGS: [5.744398] CS:
> > 0010 DS:  ES:  CR0: 80050033 [5.744401] CR2:
> > 7ffa6cc262e8 CR3: 3ab3b000 CR4: 001406e0
> > [5.744403] Stack: [5.744406]  0001
> > 9658f7a66dc0 9658fc659200 e878d638 [5.744409]
> > 0001 0002fc659200 0001 9658fa908008
> > [5.744411]   9658fc64fea8 fffee0a6
> > c05720a0 [5.744414] Call Trace: [5.744416]
> > [] ? pkg_state_counter+0xa0/0xa0
> > [intel_powerclamp]
> > [5.744419]  [] ?
> > powerclamp_set_cur_state+0x170/0x170 [intel_powerclamp]
> > [5.744421]  [] ?
> > powerclamp_set_cur_state+0x170/0x170 [intel_powerclamp]
> > [5.744424]  [] kthread+0xd8/0xf0
> > [5.744427]  [] ret_from_fork+0x1f/0x40
> > [5.744429]  [] ?
> > kthread_create_on_node+0x1e0/0x1e0 [5.744432] Code: cc e9 ba 00
> > 00 00 eb 19 0f 1f 00 0f ae f0 65 48 8b 04 25 04 69 01 00 0f ae b8
> > 08
> > c0 ff ff 0f ae f0 31 d2 48 8b 44 24 38 48 89 d1 <0f> 01 c8 49 8b 45
> > 08 a8 08 75 0b b9 01 00 00 00 4c 89 f0 0f 01 [5.744434] RIP
> > [] clamp_thread+0x2b8/0x5d0 [intel_powerclamp]
> > [5.744437]  RSP  [5.70] invalid
> > opcode:
> >  [#2] SMP [5.744452] ---[ end trace cf659c4076bf2804 ]---
> > 
> > Looking at the instruction at the RIP  shows that
> > the kernel attempted to execute “monitor” instruction. 
> > 
> >  8b8:   0f 01 c8monitor %rax,%rcx,%rdx
> >  8bb:   49 8b 45 08 mov0x8(%r13),%rax
> > 
> > To fix this, I think you should restore the explicit feature check
> > “if block” that was removed in the above mentioned commit. Can you
> > please look at this ?
> > 
> > Thanks,
> > Alok
> > 
> > 
> > [1] b721ca0d192754deccb89fb01c77e41e6fd91ad9
> > https://github.com/torvalds/linux/commit/b721ca0d192754deccb89fb01c
> > 77e41e6fd91ad9, 
> > 
> [Jacob Pan]


Re: Regression in intel_powerclamp, due to cpu whitelist removal

2016-10-19 Thread Zhang Rui
On Wed, 2016-10-19 at 20:45 -0700, Jacob Pan wrote:
> On Tue, 18 Oct 2016 14:20:49 +
> Alok Kataria  wrote:
> 
> > 
> > Hi Jacob, Zhang, 
> > 
> > One of your recent commit "thermal/powerclamp: remove cpu
> > whitelist” [1], has caused a regression in the kernel. 
> > 
> > That commit changed powerclamp_probe from requiring all of the
> > following features:
> > 
> > X86_FEATURE_NONSTOP_TSC
> > X86_FEATURE_CONSTANT_TSC
> > X86_FEATURE_MWAIT
> > X86_FEATURE_ARAT   
> > 
> > to *any* of them.  The problem is clamp_thread still wants to use
> > mwait_idle_with_hints even if the CPU doesn't support it. 
> > 
> Hi Alok,
> 
> You are right, it should be AND not OR.
>  
> +Eric who has a patch to address this.
> 
> https://patchwork.kernel.org/patch/9365005/
> 
> Rui/Rafael,
> 
> Could you consider this as an urgent fix?
> 
yeah, will include this for next -rc. And cc stable v4.7+.

thanks,
rui
> Jacob
> > 
> > This was reported by our users when running ubuntu 16.10
> > (4.8.0-22-generic) inside a VMware VM, though as mentioned above I
> > don’t think it is specific to our platform. We have seen kernel
> > panics due to invalid opcode because of this. Below is the stack
> > trace for your reference. 
> > 
> > [5.736416] invalid opcode:  [#1] SMP
> > [5.736455] Modules linked in: vmw_vsock_vmci_transport vsock
> > vmw_balloon intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul
> > ghash_clmulni_intel aesni_intel aes_x86_64 lrw glue_helper
> > ablk_helper cryptd intel_rapl_perf input_leds joydev serio_raw
> > snd_ens1371 snd_ac97_codec gameport ac97_bus snd_pcm snd_seq_midi
> > snd_seq_midi_event snd_rawmidi snd_seq snd_seq_device snd_timer snd
> > soundcore i2c_piix4 shpchp vmw_vmci nfit floppy(+) mac_hid
> > parport_pc
> > ppdev lp parport ip_tables x_tables autofs4 hid_generic usbhid hid
> > ahci libahci e1000 mptspi mptscsih psmouse mptbase vmwgfx
> > scsi_transport_spi ttm drm_kms_helper syscopyarea sysfillrect
> > sysimgblt fb_sys_fops drm pata_acpi fjes [5.744370] CPU: 1 PID:
> > 912 Comm: kidle_inject/1 Not tainted 4.8.0-22-generic #24-Ubuntu
> > [5.744373] Hardware name: VMware, Inc. VMware Virtual
> > Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015
> > [5.744375] task: 9658f7a663c0 task.stack: 9658fa908000
> > [5.744378] RIP: 0010:[]  []
> > clamp_thread+0x2b8/0x5d0 [intel_powerclamp] [5.744380] RSP:
> > 0018:9658fa90be00  EFLAGS: 00010246 [5.744383] RAX:
> > 9658fa908008 RBX: fffee0a6 RCX: 
> > [5.744386] RDX:  RSI: 0246 RDI:
> > 0246 [5.744388] RBP: 9658fa90bec0 R08:
> > 9658fa908000 R09:  [5.744391] R10:
> > 0001cbf7 R11:  R12: 8db581a0
> > [5.744393] R13: 9658fa908000 R14:  R15:
> > 9658fa908000 [5.744396] FS:  ()
> > GS:9658fc64() knlGS: [5.744398] CS:
> > 0010 DS:  ES:  CR0: 80050033 [5.744401] CR2:
> > 7ffa6cc262e8 CR3: 3ab3b000 CR4: 001406e0
> > [5.744403] Stack: [5.744406]  0001
> > 9658f7a66dc0 9658fc659200 e878d638 [5.744409]
> > 0001 0002fc659200 0001 9658fa908008
> > [5.744411]   9658fc64fea8 fffee0a6
> > c05720a0 [5.744414] Call Trace: [5.744416]
> > [] ? pkg_state_counter+0xa0/0xa0
> > [intel_powerclamp]
> > [5.744419]  [] ?
> > powerclamp_set_cur_state+0x170/0x170 [intel_powerclamp]
> > [5.744421]  [] ?
> > powerclamp_set_cur_state+0x170/0x170 [intel_powerclamp]
> > [5.744424]  [] kthread+0xd8/0xf0
> > [5.744427]  [] ret_from_fork+0x1f/0x40
> > [5.744429]  [] ?
> > kthread_create_on_node+0x1e0/0x1e0 [5.744432] Code: cc e9 ba 00
> > 00 00 eb 19 0f 1f 00 0f ae f0 65 48 8b 04 25 04 69 01 00 0f ae b8
> > 08
> > c0 ff ff 0f ae f0 31 d2 48 8b 44 24 38 48 89 d1 <0f> 01 c8 49 8b 45
> > 08 a8 08 75 0b b9 01 00 00 00 4c 89 f0 0f 01 [5.744434] RIP
> > [] clamp_thread+0x2b8/0x5d0 [intel_powerclamp]
> > [5.744437]  RSP  [5.70] invalid
> > opcode:
> >  [#2] SMP [5.744452] ---[ end trace cf659c4076bf2804 ]---
> > 
> > Looking at the instruction at the RIP  shows that
> > the kernel attempted to execute “monitor” instruction. 
> > 
> >  8b8:   0f 01 c8monitor %rax,%rcx,%rdx
> >  8bb:   49 8b 45 08 mov0x8(%r13),%rax
> > 
> > To fix this, I think you should restore the explicit feature check
> > “if block” that was removed in the above mentioned commit. Can you
> > please look at this ?
> > 
> > Thanks,
> > Alok
> > 
> > 
> > [1] b721ca0d192754deccb89fb01c77e41e6fd91ad9
> > https://github.com/torvalds/linux/commit/b721ca0d192754deccb89fb01c
> > 77e41e6fd91ad9, 
> > 
> [Jacob Pan]


Re: [PATCH v3 1/2] arm64: dts: uniphier: add CPU clock and OPP table for LD11 SoC

2016-10-19 Thread Viresh Kumar
On 20-10-16, 13:44, Masahiro Yamada wrote:
> Add a CPU clock to every CPU node and a CPU OPP table to use the
> generic cpufreq driver.
> 
> Note:
> clock-latency-ns (300ns) was calculated based on the CPU-gear switch
> sequencer spec; it takes 12 clock cycles on the sequencer running
> at 50 MHz, plus a bit additional latency.
> 
> Signed-off-by: Masahiro Yamada 
> ---
> 
> Changes in v2:
>   - Match the node name to the opp-hz property.

For both the patches ..

Acked-by: Viresh Kumar 

-- 
viresh


Re: [PATCH v3 1/2] arm64: dts: uniphier: add CPU clock and OPP table for LD11 SoC

2016-10-19 Thread Viresh Kumar
On 20-10-16, 13:44, Masahiro Yamada wrote:
> Add a CPU clock to every CPU node and a CPU OPP table to use the
> generic cpufreq driver.
> 
> Note:
> clock-latency-ns (300ns) was calculated based on the CPU-gear switch
> sequencer spec; it takes 12 clock cycles on the sequencer running
> at 50 MHz, plus a bit additional latency.
> 
> Signed-off-by: Masahiro Yamada 
> ---
> 
> Changes in v2:
>   - Match the node name to the opp-hz property.

For both the patches ..

Acked-by: Viresh Kumar 

-- 
viresh


Re: [lkp] [ipc/sem.c] 5864a2fd30: aim9.shared_memory.ops_per_sec -13.0%

2016-10-19 Thread Manfred Spraul

On 10/20/2016 02:21 AM, Andrew Morton wrote:

On Wed, 19 Oct 2016 06:38:14 +0200 Manfred Spraul  
wrote:


Hi,

as discussed before:
The root cause for the performance regression is the smp_mb() that was
added into the fast path.

I see two options:
1) switch to full spin_lock()/spin_unlock() for the rare codepath,
   then the fast path doesn't need the smp_mb() anymore.

2) confirm that no arch needs the smp_mb(), then remove it.
   - powerpc is ok after commit
  6262db7c088b ("powerpc/spinlock: Fix spin_unlock_wait()")
   - arm is ok after commit
  d86b8da04dfa ("arm64: spinlock: serialise spin_unlock_wait against concurrent 
lockers")
   - for x86 is ok after commit
  2c6100227116 ("locking/qspinlock: Fix spin_unlock_wait() some more")
   - for the remaining SMP architectures, I don't have a status.

I would prefer the approach 1:
The memory ordering provided by spin_lock()/spin_unlock() is clear.

Thus:
Attached are patches for approach 1:

- Patch 1 replaces spin_unlock_wait() with spin_lock()/spin_unlock() and
   removes all memory barriers that are then unnecessary.

- Patch 2 adds the hysteresis code: This makes the rare codepath
   extremely rare.
   It also corrects some wrong comments, e.g. regarding switching
   from global lock to per-sem lock (we "must' switch, not we "can"
   switch as written right now).

The patches passed stress-testing.

What do you think?

Are you able to confirm that the performance issues are fixed?

I don't know why we are now at -13%.
The previous -9% were resolved by the patches:

http://marc.info/?l=linux-kernel=147608082017551=2

My initial idea was to aim for 4.10, then we have more time to decide.

I suppose I can slip these into -next and see what the effect is upon
the Intel test results.  But a) I don't know if they test linux-next(?)
and b) I don't know where the test results are published, assuming they
are published(?).


--

Manfred



Re: [lkp] [ipc/sem.c] 5864a2fd30: aim9.shared_memory.ops_per_sec -13.0%

2016-10-19 Thread Manfred Spraul

On 10/20/2016 02:21 AM, Andrew Morton wrote:

On Wed, 19 Oct 2016 06:38:14 +0200 Manfred Spraul  
wrote:


Hi,

as discussed before:
The root cause for the performance regression is the smp_mb() that was
added into the fast path.

I see two options:
1) switch to full spin_lock()/spin_unlock() for the rare codepath,
   then the fast path doesn't need the smp_mb() anymore.

2) confirm that no arch needs the smp_mb(), then remove it.
   - powerpc is ok after commit
  6262db7c088b ("powerpc/spinlock: Fix spin_unlock_wait()")
   - arm is ok after commit
  d86b8da04dfa ("arm64: spinlock: serialise spin_unlock_wait against concurrent 
lockers")
   - for x86 is ok after commit
  2c6100227116 ("locking/qspinlock: Fix spin_unlock_wait() some more")
   - for the remaining SMP architectures, I don't have a status.

I would prefer the approach 1:
The memory ordering provided by spin_lock()/spin_unlock() is clear.

Thus:
Attached are patches for approach 1:

- Patch 1 replaces spin_unlock_wait() with spin_lock()/spin_unlock() and
   removes all memory barriers that are then unnecessary.

- Patch 2 adds the hysteresis code: This makes the rare codepath
   extremely rare.
   It also corrects some wrong comments, e.g. regarding switching
   from global lock to per-sem lock (we "must' switch, not we "can"
   switch as written right now).

The patches passed stress-testing.

What do you think?

Are you able to confirm that the performance issues are fixed?

I don't know why we are now at -13%.
The previous -9% were resolved by the patches:

http://marc.info/?l=linux-kernel=147608082017551=2

My initial idea was to aim for 4.10, then we have more time to decide.

I suppose I can slip these into -next and see what the effect is upon
the Intel test results.  But a) I don't know if they test linux-next(?)
and b) I don't know where the test results are published, assuming they
are published(?).


--

Manfred



[PATCH v3 2/2] arm64: dts: uniphier: add CPU clocks and OPP tables for LD20 SoC

2016-10-19 Thread Masahiro Yamada
Add a CPU clock to every CPU node and CPU OPP tables to use the
generic cpufreq driver.  All the CPUs in each cluster share the
same OPP table.

Note:
clock-latency-ns (300ns) was calculated based on the CPU-gear switch
sequencer spec; it takes 12 clock cycles on the sequencer running
at 50 MHz, plus a bit additional latency.

Signed-off-by: Masahiro Yamada 
  - Fix cluster1 OPP as well.

---

Changes in v2:
  - Match the node name to the opp-hz property.

 arch/arm64/boot/dts/socionext/uniphier-ld20.dtsi | 84 
 1 file changed, 84 insertions(+)

diff --git a/arch/arm64/boot/dts/socionext/uniphier-ld20.dtsi 
b/arch/arm64/boot/dts/socionext/uniphier-ld20.dtsi
index 6f48e82..a9a08dd 100644
--- a/arch/arm64/boot/dts/socionext/uniphier-ld20.dtsi
+++ b/arch/arm64/boot/dts/socionext/uniphier-ld20.dtsi
@@ -79,28 +79,36 @@
device_type = "cpu";
compatible = "arm,cortex-a72", "arm,armv8";
reg = <0 0x000>;
+   clocks = <_clk 32>;
enable-method = "psci";
+   operating-points-v2 = <_opp>;
};
 
cpu1: cpu@1 {
device_type = "cpu";
compatible = "arm,cortex-a72", "arm,armv8";
reg = <0 0x001>;
+   clocks = <_clk 32>;
enable-method = "psci";
+   operating-points-v2 = <_opp>;
};
 
cpu2: cpu@100 {
device_type = "cpu";
compatible = "arm,cortex-a53", "arm,armv8";
reg = <0 0x100>;
+   clocks = <_clk 33>;
enable-method = "psci";
+   operating-points-v2 = <_opp>;
};
 
cpu3: cpu@101 {
device_type = "cpu";
compatible = "arm,cortex-a53", "arm,armv8";
reg = <0 0x101>;
+   clocks = <_clk 33>;
enable-method = "psci";
+   operating-points-v2 = <_opp>;
};
};
 
@@ -109,6 +117,82 @@
method = "smc";
};
 
+   cluster0_opp: opp_table0 {
+   compatible = "operating-points-v2";
+   opp-shared;
+
+   opp@25000 {
+   opp-hz = /bits/ 64 <25000>;
+   clock-latency-ns = <300>;
+   };
+   opp@27500 {
+   opp-hz = /bits/ 64 <27500>;
+   clock-latency-ns = <300>;
+   };
+   opp@5 {
+   opp-hz = /bits/ 64 <5>;
+   clock-latency-ns = <300>;
+   };
+   opp@55000 {
+   opp-hz = /bits/ 64 <55000>;
+   clock-latency-ns = <300>;
+   };
+   opp@67000 {
+   opp-hz = /bits/ 64 <67000>;
+   clock-latency-ns = <300>;
+   };
+   opp@74000 {
+   opp-hz = /bits/ 64 <74000>;
+   clock-latency-ns = <300>;
+   };
+   opp@10 {
+   opp-hz = /bits/ 64 <10>;
+   clock-latency-ns = <300>;
+   };
+   opp@11 {
+   opp-hz = /bits/ 64 <11>;
+   clock-latency-ns = <300>;
+   };
+   };
+
+   cluster1_opp: opp_table1 {
+   compatible = "operating-points-v2";
+   opp-shared;
+
+   opp@25000 {
+   opp-hz = /bits/ 64 <25000>;
+   clock-latency-ns = <300>;
+   };
+   opp@27500 {
+   opp-hz = /bits/ 64 <27500>;
+   clock-latency-ns = <300>;
+   };
+   opp@5 {
+   opp-hz = /bits/ 64 <5>;
+   clock-latency-ns = <300>;
+   };
+   opp@55000 {
+   opp-hz = /bits/ 64 <55000>;
+   clock-latency-ns = <300>;
+   };
+   opp@67000 {
+   opp-hz = /bits/ 64 <67000>;
+   clock-latency-ns = <300>;
+   };
+   opp@74000 {
+   opp-hz = /bits/ 64 <74000>;
+   clock-latency-ns = <300>;
+   };
+   opp@10 {
+   opp-hz = /bits/ 64 <10>;
+   clock-latency-ns = <300>;
+   };
+   opp@11 {
+ 

[PATCH v3 2/2] arm64: dts: uniphier: add CPU clocks and OPP tables for LD20 SoC

2016-10-19 Thread Masahiro Yamada
Add a CPU clock to every CPU node and CPU OPP tables to use the
generic cpufreq driver.  All the CPUs in each cluster share the
same OPP table.

Note:
clock-latency-ns (300ns) was calculated based on the CPU-gear switch
sequencer spec; it takes 12 clock cycles on the sequencer running
at 50 MHz, plus a bit additional latency.

Signed-off-by: Masahiro Yamada 
  - Fix cluster1 OPP as well.

---

Changes in v2:
  - Match the node name to the opp-hz property.

 arch/arm64/boot/dts/socionext/uniphier-ld20.dtsi | 84 
 1 file changed, 84 insertions(+)

diff --git a/arch/arm64/boot/dts/socionext/uniphier-ld20.dtsi 
b/arch/arm64/boot/dts/socionext/uniphier-ld20.dtsi
index 6f48e82..a9a08dd 100644
--- a/arch/arm64/boot/dts/socionext/uniphier-ld20.dtsi
+++ b/arch/arm64/boot/dts/socionext/uniphier-ld20.dtsi
@@ -79,28 +79,36 @@
device_type = "cpu";
compatible = "arm,cortex-a72", "arm,armv8";
reg = <0 0x000>;
+   clocks = <_clk 32>;
enable-method = "psci";
+   operating-points-v2 = <_opp>;
};
 
cpu1: cpu@1 {
device_type = "cpu";
compatible = "arm,cortex-a72", "arm,armv8";
reg = <0 0x001>;
+   clocks = <_clk 32>;
enable-method = "psci";
+   operating-points-v2 = <_opp>;
};
 
cpu2: cpu@100 {
device_type = "cpu";
compatible = "arm,cortex-a53", "arm,armv8";
reg = <0 0x100>;
+   clocks = <_clk 33>;
enable-method = "psci";
+   operating-points-v2 = <_opp>;
};
 
cpu3: cpu@101 {
device_type = "cpu";
compatible = "arm,cortex-a53", "arm,armv8";
reg = <0 0x101>;
+   clocks = <_clk 33>;
enable-method = "psci";
+   operating-points-v2 = <_opp>;
};
};
 
@@ -109,6 +117,82 @@
method = "smc";
};
 
+   cluster0_opp: opp_table0 {
+   compatible = "operating-points-v2";
+   opp-shared;
+
+   opp@25000 {
+   opp-hz = /bits/ 64 <25000>;
+   clock-latency-ns = <300>;
+   };
+   opp@27500 {
+   opp-hz = /bits/ 64 <27500>;
+   clock-latency-ns = <300>;
+   };
+   opp@5 {
+   opp-hz = /bits/ 64 <5>;
+   clock-latency-ns = <300>;
+   };
+   opp@55000 {
+   opp-hz = /bits/ 64 <55000>;
+   clock-latency-ns = <300>;
+   };
+   opp@67000 {
+   opp-hz = /bits/ 64 <67000>;
+   clock-latency-ns = <300>;
+   };
+   opp@74000 {
+   opp-hz = /bits/ 64 <74000>;
+   clock-latency-ns = <300>;
+   };
+   opp@10 {
+   opp-hz = /bits/ 64 <10>;
+   clock-latency-ns = <300>;
+   };
+   opp@11 {
+   opp-hz = /bits/ 64 <11>;
+   clock-latency-ns = <300>;
+   };
+   };
+
+   cluster1_opp: opp_table1 {
+   compatible = "operating-points-v2";
+   opp-shared;
+
+   opp@25000 {
+   opp-hz = /bits/ 64 <25000>;
+   clock-latency-ns = <300>;
+   };
+   opp@27500 {
+   opp-hz = /bits/ 64 <27500>;
+   clock-latency-ns = <300>;
+   };
+   opp@5 {
+   opp-hz = /bits/ 64 <5>;
+   clock-latency-ns = <300>;
+   };
+   opp@55000 {
+   opp-hz = /bits/ 64 <55000>;
+   clock-latency-ns = <300>;
+   };
+   opp@67000 {
+   opp-hz = /bits/ 64 <67000>;
+   clock-latency-ns = <300>;
+   };
+   opp@74000 {
+   opp-hz = /bits/ 64 <74000>;
+   clock-latency-ns = <300>;
+   };
+   opp@10 {
+   opp-hz = /bits/ 64 <10>;
+   clock-latency-ns = <300>;
+   };
+   opp@11 {
+   opp-hz = /bits/ 64 

[PATCH v3 1/2] arm64: dts: uniphier: add CPU clock and OPP table for LD11 SoC

2016-10-19 Thread Masahiro Yamada
Add a CPU clock to every CPU node and a CPU OPP table to use the
generic cpufreq driver.

Note:
clock-latency-ns (300ns) was calculated based on the CPU-gear switch
sequencer spec; it takes 12 clock cycles on the sequencer running
at 50 MHz, plus a bit additional latency.

Signed-off-by: Masahiro Yamada 
---

Changes in v2:
  - Match the node name to the opp-hz property.

 arch/arm64/boot/dts/socionext/uniphier-ld11.dtsi | 38 
 1 file changed, 38 insertions(+)

diff --git a/arch/arm64/boot/dts/socionext/uniphier-ld11.dtsi 
b/arch/arm64/boot/dts/socionext/uniphier-ld11.dtsi
index 73e0acf..bb05f0a 100644
--- a/arch/arm64/boot/dts/socionext/uniphier-ld11.dtsi
+++ b/arch/arm64/boot/dts/socionext/uniphier-ld11.dtsi
@@ -70,14 +70,18 @@
device_type = "cpu";
compatible = "arm,cortex-a53", "arm,armv8";
reg = <0 0x000>;
+   clocks = <_clk 33>;
enable-method = "psci";
+   operating-points-v2 = <_opp>;
};
 
cpu1: cpu@1 {
device_type = "cpu";
compatible = "arm,cortex-a53", "arm,armv8";
reg = <0 0x001>;
+   clocks = <_clk 33>;
enable-method = "psci";
+   operating-points-v2 = <_opp>;
};
};
 
@@ -86,6 +90,40 @@
method = "smc";
};
 
+   cluster0_opp: opp_table {
+   compatible = "operating-points-v2";
+   opp-shared;
+
+   opp@24500 {
+   opp-hz = /bits/ 64 <24500>;
+   clock-latency-ns = <300>;
+   };
+   opp@25000 {
+   opp-hz = /bits/ 64 <25000>;
+   clock-latency-ns = <300>;
+   };
+   opp@49000 {
+   opp-hz = /bits/ 64 <49000>;
+   clock-latency-ns = <300>;
+   };
+   opp@5 {
+   opp-hz = /bits/ 64 <5>;
+   clock-latency-ns = <300>;
+   };
+   opp@653334000 {
+   opp-hz = /bits/ 64 <653334000>;
+   clock-latency-ns = <300>;
+   };
+   opp@67000 {
+   opp-hz = /bits/ 64 <67000>;
+   clock-latency-ns = <300>;
+   };
+   opp@98000 {
+   opp-hz = /bits/ 64 <98000>;
+   clock-latency-ns = <300>;
+   };
+   };
+
clocks {
refclk: ref {
compatible = "fixed-clock";
-- 
1.9.1



[PATCH v3 1/2] arm64: dts: uniphier: add CPU clock and OPP table for LD11 SoC

2016-10-19 Thread Masahiro Yamada
Add a CPU clock to every CPU node and a CPU OPP table to use the
generic cpufreq driver.

Note:
clock-latency-ns (300ns) was calculated based on the CPU-gear switch
sequencer spec; it takes 12 clock cycles on the sequencer running
at 50 MHz, plus a bit additional latency.

Signed-off-by: Masahiro Yamada 
---

Changes in v2:
  - Match the node name to the opp-hz property.

 arch/arm64/boot/dts/socionext/uniphier-ld11.dtsi | 38 
 1 file changed, 38 insertions(+)

diff --git a/arch/arm64/boot/dts/socionext/uniphier-ld11.dtsi 
b/arch/arm64/boot/dts/socionext/uniphier-ld11.dtsi
index 73e0acf..bb05f0a 100644
--- a/arch/arm64/boot/dts/socionext/uniphier-ld11.dtsi
+++ b/arch/arm64/boot/dts/socionext/uniphier-ld11.dtsi
@@ -70,14 +70,18 @@
device_type = "cpu";
compatible = "arm,cortex-a53", "arm,armv8";
reg = <0 0x000>;
+   clocks = <_clk 33>;
enable-method = "psci";
+   operating-points-v2 = <_opp>;
};
 
cpu1: cpu@1 {
device_type = "cpu";
compatible = "arm,cortex-a53", "arm,armv8";
reg = <0 0x001>;
+   clocks = <_clk 33>;
enable-method = "psci";
+   operating-points-v2 = <_opp>;
};
};
 
@@ -86,6 +90,40 @@
method = "smc";
};
 
+   cluster0_opp: opp_table {
+   compatible = "operating-points-v2";
+   opp-shared;
+
+   opp@24500 {
+   opp-hz = /bits/ 64 <24500>;
+   clock-latency-ns = <300>;
+   };
+   opp@25000 {
+   opp-hz = /bits/ 64 <25000>;
+   clock-latency-ns = <300>;
+   };
+   opp@49000 {
+   opp-hz = /bits/ 64 <49000>;
+   clock-latency-ns = <300>;
+   };
+   opp@5 {
+   opp-hz = /bits/ 64 <5>;
+   clock-latency-ns = <300>;
+   };
+   opp@653334000 {
+   opp-hz = /bits/ 64 <653334000>;
+   clock-latency-ns = <300>;
+   };
+   opp@67000 {
+   opp-hz = /bits/ 64 <67000>;
+   clock-latency-ns = <300>;
+   };
+   opp@98000 {
+   opp-hz = /bits/ 64 <98000>;
+   clock-latency-ns = <300>;
+   };
+   };
+
clocks {
refclk: ref {
compatible = "fixed-clock";
-- 
1.9.1



Re: [PATCH 2/2] phy: qcom-qmp: new qmp phy driver for qcom-chipsets

2016-10-19 Thread Vivek Gautam
On Wed, Oct 19, 2016 at 4:24 PM, Vivek Gautam
 wrote:
> CC: Srinivas Kandagatla
>
>
> On 10/19/2016 04:13 PM, Vivek Gautam wrote:
>>
>> Qualcomm SOCs have QMP phy controller that provides support
>> to a number of controller, viz. PCIe, UFS, and USB.
>> Add a new driver, based on generic phy framework, for this
>> phy controller.
>>
>> USB3-phy changes: Based on phy-msm-ssusb-qmp driver available on
>> msm-4.4 kernel @codeaurora[1].
>> PCIe-phy changes: Based on msm8996-pcie-phy driver posted by
>> Srinivas [2].
>>
>> [1]
>> https://source.codeaurora.org/quic/la/kernel/msm-4.4/log/?h=caf/3.18/msm-3.18
>> [2] https://patchwork.kernel.org/patch/9318947/
>>
>> Signed-off-by: Vivek Gautam 
>> Cc: Kishon Vijay Abraham I 
>> ---

[snip]

>> +static int qcom_qmp_phy_com_init(struct qcom_qmp_phy *qphy)
>> +{
>> +   const struct qmp_phy_init_cfg *cfg = qphy->cfg;
>> +   void __iomem *serdes = qphy->serdes;
>> +   int ret;
>> +
>> +   mutex_lock(>phy_mutex);
>> +   if (qphy->init_count++) {
>> +   mutex_unlock(>phy_mutex);
>> +   return 0;
>> +   }
>> +
>> +   ret = reset_control_deassert(qphy->phy_rst);
>> +   if (ret) {
>> +   dev_err(qphy->dev, "phy reset deassert failed\n");
>> +   return ret;

Pointed out by Kbuild-test: drivers/phy/phy-qcom-qmp.c:677:2-8:
preceding lock on line 668

The mutex has to be unlocked for all error cases.
Will fix this in the next patch version.

[snip]


Thanks
Vivek


-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [PATCH 2/2] phy: qcom-qmp: new qmp phy driver for qcom-chipsets

2016-10-19 Thread Vivek Gautam
On Wed, Oct 19, 2016 at 4:24 PM, Vivek Gautam
 wrote:
> CC: Srinivas Kandagatla
>
>
> On 10/19/2016 04:13 PM, Vivek Gautam wrote:
>>
>> Qualcomm SOCs have QMP phy controller that provides support
>> to a number of controller, viz. PCIe, UFS, and USB.
>> Add a new driver, based on generic phy framework, for this
>> phy controller.
>>
>> USB3-phy changes: Based on phy-msm-ssusb-qmp driver available on
>> msm-4.4 kernel @codeaurora[1].
>> PCIe-phy changes: Based on msm8996-pcie-phy driver posted by
>> Srinivas [2].
>>
>> [1]
>> https://source.codeaurora.org/quic/la/kernel/msm-4.4/log/?h=caf/3.18/msm-3.18
>> [2] https://patchwork.kernel.org/patch/9318947/
>>
>> Signed-off-by: Vivek Gautam 
>> Cc: Kishon Vijay Abraham I 
>> ---

[snip]

>> +static int qcom_qmp_phy_com_init(struct qcom_qmp_phy *qphy)
>> +{
>> +   const struct qmp_phy_init_cfg *cfg = qphy->cfg;
>> +   void __iomem *serdes = qphy->serdes;
>> +   int ret;
>> +
>> +   mutex_lock(>phy_mutex);
>> +   if (qphy->init_count++) {
>> +   mutex_unlock(>phy_mutex);
>> +   return 0;
>> +   }
>> +
>> +   ret = reset_control_deassert(qphy->phy_rst);
>> +   if (ret) {
>> +   dev_err(qphy->dev, "phy reset deassert failed\n");
>> +   return ret;

Pointed out by Kbuild-test: drivers/phy/phy-qcom-qmp.c:677:2-8:
preceding lock on line 668

The mutex has to be unlocked for all error cases.
Will fix this in the next patch version.

[snip]


Thanks
Vivek


-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


[RFC][PATCH 1/2 v2] usb: dwc2: Force port resume on switching to device mode

2016-10-19 Thread John Stultz
From: Chen Yu 

We've seen failures when switching between host and gadget mode,
which was diagnosed as being caused due to the bus being
auto-suspended when we switched.

So this patch forces a port resume when switching to device
mode if the bus is suspended.

Cc: Wei Xu 
Cc: Guodong Xu 
Cc: Amit Pundir 
Cc: Rob Herring 
Cc: John Youn 
Cc: Douglas Anderson 
Cc: Greg Kroah-Hartman 
Cc: linux-...@vger.kernel.org
Signed-off-by: Chen Yu 
Signed-off-by: John Stultz 
---
 drivers/usb/dwc2/hcd.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/drivers/usb/dwc2/hcd.c b/drivers/usb/dwc2/hcd.c
index df5a065..b374e60 100644
--- a/drivers/usb/dwc2/hcd.c
+++ b/drivers/usb/dwc2/hcd.c
@@ -54,6 +54,9 @@
 #include "core.h"
 #include "hcd.h"
 
+
+static void dwc2_port_resume(struct dwc2_hsotg *hsotg);
+
 /*
  * =
  *  Host Core Layer Functions
@@ -3204,6 +3207,11 @@ static void dwc2_conn_id_status_change(struct 
work_struct *work)
if (gotgctl & GOTGCTL_CONID_B) {
/* Wait for switch to device mode */
dev_dbg(hsotg->dev, "connId B\n");
+   if (hsotg->bus_suspended) {
+   dev_info(hsotg->dev,
+   "Do port resume before switching to device 
mode\n");
+   dwc2_port_resume(hsotg);
+   }
while (!dwc2_is_device_mode(hsotg)) {
dev_info(hsotg->dev,
 "Waiting for Peripheral Mode, Mode=%s\n",
-- 
2.7.4



[RFC][PATCH 1/2 v2] usb: dwc2: Force port resume on switching to device mode

2016-10-19 Thread John Stultz
From: Chen Yu 

We've seen failures when switching between host and gadget mode,
which was diagnosed as being caused due to the bus being
auto-suspended when we switched.

So this patch forces a port resume when switching to device
mode if the bus is suspended.

Cc: Wei Xu 
Cc: Guodong Xu 
Cc: Amit Pundir 
Cc: Rob Herring 
Cc: John Youn 
Cc: Douglas Anderson 
Cc: Greg Kroah-Hartman 
Cc: linux-...@vger.kernel.org
Signed-off-by: Chen Yu 
Signed-off-by: John Stultz 
---
 drivers/usb/dwc2/hcd.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/drivers/usb/dwc2/hcd.c b/drivers/usb/dwc2/hcd.c
index df5a065..b374e60 100644
--- a/drivers/usb/dwc2/hcd.c
+++ b/drivers/usb/dwc2/hcd.c
@@ -54,6 +54,9 @@
 #include "core.h"
 #include "hcd.h"
 
+
+static void dwc2_port_resume(struct dwc2_hsotg *hsotg);
+
 /*
  * =
  *  Host Core Layer Functions
@@ -3204,6 +3207,11 @@ static void dwc2_conn_id_status_change(struct 
work_struct *work)
if (gotgctl & GOTGCTL_CONID_B) {
/* Wait for switch to device mode */
dev_dbg(hsotg->dev, "connId B\n");
+   if (hsotg->bus_suspended) {
+   dev_info(hsotg->dev,
+   "Do port resume before switching to device 
mode\n");
+   dwc2_port_resume(hsotg);
+   }
while (!dwc2_is_device_mode(hsotg)) {
dev_info(hsotg->dev,
 "Waiting for Peripheral Mode, Mode=%s\n",
-- 
2.7.4



[RFC][PATCH 2/2 v2] usb: dwc2: Add a quirk to allow speed negotiation for Hisilicon Hi6220

2016-10-19 Thread John Stultz
From: Chen Yu 

The Hi6220's usb controller is limited in that it does not
support "Split Transactions", so it does not support communicating
with low-speed and full-speed devices behind a high-speed hub.

Thus it requires a quirk so that we can manually drop the usb
speed when low/full-speed are attached, and bump back to high
speed when they are removed.

Cc: Wei Xu 
Cc: Guodong Xu 
Cc: Amit Pundir 
Cc: Rob Herring 
Cc: John Youn 
Cc: Douglas Anderson 
Cc: Greg Kroah-Hartman 
Cc: linux-...@vger.kernel.org
Signed-off-by: Chen Yu 
[jstultz: Reworked to simplify the patch, and made
 commit log to be more specific about the issue]
Signed-off-by: John Stultz 
---
v2:
* Fix build issue reported by kbuildbot
* Rework to avoid using new dts entry suggested by RobH
* Further tweaks from Chen Yu to try to address comments from
  John Youn
* Further simplified logic
---
 drivers/usb/dwc2/core.h |  6 +
 drivers/usb/dwc2/hcd.c  | 61 +
 drivers/usb/dwc2/platform.c |  1 +
 3 files changed, 68 insertions(+)

diff --git a/drivers/usb/dwc2/core.h b/drivers/usb/dwc2/core.h
index 2a21a04..baa4f64 100644
--- a/drivers/usb/dwc2/core.h
+++ b/drivers/usb/dwc2/core.h
@@ -417,6 +417,11 @@ enum dwc2_ep0_state {
  * needed.
  * 0 - No (default)
  * 1 - Yes
+ * @change_speed_quirk: Change speed configuration to DWC2_SPEED_PARAM_FULL
+ *  while full speed device connect. And change speed
+ *  back to DWC2_SPEED_PARAM_HIGH while device is gone.
+ * 0 - No (default)
+ * 1 - Yes
  *
  * The following parameters may be specified when starting the module. These
  * parameters define how the DWC_otg controller should be configured. A
@@ -457,6 +462,7 @@ struct dwc2_core_params {
int uframe_sched;
int external_id_pin_ctl;
int hibernation;
+   int change_speed_quirk;
 };
 
 /**
diff --git a/drivers/usb/dwc2/hcd.c b/drivers/usb/dwc2/hcd.c
index b374e60..85121e2 100644
--- a/drivers/usb/dwc2/hcd.c
+++ b/drivers/usb/dwc2/hcd.c
@@ -4868,6 +4868,62 @@ static void _dwc2_hcd_clear_tt_buffer_complete(struct 
usb_hcd *hcd,
spin_unlock_irqrestore(>lock, flags);
 }
 
+/*
+ * HPRT0_SPD_HIGH_SPEED: high speed
+ * HPRT0_SPD_FULL_SPEED: full speed
+ */
+static void dwc2_change_bus_speed(struct usb_hcd *hcd, int speed)
+{
+   struct dwc2_hsotg *hsotg = dwc2_hcd_to_hsotg(hcd);
+
+   if (hsotg->core_params->speed == speed)
+   return;
+
+   hsotg->core_params->speed = speed;
+   queue_work(hsotg->wq_otg, >wf_otg);
+}
+
+static void dwc2_free_dev(struct usb_hcd *hcd, struct usb_device *udev)
+{
+   struct dwc2_hsotg *hsotg = dwc2_hcd_to_hsotg(hcd);
+
+   if (!hsotg->core_params->change_speed_quirk)
+   return;
+
+   /*
+* On removal, set speed to default high-speed.
+*/
+   if (udev->parent &&
+   udev->parent->speed > USB_SPEED_UNKNOWN &&
+   udev->parent->speed < USB_SPEED_HIGH) {
+   dev_info(hsotg->dev, "Set speed to default high-speed\n");
+   dwc2_change_bus_speed(hcd, HPRT0_SPD_HIGH_SPEED);
+   }
+}
+
+static int dwc2_reset_device(struct usb_hcd *hcd, struct usb_device *udev)
+{
+   struct dwc2_hsotg *hsotg = dwc2_hcd_to_hsotg(hcd);
+
+   if (!hsotg->core_params->change_speed_quirk)
+   return 0;
+
+   if (udev->speed == USB_SPEED_HIGH) {
+   dev_info(hsotg->dev, "Set speed to high-speed\n");
+   dwc2_change_bus_speed(hcd, HPRT0_SPD_HIGH_SPEED);
+   } else if ((udev->speed == USB_SPEED_FULL ||
+   udev->speed == USB_SPEED_LOW)) {
+   /*
+* Change speed setting to full-speed if there's
+* a full-speed or low-speed device plugged in.
+*/
+   dev_info(hsotg->dev, "Set speed to full-speed\n");
+   dwc2_change_bus_speed(hcd, HPRT0_SPD_FULL_SPEED);
+   }
+
+   return 0;
+}
+
 static struct hc_driver dwc2_hc_driver = {
.description = "dwc2_hsotg",
.product_desc = "DWC OTG Controller",
@@ -5023,6 +5079,11 @@ int dwc2_hcd_init(struct dwc2_hsotg *hsotg, int irq)
dev_warn(hsotg->dev, "can't set coherent DMA mask\n");
}
 
+   if (hsotg->core_params->change_speed_quirk) {
+   dwc2_hc_driver.free_dev = dwc2_free_dev;
+   dwc2_hc_driver.reset_device = dwc2_reset_device;
+   }
+
hcd = usb_create_hcd(_hc_driver, hsotg->dev, dev_name(hsotg->dev));
if (!hcd)
goto error1;
diff --git 

[RFC][PATCH 0/2 v2] dwc2 fixes for HiKey

2016-10-19 Thread John Stultz
I wanted to send out two patches we're using on the dwc2
driver in order to make HiKey function.

The first is seemingly just a bug fix we ran into when
changing modes while the bus was auto-suspended.

The second is a little more interesting as it works around
a limitation of the the device in that it can't handle split
transactions, so the usb speed but be lowered and raised
manually when full-speed devices come and go.

I wanted to send these out for further review and consideration.

Changes in v2:
* Fix build issue reported by kbuildbot
* Rework to avoid using new dts entry suggested by RobH
* Further tweaks from Chen Yu to try to address comments from
  John Youn
* Further simplified logic


Feedback would be greatly appreciated!

thanks
-john


Cc: Wei Xu 
Cc: Guodong Xu 
Cc: Amit Pundir 
Cc: Rob Herring 
Cc: John Youn 
Cc: Douglas Anderson 
Cc: Greg Kroah-Hartman 
Cc: linux-...@vger.kernel.org
Cc: Chen Yu 


Chen Yu (2):
  usb: dwc2: Force port resume on switching to device mode
  usb: dwc2: Add a quirk to allow speed negotiation for Hisilicon Hi6220

 drivers/usb/dwc2/core.h |  6 
 drivers/usb/dwc2/hcd.c  | 69 +
 drivers/usb/dwc2/platform.c |  1 +
 3 files changed, 76 insertions(+)

-- 
2.7.4



[RFC][PATCH 2/2 v2] usb: dwc2: Add a quirk to allow speed negotiation for Hisilicon Hi6220

2016-10-19 Thread John Stultz
From: Chen Yu 

The Hi6220's usb controller is limited in that it does not
support "Split Transactions", so it does not support communicating
with low-speed and full-speed devices behind a high-speed hub.

Thus it requires a quirk so that we can manually drop the usb
speed when low/full-speed are attached, and bump back to high
speed when they are removed.

Cc: Wei Xu 
Cc: Guodong Xu 
Cc: Amit Pundir 
Cc: Rob Herring 
Cc: John Youn 
Cc: Douglas Anderson 
Cc: Greg Kroah-Hartman 
Cc: linux-...@vger.kernel.org
Signed-off-by: Chen Yu 
[jstultz: Reworked to simplify the patch, and made
 commit log to be more specific about the issue]
Signed-off-by: John Stultz 
---
v2:
* Fix build issue reported by kbuildbot
* Rework to avoid using new dts entry suggested by RobH
* Further tweaks from Chen Yu to try to address comments from
  John Youn
* Further simplified logic
---
 drivers/usb/dwc2/core.h |  6 +
 drivers/usb/dwc2/hcd.c  | 61 +
 drivers/usb/dwc2/platform.c |  1 +
 3 files changed, 68 insertions(+)

diff --git a/drivers/usb/dwc2/core.h b/drivers/usb/dwc2/core.h
index 2a21a04..baa4f64 100644
--- a/drivers/usb/dwc2/core.h
+++ b/drivers/usb/dwc2/core.h
@@ -417,6 +417,11 @@ enum dwc2_ep0_state {
  * needed.
  * 0 - No (default)
  * 1 - Yes
+ * @change_speed_quirk: Change speed configuration to DWC2_SPEED_PARAM_FULL
+ *  while full speed device connect. And change speed
+ *  back to DWC2_SPEED_PARAM_HIGH while device is gone.
+ * 0 - No (default)
+ * 1 - Yes
  *
  * The following parameters may be specified when starting the module. These
  * parameters define how the DWC_otg controller should be configured. A
@@ -457,6 +462,7 @@ struct dwc2_core_params {
int uframe_sched;
int external_id_pin_ctl;
int hibernation;
+   int change_speed_quirk;
 };
 
 /**
diff --git a/drivers/usb/dwc2/hcd.c b/drivers/usb/dwc2/hcd.c
index b374e60..85121e2 100644
--- a/drivers/usb/dwc2/hcd.c
+++ b/drivers/usb/dwc2/hcd.c
@@ -4868,6 +4868,62 @@ static void _dwc2_hcd_clear_tt_buffer_complete(struct 
usb_hcd *hcd,
spin_unlock_irqrestore(>lock, flags);
 }
 
+/*
+ * HPRT0_SPD_HIGH_SPEED: high speed
+ * HPRT0_SPD_FULL_SPEED: full speed
+ */
+static void dwc2_change_bus_speed(struct usb_hcd *hcd, int speed)
+{
+   struct dwc2_hsotg *hsotg = dwc2_hcd_to_hsotg(hcd);
+
+   if (hsotg->core_params->speed == speed)
+   return;
+
+   hsotg->core_params->speed = speed;
+   queue_work(hsotg->wq_otg, >wf_otg);
+}
+
+static void dwc2_free_dev(struct usb_hcd *hcd, struct usb_device *udev)
+{
+   struct dwc2_hsotg *hsotg = dwc2_hcd_to_hsotg(hcd);
+
+   if (!hsotg->core_params->change_speed_quirk)
+   return;
+
+   /*
+* On removal, set speed to default high-speed.
+*/
+   if (udev->parent &&
+   udev->parent->speed > USB_SPEED_UNKNOWN &&
+   udev->parent->speed < USB_SPEED_HIGH) {
+   dev_info(hsotg->dev, "Set speed to default high-speed\n");
+   dwc2_change_bus_speed(hcd, HPRT0_SPD_HIGH_SPEED);
+   }
+}
+
+static int dwc2_reset_device(struct usb_hcd *hcd, struct usb_device *udev)
+{
+   struct dwc2_hsotg *hsotg = dwc2_hcd_to_hsotg(hcd);
+
+   if (!hsotg->core_params->change_speed_quirk)
+   return 0;
+
+   if (udev->speed == USB_SPEED_HIGH) {
+   dev_info(hsotg->dev, "Set speed to high-speed\n");
+   dwc2_change_bus_speed(hcd, HPRT0_SPD_HIGH_SPEED);
+   } else if ((udev->speed == USB_SPEED_FULL ||
+   udev->speed == USB_SPEED_LOW)) {
+   /*
+* Change speed setting to full-speed if there's
+* a full-speed or low-speed device plugged in.
+*/
+   dev_info(hsotg->dev, "Set speed to full-speed\n");
+   dwc2_change_bus_speed(hcd, HPRT0_SPD_FULL_SPEED);
+   }
+
+   return 0;
+}
+
 static struct hc_driver dwc2_hc_driver = {
.description = "dwc2_hsotg",
.product_desc = "DWC OTG Controller",
@@ -5023,6 +5079,11 @@ int dwc2_hcd_init(struct dwc2_hsotg *hsotg, int irq)
dev_warn(hsotg->dev, "can't set coherent DMA mask\n");
}
 
+   if (hsotg->core_params->change_speed_quirk) {
+   dwc2_hc_driver.free_dev = dwc2_free_dev;
+   dwc2_hc_driver.reset_device = dwc2_reset_device;
+   }
+
hcd = usb_create_hcd(_hc_driver, hsotg->dev, dev_name(hsotg->dev));
if (!hcd)
goto error1;
diff --git a/drivers/usb/dwc2/platform.c b/drivers/usb/dwc2/platform.c
index 8e1728b..17eb5fd 100644
--- a/drivers/usb/dwc2/platform.c
+++ b/drivers/usb/dwc2/platform.c
@@ -85,6 +85,7 @@ static const struct dwc2_core_params params_hi6220 = {

[RFC][PATCH 0/2 v2] dwc2 fixes for HiKey

2016-10-19 Thread John Stultz
I wanted to send out two patches we're using on the dwc2
driver in order to make HiKey function.

The first is seemingly just a bug fix we ran into when
changing modes while the bus was auto-suspended.

The second is a little more interesting as it works around
a limitation of the the device in that it can't handle split
transactions, so the usb speed but be lowered and raised
manually when full-speed devices come and go.

I wanted to send these out for further review and consideration.

Changes in v2:
* Fix build issue reported by kbuildbot
* Rework to avoid using new dts entry suggested by RobH
* Further tweaks from Chen Yu to try to address comments from
  John Youn
* Further simplified logic


Feedback would be greatly appreciated!

thanks
-john


Cc: Wei Xu 
Cc: Guodong Xu 
Cc: Amit Pundir 
Cc: Rob Herring 
Cc: John Youn 
Cc: Douglas Anderson 
Cc: Greg Kroah-Hartman 
Cc: linux-...@vger.kernel.org
Cc: Chen Yu 


Chen Yu (2):
  usb: dwc2: Force port resume on switching to device mode
  usb: dwc2: Add a quirk to allow speed negotiation for Hisilicon Hi6220

 drivers/usb/dwc2/core.h |  6 
 drivers/usb/dwc2/hcd.c  | 69 +
 drivers/usb/dwc2/platform.c |  1 +
 3 files changed, 76 insertions(+)

-- 
2.7.4



Re: [PATCH 1/2] phy: qcom-qusb2: New driver for QUSB2 PHY on Qcom chips

2016-10-19 Thread Vivek Gautam
On Wed, Oct 19, 2016 at 4:13 PM, Vivek Gautam
 wrote:
> PHY transceiver driver for QUSB2 phy controller that provides
> HighSpeed functionality for DWC3 controller present on
> Qualcomm chipsets.
>
> This driver is based on phy-msm-qusb driver available in
> msm-4.4 kernel @codeaurora[1]
>
> [1] 
> https://source.codeaurora.org/quic/la/kernel/msm-4.4/log/?h=caf/3.18/msm-3.18
>
> Signed-off-by: Vivek Gautam 
> Cc: Kishon Vijay Abraham I 
> ---

[snip]

> +   qphy->ref_clk_src = devm_clk_get(dev, "ref_clk_src");
> +   if (IS_ERR(qphy->ref_clk_src)) {
> +   qphy->ref_clk_src = NULL;
> +   ret = PTR_ERR(qphy->ref_clk_src);

Pointed out by Kbuild test: ERROR: PTR_ERR applied after
initialization to constant on line

Will correct this in the next patch version.

Same for ref_clk and iface_clk.

> +   if (ret != -EPROBE_DEFER)
> +   dev_dbg(dev, "clk get failed for ref_clk_src\n");
> +   }
> +
> +   qphy->ref_clk = devm_clk_get(dev, "ref_clk");
> +   if (IS_ERR(qphy->ref_clk)) {
> +   qphy->ref_clk = NULL;
> +   ret = PTR_ERR(qphy->ref_clk);

[snip]


Thanks
Vivek


-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [PATCH 1/2] phy: qcom-qusb2: New driver for QUSB2 PHY on Qcom chips

2016-10-19 Thread Vivek Gautam
On Wed, Oct 19, 2016 at 4:13 PM, Vivek Gautam
 wrote:
> PHY transceiver driver for QUSB2 phy controller that provides
> HighSpeed functionality for DWC3 controller present on
> Qualcomm chipsets.
>
> This driver is based on phy-msm-qusb driver available in
> msm-4.4 kernel @codeaurora[1]
>
> [1] 
> https://source.codeaurora.org/quic/la/kernel/msm-4.4/log/?h=caf/3.18/msm-3.18
>
> Signed-off-by: Vivek Gautam 
> Cc: Kishon Vijay Abraham I 
> ---

[snip]

> +   qphy->ref_clk_src = devm_clk_get(dev, "ref_clk_src");
> +   if (IS_ERR(qphy->ref_clk_src)) {
> +   qphy->ref_clk_src = NULL;
> +   ret = PTR_ERR(qphy->ref_clk_src);

Pointed out by Kbuild test: ERROR: PTR_ERR applied after
initialization to constant on line

Will correct this in the next patch version.

Same for ref_clk and iface_clk.

> +   if (ret != -EPROBE_DEFER)
> +   dev_dbg(dev, "clk get failed for ref_clk_src\n");
> +   }
> +
> +   qphy->ref_clk = devm_clk_get(dev, "ref_clk");
> +   if (IS_ERR(qphy->ref_clk)) {
> +   qphy->ref_clk = NULL;
> +   ret = PTR_ERR(qphy->ref_clk);

[snip]


Thanks
Vivek


-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [PATCH] kbuild: provide include/asm/asm-prototypes.h for ARM

2016-10-19 Thread Nicholas Piggin
On Wed, 19 Oct 2016 16:32:00 +0100
Russell King - ARM Linux  wrote:

> On Wed, Oct 19, 2016 at 05:02:55PM +0200, Arnd Bergmann wrote:
> > On Wednesday, October 19, 2016 4:52:06 PM CEST Michal Marek wrote:  
> > > Dne 17.10.2016 v 14:26 Arnd Bergmann napsal(a):  
> > > > This adds an asm/asm-prototypes.h header for ARM to fix the
> > > > broken symbol versioning for symbols exported from assembler
> > > > files.
> > > > 
> > > > In addition to the header, we have to do these other small
> > > > changes:
> > > > 
> > > > - move the 'extern' declarations out of memset_io/memcpy_io
> > > >   to make them visible to the symbol version generator
> > > > - move the exports from bitops.h to {change,clear,set,...}bit.S
> > > > - move the exports from csumpartialgeneric.S into the files
> > > >   including it
> > > > 
> > > > I couldn't find the correct prototypes for the compiler builtins,
> > > > so I went with the fake 'void f(void)' prototypes that we had
> > > > before.
> > > > 
> > > > Signed-off-by: Arnd Bergmann   
> > > 
> > > Hi Arnd,
> > > 
> > > just to make sure I'm looking at the right code - is this based on the
> > > patch by Nick here: https://patchwork.kernel.org/patch/9377783/?
> > >   
> > 
> > (adding Russell to Cc, I missed him during my earlier mail, which
> > is now archived at https://lkml.org/lkml/2016/10/17/356)  
> 
> I'm not in favour of this.
> 
>+extern void mmioset(void *, unsigned int, size_t);
>+extern void mmiocpy(void *, const void *, size_t);
>+
> #ifndef __ARMBE__
> static inline void memset_io(volatile void __iomem *dst, unsigned c,
>size_t count)
> {
>-   extern void mmioset(void *, unsigned int, size_t);
>mmioset((void __force *)dst, c, count);
> }
> 
> The reason they're declared _within_ memset_io() is to prevent people
> from using them by hiding their declaration.  Moving them outside is
> an open invitation to stupid people starting to use them as an "oh it
> must be an official API".
> 
> We know this happens, there's been a long history of this kind of stupid
> in the ARM community, not only with cache flushing APIs, but also the
> DMA APIs.
> 
> The way the existing code is written is a completely valid way to hide
> declarations from outside the intended caller's scope.
> 
> We've been here many times, we've had many people doing this crap, so
> I'm now at the point of NAKing changes which result in an increased
> visibility to the rest of the kernel of symbols that should not be
> used by stupid driver authors.
> 
> Now, why do we have these extra functions when they're just aliased to
> memset()/memcpy() - to avoid GCC optimising them because it thinks that
> they're standard memset()/memcpy().
> 
> So overall this gets a NAK from me.

Fair point, what about leaving those as they are, and also adding
them to asm-prototypes.h protected with GENKSYMS ifdef? It's not
beautiful, but still better than armksyms.c before Al's patches (or
at least no worse).


> Now, it would have _ALSO_ been nice to have been at least COPIED on the
> original set of changes that caused the need for this change.  I wasn't.
> So I want to see the original set of changes reverted, because they're
> clearly causing breakage.  Let's revert them and then go through the
> proper process of maintainer review, rather than bypassing maintainers
> and screwing up architectures in the process.  There really is no
> excuse for this crap.

You may have a point about improvement of the process. I wasn't
involved in the original patches, but we did cc linux-arch when the
.S CRC issue became known.

However let's work on the assumption that they won't be reverted at this
stage, and try to come up with something to fix it that you're happy with.

Thanks,
Nick


Re: [PATCH] kbuild: provide include/asm/asm-prototypes.h for ARM

2016-10-19 Thread Nicholas Piggin
On Wed, 19 Oct 2016 16:32:00 +0100
Russell King - ARM Linux  wrote:

> On Wed, Oct 19, 2016 at 05:02:55PM +0200, Arnd Bergmann wrote:
> > On Wednesday, October 19, 2016 4:52:06 PM CEST Michal Marek wrote:  
> > > Dne 17.10.2016 v 14:26 Arnd Bergmann napsal(a):  
> > > > This adds an asm/asm-prototypes.h header for ARM to fix the
> > > > broken symbol versioning for symbols exported from assembler
> > > > files.
> > > > 
> > > > In addition to the header, we have to do these other small
> > > > changes:
> > > > 
> > > > - move the 'extern' declarations out of memset_io/memcpy_io
> > > >   to make them visible to the symbol version generator
> > > > - move the exports from bitops.h to {change,clear,set,...}bit.S
> > > > - move the exports from csumpartialgeneric.S into the files
> > > >   including it
> > > > 
> > > > I couldn't find the correct prototypes for the compiler builtins,
> > > > so I went with the fake 'void f(void)' prototypes that we had
> > > > before.
> > > > 
> > > > Signed-off-by: Arnd Bergmann   
> > > 
> > > Hi Arnd,
> > > 
> > > just to make sure I'm looking at the right code - is this based on the
> > > patch by Nick here: https://patchwork.kernel.org/patch/9377783/?
> > >   
> > 
> > (adding Russell to Cc, I missed him during my earlier mail, which
> > is now archived at https://lkml.org/lkml/2016/10/17/356)  
> 
> I'm not in favour of this.
> 
>+extern void mmioset(void *, unsigned int, size_t);
>+extern void mmiocpy(void *, const void *, size_t);
>+
> #ifndef __ARMBE__
> static inline void memset_io(volatile void __iomem *dst, unsigned c,
>size_t count)
> {
>-   extern void mmioset(void *, unsigned int, size_t);
>mmioset((void __force *)dst, c, count);
> }
> 
> The reason they're declared _within_ memset_io() is to prevent people
> from using them by hiding their declaration.  Moving them outside is
> an open invitation to stupid people starting to use them as an "oh it
> must be an official API".
> 
> We know this happens, there's been a long history of this kind of stupid
> in the ARM community, not only with cache flushing APIs, but also the
> DMA APIs.
> 
> The way the existing code is written is a completely valid way to hide
> declarations from outside the intended caller's scope.
> 
> We've been here many times, we've had many people doing this crap, so
> I'm now at the point of NAKing changes which result in an increased
> visibility to the rest of the kernel of symbols that should not be
> used by stupid driver authors.
> 
> Now, why do we have these extra functions when they're just aliased to
> memset()/memcpy() - to avoid GCC optimising them because it thinks that
> they're standard memset()/memcpy().
> 
> So overall this gets a NAK from me.

Fair point, what about leaving those as they are, and also adding
them to asm-prototypes.h protected with GENKSYMS ifdef? It's not
beautiful, but still better than armksyms.c before Al's patches (or
at least no worse).


> Now, it would have _ALSO_ been nice to have been at least COPIED on the
> original set of changes that caused the need for this change.  I wasn't.
> So I want to see the original set of changes reverted, because they're
> clearly causing breakage.  Let's revert them and then go through the
> proper process of maintainer review, rather than bypassing maintainers
> and screwing up architectures in the process.  There really is no
> excuse for this crap.

You may have a point about improvement of the process. I wasn't
involved in the original patches, but we did cc linux-arch when the
.S CRC issue became known.

However let's work on the assumption that they won't be reverted at this
stage, and try to come up with something to fix it that you're happy with.

Thanks,
Nick


4.9-rc1 boot regression, ambiguous bisect result

2016-10-19 Thread Dan Williams
Hi,

I am currently unable to boot a Yoga 900 with latest mainline, but 4.8 boots.

The symptom is a reboot before the video console is available.

I bisected to commit 816e76129ed5 "efi: Allow drivers to reserve boot
services forever".  However, that commit is known to be broken.  The
proposed fix, commit 92dc33501bfb "x86/efi: Round EFI memmap
reservations to EFI_PAGE_SIZE", also exhibits the reboot problem.

During the bisect some of the stopping points landed on commits that
caused the boot process to hang rather than cause a reboot.  The
commits that resulted in a hang are marked "git bisect skip" in this
log: https://gist.github.com/djbw/1b501daa98192a42ae848f03bb59c30e

I'll try treating those hangs as bad bisect results and re-run the
full bisect tomorrow.  In the meantime I wonder if the bisect log
implicates a better regression candidate?


4.9-rc1 boot regression, ambiguous bisect result

2016-10-19 Thread Dan Williams
Hi,

I am currently unable to boot a Yoga 900 with latest mainline, but 4.8 boots.

The symptom is a reboot before the video console is available.

I bisected to commit 816e76129ed5 "efi: Allow drivers to reserve boot
services forever".  However, that commit is known to be broken.  The
proposed fix, commit 92dc33501bfb "x86/efi: Round EFI memmap
reservations to EFI_PAGE_SIZE", also exhibits the reboot problem.

During the bisect some of the stopping points landed on commits that
caused the boot process to hang rather than cause a reboot.  The
commits that resulted in a hang are marked "git bisect skip" in this
log: https://gist.github.com/djbw/1b501daa98192a42ae848f03bb59c30e

I'll try treating those hangs as bad bisect results and re-run the
full bisect tomorrow.  In the meantime I wonder if the bisect log
implicates a better regression candidate?


Re: Regression in intel_powerclamp, due to cpu whitelist removal

2016-10-19 Thread Alok Kataria
On Wed, 2016-10-19 at 20:45 -0700, Jacob Pan wrote:
> On Tue, 18 Oct 2016 14:20:49 +
> Alok Kataria  wrote:
> 
> > Hi Jacob, Zhang, 
> > 
> > One of your recent commit "thermal/powerclamp: remove cpu
> > whitelist” [1], has caused a regression in the kernel. 
> > 
> > That commit changed powerclamp_probe from requiring all of the
> > following features:
> > 
> > X86_FEATURE_NONSTOP_TSC
> > X86_FEATURE_CONSTANT_TSC
> > X86_FEATURE_MWAIT
> > X86_FEATURE_ARAT   
> > 
> > to *any* of them.  The problem is clamp_thread still wants to use
> > mwait_idle_with_hints even if the CPU doesn't support it. 
> >
> Hi Alok,
> 
> You are right, it should be AND not OR.
>  
> +Eric who has a patch to address this.
> 
> https://patchwork.kernel.org/patch/9365005/

Thanks Jacob. 
Also, I don't see stable copied on that submission, shouldn't this be a
candidate for backporting to all affected kernel versions ?

Thanks,
Alok
> 
> Rui/Rafael,
> 
> Could you consider this as an urgent fix?
> 
> Jacob
> > This was reported by our users when running ubuntu 16.10
> > (4.8.0-22-generic) inside a VMware VM, though as mentioned above I
> > don’t think it is specific to our platform. We have seen kernel
> > panics due to invalid opcode because of this. Below is the stack
> > trace for your reference. 
> > 
> > [5.736416] invalid opcode:  [#1] SMP
> > [5.736455] Modules linked in: vmw_vsock_vmci_transport vsock
> > vmw_balloon intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul
> > ghash_clmulni_intel aesni_intel aes_x86_64 lrw glue_helper
> > ablk_helper cryptd intel_rapl_perf input_leds joydev serio_raw
> > snd_ens1371 snd_ac97_codec gameport ac97_bus snd_pcm snd_seq_midi
> > snd_seq_midi_event snd_rawmidi snd_seq snd_seq_device snd_timer snd
> > soundcore i2c_piix4 shpchp vmw_vmci nfit floppy(+) mac_hid parport_pc
> > ppdev lp parport ip_tables x_tables autofs4 hid_generic usbhid hid
> > ahci libahci e1000 mptspi mptscsih psmouse mptbase vmwgfx
> > scsi_transport_spi ttm drm_kms_helper syscopyarea sysfillrect
> > sysimgblt fb_sys_fops drm pata_acpi fjes [5.744370] CPU: 1 PID:
> > 912 Comm: kidle_inject/1 Not tainted 4.8.0-22-generic #24-Ubuntu
> > [5.744373] Hardware name: VMware, Inc. VMware Virtual
> > Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015
> > [5.744375] task: 9658f7a663c0 task.stack: 9658fa908000
> > [5.744378] RIP: 0010:[]  []
> > clamp_thread+0x2b8/0x5d0 [intel_powerclamp] [5.744380] RSP:
> > 0018:9658fa90be00  EFLAGS: 00010246 [5.744383] RAX:
> > 9658fa908008 RBX: fffee0a6 RCX: 
> > [5.744386] RDX:  RSI: 0246 RDI:
> > 0246 [5.744388] RBP: 9658fa90bec0 R08:
> > 9658fa908000 R09:  [5.744391] R10:
> > 0001cbf7 R11:  R12: 8db581a0
> > [5.744393] R13: 9658fa908000 R14:  R15:
> > 9658fa908000 [5.744396] FS:  ()
> > GS:9658fc64() knlGS: [5.744398] CS:
> > 0010 DS:  ES:  CR0: 80050033 [5.744401] CR2:
> > 7ffa6cc262e8 CR3: 3ab3b000 CR4: 001406e0
> > [5.744403] Stack: [5.744406]  0001
> > 9658f7a66dc0 9658fc659200 e878d638 [5.744409]
> > 0001 0002fc659200 0001 9658fa908008
> > [5.744411]   9658fc64fea8 fffee0a6
> > c05720a0 [5.744414] Call Trace: [5.744416]
> > [] ? pkg_state_counter+0xa0/0xa0 [intel_powerclamp]
> > [5.744419]  [] ?
> > powerclamp_set_cur_state+0x170/0x170 [intel_powerclamp]
> > [5.744421]  [] ?
> > powerclamp_set_cur_state+0x170/0x170 [intel_powerclamp]
> > [5.744424]  [] kthread+0xd8/0xf0
> > [5.744427]  [] ret_from_fork+0x1f/0x40
> > [5.744429]  [] ?
> > kthread_create_on_node+0x1e0/0x1e0 [5.744432] Code: cc e9 ba 00
> > 00 00 eb 19 0f 1f 00 0f ae f0 65 48 8b 04 25 04 69 01 00 0f ae b8 08
> > c0 ff ff 0f ae f0 31 d2 48 8b 44 24 38 48 89 d1 <0f> 01 c8 49 8b 45
> > 08 a8 08 75 0b b9 01 00 00 00 4c 89 f0 0f 01 [5.744434] RIP
> > [] clamp_thread+0x2b8/0x5d0 [intel_powerclamp]
> > [5.744437]  RSP  [5.70] invalid opcode:
> >  [#2] SMP [5.744452] ---[ end trace cf659c4076bf2804 ]---
> > 
> > Looking at the instruction at the RIP  shows that
> > the kernel attempted to execute “monitor” instruction. 
> > 
> >  8b8:   0f 01 c8monitor %rax,%rcx,%rdx
> >  8bb:   49 8b 45 08 mov0x8(%r13),%rax
> > 
> > To fix this, I think you should restore the explicit feature check
> > “if block” that was removed in the above mentioned commit. Can you
> > please look at this ?
> > 
> > Thanks,
> > Alok
> > 
> > 
> > [1] b721ca0d192754deccb89fb01c77e41e6fd91ad9
> > 

Re: Regression in intel_powerclamp, due to cpu whitelist removal

2016-10-19 Thread Alok Kataria
On Wed, 2016-10-19 at 20:45 -0700, Jacob Pan wrote:
> On Tue, 18 Oct 2016 14:20:49 +
> Alok Kataria  wrote:
> 
> > Hi Jacob, Zhang, 
> > 
> > One of your recent commit "thermal/powerclamp: remove cpu
> > whitelist” [1], has caused a regression in the kernel. 
> > 
> > That commit changed powerclamp_probe from requiring all of the
> > following features:
> > 
> > X86_FEATURE_NONSTOP_TSC
> > X86_FEATURE_CONSTANT_TSC
> > X86_FEATURE_MWAIT
> > X86_FEATURE_ARAT   
> > 
> > to *any* of them.  The problem is clamp_thread still wants to use
> > mwait_idle_with_hints even if the CPU doesn't support it. 
> >
> Hi Alok,
> 
> You are right, it should be AND not OR.
>  
> +Eric who has a patch to address this.
> 
> https://patchwork.kernel.org/patch/9365005/

Thanks Jacob. 
Also, I don't see stable copied on that submission, shouldn't this be a
candidate for backporting to all affected kernel versions ?

Thanks,
Alok
> 
> Rui/Rafael,
> 
> Could you consider this as an urgent fix?
> 
> Jacob
> > This was reported by our users when running ubuntu 16.10
> > (4.8.0-22-generic) inside a VMware VM, though as mentioned above I
> > don’t think it is specific to our platform. We have seen kernel
> > panics due to invalid opcode because of this. Below is the stack
> > trace for your reference. 
> > 
> > [5.736416] invalid opcode:  [#1] SMP
> > [5.736455] Modules linked in: vmw_vsock_vmci_transport vsock
> > vmw_balloon intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul
> > ghash_clmulni_intel aesni_intel aes_x86_64 lrw glue_helper
> > ablk_helper cryptd intel_rapl_perf input_leds joydev serio_raw
> > snd_ens1371 snd_ac97_codec gameport ac97_bus snd_pcm snd_seq_midi
> > snd_seq_midi_event snd_rawmidi snd_seq snd_seq_device snd_timer snd
> > soundcore i2c_piix4 shpchp vmw_vmci nfit floppy(+) mac_hid parport_pc
> > ppdev lp parport ip_tables x_tables autofs4 hid_generic usbhid hid
> > ahci libahci e1000 mptspi mptscsih psmouse mptbase vmwgfx
> > scsi_transport_spi ttm drm_kms_helper syscopyarea sysfillrect
> > sysimgblt fb_sys_fops drm pata_acpi fjes [5.744370] CPU: 1 PID:
> > 912 Comm: kidle_inject/1 Not tainted 4.8.0-22-generic #24-Ubuntu
> > [5.744373] Hardware name: VMware, Inc. VMware Virtual
> > Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015
> > [5.744375] task: 9658f7a663c0 task.stack: 9658fa908000
> > [5.744378] RIP: 0010:[]  []
> > clamp_thread+0x2b8/0x5d0 [intel_powerclamp] [5.744380] RSP:
> > 0018:9658fa90be00  EFLAGS: 00010246 [5.744383] RAX:
> > 9658fa908008 RBX: fffee0a6 RCX: 
> > [5.744386] RDX:  RSI: 0246 RDI:
> > 0246 [5.744388] RBP: 9658fa90bec0 R08:
> > 9658fa908000 R09:  [5.744391] R10:
> > 0001cbf7 R11:  R12: 8db581a0
> > [5.744393] R13: 9658fa908000 R14:  R15:
> > 9658fa908000 [5.744396] FS:  ()
> > GS:9658fc64() knlGS: [5.744398] CS:
> > 0010 DS:  ES:  CR0: 80050033 [5.744401] CR2:
> > 7ffa6cc262e8 CR3: 3ab3b000 CR4: 001406e0
> > [5.744403] Stack: [5.744406]  0001
> > 9658f7a66dc0 9658fc659200 e878d638 [5.744409]
> > 0001 0002fc659200 0001 9658fa908008
> > [5.744411]   9658fc64fea8 fffee0a6
> > c05720a0 [5.744414] Call Trace: [5.744416]
> > [] ? pkg_state_counter+0xa0/0xa0 [intel_powerclamp]
> > [5.744419]  [] ?
> > powerclamp_set_cur_state+0x170/0x170 [intel_powerclamp]
> > [5.744421]  [] ?
> > powerclamp_set_cur_state+0x170/0x170 [intel_powerclamp]
> > [5.744424]  [] kthread+0xd8/0xf0
> > [5.744427]  [] ret_from_fork+0x1f/0x40
> > [5.744429]  [] ?
> > kthread_create_on_node+0x1e0/0x1e0 [5.744432] Code: cc e9 ba 00
> > 00 00 eb 19 0f 1f 00 0f ae f0 65 48 8b 04 25 04 69 01 00 0f ae b8 08
> > c0 ff ff 0f ae f0 31 d2 48 8b 44 24 38 48 89 d1 <0f> 01 c8 49 8b 45
> > 08 a8 08 75 0b b9 01 00 00 00 4c 89 f0 0f 01 [5.744434] RIP
> > [] clamp_thread+0x2b8/0x5d0 [intel_powerclamp]
> > [5.744437]  RSP  [5.70] invalid opcode:
> >  [#2] SMP [5.744452] ---[ end trace cf659c4076bf2804 ]---
> > 
> > Looking at the instruction at the RIP  shows that
> > the kernel attempted to execute “monitor” instruction. 
> > 
> >  8b8:   0f 01 c8monitor %rax,%rcx,%rdx
> >  8bb:   49 8b 45 08 mov0x8(%r13),%rax
> > 
> > To fix this, I think you should restore the explicit feature check
> > “if block” that was removed in the above mentioned commit. Can you
> > please look at this ?
> > 
> > Thanks,
> > Alok
> > 
> > 
> > [1] b721ca0d192754deccb89fb01c77e41e6fd91ad9
> > 

Re: [PATCH v2 2/2] arm64: dts: uniphier: add CPU clocks and OPP tables for LD20 SoC

2016-10-19 Thread Masahiro Yamada
2016-10-20 12:55 GMT+09:00 Viresh Kumar :
> On 20-10-16, 10:15, Masahiro Yamada wrote:
>> + opp@6 {
>> + opp-hz = /bits/ 64 <67000>;
>> + clock-latency-ns = <300>;
>> + };
>> + opp@7 {
>> + opp-hz = /bits/ 64 <74000>;
>> + clock-latency-ns = <300>;
>> + };
>
> Forgot changing these ? Please make sure to update all of them.
>

Oops, I missed that.

Will fix soon.

Thanks!



-- 
Best Regards
Masahiro Yamada


Re: [PATCH v2 2/2] arm64: dts: uniphier: add CPU clocks and OPP tables for LD20 SoC

2016-10-19 Thread Masahiro Yamada
2016-10-20 12:55 GMT+09:00 Viresh Kumar :
> On 20-10-16, 10:15, Masahiro Yamada wrote:
>> + opp@6 {
>> + opp-hz = /bits/ 64 <67000>;
>> + clock-latency-ns = <300>;
>> + };
>> + opp@7 {
>> + opp-hz = /bits/ 64 <74000>;
>> + clock-latency-ns = <300>;
>> + };
>
> Forgot changing these ? Please make sure to update all of them.
>

Oops, I missed that.

Will fix soon.

Thanks!



-- 
Best Regards
Masahiro Yamada


Re: [PATCH v2 2/2] arm64: dts: uniphier: add CPU clocks and OPP tables for LD20 SoC

2016-10-19 Thread Viresh Kumar
On 20-10-16, 10:15, Masahiro Yamada wrote:
> + opp@6 {
> + opp-hz = /bits/ 64 <67000>;
> + clock-latency-ns = <300>;
> + };
> + opp@7 {
> + opp-hz = /bits/ 64 <74000>;
> + clock-latency-ns = <300>;
> + };

Forgot changing these ? Please make sure to update all of them.

-- 
viresh


Re: [PATCH v2 2/2] arm64: dts: uniphier: add CPU clocks and OPP tables for LD20 SoC

2016-10-19 Thread Viresh Kumar
On 20-10-16, 10:15, Masahiro Yamada wrote:
> + opp@6 {
> + opp-hz = /bits/ 64 <67000>;
> + clock-latency-ns = <300>;
> + };
> + opp@7 {
> + opp-hz = /bits/ 64 <74000>;
> + clock-latency-ns = <300>;
> + };

Forgot changing these ? Please make sure to update all of them.

-- 
viresh


Re: [GIT PULL] kbuild changes for v4.9-rc1

2016-10-19 Thread Nicholas Piggin
On Wed, 19 Oct 2016 16:38:14 +0200
Michal Marek  wrote:

> Dne 18.10.2016 v 03:34 Nicholas Piggin napsal(a):
> > We should probably just bring all these arch patches through the
> > kbuild tree.
> > 
> > I'm sorry for the breakage: I didn't realize it broke the build with
> > some configs, otherwise I would have given Michal a heads up before
> > his pull request, and worked to get this stuff in first.  
> 
> It breaks with some binutils versions only (and only with
> CONFIG_MODVERSIONS=y, of course).

Yeah this seems to be the issue, it apparently slipped past all the
automated builds. It seems like the existing CRC warnings in the tree
only trigger in rare circumstances too, so something could be a bit
fragile there.

Thanks,
Nick


Re: [GIT PULL] kbuild changes for v4.9-rc1

2016-10-19 Thread Nicholas Piggin
On Wed, 19 Oct 2016 16:38:14 +0200
Michal Marek  wrote:

> Dne 18.10.2016 v 03:34 Nicholas Piggin napsal(a):
> > We should probably just bring all these arch patches through the
> > kbuild tree.
> > 
> > I'm sorry for the breakage: I didn't realize it broke the build with
> > some configs, otherwise I would have given Michal a heads up before
> > his pull request, and worked to get this stuff in first.  
> 
> It breaks with some binutils versions only (and only with
> CONFIG_MODVERSIONS=y, of course).

Yeah this seems to be the issue, it apparently slipped past all the
automated builds. It seems like the existing CRC warnings in the tree
only trigger in rare circumstances too, so something could be a bit
fragile there.

Thanks,
Nick


[PATCH v2 3/8] drm/sun4i: tcon: Move SoC specific quirks to a DT matched data structure

2016-10-19 Thread Chen-Yu Tsai
We already have some differences between the 2 supported SoCs.
More will be added as we support other SoCs. To avoid bloating
the probe function with even more conditionals, move the quirks
to a separate data structure that's tied to the compatible string.

Signed-off-by: Chen-Yu Tsai 
---
 drivers/gpu/drm/sun4i/sun4i_tcon.c | 33 ++---
 drivers/gpu/drm/sun4i/sun4i_tcon.h | 11 +++
 2 files changed, 25 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/sun4i/sun4i_tcon.c 
b/drivers/gpu/drm/sun4i/sun4i_tcon.c
index cadacb517f95..7658f0337e0b 100644
--- a/drivers/gpu/drm/sun4i/sun4i_tcon.c
+++ b/drivers/gpu/drm/sun4i/sun4i_tcon.c
@@ -20,6 +20,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -62,7 +63,7 @@ void sun4i_tcon_channel_disable(struct sun4i_tcon *tcon, int 
channel)
return;
}
 
-   WARN_ON(!tcon->has_channel_1);
+   WARN_ON(!tcon->quirks->has_channel_1);
regmap_update_bits(tcon->regs, SUN4I_TCON1_CTL_REG,
   SUN4I_TCON1_CTL_TCON_ENABLE, 0);
clk_disable_unprepare(tcon->sclk1);
@@ -80,7 +81,7 @@ void sun4i_tcon_channel_enable(struct sun4i_tcon *tcon, int 
channel)
return;
}
 
-   WARN_ON(!tcon->has_channel_1);
+   WARN_ON(!tcon->quirks->has_channel_1);
regmap_update_bits(tcon->regs, SUN4I_TCON1_CTL_REG,
   SUN4I_TCON1_CTL_TCON_ENABLE,
   SUN4I_TCON1_CTL_TCON_ENABLE);
@@ -202,7 +203,7 @@ void sun4i_tcon1_mode_set(struct sun4i_tcon *tcon,
u8 clk_delay;
u32 val;
 
-   WARN_ON(!tcon->has_channel_1);
+   WARN_ON(!tcon->quirks->has_channel_1);
 
/* Adjust clock delay */
clk_delay = sun4i_tcon_get_clk_delay(mode, 1);
@@ -266,7 +267,7 @@ void sun4i_tcon1_mode_set(struct sun4i_tcon *tcon,
/*
 * FIXME: Undocumented bits
 */
-   if (tcon->has_mux)
+   if (tcon->quirks->has_unknown_mux)
regmap_write(tcon->regs, SUN4I_TCON_MUX_CTRL_REG, 1);
 }
 EXPORT_SYMBOL(sun4i_tcon1_mode_set);
@@ -327,7 +328,7 @@ static int sun4i_tcon_init_clocks(struct device *dev,
return PTR_ERR(tcon->sclk0);
}
 
-   if (tcon->has_channel_1) {
+   if (tcon->quirks->has_channel_1) {
tcon->sclk1 = devm_clk_get(dev, "tcon-ch1");
if (IS_ERR(tcon->sclk1)) {
dev_err(dev, "Couldn't get the TCON channel 1 clock\n");
@@ -487,14 +488,7 @@ static int sun4i_tcon_bind(struct device *dev, struct 
device *master,
drv->tcon = tcon;
tcon->drm = drm;
tcon->dev = dev;
-
-   if (of_device_is_compatible(dev->of_node, "allwinner,sun5i-a13-tcon")) {
-   tcon->has_mux = true;
-   tcon->has_channel_1 = true;
-   } else {
-   tcon->has_mux = false;
-   tcon->has_channel_1 = false;
-   }
+   tcon->quirks = of_device_get_match_data(dev);
 
tcon->lcd_rst = devm_reset_control_get(dev, "lcd");
if (IS_ERR(tcon->lcd_rst)) {
@@ -588,9 +582,18 @@ static int sun4i_tcon_remove(struct platform_device *pdev)
return 0;
 }
 
+static const struct sun4i_tcon_quirks sun5i_a13_quirks = {
+   .has_unknown_mux = true,
+   .has_channel_1  = true,
+};
+
+static const struct sun4i_tcon_quirks sun8i_a33_quirks = {
+   /* nothing is supported */
+};
+
 static const struct of_device_id sun4i_tcon_of_table[] = {
-   { .compatible = "allwinner,sun5i-a13-tcon" },
-   { .compatible = "allwinner,sun8i-a33-tcon" },
+   { .compatible = "allwinner,sun5i-a13-tcon", .data = _a13_quirks },
+   { .compatible = "allwinner,sun8i-a33-tcon", .data = _a33_quirks },
{ }
 };
 MODULE_DEVICE_TABLE(of, sun4i_tcon_of_table);
diff --git a/drivers/gpu/drm/sun4i/sun4i_tcon.h 
b/drivers/gpu/drm/sun4i/sun4i_tcon.h
index 12bd48925f4d..166064bafe2e 100644
--- a/drivers/gpu/drm/sun4i/sun4i_tcon.h
+++ b/drivers/gpu/drm/sun4i/sun4i_tcon.h
@@ -142,6 +142,11 @@
 
 #define SUN4I_TCON_MAX_CHANNELS2
 
+struct sun4i_tcon_quirks {
+   boolhas_unknown_mux; /* sun5i has undocumented mux */
+   boolhas_channel_1;  /* a33 does not have channel 1 */
+};
+
 struct sun4i_tcon {
struct device   *dev;
struct drm_device   *drm;
@@ -160,12 +165,10 @@ struct sun4i_tcon {
/* Reset control */
struct reset_control*lcd_rst;
 
-   /* Platform adjustments */
-   boolhas_mux;
-
struct drm_panel*panel;
 
-   boolhas_channel_1;
+   /* Platform adjustments */
+   const struct sun4i_tcon_quirks  *quirks;
 };
 
 struct drm_bridge *sun4i_tcon_find_bridge(struct device_node *node);
-- 
2.9.3



[PATCH v2 5/8] drm/sun4i: Add compatible strings for A31/A31s display pipelines

2016-10-19 Thread Chen-Yu Tsai
The A31's display pipeline has 2 frontends, 2 backends, and 2 TCONs. It
also has new display enhancement blocks, such as the DRC (Dynamic Range
Controller), the DEU (Display Enhancement Unit), and the CMU (Color
Management Unit). It supports HDMI, MIPI DSI, and 2 LCD/LVDS channels.

The A31s display pipeline is almost the same, just without MIPI DSI.
Only the TCON seems to be different, due to the missing mux for MIPI
DSI.

Add compatible strings for both of them.

Signed-off-by: Chen-Yu Tsai 
Acked-by: Rob Herring 
---
 Documentation/devicetree/bindings/display/sunxi/sun4i-drm.txt | 4 
 drivers/gpu/drm/sun4i/sun4i_backend.c | 1 +
 drivers/gpu/drm/sun4i/sun4i_drv.c | 3 +++
 3 files changed, 8 insertions(+)

diff --git a/Documentation/devicetree/bindings/display/sunxi/sun4i-drm.txt 
b/Documentation/devicetree/bindings/display/sunxi/sun4i-drm.txt
index 15fdca8909f2..b82c00449468 100644
--- a/Documentation/devicetree/bindings/display/sunxi/sun4i-drm.txt
+++ b/Documentation/devicetree/bindings/display/sunxi/sun4i-drm.txt
@@ -91,6 +91,7 @@ system.
 Required properties:
   - compatible: value must be one of:
 * allwinner,sun5i-a13-display-backend
+* allwinner,sun6i-a31-display-backend
 * allwinner,sun8i-a33-display-backend
   - reg: base address and size of the memory-mapped region.
   - clocks: phandles to the clocks feeding the frontend and backend
@@ -121,6 +122,7 @@ deinterlacing and color space conversion.
 Required properties:
   - compatible: value must be one of:
 * allwinner,sun5i-a13-display-frontend
+* allwinner,sun6i-a31-display-frontend
 * allwinner,sun8i-a33-display-frontend
   - reg: base address and size of the memory-mapped region.
   - interrupts: interrupt associated to this IP
@@ -146,6 +148,8 @@ extra node.
 Required properties:
   - compatible: value must be one of:
 * allwinner,sun5i-a13-display-engine
+* allwinner,sun6i-a31-display-engine
+* allwinner,sun6i-a31s-display-engine
 * allwinner,sun8i-a33-display-engine
 
   - allwinner,pipelines: list of phandle to the display engine
diff --git a/drivers/gpu/drm/sun4i/sun4i_backend.c 
b/drivers/gpu/drm/sun4i/sun4i_backend.c
index 32c0584e3c35..6e6c59a661b6 100644
--- a/drivers/gpu/drm/sun4i/sun4i_backend.c
+++ b/drivers/gpu/drm/sun4i/sun4i_backend.c
@@ -408,6 +408,7 @@ static int sun4i_backend_remove(struct platform_device 
*pdev)
 
 static const struct of_device_id sun4i_backend_of_table[] = {
{ .compatible = "allwinner,sun5i-a13-display-backend" },
+   { .compatible = "allwinner,sun6i-a31-display-backend" },
{ .compatible = "allwinner,sun8i-a33-display-backend" },
{ }
 };
diff --git a/drivers/gpu/drm/sun4i/sun4i_drv.c 
b/drivers/gpu/drm/sun4i/sun4i_drv.c
index a15c231fbd59..fa6568e1822a 100644
--- a/drivers/gpu/drm/sun4i/sun4i_drv.c
+++ b/drivers/gpu/drm/sun4i/sun4i_drv.c
@@ -201,6 +201,7 @@ static const struct component_master_ops 
sun4i_drv_master_ops = {
 static bool sun4i_drv_node_is_frontend(struct device_node *node)
 {
return of_device_is_compatible(node, 
"allwinner,sun5i-a13-display-frontend") ||
+   of_device_is_compatible(node, 
"allwinner,sun6i-a31-display-frontend") ||
of_device_is_compatible(node, 
"allwinner,sun8i-a33-display-frontend");
 }
 
@@ -324,6 +325,8 @@ static int sun4i_drv_remove(struct platform_device *pdev)
 
 static const struct of_device_id sun4i_drv_of_table[] = {
{ .compatible = "allwinner,sun5i-a13-display-engine" },
+   { .compatible = "allwinner,sun6i-a31-display-engine" },
+   { .compatible = "allwinner,sun6i-a31s-display-engine" },
{ .compatible = "allwinner,sun8i-a33-display-engine" },
{ }
 };
-- 
2.9.3



[PATCH v2 2/8] drm/sun4i: sun6i-drc: Support DRC on A31 and A31s

2016-10-19 Thread Chen-Yu Tsai
The A31 and A31s also have the DRC as part of the display pipeline.
As we know virtually nothing about them, just add compatible strings
for both SoCs to the stub driver.

Signed-off-by: Chen-Yu Tsai 
Acked-by: Rob Herring 
---
 Documentation/devicetree/bindings/display/sunxi/sun4i-drm.txt | 2 ++
 drivers/gpu/drm/sun4i/sun6i_drc.c | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/Documentation/devicetree/bindings/display/sunxi/sun4i-drm.txt 
b/Documentation/devicetree/bindings/display/sunxi/sun4i-drm.txt
index b95696d748c7..5368961cd727 100644
--- a/Documentation/devicetree/bindings/display/sunxi/sun4i-drm.txt
+++ b/Documentation/devicetree/bindings/display/sunxi/sun4i-drm.txt
@@ -64,6 +64,8 @@ adaptive backlight control.
 
 Required properties:
   - compatible: value must be one of:
+* allwinner,sun6i-a31-drc
+* allwinner,sun6i-a31s-drc
 * allwinner,sun8i-a33-drc
   - reg: base address and size of the memory-mapped region.
   - interrupts: interrupt associated to this IP
diff --git a/drivers/gpu/drm/sun4i/sun6i_drc.c 
b/drivers/gpu/drm/sun4i/sun6i_drc.c
index bf6d846d8132..6ef707c5a719 100644
--- a/drivers/gpu/drm/sun4i/sun6i_drc.c
+++ b/drivers/gpu/drm/sun4i/sun6i_drc.c
@@ -98,6 +98,8 @@ static int sun6i_drc_remove(struct platform_device *pdev)
 }
 
 static const struct of_device_id sun6i_drc_of_table[] = {
+   { .compatible = "allwinner,sun6i-a31-drc" },
+   { .compatible = "allwinner,sun6i-a31s-drc" },
{ .compatible = "allwinner,sun8i-a33-drc" },
{ }
 };
-- 
2.9.3



[PATCH v2 3/8] drm/sun4i: tcon: Move SoC specific quirks to a DT matched data structure

2016-10-19 Thread Chen-Yu Tsai
We already have some differences between the 2 supported SoCs.
More will be added as we support other SoCs. To avoid bloating
the probe function with even more conditionals, move the quirks
to a separate data structure that's tied to the compatible string.

Signed-off-by: Chen-Yu Tsai 
---
 drivers/gpu/drm/sun4i/sun4i_tcon.c | 33 ++---
 drivers/gpu/drm/sun4i/sun4i_tcon.h | 11 +++
 2 files changed, 25 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/sun4i/sun4i_tcon.c 
b/drivers/gpu/drm/sun4i/sun4i_tcon.c
index cadacb517f95..7658f0337e0b 100644
--- a/drivers/gpu/drm/sun4i/sun4i_tcon.c
+++ b/drivers/gpu/drm/sun4i/sun4i_tcon.c
@@ -20,6 +20,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -62,7 +63,7 @@ void sun4i_tcon_channel_disable(struct sun4i_tcon *tcon, int 
channel)
return;
}
 
-   WARN_ON(!tcon->has_channel_1);
+   WARN_ON(!tcon->quirks->has_channel_1);
regmap_update_bits(tcon->regs, SUN4I_TCON1_CTL_REG,
   SUN4I_TCON1_CTL_TCON_ENABLE, 0);
clk_disable_unprepare(tcon->sclk1);
@@ -80,7 +81,7 @@ void sun4i_tcon_channel_enable(struct sun4i_tcon *tcon, int 
channel)
return;
}
 
-   WARN_ON(!tcon->has_channel_1);
+   WARN_ON(!tcon->quirks->has_channel_1);
regmap_update_bits(tcon->regs, SUN4I_TCON1_CTL_REG,
   SUN4I_TCON1_CTL_TCON_ENABLE,
   SUN4I_TCON1_CTL_TCON_ENABLE);
@@ -202,7 +203,7 @@ void sun4i_tcon1_mode_set(struct sun4i_tcon *tcon,
u8 clk_delay;
u32 val;
 
-   WARN_ON(!tcon->has_channel_1);
+   WARN_ON(!tcon->quirks->has_channel_1);
 
/* Adjust clock delay */
clk_delay = sun4i_tcon_get_clk_delay(mode, 1);
@@ -266,7 +267,7 @@ void sun4i_tcon1_mode_set(struct sun4i_tcon *tcon,
/*
 * FIXME: Undocumented bits
 */
-   if (tcon->has_mux)
+   if (tcon->quirks->has_unknown_mux)
regmap_write(tcon->regs, SUN4I_TCON_MUX_CTRL_REG, 1);
 }
 EXPORT_SYMBOL(sun4i_tcon1_mode_set);
@@ -327,7 +328,7 @@ static int sun4i_tcon_init_clocks(struct device *dev,
return PTR_ERR(tcon->sclk0);
}
 
-   if (tcon->has_channel_1) {
+   if (tcon->quirks->has_channel_1) {
tcon->sclk1 = devm_clk_get(dev, "tcon-ch1");
if (IS_ERR(tcon->sclk1)) {
dev_err(dev, "Couldn't get the TCON channel 1 clock\n");
@@ -487,14 +488,7 @@ static int sun4i_tcon_bind(struct device *dev, struct 
device *master,
drv->tcon = tcon;
tcon->drm = drm;
tcon->dev = dev;
-
-   if (of_device_is_compatible(dev->of_node, "allwinner,sun5i-a13-tcon")) {
-   tcon->has_mux = true;
-   tcon->has_channel_1 = true;
-   } else {
-   tcon->has_mux = false;
-   tcon->has_channel_1 = false;
-   }
+   tcon->quirks = of_device_get_match_data(dev);
 
tcon->lcd_rst = devm_reset_control_get(dev, "lcd");
if (IS_ERR(tcon->lcd_rst)) {
@@ -588,9 +582,18 @@ static int sun4i_tcon_remove(struct platform_device *pdev)
return 0;
 }
 
+static const struct sun4i_tcon_quirks sun5i_a13_quirks = {
+   .has_unknown_mux = true,
+   .has_channel_1  = true,
+};
+
+static const struct sun4i_tcon_quirks sun8i_a33_quirks = {
+   /* nothing is supported */
+};
+
 static const struct of_device_id sun4i_tcon_of_table[] = {
-   { .compatible = "allwinner,sun5i-a13-tcon" },
-   { .compatible = "allwinner,sun8i-a33-tcon" },
+   { .compatible = "allwinner,sun5i-a13-tcon", .data = _a13_quirks },
+   { .compatible = "allwinner,sun8i-a33-tcon", .data = _a33_quirks },
{ }
 };
 MODULE_DEVICE_TABLE(of, sun4i_tcon_of_table);
diff --git a/drivers/gpu/drm/sun4i/sun4i_tcon.h 
b/drivers/gpu/drm/sun4i/sun4i_tcon.h
index 12bd48925f4d..166064bafe2e 100644
--- a/drivers/gpu/drm/sun4i/sun4i_tcon.h
+++ b/drivers/gpu/drm/sun4i/sun4i_tcon.h
@@ -142,6 +142,11 @@
 
 #define SUN4I_TCON_MAX_CHANNELS2
 
+struct sun4i_tcon_quirks {
+   boolhas_unknown_mux; /* sun5i has undocumented mux */
+   boolhas_channel_1;  /* a33 does not have channel 1 */
+};
+
 struct sun4i_tcon {
struct device   *dev;
struct drm_device   *drm;
@@ -160,12 +165,10 @@ struct sun4i_tcon {
/* Reset control */
struct reset_control*lcd_rst;
 
-   /* Platform adjustments */
-   boolhas_mux;
-
struct drm_panel*panel;
 
-   boolhas_channel_1;
+   /* Platform adjustments */
+   const struct sun4i_tcon_quirks  *quirks;
 };
 
 struct drm_bridge *sun4i_tcon_find_bridge(struct device_node *node);
-- 
2.9.3



[PATCH v2 5/8] drm/sun4i: Add compatible strings for A31/A31s display pipelines

2016-10-19 Thread Chen-Yu Tsai
The A31's display pipeline has 2 frontends, 2 backends, and 2 TCONs. It
also has new display enhancement blocks, such as the DRC (Dynamic Range
Controller), the DEU (Display Enhancement Unit), and the CMU (Color
Management Unit). It supports HDMI, MIPI DSI, and 2 LCD/LVDS channels.

The A31s display pipeline is almost the same, just without MIPI DSI.
Only the TCON seems to be different, due to the missing mux for MIPI
DSI.

Add compatible strings for both of them.

Signed-off-by: Chen-Yu Tsai 
Acked-by: Rob Herring 
---
 Documentation/devicetree/bindings/display/sunxi/sun4i-drm.txt | 4 
 drivers/gpu/drm/sun4i/sun4i_backend.c | 1 +
 drivers/gpu/drm/sun4i/sun4i_drv.c | 3 +++
 3 files changed, 8 insertions(+)

diff --git a/Documentation/devicetree/bindings/display/sunxi/sun4i-drm.txt 
b/Documentation/devicetree/bindings/display/sunxi/sun4i-drm.txt
index 15fdca8909f2..b82c00449468 100644
--- a/Documentation/devicetree/bindings/display/sunxi/sun4i-drm.txt
+++ b/Documentation/devicetree/bindings/display/sunxi/sun4i-drm.txt
@@ -91,6 +91,7 @@ system.
 Required properties:
   - compatible: value must be one of:
 * allwinner,sun5i-a13-display-backend
+* allwinner,sun6i-a31-display-backend
 * allwinner,sun8i-a33-display-backend
   - reg: base address and size of the memory-mapped region.
   - clocks: phandles to the clocks feeding the frontend and backend
@@ -121,6 +122,7 @@ deinterlacing and color space conversion.
 Required properties:
   - compatible: value must be one of:
 * allwinner,sun5i-a13-display-frontend
+* allwinner,sun6i-a31-display-frontend
 * allwinner,sun8i-a33-display-frontend
   - reg: base address and size of the memory-mapped region.
   - interrupts: interrupt associated to this IP
@@ -146,6 +148,8 @@ extra node.
 Required properties:
   - compatible: value must be one of:
 * allwinner,sun5i-a13-display-engine
+* allwinner,sun6i-a31-display-engine
+* allwinner,sun6i-a31s-display-engine
 * allwinner,sun8i-a33-display-engine
 
   - allwinner,pipelines: list of phandle to the display engine
diff --git a/drivers/gpu/drm/sun4i/sun4i_backend.c 
b/drivers/gpu/drm/sun4i/sun4i_backend.c
index 32c0584e3c35..6e6c59a661b6 100644
--- a/drivers/gpu/drm/sun4i/sun4i_backend.c
+++ b/drivers/gpu/drm/sun4i/sun4i_backend.c
@@ -408,6 +408,7 @@ static int sun4i_backend_remove(struct platform_device 
*pdev)
 
 static const struct of_device_id sun4i_backend_of_table[] = {
{ .compatible = "allwinner,sun5i-a13-display-backend" },
+   { .compatible = "allwinner,sun6i-a31-display-backend" },
{ .compatible = "allwinner,sun8i-a33-display-backend" },
{ }
 };
diff --git a/drivers/gpu/drm/sun4i/sun4i_drv.c 
b/drivers/gpu/drm/sun4i/sun4i_drv.c
index a15c231fbd59..fa6568e1822a 100644
--- a/drivers/gpu/drm/sun4i/sun4i_drv.c
+++ b/drivers/gpu/drm/sun4i/sun4i_drv.c
@@ -201,6 +201,7 @@ static const struct component_master_ops 
sun4i_drv_master_ops = {
 static bool sun4i_drv_node_is_frontend(struct device_node *node)
 {
return of_device_is_compatible(node, 
"allwinner,sun5i-a13-display-frontend") ||
+   of_device_is_compatible(node, 
"allwinner,sun6i-a31-display-frontend") ||
of_device_is_compatible(node, 
"allwinner,sun8i-a33-display-frontend");
 }
 
@@ -324,6 +325,8 @@ static int sun4i_drv_remove(struct platform_device *pdev)
 
 static const struct of_device_id sun4i_drv_of_table[] = {
{ .compatible = "allwinner,sun5i-a13-display-engine" },
+   { .compatible = "allwinner,sun6i-a31-display-engine" },
+   { .compatible = "allwinner,sun6i-a31s-display-engine" },
{ .compatible = "allwinner,sun8i-a33-display-engine" },
{ }
 };
-- 
2.9.3



[PATCH v2 2/8] drm/sun4i: sun6i-drc: Support DRC on A31 and A31s

2016-10-19 Thread Chen-Yu Tsai
The A31 and A31s also have the DRC as part of the display pipeline.
As we know virtually nothing about them, just add compatible strings
for both SoCs to the stub driver.

Signed-off-by: Chen-Yu Tsai 
Acked-by: Rob Herring 
---
 Documentation/devicetree/bindings/display/sunxi/sun4i-drm.txt | 2 ++
 drivers/gpu/drm/sun4i/sun6i_drc.c | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/Documentation/devicetree/bindings/display/sunxi/sun4i-drm.txt 
b/Documentation/devicetree/bindings/display/sunxi/sun4i-drm.txt
index b95696d748c7..5368961cd727 100644
--- a/Documentation/devicetree/bindings/display/sunxi/sun4i-drm.txt
+++ b/Documentation/devicetree/bindings/display/sunxi/sun4i-drm.txt
@@ -64,6 +64,8 @@ adaptive backlight control.
 
 Required properties:
   - compatible: value must be one of:
+* allwinner,sun6i-a31-drc
+* allwinner,sun6i-a31s-drc
 * allwinner,sun8i-a33-drc
   - reg: base address and size of the memory-mapped region.
   - interrupts: interrupt associated to this IP
diff --git a/drivers/gpu/drm/sun4i/sun6i_drc.c 
b/drivers/gpu/drm/sun4i/sun6i_drc.c
index bf6d846d8132..6ef707c5a719 100644
--- a/drivers/gpu/drm/sun4i/sun6i_drc.c
+++ b/drivers/gpu/drm/sun4i/sun6i_drc.c
@@ -98,6 +98,8 @@ static int sun6i_drc_remove(struct platform_device *pdev)
 }
 
 static const struct of_device_id sun6i_drc_of_table[] = {
+   { .compatible = "allwinner,sun6i-a31-drc" },
+   { .compatible = "allwinner,sun6i-a31s-drc" },
{ .compatible = "allwinner,sun8i-a33-drc" },
{ }
 };
-- 
2.9.3



[PATCH v2 0/8] drm/sun4i: Support first display pipeline on sun6i

2016-10-19 Thread Chen-Yu Tsai
Hi everyone,

This series adds support for the first display pipeline found on the
A31 and A31s SoCs, with output through the RGB LCD interface.

This has been tested on the Sinlinx SinA31s development board, with
its included 7" LCD panel, and the Merrii Hummingbird A31 development
board, with its dumb VGA DAC bridge.

Changes since v1:

  - Made quirk structures static

  - Dropped unused/unsupported quirks

  - Dropped dotclock max frequency quirk patch.
The check for maximum PLL clock will be moved to the clk driver.

  - Changed is_sun5i quirk to has_unknown_mux

  - Dropped SinA31s LCD patch

  - Added patch to support enable pin GPIO for dumb VGA DACs

  - Added patch to enable VGA output via dumb VGA DAC on Hummingbird
A31 board


Regards
ChenYu


Chen-Yu Tsai (8):
  drm/bridge: rgb-to-vga: Support an enable GPIO
  drm/sun4i: sun6i-drc: Support DRC on A31 and A31s
  drm/sun4i: tcon: Move SoC specific quirks to a DT matched data
structure
  drm/sun4i: Add compatible string for A31/A31s TCON (timing controller)
  drm/sun4i: Add compatible strings for A31/A31s display pipelines
  ARM: dts: sun6i: Add device nodes for first display pipeline
  ARM: dts: sun6i: Add A31 LCD0 RGB888 pins
  ARM: dts: sun6i: hummingbird-a31: Enable display output through VGA
bridge

 .../bindings/display/bridge/dumb-vga-dac.txt   |   2 +
 .../bindings/display/sunxi/sun4i-drm.txt   |  10 +-
 arch/arm/boot/dts/sun6i-a31-hummingbird.dts|  56 +++
 arch/arm/boot/dts/sun6i-a31.dtsi   | 165 +
 arch/arm/boot/dts/sun6i-a31s.dtsi  |   8 +
 drivers/gpu/drm/bridge/dumb-vga-dac.c  |  28 
 drivers/gpu/drm/sun4i/sun4i_backend.c  |   1 +
 drivers/gpu/drm/sun4i/sun4i_drv.c  |   5 +
 drivers/gpu/drm/sun4i/sun4i_tcon.c |  43 --
 drivers/gpu/drm/sun4i/sun4i_tcon.h |  11 +-
 drivers/gpu/drm/sun4i/sun6i_drc.c  |   2 +
 11 files changed, 311 insertions(+), 20 deletions(-)

-- 
2.9.3



[PATCH v2 4/8] drm/sun4i: Add compatible string for A31/A31s TCON (timing controller)

2016-10-19 Thread Chen-Yu Tsai
The A31 TCON has mux controls for how TCON outputs are routed to the
HDMI and MIPI DSI blocks.

Since the A31s does not have MIPI DSI, it only has a mux for the HDMI
controller input.

This patch only adds support for the compatible strings. Actual support
for the mux controls should be added with HDMI and MIPI DSI support.

Signed-off-by: Chen-Yu Tsai 
---
 Documentation/devicetree/bindings/display/sunxi/sun4i-drm.txt |  4 +++-
 drivers/gpu/drm/sun4i/sun4i_drv.c |  2 ++
 drivers/gpu/drm/sun4i/sun4i_tcon.c| 10 ++
 3 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/display/sunxi/sun4i-drm.txt 
b/Documentation/devicetree/bindings/display/sunxi/sun4i-drm.txt
index 5368961cd727..15fdca8909f2 100644
--- a/Documentation/devicetree/bindings/display/sunxi/sun4i-drm.txt
+++ b/Documentation/devicetree/bindings/display/sunxi/sun4i-drm.txt
@@ -28,6 +28,8 @@ The TCON acts as a timing controller for RGB, LVDS and TV 
interfaces.
 Required properties:
  - compatible: value must be either:
* allwinner,sun5i-a13-tcon
+   * allwinner,sun6i-a31-tcon
+   * allwinner,sun6i-a31s-tcon
* allwinner,sun8i-a33-tcon
  - reg: base address and size of memory-mapped region
  - interrupts: interrupt associated to this IP
@@ -50,7 +52,7 @@ Required properties:
   second the block connected to the TCON channel 1 (usually the TV
   encoder)
 
-On the A13, there is one more clock required:
+On SoCs other than the A33, there is one more clock required:
- 'tcon-ch1': The clock driving the TCON channel 1
 
 DRC
diff --git a/drivers/gpu/drm/sun4i/sun4i_drv.c 
b/drivers/gpu/drm/sun4i/sun4i_drv.c
index 0da9862ad8ed..a15c231fbd59 100644
--- a/drivers/gpu/drm/sun4i/sun4i_drv.c
+++ b/drivers/gpu/drm/sun4i/sun4i_drv.c
@@ -207,6 +207,8 @@ static bool sun4i_drv_node_is_frontend(struct device_node 
*node)
 static bool sun4i_drv_node_is_tcon(struct device_node *node)
 {
return of_device_is_compatible(node, "allwinner,sun5i-a13-tcon") ||
+   of_device_is_compatible(node, "allwinner,sun6i-a31-tcon") ||
+   of_device_is_compatible(node, "allwinner,sun6i-a31s-tcon") ||
of_device_is_compatible(node, "allwinner,sun8i-a33-tcon");
 }
 
diff --git a/drivers/gpu/drm/sun4i/sun4i_tcon.c 
b/drivers/gpu/drm/sun4i/sun4i_tcon.c
index 7658f0337e0b..c6afb2448655 100644
--- a/drivers/gpu/drm/sun4i/sun4i_tcon.c
+++ b/drivers/gpu/drm/sun4i/sun4i_tcon.c
@@ -587,12 +587,22 @@ static const struct sun4i_tcon_quirks sun5i_a13_quirks = {
.has_channel_1  = true,
 };
 
+static const struct sun4i_tcon_quirks sun6i_a31_quirks = {
+   .has_channel_1  = true,
+};
+
+static const struct sun4i_tcon_quirks sun6i_a31s_quirks = {
+   .has_channel_1  = true,
+};
+
 static const struct sun4i_tcon_quirks sun8i_a33_quirks = {
/* nothing is supported */
 };
 
 static const struct of_device_id sun4i_tcon_of_table[] = {
{ .compatible = "allwinner,sun5i-a13-tcon", .data = _a13_quirks },
+   { .compatible = "allwinner,sun6i-a31-tcon", .data = _a31_quirks },
+   { .compatible = "allwinner,sun6i-a31s-tcon", .data = _a31s_quirks 
},
{ .compatible = "allwinner,sun8i-a33-tcon", .data = _a33_quirks },
{ }
 };
-- 
2.9.3



Re: [PATCH 5/6] clk: pxa: transfer CPU clock setting from pxa2xx-cpufreq

2016-10-19 Thread Viresh Kumar
On 19-10-16, 13:51, Stephen Boyd wrote:
> On 10/19, Robert Jarzmik wrote:
> > Stephen Boyd  writes:
> > 
> > > On 10/18, Robert Jarzmik wrote:
> > >> Robert Jarzmik  writes:
> > >> Hi Michael and Stephen,
> > >> 
> > >> I'm planing on sending a v2 next week with minor corrections, mostly in 
> > >> the data
> > >> tables (pxa25x_freqs and pxa27x_freqs), as testing prooved some values 
> > >> were wrong.
> > >> 
> > >> If you want me modify this serie, will you have time to review for next 
> > >> week or
> > >> should I delay the v2 posting ?
> > >> 
> > >
> > > No need to delay. clk patches look fine with a quick glance. It
> > > would be really neat if we could make cpufreq-dt work without DT.
> > > What's blocking that? OPP tables?
> > 
> > Heu I'm not the author of cpufreq-dt, so I'm not the best to answer.
> > To answer the question "without DT", it depends if you mean "with ACPI" or 
> > "with
> > platform_data" or something else.
> 
> I mean platform_data mostly. Do you use ACPI with the clk driver?

I would like to see sample code for that, as I am not sure what all changes
would be required in cpufreq-dt in that case. Yes it is more dependent on the
OPP framework than DT right now.

My first impression is that it should be doable..

-- 
viresh


[PATCH v2 0/8] drm/sun4i: Support first display pipeline on sun6i

2016-10-19 Thread Chen-Yu Tsai
Hi everyone,

This series adds support for the first display pipeline found on the
A31 and A31s SoCs, with output through the RGB LCD interface.

This has been tested on the Sinlinx SinA31s development board, with
its included 7" LCD panel, and the Merrii Hummingbird A31 development
board, with its dumb VGA DAC bridge.

Changes since v1:

  - Made quirk structures static

  - Dropped unused/unsupported quirks

  - Dropped dotclock max frequency quirk patch.
The check for maximum PLL clock will be moved to the clk driver.

  - Changed is_sun5i quirk to has_unknown_mux

  - Dropped SinA31s LCD patch

  - Added patch to support enable pin GPIO for dumb VGA DACs

  - Added patch to enable VGA output via dumb VGA DAC on Hummingbird
A31 board


Regards
ChenYu


Chen-Yu Tsai (8):
  drm/bridge: rgb-to-vga: Support an enable GPIO
  drm/sun4i: sun6i-drc: Support DRC on A31 and A31s
  drm/sun4i: tcon: Move SoC specific quirks to a DT matched data
structure
  drm/sun4i: Add compatible string for A31/A31s TCON (timing controller)
  drm/sun4i: Add compatible strings for A31/A31s display pipelines
  ARM: dts: sun6i: Add device nodes for first display pipeline
  ARM: dts: sun6i: Add A31 LCD0 RGB888 pins
  ARM: dts: sun6i: hummingbird-a31: Enable display output through VGA
bridge

 .../bindings/display/bridge/dumb-vga-dac.txt   |   2 +
 .../bindings/display/sunxi/sun4i-drm.txt   |  10 +-
 arch/arm/boot/dts/sun6i-a31-hummingbird.dts|  56 +++
 arch/arm/boot/dts/sun6i-a31.dtsi   | 165 +
 arch/arm/boot/dts/sun6i-a31s.dtsi  |   8 +
 drivers/gpu/drm/bridge/dumb-vga-dac.c  |  28 
 drivers/gpu/drm/sun4i/sun4i_backend.c  |   1 +
 drivers/gpu/drm/sun4i/sun4i_drv.c  |   5 +
 drivers/gpu/drm/sun4i/sun4i_tcon.c |  43 --
 drivers/gpu/drm/sun4i/sun4i_tcon.h |  11 +-
 drivers/gpu/drm/sun4i/sun6i_drc.c  |   2 +
 11 files changed, 311 insertions(+), 20 deletions(-)

-- 
2.9.3



[PATCH v2 4/8] drm/sun4i: Add compatible string for A31/A31s TCON (timing controller)

2016-10-19 Thread Chen-Yu Tsai
The A31 TCON has mux controls for how TCON outputs are routed to the
HDMI and MIPI DSI blocks.

Since the A31s does not have MIPI DSI, it only has a mux for the HDMI
controller input.

This patch only adds support for the compatible strings. Actual support
for the mux controls should be added with HDMI and MIPI DSI support.

Signed-off-by: Chen-Yu Tsai 
---
 Documentation/devicetree/bindings/display/sunxi/sun4i-drm.txt |  4 +++-
 drivers/gpu/drm/sun4i/sun4i_drv.c |  2 ++
 drivers/gpu/drm/sun4i/sun4i_tcon.c| 10 ++
 3 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/display/sunxi/sun4i-drm.txt 
b/Documentation/devicetree/bindings/display/sunxi/sun4i-drm.txt
index 5368961cd727..15fdca8909f2 100644
--- a/Documentation/devicetree/bindings/display/sunxi/sun4i-drm.txt
+++ b/Documentation/devicetree/bindings/display/sunxi/sun4i-drm.txt
@@ -28,6 +28,8 @@ The TCON acts as a timing controller for RGB, LVDS and TV 
interfaces.
 Required properties:
  - compatible: value must be either:
* allwinner,sun5i-a13-tcon
+   * allwinner,sun6i-a31-tcon
+   * allwinner,sun6i-a31s-tcon
* allwinner,sun8i-a33-tcon
  - reg: base address and size of memory-mapped region
  - interrupts: interrupt associated to this IP
@@ -50,7 +52,7 @@ Required properties:
   second the block connected to the TCON channel 1 (usually the TV
   encoder)
 
-On the A13, there is one more clock required:
+On SoCs other than the A33, there is one more clock required:
- 'tcon-ch1': The clock driving the TCON channel 1
 
 DRC
diff --git a/drivers/gpu/drm/sun4i/sun4i_drv.c 
b/drivers/gpu/drm/sun4i/sun4i_drv.c
index 0da9862ad8ed..a15c231fbd59 100644
--- a/drivers/gpu/drm/sun4i/sun4i_drv.c
+++ b/drivers/gpu/drm/sun4i/sun4i_drv.c
@@ -207,6 +207,8 @@ static bool sun4i_drv_node_is_frontend(struct device_node 
*node)
 static bool sun4i_drv_node_is_tcon(struct device_node *node)
 {
return of_device_is_compatible(node, "allwinner,sun5i-a13-tcon") ||
+   of_device_is_compatible(node, "allwinner,sun6i-a31-tcon") ||
+   of_device_is_compatible(node, "allwinner,sun6i-a31s-tcon") ||
of_device_is_compatible(node, "allwinner,sun8i-a33-tcon");
 }
 
diff --git a/drivers/gpu/drm/sun4i/sun4i_tcon.c 
b/drivers/gpu/drm/sun4i/sun4i_tcon.c
index 7658f0337e0b..c6afb2448655 100644
--- a/drivers/gpu/drm/sun4i/sun4i_tcon.c
+++ b/drivers/gpu/drm/sun4i/sun4i_tcon.c
@@ -587,12 +587,22 @@ static const struct sun4i_tcon_quirks sun5i_a13_quirks = {
.has_channel_1  = true,
 };
 
+static const struct sun4i_tcon_quirks sun6i_a31_quirks = {
+   .has_channel_1  = true,
+};
+
+static const struct sun4i_tcon_quirks sun6i_a31s_quirks = {
+   .has_channel_1  = true,
+};
+
 static const struct sun4i_tcon_quirks sun8i_a33_quirks = {
/* nothing is supported */
 };
 
 static const struct of_device_id sun4i_tcon_of_table[] = {
{ .compatible = "allwinner,sun5i-a13-tcon", .data = _a13_quirks },
+   { .compatible = "allwinner,sun6i-a31-tcon", .data = _a31_quirks },
+   { .compatible = "allwinner,sun6i-a31s-tcon", .data = _a31s_quirks 
},
{ .compatible = "allwinner,sun8i-a33-tcon", .data = _a33_quirks },
{ }
 };
-- 
2.9.3



Re: [PATCH 5/6] clk: pxa: transfer CPU clock setting from pxa2xx-cpufreq

2016-10-19 Thread Viresh Kumar
On 19-10-16, 13:51, Stephen Boyd wrote:
> On 10/19, Robert Jarzmik wrote:
> > Stephen Boyd  writes:
> > 
> > > On 10/18, Robert Jarzmik wrote:
> > >> Robert Jarzmik  writes:
> > >> Hi Michael and Stephen,
> > >> 
> > >> I'm planing on sending a v2 next week with minor corrections, mostly in 
> > >> the data
> > >> tables (pxa25x_freqs and pxa27x_freqs), as testing prooved some values 
> > >> were wrong.
> > >> 
> > >> If you want me modify this serie, will you have time to review for next 
> > >> week or
> > >> should I delay the v2 posting ?
> > >> 
> > >
> > > No need to delay. clk patches look fine with a quick glance. It
> > > would be really neat if we could make cpufreq-dt work without DT.
> > > What's blocking that? OPP tables?
> > 
> > Heu I'm not the author of cpufreq-dt, so I'm not the best to answer.
> > To answer the question "without DT", it depends if you mean "with ACPI" or 
> > "with
> > platform_data" or something else.
> 
> I mean platform_data mostly. Do you use ACPI with the clk driver?

I would like to see sample code for that, as I am not sure what all changes
would be required in cpufreq-dt in that case. Yes it is more dependent on the
OPP framework than DT right now.

My first impression is that it should be doable..

-- 
viresh


[PATCH v2 7/8] ARM: dts: sun6i: Add A31 LCD0 RGB888 pins

2016-10-19 Thread Chen-Yu Tsai
The LCD0 controller on the A31 can do RGB output up to 8 bits per
channel. Add the pins for RGB888 output.

Signed-off-by: Chen-Yu Tsai 
---
 arch/arm/boot/dts/sun6i-a31.dtsi | 13 +
 1 file changed, 13 insertions(+)

diff --git a/arch/arm/boot/dts/sun6i-a31.dtsi b/arch/arm/boot/dts/sun6i-a31.dtsi
index 4d2c7786b92a..2e8bf93dcfb2 100644
--- a/arch/arm/boot/dts/sun6i-a31.dtsi
+++ b/arch/arm/boot/dts/sun6i-a31.dtsi
@@ -540,6 +540,19 @@
allwinner,pull = ;
};
 
+   lcd0_rgb888_pins: lcd0_rgb888 {
+   allwinner,pins = "PD0", "PD1", "PD2", "PD3",
+"PD4", "PD5", "PD6", "PD7",
+"PD8", "PD9", "PD10", "PD11",
+"PD12", "PD13", "PD14", "PD15",
+"PD16", "PD17", "PD18", "PD19",
+"PD20", "PD21", "PD22", "PD23",
+"PD24", "PD25", "PD26", "PD27";
+   allwinner,function = "lcd0";
+   allwinner,drive = ;
+   allwinner,pull = ;
+   };
+
mmc0_pins_a: mmc0@0 {
allwinner,pins = "PF0", "PF1", "PF2",
 "PF3", "PF4", "PF5";
-- 
2.9.3



[PATCH v2 8/8] ARM: dts: sun6i: hummingbird-a31: Enable display output through VGA bridge

2016-10-19 Thread Chen-Yu Tsai
The Hummingbird A31 board has a RGB-to-VGA bridge which converts RGB
output from the LCD interface to VGA signals.

Enable this part of the display pipeline.

Signed-off-by: Chen-Yu Tsai 
---
 arch/arm/boot/dts/sun6i-a31-hummingbird.dts | 56 +
 1 file changed, 56 insertions(+)

diff --git a/arch/arm/boot/dts/sun6i-a31-hummingbird.dts 
b/arch/arm/boot/dts/sun6i-a31-hummingbird.dts
index 9a74637f677f..05a49b2147f1 100644
--- a/arch/arm/boot/dts/sun6i-a31-hummingbird.dts
+++ b/arch/arm/boot/dts/sun6i-a31-hummingbird.dts
@@ -63,6 +63,49 @@
stdout-path = "serial0:115200n8";
};
 
+   bridge {
+   compatible = "dumb-vga-dac";
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   ports {
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   port@0 {
+   #address-cells = <1>;
+   #size-cells = <0>;
+   reg = <0>;
+
+   vga_bridge_in: endpoint@0 {
+   reg = <0>;
+   remote-endpoint = <_out_vga>;
+   };
+   };
+
+   port@1 {
+   #address-cells = <1>;
+   #size-cells = <0>;
+   reg = <1>;
+
+   vga_bridge_out: endpoint@0 {
+   reg = <0>;
+   remote-endpoint = <_con_in>;
+   };
+   };
+   };
+   };
+
+   vga {
+   compatible = "vga-connector";
+
+   port {
+   vga_con_in: endpoint {
+   remote-endpoint = <_bridge_out>;
+   };
+   };
+   };
+
wifi_pwrseq: wifi_pwrseq {
compatible = "mmc-pwrseq-simple";
reset-gpios = < 6 10 GPIO_ACTIVE_LOW>; /* PG10 */
@@ -245,6 +288,19 @@
status = "okay";
 };
 
+ {
+   pinctrl-names = "default";
+   pinctrl-0 = <_rgb888_pins>;
+   status = "okay";
+};
+
+_out {
+   tcon0_out_vga: endpoint@0 {
+   reg = <0>;
+   remote-endpoint = <_bridge_in>;
+   };
+};
+
  {
pinctrl-names = "default";
pinctrl-0 = <_pins_a>;
-- 
2.9.3



[PATCH v2 6/8] ARM: dts: sun6i: Add device nodes for first display pipeline

2016-10-19 Thread Chen-Yu Tsai
The A31 has 2 parallel display pipelines, which can be intermixed.
However the driver currently only supports one of them.

Signed-off-by: Chen-Yu Tsai 
---
 arch/arm/boot/dts/sun6i-a31.dtsi  | 152 ++
 arch/arm/boot/dts/sun6i-a31s.dtsi |   8 ++
 2 files changed, 160 insertions(+)

diff --git a/arch/arm/boot/dts/sun6i-a31.dtsi b/arch/arm/boot/dts/sun6i-a31.dtsi
index c1b891e75f18..4d2c7786b92a 100644
--- a/arch/arm/boot/dts/sun6i-a31.dtsi
+++ b/arch/arm/boot/dts/sun6i-a31.dtsi
@@ -231,6 +231,11 @@
};
};
 
+   de: display-engine {
+   compatible = "allwinner,sun6i-a31-display-engine";
+   allwinner,pipelines = <>;
+   };
+
soc@01c0 {
compatible = "simple-bus";
#address-cells = <1>;
@@ -246,6 +251,44 @@
#dma-cells = <1>;
};
 
+   tcon0: lcd-controller@01c0c000 {
+   compatible = "allwinner,sun6i-a31-tcon";
+   reg = <0x01c0c000 0x1000>;
+   interrupts = ;
+   resets = < RST_AHB1_LCD0>;
+   reset-names = "lcd";
+   clocks = < CLK_AHB1_LCD0>,
+< CLK_LCD0_CH0>,
+< CLK_LCD0_CH1>;
+   clock-names = "ahb",
+ "tcon-ch0",
+ "tcon-ch1";
+   clock-output-names = "tcon0-pixel-clock";
+   status = "disabled";
+
+   ports {
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   tcon0_in: port@0 {
+   #address-cells = <1>;
+   #size-cells = <0>;
+   reg = <0>;
+
+   tcon0_in_drc0: endpoint@0 {
+   reg = <0>;
+   remote-endpoint = 
<_out_tcon0>;
+   };
+   };
+
+   tcon0_out: port@1 {
+   #address-cells = <1>;
+   #size-cells = <0>;
+   reg = <1>;
+   };
+   };
+   };
+
mmc0: mmc@01c0f000 {
compatible = "allwinner,sun7i-a20-mmc";
reg = <0x01c0f000 0x1000>;
@@ -799,6 +842,115 @@
interrupts = ;
};
 
+   fe0: display-frontend@01e0 {
+   compatible = "allwinner,sun6i-a31-display-frontend";
+   reg = <0x01e0 0x2>;
+   interrupts = ;
+   clocks = < CLK_AHB1_FE0>, < CLK_FE0>,
+< CLK_DRAM_FE0>;
+   clock-names = "ahb", "mod",
+ "ram";
+   resets = < RST_AHB1_FE0>;
+
+   ports {
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   fe0_out: port@1 {
+   #address-cells = <1>;
+   #size-cells = <0>;
+   reg = <1>;
+
+   fe0_out_be0: endpoint@0 {
+   reg = <0>;
+   remote-endpoint = <_in_fe0>;
+   };
+   };
+   };
+   };
+
+   be0: display-backend@01e6 {
+   compatible = "allwinner,sun6i-a31-display-backend";
+   reg = <0x01e6 0x1>;
+   interrupts = ;
+   clocks = < CLK_AHB1_BE0>, < CLK_BE0>,
+< CLK_DRAM_BE0>;
+   clock-names = "ahb", "mod",
+ "ram";
+   resets = < RST_AHB1_BE0>;
+
+   assigned-clocks = < CLK_BE0>;
+   assigned-clock-rates = <3>;
+
+   ports {
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   be0_in: port@0 {
+   #address-cells = <1>;
+   #size-cells = <0>;
+   reg = <0>;
+
+   

[PATCH v2 7/8] ARM: dts: sun6i: Add A31 LCD0 RGB888 pins

2016-10-19 Thread Chen-Yu Tsai
The LCD0 controller on the A31 can do RGB output up to 8 bits per
channel. Add the pins for RGB888 output.

Signed-off-by: Chen-Yu Tsai 
---
 arch/arm/boot/dts/sun6i-a31.dtsi | 13 +
 1 file changed, 13 insertions(+)

diff --git a/arch/arm/boot/dts/sun6i-a31.dtsi b/arch/arm/boot/dts/sun6i-a31.dtsi
index 4d2c7786b92a..2e8bf93dcfb2 100644
--- a/arch/arm/boot/dts/sun6i-a31.dtsi
+++ b/arch/arm/boot/dts/sun6i-a31.dtsi
@@ -540,6 +540,19 @@
allwinner,pull = ;
};
 
+   lcd0_rgb888_pins: lcd0_rgb888 {
+   allwinner,pins = "PD0", "PD1", "PD2", "PD3",
+"PD4", "PD5", "PD6", "PD7",
+"PD8", "PD9", "PD10", "PD11",
+"PD12", "PD13", "PD14", "PD15",
+"PD16", "PD17", "PD18", "PD19",
+"PD20", "PD21", "PD22", "PD23",
+"PD24", "PD25", "PD26", "PD27";
+   allwinner,function = "lcd0";
+   allwinner,drive = ;
+   allwinner,pull = ;
+   };
+
mmc0_pins_a: mmc0@0 {
allwinner,pins = "PF0", "PF1", "PF2",
 "PF3", "PF4", "PF5";
-- 
2.9.3



[PATCH v2 8/8] ARM: dts: sun6i: hummingbird-a31: Enable display output through VGA bridge

2016-10-19 Thread Chen-Yu Tsai
The Hummingbird A31 board has a RGB-to-VGA bridge which converts RGB
output from the LCD interface to VGA signals.

Enable this part of the display pipeline.

Signed-off-by: Chen-Yu Tsai 
---
 arch/arm/boot/dts/sun6i-a31-hummingbird.dts | 56 +
 1 file changed, 56 insertions(+)

diff --git a/arch/arm/boot/dts/sun6i-a31-hummingbird.dts 
b/arch/arm/boot/dts/sun6i-a31-hummingbird.dts
index 9a74637f677f..05a49b2147f1 100644
--- a/arch/arm/boot/dts/sun6i-a31-hummingbird.dts
+++ b/arch/arm/boot/dts/sun6i-a31-hummingbird.dts
@@ -63,6 +63,49 @@
stdout-path = "serial0:115200n8";
};
 
+   bridge {
+   compatible = "dumb-vga-dac";
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   ports {
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   port@0 {
+   #address-cells = <1>;
+   #size-cells = <0>;
+   reg = <0>;
+
+   vga_bridge_in: endpoint@0 {
+   reg = <0>;
+   remote-endpoint = <_out_vga>;
+   };
+   };
+
+   port@1 {
+   #address-cells = <1>;
+   #size-cells = <0>;
+   reg = <1>;
+
+   vga_bridge_out: endpoint@0 {
+   reg = <0>;
+   remote-endpoint = <_con_in>;
+   };
+   };
+   };
+   };
+
+   vga {
+   compatible = "vga-connector";
+
+   port {
+   vga_con_in: endpoint {
+   remote-endpoint = <_bridge_out>;
+   };
+   };
+   };
+
wifi_pwrseq: wifi_pwrseq {
compatible = "mmc-pwrseq-simple";
reset-gpios = < 6 10 GPIO_ACTIVE_LOW>; /* PG10 */
@@ -245,6 +288,19 @@
status = "okay";
 };
 
+ {
+   pinctrl-names = "default";
+   pinctrl-0 = <_rgb888_pins>;
+   status = "okay";
+};
+
+_out {
+   tcon0_out_vga: endpoint@0 {
+   reg = <0>;
+   remote-endpoint = <_bridge_in>;
+   };
+};
+
  {
pinctrl-names = "default";
pinctrl-0 = <_pins_a>;
-- 
2.9.3



[PATCH v2 6/8] ARM: dts: sun6i: Add device nodes for first display pipeline

2016-10-19 Thread Chen-Yu Tsai
The A31 has 2 parallel display pipelines, which can be intermixed.
However the driver currently only supports one of them.

Signed-off-by: Chen-Yu Tsai 
---
 arch/arm/boot/dts/sun6i-a31.dtsi  | 152 ++
 arch/arm/boot/dts/sun6i-a31s.dtsi |   8 ++
 2 files changed, 160 insertions(+)

diff --git a/arch/arm/boot/dts/sun6i-a31.dtsi b/arch/arm/boot/dts/sun6i-a31.dtsi
index c1b891e75f18..4d2c7786b92a 100644
--- a/arch/arm/boot/dts/sun6i-a31.dtsi
+++ b/arch/arm/boot/dts/sun6i-a31.dtsi
@@ -231,6 +231,11 @@
};
};
 
+   de: display-engine {
+   compatible = "allwinner,sun6i-a31-display-engine";
+   allwinner,pipelines = <>;
+   };
+
soc@01c0 {
compatible = "simple-bus";
#address-cells = <1>;
@@ -246,6 +251,44 @@
#dma-cells = <1>;
};
 
+   tcon0: lcd-controller@01c0c000 {
+   compatible = "allwinner,sun6i-a31-tcon";
+   reg = <0x01c0c000 0x1000>;
+   interrupts = ;
+   resets = < RST_AHB1_LCD0>;
+   reset-names = "lcd";
+   clocks = < CLK_AHB1_LCD0>,
+< CLK_LCD0_CH0>,
+< CLK_LCD0_CH1>;
+   clock-names = "ahb",
+ "tcon-ch0",
+ "tcon-ch1";
+   clock-output-names = "tcon0-pixel-clock";
+   status = "disabled";
+
+   ports {
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   tcon0_in: port@0 {
+   #address-cells = <1>;
+   #size-cells = <0>;
+   reg = <0>;
+
+   tcon0_in_drc0: endpoint@0 {
+   reg = <0>;
+   remote-endpoint = 
<_out_tcon0>;
+   };
+   };
+
+   tcon0_out: port@1 {
+   #address-cells = <1>;
+   #size-cells = <0>;
+   reg = <1>;
+   };
+   };
+   };
+
mmc0: mmc@01c0f000 {
compatible = "allwinner,sun7i-a20-mmc";
reg = <0x01c0f000 0x1000>;
@@ -799,6 +842,115 @@
interrupts = ;
};
 
+   fe0: display-frontend@01e0 {
+   compatible = "allwinner,sun6i-a31-display-frontend";
+   reg = <0x01e0 0x2>;
+   interrupts = ;
+   clocks = < CLK_AHB1_FE0>, < CLK_FE0>,
+< CLK_DRAM_FE0>;
+   clock-names = "ahb", "mod",
+ "ram";
+   resets = < RST_AHB1_FE0>;
+
+   ports {
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   fe0_out: port@1 {
+   #address-cells = <1>;
+   #size-cells = <0>;
+   reg = <1>;
+
+   fe0_out_be0: endpoint@0 {
+   reg = <0>;
+   remote-endpoint = <_in_fe0>;
+   };
+   };
+   };
+   };
+
+   be0: display-backend@01e6 {
+   compatible = "allwinner,sun6i-a31-display-backend";
+   reg = <0x01e6 0x1>;
+   interrupts = ;
+   clocks = < CLK_AHB1_BE0>, < CLK_BE0>,
+< CLK_DRAM_BE0>;
+   clock-names = "ahb", "mod",
+ "ram";
+   resets = < RST_AHB1_BE0>;
+
+   assigned-clocks = < CLK_BE0>;
+   assigned-clock-rates = <3>;
+
+   ports {
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   be0_in: port@0 {
+   #address-cells = <1>;
+   #size-cells = <0>;
+   reg = <0>;
+
+   be0_in_fe0: endpoint@0 {

Re: [PATCH 08/10] ufs-qcom: phy/hcd: Refactoring phy clock handling

2016-10-19 Thread Vivek Gautam
On Thu, Oct 20, 2016 at 12:48 AM, Subhash Jadavani
 wrote:
> On 2016-10-19 10:45, Vivek Gautam wrote:
>>
>> Hi,
>>
>>
>> On Wed, Oct 19, 2016 at 1:43 AM, Subhash Jadavani
>>  wrote:
>>>
>>> On 2016-10-18 07:28, Vivek Gautam wrote:


 Add phy clock enable code to phy_power_on/off callbacks, and
 remove explicit calls to enable these phy clocks from the
 ufs-qcom hcd driver.

 Signed-off-by: Vivek Gautam 
 ---

 Changes since v1:
  - staticized ufs_qcom_phy_enable(/disable)_ref_clk(),
  - staticized ufs_qcom_phy_enable(/disable)_iface_clk()
  - removed function declaration and export symbol for these APIs.
>>
>>
>> [snip]
>>
 --- a/drivers/scsi/ufs/ufs-qcom.c
 +++ b/drivers/scsi/ufs/ufs-qcom.c
 @@ -1112,17 +1112,6 @@ static int ufs_qcom_setup_clocks(struct ufs_hba
 *hba, bool on)
 return 0;

 if (on) {
 -   err = ufs_qcom_phy_enable_iface_clk(host->generic_phy);
 -   if (err)
 -   goto out;
 -
 -   err = ufs_qcom_phy_enable_ref_clk(host->generic_phy);
 -   if (err) {
 -   dev_err(hba->dev, "%s enable phy ref clock
 failed,
 err=%d\n",
 -   __func__, err);
 -
 ufs_qcom_phy_disable_iface_clk(host->generic_phy);
 -   goto out;
 -   }
>>>
>>>
>>>
>>> Now that you are moving these ref clk enable/disable to phy_power_on/off
>>> and
>>> these phy_power_on/off are called only in runtime suspend/resume (3
>>> seconds
>>> after last UFS access).
>>> Goal is to disable the phy reference clock during aggressive gating (10ms
>>> from last UFS access) so shouldn't we call the phy_power_on/off from
>>> these
>>> setup_clocks() function as well?
>>>
>>
>> So setup_clocks() is called for aggressive clock gating as well ?
>> If that's the case then yes, we may need to call. But we should try to
>> understand here. The phy_power_off turns off all the clocks - reflclk,
>> and other interface clocks. Do we want all of them to be turned off ?
>
>
> Yes, we want to turn off the ref clock (& other clocks) during aggressive
> gating.

Ok.

>
>>
>> phy_power_off will also turn off the PHY. Do we want all this for
>> aggressive
>> clock gating ?
>
>
> Yes, PHY rails can be powered off both during the aggressive clk gating and
> runtime suspend. But as the regulator on/off latencies could be higher
> (especially for the shared rail), we were turning them off doing it in
> runtime suspend only (via phy_power_off()).
> Now that phy_power_on/off is managing both clocks and regulators, and we
> will really want to turn off the ref clocks during clock gating, there is no
> option but to call phy_power_on/off during aggressive gating.

Alright then, i will update the change to add phy_power_off/on()
during aggressive clock gating.


Thanks
Vivek

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: Regression in intel_powerclamp, due to cpu whitelist removal

2016-10-19 Thread Jacob Pan
On Tue, 18 Oct 2016 14:20:49 +
Alok Kataria  wrote:

> Hi Jacob, Zhang, 
> 
> One of your recent commit "thermal/powerclamp: remove cpu
> whitelist” [1], has caused a regression in the kernel. 
> 
> That commit changed powerclamp_probe from requiring all of the
> following features:
> 
> X86_FEATURE_NONSTOP_TSC
> X86_FEATURE_CONSTANT_TSC
> X86_FEATURE_MWAIT
> X86_FEATURE_ARAT   
> 
> to *any* of them.  The problem is clamp_thread still wants to use
> mwait_idle_with_hints even if the CPU doesn't support it. 
>
Hi Alok,

You are right, it should be AND not OR.
 
+Eric who has a patch to address this.

https://patchwork.kernel.org/patch/9365005/

Rui/Rafael,

Could you consider this as an urgent fix?

Jacob
> This was reported by our users when running ubuntu 16.10
> (4.8.0-22-generic) inside a VMware VM, though as mentioned above I
> don’t think it is specific to our platform. We have seen kernel
> panics due to invalid opcode because of this. Below is the stack
> trace for your reference. 
> 
> [5.736416] invalid opcode:  [#1] SMP
> [5.736455] Modules linked in: vmw_vsock_vmci_transport vsock
> vmw_balloon intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul
> ghash_clmulni_intel aesni_intel aes_x86_64 lrw glue_helper
> ablk_helper cryptd intel_rapl_perf input_leds joydev serio_raw
> snd_ens1371 snd_ac97_codec gameport ac97_bus snd_pcm snd_seq_midi
> snd_seq_midi_event snd_rawmidi snd_seq snd_seq_device snd_timer snd
> soundcore i2c_piix4 shpchp vmw_vmci nfit floppy(+) mac_hid parport_pc
> ppdev lp parport ip_tables x_tables autofs4 hid_generic usbhid hid
> ahci libahci e1000 mptspi mptscsih psmouse mptbase vmwgfx
> scsi_transport_spi ttm drm_kms_helper syscopyarea sysfillrect
> sysimgblt fb_sys_fops drm pata_acpi fjes [5.744370] CPU: 1 PID:
> 912 Comm: kidle_inject/1 Not tainted 4.8.0-22-generic #24-Ubuntu
> [5.744373] Hardware name: VMware, Inc. VMware Virtual
> Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015
> [5.744375] task: 9658f7a663c0 task.stack: 9658fa908000
> [5.744378] RIP: 0010:[]  []
> clamp_thread+0x2b8/0x5d0 [intel_powerclamp] [5.744380] RSP:
> 0018:9658fa90be00  EFLAGS: 00010246 [5.744383] RAX:
> 9658fa908008 RBX: fffee0a6 RCX: 
> [5.744386] RDX:  RSI: 0246 RDI:
> 0246 [5.744388] RBP: 9658fa90bec0 R08:
> 9658fa908000 R09:  [5.744391] R10:
> 0001cbf7 R11:  R12: 8db581a0
> [5.744393] R13: 9658fa908000 R14:  R15:
> 9658fa908000 [5.744396] FS:  ()
> GS:9658fc64() knlGS: [5.744398] CS:
> 0010 DS:  ES:  CR0: 80050033 [5.744401] CR2:
> 7ffa6cc262e8 CR3: 3ab3b000 CR4: 001406e0
> [5.744403] Stack: [5.744406]  0001
> 9658f7a66dc0 9658fc659200 e878d638 [5.744409]
> 0001 0002fc659200 0001 9658fa908008
> [5.744411]   9658fc64fea8 fffee0a6
> c05720a0 [5.744414] Call Trace: [5.744416]
> [] ? pkg_state_counter+0xa0/0xa0 [intel_powerclamp]
> [5.744419]  [] ?
> powerclamp_set_cur_state+0x170/0x170 [intel_powerclamp]
> [5.744421]  [] ?
> powerclamp_set_cur_state+0x170/0x170 [intel_powerclamp]
> [5.744424]  [] kthread+0xd8/0xf0
> [5.744427]  [] ret_from_fork+0x1f/0x40
> [5.744429]  [] ?
> kthread_create_on_node+0x1e0/0x1e0 [5.744432] Code: cc e9 ba 00
> 00 00 eb 19 0f 1f 00 0f ae f0 65 48 8b 04 25 04 69 01 00 0f ae b8 08
> c0 ff ff 0f ae f0 31 d2 48 8b 44 24 38 48 89 d1 <0f> 01 c8 49 8b 45
> 08 a8 08 75 0b b9 01 00 00 00 4c 89 f0 0f 01 [5.744434] RIP
> [] clamp_thread+0x2b8/0x5d0 [intel_powerclamp]
> [5.744437]  RSP  [5.70] invalid opcode:
>  [#2] SMP [5.744452] ---[ end trace cf659c4076bf2804 ]---
> 
> Looking at the instruction at the RIP  shows that
> the kernel attempted to execute “monitor” instruction. 
> 
>  8b8:   0f 01 c8monitor %rax,%rcx,%rdx
>  8bb:   49 8b 45 08 mov0x8(%r13),%rax
> 
> To fix this, I think you should restore the explicit feature check
> “if block” that was removed in the above mentioned commit. Can you
> please look at this ?
> 
> Thanks,
> Alok
> 
> 
> [1] b721ca0d192754deccb89fb01c77e41e6fd91ad9
> https://github.com/torvalds/linux/commit/b721ca0d192754deccb89fb01c77e41e6fd91ad9,
>  
> 

[Jacob Pan]


Re: [PATCH 08/10] ufs-qcom: phy/hcd: Refactoring phy clock handling

2016-10-19 Thread Vivek Gautam
On Thu, Oct 20, 2016 at 12:48 AM, Subhash Jadavani
 wrote:
> On 2016-10-19 10:45, Vivek Gautam wrote:
>>
>> Hi,
>>
>>
>> On Wed, Oct 19, 2016 at 1:43 AM, Subhash Jadavani
>>  wrote:
>>>
>>> On 2016-10-18 07:28, Vivek Gautam wrote:


 Add phy clock enable code to phy_power_on/off callbacks, and
 remove explicit calls to enable these phy clocks from the
 ufs-qcom hcd driver.

 Signed-off-by: Vivek Gautam 
 ---

 Changes since v1:
  - staticized ufs_qcom_phy_enable(/disable)_ref_clk(),
  - staticized ufs_qcom_phy_enable(/disable)_iface_clk()
  - removed function declaration and export symbol for these APIs.
>>
>>
>> [snip]
>>
 --- a/drivers/scsi/ufs/ufs-qcom.c
 +++ b/drivers/scsi/ufs/ufs-qcom.c
 @@ -1112,17 +1112,6 @@ static int ufs_qcom_setup_clocks(struct ufs_hba
 *hba, bool on)
 return 0;

 if (on) {
 -   err = ufs_qcom_phy_enable_iface_clk(host->generic_phy);
 -   if (err)
 -   goto out;
 -
 -   err = ufs_qcom_phy_enable_ref_clk(host->generic_phy);
 -   if (err) {
 -   dev_err(hba->dev, "%s enable phy ref clock
 failed,
 err=%d\n",
 -   __func__, err);
 -
 ufs_qcom_phy_disable_iface_clk(host->generic_phy);
 -   goto out;
 -   }
>>>
>>>
>>>
>>> Now that you are moving these ref clk enable/disable to phy_power_on/off
>>> and
>>> these phy_power_on/off are called only in runtime suspend/resume (3
>>> seconds
>>> after last UFS access).
>>> Goal is to disable the phy reference clock during aggressive gating (10ms
>>> from last UFS access) so shouldn't we call the phy_power_on/off from
>>> these
>>> setup_clocks() function as well?
>>>
>>
>> So setup_clocks() is called for aggressive clock gating as well ?
>> If that's the case then yes, we may need to call. But we should try to
>> understand here. The phy_power_off turns off all the clocks - reflclk,
>> and other interface clocks. Do we want all of them to be turned off ?
>
>
> Yes, we want to turn off the ref clock (& other clocks) during aggressive
> gating.

Ok.

>
>>
>> phy_power_off will also turn off the PHY. Do we want all this for
>> aggressive
>> clock gating ?
>
>
> Yes, PHY rails can be powered off both during the aggressive clk gating and
> runtime suspend. But as the regulator on/off latencies could be higher
> (especially for the shared rail), we were turning them off doing it in
> runtime suspend only (via phy_power_off()).
> Now that phy_power_on/off is managing both clocks and regulators, and we
> will really want to turn off the ref clocks during clock gating, there is no
> option but to call phy_power_on/off during aggressive gating.

Alright then, i will update the change to add phy_power_off/on()
during aggressive clock gating.


Thanks
Vivek

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: Regression in intel_powerclamp, due to cpu whitelist removal

2016-10-19 Thread Jacob Pan
On Tue, 18 Oct 2016 14:20:49 +
Alok Kataria  wrote:

> Hi Jacob, Zhang, 
> 
> One of your recent commit "thermal/powerclamp: remove cpu
> whitelist” [1], has caused a regression in the kernel. 
> 
> That commit changed powerclamp_probe from requiring all of the
> following features:
> 
> X86_FEATURE_NONSTOP_TSC
> X86_FEATURE_CONSTANT_TSC
> X86_FEATURE_MWAIT
> X86_FEATURE_ARAT   
> 
> to *any* of them.  The problem is clamp_thread still wants to use
> mwait_idle_with_hints even if the CPU doesn't support it. 
>
Hi Alok,

You are right, it should be AND not OR.
 
+Eric who has a patch to address this.

https://patchwork.kernel.org/patch/9365005/

Rui/Rafael,

Could you consider this as an urgent fix?

Jacob
> This was reported by our users when running ubuntu 16.10
> (4.8.0-22-generic) inside a VMware VM, though as mentioned above I
> don’t think it is specific to our platform. We have seen kernel
> panics due to invalid opcode because of this. Below is the stack
> trace for your reference. 
> 
> [5.736416] invalid opcode:  [#1] SMP
> [5.736455] Modules linked in: vmw_vsock_vmci_transport vsock
> vmw_balloon intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul
> ghash_clmulni_intel aesni_intel aes_x86_64 lrw glue_helper
> ablk_helper cryptd intel_rapl_perf input_leds joydev serio_raw
> snd_ens1371 snd_ac97_codec gameport ac97_bus snd_pcm snd_seq_midi
> snd_seq_midi_event snd_rawmidi snd_seq snd_seq_device snd_timer snd
> soundcore i2c_piix4 shpchp vmw_vmci nfit floppy(+) mac_hid parport_pc
> ppdev lp parport ip_tables x_tables autofs4 hid_generic usbhid hid
> ahci libahci e1000 mptspi mptscsih psmouse mptbase vmwgfx
> scsi_transport_spi ttm drm_kms_helper syscopyarea sysfillrect
> sysimgblt fb_sys_fops drm pata_acpi fjes [5.744370] CPU: 1 PID:
> 912 Comm: kidle_inject/1 Not tainted 4.8.0-22-generic #24-Ubuntu
> [5.744373] Hardware name: VMware, Inc. VMware Virtual
> Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015
> [5.744375] task: 9658f7a663c0 task.stack: 9658fa908000
> [5.744378] RIP: 0010:[]  []
> clamp_thread+0x2b8/0x5d0 [intel_powerclamp] [5.744380] RSP:
> 0018:9658fa90be00  EFLAGS: 00010246 [5.744383] RAX:
> 9658fa908008 RBX: fffee0a6 RCX: 
> [5.744386] RDX:  RSI: 0246 RDI:
> 0246 [5.744388] RBP: 9658fa90bec0 R08:
> 9658fa908000 R09:  [5.744391] R10:
> 0001cbf7 R11:  R12: 8db581a0
> [5.744393] R13: 9658fa908000 R14:  R15:
> 9658fa908000 [5.744396] FS:  ()
> GS:9658fc64() knlGS: [5.744398] CS:
> 0010 DS:  ES:  CR0: 80050033 [5.744401] CR2:
> 7ffa6cc262e8 CR3: 3ab3b000 CR4: 001406e0
> [5.744403] Stack: [5.744406]  0001
> 9658f7a66dc0 9658fc659200 e878d638 [5.744409]
> 0001 0002fc659200 0001 9658fa908008
> [5.744411]   9658fc64fea8 fffee0a6
> c05720a0 [5.744414] Call Trace: [5.744416]
> [] ? pkg_state_counter+0xa0/0xa0 [intel_powerclamp]
> [5.744419]  [] ?
> powerclamp_set_cur_state+0x170/0x170 [intel_powerclamp]
> [5.744421]  [] ?
> powerclamp_set_cur_state+0x170/0x170 [intel_powerclamp]
> [5.744424]  [] kthread+0xd8/0xf0
> [5.744427]  [] ret_from_fork+0x1f/0x40
> [5.744429]  [] ?
> kthread_create_on_node+0x1e0/0x1e0 [5.744432] Code: cc e9 ba 00
> 00 00 eb 19 0f 1f 00 0f ae f0 65 48 8b 04 25 04 69 01 00 0f ae b8 08
> c0 ff ff 0f ae f0 31 d2 48 8b 44 24 38 48 89 d1 <0f> 01 c8 49 8b 45
> 08 a8 08 75 0b b9 01 00 00 00 4c 89 f0 0f 01 [5.744434] RIP
> [] clamp_thread+0x2b8/0x5d0 [intel_powerclamp]
> [5.744437]  RSP  [5.70] invalid opcode:
>  [#2] SMP [5.744452] ---[ end trace cf659c4076bf2804 ]---
> 
> Looking at the instruction at the RIP  shows that
> the kernel attempted to execute “monitor” instruction. 
> 
>  8b8:   0f 01 c8monitor %rax,%rcx,%rdx
>  8bb:   49 8b 45 08 mov0x8(%r13),%rax
> 
> To fix this, I think you should restore the explicit feature check
> “if block” that was removed in the above mentioned commit. Can you
> please look at this ?
> 
> Thanks,
> Alok
> 
> 
> [1] b721ca0d192754deccb89fb01c77e41e6fd91ad9
> https://github.com/torvalds/linux/commit/b721ca0d192754deccb89fb01c77e41e6fd91ad9,
>  
> 

[Jacob Pan]


[PATCH v2 1/8] drm/bridge: rgb-to-vga: Support an enable GPIO

2016-10-19 Thread Chen-Yu Tsai
Some rgb-to-vga bridges have an enable GPIO, either directly tied to
an enable pin on the bridge IC, or indirectly controlling a power
switch.

Add support for it.

Signed-off-by: Chen-Yu Tsai 
---
 .../bindings/display/bridge/dumb-vga-dac.txt   |  2 ++
 drivers/gpu/drm/bridge/dumb-vga-dac.c  | 28 ++
 2 files changed, 30 insertions(+)

diff --git a/Documentation/devicetree/bindings/display/bridge/dumb-vga-dac.txt 
b/Documentation/devicetree/bindings/display/bridge/dumb-vga-dac.txt
index 003bc246a270..d3484822bf77 100644
--- a/Documentation/devicetree/bindings/display/bridge/dumb-vga-dac.txt
+++ b/Documentation/devicetree/bindings/display/bridge/dumb-vga-dac.txt
@@ -16,6 +16,8 @@ graph bindings specified in 
Documentation/devicetree/bindings/graph.txt.
 - Video port 0 for RGB input
 - Video port 1 for VGA output
 
+Optional properties:
+- enable-gpios: GPIO pin to enable or disable the bridge
 
 Example
 ---
diff --git a/drivers/gpu/drm/bridge/dumb-vga-dac.c 
b/drivers/gpu/drm/bridge/dumb-vga-dac.c
index afec232185a7..b487e5e9b56d 100644
--- a/drivers/gpu/drm/bridge/dumb-vga-dac.c
+++ b/drivers/gpu/drm/bridge/dumb-vga-dac.c
@@ -10,6 +10,7 @@
  * the License, or (at your option) any later version.
  */
 
+#include 
 #include 
 #include 
 
@@ -23,6 +24,7 @@ struct dumb_vga {
struct drm_connectorconnector;
 
struct i2c_adapter  *ddc;
+   struct gpio_desc*enable_gpio;
 };
 
 static inline struct dumb_vga *
@@ -124,8 +126,26 @@ static int dumb_vga_attach(struct drm_bridge *bridge)
return 0;
 }
 
+static void dumb_vga_enable(struct drm_bridge *bridge)
+{
+   struct dumb_vga *vga = drm_bridge_to_dumb_vga(bridge);
+
+   if (vga->enable_gpio)
+   gpiod_set_value_cansleep(vga->enable_gpio, 1);
+}
+
+static void dumb_vga_disable(struct drm_bridge *bridge)
+{
+   struct dumb_vga *vga = drm_bridge_to_dumb_vga(bridge);
+
+   if (vga->enable_gpio)
+   gpiod_set_value_cansleep(vga->enable_gpio, 0);
+}
+
 static const struct drm_bridge_funcs dumb_vga_bridge_funcs = {
.attach = dumb_vga_attach,
+   .enable = dumb_vga_enable,
+   .disable= dumb_vga_disable,
 };
 
 static struct i2c_adapter *dumb_vga_retrieve_ddc(struct device *dev)
@@ -169,6 +189,14 @@ static int dumb_vga_probe(struct platform_device *pdev)
return -ENOMEM;
platform_set_drvdata(pdev, vga);
 
+   vga->enable_gpio = devm_gpiod_get_optional(>dev, "enable",
+  GPIOD_OUT_LOW);
+   if (IS_ERR(vga->enable_gpio)) {
+   ret = PTR_ERR(vga->enable_gpio);
+   dev_err(>dev, "failed to request GPIO: %d\n", ret);
+   return ret;
+   }
+
vga->ddc = dumb_vga_retrieve_ddc(>dev);
if (IS_ERR(vga->ddc)) {
if (PTR_ERR(vga->ddc) == -ENODEV) {
-- 
2.9.3



[PATCH v2 1/8] drm/bridge: rgb-to-vga: Support an enable GPIO

2016-10-19 Thread Chen-Yu Tsai
Some rgb-to-vga bridges have an enable GPIO, either directly tied to
an enable pin on the bridge IC, or indirectly controlling a power
switch.

Add support for it.

Signed-off-by: Chen-Yu Tsai 
---
 .../bindings/display/bridge/dumb-vga-dac.txt   |  2 ++
 drivers/gpu/drm/bridge/dumb-vga-dac.c  | 28 ++
 2 files changed, 30 insertions(+)

diff --git a/Documentation/devicetree/bindings/display/bridge/dumb-vga-dac.txt 
b/Documentation/devicetree/bindings/display/bridge/dumb-vga-dac.txt
index 003bc246a270..d3484822bf77 100644
--- a/Documentation/devicetree/bindings/display/bridge/dumb-vga-dac.txt
+++ b/Documentation/devicetree/bindings/display/bridge/dumb-vga-dac.txt
@@ -16,6 +16,8 @@ graph bindings specified in 
Documentation/devicetree/bindings/graph.txt.
 - Video port 0 for RGB input
 - Video port 1 for VGA output
 
+Optional properties:
+- enable-gpios: GPIO pin to enable or disable the bridge
 
 Example
 ---
diff --git a/drivers/gpu/drm/bridge/dumb-vga-dac.c 
b/drivers/gpu/drm/bridge/dumb-vga-dac.c
index afec232185a7..b487e5e9b56d 100644
--- a/drivers/gpu/drm/bridge/dumb-vga-dac.c
+++ b/drivers/gpu/drm/bridge/dumb-vga-dac.c
@@ -10,6 +10,7 @@
  * the License, or (at your option) any later version.
  */
 
+#include 
 #include 
 #include 
 
@@ -23,6 +24,7 @@ struct dumb_vga {
struct drm_connectorconnector;
 
struct i2c_adapter  *ddc;
+   struct gpio_desc*enable_gpio;
 };
 
 static inline struct dumb_vga *
@@ -124,8 +126,26 @@ static int dumb_vga_attach(struct drm_bridge *bridge)
return 0;
 }
 
+static void dumb_vga_enable(struct drm_bridge *bridge)
+{
+   struct dumb_vga *vga = drm_bridge_to_dumb_vga(bridge);
+
+   if (vga->enable_gpio)
+   gpiod_set_value_cansleep(vga->enable_gpio, 1);
+}
+
+static void dumb_vga_disable(struct drm_bridge *bridge)
+{
+   struct dumb_vga *vga = drm_bridge_to_dumb_vga(bridge);
+
+   if (vga->enable_gpio)
+   gpiod_set_value_cansleep(vga->enable_gpio, 0);
+}
+
 static const struct drm_bridge_funcs dumb_vga_bridge_funcs = {
.attach = dumb_vga_attach,
+   .enable = dumb_vga_enable,
+   .disable= dumb_vga_disable,
 };
 
 static struct i2c_adapter *dumb_vga_retrieve_ddc(struct device *dev)
@@ -169,6 +189,14 @@ static int dumb_vga_probe(struct platform_device *pdev)
return -ENOMEM;
platform_set_drvdata(pdev, vga);
 
+   vga->enable_gpio = devm_gpiod_get_optional(>dev, "enable",
+  GPIOD_OUT_LOW);
+   if (IS_ERR(vga->enable_gpio)) {
+   ret = PTR_ERR(vga->enable_gpio);
+   dev_err(>dev, "failed to request GPIO: %d\n", ret);
+   return ret;
+   }
+
vga->ddc = dumb_vga_retrieve_ddc(>dev);
if (IS_ERR(vga->ddc)) {
if (PTR_ERR(vga->ddc) == -ENODEV) {
-- 
2.9.3



[PATCH] ata: set ncq_prio_enabled if device has support

2016-10-19 Thread Adam Manzanares
We previously had a check to see if the device has support for
prioritized ncq commands and a check to see if a device flag
is set, through a sysfs variable, in order to send a prioritized
command.

This patch only allows the sysfs variable to be set if the device
supports prioritized commands enabling one check in ata_build_rw_tf
in order to determine whether or not to send a prioritized command.

This patch depends on ata: ATA Command Priority Disabled By Default

Signed-off-by: Adam Manzanares 
---
 drivers/ata/libata-core.c | 3 +--
 drivers/ata/libata-scsi.c | 8 +++-
 2 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index b294339..43842fd 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -787,8 +787,7 @@ int ata_build_rw_tf(struct ata_taskfile *tf, struct 
ata_device *dev,
if (tf->flags & ATA_TFLAG_FUA)
tf->device |= 1 << 7;
 
-   if ((dev->flags & ATA_DFLAG_NCQ_PRIO) &&
-   (dev->flags & ATA_DFLAG_NCQ_PRIO_ENABLE)) {
+   if (dev->flags & ATA_DFLAG_NCQ_PRIO_ENABLE) {
if (class == IOPRIO_CLASS_RT)
tf->hob_nsect |= ATA_PRIO_HIGH <<
 ATA_SHIFT_PRIO;
diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index 87597a3..831beea 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -323,8 +323,14 @@ static ssize_t ata_ncq_prio_enable_store(struct device 
*device,
goto unlock;
}
 
-   if (input)
+   if (input) {
+   if (!(dev->flags & ATA_DFLAG_NCQ_PRIO)) {
+   rc = -EOPNOTSUPP;
+   goto unlock;
+   }
+
dev->flags |= ATA_DFLAG_NCQ_PRIO_ENABLE;
+   }
else
dev->flags &= ~ATA_DFLAG_NCQ_PRIO_ENABLE;
 
-- 
2.1.4



[PATCH] ata: set ncq_prio_enabled if device has support

2016-10-19 Thread Adam Manzanares
We previously had a check to see if the device has support for
prioritized ncq commands and a check to see if a device flag
is set, through a sysfs variable, in order to send a prioritized
command.

This patch only allows the sysfs variable to be set if the device
supports prioritized commands enabling one check in ata_build_rw_tf
in order to determine whether or not to send a prioritized command.

This patch depends on ata: ATA Command Priority Disabled By Default

Signed-off-by: Adam Manzanares 
---
 drivers/ata/libata-core.c | 3 +--
 drivers/ata/libata-scsi.c | 8 +++-
 2 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index b294339..43842fd 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -787,8 +787,7 @@ int ata_build_rw_tf(struct ata_taskfile *tf, struct 
ata_device *dev,
if (tf->flags & ATA_TFLAG_FUA)
tf->device |= 1 << 7;
 
-   if ((dev->flags & ATA_DFLAG_NCQ_PRIO) &&
-   (dev->flags & ATA_DFLAG_NCQ_PRIO_ENABLE)) {
+   if (dev->flags & ATA_DFLAG_NCQ_PRIO_ENABLE) {
if (class == IOPRIO_CLASS_RT)
tf->hob_nsect |= ATA_PRIO_HIGH <<
 ATA_SHIFT_PRIO;
diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index 87597a3..831beea 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -323,8 +323,14 @@ static ssize_t ata_ncq_prio_enable_store(struct device 
*device,
goto unlock;
}
 
-   if (input)
+   if (input) {
+   if (!(dev->flags & ATA_DFLAG_NCQ_PRIO)) {
+   rc = -EOPNOTSUPP;
+   goto unlock;
+   }
+
dev->flags |= ATA_DFLAG_NCQ_PRIO_ENABLE;
+   }
else
dev->flags &= ~ATA_DFLAG_NCQ_PRIO_ENABLE;
 
-- 
2.1.4



[PATCH 2/2] mailbox: sti: Fix module autoload for OF registration

2016-10-19 Thread Javier Martinez Canillas
If the driver is built as a module, autoload won't work because the module
alias information is not filled. So user-space can't match the registered
device with the corresponding module.

Export the module alias information using the MODULE_DEVICE_TABLE() macro.

Before this patch:

$ modinfo drivers/mailbox/mailbox-sti.ko | grep alias
alias:  platform:mailbox-sti

After this patch:

$ modinfo drivers/mailbox/mailbox-sti.ko | grep alias
alias:  platform:mailbox-sti
alias:  of:N*T*Cst,stih407-mailboxC*
alias:  of:N*T*Cst,stih407-mailbox

Signed-off-by: Javier Martinez Canillas 

---

 drivers/mailbox/mailbox-sti.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/mailbox/mailbox-sti.c b/drivers/mailbox/mailbox-sti.c
index a334db5c9f1c..41bcd339b68a 100644
--- a/drivers/mailbox/mailbox-sti.c
+++ b/drivers/mailbox/mailbox-sti.c
@@ -403,6 +403,7 @@ static const struct of_device_id sti_mailbox_match[] = {
},
{ }
 };
+MODULE_DEVICE_TABLE(of, sti_mailbox_match);
 
 static int sti_mbox_probe(struct platform_device *pdev)
 {
-- 
2.7.4



[PATCH 2/2] mailbox: sti: Fix module autoload for OF registration

2016-10-19 Thread Javier Martinez Canillas
If the driver is built as a module, autoload won't work because the module
alias information is not filled. So user-space can't match the registered
device with the corresponding module.

Export the module alias information using the MODULE_DEVICE_TABLE() macro.

Before this patch:

$ modinfo drivers/mailbox/mailbox-sti.ko | grep alias
alias:  platform:mailbox-sti

After this patch:

$ modinfo drivers/mailbox/mailbox-sti.ko | grep alias
alias:  platform:mailbox-sti
alias:  of:N*T*Cst,stih407-mailboxC*
alias:  of:N*T*Cst,stih407-mailbox

Signed-off-by: Javier Martinez Canillas 

---

 drivers/mailbox/mailbox-sti.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/mailbox/mailbox-sti.c b/drivers/mailbox/mailbox-sti.c
index a334db5c9f1c..41bcd339b68a 100644
--- a/drivers/mailbox/mailbox-sti.c
+++ b/drivers/mailbox/mailbox-sti.c
@@ -403,6 +403,7 @@ static const struct of_device_id sti_mailbox_match[] = {
},
{ }
 };
+MODULE_DEVICE_TABLE(of, sti_mailbox_match);
 
 static int sti_mbox_probe(struct platform_device *pdev)
 {
-- 
2.7.4



Re: [PATCH v2 1/4] cpufreq: pxa: use generic platdev driver for device-tree

2016-10-19 Thread Viresh Kumar
On 19-10-16, 22:06, Robert Jarzmik wrote:
> Viresh Kumar  writes:
> 
> >> >> +   { .compatible = "marvell,pxa250", },
> >> >> +   { .compatible = "marvell,pxa270", },
> >> >>  
> >> >> { .compatible = "samsung,exynos3250", },
> >> >> { .compatible = "samsung,exynos4210", },
> >> >
> >> > Isn't there a race between cpufreq-dt and the platform driver to
> >> > register first ?
> >> Ah, could you be more specific about the race you're talking of ?
> >> 
> >> My understanding was that cpufreq-dt-platdev does create the device, and
> >> cpufreq-dt is a driver for it, so there is no race but a direct 
> >> relationship
> >> AFAIU.
> >
> > I mean that both the driver may try to register to the cpufreq core if
> > they are both compiled in a single image.
> Euh I still don't follow you. The only driver that can register to the cpufreq
> core is cpufreq-dt.

I was wondering on what will happen if both cpufreq-dt and your pxa2xx-cpufreq
driver are present in the same kernel image. In that case the init routines of
both of them will try to call cpufreq_register_driver().

-- 
viresh


[PATCH 1/2] mailbox: mailbox-test: Fix module autoload

2016-10-19 Thread Javier Martinez Canillas
If the driver is built as a module, autoload won't work because the module
alias information is not filled. So user-space can't match the registered
device with the corresponding module.

Export the module alias information using the MODULE_DEVICE_TABLE() macro.

Before this patch:

$ modinfo drivers/mailbox/mailbox-test.ko | grep alias
$

After this patch:

$ modinfo drivers/mailbox/mailbox-test.ko | grep alias
alias:  of:N*T*Cmailbox-testC*
alias:  of:N*T*Cmailbox-test

Signed-off-by: Javier Martinez Canillas 
---

 drivers/mailbox/mailbox-test.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/mailbox/mailbox-test.c b/drivers/mailbox/mailbox-test.c
index 9ca96e9db6bf..deb9c61aebb4 100644
--- a/drivers/mailbox/mailbox-test.c
+++ b/drivers/mailbox/mailbox-test.c
@@ -357,6 +357,7 @@ static const struct of_device_id mbox_test_match[] = {
{ .compatible = "mailbox-test" },
{},
 };
+MODULE_DEVICE_TABLE(of, mbox_test_match);
 
 static struct platform_driver mbox_test_driver = {
.driver = {
-- 
2.7.4



Re: [PATCH v2 1/4] cpufreq: pxa: use generic platdev driver for device-tree

2016-10-19 Thread Viresh Kumar
On 19-10-16, 22:06, Robert Jarzmik wrote:
> Viresh Kumar  writes:
> 
> >> >> +   { .compatible = "marvell,pxa250", },
> >> >> +   { .compatible = "marvell,pxa270", },
> >> >>  
> >> >> { .compatible = "samsung,exynos3250", },
> >> >> { .compatible = "samsung,exynos4210", },
> >> >
> >> > Isn't there a race between cpufreq-dt and the platform driver to
> >> > register first ?
> >> Ah, could you be more specific about the race you're talking of ?
> >> 
> >> My understanding was that cpufreq-dt-platdev does create the device, and
> >> cpufreq-dt is a driver for it, so there is no race but a direct 
> >> relationship
> >> AFAIU.
> >
> > I mean that both the driver may try to register to the cpufreq core if
> > they are both compiled in a single image.
> Euh I still don't follow you. The only driver that can register to the cpufreq
> core is cpufreq-dt.

I was wondering on what will happen if both cpufreq-dt and your pxa2xx-cpufreq
driver are present in the same kernel image. In that case the init routines of
both of them will try to call cpufreq_register_driver().

-- 
viresh


[PATCH 1/2] mailbox: mailbox-test: Fix module autoload

2016-10-19 Thread Javier Martinez Canillas
If the driver is built as a module, autoload won't work because the module
alias information is not filled. So user-space can't match the registered
device with the corresponding module.

Export the module alias information using the MODULE_DEVICE_TABLE() macro.

Before this patch:

$ modinfo drivers/mailbox/mailbox-test.ko | grep alias
$

After this patch:

$ modinfo drivers/mailbox/mailbox-test.ko | grep alias
alias:  of:N*T*Cmailbox-testC*
alias:  of:N*T*Cmailbox-test

Signed-off-by: Javier Martinez Canillas 
---

 drivers/mailbox/mailbox-test.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/mailbox/mailbox-test.c b/drivers/mailbox/mailbox-test.c
index 9ca96e9db6bf..deb9c61aebb4 100644
--- a/drivers/mailbox/mailbox-test.c
+++ b/drivers/mailbox/mailbox-test.c
@@ -357,6 +357,7 @@ static const struct of_device_id mbox_test_match[] = {
{ .compatible = "mailbox-test" },
{},
 };
+MODULE_DEVICE_TABLE(of, mbox_test_match);
 
 static struct platform_driver mbox_test_driver = {
.driver = {
-- 
2.7.4



Re: [PATCH v4 3/5] powerpc/mm: allow memory hotplug into a memoryless node

2016-10-19 Thread Balbir Singh


On 07/10/16 05:36, Reza Arbab wrote:
> Remove the check which prevents us from hotplugging into an empty node.
> 
> This limitation has been questioned before [1], and judging by the
> response, there doesn't seem to be a reason we can't remove it. No issues
> have been found in light testing.
> 
> [1] 
> http://lkml.kernel.org/r/cagzkibrmksa1yyhbf5hwgxubcjse5smksmy4tpanerme2ug...@mail.gmail.com
> 
> http://lkml.kernel.org/r/20160511215051.gf22...@arbab-laptop.austin.ibm.com
> 
> Signed-off-by: Reza Arbab 
> Reviewed-by: Aneesh Kumar K.V 
> Acked-by: Balbir Singh 
> Cc: Nathan Fontenot 
> Cc: Bharata B Rao 
> ---
>  arch/powerpc/mm/numa.c | 13 +
>  1 file changed, 1 insertion(+), 12 deletions(-)
> 
> diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
> index 75b9cd6..d7ac419 100644
> --- a/arch/powerpc/mm/numa.c
> +++ b/arch/powerpc/mm/numa.c
> @@ -1121,7 +1121,7 @@ static int hot_add_node_scn_to_nid(unsigned long 
> scn_addr)
>  int hot_add_scn_to_nid(unsigned long scn_addr)
>  {
>   struct device_node *memory = NULL;
> - int nid, found = 0;
> + int nid;
>  
>   if (!numa_enabled || (min_common_depth < 0))
>   return first_online_node;
> @@ -1137,17 +1137,6 @@ int hot_add_scn_to_nid(unsigned long scn_addr)
>   if (nid < 0 || !node_online(nid))
>   nid = first_online_node;
>  
> - if (NODE_DATA(nid)->node_spanned_pages)
> - return nid;
> -
> - for_each_online_node(nid) {
> - if (NODE_DATA(nid)->node_spanned_pages) {
> - found = 1;
> - break;
> - }
> - }
> -
> - BUG_ON(!found);
>   return nid;

FYI, these checks were temporary to begin with

I found this in git history

b226e462124522f2f23153daff31c311729dfa2f (powerpc: don't add memory to empty 
node/zone)

Balbir Singh.


Re: [PATCH v4 3/5] powerpc/mm: allow memory hotplug into a memoryless node

2016-10-19 Thread Balbir Singh


On 07/10/16 05:36, Reza Arbab wrote:
> Remove the check which prevents us from hotplugging into an empty node.
> 
> This limitation has been questioned before [1], and judging by the
> response, there doesn't seem to be a reason we can't remove it. No issues
> have been found in light testing.
> 
> [1] 
> http://lkml.kernel.org/r/cagzkibrmksa1yyhbf5hwgxubcjse5smksmy4tpanerme2ug...@mail.gmail.com
> 
> http://lkml.kernel.org/r/20160511215051.gf22...@arbab-laptop.austin.ibm.com
> 
> Signed-off-by: Reza Arbab 
> Reviewed-by: Aneesh Kumar K.V 
> Acked-by: Balbir Singh 
> Cc: Nathan Fontenot 
> Cc: Bharata B Rao 
> ---
>  arch/powerpc/mm/numa.c | 13 +
>  1 file changed, 1 insertion(+), 12 deletions(-)
> 
> diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
> index 75b9cd6..d7ac419 100644
> --- a/arch/powerpc/mm/numa.c
> +++ b/arch/powerpc/mm/numa.c
> @@ -1121,7 +1121,7 @@ static int hot_add_node_scn_to_nid(unsigned long 
> scn_addr)
>  int hot_add_scn_to_nid(unsigned long scn_addr)
>  {
>   struct device_node *memory = NULL;
> - int nid, found = 0;
> + int nid;
>  
>   if (!numa_enabled || (min_common_depth < 0))
>   return first_online_node;
> @@ -1137,17 +1137,6 @@ int hot_add_scn_to_nid(unsigned long scn_addr)
>   if (nid < 0 || !node_online(nid))
>   nid = first_online_node;
>  
> - if (NODE_DATA(nid)->node_spanned_pages)
> - return nid;
> -
> - for_each_online_node(nid) {
> - if (NODE_DATA(nid)->node_spanned_pages) {
> - found = 1;
> - break;
> - }
> - }
> -
> - BUG_ON(!found);
>   return nid;

FYI, these checks were temporary to begin with

I found this in git history

b226e462124522f2f23153daff31c311729dfa2f (powerpc: don't add memory to empty 
node/zone)

Balbir Singh.


Re: 4.8.1 regression with cpufreq governors

2016-10-19 Thread Viresh Kumar
On 19-10-16, 15:06, Tim Walberg wrote:
> This indeed turned out to be the fix.

Really? How can this fix the ondemand governor thing?

Or is it that Ondemand never broke ?

-- 
viresh


Re: 4.8.1 regression with cpufreq governors

2016-10-19 Thread Viresh Kumar
On 19-10-16, 15:06, Tim Walberg wrote:
> This indeed turned out to be the fix.

Really? How can this fix the ondemand governor thing?

Or is it that Ondemand never broke ?

-- 
viresh


linux-next: Tree for Oct 20

2016-10-19 Thread Stephen Rothwell
Hi all,

Changes since 20161019:

The net-next tree gained a conflict against the net tree.

The akpm-current tree still had its build failures for which I applied
2 patches.

Non-merge commits (relative to Linus' tree): 1344
 1791 files changed, 152207 insertions(+), 26046 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log
files in the Next directory.  Between each merge, the tree was built
with a ppc64_defconfig for powerpc and an allmodconfig (with
CONFIG_BUILD_DOCSRC=n) for x86_64, a multi_v7_defconfig for arm and a
native build of tools/perf. After the final fixups (if any), I do an
x86_64 modules_install followed by builds for x86_64 allnoconfig,
powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig, allyesconfig
(this fails its final link) and pseries_le_defconfig and i386, sparc
and sparc64 defconfig.

Below is a summary of the state of the merge.

I am currently merging 243 trees (counting Linus' and 34 trees of patches
pending for Linus' tree).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

-- 
Cheers,
Stephen Rothwell

$ git checkout master
$ git reset --hard stable
Merging origin/master (c3f8f7fa8b19 Merge tag 'platform-drivers-x86-v4.9-2' of 
git://git.infradead.org/users/dvhart/linux-platform-drivers-x86)
Merging fixes/master (30066ce675d3 Merge branch 'linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6)
Merging kbuild-current/rc-fixes (d3e2773c4ede builddeb: Skip gcc-plugins when 
not configured)
Merging arc-current/for-curr (3da43104d318 ARC: Adjust cpuinfo for 
non-continuous cpu ids)
Merging arm-current/fixes (6127d124ee4e ARM: wire up new pkey syscalls)
Merging m68k-current/for-linus (6736e65effc3 m68k: Migrate exception table 
users off module.h and onto extable.h)
Merging metag-fixes/fixes (35d04077ad96 metag: Only define 
atomic_dec_if_positive conditionally)
Merging powerpc-fixes/fixes (78914ff08436 powerpc: Ignore the pkey system calls 
for now)
Merging sparc/master (4c1fad64eff4 Merge tag 'for-f2fs-4.9' of 
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs)
Merging net/master (c30a70d3ac60 stmmac: fix and review the ptp registration.)
CONFLICT (content): Merge conflict in drivers/net/ethernet/qlogic/Kconfig
Applying: qed*: merge fix for CONFIG_INFINIBAND_QEDR Kconfig move
Merging ipsec/master (7f92083eb58f vti6: flush x-netns xfrm cache when vti 
interface is removed)
Merging netfilter/master (6d3a4c404648 strparser: Propagate correct error code 
in strp_recv())
Merging ipvs/master (ea43f860d984 Merge branch 'ethoc-fixes')
Merging wireless-drivers/master (1ea2643961b0 ath6kl: add Dell OEM SDIO I/O for 
the Venue 8 Pro)
Merging mac80211/master (b4f0fd4baa90 qed: Use list_move_tail instead of 
list_del/list_add_tail)
Merging sound-current/for-linus (d69bb92e402f ALSA: asihpi: fix kernel memory 
disclosure)
Merging pci-current/for-linus (15480f3ab7a8 PCI: layerscape: Fix drvdata usage 
before assignment)
Merging driver-core.current/driver-core-linus (1001354ca341 Linux 4.9-rc1)
Merging tty.current/tty-linus (1001354ca341 Linux 4.9-rc1)
Merging usb.current/usb-linus (1ee1710cd6bb wusb: fix error return code in 
wusb_prf())
Merging usb-gadget-fixes/fixes (a1aa8cf6471b Revert "Documentation: devicetree: 
dwc2: Deprecate g-tx-fifo-size")
Merging usb-serial-fixes/usb-linus (ca006f785fbf USB: serial: ftdi_sio: add 
support for Infineon TriBoard TC2X7)
Merging usb-chipidea-fixes/ci-for-usb-stable (6b7f456e67a1 usb: chipidea: host: 
fix NULL ptr dereference during shutdown)
Merging staging.current/staging-linus (c89d98e224b4 staging/lustre/llite: Move 
unstable_stats from sysfs to debugfs)
Merging char-misc.current/char-misc-linus (1001354ca341 Linux 4.9-rc1)
Merging input-current/for-linus (da25311c7ca8 Input: i8042 - add XMG C504 to 
keyboard reset table)
Merging crypto-current/master (6d4952d9d9d4 hwrng: core - Don't use a stack 
buffer in add_early_randomness())
Merging ide/master (797cee982eef Merge branch 'stable-4.8' of 
git://git.infradead.org/users/pcmoore/audit)
Merging rr-fixes/fixes (8244062e

linux-next: Tree for Oct 20

2016-10-19 Thread Stephen Rothwell
Hi all,

Changes since 20161019:

The net-next tree gained a conflict against the net tree.

The akpm-current tree still had its build failures for which I applied
2 patches.

Non-merge commits (relative to Linus' tree): 1344
 1791 files changed, 152207 insertions(+), 26046 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log
files in the Next directory.  Between each merge, the tree was built
with a ppc64_defconfig for powerpc and an allmodconfig (with
CONFIG_BUILD_DOCSRC=n) for x86_64, a multi_v7_defconfig for arm and a
native build of tools/perf. After the final fixups (if any), I do an
x86_64 modules_install followed by builds for x86_64 allnoconfig,
powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig, allyesconfig
(this fails its final link) and pseries_le_defconfig and i386, sparc
and sparc64 defconfig.

Below is a summary of the state of the merge.

I am currently merging 243 trees (counting Linus' and 34 trees of patches
pending for Linus' tree).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

-- 
Cheers,
Stephen Rothwell

$ git checkout master
$ git reset --hard stable
Merging origin/master (c3f8f7fa8b19 Merge tag 'platform-drivers-x86-v4.9-2' of 
git://git.infradead.org/users/dvhart/linux-platform-drivers-x86)
Merging fixes/master (30066ce675d3 Merge branch 'linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6)
Merging kbuild-current/rc-fixes (d3e2773c4ede builddeb: Skip gcc-plugins when 
not configured)
Merging arc-current/for-curr (3da43104d318 ARC: Adjust cpuinfo for 
non-continuous cpu ids)
Merging arm-current/fixes (6127d124ee4e ARM: wire up new pkey syscalls)
Merging m68k-current/for-linus (6736e65effc3 m68k: Migrate exception table 
users off module.h and onto extable.h)
Merging metag-fixes/fixes (35d04077ad96 metag: Only define 
atomic_dec_if_positive conditionally)
Merging powerpc-fixes/fixes (78914ff08436 powerpc: Ignore the pkey system calls 
for now)
Merging sparc/master (4c1fad64eff4 Merge tag 'for-f2fs-4.9' of 
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs)
Merging net/master (c30a70d3ac60 stmmac: fix and review the ptp registration.)
CONFLICT (content): Merge conflict in drivers/net/ethernet/qlogic/Kconfig
Applying: qed*: merge fix for CONFIG_INFINIBAND_QEDR Kconfig move
Merging ipsec/master (7f92083eb58f vti6: flush x-netns xfrm cache when vti 
interface is removed)
Merging netfilter/master (6d3a4c404648 strparser: Propagate correct error code 
in strp_recv())
Merging ipvs/master (ea43f860d984 Merge branch 'ethoc-fixes')
Merging wireless-drivers/master (1ea2643961b0 ath6kl: add Dell OEM SDIO I/O for 
the Venue 8 Pro)
Merging mac80211/master (b4f0fd4baa90 qed: Use list_move_tail instead of 
list_del/list_add_tail)
Merging sound-current/for-linus (d69bb92e402f ALSA: asihpi: fix kernel memory 
disclosure)
Merging pci-current/for-linus (15480f3ab7a8 PCI: layerscape: Fix drvdata usage 
before assignment)
Merging driver-core.current/driver-core-linus (1001354ca341 Linux 4.9-rc1)
Merging tty.current/tty-linus (1001354ca341 Linux 4.9-rc1)
Merging usb.current/usb-linus (1ee1710cd6bb wusb: fix error return code in 
wusb_prf())
Merging usb-gadget-fixes/fixes (a1aa8cf6471b Revert "Documentation: devicetree: 
dwc2: Deprecate g-tx-fifo-size")
Merging usb-serial-fixes/usb-linus (ca006f785fbf USB: serial: ftdi_sio: add 
support for Infineon TriBoard TC2X7)
Merging usb-chipidea-fixes/ci-for-usb-stable (6b7f456e67a1 usb: chipidea: host: 
fix NULL ptr dereference during shutdown)
Merging staging.current/staging-linus (c89d98e224b4 staging/lustre/llite: Move 
unstable_stats from sysfs to debugfs)
Merging char-misc.current/char-misc-linus (1001354ca341 Linux 4.9-rc1)
Merging input-current/for-linus (da25311c7ca8 Input: i8042 - add XMG C504 to 
keyboard reset table)
Merging crypto-current/master (6d4952d9d9d4 hwrng: core - Don't use a stack 
buffer in add_early_randomness())
Merging ide/master (797cee982eef Merge branch 'stable-4.8' of 
git://git.infradead.org/users/pcmoore/audit)
Merging rr-fixes/fixes (8244062e

Re: [PATCH net-next 6/6] net: use core MTU range checking in misc drivers

2016-10-19 Thread Jarod Wilson
On Thu, Oct 20, 2016 at 12:38:46AM +0200, Stefan Richter wrote:
> On Oct 19 Sabrina Dubroca wrote:
> > 2016-10-18, 22:33:33 -0400, Jarod Wilson wrote:
> > [...]
> > > diff --git a/drivers/firewire/net.c b/drivers/firewire/net.c
> > > index 309311b..b5f125c 100644
> > > --- a/drivers/firewire/net.c
> > > +++ b/drivers/firewire/net.c
> > > @@ -1349,15 +1349,6 @@ static netdev_tx_t fwnet_tx(struct sk_buff *skb, 
> > > struct net_device *net)
> > >   return NETDEV_TX_OK;
> > >  }
> > >  
> > > -static int fwnet_change_mtu(struct net_device *net, int new_mtu)
> > > -{
> > > - if (new_mtu < 68)
> > > - return -EINVAL;
> > > -
> > > - net->mtu = new_mtu;
> > > - return 0;
> > > -}
> > > -  
> > 
> > This doesn't do any upper bound checking.
> 
> I need to check more closely, but I think the RFC 2734 encapsulation spec
> and our implementation do not impose a particular upper limit.  Though I
> guess it's bad to let userland set arbitrarily large values here.

In which case, that would suggest using IP_MAX_MTU (65535) here.

> > >  static const struct ethtool_ops fwnet_ethtool_ops = {
> > >   .get_link   = ethtool_op_get_link,
> > >  };
> > > @@ -1366,7 +1357,6 @@ static const struct net_device_ops fwnet_netdev_ops 
> > > = {
> > >   .ndo_open   = fwnet_open,
> > >   .ndo_stop   = fwnet_stop,
> > >   .ndo_start_xmit = fwnet_tx,
> > > - .ndo_change_mtu = fwnet_change_mtu,
> > >  };
> > >  
> > >  static void fwnet_init_dev(struct net_device *net)
> > > @@ -1481,6 +1471,8 @@ static int fwnet_probe(struct fw_unit *unit,
> > >   max_mtu = (1 << (card->max_receive + 1))
> > > - sizeof(struct rfc2734_header) - IEEE1394_GASP_HDR_SIZE;
> > >   net->mtu = min(1500U, max_mtu);
> > > + net->min_mtu = ETH_MIN_MTU;
> > > + net->max_mtu = net->mtu;  
> > 
> > But that will now prevent increasing the MTU above the initial value?
> 
> Indeed, therefore NAK.

However, there's an explicit calculation for 'max_mtu' right there that I
glazed right over. It would seem perhaps *that* should be used for
net->max_mtu here, no?

> PS:
> If the IP packet plus encapsulation header fits into IEEE 1394 packet
> payload, it is transported without link fragmentation.  If it does not
> fit, link fragmentation occurs (which reduces bandwidth a bit and
> consumes additional buffering resources at the transmitter and the
> receiver).
> 
> Broadcast and multicast packets are transmitted via IEEE 1394 asynchronous
> stream packets at a low bus speed (because our code does not attempt to
> find the maximum speed and size that is supported by all potential
> listeners).  This limits the payload to 512 bytes.
> 
> Unicast packets are transmitted via IEEE 1394 asynchronous write request
> packets at optimum speed.  In most cases, this means that 2048 bytes
> payload is possible, in some cases 4096 bytes.  Many CardBus FireWire
> cards support only 1024 bytes payload of these packets though.
> Furthermore, some low-speed long-haul cablings may cap the bus speed and
> thereby the payload size to 1024 or 512 bytes, but this is uncommon in
> practice.

Thorough as always, Stefan! :)

-- 
Jarod Wilson
ja...@redhat.com



Re: [PATCH net-next 6/6] net: use core MTU range checking in misc drivers

2016-10-19 Thread Jarod Wilson
On Thu, Oct 20, 2016 at 12:38:46AM +0200, Stefan Richter wrote:
> On Oct 19 Sabrina Dubroca wrote:
> > 2016-10-18, 22:33:33 -0400, Jarod Wilson wrote:
> > [...]
> > > diff --git a/drivers/firewire/net.c b/drivers/firewire/net.c
> > > index 309311b..b5f125c 100644
> > > --- a/drivers/firewire/net.c
> > > +++ b/drivers/firewire/net.c
> > > @@ -1349,15 +1349,6 @@ static netdev_tx_t fwnet_tx(struct sk_buff *skb, 
> > > struct net_device *net)
> > >   return NETDEV_TX_OK;
> > >  }
> > >  
> > > -static int fwnet_change_mtu(struct net_device *net, int new_mtu)
> > > -{
> > > - if (new_mtu < 68)
> > > - return -EINVAL;
> > > -
> > > - net->mtu = new_mtu;
> > > - return 0;
> > > -}
> > > -  
> > 
> > This doesn't do any upper bound checking.
> 
> I need to check more closely, but I think the RFC 2734 encapsulation spec
> and our implementation do not impose a particular upper limit.  Though I
> guess it's bad to let userland set arbitrarily large values here.

In which case, that would suggest using IP_MAX_MTU (65535) here.

> > >  static const struct ethtool_ops fwnet_ethtool_ops = {
> > >   .get_link   = ethtool_op_get_link,
> > >  };
> > > @@ -1366,7 +1357,6 @@ static const struct net_device_ops fwnet_netdev_ops 
> > > = {
> > >   .ndo_open   = fwnet_open,
> > >   .ndo_stop   = fwnet_stop,
> > >   .ndo_start_xmit = fwnet_tx,
> > > - .ndo_change_mtu = fwnet_change_mtu,
> > >  };
> > >  
> > >  static void fwnet_init_dev(struct net_device *net)
> > > @@ -1481,6 +1471,8 @@ static int fwnet_probe(struct fw_unit *unit,
> > >   max_mtu = (1 << (card->max_receive + 1))
> > > - sizeof(struct rfc2734_header) - IEEE1394_GASP_HDR_SIZE;
> > >   net->mtu = min(1500U, max_mtu);
> > > + net->min_mtu = ETH_MIN_MTU;
> > > + net->max_mtu = net->mtu;  
> > 
> > But that will now prevent increasing the MTU above the initial value?
> 
> Indeed, therefore NAK.

However, there's an explicit calculation for 'max_mtu' right there that I
glazed right over. It would seem perhaps *that* should be used for
net->max_mtu here, no?

> PS:
> If the IP packet plus encapsulation header fits into IEEE 1394 packet
> payload, it is transported without link fragmentation.  If it does not
> fit, link fragmentation occurs (which reduces bandwidth a bit and
> consumes additional buffering resources at the transmitter and the
> receiver).
> 
> Broadcast and multicast packets are transmitted via IEEE 1394 asynchronous
> stream packets at a low bus speed (because our code does not attempt to
> find the maximum speed and size that is supported by all potential
> listeners).  This limits the payload to 512 bytes.
> 
> Unicast packets are transmitted via IEEE 1394 asynchronous write request
> packets at optimum speed.  In most cases, this means that 2048 bytes
> payload is possible, in some cases 4096 bytes.  Many CardBus FireWire
> cards support only 1024 bytes payload of these packets though.
> Furthermore, some low-speed long-haul cablings may cap the bus speed and
> thereby the payload size to 1024 or 512 bytes, but this is uncommon in
> practice.

Thorough as always, Stefan! :)

-- 
Jarod Wilson
ja...@redhat.com



Re: x32 is broken in 4.9-rc1 due to "x86/signal: Add SA_{X32,IA32}_ABI sa_flags"

2016-10-19 Thread Adam Borowski
On Thu, Oct 20, 2016 at 01:02:59AM +0300, Dmitry Safonov wrote:
> 2016-10-19 20:33 GMT+03:00 Mikulas Patocka :
> > On Wed, 19 Oct 2016, Mikulas Patocka wrote:
> >> In the kernel 4.9-rc1, the x32 support is seriously broken, a x32 process
> >> is killed with SIGKILL after returning from any signal handler.
> >
> > I should have said they are killed with SIGSEGV, not SIGKILL.
> >
> >> I use Debian sid x64-64 distribution with x32 architecture added from
> >> debian-ports.
> >>
> >> I bisected the bug and found out that it is caused by the patch
> >> 6846351052e685c2d1428e80ead2d7ca3d7ed913 ("x86/signal: Add
> >> SA_{X32,IA32}_ABI sa_flags").
> >
> > So, the kernel somehow thinks that it is i386 process, not x32 process. A
> > core dump of a real x32 process shows "Class: ELF32, Machine: Advanced
> > Micro Devices X86-64".
> 
> could you give attached patch a shot?
> In about 10 hours I'll be at work and will have debian-x32 install,
> but for now, I can't test it.
> Thanks again on catching that.
> 

> From a546f8da1d12676fe79c746d859eb1e17aa4c331 Mon Sep 17 00:00:00 2001
> From: Dmitry Safonov <0x7f454...@gmail.com>
> Date: Thu, 20 Oct 2016 00:53:08 +0300
> Subject: [PATCH] x86/signal: set SA_X32_ABI flag for x32 programs
> 
> For x32 programs cs register is __USER_CS, so it returns here
> unconditionally - remove this check completely here.
> 
> Fixes: commit 6846351052e6 ("x86/signal: Add SA_{X32,IA32}_ABI sa_flags")
> 
> Reported-by: Mikulas Patocka 
> Signed-off-by: Dmitry Safonov <0x7f454...@gmail.com>
> ---
>  arch/x86/kernel/signal_compat.c | 3 ---
>  1 file changed, 3 deletions(-)
> 
> diff --git a/arch/x86/kernel/signal_compat.c b/arch/x86/kernel/signal_compat.c
> index 40df33753bae..ec1f756f9dc9 100644
> --- a/arch/x86/kernel/signal_compat.c
> +++ b/arch/x86/kernel/signal_compat.c
> @@ -105,9 +105,6 @@ void sigaction_compat_abi(struct k_sigaction *act, struct 
> k_sigaction *oact)
>   /* Don't let flags to be set from userspace */
>   act->sa.sa_flags &= ~(SA_IA32_ABI | SA_X32_ABI);
>  
> - if (user_64bit_mode(current_pt_regs()))
> - return;
> -
>   if (in_ia32_syscall())
>   act->sa.sa_flags |= SA_IA32_ABI;
>   if (in_x32_syscall())
> -- 
> 2.10.0

Works for me.  Tested on general operation, a few by-hand checks and several
random package builds.

It'd be nice to check glibc's testsuite as well as it had recent regressions
caused by kernel changes on x32 (like https://bugs.debian.org/841240) but as
gcc-6 in sid is broken right now (fails to build kernel, glibc:amd64, etc),
I didn't bother that much.

Tested-by: Adam Borowski 

-- 
A MAP07 (Dead Simple) raspberry tincture recipe: 0.5l 95% alcohol, 1kg
raspberries, 0.4kg sugar; put into a big jar for 1 month.  Filter out and
throw away the fruits (can dump them into a cake, etc), let the drink age
at least 3-6 months.


Re: x32 is broken in 4.9-rc1 due to "x86/signal: Add SA_{X32,IA32}_ABI sa_flags"

2016-10-19 Thread Adam Borowski
On Thu, Oct 20, 2016 at 01:02:59AM +0300, Dmitry Safonov wrote:
> 2016-10-19 20:33 GMT+03:00 Mikulas Patocka :
> > On Wed, 19 Oct 2016, Mikulas Patocka wrote:
> >> In the kernel 4.9-rc1, the x32 support is seriously broken, a x32 process
> >> is killed with SIGKILL after returning from any signal handler.
> >
> > I should have said they are killed with SIGSEGV, not SIGKILL.
> >
> >> I use Debian sid x64-64 distribution with x32 architecture added from
> >> debian-ports.
> >>
> >> I bisected the bug and found out that it is caused by the patch
> >> 6846351052e685c2d1428e80ead2d7ca3d7ed913 ("x86/signal: Add
> >> SA_{X32,IA32}_ABI sa_flags").
> >
> > So, the kernel somehow thinks that it is i386 process, not x32 process. A
> > core dump of a real x32 process shows "Class: ELF32, Machine: Advanced
> > Micro Devices X86-64".
> 
> could you give attached patch a shot?
> In about 10 hours I'll be at work and will have debian-x32 install,
> but for now, I can't test it.
> Thanks again on catching that.
> 

> From a546f8da1d12676fe79c746d859eb1e17aa4c331 Mon Sep 17 00:00:00 2001
> From: Dmitry Safonov <0x7f454...@gmail.com>
> Date: Thu, 20 Oct 2016 00:53:08 +0300
> Subject: [PATCH] x86/signal: set SA_X32_ABI flag for x32 programs
> 
> For x32 programs cs register is __USER_CS, so it returns here
> unconditionally - remove this check completely here.
> 
> Fixes: commit 6846351052e6 ("x86/signal: Add SA_{X32,IA32}_ABI sa_flags")
> 
> Reported-by: Mikulas Patocka 
> Signed-off-by: Dmitry Safonov <0x7f454...@gmail.com>
> ---
>  arch/x86/kernel/signal_compat.c | 3 ---
>  1 file changed, 3 deletions(-)
> 
> diff --git a/arch/x86/kernel/signal_compat.c b/arch/x86/kernel/signal_compat.c
> index 40df33753bae..ec1f756f9dc9 100644
> --- a/arch/x86/kernel/signal_compat.c
> +++ b/arch/x86/kernel/signal_compat.c
> @@ -105,9 +105,6 @@ void sigaction_compat_abi(struct k_sigaction *act, struct 
> k_sigaction *oact)
>   /* Don't let flags to be set from userspace */
>   act->sa.sa_flags &= ~(SA_IA32_ABI | SA_X32_ABI);
>  
> - if (user_64bit_mode(current_pt_regs()))
> - return;
> -
>   if (in_ia32_syscall())
>   act->sa.sa_flags |= SA_IA32_ABI;
>   if (in_x32_syscall())
> -- 
> 2.10.0

Works for me.  Tested on general operation, a few by-hand checks and several
random package builds.

It'd be nice to check glibc's testsuite as well as it had recent regressions
caused by kernel changes on x32 (like https://bugs.debian.org/841240) but as
gcc-6 in sid is broken right now (fails to build kernel, glibc:amd64, etc),
I didn't bother that much.

Tested-by: Adam Borowski 

-- 
A MAP07 (Dead Simple) raspberry tincture recipe: 0.5l 95% alcohol, 1kg
raspberries, 0.4kg sugar; put into a big jar for 1 month.  Filter out and
throw away the fruits (can dump them into a cake, etc), let the drink age
at least 3-6 months.


[PATCH 0/1] mm/hugetlb: fix huge page reservation leak in private mapping error paths

2016-10-19 Thread Mike Kravetz
This issue was discovered by Jan Stancek as described in
https://lkml.kernel.org/r/57ff7bb4.1070...@redhat.com

Error paths in hugetlb_cow() and hugetlb_no_page() do not properly clean
up reservation entries when freeing a newly allocated huge page.  This
issue was introduced with commit 67961f9db8c4 ("mm/hugetlb: fix huge page
reserve accounting for private mappings).  That commit uses the information
in private mapping reserve maps to determine if a reservation was already
consumed.  This is important in the case of hole punch and truncate as the
pages are released, but reservation entries are not restored.

This patch restores the reserve entries in hugetlb_cow and hugetlb_no_page
such that reserve entries are consistent with the global reservation count.

The huge page reservation code is quite hard to follow, and this patch
makes it even more complex.  One thought I had was to change the way
hole punch and truncate work so that private mapping pages are not thrown
away.  This would eliminate the need for this patch as well as 67961f9db8c4.
It would change the existing semantics (as seen by the user) in this area,
but I believe the documentation (man pages) say the behavior is unspecified.
This could be a future change as well as rewriting the existing reservation
code to make it easier to understand/maintain.  Thoughts?

In any case, this patch addresses the immediate issue.

Mike Kravetz (1):
  mm/hugetlb: fix huge page reservation leak in private mapping error
paths

 mm/hugetlb.c | 66 
 1 file changed, 66 insertions(+)

-- 
2.7.4



[PATCH 0/1] mm/hugetlb: fix huge page reservation leak in private mapping error paths

2016-10-19 Thread Mike Kravetz
This issue was discovered by Jan Stancek as described in
https://lkml.kernel.org/r/57ff7bb4.1070...@redhat.com

Error paths in hugetlb_cow() and hugetlb_no_page() do not properly clean
up reservation entries when freeing a newly allocated huge page.  This
issue was introduced with commit 67961f9db8c4 ("mm/hugetlb: fix huge page
reserve accounting for private mappings).  That commit uses the information
in private mapping reserve maps to determine if a reservation was already
consumed.  This is important in the case of hole punch and truncate as the
pages are released, but reservation entries are not restored.

This patch restores the reserve entries in hugetlb_cow and hugetlb_no_page
such that reserve entries are consistent with the global reservation count.

The huge page reservation code is quite hard to follow, and this patch
makes it even more complex.  One thought I had was to change the way
hole punch and truncate work so that private mapping pages are not thrown
away.  This would eliminate the need for this patch as well as 67961f9db8c4.
It would change the existing semantics (as seen by the user) in this area,
but I believe the documentation (man pages) say the behavior is unspecified.
This could be a future change as well as rewriting the existing reservation
code to make it easier to understand/maintain.  Thoughts?

In any case, this patch addresses the immediate issue.

Mike Kravetz (1):
  mm/hugetlb: fix huge page reservation leak in private mapping error
paths

 mm/hugetlb.c | 66 
 1 file changed, 66 insertions(+)

-- 
2.7.4



[PATCH 1/1] mm/hugetlb: fix huge page reservation leak in private mapping error paths

2016-10-19 Thread Mike Kravetz
Error paths in hugetlb_cow() and hugetlb_no_page() may free a newly
allocated huge page.  If a reservation was associated with the huge
page, alloc_huge_page() consumed the reservation while allocating.
When the newly allocated page is freed in free_huge_page(), it will
increment the global reservation count.  However, the reservation entry
in the reserve map will remain.  This is not an issue for shared
mappings as the entry in the reserve map indicates a reservation exists.
But, an entry in a private mapping reserve map indicates the reservation
was consumed and no longer exists.  This results in an inconsistency
between the reserve map and the global reservation count.  This 'leaks'
a reserved huge page.

Create a new routine restore_reserve_on_error() to restore the reserve
entry in these specific error paths.  This routine makes use of a new
function vma_add_reservation() which will add a reserve entry for a
specific address/page.

In general, these error paths were rarely (if ever) taken on most
architectures.  However, powerpc contained arch specific code that
that resulted in an extra fault and execution of these error paths
on all private mappings.

Fixes: 67961f9db8c4 ("mm/hugetlb: fix huge page reserve accounting for private 
mappings)

Cc: sta...@vger.kernel.org
Reported-by: Jan Stancek 
Signed-off-by: Mike Kravetz 
---
 mm/hugetlb.c | 66 
 1 file changed, 66 insertions(+)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index ec49d9e..418bf01 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1826,11 +1826,17 @@ static void return_unused_surplus_pages(struct hstate 
*h,
  * is not the case is if a reserve map was changed between calls.  It
  * is the responsibility of the caller to notice the difference and
  * take appropriate action.
+ *
+ * vma_add_reservation is used in error paths where a reservation must
+ * be restored when a newly allocated huge page must be freed.  It is
+ * to be called after calling vma_needs_reservation to determine if a
+ * reservation exists.
  */
 enum vma_resv_mode {
VMA_NEEDS_RESV,
VMA_COMMIT_RESV,
VMA_END_RESV,
+   VMA_ADD_RESV,
 };
 static long __vma_reservation_common(struct hstate *h,
struct vm_area_struct *vma, unsigned long addr,
@@ -1856,6 +1862,14 @@ static long __vma_reservation_common(struct hstate *h,
region_abort(resv, idx, idx + 1);
ret = 0;
break;
+   case VMA_ADD_RESV:
+   if (vma->vm_flags & VM_MAYSHARE)
+   ret = region_add(resv, idx, idx + 1);
+   else {
+   region_abort(resv, idx, idx + 1);
+   ret = region_del(resv, idx, idx + 1);
+   }
+   break;
default:
BUG();
}
@@ -1903,6 +1917,56 @@ static void vma_end_reservation(struct hstate *h,
(void)__vma_reservation_common(h, vma, addr, VMA_END_RESV);
 }
 
+static long vma_add_reservation(struct hstate *h,
+   struct vm_area_struct *vma, unsigned long addr)
+{
+   return __vma_reservation_common(h, vma, addr, VMA_ADD_RESV);
+}
+
+/*
+ * This routine is called to restore a reservation on error paths.  In the
+ * specific error paths, a huge page was allocated (via alloc_huge_page)
+ * and is about to be freed.  If a reservation for the page existed,
+ * alloc_huge_page would have consumed the reservation and set PagePrivate
+ * in the newly allocated page.  When the page is freed via free_huge_page,
+ * the global reservation count will be incremented if PagePrivate is set.
+ * However, free_huge_page can not adjust the reserve map.  Adjust the
+ * reserve map here to be consistent with global reserve count adjustments
+ * to be made by free_huge_page.
+ */
+static void restore_reserve_on_error(struct hstate *h,
+   struct vm_area_struct *vma, unsigned long address,
+   struct page *page)
+{
+   if (unlikely(PagePrivate(page))) {
+   long rc = vma_needs_reservation(h, vma, address);
+
+   if (unlikely(rc < 0)) {
+   /*
+* Rare out of memory condition in reserve map
+* manipulation.  Clear PagePrivate so that
+* global reserve count will not be incremented
+* by free_huge_page.  This will make it appear
+* as though the reservation for this page was
+* consumed.  This may prevent the task from
+* faulting in the page at a later time.  This
+* is better than inconsistent global huge page
+* accounting of reserve counts.
+*/
+   ClearPagePrivate(page);
+   } else if (rc) {

[PATCH 1/1] mm/hugetlb: fix huge page reservation leak in private mapping error paths

2016-10-19 Thread Mike Kravetz
Error paths in hugetlb_cow() and hugetlb_no_page() may free a newly
allocated huge page.  If a reservation was associated with the huge
page, alloc_huge_page() consumed the reservation while allocating.
When the newly allocated page is freed in free_huge_page(), it will
increment the global reservation count.  However, the reservation entry
in the reserve map will remain.  This is not an issue for shared
mappings as the entry in the reserve map indicates a reservation exists.
But, an entry in a private mapping reserve map indicates the reservation
was consumed and no longer exists.  This results in an inconsistency
between the reserve map and the global reservation count.  This 'leaks'
a reserved huge page.

Create a new routine restore_reserve_on_error() to restore the reserve
entry in these specific error paths.  This routine makes use of a new
function vma_add_reservation() which will add a reserve entry for a
specific address/page.

In general, these error paths were rarely (if ever) taken on most
architectures.  However, powerpc contained arch specific code that
that resulted in an extra fault and execution of these error paths
on all private mappings.

Fixes: 67961f9db8c4 ("mm/hugetlb: fix huge page reserve accounting for private 
mappings)

Cc: sta...@vger.kernel.org
Reported-by: Jan Stancek 
Signed-off-by: Mike Kravetz 
---
 mm/hugetlb.c | 66 
 1 file changed, 66 insertions(+)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index ec49d9e..418bf01 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1826,11 +1826,17 @@ static void return_unused_surplus_pages(struct hstate 
*h,
  * is not the case is if a reserve map was changed between calls.  It
  * is the responsibility of the caller to notice the difference and
  * take appropriate action.
+ *
+ * vma_add_reservation is used in error paths where a reservation must
+ * be restored when a newly allocated huge page must be freed.  It is
+ * to be called after calling vma_needs_reservation to determine if a
+ * reservation exists.
  */
 enum vma_resv_mode {
VMA_NEEDS_RESV,
VMA_COMMIT_RESV,
VMA_END_RESV,
+   VMA_ADD_RESV,
 };
 static long __vma_reservation_common(struct hstate *h,
struct vm_area_struct *vma, unsigned long addr,
@@ -1856,6 +1862,14 @@ static long __vma_reservation_common(struct hstate *h,
region_abort(resv, idx, idx + 1);
ret = 0;
break;
+   case VMA_ADD_RESV:
+   if (vma->vm_flags & VM_MAYSHARE)
+   ret = region_add(resv, idx, idx + 1);
+   else {
+   region_abort(resv, idx, idx + 1);
+   ret = region_del(resv, idx, idx + 1);
+   }
+   break;
default:
BUG();
}
@@ -1903,6 +1917,56 @@ static void vma_end_reservation(struct hstate *h,
(void)__vma_reservation_common(h, vma, addr, VMA_END_RESV);
 }
 
+static long vma_add_reservation(struct hstate *h,
+   struct vm_area_struct *vma, unsigned long addr)
+{
+   return __vma_reservation_common(h, vma, addr, VMA_ADD_RESV);
+}
+
+/*
+ * This routine is called to restore a reservation on error paths.  In the
+ * specific error paths, a huge page was allocated (via alloc_huge_page)
+ * and is about to be freed.  If a reservation for the page existed,
+ * alloc_huge_page would have consumed the reservation and set PagePrivate
+ * in the newly allocated page.  When the page is freed via free_huge_page,
+ * the global reservation count will be incremented if PagePrivate is set.
+ * However, free_huge_page can not adjust the reserve map.  Adjust the
+ * reserve map here to be consistent with global reserve count adjustments
+ * to be made by free_huge_page.
+ */
+static void restore_reserve_on_error(struct hstate *h,
+   struct vm_area_struct *vma, unsigned long address,
+   struct page *page)
+{
+   if (unlikely(PagePrivate(page))) {
+   long rc = vma_needs_reservation(h, vma, address);
+
+   if (unlikely(rc < 0)) {
+   /*
+* Rare out of memory condition in reserve map
+* manipulation.  Clear PagePrivate so that
+* global reserve count will not be incremented
+* by free_huge_page.  This will make it appear
+* as though the reservation for this page was
+* consumed.  This may prevent the task from
+* faulting in the page at a later time.  This
+* is better than inconsistent global huge page
+* accounting of reserve counts.
+*/
+   ClearPagePrivate(page);
+   } else if (rc) {
+   rc = 

  1   2   3   4   5   6   7   8   9   10   >