[PULL REQUEST] i2c-mux for 4.16-rc1

2017-12-30 Thread Peter Rosin
Hi Wolfram,

A couple of patches this time. Just some more compatibles for the pca954x
driver and an error handling tweak for the reg driver.

Please pull!

Cheers,
Peter

The following changes since commit ae64f9bd1d3621b5e60d7363bc20afb46aede215:

  Linux 4.15-rc2 (2017-12-03 11:01:47 -0500)

are available in the git repository at:

  https://github.com/peda-r/i2c-mux.git i2c-mux/for-next

for you to fetch changes up to ac5b85de17cb96445c51bd1a1c53c3f675582f26:

  i2c: mux: reg: don't log an error for probe deferral (2017-12-30 23:12:34 
+0100)


Adrian Fiergolski (1):
  i2c: mux: pca954x: add support for NXP PCA984x family

Tomasz Bachorski (1):
  i2c: mux: reg: don't log an error for probe deferral

 .../devicetree/bindings/i2c/i2c-mux-pca954x.txt| 13 ++--
 drivers/i2c/muxes/Kconfig  |  6 ++--
 drivers/i2c/muxes/i2c-mux-pca954x.c| 38 +++---
 drivers/i2c/muxes/i2c-mux-reg.c|  3 ++
 4 files changed, 51 insertions(+), 9 deletions(-)


[PULL REQUEST] i2c-mux for 4.16-rc1

2017-12-30 Thread Peter Rosin
Hi Wolfram,

A couple of patches this time. Just some more compatibles for the pca954x
driver and an error handling tweak for the reg driver.

Please pull!

Cheers,
Peter

The following changes since commit ae64f9bd1d3621b5e60d7363bc20afb46aede215:

  Linux 4.15-rc2 (2017-12-03 11:01:47 -0500)

are available in the git repository at:

  https://github.com/peda-r/i2c-mux.git i2c-mux/for-next

for you to fetch changes up to ac5b85de17cb96445c51bd1a1c53c3f675582f26:

  i2c: mux: reg: don't log an error for probe deferral (2017-12-30 23:12:34 
+0100)


Adrian Fiergolski (1):
  i2c: mux: pca954x: add support for NXP PCA984x family

Tomasz Bachorski (1):
  i2c: mux: reg: don't log an error for probe deferral

 .../devicetree/bindings/i2c/i2c-mux-pca954x.txt| 13 ++--
 drivers/i2c/muxes/Kconfig  |  6 ++--
 drivers/i2c/muxes/i2c-mux-pca954x.c| 38 +++---
 drivers/i2c/muxes/i2c-mux-reg.c|  3 ++
 4 files changed, 51 insertions(+), 9 deletions(-)


Re: [PATCH v2 1/2] kbuild: Require a 'make clean' if we detect gcc changed underneath us

2017-12-30 Thread Masahiro Yamada
Hi Douglas,

2017-12-23 7:14 GMT+09:00 Douglas Anderson :
> Several people reported that the commit 3298b690b21c ("kbuild: Add a
> cache for generated variables") caused them problems when they updated
> gcc versions.  Specifically the reports all looked something similar
> to this:
>
>> In file included from ./include/uapi/linux/uuid.h:21:0,
>>  from ./include/linux/uuid.h:19,
>>  from ./include/linux/mod_devicetable.h:12,
>>  from scripts/mod/devicetable-offsets.c:2:
>> ./include/linux/string.h:8:20: fatal error: stdarg.h: No such file or
>>  directory
>>  #include 
>
> Masahiro Yamada determined that the problem was with:
>
>   NOSTDINC_FLAGS += -nostdinc -isystem $(call shell-cached,$(CC)
>   -print-file-name=include)
>
> Specifically that the stale result of -print-file-name is stored in
> the cache file.  It was determined that a "make clean" fixed the
> problems in all cases.
>
> In this particular case we could certainly try to clean just the cache
> when we detect a gcc update, but it seems like overall it's a bad idea
> to do an incremental build when gcc changes.  We should warn the user
> and tell them that they need a 'make clean'.
>
> Fixes: 3298b690b21c ("kbuild: Add a cache for generated variables")
> Reported-by: Yang Shi 
> Reported-by: Dave Hansen 
> Reported-by: Mathieu Malaterre 
> Signed-off-by: Douglas Anderson 
> ---
>
> Changes in v2:
> - Don't error if MAKECMDGOALS is blank.
>
>  scripts/Kbuild.include | 17 +
>  1 file changed, 17 insertions(+)
>
> diff --git a/scripts/Kbuild.include b/scripts/Kbuild.include
> index 065324a8046f..f7efb59d85d1 100644
> --- a/scripts/Kbuild.include
> +++ b/scripts/Kbuild.include
> @@ -222,6 +222,10 @@ cc-version = $(call shell-cached,$(CONFIG_SHELL) 
> $(srctree)/scripts/gcc-version.
>  cc-fullversion = $(call shell-cached,$(CONFIG_SHELL) \
> $(srctree)/scripts/gcc-version.sh -p $(CC))
>
> +# cc-fullversion-uncached
> +cc-fullversion-uncached := $(shell $(CONFIG_SHELL) \
> +   $(srctree)/scripts/gcc-version.sh -p $(CC))
> +

No.
You used ':=' flavor assignment, so the gcc-version.sh
script is evaluated here.

The top-level Makefile includes scripts/Kbuild.include at line 278,
then defines CC at line 284.

Since $(CC) is not defined here,
the resulted cc-fullversion-uncached is
"Error: No compiler specified. Usage:  ./scripts/gcc-version.sh "


cc-fullversion-uncached works as expected
only after Kbuild descends into sub-directories.






>  # cc-ifversion
>  # Usage:  EXTRA_CFLAGS += $(call cc-ifversion, -lt, 0402, -O1)
>  cc-ifversion = $(shell [ $(cc-version) $(1) $(2) ] && echo $(3) || echo $(4))
> @@ -475,3 +479,16 @@ endif
>  endef
>  #
>  
> ###
> +
> +# Require a 'make clean' if the compiler changed; not only does the .cache.mk
> +# need to be thrown out but we should also start with fresh object files.
> +#
> +# NOTE: it's important that we don't error out when the goal is actually to
> +# try to make clean, distclean or mrproper.
> +ifeq ($(filter %clean,$(MAKECMDGOALS))$(filter mrproper,$(MAKECMDGOALS)),)
> +  ifneq ($(MAKECMDGOALS),)
> +ifneq ($(cc-fullversion-uncached),$(cc-fullversion))


As I noted above, CC is not defined yet when parsing the top-level Makefile.

Evaluating "cc-fullversion" here
adds a strange cache line, like follows:

__cached__./scripts/gcc-version.sh_-p_ := Error: No compiler
specified. Usage:  ./scripts/gcc-version.sh 


This check only works only in the second inclusion or later.

At the first inclusion,
both 'cc-fullversion-uncached' and 'cc-fullversion'
contain "Error: No compiler specified. ..."


scripts/Kbuild.include is included every time Kbuild
descends into a sub-directory.

It is pointless to check gcc version multiple times.



> +  $(error Detected new CC version ($(cc-fullversion-uncached) vs 
> $(cc-fullversion)).  Please 'make clean')
> +endif
> +  endif
> +endif
> --
> 2.15.1.620.gb9897f4670-goog
>

After all, I recommend to move this code into the top-level Makefile,
around line 585.


--- a/Makefile
+++ b/Makefile
@@ -585,6 +585,14 @@ virt-y := virt/
 endif # KBUILD_EXTMOD

 ifeq ($(dot-config),1)
+# Require a 'make clean' if the compiler changed; not only does the .cache.mk
+# need to be thrown out but we should also start with fresh object files.
+cc-fullversion-uncached := $(shell $(CONFIG_SHELL)
$(srctree)/scripts/gcc-version.sh -p $(CC))
+
+ifneq ($(cc-fullversion-uncached),$(cc-fullversion))
+   $(error Detected new CC version ($(cc-fullversion-uncached) vs
$(cc-fullversion)).  Please 'make clean')
+endif
+
 # Read in config
 -include include/config/auto.conf




-- 
Best Regards
Masahiro Yamada


Re: [PATCH v2 1/2] kbuild: Require a 'make clean' if we detect gcc changed underneath us

2017-12-30 Thread Masahiro Yamada
Hi Douglas,

2017-12-23 7:14 GMT+09:00 Douglas Anderson :
> Several people reported that the commit 3298b690b21c ("kbuild: Add a
> cache for generated variables") caused them problems when they updated
> gcc versions.  Specifically the reports all looked something similar
> to this:
>
>> In file included from ./include/uapi/linux/uuid.h:21:0,
>>  from ./include/linux/uuid.h:19,
>>  from ./include/linux/mod_devicetable.h:12,
>>  from scripts/mod/devicetable-offsets.c:2:
>> ./include/linux/string.h:8:20: fatal error: stdarg.h: No such file or
>>  directory
>>  #include 
>
> Masahiro Yamada determined that the problem was with:
>
>   NOSTDINC_FLAGS += -nostdinc -isystem $(call shell-cached,$(CC)
>   -print-file-name=include)
>
> Specifically that the stale result of -print-file-name is stored in
> the cache file.  It was determined that a "make clean" fixed the
> problems in all cases.
>
> In this particular case we could certainly try to clean just the cache
> when we detect a gcc update, but it seems like overall it's a bad idea
> to do an incremental build when gcc changes.  We should warn the user
> and tell them that they need a 'make clean'.
>
> Fixes: 3298b690b21c ("kbuild: Add a cache for generated variables")
> Reported-by: Yang Shi 
> Reported-by: Dave Hansen 
> Reported-by: Mathieu Malaterre 
> Signed-off-by: Douglas Anderson 
> ---
>
> Changes in v2:
> - Don't error if MAKECMDGOALS is blank.
>
>  scripts/Kbuild.include | 17 +
>  1 file changed, 17 insertions(+)
>
> diff --git a/scripts/Kbuild.include b/scripts/Kbuild.include
> index 065324a8046f..f7efb59d85d1 100644
> --- a/scripts/Kbuild.include
> +++ b/scripts/Kbuild.include
> @@ -222,6 +222,10 @@ cc-version = $(call shell-cached,$(CONFIG_SHELL) 
> $(srctree)/scripts/gcc-version.
>  cc-fullversion = $(call shell-cached,$(CONFIG_SHELL) \
> $(srctree)/scripts/gcc-version.sh -p $(CC))
>
> +# cc-fullversion-uncached
> +cc-fullversion-uncached := $(shell $(CONFIG_SHELL) \
> +   $(srctree)/scripts/gcc-version.sh -p $(CC))
> +

No.
You used ':=' flavor assignment, so the gcc-version.sh
script is evaluated here.

The top-level Makefile includes scripts/Kbuild.include at line 278,
then defines CC at line 284.

Since $(CC) is not defined here,
the resulted cc-fullversion-uncached is
"Error: No compiler specified. Usage:  ./scripts/gcc-version.sh "


cc-fullversion-uncached works as expected
only after Kbuild descends into sub-directories.






>  # cc-ifversion
>  # Usage:  EXTRA_CFLAGS += $(call cc-ifversion, -lt, 0402, -O1)
>  cc-ifversion = $(shell [ $(cc-version) $(1) $(2) ] && echo $(3) || echo $(4))
> @@ -475,3 +479,16 @@ endif
>  endef
>  #
>  
> ###
> +
> +# Require a 'make clean' if the compiler changed; not only does the .cache.mk
> +# need to be thrown out but we should also start with fresh object files.
> +#
> +# NOTE: it's important that we don't error out when the goal is actually to
> +# try to make clean, distclean or mrproper.
> +ifeq ($(filter %clean,$(MAKECMDGOALS))$(filter mrproper,$(MAKECMDGOALS)),)
> +  ifneq ($(MAKECMDGOALS),)
> +ifneq ($(cc-fullversion-uncached),$(cc-fullversion))


As I noted above, CC is not defined yet when parsing the top-level Makefile.

Evaluating "cc-fullversion" here
adds a strange cache line, like follows:

__cached__./scripts/gcc-version.sh_-p_ := Error: No compiler
specified. Usage:  ./scripts/gcc-version.sh 


This check only works only in the second inclusion or later.

At the first inclusion,
both 'cc-fullversion-uncached' and 'cc-fullversion'
contain "Error: No compiler specified. ..."


scripts/Kbuild.include is included every time Kbuild
descends into a sub-directory.

It is pointless to check gcc version multiple times.



> +  $(error Detected new CC version ($(cc-fullversion-uncached) vs 
> $(cc-fullversion)).  Please 'make clean')
> +endif
> +  endif
> +endif
> --
> 2.15.1.620.gb9897f4670-goog
>

After all, I recommend to move this code into the top-level Makefile,
around line 585.


--- a/Makefile
+++ b/Makefile
@@ -585,6 +585,14 @@ virt-y := virt/
 endif # KBUILD_EXTMOD

 ifeq ($(dot-config),1)
+# Require a 'make clean' if the compiler changed; not only does the .cache.mk
+# need to be thrown out but we should also start with fresh object files.
+cc-fullversion-uncached := $(shell $(CONFIG_SHELL)
$(srctree)/scripts/gcc-version.sh -p $(CC))
+
+ifneq ($(cc-fullversion-uncached),$(cc-fullversion))
+   $(error Detected new CC version ($(cc-fullversion-uncached) vs
$(cc-fullversion)).  Please 'make clean')
+endif
+
 # Read in config
 -include include/config/auto.conf




-- 
Best Regards
Masahiro Yamada


Re: `pci_apply_final_quirks()` taking half a second

2017-12-30 Thread Paul Menzel



Am 29.12.2017 um 17:14 schrieb Alan Stern:

On Thu, 28 Dec 2017, Bjorn Helgaas wrote:


On Tue, Dec 26, 2017 at 04:55:20PM +0100, Paul Menzel wrote:

Am 08.04.2017 um 17:41 schrieb Bjorn Helgaas:

On Fri, Apr 07, 2017 at 11:07:15PM +0200, Paul Menzel wrote:



Measuring where time is spent during boot with `systemd-bootchart`
on an Asus A780FullHD, it turns out that half a second is spent in
`pci_apply_final_quirks()`.


I agree, that seems like a crazy amount of time.

Can you figure out how to turn on pr_debug() (via the dynamic debug
mess or whatever) and boot with "initcall_debug"?  That should tell us
how long each quirk took.


I am sorry for taking so long to reply. I finally added `dyndbg=file
quirks.c +p` to the command line of Linux 4.13.13. This is on
another AMD system (Asus F285M Pro).



Dez 26 16:21:46 asus-f2a85-pro kernel: pci fixup
quirk_usb_early_handoff+0x0/0x6b0 returned after 197 usecs for
:00:10.0
Dez 26 16:21:46 asus-f2a85-pro kernel: pci fixup
quirk_usb_early_handoff+0x0/0x6b0 returned after 127 usecs for
:00:10.1
Dez 26 16:21:46 asus-f2a85-pro kernel: pci fixup
quirk_usb_early_handoff+0x0/0x6b0 returned after 88643 usecs for
:00:12.0
Dez 26 16:21:46 asus-f2a85-pro kernel: pci fixup
quirk_usb_early_handoff+0x0/0x6b0 returned after 137 usecs for
:00:12.2
Dez 26 16:21:46 asus-f2a85-pro kernel: pci fixup
pci_fixup_amd_ehci_pme+0x0/0x30 returned after 1 usecs for
:00:12.2
Dez 26 16:21:46 asus-f2a85-pro kernel: pci fixup
quirk_usb_early_handoff+0x0/0x6b0 returned after 85770 usecs for
:00:13.0
Dez 26 16:21:46 asus-f2a85-pro kernel: pci fixup
quirk_usb_early_handoff+0x0/0x6b0 returned after 134 usecs for
:00:13.2
Dez 26 16:21:46 asus-f2a85-pro kernel: pci fixup
pci_fixup_amd_ehci_pme+0x0/0x30 returned after 1 usecs for
:00:13.2
Dez 26 16:21:46 asus-f2a85-pro kernel: pci fixup
quirk_usb_early_handoff+0x0/0x6b0 returned after 125 usecs for
:03:00.0[…]
```

So it’s `pci fixup quirk_usb_early_handoff` taking around 85 ms, and
that twice.


Wow.  That's pretty painful, but of course I don't know how to fix it.
 From looking at quirk_usb_early_handoff(), it may depend on BIOS
details.  Maybe the USB folks will have some ideas.


Can we see the output from lspci?  It would help to know what the 12.0
and 13.0 devices are.


Sorry, that was trimmed from the original message. Here is the output 
from the ASRock A780FullD.



```
$ more /proc/version
Linux version 4.9.0-0.bpo.2-amd64 (debian-ker...@lists.debian.org)
(gcc version 4.9.2 (Debian 4.9.2-10) ) #1 SMP Debian 4.9.13-1~bpo8+1 
(2017-02-27)
$ lspci -nn
00:00.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] RS780 Host 
Bridge [1022:9600]
00:01.0 PCI bridge [0604]: ASRock Incorporation Device [1849:9602]
00:09.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] RS780/RS880 PCI 
to PCI bridge (PCIE port 4) [1022:9608]
00:0a.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] RS780/RS880 PCI 
to PCI bridge (PCIE port 5) [1022:9609]
00:11.0 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD/ATI] 
SB7x0/SB8x0/SB9x0 SATA Controller [AHCI mode] [1002:4391]
00:12.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] 
SB7x0/SB8x0/SB9x0 USB OHCI0 Controller [1002:4397]
00:12.1 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0 USB 
OHCI1 Controller [1002:4398]
00:12.2 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] 
SB7x0/SB8x0/SB9x0 USB EHCI Controller [1002:4396]
00:13.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] 
SB7x0/SB8x0/SB9x0 USB OHCI0 Controller [1002:4397]
00:13.1 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0 USB 
OHCI1 Controller [1002:4398]
00:13.2 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] 
SB7x0/SB8x0/SB9x0 USB EHCI Controller [1002:4396]
00:14.0 SMBus [0c05]: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 SMBus 
Controller [1002:4385] (rev 3a)
00:14.1 IDE interface [0101]: Advanced Micro Devices, Inc. [AMD/ATI] 
SB7x0/SB8x0/SB9x0 IDE Controller [1002:439c]
00:14.2 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 
Azalia (Intel HDA) [1002:4383]
00:14.3 ISA bridge [0601]: Advanced Micro Devices, Inc. [AMD/ATI] 
SB7x0/SB8x0/SB9x0 LPC host controller [1002:439d]
00:14.4 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 PCI to 
PCI Bridge [1002:4384]
00:14.5 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] 
SB7x0/SB8x0/SB9x0 USB OHCI2 Controller [1002:4399]
00:18.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] K8 
[Athlon64/Opteron] HyperTransport Technology Configuration [1022:1100]
00:18.1 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] K8 
[Athlon64/Opteron] Address Map [1022:1101]
00:18.2 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] K8 
[Athlon64/Opteron] DRAM Controller [1022:1102]
00:18.3 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] K8 
[Athlon64/Opteron] Miscellaneous Control [1022:1103]
01:05.0 

Re: `pci_apply_final_quirks()` taking half a second

2017-12-30 Thread Paul Menzel



Am 29.12.2017 um 17:14 schrieb Alan Stern:

On Thu, 28 Dec 2017, Bjorn Helgaas wrote:


On Tue, Dec 26, 2017 at 04:55:20PM +0100, Paul Menzel wrote:

Am 08.04.2017 um 17:41 schrieb Bjorn Helgaas:

On Fri, Apr 07, 2017 at 11:07:15PM +0200, Paul Menzel wrote:



Measuring where time is spent during boot with `systemd-bootchart`
on an Asus A780FullHD, it turns out that half a second is spent in
`pci_apply_final_quirks()`.


I agree, that seems like a crazy amount of time.

Can you figure out how to turn on pr_debug() (via the dynamic debug
mess or whatever) and boot with "initcall_debug"?  That should tell us
how long each quirk took.


I am sorry for taking so long to reply. I finally added `dyndbg=file
quirks.c +p` to the command line of Linux 4.13.13. This is on
another AMD system (Asus F285M Pro).



Dez 26 16:21:46 asus-f2a85-pro kernel: pci fixup
quirk_usb_early_handoff+0x0/0x6b0 returned after 197 usecs for
:00:10.0
Dez 26 16:21:46 asus-f2a85-pro kernel: pci fixup
quirk_usb_early_handoff+0x0/0x6b0 returned after 127 usecs for
:00:10.1
Dez 26 16:21:46 asus-f2a85-pro kernel: pci fixup
quirk_usb_early_handoff+0x0/0x6b0 returned after 88643 usecs for
:00:12.0
Dez 26 16:21:46 asus-f2a85-pro kernel: pci fixup
quirk_usb_early_handoff+0x0/0x6b0 returned after 137 usecs for
:00:12.2
Dez 26 16:21:46 asus-f2a85-pro kernel: pci fixup
pci_fixup_amd_ehci_pme+0x0/0x30 returned after 1 usecs for
:00:12.2
Dez 26 16:21:46 asus-f2a85-pro kernel: pci fixup
quirk_usb_early_handoff+0x0/0x6b0 returned after 85770 usecs for
:00:13.0
Dez 26 16:21:46 asus-f2a85-pro kernel: pci fixup
quirk_usb_early_handoff+0x0/0x6b0 returned after 134 usecs for
:00:13.2
Dez 26 16:21:46 asus-f2a85-pro kernel: pci fixup
pci_fixup_amd_ehci_pme+0x0/0x30 returned after 1 usecs for
:00:13.2
Dez 26 16:21:46 asus-f2a85-pro kernel: pci fixup
quirk_usb_early_handoff+0x0/0x6b0 returned after 125 usecs for
:03:00.0[…]
```

So it’s `pci fixup quirk_usb_early_handoff` taking around 85 ms, and
that twice.


Wow.  That's pretty painful, but of course I don't know how to fix it.
 From looking at quirk_usb_early_handoff(), it may depend on BIOS
details.  Maybe the USB folks will have some ideas.


Can we see the output from lspci?  It would help to know what the 12.0
and 13.0 devices are.


Sorry, that was trimmed from the original message. Here is the output 
from the ASRock A780FullD.



```
$ more /proc/version
Linux version 4.9.0-0.bpo.2-amd64 (debian-ker...@lists.debian.org)
(gcc version 4.9.2 (Debian 4.9.2-10) ) #1 SMP Debian 4.9.13-1~bpo8+1 
(2017-02-27)
$ lspci -nn
00:00.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] RS780 Host 
Bridge [1022:9600]
00:01.0 PCI bridge [0604]: ASRock Incorporation Device [1849:9602]
00:09.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] RS780/RS880 PCI 
to PCI bridge (PCIE port 4) [1022:9608]
00:0a.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] RS780/RS880 PCI 
to PCI bridge (PCIE port 5) [1022:9609]
00:11.0 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD/ATI] 
SB7x0/SB8x0/SB9x0 SATA Controller [AHCI mode] [1002:4391]
00:12.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] 
SB7x0/SB8x0/SB9x0 USB OHCI0 Controller [1002:4397]
00:12.1 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0 USB 
OHCI1 Controller [1002:4398]
00:12.2 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] 
SB7x0/SB8x0/SB9x0 USB EHCI Controller [1002:4396]
00:13.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] 
SB7x0/SB8x0/SB9x0 USB OHCI0 Controller [1002:4397]
00:13.1 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0 USB 
OHCI1 Controller [1002:4398]
00:13.2 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] 
SB7x0/SB8x0/SB9x0 USB EHCI Controller [1002:4396]
00:14.0 SMBus [0c05]: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 SMBus 
Controller [1002:4385] (rev 3a)
00:14.1 IDE interface [0101]: Advanced Micro Devices, Inc. [AMD/ATI] 
SB7x0/SB8x0/SB9x0 IDE Controller [1002:439c]
00:14.2 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 
Azalia (Intel HDA) [1002:4383]
00:14.3 ISA bridge [0601]: Advanced Micro Devices, Inc. [AMD/ATI] 
SB7x0/SB8x0/SB9x0 LPC host controller [1002:439d]
00:14.4 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 PCI to 
PCI Bridge [1002:4384]
00:14.5 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] 
SB7x0/SB8x0/SB9x0 USB OHCI2 Controller [1002:4399]
00:18.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] K8 
[Athlon64/Opteron] HyperTransport Technology Configuration [1022:1100]
00:18.1 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] K8 
[Athlon64/Opteron] Address Map [1022:1101]
00:18.2 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] K8 
[Athlon64/Opteron] DRAM Controller [1022:1102]
00:18.3 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] K8 
[Athlon64/Opteron] Miscellaneous Control [1022:1103]
01:05.0 

[PATCH] [RFC] mm: page_alloc: skip over regions of invalid pfns on UMA

2017-12-30 Thread Eugeniu Rosca
As a result of bisecting the v4.10..v4.11 commit range, it was
determined that commits [1] and [2] are both responsible of a ~170ms
early startup improvement on Rcar-H3-ES20 arm64 platform.

Since Rcar Gen3 family is not NUMA, we don't define CONFIG_NUMA in the
rcar3 defconfig, but this is how the boot time improvement is lost.

Make optimization [2] available on arm64 UMA systems and reduce the
time spent in memmap_init_zone() from 201ms to 34ms. Testing this
change on Apollo Lake SoC, boot time didn't change.

[1] commit 0f84832fb8f9 ("arm64: defconfig: Enable NUMA and NUMA_BALANCING")
[2] commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns 
where possible")
[3] 201ms spent in memmap_init_zone() on H3ULCB before this patch (NUMA not set)
[2.048087] On node 0 totalpages: 1003520
[2.048091]   DMA zone: 3392 pages used for memmap
[2.048094]   DMA zone: 0 pages reserved
[2.048096]   DMA zone: 217088 pages, LIFO batch:31
[2.048099] memmap_init_zone: start
[2.068881] memmap_init_zone: end
[2.068884]   Normal zone: 12288 pages used for memmap
[2.06]   Normal zone: 786432 pages, LIFO batch:31
[2.068890] memmap_init_zone: start
[2.249791] memmap_init_zone: end
[2.249824] psci: probing for conduit method from DT.

[4] 34ms spent in memmap_init_zone() on H3ULCB after this patch
[2.072935] On node 0 totalpages: 1003520
[2.072940]   DMA zone: 3392 pages used for memmap
[2.072942]   DMA zone: 0 pages reserved
[2.072945]   DMA zone: 217088 pages, LIFO batch:31
[2.072948] memmap_init_zone: start
[2.080442] memmap_init_zone: end
[2.080446]   Normal zone: 12288 pages used for memmap
[2.080449]   Normal zone: 786432 pages, LIFO batch:31
[2.080451] memmap_init_zone: start
[2.107935] memmap_init_zone: end
[2.107965] psci: probing for conduit method from DT.

Signed-off-by: Eugeniu Rosca 
---
 include/linux/memblock.h | 3 ++-
 mm/memblock.c| 2 ++
 mm/page_alloc.c  | 2 --
 3 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/include/linux/memblock.h b/include/linux/memblock.h
index 7ed0f7782d16..876c0a334164 100644
--- a/include/linux/memblock.h
+++ b/include/linux/memblock.h
@@ -182,12 +182,13 @@ static inline bool memblock_is_nomap(struct 
memblock_region *m)
return m->flags & MEMBLOCK_NOMAP;
 }
 
+unsigned long memblock_next_valid_pfn(unsigned long pfn, unsigned long 
max_pfn);
+
 #ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
 int memblock_search_pfn_nid(unsigned long pfn, unsigned long *start_pfn,
unsigned long  *end_pfn);
 void __next_mem_pfn_range(int *idx, int nid, unsigned long *out_start_pfn,
  unsigned long *out_end_pfn, int *out_nid);
-unsigned long memblock_next_valid_pfn(unsigned long pfn, unsigned long 
max_pfn);
 
 /**
  * for_each_mem_pfn_range - early memory pfn range iterator
diff --git a/mm/memblock.c b/mm/memblock.c
index 46aacdfa4f4d..ad48cf200e3b 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -1100,6 +1100,7 @@ void __init_memblock __next_mem_pfn_range(int *idx, int 
nid,
if (out_nid)
*out_nid = r->nid;
 }
+#endif /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */
 
 unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn,
  unsigned long max_pfn)
@@ -1129,6 +1130,7 @@ unsigned long __init_memblock 
memblock_next_valid_pfn(unsigned long pfn,
return min(PHYS_PFN(type->regions[right].base), max_pfn);
 }
 
+#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
 /**
  * memblock_set_node - set node ID on memblock regions
  * @base: base of area to set node ID for
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 7e5e775e97f4..defd5ef08c54 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5344,14 +5344,12 @@ void __meminit memmap_init_zone(unsigned long size, int 
nid, unsigned long zone,
goto not_early;
 
if (!early_pfn_valid(pfn)) {
-#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
/*
 * Skip to the pfn preceding the next valid one (or
 * end_pfn), such that we hit a valid pfn (or end_pfn)
 * on our next iteration of the loop.
 */
pfn = memblock_next_valid_pfn(pfn, end_pfn) - 1;
-#endif
continue;
}
if (!early_pfn_in_nid(pfn, nid))
-- 
2.14.2



[PATCH] [RFC] mm: page_alloc: skip over regions of invalid pfns on UMA

2017-12-30 Thread Eugeniu Rosca
As a result of bisecting the v4.10..v4.11 commit range, it was
determined that commits [1] and [2] are both responsible of a ~170ms
early startup improvement on Rcar-H3-ES20 arm64 platform.

Since Rcar Gen3 family is not NUMA, we don't define CONFIG_NUMA in the
rcar3 defconfig, but this is how the boot time improvement is lost.

Make optimization [2] available on arm64 UMA systems and reduce the
time spent in memmap_init_zone() from 201ms to 34ms. Testing this
change on Apollo Lake SoC, boot time didn't change.

[1] commit 0f84832fb8f9 ("arm64: defconfig: Enable NUMA and NUMA_BALANCING")
[2] commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns 
where possible")
[3] 201ms spent in memmap_init_zone() on H3ULCB before this patch (NUMA not set)
[2.048087] On node 0 totalpages: 1003520
[2.048091]   DMA zone: 3392 pages used for memmap
[2.048094]   DMA zone: 0 pages reserved
[2.048096]   DMA zone: 217088 pages, LIFO batch:31
[2.048099] memmap_init_zone: start
[2.068881] memmap_init_zone: end
[2.068884]   Normal zone: 12288 pages used for memmap
[2.06]   Normal zone: 786432 pages, LIFO batch:31
[2.068890] memmap_init_zone: start
[2.249791] memmap_init_zone: end
[2.249824] psci: probing for conduit method from DT.

[4] 34ms spent in memmap_init_zone() on H3ULCB after this patch
[2.072935] On node 0 totalpages: 1003520
[2.072940]   DMA zone: 3392 pages used for memmap
[2.072942]   DMA zone: 0 pages reserved
[2.072945]   DMA zone: 217088 pages, LIFO batch:31
[2.072948] memmap_init_zone: start
[2.080442] memmap_init_zone: end
[2.080446]   Normal zone: 12288 pages used for memmap
[2.080449]   Normal zone: 786432 pages, LIFO batch:31
[2.080451] memmap_init_zone: start
[2.107935] memmap_init_zone: end
[2.107965] psci: probing for conduit method from DT.

Signed-off-by: Eugeniu Rosca 
---
 include/linux/memblock.h | 3 ++-
 mm/memblock.c| 2 ++
 mm/page_alloc.c  | 2 --
 3 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/include/linux/memblock.h b/include/linux/memblock.h
index 7ed0f7782d16..876c0a334164 100644
--- a/include/linux/memblock.h
+++ b/include/linux/memblock.h
@@ -182,12 +182,13 @@ static inline bool memblock_is_nomap(struct 
memblock_region *m)
return m->flags & MEMBLOCK_NOMAP;
 }
 
+unsigned long memblock_next_valid_pfn(unsigned long pfn, unsigned long 
max_pfn);
+
 #ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
 int memblock_search_pfn_nid(unsigned long pfn, unsigned long *start_pfn,
unsigned long  *end_pfn);
 void __next_mem_pfn_range(int *idx, int nid, unsigned long *out_start_pfn,
  unsigned long *out_end_pfn, int *out_nid);
-unsigned long memblock_next_valid_pfn(unsigned long pfn, unsigned long 
max_pfn);
 
 /**
  * for_each_mem_pfn_range - early memory pfn range iterator
diff --git a/mm/memblock.c b/mm/memblock.c
index 46aacdfa4f4d..ad48cf200e3b 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -1100,6 +1100,7 @@ void __init_memblock __next_mem_pfn_range(int *idx, int 
nid,
if (out_nid)
*out_nid = r->nid;
 }
+#endif /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */
 
 unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn,
  unsigned long max_pfn)
@@ -1129,6 +1130,7 @@ unsigned long __init_memblock 
memblock_next_valid_pfn(unsigned long pfn,
return min(PHYS_PFN(type->regions[right].base), max_pfn);
 }
 
+#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
 /**
  * memblock_set_node - set node ID on memblock regions
  * @base: base of area to set node ID for
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 7e5e775e97f4..defd5ef08c54 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5344,14 +5344,12 @@ void __meminit memmap_init_zone(unsigned long size, int 
nid, unsigned long zone,
goto not_early;
 
if (!early_pfn_valid(pfn)) {
-#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
/*
 * Skip to the pfn preceding the next valid one (or
 * end_pfn), such that we hit a valid pfn (or end_pfn)
 * on our next iteration of the loop.
 */
pfn = memblock_next_valid_pfn(pfn, end_pfn) - 1;
-#endif
continue;
}
if (!early_pfn_in_nid(pfn, nid))
-- 
2.14.2



[GIT] Sparc

2017-12-30 Thread David Miller

Please pull to get this sparc64 bug fix.

Thank you!

The following changes since commit 5f520fc318764df800789edd202b5e3b55130613:

  Merge tag 'trace-v4.15-rc4' of 
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace (2017-12-27 
13:06:57 -0800)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc 

for you to fetch changes up to 59585b4be9ae4dc6506551709bdcd6f5210b8a01:

  sparc64: repair calling incorrect hweight function from stubs (2017-12-27 
20:29:48 -0500)


Jan Engelhardt (1):
  sparc64: repair calling incorrect hweight function from stubs

 arch/sparc/lib/hweight.S | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)


[GIT] Sparc

2017-12-30 Thread David Miller

Please pull to get this sparc64 bug fix.

Thank you!

The following changes since commit 5f520fc318764df800789edd202b5e3b55130613:

  Merge tag 'trace-v4.15-rc4' of 
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace (2017-12-27 
13:06:57 -0800)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc 

for you to fetch changes up to 59585b4be9ae4dc6506551709bdcd6f5210b8a01:

  sparc64: repair calling incorrect hweight function from stubs (2017-12-27 
20:29:48 -0500)


Jan Engelhardt (1):
  sparc64: repair calling incorrect hweight function from stubs

 arch/sparc/lib/hweight.S | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)


Re: [PATCH net] ethtool: do not print warning for applications using legacy API

2017-12-30 Thread David Decotigny
Signed-off-by: David Decotigny 


On Fri, Dec 29, 2017 at 10:02 AM, Stephen Hemminger
 wrote:
> From: Stephen Hemminger 
>
> In kernel log ths message appears on every boot:
>  "warning: `NetworkChangeNo' uses legacy ethtool link settings API,
>   link modes are only partially reported"
>
> When ethtool link settings API changed, it started complaining about
> usages of old API. Ironically, the original patch was from google but
> the application using the legacy API is chrome.
>
> Linux ABI is fixed as much as possible. The kernel must not break it
> and should not complain about applications using legacy API's.
> This patch just removes the warning since using legacy API's
> in Linux is perfectly acceptable.
>
> Fixes: 3f1ac7a700d0 ("net: ethtool: add new ETHTOOL_xLINKSETTINGS API")
> Signed-off-by: Stephen Hemminger 
> ---
>  net/core/ethtool.c | 15 ++-
>  1 file changed, 2 insertions(+), 13 deletions(-)
>
> diff --git a/net/core/ethtool.c b/net/core/ethtool.c
> index f8fcf450a36e..8225416911ae 100644
> --- a/net/core/ethtool.c
> +++ b/net/core/ethtool.c
> @@ -770,15 +770,6 @@ static int ethtool_set_link_ksettings(struct net_device 
> *dev,
> return dev->ethtool_ops->set_link_ksettings(dev, _ksettings);
>  }
>
> -static void
> -warn_incomplete_ethtool_legacy_settings_conversion(const char *details)
> -{
> -   char name[sizeof(current->comm)];
> -
> -   pr_info_once("warning: `%s' uses legacy ethtool link settings API, 
> %s\n",
> -get_task_comm(name, current), details);
> -}
> -
>  /* Query device for its ethtool_cmd settings.
>   *
>   * Backward compatibility note: for compatibility with legacy ethtool,
> @@ -805,10 +796,8 @@ static int ethtool_get_settings(struct net_device *dev, 
> void __user *useraddr)
>_ksettings);
> if (err < 0)
> return err;
> -   if (!convert_link_ksettings_to_legacy_settings(,
> -  
> _ksettings))
> -   warn_incomplete_ethtool_legacy_settings_conversion(
> -   "link modes are only partially reported");
> +   convert_link_ksettings_to_legacy_settings(,
> + _ksettings);
>
> /* send a sensible cmd tag back to user */
> cmd.cmd = ETHTOOL_GSET;
> --
> 2.11.0
>


Re: [PATCH net] ethtool: do not print warning for applications using legacy API

2017-12-30 Thread David Decotigny
Signed-off-by: David Decotigny 


On Fri, Dec 29, 2017 at 10:02 AM, Stephen Hemminger
 wrote:
> From: Stephen Hemminger 
>
> In kernel log ths message appears on every boot:
>  "warning: `NetworkChangeNo' uses legacy ethtool link settings API,
>   link modes are only partially reported"
>
> When ethtool link settings API changed, it started complaining about
> usages of old API. Ironically, the original patch was from google but
> the application using the legacy API is chrome.
>
> Linux ABI is fixed as much as possible. The kernel must not break it
> and should not complain about applications using legacy API's.
> This patch just removes the warning since using legacy API's
> in Linux is perfectly acceptable.
>
> Fixes: 3f1ac7a700d0 ("net: ethtool: add new ETHTOOL_xLINKSETTINGS API")
> Signed-off-by: Stephen Hemminger 
> ---
>  net/core/ethtool.c | 15 ++-
>  1 file changed, 2 insertions(+), 13 deletions(-)
>
> diff --git a/net/core/ethtool.c b/net/core/ethtool.c
> index f8fcf450a36e..8225416911ae 100644
> --- a/net/core/ethtool.c
> +++ b/net/core/ethtool.c
> @@ -770,15 +770,6 @@ static int ethtool_set_link_ksettings(struct net_device 
> *dev,
> return dev->ethtool_ops->set_link_ksettings(dev, _ksettings);
>  }
>
> -static void
> -warn_incomplete_ethtool_legacy_settings_conversion(const char *details)
> -{
> -   char name[sizeof(current->comm)];
> -
> -   pr_info_once("warning: `%s' uses legacy ethtool link settings API, 
> %s\n",
> -get_task_comm(name, current), details);
> -}
> -
>  /* Query device for its ethtool_cmd settings.
>   *
>   * Backward compatibility note: for compatibility with legacy ethtool,
> @@ -805,10 +796,8 @@ static int ethtool_get_settings(struct net_device *dev, 
> void __user *useraddr)
>_ksettings);
> if (err < 0)
> return err;
> -   if (!convert_link_ksettings_to_legacy_settings(,
> -  
> _ksettings))
> -   warn_incomplete_ethtool_legacy_settings_conversion(
> -   "link modes are only partially reported");
> +   convert_link_ksettings_to_legacy_settings(,
> + _ksettings);
>
> /* send a sensible cmd tag back to user */
> cmd.cmd = ETHTOOL_GSET;
> --
> 2.11.0
>


Re: [PATCH 2/2] clk: qcom: Configure the RCGs to a safe source as needed

2017-12-30 Thread kbuild test robot
Hi Amit,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on clk/clk-next]
[also build test ERROR on v4.15-rc5 next-20171222]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Amit-Nischal/clk-qcom-MISC-RCG-changes-for-SDM845/20171218-013958
base:   https://git.kernel.org/pub/scm/linux/kernel/git/clk/linux.git clk-next
config: arm-allmodconfig (attached as .config)
compiler: arm-linux-gnueabi-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=arm 

All errors (new ones prefixed by >>):

   WARNING: modpost: missing MODULE_LICENSE() in 
arch/arm/common/bL_switcher_dummy_if.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in 
drivers/auxdisplay/img-ascii-lcd.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in 
drivers/cpufreq/mediatek-cpufreq.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in drivers/gpio/gpio-ath79.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in drivers/gpio/gpio-iop.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in drivers/iio/accel/kxsd9-i2c.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in 
drivers/iio/adc/qcom-vadc-common.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in 
drivers/media/platform/mtk-vcodec/mtk-vcodec-common.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in 
drivers/media/platform/soc_camera/soc_scale_crop.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in 
drivers/media/platform/tegra-cec/tegra_cec.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in 
drivers/mmc/host/renesas_sdhi_core.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in drivers/mtd/nand/denali_pci.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in 
drivers/net/ethernet/cirrus/cs89x0.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in 
drivers/phy/qualcomm/phy-qcom-ufs.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in 
drivers/pinctrl/pxa/pinctrl-pxa2xx.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in drivers/power/reset/zx-reboot.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in drivers/soc/qcom/rmtfs_mem.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in 
drivers/staging/comedi/drivers/ni_atmio.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in 
drivers/video/fbdev/mmp/mmp_disp.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in 
sound/soc/codecs/snd-soc-pcm512x-spi.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in 
sound/soc/ux500/snd-soc-ux500-mach-mop500.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in 
sound/soc/ux500/snd-soc-ux500-plat-dma.o
   see include/linux/module.h for more information
>> ERROR: "clk_hw_is_prepared" [drivers/clk/qcom/clk-qcom.ko] undefined!

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


Re: [PATCH 2/2] clk: qcom: Configure the RCGs to a safe source as needed

2017-12-30 Thread kbuild test robot
Hi Amit,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on clk/clk-next]
[also build test ERROR on v4.15-rc5 next-20171222]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Amit-Nischal/clk-qcom-MISC-RCG-changes-for-SDM845/20171218-013958
base:   https://git.kernel.org/pub/scm/linux/kernel/git/clk/linux.git clk-next
config: arm-allmodconfig (attached as .config)
compiler: arm-linux-gnueabi-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=arm 

All errors (new ones prefixed by >>):

   WARNING: modpost: missing MODULE_LICENSE() in 
arch/arm/common/bL_switcher_dummy_if.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in 
drivers/auxdisplay/img-ascii-lcd.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in 
drivers/cpufreq/mediatek-cpufreq.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in drivers/gpio/gpio-ath79.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in drivers/gpio/gpio-iop.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in drivers/iio/accel/kxsd9-i2c.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in 
drivers/iio/adc/qcom-vadc-common.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in 
drivers/media/platform/mtk-vcodec/mtk-vcodec-common.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in 
drivers/media/platform/soc_camera/soc_scale_crop.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in 
drivers/media/platform/tegra-cec/tegra_cec.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in 
drivers/mmc/host/renesas_sdhi_core.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in drivers/mtd/nand/denali_pci.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in 
drivers/net/ethernet/cirrus/cs89x0.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in 
drivers/phy/qualcomm/phy-qcom-ufs.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in 
drivers/pinctrl/pxa/pinctrl-pxa2xx.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in drivers/power/reset/zx-reboot.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in drivers/soc/qcom/rmtfs_mem.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in 
drivers/staging/comedi/drivers/ni_atmio.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in 
drivers/video/fbdev/mmp/mmp_disp.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in 
sound/soc/codecs/snd-soc-pcm512x-spi.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in 
sound/soc/ux500/snd-soc-ux500-mach-mop500.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in 
sound/soc/ux500/snd-soc-ux500-plat-dma.o
   see include/linux/module.h for more information
>> ERROR: "clk_hw_is_prepared" [drivers/clk/qcom/clk-qcom.ko] undefined!

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


Re: [PATCH v3 net-next 2/5] net: tracepoint: replace tcp_set_state tracepoint with inet_sock_set_state tracepoint

2017-12-30 Thread Yafang Shao
On Sun, Dec 31, 2017 at 6:33 AM, Brendan Gregg
 wrote:
> On Tue, Dec 19, 2017 at 7:12 PM, Yafang Shao  wrote:
>> As sk_state is a common field for struct sock, so the state
>> transition tracepoint should not be a TCP specific feature.
>> Currently it traces all AF_INET state transition, so I rename this
>> tracepoint to inet_sock_set_state tracepoint with some minor changes and 
>> move it
>> into trace/events/sock.h.
>
> The tcp:tcp_set_state probe is tcp_set_state(), so it's only going to
> fire for TCP sessions. It's not broken, and we could add a
> sctp:sctp_set_state as well. Replacing tcp:tcp_set_state with
> inet_sk_set_state is feeling like we might be baking too much
> implementation detail into the tracepoint API.
>
> If we must have inet_sk_set_state, then must we also delete tcp:tcp_set_state?
>

Hi Brendan,

The reason we have to make this change could be got from this mail
thread, https://patchwork.kernel.org/patch/10099243/ .

The original tcp:tcp_set_state probe doesn't traced all TCP state transitions.
There're some state transitions in inet_connection_sock.c and
inet_hashtables.c are missed.
So we have to place this probe into these two files to fix the issue.
But as inet_connection_sock.c and inet_hashtables.c are common files
for all IPv4 protocols, not only for TCP, so it is not proper to place
a tcp_ function in these two files.
That's why we decide to rename tcp:tcp_set_state probe to
sock:inet_sock_set_state.

Thanks
Yafang


Re: [PATCH v3 net-next 2/5] net: tracepoint: replace tcp_set_state tracepoint with inet_sock_set_state tracepoint

2017-12-30 Thread Yafang Shao
On Sun, Dec 31, 2017 at 6:33 AM, Brendan Gregg
 wrote:
> On Tue, Dec 19, 2017 at 7:12 PM, Yafang Shao  wrote:
>> As sk_state is a common field for struct sock, so the state
>> transition tracepoint should not be a TCP specific feature.
>> Currently it traces all AF_INET state transition, so I rename this
>> tracepoint to inet_sock_set_state tracepoint with some minor changes and 
>> move it
>> into trace/events/sock.h.
>
> The tcp:tcp_set_state probe is tcp_set_state(), so it's only going to
> fire for TCP sessions. It's not broken, and we could add a
> sctp:sctp_set_state as well. Replacing tcp:tcp_set_state with
> inet_sk_set_state is feeling like we might be baking too much
> implementation detail into the tracepoint API.
>
> If we must have inet_sk_set_state, then must we also delete tcp:tcp_set_state?
>

Hi Brendan,

The reason we have to make this change could be got from this mail
thread, https://patchwork.kernel.org/patch/10099243/ .

The original tcp:tcp_set_state probe doesn't traced all TCP state transitions.
There're some state transitions in inet_connection_sock.c and
inet_hashtables.c are missed.
So we have to place this probe into these two files to fix the issue.
But as inet_connection_sock.c and inet_hashtables.c are common files
for all IPv4 protocols, not only for TCP, so it is not proper to place
a tcp_ function in these two files.
That's why we decide to rename tcp:tcp_set_state probe to
sock:inet_sock_set_state.

Thanks
Yafang


Re: general protection fault in skb_segment

2017-12-30 Thread Marcelo Ricardo Leitner
On Sat, Dec 30, 2017 at 10:52:20PM -0200, Marcelo Ricardo Leitner wrote:
> On Sat, Dec 30, 2017 at 08:42:41AM +0100, Willem de Bruijn wrote:
[...]
> > Somewhat tangential, but any PF_PACKET socket can set this
> > magic gso_size value in its virtio_net_hdr, so if it is assumed to
> > be an SCTP GSO specific option, setting it for a TCP GSO packet
> > may also cause unexpected results.
> 
> It seems virtio_net could use more sanity checks. When PACKET_VNET_HDR
> is used, it will end up calling:
> tpacket_rcv() {
> ...
> if (do_vnet) {
> if (virtio_net_hdr_from_skb(skb, h.raw + macoff -
> sizeof(struct virtio_net_hdr),
> vio_le(), true)) {
> spin_lock(>sk_receive_queue.lock);
> goto drop_n_account;
> }
> }
> 
> and virtio_net_hdr_from_skb does:
> if (skb_is_gso(skb)) {
> ...
> if (sinfo->gso_type & SKB_GSO_TCPV4)
> hdr->gso_type = VIRTIO_NET_HDR_GSO_TCPV4;
> else if (sinfo->gso_type & SKB_GSO_TCPV6)
> hdr->gso_type = VIRTIO_NET_HDR_GSO_TCPV6;
> else
> return -EINVAL;
> 
> Meaning that any gso_type other than TCP would be rejected, but this
> SCTP one got through. Seems the header contains a sctp header, but the
> gso_type set was actually pointing to TCP (otherwise it would have
> been rejected). AFAICT if this packet had an ESP header, for example,
> it could have hit esp4_gso_segment. Can you please confirm this?

I added:
--- a/net/sctp/offload.c
+++ b/net/sctp/offload.c
@@ -44,6 +44,18 @@ static struct sk_buff *sctp_gso_segment(struct sk_buff *skb,
 {
struct sk_buff *segs = ERR_PTR(-EINVAL);
struct sctphdr *sh;
+   int fail = 0;
+
+   if (!(skb_shinfo(skb)->gso_type & SKB_GSO_SCTP)) {
+   printk("Bogus gso_type: %x\n", skb_shinfo(skb)->gso_type);
+   fail = 1;
+   }
+   if (skb_shinfo(skb)->gso_size != GSO_BY_FRAGS) {
+   printk("Bogus gso_size: %u\n", skb_shinfo(skb)->gso_size);
+   fail = 1;
+   }
+   if (fail)
+   goto out;
 
sh = sctp_hdr(skb);
if (!pskb_may_pull(skb, sizeof(*sh)))

and with the reproducer, got:
[   54.255469] Bogus gso_type: 7
[   54.258801] Bogus gso_size: 63464
[   54.262532] [ cut here ]
[   54.267703] syz0: caps=(0x080058c1, 0x) len=32 
data_len=0 gso_size=63464 gso_type=7 ip_summed0
[   54.279777] WARNING: CPU: 1 PID: 13005 at /root/linux/net/core/dev.c:2600 
skb_warn_bad_offload+0xd6/0xec

gso_type 7 = SKB_GSO_TCPV4 | SKB_GSO_DODGY | SKB_GSO_TCP_ECN
as the warn indicated too.

Once this gets to sctp_gso_segment, it's too late to avoid the
warning. Would be nice if we could somehow filter this earlier in the
process.

  Marcelo


Re: general protection fault in skb_segment

2017-12-30 Thread Marcelo Ricardo Leitner
On Sat, Dec 30, 2017 at 10:52:20PM -0200, Marcelo Ricardo Leitner wrote:
> On Sat, Dec 30, 2017 at 08:42:41AM +0100, Willem de Bruijn wrote:
[...]
> > Somewhat tangential, but any PF_PACKET socket can set this
> > magic gso_size value in its virtio_net_hdr, so if it is assumed to
> > be an SCTP GSO specific option, setting it for a TCP GSO packet
> > may also cause unexpected results.
> 
> It seems virtio_net could use more sanity checks. When PACKET_VNET_HDR
> is used, it will end up calling:
> tpacket_rcv() {
> ...
> if (do_vnet) {
> if (virtio_net_hdr_from_skb(skb, h.raw + macoff -
> sizeof(struct virtio_net_hdr),
> vio_le(), true)) {
> spin_lock(>sk_receive_queue.lock);
> goto drop_n_account;
> }
> }
> 
> and virtio_net_hdr_from_skb does:
> if (skb_is_gso(skb)) {
> ...
> if (sinfo->gso_type & SKB_GSO_TCPV4)
> hdr->gso_type = VIRTIO_NET_HDR_GSO_TCPV4;
> else if (sinfo->gso_type & SKB_GSO_TCPV6)
> hdr->gso_type = VIRTIO_NET_HDR_GSO_TCPV6;
> else
> return -EINVAL;
> 
> Meaning that any gso_type other than TCP would be rejected, but this
> SCTP one got through. Seems the header contains a sctp header, but the
> gso_type set was actually pointing to TCP (otherwise it would have
> been rejected). AFAICT if this packet had an ESP header, for example,
> it could have hit esp4_gso_segment. Can you please confirm this?

I added:
--- a/net/sctp/offload.c
+++ b/net/sctp/offload.c
@@ -44,6 +44,18 @@ static struct sk_buff *sctp_gso_segment(struct sk_buff *skb,
 {
struct sk_buff *segs = ERR_PTR(-EINVAL);
struct sctphdr *sh;
+   int fail = 0;
+
+   if (!(skb_shinfo(skb)->gso_type & SKB_GSO_SCTP)) {
+   printk("Bogus gso_type: %x\n", skb_shinfo(skb)->gso_type);
+   fail = 1;
+   }
+   if (skb_shinfo(skb)->gso_size != GSO_BY_FRAGS) {
+   printk("Bogus gso_size: %u\n", skb_shinfo(skb)->gso_size);
+   fail = 1;
+   }
+   if (fail)
+   goto out;
 
sh = sctp_hdr(skb);
if (!pskb_may_pull(skb, sizeof(*sh)))

and with the reproducer, got:
[   54.255469] Bogus gso_type: 7
[   54.258801] Bogus gso_size: 63464
[   54.262532] [ cut here ]
[   54.267703] syz0: caps=(0x080058c1, 0x) len=32 
data_len=0 gso_size=63464 gso_type=7 ip_summed0
[   54.279777] WARNING: CPU: 1 PID: 13005 at /root/linux/net/core/dev.c:2600 
skb_warn_bad_offload+0xd6/0xec

gso_type 7 = SKB_GSO_TCPV4 | SKB_GSO_DODGY | SKB_GSO_TCP_ECN
as the warn indicated too.

Once this gets to sctp_gso_segment, it's too late to avoid the
warning. Would be nice if we could somehow filter this earlier in the
process.

  Marcelo


Re: [patch 0/3] x86/pti: Fix various fallout

2017-12-30 Thread Andy Lutomirski


> On Dec 30, 2017, at 2:06 PM, Linus Torvalds  
> wrote:
> 
>> On Sat, Dec 30, 2017 at 1:35 PM, Ingo Molnar  wrote:
>> 
>> Linus, I suspect -rc6 is imminent, and it would be nice to at least have the 
>> LDT
>> error path fix in. I'll send you these fixes tomorrow, but feel free to pick 
>> it up
>> from email if you wanted to release -rc6 today.
> 
> I'll do rc6 tomorrow probably around this time (early afternoon PST).
> So I think I should be ok just waiting for your pull request.

FWIW, I think this big is at worst just a memory leak.  Once an mm has any LDT 
mapped, mapping a second one can't fail because the entire LDT area is under 
512 pages, meaning that nothing needs to be allocated, so there's no 
opportunity for failure.

So it's an embarrassing bug, but not catastrophic.

> 
> Thanks,
> 
>  Linus


Re: [patch 0/3] x86/pti: Fix various fallout

2017-12-30 Thread Andy Lutomirski


> On Dec 30, 2017, at 2:06 PM, Linus Torvalds  
> wrote:
> 
>> On Sat, Dec 30, 2017 at 1:35 PM, Ingo Molnar  wrote:
>> 
>> Linus, I suspect -rc6 is imminent, and it would be nice to at least have the 
>> LDT
>> error path fix in. I'll send you these fixes tomorrow, but feel free to pick 
>> it up
>> from email if you wanted to release -rc6 today.
> 
> I'll do rc6 tomorrow probably around this time (early afternoon PST).
> So I think I should be ok just waiting for your pull request.

FWIW, I think this big is at worst just a memory leak.  Once an mm has any LDT 
mapped, mapping a second one can't fail because the entire LDT area is under 
512 pages, meaning that nothing needs to be allocated, so there's no 
opportunity for failure.

So it's an embarrassing bug, but not catastrophic.

> 
> Thanks,
> 
>  Linus


Re: [RFC][PATCHv6 00/12] printk: introduce printing kernel thread

2017-12-30 Thread Sergey Senozhatsky
Hello,

On (12/29/17 22:59), Tetsuo Handa wrote:
[..]
> Just an idea: Do we really need to use a semaphore for console_sem?
> 
> Is it possible to replace it with a spinlock? Then, I feel that we can write
> to consoles from non-process context (i.e. soft or hard IRQ context), with
> write only one log (or even one byte) at a time (i.e. write one log from one
> context, and defer all remaining logs by "somehow" scheduling for calling
> that context again).
> 
> Since process context might fail to allow printk kernel thread to run for
> long period due to many threads waiting for run, I thought that interrupt
> context might fit better if we can "somehow" chain interrupt contexts.


that's a good question. printk(), indeed, does not care that much. but
the whole thing is more complex. I can copy-paste (sorry for that) one
of my previous emails to give a brief (I'm sure the description is
incomplete) idea.



the real purpose of console_sem is to synchronize all events that can
happen to VT, fbcon, TTY, video, etc. and there are many events that
can happen to VT/fbcon. and some of those events can sleep - that's
where printk() can suffer. and this is why printk() is not different
from any  other console_sem users -- printk() uses that lock in order
to synchronize its own events: to have only one printing CPU, to prevent
concurrent console drivers list modification, to prevent concurrent consoles
modification, and so on.

let's take VT and fbcon for simplicity.

the events are.

1) IOCTL from user space
   they may involve things like resizing, scrolling, rotating,

   take a look at drivers/tty/vt/vt_ioctl.c  vt_ioctl().
   we need to take console_sem there because we modify the very
   important things - size, font maps, etc. we don't want those changes
   to mess with possibly active print outs happening from another CPUs.

2) timer events and workqueue events
   even cursor blinking must take console_sem. because it modifies the
   state of console/screen. take a look at drivers/video/fbdev/core/fbcon.c
   show_cursor_blink() for example.

   and take a look at fbcon_add_cursor_timer() in 
drivers/video/fbdev/core/fbcon.c

3) foreground console may change. video driver may be be initialized and
   registered.

4) PM events
   for exaple, drivers/video/fbdev/aty/radeon_pm.c   radeonfb_pci_suspend()

5) TTY write from user space
   when user space wants to write anything to console it goes through
   nTTY -> con_write() -> do_con_write().

 CPU: 1 PID: 1 Comm: systemd
 Call Trace:
  do_con_write+0x4c/0x1a5f
  con_write+0xa/0x1d
  n_tty_write+0xdb/0x3c5
  tty_write+0x191/0x223
  n_tty_receive_buf+0x8/0x8
  do_loop_readv_writev.part.23+0x58/0x89
  do_iter_write+0x98/0xb1
  vfs_writev+0x62/0x89

take a look at drivers/tty/vt/vt.c do_con_write()


it does a ton of things. why - because we need to scroll the console;
we need to wrap around the lines; we need to process control characters
- like \r or \n and so on and need to modify the console state accordingly;
we need to do UTF8/ASCII/etc. all of this things cannot run concurrently with
IOCTL that modify the font map or resize the console, or flip it, or rotate
it.

take a look at lf() -> con_scroll() -> fbcon_scroll() // 
drivers/video/fbdev/core/fbcon.c

we also don't want printk() to mess with do_con_write(). including
printk() from IRQ.

6) even more TTY
   I suspect that TTY may be invoked from IRQ.

7) printk() write  (and occasional ksmg_dump dumpers, e.g. 
arch/um/kernel/kmsg_dump)

   printk() goes through console_unlock()->vt_console_print().
   and it, basically, must handle all the things that TTY write does.
   handle console chars properly, do scrolling, wrapping, etc. and we
   don't want anthing else to jump in and mess with us at this stage.
   that's why we user console_sem in printk.c - to serialize all the
   events... including concurrent printk() from other CPUs. that's why
   we do console_trylock() in vprintk_emit().

8) even more printk()
   printk() can be called from IRQ. console_sem stops it if some of
   the consoles can't work in IRQ context right now.

9) consoles have notifiers

/*
 * We defer the timer blanking to work queue so it can take the console mutex
 * (console operations can still happen at irq time, but only from printk which
 * has the console mutex. Not perfect yet, but better than no locking
 */
static void blank_screen_t(unsigned long dummy)
{
blank_timer_expired = 1;
schedule_work(_work);
}

so console_sem is also used to, basically, synchronize IRQs/etc.

10) I suspect that some consoles can do things with console_sem from IRQ
   context.

and so on. we really use console_sem as a big-kernel-lock.


so where console_sem users might sleep? in tons of places...

like ark_pci_suspend()   console_lock(); mutex_lock(par);
or ark_pci_resume()   console_lock(); mutex_lock();
or con_install()  console_lock(); vc_allocate() -> kzalloc(GFP_KERNEL)

and so on and on and on.


Re: [RFC][PATCHv6 00/12] printk: introduce printing kernel thread

2017-12-30 Thread Sergey Senozhatsky
Hello,

On (12/29/17 22:59), Tetsuo Handa wrote:
[..]
> Just an idea: Do we really need to use a semaphore for console_sem?
> 
> Is it possible to replace it with a spinlock? Then, I feel that we can write
> to consoles from non-process context (i.e. soft or hard IRQ context), with
> write only one log (or even one byte) at a time (i.e. write one log from one
> context, and defer all remaining logs by "somehow" scheduling for calling
> that context again).
> 
> Since process context might fail to allow printk kernel thread to run for
> long period due to many threads waiting for run, I thought that interrupt
> context might fit better if we can "somehow" chain interrupt contexts.


that's a good question. printk(), indeed, does not care that much. but
the whole thing is more complex. I can copy-paste (sorry for that) one
of my previous emails to give a brief (I'm sure the description is
incomplete) idea.



the real purpose of console_sem is to synchronize all events that can
happen to VT, fbcon, TTY, video, etc. and there are many events that
can happen to VT/fbcon. and some of those events can sleep - that's
where printk() can suffer. and this is why printk() is not different
from any  other console_sem users -- printk() uses that lock in order
to synchronize its own events: to have only one printing CPU, to prevent
concurrent console drivers list modification, to prevent concurrent consoles
modification, and so on.

let's take VT and fbcon for simplicity.

the events are.

1) IOCTL from user space
   they may involve things like resizing, scrolling, rotating,

   take a look at drivers/tty/vt/vt_ioctl.c  vt_ioctl().
   we need to take console_sem there because we modify the very
   important things - size, font maps, etc. we don't want those changes
   to mess with possibly active print outs happening from another CPUs.

2) timer events and workqueue events
   even cursor blinking must take console_sem. because it modifies the
   state of console/screen. take a look at drivers/video/fbdev/core/fbcon.c
   show_cursor_blink() for example.

   and take a look at fbcon_add_cursor_timer() in 
drivers/video/fbdev/core/fbcon.c

3) foreground console may change. video driver may be be initialized and
   registered.

4) PM events
   for exaple, drivers/video/fbdev/aty/radeon_pm.c   radeonfb_pci_suspend()

5) TTY write from user space
   when user space wants to write anything to console it goes through
   nTTY -> con_write() -> do_con_write().

 CPU: 1 PID: 1 Comm: systemd
 Call Trace:
  do_con_write+0x4c/0x1a5f
  con_write+0xa/0x1d
  n_tty_write+0xdb/0x3c5
  tty_write+0x191/0x223
  n_tty_receive_buf+0x8/0x8
  do_loop_readv_writev.part.23+0x58/0x89
  do_iter_write+0x98/0xb1
  vfs_writev+0x62/0x89

take a look at drivers/tty/vt/vt.c do_con_write()


it does a ton of things. why - because we need to scroll the console;
we need to wrap around the lines; we need to process control characters
- like \r or \n and so on and need to modify the console state accordingly;
we need to do UTF8/ASCII/etc. all of this things cannot run concurrently with
IOCTL that modify the font map or resize the console, or flip it, or rotate
it.

take a look at lf() -> con_scroll() -> fbcon_scroll() // 
drivers/video/fbdev/core/fbcon.c

we also don't want printk() to mess with do_con_write(). including
printk() from IRQ.

6) even more TTY
   I suspect that TTY may be invoked from IRQ.

7) printk() write  (and occasional ksmg_dump dumpers, e.g. 
arch/um/kernel/kmsg_dump)

   printk() goes through console_unlock()->vt_console_print().
   and it, basically, must handle all the things that TTY write does.
   handle console chars properly, do scrolling, wrapping, etc. and we
   don't want anthing else to jump in and mess with us at this stage.
   that's why we user console_sem in printk.c - to serialize all the
   events... including concurrent printk() from other CPUs. that's why
   we do console_trylock() in vprintk_emit().

8) even more printk()
   printk() can be called from IRQ. console_sem stops it if some of
   the consoles can't work in IRQ context right now.

9) consoles have notifiers

/*
 * We defer the timer blanking to work queue so it can take the console mutex
 * (console operations can still happen at irq time, but only from printk which
 * has the console mutex. Not perfect yet, but better than no locking
 */
static void blank_screen_t(unsigned long dummy)
{
blank_timer_expired = 1;
schedule_work(_work);
}

so console_sem is also used to, basically, synchronize IRQs/etc.

10) I suspect that some consoles can do things with console_sem from IRQ
   context.

and so on. we really use console_sem as a big-kernel-lock.


so where console_sem users might sleep? in tons of places...

like ark_pci_suspend()   console_lock(); mutex_lock(par);
or ark_pci_resume()   console_lock(); mutex_lock();
or con_install()  console_lock(); vc_allocate() -> kzalloc(GFP_KERNEL)

and so on and on and on.


Re: [PATCH v2] ksm: replace jhash2 with faster hash

2017-12-30 Thread Timofey Titovets
Hi,
I was send v5, with minor changes,
but performance numbers are valid.

2017-12-31 3:07 GMT+03:00 sioh Lee :
> hello
>
> First, thanks for organizing all the experiments.
>
> and i'm sending you the results of experiments
>
> Test platform: openstack cloud platform (NEWTON version)
> Experiment node: openstack based cloud compute node (CPU: xeon E5-2620 v3, 
> memory 64gb)
> VM: (2 VCPU, RAM 4GB, DISK 20GB) * 4
> Linux kernel: 4.14 (latest version)
> KSM setup - sleep_millisecs: 200ms, pages_to_scan: 200
>
> Experiment process
> Firstly, we turn off KSM and launch 4 VMs.
> Then we turn on the KSM and measure the checksum computation time until 
> full_scans become two.
>
> The experimental results (the experimental value is the average of the 
> measured values)
> crc32c_intel: 1084.10ns
> crc32c (no hardware acceleration): 7012.51ns
> xxhash32: 2227.75ns
> xxhash64: 1413.16ns
> jhash2: 5128.30ns
>
> In summary, the result shows that crc32c_intel has advantages over all of the 
> hash function used in the experiment. (decreased by 84.54% compared to 
> crc32c, 78.86% compared to jhash2, 51.33% xxhash32, 23.28% compared to 
> xxhash64)
>
> the results are similar to those of Timofey.
>
> anyway, i saw the problem of Timofey and i had the same situation before.
>
> the solution is to call crc32c using crce32c library instead of shash alloc 
> (e.g. checksum = crc32c(0,addr,PAGE_SIZE);)

Not sure what are better allocate own shash, or use library crc32c,
because in both cases we need external dep and performance are same.
Usage of Crypto API and that library, looks mixed in kernel.

> and change code from subsys_initcall(ksm_init)  to  late_initcall(ksm_init).
>
> I have solved kernel problem using this method so this will be helpful.
That proof, what i understood correctly, ksm run too early %).

I have other workaround in V5 patch.
i.e. i already do 'choice checksum on first hash call'.
Only that i did, is move zero_checksum calculation to first call of fasthash().

That can just be never called, or will called late enough, where init are done.
(I.e. that will happen on first call of first ksm_enter()).

What better, your solution or mine, or we must again mix the work,
i can't say.

> please tell me if other problems exists.
No other problem exists.

> thanks.
>
> -sioh lee-
>

In sum, we can prove, change hash are useful and good performance
improvement in general.
With good potential on hardware acceleration on CPU.

Let's wait on advice of mm folks,
If that ok, and that do next if needed.

Thanks!

> 2017-12-31 오전 6:27에 Timofey Titovets 이(가) 쓴 글:
>> *FACEPALM*,
>> Sorry, just forgot about numbering of old jhash2 -> xxhash conversion
>> Also pickup patch for xxhash - arch dependent xxhash() function that will use
>> fastest algo for current arch.
>>
>> So next will be v5, as that must be v4.
>>
>> Thanks.
>>
>> 2017-12-29 12:52 GMT+03:00 Timofey Titovets :
>>> Pickup, Sioh Lee crc32 patch, after some long conversation
>>> and hassles, merge with my work on xxhash, add
>>> choice fastest hash helper.
>>>
>>> Base idea are same, replace jhash2 with something faster.
>>>
>>> Perf numbers:
>>> Intel(R) Xeon(R) CPU E5-2420 v2 @ 2.20GHz
>>> ksm: crc32c   hash() 12081 MB/s
>>> ksm: jhash2   hash()  1569 MB/s
>>> ksm: xxh64hash()  8770 MB/s
>>> ksm: xxh32hash()  4529 MB/s
>>>
>>> As jhash2 always will be slower, just drop it from choice.
>>>
>>> Add function to autoselect hash algo on boot, based on speed,
>>> like raid6 code does.
>>>
>>> Move init of zero_hash from init, to start of ksm thread,
>>> as ksm init run on early kernel init, run perf testing stuff on
>>> main kernel thread looks bad to me.
>>>
>>> One problem exists with that patch,
>>> ksm init run too early, and crc32c module, even compiled in
>>> can't be found, so i see:
>>>  - ksm: alloc crc32c shash error 2 in dmesg.
>>>
>>> I give up on that, so ideas welcomed.
>>>
>>> Only idea that i have, are to avoid early init by moving
>>> zero_checksum to sysfs_store parm,
>>> i.e. that's default to false, and that will work, i think.
>>>
>>> Thanks.
>>>
>>> Changes:
>>>   v1 -> v2:
>>> - Merge xxhash/crc32 patches
>>> - Replace crc32 with crc32c (crc32 have same as jhash2 speed)
>>> - Add auto speed test and auto choice of fastest hash function
>>>
>>> Signed-off-by: Timofey Titovets 
>>> Signed-off-by: leesioh 
>>> CC: Andrea Arcangeli 
>>> CC: linux...@kvack.org
>>> CC: k...@vger.kernel.org
>>> ---
>>>  mm/Kconfig |   4 ++
>>>  mm/ksm.c   | 133 
>>> -
>>>  2 files changed, 128 insertions(+), 9 deletions(-)
>>>
>>> diff --git a/mm/Kconfig b/mm/Kconfig
>>> index 03ff7703d322..d4fb147d4a22 100644
>>> --- a/mm/Kconfig
>>> +++ b/mm/Kconfig
>>> @@ -305,6 +305,10 @@ config MMU_NOTIFIER
>>>  config KSM
>>> bool "Enable KSM for page merging"
>>> depends 

Re: [PATCH v2] ksm: replace jhash2 with faster hash

2017-12-30 Thread Timofey Titovets
Hi,
I was send v5, with minor changes,
but performance numbers are valid.

2017-12-31 3:07 GMT+03:00 sioh Lee :
> hello
>
> First, thanks for organizing all the experiments.
>
> and i'm sending you the results of experiments
>
> Test platform: openstack cloud platform (NEWTON version)
> Experiment node: openstack based cloud compute node (CPU: xeon E5-2620 v3, 
> memory 64gb)
> VM: (2 VCPU, RAM 4GB, DISK 20GB) * 4
> Linux kernel: 4.14 (latest version)
> KSM setup - sleep_millisecs: 200ms, pages_to_scan: 200
>
> Experiment process
> Firstly, we turn off KSM and launch 4 VMs.
> Then we turn on the KSM and measure the checksum computation time until 
> full_scans become two.
>
> The experimental results (the experimental value is the average of the 
> measured values)
> crc32c_intel: 1084.10ns
> crc32c (no hardware acceleration): 7012.51ns
> xxhash32: 2227.75ns
> xxhash64: 1413.16ns
> jhash2: 5128.30ns
>
> In summary, the result shows that crc32c_intel has advantages over all of the 
> hash function used in the experiment. (decreased by 84.54% compared to 
> crc32c, 78.86% compared to jhash2, 51.33% xxhash32, 23.28% compared to 
> xxhash64)
>
> the results are similar to those of Timofey.
>
> anyway, i saw the problem of Timofey and i had the same situation before.
>
> the solution is to call crc32c using crce32c library instead of shash alloc 
> (e.g. checksum = crc32c(0,addr,PAGE_SIZE);)

Not sure what are better allocate own shash, or use library crc32c,
because in both cases we need external dep and performance are same.
Usage of Crypto API and that library, looks mixed in kernel.

> and change code from subsys_initcall(ksm_init)  to  late_initcall(ksm_init).
>
> I have solved kernel problem using this method so this will be helpful.
That proof, what i understood correctly, ksm run too early %).

I have other workaround in V5 patch.
i.e. i already do 'choice checksum on first hash call'.
Only that i did, is move zero_checksum calculation to first call of fasthash().

That can just be never called, or will called late enough, where init are done.
(I.e. that will happen on first call of first ksm_enter()).

What better, your solution or mine, or we must again mix the work,
i can't say.

> please tell me if other problems exists.
No other problem exists.

> thanks.
>
> -sioh lee-
>

In sum, we can prove, change hash are useful and good performance
improvement in general.
With good potential on hardware acceleration on CPU.

Let's wait on advice of mm folks,
If that ok, and that do next if needed.

Thanks!

> 2017-12-31 오전 6:27에 Timofey Titovets 이(가) 쓴 글:
>> *FACEPALM*,
>> Sorry, just forgot about numbering of old jhash2 -> xxhash conversion
>> Also pickup patch for xxhash - arch dependent xxhash() function that will use
>> fastest algo for current arch.
>>
>> So next will be v5, as that must be v4.
>>
>> Thanks.
>>
>> 2017-12-29 12:52 GMT+03:00 Timofey Titovets :
>>> Pickup, Sioh Lee crc32 patch, after some long conversation
>>> and hassles, merge with my work on xxhash, add
>>> choice fastest hash helper.
>>>
>>> Base idea are same, replace jhash2 with something faster.
>>>
>>> Perf numbers:
>>> Intel(R) Xeon(R) CPU E5-2420 v2 @ 2.20GHz
>>> ksm: crc32c   hash() 12081 MB/s
>>> ksm: jhash2   hash()  1569 MB/s
>>> ksm: xxh64hash()  8770 MB/s
>>> ksm: xxh32hash()  4529 MB/s
>>>
>>> As jhash2 always will be slower, just drop it from choice.
>>>
>>> Add function to autoselect hash algo on boot, based on speed,
>>> like raid6 code does.
>>>
>>> Move init of zero_hash from init, to start of ksm thread,
>>> as ksm init run on early kernel init, run perf testing stuff on
>>> main kernel thread looks bad to me.
>>>
>>> One problem exists with that patch,
>>> ksm init run too early, and crc32c module, even compiled in
>>> can't be found, so i see:
>>>  - ksm: alloc crc32c shash error 2 in dmesg.
>>>
>>> I give up on that, so ideas welcomed.
>>>
>>> Only idea that i have, are to avoid early init by moving
>>> zero_checksum to sysfs_store parm,
>>> i.e. that's default to false, and that will work, i think.
>>>
>>> Thanks.
>>>
>>> Changes:
>>>   v1 -> v2:
>>> - Merge xxhash/crc32 patches
>>> - Replace crc32 with crc32c (crc32 have same as jhash2 speed)
>>> - Add auto speed test and auto choice of fastest hash function
>>>
>>> Signed-off-by: Timofey Titovets 
>>> Signed-off-by: leesioh 
>>> CC: Andrea Arcangeli 
>>> CC: linux...@kvack.org
>>> CC: k...@vger.kernel.org
>>> ---
>>>  mm/Kconfig |   4 ++
>>>  mm/ksm.c   | 133 
>>> -
>>>  2 files changed, 128 insertions(+), 9 deletions(-)
>>>
>>> diff --git a/mm/Kconfig b/mm/Kconfig
>>> index 03ff7703d322..d4fb147d4a22 100644
>>> --- a/mm/Kconfig
>>> +++ b/mm/Kconfig
>>> @@ -305,6 +305,10 @@ config MMU_NOTIFIER
>>>  config KSM
>>> bool "Enable KSM for page merging"
>>> depends on MMU
>>> +   select XXHASH
>>> +   select CRYPTO
>>> +   select CRYPTO_HASH
>>> +   select 

Re: [PATCH v2 1/3] irqchip/renesas-intc-irqpin: Use WAKEUP_PATH driver PM flag

2017-12-30 Thread Rafael J. Wysocki
On Fri, Dec 29, 2017 at 2:31 PM, Ulf Hansson  wrote:
> From: Geert Uytterhoeven 
>
> Since commit 705bc96c2c15313c ("irqchip: renesas-intc-irqpin: Add minimal
> runtime PM support"), when an IRQ is used for wakeup, the INTC block's
> module clock (if exists) is manually kept running during system suspend, to
> make sure the device stays active.
>
> However, this explicit clock handling is merely a workaround for a failure
> to properly communicate wakeup information to the PM core. Instead, set the
> WAKEUP_PATH driver PM flag to indicate that the device is part of the
> wakeup path, which further also enables middle-layers and PM domains (like
> genpd) to act on this.
>
> In case the device is attached to genpd and depending on if it has an
> active wakeup configuration, genpd will keep the device active (the clock
> running) during system suspend when needed. This enables us to remove all
> explicit clock handling code from the driver, so let's do that as well.
>
> Signed-off-by: Geert Uytterhoeven 
> [Ulf: Converted to use the WAKEUP_PATH driver PM flag]
> Signed-off-by: Ulf Hansson 
> ---
>  drivers/irqchip/irq-renesas-intc-irqpin.c | 42 
> +++
>  1 file changed, 15 insertions(+), 27 deletions(-)
>
> diff --git a/drivers/irqchip/irq-renesas-intc-irqpin.c 
> b/drivers/irqchip/irq-renesas-intc-irqpin.c
> index 06f29cf..bfc2c5c 100644
> --- a/drivers/irqchip/irq-renesas-intc-irqpin.c
> +++ b/drivers/irqchip/irq-renesas-intc-irqpin.c
> @@ -17,7 +17,6 @@
>   * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
>   */
>
> -#include 
>  #include 
>  #include 
>  #include 
> @@ -78,16 +77,14 @@ struct intc_irqpin_priv {
> struct platform_device *pdev;
> struct irq_chip irq_chip;
> struct irq_domain *irq_domain;
> -   struct clk *clk;
> unsigned shared_irqs:1;
> -   unsigned needs_clk:1;
> +   unsigned wakeup_path:1;
> u8 shared_irq_mask;
>  };
>
>  struct intc_irqpin_config {
> unsigned int irlm_bit;
> unsigned needs_irlm:1;
> -   unsigned needs_clk:1;
>  };
>
>  static unsigned long intc_irqpin_read32(void __iomem *iomem)
> @@ -287,15 +284,7 @@ static int intc_irqpin_irq_set_wake(struct irq_data *d, 
> unsigned int on)
> int hw_irq = irqd_to_hwirq(d);
>
> irq_set_irq_wake(p->irq[hw_irq].requested_irq, on);
> -
> -   if (!p->clk)
> -   return 0;
> -
> -   if (on)
> -   clk_enable(p->clk);
> -   else
> -   clk_disable(p->clk);
> -
> +   p->wakeup_path = on;
> return 0;
>  }
>
> @@ -365,12 +354,10 @@ static const struct irq_domain_ops 
> intc_irqpin_irq_domain_ops = {
>  static const struct intc_irqpin_config intc_irqpin_irlm_r8a777x = {
> .irlm_bit = 23, /* ICR0.IRLM0 */
> .needs_irlm = 1,
> -   .needs_clk = 0,
>  };
>
>  static const struct intc_irqpin_config intc_irqpin_rmobile = {
> .needs_irlm = 0,
> -   .needs_clk = 1,
>  };
>
>  static const struct of_device_id intc_irqpin_dt_ids[] = {
> @@ -422,18 +409,6 @@ static int intc_irqpin_probe(struct platform_device 
> *pdev)
> platform_set_drvdata(pdev, p);
>
> config = of_device_get_match_data(dev);
> -   if (config)
> -   p->needs_clk = config->needs_clk;
> -
> -   p->clk = devm_clk_get(dev, NULL);
> -   if (IS_ERR(p->clk)) {
> -   if (p->needs_clk) {
> -   dev_err(dev, "unable to get clock\n");
> -   ret = PTR_ERR(p->clk);
> -   goto err0;
> -   }
> -   p->clk = NULL;
> -   }
>
> pm_runtime_enable(dev);
> pm_runtime_get_sync(dev);
> @@ -602,12 +577,25 @@ static int intc_irqpin_remove(struct platform_device 
> *pdev)
> return 0;
>  }
>
> +#ifdef CONFIG_PM_SLEEP
> +static int intc_irqpin_suspend(struct device *dev)
> +{
> +   struct intc_irqpin_priv *p = dev_get_drvdata(dev);
> +
> +   dev_pm_set_driver_flags(dev, p->wakeup_path ? DPM_FLAG_WAKEUP_PATH : 
> 0);

If you want that thing to be a DPM_FLAG_, then please follow the rule
that these flags are only set once at the probe time.

> +   return 0;
> +}
> +#endif

Thanks,
Rafael


Re: [PATCH v2 1/3] irqchip/renesas-intc-irqpin: Use WAKEUP_PATH driver PM flag

2017-12-30 Thread Rafael J. Wysocki
On Fri, Dec 29, 2017 at 2:31 PM, Ulf Hansson  wrote:
> From: Geert Uytterhoeven 
>
> Since commit 705bc96c2c15313c ("irqchip: renesas-intc-irqpin: Add minimal
> runtime PM support"), when an IRQ is used for wakeup, the INTC block's
> module clock (if exists) is manually kept running during system suspend, to
> make sure the device stays active.
>
> However, this explicit clock handling is merely a workaround for a failure
> to properly communicate wakeup information to the PM core. Instead, set the
> WAKEUP_PATH driver PM flag to indicate that the device is part of the
> wakeup path, which further also enables middle-layers and PM domains (like
> genpd) to act on this.
>
> In case the device is attached to genpd and depending on if it has an
> active wakeup configuration, genpd will keep the device active (the clock
> running) during system suspend when needed. This enables us to remove all
> explicit clock handling code from the driver, so let's do that as well.
>
> Signed-off-by: Geert Uytterhoeven 
> [Ulf: Converted to use the WAKEUP_PATH driver PM flag]
> Signed-off-by: Ulf Hansson 
> ---
>  drivers/irqchip/irq-renesas-intc-irqpin.c | 42 
> +++
>  1 file changed, 15 insertions(+), 27 deletions(-)
>
> diff --git a/drivers/irqchip/irq-renesas-intc-irqpin.c 
> b/drivers/irqchip/irq-renesas-intc-irqpin.c
> index 06f29cf..bfc2c5c 100644
> --- a/drivers/irqchip/irq-renesas-intc-irqpin.c
> +++ b/drivers/irqchip/irq-renesas-intc-irqpin.c
> @@ -17,7 +17,6 @@
>   * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
>   */
>
> -#include 
>  #include 
>  #include 
>  #include 
> @@ -78,16 +77,14 @@ struct intc_irqpin_priv {
> struct platform_device *pdev;
> struct irq_chip irq_chip;
> struct irq_domain *irq_domain;
> -   struct clk *clk;
> unsigned shared_irqs:1;
> -   unsigned needs_clk:1;
> +   unsigned wakeup_path:1;
> u8 shared_irq_mask;
>  };
>
>  struct intc_irqpin_config {
> unsigned int irlm_bit;
> unsigned needs_irlm:1;
> -   unsigned needs_clk:1;
>  };
>
>  static unsigned long intc_irqpin_read32(void __iomem *iomem)
> @@ -287,15 +284,7 @@ static int intc_irqpin_irq_set_wake(struct irq_data *d, 
> unsigned int on)
> int hw_irq = irqd_to_hwirq(d);
>
> irq_set_irq_wake(p->irq[hw_irq].requested_irq, on);
> -
> -   if (!p->clk)
> -   return 0;
> -
> -   if (on)
> -   clk_enable(p->clk);
> -   else
> -   clk_disable(p->clk);
> -
> +   p->wakeup_path = on;
> return 0;
>  }
>
> @@ -365,12 +354,10 @@ static const struct irq_domain_ops 
> intc_irqpin_irq_domain_ops = {
>  static const struct intc_irqpin_config intc_irqpin_irlm_r8a777x = {
> .irlm_bit = 23, /* ICR0.IRLM0 */
> .needs_irlm = 1,
> -   .needs_clk = 0,
>  };
>
>  static const struct intc_irqpin_config intc_irqpin_rmobile = {
> .needs_irlm = 0,
> -   .needs_clk = 1,
>  };
>
>  static const struct of_device_id intc_irqpin_dt_ids[] = {
> @@ -422,18 +409,6 @@ static int intc_irqpin_probe(struct platform_device 
> *pdev)
> platform_set_drvdata(pdev, p);
>
> config = of_device_get_match_data(dev);
> -   if (config)
> -   p->needs_clk = config->needs_clk;
> -
> -   p->clk = devm_clk_get(dev, NULL);
> -   if (IS_ERR(p->clk)) {
> -   if (p->needs_clk) {
> -   dev_err(dev, "unable to get clock\n");
> -   ret = PTR_ERR(p->clk);
> -   goto err0;
> -   }
> -   p->clk = NULL;
> -   }
>
> pm_runtime_enable(dev);
> pm_runtime_get_sync(dev);
> @@ -602,12 +577,25 @@ static int intc_irqpin_remove(struct platform_device 
> *pdev)
> return 0;
>  }
>
> +#ifdef CONFIG_PM_SLEEP
> +static int intc_irqpin_suspend(struct device *dev)
> +{
> +   struct intc_irqpin_priv *p = dev_get_drvdata(dev);
> +
> +   dev_pm_set_driver_flags(dev, p->wakeup_path ? DPM_FLAG_WAKEUP_PATH : 
> 0);

If you want that thing to be a DPM_FLAG_, then please follow the rule
that these flags are only set once at the probe time.

> +   return 0;
> +}
> +#endif

Thanks,
Rafael


Re: [PATCH v2 0/3] renesas: irqchip: Use WAKEUP_PATH driver PM flag

2017-12-30 Thread Rafael J. Wysocki
On Fri, Dec 29, 2017 at 2:31 PM, Ulf Hansson  wrote:
> From: Geert Uytterhoeven 
>
> Changes in v2: [By Ulf Hansson]
> - I have picked up the series from Geert [1] and converted it into use
> the WAKEUP_PATH driver PM flag. This includes some minor changes to 
> each
> patch and updates to the changelogs.
> - An important note, the WAKEUP_PATH driver PM flag is introduced in a
> separate series [2], not yet applied, so @subject series depends on 
> it.
> - One more note, two of the patches has a checkpatch error, however I
> did not fix them, becuase I think that should be done separate.
>
> [1]
> https://lkml.org/lkml/2017/11/9/382
> [2]
> https://marc.info/?l=linux-pm=151454744124661=2
>
> More information below, picked from Geert's previous cover letter.
>
> Kind regards
> Uffe
>
>
> Hi all,
>
> If an interrupt controller in a Renesas ARM SoC is part of a Clock
> Domain, and it is part of the wakeup path, it must be kept active during
> system suspend.
>
> Currently this is handled in all interrupt controller drivers by
> explicitly increasing the use count of the module clock when the device
> is part of the wakeup path.  However, this explicit clock handling is
> merely a workaround for a failure to properly communicate wakeup
> information to the device core.
>
> Hence this series fixes the affected drivers by setting the devices'
> power.wakeup_path fields instead, to indicate they are part of the
> wakeup path.  Depending on the PM Domain's active_wakeup configuration,
> the genpd core code will keep the device enabled (and the clock running)
> during system suspend when needed.

However, there is a convention, documented in the kerneldoc comment of
device_init_wakeup(), by which devices participating in system wakeup
"passively" (like USB controllers and hubs) are expected to have it
enabled by default.

If that convention was followed by the devices in question here, the
wakeup_path bit would be set for them and no other code changes would
be necessary.  So is there any reason for not following it?

Thanks,
Rafael


Re: [PATCH v2 0/3] renesas: irqchip: Use WAKEUP_PATH driver PM flag

2017-12-30 Thread Rafael J. Wysocki
On Fri, Dec 29, 2017 at 2:31 PM, Ulf Hansson  wrote:
> From: Geert Uytterhoeven 
>
> Changes in v2: [By Ulf Hansson]
> - I have picked up the series from Geert [1] and converted it into use
> the WAKEUP_PATH driver PM flag. This includes some minor changes to 
> each
> patch and updates to the changelogs.
> - An important note, the WAKEUP_PATH driver PM flag is introduced in a
> separate series [2], not yet applied, so @subject series depends on 
> it.
> - One more note, two of the patches has a checkpatch error, however I
> did not fix them, becuase I think that should be done separate.
>
> [1]
> https://lkml.org/lkml/2017/11/9/382
> [2]
> https://marc.info/?l=linux-pm=151454744124661=2
>
> More information below, picked from Geert's previous cover letter.
>
> Kind regards
> Uffe
>
>
> Hi all,
>
> If an interrupt controller in a Renesas ARM SoC is part of a Clock
> Domain, and it is part of the wakeup path, it must be kept active during
> system suspend.
>
> Currently this is handled in all interrupt controller drivers by
> explicitly increasing the use count of the module clock when the device
> is part of the wakeup path.  However, this explicit clock handling is
> merely a workaround for a failure to properly communicate wakeup
> information to the device core.
>
> Hence this series fixes the affected drivers by setting the devices'
> power.wakeup_path fields instead, to indicate they are part of the
> wakeup path.  Depending on the PM Domain's active_wakeup configuration,
> the genpd core code will keep the device enabled (and the clock running)
> during system suspend when needed.

However, there is a convention, documented in the kerneldoc comment of
device_init_wakeup(), by which devices participating in system wakeup
"passively" (like USB controllers and hubs) are expected to have it
enabled by default.

If that convention was followed by the devices in question here, the
wakeup_path bit would be set for them and no other code changes would
be necessary.  So is there any reason for not following it?

Thanks,
Rafael


Re: general protection fault in skb_segment

2017-12-30 Thread Marcelo Ricardo Leitner
On Sat, Dec 30, 2017 at 08:42:41AM +0100, Willem de Bruijn wrote:
> > syzkaller hit the following crash on
> > 37759fa6d0fa9e4d6036d19ac12f555bfc0aeafd
> > git://git.cmpxchg.org/linux-mmots.git/master
> > compiler: gcc (GCC) 7.1.1 20170620
> > .config is attached
> > Raw console output is attached.
> > C reproducer is attached
> > syzkaller reproducer is attached. See https://goo.gl/kgGztJ
> > for information about syzkaller reproducers
> 
> Reproduced with the C reproducer on v4.15-rc1 and mainline
> going back at least to v4.8, but not v4.7. SCTP GSO was
> introduced in v4.8-rc1, so a patch in this set is likely the starting
> point. Indeed crashes at 90017accff61 ("sctp: Add GSO support"),
> but not at 90017accff61~4.
> 
> The reproducer with its sandbox removed shows this invocation in strace -f
> 
> # strace -f ./repro2
> [... skipped ...]
> socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 3
> open("/dev/net/tun", O_RDONLY)  = 4
> fcntl(4, F_DUPFD, 3)= 5
> socket(PF_PACKET, SOCK_RAW|SOCK_CLOEXEC, 8) = 6
> ioctl(4, TUNSETIFF, 0x20e63000) = 0
> ioctl(3, SIOCSIFFLAGS, {ifr_name="syz0",
> ifr_flags=IFF_UP|IFF_PROMISC|IFF_ALLMULTI}) = 0
> setsockopt(6, SOL_PACKET, 0xf /* PACKET_??? */, [4096], 4) = 0
> ioctl(6, SIOCGIFINDEX, {ifr_name="syz0", ifr_index=24}) = 0
> bind(6, {sa_family=AF_PACKET, proto=, if24, pkttype=PACKET_HOST,
> addr(6)={1, aa00}, 20) = 0
> dup2(6, 5)  = 5
> write(5, "\0\201\1\0\350\367\0\0\3\0E\364\0 \0d\0\0\7\2042\342\0\0\0
> \177\0\0\1\0\t"..., 42
> 
> where 0xf in setsockopt is PACKET_VNET_HDR
> 
> So this is a packet socket writing something that apparently looks
> like an SCTP packet, is only 42 bytes long, but has GSO set in its
> virtio_net_hdr struct.
> 
> It crashes in skb_segment seemingly on a NULL list_skb.
> 
> (gdb) list *(skb_segment+0x2a4)
> 0x8167cc24 is in skb_segment (net/core/skbuff.c:3566).
> 3561if (hsize < 0)
> 3562hsize = 0;
> 3563if (hsize > len || !sg)
> 3564hsize = len;
> 3565
> 3566if (!hsize && i >= nfrags && skb_headlen(list_skb) &&
> 3567(skb_headlen(list_skb) == len || sg)) {
> 3568BUG_ON(skb_headlen(list_skb) > len);
> 3569
> 3570i = 0;
> 
> Likely there is a hidden assumption about SCTP GSO packets that does
> not hold for such packets generated by PF_PACKET.
> 
> SCTP GSO introduced the GSO_BY_FRAGS mss value, so the code
> takes a different path for SCTP packets generated by the SCTP stack.
> 
> PF_PACKET does not necessarily set gso_size to GSO_BY_FRAGS, so
> does not take the branch that requires list_skb to be non-zero here:
> 
> if (unlikely(mss == GSO_BY_FRAGS)) {
> len = list_skb->len;
> } else {
> len = head_skb->len - offset;
> if (len > mss)
> len = mss;
> }
> 
> hsize = skb_headlen(head_skb) - offset;
> if (hsize < 0)
> hsize = 0;
> if (hsize > len || !sg)
> hsize = len;
> 
> if (!hsize && i >= nfrags && skb_headlen(list_skb) &&
> (skb_headlen(list_skb) == len || sg)) {
> 
> Somewhat tangential, but any PF_PACKET socket can set this
> magic gso_size value in its virtio_net_hdr, so if it is assumed to
> be an SCTP GSO specific option, setting it for a TCP GSO packet
> may also cause unexpected results.

It seems virtio_net could use more sanity checks. When PACKET_VNET_HDR
is used, it will end up calling:
tpacket_rcv() {
...
if (do_vnet) {
if (virtio_net_hdr_from_skb(skb, h.raw + macoff -
sizeof(struct virtio_net_hdr),
vio_le(), true)) {
spin_lock(>sk_receive_queue.lock);
goto drop_n_account;
}
}

and virtio_net_hdr_from_skb does:
if (skb_is_gso(skb)) {
...
if (sinfo->gso_type & SKB_GSO_TCPV4)
hdr->gso_type = VIRTIO_NET_HDR_GSO_TCPV4;
else if (sinfo->gso_type & SKB_GSO_TCPV6)
hdr->gso_type = VIRTIO_NET_HDR_GSO_TCPV6;
else
return -EINVAL;

Meaning that any gso_type other than TCP would be rejected, but this
SCTP one got through. Seems the header contains a sctp header, but the
gso_type set was actually pointing to TCP (otherwise it would have
been rejected). AFAICT if this packet had an ESP header, for example,
it could have hit esp4_gso_segment. Can you please confirm this?

I don't know of anywhere in the stack validating if the gso_type
matches the header that actually is in there.

The 

Re: general protection fault in skb_segment

2017-12-30 Thread Marcelo Ricardo Leitner
On Sat, Dec 30, 2017 at 08:42:41AM +0100, Willem de Bruijn wrote:
> > syzkaller hit the following crash on
> > 37759fa6d0fa9e4d6036d19ac12f555bfc0aeafd
> > git://git.cmpxchg.org/linux-mmots.git/master
> > compiler: gcc (GCC) 7.1.1 20170620
> > .config is attached
> > Raw console output is attached.
> > C reproducer is attached
> > syzkaller reproducer is attached. See https://goo.gl/kgGztJ
> > for information about syzkaller reproducers
> 
> Reproduced with the C reproducer on v4.15-rc1 and mainline
> going back at least to v4.8, but not v4.7. SCTP GSO was
> introduced in v4.8-rc1, so a patch in this set is likely the starting
> point. Indeed crashes at 90017accff61 ("sctp: Add GSO support"),
> but not at 90017accff61~4.
> 
> The reproducer with its sandbox removed shows this invocation in strace -f
> 
> # strace -f ./repro2
> [... skipped ...]
> socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 3
> open("/dev/net/tun", O_RDONLY)  = 4
> fcntl(4, F_DUPFD, 3)= 5
> socket(PF_PACKET, SOCK_RAW|SOCK_CLOEXEC, 8) = 6
> ioctl(4, TUNSETIFF, 0x20e63000) = 0
> ioctl(3, SIOCSIFFLAGS, {ifr_name="syz0",
> ifr_flags=IFF_UP|IFF_PROMISC|IFF_ALLMULTI}) = 0
> setsockopt(6, SOL_PACKET, 0xf /* PACKET_??? */, [4096], 4) = 0
> ioctl(6, SIOCGIFINDEX, {ifr_name="syz0", ifr_index=24}) = 0
> bind(6, {sa_family=AF_PACKET, proto=, if24, pkttype=PACKET_HOST,
> addr(6)={1, aa00}, 20) = 0
> dup2(6, 5)  = 5
> write(5, "\0\201\1\0\350\367\0\0\3\0E\364\0 \0d\0\0\7\2042\342\0\0\0
> \177\0\0\1\0\t"..., 42
> 
> where 0xf in setsockopt is PACKET_VNET_HDR
> 
> So this is a packet socket writing something that apparently looks
> like an SCTP packet, is only 42 bytes long, but has GSO set in its
> virtio_net_hdr struct.
> 
> It crashes in skb_segment seemingly on a NULL list_skb.
> 
> (gdb) list *(skb_segment+0x2a4)
> 0x8167cc24 is in skb_segment (net/core/skbuff.c:3566).
> 3561if (hsize < 0)
> 3562hsize = 0;
> 3563if (hsize > len || !sg)
> 3564hsize = len;
> 3565
> 3566if (!hsize && i >= nfrags && skb_headlen(list_skb) &&
> 3567(skb_headlen(list_skb) == len || sg)) {
> 3568BUG_ON(skb_headlen(list_skb) > len);
> 3569
> 3570i = 0;
> 
> Likely there is a hidden assumption about SCTP GSO packets that does
> not hold for such packets generated by PF_PACKET.
> 
> SCTP GSO introduced the GSO_BY_FRAGS mss value, so the code
> takes a different path for SCTP packets generated by the SCTP stack.
> 
> PF_PACKET does not necessarily set gso_size to GSO_BY_FRAGS, so
> does not take the branch that requires list_skb to be non-zero here:
> 
> if (unlikely(mss == GSO_BY_FRAGS)) {
> len = list_skb->len;
> } else {
> len = head_skb->len - offset;
> if (len > mss)
> len = mss;
> }
> 
> hsize = skb_headlen(head_skb) - offset;
> if (hsize < 0)
> hsize = 0;
> if (hsize > len || !sg)
> hsize = len;
> 
> if (!hsize && i >= nfrags && skb_headlen(list_skb) &&
> (skb_headlen(list_skb) == len || sg)) {
> 
> Somewhat tangential, but any PF_PACKET socket can set this
> magic gso_size value in its virtio_net_hdr, so if it is assumed to
> be an SCTP GSO specific option, setting it for a TCP GSO packet
> may also cause unexpected results.

It seems virtio_net could use more sanity checks. When PACKET_VNET_HDR
is used, it will end up calling:
tpacket_rcv() {
...
if (do_vnet) {
if (virtio_net_hdr_from_skb(skb, h.raw + macoff -
sizeof(struct virtio_net_hdr),
vio_le(), true)) {
spin_lock(>sk_receive_queue.lock);
goto drop_n_account;
}
}

and virtio_net_hdr_from_skb does:
if (skb_is_gso(skb)) {
...
if (sinfo->gso_type & SKB_GSO_TCPV4)
hdr->gso_type = VIRTIO_NET_HDR_GSO_TCPV4;
else if (sinfo->gso_type & SKB_GSO_TCPV6)
hdr->gso_type = VIRTIO_NET_HDR_GSO_TCPV6;
else
return -EINVAL;

Meaning that any gso_type other than TCP would be rejected, but this
SCTP one got through. Seems the header contains a sctp header, but the
gso_type set was actually pointing to TCP (otherwise it would have
been rejected). AFAICT if this packet had an ESP header, for example,
it could have hit esp4_gso_segment. Can you please confirm this?

I don't know of anywhere in the stack validating if the gso_type
matches the header that actually is in there.

The 

Re: [PATCH v2] ksm: replace jhash2 with faster hash

2017-12-30 Thread sioh Lee
hello

First, thanks for organizing all the experiments.

and i'm sending you the results of experiments

Test platform: openstack cloud platform (NEWTON version)
Experiment node: openstack based cloud compute node (CPU: xeon E5-2620 v3, 
memory 64gb)
VM: (2 VCPU, RAM 4GB, DISK 20GB) * 4
Linux kernel: 4.14 (latest version)
KSM setup - sleep_millisecs: 200ms, pages_to_scan: 200

Experiment process
Firstly, we turn off KSM and launch 4 VMs.
Then we turn on the KSM and measure the checksum computation time until 
full_scans become two.

The experimental results (the experimental value is the average of the measured 
values)
crc32c_intel: 1084.10ns
crc32c (no hardware acceleration): 7012.51ns
xxhash32: 2227.75ns
xxhash64: 1413.16ns
jhash2: 5128.30ns

In summary, the result shows that crc32c_intel has advantages over all of the 
hash function used in the experiment. (decreased by 84.54% compared to crc32c, 
78.86% compared to jhash2, 51.33% xxhash32, 23.28% compared to xxhash64)

the results are similar to those of Timofey.

anyway, i saw the problem of Timofey and i had the same situation before.

the solution is to call crc32c using crce32c library instead of shash alloc 
(e.g. checksum = crc32c(0,addr,PAGE_SIZE);)

and change code from subsys_initcall(ksm_init)  to  late_initcall(ksm_init).

I have solved kernel problem using this method so this will be helpful.

please tell me if other problems exists.

thanks.

-sioh lee-


2017-12-31 오전 6:27에 Timofey Titovets 이(가) 쓴 글:
> *FACEPALM*,
> Sorry, just forgot about numbering of old jhash2 -> xxhash conversion
> Also pickup patch for xxhash - arch dependent xxhash() function that will use
> fastest algo for current arch.
>
> So next will be v5, as that must be v4.
>
> Thanks.
>
> 2017-12-29 12:52 GMT+03:00 Timofey Titovets :
>> Pickup, Sioh Lee crc32 patch, after some long conversation
>> and hassles, merge with my work on xxhash, add
>> choice fastest hash helper.
>>
>> Base idea are same, replace jhash2 with something faster.
>>
>> Perf numbers:
>> Intel(R) Xeon(R) CPU E5-2420 v2 @ 2.20GHz
>> ksm: crc32c   hash() 12081 MB/s
>> ksm: jhash2   hash()  1569 MB/s
>> ksm: xxh64hash()  8770 MB/s
>> ksm: xxh32hash()  4529 MB/s
>>
>> As jhash2 always will be slower, just drop it from choice.
>>
>> Add function to autoselect hash algo on boot, based on speed,
>> like raid6 code does.
>>
>> Move init of zero_hash from init, to start of ksm thread,
>> as ksm init run on early kernel init, run perf testing stuff on
>> main kernel thread looks bad to me.
>>
>> One problem exists with that patch,
>> ksm init run too early, and crc32c module, even compiled in
>> can't be found, so i see:
>>  - ksm: alloc crc32c shash error 2 in dmesg.
>>
>> I give up on that, so ideas welcomed.
>>
>> Only idea that i have, are to avoid early init by moving
>> zero_checksum to sysfs_store parm,
>> i.e. that's default to false, and that will work, i think.
>>
>> Thanks.
>>
>> Changes:
>>   v1 -> v2:
>> - Merge xxhash/crc32 patches
>> - Replace crc32 with crc32c (crc32 have same as jhash2 speed)
>> - Add auto speed test and auto choice of fastest hash function
>>
>> Signed-off-by: Timofey Titovets 
>> Signed-off-by: leesioh 
>> CC: Andrea Arcangeli 
>> CC: linux...@kvack.org
>> CC: k...@vger.kernel.org
>> ---
>>  mm/Kconfig |   4 ++
>>  mm/ksm.c   | 133 
>> -
>>  2 files changed, 128 insertions(+), 9 deletions(-)
>>
>> diff --git a/mm/Kconfig b/mm/Kconfig
>> index 03ff7703d322..d4fb147d4a22 100644
>> --- a/mm/Kconfig
>> +++ b/mm/Kconfig
>> @@ -305,6 +305,10 @@ config MMU_NOTIFIER
>>  config KSM
>> bool "Enable KSM for page merging"
>> depends on MMU
>> +   select XXHASH
>> +   select CRYPTO
>> +   select CRYPTO_HASH
>> +   select CONFIG_CRYPTO_CRC32C
>> help
>>   Enable Kernel Samepage Merging: KSM periodically scans those areas
>>   of an application's address space that an app has advised may be
>> diff --git a/mm/ksm.c b/mm/ksm.c
>> index be8f4576f842..fd5c9d0f7bc2 100644
>> --- a/mm/ksm.c
>> +++ b/mm/ksm.c
>> @@ -25,7 +25,6 @@
>>  #include 
>>  #include 
>>  #include 
>> -#include 
>>  #include 
>>  #include 
>>  #include 
>> @@ -41,6 +40,12 @@
>>  #include 
>>
>>  #include 
>> +
>> +/* Support for xxhash and crc32c */
>> +#include 
>> +#include 
>> +#include 
>> +
>>  #include "internal.h"
>>
>>  #ifdef CONFIG_NUMA
>> @@ -186,7 +191,7 @@ struct rmap_item {
>> };
>> struct mm_struct *mm;
>> unsigned long address;  /* + low bits used for flags below */
>> -   unsigned int oldchecksum;   /* when unstable */
>> +   unsigned long oldchecksum;  /* when unstable */
>> union {
>> struct rb_node node;/* when node of unstable tree */
>> struct {/* when listed from stable 

Re: [PATCH v2] ksm: replace jhash2 with faster hash

2017-12-30 Thread sioh Lee
hello

First, thanks for organizing all the experiments.

and i'm sending you the results of experiments

Test platform: openstack cloud platform (NEWTON version)
Experiment node: openstack based cloud compute node (CPU: xeon E5-2620 v3, 
memory 64gb)
VM: (2 VCPU, RAM 4GB, DISK 20GB) * 4
Linux kernel: 4.14 (latest version)
KSM setup - sleep_millisecs: 200ms, pages_to_scan: 200

Experiment process
Firstly, we turn off KSM and launch 4 VMs.
Then we turn on the KSM and measure the checksum computation time until 
full_scans become two.

The experimental results (the experimental value is the average of the measured 
values)
crc32c_intel: 1084.10ns
crc32c (no hardware acceleration): 7012.51ns
xxhash32: 2227.75ns
xxhash64: 1413.16ns
jhash2: 5128.30ns

In summary, the result shows that crc32c_intel has advantages over all of the 
hash function used in the experiment. (decreased by 84.54% compared to crc32c, 
78.86% compared to jhash2, 51.33% xxhash32, 23.28% compared to xxhash64)

the results are similar to those of Timofey.

anyway, i saw the problem of Timofey and i had the same situation before.

the solution is to call crc32c using crce32c library instead of shash alloc 
(e.g. checksum = crc32c(0,addr,PAGE_SIZE);)

and change code from subsys_initcall(ksm_init)  to  late_initcall(ksm_init).

I have solved kernel problem using this method so this will be helpful.

please tell me if other problems exists.

thanks.

-sioh lee-


2017-12-31 오전 6:27에 Timofey Titovets 이(가) 쓴 글:
> *FACEPALM*,
> Sorry, just forgot about numbering of old jhash2 -> xxhash conversion
> Also pickup patch for xxhash - arch dependent xxhash() function that will use
> fastest algo for current arch.
>
> So next will be v5, as that must be v4.
>
> Thanks.
>
> 2017-12-29 12:52 GMT+03:00 Timofey Titovets :
>> Pickup, Sioh Lee crc32 patch, after some long conversation
>> and hassles, merge with my work on xxhash, add
>> choice fastest hash helper.
>>
>> Base idea are same, replace jhash2 with something faster.
>>
>> Perf numbers:
>> Intel(R) Xeon(R) CPU E5-2420 v2 @ 2.20GHz
>> ksm: crc32c   hash() 12081 MB/s
>> ksm: jhash2   hash()  1569 MB/s
>> ksm: xxh64hash()  8770 MB/s
>> ksm: xxh32hash()  4529 MB/s
>>
>> As jhash2 always will be slower, just drop it from choice.
>>
>> Add function to autoselect hash algo on boot, based on speed,
>> like raid6 code does.
>>
>> Move init of zero_hash from init, to start of ksm thread,
>> as ksm init run on early kernel init, run perf testing stuff on
>> main kernel thread looks bad to me.
>>
>> One problem exists with that patch,
>> ksm init run too early, and crc32c module, even compiled in
>> can't be found, so i see:
>>  - ksm: alloc crc32c shash error 2 in dmesg.
>>
>> I give up on that, so ideas welcomed.
>>
>> Only idea that i have, are to avoid early init by moving
>> zero_checksum to sysfs_store parm,
>> i.e. that's default to false, and that will work, i think.
>>
>> Thanks.
>>
>> Changes:
>>   v1 -> v2:
>> - Merge xxhash/crc32 patches
>> - Replace crc32 with crc32c (crc32 have same as jhash2 speed)
>> - Add auto speed test and auto choice of fastest hash function
>>
>> Signed-off-by: Timofey Titovets 
>> Signed-off-by: leesioh 
>> CC: Andrea Arcangeli 
>> CC: linux...@kvack.org
>> CC: k...@vger.kernel.org
>> ---
>>  mm/Kconfig |   4 ++
>>  mm/ksm.c   | 133 
>> -
>>  2 files changed, 128 insertions(+), 9 deletions(-)
>>
>> diff --git a/mm/Kconfig b/mm/Kconfig
>> index 03ff7703d322..d4fb147d4a22 100644
>> --- a/mm/Kconfig
>> +++ b/mm/Kconfig
>> @@ -305,6 +305,10 @@ config MMU_NOTIFIER
>>  config KSM
>> bool "Enable KSM for page merging"
>> depends on MMU
>> +   select XXHASH
>> +   select CRYPTO
>> +   select CRYPTO_HASH
>> +   select CONFIG_CRYPTO_CRC32C
>> help
>>   Enable Kernel Samepage Merging: KSM periodically scans those areas
>>   of an application's address space that an app has advised may be
>> diff --git a/mm/ksm.c b/mm/ksm.c
>> index be8f4576f842..fd5c9d0f7bc2 100644
>> --- a/mm/ksm.c
>> +++ b/mm/ksm.c
>> @@ -25,7 +25,6 @@
>>  #include 
>>  #include 
>>  #include 
>> -#include 
>>  #include 
>>  #include 
>>  #include 
>> @@ -41,6 +40,12 @@
>>  #include 
>>
>>  #include 
>> +
>> +/* Support for xxhash and crc32c */
>> +#include 
>> +#include 
>> +#include 
>> +
>>  #include "internal.h"
>>
>>  #ifdef CONFIG_NUMA
>> @@ -186,7 +191,7 @@ struct rmap_item {
>> };
>> struct mm_struct *mm;
>> unsigned long address;  /* + low bits used for flags below */
>> -   unsigned int oldchecksum;   /* when unstable */
>> +   unsigned long oldchecksum;  /* when unstable */
>> union {
>> struct rb_node node;/* when node of unstable tree */
>> struct {/* when listed from stable tree */
>> @@ -255,7 +260,7 @@ static unsigned int ksm_thread_pages_to_scan = 100;
>>  

Re: [PATCH V5 2/2] ksm: replace jhash2 with faster hash

2017-12-30 Thread Timofey Titovets
JFYI performance on more fast/modern CPU:
Intel(R) Core(TM) i5-7200U CPU @ 2.50GHz
[  172.651044] ksm: crc32c hash() 22633 MB/s
[  172.776060] ksm: xxhash hash() 10920 MB/s
[  172.776066] ksm: choice crc32c as hash function


Re: [PATCH V5 2/2] ksm: replace jhash2 with faster hash

2017-12-30 Thread Timofey Titovets
JFYI performance on more fast/modern CPU:
Intel(R) Core(TM) i5-7200U CPU @ 2.50GHz
[  172.651044] ksm: crc32c hash() 22633 MB/s
[  172.776060] ksm: xxhash hash() 10920 MB/s
[  172.776066] ksm: choice crc32c as hash function


[PATCH V5 2/2] ksm: replace jhash2 with faster hash

2017-12-30 Thread Timofey Titovets
1. Pickup, Sioh Lee crc32 patch, after some long conversation
2. Merge with my work on xxhash
3. Add autoselect code to choice fastest hash helper.

Base idea are same, replace jhash2 with something faster.

Perf numbers:
Intel(R) Xeon(R) CPU E5-2420 v2 @ 2.20GHz
ksm: crc32c   hash() 12081 MB/s
ksm: xxh64hash()  8770 MB/s
ksm: xxh32hash()  4529 MB/s
ksm: jhash2   hash()  1569 MB/s

As jhash2 always will be slower (For data size like PAGE_SIZE),
just drop it from choice.

Add function to autoselect hash algo on boot,
based on hashing speed, like raid6 code does.

Move init of zero_checksum from init, to first call of fasthash():
  1. KSM Init run on early kernel init,
 run perf testing stuff on main kernel boot thread looks bad to me.
  2. Crypto subsystem not avaliable at that early booting,
 so crc32c even, compiled in, not avaliable

Output after first try of KSM to hash page:
ksm: crc32c hash() 15218 MB/s
ksm: xxhash hash()  8640 MB/s
ksm: choise crc32c as hash function

Thanks.

Changes:
  v1 -> v2:
- Move xxhash() to xxhash.h/c and separate patches
  v2 -> v3:
- Move xxhash() xxhash.c -> xxhash.h
- replace xxhash_t with 'unsigned long'
- update kerneldoc above xxhash()
  v3 -> v4:
- Merge xxhash/crc32 patches
- Replace crc32 with crc32c (crc32 have same as jhash2 speed)
- Add auto speed test and auto choice of fastest hash function
  v4 -> v5:
- Pickup missed xxhash patch
- Update code with compile time choicen xxhash
- Add more macros to make code more readable
- As now that only possible use xxhash or crc32c,
  on crc32c allocation error, skip speed test and fallback to xxhash
- For workaround too early init problem (crc32c not avaliable),
  move zero_checksum init to first call of fastcall()
- Don't alloc page for hash testing, use arch zero pages for that

Signed-off-by: Timofey Titovets 
Signed-off-by: leesioh 
CC: Andrea Arcangeli 
CC: linux...@kvack.org
CC: k...@vger.kernel.org
---
 mm/Kconfig |   4 +++
 mm/ksm.c   | 114 -
 2 files changed, 109 insertions(+), 9 deletions(-)

diff --git a/mm/Kconfig b/mm/Kconfig
index 03ff7703d322..d4fb147d4a22 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -305,6 +305,10 @@ config MMU_NOTIFIER
 config KSM
bool "Enable KSM for page merging"
depends on MMU
+   select XXHASH
+   select CRYPTO
+   select CRYPTO_HASH
+   select CONFIG_CRYPTO_CRC32C
help
  Enable Kernel Samepage Merging: KSM periodically scans those areas
  of an application's address space that an app has advised may be
diff --git a/mm/ksm.c b/mm/ksm.c
index be8f4576f842..b90ad6903dc6 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -25,7 +25,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -41,6 +40,13 @@
 #include 
 
 #include 
+
+/* Support for xxhash and crc32c */
+#include 
+#include 
+#include 
+#include 
+
 #include "internal.h"
 
 #ifdef CONFIG_NUMA
@@ -186,7 +192,7 @@ struct rmap_item {
};
struct mm_struct *mm;
unsigned long address;  /* + low bits used for flags below */
-   unsigned int oldchecksum;   /* when unstable */
+   unsigned long oldchecksum;  /* when unstable */
union {
struct rb_node node;/* when node of unstable tree */
struct {/* when listed from stable tree */
@@ -255,7 +261,7 @@ static unsigned int ksm_thread_pages_to_scan = 100;
 static unsigned int ksm_thread_sleep_millisecs = 20;
 
 /* Checksum of an empty (zeroed) page */
-static unsigned int zero_checksum __read_mostly;
+static unsigned long zero_checksum __read_mostly;
 
 /* Whether to merge empty (zeroed) pages with actual zero pages */
 static bool ksm_use_zero_pages __read_mostly;
@@ -284,6 +290,98 @@ static DEFINE_SPINLOCK(ksm_mmlist_lock);
sizeof(struct __struct), __alignof__(struct __struct),\
(__flags), NULL)
 
+#define TIME_125MS  (HZ >> 3)
+#define PERF_TO_MBS(X) (X*PAGE_SIZE*(1 << 3)/(SZ_1M))
+
+#define HASH_NONE   0
+#define HASH_CRC32C 1
+#define HASH_XXHASH 2
+
+static struct shash_desc desc;
+
+static int fastest_hash = 0;
+
+static void __init choice_fastest_hash(void)
+{
+   void *page = ZERO_PAGE(0);
+   unsigned long checksum, perf, je;
+   unsigned long best_perf = 0;
+
+   desc.tfm = crypto_alloc_shash("crc32c", 0, 0);
+   desc.flags = 0;
+
+   if (IS_ERR(desc.tfm)) {
+   pr_warn("ksm: alloc crc32c shash error %ld\n",
+   -PTR_ERR(desc.tfm));
+   fastest_hash = HASH_XXHASH;
+   goto out;
+   }
+
+   perf = 0;
+   preempt_disable();
+   je = jiffies + TIME_125MS;
+   while (time_before(jiffies, je)) {
+   crypto_shash_digest(, page, PAGE_SIZE, (u8 *));
+   perf++;

[PATCH V5 1/2] xxHash: create arch dependent 32/64-bit xxhash()

2017-12-30 Thread Timofey Titovets
xxh32() - fast on both 32/64-bit platforms
xxh64() - fast only on 64-bit platform

Create xxhash() which will pickup fastest version
on compile time.

As result depends on cpu word size,
the main proporse of that - in memory hashing.

Changes:
  v2:
- Create that patch
  v3 -> v5:
- Nothing, whole patchset version bump

Signed-off-by: Timofey Titovets 
---
 include/linux/xxhash.h | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/include/linux/xxhash.h b/include/linux/xxhash.h
index 9e1f42cb57e9..52b073fea17f 100644
--- a/include/linux/xxhash.h
+++ b/include/linux/xxhash.h
@@ -107,6 +107,29 @@ uint32_t xxh32(const void *input, size_t length, uint32_t 
seed);
  */
 uint64_t xxh64(const void *input, size_t length, uint64_t seed);
 
+/**
+ * xxhash() - calculate wordsize hash of the input with a given seed
+ * @input:  The data to hash.
+ * @length: The length of the data to hash.
+ * @seed:   The seed can be used to alter the result predictably.
+ *
+ * If the hash does not need to be comparable between machines with
+ * different word sizes, this function will call whichever of xxh32()
+ * or xxh64() is faster.
+ *
+ * Return:  wordsize hash of the data.
+ */
+
+static inline unsigned long xxhash(const void *input, size_t length,
+  uint64_t seed)
+{
+#if BITS_PER_LONG == 64
+   return xxh64(input, length, seed);
+#else
+   return xxh32(input, length, seed);
+#endif
+}
+
 /*-
  * Streaming Hash Functions
  */
-- 
2.15.1


[PATCH V5 2/2] ksm: replace jhash2 with faster hash

2017-12-30 Thread Timofey Titovets
1. Pickup, Sioh Lee crc32 patch, after some long conversation
2. Merge with my work on xxhash
3. Add autoselect code to choice fastest hash helper.

Base idea are same, replace jhash2 with something faster.

Perf numbers:
Intel(R) Xeon(R) CPU E5-2420 v2 @ 2.20GHz
ksm: crc32c   hash() 12081 MB/s
ksm: xxh64hash()  8770 MB/s
ksm: xxh32hash()  4529 MB/s
ksm: jhash2   hash()  1569 MB/s

As jhash2 always will be slower (For data size like PAGE_SIZE),
just drop it from choice.

Add function to autoselect hash algo on boot,
based on hashing speed, like raid6 code does.

Move init of zero_checksum from init, to first call of fasthash():
  1. KSM Init run on early kernel init,
 run perf testing stuff on main kernel boot thread looks bad to me.
  2. Crypto subsystem not avaliable at that early booting,
 so crc32c even, compiled in, not avaliable

Output after first try of KSM to hash page:
ksm: crc32c hash() 15218 MB/s
ksm: xxhash hash()  8640 MB/s
ksm: choise crc32c as hash function

Thanks.

Changes:
  v1 -> v2:
- Move xxhash() to xxhash.h/c and separate patches
  v2 -> v3:
- Move xxhash() xxhash.c -> xxhash.h
- replace xxhash_t with 'unsigned long'
- update kerneldoc above xxhash()
  v3 -> v4:
- Merge xxhash/crc32 patches
- Replace crc32 with crc32c (crc32 have same as jhash2 speed)
- Add auto speed test and auto choice of fastest hash function
  v4 -> v5:
- Pickup missed xxhash patch
- Update code with compile time choicen xxhash
- Add more macros to make code more readable
- As now that only possible use xxhash or crc32c,
  on crc32c allocation error, skip speed test and fallback to xxhash
- For workaround too early init problem (crc32c not avaliable),
  move zero_checksum init to first call of fastcall()
- Don't alloc page for hash testing, use arch zero pages for that

Signed-off-by: Timofey Titovets 
Signed-off-by: leesioh 
CC: Andrea Arcangeli 
CC: linux...@kvack.org
CC: k...@vger.kernel.org
---
 mm/Kconfig |   4 +++
 mm/ksm.c   | 114 -
 2 files changed, 109 insertions(+), 9 deletions(-)

diff --git a/mm/Kconfig b/mm/Kconfig
index 03ff7703d322..d4fb147d4a22 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -305,6 +305,10 @@ config MMU_NOTIFIER
 config KSM
bool "Enable KSM for page merging"
depends on MMU
+   select XXHASH
+   select CRYPTO
+   select CRYPTO_HASH
+   select CONFIG_CRYPTO_CRC32C
help
  Enable Kernel Samepage Merging: KSM periodically scans those areas
  of an application's address space that an app has advised may be
diff --git a/mm/ksm.c b/mm/ksm.c
index be8f4576f842..b90ad6903dc6 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -25,7 +25,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -41,6 +40,13 @@
 #include 
 
 #include 
+
+/* Support for xxhash and crc32c */
+#include 
+#include 
+#include 
+#include 
+
 #include "internal.h"
 
 #ifdef CONFIG_NUMA
@@ -186,7 +192,7 @@ struct rmap_item {
};
struct mm_struct *mm;
unsigned long address;  /* + low bits used for flags below */
-   unsigned int oldchecksum;   /* when unstable */
+   unsigned long oldchecksum;  /* when unstable */
union {
struct rb_node node;/* when node of unstable tree */
struct {/* when listed from stable tree */
@@ -255,7 +261,7 @@ static unsigned int ksm_thread_pages_to_scan = 100;
 static unsigned int ksm_thread_sleep_millisecs = 20;
 
 /* Checksum of an empty (zeroed) page */
-static unsigned int zero_checksum __read_mostly;
+static unsigned long zero_checksum __read_mostly;
 
 /* Whether to merge empty (zeroed) pages with actual zero pages */
 static bool ksm_use_zero_pages __read_mostly;
@@ -284,6 +290,98 @@ static DEFINE_SPINLOCK(ksm_mmlist_lock);
sizeof(struct __struct), __alignof__(struct __struct),\
(__flags), NULL)
 
+#define TIME_125MS  (HZ >> 3)
+#define PERF_TO_MBS(X) (X*PAGE_SIZE*(1 << 3)/(SZ_1M))
+
+#define HASH_NONE   0
+#define HASH_CRC32C 1
+#define HASH_XXHASH 2
+
+static struct shash_desc desc;
+
+static int fastest_hash = 0;
+
+static void __init choice_fastest_hash(void)
+{
+   void *page = ZERO_PAGE(0);
+   unsigned long checksum, perf, je;
+   unsigned long best_perf = 0;
+
+   desc.tfm = crypto_alloc_shash("crc32c", 0, 0);
+   desc.flags = 0;
+
+   if (IS_ERR(desc.tfm)) {
+   pr_warn("ksm: alloc crc32c shash error %ld\n",
+   -PTR_ERR(desc.tfm));
+   fastest_hash = HASH_XXHASH;
+   goto out;
+   }
+
+   perf = 0;
+   preempt_disable();
+   je = jiffies + TIME_125MS;
+   while (time_before(jiffies, je)) {
+   crypto_shash_digest(, page, PAGE_SIZE, (u8 *));
+   perf++;
+   }
+   preempt_enable();
+
+   if (best_perf < 

[PATCH V5 1/2] xxHash: create arch dependent 32/64-bit xxhash()

2017-12-30 Thread Timofey Titovets
xxh32() - fast on both 32/64-bit platforms
xxh64() - fast only on 64-bit platform

Create xxhash() which will pickup fastest version
on compile time.

As result depends on cpu word size,
the main proporse of that - in memory hashing.

Changes:
  v2:
- Create that patch
  v3 -> v5:
- Nothing, whole patchset version bump

Signed-off-by: Timofey Titovets 
---
 include/linux/xxhash.h | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/include/linux/xxhash.h b/include/linux/xxhash.h
index 9e1f42cb57e9..52b073fea17f 100644
--- a/include/linux/xxhash.h
+++ b/include/linux/xxhash.h
@@ -107,6 +107,29 @@ uint32_t xxh32(const void *input, size_t length, uint32_t 
seed);
  */
 uint64_t xxh64(const void *input, size_t length, uint64_t seed);
 
+/**
+ * xxhash() - calculate wordsize hash of the input with a given seed
+ * @input:  The data to hash.
+ * @length: The length of the data to hash.
+ * @seed:   The seed can be used to alter the result predictably.
+ *
+ * If the hash does not need to be comparable between machines with
+ * different word sizes, this function will call whichever of xxh32()
+ * or xxh64() is faster.
+ *
+ * Return:  wordsize hash of the data.
+ */
+
+static inline unsigned long xxhash(const void *input, size_t length,
+  uint64_t seed)
+{
+#if BITS_PER_LONG == 64
+   return xxh64(input, length, seed);
+#else
+   return xxh32(input, length, seed);
+#endif
+}
+
 /*-
  * Streaming Hash Functions
  */
-- 
2.15.1


Re: [PATCH v1] eSPI: add Aspeed AST2500 eSPI driver to boot a host with PCH runs on eSPI

2017-12-30 Thread Arnd Bergmann
On Fri, Dec 29, 2017 at 2:53 AM, Haiyue Wang
 wrote:
> When PCH works under eSPI mode, the PMC (Power Management Controller) in
> PCH is waiting for SUS_ACK from BMC after it alerts SUS_WARN. It is in
> dead loop if no SUS_ACK assert. This is the basic requirement for the BMC
> works as eSPI slave.
>
> Also for the host power on / off actions, from BMC side, the following VW
> (Virtual Wire) messages are done in firmware:
> 1. SLAVE_BOOT_LOAD_DONE / SLAVE_BOOT_LOAD_STATUS
> 2. SUS_ACK
> 3. OOB_RESET_ACK
> 4. HOST_RESET_ACK

I have not looked at the driver contents yet, but I'm adding the SPI
maintainer and
mailing list to Cc here for further discussion. Can you clarify how
the eSPI slave
mode relates to SPI slaves that we already support? I was under the impression
that the difference between SPI and eSPI is mainly on the master side, but that
any SPI slave can also act as an eSPI slave. Would this driver fit into the SPI
slave framework, possibly with some extensions to the generic abstraction?

It also seems rather inflexible to have a single driver that is responsible both
for the transport (eSPI register level interface for ASPEED) and the high-level
protocol (talking to an Intel PCH), since either half of the work could be
done elsewhere, using either a different eSPI slave implementation, or
a different
host architecture)

   Arnd


Re: [PATCH v1] eSPI: add Aspeed AST2500 eSPI driver to boot a host with PCH runs on eSPI

2017-12-30 Thread Arnd Bergmann
On Fri, Dec 29, 2017 at 2:53 AM, Haiyue Wang
 wrote:
> When PCH works under eSPI mode, the PMC (Power Management Controller) in
> PCH is waiting for SUS_ACK from BMC after it alerts SUS_WARN. It is in
> dead loop if no SUS_ACK assert. This is the basic requirement for the BMC
> works as eSPI slave.
>
> Also for the host power on / off actions, from BMC side, the following VW
> (Virtual Wire) messages are done in firmware:
> 1. SLAVE_BOOT_LOAD_DONE / SLAVE_BOOT_LOAD_STATUS
> 2. SUS_ACK
> 3. OOB_RESET_ACK
> 4. HOST_RESET_ACK

I have not looked at the driver contents yet, but I'm adding the SPI
maintainer and
mailing list to Cc here for further discussion. Can you clarify how
the eSPI slave
mode relates to SPI slaves that we already support? I was under the impression
that the difference between SPI and eSPI is mainly on the master side, but that
any SPI slave can also act as an eSPI slave. Would this driver fit into the SPI
slave framework, possibly with some extensions to the generic abstraction?

It also seems rather inflexible to have a single driver that is responsible both
for the transport (eSPI register level interface for ASPEED) and the high-level
protocol (talking to an Intel PCH), since either half of the work could be
done elsewhere, using either a different eSPI slave implementation, or
a different
host architecture)

   Arnd


Re: About the try to remove cross-release feature entirely by Ingo

2017-12-30 Thread Theodore Ts'o
On Sat, Dec 30, 2017 at 05:40:28PM -0500, Theodore Ts'o wrote:
> On Sat, Dec 30, 2017 at 12:44:17PM -0800, Matthew Wilcox wrote:
> > 
> > I'm not sure I agree with this part.  What if we add a new TCP lock class
> > for connections which are used for filesystems/network block devices/...?
> > Yes, it'll be up to each user to set the lockdep classification correctly,
> > but that's a relatively small number of places to add annotations,
> > and I don't see why it wouldn't work.
> 
> I was exagerrating a bit for effect, I admit.  (but only a bit).
> 
> It can probably be for all TCP connections that are used by kernel
> code (as opposed to userspace-only TCP connections).  But it would
> probably have to be each and every device-mapper instance, each and
> every block device, each and every mounted file system, each and every
> bdi object, etc.

Clarification: all TCP connections that are used by kernel code would
need to be in their own separate lock class.  All TCP connections used
only by userspace could be in their own shared lock class.  You can't
use a one lock class for all kernel-used TCP connections, because of
the Network Block Device mounted on a local file system which is then
exported via NFS and squirted out yet another TCP connection problem.

Also, what to do with TCP connections which are created in userspace
(with some authentication exchanges happening in userspace), and then
passed into kernel space for use in kernel space, is an interesting
question.

So "all you have to do is classify the locks 'properly'" is much like
the apocrophal, "all you have to do is bell the cat"[1].  Or like the
saying, "colonizing the stars is *easy*; all you have to do is figure
out faster than light travel."

[1] https://en.wikipedia.org/wiki/Belling_the_cat

- Ted


Re: About the try to remove cross-release feature entirely by Ingo

2017-12-30 Thread Theodore Ts'o
On Sat, Dec 30, 2017 at 05:40:28PM -0500, Theodore Ts'o wrote:
> On Sat, Dec 30, 2017 at 12:44:17PM -0800, Matthew Wilcox wrote:
> > 
> > I'm not sure I agree with this part.  What if we add a new TCP lock class
> > for connections which are used for filesystems/network block devices/...?
> > Yes, it'll be up to each user to set the lockdep classification correctly,
> > but that's a relatively small number of places to add annotations,
> > and I don't see why it wouldn't work.
> 
> I was exagerrating a bit for effect, I admit.  (but only a bit).
> 
> It can probably be for all TCP connections that are used by kernel
> code (as opposed to userspace-only TCP connections).  But it would
> probably have to be each and every device-mapper instance, each and
> every block device, each and every mounted file system, each and every
> bdi object, etc.

Clarification: all TCP connections that are used by kernel code would
need to be in their own separate lock class.  All TCP connections used
only by userspace could be in their own shared lock class.  You can't
use a one lock class for all kernel-used TCP connections, because of
the Network Block Device mounted on a local file system which is then
exported via NFS and squirted out yet another TCP connection problem.

Also, what to do with TCP connections which are created in userspace
(with some authentication exchanges happening in userspace), and then
passed into kernel space for use in kernel space, is an interesting
question.

So "all you have to do is classify the locks 'properly'" is much like
the apocrophal, "all you have to do is bell the cat"[1].  Or like the
saying, "colonizing the stars is *easy*; all you have to do is figure
out faster than light travel."

[1] https://en.wikipedia.org/wiki/Belling_the_cat

- Ted


Re: Review of KPTI patchset

2017-12-30 Thread Thomas Gleixner
On Sat, 30 Dec 2017, Thomas Gleixner wrote:
> On Sat, 30 Dec 2017, Mathieu Desnoyers wrote:
> The only asymetry is in the error path of write_ldt() which can leak a half
> allocated page table. But, that's a nasty one because if there is an
> existing LDT mapped, then the pagetable cannot be freed. So yes, it's not
> nice, but harmless and needs some thought to fix.

In fact it's not a leak. It's just memory waste because the pagetable gets
freed when the process exits.

The memory waste is rather simple to fix. Delta patch below.

Thanks,

tglx

8<--
--- a/arch/x86/kernel/ldt.c
+++ b/arch/x86/kernel/ldt.c
@@ -421,6 +421,14 @@ static int write_ldt(void __user *ptr, u
 */
error = map_ldt_struct(mm, new_ldt, old_ldt ? !old_ldt->slot : 0);
if (error) {
+   /*
+* Drop potentially half populated page table if the
+* mapping code failed and this was the first attempt to
+* install a LDT. If there is a LDT installed then the LDT
+* pagetable cannot be freed for obvious reasons.
+*/
+   if (!old_ldt)
+   free_ldt_pgtables(mm);
free_ldt_struct(new_ldt);
goto out_unlock;
}


Re: Review of KPTI patchset

2017-12-30 Thread Thomas Gleixner
On Sat, 30 Dec 2017, Thomas Gleixner wrote:
> On Sat, 30 Dec 2017, Mathieu Desnoyers wrote:
> The only asymetry is in the error path of write_ldt() which can leak a half
> allocated page table. But, that's a nasty one because if there is an
> existing LDT mapped, then the pagetable cannot be freed. So yes, it's not
> nice, but harmless and needs some thought to fix.

In fact it's not a leak. It's just memory waste because the pagetable gets
freed when the process exits.

The memory waste is rather simple to fix. Delta patch below.

Thanks,

tglx

8<--
--- a/arch/x86/kernel/ldt.c
+++ b/arch/x86/kernel/ldt.c
@@ -421,6 +421,14 @@ static int write_ldt(void __user *ptr, u
 */
error = map_ldt_struct(mm, new_ldt, old_ldt ? !old_ldt->slot : 0);
if (error) {
+   /*
+* Drop potentially half populated page table if the
+* mapping code failed and this was the first attempt to
+* install a LDT. If there is a LDT installed then the LDT
+* pagetable cannot be freed for obvious reasons.
+*/
+   if (!old_ldt)
+   free_ldt_pgtables(mm);
free_ldt_struct(new_ldt);
goto out_unlock;
}


Re: About the try to remove cross-release feature entirely by Ingo

2017-12-30 Thread Theodore Ts'o
On Sat, Dec 30, 2017 at 12:44:17PM -0800, Matthew Wilcox wrote:
> 
> I'm not sure I agree with this part.  What if we add a new TCP lock class
> for connections which are used for filesystems/network block devices/...?
> Yes, it'll be up to each user to set the lockdep classification correctly,
> but that's a relatively small number of places to add annotations,
> and I don't see why it wouldn't work.

I was exagerrating a bit for effect, I admit.  (but only a bit).

It can probably be for all TCP connections that are used by kernel
code (as opposed to userspace-only TCP connections).  But it would
probably have to be each and every device-mapper instance, each and
every block device, each and every mounted file system, each and every
bdi object, etc.

The point I was trying to drive home is that "all we have to do is
just classify everything well or just invalidate the right lock
objects" is a massive understatement of the complexity level of what
would be required, or the number of locks/completion handlers that
would have to be blacklisted.

- Ted


Re: About the try to remove cross-release feature entirely by Ingo

2017-12-30 Thread Theodore Ts'o
On Sat, Dec 30, 2017 at 12:44:17PM -0800, Matthew Wilcox wrote:
> 
> I'm not sure I agree with this part.  What if we add a new TCP lock class
> for connections which are used for filesystems/network block devices/...?
> Yes, it'll be up to each user to set the lockdep classification correctly,
> but that's a relatively small number of places to add annotations,
> and I don't see why it wouldn't work.

I was exagerrating a bit for effect, I admit.  (but only a bit).

It can probably be for all TCP connections that are used by kernel
code (as opposed to userspace-only TCP connections).  But it would
probably have to be each and every device-mapper instance, each and
every block device, each and every mounted file system, each and every
bdi object, etc.

The point I was trying to drive home is that "all we have to do is
just classify everything well or just invalidate the right lock
objects" is a massive understatement of the complexity level of what
would be required, or the number of locks/completion handlers that
would have to be blacklisted.

- Ted


Re: [PATCH v3 net-next 2/5] net: tracepoint: replace tcp_set_state tracepoint with inet_sock_set_state tracepoint

2017-12-30 Thread Brendan Gregg
On Tue, Dec 19, 2017 at 7:12 PM, Yafang Shao  wrote:
> As sk_state is a common field for struct sock, so the state
> transition tracepoint should not be a TCP specific feature.
> Currently it traces all AF_INET state transition, so I rename this
> tracepoint to inet_sock_set_state tracepoint with some minor changes and move 
> it
> into trace/events/sock.h.

The tcp:tcp_set_state probe is tcp_set_state(), so it's only going to
fire for TCP sessions. It's not broken, and we could add a
sctp:sctp_set_state as well. Replacing tcp:tcp_set_state with
inet_sk_set_state is feeling like we might be baking too much
implementation detail into the tracepoint API.

If we must have inet_sk_set_state, then must we also delete tcp:tcp_set_state?

Brendan


> We dont need to create a file named trace/events/inet_sock.h for this one 
> single
> tracepoint.
>
> Two helpers are introduced to trace sk_state transition
> - void inet_sk_state_store(struct sock *sk, int newstate);
> - void inet_sk_set_state(struct sock *sk, int state);
> As trace header should not be included in other header files,
> so they are defined in sock.c.
>
> The protocol such as SCTP maybe compiled as a ko, hence export
> inet_sk_set_state().
>[...]


Re: [PATCH v3 net-next 2/5] net: tracepoint: replace tcp_set_state tracepoint with inet_sock_set_state tracepoint

2017-12-30 Thread Brendan Gregg
On Tue, Dec 19, 2017 at 7:12 PM, Yafang Shao  wrote:
> As sk_state is a common field for struct sock, so the state
> transition tracepoint should not be a TCP specific feature.
> Currently it traces all AF_INET state transition, so I rename this
> tracepoint to inet_sock_set_state tracepoint with some minor changes and move 
> it
> into trace/events/sock.h.

The tcp:tcp_set_state probe is tcp_set_state(), so it's only going to
fire for TCP sessions. It's not broken, and we could add a
sctp:sctp_set_state as well. Replacing tcp:tcp_set_state with
inet_sk_set_state is feeling like we might be baking too much
implementation detail into the tracepoint API.

If we must have inet_sk_set_state, then must we also delete tcp:tcp_set_state?

Brendan


> We dont need to create a file named trace/events/inet_sock.h for this one 
> single
> tracepoint.
>
> Two helpers are introduced to trace sk_state transition
> - void inet_sk_state_store(struct sock *sk, int newstate);
> - void inet_sk_set_state(struct sock *sk, int state);
> As trace header should not be included in other header files,
> so they are defined in sock.c.
>
> The protocol such as SCTP maybe compiled as a ko, hence export
> inet_sk_set_state().
>[...]


[GIT PULL] SCSI fixes for 4.15-rc5

2017-12-30 Thread James Bottomley
Two simple fixes, both of which cause I/O hangs.  The storvsc one is
from the hyper-v which can hang under certain hot add/remove conditions
and the other is generally, where removing a target and a device in
close proximity can result in the release method being executed twice
(and subsequent list and other corruption and an eventual panic).

The patch is available here:

git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes

The short changelog is:

Cathy Avery (1):
  scsi: storvsc: Fix scsi_cmd error assignments in storvsc_handle_error

Hannes Reinecke (1):
  scsi: core: check for device state in __scsi_remove_target()

And the diffstat:


 drivers/scsi/scsi_sysfs.c  | 5 -
 drivers/scsi/storvsc_drv.c | 3 ++-
 2 files changed, 6 insertions(+), 2 deletions(-)

With full diff below.

James

---

diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index a9996c16f4ae..26ce17178401 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -1415,7 +1415,10 @@ static void __scsi_remove_target(struct scsi_target 
*starget)
 * check.
 */
if (sdev->channel != starget->channel ||
-   sdev->id != starget->id ||
+   sdev->id != starget->id)
+   continue;
+   if (sdev->sdev_state == SDEV_DEL ||
+   sdev->sdev_state == SDEV_CANCEL ||
!get_device(>sdev_gendev))
continue;
spin_unlock_irqrestore(shost->host_lock, flags);
diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c
index 1b06cf0375dc..3b3d1d050cac 100644
--- a/drivers/scsi/storvsc_drv.c
+++ b/drivers/scsi/storvsc_drv.c
@@ -953,10 +953,11 @@ static void storvsc_handle_error(struct vmscsi_request 
*vm_srb,
case TEST_UNIT_READY:
break;
default:
-   set_host_byte(scmnd, DID_TARGET_FAILURE);
+   set_host_byte(scmnd, DID_ERROR);
}
break;
case SRB_STATUS_INVALID_LUN:
+   set_host_byte(scmnd, DID_NO_CONNECT);
do_work = true;
process_err_fn = storvsc_remove_lun;
break;


[GIT PULL] SCSI fixes for 4.15-rc5

2017-12-30 Thread James Bottomley
Two simple fixes, both of which cause I/O hangs.  The storvsc one is
from the hyper-v which can hang under certain hot add/remove conditions
and the other is generally, where removing a target and a device in
close proximity can result in the release method being executed twice
(and subsequent list and other corruption and an eventual panic).

The patch is available here:

git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes

The short changelog is:

Cathy Avery (1):
  scsi: storvsc: Fix scsi_cmd error assignments in storvsc_handle_error

Hannes Reinecke (1):
  scsi: core: check for device state in __scsi_remove_target()

And the diffstat:


 drivers/scsi/scsi_sysfs.c  | 5 -
 drivers/scsi/storvsc_drv.c | 3 ++-
 2 files changed, 6 insertions(+), 2 deletions(-)

With full diff below.

James

---

diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index a9996c16f4ae..26ce17178401 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -1415,7 +1415,10 @@ static void __scsi_remove_target(struct scsi_target 
*starget)
 * check.
 */
if (sdev->channel != starget->channel ||
-   sdev->id != starget->id ||
+   sdev->id != starget->id)
+   continue;
+   if (sdev->sdev_state == SDEV_DEL ||
+   sdev->sdev_state == SDEV_CANCEL ||
!get_device(>sdev_gendev))
continue;
spin_unlock_irqrestore(shost->host_lock, flags);
diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c
index 1b06cf0375dc..3b3d1d050cac 100644
--- a/drivers/scsi/storvsc_drv.c
+++ b/drivers/scsi/storvsc_drv.c
@@ -953,10 +953,11 @@ static void storvsc_handle_error(struct vmscsi_request 
*vm_srb,
case TEST_UNIT_READY:
break;
default:
-   set_host_byte(scmnd, DID_TARGET_FAILURE);
+   set_host_byte(scmnd, DID_ERROR);
}
break;
case SRB_STATUS_INVALID_LUN:
+   set_host_byte(scmnd, DID_NO_CONNECT);
do_work = true;
process_err_fn = storvsc_remove_lun;
break;


Re: 4.14.9 doesn't boot (regression)

2017-12-30 Thread Josh Poimboeuf
On Sun, Dec 31, 2017 at 01:03:25AM +0300, Alexander Tsoy wrote:
> > Turns out my previous code to print iret frames was a bit ...
> > misguided, to put it nicely.  Not sure what I was smoking.
> > 
> > Hopefully the below patch should fix it (in place of the previous
> > patch).  Would you mind testing again?
> > 
> 
> With that patch I get:
> 
> [2.160017] NMI backtrace for cpu 0
> [2.160017] CPU: 0 PID: 1 Comm: init Not tainted 4.15.0-rc5 #1
> [2.160017] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> 1.10.2-1.fc27 04/01/2014
> [2.160017] RIP: 0010:double_fault+0x0/0x30
> [2.160017] RSP: :fe807fd0 EFLAGS: 00010086
> [2.160017] RAX: ffc0 RBX: 0001 RCX: 
> c101
> [2.160017] RDX: 8edc RSI:  RDI: 
> fe807f58
> [2.160017] RBP:  R08:  R09: 
> 
> [2.160017] R10:  R11:  R12: 
> a3c01426
> [2.160017] R13:  R14:  R15: 
> 
> [2.160017] FS:  () GS:8edcffc0() 
> knlGS:
> [2.160017] CS:  0010 DS:  ES:  CR0: 80050033
> [2.160017] CR2: fe806f08 CR3: 7c153000 CR4: 
> 06b0
> [2.160017] Call Trace:
> [2.160017]  <#DF>
> [2.160017] RIP: 0010:do_double_fault+0xb/0x140
> [2.160017] RSP: :fe806f18 EFLAGS: 00010086
> [2.160017]  

Yes, that's more like it.  I'll clean up the patches and submit them
soon.  These nasty bugs are always a good testcase for the stack dump
code.

Thanks for testing!

-- 
Josh


Re: 4.14.9 doesn't boot (regression)

2017-12-30 Thread Josh Poimboeuf
On Sun, Dec 31, 2017 at 01:03:25AM +0300, Alexander Tsoy wrote:
> > Turns out my previous code to print iret frames was a bit ...
> > misguided, to put it nicely.  Not sure what I was smoking.
> > 
> > Hopefully the below patch should fix it (in place of the previous
> > patch).  Would you mind testing again?
> > 
> 
> With that patch I get:
> 
> [2.160017] NMI backtrace for cpu 0
> [2.160017] CPU: 0 PID: 1 Comm: init Not tainted 4.15.0-rc5 #1
> [2.160017] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> 1.10.2-1.fc27 04/01/2014
> [2.160017] RIP: 0010:double_fault+0x0/0x30
> [2.160017] RSP: :fe807fd0 EFLAGS: 00010086
> [2.160017] RAX: ffc0 RBX: 0001 RCX: 
> c101
> [2.160017] RDX: 8edc RSI:  RDI: 
> fe807f58
> [2.160017] RBP:  R08:  R09: 
> 
> [2.160017] R10:  R11:  R12: 
> a3c01426
> [2.160017] R13:  R14:  R15: 
> 
> [2.160017] FS:  () GS:8edcffc0() 
> knlGS:
> [2.160017] CS:  0010 DS:  ES:  CR0: 80050033
> [2.160017] CR2: fe806f08 CR3: 7c153000 CR4: 
> 06b0
> [2.160017] Call Trace:
> [2.160017]  <#DF>
> [2.160017] RIP: 0010:do_double_fault+0xb/0x140
> [2.160017] RSP: :fe806f18 EFLAGS: 00010086
> [2.160017]  

Yes, that's more like it.  I'll clean up the patches and submit them
soon.  These nasty bugs are always a good testcase for the stack dump
code.

Thanks for testing!

-- 
Josh


Re: [patch 0/3] x86/pti: Fix various fallout

2017-12-30 Thread Linus Torvalds
On Sat, Dec 30, 2017 at 1:35 PM, Ingo Molnar  wrote:
>
> Linus, I suspect -rc6 is imminent, and it would be nice to at least have the 
> LDT
> error path fix in. I'll send you these fixes tomorrow, but feel free to pick 
> it up
> from email if you wanted to release -rc6 today.

I'll do rc6 tomorrow probably around this time (early afternoon PST).
So I think I should be ok just waiting for your pull request.

Thanks,

  Linus


Re: [patch 0/3] x86/pti: Fix various fallout

2017-12-30 Thread Linus Torvalds
On Sat, Dec 30, 2017 at 1:35 PM, Ingo Molnar  wrote:
>
> Linus, I suspect -rc6 is imminent, and it would be nice to at least have the 
> LDT
> error path fix in. I'll send you these fixes tomorrow, but feel free to pick 
> it up
> from email if you wanted to release -rc6 today.

I'll do rc6 tomorrow probably around this time (early afternoon PST).
So I think I should be ok just waiting for your pull request.

Thanks,

  Linus


Re: 4.14.9 doesn't boot (regression)

2017-12-30 Thread Alexander Tsoy
В Sat, 30 Dec 2017 11:57:46 -0600
Josh Poimboeuf  пишет:

> On Sat, Dec 30, 2017 at 11:09:46AM -0600, Josh Poimboeuf wrote:
> > On Sat, Dec 30, 2017 at 11:45:13AM +0300, Alexander Tsoy wrote:  
> > > В Пт, 29/12/2017 в 21:49 -0600, Josh Poimboeuf пишет:  
> > > > On Fri, Dec 29, 2017 at 05:10:35PM -0700, Andy Lutomirski
> > > > wrote:  
> > > > > (Also, Josh, the oops code should have printed the contents
> > > > > of the struct pt_regs at the top of the DF stack.  Any idea
> > > > > why it didn't?)  
> > > > 
> > > > Looking at one of the dumps:
> > > > 
> > > >   [  392.774879] NMI backtrace for cpu 0
> > > >   [  392.774881] CPU: 0 PID: 1 Comm: init Not tainted
> > > > 4.14.9-gentoo #1
> > > >   [  392.774881] Hardware name: Red Hat KVM, BIOS 0.5.1
> > > > 01/01/2011 [  392.774882] task: 8802368b8000 task.stack:
> > > > c900c000 [  392.774885] RIP: 0010:double_fault+0x0/0x30
> > > >   [  392.774886] RSP: :ff527fd0 EFLAGS: 0086
> > > >   [  392.774887] RAX: 3fc0 RBX: 0001
> > > > RCX: c101
> > > >   [  392.774887] RDX: 8802 RSI: 
> > > > RDI: ff527f58
> > > >   [  392.774887] RBP:  R08: 
> > > > R09: 
> > > >   [  392.774888] R10:  R11: 
> > > > R12: 816ae726
> > > >   [  392.774888] R13:  R14: 
> > > > R15: 
> > > >   [  392.774889] FS:  ()
> > > > GS:88023fc0() knlGS:
> > > >   [  392.774889] CS:  0010 DS:  ES:  CR0:
> > > > 80050033 [  392.774890] CR2: ff526f08 CR3:
> > > > 000235b48002 CR4: 001606f0
> > > >   [  392.774892] Call Trace:
> > > >   [  392.774894]  <#DF>
> > > >   [  392.774897]  do_double_fault+0xb/0x140
> > > >   [  392.774898]  
> > > > 
> > > > It should have at least printed the #DF iret frame registers,
> > > > which I recently added support for in "x86/unwinder: Handle
> > > > stack overflows more
> > > > gracefully", which is in both 4.14.9 and 4.15-rc5.
> > > > 
> > > > I think the missing iret regs are due to a bug in
> > > > show_trace_log_lvl(),
> > > > where if the unwind starts with two regs frames in a row, the
> > > > second regs don't get printed.
> > > > 
> > > > Alexander, would you mind reproducing again with the below
> > > > patch?  It should still fail, but this time it should hopefully
> > > > show another RIP/RSP/EFLAGS instead of the
> > > > "do_double_fault+0xb/0x140" line. 
> > > 
> > > Yes, it works:
> > > 
> > > [   23.058064] NMI backtrace for cpu 2
> > > [   23.058068] CPU: 2 PID: 1 Comm: init Not tainted 4.15.0-rc5+ #1
> > > [   23.058069] Hardware name: QEMU Standard PC (i440FX + PIIX,
> > > 1996), BIOS 1.10.2-1.fc27 04/01/2014
> > > [   23.058074] RIP: 0010:double_fault+0x0/0x30
> > > [   23.058075] RSP: :fe85ffd0 EFLAGS: 0086
> > > [   23.058077] RAX: 3fd0 RBX: 0001 RCX:
> > > c101
> > > [   23.058077] RDX: 9681 RSI:  RDI:
> > > fe85ff58
> > > [   23.058078] RBP:  R08:  R09:
> > > 
> > > [   23.058079] R10:  R11:  R12:
> > > 92001426
> > > [   23.058080] R13:  R14:  R15:
> > > 
> > > [   23.058083] FS:  ()
> > > GS:96813fd0() knlGS:
> > > [   23.058084] CS:  0010 DS:  ES:  CR0: 80050033
> > > [   23.058085] CR2: fe85ef08 CR3: 000137a09000 CR4:
> > > 000406a0
> > > [   23.058089] Call Trace:
> > > [   23.058101]  <#DF>
> > > [   23.058104] RIP: 0010:do_double_fault+0xb/0x140
> > > [   23.058105] RSP: :fe85ef18 EFLAGS: 00010086
> > > ORIG_RAX: 
> > > [   23.058106] RAX: 3fd0 RBX: 0001 RCX:
> > > c101
> > > [   23.058107] RDX: 9681 RSI:  RDI:
> > > fe85ff58
> > > [   23.058107] RBP:  R08:  R09:
> > > 
> > > [   23.058108] R10:  R11:  R12:
> > > 92001426
> > > [   23.058108] R13:  R14:  R15:
> > > 
> > > [   23.058111]  
> > > [   23.058111] Code: 05 00 00 48 89 e7 31 f6 e8 2e 8c 61 ff e9 69
> > > 06 00 00 e8 94 05 00 00 48 89 e7 31 f6 e8 1a 8c 61 ff e9 55 06 00
> > > 00 0f 1f 44 00 00 <0f> 1f 00 48 83 c4 88 e8 e4 04 00 00 48 89 e7
> > > 48 8b 74 24 78 48  
> > 
> > That's better indeed, though still not quite right.  It should have
> > only shown a subset of those registers.  One more bug to fix
> > there...  
> 
> Turns out my previous code to print iret frames was a bit ...
> misguided, to put it nicely.  Not sure what I was smoking.
> 
> Hopefully the below patch should fix it (in 

Re: 4.14.9 doesn't boot (regression)

2017-12-30 Thread Alexander Tsoy
В Sat, 30 Dec 2017 11:57:46 -0600
Josh Poimboeuf  пишет:

> On Sat, Dec 30, 2017 at 11:09:46AM -0600, Josh Poimboeuf wrote:
> > On Sat, Dec 30, 2017 at 11:45:13AM +0300, Alexander Tsoy wrote:  
> > > В Пт, 29/12/2017 в 21:49 -0600, Josh Poimboeuf пишет:  
> > > > On Fri, Dec 29, 2017 at 05:10:35PM -0700, Andy Lutomirski
> > > > wrote:  
> > > > > (Also, Josh, the oops code should have printed the contents
> > > > > of the struct pt_regs at the top of the DF stack.  Any idea
> > > > > why it didn't?)  
> > > > 
> > > > Looking at one of the dumps:
> > > > 
> > > >   [  392.774879] NMI backtrace for cpu 0
> > > >   [  392.774881] CPU: 0 PID: 1 Comm: init Not tainted
> > > > 4.14.9-gentoo #1
> > > >   [  392.774881] Hardware name: Red Hat KVM, BIOS 0.5.1
> > > > 01/01/2011 [  392.774882] task: 8802368b8000 task.stack:
> > > > c900c000 [  392.774885] RIP: 0010:double_fault+0x0/0x30
> > > >   [  392.774886] RSP: :ff527fd0 EFLAGS: 0086
> > > >   [  392.774887] RAX: 3fc0 RBX: 0001
> > > > RCX: c101
> > > >   [  392.774887] RDX: 8802 RSI: 
> > > > RDI: ff527f58
> > > >   [  392.774887] RBP:  R08: 
> > > > R09: 
> > > >   [  392.774888] R10:  R11: 
> > > > R12: 816ae726
> > > >   [  392.774888] R13:  R14: 
> > > > R15: 
> > > >   [  392.774889] FS:  ()
> > > > GS:88023fc0() knlGS:
> > > >   [  392.774889] CS:  0010 DS:  ES:  CR0:
> > > > 80050033 [  392.774890] CR2: ff526f08 CR3:
> > > > 000235b48002 CR4: 001606f0
> > > >   [  392.774892] Call Trace:
> > > >   [  392.774894]  <#DF>
> > > >   [  392.774897]  do_double_fault+0xb/0x140
> > > >   [  392.774898]  
> > > > 
> > > > It should have at least printed the #DF iret frame registers,
> > > > which I recently added support for in "x86/unwinder: Handle
> > > > stack overflows more
> > > > gracefully", which is in both 4.14.9 and 4.15-rc5.
> > > > 
> > > > I think the missing iret regs are due to a bug in
> > > > show_trace_log_lvl(),
> > > > where if the unwind starts with two regs frames in a row, the
> > > > second regs don't get printed.
> > > > 
> > > > Alexander, would you mind reproducing again with the below
> > > > patch?  It should still fail, but this time it should hopefully
> > > > show another RIP/RSP/EFLAGS instead of the
> > > > "do_double_fault+0xb/0x140" line. 
> > > 
> > > Yes, it works:
> > > 
> > > [   23.058064] NMI backtrace for cpu 2
> > > [   23.058068] CPU: 2 PID: 1 Comm: init Not tainted 4.15.0-rc5+ #1
> > > [   23.058069] Hardware name: QEMU Standard PC (i440FX + PIIX,
> > > 1996), BIOS 1.10.2-1.fc27 04/01/2014
> > > [   23.058074] RIP: 0010:double_fault+0x0/0x30
> > > [   23.058075] RSP: :fe85ffd0 EFLAGS: 0086
> > > [   23.058077] RAX: 3fd0 RBX: 0001 RCX:
> > > c101
> > > [   23.058077] RDX: 9681 RSI:  RDI:
> > > fe85ff58
> > > [   23.058078] RBP:  R08:  R09:
> > > 
> > > [   23.058079] R10:  R11:  R12:
> > > 92001426
> > > [   23.058080] R13:  R14:  R15:
> > > 
> > > [   23.058083] FS:  ()
> > > GS:96813fd0() knlGS:
> > > [   23.058084] CS:  0010 DS:  ES:  CR0: 80050033
> > > [   23.058085] CR2: fe85ef08 CR3: 000137a09000 CR4:
> > > 000406a0
> > > [   23.058089] Call Trace:
> > > [   23.058101]  <#DF>
> > > [   23.058104] RIP: 0010:do_double_fault+0xb/0x140
> > > [   23.058105] RSP: :fe85ef18 EFLAGS: 00010086
> > > ORIG_RAX: 
> > > [   23.058106] RAX: 3fd0 RBX: 0001 RCX:
> > > c101
> > > [   23.058107] RDX: 9681 RSI:  RDI:
> > > fe85ff58
> > > [   23.058107] RBP:  R08:  R09:
> > > 
> > > [   23.058108] R10:  R11:  R12:
> > > 92001426
> > > [   23.058108] R13:  R14:  R15:
> > > 
> > > [   23.058111]  
> > > [   23.058111] Code: 05 00 00 48 89 e7 31 f6 e8 2e 8c 61 ff e9 69
> > > 06 00 00 e8 94 05 00 00 48 89 e7 31 f6 e8 1a 8c 61 ff e9 55 06 00
> > > 00 0f 1f 44 00 00 <0f> 1f 00 48 83 c4 88 e8 e4 04 00 00 48 89 e7
> > > 48 8b 74 24 78 48  
> > 
> > That's better indeed, though still not quite right.  It should have
> > only shown a subset of those registers.  One more bug to fix
> > there...  
> 
> Turns out my previous code to print iret frames was a bit ...
> misguided, to put it nicely.  Not sure what I was smoking.
> 
> Hopefully the below patch should fix it (in place of the previous
> 

Re: Review of KPTI patchset

2017-12-30 Thread Thomas Gleixner
On Sat, 30 Dec 2017, Mathieu Desnoyers wrote:
> - On Dec 30, 2017, at 2:58 PM, Thomas Gleixner t...@linutronix.de wrote:
> > /*
> >  * Called on fork from arch_dup_mmap(). Just copy the current LDT state,
> >  * the new task is not running, so nothing can be installed.
> >  */
> > int ldt_dup_context(struct mm_struct *old_mm, struct mm_struct *mm)
> > {
> >   struct ldt_struct *new_ldt;
> >   int retval = 0;
> >
> >   if (!old_mm)
> >   return 0;
> 
> If old_mm is NULL, free_ldt_pgtables(mm) is not called.

Correct. Why should it be called? Nothing is allocated. You cannot
associate anything to a NULL pointer and you cannot duplicate the LDT
referenced by a NULL pointer, right?

> >   mutex_lock(_mm->context.lock);
> >   if (!old_mm->context.ldt)
> 
> If old_mm->context.ldt is NULL, free_ldt_pgtables(mm) is not called.

Again. Why would it be called? There is no page table allocated yet. Care
to look at the calling context or the comment above the function which
explains it?

That's fork: old_mm is the parent mm and mm is the child mm. So how would
the new child mm have that LDT pagetable _before_ it was ever tried to
allocate/map?

> >   goto out_unlock;
> >
> >   new_ldt = alloc_ldt_struct(old_mm->context.ldt->nr_entries);
> >   if (!new_ldt) {
> >   retval = -ENOMEM;
> 
> On allocation error, free_ldt_pgtables(mm) is not called.

Once more. There is no page table allocated yet.

> >   goto out_unlock;
> >   }
> >
> >   memcpy(new_ldt->entries, old_mm->context.ldt->entries,
> >  new_ldt->nr_entries * LDT_ENTRY_SIZE);
> >   finalize_ldt_struct(new_ldt);
> >
> >   retval = map_ldt_struct(mm, new_ldt, 0);
> >   if (retval) {
> >   free_ldt_pgtables(mm);
> 
> Here, if we fail to map_ldt_struct, then free_ldt_pgtables(mm) is called.
> >   free_ldt_struct(new_ldt);
> 
> In addition to call free_ldt_struct(), but map_ldt_struct failed... ?

Yes, because we failed to map it and free_ldt_pgtables() cleans up the
leftovers of the map function, which can exit with error and a half
populated page table.

free_ldt_struct() must be called otherwise we leak new_ldt which got
allocated above.

> This lack of symmetry makes me uncomfortable, and it may hint at something
> fishy.

Its not asymetric at all.

The only asymetry is in the error path of write_ldt() which can leak a half
allocated page table. But, that's a nasty one because if there is an
existing LDT mapped, then the pagetable cannot be freed. So yes, it's not
nice, but harmless and needs some thought to fix.

in the ldt_dup() case we can remove it right away because nothing has been
exposed at that point.

> >> +  this_cpu_write(cpu_tlbstate.invalidate_other, false);
> >> +}
> >> 
> >> Can this be called with preemption enabled ? If so, what happens
> >> if migrated ?
> > 
> > No, it can't and if it is then it's a bug and the smp_processor_id() debug
> > code will yell at you.
> 
> I thought the whole point about this_cpu_*() was that it could be called
> with preemption enabled, given that it figures out the per-cpu data offset
> using a segment selector prefix. How would smp_processor_id() debug code be
> involved here ?

You're right, the this_cpu_read/write wont. But the this_cpu_ptr()
dereference in one of the invoked functions will. 

Granted, it's not obvious and ideally we convert those this_cpu_read/writes
to __this_cpu_read/writes() to get the immediate fail reported on the first
access.

Thanks,

tglx


Re: Review of KPTI patchset

2017-12-30 Thread Thomas Gleixner
On Sat, 30 Dec 2017, Mathieu Desnoyers wrote:
> - On Dec 30, 2017, at 2:58 PM, Thomas Gleixner t...@linutronix.de wrote:
> > /*
> >  * Called on fork from arch_dup_mmap(). Just copy the current LDT state,
> >  * the new task is not running, so nothing can be installed.
> >  */
> > int ldt_dup_context(struct mm_struct *old_mm, struct mm_struct *mm)
> > {
> >   struct ldt_struct *new_ldt;
> >   int retval = 0;
> >
> >   if (!old_mm)
> >   return 0;
> 
> If old_mm is NULL, free_ldt_pgtables(mm) is not called.

Correct. Why should it be called? Nothing is allocated. You cannot
associate anything to a NULL pointer and you cannot duplicate the LDT
referenced by a NULL pointer, right?

> >   mutex_lock(_mm->context.lock);
> >   if (!old_mm->context.ldt)
> 
> If old_mm->context.ldt is NULL, free_ldt_pgtables(mm) is not called.

Again. Why would it be called? There is no page table allocated yet. Care
to look at the calling context or the comment above the function which
explains it?

That's fork: old_mm is the parent mm and mm is the child mm. So how would
the new child mm have that LDT pagetable _before_ it was ever tried to
allocate/map?

> >   goto out_unlock;
> >
> >   new_ldt = alloc_ldt_struct(old_mm->context.ldt->nr_entries);
> >   if (!new_ldt) {
> >   retval = -ENOMEM;
> 
> On allocation error, free_ldt_pgtables(mm) is not called.

Once more. There is no page table allocated yet.

> >   goto out_unlock;
> >   }
> >
> >   memcpy(new_ldt->entries, old_mm->context.ldt->entries,
> >  new_ldt->nr_entries * LDT_ENTRY_SIZE);
> >   finalize_ldt_struct(new_ldt);
> >
> >   retval = map_ldt_struct(mm, new_ldt, 0);
> >   if (retval) {
> >   free_ldt_pgtables(mm);
> 
> Here, if we fail to map_ldt_struct, then free_ldt_pgtables(mm) is called.
> >   free_ldt_struct(new_ldt);
> 
> In addition to call free_ldt_struct(), but map_ldt_struct failed... ?

Yes, because we failed to map it and free_ldt_pgtables() cleans up the
leftovers of the map function, which can exit with error and a half
populated page table.

free_ldt_struct() must be called otherwise we leak new_ldt which got
allocated above.

> This lack of symmetry makes me uncomfortable, and it may hint at something
> fishy.

Its not asymetric at all.

The only asymetry is in the error path of write_ldt() which can leak a half
allocated page table. But, that's a nasty one because if there is an
existing LDT mapped, then the pagetable cannot be freed. So yes, it's not
nice, but harmless and needs some thought to fix.

in the ldt_dup() case we can remove it right away because nothing has been
exposed at that point.

> >> +  this_cpu_write(cpu_tlbstate.invalidate_other, false);
> >> +}
> >> 
> >> Can this be called with preemption enabled ? If so, what happens
> >> if migrated ?
> > 
> > No, it can't and if it is then it's a bug and the smp_processor_id() debug
> > code will yell at you.
> 
> I thought the whole point about this_cpu_*() was that it could be called
> with preemption enabled, given that it figures out the per-cpu data offset
> using a segment selector prefix. How would smp_processor_id() debug code be
> involved here ?

You're right, the this_cpu_read/write wont. But the this_cpu_ptr()
dereference in one of the invoked functions will. 

Granted, it's not obvious and ideally we convert those this_cpu_read/writes
to __this_cpu_read/writes() to get the immediate fail reported on the first
access.

Thanks,

tglx


Re: [kernel-hardening] [PATCH 0/5] RFC: Public key encryption of dmesg by the kernel

2017-12-30 Thread Jann Horn
On Sat, Dec 30, 2017 at 6:57 PM, Dan Aloni  wrote:
> From: Dan Aloni 
>
> Hi All,
>
> There has been a lot of progress in recent times regarding the removal
> of sensitive information from dmesg (pointers, etc.), so I figured - why
> not encrypt it all? However, I have not found any existing discussions
> or references regarding this technical direction.
>
> I am not sure that desktop and power users would like to have their
> kernel message encrypted, but there are scenarios such as in mobile
> devices, where only the developers, makers of devices, may actually
> benefit from access to kernel prints messages, and the users may be
> more protected from exploits.

What is the benefit of your approach compared to setting
dmesg_restrict=1 or something like that and letting userland decide
who should get access to raw dmesg output and in what form?


Re: [kernel-hardening] [PATCH 0/5] RFC: Public key encryption of dmesg by the kernel

2017-12-30 Thread Jann Horn
On Sat, Dec 30, 2017 at 6:57 PM, Dan Aloni  wrote:
> From: Dan Aloni 
>
> Hi All,
>
> There has been a lot of progress in recent times regarding the removal
> of sensitive information from dmesg (pointers, etc.), so I figured - why
> not encrypt it all? However, I have not found any existing discussions
> or references regarding this technical direction.
>
> I am not sure that desktop and power users would like to have their
> kernel message encrypted, but there are scenarios such as in mobile
> devices, where only the developers, makers of devices, may actually
> benefit from access to kernel prints messages, and the users may be
> more protected from exploits.

What is the benefit of your approach compared to setting
dmesg_restrict=1 or something like that and letting userland decide
who should get access to raw dmesg output and in what form?


uaccess.h: implement unsafe_copy_{to,from}_user()?

2017-12-30 Thread Alexander Kappner
Commit 5b24a7a2aa2040c8c50c3b71122901d01661ff78 introduced the 
unsafe_get_user and unsafe_put_user replacement functions for batched calls 
to put_user and get_user. I'm trying to make the kernel smaller and reduce 
stac/clac overhead on x86 by substituting the new functions for such 
batched calls. But there's no corresponding unsafe_copy_to_user() 
or unsafe_copy_from_user() functions to copy an arbitrary-sized buffer to 
and from userspace without calling access_ok and __uaccess_begin/end.

I know that the matter of replacing these uaccess functions has been 
discussed at length (see https://lkml.org/lkml/2017/5/13/134), so before I 
started hacking away implementing new unsafe_copy_{to,from}_user functions, 
I wanted to ask if a solution to this is already being worked on or if
there's some way of accomplishing this goal without new functions.

To illustrate, here's a batched function call (from fs/fat/dir.c):

if (put_user(0, d2->d_name) ||   
put_user(0, >d_reclen)  || 
copy_to_user(d1->d_name, name, name_len) || 
// etc...
goto efault;  

This should read:

if (!access_ok(VERIFY_WRITE, d1, 2*sizeof(*infop))
goto efault;  
user_access_begin();
unsafe_put_user(0, d2->d_name, efault)
unsafe_put_user(0, >d_reclen, efault)
unsafe_copy_to_user(d1->d_name, name, name_len, efault); // we don't have 
// this function

// etc...
user_access_end();



Thanks.


uaccess.h: implement unsafe_copy_{to,from}_user()?

2017-12-30 Thread Alexander Kappner
Commit 5b24a7a2aa2040c8c50c3b71122901d01661ff78 introduced the 
unsafe_get_user and unsafe_put_user replacement functions for batched calls 
to put_user and get_user. I'm trying to make the kernel smaller and reduce 
stac/clac overhead on x86 by substituting the new functions for such 
batched calls. But there's no corresponding unsafe_copy_to_user() 
or unsafe_copy_from_user() functions to copy an arbitrary-sized buffer to 
and from userspace without calling access_ok and __uaccess_begin/end.

I know that the matter of replacing these uaccess functions has been 
discussed at length (see https://lkml.org/lkml/2017/5/13/134), so before I 
started hacking away implementing new unsafe_copy_{to,from}_user functions, 
I wanted to ask if a solution to this is already being worked on or if
there's some way of accomplishing this goal without new functions.

To illustrate, here's a batched function call (from fs/fat/dir.c):

if (put_user(0, d2->d_name) ||   
put_user(0, >d_reclen)  || 
copy_to_user(d1->d_name, name, name_len) || 
// etc...
goto efault;  

This should read:

if (!access_ok(VERIFY_WRITE, d1, 2*sizeof(*infop))
goto efault;  
user_access_begin();
unsafe_put_user(0, d2->d_name, efault)
unsafe_put_user(0, >d_reclen, efault)
unsafe_copy_to_user(d1->d_name, name, name_len, efault); // we don't have 
// this function

// etc...
user_access_end();



Thanks.


Re: [patch 0/3] x86/pti: Fix various fallout

2017-12-30 Thread Ingo Molnar

* Thomas Gleixner  wrote:

> The small series fixes the recent fallout of the x86/pti merge:
> 
>  - Remove the stale local_flush_tlb() invocations from the CPU hotplug code
> 
>  - Remove the stale preempt_disable/enable() pair from __native_flush_tlb()
> 
>  - Fix a bogus free in the write_ldt() error path

Linus, I suspect -rc6 is imminent, and it would be nice to at least have the 
LDT 
error path fix in. I'll send you these fixes tomorrow, but feel free to pick it 
up 
from email if you wanted to release -rc6 today.

Thanks,

Ingo


Re: [patch 0/3] x86/pti: Fix various fallout

2017-12-30 Thread Ingo Molnar

* Thomas Gleixner  wrote:

> The small series fixes the recent fallout of the x86/pti merge:
> 
>  - Remove the stale local_flush_tlb() invocations from the CPU hotplug code
> 
>  - Remove the stale preempt_disable/enable() pair from __native_flush_tlb()
> 
>  - Fix a bogus free in the write_ldt() error path

Linus, I suspect -rc6 is imminent, and it would be nice to at least have the 
LDT 
error path fix in. I'll send you these fixes tomorrow, but feel free to pick it 
up 
from email if you wanted to release -rc6 today.

Thanks,

Ingo


Re: [patch 1/3] x86/ldt: Free the right LDT memory in write_ldt() error path

2017-12-30 Thread Ingo Molnar

* Thomas Gleixner  wrote:

> The error path in write_ldt() frees the already installed LDT memory
> instead of the newly allocated which cannot be installed.

s/newly allocated
 /newly allocated one

> 
> Fixes: f55f0501cbf6 ("x86/pti: Put the LDT in its own PGD if PTI is on")
> Reported-by: Mathieu Desnoyers 
> Signed-off-by: Thomas Gleixner 
> ---
>  arch/x86/kernel/ldt.c |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> --- a/arch/x86/kernel/ldt.c
> +++ b/arch/x86/kernel/ldt.c
> @@ -421,7 +421,7 @@ static int write_ldt(void __user *ptr, u
>*/
>   error = map_ldt_struct(mm, new_ldt, old_ldt ? !old_ldt->slot : 0);
>   if (error) {
> - free_ldt_struct(old_ldt);
> + free_ldt_struct(new_ldt);
>   goto out_unlock;
>   }
>  

This bug kind of scares me ...

Reviewed-by: Ingo Molnar 

Thanks,

Ingo


Re: [patch 1/3] x86/ldt: Free the right LDT memory in write_ldt() error path

2017-12-30 Thread Ingo Molnar

* Thomas Gleixner  wrote:

> The error path in write_ldt() frees the already installed LDT memory
> instead of the newly allocated which cannot be installed.

s/newly allocated
 /newly allocated one

> 
> Fixes: f55f0501cbf6 ("x86/pti: Put the LDT in its own PGD if PTI is on")
> Reported-by: Mathieu Desnoyers 
> Signed-off-by: Thomas Gleixner 
> ---
>  arch/x86/kernel/ldt.c |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> --- a/arch/x86/kernel/ldt.c
> +++ b/arch/x86/kernel/ldt.c
> @@ -421,7 +421,7 @@ static int write_ldt(void __user *ptr, u
>*/
>   error = map_ldt_struct(mm, new_ldt, old_ldt ? !old_ldt->slot : 0);
>   if (error) {
> - free_ldt_struct(old_ldt);
> + free_ldt_struct(new_ldt);
>   goto out_unlock;
>   }
>  

This bug kind of scares me ...

Reviewed-by: Ingo Molnar 

Thanks,

Ingo


Re: [patch 2/3] x86/smpboot: Remove stale tlb flush invocations

2017-12-30 Thread Ingo Molnar

* Thomas Gleixner  wrote:

> smpboot_setup_warm_reset_vector() and smpboot_restore_warm_reset_vector()
> invoke local_flush_tlb() for no obvious reason.
> 
> Digging in history revealed that the original code in the 2.1 aera added
> those because the code manipulated a swapper_pg_dir pagetable entry. The
> pagetable manipulation was removed long ago in the 2.3 timeframe, but the
> tlb flush invocations stayed around forever.

s/tlb/TLB

> 
> Remove them along with the pointless pr_debugs which come from the same 2.1
> change.
> 
> Reported-by: Dominik Brodowski 
> Signed-off-by: Thomas Gleixner 
> ---
>  arch/x86/kernel/smpboot.c |9 -
>  1 file changed, 9 deletions(-)
> 
> --- a/arch/x86/kernel/smpboot.c
> +++ b/arch/x86/kernel/smpboot.c
> @@ -128,14 +128,10 @@ static inline void smpboot_setup_warm_re
>   spin_lock_irqsave(_lock, flags);
>   CMOS_WRITE(0xa, 0xf);
>   spin_unlock_irqrestore(_lock, flags);
> - local_flush_tlb();
> - pr_debug("1.\n");
>   *((volatile unsigned short *)phys_to_virt(TRAMPOLINE_PHYS_HIGH)) =
>   start_eip >> 4;
> - pr_debug("2.\n");
>   *((volatile unsigned short *)phys_to_virt(TRAMPOLINE_PHYS_LOW)) =
>   start_eip & 0xf;
> - pr_debug("3.\n");
>  }
>  
>  static inline void smpboot_restore_warm_reset_vector(void)
> @@ -143,11 +139,6 @@ static inline void smpboot_restore_warm_
>   unsigned long flags;
>  
>   /*
> -  * Install writable page 0 entry to set BIOS data area.
> -  */
> - local_flush_tlb();
> -
> - /*
>* Paranoid:  Set warm reset code and vector here back
>* to default values.
>*/

Really nice archeology! :-)

Reviewed-by: Ingo Molnar 

Thanks,

Ingo


Re: [patch 3/3] x86/mm: Remove preempt_disable/enable() from __native_flush_tlb()

2017-12-30 Thread Ingo Molnar

The cleanup looks good to me, just a few speling nits:

* Thomas Gleixner  wrote:

> The preempt_disable/enable() pair in __native_flush_tlb() was added in
> commit 5cf0791da5c1 ("x86/mm: Disable preemption during CR3 read+write") to
> protect the UP variant of flush_tlb_mm_range().
> 
> That preempt_disable/enable() pair should have been added to the UP variant
> of flush_tlb_mm_range() instead.
> 
> The UP variant was removed with commit ce4a4e565f52 ("x86/mm: Remove the UP
> asm/tlbflush.h code, always use the (formerly) SMP code"), but the
> preempt_disable/enable() pair stayed around.
> 
> The latest change to __native_flush_tlb() in commit 6fd166aae78c ("x86/mm:
> Use/Fix PCID to optimize user/kernel switches") added an access to a per
> cpu variable outside the preempt disabled regions which makes no sense at
> all. __native_flush_tlb() must always be called with at least preemption
> disabled.

s/cpu/CPU

> 
> Remove the preempt_disable/enable() pair and add a WARN_ON_ONCE() to catch
> bad callers independent of the smp_processor_id() debugging.
> 
> Signed-off-by: Thomas Gleixner 
> ---
>  arch/x86/include/asm/tlbflush.h |   14 --
>  1 file changed, 8 insertions(+), 6 deletions(-)
> 
> --- a/arch/x86/include/asm/tlbflush.h
> +++ b/arch/x86/include/asm/tlbflush.h
> @@ -345,15 +345,17 @@ static inline void invalidate_user_asid(
>   */
>  static inline void __native_flush_tlb(void)
>  {
> - invalidate_user_asid(this_cpu_read(cpu_tlbstate.loaded_mm_asid));
>   /*
> -  * If current->mm == NULL then we borrow a mm which may change
> -  * during a task switch and therefore we must not be preempted
> -  * while we write CR3 back:
> +  * Preemption or interrupts must be disabled to protect the access
> +  * to the per cpu variable and to prevent being preempted between
> +  * read_cr3() and write_cr3().
>*/
> - preempt_disable();
> + WARN_ON_ONCE(preemptible());
> +
> + invalidate_user_asid(this_cpu_read(cpu_tlbstate.loaded_mm_asid));
> +
> + /* If current->mm == NULL then the read_cr3() "borrows" a mm */
>   native_write_cr3(__native_read_cr3());
> - preempt_enable();

s/a mm/an mm

Reviewed-by: Ingo Molnar 

Thanks,

Ingo


Re: [patch 2/3] x86/smpboot: Remove stale tlb flush invocations

2017-12-30 Thread Ingo Molnar

* Thomas Gleixner  wrote:

> smpboot_setup_warm_reset_vector() and smpboot_restore_warm_reset_vector()
> invoke local_flush_tlb() for no obvious reason.
> 
> Digging in history revealed that the original code in the 2.1 aera added
> those because the code manipulated a swapper_pg_dir pagetable entry. The
> pagetable manipulation was removed long ago in the 2.3 timeframe, but the
> tlb flush invocations stayed around forever.

s/tlb/TLB

> 
> Remove them along with the pointless pr_debugs which come from the same 2.1
> change.
> 
> Reported-by: Dominik Brodowski 
> Signed-off-by: Thomas Gleixner 
> ---
>  arch/x86/kernel/smpboot.c |9 -
>  1 file changed, 9 deletions(-)
> 
> --- a/arch/x86/kernel/smpboot.c
> +++ b/arch/x86/kernel/smpboot.c
> @@ -128,14 +128,10 @@ static inline void smpboot_setup_warm_re
>   spin_lock_irqsave(_lock, flags);
>   CMOS_WRITE(0xa, 0xf);
>   spin_unlock_irqrestore(_lock, flags);
> - local_flush_tlb();
> - pr_debug("1.\n");
>   *((volatile unsigned short *)phys_to_virt(TRAMPOLINE_PHYS_HIGH)) =
>   start_eip >> 4;
> - pr_debug("2.\n");
>   *((volatile unsigned short *)phys_to_virt(TRAMPOLINE_PHYS_LOW)) =
>   start_eip & 0xf;
> - pr_debug("3.\n");
>  }
>  
>  static inline void smpboot_restore_warm_reset_vector(void)
> @@ -143,11 +139,6 @@ static inline void smpboot_restore_warm_
>   unsigned long flags;
>  
>   /*
> -  * Install writable page 0 entry to set BIOS data area.
> -  */
> - local_flush_tlb();
> -
> - /*
>* Paranoid:  Set warm reset code and vector here back
>* to default values.
>*/

Really nice archeology! :-)

Reviewed-by: Ingo Molnar 

Thanks,

Ingo


Re: [patch 3/3] x86/mm: Remove preempt_disable/enable() from __native_flush_tlb()

2017-12-30 Thread Ingo Molnar

The cleanup looks good to me, just a few speling nits:

* Thomas Gleixner  wrote:

> The preempt_disable/enable() pair in __native_flush_tlb() was added in
> commit 5cf0791da5c1 ("x86/mm: Disable preemption during CR3 read+write") to
> protect the UP variant of flush_tlb_mm_range().
> 
> That preempt_disable/enable() pair should have been added to the UP variant
> of flush_tlb_mm_range() instead.
> 
> The UP variant was removed with commit ce4a4e565f52 ("x86/mm: Remove the UP
> asm/tlbflush.h code, always use the (formerly) SMP code"), but the
> preempt_disable/enable() pair stayed around.
> 
> The latest change to __native_flush_tlb() in commit 6fd166aae78c ("x86/mm:
> Use/Fix PCID to optimize user/kernel switches") added an access to a per
> cpu variable outside the preempt disabled regions which makes no sense at
> all. __native_flush_tlb() must always be called with at least preemption
> disabled.

s/cpu/CPU

> 
> Remove the preempt_disable/enable() pair and add a WARN_ON_ONCE() to catch
> bad callers independent of the smp_processor_id() debugging.
> 
> Signed-off-by: Thomas Gleixner 
> ---
>  arch/x86/include/asm/tlbflush.h |   14 --
>  1 file changed, 8 insertions(+), 6 deletions(-)
> 
> --- a/arch/x86/include/asm/tlbflush.h
> +++ b/arch/x86/include/asm/tlbflush.h
> @@ -345,15 +345,17 @@ static inline void invalidate_user_asid(
>   */
>  static inline void __native_flush_tlb(void)
>  {
> - invalidate_user_asid(this_cpu_read(cpu_tlbstate.loaded_mm_asid));
>   /*
> -  * If current->mm == NULL then we borrow a mm which may change
> -  * during a task switch and therefore we must not be preempted
> -  * while we write CR3 back:
> +  * Preemption or interrupts must be disabled to protect the access
> +  * to the per cpu variable and to prevent being preempted between
> +  * read_cr3() and write_cr3().
>*/
> - preempt_disable();
> + WARN_ON_ONCE(preemptible());
> +
> + invalidate_user_asid(this_cpu_read(cpu_tlbstate.loaded_mm_asid));
> +
> + /* If current->mm == NULL then the read_cr3() "borrows" a mm */
>   native_write_cr3(__native_read_cr3());
> - preempt_enable();

s/a mm/an mm

Reviewed-by: Ingo Molnar 

Thanks,

Ingo


[PATCH] wan/fsl_ucc_hdlc: Delete an error message for a failed memory allocation in ucc_hdlc_probe()

2017-12-30 Thread SF Markus Elfring
From: Markus Elfring 
Date: Sat, 30 Dec 2017 22:25:44 +0100

Omit an extra message for a memory allocation failure in this function.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring 
---
 drivers/net/wan/fsl_ucc_hdlc.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/net/wan/fsl_ucc_hdlc.c b/drivers/net/wan/fsl_ucc_hdlc.c
index 33df76405b86..98f8be206bae 100644
--- a/drivers/net/wan/fsl_ucc_hdlc.c
+++ b/drivers/net/wan/fsl_ucc_hdlc.c
@@ -1082,7 +1082,6 @@ static int ucc_hdlc_probe(struct platform_device *pdev)
utdm = kzalloc(sizeof(*utdm), GFP_KERNEL);
if (!utdm) {
ret = -ENOMEM;
-   dev_err(>dev, "No mem to alloc ucc tdm data\n");
goto free_uhdlc_priv;
}
uhdlc_priv->utdm = utdm;
-- 
2.15.1



[PATCH] wan/fsl_ucc_hdlc: Delete an error message for a failed memory allocation in ucc_hdlc_probe()

2017-12-30 Thread SF Markus Elfring
From: Markus Elfring 
Date: Sat, 30 Dec 2017 22:25:44 +0100

Omit an extra message for a memory allocation failure in this function.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring 
---
 drivers/net/wan/fsl_ucc_hdlc.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/net/wan/fsl_ucc_hdlc.c b/drivers/net/wan/fsl_ucc_hdlc.c
index 33df76405b86..98f8be206bae 100644
--- a/drivers/net/wan/fsl_ucc_hdlc.c
+++ b/drivers/net/wan/fsl_ucc_hdlc.c
@@ -1082,7 +1082,6 @@ static int ucc_hdlc_probe(struct platform_device *pdev)
utdm = kzalloc(sizeof(*utdm), GFP_KERNEL);
if (!utdm) {
ret = -ENOMEM;
-   dev_err(>dev, "No mem to alloc ucc tdm data\n");
goto free_uhdlc_priv;
}
uhdlc_priv->utdm = utdm;
-- 
2.15.1



Re: [PATCH v2] ksm: replace jhash2 with faster hash

2017-12-30 Thread Timofey Titovets
*FACEPALM*,
Sorry, just forgot about numbering of old jhash2 -> xxhash conversion
Also pickup patch for xxhash - arch dependent xxhash() function that will use
fastest algo for current arch.

So next will be v5, as that must be v4.

Thanks.

2017-12-29 12:52 GMT+03:00 Timofey Titovets :
> Pickup, Sioh Lee crc32 patch, after some long conversation
> and hassles, merge with my work on xxhash, add
> choice fastest hash helper.
>
> Base idea are same, replace jhash2 with something faster.
>
> Perf numbers:
> Intel(R) Xeon(R) CPU E5-2420 v2 @ 2.20GHz
> ksm: crc32c   hash() 12081 MB/s
> ksm: jhash2   hash()  1569 MB/s
> ksm: xxh64hash()  8770 MB/s
> ksm: xxh32hash()  4529 MB/s
>
> As jhash2 always will be slower, just drop it from choice.
>
> Add function to autoselect hash algo on boot, based on speed,
> like raid6 code does.
>
> Move init of zero_hash from init, to start of ksm thread,
> as ksm init run on early kernel init, run perf testing stuff on
> main kernel thread looks bad to me.
>
> One problem exists with that patch,
> ksm init run too early, and crc32c module, even compiled in
> can't be found, so i see:
>  - ksm: alloc crc32c shash error 2 in dmesg.
>
> I give up on that, so ideas welcomed.
>
> Only idea that i have, are to avoid early init by moving
> zero_checksum to sysfs_store parm,
> i.e. that's default to false, and that will work, i think.
>
> Thanks.
>
> Changes:
>   v1 -> v2:
> - Merge xxhash/crc32 patches
> - Replace crc32 with crc32c (crc32 have same as jhash2 speed)
> - Add auto speed test and auto choice of fastest hash function
>
> Signed-off-by: Timofey Titovets 
> Signed-off-by: leesioh 
> CC: Andrea Arcangeli 
> CC: linux...@kvack.org
> CC: k...@vger.kernel.org
> ---
>  mm/Kconfig |   4 ++
>  mm/ksm.c   | 133 
> -
>  2 files changed, 128 insertions(+), 9 deletions(-)
>
> diff --git a/mm/Kconfig b/mm/Kconfig
> index 03ff7703d322..d4fb147d4a22 100644
> --- a/mm/Kconfig
> +++ b/mm/Kconfig
> @@ -305,6 +305,10 @@ config MMU_NOTIFIER
>  config KSM
> bool "Enable KSM for page merging"
> depends on MMU
> +   select XXHASH
> +   select CRYPTO
> +   select CRYPTO_HASH
> +   select CONFIG_CRYPTO_CRC32C
> help
>   Enable Kernel Samepage Merging: KSM periodically scans those areas
>   of an application's address space that an app has advised may be
> diff --git a/mm/ksm.c b/mm/ksm.c
> index be8f4576f842..fd5c9d0f7bc2 100644
> --- a/mm/ksm.c
> +++ b/mm/ksm.c
> @@ -25,7 +25,6 @@
>  #include 
>  #include 
>  #include 
> -#include 
>  #include 
>  #include 
>  #include 
> @@ -41,6 +40,12 @@
>  #include 
>
>  #include 
> +
> +/* Support for xxhash and crc32c */
> +#include 
> +#include 
> +#include 
> +
>  #include "internal.h"
>
>  #ifdef CONFIG_NUMA
> @@ -186,7 +191,7 @@ struct rmap_item {
> };
> struct mm_struct *mm;
> unsigned long address;  /* + low bits used for flags below */
> -   unsigned int oldchecksum;   /* when unstable */
> +   unsigned long oldchecksum;  /* when unstable */
> union {
> struct rb_node node;/* when node of unstable tree */
> struct {/* when listed from stable tree */
> @@ -255,7 +260,7 @@ static unsigned int ksm_thread_pages_to_scan = 100;
>  static unsigned int ksm_thread_sleep_millisecs = 20;
>
>  /* Checksum of an empty (zeroed) page */
> -static unsigned int zero_checksum __read_mostly;
> +static unsigned long zero_checksum __read_mostly;
>
>  /* Whether to merge empty (zeroed) pages with actual zero pages */
>  static bool ksm_use_zero_pages __read_mostly;
> @@ -284,6 +289,115 @@ static DEFINE_SPINLOCK(ksm_mmlist_lock);
> sizeof(struct __struct), __alignof__(struct __struct),\
> (__flags), NULL)
>
> +#define CRC32C_HASH 1
> +#define XXH32_HASH  2
> +#define XXH64_HASH  3
> +
> +const static char *hash_func_names[] = { "", "crc32c", "xxh32", "xxh64" };
> +
> +static struct shash_desc desc;
> +static struct crypto_shash *tfm;
> +static uint8_t fastest_hash = 0;
> +
> +static void __init choice_fastest_hash(void)
> +{
> +   void *page = kmalloc(PAGE_SIZE, GFP_KERNEL);
> +   unsigned long checksum, perf, js, je;
> +   unsigned long best_perf = 0;
> +
> +   tfm = crypto_alloc_shash(hash_func_names[CRC32C_HASH],
> +CRYPTO_ALG_TYPE_SHASH, 0);
> +
> +   if (IS_ERR(tfm)) {
> +   pr_warn("ksm: alloc %s shash error %ld\n",
> +   hash_func_names[CRC32C_HASH], -PTR_ERR(tfm));
> +   } else {
> +   desc.tfm = tfm;
> +   desc.flags = 0;
> +
> +   perf = 0;
> +   preempt_disable();
> +   js = jiffies;
> +   je = js + (HZ >> 3);
> +   while 

Re: [PATCH v2] ksm: replace jhash2 with faster hash

2017-12-30 Thread Timofey Titovets
*FACEPALM*,
Sorry, just forgot about numbering of old jhash2 -> xxhash conversion
Also pickup patch for xxhash - arch dependent xxhash() function that will use
fastest algo for current arch.

So next will be v5, as that must be v4.

Thanks.

2017-12-29 12:52 GMT+03:00 Timofey Titovets :
> Pickup, Sioh Lee crc32 patch, after some long conversation
> and hassles, merge with my work on xxhash, add
> choice fastest hash helper.
>
> Base idea are same, replace jhash2 with something faster.
>
> Perf numbers:
> Intel(R) Xeon(R) CPU E5-2420 v2 @ 2.20GHz
> ksm: crc32c   hash() 12081 MB/s
> ksm: jhash2   hash()  1569 MB/s
> ksm: xxh64hash()  8770 MB/s
> ksm: xxh32hash()  4529 MB/s
>
> As jhash2 always will be slower, just drop it from choice.
>
> Add function to autoselect hash algo on boot, based on speed,
> like raid6 code does.
>
> Move init of zero_hash from init, to start of ksm thread,
> as ksm init run on early kernel init, run perf testing stuff on
> main kernel thread looks bad to me.
>
> One problem exists with that patch,
> ksm init run too early, and crc32c module, even compiled in
> can't be found, so i see:
>  - ksm: alloc crc32c shash error 2 in dmesg.
>
> I give up on that, so ideas welcomed.
>
> Only idea that i have, are to avoid early init by moving
> zero_checksum to sysfs_store parm,
> i.e. that's default to false, and that will work, i think.
>
> Thanks.
>
> Changes:
>   v1 -> v2:
> - Merge xxhash/crc32 patches
> - Replace crc32 with crc32c (crc32 have same as jhash2 speed)
> - Add auto speed test and auto choice of fastest hash function
>
> Signed-off-by: Timofey Titovets 
> Signed-off-by: leesioh 
> CC: Andrea Arcangeli 
> CC: linux...@kvack.org
> CC: k...@vger.kernel.org
> ---
>  mm/Kconfig |   4 ++
>  mm/ksm.c   | 133 
> -
>  2 files changed, 128 insertions(+), 9 deletions(-)
>
> diff --git a/mm/Kconfig b/mm/Kconfig
> index 03ff7703d322..d4fb147d4a22 100644
> --- a/mm/Kconfig
> +++ b/mm/Kconfig
> @@ -305,6 +305,10 @@ config MMU_NOTIFIER
>  config KSM
> bool "Enable KSM for page merging"
> depends on MMU
> +   select XXHASH
> +   select CRYPTO
> +   select CRYPTO_HASH
> +   select CONFIG_CRYPTO_CRC32C
> help
>   Enable Kernel Samepage Merging: KSM periodically scans those areas
>   of an application's address space that an app has advised may be
> diff --git a/mm/ksm.c b/mm/ksm.c
> index be8f4576f842..fd5c9d0f7bc2 100644
> --- a/mm/ksm.c
> +++ b/mm/ksm.c
> @@ -25,7 +25,6 @@
>  #include 
>  #include 
>  #include 
> -#include 
>  #include 
>  #include 
>  #include 
> @@ -41,6 +40,12 @@
>  #include 
>
>  #include 
> +
> +/* Support for xxhash and crc32c */
> +#include 
> +#include 
> +#include 
> +
>  #include "internal.h"
>
>  #ifdef CONFIG_NUMA
> @@ -186,7 +191,7 @@ struct rmap_item {
> };
> struct mm_struct *mm;
> unsigned long address;  /* + low bits used for flags below */
> -   unsigned int oldchecksum;   /* when unstable */
> +   unsigned long oldchecksum;  /* when unstable */
> union {
> struct rb_node node;/* when node of unstable tree */
> struct {/* when listed from stable tree */
> @@ -255,7 +260,7 @@ static unsigned int ksm_thread_pages_to_scan = 100;
>  static unsigned int ksm_thread_sleep_millisecs = 20;
>
>  /* Checksum of an empty (zeroed) page */
> -static unsigned int zero_checksum __read_mostly;
> +static unsigned long zero_checksum __read_mostly;
>
>  /* Whether to merge empty (zeroed) pages with actual zero pages */
>  static bool ksm_use_zero_pages __read_mostly;
> @@ -284,6 +289,115 @@ static DEFINE_SPINLOCK(ksm_mmlist_lock);
> sizeof(struct __struct), __alignof__(struct __struct),\
> (__flags), NULL)
>
> +#define CRC32C_HASH 1
> +#define XXH32_HASH  2
> +#define XXH64_HASH  3
> +
> +const static char *hash_func_names[] = { "", "crc32c", "xxh32", "xxh64" };
> +
> +static struct shash_desc desc;
> +static struct crypto_shash *tfm;
> +static uint8_t fastest_hash = 0;
> +
> +static void __init choice_fastest_hash(void)
> +{
> +   void *page = kmalloc(PAGE_SIZE, GFP_KERNEL);
> +   unsigned long checksum, perf, js, je;
> +   unsigned long best_perf = 0;
> +
> +   tfm = crypto_alloc_shash(hash_func_names[CRC32C_HASH],
> +CRYPTO_ALG_TYPE_SHASH, 0);
> +
> +   if (IS_ERR(tfm)) {
> +   pr_warn("ksm: alloc %s shash error %ld\n",
> +   hash_func_names[CRC32C_HASH], -PTR_ERR(tfm));
> +   } else {
> +   desc.tfm = tfm;
> +   desc.flags = 0;
> +
> +   perf = 0;
> +   preempt_disable();
> +   js = jiffies;
> +   je = js + (HZ >> 3);
> +   while (time_before(jiffies, je)) {
> +   crypto_shash_digest(, page, PAGE_SIZE,
> +   

[patch 1/3] x86/ldt: Free the right LDT memory in write_ldt() error path

2017-12-30 Thread Thomas Gleixner
The error path in write_ldt() frees the already installed LDT memory
instead of the newly allocated which cannot be installed.

Fixes: f55f0501cbf6 ("x86/pti: Put the LDT in its own PGD if PTI is on")
Reported-by: Mathieu Desnoyers 
Signed-off-by: Thomas Gleixner 
---
 arch/x86/kernel/ldt.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/x86/kernel/ldt.c
+++ b/arch/x86/kernel/ldt.c
@@ -421,7 +421,7 @@ static int write_ldt(void __user *ptr, u
 */
error = map_ldt_struct(mm, new_ldt, old_ldt ? !old_ldt->slot : 0);
if (error) {
-   free_ldt_struct(old_ldt);
+   free_ldt_struct(new_ldt);
goto out_unlock;
}
 




[patch 1/3] x86/ldt: Free the right LDT memory in write_ldt() error path

2017-12-30 Thread Thomas Gleixner
The error path in write_ldt() frees the already installed LDT memory
instead of the newly allocated which cannot be installed.

Fixes: f55f0501cbf6 ("x86/pti: Put the LDT in its own PGD if PTI is on")
Reported-by: Mathieu Desnoyers 
Signed-off-by: Thomas Gleixner 
---
 arch/x86/kernel/ldt.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/x86/kernel/ldt.c
+++ b/arch/x86/kernel/ldt.c
@@ -421,7 +421,7 @@ static int write_ldt(void __user *ptr, u
 */
error = map_ldt_struct(mm, new_ldt, old_ldt ? !old_ldt->slot : 0);
if (error) {
-   free_ldt_struct(old_ldt);
+   free_ldt_struct(new_ldt);
goto out_unlock;
}
 




[patch 3/3] x86/mm: Remove preempt_disable/enable() from __native_flush_tlb()

2017-12-30 Thread Thomas Gleixner
The preempt_disable/enable() pair in __native_flush_tlb() was added in
commit 5cf0791da5c1 ("x86/mm: Disable preemption during CR3 read+write") to
protect the UP variant of flush_tlb_mm_range().

That preempt_disable/enable() pair should have been added to the UP variant
of flush_tlb_mm_range() instead.

The UP variant was removed with commit ce4a4e565f52 ("x86/mm: Remove the UP
asm/tlbflush.h code, always use the (formerly) SMP code"), but the
preempt_disable/enable() pair stayed around.

The latest change to __native_flush_tlb() in commit 6fd166aae78c ("x86/mm:
Use/Fix PCID to optimize user/kernel switches") added an access to a per
cpu variable outside the preempt disabled regions which makes no sense at
all. __native_flush_tlb() must always be called with at least preemption
disabled.

Remove the preempt_disable/enable() pair and add a WARN_ON_ONCE() to catch
bad callers independent of the smp_processor_id() debugging.

Signed-off-by: Thomas Gleixner 
---
 arch/x86/include/asm/tlbflush.h |   14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

--- a/arch/x86/include/asm/tlbflush.h
+++ b/arch/x86/include/asm/tlbflush.h
@@ -345,15 +345,17 @@ static inline void invalidate_user_asid(
  */
 static inline void __native_flush_tlb(void)
 {
-   invalidate_user_asid(this_cpu_read(cpu_tlbstate.loaded_mm_asid));
/*
-* If current->mm == NULL then we borrow a mm which may change
-* during a task switch and therefore we must not be preempted
-* while we write CR3 back:
+* Preemption or interrupts must be disabled to protect the access
+* to the per cpu variable and to prevent being preempted between
+* read_cr3() and write_cr3().
 */
-   preempt_disable();
+   WARN_ON_ONCE(preemptible());
+
+   invalidate_user_asid(this_cpu_read(cpu_tlbstate.loaded_mm_asid));
+
+   /* If current->mm == NULL then the read_cr3() "borrows" a mm */
native_write_cr3(__native_read_cr3());
-   preempt_enable();
 }
 
 /*




[patch 3/3] x86/mm: Remove preempt_disable/enable() from __native_flush_tlb()

2017-12-30 Thread Thomas Gleixner
The preempt_disable/enable() pair in __native_flush_tlb() was added in
commit 5cf0791da5c1 ("x86/mm: Disable preemption during CR3 read+write") to
protect the UP variant of flush_tlb_mm_range().

That preempt_disable/enable() pair should have been added to the UP variant
of flush_tlb_mm_range() instead.

The UP variant was removed with commit ce4a4e565f52 ("x86/mm: Remove the UP
asm/tlbflush.h code, always use the (formerly) SMP code"), but the
preempt_disable/enable() pair stayed around.

The latest change to __native_flush_tlb() in commit 6fd166aae78c ("x86/mm:
Use/Fix PCID to optimize user/kernel switches") added an access to a per
cpu variable outside the preempt disabled regions which makes no sense at
all. __native_flush_tlb() must always be called with at least preemption
disabled.

Remove the preempt_disable/enable() pair and add a WARN_ON_ONCE() to catch
bad callers independent of the smp_processor_id() debugging.

Signed-off-by: Thomas Gleixner 
---
 arch/x86/include/asm/tlbflush.h |   14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

--- a/arch/x86/include/asm/tlbflush.h
+++ b/arch/x86/include/asm/tlbflush.h
@@ -345,15 +345,17 @@ static inline void invalidate_user_asid(
  */
 static inline void __native_flush_tlb(void)
 {
-   invalidate_user_asid(this_cpu_read(cpu_tlbstate.loaded_mm_asid));
/*
-* If current->mm == NULL then we borrow a mm which may change
-* during a task switch and therefore we must not be preempted
-* while we write CR3 back:
+* Preemption or interrupts must be disabled to protect the access
+* to the per cpu variable and to prevent being preempted between
+* read_cr3() and write_cr3().
 */
-   preempt_disable();
+   WARN_ON_ONCE(preemptible());
+
+   invalidate_user_asid(this_cpu_read(cpu_tlbstate.loaded_mm_asid));
+
+   /* If current->mm == NULL then the read_cr3() "borrows" a mm */
native_write_cr3(__native_read_cr3());
-   preempt_enable();
 }
 
 /*




[patch 2/3] x86/smpboot: Remove stale tlb flush invocations

2017-12-30 Thread Thomas Gleixner
smpboot_setup_warm_reset_vector() and smpboot_restore_warm_reset_vector()
invoke local_flush_tlb() for no obvious reason.

Digging in history revealed that the original code in the 2.1 aera added
those because the code manipulated a swapper_pg_dir pagetable entry. The
pagetable manipulation was removed long ago in the 2.3 timeframe, but the
tlb flush invocations stayed around forever.

Remove them along with the pointless pr_debugs which come from the same 2.1
change.

Reported-by: Dominik Brodowski 
Signed-off-by: Thomas Gleixner 
---
 arch/x86/kernel/smpboot.c |9 -
 1 file changed, 9 deletions(-)

--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -128,14 +128,10 @@ static inline void smpboot_setup_warm_re
spin_lock_irqsave(_lock, flags);
CMOS_WRITE(0xa, 0xf);
spin_unlock_irqrestore(_lock, flags);
-   local_flush_tlb();
-   pr_debug("1.\n");
*((volatile unsigned short *)phys_to_virt(TRAMPOLINE_PHYS_HIGH)) =
start_eip >> 4;
-   pr_debug("2.\n");
*((volatile unsigned short *)phys_to_virt(TRAMPOLINE_PHYS_LOW)) =
start_eip & 0xf;
-   pr_debug("3.\n");
 }
 
 static inline void smpboot_restore_warm_reset_vector(void)
@@ -143,11 +139,6 @@ static inline void smpboot_restore_warm_
unsigned long flags;
 
/*
-* Install writable page 0 entry to set BIOS data area.
-*/
-   local_flush_tlb();
-
-   /*
 * Paranoid:  Set warm reset code and vector here back
 * to default values.
 */




[patch 2/3] x86/smpboot: Remove stale tlb flush invocations

2017-12-30 Thread Thomas Gleixner
smpboot_setup_warm_reset_vector() and smpboot_restore_warm_reset_vector()
invoke local_flush_tlb() for no obvious reason.

Digging in history revealed that the original code in the 2.1 aera added
those because the code manipulated a swapper_pg_dir pagetable entry. The
pagetable manipulation was removed long ago in the 2.3 timeframe, but the
tlb flush invocations stayed around forever.

Remove them along with the pointless pr_debugs which come from the same 2.1
change.

Reported-by: Dominik Brodowski 
Signed-off-by: Thomas Gleixner 
---
 arch/x86/kernel/smpboot.c |9 -
 1 file changed, 9 deletions(-)

--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -128,14 +128,10 @@ static inline void smpboot_setup_warm_re
spin_lock_irqsave(_lock, flags);
CMOS_WRITE(0xa, 0xf);
spin_unlock_irqrestore(_lock, flags);
-   local_flush_tlb();
-   pr_debug("1.\n");
*((volatile unsigned short *)phys_to_virt(TRAMPOLINE_PHYS_HIGH)) =
start_eip >> 4;
-   pr_debug("2.\n");
*((volatile unsigned short *)phys_to_virt(TRAMPOLINE_PHYS_LOW)) =
start_eip & 0xf;
-   pr_debug("3.\n");
 }
 
 static inline void smpboot_restore_warm_reset_vector(void)
@@ -143,11 +139,6 @@ static inline void smpboot_restore_warm_
unsigned long flags;
 
/*
-* Install writable page 0 entry to set BIOS data area.
-*/
-   local_flush_tlb();
-
-   /*
 * Paranoid:  Set warm reset code and vector here back
 * to default values.
 */




[patch 0/3] x86/pti: Fix various fallout

2017-12-30 Thread Thomas Gleixner
The small series fixes the recent fallout of the x86/pti merge:

 - Remove the stale local_flush_tlb() invocations from the CPU hotplug code

 - Remove the stale preempt_disable/enable() pair from __native_flush_tlb()

 - Fix a bogus free in the write_ldt() error path

Thanks,

tglx

---
 include/asm/tlbflush.h |   14 --
 kernel/ldt.c   |2 +-
 kernel/smpboot.c   |9 -
 3 files changed, 9 insertions(+), 16 deletions(-)



[patch 0/3] x86/pti: Fix various fallout

2017-12-30 Thread Thomas Gleixner
The small series fixes the recent fallout of the x86/pti merge:

 - Remove the stale local_flush_tlb() invocations from the CPU hotplug code

 - Remove the stale preempt_disable/enable() pair from __native_flush_tlb()

 - Fix a bogus free in the write_ldt() error path

Thanks,

tglx

---
 include/asm/tlbflush.h |   14 --
 kernel/ldt.c   |2 +-
 kernel/smpboot.c   |9 -
 3 files changed, 9 insertions(+), 16 deletions(-)



[tip:core/urgent] objtool: Fix seg fault with clang-compiled objects

2017-12-30 Thread tip-bot for Simon Ser
Commit-ID:  ce90aaf5cde4ce057b297bb6c955caf16ef00ee6
Gitweb: https://git.kernel.org/tip/ce90aaf5cde4ce057b297bb6c955caf16ef00ee6
Author: Simon Ser 
AuthorDate: Sat, 30 Dec 2017 14:43:32 -0600
Committer:  Ingo Molnar 
CommitDate: Sat, 30 Dec 2017 22:04:17 +0100

objtool: Fix seg fault with clang-compiled objects

Fix a seg fault which happens when an input file provided to 'objtool
orc generate' doesn't have a '.shstrtab' section (for instance, object
files produced by clang don't have this section).

Signed-off-by: Simon Ser 
Signed-off-by: Josh Poimboeuf 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Link: 
http://lkml.kernel.org/r/c0f2231683e9bed40fac1f13ce2c33b8389854bc.1514666459.git.jpoim...@redhat.com
Signed-off-by: Ingo Molnar 
---
 tools/objtool/orc_gen.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/objtool/orc_gen.c b/tools/objtool/orc_gen.c
index e5ca314..e61fe70 100644
--- a/tools/objtool/orc_gen.c
+++ b/tools/objtool/orc_gen.c
@@ -165,6 +165,8 @@ int create_orc_sections(struct objtool_file *file)
 
/* create .orc_unwind_ip and .rela.orc_unwind_ip sections */
sec = elf_create_section(file->elf, ".orc_unwind_ip", sizeof(int), idx);
+   if (!sec)
+   return -1;
 
ip_relasec = elf_create_rela_section(file->elf, sec);
if (!ip_relasec)


[tip:core/urgent] objtool: Fix seg fault with clang-compiled objects

2017-12-30 Thread tip-bot for Simon Ser
Commit-ID:  ce90aaf5cde4ce057b297bb6c955caf16ef00ee6
Gitweb: https://git.kernel.org/tip/ce90aaf5cde4ce057b297bb6c955caf16ef00ee6
Author: Simon Ser 
AuthorDate: Sat, 30 Dec 2017 14:43:32 -0600
Committer:  Ingo Molnar 
CommitDate: Sat, 30 Dec 2017 22:04:17 +0100

objtool: Fix seg fault with clang-compiled objects

Fix a seg fault which happens when an input file provided to 'objtool
orc generate' doesn't have a '.shstrtab' section (for instance, object
files produced by clang don't have this section).

Signed-off-by: Simon Ser 
Signed-off-by: Josh Poimboeuf 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Link: 
http://lkml.kernel.org/r/c0f2231683e9bed40fac1f13ce2c33b8389854bc.1514666459.git.jpoim...@redhat.com
Signed-off-by: Ingo Molnar 
---
 tools/objtool/orc_gen.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/objtool/orc_gen.c b/tools/objtool/orc_gen.c
index e5ca314..e61fe70 100644
--- a/tools/objtool/orc_gen.c
+++ b/tools/objtool/orc_gen.c
@@ -165,6 +165,8 @@ int create_orc_sections(struct objtool_file *file)
 
/* create .orc_unwind_ip and .rela.orc_unwind_ip sections */
sec = elf_create_section(file->elf, ".orc_unwind_ip", sizeof(int), idx);
+   if (!sec)
+   return -1;
 
ip_relasec = elf_create_rela_section(file->elf, sec);
if (!ip_relasec)


[tip:core/urgent] objtool: Fix seg fault caused by missing parameter

2017-12-30 Thread tip-bot for Simon Ser
Commit-ID:  d89e426499cf36b96161bd32970d6783f1fbcb0e
Gitweb: https://git.kernel.org/tip/d89e426499cf36b96161bd32970d6783f1fbcb0e
Author: Simon Ser 
AuthorDate: Sat, 30 Dec 2017 14:43:31 -0600
Committer:  Ingo Molnar 
CommitDate: Sat, 30 Dec 2017 22:04:17 +0100

objtool: Fix seg fault caused by missing parameter

Fix a seg fault when no parameter is provided to 'objtool orc'.

Signed-off-by: Simon Ser 
Signed-off-by: Josh Poimboeuf 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Link: 
http://lkml.kernel.org/r/9172803ec7ebb72535bcd0b7f966ae96d515968e.1514666459.git.jpoim...@redhat.com
Signed-off-by: Ingo Molnar 
---
 tools/objtool/builtin-orc.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/objtool/builtin-orc.c b/tools/objtool/builtin-orc.c
index 4c6b5c9..91e8e19 100644
--- a/tools/objtool/builtin-orc.c
+++ b/tools/objtool/builtin-orc.c
@@ -44,6 +44,9 @@ int cmd_orc(int argc, const char **argv)
const char *objname;
 
argc--; argv++;
+   if (argc <= 0)
+   usage_with_options(orc_usage, check_options);
+
if (!strncmp(argv[0], "gen", 3)) {
argc = parse_options(argc, argv, check_options, orc_usage, 0);
if (argc != 1)
@@ -52,7 +55,6 @@ int cmd_orc(int argc, const char **argv)
objname = argv[0];
 
return check(objname, no_fp, no_unreachable, true);
-
}
 
if (!strcmp(argv[0], "dump")) {


[tip:core/urgent] objtool: Fix seg fault caused by missing parameter

2017-12-30 Thread tip-bot for Simon Ser
Commit-ID:  d89e426499cf36b96161bd32970d6783f1fbcb0e
Gitweb: https://git.kernel.org/tip/d89e426499cf36b96161bd32970d6783f1fbcb0e
Author: Simon Ser 
AuthorDate: Sat, 30 Dec 2017 14:43:31 -0600
Committer:  Ingo Molnar 
CommitDate: Sat, 30 Dec 2017 22:04:17 +0100

objtool: Fix seg fault caused by missing parameter

Fix a seg fault when no parameter is provided to 'objtool orc'.

Signed-off-by: Simon Ser 
Signed-off-by: Josh Poimboeuf 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Link: 
http://lkml.kernel.org/r/9172803ec7ebb72535bcd0b7f966ae96d515968e.1514666459.git.jpoim...@redhat.com
Signed-off-by: Ingo Molnar 
---
 tools/objtool/builtin-orc.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/objtool/builtin-orc.c b/tools/objtool/builtin-orc.c
index 4c6b5c9..91e8e19 100644
--- a/tools/objtool/builtin-orc.c
+++ b/tools/objtool/builtin-orc.c
@@ -44,6 +44,9 @@ int cmd_orc(int argc, const char **argv)
const char *objname;
 
argc--; argv++;
+   if (argc <= 0)
+   usage_with_options(orc_usage, check_options);
+
if (!strncmp(argv[0], "gen", 3)) {
argc = parse_options(argc, argv, check_options, orc_usage, 0);
if (argc != 1)
@@ -52,7 +55,6 @@ int cmd_orc(int argc, const char **argv)
objname = argv[0];
 
return check(objname, no_fp, no_unreachable, true);
-
}
 
if (!strcmp(argv[0], "dump")) {


[PATCH 2/2] at76c50x-usb: Improve size determinations in at76_usbdfu_download()

2017-12-30 Thread SF Markus Elfring
From: Markus Elfring 
Date: Sat, 30 Dec 2017 21:56:56 +0100

Replace the specification of two data types by pointer dereferences
as the parameter for the operator "sizeof" to make the corresponding size
determination a bit safer according to the Linux coding style convention.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring 
---
 drivers/net/wireless/atmel/at76c50x-usb.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/wireless/atmel/at76c50x-usb.c 
b/drivers/net/wireless/atmel/at76c50x-usb.c
index 2893d339b440..6144d4a258ca 100644
--- a/drivers/net/wireless/atmel/at76c50x-usb.c
+++ b/drivers/net/wireless/atmel/at76c50x-usb.c
@@ -383,7 +383,7 @@ static int at76_usbdfu_download(struct usb_device *udev, u8 
*buf, u32 size,
return -EINVAL;
}
 
-   dfu_stat_buf = kmalloc(sizeof(struct dfu_status), GFP_KERNEL);
+   dfu_stat_buf = kmalloc(sizeof(*dfu_stat_buf), GFP_KERNEL);
if (!dfu_stat_buf) {
ret = -ENOMEM;
goto exit;
@@ -395,7 +395,7 @@ static int at76_usbdfu_download(struct usb_device *udev, u8 
*buf, u32 size,
goto exit;
}
 
-   dfu_state = kmalloc(sizeof(u8), GFP_KERNEL);
+   dfu_state = kmalloc(sizeof(*dfu_state), GFP_KERNEL);
if (!dfu_state) {
ret = -ENOMEM;
goto exit;
-- 
2.15.1



[PATCH 2/2] at76c50x-usb: Improve size determinations in at76_usbdfu_download()

2017-12-30 Thread SF Markus Elfring
From: Markus Elfring 
Date: Sat, 30 Dec 2017 21:56:56 +0100

Replace the specification of two data types by pointer dereferences
as the parameter for the operator "sizeof" to make the corresponding size
determination a bit safer according to the Linux coding style convention.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring 
---
 drivers/net/wireless/atmel/at76c50x-usb.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/wireless/atmel/at76c50x-usb.c 
b/drivers/net/wireless/atmel/at76c50x-usb.c
index 2893d339b440..6144d4a258ca 100644
--- a/drivers/net/wireless/atmel/at76c50x-usb.c
+++ b/drivers/net/wireless/atmel/at76c50x-usb.c
@@ -383,7 +383,7 @@ static int at76_usbdfu_download(struct usb_device *udev, u8 
*buf, u32 size,
return -EINVAL;
}
 
-   dfu_stat_buf = kmalloc(sizeof(struct dfu_status), GFP_KERNEL);
+   dfu_stat_buf = kmalloc(sizeof(*dfu_stat_buf), GFP_KERNEL);
if (!dfu_stat_buf) {
ret = -ENOMEM;
goto exit;
@@ -395,7 +395,7 @@ static int at76_usbdfu_download(struct usb_device *udev, u8 
*buf, u32 size,
goto exit;
}
 
-   dfu_state = kmalloc(sizeof(u8), GFP_KERNEL);
+   dfu_state = kmalloc(sizeof(*dfu_state), GFP_KERNEL);
if (!dfu_state) {
ret = -ENOMEM;
goto exit;
-- 
2.15.1



[PATCH 1/2] at76c50x-usb: Delete an error message for a failed memory allocation in at76_submit_rx_urb()

2017-12-30 Thread SF Markus Elfring
From: Markus Elfring 
Date: Sat, 30 Dec 2017 21:50:12 +0100

Omit an extra message for a memory allocation failure in this function.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring 
---
 drivers/net/wireless/atmel/at76c50x-usb.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/net/wireless/atmel/at76c50x-usb.c 
b/drivers/net/wireless/atmel/at76c50x-usb.c
index ede89d4ffc88..2893d339b440 100644
--- a/drivers/net/wireless/atmel/at76c50x-usb.c
+++ b/drivers/net/wireless/atmel/at76c50x-usb.c
@@ -1223,8 +1223,6 @@ static int at76_submit_rx_urb(struct at76_priv *priv)
if (!skb) {
skb = dev_alloc_skb(sizeof(struct at76_rx_buffer));
if (!skb) {
-   wiphy_err(priv->hw->wiphy,
- "cannot allocate rx skbuff\n");
ret = -ENOMEM;
goto exit;
}
-- 
2.15.1



[PATCH 1/2] at76c50x-usb: Delete an error message for a failed memory allocation in at76_submit_rx_urb()

2017-12-30 Thread SF Markus Elfring
From: Markus Elfring 
Date: Sat, 30 Dec 2017 21:50:12 +0100

Omit an extra message for a memory allocation failure in this function.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring 
---
 drivers/net/wireless/atmel/at76c50x-usb.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/net/wireless/atmel/at76c50x-usb.c 
b/drivers/net/wireless/atmel/at76c50x-usb.c
index ede89d4ffc88..2893d339b440 100644
--- a/drivers/net/wireless/atmel/at76c50x-usb.c
+++ b/drivers/net/wireless/atmel/at76c50x-usb.c
@@ -1223,8 +1223,6 @@ static int at76_submit_rx_urb(struct at76_priv *priv)
if (!skb) {
skb = dev_alloc_skb(sizeof(struct at76_rx_buffer));
if (!skb) {
-   wiphy_err(priv->hw->wiphy,
- "cannot allocate rx skbuff\n");
ret = -ENOMEM;
goto exit;
}
-- 
2.15.1



[PATCH 0/2] at76c50x-usb: Adjustments for two function implementations

2017-12-30 Thread SF Markus Elfring
From: Markus Elfring 
Date: Sat, 30 Dec 2017 22:01:23 +0100

Two update suggestions were taken into account
from static source code analysis.

Markus Elfring (2):
  Delete an error message for a failed memory allocation in at76_submit_rx_urb()
  Improve size determinations in at76_usbdfu_download()

 drivers/net/wireless/atmel/at76c50x-usb.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

-- 
2.15.1



[PATCH 0/2] at76c50x-usb: Adjustments for two function implementations

2017-12-30 Thread SF Markus Elfring
From: Markus Elfring 
Date: Sat, 30 Dec 2017 22:01:23 +0100

Two update suggestions were taken into account
from static source code analysis.

Markus Elfring (2):
  Delete an error message for a failed memory allocation in at76_submit_rx_urb()
  Improve size determinations in at76_usbdfu_download()

 drivers/net/wireless/atmel/at76c50x-usb.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

-- 
2.15.1



[PATCH 00/11] drm/sun4i: Add A83T HDMI support

2017-12-30 Thread Jernej Skrabec
This patch series implements support for A83T DW HDMI and PHY. It is based
upon Maxime Ripard's "drm/sun4i: Add A83t LVDS support" patch series which
can be found here:
http://lists.infradead.org/pipermail/linux-arm-kernel/2017-December/550035.html

While exactly this combination of HDMI controller and PHY is not common in
Allwinner SoCs, this patch series nevertheless makes groundwork for other
SoCs, which have same DW HDMI IP block, but different PHYs, like H3 and H5.

All patches can also be found on github:
https://github.com/jernejsk/linux-1/commits/a83t_hdmi

Please take a look.

Best regards,
Jernej

Jernej Skrabec (11):
  clk: sunxi-ng: Don't set k if width is 0 for nkmp plls
  clk: sunxi-ng: a83t: Add M divider to TCON1 clock
  drm/bridge/synopsys: dw-hdmi: Enable workaround for v1.32a
  drm/bridge/synopsys: dw-hdmi: Export some PHY related functions
  drm/bridge/synopsys: dw-hdmi: Add deinit callback
  dt-bindings: display: sun4i-drm: Add A83T HDMI pipeline
  drm/sun4i: Add support for A83T second TCON
  drm/sun4i: Add support for A83T second DE2 mixer
  drm/sun4i: Implement A83T HDMI driver
  ARM: dts: sun8i: a83t: Add HDMI display pipeline
  ARM: dts: sun8i: a83t: Enable HDMI on BananaPi M3

 .../bindings/display/sunxi/sun4i-drm.txt   | 188 ++-
 arch/arm/boot/dts/sun8i-a83t-bananapi-m3.dts   |  29 ++
 arch/arm/boot/dts/sun8i-a83t.dtsi  | 108 +-
 drivers/clk/sunxi-ng/ccu-sun8i-a83t.c  |   4 +-
 drivers/clk/sunxi-ng/ccu_nkmp.c|  21 +-
 drivers/gpu/drm/bridge/synopsys/dw-hdmi.c  |  56 +++-
 drivers/gpu/drm/bridge/synopsys/dw-hdmi.h  |   2 +
 drivers/gpu/drm/sun4i/Kconfig  |   9 +
 drivers/gpu/drm/sun4i/Makefile |   1 +
 drivers/gpu/drm/sun4i/sun4i_tcon.c |  46 ++-
 drivers/gpu/drm/sun4i/sun4i_tcon.h |   1 +
 drivers/gpu/drm/sun4i/sun8i_dw_hdmi.c  | 367 +
 drivers/gpu/drm/sun4i/sun8i_mixer.c|  11 +
 include/drm/bridge/dw_hdmi.h   |  11 +
 14 files changed, 808 insertions(+), 46 deletions(-)
 create mode 100644 drivers/gpu/drm/sun4i/sun8i_dw_hdmi.c

-- 
2.15.1



[PATCH 00/11] drm/sun4i: Add A83T HDMI support

2017-12-30 Thread Jernej Skrabec
This patch series implements support for A83T DW HDMI and PHY. It is based
upon Maxime Ripard's "drm/sun4i: Add A83t LVDS support" patch series which
can be found here:
http://lists.infradead.org/pipermail/linux-arm-kernel/2017-December/550035.html

While exactly this combination of HDMI controller and PHY is not common in
Allwinner SoCs, this patch series nevertheless makes groundwork for other
SoCs, which have same DW HDMI IP block, but different PHYs, like H3 and H5.

All patches can also be found on github:
https://github.com/jernejsk/linux-1/commits/a83t_hdmi

Please take a look.

Best regards,
Jernej

Jernej Skrabec (11):
  clk: sunxi-ng: Don't set k if width is 0 for nkmp plls
  clk: sunxi-ng: a83t: Add M divider to TCON1 clock
  drm/bridge/synopsys: dw-hdmi: Enable workaround for v1.32a
  drm/bridge/synopsys: dw-hdmi: Export some PHY related functions
  drm/bridge/synopsys: dw-hdmi: Add deinit callback
  dt-bindings: display: sun4i-drm: Add A83T HDMI pipeline
  drm/sun4i: Add support for A83T second TCON
  drm/sun4i: Add support for A83T second DE2 mixer
  drm/sun4i: Implement A83T HDMI driver
  ARM: dts: sun8i: a83t: Add HDMI display pipeline
  ARM: dts: sun8i: a83t: Enable HDMI on BananaPi M3

 .../bindings/display/sunxi/sun4i-drm.txt   | 188 ++-
 arch/arm/boot/dts/sun8i-a83t-bananapi-m3.dts   |  29 ++
 arch/arm/boot/dts/sun8i-a83t.dtsi  | 108 +-
 drivers/clk/sunxi-ng/ccu-sun8i-a83t.c  |   4 +-
 drivers/clk/sunxi-ng/ccu_nkmp.c|  21 +-
 drivers/gpu/drm/bridge/synopsys/dw-hdmi.c  |  56 +++-
 drivers/gpu/drm/bridge/synopsys/dw-hdmi.h  |   2 +
 drivers/gpu/drm/sun4i/Kconfig  |   9 +
 drivers/gpu/drm/sun4i/Makefile |   1 +
 drivers/gpu/drm/sun4i/sun4i_tcon.c |  46 ++-
 drivers/gpu/drm/sun4i/sun4i_tcon.h |   1 +
 drivers/gpu/drm/sun4i/sun8i_dw_hdmi.c  | 367 +
 drivers/gpu/drm/sun4i/sun8i_mixer.c|  11 +
 include/drm/bridge/dw_hdmi.h   |  11 +
 14 files changed, 808 insertions(+), 46 deletions(-)
 create mode 100644 drivers/gpu/drm/sun4i/sun8i_dw_hdmi.c

-- 
2.15.1



[PATCH 02/11] clk: sunxi-ng: a83t: Add M divider to TCON1 clock

2017-12-30 Thread Jernej Skrabec
TCON1 also has M divider, contrary to TCON0.

Fixes: 05359be1176b ("clk: sunxi-ng: Add driver for A83T CCU")

Signed-off-by: Jernej Skrabec 
---
 drivers/clk/sunxi-ng/ccu-sun8i-a83t.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/clk/sunxi-ng/ccu-sun8i-a83t.c 
b/drivers/clk/sunxi-ng/ccu-sun8i-a83t.c
index 04a9c33f53f0..7d08015b980d 100644
--- a/drivers/clk/sunxi-ng/ccu-sun8i-a83t.c
+++ b/drivers/clk/sunxi-ng/ccu-sun8i-a83t.c
@@ -504,8 +504,8 @@ static SUNXI_CCU_MUX_WITH_GATE(tcon0_clk, "tcon0", 
tcon0_parents,
 0x118, 24, 3, BIT(31), CLK_SET_RATE_PARENT);
 
 static const char * const tcon1_parents[] = { "pll-video1" };
-static SUNXI_CCU_MUX_WITH_GATE(tcon1_clk, "tcon1", tcon1_parents,
-0x11c, 24, 3, BIT(31), CLK_SET_RATE_PARENT);
+static SUNXI_CCU_M_WITH_MUX_GATE(tcon1_clk, "tcon1", tcon1_parents,
+0x11c, 0, 4, 24, 2, BIT(31), 
CLK_SET_RATE_PARENT);
 
 static SUNXI_CCU_GATE(csi_misc_clk, "csi-misc", "osc24M", 0x130, BIT(16), 0);
 
-- 
2.15.1



[PATCH 05/11] drm/bridge/synopsys: dw-hdmi: Add deinit callback

2017-12-30 Thread Jernej Skrabec
Some SoCs, like Allwinner A83T, have to do additional cleanup when
HDMI driver unloads. When using DW HDMI through DRM bridge API, there is
no place to store driver's private data so it can be accessed in unbind
function. Because of that, add deinit function which is called at the
very end, so drivers can do a proper cleanup.

Signed-off-by: Jernej Skrabec 
---
 drivers/gpu/drm/bridge/synopsys/dw-hdmi.c | 3 +++
 include/drm/bridge/dw_hdmi.h  | 1 +
 2 files changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c 
b/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
index 67467d0b683a..a6fe7a323c83 100644
--- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
+++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
@@ -2592,6 +2592,9 @@ static void __dw_hdmi_remove(struct dw_hdmi *hdmi)
i2c_del_adapter(>i2c->adap);
else
i2c_put_adapter(hdmi->ddc);
+
+   if (hdmi->plat_data->deinit)
+   hdmi->plat_data->deinit(hdmi->plat_data);
 }
 
 /* 
-
diff --git a/include/drm/bridge/dw_hdmi.h b/include/drm/bridge/dw_hdmi.h
index f5cca4362154..a3218d3da61b 100644
--- a/include/drm/bridge/dw_hdmi.h
+++ b/include/drm/bridge/dw_hdmi.h
@@ -124,6 +124,7 @@ struct dw_hdmi_phy_ops {
 
 struct dw_hdmi_plat_data {
struct regmap *regm;
+   void (*deinit)(const struct dw_hdmi_plat_data *pdata);
enum drm_mode_status (*mode_valid)(struct drm_connector *connector,
   const struct drm_display_mode *mode);
unsigned long input_bus_format;
-- 
2.15.1



[PATCH 02/11] clk: sunxi-ng: a83t: Add M divider to TCON1 clock

2017-12-30 Thread Jernej Skrabec
TCON1 also has M divider, contrary to TCON0.

Fixes: 05359be1176b ("clk: sunxi-ng: Add driver for A83T CCU")

Signed-off-by: Jernej Skrabec 
---
 drivers/clk/sunxi-ng/ccu-sun8i-a83t.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/clk/sunxi-ng/ccu-sun8i-a83t.c 
b/drivers/clk/sunxi-ng/ccu-sun8i-a83t.c
index 04a9c33f53f0..7d08015b980d 100644
--- a/drivers/clk/sunxi-ng/ccu-sun8i-a83t.c
+++ b/drivers/clk/sunxi-ng/ccu-sun8i-a83t.c
@@ -504,8 +504,8 @@ static SUNXI_CCU_MUX_WITH_GATE(tcon0_clk, "tcon0", 
tcon0_parents,
 0x118, 24, 3, BIT(31), CLK_SET_RATE_PARENT);
 
 static const char * const tcon1_parents[] = { "pll-video1" };
-static SUNXI_CCU_MUX_WITH_GATE(tcon1_clk, "tcon1", tcon1_parents,
-0x11c, 24, 3, BIT(31), CLK_SET_RATE_PARENT);
+static SUNXI_CCU_M_WITH_MUX_GATE(tcon1_clk, "tcon1", tcon1_parents,
+0x11c, 0, 4, 24, 2, BIT(31), 
CLK_SET_RATE_PARENT);
 
 static SUNXI_CCU_GATE(csi_misc_clk, "csi-misc", "osc24M", 0x130, BIT(16), 0);
 
-- 
2.15.1



[PATCH 05/11] drm/bridge/synopsys: dw-hdmi: Add deinit callback

2017-12-30 Thread Jernej Skrabec
Some SoCs, like Allwinner A83T, have to do additional cleanup when
HDMI driver unloads. When using DW HDMI through DRM bridge API, there is
no place to store driver's private data so it can be accessed in unbind
function. Because of that, add deinit function which is called at the
very end, so drivers can do a proper cleanup.

Signed-off-by: Jernej Skrabec 
---
 drivers/gpu/drm/bridge/synopsys/dw-hdmi.c | 3 +++
 include/drm/bridge/dw_hdmi.h  | 1 +
 2 files changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c 
b/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
index 67467d0b683a..a6fe7a323c83 100644
--- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
+++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
@@ -2592,6 +2592,9 @@ static void __dw_hdmi_remove(struct dw_hdmi *hdmi)
i2c_del_adapter(>i2c->adap);
else
i2c_put_adapter(hdmi->ddc);
+
+   if (hdmi->plat_data->deinit)
+   hdmi->plat_data->deinit(hdmi->plat_data);
 }
 
 /* 
-
diff --git a/include/drm/bridge/dw_hdmi.h b/include/drm/bridge/dw_hdmi.h
index f5cca4362154..a3218d3da61b 100644
--- a/include/drm/bridge/dw_hdmi.h
+++ b/include/drm/bridge/dw_hdmi.h
@@ -124,6 +124,7 @@ struct dw_hdmi_phy_ops {
 
 struct dw_hdmi_plat_data {
struct regmap *regm;
+   void (*deinit)(const struct dw_hdmi_plat_data *pdata);
enum drm_mode_status (*mode_valid)(struct drm_connector *connector,
   const struct drm_display_mode *mode);
unsigned long input_bus_format;
-- 
2.15.1



[PATCH 03/11] drm/bridge/synopsys: dw-hdmi: Enable workaround for v1.32a

2017-12-30 Thread Jernej Skrabec
Allwinner SoCs have dw hdmi controller v1.32a which exhibits same
magenta line issue as i.MX6Q and i.MX6DL. Enable workaround for it.

Tests show that one iteration is enough.

Signed-off-by: Jernej Skrabec 
---
 drivers/gpu/drm/bridge/synopsys/dw-hdmi.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c 
b/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
index a38db40ce990..7ca14d7325b5 100644
--- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
+++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
@@ -1634,9 +1634,10 @@ static void dw_hdmi_clear_overflow(struct dw_hdmi *hdmi)
 * then write one of the FC registers several times.
 *
 * The number of iterations matters and depends on the HDMI TX revision
-* (and possibly on the platform). So far only i.MX6Q (v1.30a) and
-* i.MX6DL (v1.31a) have been identified as needing the workaround, with
-* 4 and 1 iterations respectively.
+* (and possibly on the platform). So far i.MX6Q (v1.30a), i.MX6DL
+* (v1.31a) and multiple Allwinner SoCs (v1.32a) have been identified
+* as needing the workaround, with 4 iterations for v1.30a and 1
+* iteration for others.
 */
 
switch (hdmi->version) {
@@ -1644,6 +1645,7 @@ static void dw_hdmi_clear_overflow(struct dw_hdmi *hdmi)
count = 4;
break;
case 0x131a:
+   case 0x132a:
count = 1;
break;
default:
-- 
2.15.1



[PATCH 03/11] drm/bridge/synopsys: dw-hdmi: Enable workaround for v1.32a

2017-12-30 Thread Jernej Skrabec
Allwinner SoCs have dw hdmi controller v1.32a which exhibits same
magenta line issue as i.MX6Q and i.MX6DL. Enable workaround for it.

Tests show that one iteration is enough.

Signed-off-by: Jernej Skrabec 
---
 drivers/gpu/drm/bridge/synopsys/dw-hdmi.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c 
b/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
index a38db40ce990..7ca14d7325b5 100644
--- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
+++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
@@ -1634,9 +1634,10 @@ static void dw_hdmi_clear_overflow(struct dw_hdmi *hdmi)
 * then write one of the FC registers several times.
 *
 * The number of iterations matters and depends on the HDMI TX revision
-* (and possibly on the platform). So far only i.MX6Q (v1.30a) and
-* i.MX6DL (v1.31a) have been identified as needing the workaround, with
-* 4 and 1 iterations respectively.
+* (and possibly on the platform). So far i.MX6Q (v1.30a), i.MX6DL
+* (v1.31a) and multiple Allwinner SoCs (v1.32a) have been identified
+* as needing the workaround, with 4 iterations for v1.30a and 1
+* iteration for others.
 */
 
switch (hdmi->version) {
@@ -1644,6 +1645,7 @@ static void dw_hdmi_clear_overflow(struct dw_hdmi *hdmi)
count = 4;
break;
case 0x131a:
+   case 0x132a:
count = 1;
break;
default:
-- 
2.15.1



  1   2   3   4   5   >