Re: [PATCH resent] uapi libc compat: allow non-glibc to opt out of uapi definitions

2017-03-08 Thread Carlos O'Donell
On 03/08/2017 07:46 AM, David Woodhouse wrote:
> On Fri, 2016-11-11 at 07:08 -0500, Felix Janda wrote:
>> Currently, libc-compat.h detects inclusion of specific glibc headers,
>> and defines corresponding _UAPI_DEF_* macros, which in turn are used in
>> uapi headers to prevent definition of conflicting structures/constants.
>> There is no such detection for other c libraries, for them the
>> _UAPI_DEF_* macros are always defined as 1, and so none of the possibly
>> conflicting definitions are suppressed.
>>
>> This patch enables non-glibc c libraries to request the suppression of
>> any specific interface by defining the corresponding _UAPI_DEF_* macro
>> as 0.
> 
> Ick. It's fairly horrid for kernel headers to be reacting to __GLIBC__
> in any way. That's just wrong.
> 
> It makes more sense for C libraries to define the __UAPI_DEF_xxx for
> themselves as and when they add their own support for certain things,
> and for the kernel not to have incestuous knowledge of them.
> 
> The part you add here in the #else /* !__GLIBC__ */ part is what we
> should do at *all* times.
> 
> I understand that we'll want to grandfather in the glibc horridness,
> but let's make it clear that that's what it is, by letting it set the
> appropriate __UAPI_DEF_xxx macros to zero, and then continue through to
> your new part. Something like this (incremental to yours):

Any model we propose should be documented in the header of libc-compat.h
and explain how it works to solve header inclusion order in _both_ directions.
User use cases include header inclusion in _both_ directions and we should look
to support that.

> diff --git a/include/uapi/linux/libc-compat.h 
> b/include/uapi/linux/libc-compat.h
> index c316725..7673158 100644
> --- a/include/uapi/linux/libc-compat.h
> +++ b/include/uapi/linux/libc-compat.h
> @@ -53,41 +53,18 @@
>  
>  /* Coordinate with glibc net/if.h header. */
>  #if defined(_NET_IF_H) && defined(__USE_MISC)
> -
>  /* GLIBC headers included first so don't define anything
>   * that would already be defined. */
> -
>  #define __UAPI_DEF_IF_IFCONF 0
>  #define __UAPI_DEF_IF_IFMAP 0
>  #define __UAPI_DEF_IF_IFNAMSIZ 0
>  #define __UAPI_DEF_IF_IFREQ 0
>  /* Everything up to IFF_DYNAMIC, matches net/if.h until glibc 2.23 */
>  #define __UAPI_DEF_IF_NET_DEVICE_FLAGS 0
> -/* For the future if glibc adds IFF_LOWER_UP, IFF_DORMANT and IFF_ECHO */
> -#ifndef __UAPI_DEF_IF_NET_DEVICE_FLAGS_LOWER_UP_DORMANT_ECHO
> -#define __UAPI_DEF_IF_NET_DEVICE_FLAGS_LOWER_UP_DORMANT_ECHO 1
> -#endif /* __UAPI_DEF_IF_NET_DEVICE_FLAGS_LOWER_UP_DORMANT_ECHO */
> -
> -#else /* _NET_IF_H */
> -
> -/* Linux headers included first, and we must define everything
> - * we need. The expectation is that glibc will check the
> - * __UAPI_DEF_* defines and adjust appropriately. */
> -
> -#define __UAPI_DEF_IF_IFCONF 1
> -#define __UAPI_DEF_IF_IFMAP 1
> -#define __UAPI_DEF_IF_IFNAMSIZ 1
> -#define __UAPI_DEF_IF_IFREQ 1
> -/* Everything up to IFF_DYNAMIC, matches net/if.h until glibc 2.23 */
> -#define __UAPI_DEF_IF_NET_DEVICE_FLAGS 1
> -/* For the future if glibc adds IFF_LOWER_UP, IFF_DORMANT and IFF_ECHO */
> -#define __UAPI_DEF_IF_NET_DEVICE_FLAGS_LOWER_UP_DORMANT_ECHO 1
> -

Any header needing compat with a libc includes libc-compat.h (per the 
documented way the model works). With this patch any included linux kernel
header that also includes libc-compat.h would immediately define all 
the __UAPI_DEF_* constants to 1 as-if it had defined those structures, 
but it has not.

For example, with these two patches applied, the inclusion of linux/if.h
would define __UAPI_DEF_XATTR to 1, but linux/if.h has not defined
XATTR_CREATE or other constants, so a subsequent inclusion sys/xattrs.h
from userspace would _not_ define XATTR_CREATE because __UAPI_DEF_XATTR set
to 1 indicates the kernel has.

I don't want to read into the model you are proposing and would rather you
document the semantics clearly so we can all see what you mean.

>  #endif /* _NET_IF_H */
>  
>  /* Coordinate with glibc netinet/in.h header. */
>  #if defined(_NETINET_IN_H)
> -
>  /* GLIBC headers included first so don't define anything
>   * that would already be defined. */
>  #define __UAPI_DEF_IN_ADDR   0
> @@ -104,8 +81,6 @@
>   * additional in6_addr macros e.g. s6_addr16, and s6_addr32. */
>  #if defined(__USE_MISC) || defined (__USE_GNU)
>  #define __UAPI_DEF_IN6_ADDR_ALT  0
> -#else
> -#define __UAPI_DEF_IN6_ADDR_ALT  1
>  #endif
>  #define __UAPI_DEF_SOCKADDR_IN6  0
>  #define __UAPI_DEF_IPV6_MREQ 0
> @@ -113,62 +88,23 @@
>  #define __UAPI_DEF_IPV6_OPTIONS  0
>  #define __UAPI_DEF_IN6_PKTINFO   0
>  #define __UAPI_DEF_IP6_MTUINFO   0
> -
> -#else
> -
> -/* Linux headers included first, and we must define everything
> - * we need. The expectation is that glibc will check the
> - * __UAPI_DEF_* defines and adjust appropriately. */
> -#define __UAPI_DEF_IN_ADDR   1
> -#define 

RE: [PATCH v19 0/4] Introduce usb charger framework to deal with the usb gadget power negotation

2017-03-08 Thread Jun Li
Hi,

> -Original Message-
> From: Baolin Wang [mailto:baolin.w...@linaro.org]
> Sent: Tuesday, March 07, 2017 5:39 PM
> To: NeilBrown 
> Cc: Felipe Balbi ; Greg KH ;
> Sebastian Reichel ; Dmitry Eremin-Solenikov
> ; David Woodhouse ;
> r...@kernel.org; Jun Li ; Marek Szyprowski
> ; Ruslan Bilovol ;
> Peter Chen ; Alan Stern
> ; grygorii.stras...@ti.com; Yoshihiro Shimoda
> ; Lee Jones ;
> Mark Brown ; John Stultz ;
> Charles Keepax ;
> patc...@opensource.wolfsonmicro.com; Linux PM list  p...@vger.kernel.org>; USB ; device-
> mainlin...@lists.linuxfoundation.org; LKML 
> Subject: Re: [PATCH v19 0/4] Introduce usb charger framework to deal with
> the usb gadget power negotation
> 
> On 3 March 2017 at 10:23, NeilBrown  wrote:
> > On Mon, Feb 20 2017, Baolin Wang wrote:
> >
> >> Currently the Linux kernel does not provide any standard integration
> >> of this feature that integrates the USB subsystem with the system
> >> power regulation provided by PMICs meaning that either vendors must
> >> add this in their kernels or USB gadget devices based on Linux (such
> >> as mobile phones) may not behave as they should. Thus provide a
> standard framework for doing this in kernel.
> >>
> >> Now introduce one user with wm831x_power to support and test the usb
> charger.
> >> Another user introduced to support charger detection by Jun Li:
> >> https://www.spinics.net/lists/linux-usb/msg139425.html
> >> Moreover there may be other potential users will use it in future.
> >>
> >> 1. Before v19 patchset we've fixed below issues in extcon subsystem
> >> and usb phy driver, now all were merged. (Thanks for Neil's
> >> suggestion)
> >> (1) Have fixed the inconsistencies with USB connector types in extcon
> >> subsystem by following links:
> >> https://lkml.org/lkml/2016/12/21/13
> >> https://lkml.org/lkml/2016/12/21/15
> >> https://lkml.org/lkml/2016/12/21/79
> >> https://lkml.org/lkml/2017/1/3/13
> >>
> >> (2) Instead of using 'set_power' callback in phy drivers, we will
> >> introduce USB charger to set PMIC current drawn from USB
> >> configuration, moreover some 'set_power' callbacks did not implement
> >> anything to set PMIC current, thus remove them by following links:
> >> https://lkml.org/lkml/2017/1/18/436
> >> https://lkml.org/lkml/2017/1/18/439
> >> https://lkml.org/lkml/2017/1/18/438
> >> Now only two phy drivers (phy-isp1301-omap.c and phy-gpio-vbus-usb.c)
> >> still used 'set_power' callback to set current, we can remove them in
> >> future. (I have no platform with enabling these two phy drivers, so I
> >> can not test them if I converted 'set_power' callback to USB
> >> charger.)
> >>
> >> 2. Some issues pointed by Neil Brown were sill kept in this v19
> >> patchset, and I expalined each issue and may be need discuss again:
> >> (1) Change all usb phys to register an extcon and to send appropriate
> notifications.
> >> Firstly, now only 3 USB phy drivers (phy-qcom-8x16-usb.c,
> >> phy-omap-otg.c and
> >> phy-msm-usb.c) had registered an extcon, mostly did not. I can not
> >> change all usb phys to register an extcon, since there are no extcon
> >> device to register for these different phy drivers.
> >
> > You don't have to change every driver.  You just need to make it easy
> > and obvious how to change drivers in a consistent coherent way.
> > For a start you would add a 'struct extcon_dev' to 'struct usb_phy',
> > and possibly add or extend some 'static inline's in linux/usb/phy.h to
> > send notification on that extcon (if it is non-NULL).
> > e.g. usb_phy_vbus_on() could send an extcon notification.
> >
> > Then any phy driver which adds support for setting phy->extcon_dev
> > appropriately, immediately gets the relevant notifications sent.
> 
> OK. We can make these extcon related code into phy common part.
>   

Will generic phy need add extcon as well?

> >
> >> Secondly, I also agreed with Peter's comments: Not only USB PHY to
> >> register an extcon, but also for the drivers which can detect USB
> >> charger type, it may be USB controller driver, USB type-c driver,
> >> pmic driver, and these drivers may not have an extcon device since
> >> the internal part can finish the vbus detect.
> >
> > Whichever part can detect vbus, the driver for that part must be able
> > to find the extcon and trigger a notification.
> > Maybe one part can detect VBUS, another can measure the resistance on
> > ID and a third can work through the state machine to determine if D+
> > and D- are shorted together.
> > Somehow these three need to work together to 

[GIT PULL for v4.11-rc2] media fixes

2017-03-08 Thread Mauro Carvalho Chehab
Hi Linus,

Please pull from:
  git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media 
tags/media/v4.11-2

For media regression fixes:

   - serial_ir: fix a Kernel crash during boot on Kernel 4.11-rc1, due
to an IRQ code called too early;
   - other IR regression fixes at lirc and at the raw IR decoding;
   - a deadlock fix at the RC nuvoton driver;
   - Fix another issue with DMA on stack at dw2102 driver.

There's an extra patch there that change a driver interface for the
SoC VSP1 driver, with is shared between the DRM and V4L2 driver.
The patch itself is trivial, and was acked by David Arlie.
As we're early at -rc, I hope that's ok.

Thanks!
Mauro

The following changes since commit 9eeb0ed0f30938f31a3d9135a88b9502192c18dd:

  [media] mtk-vcodec: fix build warnings without DEBUG (2017-02-08 12:08:20 
-0200)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media 
tags/media/v4.11-2

for you to fetch changes up to 8c71fff434e5ecf5ff27bd61db1bc9ac4c2b2a1b:

  [media] v4l: vsp1: Adapt vsp1_du_setup_lif() interface to use a structure 
(2017-03-07 13:34:11 -0300)


media fixes for v4.11-rc2


Heiner Kallweit (1):
  [media] rc: nuvoton: fix deadlock in nvt_write_wakeup_codes

Jonathan McDowell (1):
  [media] dw2102: don't do DMA on stack

Kieran Bingham (1):
  [media] v4l: vsp1: Adapt vsp1_du_setup_lif() interface to use a structure

Sean Young (4):
  [media] serial_ir: ensure we're ready to receive interrupts
  [media] lirc: fix dead lock between open and wakeup_filter
  [media] rc: raw decoder for keymap protocol is not loaded on register
  [media] rc: protocol is not set on register for raw IR devices

 drivers/gpu/drm/rcar-du/rcar_du_vsp.c  |   8 +-
 drivers/media/platform/vsp1/vsp1_drm.c |  33 +++--
 drivers/media/rc/lirc_dev.c|   4 +-
 drivers/media/rc/nuvoton-cir.c |   5 +-
 drivers/media/rc/rc-main.c |  26 ++--
 drivers/media/rc/serial_ir.c   | 123 -
 drivers/media/usb/dvb-usb/dw2102.c | 244 -
 include/media/vsp1.h   |  13 +-
 8 files changed, 260 insertions(+), 196 deletions(-)



[ANNOUNCE] 3.2.86-rt124

2017-03-08 Thread Steven Rostedt

Dear RT Folks,

I'm pleased to announce the 3.2.86-rt124 stable release.


This release is just an update to the new stable 3.2.86 version
and no RT specific changes have been made.


You can get this release via the git tree at:

  git://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-stable-rt.git

  branch: v3.2-rt
  Head SHA1: b48de9f0a9c2007a5cc494fc2795d2b7b4b84562


Or to build 3.2.86-rt124 directly, the following patches should be applied:

  http://www.kernel.org/pub/linux/kernel/v3.x/linux-3.2.tar.xz

  http://www.kernel.org/pub/linux/kernel/v3.x/patch-3.2.86.xz

  
http://www.kernel.org/pub/linux/kernel/projects/rt/3.2/patch-3.2.86-rt124.patch.xz




Enjoy,

-- Steve



Re: [PATCH 1/2] x86/mm/numa: trivial fix on typo and error message

2017-03-08 Thread Wei Yang
Dear masters~

Would you like to share some comments on these two?

On Mon, Feb 06, 2017 at 11:35:28PM +0800, Wei Yang wrote:
>When allocating pg_data in alloc_node_data(), it will try to allocate from
>local node first and then from any node. If it fails at the second trial,
>it means there is not available memory on any node.
>
>This patch fixes the error message and correct one typo.
>
>Signed-off-by: Wei Yang 
>---
> arch/x86/mm/numa.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
>diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
>index 4366242356c5..3e9110b34147 100644
>--- a/arch/x86/mm/numa.c
>+++ b/arch/x86/mm/numa.c
>@@ -201,8 +201,8 @@ static void __init alloc_node_data(int nid)
>   nd_pa = __memblock_alloc_base(nd_size, SMP_CACHE_BYTES,
> MEMBLOCK_ALLOC_ACCESSIBLE);
>   if (!nd_pa) {
>-  pr_err("Cannot find %zu bytes in node %d\n",
>- nd_size, nid);
>+  pr_err("Cannot find %zu bytes in any node\n",
>+ nd_size);
>   return;
>   }
>   }
>@@ -225,7 +225,7 @@ static void __init alloc_node_data(int nid)
>  * numa_cleanup_meminfo - Cleanup a numa_meminfo
>  * @mi: numa_meminfo to clean up
>  *
>- * Sanitize @mi by merging and removing unncessary memblks.  Also check for
>+ * Sanitize @mi by merging and removing unnecessary memblks.  Also check for
>  * conflicts and clear unused memblks.
>  *
>  * RETURNS:
>-- 
>2.11.0

-- 
Wei Yang
Help you, Help me


signature.asc
Description: PGP signature


[sched] 1827adb11a BUG kmalloc-128 (Not tainted): Poison overwritten

2017-03-08 Thread Fengguang Wu
Hi Ingo,

FYI this also shows up in next-20170308 and tip/master 7f27de49
("Merge branch 'WIP.sched/core'"). The attached reproduce-* script may
help, however note that this bug may not show up in every boot.

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master

commit 1827adb11ad26b2290dc9fe2aaf54976b2439865
Merge: 7876991 5eca1c1
Author: Linus Torvalds <torva...@linux-foundation.org>
AuthorDate: Fri Mar 3 10:16:38 2017 -0800
Commit: Linus Torvalds <torva...@linux-foundation.org>
CommitDate: Fri Mar 3 10:16:38 2017 -0800

 Merge branch 'WIP.sched-core-for-linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
 
 Pull sched.h split-up from Ingo Molnar:
  "The point of these changes is to significantly reduce the
header footprint, to speed up the kernel build and to
   have a cleaner header structure.
 
   After these changes the new 's typical preprocessed
   size goes down from a previous ~0.68 MB (~22K lines) to ~0.45 MB (~15K
   lines), which is around 40% faster to build on typical configs.
 
   Not much changed from the last version (-v2) posted three weeks ago: I
   eliminated quirks, backmerged fixes plus I rebased it to an upstream
   SHA1 from yesterday that includes most changes queued up in -next plus
   all sched.h changes that were pending from Andrew.
 
   I've re-tested the series both on x86 and on cross-arch defconfigs,
   and did a bisectability test at a number of random points.
 
   I tried to test as many build configurations as possible, but some
   build breakage is probably still left - but it should be mostly
   limited to architectures that have no cross-compiler binaries
   available on kernel.org, and non-default configurations"
 
 * 'WIP.sched-core-for-linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (146 commits)
   sched/headers: Clean up 
   sched/headers: Remove #ifdefs from 
   sched/headers: Remove the  include from 
   sched/headers, hrtimer: Remove the  include from 

   sched/headers, x86/apic: Remove the  header inclusion from 

   sched/headers, timers: Remove the  include from 

   sched/headers: Remove  from 
   sched/headers: Remove  from 
   sched/core: Remove unused prefetch_stack()
   sched/headers: Remove  from 
   sched/headers: Remove the 'init_pid_ns' prototype from 
   sched/headers: Remove  from 
   sched/headers: Remove  from 
   sched/headers: Remove the runqueue_is_locked() prototype
   sched/headers: Remove  from 
   sched/headers: Remove  from 
   sched/headers: Remove  from 
   sched/headers: Remove  from 
   sched/headers: Remove the  include from 
   sched/headers: Remove  from 
   ...

78769912f6  Merge tag 'linux-kselftest-4.11-rc1-urgent_fix' of 
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
5eca1c10cb  sched/headers: Clean up 
1827adb11a  Merge branch 'WIP.sched-core-for-linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
+---++++
|   | 78769912f6 | 5eca1c10cb | 
1827adb11a |
+---++++
| boot_successes| 69 | 32 | 166 
   |
| boot_failures | 0  | 0  | 2   
   |
| BUG_kmalloc-#(Not_tainted):Poison_overwritten | 0  | 0  | 2   
   |
| INFO:#-#.First_byte#instead_of| 0  | 0  | 2   
   |
| INFO:Allocated_in_ida_pre_get_age=#cpu=#pid=  | 0  | 0  | 2   
   |
| INFO:Freed_in_ida_pre_get_age=#cpu=#pid=  | 0  | 0  | 2   
   |
| INFO:Slab#objects=#used=#fp=0x(null)flags=| 0  | 0  | 2   
   |
| INFO:Object#@offset=#fp=  | 0  | 0  | 2   
   |
+---++++

[2.792346]  done.
[2.793824] Using IPI No-Shortcut mode
[2.806241] Key type trusted registered
[2.807779] ima: No TPM chip found, activating TPM-bypass! (rc=-19)
[2.810445] 
=
[2.813344] BUG kmalloc-128 (Not tainted): Poison overwritten
[2.813344] 
-
[2.813344] 
[2.813344] Disabling lock debugging due to kernel taint
[2.813344] INFO: 0xd6ede140-0xd6ede1be. First byte 0xff instead of 0x6b
[2.813344] INFO: Allocated in ida_pre_get+0x3f/0x6a age=71 cpu=0 pid=19
[2.813344]  ___slab_alloc+0x4c6/0x4d8
[2.813344]  __sl

Re: [Xen-devel] [PATCH 0/7] Xen transport for 9pfs frontend driver

2017-03-08 Thread Roger Pau Monné
On Tue, Mar 07, 2017 at 10:27:05AM -0800, Stefano Stabellini wrote:
> On Tue, 7 Mar 2017, Roger Pau Monné wrote:
> > On Mon, Mar 06, 2017 at 12:00:41PM -0800, Stefano Stabellini wrote:
> > > Hi all,
> > > 
> > > This patch series implements a new transport for 9pfs, aimed at Xen
> > > systems.
> > > 
> > > The transport is based on a traditional Xen frontend and backend drivers
> > > pair. This patch series implements the frontend, which typically runs in
> > > a regular unprivileged guest.
> > > 
> > > I'll follow up with another series that implements the backend in
> > > userspace in QEMU, which typically runs in Dom0 (but could also run in
> > > a another guest).
> > > 
> > > The frontend complies to the Xen transport for 9pfs specification
> > > version 1, available here:
> > > 
> > > http://xenbits.xen.org/gitweb/?p=xen.git;a=blob_plain;f=docs/misc/9pfs.markdown;hb=HEAD
> > 
> > Kind of tangential to this series, but maybe it would make sense to 
> > implement
> > this transport in a fuse based 9pfs driver? I see there are already several
> > fuse-9pfs implementations around. Something for a GSoC/Outreach project?
> 
> Sure. Additionally, with open source frontends and backends already
> available, it should be easier to code. I am happy to co-mentor the
> project with you, if you feel like it.

I don't mind co-mentoring it, so far I haven't got lucky with any of my other
GSoC projects, but I don't know anything about 9pfs or fuse :).

This also has the difficulty that neither you not me is a member of any of the
9pfs-fuse projects, so it might be hard to get the changes upstream.

Roger.


Re: [PATCH v3 06/09] iommu/ipmmu-vmsa: Write IMCTR twice

2017-03-08 Thread Magnus Damm
Hi Robin,

On Wed, Mar 8, 2017 at 9:34 PM, Robin Murphy  wrote:
> On 08/03/17 11:02, Magnus Damm wrote:
>> From: Magnus Damm 
>>
>> Write IMCTR both in the root device and the leaf node.
>>
>> Signed-off-by: Magnus Damm 
>> ---
>>
>>  Changes since V2:
>>  - None
>>
>>  Changes since V1:
>>  - None
>>
>>  drivers/iommu/ipmmu-vmsa.c |   17 ++---
>>  1 file changed, 14 insertions(+), 3 deletions(-)
>>
>> --- 0018/drivers/iommu/ipmmu-vmsa.c
>> +++ work/drivers/iommu/ipmmu-vmsa.c   2017-03-08 18:30:36.870607110 +0900
>> @@ -286,6 +286,16 @@ static void ipmmu_ctx_write(struct ipmmu
>>   ipmmu_write(domain->root, domain->context_id * IM_CTX_SIZE + reg, 
>> data);
>>  }
>>
>> +static void ipmmu_ctx_write2(struct ipmmu_vmsa_domain *domain, unsigned int 
>> reg,
>> +  u32 data)
>
> That's pretty cryptic. Maybe both functions could do with less ambiguous
> names - something like ipmmu_ctx_write_root() vs. ipmmu_ctx_write_all(),
> perhaps? (and if there's a more specific hardware term than "all" that
> describes this kind of configuration, even better).

Yeah I agree. Will fix in next version!

Thanks,

/ magnus


Re: [PATCH net] dccp/tcp: fix routing redirect race

2017-03-08 Thread Jonathan Maxwell
On Thu, Mar 9, 2017 at 3:40 PM, Eric Dumazet  wrote:
> On Thu, 2017-03-09 at 14:42 +1100, Jonathan Maxwell wrote:
>> Sorry let me resend in plain text mode.
>>
>> On Thu, Mar 9, 2017 at 1:10 PM, Eric Dumazet  wrote:
>> > On Thu, 2017-03-09 at 12:15 +1100, Jon Maxwell wrote:
>> >> We have seen a few incidents lately where a dst_enty has been freed
>> >> with a dangling TCP socket reference (sk->sk_dst_cache) pointing to that
>> >> dst_entry. If the conditions/timings are right a crash then ensues when 
>> >> the
>> >> freed dst_entry is referenced later on. A Common crashing back trace is:
>> >
>> > Very nice catch !
>> >
>>
>> Thanks Eric.
>>
>> > Don't we have a similar issue for IPv6 ?
>> >
>> >
>>
>> Good point.
>>
>> We checked and as far as we can tell IPv6 does not invalidate the route.
>> So it should be safer.
>
> Simply doing :
>
> __sk_dst_check(sk, np->dst_cookie);
>
> is racy, even before calling dst->ops->redirect(dst, sk, skb);
>
> (if socket is owned by user)
>
>
>

Okay, I will add a similar patch for IPv6 to also protect from that.


Re: [PATCH net-next 0/5] sunvnet: better connection management

2017-03-08 Thread David Miller
From: Shannon Nelson 
Date: Wed, 8 Mar 2017 15:04:45 -0800

> On 3/6/2017 3:15 PM, Shannon Nelson wrote:
>> These patches remove some problems in handling of carrier state
>> with the ldmvsw vswitch, remove  an xoff misuse in sunvnet, and
>> add stats for debug and tracking of point-to-point connections
>> between the ldom VMs.
> 
> Further testing shows a problem in one of the patches, so there will
> be a V2 coming.

Ok.


Re: Race condition in ext4 (was Re: 4.11-rc1 acpi stomping ext4 slabs)

2017-03-08 Thread Nikolay Borisov


On  9.03.2017 03:58, Theodore Ts'o wrote:
> On Tue, Mar 07, 2017 at 10:40:53PM +0200, Nikolay Borisov wrote:
>> So this is wrong, the reason why the issues seemed fix is because I
>> switched my compiler to version 5.4.0. So this manifests only if I'm
>> using gcc 4.7.4. With the pr_info added here is the output of a boot. So
>> there are multiple invocations of ext4_ext_map_blocks and the freeing,
>> including with the address being used in subsequent kasan reports :
>> 88006ae8fdb0
> 
> Can you help bisect this, then?  I'm using Debian Testing, and the
> default gcc is gcc 6.3.0.  I'm currently forcing the use of gcc 5.4.1
> because I was running into problems with gcc 6.x a while back.  (TBH,
> I was thinking about trying to see if gcc 6.3 was stable for kernel
> compiles when I had some spare time.)  But I don't have access to
> *any* gcc 4.x on my development system, and I don't think I've tried
> using gcc 4.x in a long, Long, LONG time.
> 
> I'm currently kicking off a test run using 5.4.1 with KASAN enabled to
> see if I can trigger it myself.  Can you send me a copy of your
> .config so I can see what else might be interesting with your config?
> (e.g., SLAB vs SLUB, etc.)

Attached the config. FUrther debugging and talking with the kasan
developers I think this actually might be a kasan problem when used with
an old compiler.  I bisected this all the way to 1771c6e1a567ea0ba2,
which is the commit introducing the user access instrumentation. Here is
a mail thread where I confirmed that this might be a kasan issue :
https://lkml.org/lkml/2017/3/8/69

What I believe is happening is that the manual checks inserted in user
access code misses some context information due to instrumentation not
inserted by the compiler. Kasan gets confused as a result, hence the
warnings.


> 
> Thanks,
> 
>  - Ted
> 
#
# Automatically generated file; DO NOT EDIT.
# Linux/x86 4.7.0 Kernel Configuration
#
CONFIG_64BIT=y
CONFIG_X86_64=y
CONFIG_X86=y
CONFIG_INSTRUCTION_DECODER=y
CONFIG_OUTPUT_FORMAT="elf64-x86-64"
CONFIG_ARCH_DEFCONFIG="arch/x86/configs/x86_64_defconfig"
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_MMU=y
CONFIG_ARCH_MMAP_RND_BITS_MIN=28
CONFIG_ARCH_MMAP_RND_BITS_MAX=32
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MIN=8
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MAX=16
CONFIG_NEED_DMA_MAP_STATE=y
CONFIG_NEED_SG_DMA_LENGTH=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_BUG_RELATIVE_POINTERS=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_ARCH_HAS_CPU_RELAX=y
CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y
CONFIG_HAVE_SETUP_PER_CPU_AREA=y
CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y
CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK=y
CONFIG_ARCH_HIBERNATION_POSSIBLE=y
CONFIG_ARCH_SUSPEND_POSSIBLE=y
CONFIG_ARCH_WANT_HUGE_PMD_SHARE=y
CONFIG_ARCH_WANT_GENERAL_HUGETLB=y
CONFIG_ZONE_DMA32=y
CONFIG_AUDIT_ARCH=y
CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING=y
CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
CONFIG_X86_64_SMP=y
CONFIG_ARCH_HWEIGHT_CFLAGS="-fcall-saved-rdi -fcall-saved-rsi -fcall-saved-rdx 
-fcall-saved-rcx -fcall-saved-r8 -fcall-saved-r9 -fcall-saved-r10 
-fcall-saved-r11"
CONFIG_ARCH_SUPPORTS_UPROBES=y
CONFIG_FIX_EARLYCON_MEM=y
CONFIG_DEBUG_RODATA=y
CONFIG_PGTABLE_LEVELS=4
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"
CONFIG_IRQ_WORK=y
CONFIG_BUILDTIME_EXTABLE_SORT=y

#
# General setup
#
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_CROSS_COMPILE=""
# CONFIG_COMPILE_TEST is not set
CONFIG_LOCALVERSION="-nbor"
# CONFIG_LOCALVERSION_AUTO is not set
CONFIG_HAVE_KERNEL_GZIP=y
CONFIG_HAVE_KERNEL_BZIP2=y
CONFIG_HAVE_KERNEL_LZMA=y
CONFIG_HAVE_KERNEL_XZ=y
CONFIG_HAVE_KERNEL_LZO=y
CONFIG_HAVE_KERNEL_LZ4=y
CONFIG_KERNEL_GZIP=y
# CONFIG_KERNEL_BZIP2 is not set
# CONFIG_KERNEL_LZMA is not set
# CONFIG_KERNEL_XZ is not set
# CONFIG_KERNEL_LZO is not set
# CONFIG_KERNEL_LZ4 is not set
CONFIG_DEFAULT_HOSTNAME="(none)"
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
# CONFIG_POSIX_MQUEUE is not set
CONFIG_CROSS_MEMORY_ATTACH=y
CONFIG_FHANDLE=y
# CONFIG_USELIB is not set
# CONFIG_AUDIT is not set
CONFIG_HAVE_ARCH_AUDITSYSCALL=y

#
# IRQ subsystem
#
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_GENERIC_IRQ_SHOW=y
CONFIG_GENERIC_PENDING_IRQ=y
CONFIG_IRQ_DOMAIN=y
CONFIG_IRQ_DOMAIN_HIERARCHY=y
CONFIG_GENERIC_MSI_IRQ=y
CONFIG_GENERIC_MSI_IRQ_DOMAIN=y
# CONFIG_IRQ_DOMAIN_DEBUG is not set
CONFIG_IRQ_FORCED_THREADING=y
CONFIG_SPARSE_IRQ=y
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_ARCH_CLOCKSOURCE_DATA=y
CONFIG_CLOCKSOURCE_VALIDATE_LAST_CYCLE=y
CONFIG_GENERIC_TIME_VSYSCALL=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_GENERIC_CLOCKEVENTS_MIN_ADJUST=y
CONFIG_GENERIC_CMOS_UPDATE=y

#
# Timers subsystem
#
CONFIG_TICK_ONESHOT=y
CONFIG_NO_HZ_COMMON=y
# CONFIG_HZ_PERIODIC is not set
CONFIG_NO_HZ_IDLE=y
# CONFIG_NO_HZ_FULL is not set
# CONFIG_NO_HZ is not set
CONFIG_HIGH_RES_TIMERS=y

#
# CPU/Task time and stats accounting
#
CONFIG_TICK_CPU_ACCOUNTING=y
# 

[PATCH] edac i5000, i5400: fix use of MTR_DRAM_WIDTH macro

2017-03-08 Thread Jérémy Lefaure
The MTR_DRAM_WIDTH macro returns the data width. It is sometimes used as
if it returned a boolean true if the width if 8. This patch fixes the
tests where MTR_DRAM_WIDTH is misused.

Signed-off-by: Jérémy Lefaure 
---
 drivers/edac/i5000_edac.c | 2 +-
 drivers/edac/i5400_edac.c | 5 +++--
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/edac/i5000_edac.c b/drivers/edac/i5000_edac.c
index 1670d27bcac8..f683919981b0 100644
--- a/drivers/edac/i5000_edac.c
+++ b/drivers/edac/i5000_edac.c
@@ -1293,7 +1293,7 @@ static int i5000_init_csrows(struct mem_ctl_info *mci)
dimm->mtype = MEM_FB_DDR2;
 
/* ask what device type on this row */
-   if (MTR_DRAM_WIDTH(mtr))
+   if (MTR_DRAM_WIDTH(mtr) == 8)
dimm->dtype = DEV_X8;
else
dimm->dtype = DEV_X4;
diff --git a/drivers/edac/i5400_edac.c b/drivers/edac/i5400_edac.c
index abf6ef22e220..37a9ba71da44 100644
--- a/drivers/edac/i5400_edac.c
+++ b/drivers/edac/i5400_edac.c
@@ -1207,13 +1207,14 @@ static int i5400_init_dimms(struct mem_ctl_info *mci)
 
dimm->nr_pages = size_mb << 8;
dimm->grain = 8;
-   dimm->dtype = MTR_DRAM_WIDTH(mtr) ? DEV_X8 : DEV_X4;
+   dimm->dtype = MTR_DRAM_WIDTH(mtr) == 8 ?
+ DEV_X8 : DEV_X4;
dimm->mtype = MEM_FB_DDR2;
/*
 * The eccc mechanism is SDDC (aka SECC), with
 * is similar to Chipkill.
 */
-   dimm->edac_mode = MTR_DRAM_WIDTH(mtr) ?
+   dimm->edac_mode = MTR_DRAM_WIDTH(mtr) == 8 ?
  EDAC_S8ECD8ED : EDAC_S4ECD4ED;
ndimms++;
}
-- 
2.12.0



Re: [lkp-robot] [x86] ed3ce2a917: BUG:unable_to_handle_kernel

2017-03-08 Thread Fengguang Wu

On Thu, Mar 09, 2017 at 10:13:10AM +0800, Ye Xiaolong wrote:

On 03/02, Borislav Petkov wrote:

Hi,

On Thu, Mar 02, 2017 at 09:09:34AM +0800, kernel test robot wrote:


FYI, we noticed the following commit:

commit: ed3ce2a9172457ef7dbaa9f964e63dfde2bdcb5f ("x86: Optimize clear_page()")
url: 
https://github.com/0day-ci/linux/commits/Borislav-Petkov/x86-Optimize-clear_page/20170215-193441


in testcase: will-it-scale
with following parameters:

test: poll2
cpufreq_governor: performance

test-description: Will It Scale takes a testcase and runs it from 1 through to 
n parallel copies to see if the testcase will scale. It builds both a process 
and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale


thanks for the report, I was able to reproduce.

BUT(!) this report is misleading because it talks about will-it-scale
but your splat happens when you kexec the kernel:

 [  336.340747] LKP: kexec loading...
 [  336.340852]
 [  336.343323] kexec --noefi -l 
/tmp/cache/pkg/linux/x86_64-rhel-7.2/gcc-6/ed3ce2a9172457ef7dbaa9f964e63dfde2bdcb5f/vmlinuz-4.9.0-rc6-00134-ged3ce2a
 --initrd=/tmp/cache/initrd-concatenated
 [  336.343758]
 [  337.893471] --append=ip=lkp-ivb-d01::dhcp root=/dev/ram0 user=lkp 
job=/lkp/scheduled/lkp-ivb-d01/will-it-scale-poll2-performance-debian-x86_64-2016-08-31.cgz-ed3ce2a9172457ef7dbaa9f964e63dfde2bdcb5f-20170301-28072-1dqjyhl-11.yaml
 ARCH=x86_64 kconfig=x86_64-rhel-7.2 branch=linux-devel/devel-hourly-2017022612 
commit=ed3ce2a9172457ef7dbaa9f964e63dfde2bdcb5f 
BOOT_IMAGE=/pkg/linux/x86_64-rhel-7.2/gcc-6/ed3ce2a9172457ef7dbaa9f964e63dfde2bdcb5f/vmlinuz-4.9.0-rc6-00134-ged3ce2a
 max_uptime=1500 
RESULT_ROOT=/result/will-it-scale/poll2-performance/lkp-ivb-d01/debian-x86_64-2016-08-31.cgz/x86_64-rhel-7.2/gcc-6/ed3ce2a9172457ef7dbaa9f964e63dfde2bdcb5f/11
 LKP_SERVER=inn debug apic=debug sysrq_always_enabled 
rcupdate.rcu_cpu_stall_timeout=100 net.ifnames=0 printk.devkmsg=on panic=-1 
softlockup_panic=1 nmi_watchdog=panic oops=panic load_ramdisk=2 
prompt_ramdisk=0 drbd.minor_count=8 systemd.log_level=err ignore_
 [  337.895521]
 [  339.467661] BUG: unable to handle kernel paging request at 8803cf2e2008
 [  339.468000] IP: [] native_set_pmd+0x1/0x10
 ...


Maybe Fengguang has an idea what to do here, maybe something like add
markers to the log to denote where the test environment is prepared and
when the actual test starts. Then grep for those and generate the report
based on that...


Thanks for the suggestions, we'll keep improving the reports to avoid confusion
or misleading.


One possible improvement is to provide "lkp qemu" reproduce steps for
kernel oops -- it would be way more convenient and safe to follow than
"lkp run", since the later risks hang the physical machine.

As for the test description, the dmesg carries markers for the user
space test start/stop points, so the robot can easily tell whether the
oops happen during the test or before/after the test -- the latter may
well (but not always) indicate the oops is not relevant to the testcase,
but to the regular kernel boot/reboot/kexec process.

Thanks,
Fengguang


linux-next: Tree for Mar 9

2017-03-08 Thread Stephen Rothwell
Hi all,

News: I will not be doing any linux-next releases next week.

Changes since 20170308:

Non-merge commits (relative to Linus' tree): 2115
 2950 files changed, 278892 insertions(+), 31441 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log
files in the Next directory.  Between each merge, the tree was built
with a ppc64_defconfig for powerpc and an allmodconfig (with
CONFIG_BUILD_DOCSRC=n) for x86_64, a multi_v7_defconfig for arm and a
native build of tools/perf. After the final fixups (if any), I do an
x86_64 modules_install followed by builds for x86_64 allnoconfig,
powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig, allyesconfig
and pseries_le_defconfig and i386, sparc and sparc64 defconfig.

Below is a summary of the state of the merge.

I am currently merging 253 trees (counting Linus' and 37 trees of bug
fix patches pending for the current merge release).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

-- 
Cheers,
Stephen Rothwell

$ git checkout master
$ git reset --hard stable
Merging origin/master (b4fb8f66f1ae mm, page_alloc: Add missing check for 
memory holes)
Merging fixes/master (c470abd4fde4 Linux 4.10)
Merging kbuild-current/rc-fixes (c7858bf16c0b asm-prototypes: Clear any CPP 
defines before declaring the functions)
Merging arc-current/for-curr (7f35144cea21 ARC: get rate from clk driver 
instead of reading device tree)
Merging arm-current/fixes (9e3440481845 ARM: 8658/1: uaccess: fix zeroing of 
64-bit get_user())
Merging m68k-current/for-linus (b5bb8f3120a7 m68k/bitops: Correct signature of 
test_bit())
Merging metag-fixes/fixes (35d04077ad96 metag: Only define 
atomic_dec_if_positive conditionally)
Merging powerpc-fixes/fixes (a7d2475af7ae powerpc: Sort the selects under 
CONFIG_PPC)
Merging sparc/master (f8e6859ea9d0 Merge 
git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc)
Merging fscrypt-current/for-stable (42d97eb0ade3 fscrypt: fix renaming and 
linking special files)
Merging net/master (8474c8caac7e Merge branch 'master' of 
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec)
Merging ipsec/master (e3dc847a5f85 vti6: Don't report path MTU below 
IPV6_MIN_MTU.)
Merging netfilter/master (568af6de058c netfilter: nf_tables: set pktinfo->thoff 
at AH header if found)
Merging ipvs/master (045169816b31 Merge branch 'linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6)
Merging wireless-drivers/master (52f5631a4c05 rtlwifi: rtl8192ce: Fix loading 
of incorrect firmware)
Merging mac80211/master (8d70eeb84ab2 Merge 
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net)
Merging sound-current/for-linus (f3ac9f737603 ALSA: seq: Fix link corruption by 
event error handling)
Merging pci-current/for-linus (3bd7db63a841 PCI/ASPM: Always set 
link->downstream to avoid NULL dereference on remove)
Merging driver-core.current/driver-core-linus (c1ae3cfa0e89 Linux 4.11-rc1)
Merging tty.current/tty-linus (f98c7bce570b serial: samsung: Continue to work 
if DMA request fails)
Merging usb.current/usb-linus (c1ae3cfa0e89 Linux 4.11-rc1)
Merging usb-gadget-fixes/fixes (35b2719e72d3 usb: dwc3: gadget: make to 
increment req->remaining in all cases)
Merging usb-serial-fixes/usb-linus (8c76d7cd520e USB: serial: safe_serial: fix 
information leak in completion handler)
Merging usb-chipidea-fixes/ci-for-usb-stable (c7fbb09b2ea1 usb: chipidea: move 
the lock initialization to core file)
Merging phy/fixes (c1ae3cfa0e89 Linux 4.11-rc1)
Merging staging.current/staging-linus (69eb1596b4df staging: octeon: remove 
unused variable)
Merging char-misc.current/char-misc-linus (c1ae3cfa0e89 Linux 4.11-rc1)
Merging input-current/for-linus (45838660e34d Input: i8042 - add noloop quirk 
for Dell Embedded Box PC 3000)
Merging crypto-current/master (b985735be7af hwrng: omap - Do not access 
INTMASK_REG on EIP76)
Merging ide/master (96297aee8bce ide: palm_bk3710: add __initdata to 
palm_bk3710_port_info)
Merging vfio-fixes/for-linus (930a42ded3fe vfio/spapr_tce: Set window when 
adding additional groups to container)

Re: [PATCH v3 01/09] iommu/ipmmu-vmsa: Introduce features, break out alias

2017-03-08 Thread Magnus Damm
Hi Robin,

On Wed, Mar 8, 2017 at 8:53 PM, Robin Murphy  wrote:
> Hi Magnus,
>
> On 08/03/17 11:01, Magnus Damm wrote:
>> From: Magnus Damm 
>>
>> Introduce struct ipmmu_features to track various hardware
>> and software implementation changes inside the driver for
>> different kinds of IPMMU hardware. Add use_ns_alias_offset
>> as a first example of a feature to control if the secure
>> register bank offset should be used or not.
>>
>> Signed-off-by: Magnus Damm 
>> ---
>>
>>  Changes since V2:
>>  - None
>>
>>  Changes since V1:
>>  - Moved patch to front of the series
>>
>>  drivers/iommu/ipmmu-vmsa.c |   35 ---
>>  1 file changed, 28 insertions(+), 7 deletions(-)
>>
>> --- 0007/drivers/iommu/ipmmu-vmsa.c
>> +++ work/drivers/iommu/ipmmu-vmsa.c   2017-03-07 12:25:47.0 +0900
>> @@ -32,11 +32,15 @@
>>
>>  #define IPMMU_CTX_MAX 1
>>
>> +struct ipmmu_features {
>> + bool use_ns_alias_offset;
>> +};
>> +
>>  struct ipmmu_vmsa_device {
>>   struct device *dev;
>>   void __iomem *base;
>>   struct list_head list;
>> -
>> + const struct ipmmu_features *features;
>>   unsigned int num_utlbs;
>>   spinlock_t lock;/* Protects ctx and domains[] 
>> */
>>   DECLARE_BITMAP(ctx, IPMMU_CTX_MAX);
>> @@ -999,13 +1003,33 @@ static void ipmmu_device_reset(struct ip
>>   ipmmu_write(mmu, i * IM_CTX_SIZE + IMCTR, 0);
>>  }
>>
>> +static const struct ipmmu_features ipmmu_features_default = {
>> + .use_ns_alias_offset = true,
>> +};
>> +
>> +static const struct of_device_id ipmmu_of_ids[] = {
>> + {
>> + .compatible = "renesas,ipmmu-vmsa",
>> + .data = _features_default,
>> + }, {
>> + /* Terminator */
>> + },
>> +};
>> +
>> +MODULE_DEVICE_TABLE(of, ipmmu_of_ids);
>> +
>>  static int ipmmu_probe(struct platform_device *pdev)
>>  {
>>   struct ipmmu_vmsa_device *mmu;
>> + const struct of_device_id *match;
>>   struct resource *res;
>>   int irq;
>>   int ret;
>>
>> + match = of_match_node(ipmmu_of_ids, pdev->dev.of_node);
>
> of_device_get_match_data() makes this a lot easier.
>
>> + if (!match)
>> + return -EINVAL;
>
> Also, if the driver is DT-only per the other series, note that this
> cannot happen anyway, since of_driver_match_device() would have to have
> found a match for your probe function to be called in the first place.

Yeah, you are right. As you know, in the IPMMU driver (with the
r8a7795 V3 series applied) the init handling is a bit special with
ARM32 and ARM64 being treated differently. I would like to clean it up
and share a common implementation.

Until that happens, how do you think we should handle the (!match)
case? BUG_ON()?

Cheers,

/ magnus


"mm: fix lazyfree BUG_ON check in try_to_unmap_one()" build error

2017-03-08 Thread Sergey Senozhatsky
Hello Minchan,

/* I can't https://marc.info/?l=linux-kernel=148886631303107 thread
   in my mail box for some reason so the Reply-To message-id may be wrong. */



commit "mm: fix lazyfree BUG_ON check in try_to_unmap_one()"
(mmotm fd07630cbf59bead90046dd3e5cfd891e58e6987)


if (VM_WARN_ON_ONCE(PageSwapBacked(page) !=
PageSwapCache(page))) {
...
}


does not compile on !CONFIG_DEBUG_VM configs, because VM_WARN_ONCE() is

#define BUILD_BUG_ON_INVALID(e) ((void)(sizeof((__force long)(e



In file included from ./include/linux/mmdebug.h:4:0,
 from ./include/linux/mm.h:8,
 from mm/rmap.c:48:
mm/rmap.c: In function ‘try_to_unmap_one’:
./include/linux/bug.h:45:33: error: void value not ignored as it ought to be
 #define BUILD_BUG_ON_INVALID(e) ((void)(sizeof((__force long)(e
 ^
./include/linux/mmdebug.h:49:31: note: in expansion of macro 
‘BUILD_BUG_ON_INVALID’
 #define VM_WARN_ON_ONCE(cond) BUILD_BUG_ON_INVALID(cond)
   ^~~~
mm/rmap.c:1416:8: note: in expansion of macro ‘VM_WARN_ON_ONCE’
if (VM_WARN_ON_ONCE(PageSwapBacked(page) !=
^~~

-ss


Re: [net/bpf] 3051bf36c2 BUG: unable to handle kernel paging request at 0000a7cf

2017-03-08 Thread Kees Cook
On Wed, Mar 8, 2017 at 3:55 PM, Laura Abbott  wrote:
> On 03/08/2017 02:36 PM, Kees Cook wrote:
>> On Wed, Mar 8, 2017 at 2:27 PM, Daniel Borkmann  wrote:
>>> [   28.474232] rodata_test: test data was not read only
>>> [...]
>>
>> In my tests so far, I've never been able to get rodata_test to fail
>> (Qemu 2.5.0, Ubuntu). I'll retry with your .config and see if I can
>> recheck under Qemu 2.7.1. Do you see these failures on real hardware?
>>
>> -Kees
>>
>
> FWIW, I'm seeing the same issue with qemu 2.6.2 and 2.8.0 on Fedora 24
> and rawhide respectively.
>
> I also notice that CONFIG_X86_PAE is turned off in the defconfig. If
> I set CONFIG_HIGHMEM_64G which turns on CONFIG_X86_PAE the problem
> goes away. I can't tell if this is an indication of magically hiding
> the TLB problem or if there is an issue with !X86_PAE invalidation.

I found my difference. I normally run qemu with "-cpu host" which
makes the failure go away. With "-cpu kvm64", I see the rodata_test
failure immediately. Seems like this may be a kvm cpu feature
emulation bug? I'll see if I can find the specific cpu feature in the
morning...

-Kees

-- 
Kees Cook
Pixel Security


Re: [PATCH] net: sun: sungem: use new api ethtool_{get|set}_link_ksettings

2017-03-08 Thread David Miller
From: Philippe Reynes 
Date: Sun,  5 Mar 2017 00:04:18 +0100

> The ethtool api {get|set}_settings is deprecated.
> We move this driver to new api {get|set}_link_ksettings.
> 
> As I don't have the hardware, I'd be very pleased if
> someone may test this patch.
> 
> Signed-off-by: Philippe Reynes 

Applied.


Re: [PATCH] drivers: Remove OF dependency in drv260x driver

2017-03-08 Thread Dmitry Torokhov
On Wed, Mar 08, 2017 at 04:54:02PM -0800, Jingkui Wang wrote:
> As the driver is using generic device properties, it should work
> properly when CONFIG_OF is turned off. This patch removes the
> ifdef CONFIGOF and make sure the driver always have of_match_table.
> 
> Signed-off-by: Jingkui Wang 
> ---
>  drivers/input/misc/drv260x.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/drivers/input/misc/drv260x.c b/drivers/input/misc/drv260x.c
> index fb089d3..17eb84a 100644
> --- a/drivers/input/misc/drv260x.c
> +++ b/drivers/input/misc/drv260x.c
> @@ -652,7 +652,6 @@ static const struct i2c_device_id drv260x_id[] = {
>  };
>  MODULE_DEVICE_TABLE(i2c, drv260x_id);
> 
> -#ifdef CONFIG_OF
>  static const struct of_device_id drv260x_of_match[] = {
>   { .compatible = "ti,drv2604", },
>   { .compatible = "ti,drv2604l", },
> @@ -661,13 +660,12 @@ static const struct of_device_id drv260x_of_match[] = {
>   { }
>  };
>  MODULE_DEVICE_TABLE(of, drv260x_of_match);
> -#endif
> 
>  static struct i2c_driver drv260x_driver = {
>   .probe = drv260x_probe,
>   .driver = {
>   .name = "drv260x-haptics",
> - .of_match_table = of_match_ptr(drv260x_of_match),
> + .of_match_table = drv260x_of_match,
>   .pm = _pm_ops,
>   },
>   .id_table = drv260x_id,

Hmm, what did you use to mail it? Your mailer ate all tabs.

Thanks.

-- 
Dmitry


Re: [PATCH 1/2] x86/efi: Correct a tiny mistake in code comment

2017-03-08 Thread Dave Young
Hi,

On 03/08/17 at 04:45pm, Baoquan He wrote:
> Forgot cc to Boris, add him.
> 
> On 03/08/17 at 04:18pm, Dave Young wrote:
> > On 03/08/17 at 03:47pm, Baoquan He wrote:
> > > EFI allocate runtime services regions down from EFI_VA_START, -4G.
> > > It should be top-down handling.
> > > 
> > > Signed-off-by: Baoquan He 
> > > ---
> > >  arch/x86/platform/efi/efi_64.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/arch/x86/platform/efi/efi_64.c 
> > > b/arch/x86/platform/efi/efi_64.c
> > > index a4695da..6cbf9e0 100644
> > > --- a/arch/x86/platform/efi/efi_64.c
> > > +++ b/arch/x86/platform/efi/efi_64.c
> > > @@ -47,7 +47,7 @@
> > >  #include 
> > >  
> > >  /*
> > > - * We allocate runtime services regions bottom-up, starting from -4G, 
> > > i.e.
> > > + * We allocate runtime services regions top-down, starting from -4G, i.e.
> > 
> > Baoquan, I think original bottom-up is right, it is just considering
> > -68G as up, see the x86_64 mm.txt. We regard vmalloc as higher address
> > although from mathematics view it is lower then positive addresses.
> 
> Thanks for reviewing!
> 
> I am not sure. Just in efi_map_region() it gets the starting va to map
> 'size' big of region by below code:
>   efi_va -= size;
> 
> -4G and -68G just a trick which makes people understand easily, still we
> think kernel text mapping region is in higher addr area then vmalloc. I
> personnally think.

I understand your points, there is not right or wrong. So I think drop
the words like the change in your V2 looks good.

Thanks
Dave


Re: [PATCH] hwmon: (dell-smm) Add Dell XPS 15 9560 into DMI list

2017-03-08 Thread Guenter Roeck

On 03/03/2017 02:41 PM, Pali Rohár wrote:

It was reported that dell-smm-hwmon is working fine on Dell XPS 15 9560.

Link: http://www.spinics.net/lists/platform-driver-x86/msg10751.html
Reported-by: Vasile Dumitrescu 
Signed-off-by: Pali Rohár 


With Vasile's feedback, I'll consider this patch tested and will apply it
to hwmon-next.

Thanks,
Guenter



---
 drivers/hwmon/dell-smm-hwmon.c |7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/hwmon/dell-smm-hwmon.c b/drivers/hwmon/dell-smm-hwmon.c
index 34704b0..3189246 100644
--- a/drivers/hwmon/dell-smm-hwmon.c
+++ b/drivers/hwmon/dell-smm-hwmon.c
@@ -995,6 +995,13 @@ enum i8k_configs {
},
.driver_data = (void *)_config_data[DELL_XPS],
},
+   {
+   .ident = "Dell XPS 15 9560",
+   .matches = {
+   DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
+   DMI_MATCH(DMI_PRODUCT_NAME, "XPS 15 9560"),
+   },
+   },
{ }
 };






Re: [PATCH 0/3] mm/fs: get PG_error out of the writeback reporting business

2017-03-08 Thread Theodore Ts'o
On Tue, Mar 07, 2017 at 11:26:22AM +0100, Jan Kara wrote:
> On a more general note (DAX is actually fine here), I find the current
> practice of clearing page dirty bits on error and reporting it just once
> problematic. It keeps the system running but data is lost and possibly
> without getting the error anywhere where it is useful. We get away with
> this because it is a rare event but it seems like a problematic behavior.
> But this is more for the discussion at LSF.

I'm actually running into this in the last day or two because some MM
folks at $WORK have been trying to push hard for GFP_NOFS removal in
ext4 (at least when we are holding some mutex/semaphore like
i_data_sem) because otherwise it's possible for the OOM killer to be
unable to kill processes because they are holding on to locks that
ext4 is holding.

I've done some initial investigation, and while it's not that hard to
remove GFP_NOFS from certain parts of the writepages() codepath (which
is where we had been are running into problems), a really, REALLY big
problem is if any_filesystem->writepages() returns ENOMEM, it causes
silent data loss, because the pages are marked clean, and so data
written using buffered writeback goes *poof*.

I confirmed this by creating a test kernel with a simple patch such
that if the ext4 file system is mounted with -o debug, there was a 1
in 16 chance that ext4_writepages will immediately return with ENOMEM
(and printk the inode number, so I knew which inodes had gotten the
ENOMEM treatment).  The result was **NOT** pretty.

What I think we should strongly consider is at the very least, special
case ENOMEM being returned by writepages() during background
writeback, and *not* mark the pages clean, and make sure the inode
stays on the dirty inode list, so we can retry the write later.  This
is especially important since the process that issued the write may
have gone away, so there might not even be a userspace process to
complain to.  By converting certain page allocations (most notably in
ext4_mb_load_buddy) from GFP_NOFS to GFP_KMALLOC, this allows us to
release the i_data_sem lock and return an error.  This should allow
allow the OOM killer to do its dirty deed, and hopefully we can retry
the writepages() for that inode later.

In the case of a data integrity sync being issued by fsync() or
umount(), we could allow ENOMEM to get returned to userspace in that
case as well.  I'm not convinced all userspace code will handle an
ENOMEM correctly or sanely, but at least they people will be (less
likely) to blame file system developers.  :-)

The real problem that's going on here, by the way, is that people are
trying to run programs in insanely tight containers, and then when the
kernel locks up, they blame the mm developers.  But if there is silent
data corruption, they will blame the fs developers instead.  And while
kernel lockups are temporary (all you have to do is let the watchdog
reboot the system :-), silent data corruption is *forever*.  So what
we really need to do is to allow the OOM killer do its work, and if
job owners are unhappy that their processes are getting OOM killed,
maybe they will be suitably incentivized to pay for more memory in
their containers

- Ted

P.S. Michael Hocko, apologies for not getting back to you with your
GFP_NOFS removal patches.  But the possibility of fs malfunctions that
might lead to silent data corruption is why I'm being very cautious,
and I now have rather strong confirmation that this is not just an
irrational concern on my part.  (This is also performance review
season, FAST conference was last week, and Usenix ATC program
committee reviews are due this week.  So apologies for any reply
latency.)


[PATCH V2] cpufreq: schedutil: refactor sugov_next_freq_shared()

2017-03-08 Thread Viresh Kumar
The loop in sugov_next_freq_shared() contains an if block to skip the
loop for the current CPU. This turns out to be an unnecessary
conditional in the scheduler's hot-path for every CPU in the policy.

It would be better to drop the conditional and make the loop treat all
the CPUs in the same way. That would eliminate the need of calling
sugov_iowait_boost() at the top of the routine.

To keep the code optimized to return early if the current CPU has RT/DL
flags set, move the flags check to sugov_update_shared() instead in
order to avoid the function call entirely.

Signed-off-by: Viresh Kumar 
---
V1->V2:
- Keep the flags check separately for the current CPU, but move it to
  the parent routine.
- Improved commit log.

 kernel/sched/cpufreq_schedutil.c | 25 +
 1 file changed, 9 insertions(+), 16 deletions(-)

diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
index 78468aa051ab..f5ffe241812e 100644
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -217,30 +217,19 @@ static void sugov_update_single(struct update_util_data 
*hook, u64 time,
sugov_update_commit(sg_policy, time, next_f);
 }
 
-static unsigned int sugov_next_freq_shared(struct sugov_cpu *sg_cpu,
-  unsigned long util, unsigned long 
max,
-  unsigned int flags)
+static unsigned int sugov_next_freq_shared(struct sugov_cpu *sg_cpu)
 {
struct sugov_policy *sg_policy = sg_cpu->sg_policy;
struct cpufreq_policy *policy = sg_policy->policy;
-   unsigned int max_f = policy->cpuinfo.max_freq;
u64 last_freq_update_time = sg_policy->last_freq_update_time;
+   unsigned long util = 0, max = 1;
unsigned int j;
 
-   if (flags & SCHED_CPUFREQ_RT_DL)
-   return max_f;
-
-   sugov_iowait_boost(sg_cpu, , );
-
for_each_cpu(j, policy->cpus) {
-   struct sugov_cpu *j_sg_cpu;
+   struct sugov_cpu *j_sg_cpu = _cpu(sugov_cpu, j);
unsigned long j_util, j_max;
s64 delta_ns;
 
-   if (j == smp_processor_id())
-   continue;
-
-   j_sg_cpu = _cpu(sugov_cpu, j);
/*
 * If the CPU utilization was last updated before the previous
 * frequency update and the time elapsed between the last update
@@ -254,7 +243,7 @@ static unsigned int sugov_next_freq_shared(struct sugov_cpu 
*sg_cpu,
continue;
}
if (j_sg_cpu->flags & SCHED_CPUFREQ_RT_DL)
-   return max_f;
+   return policy->cpuinfo.max_freq;
 
j_util = j_sg_cpu->util;
j_max = j_sg_cpu->max;
@@ -289,7 +278,11 @@ static void sugov_update_shared(struct update_util_data 
*hook, u64 time,
sg_cpu->last_update = time;
 
if (sugov_should_update_freq(sg_policy, time)) {
-   next_f = sugov_next_freq_shared(sg_cpu, util, max, flags);
+   if (flags & SCHED_CPUFREQ_RT_DL)
+   next_f = sg_policy->policy->cpuinfo.max_freq;
+   else
+   next_f = sugov_next_freq_shared(sg_cpu);
+
sugov_update_commit(sg_policy, time, next_f);
}
 
-- 
2.7.1.410.g6faf27b



Re: [PATCH v3 03/09] iommu/ipmmu-vmsa: Enable multi context support

2017-03-08 Thread Magnus Damm
Hi Robin,

Thanks for your feedback!

On Wed, Mar 8, 2017 at 9:21 PM, Robin Murphy  wrote:
> On 08/03/17 11:01, Magnus Damm wrote:
>> From: Magnus Damm 
>>
>> Add support for up to 8 contexts. Each context is mapped to one
>> domain. One domain is assigned one or more slave devices. Contexts
>> are allocated dynamically and slave devices are grouped together
>> based on which IPMMU device they are connected to. This makes slave
>> devices tied to the same IPMMU device share the same IOVA space.
>>
>> Signed-off-by: Magnus Damm 
>> ---
>>
>>  Changes since V2:
>>  - Updated patch description to reflect code included in:
>>[PATCH v7 00/07] iommu/ipmmu-vmsa: IPMMU multi-arch update V7
>>
>>  Changes since V1:
>>  - Support up to 8 contexts instead of 4
>>  - Use feature flag and runtime handling
>>  - Default to single context
>>
>>  drivers/iommu/ipmmu-vmsa.c |   38 ++
>>  1 file changed, 30 insertions(+), 8 deletions(-)
>>
>> --- 0012/drivers/iommu/ipmmu-vmsa.c
>> +++ work/drivers/iommu/ipmmu-vmsa.c   2017-03-08 17:59:19.900607110 +0900
>> @@ -30,11 +30,12 @@
>>
>>  #include "io-pgtable.h"
>>
>> -#define IPMMU_CTX_MAX 1
>> +#define IPMMU_CTX_MAX 8
>>
>>  struct ipmmu_features {
>>   bool use_ns_alias_offset;
>>   bool has_cache_leaf_nodes;
>> + bool has_eight_ctx;
>
> Wouldn't it be more sensible to just encode a number of contexts
> directly, if it isn't reported by the hardware itself? I'm just
> imagining future hardware generations... :P
>
> bool also_has_another_eight_ctx_on_top_of_that;
> bool wait_no_this_is_the_one_where_ctx_15_isnt_usable;

=)

Sure, I agree with you!

Please note that this is currently a mix of software and hardware
policy. On R-Car Gen2 (ARM32) the legacy code only uses a single
context for now but 4 contexts are supported by hardware according to
the data sheet. The remaining 3 contexts are untested at this point.
For R-Car Gen3 (ARM64) the hardware supports 8 contexts and this patch
enables all of them.

>>  };
>>
>>  struct ipmmu_vmsa_device {
>> @@ -44,6 +45,7 @@ struct ipmmu_vmsa_device {
>>   const struct ipmmu_features *features;
>>   bool is_leaf;
>>   unsigned int num_utlbs;
>> + unsigned int num_ctx;
>>   spinlock_t lock;/* Protects ctx and domains[] 
>> */
>>   DECLARE_BITMAP(ctx, IPMMU_CTX_MAX);
>>   struct ipmmu_vmsa_domain *domains[IPMMU_CTX_MAX];
>> @@ -376,11 +378,12 @@ static int ipmmu_domain_allocate_context
>>
>>   spin_lock_irqsave(>lock, flags);
>>
>> - ret = find_first_zero_bit(mmu->ctx, IPMMU_CTX_MAX);
>> - if (ret != IPMMU_CTX_MAX) {
>> + ret = find_first_zero_bit(mmu->ctx, mmu->num_ctx);
>> + if (ret != mmu->num_ctx) {
>>   mmu->domains[ret] = domain;
>>   set_bit(ret, mmu->ctx);
>
> Using test_and_set_bit() in a loop would avoid having to take a lock here.

So you mean that in case of test_and_set_bit() returns 1 then we try
find_first_zero_bit() again?

This is not really a performance sensitive part of the driver, so I'm
currently optimizing for code readability. I'm of course all for
dropping the lock, but I have a hard time figuring out how your
suggestion could result in semi-readable code. Any pointers? =)

>> @@ -1112,6 +1123,17 @@ static int ipmmu_probe(struct platform_d
>>   if (mmu->features->use_ns_alias_offset)
>>   mmu->base += IM_NS_ALIAS_OFFSET;
>>
>> + /*
>> +  * The number of contexts varies with generation and instance.
>> +  * Newer SoCs get a total of 8 contexts enabled, older ones just one.
>> +  */
>> + if (mmu->features->has_eight_ctx)
>> + mmu->num_ctx = 8;
>> + else
>> + mmu->num_ctx = 1;
>> +
>> + WARN_ON(mmu->num_ctx > IPMMU_CTX_MAX);
>
> The likelihood of that happening doesn't appear to warrant a runtime
> check. Especially one which probably isn't even generated because it's
> trivially resolvable to "if (false)..." at compile time.

Sure, I agree. Will drop.

Thanks,

/ magnus


Re: [PATCH 1/3] futex: remove duplicated code

2017-03-08 Thread Rob Landley
On 03/04/2017 07:05 AM, Russell King - ARM Linux wrote:
> On Fri, Mar 03, 2017 at 01:27:10PM +0100, Jiri Slaby wrote:
>> diff --git a/kernel/futex.c b/kernel/futex.c
>> index b687cb22301c..c5ff9850952f 100644
>> --- a/kernel/futex.c
>> +++ b/kernel/futex.c
>> @@ -1457,6 +1457,42 @@ futex_wake(u32 __user *uaddr, unsigned int flags, int 
>> nr_wake, u32 bitset)
>>  return ret;
>>  }
>>  
>> +static int futex_atomic_op_inuser(int encoded_op, u32 __user *uaddr)
>> +{
>> +int op = (encoded_op >> 28) & 7;
>> +int cmp = (encoded_op >> 24) & 15;
>> +int oparg = (encoded_op << 8) >> 20;
>> +int cmparg = (encoded_op << 20) >> 20;
> 
> Hmm.  oparg and cmparg look like they're doing these shifts to get sign
> extension of the 12-bit values by assuming that "int" is 32-bit -
> probably worth a comment, or for safety, they should be "s32" so it's
> not dependent on the bit-width of "int".

I thought Linux depended on the LP64 standard for all architectures?

Standard: http://www.unix.org/whitepapers/64bit.html
Rationale: http://www.unix.org/version2/whatsnew/lp64_wp.html

So int has a defined bit width (32) on linux?

Rob


Re: blk: improve order of bio handling in generic_make_request()

2017-03-08 Thread NeilBrown
On Wed, Mar 08 2017, Mikulas Patocka wrote:

> On Wed, 8 Mar 2017, NeilBrown wrote:
>> 
>> I don't think this will fix the DM snapshot deadlock by itself.
>> Rather, it make it possible for some internal changes to DM to fix it.
>> The DM change might be something vaguely like:
>> 
>> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
>> index 3086da5664f3..06ee0960e415 100644
>> --- a/drivers/md/dm.c
>> +++ b/drivers/md/dm.c
>> @@ -1216,6 +1216,14 @@ static int __split_and_process_non_flush(struct 
>> clone_info *ci)
>> 
>>  len = min_t(sector_t, max_io_len(ci->sector, ti), ci->sector_count);
>> 
>> +if (len < ci->sector_count) {
>> +struct bio *split = bio_split(bio, len, GFP_NOIO, fs_bio_set);
>
> fs_bio_set is a shared bio set, so it is prone to deadlocks. For this 
> change, we would need two bio sets per dm device, one for the split bio 
> and one for the outgoing bio. (this also means having one more kernel 
> thread per dm device)

Yes, two local bio_sets would be best.
But we don't really need those extra kernel threads.  I'll start working
on patches to make them optional, and then to start removing them.

Thanks,
NeilBrown


signature.asc
Description: PGP signature


Re: [PATCH net] team: use ETH_MAX_MTU as max mtu

2017-03-08 Thread David Miller
From: Jarod Wilson 
Date: Mon,  6 Mar 2017 08:48:58 -0500

> This restores the ability to set a team device's mtu to anything higher
> than 1500. Similar to the reported issue with bonding, the team driver
> calls ether_setup(), which sets an initial max_mtu of 1500, while the
> underlying hardware can handle something much larger. Just set it to
> ETH_MAX_MTU to support all possible values, and the limitations of the
> underlying devices will prevent setting anything too large.
> 
> Fixes: 91572088e3fd ("net: use core MTU range checking in core net infra")
> CC: Cong Wang 
> CC: Jiri Pirko 
> CC: net...@vger.kernel.org
> Signed-off-by: Jarod Wilson 

Applied and queued up for -stable, thanks.


Re: [PATCH 3/4] phy: rockchip-typec: support DP phy switch

2017-03-08 Thread Brian Norris
Hi,

On Thu, Mar 09, 2017 at 02:02:54AM +0100, Heiko Stuebner wrote:
> Am Mittwoch, 8. März 2017, 16:39:23 CET schrieb Brian Norris:
> > On Fri, Feb 10, 2017 at 03:44:13PM +0800, Chris Zhong wrote:
> > > There are 2 Type-c PHYs in RK3399, but only one DP controller. Hence
> > > only one PHY can connect to DP controller at one time, the other should
> > > be disconnected. The GRF_SOC_CON26 register has a switch bit to do it,
> > > set this bit means enable PHY 1, clear this bit means enable PHY 0.
> > > 
> > > Signed-off-by: Chris Zhong 
> > > ---
> > > 
> > >  drivers/phy/phy-rockchip-typec.c | 9 +
> > >  1 file changed, 9 insertions(+)
> > > 
> > > diff --git a/drivers/phy/phy-rockchip-typec.c
> > > b/drivers/phy/phy-rockchip-typec.c index 7cfb0f8..1604aaa 100644
> > > --- a/drivers/phy/phy-rockchip-typec.c
> > > +++ b/drivers/phy/phy-rockchip-typec.c

...

> > > @@ -869,6 +873,11 @@ static int tcphy_parse_dt(struct rockchip_typec_phy
> > > *tcphy,> 
> > >   if (ret)
> > >   
> > >   return ret;
> > > 
> > > + ret = tcphy_get_param(dev, >uphy_dp_sel,
> > > +   "rockchip,uphy-dp-sel");
> > > + if (ret)
> > > + return ret;
> > 
> > What about existing device trees? You're essentially adding this
> > new property and requiring it at the same time.
> > 
> > Or are we considering no RK3399 DP stable at the moment? I guess we
> > haven't actually merged any device trees that support this yet, no?
> 
> An interesting situation we're in here. On the one hand, you're right this 
> breaks "backwards compatiblity".
> 
> But on the other hand, the type-c phy is currently very much unused. The only 
> current board rk3399-evb.dts does not enable them (so they're disabled 
> everywhere) and we have neither dwc3 nor dp nodes in any rk3399 devicetrees 
> so 
> far. Also Rob was ok with the binding change :-) .
> 
> So from my pov, I'd say it _should_ be ok, as nothing is using the phys at 
> all 
> yet and thus there is nothing that could get broken.

Yeah, I guess it's OK... but BTW out-of-tree DTs are perfectly
legit, once the bindings are accepted.

Another random point of contention (not worth too much, as the pattern
is already set), but why do these deserve DT properties at all? The
device already has a "rk3399" compatible property, so can't we derive
GRF offsets from that?

Brian


[GIT PULL] xen features and fixes for 4.11 rc1

2017-03-08 Thread Juergen Gross
Linus,

Please git pull the following tag:

 git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip.git 
for-linus-4.11-rc1-tag

features and fixes for 4.11 rc1

It contains one fix for MSIX handling under Xen and a trivial cleanup
patch.

Thanks.

Juergen

 arch/x86/pci/xen.c   | 23 +++
 drivers/xen/xenbus/xenbus_dev_frontend.c |  1 -
 2 files changed, 7 insertions(+), 17 deletions(-)

Dan Streetman (1):
  xen: do not re-use pirq number cached in pci device msi msg data

Masanari Iida (1):
  xenbus: Remove duplicate inclusion of linux/init.h


Re: [PATCH] mm, add_memory_resource: hold device_hotplug lock over mem_hotplug_{begin, done}

2017-03-08 Thread Dan Williams
On Mon, Mar 6, 2017 at 12:22 AM, Heiko Carstens
 wrote:
> Hello Dan,
>
>> > If you look at commit 5e33bc4165f3 ("driver core / ACPI: Avoid device hot
>> > remove locking issues") then lock_device_hotplug_sysfs() was introduced to
>> > avoid a different subtle deadlock, but it also sleeps uninterruptible, but
>> > not for more than 5ms ;)
>> >
>> > However I'm not sure if the device hotplug lock should also be used to fix
>> > an unrelated bug that was introduced with the get_online_mems() /
>> > put_online_mems() interface. Should it?
>>
>> No, I don't think it should.
>>
>> I like your proposed direction of creating a new lock internal to
>> mem_hotplug_begin() to protect active_writer, and stop relying on
>> lock_device_hotplug to serve this purpose.
>>
>> > If so, we need to sprinkle around a couple of lock_device_hotplug() calls
>> > near mem_hotplug_begin() calls, like Sebastian already started, and give it
>> > additional semantics (protecting mem_hotplug.active_writer), and hope it
>> > doesn't lead to deadlocks anywhere.
>>
>> I'll put your proposed patch through some testing.
>
> On s390 it _seems_ to work. Did it pass your testing too?
> If so I would send a patch with proper patch description for inclusion.

Looks ok here. No lockdep warnings running it through it paces with
the persistent memory use case.


Re: [f2fs-dev] [PATCH] f2fs: allocate a bio for discarding when actually issuing it

2017-03-08 Thread Chao Yu
Hi Jaegeuk,

On 2017/3/8 10:33, Jaegeuk Kim wrote:
> Let's allocate a bio when issuing discard commands later.
> 
> Signed-off-by: Jaegeuk Kim 
> ---
>  fs/f2fs/f2fs.h|   4 +-
>  fs/f2fs/segment.c | 113 
> --
>  2 files changed, 62 insertions(+), 55 deletions(-)
> 
> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> index a58c2e43bd2a..870bb4d9bc65 100644
> --- a/fs/f2fs/f2fs.h
> +++ b/fs/f2fs/f2fs.h
> @@ -197,10 +197,12 @@ enum {
>  struct discard_cmd {
>   struct list_head list;  /* command list */
>   struct completion wait; /* compleation */
> + struct block_device *bdev;  /* bdev */
>   block_t lstart; /* logical start address */
> + block_t start;  /* actual start address in dev */
>   block_t len;/* length */
> - struct bio *bio;/* bio */
>   int state;  /* state */
> + int error;  /* bio error */
>  };
>  
>  struct discard_cmd_control {
> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
> index 50c65cc4645a..d8f9e6c895cd 100644
> --- a/fs/f2fs/segment.c
> +++ b/fs/f2fs/segment.c
> @@ -636,7 +636,8 @@ static void locate_dirty_segment(struct f2fs_sb_info 
> *sbi, unsigned int segno)
>  }
>  
>  static void __add_discard_cmd(struct f2fs_sb_info *sbi,
> - struct bio *bio, block_t lstart, block_t len)
> + struct block_device *bdev, block_t lstart,
> + block_t start, block_t len)
>  {
>   struct discard_cmd_control *dcc = SM_I(sbi)->dcc_info;
>   struct list_head *cmd_list = &(dcc->discard_cmd_list);
> @@ -644,9 +645,9 @@ static void __add_discard_cmd(struct f2fs_sb_info *sbi,
>  
>   dc = f2fs_kmem_cache_alloc(discard_cmd_slab, GFP_NOFS);
>   INIT_LIST_HEAD(>list);
> - dc->bio = bio;
> - bio->bi_private = dc;
> + dc->bdev = bdev;
>   dc->lstart = lstart;
> + dc->start = start;
>   dc->len = len;
>   dc->state = D_PREP;
>   init_completion(>wait);
> @@ -658,22 +659,66 @@ static void __add_discard_cmd(struct f2fs_sb_info *sbi,
>  
>  static void __remove_discard_cmd(struct f2fs_sb_info *sbi, struct 
> discard_cmd *dc)
>  {
> - int err = dc->bio->bi_error;
> -
>   if (dc->state == D_DONE)
>   atomic_dec(&(SM_I(sbi)->dcc_info->submit_discard));
>  
> - if (err == -EOPNOTSUPP)
> - err = 0;
> + if (dc->error == -EOPNOTSUPP)
> + dc->error = 0;
>  
> - if (err)
> + if (dc->error)
>   f2fs_msg(sbi->sb, KERN_INFO,
> - "Issue discard failed, ret: %d", err);
> - bio_put(dc->bio);
> + "Issue discard failed, ret: %d", dc->error);
>   list_del(>list);
>   kmem_cache_free(discard_cmd_slab, dc);
>  }
>  
> +static void f2fs_submit_discard_endio(struct bio *bio)
> +{
> + struct discard_cmd *dc = (struct discard_cmd *)bio->bi_private;
> +
> + complete(>wait);
> + dc->error = bio->bi_error;
> + dc->state = D_DONE;
> + bio_put(bio);
> +}
> +
> +/* this function is copied from blkdev_issue_discard from block/blk-lib.c */
> +static int __submit_discard_cmd(struct discard_cmd *dc)
> +{
> + struct bio *bio = NULL;
> + int err;
> +
> + err = __blkdev_issue_discard(dc->bdev,
> + SECTOR_FROM_BLOCK(dc->start),
> + SECTOR_FROM_BLOCK(dc->len),
> + GFP_NOFS, 0, );
> + if (!err && bio) {
> + bio->bi_private = dc;
> + bio->bi_end_io = f2fs_submit_discard_endio;
> + bio->bi_opf |= REQ_SYNC;
> + submit_bio(bio);

Should set flag only if __blkdev_issue_discard is successful?

dc->state = D_SUBMIT;

And how about moving atomic_inc(>submit_discard); here?

Thanks,

> + }
> + dc->state = D_SUBMIT;
> + return err;
> +}
> +
> +static int __queue_discard_cmd(struct f2fs_sb_info *sbi,
> + struct block_device *bdev, block_t blkstart, block_t blklen)
> +{
> + block_t lblkstart = blkstart;
> +
> + trace_f2fs_issue_discard(bdev, blkstart, blklen);
> +
> + if (sbi->s_ndevs) {
> + int devi = f2fs_target_device_index(sbi, blkstart);
> +
> + blkstart -= FDEV(devi).start_blk;
> + }
> + __add_discard_cmd(sbi, bdev, lblkstart, blkstart, blklen);
> + wake_up(_I(sbi)->dcc_info->discard_wait_queue);
> + return 0;
> +}
> +
>  /* This should be covered by global mutex, _i->sentry_lock */
>  void f2fs_wait_discard_bio(struct f2fs_sb_info *sbi, block_t blkaddr)
>  {
> @@ -690,8 +735,7 @@ void f2fs_wait_discard_bio(struct f2fs_sb_info *sbi, 
> block_t blkaddr)
>  
>   if (blkaddr == NULL_ADDR) {
>   if (dc->state == D_PREP) {
> - dc->state = D_SUBMIT;
> - submit_bio(dc->bio);

Re: [PATCH v5 2/5] powerpc: kretprobes: override default function entry offset

2017-03-08 Thread Michael Ellerman
"Naveen N. Rao"  writes:
> On 2017/03/08 11:29AM, Arnaldo Carvalho de Melo wrote:
>> > I wasn't sure if you were planning on picking up KPROBES_ON_FTRACE for 
>> > v4.11. If so, it would be good to take this patch through the powerpc 
>> > tree. Otherwise, this can go via Ingo's tree.
>> 
>> If you guys convince Ingo that this should go _now_, then just cherry
>> pick what was merged into tip/perf/core that is needed for the arch
>> specific stuff and go from there.
>
> Ok, in hindsight, I think Michael's concern was actually for v4.12 

Yes I was talking about 4.12, sorry I thought that was implied :)

> itself, in which case this particular patch can go via powerpc tree, 
> while the rest of the patches in this series can go via your tree.
>
> Michael?

Yeah I think that's the easiest option. The function will be temporarily
unused until the two trees are merged, but I think that's fine.

cheers


Re: [PATCH 3/4] phy: rockchip-typec: support DP phy switch

2017-03-08 Thread Heiko Stübner
Am Mittwoch, 8. März 2017, 16:39:23 CET schrieb Brian Norris:
> On Fri, Feb 10, 2017 at 03:44:13PM +0800, Chris Zhong wrote:
> > There are 2 Type-c PHYs in RK3399, but only one DP controller. Hence
> > only one PHY can connect to DP controller at one time, the other should
> > be disconnected. The GRF_SOC_CON26 register has a switch bit to do it,
> > set this bit means enable PHY 1, clear this bit means enable PHY 0.
> > 
> > Signed-off-by: Chris Zhong 
> > ---
> > 
> >  drivers/phy/phy-rockchip-typec.c | 9 +
> >  1 file changed, 9 insertions(+)
> > 
> > diff --git a/drivers/phy/phy-rockchip-typec.c
> > b/drivers/phy/phy-rockchip-typec.c index 7cfb0f8..1604aaa 100644
> > --- a/drivers/phy/phy-rockchip-typec.c
> > +++ b/drivers/phy/phy-rockchip-typec.c
> > @@ -267,6 +267,7 @@ struct rockchip_usb3phy_port_cfg {
> > 
> > struct usb3phy_reg usb3tousb2_en;
> > struct usb3phy_reg external_psm;
> > struct usb3phy_reg pipe_status;
> > 
> > +   struct usb3phy_reg uphy_dp_sel;
> > 
> >  };
> >  
> >  struct rockchip_typec_phy {
> > 
> > @@ -736,6 +737,7 @@ static const struct phy_ops rockchip_usb3_phy_ops = {
> > 
> >  static int rockchip_dp_phy_power_on(struct phy *phy)
> >  {
> >  
> > struct rockchip_typec_phy *tcphy = phy_get_drvdata(phy);
> > 
> > +   struct rockchip_usb3phy_port_cfg *cfg = >port_cfgs;
> > 
> > int new_mode, ret = 0;
> > u32 val;
> > 
> > @@ -766,6 +768,8 @@ static int rockchip_dp_phy_power_on(struct phy *phy)
> > 
> > tcphy_phy_init(tcphy, new_mode);
> > 
> > }
> > 
> > +   property_enable(tcphy, >uphy_dp_sel, 1);
> > +
> > 
> > ret = readx_poll_timeout(readl, tcphy->base + DP_MODE_CTL,
> 
> Idea for future work: this should just be readl_poll_timeout() here, and
> throughout the driver.
> 
> >  val, val & DP_MODE_A2, 1000,
> >  PHY_MODE_SET_TIMEOUT);
> > 
> > @@ -869,6 +873,11 @@ static int tcphy_parse_dt(struct rockchip_typec_phy
> > *tcphy,> 
> > if (ret)
> > 
> > return ret;
> > 
> > +   ret = tcphy_get_param(dev, >uphy_dp_sel,
> > + "rockchip,uphy-dp-sel");
> > +   if (ret)
> > +   return ret;
> 
> What about existing device trees? You're essentially adding this
> new property and requiring it at the same time.
> 
> Or are we considering no RK3399 DP stable at the moment? I guess we
> haven't actually merged any device trees that support this yet, no?

An interesting situation we're in here. On the one hand, you're right this 
breaks "backwards compatiblity".

But on the other hand, the type-c phy is currently very much unused. The only 
current board rk3399-evb.dts does not enable them (so they're disabled 
everywhere) and we have neither dwc3 nor dp nodes in any rk3399 devicetrees so 
far. Also Rob was ok with the binding change :-) .

So from my pov, I'd say it _should_ be ok, as nothing is using the phys at all 
yet and thus there is nothing that could get broken.


Heiko

> 
> Brian
> 
> > +
> > 
> > tcphy->grf_regs = syscon_regmap_lookup_by_phandle(dev->of_node,
> > 
> >   "rockchip,grf");
> > 
> > if (IS_ERR(tcphy->grf_regs)) {




Re: [PATCH -mm -v6 2/9] mm, memcg: Support to charge/uncharge multiple swap entries

2017-03-08 Thread Huang, Ying
Balbir Singh  writes:

> On Wed, 2017-03-08 at 15:26 +0800, Huang, Ying wrote:
>> From: Huang Ying 
>> 
>> This patch make it possible to charge or uncharge a set of continuous
>> swap entries in the swap cgroup.  The number of swap entries is
>> specified via an added parameter.
>> 
>> This will be used for the THP (Transparent Huge Page) swap support.
>> Where a swap cluster backing a THP may be allocated and freed as a
>> whole.  So a set of (HPAGE_PMD_NR) continuous swap entries backing one
>> THP need to be charged or uncharged together.  This will batch the
>> cgroup operations for the THP swap too.
>
> A quick look at the patches makes it look sane. I wonder if we would
> make sense to track THP swapout separately as well
> (from a memory.stat perspective)

The patchset is just the first step of THP swap optimization.  So the
THP will still be split after putting the THP into the swap cache.  This
makes it unnecessary to change mem_cgroup_swapout().  I am working on a
following up patchset to further delaying THP splitting after swapping
out the THP to the disk.  In that patchset, I will change
mem_cgroup_swapout() too.

Best Regards,
Huang, Ying

> Balbir Singh


Re: [PATCHv2 0/6] x86/platform/uv/BAU: UV4 message completion and initialization updates

2017-03-08 Thread Andrew Banman

Hi Ingo and Thomas,

Are these patches acceptable to you? We want to get these upstream as soon as 
possible, so please send along any more comments you have. If you're annoyed by 
the format of the emails just let me know and I'll resubmit.


Thank you,

Andrew

On 2/17/17 4:06 PM, Andrew Banman wrote:

The following patch series adds the necessary functionality to make the BAU
on UV4 operational. The purpose of these patches is to implement the correct
message completion logic on UV4. Also included is a bug fix to add a field
to the INTD payload. This is needed to verify the source of each message.

As of this patch set, the BAU operates without errors and performance tests
show TLB shootdowns take up to 42% less time with the BAU enabled.

The patches are summarized as follows:

(1) Populate a message payload field to verify messages at the destination.
Without this verification, the destination agent triggers a HUB error,
resulting in an NMI.

[PATCH 1/6] x86/platform/uv/BAU: Add uv_bau_version enumerated
[PATCH v2 2/6] x86/platform/uv/BAU: Add payload descriptor qualifier

This bug fix is included at the start of the series to avoid conflicts
in a code path shared by the rest of the series.

(2) Make the wait_completion routine part of the bau_operations interface,
and add a uv4_wait_completion routine to employ new completion logic.

The message completion logic for previous generations relies on software-
defined timeouts that are not implemented on UV4. Without these patches,
the BAU driver on UV4 erroneously identifies a UV2-WAR timeout during
normal operation.

[PATCH 3/6] x86/platform/uv/BAU: Cleanup bau_operations declaration
[PATCH 4/6] x86/platform/uv/BAU: Add status mmr location fields to
[PATCH v2 5/6] x86/platform/uv/BAU: Add wait_completion to
[PATCH v2 6/6] x86/platform/uv/BAU: Implement uv4_wait_completion with


Please see the commit messages for details on the motivation and content of
each patch.

Thank you,

Andrew Banman
HPE, Linux Kernel Engineer




Re: [PATCH 6/6] kvm: x86: do not use KVM_REQ_EVENT for APICv interrupt injection

2017-03-08 Thread Wanpeng Li
2016-12-20 0:17 GMT+08:00 Paolo Bonzini :
> Since bf9f6ac8d749 ("KVM: Update Posted-Interrupts Descriptor when vCPU
> is blocked", 2015-09-18) the posted interrupt descriptor is checked
> unconditionally for PIR.ON.  Therefore we don't need KVM_REQ_EVENT to
> trigger the scan and, if NMIs or SMIs are not involved, we can avoid
> the complicated event injection path.
>
> Calling kvm_vcpu_kick if PIR.ON=1 is also useless, though it has been
> there since APICv was introduced.
>
> However, without the KVM_REQ_EVENT safety net KVM needs to be much
> more careful about races between vmx_deliver_posted_interrupt and
> vcpu_enter_guest.  First, the IPI for posted interrupts may be issued
> between setting vcpu->mode = IN_GUEST_MODE and disabling interrupts.
> If that happens, kvm_trigger_posted_interrupt returns true, but
> smp_kvm_posted_intr_ipi doesn't do anything about it.  The guest is
> entered with PIR.ON, but the posted interrupt IPI has not been sent
> and the interrupt is only delivered to the guest on the next vmentry
> (if any).  To fix this, disable interrupts before setting vcpu->mode.
> This ensures that the IPI is delayed until the guest enters non-root mode;
> it is then trapped by the processor causing the interrupt to be injected.
>
> Second, the IPI may be issued between
>
> kvm_x86_ops->hwapic_irr_update(vcpu,
> kvm_lapic_find_highest_irr(vcpu));
>
> and vcpu->mode = IN_GUEST_MODE.  In this case, kvm_vcpu_kick is called
> but it (correctly) doesn't do anything because it sees vcpu->mode ==
> OUTSIDE_GUEST_MODE.  Again, the guest is entered with PIR.ON but no
> posted interrupt IPI is pending; this time, the fix for this is to move
> the RVI update after IN_GUEST_MODE.
>
> Both issues were previously masked by the liberal usage of KVM_REQ_EVENT.
> In both race scenarios KVM_REQ_EVENT would cancel guest entry, resulting
> in another vmentry which would inject the interrupt.
>
> This saves about 300 cycles on the self_ipi_* tests of vmexit.flat.
>
> Signed-off-by: Paolo Bonzini 
> ---
>  arch/x86/kvm/lapic.c | 11 ---
>  arch/x86/kvm/vmx.c   |  8 +---
>  arch/x86/kvm/x86.c   | 44 +---
>  3 files changed, 34 insertions(+), 29 deletions(-)
>
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index f644dd1dbe71..5ea94b622e88 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -385,12 +385,8 @@ int __kvm_apic_update_irr(u32 *pir, void *regs)
>  int kvm_apic_update_irr(struct kvm_vcpu *vcpu, u32 *pir)
>  {
> struct kvm_lapic *apic = vcpu->arch.apic;
> -   int max_irr;
>
> -   max_irr = __kvm_apic_update_irr(pir, apic->regs);
> -
> -   kvm_make_request(KVM_REQ_EVENT, vcpu);
> -   return max_irr;
> +   return __kvm_apic_update_irr(pir, apic->regs);
>  }
>  EXPORT_SYMBOL_GPL(kvm_apic_update_irr);
>
> @@ -423,9 +419,10 @@ static inline void apic_clear_irr(int vec, struct 
> kvm_lapic *apic)
> vcpu = apic->vcpu;
>
> if (unlikely(vcpu->arch.apicv_active)) {
> -   /* try to update RVI */
> +   /* need to update RVI */
> apic_clear_vector(vec, apic->regs + APIC_IRR);
> -   kvm_make_request(KVM_REQ_EVENT, vcpu);
> +   kvm_x86_ops->hwapic_irr_update(vcpu,
> +   apic_find_highest_irr(apic));
> } else {
> apic->irr_pending = false;
> apic_clear_vector(vec, apic->regs + APIC_IRR);
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 27e40b180242..3dd4fad35a3e 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -5062,9 +5062,11 @@ static void vmx_deliver_posted_interrupt(struct 
> kvm_vcpu *vcpu, int vector)
> if (pi_test_and_set_pir(vector, >pi_desc))
> return;
>
> -   r = pi_test_and_set_on(>pi_desc);
> -   kvm_make_request(KVM_REQ_EVENT, vcpu);
> -   if (r || !kvm_vcpu_trigger_posted_interrupt(vcpu))
> +   /* If a previous notification has sent the IPI, nothing to do.  */
> +   if (pi_test_and_set_on(>pi_desc))
> +   return;
> +
> +   if (!kvm_vcpu_trigger_posted_interrupt(vcpu))
> kvm_vcpu_kick(vcpu);
>  }
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index c666414adc1d..725473ba6dd3 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -6710,19 +6710,6 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
> kvm_hv_process_stimers(vcpu);
> }
>
> -   /*
> -* KVM_REQ_EVENT is not set when posted interrupts are set by
> -* VT-d hardware, so we have to update RVI unconditionally.
> -*/
> -   if (kvm_lapic_enabled(vcpu)) {
> -   /*
> -* Update architecture specific hints for APIC
> -* virtual interrupt delivery.
> -*/
> -   

Re: [v2 PATCH 3/3] mmc: sdhci-cadence: Update PHY delay configuration

2017-03-08 Thread Masahiro Yamada
Hi Piotr,

2017-03-07 20:00 GMT+09:00 Piotr Sroka :
> Hi Masahiro,
>
>> -Original Message-
>> Sent: 07 March, 2017 9:03 AM
>> To: Piotr Sroka
>> Subject: Re: [v2 PATCH 3/3] mmc: sdhci-cadence: Update PHY delay 
>> configuration
>>
>> Hi Piotr,
>>
>> 2017-03-06 22:39 GMT+09:00 Piotr Sroka :
>> > PHY settings can be different for different platforms and SoCs.
>> > Fixed PHY input delays was replaced with SoC specific compatible data.
>> > DTS properties are used for configuration new PHY DLL delays.
>>
>>
>> Probably you are familiar with this IP.
>>
>> Please teach me this.
>>
>> With this patch, we will have two groups for PHY parameters.
>>
>> (A) specified via a data array associated with a compatible string 
>> SDHCI_CDNS_PHY_DLY_SD_HS SDHCI_CDNS_PHY_DLY_SD_DEFAULT
>> SDHCI_CDNS_PHY_DLY_UHS_SDR12
>> SDHCI_CDNS_PHY_DLY_UHS_SDR25
>> SDHCI_CDNS_PHY_DLY_UHS_SDR50
>> SDHCI_CDNS_PHY_DLY_UHS_DDR50
>> SDHCI_CDNS_PHY_DLY_EMMC_LEGACY
>> SDHCI_CDNS_PHY_DLY_EMMC_SDR
>> SDHCI_CDNS_PHY_DLY_EMMC_DDR
>>
>> (B) specified with DT property
>> SDHCI_CDNS_PHY_DLY_SDCLK
>> SDHCI_CDNS_PHY_DLY_HSMMC
>> SDHCI_CDNS_PHY_DLY_STROBE
>>
>> I am confused.
>> What is the difference between (A) and (B)?
>
> The first group of delays are input delays. These delays are set in current 
> version of sdhci-cadence driver in sdhci_cdns_phy_init function.
> Following by spec:
> They are provided to help in meeting timings relations between data window 
> and sampling clock.
> The clock is fixed position in respect to the SDCLK. And the idea of sampling 
> is to delay and align the data to the data window.
> If the default values of the delays are not sufficient/correct for the 
> chip/board implementation those can be adjusted
>
> The second group are DLL delays.
> There are three delays
> SDHCI_CDNS_PHY_DLY_SDCLK  - sdclk delay line use to delay outgoing sdclk 
> signal
> SDHCI_CDNS_PHY_DLY_HSMMC - sdclk delay line use to delay outgoing sdclk 
> signal for for HS200, HS400 and HS400ES
> SDHCI_CDNS_PHY_DLY_STROBE - DLL strobe delay for HS400ES
> Following by spec:
> They allows to setup basic DLL parameters. In general the default values are 
> sufficient to start working in any speed mode. The default values of delays 
> and phase detect select can be adjusted depending on the chip/board 
> implementation.



It was not clear what makes one group different from the other.

After all, parameters from both groups
should be adjusted depending on chip/board implementation.


> In general all PHY delays values either should be properly hardcoded in HW or 
> they should be properly set  by FW depending on the chip/board.
> So PHY driver should do not touch PHY delays at all or should set values 
> which are proper for specific chip/board.
>
> I am not sure where exactly they should be placed in dts file or in 
> compatible data.

I am not quite sure, either.
(comments are appreciated.)


FWIW:
The first group (data associated with compatible) allows per-chip adjustment,
but not per-board.   Pros are, we will not break DT compatibility,
and we can avoid a list of properties difficult to understand.
Cons are, we can not make fine-grained adjustment for each board.


The second group (DT property) gives more flexibility for per-chip and
per-board adjustment.
A bad thing is we will end up with specifying a bunch of mysterious
properties from DT.





-- 
Best Regards
Masahiro Yamada


Re: Compat 32-bit syscall entry from 64-bit task!?

2017-03-08 Thread Andrew Lutomirski
On Wed, Mar 8, 2017 at 3:41 PM, Dmitry V. Levin  wrote:
> Hi,
>
> On Thu, Jan 26, 2012 at 07:03:43PM +0100, Denys Vlasenko wrote:
>> Hi Linus,
>>
>> On Thu, Jan 26, 2012 at 4:47 AM, Linus Torvalds
>>  wrote:
>> >> Please look at strace source, get_scno() function, where
>> >> it reads syscall no and parameters. Let's see
>> >> - POWERPC: has 32-bit and 64-bit mode
>> >> - X86_64: has 32-bit and 64-bit mode
>> >> - IA64: has i386-compat mode
>> >> - ARM: has more than one ABI
>> >> - SPARC: has 32-bit and 64-bit mode
>> >>
>> >> Do you want to re-invent a different arch-specific way to report
>> >> syscall type for each of these arches?
>> >
>> > I think an arch-specific one is better than trying to make some
>> > generic one that is messy.
>> >
>> > As you say, many architectures have multiple system call ABIs.
>> >
>> > But they tend to be very *different* issues. They can be about
>> > multiple ABI's, as you mention, and even when they *look* similar
>> > (32-bit vs 64-bit ABI's) they are actually totally different issues.
>> > [skip]
>>
>> I don't have a particular attachment to my solution,
>> and I think we already talk about this problem for
>> far too long.
>>
>> Looks like nobody is _strongly_ opposed to your patch
>> which uses a few bits in eflags to report bitness
>> of the x86 syscall.
>>
>> Lets just do that already. If you commit it to kernel git,
>> I will immediately change strace accordingly.
>
> Is there any progress with this (or any alternative) solution?
>
> I see the kernel side has changed a bit, and the strace part
> is in a better shape than 5 years ago (although I'm biased of course),
> but I don't see any kernel interface that would allow strace to reliably
> recognize this 0x80 case.

I am strongly opposed to fudging registers to half-arsedly slightly
improve the epicly crappy ptrace(2) interface for syscalls.

To fix this right, please just add PTRACE_GET_SYSCALL_INFO or similar
to, in one shot, read out all the syscall details.  This means: arch,
no, arg0..arg5, and *whether it's entry or exit*.  I propose returning
this structure:

struct ptrace_syscall_info {
  u8 op;  /* 0 for entry, 1 for exit */
  u8 pad0;
  u16 pad1;
  u32 pad2;
  union {
struct seccomp_data syscall_entry;
s64 syscall_exit_retval;
  };
};

because struct seccomp_data already gets this right.  There's plenty
of opportunity to fine-tune this.  Now it works on all architectures.

Since struct seccomp_data may be extended in the future, the operation
should be:

ptrace(PTRACE_GET_SYSCALL_INFO, pid, (void *)sizeof(struct
ptrace_syscall_info), );

returns 0 on success and some error code if, for example, the current
ptrace stop isn't a syscall entry or exit.

--Andy


Re: [PATCH net] dccp/tcp: fix routing redirect race

2017-03-08 Thread Eric Dumazet
On Thu, 2017-03-09 at 14:42 +1100, Jonathan Maxwell wrote:
> Sorry let me resend in plain text mode.
> 
> On Thu, Mar 9, 2017 at 1:10 PM, Eric Dumazet  wrote:
> > On Thu, 2017-03-09 at 12:15 +1100, Jon Maxwell wrote:
> >> We have seen a few incidents lately where a dst_enty has been freed
> >> with a dangling TCP socket reference (sk->sk_dst_cache) pointing to that
> >> dst_entry. If the conditions/timings are right a crash then ensues when the
> >> freed dst_entry is referenced later on. A Common crashing back trace is:
> >
> > Very nice catch !
> >
> 
> Thanks Eric.
> 
> > Don't we have a similar issue for IPv6 ?
> >
> >
> 
> Good point.
> 
> We checked and as far as we can tell IPv6 does not invalidate the route.
> So it should be safer.

Simply doing :

__sk_dst_check(sk, np->dst_cookie);

is racy, even before calling dst->ops->redirect(dst, sk, skb);

(if socket is owned by user)





Re: [PATCH] zram: set physical queue limits to avoid array out of bounds accesses

2017-03-08 Thread Minchan Kim
On Wed, Mar 08, 2017 at 08:58:02AM +0100, Johannes Thumshirn wrote:
> On 03/08/2017 06:11 AM, Minchan Kim wrote:
> > And could you test this patch? It avoids split bio so no need new bio
> > allocations and makes zram code simple.
> > 
> > From f778d7564d5cd772f25bb181329362c29548a257 Mon Sep 17 00:00:00 2001
> > From: Minchan Kim 
> > Date: Wed, 8 Mar 2017 13:35:29 +0900
> > Subject: [PATCH] fix
> > 
> > Not-yet-Signed-off-by: Minchan Kim 
> > ---
> 
> [...]
> 
> Yup, this works here.
> 
> I did a mkfs.xfs /dev/nvme0n1
> dd if=/dev/urandom of=/test.bin bs=1M count=128
> sha256sum test.bin
> mount /dev/nvme0n1 /dir
> mv test.bin /dir/
> sha256sum /dir/test.bin
> 
> No panics and sha256sum of the 128MB test file still matches
> 
> Tested-by: Johannes Thumshirn 
> Reviewed-by: Johannes Thumshirn 

Thanks a lot, Johannes and Hannes!!

> 
> Now that you removed the one page limit in zram_bvec_rw() you can also
> add this hunk to remove the queue splitting:

Right. I added what you suggested with detailed description.

> 
> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> index 85f4df8..27b168f6 100644
> --- a/drivers/block/zram/zram_drv.c
> +++ b/drivers/block/zram/zram_drv.c
> @@ -868,8 +868,6 @@ static blk_qc_t zram_make_request(struct
> request_queue *queue, struct bio *bio)
>  {
> struct zram *zram = queue->queuedata;
> 
> -   blk_queue_split(queue, , queue->bio_split);
> -
> if (!valid_io_request(zram, bio->bi_iter.bi_sector,
> bio->bi_iter.bi_size)) {
> atomic64_inc(>stats.invalid_io);
> 
> Byte,
>   Johannes
> 

Jens, Could you replace the one merged with this? And I don't want
to add stable mark in this patch because I feel it need enough
testing in 64K page system I don't have. ;(

>From bb73e75ab0e21016f60858fd61e7dc6a6813e359 Mon Sep 17 00:00:00 2001
From: Minchan Kim 
Date: Thu, 9 Mar 2017 14:00:40 +0900
Subject: [PATCH] zram: handle multiple pages attached bio's bvec

Johannes Thumshirn reported system goes the panic when using NVMe over
Fabrics loopback target with zram.

The reason is zram expects each bvec in bio contains a single page
but nvme can attach a huge bulk of pages attached to the bio's bvec
so that zram's index arithmetic could be wrong so that out-of-bound
access makes panic.

This patch solves the problem via removing the limit(a bvec should
contains a only single page).

Cc: Hannes Reinecke 
Reported-by: Johannes Thumshirn 
Tested-by: Johannes Thumshirn 
Reviewed-by: Johannes Thumshirn 
Signed-off-by: Johannes Thumshirn 
Signed-off-by: Minchan Kim 
---
I don't add stable mark intentionally because I think it's rather risky
without enough testing on 64K page system(ie, partial IO part).

Thanks for the help, Johannes and Hannes!!

 drivers/block/zram/zram_drv.c | 37 ++---
 1 file changed, 10 insertions(+), 27 deletions(-)

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 01944419b1f3..fefdf260503a 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -137,8 +137,7 @@ static inline bool valid_io_request(struct zram *zram,
 
 static void update_position(u32 *index, int *offset, struct bio_vec *bvec)
 {
-   if (*offset + bvec->bv_len >= PAGE_SIZE)
-   (*index)++;
+   *index  += (*offset + bvec->bv_len) / PAGE_SIZE;
*offset = (*offset + bvec->bv_len) % PAGE_SIZE;
 }
 
@@ -838,34 +837,20 @@ static void __zram_make_request(struct zram *zram, struct 
bio *bio)
}
 
bio_for_each_segment(bvec, bio, iter) {
-   int max_transfer_size = PAGE_SIZE - offset;
-
-   if (bvec.bv_len > max_transfer_size) {
-   /*
-* zram_bvec_rw() can only make operation on a single
-* zram page. Split the bio vector.
-*/
-   struct bio_vec bv;
-
-   bv.bv_page = bvec.bv_page;
-   bv.bv_len = max_transfer_size;
-   bv.bv_offset = bvec.bv_offset;
+   struct bio_vec bv = bvec;
+   unsigned int remained = bvec.bv_len;
 
+   do {
+   bv.bv_len = min_t(unsigned int, PAGE_SIZE, remained);
if (zram_bvec_rw(zram, , index, offset,
-op_is_write(bio_op(bio))) < 0)
+   op_is_write(bio_op(bio))) < 0)
goto out;
 
-   bv.bv_len = bvec.bv_len - max_transfer_size;
-   bv.bv_offset += max_transfer_size;
-   if (zram_bvec_rw(zram, , index + 1, 0,
- 

Re: [PATCH] mm: Do not use double negation for testing page flags

2017-03-08 Thread Minchan Kim
Hi Vlastimil,

On Wed, Mar 08, 2017 at 08:51:23AM +0100, Vlastimil Babka wrote:
> On 03/08/2017 06:25 AM, Minchan Kim wrote:
> > Hi Anshuman,
> > 
> > On Tue, Mar 07, 2017 at 09:31:18PM +0530, Anshuman Khandual wrote:
> >> On 03/07/2017 12:06 PM, Minchan Kim wrote:
> >>> With the discussion[1], I found it seems there are every PageFlags
> >>> functions return bool at this moment so we don't need double
> >>> negation any more.
> >>> Although it's not a problem to keep it, it makes future users
> >>> confused to use dobule negation for them, too.
> >>>
> >>> Remove such possibility.
> >>
> >> A quick search of '!!Page' in the source tree does not show any other
> >> place having this double negation. So I guess this is all which need
> >> to be fixed.
> > 
> > Yeb. That's the why my patch includes only khugepagd part but my
> > concern is PageFlags returns int type not boolean so user might
> > be confused easily and tempted to use dobule negation.
> > 
> > Other side is they who create new custom PageXXX(e.g., PageMovable)
> > should keep it in mind that they should return 0 or 1 although
> > fucntion prototype's return value is int type.
> 
> > It shouldn't be
> > documented nowhere.
> 
> Was this double negation intentional? :P

Nice catch!
It seems you have a crystal ball. ;-)

> 
> > Although we can add a little description
> > somewhere in page-flags.h, I believe changing to boolean is more
> > clear/not-error-prone so Chen's work is enough worth, I think.
> 
> Agree, unless some arches benefit from the int by performance
> for some reason (no idea if it's possible).
> 
> Anyway, to your original patch:
> 
> Acked-by: Vlastimil Babka 

Thanks!


[RESEND PATCH v3 6/7] PCI: dwc: designware: Move _unroll configurations to a separate function

2017-03-08 Thread Kishon Vijay Abraham I
No functional change. Rename dw_pcie_writel_unroll/dw_pcie_readl_unroll
to dw_pcie_writel_ob_unroll/dw_pcie_readl_ob_unroll respectively as these
functions are used to perform only outbound configurations. Also move
these _unroll configurations to a separate function.

Signed-off-by: Kishon Vijay Abraham I 
---
 drivers/pci/dwc/pcie-designware.c |  112 ++---
 1 file changed, 67 insertions(+), 45 deletions(-)

diff --git a/drivers/pci/dwc/pcie-designware.c 
b/drivers/pci/dwc/pcie-designware.c
index 557ee53..6657a84 100644
--- a/drivers/pci/dwc/pcie-designware.c
+++ b/drivers/pci/dwc/pcie-designware.c
@@ -92,22 +92,64 @@ void dw_pcie_write_dbi(struct dw_pcie *pci, void __iomem 
*base, u32 reg,
dev_err(pci->dev, "write DBI address failed\n");
 }
 
-static u32 dw_pcie_readl_unroll(struct dw_pcie *pci, void __iomem *base,
-   u32 index, u32 reg)
+static u32 dw_pcie_readl_ob_unroll(struct dw_pcie *pci, void __iomem *base,
+  u32 index, u32 reg)
 {
u32 offset = PCIE_GET_ATU_OUTB_UNR_REG_OFFSET(index);
 
return dw_pcie_read_dbi(pci, base, offset + reg, 0x4);
 }
 
-static void dw_pcie_writel_unroll(struct dw_pcie *pci, void __iomem *base,
- u32 index, u32 reg, u32 val)
+static void dw_pcie_writel_ob_unroll(struct dw_pcie *pci, void __iomem *base,
+u32 index, u32 reg, u32 val)
 {
u32 offset = PCIE_GET_ATU_OUTB_UNR_REG_OFFSET(index);
 
dw_pcie_write_dbi(pci, base, offset + reg, 0x4, val);
 }
 
+void dw_pcie_prog_outbound_atu_unroll(struct dw_pcie *pci, int index, int type,
+ u64 cpu_addr, u64 pci_addr, u32 size)
+{
+   u32 retries, val;
+   void __iomem *base = pci->dbi_base;
+
+   dw_pcie_writel_ob_unroll(pci, base, index,
+PCIE_ATU_UNR_LOWER_BASE,
+lower_32_bits(cpu_addr));
+   dw_pcie_writel_ob_unroll(pci, base, index,
+PCIE_ATU_UNR_UPPER_BASE,
+upper_32_bits(cpu_addr));
+   dw_pcie_writel_ob_unroll(pci, base, index, PCIE_ATU_UNR_LIMIT,
+lower_32_bits(cpu_addr + size - 1));
+   dw_pcie_writel_ob_unroll(pci, base, index,
+PCIE_ATU_UNR_LOWER_TARGET,
+lower_32_bits(pci_addr));
+   dw_pcie_writel_ob_unroll(pci, base, index,
+PCIE_ATU_UNR_UPPER_TARGET,
+upper_32_bits(pci_addr));
+   dw_pcie_writel_ob_unroll(pci, base, index,
+PCIE_ATU_UNR_REGION_CTRL1,
+type);
+   dw_pcie_writel_ob_unroll(pci, base, index,
+PCIE_ATU_UNR_REGION_CTRL2,
+PCIE_ATU_ENABLE);
+
+   /*
+* Make sure ATU enable takes effect before any subsequent config
+* and I/O accesses.
+*/
+   for (retries = 0; retries < LINK_WAIT_MAX_IATU_RETRIES; retries++) {
+   val = dw_pcie_readl_ob_unroll(pci, base, index,
+ PCIE_ATU_UNR_REGION_CTRL2);
+   if (val & PCIE_ATU_ENABLE)
+   return;
+
+   usleep_range(LINK_WAIT_IATU_MIN, LINK_WAIT_IATU_MAX);
+   }
+   dev_err(pci->dev, "outbound iATU is not being enabled\n");
+}
+
 void dw_pcie_prog_outbound_atu(struct dw_pcie *pci, int index, int type,
   u64 cpu_addr, u64 pci_addr, u32 size)
 {
@@ -118,59 +160,39 @@ void dw_pcie_prog_outbound_atu(struct dw_pcie *pci, int 
index, int type,
cpu_addr = pci->ops->cpu_addr_fixup(cpu_addr);
 
if (pci->iatu_unroll_enabled) {
-   dw_pcie_writel_unroll(pci, base, index, PCIE_ATU_UNR_LOWER_BASE,
- lower_32_bits(cpu_addr));
-   dw_pcie_writel_unroll(pci, base, index, PCIE_ATU_UNR_UPPER_BASE,
- upper_32_bits(cpu_addr));
-   dw_pcie_writel_unroll(pci, base, index, PCIE_ATU_UNR_LIMIT,
- lower_32_bits(cpu_addr + size - 1));
-   dw_pcie_writel_unroll(pci, base, index,
- PCIE_ATU_UNR_LOWER_TARGET,
- lower_32_bits(pci_addr));
-   dw_pcie_writel_unroll(pci, base, index,
- PCIE_ATU_UNR_UPPER_TARGET,
- upper_32_bits(pci_addr));
-   dw_pcie_writel_unroll(pci, base, index,
- PCIE_ATU_UNR_REGION_CTRL1,
- type);
-   dw_pcie_writel_unroll(pci, base, index,
- 

[RESEND PATCH v3 0/7] PCI: dwc: Miscellaneous fixes and cleanups

2017-03-08 Thread Kishon Vijay Abraham I
Resending since it bounced from quite a few lists.

This should be the final set of cleanups/fixes before endpoint
support can be merged.

Keerthy's patch is a general fix in dra7xx driver and is not
directly related to endpoint mode.

This v1 of this series was previously sent with a different
cover letter $subject [1]

Changes from v2:
*) Kconfig changes that was spilled into a patch is removed.
*) In addition to renaming _unroll() to _ob_unroll(), all the
   _unroll configurations is also moved a separate function.

Changes from v1:
*) included a patch to rename _unroll() to _ob_unroll() as
   similar thing has to be done for inbound window in the case
   of EP mode.
*) used 'size_t' instead of 'int' for specifying the size
   in read_dbi/write_dbi function arguments.
*) Populate cpu_addr_fixup ops for artpec6 as suggested by
   Niklas

This series is based on 4.11-rc1

[1] -> https://lkml.org/lkml/2017/2/16/270

Keerthy (1):
  PCI: dwc: dra7xx: Push request_irq call to the bottom of probe

Kishon Vijay Abraham I (6):
  PCI: dwc: designware: Add new *ops* for cpu addr fixup
  PCI: dwc: dra7xx: Populate cpu_addr_fixup ops
  PCI: dwc: artpec6: Populate cpu_addr_fixup ops
  PCI: dwc: all: Modify dbi accessors to take dbi_base as argument
  PCI: dwc: all: Modify dbi accessors to access data of 4/2/1  bytes
  PCI: dwc: designware: Move _unroll configurations to a separate
function

 drivers/pci/dwc/pci-dra7xx.c   |   35 +++
 drivers/pci/dwc/pci-exynos.c   |   14 +--
 drivers/pci/dwc/pci-imx6.c |   62 +++--
 drivers/pci/dwc/pci-keystone-dw.c  |   16 ++--
 drivers/pci/dwc/pcie-armada8k.c|   39 
 drivers/pci/dwc/pcie-artpec6.c |   22 +++--
 drivers/pci/dwc/pcie-designware-host.c |   20 ++--
 drivers/pci/dwc/pcie-designware.c  |  156 +---
 drivers/pci/dwc/pcie-designware.h  |   13 ++-
 drivers/pci/dwc/pcie-hisi.c|   17 ++--
 10 files changed, 241 insertions(+), 153 deletions(-)

-- 
1.7.9.5



[PATCH v2] serial: 8250_dw: Honor clk_round_rate errors in dw8250_set_termios

2017-03-08 Thread Heiko Stuebner
clk_round_rate returns a signed long and may possibly return errors
in it, for example if there is no possible rate.

Till now dw8250_set_termios ignored any error, the signednes and would
just use the value as input to clk_set_rate. This of course falls apart
if there is an actual error, so check for errors and only try to set
a rate if the value is actually valid.

This turned up on some Rockchip platforms after commit
6a171b299379 ("serial: 8250_dw: Allow hardware flow control to be used")
enabled set_termios callback in all cases, not only ACPI.

Fixes: 6a171b299379 ("serial: 8250_dw: Allow hardware flow control to be used")
Signed-off-by: Heiko Stuebner 
Reviewed-by: Andy Shevchenko 
---
There is also another patch floating around, fixing a separate issue
on top of this one: "serial: 8250_dw: Fix breakage when HAVE_CLK=n"

changes in v2:
- adapt commit message to make it more explicit, that this is a
  somewhat critical fix
- add Andy's Reviewed-by

 drivers/tty/serial/8250/8250_dw.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/tty/serial/8250/8250_dw.c 
b/drivers/tty/serial/8250/8250_dw.c
index 6ee55a2d47bb..223ac234ddb2 100644
--- a/drivers/tty/serial/8250/8250_dw.c
+++ b/drivers/tty/serial/8250/8250_dw.c
@@ -257,7 +257,7 @@ static void dw8250_set_termios(struct uart_port *p, struct 
ktermios *termios,
 {
unsigned int baud = tty_termios_baud_rate(termios);
struct dw8250_data *d = p->private_data;
-   unsigned int rate;
+   long rate;
int ret;
 
if (IS_ERR(d->clk) || !old)
@@ -265,7 +265,10 @@ static void dw8250_set_termios(struct uart_port *p, struct 
ktermios *termios,
 
clk_disable_unprepare(d->clk);
rate = clk_round_rate(d->clk, baud * 16);
-   ret = clk_set_rate(d->clk, rate);
+   if (rate < 0)
+   ret = rate;
+   else
+   ret = clk_set_rate(d->clk, rate);
clk_prepare_enable(d->clk);
 
if (!ret)
-- 
2.11.0



[RESEND PATCH v3 2/7] PCI: dwc: dra7xx: Populate cpu_addr_fixup ops

2017-03-08 Thread Kishon Vijay Abraham I
Populate cpu_addr_fixup ops to extract the least 28 bits of the
corresponding cpu address.

Acked-by: Joao Pinto 
Signed-off-by: Kishon Vijay Abraham I 
---
 drivers/pci/dwc/pci-dra7xx.c |   11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/pci/dwc/pci-dra7xx.c b/drivers/pci/dwc/pci-dra7xx.c
index 0984baf..07c45ec 100644
--- a/drivers/pci/dwc/pci-dra7xx.c
+++ b/drivers/pci/dwc/pci-dra7xx.c
@@ -88,6 +88,11 @@ static inline void dra7xx_pcie_writel(struct dra7xx_pcie 
*pcie, u32 offset,
writel(value, pcie->base + offset);
 }
 
+static u64 dra7xx_pcie_cpu_addr_fixup(u64 pci_addr)
+{
+   return pci_addr & DRA7XX_CPU_TO_BUS_ADDR;
+}
+
 static int dra7xx_pcie_link_up(struct dw_pcie *pci)
 {
struct dra7xx_pcie *dra7xx = to_dra7xx_pcie(pci);
@@ -152,11 +157,6 @@ static void dra7xx_pcie_host_init(struct pcie_port *pp)
struct dw_pcie *pci = to_dw_pcie_from_pp(pp);
struct dra7xx_pcie *dra7xx = to_dra7xx_pcie(pci);
 
-   pp->io_base &= DRA7XX_CPU_TO_BUS_ADDR;
-   pp->mem_base &= DRA7XX_CPU_TO_BUS_ADDR;
-   pp->cfg0_base &= DRA7XX_CPU_TO_BUS_ADDR;
-   pp->cfg1_base &= DRA7XX_CPU_TO_BUS_ADDR;
-
dw_pcie_setup_rc(pp);
 
dra7xx_pcie_establish_link(dra7xx);
@@ -329,6 +329,7 @@ static int __init dra7xx_add_pcie_port(struct dra7xx_pcie 
*dra7xx,
 }
 
 static const struct dw_pcie_ops dw_pcie_ops = {
+   .cpu_addr_fixup = dra7xx_pcie_cpu_addr_fixup,
.link_up = dra7xx_pcie_link_up,
 };
 
-- 
1.7.9.5



Re: [v6 PATCH 00/21] x86: Enable User-Mode Instruction Prevention

2017-03-08 Thread Ricardo Neri
On Wed, 2017-03-08 at 19:53 +0300, Stas Sergeev wrote:
> 08.03.2017 19:46, Andy Lutomirski пишет:
> >> No no, since I meant prot mode, this is not what I need.
> >> I would never need to disable UMIP as to allow the
> >> prot mode apps to do SLDT. Instead it would be good
> >> to have an ability to provide a replacement for the dummy
> >> emulation that is currently being proposed for kernel.
> >> All is needed for this, is just to deliver a SIGSEGV.
> > That's what I meant.  Turning off FIXUP_UMIP would leave UMIP on but
> > turn off the fixup, so you'd get a SIGSEGV indicating #GP (or a vm86
> > GP exit).
> But then I am confused with the word "compat" in
> your "COMPAT_MASK0_X86_UMIP_FIXUP" and
> "sys_adjust_compat_mask(int op, int word, u32 mask);"
> 
> Leaving UMIP on and only disabling a fixup doesn't
> sound like a compat option to me. I would expect
> compat to disable it completely.

I guess that the _UMIP_FIXUP part makes it clear that emulation, not
UMIP is disabled, allowing the SIGSEGV be delivered to the user space
program.

Would having a COMPAT_MASK0_X86_UMIP_FIXUP to disable emulation and a
COMPAT_MASK0_X86_UMIP to disable UMIP make sense?

Also, wouldn't having a COMPAT_MASK0_X86_UMIP to disable UMIP defeat its
purpose? Applications could simply use this compat mask to bypass UMIP
and gain access to the instructions it protects.

Thanks and BR,
Ricardo



Re: [net/bpf] 3051bf36c2 BUG: unable to handle kernel paging request at 0000a7cf

2017-03-08 Thread Fengguang Wu

On Wed, Mar 08, 2017 at 02:43:44PM -0800, Linus Torvalds wrote:

On Wed, Mar 8, 2017 at 2:27 PM, Daniel Borkmann  wrote:


The issue seems to be accessing buff first (can be read or write access)
and then doing set_memory_ro() doesn't make it read-only immediately,
meaning the subsequent call into probe_kernel_write() will succeed without
error.

Then, if I don't touch buff first and only do the set_memory_ro() seems
to work and probe_kernel_write() will then fail as expected due to pages
being read-only now.


Ok, that definitely sounds like a TLB invalidate didn't happen.


Now, if I access buff, do the set_memory_ro() and then a msleep(0), for
example, it "kind of" works most of the time (see last log extract below),
and probe_kernel_write() will fail.


Yeah, very much consistent with a missing TLB invalidate. Scheduling
will end up invalidating it, although if it's a global page even that
might not do it (but eventually the entry will just get flushed due to
other activity).


None of this seems an issue with x86_64 and the test_setmem runs fine all
the time, same for the actual BPF stuff.


The code does look somewhat confused about when to actually flush
things - see my earlier note about NX - but it would seem to always do
__flush_tlb_all() unless I missed something. At least as long as
CPA_FLUSHTLB is set. Maybe some case forgets to set that..


Not sure if it's relevant, but out of 189 boots there are 2 boots
showing the below "CPA: called for zero pte." warning.

[7.116932] random: trinity: uninitialized urandom read (4 bytes read)
[   16.366468] sock: process `trinity-main' is using obsolete setsockopt 
SO_BSDCOMPAT
[   17.202396] BUG: unable to handle kernel paging request at 655d9eb2
[   17.204081] IP: __release_sock+0x6e/0x100
[   17.205207] *pde = 
[   17.205208]
[   17.206755] Oops:  [#1]
[   17.207686] CPU: 0 PID: 382 Comm: trinity-main Not tainted 
4.10.0-rc8-02017-g9d876e7 #1
[   17.209819] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
1.9.3-20161025_171302-gandalf 04/01/2014
[   17.212431] task: d625d200 task.stack: d6222000
[   17.213655] EIP: __release_sock+0x6e/0x100
[   17.214833] EFLAGS: 00010246 CPU: 0
[   17.215951] EAX:  EBX: 655d9eb2 ECX:  EDX: 0201
[   17.217587] ESI: 0605 EDI: d6064800 EBP: d6223ef4 ESP: d6223ee8
[   17.219185]  DS: 007b ES: 007b FS:  GS: 0033 SS: 0068
[   17.220602] CR0: 80050033 CR2: 655d9eb2 CR3: 1610f000 CR4: 0610
[   17.221966] DR0: 080cb000 DR1:  DR2:  DR3: 
[   17.223444] DR6: 0ff0 DR7: 0600
[   17.224343] Call Trace:
[   17.225007]  release_sock+0x2e/0x80
[   17.225900]  sock_setsockopt+0x8c/0x880
[   17.226857]  SyS_socketcall+0x658/0x6a0
[   17.227804]  do_fast_syscall_32+0x9a/0x160
[   17.228765]  entry_SYSENTER_32+0x4c/0x7b
[   17.229694] EIP: 0xbcc5
[   17.230428] EFLAGS: 0282 CPU: 0
[   17.231263] EAX: ffda EBX: 000e ECX: bfedce00 EDX: bfedce80
[   17.232582] ESI: 001a EDI: 00ae EBP: b754f93c ESP: bfedcdec
[   17.233882]  DS: 007b ES: 007b FS:  GS: 0033 SS: 007b
[   17.235044] Code: eb 29 8d 76 00 89 da 89 f8 ff 97 98 01 00 00 31 c9 ba 06 08 00 
00 b8 d8 19 b1 c1 e8 ed 3d 85 ff e8 e8 62 04 00 85 f6 89 f3 74 42 <8b>
33 0f 18 06 8b 43 48 a8 01 74 0e 83 e0 fe 74 09 80 3d 3d 9c
[   17.240429] EIP: __release_sock+0x6e/0x100 SS:ESP: 0068:d6223ee8
[   17.241689] CR2: 655d9eb2
[   17.242509] ---[ end trace dc10480164c75444 ]---
[   17.243569] [ cut here ]
[   17.243574] WARNING: CPU: 0 PID: 15 at arch/x86/mm/pageattr.c:1150 
__cpa_process_fault+0x388/0x390
[   17.243575] CPA: called for zero pte. vaddr = d7ab4000 cpa->vaddr = d7ab4000
[   17.243577] CPU: 0 PID: 15 Comm: kworker/0:1 Tainted: G  D 
4.10.0-rc8-02017-g9d876e7 #1
[   17.243578] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
1.9.3-20161025_171302-gandalf 04/01/2014
[   17.243582] Workqueue: events bpf_prog_free_deferred
[   17.243583] Call Trace:
[   17.243588]  dump_stack+0x16/0x25
[   17.243588]  dump_stack+0x16/0x25
[   17.243590]  __warn+0xd1/0xf0
[   17.243592]  ? __cpa_process_fault+0x388/0x390
[   17.243593]  warn_slowpath_fmt+0x3b/0x40
[   17.243594]  __cpa_process_fault+0x388/0x390
[   17.243596]  ? lookup_address_in_pgd+0xa/0x90
[   17.243598]  __change_page_attr+0x520/0x6c0
[   17.243600]  ? pfn_range_is_mapped+0xe/0x80
[   17.243601]  __change_page_attr_set_clr+0x38/0x180
[   17.243603]  change_page_attr_set_clr+0x107/0x3f0
[   17.243605]  ? dequeue_entity+0x86/0x230
[   17.243607]  set_memory_rw+0x3a/0x40
[   17.243608]  bpf_prog_free_deferred+0x16/0x30
[   17.243612]  process_one_work+0xfc/0x440
[   17.243614]  ? pick_next_task_fair+0x149/0x1d0
[   17.243615]  worker_thread+0x37/0x4e0
[   17.243617]  kthread+0xdd/0x110
[   17.243618]  ? process_one_work+0x440/0x440
[   17.243620]  ? __kthread_create_on_node+0x100/0x100
[   17.243622]  ret_from_fork+0x21/0x2c
[   17.243623] 

[PATCH] arm64: support keyctl() system call in 32-bit mode

2017-03-08 Thread Eric Biggers
From: Eric Biggers 

As is the case for a number of other architectures that have a 32-bit
compat mode, enable KEYS_COMPAT if both COMPAT and KEYS are enabled.
This allows AArch32 programs to use the keyctl() system call when
running on an AArch64 kernel.

Signed-off-by: Eric Biggers 
---
 arch/arm64/Kconfig | 4 
 1 file changed, 4 insertions(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index a39029b5414e..f21e9a76ff67 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -1063,6 +1063,10 @@ config SYSVIPC_COMPAT
def_bool y
depends on COMPAT && SYSVIPC
 
+config KEYS_COMPAT
+   def_bool y
+   depends on COMPAT && KEYS
+
 endmenu
 
 menu "Power management options"
-- 
2.12.0.246.ga2ecc84866-goog



Re: [v6 PATCH 00/21] x86: Enable User-Mode Instruction Prevention

2017-03-08 Thread Ricardo Neri
On Wed, 2017-03-08 at 17:08 +0300, Stas Sergeev wrote:
> 08.03.2017 03:32, Ricardo Neri пишет:
> > These are the instructions covered by UMIP:
> > * SGDT - Store Global Descriptor Table
> > * SIDT - Store Interrupt Descriptor Table
> > * SLDT - Store Local Descriptor Table
> > * SMSW - Store Machine Status Word
> > * STR - Store Task Register
> >
> > This patchset initially treated tasks running in virtual-8086 mode as a
> > special case. However, I received clarification that DOSEMU[8] does not
> > support applications that use these instructions.
> Yes, this is the case.
> But at least in the past there was an attempt to
> support SLDT as it is used by an ancient pharlap
> DOS extender (currently unsupported by dosemu1/2).
> So how difficult would it be to add an optional
> possibility of delivering such SIGSEGV to userspace
> so that the kernel's dummy emulation can be overridden?

I suppose a umip=noemulation kernel parameter could be added in this
case.

> It doesn't need to be a matter of this particular
> patch set, i.e. this proposal should not trigger a
> v7 resend of all 21 patches. :) But it would be useful
> for the future development of dosemu2.

Would dosemu2 use 32-bit processes in order to keep segmentation? If it
could use 64-bit processes, emulation is not used in this case and the
SIGSEGV is delivered to user space.

Thanks and BR,
Ricardo




Re: [PATCH net] dccp/tcp: fix routing redirect race

2017-03-08 Thread Eric Dumazet
On Thu, 2017-03-09 at 12:15 +1100, Jon Maxwell wrote:
> We have seen a few incidents lately where a dst_enty has been freed
> with a dangling TCP socket reference (sk->sk_dst_cache) pointing to that
> dst_entry. If the conditions/timings are right a crash then ensues when the
> freed dst_entry is referenced later on. A Common crashing back trace is:

Very nice catch !

Don't we have a similar issue for IPv6 ?




Re: kexec regression since 4.9 caused by efi

2017-03-08 Thread Dave Young
On 03/08/17 at 12:16pm, Omar Sandoval wrote:
> Hi, everyone,
> 
> Since 4.9, kexec results in the following panic on some of our servers:
> 
> [0.001000] general protection fault:  [#1] SMP
> [0.001000] Modules linked in:
> [0.001000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.11.0-rc1 #53
> [0.001000] Hardware name: Wiwynn Leopard-Orv2/Leopard-DDR BW, BIOS LBM05  
>  09/30/2016
> [0.001000] task: 81e0e4c0 task.stack: 81e0
> [0.001000] RIP: 0010:virt_efi_set_variable+0x85/0x1a0
> [0.001000] RSP: :81e03e18 EFLAGS: 00010202
> [0.001000] RAX: afafafafafafafaf RBX: 81e3a4e0 RCX: 
> 0007
> [0.001000] RDX: 81e03e70 RSI: 81e3a4e0 RDI: 
> 88407f8c2de0
> [0.001000] RBP: 81e03e60 R08:  R09: 
> 
> [0.001000] R10:  R11:  R12: 
> 81e03e70
> [0.001000] R13: 0007 R14:  R15: 
> 
> [0.001000] FS:  () GS:881fff60() 
> knlGS:
> [0.001000] CS:  0010 DS:  ES:  CR0: 80050033
> [0.001000] CR2: 88407f30f000 CR3: 001fff102000 CR4: 
> 000406b0
> [0.001000] DR0:  DR1:  DR2: 
> 
> [0.001000] DR3:  DR6: fffe0ff0 DR7: 
> 0400
> [0.001000] Call Trace:
> [0.001000]  efi_delete_dummy_variable+0x7a/0x80
> [0.001000]  efi_enter_virtual_mode+0x3e2/0x494
> [0.001000]  start_kernel+0x392/0x418
> [0.001000]  ? set_init_arg+0x55/0x55
> [0.001000]  x86_64_start_reservations+0x2a/0x2c
> [0.001000]  x86_64_start_kernel+0xea/0xed
> [0.001000]  start_cpu+0x14/0x14
> [0.001000] Code: 42 25 8d ff 80 3d 43 77 95 00 00 75 68 9c 8f 04 24 48 8b 
> 05 3e 7d 7e 00 48 89 de 4d 89 f9 4d 89 f0 44 89 e9 4c 89 e2 48 8b 40 58 <48> 
> 8b 78 58 31 c0 e8 90 e4 92 ff 48 8b 3c 24 48 c7 c6 2b 0a ca
> [0.001000] RIP: virt_efi_set_variable+0x85/0x1a0 RSP: 81e03e18
> [0.001000] ---[ end trace 0bd213e540e9b19f ]---
> [0.001000] Kernel panic - not syncing: Fatal exception
> [0.001000] ---[ end Kernel panic - not syncing: Fatal exception
> 
> Booting normally (i.e., not kexec) still works.
> 
> The decoded code is:
> 
> 
>0:   42 25 8d ff 80 3d   rex.X and $0x3d80ff8d,%eax
>6:   43 77 95rex.XB ja 0xff9e
>9:   00 00   add%al,(%rax)
>b:   75 68   jne0x75
>d:   9c  pushfq
>e:   8f 04 24popq   (%rsp)
>   11:   48 8b 05 3e 7d 7e 00mov0x7e7d3e(%rip),%rax# 0x7e7d56
>   18:   48 89 demov%rbx,%rsi
>   1b:   4d 89 f9mov%r15,%r9
>   1e:   4d 89 f0mov%r14,%r8
>   21:   44 89 e9mov%r13d,%ecx
>   24:   4c 89 e2mov%r12,%rdx
>   27:   48 8b 40 58 mov0x58(%rax),%rax
>   2b:*  48 8b 78 58 mov0x58(%rax),%rdi  <-- trapping 
> instruction
>   2f:   31 c0   xor%eax,%eax
>   31:   e8 90 e4 92 ff  callq  0xff92e4c6
>   36:   48 8b 3c 24 mov(%rsp),%rdi
>   3a:   48  rex.W
>   3b:   c7  .byte 0xc7
>   3c:   c6  (bad)
>   3d:   2b 0a   sub(%rdx),%ecx
>   3f:   ca  .byte 0xca
> 
> If I'm reading this correctly, efi.systab->runtime == 0xafafafafafafafaf,
> and we're crashing when we try to dereference that.
> 
> Here is the output of efi=debug from before the crash:
> 
> [0.00] Linux version 4.11.0-rc1 (osan...@devbig561.prn1.facebook.com) 
> (gcc version 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #53 SMP Wed Mar 8 
> 12:07:16 PST 2017
> [0.00] Command line: BOOT_IMAGE=/vmlinuz-4.6.7-34_fbk7_2504_g8275185 
> ro root=LABEL=/ ipv6.autoconf=0 erst_disable biosdevname=0 net.ifnames=0 
> fsck.repair=yes pcie_pme=nomsi 
> netconsole=+@2401:db00:0011:b03e:face::0009:/eth0,1514@2401:db00:eef0:a59::/02:90:fb:5b:b7:1e
>  crashkernel=128M console=tty0 co
> nsole=ttyS1,57600 efi=debug
> [0.00] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point 
> registers'
> [0.00] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'
> [0.00] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers'
> [0.00] x86/fpu: xstate_offset[2]:  576, xstate_sizes[2]:  256
> [0.00] x86/fpu: Enabled xstate features 0x7, context size is 832 
> bytes, using 'standard' format.
> [0.00] e820: BIOS-provided physical RAM map:
> [0.00] BIOS-e820: [mem 0x0100-0x0009] usable
> [0.00] BIOS-e820: [mem 0x0010-0x750bdfff] usable
> [0.00] BIOS-e820: [mem 

Re: [musl] Re: [PATCH resent] uapi libc compat: allow non-glibc to opt out of uapi definitions

2017-03-08 Thread Rich Felker
On Wed, Mar 08, 2017 at 07:51:29PM -0500, Carlos O'Donell wrote:
> On 03/08/2017 07:14 PM, Szabolcs Nagy wrote:
> > * Carlos O'Donell  [2017-03-08 10:53:00 -0500]:
> >> On 11/11/2016 07:08 AM, Felix Janda wrote:
> >>> fixes the following compiler errors when  is included
> >>> after musl :
> >>>
> >>> ./linux/in6.h:32:8: error: redefinition of 'struct in6_addr'
> >>> ./linux/in6.h:49:8: error: redefinition of 'struct sockaddr_in6'
> >>> ./linux/in6.h:59:8: error: redefinition of 'struct ipv6_mreq'
> >>
> >> Do you have plans for fixing the error when the inclusion order is the 
> >> other way?
> > 
> > the other way (linux header included first) is
> > problematic because linux headers don't follow
> > all the standards the libc follows, they violate
> > namespace rules in their struct definitions, so
> > the libc definitions are necessarily incompatible
> > with them and thus different translation units can
> > end up refering to the same object through
> > incompatible types which is undefined.
> > (even if the abi matches and thus works across
> > the syscall interface, a sufficiently smart
> > toolchain can break such code at link time,
> > and since the libc itself uses its own definitons
> > that's what user code should use too).
> > 
> > there should be a way to include standard conform
> > libc headers and linux headers into the same tu,
> > at least the case when all conflicting definitions
> > come from the libc should work and i think that
> > should be the scope of these libc-compat.h changes.
> > (of course if glibc tries to support arbitrary
> > interleavings then the changes should not break that)
> 
> You can get non-standard defines even when including the
> linux headers _after_ libc headers because linux headers
> should rightly continue to define things that are required
> for linux-specific applications.
> 
> IMO the fact that the UAPI headers may cause problems with
> standards conformance is orthogonal to the discussion of 
> _how_ we fix inclusion order issues.
> 
> Some of the network headers can be used in relative safety
> and need to be used for some applications. It is those cases
> where I'd like to see an inclusion guard design that works
> for both inclusion orders.

The issue has been discussed on our side (musl) and our position so
far is that we don't want to try to support the case of including the
kernel headers before the libc headers, at least not at this time.
It's a big rabbit hole of stuff that could go wrong. This doesn't
preclude the kernel folks trying to make things so that it _can_ be
supported more smoothly.

Rich


[RESEND PATCH v3 0/2] Add i2c dt-binding and device node for Mediatek MT2701 Soc

2017-03-08 Thread Jun Gao
This patch series based on v4.11-rc1, include MT2701 i2c dt-binding
and device node.

changes since v2:
- Modify commit message
- Revise dt-binding documentation

changes since v1:
- Modify commit message

Dependent on "Add clock and power domain DT nodes for Mediatek MT2701"[1].

[1] 
http://lists.infradead.org/pipermail/linux-mediatek/2016-December/007637.html

Jun Gao (2):
  dt-bindings: i2c: Add Mediatek MT2701 i2c binding
  arm: dts: Add Mediatek MT2701 i2c device node

 .../devicetree/bindings/i2c/i2c-mt6577.txt |   11 ++---
 arch/arm/boot/dts/mt2701-evb.dts   |   42 
 arch/arm/boot/dts/mt2701.dtsi  |   42 
 3 files changed, 90 insertions(+), 5 deletions(-)

--
1.7.9.5



Re: [PATCH v7 06/07] iommu/ipmmu-vmsa: ARM and ARM64 archdata access

2017-03-08 Thread Magnus Damm
Hi Robin,

On Wed, Mar 8, 2017 at 9:48 PM, Robin Murphy  wrote:
> On 07/03/17 03:17, Magnus Damm wrote:
>> From: Magnus Damm 
>>
>> Not all architectures have an iommu member in their archdata, so
>> use #ifdefs support build with COMPILE_TEST on any architecture.
>
> I have a feeling I might be repeating myself, but ipmmu_vmsa_archdata
> looks to be trivially convertible to iommu_fwspec, which I strongly
> encourage, not least because it would obviate bodges like this.

Yeah, I think it should be possible to use iommu_fwspec for this
purpose. The question is when to do it. =)

I actually looked into it recently, but then realised that for this to
work then due to code sharing I need to make use of iommu_fwspec on
both 32-bit and 64-bit ARM. So it requires rework of the existing
IPMMU for 32-bit ARM (including hairy legacy CONFIG_IOMMU_DMA=n code).
I was actually thinking of doing some rework of 32-bit ARM IPMMU code
anyway (I suspect iommu_device_* conversion caused breakage) and it
probably has to happen on top of current -next. I would also like to
start reducing burden of forward porting all these patches, and
stirring up the ground does not really help much there...

Cheers,

/ magnus


Re: [PATCH 1/2] mm/memblock: use NUMA_NO_NODE instead of MAX_NUMNODES as default node_id

2017-03-08 Thread Wei Yang
Hello, everyone,

By deeper thinking, I am willing to split these two patches into two patch
set, since they are trying to address two different things.

The first one [Patch 1] is trying to use NUMA_NO_NODE as the default node_id in
memblock_region.

Current implementation use MAX_NUMNODES as the default nid in several
situations:

* when it adds a range from e820 to memblock 
* when it returns an allocated range, it sets nid to MAX_NUMNODES 
* on x86 before initialize the numa info, it set all nid to MAX_NUMNODES

The usage of MAX_NUMNODES here is not accurate, and NUMA_NO_NODE should be
used here.

When looking at the allocation procedure of memblock, it translate
MAX_NUMNODES to NUMA_NO_NODE and mentioned MAX_NUMNODES is deprecated. So I
think it is reasonable to do this refactor here.

The second one [Patch 2] is trying to address similar issue in
for_each_mem_pfn_range(). The patch here is the first step. I have searched
out all related functions and relpaces MAX_NUMNODES with NUMA_NO_NODE. While
the warning here will still be seen when just this patch applies. While after
all patches applied, we won't see the warning again.

Hmm... it looks like some dirty work, while I still think it worth the efforts
to use the correct macro.

Willing to get some feedback :-)


On Fri, Jan 27, 2017 at 09:59:21AM +0800, Wei Yang wrote:
>According to commit  ('mm/memblock: switch to use
>NUMA_NO_NODE instead of MAX_NUMNODES'), MAX_NUMNODES is not preferred as an
>node_id indicator.
>
>This patch use NUMA_NO_NODE as the default node_id for memblock.
>
>Signed-off-by: Wei Yang 
>---
> arch/x86/mm/numa.c | 6 +++---
> mm/memblock.c  | 8 
> 2 files changed, 7 insertions(+), 7 deletions(-)
>
>diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
>index 3f35b48d1d9d..4366242356c5 100644
>--- a/arch/x86/mm/numa.c
>+++ b/arch/x86/mm/numa.c
>@@ -506,7 +506,7 @@ static void __init numa_clear_kernel_node_hotplug(void)
>*   reserve specific pages for Sandy Bridge graphics. ]
>*/
>   for_each_memblock(reserved, mb_region) {
>-  if (mb_region->nid != MAX_NUMNODES)
>+  if (mb_region->nid != NUMA_NO_NODE)
>   node_set(mb_region->nid, reserved_nodemask);
>   }
> 
>@@ -633,9 +633,9 @@ static int __init numa_init(int (*init_func)(void))
>   nodes_clear(node_online_map);
>   memset(_meminfo, 0, sizeof(numa_meminfo));
>   WARN_ON(memblock_set_node(0, ULLONG_MAX, ,
>-MAX_NUMNODES));
>+NUMA_NO_NODE));
>   WARN_ON(memblock_set_node(0, ULLONG_MAX, ,
>-MAX_NUMNODES));
>+NUMA_NO_NODE));
>   /* In case that parsing SRAT failed. */
>   WARN_ON(memblock_clear_hotplug(0, ULLONG_MAX));
>   numa_reset_distance();
>diff --git a/mm/memblock.c b/mm/memblock.c
>index d0f2c9632187..7d27566cee11 100644
>--- a/mm/memblock.c
>+++ b/mm/memblock.c
>@@ -292,7 +292,7 @@ static void __init_memblock memblock_remove_region(struct 
>memblock_type *type, u
>   type->regions[0].base = 0;
>   type->regions[0].size = 0;
>   type->regions[0].flags = 0;
>-  memblock_set_region_node(>regions[0], MAX_NUMNODES);
>+  memblock_set_region_node(>regions[0], NUMA_NO_NODE);
>   }
> }
> 
>@@ -616,7 +616,7 @@ int __init_memblock memblock_add(phys_addr_t base, 
>phys_addr_t size)
>(unsigned long long)base + size - 1,
>0UL, (void *)_RET_IP_);
> 
>-  return memblock_add_range(, base, size, MAX_NUMNODES, 
>0);
>+  return memblock_add_range(, base, size, NUMA_NO_NODE, 
>0);
> }
> 
> /**
>@@ -734,7 +734,7 @@ int __init_memblock memblock_reserve(phys_addr_t base, 
>phys_addr_t size)
>(unsigned long long)base + size - 1,
>0UL, (void *)_RET_IP_);
> 
>-  return memblock_add_range(, base, size, MAX_NUMNODES, 
>0);
>+  return memblock_add_range(, base, size, NUMA_NO_NODE, 
>0);
> }
> 
> /**
>@@ -1684,7 +1684,7 @@ static void __init_memblock memblock_dump(struct 
>memblock_type *type, char *name
>   size = rgn->size;
>   flags = rgn->flags;
> #ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
>-  if (memblock_get_region_node(rgn) != MAX_NUMNODES)
>+  if (memblock_get_region_node(rgn) != NUMA_NO_NODE)
>   snprintf(nid_buf, sizeof(nid_buf), " on node %d",
>memblock_get_region_node(rgn));
> #endif
>-- 
>2.11.0

-- 
Wei Yang
Help you, Help me


signature.asc
Description: PGP signature


Re: [PATCH v3 09/09] iommu/ipmmu-vmsa: Hook up r8a7795 DT matching code

2017-03-08 Thread Magnus Damm
Hi Geert,

On Wed, Mar 8, 2017 at 10:58 PM, Geert Uytterhoeven
 wrote:
> Hi Magnus,
>
> On Wed, Mar 8, 2017 at 12:02 PM, Magnus Damm  wrote:
>> From: Magnus Damm 
>>
>> Tie in r8a7795 features and update the IOMMU_OF_DECLARE
>> compat string to include the updated compat string.
>>
>> TODO:
>>  - Consider making use of iommu_fwspec_add_ids() for uTLB handling
>>  Needed to coexist with non-OF R-Car Gen2 somehow...
>>  - Break out stuff useful for R-Car Gen2 from this series
>>  Fix up the Gen2 IPMMU support code
>>and/or
>>  Fold more stuff into the multi-arch series
>>  - Add support for sysfs and iommu_device_link()/unlink()
>>
>> Signed-off-by: Magnus Damm 
>
>> --- 0018/drivers/iommu/ipmmu-vmsa.c
>> +++ work/drivers/iommu/ipmmu-vmsa.c 2017-03-08 19:11:53.600607110 +0900
>
>> @@ -1043,6 +1048,17 @@ static struct iommu_group *ipmmu_find_gr
>> return group;
>>  }
>>
>> +static bool ipmmu_slave_whitelist(struct device *dev)
>> +{
>> +   /* By default, do not allow use of IPMMU */
>> +   return false;
>> +}
>> +
>> +static const struct soc_device_attribute soc_r8a7795[] = {
>> +   { .soc_id = "r8a7795", },
>
> If/when the whitelist is/becomes device/revision specific, you probably want
> to store a  pointer to the *_slave_whitelist() function in the .data member?

Yeah, for sure. It is a bit early to tell exactly how the code will
look like at this point, but I think it will become more clear in the
future. Just want to send out a new version of r8a7796 IPMMU support
and some r8a7795 DT integration to get a coherent working set of patch
series out of the door first.

>> +   { /* sentinel */ }
>> +};
>> +
>>  static int ipmmu_of_xlate_dma(struct device *dev,
>>   struct of_phandle_args *spec)
>>  {
>> @@ -1053,6 +1069,18 @@ static int ipmmu_of_xlate_dma(struct dev
>> if (!of_device_is_available(spec->np))
>> return -ENODEV;
>>
>> +   /* Failing in ->attach_device() results in a hang, so make
>> +* sure the root device is installed before going there
>> +*/
>> +   if (!__ipmmu_find_root()) {
>> +   dev_info(dev, "Unable to locate IPMMU root device\n");
>
> dev_err?

Good idea. Will fix.

>> +   return -ENODEV;
>> +   }
>> +
>> +   /* For R-Car Gen3 use a white list to opt-in slave devices */
>> +   if (soc_device_match(soc_r8a7795) && !ipmmu_slave_whitelist(dev))
>> +   return -ENODEV;

This will have to be updated for r8a7796 somehow as well.

Thanks for your help!

Cheers,

/ magnus


Re: [PATCH 1/3] futex: remove duplicated code

2017-03-08 Thread H. Peter Anvin
,Thomas Gleixner ,Ingo Molnar 
,Chris Zankel ,Max Filippov 
,Arnd Bergmann 
,x...@kernel.org,linux-al...@vger.kernel.org,linux-snps-...@lists.infradead.org,linux-arm-ker...@lists.infradead.org,linux-hexa...@vger.kernel.org,linux-i...@vger.kernel.org,linux-m...@linux-mips.org,openr...@lists.librecores.org,linux-par...@vger.kernel.org,linuxppc-...@lists.ozlabs.org,linux-s...@vger.kernel.org,linux...@vger.kernel.org,sparcli...@vger.kernel.org,linux-xte...@linux-xtensa.org,linux-a...@vger.kernel.org
From: h...@zytor.com
Message-ID: <83324528-aaa1-4bed-b0c7-48426ecba...@zytor.com>

On March 8, 2017 8:16:49 PM PST, Rob Landley  wrote:
>On 03/04/2017 07:05 AM, Russell King - ARM Linux wrote:
>> On Fri, Mar 03, 2017 at 01:27:10PM +0100, Jiri Slaby wrote:
>>> diff --git a/kernel/futex.c b/kernel/futex.c
>>> index b687cb22301c..c5ff9850952f 100644
>>> --- a/kernel/futex.c
>>> +++ b/kernel/futex.c
>>> @@ -1457,6 +1457,42 @@ futex_wake(u32 __user *uaddr, unsigned int
>flags, int nr_wake, u32 bitset)
>>> return ret;
>>>  }
>>>  
>>> +static int futex_atomic_op_inuser(int encoded_op, u32 __user
>*uaddr)
>>> +{
>>> +   int op = (encoded_op >> 28) & 7;
>>> +   int cmp = (encoded_op >> 24) & 15;
>>> +   int oparg = (encoded_op << 8) >> 20;
>>> +   int cmparg = (encoded_op << 20) >> 20;
>> 
>> Hmm.  oparg and cmparg look like they're doing these shifts to get
>sign
>> extension of the 12-bit values by assuming that "int" is 32-bit -
>> probably worth a comment, or for safety, they should be "s32" so it's
>> not dependent on the bit-width of "int".
>
>I thought Linux depended on the LP64 standard for all architectures?
>
>Standard: http://www.unix.org/whitepapers/64bit.html
>Rationale: http://www.unix.org/version2/whatsnew/lp64_wp.html
>
>So int has a defined bit width (32) on linux?
>
>Rob

Linux is ILP32 on 32-bit architectures and LP64 on 64-bit architectures, but 
that doesn't inherently make this stuff clear.
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


Re: "mm: fix lazyfree BUG_ON check in try_to_unmap_one()" build error

2017-03-08 Thread Minchan Kim
Hi Sergey,

On Thu, Mar 09, 2017 at 01:29:08PM +0900, Sergey Senozhatsky wrote:
> Hello Minchan,
> 
> /* I can't https://marc.info/?l=linux-kernel=148886631303107 thread
>in my mail box for some reason so the Reply-To message-id may be wrong. */
> 
> 
> 
> commit "mm: fix lazyfree BUG_ON check in try_to_unmap_one()"
> (mmotm fd07630cbf59bead90046dd3e5cfd891e58e6987)
> 
> 
>   if (VM_WARN_ON_ONCE(PageSwapBacked(page) !=
>   PageSwapCache(page))) {
>   ...
>   }
> 
> 
> does not compile on !CONFIG_DEBUG_VM configs, because VM_WARN_ONCE() is
> 
>   #define BUILD_BUG_ON_INVALID(e) ((void)(sizeof((__force long)(e
> 
> 
> 
> In file included from ./include/linux/mmdebug.h:4:0,
>  from ./include/linux/mm.h:8,
>  from mm/rmap.c:48:
> mm/rmap.c: In function ‘try_to_unmap_one’:
> ./include/linux/bug.h:45:33: error: void value not ignored as it ought to be
>  #define BUILD_BUG_ON_INVALID(e) ((void)(sizeof((__force long)(e
>  ^
> ./include/linux/mmdebug.h:49:31: note: in expansion of macro 
> ‘BUILD_BUG_ON_INVALID’
>  #define VM_WARN_ON_ONCE(cond) BUILD_BUG_ON_INVALID(cond)
>^~~~
> mm/rmap.c:1416:8: note: in expansion of macro ‘VM_WARN_ON_ONCE’
> if (VM_WARN_ON_ONCE(PageSwapBacked(page) !=
> ^~~
> 
>   -ss
> 

Thanks for the report, Sergey!
If others are not against, I want to go this.

>From 38b10e560d066c2cef8f9d028e14008cefdaa3e0 Mon Sep 17 00:00:00 2001
From: Minchan Kim 
Date: Thu, 9 Mar 2017 14:58:23 +0900
Subject: [PATCH] mm: do not use VM_WARN_ON_ONCE as if condition

Sergey reported VM_WARN_ON_ONCE returns void with !CONFIG_DEBUG_VM
so we cannot use it as if's condition unlike WARN_ON.

This patch fixes it.

Signed-off-by: Minchan Kim 
---
 mm/rmap.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/mm/rmap.c b/mm/rmap.c
index 1d82057144ba..7d24bb93445b 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1413,12 +1413,11 @@ static int try_to_unmap_one(struct page *page, struct 
vm_area_struct *vma,
 * Store the swap location in the pte.
 * See handle_pte_fault() ...
 */
-   if (VM_WARN_ON_ONCE(PageSwapBacked(page) !=
-   PageSwapCache(page))) {
+   if (unlikely(PageSwapBacked(page) != 
PageSwapCache(page))) {
+   WARN_ON_ONCE(1);
ret = SWAP_FAIL;
page_vma_mapped_walk_done();
break;
-
}
 
/* MADV_FREE page check */
-- 
2.7.4


Re: [PATCH] net: toshiba: ps3_genic_net: use new api ethtool_{get|set}_link_ksettings

2017-03-08 Thread David Miller
From: Philippe Reynes 
Date: Sun,  5 Mar 2017 23:21:06 +0100

> The ethtool api {get|set}_settings is deprecated.
> We move this driver to new api {get|set}_link_ksettings.
> 
> As I don't have the hardware, I'd be very pleased if
> someone may test this patch.
> 
> Signed-off-by: Philippe Reynes 

Applied.


Re: [RFC 08/11] mm: make ttu's return boolean

2017-03-08 Thread John Hubbard

On 03/08/2017 10:37 PM, Minchan Kim wrote:
>[...]


I think it's the matter of taste.

if (try_to_unmap(xxx))
something
else
something

It's perfectly understandable to me. IOW, if try_to_unmap returns true,
it means it did unmap successfully. Otherwise, failed.

IMHO, SWAP_SUCCESS or TTU_RESULT_* seems to be an over-engineering.
If the user want it, user can do it by introducing right variable name
in his context. See below.


I'm OK with that approach. Just something to avoid the "what does !ret mean in this 
function call" is what I was looking for...




[...]

forcekill = PageDirty(hpage) || (flags & MF_MUST_KILL);
-   kill_procs(, forcekill, trapno,
- ret != SWAP_SUCCESS, p, pfn, flags);
+   kill_procs(, forcekill, trapno, !ret , p, pfn, flags);


The kill_procs() invocation was a little more readable before.


Indeed but I think it's not a problem of try_to_unmap but ret variable name
isn't good any more. How about this?

bool unmap_success;

unmap_success = try_to_unmap(hpage, ttu);

..

kill_procs(, forcekill, trapno, !unmap_success , p, pfn, flags);

..

return unmap_success;

My point is user can introduce whatever variable name depends on his
context. No need to make return variable complicated, IMHO.


Yes, the local variable basically achieves what I was hoping for, so sure, works for 
me.



[...]

-   case SWAP_FAIL:


Again: the SWAP_FAIL makes it crystal clear which case we're in.


To me, I don't feel it.
To me, below is perfectly understandable.

if (try_to_unmap())
do something

That's why I think it's matter of taste. Okay, I admit I might be
biased, too so I will consider what you suggested if others votes
it.


Yes, if it's really just a matter of taste, then not worth debating. Your change 
above is fine I think.


thanks
john h



Thanks.



[PATCH net] dccp/tcp: fix routing redirect race

2017-03-08 Thread Jon Maxwell
We have seen a few incidents lately where a dst_enty has been freed
with a dangling TCP socket reference (sk->sk_dst_cache) pointing to that
dst_entry. If the conditions/timings are right a crash then ensues when the
freed dst_entry is referenced later on. A Common crashing back trace is:

 #8 [] page_fault at 8163e648
[exception RIP: __tcp_ack_snd_check+74]
.
.   
 #9 [] tcp_rcv_established at 81580b64
#10 [] tcp_v4_do_rcv at 8158b54a
#11 [] tcp_v4_rcv at 8158cd02
#12 [] ip_local_deliver_finish at 815668f4
#13 [] ip_local_deliver at 81566bd9
#14 [] ip_rcv_finish at 8156656d
#15 [] ip_rcv at 81566f06
#16 [] __netif_receive_skb_core at 8152b3a2
#17 [] __netif_receive_skb at 8152b608
#18 [] netif_receive_skb at 8152b690
#19 [] vmxnet3_rq_rx_complete at a015eeaf [vmxnet3]
#20 [] vmxnet3_poll_rx_only at a015f32a [vmxnet3]
#21 [] net_rx_action at 8152bac2
#22 [] __do_softirq at 81084b4f
#23 [] call_softirq at 8164845c
#24 [] do_softirq at 81016fc5
#25 [] irq_exit at 81084ee5
#26 [] do_IRQ at 81648ff8

Of course it may happen with other NIC drivers as well.

It's found the freed dst_entry here: 

 224 static bool tcp_in_quickack_mode(struct sock *sk)↩
 225 {↩
 226 ▹   const struct inet_connection_sock *icsk = inet_csk(sk);↩
 227 ▹   const struct dst_entry *dst = __sk_dst_get(sk);↩
 228 ↩
 229 ▹   return (dst && dst_metric(dst, RTAX_QUICKACK)) ||↩
 230 ▹   ▹   (icsk->icsk_ack.quick && !icsk->icsk_ack.pingpong);↩
 231 }↩

But there are other backtraces attributed to the same freed dst_entry in 
netfilter code as well. 

All the vmcores showed 2 significant clues:

- Remote hosts behind the default gateway had always been redirected to a 
different gateway. A rtable/dst_entry will be added for that host. Making
more dst_entrys with lower reference counts. Making this more probable.

- All vmcores showed a postitive LockDroppedIcmps value, e.g:

LockDroppedIcmps  267

A closer look at the tcp_v4_err() handler revealed that do_redirect() will run
regardless of whether user space has the socket locked. This can result in a 
race condition where the same dst_entry cached in sk->sk_dst_entry can be 
decremented twice for the same socket via: 

do_redirect()->__sk_dst_check()-> dst_release(). 

Which leads to the dst_entry being prematurely freed with another socket 
pointing to it via sk->sk_dst_cache and a subsequent crash.

To fix this skip do_redirect() if usespace has the socket locked. Instead let 
the redirect take place later when user space does not have the socket 
locked.

The dccp code is very similar in this respect, so fixing it there too. 

As Eric Garver pointed out the following commit now invalidates routes. Which
can set the dst->obsolete flag so that ipv4_dst_check() returns null and 
triggers the dst_release().

Fixes: ceb3320610d6 ("ipv4: Kill routes during PMTU/redirect updates.")
Cc: Eric Garver 
Cc: Hannes Sowa 
Signed-off-by: Jon Maxwell 
---
 net/dccp/ipv4.c | 3 ++-
 net/ipv4/tcp_ipv4.c | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/net/dccp/ipv4.c b/net/dccp/ipv4.c
index 409d0cf..b99168b 100644
--- a/net/dccp/ipv4.c
+++ b/net/dccp/ipv4.c
@@ -289,7 +289,8 @@ static void dccp_v4_err(struct sk_buff *skb, u32 info)
 
switch (type) {
case ICMP_REDIRECT:
-   dccp_do_redirect(skb, sk);
+   if (!sock_owned_by_user(sk))
+   dccp_do_redirect(skb, sk);
goto out;
case ICMP_SOURCE_QUENCH:
/* Just silently ignore these. */
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 8f3ec13..575e19d 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -431,7 +431,8 @@ void tcp_v4_err(struct sk_buff *icmp_skb, u32 info)
 
switch (type) {
case ICMP_REDIRECT:
-   do_redirect(icmp_skb, sk);
+   if (!sock_owned_by_user(sk))
+   do_redirect(icmp_skb, sk);
goto out;
case ICMP_SOURCE_QUENCH:
/* Just silently ignore these. */
-- 
1.8.3.1



Re: [PATCH 3/4] phy: rockchip-typec: support DP phy switch

2017-03-08 Thread Chris Zhong

Hi Heiko and Brain

On 03/09/2017 09:02 AM, Heiko Stübner wrote:

Am Mittwoch, 8. März 2017, 16:39:23 CET schrieb Brian Norris:

On Fri, Feb 10, 2017 at 03:44:13PM +0800, Chris Zhong wrote:

There are 2 Type-c PHYs in RK3399, but only one DP controller. Hence
only one PHY can connect to DP controller at one time, the other should
be disconnected. The GRF_SOC_CON26 register has a switch bit to do it,
set this bit means enable PHY 1, clear this bit means enable PHY 0.

Signed-off-by: Chris Zhong 
---

  drivers/phy/phy-rockchip-typec.c | 9 +
  1 file changed, 9 insertions(+)

diff --git a/drivers/phy/phy-rockchip-typec.c
b/drivers/phy/phy-rockchip-typec.c index 7cfb0f8..1604aaa 100644
--- a/drivers/phy/phy-rockchip-typec.c
+++ b/drivers/phy/phy-rockchip-typec.c
@@ -267,6 +267,7 @@ struct rockchip_usb3phy_port_cfg {

struct usb3phy_reg usb3tousb2_en;
struct usb3phy_reg external_psm;
struct usb3phy_reg pipe_status;

+   struct usb3phy_reg uphy_dp_sel;

  };
  
  struct rockchip_typec_phy {


@@ -736,6 +737,7 @@ static const struct phy_ops rockchip_usb3_phy_ops = {

  static int rockchip_dp_phy_power_on(struct phy *phy)
  {
  
  	struct rockchip_typec_phy *tcphy = phy_get_drvdata(phy);


+   struct rockchip_usb3phy_port_cfg *cfg = >port_cfgs;

int new_mode, ret = 0;
u32 val;

@@ -766,6 +768,8 @@ static int rockchip_dp_phy_power_on(struct phy *phy)

tcphy_phy_init(tcphy, new_mode);

}

+   property_enable(tcphy, >uphy_dp_sel, 1);
+

ret = readx_poll_timeout(readl, tcphy->base + DP_MODE_CTL,

Idea for future work: this should just be readl_poll_timeout() here, and
throughout the driver.
Yes, the readl_poll_timeout is better, if next version series is needed, 
I am going to

add it in a separate patch behind this patch.

 val, val & DP_MODE_A2, 1000,
 PHY_MODE_SET_TIMEOUT);

@@ -869,6 +873,11 @@ static int tcphy_parse_dt(struct rockchip_typec_phy
*tcphy,>
if (ret)

return ret;

+   ret = tcphy_get_param(dev, >uphy_dp_sel,
+ "rockchip,uphy-dp-sel");
+   if (ret)
+   return ret;

What about existing device trees? You're essentially adding this
new property and requiring it at the same time.

Or are we considering no RK3399 DP stable at the moment? I guess we
haven't actually merged any device trees that support this yet, no?

An interesting situation we're in here. On the one hand, you're right this
breaks "backwards compatiblity".

But on the other hand, the type-c phy is currently very much unused. The only
current board rk3399-evb.dts does not enable them (so they're disabled
everywhere) and we have neither dwc3 nor dp nodes in any rk3399 devicetrees so
far. Also Rob was ok with the binding change :-) .

So from my pov, I'd say it _should_ be ok, as nothing is using the phys at all
yet and thus there is nothing that could get broken.


Heiko
Thanks Heiko. On the other hand, these is no any display node at rk3399 
dtsi,

so I can not add the DP node, do you or Mark.Yao have plan to complete it?




Brian


+

tcphy->grf_regs = syscon_regmap_lookup_by_phandle(dev->of_node,

  "rockchip,grf");

if (IS_ERR(tcphy->grf_regs)) {







--
Chris Zhong




Re: [lkp-robot] [x86] ed3ce2a917: BUG:unable_to_handle_kernel

2017-03-08 Thread Ye Xiaolong
On 03/02, Borislav Petkov wrote:
>Hi,
>
>On Thu, Mar 02, 2017 at 09:09:34AM +0800, kernel test robot wrote:
>> 
>> FYI, we noticed the following commit:
>> 
>> commit: ed3ce2a9172457ef7dbaa9f964e63dfde2bdcb5f ("x86: Optimize 
>> clear_page()")
>> url: 
>> https://github.com/0day-ci/linux/commits/Borislav-Petkov/x86-Optimize-clear_page/20170215-193441
>> 
>> 
>> in testcase: will-it-scale
>> with following parameters:
>> 
>>  test: poll2
>>  cpufreq_governor: performance
>> 
>> test-description: Will It Scale takes a testcase and runs it from 1 through 
>> to n parallel copies to see if the testcase will scale. It builds both a 
>> process and threads based test in order to see any differences between the 
>> two.
>> test-url: https://github.com/antonblanchard/will-it-scale
>
>thanks for the report, I was able to reproduce.
>
>BUT(!) this report is misleading because it talks about will-it-scale
>but your splat happens when you kexec the kernel:
>
>  [  336.340747] LKP: kexec loading...
>  [  336.340852] 
>  [  336.343323] kexec --noefi -l 
> /tmp/cache/pkg/linux/x86_64-rhel-7.2/gcc-6/ed3ce2a9172457ef7dbaa9f964e63dfde2bdcb5f/vmlinuz-4.9.0-rc6-00134-ged3ce2a
>  --initrd=/tmp/cache/initrd-concatenated
>  [  336.343758] 
>  [  337.893471] --append=ip=lkp-ivb-d01::dhcp root=/dev/ram0 user=lkp 
> job=/lkp/scheduled/lkp-ivb-d01/will-it-scale-poll2-performance-debian-x86_64-2016-08-31.cgz-ed3ce2a9172457ef7dbaa9f964e63dfde2bdcb5f-20170301-28072-1dqjyhl-11.yaml
>  ARCH=x86_64 kconfig=x86_64-rhel-7.2 
> branch=linux-devel/devel-hourly-2017022612 
> commit=ed3ce2a9172457ef7dbaa9f964e63dfde2bdcb5f 
> BOOT_IMAGE=/pkg/linux/x86_64-rhel-7.2/gcc-6/ed3ce2a9172457ef7dbaa9f964e63dfde2bdcb5f/vmlinuz-4.9.0-rc6-00134-ged3ce2a
>  max_uptime=1500 
> RESULT_ROOT=/result/will-it-scale/poll2-performance/lkp-ivb-d01/debian-x86_64-2016-08-31.cgz/x86_64-rhel-7.2/gcc-6/ed3ce2a9172457ef7dbaa9f964e63dfde2bdcb5f/11
>  LKP_SERVER=inn debug apic=debug sysrq_always_enabled 
> rcupdate.rcu_cpu_stall_timeout=100 net.ifnames=0 printk.devkmsg=on panic=-1 
> softlockup_panic=1 nmi_watchdog=panic oops=panic load_ramdisk=2 
> prompt_ramdisk=0 drbd.minor_count=8 systemd.log_level=err ignore_
>  [  337.895521] 
>  [  339.467661] BUG: unable to handle kernel paging request at 
> 8803cf2e2008
>  [  339.468000] IP: [] native_set_pmd+0x1/0x10
>  ...
>
>
>Maybe Fengguang has an idea what to do here, maybe something like add
>markers to the log to denote where the test environment is prepared and
>when the actual test starts. Then grep for those and generate the report
>based on that...

Thanks for the suggestions, we'll keep improving the reports to avoid confusion
or misleading.

>
>Anyway, the diff is below, please try that ontop of tip's x86/asm branch
>which already has the clear_page patch:
>
>http://git.kernel.org/cgit/linux/kernel/git/tip/tip.git/log/?h=x86/asm
>
>Thanks!

Hmm, I've checkout the tip's x86/asm branch (HEAD is f25d38475 "x86/asm:
Optimize clear_page()"), but I failed to apply your diff on top of it (error
log as below). Could you provide a tree/branch which contains your fix, it would
much easier for 0day to catch and test.

error: patch failed: arch/x86/include/asm/alternative.h:227
error: arch/x86/include/asm/alternative.h: patch does not apply
error: patch failed: arch/x86/include/asm/page_64.h:41
error: arch/x86/include/asm/page_64.h: patch does not apply


Thanks,
Xiaolong
>
>---
> arch/x86/include/asm/alternative.h | 17 -
> arch/x86/include/asm/page_64.h | 11 ++-
> 2 files changed, 6 insertions(+), 22 deletions(-)
>
>diff --git a/arch/x86/include/asm/alternative.h 
>b/arch/x86/include/asm/alternative.h
>index 12e3d8d607a9..1b020381ab38 100644
>--- a/arch/x86/include/asm/alternative.h
>+++ b/arch/x86/include/asm/alternative.h
>@@ -227,23 +227,6 @@ static inline int alternatives_text_reserved(void *start, 
>void *end)
> }
> 
> /*
>- * Like alternative_call(), but there are two features and respective 
>functions.
>- * If CPU has feature2, function2 is used.
>- * Otherwise, if CPU has feature1, function1 is used.
>- * Otherwise, old function is used.
>- */
>-#define alternative_void_call_2(oldfunc, newfunc1, feature1, newfunc2,
>\
>-  feature2, input...) 
>\
>-{ 
>\
>-  register void *__sp asm(_ASM_SP);   
>\
>-  asm volatile (ALTERNATIVE_2("call %P[old]", "call %P[new1]", feature1,  
>\
>-  "call %P[new2]", feature2)  
>\
>-  : "+r" (__sp)   
>\
>-  : [old] "i" (oldfunc), [new1] "i" (newfunc1),   
>\
>-[new2] "i" (newfunc2), ## input); 
>\
>-}
>-
>-/*
>  * use this macro(s) if you need more than one output parameter
>  * 

[PATCH] block, writeback: wait for writeback to finish before detaching wb

2017-03-08 Thread Tahsin Erdogan
__blkdev_put() could surprise writeback thread by detaching the
wb object from an inode that hasn't cleared the I_SYNC flag yet.
This causes a NULL pointer dereference as seen below:

  BUG: unable to handle kernel NULL pointer dereference at (null)
  IP: locked_inode_to_wb_and_lock_list+0x38/0x440
  PGD 0
  Oops:  [#1] SMP
  CPU: 0 PID: 34 Comm: kworker/u8:1 Not tainted 4.11.0-rc1+ #202
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
  Workqueue: writeback wb_workfn (flush-8:16)
  task: 88013aa780c0 task.stack: c912c000
  RIP: 0010:locked_inode_to_wb_and_lock_list+0x38/0x440
  RSP: 0018:c912fb70 EFLAGS: 00010202
  RAX: 0001 RBX:  RCX: 0018
  RDX: 88013aa780c0 RSI: 880139a478f8 RDI: 88013aa788b8
  RBP: c912fba0 R08: 0001 R09: 
  R10: 969da8e2 R11:  R12: 880139a47858
  R13: 880139a478e0 R14: 880139a478f8 R15: 8801371f4058
  FS:  () GS:88013ae0() knlGS:
  CS:  0010 DS:  ES:  CR0: 80050033
  CR2:  CR3: 01012000 CR4: 06f0
  Call Trace:
   writeback_sb_inodes+0x3e1/0x7a0
   __writeback_inodes_wb+0x87/0xc0
   wb_writeback+0x2e7/0x5c0
   wb_workfn+0x2d1/0x9c0
   process_one_work+0x1d3/0x620
   worker_thread+0x126/0x4a0
   kthread+0x10a/0x140
   ret_from_fork+0x2e/0x40
  RIP: locked_inode_to_wb_and_lock_list+0x38/0x440 RSP: c912fb70
  CR2: 
  ---[ end trace e0ea8a2695f4c86c ]---

Make __blkdev_put() wait for the I_SYNC flag to clear before detaching
wb.

Fixes: 43d1c0eb7e11 ("block: detach bdev inode from its wb in __blkdev_put()")
Signed-off-by: Tahsin Erdogan 
---
 fs/block_dev.c | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/fs/block_dev.c b/fs/block_dev.c
index 2eca00ec4370..70fb82fcedd0 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -95,7 +95,7 @@ void kill_bdev(struct block_device *bdev)
 
invalidate_bh_lrus();
truncate_inode_pages(mapping, 0);
-}  
+}
 EXPORT_SYMBOL(kill_bdev);
 
 /* Invalidate clean unused buffers and pagecache. */
@@ -617,13 +617,13 @@ static loff_t block_llseek(struct file *file, loff_t 
offset, int whence)
inode_unlock(bd_inode);
return retval;
 }
-   
+
 int blkdev_fsync(struct file *filp, loff_t start, loff_t end, int datasync)
 {
struct inode *bd_inode = bdev_file_inode(filp);
struct block_device *bdev = I_BDEV(bd_inode);
int error;
-   
+
error = filemap_write_and_wait_range(filp->f_mapping, start, end);
if (error)
return error;
@@ -1038,7 +1038,7 @@ void bdput(struct block_device *bdev)
 }
 
 EXPORT_SYMBOL(bdput);
- 
+
 static struct block_device *bd_acquire(struct inode *inode)
 {
struct block_device *bdev;
@@ -1880,7 +1880,10 @@ static void __blkdev_put(struct block_device *bdev, 
fmode_t mode, int for_part)
 * Detaching bdev inode from its wb in __destroy_inode()
 * is too late: the queue which embeds its bdi (along with
 * root wb) can be gone as soon as we put_disk() below.
+* Before detaching wb, wait for any writeback activity for
+* inode to settle.
 */
+   inode_wait_for_writeback(bdev->bd_inode);
inode_detach_wb(bdev->bd_inode);
}
if (bdev->bd_contains == bdev) {
-- 
2.12.0.246.ga2ecc84866-goog



Re: Race condition in ext4 (was Re: 4.11-rc1 acpi stomping ext4 slabs)

2017-03-08 Thread Theodore Ts'o
On Tue, Mar 07, 2017 at 10:40:53PM +0200, Nikolay Borisov wrote:
> So this is wrong, the reason why the issues seemed fix is because I
> switched my compiler to version 5.4.0. So this manifests only if I'm
> using gcc 4.7.4. With the pr_info added here is the output of a boot. So
> there are multiple invocations of ext4_ext_map_blocks and the freeing,
> including with the address being used in subsequent kasan reports :
> 88006ae8fdb0

Can you help bisect this, then?  I'm using Debian Testing, and the
default gcc is gcc 6.3.0.  I'm currently forcing the use of gcc 5.4.1
because I was running into problems with gcc 6.x a while back.  (TBH,
I was thinking about trying to see if gcc 6.3 was stable for kernel
compiles when I had some spare time.)  But I don't have access to
*any* gcc 4.x on my development system, and I don't think I've tried
using gcc 4.x in a long, Long, LONG time.

I'm currently kicking off a test run using 5.4.1 with KASAN enabled to
see if I can trigger it myself.  Can you send me a copy of your
.config so I can see what else might be interesting with your config?
(e.g., SLAB vs SLUB, etc.)

Thanks,

   - Ted


Re: [PATCH] lib/idr: trivial: Fix typo in file header

2017-03-08 Thread Cao jin
Seems it is missed. CCing more people

On 11/30/2016 03:11 PM, Cao jin wrote:
> Signed-off-by: Cao jin 
> ---
>  lib/idr.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/lib/idr.c b/lib/idr.c
> index 6098336df267..69fa487dbfda 100644
> --- a/lib/idr.c
> +++ b/lib/idr.c
> @@ -14,7 +14,7 @@
>   * by the id to obtain the pointer.  The bitmap makes allocating
>   * a new id quick.
>   *
> - * You call it to allocate an id (an int) an associate with that id a
> + * You call it to allocate an id (an int) and associate with that id a
>   * pointer or what ever, we treat it as a (void *).  You can pass this
>   * id to a user for him to pass back at a later time.  You then pass
>   * that id to this code and it returns your pointer.
> 

-- 
Sincerely,
Cao jin




Re: [PATCH 00/13] rtc: Add OF device table to I2C drivers that are missing it

2017-03-08 Thread Alexandre Belloni
On 03/03/2017 at 11:29:11 -0300, Javier Martinez Canillas wrote:
> Hello,
> 
> This series add OF device ID tables to RTC I2C drivers whose devices are
> either used in Device Tree source files or are listed in binding docs as
> a compatible string.
> 
> That's done because the plan is to change the I2C core to report proper OF
> modaliases instead of always reporting a MODALIAS=i2c: regardless if
> a device was registered via DT or using the legacy platform data.
> 
> So these patches will make sure that RTC I2C drivers modules will continue
> to be autoloaded once the I2C core is changed to report proper OF modalias.
> 
> Best regards,
> Javier
> 
> 
> Javier Martinez Canillas (13):
>   rtc: rv8803: Add OF device ID table
>   rtc: rv3029: Add OF device ID table
>   rtc: bq32k: Add OF device ID table
>   rtc: ds1307: Add OF device ID table
>   rtc: rx8010: Add OF device ID table
>   rtc: ds3232: Add OF device ID table
>   rtc: rtc-ds1672: Add OF device ID table
>   rtc: ds1374: Set .of_match_table to OF device ID table
>   rtc: isl1208: Add OF device ID table
>   rtc: s35390a: Add OF device ID table
>   rtc: rx8581: Add OF device ID table
>   rtc: m41t80: Add OF device ID table
>   rtc: rs5c372: Add OF device ID table
> 

All applied, thanks!

>  drivers/rtc/rtc-bq32k.c|  7 +
>  drivers/rtc/rtc-ds1307.c   | 68 
> +-
>  drivers/rtc/rtc-ds1374.c   |  1 +
>  drivers/rtc/rtc-ds1672.c   |  9 +-
>  drivers/rtc/rtc-ds3232.c   |  7 +
>  drivers/rtc/rtc-isl1208.c  | 12 ++--
>  drivers/rtc/rtc-m41t80.c   | 63 --
>  drivers/rtc/rtc-rs5c372.c  | 37 -
>  drivers/rtc/rtc-rv3029c2.c |  9 ++
>  drivers/rtc/rtc-rv8803.c   | 21 +-
>  drivers/rtc/rtc-rx8010.c   |  7 +
>  drivers/rtc/rtc-rx8581.c   |  7 +
>  drivers/rtc/rtc-s35390a.c  |  8 ++
>  13 files changed, 248 insertions(+), 8 deletions(-)
> 
> -- 
> 2.9.3
> 

-- 
Alexandre Belloni, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com


Re: [PATCH net] dccp/tcp: fix routing redirect race

2017-03-08 Thread Jonathan Maxwell
Sorry let me resend in plain text mode.

On Thu, Mar 9, 2017 at 1:10 PM, Eric Dumazet  wrote:
> On Thu, 2017-03-09 at 12:15 +1100, Jon Maxwell wrote:
>> We have seen a few incidents lately where a dst_enty has been freed
>> with a dangling TCP socket reference (sk->sk_dst_cache) pointing to that
>> dst_entry. If the conditions/timings are right a crash then ensues when the
>> freed dst_entry is referenced later on. A Common crashing back trace is:
>
> Very nice catch !
>

Thanks Eric.

> Don't we have a similar issue for IPv6 ?
>
>

Good point.

We checked and as far as we can tell IPv6 does not invalidate the route.
So it should be safer.


Re: [PATCH v3 04/09] iommu/ipmmu-vmsa: Make use of IOMMU_OF_DECLARE()

2017-03-08 Thread Magnus Damm
Hi Geert,

On Wed, Mar 8, 2017 at 10:52 PM, Geert Uytterhoeven
 wrote:
> Hi Magnus,
>
> On Wed, Mar 8, 2017 at 12:02 PM, Magnus Damm  wrote:
>> From: Magnus Damm 
>>
>> Hook up IOMMU_OF_DECLARE() support in case CONFIG_IOMMU_DMA
>> is enabled. The only current supported case for 32-bit ARM
>> is disabled, however for 64-bit ARM usage of OF is required.
>>
>> Signed-off-by: Magnus Damm 
> While I'm not such a big fan of *_OF_DECLARE() (it doesn't support deferred
> probing, which is needed for any device with dependencies, like clocks and
> power domains), what's the rationale for not using IOMMU_OF_DECLARE()
> on arm32, and thus the need for setup_done?

ARM32 could (and should) be converted over to IOMMU_OF_DECLARE(), but
it is just a matter of timing. If we try to do it before ARM32 is
converted over to CONFIG_IOMMU_DMA=y then we have to handle all the
hairy legacy implementation details of IOMMU support in case of
CONFIG_IOMMU_DMA=n _and_ deal with just moving over the init order
bits to OF. Testing and keeping all the combinations working is a lot
of work.

I prefer to kill two birds with one stone and do a larger feature jump
and move over ARM32 to same state of ARM64 (with OF init) once
CONFIG_IOMMU_DMA=y is ready for 32-bit ARM. Just changing the init
order bits to OF while keeping legacy CONFIG_IOMMU_DMA=n code is
introducing potential errors with not much upside. Unless there is
some other reason to do it that I can't see that is. =)

Cheers,

/ magnus


Re: RCU used on incoming CPU before rcu_cpu_starting() called

2017-03-08 Thread Frederic Weisbecker
On Wed, Mar 08, 2017 at 03:41:52PM -0800, Paul E. McKenney wrote:
> On Wed, Mar 08, 2017 at 02:16:56PM -0800, Paul E. McKenney wrote:
> > Hello!
> > 
> > I am seeing the following splat in rcutorture testing of v4.11-rc1:
> > 
> > [   30.694013] =
> > [   30.694013] WARNING: suspicious RCU usage
> > [   30.694013] 4.11.0-rc1+ #1 Not tainted
> > [   30.694013] -
> > [   30.694013] /home/git/linux-2.6-tip/kernel/workqueue.c:712 sched RCU or 
> > wq_pool_mutex should be held!
> > [   30.694013] 
> > [   30.694013] other info that might help us debug this:
> > [   30.694013] 
> > [   30.694013] 
> > [   30.694013] RCU used illegally from offline CPU!
> > [   30.694013] rcu_scheduler_active = 2, debug_locks = 0
> > [   30.694013] no locks held by swapper/1/0.
> > [   30.694013] 
> > [   30.694013] stack backtrace:
> > [   30.694013] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.11.0-rc1+ #1
> > [   30.694013] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> > Bochs 01/01/2011
> > [   30.694013] Call Trace:
> > [   30.694013]  dump_stack+0x67/0x99
> > [   30.694013]  lockdep_rcu_suspicious+0xe7/0x120
> > [   30.694013]  get_work_pool+0x82/0x90
> > [   30.694013]  __queue_work+0x70/0x5f0
> > [   30.694013]  queue_work_on+0x33/0x70
> > [   30.694013]  clear_sched_clock_stable+0x33/0x40
> > [   30.694013]  early_init_intel+0xe7/0x2f0
> > [   30.694013]  init_intel+0x11/0x350
> > [   30.694013]  identify_cpu+0x344/0x5a0
> > [   30.694013]  identify_secondary_cpu+0x18/0x80
> > [   30.694013]  smp_store_cpu_info+0x39/0x40
> > [   30.694013]  start_secondary+0x4e/0x100
> > [   30.694013]  start_cpu+0x14/0x14
> > 
> > Here is the relevant code from x86's smp_callin():
> > 
> > /*
> >  * Save our processor parameters. Note: this information
> >  * is needed for clock calibration.
> >  */
> > smp_store_cpu_info(cpuid);
> > 
> > /*
> >  * Get our bogomips.
> >  * Update loops_per_jiffy in cpu_data. Previous call to
> >  * smp_store_cpu_info() stored a value that is close but not as
> >  * accurate as the value just calculated.
> >  */
> > calibrate_delay();
> > cpu_data(cpuid).loops_per_jiffy = loops_per_jiffy;
> > pr_debug("Stack at about %p\n", );
> > 
> > /*
> >  * This must be done before setting cpu_online_mask
> >  * or calling notify_cpu_starting.
> >  */
> > set_cpu_sibling_map(raw_smp_processor_id());
> > wmb();
> > 
> > notify_cpu_starting(cpuid);
> > 
> > The problem is that smp_store_cpu_info() indirectly invokes
> > schedule_work(), which wants to use RCU.  But RCU isn't informed
> > of the incoming CPU until the call to notify_cpu_starting(), which
> > causes lockdep to complain bitterly about the use of RCU by the
> > premature call to schedule_work().
> > 
> > I considered just moving the notify_cpu_starting() earlier in the
> > sequence, but the comments make it seem like this would not be
> > a wise choice.
> > 
> > Any suggestions?

Calling schedule_work() from an offline (booting) CPU doesn't sound like a good
idea in the first place. And neither is it a good idea to allow using
RCU on a CPU that is not yet online.

Perhaps we could delay this sched clock stability check to a later
stage in the secondary CPU boot-up code? Once the CPU is online
and RCU is initialized? For example it could be a CPU_ONLINE hotplug
callback.


Re: [PATCH v3 02/09] iommu/ipmmu-vmsa: Add optional root device feature

2017-03-08 Thread Magnus Damm
Hi Geert,

On Wed, Mar 8, 2017 at 10:47 PM, Geert Uytterhoeven
 wrote:
> Hi Magnus,
>
> On Wed, Mar 8, 2017 at 12:01 PM, Magnus Damm  wrote:
>> From: Magnus Damm 
>>
>> Add root device handling to the IPMMU driver by allowing certain
>> DT compat strings to enable has_cache_leaf_nodes that in turn will
>> support both root devices with interrupts and leaf devices that
>> face the actual IPMMU consumer devices.
>>
>> Signed-off-by: Magnus Damm 
>
>> --- 0011/drivers/iommu/ipmmu-vmsa.c
>> +++ work/drivers/iommu/ipmmu-vmsa.c 2017-03-08 17:56:51.770607110 +0900
>
>> @@ -216,6 +219,44 @@ static void set_archdata(struct device *
>>  #define IMUASID_ASID0_SHIFT0
>>
>>  /* 
>> -
>> + * Root device handling
>> + */
>> +
>> +static bool ipmmu_is_root(struct ipmmu_vmsa_device *mmu)
>> +{
>> +   if (mmu->features->has_cache_leaf_nodes)
>> +   return mmu->is_leaf ? false : true;
>
> Expressions using the ternary operator are sometimes hard to read.
> In this case, you want negation, so why not use that?
>
> return !mmu->is_leaf;
>
>> +   else
>
> I'd drop the else.

Yeah, your suggestion makes the code easier to read. Will fix.

>> +   return true; /* older IPMMU hardware treated as single root 
>> */
>> +}
>> +
>> +static struct ipmmu_vmsa_device *__ipmmu_find_root(void)
>> +{
>> +   struct ipmmu_vmsa_device *mmu;
>> +   bool found = false;
>
> struct ipmmu_vmsa_device *root = NULL;

I used to have it initialized to NULL and not use any found variable
and only return the variable. But then I ran into the error case when
devices exist on the ipmmu_devices list however none of them are root.
I returned the last one on the list regardless if they were root or
not. So I updated the code to use the found variable, and because of
that I thought I could simply drop the NULL assignment.

>> +
>> +   spin_lock(_devices_lock);
>> +
>> +   list_for_each_entry(mmu, _devices, list) {
>> +   if (ipmmu_is_root(mmu)) {
>> +   found = true;
>
> root = mmu;
>
>> +   break;
>> +   }
>> +   }
>> +
>> +   spin_unlock(_devices_lock);
>> +   return found ? mmu : NULL;
>
> return root;

I agree it makes sense to use root as variable name, will fix. Not
sure about the NULL assignment though, can you enlighten me?

Cheers,

/ magnus


Re: RCU used on incoming CPU before rcu_cpu_starting() called

2017-03-08 Thread Paul E. McKenney
On Thu, Mar 09, 2017 at 04:55:29AM +0100, Frederic Weisbecker wrote:
> On Wed, Mar 08, 2017 at 03:41:52PM -0800, Paul E. McKenney wrote:
> > On Wed, Mar 08, 2017 at 02:16:56PM -0800, Paul E. McKenney wrote:
> > > Hello!
> > > 
> > > I am seeing the following splat in rcutorture testing of v4.11-rc1:
> > > 
> > > [   30.694013] =
> > > [   30.694013] WARNING: suspicious RCU usage
> > > [   30.694013] 4.11.0-rc1+ #1 Not tainted
> > > [   30.694013] -
> > > [   30.694013] /home/git/linux-2.6-tip/kernel/workqueue.c:712 sched RCU 
> > > or wq_pool_mutex should be held!
> > > [   30.694013] 
> > > [   30.694013] other info that might help us debug this:
> > > [   30.694013] 
> > > [   30.694013] 
> > > [   30.694013] RCU used illegally from offline CPU!
> > > [   30.694013] rcu_scheduler_active = 2, debug_locks = 0
> > > [   30.694013] no locks held by swapper/1/0.
> > > [   30.694013] 
> > > [   30.694013] stack backtrace:
> > > [   30.694013] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.11.0-rc1+ #1
> > > [   30.694013] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), 
> > > BIOS Bochs 01/01/2011
> > > [   30.694013] Call Trace:
> > > [   30.694013]  dump_stack+0x67/0x99
> > > [   30.694013]  lockdep_rcu_suspicious+0xe7/0x120
> > > [   30.694013]  get_work_pool+0x82/0x90
> > > [   30.694013]  __queue_work+0x70/0x5f0
> > > [   30.694013]  queue_work_on+0x33/0x70
> > > [   30.694013]  clear_sched_clock_stable+0x33/0x40
> > > [   30.694013]  early_init_intel+0xe7/0x2f0
> > > [   30.694013]  init_intel+0x11/0x350
> > > [   30.694013]  identify_cpu+0x344/0x5a0
> > > [   30.694013]  identify_secondary_cpu+0x18/0x80
> > > [   30.694013]  smp_store_cpu_info+0x39/0x40
> > > [   30.694013]  start_secondary+0x4e/0x100
> > > [   30.694013]  start_cpu+0x14/0x14
> > > 
> > > Here is the relevant code from x86's smp_callin():
> > > 
> > >   /*
> > >* Save our processor parameters. Note: this information
> > >* is needed for clock calibration.
> > >*/
> > >   smp_store_cpu_info(cpuid);
> > > 
> > >   /*
> > >* Get our bogomips.
> > >* Update loops_per_jiffy in cpu_data. Previous call to
> > >* smp_store_cpu_info() stored a value that is close but not as
> > >* accurate as the value just calculated.
> > >*/
> > >   calibrate_delay();
> > >   cpu_data(cpuid).loops_per_jiffy = loops_per_jiffy;
> > >   pr_debug("Stack at about %p\n", );
> > > 
> > >   /*
> > >* This must be done before setting cpu_online_mask
> > >* or calling notify_cpu_starting.
> > >*/
> > >   set_cpu_sibling_map(raw_smp_processor_id());
> > >   wmb();
> > > 
> > >   notify_cpu_starting(cpuid);
> > > 
> > > The problem is that smp_store_cpu_info() indirectly invokes
> > > schedule_work(), which wants to use RCU.  But RCU isn't informed
> > > of the incoming CPU until the call to notify_cpu_starting(), which
> > > causes lockdep to complain bitterly about the use of RCU by the
> > > premature call to schedule_work().
> > > 
> > > I considered just moving the notify_cpu_starting() earlier in the
> > > sequence, but the comments make it seem like this would not be
> > > a wise choice.
> > > 
> > > Any suggestions?
> 
> Calling schedule_work() from an offline (booting) CPU doesn't sound like a 
> good
> idea in the first place. And neither is it a good idea to allow using
> RCU on a CPU that is not yet online.

Fair point, though it is only RCU readers that are in use, so no
RCU core code would be running, at least not until interrupts are
enabled and so on.  But I needed the patch to get this out of the
way of my rcutorture testing, so it is serving a purpose whether or
not it eventually hits mainline.

> Perhaps we could delay this sched clock stability check to a later
> stage in the secondary CPU boot-up code? Once the CPU is online
> and RCU is initialized? For example it could be a CPU_ONLINE hotplug
> callback.

In theory, this does seem like a cleaner solution to me.  In practice,
I must defer to those who know the code better than I do.

Thanx, Paul



Re: [PATCH] platform/x86: dell-laptop: Handle return error form dell_get_intensity.

2017-03-08 Thread Arvind Yadav

Hi,
Yes, you are right. I will handle return error correctly.

Thanks
-Arvind

On Wednesday 08 March 2017 06:24 PM, Pali Rohár wrote:

Hi!

On Wednesday 08 March 2017 17:52:27 Arvind Yadav wrote:

Here, dell_get_intensity can return an error.

Right. That is truth and we should check for errors.


So we can assgine props.brightness as max_brightness.

But why to max_brightness? Seems that this is incorrect handling of
error too...


This change is done using Coccinelle.

Signed-off-by: Arvind Yadav 
---
  drivers/platform/x86/dell-laptop.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/drivers/platform/x86/dell-laptop.c 
b/drivers/platform/x86/dell-laptop.c
index f57dd28..0891de3 100644
--- a/drivers/platform/x86/dell-laptop.c
+++ b/drivers/platform/x86/dell-laptop.c
@@ -2053,6 +2053,9 @@ static int __init dell_init(void)
  
  		dell_backlight_device->props.brightness =

dell_get_intensity(dell_backlight_device);
+   if (dell_backlight_device->props.brightness < 0) {
+   dell_backlight_device->props.brightness = 
props.max_brightness;
+   }
backlight_update_status(dell_backlight_device);
}
  




[PATCH v2] ARM: socfpga: updates for socfpga_defconfig

2017-03-08 Thread ho . jia . jie
From: Jia Jie Ho 

This patch enables Altera TSE support in socfpga_defconfig

Signed-off-by: Jia Jie Ho 
---
v2:
 * Adding the TSE support as a module for Arria10

 arch/arm/configs/socfpga_defconfig |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/arch/arm/configs/socfpga_defconfig 
b/arch/arm/configs/socfpga_defconfig
index 030264c..2620ce7 100644
--- a/arch/arm/configs/socfpga_defconfig
+++ b/arch/arm/configs/socfpga_defconfig
@@ -71,6 +71,7 @@ CONFIG_SCSI=y
 CONFIG_BLK_DEV_SD=y
 # CONFIG_SCSI_LOWLEVEL is not set
 CONFIG_NETDEVICES=y
+CONFIG_ALTERA_TSE=m
 CONFIG_E1000E=m
 CONFIG_IGB=m
 CONFIG_IXGBE=m
-- 
1.7.7.4



Re: [PATCH] net: toshiba: spider_net: use new api ethtool_{get|set}_link_ksettings

2017-03-08 Thread David Miller
From: Philippe Reynes 
Date: Sun,  5 Mar 2017 23:46:00 +0100

> The ethtool api {get|set}_settings is deprecated.
> We move this driver to new api {get|set}_link_ksettings.
> 
> As I don't have the hardware, I'd be very pleased if
> someone may test this patch.
> 
> Signed-off-by: Philippe Reynes 

Applied.


[PATCH 1/6] mm/migrate: Add new mode parameter to migrate_page_copy() function

2017-03-08 Thread Anshuman Khandual
From: Zi Yan 

This is a prerequisite change required to make page migration framewok
copy in different modes like the default single threaded or the new
multi threaded one yet to be introduced in follow up patches. This
does not change any existing functionality. Only migrate_page_copy()
and copy_huge_page() function's signatures are affected.

Signed-off-by: Zi Yan 
Signed-off-by: Anshuman Khandual 
---
* Updated include/linux/migrate_mode.h comment as per Naoya

 fs/aio.c |  2 +-
 fs/f2fs/data.c   |  2 +-
 fs/hugetlbfs/inode.c |  2 +-
 fs/ubifs/file.c  |  2 +-
 include/linux/migrate.h  |  6 --
 include/linux/migrate_mode.h |  2 ++
 mm/migrate.c | 14 --
 7 files changed, 18 insertions(+), 12 deletions(-)

diff --git a/fs/aio.c b/fs/aio.c
index 873b4ca..ba3f6eb 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -418,7 +418,7 @@ static int aio_migratepage(struct address_space *mapping, 
struct page *new,
 * events from being lost.
 */
spin_lock_irqsave(>completion_lock, flags);
-   migrate_page_copy(new, old);
+   migrate_page_copy(new, old, MIGRATE_ST);
BUG_ON(ctx->ring_pages[idx] != old);
ctx->ring_pages[idx] = new;
spin_unlock_irqrestore(>completion_lock, flags);
diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 9ac2625..ad41356 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -1997,7 +1997,7 @@ int f2fs_migrate_page(struct address_space *mapping,
SetPagePrivate(newpage);
set_page_private(newpage, page_private(page));
 
-   migrate_page_copy(newpage, page);
+   migrate_page_copy(newpage, page, MIGRATE_ST);
 
return MIGRATEPAGE_SUCCESS;
 }
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 54de77e..0e16512f 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -850,7 +850,7 @@ static int hugetlbfs_migrate_page(struct address_space 
*mapping,
rc = migrate_huge_page_move_mapping(mapping, newpage, page);
if (rc != MIGRATEPAGE_SUCCESS)
return rc;
-   migrate_page_copy(newpage, page);
+   migrate_page_copy(newpage, page, MIGRATE_ST);
 
return MIGRATEPAGE_SUCCESS;
 }
diff --git a/fs/ubifs/file.c b/fs/ubifs/file.c
index b0d7837..293616f 100644
--- a/fs/ubifs/file.c
+++ b/fs/ubifs/file.c
@@ -1482,7 +1482,7 @@ static int ubifs_migrate_page(struct address_space 
*mapping,
SetPagePrivate(newpage);
}
 
-   migrate_page_copy(newpage, page);
+   migrate_page_copy(newpage, page, MIGRATE_ST);
return MIGRATEPAGE_SUCCESS;
 }
 #endif
diff --git a/include/linux/migrate.h b/include/linux/migrate.h
index ae8d475..d843b8f 100644
--- a/include/linux/migrate.h
+++ b/include/linux/migrate.h
@@ -42,7 +42,8 @@ extern void putback_movable_page(struct page *page);
 
 extern int migrate_prep(void);
 extern int migrate_prep_local(void);
-extern void migrate_page_copy(struct page *newpage, struct page *page);
+extern void migrate_page_copy(struct page *newpage, struct page *page,
+   enum migrate_mode mode);
 extern int migrate_huge_page_move_mapping(struct address_space *mapping,
  struct page *newpage, struct page *page);
 extern int migrate_page_move_mapping(struct address_space *mapping,
@@ -61,7 +62,8 @@ static inline int migrate_prep(void) { return -ENOSYS; }
 static inline int migrate_prep_local(void) { return -ENOSYS; }
 
 static inline void migrate_page_copy(struct page *newpage,
-struct page *page) {}
+struct page *page,
+enum migrate_mode mode) {}
 
 static inline int migrate_huge_page_move_mapping(struct address_space *mapping,
  struct page *newpage, struct page *page)
diff --git a/include/linux/migrate_mode.h b/include/linux/migrate_mode.h
index ebf3d89..deaeba5 100644
--- a/include/linux/migrate_mode.h
+++ b/include/linux/migrate_mode.h
@@ -6,11 +6,13 @@
  * on most operations but not ->writepage as the potential stall time
  * is too significant
  * MIGRATE_SYNC will block when migrating pages
+ * MIGRATE_ST will use single thread when migrating pages
  */
 enum migrate_mode {
MIGRATE_ASYNC,
MIGRATE_SYNC_LIGHT,
MIGRATE_SYNC,
+   MIGRATE_ST
 };
 
 #endif /* MIGRATE_MODE_H_INCLUDED */
diff --git a/mm/migrate.c b/mm/migrate.c
index e8ae11a..5ef4aa4 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -631,7 +631,8 @@ static void __copy_gigantic_page(struct page *dst, struct 
page *src,
}
 }
 
-static void copy_huge_page(struct page *dst, struct page *src)
+static void copy_huge_page(struct page *dst, struct page *src,
+   enum migrate_mode mode)
 {
int i;
int nr_pages;
@@ -660,12 +661,13 @@ 

[RESEND PATCH v3 4/7] PCI: dwc: all: Modify dbi accessors to take dbi_base as argument

2017-03-08 Thread Kishon Vijay Abraham I
dwc has 2 dbi address space labeled dbics and dbics2. The existing
helper to access dbi address space can access only dbics. However
dbics2 has to be accessed for programming the BAR registers in the
case of EP mode. This is in preparation for adding EP mode support
to dwc driver.

Cc: Jingoo Han 
Cc: Richard Zhu 
Cc: Lucas Stach 
Cc: Murali Karicheri 
Cc: Thomas Petazzoni 
Cc: Niklas Cassel 
Cc: Jesper Nilsson 
Cc: Joao Pinto 
Cc: Zhou Wang 
Cc: Gabriele Paoloni 
Acked-by: Joao Pinto 
Signed-off-by: Kishon Vijay Abraham I 
---
 drivers/pci/dwc/pci-dra7xx.c   |   10 +++--
 drivers/pci/dwc/pci-exynos.c   |   10 +++--
 drivers/pci/dwc/pci-imx6.c |   62 +++---
 drivers/pci/dwc/pci-keystone-dw.c  |   15 ---
 drivers/pci/dwc/pcie-armada8k.c|   39 +---
 drivers/pci/dwc/pcie-artpec6.c |7 +--
 drivers/pci/dwc/pcie-designware-host.c |   20 +
 drivers/pci/dwc/pcie-designware.c  |   76 ++--
 drivers/pci/dwc/pcie-designware.h  |   10 +++--
 drivers/pci/dwc/pcie-hisi.c|   17 ---
 10 files changed, 152 insertions(+), 114 deletions(-)

diff --git a/drivers/pci/dwc/pci-dra7xx.c b/drivers/pci/dwc/pci-dra7xx.c
index 07c45ec..3708bd6 100644
--- a/drivers/pci/dwc/pci-dra7xx.c
+++ b/drivers/pci/dwc/pci-dra7xx.c
@@ -495,12 +495,13 @@ static int dra7xx_pcie_suspend(struct device *dev)
 {
struct dra7xx_pcie *dra7xx = dev_get_drvdata(dev);
struct dw_pcie *pci = dra7xx->pci;
+   void __iomem *base = pci->dbi_base;
u32 val;
 
/* clear MSE */
-   val = dw_pcie_readl_dbi(pci, PCI_COMMAND);
+   val = dw_pcie_readl_dbi(pci, base, PCI_COMMAND);
val &= ~PCI_COMMAND_MEMORY;
-   dw_pcie_writel_dbi(pci, PCI_COMMAND, val);
+   dw_pcie_writel_dbi(pci, base, PCI_COMMAND, val);
 
return 0;
 }
@@ -509,12 +510,13 @@ static int dra7xx_pcie_resume(struct device *dev)
 {
struct dra7xx_pcie *dra7xx = dev_get_drvdata(dev);
struct dw_pcie *pci = dra7xx->pci;
+   void __iomem *base = pci->dbi_base;
u32 val;
 
/* set MSE */
-   val = dw_pcie_readl_dbi(pci, PCI_COMMAND);
+   val = dw_pcie_readl_dbi(pci, base, PCI_COMMAND);
val |= PCI_COMMAND_MEMORY;
-   dw_pcie_writel_dbi(pci, PCI_COMMAND, val);
+   dw_pcie_writel_dbi(pci, base, PCI_COMMAND, val);
 
return 0;
 }
diff --git a/drivers/pci/dwc/pci-exynos.c b/drivers/pci/dwc/pci-exynos.c
index 993b650..a0d40f7 100644
--- a/drivers/pci/dwc/pci-exynos.c
+++ b/drivers/pci/dwc/pci-exynos.c
@@ -521,23 +521,25 @@ static void exynos_pcie_enable_interrupts(struct 
exynos_pcie *ep)
exynos_pcie_msi_init(ep);
 }
 
-static u32 exynos_pcie_readl_dbi(struct dw_pcie *pci, u32 reg)
+static u32 exynos_pcie_readl_dbi(struct dw_pcie *pci, void __iomem *base,
+u32 reg)
 {
struct exynos_pcie *ep = to_exynos_pcie(pci);
u32 val;
 
exynos_pcie_sideband_dbi_r_mode(ep, true);
-   val = readl(pci->dbi_base + reg);
+   val = readl(base + reg);
exynos_pcie_sideband_dbi_r_mode(ep, false);
return val;
 }
 
-static void exynos_pcie_writel_dbi(struct dw_pcie *pci, u32 reg, u32 val)
+static void exynos_pcie_writel_dbi(struct dw_pcie *pci, void __iomem *base,
+  u32 reg, u32 val)
 {
struct exynos_pcie *ep = to_exynos_pcie(pci);
 
exynos_pcie_sideband_dbi_w_mode(ep, true);
-   writel(val, pci->dbi_base + reg);
+   writel(val, base + reg);
exynos_pcie_sideband_dbi_w_mode(ep, false);
 }
 
diff --git a/drivers/pci/dwc/pci-imx6.c b/drivers/pci/dwc/pci-imx6.c
index 801e46c..85dd901 100644
--- a/drivers/pci/dwc/pci-imx6.c
+++ b/drivers/pci/dwc/pci-imx6.c
@@ -98,12 +98,13 @@ struct imx6_pcie {
 static int pcie_phy_poll_ack(struct imx6_pcie *imx6_pcie, int exp_val)
 {
struct dw_pcie *pci = imx6_pcie->pci;
+   void __iomem *base = pci->dbi_base;
u32 val;
u32 max_iterations = 10;
u32 wait_counter = 0;
 
do {
-   val = dw_pcie_readl_dbi(pci, PCIE_PHY_STAT);
+   val = dw_pcie_readl_dbi(pci, base, PCIE_PHY_STAT);
val = (val >> PCIE_PHY_STAT_ACK_LOC) & 0x1;
wait_counter++;
 
@@ -119,21 +120,22 @@ static int pcie_phy_poll_ack(struct imx6_pcie *imx6_pcie, 
int exp_val)
 static int pcie_phy_wait_ack(struct imx6_pcie *imx6_pcie, int addr)
 {
struct dw_pcie *pci = imx6_pcie->pci;
+   void __iomem *base = pci->dbi_base;
u32 val;
int ret;
 
val = addr << PCIE_PHY_CTRL_DATA_LOC;
-   dw_pcie_writel_dbi(pci, 

[RESEND PATCH v3 5/7] PCI: dwc: all: Modify dbi accessors to access data of 4/2/1 bytes

2017-03-08 Thread Kishon Vijay Abraham I
Previously dbi accessors can be used to access data of size 4
bytes. But there might be situations (like accessing
MSI_MESSAGE_CONTROL in order to set/get the number of required
MSI interrupts in EP mode) where dbi accessors must
be used to access data of size 2. This is in preparation for
adding endpoint mode support to designware driver.

Cc: Jingoo Han 
Cc: Richard Zhu 
Cc: Lucas Stach 
Cc: Murali Karicheri 
Cc: Thomas Petazzoni 
Cc: Niklas Cassel 
Cc: Jesper Nilsson 
Cc: Joao Pinto 
Cc: Zhou Wang 
Cc: Gabriele Paoloni 
Signed-off-by: Kishon Vijay Abraham I 
---
 drivers/pci/dwc/pci-dra7xx.c   |8 ++--
 drivers/pci/dwc/pci-exynos.c   |   16 +++
 drivers/pci/dwc/pci-imx6.c |   54 +++---
 drivers/pci/dwc/pci-keystone-dw.c  |   13 +++---
 drivers/pci/dwc/pcie-armada8k.c|   38 
 drivers/pci/dwc/pcie-artpec6.c |6 +--
 drivers/pci/dwc/pcie-designware-host.c |   18 
 drivers/pci/dwc/pcie-designware.c  |   77 +++-
 drivers/pci/dwc/pcie-designware.h  |   14 +++---
 drivers/pci/dwc/pcie-hisi.c|   14 +++---
 10 files changed, 138 insertions(+), 120 deletions(-)

diff --git a/drivers/pci/dwc/pci-dra7xx.c b/drivers/pci/dwc/pci-dra7xx.c
index 3708bd6..c6fef0a 100644
--- a/drivers/pci/dwc/pci-dra7xx.c
+++ b/drivers/pci/dwc/pci-dra7xx.c
@@ -499,9 +499,9 @@ static int dra7xx_pcie_suspend(struct device *dev)
u32 val;
 
/* clear MSE */
-   val = dw_pcie_readl_dbi(pci, base, PCI_COMMAND);
+   val = dw_pcie_read_dbi(pci, base, PCI_COMMAND, 0x4);
val &= ~PCI_COMMAND_MEMORY;
-   dw_pcie_writel_dbi(pci, base, PCI_COMMAND, val);
+   dw_pcie_write_dbi(pci, base, PCI_COMMAND, 0x4, val);
 
return 0;
 }
@@ -514,9 +514,9 @@ static int dra7xx_pcie_resume(struct device *dev)
u32 val;
 
/* set MSE */
-   val = dw_pcie_readl_dbi(pci, base, PCI_COMMAND);
+   val = dw_pcie_read_dbi(pci, base, PCI_COMMAND, 0x4);
val |= PCI_COMMAND_MEMORY;
-   dw_pcie_writel_dbi(pci, base, PCI_COMMAND, val);
+   dw_pcie_write_dbi(pci, base, PCI_COMMAND, 0x4, val);
 
return 0;
 }
diff --git a/drivers/pci/dwc/pci-exynos.c b/drivers/pci/dwc/pci-exynos.c
index a0d40f7..37d6d2b 100644
--- a/drivers/pci/dwc/pci-exynos.c
+++ b/drivers/pci/dwc/pci-exynos.c
@@ -521,25 +521,25 @@ static void exynos_pcie_enable_interrupts(struct 
exynos_pcie *ep)
exynos_pcie_msi_init(ep);
 }
 
-static u32 exynos_pcie_readl_dbi(struct dw_pcie *pci, void __iomem *base,
-u32 reg)
+static u32 exynos_pcie_read_dbi(struct dw_pcie *pci, void __iomem *base,
+   u32 reg, size_t size)
 {
struct exynos_pcie *ep = to_exynos_pcie(pci);
u32 val;
 
exynos_pcie_sideband_dbi_r_mode(ep, true);
-   val = readl(base + reg);
+   dw_pcie_read(base + reg, size, );
exynos_pcie_sideband_dbi_r_mode(ep, false);
return val;
 }
 
-static void exynos_pcie_writel_dbi(struct dw_pcie *pci, void __iomem *base,
-  u32 reg, u32 val)
+static void exynos_pcie_write_dbi(struct dw_pcie *pci, void __iomem *base,
+ u32 reg, size_t size, u32 val)
 {
struct exynos_pcie *ep = to_exynos_pcie(pci);
 
exynos_pcie_sideband_dbi_w_mode(ep, true);
-   writel(val, base + reg);
+   dw_pcie_write(base + reg, size, val);
exynos_pcie_sideband_dbi_w_mode(ep, false);
 }
 
@@ -646,8 +646,8 @@ static int __init exynos_add_pcie_port(struct exynos_pcie 
*ep,
 }
 
 static const struct dw_pcie_ops dw_pcie_ops = {
-   .readl_dbi = exynos_pcie_readl_dbi,
-   .writel_dbi = exynos_pcie_writel_dbi,
+   .read_dbi = exynos_pcie_read_dbi,
+   .write_dbi = exynos_pcie_write_dbi,
.link_up = exynos_pcie_link_up,
 };
 
diff --git a/drivers/pci/dwc/pci-imx6.c b/drivers/pci/dwc/pci-imx6.c
index 85dd901..e58ca7a 100644
--- a/drivers/pci/dwc/pci-imx6.c
+++ b/drivers/pci/dwc/pci-imx6.c
@@ -104,7 +104,7 @@ static int pcie_phy_poll_ack(struct imx6_pcie *imx6_pcie, 
int exp_val)
u32 wait_counter = 0;
 
do {
-   val = dw_pcie_readl_dbi(pci, base, PCIE_PHY_STAT);
+   val = dw_pcie_read_dbi(pci, base, PCIE_PHY_STAT, 0x4);
val = (val >> PCIE_PHY_STAT_ACK_LOC) & 0x1;
wait_counter++;
 
@@ -125,17 +125,17 @@ static int pcie_phy_wait_ack(struct imx6_pcie *imx6_pcie, 
int addr)
int ret;
 
val = addr << PCIE_PHY_CTRL_DATA_LOC;
-   dw_pcie_writel_dbi(pci, base, PCIE_PHY_CTRL, val);
+   dw_pcie_write_dbi(pci, base, PCIE_PHY_CTRL, 0x4, val);
 
 

[RESEND PATCH v3 1/7] PCI: dwc: designware: Add new *ops* for cpu addr fixup

2017-03-08 Thread Kishon Vijay Abraham I
Some platforms (like dra7xx) require only the least 28 bits of the
corresponding 32 bit CPU address to be programmed in the address
translation unit. This modified address is stored in io_base/mem_base/
cfg0_base/cfg1_base in dra7xx_pcie_host_init. While this is okay for
host mode where the address range is fixed, device mode requires
different addresses to be programmed based on the host buffer address.
Add a new ops to get the least 28 bits of the corresponding 32 bit
CPU address and invoke it before programming the address translation
unit.

Acked-by: Joao Pinto 
Signed-off-by: Kishon Vijay Abraham I 
---
 drivers/pci/dwc/pcie-designware.c |3 +++
 drivers/pci/dwc/pcie-designware.h |1 +
 2 files changed, 4 insertions(+)

diff --git a/drivers/pci/dwc/pcie-designware.c 
b/drivers/pci/dwc/pcie-designware.c
index 7e1fb7d..14ee7a3 100644
--- a/drivers/pci/dwc/pcie-designware.c
+++ b/drivers/pci/dwc/pcie-designware.c
@@ -97,6 +97,9 @@ void dw_pcie_prog_outbound_atu(struct dw_pcie *pci, int 
index, int type,
 {
u32 retries, val;
 
+   if (pp->ops->cpu_addr_fixup)
+   cpu_addr = pp->ops->cpu_addr_fixup(cpu_addr);
+
if (pci->iatu_unroll_enabled) {
dw_pcie_writel_unroll(pci, index, PCIE_ATU_UNR_LOWER_BASE,
  lower_32_bits(cpu_addr));
diff --git a/drivers/pci/dwc/pcie-designware.h 
b/drivers/pci/dwc/pcie-designware.h
index cd3b871..8f3dcb2 100644
--- a/drivers/pci/dwc/pcie-designware.h
+++ b/drivers/pci/dwc/pcie-designware.h
@@ -143,6 +143,7 @@ struct pcie_port {
 };
 
 struct dw_pcie_ops {
+   u64 (*cpu_addr_fixup)(u64 cpu_addr);
u32 (*readl_dbi)(struct dw_pcie *pcie, u32 reg);
void(*writel_dbi)(struct dw_pcie *pcie, u32 reg, u32 val);
int (*link_up)(struct dw_pcie *pcie);
-- 
1.7.9.5



[RESEND PATCH v3 3/7] PCI: dwc: artpec6: Populate cpu_addr_fixup ops

2017-03-08 Thread Kishon Vijay Abraham I
Populate cpu_addr_fixup ops to extract the least 28 bits of the
corresponding cpu address.

Cc: Niklas Cassel 
Acked-by: Joao Pinto 
Signed-off-by: Kishon Vijay Abraham I 
---
 drivers/pci/dwc/pcie-artpec6.c |   15 ++-
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/drivers/pci/dwc/pcie-artpec6.c b/drivers/pci/dwc/pcie-artpec6.c
index fcd3ef8..5b3b3af 100644
--- a/drivers/pci/dwc/pcie-artpec6.c
+++ b/drivers/pci/dwc/pcie-artpec6.c
@@ -78,6 +78,11 @@ static void artpec6_pcie_writel(struct artpec6_pcie 
*artpec6_pcie, u32 offset, u
regmap_write(artpec6_pcie->regmap, offset, val);
 }
 
+static u64 artpec6_pcie_cpu_addr_fixup(u64 pci_addr)
+{
+   return pci_addr & ARTPEC6_CPU_TO_BUS_ADDR;
+}
+
 static int artpec6_pcie_establish_link(struct artpec6_pcie *artpec6_pcie)
 {
struct dw_pcie *pci = artpec6_pcie->pci;
@@ -142,11 +147,6 @@ static int artpec6_pcie_establish_link(struct artpec6_pcie 
*artpec6_pcie)
 */
dw_pcie_writel_dbi(pci, MISC_CONTROL_1_OFF, DBI_RO_WR_EN);
 
-   pp->io_base &= ARTPEC6_CPU_TO_BUS_ADDR;
-   pp->mem_base &= ARTPEC6_CPU_TO_BUS_ADDR;
-   pp->cfg0_base &= ARTPEC6_CPU_TO_BUS_ADDR;
-   pp->cfg1_base &= ARTPEC6_CPU_TO_BUS_ADDR;
-
/* setup root complex */
dw_pcie_setup_rc(pp);
 
@@ -234,6 +234,10 @@ static int artpec6_add_pcie_port(struct artpec6_pcie 
*artpec6_pcie,
return 0;
 }
 
+static const struct dw_pcie_ops dw_pcie_ops = {
+   .cpu_addr_fixup = artpec6_pcie_cpu_addr_fixup,
+};
+
 static int artpec6_pcie_probe(struct platform_device *pdev)
 {
struct device *dev = >dev;
@@ -252,6 +256,7 @@ static int artpec6_pcie_probe(struct platform_device *pdev)
return -ENOMEM;
 
pci->dev = dev;
+   pci->ops = _pcie_ops;
 
artpec6_pcie->pci = pci;
 
-- 
1.7.9.5



[RESEND PATCH v3 7/7] PCI: dwc: dra7xx: Push request_irq call to the bottom of probe

2017-03-08 Thread Kishon Vijay Abraham I
From: Keerthy 

Currently devm_request_irq is being called before base, pci fields
of dra7xx_pcie structure are populated. It is called even before
pm_runtime_enable and pm_runtime_get_sync are called. This will
lead to exceptions if in case an interrupt is triggered before
the all of the above are done. Hence push the devm_request_irq
call to the end of the probe.

Signed-off-by: Keerthy 
Signed-off-by: Kishon Vijay Abraham I 
---
 drivers/pci/dwc/pci-dra7xx.c |   14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/pci/dwc/pci-dra7xx.c b/drivers/pci/dwc/pci-dra7xx.c
index c6fef0a..8c53233 100644
--- a/drivers/pci/dwc/pci-dra7xx.c
+++ b/drivers/pci/dwc/pci-dra7xx.c
@@ -410,13 +410,6 @@ static int __init dra7xx_pcie_probe(struct platform_device 
*pdev)
return -EINVAL;
}
 
-   ret = devm_request_irq(dev, irq, dra7xx_pcie_irq_handler,
-  IRQF_SHARED, "dra7xx-pcie-main", dra7xx);
-   if (ret) {
-   dev_err(dev, "failed to request irq\n");
-   return ret;
-   }
-
res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "ti_conf");
base = devm_ioremap_nocache(dev, res->start, resource_size(res));
if (!base)
@@ -478,6 +471,13 @@ static int __init dra7xx_pcie_probe(struct platform_device 
*pdev)
if (ret < 0)
goto err_gpio;
 
+   ret = devm_request_irq(dev, irq, dra7xx_pcie_irq_handler,
+  IRQF_SHARED, "dra7xx-pcie-main", dra7xx);
+   if (ret) {
+   dev_err(dev, "failed to request irq\n");
+   goto err_gpio;
+   }
+
return 0;
 
 err_gpio:
-- 
1.7.9.5



Re: [PATCH] pinctrl: samsung: fix segfault when using external interrupts on s3c24xx

2017-03-08 Thread Krzysztof Kozlowski
On Thu, Mar 9, 2017 at 7:56 AM, Tomasz Figa  wrote:
> 2017-03-09 1:34 GMT+09:00 Krzysztof Kozlowski :
>> On Mon, Mar 06, 2017 at 09:15:16AM -0400, Sergio Prado wrote:
>>> Hi Krzysztof,
>>>
>>> > > This is a regression from commit 
>>> > > 8b1bd11c1f8f529057369c5b3702d13fd24e2765.
>>> >
>>> > Checkpatch should complain here about commit format.
>>> >
>>> > >
>>> > > Tested on FriendlyARM mini2440.
>>> > >
>>> >
>>> > Please add:
>>> >   Fixes: 8b1bd11c1f8f ("pinctrl: samsung: Add the support the multiple 
>>> > IORESOURCE_MEM for one pin-bank")
>>> >   Cc: 
>>> >
>>>
>>> OK.
>>>
>>> > > Signed-off-by: Sergio Prado 
>>> > > ---
>>> > >  drivers/pinctrl/samsung/pinctrl-s3c24xx.c | 4 ++--
>>> > >  1 file changed, 2 insertions(+), 2 deletions(-)
>>> > >
>>> > > diff --git a/drivers/pinctrl/samsung/pinctrl-s3c24xx.c 
>>> > > b/drivers/pinctrl/samsung/pinctrl-s3c24xx.c
>>> > > index b82a003546ae..1b8d887796e8 100644
>>> > > --- a/drivers/pinctrl/samsung/pinctrl-s3c24xx.c
>>> > > +++ b/drivers/pinctrl/samsung/pinctrl-s3c24xx.c
>>> > > @@ -356,8 +356,8 @@ static inline void s3c24xx_demux_eint(struct 
>>> > > irq_desc *desc,
>>> > >  {
>>> > >   struct s3c24xx_eint_data *data = irq_desc_get_handler_data(desc);
>>> > >   struct irq_chip *chip = irq_desc_get_chip(desc);
>>> > > - struct irq_data *irqd = irq_desc_get_irq_data(desc);
>>> > > - struct samsung_pin_bank *bank = irq_data_get_irq_chip_data(irqd);
>>> > > + struct samsung_pinctrl_drv_data *d = data->drvdata;
>>> > > + struct samsung_pin_bank *bank = d->pin_banks;
>>> >
>>> > I think 'pin_banks' point to all banks of given controller not to the
>>> > currently accessed one.
>>>
>>> Understood. I think it worked in my tests because on s3c2440 all banks
>>> have the same eint base address.
>>>
>>> So what do you think is the best approach to solve this problem?
>>
>> Maybe you can get to this through:
>> s3c24xx_eint_domain_data = 
>> s3c24xx_eint_data->domains[virq].host_data;
>> s3c24xx_eint_domain_data->bank
>>
>> It is getting slightly more complicated...
>
> How about the suggestions I made in my reply from March 4 (JST)?

Yes, this also looks like solution. I am not sure how much you would
like to revert but wouldn't it create duplicated members in pinctrl
structures? One for Exynos and other for S3C?

Best regards,
Krzysztof


Re: [PATCH 1/3] usb: dwc3-omap: Fix missing break in dwc3_omap_set_mailbox()

2017-03-08 Thread Roger Quadros
Felipe,

On 15/02/17 13:38, Roger Quadros wrote:
> We need to break from all cases if we want to treat
> each one of them separately.
> 
> Reported-by: Gustavo A. R. Silva 
> Fixes: d2728fb3e01f ("usb: dwc3: omap: Pass VBUS and ID events transparently")
> Cc:  #v4.8+
> Signed-off-by: Roger Quadros 

Can you please pick this one for v4.11-rc? Thanks.

> ---
>  drivers/usb/dwc3/dwc3-omap.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/usb/dwc3/dwc3-omap.c b/drivers/usb/dwc3/dwc3-omap.c
> index eb1b9cb..35b6351 100644
> --- a/drivers/usb/dwc3/dwc3-omap.c
> +++ b/drivers/usb/dwc3/dwc3-omap.c
> @@ -250,6 +250,7 @@ static void dwc3_omap_set_mailbox(struct dwc3_omap *omap,
>   val = dwc3_omap_read_utmi_ctrl(omap);
>   val |= USBOTGSS_UTMI_OTG_CTRL_IDDIG;
>   dwc3_omap_write_utmi_ctrl(omap, val);
> + break;
>  
>   case OMAP_DWC3_VBUS_OFF:
>   val = dwc3_omap_read_utmi_ctrl(omap);
> 

-- 
cheers,
-roger


Re: [RFC PATCH net] net: Work around lockdep limitation in sockets that use sockets

2017-03-08 Thread David Miller
From: David Howells 
Date: Mon, 06 Mar 2017 15:04:44 +

> Fix the general case by:
> 
>  (1) Double up all the locking keys used in sockets so that one set are
>  used if the socket is created by userspace and the other set is used
>  if the socket is created by the kernel.
> 
>  (2) Store the kern parameter passed to sk_alloc() in a variable in the
>  sock struct (sk_kern_sock).  This informs sock_lock_init(),
>  sock_init_data() and sk_clone_lock() as to the lock keys to be used.
> 
>  Note that the child created by sk_clone_lock() inherits the parent's
>  kern setting.
> 
>  (3) Add a 'kern' parameter to ->accept() that is analogous to the one
>  passed in to ->create() that distinguishes whether kernel_accept() or
>  sys_accept4() was the caller and can be passed to sk_alloc().
> 
>  Note that a lot of accept functions merely dequeue an already
>  allocated socket.  I haven't touched these as the new socket already
>  exists before we get the parameter.
> 
>  Note also that there are a couple of places where I've made the accepted
>  socket unconditionally kernel-based:
> 
>   irda_accept()
>   rds_rcp_accept_one()
>   tcp_accept_from_sock()
> 
>  because they follow a sock_create_kern() and accept off of that.

I guess this is fine, but I think you can use one of the two "sk_padding"
bits in struct sock instead of making the structure larger.


Re: [PATCH v7 kernel 5/5] This patch contains two parts:

2017-03-08 Thread Wei Wang
On 03/06/2017 09:23 PM, David Hildenbrand wrote:Am 03.03.2017 um 06:40 
schrieb Wei Wang:

From: Liang Li 

Sorry, I just saw the message due to an email issue.


I'd prefer to split this into two parts then and to create proper subjects.

Agree, will do.



If I remember correctly, the general concept was accepted by most reviewers.



Yes, that's also what I was told.

Best,
Wei


Re: [PATCH 12/26] IB/ocrdma: Adjust ten checks for null pointers

2017-03-08 Thread Yuval Shaia
On Wed, Mar 08, 2017 at 02:07:01PM +0100, SF Markus Elfring wrote:
> From: Markus Elfring 
> Date: Tue, 7 Mar 2017 21:32:22 +0100
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
> 
> The script “checkpatch.pl“ pointed information out like the following.
> 
> Comparison to NULL could be written !…

Good to know.

Reviewed-by: Yuval Shaia 


> 
> Thus fix the affected source code places.

Above line can be removed.

> 
> Signed-off-by: Markus Elfring 
> ---
>  drivers/infiniband/hw/ocrdma/ocrdma_hw.c | 20 ++--
>  1 file changed, 10 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/infiniband/hw/ocrdma/ocrdma_hw.c 
> b/drivers/infiniband/hw/ocrdma/ocrdma_hw.c
> index d5b988b011d1..8c7f0b108a7f 100644
> --- a/drivers/infiniband/hw/ocrdma/ocrdma_hw.c
> +++ b/drivers/infiniband/hw/ocrdma/ocrdma_hw.c
> @@ -665,7 +665,7 @@ static void ocrdma_process_qpcat_error(struct ocrdma_dev 
> *dev,
>   enum ib_qp_state new_ib_qps = IB_QPS_ERR;
>   enum ib_qp_state old_ib_qps;
>  
> - if (qp == NULL)
> + if (!qp)
>   BUG();
>   ocrdma_qp_state_change(qp, new_ib_qps, _ib_qps);
>  }
> @@ -693,7 +693,7 @@ static void ocrdma_dispatch_ibevent(struct ocrdma_dev 
> *dev,
>   if (cqe->qpvalid_qpid & OCRDMA_AE_MCQE_QPVALID) {
>   if (qpid < dev->attr.max_qp)
>   qp = dev->qp_tbl[qpid];
> - if (qp == NULL) {
> + if (!qp) {
>   pr_err("ocrdma%d:Async event - qpid %u is not valid\n",
>  dev->id, qpid);
>   return;
> @@ -703,7 +703,7 @@ static void ocrdma_dispatch_ibevent(struct ocrdma_dev 
> *dev,
>   if (cqe->cqvalid_cqid & OCRDMA_AE_MCQE_CQVALID) {
>   if (cqid < dev->attr.max_cq)
>   cq = dev->cq_tbl[cqid];
> - if (cq == NULL) {
> + if (!cq) {
>   pr_err("ocrdma%d:Async event - cqid %u is not valid\n",
>  dev->id, cqid);
>   return;
> @@ -882,7 +882,7 @@ static int ocrdma_mq_cq_handler(struct ocrdma_dev *dev, 
> u16 cq_id)
>  
>   while (1) {
>   cqe = ocrdma_get_mcqe(dev);
> - if (cqe == NULL)
> + if (!cqe)
>   break;
>   ocrdma_le32_to_cpu(cqe, sizeof(*cqe));
>   cqe_popped += 1;
> @@ -948,7 +948,7 @@ static void ocrdma_qp_buddy_cq_handler(struct ocrdma_dev 
> *dev,
>* false - Check for RQ CQ
>*/
>   bcq = _ocrdma_qp_buddy_cq_handler(dev, cq, true);
> - if (bcq == NULL)
> + if (!bcq)
>   bcq = _ocrdma_qp_buddy_cq_handler(dev, cq, false);
>   spin_unlock_irqrestore(>flush_q_lock, flags);
>  
> @@ -969,7 +969,7 @@ static void ocrdma_qp_cq_handler(struct ocrdma_dev *dev, 
> u16 cq_idx)
>   BUG();
>  
>   cq = dev->cq_tbl[cq_idx];
> - if (cq == NULL)
> + if (!cq)
>   return;
>  
>   if (cq->ibcq.comp_handler) {
> @@ -1289,7 +1289,7 @@ int ocrdma_mbx_rdma_stats(struct ocrdma_dev *dev, bool 
> reset)
>   int status;
>  
>   old_stats = kmalloc(sizeof(*old_stats), GFP_KERNEL);
> - if (old_stats == NULL)
> + if (!old_stats)
>   return -ENOMEM;
>  
>   memset(mqe, 0, sizeof(*mqe));
> @@ -1676,12 +1676,12 @@ static int ocrdma_mbx_create_ah_tbl(struct ocrdma_dev 
> *dev)
>   dev->av_tbl.pbl.va = dma_alloc_coherent(>dev, PAGE_SIZE,
>   >av_tbl.pbl.pa,
>   GFP_KERNEL);
> - if (dev->av_tbl.pbl.va == NULL)
> + if (!dev->av_tbl.pbl.va)
>   goto mem_err;
>  
>   dev->av_tbl.va = dma_alloc_coherent(>dev, dev->av_tbl.size,
>   , GFP_KERNEL);
> - if (dev->av_tbl.va == NULL)
> + if (!dev->av_tbl.va)
>   goto mem_err_ah;
>   dev->av_tbl.pa = pa;
>   dev->av_tbl.num_ah = max_ah;
> @@ -1722,7 +1722,7 @@ static void ocrdma_mbx_delete_ah_tbl(struct ocrdma_dev 
> *dev)
>   struct ocrdma_delete_ah_tbl *cmd;
>   struct pci_dev *pdev = dev->nic_info.pdev;
>  
> - if (dev->av_tbl.va == NULL)
> + if (!dev->av_tbl.va)
>   return;
>  
>   cmd = ocrdma_init_emb_mqe(OCRDMA_CMD_DELETE_AH_TBL, sizeof(*cmd));
> -- 
> 2.12.0
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RESEND PATCH] arm: assabet_defconfig: disable IDE subsystem

2017-03-08 Thread Sekhar Nori
On Tuesday 07 March 2017 11:21 PM, Bartlomiej Zolnierkiewicz wrote:
> 
> Hi,
> 
> On Monday, December 12, 2016 07:24:47 PM Sekhar Nori wrote:
>> Hi Bartlomiej,
>>
>> On Monday 12 December 2016 06:15 PM, Bartlomiej Zolnierkiewicz wrote:
>>>
>>> Hi,
>>>
>>> On Monday, July 18, 2016 08:15:08 PM Sekhar Nori wrote:
 On Friday 15 July 2016 08:45 PM, Kevin Hilman wrote:
> Arnd Bergmann  writes:
>
>> On Wednesday, July 13, 2016 12:59:23 PM CEST Bartlomiej Zolnierkiewicz 
>> wrote:
>>>
>>> On Friday, July 08, 2016 10:23:48 PM Arnd Bergmann wrote:
 On Friday, July 8, 2016 5:24:41 PM CEST Bartlomiej Zolnierkiewicz 
 wrote:
> This patch disables deprecated IDE subsystem in assabet_defconfig
> (no IDE host drivers are selected in this config so there is no
> valid reason to enable IDE subsystem itself).
>
> Cc: Dmitry Eremin-Solenikov 
> Signed-off-by: Bartlomiej Zolnierkiewicz 

 I think the series makes a lot of sense. I have checked your assertions
 in the changelogs and found no flaws in your logic, so I think we 
 should
 take them all through arm-soc unless there are other concerns.
>>>
>>> Thank you.
>>>
>>> Should I resend everything or just patches that were not reposted yet
>>> (the ones that were marked as RFT initially and got no feedback)?
>>
>> I'd be fine with just getting a pull request with all the patches that
>> had no negative feedback and that were not already applied (if any).
>>
 Do you have a list of ARM defconfigs that keep using CONFIG_IDE and
 how you determined that they need it?
>>>
>>> The only such defconfig is davinci_all_defconfig which uses
>>> palm_bk3710 host driver (CONFIG_BLK_DEV_PALMCHIP_BK3710).
>>>
 I know that ARCH_RPC/ARCH_ACORN has a couple of special drivers that
 have no libata replacement, are there any others like that, or are
 they all platforms that should in theory work with libata but need
 testing?
>>>
>>> All platforms except ARCH_ACORN, ARCH_DAVINCI & ARCH_RPC should work
>>> with libata.
>>
>> Adding Sekhar and Kevin for DaVinci: At first sight, palm_bk3710 looks
>> fairly straightforward (meaning someone has to do a few day's work)
>> to convert into a libata driver.
>>
>> As this is on on-chip controller that is part of a dm644x and dm646x,
>> it should also not be hard to test (as long as someone can find
>> a hard drive to plug in).
>
> I have a hard drive, but don't have any dm64xx hardware anymore to test
> this.  My last working dm644x board died last year.

 I have a working DM6446 EVM. I was able to connect a hard drive to it
 and do some basic tests with v4.6 kernel.

 I will look into converting the driver to libata. Might take some time
 because this is unfamiliar territory for me.
>>>
>>> Do you need some help with it?
>>>
>>> I can provide you with draft driver patch if you want.
>>
>> A draft driver patch will really help. I can test/debug. Otherwise, not
>> sure when I will really be able to get to this.
> 
> It took a while to get to it but here is the draft driver patch
> against v4.11-rc1.  Please test.

I tested this on DM6446 EVM. I was able to mount existing partitions on 
the hard disk and see that the directory listing looks good[1]. I will do 
more tests (including comparing performance with old driver) tomorrow. I 
did not have to do much to get it work[2]. Great job! Thanks!

I did see a warning reported during the build[3].

Regards,
Sekhar

[1] http://pastebin.ubuntu.com/24139206/

[2] The only patch I had to apply is (similar change required in couple
of other places too):

diff --git a/arch/arm/mach-davinci/board-dm644x-evm.c 
b/arch/arm/mach-davinci/board-dm644x-evm.c
index 023480b75244..60a1f23890cd 100644
--- a/arch/arm/mach-davinci/board-dm644x-evm.c
+++ b/arch/arm/mach-davinci/board-dm644x-evm.c
@@ -744,7 +744,7 @@ static int davinci_phy_fixup(struct phy_device *phydev)
return 0;
 }
 
-#define HAS_ATAIS_ENABLED(CONFIG_BLK_DEV_PALMCHIP_BK3710)
+#define HAS_ATAIS_ENABLED(CONFIG_PATA_BK3710)
 
 #define HAS_NORIS_ENABLED(CONFIG_MTD_PHYSMAP)
 
[3]

drivers/ata/pata_bk3710.c: In function 'pata_bk3710_set_piomode':
drivers/ata/pata_bk3710.c:223:5: warning: 'cycle_time' may be used 
uninitialized in this function [-Wmaybe-uninitialized]
  if (!cycle_time)
 ^



Re: [PATCH v2 08/22] PCI: dwc: designware: Add EP mode support

2017-03-08 Thread Joao Pinto
Às 3:32 PM de 3/8/2017, Joao Pinto escreveu:
> Às 1:31 PM de 3/8/2017, Kishon Vijay Abraham I escreveu:
>> Hi,
>>
>> On Wednesday 08 March 2017 05:07 PM, Joao Pinto wrote:
>>> Às 11:35 AM de 3/8/2017, Kishon Vijay Abraham I escreveu:
 Hi,

 On Wednesday 08 March 2017 05:02 PM, Joao Pinto wrote:
>
> Hi Kishon,
>
>>> Can you provide PCIE_GET_ATU_INB_UNR_REG_OFFSET (similar to
>>> PCIE_GET_ATU_OUTB_UNR_REG_OFFSET)?
>>
>> Yes of course, I will send you the definition soon.
>
> As promissed here is the definition for Inbound:
>
> +/* register address builder */
> +#define PCIE_GET_ATU_INB_UNR_REG_ADDR(region, register)  \
> + ((0x3 << 20) | (region << 9) |  \
> + (0x1 << 8) | (register << 2))

 Cool, thanks!
>>>
>>> No problem! If you have doubts, please let me know.
>>
>> Okay, so this looks slightly different than the outbound macro since it takes
>> the register argument. In the case of outbound 
>> PCIE_GET_ATU_OUTB_UNR_REG_OFFSET
>> returns the offset which was used like
>> dw_pcie_write_dbi(pci, base, offset + reg, 0x4, val);
>>
>> How should the value from PCIE_GET_ATU_INB_UNR_REG_ADDR be used?
> 
> My original way was this one:
> 
> +/* Register address builder */
> +#define PCIE_GET_ATU_OUTB_UNR_REG_ADDR(region, register) \
> + ((0x3 << 20) | (region << 9) |  \
> + (register << 2))
> 
> Bjorn then converted to offset:
> 
> #define PCIE_GET_ATU_OUTB_UNR_REG_OFFSET(region)  ((0x3 << 20) | (region << 
> 9))
> 
> and applied the <<2 shift to the ATU registers.
> 
> So you can use:
> 
> #define PCIE_GET_ATU_INB_UNR_REG_ADDR(region, register)   \
>   ((0x3 << 20) | (region << 9) |  \
>   (0x1 << 8)
> 

This one has the right name :)

#define PCIE_GET_ATU_INB_UNR_REG_OFFSET(region, register)   \
((0x3 << 20) | (region << 9) |  \
(0x1 << 8)


> Thanks.
> 
>>
>> Thanks
>> Kishon
>>
> 



Re: [PATCH v2 08/22] PCI: dwc: designware: Add EP mode support

2017-03-08 Thread Joao Pinto
Às 1:31 PM de 3/8/2017, Kishon Vijay Abraham I escreveu:
> Hi,
> 
> On Wednesday 08 March 2017 05:07 PM, Joao Pinto wrote:
>> Às 11:35 AM de 3/8/2017, Kishon Vijay Abraham I escreveu:
>>> Hi,
>>>
>>> On Wednesday 08 March 2017 05:02 PM, Joao Pinto wrote:

 Hi Kishon,

>> Can you provide PCIE_GET_ATU_INB_UNR_REG_OFFSET (similar to
>> PCIE_GET_ATU_OUTB_UNR_REG_OFFSET)?
>
> Yes of course, I will send you the definition soon.

 As promissed here is the definition for Inbound:

 +/* register address builder */
 +#define PCIE_GET_ATU_INB_UNR_REG_ADDR(region, register)   \
 +  ((0x3 << 20) | (region << 9) |  \
 +  (0x1 << 8) | (register << 2))
>>>
>>> Cool, thanks!
>>
>> No problem! If you have doubts, please let me know.
> 
> Okay, so this looks slightly different than the outbound macro since it takes
> the register argument. In the case of outbound 
> PCIE_GET_ATU_OUTB_UNR_REG_OFFSET
> returns the offset which was used like
> dw_pcie_write_dbi(pci, base, offset + reg, 0x4, val);
> 
> How should the value from PCIE_GET_ATU_INB_UNR_REG_ADDR be used?

My original way was this one:

+/* Register address builder */
+#define PCIE_GET_ATU_OUTB_UNR_REG_ADDR(region, register)   \
+   ((0x3 << 20) | (region << 9) |  \
+   (register << 2))

Bjorn then converted to offset:

#define PCIE_GET_ATU_OUTB_UNR_REG_OFFSET(region)  ((0x3 << 20) | (region << 9))

and applied the <<2 shift to the ATU registers.

So you can use:

#define PCIE_GET_ATU_INB_UNR_REG_ADDR(region, register) \
((0x3 << 20) | (region << 9) |  \
(0x1 << 8)

Thanks.

> 
> Thanks
> Kishon
> 



Re: [f2fs-dev] [PATCH] f2fs: don't allow rename unencrypted file to encrypted directory

2017-03-08 Thread Kinglong Mee
On 3/8/2017 20:08, Chao Yu wrote:
> In commit d9cdc9033181 ("ext4 crypto: enforce context consistency") we
> declared that:
> 
> 2) All files or directories in a directory must be protected using the
> same key as their containing directory.
> 
> But in f2fs_cross_rename there is a vulnerability that allow to cross
> rename unencrypted file into encrypted directory, it needs to be refused.

fscrypt_has_permitted_context has do the checking as this patch,

168 /* no restrictions if the parent directory is not encrypted */
169 if (!parent->i_sb->s_cop->is_encrypted(parent))
170 return 1;
171 /* if the child directory is not encrypted, this is always a 
problem */
172 if (!parent->i_sb->s_cop->is_encrypted(child))
173 return 0;

So, the cross rename unencrypted file into encrypted directory is permitted 
right now. 

I have a encrypted directory "ncry",  "new" is unencrypted file.

[root@nfstestnic f2fs]# renameat2 -x encry/hello new
Operation not permitted
[root@nfstestnic f2fs]# renameat2 -x encry/hello new
Operation not permitted

How do you test it? 

thanks,
Kinglong Mee
> 
> Signed-off-by: Chao Yu 
> ---
>  fs/f2fs/namei.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/fs/f2fs/namei.c b/fs/f2fs/namei.c
> index 25c073f6c7d4..8de684b84cb9 100644
> --- a/fs/f2fs/namei.c
> +++ b/fs/f2fs/namei.c
> @@ -855,6 +855,10 @@ static int f2fs_cross_rename(struct inode *old_dir, 
> struct dentry *old_dentry,
>   !fscrypt_has_encryption_key(new_dir)))
>   return -ENOKEY;
>  
> + if (f2fs_encrypted_inode(old_dir) && !f2fs_encrypted_inode(new_inode) ||
> + f2fs_encrypted_inode(new_dir) && 
> !f2fs_encrypted_inode(old_inode))
> + return -EPERM;
> +
>   if ((f2fs_encrypted_inode(old_dir) || f2fs_encrypted_inode(new_dir)) &&
>   (old_dir != new_dir) &&
>   (!fscrypt_has_permitted_context(new_dir, old_inode) ||
> 


Re: [PATCH 0/7] 5-level paging: prepare generic code

2017-03-08 Thread Michal Hocko
On Wed 08-03-17 18:07:42, Kirill A. Shutemov wrote:
> On Wed, Mar 08, 2017 at 03:25:01PM +0100, Michal Hocko wrote:
> > Btw. my build test machinery has reported this:
> > microblaze/allnoconfig
> 
> Thanks.
> 
> Fixup is below. I guess it should be folded into 4/7.

yes, this has passed the testing

> 
> diff --git a/arch/microblaze/include/asm/page.h 
> b/arch/microblaze/include/asm/page.h
> index fd850879854d..d506bb0893f9 100644
> --- a/arch/microblaze/include/asm/page.h
> +++ b/arch/microblaze/include/asm/page.h
> @@ -95,7 +95,8 @@ typedef struct { unsigned long pgd; } pgd_t;
>  #   else /* CONFIG_MMU */
>  typedef struct { unsigned long   ste[64]; }  pmd_t;
>  typedef struct { pmd_t   pue[1]; }   pud_t;
> -typedef struct { pud_t   pge[1]; }   pgd_t;
> +typedef struct { pud_t   p4e[1]; }   p4d_t;
> +typedef struct { p4d_t   pge[1]; }   pgd_t;
>  #   endif /* CONFIG_MMU */
>  
>  # define pte_val(x)  ((x).pte)
> -- 
>  Kirill A. Shutemov

-- 
Michal Hocko
SUSE Labs


[PATCH 1/3] usb: orion-echi: Add support for the Armada 3700

2017-03-08 Thread Gregory CLEMENT
From: jinghua 

- Add a new compatoble string for the Armada 3700 SoCs

- add sbuscfg support for orion usb controller driver. For the SoCs
  without hlock, need to program BAWR/BARD/AHBBRST fields in the sbuscfg
  register to guarantee the AHB master's burst would not overrun or
  underrun the FIFO.

- the sbuscfg register has to be set after the usb controller reset,
  otherwise the value would be overridden to 0. In order to do this, the
  reset callback is registered.

[gregory.clem...@free-electrons.com: - reword commit and comments
 - fix checkpatch warning]
Signed-off-by: jinghua 
Signed-off-by: Gregory CLEMENT 
---
 .../devicetree/bindings/usb/ehci-orion.txt |  4 ++-
 drivers/usb/host/ehci-orion.c  | 39 ++
 2 files changed, 42 insertions(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/usb/ehci-orion.txt 
b/Documentation/devicetree/bindings/usb/ehci-orion.txt
index 17c3bc858b86..9dfffc9dffec 100644
--- a/Documentation/devicetree/bindings/usb/ehci-orion.txt
+++ b/Documentation/devicetree/bindings/usb/ehci-orion.txt
@@ -1,7 +1,9 @@
 * EHCI controller, Orion Marvell variants
 
 Required properties:
-- compatible: must be "marvell,orion-ehci"
+- compatible: could be one of the following
+   "marvell,orion-ehci"
+   "marvell,armada-3700-ehci"
 - reg: physical base address of the controller and length of memory mapped
   region.
 - interrupts: The EHCI interrupt
diff --git a/drivers/usb/host/ehci-orion.c b/drivers/usb/host/ehci-orion.c
index ee8d5faa0194..cf778e166b90 100644
--- a/drivers/usb/host/ehci-orion.c
+++ b/drivers/usb/host/ehci-orion.c
@@ -47,6 +47,21 @@
 #define USB_PHY_IVREF_CTRL 0x440
 #define USB_PHY_TST_GRP_CTRL   0x450
 
+#define USB_SBUSCFG0x90
+#defineUSB_SBUSCFG_BAWR0x6
+#defineUSB_SBUSCFG_BARD0x3
+#defineUSB_SBUSCFG_AHBBRST 0x0
+
+/* BAWR = BARD = 3 : Align read/write bursts packets larger than 128 bytes */
+#define USB_SBUSCFG_BAWR_ALIGN_128B0x3
+#define USB_SBUSCFG_BARD_ALIGN_128B0x3
+/* AHBBRST = 3: Align AHB Burst to INCR16 (64 bytes) */
+#define USB_SBUSCFG_AHBBRST_INCR16 0x3
+
+#define USB_SBUSCFG_DEF_VAL ((USB_SBUSCFG_BAWR_ALIGN_128B << USB_SBUSCFG_BAWR) 
\
+| (USB_SBUSCFG_BARD_ALIGN_128B << USB_SBUSCFG_BARD) \
+| (USB_SBUSCFG_AHBBRST_INCR16 << USB_SBUSCFG_AHBBRST))
+
 #define DRIVER_DESC "EHCI orion driver"
 
 #define hcd_to_orion_priv(h) ((struct orion_ehci_hcd *)hcd_to_ehci(h)->priv)
@@ -151,8 +166,31 @@ ehci_orion_conf_mbus_windows(struct usb_hcd *hcd,
}
 }
 
+static int ehci_orion_drv_reset(struct usb_hcd *hcd)
+{
+   struct device *dev = hcd->self.controller;
+   int retval;
+
+   retval = ehci_setup(hcd);
+   if (retval)
+   dev_err(dev, "ehci_setup failed %d\n", retval);
+
+   /*
+* For SoC without hlock, need to program sbuscfg value to guarantee
+* AHB master's burst would not overrun or underrun FIFO.
+*
+* sbuscfg reg has to be set after usb controller reset, otherwise
+* the value would be override to 0.
+*/
+   if (of_device_is_compatible(dev->of_node, "marvell,armada-3700-ehci"))
+   wrl(USB_SBUSCFG, USB_SBUSCFG_DEF_VAL);
+
+   return retval;
+}
+
 static const struct ehci_driver_overrides orion_overrides __initconst = {
.extra_priv_size =  sizeof(struct orion_ehci_hcd),
+   .reset = ehci_orion_drv_reset,
 };
 
 static int ehci_orion_drv_probe(struct platform_device *pdev)
@@ -310,6 +348,7 @@ static int ehci_orion_drv_remove(struct platform_device 
*pdev)
 
 static const struct of_device_id ehci_orion_dt_ids[] = {
{ .compatible = "marvell,orion-ehci", },
+   { .compatible = "marvell,armada-3700-ehci", },
{},
 };
 MODULE_DEVICE_TABLE(of, ehci_orion_dt_ids);
-- 
2.11.0



Re: [v6 PATCH 21/21] selftests/x86: Add tests for User-Mode Instruction Prevention

2017-03-08 Thread Andy Lutomirski
On Tue, Mar 7, 2017 at 4:32 PM, Ricardo Neri
 wrote:
> Certain user space programs that run on virtual-8086 mode may utilize
> instructions protected by the User-Mode Instruction Prevention (UMIP)
> security feature present in new Intel processors: SGDT, SIDT and SMSW. In
> such a case, a general protection fault is issued if UMIP is enabled. When
> such a fault happens, the kernel catches it and emulates the results of
> these instructions with dummy values. The purpose of this new
> test is to verify whether the impacted instructions can be executed without
> causing such #GP. If no #GP exceptions occur, we expect to exit virtual-
> 8086 mode from INT 0x80.
>
> The instructions protected by UMIP are executed in representative use
> cases:
>  a) the memory address of the result is given in the form of a displacement
> from the base of the data segment
>  b) the memory address of the result is given in a general purpose register
>  c) the result is stored directly in a general purpose register.
>
> Unfortunately, it is not possible to check the results against a set of
> expected values because no emulation will occur in systems that do not have
> the UMIP feature. Instead, results are printed for verification.

You could pre-initialize the result buffer to a bunch of non-matching
values (1, 2, 3, ...) and then check that all the invocations of the
same instruction gave the same value.

If you do this, maybe make it a follow-up patch -- see other email.


Re: [PATCH 10/26] IB/ocrdma: Improve another size determination in ocrdma_init_emb_mqe()

2017-03-08 Thread Yuval Shaia
On Wed, Mar 08, 2017 at 02:02:46PM +0100, SF Markus Elfring wrote:
> From: Markus Elfring 
> Date: Tue, 7 Mar 2017 20:33:29 +0100
> 
> Replace the specification of a data structure by a pointer dereference
> as the parameter for the operator "sizeof" to make the corresponding size
> determination a bit safer according to the Linux coding style convention.
> 
> Signed-off-by: Markus Elfring 
> ---
>  drivers/infiniband/hw/ocrdma/ocrdma_hw.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/infiniband/hw/ocrdma/ocrdma_hw.c 
> b/drivers/infiniband/hw/ocrdma/ocrdma_hw.c
> index 7d1e1caa90de..aa32bc9f323d 100644
> --- a/drivers/infiniband/hw/ocrdma/ocrdma_hw.c
> +++ b/drivers/infiniband/hw/ocrdma/ocrdma_hw.c
> @@ -352,7 +352,7 @@ static void *ocrdma_init_emb_mqe(u8 opcode, u32 cmd_len)
>  {
>   struct ocrdma_mqe *mqe;
>  
> - mqe = kzalloc(sizeof(struct ocrdma_mqe), GFP_KERNEL);
> + mqe = kzalloc(sizeof(*mqe), GFP_KERNEL);
>   if (!mqe)
>   return NULL;
>   mqe->hdr.spcl_sge_cnt_emb |=

Reviewed-by: Yuval Shaia 

> -- 
> 2.12.0
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH] sched/wait: Introduce new, more compact wait_event*() primitives

2017-03-08 Thread Linus Torvalds
On Wed, Mar 8, 2017 at 12:37 AM, Ingo Molnar  wrote:
>
> The idea is to allow call sites to supply the 'condition' function as 
> free-form C
> code, while pushing everything else into non-macro form: there's a 'struct
> wait_event_state' on stack, and a state machine. The waiting logic is 
> converted
> from procedural form to a state machine, because we have to call out into the
> 'condition' code in different circumstances.

Ok, I think the concept is fine, but you don't actually fix the
problem with the locked version that needs to unlock (with irq
versions etc) around the schedule.

And using "bool" in a struct is disgusting and wrong, and hides the
fact that the compiler will just turn it into "char" (or even "int"
for platforms where "char'" is slow, like alpha).

So it would be better with a "state" variable that just has fields, I suspect.

.. and as mentioned, it doesn't actually fix the case that hit the
signal_pending() problem.

Honestly, I think my "pass in a waiter function" model was both less
subtle and indirect, and more generic.

And we can actually *fix* the problem with it for 4.11, instead of
adding the stupid header file includes.

 Linus


Re: RFC: SysRq nice-all-RT-tasks is broken

2017-03-08 Thread Steven Rostedt
On Wed, 8 Mar 2017 12:40:12 -0500
Steven Rostedt  wrote:

> I wonder if we should just have a special flag sent by that sysrq
> trigger. Since it is causing all tasks to go "nice" there's no need to
> do the pi chain walk in __sched_setscheduler().

Hah, there already is a flag!

Laurent, can you test this patch:

-- Steve

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 3b31fc0..7292fa9 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -4129,8 +4129,8 @@ static int __sched_setscheduler(struct task_struct *p,
int queue_flags = DEQUEUE_SAVE | DEQUEUE_MOVE;
struct rq *rq;
 
-   /* May grab non-irq protected spin_locks: */
-   BUG_ON(in_interrupt());
+   /* The pi code expects interrupts enabled */
+   BUG_ON(pi && in_interrupt());
 recheck:
/* Double check policy once rq lock held: */
if (policy < 0) {


Re: [PATCH v5] NTB: Add IDT 89HPESxNTx PCIe-switches support

2017-03-08 Thread Jon Mason
On Tue, Mar 07, 2017 at 05:02:38AM +0300, Serge Semin wrote:
> IDT 89HPESxNTx device series is PCIe-switches, which support
> Non-Transparent bridging between domains connected to the device ports.
> Since new NTB API exposes multi-port interface and messaging API, the
> IDT NT-functions can be now supported in the kernel. This driver adds
> the following functionality:
> 1) Multi-port NTB API to have information of possible NT-functions
> activated in compliance with available device ports.
> 2) Memory windows of direct and look up table based address translation
> with all possible combinations of BARs setup.
> 3) Traditional doorbell NTB API.
> 4) One-on-one messaging NTB API.
> 
> There are some IDT PCIe-switch setups, which must be done before any of
> the NTB peers started. It can be performed either by system BIOS via
> IDT SMBus-slave interface or by pre-initialized IDT PCIe-switch EEPROM:
> 1) NT-functions of corresponding ports must be activated using
> SWPARTxCTL and SWPORTxCTL registers.
> 2) BAR0 must be configured to expose NT-function configuration
> registers map.
> 3) The rest of the BARs must have at least one memory window
> configured, otherwise the driver will just return an error.
> Temperature sensor of IDT PCIe-switches can be also optionally
> activated by BIOS or EEPROM.
> (See IDT documentations for details of how the pre-initialization can
> be done)
> 
> Signed-off-by: Serge Semin 
> Acked-by: Allen Hubbe 
> 
> ---
> 
> Changelog v2:
> - Fix minor checkpatch.pl issues
> - Get rid of obfuscating macros
> 
> Changelog v3:
> - No write to registers if address is either out of bound or unaligned
> - Fix idt_reg_set_bits()/idt_reg_clear_bits() methods race condition
> - Fix invalid argument of write method called from
> idt_reg_set_bits()/idt_reg_clear_bits() functions
> - Add appropriate naming of function idt_get_mw_size()
> - Fix some documentation notes
> - Replace symbolic permission S_IRUSR with octal 0400
> 
> Changelog v4:
> - Return ~0 on read from registers with invalid address
> - Don't check bits validity on registers bits clearing
> - Keep up driver loading (just print a warning) if there is no any peer
> NTBs found
> - Fix unnecessary branching logic
> - Fix some documentation notes
> 
> Changelog v5:
> - Fix minor documentation issues
> - Replace writel/readl with iowrite32/ioread32 methods
> - Discard dev_*() wrappers with origins
> - Use pci_alloc_irq_vectors() for IRQ number and ISR initialization
> - Use Mananged Device Resource as much as possible:
> devm_request_threaded_irq(), pcim_iomap_regions_request_all()
> 
>  drivers/ntb/hw/Kconfig  |1 +
>  drivers/ntb/hw/Makefile |1 +
>  drivers/ntb/hw/idt/Kconfig  |   31 +
>  drivers/ntb/hw/idt/Makefile |1 +
>  drivers/ntb/hw/idt/ntb_hw_idt.c | 2600 
> +++
>  drivers/ntb/hw/idt/ntb_hw_idt.h | 1149 +
>  6 files changed, 3783 insertions(+)
>  create mode 100644 drivers/ntb/hw/idt/Kconfig
>  create mode 100644 drivers/ntb/hw/idt/Makefile
>  create mode 100644 drivers/ntb/hw/idt/ntb_hw_idt.c
>  create mode 100644 drivers/ntb/hw/idt/ntb_hw_idt.h
> 
> diff --git a/drivers/ntb/hw/Kconfig b/drivers/ntb/hw/Kconfig
> index 7116472..a89243c 100644
> --- a/drivers/ntb/hw/Kconfig
> +++ b/drivers/ntb/hw/Kconfig
> @@ -1,2 +1,3 @@
>  source "drivers/ntb/hw/amd/Kconfig"
> +source "drivers/ntb/hw/idt/Kconfig"
>  source "drivers/ntb/hw/intel/Kconfig"
> diff --git a/drivers/ntb/hw/Makefile b/drivers/ntb/hw/Makefile
> index 532e085..87332c3 100644
> --- a/drivers/ntb/hw/Makefile
> +++ b/drivers/ntb/hw/Makefile
> @@ -1,2 +1,3 @@
>  obj-$(CONFIG_NTB_AMD)+= amd/
> +obj-$(CONFIG_NTB_IDT)+= idt/
>  obj-$(CONFIG_NTB_INTEL)  += intel/
> diff --git a/drivers/ntb/hw/idt/Kconfig b/drivers/ntb/hw/idt/Kconfig
> new file mode 100644
> index 000..b360e56
> --- /dev/null
> +++ b/drivers/ntb/hw/idt/Kconfig
> @@ -0,0 +1,31 @@
> +config NTB_IDT
> + tristate "IDT PCIe-switch Non-Transparent Bridge support"
> + depends on PCI
> + help
> +  This driver supports NTB of cappable IDT PCIe-switches.
> +
> +  Some of the pre-initializations must be made before IDT PCIe-switch
> +  exposes it NT-functions correctly. It should be done by either proper
> +  initialisation of EEPROM connected to master smbus of the switch or
> +  by BIOS using slave-SMBus interface changing corresponding registers
> +  value. Evidently it must be done before PCI bus enumeration is
> +  finished in Linux kernel.
> +
> +  First of all partitions must be activated and properly assigned to all
> +  the ports with NT-functions intended to be activated (see SWPARTxCTL
> +  and SWPORTxCTL registers). Then all NT-function BARs must be enabled
> +  with chosen valid aperture. For memory windows related BARs the
> +  aperture settings shall determine the maximum size of memory 

[PATCH 6/6] KVM: nVMX: support RDRAND and RDSEED exiting

2017-03-08 Thread Paolo Bonzini
Signed-off-by: Paolo Bonzini 
---
 arch/x86/include/asm/vmx.h | 2 ++
 arch/x86/kvm/vmx.c | 5 +
 2 files changed, 7 insertions(+)

diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
index cc54b7026567..b2b6e5b1782b 100644
--- a/arch/x86/include/asm/vmx.h
+++ b/arch/x86/include/asm/vmx.h
@@ -70,8 +70,10 @@
 #define SECONDARY_EXEC_APIC_REGISTER_VIRT   0x0100
 #define SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY0x0200
 #define SECONDARY_EXEC_PAUSE_LOOP_EXITING  0x0400
+#define SECONDARY_EXEC_RDRAND  0x0800
 #define SECONDARY_EXEC_ENABLE_INVPCID  0x1000
 #define SECONDARY_EXEC_SHADOW_VMCS  0x4000
+#define SECONDARY_EXEC_RDSEED  0x0001
 #define SECONDARY_EXEC_ENABLE_PML   0x0002
 #define SECONDARY_EXEC_XSAVES  0x0010
 #define SECONDARY_EXEC_TSC_SCALING  0x0200
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index a3395e23cf5a..23b304fc72ec 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2749,6 +2749,7 @@ static void nested_vmx_setup_ctls_msrs(struct vcpu_vmx 
*vmx)
vmx->nested.nested_vmx_secondary_ctls_high);
vmx->nested.nested_vmx_secondary_ctls_low = 0;
vmx->nested.nested_vmx_secondary_ctls_high &=
+   SECONDARY_EXEC_RDRAND | SECONDARY_EXEC_RDSEED |
SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES |
SECONDARY_EXEC_RDTSCP |
SECONDARY_EXEC_DESC |
@@ -8132,6 +8133,10 @@ static bool nested_vmx_exit_handled(struct kvm_vcpu 
*vcpu)
return nested_cpu_has(vmcs12, CPU_BASED_INVLPG_EXITING);
case EXIT_REASON_RDPMC:
return nested_cpu_has(vmcs12, CPU_BASED_RDPMC_EXITING);
+   case EXIT_REASON_RDRAND:
+   return nested_cpu_has2(vmcs12, SECONDARY_EXEC_RDRAND);
+   case EXIT_REASON_RDSEED:
+   return nested_cpu_has2(vmcs12, SECONDARY_EXEC_RDSEED);
case EXIT_REASON_RDTSC: case EXIT_REASON_RDTSCP:
return nested_cpu_has(vmcs12, CPU_BASED_RDTSC_EXITING);
case EXIT_REASON_VMCALL: case EXIT_REASON_VMCLEAR:
-- 
1.8.3.1



[PATCH v2 5/7] PCI: dwc: all: Modify dbi accessors to access data of 4/2/1 bytes

2017-03-08 Thread Kishon Vijay Abraham I
Previously dbi accessors can be used to access data of size 4
bytes. But there might be situations (like accessing
MSI_MESSAGE_CONTROL in order to set/get the number of required
MSI interrupts in EP mode) where dbi accessors must
be used to access data of size 2. This is in preparation for
adding endpoint mode support to designware driver.

Cc: Jingoo Han 
Cc: Richard Zhu 
Cc: Lucas Stach 
Cc: Murali Karicheri 
Cc: Thomas Petazzoni 
Cc: Niklas Cassel 
Cc: Jesper Nilsson 
Cc: Joao Pinto 
Cc: Zhou Wang 
Cc: Gabriele Paoloni 
Signed-off-by: Kishon Vijay Abraham I 
---
 drivers/pci/dwc/Kconfig|   18 
 drivers/pci/dwc/pci-dra7xx.c   |8 ++--
 drivers/pci/dwc/pci-exynos.c   |   16 +++
 drivers/pci/dwc/pci-imx6.c |   54 +++---
 drivers/pci/dwc/pci-keystone-dw.c  |   13 +++---
 drivers/pci/dwc/pcie-armada8k.c|   38 
 drivers/pci/dwc/pcie-artpec6.c |6 +--
 drivers/pci/dwc/pcie-designware-host.c |   18 
 drivers/pci/dwc/pcie-designware.c  |   77 +++-
 drivers/pci/dwc/pcie-designware.h  |   14 +++---
 drivers/pci/dwc/pcie-hisi.c|   14 +++---
 11 files changed, 147 insertions(+), 129 deletions(-)

diff --git a/drivers/pci/dwc/Kconfig b/drivers/pci/dwc/Kconfig
index dfb8a69..cb3d5d0 100644
--- a/drivers/pci/dwc/Kconfig
+++ b/drivers/pci/dwc/Kconfig
@@ -36,7 +36,7 @@ config PCIE_DW_PLAT
 config PCI_EXYNOS
bool "Samsung Exynos PCIe controller"
depends on PCI
-   depends on SOC_EXYNOS5440
+   depends on SOC_EXYNOS5440 || COMPILE_TEST
depends on PCI_MSI_IRQ_DOMAIN
select PCIEPORTBUS
select PCIE_DW_HOST
@@ -44,7 +44,7 @@ config PCI_EXYNOS
 config PCI_IMX6
bool "Freescale i.MX6 PCIe controller"
depends on PCI
-   depends on SOC_IMX6Q
+   depends on SOC_IMX6Q || COMPILE_TEST
depends on PCI_MSI_IRQ_DOMAIN
select PCIEPORTBUS
select PCIE_DW_HOST
@@ -52,7 +52,7 @@ config PCI_IMX6
 config PCIE_SPEAR13XX
bool "STMicroelectronics SPEAr PCIe controller"
depends on PCI
-   depends on ARCH_SPEAR13XX
+   depends on ARCH_SPEAR13XX || COMPILE_TEST
depends on PCI_MSI_IRQ_DOMAIN
select PCIEPORTBUS
select PCIE_DW_HOST
@@ -62,7 +62,7 @@ config PCIE_SPEAR13XX
 config PCI_KEYSTONE
bool "TI Keystone PCIe controller"
depends on PCI
-   depends on ARCH_KEYSTONE
+   depends on ARCH_KEYSTONE || COMPILE_TEST
depends on PCI_MSI_IRQ_DOMAIN
select PCIEPORTBUS
select PCIE_DW_HOST
@@ -75,7 +75,7 @@ config PCI_KEYSTONE
 config PCI_LAYERSCAPE
bool "Freescale Layerscape PCIe controller"
depends on PCI
-   depends on OF && (ARM || ARCH_LAYERSCAPE)
+   depends on OF && (ARM || ARCH_LAYERSCAPE || COMPILE_TEST)
depends on PCI_MSI_IRQ_DOMAIN
select MFD_SYSCON
select PCIE_DW_HOST
@@ -83,7 +83,7 @@ config PCI_LAYERSCAPE
  Say Y here if you want PCIe controller support on Layerscape SoCs.
 
 config PCI_HISI
-   depends on OF && ARM64
+   depends on OF && (ARM64 || COMPILE_TEST)
bool "HiSilicon Hip05 and Hip06 SoCs PCIe controllers"
depends on PCI
depends on PCI_MSI_IRQ_DOMAIN
@@ -96,7 +96,7 @@ config PCI_HISI
 config PCIE_QCOM
bool "Qualcomm PCIe controller"
depends on PCI
-   depends on ARCH_QCOM && OF
+   depends on (ARCH_QCOM || COMPILE_TEST) && OF
depends on PCI_MSI_IRQ_DOMAIN
select PCIEPORTBUS
select PCIE_DW_HOST
@@ -108,7 +108,7 @@ config PCIE_QCOM
 config PCIE_ARMADA_8K
bool "Marvell Armada-8K PCIe controller"
depends on PCI
-   depends on ARCH_MVEBU
+   depends on ARCH_MVEBU || COMPILE_TEST
depends on PCI_MSI_IRQ_DOMAIN
select PCIEPORTBUS
select PCIE_DW_HOST
@@ -121,7 +121,7 @@ config PCIE_ARMADA_8K
 config PCIE_ARTPEC6
bool "Axis ARTPEC-6 PCIe controller"
depends on PCI
-   depends on MACH_ARTPEC6
+   depends on MACH_ARTPEC6 || COMPILE_TEST
depends on PCI_MSI_IRQ_DOMAIN
select PCIEPORTBUS
select PCIE_DW_HOST
diff --git a/drivers/pci/dwc/pci-dra7xx.c b/drivers/pci/dwc/pci-dra7xx.c
index 3708bd6..c6fef0a 100644
--- a/drivers/pci/dwc/pci-dra7xx.c
+++ b/drivers/pci/dwc/pci-dra7xx.c
@@ -499,9 +499,9 @@ static int dra7xx_pcie_suspend(struct device *dev)
u32 val;
 
/* clear MSE */
-   val = dw_pcie_readl_dbi(pci, base, PCI_COMMAND);
+   val = dw_pcie_read_dbi(pci, base, PCI_COMMAND, 0x4);
val &= ~PCI_COMMAND_MEMORY;
-   dw_pcie_writel_dbi(pci, base, PCI_COMMAND, val);
+   

[PATCH 1/6] KVM: nVMX: we support 1GB EPT pages

2017-03-08 Thread Paolo Bonzini
Large pages at the PDPE level can be emulated by the MMU, so the bit
can be set unconditionally in the EPT capabilities MSR.  The same is
true of 2MB EPT pages, though all Intel processors with EPT in practice
support those.

Signed-off-by: Paolo Bonzini 
---
 arch/x86/kvm/vmx.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 283aa8601833..89b74d9bc357 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2764,14 +2764,14 @@ static void nested_vmx_setup_ctls_msrs(struct vcpu_vmx 
*vmx)
vmx->nested.nested_vmx_secondary_ctls_high |=
SECONDARY_EXEC_ENABLE_EPT;
vmx->nested.nested_vmx_ept_caps = VMX_EPT_PAGE_WALK_4_BIT |
-VMX_EPTP_WB_BIT | VMX_EPT_2MB_PAGE_BIT |
-VMX_EPT_INVEPT_BIT;
+VMX_EPTP_WB_BIT | VMX_EPT_INVEPT_BIT;
if (cpu_has_vmx_ept_execute_only())
vmx->nested.nested_vmx_ept_caps |=
VMX_EPT_EXECUTE_ONLY_BIT;
vmx->nested.nested_vmx_ept_caps &= vmx_capability.ept;
vmx->nested.nested_vmx_ept_caps |= VMX_EPT_EXTENT_GLOBAL_BIT |
-   VMX_EPT_EXTENT_CONTEXT_BIT;
+   VMX_EPT_EXTENT_CONTEXT_BIT | VMX_EPT_2MB_PAGE_BIT |
+   VMX_EPT_1GB_PAGE_BIT;
} else
vmx->nested.nested_vmx_ept_caps = 0;
 
-- 
1.8.3.1




<    1   2   3   4   5   6   7   8   9   10   >