Re: [PATCH v4 5/5] staging/android: add flags member to sync ioctl structs

2016-02-28 Thread Emil Velikov
On 27 February 2016 at 15:27, Gustavo Padovan
 wrote:
> Hi Emil,
>
> 2016-02-27 Emil Velikov :
>
>> Hi Gustavo,
>>
>> On 26 February 2016 at 18:31, Gustavo Padovan  wrote:
>> > From: Gustavo Padovan 
>> >
>> > Play safe and add flags member to all structs. So we don't need to
>> > break API or create new IOCTL in the future if new features that requires
>> > flags arises.
>> >
>> > v2: check if flags are valid (zero, in this case)
>> >
>> > Signed-off-by: Gustavo Padovan 
>> > ---
>> >  drivers/staging/android/sync.c  | 7 ++-
>> >  drivers/staging/android/uapi/sync.h | 6 ++
>> >  2 files changed, 12 insertions(+), 1 deletion(-)
>> >
>> > diff --git a/drivers/staging/android/sync.c 
>> > b/drivers/staging/android/sync.c
>> > index 837cff5..54fd5ab 100644
>> > --- a/drivers/staging/android/sync.c
>> > +++ b/drivers/staging/android/sync.c
>> > @@ -445,6 +445,11 @@ static long sync_file_ioctl_merge(struct sync_file 
>> > *sync_file,
>> > goto err_put_fd;
>> > }
>> >
>> > +   if (data.flags) {
>> > +   err = -EFAULT;
>> -EINVAL ?
>>
>> > +   goto err_put_fd;
>> > +   }
>> > +
>> > fence2 = sync_file_fdget(data.fd2);
>> > if (!fence2) {
>> > err = -ENOENT;
>> > @@ -511,7 +516,7 @@ static long sync_file_ioctl_fence_info(struct 
>> > sync_file *sync_file,
>> > if (copy_from_user(&in, (void __user *)arg, sizeof(*info)))
>> > return -EFAULT;
>> >
>> > -   if (in.status || strcmp(in.name, "\0"))
>> > +   if (in.status || in.flags || strcmp(in.name, "\0"))
>> > return -EFAULT;
>> -EINVAL ?
>>
>> >
>> > if (in.num_fences && !in.sync_fence_info)
>> > diff --git a/drivers/staging/android/uapi/sync.h 
>> > b/drivers/staging/android/uapi/sync.h
>> > index 9aad623..f56a6c2 100644
>> > --- a/drivers/staging/android/uapi/sync.h
>> > +++ b/drivers/staging/android/uapi/sync.h
>> > @@ -19,11 +19,13 @@
>> >   * @fd2:   file descriptor of second fence
>> >   * @name:  name of new fence
>> >   * @fence: returns the fd of the new fence to userspace
>> > + * @flags: merge_data flags
>> >   */
>> >  struct sync_merge_data {
>> > __s32   fd2;
>> > charname[32];
>> > __s32   fence;
>> > +   __u32   flags;
>> The overall size of the struct is not multiple of 64bit, so things
>> will end up badly if we decide to extend it in the future. Even if
>> there's a small chance that update will be needed, we might as well
>> pad it now (and check the padding for zero, returning -EINVAL).
>
> I think name could be the first field here.
>
Up-to you really. I'm afraid that it doesn't resolve the issue :-(
As a test add a u64 value at the end of the struct and check the
output of pahole for 32 and 64 bit build.

>>
>> >  };
>> >
>> >  /**
>> > @@ -31,12 +33,14 @@ struct sync_merge_data {
>> >   * @obj_name:  name of parent sync_timeline
>> >   * @driver_name:   name of driver implementing the parent
>> >   * @status:status of the fence 0:active 1:signaled <0:error
>> > + * @flags: fence_info flags
>> >   * @timestamp_ns:  timestamp of status change in nanoseconds
>> >   */
>> >  struct sync_fence_info {
>> > charobj_name[32];
>> > chardriver_name[32];
>> > __s32   status;
>> > +   __u32   flags;
>> > __u64   timestamp_ns;
>> Should we be doing some form of validation in sync_fill_fence_info()
>> of 'flags' ?
>
> Do you think it is necessary? The kernel allocates a zero'ed buffer to
> fill sync_fence_info array.
>
Good point. Missed out the z in kzalloc :-)

-Emil


Re: [PATCH v2] signals, pkeys: make si_pkey 32 bits

2016-02-28 Thread Ingo Molnar

* Stephen Rothwell  wrote:

> In order to prevent a change of alignment of the _sifields union in the
> siginfo structure on (some) 32 bit platforms and an ABI breakage, we
> change the type of _pkey to unsigned int.  If more bits are needed in
> the future, a second unsigned int could be added.
> 
> Fixes: cd0ea35ff551 ("signals, pkeys: Notify userspace about protection key 
> faults")
> Acked-by: Dave Hansen 
> Signed-off-by: Stephen Rothwell 
> ---
>  arch/ia64/include/uapi/asm/siginfo.h | 2 +-
>  arch/mips/include/uapi/asm/siginfo.h | 2 +-
>  include/uapi/asm-generic/siginfo.h   | 2 +-
>  3 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/ia64/include/uapi/asm/siginfo.h 
> b/arch/ia64/include/uapi/asm/siginfo.h
> index 0151cfab929d..19e7db0c9453 100644
> --- a/arch/ia64/include/uapi/asm/siginfo.h
> +++ b/arch/ia64/include/uapi/asm/siginfo.h
> @@ -70,7 +70,7 @@ typedef struct siginfo {
>   void __user *_upper;
>   } _addr_bnd;
>   /* used when si_code=SEGV_PKUERR */
> - u64 _pkey;
> + unsigned int _pkey;
>   };
>   } _sigfault;
>  
> diff --git a/arch/mips/include/uapi/asm/siginfo.h 
> b/arch/mips/include/uapi/asm/siginfo.h
> index 6f4edf0d794c..3cc14f4a5936 100644
> --- a/arch/mips/include/uapi/asm/siginfo.h
> +++ b/arch/mips/include/uapi/asm/siginfo.h
> @@ -93,7 +93,7 @@ typedef struct siginfo {
>   void __user *_upper;
>   } _addr_bnd;
>   /* used when si_code=SEGV_PKUERR */
> - u64 _pkey;
> + unsigned int _pkey;
>   };
>   } _sigfault;
>  
> diff --git a/include/uapi/asm-generic/siginfo.h 
> b/include/uapi/asm-generic/siginfo.h
> index 90384d55225b..f4459dc3d31b 100644
> --- a/include/uapi/asm-generic/siginfo.h
> +++ b/include/uapi/asm-generic/siginfo.h
> @@ -98,7 +98,7 @@ typedef struct siginfo {
>   void __user *_upper;
>   } _addr_bnd;
>   /* used when si_code=SEGV_PKUERR */
> - u64 _pkey;
> + unsigned int _pkey;
>   };
>   } _sigfault;
>  

Please use the standard ABI integer type pattern: __u32.

The advantage of only using __[su][8|16|32|64] integer types is that it's 
"obvious" at a glance that an ABI is bitness-invariant.

For example include/uapi/linux/perf_event.h only uses such ABI-safe types, and 
arch/x86/include/uapi is using these types 95%+ of the time.

( The various struct siginfo definitions should probably be harmonized as well, 
  but in a separate patch. )

Thanks,

Ingo


Re: linux-next: manual merge of the iommu tree with the samsung-krzk tree

2016-02-28 Thread Joerg Roedel
Hi Stephen,

On Mon, Feb 29, 2016 at 03:20:55PM +1100, Stephen Rothwell wrote:
> Hi Joerg,
> 
> Today's linux-next merge of the iommu tree got a conflict in:
> 
>   drivers/memory/Kconfig
> 
> between commit:
> 
>   78fbb9361ca3 ("memory: Add support for Exynos SROM driver")
> 
> from the samsung-krzk tree and commit:
> 
>   cc8bbe1a8312 ("memory: mediatek: Add SMI driver")
> 
> from the iommu tree.
> 
> I fixed it up (see below) and can carry the fix as necessary (no action
> is required).

Thanks for fixing this (and the other conflict before) up.



Joerg



Re: [PATCH] mm: __delete_from_page_cache WARN_ON(page_mapped)

2016-02-28 Thread Joonsoo Kim
2016-02-29 13:49 GMT+09:00 Hugh Dickins :
> Commit e1534ae95004 ("mm: differentiate page_mapped() from page_mapcount()
> for compound pages") changed the famous BUG_ON(page_mapped(page)) in
> __delete_from_page_cache() to VM_BUG_ON_PAGE(page_mapped(page)): which
> gives us more info when CONFIG_DEBUG_VM=y, but nothing at all when not.
>
> Although it has not usually been very helpul, being hit long after the
> error in question, we do need to know if it actually happens on users'
> systems; but reinstating a crash there is likely to be opposed :)
>
> In the non-debug case, use WARN_ON() plus dump_page() and add_taint() -
> I don't really believe LOCKDEP_NOW_UNRELIABLE, but that seems to be the
> standard procedure now.  Move that, or the VM_BUG_ON_PAGE(), up before
> the deletion from tree: so that the unNULLified page->mapping gives a
> little more information.
>
> If the inode is being evicted (rather than truncated), it won't have
> any vmas left, so it's safe(ish) to assume that the raised mapcount is
> erroneous, and we can discount it from page_count to avoid leaking the
> page (I'm less worried by leaking the occasional 4kB, than losing a
> potential 2MB page with each 4kB page leaked).
>
> Signed-off-by: Hugh Dickins 
> ---
> I think this should go into v4.5, so I've written it with an atomic_sub
> on page->_count; but Joonsoo will probably want some page_ref thingy.

Okay. I will do it after this patch is merged.

Thanks for notification.

Thanks.


Re: log spammed with "loading xx failed with error -2" since commit e40ba6d56b [replace call to fw_read_file_contents() with kernel version]

2016-02-28 Thread James Morris
On Sun, 28 Feb 2016, Luis R. Rodriguez wrote:

> >From e63d19975787c0e237a47c17efd01e41b2a8e2fa Mon Sep 17 00:00:00 2001
> From: "Luis R. Rodriguez" 
> Date: Sat, 27 Feb 2016 14:58:08 -0800
> Subject: [PATCH] firmware: change kernel read fail to dev_dbg()
> 

Applied to
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security.git next



-- 
James Morris




Re: [PATCH] [RFC] mm/page_ref, crypto/async_pq: don't put_page from __exit

2016-02-28 Thread Joonsoo Kim
2016-02-29 6:57 GMT+09:00 Arnd Bergmann :
> The addition of tracepoints to the page reference tracking had an
> unfortunate side-effect in at least one driver that calls put_page
> from its exit function, resulting in a link error:
>
> `.exit.text' referenced in section `__jump_table' of crypto/built-in.o: 
> defined in discarded section `.exit.text' of crypto/built-in.o
>
> I could not come up with a nice solution that ignores __jump_table
> entries in discarded code, so we probably now have to treat this
> as something a driver is not allowed to do. Removing the __exit
> annotation avoids the problem in this particular driver, but the
> same problem could come back any time in other code.
>
> On a related problem regarding the runtime patching for SMP
> operations on ARM uniprocessor systems, we resorted to not
> drop the .exit section at link time, but that doesn't seem
> appropriate here.
>
> Signed-off-by: Arnd Bergmann 
> Fixes: 0f80830dd044 ("mm/page_ref: add tracepoint to track down page 
> reference manipulation")
> ---
>  crypto/async_tx/async_pq.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/crypto/async_tx/async_pq.c b/crypto/async_tx/async_pq.c
> index c0748bbd4c08..be167145aa55 100644
> --- a/crypto/async_tx/async_pq.c
> +++ b/crypto/async_tx/async_pq.c
> @@ -442,7 +442,7 @@ static int __init async_pq_init(void)
> return -ENOMEM;
>  }
>
> -static void __exit async_pq_exit(void)
> +static void async_pq_exit(void)
>  {
> put_page(pq_scribble_page);
>  }

Hello, Arnd.

I think that we can avoid this error by using __free_page().
It would not be inlined so calling it would have no problem.

Could you test it, please?

Thanks.


Re: BUG: unable to handle kernel paging request from pty_write [was: Linux 4.4.2]

2016-02-28 Thread Jiri Slaby
On 02/26/2016, 08:59 PM, Robert Święcki wrote:
> It happens only with 0x6000832 ucode, and Piledriver-based CPUs: i.e.
> newer AMD FX, and Opteron 300 series (4300, 6300 etc.).

Ok, I can confirm this is:
AMD Opteron(tm) Processor 6348

And:
microcode: CPU0: patch_level=0x06000836

Thank all the interested parties!

-- 
js
suse labs


Re: [PATCH v5] perf/x86/amd/power: Add AMD accumulated power reporting mechanism

2016-02-28 Thread Huang Rui
On Fri, Feb 26, 2016 at 11:29:52AM +0100, Borislav Petkov wrote:
> On Fri, Feb 26, 2016 at 11:18:28AM +0100, Thomas Gleixner wrote:
> > On Fri, 26 Feb 2016, Huang Rui wrote:
> > > +/* Event code: LSB 8 bits, passed in attr->config any other bit is 
> > > reserved. */
> > > +#define AMD_POWER_EVENT_MASK 0xFFULL
> > > +
> > > +#define MAX_CUS  8
> > 
> > What's that define for? Max compute units? So is that stuff eternaly limited
> > to 8?
> 
> I already sent him a cleaned up version with that dumbness removed:
> 
> https://lkml.kernel.org/r/20160128145436.ge14...@pd.tnic
> 
> Rui, what's up?
> 

Sorry, I will remove superfluous MAX_CUS check at next version.

Thanks,
Rui


[PATCH] PCI: PTM preliminary implementation

2016-02-28 Thread Yong, Jonathan
Simplified Precision Time Measurement driver, activates PTM feature
if a PCIe PTM requester (as per PCI Express 3.1 Base Specification
section 7.32)is found, but not before checking if the rest of the
PCI hierarchy can support it.

The driver does not take part in facilitating PTM conversations,
neither does it provide any useful services, it is only responsible
for setting up the required configuration space bits.

As of writing, there aren't any PTM capable devices on the market
yet, but it is supported by the Intel Apollo Lake platform.

Signed-off-by: Yong, Jonathan 
---
 drivers/pci/pci-sysfs.c |   7 +
 drivers/pci/pci.h   |  21 +++
 drivers/pci/pcie/Kconfig|   8 +
 drivers/pci/pcie/Makefile   |   2 +-
 drivers/pci/pcie/pcie_ptm.c | 353 
 drivers/pci/probe.c |   3 +
 6 files changed, 393 insertions(+), 1 deletion(-)
 create mode 100644 drivers/pci/pcie/pcie_ptm.c

diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index 95d9e7b..c634fd11 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -1335,6 +1335,9 @@ static int pci_create_capabilities_sysfs(struct pci_dev 
*dev)
/* Active State Power Management */
pcie_aspm_create_sysfs_dev_files(dev);
 
+   /* PTM */
+   pci_create_ptm_sysfs(dev);
+
if (!pci_probe_reset_function(dev)) {
retval = device_create_file(&dev->dev, &reset_attr);
if (retval)
@@ -1433,6 +1436,10 @@ static void pci_remove_capabilities_sysfs(struct pci_dev 
*dev)
}
 
pcie_aspm_remove_sysfs_dev_files(dev);
+
+   /* PTM */
+   pci_release_ptm_sysfs(dev);
+
if (dev->reset_fn) {
device_remove_file(&dev->dev, &reset_attr);
dev->reset_fn = 0;
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 9a1660f..fb90420 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -320,6 +320,27 @@ static inline resource_size_t 
pci_resource_alignment(struct pci_dev *dev,
 
 void pci_enable_acs(struct pci_dev *dev);
 
+#ifdef CONFIG_PCIEPORTBUS
+int pci_enable_ptm(struct pci_dev *dev);
+void pci_create_ptm_sysfs(struct pci_dev *dev);
+void pci_release_ptm_sysfs(struct pci_dev *dev);
+void pci_disable_ptm(struct pci_dev *dev);
+#else
+static inline int pci_enable_ptm(struct pci_dev *dev)
+{
+   return -ENXIO;
+}
+static inline void pci_create_ptm_sysfs(struct pci_dev *dev)
+{
+}
+static inline void pci_release_ptm_sysfs(struct pci_dev *dev)
+{
+}
+static inline void pci_disable_ptm(struct pci_dev *dev)
+{
+}
+#endif
+
 struct pci_dev_reset_methods {
u16 vendor;
u16 device;
diff --git a/drivers/pci/pcie/Kconfig b/drivers/pci/pcie/Kconfig
index e294713..f65ff4d 100644
--- a/drivers/pci/pcie/Kconfig
+++ b/drivers/pci/pcie/Kconfig
@@ -80,3 +80,11 @@ endchoice
 config PCIE_PME
def_bool y
depends on PCIEPORTBUS && PM
+
+config PCIE_PTM
+   bool "Turn on Precision Time Management by default"
+   depends on PCIEPORTBUS
+   help
+ Say Y here to enable PTM feature on PCI Express devices that
+ support them as they are found during device enumeration. Otherwise
+ the feature can be enabled manually through sysfs entries.
diff --git a/drivers/pci/pcie/Makefile b/drivers/pci/pcie/Makefile
index 00c62df..d18b4c7 100644
--- a/drivers/pci/pcie/Makefile
+++ b/drivers/pci/pcie/Makefile
@@ -5,7 +5,7 @@
 # Build PCI Express ASPM if needed
 obj-$(CONFIG_PCIEASPM) += aspm.o
 
-pcieportdrv-y  := portdrv_core.o portdrv_pci.o portdrv_bus.o
+pcieportdrv-y  := portdrv_core.o portdrv_pci.o portdrv_bus.o 
pcie_ptm.o
 pcieportdrv-$(CONFIG_ACPI) += portdrv_acpi.o
 
 obj-$(CONFIG_PCIEPORTBUS)  += pcieportdrv.o
diff --git a/drivers/pci/pcie/pcie_ptm.c b/drivers/pci/pcie/pcie_ptm.c
new file mode 100644
index 000..a128c79
--- /dev/null
+++ b/drivers/pci/pcie/pcie_ptm.c
@@ -0,0 +1,353 @@
+/*
+ * PCI Express Precision Time Measurement
+ * Copyright (c) 2016, Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ */
+#include 
+#include 
+#include 
+#include "../pci.h"
+
+#define PCI_PTM_REQ0x0001  /* Requester capable */
+#define  PCI_PTM_RSP   0x0002  /* Responder capable */
+#define  PCI_PTM_ROOT  0x0004  /* Root capable */
+#define  PCI_PTM_GRANULITY 0xFF00  /* Local clock granulity */
+#define PCI_PTM_ENABLE 0x0001  /* PTM enable */
+#define  PCI_PTM_ROOT_SEL  0x0002  /* Root select */
+
+#define PCI_PTM_HEADER_REG_OFFSET  

[RFC] PCI: PTM Driver

2016-02-28 Thread Yong, Jonathan
Hello LKML,

This is a preliminary implementation of the PTM[1] support driver, the code
is obviously hacked together and in need of refactoring. This driver has
only been tested against a virtual PCI bus.

The drivers job is to get to every PTM capable device, set some PCI config
space bits, then go back to sleep [2].

PTM capable PCIe devices will get a new sysfs entry to allow PTM to be
enabled if automatic PTM activation is disabled, or disabled if so desired.

Comments? Should I explain the PTM registers in more details?
Please CC me, thanks.

[1] Precision Time Measurement: A protocol for synchronizing PCIe endpoint
clocks against the host clock as specified in the PCI Express Base
Specification 3.1. It is identified by the 0x001f extended capability ID.

PTM capable devices are split into 3 roles, master, responder and requester.
Summary as follows:

A master holds the master clock that will be used for all devices under its
domain (not to be confused with PCI domains). There may be multiple masters
in a PTM hierarchy, in which case, the highest master closest to the root
complex will be selected for the PTM domain. A master is also always
responder capable. Clock precision is signified by a Local Clock
Granularity field, in nano-seconds.

A responder responds to any PTM synchronization requests from a downstream
device. A responder is typically a switch device. It may also hold a local
clock signified by a non-zero Local Clock Granularity field. A value of 0
signifies that the device simply propagates timing information from
upstream devices.

A requester is typically an endpoint that will request synchronization
updates from an upstream PTM capable time source. The driver will update
the Effective Clock Granularity field based on the same field from the
PTM domain master. The field should be programed with a value of 0 if any
intervening responder has a Local Clock Granularity field value of 0.

[2] The software drivers never see the PTM packets, the PCI Express Base
Specificaton 3.1 reads:
PTM capable components can make their PTM context available for
inspection by software, enabling software to translate timing
information between local times and PTM Master Time.

This isn't very informative.

Yong, Jonathan (1):
  PCI: PTM preliminary implementation

 drivers/pci/pci-sysfs.c |   7 +
 drivers/pci/pci.h   |  21 +++
 drivers/pci/pcie/Kconfig|   8 +
 drivers/pci/pcie/Makefile   |   2 +-
 drivers/pci/pcie/pcie_ptm.c | 353 
 drivers/pci/probe.c |   3 +
 6 files changed, 393 insertions(+), 1 deletion(-)
 create mode 100644 drivers/pci/pcie/pcie_ptm.c

-- 
2.4.10



Re: [GIT PULL] tpmdd fix

2016-02-28 Thread James Morris
On Fri, 26 Feb 2016, Jarkko Sakkinen wrote:

> Hi James,
> 
> this is the fix for the build warning.
> 
> /Jarkko
> 
> The following changes since commit 481873d06f2bf2ad732450a3a5fa5b8c2a07ef88:
> 
>   Merge branch 'next' of 
> git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity into next 
> (2016-02-26 15:06:41 +1100)
> 
> are available in the git repository at:
> 
>   https://github.com/jsakkine/linux-tpmdd.git tags/tpmdd-next-20160226
> 
> for you to fetch changes up to 2cb6d6460f1a171c71c134e0efe3a94c2206d080:
> 
>   tpm_tis: fix build warning with tpm_tis_resume (2016-02-26 11:32:07 +0200)
> 
> 
> tpmdd fix
> 
> 
> Jarkko Sakkinen (1):
>   tpm_tis: fix build warning with tpm_tis_resume
> 

Pulled to -next.

-- 
James Morris




[RFC] PCI: PTM Driver

2016-02-28 Thread Yong, Jonathan
Hello LKML,

This is a preliminary implementation of the PTM[1] support driver, the code
is obviously hacked together and in need of refactoring. This driver has
only been tested against a virtual PCI bus.

The drivers job is to get to every PTM capable device, set some PCI config
space bits, then go back to sleep [2].

PTM capable PCIe devices will get a new sysfs entry to allow PTM to be
enabled if automatic PTM activation is disabled, or disabled if so desired.

Comments? Should I explain the PTM registers in more details?
Please CC me, thanks.

[1] Precision Time Measurement: A protocol for synchronizing PCIe endpoint
clocks against the host clock as specified in the PCI Express Base
Specification 3.1. It is identified by the 0x001f extended capability ID.

PTM capable devices are split into 3 roles, master, responder and requester.
Summary as follows:

A master holds the master clock that will be used for all devices under its
domain (not to be confused with PCI domains). There may be multiple masters
in a PTM hierarchy, in which case, the highest master closest to the root
complex will be selected for the PTM domain. A master is also always
responder capable. Clock precision is signified by a Local Clock
Granularity field, in nano-seconds.

A responder responds to any PTM synchronization requests from a downstream
device. A responder is typically a switch device. It may also hold a local
clock signified by a non-zero Local Clock Granularity field. A value of 0
signifies that the device simply propagates timing information from
upstream devices.

A requester is typically an endpoint that will request synchronization
updates from an upstream PTM capable time source. The driver will update
the Effective Clock Granularity field based on the same field from the
PTM domain master. The field should be programed with a value of 0 if any
intervening responder has a Local Clock Granularity field value of 0.

[2] The software drivers never see the PTM packets, the PCI Express Base
Specificaton 3.1 reads:
PTM capable components can make their PTM context available for
inspection by software, enabling software to translate timing
information between local times and PTM Master Time.

This isn't very informative.

Yong, Jonathan (1):
  PCI: PTM preliminary implementation

 drivers/pci/pci-sysfs.c |   7 +
 drivers/pci/pci.h   |  21 +++
 drivers/pci/pcie/Kconfig|   8 +
 drivers/pci/pcie/Makefile   |   2 +-
 drivers/pci/pcie/pcie_ptm.c | 353 
 drivers/pci/probe.c |   3 +
 6 files changed, 393 insertions(+), 1 deletion(-)
 create mode 100644 drivers/pci/pcie/pcie_ptm.c

-- 
2.4.10



Re: [PATCH 8/9] powerpc: simplify csum_add(a, b) in case a or b is constant 0

2016-02-28 Thread Christophe Leroy



Le 23/10/2015 05:33, Scott Wood a écrit :

On Tue, 2015-09-22 at 16:34 +0200, Christophe Leroy wrote:

Simplify csum_add(a, b) in case a or b is constant 0

Signed-off-by: Christophe Leroy 
---
  arch/powerpc/include/asm/checksum.h | 6 ++
  1 file changed, 6 insertions(+)

diff --git a/arch/powerpc/include/asm/checksum.h
b/arch/powerpc/include/asm/checksum.h
index 56deea8..f8a9704 100644
--- a/arch/powerpc/include/asm/checksum.h
+++ b/arch/powerpc/include/asm/checksum.h
@@ -119,7 +119,13 @@ static inline __wsum csum_add(__wsum csum, __wsum
addend)
  {
  #ifdef __powerpc64__
   u64 res = (__force u64)csum;
+#endif
+ if (__builtin_constant_p(csum) && csum == 0)
+ return addend;
+ if (__builtin_constant_p(addend) && addend == 0)
+ return csum;

+#ifdef __powerpc64__
   res += (__force u64)addend;
   return (__force __wsum)((u32)res + (res >> 32));
  #else

How often does this happen?


In the following patch (9/9), csum_add() is used to implement 
csum_partial() for small blocks.
In several places in the networking code, csum_partial() is called with 
0 as initial sum.


Christophe


Re: [PATCH 4/9] powerpc: inline ip_fast_csum()

2016-02-28 Thread Christophe Leroy



Le 23/09/2015 07:43, Denis Kirjanov a écrit :

On 9/22/15, Christophe Leroy  wrote:

In several architectures, ip_fast_csum() is inlined
There are functions like ip_send_check() which do nothing
much more than calling ip_fast_csum().
Inlining ip_fast_csum() allows the compiler to optimise better

Hi Christophe,
I did try it and see no difference on ppc64. Did you test with socklib
with modified loopback and if so do you have any numbers?


Hi Denis,

I put a mftbl at start and end of ip_send_check() and tested on a MPC885:
* Without ip_fast_csum() inlined, approxymatly 7 TB ticks are spent in 
ip_send_check()
* With ip_fast_csum() inlined, approxymatly 5,4 TB ticks are spent in 
ip_send_check()


So it is about 23% time reduction.

Christophe


Re: [RFC PATCH] proc: do not include shmem and driver pages in /proc/meminfo::Cached

2016-02-28 Thread Konstantin Khlebnikov
On Mon, Feb 29, 2016 at 3:03 AM, Hugh Dickins  wrote:
> On Fri, 19 Feb 2016, Andrew Morton wrote:
>> On Fri, 19 Feb 2016 09:40:45 +0300 Konstantin Khlebnikov  
>> wrote:
>>
>> > >> What are your thoughts on this?
>> > >
>> > > My thoughts are NAK.  A misleading stat is not so bad as a
>> > > misleading stat whose meaning we change in some random kernel.
>> > >
>> > > By all means improve Documentation/filesystems/proc.txt on Cached.
>> > > By all means promote Active(file)+Inactive(file)-Buffers as often a
>> > > better measure (though Buffers itself is obscure to me - is it intended
>> > > usually to approximate resident FS metadata?).  By all means work on
>> > > /proc/meminfo-v2 (though that may entail dispiritingly long discussions).
>> > >
>> > > We have to assume that Cached has been useful to some people, and that
>> > > they've learnt to subtract Shmem from it, if slow or no swap concerns 
>> > > them.
>> > >
>> > > Added Konstantin to Cc: he's had valuable experience of people learning
>> > > to adapt to the numbers that we put out.
>> > >
>> >
>> > I think everything will ok. Subtraction of shmem isn't widespread practice,
>> > more like secret knowledge. This wasn't documented and people who use
>> > this should be aware that this might stop working at any time. So, ACK.
>>
>> It worries me as well - we're deliberately altering the behaviour of
>> existing userspace code.  Not all of those alterations will be welcome!
>>
>> We could add a shiny new field into meminfo and train people to migrate
>> to that.  But that would just be a sum of already-available fields.  In
>> an ideal world we could solve all of this with documentation and
>> cluebatting (and some apologizing!).
>
> Ah, I missed this, and just sent a redundant addition to the thread;
> followed by this doubly redundant addition.

"Cached" has been used for ages as amount of "potentially free memory".
This patch corrects it in original meaning and makes it closer to that
"potential"
meaining at the same time.

MemAvailable means exactly that and thing else so logic behind it could be
tuned and changed in the future. Thus, adding new fields makes no sense.


BTW
Glibc recently switched sysconf(_SC_PHYS_PAGES) / sysconf(_SC_AVPHYS_PAGES)
from /proc/meminfo MemTotal / MemFree to sysinfo(2) totalram / freeram for
performance reason. It seems possible to expose MemAvailable via sysinfo:
there is space for one field. Probably it's also possible to switch
_SC_AVPHYS_PAGES
to really available memory and add memcg awareness too.


Re: [PATCH 1/2] sigaltstack: implement SS_AUTODISARM flag

2016-02-28 Thread Stas Sergeev

29.02.2016 00:13, Stas Sergeev пишет:

This patch implements the SS_AUTODISARM flag that can be ORed with
SS_ONSTACK when forming ss_flags.
When this flag is set, sigaltstack will be disabled when entering
the signal handler; more precisely, after saving sas to uc_stack.
When leaving the signal handler, the sigaltstack is restored by
uc_stack.
When this flag is used, it is safe to switch from sighandler with
swapcontext(). Without this flag, the subsequent signal will corrupt
the state of the switched-away sighandler.

CC: Ingo Molnar 
CC: Peter Zijlstra 
CC: Richard Weinberger 
CC: Andrew Morton 
CC: Oleg Nesterov 
CC: Tejun Heo 
CC: Heinrich Schuchardt 
CC: Jason Low 
CC: Andrea Arcangeli 
CC: Frederic Weisbecker 
CC: Konstantin Khlebnikov 
CC: Josh Triplett 
CC: "Eric W. Biederman" 
CC: Aleksa Sarai 
CC: "Amanieu d'Antras" 
CC: Paul Moore 
CC: Sasha Levin 
CC: Palmer Dabbelt 
CC: Vladimir Davydov 
CC: linux-kernel@vger.kernel.org
CC: linux-...@vger.kernel.org
CC: Andy Lutomirski 

Signed-off-by: Stas Sergeev 
---
  include/linux/sched.h   |  1 +
  include/linux/signal.h  |  4 +++-
  include/uapi/linux/signal.h |  3 +++
  kernel/fork.c   |  4 +++-
  kernel/signal.c | 23 ---
  5 files changed, 22 insertions(+), 13 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index a10494a..f561d34 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1587,6 +1587,7 @@ struct task_struct {
  
  	unsigned long sas_ss_sp;

size_t sas_ss_size;
+   unsigned sas_ss_flags;
  
  	struct callback_head *task_works;
  
diff --git a/include/linux/signal.h b/include/linux/signal.h

index 92557bb..be3ebe0 100644
--- a/include/linux/signal.h
+++ b/include/linux/signal.h
@@ -432,8 +432,10 @@ int __save_altstack(stack_t __user *, unsigned long);
stack_t __user *__uss = uss; \
struct task_struct *t = current; \
put_user_ex((void __user *)t->sas_ss_sp, &__uss->ss_sp); \
-   put_user_ex(sas_ss_flags(sp), &__uss->ss_flags); \
+   put_user_ex(t->sas_ss_flags, &__uss->ss_flags); \
put_user_ex(t->sas_ss_size, &__uss->ss_size); \
+   if (t->sas_ss_flags & SS_AUTODISARM) \
+   t->sas_ss_size = 0; \

Should also reset flags here...
Will send v4.


Re: [PATCH v10 2/2] cpufreq: powernv: Add sysfs attributes to show throttle stats

2016-02-28 Thread Viresh Kumar
On 26-02-16, 16:06, Shilpasri G Bhat wrote:
> +static int powernv_cpufreq_policy_notifier(struct notifier_block *nb,
> +unsigned long action, void *data)
> +{
> + struct cpufreq_policy *policy = data;
> + int ret;
> +
> + if (action == CPUFREQ_CREATE_POLICY) {
> + ret = sysfs_create_group(&policy->kobj, &throttle_attr_grp);
> + if (ret)
> + pr_info("Failed to create throttle stats directory for 
> cpu %d\n",
> + policy->cpu);
> + } else if (action == CPUFREQ_REMOVE_POLICY) {
> + sysfs_remove_group(&policy->kobj, &throttle_attr_grp);
> + }
> +
> + return NOTIFY_DONE;
> +}
> +
> +static struct notifier_block powernv_cpufreq_policy_nb = {
> + .notifier_call  = powernv_cpufreq_policy_notifier,
> + .next   = NULL,
> +};
> +
>  static void powernv_cpufreq_stop_cpu(struct cpufreq_policy *policy)
>  {
>   struct powernv_smp_call_data freq_data;
> @@ -603,6 +708,8 @@ static inline void clean_chip_info(void)
>  
>  static inline void unregister_all_notifiers(void)
>  {
> + cpufreq_unregister_notifier(&powernv_cpufreq_policy_nb,
> + CPUFREQ_POLICY_NOTIFIER);
>   opal_message_notifier_unregister(OPAL_MSG_OCC,
>&powernv_cpufreq_opal_nb);
>   unregister_reboot_notifier(&powernv_cpufreq_reboot_nb);
> @@ -628,6 +735,8 @@ static int __init powernv_cpufreq_init(void)
>  
>   register_reboot_notifier(&powernv_cpufreq_reboot_nb);
>   opal_message_notifier_register(OPAL_MSG_OCC, &powernv_cpufreq_opal_nb);
> + cpufreq_register_notifier(&powernv_cpufreq_policy_nb,
> +   CPUFREQ_POLICY_NOTIFIER);
>  
>   rc = cpufreq_register_driver(&powernv_cpufreq_driver);
>   if (!rc)

@Rafael: This driver needs to do this *ugly* notifier hack, just because we
aren't doing kobject_add() for policy->kobj before ->init(). And we did that
because, we wanted to create the policyX structure with the first CPU in
policy->related_cpus mask and related_cpus mask isn't available until we call
->init()..

Should we do something in core to make this easier for this driver?

-- 
viresh


linux-next: manual merge of the target-merge tree with the net-next tree

2016-02-28 Thread Stephen Rothwell
Hi Nicholas,

Today's linux-next merge of the target-merge tree got a conflict in:

  drivers/net/ethernet/chelsio/cxgb4/t4fw_api.h

between commit:

  ba9cee6aa67d ("cxgb4/iw_cxgb4: TOS support")

from the net-next tree and commit:

  c973e2a3ff1b ("cxgb4: add definitions for iSCSI target ULD")

from the target-merge tree.

I fixed it up (the latter was a superset of the former) and can carry
the fix as necessary (no action is required).

-- 
Cheers,
Stephen Rothwell


Re: [PATCH] mm/zsmalloc: add compact column to pool stat

2016-02-28 Thread Sergey Senozhatsky
Hello,

On (02/29/16 15:02), Minchan Kim wrote:
> On Sat, Feb 27, 2016 at 03:23:53PM +0900, Sergey Senozhatsky wrote:
> > Add a new column to pool stats, which will tell us class' zs_can_compact()
> > number, so it will be easier to analyze zsmalloc fragmentation.
> 
> Just nitpick:
> 
> Strictly speaking, zs_can_compact number is number of "ideal freeable page
> by compaction". How about using high level term in description rather than
> function name?

OK, makes sense.


> > At the moment, we have only numbers of FULL and ALMOST_EMPTY classes, but
> > they don't tell us how badly the class is fragmented internally.
> > 
> > The new /sys/kernel/debug/zsmalloc/zramX/classes output look as follows:
> > 
> >  class  size almost_full almost_empty obj_allocated   obj_used pages_used 
> > pages_per_zspage compact
> > [..]
> > 12   224   02   146  5  8   
> >  4   4
> > 13   240   00 0  0  0   
> >  1   0
> > 14   256   1   13  1840   1672115   
> >  1  10
> > 15   272   00 0  0  0   
> >  1   0
> > [..]
> > 49   816   03   745735149   
> >  1   2
> > 51   848   34   361306 76   
> >  4   8
> > 52   864  12   14   378268 81   
> >  3  21
> > 54   896   1   12   117 57 26   
> >  2  12
> > 57   944   00 0  0  0   
> >  3   0
> > [..]
> >  Total26  131 12709  10994   1071   
> >134
> > 
> > For example, from this particular output we can easily conclude that 
> > class-896
> > is heavily fragmented -- it occupies 26 pages, 12 can be freed by 
> > compaction.
> 
> How about using "freeable" or something which could represent "freeable"?
> IMO, it's more strightforward for user.

OK. didn't want to put any long column name there, which would bloat the
output. will take a look.

> Other than that,
> 
> Acked-by: Minchan Kim 
> 
> 
> Thanks for the nice job!

thanks.

-ss


Re: [PATCH] mm/zsmalloc: add compact column to pool stat

2016-02-28 Thread Minchan Kim
On Sat, Feb 27, 2016 at 03:23:53PM +0900, Sergey Senozhatsky wrote:
> Add a new column to pool stats, which will tell us class' zs_can_compact()
> number, so it will be easier to analyze zsmalloc fragmentation.

Just nitpick:

Strictly speaking, zs_can_compact number is number of "ideal freeable page
by compaction". How about using high level term in description rather than
function name?


> 
> At the moment, we have only numbers of FULL and ALMOST_EMPTY classes, but
> they don't tell us how badly the class is fragmented internally.
> 
> The new /sys/kernel/debug/zsmalloc/zramX/classes output look as follows:
> 
>  class  size almost_full almost_empty obj_allocated   obj_used pages_used 
> pages_per_zspage compact
> [..]
> 12   224   02   146  5  8 
>4   4
> 13   240   00 0  0  0 
>1   0
> 14   256   1   13  1840   1672115 
>1  10
> 15   272   00 0  0  0 
>1   0
> [..]
> 49   816   03   745735149 
>1   2
> 51   848   34   361306 76 
>4   8
> 52   864  12   14   378268 81 
>3  21
> 54   896   1   12   117 57 26 
>2  12
> 57   944   00 0  0  0 
>3   0
> [..]
>  Total26  131 12709  10994   1071 
>  134
> 
> For example, from this particular output we can easily conclude that class-896
> is heavily fragmented -- it occupies 26 pages, 12 can be freed by compaction.

How about using "freeable" or something which could represent "freeable"?
IMO, it's more strightforward for user.

Other than that,

Acked-by: Minchan Kim 


Thanks for the nice job!


Re: [PATCH] asm-generic: remove old nonatomic-io wrapper files

2016-02-28 Thread Vinod Koul
On Fri, Feb 26, 2016 at 03:29:05PM +0100, Arnd Bergmann wrote:
> The two header files got moved to include/linux, and most
> users were already converted, this changes the remaining drivers
> and removes the files.
> 
> Signed-off-by: Arnd Bergmann 
> ---
>  drivers/dma/idma64.h| 2 +-
For this:

Acked-by: Vinod Koul 

Thanks
-- 
~Vinod


Re: [PATCH v3 22/22] sound/usb: Use Media Controller API to share media resources

2016-02-28 Thread Shuah Khan
On 02/27/2016 12:48 AM, Takashi Iwai wrote:
> On Sat, 27 Feb 2016 03:55:39 +0100,
> Shuah Khan wrote:
>>
>> On 02/26/2016 01:50 PM, Takashi Iwai wrote:
>>> On Fri, 26 Feb 2016 21:08:43 +0100,
>>> Shuah Khan wrote:

 On 02/26/2016 12:55 PM, Takashi Iwai wrote:
> On Fri, 12 Feb 2016 00:41:38 +0100,
> Shuah Khan wrote:
>>
>> Change ALSA driver to use Media Controller API to
>> share media resources with DVB and V4L2 drivers
>> on a AU0828 media device. Media Controller specific
>> initialization is done after sound card is registered.
>> ALSA creates Media interface and entity function graph
>> nodes for Control, Mixer, PCM Playback, and PCM Capture
>> devices.
>>
>> snd_usb_hw_params() will call Media Controller enable
>> source handler interface to request the media resource.
>> If resource request is granted, it will release it from
>> snd_usb_hw_free(). If resource is busy, -EBUSY is returned.
>>
>> Media specific cleanup is done in usb_audio_disconnect().
>>
>> Signed-off-by: Shuah Khan 
>> ---
>>  sound/usb/Kconfig|   4 +
>>  sound/usb/Makefile   |   2 +
>>  sound/usb/card.c |  14 +++
>>  sound/usb/card.h |   3 +
>>  sound/usb/media.c| 318 
>> +++
>>  sound/usb/media.h|  72 +++
>>  sound/usb/mixer.h|   3 +
>>  sound/usb/pcm.c  |  28 -
>>  sound/usb/quirks-table.h |   1 +
>>  sound/usb/stream.c   |   2 +
>>  sound/usb/usbaudio.h |   6 +
>>  11 files changed, 448 insertions(+), 5 deletions(-)
>>  create mode 100644 sound/usb/media.c
>>  create mode 100644 sound/usb/media.h
>>
>> diff --git a/sound/usb/Kconfig b/sound/usb/Kconfig
>> index a452ad7..ba117f5 100644
>> --- a/sound/usb/Kconfig
>> +++ b/sound/usb/Kconfig
>> @@ -15,6 +15,7 @@ config SND_USB_AUDIO
>>  select SND_RAWMIDI
>>  select SND_PCM
>>  select BITREVERSE
>> +select SND_USB_AUDIO_USE_MEDIA_CONTROLLER if MEDIA_CONTROLLER 
>> && MEDIA_SUPPORT
>
> Looking at the media Kconfig again, this would be broken if
> MEDIA_SUPPORT=m and SND_USB_AUDIO=y.  The ugly workaround is something
> like:
>   select SND_USB_AUDIO_USE_MEDIA_CONTROLLER \
>   if MEDIA_CONTROLLER && (MEDIA_SUPPORT=y || MEDIA_SUPPORT=SND)

 My current config is MEDIA_SUPPORT=m and SND_USB_AUDIO=y
 It is working and I didn't see any issues so far.
>>>
>>> Hmm, how does it be?  In drivers/media/Makefile:
>>>
>>> ifeq ($(CONFIG_MEDIA_CONTROLLER),y)
>>>   obj-$(CONFIG_MEDIA_SUPPORT) += media.o
>>> endif
>>>
>>> So it's a module.  Meanwhile you have reference from usb-audio driver
>>> that is built-in kernel.  How is the symbol resolved?
>>
>> Sorry my mistake. I misspoke. My config had:
>> CONFIG_MEDIA_SUPPORT=m
>> CONFIG_MEDIA_CONTROLLER=y
>> CONFIG_SND_USB_AUDIO=m
>>
>> The following doesn't work as you pointed out.
>>
>> CONFIG_MEDIA_SUPPORT=m
>> CONFIG_MEDIA_CONTROLLER=y
>> CONFIG_SND_USB_AUDIO=y
>>
>> okay here is what will work for all of the possible
>> combinations of CONFIG_MEDIA_SUPPORT and CONFIG_SND_USB_AUDIO
>>
>> select SND_USB_AUDIO_USE_MEDIA_CONTROLLER \
>>if MEDIA_CONTROLLER && ((MEDIA_SUPPORT=y) || (MEDIA_SUPPORT=m && 
>> SND_USB_AUDIO=m))
>>
>> The above will cover the cases when
>>
>> 1. CONFIG_MEDIA_SUPPORT and CONFIG_SND_USB_AUDIO are
>>both modules
>>CONFIG_SND_USB_AUDIO_USE_MEDIA_CONTROLLER is selected
>>
>> 2. CONFIG_MEDIA_SUPPORT=y and CONFIG_SND_USB_AUDIO=m
>>CONFIG_SND_USB_AUDIO_USE_MEDIA_CONTROLLER is selected
>>
>> 3. CONFIG_MEDIA_SUPPORT=y and CONFIG_SND_USB_AUDIO=y
>>CONFIG_SND_USB_AUDIO_USE_MEDIA_CONTROLLER is selected
>>
>> 4. CONFIG_MEDIA_SUPPORT=m and CONFIG_SND_USB_AUDIO=y
>>This is when we don't want
>>CONFIG_SND_USB_AUDIO_USE_MEDIA_CONTROLLER selected
>>
>> I verified all of the above combinations to make sure
>> the logic works.
>>
>> If you think of a better way to do this please let me
>> know. I will go ahead and send patch v4 with the above
>> change and you can decide if that is acceptable.
> 
> I'm not 100% sure whether CONFIG_SND_USB_AUDIO=m can be put there as
> conditional inside CONFIG_SND_USB_AUDIO definition.  Maybe a safer
> form would be like:
> 
> config SND_USB_AUDIO_USE_MEDIA_CONTROLLER
>   bool
>   default y
>   depends on SND_USB_AUDIO
>   depends on MEDIA_CONTROLLER
>   depends on (MEDIA_SUPPORT=y || MEDIA_SUPPORT=SND_USB_AUDIO)
> 
> and drop select from SND_USB_AUDIO.
> 
> 
> Other than that, it looks more or less OK to me.
> The way how media_stream_init() gets called is a bit worrisome, but it
> should work practically.  Another concern is about the disconnection.
> Can all function calls in media_device_delete() be safe even if it's
> called while the application still ope

[PATCH v4 22/22] sound/usb: Use Media Controller API to share media resources

2016-02-28 Thread Shuah Khan
Change ALSA driver to use Media Controller API to
share media resources with DVB and V4L2 drivers
on a AU0828 media device. Media Controller specific
initialization is done after sound card is registered.
ALSA creates Media interface and entity function graph
nodes for Control, Mixer, PCM Playback, and PCM Capture
devices.

snd_usb_hw_params() will call Media Controller enable
source handler interface to request the media resource.
If resource request is granted, it will release it from
snd_usb_hw_free(). If resource is busy, -EBUSY is returned.

Media specific cleanup is done in usb_audio_disconnect().

Signed-off-by: Shuah Khan 
---

Changes since v3:
- Fixed Kconfig to handle the following
1. CONFIG_MEDIA_SUPPORT and CONFIG_SND_USB_AUDIO are
   both modules
   CONFIG_SND_USB_AUDIO_USE_MEDIA_CONTROLLER is selected

2. CONFIG_MEDIA_SUPPORT=y and CONFIG_SND_USB_AUDIO=m
   CONFIG_SND_USB_AUDIO_USE_MEDIA_CONTROLLER is selected

3. CONFIG_MEDIA_SUPPORT=y and CONFIG_SND_USB_AUDIO=y
   CONFIG_SND_USB_AUDIO_USE_MEDIA_CONTROLLER is selected

4. CONFIG_MEDIA_SUPPORT=m and CONFIG_SND_USB_AUDIO=y
   This is when we don't want
   CONFIG_SND_USB_AUDIO_USE_MEDIA_CONTROLLER selected

 sound/usb/Kconfig|   4 +
 sound/usb/Makefile   |   2 +
 sound/usb/card.c |  14 +++
 sound/usb/card.h |   3 +
 sound/usb/media.c| 318 +++
 sound/usb/media.h|  72 +++
 sound/usb/mixer.h|   3 +
 sound/usb/pcm.c  |  28 -
 sound/usb/quirks-table.h |   1 +
 sound/usb/stream.c   |   2 +
 sound/usb/usbaudio.h |   6 +
 11 files changed, 448 insertions(+), 5 deletions(-)
 create mode 100644 sound/usb/media.c
 create mode 100644 sound/usb/media.h

diff --git a/sound/usb/Kconfig b/sound/usb/Kconfig
index a452ad7..d14bf41 100644
--- a/sound/usb/Kconfig
+++ b/sound/usb/Kconfig
@@ -15,6 +15,7 @@ config SND_USB_AUDIO
select SND_RAWMIDI
select SND_PCM
select BITREVERSE
+   select SND_USB_AUDIO_USE_MEDIA_CONTROLLER if MEDIA_CONTROLLER && 
(MEDIA_SUPPORT=y || MEDIA_SUPPORT=SND_USB_AUDIO)
help
  Say Y here to include support for USB audio and USB MIDI
  devices.
@@ -22,6 +23,9 @@ config SND_USB_AUDIO
  To compile this driver as a module, choose M here: the module
  will be called snd-usb-audio.
 
+config SND_USB_AUDIO_USE_MEDIA_CONTROLLER
+   bool
+
 config SND_USB_UA101
tristate "Edirol UA-101/UA-1000 driver"
select SND_PCM
diff --git a/sound/usb/Makefile b/sound/usb/Makefile
index 2d2d122..8dca3c4 100644
--- a/sound/usb/Makefile
+++ b/sound/usb/Makefile
@@ -15,6 +15,8 @@ snd-usb-audio-objs := card.o \
quirks.o \
stream.o
 
+snd-usb-audio-$(CONFIG_SND_USB_AUDIO_USE_MEDIA_CONTROLLER) += media.o
+
 snd-usbmidi-lib-objs := midi.o
 
 # Toplevel Module Dependency
diff --git a/sound/usb/card.c b/sound/usb/card.c
index 1f09d95..35fe256 100644
--- a/sound/usb/card.c
+++ b/sound/usb/card.c
@@ -66,6 +66,7 @@
 #include "format.h"
 #include "power.h"
 #include "stream.h"
+#include "media.h"
 
 MODULE_AUTHOR("Takashi Iwai ");
 MODULE_DESCRIPTION("USB Audio");
@@ -561,6 +562,11 @@ static int usb_audio_probe(struct usb_interface *intf,
if (err < 0)
goto __error;
 
+   if (quirk->media_device) {
+   /* don't want to fail when media_device_create() fails */
+   media_device_create(chip, intf);
+   }
+
usb_chip[chip->index] = chip;
chip->num_interfaces++;
usb_set_intfdata(intf, chip);
@@ -617,6 +623,14 @@ static void usb_audio_disconnect(struct usb_interface 
*intf)
list_for_each(p, &chip->midi_list) {
snd_usbmidi_disconnect(p);
}
+   /*
+* Nice to check quirk && quirk->media_device
+* need some special handlings. Doesn't look like
+* we have access to quirk here
+* Acceses mixer_list
+   */
+   media_device_delete(chip);
+
/* release mixer resources */
list_for_each_entry(mixer, &chip->mixer_list, list) {
snd_usb_mixer_disconnect(mixer);
diff --git a/sound/usb/card.h b/sound/usb/card.h
index 71778ca..34a0898 100644
--- a/sound/usb/card.h
+++ b/sound/usb/card.h
@@ -105,6 +105,8 @@ struct snd_usb_endpoint {
struct list_head list;
 };
 
+struct media_ctl;
+
 struct snd_usb_substream {
struct snd_usb_stream *stream;
struct usb_device *dev;
@@ -156,6 +158,7 @@ struct snd_usb_substream {
} dsd_dop;
 
bool trigger_tstamp_pending_update; /* trigger timestamp being updated 
from initial estimate */
+   struct media_ctl *media_ctl;
 };
 
 struct snd_usb_stream {
diff --git a/sound/usb/media.c b/sound/usb/media.c
new file mode 100644
index 000..cff1459
--- /dev/null
+++ b/sound/usb/media.c

[PATCH] phy: Fix armada375 compile test build on UM

2016-02-28 Thread Krzysztof Kozlowski
The phy-armada375-usb2 driver uses IOMEM functions so COMPILE_TEST && OF
build failed with:

drivers/built-in.o: In function `armada375_usb_phy_probe':
phy-armada375-usb2.c:(.text+0x121d): undefined reference to 
`devm_ioremap_resource'

Signed-off-by: Krzysztof Kozlowski 
---
 drivers/phy/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/phy/Kconfig b/drivers/phy/Kconfig
index 0124d17bd9fe..786a9d6356b8 100644
--- a/drivers/phy/Kconfig
+++ b/drivers/phy/Kconfig
@@ -32,7 +32,7 @@ config PHY_BERLIN_SATA
 config ARMADA375_USBCLUSTER_PHY
def_bool y
depends on MACH_ARMADA_375 || COMPILE_TEST
-   depends on OF
+   depends on OF && HAS_IOMEM
select GENERIC_PHY
 
 config PHY_DM816X_USB
-- 
2.5.0



[GIT PULL] extcon next for 4.6

2016-02-28 Thread Chanwoo Choi
Dear Greg,

This is extcon-next pull request for v4.6. I add detailed description of
this pull request on below. Please pull extcon with following updates.

Best Regards,
Chanwoo Choi

The following changes since commit 92e963f50fc74041b5e9e744c330dca48e04f08d:

  Linux 4.5-rc1 (2016-01-24 13:06:47 -0800)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/extcon.git 
tags/extcon-next-for-4.6

for you to fetch changes up to ae64e42cc2b3a17ac0c11815f53211093a54cf55:

  extcon: palmas: Drop IRQF_EARLY_RESUME flag (2016-02-29 11:07:34 +0900)


Update extcon for 4.6

Detailed description for patchset:
1. Add new EXTCON_CHG_USB_SDP type
- SDP (Standard Downstream Port) USB Charging Port
  means the charging connector.a

2. Add the VBUS detection by using GPIO on extcon-palmas
- Beaglex15 board uses the extcon-palmas driver
  But, beaglex15 board need the GPIO support for VBUS
  detection.

3. Fix the minor issue of extcon drivers


Chanwoo Choi (1):
  extcon: Add the EXTCON_CHG_USB_SDP to support SDP charing port

Charles Keepax (1):
  extcon: arizona: Use DAPM mutex helper functions

Dan Carpenter (1):
  extcon: max77843: Use correct size for reading the interrupt register

Felipe Balbi (3):
  extcon: palmas: Add the support for VBUS detection by using GPIO
  arm: boot: dts: beaglex15: Remove ID GPIO
  arm: boot: beaglex15: pass correct interrupt

Geliang Tang (1):
  extcon: Use to_i2c_client for both rt8973a and sm5502

Grygorii Strashko (1):
  extcon: palmas: Drop IRQF_EARLY_RESUME flag

Moritz Fischer (1):
  extcon: gpio: Fix typo in comment

 arch/arm/boot/dts/am57xx-beagle-x15.dts |  3 +-
 drivers/extcon/extcon-arizona.c |  4 +--
 drivers/extcon/extcon-gpio.c|  2 +-
 drivers/extcon/extcon-max14577.c|  3 ++
 drivers/extcon/extcon-max77693.c| 12 +++-
 drivers/extcon/extcon-max77843.c|  5 ++-
 drivers/extcon/extcon-max8997.c |  3 ++
 drivers/extcon/extcon-palmas.c  | 54 +++--
 drivers/extcon/extcon-rt8973a.c |  8 +++--
 drivers/extcon/extcon-sm5502.c  |  8 +++--
 include/linux/mfd/palmas.h  |  3 ++
 11 files changed, 92 insertions(+), 13 deletions(-)


Re: [PATCH v6 00/12] Add T210 support in Tegra soctherm

2016-02-28 Thread Wei Ni
Hi,
Does anyone have comments on this series?

Thanks.
Wei.

On 2016年02月22日 16:05, Wei Ni wrote:
> This patchset adds following functions for tegra_soctherm driver:
> 1. add T210 support.
> 2. export debugfs to show some registers.
> 3. add thermtrip funciton.
> 4. add suspend/resume function.
> 
> The v5 serial is in:
> http://www.spinics.net/lists/linux-tegra/msg25079.html
> The v4 serial is in:
> http://www.spinics.net/lists/linux-tegra/msg24972.html
> The V3 serial is in:
> http://www.spinics.net/lists/linux-tegra/msg24911.html
> The V2 serial is in:
> http://www.spinics.net/lists/linux-tegra/msg24901.html
> The V1 serial is in:
> http://www.spinics.net/lists/linux-tegra/msg24808.html
> 
> Main changes from V5:
> 1. Change to use linux thermal framework to implement
> thermtrip funciton, per Rob's comment.
> 2. Add .set_trip_temp() in of-thermal driver, so that
> we can set trips on hardware.
> 
> Main changes from V4:
> 1. Change description of devicetree binding per Rob's comment.
> 2. Call of_node_put to decrement refcount of the node.
> 
> Main changes from V3:
> 1. Change structures to "const" in chip specific files.
> 2. Minor changes per Thieery's comments.
> 
> Main changes from V2:
> 1. Fix build error in patch [1/11].
> 2. Use of_get_child_by_name instead of of_find_node_by_name in patch [8/11].
> 3. Use debugfs_remove_recursive to remove debugfs in patch [6/11].
> 
> Main changes from V1:
> 1. Use the new type to handl different Tegra chips in one driver, which 
> suggested by Thierry.
> 2. Changes per Thieery's other comments.
> 
> Wei Ni (12):
>   thermal: tegra: move tegra thermal files into tegra directory
>   thermal: tegra: combine sensor group-related data
>   thermal: tegra: get rid of PDIV/HOTSPOT hack
>   thermal: tegra: split tegra_soctherm driver
>   thermal: tegra: add Tegra210 specific SOC_THERM driver
>   thermal: tegra: add a debugfs to show registers
>   thermal: of-thermal: allow setting trip_temp on hardware
>   of: add notes of critical trips for soctherm
>   thermal: tegra: add thermtrip function
>   thermal: tegra: add PM support
>   arm64: tegra: add soctherm node for Tegra210
>   arm: tegra: set critical trips for Tegra124
> 
>  .../devicetree/bindings/thermal/tegra-soctherm.txt |  12 +
>  arch/arm/boot/dts/tegra124.dtsi|  16 +
>  arch/arm64/boot/dts/nvidia/tegra210.dtsi   |  60 ++
>  drivers/thermal/Kconfig|  12 +-
>  drivers/thermal/Makefile   |   2 +-
>  drivers/thermal/of-thermal.c   |   8 +
>  drivers/thermal/tegra/Kconfig  |  13 +
>  drivers/thermal/tegra/Makefile |   5 +
>  drivers/thermal/tegra/soctherm-fuse.c  | 169 +
>  drivers/thermal/tegra/soctherm.c   | 685 
> +
>  drivers/thermal/tegra/soctherm.h   | 123 
>  drivers/thermal/tegra/tegra124-soctherm.c  | 196 ++
>  drivers/thermal/tegra/tegra210-soctherm.c  | 197 ++
>  drivers/thermal/tegra_soctherm.c   | 476 --
>  include/dt-bindings/thermal/tegra124-soctherm.h|   1 +
>  include/linux/thermal.h|   1 +
>  16 files changed, 1489 insertions(+), 487 deletions(-)
>  create mode 100644 drivers/thermal/tegra/Kconfig
>  create mode 100644 drivers/thermal/tegra/Makefile
>  create mode 100644 drivers/thermal/tegra/soctherm-fuse.c
>  create mode 100644 drivers/thermal/tegra/soctherm.c
>  create mode 100644 drivers/thermal/tegra/soctherm.h
>  create mode 100644 drivers/thermal/tegra/tegra124-soctherm.c
>  create mode 100644 drivers/thermal/tegra/tegra210-soctherm.c
>  delete mode 100644 drivers/thermal/tegra_soctherm.c
> 


Re: [PATCH 01/10] fs crypto: add basic definitions for per-file encryption

2016-02-28 Thread Randy Dunlap
On 02/25/16 11:25, Jaegeuk Kim wrote:
> This patch adds definitions for per-file encryption used by ext4 and f2fs.
> 
> Signed-off-by: Jaegeuk Kim 
> ---
>  include/linux/fs.h   |   8 ++
>  include/linux/fscrypto.h | 239 
> +++
>  include/uapi/linux/fs.h  |  18 
>  3 files changed, 265 insertions(+)
>  create mode 100644 include/linux/fscrypto.h
> 
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index ae68100..d8f57cf 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -53,6 +53,8 @@ struct swap_info_struct;
>  struct seq_file;
>  struct workqueue_struct;
>  struct iov_iter;
> +struct fscrypt_info;
> +struct fscrypt_operations;
>  
>  extern void __init inode_init(void);
>  extern void __init inode_init_early(void);
> @@ -678,6 +680,10 @@ struct inode {
>   struct hlist_head   i_fsnotify_marks;
>  #endif
>  
> +#ifdef CONFIG_FS_ENCRYPTION
> + struct fscrypt_info *i_crypt_info;
> +#endif
> +
>   void*i_private; /* fs or device private pointer */
>  };
>  
> @@ -1323,6 +1329,8 @@ struct super_block {
>  #endif
>   const struct xattr_handler **s_xattr;
>  
> + const struct fscrypt_operations *s_cop;
> +
>   struct hlist_bl_heads_anon; /* anonymous dentries for (nfs) 
> exporting */
>   struct list_heads_mounts;   /* list of mounts; _not_ for fs 
> use */
>   struct block_device *s_bdev;
> diff --git a/include/linux/fscrypto.h b/include/linux/fscrypto.h
> new file mode 100644
> index 000..b0aed92
> --- /dev/null
> +++ b/include/linux/fscrypto.h
> @@ -0,0 +1,239 @@
> +/*
> + * General per-file encryption definition
> + *
> + * Copyright (C) 2015, Google, Inc.
> + *
> + * Written by Michael Halcrow, 2015.
> + * Modified by Jaegeuk Kim, 2015.
> + */
> +
> +#ifndef _LINUX_FSCRYPTO_H
> +#define _LINUX_FSCRYPTO_H
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#define FS_KEY_DERIVATION_NONCE_SIZE 16
> +#define FS_ENCRYPTION_CONTEXT_FORMAT_V1  1
> +
> +#define FS_POLICY_FLAGS_PAD_40x00
> +#define FS_POLICY_FLAGS_PAD_80x01
> +#define FS_POLICY_FLAGS_PAD_16   0x02
> +#define FS_POLICY_FLAGS_PAD_32   0x03
> +#define FS_POLICY_FLAGS_PAD_MASK 0x03
> +#define FS_POLICY_FLAGS_VALID0x03
> +
> +/* Encryption algorithms */
> +#define FS_ENCRYPTION_MODE_INVALID   0
> +#define FS_ENCRYPTION_MODE_AES_256_XTS   1
> +#define FS_ENCRYPTION_MODE_AES_256_GCM   2
> +#define FS_ENCRYPTION_MODE_AES_256_CBC   3
> +#define FS_ENCRYPTION_MODE_AES_256_CTS   4
> +
> +/**
> + * Encryption context for inode
> + *
> + * Protector format:
> + *  1 byte: Protector format (1 = this version)
> + *  1 byte: File contents encryption mode
> + *  1 byte: File names encryption mode
> + *  1 byte: Flags
> + *  8 bytes: Master Key descriptor
> + *  16 bytes: Encryption Key derivation nonce
> + */
> +struct fscrypt_context {
> + char format;
> + char contents_encryption_mode;
> + char filenames_encryption_mode;
> + char flags;
> + char master_key_descriptor[FS_KEY_DESCRIPTOR_SIZE];
> + char nonce[FS_KEY_DERIVATION_NONCE_SIZE];

how about u8 instead of char?

> +} __packed;
> +
> +/* Encryption parameters */
> +#define FS_XTS_TWEAK_SIZE16
> +#define FS_AES_128_ECB_KEY_SIZE  16
> +#define FS_AES_256_GCM_KEY_SIZE  32
> +#define FS_AES_256_CBC_KEY_SIZE  32
> +#define FS_AES_256_CTS_KEY_SIZE  32
> +#define FS_AES_256_XTS_KEY_SIZE  64
> +#define FS_MAX_KEY_SIZE  64
> +
> +#define FS_KEY_DESC_PREFIX   "fscrypt:"
> +#define FS_KEY_DESC_PREFIX_SIZE  8
> +
> +/* This is passed in from userspace into the kernel keyring */
> +struct fscrypt_key {
> + __u32 mode;
> + char raw[FS_MAX_KEY_SIZE];
> + __u32 size;
> +} __packed;
> +
> +struct fscrypt_info {
> + char ci_data_mode;
> + char ci_filename_mode;
> + char ci_flags;

ditto

> + struct crypto_ablkcipher *ci_ctfm;
> + struct key *ci_keyring_key;
> + char ci_master_key[FS_KEY_DESCRIPTOR_SIZE];
> +};
> +
> +#define FS_CTX_REQUIRES_FREE_ENCRYPT_FL  0x0001
> +#define FS_WRITE_PATH_FL 0x0002
> +
> +struct fscrypt_ctx {
> + union {
> + struct {
> + struct page *bounce_page;   /* Ciphertext page */
> + struct page *control_page;  /* Original page  */
> + } w;
> + struct {
> + struct bio *bio;
> + struct work_struct work;
> + } r;
> + struct list_head free_list; /* Free list */
> + };
> + char flags; /* Flags */
> + char mode;  /* Encryption mod

Re: [PATCH 06/10] fs crypto: add Makefile and Kconfig

2016-02-28 Thread Randy Dunlap
On 02/25/16 11:26, Jaegeuk Kim wrote:
> This patch adds a facility to enable per-file encryption.
> 
> Arnd fixes a missing CONFIG_BLOCK check in the original patch.
> "The newly added generic crypto abstraction for file systems operates
> on 'struct bio' objects, which do not exist when CONFIG_BLOCK is
> disabled:
> 
> fs/crypto/crypto.c: In function 'fscrypt_zeroout_range':
> fs/crypto/crypto.c:308:9: error: implicit declaration of function 'bio_alloc' 
> [-Werror=implicit-function-declaration]
> 
> This adds a Kconfig dependency that prevents FS_ENCRYPTION from being
> enabled without BLOCK."
> 
> Signed-off-by: Arnd Bergmann 
> Signed-off-by: Jaegeuk Kim 
> ---
>  fs/Kconfig |  2 ++
>  fs/Makefile|  1 +
>  fs/crypto/Kconfig  | 17 +
>  fs/crypto/Makefile |  2 ++
>  4 files changed, 22 insertions(+)
>  create mode 100644 fs/crypto/Kconfig
>  create mode 100644 fs/crypto/Makefile
> 
> diff --git a/fs/Kconfig b/fs/Kconfig
> index 9adee0d..9d75767 100644
> --- a/fs/Kconfig
> +++ b/fs/Kconfig
> @@ -84,6 +84,8 @@ config MANDATORY_FILE_LOCKING
>  
> To the best of my knowledge this is dead code that no one cares about.
>  
> +source "fs/crypto/Kconfig"
> +
>  source "fs/notify/Kconfig"
>  
>  source "fs/quota/Kconfig"
> diff --git a/fs/Makefile b/fs/Makefile
> index 79f5225..47571e2 100644
> --- a/fs/Makefile
> +++ b/fs/Makefile
> @@ -30,6 +30,7 @@ obj-$(CONFIG_EVENTFD)   += eventfd.o
>  obj-$(CONFIG_USERFAULTFD)+= userfaultfd.o
>  obj-$(CONFIG_AIO)   += aio.o
>  obj-$(CONFIG_FS_DAX) += dax.o
> +obj-y+= crypto/
>  obj-$(CONFIG_FILE_LOCKING)  += locks.o
>  obj-$(CONFIG_COMPAT) += compat.o compat_ioctl.o
>  obj-$(CONFIG_BINFMT_AOUT)+= binfmt_aout.o
> diff --git a/fs/crypto/Kconfig b/fs/crypto/Kconfig
> new file mode 100644
> index 000..9bea124e
> --- /dev/null
> +++ b/fs/crypto/Kconfig
> @@ -0,0 +1,17 @@
> +config FS_ENCRYPTION
> + bool "FS Encryption (Per-file encryption)"
> + depends on BLOCK

depends on CRYPTO
since all of the CRYPTO_xxx below also depend on CRYPTO.

> + select CRYPTO_AES
> + select CRYPTO_CBC
> + select CRYPTO_ECB
> + select CRYPTO_XTS
> + select CRYPTO_CTS
> + select CRYPTO_CTR
> + select CRYPTO_SHA256
> + select KEYS
> + select ENCRYPTED_KEYS
> + help
> +   Enable encryption of files and directories.  This
> +   feature is similar to ecryptfs, but it is more memory
> +   efficient since it avoids caching the encrypted and
> +   decrypted pages in the page cache.
> diff --git a/fs/crypto/Makefile b/fs/crypto/Makefile
> new file mode 100644
> index 000..f9f68cd
> --- /dev/null
> +++ b/fs/crypto/Makefile
> @@ -0,0 +1,2 @@
> +obj-y += fname.o
> +obj-$(CONFIG_FS_ENCRYPTION)  += crypto.o policy.o keyinfo.o
> 


-- 
~Randy


[PATCH 01/10] selftests/x86: In syscall_nt, test NT|TF as well

2016-02-28 Thread Andy Lutomirski
Setting TF prevents fastpath returns in most cases, which causes the
test to fail on 32-bit kernels because 32-bit kernels do not, in
fact, handle NT correctly on SYSENTER entries.

The next patch will fix 32-bit kernels.

Signed-off-by: Andy Lutomirski 
---
 tools/testing/selftests/x86/syscall_nt.c | 57 +++-
 1 file changed, 49 insertions(+), 8 deletions(-)

diff --git a/tools/testing/selftests/x86/syscall_nt.c 
b/tools/testing/selftests/x86/syscall_nt.c
index 60c06af4646a..a6ceff86c199 100644
--- a/tools/testing/selftests/x86/syscall_nt.c
+++ b/tools/testing/selftests/x86/syscall_nt.c
@@ -17,6 +17,9 @@
 
 #include 
 #include 
+#include 
+#include 
+#include 
 #include 
 #include 
 
@@ -26,6 +29,8 @@
 # define WIDTH "l"
 #endif
 
+static unsigned int nerrs;
+
 static unsigned long get_eflags(void)
 {
unsigned long eflags;
@@ -39,16 +44,52 @@ static void set_eflags(unsigned long eflags)
  : : "rm" (eflags) : "flags");
 }
 
-int main()
+static void sethandler(int sig, void (*handler)(int, siginfo_t *, void *),
+  int flags)
 {
-   printf("[RUN]\tSet NT and issue a syscall\n");
-   set_eflags(get_eflags() | X86_EFLAGS_NT);
+   struct sigaction sa;
+   memset(&sa, 0, sizeof(sa));
+   sa.sa_sigaction = handler;
+   sa.sa_flags = SA_SIGINFO | flags;
+   sigemptyset(&sa.sa_mask);
+   if (sigaction(sig, &sa, 0))
+   err(1, "sigaction");
+}
+
+static void sigtrap(int sig, siginfo_t *si, void *ctx_void)
+{
+}
+
+static void do_it(unsigned long extraflags)
+{
+   unsigned long flags;
+
+   set_eflags(get_eflags() | extraflags);
syscall(SYS_getpid);
-   if (get_eflags() & X86_EFLAGS_NT) {
-   printf("[OK]\tThe syscall worked and NT is still set\n");
-   return 0;
+   flags = get_eflags();
+   if ((flags & extraflags) == extraflags) {
+   printf("[OK]\tThe syscall worked and flags are still set\n");
} else {
-   printf("[FAIL]\tThe syscall worked but NT was cleared\n");
-   return 1;
+   printf("[FAIL]\tThe syscall worked but flags were cleared 
(flags = 0x%lx but expected 0x%lx set)\n",
+  flags, extraflags);
+   nerrs++;
}
 }
+
+int main()
+{
+   printf("[RUN]\tSet NT and issue a syscall\n");
+   do_it(X86_EFLAGS_NT);
+
+   /*
+* Now try it again with TF set -- TF forces returns via IRET in all
+* cases except non-ptregs-using 64-bit full fast path syscalls.
+*/
+
+   sethandler(SIGTRAP, sigtrap, 0);
+
+   printf("[RUN]\tSet NT|TF and issue a syscall\n");
+   do_it(X86_EFLAGS_NT | X86_EFLAGS_TF);
+
+   return nerrs == 0 ? 0 : 1;
+}
-- 
2.5.0



[PATCH 00/10] x86: Various SYSENTER/SYSEXIT/#DB fixes and cleanups

2016-02-28 Thread Andy Lutomirski
hpa asked me to get rid of the ASM_CLAC at the beginning of the SYSENTER
path.  Little did he know...

This series makes the observed behavior of SYSENTER wrt flags the same
for all sane flags and kernel bitnesses.  That is, SYSENTER preserves
flags now unless you do a syscall that explicitly changes flags, and
the HW flags that the syscall executes with are sanitized.  This
includes NT, TF, AC and all arithmetic flags.  Prior to this series,
32-bit kernels clobbered TF and the arithmetic flags and behaved
highly erratically if NT was set.  (If IF is cleared by evil userspace
when SYSENTER starts, IF will be set again on return.  There's nothing
the kernel can do about this -- SYSENTER inherently forgets the state
of IF.)

This series speeds up SYSENTER on all kernels by a surprisingly large
amount on Skylake because it eliminates an unconditional CLAC.

While SYSENTER used to handle TF correctly as far as I can tell on
64-bit kernels, the means by which it did so was heavily tangled up in
the ptrace single-step logic.  It now works just like all the other
kernel entries except insofar as do_debug has a simple special case
for it.  Relatedly, the bizarre and poorly explained old fixup in
do_debug is now hidden behind a WARN_ON_ONCE in preparation for
deleting it at some point.

The code that fixed up NMI and #DB early in SYSENTER in 32-bit kernels
used to be both terrifying and incorrect.  (It doesn't appear to have
been exploitably bad, but the reason for that is subtle, and the code
was certainy more fragile than it deserved to me.)  We still need a
special fixup, but it's much simpler now.

While I was doing all this, I also noticed that DR6 and BTF handling
in do_debug was a bit off.  Two of the patches in here try to fix it
up.

Have fun!

tl;dr: Cleanups and sanity fixes here, but no security fixes, and I
don't think anything needs to be backported or put in x86/urgent.

This series applies to the result of merging tip:x86/asm and
tip:x86/urgent.  I've been testing on a somewhat bastardized base,
because tip currently doesn't work on my laptop in 32-bit mode.  (That
bug is fixed in Linus' tree.)

Andy Lutomirski (10):
  selftests/x86: In syscall_nt, test NT|TF as well
  x86/entry/compat: In SYSENTER, sink AC clearing below the existing
FLAGS test
  x86/entry/32: Filter NT and speed up AC filtering in SYSENTER
  x86/entry/32: Restore FLAGS on SYSEXIT
  x86/traps: Clear TIF_BLOCKSTEP on all debug exceptions
  x86/traps: Clear DR6 early in do_debug and improve the comment
  x86/entry: Vastly simplify SYSENTER TF handling
  x86/entry: Only allocate space for SYSENTER_stack if needed
  x86/entry/32: Simplify and fix up the SYSENTER stack #DB/NMI fixup
  x86/entry/32: Add and check a stack canary for the SYSENTER stack

 arch/x86/entry/entry_32.S| 182 ++-
 arch/x86/entry/entry_64_compat.S |  15 ++-
 arch/x86/include/asm/processor.h |   5 +-
 arch/x86/include/asm/proto.h |  15 ++-
 arch/x86/kernel/asm-offsets_32.c |   5 +
 arch/x86/kernel/process.c|   3 +
 arch/x86/kernel/traps.c  |  87 ---
 tools/testing/selftests/x86/syscall_nt.c |  57 --
 8 files changed, 263 insertions(+), 106 deletions(-)

-- 
2.5.0



[PATCH 02/10] x86/entry/compat: In SYSENTER, sink AC clearing below the existing FLAGS test

2016-02-28 Thread Andy Lutomirski
CLAC is slow, and the SYSENTER code already has an unlikely path
that runs if unusual flags are set.  Drop the CLAC and instead rely
on the unlikely path to clear AC.

This seems to save ~24 cycles on my Skylake laptop.  (Hey, Intel,
make this faster please!)

Signed-off-by: Andy Lutomirski 
---
 arch/x86/entry/entry_64_compat.S | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/arch/x86/entry/entry_64_compat.S b/arch/x86/entry/entry_64_compat.S
index 89bcb4979e7a..7c8e72da7654 100644
--- a/arch/x86/entry/entry_64_compat.S
+++ b/arch/x86/entry/entry_64_compat.S
@@ -66,8 +66,6 @@ ENTRY(entry_SYSENTER_compat)
 */
pushfq  /* pt_regs->flags (except IF = 0) */
orl $X86_EFLAGS_IF, (%rsp)  /* Fix saved flags */
-   ASM_CLAC/* Clear AC after saving FLAGS */
-
pushq   $__USER32_CS/* pt_regs->cs */
xorq%r8,%r8
pushq   %r8 /* pt_regs->ip = 0 (placeholder) */
@@ -90,9 +88,9 @@ ENTRY(entry_SYSENTER_compat)
cld
 
/*
-* Sysenter doesn't filter flags, so we need to clear NT
+* Sysenter doesn't filter flags, so we need to clear NT and AC
 * ourselves.  To save a few cycles, we can check whether
-* NT was set instead of doing an unconditional popfq.
+* either was set instead of doing an unconditional popfq.
 * This needs to happen before enabling interrupts so that
 * we don't get preempted with NT set.
 *
@@ -102,7 +100,7 @@ ENTRY(entry_SYSENTER_compat)
 * we're keeping that code behind a branch which will predict as
 * not-taken and therefore its instructions won't be fetched.
 */
-   testl   $X86_EFLAGS_NT, EFLAGS(%rsp)
+   testl   $X86_EFLAGS_NT|X86_EFLAGS_AC, EFLAGS(%rsp)
jnz .Lsysenter_fix_flags
 .Lsysenter_flags_fixed:
 
-- 
2.5.0



[PATCH 06/10] x86/traps: Clear DR6 early in do_debug and improve the comment

2016-02-28 Thread Andy Lutomirski
Leaving any bits set in DR6 on return from a debug exception is
asking for trouble.  Prevent it by writing zero right away and
clarify the comment.

Signed-off-by: Andy Lutomirski 
---
 arch/x86/kernel/traps.c | 15 ---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 19e6cfa501e3..6dddc220e3ed 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -593,6 +593,18 @@ dotraplinkage void do_debug(struct pt_regs *regs, long 
error_code)
ist_enter(regs);
 
get_debugreg(dr6, 6);
+   /*
+* The Intel SDM says:
+*
+*   Certain debug exceptions may clear bits 0-3. The remaining
+*   contents of the DR6 register are never cleared by the
+*   processor. To avoid confusion in identifying debug
+*   exceptions, debug handlers should clear the register before
+*   returning to the interrupted task.
+*
+* Keep it simple: clear DR6 immediately.
+*/
+   set_debugreg(0, 6);
 
/* Filter out all the reserved bits which are preset to 1 */
dr6 &= ~DR6_RESERVED;
@@ -616,9 +628,6 @@ dotraplinkage void do_debug(struct pt_regs *regs, long 
error_code)
if ((dr6 & DR_STEP) && kmemcheck_trap(regs))
goto exit;
 
-   /* DR6 may or may not be cleared by the CPU */
-   set_debugreg(0, 6);
-
/* Store the virtualized DR6 value */
tsk->thread.debugreg6 = dr6;
 
-- 
2.5.0



[PATCH 07/10] x86/entry: Vastly simplify SYSENTER TF handling

2016-02-28 Thread Andy Lutomirski
Due to a blatant design error, SYSENTER doesn't clear TF.  As a result,
if a user does SYSENTER with TF set, we will single-step through the
kernel until something clears TF.  There is absolutely nothing we can
do to prevent this short of turning off SYSENTER [1].

Simplify the handling considerably with two changes:

1. We already sanitize EFLAGS in SYSENTER to clear NT and AC.  We can
   add TF to that list of flags to sanitize with no overhead whatsoever.

2. Teach do_debug to ignore single-step traps in the SYSENTER prologue.

That's all we need to do.

Don't get too excited -- our handling is still buggy on 32-bit
kernels.  There's nothing wrong with the SYSENTER code itself, but
the #DB prologue has a clever fixup for traps on the very first
instruction of entry_SYSENTER_32, and the fixup doesn't work quite
correctly.  The next two patches will fix that.

[1] We could probably prevent it by forcing BTF on at all times and
making sure we clear TF before any branches in the SYSENTER
code.  Needless to say, this is a bad idea.

Signed-off-by: Andy Lutomirski 
---
 arch/x86/entry/entry_32.S| 42 ++--
 arch/x86/entry/entry_64_compat.S |  9 ++-
 arch/x86/include/asm/proto.h | 15 ++--
 arch/x86/kernel/traps.c  | 52 +---
 4 files changed, 94 insertions(+), 24 deletions(-)

diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S
index ed171f938960..752d4f031a18 100644
--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -287,7 +287,26 @@ need_resched:
 END(resume_kernel)
 #endif
 
-   # SYSENTER  call handler stub
+GLOBAL(__begin_SYSENTER_singlestep_region)
+/*
+ * All code from here through __end_SYSENTER_singlestep_region is subject
+ * to being single-stepped if a user program sets TF and executes SYSENTER.
+ * There is absolutely nothing that we can do to prevent this from happening
+ * (thanks Intel!).  To keep our handling of this situation as simple as
+ * possible, we handle TF just like AC and NT, except that our #DB handler
+ * will ignore all of the single-step traps generated in this range.
+ */
+
+#ifdef CONFIG_XEN
+/*
+ * Xen doesn't set %esp to be precisely what the normal SYSENTER
+ * entry point expects, so fix it up before using the normal path.
+ */
+ENTRY(xen_sysenter_target)
+   addl$5*4, %esp  /* remove xen-provided frame */
+   jmp sysenter_past_esp
+#endif
+
 ENTRY(entry_SYSENTER_32)
movlTSS_sysenter_sp0(%esp), %esp
 sysenter_past_esp:
@@ -301,19 +320,25 @@ sysenter_past_esp:
SAVE_ALL pt_regs_ax=$-ENOSYS/* save rest */
 
/*
-* Sysenter doesn't filter flags, so we need to clear NT and AC
-* ourselves.  To save a few cycles, we can check whether
+* Sysenter doesn't filter flags, so we need to clear NT, AC
+* and TF ourselves.  To save a few cycles, we can check whether
 * either was set instead of doing an unconditional popfq.
 * This needs to happen before enabling interrupts so that
 * we don't get preempted with NT set.
 *
+* If TF is set, we will single-step all the way to here -- do_debug
+* will ignore all the traps.  (Yes, this is slow, but so is
+* single-stepping in general.  This allows us to avoid having
+* a more complicated code to handle the case where a user program
+* forces us to single-step through the SYSENTER entry code.)
+*
 * NB.: .Lsysenter_fix_flags is a label with the code under it moved
 * out-of-line as an optimization: NT is unlikely to be set in the
 * majority of the cases and instead of polluting the I$ unnecessarily,
 * we're keeping that code behind a branch which will predict as
 * not-taken and therefore its instructions won't be fetched.
 */
-   testl   $X86_EFLAGS_NT|X86_EFLAGS_AC, PT_EFLAGS(%esp)
+   testl   $X86_EFLAGS_NT|X86_EFLAGS_AC|X86_EFLAGS_TF, PT_EFLAGS(%esp)
jnz .Lsysenter_fix_flags
 .Lsysenter_flags_fixed:
 
@@ -369,6 +394,7 @@ sysenter_past_esp:
pushl   $X86_EFLAGS_FIXED
popfl
jmp .Lsysenter_flags_fixed
+GLOBAL(__end_SYSENTER_singlestep_region)
 ENDPROC(entry_SYSENTER_32)
 
# system call handler stub
@@ -662,14 +688,6 @@ ENTRY(spurious_interrupt_bug)
 END(spurious_interrupt_bug)
 
 #ifdef CONFIG_XEN
-/*
- * Xen doesn't set %esp to be precisely what the normal SYSENTER
- * entry point expects, so fix it up before using the normal path.
- */
-ENTRY(xen_sysenter_target)
-   addl$5*4, %esp  /* remove xen-provided frame */
-   jmp sysenter_past_esp
-
 ENTRY(xen_hypervisor_callback)
pushl   $-1 /* orig_ax = -1 => not a system 
call */
SAVE_ALL
diff --git a/arch/x86/entry/entry_64_compat.S b/arch/x86/entry/entry_64_compat.S
index 7c8e72da7654..6aec75b41b06 100

[PATCH 08/10] x86/entry: Only allocate space for SYSENTER_stack if needed

2016-02-28 Thread Andy Lutomirski
The SYSENTER stack is only used on 32-bit kernels.  Remove it in
64-bit kernels.

(We may end up using it down the road on 64-bit kernels.  If so,
 we'll re-enable it for CONFIG_IA32_EMULATION.)

Signed-off-by: Andy Lutomirski 
---
 arch/x86/include/asm/processor.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index ecb410310e70..7cd01b71b5bd 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -297,10 +297,12 @@ struct tss_struct {
 */
unsigned long   io_bitmap[IO_BITMAP_LONGS + 1];
 
+#ifdef CONFIG_X86_32
/*
 * Space for the temporary SYSENTER stack:
 */
unsigned long   SYSENTER_stack[64];
+#endif
 
 } cacheline_aligned;
 
-- 
2.5.0



[PATCH 10/10] x86/entry/32: Add and check a stack canary for the SYSENTER stack

2016-02-28 Thread Andy Lutomirski
Signed-off-by: Andy Lutomirski 
---
 arch/x86/include/asm/processor.h | 3 ++-
 arch/x86/kernel/process.c| 3 +++
 arch/x86/kernel/traps.c  | 8 
 3 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 7cd01b71b5bd..50a6dc871cc0 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -299,8 +299,9 @@ struct tss_struct {
 
 #ifdef CONFIG_X86_32
/*
-* Space for the temporary SYSENTER stack:
+* Space for the temporary SYSENTER stack.
 */
+   unsigned long   SYSENTER_stack_canary;
unsigned long   SYSENTER_stack[64];
 #endif
 
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 9f7c21c22477..ee9a9792caeb 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -57,6 +57,9 @@ __visible DEFINE_PER_CPU_SHARED_ALIGNED(struct tss_struct, 
cpu_tss) = {
  */
.io_bitmap  = { [0 ... IO_BITMAP_LONGS] = ~0 },
 #endif
+#ifdef CONFIG_X86_32
+   .SYSENTER_stack_canary  = STACK_END_MAGIC,
+#endif
 };
 EXPORT_PER_CPU_SYMBOL(cpu_tss);
 
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 80928ea78373..590110119e6a 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -713,6 +713,14 @@ dotraplinkage void do_debug(struct pt_regs *regs, long 
error_code)
debug_stack_usage_dec();
 
 exit:
+#if defined(CONFIG_X86_32)
+   /*
+* This is the most likely code path that involves non-trivial use
+* of the SYSENTER stack.  Check that we haven't overrun it.
+*/
+   WARN(this_cpu_read(cpu_tss.SYSENTER_stack_canary) != STACK_END_MAGIC,
+"Overran or corrupted SYSENTER stack\n");
+#endif
ist_exit(regs);
 }
 NOKPROBE_SYMBOL(do_debug);
-- 
2.5.0



[PATCH 09/10] x86/entry/32: Simplify and fix up the SYSENTER stack #DB/NMI fixup

2016-02-28 Thread Andy Lutomirski
Right after SYSENTER, we can get a #DB or NMI.  On x86_32, there's no IST,
so the exception handler is invoked on the temporary SYSENTER stack.

Because the SYSENTER stack is very small, we have a fixup to switch
off the stack quickly when this happens.  The old fixup had several issues:

1. It checked the interrupt frame's CS and EIP.  This wasn't
   obviously correct on Xen or if vm86 mode was in use [1].

2. In the NMI handler, it did some frightening digging into the
   stack frame.  I'm not convinced this digging was correct.

3. The fixup didn't switch stacks and then switch back.  Instead, it
   synthesized a brand new stack frame that would redirect the IRET
   back to the SYSENTER code.  That frame was highly questionable.
   For one thing, if NMI nested inside #DB, we would effectively
   abort the #DB prologue, which was probably safe but was
   frightening.  For another, the code used PUSHFL to write the
   FLAGS portion of the frame, which was simply bogus -- by the time
   PUSHFL was called, at least TF, NT, VM, and all of the arithmetic
   flags were clobbered.

Simplify this considerably.  Instead of looking at the saved frame
to see where we came from, check the hardware ESP register against
the SYSENTER stack directly.  Malicious user code cannot spoof the
kernel ESP register, and by moving the check after SAVE_ALL, we can
use normal PER_CPU accesses to find all the relevant addresses.

With this patch applied, the improved syscall_nt_32 test finally
passes on 32-bit kernels.

[1] It isn't obviously correct, but it is nonetheless safe from vm86
shenanigans as far as I can tell.  A user can't point EIP at
entry_SYSENTER_32 while in vm86 mode because entry_SYSENTER_32,
like all kernel addresses, is greater than 0x and would thus
violate the CS segment limit.

Signed-off-by: Andy Lutomirski 
---
 arch/x86/entry/entry_32.S| 114 ++-
 arch/x86/kernel/asm-offsets_32.c |   5 ++
 2 files changed, 56 insertions(+), 63 deletions(-)

diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S
index 752d4f031a18..99bf636a6eaf 100644
--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -987,51 +987,48 @@ error_code:
jmp ret_from_exception
 END(page_fault)
 
-/*
- * Debug traps and NMI can happen at the one SYSENTER instruction
- * that sets up the real kernel stack. Check here, since we can't
- * allow the wrong stack to be used.
- *
- * "TSS_sysenter_sp0+12" is because the NMI/debug handler will have
- * already pushed 3 words if it hits on the sysenter instruction:
- * eflags, cs and eip.
- *
- * We just load the right stack, and push the three (known) values
- * by hand onto the new stack - while updating the return eip past
- * the instruction that would have done it for sysenter.
- */
-.macro FIX_STACK offset ok label
-   cmpw$__KERNEL_CS, 4(%esp)
-   jne \ok
-\label:
-   movlTSS_sysenter_sp0 + \offset(%esp), %esp
-   pushfl
-   pushl   $__KERNEL_CS
-   pushl   $sysenter_past_esp
-.endm
-
 ENTRY(debug)
+   /*
+* #DB can happen at the first instruction of
+* entry_SYSENTER_32 or in Xen's SYSENTER prologue.  If this
+* happens, then we will be running on a very small stack.  We
+* need to detect this condition and switch to the thread
+* stack before calling any C code at all.
+*
+* If you edit this code, keep in mind that NMIs can happen in here.
+*/
ASM_CLAC
-   cmpl$entry_SYSENTER_32, (%esp)
-   jne debug_stack_correct
-   FIX_STACK 12, debug_stack_correct, debug_esp_fix_insn
-debug_stack_correct:
pushl   $-1 # mark this as an int
SAVE_ALL
-   TRACE_IRQS_OFF
xorl%edx, %edx  # error code 0
movl%esp, %eax  # pt_regs pointer
+
+   /* Are we currently on the SYSENTER stack? */
+   PER_CPU(cpu_tss + CPU_TSS_SYSENTER_stack + SIZEOF_SYSENTER_stack, %ecx)
+   subl%eax, %ecx  /* ecx = (end of SYENTER_stack) - esp */
+   cmpl$SIZEOF_SYSENTER_stack, %ecx
+   jb  .Ldebug_from_sysenter_stack
+
+   TRACE_IRQS_OFF
+   calldo_debug
+   jmp ret_from_exception
+
+.Ldebug_from_sysenter_stack:
+   /* We're on the SYSENTER stack.  Switch off. */
+   movl%esp, %ebp
+   movlPER_CPU_VAR(cpu_current_top_of_stack), %esp
+   TRACE_IRQS_OFF
calldo_debug
+   movl%ebp, %esp
jmp ret_from_exception
 END(debug)
 
 /*
- * NMI is doubly nasty. It can happen _while_ we're handling
- * a debug fault, and the debug fault hasn't yet been able to
- * clear up the stack. So we first check whether we got  an
- * NMI on the sysenter entry path, but after that we need to
- * check whether we got an NMI on the debug path where the debug
- * fault happened on the sysenter path.
+ * NMI is 

[PATCH 05/10] x86/traps: Clear TIF_BLOCKSTEP on all debug exceptions

2016-02-28 Thread Andy Lutomirski
The SDM says that debug exceptions clear BTF, and we need to keep
TIF_BLOCKSTEP in sync with BTF.  Clear it unconditionally and improve
the comment.

I suspect that the fact that kmemcheck could cause TIF_BLOCKSTEP not
to be cleared was just an oversight.

Signed-off-by: Andy Lutomirski 
---
 arch/x86/kernel/traps.c | 12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index dd2c2e66c2e1..19e6cfa501e3 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -598,6 +598,13 @@ dotraplinkage void do_debug(struct pt_regs *regs, long 
error_code)
dr6 &= ~DR6_RESERVED;
 
/*
+* The SDM says "The processor clears the BTF flag when it
+* generates a debug exception."  Clear TIF_BLOCKSTEP to keep
+* TIF_BLOCKSTEP in sync with the hardware BTF flag.
+*/
+   clear_tsk_thread_flag(tsk, TIF_BLOCKSTEP);
+
+   /*
 * If dr6 has no reason to give us about the origin of this trap,
 * then it's very likely the result of an icebp/int01 trap.
 * User wants a sigtrap for that.
@@ -612,11 +619,6 @@ dotraplinkage void do_debug(struct pt_regs *regs, long 
error_code)
/* DR6 may or may not be cleared by the CPU */
set_debugreg(0, 6);
 
-   /*
-* The processor cleared BTF, so don't mark that we need it set.
-*/
-   clear_tsk_thread_flag(tsk, TIF_BLOCKSTEP);
-
/* Store the virtualized DR6 value */
tsk->thread.debugreg6 = dr6;
 
-- 
2.5.0



[PATCH 04/10] x86/entry/32: Restore FLAGS on SYSEXIT

2016-02-28 Thread Andy Lutomirski
We weren't restoring FLAGS at all on SYSEXIT.  Apparently no one cared.

With this patch applied, native kernels should always honor
task_pt_regs()->flags, which opens the door for some sys_iopl
cleanups.  I'll do those as a separate series, though, since getting
it right will involve tweaking some paravirt ops.

(The short version is that, before this patch, sys_iopl, invoked via
 SYSENTER, wasn't guaranteed to ever transfer the updated
 regs->flags, so sys_iopl had to change the hardware flags register
 as well.)

Reported-by: Brian Gerst 
Signed-off-by: Andy Lutomirski 
---
 arch/x86/entry/entry_32.S | 9 +
 1 file changed, 9 insertions(+)

diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S
index 263ebde6333f..ed171f938960 100644
--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -343,6 +343,15 @@ sysenter_past_esp:
popl%eax/* pt_regs->ax */
 
/*
+* Restore all flags except IF (we restore IF separately because
+* STI gives a one-instruction window in which we won't be interrupted,
+* whereas POPF does not.
+*/
+   addl$PT_EFLAGS-PT_DS, %esp  /* point esp at pt_regs->flags */
+   btr $X86_EFLAGS_IF_BIT, (%esp)
+   popfl
+
+   /*
 * Return back to the vDSO, which will pop ecx and edx.
 * Don't bother with DS and ES (they already contain __USER_DS).
 */
-- 
2.5.0



[PATCH 03/10] x86/entry/32: Filter NT and speed up AC filtering in SYSENTER

2016-02-28 Thread Andy Lutomirski
This makes the 32-bit code work just like the 64-bit code.  It should
speed up syscalls on 32-bit kernels on Skylake by something like 20
cycles (by analogy to the 64-bit compat case).

It also cleans up NT just like we do for the 64-bit case.

Signed-off-by: Andy Lutomirski 
---
 arch/x86/entry/entry_32.S | 23 ++-
 1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S
index ab710eee4308..263ebde6333f 100644
--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -294,7 +294,6 @@ sysenter_past_esp:
pushl   $__USER_DS  /* pt_regs->ss */
pushl   %ebp/* pt_regs->sp (stashed in bp) */
pushfl  /* pt_regs->flags (except IF = 0) */
-   ASM_CLAC/* Clear AC after saving FLAGS */
orl $X86_EFLAGS_IF, (%esp)  /* Fix IF */
pushl   $__USER_CS  /* pt_regs->cs */
pushl   $0  /* pt_regs->ip = 0 (placeholder) */
@@ -302,6 +301,23 @@ sysenter_past_esp:
SAVE_ALL pt_regs_ax=$-ENOSYS/* save rest */
 
/*
+* Sysenter doesn't filter flags, so we need to clear NT and AC
+* ourselves.  To save a few cycles, we can check whether
+* either was set instead of doing an unconditional popfq.
+* This needs to happen before enabling interrupts so that
+* we don't get preempted with NT set.
+*
+* NB.: .Lsysenter_fix_flags is a label with the code under it moved
+* out-of-line as an optimization: NT is unlikely to be set in the
+* majority of the cases and instead of polluting the I$ unnecessarily,
+* we're keeping that code behind a branch which will predict as
+* not-taken and therefore its instructions won't be fetched.
+*/
+   testl   $X86_EFLAGS_NT|X86_EFLAGS_AC, PT_EFLAGS(%esp)
+   jnz .Lsysenter_fix_flags
+.Lsysenter_flags_fixed:
+
+   /*
 * User mode is traced as though IRQs are on, and SYSENTER
 * turned them off.
 */
@@ -339,6 +355,11 @@ sysenter_past_esp:
 .popsection
_ASM_EXTABLE(1b, 2b)
PTGS_TO_GS_EX
+
+.Lsysenter_fix_flags:
+   pushl   $X86_EFLAGS_FIXED
+   popfl
+   jmp .Lsysenter_flags_fixed
 ENDPROC(entry_SYSENTER_32)
 
# system call handler stub
-- 
2.5.0



Re: [PATCH v3 3/3] sched, x86: Check that we're on the right stack in schedule and __might_sleep

2016-02-28 Thread Andy Lutomirski
On Wed, Nov 19, 2014 at 11:44 AM, Linus Torvalds
 wrote:
> On Wed, Nov 19, 2014 at 11:29 AM, Andi Kleen  wrote:
>>
>> The exception handlers which use the IST stacks don't necessarily
>> set irq count. Maybe they should.
>
> Hmm. I think they should. Since they clearly must not schedule, as
> they use a percpu stack.
>
> Which exceptions use IST?
>
> [ grep grep ]
>
> Looks like stack, doublefault, nmi, debug and mce. And yes, I really
> think they should all raise the irq count if they don't already.
> Rather than add random arch-specific "let's check that we're on the
> right stack" code to the might-sleep stuff, just use the one we have.
>

Resurrecting an old thread:

The outcome of this discussion was that ist_enter now raises
HARDIRQ_COUNT.  I think this is causing a problem.  If a user program
enables TF, it generates a bunch of debug exceptions.  The handlers
raise the IRQ count and do stuff, and apparently some of that stuff
can raise a softirq.  (I have no idea where the softirq is being
raised.)  The softirq code notices that we're in_interrupt and doesn't
wake ksoftirqd because it thinks we're about to exit the interrupt and
process the softirq.  But we don't, which causes occasional warnings
and confuses things (and me!).

So how do we fix it?  If we stop raising HARDIRQ_COUNT (and apply
$SUBJECT?), then raise_softirq will wake ksoftirqd and life is good.
But this seems a bit silly, since, if we entered the ist exception
handler from a context with irqs on and softirqs enabled, we *could*
plausibly handle the softirq right away -- we're on an essentially
empty stack.  (Of course, it's a *small* stack, since it could be the
IST stack.)

Or we could just let ksoftirqd do its thing and stop raising
HARDIRQ_COUNT.  We could add a new preempt count field just for IST
(yuck).  We could try to hijack a different preempt count field
(NMI?).  But I kind of like the idea of just reinstating the original
patch of explicitly checking that we're on a safe stack in schedule
and __might_sleep, since that is the actual condition we care about.

--Andy


[PATCH v4 3/5] ocfs2: create/remove sysfile for online file check

2016-02-28 Thread Gang He
Create online file check sysfile when ocfs2 mount,
remove the related sysfile when ocfs2 umount.

Signed-off-by: Gang He 
Reviewed-by: Mark Fasheh 
---
 fs/ocfs2/super.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
index 2de4c8a..5ef88b8 100644
--- a/fs/ocfs2/super.c
+++ b/fs/ocfs2/super.c
@@ -74,6 +74,7 @@
 #include "suballoc.h"
 
 #include "buffer_head_io.h"
+#include "filecheck.h"
 
 static struct kmem_cache *ocfs2_inode_cachep;
 struct kmem_cache *ocfs2_dquot_cachep;
@@ -1204,6 +1205,9 @@ static int ocfs2_fill_super(struct super_block *sb, void 
*data, int silent)
/* Start this when the mount is almost sure of being successful */
ocfs2_orphan_scan_start(osb);
 
+   /* Create filecheck sysfile /sys/fs/ocfs2//filecheck */
+   ocfs2_filecheck_create_sysfs(sb);
+
return status;
 
 read_super_error:
@@ -1671,6 +1675,7 @@ static void ocfs2_put_super(struct super_block *sb)
 
ocfs2_sync_blockdev(sb);
ocfs2_dismount_volume(sb, 0);
+   ocfs2_filecheck_remove_sysfs(sb);
 }
 
 static int ocfs2_statfs(struct dentry *dentry, struct kstatfs *buf)
-- 
2.1.2



[PATCH v4 4/5] ocfs2: check/fix inode block for online file check

2016-02-28 Thread Gang He
Implement online check or fix inode block during
reading a inode block to memory.

Signed-off-by: Gang He 
---
 fs/ocfs2/inode.c   | 225 +++--
 fs/ocfs2/ocfs2_trace.h |   2 +
 2 files changed, 218 insertions(+), 9 deletions(-)

diff --git a/fs/ocfs2/inode.c b/fs/ocfs2/inode.c
index 8f87e05..6ce531e 100644
--- a/fs/ocfs2/inode.c
+++ b/fs/ocfs2/inode.c
@@ -53,6 +53,7 @@
 #include "xattr.h"
 #include "refcounttree.h"
 #include "ocfs2_trace.h"
+#include "filecheck.h"
 
 #include "buffer_head_io.h"
 
@@ -74,6 +75,14 @@ static int ocfs2_truncate_for_delete(struct ocfs2_super *osb,
struct inode *inode,
struct buffer_head *fe_bh);
 
+static int ocfs2_filecheck_read_inode_block_full(struct inode *inode,
+struct buffer_head **bh,
+int flags, int type);
+static int ocfs2_filecheck_validate_inode_block(struct super_block *sb,
+   struct buffer_head *bh);
+static int ocfs2_filecheck_repair_inode_block(struct super_block *sb,
+ struct buffer_head *bh);
+
 void ocfs2_set_inode_flags(struct inode *inode)
 {
unsigned int flags = OCFS2_I(inode)->ip_attr;
@@ -127,6 +136,7 @@ struct inode *ocfs2_ilookup(struct super_block *sb, u64 
blkno)
 struct inode *ocfs2_iget(struct ocfs2_super *osb, u64 blkno, unsigned flags,
 int sysfile_type)
 {
+   int rc = 0;
struct inode *inode = NULL;
struct super_block *sb = osb->sb;
struct ocfs2_find_inode_args args;
@@ -161,12 +171,17 @@ struct inode *ocfs2_iget(struct ocfs2_super *osb, u64 
blkno, unsigned flags,
}
trace_ocfs2_iget5_locked(inode->i_state);
if (inode->i_state & I_NEW) {
-   ocfs2_read_locked_inode(inode, &args);
+   rc = ocfs2_read_locked_inode(inode, &args);
unlock_new_inode(inode);
}
if (is_bad_inode(inode)) {
iput(inode);
-   inode = ERR_PTR(-ESTALE);
+   if ((flags & OCFS2_FI_FLAG_FILECHECK_CHK) ||
+   (flags & OCFS2_FI_FLAG_FILECHECK_FIX))
+   /* Return OCFS2_FILECHECK_ERR_XXX related errno */
+   inode = ERR_PTR(rc);
+   else
+   inode = ERR_PTR(-ESTALE);
goto bail;
}
 
@@ -409,7 +424,7 @@ static int ocfs2_read_locked_inode(struct inode *inode,
struct ocfs2_super *osb;
struct ocfs2_dinode *fe;
struct buffer_head *bh = NULL;
-   int status, can_lock;
+   int status, can_lock, lock_level = 0;
u32 generation = 0;
 
status = -EINVAL;
@@ -477,7 +492,7 @@ static int ocfs2_read_locked_inode(struct inode *inode,
mlog_errno(status);
return status;
}
-   status = ocfs2_inode_lock(inode, NULL, 0);
+   status = ocfs2_inode_lock(inode, NULL, lock_level);
if (status) {
make_bad_inode(inode);
mlog_errno(status);
@@ -494,16 +509,32 @@ static int ocfs2_read_locked_inode(struct inode *inode,
}
 
if (can_lock) {
-   status = ocfs2_read_inode_block_full(inode, &bh,
-OCFS2_BH_IGNORE_CACHE);
+   if (args->fi_flags & OCFS2_FI_FLAG_FILECHECK_CHK)
+   status = ocfs2_filecheck_read_inode_block_full(inode,
+   &bh, OCFS2_BH_IGNORE_CACHE, 0);
+   else if (args->fi_flags & OCFS2_FI_FLAG_FILECHECK_FIX)
+   status = ocfs2_filecheck_read_inode_block_full(inode,
+   &bh, OCFS2_BH_IGNORE_CACHE, 1);
+   else
+   status = ocfs2_read_inode_block_full(inode,
+   &bh, OCFS2_BH_IGNORE_CACHE);
} else {
status = ocfs2_read_blocks_sync(osb, args->fi_blkno, 1, &bh);
/*
 * If buffer is in jbd, then its checksum may not have been
 * computed as yet.
 */
-   if (!status && !buffer_jbd(bh))
-   status = ocfs2_validate_inode_block(osb->sb, bh);
+   if (!status && !buffer_jbd(bh)) {
+   if (args->fi_flags & OCFS2_FI_FLAG_FILECHECK_CHK)
+   status = ocfs2_filecheck_validate_inode_block(
+   osb->sb, bh);
+   else if (args->fi_flags & OCFS2_FI_FLAG_FILECHECK_FIX)
+   status = ocfs2_filecheck_repair_inode_block(
+  

[PATCH v4 1/5] ocfs2: export ocfs2_kset for online file check

2016-02-28 Thread Gang He
Export ocfs2_kset object from ocfs2_stackglue kernel module,
then online file check code will create the related sysfiles
under ocfs2_kset object.

Signed-off-by: Gang He 
Reviewed-by: Mark Fasheh 
---
 fs/ocfs2/stackglue.c | 3 ++-
 fs/ocfs2/stackglue.h | 2 ++
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/fs/ocfs2/stackglue.c b/fs/ocfs2/stackglue.c
index 5d965e8..13219ed 100644
--- a/fs/ocfs2/stackglue.c
+++ b/fs/ocfs2/stackglue.c
@@ -629,7 +629,8 @@ static struct attribute_group ocfs2_attr_group = {
.attrs = ocfs2_attrs,
 };
 
-static struct kset *ocfs2_kset;
+struct kset *ocfs2_kset;
+EXPORT_SYMBOL_GPL(ocfs2_kset);
 
 static void ocfs2_sysfs_exit(void)
 {
diff --git a/fs/ocfs2/stackglue.h b/fs/ocfs2/stackglue.h
index 66334a3..f2dce10 100644
--- a/fs/ocfs2/stackglue.h
+++ b/fs/ocfs2/stackglue.h
@@ -298,4 +298,6 @@ void ocfs2_stack_glue_set_max_proto_version(struct 
ocfs2_protocol_version *max_p
 int ocfs2_stack_glue_register(struct ocfs2_stack_plugin *plugin);
 void ocfs2_stack_glue_unregister(struct ocfs2_stack_plugin *plugin);
 
+extern struct kset *ocfs2_kset;
+
 #endif  /* STACKGLUE_H */
-- 
2.1.2



[PATCH v4 5/5] ocfs2: add feature document for online file check

2016-02-28 Thread Gang He
This document will describe OCFS2 online file check feature.
OCFS2 is often used in high-availaibility systems. However, OCFS2 usually
converts the filesystem to read-only when encounters an error. This may not be
necessary, since turning the filesystem read-only would affect other running
processes as well, decreasing availability.
Then, a mount option (errors=continue) is introduced, which would return the
-EIO errno to the calling process and terminate furhter processing so that the
filesystem is not corrupted further. The filesystem is not converted to
read-only, and the problematic file's inode number is reported in the kernel
log. The user can try to check/fix this file via online filecheck feature.

Signed-off-by: Gang He 
Reviewed-by: Mark Fasheh 
---
 .../filesystems/ocfs2-online-filecheck.txt | 94 ++
 1 file changed, 94 insertions(+)
 create mode 100644 Documentation/filesystems/ocfs2-online-filecheck.txt

diff --git a/Documentation/filesystems/ocfs2-online-filecheck.txt 
b/Documentation/filesystems/ocfs2-online-filecheck.txt
new file mode 100644
index 000..1ab0786
--- /dev/null
+++ b/Documentation/filesystems/ocfs2-online-filecheck.txt
@@ -0,0 +1,94 @@
+   OCFS2 online file check
+   ---
+
+This document will describe OCFS2 online file check feature.
+
+Introduction
+
+OCFS2 is often used in high-availaibility systems. However, OCFS2 usually
+converts the filesystem to read-only when encounters an error. This may not be
+necessary, since turning the filesystem read-only would affect other running
+processes as well, decreasing availability.
+Then, a mount option (errors=continue) is introduced, which would return the
+-EIO errno to the calling process and terminate furhter processing so that the
+filesystem is not corrupted further. The filesystem is not converted to
+read-only, and the problematic file's inode number is reported in the kernel
+log. The user can try to check/fix this file via online filecheck feature.
+
+Scope
+=
+This effort is to check/fix small issues which may hinder day-to-day operations
+of a cluster filesystem by turning the filesystem read-only. The scope of
+checking/fixing is at the file level, initially for regular files and 
eventually
+to all files (including system files) of the filesystem.
+
+In case of directory to file links is incorrect, the directory inode is
+reported as erroneous.
+
+This feature is not suited for extravagant checks which involve dependency of
+other components of the filesystem, such as but not limited to, checking if the
+bits for file blocks in the allocation has been set. In case of such an error,
+the offline fsck should/would be recommended.
+
+Finally, such an operation/feature should not be automated lest the filesystem
+may end up with more damage than before the repair attempt. So, this has to
+be performed using user interaction and consent.
+
+User interface
+==
+When there are errors in the OCFS2 filesystem, they are usually accompanied
+by the inode number which caused the error. This inode number would be the
+input to check/fix the file.
+
+There is a sysfs directory for each OCFS2 file system mounting:
+
+  /sys/fs/ocfs2//filecheck
+
+Here,  indicates the name of OCFS2 volumn device which has been 
already
+mounted. The file above would accept inode numbers. This could be used to
+communicate with kernel space, tell which file(inode number) will be checked or
+fixed. Currently, three operations are supported, which includes checking
+inode, fixing inode and setting the size of result record history.
+
+1. If you want to know what error exactly happened to  before fixing, do
+
+  # echo "" > /sys/fs/ocfs2//filecheck/check
+  # cat /sys/fs/ocfs2//filecheck/check
+
+The output is like this:
+  INO  DONEERROR
+39502  1   GENERATION
+
+ lists the inode numbers.
+ indicates whether the operation has been finished.
+ says what kind of errors was found. For the detailed error numbers,
+please refer to the file linux/fs/ocfs2/filecheck.h.
+
+2. If you determine to fix this inode, do
+
+  # echo "" > /sys/fs/ocfs2//filecheck/fix
+  # cat /sys/fs/ocfs2//filecheck/fix
+
+The output is like this:
+  INO  DONEERROR
+39502  1   SUCCESS
+
+This time, the  column indicates whether this fix is successful or not.
+
+3. The record cache is used to store the history of check/fix results. It's
+defalut size is 10, and can be adjust between the range of 10 ~ 100. You can
+adjust the size like this:
+
+  # echo "" > /sys/fs/ocfs2//filecheck/set
+
+Fixing stuff
+
+On receivng the inode, the filesystem would read the inode and the
+file metadata. In case of errors, the filesystem would fix the errors
+and report the problems it fixed in the kernel log. As a precautionary measure,
+the inode must first be checked for errors before performing a final fix.
+
+The inode and the result 

[PATCH v4 2/5] ocfs2: sysfile interfaces for online file check

2016-02-28 Thread Gang He
Implement online file check sysfile interfaces, e.g.
how to create the related sysfile according to device name,
how to display/handle file check request from the sysfile.

Signed-off-by: Gang He 
---
 fs/ocfs2/Makefile|   3 +-
 fs/ocfs2/filecheck.c | 606 +++
 fs/ocfs2/filecheck.h |  49 +
 fs/ocfs2/inode.h |   3 +
 4 files changed, 660 insertions(+), 1 deletion(-)
 create mode 100644 fs/ocfs2/filecheck.c
 create mode 100644 fs/ocfs2/filecheck.h

diff --git a/fs/ocfs2/Makefile b/fs/ocfs2/Makefile
index ce210d4..e27e652 100644
--- a/fs/ocfs2/Makefile
+++ b/fs/ocfs2/Makefile
@@ -41,7 +41,8 @@ ocfs2-objs := \
quota_local.o   \
quota_global.o  \
xattr.o \
-   acl.o
+   acl.o   \
+   filecheck.o
 
 ocfs2_stackglue-objs := stackglue.o
 ocfs2_stack_o2cb-objs := stack_o2cb.o
diff --git a/fs/ocfs2/filecheck.c b/fs/ocfs2/filecheck.c
new file mode 100644
index 000..2cabbcf
--- /dev/null
+++ b/fs/ocfs2/filecheck.c
@@ -0,0 +1,606 @@
+/* -*- mode: c; c-basic-offset: 8; -*-
+ * vim: noexpandtab sw=8 ts=8 sts=0:
+ *
+ * filecheck.c
+ *
+ * Code which implements online file check.
+ *
+ * Copyright (C) 2016 SuSE.  All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public
+ * License as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "ocfs2.h"
+#include "ocfs2_fs.h"
+#include "stackglue.h"
+#include "inode.h"
+
+#include "filecheck.h"
+
+
+/* File check error strings,
+ * must correspond with error number in header file.
+ */
+static const char * const ocfs2_filecheck_errs[] = {
+   "SUCCESS",
+   "FAILED",
+   "INPROGRESS",
+   "READONLY",
+   "INJBD",
+   "INVALIDINO",
+   "BLOCKECC",
+   "BLOCKNO",
+   "VALIDFLAG",
+   "GENERATION",
+   "UNSUPPORTED"
+};
+
+static DEFINE_SPINLOCK(ocfs2_filecheck_sysfs_lock);
+static LIST_HEAD(ocfs2_filecheck_sysfs_list);
+
+struct ocfs2_filecheck {
+   struct list_head fc_head;   /* File check entry list head */
+   spinlock_t fc_lock;
+   unsigned int fc_max;/* Maximum number of entry in list */
+   unsigned int fc_size;   /* Current entry count in list */
+   unsigned int fc_done;   /* Finished entry count in list */
+};
+
+struct ocfs2_filecheck_sysfs_entry {   /* sysfs entry per mounting */
+   struct list_head fs_list;
+   atomic_t fs_count;
+   struct super_block *fs_sb;
+   struct kset *fs_devicekset;
+   struct kset *fs_fcheckkset;
+   struct ocfs2_filecheck *fs_fcheck;
+};
+
+#define OCFS2_FILECHECK_MAXSIZE100
+#define OCFS2_FILECHECK_MINSIZE10
+
+/* File check operation type */
+enum {
+   OCFS2_FILECHECK_TYPE_CHK = 0,   /* Check a file(inode) */
+   OCFS2_FILECHECK_TYPE_FIX,   /* Fix a file(inode) */
+   OCFS2_FILECHECK_TYPE_SET = 100  /* Set entry list maximum size */
+};
+
+struct ocfs2_filecheck_entry {
+   struct list_head fe_list;
+   unsigned long fe_ino;
+   unsigned int fe_type;
+   unsigned int fe_done:1;
+   unsigned int fe_status:31;
+};
+
+struct ocfs2_filecheck_args {
+   unsigned int fa_type;
+   union {
+   unsigned long fa_ino;
+   unsigned int fa_len;
+   };
+};
+
+static const char *
+ocfs2_filecheck_error(int errno)
+{
+   if (!errno)
+   return ocfs2_filecheck_errs[errno];
+
+   BUG_ON(errno < OCFS2_FILECHECK_ERR_START ||
+  errno > OCFS2_FILECHECK_ERR_END);
+   return ocfs2_filecheck_errs[errno - OCFS2_FILECHECK_ERR_START + 1];
+}
+
+static ssize_t ocfs2_filecheck_show(struct kobject *kobj,
+   struct kobj_attribute *attr,
+   char *buf);
+static ssize_t ocfs2_filecheck_store(struct kobject *kobj,
+struct kobj_attribute *attr,
+const char *buf, size_t count);
+static struct kobj_attribute ocfs2_attr_filecheck_chk =
+   __ATTR(check, S_IRUSR | S_IWUSR,
+   ocfs2_filecheck_show,
+   ocfs2_filecheck_store);
+static struct kobj_attribute ocfs2_attr_filecheck_fix =
+   __ATTR(fix, S_IRUSR | S_IWUSR,
+   ocfs2_filecheck_show,
+   ocfs2_filecheck_store);
+static struct kobj_attribute ocfs2

[PATCH v4 0/5] Add online file check feature

2016-02-28 Thread Gang He
When there are errors in the ocfs2 filesystem,
they are usually accompanied by the inode number which caused the error.
This inode number would be the input to fixing the file.
One of these options could be considered:
A file in the sys filesytem which would accept inode numbers.
This could be used to communication back what has to be fixed or is fixed.
You could write:
$# echo "" > /sys/fs/ocfs2/devname/filecheck/check
or
$# echo "" > /sys/fs/ocfs2/devname/filecheck/fix

Compare with third version, I add buffer_jbd() check in inode block fix/writing
dirty buffer back, make unsigned short type to unsigned int type for members
in ocfs2_filecheck_entry struct, add feature document in this patch set.
Compare with second version, I re-design filecheck sysfs interfaces, there
are three sysfs files(check, fix and set) under filecheck directory(see above),
sysfs will accept only one argument . Second, I adjust some code in
ocfs2_filecheck_repair_inode_block() function according to upstream feedback,
we cannot just add VALID_FL flag back as a inode block fix, then we will not
fix this field corruption currently until having a complete solution.
Compare with first version, I use strncasecmp instead of double strncmp
functions. Second, update the source file contribution vendor.

Gang He (5):
  ocfs2: export ocfs2_kset for online file check
  ocfs2: sysfile interfaces for online file check
  ocfs2: create/remove sysfile for online file check
  ocfs2: check/fix inode block for online file check
  ocfs2: add feature document for online file check

 .../filesystems/ocfs2-online-filecheck.txt |  94 
 fs/ocfs2/Makefile  |   3 +-
 fs/ocfs2/filecheck.c   | 606 +
 fs/ocfs2/filecheck.h   |  49 ++
 fs/ocfs2/inode.c   | 225 +++-
 fs/ocfs2/inode.h   |   3 +
 fs/ocfs2/ocfs2_trace.h |   2 +
 fs/ocfs2/stackglue.c   |   3 +-
 fs/ocfs2/stackglue.h   |   2 +
 fs/ocfs2/super.c   |   5 +
 10 files changed, 981 insertions(+), 11 deletions(-)
 create mode 100644 Documentation/filesystems/ocfs2-online-filecheck.txt
 create mode 100644 fs/ocfs2/filecheck.c
 create mode 100644 fs/ocfs2/filecheck.h

-- 
2.1.2



linux-next: manual merge of the kvm-arm tree with the arm64 tree

2016-02-28 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the kvm-arm tree got a conflict in:

  arch/arm64/include/asm/cpufeature.h

between commit:

  104a0c02e8b1 ("arm64: Add workaround for Cavium erratum 27456")

from the arm64 tree and commit:

  d0be74f771d5 ("arm64: Add ARM64_HAS_VIRT_HOST_EXTN feature")

from the kvm-arm tree.

I fixed it up (see below) and can carry the fix as necessary (no action
is required).

-- 
Cheers,
Stephen Rothwell

diff --cc arch/arm64/include/asm/cpufeature.h
index 1497163213ed,a5c769b1c65b..
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@@ -30,12 -30,12 +30,13 @@@
  #define ARM64_HAS_LSE_ATOMICS 5
  #define ARM64_WORKAROUND_CAVIUM_23154 6
  #define ARM64_WORKAROUND_834220   7
 -/* #define ARM64_HAS_NO_HW_PREFETCH   8 */
 -/* #define ARM64_HAS_UAO  9 */
 -/* #define ARM64_ALT_PAN_NOT_UAO  10 */
 +#define ARM64_HAS_NO_HW_PREFETCH  8
 +#define ARM64_HAS_UAO 9
 +#define ARM64_ALT_PAN_NOT_UAO 10
+ #define ARM64_HAS_VIRT_HOST_EXTN  11
 +#define ARM64_WORKAROUND_CAVIUM_27456 12
  
 -#define ARM64_NCAPS   12
 +#define ARM64_NCAPS   13
  
  #ifndef __ASSEMBLY__
  


Re: [PATCH V3 3/3] vhost_net: basic polling support

2016-02-28 Thread Jason Wang


On 02/29/2016 05:56 AM, Christian Borntraeger wrote:
> On 02/26/2016 09:42 AM, Jason Wang wrote:
>> > This patch tries to poll for new added tx buffer or socket receive
>> > queue for a while at the end of tx/rx processing. The maximum time
>> > spent on polling were specified through a new kind of vring ioctl.
>> > 
>> > Signed-off-by: Jason Wang 
>> > ---
>> >  drivers/vhost/net.c| 79 
>> > +++---
>> >  drivers/vhost/vhost.c  | 14 
>> >  drivers/vhost/vhost.h  |  1 +
>> >  include/uapi/linux/vhost.h |  6 
>> >  4 files changed, 95 insertions(+), 5 deletions(-)
>> > 
>> > diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
>> > index 9eda69e..c91af93 100644
>> > --- a/drivers/vhost/net.c
>> > +++ b/drivers/vhost/net.c
>> > @@ -287,6 +287,44 @@ static void vhost_zerocopy_callback(struct ubuf_info 
>> > *ubuf, bool success)
>> >rcu_read_unlock_bh();
>> >  }
>> > 
>> > +static inline unsigned long busy_clock(void)
>> > +{
>> > +  return local_clock() >> 10;
>> > +}
>> > +
>> > +static bool vhost_can_busy_poll(struct vhost_dev *dev,
>> > +  unsigned long endtime)
>> > +{
>> > +  return likely(!need_resched()) &&
>> > + likely(!time_after(busy_clock(), endtime)) &&
>> > + likely(!signal_pending(current)) &&
>> > + !vhost_has_work(dev) &&
>> > + single_task_running();
>> > +}
>> > +
>> > +static int vhost_net_tx_get_vq_desc(struct vhost_net *net,
>> > +  struct vhost_virtqueue *vq,
>> > +  struct iovec iov[], unsigned int iov_size,
>> > +  unsigned int *out_num, unsigned int *in_num)
>> > +{
>> > +  unsigned long uninitialized_var(endtime);
>> > +  int r = vhost_get_vq_desc(vq, vq->iov, ARRAY_SIZE(vq->iov),
>> > +  out_num, in_num, NULL, NULL);
>> > +
>> > +  if (r == vq->num && vq->busyloop_timeout) {
>> > +  preempt_disable();
>> > +  endtime = busy_clock() + vq->busyloop_timeout;
>> > +  while (vhost_can_busy_poll(vq->dev, endtime) &&
>> > + vhost_vq_avail_empty(vq->dev, vq))
>> > +  cpu_relax();
> Can you use cpu_relax_lowlatency (which should be the same as cpu_relax for 
> almost
> everybody but s390? cpu_relax (without low latency might give up the time 
> slice
> when running under another hypervisor (like LPAR on s390), which might not be 
> what
> we want here.

Ok, will do this in next version.


Re: [PATCH V3 3/3] vhost_net: basic polling support

2016-02-28 Thread Jason Wang


On 02/28/2016 10:09 PM, Michael S. Tsirkin wrote:
> On Fri, Feb 26, 2016 at 04:42:44PM +0800, Jason Wang wrote:
>> > This patch tries to poll for new added tx buffer or socket receive
>> > queue for a while at the end of tx/rx processing. The maximum time
>> > spent on polling were specified through a new kind of vring ioctl.
>> > 
>> > Signed-off-by: Jason Wang 
> Looks good overall, but I still see one problem.
>
>> > ---
>> >  drivers/vhost/net.c| 79 
>> > +++---
>> >  drivers/vhost/vhost.c  | 14 
>> >  drivers/vhost/vhost.h  |  1 +
>> >  include/uapi/linux/vhost.h |  6 
>> >  4 files changed, 95 insertions(+), 5 deletions(-)
>> > 
>> > diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
>> > index 9eda69e..c91af93 100644
>> > --- a/drivers/vhost/net.c
>> > +++ b/drivers/vhost/net.c
>> > @@ -287,6 +287,44 @@ static void vhost_zerocopy_callback(struct ubuf_info 
>> > *ubuf, bool success)
>> >rcu_read_unlock_bh();
>> >  }
>> >  
>> > +static inline unsigned long busy_clock(void)
>> > +{
>> > +  return local_clock() >> 10;
>> > +}
>> > +
>> > +static bool vhost_can_busy_poll(struct vhost_dev *dev,
>> > +  unsigned long endtime)
>> > +{
>> > +  return likely(!need_resched()) &&
>> > + likely(!time_after(busy_clock(), endtime)) &&
>> > + likely(!signal_pending(current)) &&
>> > + !vhost_has_work(dev) &&
>> > + single_task_running();
> So I find it quite unfortunate that this still uses single_task_running.
> This means that for example a SCHED_IDLE task will prevent polling from
> becoming active, and that seems like a bug, or at least
> an undocumented feature :).

Yes, it may need more thoughts.

>
> Unfortunately this logic affects the behaviour as observed
> by userspace, so we can't merge it like this and tune
> afterwards, since otherwise mangement tools will start
> depending on this logic.
>
>

How about remove single_task_running() first here and optimize on top?
We probably need something like this to handle overcommitment.



[lkp] [n_tty] dd9a6fee68: INFO: possible circular locking dependency detected ]

2016-02-28 Thread kernel test robot
FYI, we noticed the below changes on

https://github.com/0day-ci/linux 
Brian-Bloniarz/Re-n_tty-Check-the-other-end-of-pty-pair-before-returning-EAGAIN-on-a-read/20160229-070452
commit dd9a6fee6830f16f602b1aa2e85d6307acd04945 ("n_tty: Check the other end of 
pty pair before returning EAGAIN on a read()")


++--++
|| v4.5-rc6 | dd9a6fee68 |
++--++
| boot_successes | 128  | 2  |
| boot_failures  | 9| 6  |
| invoked_oom-killer:gfp_mask=0x | 9| 1  |
| Mem-Info   | 9| 2  |
| Out_of_memory:Kill_process | 9| 1  |
| backtrace:vfs_write| 1||
| backtrace:SyS_write| 1||
| backtrace:do_execveat_common   | 1||
| backtrace:compat_SyS_execve| 1||
| backtrace:vfs_read | 1| 4  |
| backtrace:SyS_read | 1| 4  |
| backtrace:compat_process_vm_rw | 1||
| backtrace:compat_SyS_process_vm_readv  | 1||
| backtrace:_do_fork | 1||
| backtrace:SyS_clone| 1||
| page_allocation_failure:order:#,mode   | 0| 1  |
| warn_alloc_failed+0x   | 0| 1  |
| backtrace:kswapd   | 0| 1  |
| INFO:possible_circular_locking_dependency_detected | 0| 4  |
| backtrace:flush_to_ldisc   | 0| 4  |
++--++



[   17.523349] mount (2393) used greatest stack depth: 12392 bytes left
[   17.684314] 
[   17.684972] ==
[   17.686059] [ INFO: possible circular locking dependency detected ]
[   17.687174] 4.5.0-rc6-1-gdd9a6fe #64 Not tainted
[   17.688127] ---
[   17.689216] bootlogd/2434 is trying to acquire lock:
[   17.690167]  ((&buf->work)){+.+...}, at: [] 
flush_work+0x5/0x23d
[   17.692006] 
[   17.692006] but task is already holding lock:
[   17.693433]  (&tty->termios_rwsem){..}, at: [] 
n_tty_read+0xd0/0x882
[   17.695346] 
[   17.695346] which lock already depends on the new lock.
[   17.695346] 
[   17.697370] 
[   17.697370] the existing dependency chain (in reverse order) is:
[   17.698961] 
-> #2 (&tty->termios_rwsem){..}:
[   17.700507][] lock_acquire+0x147/0x1e2
[   17.701621][] down_read+0x48/0x90
[   17.702696][] n_tty_receive_buf_common+0x46/0x8c0
[   17.703900][] n_tty_receive_buf2+0x14/0x16
[   17.705046][] flush_to_ldisc+0xcb/0x125
[   17.706167][] process_one_work+0x2b8/0x5b2
[   17.707339][] worker_thread+0x28b/0x37d
[   17.708454][] kthread+0xfb/0x103
[   17.709511][] ret_from_fork+0x3f/0x70
[   17.710614] 
-> #1 (&buf->lock){+.+...}:
[   17.712070][] lock_acquire+0x147/0x1e2
[   17.713185][] mutex_lock_nested+0x79/0x35f
[   17.714328][] flush_to_ldisc+0x4b/0x125
[   17.715443][] process_one_work+0x2b8/0x5b2
[   17.716587][] worker_thread+0x28b/0x37d
[   17.717700][] kthread+0xfb/0x103
[   17.718752][] ret_from_fork+0x3f/0x70
[   17.719855] 
-> #0 ((&buf->work)){+.+...}:
[   17.721333][] __lock_acquire+0x12dd/0x1932
[   17.722489][] lock_acquire+0x147/0x1e2
[   17.723598][] flush_work+0x3a/0x23d
[   17.724683][] n_tty_read+0x308/0x882
[   17.725771][] tty_read+0x8b/0xcd
[   17.726830][] __vfs_read+0x26/0xb9
[   17.727910][] vfs_read+0xa0/0x12e
[   17.728974][] SyS_read+0x51/0x92
[   17.730032][] entry_SYSCALL_64_fastpath+0x12/0x72
[   17.731237] 
[   17.731237] other info that might help us debug this:
[   17.731237] 
[   17.733255] Chain exists of:
  (&buf->work) --> &buf->lock --> &tty->termios_rwsem

[   17.735644]  Possible unsafe locking scenario:
[   17.735644] 
[   17.737064]CPU0CPU1
[   17.737969]
[   17.738873]   lock(&tty->termios_rwsem);
[   17.739832]lock(&buf->lock);
[   17.740966]lock(&tty->termios_rwsem);
[   17.742181]   lock((&buf->work));
[   17.743081] 
[   17.743081]  *** DE

Western Union Pick up

2016-02-28 Thread WESTERN UNION
Dear Recipient,

You have £850,000 British Pounds Sterling waiting for pick-up at Western Union. 
Contact: wuglobaloff...@qq.com with your personal information for pick up.


Sincerely,
Hillary Wilson
Heritage Lottery Fund
Tel: +44 7024040428


Re: Softirq priority inversion from "softirq: reduce latencies"

2016-02-28 Thread Mike Galbraith
On Sun, 2016-02-28 at 18:01 +0100, Francois Romieu wrote:
> Mike Galbraith  :
> [...]
> > Hrm, relatively new + tasklet woes rings a bell.  Ah, that..
> > 
> > 
> > What's worse is that at the point where this code was written it was
> > already well known that tasklets are a steaming pile of crap and
> > should die.
> > 
> > 
> > Source thereof https://lwn.net/Articles/588457/
> 
> tasklets are ingrained in the dmaengine API (see 
> Documentation/dmaengine/client.txt
> and drivers/dma/virt-dma.h::vchan_cookie_complete).
> 
> Moving everything to irq context or handling his own sub-{jiffy/ms} timer
> while losing async dma doesn't exactly smell like roses either. :o(

https://lwn.net/Articles/239633/

If I'm listening properly, the root cause is that there is a timing
constraint involved, which is being exposed because one softirq raises
another (ew).  Processing timeout happens, freshly raised tasklet
wanders off to SCHED_NORMAL kthread context where its constraint dies.

Given the dma stuff apparently works fine in -rt (or did, see below),
timing constraints can't be super tight, so perhaps we could grow
realtime workqueue support for the truly deserving.  The tricky bit
would be being keeping everybody and his brother from abusing it.

WRT -rt: if dma tasklets really do have hard (ish) constraints, -rt
recently "broke" in the same way.. of all softirqs which are deferred
to kthread context, due to a recent change, only timer/hrtimer are
executed at realtime priority by default.

-Mike


[PATCH v8] watchdog: Add watchdog timer support for the WinSystems EBC-C384

2016-02-28 Thread William Breathitt Gray
The WinSystems EBC-C384 has an onboard watchdog timer. The timeout range
supported by the watchdog timer is 1 second to 255 minutes. Timeouts
under 256 seconds have a 1 second granularity, while the rest have a 1
minute granularity.

This driver adds watchdog timer support for this onboard watchdog timer.
The timeout may be configured via the timeout module parameter.

Signed-off-by: William Breathitt Gray 
---
Changes in v8:
  - Utilize the roundup macro to round up second resolution to minute
granularity when setting the timeout member

 MAINTAINERS |   6 ++
 drivers/watchdog/Kconfig|   9 ++
 drivers/watchdog/Makefile   |   1 +
 drivers/watchdog/ebc-c384_wdt.c | 188 
 4 files changed, 204 insertions(+)
 create mode 100644 drivers/watchdog/ebc-c384_wdt.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 28eb61b..66107fd 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -11860,6 +11860,12 @@ M: David Härdeman 
 S: Maintained
 F: drivers/media/rc/winbond-cir.c
 
+WINSYSTEMS EBC-C384 WATCHDOG DRIVER
+M: William Breathitt Gray 
+L: linux-watch...@vger.kernel.org
+S: Maintained
+F: drivers/watchdog/ebc-c384_wdt.c
+
 WIMAX STACK
 M: Inaky Perez-Gonzalez 
 M: linux-wi...@intel.com
diff --git a/drivers/watchdog/Kconfig b/drivers/watchdog/Kconfig
index 0f6d851..11f3a3d 100644
--- a/drivers/watchdog/Kconfig
+++ b/drivers/watchdog/Kconfig
@@ -713,6 +713,15 @@ config ALIM7101_WDT
 
  Most people will say N.
 
+config EBC_C384_WDT
+   tristate "WinSystems EBC-C384 Watchdog Timer"
+   depends on X86
+   select WATCHDOG_CORE
+   help
+ Enables watchdog timer support for the watchdog timer on the
+ WinSystems EBC-C384 motherboard. The timeout may be configured via
+ the timeout module parameter.
+
 config F71808E_WDT
tristate "Fintek F71808E, F71862FG, F71869, F71882FG and F71889FG 
Watchdog"
depends on X86
diff --git a/drivers/watchdog/Makefile b/drivers/watchdog/Makefile
index f566753..15762c8 100644
--- a/drivers/watchdog/Makefile
+++ b/drivers/watchdog/Makefile
@@ -88,6 +88,7 @@ obj-$(CONFIG_ACQUIRE_WDT) += acquirewdt.o
 obj-$(CONFIG_ADVANTECH_WDT) += advantechwdt.o
 obj-$(CONFIG_ALIM1535_WDT) += alim1535_wdt.o
 obj-$(CONFIG_ALIM7101_WDT) += alim7101_wdt.o
+obj-$(CONFIG_EBC_C384_WDT) += ebc-c384_wdt.o
 obj-$(CONFIG_F71808E_WDT) += f71808e_wdt.o
 obj-$(CONFIG_SP5100_TCO) += sp5100_tco.o
 obj-$(CONFIG_GEODE_WDT) += geodewdt.o
diff --git a/drivers/watchdog/ebc-c384_wdt.c b/drivers/watchdog/ebc-c384_wdt.c
new file mode 100644
index 000..77fda0b
--- /dev/null
+++ b/drivers/watchdog/ebc-c384_wdt.c
@@ -0,0 +1,188 @@
+/*
+ * Watchdog timer driver for the WinSystems EBC-C384
+ * Copyright (C) 2016 William Breathitt Gray
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define MODULE_NAME"ebc-c384_wdt"
+#define WATCHDOG_TIMEOUT   60
+/*
+ * The timeout value in minutes must fit in a single byte when sent to the
+ * watchdog timer; the maximum timeout possible is 15300 (255 * 60) seconds.
+ */
+#define WATCHDOG_MAX_TIMEOUT   15300
+#define BASE_ADDR  0x564
+#define ADDR_EXTENT5
+#define CFG_ADDR   (BASE_ADDR + 1)
+#define PET_ADDR   (BASE_ADDR + 2)
+
+static bool nowayout = WATCHDOG_NOWAYOUT;
+module_param(nowayout, bool, 0);
+MODULE_PARM_DESC(nowayout, "Watchdog cannot be stopped once started (default="
+   __MODULE_STRING(WATCHDOG_NOWAYOUT) ")");
+
+static unsigned timeout;
+module_param(timeout, uint, 0);
+MODULE_PARM_DESC(timeout, "Watchdog timeout in seconds (default="
+   __MODULE_STRING(WATCHDOG_TIMEOUT) ")");
+
+static int ebc_c384_wdt_start(struct watchdog_device *wdev)
+{
+   unsigned t = wdev->timeout;
+
+   /* resolution is in minutes for timeouts greater than 255 seconds */
+   if (t > 255)
+   t = DIV_ROUND_UP(t, 60);
+
+   outb(t, PET_ADDR);
+
+   return 0;
+}
+
+static int ebc_c384_wdt_stop(struct watchdog_device *wdev)
+{
+   outb(0x00, PET_ADDR);
+
+   return 0;
+}
+
+static int ebc_c384_wdt_set_timeout(struct watchdog_device *wdev, unsigned t)
+{
+   /* resolution is in minutes for timeouts greater than 255 seconds */
+   if (t > 255) {
+   /* round second resolution up to minute granularity */
+   wdev->timeout = roundup(t, 60);
+
+   /* set watchdog timer for minute

Re: [PATCH] hwmon: (ntc_thermistor) Add support for ncpXXxh103

2016-02-28 Thread Guenter Roeck

On 02/28/2016 02:31 PM, Joseph wrote:

From: Joseph McNally 

This patch adds support for the Murata NCP15XH103 thermistor series.

Signed-off-by: Joseph McNally 


Applied.

Thanks,
Guenter



[PATCH] mm: __delete_from_page_cache WARN_ON(page_mapped)

2016-02-28 Thread Hugh Dickins
Commit e1534ae95004 ("mm: differentiate page_mapped() from page_mapcount()
for compound pages") changed the famous BUG_ON(page_mapped(page)) in
__delete_from_page_cache() to VM_BUG_ON_PAGE(page_mapped(page)): which
gives us more info when CONFIG_DEBUG_VM=y, but nothing at all when not.

Although it has not usually been very helpul, being hit long after the
error in question, we do need to know if it actually happens on users'
systems; but reinstating a crash there is likely to be opposed :)

In the non-debug case, use WARN_ON() plus dump_page() and add_taint() -
I don't really believe LOCKDEP_NOW_UNRELIABLE, but that seems to be the
standard procedure now.  Move that, or the VM_BUG_ON_PAGE(), up before
the deletion from tree: so that the unNULLified page->mapping gives a
little more information.

If the inode is being evicted (rather than truncated), it won't have
any vmas left, so it's safe(ish) to assume that the raised mapcount is
erroneous, and we can discount it from page_count to avoid leaking the
page (I'm less worried by leaking the occasional 4kB, than losing a
potential 2MB page with each 4kB page leaked).

Signed-off-by: Hugh Dickins 
---
I think this should go into v4.5, so I've written it with an atomic_sub
on page->_count; but Joonsoo will probably want some page_ref thingy.

 mm/filemap.c |   22 +-
 1 file changed, 21 insertions(+), 1 deletion(-)

--- 4.5-rc6/mm/filemap.c2016-02-28 09:04:38.816707844 -0800
+++ linux/mm/filemap.c  2016-02-28 19:45:23.406263928 -0800
@@ -195,6 +195,27 @@ void __delete_from_page_cache(struct pag
else
cleancache_invalidate_page(mapping, page);
 
+   VM_BUG_ON_PAGE(page_mapped(page), page);
+   if (!IS_ENABLED(CONFIG_DEBUG_VM) && WARN_ON(page_mapped(page))) {
+   int mapcount;
+
+   dump_page(page, "still mapped when deleted");
+   add_taint(TAINT_BAD_PAGE, LOCKDEP_NOW_UNRELIABLE);
+
+   mapcount = page_mapcount(page);
+   if (mapping_exiting(mapping) &&
+   page_count(page) >= mapcount + 2) {
+   /*
+* All vmas have already been torn down, so it's
+* a good bet that actually the page is unmapped,
+* and we'd prefer not to leak it: if we're wrong,
+* some other bad page check should catch it later.
+*/
+   page_mapcount_reset(page);
+   atomic_sub(mapcount, &page->_count);
+   }
+   }
+
page_cache_tree_delete(mapping, page, shadow);
 
page->mapping = NULL;
@@ -205,7 +226,6 @@ void __delete_from_page_cache(struct pag
__dec_zone_page_state(page, NR_FILE_PAGES);
if (PageSwapBacked(page))
__dec_zone_page_state(page, NR_SHMEM);
-   VM_BUG_ON_PAGE(page_mapped(page), page);
 
/*
 * At this point page must be either written or cleaned by truncate.


Re: [v7, RESEND] watchdog: Add watchdog timer support for the WinSystems EBC-C384

2016-02-28 Thread Guenter Roeck
Hi William,

On Sun, Feb 28, 2016 at 11:29:10PM -0500, William Breathitt Gray wrote:
> The WinSystems EBC-C384 has an onboard watchdog timer. The timeout range
> supported by the watchdog timer is 1 second to 255 minutes. Timeouts
> under 256 seconds have a 1 second granularity, while the rest have a 1
> minute granularity.
> 
> This driver adds watchdog timer support for this onboard watchdog timer.
> The timeout may be configured via the timeout module parameter.
> 
> Signed-off-by: William Breathitt Gray 
> Reviewed-by: Guenter Roeck 
> ---
>  MAINTAINERS |   6 ++
>  drivers/watchdog/Kconfig|   9 ++
>  drivers/watchdog/Makefile   |   1 +
>  drivers/watchdog/ebc-c384_wdt.c | 188 
> 
> 
[ ... ]

> +
> +static int ebc_c384_wdt_set_timeout(struct watchdog_device *wdev, unsigned t)
> +{
> + /* resolution is in minutes for timeouts greater than 255 seconds */
> + if (t > 255) {
> + /* round second resolution up to minute granularity */
> + wdev->timeout = DIV_ROUND_UP(t, 60) * 60;

Good catch.

Turns out there is a much better macro for this:
wdev->timeout = roundup(t, 60);

Guenter


linux-next: manual merge of the tip tree with the pm tree

2016-02-28 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the tip tree got a conflict in:

  drivers/cpufreq/intel_pstate.c

between commit:

  7791e4aa59ad ("cpufreq: intel_pstate: Enable HWP by default")

from the pm tree and commit:

  bc696ca05f5a ("x86/cpufeature: Replace the old static_cpu_has() with safe 
variant")

from the tip tree.

I fixed it up (the former removed the code modified by the latter)
and can carry the fix as necessary (no action is required).

-- 
Cheers,
Stephen Rothwell


Re: [PATCH v7] watchdog: Add watchdog timer support for the WinSystems EBC-C384

2016-02-28 Thread Guenter Roeck

Hi,

On 02/28/2016 08:20 PM, William Breathitt Gray wrote:

The WinSystems EBC-C384 has an onboard watchdog timer. The timeout range
supported by the watchdog timer is 1 second to 255 minutes. Timeouts
under 256 seconds have a 1 second granularity, while the rest have a 1
minute granularity.

This driver adds watchdog timer support for this onboard watchdog timer.
The timeout may be configured via the timeout module parameter.

Signed-off-by: William Breathitt Gray 
Reviewed-by: Guenter Roeck 
---
Changes in v7:
   - Make sure timeout member is in seconds resolution despite minutes
 granularity



For Wim's benefit:

You forgot the actual change. The follow-up RESEND is really confusing;
RESEND indicates that no change was made, and leaves it up to us to figure
out what is going on. If something like this happens again, just add
another rev and add a note indicating what has (really) changed.

Also, when you make code changes, please drop previous Reviewed-by: or
Acked-by: tags unless you got explicit permission from the reviewer
to keep the tag.

Thanks,
Guenter



Re: [PATCH] 3c59x: Ensure to apply the expires time

2016-02-28 Thread David Miller
From: Stafford Horne 
Date: Sun, 28 Feb 2016 16:49:29 +0900

> In commit 5b6490def9168af6a ("3c59x: Use setup_timer()") Amitoj
> removed add_timer which sets up the epires timer.  In this patch
> the behavior is restore but it uses mod_timer which is a bit more
> compact.
> 
> Signed-off-by: Stafford Horne 

Applied, thanks.


[PATCH v7 RESEND] watchdog: Add watchdog timer support for the WinSystems EBC-C384

2016-02-28 Thread William Breathitt Gray
The WinSystems EBC-C384 has an onboard watchdog timer. The timeout range
supported by the watchdog timer is 1 second to 255 minutes. Timeouts
under 256 seconds have a 1 second granularity, while the rest have a 1
minute granularity.

This driver adds watchdog timer support for this onboard watchdog timer.
The timeout may be configured via the timeout module parameter.

Signed-off-by: William Breathitt Gray 
Reviewed-by: Guenter Roeck 
---
 MAINTAINERS |   6 ++
 drivers/watchdog/Kconfig|   9 ++
 drivers/watchdog/Makefile   |   1 +
 drivers/watchdog/ebc-c384_wdt.c | 188 
 4 files changed, 204 insertions(+)
 create mode 100644 drivers/watchdog/ebc-c384_wdt.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 28eb61b..66107fd 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -11860,6 +11860,12 @@ M: David Härdeman 
 S: Maintained
 F: drivers/media/rc/winbond-cir.c
 
+WINSYSTEMS EBC-C384 WATCHDOG DRIVER
+M: William Breathitt Gray 
+L: linux-watch...@vger.kernel.org
+S: Maintained
+F: drivers/watchdog/ebc-c384_wdt.c
+
 WIMAX STACK
 M: Inaky Perez-Gonzalez 
 M: linux-wi...@intel.com
diff --git a/drivers/watchdog/Kconfig b/drivers/watchdog/Kconfig
index 0f6d851..11f3a3d 100644
--- a/drivers/watchdog/Kconfig
+++ b/drivers/watchdog/Kconfig
@@ -713,6 +713,15 @@ config ALIM7101_WDT
 
  Most people will say N.
 
+config EBC_C384_WDT
+   tristate "WinSystems EBC-C384 Watchdog Timer"
+   depends on X86
+   select WATCHDOG_CORE
+   help
+ Enables watchdog timer support for the watchdog timer on the
+ WinSystems EBC-C384 motherboard. The timeout may be configured via
+ the timeout module parameter.
+
 config F71808E_WDT
tristate "Fintek F71808E, F71862FG, F71869, F71882FG and F71889FG 
Watchdog"
depends on X86
diff --git a/drivers/watchdog/Makefile b/drivers/watchdog/Makefile
index f566753..15762c8 100644
--- a/drivers/watchdog/Makefile
+++ b/drivers/watchdog/Makefile
@@ -88,6 +88,7 @@ obj-$(CONFIG_ACQUIRE_WDT) += acquirewdt.o
 obj-$(CONFIG_ADVANTECH_WDT) += advantechwdt.o
 obj-$(CONFIG_ALIM1535_WDT) += alim1535_wdt.o
 obj-$(CONFIG_ALIM7101_WDT) += alim7101_wdt.o
+obj-$(CONFIG_EBC_C384_WDT) += ebc-c384_wdt.o
 obj-$(CONFIG_F71808E_WDT) += f71808e_wdt.o
 obj-$(CONFIG_SP5100_TCO) += sp5100_tco.o
 obj-$(CONFIG_GEODE_WDT) += geodewdt.o
diff --git a/drivers/watchdog/ebc-c384_wdt.c b/drivers/watchdog/ebc-c384_wdt.c
new file mode 100644
index 000..21a4e95
--- /dev/null
+++ b/drivers/watchdog/ebc-c384_wdt.c
@@ -0,0 +1,188 @@
+/*
+ * Watchdog timer driver for the WinSystems EBC-C384
+ * Copyright (C) 2016 William Breathitt Gray
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define MODULE_NAME"ebc-c384_wdt"
+#define WATCHDOG_TIMEOUT   60
+/*
+ * The timeout value in minutes must fit in a single byte when sent to the
+ * watchdog timer; the maximum timeout possible is 15300 (255 * 60) seconds.
+ */
+#define WATCHDOG_MAX_TIMEOUT   15300
+#define BASE_ADDR  0x564
+#define ADDR_EXTENT5
+#define CFG_ADDR   (BASE_ADDR + 1)
+#define PET_ADDR   (BASE_ADDR + 2)
+
+static bool nowayout = WATCHDOG_NOWAYOUT;
+module_param(nowayout, bool, 0);
+MODULE_PARM_DESC(nowayout, "Watchdog cannot be stopped once started (default="
+   __MODULE_STRING(WATCHDOG_NOWAYOUT) ")");
+
+static unsigned timeout;
+module_param(timeout, uint, 0);
+MODULE_PARM_DESC(timeout, "Watchdog timeout in seconds (default="
+   __MODULE_STRING(WATCHDOG_TIMEOUT) ")");
+
+static int ebc_c384_wdt_start(struct watchdog_device *wdev)
+{
+   unsigned t = wdev->timeout;
+
+   /* resolution is in minutes for timeouts greater than 255 seconds */
+   if (t > 255)
+   t = DIV_ROUND_UP(t, 60);
+
+   outb(t, PET_ADDR);
+
+   return 0;
+}
+
+static int ebc_c384_wdt_stop(struct watchdog_device *wdev)
+{
+   outb(0x00, PET_ADDR);
+
+   return 0;
+}
+
+static int ebc_c384_wdt_set_timeout(struct watchdog_device *wdev, unsigned t)
+{
+   /* resolution is in minutes for timeouts greater than 255 seconds */
+   if (t > 255) {
+   /* round second resolution up to minute granularity */
+   wdev->timeout = DIV_ROUND_UP(t, 60) * 60;
+
+   /* set watchdog timer for minutes */
+   outb(0x00, CFG_ADDR);
+   } else {
+   wdev->timeout = t;
+

[PATCH v19 03/10] x86/xen: Mark xen_cpuid() stack frame as non-standard

2016-02-28 Thread Josh Poimboeuf
objtool reports the following false positive warning:

  arch/x86/xen/enlighten.o: warning: objtool: xen_cpuid()+0x41: can't find jump 
dest instruction at .text+0x108

The warning is due to xen_cpuid()'s use of XEN_EMULATE_PREFIX to insert
some fake instructions which objtool doesn't know how to decode.

Signed-off-by: Josh Poimboeuf 
Cc: David Vrabel 
Cc: Konrad Rzeszutek Wilk 
Cc: Boris Ostrovsky 
---
 arch/x86/xen/enlighten.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index d09e4c9..5c45a69 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -32,6 +32,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #ifdef CONFIG_KEXEC_CORE
 #include 
@@ -351,8 +352,8 @@ static void xen_cpuid(unsigned int *ax, unsigned int *bx,
*cx &= maskecx;
*cx |= setecx;
*dx &= maskedx;
-
 }
+STACK_FRAME_NON_STANDARD(xen_cpuid); /* XEN_EMULATE_PREFIX */
 
 static bool __init xen_check_mwait(void)
 {
-- 
2.4.3



Re: [PATCH v7] watchdog: Add watchdog timer support for the WinSystems EBC-C384

2016-02-28 Thread William Breathitt Gray
On 02/28/2016 11:20 PM, William Breathitt Gray wrote:
> The WinSystems EBC-C384 has an onboard watchdog timer. The timeout range
> supported by the watchdog timer is 1 second to 255 minutes. Timeouts
> under 256 seconds have a 1 second granularity, while the rest have a 1
> minute granularity.
> 
> This driver adds watchdog timer support for this onboard watchdog timer.
> The timeout may be configured via the timeout module parameter.
> 
> Signed-off-by: William Breathitt Gray 
> Reviewed-by: Guenter Roeck 
> ---
> Changes in v7:
>   - Make sure timeout member is in seconds resolution despite minutes
> granularity

Oops, my apologies, I sent out the wrong commit. Please ignore this  
version, I will resend the correct commit.   
 
William Breathitt Gray


[PATCH v19 01/10] objtool: Mark non-standard files and directories

2016-02-28 Thread Josh Poimboeuf
Code which runs outside the kernel's normal mode of operation often does
unusual things which can cause a static analysis tool like objtool to
emit false positive warnings:

- boot image
- vdso image
- relocation
- realmode
- efi
- head
- purgatory
- modpost

Set OBJECT_FILES_NON_STANDARD for their related files and directories,
which will tell objtool to skip checking them.  It's ok to skip them
because they don't affect runtime stack traces.

Also skip the following code which does the right thing with respect to
frame pointers, but is too "special" to be validated by a tool:

- entry
- mcount

Also skip the test_nx module because it modifies its exception handling
table at runtime, which objtool can't understand.  Fortunately it's
just a test module so it doesn't matter much.

Currently objtool is the only user of OBJECT_FILES_NON_STANDARD, but it
might eventually be useful for other tools.

Signed-off-by: Josh Poimboeuf 
---
 arch/x86/boot/Makefile|  3 ++-
 arch/x86/boot/compressed/Makefile |  3 ++-
 arch/x86/entry/Makefile   |  4 
 arch/x86/entry/vdso/Makefile  |  6 --
 arch/x86/kernel/Makefile  | 11 ---
 arch/x86/platform/efi/Makefile|  2 ++
 arch/x86/purgatory/Makefile   |  2 ++
 arch/x86/realmode/Makefile|  4 +++-
 arch/x86/realmode/rm/Makefile |  3 ++-
 drivers/firmware/efi/libstub/Makefile |  1 +
 scripts/mod/Makefile  |  2 ++
 11 files changed, 32 insertions(+), 9 deletions(-)

diff --git a/arch/x86/boot/Makefile b/arch/x86/boot/Makefile
index bbe1a62..0bf6749 100644
--- a/arch/x86/boot/Makefile
+++ b/arch/x86/boot/Makefile
@@ -9,7 +9,8 @@
 # Changed by many, many contributors over the years.
 #
 
-KASAN_SANITIZE := n
+KASAN_SANITIZE := n
+OBJECT_FILES_NON_STANDARD  := y
 
 # If you want to preset the SVGA mode, uncomment the next line and
 # set SVGA_MODE to whatever number you want.
diff --git a/arch/x86/boot/compressed/Makefile 
b/arch/x86/boot/compressed/Makefile
index f9ce75d..5e1d26e 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -16,7 +16,8 @@
 #  (see scripts/Makefile.lib size_append)
 #  compressed vmlinux.bin.all + u32 size of vmlinux.bin.all
 
-KASAN_SANITIZE := n
+KASAN_SANITIZE := n
+OBJECT_FILES_NON_STANDARD  := y
 
 targets := vmlinux vmlinux.bin vmlinux.bin.gz vmlinux.bin.bz2 vmlinux.bin.lzma 
\
vmlinux.bin.xz vmlinux.bin.lzo vmlinux.bin.lz4
diff --git a/arch/x86/entry/Makefile b/arch/x86/entry/Makefile
index bd55ded..fe91c25 100644
--- a/arch/x86/entry/Makefile
+++ b/arch/x86/entry/Makefile
@@ -1,6 +1,10 @@
 #
 # Makefile for the x86 low level entry code
 #
+
+OBJECT_FILES_NON_STANDARD_entry_$(BITS).o   := y
+OBJECT_FILES_NON_STANDARD_entry_64_compat.o := y
+
 obj-y  := entry_$(BITS).o thunk_$(BITS).o 
syscall_$(BITS).o
 obj-y  += common.o
 
diff --git a/arch/x86/entry/vdso/Makefile b/arch/x86/entry/vdso/Makefile
index c854541..f9fb859 100644
--- a/arch/x86/entry/vdso/Makefile
+++ b/arch/x86/entry/vdso/Makefile
@@ -3,8 +3,9 @@
 #
 
 KBUILD_CFLAGS += $(DISABLE_LTO)
-KASAN_SANITIZE := n
-UBSAN_SANITIZE := n
+KASAN_SANITIZE := n
+UBSAN_SANITIZE := n
+OBJECT_FILES_NON_STANDARD  := y
 
 VDSO64-$(CONFIG_X86_64):= y
 VDSOX32-$(CONFIG_X86_X32_ABI)  := y
@@ -16,6 +17,7 @@ vobjs-y := vdso-note.o vclock_gettime.o vgetcpu.o
 
 # files to link into kernel
 obj-y  += vma.o
+OBJECT_FILES_NON_STANDARD_vma.o:= n
 
 # vDSO images to build
 vdso_img-$(VDSO64-y)   += 64
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index b1b78ff..d5fb087 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -16,9 +16,14 @@ CFLAGS_REMOVE_ftrace.o = -pg
 CFLAGS_REMOVE_early_printk.o = -pg
 endif
 
-KASAN_SANITIZE_head$(BITS).o := n
-KASAN_SANITIZE_dumpstack.o := n
-KASAN_SANITIZE_dumpstack_$(BITS).o := n
+KASAN_SANITIZE_head$(BITS).o   := n
+KASAN_SANITIZE_dumpstack.o := n
+KASAN_SANITIZE_dumpstack_$(BITS).o := n
+
+OBJECT_FILES_NON_STANDARD_head_$(BITS).o   := y
+OBJECT_FILES_NON_STANDARD_relocate_kernel_$(BITS).o:= y
+OBJECT_FILES_NON_STANDARD_mcount_$(BITS).o := y
+OBJECT_FILES_NON_STANDARD_test_nx.o:= y
 
 CFLAGS_irq.o := -I$(src)/../include/asm/trace
 
diff --git a/arch/x86/platform/efi/Makefile b/arch/x86/platform/efi/Makefile
index 2846aaa..066619b 100644
--- a/arch/x86/platform/efi/Makefile
+++ b/arch/x86/platform/efi/Makefile
@@ -1,3 +1,5 @@
+OBJECT_FILES_NON_STANDARD_efi_thunk_$(BITS).o := y
+
 obj-$(CONFIG_EFI)  += quirks.o efi.o efi_$(BITS).o 
efi_stub_$(BITS).o
 obj-$(CONFIG_ACPI_BGRT) += efi-bgrt.o
 obj-$(CONFIG_EARLY_PRINTK_EFI) += early_printk.o
diff --git a/arch/x86/purgatory/Makefi

[PATCH v19 05/10] sched: Mark __schedule() stack frame as non-standard

2016-02-28 Thread Josh Poimboeuf
objtool reports the following warnings for __schedule():

  kernel/sched/core.o: warning: objtool:__schedule()+0x3c0: duplicate frame 
pointer save
  kernel/sched/core.o: warning: objtool:__schedule()+0x3fd: sibling call from 
callable instruction with changed frame pointer
  kernel/sched/core.o: warning: objtool:__schedule()+0x40a: call without frame 
pointer save/setup
  kernel/sched/core.o: warning: objtool:__schedule()+0x7fd: frame pointer state 
mismatch
  kernel/sched/core.o: warning: objtool:__schedule()+0x421: frame pointer state 
mismatch

Basically it's confused by two unusual attributes of the switch_to()
macro:

1. It saves prev's frame pointer to the old stack and restores next's
   frame pointer from the new stack.

2. For new tasks it jumps directly to ret_from_fork.

Eventually it would probably be a good idea to clean up the
ret_from_fork hack so that new tasks are created with a valid initial
stack, as suggested by Andy:

  
https://lkml.kernel.org/r/CALCETrWsqCw4L1qKO9j9L5F+4ED4viuLQTFc=n1pkbzffpq...@mail.gmail.com

Then __schedule() could return normally into the new code and objtool
hopefully wouldn't have a problem anymore.

In the meantime, mark its stack frame as non-standard so we can have a
baseline with no objtool warnings.  The marker also serves as a reminder
that this code could be improved a bit.

Signed-off-by: Josh Poimboeuf 
---
 kernel/sched/core.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 9503d59..641043d 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -74,6 +74,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -3288,6 +3289,7 @@ static void __sched notrace __schedule(bool preempt)
 
balance_callback(rq);
 }
+STACK_FRAME_NON_STANDARD(__schedule); /* switch_to() */
 
 static inline void sched_submit_work(struct task_struct *tsk)
 {
-- 
2.4.3



[PATCH v19 09/10] objtool: Add CONFIG_STACK_VALIDATION option

2016-02-28 Thread Josh Poimboeuf
Add a CONFIG_STACK_VALIDATION option which will run "objtool check" for
each .o file to ensure the validity of its stack metadata.

Signed-off-by: Josh Poimboeuf 
---
 Makefile   |  5 -
 arch/Kconfig   |  6 ++
 lib/Kconfig.debug  | 12 
 scripts/Makefile.build | 39 +++
 4 files changed, 57 insertions(+), 5 deletions(-)

diff --git a/Makefile b/Makefile
index fbe1b92..62be03b 100644
--- a/Makefile
+++ b/Makefile
@@ -993,7 +993,10 @@ prepare0: archprepare FORCE
$(Q)$(MAKE) $(build)=.
 
 # All the preparing..
-prepare: prepare0
+prepare: prepare0 prepare-objtool
+
+PHONY += prepare-objtool
+prepare-objtool: $(if $(CONFIG_STACK_VALIDATION), tools/objtool FORCE)
 
 # Generate some files
 # ---
diff --git a/arch/Kconfig b/arch/Kconfig
index f6b649d..81869a5 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -583,6 +583,12 @@ config HAVE_COPY_THREAD_TLS
  normal C parameter passing, rather than extracting the syscall
  argument from pt_regs.
 
+config HAVE_STACK_VALIDATION
+   bool
+   help
+ Architecture supports the 'objtool check' host tool command, which
+ performs compile-time stack metadata validation.
+
 #
 # ABI hall of shame
 #
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 8bfd1ac..8552656 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -342,6 +342,18 @@ config FRAME_POINTER
  larger and slower, but it gives very useful debugging information
  in case of kernel bugs. (precise oopses/stacktraces/warnings)
 
+config STACK_VALIDATION
+   bool "Compile-time stack metadata validation"
+   depends on HAVE_STACK_VALIDATION
+   default n
+   help
+ Add compile-time checks to validate stack metadata, including frame
+ pointers (if CONFIG_FRAME_POINTER is enabled).  This helps ensure
+ that runtime stack traces are more reliable.
+
+ For more information, see
+ tools/objtool/Documentation/stack-validation.txt.
+
 config DEBUG_FORCE_WEAK_PER_CPU
bool "Force weak per-cpu definitions"
depends on DEBUG_KERNEL
diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index 2c47f9c..130a452 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -241,10 +241,32 @@ cmd_record_mcount =   
\
fi;
 endif
 
+ifdef CONFIG_STACK_VALIDATION
+
+__objtool_obj := $(objtree)/tools/objtool/objtool
+
+objtool_args = check
+ifndef CONFIG_FRAME_POINTER
+objtool_args += --no-fp
+endif
+
+# 'OBJECT_FILES_NON_STANDARD := y': skip objtool checking for a directory
+# 'OBJECT_FILES_NON_STANDARD_foo.o := 'y': skip objtool checking for a file
+# 'OBJECT_FILES_NON_STANDARD_foo.o := 'n': override directory skip for a file
+cmd_objtool = $(if $(patsubst y%,, \
+   
$(OBJECT_FILES_NON_STANDARD_$(basetarget).o)$(OBJECT_FILES_NON_STANDARD)n), \
+   $(__objtool_obj) $(objtool_args) "$(@)";)
+objtool_obj = $(if $(patsubst y%,, \
+   
$(OBJECT_FILES_NON_STANDARD_$(basetarget).o)$(OBJECT_FILES_NON_STANDARD)n), \
+   $(__objtool_obj))
+
+endif # CONFIG_STACK_VALIDATION
+
 define rule_cc_o_c
$(call echo-cmd,checksrc) $(cmd_checksrc) \
$(call echo-cmd,cc_o_c) $(cmd_cc_o_c);\
$(cmd_modversions)\
+   $(cmd_objtool)\
$(call echo-cmd,record_mcount)\
$(cmd_record_mcount)  \
scripts/basic/fixdep $(depfile) $@ '$(call make-cmd,cc_o_c)' >\
@@ -253,14 +275,23 @@ define rule_cc_o_c
mv -f $(dot-target).tmp $(dot-target).cmd
 endef
 
+define rule_as_o_S
+   $(call echo-cmd,as_o_S) $(cmd_as_o_S);\
+   $(cmd_objtool)\
+   scripts/basic/fixdep $(depfile) $@ '$(call make-cmd,as_o_S)' >\
+ $(dot-target).tmp;  \
+   rm -f $(depfile); \
+   mv -f $(dot-target).tmp $(dot-target).cmd
+endef
+
 # Built-in and composite module parts
-$(obj)/%.o: $(src)/%.c $(recordmcount_source) FORCE
+$(obj)/%.o: $(src)/%.c $(recordmcount_source) $(objtool_obj) FORCE
$(call cmd,force_checksrc)
$(call if_changed_rule,cc_o_c)
 
 # Single-part modules are special since we need to mark them in $(MODVERDIR)
 
-$(single-used-m): $(obj)/%.o: $(src)/%.c $(recordmcount_source) FORCE
+$(single-used-m): $(obj)/%.o: $(src)/%.c $(recordmcount_source) $(objtool_obj) 
FORCE
$(call cmd,force_checksrc)
$(call if_changed_rule,cc_o_c)
@{ echo $(@:.o=.ko); echo $@; } > $(MODVERDIR)/$(@F:.o=.mod)
@@ -290,8 +321,8 @@ $(obj

[PATCH v19 07/10] x86/kprobes: Mark kretprobe_trampoline() stack frame as non-standard

2016-02-28 Thread Josh Poimboeuf
objtool reports the following warning for kretprobe_trampoline():

  arch/x86/kernel/kprobes/core.o: warning: objtool: 
kretprobe_trampoline()+0x20: call without frame pointer save/setup

kretprobes are a special case where the stack is intentionally wrong.
The return address isn't known at the beginning of the trampoline, so
the stack frame can't be set up properly before it calls
trampoline_handler().

Because kretprobe handlers don't sleep, the frame pointer doesn't *have*
to be accurate in the trampoline.  So it's ok to tell objtool to ignore
it.  This results in no actual changes to the generated code.

Signed-off-by: Josh Poimboeuf 
Cc: Ananth N Mavinakayanahalli 
Cc: Anil S Keshavamurthy 
Cc: "David S. Miller" 
Cc: Masami Hiramatsu 
---
 arch/x86/kernel/kprobes/core.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index 48acaac..ae703ac 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -49,6 +49,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -703,6 +704,7 @@ asm(
".size kretprobe_trampoline, .-kretprobe_trampoline\n"
 );
 NOKPROBE_SYMBOL(kretprobe_trampoline);
+STACK_FRAME_NON_STANDARD(kretprobe_trampoline);
 
 /*
  * Called from kretprobe_trampoline
-- 
2.4.3



[PATCH v19 00/10] Compile-time stack metadata validation

2016-02-28 Thread Josh Poimboeuf
This is v19 of the compile-time stack metadata validation patch set.

It's based on tip:core/objtool.

v18 can be found here:

  https://lkml.kernel.org/r/cover.1456440439.git.jpoim...@redhat.com

For more information about the motivation behind this patch set, and
more details about what it does, see the patch 8 changelog and
tools/objtool/Documentation/stack-validation.txt.

Patches 1-7 mark various directories, files, and functions as
"non-standard" in preparation for objtool.

Patches 8-10 add objtool and integrate it into the kernel build.

v19:
- add support for CONFIG_GCOV_KERNEL, CONFIG_KASAN, CONFIG_UBSAN
- always inline context_switch() to prevent gcov inline changes
- add main() return value in objtool.c
- change warning output format to mimic gcc warnings

v18:
- include/linux/objtool.h -> include/linux/frame.h
- __objtool_ignore_func -> __func_stack_frame_non_standard
- reword commit messages and comments a bit
- reorder patches

v17:
- __ex_table fix
- rename stacktool -> objtool
- STACKTOOL_IGNORE_FUNCTION -> STACK_FRAME_NON_STANDARD
- 'STACKTOOL := n' -> 'OBJECT_FILES_NON_STANDARD := y'
- updated global_noreturns list

v16:
- fix all allyesconfig warnings, except for staging
- get rid of STACKTOOL_IGNORE_INSN which is no longer needed
- remove several whitelists in favor of automatically whitelisting any
  function with a special instruction like ljmp, lret, or vmrun
- split up stacktool patch into 3 parts as suggested by Ingo
- update the global noreturn function list
- detect noreturn function fallthroughs
- skip weak functions in noreturn call detection logic
- add empty function check to noreturn logic
- allow non-section rela symbols for __ex_table sections
- support rare switch table case with jmpq *[addr](%rip)
- don't warn on frame pointer restore without save
- rearrange patch order a bit

v15:
- restructure code for a new cmdline interface "stacktool check" using
  the new subcommand framework in tools/lib/subcmd
- fix 32 bit build fail (put __sp at end) in paravirt_types.h patch 10
  which was reported by 0day

v14:
- make tools/include/linux/list.h self-sufficient
- create FRAME_OFFSET to allow 32-bit code to be able to access function
  arguments on the stack
- add FRAME_OFFSET usage in crypto patch 14/24: "Create stack frames in
  aesni-intel_asm.S"
- rename "index" -> "idx" to fix build with some compilers

v13:
- LDFLAGS order fix from Chris J Arges
- new warning fix patches from Chris J Arges
- "--frame-pointer" -> "--check-frame-pointer"

v12:
- rename "stackvalidate" -> "stacktool"
- move from scripts/ to tools/:
  - makefile rework
  - make a copy of the x86 insn code (and warn if the code diverges)
  - use tools/include/linux/list.h
- move warning macros to a new warn.h file
- change wording: "stack validation" -> "stack metadata validation"

v11:
- attempt to answer the "why" question better in the documentation and
  commit message
- s/FP_SAVE/FRAME_BEGIN/ in documentation

v10:
- add scripts/mod to directory ignores
- remove circular dependencies for ignored objects which are built
  before stackvalidate
- fix CONFIG_MODVERSIONS incompatibility

v9:
- rename FRAME/ENDFRAME -> FRAME_BEGIN/FRAME_END
- fix jump table issue for when the original instruction is a jump
- drop paravirt thunk alignment patch
- add maintainers to CC for proposed warning fixes

v8:
- add proposed fixes for warnings
- fix all memory leaks
- process ignores earlier and add more ignore checks
- always assume POPCNT alternative is enabled
- drop hweight inline asm fix
- drop __schedule() ignore patch
- change .Ltemp_\@ to .Lstackvalidate_ignore_\@ in asm macro
- fix CONFIG_* checks in asm macros
- add C versions of ignore macros and frame macros
- change ";" to "\n" in C macros
- add ifdef CONFIG_STACK_VALIDATION checks in C ignore macros
- use numbered label in C ignore macro
- add missing break in switch case statement in arch-x86.c

v7:
- sibling call support
- document proposed solution for inline asm() frame pointer issues
- say "kernel entry/exit" instead of "context switch"
- clarify the checking of switch statement jump tables
- discard __stackvalidate_ignore_* sections in linker script
- use .Ltemp_\@ to get a unique label instead of static 3-digit number
- change STACKVALIDATE_IGNORE_FUNC variable to a static
- move STACKVALIDATE_IGNORE_INSN to arch-specific .h file

v6:
- rename asmvalidate -> stackvalidate (again)
- gcc-generated object file support
- recursive branch state analysis
- external jump support
- fixup/exception table support
- jump label support
- switch statement jump table support
- added documentation
- detection of "noreturn" dead end functions
- added a Kbuild mechanism for skipping files and dirs
- moved frame pointer macros to arch/x86/include/asm/frame.h
- moved ignore macros to include/linux/stackvalidate.h

v5:
- stackvalidate -> asmvalidate
- frame pointers only required for non-leaf functions
- check for the use of the FP_SAVE/RESTORE macros instead of manually

[PATCH v19 04/10] bpf: Mark __bpf_prog_run() stack frame as non-standard

2016-02-28 Thread Josh Poimboeuf
objtool reports the following false positive warnings:

  kernel/bpf/core.o: warning: objtool: __bpf_prog_run()+0x5c: sibling call from 
callable instruction with changed frame pointer
  kernel/bpf/core.o: warning: objtool: __bpf_prog_run()+0x60: function has 
unreachable instruction
  kernel/bpf/core.o: warning: objtool: __bpf_prog_run()+0x64: function has 
unreachable instruction
  [...]

It's confused by the following dynamic jump instruction in
__bpf_prog_run()::

  jmp *(%r12,%rax,8)

which corresponds to the following line in the C code:

  goto *jumptable[insn->code];

There's no way for objtool to deterministically find all possible
branch targets for a dynamic jump, so it can't verify this code.

In this case the jumps all stay within the function, and there's nothing
unusual going on related to the stack, so we can whitelist the function.

Signed-off-by: Josh Poimboeuf 
Acked-by: Daniel Borkmann 
Acked-by: Alexei Starovoitov 
Cc: net...@vger.kernel.org
---
 kernel/bpf/core.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 972d9a8..be0abf6 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -27,6 +27,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 
@@ -649,6 +650,7 @@ load_byte:
WARN_RATELIMIT(1, "unknown opcode %02x\n", insn->code);
return 0;
 }
+STACK_FRAME_NON_STANDARD(__bpf_prog_run); /* jump table */
 
 bool bpf_prog_array_compatible(struct bpf_array *array,
   const struct bpf_prog *fp)
-- 
2.4.3



[PATCH v19 10/10] objtool: Enable stack metadata validation on x86_64

2016-02-28 Thread Josh Poimboeuf
Set HAVE_STACK_VALIDATION to enable stack metadata validation for
x86_64.

Signed-off-by: Josh Poimboeuf 
---
 arch/x86/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index c46662f..adc5a6d 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -155,6 +155,7 @@ config X86
select VIRT_TO_BUS
select X86_DEV_DMA_OPS  if X86_64
select X86_FEATURE_NAMESif PROC_FS
+   select HAVE_STACK_VALIDATIONif X86_64
 
 config INSTRUCTION_DECODER
def_bool y
-- 
2.4.3



[PATCH v19 06/10] sched: always inline context_switch()

2016-02-28 Thread Josh Poimboeuf
When CONFIG_GCOV is enabled, gcc decides to put context_switch()
out-of-line, which is inconsistent with its normal behavior.

It also causes an objtool warning because __schedule() no longer inlines
context_switch(), so the "STACK_FRAME_NON_STANDARD(__schedule)"
statement loses its effect.

Signed-off-by: Josh Poimboeuf 
---
 kernel/sched/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 641043d..bb0daab 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2763,7 +2763,7 @@ asmlinkage __visible void schedule_tail(struct 
task_struct *prev)
 /*
  * context_switch - switch to the new MM and the new thread's register state.
  */
-static inline struct rq *
+static __always_inline struct rq *
 context_switch(struct rq *rq, struct task_struct *prev,
   struct task_struct *next)
 {
-- 
2.4.3



[PATCH v19 02/10] objtool: Add STACK_FRAME_NON_STANDARD macro

2016-02-28 Thread Josh Poimboeuf
Add a new macro, STACK_FRAME_NON_STANDARD, which is used to denote a
function which does something unusual related to its stack frame.  Use
of the macro prevents objtool from emitting a false positive warning.

Signed-off-by: Josh Poimboeuf 
---
 arch/x86/kernel/vmlinux.lds.S |  5 -
 include/linux/frame.h | 23 +++
 2 files changed, 27 insertions(+), 1 deletion(-)
 create mode 100644 include/linux/frame.h

diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index 92dc211..13fa0ad 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -343,7 +343,10 @@ SECTIONS
 
/* Sections to be discarded */
DISCARDS
-   /DISCARD/ : { *(.eh_frame) }
+   /DISCARD/ : {
+   *(.eh_frame)
+   *(__func_stack_frame_non_standard)
+   }
 }
 
 
diff --git a/include/linux/frame.h b/include/linux/frame.h
new file mode 100644
index 000..e6baaba
--- /dev/null
+++ b/include/linux/frame.h
@@ -0,0 +1,23 @@
+#ifndef _LINUX_FRAME_H
+#define _LINUX_FRAME_H
+
+#ifdef CONFIG_STACK_VALIDATION
+/*
+ * This macro marks the given function's stack frame as "non-standard", which
+ * tells objtool to ignore the function when doing stack metadata validation.
+ * It should only be used in special cases where you're 100% sure it won't
+ * affect the reliability of frame pointers and kernel stack traces.
+ *
+ * For more information, see tools/objtool/Documentation/stack-validation.txt.
+ */
+#define STACK_FRAME_NON_STANDARD(func) \
+   static void __used __section(__func_stack_frame_non_standard) \
+   *__func_stack_frame_non_standard_##func = func
+
+#else /* !CONFIG_STACK_VALIDATION */
+
+#define STACK_FRAME_NON_STANDARD(func)
+
+#endif /* CONFIG_STACK_VALIDATION */
+
+#endif /* _LINUX_FRAME_H */
-- 
2.4.3



[PATCH v2] perf/x86/amd: Adding support for new IOMMU performance event

2016-02-28 Thread Suravee Suthikulpanit
This patch adds new IOMMU performance event based on
the information in table 74 of the AMD I/O Virtualization Technology
(IOMMU) Specification (Document Id: 4882, Rev 2.62, Feb 2015)

Link: http://support.amd.com/TechDocs/48882_IOMMU.pdf

Reviewed-by: Joerg Roedel 
Acked-by: Joerg Roedel 
Signed-off-by: Suravee Suthikulpanit 
---

Hi Ingo/Peter,

I have re-based the patch from tips, and re-send this as V2.
If there is no other concern, would you please accept this patch
when you get a chance. FYI, here is the link to V1
(https://lkml.org/lkml/2015/12/11/891).

Thanks,
Suravee

 arch/x86/events/amd/iommu.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/arch/x86/events/amd/iommu.c b/arch/x86/events/amd/iommu.c
index 635e5eb..40625ca 100644
--- a/arch/x86/events/amd/iommu.c
+++ b/arch/x86/events/amd/iommu.c
@@ -118,6 +118,11 @@ static struct amd_iommu_event_desc 
amd_iommu_v2_event_descs[] = {
AMD_IOMMU_EVENT_DESC(cmd_processed,   "csource=0x11"),
AMD_IOMMU_EVENT_DESC(cmd_processed_inv,   "csource=0x12"),
AMD_IOMMU_EVENT_DESC(tlb_inv, "csource=0x13"),
+   AMD_IOMMU_EVENT_DESC(ign_rd_wr_mmio_1ff8h,"csource=0x14"),
+   AMD_IOMMU_EVENT_DESC(vapic_int_non_guest, "csource=0x15"),
+   AMD_IOMMU_EVENT_DESC(vapic_int_guest, "csource=0x16"),
+   AMD_IOMMU_EVENT_DESC(smi_recv,"csource=0x17"),
+   AMD_IOMMU_EVENT_DESC(smi_blk, "csource=0x18"),
{ /* end: all zeroes */ },
 };
 
-- 
1.9.1



linux-next: manual merge of the iommu tree with the samsung-krzk tree

2016-02-28 Thread Stephen Rothwell
Hi Joerg,

Today's linux-next merge of the iommu tree got a conflict in:

  drivers/memory/Kconfig

between commit:

  78fbb9361ca3 ("memory: Add support for Exynos SROM driver")

from the samsung-krzk tree and commit:

  cc8bbe1a8312 ("memory: mediatek: Add SMI driver")

from the iommu tree.

I fixed it up (see below) and can carry the fix as necessary (no action
is required).

-- 
Cheers,
Stephen Rothwell

diff --cc drivers/memory/Kconfig
index bcb19822968b,51d5cd20c26a..
--- a/drivers/memory/Kconfig
+++ b/drivers/memory/Kconfig
@@@ -114,7 -114,14 +114,15 @@@ config JZ4780_NEM
  the Ingenic JZ4780. This controller is used to handle external
  memory devices such as NAND and SRAM.
  
+ config MTK_SMI
+   bool
+   depends on ARCH_MEDIATEK || COMPILE_TEST
+   help
+ This driver is for the Memory Controller module in MediaTek SoCs,
+ mainly help enable/disable iommu and control the power domain and
+ clocks for each local arbiter.
+ 
 +source "drivers/memory/samsung/Kconfig"
  source "drivers/memory/tegra/Kconfig"
  
  endif


[PATCH v7] watchdog: Add watchdog timer support for the WinSystems EBC-C384

2016-02-28 Thread William Breathitt Gray
The WinSystems EBC-C384 has an onboard watchdog timer. The timeout range
supported by the watchdog timer is 1 second to 255 minutes. Timeouts
under 256 seconds have a 1 second granularity, while the rest have a 1
minute granularity.

This driver adds watchdog timer support for this onboard watchdog timer.
The timeout may be configured via the timeout module parameter.

Signed-off-by: William Breathitt Gray 
Reviewed-by: Guenter Roeck 
---
Changes in v7:
  - Make sure timeout member is in seconds resolution despite minutes
granularity

 MAINTAINERS |   6 ++
 drivers/watchdog/Kconfig|   9 ++
 drivers/watchdog/Makefile   |   1 +
 drivers/watchdog/ebc-c384_wdt.c | 188 
 4 files changed, 204 insertions(+)
 create mode 100644 drivers/watchdog/ebc-c384_wdt.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 28eb61b..66107fd 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -11860,6 +11860,12 @@ M: David Härdeman 
 S: Maintained
 F: drivers/media/rc/winbond-cir.c
 
+WINSYSTEMS EBC-C384 WATCHDOG DRIVER
+M: William Breathitt Gray 
+L: linux-watch...@vger.kernel.org
+S: Maintained
+F: drivers/watchdog/ebc-c384_wdt.c
+
 WIMAX STACK
 M: Inaky Perez-Gonzalez 
 M: linux-wi...@intel.com
diff --git a/drivers/watchdog/Kconfig b/drivers/watchdog/Kconfig
index 0f6d851..11f3a3d 100644
--- a/drivers/watchdog/Kconfig
+++ b/drivers/watchdog/Kconfig
@@ -713,6 +713,15 @@ config ALIM7101_WDT
 
  Most people will say N.
 
+config EBC_C384_WDT
+   tristate "WinSystems EBC-C384 Watchdog Timer"
+   depends on X86
+   select WATCHDOG_CORE
+   help
+ Enables watchdog timer support for the watchdog timer on the
+ WinSystems EBC-C384 motherboard. The timeout may be configured via
+ the timeout module parameter.
+
 config F71808E_WDT
tristate "Fintek F71808E, F71862FG, F71869, F71882FG and F71889FG 
Watchdog"
depends on X86
diff --git a/drivers/watchdog/Makefile b/drivers/watchdog/Makefile
index f566753..15762c8 100644
--- a/drivers/watchdog/Makefile
+++ b/drivers/watchdog/Makefile
@@ -88,6 +88,7 @@ obj-$(CONFIG_ACQUIRE_WDT) += acquirewdt.o
 obj-$(CONFIG_ADVANTECH_WDT) += advantechwdt.o
 obj-$(CONFIG_ALIM1535_WDT) += alim1535_wdt.o
 obj-$(CONFIG_ALIM7101_WDT) += alim7101_wdt.o
+obj-$(CONFIG_EBC_C384_WDT) += ebc-c384_wdt.o
 obj-$(CONFIG_F71808E_WDT) += f71808e_wdt.o
 obj-$(CONFIG_SP5100_TCO) += sp5100_tco.o
 obj-$(CONFIG_GEODE_WDT) += geodewdt.o
diff --git a/drivers/watchdog/ebc-c384_wdt.c b/drivers/watchdog/ebc-c384_wdt.c
new file mode 100644
index 000..2cdaf5d
--- /dev/null
+++ b/drivers/watchdog/ebc-c384_wdt.c
@@ -0,0 +1,188 @@
+/*
+ * Watchdog timer driver for the WinSystems EBC-C384
+ * Copyright (C) 2016 William Breathitt Gray
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define MODULE_NAME"ebc-c384_wdt"
+#define WATCHDOG_TIMEOUT   60
+/*
+ * The timeout value in minutes must fit in a single byte when sent to the
+ * watchdog timer; the maximum timeout possible is 15300 (255 * 60) seconds.
+ */
+#define WATCHDOG_MAX_TIMEOUT   15300
+#define BASE_ADDR  0x564
+#define ADDR_EXTENT5
+#define CFG_ADDR   (BASE_ADDR + 1)
+#define PET_ADDR   (BASE_ADDR + 2)
+
+static bool nowayout = WATCHDOG_NOWAYOUT;
+module_param(nowayout, bool, 0);
+MODULE_PARM_DESC(nowayout, "Watchdog cannot be stopped once started (default="
+   __MODULE_STRING(WATCHDOG_NOWAYOUT) ")");
+
+static unsigned timeout;
+module_param(timeout, uint, 0);
+MODULE_PARM_DESC(timeout, "Watchdog timeout in seconds (default="
+   __MODULE_STRING(WATCHDOG_TIMEOUT) ")");
+
+static int ebc_c384_wdt_start(struct watchdog_device *wdev)
+{
+   unsigned t = wdev->timeout;
+
+   /* resolution is in minutes for timeouts greater than 255 seconds */
+   if (t > 255)
+   t = DIV_ROUND_UP(t, 60);
+
+   outb(t, PET_ADDR);
+
+   return 0;
+}
+
+static int ebc_c384_wdt_stop(struct watchdog_device *wdev)
+{
+   outb(0x00, PET_ADDR);
+
+   return 0;
+}
+
+static int ebc_c384_wdt_set_timeout(struct watchdog_device *wdev, unsigned t)
+{
+   /* resolution is in minutes for timeouts greater than 255 seconds */
+   if (t > 255) {
+   /* round second resolution up to minute resolution */
+   wdev->timeout = DIV_ROUND_UP(t, 60);
+
+   /* set watchdog timer for minute

Re: [PATCH] s390x: fix condition to choose correct function

2016-02-28 Thread Steve French
merged into cifs-2.6.git

Looks like alpha has a similar problem though

On Wed, Feb 24, 2016 at 12:45 AM, Yadan Fan  wrote:
> This issue is involved from commit 02323db17e3a7 ("cifs: fix
> cifs_uniqueid_to_ino_t not to ever return 0"), when BITS_PER_LONG
> is 64 on s390x, the corresponding cifs_uniqueid_to_ino_t()
> function will cast 64-bit fileid to 32-bit by using (ino_t)fileid,
> because ino_t (typdefed __kernel_ino_t) is int type.
>
> Signed-off-by: Yadan Fan 
> ---
>  fs/cifs/cifsfs.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/cifs/cifsfs.h b/fs/cifs/cifsfs.h
> index 68c4547..02dcbe1 100644
> --- a/fs/cifs/cifsfs.h
> +++ b/fs/cifs/cifsfs.h
> @@ -31,7 +31,7 @@
>   * so that it will fit. We use hash_64 to convert the value to 31 bits, and
>   * then add 1, to ensure that we don't end up with a 0 as the value.
>   */
> -#if BITS_PER_LONG == 64
> +#if BITS_PER_LONG == 64 && !defined(CONFIG_S390)
>  static inline ino_t
>  cifs_uniqueid_to_ino_t(u64 fileid)
>  {
> --
> 2.6.2
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-cifs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Thanks,

Steve


linux-next: manual merge of the iommu tree with the arm-soc tree

2016-02-28 Thread Stephen Rothwell
Hi Joerg,

Today's linux-next merge of the iommu tree got a conflict in:

  arch/arm64/boot/dts/mediatek/mt8173.dtsi

between commit:

  93e9f5ee1e35 ("dts: arm64: Add EFUSE device node")

from the arm-soc tree and commit:

  5ff6b3a6d391 ("dts: mt8173: Add iommu/smi nodes for mt8173")

from the iommu tree.

I fixed it up (see below) and can carry the fix as necessary (no action
is required).

-- 
Cheers,
Stephen Rothwell

diff --cc arch/arm64/boot/dts/mediatek/mt8173.dtsi
index f4bd3c9182ad,804881181fcc..
--- a/arch/arm64/boot/dts/mediatek/mt8173.dtsi
+++ b/arch/arm64/boot/dts/mediatek/mt8173.dtsi
@@@ -277,11 -278,17 +278,22 @@@
reg = <0 0x10200620 0 0x20>;
};
  
+   iommu: iommu@10205000 {
+   compatible = "mediatek,mt8173-m4u";
+   reg = <0 0x10205000 0 0x1000>;
+   interrupts = ;
+   clocks = <&infracfg CLK_INFRA_M4U>;
+   clock-names = "bclk";
+   mediatek,larbs = <&larb0 &larb1 &larb2
+ &larb3 &larb4 &larb5>;
+   #iommu-cells = <1>;
+   };
+ 
 +  efuse: efuse@10206000 {
 +  compatible = "mediatek,mt8173-efuse";
 +  reg = <0 0x10206000 0 0x1000>;
 +  };
 +
apmixedsys: clock-controller@10209000 {
compatible = "mediatek,mt8173-apmixedsys";
reg = <0 0x10209000 0 0x1000>;


Re: [PATCH v2 02/13] clk: sunxi: add ahb1 clock for A83T

2016-02-28 Thread Chen-Yu Tsai
Hi,

On Sun, Feb 28, 2016 at 7:18 AM, Vishnu Patekar
 wrote:
> AHB1 on A83T is similar to ahb1 on A31, except parents are different.
> clock index 0b1x is PLL6.
>
> Signed-off-by: Vishnu Patekar 
> Acked-by: Chen-Yu Tsai 
> Acked-by: Rob Herring 
> ---
>  Documentation/devicetree/bindings/clock/sunxi.txt |  1 +
>  drivers/clk/sunxi/clk-sunxi.c | 76 
> +++
>  2 files changed, 77 insertions(+)
>
> diff --git a/Documentation/devicetree/bindings/clock/sunxi.txt 
> b/Documentation/devicetree/bindings/clock/sunxi.txt
> index c09f59b..2ee7841 100644
> --- a/Documentation/devicetree/bindings/clock/sunxi.txt
> +++ b/Documentation/devicetree/bindings/clock/sunxi.txt
> @@ -29,6 +29,7 @@ Required properties:
> "allwinner,sun6i-a31-ar100-clk" - for the AR100 on A31
> "allwinner,sun9i-a80-cpus-clk" - for the CPUS on A80
> "allwinner,sun6i-a31-ahb1-clk" - for the AHB1 clock on A31
> +   "allwinner,sun8i-a83t-ahb1-clk" - for the AHB1 clock on A83T
> "allwinner,sun8i-h3-ahb2-clk" - for the AHB2 clock on H3
> "allwinner,sun6i-a31-ahb1-gates-clk" - for the AHB1 gates on A31
> "allwinner,sun8i-a23-ahb1-gates-clk" - for the AHB1 gates on A23
> diff --git a/drivers/clk/sunxi/clk-sunxi.c b/drivers/clk/sunxi/clk-sunxi.c
> index 99f60ef..0ae1f09 100644
> --- a/drivers/clk/sunxi/clk-sunxi.c
> +++ b/drivers/clk/sunxi/clk-sunxi.c
> @@ -344,6 +344,67 @@ static void sun6i_ahb1_recalc(struct factors_request 
> *req)
> req->rate >>= req->p;
>  }
>
> +#define SUN8I_A83T_AHB1_PARENT_PLL62
> +/**
> + * sun8i_a83t_get_ahb_factors() - calculates m, p factors for AHB
> + * AHB rate is calculated as follows
> + * rate = parent_rate >> p
> + *
> + * if parent is pll6, then
> + * parent_rate = pll6 rate / (m + 1)
> + */
> +
> +static void sun8i_a83t_get_ahb1_factors(struct factors_request *req)
> +{
> +   u8 div, calcp, calcm = 1;
> +
> +   /*
> +* clock can only divide, so we will never be able to achieve
> +* frequencies higher than the parent frequency
> +*/
> +   if (req->parent_rate && req->rate > req->parent_rate)
> +   req->rate = req->parent_rate;
> +
> +   div = DIV_ROUND_UP(req->parent_rate, req->rate);
> +
> +   /* calculate pre-divider if parent is pll6 */
> +   if (req->parent_index >= SUN8I_A83T_AHB1_PARENT_PLL6) {
> +   if (div < 4)
> +   calcp = 0;
> +   else if (div / 2 < 4)
> +   calcp = 1;
> +   else if (div / 4 < 4)
> +   calcp = 2;
> +   else
> +   calcp = 3;
> +
> +   calcm = DIV_ROUND_UP(div, 1 << calcp);
> +   } else {
> +   calcp = __roundup_pow_of_two(div);
> +   calcp = calcp > 3 ? 3 : calcp;
> +}

Indent here.

> +
> +   req->rate = (req->parent_rate / calcm) >> calcp;
> +   req->p = calcp;
> +   req->m = calcm - 1;
> +}
> +
> +/**
> +* sun8i_a83t_ahb1_recalc() - calculates AHB clock rate from m, p factors and
> +*   parent index

Whitespace here.

> +*/
> +static void sun8i_a83t_ahb1_recalc(struct factors_request *req)
> +{
> +   req->rate = req->parent_rate;
> +
> +/* apply pre-divider first if parent is pll6 */

Indent here.

ChenYu

> +   if (req->parent_index >= SUN6I_AHB1_PARENT_PLL6)
> +   req->rate /= req->m + 1;
> +
> +   /* clk divider */
> +   req->rate >>= req->p;
> +}
> +
>  /**
>   * sun4i_get_apb1_factors() - calculates m, p factors for APB1
>   * APB1 rate is calculated as follows
> @@ -555,6 +616,14 @@ static const struct factors_data sun6i_ahb1_data 
> __initconst = {
> .recalc = sun6i_ahb1_recalc,
>  };
>
> +static const struct factors_data sun8i_a83t_ahb1_data __initconst = {
> +   .mux = 12,
> +   .muxmask = BIT(1) | BIT(0),
> +   .table = &sun6i_ahb1_config,
> +   .getter = sun8i_a83t_get_ahb1_factors,
> +   .recalc = sun8i_a83t_ahb1_recalc,
> +};
> +
>  static const struct factors_data sun4i_apb1_data __initconst = {
> .mux = 24,
> .muxmask = BIT(1) | BIT(0),
> @@ -627,6 +696,13 @@ static void __init sun6i_ahb1_clk_setup(struct 
> device_node *node)
>  CLK_OF_DECLARE(sun6i_a31_ahb1, "allwinner,sun6i-a31-ahb1-clk",
>sun6i_ahb1_clk_setup);
>
> +static void __init sun8i_a83t_ahb1_clk_setup(struct device_node *node)
> +{
> +   sunxi_factors_clk_setup(node, &sun8i_a83t_ahb1_data);
> +}
> +CLK_OF_DECLARE(sun8i_a83t_ahb1, "allwinner,sun8i-a83t-ahb1-clk",
> +  sun8i_a83t_ahb1_clk_setup);
> +
>  static void __init sun4i_apb1_clk_setup(struct device_node *node)
>  {
> sunxi_factors_clk_setup(node, &sun4i_apb1_data);
> --
> 1.9.1
>


Re: n_tty: Check the other end of pty pair before returning EAGAIN on a read()

2016-02-28 Thread Brian Bloniarz
(Take 3, fix compile error in n_hdlc.c)

Hi Peter, I saw Marc Aurele La France's proposed patch to n_tty to fix
OpenSSH, and your feedback. Patch below is an attempt to address that
feedback. Please let me know if this is the change you envisioned;
(see Marc's excellent original writeup for details on the issue).

[PATCH] n_tty: wait for buffer work in read() and poll().

Undoes the following four changes:

1) f95499c3030fe1bfad57745f2db1959c5b43dca8
n_tty: Don't wait for buffer work in read() loop

2) f8747d4a466ab2cafe56112c51b3379f9fdb7a12
tty: Fix pty master read() after slave closes

3) 52bce7f8d4fc633c9a9d0646eef58ba6ae9a3b73
pty, n_tty: Simplify input processing on final close

4) 1a48632ffed61352a7810ce089dc5a8bcd505a60
pty: Fix input race when closing

These changes caused a regression in OpenSSH, as it assumes that the
first read() to return EAGAIN after a SIGCHLD means that all the child's
output has been returned.

Inspired by analysis and patch from Marc Aurele La France 

Reported-by: Volth 
Reported-by: Marc Aurele La France 
BugLink: https://bugzilla.mindrot.org/show_bug.cgi?id=52
BugLink: https://bugzilla.mindrot.org/show_bug.cgi?id=2492
Signed-off-by: Brian Bloniarz 
---
 Documentation/serial/tty.txt |  3 ---
 drivers/tty/n_hdlc.c |  4 ++--
 drivers/tty/n_tty.c  | 34 +++---
 drivers/tty/pty.c|  4 +---
 drivers/tty/tty_buffer.c | 29 +
 include/linux/tty.h  |  1 -
 6 files changed, 19 insertions(+), 56 deletions(-)

diff --git a/Documentation/serial/tty.txt b/Documentation/serial/tty.txt
index bc3842d..e2dea3d 100644
--- a/Documentation/serial/tty.txt
+++ b/Documentation/serial/tty.txt
@@ -213,9 +213,6 @@ TTY_IO_ERRORIf set, causes all subsequent 
userspace read/write
 
 TTY_OTHER_CLOSED   Device is a pty and the other side has closed.
 
-TTY_OTHER_DONE Device is a pty and the other side has closed and
-   all pending input processing has been completed.
-
 TTY_NO_WRITE_SPLIT Prevent driver from splitting up writes into
smaller chunks.
 
diff --git a/drivers/tty/n_hdlc.c b/drivers/tty/n_hdlc.c
index bbc4ce6..644ddb8 100644
--- a/drivers/tty/n_hdlc.c
+++ b/drivers/tty/n_hdlc.c
@@ -600,7 +600,7 @@ static ssize_t n_hdlc_tty_read(struct tty_struct *tty, 
struct file *file,
add_wait_queue(&tty->read_wait, &wait);
 
for (;;) {
-   if (test_bit(TTY_OTHER_DONE, &tty->flags)) {
+   if (test_bit(TTY_OTHER_CLOSED, &tty->flags)) {
ret = -EIO;
break;
}
@@ -828,7 +828,7 @@ static unsigned int n_hdlc_tty_poll(struct tty_struct *tty, 
struct file *filp,
/* set bits for operations that won't block */
if (n_hdlc->rx_buf_list.head)
mask |= POLLIN | POLLRDNORM;/* readable */
-   if (test_bit(TTY_OTHER_DONE, &tty->flags))
+   if (test_bit(TTY_OTHER_CLOSED, &tty->flags))
mask |= POLLHUP;
if (tty_hung_up_p(filp))
mask |= POLLHUP;
diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c
index b280abaa..fc04011 100644
--- a/drivers/tty/n_tty.c
+++ b/drivers/tty/n_tty.c
@@ -1952,10 +1952,20 @@ err:
return -ENOMEM;
 }
 
+/**
+ * Synchronously pushes the terminal flip buffers to the line discipline
+ * and checks for available data.
+ *
+ * Must not be called from IRQ context.
+ */
 static inline int input_available_p(struct tty_struct *tty, int poll)
 {
struct n_tty_data *ldata = tty->disc_data;
-   int amt = poll && !TIME_CHAR(tty) && MIN_CHAR(tty) ? MIN_CHAR(tty) : 1;
+   int amt;
+
+   flush_work(&tty->port->buf.work);
+
+   amt = poll && !TIME_CHAR(tty) && MIN_CHAR(tty) ? MIN_CHAR(tty) : 1;
 
if (ldata->icanon && !L_EXTPROC(tty))
return ldata->canon_head != ldata->read_tail;
@@ -1963,18 +1973,6 @@ static inline int input_available_p(struct tty_struct 
*tty, int poll)
return ldata->commit_head - ldata->read_tail >= amt;
 }
 
-static inline int check_other_done(struct tty_struct *tty)
-{
-   int done = test_bit(TTY_OTHER_DONE, &tty->flags);
-   if (done) {
-   /* paired with cmpxchg() in check_other_closed(); ensures
-* read buffer head index is not stale
-*/
-   smp_mb__after_atomic();
-   }
-   return done;
-}
-
 /**
  * copy_from_read_buf  -   copy read data directly
  * @tty: terminal device
@@ -2170,7 +2168,7 @@ static ssize_t n_tty_read(struct tty_struct *tty, struct 
file *file,
struct n_tty_data *ldata = tty->disc_data;
unsigned char __user *b = buf;
DEFINE_WAIT_FUNC(wait, woken_wake_function);
-   int c, done;
+   int c;
int minimum, time;
ssize_t retval = 0;
 

linux-next: build failure after merge of the mfd tree

2016-02-28 Thread Stephen Rothwell
Hi Lee,

After merging the mfd tree, today's linux-next build (x86_64 allmodconfig)
failed like this:

drivers/regulator/tps65086-regulator.c:194:9: error: implicit declaration of 
function 'regmap_write_bits' [-Werror=implicit-function-declaration]
   ret = regmap_write_bits(config->regmap,
 ^

Caused by commit

  23b92e4cf5fd ("regmap: remove regmap_write_bits()")

from the sound-asoc & regmap trees.

I am not sure why this is suddenly exposed by the mfd tree, but grep
would have been useful when the regmap tree patch was applied.

I have reverted that regmap commit for today.

-- 
Cheers,
Stephen Rothwell


[PATCH v4 3/8] fixdep: accept extra dependencies on stdin

2016-02-28 Thread Nicolas Pitre
... and merge them in the list of parsed dependencies.

Signed-off-by: Nicolas Pitre 
---
 scripts/basic/fixdep.c | 60 +-
 1 file changed, 45 insertions(+), 15 deletions(-)

diff --git a/scripts/basic/fixdep.c b/scripts/basic/fixdep.c
index 5b327c67a8..d984deb120 100644
--- a/scripts/basic/fixdep.c
+++ b/scripts/basic/fixdep.c
@@ -120,13 +120,15 @@
 #define INT_NFIG ntohl(0x4e464947)
 #define INT_FIG_ ntohl(0x4649475f)
 
+int insert_extra_deps;
 char *target;
 char *depfile;
 char *cmdline;
 
 static void usage(void)
 {
-   fprintf(stderr, "Usage: fixdep   \n");
+   fprintf(stderr, "Usage: fixdep [-e]   \n");
+   fprintf(stderr, " -e  insert extra dependencies given on stdin\n");
exit(1);
 }
 
@@ -138,6 +140,40 @@ static void print_cmdline(void)
printf("cmd_%s := %s\n\n", target, cmdline);
 }
 
+/*
+ * Print out a dependency path from a symbol name
+ */
+static void print_config(const char *m, int slen)
+{
+   int c, i;
+
+   printf("$(wildcard include/config/");
+   for (i = 0; i < slen; i++) {
+   c = m[i];
+   if (c == '_')
+   c = '/';
+   else
+   c = tolower(c);
+   putchar(c);
+   }
+   printf(".h) \\\n");
+}
+
+static void do_extra_deps(void)
+{
+   if (insert_extra_deps) {
+   char buf[80];
+   while(fgets(buf, sizeof(buf), stdin)) {
+   int len = strlen(buf);
+   if (len < 2 || buf[len-1] != '\n') {
+   fprintf(stderr, "fixdep: bad data on stdin\n");
+   exit(1);
+   }
+   print_config(buf, len-1);
+   }
+   }
+}
+
 struct item {
struct item *next;
unsigned intlen;
@@ -197,23 +233,12 @@ static void define_config(const char *name, int len, 
unsigned int hash)
 static void use_config(const char *m, int slen)
 {
unsigned int hash = strhash(m, slen);
-   int c, i;
 
if (is_defined_config(m, slen, hash))
return;
 
define_config(m, slen, hash);
-
-   printf("$(wildcard include/config/");
-   for (i = 0; i < slen; i++) {
-   c = m[i];
-   if (c == '_')
-   c = '/';
-   else
-   c = tolower(c);
-   putchar(c);
-   }
-   printf(".h) \\\n");
+   print_config(m, slen);
 }
 
 static void parse_config_file(const char *map, size_t len)
@@ -250,7 +275,7 @@ static void parse_config_file(const char *map, size_t len)
}
 }
 
-/* test is s ends in sub */
+/* test if s ends in sub */
 static int strrcmp(const char *s, const char *sub)
 {
int slen = strlen(s);
@@ -374,6 +399,8 @@ static void parse_dep_file(void *map, size_t len)
exit(1);
}
 
+   do_extra_deps();
+
printf("\n%s: $(deps_%s)\n\n", target, target);
printf("$(deps_%s):\n", target);
 }
@@ -430,7 +457,10 @@ int main(int argc, char *argv[])
 {
traps();
 
-   if (argc != 4)
+   if (argc == 5 && !strcmp(argv[1], "-e")) {
+   insert_extra_deps = 1;
+   argv++;
+   } else if (argc != 4)
usage();
 
depfile = argv[1];
-- 
2.5.0



[PATCH v4 2/8] allow for per-symbol configurable EXPORT_SYMBOL()

2016-02-28 Thread Nicolas Pitre
Similar to include/generated/autoconf.h, include/generated/autoksyms.h
will contain a list of defines for each EXPORT_SYMBOL() that we want
active. The format is:

  #define __KSYM_ 1

This list will be auto-generated with another patch.  For now we only
include the preprocessor magic to automatically create or omit the
corresponding struct kernel_symbol declaration.

Given the content of include/generated/autoksyms.h may not be known in
advance, an empty file is created early on to let the build proceed.

Signed-off-by: Nicolas Pitre 
Acked-by: Rusty Russell 
---
 Makefile   |  2 ++
 include/linux/export.h | 22 --
 2 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/Makefile b/Makefile
index 6c1a3c2479..e916428cf7 100644
--- a/Makefile
+++ b/Makefile
@@ -986,6 +986,8 @@ prepare2: prepare3 outputmakefile asm-generic
 prepare1: prepare2 $(version_h) include/generated/utsrelease.h \
include/config/auto.conf
$(cmd_crmodverdir)
+   $(Q)test -e include/generated/autoksyms.h || \
+   touch   include/generated/autoksyms.h
 
 archprepare: archheaders archscripts prepare1 scripts_basic
 
diff --git a/include/linux/export.h b/include/linux/export.h
index 96e45ea463..77afdb2a25 100644
--- a/include/linux/export.h
+++ b/include/linux/export.h
@@ -38,7 +38,7 @@ extern struct module __this_module;
 
 #ifdef CONFIG_MODULES
 
-#ifndef __GENKSYMS__
+#if defined(__KERNEL__) && !defined(__GENKSYMS__)
 #ifdef CONFIG_MODVERSIONS
 /* Mark the CRC weak since genksyms apparently decides not to
  * generate a checksums for some symbols */
@@ -53,7 +53,7 @@ extern struct module __this_module;
 #endif
 
 /* For every exported symbol, place a struct in the __ksymtab section */
-#define __EXPORT_SYMBOL(sym, sec)  \
+#define ___EXPORT_SYMBOL(sym, sec) \
extern typeof(sym) sym; \
__CRC_SYMBOL(sym, sec)  \
static const char __kstrtab_##sym[] \
@@ -65,6 +65,24 @@ extern struct module __this_module;
__attribute__((section("___ksymtab" sec "+" #sym), unused)) \
= { (unsigned long)&sym, __kstrtab_##sym }
 
+#ifdef CONFIG_TRIM_UNUSED_KSYMS
+
+#include 
+#include 
+
+#define __EXPORT_SYMBOL(sym, sec)  \
+   __cond_export_sym(sym, sec, config_enabled(__KSYM_##sym))
+#define __cond_export_sym(sym, sec, conf)  \
+   ___cond_export_sym(sym, sec, conf)
+#define ___cond_export_sym(sym, sec, enabled)  \
+   __cond_export_sym_##enabled(sym, sec)
+#define __cond_export_sym_1(sym, sec) ___EXPORT_SYMBOL(sym, sec)
+#define __cond_export_sym_0(sym, sec) /* nothing */
+
+#else
+#define __EXPORT_SYMBOL ___EXPORT_SYMBOL
+#endif
+
 #define EXPORT_SYMBOL(sym) \
__EXPORT_SYMBOL(sym, "")
 
-- 
2.5.0



[PATCH v4 6/8] create/adjust generated/autoksyms.h

2016-02-28 Thread Nicolas Pitre
Given the list of exported symbols needed by all modules, we can create
a header file containing preprocessor defines for each of those symbols.
Also, when some symbols are added and/or removed from the list, we can
update the time on the corresponding files used as build dependencies for
those symbols. And finally, if any symbol did change state, the
corresponding source files must be rebuilt.

The insertion or removal of an EXPORT_SYMBOL() entry within a module may
create or remove the need for another exported symbol.  This is why this
operation has to be repeated until the list of needed exported symbols
becomes stable. Only then the final kernel and modules link take place.

Signed-off-by: Nicolas Pitre 
Acked-by: Rusty Russell 
---
 Makefile| 13 ++
 scripts/adjust_autoksyms.sh | 97 +
 2 files changed, 110 insertions(+)
 create mode 100755 scripts/adjust_autoksyms.sh

diff --git a/Makefile b/Makefile
index e916428cf7..bb865095ca 100644
--- a/Makefile
+++ b/Makefile
@@ -921,6 +921,10 @@ quiet_cmd_link-vmlinux = LINK$@
 # Include targets which we want to
 # execute if the rest of the kernel build went well.
 vmlinux: scripts/link-vmlinux.sh $(vmlinux-deps) FORCE
+ifdef CONFIG_TRIM_UNUSED_KSYMS
+   $(Q)$(CONFIG_SHELL) scripts/adjust_autoksyms.sh \
+ "$(MAKE) KBUILD_MODULES=1 -f $(srctree)/Makefile autoksyms_recursive"
+endif
 ifdef CONFIG_HEADERS_CHECK
$(Q)$(MAKE) -f $(srctree)/Makefile headers_check
 endif
@@ -935,6 +939,15 @@ ifdef CONFIG_GDB_SCRIPTS
 endif
+$(call if_changed,link-vmlinux)
 
+autoksyms_recursive: $(vmlinux-deps)
+   $(Q)$(CONFIG_SHELL) scripts/adjust_autoksyms.sh \
+ "$(MAKE) KBUILD_MODULES=1 -f $(srctree)/Makefile autoksyms_recursive"
+PHONY += autoksyms_recursive
+
+# standalone target for easier testing
+include/generated/autoksyms.h: FORCE
+   $(Q)$(CONFIG_SHELL) scripts/adjust_autoksyms.sh true
+
 # The actual objects are generated when descending,
 # make sure no implicit rule kicks in
 $(sort $(vmlinux-deps)): $(vmlinux-dirs) ;
diff --git a/scripts/adjust_autoksyms.sh b/scripts/adjust_autoksyms.sh
new file mode 100755
index 00..a145a24cd8
--- /dev/null
+++ b/scripts/adjust_autoksyms.sh
@@ -0,0 +1,97 @@
+#!/bin/sh
+
+# Script to create/update include/generated/autoksyms.h and dependency files
+#
+# Copyright:   (C) 2016  Linaro Limited
+# Created by:  Nicolas Pitre, January 2016
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License version 2 as
+# published by the Free Software Foundation.
+
+# Create/update the include/generated/autoksyms.h file from the list
+# of all module's needed symbols as recorded on the third line of
+# .tmp_versions/*.mod files.
+#
+# For each symbol being added or removed, the corresponding dependency
+# file's timestamp is updated to force a rebuild of the affected source
+# file. All arguments passed to this script are assumed to be a command
+# to be exec'd to trigger a rebuild of those files.
+
+set -e
+
+cur_ksyms_file="include/generated/autoksyms.h"
+new_ksyms_file="include/generated/autoksyms.h.tmpnew"
+
+info() { [ "$quiet" != "silent_" ] && printf "  %-7s %s\n" "$1" "$2"; }
+
+info "CHK" "$cur_ksyms_file"
+
+# Use "make V=1" to debug this script.
+case "$KBUILD_VERBOSE" in
+*1*)
+   set -x
+   ;;
+esac
+
+# We need access to CONFIG_ symbols
+case "${KCONFIG_CONFIG}" in
+*/*)
+   . "${KCONFIG_CONFIG}"
+   ;;
+*)
+   # Force using a file from the current directory
+   . "./${KCONFIG_CONFIG}"
+esac
+
+# In case it doesn't exist yet...
+[ -e "$cur_ksyms_file" ] || touch "$cur_ksyms_file"
+
+# Generate a new ksym list file with symbols needed by the current
+# set of modules.
+cat > "$new_ksyms_file" << EOT
+/*
+ * Automatically generated file; DO NOT EDIT.
+ */
+
+EOT
+sed -ns -e '3s/ /\n/gp' "$MODVERDIR"/*.mod | sort -u |
+while read sym; do
+   [ -n "$CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX" ] && sym="${sym#_}"
+   echo "#define __KSYM_${sym} 1"
+done >> "$new_ksyms_file"
+
+# Special case for modversions (see modpost.c)
+if [ -n "$CONFIG_MODVERSIONS" ]; then
+   echo "#define __KSYM_module_layout 1" >> "$new_ksyms_file"
+fi
+
+# Extract changes between old and new list and touch corresponding
+# dependency files.
+# Note: sort -m doesn't work well with underscore prefixed symbols so we
+# use 'cat ... | sort' instead.
+changed=$(
+count=0
+cat "$cur_ksyms_file" "$new_ksyms_file" | sort | uniq -u |
+sed -n 's/^#define __KSYM_\(.*\) 1/\1/p' | tr "A-Z_" "a-z/" |
+while read sympath; do
+   [ -z "$sympath" ] && continue
+   depfile="include/config/ksym/${sympath}.h"
+   mkdir -p "$(dirname "$depfile")"
+   touch "$depfile"
+   echo $((count += 1))
+done | tail -1 )
+changed=${changed:-0}
+
+if [ $changed -gt 0 ]; then
+   # Replace the old list with tne new one
+   old=$(grep -c "^#define __

[PATCH v4 4/8] kbuild: de-duplicate fixdep usage

2016-02-28 Thread Nicolas Pitre
The generation and postprocessing of automatic dependency rules is
duplicated in rule_cc_o_c and if_changed_dep. Since this is not a
trivial one-liner action, it is now abstracted under cmd_and_fixdep
to simplify things and make future changes easier.

In the rule_cc_o_c case that means the order of some commands has been
altered, namely fixdep and related file manipulations are executed
earlier, but they didn't depend on those commands that now execute later.

Signed-off-by: Nicolas Pitre 
---
 scripts/Kbuild.include | 5 -
 scripts/Makefile.build | 9 ++---
 2 files changed, 6 insertions(+), 8 deletions(-)

diff --git a/scripts/Kbuild.include b/scripts/Kbuild.include
index 1db6d73c8d..8a257fa663 100644
--- a/scripts/Kbuild.include
+++ b/scripts/Kbuild.include
@@ -256,10 +256,13 @@ if_changed = $(if $(strip $(any-prereq) $(arg-check)),
   \
 # Execute the command and also postprocess generated .d dependencies file.
 if_changed_dep = $(if $(strip $(any-prereq) $(arg-check) ),  \
@set -e; \
+   $(cmd_and_fixdep))
+
+cmd_and_fixdep = \
$(echo-cmd) $(cmd_$(1)); \
scripts/basic/fixdep $(depfile) $@ '$(make-cmd)' > $(dot-target).tmp;\
rm -f $(depfile);\
-   mv -f $(dot-target).tmp $(dot-target).cmd)
+   mv -f $(dot-target).tmp $(dot-target).cmd;
 
 # Usage: $(call if_changed_rule,foo)
 # Will check if $(cmd_foo) or any of the prerequisites changed,
diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index f4b4320e0d..8134ee81ad 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -243,14 +243,9 @@ endif
 
 define rule_cc_o_c
$(call echo-cmd,checksrc) $(cmd_checksrc) \
-   $(call echo-cmd,cc_o_c) $(cmd_cc_o_c);\
+   $(call cmd_and_fixdep,cc_o_c) \
$(cmd_modversions)\
-   $(call echo-cmd,record_mcount)\
-   $(cmd_record_mcount)  \
-   scripts/basic/fixdep $(depfile) $@ '$(call make-cmd,cc_o_c)' >\
- $(dot-target).tmp;  \
-   rm -f $(depfile); \
-   mv -f $(dot-target).tmp $(dot-target).cmd
+   $(call echo-cmd,record_mcount) $(cmd_record_mcount)
 endef
 
 # List module undefined symbols (or empty line if not enabled)
-- 
2.5.0



[PATCH v4 5/8] kbuild: add fine grained build dependencies for exported symbols

2016-02-28 Thread Nicolas Pitre
Like with kconfig options, we now have the ability to compile in and
out individual EXPORT_SYMBOL() declarations based on the content of
include/generated/autoksyms.h.  However we don't want the entire
world to be rebuilt whenever that file is touched.

Let's apply the same build dependency trick used for CONFIG_* symbols
where the time stamp of empty files whose paths matching those symbols
is used to trigger fine grained rebuilds. In our case the key is the
symbol name passed to EXPORT_SYMBOL().

However, unlike config options, we cannot just use fixdep to parse
the source code for EXPORT_SYMBOL(ksym) because several variants exist
and parsing them all in a separate tool, and keeping it in synch, is
not trivially maintainable.  Furthermore, there are variants such as

EXPORT_SYMBOL_GPL(pci_user_read_config_##size);

that are instanciated via a macro for which we can't easily determine
the actual exported symbol name(s) short of actually running the
preprocessor on them.

Storing the symbol name string in a special ELF section doesn't work
for targets that output assembly or preprocessed source.

So the best way is really to leverage the preprocessor by having it emit
a warning for each EXPORT_SYMBOL() instance and filtering those apart
from stderr by the build system. Then the list of symbols is simply fed
to fixdep to be merged with the other dependencies.

Because of the lowercasing performed by fixdep, there might be name
collisions triggering spurious rebuilds for similar symbols. But this
shouldn't be a big issue in practice. (This is the case for CONFIG_*
symbols and I didn't want to be different here, whatever the original
reason for doing so.)

To avoid needless build overhead, the exported symbol name gathering is
performed only when CONFIG_TRIM_UNUSED_KSYMS is selected.

Signed-off-by: Nicolas Pitre 
Acked-by: Rusty Russell 
---
 include/linux/export.h | 16 ++--
 scripts/Kbuild.include | 28 
 scripts/basic/fixdep.c |  1 +
 3 files changed, 43 insertions(+), 2 deletions(-)

diff --git a/include/linux/export.h b/include/linux/export.h
index 77afdb2a25..794392102d 100644
--- a/include/linux/export.h
+++ b/include/linux/export.h
@@ -76,8 +76,20 @@ extern struct module __this_module;
___cond_export_sym(sym, sec, conf)
 #define ___cond_export_sym(sym, sec, enabled)  \
__cond_export_sym_##enabled(sym, sec)
-#define __cond_export_sym_1(sym, sec) ___EXPORT_SYMBOL(sym, sec)
-#define __cond_export_sym_0(sym, sec) /* nothing */
+#define __cond_export_sym_1(sym, sec)  \
+   __KSYM_DEP(sym) ___EXPORT_SYMBOL(sym, sec)
+#define __cond_export_sym_0(sym, sec)  \
+   __KSYM_DEP(sym) /* nothing */
+
+/*
+ * For fine grained build dependencies, we want to tell the build system
+ * about each possible exported symbol even if they're not actually exported.
+ * This is accomplished with a preprocessor warning that gets captured by
+ * the make rule (see ksym_dep_filter in scripts/Kbuild.include).
+ */
+#define __KSYM_DEP(sym) __pragma_string( KBUILD_AUTOKSYM_DEP: sym )
+#define __pragma_string(x) __emit_pragma( GCC warning #x )
+#define __emit_pragma(x) _Pragma(#x)
 
 #else
 #define __EXPORT_SYMBOL ___EXPORT_SYMBOL
diff --git a/scripts/Kbuild.include b/scripts/Kbuild.include
index 8a257fa663..0b69479310 100644
--- a/scripts/Kbuild.include
+++ b/scripts/Kbuild.include
@@ -258,12 +258,40 @@ if_changed_dep = $(if $(strip $(any-prereq) $(arg-check) 
),  \
@set -e; \
$(cmd_and_fixdep))
 
+ifndef CONFIG_TRIM_UNUSED_KSYMS
+
 cmd_and_fixdep = \
$(echo-cmd) $(cmd_$(1)); \
scripts/basic/fixdep $(depfile) $@ '$(make-cmd)' > $(dot-target).tmp;\
rm -f $(depfile);\
mv -f $(dot-target).tmp $(dot-target).cmd;
 
+else
+
+# Filter out exported kernel symbol names advertised as warning pragmas
+# by the preprocessor and write them to $(1). We must consider continuation
+# lines as well: they start with a blank, or the preceeding line ends with
+# a ':'. Anything else is passed through as is.
+# See also __KSYM_DEP() in include/linux/export.h.
+ksym_dep_filter = sed -n \
+   -e '1 {x; $$!d}' \
+   -e '/^ / {H; $$!d}' \
+   -e 'x; /:$$/ {x; H; $$!d; s/^/ /; x}' \
+   -e ':filter; /^.*KBUILD_AUTOKSYM_DEP: /! {p; b next}' \
+   -e 's//KSYM_/; s/\n.*//; w $(1)' \
+   -e ':next; $$!d' \
+   -e '1 q; s/^/ /; x; /^ /! b filter'
+
+cmd_and_fixdep = \
+   $(echo-cmd)  \
+   $(cmd_$(1)) 2>&1 | $(call ksym_dep_filter,$(dot-target).ksym.tmp) >&2;\
+   scripts/basic/fixdep -e $(depfile) $@

[PATCH v4 0/8] [PULL REQUEST] Trim unused exported kernel symbols

2016-02-28 Thread Nicolas Pitre
This patch series provides the option to omit exported symbols from
the kernel and modules that are never referenced by any of the selected
modules in the current kernel configuration. this allows for optimizing
the compiled code and reducing final binaries' size. When using LTO the
binary size reduction is even more effective. It could also be argued
that this could bring some security advantages.

The original cover letter with lots of test results can be found here:

https://lkml.org/lkml/2016/2/8/813

Please consider for merging into your tree. Alternately, the following
branch can be merged:

http://git.linaro.org/people/nicolas.pitre/linux.git autoksyms

Thanks.

Changes from v3:

- Shell portability changes to adjust_autoksyms.sh, partly from
  suggestions by Zev Weiss.

- Fix sample modules by building them before adjust_autoksyms.sh is run.

Changes from v2:

- Generating the build dependencies by parsing the source with fixdep
  turned out to be unreliable due to all the EXPORT_SYMBOL() variants,
  and especially their use within macros where the actual symbol name
  is known only after running the preprocessor. This list of symbol names
  is now obtained from the preprocessor directly, fixing allmodconfig
  builds.

Changes from v1:

- Replaced "exp" that doesn't convey the right meaning as noted by
  Sam Ravnborg. The "ksym" identifier is actually what the kernel
  already uses for this. Therefore:
  - CONFIG_TRIM_UNUSED_EXPSYMS --> CONFIG_TRIM_UNUSED_KSYMS
  - include/generated/expsyms.h --> include/generated/autoksyms.h
  - #define __EXPSYM_* --> #define __KSYM_*

- Some sed regexp improvements as suggested by Al Viro.

- Renamed vmlinux_recursive target to autoksyms_recursive.

- Accept EXPORT_SYMBOL variants with a prefix, e.g. ACPI_EXPORT_SYMBOL.

- Minor commit log clarifications.

- Added Rusty's ACK.

diffstat:

 Makefile| 23 +++--
 include/linux/export.h  | 34 -
 init/Kconfig| 16 ++
 scripts/Kbuild.include  | 33 -
 scripts/Makefile.build  | 22 +
 scripts/adjust_autoksyms.sh | 97 +
 scripts/basic/fixdep.c  | 61 +--
 7 files changed, 256 insertions(+), 30 deletions(-)





[PATCH v4 7/8] kbuild: build sample modules along with the rest of the kernel

2016-02-28 Thread Nicolas Pitre
Make sample modules in parallel with the rest of the kernel rather
than having them built from the vmlinux target. This makes the build
slightly faster, and those modules are properly considered when
adjust_autoksyms.sh is executed.

Signed-off-by: Nicolas Pitre 
---
 Makefile | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/Makefile b/Makefile
index bb865095ca..f5daa4bbf3 100644
--- a/Makefile
+++ b/Makefile
@@ -928,9 +928,6 @@ endif
 ifdef CONFIG_HEADERS_CHECK
$(Q)$(MAKE) -f $(srctree)/Makefile headers_check
 endif
-ifdef CONFIG_SAMPLES
-   $(Q)$(MAKE) $(build)=samples
-endif
 ifdef CONFIG_BUILD_DOCSRC
$(Q)$(MAKE) $(build)=Documentation
 endif
@@ -948,6 +945,11 @@ PHONY += autoksyms_recursive
 include/generated/autoksyms.h: FORCE
$(Q)$(CONFIG_SHELL) scripts/adjust_autoksyms.sh true
 
+# Build samples along the rest of the kernel
+ifdef CONFIG_SAMPLES
+vmlinux-dirs += samples
+endif
+
 # The actual objects are generated when descending,
 # make sure no implicit rule kicks in
 $(sort $(vmlinux-deps)): $(vmlinux-dirs) ;
-- 
2.5.0



[PATCH v4 8/8] kconfig option for TRIM_UNUSED_KSYMS

2016-02-28 Thread Nicolas Pitre
The config option to enable it all.

Signed-off-by: Nicolas Pitre 
Acked-by: Rusty Russell 
---
 init/Kconfig | 16 
 1 file changed, 16 insertions(+)

diff --git a/init/Kconfig b/init/Kconfig
index 22320804fb..e6f666331b 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1990,6 +1990,22 @@ config MODULE_COMPRESS_XZ
 
 endchoice
 
+config TRIM_UNUSED_KSYMS
+   bool "Trim unused exported kernel symbols"
+   depends on MODULES && !UNUSED_SYMBOLS
+   help
+ The kernel and some modules make many symbols available for
+ other modules to use via EXPORT_SYMBOL() and variants. Depending
+ on the set of modules being selected in your kernel configuration,
+ many of those exported symbols might never be used.
+
+ This option allows for unused exported symbols to be dropped from
+ the build. In turn, this provides the compiler more opportunities
+ (especially when using LTO) for optimizing the code and reducing
+ binary size.  This might have some security advantages as well.
+
+ If unsure say N.
+
 endif # MODULES
 
 config MODULES_TREE_LOOKUP
-- 
2.5.0



[PATCH v4 1/8] kbuild: record needed exported symbols for modules

2016-02-28 Thread Nicolas Pitre
Kernel modules are partially linked object files with some undefined
symbols that are expected to be matched with EXPORT_SYMBOL() entries
from elsewhere.

Each .tmp_versions/*.mod file currently contains two line of text
separated by a newline character. The first line has the actual module
file name while the second line has a list of object files constituting
that module. Those files are parsed by modpost (scripts/mod/sumversion.c),
scripts/Makefile.modpost, scripts/Makefile.modsign, etc.  Only the
modpost utility cares about the second line while the others retrieve
only the first line.

Therefore we can add a third line to record the list of undefined symbols
aka required EXPORT_SYMBOL() entries for each module into that file
without breaking anything. Like for the second line, symbols are separated
by a blank and the list is terminated with a newline character.

To avoid needless build overhead, the undefined symbols extraction is
performed only when CONFIG_TRIM_UNUSED_KSYMS is selected.

Signed-off-by: Nicolas Pitre 
Acked-by: Rusty Russell 
---
 scripts/Makefile.build | 13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index 2c47f9c305..f4b4320e0d 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -253,6 +253,13 @@ define rule_cc_o_c
mv -f $(dot-target).tmp $(dot-target).cmd
 endef
 
+# List module undefined symbols (or empty line if not enabled)
+ifdef CONFIG_TRIM_UNUSED_KSYMS
+cmd_undef_syms = $(NM) $@ | sed -n 's/^ \+U //p' | xargs echo
+else
+cmd_undef_syms = echo
+endif
+
 # Built-in and composite module parts
 $(obj)/%.o: $(src)/%.c $(recordmcount_source) FORCE
$(call cmd,force_checksrc)
@@ -263,7 +270,8 @@ $(obj)/%.o: $(src)/%.c $(recordmcount_source) FORCE
 $(single-used-m): $(obj)/%.o: $(src)/%.c $(recordmcount_source) FORCE
$(call cmd,force_checksrc)
$(call if_changed_rule,cc_o_c)
-   @{ echo $(@:.o=.ko); echo $@; } > $(MODVERDIR)/$(@F:.o=.mod)
+   @{ echo $(@:.o=.ko); echo $@; \
+  $(cmd_undef_syms); } > $(MODVERDIR)/$(@F:.o=.mod)
 
 quiet_cmd_cc_lst_c = MKLST   $@
   cmd_cc_lst_c = $(CC) $(c_flags) -g -c -o $*.o $< && \
@@ -393,7 +401,8 @@ $(call multi_depend, $(multi-used-y), .o, -objs -y)
 
 $(multi-used-m): FORCE
$(call if_changed,link_multi-m)
-   @{ echo $(@:.o=.ko); echo $(link_multi_deps); } > 
$(MODVERDIR)/$(@F:.o=.mod)
+   @{ echo $(@:.o=.ko); echo $(link_multi_deps); \
+  $(cmd_undef_syms); } > $(MODVERDIR)/$(@F:.o=.mod)
 $(call multi_depend, $(multi-used-m), .o, -objs -y -m)
 
 targets += $(multi-used-y) $(multi-used-m)
-- 
2.5.0



Re: multipath: I/O hanging forever

2016-02-28 Thread Andrea Righi
On Sun, Feb 28, 2016 at 06:53:33PM -0700, Andrea Righi wrote:
... 
> I'm using 4.5.0-rc5+, from Linus' git. I'll try to do a git bisect
> later, I'm pretty sure this problem has been introduced recently (i.e.,
> I've never seen this issue with 4.1.x).

I confirm, just tested kernel 4.1 and this problem doesn't happen.

Thanks,
-Andrea


RE: [PATCH v2 3/4] mtd:spi-nor:fsl-quadspi:Add fast-read mode support

2016-02-28 Thread Yunhui Cui
Hi Han,

But I don't think QuadSPI driver need to check the m25p,fast-read property 
again since spi-nor layer has already done that. Adding the property in flash 
node should work in the same way.

[Yunhui]: There are three modes in fsl-quadspi driver , fast mode, quad mode, 
ddr quad read. The last parameter mode of spi_nor_scan() I have to specify . 
Otherwise, flash is still set to quad mode.
spi-nor.c: 

1419 if (mode == SPI_NOR_QUAD && info->flags & SPI_NOR_QUAD_READ) {
1420 ret = set_quad_mode(nor, info);
1421 if (ret) {
1422 dev_err(dev, "quad mode not supported\n");
1423 return ret;
1424 }
1425 nor->flash_read = SPI_NOR_QUAD;
1426 } else if (mode == SPI_NOR_DUAL && info->flags & 
SPI_NOR_DUAL_READ) {
1427 nor->flash_read = SPI_NOR_DUAL;
1428 }


Thanks 
Yunhui
-Original Message-
From: Han Xu [mailto:xhnj...@gmail.com] 
Sent: Saturday, February 27, 2016 12:32 AM
To: Yunhui Cui
Cc: Yunhui Cui; dw...@infradead.org; computersforpe...@gmail.com; 
han...@freescale.com; linux-...@lists.infradead.org; 
linux-kernel@vger.kernel.org; linux-arm-ker...@lists.infradead.org; Yao Yuan
Subject: Re: [PATCH v2 3/4] mtd:spi-nor:fsl-quadspi:Add fast-read mode support

On Thu, Feb 25, 2016 at 08:07:22AM +, Yunhui Cui wrote:
> Hi Han,
> 
> I have provided the options " m25p,fast-read ", because there are probable 
> some flashes can't support quad mode.
> So we should support fast-read mode in our driver. Moreover,  There is a 
> option to select fast-read mode in spi_nor.c :
>/* If we were instantiated by DT, use it */
>  if (of_property_read_bool(np, "m25p,fast-read"))
>  nor->flash_read = SPI_NOR_FAST;

Did you have some REAL cases using SPI NOR that only supports upto fast-read 
with Quad SPI driver? Neither fast-read or normal-read, which is actually more 
general, supported in the driver, just because I didn't see any REAL cases till 
now.

I didn't run against the patch, although IMO it's not that necessary. But I 
don't think QuadSPI driver need to check the m25p,fast-read property again 
since spi-nor layer has already done that. Adding the property in flash node 
should work in the same way.
 
> 
> Thanks
> Yunhui
> 
> -Original Message-
> From: Han Xu [mailto:xhnj...@gmail.com]
> Sent: Thursday, February 18, 2016 2:08 AM
> To: Yunhui Cui
> Cc: dw...@infradead.org; computersforpe...@gmail.com; 
> han...@freescale.com; linux-...@lists.infradead.org; 
> linux-kernel@vger.kernel.org; linux-arm-ker...@lists.infradead.org; 
> Yao Yuan
> Subject: Re: [PATCH v2 3/4] mtd:spi-nor:fsl-quadspi:Add fast-read mode 
> support
> 
> On Mon, Feb 01, 2016 at 07:30:07PM +0800, Yunhui Cui wrote:
> > The qspi driver add generic fast-read mode for different flash 
> > venders. There are some different board flash work on different 
> > mode, such fast-read, quad-mode.
> > So we have to modify the third entrace parameter of spi_nor_scan().
> > 
> > Signed-off-by: Yunhui Cui 
> > ---
> >  drivers/mtd/spi-nor/fsl-quadspi.c | 27 +--
> >  1 file changed, 21 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/mtd/spi-nor/fsl-quadspi.c
> > b/drivers/mtd/spi-nor/fsl-quadspi.c
> > index 9861290..0a31cb1 100644
> > --- a/drivers/mtd/spi-nor/fsl-quadspi.c
> > +++ b/drivers/mtd/spi-nor/fsl-quadspi.c
> > @@ -389,11 +389,21 @@ static void fsl_qspi_init_lut(struct fsl_qspi *q)
> > /* Read */
> > lut_base = SEQID_READ * 4;
> >  
> > -   qspi_writel(q, LUT0(CMD, PAD1, read_op) | LUT1(ADDR, PAD1, addrlen),
> > -   base + QUADSPI_LUT(lut_base));
> > -   qspi_writel(q, LUT0(DUMMY, PAD1, read_dm) |
> > -   LUT1(FSL_READ, PAD4, rxfifo),
> > -   base + QUADSPI_LUT(lut_base + 1));
> > +   if (nor->flash_read == SPI_NOR_FAST) {
> > +   qspi_writel(q, LUT0(CMD, PAD1, read_op) |
> > +   LUT1(ADDR, PAD1, addrlen),
> > +   base + QUADSPI_LUT(lut_base));
> > +   qspi_writel(q,  LUT0(DUMMY, PAD1, read_dm) |
> > +   LUT1(FSL_READ, PAD1, rxfifo),
> > +   base + QUADSPI_LUT(lut_base + 1));
> > +   } else if (nor->flash_read == SPI_NOR_QUAD) {
> > +   qspi_writel(q, LUT0(CMD, PAD1, read_op) |
> > +   LUT1(ADDR, PAD1, addrlen),
> > +   base + QUADSPI_LUT(lut_base));
> > +   qspi_writel(q, LUT0(DUMMY, PAD1, read_dm) |
> > +   LUT1(FSL_READ, PAD4, rxfifo),
> > +   base + QUADSPI_LUT(lut_base + 1));
> > +   }
> >  
> > /* Write enable */
> > lut_base = SEQID_WREN * 4;
> > @@ -468,6 +478,7 @@ static int fsl_qspi_get_seqid(struct fsl_qspi 
> > *q,
> > u8 cmd)  {
> > switch (cmd) {
> > case SPINOR_OP_READ_1_1_4:
> > +   case SPINOR_OP_READ_FAST:

Re: [PATCH] bus: imx-weim: Take the 'status' property value into account

2016-02-28 Thread Shawn Guo
On Mon, Feb 22, 2016 at 09:01:53AM -0300, Fabio Estevam wrote:
> From: Fabio Estevam 
> 
> Currently we have an incorrect behaviour when multiple devices
> are present under the weim node. For example:
> 
> &weim {
>   ...
>   status = "okay";
>   
>   sram@0,0 {
>   ...
>   status = "okay";
>   };
> 
>   mram@0,0 {
>   ...
>   status = "disabled";
>   };
> };
> 
> In this case only the 'sram' device should be probed and not 'mram'.
> 
> However what happens currently is that the status variable is ignored,
> causing the 'sram' device to be disabled and 'mram' to be enabled.  
> 
> Change the weim_parse_dt() function to use
> for_each_available_child_of_node()so that the devices marked with
> 'status = disabled' are not probed.
> 
> Cc: 
> Suggested-by: Wolfgang Netbal 
> Signed-off-by: Fabio Estevam 

Acked-by: Shawn Guo 

Arnd, Olof,

I do not have any other 'driver' patches queued, so please help directly
apply this one.  Considering this fixes a real problem, it would be good
if we can merge this through -rc.  But we understand that it's -rc6 now,
and this doesn't fix a regression or so-critical issue, so it should be
fine to queue the patch for the next release as well.

Shawn

> ---
>  drivers/bus/imx-weim.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/bus/imx-weim.c b/drivers/bus/imx-weim.c
> index e98d15e..1827fc4 100644
> --- a/drivers/bus/imx-weim.c
> +++ b/drivers/bus/imx-weim.c
> @@ -150,7 +150,7 @@ static int __init weim_parse_dt(struct platform_device 
> *pdev,
>   return ret;
>   }
>  
> - for_each_child_of_node(pdev->dev.of_node, child) {
> + for_each_available_child_of_node(pdev->dev.of_node, child) {
>   if (!child->name)
>   continue;
>  
> -- 
> 1.9.1
> 
> 


Re: [PATCH 2/5] oom reaper: handle mlocked pages

2016-02-28 Thread Hugh Dickins
On Tue, 23 Feb 2016, Michal Hocko wrote:
> On Mon 22-02-16 17:36:07, David Rientjes wrote:
> > 
> > Are we concerned about munlock_vma_pages_all() taking lock_page() and 
> > perhaps stalling forever, the same way it would stall in exit_mmap() for 
> > VM_LOCKED vmas, if another thread has locked the same page and is doing an 
> > allocation?
> 
> This is a good question. I have checked for that particular case
> previously and managed to convinced myself that this is OK(ish).
> munlock_vma_pages_range locks only THP pages to prevent from the
> parallel split-up AFAICS.

I think you're mistaken on that: there is also the lock_page()
on every page in Phase 2 of __munlock_pagevec().

> And split_huge_page_to_list doesn't seem
> to depend on an allocation. It can block on anon_vma lock but I didn't
> see any allocation requests from there either. I might be missing
> something of course. Do you have any specific path in mind?
> 
> > I'm wondering if in that case it would be better to do a 
> > best-effort munlock_vma_pages_all() with trylock_page() and just give up 
> > on releasing memory from that particular vma.  In that case, there may be 
> > other memory that can be freed with unmap_page_range() that would handle 
> > this livelock.

I agree with David, that we ought to trylock_page() throughout munlock:
just so long as it gets to do the TestClearPageMlocked without demanding
page lock, the rest is the usual sugarcoating for accurate Mlocked stats,
and leave the rest for reclaim to fix up.

> 
> I have tried to code it up but I am not really sure the whole churn is
> really worth it - unless I am missing something that would really make
> the THP case likely to hit in the real life.

Though I must have known about it forever, it was a shock to see all
those page locks demanded in exit, brought home to us a week or so ago.

The proximate cause in this case was my own change, to defer pte_alloc
to suit huge tmpfs: it had not previously occurred to me that I was
now doing the pte_alloc while __do_fault holds page lock.  Bad Hugh.
But change not yet upstream, so not so urgent for you.

>From time immemorial, free_swap_and_cache() and free_swap_cache() only
ever trylock a page, precisely so that they never hold up munmap or exit
(well, if I looked harder, I might find lock ordering reasons too).

> 
> Just for the reference this is what I came up with (just compile tested).

I tried something similar internally (on an earlier kernel).  Like
you I've set that work aside for now, there were quicker ways to fix
the issue at hand.  But it does continue to offend me that munlock
demands all those page locks: so if you don't get back to it before me,
I shall eventually.

I didn't understand why you complicated yours with the "enforce"
arg to munlock_vma_pages_range(): why not just trylock in all cases?

Hugh


Re: [PATCH] cpufreq: Select IRQ_WORK if CPU_FREQ_GOV_COMMON is set

2016-02-28 Thread Viresh Kumar
On 28-02-16, 02:33, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki 
> 
> Commit 8fb47ff100af (cpufreq: governor: Replace timers with utilization
> update callbacks) made CPU_FREQ select IRQ_WORK, but that's not
> necessary, as it is sufficient for IRQ_WORK to be selected by
> CPU_FREQ_GOV_COMMON, so modify the cpufreq Kconfig to that effect.
> 
> Signed-off-by: Rafael J. Wysocki 
> ---
> 
> On top of linux-next.
> 
> Thanks,
> Rafael
> 
> ---
>  drivers/cpufreq/Kconfig |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> Index: linux-pm/drivers/cpufreq/Kconfig
> ===
> --- linux-pm.orig/drivers/cpufreq/Kconfig
> +++ linux-pm/drivers/cpufreq/Kconfig
> @@ -3,7 +3,6 @@ menu "CPU Frequency scaling"
>  config CPU_FREQ
>   bool "CPU Frequency scaling"
>   select SRCU
> - select IRQ_WORK
>   help
> CPU Frequency scaling allows you to change the clock speed of 
> CPUs on the fly. This is a nice method to save power, because 
> @@ -20,6 +19,7 @@ config CPU_FREQ
>  if CPU_FREQ
>  
>  config CPU_FREQ_GOV_COMMON
> + select IRQ_WORK
>   bool
>  
>  config CPU_FREQ_BOOST_SW

Acked-by: Viresh Kumar 

-- 
viresh


Re: [PATCH 09/50] pinctrl: imx: Use devm_pinctrl_register() for pinctrl registration

2016-02-28 Thread Shawn Guo
On Wed, Feb 24, 2016 at 06:45:34PM +0530, Laxman Dewangan wrote:
> Use devm_pinctrl_register() for pin control registration and remove
> need of .remove callback.
> 
> Signed-off-by: Laxman Dewangan 
> Cc: Shawn Guo 

Acked-by: Shawn Guo 


linux-next: manual merge of the drm tree with Linus' tree

2016-02-28 Thread Stephen Rothwell
Hi Dave,

Today's linux-next merge of the drm tree got a conflict in:

  drivers/gpu/drm/amd/amdgpu/amdgpu_display.c

between commit:

  e1d09dc0ccc6 ("drm/amdgpu: Don't hang in amdgpu_flip_work_func on disabled 
crtc.")

from Linus' tree and commit:

  6bd9e877ce53 ("drm/amdgpu: Move MMIO flip out of spinlocked region")

from the drm tree.

I fixed it up (see below) and can carry the fix as necessary (no action
is required).

-- 
Cheers,
Stephen Rothwell

diff --cc drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
index 8297bc319369,2cb53c24dec0..
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
@@@ -72,13 -70,16 +70,16 @@@ static void amdgpu_flip_work_func(struc
  
struct drm_crtc *crtc = &amdgpuCrtc->base;
unsigned long flags;
 -  unsigned i;
 -  int vpos, hpos, stat, min_udelay;
 +  unsigned i, repcnt = 4;
 +  int vpos, hpos, stat, min_udelay = 0;
struct drm_vblank_crtc *vblank = &crtc->dev->vblank[work->crtc_id];
  
-   amdgpu_flip_wait_fence(adev, &work->excl);
+   if (amdgpu_flip_handle_fence(work, &work->excl))
+   return;
+ 
for (i = 0; i < work->shared_count; ++i)
-   amdgpu_flip_wait_fence(adev, &work->shared[i]);
+   if (amdgpu_flip_handle_fence(work, &work->shared[i]))
+   return;
  
/* We borrow the event spin lock for protecting flip_status */
spin_lock_irqsave(&crtc->dev->event_lock, flags);
@@@ -123,19 -119,12 +124,19 @@@
spin_lock_irqsave(&crtc->dev->event_lock, flags);
};
  
 +  if (!repcnt)
 +  DRM_DEBUG_DRIVER("Delay problem on crtc %d: min_udelay %d, "
 +   "framedur %d, linedur %d, stat %d, vpos %d, "
 +   "hpos %d\n", work->crtc_id, min_udelay,
 +   vblank->framedur_ns / 1000,
 +   vblank->linedur_ns / 1000, stat, vpos, hpos);
 +
-   /* do the flip (mmio) */
-   adev->mode_info.funcs->page_flip(adev, work->crtc_id, work->base);
/* set the flip status */
amdgpuCrtc->pflip_status = AMDGPU_FLIP_SUBMITTED;
- 
spin_unlock_irqrestore(&crtc->dev->event_lock, flags);
+ 
+   /* Do the flip (mmio) */
+   adev->mode_info.funcs->page_flip(adev, work->crtc_id, work->base);
  }
  
  /*


RE: [PATCH 0/4] MSR: MSR: MSR Whitelist and Batch Introduction

2016-02-28 Thread Mcfadden, Marty Jay
> On Sun, Feb 28, 2016, Borislav Petkov  wrote: 
> 
> Can we have some concrete examples for that please?
> 

Our environment allows users to have exclusive access to some
number of compute nodes for a limited time.  Bit-level control of
MSRs is required when a user might gain root or, more commonly,
interfere with subsequent jobs run by other users.

The canonical examples for bitwise control are
MSR_PKG_POWER_LIMIT and MSR_DRAM_POWER_LIMIT.  We
want to provider user space control over power bounds, but if
the lock bit is set the power bound cannot be changed without
rebooting.  As setting very low power bounds can slow
performance by a factor of 4x or worse, leaving the lock bit
writable allows a crude denial-of-service attack.

A second use case for bitwise control is IA32_MISC_ENABLE.  This
MSR controls a wide variety of processor functionality, some of
which is benign ("Performance Energy Bias Hint") and some that
might not be ("Automatic Thermal Control Circuit Enable").  Rather
than do a formal security review of the dozen features controlled
by this MSR, we'd like to take the simpler step of allowing writes
to only what we know is safe.  Note that bit "Enhanced Intel
SpeedStep Technology Select Lock" is a lock bit.

Thanks,

Marty McFadden


Re: [PATCH 1/7] extcon: palmas: Drop IRQF_EARLY_RESUME flag

2016-02-28 Thread Chanwoo Choi
Hi Grygorii,

On 2016년 02월 27일 00:42, Grygorii Strashko wrote:
> Palams extcon IRQs are nested threaded and wired to the Palmas
> inerrupt controller. So, this flag is not required for nested irqs
> anymore, since commit 3c646f2c6aa9 ("genirq: Don't suspend
> nested_thread irqs over system suspend") was merged.
> 
> Cc: MyungJoo Ham 
> Cc: Chanwoo Choi 
> Cc: Tony Lindgren 
> Cc: Lee Jones 
> Cc: Roger Quadros 
> Cc: Nishanth Menon 
> Signed-off-by: Grygorii Strashko 
> ---
>  drivers/extcon/extcon-palmas.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/extcon/extcon-palmas.c b/drivers/extcon/extcon-palmas.c
> index 93c30a8..0a861b3 100644
> --- a/drivers/extcon/extcon-palmas.c
> +++ b/drivers/extcon/extcon-palmas.c
> @@ -266,7 +266,7 @@ static int palmas_usb_probe(struct platform_device *pdev)
>   palmas_usb->id_irq,
>   NULL, palmas_id_irq_handler,
>   IRQF_TRIGGER_FALLING | IRQF_TRIGGER_RISING |
> - IRQF_ONESHOT | IRQF_EARLY_RESUME,
> + IRQF_ONESHOT,
>   "palmas_usb_id", palmas_usb);
>   if (status < 0) {
>   dev_err(&pdev->dev, "can't get IRQ %d, err %d\n",
> @@ -304,7 +304,7 @@ static int palmas_usb_probe(struct platform_device *pdev)
>   palmas_usb->vbus_irq, NULL,
>   palmas_vbus_irq_handler,
>   IRQF_TRIGGER_FALLING | IRQF_TRIGGER_RISING |
> - IRQF_ONESHOT | IRQF_EARLY_RESUME,
> + IRQF_ONESHOT,
>   "palmas_usb_vbus", palmas_usb);
>   if (status < 0) {
>   dev_err(&pdev->dev, "can't get IRQ %d, err %d\n",
> 

Applied it on extcon git.

Thanks,
Chanwoo Choi



[PATCH 2/2] staging: dgap: use tty_alloc_driver instead of kcalloc

2016-02-28 Thread Daeseok Youn
>From 60b1e6e5d9401f10f584928d4feeb8a3b72b46a9 Mon Sep 17 00:00:00 2001
From: Daeseok Youn 
Date: Mon, 29 Feb 2016 11:04:02 +0900
Subject: [PATCH 2/2] staging: dgap: use tty_alloc_driver instead of kcalloc

the tty_alloc_driver() can allocate memory for ttys and termios.
And also it can release allocated memory easly with using
put_tty_driver().

Signed-off-by: Daeseok Youn 
---
 drivers/staging/dgnc/dgnc_tty.c | 86 +++--
 1 file changed, 31 insertions(+), 55 deletions(-)

diff --git a/drivers/staging/dgnc/dgnc_tty.c b/drivers/staging/dgnc/dgnc_tty.c
index 01a0018..da5cba7 100644
--- a/drivers/staging/dgnc/dgnc_tty.c
+++ b/drivers/staging/dgnc/dgnc_tty.c
@@ -176,9 +176,15 @@ int dgnc_tty_preinit(void)
  */
 int dgnc_tty_register(struct dgnc_board *brd)
 {
-   int rc = 0;
+   int rc;
+
+   brd->serial_driver = tty_alloc_driver(brd->maxports,
+ TTY_DRIVER_REAL_RAW |
+ TTY_DRIVER_DYNAMIC_DEV |
+ TTY_DRIVER_HARDWARE_BREAK);
 
-   brd->serial_driver->magic = TTY_DRIVER_MAGIC;
+   if (IS_ERR(brd->serial_driver))
+   return PTR_ERR(brd->serial_driver);
 
snprintf(brd->SerialName, MAXTTYNAMELEN, "tty_dgnc_%d_", brd->boardnum);
 
@@ -186,31 +192,10 @@ int dgnc_tty_register(struct dgnc_board *brd)
brd->serial_driver->name_base = 0;
brd->serial_driver->major = 0;
brd->serial_driver->minor_start = 0;
-   brd->serial_driver->num = brd->maxports;
brd->serial_driver->type = TTY_DRIVER_TYPE_SERIAL;
brd->serial_driver->subtype = SERIAL_TYPE_NORMAL;
brd->serial_driver->init_termios = DgncDefaultTermios;
brd->serial_driver->driver_name = DRVSTR;
-   brd->serial_driver->flags = (TTY_DRIVER_REAL_RAW |
-  TTY_DRIVER_DYNAMIC_DEV |
-  TTY_DRIVER_HARDWARE_BREAK);
-
-   /*
-* The kernel wants space to store pointers to
-* tty_struct's and termios's.
-*/
-   brd->serial_driver->ttys = kcalloc(brd->maxports,
-sizeof(*brd->serial_driver->ttys),
-GFP_KERNEL);
-   if (!brd->serial_driver->ttys)
-   return -ENOMEM;
-
-   kref_init(&brd->serial_driver->kref);
-   brd->serial_driver->termios = kcalloc(brd->maxports,
-   
sizeof(*brd->serial_driver->termios),
-   GFP_KERNEL);
-   if (!brd->serial_driver->termios)
-   return -ENOMEM;
 
/*
 * Entry points for driver.  Called by the kernel from
@@ -224,7 +209,7 @@ int dgnc_tty_register(struct dgnc_board *brd)
if (rc < 0) {
dev_dbg(&brd->pdev->dev,
"Can't register tty device (%d)\n", rc);
-   return rc;
+   goto free_serial_driver;
}
brd->dgnc_Major_Serial_Registered = true;
}
@@ -234,38 +219,26 @@ int dgnc_tty_register(struct dgnc_board *brd)
 * again, separately so we don't get the LD confused about what major
 * we are when we get into the dgnc_tty_open() routine.
 */
-   brd->print_driver->magic = TTY_DRIVER_MAGIC;
+   brd->print_driver = tty_alloc_driver(brd->maxports,
+TTY_DRIVER_REAL_RAW |
+TTY_DRIVER_DYNAMIC_DEV |
+TTY_DRIVER_HARDWARE_BREAK);
+
+   if (IS_ERR(brd->print_driver)) {
+   rc = PTR_ERR(brd->print_driver);
+   goto unregister_serial_driver;
+   }
+
snprintf(brd->PrintName, MAXTTYNAMELEN, "pr_dgnc_%d_", brd->boardnum);
 
brd->print_driver->name = brd->PrintName;
brd->print_driver->name_base = 0;
brd->print_driver->major = brd->serial_driver->major;
brd->print_driver->minor_start = 0x80;
-   brd->print_driver->num = brd->maxports;
brd->print_driver->type = TTY_DRIVER_TYPE_SERIAL;
brd->print_driver->subtype = SERIAL_TYPE_NORMAL;
brd->print_driver->init_termios = DgncDefaultTermios;
brd->print_driver->driver_name = DRVSTR;
-   brd->print_driver->flags = (TTY_DRIVER_REAL_RAW |
- TTY_DRIVER_DYNAMIC_DEV |
- TTY_DRIVER_HARDWARE_BREAK);
-
-   /*
-* The kernel wants space to store pointers to
-* tty_struct's and termios's.  Must be separated from
-* the Serial Driver so we don't get confused
-*/
-   brd->print_driver->ttys = kcalloc(brd->maxports,
-   sizeof(*brd->print_driver->ttys),
-   GFP_KERNEL);
-   if (!brd->p

[PATCH 1/2] staging: dgnc: use pointer type of tty_struct

2016-02-28 Thread Daeseok Youn
>From 70f8703b3bd73fa56f4ea91e98967b8925550aa6 Mon Sep 17 00:00:00 2001
From: Daeseok Youn 
Date: Thu, 25 Feb 2016 14:53:37 +0900
Subject: [PATCH 1/2] staging: dgnc: use pointer type of tty_struct

For using tty_alloc_driver, SerialDriver has to be pointer type.
It also has checkpatch.pl warning about Camelcase, so
SerialDriver is changed to serial_driver.

Signed-off-by: Daeseok Youn 
---
 drivers/staging/dgnc/dgnc_driver.h |   4 +-
 drivers/staging/dgnc/dgnc_tty.c| 118 ++---
 2 files changed, 61 insertions(+), 61 deletions(-)

diff --git a/drivers/staging/dgnc/dgnc_driver.h 
b/drivers/staging/dgnc/dgnc_driver.h
index ce7cd9b..1c7a8fa 100644
--- a/drivers/staging/dgnc/dgnc_driver.h
+++ b/drivers/staging/dgnc/dgnc_driver.h
@@ -205,9 +205,9 @@ struct dgnc_board {
 * to our channels.
 */
 
-   struct tty_driver   SerialDriver;
+   struct tty_driver *serial_driver;
charSerialName[200];
-   struct tty_driver   PrintDriver;
+   struct tty_driver *print_driver;
charPrintName[200];
 
booldgnc_Major_Serial_Registered;
diff --git a/drivers/staging/dgnc/dgnc_tty.c b/drivers/staging/dgnc/dgnc_tty.c
index 8b1ba65..01a0018 100644
--- a/drivers/staging/dgnc/dgnc_tty.c
+++ b/drivers/staging/dgnc/dgnc_tty.c
@@ -178,20 +178,20 @@ int dgnc_tty_register(struct dgnc_board *brd)
 {
int rc = 0;
 
-   brd->SerialDriver.magic = TTY_DRIVER_MAGIC;
+   brd->serial_driver->magic = TTY_DRIVER_MAGIC;
 
snprintf(brd->SerialName, MAXTTYNAMELEN, "tty_dgnc_%d_", brd->boardnum);
 
-   brd->SerialDriver.name = brd->SerialName;
-   brd->SerialDriver.name_base = 0;
-   brd->SerialDriver.major = 0;
-   brd->SerialDriver.minor_start = 0;
-   brd->SerialDriver.num = brd->maxports;
-   brd->SerialDriver.type = TTY_DRIVER_TYPE_SERIAL;
-   brd->SerialDriver.subtype = SERIAL_TYPE_NORMAL;
-   brd->SerialDriver.init_termios = DgncDefaultTermios;
-   brd->SerialDriver.driver_name = DRVSTR;
-   brd->SerialDriver.flags = (TTY_DRIVER_REAL_RAW |
+   brd->serial_driver->name = brd->SerialName;
+   brd->serial_driver->name_base = 0;
+   brd->serial_driver->major = 0;
+   brd->serial_driver->minor_start = 0;
+   brd->serial_driver->num = brd->maxports;
+   brd->serial_driver->type = TTY_DRIVER_TYPE_SERIAL;
+   brd->serial_driver->subtype = SERIAL_TYPE_NORMAL;
+   brd->serial_driver->init_termios = DgncDefaultTermios;
+   brd->serial_driver->driver_name = DRVSTR;
+   brd->serial_driver->flags = (TTY_DRIVER_REAL_RAW |
   TTY_DRIVER_DYNAMIC_DEV |
   TTY_DRIVER_HARDWARE_BREAK);
 
@@ -199,28 +199,28 @@ int dgnc_tty_register(struct dgnc_board *brd)
 * The kernel wants space to store pointers to
 * tty_struct's and termios's.
 */
-   brd->SerialDriver.ttys = kcalloc(brd->maxports,
-sizeof(*brd->SerialDriver.ttys),
+   brd->serial_driver->ttys = kcalloc(brd->maxports,
+sizeof(*brd->serial_driver->ttys),
 GFP_KERNEL);
-   if (!brd->SerialDriver.ttys)
+   if (!brd->serial_driver->ttys)
return -ENOMEM;
 
-   kref_init(&brd->SerialDriver.kref);
-   brd->SerialDriver.termios = kcalloc(brd->maxports,
-   sizeof(*brd->SerialDriver.termios),
+   kref_init(&brd->serial_driver->kref);
+   brd->serial_driver->termios = kcalloc(brd->maxports,
+   
sizeof(*brd->serial_driver->termios),
GFP_KERNEL);
-   if (!brd->SerialDriver.termios)
+   if (!brd->serial_driver->termios)
return -ENOMEM;
 
/*
 * Entry points for driver.  Called by the kernel from
 * tty_io.c and n_tty.c.
 */
-   tty_set_operations(&brd->SerialDriver, &dgnc_tty_ops);
+   tty_set_operations(brd->serial_driver, &dgnc_tty_ops);
 
if (!brd->dgnc_Major_Serial_Registered) {
/* Register tty devices */
-   rc = tty_register_driver(&brd->SerialDriver);
+   rc = tty_register_driver(brd->serial_driver);
if (rc < 0) {
dev_dbg(&brd->pdev->dev,
"Can't register tty device (%d)\n", rc);
@@ -234,19 +234,19 @@ int dgnc_tty_register(struct dgnc_board *brd)
 * again, separately so we don't get the LD confused about what major
 * we are when we get into the dgnc_tty_open() routine.
 */
-   brd->PrintDriver.magic = TTY_DRIVER_MAGIC;
+   brd->print_driver->magic = TTY_DRIVER_MAGIC;
snprintf(brd->PrintName, MAXTTYNAMELEN, "pr_dgnc_%d

Re: [PATCH v4 01/17] Xen: ACPI: Hide UART used by Xen

2016-02-28 Thread Shannon Zhao


On 2016/2/12 6:22, Rafael J. Wysocki wrote:
> On Thursday, February 11, 2016 04:04:14 PM Stefano Stabellini wrote:
>> > On Wed, 10 Feb 2016, Rafael J. Wysocki wrote:
>>> > > On Tuesday, February 09, 2016 11:19:02 AM Stefano Stabellini wrote:
 > > > On Mon, 8 Feb 2016, Rafael J. Wysocki wrote:
> > > > > On Monday, February 08, 2016 10:57:01 AM Stefano Stabellini wrote:
>> > > > > > On Sat, 6 Feb 2016, Rafael J. Wysocki wrote:
>>> > > > > > > On Fri, Feb 5, 2016 at 4:05 AM, Shannon Zhao 
>>> > > > > > >  wrote:
 > > > > > > > From: Shannon Zhao 
 > > > > > > >
 > > > > > > > ACPI 6.0 introduces a new table STAO to list the devices 
 > > > > > > > which are used
 > > > > > > > by Xen and can't be used by Dom0. On Xen virtual 
 > > > > > > > platforms, the physical
 > > > > > > > UART is used by Xen. So here it hides UART from Dom0.
 > > > > > > >
 > > > > > > > Signed-off-by: Shannon Zhao 
 > > > > > > > Reviewed-by: Stefano Stabellini 
 > > > > > > > 
>>> > > > > > > 
>>> > > > > > > Well, this doesn't look right to me.
>>> > > > > > > 
>>> > > > > > > We need to find a nicer way to achieve what you want.
>> > > > > > 
>> > > > > > I take that you are talking about how to honor the STAO table 
>> > > > > > in Linux.
>> > > > > > Do you have any concrete suggestions?
> > > > > 
> > > > > I do.
> > > > > 
> > > > > The last hunk of the patch is likely what it needs to be, 
> > > > > although I'm
> > > > > not sure if the place it is added to is the right one.  That's a 
> > > > > minor thing,
> > > > > though.
> > > > > 
> > > > > The other part is problematic.  Not that as it doesn't work, but 
> > > > > because of
> > > > > how it works.  With these changes the device will be visible to 
> > > > > the OS (in
> > > > > fact to user space even), but will never be "present".  I'm not 
> > > > > sure if
> > > > > that's what you want?
> > > > > 
> > > > > It might be better to add a check to acpi_bus_type_and_status() 
> > > > > that will
> > > > > evaluate the "should ignore?" thing and return -ENODEV if this is 
> > > > > true.  This
> > > > > way the device won't be visible at all.
 > > > 
 > > > Something like below?  Actually your suggestion is better, thank you!
 > > > 
 > > > diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
 > > > index 78d5f02..4778c51 100644
 > > > --- a/drivers/acpi/scan.c
 > > > +++ b/drivers/acpi/scan.c
 > > > @@ -1455,6 +1455,9 @@ static int 
 > > > acpi_bus_type_and_status(acpi_handle handle, int *type,
 > > >  if (ACPI_FAILURE(status))
 > > >  return -ENODEV;
 > > >  
 > > > +if (acpi_check_device_is_ignored(handle))
 > > > +return -ENODEV;
 > > > +
 > > >  switch (acpi_type) {
 > > >  case ACPI_TYPE_ANY: /* for ACPI_ROOT_OBJECT */
 > > >  case ACPI_TYPE_DEVICE:
 > > > 
>>> > > 
>>> > > I thought about doing that under ACPI_TYPE_DEVICE, because it shouldn't 
>>> > > be
>>> > > applicable to the other types.  But generally, yes.
>> > 
>> > I was pondering about it myself. Maybe an ACPI_TYPE_PROCESSOR object
>> > could theoretically be hidden with the STAO?
> But this patch won't check for it anyway, will it?
> 
> It seems to be only checking against the UART address or have I missed
> anything?
> 
>> > I added the check before
>> > the switch because I thought that there would be no harm in being
>> > caution about it.
>> > 
>> > 
>>> > > Plus I'd move the table checks to acpi_scan_init(), so the UART address 
>>> > > can
>>> > > be a static variable in scan.c.
>>> > >
>>> > > Also maybe rename acpi_check_device_is_ignored() to something like
>>> > > acpi_device_should_be_hidden().
>> > 
>> > Both make sense. Shannon, are you happy to make these changes?
> Plus maybe make acpi_device_should_be_hidden() print a (KERN_INFO) message
> when it decides to hide something?
Ok, will update this patch. Thanks a lot!

-- 
Shannon



  1   2   3   4   >