[PATCH 3.16 080/134] PCI: Disable boot interrupt quirk for ASUS M2N-LR

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Stefan Assmann 

commit c4e649b09f55595e6df6da5465a5b3cfc93557c1 upstream.

The ASUS M2N-LR should not trigger boot interrupt quirks although it
carries an Intel 6702PXH.  On this board the boot interrupt quirks cause
incorrect IRQ assignments and should be disabled.

Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=43074
Tested-by: Solomon Peachy 
Signed-off-by: Stefan Assmann 
Signed-off-by: Bjorn Helgaas 
Signed-off-by: Ben Hutchings 
---
 drivers/pci/quirks.c | 24 
 1 file changed, 24 insertions(+)

--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -1637,6 +1637,29 @@ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_IN
 DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL,   0x260b, quirk_intel_pcie_pm);
 
 #ifdef CONFIG_X86_IO_APIC
+static int dmi_disable_ioapicreroute(const struct dmi_system_id *d)
+{
+   noioapicreroute = 1;
+   pr_info("%s detected: disable boot interrupt reroute\n", d->ident);
+
+   return 0;
+}
+
+static struct dmi_system_id boot_interrupt_dmi_table[] = {
+   /*
+* Systems to exclude from boot interrupt reroute quirks
+*/
+   {
+   .callback = dmi_disable_ioapicreroute,
+   .ident = "ASUSTek Computer INC. M2N-LR",
+   .matches = {
+   DMI_MATCH(DMI_SYS_VENDOR, "ASUSTek Computer INC."),
+   DMI_MATCH(DMI_PRODUCT_NAME, "M2N-LR"),
+   },
+   },
+   {}
+};
+
 /*
  * Boot interrupts on some chipsets cannot be turned off. For these chipsets,
  * remap the original interrupt in the linux kernel to the boot interrupt, so
@@ -1645,6 +1668,7 @@ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_IN
  */
 static void quirk_reroute_to_boot_interrupts_intel(struct pci_dev *dev)
 {
+   dmi_check_system(boot_interrupt_dmi_table);
if (noioapicquirk || noioapicreroute)
return;
 



[PATCH 1/3] jfs: Delete an error message for a failed memory allocation in diMount()

2017-08-18 Thread SF Markus Elfring
From: Markus Elfring 
Date: Fri, 18 Aug 2017 09:19:24 +0200

Omit an extra message for a memory allocation failure in this function.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring 
---
 fs/jfs/jfs_imap.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/fs/jfs/jfs_imap.c b/fs/jfs/jfs_imap.c
index f36ef68905a7..da8051299e47 100644
--- a/fs/jfs/jfs_imap.c
+++ b/fs/jfs/jfs_imap.c
@@ -119,7 +119,5 @@ int diMount(struct inode *ipimap)
-   if (imap == NULL) {
-   jfs_err("diMount: kmalloc returned NULL!");
+   if (!imap)
return -ENOMEM;
-   }
 
/* read the on-disk inode map control structure. */
 
-- 
2.14.0



[PATCH 3.16 075/134] powerpc/sysfs: Fix reference leak of cpu device_nodes present at boot

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Tyrel Datwyler 

commit e76ca27790a514590af782f83f6eae49e0ccf8c9 upstream.

For CPUs present at boot each logical CPU acquires a reference to the
associated device node of the core. This happens in register_cpu() which
is called by topology_init(). The result of this is that we end up with
a reference held by each thread of the core. However, these references
are never freed if the CPU core is DLPAR removed.

This patch fixes the reference leaks by acquiring and releasing the references
in the CPU hotplug callbacks un/register_cpu_online(). With this patch symmetric
reference counting is observed with both CPUs present at boot, and those DLPAR
added after boot.

Fixes: f86e4718f24b ("driver/core: cpu: initialize of_node in cpu's device 
struture")
Signed-off-by: Tyrel Datwyler 
Signed-off-by: Michael Ellerman 
[bwh: Backported to 3.16: adjust context]
Signed-off-by: Ben Hutchings 
---
 arch/powerpc/kernel/sysfs.c | 6 ++
 1 file changed, 6 insertions(+)

--- a/arch/powerpc/kernel/sysfs.c
+++ b/arch/powerpc/kernel/sysfs.c
@@ -672,6 +672,10 @@ static void register_cpu_online(unsigned
struct device_attribute *attrs, *pmc_attrs;
int i, nattrs;
 
+   /* For cpus present at boot a reference was already grabbed in 
register_cpu() */
+   if (!s->of_node)
+   s->of_node = of_get_cpu_node(cpu, NULL);
+
 #ifdef CONFIG_PPC64
if (cpu_has_feature(CPU_FTR_SMT))
device_create_file(s, &dev_attr_smt_snooze_delay);
@@ -825,6 +829,8 @@ static void unregister_cpu_online(unsign
}
 #endif
cacheinfo_cpu_offline(cpu);
+   of_node_put(s->of_node);
+   s->of_node = NULL;
 }
 
 #ifdef CONFIG_ARCH_CPU_PROBE_RELEASE



[PATCH 3.16 077/134] netfilter: ctnetlink: make it safer when updating ct->status

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Liping Zhang 

commit 53b56da83d7899de375a9de153fd7f5397de85e6 upstream.

After converting to use rcu for conntrack hash, one CPU may update
the ct->status via ctnetlink, while another CPU may process the
packets and update the ct->status.

So the non-atomic operation "ct->status |= status;" via ctnetlink
becomes unsafe, and this may clear the IPS_DYING_BIT bit set by
another CPU unexpectedly. For example:
 CPU0CPU1
  ctnetlink_change_status__nf_conntrack_find_get
  old = ct->status  nf_ct_gc_expired
  - nf_ct_kill
  -  test_and_set_bit(IPS_DYING_BIT
  new = old | status; -
  ct->status = new; <-- oops, _DYING_ is cleared!

Now using a series of atomic bit operation to solve the above issue.

Also note, user shouldn't set IPS_TEMPLATE, IPS_SEQ_ADJUST directly,
so make these two bits be unchangable too.

If we set the IPS_TEMPLATE_BIT, ct will be freed by nf_ct_tmpl_free,
but actually it is alloced by nf_conntrack_alloc.
If we set the IPS_SEQ_ADJUST_BIT, this may cause the NULL pointer
deference, as the nfct_seqadj(ct) maybe NULL.

Last, add some comments to describe the logic change due to the
commit a963d710f367 ("netfilter: ctnetlink: Fix regression in CTA_STATUS
processing"), which makes me feel a little confusing.

Fixes: 76507f69c44e ("[NETFILTER]: nf_conntrack: use RCU for conntrack hash")
Signed-off-by: Liping Zhang 
Signed-off-by: Pablo Neira Ayuso 
[bwh: Backported to 3.16: IPS_UNCHANGEABLE_MASK was not previously defined and
 ctnetlink_update_status() is not needed]
Signed-off-by: Ben Hutchings 
---
--- a/include/uapi/linux/netfilter/nf_conntrack_common.h
+++ b/include/uapi/linux/netfilter/nf_conntrack_common.h
@@ -91,6 +91,15 @@ enum ip_conntrack_status {
/* Conntrack got a helper explicitly attached via CT target. */
IPS_HELPER_BIT = 13,
IPS_HELPER = (1 << IPS_HELPER_BIT),
+
+   /* Be careful here, modifying these bits can make things messy,
+* so don't let users modify them directly.
+*/
+   IPS_UNCHANGEABLE_MASK = (IPS_NAT_DONE_MASK | IPS_NAT_MASK |
+IPS_EXPECTED | IPS_CONFIRMED | IPS_DYING |
+IPS_SEQ_ADJUST | IPS_TEMPLATE),
+
+   __IPS_MAX_BIT = 14,
 };
 
 /* Connection tracking event types */
--- a/net/netfilter/nf_conntrack_netlink.c
+++ b/net/netfilter/nf_conntrack_netlink.c
@@ -1307,6 +1307,24 @@ ctnetlink_parse_nat_setup(struct nf_conn
 }
 #endif
 
+static void
+__ctnetlink_change_status(struct nf_conn *ct, unsigned long on,
+ unsigned long off)
+{
+   unsigned int bit;
+
+   /* Ignore these unchangable bits */
+   on &= ~IPS_UNCHANGEABLE_MASK;
+   off &= ~IPS_UNCHANGEABLE_MASK;
+
+   for (bit = 0; bit < __IPS_MAX_BIT; bit++) {
+   if (on & (1 << bit))
+   set_bit(bit, &ct->status);
+   else if (off & (1 << bit))
+   clear_bit(bit, &ct->status);
+   }
+}
+
 static int
 ctnetlink_change_status(struct nf_conn *ct, const struct nlattr * const cda[])
 {
@@ -1326,10 +1344,7 @@ ctnetlink_change_status(struct nf_conn *
/* ASSURED bit can only be set */
return -EBUSY;
 
-   /* Be careful here, modifying NAT bits can screw up things,
-* so don't let users modify them directly if they don't pass
-* nf_nat_range. */
-   ct->status |= status & ~(IPS_NAT_DONE_MASK | IPS_NAT_MASK);
+   __ctnetlink_change_status(ct, status, 0);
return 0;
 }
 
@@ -1513,7 +1528,7 @@ ctnetlink_change_seq_adj(struct nf_conn
if (ret < 0)
return ret;
 
-   ct->status |= IPS_SEQ_ADJUST;
+   set_bit(IPS_SEQ_ADJUST_BIT, &ct->status);
}
 
if (cda[CTA_SEQ_ADJ_REPLY]) {
@@ -1522,7 +1537,7 @@ ctnetlink_change_seq_adj(struct nf_conn
if (ret < 0)
return ret;
 
-   ct->status |= IPS_SEQ_ADJUST;
+   set_bit(IPS_SEQ_ADJUST_BIT, &ct->status);
}
 
return 0;



[PATCH 2/3] jfs: Improve size determinations in five functions

2017-08-18 Thread SF Markus Elfring
From: Markus Elfring 
Date: Fri, 18 Aug 2017 14:54:11 +0200

* Replace the specification of data structures by pointer dereferences
  as the parameter for the operator "sizeof" to make the corresponding size
  determination a bit safer according to the Linux coding style convention.

  This issue was detected by using the Coccinelle software.

* The script "checkpatch.pl" pointed information out like the following.

  ERROR: do not use assignment in if condition

  Thus fix two affected source code places.

Signed-off-by: Markus Elfring 
---
 fs/jfs/jfs_imap.c | 2 +-
 fs/jfs/jfs_logmgr.c   | 8 +---
 fs/jfs/jfs_metapage.c | 2 +-
 fs/jfs/super.c| 2 +-
 4 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/fs/jfs/jfs_imap.c b/fs/jfs/jfs_imap.c
index da8051299e47..a7e3a61187db 100644
--- a/fs/jfs/jfs_imap.c
+++ b/fs/jfs/jfs_imap.c
@@ -115,5 +115,5 @@ int diMount(struct inode *ipimap)
 * allocate/initialize the in-memory inode map control structure
 */
/* allocate the in-memory inode map control structure. */
-   imap = kmalloc(sizeof(struct inomap), GFP_KERNEL);
+   imap = kmalloc(sizeof(*imap), GFP_KERNEL);
if (!imap)
diff --git a/fs/jfs/jfs_logmgr.c b/fs/jfs/jfs_logmgr.c
index a21f0e9eecd4..ed103c44bd52 100644
--- a/fs/jfs/jfs_logmgr.c
+++ b/fs/jfs/jfs_logmgr.c
@@ -1109,7 +1109,8 @@ int lmLogOpen(struct super_block *sb)
}
}
 
-   if (!(log = kzalloc(sizeof(struct jfs_log), GFP_KERNEL))) {
+   log = kzalloc(sizeof(*log), GFP_KERNEL);
+   if (!log) {
mutex_unlock(&jfs_log_mutex);
return -ENOMEM;
}
@@ -1178,7 +1179,8 @@ static int open_inline_log(struct super_block *sb)
struct jfs_log *log;
int rc;
 
-   if (!(log = kzalloc(sizeof(struct jfs_log), GFP_KERNEL)))
+   log = kzalloc(sizeof(*log), GFP_KERNEL);
+   if (!log)
return -ENOMEM;
INIT_LIST_HEAD(&log->sb_list);
init_waitqueue_head(&log->syncwait);
@@ -1839,7 +1841,7 @@ static int lbmLogInit(struct jfs_log * log)
goto error;
buffer = page_address(page);
for (offset = 0; offset < PAGE_SIZE; offset += LOGPSIZE) {
-   lbuf = kmalloc(sizeof(struct lbuf), GFP_KERNEL);
+   lbuf = kmalloc(sizeof(*lbuf), GFP_KERNEL);
if (lbuf == NULL) {
if (offset == 0)
__free_page(page);
diff --git a/fs/jfs/jfs_metapage.c b/fs/jfs/jfs_metapage.c
index 65120a471729..6c75a7c87c0c 100644
--- a/fs/jfs/jfs_metapage.c
+++ b/fs/jfs/jfs_metapage.c
@@ -107,7 +107,7 @@ static inline int insert_metapage(struct page *page, struct 
metapage *mp)
if (PagePrivate(page))
a = mp_anchor(page);
else {
-   a = kzalloc(sizeof(struct meta_anchor), GFP_NOFS);
+   a = kzalloc(sizeof(*a), GFP_NOFS);
if (!a)
return -ENOMEM;
set_page_private(page, (unsigned long)a);
diff --git a/fs/jfs/super.c b/fs/jfs/super.c
index 78b41e1d5c67..381476422e8d 100644
--- a/fs/jfs/super.c
+++ b/fs/jfs/super.c
@@ -526,7 +526,7 @@ static int jfs_fill_super(struct super_block *sb, void 
*data, int silent)
 
jfs_info("In jfs_read_super: s_flags=0x%lx", sb->s_flags);
 
-   sbi = kzalloc(sizeof(struct jfs_sb_info), GFP_KERNEL);
+   sbi = kzalloc(sizeof(*sbi), GFP_KERNEL);
if (!sbi)
return -ENOMEM;
 
-- 
2.14.0



[PATCH 3.16 079/134] dm era: save spacemap metadata root after the pre-commit

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Somasundaram Krishnasamy 

commit 117aceb030307dcd431fdcff87ce988d3016c34a upstream.

When committing era metadata to disk, it doesn't always save the latest
spacemap metadata root in superblock. Due to this, metadata is getting
corrupted sometimes when reopening the device. The correct order of update
should be, pre-commit (shadows spacemap root), save the spacemap root
(newly shadowed block) to in-core superblock and then the final commit.

Signed-off-by: Somasundaram Krishnasamy 
Signed-off-by: Mike Snitzer 
Signed-off-by: Ben Hutchings 
---
 drivers/md/dm-era-target.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

--- a/drivers/md/dm-era-target.c
+++ b/drivers/md/dm-era-target.c
@@ -957,15 +957,15 @@ static int metadata_commit(struct era_me
}
}
 
-   r = save_sm_root(md);
+   r = dm_tm_pre_commit(md->tm);
if (r) {
-   DMERR("%s: save_sm_root failed", __func__);
+   DMERR("%s: pre commit failed", __func__);
return r;
}
 
-   r = dm_tm_pre_commit(md->tm);
+   r = save_sm_root(md);
if (r) {
-   DMERR("%s: pre commit failed", __func__);
+   DMERR("%s: save_sm_root failed", __func__);
return r;
}
 



[PATCH 3.16 069/134] x86/boot: Fix BSS corruption/overwrite bug in early x86 kernel startup

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Ashish Kalra 

commit d594aa0277e541bb997aef0bc0a55172d8138340 upstream.

The minimum size for a new stack (512 bytes) setup for arch/x86/boot components
when the bootloader does not setup/provide a stack for the early boot components
is not "enough".

The setup code executing as part of early kernel startup code, uses the stack
beyond 512 bytes and accidentally overwrites and corrupts part of the BSS
section. This is exposed mostly in the early video setup code, where
it was corrupting BSS variables like force_x, force_y, which in-turn affected
kernel parameters such as screen_info (screen_info.orig_video_cols) and
later caused an exception/panic in console_init().

Most recent boot loaders setup the stack for early boot components, so this
stack overwriting into BSS section issue has not been exposed.

Signed-off-by: Ashish Kalra 
Cc: Andy Lutomirski 
Cc: Borislav Petkov 
Cc: Brian Gerst 
Cc: Denys Vlasenko 
Cc: H. Peter Anvin 
Cc: Josh Poimboeuf 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Link: 
http://lkml.kernel.org/r/20170419152015.10011-1-ashishkalra@Ashishs-MacBook-Pro.local
Signed-off-by: Ingo Molnar 
Signed-off-by: Ben Hutchings 
---
 arch/x86/boot/boot.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/x86/boot/boot.h
+++ b/arch/x86/boot/boot.h
@@ -16,7 +16,7 @@
 #ifndef BOOT_BOOT_H
 #define BOOT_BOOT_H
 
-#define STACK_SIZE 512 /* Minimum number of bytes for stack */
+#define STACK_SIZE 1024/* Minimum number of bytes for stack */
 
 #ifndef __ASSEMBLY__
 



[PATCH 3.16 074/134] powerpc/pseries: Fix of_node_put() underflow during DLPAR remove

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Tyrel Datwyler 

commit 68baf692c435339e6295cb470ea5545cbc28160e upstream.

Historically struct device_node references were tracked using a kref embedded as
a struct field. Commit 75b57ecf9d1d ("of: Make device nodes kobjects so they
show up in sysfs") (Mar 2014) refactored device_nodes to be kobjects such that
the device tree could by more simply exposed to userspace using sysfs.

Commit 0829f6d1f69e ("of: device_node kobject lifecycle fixes") (Mar 2014)
followed up these changes to better control the kobject lifecycle and in
particular the referecne counting via of_node_get(), of_node_put(), and
of_node_init().

A result of this second commit was that it introduced an of_node_put() call when
a dynamic node is detached, in of_node_remove(), that removes the initial kobj
reference created by of_node_init().

Traditionally as the original dynamic device node user the pseries code had
assumed responsibilty for releasing this final reference in its platform
specific DLPAR detach code.

This patch fixes a refcount underflow introduced by commit 0829f6d1f6, and
recently exposed by the upstreaming of the recount API.

Messages like the following are no longer seen in the kernel log with this
patch following DLPAR remove operations of cpus and pci devices.

  rpadlpar_io: slot PHB 72 removed
  refcount_t: underflow; use-after-free.
  [ cut here ]
  WARNING: CPU: 5 PID: 3335 at lib/refcount.c:128 
refcount_sub_and_test+0xf4/0x110

Fixes: 0829f6d1f69e ("of: device_node kobject lifecycle fixes")
Signed-off-by: Tyrel Datwyler 
[mpe: Make change log commit references more verbose]
Signed-off-by: Michael Ellerman 
Signed-off-by: Ben Hutchings 
---
 arch/powerpc/platforms/pseries/dlpar.c | 1 -
 1 file changed, 1 deletion(-)

--- a/arch/powerpc/platforms/pseries/dlpar.c
+++ b/arch/powerpc/platforms/pseries/dlpar.c
@@ -298,7 +298,6 @@ int dlpar_detach_node(struct device_node
if (rc)
return rc;
 
-   of_node_put(dn); /* Must decrement the refcount */
return 0;
 }
 



[PATCH 3.16 076/134] netfilter: ctnetlink: fix deadlock due to acquire _expect_lock twice

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Liping Zhang 

commit 88be4c09d9008f9ff337cbf48c5d0f06c8f872e7 upstream.

Currently, ctnetlink_change_conntrack is always protected by _expect_lock,
but this will cause a deadlock when deleting the helper from a conntrack,
as the _expect_lock will be acquired again by nf_ct_remove_expectations:

 CPU0

  lock(nf_conntrack_expect_lock);
  lock(nf_conntrack_expect_lock);

  *** DEADLOCK ***
  May be due to missing lock nesting notation

  2 locks held by lt-conntrack_gr/12853:
  #0:  (&table[i].mutex){+.+.+.}, at: []
   nfnetlink_rcv_msg+0x399/0x6a9 [nfnetlink]
  #1:  (nf_conntrack_expect_lock){+.}, at: []
   ctnetlink_new_conntrack+0x17f/0x408 [nf_conntrack_netlink]

  Call Trace:
   dump_stack+0x85/0xc2
   __lock_acquire+0x1608/0x1680
   ? ctnetlink_parse_tuple_proto+0x10f/0x1c0 [nf_conntrack_netlink]
   lock_acquire+0x100/0x1f0
   ? nf_ct_remove_expectations+0x32/0x90 [nf_conntrack]
   _raw_spin_lock_bh+0x3f/0x50
   ? nf_ct_remove_expectations+0x32/0x90 [nf_conntrack]
   nf_ct_remove_expectations+0x32/0x90 [nf_conntrack]
   ctnetlink_change_helper+0xc6/0x190 [nf_conntrack_netlink]
   ctnetlink_new_conntrack+0x1b2/0x408 [nf_conntrack_netlink]
   nfnetlink_rcv_msg+0x60a/0x6a9 [nfnetlink]
   ? nfnetlink_rcv_msg+0x1b9/0x6a9 [nfnetlink]
   ? nfnetlink_bind+0x1a0/0x1a0 [nfnetlink]
   netlink_rcv_skb+0xa4/0xc0
   nfnetlink_rcv+0x87/0x770 [nfnetlink]

Since the operations are unrelated to nf_ct_expect, so we can drop the
_expect_lock. Also note, after removing the _expect_lock protection,
another CPU may invoke nf_conntrack_helper_unregister, so we should
use rcu_read_lock to protect __nf_conntrack_helper_find invoked by
ctnetlink_change_helper.

Fixes: ca7433df3a67 ("netfilter: conntrack: seperate expect locking from 
nf_conntrack_lock")
Signed-off-by: Liping Zhang 
Signed-off-by: Pablo Neira Ayuso 
[bwh: Backported to 3.16:
 - ctnetlink_change_helper() still auto-loads modules, so update the unlocking
   and re-locking there
 - Adjust context]
Signed-off-by: Ben Hutchings 
---
--- a/net/netfilter/nf_conntrack_netlink.c
+++ b/net/netfilter/nf_conntrack_netlink.c
@@ -1384,24 +1384,22 @@ ctnetlink_change_helper(struct nf_conn *
return 0;
}
 
+   rcu_read_lock();
helper = __nf_conntrack_helper_find(helpname, nf_ct_l3num(ct),
nf_ct_protonum(ct));
if (helper == NULL) {
 #ifdef CONFIG_MODULES
-   spin_unlock_bh(&nf_conntrack_expect_lock);
+   rcu_read_unlock();
 
-   if (request_module("nfct-helper-%s", helpname) < 0) {
-   spin_lock_bh(&nf_conntrack_expect_lock);
+   if (request_module("nfct-helper-%s", helpname) < 0)
return -EOPNOTSUPP;
-   }
 
-   spin_lock_bh(&nf_conntrack_expect_lock);
+   rcu_read_lock();
helper = __nf_conntrack_helper_find(helpname, nf_ct_l3num(ct),
nf_ct_protonum(ct));
-   if (helper)
-   return -EAGAIN;
 #endif
-   return -EOPNOTSUPP;
+   rcu_read_unlock();
+   return helper ? -EAGAIN : -EOPNOTSUPP;
}
 
if (help) {
@@ -1409,13 +1407,16 @@ ctnetlink_change_helper(struct nf_conn *
/* update private helper data if allowed. */
if (helper->from_nlattr)
helper->from_nlattr(helpinfo, ct);
-   return 0;
+   err = 0;
} else
-   return -EBUSY;
+   err = -EBUSY;
+   } else {
+   /* we cannot set a helper for an existing conntrack */
+   err = -EOPNOTSUPP;
}
 
-   /* we cannot set a helper for an existing conntrack */
-   return -EOPNOTSUPP;
+   rcu_read_unlock();
+   return err;
 }
 
 static inline int
@@ -1831,9 +1832,7 @@ ctnetlink_new_conntrack(struct sock *ctn
err = -EEXIST;
ct = nf_ct_tuplehash_to_ctrack(h);
if (!(nlh->nlmsg_flags & NLM_F_EXCL)) {
-   spin_lock_bh(&nf_conntrack_expect_lock);
err = ctnetlink_change_conntrack(ct, cda);
-   spin_unlock_bh(&nf_conntrack_expect_lock);
if (err == 0) {
nf_conntrack_eventmask_report((1 << IPCT_REPLY) |
  (1 << IPCT_ASSURED) |
@@ -2165,11 +2164,7 @@ ctnetlink_nfqueue_parse(const struct nla
if (ret < 0)
return ret;
 
-   spin_lock_bh(&nf_conntrack_expect_lock);
-   ret = ctnetlink_nfqueue_parse_ct((const struct nlattr **)cda, ct);
-   spin_unlock_bh(&nf_conntrack_expect_lock);
-
-   return ret;
+   return ctnetlink_nfqueue_parse_ct((const struc

[PATCH 3.16 073/134] IB/mlx4: Fix ib device initialization error flow

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Jack Morgenstein 

commit 99e68909d5aba1861897fe7afc3306c3c81b6de0 upstream.

In mlx4_ib_add, procedure mlx4_ib_alloc_eqs is called to allocate EQs.

However, in the mlx4_ib_add error flow, procedure mlx4_ib_free_eqs is not
called to free the allocated EQs.

Fixes: e605b743f33d ("IB/mlx4: Increase the number of vectors (EQs) available 
for ULPs")
Signed-off-by: Jack Morgenstein 
Signed-off-by: Leon Romanovsky 
Signed-off-by: Doug Ledford 
Signed-off-by: Ben Hutchings 
---
 drivers/infiniband/hw/mlx4/main.c | 1 +
 1 file changed, 1 insertion(+)

--- a/drivers/infiniband/hw/mlx4/main.c
+++ b/drivers/infiniband/hw/mlx4/main.c
@@ -2281,6 +2281,7 @@ err_counter:
mlx4_counter_free(ibdev->dev, ibdev->counters[i - 1]);
 
 err_map:
+   mlx4_ib_free_eqs(dev, ibdev);
iounmap(ibdev->uar_map);
 
 err_uar:



[PATCH 3/3] jfs: Adjust 67 checks for null pointers

2017-08-18 Thread SF Markus Elfring
From: Markus Elfring 
Date: Fri, 18 Aug 2017 15:15:02 +0200
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The script “checkpatch.pl” pointed information out like the following.

Comparison to NULL could be written !…

Thus fix the affected source code places.

Signed-off-by: Markus Elfring 
---
 fs/jfs/jfs_dmap.c | 44 ++--
 fs/jfs/jfs_dtree.c|  6 +++---
 fs/jfs/jfs_imap.c | 25 -
 fs/jfs/jfs_logmgr.c   | 18 +-
 fs/jfs/jfs_metapage.c |  5 ++---
 fs/jfs/jfs_mount.c|  6 +++---
 fs/jfs/jfs_txnmgr.c   |  6 +++---
 fs/jfs/jfs_unicode.c  |  3 +--
 fs/jfs/jfs_xtree.c|  6 +++---
 fs/jfs/namei.c|  3 +--
 fs/jfs/resize.c   |  2 +-
 fs/jfs/super.c|  4 ++--
 fs/jfs/xattr.c| 10 +-
 13 files changed, 67 insertions(+), 71 deletions(-)

diff --git a/fs/jfs/jfs_dmap.c b/fs/jfs/jfs_dmap.c
index 2d514c7affc2..59a8f86984c2 100644
--- a/fs/jfs/jfs_dmap.c
+++ b/fs/jfs/jfs_dmap.c
@@ -174,14 +174,14 @@ int dbMount(struct inode *ipbmap)
 */
/* allocate memory for the in-memory bmap descriptor */
bmp = kmalloc(sizeof(struct bmap), GFP_KERNEL);
-   if (bmp == NULL)
+   if (!bmp)
return -ENOMEM;
 
/* read the on-disk bmap descriptor. */
mp = read_metapage(ipbmap,
   BMAPBLKNO << JFS_SBI(ipbmap->i_sb)->l2nbperpage,
   PSIZE, 0);
-   if (mp == NULL) {
+   if (!mp) {
kfree(bmp);
return -EIO;
}
@@ -274,7 +274,7 @@ int dbSync(struct inode *ipbmap)
mp = read_metapage(ipbmap,
   BMAPBLKNO << JFS_SBI(ipbmap->i_sb)->l2nbperpage,
   PSIZE, 0);
-   if (mp == NULL) {
+   if (!mp) {
jfs_err("dbSync: read_metapage failed!");
return -EIO;
}
@@ -370,7 +370,7 @@ int dbFree(struct inode *ip, s64 blkno, s64 nblocks)
/* get the buffer for the current dmap. */
lblkno = BLKTODMAP(blkno, bmp->db_l2nbperpage);
mp = read_metapage(ipbmap, lblkno, PSIZE, 0);
-   if (mp == NULL) {
+   if (!mp) {
IREAD_UNLOCK(ipbmap);
return -EIO;
}
@@ -464,7 +464,7 @@ dbUpdatePMap(struct inode *ipbmap,
 
mp = read_metapage(bmp->db_ipbmap, lblkno, PSIZE,
   0);
-   if (mp == NULL)
+   if (!mp)
return -EIO;
metapage_wait_for_io(mp);
}
@@ -780,7 +780,7 @@ int dbAlloc(struct inode *ip, s64 hint, s64 nblocks, s64 * 
results)
rc = -EIO;
lblkno = BLKTODMAP(blkno, bmp->db_l2nbperpage);
mp = read_metapage(ipbmap, lblkno, PSIZE, 0);
-   if (mp == NULL)
+   if (!mp)
goto read_unlock;
 
dp = (struct dmap *) mp->data;
@@ -922,7 +922,7 @@ int dbAllocExact(struct inode *ip, s64 blkno, int nblocks)
/* read in the dmap covering the extent */
lblkno = BLKTODMAP(blkno, bmp->db_l2nbperpage);
mp = read_metapage(ipbmap, lblkno, PSIZE, 0);
-   if (mp == NULL) {
+   if (!mp) {
IREAD_UNLOCK(ipbmap);
return -EIO;
}
@@ -1078,7 +1078,7 @@ static int dbExtend(struct inode *ip, s64 blkno, s64 
nblocks, s64 addnblocks)
 */
lblkno = BLKTODMAP(extblkno, bmp->db_l2nbperpage);
mp = read_metapage(ipbmap, lblkno, PSIZE, 0);
-   if (mp == NULL) {
+   if (!mp) {
IREAD_UNLOCK(ipbmap);
return -EIO;
}
@@ -1421,7 +1421,7 @@ dbAllocAG(struct bmap * bmp, int agno, s64 nblocks, int 
l2nb, s64 * results)
 */
lblkno = BLKTOCTL(blkno, bmp->db_l2nbperpage, bmp->db_aglevel);
mp = read_metapage(bmp->db_ipbmap, lblkno, PSIZE, 0);
-   if (mp == NULL)
+   if (!mp)
return -EIO;
dcp = (struct dmapctl *) mp->data;
budmin = dcp->budmin;
@@ -1642,7 +1642,7 @@ s64 dbDiscardAG(struct inode *ip, int agno, s64 minlen)
do_div(max_ranges, minlen);
range_cnt = min_t(u64, max_ranges + 1, 32 * 1024);
totrim = kmalloc(sizeof(struct range2trim) * range_cnt, GFP_NOFS);
-   if (totrim == NULL) {
+   if (!totrim) {
jfs_error(bmp->db_ipbmap->i_sb, "no memory for trim array\n");
IWRITE_UNLOCK(ipbmap);
return 0;
@@ -1743,7 +1743,7 @@ static int dbFindCtl(struct bmap * bmp, int l2nb, int 
level, s64 * blkno)
 */
lblkno = BLKTOCTL(b, bmp->db_l2nbperpage, lev);
mp = read_metapage(bmp->db_ipbmap, lblkno, PSIZE, 0);
-   if (mp == NULL)
+   if (!mp)
  

Re: [PATCH v8 2/2] sched/rt: Add support for SD_PREFER_SIBLING on find_lowest_rq()

2017-08-18 Thread Steven Rostedt
On Fri, 18 Aug 2017 17:21:59 +0900
Byungchul Park  wrote:

> It would be better to try to check other siblings first if
> SD_PREFER_SIBLING is flaged when pushing tasks - migration.
> 
> Signed-off-by: Byungchul Park 

Looks good.

Reviewed-by: Steven Rostedt (VMware) 

-- Steve


[PATCH 3.16 068/134] mwifiex: pcie: fix cmd_buf use-after-free in remove/reset

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Brian Norris 

commit 3c8cb9ad032d737b874e402c59eb51e3c991a144 upstream.

Command buffers (skb's) are allocated by the main driver, and freed upon
the last use. That last use is often in mwifiex_free_cmd_buffer(). In
the meantime, if the command buffer gets used by the PCI driver, we map
it as DMA-able, and store the mapping information in the 'cb' memory.

However, if a command was in-flight when resetting the device (and
therefore was still mapped), we don't get a chance to unmap this memory
until after the core has cleaned up its command handling.

Let's keep a refcount within the PCI driver, so we ensure the memory
only gets freed after we've finished unmapping it.

Noticed by KASAN when forcing a reset via:

  echo 1 > /sys/bus/pci/.../reset

The same code path can presumably be exercised in remove() and
shutdown().

[  205.390377] mwifiex_pcie :01:00.0: info: shutdown mwifiex...
[  205.400393] 
==
[  205.407719] BUG: KASAN: use-after-free in 
mwifiex_unmap_pci_memory.isra.14+0x4c/0x100 [mwifiex_pcie] at addr 
ffc0ad471b28
[  205.419040] Read of size 16 by task bash/1913
[  205.423421] 
=
[  205.431625] BUG skbuff_head_cache (Tainted: GB  ): kasan: bad 
access detected
[  205.439815] 
-
[  205.439815]
[  205.449534] INFO: Allocated in __build_skb+0x48/0x114 age=1311 cpu=4 pid=1913
[  205.456709]  alloc_debug_processing+0x124/0x178
[  205.461282]  ___slab_alloc.constprop.58+0x528/0x608
[  205.466196]  __slab_alloc.isra.54.constprop.57+0x44/0x54
[  205.471542]  kmem_cache_alloc+0xcc/0x278
[  205.475497]  __build_skb+0x48/0x114
[  205.479019]  __netdev_alloc_skb+0xe0/0x170
[  205.483244]  mwifiex_alloc_cmd_buffer+0x68/0xdc [mwifiex]
[  205.488759]  mwifiex_init_fw+0x40/0x6cc [mwifiex]
[  205.493584]  _mwifiex_fw_dpc+0x158/0x520 [mwifiex]
[  205.498491]  mwifiex_reinit_sw+0x2c4/0x398 [mwifiex]
[  205.503510]  mwifiex_pcie_reset_notify+0x114/0x15c [mwifiex_pcie]
[  205.509643]  pci_reset_notify+0x5c/0x6c
[  205.513519]  pci_reset_function+0x6c/0x7c
[  205.517567]  reset_store+0x68/0x98
[  205.521003]  dev_attr_store+0x54/0x60
[  205.524705]  sysfs_kf_write+0x9c/0xb0
[  205.528413] INFO: Freed in __kfree_skb+0xb0/0xbc age=131 cpu=4 pid=1913
[  205.535064]  free_debug_processing+0x264/0x370
[  205.539550]  __slab_free+0x84/0x40c
[  205.543075]  kmem_cache_free+0x1c8/0x2a0
[  205.547030]  __kfree_skb+0xb0/0xbc
[  205.550465]  consume_skb+0x164/0x178
[  205.554079]  __dev_kfree_skb_any+0x58/0x64
[  205.558304]  mwifiex_free_cmd_buffer+0xa0/0x158 [mwifiex]
[  205.563817]  mwifiex_shutdown_drv+0x578/0x5c4 [mwifiex]
[  205.569164]  mwifiex_shutdown_sw+0x178/0x310 [mwifiex]
[  205.574353]  mwifiex_pcie_reset_notify+0xd4/0x15c [mwifiex_pcie]
[  205.580398]  pci_reset_notify+0x5c/0x6c
[  205.584274]  pci_dev_save_and_disable+0x24/0x6c
[  205.588837]  pci_reset_function+0x30/0x7c
[  205.592885]  reset_store+0x68/0x98
[  205.596324]  dev_attr_store+0x54/0x60
[  205.600017]  sysfs_kf_write+0x9c/0xb0
...
[  205.800488] Call trace:
[  205.802980] [] dump_backtrace+0x0/0x190
[  205.808415] [] show_stack+0x20/0x28
[  205.813506] [] dump_stack+0xa4/0xcc
[  205.818598] [] print_trailer+0x158/0x168
[  205.824120] [] object_err+0x4c/0x5c
[  205.829210] [] kasan_report+0x334/0x500
[  205.834641] [] check_memory_region+0x20/0x14c
[  205.840593] [] __asan_loadN+0x14/0x1c
[  205.845879] [] mwifiex_unmap_pci_memory.isra.14+0x4c/0x100 
[mwifiex_pcie]
[  205.854282] [] mwifiex_pcie_delete_cmdrsp_buf+0x94/0xa8 
[mwifiex_pcie]
[  205.862421] [] mwifiex_pcie_free_buffers+0x11c/0x158 
[mwifiex_pcie]
[  205.870302] [] mwifiex_pcie_down_dev+0x70/0x80 
[mwifiex_pcie]
[  205.877736] [] mwifiex_shutdown_sw+0x190/0x310 [mwifiex]
[  205.884658] [] mwifiex_pcie_reset_notify+0xd4/0x15c 
[mwifiex_pcie]
[  205.892446] [] pci_reset_notify+0x5c/0x6c
[  205.898048] [] pci_dev_save_and_disable+0x24/0x6c
[  205.904350] [] pci_reset_function+0x30/0x7c
[  205.910134] [] reset_store+0x68/0x98
[  205.915312] [] dev_attr_store+0x54/0x60
[  205.920750] [] sysfs_kf_write+0x9c/0xb0
[  205.926182] [] kernfs_fop_write+0x184/0x1f8
[  205.931963] [] __vfs_write+0x6c/0x17c
[  205.937221] [] vfs_write+0xf0/0x1c4
[  205.942310] [] SyS_write+0x78/0xd8
[  205.947312] [] el0_svc_naked+0x24/0x28
...
[  205.998268] 
==

This bug has been around in different forms for a while. It was sort of
noticed in commit 955ab095c51a ("mwifiex: Do not kfree cmd buf while
unregistering PCIe"), but it just fixed the double-free, without
acknowledging the potential for use-after-free.

Fixes: fc3314609047 ("mwifiex: use pci_alloc/free_consistent APIs for PCIe")
Signed-off-by: Brian Norris 
Si

[PATCH 3.16 072/134] HSI: ssi_protocol: double free in ssip_pn_xmit()

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Dan Carpenter 

commit 3026050179a3a9a6f5c892c414b5e36ecf092081 upstream.

If skb_pad() fails then it frees skb and we don't need to free it again
at the end of the function.

Fixes: dc7bf5d7 ("HSI: Introduce driver for SSI Protocol")
Signed-off-by: Dan Carpenter 
Signed-off-by: Sebastian Reichel 
Signed-off-by: Ben Hutchings 
---
 drivers/hsi/clients/ssi_protocol.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

--- a/drivers/hsi/clients/ssi_protocol.c
+++ b/drivers/hsi/clients/ssi_protocol.c
@@ -976,7 +976,7 @@ static int ssip_pn_xmit(struct sk_buff *
goto drop;
/* Pad to 32-bits - FIXME: Revisit*/
if ((skb->len & 3) && skb_pad(skb, 4 - (skb->len & 3)))
-   goto drop;
+   goto inc_dropped;
 
/*
 * Modem sends Phonet messages over SSI with its own endianess...
@@ -1028,8 +1028,9 @@ static int ssip_pn_xmit(struct sk_buff *
 drop2:
hsi_free_msg(msg);
 drop:
-   dev->stats.tx_dropped++;
dev_kfree_skb(skb);
+inc_dropped:
+   dev->stats.tx_dropped++;
 
return 0;
 }



[PATCH 3.16 071/134] IB/ipoib: Update broadcast object if PKey value was changed in index 0

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Feras Daoud 

commit 9a9b8112699d78e7f317019b37f377e90023f3ed upstream.

Update the broadcast address in the priv->broadcast object when the
Pkey value changes in index 0, otherwise the multicast GID value will
keep the previous value of the PKey, and will not be updated.
This leads to interface state down because the interface will keep the
old PKey value.

For example, in SR-IOV environment, if the PF changes the value of PKey
index 0 for one of the VFs, then the VF receives PKey change event that
triggers heavy flush. This flush calls update_parent_pkey that update the
broadcast object and its relevant members. If in this case the multicast
GID will not be updated, the interface state will be down.

Fixes: c2904141696e ("IPoIB: Fix pkey change flow for virtualization 
environments")
Signed-off-by: Feras Daoud 
Signed-off-by: Erez Shitrit 
Reviewed-by: Alex Vesker 
Signed-off-by: Leon Romanovsky 
Signed-off-by: Doug Ledford 
Signed-off-by: Ben Hutchings 
---
 drivers/infiniband/ulp/ipoib/ipoib_ib.c | 13 +
 1 file changed, 13 insertions(+)

--- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
@@ -961,6 +961,19 @@ static inline int update_parent_pkey(str
 */
priv->dev->broadcast[8] = priv->pkey >> 8;
priv->dev->broadcast[9] = priv->pkey & 0xff;
+
+   /*
+* Update the broadcast address in the priv->broadcast object,
+* in case it already exists, otherwise no one will do that.
+*/
+   if (priv->broadcast) {
+   spin_lock_irq(&priv->lock);
+   memcpy(priv->broadcast->mcmember.mgid.raw,
+  priv->dev->broadcast + 4,
+   sizeof(union ib_gid));
+   spin_unlock_irq(&priv->lock);
+   }
+
return 0;
}
 



[PATCH 3.16 070/134] NFS: Use GFP_NOIO for two allocations in writeback

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Benjamin Coddington 

commit ae97aa524ef495b6276fd26f5d5449fb22975d7c upstream.

Prevent a deadlock that can occur if we wait on allocations
that try to write back our pages.

Signed-off-by: Benjamin Coddington 
Fixes: 00bfa30abe869 ("NFS: Create a common pgio_alloc and pgio_release...")
Signed-off-by: Trond Myklebust 
[bwh: Backported to 3.16:
 - Drop changes in nfs_pageio_init()
 - Adjust context]
Signed-off-by: Ben Hutchings 
---
 fs/nfs/pagelist.c | 15 +++
 1 file changed, 11 insertions(+), 4 deletions(-)

--- a/fs/nfs/pagelist.c
+++ b/fs/nfs/pagelist.c
@@ -29,13 +29,14 @@
 static struct kmem_cache *nfs_page_cachep;
 static const struct rpc_call_ops nfs_pgio_common_ops;
 
-static bool nfs_pgarray_set(struct nfs_page_array *p, unsigned int pagecount)
+static bool nfs_pgarray_set(struct nfs_page_array *p, unsigned int pagecount,
+   gfp_t gfp_flags)
 {
p->npages = pagecount;
if (pagecount <= ARRAY_SIZE(p->page_array))
p->pagevec = p->page_array;
else {
-   p->pagevec = kcalloc(pagecount, sizeof(struct page *), 
GFP_KERNEL);
+   p->pagevec = kcalloc(pagecount, sizeof(struct page *), 
gfp_flags);
if (!p->pagevec)
p->npages = 0;
}
@@ -739,9 +740,12 @@ int nfs_generic_pgio(struct nfs_pageio_d
struct list_head *head = &desc->pg_list;
struct nfs_commit_info cinfo;
unsigned int pagecount, pageused;
+   gfp_t gfp_flags = GFP_KERNEL;
 
pagecount = nfs_page_array_len(desc->pg_base, desc->pg_count);
-   if (!nfs_pgarray_set(&hdr->page_array, pagecount))
+   if (desc->pg_rw_ops->rw_mode == FMODE_WRITE)
+   gfp_flags = GFP_NOIO;
+   if (!nfs_pgarray_set(&hdr->page_array, pagecount, gfp_flags))
return nfs_pgio_error(desc, hdr);
 
nfs_init_cinfo(&cinfo, desc->pg_inode, desc->pg_dreq);



[PATCH 3.16 064/134] [media] ov2640: fix vflip control

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Frank Schaefer 

commit 7f140fc2064bcd23e0490d8210650e2ef21c1c89 upstream.

Enabling vflip currently causes wrong colors.
It seems that (at least with the current sensor setup) REG04_VFLIP_IMG only
changes the vertical readout direction.
Because pixels are arranged RGRG... in odd lines and GBGB... in even lines,
either a one line shift or even/odd line swap is required, too, but
apparently this doesn't happen.

I finally figured out that this can be done manually by setting
REG04_VREF_EN.
Looking at hflip, it turns out that bit REG04_HREF_EN is set there
permanetly, but according to my tests has no effect on the pixel readout
order.
So my conclusion is that the current documentation of sensor register 0x04
is wrong (has changed after preliminary datasheet version 2.2).

I'm pretty sure that automatic vertical line shift/switch can be enabled,
too, but until anyone finds ot how this works, we have to stick with manual
switching.

Signed-off-by: Frank Schäfer 
Signed-off-by: Mauro Carvalho Chehab 
[bwh: Backported to 3.16: adjust filename]
Signed-off-by: Ben Hutchings 
---
 drivers/media/i2c/soc_camera/ov2640.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

--- a/drivers/media/i2c/soc_camera/ov2640.c
+++ b/drivers/media/i2c/soc_camera/ov2640.c
@@ -713,8 +713,10 @@ static int ov2640_s_ctrl(struct v4l2_ctr
 
switch (ctrl->id) {
case V4L2_CID_VFLIP:
-   val = ctrl->val ? REG04_VFLIP_IMG : 0x00;
-   return ov2640_mask_set(client, REG04, REG04_VFLIP_IMG, val);
+   val = ctrl->val ? REG04_VFLIP_IMG | REG04_VREF_EN : 0x00;
+   return ov2640_mask_set(client, REG04,
+  REG04_VFLIP_IMG | REG04_VREF_EN, val);
+   /* NOTE: REG04_VREF_EN: 1 line shift / even/odd line swap */
case V4L2_CID_HFLIP:
val = ctrl->val ? REG04_HFLIP_IMG : 0x00;
return ov2640_mask_set(client, REG04, REG04_HFLIP_IMG, val);



[PATCH 3.16 065/134] ath9k: off by one in ath9k_hw_nvram_read_array()

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Dan Carpenter 

commit b7dcf68f383a05567bd16a390907b67022a62d3d upstream.

The > should be >= or we read one space beyond the end of the array.

Fixes: ab5c4f71d8c7 ("ath9k: allow to load EEPROM content via firmware API")
Signed-off-by: Dan Carpenter 
Signed-off-by: Kalle Valo 
[bwh: Backported to 3.16: adjust context]
Signed-off-by: Ben Hutchings 
---
 drivers/net/wireless/ath/ath9k/eeprom.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/net/wireless/ath/ath9k/eeprom.c
+++ b/drivers/net/wireless/ath/ath9k/eeprom.c
@@ -118,7 +118,7 @@ static bool ath9k_hw_nvram_read_blob(str
 {
u16 *blob_data;
 
-   if (off * sizeof(u16) > ah->eeprom_blob->size)
+   if (off * sizeof(u16) >= ah->eeprom_blob->size)
return false;
 
blob_data = (u16 *)ah->eeprom_blob->data;



[PATCH 3.16 060/134] PCI: Freeze PME scan before suspending devices

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Lukas Wunner 

commit ea00353f36b64375518662a8ad15e39218a1f324 upstream.

Laurent Pinchart reported that the Renesas R-Car H2 Lager board (r8a7790)
crashes during suspend tests.  Geert Uytterhoeven managed to reproduce the
issue on an M2-W Koelsch board (r8a7791):

  It occurs when the PME scan runs, once per second.  During PME scan, the
  PCI host bridge (rcar-pci) registers are accessed while its module clock
  has already been disabled, leading to the crash.

One reproducer is to configure s2ram to use "s2idle" instead of "deep"
suspend:

  # echo 0 > /sys/module/printk/parameters/console_suspend
  # echo s2idle > /sys/power/mem_sleep
  # echo mem > /sys/power/state

Another reproducer is to write either "platform" or "processors" to
/sys/power/pm_test.  It does not (or is less likely) to happen during full
system suspend ("core" or "none") because system suspend also disables
timers, and thus the workqueue handling PME scans no longer runs.  Geert
believes the issue may still happen in the small window between disabling
module clocks and disabling timers:

  # echo 0 > /sys/module/printk/parameters/console_suspend
  # echo platform > /sys/power/pm_test# Or "processors"
  # echo mem > /sys/power/state

(Make sure CONFIG_PCI_RCAR_GEN2 and CONFIG_USB_OHCI_HCD_PCI are enabled.)

Rafael Wysocki agrees that PME scans should be suspended before the host
bridge registers become inaccessible.  To that end, queue the task on a
workqueue that gets frozen before devices suspend.

Rafael notes however that as a result, some wakeup events may be missed if
they are delivered via PME from a device without working IRQ (which hence
must be polled) and occur after the workqueue has been frozen.  If that
turns out to be an issue in practice, it may be possible to solve it by
calling pci_pme_list_scan() once directly from one of the host bridge's
pm_ops callbacks.

Stacktrace for posterity:

  PM: Syncing filesystems ... [   38.566237] done.
  PM: Preparing system for sleep (mem)
  Freezing user space processes ... [   38.579813] (elapsed 0.001 seconds) done.
  Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
  PM: Suspending system (mem)
  PM: suspend of devices complete after 152.456 msecs
  PM: late suspend of devices complete after 2.809 msecs
  PM: noirq suspend of devices complete after 29.863 msecs
  suspend debug: Waiting for 5 second(s).
  Unhandled fault: asynchronous external abort (0x1211) at 0x
  pgd = c0003000
  [] *pgd=8040004003, *pmd=
  Internal error: : 1211 [#1] SMP ARM
  Modules linked in:
  CPU: 1 PID: 20 Comm: kworker/1:1 Not tainted
  4.9.0-rc1-koelsch-00011-g68db9bc814362e7f #3383
  Hardware name: Generic R8A7791 (Flattened Device Tree)
  Workqueue: events pci_pme_list_scan
  task: eb56e140 task.stack: eb58e000
  PC is at pci_generic_config_read+0x64/0x6c
  LR is at rcar_pci_cfg_base+0x64/0x84
  pc : []lr : []psr: 600d0093
  sp : eb58fe98  ip : c041d750  fp : 0008
  r10: c0e2283c  r9 :   r8 : 600d0013
  r7 : 0008  r6 : eb58fed6  r5 : 0002  r4 : eb58feb4
  r3 :   r2 : 0044  r1 : 0008  r0 : 
  Flags: nZCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment user
  Control: 30c5387d  Table: 6a9f6c80  DAC: 
  Process kworker/1:1 (pid: 20, stack limit = 0xeb58e210)
  Stack: (0xeb58fe98 to 0xeb59)
  fe80:   0002 0044
  fea0: eb6f5800 c041d9b0 eb58feb4 0008 0044  eb78a000 eb78a000
  fec0: 0044  eb9aff00 c0424bf0 eb78a000  eb78a000 c0e22830
  fee0: ea8a6fc0 c0424c5c eaae79c0 c0424ce0 eb55f380 c0e22838 eb9a9800 c0235fbc
  ff00: eb55f380 c0e22838 eb55f380 eb9a9800 eb9a9800 eb58e000 eb9a9824 c0e02100
  ff20: eb55f398 c02366c4 eb56e140 eb5631c0  eb55f380 c023641c 
  ff40:    c023a928 cd105598  40506a34 eb55f380
  ff60:   dead4ead   eb58ff74 eb58ff74 
  ff80:  dead4ead   eb58ff90 eb58ff90 eb58ffac eb5631c0
  ffa0: c023a844   c0206d68    
  ffc0:        
  ffe0:     0013  3a81336c 10ccd1dd
  [] (pci_generic_config_read) from []
  (pci_bus_read_config_word+0x58/0x80)
  [] (pci_bus_read_config_word) from []
  (pci_check_pme_status+0x34/0x78)
  [] (pci_check_pme_status) from [] 
(pci_pme_wakeup+0x28/0x54)
  [] (pci_pme_wakeup) from [] (pci_pme_list_scan+0x58/0xb4)
  [] (pci_pme_list_scan) from []
  (process_one_work+0x1bc/0x308)
  [] (process_one_work) from [] (worker_thread+0x2a8/0x3e0)
  [] (worker_thread) from [] (kthread+0xe4/0xfc)
  [] (kthread) from [] (ret_from_fork+0x14/0x2c)
  Code: ea00 e5903000 f57ff04f e3a0 (e5843000)
  ---[ 

[PATCH 3.16 067/134] usb: host: xhci: print correct command ring address

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Peter Chen 

commit 6fc091fb0459ade939a795bfdcaf645385b951d4 upstream.

Print correct command ring address using 'val_64'.

Signed-off-by: Peter Chen 
Signed-off-by: Mathias Nyman 
Signed-off-by: Greg Kroah-Hartman 
Signed-off-by: Ben Hutchings 
---
 drivers/usb/host/xhci-mem.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/usb/host/xhci-mem.c
+++ b/drivers/usb/host/xhci-mem.c
@@ -2459,7 +2459,7 @@ int xhci_mem_init(struct xhci_hcd *xhci,
(xhci->cmd_ring->first_seg->dma & (u64) ~CMD_RING_RSVD_BITS) |
xhci->cmd_ring->cycle_state;
xhci_dbg_trace(xhci, trace_xhci_dbg_init,
-   "// Setting command ring address to 0x%x", val);
+   "// Setting command ring address to 0x%016llx", val_64);
xhci_write_64(xhci, val_64, &xhci->op_regs->cmd_ring);
xhci_dbg_cmd_ptrs(xhci);
 



[PATCH 3.16 066/134] KVM: arm/arm64: fix races in kvm_psci_vcpu_on

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Andrew Jones 

commit 6c7a5dce22b3f3cc44be098e2837fa6797edb8b8 upstream.

Fix potential races in kvm_psci_vcpu_on() by taking the kvm->lock
mutex.  In general, it's a bad idea to allow more than one PSCI_CPU_ON
to process the same target VCPU at the same time.  One such problem
that may arise is that one PSCI_CPU_ON could be resetting the target
vcpu, which fills the entire sys_regs array with a temporary value
including the MPIDR register, while another looks up the VCPU based
on the MPIDR value, resulting in no target VCPU found.  Resolves both
races found with the kvm-unit-tests/arm/psci unit test.

Reviewed-by: Marc Zyngier 
Reviewed-by: Christoffer Dall 
Reported-by: Levente Kurusa 
Suggested-by: Christoffer Dall 
Signed-off-by: Andrew Jones 
Signed-off-by: Christoffer Dall 
Signed-off-by: Ben Hutchings 
---
 arch/arm/kvm/psci.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

--- a/arch/arm/kvm/psci.c
+++ b/arch/arm/kvm/psci.c
@@ -191,9 +191,10 @@ int kvm_psci_version(struct kvm_vcpu *vc
 
 static int kvm_psci_0_2_call(struct kvm_vcpu *vcpu)
 {
-   int ret = 1;
+   struct kvm *kvm = vcpu->kvm;
unsigned long psci_fn = *vcpu_reg(vcpu, 0) & ~((u32) 0);
unsigned long val;
+   int ret = 1;
 
switch (psci_fn) {
case PSCI_0_2_FN_PSCI_VERSION:
@@ -213,7 +214,9 @@ static int kvm_psci_0_2_call(struct kvm_
break;
case PSCI_0_2_FN_CPU_ON:
case PSCI_0_2_FN64_CPU_ON:
+   mutex_lock(&kvm->lock);
val = kvm_psci_vcpu_on(vcpu);
+   mutex_unlock(&kvm->lock);
break;
case PSCI_0_2_FN_AFFINITY_INFO:
case PSCI_0_2_FN64_AFFINITY_INFO:
@@ -269,6 +272,7 @@ static int kvm_psci_0_2_call(struct kvm_
 
 static int kvm_psci_0_1_call(struct kvm_vcpu *vcpu)
 {
+   struct kvm *kvm = vcpu->kvm;
unsigned long psci_fn = *vcpu_reg(vcpu, 0) & ~((u32) 0);
unsigned long val;
 
@@ -278,7 +282,9 @@ static int kvm_psci_0_1_call(struct kvm_
val = PSCI_RET_SUCCESS;
break;
case KVM_PSCI_FN_CPU_ON:
+   mutex_lock(&kvm->lock);
val = kvm_psci_vcpu_on(vcpu);
+   mutex_unlock(&kvm->lock);
break;
case KVM_PSCI_FN_CPU_SUSPEND:
case KVM_PSCI_FN_MIGRATE:



[PATCH 3.2 45/59] net: ethernet: ucc_geth: fix MEM_PART_MURAM mode

2017-08-18 Thread Ben Hutchings
3.2.92-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Christophe Leroy 

commit 8b8642af15ed14b9a7a34d3401afbcc274533e13 upstream.

Since commit 5093bb965a163 ("powerpc/QE: switch to the cpm_muram
implementation"), muram area is not part of immrbar mapping anymore
so immrbar_virt_to_phys() is not usable anymore.

Fixes: 5093bb965a163 ("powerpc/QE: switch to the cpm_muram implementation")
Signed-off-by: Christophe Leroy 
Acked-by: David S. Miller 
Acked-by: Li Yang 
Signed-off-by: Scott Wood 
[bwh: Backported to 3.2: adjust filename]
Signed-off-by: Ben Hutchings 
---
 arch/powerpc/include/asm/qe.h | 1 +
 drivers/net/ethernet/freescale/ucc_geth.c | 8 +++-
 2 files changed, 4 insertions(+), 5 deletions(-)

--- a/arch/powerpc/include/asm/qe.h
+++ b/arch/powerpc/include/asm/qe.h
@@ -193,6 +193,7 @@ static inline int qe_alive_during_sleep(
 #define qe_muram_free cpm_muram_free
 #define qe_muram_addr cpm_muram_addr
 #define qe_muram_offset cpm_muram_offset
+#define qe_muram_dma cpm_muram_dma
 
 /* Structure that defines QE firmware binary files.
  *
--- a/drivers/net/ethernet/freescale/ucc_geth.c
+++ b/drivers/net/ethernet/freescale/ucc_geth.c
@@ -2591,11 +2591,10 @@ static int ucc_geth_startup(struct ucc_g
} else if (ugeth->ug_info->uf_info.bd_mem_part ==
   MEM_PART_MURAM) {
out_be32(&ugeth->p_send_q_mem_reg->sqqd[i].bd_ring_base,
-(u32) immrbar_virt_to_phys(ugeth->
-   p_tx_bd_ring[i]));
+(u32)qe_muram_dma(ugeth->p_tx_bd_ring[i]));
out_be32(&ugeth->p_send_q_mem_reg->sqqd[i].
 last_bd_completed_address,
-(u32) immrbar_virt_to_phys(endOfRing));
+(u32)qe_muram_dma(endOfRing));
}
}
 
@@ -2856,8 +2855,7 @@ static int ucc_geth_startup(struct ucc_g
} else if (ugeth->ug_info->uf_info.bd_mem_part ==
   MEM_PART_MURAM) {
out_be32(&ugeth->p_rx_bd_qs_tbl[i].externalbdbaseptr,
-(u32) immrbar_virt_to_phys(ugeth->
-   p_rx_bd_ring[i]));
+(u32)qe_muram_dma(ugeth->p_rx_bd_ring[i]));
}
/* rest of fields handled by QE */
}



[PATCH 3.16 053/134] net: ipv6: send unsolicited NA on admin up

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: David Ahern 

commit 4a6e3c5def13c91adf2acc613837001f09af3baa upstream.

ndisc_notify is the ipv6 equivalent to arp_notify. When arp_notify is
set to 1, gratuitous arp requests are sent when the device is brought up.
The same is expected when ndisc_notify is set to 1 (per ndisc_notify in
Documentation/networking/ip-sysctl.txt). The NA is not sent on NETDEV_UP
event; add it.

Fixes: 5cb04436eef6 ("ipv6: add knob to send unsolicited ND on link-layer 
address change")
Signed-off-by: David Ahern 
Acked-by: Hannes Frederic Sowa 
Signed-off-by: David S. Miller 
Signed-off-by: Ben Hutchings 
---
 net/ipv6/ndisc.c | 2 ++
 1 file changed, 2 insertions(+)

--- a/net/ipv6/ndisc.c
+++ b/net/ipv6/ndisc.c
@@ -1605,6 +1605,8 @@ static int ndisc_netdev_event(struct not
case NETDEV_CHANGEADDR:
neigh_changeaddr(&nd_tbl, dev);
fib6_run_gc(0, net, false);
+   /* fallthrough */
+   case NETDEV_UP:
idev = in6_dev_get(dev);
if (!idev)
break;



[PATCH 3.16 061/134] USB: serial: ftdi_sio: add device ID for Microsemi/Arrow SF2PLUS Dev Kit

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Marek Vasut 

commit 31c5d1922b90ddc1da6a6ddecef7cd31f17aa32b upstream.

This development kit has an FT4232 on it with a custom USB VID/PID.
The FT4232 provides four UARTs, but only two are used. The UART 0
is used by the FlashPro5 programmer and UART 2 is connected to the
SmartFusion2 CortexM3 SoC UART port.

Note that the USB VID is registered to Actel according to Linux USB
VID database, but that was acquired by Microsemi.

Signed-off-by: Marek Vasut 
Signed-off-by: Johan Hovold 
Signed-off-by: Ben Hutchings 
---
 drivers/usb/serial/ftdi_sio.c | 1 +
 drivers/usb/serial/ftdi_sio_ids.h | 6 ++
 2 files changed, 7 insertions(+)

--- a/drivers/usb/serial/ftdi_sio.c
+++ b/drivers/usb/serial/ftdi_sio.c
@@ -886,6 +886,7 @@ static const struct usb_device_id id_tab
{ USB_DEVICE_AND_INTERFACE_INFO(MICROCHIP_VID, MICROCHIP_USB_BOARD_PID,
USB_CLASS_VENDOR_SPEC,
USB_SUBCLASS_VENDOR_SPEC, 0x00) },
+   { USB_DEVICE_INTERFACE_NUMBER(ACTEL_VID, 
MICROSEMI_ARROW_SF2PLUS_BOARD_PID, 2) },
{ USB_DEVICE(JETI_VID, JETI_SPC1201_PID) },
{ USB_DEVICE(MARVELL_VID, MARVELL_SHEEVAPLUG_PID),
.driver_info = (kernel_ulong_t)&ftdi_jtag_quirk },
--- a/drivers/usb/serial/ftdi_sio_ids.h
+++ b/drivers/usb/serial/ftdi_sio_ids.h
@@ -873,6 +873,12 @@
 #defineFIC_VID 0x1457
 #defineFIC_NEO1973_DEBUG_PID   0x5118
 
+/*
+ * Actel / Microsemi
+ */
+#define ACTEL_VID  0x1514
+#define MICROSEMI_ARROW_SF2PLUS_BOARD_PID  0x2008
+
 /* Olimex */
 #define OLIMEX_VID 0x15BA
 #define OLIMEX_ARM_USB_OCD_PID 0x0003



[PATCH 3.16 063/134] [media] dw2102: limit messages to buffer size

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Alyssa Milburn 

commit 950e252cb469f323740d78e4907843acef89eedb upstream.

Otherwise the i2c transfer functions can read or write beyond the end of
stack or heap buffers.

Signed-off-by: Alyssa Milburn 
Signed-off-by: Mauro Carvalho Chehab 
[bwh: Backported to 3.16:
 - Use obuf instead of state->data
 - Adjust context]
Signed-off-by: Ben Hutchings 
---
 drivers/media/usb/dvb-usb/dw2102.c | 54 ++
 1 file changed, 54 insertions(+)

--- a/drivers/media/usb/dvb-usb/dw2102.c
+++ b/drivers/media/usb/dvb-usb/dw2102.c
@@ -247,6 +247,20 @@ static int dw2102_serit_i2c_transfer(str
 
switch (num) {
case 2:
+   if (msg[0].len != 1) {
+   warn("i2c rd: len=%d is not 1!\n",
+msg[0].len);
+   num = -EOPNOTSUPP;
+   break;
+   }
+
+   if (2 + msg[1].len > sizeof(buf6)) {
+   warn("i2c rd: len=%d is too big!\n",
+msg[1].len);
+   num = -EOPNOTSUPP;
+   break;
+   }
+
/* read si2109 register by number */
buf6[0] = msg[0].addr << 1;
buf6[1] = msg[0].len;
@@ -262,6 +276,13 @@ static int dw2102_serit_i2c_transfer(str
case 1:
switch (msg[0].addr) {
case 0x68:
+   if (2 + msg[0].len > sizeof(buf6)) {
+   warn("i2c wr: len=%d is too big!\n",
+msg[0].len);
+   num = -EOPNOTSUPP;
+   break;
+   }
+
/* write to si2109 register */
buf6[0] = msg[0].addr << 1;
buf6[1] = msg[0].len;
@@ -305,6 +326,13 @@ static int dw2102_earda_i2c_transfer(str
/* first write first register number */
u8 ibuf[MAX_XFER_SIZE], obuf[3];
 
+   if (2 + msg[0].len != sizeof(obuf)) {
+   warn("i2c rd: len=%d is not 1!\n",
+msg[0].len);
+   ret = -EOPNOTSUPP;
+   goto unlock;
+   }
+
if (2 + msg[1].len > sizeof(ibuf)) {
warn("i2c rd: len=%d is too big!\n",
 msg[1].len);
@@ -505,6 +533,12 @@ static int dw3101_i2c_transfer(struct i2
/* first write first register number */
u8 ibuf[MAX_XFER_SIZE], obuf[3];
 
+   if (2 + msg[0].len != sizeof(obuf)) {
+   warn("i2c rd: len=%d is not 1!\n",
+msg[0].len);
+   ret = -EOPNOTSUPP;
+   goto unlock;
+   }
if (2 + msg[1].len > sizeof(ibuf)) {
warn("i2c rd: len=%d is too big!\n",
 msg[1].len);
@@ -730,6 +764,13 @@ static int su3000_i2c_transfer(struct i2
msg[0].buf[0] = ibuf[1];
break;
default:
+   if (3 + msg[0].len > sizeof(obuf)) {
+   warn("i2c wr: len=%d is too big!\n",
+msg[0].len);
+   num = -EOPNOTSUPP;
+   break;
+   }
+
/* always i2c write*/
obuf[0] = 0x08;
obuf[1] = msg[0].addr;
@@ -745,6 +786,19 @@ static int su3000_i2c_transfer(struct i2
break;
case 2:
/* always i2c read */
+   if (4 + msg[0].len > sizeof(obuf)) {
+   warn("i2c rd: len=%d is too big!\n",
+msg[0].len);
+   num = -EOPNOTSUPP;
+   break;
+   }
+   if (1 + msg[1].len > sizeof(obuf)) {
+   warn("i2c rd: len=%d is too big!\n",
+msg[1].len);
+   num = -EOPNOTSUPP;
+   break;
+   }
+
obuf[0] = 0x09;
obuf[1] = msg[0].len;
obuf[2] = msg[1].len;



[PATCH 3.16 062/134] [media] ttusb2: limit messages to buffer size

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Alyssa Milburn 

commit a12b8ab8c5ff7ccd7b107a564743507c850a441d upstream.

Otherwise ttusb2_i2c_xfer can read or write beyond the end of static and
heap buffers.

Signed-off-by: Alyssa Milburn 
Signed-off-by: Mauro Carvalho Chehab 
Signed-off-by: Ben Hutchings 
---
 drivers/media/usb/dvb-usb/ttusb2.c | 19 +++
 1 file changed, 19 insertions(+)

--- a/drivers/media/usb/dvb-usb/ttusb2.c
+++ b/drivers/media/usb/dvb-usb/ttusb2.c
@@ -78,6 +78,9 @@ static int ttusb2_msg(struct dvb_usb_dev
u8 *s, *r = NULL;
int ret = 0;
 
+   if (4 + rlen > 64)
+   return -EIO;
+
s = kzalloc(wlen+4, GFP_KERNEL);
if (!s)
return -ENOMEM;
@@ -381,6 +384,22 @@ static int ttusb2_i2c_xfer(struct i2c_ad
write_read = i+1 < num && (msg[i+1].flags & I2C_M_RD);
read = msg[i].flags & I2C_M_RD;
 
+   if (3 + msg[i].len > sizeof(obuf)) {
+   err("i2c wr len=%d too high", msg[i].len);
+   break;
+   }
+   if (write_read) {
+   if (3 + msg[i+1].len > sizeof(ibuf)) {
+   err("i2c rd len=%d too high", msg[i+1].len);
+   break;
+   }
+   } else if (read) {
+   if (3 + msg[i].len > sizeof(ibuf)) {
+   err("i2c rd len=%d too high", msg[i].len);
+   break;
+   }
+   }
+
obuf[0] = (msg[i].addr << 1) | (write_read | read);
if (read)
obuf[1] = 0;



[PATCH 3.16 057/134] PCI: Ignore write combining when mapping I/O port space

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Bjorn Helgaas 

commit 3a92c319c44a7bcee9f48dff9d97d001943b54c6 upstream.

PCI exposes files like /proc/bus/pci/00/00.0 in procfs.  These files
support operations like this:

  ioctl(fd, PCIIOC_MMAP_IS_IO);   # request I/O port space
  ioctl(fd, PCIIOC_WRITE_COMBINE, 1); # request write-combining
  mmap(fd, ...)

Write combining is useful on PCI memory space, but I don't think it makes
sense on PCI I/O port space.

We *could* change proc_bus_pci_ioctl() to make it impossible to set
mmap_state == pci_mmap_io and write_combine at the same time, but that
would break the following sequence, which is currently legal:

  mmap(fd, ...)   # default is I/O, non-combining
  ioctl(fd, PCIIOC_WRITE_COMBINE, 1); # request write-combining
  ioctl(fd, PCIIOC_MMAP_IS_MEM);  # request memory space
  mmap(fd, ...)   # get write-combining mapping

Ignore the write-combining flag when mapping I/O port space.

This patch should have no functional effect, based on this analysis of all
implementations of pci_mmap_page_range():

  - ia64 mips parisc sh unicore32 x86 do not support mapping of I/O port
space at all.

  - arm cris microblaze mn10300 sparc xtensa support mapping of I/O port
space, but ignore the write_combine argument to pci_mmap_page_range().

  - powerpc supports mapping of I/O port space and uses write_combine, and
it disables write combining for I/O port space in
__pci_mmap_set_pgprot().

This patch makes it possible to remove __pci_mmap_set_pgprot() from
powerpc, which simplifies that path.

Signed-off-by: Bjorn Helgaas 
Signed-off-by: Ben Hutchings 
---
 drivers/pci/proc.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

--- a/drivers/pci/proc.c
+++ b/drivers/pci/proc.c
@@ -231,7 +231,7 @@ static int proc_bus_pci_mmap(struct file
 {
struct pci_dev *dev = PDE_DATA(file_inode(file));
struct pci_filp_private *fpriv = file->private_data;
-   int i, ret;
+   int i, ret, write_combine;
 
if (!capable(CAP_SYS_RAWIO))
return -EPERM;
@@ -245,9 +245,12 @@ static int proc_bus_pci_mmap(struct file
if (i >= PCI_ROM_RESOURCE)
return -ENODEV;
 
+   if (fpriv->mmap_state == pci_mmap_mem)
+   write_combine = fpriv->write_combine;
+   else
+   write_combine = 0;
ret = pci_mmap_page_range(dev, vma,
- fpriv->mmap_state,
- fpriv->write_combine);
+ fpriv->mmap_state, write_combine);
if (ret < 0)
return ret;
 



[PATCH 3.16 048/134] perf/x86/pebs: Fix handling of PEBS buffer overflows

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Stephane Eranian 

commit daa864b8f8e34477bde817f26d736d89dc6032f3 upstream.

This patch solves a race condition between PEBS and the PMU handler.

In case multiple PEBS events are sampled at the same time,
it is possible to have GLOBAL_STATUS bit 62 set indicating
PEBS buffer overflow and also seeing at most 3 PEBS counters
having their bits set in the status register. This is a sign
that there was at least one PEBS record pending at the time
of the PMU interrupt. PEBS counters must only be processed
via the drain_pebs() calls, and not via the regular sample
processing loop coming after that the function, otherwise
phony regular samples may be generated in the sampling buffer
not marked with the EXACT tag.

Another possibility is to have one PEBS event and at least
one non-PEBS event whic hoverflows while PEBS has armed. In this
case, bit 62 of GLOBAL_STATUS will not be set, yet the overflow
status bit for the PEBS counter will be on Skylake.

To avoid this problem, we systematically ignore the PEBS-enabled
counters from the GLOBAL_STATUS mask and we always process PEBS
events via drain_pebs().

The problem manifested itself by having non-exact samples when
sampling only PEBS events, i.e., the PERF_SAMPLE_RECORD would
not have the EXACT flag set.

Note that this problem is only present on Skylake processor.
This fix is harmless on older processors.

Reported-by: Peter Zijlstra 
Signed-off-by: Stephane Eranian 
Signed-off-by: Peter Zijlstra (Intel) 
Cc: Alexander Shishkin 
Cc: Arnaldo Carvalho de Melo 
Cc: Jiri Olsa 
Cc: Linus Torvalds 
Cc: Thomas Gleixner 
Cc: Vince Weaver 
Link: 
http://lkml.kernel.org/r/1482395366-8992-1-git-send-email-eran...@google.com
Signed-off-by: Ingo Molnar 
[bwh: Backported to 3.16: adjust filename, context]
Signed-off-by: Ben Hutchings 
---
 arch/x86/kernel/cpu/perf_event_intel.c | 30 +-
 1 file changed, 21 insertions(+), 9 deletions(-)

--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -1402,20 +1402,33 @@ again:
}
 
/*
+* In case multiple PEBS events are sampled at the same time,
+* it is possible to have GLOBAL_STATUS bit 62 set indicating
+* PEBS buffer overflow and also seeing at most 3 PEBS counters
+* having their bits set in the status register. This is a sign
+* that there was at least one PEBS record pending at the time
+* of the PMU interrupt. PEBS counters must only be processed
+* via the drain_pebs() calls and not via the regular sample
+* processing loop coming after that the function, otherwise
+* phony regular samples may be generated in the sampling buffer
+* not marked with the EXACT tag. Another possibility is to have
+* one PEBS event and at least one non-PEBS event whic hoverflows
+* while PEBS has armed. In this case, bit 62 of GLOBAL_STATUS will
+* not be set, yet the overflow status bit for the PEBS counter will
+* be on Skylake.
+*
+* To avoid this problem, we systematically ignore the PEBS-enabled
+* counters from the GLOBAL_STATUS mask and we always process PEBS
+* events via drain_pebs().
+*/
+   status &= ~cpuc->pebs_enabled;
+
+   /*
 * PEBS overflow sets bit 62 in the global status register
 */
if (__test_and_clear_bit(62, (unsigned long *)&status)) {
handled++;
x86_pmu.drain_pebs(regs);
-   /*
-* There are cases where, even though, the PEBS ovfl bit is set
-* in GLOBAL_OVF_STATUS, the PEBS events may also have their
-* overflow bits set for their counters. We must clear them
-* here because they have been processed as exact samples in
-* the drain_pebs() routine. They must not be processed again
-* in the for_each_bit_set() loop for regular samples below.
-*/
-   status &= ~cpuc->pebs_enabled;
status &= x86_pmu.intel_ctrl | GLOBAL_STATUS_TRACE_TOPAPMI;
}
 



Re: [PATCH v2 3/3] vfio/pci: Don't probe devices that can't be reset

2017-08-18 Thread Jan Glauber
On Thu, Aug 17, 2017 at 07:00:17AM -0600, Alex Williamson wrote:
> On Thu, 17 Aug 2017 10:14:23 +0200
> Jan Glauber  wrote:
> 
> > If a PCI device supports neither function-level reset, nor slot
> > or bus reset then refuse to probe it. A line is printed to inform
> > the user.
> 
> But that's not what this does, this requires that the device is on a
> reset-able bus.  This is a massive regression.  With this we could no
> longer assign devices on the root complex or any device which doesn't
> return from bus reset and currently makes use of the NO_BUS_RESET flag
> and works happily otherwise.  Full NAK.  Thanks,

Looks like I missed the slot reset check. So how about this:

if (pci_probe_reset_slot(pdev->slot) && pci_probe_reset_bus(pdev->bus)) {
dev_warn(...);
return -ENODEV;
}

Or am I still missing something here?

thanks,
Jan

> Alex
>  
> > Without this change starting qemu with a vfio-pci device can lead to
> > a kernel panic on some Cavium cn8xxx systems, depending on the used
> > device.
> > 
> > Signed-off-by: Jan Glauber 
> > ---
> >  drivers/vfio/pci/vfio_pci.c | 6 ++
> >  1 file changed, 6 insertions(+)
> > 
> > diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
> > index 063c1ce..029ba13 100644
> > --- a/drivers/vfio/pci/vfio_pci.c
> > +++ b/drivers/vfio/pci/vfio_pci.c
> > @@ -1196,6 +1196,12 @@ static int vfio_pci_probe(struct pci_dev *pdev, 
> > const struct pci_device_id *id)
> > if (pdev->hdr_type != PCI_HEADER_TYPE_NORMAL)
> > return -EINVAL;
> >  
> > +   ret = pci_probe_reset_bus(pdev->bus);
> > +   if (ret) {
> > +   dev_warn(&pdev->dev, "Refusing to probe because reset is not 
> > possible.\n");
> > +   return ret;
> > +   }
> > +
> > group = vfio_iommu_group_get(&pdev->dev);
> > if (!group)
> > return -EINVAL;


[PATCH 3.16 052/134] ftrace: Fix removing of second function probe

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: "Steven Rostedt (VMware)" 

commit 82cc4fc2e70ec5baeff8f776f2773abc8b2cc0ae upstream.

When two function probes are added to set_ftrace_filter, and then one of
them is removed, the update to the function locations is not performed, and
the record keeping of the function states are corrupted, and causes an
ftrace_bug() to occur.

This is easily reproducable by adding two probes, removing one, and then
adding it back again.

 # cd /sys/kernel/debug/tracing
 # echo schedule:traceoff > set_ftrace_filter
 # echo do_IRQ:traceoff > set_ftrace_filter
 # echo \!do_IRQ:traceoff > /debug/tracing/set_ftrace_filter
 # echo do_IRQ:traceoff > set_ftrace_filter

Causes:
 [ cut here ]
 WARNING: CPU: 2 PID: 1098 at kernel/trace/ftrace.c:2369 
ftrace_get_addr_curr+0x143/0x220
 Modules linked in: [...]
 CPU: 2 PID: 1098 Comm: bash Not tainted 4.10.0-test+ #405
 Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v02.05 
05/07/2012
 Call Trace:
  dump_stack+0x68/0x9f
  __warn+0x111/0x130
  ? trace_irq_work_interrupt+0xa0/0xa0
  warn_slowpath_null+0x1d/0x20
  ftrace_get_addr_curr+0x143/0x220
  ? __fentry__+0x10/0x10
  ftrace_replace_code+0xe3/0x4f0
  ? ftrace_int3_handler+0x90/0x90
  ? printk+0x99/0xb5
  ? 0x8100
  ftrace_modify_all_code+0x97/0x110
  arch_ftrace_update_code+0x10/0x20
  ftrace_run_update_code+0x1c/0x60
  ftrace_run_modify_code.isra.48.constprop.62+0x8e/0xd0
  register_ftrace_function_probe+0x4b6/0x590
  ? ftrace_startup+0x310/0x310
  ? debug_lockdep_rcu_enabled.part.4+0x1a/0x30
  ? update_stack_state+0x88/0x110
  ? ftrace_regex_write.isra.43.part.44+0x1d3/0x320
  ? preempt_count_sub+0x18/0xd0
  ? mutex_lock_nested+0x104/0x800
  ? ftrace_regex_write.isra.43.part.44+0x1d3/0x320
  ? __unwind_start+0x1c0/0x1c0
  ? _mutex_lock_nest_lock+0x800/0x800
  ftrace_trace_probe_callback.isra.3+0xc0/0x130
  ? func_set_flag+0xe0/0xe0
  ? __lock_acquire+0x642/0x1790
  ? __might_fault+0x1e/0x20
  ? trace_get_user+0x398/0x470
  ? strcmp+0x35/0x60
  ftrace_trace_onoff_callback+0x48/0x70
  ftrace_regex_write.isra.43.part.44+0x251/0x320
  ? match_records+0x420/0x420
  ftrace_filter_write+0x2b/0x30
  __vfs_write+0xd7/0x330
  ? do_loop_readv_writev+0x120/0x120
  ? locks_remove_posix+0x90/0x2f0
  ? do_lock_file_wait+0x160/0x160
  ? __lock_is_held+0x93/0x100
  ? rcu_read_lock_sched_held+0x5c/0xb0
  ? preempt_count_sub+0x18/0xd0
  ? __sb_start_write+0x10a/0x230
  ? vfs_write+0x222/0x240
  vfs_write+0xef/0x240
  SyS_write+0xab/0x130
  ? SyS_read+0x130/0x130
  ? trace_hardirqs_on_caller+0x182/0x280
  ? trace_hardirqs_on_thunk+0x1a/0x1c
  entry_SYSCALL_64_fastpath+0x18/0xad
 RIP: 0033:0x7fe61c157c30
 RSP: 002b:7ffe87890258 EFLAGS: 0246 ORIG_RAX: 0001
 RAX: ffda RBX: 8114a410 RCX: 7fe61c157c30
 RDX: 0010 RSI: 55814798f5e0 RDI: 0001
 RBP: 8800c9027f98 R08: 7fe61c422740 R09: 7fe61ca53700
 R10: 0073 R11: 0246 R12: 558147a36400
 R13: 7ffe8788f160 R14: 0024 R15: 7ffe8788f15c
  ? trace_hardirqs_off_caller+0xc0/0x110
 ---[ end trace 99fa09b3d9869c2c ]---
 Bad trampoline accounting at: 81cc3b00 (do_IRQ+0x0/0x150)

Fixes: 59df055f1991 ("ftrace: trace different functions with a different 
tracer")
Signed-off-by: Steven Rostedt (VMware) 
[bwh: Backported to 3.16:
 - Use ftrace_run_update_code() instead of ftrace_run_modify_code(), and
   don't define old_hash
 - Adjust context]
Signed-off-by: Ben Hutchings 
---
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -3119,23 +3119,24 @@ static void __enable_ftrace_function_pro
ftrace_probe_registered = 1;
 }
 
-static void __disable_ftrace_function_probe(void)
+static bool __disable_ftrace_function_probe(void)
 {
int i;
 
if (!ftrace_probe_registered)
-   return;
+   return false;
 
for (i = 0; i < FTRACE_FUNC_HASHSIZE; i++) {
struct hlist_head *hhd = &ftrace_func_hash[i];
if (hhd->first)
-   return;
+   return false;
}
 
/* no more funcs left */
ftrace_shutdown(&trace_probe_ops, 0);
 
ftrace_probe_registered = 0;
+   return true;
 }
 
 
@@ -3263,6 +3264,7 @@ __unregister_ftrace_function_probe(char
int type = MATCH_FULL;
int i, len = 0;
char *search;
+   bool disabled;
 
if (glob && (strcmp(glob, "*") == 0 || !strlen(glob)))
glob = NULL;
@@ -3316,12 +3318,16 @@ __unregister_ftrace_function_probe(char
}
}
mutex_lock(&ftrace_lock);
-   __disable_ftrace_function_probe();
+   disabled = __disable_ftrace_function_probe();
/*
 * Remove after the disable is called. Otherwise, if the last
 * probe is removed, a null hash means *all enabled*.

Re: [PATCH v2] blktrace: Fix potentail deadlock between delete & sysfs ops

2017-08-18 Thread Waiman Long
On 08/17/2017 07:23 PM, Steven Rostedt wrote:
> On Thu, 17 Aug 2017 18:18:18 -0400
> Steven Rostedt  wrote:
>
>> On Thu, 17 Aug 2017 18:13:20 -0400
>> Steven Rostedt  wrote:
>>
>>> On Thu, 17 Aug 2017 17:27:22 -0400
>>> Waiman Long  wrote:
>>>
>>>   
 It is actually what the patch is trying to do by checking for the
 deletion flag in the mutex_trylock loop. Please note that mutex does not
 guarantee FIFO ordering of lock acquisition. As a result, cpu1 may call
 mutex_lock() and wait for it while cpu2 can set the deletion flag later
 and get the mutex first before cpu1. So checking for the deletion flag
 before taking the mutex isn't enough.
>>> Yeah, I figured that out already (crossed emails). BTW, how did you
>>> trigger this warning. I'm playing around with adding loop devices,
>>> volume groups, and logical volumes, and reading the trace files
>>> created in the sysfs directory, then removing those items, but it's
>>> not triggering the "delete" path. What operation deletes the partition?  
>> I'm guessing that deleting an actual partition may work (unfortunately,
>> my test box has no partition to delete ;-) I'll find another box to
>> test on.
>>
> OK, deleting a partition doesn't trigger the lockdep splat. But I also
> added a printk in the BLKPG_DEL_PARTITION case switch, which also
> doesn't print. What command do I need to do to trigger that path?
>
> Thanks,
>
> -- Steve

Attached is a reproducer that was used to trigger the warning. Some
tuning may be needed depend on the actual configuration of the test machine.

Cheers,
Longman



run_test.sh
Description: application/shellscript


Re: [PATCH v2 6/6] kernel: tracepoints: add support for relative references

2017-08-18 Thread Steven Rostedt
On Fri, 18 Aug 2017 12:44:17 +0100
Ard Biesheuvel  wrote:

> On 18 August 2017 at 12:26, Ard Biesheuvel  wrote:
> > To avoid the need for relocating absolute references to tracepoint
> > structures at boot time when running relocatable kernels (which may
> > take a disproportionate amount of space), add the option to emit
> > these tables as relative references instead.
> >
> > Cc: Steven Rostedt 
> > Cc: Ingo Molnar 
> > Signed-off-by: Ard Biesheuvel 
> > ---
> >  include/linux/tracepoint.h | 42 ++--
> >  kernel/tracepoint.c|  7 +---
> >  2 files changed, 40 insertions(+), 9 deletions(-)
> >
> > diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
> > index a26ffbe09e71..68701821933a 100644
> > --- a/include/linux/tracepoint.h
> > +++ b/include/linux/tracepoint.h
> > @@ -228,6 +228,42 @@ extern void syscall_unregfunc(void);
> > return static_key_false(&__tracepoint_##name.key);  \
> > }
> >
> > +#ifdef CONFIG_HAVE_ARCH_PREL32_RELOCATIONS
> > +#define __TRACEPOINT_ENTRY(name)   \
> > +   asm("   .section \"__tracepoints_ptrs\", \"a\"\n"   \
> > +   "   .balign 4\n"\
> > +   "   .long " VMLINUX_SYMBOL_STR(__tracepoint_##name) " - .\n"\
> > +   "   .previous\n")
> > +
> > +struct tracepoint_entry_t {
> > +   int tp_offset;
> > +};
> > +
> > +static inline
> > +struct tracepoint *tracepoint_from_entry(const struct tracepoint_entry_t 
> > *ent)
> > +{
> > +   return (struct tracepoint *)((unsigned long)ent + ent->tp_offset);
> > +}
> > +#else
> > +#define __TRACEPOINT_ENTRY(name)\
> > +   static struct tracepoint * const __tracepoint_ptr_##name __used  \
> > +   __attribute__((section("__tracepoints_ptrs"))) = \
> > +   &__tracepoint_##name
> > +
> > +struct tracepoint_entry_t {
> > +   struct tracepoint *tp;
> > +};
> > +
> > +static inline
> > +struct tracepoint *tracepoint_from_entry(const struct tracepoint_entry_t 
> > *ent)
> > +{
> > +   return ent->tp;
> > +}
> > +#endif
> > +
> > +extern struct tracepoint_entry_t const __start___tracepoints_ptrs[];
> > +extern struct tracepoint_entry_t const __stop___tracepoints_ptrs[];
> > +  
> 
> It appears the stuff above needs to be move inside the double-include
> guard (which oddly enough does not cover the entire file)

Why was this moved to the header file? To fulfill some checkpatch
warning?

-- Steve

> 
> >  /*
> >   * We have no guarantee that gcc and the linker won't up-align the 
> > tracepoint
> >   * structures, so we create an array of pointers that will be used for 
> > iteration
> > @@ -237,11 +273,9 @@ extern void syscall_unregfunc(void);
> > static const char __tpstrtab_##name[]\
> > __attribute__((section("__tracepoints_strings"))) = #name;   \
> > struct tracepoint __tracepoint_##name\
> > -   __attribute__((section("__tracepoints"))) =  \
> > +   __attribute__((section("__tracepoints"))) __used =   \
> > { __tpstrtab_##name, STATIC_KEY_INIT_FALSE, reg, unreg, 
> > NULL };\
> > -   static struct tracepoint * const __tracepoint_ptr_##name __used  \
> > -   __attribute__((section("__tracepoints_ptrs"))) = \
> > -   &__tracepoint_##name;
> > +   __TRACEPOINT_ENTRY(name);
> >
> >  #define DEFINE_TRACE(name) \
> > DEFINE_TRACE_FN(name, NULL, NULL);
> > diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c
> > index 685c50ae6300..21bc41454fd6 100644
> > --- a/kernel/tracepoint.c
> > +++ b/kernel/tracepoint.c
> > @@ -28,9 +28,6 @@
> >  #include 
> >  #include 
> >
> > -extern struct tracepoint * const __start___tracepoints_ptrs[];
> > -extern struct tracepoint * const __stop___tracepoints_ptrs[];
> > -
> >  /* Set to 1 to enable tracepoint debug output */
> >  static const int tracepoint_debug;
> >
> > @@ -508,12 +505,12 @@ static void for_each_tracepoint_range(struct 
> > tracepoint * const *begin,
> > void (*fct)(struct tracepoint *tp, void *priv),
> > void *priv)
> >  {
> > -   struct tracepoint * const *iter;
> > +   struct tracepoint_entry_t const *iter;
> >
> > if (!begin)
> > return;
> > for (iter = begin; iter < end; iter++)
> > -   fct(*iter, priv);
> > +   fct(tracepoint_from_entry(iter), priv);
> >  }
> >
> >  /**
> > --
> > 2.11.0
> >  



Re: [PATCH v2 6/6] kernel: tracepoints: add support for relative references

2017-08-18 Thread Ard Biesheuvel
On 18 August 2017 at 14:43, Steven Rostedt  wrote:
> On Fri, 18 Aug 2017 12:44:17 +0100
> Ard Biesheuvel  wrote:
>
>> On 18 August 2017 at 12:26, Ard Biesheuvel  wrote:
>> > To avoid the need for relocating absolute references to tracepoint
>> > structures at boot time when running relocatable kernels (which may
>> > take a disproportionate amount of space), add the option to emit
>> > these tables as relative references instead.
>> >
>> > Cc: Steven Rostedt 
>> > Cc: Ingo Molnar 
>> > Signed-off-by: Ard Biesheuvel 
>> > ---
>> >  include/linux/tracepoint.h | 42 ++--
>> >  kernel/tracepoint.c|  7 +---
>> >  2 files changed, 40 insertions(+), 9 deletions(-)
>> >
>> > diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
>> > index a26ffbe09e71..68701821933a 100644
>> > --- a/include/linux/tracepoint.h
>> > +++ b/include/linux/tracepoint.h
>> > @@ -228,6 +228,42 @@ extern void syscall_unregfunc(void);
>> > return static_key_false(&__tracepoint_##name.key);  \
>> > }
>> >
>> > +#ifdef CONFIG_HAVE_ARCH_PREL32_RELOCATIONS
>> > +#define __TRACEPOINT_ENTRY(name)   \
>> > +   asm("   .section \"__tracepoints_ptrs\", \"a\"\n"   \
>> > +   "   .balign 4\n"\
>> > +   "   .long " VMLINUX_SYMBOL_STR(__tracepoint_##name) " - .\n"\
>> > +   "   .previous\n")
>> > +
>> > +struct tracepoint_entry_t {
>> > +   int tp_offset;
>> > +};
>> > +
>> > +static inline
>> > +struct tracepoint *tracepoint_from_entry(const struct tracepoint_entry_t 
>> > *ent)
>> > +{
>> > +   return (struct tracepoint *)((unsigned long)ent + ent->tp_offset);
>> > +}
>> > +#else
>> > +#define __TRACEPOINT_ENTRY(name)\
>> > +   static struct tracepoint * const __tracepoint_ptr_##name __used  \
>> > +   __attribute__((section("__tracepoints_ptrs"))) = \
>> > +   &__tracepoint_##name
>> > +
>> > +struct tracepoint_entry_t {
>> > +   struct tracepoint *tp;
>> > +};
>> > +
>> > +static inline
>> > +struct tracepoint *tracepoint_from_entry(const struct tracepoint_entry_t 
>> > *ent)
>> > +{
>> > +   return ent->tp;
>> > +}
>> > +#endif
>> > +
>> > +extern struct tracepoint_entry_t const __start___tracepoints_ptrs[];
>> > +extern struct tracepoint_entry_t const __stop___tracepoints_ptrs[];
>> > +
>>
>> It appears the stuff above needs to be move inside the double-include
>> guard (which oddly enough does not cover the entire file)
>
> Why was this moved to the header file? To fulfill some checkpatch
> warning?
>

Yes.


[PATCH 3.16 058/134] PCI: Fix another sanity check bug in /proc/pci mmap

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: David Woodhouse 

commit 17caf56731311c9596e7d38a70c88fcb6afa6a1b upstream.

Don't match MMIO maps with I/O BARs and vice versa.

Signed-off-by: David Woodhouse 
Signed-off-by: Bjorn Helgaas 
Signed-off-by: Ben Hutchings 
---
 drivers/pci/proc.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

--- a/drivers/pci/proc.c
+++ b/drivers/pci/proc.c
@@ -231,14 +231,20 @@ static int proc_bus_pci_mmap(struct file
 {
struct pci_dev *dev = PDE_DATA(file_inode(file));
struct pci_filp_private *fpriv = file->private_data;
-   int i, ret, write_combine;
+   int i, ret, write_combine, res_bit;
 
if (!capable(CAP_SYS_RAWIO))
return -EPERM;
 
+   if (fpriv->mmap_state == pci_mmap_io)
+   res_bit = IORESOURCE_IO;
+   else
+   res_bit = IORESOURCE_MEM;
+
/* Make sure the caller is mapping a real resource for this device */
for (i = 0; i < PCI_ROM_RESOURCE; i++) {
-   if (pci_mmap_fits(dev, i, vma,  PCI_MMAP_PROCFS))
+   if (dev->resource[i].flags & res_bit &&
+   pci_mmap_fits(dev, i, vma,  PCI_MMAP_PROCFS))
break;
}
 



[PATCH 3.16 059/134] PCI: Only allow WC mmap on prefetchable resources

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: David Woodhouse 

commit cef4d02305a06be581bb7f4353446717a1b319ec upstream.

The /proc/bus/pci mmap interface allows the user to specify whether they
want WC or not.  Don't let them do so on non-prefetchable BARs.

Signed-off-by: David Woodhouse 
Signed-off-by: Bjorn Helgaas 
Signed-off-by: Ben Hutchings 
---
 drivers/pci/proc.c | 13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

--- a/drivers/pci/proc.c
+++ b/drivers/pci/proc.c
@@ -231,7 +231,7 @@ static int proc_bus_pci_mmap(struct file
 {
struct pci_dev *dev = PDE_DATA(file_inode(file));
struct pci_filp_private *fpriv = file->private_data;
-   int i, ret, write_combine, res_bit;
+   int i, ret, write_combine = 0, res_bit;
 
if (!capable(CAP_SYS_RAWIO))
return -EPERM;
@@ -251,10 +251,13 @@ static int proc_bus_pci_mmap(struct file
if (i >= PCI_ROM_RESOURCE)
return -ENODEV;
 
-   if (fpriv->mmap_state == pci_mmap_mem)
-   write_combine = fpriv->write_combine;
-   else
-   write_combine = 0;
+   if (fpriv->mmap_state == pci_mmap_mem &&
+   fpriv->write_combine) {
+   if (dev->resource[i].flags & IORESOURCE_PREFETCH)
+   write_combine = 1;
+   else
+   return -EINVAL;
+   }
ret = pci_mmap_page_range(dev, vma,
  fpriv->mmap_state, write_combine);
if (ret < 0)



Re: [PATCH v4] livepatch: introduce shadow variable API

2017-08-18 Thread Nicolai Stange
Joe Lawrence  writes:


> +
> +/**
> + * klp_shadow_get() - retrieve a shadow variable data pointer
> + * @obj: pointer to parent object
> + * @id:  data identifier
> + *
> + * Return: the shadow variable data element, NULL on failure.
> + */
> +void *klp_shadow_get(void *obj, unsigned long id)
> +{
> + struct klp_shadow *shadow;
> +
> + rcu_read_lock();
> +
> + hash_for_each_possible_rcu(klp_shadow_hash, shadow, node,
> +(unsigned long)obj) {
> +
> + if (klp_shadow_match(shadow, obj, id)) {
> + rcu_read_unlock();
> + return shadow->data;

I had to think a moment about what protects shadow from getting freed by
a concurrent detach after that rcu_read_unlock(). Then I noticed that if
obj and the livepatch are alive, then so is shadow, because there
obviously hasn't been any reason to detach it.

So maybe it would be nice to have an additional comment at
klp_shadow_detach() that it's the API user's responsibility not to use a
shadow instance after detaching it...

Thanks,

Nicolai


> + }
> + }
> +
> + rcu_read_unlock();
> +
> + return NULL;
> +}
> +EXPORT_SYMBOL_GPL(klp_shadow_get);



-- 
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 
(AG Nürnberg)


[PATCH 3.16 055/134] [media] zr364xx: enforce minimum size when reading header

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Alyssa Milburn 

commit ee0fe833d96793853335844b6d99fb76bd12cbeb upstream.

This code copies actual_length-128 bytes from the header, which will
underflow if the received buffer is too small.

Signed-off-by: Alyssa Milburn 
Signed-off-by: Mauro Carvalho Chehab 
Signed-off-by: Ben Hutchings 
---
 drivers/media/usb/zr364xx/zr364xx.c | 8 
 1 file changed, 8 insertions(+)

--- a/drivers/media/usb/zr364xx/zr364xx.c
+++ b/drivers/media/usb/zr364xx/zr364xx.c
@@ -605,6 +605,14 @@ static int zr364xx_read_video_callback(s
ptr = pdest = frm->lpvbits;
 
if (frm->ulState == ZR364XX_READ_IDLE) {
+   if (purb->actual_length < 128) {
+   /* header incomplete */
+   dev_info(&cam->udev->dev,
+"%s: buffer (%d bytes) too small to hold jpeg 
header. Discarding.\n",
+__func__, purb->actual_length);
+   return -EINVAL;
+   }
+
frm->ulState = ZR364XX_READ_FRAME;
frm->cur_size = 0;
 



[PATCH 3.16 049/134] perf/x86: Fix spurious NMI with PEBS Load Latency event

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Kan Liang 

commit fd583ad1563bec5f00140e1f2444adbcd331caad upstream.

Spurious NMIs will be observed with the following command:

  while :; do
perf record -bae "cpu/umask=0x01,event=0xcd,ldlat=0x80/pp"
  -e "cpu/umask=0x03,event=0x0/"
  -e "cpu/umask=0x02,event=0x0/"
  -e cycles,branches,cache-misses
  -e cache-references -- sleep 10
  done

The bug was introduced by commit:

  8077eca079a2 ("perf/x86/pebs: Add workaround for broken OVFL status on HSW+")

That commit clears the status bits for the counters used for PEBS
events, by masking the whole 64 bits pebs_enabled. However, only the
low 32 bits of both status and pebs_enabled are reserved for PEBS-able
counters.

For status bits 32-34 are fixed counter overflow bits. For
pebs_enabled bits 32-34 are for PEBS Load Latency.

In the test case, the PEBS Load Latency event and fixed counter event
could overflow at the same time. The fixed counter overflow bit will
be cleared by mistake. Once it is cleared, the fixed counter overflow
never be processed, which finally trigger spurious NMI.

Correct the PEBS enabled mask by ignoring the non-PEBS bits.

Signed-off-by: Kan Liang 
Signed-off-by: Peter Zijlstra (Intel) 
Cc: Alexander Shishkin 
Cc: Arnaldo Carvalho de Melo 
Cc: Jiri Olsa 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Stephane Eranian 
Cc: Thomas Gleixner 
Cc: Vince Weaver 
Fixes: 8077eca079a2 ("perf/x86/pebs: Add workaround for broken OVFL status on 
HSW+")
Link: 
http://lkml.kernel.org/r/1491333246-3965-1-git-send-email-kan.li...@intel.com
Signed-off-by: Ingo Molnar 
[bwh: Backported to 3.16:
 - Drop change in get_next_pebs_record_by_bit()
 - Adjust filenames]
Signed-off-by: Ben Hutchings 
---
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -1421,7 +1421,7 @@ again:
 * counters from the GLOBAL_STATUS mask and we always process PEBS
 * events via drain_pebs().
 */
-   status &= ~cpuc->pebs_enabled;
+   status &= ~(cpuc->pebs_enabled & PEBS_COUNTER_MASK);
 
/*
 * PEBS overflow sets bit 62 in the global status register
--- a/arch/x86/kernel/cpu/perf_event.h
+++ b/arch/x86/kernel/cpu/perf_event.h
@@ -79,6 +79,7 @@ struct amd_nb {
 
 /* The maximal number of PEBS events: */
 #define MAX_PEBS_EVENTS8
+#define PEBS_COUNTER_MASK  ((1ULL << MAX_PEBS_EVENTS) - 1)
 
 /*
  * A debug store configuration.



[PATCH 3.16 040/134] perf inject: Don't proceed if perf_session__process_event() fails

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: David Carrillo-Cisneros 

commit bb8d521f77f3e68a713456b7fb1e99f52ff3342c upstream.

All paths following perf_session__process_event() in __cmd_inject() are
useless if __cmd_inject() is to fail, some depend on a correct
session->evlist.

First commit to add code that depends on session->evlist without checking
error was commmit e558a5bd8b ("perf inject: Work with files"). It has
grown since then.

Change __cmd_inject() to fail immediately after
perf_session__process_event() fails.

Signed-off-by: David Carrillo-Cisneros 
Acked-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Andrew Vagin 
Cc: He Kuang 
Cc: Masami Hiramatsu 
Cc: Paul Turner 
Cc: Peter Zijlstra 
Cc: Simon Que 
Cc: Stephane Eranian 
Cc: Wang Nan 
Fixes: e558a5bd8b74 ("perf inject: Work with files")
Link: http://lkml.kernel.org/r/20170410201432.24807-2-davi...@google.com
Signed-off-by: Arnaldo Carvalho de Melo 
[bwh: Backported to 3.16: adjust context]
Signed-off-by: Ben Hutchings 
---
 tools/perf/builtin-inject.c | 2 ++
 1 file changed, 2 insertions(+)

--- a/tools/perf/builtin-inject.c
+++ b/tools/perf/builtin-inject.c
@@ -387,6 +387,8 @@ static int __cmd_inject(struct perf_inje
lseek(file_out->fd, session->header.data_offset, SEEK_SET);
 
ret = perf_session__process_events(session, &inject->tool);
+   if (ret)
+   return ret;
 
if (!file_out->is_pipe) {
session->header.data_size = inject->bytes_written;



[PATCH 3.16 056/134] regulator: tps65023: Fix inverted core enable logic.

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Richard Cochran 

commit c90722b54a4f5e21ac59301ed9a6dbaa439bdb16 upstream.

Commit 43530b69d758328d3ffe6ab98fd640463e8e3667 ("regulator: Use
regmap_read/write(), regmap_update_bits functions directly") intended
to replace working inline helper functions with standard regmap
calls.  However, it also inverted the set/clear logic of the "CORE ADJ
Allowed" bit.  That patch was clearly never tested, since without that
bit cleared, the core VDCDC1 voltage output does not react to I2C
configuration changes.

This patch fixes the issue by clearing the bit as in the original,
correct implementation.  Note for stable back porting that, due to
subsequent driver churn, this patch will not apply on every kernel
version.

Fixes: 43530b69d758 ("regulator: Use regmap_read/write(), regmap_update_bits 
functions directly")
Signed-off-by: Richard Cochran 
Signed-off-by: Mark Brown 
Signed-off-by: Ben Hutchings 
---
 drivers/regulator/tps65023-regulator.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

--- a/drivers/regulator/tps65023-regulator.c
+++ b/drivers/regulator/tps65023-regulator.c
@@ -293,8 +293,7 @@ static int tps_65023_probe(struct i2c_cl
 
/* Enable setting output voltage by I2C */
regmap_update_bits(tps->regmap, TPS65023_REG_CON_CTRL2,
-   TPS65023_REG_CTRL2_CORE_ADJ,
-   TPS65023_REG_CTRL2_CORE_ADJ);
+  TPS65023_REG_CTRL2_CORE_ADJ, 0);
 
return 0;
 }



[PATCH 3.16 051/134] iio: proximity: as3935: fix as3935_write

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Matt Ranostay 

commit 84ca8e364acb26aba3292bc113ca8ed4335380fd upstream.

AS3935_WRITE_DATA macro bit is incorrect and the actual write
sequence is two leading zeros.

Cc: George McCollister 
Signed-off-by: Matt Ranostay 
Signed-off-by: Jonathan Cameron 
Signed-off-by: Ben Hutchings 
---
 drivers/iio/proximity/as3935.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

--- a/drivers/iio/proximity/as3935.c
+++ b/drivers/iio/proximity/as3935.c
@@ -50,7 +50,6 @@
 #define AS3935_TUNE_CAP0x08
 #define AS3935_CALIBRATE   0x3D
 
-#define AS3935_WRITE_DATA  BIT(15)
 #define AS3935_READ_DATA   BIT(14)
 #define AS3935_ADDRESS(x)  ((x) << 8)
 
@@ -105,7 +104,7 @@ static int as3935_write(struct as3935_st
 {
u8 *buf = st->buf;
 
-   buf[0] = (AS3935_WRITE_DATA | AS3935_ADDRESS(reg)) >> 8;
+   buf[0] = AS3935_ADDRESS(reg) >> 8;
buf[1] = val;
 
return spi_write(st->spi, buf, 2);



Re: [PATCH v4] livepatch: introduce shadow variable API

2017-08-18 Thread Joe Lawrence
On 08/17/2017 10:05 AM, Petr Mladek wrote:
> On Mon 2017-08-14 16:02:43, Joe Lawrence wrote:
>> [ ... snip ... ]
>> diff --git a/samples/livepatch/livepatch-shadow-fix1.c 
>> b/samples/livepatch/livepatch-shadow-fix1.c
>> new file mode 100644
>> index ..5acc838463d1
>> --- /dev/null
>> +++ b/samples/livepatch/livepatch-shadow-fix1.c
>> +void livepatch_fix1_dummy_free(struct dummy *d)
>> +{
>> +void **shadow_leak;
>> +
>> +/*
>> + * Patch: fetch the saved SV_LEAK shadow variable, detach and
>> + * free it.  Note: handle cases where this shadow variable does
>> + * not exist (ie, dummy structures allocated before this livepatch
>> + * was loaded.)
>> + */
>> +shadow_leak = klp_shadow_get(d, SV_LEAK);
>> +if (shadow_leak) {
>> +klp_shadow_detach(d, SV_LEAK);
>> +kfree(*shadow_leak);
> 
> This should get removed. The buffer used for the shadow variable
> is freed by kfree_rcu() called from klp_shadow_detach().
> 
> Same problem is also in the other livepatch.
> 
>> +pr_info("%s: dummy @ %p, prevented leak @ %p\n",
>> + __func__, d, *shadow_leak);
> 
> This might access shadow_leak after it was (double) freed.
> 
>> +} else {
>> +pr_info("%s: dummy @ %p leaked!\n", __func__, d);
>> +}
>> +
>> +kfree(d);
>> +}

Hi Petr,

I think you're half correct.

The kfree is the crux of the memory leak patch, so it needs to stay.
However, the shadow variable is holding a copy of the pointer to the
memory leak area, so you're right that it can't be safely dereferenced
after the shadow variable is detached*.

The code should to be rearranged like:

void livepatch_fix1_dummy_free(struct dummy *d)
{
void **p_shadow_leak, *shadow_leak;

p_shadow_leak = klp_shadow_get(d, SV_LEAK);
if (p_shadow_leak) {
shadow_leak = *p_shadow_leak;   << deref before detach
klp_shadow_detach(d, SV_LEAK);
kfree(shadow_leak);
...

* Aside: I usually develop with slub_debug=FZPU set to catch silly
use-after-frees like this.  However, since the shadow variable is
released via kfree_rcu(), I think there was a window before the grace
period where this one worked out okay...  once I added a
synchronize_rcu() call in between the klp_shadow_detch() and kfree()
calls, I did see the poison pattern.  This is my first time using
kfree_rcu(), so it was interesting to dig into.

Thanks,

-- Joe



Re: [PATCH v3 0/2] cpuset: Allow v2 behavior in v1 cgroup

2017-08-18 Thread Waiman Long
On 08/18/2017 07:29 AM, Tejun Heo wrote:
> On Thu, Aug 17, 2017 at 03:33:08PM -0400, Waiman Long wrote:
>> v2->v3:
>>  - Change the generic CGRP_ROOT_V2_MODE flag to a cpuset specific
>>CGRP_ROOT_CPUSET_V2_MODE flag.
>>
>> v1->v2:
>>  - Drop the kernel command line option and use cgroupfs mount option
>>instead to enable v2 controller behavior in v1 cgroup.
>>
>> v1: https://lkml.org/lkml/2017/8/15/570
> Waiman, patches look good to me but can you please take care of the
> kbuild warnings on 32bit and maybe merge the two patches into one?

The kbuild warning is in the __init code of the v1 patch for checking
the boot time argument. The __init code was gone in the v3 patch. So if
there is no further kbuild warning after a couple of days, I think it
should be OK.

Cheers,
Longman


[PATCH 3.16 032/134] MIPS: Loongson-3: Select MIPS_L1_CACHE_SHIFT_6

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Huacai Chen 

commit 17c99d9421695a0e0de18bf1e7091d859e20ec1d upstream.

Some newer Loongson-3 have 64 bytes cache lines, so select
MIPS_L1_CACHE_SHIFT_6.

Signed-off-by: Huacai Chen 
Cc: John Crispin 
Cc: Steven J . Hill 
Cc: Fuxin Zhang 
Cc: Zhangjin Wu 
Cc: linux-m...@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/15755/
Signed-off-by: Ralf Baechle 
[bwh: Backported to 3.16: adjust context]
Signed-off-by: Ben Hutchings 
---
 arch/mips/Kconfig | 1 +
 1 file changed, 1 insertion(+)

--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -1193,6 +1193,7 @@ config CPU_LOONGSON3
select CPU_SUPPORTS_HUGEPAGES
select WEAK_ORDERING
select WEAK_REORDERING_BEYOND_LLSC
+   select MIPS_L1_CACHE_SHIFT_6
help
The Loongson 3 processor implements the MIPS64R2 instruction
set with many extensions.



Re: [PATCH v14 4/5] mm: support reporting free page blocks

2017-08-18 Thread Michal Hocko
On Thu 17-08-17 11:26:55, Wei Wang wrote:
> This patch adds support to walk through the free page blocks in the
> system and report them via a callback function. Some page blocks may
> leave the free list after zone->lock is released, so it is the caller's
> responsibility to either detect or prevent the use of such pages.

This could see more details to be honest. Especially the usecase you are
going to use this for. This will help us to understand the motivation
in future when the current user might be gone a new ones largely diverge
into a different usage. This wouldn't be the first time I have seen
something like that.

> Signed-off-by: Wei Wang 
> Signed-off-by: Liang Li 
> Cc: Michal Hocko 
> Cc: Michael S. Tsirkin 
> ---
>  include/linux/mm.h |  6 ++
>  mm/page_alloc.c| 44 
>  2 files changed, 50 insertions(+)
> 
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 46b9ac5..cd29b9f 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -1835,6 +1835,12 @@ extern void free_area_init_node(int nid, unsigned long 
> * zones_size,
>   unsigned long zone_start_pfn, unsigned long *zholes_size);
>  extern void free_initmem(void);
>  
> +extern void walk_free_mem_block(void *opaque1,
> + unsigned int min_order,
> + void (*visit)(void *opaque2,
> +   unsigned long pfn,
> +   unsigned long nr_pages));
> +
>  /*
>   * Free reserved pages within range [PAGE_ALIGN(start), end & PAGE_MASK)
>   * into the buddy system. The freed pages will be poisoned with pattern
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 6d00f74..a721a35 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -4762,6 +4762,50 @@ void show_free_areas(unsigned int filter, nodemask_t 
> *nodemask)
>   show_swap_cache_info();
>  }
>  
> +/**
> + * walk_free_mem_block - Walk through the free page blocks in the system
> + * @opaque1: the context passed from the caller
> + * @min_order: the minimum order of free lists to check
> + * @visit: the callback function given by the caller

The original suggestion for using visit was motivated by a visit design
pattern but I can see how this can be confusing. Maybe a more explicit
name wold be better. What about report_free_range.

> + *
> + * The function is used to walk through the free page blocks in the system,
> + * and each free page block is reported to the caller via the @visit 
> callback.
> + * Please note:
> + * 1) The function is used to report hints of free pages, so the caller 
> should
> + * not use those reported pages after the callback returns.
> + * 2) The callback is invoked with the zone->lock being held, so it should 
> not
> + * block and should finish as soon as possible.

I think that the explicit note about zone->lock is not really need. This
can change in future and I would even bet that somebody might rely on
the lock being held for some purpose and silently get broken with the
change. Instead I would much rather see something like the following:
"
Please note that there are no locking guarantees for the callback and
that the reported pfn range might be freed or disappear after the
callback returns so the caller has to be very careful how it is used.

The callback itself must not sleep or perform any operations which would
require any memory allocations directly (not even GFP_NOWAIT/GFP_ATOMIC)
or via any lock dependency. It is generally advisable to implement
the callback as simple as possible and defer any heavy lifting to a
different context.

There is no guarantee that each free range will be reported only once
during one walk_free_mem_block invocation.

pfn_to_page on the given range is strongly discouraged and if there is
an absolute need for that make sure to contact MM people to discuss
potential problems.

The function itself might sleep so it cannot be called from atomic
contexts.

In general low orders tend to be very volatile and so it makes more
sense to query larger ones for various optimizations which like
ballooning etc... This will reduce the overhead as well.
"

> + */
> +void walk_free_mem_block(void *opaque1,
> +  unsigned int min_order,

make the order int and...
> +  void (*visit)(void *opaque2,
> +unsigned long pfn,
> +unsigned long nr_pages))
> +{
> + struct zone *zone;
> + struct page *page;
> + struct list_head *list;
> + unsigned int order;
> + enum migratetype mt;
> + unsigned long pfn, flags;
> +
> + for_each_populated_zone(zone) {
> + for (order = MAX_ORDER - 1;
> +  order < MAX_ORDER && order >= min_order; order--) {

you will not need the underflow check which is just ugly

> + for (mt = 0; mt < MIGRATE_TYPES; mt++) {
> +

[PATCH 3.16 054/134] [media] digitv: limit messages to buffer size

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Alyssa Milburn 

commit 821117dc21083a99dd99174c10848d70ff43de29 upstream.

Return an error rather than memcpy()ing beyond the end of the buffer.
Internal callers use appropriate sizes, but digitv_i2c_xfer may not.

Signed-off-by: Alyssa Milburn 
Signed-off-by: Mauro Carvalho Chehab 
[bwh: Backported to 3.16: adjust context]
Signed-off-by: Ben Hutchings 
---
--- a/drivers/media/usb/dvb-usb/digitv.c
+++ b/drivers/media/usb/dvb-usb/digitv.c
@@ -30,6 +30,10 @@ static int digitv_ctrl_msg(struct dvb_us
 {
int wo = (rbuf == NULL || rlen == 0); /* write-only */
u8 sndbuf[7],rcvbuf[7];
+
+   if (wlen > 4 || rlen > 4)
+   return -EIO;
+
memset(sndbuf,0,7); memset(rcvbuf,0,7);
 
sndbuf[0] = cmd;



[PATCH 3.16 047/134] ARM: dts: at91: sama5d3_xplained: not all ADC channels are available

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Ludovic Desroches 

commit d3df1ec06353e51fc44563d2e7e18d42811af290 upstream.

Remove ADC channels that are not available by default on the sama5d3_xplained
board (resistor not populated) in order to not create confusion.

Signed-off-by: Ludovic Desroches 
Acked-by: Nicolas Ferre 
Signed-off-by: Alexandre Belloni 
Signed-off-by: Ben Hutchings 
---
 arch/arm/boot/dts/at91-sama5d3_xplained.dts | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

--- a/arch/arm/boot/dts/at91-sama5d3_xplained.dts
+++ b/arch/arm/boot/dts/at91-sama5d3_xplained.dts
@@ -142,9 +142,9 @@
 
adc0: adc@f8018000 {
atmel,adc-vref = <3300>;
+   atmel,adc-channels-used = <0xfe>;
pinctrl-0 = <
&pinctrl_adc0_adtrg
-   &pinctrl_adc0_ad0
&pinctrl_adc0_ad1
&pinctrl_adc0_ad2
&pinctrl_adc0_ad3
@@ -152,8 +152,6 @@
&pinctrl_adc0_ad5
&pinctrl_adc0_ad6
&pinctrl_adc0_ad7
-   &pinctrl_adc0_ad8
-   &pinctrl_adc0_ad9
>;
status = "okay";
};



[PATCH 3.16 046/134] ARM: dts: at91: sama5d3_xplained: fix ADC vref

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Ludovic Desroches 

commit 9cdd31e5913c1f86dce7e201b086155b3f24896b upstream.

The voltage reference for the ADC is not 3V but 3.3V since it is connected to
VDDANA.

Signed-off-by: Ludovic Desroches 
Acked-by: Nicolas Ferre 
Signed-off-by: Alexandre Belloni 
Signed-off-by: Ben Hutchings 
---
 arch/arm/boot/dts/at91-sama5d3_xplained.dts | 1 +
 1 file changed, 1 insertion(+)

--- a/arch/arm/boot/dts/at91-sama5d3_xplained.dts
+++ b/arch/arm/boot/dts/at91-sama5d3_xplained.dts
@@ -141,6 +141,7 @@
};
 
adc0: adc@f8018000 {
+   atmel,adc-vref = <3300>;
pinctrl-0 = <
&pinctrl_adc0_adtrg
&pinctrl_adc0_ad0



[PATCH 3.16 050/134] iio: dac: ad7303: fix channel description

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Pavel Roskin 

commit ce420fd4251809b4c3119b3b20c8b13bd8eba150 upstream.

realbits, storagebits and shift should be numbers, not ASCII characters.

Signed-off-by: Pavel Roskin 
Reviewed-by: Lars-Peter Clausen 
Signed-off-by: Jonathan Cameron 
Signed-off-by: Ben Hutchings 
---
 drivers/iio/dac/ad7303.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

--- a/drivers/iio/dac/ad7303.c
+++ b/drivers/iio/dac/ad7303.c
@@ -184,9 +184,9 @@ static const struct iio_chan_spec_ext_in
.address = (chan),  \
.scan_type = {  \
.sign = 'u',\
-   .realbits = '8',\
-   .storagebits = '8', \
-   .shift = '0',   \
+   .realbits = 8,  \
+   .storagebits = 8,   \
+   .shift = 0, \
},  \
.ext_info = ad7303_ext_info,\
 }



[PATCH 3.16 044/134] vfio/type1: Remove locked page accounting workqueue

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Alex Williamson 

commit 0cfef2b7410b64d7a430947e0b533314c4f97153 upstream.

If the mmap_sem is contented then the vfio type1 IOMMU backend will
defer locked page accounting updates to a workqueue task.  This has a
few problems and depending on which side the user tries to play, they
might be over-penalized for unmaps that haven't yet been accounted or
race the workqueue to enter more mappings than they're allowed.  The
original intent of this workqueue mechanism seems to be focused on
reducing latency through the ioctl, but we cannot do so at the cost
of correctness.  Remove this workqueue mechanism and update the
callers to allow for failure.  We can also now recheck the limit under
write lock to make sure we don't exceed it.

vfio_pin_pages_remote() also now necessarily includes an unwind path
which we can jump to directly if the consecutive page pinning finds
that we're exceeding the user's memory limits.  This avoids the
current lazy approach which does accounting and mapping up to the
fault, only to return an error on the next iteration to unwind the
entire vfio_dma.

Reviewed-by: Peter Xu 
Reviewed-by: Kirti Wankhede 
Signed-off-by: Alex Williamson 
[bwh: Backported to 3.16:
 - vfio_lock_acct() always operates on current->mm
 - Drop changes to vfio_{,un}pin_page_external() and
   vfio_iommu_unmap_unpin_reaccount()
 - Drop test of rsvd flag
 - Fix up the disable_hugepages case in vfio_pin_pages()
 - Use down_write() instead of down_write_killable()
 - Adjust context]
Signed-off-by: Ben Hutchings 
---
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -128,57 +128,37 @@ static void vfio_unlink_dma(struct vfio_
rb_erase(&old->node, &iommu->dma_list);
 }
 
-struct vwork {
-   struct mm_struct*mm;
-   longnpage;
-   struct work_struct  work;
-};
-
-/* delayed decrement/increment for locked_vm */
-static void vfio_lock_acct_bg(struct work_struct *work)
+static int vfio_lock_acct(long npage, bool *lock_cap)
 {
-   struct vwork *vwork = container_of(work, struct vwork, work);
struct mm_struct *mm;
+   int ret;
 
-   mm = vwork->mm;
-   down_write(&mm->mmap_sem);
-   mm->locked_vm += vwork->npage;
-   up_write(&mm->mmap_sem);
-   mmput(mm);
-   kfree(vwork);
-}
+   if (!npage)
+   return 0;
 
-static void vfio_lock_acct(long npage)
-{
-   struct vwork *vwork;
-   struct mm_struct *mm;
+   mm = current->mm;
+   if (!mm)
+   return -ESRCH; /* process exited */
 
-   if (!current->mm || !npage)
-   return; /* process exited or nothing to do */
+   ret = 0;
+   down_write(&mm->mmap_sem);
+   if (npage > 0) {
+   if (lock_cap ? !*lock_cap : !capable(CAP_IPC_LOCK)) {
+   unsigned long limit;
 
-   if (down_write_trylock(¤t->mm->mmap_sem)) {
-   current->mm->locked_vm += npage;
-   up_write(¤t->mm->mmap_sem);
-   return;
-   }
+   limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT;
 
-   /*
-* Couldn't get mmap_sem lock, so must setup to update
-* mm->locked_vm later. If locked_vm were atomic, we
-* wouldn't need this silliness
-*/
-   vwork = kmalloc(sizeof(struct vwork), GFP_KERNEL);
-   if (!vwork)
-   return;
-   mm = get_task_mm(current);
-   if (!mm) {
-   kfree(vwork);
-   return;
+   if (mm->locked_vm + npage > limit)
+   ret = -ENOMEM;
+   }
}
-   INIT_WORK(&vwork->work, vfio_lock_acct_bg);
-   vwork->mm = mm;
-   vwork->npage = npage;
-   schedule_work(&vwork->work);
+
+   if (!ret)
+   mm->locked_vm += npage;
+
+   up_write(&mm->mmap_sem);
+
+   return ret;
 }
 
 /*
@@ -260,7 +240,7 @@ static int vaddr_get_pfn(unsigned long v
 static long vfio_pin_pages(unsigned long vaddr, long npage,
   int prot, unsigned long *pfn_base)
 {
-   unsigned long limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT;
+   unsigned long pfn = 0, limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT;
bool lock_cap = capable(CAP_IPC_LOCK);
long ret, i;
 
@@ -282,14 +262,13 @@ static long vfio_pin_pages(unsigned long
}
 
if (unlikely(disable_hugepages)) {
-   vfio_lock_acct(1);
-   return 1;
+   ret = vfio_lock_acct(1, &lock_cap);
+   i = 1;
+   goto unpin_out;
}
 
/* Lock all the consecutive pages from pfn_base */
for (i = 1, vaddr += PAGE_SIZE; i < npage; i++, vaddr += PAGE_SIZE) {
-   unsigned long pfn = 0;
-
ret = vaddr_get_pfn(vaddr, prot, &pfn);
if (ret)
bre

[PATCH 3.16 041/134] serial: omap: fix runtime-pm handling on unbind

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Johan Hovold 

commit 099bd73dc17ed77aa8c98323e043613b6e8f54fc upstream.

An unbalanced and misplaced synchronous put was used to suspend the
device on driver unbind, something which with a likewise misplaced
pm_runtime_disable leads to external aborts when an open port is being
removed.

Unhandled fault: external abort on non-linefetch (0x1028) at 0xfa024010
...
[] (serial_omap_set_mctrl) from [] 
(uart_update_mctrl+0x50/0x60)
[] (uart_update_mctrl) from [] (uart_shutdown+0xbc/0x138)
[] (uart_shutdown) from [] (uart_hangup+0x94/0x190)
[] (uart_hangup) from [] (__tty_hangup+0x404/0x41c)
[] (__tty_hangup) from [] (tty_vhangup+0x1c/0x20)
[] (tty_vhangup) from [] (uart_remove_one_port+0xec/0x260)
[] (uart_remove_one_port) from [] 
(serial_omap_remove+0x40/0x60)
[] (serial_omap_remove) from [] 
(platform_drv_remove+0x34/0x4c)

Fix this up by resuming the device before deregistering the port and by
suspending and disabling runtime pm only after the port has been
removed.

Also make sure to disable autosuspend before disabling runtime pm so
that the usage count is balanced and device actually suspended before
returning.

Note that due to a negative autosuspend delay being set in probe, the
unbalanced put would actually suspend the device on first driver unbind,
while rebinding and again unbinding would result in a negative
power.usage_count.

Fixes: 7e9c8e7dbf3b ("serial: omap: make sure to suspend device before remove")
Cc: Felipe Balbi 
Cc: Santosh Shilimkar 
Signed-off-by: Johan Hovold 
Acked-by: Tony Lindgren 
Signed-off-by: Greg Kroah-Hartman 
Signed-off-by: Ben Hutchings 
---
 drivers/tty/serial/omap-serial.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

--- a/drivers/tty/serial/omap-serial.c
+++ b/drivers/tty/serial/omap-serial.c
@@ -1754,9 +1754,13 @@ static int serial_omap_remove(struct pla
 {
struct uart_omap_port *up = platform_get_drvdata(dev);
 
+   pm_runtime_get_sync(up->dev);
+
+   uart_remove_one_port(&serial_omap_reg, &up->port);
+
+   pm_runtime_dont_use_autosuspend(up->dev);
pm_runtime_put_sync(up->dev);
pm_runtime_disable(up->dev);
-   uart_remove_one_port(&serial_omap_reg, &up->port);
pm_qos_remove_request(&up->pm_qos_request);
device_init_wakeup(&dev->dev, false);
 



[PATCH 3.16 045/134] power: supply: lp8788: prevent out of bounds array access

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Giedrius Statkevičius
 

commit bdd9968d35f7fcdb76089347d1529bf079534214 upstream.

val might become 7 in which case stime[7] (array of length 7) would be
accessed during the scnprintf call later and that will cause issues.
Obviously, string concatenation is not intended here so just a comma needs
to be added to fix the issue.

Fixes: 98a276649358 ("power_supply: Add new lp8788 charger driver")
Signed-off-by: Giedrius Statkevičius 
Acked-by: Milo Kim 
Signed-off-by: Sebastian Reichel 
[bwh: Backported to 3.16: adjust filename]
Signed-off-by: Ben Hutchings 
---
 drivers/power/lp8788-charger.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/power/lp8788-charger.c
+++ b/drivers/power/lp8788-charger.c
@@ -642,7 +642,7 @@ static ssize_t lp8788_show_eoc_time(stru
 {
struct lp8788_charger *pchg = dev_get_drvdata(dev);
char *stime[] = { "400ms", "5min", "10min", "15min",
-   "20min", "25min", "30min" "No timeout" };
+   "20min", "25min", "30min", "No timeout" };
u8 val;
 
lp8788_read_byte(pchg->lp, LP8788_CHG_EOC, &val);



[PATCH 3.16 042/134] serial: omap: suspend device on probe errors

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Johan Hovold 

commit 77e6fe7fd2b7cba0bf2f2dc8cde51d7b9a35bf74 upstream.

Make sure to actually suspend the device before returning after a failed
(or deferred) probe.

Note that autosuspend must be disabled before runtime pm is disabled in
order to balance the usage count due to a negative autosuspend delay as
well as to make the final put suspend the device synchronously.

Fixes: 388bc2622680 ("omap-serial: Fix the error handling in the omap_serial 
probe")
Cc: Shubhrajyoti D 
Signed-off-by: Johan Hovold 
Acked-by: Tony Lindgren 
Signed-off-by: Greg Kroah-Hartman 
Signed-off-by: Ben Hutchings 
---
 drivers/tty/serial/omap-serial.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/drivers/tty/serial/omap-serial.c
+++ b/drivers/tty/serial/omap-serial.c
@@ -1741,7 +1741,8 @@ static int serial_omap_probe(struct plat
return 0;
 
 err_add_port:
-   pm_runtime_put(&pdev->dev);
+   pm_runtime_dont_use_autosuspend(&pdev->dev);
+   pm_runtime_put_sync(&pdev->dev);
pm_runtime_disable(&pdev->dev);
 err_rs485:
 err_port_line:



[PATCH 3.16 043/134] PCI: Fix pci_mmap_fits() for HAVE_PCI_RESOURCE_TO_USER platforms

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: David Woodhouse 

commit 6bccc7f426abd640f08d8c75fb22f99483f201b4 upstream.

In the PCI_MMAP_PROCFS case when the address being passed by the user is a
'user visible' resource address based on the bus window, and not the actual
contents of the resource, that's what we need to be checking it against.

Signed-off-by: David Woodhouse 
Signed-off-by: Bjorn Helgaas 
Signed-off-by: Ben Hutchings 
---
 drivers/pci/pci-sysfs.c | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -965,15 +965,19 @@ void pci_remove_legacy_files(struct pci_
 int pci_mmap_fits(struct pci_dev *pdev, int resno, struct vm_area_struct *vma,
  enum pci_mmap_api mmap_api)
 {
-   unsigned long nr, start, size, pci_start;
+   unsigned long nr, start, size;
+   resource_size_t pci_start = 0, pci_end;
 
if (pci_resource_len(pdev, resno) == 0)
return 0;
nr = vma_pages(vma);
start = vma->vm_pgoff;
size = ((pci_resource_len(pdev, resno) - 1) >> PAGE_SHIFT) + 1;
-   pci_start = (mmap_api == PCI_MMAP_PROCFS) ?
-   pci_resource_start(pdev, resno) >> PAGE_SHIFT : 0;
+   if (mmap_api == PCI_MMAP_PROCFS) {
+   pci_resource_to_user(pdev, resno, &pdev->resource[resno],
+&pci_start, &pci_end);
+   pci_start >>= PAGE_SHIFT;
+   }
if (start >= pci_start && start < pci_start + size &&
start + nr <= pci_start + size)
return 1;



[PATCH 3.2 57/59] timerfd: Protect the might cancel mechanism proper

2017-08-18 Thread Ben Hutchings
3.2.92-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Thomas Gleixner 

commit 1e38da300e1e395a15048b0af1e5305bd91402f6 upstream.

The handling of the might_cancel queueing is not properly protected, so
parallel operations on the file descriptor can race with each other and
lead to list corruptions or use after free.

Protect the context for these operations with a seperate lock.

The wait queue lock cannot be reused for this because that would create a
lock inversion scenario vs. the cancel lock. Replacing might_cancel with an
atomic (atomic_t or atomic bit) does not help either because it still can
race vs. the actual list operation.

Reported-by: Dmitry Vyukov 
Signed-off-by: Thomas Gleixner 
Cc: "linux-fsde...@vger.kernel.org"
Cc: syzkaller 
Cc: Al Viro 
Cc: linux-fsde...@vger.kernel.org
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1701311521430.3457@nanos
Signed-off-by: Thomas Gleixner 
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings 
---
 fs/timerfd.c | 17 ++---
 1 file changed, 14 insertions(+), 3 deletions(-)

--- a/fs/timerfd.c
+++ b/fs/timerfd.c
@@ -34,6 +34,7 @@ struct timerfd_ctx {
int clockid;
struct rcu_head rcu;
struct list_head clist;
+   spinlock_t cancel_lock;
bool might_cancel;
 };
 
@@ -86,7 +87,7 @@ void timerfd_clock_was_set(void)
rcu_read_unlock();
 }
 
-static void timerfd_remove_cancel(struct timerfd_ctx *ctx)
+static void __timerfd_remove_cancel(struct timerfd_ctx *ctx)
 {
if (ctx->might_cancel) {
ctx->might_cancel = false;
@@ -96,6 +97,13 @@ static void timerfd_remove_cancel(struct
}
 }
 
+static void timerfd_remove_cancel(struct timerfd_ctx *ctx)
+{
+   spin_lock(&ctx->cancel_lock);
+   __timerfd_remove_cancel(ctx);
+   spin_unlock(&ctx->cancel_lock);
+}
+
 static bool timerfd_canceled(struct timerfd_ctx *ctx)
 {
if (!ctx->might_cancel || ctx->moffs.tv64 != KTIME_MAX)
@@ -106,6 +114,7 @@ static bool timerfd_canceled(struct time
 
 static void timerfd_setup_cancel(struct timerfd_ctx *ctx, int flags)
 {
+   spin_lock(&ctx->cancel_lock);
if (ctx->clockid == CLOCK_REALTIME && (flags & TFD_TIMER_ABSTIME) &&
(flags & TFD_TIMER_CANCEL_ON_SET)) {
if (!ctx->might_cancel) {
@@ -114,9 +123,10 @@ static void timerfd_setup_cancel(struct
list_add_rcu(&ctx->clist, &cancel_list);
spin_unlock(&cancel_lock);
}
-   } else if (ctx->might_cancel) {
-   timerfd_remove_cancel(ctx);
+   } else {
+   __timerfd_remove_cancel(ctx);
}
+   spin_unlock(&ctx->cancel_lock);
 }
 
 static ktime_t timerfd_get_remaining(struct timerfd_ctx *ctx)
@@ -268,6 +278,7 @@ SYSCALL_DEFINE2(timerfd_create, int, clo
return -ENOMEM;
 
init_waitqueue_head(&ctx->wqh);
+   spin_lock_init(&ctx->cancel_lock);
ctx->clockid = clockid;
hrtimer_init(&ctx->tmr, clockid, HRTIMER_MODE_ABS);
ctx->moffs = ktime_get_monotonic_offset();



[PATCH] mlx5: ensure 0 is returned when vport is zero

2017-08-18 Thread Colin King
From: Colin Ian King 

Currently, if vport is zero then then an uninialized return status
in err is returned.  Since the only return status at the end of the
function esw_add_uc_addr is zero for the current set of return paths
we may as well just return 0 rather than err to fix this issue.

Detected by CoverityScan, CID#1452698 ("Uninitialized scalar variable")

Fixes: eeb66cdb6826 ("net/mlx5: Separate between E-Switch and MPFS")
Signed-off-by: Colin Ian King 
---
 drivers/net/ethernet/mellanox/mlx5/core/eswitch.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c 
b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
index 6d9fb6ac6e9b..c77f4c0c7769 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
@@ -401,7 +401,7 @@ static int esw_add_uc_addr(struct mlx5_eswitch *esw, struct 
vport_addr *vaddr)
esw_debug(esw->dev, "\tADDED UC MAC: vport[%d] %pM fr(%p)\n",
  vport, mac, vaddr->flow_rule);
 
-   return err;
+   return 0;
 }
 
 static int esw_del_uc_addr(struct mlx5_eswitch *esw, struct vport_addr *vaddr)
-- 
2.11.0



[PATCH v5 2/7] KVM: arm64: Save ESR_EL2 on guest SError

2017-08-18 Thread Dongjiu Geng
From: James Morse 

When we exit a guest due to an SError the vcpu fault info isn't updated
with the ESR. Today this is only done for traps.

The v8.2 RAS Extensions define ISS values for SError. Update the vcpu's
fault_info with the ESR on SError so that handle_exit() can determine
if this was a RAS SError and decode its severity.

Signed-off-by: James Morse 
---
 arch/arm64/kvm/hyp/switch.c | 15 ---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index 945e79c641c4..c6f17c7675ad 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -226,13 +226,20 @@ static bool __hyp_text __translate_far_to_hpfar(u64 far, 
u64 *hpfar)
return true;
 }
 
+static void __hyp_text __populate_fault_info_esr(struct kvm_vcpu *vcpu)
+{
+   vcpu->arch.fault.esr_el2 = read_sysreg_el2(esr);
+}
+
 static bool __hyp_text __populate_fault_info(struct kvm_vcpu *vcpu)
 {
-   u64 esr = read_sysreg_el2(esr);
-   u8 ec = ESR_ELx_EC(esr);
+   u8 ec;
+   u64 esr;
u64 hpfar, far;
 
-   vcpu->arch.fault.esr_el2 = esr;
+   __populate_fault_info_esr(vcpu);
+   esr = vcpu->arch.fault.esr_el2;
+   ec = ESR_ELx_EC(esr);
 
if (ec != ESR_ELx_EC_DABT_LOW && ec != ESR_ELx_EC_IABT_LOW)
return true;
@@ -321,6 +328,8 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu)
 */
if (exit_code == ARM_EXCEPTION_TRAP && !__populate_fault_info(vcpu))
goto again;
+   else if (ARM_EXCEPTION_CODE(exit_code) == ARM_EXCEPTION_EL1_SERROR)
+   __populate_fault_info_esr(vcpu);
 
if (static_branch_unlikely(&vgic_v2_cpuif_trap) &&
exit_code == ARM_EXCEPTION_TRAP) {
-- 
2.14.0



[PATCH v5 7/7] arm64: kvm: handle SEI notification and inject virtual SError

2017-08-18 Thread Dongjiu Geng
After receive SError, KVM firstly call memory failure to
deal with the Error. If memory failure wants user space to
handle it, it will notify user space. This patch adds support
to userspace that injects virtual SError with specified
syndrome. This syndrome value will be set to the VSESR_EL2.
VSESR_EL2 is a new RAS extensions register which provides the
syndrome value reported to software on taking a virtual SError
interrupt exception.

Signed-off-by: Dongjiu Geng 
Signed-off-by: Quanming Wu 
---
 arch/arm/include/asm/kvm_host.h  |  2 ++
 arch/arm/kvm/guest.c |  5 +
 arch/arm64/include/asm/kvm_emulate.h | 10 ++
 arch/arm64/include/asm/kvm_host.h|  2 ++
 arch/arm64/include/asm/sysreg.h  |  3 +++
 arch/arm64/include/asm/system_misc.h |  1 +
 arch/arm64/kvm/guest.c   | 13 +
 arch/arm64/kvm/handle_exit.c | 21 +++--
 arch/arm64/kvm/hyp/switch.c  | 14 ++
 include/uapi/linux/kvm.h |  2 ++
 virt/kvm/arm/arm.c   |  7 +++
 11 files changed, 78 insertions(+), 2 deletions(-)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 127e2dd2e21c..bdb6ea690257 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -244,6 +244,8 @@ int kvm_arm_coproc_set_reg(struct kvm_vcpu *vcpu, const 
struct kvm_one_reg *);
 int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
int exception_index);
 
+int kvm_vcpu_ioctl_sei(struct kvm_vcpu *vcpu, u64 *syndrome);
+
 static inline void __cpu_init_hyp_mode(phys_addr_t pgd_ptr,
   unsigned long hyp_stack_ptr,
   unsigned long vector_ptr)
diff --git a/arch/arm/kvm/guest.c b/arch/arm/kvm/guest.c
index 1e0784ebbfd6..c23df72d9bec 100644
--- a/arch/arm/kvm/guest.c
+++ b/arch/arm/kvm/guest.c
@@ -248,6 +248,11 @@ int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu,
return -EINVAL;
 }
 
+int kvm_vcpu_ioctl_sei(struct kvm_vcpu *vcpu, u64 *syndrome)
+{
+   return 0;
+}
+
 int __attribute_const__ kvm_target_cpu(void)
 {
switch (read_cpuid_part()) {
diff --git a/arch/arm64/include/asm/kvm_emulate.h 
b/arch/arm64/include/asm/kvm_emulate.h
index 47983db27de2..74213bd539dc 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -155,6 +155,16 @@ static inline u32 kvm_vcpu_get_hsr(const struct kvm_vcpu 
*vcpu)
return vcpu->arch.fault.esr_el2;
 }
 
+static inline u32 kvm_vcpu_get_vsesr(const struct kvm_vcpu *vcpu)
+{
+   return vcpu->arch.fault.vsesr_el2;
+}
+
+static inline void kvm_vcpu_set_vsesr(struct kvm_vcpu *vcpu, unsigned long val)
+{
+   vcpu->arch.fault.vsesr_el2 = val;
+}
+
 static inline int kvm_vcpu_get_condition(const struct kvm_vcpu *vcpu)
 {
u32 esr = kvm_vcpu_get_hsr(vcpu);
diff --git a/arch/arm64/include/asm/kvm_host.h 
b/arch/arm64/include/asm/kvm_host.h
index d68630007b14..57b011261597 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -88,6 +88,7 @@ struct kvm_vcpu_fault_info {
u32 esr_el2;/* Hyp Syndrom Register */
u64 far_el2;/* Hyp Fault Address Register */
u64 hpfar_el2;  /* Hyp IPA Fault Address Register */
+   u32 vsesr_el2;  /* Virtual SError Exception Syndrome Register */
 };
 
 /*
@@ -381,6 +382,7 @@ int kvm_arm_vcpu_arch_get_attr(struct kvm_vcpu *vcpu,
   struct kvm_device_attr *attr);
 int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu,
   struct kvm_device_attr *attr);
+int kvm_vcpu_ioctl_sei(struct kvm_vcpu *vcpu, u64 *syndrome);
 
 static inline void __cpu_init_stage2(void)
 {
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 35b786b43ee4..06059eef0f5d 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -86,6 +86,9 @@
 #define REG_PSTATE_PAN_IMM sys_reg(0, 0, 4, 0, 4)
 #define REG_PSTATE_UAO_IMM sys_reg(0, 0, 4, 0, 3)
 
+/* virtual SError exception syndrome register in armv8.2 */
+#define REG_VSESR_EL2  sys_reg(3, 4, 5, 2, 3)
+
 #define SET_PSTATE_PAN(x) __emit_inst(0xd500 | REG_PSTATE_PAN_IMM |
\
  (!!x)<<8 | 0x1f)
 #define SET_PSTATE_UAO(x) __emit_inst(0xd500 | REG_PSTATE_UAO_IMM |
\
diff --git a/arch/arm64/include/asm/system_misc.h 
b/arch/arm64/include/asm/system_misc.h
index 07aa8e3c5630..7d07aeb02bc4 100644
--- a/arch/arm64/include/asm/system_misc.h
+++ b/arch/arm64/include/asm/system_misc.h
@@ -57,6 +57,7 @@ extern void (*arm_pm_restart)(enum reboot_mode reboot_mode, 
const char *cmd);
 })
 
 int handle_guest_sea(phys_addr_t addr, unsigned int esr);
+int handle_guest_sei(phys_addr_t addr, unsigned int esr);
 
 #endif /* __ASSEMBLY__ */
 
diff --git a/arch/arm64/

[GIT PULL] apparmor updates for next

2017-08-18 Thread John Johansen
Hi James,

Please pull these apparmor changes for next.

Thanks!

-Kees

The following changes since commit 706224ae390ddbf1871abb7938245be45bf04104:

  samples: Unrename SECCOMP_RET_KILL (2017-08-17 14:17:07 +1000)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor for-security

for you to fetch changes up to 76e22e212a850bbd16cf49f9c586d4635507e0b5:

  apparmor: fix incorrect type assignment when freeing proxies (2017-08-18 
06:45:37 -0700)


Christos Gkekas (1):
  apparmor: Fix logical error in verify_header()

Dan Carpenter (1):
  apparmor: Fix an error code in aafs_create()

Geert Uytterhoeven (1):
  apparmor: Fix shadowed local variable in unpack_trans_table()

John Johansen (12):
  apparmor: Redundant condition: prev_ns. in [label.c:1498]
  apparmor: add the ability to mediate signals
  apparmor: add mount mediation
  apparmor: cleanup conditional check for label in label_print
  apparmor: add support for absolute root view based labels
  apparmor: make policy_unpack able to audit different info messages
  apparmor: add more debug asserts to apparmorfs
  apparmor: add base infastructure for socket mediation
  apparmor: move new_null_profile to after profile lookup fns()
  apparmor: fix race condition in null profile creation
  apparmor: ensure unconfined profiles have dfas initialized
  apparmor: fix incorrect type assignment when freeing proxies

 security/apparmor/.gitignore  |   1 +
 security/apparmor/Makefile|  43 ++-
 security/apparmor/apparmorfs.c|  37 +-
 security/apparmor/domain.c|   4 +-
 security/apparmor/file.c  |  30 ++
 security/apparmor/include/apparmor.h  |   2 +
 security/apparmor/include/audit.h |  39 +-
 security/apparmor/include/domain.h|   5 +
 security/apparmor/include/ipc.h   |   6 +
 security/apparmor/include/label.h |   1 +
 security/apparmor/include/mount.h |  54 +++
 security/apparmor/include/net.h   | 114 ++
 security/apparmor/include/perms.h |   5 +-
 security/apparmor/include/policy.h|  13 +
 security/apparmor/include/sig_names.h |  95 +
 security/apparmor/ipc.c   |  99 +
 security/apparmor/label.c |  36 +-
 security/apparmor/lib.c   |   5 +-
 security/apparmor/lsm.c   | 472 +++
 security/apparmor/mount.c | 696 ++
 security/apparmor/net.c   | 184 +
 security/apparmor/policy.c| 166 
 security/apparmor/policy_ns.c |   2 +
 security/apparmor/policy_unpack.c | 105 -
 24 files changed, 2081 insertions(+), 133 deletions(-)
 create mode 100644 security/apparmor/include/mount.h
 create mode 100644 security/apparmor/include/net.h
 create mode 100644 security/apparmor/include/sig_names.h
 create mode 100644 security/apparmor/mount.c
 create mode 100644 security/apparmor/net.c


[PATCH v5 0/7] Add RAS virtualization support to SEA/SEI notification type

2017-08-18 Thread Dongjiu Geng
In the firmware-first RAS solution, corrupt data is detected in a
memory location when guest OS application software executing at EL0
or guest OS kernel El1 software are reading from the memory. The
memory node records errors in an error record accessible using
system registers.

Because SCR_EL3.EA is 1, then CPU will trap to El3 firmware, EL3
firmware records the error to APEI table through reading system
register.

Because the error was taken from a lower Exception leve, if the
exception is SEA/SEI and HCR_EL2.TEA/HCR_EL2.AMO is 1, firmware
sets ESR_EL2/FAR_El to fake a exception trap to EL2, then
transfers to hypervisor.

Hypervisor calls the momory failure to deal with this error, momory
failure read the APEI table and decide whether it needs to deliver
SIGBUS signal to user space, the advantage of using SIGBUS signal
to notify user space is that it can be compatible Non-Kvm users.

Dongjiu Geng (5):
  acpi: apei: Add SEI notification type support for ARMv8
  support user space to query RAS extension feature
  arm64: kvm: route synchronous external abort exceptions to el2
  KVM: arm/arm64: Allow get exception syndrome and
  arm64: kvm: handle SEI notification and inject virtual SError

James Morse (1):
  KVM: arm64: Save ESR_EL2 on guest SError

Xie XiuQi (1):
  arm64: cpufeature: Detect CPU RAS Extentions

 arch/arm/include/asm/kvm_host.h  |  2 ++
 arch/arm/kvm/guest.c |  5 +++
 arch/arm64/Kconfig   | 16 ++
 arch/arm64/include/asm/barrier.h |  1 +
 arch/arm64/include/asm/cpucaps.h |  3 +-
 arch/arm64/include/asm/kvm_arm.h |  2 ++
 arch/arm64/include/asm/kvm_emulate.h | 17 ++
 arch/arm64/include/asm/kvm_host.h|  2 ++
 arch/arm64/include/asm/sysreg.h  |  5 +++
 arch/arm64/include/asm/system_misc.h |  1 +
 arch/arm64/include/uapi/asm/kvm.h|  5 +++
 arch/arm64/kernel/cpufeature.c   | 13 
 arch/arm64/kernel/process.c  |  3 ++
 arch/arm64/kvm/guest.c   | 48 +
 arch/arm64/kvm/handle_exit.c | 21 +++--
 arch/arm64/kvm/hyp/switch.c  | 29 +++--
 arch/arm64/kvm/reset.c   |  3 ++
 arch/arm64/mm/fault.c| 21 +++--
 drivers/acpi/apei/Kconfig| 15 +
 drivers/acpi/apei/ghes.c | 60 +++-
 include/acpi/ghes.h  |  2 +-
 include/uapi/linux/kvm.h |  3 ++
 virt/kvm/arm/arm.c   |  7 +
 23 files changed, 254 insertions(+), 30 deletions(-)

-- 
2.14.0



[PATCH v5 5/7] arm64: kvm: route synchronous external abort exceptions to el2

2017-08-18 Thread Dongjiu Geng
ARMv8.2 adds a new bit HCR_EL2.TEA which controls to
route synchronous external aborts to EL2, and add a
trap control bit HCR_EL2.TERR which will control to
trap all Non-secure EL1&0 error record accesses to EL2.

This patch will enable the two bits for the guest OS.
when an synchronous abort is generated in the guest OS,
it will trap to EL3 firmware, firmware will be according
to the HCR_EL2.TEA to decide to jump to hypervisor or host
OS. In the guest OS, RAS error record access will trap to
EL2.

Signed-off-by: Dongjiu Geng 
---
 arch/arm64/include/asm/kvm_arm.h | 2 ++
 arch/arm64/include/asm/kvm_emulate.h | 7 +++
 2 files changed, 9 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
index 61d694c2eae5..1188272003c4 100644
--- a/arch/arm64/include/asm/kvm_arm.h
+++ b/arch/arm64/include/asm/kvm_arm.h
@@ -23,6 +23,8 @@
 #include 
 
 /* Hyp Configuration Register (HCR) bits */
+#define HCR_TEA(UL(1) << 37)
+#define HCR_TERR   (UL(1) << 36)
 #define HCR_E2H(UL(1) << 34)
 #define HCR_ID (UL(1) << 33)
 #define HCR_CD (UL(1) << 32)
diff --git a/arch/arm64/include/asm/kvm_emulate.h 
b/arch/arm64/include/asm/kvm_emulate.h
index fe39e6841326..47983db27de2 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -47,6 +47,13 @@ static inline void vcpu_reset_hcr(struct kvm_vcpu *vcpu)
vcpu->arch.hcr_el2 = HCR_GUEST_FLAGS;
if (is_kernel_in_hyp_mode())
vcpu->arch.hcr_el2 |= HCR_E2H;
+   if (cpus_have_const_cap(ARM64_HAS_RAS_EXTN)) {
+   /* route synchronous external abort exceptions to EL2 */
+   vcpu->arch.hcr_el2 |= HCR_TEA;
+   /* trap error record accesses */
+   vcpu->arch.hcr_el2 |= HCR_TERR;
+   }
+
if (test_bit(KVM_ARM_VCPU_EL1_32BIT, vcpu->arch.features))
vcpu->arch.hcr_el2 &= ~HCR_RW;
 }
-- 
2.14.0



[PATCH v5 1/7] arm64: cpufeature: Detect CPU RAS Extentions

2017-08-18 Thread Dongjiu Geng
From: Xie XiuQi 

ARM's v8.2 Extentions add support for Reliability, Availability and
Serviceability (RAS). On CPUs with these extensions system software
can use additional barriers to isolate errors and determine if faults
are pending.

Add cpufeature detection and a barrier in the context-switch code.
There is no need to use alternatives for this as CPUs that don't
support this feature will treat the instruction as a nop.

Platform level RAS support may require additional firmware support.

Signed-off-by: Xie XiuQi 
[Rebased, added esb and config option, reworded commit message]
Signed-off-by: James Morse 
---
 arch/arm64/Kconfig   | 16 
 arch/arm64/include/asm/barrier.h |  1 +
 arch/arm64/include/asm/cpucaps.h |  3 ++-
 arch/arm64/include/asm/sysreg.h  |  2 ++
 arch/arm64/kernel/cpufeature.c   | 13 +
 arch/arm64/kernel/process.c  |  3 +++
 6 files changed, 37 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index dfd908630631..4d87aa963d83 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -960,6 +960,22 @@ config ARM64_UAO
  regular load/store instructions if the cpu does not implement the
  feature.
 
+config ARM64_RAS_EXTN
+   bool "Enable support for RAS CPU Extensions"
+   default y
+   help
+ CPUs that support the Reliability, Availability and Serviceability
+ (RAS) Extensions, part of ARMv8.2 are able to track faults and
+ errors, classify them and report them to software.
+
+ On CPUs with these extensions system software can use additional
+ barriers to determine if faults are pending and read the
+ classification from a new set of registers.
+
+ Selecting this feature will allow the kernel to use these barriers
+ and access the new registers if the system supports the extension.
+ Platform RAS features may additionally depend on firmware support.
+
 endmenu
 
 config ARM64_MODULE_CMODEL_LARGE
diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h
index 0fe7e43b7fbc..8b0a0eb67625 100644
--- a/arch/arm64/include/asm/barrier.h
+++ b/arch/arm64/include/asm/barrier.h
@@ -30,6 +30,7 @@
 #define isb()  asm volatile("isb" : : : "memory")
 #define dmb(opt)   asm volatile("dmb " #opt : : : "memory")
 #define dsb(opt)   asm volatile("dsb " #opt : : : "memory")
+#define esb()  asm volatile("hint #16"  : : : "memory")
 
 #define mb()   dsb(sy)
 #define rmb()  dsb(ld)
diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
index 8d2272c6822c..f93bf77f1f74 100644
--- a/arch/arm64/include/asm/cpucaps.h
+++ b/arch/arm64/include/asm/cpucaps.h
@@ -39,7 +39,8 @@
 #define ARM64_WORKAROUND_QCOM_FALKOR_E1003 18
 #define ARM64_WORKAROUND_85892119
 #define ARM64_WORKAROUND_CAVIUM_30115  20
+#define ARM64_HAS_RAS_EXTN 21
 
-#define ARM64_NCAPS21
+#define ARM64_NCAPS22
 
 #endif /* __ASM_CPUCAPS_H */
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 248339e4aaf5..35b786b43ee4 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -331,6 +331,7 @@
 #define ID_AA64ISAR1_JSCVT_SHIFT   12
 
 /* id_aa64pfr0 */
+#define ID_AA64PFR0_RAS_SHIFT  28
 #define ID_AA64PFR0_GIC_SHIFT  24
 #define ID_AA64PFR0_ASIMD_SHIFT20
 #define ID_AA64PFR0_FP_SHIFT   16
@@ -339,6 +340,7 @@
 #define ID_AA64PFR0_EL1_SHIFT  4
 #define ID_AA64PFR0_EL0_SHIFT  0
 
+#define ID_AA64PFR0_RAS_V1 0x1
 #define ID_AA64PFR0_FP_NI  0xf
 #define ID_AA64PFR0_FP_SUPPORTED   0x0
 #define ID_AA64PFR0_ASIMD_NI   0xf
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 9f9e0064c8c1..a807ab55ee10 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -124,6 +124,7 @@ static const struct arm64_ftr_bits ftr_id_aa64isar1[] = {
 };
 
 static const struct arm64_ftr_bits ftr_id_aa64pfr0[] = {
+   ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, 
ID_AA64PFR0_RAS_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, 
ID_AA64PFR0_GIC_SHIFT, 4, 0),
S_ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, 
ID_AA64PFR0_ASIMD_SHIFT, 4, ID_AA64PFR0_ASIMD_NI),
S_ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, 
ID_AA64PFR0_FP_SHIFT, 4, ID_AA64PFR0_FP_NI),
@@ -888,6 +889,18 @@ static const struct arm64_cpu_capabilities 
arm64_features[] = {
.min_field_value = 0,
.matches = has_no_fpsimd,
},
+#ifdef CONFIG_ARM64_RAS_EXTN
+   {
+   .desc = "RAS Extension Support",
+   .capability = ARM64_HAS_RAS_EXTN,
+   .def_scope = SCOPE_SYSTEM,
+   .matches = has_cp

[PATCH v5 4/7] support user space to query RAS extension feature

2017-08-18 Thread Dongjiu Geng
In armv8.2 RAS extension, it adds virtual SError exception
syndrome registeri(VSESR_EL2), user space will specify that
value. so user space will check whether CPU feature has RAS
extension. if has, it will specify the virtual SError syndrome
value. Otherwise, it will not set. This patch adds this support

Signed-off-by: Dongjiu Geng 
---
 arch/arm64/kvm/reset.c   | 3 +++
 include/uapi/linux/kvm.h | 1 +
 2 files changed, 4 insertions(+)

diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index 3256b9228e75..b7313ee028e9 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -77,6 +77,9 @@ int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long 
ext)
case KVM_CAP_ARM_PMU_V3:
r = kvm_arm_support_pmu_v3();
break;
+   case KVM_CAP_ARM_RAS_EXTENSION:
+   r = cpus_have_const_cap(ARM64_HAS_RAS_EXTN);
+   break;
case KVM_CAP_SET_GUEST_DEBUG:
case KVM_CAP_VCPU_ATTRIBUTES:
r = 1;
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 6cd63c18708a..5a2a338cae57 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -929,6 +929,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_PPC_SMT_POSSIBLE 147
 #define KVM_CAP_HYPERV_SYNIC2 148
 #define KVM_CAP_HYPERV_VP_INDEX 149
+#define KVM_CAP_ARM_RAS_EXTENSION 150
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
-- 
2.14.0



[PATCH v5 3/7] acpi: apei: Add SEI notification type support for ARMv8

2017-08-18 Thread Dongjiu Geng
ARMV8.2 requires implementation of the RAS extension, in
this extension it adds SEI(SError Interrupt) notification
type, this patch addes a new GHES error source handling
function for SEI. Because this error source parse and handling
method are similar with the SEA. so use one function to handle
them.

In current code logic, The two functions ghes_sea_add() and
ghes_sea_remove() are only called when CONFIG_ACPI_APEI_SEA
and CONFIG_ACPI_APEI_SEI are defined. If not, it will return
errors in the ghes_probe() and do not continue, so remove the
useless code that handling CONFIG_ACPI_APEI_SEA and
CONFIG_ACPI_APEI_SEI do not defined.

Expose one API ghes_notify_sex() to external, external modules
can call this exposed APIs to parse and handling the SEA/SEI.

Signed-off-by: Dongjiu Geng 
---
 arch/arm64/mm/fault.c | 21 +++--
 drivers/acpi/apei/Kconfig | 15 
 drivers/acpi/apei/ghes.c  | 60 ++-
 include/acpi/ghes.h   |  2 +-
 4 files changed, 74 insertions(+), 24 deletions(-)

diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index 2509e4fe6992..0aa92a69c280 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -585,7 +585,7 @@ static int do_sea(unsigned long addr, unsigned int esr, 
struct pt_regs *regs)
if (interrupts_enabled(regs))
nmi_enter();
 
-   ret = ghes_notify_sea();
+   ret = ghes_notify_sex(ACPI_HEST_NOTIFY_SEA);
 
if (interrupts_enabled(regs))
nmi_exit();
@@ -682,7 +682,24 @@ int handle_guest_sea(phys_addr_t addr, unsigned int esr)
int ret = -ENOENT;
 
if (IS_ENABLED(CONFIG_ACPI_APEI_SEA))
-   ret = ghes_notify_sea();
+   ret = ghes_notify_sex(ACPI_HEST_NOTIFY_SEA);
+
+   return ret;
+}
+
+/*
+ * Handle SError interrupt that occur in a guest kernel.
+ *
+ * The return value will be zero if the SEI was successfully handled
+ * and non-zero if there was an error processing the error or there was
+ * no error to process.
+ */
+int handle_guest_sei(phys_addr_t addr, unsigned int esr)
+{
+   int ret = -ENOENT;
+
+   if (IS_ENABLED(CONFIG_ACPI_APEI_SEI))
+   ret = ghes_notify_sex(ACPI_HEST_NOTIFY_SEI);
 
return ret;
 }
diff --git a/drivers/acpi/apei/Kconfig b/drivers/acpi/apei/Kconfig
index de14d49a5c90..556370c763ec 100644
--- a/drivers/acpi/apei/Kconfig
+++ b/drivers/acpi/apei/Kconfig
@@ -54,6 +54,21 @@ config ACPI_APEI_SEA
  option allows the OS to look for such hardware error record, and
  take appropriate action.
 
+config ACPI_APEI_SEI
+   bool "APEI Asynchronous SError Interrupt logging/recovering support"
+   depends on ARM64 && ACPI_APEI_GHES
+   default y
+   help
+ This option should be enabled if the system supports
+ firmware first handling of SEI (asynchronous SError interrupt).
+
+ SEI happens with invalid instruction access or asynchronous exceptions
+ on ARMv8 systems. If a system supports firmware first handling of SEI,
+ the platform analyzes and handles hardware error notifications from
+ SEI, and it may then form a HW error record for the OS to parse and
+ handle. This option allows the OS to look for such hardware error
+ record, and take appropriate action.
+
 config ACPI_APEI_MEMORY_FAILURE
bool "APEI memory error recovering support"
depends on ACPI_APEI && MEMORY_FAILURE
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index d661d452b238..705738aa48b8 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -813,20 +813,21 @@ static struct notifier_block ghes_notifier_hed = {
.notifier_call = ghes_notify_hed,
 };
 
-#ifdef CONFIG_ACPI_APEI_SEA
 static LIST_HEAD(ghes_sea);
+static LIST_HEAD(ghes_sei);
 
 /*
  * Return 0 only if one of the SEA error sources successfully reported an error
  * record sent from the firmware.
  */
-int ghes_notify_sea(void)
+
+int ghes_handle_sex(struct list_head *head)
 {
struct ghes *ghes;
int ret = -ENOENT;
 
rcu_read_lock();
-   list_for_each_entry_rcu(ghes, &ghes_sea, list) {
+   list_for_each_entry_rcu(ghes, head, list) {
if (!ghes_proc(ghes))
ret = 0;
}
@@ -834,33 +835,41 @@ int ghes_notify_sea(void)
return ret;
 }
 
-static void ghes_sea_add(struct ghes *ghes)
+int ghes_notify_sex(u8 type)
+{
+   if (type == ACPI_HEST_NOTIFY_SEA)
+   return ghes_handle_sex(&ghes_sea);
+   else if (type == ACPI_HEST_NOTIFY_SEI)
+   return ghes_handle_sex(&ghes_sei);
+
+   return -ENOENT;
+}
+
+/*
+ * This function is only called when the CONFIG_HAVE_ACPI_APEI_SEA or
+ * CONFIG_HAVE_ACPI_APEI_SEA is enabled. when disabled, it will return
+ * error in the ghes_probe
+ */
+static void ghes_sex_add(struct ghes *ghes)
 {
+   u8 

[PATCH v5 6/7] KVM: arm64: Allow get exception information from userspace

2017-08-18 Thread Dongjiu Geng
when userspace gets SIGBUS signal, it does not know whether
this is a synchronous external abort or SError, so needs
to get the exception syndrome. so this patch allows userspace
can get this values. For syndrome, only give userspace
syndrome EC and ISS.

Now we move the synchronous external abort injection logic to
userspace, when userspace injects the SEA exception to guest
OS, it needs to specify the far_el1 value, so this patch give
the exception virtual address to user space.

Signed-off-by: Dongjiu Geng 
Signed-off-by: Quanming Wu 
---
 arch/arm64/include/uapi/asm/kvm.h |  5 +
 arch/arm64/kvm/guest.c| 35 +++
 2 files changed, 40 insertions(+)

diff --git a/arch/arm64/include/uapi/asm/kvm.h 
b/arch/arm64/include/uapi/asm/kvm.h
index 9f3ca24bbcc6..514261f682b8 100644
--- a/arch/arm64/include/uapi/asm/kvm.h
+++ b/arch/arm64/include/uapi/asm/kvm.h
@@ -181,6 +181,11 @@ struct kvm_arch_memory_slot {
 #define KVM_REG_ARM64_SYSREG_OP2_MASK  0x0007
 #define KVM_REG_ARM64_SYSREG_OP2_SHIFT 0
 
+/* AArch64 fault registers */
+#define KVM_REG_ARM64_FAULT(0x0014 << KVM_REG_ARM_COPROC_SHIFT)
+#define KVM_REG_ARM64_FAULT_ESR_EC_ISS (0)
+#define KVM_REG_ARM64_FAULT_FAR(1)
+
 #define ARM64_SYS_REG_SHIFT_MASK(x,n) \
(((x) << KVM_REG_ARM64_SYSREG_ ## n ## _SHIFT) & \
KVM_REG_ARM64_SYSREG_ ## n ## _MASK)
diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
index 5c7f657dd207..cb383c310f18 100644
--- a/arch/arm64/kvm/guest.c
+++ b/arch/arm64/kvm/guest.c
@@ -128,6 +128,38 @@ static int set_core_reg(struct kvm_vcpu *vcpu, const 
struct kvm_one_reg *reg)
 out:
return err;
 }
+static int get_fault_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
+{
+   void __user *uaddr = (void __user *)(unsigned long)reg->addr;
+   u32 ec, value;
+   u32 id = reg->id & ~(KVM_REG_ARCH_MASK |
+   KVM_REG_SIZE_MASK | KVM_REG_ARM64_FAULT);
+
+   switch (id) {
+   case KVM_REG_ARM64_FAULT_ESR_EC_ISS:
+   /* The user space needs to know the fault exception
+* class field
+*/
+   ec = kvm_vcpu_get_hsr(vcpu) & ESR_ELx_EC_MASK;
+   value = ec | (kvm_vcpu_get_hsr(vcpu) & ESR_ELx_ISS_MASK);
+
+   if (copy_to_user(uaddr, &value, KVM_REG_SIZE(reg->id)) != 0)
+   return -EFAULT;
+   break;
+   case KVM_REG_ARM64_FAULT_FAR:
+   /* when user space injects synchronized abort, it needs
+* to inject the fault address.
+*/
+   if (copy_to_user(uaddr, &(vcpu->arch.fault.far_el2),
+   KVM_REG_SIZE(reg->id)) != 0)
+   return -EFAULT;
+   break;
+   default:
+   return -ENOENT;
+   }
+   return 0;
+}
+
 
 int kvm_arch_vcpu_ioctl_get_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
 {
@@ -243,6 +275,9 @@ int kvm_arm_get_reg(struct kvm_vcpu *vcpu, const struct 
kvm_one_reg *reg)
if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_CORE)
return get_core_reg(vcpu, reg);
 
+   if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM64_FAULT)
+   return get_fault_reg(vcpu, reg);
+
if (is_timer_reg(reg->id))
return get_timer_reg(vcpu, reg);
 
-- 
2.14.0



Re: [PATCH v2 6/6] kernel: tracepoints: add support for relative references

2017-08-18 Thread Steven Rostedt
On Fri, 18 Aug 2017 14:44:15 +0100
Ard Biesheuvel  wrote:

> >> It appears the stuff above needs to be move inside the double-include
> >> guard (which oddly enough does not cover the entire file)  
> >
> > Why was this moved to the header file? To fulfill some checkpatch
> > warning?
> >  
> 
> Yes.

My preference is to ignore that checkpatch warning. The section
variables are created by linker magic, and not normal "extern"
variables. They are only used in one location, and I like to keep them
where they are used, and not be something other places might think they
can be used. In other words, keep them by the C code, and out of
headers.

Tracepoints and linker/asm work always triggers a lot of bogus
checkpatch warnings. Which is unfortunate. :-/

Thanks!

-- Steve


[PATCH] drm/msm/: make clk_init_data const

2017-08-18 Thread Bhumika Goyal
Make these const as they are only stored in the init field of a clk_hw
structure, which is const.
Done using Coccinelle.

Signed-off-by: Bhumika Goyal 
---
 drivers/gpu/drm/msm/hdmi/hdmi_phy_8996.c | 2 +-
 drivers/gpu/drm/msm/hdmi/hdmi_pll_8960.c | 2 +-
 drivers/gpu/drm/msm/mdp/mdp4/mdp4_lvds_pll.c | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/msm/hdmi/hdmi_phy_8996.c 
b/drivers/gpu/drm/msm/hdmi/hdmi_phy_8996.c
index 1fb7645..dea4697 100644
--- a/drivers/gpu/drm/msm/hdmi/hdmi_phy_8996.c
+++ b/drivers/gpu/drm/msm/hdmi/hdmi_phy_8996.c
@@ -702,7 +702,7 @@ static int hdmi_8996_pll_is_enabled(struct clk_hw *hw)
"xo",
 };
 
-static struct clk_init_data pll_init = {
+static const struct clk_init_data pll_init = {
.name = "hdmipll",
.ops = &hdmi_8996_pll_ops,
.parent_names = hdmi_pll_parents,
diff --git a/drivers/gpu/drm/msm/hdmi/hdmi_pll_8960.c 
b/drivers/gpu/drm/msm/hdmi/hdmi_pll_8960.c
index 9959075..2e3c147 100644
--- a/drivers/gpu/drm/msm/hdmi/hdmi_pll_8960.c
+++ b/drivers/gpu/drm/msm/hdmi/hdmi_pll_8960.c
@@ -419,7 +419,7 @@ static int hdmi_pll_set_rate(struct clk_hw *hw, unsigned 
long rate,
"pxo",
 };
 
-static struct clk_init_data pll_init = {
+static const struct clk_init_data pll_init = {
.name = "hdmi_pll",
.ops = &hdmi_pll_ops,
.parent_names = hdmi_pll_parents,
diff --git a/drivers/gpu/drm/msm/mdp/mdp4/mdp4_lvds_pll.c 
b/drivers/gpu/drm/msm/mdp/mdp4/mdp4_lvds_pll.c
index ce42459..b9a7104 100644
--- a/drivers/gpu/drm/msm/mdp/mdp4/mdp4_lvds_pll.c
+++ b/drivers/gpu/drm/msm/mdp/mdp4/mdp4_lvds_pll.c
@@ -137,7 +137,7 @@ static int mpd4_lvds_pll_set_rate(struct clk_hw *hw, 
unsigned long rate,
"pxo",
 };
 
-static struct clk_init_data pll_init = {
+static const struct clk_init_data pll_init = {
.name = "mpd4_lvds_pll",
.ops = &mpd4_lvds_pll_ops,
.parent_names = mpd4_lvds_pll_parents,
-- 
1.9.1



[PATCH 3.16 035/134] [media] cx231xx-cards: fix NULL-deref at probe

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Johan Hovold 

commit 0cd273bb5e4d1828efaaa8dfd11b7928131ed149 upstream.

Make sure to check the number of endpoints to avoid dereferencing a
NULL-pointer or accessing memory beyond the endpoint array should a
malicious device lack the expected endpoints.

Fixes: e0d3bafd0258 ("V4L/DVB (10954): Add cx231xx USB driver")

Cc: Sri Deevi 
Signed-off-by: Johan Hovold 
Signed-off-by: Hans Verkuil 
Signed-off-by: Mauro Carvalho Chehab 
[bwh: Backported to 3.16: adjust context]
Signed-off-by: Ben Hutchings 
---
 drivers/media/usb/cx231xx/cx231xx-cards.c | 45 +++
 1 file changed, 40 insertions(+), 5 deletions(-)

--- a/drivers/media/usb/cx231xx/cx231xx-cards.c
+++ b/drivers/media/usb/cx231xx/cx231xx-cards.c
@@ -1258,6 +1258,9 @@ static int cx231xx_usb_probe(struct usb_
uif = udev->actconfig->interface[dev->current_pcb_config.
   hs_config_info[0].interface_info.video_index + 1];
 
+   if (uif->altsetting[0].desc.bNumEndpoints < isoc_pipe + 1)
+   return -ENODEV;
+
dev->video_mode.end_point_addr = uif->altsetting[0].
endpoint[isoc_pipe].desc.bEndpointAddress;
 
@@ -1275,8 +1278,12 @@ static int cx231xx_usb_probe(struct usb_
}
 
for (i = 0; i < dev->video_mode.num_alt; i++) {
-   u16 tmp = le16_to_cpu(uif->altsetting[i].endpoint[isoc_pipe].
-   desc.wMaxPacketSize);
+   u16 tmp;
+
+   if (uif->altsetting[i].desc.bNumEndpoints < isoc_pipe + 1)
+   return -ENODEV;
+
+   tmp = 
le16_to_cpu(uif->altsetting[i].endpoint[isoc_pipe].desc.wMaxPacketSize);
dev->video_mode.alt_max_pkt_size[i] =
(tmp & 0x07ff) * (((tmp & 0x1800) >> 11) + 1);
cx231xx_info("Alternate setting %i, max size= %i\n", i,
@@ -1288,6 +1295,9 @@ static int cx231xx_usb_probe(struct usb_
   hs_config_info[0].interface_info.
   vanc_index + 1];
 
+   if (uif->altsetting[0].desc.bNumEndpoints < isoc_pipe + 1)
+   return -ENODEV;
+
dev->vbi_mode.end_point_addr =
uif->altsetting[0].endpoint[isoc_pipe].desc.
bEndpointAddress;
@@ -1306,8 +1316,12 @@ static int cx231xx_usb_probe(struct usb_
}
 
for (i = 0; i < dev->vbi_mode.num_alt; i++) {
-   u16 tmp =
-   le16_to_cpu(uif->altsetting[i].endpoint[isoc_pipe].
+   u16 tmp;
+
+   if (uif->altsetting[i].desc.bNumEndpoints < isoc_pipe + 1)
+   return -ENODEV;
+
+   tmp = le16_to_cpu(uif->altsetting[i].endpoint[isoc_pipe].
desc.wMaxPacketSize);
dev->vbi_mode.alt_max_pkt_size[i] =
(tmp & 0x07ff) * (((tmp & 0x1800) >> 11) + 1);
@@ -1320,6 +1334,9 @@ static int cx231xx_usb_probe(struct usb_
   hs_config_info[0].interface_info.
   hanc_index + 1];
 
+   if (uif->altsetting[0].desc.bNumEndpoints < isoc_pipe + 1)
+   return -ENODEV;
+
dev->sliced_cc_mode.end_point_addr =
uif->altsetting[0].endpoint[isoc_pipe].desc.
bEndpointAddress;
@@ -1338,7 +1355,12 @@ static int cx231xx_usb_probe(struct usb_
}
 
for (i = 0; i < dev->sliced_cc_mode.num_alt; i++) {
-   u16 tmp = le16_to_cpu(uif->altsetting[i].endpoint[isoc_pipe].
+   u16 tmp;
+
+   if (uif->altsetting[i].desc.bNumEndpoints < isoc_pipe + 1)
+   return -ENODEV;
+
+   tmp = le16_to_cpu(uif->altsetting[i].endpoint[isoc_pipe].
desc.wMaxPacketSize);
dev->sliced_cc_mode.alt_max_pkt_size[i] =
(tmp & 0x07ff) * (((tmp & 0x1800) >> 11) + 1);
@@ -1353,6 +1375,11 @@ static int cx231xx_usb_probe(struct usb_
   interface_info.
   ts1_index + 1];
 
+   if (uif->altsetting[0].desc.bNumEndpoints < isoc_pipe + 1) {
+   retval = -ENODEV;
+   goto err_video_alt;
+   }
+
dev->ts1_mode.end_point_addr =
uif->altsetting[0].endpoint[isoc_pipe].
desc.bEndpointAddress;
@@ -1371,7 +1398,14 @@ static int cx231xx_usb_probe(struct usb_
}
 
for (i = 0; i < dev->ts1_mode.num_alt; i++) {
-   u16 tmp = le16_to_cpu(uif->altsetting[i].
+   u16 tmp;
+
+   if (uif->altsetting[i].desc.bNumEndpoints < isoc_pipe + 
1) {
+   retval = -ENODEV;
+   

Re: [PATCH 3.16 119/134] ipv4: restore rt->fi for reference counting

2017-08-18 Thread Eric Dumazet
On Fri, Aug 18, 2017 at 6:13 AM, Ben Hutchings  wrote:
> 3.16.47-rc1 review patch.  If anyone has any objections, please let me know.
>
> --
>
> From: WANG Cong 
>
> commit 82486aa6f1b9bc8145e6d0fa2bc0b44307f3b875 upstream.
>
> IPv4 dst could use fi->fib_metrics to store metrics but fib_info
> itself is refcnt'ed, so without taking a refcnt fi and
> fi->fib_metrics could be freed while dst metrics still points to
> it. This triggers use-after-free as reported by Andrey twice.
>
> This patch reverts commit 2860583fe840 ("ipv4: Kill rt->fi") to
> restore this reference counting. It is a quick fix for -net and
> -stable, for -net-next, as Eric suggested, we can consider doing
> reference counting for metrics itself instead of relying on fib_info.
>
> IPv6 is very different, it copies or steals the metrics from mx6_config
> in fib6_commit_metrics() so probably doesn't need a refcnt.
>
> Decnet has already done the refcnt'ing, see dn_fib_semantic_match().
>
> Fixes: 2860583fe840 ("ipv4: Kill rt->fi")
> Reported-by: Andrey Konovalov 
> Tested-by: Andrey Konovalov 
> Signed-off-by: Cong Wang 
> Acked-by: Eric Dumazet 
> Signed-off-by: David S. Miller 
> [bwh: Backported to 3.16:
>  - Update all 5 places where rtable is initialised
>  - Open-code fib_info_hold()
>  - Adjust context]
> Signed-off-by: Ben Hutchings 
> ---

I thought we refined this later with :

commit 3fb07daff8e99243366a081e5129560734de4ada
Author: Eric Dumazet 
Date:   Thu May 25 14:27:35 2017 -0700

ipv4: add reference counting to metrics

Andrey Konovalov reported crashes in ipv4_mtu()

I could reproduce the issue with KASAN kernels, between
10.246.7.151 and 10.246.7.152 :

1) 20 concurrent netperf -t TCP_RR -H 10.246.7.152 -l 1000 &

2) At the same time run following loop :
while :
do
 ip ro add 10.246.7.152 dev eth0 src 10.246.7.151 mtu 1500
 ip ro del 10.246.7.152 dev eth0 src 10.246.7.151 mtu 1500
done

Cong Wang attempted to add back rt->fi in commit
82486aa6f1b9 ("ipv4: restore rt->fi for reference counting")
but this proved to add some issues that were complex to solve.

Instead, I suggested to add a refcount to the metrics themselves,
being a standalone object (in particular, no reference to other objects)

I tried to make this patch as small as possible to ease its backport,
instead of being super clean. Note that we believe that only ipv4 dst
need to take care of the metric refcount. But if this is wrong,
this patch adds the basic infrastructure to extend this to other
families.

Many thanks to Julian Anastasov for reviewing this patch, and Cong Wang
for his efforts on this problem.

Fixes: 2860583fe840 ("ipv4: Kill rt->fi")
Signed-off-by: Eric Dumazet 
Reported-by: Andrey Konovalov 
Reviewed-by: Julian Anastasov 
Acked-by: Cong Wang 
Signed-off-by: David S. Miller 


And then :

commit 187e5b3ac84d3421d2de3aca949b2791fbcad554
Author: Eric Dumazet 
Date:   Tue Aug 15 05:26:17 2017 -0700

ipv4: fix NULL dereference in free_fib_info_rcu()

If fi->fib_metrics could not be allocated in fib_create_info()
we attempt to dereference a NULL pointer in free_fib_info_rcu() :

m = fi->fib_metrics;
if (m != &dst_default_metrics && atomic_dec_and_test(&m->refcnt))
kfree(m);

Before my recent patch, we used to call kfree(NULL) and nothing wrong
happened.

Instead of using RCU to defer freeing while we are under memory stress,
it seems better to take immediate action.

This was reported by syzkaller team.

Fixes: 3fb07daff8e9 ("ipv4: add reference counting to metrics")
Signed-off-by: Eric Dumazet 
Reported-by: Dmitry Vyukov 
Signed-off-by: David S. Miller 


[PATCH 3.16 034/134] [media] usbvision: fix NULL-deref at probe

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Johan Hovold 

commit eacb975b48272f54532b62f515a3cf7eefa35123 upstream.

Make sure to check the number of endpoints to avoid dereferencing a
NULL-pointer or accessing memory beyond the endpoint array should a
malicious device lack the expected endpoints.

Fixes: 2a9f8b5d25be ("V4L/DVB (5206): Usbvision: set alternate interface
modification")

Cc: Thierry MERLE 
Signed-off-by: Johan Hovold 
Signed-off-by: Hans Verkuil 
Signed-off-by: Mauro Carvalho Chehab 
Signed-off-by: Ben Hutchings 
---
 drivers/media/usb/usbvision/usbvision-video.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

--- a/drivers/media/usb/usbvision/usbvision-video.c
+++ b/drivers/media/usb/usbvision/usbvision-video.c
@@ -1599,7 +1599,14 @@ static int usbvision_probe(struct usb_in
}
 
for (i = 0; i < usbvision->num_alt; i++) {
-   u16 tmp = le16_to_cpu(uif->altsetting[i].endpoint[1].desc.
+   u16 tmp;
+
+   if (uif->altsetting[i].desc.bNumEndpoints < 2) {
+   ret = -ENODEV;
+   goto err_pkt;
+   }
+
+   tmp = le16_to_cpu(uif->altsetting[i].endpoint[1].desc.
  wMaxPacketSize);
usbvision->alt_max_pkt_size[i] =
(tmp & 0x07ff) * (((tmp & 0x1800) >> 11) + 1);



Re: [PATCH v4 4/8] irqchip/irq-goldfish-pic: Add Goldfish PIC driver

2017-08-18 Thread Marc Zyngier
On 18/08/17 14:08, Aleksandar Markovic wrote:
> From: Miodrag Dinic 
> 
> Add device driver for a virtual programmable interrupt controller
> 
> The virtual PIC is designed as a device tree-based interrupt controller.
> 
> The compatible string used by OS for binding the driver is
> "google,goldfish-pic".
> 
> Signed-off-by: Miodrag Dinic 
> Signed-off-by: Goran Ferenc 
> Signed-off-by: Aleksandar Markovic 
> ---
>  MAINTAINERS|   1 +
>  drivers/irqchip/Kconfig|   8 ++
>  drivers/irqchip/Makefile   |   1 +
>  drivers/irqchip/irq-goldfish-pic.c | 145 
> +
>  4 files changed, 155 insertions(+)
>  create mode 100644 drivers/irqchip/irq-goldfish-pic.c
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 013da1d..6426875 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -844,6 +844,7 @@ ANDROID GOLDFISH PIC DRIVER
>  M:   Miodrag Dinic 
>  S:   Supported
>  F:   
> Documentation/devicetree/bindings/interrupt-controller/google,goldfish-pic.txt
> +F:   drivers/irqchip/irq-goldfish-pic.c
>  
>  ANDROID GOLDFISH RTC DRIVER
>  M:   Miodrag Dinic 
> diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
> index f1fd5f4..21fab14 100644
> --- a/drivers/irqchip/Kconfig
> +++ b/drivers/irqchip/Kconfig
> @@ -306,3 +306,11 @@ config QCOM_IRQ_COMBINER
>   help
> Say yes here to add support for the IRQ combiner devices embedded
> in Qualcomm Technologies chips.
> +
> +config GOLDFISH_PIC
> + bool "Goldfish programmable interrupt controller"
> + depends on MIPS && (GOLDFISH || COMPILE_TEST)
> + select IRQ_DOMAIN
> + help
> +   Say yes here to enable Goldfish interrupt controller driver used
> +   for Goldfish based virtual platforms.
> diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
> index e88d856..ade04a1 100644
> --- a/drivers/irqchip/Makefile
> +++ b/drivers/irqchip/Makefile
> @@ -78,3 +78,4 @@ obj-$(CONFIG_EZNPS_GIC) += irq-eznps.o
>  obj-$(CONFIG_ARCH_ASPEED)+= irq-aspeed-vic.o irq-aspeed-i2c-ic.o
>  obj-$(CONFIG_STM32_EXTI) += irq-stm32-exti.o
>  obj-$(CONFIG_QCOM_IRQ_COMBINER)  += qcom-irq-combiner.o
> +obj-$(CONFIG_GOLDFISH_PIC)   += irq-goldfish-pic.o
> diff --git a/drivers/irqchip/irq-goldfish-pic.c 
> b/drivers/irqchip/irq-goldfish-pic.c
> new file mode 100644
> index 000..948c35e
> --- /dev/null
> +++ b/drivers/irqchip/irq-goldfish-pic.c
> @@ -0,0 +1,145 @@
> +/*
> + * Copyright (C) 2017 Imagination Technologies Ltd.  All rights reserved
> + *   Author: Miodrag Dinic 
> + *
> + * This file implements interrupt controller driver for MIPS Goldfish PIC.
> + *
> + * This program is free software; you can redistribute   it and/or 
> modify it
> + * under  the terms of   the GNU General  Public License as published by 
> the
> + * Free Software Foundation;  either version 2 of the  License, or (at your
> + * option) any later version.
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include 
> +
> +/* 0..7 MIPS CPU interrupts */
> +#define GF_CPU_IRQ_PIC   (MIPS_CPU_IRQ_BASE + 2)
> +#define GF_CPU_IRQ_COMPARE   (MIPS_CPU_IRQ_BASE + 7)
> +
> +#define GF_NR_IRQS   40
> +/* 8..39 Cascaded Goldfish PIC interrupts */
> +#define GF_IRQ_OFFSET8
> +
> +#define GF_PIC_NUMBER0x04
> +#define GF_PIC_DISABLE_ALL   0x08
> +#define GF_PIC_DISABLE   0x0c
> +#define GF_PIC_ENABLE0x10
> +
> +static struct irq_domain *irq_domain;
> +static void __iomem *gf_pic_base;
> +
> +static inline void unmask_goldfish_irq(struct irq_data *d)
> +{
> + writel(d->hwirq - GF_IRQ_OFFSET,
> +gf_pic_base + GF_PIC_ENABLE);
> + irq_enable_hazard();
> +}
> +
> +static inline void mask_goldfish_irq(struct irq_data *d)
> +{
> + writel(d->hwirq - GF_IRQ_OFFSET,
> +gf_pic_base + GF_PIC_DISABLE);
> + irq_disable_hazard();
> +}
> +
> +static struct irq_chip goldfish_irq_controller = {
> + .name   = "Goldfish PIC",
> + .irq_ack= mask_goldfish_irq,

I'm slightly puzzled.

> + .irq_mask   = mask_goldfish_irq,
> + .irq_mask_ack   = mask_goldfish_irq,

What does it mean to have irq_mask_ack implemented as mask?

> + .irq_unmask = unmask_goldfish_irq,
> + .irq_eoi= unmask_goldfish_irq,

Really? Are you joking?

> + .irq_disable= mask_goldfish_irq,
> + .irq_enable = unmask_goldfish_irq,

If enable/disable are the same as mask/unmask, you don't need separate
entry points.

> +};
> +
> +static void goldfish_irq_dispatch(void)
> +{
> + u32 irq;
> + u32 virq;
> +
> + irq = readl(gf_pic_base + GF_PIC_NUMBER);
> + if (irq == 0) {
> + /* Timer interrupt */
> + do_IRQ(GF_CPU_IRQ_COMPARE);
> + return;
> + }

Why isn't this indirected through 

[PATCH 3.16 031/134] scsi: scsi_error: count medium access timeout only once per EH run

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Hannes Reinecke 

commit 7a38dc0bfb4cc39ed57e120e2224673f3d4d200f upstream.

The current medium access timeout counter will be increased for
each command, so if there are enough failed commands we'll hit
the medium access timeout for even a single device failure and
the following kernel message is displayed:

sd H:C:T:L: [sdXY] Medium access timeout failure. Offlining disk!

Fix this by making the timeout per EH run, ie the counter will
only be increased once per device and EH run.

Fixes: 18a4d0a ("[SCSI] Handle disk devices which can not process medium access 
commands")
Cc: Ewan Milne 
Cc: Lawrence Obermann 
Cc: Benjamin Block 
Cc: Steffen Maier 
Signed-off-by: Hannes Reinecke 
Reviewed-by: Christoph Hellwig 
Signed-off-by: Martin K. Petersen 
[bwh: Backported to 3.16:
 - Open-code blk_rq_is_passthrough()
 - Adjust context]
Signed-off-by: Ben Hutchings 
---
 drivers/scsi/scsi_error.c  | 18 ++
 drivers/scsi/sd.c  | 27 ++-
 drivers/scsi/sd.h  |  1 +
 include/scsi/scsi_driver.h |  1 +
 4 files changed, 46 insertions(+), 1 deletion(-)

--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -224,6 +224,23 @@ scsi_abort_command(struct scsi_cmnd *scm
 }
 
 /**
+ * scsi_eh_reset - call into ->eh_action to reset internal counters
+ * @scmd:  scmd to run eh on.
+ *
+ * The scsi driver might be carrying internal state about the
+ * devices, so we need to call into the driver to reset the
+ * internal state once the error handler is started.
+ */
+static void scsi_eh_reset(struct scsi_cmnd *scmd)
+{
+   if (scmd->request->cmd_type == REQ_TYPE_FS) {
+   struct scsi_driver *sdrv = scsi_cmd_to_driver(scmd);
+   if (sdrv->eh_reset)
+   sdrv->eh_reset(scmd);
+   }
+}
+
+/**
  * scsi_eh_scmd_add - add scsi cmd to error handling.
  * @scmd:  scmd to run eh on.
  * @eh_flag:   optional SCSI_EH flag.
@@ -252,6 +269,7 @@ int scsi_eh_scmd_add(struct scsi_cmnd *s
if (scmd->eh_eflags & SCSI_EH_ABORT_SCHEDULED)
eh_flag &= ~SCSI_EH_CANCEL_CMD;
scmd->eh_eflags |= eh_flag;
+   scsi_eh_reset(scmd);
list_add_tail(&scmd->eh_entry, &shost->eh_cmd_q);
shost->host_failed++;
scsi_eh_wakeup(shost);
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -112,6 +112,7 @@ static void sd_rescan(struct device *);
 static int sd_init_command(struct scsi_cmnd *SCpnt);
 static void sd_uninit_command(struct scsi_cmnd *SCpnt);
 static int sd_done(struct scsi_cmnd *);
+static void sd_eh_reset(struct scsi_cmnd *);
 static int sd_eh_action(struct scsi_cmnd *, int);
 static void sd_read_capacity(struct scsi_disk *sdkp, unsigned char *buffer);
 static void scsi_disk_release(struct device *cdev);
@@ -509,6 +510,7 @@ static struct scsi_driver sd_template =
.uninit_command = sd_uninit_command,
.done   = sd_done,
.eh_action  = sd_eh_action,
+   .eh_reset   = sd_eh_reset,
 };
 
 /*
@@ -1536,6 +1538,26 @@ static const struct block_device_operati
 };
 
 /**
+ * sd_eh_reset - reset error handling callback
+ * @scmd:  sd-issued command that has failed
+ *
+ * This function is called by the SCSI midlayer before starting
+ * SCSI EH. When counting medium access failures we have to be
+ * careful to register it only only once per device and SCSI EH run;
+ * there might be several timed out commands which will cause the
+ * 'max_medium_access_timeouts' counter to trigger after the first
+ * SCSI EH run already and set the device to offline.
+ * So this function resets the internal counter before starting SCSI EH.
+ **/
+static void sd_eh_reset(struct scsi_cmnd *scmd)
+{
+   struct scsi_disk *sdkp = scsi_disk(scmd->request->rq_disk);
+
+   /* New SCSI EH run, reset gate variable */
+   sdkp->ignore_medium_access_errors = false;
+}
+
+/**
  * sd_eh_action - error handling callback
  * @scmd:  sd-issued command that has failed
  * @eh_disp:   The recovery disposition suggested by the midlayer
@@ -1564,7 +1586,10 @@ static int sd_eh_action(struct scsi_cmnd
 * process of recovering or has it suffered an internal failure
 * that prevents access to the storage medium.
 */
-   sdkp->medium_access_timed_out++;
+   if (!sdkp->ignore_medium_access_errors) {
+   sdkp->medium_access_timed_out++;
+   sdkp->ignore_medium_access_errors = true;
+   }
 
/*
 * If the device keeps failing read/write commands but TEST UNIT
--- a/drivers/scsi/sd.h
+++ b/drivers/scsi/sd.h
@@ -90,6 +90,7 @@ struct scsi_disk {
unsignedlbpvpd : 1;
unsignedws10 : 1;
unsignedws16 : 1;
+   unsignedignore_medium_access_errors : 1;
 };
 #define 

[PATCH 3.16 037/134] [media] cx231xx-audio: fix NULL-deref at probe

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Johan Hovold 

commit 65f921647f4c89a2068478c89691f39b309b58f7 upstream.

Make sure to check the number of endpoints to avoid dereferencing a
NULL-pointer or accessing memory beyond the endpoint array should a
malicious device lack the expected endpoints.

Fixes: e0d3bafd0258 ("V4L/DVB (10954): Add cx231xx USB driver")

Cc: Sri Deevi 
Signed-off-by: Johan Hovold 
Signed-off-by: Hans Verkuil 
Signed-off-by: Mauro Carvalho Chehab 
Signed-off-by: Ben Hutchings 
---
 drivers/media/usb/cx231xx/cx231xx-audio.c | 17 +++--
 1 file changed, 15 insertions(+), 2 deletions(-)

--- a/drivers/media/usb/cx231xx/cx231xx-audio.c
+++ b/drivers/media/usb/cx231xx/cx231xx-audio.c
@@ -699,6 +699,11 @@ static int cx231xx_audio_init(struct cx2
hs_config_info[0].interface_info.
audio_index + 1];
 
+   if (uif->altsetting[0].desc.bNumEndpoints < isoc_pipe + 1) {
+   err = -ENODEV;
+   goto err_free_card;
+   }
+
adev->end_point_addr =
uif->altsetting[0].endpoint[isoc_pipe].desc.
bEndpointAddress;
@@ -714,8 +719,14 @@ static int cx231xx_audio_init(struct cx2
}
 
for (i = 0; i < adev->num_alt; i++) {
-   u16 tmp =
-   le16_to_cpu(uif->altsetting[i].endpoint[isoc_pipe].desc.
+   u16 tmp;
+
+   if (uif->altsetting[i].desc.bNumEndpoints < isoc_pipe + 1) {
+   err = -ENODEV;
+   goto err_free_pkt_size;
+   }
+
+   tmp = le16_to_cpu(uif->altsetting[i].endpoint[isoc_pipe].desc.
wMaxPacketSize);
adev->alt_max_pkt_size[i] =
(tmp & 0x07ff) * (((tmp & 0x1800) >> 11) + 1);
@@ -725,6 +736,8 @@ static int cx231xx_audio_init(struct cx2
 
return 0;
 
+err_free_pkt_size:
+   kfree(adev->alt_max_pkt_size);
 err_free_card:
snd_card_free(card);
 



Re: [PATCH v2 6/6] kernel: tracepoints: add support for relative references

2017-08-18 Thread Ard Biesheuvel
On 18 August 2017 at 14:52, Steven Rostedt  wrote:
> On Fri, 18 Aug 2017 14:44:15 +0100
> Ard Biesheuvel  wrote:
>
>> >> It appears the stuff above needs to be move inside the double-include
>> >> guard (which oddly enough does not cover the entire file)
>> >
>> > Why was this moved to the header file? To fulfill some checkpatch
>> > warning?
>> >
>>
>> Yes.
>
> My preference is to ignore that checkpatch warning. The section
> variables are created by linker magic, and not normal "extern"
> variables. They are only used in one location, and I like to keep them
> where they are used, and not be something other places might think they
> can be used. In other words, keep them by the C code, and out of
> headers.
>
> Tracepoints and linker/asm work always triggers a lot of bogus
> checkpatch warnings. Which is unfortunate. :-/
>

Actually, I couldn't agree more. I will backpedal on the checkpatch
appeasement in v3 in general.

Thanks,
Ard.


[PATCH 3.16 036/134] [media] cx231xx-audio: fix init error path

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Johan Hovold 

commit fff1abc4d54e469140a699612b4db8d6397bfcba upstream.

Make sure to release the snd_card also on a late allocation error.

Fixes: e0d3bafd0258 ("V4L/DVB (10954): Add cx231xx USB driver")

Cc: Sri Deevi 
Signed-off-by: Johan Hovold 
Signed-off-by: Hans Verkuil 
Signed-off-by: Mauro Carvalho Chehab 
[bwh: Backported to 3.16: adjust context]
Signed-off-by: Ben Hutchings 
---
--- a/drivers/media/usb/cx231xx/cx231xx-audio.c
+++ b/drivers/media/usb/cx231xx/cx231xx-audio.c
@@ -672,10 +672,8 @@ static int cx231xx_audio_init(struct cx2
 
spin_lock_init(&adev->slock);
err = snd_pcm_new(card, "Cx231xx Audio", 0, 0, 1, &pcm);
-   if (err < 0) {
-   snd_card_free(card);
-   return err;
-   }
+   if (err < 0)
+   goto err_free_card;
 
snd_pcm_set_ops(pcm, SNDRV_PCM_STREAM_CAPTURE,
&snd_cx231xx_pcm_capture);
@@ -689,10 +687,9 @@ static int cx231xx_audio_init(struct cx2
INIT_WORK(&dev->wq_trigger, audio_trigger);
 
err = snd_card_register(card);
-   if (err < 0) {
-   snd_card_free(card);
-   return err;
-   }
+   if (err < 0)
+   goto err_free_card;
+
adev->sndcard = card;
adev->udev = dev->udev;
 
@@ -710,10 +707,10 @@ static int cx231xx_audio_init(struct cx2
cx231xx_info("EndPoint Addr 0x%x, Alternate settings: %i\n",
 adev->end_point_addr, adev->num_alt);
adev->alt_max_pkt_size = kmalloc(32 * adev->num_alt, GFP_KERNEL);
-
-   if (adev->alt_max_pkt_size == NULL) {
+   if (!adev->alt_max_pkt_size) {
cx231xx_errdev("out of memory!\n");
-   return -ENOMEM;
+   err = -ENOMEM;
+   goto err_free_card;
}
 
for (i = 0; i < adev->num_alt; i++) {
@@ -727,6 +724,11 @@ static int cx231xx_audio_init(struct cx2
}
 
return 0;
+
+err_free_card:
+   snd_card_free(card);
+
+   return err;
 }
 
 static int cx231xx_audio_fini(struct cx231xx *dev)



[PATCH 3.16 038/134] [media] uvcvideo: Fix empty packet statistic

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Kieran Bingham 

commit 360a3a90c6261fe24a959ff38f8f6c3a8468f23c upstream.

The frame counters are inadvertently counting packets with content as
empty.

Fix it by correcting the logic expression

Fixes: 7bc5edb00bbd [media] uvcvideo: Extract video stream statistics

Signed-off-by: Kieran Bingham 
Signed-off-by: Laurent Pinchart 
Signed-off-by: Mauro Carvalho Chehab 
Signed-off-by: Ben Hutchings 
---
 drivers/media/usb/uvc/uvc_video.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/media/usb/uvc/uvc_video.c
+++ b/drivers/media/usb/uvc/uvc_video.c
@@ -810,7 +810,7 @@ static void uvc_video_stats_decode(struc
 
/* Update the packets counters. */
stream->stats.frame.nb_packets++;
-   if (len > header_size)
+   if (len <= header_size)
stream->stats.frame.nb_empty++;
 
if (data[1] & UVC_STREAM_ERR)



[PATCH 1/1] crypto: stm32/hash - Remove uninitialized symbol

2017-08-18 Thread Lionel Debieve
Remove err symbol as this is not used in the thread context
and the variable is not initialized.

Signed-off-by: Lionel Debieve 
---
 drivers/crypto/stm32/stm32-hash.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/crypto/stm32/stm32-hash.c 
b/drivers/crypto/stm32/stm32-hash.c
index b585ce5..b34ee85 100644
--- a/drivers/crypto/stm32/stm32-hash.c
+++ b/drivers/crypto/stm32/stm32-hash.c
@@ -1067,7 +1067,6 @@ static int stm32_hash_cra_sha256_init(struct crypto_tfm 
*tfm)
 static irqreturn_t stm32_hash_irq_thread(int irq, void *dev_id)
 {
struct stm32_hash_dev *hdev = dev_id;
-   int err;
 
if (HASH_FLAGS_CPU & hdev->flags) {
if (HASH_FLAGS_OUTPUT_READY & hdev->flags) {
@@ -1084,8 +1083,8 @@ static irqreturn_t stm32_hash_irq_thread(int irq, void 
*dev_id)
return IRQ_HANDLED;
 
 finish:
-   /*Finish current request */
-   stm32_hash_finish_req(hdev->req, err);
+   /* Finish current request */
+   stm32_hash_finish_req(hdev->req, 0);
 
return IRQ_HANDLED;
 }
-- 
2.7.4



[PATCH 3.16 039/134] padata: free correct variable

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: "Jason A. Donenfeld" 

commit 07a77929ba672d93642a56dc2255dd21e6e2290b upstream.

The author meant to free the variable that was just allocated, instead
of the one that failed to be allocated, but made a simple typo. This
patch rectifies that.

Signed-off-by: Jason A. Donenfeld 
Signed-off-by: Herbert Xu 
Signed-off-by: Ben Hutchings 
---
 kernel/padata.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/kernel/padata.c
+++ b/kernel/padata.c
@@ -356,7 +356,7 @@ static int padata_setup_cpumasks(struct
 
cpumask_and(pd->cpumask.pcpu, pcpumask, cpu_online_mask);
if (!alloc_cpumask_var(&pd->cpumask.cbcpu, GFP_KERNEL)) {
-   free_cpumask_var(pd->cpumask.cbcpu);
+   free_cpumask_var(pd->cpumask.pcpu);
return -ENOMEM;
}
 



[PATCH 3.16 027/134] IPoIB: Remove unnecessary test for NULL before debugfs_remove()

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Fabian Frederick 

commit e42fa2092c1049ac9c0e38aaac39ef3c40e91a36 upstream.

Fix checkpatch warning:

WARNING: debugfs_remove(NULL) is safe this check is probably not required

Signed-off-by: Fabian Frederick 
Signed-off-by: Doug Ledford 
Signed-off-by: Roland Dreier 
Signed-off-by: Ben Hutchings 
---
 drivers/infiniband/ulp/ipoib/ipoib_fs.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

--- a/drivers/infiniband/ulp/ipoib/ipoib_fs.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_fs.c
@@ -281,10 +281,8 @@ void ipoib_delete_debug_files(struct net
 {
struct ipoib_dev_priv *priv = netdev_priv(dev);
 
-   if (priv->mcg_dentry)
-   debugfs_remove(priv->mcg_dentry);
-   if (priv->path_dentry)
-   debugfs_remove(priv->path_dentry);
+   debugfs_remove(priv->mcg_dentry);
+   debugfs_remove(priv->path_dentry);
 }
 
 int ipoib_register_debugfs(void)



Re: [PATCH v2] blktrace: Fix potentail deadlock between delete & sysfs ops

2017-08-18 Thread Waiman Long
On 08/17/2017 05:30 PM, Steven Rostedt wrote:
> On Thu, 17 Aug 2017 17:10:07 -0400
> Steven Rostedt  wrote:
>
>
>> Instead of playing games with taking the lock, the only way this race
>> is hit, is if the partition is being deleted and the sysfs attribute is
>> being read at the same time, correct? In that case, just return
>> -ENODEV, and be done with it.
> Nevermind that wont work. Too bad there's not a mutex_lock_timeout()
> that we could use in a loop. It would solve the issue of forward
> progress with RT tasks, and will break after a timeout in case of
> deadlock.
>
> -- Steve

I think it will be useful to have mutex_timed_lock(). RT-mutex does have
a timed version, so I guess it shouldn't be hard to implement one for
mutex. I can take a shot at trying to do that.

Thanks,
Longman




[PATCH 3.16 028/134] IB/IPoIB: ibX: failed to create mcg debug file

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Shamir Rabinovitch 

commit 771a52584096c45e4565e8aabb596eece9d73d61 upstream.

When udev renames the netdev devices, ipoib debugfs entries does not
get renamed. As a result, if subsequent probe of ipoib device reuse the
name then creating a debugfs entry for the new device would fail.

Also, moved ipoib_create_debug_files and ipoib_delete_debug_files as part
of ipoib event handling in order to avoid any race condition between these.

Fixes: 1732b0ef3b3a ([IPoIB] add path record information in debugfs)
Signed-off-by: Vijay Kumar 
Signed-off-by: Shamir Rabinovitch 
Reviewed-by: Mark Bloch 
Signed-off-by: Doug Ledford 
Signed-off-by: Ben Hutchings 
---
 drivers/infiniband/ulp/ipoib/ipoib_fs.c   |  3 +++
 drivers/infiniband/ulp/ipoib/ipoib_main.c | 44 +++
 drivers/infiniband/ulp/ipoib/ipoib_vlan.c |  3 ---
 3 files changed, 42 insertions(+), 8 deletions(-)

--- a/drivers/infiniband/ulp/ipoib/ipoib_fs.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_fs.c
@@ -281,8 +281,11 @@ void ipoib_delete_debug_files(struct net
 {
struct ipoib_dev_priv *priv = netdev_priv(dev);
 
+   WARN_ONCE(!priv->mcg_dentry, "null mcg debug file\n");
+   WARN_ONCE(!priv->path_dentry, "null path debug file\n");
debugfs_remove(priv->mcg_dentry);
debugfs_remove(priv->path_dentry);
+   priv->mcg_dentry = priv->path_dentry = NULL;
 }
 
 int ipoib_register_debugfs(void)
--- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
@@ -98,6 +98,33 @@ static struct ib_client ipoib_client = {
.remove = ipoib_remove_one
 };
 
+#ifdef CONFIG_INFINIBAND_IPOIB_DEBUG
+static int ipoib_netdev_event(struct notifier_block *this,
+ unsigned long event, void *ptr)
+{
+   struct netdev_notifier_info *ni = ptr;
+   struct net_device *dev = ni->dev;
+
+   if (dev->netdev_ops->ndo_open != ipoib_open)
+   return NOTIFY_DONE;
+
+   switch (event) {
+   case NETDEV_REGISTER:
+   ipoib_create_debug_files(dev);
+   break;
+   case NETDEV_CHANGENAME:
+   ipoib_delete_debug_files(dev);
+   ipoib_create_debug_files(dev);
+   break;
+   case NETDEV_UNREGISTER:
+   ipoib_delete_debug_files(dev);
+   break;
+   }
+
+   return NOTIFY_DONE;
+}
+#endif
+
 int ipoib_open(struct net_device *dev)
 {
struct ipoib_dev_priv *priv = netdev_priv(dev);
@@ -1313,8 +1340,6 @@ void ipoib_dev_cleanup(struct net_device
 
ASSERT_RTNL();
 
-   ipoib_delete_debug_files(dev);
-
/* Delete any child interfaces first */
list_for_each_entry_safe(cpriv, tcpriv, &priv->child_intfs, list) {
/* Stop GC on child */
@@ -1620,8 +1645,6 @@ static struct net_device *ipoib_add_port
goto register_failed;
}
 
-   ipoib_create_debug_files(priv->dev);
-
if (ipoib_cm_add_mode_attr(priv->dev))
goto sysfs_failed;
if (ipoib_add_pkey_attr(priv->dev))
@@ -1636,7 +1659,6 @@ static struct net_device *ipoib_add_port
return priv->dev;
 
 sysfs_failed:
-   ipoib_delete_debug_files(priv->dev);
unregister_netdev(priv->dev);
 
 register_failed:
@@ -1727,6 +1749,12 @@ static void ipoib_remove_one(struct ib_d
kfree(dev_list);
 }
 
+#ifdef CONFIG_INFINIBAND_IPOIB_DEBUG
+static struct notifier_block ipoib_netdev_notifier = {
+   .notifier_call = ipoib_netdev_event,
+};
+#endif
+
 static int __init ipoib_init_module(void)
 {
int ret;
@@ -1776,6 +1804,9 @@ static int __init ipoib_init_module(void
if (ret)
goto err_client;
 
+#ifdef CONFIG_INFINIBAND_IPOIB_DEBUG
+   register_netdevice_notifier(&ipoib_netdev_notifier);
+#endif
return 0;
 
 err_client:
@@ -1793,6 +1824,9 @@ err_fs:
 
 static void __exit ipoib_cleanup_module(void)
 {
+#ifdef CONFIG_INFINIBAND_IPOIB_DEBUG
+   unregister_netdevice_notifier(&ipoib_netdev_notifier);
+#endif
ipoib_netlink_fini();
ib_unregister_client(&ipoib_client);
ib_sa_unregister_client(&ipoib_sa_client);
--- a/drivers/infiniband/ulp/ipoib/ipoib_vlan.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_vlan.c
@@ -86,8 +86,6 @@ int __ipoib_vlan_add(struct ipoib_dev_pr
 
priv->parent = ppriv->dev;
 
-   ipoib_create_debug_files(priv->dev);
-
/* RTNL childs don't need proprietary sysfs entries */
if (type == IPOIB_LEGACY_CHILD) {
if (ipoib_cm_add_mode_attr(priv->dev))
@@ -109,7 +107,6 @@ int __ipoib_vlan_add(struct ipoib_dev_pr
 
 sysfs_failed:
result = -ENOMEM;
-   ipoib_delete_debug_files(priv->dev);
unregister_netdevice(priv->dev);
 
 register_failed:



[PATCH 3.16 029/134] [media] gspca: konica: add missing endpoint sanity check

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Johan Hovold 

commit aa58fedb8c7b6cf2f05941d238495f9e2f29655c upstream.

Make sure to check the number of endpoints to avoid accessing memory
beyond the endpoint array should a device lack the expected endpoints.

Note that, as far as I can tell, the gspca framework has already made
sure there is at least one endpoint in the current alternate setting so
there should be no risk for a NULL-pointer dereference here.

Fixes: b517af722860 ("V4L/DVB: gspca_konica: New gspca subdriver for
konica chipset using cams")

Cc: Hans de Goede 
Signed-off-by: Johan Hovold 
Signed-off-by: Hans Verkuil 
Signed-off-by: Mauro Carvalho Chehab 
Signed-off-by: Ben Hutchings 
---
 drivers/media/usb/gspca/konica.c | 3 +++
 1 file changed, 3 insertions(+)

--- a/drivers/media/usb/gspca/konica.c
+++ b/drivers/media/usb/gspca/konica.c
@@ -188,6 +188,9 @@ static int sd_start(struct gspca_dev *gs
return -EIO;
}
 
+   if (alt->desc.bNumEndpoints < 2)
+   return -ENODEV;
+
packet_size = le16_to_cpu(alt->endpoint[0].desc.wMaxPacketSize);
 
n = gspca_dev->cam.cam_mode[gspca_dev->curr_mode].priv;



[PATCH 3.16 033/134] [media] dib0700: fix NULL-deref at probe

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Johan Hovold 

commit d5823511c0f8719a39e72ede1bce65411ac653b7 upstream.

Make sure to check the number of endpoints to avoid dereferencing a
NULL-pointer should a malicious device lack endpoints.

Fixes: c4018fa2e4c0 ("[media] dib0700: fix RC support on Hauppauge
Nova-TD")

Cc: Mauro Carvalho Chehab 
Signed-off-by: Johan Hovold 
Signed-off-by: Hans Verkuil 
Signed-off-by: Mauro Carvalho Chehab 
Signed-off-by: Ben Hutchings 
---
 drivers/media/usb/dvb-usb/dib0700_core.c | 3 +++
 1 file changed, 3 insertions(+)

--- a/drivers/media/usb/dvb-usb/dib0700_core.c
+++ b/drivers/media/usb/dvb-usb/dib0700_core.c
@@ -769,6 +769,9 @@ int dib0700_rc_setup(struct dvb_usb_devi
 
/* Starting in firmware 1.20, the RC info is provided on a bulk pipe */
 
+   if (intf->altsetting[0].desc.bNumEndpoints < rc_ep + 1)
+   return -ENODEV;
+
purb = usb_alloc_urb(0, GFP_KERNEL);
if (purb == NULL) {
err("rc usb alloc urb failed");



[PATCH 3.16 030/134] [media] s5p-mfc: Fix unbalanced call to clock management

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Marek Szyprowski 

commit a5cb00eb4223458250b55daf03ac7ea5f424d601 upstream.

Clock should be turned off after calling s5p_mfc_init_hw() from the
watchdog worker, like it is already done in the s5p_mfc_open() which also
calls this function.

Fixes: af93574678108 ("[media] MFC: Add MFC 5.1 V4L2 driver")

Signed-off-by: Marek Szyprowski 
Signed-off-by: Sylwester Nawrocki 
Signed-off-by: Mauro Carvalho Chehab 
Signed-off-by: Ben Hutchings 
---
 drivers/media/platform/s5p-mfc/s5p_mfc.c | 1 +
 1 file changed, 1 insertion(+)

--- a/drivers/media/platform/s5p-mfc/s5p_mfc.c
+++ b/drivers/media/platform/s5p-mfc/s5p_mfc.c
@@ -169,6 +169,7 @@ static void s5p_mfc_watchdog_worker(stru
}
s5p_mfc_clock_on();
ret = s5p_mfc_init_hw(dev);
+   s5p_mfc_clock_off();
if (ret)
mfc_err("Failed to reinit FW\n");
}



[PATCH 3.16 022/134] pinctrl: sh-pfc: r8a7791: Fix SCIF2 pinmux data

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Sergei Shtylyov 

commit 58439280f84e6b39fd7d61f25ab30489c1aaf0a9 upstream.

PINMUX_IPSR_MSEL() macro invocation for the TX2 signal has apparently wrong
1st argument -- most probably a result of cut&paste programming...

Fixes: 508845196238 ("pinctrl: sh-pfc: r8a7791 PFC support")
Signed-off-by: Sergei Shtylyov 
Signed-off-by: Geert Uytterhoeven 
[bwh: Backported to 3.16:
 - Use PINMUX_IPSR_MODSEL_DATA() instead of PINMUX_IPSR_MSEL()
 - Adjust context]
Signed-off-by: Ben Hutchings 
---
 drivers/pinctrl/sh-pfc/pfc-r8a7791.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/pinctrl/sh-pfc/pfc-r8a7791.c
+++ b/drivers/pinctrl/sh-pfc/pfc-r8a7791.c
@@ -1092,7 +1092,7 @@ static const u16 pinmux_data[] = {
PINMUX_IPSR_MODSEL_DATA(IP6_5_3, FMIN_E, SEL_FM_4),
PINMUX_IPSR_DATA(IP6_7_6, AUDIO_CLKOUT),
PINMUX_IPSR_MODSEL_DATA(IP6_7_6, MSIOF1_SS1_B, SEL_SOF1_1),
-   PINMUX_IPSR_MODSEL_DATA(IP6_5_3, TX2, SEL_SCIF2_0),
+   PINMUX_IPSR_MODSEL_DATA(IP6_7_6, TX2, SEL_SCIF2_0),
PINMUX_IPSR_MODSEL_DATA(IP6_7_6, SCIFA2_TXD, SEL_SCIFA2_0),
PINMUX_IPSR_DATA(IP6_9_8, IRQ0),
PINMUX_IPSR_MODSEL_DATA(IP6_9_8, SCIFB1_RXD_D, SEL_SCIFB1_3),



[PATCH 3.16 025/134] PCI: dwc: Fix uninitialized variable in dw_handle_msi_irq()

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Dan Carpenter 

commit 1b497e6493c49bbb55c89f53562f7f853495e90d upstream.

The bug is that "val" is unsigned long but we only initialize 32 bits of
it.  Then we test "if (val)" and that might be true not because we set the
bits but because some were never initialized.

Fixes: f342d940ee0e ("PCI: exynos: Add support for MSI")
Signed-off-by: Dan Carpenter 
Signed-off-by: Bjorn Helgaas 
[bwh: Backported to 3.16: adjust filename, context]
Signed-off-by: Ben Hutchings 
---
 drivers/pci/host/pcie-designware.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

--- a/drivers/pci/host/pcie-designware.c
+++ b/drivers/pci/host/pcie-designware.c
@@ -158,19 +158,20 @@ static struct irq_chip dw_msi_irq_chip =
 /* MSI int handler */
 irqreturn_t dw_handle_msi_irq(struct pcie_port *pp)
 {
-   unsigned long val;
+   u32 val;
int i, pos, irq;
irqreturn_t ret = IRQ_NONE;
 
for (i = 0; i < MAX_MSI_CTRLS; i++) {
dw_pcie_rd_own_conf(pp, PCIE_MSI_INTR0_STATUS + i * 12, 4,
-   (u32 *)&val);
+   &val);
if (!val)
continue;
 
ret = IRQ_HANDLED;
pos = 0;
-   while ((pos = find_next_bit(&val, 32, pos)) != 32) {
+   while ((pos = find_next_bit((unsigned long *) &val, 32,
+   pos)) != 32) {
irq = irq_find_mapping(pp->irq_domain, i * 32 + pos);
dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_STATUS + i * 12,
4, 1 << pos);



[PATCH 3.16 024/134] PCI: dwc: Unindent dw_handle_msi_irq() loop

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Bjorn Helgaas 

commit dbe4a09e8bbcf88809a8394d6a359d8cebd22a86 upstream.

Use "continue" to skip rest of the loop when possible to save an indent
level.  No functional change intended.

Suggested-by: walter harms 
Signed-off-by: Bjorn Helgaas 
[bwh: Backported to 3.16: adjust filename, context]
Signed-off-by: Ben Hutchings 
---
 drivers/pci/host/pcie-designware.c | 22 +++---
 1 file changed, 11 insertions(+), 11 deletions(-)

--- a/drivers/pci/host/pcie-designware.c
+++ b/drivers/pci/host/pcie-designware.c
@@ -165,18 +165,17 @@ irqreturn_t dw_handle_msi_irq(struct pci
for (i = 0; i < MAX_MSI_CTRLS; i++) {
dw_pcie_rd_own_conf(pp, PCIE_MSI_INTR0_STATUS + i * 12, 4,
(u32 *)&val);
-   if (val) {
-   ret = IRQ_HANDLED;
-   pos = 0;
-   while ((pos = find_next_bit(&val, 32, pos)) != 32) {
-   irq = irq_find_mapping(pp->irq_domain,
-   i * 32 + pos);
-   dw_pcie_wr_own_conf(pp,
-   PCIE_MSI_INTR0_STATUS + i * 12,
-   4, 1 << pos);
-   generic_handle_irq(irq);
-   pos++;
-   }
+   if (!val)
+   continue;
+
+   ret = IRQ_HANDLED;
+   pos = 0;
+   while ((pos = find_next_bit(&val, 32, pos)) != 32) {
+   irq = irq_find_mapping(pp->irq_domain, i * 32 + pos);
+   dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_STATUS + i * 12,
+   4, 1 << pos);
+   generic_handle_irq(irq);
+   pos++;
}
}
 



Re: [kernel-hardening] [RFC] memory allocations in genalloc

2017-08-18 Thread Laura Abbott
On 08/17/2017 09:26 AM, Igor Stoppa wrote:
> Foreword:
> If I should direct this message to someone else, please let me know.
> I couldn't get a clear idea, by looking at both MAINTAINERS and git blame.
> 
> 
> 
> Hi,
> 
> I'm currently trying to convert the SE Linux policy db into using a
> protectable memory allocator (pmalloc) that I have developed.
> 
> Such allocator is based on genalloc: I had come up with an
> implementation that was pretty similar to what genalloc already does, so
> it was pointed out to me that I could have a look at it.
> 
> And, indeed, it seemed a perfect choice.
> 
> But ... when freeing memory, genalloc wants that the caller also states
> how large each specific memory allocation is.
> 
> This, per se, is not an issue, although genalloc doesn't seen to check
> if the memory being freed is really matching a previous allocation request.
> 
> However, this design doesn't sit well with the use case I have in mind.
> 
> In particular, when the SE Linux policy db is populated, the creation of
> one or more specific entry of the db might fail.
> In this case, the memory previously allocated for said entry, is
> released with kfree, which doesn't need to know the size of the chunk
> being freed.
> 
> I would like to add similar capability to genalloc.
> 
> genalloc already uses bitmaps, to track what words are allocated (1) and
> which are free (0)
> 
> What I would like to do is to add another bitmap, which would track the
> beginning of each individual allocation (1 on the first allocation unit
> of each allocation, 0 otherwise).
> 
> Such enhancement would enable also the detection of calls to free with
> incorrect / misaligned addresses - right now it is possible to
> successfully free a memory area that overlaps the interface of two
> adjacent allocations, without fully covering either of them.
> 
> Would this change be acceptable?
> Is there any better way to achieve what I want?
> 

In general, I don't see anything wrong with wanting to let gen_pool_free
not take a size. It's hard to say anything more without a patch to review.
My biggest concern would be keeping existing behavior and managing two
bitmaps locklessly.


> 
> ---
> 
> I have also a question wrt the use of spinlocks in genalloc.
> Why a spinlock?
> 
> Freeing a chunk of memory previously allocated with vmalloc requires
> invoking vfree_atomic, instead of vfree, because the list of chunks is
> walked with the spinlock held, and vfree can sleep.
> 
> Why not using a mutex?
> 

>From the git history, gen_pool used to use a reader/writer lock and
was switched to be lockless so it could be used in NMI contexts
7f184275aa30 ("lib, Make gen_pool memory allocator lockless").
This looks to be an intentional choice, presumably so regions can be
added in atomic contexts. Again, if you have a specific patch or
proposal this would be easier to review.

Thanks,
Laura


> 
> --
> TIA, igor
> 



Re: [PATCH v3] livepatch: add (un)patch callbacks

2017-08-18 Thread Petr Mladek
On Wed 2017-08-16 15:17:04, Joe Lawrence wrote:
> Provide livepatch modules a klp_object (un)patching notification
> mechanism.  Pre and post-(un)patch callbacks allow livepatch modules to
> setup or synchronize changes that would be difficult to support in only
> patched-or-unpatched code contexts.
> 
> diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h
> index 194991ef9347..500dc9b2b361 100644
> --- a/include/linux/livepatch.h
> +++ b/include/linux/livepatch.h
> @@ -138,6 +154,71 @@ struct klp_patch {
>func->old_name || func->new_func || func->old_sympos; \
>func++)
>  
> +/**
> + * klp_is_object_loaded() - is klp_object currently loaded?
> + * @obj: klp_object pointer
> + *
> + * Return: true if klp_object is loaded (always true for vmlinux)
> + */
> +static inline bool klp_is_object_loaded(struct klp_object *obj)
> +{
> + return !obj->name || obj->mod;
> +}
> +
> +/**
> + * klp_pre_patch_callback - execute before klp_object is patched
> + * @obj: invoke callback for this klp_object
> + *
> + * Return: status from callback
> + *
> + * Callers should ensure obj->patched is *not* set.
> + */
> +static inline int klp_pre_patch_callback(struct klp_object *obj)
> +{
> + if (obj->callbacks.pre_patch)
> + return (*obj->callbacks.pre_patch)(obj);
> + return 0;
> +}
> +
> +/**
> + * klp_post_patch_callback() - execute after klp_object is patched
> + * @obj: invoke callback for this klp_object
> + *
> + * Callers should ensure obj->patched is set.
> + */
> +static inline void klp_post_patch_callback(struct klp_object *obj)
> +{
> + if (obj->callbacks.post_patch)
> + (*obj->callbacks.post_patch)(obj);
> +}
> +
> +/**
> + * klp_pre_unpatch_callback() - execute before klp_object is unpatched
> + *  and is active across all tasks
> + * @obj: invoke callback for this klp_object
> + *
> + * Callers should ensure obj->patched is set.
> + */
> +static inline void klp_pre_unpatch_callback(struct klp_object *obj)
> +{
> + if (obj->callbacks.pre_unpatch)
> + (*obj->callbacks.pre_unpatch)(obj);
> +}
> +
> +/**
> + * klp_post_unpatch_callback() - execute after klp_object is unpatched,
> + *   all code has been restored and no tasks
> + *   are running patched code
> + * @obj: invoke callback for this klp_object
> + *
> + * Callers should ensure obj->patched is *not* set.
> + */
> +static inline void klp_post_unpatch_callback(struct klp_object *obj)
> +{
> + if (obj->callbacks.post_unpatch)
> + (*obj->callbacks.post_unpatch)(obj);
> +}

I guess that we do not want to make these function usable
outside livepatch code. Thefore these inliners should go
to kernel/livepatch/core.h or so.

> +
>  int klp_register_patch(struct klp_patch *);
>  int klp_unregister_patch(struct klp_patch *);
>  int klp_enable_patch(struct klp_patch *);
> diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
> index b9628e43c78f..ddb23e18a357 100644
> --- a/kernel/livepatch/core.c
> +++ b/kernel/livepatch/core.c
> @@ -878,6 +890,8 @@ int klp_module_coming(struct module *mod)
>   goto err;
>   }
>  
> + klp_post_patch_callback(obj);

This should be called only if (patch != klp_transition_patch).
Otherwise, it would be called too early.

> +
>   break;
>   }
>   }
> @@ -929,7 +943,10 @@ void klp_module_going(struct module *mod)
>   if (patch->enabled || patch == klp_transition_patch) {
>   pr_notice("reverting patch '%s' on unloading 
> module '%s'\n",
> patch->mod->name, obj->mod->name);
> +
> + klp_pre_unpatch_callback(obj);

Also the pre_unpatch() callback should be called only
if (patch != klp_transition_patch). Otherwise, it should have
already been called. It is not the current case but see below.


>   klp_unpatch_object(obj);
> + klp_post_unpatch_callback(obj);
>   }
>  
>   klp_free_object_loaded(obj);
> diff --git a/kernel/livepatch/patch.c b/kernel/livepatch/patch.c
> index 52c4e907c14b..0eed0df6e6d9 100644
> --- a/kernel/livepatch/patch.c
> +++ b/kernel/livepatch/patch.c
> @@ -257,6 +257,7 @@ int klp_patch_object(struct klp_object *obj)
>   klp_for_each_func(obj, func) {
>   ret = klp_patch_func(func);
>   if (ret) {
> + klp_pre_unpatch_callback(obj);

This looks strange (somehow asymetric). IMHO, it should not be
needed. klp_pre_unpatch_callback() should revert changes done
by klp_post_patch_callback() that has not run yet.

>   klp_unpatch_object(obj);
>   return ret;
>   }
> @@ -271,6 +272,8 @@ void klp_unpatch_o

Re: [PATCH] KVM/x86: Increase max vcpu number to 352

2017-08-18 Thread Konrad Rzeszutek Wilk
On Tue, Aug 15, 2017 at 06:13:29PM +0200, Radim Krčmář wrote:
> (Missed this mail before my last reply.)
> 
> 2017-08-15 10:10-0400, Konrad Rzeszutek Wilk:
> > On Tue, Aug 15, 2017 at 11:00:04AM +0800, Lan Tianyu wrote:
> > > On 2017年08月12日 03:35, Konrad Rzeszutek Wilk wrote:
> > > > Migration with 352 CPUs all being busy dirtying memory and also poking
> > > > at various I/O ports (say all of them dirtying the VGA) is no problem?
> > > 
> > > This depends on what kind of workload is running during migration. I
> > > think this may affect service down time since there maybe a lot of dirty
> > > memory data to transfer after stopping vcpus. This also depends on how
> > > user sets "migrate_set_downtime" for qemu. But I think increasing vcpus
> > > will break migration function.
> > 
> > OK, so let me take a step back.
> > 
> > I see this nice 'supported' CPU count that is exposed in kvm module.
> > 
> > Then there is QEMU throwing out a warning if you crank up the CPU count
> > above that number.
> 
> I find the range between "recommended max" and "hard max" VCPU count
> confusing at best ... IIUC, it was there because KVM internals had
> problems with scaling and we will hit more in the future because some
> loops still are linear on VCPU count.

Is that documented somewhere? There are some folks would be interested
in looking at that if it was known what exactly to look for..

> 
> The exposed value doesn't say whether migration will work, because that
> is a userspace thing and we're not aware of bottlenecks on the KVM side.
> 
> > Red Hat's web-pages talk about CPU count as well.
> > 
> > And I am assuming all of those are around what has been tested and
> > what has shown to work. And one of those test-cases surely must
> > be migration.
> 
> Right, Red Hat will only allow/support what it has tested, even if
> upstream has a practically unlimited count.  I think the upstream number
> used to be raised by Red Hat, which is why upstream isn't at the hard
> implementation limit ...

Aim for the sky! Perhaps then lets crank it up to 4096 upstream and let
each vendor/distro/cloud decide the right number based on their
testing.

And also have more folks report issues as they try running say running
these huge vCPU guests?

> 
> > Ergo, if the vCPU count increase will break migration, then it is
> > a regression.
> 
> Raising the limit would not break existing guests, but I would rather
> avoid adding higher VCPU count as a feature that disables migration.
> 
> > Or a fix/work needs to be done to support a higher CPU count for
> > migrating?
> 
> Post-copy migration should handle higher CPU count and it is the default
> fallback on QEMU.  Asking the question on a userspace list would yield
> better answers, though.
> 
> Thanks.


[PATCH 3.16 026/134] ath9k_htc: fix NULL-deref at probe

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Johan Hovold 

commit ebeb36670ecac36c179b5fb5d5c88ff03ba191ec upstream.

Make sure to check the number of endpoints to avoid dereferencing a
NULL-pointer or accessing memory beyond the endpoint array should a
malicious device lack the expected endpoints.

Fixes: 36bcce430657 ("ath9k_htc: Handle storage devices")
Signed-off-by: Johan Hovold 
Signed-off-by: Kalle Valo 
Signed-off-by: Ben Hutchings 
---
 drivers/net/wireless/ath/ath9k/hif_usb.c | 3 +++
 1 file changed, 3 insertions(+)

--- a/drivers/net/wireless/ath/ath9k/hif_usb.c
+++ b/drivers/net/wireless/ath/ath9k/hif_usb.c
@@ -1145,6 +1145,9 @@ static int send_eject_command(struct usb
u8 bulk_out_ep;
int r;
 
+   if (iface_desc->desc.bNumEndpoints < 2)
+   return -ENODEV;
+
/* Find bulk out endpoint */
for (r = 1; r >= 0; r--) {
endpoint = &iface_desc->endpoint[r].desc;



[PATCH 3.16 017/134] staging: iio: tsl2x7x_core: Fix standard deviation calculation

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Eva Rachel Retuya 

commit cf6c77323a96fc40309cc8a4921ef206cccdd961 upstream.

Standard deviation is calculated as the square root of the variance
where variance is the mean of sample_sum and length. Correct the
computation of statP->stddev in accordance to the proper calculation.

Fixes: 3c97c08b5735 ("staging: iio: add TAOS tsl2x7x driver")
Reported-by: Abhiram Balasubramanian 
Signed-off-by: Eva Rachel Retuya 
Signed-off-by: Jonathan Cameron 
Signed-off-by: Ben Hutchings 
---
 drivers/staging/iio/light/tsl2x7x_core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/staging/iio/light/tsl2x7x_core.c
+++ b/drivers/staging/iio/light/tsl2x7x_core.c
@@ -849,7 +849,7 @@ void tsl2x7x_prox_calculate(int *data, i
tmp = data[i] - statP->mean;
sample_sum += tmp * tmp;
}
-   statP->stddev = int_sqrt((long)sample_sum)/length;
+   statP->stddev = int_sqrt((long)sample_sum / length);
 }
 
 /**



[PATCH 3.16 023/134] pinctrl: sh-pfc: r8a7791: Fix IPSR comment typos

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Sergei Shtylyov 

commit 0cbdc11482d72ad164e33ef7cc57b01e8b61e40d upstream.

The IPSR field names in the comments have been fat-fingered in a couple
places --  fix those silly typos...

Fixes: 508845196238 ("pinctrl: sh-pfc: r8a7791 PFC support")
Signed-off-by: Sergei Shtylyov 
Signed-off-by: Geert Uytterhoeven 
Signed-off-by: Ben Hutchings 
---
 drivers/pinctrl/sh-pfc/pfc-r8a7791.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

--- a/drivers/pinctrl/sh-pfc/pfc-r8a7791.c
+++ b/drivers/pinctrl/sh-pfc/pfc-r8a7791.c
@@ -4960,7 +4960,7 @@ static const struct pinmux_cfg_reg pinmu
},
{ PINMUX_CFG_REG_VAR("IPSR2", 0xE6060028, 32,
 2, 3, 2, 2, 2, 2, 3, 3, 3, 3, 2, 2, 3) {
-   /* IP2_31_20 [2] */
+   /* IP2_31_30 [2] */
0, 0, 0, 0,
/* IP2_29_27 [3] */
FN_EX_CS3_N, FN_ATADIR0_N, FN_MSIOF2_TXD,
@@ -4980,7 +4980,7 @@ static const struct pinmux_cfg_reg pinmu
/* IP2_15_13 [3] */
FN_A24, FN_DREQ2, FN_IO3, FN_TX1, FN_SCIFA1_TXD,
0, 0, 0,
-   /* IP2_12_0 [3] */
+   /* IP2_12_10 [3] */
FN_A23, FN_IO2, FN_BPFCLK_B, FN_RX0, FN_SCIFA0_RXD,
0, 0, 0,
/* IP2_9_7 [3] */
@@ -5291,7 +5291,7 @@ static const struct pinmux_cfg_reg pinmu
/* IP10_24_22 [3] */
FN_VI0_R1, FN_VI2_DATA2, FN_GLO_I1_B, FN_TS_SCK0_C, FN_ATAG1_N,
0, 0, 0,
-   /* IP10_21_29 [3] */
+   /* IP10_21_19 [3] */
FN_VI0_R0, FN_VI2_DATA1, FN_GLO_I0_B,
FN_TS_SDATA0_C, FN_ATACS11_N,
0, 0, 0,



[PATCH 3.16 020/134] pinctrl: sh-pfc: r8a7791: Add missing HSCIF1 pinmux data

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Sergei Shtylyov 

commit da7a692fbbab07f4e9798b5b52798f6e3256dd8f upstream.

The R8A7791 PFC driver  was apparently based on the preliminary revisions
of  the  user's manual, which  omitted the HSCIF1 group E signals in  the
IPSR4 register description. This would cause HSCIF1's probe  to fail with
the messages like below:

sh-pfc e606.pfc: cannot locate data/mark enum_id for mark 1989
sh-sci e62c8000.serial: Error applying setting, reverse things back
sh-sci: probe of e62c8000.serial failed with error -22

Add the neceassary PINMUX_IPSR_MSEL() invocations for the HSCK1_E,
HCTS1#_E, and HRTS1#_E signals...

Fixes: 508845196238 ("pinctrl: sh-pfc: r8a7791 PFC support")
Signed-off-by: Sergei Shtylyov 
Signed-off-by: Geert Uytterhoeven 
[bwh: Backported to 3.16:
 - Use PINMUX_IPSR_MODSEL_DATA() instead of PINMUX_IPSR_MSEL()
 - Adjust context]
Signed-off-by: Ben Hutchings 
---
 drivers/pinctrl/sh-pfc/pfc-r8a7791.c | 3 +++
 1 file changed, 3 insertions(+)

--- a/drivers/pinctrl/sh-pfc/pfc-r8a7791.c
+++ b/drivers/pinctrl/sh-pfc/pfc-r8a7791.c
@@ -999,14 +999,17 @@ static const u16 pinmux_data[] = {
PINMUX_IPSR_MODSEL_DATA(IP4_12_10, SCL2, SEL_IIC2_0),
PINMUX_IPSR_MODSEL_DATA(IP4_12_10, GPS_CLK_B, SEL_GPS_1),
PINMUX_IPSR_MODSEL_DATA(IP4_12_10, GLO_Q0_D, SEL_GPS_3),
+   PINMUX_IPSR_MODSEL_DATA(IP4_12_10, HSCK1_E, SEL_HSCIF1_4),
PINMUX_IPSR_DATA(IP4_15_13, SSI_WS2),
PINMUX_IPSR_MODSEL_DATA(IP4_15_13, SDA2, SEL_IIC2_0),
PINMUX_IPSR_MODSEL_DATA(IP4_15_13, GPS_SIGN_B, SEL_GPS_1),
PINMUX_IPSR_MODSEL_DATA(IP4_15_13, RX2_E, SEL_SCIF2_4),
PINMUX_IPSR_MODSEL_DATA(IP4_15_13, GLO_Q1_D, SEL_GPS_3),
+   PINMUX_IPSR_MODSEL_DATA(IP4_15_13, HCTS1_N_E, SEL_HSCIF1_4),
PINMUX_IPSR_DATA(IP4_18_16, SSI_SDATA2),
PINMUX_IPSR_MODSEL_DATA(IP4_18_16, GPS_MAG_B, SEL_GPS_1),
PINMUX_IPSR_MODSEL_DATA(IP4_18_16, TX2_E, SEL_SCIF2_4),
+   PINMUX_IPSR_MODSEL_DATA(IP4_18_16, HRTS1_N_E, SEL_HSCIF1_4),
PINMUX_IPSR_DATA(IP4_19, SSI_SCK34),
PINMUX_IPSR_DATA(IP4_20, SSI_WS34),
PINMUX_IPSR_DATA(IP4_21, SSI_SDATA3),



[PATCH 3.16 021/134] pinctrl: sh-pfc: r8a7791: Add missing DVC_MUTE signal

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Sergei Shtylyov 

commit 3908632fb829d73317c64c3d04f584b49f62e4ae upstream.

The R8A7791 PFC driver  was apparently based on the preliminary revisions
of  the user's  manual, which  omitted the DVC_MUTE signal  altogether in
the PFC section. The modern manual has the signal described,  so just add
the necassary data to the driver...

Fixes: 508845196238 ("pinctrl: sh-pfc: r8a7791 PFC support")
Signed-off-by: Sergei Shtylyov 
Signed-off-by: Geert Uytterhoeven 
[bwh: Backported to 3.16:
 - Use PINMUX_IPSR_DATA() instead of PINMUX_IPSR_GPSR()
 - Adjust context]
Signed-off-by: Ben Hutchings 
---
 drivers/pinctrl/sh-pfc/pfc-r8a7791.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

--- a/drivers/pinctrl/sh-pfc/pfc-r8a7791.c
+++ b/drivers/pinctrl/sh-pfc/pfc-r8a7791.c
@@ -192,7 +192,7 @@ enum {
 
/* IPSR6 */
FN_AUDIO_CLKB, FN_STP_OPWM_0_B, FN_MSIOF1_SCK_B,
-   FN_SCIF_CLK, FN_BPFCLK_E,
+   FN_SCIF_CLK, FN_DVC_MUTE, FN_BPFCLK_E,
FN_AUDIO_CLKC, FN_SCIFB0_SCK_C, FN_MSIOF1_SYNC_B, FN_RX2,
FN_SCIFA2_RXD, FN_FMIN_E,
FN_AUDIO_CLKOUT, FN_MSIOF1_SS1_B, FN_TX2, FN_SCIFA2_TXD,
@@ -562,7 +562,7 @@ enum {
 
/* IPSR6 */
AUDIO_CLKB_MARK, STP_OPWM_0_B_MARK, MSIOF1_SCK_B_MARK,
-   SCIF_CLK_MARK, BPFCLK_E_MARK,
+   SCIF_CLK_MARK, DVC_MUTE_MARK, BPFCLK_E_MARK,
AUDIO_CLKC_MARK, SCIFB0_SCK_C_MARK, MSIOF1_SYNC_B_MARK, RX2_MARK,
SCIFA2_RXD_MARK, FMIN_E_MARK,
AUDIO_CLKOUT_MARK, MSIOF1_SS1_B_MARK, TX2_MARK, SCIFA2_TXD_MARK,
@@ -1082,6 +1082,7 @@ static const u16 pinmux_data[] = {
PINMUX_IPSR_MODSEL_DATA(IP6_2_0, STP_OPWM_0_B, SEL_SSP_1),
PINMUX_IPSR_MODSEL_DATA(IP6_2_0, MSIOF1_SCK_B, SEL_SOF1_1),
PINMUX_IPSR_MODSEL_DATA(IP6_2_0, SCIF_CLK, SEL_SCIF_0),
+   PINMUX_IPSR_DATA(IP6_2_0, DVC_MUTE),
PINMUX_IPSR_MODSEL_DATA(IP6_2_0, BPFCLK_E, SEL_FM_4),
PINMUX_IPSR_DATA(IP6_5_3, AUDIO_CLKC),
PINMUX_IPSR_MODSEL_DATA(IP6_5_3, SCIFB0_SCK_C, SEL_SCIFB_2),
@@ -5148,7 +5149,7 @@ static const struct pinmux_cfg_reg pinmu
0, 0,
/* IP6_2_0 [3] */
FN_AUDIO_CLKB, FN_STP_OPWM_0_B, FN_MSIOF1_SCK_B,
-   FN_SCIF_CLK, 0, FN_BPFCLK_E,
+   FN_SCIF_CLK, FN_DVC_MUTE, FN_BPFCLK_E,
0, 0, }
},
{ PINMUX_CFG_REG_VAR("IPSR7", 0xE606003C, 32,



[PATCH 3.16 016/134] [media] mceusb: fix NULL-deref at probe

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Johan Hovold 

commit 03eb2a557ed552e920a0942b774aaf931596eec1 upstream.

Make sure to check for the required out endpoint to avoid dereferencing
a NULL-pointer in mce_request_packet should a malicious device lack such
an endpoint. Note that this path is hit during probe.

Fixes: 66e89522aff7 ("V4L/DVB: IR: add mceusb IR receiver driver")

Signed-off-by: Johan Hovold 
Signed-off-by: Sean Young 
Signed-off-by: Mauro Carvalho Chehab 
Signed-off-by: Ben Hutchings 
---
 drivers/media/rc/mceusb.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/drivers/media/rc/mceusb.c
+++ b/drivers/media/rc/mceusb.c
@@ -1299,8 +1299,8 @@ static int mceusb_dev_probe(struct usb_i
}
}
}
-   if (ep_in == NULL) {
-   dev_dbg(&intf->dev, "inbound and/or endpoint not found");
+   if (!ep_in || !ep_out) {
+   dev_dbg(&intf->dev, "required endpoints not found\n");
return -ENODEV;
}
 



[PATCH 3.16 018/134] USB: Proper handling of Race Condition when two USB class drivers try to call init_usb_class simultaneously

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Ajay Kaher 

commit 2f86a96be0ccb1302b7eee7855dbee5ce4dc5dfb upstream.

There is race condition when two USB class drivers try to call
init_usb_class at the same time and leads to crash.
code path: probe->usb_register_dev->init_usb_class

To solve this, mutex locking has been added in init_usb_class() and
destroy_usb_class().

As pointed by Alan, removed "if (usb_class)" test from destroy_usb_class()
because usb_class can never be NULL there.

Signed-off-by: Ajay Kaher 
Acked-by: Alan Stern 
Signed-off-by: Greg Kroah-Hartman 
Signed-off-by: Ben Hutchings 
---
 drivers/usb/core/file.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

--- a/drivers/usb/core/file.c
+++ b/drivers/usb/core/file.c
@@ -26,6 +26,7 @@
 #define MAX_USB_MINORS 256
 static const struct file_operations *usb_minors[MAX_USB_MINORS];
 static DECLARE_RWSEM(minor_rwsem);
+static DEFINE_MUTEX(init_usb_class_mutex);
 
 static int usb_open(struct inode *inode, struct file *file)
 {
@@ -108,8 +109,9 @@ static void release_usb_class(struct kre
 
 static void destroy_usb_class(void)
 {
-   if (usb_class)
-   kref_put(&usb_class->kref, release_usb_class);
+   mutex_lock(&init_usb_class_mutex);
+   kref_put(&usb_class->kref, release_usb_class);
+   mutex_unlock(&init_usb_class_mutex);
 }
 
 int usb_major_init(void)
@@ -171,7 +173,10 @@ int usb_register_dev(struct usb_interfac
if (intf->minor >= 0)
return -EADDRINUSE;
 
+   mutex_lock(&init_usb_class_mutex);
retval = init_usb_class();
+   mutex_unlock(&init_usb_class_mutex);
+
if (retval)
return retval;
 



[PATCH 3.16 019/134] cdc-acm: fix possible invalid access when processing notification

2017-08-18 Thread Ben Hutchings
3.16.47-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Tobias Herzog 

commit 1bb9914e1730417d530de9ed37e59efdc647146b upstream.

Notifications may only be 8 bytes long. Accessing the 9th and
10th byte of unimplemented/unknown notifications may be insecure.
Also check the length of known notifications before accessing anything
behind the 8th byte.

Signed-off-by: Tobias Herzog 
Acked-by: Oliver Neukum 
Signed-off-by: Greg Kroah-Hartman 
Signed-off-by: Ben Hutchings 
---
 drivers/usb/class/cdc-acm.c | 13 +
 1 file changed, 9 insertions(+), 4 deletions(-)

--- a/drivers/usb/class/cdc-acm.c
+++ b/drivers/usb/class/cdc-acm.c
@@ -312,6 +312,12 @@ static void acm_ctrl_irq(struct urb *urb
break;
 
case USB_CDC_NOTIFY_SERIAL_STATE:
+   if (le16_to_cpu(dr->wLength) != 2) {
+   dev_dbg(&acm->control->dev,
+   "%s - malformed serial state\n", __func__);
+   break;
+   }
+
newctrl = get_unaligned_le16(data);
 
if (!acm->clocal && (acm->ctrlin & ~newctrl & ACM_CTRL_DCD)) {
@@ -348,11 +354,10 @@ static void acm_ctrl_irq(struct urb *urb
 
default:
dev_dbg(&acm->control->dev,
-   "%s - unknown notification %d received: index %d "
-   "len %d data0 %d data1 %d\n",
+   "%s - unknown notification %d received: index %d len 
%d\n",
__func__,
-   dr->bNotificationType, dr->wIndex,
-   dr->wLength, data[0], data[1]);
+   dr->bNotificationType, dr->wIndex, dr->wLength);
+
break;
}
 exit:



[PATCH 3.2 30/59] PCI: Freeze PME scan before suspending devices

2017-08-18 Thread Ben Hutchings
3.2.92-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Lukas Wunner 

commit ea00353f36b64375518662a8ad15e39218a1f324 upstream.

Laurent Pinchart reported that the Renesas R-Car H2 Lager board (r8a7790)
crashes during suspend tests.  Geert Uytterhoeven managed to reproduce the
issue on an M2-W Koelsch board (r8a7791):

  It occurs when the PME scan runs, once per second.  During PME scan, the
  PCI host bridge (rcar-pci) registers are accessed while its module clock
  has already been disabled, leading to the crash.

One reproducer is to configure s2ram to use "s2idle" instead of "deep"
suspend:

  # echo 0 > /sys/module/printk/parameters/console_suspend
  # echo s2idle > /sys/power/mem_sleep
  # echo mem > /sys/power/state

Another reproducer is to write either "platform" or "processors" to
/sys/power/pm_test.  It does not (or is less likely) to happen during full
system suspend ("core" or "none") because system suspend also disables
timers, and thus the workqueue handling PME scans no longer runs.  Geert
believes the issue may still happen in the small window between disabling
module clocks and disabling timers:

  # echo 0 > /sys/module/printk/parameters/console_suspend
  # echo platform > /sys/power/pm_test# Or "processors"
  # echo mem > /sys/power/state

(Make sure CONFIG_PCI_RCAR_GEN2 and CONFIG_USB_OHCI_HCD_PCI are enabled.)

Rafael Wysocki agrees that PME scans should be suspended before the host
bridge registers become inaccessible.  To that end, queue the task on a
workqueue that gets frozen before devices suspend.

Rafael notes however that as a result, some wakeup events may be missed if
they are delivered via PME from a device without working IRQ (which hence
must be polled) and occur after the workqueue has been frozen.  If that
turns out to be an issue in practice, it may be possible to solve it by
calling pci_pme_list_scan() once directly from one of the host bridge's
pm_ops callbacks.

Stacktrace for posterity:

  PM: Syncing filesystems ... [   38.566237] done.
  PM: Preparing system for sleep (mem)
  Freezing user space processes ... [   38.579813] (elapsed 0.001 seconds) done.
  Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
  PM: Suspending system (mem)
  PM: suspend of devices complete after 152.456 msecs
  PM: late suspend of devices complete after 2.809 msecs
  PM: noirq suspend of devices complete after 29.863 msecs
  suspend debug: Waiting for 5 second(s).
  Unhandled fault: asynchronous external abort (0x1211) at 0x
  pgd = c0003000
  [] *pgd=8040004003, *pmd=
  Internal error: : 1211 [#1] SMP ARM
  Modules linked in:
  CPU: 1 PID: 20 Comm: kworker/1:1 Not tainted
  4.9.0-rc1-koelsch-00011-g68db9bc814362e7f #3383
  Hardware name: Generic R8A7791 (Flattened Device Tree)
  Workqueue: events pci_pme_list_scan
  task: eb56e140 task.stack: eb58e000
  PC is at pci_generic_config_read+0x64/0x6c
  LR is at rcar_pci_cfg_base+0x64/0x84
  pc : []lr : []psr: 600d0093
  sp : eb58fe98  ip : c041d750  fp : 0008
  r10: c0e2283c  r9 :   r8 : 600d0013
  r7 : 0008  r6 : eb58fed6  r5 : 0002  r4 : eb58feb4
  r3 :   r2 : 0044  r1 : 0008  r0 : 
  Flags: nZCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment user
  Control: 30c5387d  Table: 6a9f6c80  DAC: 
  Process kworker/1:1 (pid: 20, stack limit = 0xeb58e210)
  Stack: (0xeb58fe98 to 0xeb59)
  fe80:   0002 0044
  fea0: eb6f5800 c041d9b0 eb58feb4 0008 0044  eb78a000 eb78a000
  fec0: 0044  eb9aff00 c0424bf0 eb78a000  eb78a000 c0e22830
  fee0: ea8a6fc0 c0424c5c eaae79c0 c0424ce0 eb55f380 c0e22838 eb9a9800 c0235fbc
  ff00: eb55f380 c0e22838 eb55f380 eb9a9800 eb9a9800 eb58e000 eb9a9824 c0e02100
  ff20: eb55f398 c02366c4 eb56e140 eb5631c0  eb55f380 c023641c 
  ff40:    c023a928 cd105598  40506a34 eb55f380
  ff60:   dead4ead   eb58ff74 eb58ff74 
  ff80:  dead4ead   eb58ff90 eb58ff90 eb58ffac eb5631c0
  ffa0: c023a844   c0206d68    
  ffc0:        
  ffe0:     0013  3a81336c 10ccd1dd
  [] (pci_generic_config_read) from []
  (pci_bus_read_config_word+0x58/0x80)
  [] (pci_bus_read_config_word) from []
  (pci_check_pme_status+0x34/0x78)
  [] (pci_check_pme_status) from [] 
(pci_pme_wakeup+0x28/0x54)
  [] (pci_pme_wakeup) from [] (pci_pme_list_scan+0x58/0xb4)
  [] (pci_pme_list_scan) from []
  (process_one_work+0x1bc/0x308)
  [] (process_one_work) from [] (worker_thread+0x2a8/0x3e0)
  [] (worker_thread) from [] (kthread+0xe4/0xfc)
  [] (kthread) from [] (ret_from_fork+0x14/0x2c)
  Code: ea00 e5903000 f57ff04f e3a0 (e5843000)
  ---[ e

[PATCH v11 0/6] Add RAS virtualization support for armv8 SEA and SEI

2017-08-18 Thread Dongjiu Geng
In the armv8 platform, the mainly processor hardware error notification
type are synchronous external abort(SEA) and SError Interrupt (SEI), For
the ARMv8 SEA/SEI, KVM or host kernel will deliver SIGBUS or use other
interface to notify user space. After user space gets the notification,
it will record the CPER to simulate GHES for guest OS and inject the a
exception(SEA/SEI) to KVM. 

This series patch has two parts, one part handles synchronous external
abort(SEA) exception and SError Interrupt (SEI) exception; another part
is generating APEI table when guest OS boot up, and dynamically record
CPER for the guest OS about the generic hardware errors. Currently the
userspace only handles the memory section hardware errors. Before Qemu
record the CPER, it needs to check the ACK value written by the guest
OS to avoid read-write race condition. In the simulated APEI/GHESV2/CPER
table, the max number of error soure is 11, which is classified by
notification type, now only enable the SEA/SEI notification type error
source to avoid OS boot warning.


About the whole solution we ever discuessed it in here before:
https://patchwork.kernel.org/patch/9633105/

Below is the APEI/GHESV2/CPER table layout, the max number of error soure is 11:

   etc/acpi/tables   etc/hardware_errors

==
+ +--++--+
| | HEST ||address   |  
+--+
| +--+|registers |  | 
Error Status |
| | GHES0|| ++  | 
Data Block 0 |
| +--+ +->| |status_address0 |->| 
++
| | .| |  | ++  | | 
 CPER  |
| | error_status_address-+-+ +--->| |status_address1 |--+   | | 
 CPER  |
| | .|   || ++  |   | | 
   |
| | read_ack_register+-+ ||  .   |  |   | | 
 CPER  |
| | read_ack_preserve| | |+--+  |   | 
++
| | read_ack_write   | | | +->| |status_address10|+ |   | 
Error Status |
+ +--+ | | |  | ++| |   | 
Data Block 1 |
| | GHES1| +-+-+->| | ack_value0 || +-->| 
++
+ +--+   | |  | ++| | | 
 CPER  |
| | .|   | | +--->| | ack_value1 || | | 
 CPER  |
| | error_status_address-+---+ | || ++| | | 
   |
| | .| | || |  . || | | 
 CPER  |
| | read_ack_register+-+-+| ++| 
+-++
| | read_ack_preserve| |   +->| | ack_value10|| | 
|..  |
| | read_ack_write   | |   |  | ++| | 
++
+ +--| |   |  | | 
Error Status |
| | ...  | |   |  | | 
Data Block 10|
+ +--+ |   |  +>| 
++
| | GHES10   | |   || | 
 CPER  |
+ +--+ |   || | 
 CPER  |
| | .| |   || | 
   |
| | error_status_address-+-+   || | 
 CPER  |
| | .| |
+-++
| | read_ack_register+-+
| | read_ack_preserve|
| | read_ack_write   |
+ +--+


--
How to test guest OS do SEA/SEI recovery:

1. In the guest OS, trigger a SEA or SEI.
2. Then you will see below error log that printed by the memory failure
3. Memory failure will do the recovery for the error.

Such as the below shown kernel log:
[   21.101216] Synchronous External Abort: synchronous external abort 
(0x9610) at 0xff8008064018
[   21.104969] {1}[Hardware Error]: Hardware error from APEI Generic Hardware 
Error Source: 8
[   21.106918] {1}[Hardware Error]: event severity: recoverable
[   21.109027] {1}[Hardware Error]:  Error 0, type: recoverable
[   21.110362] {1}[Hardware Error]:   section_type: memory error
[   21.111705] {1}[Hardware Error]:   physical_address: 0x7a20
[   21.113255] {1}[Hardware Error]:   e

<    1   2   3   4   5   6   7   8   9   10   >