[PATCH 3.4 099/125] xen/pciback: Do not install an IRQ handler for MSI interrupts.

2016-10-12 Thread lizf
From: Konrad Rzeszutek Wilk 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit a396f3a210c3a61e94d6b87ec05a75d0be2a60d0 upstream.

Otherwise an guest can subvert the generic MSI code to trigger
an BUG_ON condition during MSI interrupt freeing:

 for (i = 0; i < entry->nvec_used; i++)
BUG_ON(irq_has_action(entry->irq + i));

Xen PCI backed installs an IRQ handler (request_irq) for
the dev->irq whenever the guest writes PCI_COMMAND_MEMORY
(or PCI_COMMAND_IO) to the PCI_COMMAND register. This is
done in case the device has legacy interrupts the GSI line
is shared by the backend devices.

To subvert the backend the guest needs to make the backend
to change the dev->irq from the GSI to the MSI interrupt line,
make the backend allocate an interrupt handler, and then command
the backend to free the MSI interrupt and hit the BUG_ON.

Since the backend only calls 'request_irq' when the guest
writes to the PCI_COMMAND register the guest needs to call
XEN_PCI_OP_enable_msi before any other operation. This will
cause the generic MSI code to setup an MSI entry and
populate dev->irq with the new PIRQ value.

Then the guest can write to PCI_COMMAND PCI_COMMAND_MEMORY
and cause the backend to setup an IRQ handler for dev->irq
(which instead of the GSI value has the MSI pirq). See
'xen_pcibk_control_isr'.

Then the guest disables the MSI: XEN_PCI_OP_disable_msi
which ends up triggering the BUG_ON condition in 'free_msi_irqs'
as there is an IRQ handler for the entry->irq (dev->irq).

Note that this cannot be done using MSI-X as the generic
code does not over-write dev->irq with the MSI-X PIRQ values.

The patch inhibits setting up the IRQ handler if MSI or
MSI-X (for symmetry reasons) code had been called successfully.

P.S.
Xen PCIBack when it sets up the device for the guest consumption
ends up writting 0 to the PCI_COMMAND (see xen_pcibk_reset_device).
XSA-120 addendum patch removed that - however when upstreaming said
addendum we found that it caused issues with qemu upstream. That
has now been fixed in qemu upstream.

This is part of XSA-157

Reviewed-by: David Vrabel 
Signed-off-by: Konrad Rzeszutek Wilk 
Signed-off-by: Zefan Li 
---
 drivers/xen/xen-pciback/pciback_ops.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/xen/xen-pciback/pciback_ops.c 
b/drivers/xen/xen-pciback/pciback_ops.c
index f7ce4de..90bc022 100644
--- a/drivers/xen/xen-pciback/pciback_ops.c
+++ b/drivers/xen/xen-pciback/pciback_ops.c
@@ -69,6 +69,13 @@ static void xen_pcibk_control_isr(struct pci_dev *dev, int 
reset)
enable ? "enable" : "disable");
 
if (enable) {
+   /*
+* The MSI or MSI-X should not have an IRQ handler. Otherwise
+* if the guest terminates we BUG_ON in free_msi_irqs.
+*/
+   if (dev->msi_enabled || dev->msix_enabled)
+   goto out;
+
rc = request_irq(dev_data->irq,
xen_pcibk_guest_interrupt, IRQF_SHARED,
dev_data->irq_name, dev);
-- 
1.9.1



[PATCH 3.4 105/125] parisc: Fix syscall restarts

2016-10-12 Thread lizf
From: Helge Deller <del...@gmx.de>

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 71a71fb5374a23be36a91981b5614590b9e722c3 upstream.

On parisc syscalls which are interrupted by signals sometimes failed to
restart and instead returned -ENOSYS which in the worst case lead to
userspace crashes.
A similiar problem existed on MIPS and was fixed by commit e967ef02
("MIPS: Fix restart of indirect syscalls").

On parisc the current syscall restart code assumes that all syscall
callers load the syscall number in the delay slot of the ble
instruction. That's how it is e.g. done in the unistd.h header file:
ble 0x100(%sr2, %r0)
ldi #syscall_nr, %r20
Because of that assumption the current code never restored %r20 before
returning to userspace.

This assumption is at least not true for code which uses the glibc
syscall() function, which instead uses this syntax:
ble 0x100(%sr2, %r0)
copy regX, %r20
where regX depend on how the compiler optimizes the code and register
usage.

This patch fixes this problem by adding code to analyze how the syscall
number is loaded in the delay branch and - if needed - copy the syscall
number to regX prior returning to userspace for the syscall restart.

Signed-off-by: Helge Deller <del...@gmx.de>
Cc: Mathieu Desnoyers <mathieu.desnoy...@efficios.com>
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li <lize...@huawei.com>
---
 arch/parisc/kernel/signal.c | 67 +++--
 1 file changed, 52 insertions(+), 15 deletions(-)

diff --git a/arch/parisc/kernel/signal.c b/arch/parisc/kernel/signal.c
index 12c1ed3..c626855 100644
--- a/arch/parisc/kernel/signal.c
+++ b/arch/parisc/kernel/signal.c
@@ -468,6 +468,55 @@ handle_signal(unsigned long sig, siginfo_t *info, struct 
k_sigaction *ka,
return 1;
 }
 
+/*
+ * Check how the syscall number gets loaded into %r20 within
+ * the delay branch in userspace and adjust as needed.
+ */
+
+static void check_syscallno_in_delay_branch(struct pt_regs *regs)
+{
+   u32 opcode, source_reg;
+   u32 __user *uaddr;
+   int err;
+
+   /* Usually we don't have to restore %r20 (the system call number)
+* because it gets loaded in the delay slot of the branch external
+* instruction via the ldi instruction.
+* In some cases a register-to-register copy instruction might have
+* been used instead, in which case we need to copy the syscall
+* number into the source register before returning to userspace.
+*/
+
+   /* A syscall is just a branch, so all we have to do is fiddle the
+* return pointer so that the ble instruction gets executed again.
+*/
+   regs->gr[31] -= 8; /* delayed branching */
+
+   /* Get assembler opcode of code in delay branch */
+   uaddr = (unsigned int *) ((regs->gr[31] & ~3) + 4);
+   err = get_user(opcode, uaddr);
+   if (err)
+   return;
+
+   /* Check if delay branch uses "ldi int,%r20" */
+   if ((opcode & 0x) == 0x3414)
+   return; /* everything ok, just return */
+
+   /* Check if delay branch uses "nop" */
+   if (opcode == INSN_NOP)
+   return;
+
+   /* Check if delay branch uses "copy %rX,%r20" */
+   if ((opcode & 0xffe0) == 0x08000254) {
+   source_reg = (opcode >> 16) & 31;
+   regs->gr[source_reg] = regs->gr[20];
+   return;
+   }
+
+   pr_warn("syscall restart: %s (pid %d): unexpected opcode 0x%08x\n",
+   current->comm, task_pid_nr(current), opcode);
+}
+
 static inline void
 syscall_restart(struct pt_regs *regs, struct k_sigaction *ka)
 {
@@ -489,10 +538,7 @@ syscall_restart(struct pt_regs *regs, struct k_sigaction 
*ka)
}
/* fallthrough */
case -ERESTARTNOINTR:
-   /* A syscall is just a branch, so all
-* we have to do is fiddle the return pointer.
-*/
-   regs->gr[31] -= 8; /* delayed branching */
+   check_syscallno_in_delay_branch(regs);
/* Preserve original r28. */
regs->gr[28] = regs->orig_r28;
break;
@@ -543,18 +589,9 @@ insert_restart_trampoline(struct pt_regs *regs)
}
case -ERESTARTNOHAND:
case -ERESTARTSYS:
-   case -ERESTARTNOINTR: {
-   /* Hooray for delayed branching.  We don't
-* have to restore %r20 (the system call
-* number) because it gets loaded in the delay
-* slot of the branch external instruction.
-*/
-   regs->gr[31] -= 8;
-   /* Preserve original r28. */
-   regs->gr[28] = regs->orig_r28;
-
+

[PATCH 3.4 081/125] mm, vmstat: allow WQ concurrency to discover memory reclaim doesn't make any progress

2016-10-12 Thread lizf
From: Michal Hocko <mho...@suse.com>

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 373ccbe5927034b55bdc80b0f8b54d6e13fe8d12 upstream.

Tetsuo Handa has reported that the system might basically livelock in
OOM condition without triggering the OOM killer.

The issue is caused by internal dependency of the direct reclaim on
vmstat counter updates (via zone_reclaimable) which are performed from
the workqueue context.  If all the current workers get assigned to an
allocation request, though, they will be looping inside the allocator
trying to reclaim memory but zone_reclaimable can see stalled numbers so
it will consider a zone reclaimable even though it has been scanned way
too much.  WQ concurrency logic will not consider this situation as a
congested workqueue because it relies that worker would have to sleep in
such a situation.  This also means that it doesn't try to spawn new
workers or invoke the rescuer thread if the one is assigned to the
queue.

In order to fix this issue we need to do two things.  First we have to
let wq concurrency code know that we are in trouble so we have to do a
short sleep.  In order to prevent from issues handled by 0e093d99763e
("writeback: do not sleep on the congestion queue if there are no
congested BDIs or if significant congestion is not being encountered in
the current zone") we limit the sleep only to worker threads which are
the ones of the interest anyway.

The second thing to do is to create a dedicated workqueue for vmstat and
mark it WQ_MEM_RECLAIM to note it participates in the reclaim and to
have a spare worker thread for it.

Signed-off-by: Michal Hocko <mho...@suse.com>
Reported-by: Tetsuo Handa <penguin-ker...@i-love.sakura.ne.jp>
Cc: Tejun Heo <t...@kernel.org>
Cc: Cristopher Lameter <clame...@sgi.com>
Cc: Joonsoo Kim <js1...@gmail.com>
Cc: Arkadiusz Miskiewicz <ar...@maven.pl>
Signed-off-by: Andrew Morton <a...@linux-foundation.org>
Signed-off-by: Linus Torvalds <torva...@linux-foundation.org>
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li <lize...@huawei.com>
---
 mm/backing-dev.c | 19 ---
 mm/vmstat.c  |  6 --
 2 files changed, 20 insertions(+), 5 deletions(-)

diff --git a/mm/backing-dev.c b/mm/backing-dev.c
index dd8e2aa..3f54b7d 100644
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -843,8 +843,9 @@ EXPORT_SYMBOL(congestion_wait);
  * jiffies for either a BDI to exit congestion of the given @sync queue
  * or a write to complete.
  *
- * In the absence of zone congestion, cond_resched() is called to yield
- * the processor if necessary but otherwise does not sleep.
+ * In the absence of zone congestion, a short sleep or a cond_resched is
+ * performed to yield the processor and to allow other subsystems to make
+ * a forward progress.
  *
  * The return value is 0 if the sleep is for the full timeout. Otherwise,
  * it is the number of jiffies that were still remaining when the function
@@ -864,7 +865,19 @@ long wait_iff_congested(struct zone *zone, int sync, long 
timeout)
 */
if (atomic_read(_bdi_congested[sync]) == 0 ||
!zone_is_reclaim_congested(zone)) {
-   cond_resched();
+
+   /*
+* Memory allocation/reclaim might be called from a WQ
+* context and the current implementation of the WQ
+* concurrency control doesn't recognize that a particular
+* WQ is congested if the worker thread is looping without
+* ever sleeping. Therefore we have to do a short sleep
+* here rather than calling cond_resched().
+*/
+   if (current->flags & PF_WQ_WORKER)
+   schedule_timeout(1);
+   else
+   cond_resched();
 
/* In case we scheduled, work out time remaining */
ret = timeout - (jiffies - start);
diff --git a/mm/vmstat.c b/mm/vmstat.c
index 7db1b9b..e89c0f6 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -1139,13 +1139,14 @@ static const struct file_operations 
proc_vmstat_file_operations = {
 #endif /* CONFIG_PROC_FS */
 
 #ifdef CONFIG_SMP
+static struct workqueue_struct *vmstat_wq;
 static DEFINE_PER_CPU(struct delayed_work, vmstat_work);
 int sysctl_stat_interval __read_mostly = HZ;
 
 static void vmstat_update(struct work_struct *w)
 {
refresh_cpu_vm_stats(smp_processor_id());
-   schedule_delayed_work(&__get_cpu_var(vmstat_work),
+   queue_delayed_work(vmstat_wq, &__get_cpu_var(vmstat_work),
round_jiffies_relative(sysctl_stat_interval));
 }
 
@@ -1154,7 +1155,7 @@ static void __cpuinit start_cpu_timer(int cpu)
struct delayed_work *work = _cpu(vmstat_work, cpu);
 
INIT_DELAYED_WORK_DEFERRABLE(work, vmstat_update);
-   schedule_delayed_work_on(cpu, w

[PATCH 3.4 070/125] usb: xhci: fix config fail of FS hub behind a HS hub with MTT

2016-10-12 Thread lizf
From: Chunfeng Yun 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 096b110a3dd3c868e4610937c80d2e3f3357c1a9 upstream.

if a full speed hub connects to a high speed hub which
supports MTT, the MTT field of its slot context will be set
to 1 when xHCI driver setups an xHCI virtual device in
xhci_setup_addressable_virt_dev(); once usb core fetch its
hub descriptor, and need to update the xHC's internal data
structures for the device, the HUB field of its slot context
will be set to 1 too, meanwhile MTT is also set before,
this will cause configure endpoint command fail, so in the
case, we should clear MTT to 0 for full speed hub according
to section 6.2.2

Signed-off-by: Chunfeng Yun 
Signed-off-by: Mathias Nyman 
Signed-off-by: Greg Kroah-Hartman 
Signed-off-by: Zefan Li 
---
 drivers/usb/host/xhci.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index 88be7a5..95ac4cf 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -4123,8 +4123,16 @@ int xhci_update_hub_device(struct usb_hcd *hcd, struct 
usb_device *hdev,
ctrl_ctx->add_flags |= cpu_to_le32(SLOT_FLAG);
slot_ctx = xhci_get_slot_ctx(xhci, config_cmd->in_ctx);
slot_ctx->dev_info |= cpu_to_le32(DEV_HUB);
+   /*
+* refer to section 6.2.2: MTT should be 0 for full speed hub,
+* but it may be already set to 1 when setup an xHCI virtual
+* device, so clear it anyway.
+*/
if (tt->multi)
slot_ctx->dev_info |= cpu_to_le32(DEV_MTT);
+   else if (hdev->speed == USB_SPEED_FULL)
+   slot_ctx->dev_info &= cpu_to_le32(~DEV_MTT);
+
if (xhci->hci_version > 0x95) {
xhci_dbg(xhci, "xHCI version %x needs hub "
"TT think time and number of ports\n",
-- 
1.9.1



[PATCH 3.4 070/125] usb: xhci: fix config fail of FS hub behind a HS hub with MTT

2016-10-12 Thread lizf
From: Chunfeng Yun 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 096b110a3dd3c868e4610937c80d2e3f3357c1a9 upstream.

if a full speed hub connects to a high speed hub which
supports MTT, the MTT field of its slot context will be set
to 1 when xHCI driver setups an xHCI virtual device in
xhci_setup_addressable_virt_dev(); once usb core fetch its
hub descriptor, and need to update the xHC's internal data
structures for the device, the HUB field of its slot context
will be set to 1 too, meanwhile MTT is also set before,
this will cause configure endpoint command fail, so in the
case, we should clear MTT to 0 for full speed hub according
to section 6.2.2

Signed-off-by: Chunfeng Yun 
Signed-off-by: Mathias Nyman 
Signed-off-by: Greg Kroah-Hartman 
Signed-off-by: Zefan Li 
---
 drivers/usb/host/xhci.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index 88be7a5..95ac4cf 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -4123,8 +4123,16 @@ int xhci_update_hub_device(struct usb_hcd *hcd, struct 
usb_device *hdev,
ctrl_ctx->add_flags |= cpu_to_le32(SLOT_FLAG);
slot_ctx = xhci_get_slot_ctx(xhci, config_cmd->in_ctx);
slot_ctx->dev_info |= cpu_to_le32(DEV_HUB);
+   /*
+* refer to section 6.2.2: MTT should be 0 for full speed hub,
+* but it may be already set to 1 when setup an xHCI virtual
+* device, so clear it anyway.
+*/
if (tt->multi)
slot_ctx->dev_info |= cpu_to_le32(DEV_MTT);
+   else if (hdev->speed == USB_SPEED_FULL)
+   slot_ctx->dev_info &= cpu_to_le32(~DEV_MTT);
+
if (xhci->hci_version > 0x95) {
xhci_dbg(xhci, "xHCI version %x needs hub "
"TT think time and number of ports\n",
-- 
1.9.1



[PATCH 3.4 099/125] xen/pciback: Do not install an IRQ handler for MSI interrupts.

2016-10-12 Thread lizf
From: Konrad Rzeszutek Wilk 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit a396f3a210c3a61e94d6b87ec05a75d0be2a60d0 upstream.

Otherwise an guest can subvert the generic MSI code to trigger
an BUG_ON condition during MSI interrupt freeing:

 for (i = 0; i < entry->nvec_used; i++)
BUG_ON(irq_has_action(entry->irq + i));

Xen PCI backed installs an IRQ handler (request_irq) for
the dev->irq whenever the guest writes PCI_COMMAND_MEMORY
(or PCI_COMMAND_IO) to the PCI_COMMAND register. This is
done in case the device has legacy interrupts the GSI line
is shared by the backend devices.

To subvert the backend the guest needs to make the backend
to change the dev->irq from the GSI to the MSI interrupt line,
make the backend allocate an interrupt handler, and then command
the backend to free the MSI interrupt and hit the BUG_ON.

Since the backend only calls 'request_irq' when the guest
writes to the PCI_COMMAND register the guest needs to call
XEN_PCI_OP_enable_msi before any other operation. This will
cause the generic MSI code to setup an MSI entry and
populate dev->irq with the new PIRQ value.

Then the guest can write to PCI_COMMAND PCI_COMMAND_MEMORY
and cause the backend to setup an IRQ handler for dev->irq
(which instead of the GSI value has the MSI pirq). See
'xen_pcibk_control_isr'.

Then the guest disables the MSI: XEN_PCI_OP_disable_msi
which ends up triggering the BUG_ON condition in 'free_msi_irqs'
as there is an IRQ handler for the entry->irq (dev->irq).

Note that this cannot be done using MSI-X as the generic
code does not over-write dev->irq with the MSI-X PIRQ values.

The patch inhibits setting up the IRQ handler if MSI or
MSI-X (for symmetry reasons) code had been called successfully.

P.S.
Xen PCIBack when it sets up the device for the guest consumption
ends up writting 0 to the PCI_COMMAND (see xen_pcibk_reset_device).
XSA-120 addendum patch removed that - however when upstreaming said
addendum we found that it caused issues with qemu upstream. That
has now been fixed in qemu upstream.

This is part of XSA-157

Reviewed-by: David Vrabel 
Signed-off-by: Konrad Rzeszutek Wilk 
Signed-off-by: Zefan Li 
---
 drivers/xen/xen-pciback/pciback_ops.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/xen/xen-pciback/pciback_ops.c 
b/drivers/xen/xen-pciback/pciback_ops.c
index f7ce4de..90bc022 100644
--- a/drivers/xen/xen-pciback/pciback_ops.c
+++ b/drivers/xen/xen-pciback/pciback_ops.c
@@ -69,6 +69,13 @@ static void xen_pcibk_control_isr(struct pci_dev *dev, int 
reset)
enable ? "enable" : "disable");
 
if (enable) {
+   /*
+* The MSI or MSI-X should not have an IRQ handler. Otherwise
+* if the guest terminates we BUG_ON in free_msi_irqs.
+*/
+   if (dev->msi_enabled || dev->msix_enabled)
+   goto out;
+
rc = request_irq(dev_data->irq,
xen_pcibk_guest_interrupt, IRQF_SHARED,
dev_data->irq_name, dev);
-- 
1.9.1



[PATCH 3.4 105/125] parisc: Fix syscall restarts

2016-10-12 Thread lizf
From: Helge Deller 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 71a71fb5374a23be36a91981b5614590b9e722c3 upstream.

On parisc syscalls which are interrupted by signals sometimes failed to
restart and instead returned -ENOSYS which in the worst case lead to
userspace crashes.
A similiar problem existed on MIPS and was fixed by commit e967ef02
("MIPS: Fix restart of indirect syscalls").

On parisc the current syscall restart code assumes that all syscall
callers load the syscall number in the delay slot of the ble
instruction. That's how it is e.g. done in the unistd.h header file:
ble 0x100(%sr2, %r0)
ldi #syscall_nr, %r20
Because of that assumption the current code never restored %r20 before
returning to userspace.

This assumption is at least not true for code which uses the glibc
syscall() function, which instead uses this syntax:
ble 0x100(%sr2, %r0)
copy regX, %r20
where regX depend on how the compiler optimizes the code and register
usage.

This patch fixes this problem by adding code to analyze how the syscall
number is loaded in the delay branch and - if needed - copy the syscall
number to regX prior returning to userspace for the syscall restart.

Signed-off-by: Helge Deller 
Cc: Mathieu Desnoyers 
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li 
---
 arch/parisc/kernel/signal.c | 67 +++--
 1 file changed, 52 insertions(+), 15 deletions(-)

diff --git a/arch/parisc/kernel/signal.c b/arch/parisc/kernel/signal.c
index 12c1ed3..c626855 100644
--- a/arch/parisc/kernel/signal.c
+++ b/arch/parisc/kernel/signal.c
@@ -468,6 +468,55 @@ handle_signal(unsigned long sig, siginfo_t *info, struct 
k_sigaction *ka,
return 1;
 }
 
+/*
+ * Check how the syscall number gets loaded into %r20 within
+ * the delay branch in userspace and adjust as needed.
+ */
+
+static void check_syscallno_in_delay_branch(struct pt_regs *regs)
+{
+   u32 opcode, source_reg;
+   u32 __user *uaddr;
+   int err;
+
+   /* Usually we don't have to restore %r20 (the system call number)
+* because it gets loaded in the delay slot of the branch external
+* instruction via the ldi instruction.
+* In some cases a register-to-register copy instruction might have
+* been used instead, in which case we need to copy the syscall
+* number into the source register before returning to userspace.
+*/
+
+   /* A syscall is just a branch, so all we have to do is fiddle the
+* return pointer so that the ble instruction gets executed again.
+*/
+   regs->gr[31] -= 8; /* delayed branching */
+
+   /* Get assembler opcode of code in delay branch */
+   uaddr = (unsigned int *) ((regs->gr[31] & ~3) + 4);
+   err = get_user(opcode, uaddr);
+   if (err)
+   return;
+
+   /* Check if delay branch uses "ldi int,%r20" */
+   if ((opcode & 0x) == 0x3414)
+   return; /* everything ok, just return */
+
+   /* Check if delay branch uses "nop" */
+   if (opcode == INSN_NOP)
+   return;
+
+   /* Check if delay branch uses "copy %rX,%r20" */
+   if ((opcode & 0xffe0) == 0x08000254) {
+   source_reg = (opcode >> 16) & 31;
+   regs->gr[source_reg] = regs->gr[20];
+   return;
+   }
+
+   pr_warn("syscall restart: %s (pid %d): unexpected opcode 0x%08x\n",
+   current->comm, task_pid_nr(current), opcode);
+}
+
 static inline void
 syscall_restart(struct pt_regs *regs, struct k_sigaction *ka)
 {
@@ -489,10 +538,7 @@ syscall_restart(struct pt_regs *regs, struct k_sigaction 
*ka)
}
/* fallthrough */
case -ERESTARTNOINTR:
-   /* A syscall is just a branch, so all
-* we have to do is fiddle the return pointer.
-*/
-   regs->gr[31] -= 8; /* delayed branching */
+   check_syscallno_in_delay_branch(regs);
/* Preserve original r28. */
regs->gr[28] = regs->orig_r28;
break;
@@ -543,18 +589,9 @@ insert_restart_trampoline(struct pt_regs *regs)
}
case -ERESTARTNOHAND:
case -ERESTARTSYS:
-   case -ERESTARTNOINTR: {
-   /* Hooray for delayed branching.  We don't
-* have to restore %r20 (the system call
-* number) because it gets loaded in the delay
-* slot of the branch external instruction.
-*/
-   regs->gr[31] -= 8;
-   /* Preserve original r28. */
-   regs->gr[28] = regs->orig_r28;
-
+   case -ERESTARTNOINTR:
+   check_syscallno_in_delay_branch(regs);
return;
-   }
default:
break;
}
-- 
1.9.1



[PATCH 3.4 081/125] mm, vmstat: allow WQ concurrency to discover memory reclaim doesn't make any progress

2016-10-12 Thread lizf
From: Michal Hocko 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 373ccbe5927034b55bdc80b0f8b54d6e13fe8d12 upstream.

Tetsuo Handa has reported that the system might basically livelock in
OOM condition without triggering the OOM killer.

The issue is caused by internal dependency of the direct reclaim on
vmstat counter updates (via zone_reclaimable) which are performed from
the workqueue context.  If all the current workers get assigned to an
allocation request, though, they will be looping inside the allocator
trying to reclaim memory but zone_reclaimable can see stalled numbers so
it will consider a zone reclaimable even though it has been scanned way
too much.  WQ concurrency logic will not consider this situation as a
congested workqueue because it relies that worker would have to sleep in
such a situation.  This also means that it doesn't try to spawn new
workers or invoke the rescuer thread if the one is assigned to the
queue.

In order to fix this issue we need to do two things.  First we have to
let wq concurrency code know that we are in trouble so we have to do a
short sleep.  In order to prevent from issues handled by 0e093d99763e
("writeback: do not sleep on the congestion queue if there are no
congested BDIs or if significant congestion is not being encountered in
the current zone") we limit the sleep only to worker threads which are
the ones of the interest anyway.

The second thing to do is to create a dedicated workqueue for vmstat and
mark it WQ_MEM_RECLAIM to note it participates in the reclaim and to
have a spare worker thread for it.

Signed-off-by: Michal Hocko 
Reported-by: Tetsuo Handa 
Cc: Tejun Heo 
Cc: Cristopher Lameter 
Cc: Joonsoo Kim 
Cc: Arkadiusz Miskiewicz 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li 
---
 mm/backing-dev.c | 19 ---
 mm/vmstat.c  |  6 --
 2 files changed, 20 insertions(+), 5 deletions(-)

diff --git a/mm/backing-dev.c b/mm/backing-dev.c
index dd8e2aa..3f54b7d 100644
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -843,8 +843,9 @@ EXPORT_SYMBOL(congestion_wait);
  * jiffies for either a BDI to exit congestion of the given @sync queue
  * or a write to complete.
  *
- * In the absence of zone congestion, cond_resched() is called to yield
- * the processor if necessary but otherwise does not sleep.
+ * In the absence of zone congestion, a short sleep or a cond_resched is
+ * performed to yield the processor and to allow other subsystems to make
+ * a forward progress.
  *
  * The return value is 0 if the sleep is for the full timeout. Otherwise,
  * it is the number of jiffies that were still remaining when the function
@@ -864,7 +865,19 @@ long wait_iff_congested(struct zone *zone, int sync, long 
timeout)
 */
if (atomic_read(_bdi_congested[sync]) == 0 ||
!zone_is_reclaim_congested(zone)) {
-   cond_resched();
+
+   /*
+* Memory allocation/reclaim might be called from a WQ
+* context and the current implementation of the WQ
+* concurrency control doesn't recognize that a particular
+* WQ is congested if the worker thread is looping without
+* ever sleeping. Therefore we have to do a short sleep
+* here rather than calling cond_resched().
+*/
+   if (current->flags & PF_WQ_WORKER)
+   schedule_timeout(1);
+   else
+   cond_resched();
 
/* In case we scheduled, work out time remaining */
ret = timeout - (jiffies - start);
diff --git a/mm/vmstat.c b/mm/vmstat.c
index 7db1b9b..e89c0f6 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -1139,13 +1139,14 @@ static const struct file_operations 
proc_vmstat_file_operations = {
 #endif /* CONFIG_PROC_FS */
 
 #ifdef CONFIG_SMP
+static struct workqueue_struct *vmstat_wq;
 static DEFINE_PER_CPU(struct delayed_work, vmstat_work);
 int sysctl_stat_interval __read_mostly = HZ;
 
 static void vmstat_update(struct work_struct *w)
 {
refresh_cpu_vm_stats(smp_processor_id());
-   schedule_delayed_work(&__get_cpu_var(vmstat_work),
+   queue_delayed_work(vmstat_wq, &__get_cpu_var(vmstat_work),
round_jiffies_relative(sysctl_stat_interval));
 }
 
@@ -1154,7 +1155,7 @@ static void __cpuinit start_cpu_timer(int cpu)
struct delayed_work *work = _cpu(vmstat_work, cpu);
 
INIT_DELAYED_WORK_DEFERRABLE(work, vmstat_update);
-   schedule_delayed_work_on(cpu, work, __round_jiffies_relative(HZ, cpu));
+   queue_delayed_work_on(cpu, vmstat_wq, work, 
__round_jiffies_relative(HZ, cpu));
 }
 
 /*
@@ -1204,6 +1205,7 @@ static int __init setup_vmstat(void)
 
register_cpu_notifier(_notifier);
 
+   vmstat_wq = alloc_workque

[PATCH 3.4 077/125] ses: Fix problems with simple enclosures

2016-10-12 Thread lizf
From: James Bottomley 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 3417c1b5cb1fdc10261dbed42b05cc93166a78fd upstream.

Simple enclosure implementations (mostly USB) are allowed to return only
page 8 to every diagnostic query.  That really confuses our
implementation because we assume the return is the page we asked for and
end up doing incorrect offsets based on bogus information leading to
accesses outside of allocated ranges.  Fix that by checking the page
code of the return and giving an error if it isn't the one we asked for.
This should fix reported bugs with USB storage by simply refusing to
attach to enclosures that behave like this.  It's also good defensive
practise now that we're starting to see more USB enclosures.

Reported-by: Andrea Gelmini 
Reviewed-by: Ewan D. Milne 
Reviewed-by: Tomas Henzl 
Signed-off-by: James Bottomley 
Signed-off-by: Zefan Li 
---
 drivers/scsi/ses.c | 20 +++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/ses.c b/drivers/scsi/ses.c
index eba183c..b3051fe 100644
--- a/drivers/scsi/ses.c
+++ b/drivers/scsi/ses.c
@@ -70,6 +70,7 @@ static int ses_probe(struct device *dev)
 static int ses_recv_diag(struct scsi_device *sdev, int page_code,
 void *buf, int bufflen)
 {
+   int ret;
unsigned char cmd[] = {
RECEIVE_DIAGNOSTIC,
1,  /* Set PCV bit */
@@ -78,9 +79,26 @@ static int ses_recv_diag(struct scsi_device *sdev, int 
page_code,
bufflen & 0xff,
0
};
+   unsigned char recv_page_code;
 
-   return scsi_execute_req(sdev, cmd, DMA_FROM_DEVICE, buf, bufflen,
+   ret =  scsi_execute_req(sdev, cmd, DMA_FROM_DEVICE, buf, bufflen,
NULL, SES_TIMEOUT, SES_RETRIES, NULL);
+   if (unlikely(!ret))
+   return ret;
+
+   recv_page_code = ((unsigned char *)buf)[0];
+
+   if (likely(recv_page_code == page_code))
+   return ret;
+
+   /* successful diagnostic but wrong page code.  This happens to some
+* USB devices, just print a message and pretend there was an error */
+
+   sdev_printk(KERN_ERR, sdev,
+   "Wrong diagnostic page; asked for %d got %u\n",
+   page_code, recv_page_code);
+
+   return -EINVAL;
 }
 
 static int ses_send_diag(struct scsi_device *sdev, int page_code,
-- 
1.9.1



[PATCH 3.4 109/125] ftrace/scripts: Fix incorrect use of sprintf in recordmcount

2016-10-12 Thread lizf
From: Colin Ian King 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 713a3e4de707fab49d5aa4bceb77db1058572a7b upstream.

Fix build warning:

scripts/recordmcount.c:589:4: warning: format not a string
literal and no format arguments [-Wformat-security]
sprintf("%s: failed\n", file);

Fixes: a50bd43935586 ("ftrace/scripts: Have recordmcount copy the object file")
Link: 
http://lkml.kernel.org/r/1451516801-16951-1-git-send-email-colin.k...@canonical.com

Cc: Li Bin 
Cc: Russell King 
Cc: Will Deacon 
Signed-off-by: Colin Ian King 
Signed-off-by: Steven Rostedt 
Signed-off-by: Zefan Li 
---
 scripts/recordmcount.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scripts/recordmcount.c b/scripts/recordmcount.c
index 0970379..0d5ae4a 100644
--- a/scripts/recordmcount.c
+++ b/scripts/recordmcount.c
@@ -546,7 +546,7 @@ main(int argc, char *argv[])
do_file(file);
break;
case SJ_FAIL:/* error in do_file or below */
-   sprintf("%s: failed\n", file);
+   fprintf(stderr, "%s: failed\n", file);
++n_error;
break;
case SJ_SUCCEED:/* premature success */
-- 
1.9.1



[PATCH 3.4 106/125] ipv6/addrlabel: fix ip6addrlbl_get()

2016-10-12 Thread lizf
From: Andrey Ryabinin 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit e459dfeeb64008b2d23bdf600f03b3605dbb8152 upstream.

ip6addrlbl_get() has never worked. If ip6addrlbl_hold() succeeded,
ip6addrlbl_get() will exit with '-ESRCH'. If ip6addrlbl_hold() failed,
ip6addrlbl_get() will use about to be free ip6addrlbl_entry pointer.

Fix this by inverting ip6addrlbl_hold() check.

Fixes: 2a8cc6c89039 ("[IPV6] ADDRCONF: Support RFC3484 configurable address 
selection policy table.")
Signed-off-by: Andrey Ryabinin 
Reviewed-by: Cong Wang 
Acked-by: YOSHIFUJI Hideaki 
Signed-off-by: David S. Miller 
Signed-off-by: Zefan Li 
---
 net/ipv6/addrlabel.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv6/addrlabel.c b/net/ipv6/addrlabel.c
index 2d8ddba..c8c6a12 100644
--- a/net/ipv6/addrlabel.c
+++ b/net/ipv6/addrlabel.c
@@ -558,7 +558,7 @@ static int ip6addrlbl_get(struct sk_buff *in_skb, struct 
nlmsghdr* nlh,
 
rcu_read_lock();
p = __ipv6_addr_label(net, addr, ipv6_addr_type(addr), 
ifal->ifal_index);
-   if (p && ip6addrlbl_hold(p))
+   if (p && !ip6addrlbl_hold(p))
p = NULL;
lseq = ip6addrlbl_table.seq;
rcu_read_unlock();
-- 
1.9.1



[PATCH 3.4 110/125] net: possible use after free in dst_release

2016-10-12 Thread lizf
From: Francesco Ruggeri <frugg...@aristanetworks.com>

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 07a5d38453599052aff0877b16bb9c1585f08609 upstream.

dst_release should not access dst->flags after decrementing
__refcnt to 0. The dst_entry may be in dst_busy_list and
dst_gc_task may dst_destroy it before dst_release gets a chance
to access dst->flags.

Fixes: d69bbf88c8d0 ("net: fix a race in dst_release()")
Fixes: 27b75c95f10d ("net: avoid RCU for NOCACHE dst")
Signed-off-by: Francesco Ruggeri <frugg...@arista.com>
Acked-by: Eric Dumazet <eduma...@google.com>
Signed-off-by: David S. Miller <da...@davemloft.net>
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li <lize...@huawei.com>
---
 net/core/dst.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/core/dst.c b/net/core/dst.c
index 54ba1eb..48cff89 100644
--- a/net/core/dst.c
+++ b/net/core/dst.c
@@ -269,10 +269,11 @@ void dst_release(struct dst_entry *dst)
 {
if (dst) {
int newrefcnt;
+   unsigned short nocache = dst->flags & DST_NOCACHE;
 
newrefcnt = atomic_dec_return(>__refcnt);
WARN_ON(newrefcnt < 0);
-   if (!newrefcnt && unlikely(dst->flags & DST_NOCACHE)) {
+   if (!newrefcnt && unlikely(nocache)) {
dst = dst_destroy(dst);
if (dst)
__dst_free(dst);
-- 
1.9.1



[PATCH 3.4 073/125] 9p: ->evict_inode() should kick out ->i_data, not ->i_mapping

2016-10-12 Thread lizf
From: Al Viro <v...@zeniv.linux.org.uk>

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 4ad78628445d26e5e9487b2e8f23274ad7b0f5d3 upstream.

For block devices the pagecache is associated with the inode
on bdevfs, not with the aliasing ones on the mountable filesystems.
The latter have its own ->i_data empty and ->i_mapping pointing
to the (unique per major/minor) bdevfs inode.  That guarantees
cache coherence between all block device inodes with the same
device number.

Eviction of an alias inode has no business trying to evict the
pages belonging to bdevfs one; moreover, ->i_mapping is only
safe to access when the thing is opened.  At the time of
->evict_inode() the victim is definitely *not* opened.  We are
about to kill the address space embedded into struct inode
(inode->i_data) and that's what we need to empty of any pages.

9p instance tries to empty inode->i_mapping instead, which is
both unsafe and bogus - if we have several device nodes with
the same device number in different places, closing one of them
should not try to empty the (shared) page cache.

Fortunately, other instances in the tree are OK; they are
evicting from >i_data instead, as 9p one should.

Reported-by: "Suzuki K. Poulose" <suzuki.poul...@arm.com>
Tested-by: "Suzuki K. Poulose" <suzuki.poul...@arm.com>
Signed-off-by: Al Viro <v...@zeniv.linux.org.uk>
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li <lize...@huawei.com>
---
 fs/9p/vfs_inode.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c
index c9b32dc..116e43f 100644
--- a/fs/9p/vfs_inode.c
+++ b/fs/9p/vfs_inode.c
@@ -447,9 +447,9 @@ void v9fs_evict_inode(struct inode *inode)
 {
struct v9fs_inode *v9inode = V9FS_I(inode);
 
-   truncate_inode_pages(inode->i_mapping, 0);
+   truncate_inode_pages(>i_data, 0);
end_writeback(inode);
-   filemap_fdatawrite(inode->i_mapping);
+   filemap_fdatawrite(>i_data);
 
 #ifdef CONFIG_9P_FSCACHE
v9fs_cache_inode_put_cookie(inode);
-- 
1.9.1



[PATCH 3.4 079/125] ses: fix additional element traversal bug

2016-10-12 Thread lizf
From: James Bottomley 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 5e1033561da1152c57b97ee84371dba2b3d64c25 upstream.

KASAN found that our additional element processing scripts drop off
the end of the VPD page into unallocated space.  The reason is that
not every element has additional information but our traversal
routines think they do, leading to them expecting far more additional
information than is present.  Fix this by adding a gate to the
traversal routine so that it only processes elements that are expected
to have additional information (list is in SES-2 section 6.1.13.1:
Additional Element Status diagnostic page overview)

Reported-by: Pavel Tikhomirov 
Tested-by: Pavel Tikhomirov 
Signed-off-by: James Bottomley 
Signed-off-by: Zefan Li 
---
 drivers/scsi/ses.c| 10 +-
 include/linux/enclosure.h |  4 
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/ses.c b/drivers/scsi/ses.c
index b3051fe..3643bbf 100644
--- a/drivers/scsi/ses.c
+++ b/drivers/scsi/ses.c
@@ -454,7 +454,15 @@ static void ses_enclosure_data_process(struct 
enclosure_device *edev,
if (desc_ptr)
desc_ptr += len;
 
-   if (addl_desc_ptr)
+   if (addl_desc_ptr &&
+   /* only find additional descriptions for specific 
devices */
+   (type_ptr[0] == ENCLOSURE_COMPONENT_DEVICE ||
+type_ptr[0] == ENCLOSURE_COMPONENT_ARRAY_DEVICE ||
+type_ptr[0] == ENCLOSURE_COMPONENT_SAS_EXPANDER ||
+/* these elements are optional */
+type_ptr[0] == 
ENCLOSURE_COMPONENT_SCSI_TARGET_PORT ||
+type_ptr[0] == 
ENCLOSURE_COMPONENT_SCSI_INITIATOR_PORT ||
+type_ptr[0] == 
ENCLOSURE_COMPONENT_CONTROLLER_ELECTRONICS))
addl_desc_ptr += addl_desc_ptr[1] + 2;
 
}
diff --git a/include/linux/enclosure.h b/include/linux/enclosure.h
index 9a33c5f..f6c229e 100644
--- a/include/linux/enclosure.h
+++ b/include/linux/enclosure.h
@@ -29,7 +29,11 @@
 /* A few generic types ... taken from ses-2 */
 enum enclosure_component_type {
ENCLOSURE_COMPONENT_DEVICE = 0x01,
+   ENCLOSURE_COMPONENT_CONTROLLER_ELECTRONICS = 0x07,
+   ENCLOSURE_COMPONENT_SCSI_TARGET_PORT = 0x14,
+   ENCLOSURE_COMPONENT_SCSI_INITIATOR_PORT = 0x15,
ENCLOSURE_COMPONENT_ARRAY_DEVICE = 0x17,
+   ENCLOSURE_COMPONENT_SAS_EXPANDER = 0x18,
 };
 
 /* ses-2 common element status */
-- 
1.9.1



[PATCH 3.4 107/125] ocfs2: fix BUG when calculate new backup super

2016-10-12 Thread lizf
From: Joseph Qi <joseph...@huawei.com>

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 5c9ee4cbf2a945271f25b89b137f2c03bbc3be33 upstream.

When resizing, it firstly extends the last gd.  Once it should backup
super in the gd, it calculates new backup super and update the
corresponding value.

But it currently doesn't consider the situation that the backup super is
already done.  And in this case, it still sets the bit in gd bitmap and
then decrease from bg_free_bits_count, which leads to a corrupted gd and
trigger the BUG in ocfs2_block_group_set_bits:

BUG_ON(le16_to_cpu(bg->bg_free_bits_count) < num_bits);

So check whether the backup super is done and then do the updates.

Signed-off-by: Joseph Qi <joseph...@huawei.com>
Reviewed-by: Jiufei Xue <xuejiu...@huawei.com>
Reviewed-by: Yiwen Jiang <jiangyi...@huawei.com>
Cc: Mark Fasheh <mfas...@suse.de>
Cc: Joel Becker <jl...@evilplan.org>
Signed-off-by: Andrew Morton <a...@linux-foundation.org>
Signed-off-by: Linus Torvalds <torva...@linux-foundation.org>
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li <lize...@huawei.com>
---
 fs/ocfs2/resize.c | 15 ---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/fs/ocfs2/resize.c b/fs/ocfs2/resize.c
index ec55add..2ca64fa 100644
--- a/fs/ocfs2/resize.c
+++ b/fs/ocfs2/resize.c
@@ -56,11 +56,12 @@ static u16 ocfs2_calc_new_backup_super(struct inode *inode,
   int new_clusters,
   u32 first_new_cluster,
   u16 cl_cpg,
+  u16 old_bg_clusters,
   int set)
 {
int i;
u16 backups = 0;
-   u32 cluster;
+   u32 cluster, lgd_cluster;
u64 blkno, gd_blkno, lgd_blkno = le64_to_cpu(gd->bg_blkno);
 
for (i = 0; i < OCFS2_MAX_BACKUP_SUPERBLOCKS; i++) {
@@ -73,6 +74,12 @@ static u16 ocfs2_calc_new_backup_super(struct inode *inode,
else if (gd_blkno > lgd_blkno)
break;
 
+   /* check if already done backup super */
+   lgd_cluster = ocfs2_blocks_to_clusters(inode->i_sb, lgd_blkno);
+   lgd_cluster += old_bg_clusters;
+   if (lgd_cluster >= cluster)
+   continue;
+
if (set)
ocfs2_set_bit(cluster % cl_cpg,
  (unsigned long *)gd->bg_bitmap);
@@ -101,6 +108,7 @@ static int ocfs2_update_last_group_and_inode(handle_t 
*handle,
u16 chain, num_bits, backups = 0;
u16 cl_bpc = le16_to_cpu(cl->cl_bpc);
u16 cl_cpg = le16_to_cpu(cl->cl_cpg);
+   u16 old_bg_clusters;
 
trace_ocfs2_update_last_group_and_inode(new_clusters,
first_new_cluster);
@@ -114,6 +122,7 @@ static int ocfs2_update_last_group_and_inode(handle_t 
*handle,
 
group = (struct ocfs2_group_desc *)group_bh->b_data;
 
+   old_bg_clusters = le16_to_cpu(group->bg_bits) / cl_bpc;
/* update the group first. */
num_bits = new_clusters * cl_bpc;
le16_add_cpu(>bg_bits, num_bits);
@@ -129,7 +138,7 @@ static int ocfs2_update_last_group_and_inode(handle_t 
*handle,
 group,
 new_clusters,
 first_new_cluster,
-cl_cpg, 1);
+cl_cpg, old_bg_clusters, 
1);
le16_add_cpu(>bg_free_bits_count, -1 * backups);
}
 
@@ -169,7 +178,7 @@ out_rollback:
group,
new_clusters,
first_new_cluster,
-   cl_cpg, 0);
+   cl_cpg, old_bg_clusters, 0);
le16_add_cpu(>bg_free_bits_count, backups);
le16_add_cpu(>bg_bits, -1 * num_bits);
le16_add_cpu(>bg_free_bits_count, -1 * num_bits);
-- 
1.9.1



[PATCH 3.4 074/125] crypto: skcipher - Copy iv from desc even for 0-len walks

2016-10-12 Thread lizf
From: "Jason A. Donenfeld" <ja...@zx2c4.com>

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 70d906bc17500edfa9bdd8c8b7e59618c7911613 upstream.

Some ciphers actually support encrypting zero length plaintexts. For
example, many AEAD modes support this. The resulting ciphertext for
those winds up being only the authentication tag, which is a result of
the key, the iv, the additional data, and the fact that the plaintext
had zero length. The blkcipher constructors won't copy the IV to the
right place, however, when using a zero length input, resulting in
some significant problems when ciphers call their initialization
routines, only to find that the ->iv parameter is uninitialized. One
such example of this would be using chacha20poly1305 with a zero length
input, which then calls chacha20, which calls the key setup routine,
which eventually OOPSes due to the uninitialized ->iv member.

Signed-off-by: Jason A. Donenfeld <ja...@zx2c4.com>
Signed-off-by: Herbert Xu <herb...@gondor.apana.org.au>
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li <lize...@huawei.com>
---
 crypto/ablkcipher.c | 2 +-
 crypto/blkcipher.c  | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/crypto/ablkcipher.c b/crypto/ablkcipher.c
index 4a9c499..1ef1428 100644
--- a/crypto/ablkcipher.c
+++ b/crypto/ablkcipher.c
@@ -280,12 +280,12 @@ static int ablkcipher_walk_first(struct 
ablkcipher_request *req,
if (WARN_ON_ONCE(in_irq()))
return -EDEADLK;
 
+   walk->iv = req->info;
walk->nbytes = walk->total;
if (unlikely(!walk->total))
return 0;
 
walk->iv_buffer = NULL;
-   walk->iv = req->info;
if (unlikely(((unsigned long)walk->iv & alignmask))) {
int err = ablkcipher_copy_iv(walk, tfm, alignmask);
if (err)
diff --git a/crypto/blkcipher.c b/crypto/blkcipher.c
index 0a1ebea..34e5d65 100644
--- a/crypto/blkcipher.c
+++ b/crypto/blkcipher.c
@@ -329,12 +329,12 @@ static int blkcipher_walk_first(struct blkcipher_desc 
*desc,
if (WARN_ON_ONCE(in_irq()))
return -EDEADLK;
 
+   walk->iv = desc->info;
walk->nbytes = walk->total;
if (unlikely(!walk->total))
return 0;
 
walk->buffer = NULL;
-   walk->iv = desc->info;
if (unlikely(((unsigned long)walk->iv & alignmask))) {
int err = blkcipher_copy_iv(walk, tfm, alignmask);
if (err)
-- 
1.9.1



[PATCH 3.4 110/125] net: possible use after free in dst_release

2016-10-12 Thread lizf
From: Francesco Ruggeri 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 07a5d38453599052aff0877b16bb9c1585f08609 upstream.

dst_release should not access dst->flags after decrementing
__refcnt to 0. The dst_entry may be in dst_busy_list and
dst_gc_task may dst_destroy it before dst_release gets a chance
to access dst->flags.

Fixes: d69bbf88c8d0 ("net: fix a race in dst_release()")
Fixes: 27b75c95f10d ("net: avoid RCU for NOCACHE dst")
Signed-off-by: Francesco Ruggeri 
Acked-by: Eric Dumazet 
Signed-off-by: David S. Miller 
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li 
---
 net/core/dst.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/core/dst.c b/net/core/dst.c
index 54ba1eb..48cff89 100644
--- a/net/core/dst.c
+++ b/net/core/dst.c
@@ -269,10 +269,11 @@ void dst_release(struct dst_entry *dst)
 {
if (dst) {
int newrefcnt;
+   unsigned short nocache = dst->flags & DST_NOCACHE;
 
newrefcnt = atomic_dec_return(>__refcnt);
WARN_ON(newrefcnt < 0);
-   if (!newrefcnt && unlikely(dst->flags & DST_NOCACHE)) {
+   if (!newrefcnt && unlikely(nocache)) {
dst = dst_destroy(dst);
if (dst)
__dst_free(dst);
-- 
1.9.1



[PATCH 3.4 073/125] 9p: ->evict_inode() should kick out ->i_data, not ->i_mapping

2016-10-12 Thread lizf
From: Al Viro 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 4ad78628445d26e5e9487b2e8f23274ad7b0f5d3 upstream.

For block devices the pagecache is associated with the inode
on bdevfs, not with the aliasing ones on the mountable filesystems.
The latter have its own ->i_data empty and ->i_mapping pointing
to the (unique per major/minor) bdevfs inode.  That guarantees
cache coherence between all block device inodes with the same
device number.

Eviction of an alias inode has no business trying to evict the
pages belonging to bdevfs one; moreover, ->i_mapping is only
safe to access when the thing is opened.  At the time of
->evict_inode() the victim is definitely *not* opened.  We are
about to kill the address space embedded into struct inode
(inode->i_data) and that's what we need to empty of any pages.

9p instance tries to empty inode->i_mapping instead, which is
both unsafe and bogus - if we have several device nodes with
the same device number in different places, closing one of them
should not try to empty the (shared) page cache.

Fortunately, other instances in the tree are OK; they are
evicting from >i_data instead, as 9p one should.

Reported-by: "Suzuki K. Poulose" 
Tested-by: "Suzuki K. Poulose" 
Signed-off-by: Al Viro 
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li 
---
 fs/9p/vfs_inode.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c
index c9b32dc..116e43f 100644
--- a/fs/9p/vfs_inode.c
+++ b/fs/9p/vfs_inode.c
@@ -447,9 +447,9 @@ void v9fs_evict_inode(struct inode *inode)
 {
struct v9fs_inode *v9inode = V9FS_I(inode);
 
-   truncate_inode_pages(inode->i_mapping, 0);
+   truncate_inode_pages(>i_data, 0);
end_writeback(inode);
-   filemap_fdatawrite(inode->i_mapping);
+   filemap_fdatawrite(>i_data);
 
 #ifdef CONFIG_9P_FSCACHE
v9fs_cache_inode_put_cookie(inode);
-- 
1.9.1



[PATCH 3.4 079/125] ses: fix additional element traversal bug

2016-10-12 Thread lizf
From: James Bottomley 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 5e1033561da1152c57b97ee84371dba2b3d64c25 upstream.

KASAN found that our additional element processing scripts drop off
the end of the VPD page into unallocated space.  The reason is that
not every element has additional information but our traversal
routines think they do, leading to them expecting far more additional
information than is present.  Fix this by adding a gate to the
traversal routine so that it only processes elements that are expected
to have additional information (list is in SES-2 section 6.1.13.1:
Additional Element Status diagnostic page overview)

Reported-by: Pavel Tikhomirov 
Tested-by: Pavel Tikhomirov 
Signed-off-by: James Bottomley 
Signed-off-by: Zefan Li 
---
 drivers/scsi/ses.c| 10 +-
 include/linux/enclosure.h |  4 
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/ses.c b/drivers/scsi/ses.c
index b3051fe..3643bbf 100644
--- a/drivers/scsi/ses.c
+++ b/drivers/scsi/ses.c
@@ -454,7 +454,15 @@ static void ses_enclosure_data_process(struct 
enclosure_device *edev,
if (desc_ptr)
desc_ptr += len;
 
-   if (addl_desc_ptr)
+   if (addl_desc_ptr &&
+   /* only find additional descriptions for specific 
devices */
+   (type_ptr[0] == ENCLOSURE_COMPONENT_DEVICE ||
+type_ptr[0] == ENCLOSURE_COMPONENT_ARRAY_DEVICE ||
+type_ptr[0] == ENCLOSURE_COMPONENT_SAS_EXPANDER ||
+/* these elements are optional */
+type_ptr[0] == 
ENCLOSURE_COMPONENT_SCSI_TARGET_PORT ||
+type_ptr[0] == 
ENCLOSURE_COMPONENT_SCSI_INITIATOR_PORT ||
+type_ptr[0] == 
ENCLOSURE_COMPONENT_CONTROLLER_ELECTRONICS))
addl_desc_ptr += addl_desc_ptr[1] + 2;
 
}
diff --git a/include/linux/enclosure.h b/include/linux/enclosure.h
index 9a33c5f..f6c229e 100644
--- a/include/linux/enclosure.h
+++ b/include/linux/enclosure.h
@@ -29,7 +29,11 @@
 /* A few generic types ... taken from ses-2 */
 enum enclosure_component_type {
ENCLOSURE_COMPONENT_DEVICE = 0x01,
+   ENCLOSURE_COMPONENT_CONTROLLER_ELECTRONICS = 0x07,
+   ENCLOSURE_COMPONENT_SCSI_TARGET_PORT = 0x14,
+   ENCLOSURE_COMPONENT_SCSI_INITIATOR_PORT = 0x15,
ENCLOSURE_COMPONENT_ARRAY_DEVICE = 0x17,
+   ENCLOSURE_COMPONENT_SAS_EXPANDER = 0x18,
 };
 
 /* ses-2 common element status */
-- 
1.9.1



[PATCH 3.4 107/125] ocfs2: fix BUG when calculate new backup super

2016-10-12 Thread lizf
From: Joseph Qi 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 5c9ee4cbf2a945271f25b89b137f2c03bbc3be33 upstream.

When resizing, it firstly extends the last gd.  Once it should backup
super in the gd, it calculates new backup super and update the
corresponding value.

But it currently doesn't consider the situation that the backup super is
already done.  And in this case, it still sets the bit in gd bitmap and
then decrease from bg_free_bits_count, which leads to a corrupted gd and
trigger the BUG in ocfs2_block_group_set_bits:

BUG_ON(le16_to_cpu(bg->bg_free_bits_count) < num_bits);

So check whether the backup super is done and then do the updates.

Signed-off-by: Joseph Qi 
Reviewed-by: Jiufei Xue 
Reviewed-by: Yiwen Jiang 
Cc: Mark Fasheh 
Cc: Joel Becker 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li 
---
 fs/ocfs2/resize.c | 15 ---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/fs/ocfs2/resize.c b/fs/ocfs2/resize.c
index ec55add..2ca64fa 100644
--- a/fs/ocfs2/resize.c
+++ b/fs/ocfs2/resize.c
@@ -56,11 +56,12 @@ static u16 ocfs2_calc_new_backup_super(struct inode *inode,
   int new_clusters,
   u32 first_new_cluster,
   u16 cl_cpg,
+  u16 old_bg_clusters,
   int set)
 {
int i;
u16 backups = 0;
-   u32 cluster;
+   u32 cluster, lgd_cluster;
u64 blkno, gd_blkno, lgd_blkno = le64_to_cpu(gd->bg_blkno);
 
for (i = 0; i < OCFS2_MAX_BACKUP_SUPERBLOCKS; i++) {
@@ -73,6 +74,12 @@ static u16 ocfs2_calc_new_backup_super(struct inode *inode,
else if (gd_blkno > lgd_blkno)
break;
 
+   /* check if already done backup super */
+   lgd_cluster = ocfs2_blocks_to_clusters(inode->i_sb, lgd_blkno);
+   lgd_cluster += old_bg_clusters;
+   if (lgd_cluster >= cluster)
+   continue;
+
if (set)
ocfs2_set_bit(cluster % cl_cpg,
  (unsigned long *)gd->bg_bitmap);
@@ -101,6 +108,7 @@ static int ocfs2_update_last_group_and_inode(handle_t 
*handle,
u16 chain, num_bits, backups = 0;
u16 cl_bpc = le16_to_cpu(cl->cl_bpc);
u16 cl_cpg = le16_to_cpu(cl->cl_cpg);
+   u16 old_bg_clusters;
 
trace_ocfs2_update_last_group_and_inode(new_clusters,
first_new_cluster);
@@ -114,6 +122,7 @@ static int ocfs2_update_last_group_and_inode(handle_t 
*handle,
 
group = (struct ocfs2_group_desc *)group_bh->b_data;
 
+   old_bg_clusters = le16_to_cpu(group->bg_bits) / cl_bpc;
/* update the group first. */
num_bits = new_clusters * cl_bpc;
le16_add_cpu(>bg_bits, num_bits);
@@ -129,7 +138,7 @@ static int ocfs2_update_last_group_and_inode(handle_t 
*handle,
 group,
 new_clusters,
 first_new_cluster,
-cl_cpg, 1);
+cl_cpg, old_bg_clusters, 
1);
le16_add_cpu(>bg_free_bits_count, -1 * backups);
}
 
@@ -169,7 +178,7 @@ out_rollback:
group,
new_clusters,
first_new_cluster,
-   cl_cpg, 0);
+   cl_cpg, old_bg_clusters, 0);
le16_add_cpu(>bg_free_bits_count, backups);
le16_add_cpu(>bg_bits, -1 * num_bits);
le16_add_cpu(>bg_free_bits_count, -1 * num_bits);
-- 
1.9.1



[PATCH 3.4 077/125] ses: Fix problems with simple enclosures

2016-10-12 Thread lizf
From: James Bottomley 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 3417c1b5cb1fdc10261dbed42b05cc93166a78fd upstream.

Simple enclosure implementations (mostly USB) are allowed to return only
page 8 to every diagnostic query.  That really confuses our
implementation because we assume the return is the page we asked for and
end up doing incorrect offsets based on bogus information leading to
accesses outside of allocated ranges.  Fix that by checking the page
code of the return and giving an error if it isn't the one we asked for.
This should fix reported bugs with USB storage by simply refusing to
attach to enclosures that behave like this.  It's also good defensive
practise now that we're starting to see more USB enclosures.

Reported-by: Andrea Gelmini 
Reviewed-by: Ewan D. Milne 
Reviewed-by: Tomas Henzl 
Signed-off-by: James Bottomley 
Signed-off-by: Zefan Li 
---
 drivers/scsi/ses.c | 20 +++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/ses.c b/drivers/scsi/ses.c
index eba183c..b3051fe 100644
--- a/drivers/scsi/ses.c
+++ b/drivers/scsi/ses.c
@@ -70,6 +70,7 @@ static int ses_probe(struct device *dev)
 static int ses_recv_diag(struct scsi_device *sdev, int page_code,
 void *buf, int bufflen)
 {
+   int ret;
unsigned char cmd[] = {
RECEIVE_DIAGNOSTIC,
1,  /* Set PCV bit */
@@ -78,9 +79,26 @@ static int ses_recv_diag(struct scsi_device *sdev, int 
page_code,
bufflen & 0xff,
0
};
+   unsigned char recv_page_code;
 
-   return scsi_execute_req(sdev, cmd, DMA_FROM_DEVICE, buf, bufflen,
+   ret =  scsi_execute_req(sdev, cmd, DMA_FROM_DEVICE, buf, bufflen,
NULL, SES_TIMEOUT, SES_RETRIES, NULL);
+   if (unlikely(!ret))
+   return ret;
+
+   recv_page_code = ((unsigned char *)buf)[0];
+
+   if (likely(recv_page_code == page_code))
+   return ret;
+
+   /* successful diagnostic but wrong page code.  This happens to some
+* USB devices, just print a message and pretend there was an error */
+
+   sdev_printk(KERN_ERR, sdev,
+   "Wrong diagnostic page; asked for %d got %u\n",
+   page_code, recv_page_code);
+
+   return -EINVAL;
 }
 
 static int ses_send_diag(struct scsi_device *sdev, int page_code,
-- 
1.9.1



[PATCH 3.4 109/125] ftrace/scripts: Fix incorrect use of sprintf in recordmcount

2016-10-12 Thread lizf
From: Colin Ian King 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 713a3e4de707fab49d5aa4bceb77db1058572a7b upstream.

Fix build warning:

scripts/recordmcount.c:589:4: warning: format not a string
literal and no format arguments [-Wformat-security]
sprintf("%s: failed\n", file);

Fixes: a50bd43935586 ("ftrace/scripts: Have recordmcount copy the object file")
Link: 
http://lkml.kernel.org/r/1451516801-16951-1-git-send-email-colin.k...@canonical.com

Cc: Li Bin 
Cc: Russell King 
Cc: Will Deacon 
Signed-off-by: Colin Ian King 
Signed-off-by: Steven Rostedt 
Signed-off-by: Zefan Li 
---
 scripts/recordmcount.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scripts/recordmcount.c b/scripts/recordmcount.c
index 0970379..0d5ae4a 100644
--- a/scripts/recordmcount.c
+++ b/scripts/recordmcount.c
@@ -546,7 +546,7 @@ main(int argc, char *argv[])
do_file(file);
break;
case SJ_FAIL:/* error in do_file or below */
-   sprintf("%s: failed\n", file);
+   fprintf(stderr, "%s: failed\n", file);
++n_error;
break;
case SJ_SUCCEED:/* premature success */
-- 
1.9.1



[PATCH 3.4 106/125] ipv6/addrlabel: fix ip6addrlbl_get()

2016-10-12 Thread lizf
From: Andrey Ryabinin 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit e459dfeeb64008b2d23bdf600f03b3605dbb8152 upstream.

ip6addrlbl_get() has never worked. If ip6addrlbl_hold() succeeded,
ip6addrlbl_get() will exit with '-ESRCH'. If ip6addrlbl_hold() failed,
ip6addrlbl_get() will use about to be free ip6addrlbl_entry pointer.

Fix this by inverting ip6addrlbl_hold() check.

Fixes: 2a8cc6c89039 ("[IPV6] ADDRCONF: Support RFC3484 configurable address 
selection policy table.")
Signed-off-by: Andrey Ryabinin 
Reviewed-by: Cong Wang 
Acked-by: YOSHIFUJI Hideaki 
Signed-off-by: David S. Miller 
Signed-off-by: Zefan Li 
---
 net/ipv6/addrlabel.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv6/addrlabel.c b/net/ipv6/addrlabel.c
index 2d8ddba..c8c6a12 100644
--- a/net/ipv6/addrlabel.c
+++ b/net/ipv6/addrlabel.c
@@ -558,7 +558,7 @@ static int ip6addrlbl_get(struct sk_buff *in_skb, struct 
nlmsghdr* nlh,
 
rcu_read_lock();
p = __ipv6_addr_label(net, addr, ipv6_addr_type(addr), 
ifal->ifal_index);
-   if (p && ip6addrlbl_hold(p))
+   if (p && !ip6addrlbl_hold(p))
p = NULL;
lseq = ip6addrlbl_table.seq;
rcu_read_unlock();
-- 
1.9.1



[PATCH 3.4 074/125] crypto: skcipher - Copy iv from desc even for 0-len walks

2016-10-12 Thread lizf
From: "Jason A. Donenfeld" 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 70d906bc17500edfa9bdd8c8b7e59618c7911613 upstream.

Some ciphers actually support encrypting zero length plaintexts. For
example, many AEAD modes support this. The resulting ciphertext for
those winds up being only the authentication tag, which is a result of
the key, the iv, the additional data, and the fact that the plaintext
had zero length. The blkcipher constructors won't copy the IV to the
right place, however, when using a zero length input, resulting in
some significant problems when ciphers call their initialization
routines, only to find that the ->iv parameter is uninitialized. One
such example of this would be using chacha20poly1305 with a zero length
input, which then calls chacha20, which calls the key setup routine,
which eventually OOPSes due to the uninitialized ->iv member.

Signed-off-by: Jason A. Donenfeld 
Signed-off-by: Herbert Xu 
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li 
---
 crypto/ablkcipher.c | 2 +-
 crypto/blkcipher.c  | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/crypto/ablkcipher.c b/crypto/ablkcipher.c
index 4a9c499..1ef1428 100644
--- a/crypto/ablkcipher.c
+++ b/crypto/ablkcipher.c
@@ -280,12 +280,12 @@ static int ablkcipher_walk_first(struct 
ablkcipher_request *req,
if (WARN_ON_ONCE(in_irq()))
return -EDEADLK;
 
+   walk->iv = req->info;
walk->nbytes = walk->total;
if (unlikely(!walk->total))
return 0;
 
walk->iv_buffer = NULL;
-   walk->iv = req->info;
if (unlikely(((unsigned long)walk->iv & alignmask))) {
int err = ablkcipher_copy_iv(walk, tfm, alignmask);
if (err)
diff --git a/crypto/blkcipher.c b/crypto/blkcipher.c
index 0a1ebea..34e5d65 100644
--- a/crypto/blkcipher.c
+++ b/crypto/blkcipher.c
@@ -329,12 +329,12 @@ static int blkcipher_walk_first(struct blkcipher_desc 
*desc,
if (WARN_ON_ONCE(in_irq()))
return -EDEADLK;
 
+   walk->iv = desc->info;
walk->nbytes = walk->total;
if (unlikely(!walk->total))
return 0;
 
walk->buffer = NULL;
-   walk->iv = desc->info;
if (unlikely(((unsigned long)walk->iv & alignmask))) {
int err = blkcipher_copy_iv(walk, tfm, alignmask);
if (err)
-- 
1.9.1



[PATCH 3.4 104/125] KEYS: Fix race between read and revoke

2016-10-12 Thread lizf
From: David Howells 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit b4a1b4f5047e4f54e194681125c74c0aa64d637d upstream.

This fixes CVE-2015-7550.

There's a race between keyctl_read() and keyctl_revoke().  If the revoke
happens between keyctl_read() checking the validity of a key and the key's
semaphore being taken, then the key type read method will see a revoked key.

This causes a problem for the user-defined key type because it assumes in
its read method that there will always be a payload in a non-revoked key
and doesn't check for a NULL pointer.

Fix this by making keyctl_read() check the validity of a key after taking
semaphore instead of before.

I think the bug was introduced with the original keyrings code.

This was discovered by a multithreaded test program generated by syzkaller
(http://github.com/google/syzkaller).  Here's a cleaned up version:

#include 
#include 
#include 
void *thr0(void *arg)
{
key_serial_t key = (unsigned long)arg;
keyctl_revoke(key);
return 0;
}
void *thr1(void *arg)
{
key_serial_t key = (unsigned long)arg;
char buffer[16];
keyctl_read(key, buffer, 16);
return 0;
}
int main()
{
key_serial_t key = add_key("user", "%", "foo", 3, 
KEY_SPEC_USER_KEYRING);
pthread_t th[5];
pthread_create([0], 0, thr0, (void *)(unsigned long)key);
pthread_create([1], 0, thr1, (void *)(unsigned long)key);
pthread_create([2], 0, thr0, (void *)(unsigned long)key);
pthread_create([3], 0, thr1, (void *)(unsigned long)key);
pthread_join(th[0], 0);
pthread_join(th[1], 0);
pthread_join(th[2], 0);
pthread_join(th[3], 0);
return 0;
}

Build as:

cc -o keyctl-race keyctl-race.c -lkeyutils -lpthread

Run as:

while keyctl-race; do :; done

as it may need several iterations to crash the kernel.  The crash can be
summarised as:

BUG: unable to handle kernel NULL pointer dereference at 
0010
IP: [] user_read+0x56/0xa3
...
Call Trace:
 [] keyctl_read_key+0xb6/0xd7
 [] SyS_keyctl+0x83/0xe0
 [] entry_SYSCALL_64_fastpath+0x12/0x6f

Reported-by: Dmitry Vyukov 
Signed-off-by: David Howells 
Tested-by: Dmitry Vyukov 
Signed-off-by: James Morris 
Signed-off-by: Zefan Li 
---
 security/keys/keyctl.c | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/security/keys/keyctl.c b/security/keys/keyctl.c
index dfc8c22..0ba68b1 100644
--- a/security/keys/keyctl.c
+++ b/security/keys/keyctl.c
@@ -702,16 +702,16 @@ long keyctl_read_key(key_serial_t keyid, char __user 
*buffer, size_t buflen)
 
/* the key is probably readable - now try to read it */
 can_read_key:
-   ret = key_validate(key);
-   if (ret == 0) {
-   ret = -EOPNOTSUPP;
-   if (key->type->read) {
-   /* read the data with the semaphore held (since we
-* might sleep) */
-   down_read(>sem);
+   ret = -EOPNOTSUPP;
+   if (key->type->read) {
+   /* Read the data with the semaphore held (since we might sleep)
+* to protect against the key being updated or revoked.
+*/
+   down_read(>sem);
+   ret = key_validate(key);
+   if (ret == 0)
ret = key->type->read(key, buffer, buflen);
-   up_read(>sem);
-   }
+   up_read(>sem);
}
 
 error2:
-- 
1.9.1



[PATCH 3.4 104/125] KEYS: Fix race between read and revoke

2016-10-12 Thread lizf
From: David Howells 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit b4a1b4f5047e4f54e194681125c74c0aa64d637d upstream.

This fixes CVE-2015-7550.

There's a race between keyctl_read() and keyctl_revoke().  If the revoke
happens between keyctl_read() checking the validity of a key and the key's
semaphore being taken, then the key type read method will see a revoked key.

This causes a problem for the user-defined key type because it assumes in
its read method that there will always be a payload in a non-revoked key
and doesn't check for a NULL pointer.

Fix this by making keyctl_read() check the validity of a key after taking
semaphore instead of before.

I think the bug was introduced with the original keyrings code.

This was discovered by a multithreaded test program generated by syzkaller
(http://github.com/google/syzkaller).  Here's a cleaned up version:

#include 
#include 
#include 
void *thr0(void *arg)
{
key_serial_t key = (unsigned long)arg;
keyctl_revoke(key);
return 0;
}
void *thr1(void *arg)
{
key_serial_t key = (unsigned long)arg;
char buffer[16];
keyctl_read(key, buffer, 16);
return 0;
}
int main()
{
key_serial_t key = add_key("user", "%", "foo", 3, 
KEY_SPEC_USER_KEYRING);
pthread_t th[5];
pthread_create([0], 0, thr0, (void *)(unsigned long)key);
pthread_create([1], 0, thr1, (void *)(unsigned long)key);
pthread_create([2], 0, thr0, (void *)(unsigned long)key);
pthread_create([3], 0, thr1, (void *)(unsigned long)key);
pthread_join(th[0], 0);
pthread_join(th[1], 0);
pthread_join(th[2], 0);
pthread_join(th[3], 0);
return 0;
}

Build as:

cc -o keyctl-race keyctl-race.c -lkeyutils -lpthread

Run as:

while keyctl-race; do :; done

as it may need several iterations to crash the kernel.  The crash can be
summarised as:

BUG: unable to handle kernel NULL pointer dereference at 
0010
IP: [] user_read+0x56/0xa3
...
Call Trace:
 [] keyctl_read_key+0xb6/0xd7
 [] SyS_keyctl+0x83/0xe0
 [] entry_SYSCALL_64_fastpath+0x12/0x6f

Reported-by: Dmitry Vyukov 
Signed-off-by: David Howells 
Tested-by: Dmitry Vyukov 
Signed-off-by: James Morris 
Signed-off-by: Zefan Li 
---
 security/keys/keyctl.c | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/security/keys/keyctl.c b/security/keys/keyctl.c
index dfc8c22..0ba68b1 100644
--- a/security/keys/keyctl.c
+++ b/security/keys/keyctl.c
@@ -702,16 +702,16 @@ long keyctl_read_key(key_serial_t keyid, char __user 
*buffer, size_t buflen)
 
/* the key is probably readable - now try to read it */
 can_read_key:
-   ret = key_validate(key);
-   if (ret == 0) {
-   ret = -EOPNOTSUPP;
-   if (key->type->read) {
-   /* read the data with the semaphore held (since we
-* might sleep) */
-   down_read(>sem);
+   ret = -EOPNOTSUPP;
+   if (key->type->read) {
+   /* Read the data with the semaphore held (since we might sleep)
+* to protect against the key being updated or revoked.
+*/
+   down_read(>sem);
+   ret = key_validate(key);
+   if (ret == 0)
ret = key->type->read(key, buffer, buflen);
-   up_read(>sem);
-   }
+   up_read(>sem);
}
 
 error2:
-- 
1.9.1



[PATCH 3.4 080/125] parisc iommu: fix panic due to trying to allocate too large region

2016-10-12 Thread lizf
From: Mikulas Patocka 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit e46e31a3696ae2d66f32c207df3969613726e636 upstream.

When using the Promise TX2+ SATA controller on PA-RISC, the system often
crashes with kernel panic, for example just writing data with the dd
utility will make it crash.

Kernel panic - not syncing: drivers/parisc/sba_iommu.c: I/O MMU @ 
a000 is out of mapping resources

CPU: 0 PID: 18442 Comm: mkspadfs Not tainted 4.4.0-rc2 #2
Backtrace:
 [<4021497c>] show_stack+0x14/0x20
 [<40410bf0>] dump_stack+0x88/0x100
 [<4023978c>] panic+0x124/0x360
 [<40452c18>] sba_alloc_range+0x698/0x6a0
 [<40453150>] sba_map_sg+0x260/0x5b8
 [<0c18dbb4>] ata_qc_issue+0x264/0x4a8 [libata]
 [<0c19535c>] ata_scsi_translate+0xe4/0x220 [libata]
 [<0c19a93c>] ata_scsi_queuecmd+0xbc/0x320 [libata]
 [<40499bbc>] scsi_dispatch_cmd+0xfc/0x130
 [<4049da34>] scsi_request_fn+0x6e4/0x970
 [<403e95a8>] __blk_run_queue+0x40/0x60
 [<403e9d8c>] blk_run_queue+0x3c/0x68
 [<4049a534>] scsi_run_queue+0x2a4/0x360
 [<4049be68>] scsi_end_request+0x1a8/0x238
 [<4049de84>] scsi_io_completion+0xfc/0x688
 [<40493c74>] scsi_finish_command+0x17c/0x1d0

The cause of the crash is not exhaustion of the IOMMU space, there is
plenty of free pages. The function sba_alloc_range is called with size
0x11000, thus the pages_needed variable is 0x11. The function
sba_search_bitmap is called with bits_wanted 0x11 and boundary size is
0x10 (because dma_get_seg_boundary(dev) returns 0x).

The function sba_search_bitmap attempts to allocate 17 pages that must not
cross 16-page boundary - it can't satisfy this requirement
(iommu_is_span_boundary always returns true) and fails even if there are
many free entries in the IOMMU space.

How did it happen that we try to allocate 17 pages that don't cross
16-page boundary? The cause is in the function iommu_coalesce_chunks. This
function tries to coalesce adjacent entries in the scatterlist. The
function does several checks if it may coalesce one entry with the next,
one of those checks is this:

if (startsg->length + dma_len > max_seg_size)
break;

When it finishes coalescing adjacent entries, it allocates the mapping:

sg_dma_len(contig_sg) = dma_len;
dma_len = ALIGN(dma_len + dma_offset, IOVP_SIZE);
sg_dma_address(contig_sg) =
PIDE_FLAG
| (iommu_alloc_range(ioc, dev, dma_len) << IOVP_SHIFT)
| dma_offset;

It is possible that (startsg->length + dma_len > max_seg_size) is false
(we are just near the 0x1 max_seg_size boundary), so the funcion
decides to coalesce this entry with the next entry. When the coalescing
succeeds, the function performs
dma_len = ALIGN(dma_len + dma_offset, IOVP_SIZE);
And now, because of non-zero dma_offset, dma_len is greater than 0x1.
iommu_alloc_range (a pointer to sba_alloc_range) is called and it attempts
to allocate 17 pages for a device that must not cross 16-page boundary.

To fix the bug, we must make sure that dma_len after addition of
dma_offset and alignment doesn't cross the segment boundary. I.e. change
if (startsg->length + dma_len > max_seg_size)
break;
to
if (ALIGN(dma_len + dma_offset + startsg->length, IOVP_SIZE) > 
max_seg_size)
break;

This patch makes this change (it precalculates max_seg_boundary at the
beginning of the function iommu_coalesce_chunks). I also added a check
that the mapping length doesn't exceed dma_get_seg_boundary(dev) (it is
not needed for Promise TX2+ SATA, but it may be needed for other devices
that have dma_get_seg_boundary lower than dma_get_max_seg_size).

Signed-off-by: Mikulas Patocka 
Signed-off-by: Helge Deller 
Signed-off-by: Zefan Li 
---
 drivers/parisc/iommu-helpers.h | 15 ---
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/drivers/parisc/iommu-helpers.h b/drivers/parisc/iommu-helpers.h
index 8c33491..c6aa388 100644
--- a/drivers/parisc/iommu-helpers.h
+++ b/drivers/parisc/iommu-helpers.h
@@ -104,7 +104,11 @@ iommu_coalesce_chunks(struct ioc *ioc, struct device *dev,
struct scatterlist *contig_sg; /* contig chunk head */
unsigned long dma_offset, dma_len; /* start/len of DMA stream */
unsigned int n_mappings = 0;
-   unsigned int max_seg_size = dma_get_max_seg_size(dev);
+   unsigned int max_seg_size = min(dma_get_max_seg_size(dev),
+   (unsigned)DMA_CHUNK_SIZE);
+   unsigned int max_seg_boundary = dma_get_seg_boundary(dev) + 1;
+   if (max_seg_boundary)   /* check if the addition above didn't overflow 
*/
+   max_seg_size = min(max_seg_size, max_seg_boundary);
 
while (nents > 0) {
 
@@ -139,14 +143,11 @@ 

[PATCH 3.4 080/125] parisc iommu: fix panic due to trying to allocate too large region

2016-10-12 Thread lizf
From: Mikulas Patocka 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit e46e31a3696ae2d66f32c207df3969613726e636 upstream.

When using the Promise TX2+ SATA controller on PA-RISC, the system often
crashes with kernel panic, for example just writing data with the dd
utility will make it crash.

Kernel panic - not syncing: drivers/parisc/sba_iommu.c: I/O MMU @ 
a000 is out of mapping resources

CPU: 0 PID: 18442 Comm: mkspadfs Not tainted 4.4.0-rc2 #2
Backtrace:
 [<4021497c>] show_stack+0x14/0x20
 [<40410bf0>] dump_stack+0x88/0x100
 [<4023978c>] panic+0x124/0x360
 [<40452c18>] sba_alloc_range+0x698/0x6a0
 [<40453150>] sba_map_sg+0x260/0x5b8
 [<0c18dbb4>] ata_qc_issue+0x264/0x4a8 [libata]
 [<0c19535c>] ata_scsi_translate+0xe4/0x220 [libata]
 [<0c19a93c>] ata_scsi_queuecmd+0xbc/0x320 [libata]
 [<40499bbc>] scsi_dispatch_cmd+0xfc/0x130
 [<4049da34>] scsi_request_fn+0x6e4/0x970
 [<403e95a8>] __blk_run_queue+0x40/0x60
 [<403e9d8c>] blk_run_queue+0x3c/0x68
 [<4049a534>] scsi_run_queue+0x2a4/0x360
 [<4049be68>] scsi_end_request+0x1a8/0x238
 [<4049de84>] scsi_io_completion+0xfc/0x688
 [<40493c74>] scsi_finish_command+0x17c/0x1d0

The cause of the crash is not exhaustion of the IOMMU space, there is
plenty of free pages. The function sba_alloc_range is called with size
0x11000, thus the pages_needed variable is 0x11. The function
sba_search_bitmap is called with bits_wanted 0x11 and boundary size is
0x10 (because dma_get_seg_boundary(dev) returns 0x).

The function sba_search_bitmap attempts to allocate 17 pages that must not
cross 16-page boundary - it can't satisfy this requirement
(iommu_is_span_boundary always returns true) and fails even if there are
many free entries in the IOMMU space.

How did it happen that we try to allocate 17 pages that don't cross
16-page boundary? The cause is in the function iommu_coalesce_chunks. This
function tries to coalesce adjacent entries in the scatterlist. The
function does several checks if it may coalesce one entry with the next,
one of those checks is this:

if (startsg->length + dma_len > max_seg_size)
break;

When it finishes coalescing adjacent entries, it allocates the mapping:

sg_dma_len(contig_sg) = dma_len;
dma_len = ALIGN(dma_len + dma_offset, IOVP_SIZE);
sg_dma_address(contig_sg) =
PIDE_FLAG
| (iommu_alloc_range(ioc, dev, dma_len) << IOVP_SHIFT)
| dma_offset;

It is possible that (startsg->length + dma_len > max_seg_size) is false
(we are just near the 0x1 max_seg_size boundary), so the funcion
decides to coalesce this entry with the next entry. When the coalescing
succeeds, the function performs
dma_len = ALIGN(dma_len + dma_offset, IOVP_SIZE);
And now, because of non-zero dma_offset, dma_len is greater than 0x1.
iommu_alloc_range (a pointer to sba_alloc_range) is called and it attempts
to allocate 17 pages for a device that must not cross 16-page boundary.

To fix the bug, we must make sure that dma_len after addition of
dma_offset and alignment doesn't cross the segment boundary. I.e. change
if (startsg->length + dma_len > max_seg_size)
break;
to
if (ALIGN(dma_len + dma_offset + startsg->length, IOVP_SIZE) > 
max_seg_size)
break;

This patch makes this change (it precalculates max_seg_boundary at the
beginning of the function iommu_coalesce_chunks). I also added a check
that the mapping length doesn't exceed dma_get_seg_boundary(dev) (it is
not needed for Promise TX2+ SATA, but it may be needed for other devices
that have dma_get_seg_boundary lower than dma_get_max_seg_size).

Signed-off-by: Mikulas Patocka 
Signed-off-by: Helge Deller 
Signed-off-by: Zefan Li 
---
 drivers/parisc/iommu-helpers.h | 15 ---
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/drivers/parisc/iommu-helpers.h b/drivers/parisc/iommu-helpers.h
index 8c33491..c6aa388 100644
--- a/drivers/parisc/iommu-helpers.h
+++ b/drivers/parisc/iommu-helpers.h
@@ -104,7 +104,11 @@ iommu_coalesce_chunks(struct ioc *ioc, struct device *dev,
struct scatterlist *contig_sg; /* contig chunk head */
unsigned long dma_offset, dma_len; /* start/len of DMA stream */
unsigned int n_mappings = 0;
-   unsigned int max_seg_size = dma_get_max_seg_size(dev);
+   unsigned int max_seg_size = min(dma_get_max_seg_size(dev),
+   (unsigned)DMA_CHUNK_SIZE);
+   unsigned int max_seg_boundary = dma_get_seg_boundary(dev) + 1;
+   if (max_seg_boundary)   /* check if the addition above didn't overflow 
*/
+   max_seg_size = min(max_seg_size, max_seg_boundary);
 
while (nents > 0) {
 
@@ -139,14 +143,11 @@ iommu_coalesce_chunks(struct ioc *ioc, struct device *dev,
 
/*
   

[PATCH 3.4 078/125] vgaarb: fix signal handling in vga_get()

2016-10-12 Thread lizf
From: "Kirill A. Shutemov" 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 9f5bd30818c42c6c36a51f93b4df75a2ea2bd85e upstream.

There are few defects in vga_get() related to signal hadning:

  - we shouldn't check for pending signals for TASK_UNINTERRUPTIBLE
case;

  - if we found pending signal we must remove ourself from wait queue
and change task state back to running;

  - -ERESTARTSYS is more appropriate, I guess.

Signed-off-by: Kirill A. Shutemov 
Reviewed-by: David Herrmann 
Signed-off-by: Dave Airlie 
Signed-off-by: Zefan Li 
---
 drivers/gpu/vga/vgaarb.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/vga/vgaarb.c b/drivers/gpu/vga/vgaarb.c
index 111d956..6a46d6e 100644
--- a/drivers/gpu/vga/vgaarb.c
+++ b/drivers/gpu/vga/vgaarb.c
@@ -381,8 +381,10 @@ int vga_get(struct pci_dev *pdev, unsigned int rsrc, int 
interruptible)
set_current_state(interruptible ?
  TASK_INTERRUPTIBLE :
  TASK_UNINTERRUPTIBLE);
-   if (signal_pending(current)) {
-   rc = -EINTR;
+   if (interruptible && signal_pending(current)) {
+   __set_current_state(TASK_RUNNING);
+   remove_wait_queue(_wait_queue, );
+   rc = -ERESTARTSYS;
break;
}
schedule();
-- 
1.9.1



[PATCH 3.4 083/125] tty: Fix GPF in flush_to_ldisc()

2016-10-12 Thread lizf
From: Peter Hurley <pe...@hurleysoftware.com>

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 9ce119f318ba1a07c29149301f1544b6c4bea52a upstream.

A line discipline which does not define a receive_buf() method can
can cause a GPF if data is ever received [1]. Oddly, this was known
to the author of n_tracesink in 2011, but never fixed.

[1] GPF report
BUG: unable to handle kernel NULL pointer dereference at   (null)
IP: [<  (null)>]   (null)
PGD 3752d067 PUD 37a7b067 PMD 0
Oops: 0010 [#1] SMP KASAN
Modules linked in:
CPU: 2 PID: 148 Comm: kworker/u10:2 Not tainted 4.4.0-rc2+ #51
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Workqueue: events_unbound flush_to_ldisc
task: 88006da94440 ti: 88006db6 task.ti: 88006db6
RIP: 0010:[<>]  [<  (null)>]   (null)
RSP: 0018:88006db67b50  EFLAGS: 00010246
RAX: 0102 RBX: 88003ab32f88 RCX: 0102
RDX:  RSI: 88003ab330a6 RDI: 88003aabd388
RBP: 88006db67c48 R08: 88003ab32f9c R09: 88003ab31fb0
R10: 88003ab32fa8 R11:  R12: dc00
R13: 88006db67c20 R14: 863df820 R15: 88003ab31fb8
FS:  () GS:88006dc0() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2:  CR3: 37938000 CR4: 06e0
Stack:
 829f46f1 88006da94bf8 88006da94bf8 
 88003ab31fb0 88003aabd438 88003ab31ff8 88006430fd90
 88003ab32f9c ed0007557a87 11000db6cf78 88003ab32078
Call Trace:
 [] process_one_work+0x8f1/0x17a0 kernel/workqueue.c:2030
 [] worker_thread+0xd4/0x1180 kernel/workqueue.c:2162
 [] kthread+0x1cf/0x270 drivers/block/aoe/aoecmd.c:1302
 [] ret_from_fork+0x3f/0x70 arch/x86/entry/entry_64.S:468
Code:  Bad RIP value.
RIP  [<  (null)>]   (null)
 RSP 
CR2: 
---[ end trace a587f8947e54d6ea ]---

Reported-by: Dmitry Vyukov <dvyu...@google.com>
Signed-off-by: Peter Hurley <pe...@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>
[lizf: Backportd to 3.4: adjust context]
Signed-off-by: Zefan Li <lize...@huawei.com>
---
 drivers/tty/tty_buffer.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/tty/tty_buffer.c b/drivers/tty/tty_buffer.c
index 4f02f9c..3f59d6c 100644
--- a/drivers/tty/tty_buffer.c
+++ b/drivers/tty/tty_buffer.c
@@ -443,7 +443,8 @@ static void flush_to_ldisc(struct work_struct *work)
flag_buf = head->flag_buf_ptr + head->read;
head->read += count;
spin_unlock_irqrestore(>buf.lock, flags);
-   disc->ops->receive_buf(tty, char_buf,
+   if (disc->ops->receive_buf)
+   disc->ops->receive_buf(tty, char_buf,
flag_buf, count);
spin_lock_irqsave(>buf.lock, flags);
}
-- 
1.9.1



[PATCH 3.4 103/125] USB: fix invalid memory access in hub_activate()

2016-10-12 Thread lizf
From: Alan Stern <st...@rowland.harvard.edu>

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit e50293ef9775c5f1cf3fcc093037dd6a8c5684ea upstream.

Commit 8520f38099cc ("USB: change hub initialization sleeps to
delayed_work") changed the hub_activate() routine to make part of it
run in a workqueue.  However, the commit failed to take a reference to
the usb_hub structure or to lock the hub interface while doing so.  As
a result, if a hub is plugged in and quickly unplugged before the work
routine can run, the routine will try to access memory that has been
deallocated.  Or, if the hub is unplugged while the routine is
running, the memory may be deallocated while it is in active use.

This patch fixes the problem by taking a reference to the usb_hub at
the start of hub_activate() and releasing it at the end (when the work
is finished), and by locking the hub interface while the work routine
is running.  It also adds a check at the start of the routine to see
if the hub has already been disconnected, in which nothing should be
done.

Signed-off-by: Alan Stern <st...@rowland.harvard.edu>
Reported-by: Alexandru Cornea <alexandru.cor...@intel.com>
Tested-by: Alexandru Cornea <alexandru.cor...@intel.com>
Fixes: 8520f38099cc ("USB: change hub initialization sleeps to delayed_work")
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>
[lizf: Backported to 3.4: add forward declaration of hub_release()]
Signed-off-by: Zefan Li <lize...@huawei.com>
---
 drivers/usb/core/hub.c | 23 ---
 1 file changed, 20 insertions(+), 3 deletions(-)

diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c
index 62ea924..e0ad5dc 100644
--- a/drivers/usb/core/hub.c
+++ b/drivers/usb/core/hub.c
@@ -156,6 +156,7 @@ EXPORT_SYMBOL_GPL(ehci_cf_port_reset_rwsem);
 #define HUB_DEBOUNCE_STABLE 100
 
 
+static void hub_release(struct kref *kref);
 static int usb_reset_and_verify_device(struct usb_device *udev);
 
 static inline char *portspeed(struct usb_hub *hub, int portstatus)
@@ -797,10 +798,20 @@ static void hub_activate(struct usb_hub *hub, enum 
hub_activation_type type)
unsigned delay;
 
/* Continue a partial initialization */
-   if (type == HUB_INIT2)
-   goto init2;
-   if (type == HUB_INIT3)
+   if (type == HUB_INIT2 || type == HUB_INIT3) {
+   device_lock(hub->intfdev);
+
+   /* Was the hub disconnected while we were waiting? */
+   if (hub->disconnected) {
+   device_unlock(hub->intfdev);
+   kref_put(>kref, hub_release);
+   return;
+   }
+   if (type == HUB_INIT2)
+   goto init2;
goto init3;
+   }
+   kref_get(>kref);
 
/* The superspeed hub except for root hub has to use Hub Depth
 * value as an offset into the route string to locate the bits
@@ -990,6 +1001,7 @@ static void hub_activate(struct usb_hub *hub, enum 
hub_activation_type type)
PREPARE_DELAYED_WORK(>init_work, hub_init_func3);
schedule_delayed_work(>init_work,
msecs_to_jiffies(delay));
+   device_unlock(hub->intfdev);
return; /* Continues at init3: below */
} else {
msleep(delay);
@@ -1010,6 +1022,11 @@ static void hub_activate(struct usb_hub *hub, enum 
hub_activation_type type)
/* Allow autosuspend if it was suppressed */
if (type <= HUB_INIT3)
usb_autopm_put_interface_async(to_usb_interface(hub->intfdev));
+
+   if (type == HUB_INIT2 || type == HUB_INIT3)
+   device_unlock(hub->intfdev);
+
+   kref_put(>kref, hub_release);
 }
 
 /* Implement the continuations for the delays above */
-- 
1.9.1



[PATCH 3.4 075/125] rfkill: copy the name into the rfkill struct

2016-10-12 Thread lizf
From: Johannes Berg 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit b7bb110008607a915298bf0f47d25886ecb94477 upstream.

Some users of rfkill, like NFC and cfg80211, use a dynamic name when
allocating rfkill, in those cases dev_name(). Therefore, the pointer
passed to rfkill_alloc() might not be valid forever, I specifically
found the case that the rfkill name was quite obviously an invalid
pointer (or at least garbage) when the wiphy had been renamed.

Fix this by making a copy of the rfkill name in rfkill_alloc().

Signed-off-by: Johannes Berg 
Signed-off-by: Zefan Li 
---
 net/rfkill/core.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/rfkill/core.c b/net/rfkill/core.c
index f974961..feef1a45 100644
--- a/net/rfkill/core.c
+++ b/net/rfkill/core.c
@@ -51,7 +51,6 @@
 struct rfkill {
spinlock_t  lock;
 
-   const char  *name;
enum rfkill_typetype;
 
unsigned long   state;
@@ -75,6 +74,7 @@ struct rfkill {
struct delayed_work poll_work;
struct work_struct  uevent_work;
struct work_struct  sync_work;
+   charname[];
 };
 #define to_rfkill(d)   container_of(d, struct rfkill, dev)
 
@@ -849,14 +849,14 @@ struct rfkill * __must_check rfkill_alloc(const char 
*name,
if (WARN_ON(type == RFKILL_TYPE_ALL || type >= NUM_RFKILL_TYPES))
return NULL;
 
-   rfkill = kzalloc(sizeof(*rfkill), GFP_KERNEL);
+   rfkill = kzalloc(sizeof(*rfkill) + strlen(name) + 1, GFP_KERNEL);
if (!rfkill)
return NULL;
 
spin_lock_init(>lock);
INIT_LIST_HEAD(>node);
rfkill->type = type;
-   rfkill->name = name;
+   strcpy(rfkill->name, name);
rfkill->ops = ops;
rfkill->data = ops_data;
 
-- 
1.9.1



[PATCH 3.4 108/125] mm/memory_hotplug.c: check for missing sections in test_pages_in_a_zone()

2016-10-12 Thread lizf
From: Andrew Banman 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 5f0f2887f4de9508dcf438deab28f1de8070c271 upstream.

test_pages_in_a_zone() does not account for the possibility of missing
sections in the given pfn range.  pfn_valid_within always returns 1 when
CONFIG_HOLES_IN_ZONE is not set, allowing invalid pfns from missing
sections to pass the test, leading to a kernel oops.

Wrap an additional pfn loop with PAGES_PER_SECTION granularity to check
for missing sections before proceeding into the zone-check code.

This also prevents a crash from offlining memory devices with missing
sections.  Despite this, it may be a good idea to keep the related patch
'[PATCH 3/3] drivers: memory: prohibit offlining of memory blocks with
missing sections' because missing sections in a memory block may lead to
other problems not covered by the scope of this fix.

Signed-off-by: Andrew Banman 
Acked-by: Alex Thorlton 
Cc: Russ Anderson 
Cc: Alex Thorlton 
Cc: Yinghai Lu 
Cc: Greg KH 
Cc: Seth Jennings 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
Signed-off-by: Zefan Li 
---
 mm/memory_hotplug.c | 31 +++
 1 file changed, 19 insertions(+), 12 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 09d87b7..223232a 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -716,23 +716,30 @@ int is_mem_section_removable(unsigned long start_pfn, 
unsigned long nr_pages)
  */
 static int test_pages_in_a_zone(unsigned long start_pfn, unsigned long end_pfn)
 {
-   unsigned long pfn;
+   unsigned long pfn, sec_end_pfn;
struct zone *zone = NULL;
struct page *page;
int i;
-   for (pfn = start_pfn;
+   for (pfn = start_pfn, sec_end_pfn = SECTION_ALIGN_UP(start_pfn);
 pfn < end_pfn;
-pfn += MAX_ORDER_NR_PAGES) {
-   i = 0;
-   /* This is just a CONFIG_HOLES_IN_ZONE check.*/
-   while ((i < MAX_ORDER_NR_PAGES) && !pfn_valid_within(pfn + i))
-   i++;
-   if (i == MAX_ORDER_NR_PAGES)
+pfn = sec_end_pfn + 1, sec_end_pfn += PAGES_PER_SECTION) {
+   /* Make sure the memory section is present first */
+   if (!present_section_nr(pfn_to_section_nr(pfn)))
continue;
-   page = pfn_to_page(pfn + i);
-   if (zone && page_zone(page) != zone)
-   return 0;
-   zone = page_zone(page);
+   for (; pfn < sec_end_pfn && pfn < end_pfn;
+pfn += MAX_ORDER_NR_PAGES) {
+   i = 0;
+   /* This is just a CONFIG_HOLES_IN_ZONE check.*/
+   while ((i < MAX_ORDER_NR_PAGES) &&
+   !pfn_valid_within(pfn + i))
+   i++;
+   if (i == MAX_ORDER_NR_PAGES)
+   continue;
+   page = pfn_to_page(pfn + i);
+   if (zone && page_zone(page) != zone)
+   return 0;
+   zone = page_zone(page);
+   }
}
return 1;
 }
-- 
1.9.1



[PATCH 3.4 078/125] vgaarb: fix signal handling in vga_get()

2016-10-12 Thread lizf
From: "Kirill A. Shutemov" 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 9f5bd30818c42c6c36a51f93b4df75a2ea2bd85e upstream.

There are few defects in vga_get() related to signal hadning:

  - we shouldn't check for pending signals for TASK_UNINTERRUPTIBLE
case;

  - if we found pending signal we must remove ourself from wait queue
and change task state back to running;

  - -ERESTARTSYS is more appropriate, I guess.

Signed-off-by: Kirill A. Shutemov 
Reviewed-by: David Herrmann 
Signed-off-by: Dave Airlie 
Signed-off-by: Zefan Li 
---
 drivers/gpu/vga/vgaarb.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/vga/vgaarb.c b/drivers/gpu/vga/vgaarb.c
index 111d956..6a46d6e 100644
--- a/drivers/gpu/vga/vgaarb.c
+++ b/drivers/gpu/vga/vgaarb.c
@@ -381,8 +381,10 @@ int vga_get(struct pci_dev *pdev, unsigned int rsrc, int 
interruptible)
set_current_state(interruptible ?
  TASK_INTERRUPTIBLE :
  TASK_UNINTERRUPTIBLE);
-   if (signal_pending(current)) {
-   rc = -EINTR;
+   if (interruptible && signal_pending(current)) {
+   __set_current_state(TASK_RUNNING);
+   remove_wait_queue(_wait_queue, );
+   rc = -ERESTARTSYS;
break;
}
schedule();
-- 
1.9.1



[PATCH 3.4 083/125] tty: Fix GPF in flush_to_ldisc()

2016-10-12 Thread lizf
From: Peter Hurley 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 9ce119f318ba1a07c29149301f1544b6c4bea52a upstream.

A line discipline which does not define a receive_buf() method can
can cause a GPF if data is ever received [1]. Oddly, this was known
to the author of n_tracesink in 2011, but never fixed.

[1] GPF report
BUG: unable to handle kernel NULL pointer dereference at   (null)
IP: [<  (null)>]   (null)
PGD 3752d067 PUD 37a7b067 PMD 0
Oops: 0010 [#1] SMP KASAN
Modules linked in:
CPU: 2 PID: 148 Comm: kworker/u10:2 Not tainted 4.4.0-rc2+ #51
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Workqueue: events_unbound flush_to_ldisc
task: 88006da94440 ti: 88006db6 task.ti: 88006db6
RIP: 0010:[<>]  [<  (null)>]   (null)
RSP: 0018:88006db67b50  EFLAGS: 00010246
RAX: 0102 RBX: 88003ab32f88 RCX: 0102
RDX:  RSI: 88003ab330a6 RDI: 88003aabd388
RBP: 88006db67c48 R08: 88003ab32f9c R09: 88003ab31fb0
R10: 88003ab32fa8 R11:  R12: dc00
R13: 88006db67c20 R14: 863df820 R15: 88003ab31fb8
FS:  () GS:88006dc0() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2:  CR3: 37938000 CR4: 06e0
Stack:
 829f46f1 88006da94bf8 88006da94bf8 
 88003ab31fb0 88003aabd438 88003ab31ff8 88006430fd90
 88003ab32f9c ed0007557a87 11000db6cf78 88003ab32078
Call Trace:
 [] process_one_work+0x8f1/0x17a0 kernel/workqueue.c:2030
 [] worker_thread+0xd4/0x1180 kernel/workqueue.c:2162
 [] kthread+0x1cf/0x270 drivers/block/aoe/aoecmd.c:1302
 [] ret_from_fork+0x3f/0x70 arch/x86/entry/entry_64.S:468
Code:  Bad RIP value.
RIP  [<  (null)>]   (null)
 RSP 
CR2: 
---[ end trace a587f8947e54d6ea ]---

Reported-by: Dmitry Vyukov 
Signed-off-by: Peter Hurley 
Signed-off-by: Greg Kroah-Hartman 
[lizf: Backportd to 3.4: adjust context]
Signed-off-by: Zefan Li 
---
 drivers/tty/tty_buffer.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/tty/tty_buffer.c b/drivers/tty/tty_buffer.c
index 4f02f9c..3f59d6c 100644
--- a/drivers/tty/tty_buffer.c
+++ b/drivers/tty/tty_buffer.c
@@ -443,7 +443,8 @@ static void flush_to_ldisc(struct work_struct *work)
flag_buf = head->flag_buf_ptr + head->read;
head->read += count;
spin_unlock_irqrestore(>buf.lock, flags);
-   disc->ops->receive_buf(tty, char_buf,
+   if (disc->ops->receive_buf)
+   disc->ops->receive_buf(tty, char_buf,
flag_buf, count);
spin_lock_irqsave(>buf.lock, flags);
}
-- 
1.9.1



[PATCH 3.4 103/125] USB: fix invalid memory access in hub_activate()

2016-10-12 Thread lizf
From: Alan Stern 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit e50293ef9775c5f1cf3fcc093037dd6a8c5684ea upstream.

Commit 8520f38099cc ("USB: change hub initialization sleeps to
delayed_work") changed the hub_activate() routine to make part of it
run in a workqueue.  However, the commit failed to take a reference to
the usb_hub structure or to lock the hub interface while doing so.  As
a result, if a hub is plugged in and quickly unplugged before the work
routine can run, the routine will try to access memory that has been
deallocated.  Or, if the hub is unplugged while the routine is
running, the memory may be deallocated while it is in active use.

This patch fixes the problem by taking a reference to the usb_hub at
the start of hub_activate() and releasing it at the end (when the work
is finished), and by locking the hub interface while the work routine
is running.  It also adds a check at the start of the routine to see
if the hub has already been disconnected, in which nothing should be
done.

Signed-off-by: Alan Stern 
Reported-by: Alexandru Cornea 
Tested-by: Alexandru Cornea 
Fixes: 8520f38099cc ("USB: change hub initialization sleeps to delayed_work")
Signed-off-by: Greg Kroah-Hartman 
[lizf: Backported to 3.4: add forward declaration of hub_release()]
Signed-off-by: Zefan Li 
---
 drivers/usb/core/hub.c | 23 ---
 1 file changed, 20 insertions(+), 3 deletions(-)

diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c
index 62ea924..e0ad5dc 100644
--- a/drivers/usb/core/hub.c
+++ b/drivers/usb/core/hub.c
@@ -156,6 +156,7 @@ EXPORT_SYMBOL_GPL(ehci_cf_port_reset_rwsem);
 #define HUB_DEBOUNCE_STABLE 100
 
 
+static void hub_release(struct kref *kref);
 static int usb_reset_and_verify_device(struct usb_device *udev);
 
 static inline char *portspeed(struct usb_hub *hub, int portstatus)
@@ -797,10 +798,20 @@ static void hub_activate(struct usb_hub *hub, enum 
hub_activation_type type)
unsigned delay;
 
/* Continue a partial initialization */
-   if (type == HUB_INIT2)
-   goto init2;
-   if (type == HUB_INIT3)
+   if (type == HUB_INIT2 || type == HUB_INIT3) {
+   device_lock(hub->intfdev);
+
+   /* Was the hub disconnected while we were waiting? */
+   if (hub->disconnected) {
+   device_unlock(hub->intfdev);
+   kref_put(>kref, hub_release);
+   return;
+   }
+   if (type == HUB_INIT2)
+   goto init2;
goto init3;
+   }
+   kref_get(>kref);
 
/* The superspeed hub except for root hub has to use Hub Depth
 * value as an offset into the route string to locate the bits
@@ -990,6 +1001,7 @@ static void hub_activate(struct usb_hub *hub, enum 
hub_activation_type type)
PREPARE_DELAYED_WORK(>init_work, hub_init_func3);
schedule_delayed_work(>init_work,
msecs_to_jiffies(delay));
+   device_unlock(hub->intfdev);
return; /* Continues at init3: below */
} else {
msleep(delay);
@@ -1010,6 +1022,11 @@ static void hub_activate(struct usb_hub *hub, enum 
hub_activation_type type)
/* Allow autosuspend if it was suppressed */
if (type <= HUB_INIT3)
usb_autopm_put_interface_async(to_usb_interface(hub->intfdev));
+
+   if (type == HUB_INIT2 || type == HUB_INIT3)
+   device_unlock(hub->intfdev);
+
+   kref_put(>kref, hub_release);
 }
 
 /* Implement the continuations for the delays above */
-- 
1.9.1



[PATCH 3.4 075/125] rfkill: copy the name into the rfkill struct

2016-10-12 Thread lizf
From: Johannes Berg 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit b7bb110008607a915298bf0f47d25886ecb94477 upstream.

Some users of rfkill, like NFC and cfg80211, use a dynamic name when
allocating rfkill, in those cases dev_name(). Therefore, the pointer
passed to rfkill_alloc() might not be valid forever, I specifically
found the case that the rfkill name was quite obviously an invalid
pointer (or at least garbage) when the wiphy had been renamed.

Fix this by making a copy of the rfkill name in rfkill_alloc().

Signed-off-by: Johannes Berg 
Signed-off-by: Zefan Li 
---
 net/rfkill/core.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/rfkill/core.c b/net/rfkill/core.c
index f974961..feef1a45 100644
--- a/net/rfkill/core.c
+++ b/net/rfkill/core.c
@@ -51,7 +51,6 @@
 struct rfkill {
spinlock_t  lock;
 
-   const char  *name;
enum rfkill_typetype;
 
unsigned long   state;
@@ -75,6 +74,7 @@ struct rfkill {
struct delayed_work poll_work;
struct work_struct  uevent_work;
struct work_struct  sync_work;
+   charname[];
 };
 #define to_rfkill(d)   container_of(d, struct rfkill, dev)
 
@@ -849,14 +849,14 @@ struct rfkill * __must_check rfkill_alloc(const char 
*name,
if (WARN_ON(type == RFKILL_TYPE_ALL || type >= NUM_RFKILL_TYPES))
return NULL;
 
-   rfkill = kzalloc(sizeof(*rfkill), GFP_KERNEL);
+   rfkill = kzalloc(sizeof(*rfkill) + strlen(name) + 1, GFP_KERNEL);
if (!rfkill)
return NULL;
 
spin_lock_init(>lock);
INIT_LIST_HEAD(>node);
rfkill->type = type;
-   rfkill->name = name;
+   strcpy(rfkill->name, name);
rfkill->ops = ops;
rfkill->data = ops_data;
 
-- 
1.9.1



[PATCH 3.4 108/125] mm/memory_hotplug.c: check for missing sections in test_pages_in_a_zone()

2016-10-12 Thread lizf
From: Andrew Banman 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 5f0f2887f4de9508dcf438deab28f1de8070c271 upstream.

test_pages_in_a_zone() does not account for the possibility of missing
sections in the given pfn range.  pfn_valid_within always returns 1 when
CONFIG_HOLES_IN_ZONE is not set, allowing invalid pfns from missing
sections to pass the test, leading to a kernel oops.

Wrap an additional pfn loop with PAGES_PER_SECTION granularity to check
for missing sections before proceeding into the zone-check code.

This also prevents a crash from offlining memory devices with missing
sections.  Despite this, it may be a good idea to keep the related patch
'[PATCH 3/3] drivers: memory: prohibit offlining of memory blocks with
missing sections' because missing sections in a memory block may lead to
other problems not covered by the scope of this fix.

Signed-off-by: Andrew Banman 
Acked-by: Alex Thorlton 
Cc: Russ Anderson 
Cc: Alex Thorlton 
Cc: Yinghai Lu 
Cc: Greg KH 
Cc: Seth Jennings 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
Signed-off-by: Zefan Li 
---
 mm/memory_hotplug.c | 31 +++
 1 file changed, 19 insertions(+), 12 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 09d87b7..223232a 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -716,23 +716,30 @@ int is_mem_section_removable(unsigned long start_pfn, 
unsigned long nr_pages)
  */
 static int test_pages_in_a_zone(unsigned long start_pfn, unsigned long end_pfn)
 {
-   unsigned long pfn;
+   unsigned long pfn, sec_end_pfn;
struct zone *zone = NULL;
struct page *page;
int i;
-   for (pfn = start_pfn;
+   for (pfn = start_pfn, sec_end_pfn = SECTION_ALIGN_UP(start_pfn);
 pfn < end_pfn;
-pfn += MAX_ORDER_NR_PAGES) {
-   i = 0;
-   /* This is just a CONFIG_HOLES_IN_ZONE check.*/
-   while ((i < MAX_ORDER_NR_PAGES) && !pfn_valid_within(pfn + i))
-   i++;
-   if (i == MAX_ORDER_NR_PAGES)
+pfn = sec_end_pfn + 1, sec_end_pfn += PAGES_PER_SECTION) {
+   /* Make sure the memory section is present first */
+   if (!present_section_nr(pfn_to_section_nr(pfn)))
continue;
-   page = pfn_to_page(pfn + i);
-   if (zone && page_zone(page) != zone)
-   return 0;
-   zone = page_zone(page);
+   for (; pfn < sec_end_pfn && pfn < end_pfn;
+pfn += MAX_ORDER_NR_PAGES) {
+   i = 0;
+   /* This is just a CONFIG_HOLES_IN_ZONE check.*/
+   while ((i < MAX_ORDER_NR_PAGES) &&
+   !pfn_valid_within(pfn + i))
+   i++;
+   if (i == MAX_ORDER_NR_PAGES)
+   continue;
+   page = pfn_to_page(pfn + i);
+   if (zone && page_zone(page) != zone)
+   return 0;
+   zone = page_zone(page);
+   }
}
return 1;
 }
-- 
1.9.1



[PATCH 3.4 000/125] 3.4.113-rc1 review

2016-10-12 Thread lizf
From: Zefan Li 

This is the start of the stable review cycle for the 3.4.113 release.
There are 125 patches in this series, all will be posted as a response
to this one.  If anyone has any issues with these being applied, please
let me know.

Responses should be made by Fri Oct 14 12:32:05 UTC 2016.
Anything received after that time might be too late.

A combined patch relative to 3.4.112 will be posted as an additional
response to this.  A shortlog and diffstat can be found below.

thanks,

Zefan Li



Aaro Koskinen (1):
  broadcom: fix PHY_ID_BCM5481 entry in the id table

Al Viro (2):
  fix sysvfs symlinks
  9p: ->evict_inode() should kick out ->i_data, not ->i_mapping

Alan Stern (1):
  USB: fix invalid memory access in hub_activate()

Aleksander Morgado (1):
  USB: serial: option: add support for Novatel MiFi USB620L

Alexey Khoroshilov (1):
  USB: whci-hcd: add check for dma mapping error

Andrew Banman (1):
  mm/memory_hotplug.c: check for missing sections in
test_pages_in_a_zone()

Andrey Ryabinin (1):
  ipv6/addrlabel: fix ip6addrlbl_get()

Anson Huang (1):
  ARM: 8471/1: need to save/restore arm register(r11) when it is
corrupted

Arnd Bergmann (1):
  ARM: pxa: remove incorrect __init annotation on pxa27x_set_pwrmode

Ben Hutchings (1):
  USB: ti_usb_3410_502: Fix ID table size

Bjørn Mork (1):
  USB: option: add XS Stick W100-2 from 4G Systems

Boris BREZILLON (1):
  mtd: mtdpart: fix add_mtd_partitions error path

Borislav Petkov (1):
  x86/cpu: Call verify_cpu() after having entered long mode too

Chen Yu (1):
  ACPI: Use correct IRQ when uninstalling ACPI interrupt handler

Christoph Hellwig (1):
  scsi: restart list search after unlock in scsi_remove_target

Chunfeng Yun (1):
  usb: xhci: fix config fail of FS hub behind a HS hub with MTT

Clemens Ladisch (3):
  ALSA: usb-audio: add packet size quirk for the Medeli DD305
  ALSA: usb-audio: prevent CH345 multiport output SysEx corruption
  ALSA: usb-audio: work around CH345 input SysEx corruption

Colin Ian King (1):
  ftrace/scripts: Fix incorrect use of sprintf in recordmcount

Daeho Jeong (1):
  ext4, jbd2: ensure entering into panic after recording an error in
superblock

Dan Carpenter (4):
  mwifiex: fix mwifiex_rdeeprom_read()
  devres: fix a for loop bounds check
  mISDN: fix a loop count
  USB: ipaq.c: fix a timeout loop

Dave Airlie (1):
  drm/radeon: fix hotplug race at startup

David Howells (2):
  FS-Cache: Handle a write to the page immediately beyond the EOF marker
  KEYS: Fix race between read and revoke

David Turner (1):
  ext4: Fix handling of extended tv_sec

David Vrabel (3):
  xen: Add RING_COPY_REQUEST()
  xen-netback: don't use last request to determine minimum Tx credit
  xen-netback: use RING_COPY_REQUEST() throughout

David Woodhouse (1):
  iommu/vt-d: Fix ATSR handling for Root-Complex integrated endpoints

Dmitry Tunin (1):
  Bluetooth: ath3k: Add support of AR3012 0cf3:817b device

Dmitry V. Levin (1):
  x86/signal: Fix restart_syscall number for x32 tasks

Eric Dumazet (5):
  net: fix a race in dst_release()
  tcp: md5: fix lockdep annotation
  af_unix: fix a fatal race with bit fields
  udp: properly support MSG_PEEK with truncated buffers
  tcp: make challenge acks less predictable

Filipe Manana (1):
  Btrfs: fix race leading to BUG_ON when running delalloc for nodatacow

Francesco Ruggeri (1):
  net: possible use after free in dst_release

Helge Deller (1):
  parisc: Fix syscall restarts

Herbert Xu (2):
  crypto: algif_hash - Only export and import on sockets with data
  net: Fix skb csum races when peeking

James Bottomley (2):
  ses: Fix problems with simple enclosures
  ses: fix additional element traversal bug

Jan Kara (3):
  vfs: Make sendfile(2) killable even better
  vfs: Avoid softlockups with sendfile(2)
  jbd2: Fix unreclaimed pages after truncate in data=journal mode

Jason A. Donenfeld (1):
  crypto: skcipher - Copy iv from desc even for 0-len walks

Jeff Layton (1):
  nfs: if we have no valid attrs, then don't declare the attribute cache
valid

Jiri Slaby (1):
  usblp: do not set TASK_INTERRUPTIBLE before lock

Joe Thornber (1):
  dm btree: fix bufio buffer leaks in dm_btree_del() error path

Johan Hovold (1):
  spi: fix parent-device reference leak

Johannes Berg (3):
  mac80211: fix driver RSSI event calculations
  mac80211: mesh: fix call_rcu() usage
  rfkill: copy the name into the rfkill struct

John Stultz (1):
  time: Prevent early expiry of hrtimers[CLOCK_REALTIME] at the leap
second edge

Joseph Qi (1):
  ocfs2: fix BUG when calculate new backup super

Karl Heiss (1):
  sctp: Prevent soft lockup when sctp_accept() is called during a
timeout event

Kees Cook (1):
  mac: validate mac_partition is within sector

Kinglong Mee (2):
  FS-Cache: Increase reference of parent after registering, netfs
success
  FS-Cache: Don't override netfs's primary_index if registering failed

Kirill A. Shutemov (1):
  vgaarb: fix signal 

[PATCH 3.4 082/125] mm: hugetlb: call huge_pte_alloc() only if ptep is null

2016-10-12 Thread lizf
From: Naoya Horiguchi <n-horigu...@ah.jp.nec.com>

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 0d777df5d8953293be090d9ab5a355db893e8357 upstream.

Currently at the beginning of hugetlb_fault(), we call huge_pte_offset()
and check whether the obtained *ptep is a migration/hwpoison entry or
not.  And if not, then we get to call huge_pte_alloc().  This is racy
because the *ptep could turn into migration/hwpoison entry after the
huge_pte_offset() check.  This race results in BUG_ON in
huge_pte_alloc().

We don't have to call huge_pte_alloc() when the huge_pte_offset()
returns non-NULL, so let's fix this bug with moving the code into else
block.

Note that the *ptep could turn into a migration/hwpoison entry after
this block, but that's not a problem because we have another
!pte_present check later (we never go into hugetlb_no_page() in that
case.)

Fixes: 290408d4a250 ("hugetlb: hugepage migration core")
Signed-off-by: Naoya Horiguchi <n-horigu...@ah.jp.nec.com>
Acked-by: Hillf Danton <hillf...@alibaba-inc.com>
Acked-by: David Rientjes <rient...@google.com>
Cc: Hugh Dickins <hu...@google.com>
Cc: Dave Hansen <dave.han...@intel.com>
Cc: Mel Gorman <mgor...@suse.de>
Cc: Joonsoo Kim <iamjoonsoo@lge.com>
Cc: Mike Kravetz <mike.krav...@oracle.com>
Signed-off-by: Andrew Morton <a...@linux-foundation.org>
Signed-off-by: Linus Torvalds <torva...@linux-foundation.org>
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li <lize...@huawei.com>
---
 mm/hugetlb.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index e622aab..416cbfd 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -2835,12 +2835,12 @@ int hugetlb_fault(struct mm_struct *mm, struct 
vm_area_struct *vma,
} else if (unlikely(is_hugetlb_entry_hwpoisoned(entry)))
return VM_FAULT_HWPOISON_LARGE |
   VM_FAULT_SET_HINDEX(h - hstates);
+   } else {
+   ptep = huge_pte_alloc(mm, address, huge_page_size(h));
+   if (!ptep)
+   return VM_FAULT_OOM;
}
 
-   ptep = huge_pte_alloc(mm, address, huge_page_size(h));
-   if (!ptep)
-   return VM_FAULT_OOM;
-
/*
 * Serialize hugepage allocation and instantiation, so that we don't
 * get spurious allocation failures if two CPUs race to instantiate
-- 
1.9.1



[PATCH 3.4 076/125] dm btree: fix bufio buffer leaks in dm_btree_del() error path

2016-10-12 Thread lizf
From: Joe Thornber 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit ed8b45a3679eb49069b094c0711b30833f27c734 upstream.

If dm_btree_del()'s call to push_frame() fails, e.g. due to
btree_node_validator finding invalid metadata, the dm_btree_del() error
path must unlock all frames (which have active dm-bufio buffers) that
were pushed onto the del_stack.

Otherwise, dm_bufio_client_destroy() will BUG_ON() because dm-bufio
buffers have leaked, e.g.:
  device-mapper: bufio: leaked buffer 3, hold count 1, list 0

Signed-off-by: Joe Thornber 
Signed-off-by: Mike Snitzer 
Signed-off-by: Zefan Li 
---
 drivers/md/persistent-data/dm-btree.c | 16 +++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/drivers/md/persistent-data/dm-btree.c 
b/drivers/md/persistent-data/dm-btree.c
index 77c615e..c948acf 100644
--- a/drivers/md/persistent-data/dm-btree.c
+++ b/drivers/md/persistent-data/dm-btree.c
@@ -230,6 +230,16 @@ static void pop_frame(struct del_stack *s)
dm_tm_unlock(s->tm, f->b);
 }
 
+static void unlock_all_frames(struct del_stack *s)
+{
+   struct frame *f;
+
+   while (unprocessed_frames(s)) {
+   f = s->spine + s->top--;
+   dm_tm_unlock(s->tm, f->b);
+   }
+}
+
 int dm_btree_del(struct dm_btree_info *info, dm_block_t root)
 {
int r;
@@ -285,9 +295,13 @@ int dm_btree_del(struct dm_btree_info *info, dm_block_t 
root)
f->current_child = f->nr_children;
}
}
-
 out:
+   if (r) {
+   /* cleanup all frames of del_stack */
+   unlock_all_frames(s);
+   }
kfree(s);
+
return r;
 }
 EXPORT_SYMBOL_GPL(dm_btree_del);
-- 
1.9.1



[PATCH 3.4 082/125] mm: hugetlb: call huge_pte_alloc() only if ptep is null

2016-10-12 Thread lizf
From: Naoya Horiguchi 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 0d777df5d8953293be090d9ab5a355db893e8357 upstream.

Currently at the beginning of hugetlb_fault(), we call huge_pte_offset()
and check whether the obtained *ptep is a migration/hwpoison entry or
not.  And if not, then we get to call huge_pte_alloc().  This is racy
because the *ptep could turn into migration/hwpoison entry after the
huge_pte_offset() check.  This race results in BUG_ON in
huge_pte_alloc().

We don't have to call huge_pte_alloc() when the huge_pte_offset()
returns non-NULL, so let's fix this bug with moving the code into else
block.

Note that the *ptep could turn into a migration/hwpoison entry after
this block, but that's not a problem because we have another
!pte_present check later (we never go into hugetlb_no_page() in that
case.)

Fixes: 290408d4a250 ("hugetlb: hugepage migration core")
Signed-off-by: Naoya Horiguchi 
Acked-by: Hillf Danton 
Acked-by: David Rientjes 
Cc: Hugh Dickins 
Cc: Dave Hansen 
Cc: Mel Gorman 
Cc: Joonsoo Kim 
Cc: Mike Kravetz 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li 
---
 mm/hugetlb.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index e622aab..416cbfd 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -2835,12 +2835,12 @@ int hugetlb_fault(struct mm_struct *mm, struct 
vm_area_struct *vma,
} else if (unlikely(is_hugetlb_entry_hwpoisoned(entry)))
return VM_FAULT_HWPOISON_LARGE |
   VM_FAULT_SET_HINDEX(h - hstates);
+   } else {
+   ptep = huge_pte_alloc(mm, address, huge_page_size(h));
+   if (!ptep)
+   return VM_FAULT_OOM;
}
 
-   ptep = huge_pte_alloc(mm, address, huge_page_size(h));
-   if (!ptep)
-   return VM_FAULT_OOM;
-
/*
 * Serialize hugepage allocation and instantiation, so that we don't
 * get spurious allocation failures if two CPUs race to instantiate
-- 
1.9.1



[PATCH 3.4 076/125] dm btree: fix bufio buffer leaks in dm_btree_del() error path

2016-10-12 Thread lizf
From: Joe Thornber 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit ed8b45a3679eb49069b094c0711b30833f27c734 upstream.

If dm_btree_del()'s call to push_frame() fails, e.g. due to
btree_node_validator finding invalid metadata, the dm_btree_del() error
path must unlock all frames (which have active dm-bufio buffers) that
were pushed onto the del_stack.

Otherwise, dm_bufio_client_destroy() will BUG_ON() because dm-bufio
buffers have leaked, e.g.:
  device-mapper: bufio: leaked buffer 3, hold count 1, list 0

Signed-off-by: Joe Thornber 
Signed-off-by: Mike Snitzer 
Signed-off-by: Zefan Li 
---
 drivers/md/persistent-data/dm-btree.c | 16 +++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/drivers/md/persistent-data/dm-btree.c 
b/drivers/md/persistent-data/dm-btree.c
index 77c615e..c948acf 100644
--- a/drivers/md/persistent-data/dm-btree.c
+++ b/drivers/md/persistent-data/dm-btree.c
@@ -230,6 +230,16 @@ static void pop_frame(struct del_stack *s)
dm_tm_unlock(s->tm, f->b);
 }
 
+static void unlock_all_frames(struct del_stack *s)
+{
+   struct frame *f;
+
+   while (unprocessed_frames(s)) {
+   f = s->spine + s->top--;
+   dm_tm_unlock(s->tm, f->b);
+   }
+}
+
 int dm_btree_del(struct dm_btree_info *info, dm_block_t root)
 {
int r;
@@ -285,9 +295,13 @@ int dm_btree_del(struct dm_btree_info *info, dm_block_t 
root)
f->current_child = f->nr_children;
}
}
-
 out:
+   if (r) {
+   /* cleanup all frames of del_stack */
+   unlock_all_frames(s);
+   }
kfree(s);
+
return r;
 }
 EXPORT_SYMBOL_GPL(dm_btree_del);
-- 
1.9.1



[PATCH 3.4 000/125] 3.4.113-rc1 review

2016-10-12 Thread lizf
From: Zefan Li 

This is the start of the stable review cycle for the 3.4.113 release.
There are 125 patches in this series, all will be posted as a response
to this one.  If anyone has any issues with these being applied, please
let me know.

Responses should be made by Fri Oct 14 12:32:05 UTC 2016.
Anything received after that time might be too late.

A combined patch relative to 3.4.112 will be posted as an additional
response to this.  A shortlog and diffstat can be found below.

thanks,

Zefan Li



Aaro Koskinen (1):
  broadcom: fix PHY_ID_BCM5481 entry in the id table

Al Viro (2):
  fix sysvfs symlinks
  9p: ->evict_inode() should kick out ->i_data, not ->i_mapping

Alan Stern (1):
  USB: fix invalid memory access in hub_activate()

Aleksander Morgado (1):
  USB: serial: option: add support for Novatel MiFi USB620L

Alexey Khoroshilov (1):
  USB: whci-hcd: add check for dma mapping error

Andrew Banman (1):
  mm/memory_hotplug.c: check for missing sections in
test_pages_in_a_zone()

Andrey Ryabinin (1):
  ipv6/addrlabel: fix ip6addrlbl_get()

Anson Huang (1):
  ARM: 8471/1: need to save/restore arm register(r11) when it is
corrupted

Arnd Bergmann (1):
  ARM: pxa: remove incorrect __init annotation on pxa27x_set_pwrmode

Ben Hutchings (1):
  USB: ti_usb_3410_502: Fix ID table size

Bjørn Mork (1):
  USB: option: add XS Stick W100-2 from 4G Systems

Boris BREZILLON (1):
  mtd: mtdpart: fix add_mtd_partitions error path

Borislav Petkov (1):
  x86/cpu: Call verify_cpu() after having entered long mode too

Chen Yu (1):
  ACPI: Use correct IRQ when uninstalling ACPI interrupt handler

Christoph Hellwig (1):
  scsi: restart list search after unlock in scsi_remove_target

Chunfeng Yun (1):
  usb: xhci: fix config fail of FS hub behind a HS hub with MTT

Clemens Ladisch (3):
  ALSA: usb-audio: add packet size quirk for the Medeli DD305
  ALSA: usb-audio: prevent CH345 multiport output SysEx corruption
  ALSA: usb-audio: work around CH345 input SysEx corruption

Colin Ian King (1):
  ftrace/scripts: Fix incorrect use of sprintf in recordmcount

Daeho Jeong (1):
  ext4, jbd2: ensure entering into panic after recording an error in
superblock

Dan Carpenter (4):
  mwifiex: fix mwifiex_rdeeprom_read()
  devres: fix a for loop bounds check
  mISDN: fix a loop count
  USB: ipaq.c: fix a timeout loop

Dave Airlie (1):
  drm/radeon: fix hotplug race at startup

David Howells (2):
  FS-Cache: Handle a write to the page immediately beyond the EOF marker
  KEYS: Fix race between read and revoke

David Turner (1):
  ext4: Fix handling of extended tv_sec

David Vrabel (3):
  xen: Add RING_COPY_REQUEST()
  xen-netback: don't use last request to determine minimum Tx credit
  xen-netback: use RING_COPY_REQUEST() throughout

David Woodhouse (1):
  iommu/vt-d: Fix ATSR handling for Root-Complex integrated endpoints

Dmitry Tunin (1):
  Bluetooth: ath3k: Add support of AR3012 0cf3:817b device

Dmitry V. Levin (1):
  x86/signal: Fix restart_syscall number for x32 tasks

Eric Dumazet (5):
  net: fix a race in dst_release()
  tcp: md5: fix lockdep annotation
  af_unix: fix a fatal race with bit fields
  udp: properly support MSG_PEEK with truncated buffers
  tcp: make challenge acks less predictable

Filipe Manana (1):
  Btrfs: fix race leading to BUG_ON when running delalloc for nodatacow

Francesco Ruggeri (1):
  net: possible use after free in dst_release

Helge Deller (1):
  parisc: Fix syscall restarts

Herbert Xu (2):
  crypto: algif_hash - Only export and import on sockets with data
  net: Fix skb csum races when peeking

James Bottomley (2):
  ses: Fix problems with simple enclosures
  ses: fix additional element traversal bug

Jan Kara (3):
  vfs: Make sendfile(2) killable even better
  vfs: Avoid softlockups with sendfile(2)
  jbd2: Fix unreclaimed pages after truncate in data=journal mode

Jason A. Donenfeld (1):
  crypto: skcipher - Copy iv from desc even for 0-len walks

Jeff Layton (1):
  nfs: if we have no valid attrs, then don't declare the attribute cache
valid

Jiri Slaby (1):
  usblp: do not set TASK_INTERRUPTIBLE before lock

Joe Thornber (1):
  dm btree: fix bufio buffer leaks in dm_btree_del() error path

Johan Hovold (1):
  spi: fix parent-device reference leak

Johannes Berg (3):
  mac80211: fix driver RSSI event calculations
  mac80211: mesh: fix call_rcu() usage
  rfkill: copy the name into the rfkill struct

John Stultz (1):
  time: Prevent early expiry of hrtimers[CLOCK_REALTIME] at the leap
second edge

Joseph Qi (1):
  ocfs2: fix BUG when calculate new backup super

Karl Heiss (1):
  sctp: Prevent soft lockup when sctp_accept() is called during a
timeout event

Kees Cook (1):
  mac: validate mac_partition is within sector

Kinglong Mee (2):
  FS-Cache: Increase reference of parent after registering, netfs
success
  FS-Cache: Don't override netfs's primary_index if registering failed

Kirill A. Shutemov (1):
  vgaarb: fix signal handling in vga_get()


[PATCH 3.4 034/125] usb: musb: core: fix order of arguments to ulpi write callback

2016-10-12 Thread lizf
From: Uwe Kleine-König 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 705e63d2b29c8bbf091119084544d353bda70393 upstream.

There is a bit of a mess in the order of arguments to the ulpi write
callback. There is

int ulpi_write(struct ulpi *ulpi, u8 addr, u8 val)

in drivers/usb/common/ulpi.c;

struct usb_phy_io_ops {
...
int (*write)(struct usb_phy *x, u32 val, u32 reg);
}

in include/linux/usb/phy.h.

The callback registered by the musb driver has to comply to the latter,
but up to now had "offset" first which effectively made the function
broken for correct users. So flip the order and while at it also
switch to the parameter names of struct usb_phy_io_ops's write.

Fixes: ffb865b1e460 ("usb: musb: add ulpi access operations")
Signed-off-by: Uwe Kleine-König 
Signed-off-by: Felipe Balbi 
Signed-off-by: Zefan Li 
---
 drivers/usb/musb/musb_core.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/usb/musb/musb_core.c b/drivers/usb/musb/musb_core.c
index d3481c4..db25165 100644
--- a/drivers/usb/musb/musb_core.c
+++ b/drivers/usb/musb/musb_core.c
@@ -131,7 +131,7 @@ static inline struct musb *dev_to_musb(struct device *dev)
 /*-*/
 
 #ifndef CONFIG_BLACKFIN
-static int musb_ulpi_read(struct usb_phy *phy, u32 offset)
+static int musb_ulpi_read(struct usb_phy *phy, u32 reg)
 {
void __iomem *addr = phy->io_priv;
int i = 0;
@@ -150,7 +150,7 @@ static int musb_ulpi_read(struct usb_phy *phy, u32 offset)
 * ULPICarKitControlDisableUTMI after clearing POWER_SUSPENDM.
 */
 
-   musb_writeb(addr, MUSB_ULPI_REG_ADDR, (u8)offset);
+   musb_writeb(addr, MUSB_ULPI_REG_ADDR, (u8)reg);
musb_writeb(addr, MUSB_ULPI_REG_CONTROL,
MUSB_ULPI_REG_REQ | MUSB_ULPI_RDN_WR);
 
@@ -175,7 +175,7 @@ out:
return ret;
 }
 
-static int musb_ulpi_write(struct usb_phy *phy, u32 offset, u32 data)
+static int musb_ulpi_write(struct usb_phy *phy, u32 val, u32 reg)
 {
void __iomem *addr = phy->io_priv;
int i = 0;
@@ -190,8 +190,8 @@ static int musb_ulpi_write(struct usb_phy *phy, u32 offset, 
u32 data)
power &= ~MUSB_POWER_SUSPENDM;
musb_writeb(addr, MUSB_POWER, power);
 
-   musb_writeb(addr, MUSB_ULPI_REG_ADDR, (u8)offset);
-   musb_writeb(addr, MUSB_ULPI_REG_DATA, (u8)data);
+   musb_writeb(addr, MUSB_ULPI_REG_ADDR, (u8)reg);
+   musb_writeb(addr, MUSB_ULPI_REG_DATA, (u8)val);
musb_writeb(addr, MUSB_ULPI_REG_CONTROL, MUSB_ULPI_REG_REQ);
 
while (!(musb_readb(addr, MUSB_ULPI_REG_CONTROL)
-- 
1.9.1



[PATCH 3.4 091/125] ftrace/scripts: Have recordmcount copy the object file

2016-10-12 Thread lizf
From: "Steven Rostedt (Red Hat)" 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit a50bd43935586420fb75f4558369eb08566fac5e upstream.

Russell King found that he had weird side effects when compiling the kernel
with hard linked ccache. The reason was that recordmcount modified the
kernel in place via mmap, and when a file gets modified twice by
recordmcount, it will complain about it. To fix this issue, Russell wrote a
patch that checked if the file was hard linked more than once and would
unlink it if it was.

Linus Torvalds was not happy with the fact that recordmcount does this in
place modification. Instead of doing the unlink only if the file has two or
more hard links, it does the unlink all the time. In otherwords, it always
does a copy if it changed something. That is, it does the write out if a
change was made.

Signed-off-by: Steven Rostedt 
Signed-off-by: Zefan Li 
---
 scripts/recordmcount.c | 145 +
 1 file changed, 110 insertions(+), 35 deletions(-)

diff --git a/scripts/recordmcount.c b/scripts/recordmcount.c
index 4eb047a..0970379 100644
--- a/scripts/recordmcount.c
+++ b/scripts/recordmcount.c
@@ -35,12 +35,17 @@
 
 static int fd_map; /* File descriptor for file being modified. */
 static int mmap_failed; /* Boolean flag. */
-static void *ehdr_curr; /* current ElfXX_Ehdr *  for resource cleanup */
 static char gpfx;  /* prefix for global symbol name (sometimes '_') */
 static struct stat sb; /* Remember .st_size, etc. */
 static jmp_buf jmpenv; /* setjmp/longjmp per-file error escape */
 static const char *altmcount;  /* alternate mcount symbol name */
 static int warn_on_notrace_sect; /* warn when section has mcount not being 
recorded */
+static void *file_map; /* pointer of the mapped file */
+static void *file_end; /* pointer to the end of the mapped file */
+static int file_updated; /* flag to state file was changed */
+static void *file_ptr; /* current file pointer location */
+static void *file_append; /* added to the end of the file */
+static size_t file_append_size; /* how much is added to end of file */
 
 /* setjmp() return values */
 enum {
@@ -54,10 +59,14 @@ static void
 cleanup(void)
 {
if (!mmap_failed)
-   munmap(ehdr_curr, sb.st_size);
+   munmap(file_map, sb.st_size);
else
-   free(ehdr_curr);
-   close(fd_map);
+   free(file_map);
+   file_map = NULL;
+   free(file_append);
+   file_append = NULL;
+   file_append_size = 0;
+   file_updated = 0;
 }
 
 static void __attribute__((noreturn))
@@ -79,12 +88,22 @@ succeed_file(void)
 static off_t
 ulseek(int const fd, off_t const offset, int const whence)
 {
-   off_t const w = lseek(fd, offset, whence);
-   if (w == (off_t)-1) {
-   perror("lseek");
+   switch (whence) {
+   case SEEK_SET:
+   file_ptr = file_map + offset;
+   break;
+   case SEEK_CUR:
+   file_ptr += offset;
+   break;
+   case SEEK_END:
+   file_ptr = file_map + (sb.st_size - offset);
+   break;
+   }
+   if (file_ptr < file_map) {
+   fprintf(stderr, "lseek: seek before file\n");
fail_file();
}
-   return w;
+   return file_ptr - file_map;
 }
 
 static size_t
@@ -101,12 +120,38 @@ uread(int const fd, void *const buf, size_t const count)
 static size_t
 uwrite(int const fd, void const *const buf, size_t const count)
 {
-   size_t const n = write(fd, buf, count);
-   if (n != count) {
-   perror("write");
-   fail_file();
+   size_t cnt = count;
+   off_t idx = 0;
+
+   file_updated = 1;
+
+   if (file_ptr + count >= file_end) {
+   off_t aoffset = (file_ptr + count) - file_end;
+
+   if (aoffset > file_append_size) {
+   file_append = realloc(file_append, aoffset);
+   file_append_size = aoffset;
+   }
+   if (!file_append) {
+   perror("write");
+   fail_file();
+   }
+   if (file_ptr < file_end) {
+   cnt = file_end - file_ptr;
+   } else {
+   cnt = 0;
+   idx = aoffset - count;
+   }
}
-   return n;
+
+   if (cnt)
+   memcpy(file_ptr, buf, cnt);
+
+   if (cnt < count)
+   memcpy(file_append + idx, buf + cnt, count - cnt);
+
+   file_ptr += count;
+   return count;
 }
 
 static void *
@@ -163,9 +208,7 @@ static int make_nop_x86(void *map, size_t const offset)
  */
 static void *mmap_file(char const *fname)
 {
-   void *addr;
-
-   fd_map = open(fname, O_RDWR);
+   fd_map = open(fname, O_RDONLY);
 

[PATCH 3.4 117/125] ipv6: update ip6_rt_last_gc every time GC is run

2016-10-12 Thread lizf
From: Michal Kubeček 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 49a18d86f66d33a20144ecb5a34bba0d1856b260 upstream.

As pointed out by Eric Dumazet, net->ipv6.ip6_rt_last_gc should
hold the last time garbage collector was run so that we should
update it whenever fib6_run_gc() calls fib6_clean_all(), not only
if we got there from ip6_dst_gc().

Signed-off-by: Michal Kubecek 
Signed-off-by: David S. Miller 
Signed-off-by: Zefan Li 
---
 net/ipv6/ip6_fib.c | 6 +-
 net/ipv6/route.c   | 4 +---
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
index fc5ce6e..e6b7a00 100644
--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -1595,6 +1595,8 @@ static DEFINE_SPINLOCK(fib6_gc_lock);
 
 void fib6_run_gc(unsigned long expires, struct net *net, bool force)
 {
+   unsigned long now;
+
if (force) {
spin_lock_bh(_gc_lock);
} else if (!spin_trylock_bh(_gc_lock)) {
@@ -1607,10 +1609,12 @@ void fib6_run_gc(unsigned long expires, struct net 
*net, bool force)
gc_args.more = icmp6_dst_gc();
 
fib6_clean_all(net, fib6_age, 0, NULL);
+   now = jiffies;
+   net->ipv6.ip6_rt_last_gc = now;
 
if (gc_args.more)
mod_timer(>ipv6.ip6_fib_timer,
- round_jiffies(jiffies
+ round_jiffies(now
+ net->ipv6.sysctl.ip6_rt_gc_interval));
else
del_timer(>ipv6.ip6_fib_timer);
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 7ab7f8a..28957ba 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -1230,7 +1230,6 @@ static void icmp6_clean_all(int (*func)(struct rt6_info 
*rt, void *arg),
 
 static int ip6_dst_gc(struct dst_ops *ops)
 {
-   unsigned long now = jiffies;
struct net *net = container_of(ops, struct net, ipv6.ip6_dst_ops);
int rt_min_interval = net->ipv6.sysctl.ip6_rt_gc_min_interval;
int rt_max_size = net->ipv6.sysctl.ip6_rt_max_size;
@@ -1240,13 +1239,12 @@ static int ip6_dst_gc(struct dst_ops *ops)
int entries;
 
entries = dst_entries_get_fast(ops);
-   if (time_after(rt_last_gc + rt_min_interval, now) &&
+   if (time_after(rt_last_gc + rt_min_interval, jiffies) &&
entries <= rt_max_size)
goto out;
 
net->ipv6.ip6_rt_gc_expire++;
fib6_run_gc(net->ipv6.ip6_rt_gc_expire, net, entries > rt_max_size);
-   net->ipv6.ip6_rt_last_gc = now;
entries = dst_entries_get_slow(ops);
if (entries < ops->gc_thresh)
net->ipv6.ip6_rt_gc_expire = rt_gc_timeout>>1;
-- 
1.9.1



[PATCH 3.4 112/125] USB: ti_usb_3410_502: Fix ID table size

2016-10-12 Thread lizf
From: Ben Hutchings 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


Commit 35a2fbc941ac ("USB: serial: ti_usb_3410_5052: new device id for
Abbot strip port cable") failed to update the size of the
ti_id_table_3410 array.  This doesn't need to be fixed upstream
following commit d7ece6515e12 ("USB: ti_usb_3410_5052: remove
vendor/product module parameters") but should be fixed in stable
branches older than 3.12.

Backports of commit c9d09dc7ad10 ("USB: serial: ti_usb_3410_5052: add
Abbott strip port ID to combined table as well.") similarly failed to
update the size of the ti_id_table_combined array.

Signed-off-by: Ben Hutchings 
Signed-off-by: Zefan Li 
---
 drivers/usb/serial/ti_usb_3410_5052.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/usb/serial/ti_usb_3410_5052.c 
b/drivers/usb/serial/ti_usb_3410_5052.c
index 2575779..974c4fa 100644
--- a/drivers/usb/serial/ti_usb_3410_5052.c
+++ b/drivers/usb/serial/ti_usb_3410_5052.c
@@ -164,7 +164,7 @@ static unsigned int product_5052_count;
 /* the array dimension is the number of default entries plus */
 /* TI_EXTRA_VID_PID_COUNT user defined entries plus 1 terminating */
 /* null entry */
-static struct usb_device_id ti_id_table_3410[15+TI_EXTRA_VID_PID_COUNT+1] = {
+static struct usb_device_id ti_id_table_3410[16+TI_EXTRA_VID_PID_COUNT+1] = {
{ USB_DEVICE(TI_VENDOR_ID, TI_3410_PRODUCT_ID) },
{ USB_DEVICE(TI_VENDOR_ID, TI_3410_EZ430_ID) },
{ USB_DEVICE(MTS_VENDOR_ID, MTS_GSM_NO_FW_PRODUCT_ID) },
@@ -190,7 +190,7 @@ static struct usb_device_id 
ti_id_table_5052[5+TI_EXTRA_VID_PID_COUNT+1] = {
{ USB_DEVICE(TI_VENDOR_ID, TI_5052_FIRMWARE_PRODUCT_ID) },
 };
 
-static struct usb_device_id 
ti_id_table_combined[19+2*TI_EXTRA_VID_PID_COUNT+1] = {
+static struct usb_device_id 
ti_id_table_combined[20+2*TI_EXTRA_VID_PID_COUNT+1] = {
{ USB_DEVICE(TI_VENDOR_ID, TI_3410_PRODUCT_ID) },
{ USB_DEVICE(TI_VENDOR_ID, TI_3410_EZ430_ID) },
{ USB_DEVICE(MTS_VENDOR_ID, MTS_GSM_NO_FW_PRODUCT_ID) },
-- 
1.9.1



[PATCH 3.4 097/125] xen/pciback: Return error on XEN_PCI_OP_enable_msi when device has MSI or MSI-X enabled

2016-10-12 Thread lizf
From: Konrad Rzeszutek Wilk 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 56441f3c8e5bd45aab10dd9f8c505dd4bec03b0d upstream.

The guest sequence of:

 a) XEN_PCI_OP_enable_msi
 b) XEN_PCI_OP_enable_msi
 c) XEN_PCI_OP_disable_msi

results in hitting an BUG_ON condition in the msi.c code.

The MSI code uses an dev->msi_list to which it adds MSI entries.
Under the above conditions an BUG_ON() can be hit. The device
passed in the guest MUST have MSI capability.

The a) adds the entry to the dev->msi_list and sets msi_enabled.
The b) adds a second entry but adding in to SysFS fails (duplicate entry)
and deletes all of the entries from msi_list and returns (with msi_enabled
is still set).  c) pci_disable_msi passes the msi_enabled checks and hits:

BUG_ON(list_empty(dev_to_msi_list(>dev)));

and blows up.

The patch adds a simple check in the XEN_PCI_OP_enable_msi to guard
against that. The check for msix_enabled is not stricly neccessary.

This is part of XSA-157.

Reviewed-by: David Vrabel 
Reviewed-by: Jan Beulich 
Signed-off-by: Konrad Rzeszutek Wilk 
Signed-off-by: Zefan Li 
---
 drivers/xen/xen-pciback/pciback_ops.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/xen/xen-pciback/pciback_ops.c 
b/drivers/xen/xen-pciback/pciback_ops.c
index a751a66..1ab998c 100644
--- a/drivers/xen/xen-pciback/pciback_ops.c
+++ b/drivers/xen/xen-pciback/pciback_ops.c
@@ -143,7 +143,12 @@ int xen_pcibk_enable_msi(struct xen_pcibk_device *pdev,
if (unlikely(verbose_request))
printk(KERN_DEBUG DRV_NAME ": %s: enable MSI\n", pci_name(dev));
 
-   status = pci_enable_msi(dev);
+   if (dev->msi_enabled)
+   status = -EALREADY;
+   else if (dev->msix_enabled)
+   status = -ENXIO;
+   else
+   status = pci_enable_msi(dev);
 
if (status) {
pr_warn_ratelimited(DRV_NAME ": %s: error enabling MSI for 
guest %u: err %d\n",
-- 
1.9.1



[PATCH 3.4 116/125] sctp: Prevent soft lockup when sctp_accept() is called during a timeout event

2016-10-12 Thread lizf
From: Karl Heiss 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 635682a14427d241bab7bbdeebb48a7d7b91638e upstream.

A case can occur when sctp_accept() is called by the user during
a heartbeat timeout event after the 4-way handshake.  Since
sctp_assoc_migrate() changes both assoc->base.sk and assoc->ep, the
bh_sock_lock in sctp_generate_heartbeat_event() will be taken with
the listening socket but released with the new association socket.
The result is a deadlock on any future attempts to take the listening
socket lock.

Note that this race can occur with other SCTP timeouts that take
the bh_lock_sock() in the event sctp_accept() is called.

 BUG: soft lockup - CPU#9 stuck for 67s! [swapper:0]
 ...
 RIP: 0010:[]  [] _spin_lock+0x1e/0x30
 RSP: 0018:880028323b20  EFLAGS: 0206
 RAX: 0002 RBX: 880028323b20 RCX: 
 RDX:  RSI: 880028323be0 RDI: 8804632c4b48
 RBP: 8100bb93 R08:  R09: 
 R10: 880610662280 R11: 0100 R12: 880028323aa0
 R13: 8804383c3880 R14: 880028323a90 R15: 81534225
 FS:  () GS:88002832() knlGS:
 CS:  0010 DS: 0018 ES: 0018 CR0: 8005003b
 CR2: 006df528 CR3: 01a85000 CR4: 06e0
 DR0:  DR1:  DR2: 
 DR3:  DR6: 0ff0 DR7: 0400
 Process swapper (pid: 0, threadinfo 880616b7, task 880616b6cab0)
 Stack:
 880028323c40 a01c2582 880614cfb020 
  0100 0014383a6c44 8804383c3880 880614e93c00
  880614e93c00  8804632c4b00 8804383c38b8
 Call Trace:
 
 [] ? sctp_rcv+0x492/0xa10 [sctp]
 [] ? nf_iterate+0x69/0xb0
 [] ? ip_local_deliver_finish+0x0/0x2d0
 [] ? nf_hook_slow+0x76/0x120
 [] ? ip_local_deliver_finish+0x0/0x2d0
 [] ? ip_local_deliver_finish+0xdd/0x2d0
 [] ? ip_local_deliver+0x98/0xa0
 [] ? ip_rcv_finish+0x12d/0x440
 [] ? ip_rcv+0x275/0x350
 [] ? __netif_receive_skb+0x4ab/0x750
 ...

With lockdep debugging:

 =
 [ BUG: bad unlock balance detected! ]
 -
 CslRx/12087 is trying to release lock (slock-AF_INET) at:
 [] sctp_generate_timeout_event+0x40/0xe0 [sctp]
 but there are no more locks to release!

 other info that might help us debug this:
 2 locks held by CslRx/12087:
 #0:  (>timers[i]){+.-...}, at: [] 
run_timer_softirq+0x16f/0x3e0
 #1:  (slock-AF_INET){+.-...}, at: [] 
sctp_generate_timeout_event+0x23/0xe0 [sctp]

Ensure the socket taken is also the same one that is released by
saving a copy of the socket before entering the timeout event
critical section.

Signed-off-by: Karl Heiss 
Signed-off-by: David S. Miller 
[bwh: Backported to 3.2:
 - Net namespaces are not used
 - Keep using sctp_bh_{,un}lock_sock()
 - Adjust context]
Signed-off-by: Ben Hutchings 
Signed-off-by: Zefan Li 
---
 net/sctp/sm_sideeffect.c | 34 +++---
 1 file changed, 19 insertions(+), 15 deletions(-)

diff --git a/net/sctp/sm_sideeffect.c b/net/sctp/sm_sideeffect.c
index 5fa033a..06c75b1 100644
--- a/net/sctp/sm_sideeffect.c
+++ b/net/sctp/sm_sideeffect.c
@@ -249,11 +249,12 @@ void sctp_generate_t3_rtx_event(unsigned long peer)
int error;
struct sctp_transport *transport = (struct sctp_transport *) peer;
struct sctp_association *asoc = transport->asoc;
+   struct sock *sk = asoc->base.sk;
 
/* Check whether a task is in the sock.  */
 
-   sctp_bh_lock_sock(asoc->base.sk);
-   if (sock_owned_by_user(asoc->base.sk)) {
+   sctp_bh_lock_sock(sk);
+   if (sock_owned_by_user(sk)) {
SCTP_DEBUG_PRINTK("%s:Sock is busy.\n", __func__);
 
/* Try again later.  */
@@ -276,10 +277,10 @@ void sctp_generate_t3_rtx_event(unsigned long peer)
   transport, GFP_ATOMIC);
 
if (error)
-   asoc->base.sk->sk_err = -error;
+   sk->sk_err = -error;
 
 out_unlock:
-   sctp_bh_unlock_sock(asoc->base.sk);
+   sctp_bh_unlock_sock(sk);
sctp_transport_put(transport);
 }
 
@@ -289,10 +290,11 @@ out_unlock:
 static void sctp_generate_timeout_event(struct sctp_association *asoc,
sctp_event_timeout_t timeout_type)
 {
+   struct sock *sk = asoc->base.sk;
int error = 0;
 
-   sctp_bh_lock_sock(asoc->base.sk);
-   if (sock_owned_by_user(asoc->base.sk)) {
+   sctp_bh_lock_sock(sk);
+   if (sock_owned_by_user(sk)) {
SCTP_DEBUG_PRINTK("%s:Sock is busy: timer %d\n",
  __func__,
  timeout_type);
@@ -316,10 +318,10 @@ static 

[PATCH 3.4 118/125] ipv6: don't call fib6_run_gc() until routing is ready

2016-10-12 Thread lizf
From: Michal Kubeček 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 2c861cc65ef4604011a0082e4dcdba2819aa191a upstream.

When loading the ipv6 module, ndisc_init() is called before
ip6_route_init(). As the former registers a handler calling
fib6_run_gc(), this opens a window to run the garbage collector
before necessary data structures are initialized. If a network
device is initialized in this window, adding MAC address to it
triggers a NETDEV_CHANGEADDR event, leading to a crash in
fib6_clean_all().

Take the event handler registration out of ndisc_init() into a
separate function ndisc_late_init() and move it after
ip6_route_init().

Signed-off-by: Michal Kubecek 
Signed-off-by: David S. Miller 
Signed-off-by: Zefan Li 
---
 include/net/ndisc.h |  2 ++
 net/ipv6/af_inet6.c |  6 ++
 net/ipv6/ndisc.c| 18 +++---
 3 files changed, 19 insertions(+), 7 deletions(-)

diff --git a/include/net/ndisc.h b/include/net/ndisc.h
index 6f9c25a..cd205e9 100644
--- a/include/net/ndisc.h
+++ b/include/net/ndisc.h
@@ -117,7 +117,9 @@ static inline struct neighbour *__ipv6_neigh_lookup(struct 
neigh_table *tbl, str
 }
 
 extern int ndisc_init(void);
+extern int ndisc_late_init(void);
 
+extern voidndisc_late_cleanup(void);
 extern voidndisc_cleanup(void);
 
 extern int ndisc_rcv(struct sk_buff *skb);
diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
index 5300ef3..8ddb56f 100644
--- a/net/ipv6/af_inet6.c
+++ b/net/ipv6/af_inet6.c
@@ -1161,6 +1161,9 @@ static int __init inet6_init(void)
err = ip6_route_init();
if (err)
goto ip6_route_fail;
+   err = ndisc_late_init();
+   if (err)
+   goto ndisc_late_fail;
err = ip6_flowlabel_init();
if (err)
goto ip6_flowlabel_fail;
@@ -1221,6 +1224,8 @@ ipv6_exthdrs_fail:
 addrconf_fail:
ip6_flowlabel_cleanup();
 ip6_flowlabel_fail:
+   ndisc_late_cleanup();
+ndisc_late_fail:
ip6_route_cleanup();
 ip6_route_fail:
 #ifdef CONFIG_PROC_FS
@@ -1288,6 +1293,7 @@ static void __exit inet6_exit(void)
ipv6_exthdrs_exit();
addrconf_cleanup();
ip6_flowlabel_cleanup();
+   ndisc_late_cleanup();
ip6_route_cleanup();
 #ifdef CONFIG_PROC_FS
 
diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c
index e235b4c..02e6568 100644
--- a/net/ipv6/ndisc.c
+++ b/net/ipv6/ndisc.c
@@ -1867,24 +1867,28 @@ int __init ndisc_init(void)
if (err)
goto out_unregister_pernet;
 #endif
-   err = register_netdevice_notifier(_netdev_notifier);
-   if (err)
-   goto out_unregister_sysctl;
 out:
return err;
 
-out_unregister_sysctl:
 #ifdef CONFIG_SYSCTL
-   neigh_sysctl_unregister(_tbl.parms);
 out_unregister_pernet:
-#endif
unregister_pernet_subsys(_net_ops);
goto out;
+#endif
 }
 
-void ndisc_cleanup(void)
+int __init ndisc_late_init(void)
+{
+   return register_netdevice_notifier(_netdev_notifier);
+}
+
+void ndisc_late_cleanup(void)
 {
unregister_netdevice_notifier(_netdev_notifier);
+}
+
+void ndisc_cleanup(void)
+{
 #ifdef CONFIG_SYSCTL
neigh_sysctl_unregister(_tbl.parms);
 #endif
-- 
1.9.1



[PATCH 3.4 111/125] af_unix: fix a fatal race with bit fields

2016-10-12 Thread lizf
From: Eric Dumazet 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 60bc851ae59bfe99be6ee89d6bc50008c85ec75d upstream.

Using bit fields is dangerous on ppc64/sparc64, as the compiler [1]
uses 64bit instructions to manipulate them.
If the 64bit word includes any atomic_t or spinlock_t, we can lose
critical concurrent changes.

This is happening in af_unix, where unix_sk(sk)->gc_candidate/
gc_maybe_cycle/lock share the same 64bit word.

This leads to fatal deadlock, as one/several cpus spin forever
on a spinlock that will never be available again.

A safer way would be to use a long to store flags.
This way we are sure compiler/arch wont do bad things.

As we own unix_gc_lock spinlock when clearing or setting bits,
we can use the non atomic __set_bit()/__clear_bit().

recursion_level can share the same 64bit location with the spinlock,
as it is set only with this spinlock held.

[1] bug fixed in gcc-4.8.0 :
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52080

Reported-by: Ambrose Feinstein 
Signed-off-by: Eric Dumazet 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Signed-off-by: David S. Miller 
Cc: hejianet 
Signed-off-by: Zefan Li 
---
 include/net/af_unix.h |  5 +++--
 net/unix/garbage.c| 12 ++--
 2 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/include/net/af_unix.h b/include/net/af_unix.h
index d29a576..f3cbf1c 100644
--- a/include/net/af_unix.h
+++ b/include/net/af_unix.h
@@ -56,9 +56,10 @@ struct unix_sock {
struct list_headlink;
atomic_long_t   inflight;
spinlock_t  lock;
-   unsigned intgc_candidate : 1;
-   unsigned intgc_maybe_cycle : 1;
unsigned char   recursion_level;
+   unsigned long   gc_flags;
+#define UNIX_GC_CANDIDATE  0
+#define UNIX_GC_MAYBE_CYCLE1
struct socket_wqpeer_wq;
wait_queue_tpeer_wake;
 };
diff --git a/net/unix/garbage.c b/net/unix/garbage.c
index b6f4b99..00d3e56 100644
--- a/net/unix/garbage.c
+++ b/net/unix/garbage.c
@@ -185,7 +185,7 @@ static void scan_inflight(struct sock *x, void 
(*func)(struct unix_sock *),
 * have been added to the queues after
 * starting the garbage collection
 */
-   if (u->gc_candidate) {
+   if (test_bit(UNIX_GC_CANDIDATE, 
>gc_flags)) {
hit = true;
func(u);
}
@@ -254,7 +254,7 @@ static void inc_inflight_move_tail(struct unix_sock *u)
 * of the list, so that it's checked even if it was already
 * passed over
 */
-   if (u->gc_maybe_cycle)
+   if (test_bit(UNIX_GC_MAYBE_CYCLE, >gc_flags))
list_move_tail(>link, _candidates);
 }
 
@@ -315,8 +315,8 @@ void unix_gc(void)
BUG_ON(total_refs < inflight_refs);
if (total_refs == inflight_refs) {
list_move_tail(>link, _candidates);
-   u->gc_candidate = 1;
-   u->gc_maybe_cycle = 1;
+   __set_bit(UNIX_GC_CANDIDATE, >gc_flags);
+   __set_bit(UNIX_GC_MAYBE_CYCLE, >gc_flags);
}
}
 
@@ -344,7 +344,7 @@ void unix_gc(void)
 
if (atomic_long_read(>inflight) > 0) {
list_move_tail(>link, _cycle_list);
-   u->gc_maybe_cycle = 0;
+   __clear_bit(UNIX_GC_MAYBE_CYCLE, >gc_flags);
scan_children(>sk, inc_inflight_move_tail, NULL);
}
}
@@ -356,7 +356,7 @@ void unix_gc(void)
 */
while (!list_empty(_cycle_list)) {
u = list_entry(not_cycle_list.next, struct unix_sock, link);
-   u->gc_candidate = 0;
+   __clear_bit(UNIX_GC_CANDIDATE, >gc_flags);
list_move_tail(>link, _inflight_list);
}
 
-- 
1.9.1



[PATCH 3.4 091/125] ftrace/scripts: Have recordmcount copy the object file

2016-10-12 Thread lizf
From: "Steven Rostedt (Red Hat)" 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit a50bd43935586420fb75f4558369eb08566fac5e upstream.

Russell King found that he had weird side effects when compiling the kernel
with hard linked ccache. The reason was that recordmcount modified the
kernel in place via mmap, and when a file gets modified twice by
recordmcount, it will complain about it. To fix this issue, Russell wrote a
patch that checked if the file was hard linked more than once and would
unlink it if it was.

Linus Torvalds was not happy with the fact that recordmcount does this in
place modification. Instead of doing the unlink only if the file has two or
more hard links, it does the unlink all the time. In otherwords, it always
does a copy if it changed something. That is, it does the write out if a
change was made.

Signed-off-by: Steven Rostedt 
Signed-off-by: Zefan Li 
---
 scripts/recordmcount.c | 145 +
 1 file changed, 110 insertions(+), 35 deletions(-)

diff --git a/scripts/recordmcount.c b/scripts/recordmcount.c
index 4eb047a..0970379 100644
--- a/scripts/recordmcount.c
+++ b/scripts/recordmcount.c
@@ -35,12 +35,17 @@
 
 static int fd_map; /* File descriptor for file being modified. */
 static int mmap_failed; /* Boolean flag. */
-static void *ehdr_curr; /* current ElfXX_Ehdr *  for resource cleanup */
 static char gpfx;  /* prefix for global symbol name (sometimes '_') */
 static struct stat sb; /* Remember .st_size, etc. */
 static jmp_buf jmpenv; /* setjmp/longjmp per-file error escape */
 static const char *altmcount;  /* alternate mcount symbol name */
 static int warn_on_notrace_sect; /* warn when section has mcount not being 
recorded */
+static void *file_map; /* pointer of the mapped file */
+static void *file_end; /* pointer to the end of the mapped file */
+static int file_updated; /* flag to state file was changed */
+static void *file_ptr; /* current file pointer location */
+static void *file_append; /* added to the end of the file */
+static size_t file_append_size; /* how much is added to end of file */
 
 /* setjmp() return values */
 enum {
@@ -54,10 +59,14 @@ static void
 cleanup(void)
 {
if (!mmap_failed)
-   munmap(ehdr_curr, sb.st_size);
+   munmap(file_map, sb.st_size);
else
-   free(ehdr_curr);
-   close(fd_map);
+   free(file_map);
+   file_map = NULL;
+   free(file_append);
+   file_append = NULL;
+   file_append_size = 0;
+   file_updated = 0;
 }
 
 static void __attribute__((noreturn))
@@ -79,12 +88,22 @@ succeed_file(void)
 static off_t
 ulseek(int const fd, off_t const offset, int const whence)
 {
-   off_t const w = lseek(fd, offset, whence);
-   if (w == (off_t)-1) {
-   perror("lseek");
+   switch (whence) {
+   case SEEK_SET:
+   file_ptr = file_map + offset;
+   break;
+   case SEEK_CUR:
+   file_ptr += offset;
+   break;
+   case SEEK_END:
+   file_ptr = file_map + (sb.st_size - offset);
+   break;
+   }
+   if (file_ptr < file_map) {
+   fprintf(stderr, "lseek: seek before file\n");
fail_file();
}
-   return w;
+   return file_ptr - file_map;
 }
 
 static size_t
@@ -101,12 +120,38 @@ uread(int const fd, void *const buf, size_t const count)
 static size_t
 uwrite(int const fd, void const *const buf, size_t const count)
 {
-   size_t const n = write(fd, buf, count);
-   if (n != count) {
-   perror("write");
-   fail_file();
+   size_t cnt = count;
+   off_t idx = 0;
+
+   file_updated = 1;
+
+   if (file_ptr + count >= file_end) {
+   off_t aoffset = (file_ptr + count) - file_end;
+
+   if (aoffset > file_append_size) {
+   file_append = realloc(file_append, aoffset);
+   file_append_size = aoffset;
+   }
+   if (!file_append) {
+   perror("write");
+   fail_file();
+   }
+   if (file_ptr < file_end) {
+   cnt = file_end - file_ptr;
+   } else {
+   cnt = 0;
+   idx = aoffset - count;
+   }
}
-   return n;
+
+   if (cnt)
+   memcpy(file_ptr, buf, cnt);
+
+   if (cnt < count)
+   memcpy(file_append + idx, buf + cnt, count - cnt);
+
+   file_ptr += count;
+   return count;
 }
 
 static void *
@@ -163,9 +208,7 @@ static int make_nop_x86(void *map, size_t const offset)
  */
 static void *mmap_file(char const *fname)
 {
-   void *addr;
-
-   fd_map = open(fname, O_RDWR);
+   fd_map = open(fname, O_RDONLY);
if (fd_map < 0 || fstat(fd_map, ) < 0) {
  

[PATCH 3.4 117/125] ipv6: update ip6_rt_last_gc every time GC is run

2016-10-12 Thread lizf
From: Michal Kubeček 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 49a18d86f66d33a20144ecb5a34bba0d1856b260 upstream.

As pointed out by Eric Dumazet, net->ipv6.ip6_rt_last_gc should
hold the last time garbage collector was run so that we should
update it whenever fib6_run_gc() calls fib6_clean_all(), not only
if we got there from ip6_dst_gc().

Signed-off-by: Michal Kubecek 
Signed-off-by: David S. Miller 
Signed-off-by: Zefan Li 
---
 net/ipv6/ip6_fib.c | 6 +-
 net/ipv6/route.c   | 4 +---
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
index fc5ce6e..e6b7a00 100644
--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -1595,6 +1595,8 @@ static DEFINE_SPINLOCK(fib6_gc_lock);
 
 void fib6_run_gc(unsigned long expires, struct net *net, bool force)
 {
+   unsigned long now;
+
if (force) {
spin_lock_bh(_gc_lock);
} else if (!spin_trylock_bh(_gc_lock)) {
@@ -1607,10 +1609,12 @@ void fib6_run_gc(unsigned long expires, struct net 
*net, bool force)
gc_args.more = icmp6_dst_gc();
 
fib6_clean_all(net, fib6_age, 0, NULL);
+   now = jiffies;
+   net->ipv6.ip6_rt_last_gc = now;
 
if (gc_args.more)
mod_timer(>ipv6.ip6_fib_timer,
- round_jiffies(jiffies
+ round_jiffies(now
+ net->ipv6.sysctl.ip6_rt_gc_interval));
else
del_timer(>ipv6.ip6_fib_timer);
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 7ab7f8a..28957ba 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -1230,7 +1230,6 @@ static void icmp6_clean_all(int (*func)(struct rt6_info 
*rt, void *arg),
 
 static int ip6_dst_gc(struct dst_ops *ops)
 {
-   unsigned long now = jiffies;
struct net *net = container_of(ops, struct net, ipv6.ip6_dst_ops);
int rt_min_interval = net->ipv6.sysctl.ip6_rt_gc_min_interval;
int rt_max_size = net->ipv6.sysctl.ip6_rt_max_size;
@@ -1240,13 +1239,12 @@ static int ip6_dst_gc(struct dst_ops *ops)
int entries;
 
entries = dst_entries_get_fast(ops);
-   if (time_after(rt_last_gc + rt_min_interval, now) &&
+   if (time_after(rt_last_gc + rt_min_interval, jiffies) &&
entries <= rt_max_size)
goto out;
 
net->ipv6.ip6_rt_gc_expire++;
fib6_run_gc(net->ipv6.ip6_rt_gc_expire, net, entries > rt_max_size);
-   net->ipv6.ip6_rt_last_gc = now;
entries = dst_entries_get_slow(ops);
if (entries < ops->gc_thresh)
net->ipv6.ip6_rt_gc_expire = rt_gc_timeout>>1;
-- 
1.9.1



[PATCH 3.4 112/125] USB: ti_usb_3410_502: Fix ID table size

2016-10-12 Thread lizf
From: Ben Hutchings 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


Commit 35a2fbc941ac ("USB: serial: ti_usb_3410_5052: new device id for
Abbot strip port cable") failed to update the size of the
ti_id_table_3410 array.  This doesn't need to be fixed upstream
following commit d7ece6515e12 ("USB: ti_usb_3410_5052: remove
vendor/product module parameters") but should be fixed in stable
branches older than 3.12.

Backports of commit c9d09dc7ad10 ("USB: serial: ti_usb_3410_5052: add
Abbott strip port ID to combined table as well.") similarly failed to
update the size of the ti_id_table_combined array.

Signed-off-by: Ben Hutchings 
Signed-off-by: Zefan Li 
---
 drivers/usb/serial/ti_usb_3410_5052.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/usb/serial/ti_usb_3410_5052.c 
b/drivers/usb/serial/ti_usb_3410_5052.c
index 2575779..974c4fa 100644
--- a/drivers/usb/serial/ti_usb_3410_5052.c
+++ b/drivers/usb/serial/ti_usb_3410_5052.c
@@ -164,7 +164,7 @@ static unsigned int product_5052_count;
 /* the array dimension is the number of default entries plus */
 /* TI_EXTRA_VID_PID_COUNT user defined entries plus 1 terminating */
 /* null entry */
-static struct usb_device_id ti_id_table_3410[15+TI_EXTRA_VID_PID_COUNT+1] = {
+static struct usb_device_id ti_id_table_3410[16+TI_EXTRA_VID_PID_COUNT+1] = {
{ USB_DEVICE(TI_VENDOR_ID, TI_3410_PRODUCT_ID) },
{ USB_DEVICE(TI_VENDOR_ID, TI_3410_EZ430_ID) },
{ USB_DEVICE(MTS_VENDOR_ID, MTS_GSM_NO_FW_PRODUCT_ID) },
@@ -190,7 +190,7 @@ static struct usb_device_id 
ti_id_table_5052[5+TI_EXTRA_VID_PID_COUNT+1] = {
{ USB_DEVICE(TI_VENDOR_ID, TI_5052_FIRMWARE_PRODUCT_ID) },
 };
 
-static struct usb_device_id 
ti_id_table_combined[19+2*TI_EXTRA_VID_PID_COUNT+1] = {
+static struct usb_device_id 
ti_id_table_combined[20+2*TI_EXTRA_VID_PID_COUNT+1] = {
{ USB_DEVICE(TI_VENDOR_ID, TI_3410_PRODUCT_ID) },
{ USB_DEVICE(TI_VENDOR_ID, TI_3410_EZ430_ID) },
{ USB_DEVICE(MTS_VENDOR_ID, MTS_GSM_NO_FW_PRODUCT_ID) },
-- 
1.9.1



[PATCH 3.4 097/125] xen/pciback: Return error on XEN_PCI_OP_enable_msi when device has MSI or MSI-X enabled

2016-10-12 Thread lizf
From: Konrad Rzeszutek Wilk 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 56441f3c8e5bd45aab10dd9f8c505dd4bec03b0d upstream.

The guest sequence of:

 a) XEN_PCI_OP_enable_msi
 b) XEN_PCI_OP_enable_msi
 c) XEN_PCI_OP_disable_msi

results in hitting an BUG_ON condition in the msi.c code.

The MSI code uses an dev->msi_list to which it adds MSI entries.
Under the above conditions an BUG_ON() can be hit. The device
passed in the guest MUST have MSI capability.

The a) adds the entry to the dev->msi_list and sets msi_enabled.
The b) adds a second entry but adding in to SysFS fails (duplicate entry)
and deletes all of the entries from msi_list and returns (with msi_enabled
is still set).  c) pci_disable_msi passes the msi_enabled checks and hits:

BUG_ON(list_empty(dev_to_msi_list(>dev)));

and blows up.

The patch adds a simple check in the XEN_PCI_OP_enable_msi to guard
against that. The check for msix_enabled is not stricly neccessary.

This is part of XSA-157.

Reviewed-by: David Vrabel 
Reviewed-by: Jan Beulich 
Signed-off-by: Konrad Rzeszutek Wilk 
Signed-off-by: Zefan Li 
---
 drivers/xen/xen-pciback/pciback_ops.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/xen/xen-pciback/pciback_ops.c 
b/drivers/xen/xen-pciback/pciback_ops.c
index a751a66..1ab998c 100644
--- a/drivers/xen/xen-pciback/pciback_ops.c
+++ b/drivers/xen/xen-pciback/pciback_ops.c
@@ -143,7 +143,12 @@ int xen_pcibk_enable_msi(struct xen_pcibk_device *pdev,
if (unlikely(verbose_request))
printk(KERN_DEBUG DRV_NAME ": %s: enable MSI\n", pci_name(dev));
 
-   status = pci_enable_msi(dev);
+   if (dev->msi_enabled)
+   status = -EALREADY;
+   else if (dev->msix_enabled)
+   status = -ENXIO;
+   else
+   status = pci_enable_msi(dev);
 
if (status) {
pr_warn_ratelimited(DRV_NAME ": %s: error enabling MSI for 
guest %u: err %d\n",
-- 
1.9.1



[PATCH 3.4 116/125] sctp: Prevent soft lockup when sctp_accept() is called during a timeout event

2016-10-12 Thread lizf
From: Karl Heiss 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 635682a14427d241bab7bbdeebb48a7d7b91638e upstream.

A case can occur when sctp_accept() is called by the user during
a heartbeat timeout event after the 4-way handshake.  Since
sctp_assoc_migrate() changes both assoc->base.sk and assoc->ep, the
bh_sock_lock in sctp_generate_heartbeat_event() will be taken with
the listening socket but released with the new association socket.
The result is a deadlock on any future attempts to take the listening
socket lock.

Note that this race can occur with other SCTP timeouts that take
the bh_lock_sock() in the event sctp_accept() is called.

 BUG: soft lockup - CPU#9 stuck for 67s! [swapper:0]
 ...
 RIP: 0010:[]  [] _spin_lock+0x1e/0x30
 RSP: 0018:880028323b20  EFLAGS: 0206
 RAX: 0002 RBX: 880028323b20 RCX: 
 RDX:  RSI: 880028323be0 RDI: 8804632c4b48
 RBP: 8100bb93 R08:  R09: 
 R10: 880610662280 R11: 0100 R12: 880028323aa0
 R13: 8804383c3880 R14: 880028323a90 R15: 81534225
 FS:  () GS:88002832() knlGS:
 CS:  0010 DS: 0018 ES: 0018 CR0: 8005003b
 CR2: 006df528 CR3: 01a85000 CR4: 06e0
 DR0:  DR1:  DR2: 
 DR3:  DR6: 0ff0 DR7: 0400
 Process swapper (pid: 0, threadinfo 880616b7, task 880616b6cab0)
 Stack:
 880028323c40 a01c2582 880614cfb020 
  0100 0014383a6c44 8804383c3880 880614e93c00
  880614e93c00  8804632c4b00 8804383c38b8
 Call Trace:
 
 [] ? sctp_rcv+0x492/0xa10 [sctp]
 [] ? nf_iterate+0x69/0xb0
 [] ? ip_local_deliver_finish+0x0/0x2d0
 [] ? nf_hook_slow+0x76/0x120
 [] ? ip_local_deliver_finish+0x0/0x2d0
 [] ? ip_local_deliver_finish+0xdd/0x2d0
 [] ? ip_local_deliver+0x98/0xa0
 [] ? ip_rcv_finish+0x12d/0x440
 [] ? ip_rcv+0x275/0x350
 [] ? __netif_receive_skb+0x4ab/0x750
 ...

With lockdep debugging:

 =
 [ BUG: bad unlock balance detected! ]
 -
 CslRx/12087 is trying to release lock (slock-AF_INET) at:
 [] sctp_generate_timeout_event+0x40/0xe0 [sctp]
 but there are no more locks to release!

 other info that might help us debug this:
 2 locks held by CslRx/12087:
 #0:  (>timers[i]){+.-...}, at: [] 
run_timer_softirq+0x16f/0x3e0
 #1:  (slock-AF_INET){+.-...}, at: [] 
sctp_generate_timeout_event+0x23/0xe0 [sctp]

Ensure the socket taken is also the same one that is released by
saving a copy of the socket before entering the timeout event
critical section.

Signed-off-by: Karl Heiss 
Signed-off-by: David S. Miller 
[bwh: Backported to 3.2:
 - Net namespaces are not used
 - Keep using sctp_bh_{,un}lock_sock()
 - Adjust context]
Signed-off-by: Ben Hutchings 
Signed-off-by: Zefan Li 
---
 net/sctp/sm_sideeffect.c | 34 +++---
 1 file changed, 19 insertions(+), 15 deletions(-)

diff --git a/net/sctp/sm_sideeffect.c b/net/sctp/sm_sideeffect.c
index 5fa033a..06c75b1 100644
--- a/net/sctp/sm_sideeffect.c
+++ b/net/sctp/sm_sideeffect.c
@@ -249,11 +249,12 @@ void sctp_generate_t3_rtx_event(unsigned long peer)
int error;
struct sctp_transport *transport = (struct sctp_transport *) peer;
struct sctp_association *asoc = transport->asoc;
+   struct sock *sk = asoc->base.sk;
 
/* Check whether a task is in the sock.  */
 
-   sctp_bh_lock_sock(asoc->base.sk);
-   if (sock_owned_by_user(asoc->base.sk)) {
+   sctp_bh_lock_sock(sk);
+   if (sock_owned_by_user(sk)) {
SCTP_DEBUG_PRINTK("%s:Sock is busy.\n", __func__);
 
/* Try again later.  */
@@ -276,10 +277,10 @@ void sctp_generate_t3_rtx_event(unsigned long peer)
   transport, GFP_ATOMIC);
 
if (error)
-   asoc->base.sk->sk_err = -error;
+   sk->sk_err = -error;
 
 out_unlock:
-   sctp_bh_unlock_sock(asoc->base.sk);
+   sctp_bh_unlock_sock(sk);
sctp_transport_put(transport);
 }
 
@@ -289,10 +290,11 @@ out_unlock:
 static void sctp_generate_timeout_event(struct sctp_association *asoc,
sctp_event_timeout_t timeout_type)
 {
+   struct sock *sk = asoc->base.sk;
int error = 0;
 
-   sctp_bh_lock_sock(asoc->base.sk);
-   if (sock_owned_by_user(asoc->base.sk)) {
+   sctp_bh_lock_sock(sk);
+   if (sock_owned_by_user(sk)) {
SCTP_DEBUG_PRINTK("%s:Sock is busy: timer %d\n",
  __func__,
  timeout_type);
@@ -316,10 +318,10 @@ static void sctp_generate_timeout_event(struct 
sctp_association *asoc,
   (void 

[PATCH 3.4 118/125] ipv6: don't call fib6_run_gc() until routing is ready

2016-10-12 Thread lizf
From: Michal Kubeček 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 2c861cc65ef4604011a0082e4dcdba2819aa191a upstream.

When loading the ipv6 module, ndisc_init() is called before
ip6_route_init(). As the former registers a handler calling
fib6_run_gc(), this opens a window to run the garbage collector
before necessary data structures are initialized. If a network
device is initialized in this window, adding MAC address to it
triggers a NETDEV_CHANGEADDR event, leading to a crash in
fib6_clean_all().

Take the event handler registration out of ndisc_init() into a
separate function ndisc_late_init() and move it after
ip6_route_init().

Signed-off-by: Michal Kubecek 
Signed-off-by: David S. Miller 
Signed-off-by: Zefan Li 
---
 include/net/ndisc.h |  2 ++
 net/ipv6/af_inet6.c |  6 ++
 net/ipv6/ndisc.c| 18 +++---
 3 files changed, 19 insertions(+), 7 deletions(-)

diff --git a/include/net/ndisc.h b/include/net/ndisc.h
index 6f9c25a..cd205e9 100644
--- a/include/net/ndisc.h
+++ b/include/net/ndisc.h
@@ -117,7 +117,9 @@ static inline struct neighbour *__ipv6_neigh_lookup(struct 
neigh_table *tbl, str
 }
 
 extern int ndisc_init(void);
+extern int ndisc_late_init(void);
 
+extern voidndisc_late_cleanup(void);
 extern voidndisc_cleanup(void);
 
 extern int ndisc_rcv(struct sk_buff *skb);
diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
index 5300ef3..8ddb56f 100644
--- a/net/ipv6/af_inet6.c
+++ b/net/ipv6/af_inet6.c
@@ -1161,6 +1161,9 @@ static int __init inet6_init(void)
err = ip6_route_init();
if (err)
goto ip6_route_fail;
+   err = ndisc_late_init();
+   if (err)
+   goto ndisc_late_fail;
err = ip6_flowlabel_init();
if (err)
goto ip6_flowlabel_fail;
@@ -1221,6 +1224,8 @@ ipv6_exthdrs_fail:
 addrconf_fail:
ip6_flowlabel_cleanup();
 ip6_flowlabel_fail:
+   ndisc_late_cleanup();
+ndisc_late_fail:
ip6_route_cleanup();
 ip6_route_fail:
 #ifdef CONFIG_PROC_FS
@@ -1288,6 +1293,7 @@ static void __exit inet6_exit(void)
ipv6_exthdrs_exit();
addrconf_cleanup();
ip6_flowlabel_cleanup();
+   ndisc_late_cleanup();
ip6_route_cleanup();
 #ifdef CONFIG_PROC_FS
 
diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c
index e235b4c..02e6568 100644
--- a/net/ipv6/ndisc.c
+++ b/net/ipv6/ndisc.c
@@ -1867,24 +1867,28 @@ int __init ndisc_init(void)
if (err)
goto out_unregister_pernet;
 #endif
-   err = register_netdevice_notifier(_netdev_notifier);
-   if (err)
-   goto out_unregister_sysctl;
 out:
return err;
 
-out_unregister_sysctl:
 #ifdef CONFIG_SYSCTL
-   neigh_sysctl_unregister(_tbl.parms);
 out_unregister_pernet:
-#endif
unregister_pernet_subsys(_net_ops);
goto out;
+#endif
 }
 
-void ndisc_cleanup(void)
+int __init ndisc_late_init(void)
+{
+   return register_netdevice_notifier(_netdev_notifier);
+}
+
+void ndisc_late_cleanup(void)
 {
unregister_netdevice_notifier(_netdev_notifier);
+}
+
+void ndisc_cleanup(void)
+{
 #ifdef CONFIG_SYSCTL
neigh_sysctl_unregister(_tbl.parms);
 #endif
-- 
1.9.1



[PATCH 3.4 111/125] af_unix: fix a fatal race with bit fields

2016-10-12 Thread lizf
From: Eric Dumazet 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 60bc851ae59bfe99be6ee89d6bc50008c85ec75d upstream.

Using bit fields is dangerous on ppc64/sparc64, as the compiler [1]
uses 64bit instructions to manipulate them.
If the 64bit word includes any atomic_t or spinlock_t, we can lose
critical concurrent changes.

This is happening in af_unix, where unix_sk(sk)->gc_candidate/
gc_maybe_cycle/lock share the same 64bit word.

This leads to fatal deadlock, as one/several cpus spin forever
on a spinlock that will never be available again.

A safer way would be to use a long to store flags.
This way we are sure compiler/arch wont do bad things.

As we own unix_gc_lock spinlock when clearing or setting bits,
we can use the non atomic __set_bit()/__clear_bit().

recursion_level can share the same 64bit location with the spinlock,
as it is set only with this spinlock held.

[1] bug fixed in gcc-4.8.0 :
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52080

Reported-by: Ambrose Feinstein 
Signed-off-by: Eric Dumazet 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Signed-off-by: David S. Miller 
Cc: hejianet 
Signed-off-by: Zefan Li 
---
 include/net/af_unix.h |  5 +++--
 net/unix/garbage.c| 12 ++--
 2 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/include/net/af_unix.h b/include/net/af_unix.h
index d29a576..f3cbf1c 100644
--- a/include/net/af_unix.h
+++ b/include/net/af_unix.h
@@ -56,9 +56,10 @@ struct unix_sock {
struct list_headlink;
atomic_long_t   inflight;
spinlock_t  lock;
-   unsigned intgc_candidate : 1;
-   unsigned intgc_maybe_cycle : 1;
unsigned char   recursion_level;
+   unsigned long   gc_flags;
+#define UNIX_GC_CANDIDATE  0
+#define UNIX_GC_MAYBE_CYCLE1
struct socket_wqpeer_wq;
wait_queue_tpeer_wake;
 };
diff --git a/net/unix/garbage.c b/net/unix/garbage.c
index b6f4b99..00d3e56 100644
--- a/net/unix/garbage.c
+++ b/net/unix/garbage.c
@@ -185,7 +185,7 @@ static void scan_inflight(struct sock *x, void 
(*func)(struct unix_sock *),
 * have been added to the queues after
 * starting the garbage collection
 */
-   if (u->gc_candidate) {
+   if (test_bit(UNIX_GC_CANDIDATE, 
>gc_flags)) {
hit = true;
func(u);
}
@@ -254,7 +254,7 @@ static void inc_inflight_move_tail(struct unix_sock *u)
 * of the list, so that it's checked even if it was already
 * passed over
 */
-   if (u->gc_maybe_cycle)
+   if (test_bit(UNIX_GC_MAYBE_CYCLE, >gc_flags))
list_move_tail(>link, _candidates);
 }
 
@@ -315,8 +315,8 @@ void unix_gc(void)
BUG_ON(total_refs < inflight_refs);
if (total_refs == inflight_refs) {
list_move_tail(>link, _candidates);
-   u->gc_candidate = 1;
-   u->gc_maybe_cycle = 1;
+   __set_bit(UNIX_GC_CANDIDATE, >gc_flags);
+   __set_bit(UNIX_GC_MAYBE_CYCLE, >gc_flags);
}
}
 
@@ -344,7 +344,7 @@ void unix_gc(void)
 
if (atomic_long_read(>inflight) > 0) {
list_move_tail(>link, _cycle_list);
-   u->gc_maybe_cycle = 0;
+   __clear_bit(UNIX_GC_MAYBE_CYCLE, >gc_flags);
scan_children(>sk, inc_inflight_move_tail, NULL);
}
}
@@ -356,7 +356,7 @@ void unix_gc(void)
 */
while (!list_empty(_cycle_list)) {
u = list_entry(not_cycle_list.next, struct unix_sock, link);
-   u->gc_candidate = 0;
+   __clear_bit(UNIX_GC_CANDIDATE, >gc_flags);
list_move_tail(>link, _inflight_list);
}
 
-- 
1.9.1



[PATCH 3.4 034/125] usb: musb: core: fix order of arguments to ulpi write callback

2016-10-12 Thread lizf
From: Uwe Kleine-König 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 705e63d2b29c8bbf091119084544d353bda70393 upstream.

There is a bit of a mess in the order of arguments to the ulpi write
callback. There is

int ulpi_write(struct ulpi *ulpi, u8 addr, u8 val)

in drivers/usb/common/ulpi.c;

struct usb_phy_io_ops {
...
int (*write)(struct usb_phy *x, u32 val, u32 reg);
}

in include/linux/usb/phy.h.

The callback registered by the musb driver has to comply to the latter,
but up to now had "offset" first which effectively made the function
broken for correct users. So flip the order and while at it also
switch to the parameter names of struct usb_phy_io_ops's write.

Fixes: ffb865b1e460 ("usb: musb: add ulpi access operations")
Signed-off-by: Uwe Kleine-König 
Signed-off-by: Felipe Balbi 
Signed-off-by: Zefan Li 
---
 drivers/usb/musb/musb_core.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/usb/musb/musb_core.c b/drivers/usb/musb/musb_core.c
index d3481c4..db25165 100644
--- a/drivers/usb/musb/musb_core.c
+++ b/drivers/usb/musb/musb_core.c
@@ -131,7 +131,7 @@ static inline struct musb *dev_to_musb(struct device *dev)
 /*-*/
 
 #ifndef CONFIG_BLACKFIN
-static int musb_ulpi_read(struct usb_phy *phy, u32 offset)
+static int musb_ulpi_read(struct usb_phy *phy, u32 reg)
 {
void __iomem *addr = phy->io_priv;
int i = 0;
@@ -150,7 +150,7 @@ static int musb_ulpi_read(struct usb_phy *phy, u32 offset)
 * ULPICarKitControlDisableUTMI after clearing POWER_SUSPENDM.
 */
 
-   musb_writeb(addr, MUSB_ULPI_REG_ADDR, (u8)offset);
+   musb_writeb(addr, MUSB_ULPI_REG_ADDR, (u8)reg);
musb_writeb(addr, MUSB_ULPI_REG_CONTROL,
MUSB_ULPI_REG_REQ | MUSB_ULPI_RDN_WR);
 
@@ -175,7 +175,7 @@ out:
return ret;
 }
 
-static int musb_ulpi_write(struct usb_phy *phy, u32 offset, u32 data)
+static int musb_ulpi_write(struct usb_phy *phy, u32 val, u32 reg)
 {
void __iomem *addr = phy->io_priv;
int i = 0;
@@ -190,8 +190,8 @@ static int musb_ulpi_write(struct usb_phy *phy, u32 offset, 
u32 data)
power &= ~MUSB_POWER_SUSPENDM;
musb_writeb(addr, MUSB_POWER, power);
 
-   musb_writeb(addr, MUSB_ULPI_REG_ADDR, (u8)offset);
-   musb_writeb(addr, MUSB_ULPI_REG_DATA, (u8)data);
+   musb_writeb(addr, MUSB_ULPI_REG_ADDR, (u8)reg);
+   musb_writeb(addr, MUSB_ULPI_REG_DATA, (u8)val);
musb_writeb(addr, MUSB_ULPI_REG_CONTROL, MUSB_ULPI_REG_REQ);
 
while (!(musb_readb(addr, MUSB_ULPI_REG_CONTROL)
-- 
1.9.1



[PATCH 3.4 051/125] USB: option: add XS Stick W100-2 from 4G Systems

2016-10-12 Thread lizf
From: Bjørn Mork 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 638148e20c7f8f6e95017fdc13bce8549a6925e0 upstream.

Thomas reports
"
4gsystems sells two total different LTE-surfsticks under the same name.
..
The newer version of XS Stick W100 is from "omega"
..
Under windows the driver switches to the same ID, and uses MI03\6 for
network and MI01\6 for modem.
..
echo "1c9e 9b01" > /sys/bus/usb/drivers/qmi_wwan/new_id
echo "1c9e 9b01" > /sys/bus/usb-serial/drivers/option1/new_id

T:  Bus=01 Lev=01 Prnt=01 Port=03 Cnt=01 Dev#=  4 Spd=480 MxCh= 0
D:  Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs=  1
P:  Vendor=1c9e ProdID=9b01 Rev=02.32
S:  Manufacturer=USB Modem
S:  Product=USB Modem
S:  SerialNumber=
C:  #Ifs= 5 Cfg#= 1 Atr=80 MxPwr=500mA
I:  If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
I:  If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
I:  If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
I:  If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan
I:  If#= 4 Alt= 0 #EPs= 2 Cls=08(stor.) Sub=06 Prot=50 Driver=usb-storage

Now all important things are there:

wwp0s29f7u2i3 (net), ttyUSB2 (at), cdc-wdm0 (qmi), ttyUSB1 (at)

There is also ttyUSB0, but it is not usable, at least not for at.

The device works well with qmi and ModemManager-NetworkManager.
"

Reported-by: Thomas Schäfer 
Signed-off-by: Bjørn Mork 
Signed-off-by: Greg Kroah-Hartman 
Signed-off-by: Zefan Li 
---
 drivers/usb/serial/option.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/drivers/usb/serial/option.c b/drivers/usb/serial/option.c
index d5febd4..1852ca6 100644
--- a/drivers/usb/serial/option.c
+++ b/drivers/usb/serial/option.c
@@ -352,6 +352,7 @@ static void option_instat_callback(struct urb *urb);
 /* This is the 4G XS Stick W14 a.k.a. Mobilcom Debitel Surf-Stick *
  * It seems to contain a Qualcomm QSC6240/6290 chipset*/
 #define FOUR_G_SYSTEMS_PRODUCT_W14 0x9603
+#define FOUR_G_SYSTEMS_PRODUCT_W1000x9b01
 
 /* iBall 3.5G connect wireless modem */
 #define IBALL_3_5G_CONNECT 0x9605
@@ -525,6 +526,11 @@ static const struct option_blacklist_info 
four_g_w14_blacklist = {
.sendsetup = BIT(0) | BIT(1),
 };
 
+static const struct option_blacklist_info four_g_w100_blacklist = {
+   .sendsetup = BIT(1) | BIT(2),
+   .reserved = BIT(3),
+};
+
 static const struct option_blacklist_info alcatel_x200_blacklist = {
.sendsetup = BIT(0) | BIT(1),
.reserved = BIT(4),
@@ -1621,6 +1627,9 @@ static const struct usb_device_id option_ids[] = {
{ USB_DEVICE(LONGCHEER_VENDOR_ID, FOUR_G_SYSTEMS_PRODUCT_W14),
  .driver_info = (kernel_ulong_t)_g_w14_blacklist
},
+   { USB_DEVICE(LONGCHEER_VENDOR_ID, FOUR_G_SYSTEMS_PRODUCT_W100),
+ .driver_info = (kernel_ulong_t)_g_w100_blacklist
+   },
{ USB_DEVICE_INTERFACE_CLASS(LONGCHEER_VENDOR_ID, 
SPEEDUP_PRODUCT_SU9800, 0xff) },
{ USB_DEVICE(LONGCHEER_VENDOR_ID, ZOOM_PRODUCT_4597) },
{ USB_DEVICE(LONGCHEER_VENDOR_ID, IBALL_3_5G_CONNECT) },
-- 
1.9.1



[PATCH 3.4 120/125] Fix incomplete backport of commit 423f04d63cf4

2016-10-12 Thread lizf
From: Zefan Li 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


Signed-off-by: Zefan Li 
---
 drivers/md/raid1.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index a548eed..a4d994f 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -1272,11 +1272,8 @@ static void error(struct mddev *mddev, struct md_rdev 
*rdev)
set_bit(Blocked, >flags);
spin_lock_irqsave(>device_lock, flags);
if (test_and_clear_bit(In_sync, >flags)) {
-   unsigned long flags;
-   spin_lock_irqsave(>device_lock, flags);
mddev->degraded++;
set_bit(Faulty, >flags);
-   spin_unlock_irqrestore(>device_lock, flags);
/*
 * if recovery is running, make sure it aborts.
 */
-- 
1.9.1



[PATCH 3.4 123/125] Revert "USB: Add OTG PET device to TPL"

2016-10-12 Thread lizf
From: Zefan Li 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


This reverts commit 97fa724b23c3dd22e9c0979ad0e9d260cc6d545d.

Conflicts:
drivers/usb/core/quirks.c

Signed-off-by: Zefan Li 
---
 drivers/usb/core/otg_whitelist.h | 5 -
 drivers/usb/core/quirks.c| 4 
 2 files changed, 9 deletions(-)

diff --git a/drivers/usb/core/otg_whitelist.h b/drivers/usb/core/otg_whitelist.h
index 2753cec..e8cdce5 100644
--- a/drivers/usb/core/otg_whitelist.h
+++ b/drivers/usb/core/otg_whitelist.h
@@ -59,11 +59,6 @@ static int is_targeted(struct usb_device *dev)
 le16_to_cpu(dev->descriptor.idProduct) == 0xbadd))
return 0;
 
-   /* OTG PET device is always targeted (see OTG 2.0 ECN 6.4.2) */
-   if ((le16_to_cpu(dev->descriptor.idVendor) == 0x1a0a &&
-le16_to_cpu(dev->descriptor.idProduct) == 0x0200))
-   return 1;
-
/* NOTE: can't use usb_match_id() since interface caches
 * aren't set up yet. this is cut/paste from that code.
 */
diff --git a/drivers/usb/core/quirks.c b/drivers/usb/core/quirks.c
index 32e08dc..90f04a8 100644
--- a/drivers/usb/core/quirks.c
+++ b/drivers/usb/core/quirks.c
@@ -184,10 +184,6 @@ static const struct usb_device_id 
usb_interface_quirk_list[] = {
{ USB_VENDOR_AND_INTERFACE_INFO(0x046d, USB_CLASS_VIDEO, 1, 0),
  .driver_info = USB_QUIRK_RESET_RESUME },
 
-   /* Protocol and OTG Electrical Test Device */
-   { USB_DEVICE(0x1a0a, 0x0200), .driver_info =
-   USB_QUIRK_LINEAR_UFRAME_INTR_BINTERVAL },
-
{ }  /* terminating entry must be last */
 };
 
-- 
1.9.1



[PATCH 3.4 040/125] ip6mr: call del_timer_sync() in ip6mr_free_table()

2016-10-12 Thread lizf
From: WANG Cong 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 7ba0c47c34a1ea5bc7a24ca67309996cce0569b5 upstream.

We need to wait for the flying timers, since we
are going to free the mrtable right after it.

Cc: Hannes Frederic Sowa 
Signed-off-by: Cong Wang 
Signed-off-by: David S. Miller 
Signed-off-by: Zefan Li 
---
 net/ipv6/ip6mr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv6/ip6mr.c b/net/ipv6/ip6mr.c
index abe1f76..84cf871 100644
--- a/net/ipv6/ip6mr.c
+++ b/net/ipv6/ip6mr.c
@@ -333,7 +333,7 @@ static struct mr6_table *ip6mr_new_table(struct net *net, 
u32 id)
 
 static void ip6mr_free_table(struct mr6_table *mrt)
 {
-   del_timer(>ipmr_expire_timer);
+   del_timer_sync(>ipmr_expire_timer);
mroute_clean_tables(mrt);
kfree(mrt);
 }
-- 
1.9.1



[PATCH 3.4 096/125] xen/pciback: Save xen_pci_op commands before processing it

2016-10-12 Thread lizf
From: Konrad Rzeszutek Wilk 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 8135cf8b092723dbfcc611fe6fdcb3a36c9951c5 upstream.

Double fetch vulnerabilities that happen when a variable is
fetched twice from shared memory but a security check is only
performed the first time.

The xen_pcibk_do_op function performs a switch statements on the op->cmd
value which is stored in shared memory. Interestingly this can result
in a double fetch vulnerability depending on the performed compiler
optimization.

This patch fixes it by saving the xen_pci_op command before
processing it. We also use 'barrier' to make sure that the
compiler does not perform any optimization.

This is part of XSA155.

Reviewed-by: Konrad Rzeszutek Wilk 
Signed-off-by: Jan Beulich 
Signed-off-by: David Vrabel 
Signed-off-by: Konrad Rzeszutek Wilk 
Signed-off-by: Zefan Li 
---
 drivers/xen/xen-pciback/pciback.h |  1 +
 drivers/xen/xen-pciback/pciback_ops.c | 15 ++-
 2 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/drivers/xen/xen-pciback/pciback.h 
b/drivers/xen/xen-pciback/pciback.h
index a7def01..7a642e3 100644
--- a/drivers/xen/xen-pciback/pciback.h
+++ b/drivers/xen/xen-pciback/pciback.h
@@ -37,6 +37,7 @@ struct xen_pcibk_device {
struct xen_pci_sharedinfo *sh_info;
unsigned long flags;
struct work_struct op_work;
+   struct xen_pci_op op;
 };
 
 struct xen_pcibk_dev_data {
diff --git a/drivers/xen/xen-pciback/pciback_ops.c 
b/drivers/xen/xen-pciback/pciback_ops.c
index d52703c..a751a66 100644
--- a/drivers/xen/xen-pciback/pciback_ops.c
+++ b/drivers/xen/xen-pciback/pciback_ops.c
@@ -297,9 +297,11 @@ void xen_pcibk_do_op(struct work_struct *data)
container_of(data, struct xen_pcibk_device, op_work);
struct pci_dev *dev;
struct xen_pcibk_dev_data *dev_data = NULL;
-   struct xen_pci_op *op = >sh_info->op;
+   struct xen_pci_op *op = >op;
int test_intx = 0;
 
+   *op = pdev->sh_info->op;
+   barrier();
dev = xen_pcibk_get_pci_dev(pdev, op->domain, op->bus, op->devfn);
 
if (dev == NULL)
@@ -341,6 +343,17 @@ void xen_pcibk_do_op(struct work_struct *data)
if ((dev_data->enable_intx != test_intx))
xen_pcibk_control_isr(dev, 0 /* no reset */);
}
+   pdev->sh_info->op.err = op->err;
+   pdev->sh_info->op.value = op->value;
+#ifdef CONFIG_PCI_MSI
+   if (op->cmd == XEN_PCI_OP_enable_msix && op->err == 0) {
+   unsigned int i;
+
+   for (i = 0; i < op->value; i++)
+   pdev->sh_info->op.msix_entries[i].vector =
+   op->msix_entries[i].vector;
+   }
+#endif
/* Tell the driver domain that we're done. */
wmb();
clear_bit(_XEN_PCIF_active, (unsigned long *)>sh_info->flags);
-- 
1.9.1



[PATCH 3.4 058/125] sched/core: Clear the root_domain cpumasks in init_rootdomain()

2016-10-12 Thread lizf
From: Xunlei Pang <xlp...@redhat.com>

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 8295c69925ad53ec32ca54ac9fc194ff21bc40e2 upstream.

root_domain::rto_mask allocated through alloc_cpumask_var()
contains garbage data, this may cause problems. For instance,
When doing pull_rt_task(), it may do useless iterations if
rto_mask retains some extra garbage bits. Worse still, this
violates the isolated domain rule for clustered scheduling
using cpuset, because the tasks(with all the cpus allowed)
belongs to one root domain can be pulled away into another
root domain.

The patch cleans the garbage by using zalloc_cpumask_var()
instead of alloc_cpumask_var() for root_domain::rto_mask
allocation, thereby addressing the issues.

Do the same thing for root_domain's other cpumask memembers:
dlo_mask, span, and online.

Signed-off-by: Xunlei Pang <xlp...@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <pet...@infradead.org>
Cc: Linus Torvalds <torva...@linux-foundation.org>
Cc: Mike Galbraith <efa...@gmx.de>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Steven Rostedt <rost...@goodmis.org>
Cc: Thomas Gleixner <t...@linutronix.de>
Link: 
http://lkml.kernel.org/r/1449057179-29321-1-git-send-email-xlp...@redhat.com
Signed-off-by: Ingo Molnar <mi...@kernel.org>
[lizf: there's no rd->dlo_mask, so remove the change to it]
Signed-off-by: Zefan Li <lize...@huawei.com>
---
 kernel/sched/core.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 609a226..e29d800 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -5931,11 +5931,11 @@ static int init_rootdomain(struct root_domain *rd)
 {
memset(rd, 0, sizeof(*rd));
 
-   if (!alloc_cpumask_var(>span, GFP_KERNEL))
+   if (!zalloc_cpumask_var(>span, GFP_KERNEL))
goto out;
-   if (!alloc_cpumask_var(>online, GFP_KERNEL))
+   if (!zalloc_cpumask_var(>online, GFP_KERNEL))
goto free_span;
-   if (!alloc_cpumask_var(>rto_mask, GFP_KERNEL))
+   if (!zalloc_cpumask_var(>rto_mask, GFP_KERNEL))
goto free_online;
 
if (cpupri_init(>cpupri) != 0)
-- 
1.9.1



[PATCH 3.4 096/125] xen/pciback: Save xen_pci_op commands before processing it

2016-10-12 Thread lizf
From: Konrad Rzeszutek Wilk 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 8135cf8b092723dbfcc611fe6fdcb3a36c9951c5 upstream.

Double fetch vulnerabilities that happen when a variable is
fetched twice from shared memory but a security check is only
performed the first time.

The xen_pcibk_do_op function performs a switch statements on the op->cmd
value which is stored in shared memory. Interestingly this can result
in a double fetch vulnerability depending on the performed compiler
optimization.

This patch fixes it by saving the xen_pci_op command before
processing it. We also use 'barrier' to make sure that the
compiler does not perform any optimization.

This is part of XSA155.

Reviewed-by: Konrad Rzeszutek Wilk 
Signed-off-by: Jan Beulich 
Signed-off-by: David Vrabel 
Signed-off-by: Konrad Rzeszutek Wilk 
Signed-off-by: Zefan Li 
---
 drivers/xen/xen-pciback/pciback.h |  1 +
 drivers/xen/xen-pciback/pciback_ops.c | 15 ++-
 2 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/drivers/xen/xen-pciback/pciback.h 
b/drivers/xen/xen-pciback/pciback.h
index a7def01..7a642e3 100644
--- a/drivers/xen/xen-pciback/pciback.h
+++ b/drivers/xen/xen-pciback/pciback.h
@@ -37,6 +37,7 @@ struct xen_pcibk_device {
struct xen_pci_sharedinfo *sh_info;
unsigned long flags;
struct work_struct op_work;
+   struct xen_pci_op op;
 };
 
 struct xen_pcibk_dev_data {
diff --git a/drivers/xen/xen-pciback/pciback_ops.c 
b/drivers/xen/xen-pciback/pciback_ops.c
index d52703c..a751a66 100644
--- a/drivers/xen/xen-pciback/pciback_ops.c
+++ b/drivers/xen/xen-pciback/pciback_ops.c
@@ -297,9 +297,11 @@ void xen_pcibk_do_op(struct work_struct *data)
container_of(data, struct xen_pcibk_device, op_work);
struct pci_dev *dev;
struct xen_pcibk_dev_data *dev_data = NULL;
-   struct xen_pci_op *op = >sh_info->op;
+   struct xen_pci_op *op = >op;
int test_intx = 0;
 
+   *op = pdev->sh_info->op;
+   barrier();
dev = xen_pcibk_get_pci_dev(pdev, op->domain, op->bus, op->devfn);
 
if (dev == NULL)
@@ -341,6 +343,17 @@ void xen_pcibk_do_op(struct work_struct *data)
if ((dev_data->enable_intx != test_intx))
xen_pcibk_control_isr(dev, 0 /* no reset */);
}
+   pdev->sh_info->op.err = op->err;
+   pdev->sh_info->op.value = op->value;
+#ifdef CONFIG_PCI_MSI
+   if (op->cmd == XEN_PCI_OP_enable_msix && op->err == 0) {
+   unsigned int i;
+
+   for (i = 0; i < op->value; i++)
+   pdev->sh_info->op.msix_entries[i].vector =
+   op->msix_entries[i].vector;
+   }
+#endif
/* Tell the driver domain that we're done. */
wmb();
clear_bit(_XEN_PCIF_active, (unsigned long *)>sh_info->flags);
-- 
1.9.1



[PATCH 3.4 058/125] sched/core: Clear the root_domain cpumasks in init_rootdomain()

2016-10-12 Thread lizf
From: Xunlei Pang 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 8295c69925ad53ec32ca54ac9fc194ff21bc40e2 upstream.

root_domain::rto_mask allocated through alloc_cpumask_var()
contains garbage data, this may cause problems. For instance,
When doing pull_rt_task(), it may do useless iterations if
rto_mask retains some extra garbage bits. Worse still, this
violates the isolated domain rule for clustered scheduling
using cpuset, because the tasks(with all the cpus allowed)
belongs to one root domain can be pulled away into another
root domain.

The patch cleans the garbage by using zalloc_cpumask_var()
instead of alloc_cpumask_var() for root_domain::rto_mask
allocation, thereby addressing the issues.

Do the same thing for root_domain's other cpumask memembers:
dlo_mask, span, and online.

Signed-off-by: Xunlei Pang 
Signed-off-by: Peter Zijlstra (Intel) 
Cc: Linus Torvalds 
Cc: Mike Galbraith 
Cc: Peter Zijlstra 
Cc: Steven Rostedt 
Cc: Thomas Gleixner 
Link: 
http://lkml.kernel.org/r/1449057179-29321-1-git-send-email-xlp...@redhat.com
Signed-off-by: Ingo Molnar 
[lizf: there's no rd->dlo_mask, so remove the change to it]
Signed-off-by: Zefan Li 
---
 kernel/sched/core.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 609a226..e29d800 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -5931,11 +5931,11 @@ static int init_rootdomain(struct root_domain *rd)
 {
memset(rd, 0, sizeof(*rd));
 
-   if (!alloc_cpumask_var(>span, GFP_KERNEL))
+   if (!zalloc_cpumask_var(>span, GFP_KERNEL))
goto out;
-   if (!alloc_cpumask_var(>online, GFP_KERNEL))
+   if (!zalloc_cpumask_var(>online, GFP_KERNEL))
goto free_span;
-   if (!alloc_cpumask_var(>rto_mask, GFP_KERNEL))
+   if (!zalloc_cpumask_var(>rto_mask, GFP_KERNEL))
goto free_online;
 
if (cpupri_init(>cpupri) != 0)
-- 
1.9.1



[PATCH 3.4 123/125] Revert "USB: Add OTG PET device to TPL"

2016-10-12 Thread lizf
From: Zefan Li 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


This reverts commit 97fa724b23c3dd22e9c0979ad0e9d260cc6d545d.

Conflicts:
drivers/usb/core/quirks.c

Signed-off-by: Zefan Li 
---
 drivers/usb/core/otg_whitelist.h | 5 -
 drivers/usb/core/quirks.c| 4 
 2 files changed, 9 deletions(-)

diff --git a/drivers/usb/core/otg_whitelist.h b/drivers/usb/core/otg_whitelist.h
index 2753cec..e8cdce5 100644
--- a/drivers/usb/core/otg_whitelist.h
+++ b/drivers/usb/core/otg_whitelist.h
@@ -59,11 +59,6 @@ static int is_targeted(struct usb_device *dev)
 le16_to_cpu(dev->descriptor.idProduct) == 0xbadd))
return 0;
 
-   /* OTG PET device is always targeted (see OTG 2.0 ECN 6.4.2) */
-   if ((le16_to_cpu(dev->descriptor.idVendor) == 0x1a0a &&
-le16_to_cpu(dev->descriptor.idProduct) == 0x0200))
-   return 1;
-
/* NOTE: can't use usb_match_id() since interface caches
 * aren't set up yet. this is cut/paste from that code.
 */
diff --git a/drivers/usb/core/quirks.c b/drivers/usb/core/quirks.c
index 32e08dc..90f04a8 100644
--- a/drivers/usb/core/quirks.c
+++ b/drivers/usb/core/quirks.c
@@ -184,10 +184,6 @@ static const struct usb_device_id 
usb_interface_quirk_list[] = {
{ USB_VENDOR_AND_INTERFACE_INFO(0x046d, USB_CLASS_VIDEO, 1, 0),
  .driver_info = USB_QUIRK_RESET_RESUME },
 
-   /* Protocol and OTG Electrical Test Device */
-   { USB_DEVICE(0x1a0a, 0x0200), .driver_info =
-   USB_QUIRK_LINEAR_UFRAME_INTR_BINTERVAL },
-
{ }  /* terminating entry must be last */
 };
 
-- 
1.9.1



[PATCH 3.4 040/125] ip6mr: call del_timer_sync() in ip6mr_free_table()

2016-10-12 Thread lizf
From: WANG Cong 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 7ba0c47c34a1ea5bc7a24ca67309996cce0569b5 upstream.

We need to wait for the flying timers, since we
are going to free the mrtable right after it.

Cc: Hannes Frederic Sowa 
Signed-off-by: Cong Wang 
Signed-off-by: David S. Miller 
Signed-off-by: Zefan Li 
---
 net/ipv6/ip6mr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv6/ip6mr.c b/net/ipv6/ip6mr.c
index abe1f76..84cf871 100644
--- a/net/ipv6/ip6mr.c
+++ b/net/ipv6/ip6mr.c
@@ -333,7 +333,7 @@ static struct mr6_table *ip6mr_new_table(struct net *net, 
u32 id)
 
 static void ip6mr_free_table(struct mr6_table *mrt)
 {
-   del_timer(>ipmr_expire_timer);
+   del_timer_sync(>ipmr_expire_timer);
mroute_clean_tables(mrt);
kfree(mrt);
 }
-- 
1.9.1



[PATCH 3.4 051/125] USB: option: add XS Stick W100-2 from 4G Systems

2016-10-12 Thread lizf
From: Bjørn Mork 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 638148e20c7f8f6e95017fdc13bce8549a6925e0 upstream.

Thomas reports
"
4gsystems sells two total different LTE-surfsticks under the same name.
..
The newer version of XS Stick W100 is from "omega"
..
Under windows the driver switches to the same ID, and uses MI03\6 for
network and MI01\6 for modem.
..
echo "1c9e 9b01" > /sys/bus/usb/drivers/qmi_wwan/new_id
echo "1c9e 9b01" > /sys/bus/usb-serial/drivers/option1/new_id

T:  Bus=01 Lev=01 Prnt=01 Port=03 Cnt=01 Dev#=  4 Spd=480 MxCh= 0
D:  Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs=  1
P:  Vendor=1c9e ProdID=9b01 Rev=02.32
S:  Manufacturer=USB Modem
S:  Product=USB Modem
S:  SerialNumber=
C:  #Ifs= 5 Cfg#= 1 Atr=80 MxPwr=500mA
I:  If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
I:  If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
I:  If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
I:  If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan
I:  If#= 4 Alt= 0 #EPs= 2 Cls=08(stor.) Sub=06 Prot=50 Driver=usb-storage

Now all important things are there:

wwp0s29f7u2i3 (net), ttyUSB2 (at), cdc-wdm0 (qmi), ttyUSB1 (at)

There is also ttyUSB0, but it is not usable, at least not for at.

The device works well with qmi and ModemManager-NetworkManager.
"

Reported-by: Thomas Schäfer 
Signed-off-by: Bjørn Mork 
Signed-off-by: Greg Kroah-Hartman 
Signed-off-by: Zefan Li 
---
 drivers/usb/serial/option.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/drivers/usb/serial/option.c b/drivers/usb/serial/option.c
index d5febd4..1852ca6 100644
--- a/drivers/usb/serial/option.c
+++ b/drivers/usb/serial/option.c
@@ -352,6 +352,7 @@ static void option_instat_callback(struct urb *urb);
 /* This is the 4G XS Stick W14 a.k.a. Mobilcom Debitel Surf-Stick *
  * It seems to contain a Qualcomm QSC6240/6290 chipset*/
 #define FOUR_G_SYSTEMS_PRODUCT_W14 0x9603
+#define FOUR_G_SYSTEMS_PRODUCT_W1000x9b01
 
 /* iBall 3.5G connect wireless modem */
 #define IBALL_3_5G_CONNECT 0x9605
@@ -525,6 +526,11 @@ static const struct option_blacklist_info 
four_g_w14_blacklist = {
.sendsetup = BIT(0) | BIT(1),
 };
 
+static const struct option_blacklist_info four_g_w100_blacklist = {
+   .sendsetup = BIT(1) | BIT(2),
+   .reserved = BIT(3),
+};
+
 static const struct option_blacklist_info alcatel_x200_blacklist = {
.sendsetup = BIT(0) | BIT(1),
.reserved = BIT(4),
@@ -1621,6 +1627,9 @@ static const struct usb_device_id option_ids[] = {
{ USB_DEVICE(LONGCHEER_VENDOR_ID, FOUR_G_SYSTEMS_PRODUCT_W14),
  .driver_info = (kernel_ulong_t)_g_w14_blacklist
},
+   { USB_DEVICE(LONGCHEER_VENDOR_ID, FOUR_G_SYSTEMS_PRODUCT_W100),
+ .driver_info = (kernel_ulong_t)_g_w100_blacklist
+   },
{ USB_DEVICE_INTERFACE_CLASS(LONGCHEER_VENDOR_ID, 
SPEEDUP_PRODUCT_SU9800, 0xff) },
{ USB_DEVICE(LONGCHEER_VENDOR_ID, ZOOM_PRODUCT_4597) },
{ USB_DEVICE(LONGCHEER_VENDOR_ID, IBALL_3_5G_CONNECT) },
-- 
1.9.1



[PATCH 3.4 120/125] Fix incomplete backport of commit 423f04d63cf4

2016-10-12 Thread lizf
From: Zefan Li 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


Signed-off-by: Zefan Li 
---
 drivers/md/raid1.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index a548eed..a4d994f 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -1272,11 +1272,8 @@ static void error(struct mddev *mddev, struct md_rdev 
*rdev)
set_bit(Blocked, >flags);
spin_lock_irqsave(>device_lock, flags);
if (test_and_clear_bit(In_sync, >flags)) {
-   unsigned long flags;
-   spin_lock_irqsave(>device_lock, flags);
mddev->degraded++;
set_bit(Faulty, >flags);
-   spin_unlock_irqrestore(>device_lock, flags);
/*
 * if recovery is running, make sure it aborts.
 */
-- 
1.9.1



[PATCH 3.4 090/125] scripts: recordmcount: break hardlinks

2016-10-12 Thread lizf
From: Russell King 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit dd39a26538e37f6c6131e829a4a510787e43c783 upstream.

recordmcount edits the file in-place, which can cause problems when
using ccache in hardlink mode.  Arrange for recordmcount to break a
hardlinked object.

Link: http://lkml.kernel.org/r/e1a7mvt-et...@rmk-pc.arm.linux.org.uk

Signed-off-by: Russell King 
Signed-off-by: Steven Rostedt 
Signed-off-by: Zefan Li 
---
 scripts/recordmcount.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/scripts/recordmcount.c b/scripts/recordmcount.c
index ee52cb8..4eb047a 100644
--- a/scripts/recordmcount.c
+++ b/scripts/recordmcount.c
@@ -182,6 +182,20 @@ static void *mmap_file(char const *fname)
addr = umalloc(sb.st_size);
uread(fd_map, addr, sb.st_size);
}
+   if (sb.st_nlink != 1) {
+   /* file is hard-linked, break the hard link */
+   close(fd_map);
+   if (unlink(fname) < 0) {
+   perror(fname);
+   fail_file();
+   }
+   fd_map = open(fname, O_RDWR | O_CREAT, sb.st_mode);
+   if (fd_map < 0) {
+   perror(fname);
+   fail_file();
+   }
+   uwrite(fd_map, addr, sb.st_size);
+   }
return addr;
 }
 
-- 
1.9.1



[PATCH 3.4 113/125] net: Fix skb csum races when peeking

2016-10-12 Thread lizf
From: Herbert Xu 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


[ Upstream commit 89c22d8c3b278212eef6a8cc66b570bc840a6f5a ]

When we calculate the checksum on the recv path, we store the
result in the skb as an optimisation in case we need the checksum
again down the line.

This is in fact bogus for the MSG_PEEK case as this is done without
any locking.  So multiple threads can peek and then store the result
to the same skb, potentially resulting in bogus skb states.

This patch fixes this by only storing the result if the skb is not
shared.  This preserves the optimisations for the few cases where
it can be done safely due to locking or other reasons, e.g., SIOCINQ.

Signed-off-by: Herbert Xu 
Acked-by: Eric Dumazet 
Signed-off-by: David S. Miller 
Signed-off-by: Ben Hutchings 
Signed-off-by: Zefan Li 
---
 net/core/datagram.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/core/datagram.c b/net/core/datagram.c
index ba96ad9..bc412ca 100644
--- a/net/core/datagram.c
+++ b/net/core/datagram.c
@@ -695,7 +695,8 @@ __sum16 __skb_checksum_complete_head(struct sk_buff *skb, 
int len)
if (likely(!sum)) {
if (unlikely(skb->ip_summed == CHECKSUM_COMPLETE))
netdev_rx_csum_fault(skb->dev);
-   skb->ip_summed = CHECKSUM_UNNECESSARY;
+   if (!skb_shared(skb))
+   skb->ip_summed = CHECKSUM_UNNECESSARY;
}
return sum;
 }
-- 
1.9.1



[PATCH 3.4 090/125] scripts: recordmcount: break hardlinks

2016-10-12 Thread lizf
From: Russell King 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit dd39a26538e37f6c6131e829a4a510787e43c783 upstream.

recordmcount edits the file in-place, which can cause problems when
using ccache in hardlink mode.  Arrange for recordmcount to break a
hardlinked object.

Link: http://lkml.kernel.org/r/e1a7mvt-et...@rmk-pc.arm.linux.org.uk

Signed-off-by: Russell King 
Signed-off-by: Steven Rostedt 
Signed-off-by: Zefan Li 
---
 scripts/recordmcount.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/scripts/recordmcount.c b/scripts/recordmcount.c
index ee52cb8..4eb047a 100644
--- a/scripts/recordmcount.c
+++ b/scripts/recordmcount.c
@@ -182,6 +182,20 @@ static void *mmap_file(char const *fname)
addr = umalloc(sb.st_size);
uread(fd_map, addr, sb.st_size);
}
+   if (sb.st_nlink != 1) {
+   /* file is hard-linked, break the hard link */
+   close(fd_map);
+   if (unlink(fname) < 0) {
+   perror(fname);
+   fail_file();
+   }
+   fd_map = open(fname, O_RDWR | O_CREAT, sb.st_mode);
+   if (fd_map < 0) {
+   perror(fname);
+   fail_file();
+   }
+   uwrite(fd_map, addr, sb.st_size);
+   }
return addr;
 }
 
-- 
1.9.1



[PATCH 3.4 113/125] net: Fix skb csum races when peeking

2016-10-12 Thread lizf
From: Herbert Xu 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


[ Upstream commit 89c22d8c3b278212eef6a8cc66b570bc840a6f5a ]

When we calculate the checksum on the recv path, we store the
result in the skb as an optimisation in case we need the checksum
again down the line.

This is in fact bogus for the MSG_PEEK case as this is done without
any locking.  So multiple threads can peek and then store the result
to the same skb, potentially resulting in bogus skb states.

This patch fixes this by only storing the result if the skb is not
shared.  This preserves the optimisations for the few cases where
it can be done safely due to locking or other reasons, e.g., SIOCINQ.

Signed-off-by: Herbert Xu 
Acked-by: Eric Dumazet 
Signed-off-by: David S. Miller 
Signed-off-by: Ben Hutchings 
Signed-off-by: Zefan Li 
---
 net/core/datagram.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/core/datagram.c b/net/core/datagram.c
index ba96ad9..bc412ca 100644
--- a/net/core/datagram.c
+++ b/net/core/datagram.c
@@ -695,7 +695,8 @@ __sum16 __skb_checksum_complete_head(struct sk_buff *skb, 
int len)
if (likely(!sum)) {
if (unlikely(skb->ip_summed == CHECKSUM_COMPLETE))
netdev_rx_csum_fault(skb->dev);
-   skb->ip_summed = CHECKSUM_UNNECESSARY;
+   if (!skb_shared(skb))
+   skb->ip_summed = CHECKSUM_UNNECESSARY;
}
return sum;
 }
-- 
1.9.1



[PATCH 3.4 055/125] vfs: Avoid softlockups with sendfile(2)

2016-10-12 Thread lizf
From: Jan Kara 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit c2489e07c0a71a56fb2c84bc0ee66cddfca7d068 upstream.

The following test program from Dmitry can cause softlockups or RCU
stalls as it copies 1GB from tmpfs into eventfd and we don't have any
scheduling point at that path in sendfile(2) implementation:

int r1 = eventfd(0, 0);
int r2 = memfd_create("", 0);
unsigned long n = 1<<30;
fallocate(r2, 0, 0, n);
sendfile(r1, r2, 0, n);

Add cond_resched() into __splice_from_pipe() to fix the problem.

CC: Dmitry Vyukov 
Signed-off-by: Jan Kara 
Signed-off-by: Al Viro 
Signed-off-by: Zefan Li 
---
 fs/splice.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/splice.c b/fs/splice.c
index 4e2309e..8b97331 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -935,6 +935,7 @@ ssize_t __splice_from_pipe(struct pipe_inode_info *pipe, 
struct splice_desc *sd,
 
splice_from_pipe_begin(sd);
do {
+   cond_resched();
ret = splice_from_pipe_next(pipe, sd);
if (ret > 0)
ret = splice_from_pipe_feed(pipe, sd, actor);
-- 
1.9.1



[PATCH 3.4 041/125] net: ip6mr: fix static mfc/dev leaks on table destruction

2016-10-12 Thread lizf
From: Nikolay Aleksandrov 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 4c6980462f32b4f282c5d8e5f7ea8070e2937725 upstream.

Similar to ipv4, when destroying an mrt table the static mfc entries and
the static devices are kept, which leads to devices that can never be
destroyed (because of refcnt taken) and leaked memory. Make sure that
everything is cleaned up on netns destruction.

Fixes: 8229efdaef1e ("netns: ip6mr: enable namespace support in ipv6 multicast 
forwarding code")
CC: Benjamin Thery 
Signed-off-by: Nikolay Aleksandrov 
Reviewed-by: Cong Wang 
Signed-off-by: David S. Miller 
Signed-off-by: Zefan Li 
---
 net/ipv6/ip6mr.c | 15 ---
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/net/ipv6/ip6mr.c b/net/ipv6/ip6mr.c
index 84cf871..c5fa9df 100644
--- a/net/ipv6/ip6mr.c
+++ b/net/ipv6/ip6mr.c
@@ -117,7 +117,7 @@ static int __ip6mr_fill_mroute(struct mr6_table *mrt, 
struct sk_buff *skb,
   struct mfc6_cache *c, struct rtmsg *rtm);
 static int ip6mr_rtm_dumproute(struct sk_buff *skb,
   struct netlink_callback *cb);
-static void mroute_clean_tables(struct mr6_table *mrt);
+static void mroute_clean_tables(struct mr6_table *mrt, bool all);
 static void ipmr_expire_process(unsigned long arg);
 
 #ifdef CONFIG_IPV6_MROUTE_MULTIPLE_TABLES
@@ -334,7 +334,7 @@ static struct mr6_table *ip6mr_new_table(struct net *net, 
u32 id)
 static void ip6mr_free_table(struct mr6_table *mrt)
 {
del_timer_sync(>ipmr_expire_timer);
-   mroute_clean_tables(mrt);
+   mroute_clean_tables(mrt, true);
kfree(mrt);
 }
 
@@ -1472,7 +1472,7 @@ static int ip6mr_mfc_add(struct net *net, struct 
mr6_table *mrt,
  * Close the multicast socket, and clear the vif tables etc
  */
 
-static void mroute_clean_tables(struct mr6_table *mrt)
+static void mroute_clean_tables(struct mr6_table *mrt, bool all)
 {
int i;
LIST_HEAD(list);
@@ -1482,8 +1482,9 @@ static void mroute_clean_tables(struct mr6_table *mrt)
 *  Shut down all active vif entries
 */
for (i = 0; i < mrt->maxvif; i++) {
-   if (!(mrt->vif6_table[i].flags & VIFF_STATIC))
-   mif6_delete(mrt, i, );
+   if (!all && (mrt->vif6_table[i].flags & VIFF_STATIC))
+   continue;
+   mif6_delete(mrt, i, );
}
unregister_netdevice_many();
 
@@ -1492,7 +1493,7 @@ static void mroute_clean_tables(struct mr6_table *mrt)
 */
for (i = 0; i < MFC6_LINES; i++) {
list_for_each_entry_safe(c, next, >mfc6_cache_array[i], 
list) {
-   if (c->mfc_flags & MFC_STATIC)
+   if (!all && (c->mfc_flags & MFC_STATIC))
continue;
write_lock_bh(_lock);
list_del(>list);
@@ -1546,7 +1547,7 @@ int ip6mr_sk_done(struct sock *sk)
net->ipv6.devconf_all->mc_forwarding--;
write_unlock_bh(_lock);
 
-   mroute_clean_tables(mrt);
+   mroute_clean_tables(mrt, false);
err = 0;
break;
}
-- 
1.9.1



[PATCH 3.4 036/125] mac80211: mesh: fix call_rcu() usage

2016-10-12 Thread lizf
From: Johannes Berg 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit c2e703a55245bfff3db53b1f7cbe59f1ee8a4339 upstream.

When using call_rcu(), the called function may be delayed quite
significantly, and without a matching rcu_barrier() there's no
way to be sure it has finished.
Therefore, global state that could be gone/freed/reused should
never be touched in the callback.

Fix this in mesh by moving the atomic_dec() into the caller;
that's not really a problem since we already unlinked the path
and it will be destroyed anyway.

This fixes a crash Jouni observed when running certain tests in
a certain order, in which the mesh interface was torn down, the
memory reused for a function pointer (work struct) and running
that then crashed since the pointer had been decremented by 1,
resulting in an invalid instruction byte stream.

Fixes: eb2b9311fd00 ("mac80211: mesh path table implementation")
Reported-by: Jouni Malinen 
Signed-off-by: Johannes Berg 
Signed-off-by: Zefan Li 
---
 net/mac80211/mesh_pathtbl.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/net/mac80211/mesh_pathtbl.c b/net/mac80211/mesh_pathtbl.c
index 49aaefd..7ed81ee 100644
--- a/net/mac80211/mesh_pathtbl.c
+++ b/net/mac80211/mesh_pathtbl.c
@@ -757,10 +757,8 @@ void mesh_plink_broken(struct sta_info *sta)
 static void mesh_path_node_reclaim(struct rcu_head *rp)
 {
struct mpath_node *node = container_of(rp, struct mpath_node, rcu);
-   struct ieee80211_sub_if_data *sdata = node->mpath->sdata;
 
del_timer_sync(>mpath->timer);
-   atomic_dec(>u.mesh.mpaths);
kfree(node->mpath);
kfree(node);
 }
@@ -768,8 +766,9 @@ static void mesh_path_node_reclaim(struct rcu_head *rp)
 /* needs to be called with the corresponding hashwlock taken */
 static void __mesh_path_del(struct mesh_table *tbl, struct mpath_node *node)
 {
-   struct mesh_path *mpath;
-   mpath = node->mpath;
+   struct mesh_path *mpath = node->mpath;
+   struct ieee80211_sub_if_data *sdata = node->mpath->sdata;
+
spin_lock(>state_lock);
mpath->flags |= MESH_PATH_RESOLVING;
if (mpath->is_gate)
@@ -777,6 +776,7 @@ static void __mesh_path_del(struct mesh_table *tbl, struct 
mpath_node *node)
hlist_del_rcu(>list);
call_rcu(>rcu, mesh_path_node_reclaim);
spin_unlock(>state_lock);
+   atomic_dec(>u.mesh.mpaths);
atomic_dec(>entries);
 }
 
-- 
1.9.1



[PATCH 3.4 055/125] vfs: Avoid softlockups with sendfile(2)

2016-10-12 Thread lizf
From: Jan Kara 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit c2489e07c0a71a56fb2c84bc0ee66cddfca7d068 upstream.

The following test program from Dmitry can cause softlockups or RCU
stalls as it copies 1GB from tmpfs into eventfd and we don't have any
scheduling point at that path in sendfile(2) implementation:

int r1 = eventfd(0, 0);
int r2 = memfd_create("", 0);
unsigned long n = 1<<30;
fallocate(r2, 0, 0, n);
sendfile(r1, r2, 0, n);

Add cond_resched() into __splice_from_pipe() to fix the problem.

CC: Dmitry Vyukov 
Signed-off-by: Jan Kara 
Signed-off-by: Al Viro 
Signed-off-by: Zefan Li 
---
 fs/splice.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/splice.c b/fs/splice.c
index 4e2309e..8b97331 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -935,6 +935,7 @@ ssize_t __splice_from_pipe(struct pipe_inode_info *pipe, 
struct splice_desc *sd,
 
splice_from_pipe_begin(sd);
do {
+   cond_resched();
ret = splice_from_pipe_next(pipe, sd);
if (ret > 0)
ret = splice_from_pipe_feed(pipe, sd, actor);
-- 
1.9.1



[PATCH 3.4 041/125] net: ip6mr: fix static mfc/dev leaks on table destruction

2016-10-12 Thread lizf
From: Nikolay Aleksandrov 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 4c6980462f32b4f282c5d8e5f7ea8070e2937725 upstream.

Similar to ipv4, when destroying an mrt table the static mfc entries and
the static devices are kept, which leads to devices that can never be
destroyed (because of refcnt taken) and leaked memory. Make sure that
everything is cleaned up on netns destruction.

Fixes: 8229efdaef1e ("netns: ip6mr: enable namespace support in ipv6 multicast 
forwarding code")
CC: Benjamin Thery 
Signed-off-by: Nikolay Aleksandrov 
Reviewed-by: Cong Wang 
Signed-off-by: David S. Miller 
Signed-off-by: Zefan Li 
---
 net/ipv6/ip6mr.c | 15 ---
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/net/ipv6/ip6mr.c b/net/ipv6/ip6mr.c
index 84cf871..c5fa9df 100644
--- a/net/ipv6/ip6mr.c
+++ b/net/ipv6/ip6mr.c
@@ -117,7 +117,7 @@ static int __ip6mr_fill_mroute(struct mr6_table *mrt, 
struct sk_buff *skb,
   struct mfc6_cache *c, struct rtmsg *rtm);
 static int ip6mr_rtm_dumproute(struct sk_buff *skb,
   struct netlink_callback *cb);
-static void mroute_clean_tables(struct mr6_table *mrt);
+static void mroute_clean_tables(struct mr6_table *mrt, bool all);
 static void ipmr_expire_process(unsigned long arg);
 
 #ifdef CONFIG_IPV6_MROUTE_MULTIPLE_TABLES
@@ -334,7 +334,7 @@ static struct mr6_table *ip6mr_new_table(struct net *net, 
u32 id)
 static void ip6mr_free_table(struct mr6_table *mrt)
 {
del_timer_sync(>ipmr_expire_timer);
-   mroute_clean_tables(mrt);
+   mroute_clean_tables(mrt, true);
kfree(mrt);
 }
 
@@ -1472,7 +1472,7 @@ static int ip6mr_mfc_add(struct net *net, struct 
mr6_table *mrt,
  * Close the multicast socket, and clear the vif tables etc
  */
 
-static void mroute_clean_tables(struct mr6_table *mrt)
+static void mroute_clean_tables(struct mr6_table *mrt, bool all)
 {
int i;
LIST_HEAD(list);
@@ -1482,8 +1482,9 @@ static void mroute_clean_tables(struct mr6_table *mrt)
 *  Shut down all active vif entries
 */
for (i = 0; i < mrt->maxvif; i++) {
-   if (!(mrt->vif6_table[i].flags & VIFF_STATIC))
-   mif6_delete(mrt, i, );
+   if (!all && (mrt->vif6_table[i].flags & VIFF_STATIC))
+   continue;
+   mif6_delete(mrt, i, );
}
unregister_netdevice_many();
 
@@ -1492,7 +1493,7 @@ static void mroute_clean_tables(struct mr6_table *mrt)
 */
for (i = 0; i < MFC6_LINES; i++) {
list_for_each_entry_safe(c, next, >mfc6_cache_array[i], 
list) {
-   if (c->mfc_flags & MFC_STATIC)
+   if (!all && (c->mfc_flags & MFC_STATIC))
continue;
write_lock_bh(_lock);
list_del(>list);
@@ -1546,7 +1547,7 @@ int ip6mr_sk_done(struct sock *sk)
net->ipv6.devconf_all->mc_forwarding--;
write_unlock_bh(_lock);
 
-   mroute_clean_tables(mrt);
+   mroute_clean_tables(mrt, false);
err = 0;
break;
}
-- 
1.9.1



[PATCH 3.4 036/125] mac80211: mesh: fix call_rcu() usage

2016-10-12 Thread lizf
From: Johannes Berg 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit c2e703a55245bfff3db53b1f7cbe59f1ee8a4339 upstream.

When using call_rcu(), the called function may be delayed quite
significantly, and without a matching rcu_barrier() there's no
way to be sure it has finished.
Therefore, global state that could be gone/freed/reused should
never be touched in the callback.

Fix this in mesh by moving the atomic_dec() into the caller;
that's not really a problem since we already unlinked the path
and it will be destroyed anyway.

This fixes a crash Jouni observed when running certain tests in
a certain order, in which the mesh interface was torn down, the
memory reused for a function pointer (work struct) and running
that then crashed since the pointer had been decremented by 1,
resulting in an invalid instruction byte stream.

Fixes: eb2b9311fd00 ("mac80211: mesh path table implementation")
Reported-by: Jouni Malinen 
Signed-off-by: Johannes Berg 
Signed-off-by: Zefan Li 
---
 net/mac80211/mesh_pathtbl.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/net/mac80211/mesh_pathtbl.c b/net/mac80211/mesh_pathtbl.c
index 49aaefd..7ed81ee 100644
--- a/net/mac80211/mesh_pathtbl.c
+++ b/net/mac80211/mesh_pathtbl.c
@@ -757,10 +757,8 @@ void mesh_plink_broken(struct sta_info *sta)
 static void mesh_path_node_reclaim(struct rcu_head *rp)
 {
struct mpath_node *node = container_of(rp, struct mpath_node, rcu);
-   struct ieee80211_sub_if_data *sdata = node->mpath->sdata;
 
del_timer_sync(>mpath->timer);
-   atomic_dec(>u.mesh.mpaths);
kfree(node->mpath);
kfree(node);
 }
@@ -768,8 +766,9 @@ static void mesh_path_node_reclaim(struct rcu_head *rp)
 /* needs to be called with the corresponding hashwlock taken */
 static void __mesh_path_del(struct mesh_table *tbl, struct mpath_node *node)
 {
-   struct mesh_path *mpath;
-   mpath = node->mpath;
+   struct mesh_path *mpath = node->mpath;
+   struct ieee80211_sub_if_data *sdata = node->mpath->sdata;
+
spin_lock(>state_lock);
mpath->flags |= MESH_PATH_RESOLVING;
if (mpath->is_gate)
@@ -777,6 +776,7 @@ static void __mesh_path_del(struct mesh_table *tbl, struct 
mpath_node *node)
hlist_del_rcu(>list);
call_rcu(>rcu, mesh_path_node_reclaim);
spin_unlock(>state_lock);
+   atomic_dec(>u.mesh.mpaths);
atomic_dec(>entries);
 }
 
-- 
1.9.1



[PATCH 3.4 092/125] xen: Add RING_COPY_REQUEST()

2016-10-12 Thread lizf
From: David Vrabel 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 454d5d882c7e412b840e3c99010fe81a9862f6fb upstream.

Using RING_GET_REQUEST() on a shared ring is easy to use incorrectly
(i.e., by not considering that the other end may alter the data in the
shared ring while it is being inspected).  Safe usage of a request
generally requires taking a local copy.

Provide a RING_COPY_REQUEST() macro to use instead of
RING_GET_REQUEST() and an open-coded memcpy().  This takes care of
ensuring that the copy is done correctly regardless of any possible
compiler optimizations.

Use a volatile source to prevent the compiler from reordering or
omitting the copy.

This is part of XSA155.

Signed-off-by: David Vrabel 
Signed-off-by: Konrad Rzeszutek Wilk 
Signed-off-by: Zefan Li 
---
 include/xen/interface/io/ring.h | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/include/xen/interface/io/ring.h b/include/xen/interface/io/ring.h
index 7d28aff..7dc685b 100644
--- a/include/xen/interface/io/ring.h
+++ b/include/xen/interface/io/ring.h
@@ -181,6 +181,20 @@ struct __name##_back_ring {
\
 #define RING_GET_REQUEST(_r, _idx) \
 (&((_r)->sring->ring[((_idx) & (RING_SIZE(_r) - 1))].req))
 
+/*
+ * Get a local copy of a request.
+ *
+ * Use this in preference to RING_GET_REQUEST() so all processing is
+ * done on a local copy that cannot be modified by the other end.
+ *
+ * Note that https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145 may cause this
+ * to be ineffective where _req is a struct which consists of only bitfields.
+ */
+#define RING_COPY_REQUEST(_r, _idx, _req) do { \
+   /* Use volatile to force the copy into _req. */ \
+   *(_req) = *(volatile typeof(_req))RING_GET_REQUEST(_r, _idx);   \
+} while (0)
+
 #define RING_GET_RESPONSE(_r, _idx)\
 (&((_r)->sring->ring[((_idx) & (RING_SIZE(_r) - 1))].rsp))
 
-- 
1.9.1



[PATCH 3.4 057/125] wan/x25: Fix use-after-free in x25_asy_open_tty()

2016-10-12 Thread lizf
From: Peter Hurley 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit ee9159ddce14bc1dec9435ae4e3bd3153e783706 upstream.

The N_X25 line discipline may access the previous line discipline's closed
and already-freed private data on open [1].

The tty->disc_data field _never_ refers to valid data on entry to the
line discipline's open() method. Rather, the ldisc is expected to
initialize that field for its own use for the lifetime of the instance
(ie. from open() to close() only).

[1]
[  634.336761] 
==
[  634.338226] BUG: KASAN: use-after-free in x25_asy_open_tty+0x13d/0x490 
at addr 8800a743efd0
[  634.339558] Read of size 4 by task syzkaller_execu/8981
[  634.340359] 
=
[  634.341598] BUG kmalloc-512 (Not tainted): kasan: bad access detected
...
[  634.405018] Call Trace:
[  634.405277] dump_stack (lib/dump_stack.c:52)
[  634.405775] print_trailer (mm/slub.c:655)
[  634.406361] object_err (mm/slub.c:662)
[  634.406824] kasan_report_error (mm/kasan/report.c:138 
mm/kasan/report.c:236)
[  634.409581] __asan_report_load4_noabort (mm/kasan/report.c:279)
[  634.411355] x25_asy_open_tty (drivers/net/wan/x25_asy.c:559 
(discriminator 1))
[  634.413997] tty_ldisc_open.isra.2 (drivers/tty/tty_ldisc.c:447)
[  634.414549] tty_set_ldisc (drivers/tty/tty_ldisc.c:567)
[  634.415057] tty_ioctl (drivers/tty/tty_io.c:2646 
drivers/tty/tty_io.c:2879)
[  634.423524] do_vfs_ioctl (fs/ioctl.c:43 fs/ioctl.c:607)
[  634.427491] SyS_ioctl (fs/ioctl.c:622 fs/ioctl.c:613)
[  634.427945] entry_SYSCALL_64_fastpath (arch/x86/entry/entry_64.S:188)

Reported-and-tested-by: Sasha Levin 
Signed-off-by: Peter Hurley 
Signed-off-by: David S. Miller 
Signed-off-by: Zefan Li 
---
 drivers/net/wan/x25_asy.c | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/net/wan/x25_asy.c b/drivers/net/wan/x25_asy.c
index d7a65e1..dadf085 100644
--- a/drivers/net/wan/x25_asy.c
+++ b/drivers/net/wan/x25_asy.c
@@ -546,16 +546,12 @@ static void x25_asy_receive_buf(struct tty_struct *tty,
 
 static int x25_asy_open_tty(struct tty_struct *tty)
 {
-   struct x25_asy *sl = tty->disc_data;
+   struct x25_asy *sl;
int err;
 
if (tty->ops->write == NULL)
return -EOPNOTSUPP;
 
-   /* First make sure we're not already connected. */
-   if (sl && sl->magic == X25_ASY_MAGIC)
-   return -EEXIST;
-
/* OK.  Find a free X.25 channel to use. */
sl = x25_asy_alloc();
if (sl == NULL)
-- 
1.9.1



[PATCH 3.4 092/125] xen: Add RING_COPY_REQUEST()

2016-10-12 Thread lizf
From: David Vrabel 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 454d5d882c7e412b840e3c99010fe81a9862f6fb upstream.

Using RING_GET_REQUEST() on a shared ring is easy to use incorrectly
(i.e., by not considering that the other end may alter the data in the
shared ring while it is being inspected).  Safe usage of a request
generally requires taking a local copy.

Provide a RING_COPY_REQUEST() macro to use instead of
RING_GET_REQUEST() and an open-coded memcpy().  This takes care of
ensuring that the copy is done correctly regardless of any possible
compiler optimizations.

Use a volatile source to prevent the compiler from reordering or
omitting the copy.

This is part of XSA155.

Signed-off-by: David Vrabel 
Signed-off-by: Konrad Rzeszutek Wilk 
Signed-off-by: Zefan Li 
---
 include/xen/interface/io/ring.h | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/include/xen/interface/io/ring.h b/include/xen/interface/io/ring.h
index 7d28aff..7dc685b 100644
--- a/include/xen/interface/io/ring.h
+++ b/include/xen/interface/io/ring.h
@@ -181,6 +181,20 @@ struct __name##_back_ring {
\
 #define RING_GET_REQUEST(_r, _idx) \
 (&((_r)->sring->ring[((_idx) & (RING_SIZE(_r) - 1))].req))
 
+/*
+ * Get a local copy of a request.
+ *
+ * Use this in preference to RING_GET_REQUEST() so all processing is
+ * done on a local copy that cannot be modified by the other end.
+ *
+ * Note that https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145 may cause this
+ * to be ineffective where _req is a struct which consists of only bitfields.
+ */
+#define RING_COPY_REQUEST(_r, _idx, _req) do { \
+   /* Use volatile to force the copy into _req. */ \
+   *(_req) = *(volatile typeof(_req))RING_GET_REQUEST(_r, _idx);   \
+} while (0)
+
 #define RING_GET_RESPONSE(_r, _idx)\
 (&((_r)->sring->ring[((_idx) & (RING_SIZE(_r) - 1))].rsp))
 
-- 
1.9.1



[PATCH 3.4 057/125] wan/x25: Fix use-after-free in x25_asy_open_tty()

2016-10-12 Thread lizf
From: Peter Hurley 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit ee9159ddce14bc1dec9435ae4e3bd3153e783706 upstream.

The N_X25 line discipline may access the previous line discipline's closed
and already-freed private data on open [1].

The tty->disc_data field _never_ refers to valid data on entry to the
line discipline's open() method. Rather, the ldisc is expected to
initialize that field for its own use for the lifetime of the instance
(ie. from open() to close() only).

[1]
[  634.336761] 
==
[  634.338226] BUG: KASAN: use-after-free in x25_asy_open_tty+0x13d/0x490 
at addr 8800a743efd0
[  634.339558] Read of size 4 by task syzkaller_execu/8981
[  634.340359] 
=
[  634.341598] BUG kmalloc-512 (Not tainted): kasan: bad access detected
...
[  634.405018] Call Trace:
[  634.405277] dump_stack (lib/dump_stack.c:52)
[  634.405775] print_trailer (mm/slub.c:655)
[  634.406361] object_err (mm/slub.c:662)
[  634.406824] kasan_report_error (mm/kasan/report.c:138 
mm/kasan/report.c:236)
[  634.409581] __asan_report_load4_noabort (mm/kasan/report.c:279)
[  634.411355] x25_asy_open_tty (drivers/net/wan/x25_asy.c:559 
(discriminator 1))
[  634.413997] tty_ldisc_open.isra.2 (drivers/tty/tty_ldisc.c:447)
[  634.414549] tty_set_ldisc (drivers/tty/tty_ldisc.c:567)
[  634.415057] tty_ioctl (drivers/tty/tty_io.c:2646 
drivers/tty/tty_io.c:2879)
[  634.423524] do_vfs_ioctl (fs/ioctl.c:43 fs/ioctl.c:607)
[  634.427491] SyS_ioctl (fs/ioctl.c:622 fs/ioctl.c:613)
[  634.427945] entry_SYSCALL_64_fastpath (arch/x86/entry/entry_64.S:188)

Reported-and-tested-by: Sasha Levin 
Signed-off-by: Peter Hurley 
Signed-off-by: David S. Miller 
Signed-off-by: Zefan Li 
---
 drivers/net/wan/x25_asy.c | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/net/wan/x25_asy.c b/drivers/net/wan/x25_asy.c
index d7a65e1..dadf085 100644
--- a/drivers/net/wan/x25_asy.c
+++ b/drivers/net/wan/x25_asy.c
@@ -546,16 +546,12 @@ static void x25_asy_receive_buf(struct tty_struct *tty,
 
 static int x25_asy_open_tty(struct tty_struct *tty)
 {
-   struct x25_asy *sl = tty->disc_data;
+   struct x25_asy *sl;
int err;
 
if (tty->ops->write == NULL)
return -EOPNOTSUPP;
 
-   /* First make sure we're not already connected. */
-   if (sl && sl->magic == X25_ASY_MAGIC)
-   return -EEXIST;
-
/* OK.  Find a free X.25 channel to use. */
sl = x25_asy_alloc();
if (sl == NULL)
-- 
1.9.1



[PATCH 3.4 062/125] USB: cp210x: Remove CP2110 ID from compatibility list

2016-10-12 Thread lizf
From: Konstantin Shkolnyy 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 7c90e610b60cd1ed6abafd806acfaedccbbe52d1 upstream.

CP2110 ID (0x10c4, 0xea80) doesn't belong here because it's a HID
and completely different from CP210x devices.

Signed-off-by: Konstantin Shkolnyy 
Signed-off-by: Johan Hovold 
Signed-off-by: Zefan Li 
---
 drivers/usb/serial/cp210x.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/usb/serial/cp210x.c b/drivers/usb/serial/cp210x.c
index 7a04e2c..b48444b 100644
--- a/drivers/usb/serial/cp210x.c
+++ b/drivers/usb/serial/cp210x.c
@@ -138,7 +138,6 @@ static const struct usb_device_id id_table[] = {
{ USB_DEVICE(0x10C4, 0xEA60) }, /* Silicon Labs factory default */
{ USB_DEVICE(0x10C4, 0xEA61) }, /* Silicon Labs factory default */
{ USB_DEVICE(0x10C4, 0xEA70) }, /* Silicon Labs factory default */
-   { USB_DEVICE(0x10C4, 0xEA80) }, /* Silicon Labs factory default */
{ USB_DEVICE(0x10C4, 0xEA71) }, /* Infinity GPS-MIC-1 Radio Monophone */
{ USB_DEVICE(0x10C4, 0xF001) }, /* Elan Digital Systems USBscope50 */
{ USB_DEVICE(0x10C4, 0xF002) }, /* Elan Digital Systems USBwave12 */
-- 
1.9.1



[PATCH 3.4 094/125] xen-netback: use RING_COPY_REQUEST() throughout

2016-10-12 Thread lizf
From: David Vrabel <david.vra...@citrix.com>

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 68a33bfd8403e4e22847165d149823a2e0e67c9c upstream.

Instead of open-coding memcpy()s and directly accessing Tx and Rx
requests, use the new RING_COPY_REQUEST() that ensures the local copy
is correct.

This is more than is strictly necessary for guest Rx requests since
only the id and gref fields are used and it is harmless if the
frontend modifies these.

This is part of XSA155.

Reviewed-by: Wei Liu <wei.l...@citrix.com>
Signed-off-by: David Vrabel <david.vra...@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.w...@oracle.com>
[lizf: Backported to 3.4:
 - adjust context
 - s/queue/vif/g]
Signed-off-by: Zefan Li <lize...@huawei.com>
---
 drivers/net/xen-netback/netback.c | 30 ++
 1 file changed, 14 insertions(+), 16 deletions(-)

diff --git a/drivers/net/xen-netback/netback.c 
b/drivers/net/xen-netback/netback.c
index 25d4c31..37bcc56 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -398,17 +398,17 @@ static struct netbk_rx_meta *get_next_rx_buffer(struct 
xenvif *vif,
struct netrx_pending_operations 
*npo)
 {
struct netbk_rx_meta *meta;
-   struct xen_netif_rx_request *req;
+   struct xen_netif_rx_request req;
 
-   req = RING_GET_REQUEST(>rx, vif->rx.req_cons++);
+   RING_COPY_REQUEST(>rx, vif->rx.req_cons++, );
 
meta = npo->meta + npo->meta_prod++;
meta->gso_size = 0;
meta->size = 0;
-   meta->id = req->id;
+   meta->id = req.id;
 
npo->copy_off = 0;
-   npo->copy_gref = req->gref;
+   npo->copy_gref = req.gref;
 
return meta;
 }
@@ -510,7 +510,7 @@ static int netbk_gop_skb(struct sk_buff *skb,
struct xenvif *vif = netdev_priv(skb->dev);
int nr_frags = skb_shinfo(skb)->nr_frags;
int i;
-   struct xen_netif_rx_request *req;
+   struct xen_netif_rx_request req;
struct netbk_rx_meta *meta;
unsigned char *data;
int head = 1;
@@ -520,14 +520,14 @@ static int netbk_gop_skb(struct sk_buff *skb,
 
/* Set up a GSO prefix descriptor, if necessary */
if (skb_shinfo(skb)->gso_size && vif->gso_prefix) {
-   req = RING_GET_REQUEST(>rx, vif->rx.req_cons++);
+   RING_COPY_REQUEST(>rx, vif->rx.req_cons++, );
meta = npo->meta + npo->meta_prod++;
meta->gso_size = skb_shinfo(skb)->gso_size;
meta->size = 0;
-   meta->id = req->id;
+   meta->id = req.id;
}
 
-   req = RING_GET_REQUEST(>rx, vif->rx.req_cons++);
+   RING_COPY_REQUEST(>rx, vif->rx.req_cons++, );
meta = npo->meta + npo->meta_prod++;
 
if (!vif->gso_prefix)
@@ -536,9 +536,9 @@ static int netbk_gop_skb(struct sk_buff *skb,
meta->gso_size = 0;
 
meta->size = 0;
-   meta->id = req->id;
+   meta->id = req.id;
npo->copy_off = 0;
-   npo->copy_gref = req->gref;
+   npo->copy_gref = req.gref;
 
data = skb->data;
while (data < skb_tail_pointer(skb)) {
@@ -882,7 +882,7 @@ static void netbk_tx_err(struct xenvif *vif,
make_tx_response(vif, txp, XEN_NETIF_RSP_ERROR);
if (cons == end)
break;
-   txp = RING_GET_REQUEST(>tx, cons++);
+   RING_COPY_REQUEST(>tx, cons++, txp);
} while (1);
vif->tx.req_cons = cons;
xen_netbk_check_rx_xenvif(vif);
@@ -943,8 +943,7 @@ static int netbk_count_requests(struct xenvif *vif,
drop_err = -E2BIG;
}
 
-   memcpy(txp, RING_GET_REQUEST(>tx, cons + slots),
-  sizeof(*txp));
+   RING_COPY_REQUEST(>tx, cons + slots, txp);
 
/* If the guest submitted a frame >= 64 KiB then
 * first->size overflowed and following slots will
@@ -1226,8 +1225,7 @@ static int xen_netbk_get_extras(struct xenvif *vif,
return -EBADR;
}
 
-   memcpy(, RING_GET_REQUEST(>tx, cons),
-  sizeof(extra));
+   RING_COPY_REQUEST(>tx, cons, );
if (unlikely(!extra.type ||
 extra.type >= XEN_NETIF_EXTRA_TYPE_MAX)) {
vif->tx.req_cons = ++cons;
@@ -1422,7 +1420,7 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk 
*netbk)
 
idx = vif->tx.req_cons;
rmb(); /* Ensure that we see the request before we copy it. */
-   memcpy(, RING_GET_REQUEST(>tx, idx), sizeof(txreq));
+   RING_COPY_REQUEST(>tx, idx, );
 
/* Credit-based scheduling. */
if (txreq.size > vif->remaining_credit &&
-- 
1.9.1



[PATCH 3.4 062/125] USB: cp210x: Remove CP2110 ID from compatibility list

2016-10-12 Thread lizf
From: Konstantin Shkolnyy 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 7c90e610b60cd1ed6abafd806acfaedccbbe52d1 upstream.

CP2110 ID (0x10c4, 0xea80) doesn't belong here because it's a HID
and completely different from CP210x devices.

Signed-off-by: Konstantin Shkolnyy 
Signed-off-by: Johan Hovold 
Signed-off-by: Zefan Li 
---
 drivers/usb/serial/cp210x.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/usb/serial/cp210x.c b/drivers/usb/serial/cp210x.c
index 7a04e2c..b48444b 100644
--- a/drivers/usb/serial/cp210x.c
+++ b/drivers/usb/serial/cp210x.c
@@ -138,7 +138,6 @@ static const struct usb_device_id id_table[] = {
{ USB_DEVICE(0x10C4, 0xEA60) }, /* Silicon Labs factory default */
{ USB_DEVICE(0x10C4, 0xEA61) }, /* Silicon Labs factory default */
{ USB_DEVICE(0x10C4, 0xEA70) }, /* Silicon Labs factory default */
-   { USB_DEVICE(0x10C4, 0xEA80) }, /* Silicon Labs factory default */
{ USB_DEVICE(0x10C4, 0xEA71) }, /* Infinity GPS-MIC-1 Radio Monophone */
{ USB_DEVICE(0x10C4, 0xF001) }, /* Elan Digital Systems USBscope50 */
{ USB_DEVICE(0x10C4, 0xF002) }, /* Elan Digital Systems USBwave12 */
-- 
1.9.1



[PATCH 3.4 094/125] xen-netback: use RING_COPY_REQUEST() throughout

2016-10-12 Thread lizf
From: David Vrabel 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 68a33bfd8403e4e22847165d149823a2e0e67c9c upstream.

Instead of open-coding memcpy()s and directly accessing Tx and Rx
requests, use the new RING_COPY_REQUEST() that ensures the local copy
is correct.

This is more than is strictly necessary for guest Rx requests since
only the id and gref fields are used and it is harmless if the
frontend modifies these.

This is part of XSA155.

Reviewed-by: Wei Liu 
Signed-off-by: David Vrabel 
Signed-off-by: Konrad Rzeszutek Wilk 
[lizf: Backported to 3.4:
 - adjust context
 - s/queue/vif/g]
Signed-off-by: Zefan Li 
---
 drivers/net/xen-netback/netback.c | 30 ++
 1 file changed, 14 insertions(+), 16 deletions(-)

diff --git a/drivers/net/xen-netback/netback.c 
b/drivers/net/xen-netback/netback.c
index 25d4c31..37bcc56 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -398,17 +398,17 @@ static struct netbk_rx_meta *get_next_rx_buffer(struct 
xenvif *vif,
struct netrx_pending_operations 
*npo)
 {
struct netbk_rx_meta *meta;
-   struct xen_netif_rx_request *req;
+   struct xen_netif_rx_request req;
 
-   req = RING_GET_REQUEST(>rx, vif->rx.req_cons++);
+   RING_COPY_REQUEST(>rx, vif->rx.req_cons++, );
 
meta = npo->meta + npo->meta_prod++;
meta->gso_size = 0;
meta->size = 0;
-   meta->id = req->id;
+   meta->id = req.id;
 
npo->copy_off = 0;
-   npo->copy_gref = req->gref;
+   npo->copy_gref = req.gref;
 
return meta;
 }
@@ -510,7 +510,7 @@ static int netbk_gop_skb(struct sk_buff *skb,
struct xenvif *vif = netdev_priv(skb->dev);
int nr_frags = skb_shinfo(skb)->nr_frags;
int i;
-   struct xen_netif_rx_request *req;
+   struct xen_netif_rx_request req;
struct netbk_rx_meta *meta;
unsigned char *data;
int head = 1;
@@ -520,14 +520,14 @@ static int netbk_gop_skb(struct sk_buff *skb,
 
/* Set up a GSO prefix descriptor, if necessary */
if (skb_shinfo(skb)->gso_size && vif->gso_prefix) {
-   req = RING_GET_REQUEST(>rx, vif->rx.req_cons++);
+   RING_COPY_REQUEST(>rx, vif->rx.req_cons++, );
meta = npo->meta + npo->meta_prod++;
meta->gso_size = skb_shinfo(skb)->gso_size;
meta->size = 0;
-   meta->id = req->id;
+   meta->id = req.id;
}
 
-   req = RING_GET_REQUEST(>rx, vif->rx.req_cons++);
+   RING_COPY_REQUEST(>rx, vif->rx.req_cons++, );
meta = npo->meta + npo->meta_prod++;
 
if (!vif->gso_prefix)
@@ -536,9 +536,9 @@ static int netbk_gop_skb(struct sk_buff *skb,
meta->gso_size = 0;
 
meta->size = 0;
-   meta->id = req->id;
+   meta->id = req.id;
npo->copy_off = 0;
-   npo->copy_gref = req->gref;
+   npo->copy_gref = req.gref;
 
data = skb->data;
while (data < skb_tail_pointer(skb)) {
@@ -882,7 +882,7 @@ static void netbk_tx_err(struct xenvif *vif,
make_tx_response(vif, txp, XEN_NETIF_RSP_ERROR);
if (cons == end)
break;
-   txp = RING_GET_REQUEST(>tx, cons++);
+   RING_COPY_REQUEST(>tx, cons++, txp);
} while (1);
vif->tx.req_cons = cons;
xen_netbk_check_rx_xenvif(vif);
@@ -943,8 +943,7 @@ static int netbk_count_requests(struct xenvif *vif,
drop_err = -E2BIG;
}
 
-   memcpy(txp, RING_GET_REQUEST(>tx, cons + slots),
-  sizeof(*txp));
+   RING_COPY_REQUEST(>tx, cons + slots, txp);
 
/* If the guest submitted a frame >= 64 KiB then
 * first->size overflowed and following slots will
@@ -1226,8 +1225,7 @@ static int xen_netbk_get_extras(struct xenvif *vif,
return -EBADR;
}
 
-   memcpy(, RING_GET_REQUEST(>tx, cons),
-  sizeof(extra));
+   RING_COPY_REQUEST(>tx, cons, );
if (unlikely(!extra.type ||
 extra.type >= XEN_NETIF_EXTRA_TYPE_MAX)) {
vif->tx.req_cons = ++cons;
@@ -1422,7 +1420,7 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk 
*netbk)
 
idx = vif->tx.req_cons;
rmb(); /* Ensure that we see the request before we copy it. */
-   memcpy(, RING_GET_REQUEST(>tx, idx), sizeof(txreq));
+   RING_COPY_REQUEST(>tx, idx, );
 
/* Credit-based scheduling. */
if (txreq.size > vif->remaining_credit &&
-- 
1.9.1



[PATCH 3.4 060/125] fix sysvfs symlinks

2016-10-12 Thread lizf
From: Al Viro <v...@zeniv.linux.org.uk>

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 0ebf7f10d67a70e120f365018f1c5fce9ddc567d upstream.

The thing got broken back in 2002 - sysvfs does *not* have inline
symlinks; even short ones have bodies stored in the first block
of file.  sysv_symlink() handles that correctly; unfortunately,
attempting to look an existing symlink up will end up confusing
them for inline symlinks, and interpret the block number containing
the body as the body itself.

Nobody has noticed until now, which says something about the level
of testing sysvfs gets ;-/

Signed-off-by: Al Viro <v...@zeniv.linux.org.uk>
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li <lize...@huawei.com>
---
 fs/sysv/inode.c | 10 ++
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/fs/sysv/inode.c b/fs/sysv/inode.c
index 3da5ce2..fcdd63c 100644
--- a/fs/sysv/inode.c
+++ b/fs/sysv/inode.c
@@ -176,14 +176,8 @@ void sysv_set_inode(struct inode *inode, dev_t rdev)
inode->i_fop = _dir_operations;
inode->i_mapping->a_ops = _aops;
} else if (S_ISLNK(inode->i_mode)) {
-   if (inode->i_blocks) {
-   inode->i_op = _symlink_inode_operations;
-   inode->i_mapping->a_ops = _aops;
-   } else {
-   inode->i_op = _fast_symlink_inode_operations;
-   nd_terminate_link(SYSV_I(inode)->i_data, inode->i_size,
-   sizeof(SYSV_I(inode)->i_data) - 1);
-   }
+   inode->i_op = _symlink_inode_operations;
+   inode->i_mapping->a_ops = _aops;
} else
init_special_inode(inode, inode->i_mode, rdev);
 }
-- 
1.9.1



[PATCH 3.4 063/125] ext4: Fix handling of extended tv_sec

2016-10-12 Thread lizf
From: David Turner 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit a4dad1ae24f850410c4e60f22823cba1289b8d52 upstream.

In ext4, the bottom two bits of {a,c,m}time_extra are used to extend
the {a,c,m}time fields, deferring the year 2038 problem to the year
2446.

When decoding these extended fields, for times whose bottom 32 bits
would represent a negative number, sign extension causes the 64-bit
extended timestamp to be negative as well, which is not what's
intended.  This patch corrects that issue, so that the only negative
{a,c,m}times are those between 1901 and 1970 (as per 32-bit signed
timestamps).

Some older kernels might have written pre-1970 dates with 1,1 in the
extra bits.  This patch treats those incorrectly-encoded dates as
pre-1970, instead of post-2311, until kernel 4.20 is released.
Hopefully by then e2fsck will have fixed up the bad data.

Also add a comment explaining the encoding of ext4's extra {a,c,m}time
bits.

Signed-off-by: David Turner 
Signed-off-by: Theodore Ts'o 
Reported-by: Mark Harris 
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=23732
Signed-off-by: Zefan Li 
---
 fs/ext4/ext4.h | 51 ---
 1 file changed, 44 insertions(+), 7 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index b9cdb6d..aedf75f 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -26,6 +26,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -704,19 +705,55 @@ struct move_extent {
<= (EXT4_GOOD_OLD_INODE_SIZE +  \
(einode)->i_extra_isize))   \
 
+/*
+ * We use an encoding that preserves the times for extra epoch "00":
+ *
+ * extra  msb of adjust for signed
+ * epoch  32-bit 32-bit tv_sec to
+ * bits   timedecoded 64-bit tv_sec  64-bit tv_sec  valid time range
+ * 0 01-0x8000..-0x0001  0x0 1901-12-13..1969-12-31
+ * 0 000x0..0x07fff  0x0 1970-01-01..2038-01-19
+ * 0 110x08000..0x0  0x1 2038-01-19..2106-02-07
+ * 0 100x1..0x17fff  0x1 2106-02-07..2174-02-25
+ * 1 010x18000..0x1  0x2 2174-02-25..2242-03-16
+ * 1 000x2..0x27fff  0x2 2242-03-16..2310-04-04
+ * 1 110x28000..0x2  0x3 2310-04-04..2378-04-22
+ * 1 100x3..0x37fff  0x3 2378-04-22..2446-05-10
+ *
+ * Note that previous versions of the kernel on 64-bit systems would
+ * incorrectly use extra epoch bits 1,1 for dates between 1901 and
+ * 1970.  e2fsck will correct this, assuming that it is run on the
+ * affected filesystem before 2242.
+ */
+
 static inline __le32 ext4_encode_extra_time(struct timespec *time)
 {
-   return cpu_to_le32((sizeof(time->tv_sec) > 4 ?
-  (time->tv_sec >> 32) & EXT4_EPOCH_MASK : 0) |
-  ((time->tv_nsec << EXT4_EPOCH_BITS) & 
EXT4_NSEC_MASK));
+   u32 extra = sizeof(time->tv_sec) > 4 ?
+   ((time->tv_sec - (s32)time->tv_sec) >> 32) & EXT4_EPOCH_MASK : 
0;
+   return cpu_to_le32(extra | (time->tv_nsec << EXT4_EPOCH_BITS));
 }
 
 static inline void ext4_decode_extra_time(struct timespec *time, __le32 extra)
 {
-   if (sizeof(time->tv_sec) > 4)
-  time->tv_sec |= (__u64)(le32_to_cpu(extra) & EXT4_EPOCH_MASK)
-  << 32;
-   time->tv_nsec = (le32_to_cpu(extra) & EXT4_NSEC_MASK) >> 
EXT4_EPOCH_BITS;
+   if (unlikely(sizeof(time->tv_sec) > 4 &&
+   (extra & cpu_to_le32(EXT4_EPOCH_MASK {
+#if LINUX_VERSION_CODE < KERNEL_VERSION(4,20,0)
+   /* Handle legacy encoding of pre-1970 dates with epoch
+* bits 1,1.  We assume that by kernel version 4.20,
+* everyone will have run fsck over the affected
+* filesystems to correct the problem.  (This
+* backwards compatibility may be removed before this
+* time, at the discretion of the ext4 developers.)
+*/
+   u64 extra_bits = le32_to_cpu(extra) & EXT4_EPOCH_MASK;
+   if (extra_bits == 3 && ((time->tv_sec) & 0x8000) != 0)
+   extra_bits = 0;
+   time->tv_sec += extra_bits << 32;
+#else
+   time->tv_sec += (u64)(le32_to_cpu(extra) & EXT4_EPOCH_MASK) << 
32;
+#endif
+   }
+   time->tv_nsec = (le32_to_cpu(extra) & EXT4_NSEC_MASK) >> 
EXT4_EPOCH_BITS;
 }
 
 #define EXT4_INODE_SET_XTIME(xtime, inode, raw_inode) \
-- 
1.9.1



[PATCH 3.4 028/125] x86/cpu: Call verify_cpu() after having entered long mode too

2016-10-12 Thread lizf
From: Borislav Petkov <b...@suse.de>

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 04633df0c43d710e5f696b06539c100898678235 upstream.

When we get loaded by a 64-bit bootloader, kernel entry point is
startup_64 in head_64.S. We don't trust any and all bootloaders because
some will fiddle with CPU configuration so we go ahead and massage each
CPU into sanity again.

For example, some dell BIOSes have this XD disable feature which set
IA32_MISC_ENABLE[34] and disable NX. This might be some dumb workaround
for other OSes but Linux sure doesn't need it.

A similar thing is present in the Surface 3 firmware - see
https://bugzilla.kernel.org/show_bug.cgi?id=106051 - which sets this bit
only on the BSP:

  # rdmsr -a 0x1a0
  400850089
  850089
  850089
  850089

I know, right?!

There's not even an off switch in there.

So fix all those cases by sanitizing the 64-bit entry point too. For
that, make verify_cpu() callable in 64-bit mode also.

Requested-and-debugged-by: "H. Peter Anvin" <h...@zytor.com>
Reported-and-tested-by: Bastien Nocera <bugzi...@hadess.net>
Signed-off-by: Borislav Petkov <b...@suse.de>
Cc: Matt Fleming <m...@codeblueprint.co.uk>
Cc: Peter Zijlstra <pet...@infradead.org>
Link: http://lkml.kernel.org/r/1446739076-21303-1-git-send-email...@alien8.de
Signed-off-by: Thomas Gleixner <t...@linutronix.de>
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li <lize...@huawei.com>
---
 arch/x86/kernel/head_64.S|  8 
 arch/x86/kernel/verify_cpu.S | 12 +++-
 2 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index 40f4eb3..59d0eac 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -45,6 +45,9 @@ L3_START_KERNEL = pud_index(__START_KERNEL_map)
.globl startup_64
 startup_64:
 
+   /* Sanitize CPU configuration */
+   call verify_cpu
+
/*
 * At this point the CPU runs in 64bit mode CS.L = 1 CS.D = 1,
 * and someone has loaded an identity mapped page table
@@ -160,6 +163,9 @@ ENTRY(secondary_startup_64)
 * after the boot processor executes this code.
 */
 
+   /* Sanitize CPU configuration */
+   call verify_cpu
+
/* Enable PAE mode and PGE */
movl$(X86_CR4_PAE | X86_CR4_PGE), %eax
movq%rax, %cr4
@@ -253,6 +259,8 @@ ENTRY(secondary_startup_64)
pushq   %rax# target address in negative space
lretq
 
+#include "verify_cpu.S"
+
/* SMP bootup changes these two */
__REFDATA
.align  8
diff --git a/arch/x86/kernel/verify_cpu.S b/arch/x86/kernel/verify_cpu.S
index b9242ba..4cf401f 100644
--- a/arch/x86/kernel/verify_cpu.S
+++ b/arch/x86/kernel/verify_cpu.S
@@ -34,10 +34,11 @@
 #include 
 
 verify_cpu:
-   pushfl  # Save caller passed flags
-   pushl   $0  # Kill any dangerous flags
-   popfl
+   pushf   # Save caller passed flags
+   push$0  # Kill any dangerous flags
+   popf
 
+#ifndef __x86_64__
pushfl  # standard way to check for cpuid
popl%eax
movl%eax,%ebx
@@ -48,6 +49,7 @@ verify_cpu:
popl%eax
cmpl%eax,%ebx
jz  verify_cpu_no_longmode  # cpu has no cpuid
+#endif
 
movl$0x0,%eax   # See if cpuid 1 is implemented
cpuid
@@ -130,10 +132,10 @@ verify_cpu_sse_test:
jmp verify_cpu_sse_test # try again
 
 verify_cpu_no_longmode:
-   popfl   # Restore caller passed flags
+   popf# Restore caller passed flags
movl $1,%eax
ret
 verify_cpu_sse_ok:
-   popfl   # Restore caller passed flags
+   popf# Restore caller passed flags
xorl %eax, %eax
ret
-- 
1.9.1



[PATCH 3.4 095/125] xen-blkback: only read request operation from shared ring once

2016-10-12 Thread lizf
From: Roger Pau Monné <roger@citrix.com>

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 1f13d75ccb806260079e0679d55d9253e370ec8a upstream.

A compiler may load a switch statement value multiple times, which could
be bad when the value is in memory shared with the frontend.

When converting a non-native request to a native one, ensure that
src->operation is only loaded once by using READ_ONCE().

This is part of XSA155.

Signed-off-by: Roger Pau Monné <roger@citrix.com>
Signed-off-by: David Vrabel <david.vra...@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.w...@oracle.com>
[lizf: Backported to 3.4:
 - adjust context
 - call ACCESS_ONCE instead of READ_ONCE]
Signed-off-by: Zefan Li <lize...@huawei.com>
---
 drivers/block/xen-blkback/common.h | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/block/xen-blkback/common.h 
b/drivers/block/xen-blkback/common.h
index 933adc5..47e5b65 100644
--- a/drivers/block/xen-blkback/common.h
+++ b/drivers/block/xen-blkback/common.h
@@ -256,8 +256,8 @@ static inline void blkif_get_x86_32_req(struct 
blkif_request *dst,
struct blkif_x86_32_request *src)
 {
int i, n = BLKIF_MAX_SEGMENTS_PER_REQUEST;
-   dst->operation = src->operation;
-   switch (src->operation) {
+   dst->operation = ACCESS_ONCE(src->operation);
+   switch (dst->operation) {
case BLKIF_OP_READ:
case BLKIF_OP_WRITE:
case BLKIF_OP_WRITE_BARRIER:
@@ -292,8 +292,8 @@ static inline void blkif_get_x86_64_req(struct 
blkif_request *dst,
struct blkif_x86_64_request *src)
 {
int i, n = BLKIF_MAX_SEGMENTS_PER_REQUEST;
-   dst->operation = src->operation;
-   switch (src->operation) {
+   dst->operation = ACCESS_ONCE(src->operation);
+   switch (dst->operation) {
case BLKIF_OP_READ:
case BLKIF_OP_WRITE:
case BLKIF_OP_WRITE_BARRIER:
-- 
1.9.1



[PATCH 3.4 061/125] fuse: break infinite loop in fuse_fill_write_pages()

2016-10-12 Thread lizf
From: Roman Gushchin 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 3ca8138f014a913f98e6ef40e939868e1e9ea876 upstream.

I got a report about unkillable task eating CPU. Further
investigation shows, that the problem is in the fuse_fill_write_pages()
function. If iov's first segment has zero length, we get an infinite
loop, because we never reach iov_iter_advance() call.

Fix this by calling iov_iter_advance() before repeating an attempt to
copy data from userspace.

A similar problem is described in 124d3b7041f ("fix writev regression:
pan hanging unkillable and un-straceable"). If zero-length segmend
is followed by segment with invalid address,
iov_iter_fault_in_readable() checks only first segment (zero-length),
iov_iter_copy_from_user_atomic() skips it, fails at second and
returns zero -> goto again without skipping zero-length segment.

Patch calls iov_iter_advance() before goto again: we'll skip zero-length
segment at second iteraction and iov_iter_fault_in_readable() will detect
invalid address.

Special thanks to Konstantin Khlebnikov, who helped a lot with the commit
description.

Cc: Andrew Morton 
Cc: Maxim Patlasov 
Cc: Konstantin Khlebnikov 
Signed-off-by: Roman Gushchin 
Signed-off-by: Miklos Szeredi 
Fixes: ea9b9907b82a ("fuse: implement perform_write")
Signed-off-by: Zefan Li 
---
 fs/fuse/file.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index e4f1f1a..951457a 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -846,6 +846,7 @@ static ssize_t fuse_fill_write_pages(struct fuse_req *req,
 
mark_page_accessed(page);
 
+   iov_iter_advance(ii, tmp);
if (!tmp) {
unlock_page(page);
page_cache_release(page);
@@ -857,7 +858,6 @@ static ssize_t fuse_fill_write_pages(struct fuse_req *req,
req->pages[req->num_pages] = page;
req->num_pages++;
 
-   iov_iter_advance(ii, tmp);
count += tmp;
pos += tmp;
offset += tmp;
-- 
1.9.1



[PATCH 3.4 045/125] iio: lpc32xx_adc: fix warnings caused by enabling unprepared clock

2016-10-12 Thread lizf
From: Vladimir Zapolskiy 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 01bb70ae0b98d266fa3e860482c7ce22fa482a6e upstream.

If common clock framework is configured, the driver generates a warning,
which is fixed by this change:

root@devkit3250:~# cat /sys/bus/iio/devices/iio\:device0/in_voltage0_raw
[ cut here ]
WARNING: CPU: 0 PID: 724 at drivers/clk/clk.c:727 
clk_core_enable+0x2c/0xa4()
Modules linked in: sc16is7xx snd_soc_uda1380
CPU: 0 PID: 724 Comm: cat Not tainted 4.3.0-rc2+ #198
Hardware name: LPC32XX SoC (Flattened Device Tree)
Backtrace:
[<>] (dump_backtrace) from [<>] (show_stack+0x18/0x1c)
[<>] (show_stack) from [<>] (dump_stack+0x20/0x28)
[<>] (dump_stack) from [<>] (warn_slowpath_common+0x90/0xb8)
[<>] (warn_slowpath_common) from [<>] (warn_slowpath_null+0x24/0x2c)
[<>] (warn_slowpath_null) from [<>] (clk_core_enable+0x2c/0xa4)
[<>] (clk_core_enable) from [<>] (clk_enable+0x24/0x38)
[<>] (clk_enable) from [<>] (lpc32xx_read_raw+0x38/0x80)
[<>] (lpc32xx_read_raw) from [<>] (iio_read_channel_info+0x70/0x94)
[<>] (iio_read_channel_info) from [<>] (dev_attr_show+0x28/0x4c)
[<>] (dev_attr_show) from [<>] (sysfs_kf_seq_show+0x8c/0xf0)
[<>] (sysfs_kf_seq_show) from [<>] (kernfs_seq_show+0x2c/0x30)
[<>] (kernfs_seq_show) from [<>] (seq_read+0x1c8/0x440)
[<>] (seq_read) from [<>] (kernfs_fop_read+0x38/0x170)
[<>] (kernfs_fop_read) from [<>] (do_readv_writev+0x16c/0x238)
[<>] (do_readv_writev) from [<>] (vfs_readv+0x50/0x58)
[<>] (vfs_readv) from [<>] (default_file_splice_read+0x1a4/0x308)
[<>] (default_file_splice_read) from [<>] (do_splice_to+0x78/0x84)
[<>] (do_splice_to) from [<>] (splice_direct_to_actor+0xc8/0x1cc)
[<>] (splice_direct_to_actor) from [<>] (do_splice_direct+0xa0/0xb8)
[<>] (do_splice_direct) from [<>] (do_sendfile+0x1a8/0x30c)
[<>] (do_sendfile) from [<>] (SyS_sendfile64+0x104/0x10c)
[<>] (SyS_sendfile64) from [<>] (ret_fast_syscall+0x0/0x38)

Signed-off-by: Vladimir Zapolskiy 
Signed-off-by: Jonathan Cameron 
Signed-off-by: Zefan Li 
---
 drivers/staging/iio/adc/lpc32xx_adc.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/iio/adc/lpc32xx_adc.c 
b/drivers/staging/iio/adc/lpc32xx_adc.c
index dfc9033..37ca387 100644
--- a/drivers/staging/iio/adc/lpc32xx_adc.c
+++ b/drivers/staging/iio/adc/lpc32xx_adc.c
@@ -75,7 +75,7 @@ static int lpc32xx_read_raw(struct iio_dev *indio_dev,
 
if (mask == 0) {
mutex_lock(_dev->mlock);
-   clk_enable(info->clk);
+   clk_prepare_enable(info->clk);
/* Measurement setup */
__raw_writel(AD_INTERNAL | (chan->address) | AD_REFp | AD_REFm,
LPC32XX_ADC_SELECT(info->adc_base));
@@ -83,7 +83,7 @@ static int lpc32xx_read_raw(struct iio_dev *indio_dev,
__raw_writel(AD_PDN_CTRL | AD_STROBE,
LPC32XX_ADC_CTRL(info->adc_base));
wait_for_completion(>completion); /* set by ISR */
-   clk_disable(info->clk);
+   clk_disable_unprepare(info->clk);
*val = info->value;
mutex_unlock(_dev->mlock);
 
-- 
1.9.1



[PATCH 3.4 060/125] fix sysvfs symlinks

2016-10-12 Thread lizf
From: Al Viro 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 0ebf7f10d67a70e120f365018f1c5fce9ddc567d upstream.

The thing got broken back in 2002 - sysvfs does *not* have inline
symlinks; even short ones have bodies stored in the first block
of file.  sysv_symlink() handles that correctly; unfortunately,
attempting to look an existing symlink up will end up confusing
them for inline symlinks, and interpret the block number containing
the body as the body itself.

Nobody has noticed until now, which says something about the level
of testing sysvfs gets ;-/

Signed-off-by: Al Viro 
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li 
---
 fs/sysv/inode.c | 10 ++
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/fs/sysv/inode.c b/fs/sysv/inode.c
index 3da5ce2..fcdd63c 100644
--- a/fs/sysv/inode.c
+++ b/fs/sysv/inode.c
@@ -176,14 +176,8 @@ void sysv_set_inode(struct inode *inode, dev_t rdev)
inode->i_fop = _dir_operations;
inode->i_mapping->a_ops = _aops;
} else if (S_ISLNK(inode->i_mode)) {
-   if (inode->i_blocks) {
-   inode->i_op = _symlink_inode_operations;
-   inode->i_mapping->a_ops = _aops;
-   } else {
-   inode->i_op = _fast_symlink_inode_operations;
-   nd_terminate_link(SYSV_I(inode)->i_data, inode->i_size,
-   sizeof(SYSV_I(inode)->i_data) - 1);
-   }
+   inode->i_op = _symlink_inode_operations;
+   inode->i_mapping->a_ops = _aops;
} else
init_special_inode(inode, inode->i_mode, rdev);
 }
-- 
1.9.1



[PATCH 3.4 063/125] ext4: Fix handling of extended tv_sec

2016-10-12 Thread lizf
From: David Turner 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit a4dad1ae24f850410c4e60f22823cba1289b8d52 upstream.

In ext4, the bottom two bits of {a,c,m}time_extra are used to extend
the {a,c,m}time fields, deferring the year 2038 problem to the year
2446.

When decoding these extended fields, for times whose bottom 32 bits
would represent a negative number, sign extension causes the 64-bit
extended timestamp to be negative as well, which is not what's
intended.  This patch corrects that issue, so that the only negative
{a,c,m}times are those between 1901 and 1970 (as per 32-bit signed
timestamps).

Some older kernels might have written pre-1970 dates with 1,1 in the
extra bits.  This patch treats those incorrectly-encoded dates as
pre-1970, instead of post-2311, until kernel 4.20 is released.
Hopefully by then e2fsck will have fixed up the bad data.

Also add a comment explaining the encoding of ext4's extra {a,c,m}time
bits.

Signed-off-by: David Turner 
Signed-off-by: Theodore Ts'o 
Reported-by: Mark Harris 
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=23732
Signed-off-by: Zefan Li 
---
 fs/ext4/ext4.h | 51 ---
 1 file changed, 44 insertions(+), 7 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index b9cdb6d..aedf75f 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -26,6 +26,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -704,19 +705,55 @@ struct move_extent {
<= (EXT4_GOOD_OLD_INODE_SIZE +  \
(einode)->i_extra_isize))   \
 
+/*
+ * We use an encoding that preserves the times for extra epoch "00":
+ *
+ * extra  msb of adjust for signed
+ * epoch  32-bit 32-bit tv_sec to
+ * bits   timedecoded 64-bit tv_sec  64-bit tv_sec  valid time range
+ * 0 01-0x8000..-0x0001  0x0 1901-12-13..1969-12-31
+ * 0 000x0..0x07fff  0x0 1970-01-01..2038-01-19
+ * 0 110x08000..0x0  0x1 2038-01-19..2106-02-07
+ * 0 100x1..0x17fff  0x1 2106-02-07..2174-02-25
+ * 1 010x18000..0x1  0x2 2174-02-25..2242-03-16
+ * 1 000x2..0x27fff  0x2 2242-03-16..2310-04-04
+ * 1 110x28000..0x2  0x3 2310-04-04..2378-04-22
+ * 1 100x3..0x37fff  0x3 2378-04-22..2446-05-10
+ *
+ * Note that previous versions of the kernel on 64-bit systems would
+ * incorrectly use extra epoch bits 1,1 for dates between 1901 and
+ * 1970.  e2fsck will correct this, assuming that it is run on the
+ * affected filesystem before 2242.
+ */
+
 static inline __le32 ext4_encode_extra_time(struct timespec *time)
 {
-   return cpu_to_le32((sizeof(time->tv_sec) > 4 ?
-  (time->tv_sec >> 32) & EXT4_EPOCH_MASK : 0) |
-  ((time->tv_nsec << EXT4_EPOCH_BITS) & 
EXT4_NSEC_MASK));
+   u32 extra = sizeof(time->tv_sec) > 4 ?
+   ((time->tv_sec - (s32)time->tv_sec) >> 32) & EXT4_EPOCH_MASK : 
0;
+   return cpu_to_le32(extra | (time->tv_nsec << EXT4_EPOCH_BITS));
 }
 
 static inline void ext4_decode_extra_time(struct timespec *time, __le32 extra)
 {
-   if (sizeof(time->tv_sec) > 4)
-  time->tv_sec |= (__u64)(le32_to_cpu(extra) & EXT4_EPOCH_MASK)
-  << 32;
-   time->tv_nsec = (le32_to_cpu(extra) & EXT4_NSEC_MASK) >> 
EXT4_EPOCH_BITS;
+   if (unlikely(sizeof(time->tv_sec) > 4 &&
+   (extra & cpu_to_le32(EXT4_EPOCH_MASK {
+#if LINUX_VERSION_CODE < KERNEL_VERSION(4,20,0)
+   /* Handle legacy encoding of pre-1970 dates with epoch
+* bits 1,1.  We assume that by kernel version 4.20,
+* everyone will have run fsck over the affected
+* filesystems to correct the problem.  (This
+* backwards compatibility may be removed before this
+* time, at the discretion of the ext4 developers.)
+*/
+   u64 extra_bits = le32_to_cpu(extra) & EXT4_EPOCH_MASK;
+   if (extra_bits == 3 && ((time->tv_sec) & 0x8000) != 0)
+   extra_bits = 0;
+   time->tv_sec += extra_bits << 32;
+#else
+   time->tv_sec += (u64)(le32_to_cpu(extra) & EXT4_EPOCH_MASK) << 
32;
+#endif
+   }
+   time->tv_nsec = (le32_to_cpu(extra) & EXT4_NSEC_MASK) >> 
EXT4_EPOCH_BITS;
 }
 
 #define EXT4_INODE_SET_XTIME(xtime, inode, raw_inode) \
-- 
1.9.1



[PATCH 3.4 028/125] x86/cpu: Call verify_cpu() after having entered long mode too

2016-10-12 Thread lizf
From: Borislav Petkov 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 04633df0c43d710e5f696b06539c100898678235 upstream.

When we get loaded by a 64-bit bootloader, kernel entry point is
startup_64 in head_64.S. We don't trust any and all bootloaders because
some will fiddle with CPU configuration so we go ahead and massage each
CPU into sanity again.

For example, some dell BIOSes have this XD disable feature which set
IA32_MISC_ENABLE[34] and disable NX. This might be some dumb workaround
for other OSes but Linux sure doesn't need it.

A similar thing is present in the Surface 3 firmware - see
https://bugzilla.kernel.org/show_bug.cgi?id=106051 - which sets this bit
only on the BSP:

  # rdmsr -a 0x1a0
  400850089
  850089
  850089
  850089

I know, right?!

There's not even an off switch in there.

So fix all those cases by sanitizing the 64-bit entry point too. For
that, make verify_cpu() callable in 64-bit mode also.

Requested-and-debugged-by: "H. Peter Anvin" 
Reported-and-tested-by: Bastien Nocera 
Signed-off-by: Borislav Petkov 
Cc: Matt Fleming 
Cc: Peter Zijlstra 
Link: http://lkml.kernel.org/r/1446739076-21303-1-git-send-email...@alien8.de
Signed-off-by: Thomas Gleixner 
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li 
---
 arch/x86/kernel/head_64.S|  8 
 arch/x86/kernel/verify_cpu.S | 12 +++-
 2 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index 40f4eb3..59d0eac 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -45,6 +45,9 @@ L3_START_KERNEL = pud_index(__START_KERNEL_map)
.globl startup_64
 startup_64:
 
+   /* Sanitize CPU configuration */
+   call verify_cpu
+
/*
 * At this point the CPU runs in 64bit mode CS.L = 1 CS.D = 1,
 * and someone has loaded an identity mapped page table
@@ -160,6 +163,9 @@ ENTRY(secondary_startup_64)
 * after the boot processor executes this code.
 */
 
+   /* Sanitize CPU configuration */
+   call verify_cpu
+
/* Enable PAE mode and PGE */
movl$(X86_CR4_PAE | X86_CR4_PGE), %eax
movq%rax, %cr4
@@ -253,6 +259,8 @@ ENTRY(secondary_startup_64)
pushq   %rax# target address in negative space
lretq
 
+#include "verify_cpu.S"
+
/* SMP bootup changes these two */
__REFDATA
.align  8
diff --git a/arch/x86/kernel/verify_cpu.S b/arch/x86/kernel/verify_cpu.S
index b9242ba..4cf401f 100644
--- a/arch/x86/kernel/verify_cpu.S
+++ b/arch/x86/kernel/verify_cpu.S
@@ -34,10 +34,11 @@
 #include 
 
 verify_cpu:
-   pushfl  # Save caller passed flags
-   pushl   $0  # Kill any dangerous flags
-   popfl
+   pushf   # Save caller passed flags
+   push$0  # Kill any dangerous flags
+   popf
 
+#ifndef __x86_64__
pushfl  # standard way to check for cpuid
popl%eax
movl%eax,%ebx
@@ -48,6 +49,7 @@ verify_cpu:
popl%eax
cmpl%eax,%ebx
jz  verify_cpu_no_longmode  # cpu has no cpuid
+#endif
 
movl$0x0,%eax   # See if cpuid 1 is implemented
cpuid
@@ -130,10 +132,10 @@ verify_cpu_sse_test:
jmp verify_cpu_sse_test # try again
 
 verify_cpu_no_longmode:
-   popfl   # Restore caller passed flags
+   popf# Restore caller passed flags
movl $1,%eax
ret
 verify_cpu_sse_ok:
-   popfl   # Restore caller passed flags
+   popf# Restore caller passed flags
xorl %eax, %eax
ret
-- 
1.9.1



[PATCH 3.4 095/125] xen-blkback: only read request operation from shared ring once

2016-10-12 Thread lizf
From: Roger Pau Monné 

3.4.113-rc1 review patch.  If anyone has any objections, please let me know.

--


commit 1f13d75ccb806260079e0679d55d9253e370ec8a upstream.

A compiler may load a switch statement value multiple times, which could
be bad when the value is in memory shared with the frontend.

When converting a non-native request to a native one, ensure that
src->operation is only loaded once by using READ_ONCE().

This is part of XSA155.

Signed-off-by: Roger Pau Monné 
Signed-off-by: David Vrabel 
Signed-off-by: Konrad Rzeszutek Wilk 
[lizf: Backported to 3.4:
 - adjust context
 - call ACCESS_ONCE instead of READ_ONCE]
Signed-off-by: Zefan Li 
---
 drivers/block/xen-blkback/common.h | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/block/xen-blkback/common.h 
b/drivers/block/xen-blkback/common.h
index 933adc5..47e5b65 100644
--- a/drivers/block/xen-blkback/common.h
+++ b/drivers/block/xen-blkback/common.h
@@ -256,8 +256,8 @@ static inline void blkif_get_x86_32_req(struct 
blkif_request *dst,
struct blkif_x86_32_request *src)
 {
int i, n = BLKIF_MAX_SEGMENTS_PER_REQUEST;
-   dst->operation = src->operation;
-   switch (src->operation) {
+   dst->operation = ACCESS_ONCE(src->operation);
+   switch (dst->operation) {
case BLKIF_OP_READ:
case BLKIF_OP_WRITE:
case BLKIF_OP_WRITE_BARRIER:
@@ -292,8 +292,8 @@ static inline void blkif_get_x86_64_req(struct 
blkif_request *dst,
struct blkif_x86_64_request *src)
 {
int i, n = BLKIF_MAX_SEGMENTS_PER_REQUEST;
-   dst->operation = src->operation;
-   switch (src->operation) {
+   dst->operation = ACCESS_ONCE(src->operation);
+   switch (dst->operation) {
case BLKIF_OP_READ:
case BLKIF_OP_WRITE:
case BLKIF_OP_WRITE_BARRIER:
-- 
1.9.1



  1   2   3   4   5   6   7   8   9   10   >