[RFC PATCH 2/3 V3] livedump: Add write protection management

2012-10-10 Thread YOSHIDA Masanori
This patch makes it possible to write-protect pages in kernel space and to
install a handler function that is called every time when page fault occurs
on the protected page. The write protection is executed in the stop-machine
state to protect all pages consistently.

Processing of write protection and fault handling is executed in the order
as follows:

(1) Initialization phase
  - Sets up data structure for write protection management.
  - Splits all large pages in kernel space into 4K pages since currently
livedump can handle only 4K pages. In the future, this step (page
splitting) should be eliminated.
(2) Write protection phase
  - Stops machine.
  - Handles sensitive pages.
(described below about sensitive pages)
  - Sets up write protection.
  - Resumes machine.
(3) Page fault exception handling
  - Calls the handler function before unprotecting the faulted page.
(4) Sweep phase
  - Calls the handler function against the rest of pages.
(5) Uninitialization phase
  - Cleans up all data structure for write protection management.

This patch exports the following 4 ioctl operations.
- Ioctl to invoke initialization phase
- Ioctl to invoke write protection phase
- Ioctl to invoke sweep phase
- Ioctl to invoke uninitialization phase

States of processing is as follows. They can transit only in this order.
- STATE_UNINIT
- STATE_INITED
- STATE_STARTED (= write protection already set up)
- STATE_SWEPT

However, this order is protected by a normal integer variable, therefore,
to be exact, this code is not yet safe against concurrent operation.

The livedump module has to acquire consistent memory image of kernel space.
Therefore, write protection is set up while update of memory state is
suspended. To do so, the livedump uses stop_machine currently.

Causing livedump's page fault (LPF) during LPF handling results in nested
LPF handling. Since LPF handler uses spinlocks, this situation may cause
deadlock. Therefore, any pages that can be updated during LPF handling must
not be write-protected. For the same reason, any pages that can be updated
during NMI handling must not be write-protected. NMI can happen during LPF
handling, and so LPF during NMI handling also results in nested LPF
handling. I call such pages that must not be write-protected
"sensitive page". Against the sensitive pages, the handler function is
called during the stop-machine state and they are not write-protected.

I list the sensitive pages in the following:

- Kernel/Exception/Interrupt stacks
- Page table structure
- All task_struct
- ".data" section of kernel
- per_cpu areas

Pages that are not updated don't cause page fault and so the handler
function is not invoked against them. To handle these pages, the livedump
module finally needs to call the handler function against each of them.
I call this phase "sweep", which is triggered by ioctl operation.

Signed-off-by: YOSHIDA Masanori 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Cc: x...@kernel.org
Cc: Prarit Bhargava 
Cc: Andy Lutomirski 
Cc: "Eric W. Biederman" 
Cc: Srikar Dronamraju 
Cc: linux-kernel@vger.kernel.org
---

 arch/x86/Kconfig |   16 +
 arch/x86/include/asm/wrprotect.h |   45 +++
 arch/x86/mm/Makefile |2 
 arch/x86/mm/fault.c  |7 
 arch/x86/mm/wrprotect.c  |  548 ++
 kernel/livedump.c|   46 +++
 tools/livedump/livedump  |   32 ++
 7 files changed, 695 insertions(+), 1 deletion(-)
 create mode 100644 arch/x86/include/asm/wrprotect.h
 create mode 100644 arch/x86/mm/wrprotect.c
 create mode 100755 tools/livedump/livedump

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 39c0813..e3b4e33 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1734,9 +1734,23 @@ config CMDLINE_OVERRIDE
  This is used to work around broken boot loaders.  This should
  be set to 'N' under normal conditions.
 
+config WRPROTECT
+   bool "Write protection on kernel space"
+   depends on X86_64
+   ---help---
+ Set this option to 'Y' to allow the kernel to write protect
+ its own memory space and to handle page fault caused by the
+ write protection.
+
+ This feature regularly causes small overhead on kernel.
+ Once this feature is activated, it causes much more overhead
+ on kernel.
+
+ If in doubt, say N.
+
 config LIVEDUMP
bool "Live Dump support"
-   depends on X86_64
+   depends on WRPROTECT
---help---
  Set this option to 'Y' to allow the kernel support to acquire
  a consistent snapshot of kernel space without stopping system.
diff --git a/arch/x86/include/asm/wrprotect.h b/arch/x86/include/asm/wrprotect.h
new file mode 100644
index 000..f674998
--- /dev/null
+++ b/arch/x86/include/asm/wrprotect.h
@@ -0,0 +1,45 @@
+/* wrprortect.h - Kernel space write protection support
+ * Copyright (C) 2012 Hitachi, Ltd.
+ * 

[RFC PATCH 3/3 V3] livedump: Add memory dumping functionality

2012-10-10 Thread YOSHIDA Masanori
This patch implements memory dumping of kernel space. Faulting pages are
temporarily pushed into kfifo and they are poped and dumped by kthread
dedicated to livedump. At the moment, supported target is only block
device like /dev/sdb.

Memory dumping is executed as follows:
(1)The handler function is invoked and:
  - It pops a buffer page from the kfifo "pool".
  - It copies a faulting page into the buffer page.
  - It pushes the buffer page into the kfifo "pend".
(2)The kthread pops the buffer page from the kfifo "pend" and submits
   bio to dump it.
(3)The endio returns the buffer page back to the kfifo "pool".

At the step (1), if the kfifo "pool" is empty, processing varies depending
on whether tha handler function is called in the sweep phase or not.
If it's in the sweep phase, the handler function waits until the kfifo
"pool" becomes available.
If not, the livedump simply fails.

Signed-off-by: YOSHIDA Masanori 
Cc: Ingo Molnar 
Cc: Peter Zijlstra 
Cc: Andrew Morton 
Cc: "Eric W. Biederman" 
Cc: Al Viro 
Cc: linux-kernel@vger.kernel.org
---

 kernel/Makefile   |2 
 kernel/livedump-memdump.c |  445 +
 kernel/livedump-memdump.h |   32 +++
 kernel/livedump.c |   24 ++
 tools/livedump/livedump   |   16 +-
 5 files changed, 508 insertions(+), 11 deletions(-)
 create mode 100644 kernel/livedump-memdump.c
 create mode 100644 kernel/livedump-memdump.h

diff --git a/kernel/Makefile b/kernel/Makefile
index c8bd09b..e009578 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -110,7 +110,7 @@ obj-$(CONFIG_USER_RETURN_NOTIFIER) += user-return-notifier.o
 obj-$(CONFIG_PADATA) += padata.o
 obj-$(CONFIG_CRASH_DUMP) += crash_dump.o
 obj-$(CONFIG_JUMP_LABEL) += jump_label.o
-obj-$(CONFIG_LIVEDUMP) += livedump.o
+obj-$(CONFIG_LIVEDUMP) += livedump.o livedump-memdump.o
 
 $(obj)/configs.o: $(obj)/config_data.h
 
diff --git a/kernel/livedump-memdump.c b/kernel/livedump-memdump.c
new file mode 100644
index 000..13a9413
--- /dev/null
+++ b/kernel/livedump-memdump.c
@@ -0,0 +1,445 @@
+/* livedump-memdump.c - Live Dump's memory dumping management
+ * Copyright (C) 2012 Hitachi, Ltd.
+ * Author: YOSHIDA Masanori 
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
+ * MA  02110-1301, USA.
+ */
+
+#include "livedump-memdump.h"
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define MEMDUMP_KFIFO_SIZE 16384 /* in pages */
+#define SECTOR_SHIFT 9
+static const char THREAD_NAME[] = "livedump";
+static struct block_device *memdump_bdev;
+
+/* State machine */
+enum MEMDUMP_STATE {
+   _MEMDUMP_INIT,
+   MEMDUMP_INACTIVE = _MEMDUMP_INIT,
+   MEMDUMP_ACTIVATING,
+   MEMDUMP_ACTIVE,
+   MEMDUMP_INACTIVATING,
+   _MEMDUMP_OVERFLOW,
+};
+
+static struct memdump_state {
+   atomic_t val;
+   atomic_t count;
+   spinlock_t lock;
+} __aligned(PAGE_SIZE) memdump_state = {
+   ATOMIC_INIT(_MEMDUMP_INIT),
+   ATOMIC_INIT(0),
+   __SPIN_LOCK_INITIALIZER(memdump_state.lock),
+};
+
+/* memdump_state_inc
+ *
+ * Increments ACTIVE state refcount.
+ * The refcount must be zero to transit to next state (INACTIVATING).
+ */
+static bool memdump_state_inc(void)
+{
+   bool ret;
+
+   spin_lock(_state.lock);
+   ret = (atomic_read(_state.val) == MEMDUMP_ACTIVE);
+   if (ret)
+   atomic_inc(_state.count);
+   spin_unlock(_state.lock);
+   return ret;
+}
+
+/* memdump_state_dec
+ *
+ * Decrements ACTIVE state refcount
+ */
+static void memdump_state_dec(void)
+{
+   atomic_dec(_state.count);
+}
+
+/* memdump_state_transit
+ *
+ * Transit to next state.
+ * If current state isn't assumed state, transition fails.
+ */
+static bool memdump_state_transit(enum MEMDUMP_STATE assumed)
+{
+   bool ret;
+
+   spin_lock(_state.lock);
+   ret = (atomic_read(_state.val) == assumed &&
+   atomic_read(_state.count) == 0);
+   if (ret) {
+   atomic_inc(_state.val);
+   if (atomic_read(_state.val) == _MEMDUMP_OVERFLOW)
+   atomic_set(_state.val, _MEMDUMP_INIT);
+   }
+   spin_unlock(_state.lock);
+   return ret;
+}
+
+static void memdump_state_transit_back(void)
+{
+   atomic_dec(_state.val);
+}
+
+/* Request queue */
+
+/*
+ * Request 

[RFC PATCH 1/3 V3] livedump: Add the new misc device "livedump"

2012-10-10 Thread YOSHIDA Masanori
Introduces the new misc device "livedump".
This device will be used as interface between livedump and user space.
Right now, the device only has empty ioctl operation.

***ATTENTION PLEASE***
I think debugfs is more suitable for this feature, but currently livedump
uses the misc device for simplicity. This will be fixed in the future.

Signed-off-by: YOSHIDA Masanori 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Cc: x...@kernel.org
Cc: Peter Zijlstra 
Cc: Andrew Morton 
Cc: "Eric W. Biederman" 
Cc: Al Viro 
Cc: linux-kernel@vger.kernel.org
---

 arch/x86/Kconfig  |   15 +++
 kernel/Makefile   |1 +
 kernel/livedump.c |   73 +
 3 files changed, 89 insertions(+)
 create mode 100644 kernel/livedump.c

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 50a1d1f..39c0813 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1734,6 +1734,21 @@ config CMDLINE_OVERRIDE
  This is used to work around broken boot loaders.  This should
  be set to 'N' under normal conditions.
 
+config LIVEDUMP
+   bool "Live Dump support"
+   depends on X86_64
+   ---help---
+ Set this option to 'Y' to allow the kernel support to acquire
+ a consistent snapshot of kernel space without stopping system.
+
+ This feature regularly causes small overhead on kernel.
+
+ Once this feature is initialized by its special ioctl, it
+ allocates huge memory for itself and causes much more overhead
+ on kernel.
+
+ If in doubt, say N.
+
 endmenu
 
 config ARCH_ENABLE_MEMORY_HOTPLUG
diff --git a/kernel/Makefile b/kernel/Makefile
index c0cc67a..c8bd09b 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -110,6 +110,7 @@ obj-$(CONFIG_USER_RETURN_NOTIFIER) += user-return-notifier.o
 obj-$(CONFIG_PADATA) += padata.o
 obj-$(CONFIG_CRASH_DUMP) += crash_dump.o
 obj-$(CONFIG_JUMP_LABEL) += jump_label.o
+obj-$(CONFIG_LIVEDUMP) += livedump.o
 
 $(obj)/configs.o: $(obj)/config_data.h
 
diff --git a/kernel/livedump.c b/kernel/livedump.c
new file mode 100644
index 000..409f7ed
--- /dev/null
+++ b/kernel/livedump.c
@@ -0,0 +1,73 @@
+/* livedump.c - Live Dump's main
+ * Copyright (C) 2012 Hitachi, Ltd.
+ * Author: YOSHIDA Masanori 
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
+ * MA  02110-1301, USA.
+ */
+
+#include 
+#include 
+#include 
+#include 
+
+#define DEVICE_NAME"livedump"
+
+#define LIVEDUMP_IOC(x)_IO(0xff, x)
+
+static long livedump_ioctl(
+   struct file *file, unsigned int cmd, unsigned long arg)
+{
+   switch (cmd) {
+   default:
+   return -ENOIOCTLCMD;
+   }
+}
+
+static const struct file_operations livedump_fops = {
+   .unlocked_ioctl = livedump_ioctl,
+};
+static struct miscdevice livedump_misc = {
+   .minor = MISC_DYNAMIC_MINOR,
+   .name = DEVICE_NAME,
+   .fops = _fops,
+};
+
+static int livedump_exit(struct notifier_block *_, unsigned long __, void *___)
+{
+   misc_deregister(_misc);
+   return NOTIFY_DONE;
+}
+static struct notifier_block livedump_nb = {
+   .notifier_call = livedump_exit
+};
+
+static int __init livedump_init(void)
+{
+   int ret;
+
+   ret = misc_register(_misc);
+   if (WARN_ON(ret))
+   return ret;
+
+   ret = register_reboot_notifier(_nb);
+   if (WARN_ON(ret)) {
+   livedump_exit(NULL, 0, NULL);
+   return ret;
+   }
+
+   return 0;
+}
+device_initcall(livedump_init);

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH 0/3 V3] introduce: livedump

2012-10-10 Thread YOSHIDA Masanori
The following series introduces the new memory dumping mechanism Live Dump,
which lets users obtain a consistent memory dump without stopping a running
system.


Changes in V3:
 - The patchset is rebased onto v3.6.
 - crash-6.1.0 is required (which was 6.0.6 previously).
 - Notifier-call-chain in do_page_fault is replaced with the callback
   dedicated for livedump.
 - The patchset implements the feature of dumping to disk.
   This version only supports block device as target device.

V2 is here: https://lkml.org/lkml/2012/5/25/104

ToDo:
 - Large page support
   Currently livedump can dump only 4K pages, and so it splits all
   pages in kernel space in advance. This causes big TLB overhead.
 - Other target device support
   Currently livedump can dump only to block device. Practically,
   dumping to normal file is necessary.
 - Other space/area support
   Currently livedump write-protect only kernel's straight mapping
   area. Pages in user space or vmap area cannot be dumped
   consistently.
 - Other CPU architecture support
   Currently livedump supports only x86-64.


Background:
This mechanism is useful especially in the case where very important
systems are consolidated onto a single machine via virtualization.
Assuming a KVM host runs multiple important VMs on it and one of them
fails, the other VMs have to keep running. However, at the same time, an
administrator may want to obtain memory dump of not only the failed guest
but also the host because possibly the cause of failture is not in the
guest but in the host or the hardware under it.


Mechanism overview:
Live Dump is based on Copy-on-write technique. Basically processing is
performed in the following order.
(1) Suspends processing of all CPUs.
(2) Makes pages (which you want to dump) read-only.
(3) Resumes all CPUs
(4) On page fault, dumps a faulting page.
(5) Finally, dumps the rest of pages that are not updated.

The kthread named "livedump" is in charge of dumping to disk. It has queue
to receive dump request from livedump's page fault handler. If ever the
queue becomes full, livedump simply fails, since livedump's page fault
can never sleep to wait for space.


This series consists of 3 patches.

The 1st patch introduces "livedump" misc device.

The 2nd patch introduces feature of write protection management. This
enables users to turn on write protection on kernel space and to install a
hook function that is called every time page fault occurs on each protected
page.

The 3rd patch introduces memory dumping feature. This patch installs the
function to dump content of the protected page on page fault.


***How to test***
To test this patch, you have to apply the attached patch to the source code
of crash[1]. This patch can be applied to the version 6.1.0 of crash.  In
addition to this, you have to configure your kernel to turn on
CONFIG_DEBUG_INFO.

[1]crash, http://people.redhat.com/anderson/crash-6.1.0.tar.gz

At first, kick the script tools/livedump/livedump as follows.
 # livedump dump 

At this point, all memory image has been saved. Then you can analyze
the image by kicking the patched crash as follows.
 # crash  System.map vmlinux.o

By the following command, you can release all resources of livedump.
 # livedump release

---

YOSHIDA Masanori (3):
  livedump: Add memory dumping functionality
  livedump: Add write protection management
  livedump: Add the new misc device "livedump"


 arch/x86/Kconfig |   29 ++
 arch/x86/include/asm/wrprotect.h |   45 +++
 arch/x86/mm/Makefile |2 
 arch/x86/mm/fault.c  |7 
 arch/x86/mm/wrprotect.c  |  548 ++
 kernel/Makefile  |1 
 kernel/livedump-memdump.c|  445 +++
 kernel/livedump-memdump.h|   32 ++
 kernel/livedump.c|  133 +
 tools/livedump/livedump  |   38 +++
 10 files changed, 1280 insertions(+)
 create mode 100644 arch/x86/include/asm/wrprotect.h
 create mode 100644 arch/x86/mm/wrprotect.c
 create mode 100644 kernel/livedump-memdump.c
 create mode 100644 kernel/livedump-memdump.h
 create mode 100644 kernel/livedump.c
 create mode 100755 tools/livedump/livedump

-- 
YOSHIDA Masanori
Linux Technology Center
Yokohama Research Laboratory
Hitachi, Ltd.
diff --git a/filesys.c b/filesys.c
index cc78f7d..21ddb12 100755
--- a/filesys.c
+++ b/filesys.c
@@ -168,6 +168,7 @@ memory_source_init(void)
return;
 
if (!STREQ(pc->live_memsrc, "/dev/mem") &&
+   !STRNEQ(pc->live_memsrc, "/dev/sd") &&
 STREQ(pc->live_memsrc, pc->memory_device)) {
if (memory_driver_init())
return;
@@ -188,6 +189,11 @@ memory_source_init(void)
strerror(errno));
} else

RE: [PATCH] usb: remove CONFIG_USB_MUSB_HOST etc

2012-10-10 Thread Manjunathappa, Prakash
Hi,
On Mon, Oct 08, 2012 at 18:47:07, Constantine Shulyupin wrote:
> From: Constantine Shulyupin 
> 
> Remove USB configuration in arch/arm/mach-davinci/usb.c accordingly 
> CONFIG_USB_MUSB_OTG CONFIG_USB_MUSB_PERIPHERAL CONFIG_USB_MUSB_HOST 
> and set MUSB_OTG configuration by default
> because this configuration options are removed from Kconfig.
> 
> Signed-off-by: Constantine Shulyupin 
>  
> ---
>  arch/arm/mach-davinci/usb.c |6 --
>  1 file changed, 6 deletions(-)
> 
> diff --git a/arch/arm/mach-davinci/usb.c b/arch/arm/mach-davinci/usb.c
> index f77b953..34509ff 100644
> --- a/arch/arm/mach-davinci/usb.c
> +++ b/arch/arm/mach-davinci/usb.c
> @@ -42,14 +42,8 @@ static struct musb_hdrc_config musb_config = {
>  };
>  
>  static struct musb_hdrc_platform_data usb_data = {
> -#if defined(CONFIG_USB_MUSB_OTG)
>   /* OTG requires a Mini-AB connector */
>   .mode   = MUSB_OTG,
> -#elif defined(CONFIG_USB_MUSB_PERIPHERAL)
> - .mode   = MUSB_PERIPHERAL,
> -#elif defined(CONFIG_USB_MUSB_HOST)
> - .mode   = MUSB_HOST,
> -#endif
>   .clock  = "usb",
>   .config = _config,
>  };

Tested it on DM6446-EVM for host mode with MSC thumb drive and gadget 
mode with g-ether. It works.

Acked-by: Manjunathappa, Prakash 

Thanks,
Prakash

> -- 
> 1.7.9.5
> 
> ___
> Davinci-linux-open-source mailing list
> davinci-linux-open-sou...@linux.davincidsp.com
> http://linux.davincidsp.com/mailman/listinfo/davinci-linux-open-source
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: REGRESSION: usbdevfs: Use-scatter-gather-lists-for-large-bulk-transfers

2012-10-10 Thread Henrik Rydberg
On Wed, Oct 10, 2012 at 10:34:59PM +0200, Peter Stuge wrote:
> Hej Henrik,
> 
> Henrik Rydberg wrote:
> > commit 3d97ff63f8997761f12c8fbe8082996c6eeaba1a
> > Author: Hans de Goede 
> > Date:   Wed Jul 4 09:18:03 2012 +0200
> > 
> > usbdevfs: Use scatter-gather lists for large bulk transfers
> > 
> > breaks an usb programming cable over here. The problem is reported as
> > "bulk tranfer failed" [sic] by the tool, and bisection leads to this
> > commit. Reverting on top of 3.6 solves it for me.
> > 
> > I am happy to test alternatives.
> 
> In order to make full use of the new kernel commit you also need
> changes in libusb, if the tool uses libusb, but I agree that the
> kernel change must under no circumstance cause existing userland
> software to regress.

Indeed.

> What is the programming cable and software that uses it?

The programmer is impact, using libusbx-1.0.14-1. The device runs the
xusbdfwu firmware. The (usb 2.0) bulk endpoint says wMaxPacketSize
1x512 bytes, and bInterval 0.

The patch is pretty generic, so I am suprised the problem has not
shown up earlier.

Henrik
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: acpi : acpi_bus_trim() stops removing devices when failing to remove the device

2012-10-10 Thread Yasuaki Ishimatsu

Hi Toshi,

2012/10/10 22:01, Toshi Kani wrote:

On Wed, 2012-10-10 at 10:07 +0900, Yasuaki Ishimatsu wrote:
  :

if (acpi_drv) {
if (acpi_drv->ops.notify)
acpi_device_remove_notify_handler(acpi_dev);


THIS CALL


-   if (acpi_drv->ops.remove)
-   acpi_drv->ops.remove(acpi_dev, acpi_dev->removal_type);
+   if (acpi_drv->ops.remove) {
+   ret = acpi_drv->ops.remove(acpi_dev,
+  acpi_dev->removal_type);
+   if (ret)


Hi Yasuaki,

Shouldn't the notify handler be reinstalled here if it was removed by
the acpi_device_remove_notify_handler() above?


I do not reinstall the notify handler.
The function has not been removed on linux-3.6. And the patch is created
on linux-3.6. So the function remains in the patch.


Umm... I am not sure what you meant.  Let me clarify my comment.  When
acpi_drv->ops.remove() failed, I thought we would need to roll-back the
procedure done by the acpi_device_remove_notify_handler() call, which I
indicated as "THIS CALL" above.  So, in this error path, don't we need
something like below?

if (acpi_drv->ops.notify)
acpi_device_install_notify_handler(acpi_dev)


I understood what you said.  I'll update it.

Thanks,
Yasuaki Ishimatsu



Thanks,
-Toshi







--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/4] dmaengine: dw_dmac: use helper macro module_platform_driver()

2012-10-10 Thread Mika Westerberg
On Wed, Oct 10, 2012 at 04:42:00PM +0300, Felipe Balbi wrote:
> Hi,
> 
> On Wed, Oct 10, 2012 at 03:52:40PM +0300, Andy Shevchenko wrote:
> > On Wed, Oct 10, 2012 at 3:40 PM, Felipe Balbi  wrote:
> > > On Wed, Oct 10, 2012 at 12:21:04PM +0300, Andy Shevchenko wrote:
> > >> On Wed, Oct 10, 2012 at 12:08 PM, viresh kumar  
> > >> wrote:
> > >> > On Wed, Oct 10, 2012 at 2:34 PM, Andy Shevchenko
> > >> >  wrote:
> > >> >> On Tue, 2012-10-02 at 14:41 +0300, Andy Shevchenko wrote:
> > >> >>> From: Heikki Krogerus 
> > >> >>>
> > >> >>> Since v3.2 we have nice macro to define the platform driver's init 
> > >> >>> and exit
> > >> >>> calls. This patch simplifies the dw_dmac driver by using that macro.
> > >> >>
> > >> >> Actually we can't do this. It will break initialization of some other
> > >> >> drivers.
> > >> >
> > >> > why?
> > >>
> > >> We have spi, i2c and hsuart devices connected to the DMA controller.
> > >> In case we would like to use DMA we have to have the dw_dmac loaded
> > >> before them. Currently we have spi driver on subsys_initcall level,
> > >> and Mika, who is developing it, will change to module_init_call level.
> > >> However, it will just hide the potential issue. He also tried to use
> > >> deferred module loading, but we don't know if it's good solution or
> > >> not, and that solution requires something to stop deferring at some
> > >> moment.
> > >>
> > >> Might be we missed something and there is a better solution.
> > >
> > > if they can only work with DMA, they should return -EPROBE_DEFER so
> > > their probe() function can be called after DMA driver has finished
> > > probing.
> > 
> > They could work either with DMA or via PIO mode.
> > How does the driver know when to stop to return -EPROBE_DEFER?
> 
> Why would you even allow to work as PIO-only ? Who would even want to
> use the driver as PIO only ?

Think about SPI or I2C, if we don't have DMA available we are still able to
use the driver (and the bus) instead of just failing.

> In any case, you can add a Kconfig choice like WHATEVER_PIO_ONLY and
> only return -EPROBE_DEFER ifndef WHATEVER_PIO_ONLY.

Why would we add more Kconfig options for things that can be checked
runtime? Distro makers need to select that option anyway so it doesn't gain
anything, except confuses users.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH RFC 2/2] [x86] Optimize copy_page by re-arranging instruction sequence and saving register

2012-10-10 Thread ling . ma
From: Ma Ling 

Load and write operation occupy about 35% and 10% respectively
for most industry benchmarks. Fetched 16-aligned bytes code include 
about 4 instructions, implying 1.34(0.35 * 4) load, 0.4 write.  
Modern CPU support 2 load and 1 write per cycle, so throughput from write is
bottleneck for memcpy or copy_page, and some slight CPU only support one mem
operation per cycle. So it is enough to issue one read and write instruction
per cycle, and we can save registers. 

In this patch we also re-arrange instruction sequence to improve performance
The performance on atom is improved about 11%, 9% on hot/cold-cache case 
respectively.

Signed-off-by: Ma Ling 

---
 arch/x86/lib/copy_page_64.S |  103 +-
 1 files changed, 42 insertions(+), 61 deletions(-)

diff --git a/arch/x86/lib/copy_page_64.S b/arch/x86/lib/copy_page_64.S
index 3da5527..13c97f4 100644
--- a/arch/x86/lib/copy_page_64.S
+++ b/arch/x86/lib/copy_page_64.S
@@ -20,76 +20,57 @@ ENDPROC(copy_page_rep)
 
 ENTRY(copy_page)
CFI_STARTPROC
-   subq$2*8,   %rsp
-   CFI_ADJUST_CFA_OFFSET 2*8
-   movq%rbx,   (%rsp)
-   CFI_REL_OFFSET rbx, 0
-   movq%r12,   1*8(%rsp)
-   CFI_REL_OFFSET r12, 1*8
+   mov $(4096/64)-5, %ecx
 
-   movl$(4096/64)-5,   %ecx
-   .p2align 4
 .Loop64:
-   dec %rcx
-
-   movq0x8*0(%rsi), %rax
-   movq0x8*1(%rsi), %rbx
-   movq0x8*2(%rsi), %rdx
-   movq0x8*3(%rsi), %r8
-   movq0x8*4(%rsi), %r9
-   movq0x8*5(%rsi), %r10
-   movq0x8*6(%rsi), %r11
-   movq0x8*7(%rsi), %r12
-
prefetcht0 5*64(%rsi)
-
-   movq%rax, 0x8*0(%rdi)
-   movq%rbx, 0x8*1(%rdi)
-   movq%rdx, 0x8*2(%rdi)
-   movq%r8,  0x8*3(%rdi)
-   movq%r9,  0x8*4(%rdi)
-   movq%r10, 0x8*5(%rdi)
-   movq%r11, 0x8*6(%rdi)
-   movq%r12, 0x8*7(%rdi)
-
-   leaq64 (%rsi), %rsi
-   leaq64 (%rdi), %rdi
-
+   decb%cl
+
+   movq0x8*0(%rsi), %r10
+   movq0x8*1(%rsi), %rax
+   movq0x8*2(%rsi), %r8
+   movq0x8*3(%rsi), %r9
+   movq%r10, 0x8*0(%rdi)
+   movq%rax, 0x8*1(%rdi)
+   movq%r8, 0x8*2(%rdi)
+   movq%r9, 0x8*3(%rdi)
+
+   movq0x8*4(%rsi), %r10
+   movq0x8*5(%rsi), %rax
+   movq0x8*6(%rsi), %r8
+   movq0x8*7(%rsi), %r9
+   leaq64(%rsi), %rsi
+   movq%r10, 0x8*4(%rdi)
+   movq%rax, 0x8*5(%rdi)
+   movq%r8, 0x8*6(%rdi)
+   movq%r9, 0x8*7(%rdi)
+   leaq64(%rdi), %rdi
jnz .Loop64
 
-   movl$5, %ecx
-   .p2align 4
+   mov $5, %dl
 .Loop2:
-   decl%ecx
-
-   movq0x8*0(%rsi), %rax
-   movq0x8*1(%rsi), %rbx
-   movq0x8*2(%rsi), %rdx
-   movq0x8*3(%rsi), %r8
-   movq0x8*4(%rsi), %r9
-   movq0x8*5(%rsi), %r10
-   movq0x8*6(%rsi), %r11
-   movq0x8*7(%rsi), %r12
-
-   movq%rax, 0x8*0(%rdi)
-   movq%rbx, 0x8*1(%rdi)
-   movq%rdx, 0x8*2(%rdi)
-   movq%r8,  0x8*3(%rdi)
-   movq%r9,  0x8*4(%rdi)
-   movq%r10, 0x8*5(%rdi)
-   movq%r11, 0x8*6(%rdi)
-   movq%r12, 0x8*7(%rdi)
-
-   leaq64(%rdi), %rdi
+   decb%dl
+   movq0x8*0(%rsi), %r10
+   movq0x8*1(%rsi), %rax
+   movq0x8*2(%rsi), %r8
+   movq0x8*3(%rsi), %r9
+   movq%r10, 0x8*0(%rdi)
+   movq%rax, 0x8*1(%rdi)
+   movq%r8, 0x8*2(%rdi)
+   movq%r9, 0x8*3(%rdi)
+
+   movq0x8*4(%rsi), %r10
+   movq0x8*5(%rsi), %rax
+   movq0x8*6(%rsi), %r8
+   movq0x8*7(%rsi), %r9
leaq64(%rsi), %rsi
+   movq%r10, 0x8*4(%rdi)
+   movq%rax, 0x8*5(%rdi)
+   movq%r8, 0x8*6(%rdi)
+   movq%r9, 0x8*7(%rdi)
+   leaq64(%rdi), %rdi
jnz .Loop2
 
-   movq(%rsp), %rbx
-   CFI_RESTORE rbx
-   movq1*8(%rsp), %r12
-   CFI_RESTORE r12
-   addq$2*8, %rsp
-   CFI_ADJUST_CFA_OFFSET -2*8
ret
 .Lcopy_page_end:
CFI_ENDPROC
-- 
1.6.5.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH RFC 1/2] [x86] Modify comments and clean up code.

2012-10-10 Thread ling . ma
From: Ma Ling 

Modern CPU use fast-string instruction to accelerate copy performance,
by combining data into 128bit, so we modify comments and code style.

Signed-off-by: Ma Ling 

---
 arch/x86/lib/copy_page_64.S |  119 +--
 1 files changed, 59 insertions(+), 60 deletions(-)

diff --git a/arch/x86/lib/copy_page_64.S b/arch/x86/lib/copy_page_64.S
index 6b34d04..3da5527 100644
--- a/arch/x86/lib/copy_page_64.S
+++ b/arch/x86/lib/copy_page_64.S
@@ -5,91 +5,90 @@
 #include 
 
ALIGN
-copy_page_c:
+copy_page_rep:
CFI_STARTPROC
-   movl $4096/8,%ecx
-   rep movsq
+   movl$4096/8, %ecx
+   rep movsq
ret
CFI_ENDPROC
-ENDPROC(copy_page_c)
+ENDPROC(copy_page_rep)
 
-/* Don't use streaming store because it's better when the target
-   ends up in cache. */
-   
-/* Could vary the prefetch distance based on SMP/UP */
+/*
+   Don't use streaming copy unless cpu indicate X86_FEATURE_REP_GOOD
+   Could vary the prefetch distance based on SMP/UP
+*/
 
 ENTRY(copy_page)
CFI_STARTPROC
-   subq$2*8,%rsp
+   subq$2*8,   %rsp
CFI_ADJUST_CFA_OFFSET 2*8
-   movq%rbx,(%rsp)
+   movq%rbx,   (%rsp)
CFI_REL_OFFSET rbx, 0
-   movq%r12,1*8(%rsp)
+   movq%r12,   1*8(%rsp)
CFI_REL_OFFSET r12, 1*8
 
-   movl$(4096/64)-5,%ecx
+   movl$(4096/64)-5,   %ecx
.p2align 4
 .Loop64:
-   dec %rcx
+   dec %rcx
 
-   movq(%rsi), %rax
-   movq  8 (%rsi), %rbx
-   movq 16 (%rsi), %rdx
-   movq 24 (%rsi), %r8
-   movq 32 (%rsi), %r9
-   movq 40 (%rsi), %r10
-   movq 48 (%rsi), %r11
-   movq 56 (%rsi), %r12
+   movq0x8*0(%rsi), %rax
+   movq0x8*1(%rsi), %rbx
+   movq0x8*2(%rsi), %rdx
+   movq0x8*3(%rsi), %r8
+   movq0x8*4(%rsi), %r9
+   movq0x8*5(%rsi), %r10
+   movq0x8*6(%rsi), %r11
+   movq0x8*7(%rsi), %r12
 
prefetcht0 5*64(%rsi)
 
-   movq %rax,(%rdi)
-   movq %rbx,  8 (%rdi)
-   movq %rdx, 16 (%rdi)
-   movq %r8,  24 (%rdi)
-   movq %r9,  32 (%rdi)
-   movq %r10, 40 (%rdi)
-   movq %r11, 48 (%rdi)
-   movq %r12, 56 (%rdi)
+   movq%rax, 0x8*0(%rdi)
+   movq%rbx, 0x8*1(%rdi)
+   movq%rdx, 0x8*2(%rdi)
+   movq%r8,  0x8*3(%rdi)
+   movq%r9,  0x8*4(%rdi)
+   movq%r10, 0x8*5(%rdi)
+   movq%r11, 0x8*6(%rdi)
+   movq%r12, 0x8*7(%rdi)
 
-   leaq64 (%rsi), %rsi
-   leaq64 (%rdi), %rdi
+   leaq64 (%rsi), %rsi
+   leaq64 (%rdi), %rdi
 
-   jnz .Loop64
+   jnz .Loop64
 
-   movl$5,%ecx
+   movl$5, %ecx
.p2align 4
 .Loop2:
-   decl   %ecx
-
-   movq(%rsi), %rax
-   movq  8 (%rsi), %rbx
-   movq 16 (%rsi), %rdx
-   movq 24 (%rsi), %r8
-   movq 32 (%rsi), %r9
-   movq 40 (%rsi), %r10
-   movq 48 (%rsi), %r11
-   movq 56 (%rsi), %r12
-
-   movq %rax,(%rdi)
-   movq %rbx,  8 (%rdi)
-   movq %rdx, 16 (%rdi)
-   movq %r8,  24 (%rdi)
-   movq %r9,  32 (%rdi)
-   movq %r10, 40 (%rdi)
-   movq %r11, 48 (%rdi)
-   movq %r12, 56 (%rdi)
-
-   leaq64(%rdi),%rdi
-   leaq64(%rsi),%rsi
-
+   decl%ecx
+
+   movq0x8*0(%rsi), %rax
+   movq0x8*1(%rsi), %rbx
+   movq0x8*2(%rsi), %rdx
+   movq0x8*3(%rsi), %r8
+   movq0x8*4(%rsi), %r9
+   movq0x8*5(%rsi), %r10
+   movq0x8*6(%rsi), %r11
+   movq0x8*7(%rsi), %r12
+
+   movq%rax, 0x8*0(%rdi)
+   movq%rbx, 0x8*1(%rdi)
+   movq%rdx, 0x8*2(%rdi)
+   movq%r8,  0x8*3(%rdi)
+   movq%r9,  0x8*4(%rdi)
+   movq%r10, 0x8*5(%rdi)
+   movq%r11, 0x8*6(%rdi)
+   movq%r12, 0x8*7(%rdi)
+
+   leaq64(%rdi), %rdi
+   leaq64(%rsi), %rsi
jnz .Loop2
 
-   movq(%rsp),%rbx
+   movq(%rsp), %rbx
CFI_RESTORE rbx
-   movq1*8(%rsp),%r12
+   movq1*8(%rsp), %r12
CFI_RESTORE r12
-   addq$2*8,%rsp
+   addq$2*8, %rsp
CFI_ADJUST_CFA_OFFSET -2*8
ret
 .Lcopy_page_end:
@@ -103,7 +102,7 @@ ENDPROC(copy_page)
 
.section .altinstr_replacement,"ax"
 1: .byte 0xeb  /* jmp  */
-   .byte (copy_page_c - copy_page) - (2f - 1b) /* offset */
+   .byte (copy_page_rep - copy_page) - (2f - 1b)   /* offset */
 2:
.previous
.section .altinstructions,"a"
-- 
1.6.5.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please 

RE: [PATCH 4/4] mtd: nand: omap2: Add data correction support

2012-10-10 Thread Philip, Avinash
On Wed, Oct 10, 2012 at 22:38:06, Ivan Djelic wrote:
> On Tue, Oct 09, 2012 at 01:36:50PM +0100, Philip, Avinash wrote:
> (...)
> > > There are at least 2 potential problems when reading an erased page with 
> > > bitflips:
> > > 
> > > 1. bitflip in data area and no bitflip in spare area (all 0xff)
> > > Your code will not perform any ECC correction.
> > > UBIFS does not like finding bitflips in empty pages, see for instance
> > > http://lists.infradead.org/pipermail/linux-mtd/2012-March/040328.html.
> > 
> > In case of error correction using ELM, syndrome vector calculated after 
> > reading
> > Data area & OOB area. So handling of erased page requires a software 
> > workaround.
> > I am planning something as follows.
> > 
> > I will first check calculated ecc, which would be zero for non error pages.
> > Then I would check 0xFF in OOB area (for erased page) by checking number of
> > bit zeros in OOB area.  If it is 0xFF (number of bit zero count is zero),
> > set entire page as 0xFF if number of bit zeros is less than max bit flips
> > (8 or 4) by counting the number of bit zero's in data area.
> > 
> > This logic is implemented in fsmc_nand.c
> > 
> > See commit
> > mtd: fsmc: Newly erased page read algorithm implemented
> > 
> > > 
> > > 2. bitflip in ECC bytes in spare area
> > > Your code will report an uncorrectable error upon reading; if this 
> > > happens while reading a partially programmed UBI block,
> > > I guess you will lose data.
> > 
> > In case of uncorrectable errors due to bit flips in spare area,
> > I can go on checking number of bit zero's in data area + OOB area
> > are less than max bit flips (8 or 4), I can go on setting the entire
> > page as 0xFF.
> > 
> 
> OK, sounds reasonable.
> Another simple strategy could use the fact that you add a 14th zero byte to
> the 13 BCH bytes for RBL compatibility:

RBL compatibility (14th byte) is applicable only for BCH8 ecc scheme.

So I am planning adding an extra byte (0) for BCH4 ecc scheme. So with this
we can go for same approaches in BCH4 & BCH8 ecc scheme.

If I understood correctly, software BCH ecc scheme is modifying calculated
ecc data to handle bit flips in erased pages.

If that is the only reason, whether same logic can go for same ECC calculation
(remove modification of calculated ecc in case of software ecc correction)
by adding an extra byte (0) in spare area to handle erased pages.

So can you share if I am missing something?

> 
> Upon reading:
>  - if this 14th byte is zero (*) => page was programmed: perform ECC
>correction as usual
>  - else, page was not programmed: do not perform ECC, read entire data+spare
>area, and set it to 0xff if less than 8 or 4 (max bitflips) zero bits
>were found
> 
> (*) for robustness to bitflip in 14th byte, replace condition
> "14th byte is zero" by e.g. "14th byte has less than 4 bits set to 1".
> 
> What do you think ?

This seems logically good.

Thanks
Avinash

> 
> BR,
> --
> Ivan
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] fix x2apic defect that Linux kernel doesn't mask 8259A interrupt during the time window between changing VT-d table base address and initializing these VT-d entries(smpboot.c and apic.c )

2012-10-10 Thread H. Peter Anvin
OSV = Operating System Vendor ime. red Hat, SUSE etc.

"Zhang, Lin-Bao (Linux Kernel R)"  wrote:

>
>> -Original Message-
>> From: Suresh Siddha [mailto:suresh.b.sid...@intel.com]
>> > > Can you please apply the appended patch to 2.6.32 kernel and see
>if
>> > > the issue you mentioned gets fixed? If so, we can ask the -stable
>> > > and OSV's teams to pick up this fix.
>> > Yes , it can resolve current issue.
>> 
>> Thanks for testing it out.
>> 
>You are welcome!
>
>> I will add the appropriate changelog and send the patch out (to
>2.6.32 stable
>> and OSV kernels) with your "Tested-by:" if you are ok.
>> 
>Sure, it is my pleasure .Please go ahead!
>BTW , what's OSV kernels ? I can't find its meaning by searching
>google.  
>Once your patch has been included by 2.6 git, kindly inform me. Thanks.
>
>
>> thanks,
>> suresh

-- 
Sent from my mobile phone. Please excuse brevity and lack of formatting.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2]suppress "Device nodeX does not have a release() function" warning

2012-10-10 Thread Yasuaki Ishimatsu
When calling unregister_node(), the function shows following message at
device_release().

"Device 'node2' does not have a release() function, it is broken and must
be fixed."

The reason is node's device struct does not have a release() function.

So the patch registers node_device_release() to the device's release()
function for suppressing the warning message. Additionally, the patch adds
memset() to initialize a node struct into register_node(). Because the node
struct is part of node_devices[] array and it cannot be freed by
node_device_release(). So if system reuses the node struct, it has a garbage.

CC: David Rientjes 
CC: Jiang Liu 
Cc: Minchan Kim 
CC: Andrew Morton 
CC: KOSAKI Motohiro 
Signed-off-by: Yasuaki Ishimatsu 
Signed-off-by: Wen Congyang 
---
 drivers/base/node.c |   11 +++
 1 file changed, 11 insertions(+)

Index: linux-3.6/drivers/base/node.c
===
--- linux-3.6.orig/drivers/base/node.c  2012-10-11 10:04:02.149758748 +0900
+++ linux-3.6/drivers/base/node.c   2012-10-11 10:20:34.111806931 +0900
@@ -252,6 +252,14 @@ static inline void hugetlb_register_node
 static inline void hugetlb_unregister_node(struct node *node) {}
 #endif
 
+static void node_device_release(struct device *dev)
+{
+#if defined(CONFIG_MEMORY_HOTPLUG_SPARSE) && defined(CONFIG_HUGETLBFS)
+   struct node *node_dev = to_node(dev);
+
+   flush_work(_dev->node_work);
+#endif
+}
 
 /*
  * register_node - Setup a sysfs device for a node.
@@ -263,8 +271,11 @@ int register_node(struct node *node, int
 {
int error;
 
+   memset(node, 0, sizeof(*node));
+
node->dev.id = num;
node->dev.bus = _subsys;
+   node->dev.release = node_device_release;
error = device_register(>dev);
 
if (!error){

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2]suppress "Device memoryX does not have a release() function" warning

2012-10-10 Thread Yasuaki Ishimatsu
When calling remove_memory_block(), the function shows following message at
device_release().

"Device 'memory528' does not have a release() function, it is broken and must
be fixed."

The reason is memory_block's device struct does not have a release() function.

So the patch registers memory_block_release() to the device's release() function
for suppressing the warning message. Additionally, the patch moves kfree(mem)
into the release function since the release function is prepared as a means
to free a memory_block struct.

CC: David Rientjes 
CC: Jiang Liu 
Cc: Minchan Kim 
CC: Andrew Morton 
CC: KOSAKI Motohiro 
CC: Wen Congyang 
Signed-off-by: Yasuaki Ishimatsu 
---
 drivers/base/memory.c |9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

Index: linux-3.6/drivers/base/memory.c
===
--- linux-3.6.orig/drivers/base/memory.c2012-10-11 11:37:33.404668048 
+0900
+++ linux-3.6/drivers/base/memory.c 2012-10-11 11:38:27.865672989 +0900
@@ -70,6 +70,13 @@ void unregister_memory_isolate_notifier(
 }
 EXPORT_SYMBOL(unregister_memory_isolate_notifier);
 
+static void memory_block_release(struct device *dev)
+{
+   struct memory_block *mem = container_of(dev, struct memory_block, dev);
+
+   kfree(mem);
+}
+
 /*
  * register_memory - Setup a sysfs device for a memory block
  */
@@ -80,6 +87,7 @@ int register_memory(struct memory_block 
 
memory->dev.bus = _subsys;
memory->dev.id = memory->start_section_nr / sections_per_block;
+   memory->dev.release = memory_block_release;
 
error = device_register(>dev);
return error;
@@ -630,7 +638,6 @@ int remove_memory_block(unsigned long no
mem_remove_simple_file(mem, phys_device);
mem_remove_simple_file(mem, removable);
unregister_memory(mem);
-   kfree(mem);
} else
kobject_put(>dev.kobj);
 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 0/2] Suppress "Device does not have a release() function" warning

2012-10-10 Thread Yasuaki Ishimatsu
This patch-set is patches to which [1] and [2] are updated
  [1] memory-hotplug: add memory_block_release
  [2] memory-hotplug: add node_device_release
from following patch-set.

https://lkml.org/lkml/2012/9/27/39

So the patch-set version is v2.

v1 -> v2
[PATCH 1/2]
- change subject to Suppress "Device memoryX does not have a release()
  function" warning.
- Add detail information into description
- change function name from release_memory_block() to memory_block_release(),
  because other device release() function is named to _release()
[PATCH 2/2]
- change subject to Suppress "Device nodeX does not have a release() function"
  warning.
- Add detail information into description
- Remove memset() to initialize a node struct from node_device_release()
- Add memset() to initialize a node struct into register_node()


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] backlight: lm3639: Return proper error in lm3639_bled_mode_store error paths

2012-10-10 Thread Axel Lin
Signed-off-by: Axel Lin 
---
 drivers/video/backlight/lm3639_bl.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/video/backlight/lm3639_bl.c 
b/drivers/video/backlight/lm3639_bl.c
index c6915c6..585949b 100644
--- a/drivers/video/backlight/lm3639_bl.c
+++ b/drivers/video/backlight/lm3639_bl.c
@@ -206,11 +206,11 @@ static ssize_t lm3639_bled_mode_store(struct device *dev,
 
 out:
dev_err(pchip->dev, "%s:i2c access fail to register\n", __func__);
-   return size;
+   return ret;
 
 out_input:
dev_err(pchip->dev, "%s:input conversion fail\n", __func__);
-   return size;
+   return ret;
 
 }
 
-- 
1.7.9.5



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: dtc: import latest upstream dtc

2012-10-10 Thread David Gibson
On Wed, Oct 10, 2012 at 03:42:33PM -1000, Mitch Bradley wrote:
> On 10/10/2012 1:16 PM, David Gibson wrote:
> > On Wed, Oct 10, 2012 at 10:33:31AM -0500, Rob Herring wrote:
> >> On 10/10/2012 10:16 AM, Stephen Warren wrote:
> >>> On 10/10/2012 01:24 AM, David Gibson wrote:
>  On Tue, Oct 09, 2012 at 10:43:50PM -0600, Warner Losh wrote:
> > On Oct 9, 2012, at 6:04 PM, Scott Wood wrote:
> > [snip]
> >>> That's probably a reasonable idea, although I imagined that people would
> >>> actually split out the portions of any header file they wanted to use
> >>> with dtc, so that any headers included by *.dts would only include
> >>> #defines. Those headers could be used by both dtc and other .h files (or
> >>> .c files).
> >>
> >> Used by what other files? kernel files? We ultimately want to split out
> >> dts files from the kernel, so whatever we add needs to be self
> >> contained. I don't see this as a huge issue though because the whole
> >> point of the DT data is to move that information out of the kernel. If
> >> it is needed in both places, then something is wrong.
> > 
> > People get very hung up on this idea of having the DT move device
> > information out of the kernel, but that was never really the
> > motivation behind it.  Or at least, not the only or foremost
> > motivation.
> > 
> > The DT provides a consistent, flexible way of describing device
> > information.  That allows the core runtime the kernel to operate the
> > same way, regardless of how the DT information was obtained.  The DT
> > could come from firmware, but it could also come from an intermediate
> > bootloader or from early kernel code.  All are perfectly acceptable
> > options depending on the constraints of the platform.
> > 
> > The idea of firmware supplying the DT is much touted, but while it's a
> > theoretically nice idea, I think it's frequently a bad idea for
> > practical reasons.  Those being, in essence that a) firmware usually
> > sucks, b) it's usually harder (or at least no easier) to replace
> > firmware with a fixed version than the kernel/bootwrapper and c)
> > firmware usually *really* sucks.
> 
> Gee, it sounds like you want firmware to suck.  Beating on the "firmware
> sucks" drum is sort of a self-fulfilling prophecy, discouraging talented
> programmers from doing firmware.  Who would want to work on something
> that "everyone knows sucks"?

At this point it's already fulfilled.  Unfortunately, it really
doesn't matter how many more nice firmwares appear, once you have to
support the shitty ones - which we already do - the damage is done.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [RFC PATCH 06/06] input/rmi4: F11 - 2D touch interface

2012-10-10 Thread Christopher Heiny
Linus Walleij wrote:
> On Sat, Oct 6, 2012 at 6:10 AM, Christopher Heiny 
> wrote:
> 
> So looking closer at this one since we will use it. Maybe it's in such a
> good shape now that I should be able to actually test it with the hardware?

Well, it's been possible to test at least since the patch submitted last 
December.  The code might have been ugly, but it was working.

> 
> (...)
> 
> > diff --git a/drivers/input/rmi4/rmi_f11.c b/drivers/input/rmi4/rmi_f11.c

[snip previously mentioned stuff]

> 
> > +#define F11_CTRL_SENSOR_MAX_X_POS_OFFSET   6
> > +#define F11_CTRL_SENSOR_MAX_Y_POS_OFFSET   8
> > +
> > +#define F11_CEIL(x, y) (((x) + ((y)-1)) / (y))
> 
> Use existing kernel macros in 
> 
> In this case:
> #define F11_CEIL(x, y) DIV_ROUND_UP(x, y)
> 
> Or just use DIV_ROUND_UP() directly in the code, your choice.

Use it directly is simpler.

> 
> > +#define MAX_NAME_LENGTH 256
> 
> Really? Are you sure there is not a null terminator or length byte
> included so it's actually 255?

We were assuming 255 + terminator.  Perhaps a better name such as 
NAME_BUFFER_SIZE would be clearer?


> 
> (...)
> 
> > +static int sensor_debug_open(struct inode *inodep, struct file *filp)
> > +{
> > +   struct sensor_debugfs_data *data;
> > +   struct f11_2d_sensor *sensor = inodep->i_private;
> > +   struct rmi_function_container *fc = sensor->fc;
> > +
> > +   data = devm_kzalloc(>dev, sizeof(struct sensor_debugfs_data),
> > +   GFP_KERNEL);
> 
> Again I may have lead you astray. Check if this leaks memory, in that
> case use kzalloc()/kfree(). Sorry

No problem - will correct it.

> 
> (...)
> 
> > +static int f11_debug_open(struct inode *inodep, struct file *filp)
> > +{
> > +   struct f11_debugfs_data *data;
> > +   struct rmi_function_container *fc = inodep->i_private;
> > +
> > +   data = devm_kzalloc(>dev, sizeof(struct f11_debugfs_data),
> > +   GFP_KERNEL);
> 
> Dito.
> 
> (...)
> 
> > +static void rmi_f11_abs_pos_report(struct f11_data *f11,
> > +  struct f11_2d_sensor *sensor,
> > +  u8 finger_state, u8 n_finger)
> 
> (...)
> 
> > +   if (axis_align->flip_y)
> > +   y = max(sensor->max_y - y, 0);
> > +
> > +   /*
> > +   ** here checking if X offset or y offset are specified is
> > +   **  redundant.  We just add the offsets or, clip the
> > values
> > +   **
> > +   ** note: offsets need to be done before clipping occurs,
> > +   ** or we could get funny values that are outside
> > +   ** clipping boundaries.
> > +   */
> 
> This is a weird commenting style, what's wrong with a single star?
> (No big deal but it stands out...)

It's probably just someone's editor settings.  We can tidy it up.

> 
> (...)
> 
> > +static int f11_allocate_control_regs(struct rmi_device *rmi_dev,
> > +   struct f11_2d_device_query *device_query,
> > +   struct f11_2d_sensor_query *sensor_query,
> > +   struct f11_2d_ctrl *ctrl,
> > +   u16 ctrl_base_addr) {
> > +
> > +   struct rmi_driver_data *driver_data =
> > dev_get_drvdata(_dev->dev); +   struct rmi_function_container *fc
> > = driver_data->f01_container; +
> > +   ctrl->ctrl0_9 = devm_kzalloc(>dev, sizeof(union
> > f11_2d_ctrl0_9), +  GFP_KERNEL);
> 
> If this is called from .probe() only, this is correct.
> 
> So the rule is: use devm_* for anything that is allocated at .probe()
> and released on .remove(). Any other dynamic buffers etc need to
> use common kzalloc()/kfree().

OK - we'll review to make sure that rule is followed, and change as required.

[snip a bunch of the same]

> 
> > +
> > +   return f11_read_control_regs(rmi_dev, ctrl, ctrl_base_addr);
> 
> Hey why are you ending with a call to that function?
> The function name gets misleading.
> 
> Instead call both functions in succession at the call site on
> .probe().

OK.


> 
> (...)
> 
> > +static int f11_device_init(struct rmi_function_container *fc)
> > +{
> > +   int rc;
> > +
> > +   rc = rmi_f11_initialize(fc);
> > +   if (rc < 0)
> > +   goto err_free_data;
> > +
> > +   rc = rmi_f11_register_devices(fc);
> > +   if (rc < 0)
> > +   goto err_free_data;
> > +
> > +   rc = rmi_f11_create_sysfs(fc);
> > +   if (rc < 0)
> > +   goto err_free_data;
> > +
> > +   return 0;
> > +
> > +err_free_data:
> > +   rmi_f11_free_memory(fc);
> > +
> > +   return rc;
> > +}
> > +
> > +static void rmi_f11_free_memory(struct rmi_function_container *fc)
> > +{
> > +   struct f11_data *f11 = fc->data;
> > +   int i;
> > +
> > +   if (f11) {
> > +   for (i = 0; i < f11->dev_query.nbr_of_sensors + 1; i++)
> > +  

Re: [PATCH 05/16] vfs: bogus warnings in fs/namei.c

2012-10-10 Thread Al Viro
On Tue, Oct 09, 2012 at 01:07:19PM +, Arnd Bergmann wrote:

> Update: I could actually reproduce the problem now, but it only happens when
> building with 'gcc -s' (i.e. CONFIG_CC_OPTIMIZE_FOR_SIZE). It does happen
> with both gcc-4.6 and with gcc-4.8, and on both x86-64 and ARM. An alternative
> patch that would also make it go away is the variant below, but I think that's
> even worse than the first version I suggested because it makes the binary
> output slightly worse by adding an unnecessary initialization when building 
> with
> 'make -s'.

I can live with that, provided that you give it sane commit message and
your s-o-b.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [RFC PATCH 05/06] input/rmi4: F01 - device control

2012-10-10 Thread Christopher Heiny
Linus Walleij wrote:
> On Sat, Oct 6, 2012 at 6:10 AM, Christopher Heiny  
> wrote:
> > RMI Function 01 implements basic device control and power management
> > behaviors for the RMI4 sensor.  Since the last patch, we've decoupled
> > rmi_f01.c implementation from rmi_driver.c, so rmi_f01.c acts as a
> > standard driver module to handle F01 devices on the RMI bus.
> > 
> > Like other modules, a number of attributes have been moved from sysfs to
> > debugfs, depending on their expected use.
> > 
> > 
> > rmi_f01.h exports definitions that we expect to be used by other
> > functionality in the future (such as firmware reflash).
> > 
> > 
> > Signed-off-by: Christopher Heiny 
> > 
> > Cc: Dmitry Torokhov 
> > Cc: Linus Walleij 
> > Cc: Naveen Kumar Gaddipati 
> > Cc: Joeri de Gram 
> > 
> > ---
> 
> There is liberal whitespacing above. (No big deal, but anyway.)
> 
> (...)
> 
> > +/**
> > + * @reset - set this bit to force a firmware reset of the sensor.
> > + */
> > +union f01_device_commands {
> > +   struct {
> > +   bool reset:1;
> > +   u8 reserved:7;
> > +   } __attribute__((__packed__));
> > +   u8 reg;
> > +};
> 
> I'm still scared by these unions. I see what you're doing but my
> preferred style of driver writing is to use a simple u8 if you just treat
> it the right way with some |= and &= ...
> 
> #include 
> 
> #define F01_RESET BIT(0)
> 
> u8 my_command = F01_RESET;
> 
> send(_command);
> 
> I will not insist on this because it's a bit about programming style.
> For memory-mapped devices we usually do it my way, but this
> is more like some protocol and I know protocols like to do things
> with structs and stuff so no big deal.

That's a good summary of what we're trying to do.  Our original version did 
more of the traditional mask+shift approach to manipulating the fields in the 
various registers, but in the case of complicated functions such as F11 this 
rapidly became unreadable.  We found the unions worked a lot better - the code 
was more readable and less error prone.  For consistency we decided to apply 
them throughout the code.

> 
> > +#ifdef CONFIG_RMI4_DEBUG
> > +struct f01_debugfs_data {
> > +   bool done;
> > +   struct rmi_function_container *fc;
> > +};
> > +
> > +static int f01_debug_open(struct inode *inodep, struct file *filp)
> > +{
> > +   struct f01_debugfs_data *data;
> > +   struct rmi_function_container *fc = inodep->i_private;
> > +
> > +   data = devm_kzalloc(>dev, sizeof(struct f01_debugfs_data),
> > +   GFP_KERNEL);
> 
> Wait, you probably did this because I requested it, but I was maybe
> wrong?
> 
> Will this not re-allocate a chunk every time you look at a debugfs
> file? So it leaks memory?
> 
> In that case common kzalloc() and kfree() is the way to go, as it
> is for dynamic buffers. Sorry for screwing things up for you.

No problem - we'll fix it.  Or unfix it.  Or something like that. :-)


> 
> > +   for (i = 0; i < f01->irq_count && *local_buf != 0;
> > +i++, local_buf += 2) {
> > +   int irq_shift;
> > +   int interrupt_enable;
> > +   int result;
> > +
> > +   irq_reg = i / 8;
> > +   irq_shift = i % 8;
> 
> Please stop doing these arithmetics-turned-maths things.
> 
> irq_reg = i >> 8;
> irq_shift = i & 0xFF;

See note on this in a previous email.

> 
> (...)
> 
> > +static ssize_t rmi_fn_01_interrupt_enable_show(struct device *dev,
> > +   struct device_attribute *attr, char *buf)
> > +{
> > +   struct rmi_function_container *fc;
> > +   struct f01_data *data;
> > +   int i, len, total_len = 0;
> > +   char *current_buf = buf;
> > +
> > +   fc = to_rmi_function_container(dev);
> > +   data = fc->data;
> > +   /* loop through each irq value and copy its
> > +* string representation into buf */
> > +   for (i = 0; i < data->irq_count; i++) {
> > +   int irq_reg;
> > +   int irq_shift;
> > +   int interrupt_enable;
> > +
> > +   irq_reg = i / 8;
> > +   irq_shift = i % 8;
> 
> Dito.
> 
> (...)
> 
> > +static int f01_probe(struct device *dev);
> 
> Do you really need to forward-declare this?

It's a leftover from the process of eliminating roll-your-own bus 
implementation, and move the other code around as well.  (same applies for 
similar code in rmi_f11.c).

> 
> (...)
> 
> > +static struct rmi_function_handler function_handler = {
> > +   .driver = {
> > +   .owner = THIS_MODULE,
> > +   .name = "rmi_f01",
> > +   .bus = _bus_type,
> > +   .probe = f01_probe,
> > +   .remove = f01_remove_device,
> > +   },
> > +   .func = 0x01,
> > +   .config = rmi_f01_config,
> > +   .attention = rmi_f01_attention,
> > +
> > +#ifdef CONFIG_PM
> > +   .suspend = rmi_f01_suspend,
> > +   .resume = rmi_f01_resume,
> 

[PATCH 1/9] perf python: add ui stubs file

2012-10-10 Thread David Ahern
stderr based implementations of ui_ functions for the python
library. Needed for patch 3 - consolidating open counters method.

Signed-off-by: David Ahern 
Cc: Arnaldo Carvalho de Melo 
Cc: Ingo Molnar 
Cc: Frederic Weisbecker 
Cc: Peter Zijlstra 
---
 tools/perf/util/python-ext-sources |1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/perf/util/python-ext-sources 
b/tools/perf/util/python-ext-sources
index 2133628..8a45370 100644
--- a/tools/perf/util/python-ext-sources
+++ b/tools/perf/util/python-ext-sources
@@ -19,3 +19,4 @@ util/debugfs.c
 util/rblist.c
 util/strlist.c
 ../../lib/rbtree.c
+util/ui_stubs.c
-- 
1.7.10.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/9] perf evlist: introduce open counters method

2012-10-10 Thread David Ahern
Superset of the open counters code in perf-top and perf-record -
combining retry handling and error handling. Should be functionally
equivalent.

Signed-off-by: David Ahern 
Cc: Arnaldo Carvalho de Melo 
Cc: Ingo Molnar 
Cc: Frederic Weisbecker 
Cc: Peter Zijlstra 
---
 tools/perf/util/evlist.c |  127 +-
 tools/perf/util/evlist.h |3 ++
 2 files changed, 129 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index a41dc4a..bce2f58 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -15,7 +15,7 @@
 #include "evlist.h"
 #include "evsel.h"
 #include 
-
+#include "debug.h"
 #include "parse-events.h"
 
 #include 
@@ -838,3 +838,128 @@ size_t perf_evlist__fprintf(struct perf_evlist *evlist, 
FILE *fp)
 
return printed + fprintf(fp, "\n");;
 }
+
+int perf_evlist__open_counters(struct perf_evlist *evlist,
+  struct perf_record_opts *opts)
+{
+   struct perf_evsel *pos;
+   int rc = 0;
+
+   list_for_each_entry(pos, >entries, node) {
+   struct perf_event_attr *attr = >attr;
+
+   /*
+* Carried over from perf-record:
+* Check if parse_single_tracepoint_event has already asked for
+* PERF_SAMPLE_TIME.
+*
+* XXX this is kludgy but short term fix for problems 
introduced by
+* eac23d1c that broke 'perf script' by having different 
sample_types
+* when using multiple tracepoint events when we use a perf 
binary
+* that tries to use sample_id_all on an older kernel.
+*
+* We need to move counter creation to perf_session, support
+* different sample_types, etc.
+*/
+   bool time_needed = attr->sample_type & PERF_SAMPLE_TIME;
+
+fallback_missing_features:
+   if (opts->exclude_guest_missing)
+   attr->exclude_guest = attr->exclude_host = 0;
+retry_sample_id:
+   attr->sample_id_all = opts->sample_id_all_missing ? 0 : 1;
+try_again:
+   if (perf_evsel__open(pos, evlist->cpus, evlist->threads) < 0) {
+   int err = errno;
+
+   if (err == EPERM || err == EACCES) {
+   ui__error_paranoid();
+   rc = -err;
+   goto out;
+   } else if (err ==  ENODEV && opts->target.cpu_list) {
+   pr_err("No such device - did you specify"
+  " an out-of-range profile CPU?\n");
+   rc = -err;
+   goto out;
+   } else if (err == EINVAL) {
+   if (!opts->exclude_guest_missing &&
+   (attr->exclude_guest || 
attr->exclude_host)) {
+   pr_debug("Old kernel, cannot exclude "
+"guest or host samples.\n");
+   opts->exclude_guest_missing = true;
+   goto fallback_missing_features;
+   } else if (!opts->sample_id_all_missing) {
+   /*
+* Old kernel, no 
attr->sample_id_type_all field
+*/
+   opts->sample_id_all_missing = true;
+   if (!opts->sample_time &&
+   !opts->raw_samples &&
+   !time_needed)
+   attr->sample_type &= 
~PERF_SAMPLE_TIME;
+   goto retry_sample_id;
+   }
+   }
+
+   /*
+* If it's cycles then fall back to hrtimer
+* based cpu-clock-tick sw counter, which
+* is always available even if no PMU support:
+*
+* PPC returns ENXIO until 2.6.37 (behavior changed
+* with commit b0a873e).
+*/
+   if ((err == ENOENT || err == ENXIO) &&
+   (attr->type == PERF_TYPE_HARDWARE) &&
+   (attr->config == PERF_COUNT_HW_CPU_CYCLES)) {
+
+   if (verbose)
+   ui__warning("Cycles event not 
supported,\n"
+   "trying to fall back to 
cpu-clock-ticks\n");
+
+   attr->type = PERF_TYPE_SOFTWARE;
+   

[PATCH 6/9] perf stat: move user options to perf_record_opts

2012-10-10 Thread David Ahern
This is required for perf-stat to use perf_evlist__open_counters.
And move opts to a stack variable.

Signed-off-by: David Ahern 
Cc: Arnaldo Carvalho de Melo 
Cc: Ingo Molnar 
Cc: Frederic Weisbecker 
Cc: Peter Zijlstra 
---
 tools/perf/builtin-stat.c |  161 +++--
 1 file changed, 97 insertions(+), 64 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 93b9011..9727d217 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -66,12 +66,8 @@
 
 static struct perf_evlist  *evsel_list;
 
-static struct perf_target  target = {
-   .uid= UINT_MAX,
-};
 
 static int run_count   =  1;
-static boolno_inherit  = false;
 static boolscale   =  true;
 static boolno_aggr = false;
 static pid_t   child_pid   = -1;
@@ -81,7 +77,6 @@ static bool   big_num 
=  true;
 static int big_num_opt =  -1;
 static const char  *csv_sep= NULL;
 static boolcsv_output  = false;
-static boolgroup   = false;
 static FILE*output = NULL;
 
 static volatile int done = 0;
@@ -102,14 +97,16 @@ static void perf_evsel__free_stat_priv(struct perf_evsel 
*evsel)
evsel->priv = NULL;
 }
 
-static inline struct cpu_map *perf_evsel__cpus(struct perf_evsel *evsel)
+static inline struct cpu_map *perf_evsel__cpus(struct perf_evsel *evsel,
+  struct perf_target *target)
 {
-   return (evsel->cpus && !target.cpu_list) ? evsel->cpus : 
evsel_list->cpus;
+   return (evsel->cpus && !target->cpu_list) ? evsel->cpus : 
evsel_list->cpus;
 }
 
-static inline int perf_evsel__nr_cpus(struct perf_evsel *evsel)
+static inline int perf_evsel__nr_cpus(struct perf_evsel *evsel,
+ struct perf_target *target)
 {
-   return perf_evsel__cpus(evsel)->nr;
+   return perf_evsel__cpus(evsel, target)->nr;
 }
 
 static struct stats runtime_nsecs_stats[MAX_NR_CPUS];
@@ -126,8 +123,10 @@ static struct stats runtime_dtlb_cache_stats[MAX_NR_CPUS];
 static struct stats walltime_nsecs_stats;
 
 static int create_perf_stat_counter(struct perf_evsel *evsel,
-   struct perf_evsel *first)
+   struct perf_evsel *first,
+   struct perf_record_opts *opts)
 {
+   struct perf_target *target = >target;
struct perf_event_attr *attr = >attr;
bool exclude_guest_missing = false;
int ret;
@@ -136,20 +135,22 @@ static int create_perf_stat_counter(struct perf_evsel 
*evsel,
attr->read_format = PERF_FORMAT_TOTAL_TIME_ENABLED |
PERF_FORMAT_TOTAL_TIME_RUNNING;
 
-   attr->inherit = !no_inherit;
+   attr->inherit = !opts->no_inherit;
 
 retry:
if (exclude_guest_missing)
evsel->attr.exclude_guest = evsel->attr.exclude_host = 0;
 
-   if (perf_target__has_cpu()) {
-   ret = perf_evsel__open_per_cpu(evsel, perf_evsel__cpus(evsel));
+   if (perf_target__has_cpu(target)) {
+   ret = perf_evsel__open_per_cpu(evsel,
+  perf_evsel__cpus(evsel, target));
if (ret)
goto check_ret;
return 0;
}
 
-   if (!perf_target__has_task() && (!group || evsel == first)) {
+   if (!perf_target__has_task(target) &&
+   (!opts->group || evsel == first)) {
attr->disabled = 1;
attr->enable_on_exec = 1;
}
@@ -218,13 +219,15 @@ static void update_shadow_stats(struct perf_evsel 
*counter, u64 *count)
  * Read out the results of a single counter:
  * aggregate counts across CPUs in system-wide mode
  */
-static int read_counter_aggr(struct perf_evsel *counter)
+static int read_counter_aggr(struct perf_evsel *counter,
+struct perf_record_opts *opts)
 {
+   struct perf_target *target = >target;
struct perf_stat *ps = counter->priv;
u64 *count = counter->counts->aggr.values;
int i;
 
-   if (__perf_evsel__read(counter, perf_evsel__nr_cpus(counter),
+   if (__perf_evsel__read(counter, perf_evsel__nr_cpus(counter, target),
   evsel_list->threads->nr, scale) < 0)
return -1;
 
@@ -248,12 +251,14 @@ static int read_counter_aggr(struct perf_evsel *counter)
  * Read out the results of a single counter:
  * do not aggregate counts across CPUs in system-wide mode
  */
-static int 

[PATCH 7/9] perf evlist: add stat unique code to open_counters method

2012-10-10 Thread David Ahern
Mainly the addition is an argument to keep going for some open
failures.

Signed-off-by: David Ahern 
Cc: Arnaldo Carvalho de Melo 
Cc: Ingo Molnar 
Cc: Frederic Weisbecker 
Cc: Peter Zijlstra 
---
 tools/perf/builtin-record.c |2 +-
 tools/perf/builtin-top.c|2 +-
 tools/perf/util/evlist.c|   16 ++--
 tools/perf/util/evlist.h|3 ++-
 4 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index b9dcc01..663ccc8 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -234,7 +234,7 @@ static int perf_record__open(struct perf_record *rec)
if (opts->group)
perf_evlist__set_leader(evlist);
 
-   rc = perf_evlist__open_counters(evlist, opts);
+   rc = perf_evlist__open_counters(evlist, opts, false);
if (rc != 0)
goto out;
 
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 2ffc32e..2c3b3c7 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -921,7 +921,7 @@ static void perf_top__start_counters(struct perf_top *top)
attr->inherit = !top->opts.no_inherit;
}
 
-   if (perf_evlist__open_counters(evlist, >opts) != 0)
+   if (perf_evlist__open_counters(evlist, >opts, false) != 0)
goto out_err;
 
if (perf_evlist__mmap(evlist, top->opts.mmap_pages, false) < 0) {
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index bce2f58..fa0daac 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -840,7 +840,8 @@ size_t perf_evlist__fprintf(struct perf_evlist *evlist, 
FILE *fp)
 }
 
 int perf_evlist__open_counters(struct perf_evlist *evlist,
-  struct perf_record_opts *opts)
+  struct perf_record_opts *opts,
+  bool continue_on_fail)
 {
struct perf_evsel *pos;
int rc = 0;
@@ -872,6 +873,16 @@ try_again:
if (perf_evsel__open(pos, evlist->cpus, evlist->threads) < 0) {
int err = errno;
 
+   if (continue_on_fail &&
+   (err == EINVAL || err == ENOSYS || err == ENXIO ||
+err == ENOENT || err == EOPNOTSUPP)) {
+   if (verbose)
+   ui__warning("%s event is not supported 
by the kernel.\n",
+   perf_evsel__name(pos));
+   pos->supported = false;
+   continue;
+   }
+
if (err == EPERM || err == EACCES) {
ui__error_paranoid();
rc = -err;
@@ -958,7 +969,8 @@ try_again:
pr_err("No CONFIG_PERF_EVENTS=y kernel support 
configured?\n");
rc = -err;
goto out;
-   }
+   } else
+   pos->supported = true;
}
 out:
return rc;
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 270e546..0747b6f 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -137,5 +137,6 @@ static inline struct perf_evsel *perf_evlist__last(struct 
perf_evlist *evlist)
 size_t perf_evlist__fprintf(struct perf_evlist *evlist, FILE *fp);
 
 int perf_evlist__open_counters(struct perf_evlist *evlist,
-  struct perf_record_opts *opts);
+  struct perf_record_opts *opts,
+  bool continue_on_fail);
 #endif /* __PERF_EVLIST_H */
-- 
1.7.10.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 8/9] perf stat: move to perf_evlist__open_counters

2012-10-10 Thread David Ahern
Removes a lot of duplicated code moving to the common
open method.

Signed-off-by: David Ahern 
Cc: Arnaldo Carvalho de Melo 
Cc: Ingo Molnar 
Cc: Frederic Weisbecker 
Cc: Peter Zijlstra 
---
 tools/perf/builtin-stat.c |  103 ++---
 1 file changed, 22 insertions(+), 81 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 9727d217..affbada 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -122,55 +122,6 @@ static struct stats runtime_itlb_cache_stats[MAX_NR_CPUS];
 static struct stats runtime_dtlb_cache_stats[MAX_NR_CPUS];
 static struct stats walltime_nsecs_stats;
 
-static int create_perf_stat_counter(struct perf_evsel *evsel,
-   struct perf_evsel *first,
-   struct perf_record_opts *opts)
-{
-   struct perf_target *target = >target;
-   struct perf_event_attr *attr = >attr;
-   bool exclude_guest_missing = false;
-   int ret;
-
-   if (scale)
-   attr->read_format = PERF_FORMAT_TOTAL_TIME_ENABLED |
-   PERF_FORMAT_TOTAL_TIME_RUNNING;
-
-   attr->inherit = !opts->no_inherit;
-
-retry:
-   if (exclude_guest_missing)
-   evsel->attr.exclude_guest = evsel->attr.exclude_host = 0;
-
-   if (perf_target__has_cpu(target)) {
-   ret = perf_evsel__open_per_cpu(evsel,
-  perf_evsel__cpus(evsel, target));
-   if (ret)
-   goto check_ret;
-   return 0;
-   }
-
-   if (!perf_target__has_task(target) &&
-   (!opts->group || evsel == first)) {
-   attr->disabled = 1;
-   attr->enable_on_exec = 1;
-   }
-
-   ret = perf_evsel__open_per_thread(evsel, evsel_list->threads);
-   if (!ret)
-   return 0;
-   /* fall through */
-check_ret:
-   if (ret && errno == EINVAL) {
-   if (!exclude_guest_missing &&
-   (evsel->attr.exclude_guest || evsel->attr.exclude_host)) {
-   pr_debug("Old kernel, cannot exclude "
-"guest or host samples.\n");
-   exclude_guest_missing = true;
-   goto retry;
-   }
-   }
-   return ret;
-}
 
 /*
  * Does the counter have nsecs as a unit?
@@ -277,6 +228,7 @@ static int run_perf_stat(int argc __maybe_unused,
unsigned long long t0, t1;
struct perf_evsel *counter, *first;
struct cpu_map *cmap;
+   struct perf_target *target = >target;
int status = 0;
int child_ready_pipe[2], go_pipe[2];
const bool forks = (argc > 0);
@@ -320,7 +272,7 @@ static int run_perf_stat(int argc __maybe_unused,
exit(-1);
}
 
-   if (perf_target__none(>target))
+   if (perf_target__none(target))
evsel_list->threads->map[0] = child_pid;
 
/*
@@ -339,38 +291,27 @@ static int run_perf_stat(int argc __maybe_unused,
first = perf_evlist__first(evsel_list);
 
list_for_each_entry(counter, _list->entries, node) {
-   if (create_perf_stat_counter(counter, first, opts) < 0) {
-   /*
-* PPC returns ENXIO for HW counters until 2.6.37
-* (behavior changed with commit b0a873e).
-*/
-   if (errno == EINVAL || errno == ENOSYS ||
-   errno == ENOENT || errno == EOPNOTSUPP ||
-   errno == ENXIO) {
-   if (verbose)
-   ui__warning("%s event is not supported 
by the kernel.\n",
-   perf_evsel__name(counter));
-   counter->supported = false;
-   continue;
-   }
-
-   if (errno == EPERM || errno == EACCES) {
-   error("You may not have permission to collect 
%sstats.\n"
- "\t Consider tweaking"
- " /proc/sys/kernel/perf_event_paranoid or 
running as root.",
- opts->target.system_wide ? "system-wide " 
: "");
-   } else {
-   error("open_counter returned with %d (%s). "
- "/bin/dmesg may provide additional 
information.\n",
-  errno, strerror(errno));
-   }
-   if (child_pid != -1)
-   kill(child_pid, SIGTERM);
-
-   pr_err("Not all events could be opened.\n");
-   return -1;
+   struct 

[PATCH 9/9] perf evsel: remove perf_evsel__open_per_cpu

2012-10-10 Thread David Ahern
No longer needed with perf-stat converted to perf_evlist__open_counters.

Signed-off-by: David Ahern 
Cc: Arnaldo Carvalho de Melo 
Cc: Ingo Molnar 
Cc: Frederic Weisbecker 
Cc: Peter Zijlstra 
---
 tools/perf/util/evsel.c |6 --
 tools/perf/util/evsel.h |2 --
 2 files changed, 8 deletions(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index ffdd94e..ab3d1c8 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -774,12 +774,6 @@ int perf_evsel__open(struct perf_evsel *evsel, struct 
cpu_map *cpus,
return __perf_evsel__open(evsel, cpus, threads);
 }
 
-int perf_evsel__open_per_cpu(struct perf_evsel *evsel,
-struct cpu_map *cpus)
-{
-   return __perf_evsel__open(evsel, cpus, _thread_map.map);
-}
-
 int perf_evsel__open_per_thread(struct perf_evsel *evsel,
struct thread_map *threads)
 {
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 3ead0d5..bf32de4 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -121,8 +121,6 @@ void perf_evsel__close_fd(struct perf_evsel *evsel, int 
ncpus, int nthreads);
 int perf_evsel__set_filter(struct perf_evsel *evsel, int ncpus, int nthreads,
   const char *filter);
 
-int perf_evsel__open_per_cpu(struct perf_evsel *evsel,
-struct cpu_map *cpus);
 int perf_evsel__open_per_thread(struct perf_evsel *evsel,
struct thread_map *threads);
 int perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
-- 
1.7.10.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 4/9] perf top: use the new perf_evlist__open_counters method

2012-10-10 Thread David Ahern
Remove open counters code with all the retry and error handling in
favor of the new perf_evlist__open_counters method which is based
on the top code.

Signed-off-by: David Ahern 
Cc: Arnaldo Carvalho de Melo 
Cc: Ingo Molnar 
Cc: Frederic Weisbecker 
Cc: Peter Zijlstra 
---
 tools/perf/builtin-top.c |   70 ++
 1 file changed, 3 insertions(+), 67 deletions(-)

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 33c3825..2ffc32e 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -919,75 +919,11 @@ static void perf_top__start_counters(struct perf_top *top)
attr->mmap = 1;
attr->comm = 1;
attr->inherit = !top->opts.no_inherit;
-fallback_missing_features:
-   if (top->opts.exclude_guest_missing)
-   attr->exclude_guest = attr->exclude_host = 0;
-retry_sample_id:
-   attr->sample_id_all = top->opts.sample_id_all_missing ? 0 : 1;
-try_again:
-   if (perf_evsel__open(counter, top->evlist->cpus,
-top->evlist->threads) < 0) {
-   int err = errno;
-
-   if (err == EPERM || err == EACCES) {
-   ui__error_paranoid();
-   goto out_err;
-   } else if (err == EINVAL) {
-   if (!top->opts.exclude_guest_missing &&
-   (attr->exclude_guest || 
attr->exclude_host)) {
-   pr_debug("Old kernel, cannot exclude "
-"guest or host samples.\n");
-   top->opts.exclude_guest_missing = true;
-   goto fallback_missing_features;
-   } else if (!top->opts.sample_id_all_missing) {
-   /*
-* Old kernel, no 
attr->sample_id_type_all field
-*/
-   top->opts.sample_id_all_missing = true;
-   goto retry_sample_id;
-   }
-   }
-   /*
-* If it's cycles then fall back to hrtimer
-* based cpu-clock-tick sw counter, which
-* is always available even if no PMU support:
-*/
-   if ((err == ENOENT || err == ENXIO) &&
-   (attr->type == PERF_TYPE_HARDWARE) &&
-   (attr->config == PERF_COUNT_HW_CPU_CYCLES)) {
-
-   if (verbose)
-   ui__warning("Cycles event not 
supported,\n"
-   "trying to fall back to 
cpu-clock-ticks\n");
-
-   attr->type = PERF_TYPE_SOFTWARE;
-   attr->config = PERF_COUNT_SW_CPU_CLOCK;
-   if (counter->name) {
-   free(counter->name);
-   counter->name = NULL;
-   }
-   goto try_again;
-   }
-
-   if (err == ENOENT) {
-   ui__error("The %s event is not supported.\n",
- perf_evsel__name(counter));
-   goto out_err;
-   } else if (err == EMFILE) {
-   ui__error("Too many events are opened.\n"
-   "Try again after reducing the 
number of events\n");
-   goto out_err;
-   }
-
-   ui__error("The sys_perf_event_open() syscall "
-   "returned with %d (%s).  /bin/dmesg "
-   "may provide additional information.\n"
-   "No CONFIG_PERF_EVENTS=y kernel support "
-   "configured?\n", err, strerror(err));
-   goto out_err;
-   }
}
 
+   if (perf_evlist__open_counters(evlist, >opts) != 0)
+   goto out_err;
+
if (perf_evlist__mmap(evlist, top->opts.mmap_pages, false) < 0) {
ui__error("Failed to mmap with %d (%s)\n",
errno, strerror(errno));
-- 
1.7.10.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 5/9] perf record: use the new perf_evlist__open_counters method

2012-10-10 Thread David Ahern
Remove open counters code with all the retry and error handling in
favor of the new perf_evlist__open_counters method which is based
on the existing code.

Signed-off-by: David Ahern 
Cc: Arnaldo Carvalho de Melo 
Cc: Ingo Molnar 
Cc: Frederic Weisbecker 
Cc: Peter Zijlstra 
---
 tools/perf/builtin-record.c |  109 +--
 1 file changed, 2 insertions(+), 107 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 73b5d7f..b9dcc01 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -224,7 +224,6 @@ static bool perf_evlist__equal(struct perf_evlist *evlist,
 
 static int perf_record__open(struct perf_record *rec)
 {
-   struct perf_evsel *pos;
struct perf_evlist *evlist = rec->evlist;
struct perf_session *session = rec->session;
struct perf_record_opts *opts = >opts;
@@ -235,113 +234,9 @@ static int perf_record__open(struct perf_record *rec)
if (opts->group)
perf_evlist__set_leader(evlist);
 
-   list_for_each_entry(pos, >entries, node) {
-   struct perf_event_attr *attr = >attr;
-   /*
-* Check if parse_single_tracepoint_event has already asked for
-* PERF_SAMPLE_TIME.
-*
-* XXX this is kludgy but short term fix for problems 
introduced by
-* eac23d1c that broke 'perf script' by having different 
sample_types
-* when using multiple tracepoint events when we use a perf 
binary
-* that tries to use sample_id_all on an older kernel.
-*
-* We need to move counter creation to perf_session, support
-* different sample_types, etc.
-*/
-   bool time_needed = attr->sample_type & PERF_SAMPLE_TIME;
-
-fallback_missing_features:
-   if (opts->exclude_guest_missing)
-   attr->exclude_guest = attr->exclude_host = 0;
-retry_sample_id:
-   attr->sample_id_all = opts->sample_id_all_missing ? 0 : 1;
-try_again:
-   if (perf_evsel__open(pos, evlist->cpus, evlist->threads) < 0) {
-   int err = errno;
-
-   if (err == EPERM || err == EACCES) {
-   ui__error_paranoid();
-   rc = -err;
-   goto out;
-   } else if (err ==  ENODEV && opts->target.cpu_list) {
-   pr_err("No such device - did you specify"
-  " an out-of-range profile CPU?\n");
-   rc = -err;
-   goto out;
-   } else if (err == EINVAL) {
-   if (!opts->exclude_guest_missing &&
-   (attr->exclude_guest || 
attr->exclude_host)) {
-   pr_debug("Old kernel, cannot exclude "
-"guest or host samples.\n");
-   opts->exclude_guest_missing = true;
-   goto fallback_missing_features;
-   } else if (!opts->sample_id_all_missing) {
-   /*
-* Old kernel, no 
attr->sample_id_type_all field
-*/
-   opts->sample_id_all_missing = true;
-   if (!opts->sample_time && 
!opts->raw_samples && !time_needed)
-   attr->sample_type &= 
~PERF_SAMPLE_TIME;
-
-   goto retry_sample_id;
-   }
-   }
-
-   /*
-* If it's cycles then fall back to hrtimer
-* based cpu-clock-tick sw counter, which
-* is always available even if no PMU support.
-*
-* PPC returns ENXIO until 2.6.37 (behavior changed
-* with commit b0a873e).
-*/
-   if ((err == ENOENT || err == ENXIO)
-   && attr->type == PERF_TYPE_HARDWARE
-   && attr->config == 
PERF_COUNT_HW_CPU_CYCLES) {
-
-   if (verbose)
-   ui__warning("The cycles event is not 
supported, "
-   "trying to fall back to 
cpu-clock-ticks\n");
-   attr->type = PERF_TYPE_SOFTWARE;
-   attr->config = PERF_COUNT_SW_CPU_CLOCK;
-   if (pos->name) {
-  

[PATCH 2/9] perf top: make use of perf_record_opts

2012-10-10 Thread David Ahern
Changes top code to use the perf_record_opts struct. Stepping stone to
consolidating the open counters code.

Signed-off-by: David Ahern 
Cc: Arnaldo Carvalho de Melo 
Cc: Ingo Molnar 
Cc: Frederic Weisbecker 
Cc: Peter Zijlstra 
---
 tools/perf/builtin-top.c |   84 --
 tools/perf/util/top.c|   20 +--
 tools/perf/util/top.h|9 +
 3 files changed, 54 insertions(+), 59 deletions(-)

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index fb9da71..33c3825 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -591,7 +591,7 @@ static void *display_thread_tui(void *arg)
 * via --uid.
 */
list_for_each_entry(pos, >evlist->entries, node)
-   pos->hists.uid_filter_str = top->target.uid_str;
+   pos->hists.uid_filter_str = top->opts.target.uid_str;
 
perf_evlist__tui_browse_hists(top->evlist, help,
  perf_top__sort_new_samples,
@@ -891,7 +891,7 @@ static void perf_top__start_counters(struct perf_top *top)
struct perf_evsel *counter;
struct perf_evlist *evlist = top->evlist;
 
-   if (top->group)
+   if (top->opts.group)
perf_evlist__set_leader(evlist);
 
list_for_each_entry(counter, >entries, node) {
@@ -899,10 +899,10 @@ static void perf_top__start_counters(struct perf_top *top)
 
attr->sample_type = PERF_SAMPLE_IP | PERF_SAMPLE_TID;
 
-   if (top->freq) {
+   if (top->opts.freq) {
attr->sample_type |= PERF_SAMPLE_PERIOD;
attr->freq= 1;
-   attr->sample_freq = top->freq;
+   attr->sample_freq = top->opts.freq;
}
 
if (evlist->nr_entries > 1) {
@@ -910,7 +910,7 @@ static void perf_top__start_counters(struct perf_top *top)
attr->read_format |= PERF_FORMAT_ID;
}
 
-   if (perf_target__has_cpu(>target))
+   if (perf_target__has_cpu(>opts.target))
attr->sample_type |= PERF_SAMPLE_CPU;
 
if (symbol_conf.use_callchain)
@@ -918,12 +918,12 @@ static void perf_top__start_counters(struct perf_top *top)
 
attr->mmap = 1;
attr->comm = 1;
-   attr->inherit = top->inherit;
+   attr->inherit = !top->opts.no_inherit;
 fallback_missing_features:
-   if (top->exclude_guest_missing)
+   if (top->opts.exclude_guest_missing)
attr->exclude_guest = attr->exclude_host = 0;
 retry_sample_id:
-   attr->sample_id_all = top->sample_id_all_missing ? 0 : 1;
+   attr->sample_id_all = top->opts.sample_id_all_missing ? 0 : 1;
 try_again:
if (perf_evsel__open(counter, top->evlist->cpus,
 top->evlist->threads) < 0) {
@@ -933,17 +933,17 @@ try_again:
ui__error_paranoid();
goto out_err;
} else if (err == EINVAL) {
-   if (!top->exclude_guest_missing &&
+   if (!top->opts.exclude_guest_missing &&
(attr->exclude_guest || 
attr->exclude_host)) {
pr_debug("Old kernel, cannot exclude "
 "guest or host samples.\n");
-   top->exclude_guest_missing = true;
+   top->opts.exclude_guest_missing = true;
goto fallback_missing_features;
-   } else if (!top->sample_id_all_missing) {
+   } else if (!top->opts.sample_id_all_missing) {
/*
 * Old kernel, no 
attr->sample_id_type_all field
 */
-   top->sample_id_all_missing = true;
+   top->opts.sample_id_all_missing = true;
goto retry_sample_id;
}
}
@@ -988,7 +988,7 @@ try_again:
}
}
 
-   if (perf_evlist__mmap(evlist, top->mmap_pages, false) < 0) {
+   if (perf_evlist__mmap(evlist, top->opts.mmap_pages, false) < 0) {
ui__error("Failed to mmap with %d (%s)\n",
errno, strerror(errno));
goto out_err;
@@ -1034,7 +1034,7 @@ static int __cmd_top(struct perf_top *top)
if (ret)
goto out_delete;
 
-   if (perf_target__has_task(>target))
+   if (perf_target__has_task(>opts.target))

[PATCH 0/9] perf: consolidate all the open counters loops

2012-10-10 Thread David Ahern
ACME was a litle slow today (ACME Component Mgmt Env that is) so managed
to add perf-stat to the list and do a decent amount of testing. This
consolidates all of the open counters loops into a single common one.

David Ahern (9):
  perf python: add ui stubs file
  perf top: make use of perf_record_opts
  perf evlist: introduce open counters method
  perf top: use the new perf_evlist__open_counters method
  perf record: use the new perf_evlist__open_counters method
  perf stat: move user options to perf_record_opts
  perf evlist: add stat unique code to open_counters method
  perf stat: move to perf_evlist__open_counters
  perf evsel: remove perf_evsel__open_per_cpu

 tools/perf/builtin-record.c|  109 +---
 tools/perf/builtin-stat.c  |  240 
 tools/perf/builtin-top.c   |  142 ++---
 tools/perf/util/evlist.c   |  139 -
 tools/perf/util/evlist.h   |4 +
 tools/perf/util/evsel.c|6 -
 tools/perf/util/evsel.h|2 -
 tools/perf/util/python-ext-sources |1 +
 tools/perf/util/top.c  |   20 +--
 tools/perf/util/top.h  |9 +-
 10 files changed, 303 insertions(+), 369 deletions(-)

-- 
1.7.10.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [RFC PATCH 04/06] input/rmi4: Config files and makefiles

2012-10-10 Thread Christopher Heiny
Linus Walleij wrote:
> On Sat, Oct 6, 2012 at 6:10 AM, Christopher Heiny 
> wrote:
> 
> (...)
> 
> > diff --git a/drivers/input/rmi4/Kconfig b/drivers/input/rmi4/Kconfig
> 
> (...)
> 
> > +config RMI4_DEBUG
> > +   bool "RMI4 Debugging"
> > +   depends on RMI4_BUS
> 
> select DEBUG_FS
> 
> This has been illustrated many times in the review. You definatley
> have code depending on debugfs when this is selected.

Agreed.--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [RFC PATCH 03/06] input/rmi4: I2C physical interface

2012-10-10 Thread Christopher Heiny
Linus Walleij wrote:
> On Sat, Oct 6, 2012 at 6:10 AM, Christopher Heiny  
> wrote:
> > The I2C physical driver is not extensively changed in terms of
> > functionality since the previous patch.  Management of the attention GPIO
> > has been moved to rmi_driver.c (see previous email), and most of the
> > debug related interfaces have been moved from sysfs to debugfs.  Control
> > of the debug features has been moved from compile-time to runtime
> > switches available via debugfs.
> > 
> > The core I2C functionality was previously ACKed by Jean Delvare.  I don't
> > believe that portion of the code has changed much since then, but we'd
> > appreciate a second glance at this.
> 
> The above commit blurb looks more like a changelog than a description
> of the actual patch. Nothing wrong with that but begin by describing
> the patch first.

Good point.  I was describing the patch, but not from the correct point of 
view. :-)

[snip some items covered in a previous email]

> 
> > +static int setup_debugfs(struct rmi_device *rmi_dev, struct rmi_i2c_data
> > *data); +static void teardown_debugfs(struct rmi_i2c_data *data);
> 
> Why do you need to forward-declare these? Can't you just move them
> up above the functions using them?

Probably.  We'll do that if possible.

> 
> > +struct i2c_debugfs_data {
> > +   bool done;
> 
> Done with what? ... needs some doc.

OK.

> 
> > +   struct rmi_i2c_data *i2c_data;
> > +};
> 
> (...)
> 
> > +static int __devinit rmi_i2c_probe(struct i2c_client *client,
> > + const struct i2c_device_id *id)
> 
> (...)
> 
> > +   rmi_phys = kzalloc(sizeof(struct rmi_phys_device), GFP_KERNEL);
> 
> (...)
> 
> > +   data = kzalloc(sizeof(struct rmi_i2c_data), GFP_KERNEL);
> 
> Can you use devm_kzalloc(>dev, ...) for these so you don't
> need to free() them explicitly?

Hmm.  That looks like a merge regression - I'm pretty sure we implemented 
devm_kzalloc there.

> 
> (...)
> 
> > +static int __devexit rmi_i2c_remove(struct i2c_client *client)
> > +{
> > +   struct rmi_phys_device *phys = i2c_get_clientdata(client);
> > +   struct rmi_device_platform_data *pd = client->dev.platform_data;
> > +
> > +   /* Can I remove this disable_device */
> > +   /*disable_device(phys); */
> 
> So just delete these two lines then?

Yes.--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [RFC PATCH 02/06] input/rmi4: Core files

2012-10-10 Thread Christopher Heiny
On Thursday, October 11, 2012 02:21:53 AM you wrote:
> On Sat, Oct 6, 2012 at 6:09 AM, Christopher Heiny  
> wrote:
> > rmi_bus.c implements the basic functionality of the RMI bus.  This file is
> > greatly simplified compared to the previous patch - we've switched from
> > "do it yourself" device/driver binding to using device_type to distinguish
> > between the two kinds of devices on the bus (sensor devices and function
> > specific devices) and using the standard bus implementation to manage
> > devices and drivers.
> 
> So I think you really want Greg KH to look at this bus implementation
> now. Please include Greg on future mailings...
> 
> It looks much improved from previous versions, and sorry if I am now
> adding even more comments, but it's because you cleared out some
> noise that was disturbing my perception so I can cleanly review
> the architecture of this thing now. (I'm impressed by your work and
> new high-speed turnaround time!)

Thanks for the praise - it means a lot to me and my team.

We'll cc Greg on the next patch drop.

[snip some items covered in a previous email]

> 
> (...)
> 
> > diff --git a/drivers/input/rmi4/rmi_driver.c
> > b/drivers/input/rmi4/rmi_driver.c

[snip some items covered in a previous email]

> 
> > +#define DELAY_NAME "delay"
> 
> This is only used in one place, why not just use the string
> "delay" there?
> 
> (...)
> 
> > +   if (IS_ENABLED(CONFIG_RMI4_SPI) && !strncmp("spi", info->proto,
> > 3)) { +   data->debugfs_delay =
> > debugfs_create_file(DELAY_NAME,
> > +   RMI_RW_ATTR, rmi_dev->debugfs_root,
> > rmi_dev, +   _fops);
> 
> i.e. there.

That's a left-over.  We'll consolidate it.

> 
> (...)
> 
> > +/* Useful helper functions for u8* */
> > +
> > +static bool u8_is_any_set(u8 *target, int size)
> > +{
> > +   int i;
> > +   /* We'd like to use find_first_bit, but it ALWAYS returns 1,
> > +   *  no matter what we pass it.  So we have to do this the hard way.
> > +   *  return find_first_bit((long unsigned int *)  target, size) !=
> > 0;
> > +   */
> > +   for (i = 0; i < size; i++) {
> > +   if (target[i])
> > +   return true;
> > +   }
> > +   return false;
> > +}
> 
> Instead of:
> 
> if (u8_is_any_set(foo, 128) {}
> 
> Why can't you use:
> 
> if (!bitmap_empty(foo, 128*8) {}
> 
> ?
> 
> If you look at the implementation in the  header
> and __bitmap_empty() in lib/bitmap.c you will realize that this
> function is already optimized like this (and I actually don't think
> the RMI4 code is performance-critical for these functions anyway,
> but prove me wrong!)

We'll give !bitmap_empty a try.

> 
> > +
> > +/** This is here because all those casts made for some ugly code.
> > + */
> > +static void u8_and(u8 *dest, u8 *target1, u8 *target2, int nbits)
> > +{
> > +   bitmap_and((long unsigned int *) dest,
> > +  (long unsigned int *) target1,
> > +  (long unsigned int *) target2,
> > +  nbits);
> > +}
> 
> Hm, getting rid of unreadable casts is a valid case.
> 
> I'll be OK with this but maybe the real solution is to introduce such
> helpers into ?

Hmmm.  We'll give that some thought.  Thought I'd like to get the RMI4 driver 
nailed down, just to keep the area of change small.  Once we've got all the 
kinks worked out here, we'll look at bitmap.h helpers.

> 
> (...)
> 
> > +static int process_interrupt_requests(struct rmi_device *rmi_dev)
> > +{
> > +   struct rmi_driver_data *data = dev_get_drvdata(_dev->dev);
> > +   struct device *dev = _dev->dev;
> > +   struct rmi_function_container *entry;
> > +   u8 irq_status[data->num_of_irq_regs];
> 
> Looking at this...
> 
> What does the data->num_of_irq_regs actually contain?
> 
> I just fear that it is something constant like always 2 or always 4,
> so there is actually, in reality, a 16 or 32 bit register hiding in there.
> 
> In that case what you should do is to represent it as a u16 or u32 here,
> just or the bits into a status word, and then walk over that status
> word with something like ffs(bitword); ...

Nope, it's not constant.  In theory, and RMI4 based sensor can have up to 128 
functions (in practice, it's far fewer), and each function can have as many as 
7 interrupts.  So the number of IRQ registers can vary from RMI4 sensor to RMI4 
sensor, and needs to be computed during the scan of the product descriptor 
table.

> 
> (...)
> 
> > +static int standard_resume(struct rmi_device *rmi_dev)
> 
> Standard eh?
> 
> Atleast prefix with rmi4_*...

Ooops - we excised the Android based stuff, but forgot to change that function 
name.


> 
> > +static int rmi_driver_suspend(struct device *dev)
> > +{
> > +   struct rmi_device *rmi_dev = to_rmi_device(dev);
> > +   return standard_suspend(rmi_dev);
> > +}
> > +
> > +static int rmi_driver_resume(struct device *dev)
> > +{
> > +   struct 

RE: [RFC PATCH 01/06] input/rmi4: Public header and documentation

2012-10-10 Thread Christopher Heiny
Mark Brown wrote:
> On Tue, Oct 09, 2012 at 09:43:13AM +0200, Linus Walleij wrote:
> > On Sat, Oct 6, 2012 at 6:09 AM, Christopher Heiny  
> > wrote:
> > > + * @cs_assert - For systems where the SPI subsystem does not control
> > > the CS/SSB + * line, or where such control is broken, you can provide a
> > > custom routine to + * handle a GPIO as CS/SSB.  This routine will be
> > > called at the beginning and + * end of each SPI transaction.  The RMI
> > > SPI implementation will wait + * pre_delay_us after this routine
> > > returns before starting the SPI transfer; + * and post_delay_us after
> > > completion of the SPI transfer(s) before calling it + * with
> > > assert==FALSE.
> > 
> > Hm hm, can you describe the case where this happens?
> > 
> > Usually we don't avoid fixes for broken drivers by duct-taping
> > solutions into other drivers, instead we fix the SPI driver.
> > 
> > I can think of systems where CS is asserted not by using
> > GPIO but by poking some special register for example, which
> > is a valid reason for including this, but working around broken
> > SPI drivers is not a valid reason to include this.
> > 
> > (Paging Mark about it.)
> 
> Yeah, this seems silly - by this logic we'd have to go round implementing
> manual /CS control in every single SPI client driver which isn't
> terribly sensible.  The driver should just assume that the SPI
> controller does what it's told.  As you say if there's an issue the
> relevant controller driver should take care of things.
> 
> We should also have generic support in the SPI framework for GPIO based
> /CS, there's enough drivers open coding this already either due to
> hardware limitations or to support extra chip selects.
> 
> The ability of SPI hardware and driver authors to get /CS right is
> pretty depressing :/

You will get no argument at all from me on that point.  I'll even add board 
layout engineers to the list ("it wasn't convenient to run CS, so we just used 
a different pin.  You can just mux it, right?").  Basically this feature exists 
to help get prototype systems up and running while the SPI 
hardware/driver/layout matures.

If this feature is a deal-breaker, we can take it out.  In the absence of a 
generic GPIO implementation for CS, though, I'd much rather leave it in.  Once 
generic GPIO CS arrives, we'll remove it pretty quickly.  

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] fix x2apic defect that Linux kernel doesn't mask 8259A interrupt during the time window between changing VT-d table base address and initializing these VT-d entries(smpboot.c and apic.c )

2012-10-10 Thread Zhang, Lin-Bao (Linux Kernel R)

> -Original Message-
> From: Suresh Siddha [mailto:suresh.b.sid...@intel.com]
> > > Can you please apply the appended patch to 2.6.32 kernel and see if
> > > the issue you mentioned gets fixed? If so, we can ask the -stable
> > > and OSV's teams to pick up this fix.
> > Yes , it can resolve current issue.
> 
> Thanks for testing it out.
> 
You are welcome!

> I will add the appropriate changelog and send the patch out (to 2.6.32 stable
> and OSV kernels) with your "Tested-by:" if you are ok.
> 
Sure, it is my pleasure .Please go ahead!
BTW , what's OSV kernels ? I can't find its meaning by searching google.  
Once your patch has been included by 2.6 git, kindly inform me. Thanks. 

> thanks,
> suresh



RE: [RFC PATCH 01/06] input/rmi4: Public header and documentation

2012-10-10 Thread Christopher Heiny
Linus Walleij wrote:
> On Sat, Oct 6, 2012 at 6:09 AM, Christopher Heiny  
> wrote:
> > As requested in the feedback from the previous patch, we've documented the
> > debugfs and sysfs attributes in files in
> > Documentation/ABI/testing.  There's two files, one for debugfs and one
> > for sysfs.
> 
> This is a massive improvement! Atleast as far as I've read... If you fix the
> below remarks I think I'm ready to accept this file, but that's just me and
> doesn't say anything about what Dmitry et al will comment on...

Thanks!  See my comments below.

> 
> (...)
> 
> > +  The RMI4 driver implementation exposes a set of informational and
> > control +  parameters via debugs.  These parameters are those that
> > typically are only
> s/debugs/debugfs
> 
> (...)
> 
> > +  comms_debug - (rw) Write 1 to this dump information about
> > register +  reads and writes to the console.  Write 0 to
> > this to turn +  this feature off.  WARNING: Imposes a
> > major performance +  penalty when turned on.
> > +  irq_debug - (rw) Write 1 to this dump information about
> > interrupts +  to the console.  Write 0 to this to turn
> > this feature off. +  WARNIG: Imposes a major performance
> > penalty when turned on.
> Hm. Usually we control dynamic debug prints by standard kernel
> frameworks, can you tell what is wrong with this and why you need
> a custom mechanism? See the following:
> Documentation/dynamic-debug-howto.txt
> http://lwn.net/Articles/434833/

The current arrangement was arrived at after some discussion with customers.  
Originally we went with the Kconfig based approach you suggested in August.  
However, the response from our guinea pigs, um, very helpful test customers, 
was "AAAggh! Too complicated and too static!!"  As a result we explored 
alternatives.  The dynamic debug interface was considered, but it is usually 
disabled in our customer's kernel configurations, even during development.  In 
the end, we arrived at some simple debugfs on/off switches for the more verbose 
features (like comms_debug and irq_debug, above).

If this is a deal-breaker, I can go back to the customers and see if they are 
willing to consider enabling dynamic debug during prototyping and development.

> 
> (...)
> 
> > +++ b/Documentation/ABI/testing/sysfs-rmi4
> 
> (...)
> 
> > +  chargerinput ... (rw) User space programs can use this to tell
> > the +  sensor that the system is plugged into an external
> > power +  source (as opposed to running on
> > batteries).  This allows +  the sensor firmware to make
> > necessary adjustments for the +  current capacitence
> > regime.  Write 1 to this when the +  system is using
> > external power, write 0 to this when the +  system is
> > running on batteries.  See spec for full details.
> I remember discussing in-kernel notifiers for this. I don't
> really see the point in tunnelling a notification from the drivers/power
> subsystem to the drivers/input subsystem through userspace for
> no good.
> 
> It's no blocker though, I don't expect you to fix this as part of
> this driver submission.
> 
> Maybe Anton can comment?

Hmmm.  I agree that it'd be good to avoid looping through userspace.  But

I found ways to notfiy the kernel that the charger is plugged/unplugged, but 
that's only useful if you're a battery charger device driver.  I also found 
ways for userspace to get notification of charger events.  I didn't spot any 
way for in-kernel drivers to get notification of such events.  Perhaps I'm not 
looking the right places, though - can you provide a pointer?

> 
> (...)
> 
> > +  interrupt_enable ... (ro) This represents the current RMI4
> > interrupt +  mask (F01_RMI_Ctrl1 registers).  See spec
> > for full details.
> What does the userspace have to do with this stuff? Seems way
> too low-level, but whatever.

It's primarily used in hardware prototyping and bring up.  Perhaps it belongs 
in debugfs in that case.

> 
> (...)
> 
> > +  sleepmode ... (rw) Controls power management on the
> > device.  Writing +  0 to this parameter puts the device
> > into its normal operating +  mode.  Writing 1 to this
> > parameter fully disables touch +  sensors and similar
> > inputs - no touch data will be reported +  from the
> > device in this mode.  Writing 2 or 3 to this device
> > +  may or may not have an effect, depending on the
> > particular +  device - see the product specification for
> > your sensor for +  details.
> 
> Usually power management is controlled from kernelspace, but no
> big deal, maybe userspace knows even more details in some
> cases.

Well, in some cases userspace does think it knows more :-).  This one should 

Re: [PATCH 0/2] struct pid-ify autofs4

2012-10-10 Thread Ian Kent
On Mon, 2012-09-24 at 19:56 -0700, Eric W. Biederman wrote:
> Ian Kent  writes:
> 
> > On Mon, 2012-09-24 at 15:34 +0200, Miklos Szeredi wrote:
> >> Ian Kent  writes:
> >> 
> >> > On Fri, 2012-09-21 at 17:44 +0200, Miklos Szeredi wrote:
> >> >> Miklos Szeredi  writes:
> >> >> 
> >> >> > These two patches change autofs4 to store struct pid pointers instead 
> >> >> > of pid_t
> >> >> > values.
> >> >> >
> >> >> > Fixed various issues with the previous post.  Not tested, handle with
> >> >> > care!
> >> >> 
> >> >> Customer gave positive test results.
> >> >
> >> > For what exactly, there's no problem description in these patches?
> >> 
> >> From what I understand (and I'm not an expert by any means) is that
> >> autofs doesn't work if containers are used.  The first patch fixes this.
> >
> > Yeah, the problem with that is that "autofs doesn't work if containers
> > are used" is ill defined since there are use cases where it does, I
> > believe. At the very least, ill defined in my view of things.
> 
> An easy complaint is that task->pid and task->tgid are deprecated fields
> in the task structure.  Things that use pids should in use struct pid
> values instead.

Yes, we do need to fix that.

> 
> The trick part of using struct pid values is that there are times when
> you need to interact with userspace.  And the question is which pid
> namespace is your userspace process in so that you can convert
> to and from the proper pid namespace.
> 
> The pgrp option on the mount of autofs is buggy because the pid
> namespace of the process group is not captured at the time of mount
> and so userspace could think it is talking about one process group
> while autofs is talking about another process group.  This is a
> practical problem if the process that mounts autofs is not running
> in the initial pid namespace.

Yep.

> 
> There is a second question.  What happens if the oz_pgrp exists
> and then pids wrap around and another process uses the same process
> group number.  Currently the autofs code will treat the new proces
> group like the old one leading to unexpected behavior.  Which I believe
> will be autofs mounts not happening when desired.

OK, but I think your saying a process in another namespace could then
use that process group number or the daemon would need to be SIGKILLed
in the initial namespace.

> 
> Another problem is what happens when a process triggers an automount.
> Today we will report the pid and the tgid in the initial pid namespace
> of the process that triggered the mount.
> 
> So what I can see is that today if the process that mounts autofs
> (aka the process at the other end of the autofs pipe) is not in
> the initial pid namespace things will go awry, as autofs will report
> pid values that make no sense to anyone.

Yep, I get that too.

> 
> I would like to say the patches fix that problem (and they come close)
> however they still translate everything into the initial pid namespace.

I think it's a little more difficult to do than it appears since
multiple namespace behavior is not defined for autofs.

For example, suppose we had the situation where the correct namespace
was always used and another instance of automount was run within a
namespace using different maps. When the namespace is created we could
get autofs mounts from the initial namespace which the daemon won't know
how to handle so they would need to be umounted before the new daemon is
started. Since that isn't automated as part of namespace creation it
almost certainly would lead to complains and lots of confusion.

At the moment it seems best to restrict interactions to a single daemon
in the initial namespace until specific behavioral requirements are
clear.

> 
> > But I can't even sensibly discuss it because of the lack of specified
> > use cases and requirements for each. So, there's a chance this will
> > break another case that does work.
> 
> That seems to be a reasonable concern.  Education so that the
> differences in the code are comprehensible.
> 
> I am also curious about which case people were seeing problems with as
> these patches were reported to be tested and to have fixed a customer
> problem.  The only case that is particularly clear to me that these
> patches will fix is the case of process group wrap around, but proces
> group wrap around should be exceedingly rare.

This comment arose because of a bug report regarding lxc not working
with automount. It had an accompanying patch that had a couple of hunks
from an unrelated patch and I think most of what is in the patches here.
My initial reaction was caution since I don't want to break what now
functions OK with what was a suspect patch.

If that change is done in RHEL it would have to be a back port of these
patches since it is clearer what they are trying to achieve and I
believe they would not break existing function.

> 
> To handle mounts made outside of the initial pid namespace it appears
> that all that is needed is to caputure the pid 

Re: linux-next: manual merge of the kvm-ppc tree with the powerpc-merge tree

2012-10-10 Thread Tabi Timur-B04825
On Wed, Oct 10, 2012 at 9:47 PM, Stephen Rothwell  wrote:

> Commit 549d62d889b4 ("KVM: PPC: use definitions in epapr header
> for hcalls") from the kvm-ppc tree added an include of asm/epapr_hcall.h
> to the user visible part of asm/kvm_para.h so asm/epapr_hcall.h became a
> user visible header file.

Any real user-space code that tries to call any of the functions in
epapr_hcall.h will cause an exception.

Claiming that kernel header files that KVM needs are suddenly
user-space header files doesn't make much sense to me, but I guess
it's not my decision.

-- 
Timur Tabi
Linux kernel developer at Freescale
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: udev breakages -

2012-10-10 Thread Eric W. Biederman
Greg KH  writes:

> On Thu, Oct 04, 2012 at 10:29:51AM -0700, Eric W. Biederman wrote:
>> There are still quite a few interesting cases that devtmpfs does not
>> even think about supporting.  Cases that were reported when devtmpfs was
>> being reviewed. 
>
> Care to refresh my memory?

Anyone who wants something besides the default policy.   Containers
chroots anyone who doesn't want /dev/console to be c 5 1.

>> Additionally the devtmpfs maintainership has not dealt with legitimate
>> concerns any better than this firmware issue has been dealt with.  I
>> still haven't even hear a productive suggestion back on the hole
>> /dev/ptmx mess.
>
> I don't know how to handle the /dev/ptmx issue properly from within
> devtmpfs, does anyone?  Proposals are always welcome, the last time this
> came up a week or so ago, I don't recall seeing any proposals, just a
> general complaint.

The proposal at that time was to work around the silliness with a little
kernel magic.

To recap for those who haven't watched closely.  devpts now has a ptmx
device node and it would be very nice if we were to use that device
node instead of /dev/ptmx.

Baically it would be nice to tell udev to not create /dev/ptmx, and
instead to make /dev/ptmx a symlink to /dev/pts/ptmx.

I got to looking at the problem and if I don't worry about systemd and
just look at older versions of udev that are out there in the wild it
turns out the following udev configuratoin line does exactly what is
needed.  It creats a symlink from /dev/ptmx to /dev/pts/ptmx.  And if
on the odd chance devpts is not mounted it creates /dev/pts/ptmx as
well.

KERNEL=="ptmx" NAME:="pts/ptmx" SYMLINK="ptmx"

Does assigning to NAME to specify the device naming policy work in
systemd-udev or has that capability been ripped out?

Thinking about it.  Since systemd-udev no longer supports changing the
device name.  And likely it no longer even supports assigning to NAME
even for purposes of changing the target of the symlink.  Then I expect
what we want to do is:

diff --git a/drivers/base/devtmpfs.c b/drivers/base/devtmpfs.c
index 147d1a4..7dc5bed 100644
--- a/drivers/base/devtmpfs.c
+++ b/drivers/base/devtmpfs.c
@@ -377,6 +377,7 @@ static int devtmpfsd(void *p)
goto out;
sys_chdir("/.."); /* will traverse into overmounted root */
sys_chroot(".");
+   sys_symlink("pts/ptmx", "ptmx");
complete(_done);
while (1) {
spin_lock(_lock);



Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] init_module: update to modern interfaces

2012-10-10 Thread Rusty Russell
"Michael Kerrisk (man-pages)"  writes:
> [CC widened, so that some more review might come in. Rusty?]

Sure.

Looks good. but:

> .B EBUSY
> The module's initialization routine failed.

Possibly.  You should mention that the individual module's
initialization routine can return other errors as appropriate.

> .BR EINVAL " (Linux 2.4 and earlier)"
> Some
> .I image
> slot is filled in incorrectly,
> .I image\->name
> does not correspond to the original module name, some
> .I image\->deps
> entry does not correspond to a loaded module,
> or some other similar inconsistency.
> .TP

Why document this?

> .B ENOEXEC
> The ELF image in
> .I module_image
> is too small or has corrupted segments.

Or is not an ELF image, or wrong arch...

> .TP
> .B EPERM
> The caller was not privileged
> (did not have the
> .B CAP_SYS_MODULE
> capability),
> or module loading is disabled
> (see
> .IR /proc/sys/kernel/modules_disabled
> in
> .BR proc (5)).
> .SH "CONFORMING TO"
> .BR init_module ()
> is Linux-specific.
> .SH NOTES
> Glibc does not provide a wrapper for this system call; call it using
> .BR syscall (2).
>
> Information about currently loaded modules can be found in
> .IR /proc/modules
> and in the file trees under the per-module subdirectories under
> .IR /sys/module .
>
> See the Linux kernel source file
> .I include/linux/module.h
> for some useful background information.
> .SS Linux 2.4 and earlier
> .PP
> In Linux 2.4 and earlier, this system call was rather different:
>
> .B "#include "
>
> .BI "int init_module(const char *" name ", struct module *" image );
>
> This version of the system call
> loads the relocated module image pointed to by
> .I image
> into kernel space and runs the module's
> .I init
> function.
> The caller is responsible for providing the relocated image (since
> Linux 2.6, the
> .BR init_module ()
> system call does the relocation).
> .PP
> The module image begins with a module structure and is followed by
> code and data as appropriate.
> The module structure is defined as follows:
> .PP
> .in +4n
> .nf
> struct module {
> unsigned long size_of_struct;
> struct module*next;
> const char   *name;
> unsigned long size;
> long  usecount;
> unsigned long flags;
> unsigned int  nsyms;
> unsigned int  ndeps;
> struct module_symbol *syms;
> struct module_ref*deps;
> struct module_ref*refs;
> int (*init)(void);
> void(*cleanup)(void);
> const struct exception_table_entry *ex_table_start;
> const struct exception_table_entry *ex_table_end;
> #ifdef __alpha__
> unsigned long gp;
> #endif
> };
> .fi
> .in
> .PP
> All of the pointer fields, with the exception of
> .I next
> and
> .IR refs ,
> are expected to point within the module body and be
> initialized as appropriate for kernel space, that is, relocated with
> the rest of the module.

You might want to note that the 2.4 syscall can be detected by calling
query_module(): 2.6 and above give ENOSYS.

Cheers,
Rusty.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Request for review] Revised delete_module(2) manual page

2012-10-10 Thread Rusty Russell
"Michael Kerrisk (man-pages)"  writes:

> Hello Kees, Rusty,
>
> The current delete_module(2) page is severely out of date (basically,
> its content corresponds to 2.4 days, and was even pretty thin in
> covering that). So, I took a shot at revising the page to Linux 2.6+
> reality. Would it be possible that you could review it?

OK.  Main suggestion is that I discussed with Lucas removing the
!O_NONBLOCK case.  It's not supported by modprobe -r, and almost
unheard-of for rmmod (it's --wait).

In practice, people want the unload-or-fail semantics, or the
force-unload semantics.

> Otherwise, by default,
> .BR delete_module ()
> marks a module so that no new references are permitted.
> If the module's reference count
> (i.e., the number of processes currently using the module) is nonzero,
> it then places the caller in an uninterruptible sleep
> state until all reference count is zero,
> at which point the call unblocks.
> When the reference count reaches zero, the module is unloaded.

So this should be inverted:

Otherwise (assuming O_NONBLOCK, see flags below), if the
module's reference count (i.e., the number of processes
currently using the module) is nonzero, the call fails.

> The
> .IR flags
> argument can be used to modify the behavior of the system call.

It is usually set to O_NONBLOCK, which may be required in future kernel
versions (see NOTES).

> The following values can be ORed in this argument:
> .TP
> .B O_TRUNC
> .\"   KMOD_REMOVE_FORCE in kmod library
> Force unloading of the module, even if the following conditions are true:
> .RS
> .IP * 3
> The module has no
> .I exit
> function.
> By default, attempting to unload a module that has no
> .I exit
> function fails.
> .IP *
> The reference count for (i.e., the number of processes currently using)
> this module is nonzero.
...
> .IP
> Using this flag taints the kernel (TAINT_FORCED_RMMOD).
> .IP
> .IR "Using this flag is dangerous!"
> If the kernel was not built with
> .BR CONFIG_MODULE_FORCE_UNLOAD ,
> this flag is silently ignored.

NOTES:

If O_NONBLOCK is not set, then the kernel may enter uninterruptible
sleep until the module reference count reaches zero.  This is not
generally desirable, so this flag may be compulsory in future kernel
configurations.

Cheers,
Rusty.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[GIT PULL] ACPI & Thermal patches for 3.7-merge

2012-10-10 Thread Len Brown
Hi Linus,

Please pull these ACPI & Thermal patches.

The generic Linux thermal layer is gaining some
new capabilities (generic cooling via cpufreq)
and some new customers (ARM).

Also, an ACPI EC bug fix plus a regression fix.

thanks!
Len Brown, Intel Open Source Technology Center

The following changes since commit f5a246eab9a268f51ba8189ea5b098a1bfff200e:

  Merge tag 'sound-3.7' of 
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound (2012-10-09 07:07:14 
+0900)

are available in the git repository at:


  git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux.git release

for you to fetch changes up to d1d4a81b842db21b144ffd2334ca5eee3eb740f3:

  Merge branches 'fixes-for-37', 'ec' and 'thermal' into release (2012-10-09 
01:47:35 -0400)



Amit Daniel Kachhap (6):
  thermal: add generic cpufreq cooling implementation
  hwmon: exynos4: move thermal sensor driver to driver/thermal directory
  thermal: exynos5: add exynos5250 thermal sensor driver support
  thermal: exynos: register the tmu sensor with the kernel thermal layer
  ARM: exynos: add thermal sensor driver platform data support
  thermal: exynos: Use devm_* functions

Eduardo Valentin (1):
  Fix a build error.

Feng Tang (2):
  ACPI: EC: Make the GPE storm threshold a module parameter
  ACPI: EC: Add a quirk for CLEVO M720T/M730T laptop

Guenter Roeck (2):
  thermal: fix potential out-of-bounds memory access
  thermal: Fix potential NULL pointer accesses

Jonghwa Lee (1):
  Thermal: Fix bug on cpu_cooling, cooling device's id conflict problem.

Kuninori Morimoto (1):
  thermal: add Renesas R-Car thermal sensor support

Len Brown (2):
  Merge branch 'release' of git://git.kernel.org/.../rzhang/linux into 
thermal
  Merge branches 'fixes-for-37', 'ec' and 'thermal' into release

Sachin Kamat (1):
  thermal: Exynos: Fix NULL pointer dereference in 
exynos_unregister_thermal()

Srivatsa S. Bhat (1):
  ACPI idle, CPU hotplug: Fix NULL pointer dereference during hotplug

Wei Yongjun (2):
  cpuidle / ACPI: fix potential NULL pointer dereference
  tools/power/acpi/acpidump: remove duplicated include from acpidump.c

Zhang Rui (13):
  Thermal: Introduce multiple cooling states support
  Thermal: Introduce cooling states range support
  Thermal: set upper and lower limits
  Thermal: Introduce .get_trend() callback.
  Thermal: Remove tc1/tc2 in generic thermal layer.
  Thermal: Introduce thermal_zone_trip_update()
  Thermal: rename structure thermal_cooling_device_instance to 
thermal_instance
  Thermal: Rename thermal_zone_device.cooling_devices
  Thermal: Rename thermal_instance.node to thermal_instance.tz_node.
  Thermal: List thermal_instance in thermal_cooling_device.
  Thermal: Introduce simple arbitrator for setting device cooling state
  Thermal: Unify the code for both active and passive cooling
  Thermal: Introduce locking for cdev.thermal_instances list.

 Documentation/thermal/cpu-cooling-api.txt  |  32 +
 .../{hwmon/exynos4_tmu => thermal/exynos_thermal}  |  35 +-
 Documentation/thermal/sysfs-api.txt|   9 +-
 drivers/acpi/ec.c  |  30 +-
 drivers/acpi/processor_idle.c  |   3 +-
 drivers/acpi/thermal.c |  93 +-
 drivers/cpuidle/cpuidle.c  |   2 +-
 drivers/hwmon/Kconfig  |  10 -
 drivers/hwmon/Makefile |   1 -
 drivers/hwmon/exynos4_tmu.c| 518 ---
 drivers/platform/x86/acerhdf.c |   5 +-
 drivers/platform/x86/intel_mid_thermal.c   |   2 +-
 drivers/power/power_supply_core.c  |   2 +-
 drivers/staging/omap-thermal/omap-thermal-common.c |   5 +-
 drivers/thermal/Kconfig|  26 +
 drivers/thermal/Makefile   |   5 +-
 drivers/thermal/cpu_cooling.c  | 449 ++
 drivers/thermal/exynos_thermal.c   | 997 +
 drivers/thermal/rcar_thermal.c | 260 ++
 drivers/thermal/spear_thermal.c|   2 +-
 drivers/thermal/thermal_sys.c  | 321 ---
 include/linux/cpu_cooling.h|  58 ++
 .../{exynos4_tmu.h => exynos_thermal.h}|  47 +-
 include/linux/thermal.h|  28 +-
 tools/power/acpi/acpidump.c|   1 -
 25 files changed, 2205 insertions(+), 736 deletions(-)
 create mode 100644 Documentation/thermal/cpu-cooling-api.txt
 rename Documentation/{hwmon/exynos4_tmu => thermal/exynos_thermal} (71%)
 delete mode 100644 drivers/hwmon/exynos4_tmu.c
 create mode 100644 drivers/thermal/cpu_cooling.c
 create mode 100644 drivers/thermal/exynos_thermal.c
 create mode 100644 

Re: [PATCH 00/16] f2fs: introduce flash-friendly file system

2012-10-10 Thread Namjae Jeon
2012/10/10 Jaegeuk Kim :

>>
>> I mean that every volume is placed inside any partition (MTD or GPT). Every 
>> partition begins from any
>> physical sector. So, as I can understand, f2fs volume can begin from 
>> physical sector that is laid
>> inside physical erase block. Thereby, in such case of formating the f2fs's 
>> operation units will be
>> unaligned in relation of physical erase blocks, from my point of view. 
>> Maybe, I misunderstand
>> something but it can lead to additional FTL operations and performance 
>> degradation, from my point of
>> view.
>
> I think mkfs already calculates the offset to align that.
I think this answer is not what he want.
If you don't use partition table such as dos partition table or gpt, I
think that it is possible to align using mkfs.
But If we should consider partition table space in storage, I don't
understand how it  could be align using mkfs.

Thanks.
> Thanks,
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 02/06] input/rmi4: Core files

2012-10-10 Thread Joe Perches
On Thu, 2012-10-11 at 02:49 +, Christopher Heiny wrote:
> Joe Perches wrote:
[]
> > > + list_for_each_entry(entry, >rmi_functions.list, list)
> > > + if (entry->irq_mask)
> > > + process_one_interrupt(entry, irq_status,
> > > +   data);
> > 
> > style nit, it'd be nicer with braces.
> 
> I agree with you, but checkpatch.pl doesn't. :-(

Sure it does.

$ cat t.c
{
list_for_each_entry(entry, >rmi_functions.list, list) {
if (entry->irq_mask)
process_one_interrupt(entry, irq_status, data);
}
}
$ ./scripts/checkpatch.pl --strict -f t.c
total: 0 errors, 0 warnings, 0 checks, 7 lines checked

t.c has no obvious style problems and is ready for submission.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


linux-next: Tree for Oct 11

2012-10-10 Thread Stephen Rothwell
Hi all,

Do not add stuff destined for v3.8 to your linux-next included branches
until after v3.7-rc1 is released.

Changes since 201201010:

Conflicts are migrating as trees are merged by Linus.

Linus' tree still had its build failure for which I reverted a commit (in
my fixes tree).

The kvm-ppc tree gained conflicts against the powerpc-merge tree.



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" as mentioned in the FAQ on the wiki
(see below).

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log files
in the Next directory.  Between each merge, the tree was built with
a ppc64_defconfig for powerpc and an allmodconfig for x86_64. After the
final fixups (if any), it is also built with powerpc allnoconfig (32 and
64 bit), ppc44x_defconfig and allyesconfig (minus
CONFIG_PROFILE_ALL_BRANCHES - this fails its final link) and i386, sparc,
sparc64 and arm defconfig. These builds also have
CONFIG_ENABLE_WARN_DEPRECATED, CONFIG_ENABLE_MUST_CHECK and
CONFIG_DEBUG_INFO disabled when necessary.

Below is a summary of the state of the merge.

We are up to 204 trees (counting Linus' and 26 trees of patches pending
for Linus' tree), more are welcome (even if they are currently empty).
Thanks to those who have contributed, and to those who haven't, please do.

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

There is a wiki covering stuff to do with linux-next at
http://linux.f-seidel.de/linux-next/pmwiki/ .  Thanks to Frank Seidel.

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

$ git checkout master
$ git reset --hard stable
Merging origin/master (2474542 Merge tag 'for-3.7-rc1' of 
git://gitorious.org/linux-pwm/linux-pwm)
Merging fixes/master (c058c71 Revert "memory-hotplug: suppress "Trying to free 
nonexistent resource " warning")
Merging kbuild-current/rc-fixes (b1e0d8b kbuild: Fix gcc -x syntax)
Merging arm-current/fixes (846a136 ARM: vfp: fix saving d16-d31 vfp registers 
on v6+ kernels)
Merging m68k-current/for-linus (f82735d m68k: Use PTR_RET rather than 
if(IS_ERR(...)) + PTR_ERR)
Merging powerpc-merge/merge (fd3bc66 Merge tag 'disintegrate-powerpc-20121009' 
into merge)
Merging sparc/master (2474542 Merge tag 'for-3.7-rc1' of 
git://gitorious.org/linux-pwm/linux-pwm)
Merging net/master (8545768 Merge tag 'master-2012-10-08' of 
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless)
Merging sound-current/for-linus (5d037f9 Merge tag 'asoc-3.6' of 
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus)
Merging pci-current/for-linus (0ff9514 PCI: Don't print anything while decoding 
is disabled)
Merging wireless/master (c3e7724 mac80211: use ieee80211_free_txskb to fix 
possible skb leaks)
Merging driver-core.current/driver-core-linus (5698bd7 Linux 3.6-rc6)
Merging tty.current/tty-linus (b70936d tty: serial: sccnxp: Fix bug with 
unterminated platform_id list)
Merging usb.current/usb-linus (ecefbd9 Merge tag 'kvm-3.7-1' of 
git://git.kernel.org/pub/scm/virt/kvm/kvm)
Merging staging.current/staging-linus (5698bd7 Linux 3.6-rc6)
Merging char-misc.current/char-misc-linus (fea7a08 Linux 3.6-rc3)
Merging input-current/for-linus (dde3ada Merge branch 'next' into for-linus)
Merging md-current/for-linus (80b4812 md/raid10: fix "enough" function for 
detecting if array is failed.)
Merging audit-current/for-linus (c158a35 audit: no leading space in 
audit_log_d_path prefix)
Merging crypto-current/master (c9f97a2 crypto: x86/glue_helper - fix storing of 
new IV in CBC encryption)
Merging ide/master (9974e43 ide: fix generic_ide_suspend/resume Oops)
Merging dwmw2/master (244dc4e Merge 
git://git.infradead.org/users/dwmw2/random-2.6)
Merging sh-current/sh-fixes-for-linus (4403310 SH: Convert out[bwl] macros to 
inline functions)
Merging irqdomain-current/irqdomain/merge (15e06bf irqdomain: Fix debugfs 
formatting)
Merging devicetree-current/devicetree/merge (4e8383b of: release node fix for 
of_parse_phandle_with_args)
Merging spi-current/spi/merge (d1c185b of/spi: Fix SPI module loading by using 
proper "spi:" modalias prefixes.)
Merging gpio-current/gpio/merge (96b7064 gpio/tca6424: merge I2C transactions, 
remove cast)
Merging asm-generic/master (c37d615 Merge branch 'disintegrate-asm-generic' of 

Re: [PATCH v5] create sun sysfs file

2012-10-10 Thread Len Brown
v5 applied (with typo fixed).
In the future, it would be better if your patches apply to the latest
upstream kernel.  (_STR made changes in same place as _SUN)

thanks,
Len Brown, Intel Open Source Technology Center

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 5/5] aio: Refactor aio_read_evt, use cmxchg(), fix bug

2012-10-10 Thread Kent Overstreet
On Wed, Oct 10, 2012 at 02:43:15PM -0700, Zach Brown wrote:
> > True. But that could be solved with a separate interface that either
> > doesn't use a context to submit a call synchronously, or uses an
> > implicit per thread context.
> 
> Sure, but why bother if we can make the one submission interface fast
> enough to satisfy quick callers?  Less is more, and all that.

Very true, if it's possible. I'm just still skeptical.

> > I don't have a _strong_ opinion there, but my intuition is that we
> > shouldn't be creating new types of handles without a good reason. I
> > don't think the annoyances are for the most part particular to file
> > descriptors, I think the tend to be applicable to handles in general and
> > at least with file descriptors they're known and solved.
> 
> I strongly disagree.  That descriptors are an expensive limited
> resources is a perfectly good reason to not make them required to access
> the ring.

What's so special about aio vs. epoll, and now signalfd/eventfd/timerfd
etc.? 

> > That would be awesome, though for it to be worthwhile there couldn't be
> > any kernel notion of a context at all and I'm not sure if that's
> > practical. But the idea hadn't occured to me before and I'm sure you've
> > thought about it more than I have... hrm.
> > 
> > Oh hey, that's what acall does :P
> 
> :)
> 
> > For completions though you really want the ringbuffer pinned... what do
> > you do about that?
> 
> I don't think the kernel has to mandate that, no.  The code has to deal
> with completions faulting, but they probably won't.  In acall it
> happened that completions always came from threads that could block so
> its coping mechanism was to just use put_user() :).

Yeah, but that means the completion has to be delivered from process
context. That's not what aio does today, and it'd be a real performance
regression.

I don't know of a way around that myself.

> If userspace wants them rings locked, they can mlock() the memory.
> 
> Think about it from another angle: the current mechanism of creating an
> aio ring is a way to allocate pinned memory outside of the usual mlock
> accounting.  This could be abused, so aio grew an additional tunable to
> limit the number of total entries in rings in the system.
> 
> By putting the ring in normal user memory we avoid that problem
> entirely.

No different from any other place the kernel allocates memory on behalf
of userspace... it needs a general solution, not a bunch of special case
solutions (though since the general solution is memcg you might argue
the cure is worse than the disease... :P)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: linux-next: manual merge of the kvm-ppc tree with the powerpc-merge tree

2012-10-10 Thread Stephen Rothwell
On Thu, 11 Oct 2012 01:47:13 + Tabi Timur-B04825  
wrote:
>
> On Wed, Oct 10, 2012 at 8:18 PM, Stephen Rothwell  
> wrote:
> 
> >  arch/powerpc/include/asm/epapr_hcalls.h  |  511 
> > --
> >  arch/powerpc/include/uapi/asm/Kbuild |1 +
> >  arch/powerpc/include/uapi/asm/epapr_hcalls.h |  511 
> > ++
> 
> What is include/uapi?  epapr_hcalls.h is not a user-space header file.
>  I don't remember seeing the original patch which moved it, so I don't
> know where this comes from.

Commit 549d62d889b4 ("KVM: PPC: use definitions in epapr header
for hcalls") from the kvm-ppc tree added an include of asm/epapr_hcall.h
to the user visible part of asm/kvm_para.h so asm/epapr_hcall.h became a
user visible header file.

The UAPI changes are moving the user visible parts of export header files
into a separate include directory called uapi.

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au
http://www.canb.auug.org.au/~sfr/


pgp37NfCZlM2I.pgp
Description: PGP signature


RE: [RFC PATCH 02/06] input/rmi4: Core files

2012-10-10 Thread Christopher Heiny
Joe Perches wrote:
> On Fri, 2012-10-05 at 21:09 -0700, Christopher Heiny wrote:
> []
> 
> Just some trivial comments:

Thanks - see below for responses.

> > diff --git a/drivers/input/rmi4/rmi_driver.c
> > b/drivers/input/rmi4/rmi_driver.c
> []
> 
> > @@ -0,0 +1,1529 @@
> 
> []
> 
> > +static ssize_t delay_write(struct file *filp, const char __user *buffer,
> > +size_t size, loff_t *offset) {
> > + struct driver_debugfs_data *data = filp->private_data;
> > + struct rmi_device_platform_data *pdata =
> > + data->rmi_dev->phys->dev->platform_data;
> > + int retval;
> > + char local_buf[size];
> > + unsigned int new_read_delay;
> > + unsigned int new_write_delay;
> > + unsigned int new_block_delay;
> > + unsigned int new_pre_delay;
> > + unsigned int new_post_delay;
> > +
> > + retval = copy_from_user(local_buf, buffer, size);
> > + if (retval)
> > + return -EFAULT;
> > +
> > + retval = sscanf(local_buf, "%u %u %u %u %u", _read_delay,
> > + _write_delay, _block_delay,
> > + _pre_delay, _post_delay);
> > + if (retval != 5) {
> > + dev_err(>rmi_dev->dev,
> > + "Incorrect number of values provided for delay.");
> > + return -EINVAL;
> > + }
> > + if (new_read_delay < 0) {
> 
> These are unnecessary tests as unsigned values are never < 0.

Right.  Thought we'd taken care of most of the silliness like that, but 
obviously missed some.  We'll recheck the codebase.

[snip]


> > +static ssize_t phys_read(struct file *filp, char __user *buffer, size_t
> > size, + loff_t *offset) {
> > + struct driver_debugfs_data *data = filp->private_data;
> > + struct rmi_phys_info *info = >rmi_dev->phys->info;
> > + int retval;
> > + char local_buf[size];
> 
> size comes from where?  possible stack overflow?

H.  Good point.  We'll look at this.

[snip]

> []
> 
> > + list_for_each_entry(entry, >rmi_functions.list, list)
> > + if (entry->irq_mask)
> > + process_one_interrupt(entry, irq_status,
> > +   data);
> 
> style nit, it'd be nicer with braces.

I agree with you, but checkpatch.pl doesn't. :-(

> 
> > diff --git a/drivers/input/rmi4/rmi_driver.h
> > b/drivers/input/rmi4/rmi_driver.h
> []
> 
> > @@ -0,0 +1,438 @@
> > 
> > +
> > +#define tricat(x, y, z) tricat_(x, y, z)
> > +
> > +#define tricat_(x, y, z) x##y##z
> 
> I think these tricat macros are merely obfuscating
> and don't need to be used.

tricat is used internally by another collection of macros that helps generate 
sysfs files.  In particular, it's used to generate the RMI4 register name 
symbols.--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Scheduler queues for less os-jitter?

2012-10-10 Thread Mike Galbraith
On Wed, 2012-10-10 at 20:13 +0200, Uwaysi Bin Kareem wrote: 
> I was just wondering, have you considered this?
> 
> If daemons are contributing to os-jitter, wouldn`t having them all on  
> their own queue reduce jitter? So people could have the stuff like in  
> Ubuntu they want, without affecting jitter, or needing stuff like Tiny  
> Core, for tiny jitter?
> 
> So you get (simplified) something like mainapp - process1 in queue 2,  
> mainapp - process2 in queue 2, mainapp - process 3 in queue 2, etc.
> 
> Or is that already batch maybe, lol.

You could try SCHED_AUTOGROUP, or create whatever task groups manually,
or use systemd to do that for you.  Like everything else having anything
to do with scheduling, all are double edged swords, so may help, may
hurt.

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/5] aio: kiocb_cancel()

2012-10-10 Thread Kent Overstreet
On Wed, Oct 10, 2012 at 07:03:56AM -0400, Theodore Ts'o wrote:
> On Tue, Oct 09, 2012 at 02:37:00PM -0700, Kent Overstreet wrote:
> > > Honestly: I wouldn't bother.  Nothing of consequence uses cancel.
> > > 
> > > I have an RFC patch series that tears it out.  Let me polish that up
> > > send it out, I'll cc: you.
> > 
> > Even better :)
> > 
> > I've been looking at aio locking the past few days, and I was getting
> > ready to write something up about cancellation to the lists.
> 
> I can definitely think of use cases (inside of Google) where we could
> really use aio_cancel.  The issue is it's hard use it safely (i.e., we
> need to know whether the file system has already extended the file
> before we can know whether or not we can safely cancel the I/O).
> 
> > Short version, supporting cancellation without global sychronization is
> > possible but it'd require help from the allocator.
> 
> Well, we would need some kind of flag to indicate whether cancellation
> is possible, yes.  But if we're doing AIO to a raw disk, or we're
> talking about a read request, we wouldn't need any help from the
> file system's block allocator.
> 
> And maybe the current way of doing things isn't the best way.  But it
> would be nice if we didn't completely give up on the functionality of
> aio_cancel.

I thought about this more, and I think ripping out cancellation is a bad
idea in the long term too.

With what aio is capable of now (O_DIRECT), it's certainly arguable that
cancellation doesn't give you much and isn't worth the hassle of
implementing.

But, if we ever fix aio (and I certainly hope we do) and implement
something capable of doing arbitrary IO asynchronously, then we really
are going to need cancellation, because of io to sockets/pipes that can
be blocked for an unbounded amount of time.

Even if it turns out programs don't ever need cancellation in practice,
the real kicker is checkpoint/restore - for checkpoint/restore we've got
to enumerate all the outstanding operations or wait for them to complete
(for the ones that do complete in bounded time), and that's the hard
part of cancellation right there.

Zach might object here and point out that if we implement asynchronous
operations with kernel threads, we can implement cancelling by just
killing the thread. But this is only relevant if we _only_ implement
asynchronous operations with threads in aio v2, and I personally think
that's a lousy idea.

The reason is there's a lot of things that need to be fixed with aio in
aio v2, so aio v2 really needs to be a complete replacement for existing
aio users. It won't be if we start using kernel threads where we weren't
before - kernel threads are cheap but they aren't free, we'll regress on
performance and that means the new interfaces probably won't get used at
all by the existing users.

This isn't a big deal - we can implement aio v2 as a generic async
syscall mechanism, and then the default implementation for any given
syscall will just be the thread pool stuff but use optimized async
implementations for cases where it's available - this'll let us make use
of much of the existing infrastructure.

But it does mean we need a cancellation mechanism that isn't tied to
threads - i.e. does basically what the existing aio cancellation does.

So, IMO we should bite the bullet and do cancellation right now.

Also, I don't think there's anything really terrible about the existing
cancellation mechanism (unlike retrys, that code is... wtf), it's just
problematic for locking/performance. But that's fixable.

It is definitely possible to implement cancellation such that it doesn't
cost anything when it's not being used, but we do need some help from
the allocator.

Say we had a way of iterating over all the slabs in the kiocb cache:
if we pass a constructor to kmem_cache_create() we can guarantee that
all the refcounts are initialized to 0. Combine that with
SLAB_DESTROY_BY_RCU, and we can safely do a try_get_kiocb() when we're
iterating over all the kiocbs looking for the one we want.

The missing piece is a way to iterate over all those slabs, I haven't
been able to find any exposed interface for that.

If we could implement such a mechanism, that would be the way to go (and
I don't think it'd be useful for just aio, this sort of thing comes up
elsewhere. Anything that has to time out operations in particular). 

The other option would be to use a simpler dedicated allocator - I have
a simple fast percpu allocator I wrote awhile back for some driver code,
that allocates out of a fixed sized array (I needed to be able to
reference objects by a 16 bit id, i.e. index in the array). I could
easily change that to allocate pages (i.e. slabs) lazily... then we'd
have one of these allocator structs per kioctx so they'd get freed when
the kioctx goes away and we'd have less to look through when we want to
cancel something.

This'd probably be the shortest path, but - while it's no slower than
the existing slab allocators, it 

[ 14/84] xfrm_user: fix info leak in copy_to_user_auth()

2012-10-10 Thread Greg Kroah-Hartman
3.0-stable review patch.  If anyone has any objections, please let me know.

--


From: Mathias Krause 

[ Upstream commit 4c87308bdea31a7b4828a51f6156e6f721a1fcc9 ]

copy_to_user_auth() fails to initialize the remainder of alg_name and
therefore discloses up to 54 bytes of heap memory via netlink to
userland.

Use strncpy() instead of strcpy() to fill the trailing bytes of alg_name
with null bytes.

Signed-off-by: Mathias Krause 
Acked-by: Steffen Klassert 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 net/xfrm/xfrm_user.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/net/xfrm/xfrm_user.c
+++ b/net/xfrm/xfrm_user.c
@@ -742,7 +742,7 @@ static int copy_to_user_auth(struct xfrm
return -EMSGSIZE;
 
algo = nla_data(nla);
-   strcpy(algo->alg_name, auth->alg_name);
+   strncpy(algo->alg_name, auth->alg_name, sizeof(algo->alg_name));
memcpy(algo->alg_key, auth->alg_key, (auth->alg_key_len + 7) / 8);
algo->alg_key_len = auth->alg_key_len;
 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[ 07/84] kernel/sys.c: call disable_nonboot_cpus() in kernel_restart()

2012-10-10 Thread Greg Kroah-Hartman
3.0-stable review patch.  If anyone has any objections, please let me know.

--

From: Shawn Guo 

commit f96972f2dc6365421cf2366ebd61ee4cf060c8d5 upstream.

As kernel_power_off() calls disable_nonboot_cpus(), we may also want to
have kernel_restart() call disable_nonboot_cpus().  Doing so can help
machines that require boot cpu be the last alive cpu during reboot to
survive with kernel restart.

This fixes one reboot issue seen on imx6q (Cortex-A9 Quad).  The machine
requires that the restart routine be run on the primary cpu rather than
secondary ones.  Otherwise, the secondary core running the restart
routine will fail to come to online after reboot.

Signed-off-by: Shawn Guo 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
Signed-off-by: Greg Kroah-Hartman 

---
 kernel/sys.c |1 +
 1 file changed, 1 insertion(+)

--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -334,6 +334,7 @@ void kernel_restart_prepare(char *cmd)
 void kernel_restart(char *cmd)
 {
kernel_restart_prepare(cmd);
+   disable_nonboot_cpus();
if (!cmd)
printk(KERN_EMERG "Restarting system.\n");
else


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[ 09/84] workqueue: add missing smp_wmb() in process_one_work()

2012-10-10 Thread Greg Kroah-Hartman
3.0-stable review patch.  If anyone has any objections, please let me know.

--

From: Tejun Heo 

commit 959d1af8cffc8fd38ed53e8be1cf4ab8782f9c00 upstream.

WORK_STRUCT_PENDING is used to claim ownership of a work item and
process_one_work() releases it before starting execution.  When
someone else grabs PENDING, all pre-release updates to the work item
should be visible and all updates made by the new owner should happen
afterwards.

Grabbing PENDING uses test_and_set_bit() and thus has a full barrier;
however, clearing doesn't have a matching wmb.  Given the preceding
spin_unlock and use of clear_bit, I don't believe this can be a
problem on an actual machine and there hasn't been any related report
but it still is theretically possible for clear_pending to permeate
upwards and happen before work->entry update.

Add an explicit smp_wmb() before work_clear_pending().

Signed-off-by: Tejun Heo 
Cc: Oleg Nesterov 
Signed-off-by: Greg Kroah-Hartman 

---
 kernel/workqueue.c |2 ++
 1 file changed, 2 insertions(+)

--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1868,7 +1868,9 @@ __acquires(>lock)
 
spin_unlock_irq(>lock);
 
+   smp_wmb();  /* paired with test_and_set_bit(PENDING) */
work_clear_pending(work);
+
lock_map_acquire_read(>wq->lockdep_map);
lock_map_acquire(_map);
trace_workqueue_execute_start(work);


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[ 11/84] xfrm_user: return error pointer instead of NULL

2012-10-10 Thread Greg Kroah-Hartman
3.0-stable review patch.  If anyone has any objections, please let me know.

--


From: Mathias Krause 

[ Upstream commit 864745d291b5ba80ea0bd0edcbe67273de368836 ]

When dump_one_state() returns an error, e.g. because of a too small
buffer to dump the whole xfrm state, xfrm_state_netlink() returns NULL
instead of an error pointer. But its callers expect an error pointer
and therefore continue to operate on a NULL skbuff.

This could lead to a privilege escalation (execution of user code in
kernel context) if the attacker has CAP_NET_ADMIN and is able to map
address 0.

Signed-off-by: Mathias Krause 
Acked-by: Steffen Klassert 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 net/xfrm/xfrm_user.c |6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

--- a/net/xfrm/xfrm_user.c
+++ b/net/xfrm/xfrm_user.c
@@ -862,6 +862,7 @@ static struct sk_buff *xfrm_state_netlin
 {
struct xfrm_dump_info info;
struct sk_buff *skb;
+   int err;
 
skb = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_ATOMIC);
if (!skb)
@@ -872,9 +873,10 @@ static struct sk_buff *xfrm_state_netlin
info.nlmsg_seq = seq;
info.nlmsg_flags = 0;
 
-   if (dump_one_state(x, 0, )) {
+   err = dump_one_state(x, 0, );
+   if (err) {
kfree_skb(skb);
-   return NULL;
+   return ERR_PTR(err);
}
 
return skb;


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[ 26/84] 8021q: fix mac_len recomputation in vlan_untag()

2012-10-10 Thread Greg Kroah-Hartman
3.0-stable review patch.  If anyone has any objections, please let me know.

--


From: Antonio Quartulli 

[ Upstream commit 5316cf9a5197eb80b2800e1acadde287924ca975 ]

skb_reset_mac_len() relies on the value of the skb->network_header pointer,
therefore we must wait for such pointer to be recalculated before computing
the new mac_len value.

Signed-off-by: Antonio Quartulli 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 net/8021q/vlan_core.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/net/8021q/vlan_core.c
+++ b/net/8021q/vlan_core.c
@@ -106,7 +106,6 @@ static struct sk_buff *vlan_reorder_head
return NULL;
memmove(skb->data - ETH_HLEN, skb->data - VLAN_ETH_HLEN, 2 * ETH_ALEN);
skb->mac_header += VLAN_HLEN;
-   skb_reset_mac_len(skb);
return skb;
 }
 
@@ -173,6 +172,8 @@ struct sk_buff *vlan_untag(struct sk_buf
 
skb_reset_network_header(skb);
skb_reset_transport_header(skb);
+   skb_reset_mac_len(skb);
+
return skb;
 
 err_free:


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[ 04/84] ACPI: run _OSC after ACPI_FULL_INITIALIZATION

2012-10-10 Thread Greg Kroah-Hartman
3.0-stable review patch.  If anyone has any objections, please let me know.

--

From: Lin Ming 

commit fc54ab72959edbf229b65ac74b2f122d799ca002 upstream.

The _OSC method may exist in module level code,
so it must be called after ACPI_FULL_INITIALIZATION

On some new platforms with Zero-Power-Optical-Disk-Drive (ZPODD)
support, this fix is necessary to save power.

Signed-off-by: Lin Ming 
Tested-by: Aaron Lu 
Signed-off-by: Len Brown 
Signed-off-by: Greg Kroah-Hartman 

---
 drivers/acpi/bus.c |8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

--- a/drivers/acpi/bus.c
+++ b/drivers/acpi/bus.c
@@ -944,8 +944,6 @@ static int __init acpi_bus_init(void)
status = acpi_ec_ecdt_probe();
/* Ignore result. Not having an ECDT is not fatal. */
 
-   acpi_bus_osc_support();
-
status = acpi_initialize_objects(ACPI_FULL_INITIALIZATION);
if (ACPI_FAILURE(status)) {
printk(KERN_ERR PREFIX "Unable to initialize ACPI objects\n");
@@ -953,6 +951,12 @@ static int __init acpi_bus_init(void)
}
 
/*
+* _OSC method may exist in module level code,
+* so it must be run after ACPI_FULL_INITIALIZATION
+*/
+   acpi_bus_osc_support();
+
+   /*
 * _PDC control method may load dynamic SSDT tables,
 * and we need to install the table handler before that.
 */


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[ 28/84] tcp: flush DMA queue before sk_wait_data if rcv_wnd is zero

2012-10-10 Thread Greg Kroah-Hartman
3.0-stable review patch.  If anyone has any objections, please let me know.

--


From: =?UTF-8?q?Michal=20Kube=C4=8Dek?= 

[ Upstream commit 15c041759bfcd9ab0a4e43f1c16e2644977d0467 ]

If recv() syscall is called for a TCP socket so that
  - IOAT DMA is used
  - MSG_WAITALL flag is used
  - requested length is bigger than sk_rcvbuf
  - enough data has already arrived to bring rcv_wnd to zero
then when tcp_recvmsg() gets to calling sk_wait_data(), receive
window can be still zero while sk_async_wait_queue exhausts
enough space to keep it zero. As this queue isn't cleaned until
the tcp_service_net_dma() call, sk_wait_data() cannot receive
any data and blocks forever.

If zero receive window and non-empty sk_async_wait_queue is
detected before calling sk_wait_data(), process the queue first.

Signed-off-by: Michal Kubecek 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 net/ipv4/tcp.c |   10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -1592,8 +1592,14 @@ int tcp_recvmsg(struct kiocb *iocb, stru
}
 
 #ifdef CONFIG_NET_DMA
-   if (tp->ucopy.dma_chan)
-   dma_async_memcpy_issue_pending(tp->ucopy.dma_chan);
+   if (tp->ucopy.dma_chan) {
+   if (tp->rcv_wnd == 0 &&
+   !skb_queue_empty(>sk_async_wait_queue)) {
+   tcp_service_net_dma(sk, true);
+   tcp_cleanup_rbuf(sk, copied);
+   } else
+   
dma_async_memcpy_issue_pending(tp->ucopy.dma_chan);
+   }
 #endif
if (copied >= target) {
/* Do not sleep, just process backlog. */


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[ 83/84] mtd: omap2: fix omap_nand_remove segfault

2012-10-10 Thread Greg Kroah-Hartman
3.0-stable review patch.  If anyone has any objections, please let me know.

--

From: Andreas Bießmann 

commit 7d9b110269253b1d5858cfa57d68dfc7bf50dd77 upstream.

Do not kfree() the mtd_info; it is handled in the mtd subsystem and
already freed by nand_release(). Instead kfree() the struct
omap_nand_info allocated in omap_nand_probe which was not freed before.

This patch fixes following error when unloading the omap2 module:

---8<---
~ $ rmmod omap2
[ cut here ]
kernel BUG at mm/slab.c:3126!
Internal error: Oops - BUG: 0 [#1] PREEMPT ARM
Modules linked in: omap2(-)
CPU: 0Not tainted  (3.6.0-rc3-00230-g155e36d-dirty #3)
PC is at cache_free_debugcheck+0x2d4/0x36c
LR is at kfree+0xc8/0x2ac
pc : []lr : []psr: 200d0193
sp : c521fe08  ip : c0e8ef90  fp : c521fe5c
r10: bf0001fc  r9 : c521e000  r8 : c0d99c8c
r7 : c661ebc0  r6 : c065d5a4  r5 : c65c4060  r4 : c78005c0
r3 :   r2 : 1000  r1 : c65c4000  r0 : 0001
Flags: nzCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment user
Control: 10c5387d  Table: 86694019  DAC: 0015
Process rmmod (pid: 549, stack limit = 0xc521e2f0)
Stack: (0xc521fe08 to 0xc522)
fe00:   c008a874 c00bf44c c515c6d0 200d0193 c65c4860 c515c240
fe20: c521fe3c c521fe30 c008a9c0 c008a854 c521fe5c c65c4860 c78005c0 bf0001fc
fe40: c780ff40 a00d0113 c521e000  c521fe84 c521fe60 c0112efc c01122d8
fe60: c65c4860 c0673778 c06737ac  00070013  c521fe9c c521fe88
fe80: bf0001fc c0112e40 c0673778 bf001ca8 c521feac c521fea0 c02ca11c bf0001ac
fea0: c521fec4 c521feb0 c02c82c4 c02ca100 c0673778 bf001ca8 c521fee4 c521fec8
fec0: c02c8dd8 c02c8250  bf001ca8 bf001ca8 c0804ee0 c521ff04 c521fee8
fee0: c02c804c c02c8d20 bf001924  bf001ca8 c521e000 c521ff1c c521ff08
ff00: c02c950c c02c7fbc bf001d48  c521ff2c c521ff20 c02ca3a4 c02c94b8
ff20: c521ff3c c521ff30 bf001938 c02ca394 c521ffa4 c521ff40 c009beb4 bf001930
ff40: c521ff6c 70616d6f b6fe0032 c0014f84 70616d6f b6fe0032 0081 60070010
ff60: c521ff84 c521ff70 c008e1f4 c00bf328 0001a004 70616d6f c521ff94 0021ff88
ff80: c008e368 0001a004 70616d6f b6fe0032 0081 c0015028  c521ffa8
ffa0: c0014dc0 c009bcd0 0001a004 70616d6f bec2ab38 0880 bec2ab38 0880
ffc0: 0001a004 70616d6f b6fe0032 0081 0319  b6fe1000 
ffe0: bec2ab30 bec2ab20 00019f00 b6f539c0 60070010 bec2ab38  
Backtrace:
[] (cache_free_debugcheck+0x0/0x36c) from [] 
(kfree+0xc8/0x2ac)
[] (kfree+0x0/0x2ac) from [] (omap_nand_remove+0x5c/0x64 
[omap2])
[] (omap_nand_remove+0x0/0x64 [omap2]) from [] 
(platform_drv_remove+0x28/0x2c)
 r5:bf001ca8 r4:c0673778
[] (platform_drv_remove+0x0/0x2c) from [] 
(__device_release_driver+0x80/0xdc)
[] (__device_release_driver+0x0/0xdc) from [] 
(driver_detach+0xc4/0xc8)
 r5:bf001ca8 r4:c0673778
[] (driver_detach+0x0/0xc8) from [] 
(bus_remove_driver+0x9c/0x104)
 r6:c0804ee0 r5:bf001ca8 r4:bf001ca8 r3:
[] (bus_remove_driver+0x0/0x104) from [] 
(driver_unregister+0x60/0x80)
 r6:c521e000 r5:bf001ca8 r4: r3:bf001924
[] (driver_unregister+0x0/0x80) from [] 
(platform_driver_unregister+0x1c/0x20)
 r5: r4:bf001d48
[] (platform_driver_unregister+0x0/0x20) from [] 
(omap_nand_driver_exit+0x14/0x1c [omap2])
[] (omap_nand_driver_exit+0x0/0x1c [omap2]) from [] 
(sys_delete_module+0x1f0/0x2ec)
[] (sys_delete_module+0x0/0x2ec) from [] 
(ret_fast_syscall+0x0/0x48)
 r8:c0015028 r7:0081 r6:b6fe0032 r5:70616d6f r4:0001a004
Code: e1a5 eb0d9172 e7f001f2 e7f001f2 (e7f001f2)
---[ end trace 6a30b24d8c0cc2ee ]---
Segmentation fault
--->8---

This error was introduced in 67ce04bf2746f8a1f8c2a104b313d20c63f68378 which
was the first commit of this driver.

Signed-off-by: Andreas Bießmann 
Signed-off-by: Artem Bityutskiy 
Signed-off-by: David Woodhouse 
Signed-off-by: Greg Kroah-Hartman 

---
 drivers/mtd/nand/omap2.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/mtd/nand/omap2.c
+++ b/drivers/mtd/nand/omap2.c
@@ -1139,7 +1139,7 @@ static int omap_nand_remove(struct platf
/* Release NAND device, its internal structures and partitions */
nand_release(>mtd);
iounmap(info->nand.IO_ADDR_R);
-   kfree(>mtd);
+   kfree(info);
return 0;
 }
 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] extcon fixes for Linux 3.6

2012-10-10 Thread Greg KH
On Thu, Oct 11, 2012 at 11:02:47AM +0900, Chanwoo Choi wrote:
> On 10/11/2012 10:40 AM, Greg KH wrote:
> > On Thu, Oct 11, 2012 at 10:31:19AM +0900, Chanwoo Choi wrote:
> >> Hi Greg,
> >>
> >> Please pull extcon fixes for Linux 3.6 from:
> >>
> >> git://git.infradead.org/users/kmpark/linux-samsung extcon-for-next
> > 
> > Linux 3.6 has been released already, I can't take any patches for it. :)
> > 
> > Are these for 3.7 instead?
> 
> Sorry, The extcon patches are for 3.7.
> 
> > And is your gpg key in the public servers so that I "know" this is
> > really the set of patches you want me to pull, and that you are "you"?
> > 
> > If you are at LinuxCon Korea the next two days, I'll be glad to sign
> > your key to solve this issue...
> > 
> 
> OK, Myungjoo Ham and me will attend at LinuxCon Korea tommorrow.
> and will talk to you about gpg key issue.

Ok, I'll hold off on pulling this until after I talk this over with you
in person.

> ps. I will create gpg key and register in public server today.

That will be great.  Then you can get a kernel.org git tree as well.

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[ 71/84] r8169: 8168c and later require bit 0x20 to be set in Config2 for PME signaling.

2012-10-10 Thread Greg Kroah-Hartman
3.0-stable review patch.  If anyone has any objections, please let me know.

--

From: Francois Romieu 

commit d387b427c973974dd619a33549c070ac5d0e089f upstream.

The new 84xx stopped flying below the radars.

Signed-off-by: Francois Romieu 
Cc: Hayes Wang 
Reviewed-by: Jonathan Nieder 
Acked-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/net/r8169.c |6 ++
 1 file changed, 6 insertions(+)

--- a/drivers/net/r8169.c
+++ b/drivers/net/r8169.c
@@ -299,6 +299,8 @@ enum rtl_registers {
Config0 = 0x51,
Config1 = 0x52,
Config2 = 0x53,
+#define PME_SIGNAL (1 << 5)/* 8168c and later */
+
Config3 = 0x54,
Config4 = 0x55,
Config5 = 0x56,
@@ -1249,6 +1251,10 @@ static void __rtl8169_set_wol(struct rtl
RTL_W8(Config1, options);
break;
default:
+   options = RTL_R8(Config2) & ~PME_SIGNAL;
+   if (wolopts)
+   options |= PME_SIGNAL;
+   RTL_W8(Config2, options);
break;
}
 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[ 72/84] r8169: fix unsigned int wraparound with TSO

2012-10-10 Thread Greg Kroah-Hartman
3.0-stable review patch.  If anyone has any objections, please let me know.

--

From: Julien Ducourthial 

commit 477206a018f902895bfcd069dd820bfe94c187b1 upstream.

The r8169 may get stuck or show bad behaviour after activating TSO :
the net_device is not stopped when it has no more TX descriptors.
This problem comes from TX_BUFS_AVAIL which may reach -1 when all
transmit descriptors are in use. The patch simply tries to keep positive
values.

Tested with 8111d(onboard) on a D510MO, and with 8111e(onboard) on a
Zotac 890GXITX.

Signed-off-by: Julien Ducourthial 
Acked-by: Francois Romieu 
Signed-off-by: David S. Miller 
Reviewed-by: Jonathan Nieder 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/net/r8169.c |   16 ++--
 1 file changed, 10 insertions(+), 6 deletions(-)

--- a/drivers/net/r8169.c
+++ b/drivers/net/r8169.c
@@ -58,8 +58,12 @@
 #define R8169_MSG_DEFAULT \
(NETIF_MSG_DRV | NETIF_MSG_PROBE | NETIF_MSG_IFUP | NETIF_MSG_IFDOWN)
 
-#define TX_BUFFS_AVAIL(tp) \
-   (tp->dirty_tx + NUM_TX_DESC - tp->cur_tx - 1)
+#define TX_SLOTS_AVAIL(tp) \
+   (tp->dirty_tx + NUM_TX_DESC - tp->cur_tx)
+
+/* A skbuff with nr_frags needs nr_frags+1 entries in the tx queue */
+#define TX_FRAGS_READY_FOR(tp,nr_frags) \
+   (TX_SLOTS_AVAIL(tp) >= (nr_frags + 1))
 
 /* Maximum number of multicast addresses to filter (vs. Rx-all-multicast).
The RTL chips use a 64 element hash table based on the Ethernet CRC. */
@@ -4924,7 +4928,7 @@ static netdev_tx_t rtl8169_start_xmit(st
u32 opts[2];
int frags;
 
-   if (unlikely(TX_BUFFS_AVAIL(tp) < skb_shinfo(skb)->nr_frags)) {
+   if (unlikely(!TX_FRAGS_READY_FOR(tp, skb_shinfo(skb)->nr_frags))) {
netif_err(tp, drv, dev, "BUG! Tx Ring full when queue 
awake!\n");
goto err_stop_0;
}
@@ -4972,10 +4976,10 @@ static netdev_tx_t rtl8169_start_xmit(st
 
RTL_W8(TxPoll, NPQ);
 
-   if (TX_BUFFS_AVAIL(tp) < MAX_SKB_FRAGS) {
+   if (!TX_FRAGS_READY_FOR(tp, MAX_SKB_FRAGS)) {
netif_stop_queue(dev);
smp_mb();
-   if (TX_BUFFS_AVAIL(tp) >= MAX_SKB_FRAGS)
+   if (TX_FRAGS_READY_FOR(tp, MAX_SKB_FRAGS))
netif_wake_queue(dev);
}
 
@@ -5077,7 +5081,7 @@ static void rtl8169_tx_interrupt(struct
tp->dirty_tx = dirty_tx;
smp_mb();
if (netif_queue_stopped(dev) &&
-   (TX_BUFFS_AVAIL(tp) >= MAX_SKB_FRAGS)) {
+   TX_FRAGS_READY_FOR(tp, MAX_SKB_FRAGS)) {
netif_wake_queue(dev);
}
/*


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[ 74/84] revert "mm: mempolicy: Let vma_merge and vma_split handle vma->vm_policy linkages"

2012-10-10 Thread Greg Kroah-Hartman
3.0-stable review patch.  If anyone has any objections, please let me know.

--

From: KOSAKI Motohiro 

commit 8d34694c1abf29df1f3c7317936b7e3e2e308d9b upstream.

Commit 05f144a0d5c2 ("mm: mempolicy: Let vma_merge and vma_split handle
vma->vm_policy linkages") removed vma->vm_policy updates code but it is
the purpose of mbind_range().  Now, mbind_range() is virtually a no-op
and while it does not allow memory corruption it is not the right fix.
This patch is a revert.

[mgor...@suse.de: Edited changelog]
Signed-off-by: KOSAKI Motohiro 
Signed-off-by: Mel Gorman 
Cc: Christoph Lameter 
Cc: Josh Boyer 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
Signed-off-by: Greg Kroah-Hartman 

---
 mm/mempolicy.c |   41 -
 1 file changed, 24 insertions(+), 17 deletions(-)

--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -606,6 +606,27 @@ check_range(struct mm_struct *mm, unsign
return first;
 }
 
+/* Apply policy to a single VMA */
+static int policy_vma(struct vm_area_struct *vma, struct mempolicy *new)
+{
+   int err = 0;
+   struct mempolicy *old = vma->vm_policy;
+
+   pr_debug("vma %lx-%lx/%lx vm_ops %p vm_file %p set_policy %p\n",
+vma->vm_start, vma->vm_end, vma->vm_pgoff,
+vma->vm_ops, vma->vm_file,
+vma->vm_ops ? vma->vm_ops->set_policy : NULL);
+
+   if (vma->vm_ops && vma->vm_ops->set_policy)
+   err = vma->vm_ops->set_policy(vma, new);
+   if (!err) {
+   mpol_get(new);
+   vma->vm_policy = new;
+   mpol_put(old);
+   }
+   return err;
+}
+
 /* Step 2: apply policy to a range and do splits. */
 static int mbind_range(struct mm_struct *mm, unsigned long start,
   unsigned long end, struct mempolicy *new_pol)
@@ -645,23 +666,9 @@ static int mbind_range(struct mm_struct
if (err)
goto out;
}
-
-   /*
-* Apply policy to a single VMA. The reference counting of
-* policy for vma_policy linkages has already been handled by
-* vma_merge and split_vma as necessary. If this is a shared
-* policy then ->set_policy will increment the reference count
-* for an sp node.
-*/
-   pr_debug("vma %lx-%lx/%lx vm_ops %p vm_file %p set_policy %p\n",
-   vma->vm_start, vma->vm_end, vma->vm_pgoff,
-   vma->vm_ops, vma->vm_file,
-   vma->vm_ops ? vma->vm_ops->set_policy : NULL);
-   if (vma->vm_ops && vma->vm_ops->set_policy) {
-   err = vma->vm_ops->set_policy(vma, new_pol);
-   if (err)
-   goto out;
-   }
+   err = policy_vma(vma, new_pol);
+   if (err)
+   goto out;
}
 
  out:


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[ 76/84] mempolicy: fix a race in shared_policy_replace()

2012-10-10 Thread Greg Kroah-Hartman
3.0-stable review patch.  If anyone has any objections, please let me know.

--

From: Mel Gorman 

commit b22d127a39ddd10d93deee3d96e643657ad53a49 upstream.

shared_policy_replace() use of sp_alloc() is unsafe.  1) sp_node cannot
be dereferenced if sp->lock is not held and 2) another thread can modify
sp_node between spin_unlock for allocating a new sp node and next
spin_lock.  The bug was introduced before 2.6.12-rc2.

Kosaki's original patch for this problem was to allocate an sp node and
policy within shared_policy_replace and initialise it when the lock is
reacquired.  I was not keen on this approach because it partially
duplicates sp_alloc().  As the paths were sp->lock is taken are not that
performance critical this patch converts sp->lock to sp->mutex so it can
sleep when calling sp_alloc().

[kosaki.motoh...@jp.fujitsu.com: Original patch]
Signed-off-by: Mel Gorman 
Acked-by: KOSAKI Motohiro 
Reviewed-by: Christoph Lameter 
Cc: Josh Boyer 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
Signed-off-by: Greg Kroah-Hartman 

---
 include/linux/mempolicy.h |2 +-
 mm/mempolicy.c|   37 -
 2 files changed, 17 insertions(+), 22 deletions(-)

--- a/include/linux/mempolicy.h
+++ b/include/linux/mempolicy.h
@@ -188,7 +188,7 @@ struct sp_node {
 
 struct shared_policy {
struct rb_root root;
-   spinlock_t lock;
+   struct mutex mutex;
 };
 
 void mpol_shared_policy_init(struct shared_policy *sp, struct mempolicy *mpol);
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -2021,7 +2021,7 @@ int __mpol_equal(struct mempolicy *a, st
  */
 
 /* lookup first element intersecting start-end */
-/* Caller holds sp->lock */
+/* Caller holds sp->mutex */
 static struct sp_node *
 sp_lookup(struct shared_policy *sp, unsigned long start, unsigned long end)
 {
@@ -2085,13 +2085,13 @@ mpol_shared_policy_lookup(struct shared_
 
if (!sp->root.rb_node)
return NULL;
-   spin_lock(>lock);
+   mutex_lock(>mutex);
sn = sp_lookup(sp, idx, idx+1);
if (sn) {
mpol_get(sn->policy);
pol = sn->policy;
}
-   spin_unlock(>lock);
+   mutex_unlock(>mutex);
return pol;
 }
 
@@ -2131,10 +2131,10 @@ static struct sp_node *sp_alloc(unsigned
 static int shared_policy_replace(struct shared_policy *sp, unsigned long start,
 unsigned long end, struct sp_node *new)
 {
-   struct sp_node *n, *new2 = NULL;
+   struct sp_node *n;
+   int ret = 0;
 
-restart:
-   spin_lock(>lock);
+   mutex_lock(>mutex);
n = sp_lookup(sp, start, end);
/* Take care of old policies in the same range. */
while (n && n->start < end) {
@@ -2147,16 +2147,14 @@ restart:
} else {
/* Old policy spanning whole new range. */
if (n->end > end) {
+   struct sp_node *new2;
+   new2 = sp_alloc(end, n->end, n->policy);
if (!new2) {
-   spin_unlock(>lock);
-   new2 = sp_alloc(end, n->end, n->policy);
-   if (!new2)
-   return -ENOMEM;
-   goto restart;
+   ret = -ENOMEM;
+   goto out;
}
n->end = start;
sp_insert(sp, new2);
-   new2 = NULL;
break;
} else
n->end = start;
@@ -2167,12 +2165,9 @@ restart:
}
if (new)
sp_insert(sp, new);
-   spin_unlock(>lock);
-   if (new2) {
-   mpol_put(new2->policy);
-   kmem_cache_free(sn_cache, new2);
-   }
-   return 0;
+out:
+   mutex_unlock(>mutex);
+   return ret;
 }
 
 /**
@@ -2190,7 +2185,7 @@ void mpol_shared_policy_init(struct shar
int ret;
 
sp->root = RB_ROOT; /* empty tree == default mempolicy */
-   spin_lock_init(>lock);
+   mutex_init(>mutex);
 
if (mpol) {
struct vm_area_struct pvma;
@@ -2256,7 +2251,7 @@ void mpol_free_shared_policy(struct shar
 
if (!p->root.rb_node)
return;
-   spin_lock(>lock);
+   mutex_lock(>mutex);
next = rb_first(>root);
while (next) {
n = rb_entry(next, struct sp_node, nd);
@@ -2265,7 +2260,7 @@ void mpol_free_shared_policy(struct shar
mpol_put(n->policy);
kmem_cache_free(sn_cache, n);
}
-   spin_unlock(>lock);
+   mutex_unlock(>mutex);
 }
 
 /* assumes fs == KERNEL_DS */


--
To 

[ 68/84] r8169: missing barriers.

2012-10-10 Thread Greg Kroah-Hartman
3.0-stable review patch.  If anyone has any objections, please let me know.

--

From: Francois Romieu 

commit 1e874e041fc7c222cbd85b20c4406070be1f687a upstream.

Signed-off-by: Francois Romieu 
Cc: Hayes Wang 
Reviewed-by: Jonathan Nieder 
Acked-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/net/r8169.c |5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

--- a/drivers/net/r8169.c
+++ b/drivers/net/r8169.c
@@ -4957,7 +4957,7 @@ static netdev_tx_t rtl8169_start_xmit(st
 
if (TX_BUFFS_AVAIL(tp) < MAX_SKB_FRAGS) {
netif_stop_queue(dev);
-   smp_rmb();
+   smp_mb();
if (TX_BUFFS_AVAIL(tp) >= MAX_SKB_FRAGS)
netif_wake_queue(dev);
}
@@ -5058,7 +5058,7 @@ static void rtl8169_tx_interrupt(struct
 
if (tp->dirty_tx != dirty_tx) {
tp->dirty_tx = dirty_tx;
-   smp_wmb();
+   smp_mb();
if (netif_queue_stopped(dev) &&
(TX_BUFFS_AVAIL(tp) >= MAX_SKB_FRAGS)) {
netif_wake_queue(dev);
@@ -5069,7 +5069,6 @@ static void rtl8169_tx_interrupt(struct
 * of start_xmit activity is detected (if it is not detected,
 * it is slow enough). -- FR
 */
-   smp_rmb();
if (tp->cur_tx != dirty_tx)
RTL_W8(TxPoll, NPQ);
}


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[ 77/84] mempolicy: fix refcount leak in mpol_set_shared_policy()

2012-10-10 Thread Greg Kroah-Hartman
3.0-stable review patch.  If anyone has any objections, please let me know.

--

From: KOSAKI Motohiro 

commit 63f74ca21f1fad36d075e063f06dcc6d39fe86b2 upstream.

When shared_policy_replace() fails to allocate new->policy is not freed
correctly by mpol_set_shared_policy().  The problem is that shared
mempolicy code directly call kmem_cache_free() in multiple places where
it is easy to make a mistake.

This patch creates an sp_free wrapper function and uses it. The bug was
introduced pre-git age (IOW, before 2.6.12-rc2).

[mgor...@suse.de: Editted changelog]
Signed-off-by: KOSAKI Motohiro 
Signed-off-by: Mel Gorman 
Reviewed-by: Christoph Lameter 
Cc: Josh Boyer 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
Signed-off-by: Greg Kroah-Hartman 

---
 mm/mempolicy.c |   15 +--
 1 file changed, 9 insertions(+), 6 deletions(-)

--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -2095,12 +2095,17 @@ mpol_shared_policy_lookup(struct shared_
return pol;
 }
 
+static void sp_free(struct sp_node *n)
+{
+   mpol_put(n->policy);
+   kmem_cache_free(sn_cache, n);
+}
+
 static void sp_delete(struct shared_policy *sp, struct sp_node *n)
 {
pr_debug("deleting %lx-l%lx\n", n->start, n->end);
rb_erase(>nd, >root);
-   mpol_put(n->policy);
-   kmem_cache_free(sn_cache, n);
+   sp_free(n);
 }
 
 static struct sp_node *sp_alloc(unsigned long start, unsigned long end,
@@ -2239,7 +2244,7 @@ int mpol_set_shared_policy(struct shared
}
err = shared_policy_replace(info, vma->vm_pgoff, vma->vm_pgoff+sz, new);
if (err && new)
-   kmem_cache_free(sn_cache, new);
+   sp_free(new);
return err;
 }
 
@@ -2256,9 +2261,7 @@ void mpol_free_shared_policy(struct shar
while (next) {
n = rb_entry(next, struct sp_node, nd);
next = rb_next(>nd);
-   rb_erase(>nd, >root);
-   mpol_put(n->policy);
-   kmem_cache_free(sn_cache, n);
+   sp_delete(p, n);
}
mutex_unlock(>mutex);
 }


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[ 75/84] mempolicy: remove mempolicy sharing

2012-10-10 Thread Greg Kroah-Hartman
3.0-stable review patch.  If anyone has any objections, please let me know.

--

From: KOSAKI Motohiro 

commit 869833f2c5c6e4dd09a5378cfc665ffb4615e5d2 upstream.

Dave Jones' system call fuzz testing tool "trinity" triggered the
following bug error with slab debugging enabled


=
BUG numa_policy (Not tainted): Poison overwritten

-

INFO: 0x880146498250-0x880146498250. First byte 0x6a instead of 0x6b
INFO: Allocated in mpol_new+0xa3/0x140 age=46310 cpu=6 pid=32154
 __slab_alloc+0x3d3/0x445
 kmem_cache_alloc+0x29d/0x2b0
 mpol_new+0xa3/0x140
 sys_mbind+0x142/0x620
 system_call_fastpath+0x16/0x1b

INFO: Freed in __mpol_put+0x27/0x30 age=46268 cpu=6 pid=32154
 __slab_free+0x2e/0x1de
 kmem_cache_free+0x25a/0x260
 __mpol_put+0x27/0x30
 remove_vma+0x68/0x90
 exit_mmap+0x118/0x140
 mmput+0x73/0x110
 exit_mm+0x108/0x130
 do_exit+0x162/0xb90
 do_group_exit+0x4f/0xc0
 sys_exit_group+0x17/0x20
 system_call_fastpath+0x16/0x1b

INFO: Slab 0xea0005192600 objects=27 used=27 fp=0x  (null) 
flags=0x204080
INFO: Object 0x880146498250 @offset=592 fp=0x88014649b9d0

The problem is that the structure is being prematurely freed due to a
reference count imbalance. In the following case mbind(addr, len) should
replace the memory policies of both vma1 and vma2 and thus they will
become to share the same mempolicy and the new mempolicy will have the
MPOL_F_SHARED flag.

  +---+---+
  | vma1  | vma2(shmem)   |
  +---+---+
  |   |
 addr addr+len

alloc_pages_vma() uses get_vma_policy() and mpol_cond_put() pair for
maintaining the mempolicy reference count.  The current rule is that
get_vma_policy() only increments refcount for shmem VMA and
mpol_conf_put() only decrements refcount if the policy has
MPOL_F_SHARED.

In above case, vma1 is not shmem vma and vma->policy has MPOL_F_SHARED!
The reference count will be decreased even though was not increased
whenever alloc_page_vma() is called.  This has been broken since commit
[52cd3b07: mempolicy: rework mempolicy Reference Counting] in 2008.

There is another serious bug with the sharing of memory policies.
Currently, mempolicy rebind logic (it is called from cpuset rebinding)
ignores a refcount of mempolicy and override it forcibly.  Thus, any
mempolicy sharing may cause mempolicy corruption.  The bug was
introduced by commit [68860ec1: cpusets: automatic numa mempolicy
rebinding].

Ideally, the shared policy handling would be rewritten to either
properly handle COW of the policy structures or at least reference count
MPOL_F_SHARED based exclusively on information within the policy.
However, this patch takes the easier approach of disabling any policy
sharing between VMAs.  Each new range allocated with sp_alloc will
allocate a new policy, set the reference count to 1 and drop the
reference count of the old policy.  This increases the memory footprint
but is not expected to be a major problem as mbind() is unlikely to be
used for fine-grained ranges.  It is also inefficient because it means
we allocate a new policy even in cases where mbind_range() could use the
new_policy passed to it.  However, it is more straight-forward and the
change should be invisible to the user.

[mgor...@suse.de: Edited changelog]
Reported-by: Dave Jones 
Cc: Christoph Lameter 
Reviewed-by: Christoph Lameter 
Signed-off-by: KOSAKI Motohiro 
Signed-off-by: Mel Gorman 
Cc: Josh Boyer 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
Signed-off-by: Greg Kroah-Hartman 

---
 mm/mempolicy.c |   52 ++--
 1 file changed, 38 insertions(+), 14 deletions(-)

--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -606,24 +606,39 @@ check_range(struct mm_struct *mm, unsign
return first;
 }
 
-/* Apply policy to a single VMA */
-static int policy_vma(struct vm_area_struct *vma, struct mempolicy *new)
+/*
+ * Apply policy to a single VMA
+ * This must be called with the mmap_sem held for writing.
+ */
+static int vma_replace_policy(struct vm_area_struct *vma,
+   struct mempolicy *pol)
 {
-   int err = 0;
-   struct mempolicy *old = vma->vm_policy;
+   int err;
+   struct mempolicy *old;
+   struct mempolicy *new;
 
pr_debug("vma %lx-%lx/%lx vm_ops %p vm_file %p set_policy %p\n",
 vma->vm_start, vma->vm_end, vma->vm_pgoff,
 vma->vm_ops, vma->vm_file,
 vma->vm_ops ? vma->vm_ops->set_policy : NULL);
 
-   if (vma->vm_ops && vma->vm_ops->set_policy)
+   new = mpol_dup(pol);
+   if (IS_ERR(new))
+

[ 73/84] r8169: call netif_napi_del at errpaths and at driver unload

2012-10-10 Thread Greg Kroah-Hartman
3.0-stable review patch.  If anyone has any objections, please let me know.

--

From: Devendra Naga 

commit ad1be8d345416a794dea39761a374032aa471a76 upstream.

When register_netdev fails, the init'ed NAPIs by netif_napi_add must be
deleted with netif_napi_del, and also when driver unloads, it should
delete the NAPI before unregistering netdevice using unregister_netdev.

Signed-off-by: Devendra Naga 
Signed-off-by: David S. Miller 
Reviewed-by: Jonathan Nieder 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/net/r8169.c |3 +++
 1 file changed, 3 insertions(+)

--- a/drivers/net/r8169.c
+++ b/drivers/net/r8169.c
@@ -3706,6 +3706,7 @@ out:
return rc;
 
 err_out_msi_4:
+   netif_napi_del(>napi);
rtl_disable_msi(pdev, tp);
iounmap(ioaddr);
 err_out_free_res_3:
@@ -3731,6 +3732,8 @@ static void __devexit rtl8169_remove_one
 
cancel_delayed_work_sync(>task);
 
+   netif_napi_del(>napi);
+
unregister_netdev(dev);
 
rtl_release_firmware(tp);


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[ 69/84] r8169: runtime resume before shutdown.

2012-10-10 Thread Greg Kroah-Hartman
3.0-stable review patch.  If anyone has any objections, please let me know.

--


From: françois romieu 

commit 2a15cd2ff488a9fdb55e5e34060f499853b27c77 upstream.

With runtime PM, if the ethernet cable is disconnected, the device is
transitioned to D3 state to conserve energy. If the system is shutdown
in this state, any register accesses in rtl_shutdown are dropped on
the floor. As the device was programmed by .runtime_suspend() to wake
on link changes, it is thus brought back up as soon as the link recovers.

Resuming every suspended device through the driver core would slow things
down and it is not clear how many devices really need it now.

Original report and D0 transition patch by Sameer Nanda. Patch has been
changed to comply with advices by Rafael J. Wysocki and the PM folks.

Reported-by: Sameer Nanda 
Signed-off-by: Francois Romieu 
Cc: Rafael J. Wysocki 
Cc: Hayes Wang 
Cc: Alan Stern 
Acked-by: Rafael J. Wysocki 
Signed-off-by: David S. Miller 
Reviewed-by: Jonathan Nieder 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/net/r8169.c |5 +
 1 file changed, 5 insertions(+)

--- a/drivers/net/r8169.c
+++ b/drivers/net/r8169.c
@@ -5570,6 +5570,9 @@ static void rtl_shutdown(struct pci_dev
struct net_device *dev = pci_get_drvdata(pdev);
struct rtl8169_private *tp = netdev_priv(dev);
void __iomem *ioaddr = tp->mmio_addr;
+   struct device *d = >dev;
+
+   pm_runtime_get_sync(d);
 
rtl8169_net_suspend(dev);
 
@@ -5598,6 +5601,8 @@ static void rtl_shutdown(struct pci_dev
pci_wake_from_d3(pdev, true);
pci_set_power_state(pdev, PCI_D3hot);
}
+
+   pm_runtime_put_noidle(d);
 }
 
 static struct pci_driver rtl8169_pci_driver = {


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[ 70/84] r8169: Config1 is read-only on 8168c and later.

2012-10-10 Thread Greg Kroah-Hartman
3.0-stable review patch.  If anyone has any objections, please let me know.

--

From: Francois Romieu 

commit 851e60221926a53344b4227879858bef841b0477 upstream.

Suggested by Hayes.

Signed-off-by: Francois Romieu 
Cc: Hayes Wang 
Reviewed-by: Jonathan Nieder 
Acked-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/net/r8169.c |   15 +--
 1 file changed, 13 insertions(+), 2 deletions(-)

--- a/drivers/net/r8169.c
+++ b/drivers/net/r8169.c
@@ -1223,7 +1223,6 @@ static void __rtl8169_set_wol(struct rtl
u16 reg;
u8  mask;
} cfg[] = {
-   { WAKE_ANY,   Config1, PMEnable },
{ WAKE_PHY,   Config3, LinkUp },
{ WAKE_MAGIC, Config3, MagicPacket },
{ WAKE_UCAST, Config5, UWF },
@@ -1231,16 +1230,28 @@ static void __rtl8169_set_wol(struct rtl
{ WAKE_MCAST, Config5, MWF },
{ WAKE_ANY,   Config5, LanWake }
};
+   u8 options;
 
RTL_W8(Cfg9346, Cfg9346_Unlock);
 
for (i = 0; i < ARRAY_SIZE(cfg); i++) {
-   u8 options = RTL_R8(cfg[i].reg) & ~cfg[i].mask;
+   options = RTL_R8(cfg[i].reg) & ~cfg[i].mask;
if (wolopts & cfg[i].opt)
options |= cfg[i].mask;
RTL_W8(cfg[i].reg, options);
}
 
+   switch (tp->mac_version) {
+   case RTL_GIGA_MAC_VER_01 ... RTL_GIGA_MAC_VER_17:
+   options = RTL_R8(Config1) & ~PMEnable;
+   if (wolopts)
+   options |= PMEnable;
+   RTL_W8(Config1, options);
+   break;
+   default:
+   break;
+   }
+
RTL_W8(Cfg9346, Cfg9346_Lock);
 }
 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[ 84/84] mtd: omap2: fix module loading

2012-10-10 Thread Greg Kroah-Hartman
3.0-stable review patch.  If anyone has any objections, please let me know.

--

From: Andreas Bießmann 

commit 4d3d688da8e7016f15483e9319b41311e1db9515 upstream.

Unloading the omap2 nand driver missed to release the memory region which will
result in not being able to request it again if one want to load the driver
later on.

This patch fixes following error when loading omap2 module after unloading:
---8<---
~ $ rmmod omap2
~ $ modprobe omap2
[   37.420928] omap2-nand: probe of omap2-nand.0 failed with error -16
~ $
--->8---

This error was introduced in 67ce04bf2746f8a1f8c2a104b313d20c63f68378 which
was the first commit of this driver.

Signed-off-by: Andreas Bießmann 
Signed-off-by: Artem Bityutskiy 
Signed-off-by: David Woodhouse 
Signed-off-by: Greg Kroah-Hartman 

---
 drivers/mtd/nand/omap2.c |1 +
 1 file changed, 1 insertion(+)

--- a/drivers/mtd/nand/omap2.c
+++ b/drivers/mtd/nand/omap2.c
@@ -1139,6 +1139,7 @@ static int omap_nand_remove(struct platf
/* Release NAND device, its internal structures and partitions */
nand_release(>mtd);
iounmap(info->nand.IO_ADDR_R);
+   release_mem_region(info->phys_base, NAND_IO_SIZE);
kfree(info);
return 0;
 }


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[ 31/84] net: small bug on rxhash calculation

2012-10-10 Thread Greg Kroah-Hartman
3.0-stable review patch.  If anyone has any objections, please let me know.

--


From: Chema Gonzalez 

[ Upstream commit 6862234238e84648c305526af2edd98badcad1e0 ]

In the current rxhash calculation function, while the
sorting of the ports/addrs is coherent (you get the
same rxhash for packets sharing the same 4-tuple, in
both directions), ports and addrs are sorted
independently. This implies packets from a connection
between the same addresses but crossed ports hash to
the same rxhash.

For example, traffic between A=S:l and B=L:s is hashed
(in both directions) from {L, S, {s, l}}. The same
rxhash is obtained for packets between C=S:s and D=L:l.

This patch ensures that you either swap both addrs and ports,
or you swap none. Traffic between A and B, and traffic
between C and D, get their rxhash from different sources
({L, S, {l, s}} for A<->B, and {L, S, {s, l}} for C<->D)

The patch is co-written with Eric Dumazet 

Signed-off-by: Chema Gonzalez 
Signed-off-by: Eric Dumazet 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 net/core/dev.c |   11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2559,16 +2559,17 @@ __u32 __skb_get_rxhash(struct sk_buff *s
poff = proto_ports_offset(ip_proto);
if (poff >= 0) {
nhoff += ihl * 4 + poff;
-   if (pskb_may_pull(skb, nhoff + 4)) {
+   if (pskb_may_pull(skb, nhoff + 4))
ports.v32 = * (__force u32 *) (skb->data + nhoff);
-   if (ports.v16[1] < ports.v16[0])
-   swap(ports.v16[0], ports.v16[1]);
-   }
}
 
/* get a consistent hash (same value on both flow directions) */
-   if (addr2 < addr1)
+   if (addr2 < addr1 ||
+   (addr2 == addr1 &&
+ports.v16[1] < ports.v16[0])) {
swap(addr1, addr2);
+   swap(ports.v16[0], ports.v16[1]);
+   }
 
hash = jhash_3words(addr1, addr2, ports.v32, hashrnd);
if (!hash)


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[ 18/84] xfrm_user: dont copy esn replay window twice for new states

2012-10-10 Thread Greg Kroah-Hartman
3.0-stable review patch.  If anyone has any objections, please let me know.

--


From: Mathias Krause 

[ Upstream commit e3ac104d41a97b42316915020ba228c505447d21 ]

The ESN replay window was already fully initialized in
xfrm_alloc_replay_state_esn(). No need to copy it again.

Signed-off-by: Mathias Krause 
Cc: Steffen Klassert 
Acked-by: Steffen Klassert 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 net/xfrm/xfrm_user.c |9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

--- a/net/xfrm/xfrm_user.c
+++ b/net/xfrm/xfrm_user.c
@@ -442,10 +442,11 @@ static void copy_from_user_state(struct
  * somehow made shareable and move it to xfrm_state.c - JHS
  *
 */
-static void xfrm_update_ae_params(struct xfrm_state *x, struct nlattr **attrs)
+static void xfrm_update_ae_params(struct xfrm_state *x, struct nlattr **attrs,
+ int update_esn)
 {
struct nlattr *rp = attrs[XFRMA_REPLAY_VAL];
-   struct nlattr *re = attrs[XFRMA_REPLAY_ESN_VAL];
+   struct nlattr *re = update_esn ? attrs[XFRMA_REPLAY_ESN_VAL] : NULL;
struct nlattr *lt = attrs[XFRMA_LTIME_VAL];
struct nlattr *et = attrs[XFRMA_ETIMER_THRESH];
struct nlattr *rt = attrs[XFRMA_REPLAY_THRESH];
@@ -555,7 +556,7 @@ static struct xfrm_state *xfrm_state_con
goto error;
 
/* override default values from above */
-   xfrm_update_ae_params(x, attrs);
+   xfrm_update_ae_params(x, attrs, 0);
 
return x;
 
@@ -1801,7 +1802,7 @@ static int xfrm_new_ae(struct sk_buff *s
goto out;
 
spin_lock_bh(>lock);
-   xfrm_update_ae_params(x, attrs);
+   xfrm_update_ae_params(x, attrs, 1);
spin_unlock_bh(>lock);
 
c.event = nlh->nlmsg_type;


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[ 20/84] net: ethernet: davinci_cpdma: decrease the desc count when cleaning up the remaining packets

2012-10-10 Thread Greg Kroah-Hartman
3.0-stable review patch.  If anyone has any objections, please let me know.

--


From: htbegin 

[ Upstream commit ffb5ba90017505a19e238e986e6d33f09e4df765 ]

chan->count is used by rx channel. If the desc count is not updated by
the clean up loop in cpdma_chan_stop, the value written to the rxfree
register in cpdma_chan_start will be incorrect.

Signed-off-by: Tao Hou 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/net/davinci_cpdma.c |1 +
 1 file changed, 1 insertion(+)

--- a/drivers/net/davinci_cpdma.c
+++ b/drivers/net/davinci_cpdma.c
@@ -849,6 +849,7 @@ int cpdma_chan_stop(struct cpdma_chan *c
 
next_dma = desc_read(desc, hw_next);
chan->head = desc_from_phys(pool, next_dma);
+   chan->count--;
chan->stats.teardown_dequeue++;
 
/* issue callback without locks held */


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[ 22/84] netxen: check for root bus in netxen_mask_aer_correctable

2012-10-10 Thread Greg Kroah-Hartman
3.0-stable review patch.  If anyone has any objections, please let me know.

--


From: Nikolay Aleksandrov 

[ Upstream commit e4d1aa40e363ed3e0486aeeeb0d173f7f822737e ]

Add a check if pdev->bus->self == NULL (root bus). When attaching
a netxen NIC to a VM it can be on the root bus and the guest would
crash in netxen_mask_aer_correctable() because of a NULL pointer
dereference if CONFIG_PCIEAER is present.

Signed-off-by: Nikolay Aleksandrov 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/net/netxen/netxen_nic_main.c |4 
 1 file changed, 4 insertions(+)

--- a/drivers/net/netxen/netxen_nic_main.c
+++ b/drivers/net/netxen/netxen_nic_main.c
@@ -1288,6 +1288,10 @@ static void netxen_mask_aer_correctable(
struct pci_dev *root = pdev->bus->self;
u32 aer_pos;
 
+   /* root bus? */
+   if (!root)
+   return;
+
if (adapter->ahw.board_type != NETXEN_BRDTYPE_P3_4_GB_MM &&
adapter->ahw.board_type != NETXEN_BRDTYPE_P3_10G_TP)
return;


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[ 15/84] xfrm_user: fix info leak in copy_to_user_state()

2012-10-10 Thread Greg Kroah-Hartman
3.0-stable review patch.  If anyone has any objections, please let me know.

--


From: Mathias Krause 

[ Upstream commit f778a636713a435d3a922c60b1622a91136560c1 ]

The memory reserved to dump the xfrm state includes the padding bytes of
struct xfrm_usersa_info added by the compiler for alignment (7 for
amd64, 3 for i386). Add an explicit memset(0) before filling the buffer
to avoid the info leak.

Signed-off-by: Mathias Krause 
Acked-by: Steffen Klassert 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 net/xfrm/xfrm_user.c |1 +
 1 file changed, 1 insertion(+)

--- a/net/xfrm/xfrm_user.c
+++ b/net/xfrm/xfrm_user.c
@@ -689,6 +689,7 @@ out:
 
 static void copy_to_user_state(struct xfrm_state *x, struct xfrm_usersa_info 
*p)
 {
+   memset(p, 0, sizeof(*p));
memcpy(>id, >id, sizeof(p->id));
memcpy(>sel, >sel, sizeof(p->sel));
memcpy(>lft, >lft, sizeof(p->lft));


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[ 23/84] net-sched: sch_cbq: avoid infinite loop

2012-10-10 Thread Greg Kroah-Hartman
3.0-stable review patch.  If anyone has any objections, please let me know.

--


From: Eric Dumazet 

[ Upstream commit bdfc87f7d1e253e0a61e2fc6a75ea9d76f7fc03a ]

Its possible to setup a bad cbq configuration leading to
an infinite loop in cbq_classify()

DEV_OUT=eth0
ICMP="match ip protocol 1 0xff"
U32="protocol ip u32"
DST="match ip dst"
tc qdisc add dev $DEV_OUT root handle 1: cbq avpkt 1000 \
bandwidth 100mbit
tc class add dev $DEV_OUT parent 1: classid 1:1 cbq \
rate 512kbit allot 1500 prio 5 bounded isolated
tc filter add dev $DEV_OUT parent 1: prio 3 $U32 \
$ICMP $DST 192.168.3.234 flowid 1:

Reported-by: Denys Fedoryschenko 
Tested-by: Denys Fedoryschenko 
Signed-off-by: Eric Dumazet 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 net/sched/sch_cbq.c |5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

--- a/net/sched/sch_cbq.c
+++ b/net/sched/sch_cbq.c
@@ -250,10 +250,11 @@ cbq_classify(struct sk_buff *skb, struct
else if ((cl = defmap[res.classid & TC_PRIO_MAX]) == 
NULL)
cl = defmap[TC_PRIO_BESTEFFORT];
 
-   if (cl == NULL || cl->level >= head->level)
+   if (cl == NULL)
goto fallback;
}
-
+   if (cl->level >= head->level)
+   goto fallback;
 #ifdef CONFIG_NET_CLS_ACT
switch (result) {
case TC_ACT_QUEUED:


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[ 44/84] SCSI: zfcp: restore refcount check on port_remove

2012-10-10 Thread Greg Kroah-Hartman
3.0-stable review patch.  If anyone has any objections, please let me know.

--

From: Steffen Maier 

commit d99b601b63386f3395dc26a699ae703a273d9982 upstream.

Upstream commit f3450c7b917201bb49d67032e9f60d5125675d6a
"[SCSI] zfcp: Replace local reference counting with common kref"
accidentally dropped a reference count check before tearing down
zfcp_ports that are potentially in use by zfcp_units.
Even remote ports in use can be removed causing
unreachable garbage objects zfcp_ports with zfcp_units.
Thus units won't come back even after a manual port_rescan.
The kref of zfcp_port->dev.kobj is already used by the driver core.
We cannot re-use it to track the number of zfcp_units.
Re-introduce our own counter for units per port
and check on port_remove.

Signed-off-by: Steffen Maier 
Reviewed-by: Heiko Carstens 
Signed-off-by: James Bottomley 
Signed-off-by: Greg Kroah-Hartman 

---
 drivers/s390/scsi/zfcp_aux.c   |1 +
 drivers/s390/scsi/zfcp_def.h   |1 +
 drivers/s390/scsi/zfcp_ext.h   |1 +
 drivers/s390/scsi/zfcp_sysfs.c |   18 --
 drivers/s390/scsi/zfcp_unit.c  |   36 ++--
 5 files changed, 45 insertions(+), 12 deletions(-)

--- a/drivers/s390/scsi/zfcp_aux.c
+++ b/drivers/s390/scsi/zfcp_aux.c
@@ -518,6 +518,7 @@ struct zfcp_port *zfcp_port_enqueue(stru
 
rwlock_init(>unit_list_lock);
INIT_LIST_HEAD(>unit_list);
+   atomic_set(>units, 0);
 
INIT_WORK(>gid_pn_work, zfcp_fc_port_did_lookup);
INIT_WORK(>test_link_work, zfcp_fc_link_test_work);
--- a/drivers/s390/scsi/zfcp_def.h
+++ b/drivers/s390/scsi/zfcp_def.h
@@ -204,6 +204,7 @@ struct zfcp_port {
struct zfcp_adapter*adapter;   /* adapter used to access port */
struct list_headunit_list;  /* head of logical unit list */
rwlock_tunit_list_lock; /* unit list lock */
+   atomic_tunits; /* zfcp_unit count */
atomic_t   status; /* status of this remote port */
u64wwnn;   /* WWNN if known */
u64wwpn;   /* WWPN */
--- a/drivers/s390/scsi/zfcp_ext.h
+++ b/drivers/s390/scsi/zfcp_ext.h
@@ -158,6 +158,7 @@ extern void zfcp_scsi_dif_sense_error(st
 extern struct attribute_group zfcp_sysfs_unit_attrs;
 extern struct attribute_group zfcp_sysfs_adapter_attrs;
 extern struct attribute_group zfcp_sysfs_port_attrs;
+extern struct mutex zfcp_sysfs_port_units_mutex;
 extern struct device_attribute *zfcp_sysfs_sdev_attrs[];
 extern struct device_attribute *zfcp_sysfs_shost_attrs[];
 
--- a/drivers/s390/scsi/zfcp_sysfs.c
+++ b/drivers/s390/scsi/zfcp_sysfs.c
@@ -227,6 +227,8 @@ static ssize_t zfcp_sysfs_port_rescan_st
 static ZFCP_DEV_ATTR(adapter, port_rescan, S_IWUSR, NULL,
 zfcp_sysfs_port_rescan_store);
 
+DEFINE_MUTEX(zfcp_sysfs_port_units_mutex);
+
 static ssize_t zfcp_sysfs_port_remove_store(struct device *dev,
struct device_attribute *attr,
const char *buf, size_t count)
@@ -249,6 +251,16 @@ static ssize_t zfcp_sysfs_port_remove_st
else
retval = 0;
 
+   mutex_lock(_sysfs_port_units_mutex);
+   if (atomic_read(>units) > 0) {
+   retval = -EBUSY;
+   mutex_unlock(_sysfs_port_units_mutex);
+   goto out;
+   }
+   /* port is about to be removed, so no more unit_add */
+   atomic_set(>units, -1);
+   mutex_unlock(_sysfs_port_units_mutex);
+
write_lock_irq(>port_list_lock);
list_del(>list);
write_unlock_irq(>port_list_lock);
@@ -289,12 +301,14 @@ static ssize_t zfcp_sysfs_unit_add_store
 {
struct zfcp_port *port = container_of(dev, struct zfcp_port, dev);
u64 fcp_lun;
+   int retval;
 
if (strict_strtoull(buf, 0, (unsigned long long *) _lun))
return -EINVAL;
 
-   if (zfcp_unit_add(port, fcp_lun))
-   return -EINVAL;
+   retval = zfcp_unit_add(port, fcp_lun);
+   if (retval)
+   return retval;
 
return count;
 }
--- a/drivers/s390/scsi/zfcp_unit.c
+++ b/drivers/s390/scsi/zfcp_unit.c
@@ -104,7 +104,7 @@ static void zfcp_unit_release(struct dev
 {
struct zfcp_unit *unit = container_of(dev, struct zfcp_unit, dev);
 
-   put_device(>port->dev);
+   atomic_dec(>port->units);
kfree(unit);
 }
 
@@ -119,16 +119,27 @@ static void zfcp_unit_release(struct dev
 int zfcp_unit_add(struct zfcp_port *port, u64 fcp_lun)
 {
struct zfcp_unit *unit;
+   int retval = 0;
+
+   mutex_lock(_sysfs_port_units_mutex);
+   if (atomic_read(>units) == -1) {
+   /* port is already gone */
+   retval = -ENODEV;
+   goto out;
+   }
 
unit = zfcp_unit_find(port, fcp_lun);
if (unit) {

[ 45/84] SCSI: zfcp: only access zfcp_scsi_dev for valid scsi_device

2012-10-10 Thread Greg Kroah-Hartman
3.0-stable review patch.  If anyone has any objections, please let me know.

--

From: Martin Peschke 

commit d436de8ce25f53a8a880a931886821f632247943 upstream.

__scsi_remove_device (e.g. due to dev_loss_tmo) calls
zfcp_scsi_slave_destroy which in turn sends a close LUN FSF request to
the adapter. After 30 seconds without response,
zfcp_erp_timeout_handler kicks the ERP thread failing the close LUN
ERP action. zfcp_erp_wait in zfcp_erp_lun_shutdown_wait and thus
zfcp_scsi_slave_destroy returns and then scsi_device is no longer
valid. Sometime later the response to the close LUN FSF request may
finally come in. However, commit
b62a8d9b45b971a67a0f8413338c230e3117dff5
"[SCSI] zfcp: Use SCSI device data zfcp_scsi_dev instead of zfcp_unit"
introduced a number of attempts to unconditionally access struct
zfcp_scsi_dev through struct scsi_device causing a use-after-free.
This leads to an Oops due to kernel page fault in one of:
zfcp_fsf_abort_fcp_command_handler, zfcp_fsf_open_lun_handler,
zfcp_fsf_close_lun_handler, zfcp_fsf_req_trace,
zfcp_fsf_fcp_handler_common.
Move dereferencing of zfcp private data zfcp_scsi_dev allocated in
scsi_device via scsi_transport_reserve_device after the check for
potentially aborted FSF request and thus no longer valid scsi_device.
Only then assign sdev_to_zfcp(sdev) to the local auto variable struct
zfcp_scsi_dev *zfcp_sdev.

Signed-off-by: Martin Peschke 
Signed-off-by: Steffen Maier 
Signed-off-by: James Bottomley 
Signed-off-by: Greg Kroah-Hartman 

---
 drivers/s390/scsi/zfcp_fsf.c |   19 ++-
 1 file changed, 14 insertions(+), 5 deletions(-)

--- a/drivers/s390/scsi/zfcp_fsf.c
+++ b/drivers/s390/scsi/zfcp_fsf.c
@@ -771,12 +771,14 @@ out:
 static void zfcp_fsf_abort_fcp_command_handler(struct zfcp_fsf_req *req)
 {
struct scsi_device *sdev = req->data;
-   struct zfcp_scsi_dev *zfcp_sdev = sdev_to_zfcp(sdev);
+   struct zfcp_scsi_dev *zfcp_sdev;
union fsf_status_qual *fsq = >qtcb->header.fsf_status_qual;
 
if (req->status & ZFCP_STATUS_FSFREQ_ERROR)
return;
 
+   zfcp_sdev = sdev_to_zfcp(sdev);
+
switch (req->qtcb->header.fsf_status) {
case FSF_PORT_HANDLE_NOT_VALID:
if (fsq->word[0] == fsq->word[1]) {
@@ -1730,13 +1732,15 @@ static void zfcp_fsf_open_lun_handler(st
 {
struct zfcp_adapter *adapter = req->adapter;
struct scsi_device *sdev = req->data;
-   struct zfcp_scsi_dev *zfcp_sdev = sdev_to_zfcp(sdev);
+   struct zfcp_scsi_dev *zfcp_sdev;
struct fsf_qtcb_header *header = >qtcb->header;
struct fsf_qtcb_bottom_support *bottom = >qtcb->bottom.support;
 
if (req->status & ZFCP_STATUS_FSFREQ_ERROR)
return;
 
+   zfcp_sdev = sdev_to_zfcp(sdev);
+
atomic_clear_mask(ZFCP_STATUS_COMMON_ACCESS_DENIED |
  ZFCP_STATUS_COMMON_ACCESS_BOXED |
  ZFCP_STATUS_LUN_SHARED |
@@ -1847,11 +1851,13 @@ out:
 static void zfcp_fsf_close_lun_handler(struct zfcp_fsf_req *req)
 {
struct scsi_device *sdev = req->data;
-   struct zfcp_scsi_dev *zfcp_sdev = sdev_to_zfcp(sdev);
+   struct zfcp_scsi_dev *zfcp_sdev;
 
if (req->status & ZFCP_STATUS_FSFREQ_ERROR)
return;
 
+   zfcp_sdev = sdev_to_zfcp(sdev);
+
switch (req->qtcb->header.fsf_status) {
case FSF_PORT_HANDLE_NOT_VALID:
zfcp_erp_adapter_reopen(zfcp_sdev->port->adapter, 0, "fscuh_1");
@@ -1941,7 +1947,7 @@ static void zfcp_fsf_req_trace(struct zf
 {
struct fsf_qual_latency_info *lat_in;
struct latency_cont *lat = NULL;
-   struct zfcp_scsi_dev *zfcp_sdev = sdev_to_zfcp(scsi->device);
+   struct zfcp_scsi_dev *zfcp_sdev;
struct zfcp_blk_drv_data blktrc;
int ticks = req->adapter->timer_ticks;
 
@@ -1956,6 +1962,7 @@ static void zfcp_fsf_req_trace(struct zf
 
if (req->adapter->adapter_features & FSF_FEATURE_MEASUREMENT_DATA &&
!(req->status & ZFCP_STATUS_FSFREQ_ERROR)) {
+   zfcp_sdev = sdev_to_zfcp(scsi->device);
blktrc.flags |= ZFCP_BLK_LAT_VALID;
blktrc.channel_lat = lat_in->channel_lat * ticks;
blktrc.fabric_lat = lat_in->fabric_lat * ticks;
@@ -1993,12 +2000,14 @@ static void zfcp_fsf_fcp_handler_common(
 {
struct scsi_cmnd *scmnd = req->data;
struct scsi_device *sdev = scmnd->device;
-   struct zfcp_scsi_dev *zfcp_sdev = sdev_to_zfcp(sdev);
+   struct zfcp_scsi_dev *zfcp_sdev;
struct fsf_qtcb_header *header = >qtcb->header;
 
if (unlikely(req->status & ZFCP_STATUS_FSFREQ_ERROR))
return;
 
+   zfcp_sdev = sdev_to_zfcp(sdev);
+
switch (header->fsf_status) {
case FSF_HANDLE_MISMATCH:
case FSF_PORT_HANDLE_NOT_VALID:


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to 

[ 47/84] ext4: online defrag is not supported for journaled files

2012-10-10 Thread Greg Kroah-Hartman
3.0-stable review patch.  If anyone has any objections, please let me know.

--

From: Dmitry Monakhov 

commit f066055a3449f0e5b0ae4f3ceab4445bead47638 upstream.

Proper block swap for inodes with full journaling enabled is
truly non obvious task. In order to be on a safe side let's
explicitly disable it for now.

Signed-off-by: Dmitry Monakhov 
Signed-off-by: "Theodore Ts'o" 
Signed-off-by: Greg Kroah-Hartman 

---
 fs/ext4/move_extent.c |7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

--- a/fs/ext4/move_extent.c
+++ b/fs/ext4/move_extent.c
@@ -1209,7 +1209,12 @@ ext4_move_extents(struct file *o_filp, s
orig_inode->i_ino, donor_inode->i_ino);
return -EINVAL;
}
-
+   /* TODO: This is non obvious task to swap blocks for inodes with full
+  jornaling enabled */
+   if (ext4_should_journal_data(orig_inode) ||
+   ext4_should_journal_data(donor_inode)) {
+   return -EINVAL;
+   }
/* Protect orig and donor inodes against a truncate */
ret1 = mext_inode_double_lock(orig_inode, donor_inode);
if (ret1 < 0)


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[ 50/84] ASoC: wm9712: Fix name of Capture Switch

2012-10-10 Thread Greg Kroah-Hartman
3.0-stable review patch.  If anyone has any objections, please let me know.

--

From: Mark Brown 

commit 689185b78ba6fbe0042f662a468b5565909dff7a upstream.

Help UIs associate it with the matching gain control.

Signed-off-by: Mark Brown 
Signed-off-by: Greg Kroah-Hartman 

---
 sound/soc/codecs/wm9712.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/sound/soc/codecs/wm9712.c
+++ b/sound/soc/codecs/wm9712.c
@@ -144,7 +144,7 @@ SOC_SINGLE("Playback Attenuate (-6dB) Sw
 SOC_SINGLE("Bass Volume", AC97_MASTER_TONE, 8, 15, 1),
 SOC_SINGLE("Treble Volume", AC97_MASTER_TONE, 0, 15, 1),
 
-SOC_SINGLE("Capture ADC Switch", AC97_REC_GAIN, 15, 1, 1),
+SOC_SINGLE("Capture Switch", AC97_REC_GAIN, 15, 1, 1),
 SOC_ENUM("Capture Volume Steps", wm9712_enum[6]),
 SOC_DOUBLE("Capture Volume", AC97_REC_GAIN, 8, 0, 63, 1),
 SOC_SINGLE("Capture ZC Switch", AC97_REC_GAIN, 7, 1, 0),


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[ 37/84] netrom: copy_datagram_iovec can fail

2012-10-10 Thread Greg Kroah-Hartman
3.0-stable review patch.  If anyone has any objections, please let me know.

--


From: Alan Cox 

[ Upstream commit 6cf5c951175abcec4da470c50565cc0afe6cd11d ]

Check for an error from this and if so bail properly.

Signed-off-by: Alan Cox 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 net/netrom/af_netrom.c |7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

--- a/net/netrom/af_netrom.c
+++ b/net/netrom/af_netrom.c
@@ -1170,7 +1170,12 @@ static int nr_recvmsg(struct kiocb *iocb
msg->msg_flags |= MSG_TRUNC;
}
 
-   skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+   er = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+   if (er < 0) {
+   skb_free_datagram(sk, skb);
+   release_sock(sk);
+   return er;
+   }
 
if (sax != NULL) {
sax->sax25_family = AF_NETROM;


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[ 39/84] aoe: assert AoE packets marked as requiring no checksum

2012-10-10 Thread Greg Kroah-Hartman
3.0-stable review patch.  If anyone has any objections, please let me know.

--


From: Ed Cashin 

[ Upstream commit 8babe8cc6570ed896b7b596337eb8fe730c3ff45 ]

In order for the network layer to see that AoE requires
no checksumming in a generic way, the packets must be
marked as requiring no checksum, so we make this requirement
explicit with the assertion.

Signed-off-by: Ed Cashin 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/block/aoe/aoecmd.c |1 +
 1 file changed, 1 insertion(+)

--- a/drivers/block/aoe/aoecmd.c
+++ b/drivers/block/aoe/aoecmd.c
@@ -35,6 +35,7 @@ new_skb(ulong len)
skb_reset_mac_header(skb);
skb_reset_network_header(skb);
skb->protocol = __constant_htons(ETH_P_AOE);
+   skb_checksum_none_assert(skb);
}
return skb;
 }


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[ 41/84] SCSI: zfcp: Make trace record tags unique

2012-10-10 Thread Greg Kroah-Hartman
3.0-stable review patch.  If anyone has any objections, please let me know.

--

From: Steffen Maier 

commit 0100998dbfe6dfcd90a6e912ca7ed6f255d48f25 upstream.

Duplicate fssrh_2 from a54ca0f62f953898b05549391ac2a8a4dad6482b
"[SCSI] zfcp: Redesign of the debug tracing for HBA records."
complicates distinction of generic status read response from
local link up.
Duplicate fsscth1 from 2c55b750a884b86dea8b4cc5f15e1484cc47a25c
"[SCSI] zfcp: Redesign of the debug tracing for SAN records."
complicates distinction of good common transport response from
invalid port handle.

Signed-off-by: Steffen Maier 
Reviewed-by: Martin Peschke 
Signed-off-by: James Bottomley 
Signed-off-by: Greg Kroah-Hartman 

---
 drivers/s390/scsi/zfcp_fsf.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/drivers/s390/scsi/zfcp_fsf.c
+++ b/drivers/s390/scsi/zfcp_fsf.c
@@ -219,7 +219,7 @@ static void zfcp_fsf_status_read_handler
return;
}
 
-   zfcp_dbf_hba_fsf_uss("fssrh_2", req);
+   zfcp_dbf_hba_fsf_uss("fssrh_4", req);
 
switch (sr_buf->status_type) {
case FSF_STATUS_READ_PORT_CLOSED:
@@ -885,7 +885,7 @@ static void zfcp_fsf_send_ct_handler(str
 
switch (header->fsf_status) {
 case FSF_GOOD:
-   zfcp_dbf_san_res("fsscth1", req);
+   zfcp_dbf_san_res("fsscth2", req);
ct->status = 0;
break;
 case FSF_SERVICE_CLASS_NOT_SUPPORTED:


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[ 42/84] SCSI: zfcp: Do not wakeup while suspended

2012-10-10 Thread Greg Kroah-Hartman
3.0-stable review patch.  If anyone has any objections, please let me know.

--

From: Steffen Maier 

commit cb45214960bc989af8b911ebd77da541c797717d upstream.

If the mapping of FCP device bus ID and corresponding subchannel
is modified while the Linux image is suspended, the resume of FCP
devices can fail. During resume, zfcp gets callbacks from cio regarding
the modified subchannels but they can be arbitrarily mixed with the
restore/resume callback. Since the cio callbacks would trigger
adapter recovery, zfcp could wakeup before the resume callback.
Therefore, ignore the cio callbacks regarding subchannels while
being suspended. We can safely do so, since zfcp does not deal itself
with subchannels. For problem determination purposes, we still trace the
ignored callback events.

The following kernel messages could be seen on resume:

kernel: : parent  should not be sleeping

As part of adapter reopen recovery, zfcp performs auto port scanning
which can erroneously try to register new remote ports with
scsi_transport_fc and the device core code complains about the parent
(adapter) still sleeping.

kernel: zfcp.3dff9c: :\
 Setting up the QDIO connection to the FCP adapter failed

kernel: zfcp.574d43: :\
 ERP cannot recover an error on the FCP device

In such cases, the adapter gave up recovery and remained blocked along
with its child objects: remote ports and LUNs/scsi devices. Even the
adapter shutdown as part of giving up recovery failed because the ccw
device state remained disconnected. Later, the corresponding remote
ports ran into dev_loss_tmo. As a result, the LUNs were erroneously
not available again after resume.

Even a manually triggered adapter recovery (e.g. sysfs attribute
failed, or device offline/online via sysfs) could not recover the
adapter due to the remaining disconnected state of the corresponding
ccw device.

Signed-off-by: Steffen Maier 
Signed-off-by: James Bottomley 
Signed-off-by: Greg Kroah-Hartman 

---
 drivers/s390/scsi/zfcp_ccw.c |   73 +--
 drivers/s390/scsi/zfcp_dbf.c |   20 +++
 drivers/s390/scsi/zfcp_dbf.h |1 
 drivers/s390/scsi/zfcp_def.h |1 
 drivers/s390/scsi/zfcp_ext.h |1 
 5 files changed, 86 insertions(+), 10 deletions(-)

--- a/drivers/s390/scsi/zfcp_ccw.c
+++ b/drivers/s390/scsi/zfcp_ccw.c
@@ -38,17 +38,23 @@ void zfcp_ccw_adapter_put(struct zfcp_ad
spin_unlock_irqrestore(_ccw_adapter_ref_lock, flags);
 }
 
-static int zfcp_ccw_activate(struct ccw_device *cdev)
-
+/**
+ * zfcp_ccw_activate - activate adapter and wait for it to finish
+ * @cdev: pointer to belonging ccw device
+ * @clear: Status flags to clear.
+ * @tag: s390dbf trace record tag
+ */
+static int zfcp_ccw_activate(struct ccw_device *cdev, int clear, char *tag)
 {
struct zfcp_adapter *adapter = zfcp_ccw_adapter_by_cdev(cdev);
 
if (!adapter)
return 0;
 
+   zfcp_erp_clear_adapter_status(adapter, clear);
zfcp_erp_set_adapter_status(adapter, ZFCP_STATUS_COMMON_RUNNING);
zfcp_erp_adapter_reopen(adapter, ZFCP_STATUS_COMMON_ERP_FAILED,
-   "ccresu2");
+   tag);
zfcp_erp_wait(adapter);
flush_work(>scan_work);
 
@@ -163,26 +169,29 @@ static int zfcp_ccw_set_online(struct cc
BUG_ON(!zfcp_reqlist_isempty(adapter->req_list));
adapter->req_no = 0;
 
-   zfcp_ccw_activate(cdev);
+   zfcp_ccw_activate(cdev, 0, "ccsonl1");
zfcp_ccw_adapter_put(adapter);
return 0;
 }
 
 /**
- * zfcp_ccw_set_offline - set_offline function of zfcp driver
+ * zfcp_ccw_offline_sync - shut down adapter and wait for it to finish
  * @cdev: pointer to belonging ccw device
+ * @set: Status flags to set.
+ * @tag: s390dbf trace record tag
  *
  * This function gets called by the common i/o layer and sets an adapter
  * into state offline.
  */
-static int zfcp_ccw_set_offline(struct ccw_device *cdev)
+static int zfcp_ccw_offline_sync(struct ccw_device *cdev, int set, char *tag)
 {
struct zfcp_adapter *adapter = zfcp_ccw_adapter_by_cdev(cdev);
 
if (!adapter)
return 0;
 
-   zfcp_erp_adapter_shutdown(adapter, 0, "ccsoff1");
+   zfcp_erp_set_adapter_status(adapter, set);
+   zfcp_erp_adapter_shutdown(adapter, 0, tag);
zfcp_erp_wait(adapter);
 
zfcp_ccw_adapter_put(adapter);
@@ -190,6 +199,18 @@ static int zfcp_ccw_set_offline(struct c
 }
 
 /**
+ * zfcp_ccw_set_offline - set_offline function of zfcp driver
+ * @cdev: pointer to belonging ccw device
+ *
+ * This function gets called by the common i/o layer and sets an adapter
+ * into state offline.
+ */
+static int zfcp_ccw_set_offline(struct ccw_device *cdev)
+{
+   return zfcp_ccw_offline_sync(cdev, 0, "ccsoff1");
+}
+
+/**
  * zfcp_ccw_notify - ccw notify function
  * @cdev: pointer to belonging ccw device
  * @event: indicates if adapter was detached or attached
@@ -206,6 

[ 62/84] r8169: remove erroneous processing of always set bit.

2012-10-10 Thread Greg Kroah-Hartman
3.0-stable review patch.  If anyone has any objections, please let me know.

--

From: Francois Romieu 

commit e03f33af79f0772156e1a1a1e36bdddf8012b2e4 upstream.

When set, RxFOVF (resp. RxBOVF) is always 1 (resp. 0).

Signed-off-by: Francois Romieu 
Cc: Hayes 
Reviewed-by: Jonathan Nieder 
Acked-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/net/r8169.c |7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

--- a/drivers/net/r8169.c
+++ b/drivers/net/r8169.c
@@ -388,6 +388,7 @@ enum rtl_register_content {
RxOK= 0x0001,
 
/* RxStatusDesc */
+   RxBOVF  = (1 << 24),
RxFOVF  = (1 << 23),
RxRWT   = (1 << 22),
RxRES   = (1 << 21),
@@ -666,6 +667,7 @@ struct rtl8169_private {
struct mii_if_info mii;
struct rtl8169_counters counters;
u32 saved_wolopts;
+   u32 opts1_mask;
 
const struct firmware *fw;
 #define RTL_FIRMWARE_UNKNOWN   ERR_PTR(-EAGAIN);
@@ -3442,6 +3444,9 @@ rtl8169_init_one(struct pci_dev *pdev, c
tp->intr_event = cfg->intr_event;
tp->napi_event = cfg->napi_event;
 
+   tp->opts1_mask = (tp->mac_version != RTL_GIGA_MAC_VER_01) ?
+   ~(RxBOVF | RxFOVF) : ~0;
+
init_timer(>timer);
tp->timer.data = (unsigned long) dev;
tp->timer.function = rtl8169_phy_timer;
@@ -4920,7 +4925,7 @@ static int rtl8169_rx_interrupt(struct n
u32 status;
 
rmb();
-   status = le32_to_cpu(desc->opts1);
+   status = le32_to_cpu(desc->opts1) & tp->opts1_mask;
 
if (status & DescOwn)
break;


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[ 64/84] r8169: expand received packet length indication.

2012-10-10 Thread Greg Kroah-Hartman
3.0-stable review patch.  If anyone has any objections, please let me know.

--

From: Francois Romieu 

commit deb9d93c89d311714a60809b28160e538e1cbb43 upstream.

8168d and above allow jumbo frames beyond 8k. Bump the received
packet length check before enabling jumbo frames on these chipsets.

Frame length indication covers bits 0..13 of the first Rx descriptor
32 bits for the 8169 and 8168. I only have authoritative documentation
for the allowed use of the extra (13) bit with the 8169 and 8168c.
Realtek's drivers use the same mask for the 816x and the fast ethernet
only 810x.

Signed-off-by: Francois Romieu 
Reviewed-by: Jonathan Nieder 
Acked-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/net/r8169.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/net/r8169.c
+++ b/drivers/net/r8169.c
@@ -5137,7 +5137,7 @@ static int rtl8169_rx_interrupt(struct n
} else {
struct sk_buff *skb;
dma_addr_t addr = le64_to_cpu(desc->addr);
-   int pkt_size = (status & 0x1FFF) - 4;
+   int pkt_size = (status & 0x3fff) - 4;
 
/*
 * The driver does not support incoming fragmented


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[ 66/84] r8169: Rx FIFO overflow fixes.

2012-10-10 Thread Greg Kroah-Hartman
3.0-stable review patch.  If anyone has any objections, please let me know.

--

From: Francois Romieu 

commit 811fd3010cf512f2e23e6c4c912aad54516dc706 upstream.

Realtek has specified that the post 8168c gigabit chips and the post
8105e fast ethernet chips recover automatically from a Rx FIFO overflow.
The driver does not need to clear the RxFIFOOver bit of IntrStatus and
it should rather avoid messing it.

The implementation deserves some explanation:
1. events outside of the intr_event bit mask are now ignored. It enforces
   a no-processing policy for the events that either should not be there
   or should be ignored.

2. RxFIFOOver was already ignored in rtl_cfg_infos[RTL_CFG_1] for the
   whole 8168 line of chips with two exceptions:
   - RTL_GIGA_MAC_VER_22 since b5ba6d12bdac21bc0620a5089e0f24e362645efd
 ("use RxFIFO overflow workaround for 8168c chipset.").
 This one should now be correctly handled.
   - RTL_GIGA_MAC_VER_11 (8168b) which requires a different Rx FIFO
 overflow processing.

   Though it does not conform to Realtek suggestion above, the updated
   driver includes no change for RTL_GIGA_MAC_VER_12 and RTL_GIGA_MAC_VER_17.
   Both are 8168b. RTL_GIGA_MAC_VER_12 is common and a bit old so I'd rather
   wait for experimental evidence that the change suggested by Realtek really
   helps or does not hurt in unexpected ways.

   Removed case statements in rtl8169_interrupt are only 8168 relevant.

3. RxFIFOOver is masked for post 8105e 810x chips, namely the sole 8105e
   (RTL_GIGA_MAC_VER_30) itself.

Signed-off-by: Francois Romieu 
Cc: hayeswang 
Signed-off-by: David S. Miller 
Reviewed-by: Jonathan Nieder 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/net/r8169.c |   54 +++-
 1 file changed, 24 insertions(+), 30 deletions(-)

--- a/drivers/net/r8169.c
+++ b/drivers/net/r8169.c
@@ -1088,17 +1088,21 @@ static u8 rtl8168d_efuse_read(void __iom
return value;
 }
 
-static void rtl8169_irq_mask_and_ack(void __iomem *ioaddr)
+static void rtl8169_irq_mask_and_ack(struct rtl8169_private *tp)
 {
-   RTL_W16(IntrMask, 0x);
+   void __iomem *ioaddr = tp->mmio_addr;
 
-   RTL_W16(IntrStatus, 0x);
+   RTL_W16(IntrMask, 0x);
+   RTL_W16(IntrStatus, tp->intr_event);
+   RTL_R8(ChipCmd);
 }
 
-static void rtl8169_asic_down(void __iomem *ioaddr)
+static void rtl8169_asic_down(struct rtl8169_private *tp)
 {
+   void __iomem *ioaddr = tp->mmio_addr;
+
RTL_W8(ChipCmd, 0x00);
-   rtl8169_irq_mask_and_ack(ioaddr);
+   rtl8169_irq_mask_and_ack(tp);
RTL_R16(CPlusCmd);
 }
 
@@ -3817,7 +3821,7 @@ static void rtl8169_hw_reset(struct rtl8
void __iomem *ioaddr = tp->mmio_addr;
 
/* Disable interrupts */
-   rtl8169_irq_mask_and_ack(ioaddr);
+   rtl8169_irq_mask_and_ack(tp);
 
if (tp->mac_version == RTL_GIGA_MAC_VER_27 ||
tp->mac_version == RTL_GIGA_MAC_VER_28 ||
@@ -4284,8 +4288,7 @@ static void rtl_hw_start_8168(struct net
RTL_W16(IntrMitigate, 0x5151);
 
/* Work around for RxFIFO overflow. */
-   if (tp->mac_version == RTL_GIGA_MAC_VER_11 ||
-   tp->mac_version == RTL_GIGA_MAC_VER_22) {
+   if (tp->mac_version == RTL_GIGA_MAC_VER_11) {
tp->intr_event |= RxFIFOOver | PCSTimeout;
tp->intr_event &= ~RxOverflow;
}
@@ -4467,6 +4470,11 @@ static void rtl_hw_start_8101(struct net
void __iomem *ioaddr = tp->mmio_addr;
struct pci_dev *pdev = tp->pci_dev;
 
+   if (tp->mac_version >= RTL_GIGA_MAC_VER_30) {
+   tp->intr_event &= ~RxFIFOOver;
+   tp->napi_event &= ~RxFIFOOver;
+   }
+
if (tp->mac_version == RTL_GIGA_MAC_VER_13 ||
tp->mac_version == RTL_GIGA_MAC_VER_16) {
int cap = tp->pcie_cap;
@@ -4738,7 +4746,7 @@ static void rtl8169_wait_for_quiescence(
/* Wait for any pending NAPI task to complete */
napi_disable(>napi);
 
-   rtl8169_irq_mask_and_ack(ioaddr);
+   rtl8169_irq_mask_and_ack(tp);
 
tp->intr_mask = 0x;
RTL_W16(IntrMask, tp->intr_event);
@@ -5200,13 +5208,17 @@ static irqreturn_t rtl8169_interrupt(int
 */
status = RTL_R16(IntrStatus);
while (status && status != 0x) {
+   status &= tp->intr_event;
+   if (!status)
+   break;
+
handled = 1;
 
/* Handle all of the error cases first. These will reset
 * the chip, so just exit the loop.
 */
if (unlikely(!netif_running(dev))) {
-   rtl8169_asic_down(ioaddr);
+   rtl8169_asic_down(tp);
break;
}
 
@@ -5214,27 +5226,9 @@ static irqreturn_t rtl8169_interrupt(int
switch (tp->mac_version) {
/* Work around for 

[ 52/84] mm: thp: fix pmd_present for split_huge_page and PROT_NONE with THP

2012-10-10 Thread Greg Kroah-Hartman
3.0-stable review patch.  If anyone has any objections, please let me know.

--

From: Andrea Arcangeli 

commit 027ef6c87853b0a9df53175063028edb4950d476 upstream.

In many places !pmd_present has been converted to pmd_none.  For pmds
that's equivalent and pmd_none is quicker so using pmd_none is better.

However (unless we delete pmd_present) we should provide an accurate
pmd_present too.  This will avoid the risk of code thinking the pmd is non
present because it's under __split_huge_page_map, see the pmd_mknotpresent
there and the comment above it.

If the page has been mprotected as PROT_NONE, it would also lead to a
pmd_present false negative in the same way as the race with
split_huge_page.

Because the PSE bit stays on at all times (both during split_huge_page and
when the _PAGE_PROTNONE bit get set), we could only check for the PSE bit,
but checking the PROTNONE bit too is still good to remember pmd_present
must always keep PROT_NONE into account.

This explains a not reproducible BUG_ON that was seldom reported on the
lists.

The same issue is in pmd_large, it would go wrong with both PROT_NONE and
if it races with split_huge_page.

Signed-off-by: Andrea Arcangeli 
Acked-by: Rik van Riel 
Cc: Johannes Weiner 
Cc: Hugh Dickins 
Cc: Mel Gorman 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
Signed-off-by: Greg Kroah-Hartman 

---
 arch/x86/include/asm/pgtable.h |   11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -146,8 +146,7 @@ static inline unsigned long pmd_pfn(pmd_
 
 static inline int pmd_large(pmd_t pte)
 {
-   return (pmd_flags(pte) & (_PAGE_PSE | _PAGE_PRESENT)) ==
-   (_PAGE_PSE | _PAGE_PRESENT);
+   return pmd_flags(pte) & _PAGE_PSE;
 }
 
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
@@ -415,7 +414,13 @@ static inline int pte_hidden(pte_t pte)
 
 static inline int pmd_present(pmd_t pmd)
 {
-   return pmd_flags(pmd) & _PAGE_PRESENT;
+   /*
+* Checking for _PAGE_PSE is needed too because
+* split_huge_page will temporarily clear the present bit (but
+* the _PAGE_PSE flag will remain set at all times while the
+* _PAGE_PRESENT bit is clear).
+*/
+   return pmd_flags(pmd) & (_PAGE_PRESENT | _PAGE_PROTNONE | _PAGE_PSE);
 }
 
 static inline int pmd_none(pmd_t pmd)


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[ 53/84] ALSA: aloop - add locking to timer access

2012-10-10 Thread Greg Kroah-Hartman
3.0-stable review patch.  If anyone has any objections, please let me know.

--

From: Omair Mohammed Abdullah 

commit d4f1e48bd11e3df6a26811f7a1f06c4225d92f7d upstream.

When the loopback timer handler is running, calling del_timer() (for STOP
trigger) will not wait for the handler to complete before deactivating the
timer. The timer gets rescheduled in the handler as usual. Then a subsequent
START trigger will try to start the timer using add_timer() with a timer pending
leading to a kernel panic.

Serialize the calls to add_timer() and del_timer() using a spin lock to avoid
this.

Signed-off-by: Omair Mohammed Abdullah 
Signed-off-by: Vinod Koul 
Signed-off-by: Takashi Iwai 
Signed-off-by: Greg Kroah-Hartman 

---
 sound/drivers/aloop.c |6 ++
 1 file changed, 6 insertions(+)

--- a/sound/drivers/aloop.c
+++ b/sound/drivers/aloop.c
@@ -119,6 +119,7 @@ struct loopback_pcm {
unsigned int period_size_frac;
unsigned long last_jiffies;
struct timer_list timer;
+   spinlock_t timer_lock;
 };
 
 static struct platform_device *devices[SNDRV_CARDS];
@@ -169,6 +170,7 @@ static void loopback_timer_start(struct
unsigned long tick;
unsigned int rate_shift = get_rate_shift(dpcm);
 
+   spin_lock(>timer_lock);
if (rate_shift != dpcm->pcm_rate_shift) {
dpcm->pcm_rate_shift = rate_shift;
dpcm->period_size_frac = frac_pos(dpcm, dpcm->pcm_period_size);
@@ -181,12 +183,15 @@ static void loopback_timer_start(struct
tick = (tick + dpcm->pcm_bps - 1) / dpcm->pcm_bps;
dpcm->timer.expires = jiffies + tick;
add_timer(>timer);
+   spin_unlock(>timer_lock);
 }
 
 static inline void loopback_timer_stop(struct loopback_pcm *dpcm)
 {
+   spin_lock(>timer_lock);
del_timer(>timer);
dpcm->timer.expires = 0;
+   spin_unlock(>timer_lock);
 }
 
 #define CABLE_VALID_PLAYBACK   (1 << SNDRV_PCM_STREAM_PLAYBACK)
@@ -658,6 +663,7 @@ static int loopback_open(struct snd_pcm_
dpcm->substream = substream;
setup_timer(>timer, loopback_timer_function,
(unsigned long)dpcm);
+   spin_lock_init(>timer_lock);
 
cable = loopback->cables[substream->number][dev];
if (!cable) {


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[ 35/84] ipv6: mip6: fix mip6_mh_filter()

2012-10-10 Thread Greg Kroah-Hartman
3.0-stable review patch.  If anyone has any objections, please let me know.

--


From: Eric Dumazet 

[ Upstream commit 96af69ea2a83d292238bdba20e4508ee967cf8cb ]

mip6_mh_filter() should not modify its input, or else its caller
would need to recompute ipv6_hdr() if skb->head is reallocated.

Use skb_header_pointer() instead of pskb_may_pull()

Signed-off-by: Eric Dumazet 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 net/ipv6/mip6.c |   20 +++-
 1 file changed, 11 insertions(+), 9 deletions(-)

--- a/net/ipv6/mip6.c
+++ b/net/ipv6/mip6.c
@@ -84,28 +84,30 @@ static int mip6_mh_len(int type)
 
 static int mip6_mh_filter(struct sock *sk, struct sk_buff *skb)
 {
-   struct ip6_mh *mh;
+   struct ip6_mh _hdr;
+   const struct ip6_mh *mh;
 
-   if (!pskb_may_pull(skb, (skb_transport_offset(skb)) + 8) ||
-   !pskb_may_pull(skb, (skb_transport_offset(skb) +
-((skb_transport_header(skb)[1] + 1) << 3
+   mh = skb_header_pointer(skb, skb_transport_offset(skb),
+   sizeof(_hdr), &_hdr);
+   if (!mh)
return -1;
 
-   mh = (struct ip6_mh *)skb_transport_header(skb);
+   if (((mh->ip6mh_hdrlen + 1) << 3) > skb->len)
+   return -1;
 
if (mh->ip6mh_hdrlen < mip6_mh_len(mh->ip6mh_type)) {
LIMIT_NETDEBUG(KERN_DEBUG "mip6: MH message too short: %d vs 
>=%d\n",
   mh->ip6mh_hdrlen, mip6_mh_len(mh->ip6mh_type));
-   mip6_param_prob(skb, 0, ((>ip6mh_hdrlen) -
-skb_network_header(skb)));
+   mip6_param_prob(skb, 0, offsetof(struct ip6_mh, ip6mh_hdrlen) +
+   skb_network_header_len(skb));
return -1;
}
 
if (mh->ip6mh_proto != IPPROTO_NONE) {
LIMIT_NETDEBUG(KERN_DEBUG "mip6: MH invalid payload proto = 
%d\n",
   mh->ip6mh_proto);
-   mip6_param_prob(skb, 0, ((>ip6mh_proto) -
-skb_network_header(skb)));
+   mip6_param_prob(skb, 0, offsetof(struct ip6_mh, ip6mh_proto) +
+   skb_network_header_len(skb));
return -1;
}
 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[ 34/84] ipv6: raw: fix icmpv6_filter()

2012-10-10 Thread Greg Kroah-Hartman
3.0-stable review patch.  If anyone has any objections, please let me know.

--


From: Eric Dumazet 

[ Upstream commit 1b05c4b50edbddbdde715c4a7350629819f6655e ]

icmpv6_filter() should not modify its input, or else its caller
would need to recompute ipv6_hdr() if skb->head is reallocated.

Use skb_header_pointer() instead of pskb_may_pull() and
change the prototype to make clear both sk and skb are const.

Also, if icmpv6 header cannot be found, do not deliver the packet,
as we do in IPv4.

Signed-off-by: Eric Dumazet 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 net/ipv6/raw.c |   21 ++---
 1 file changed, 10 insertions(+), 11 deletions(-)

--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -106,21 +106,20 @@ found:
  * 0 - deliver
  * 1 - block
  */
-static __inline__ int icmpv6_filter(struct sock *sk, struct sk_buff *skb)
+static int icmpv6_filter(const struct sock *sk, const struct sk_buff *skb)
 {
-   struct icmp6hdr *icmph;
-   struct raw6_sock *rp = raw6_sk(sk);
+   struct icmp6hdr *_hdr;
+   const struct icmp6hdr *hdr;
 
-   if (pskb_may_pull(skb, sizeof(struct icmp6hdr))) {
-   __u32 *data = >filter.data[0];
-   int bit_nr;
+   hdr = skb_header_pointer(skb, skb_transport_offset(skb),
+sizeof(_hdr), &_hdr);
+   if (hdr) {
+   const __u32 *data = _sk(sk)->filter.data[0];
+   unsigned int type = hdr->icmp6_type;
 
-   icmph = (struct icmp6hdr *) skb->data;
-   bit_nr = icmph->icmp6_type;
-
-   return (data[bit_nr >> 5] & (1 << (bit_nr & 31))) != 0;
+   return (data[type >> 5] & (1U << (type & 31))) != 0;
}
-   return 0;
+   return 1;
 }
 
 #if defined(CONFIG_IPV6_MIP6) || defined(CONFIG_IPV6_MIP6_MODULE)


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[ 48/84] ext4: always set i_op in ext4_mknod()

2012-10-10 Thread Greg Kroah-Hartman
3.0-stable review patch.  If anyone has any objections, please let me know.

--

From: Bernd Schubert 

commit 6a08f447facb4f9e29fcc30fb68060bb5a0d21c2 upstream.

ext4_special_inode_operations have their own ifdef CONFIG_EXT4_FS_XATTR
to mask those methods. And ext4_iget also always sets it, so there is
an inconsistency.

Signed-off-by: Bernd Schubert 
Signed-off-by: "Theodore Ts'o" 
Signed-off-by: Greg Kroah-Hartman 

---
 fs/ext4/namei.c |2 --
 1 file changed, 2 deletions(-)

--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -1799,9 +1799,7 @@ retry:
err = PTR_ERR(inode);
if (!IS_ERR(inode)) {
init_special_inode(inode, inode->i_mode, rdev);
-#ifdef CONFIG_EXT4_FS_XATTR
inode->i_op = _special_inode_operations;
-#endif
err = ext4_add_nondir(handle, dentry, inode);
}
ext4_journal_stop(handle);


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH net-next? V2] pktgen: Use simpler test for non-zero ipv6 address

2012-10-10 Thread Cong Wang
On Thu, Oct 11, 2012 at 3:23 AM, Joe Perches  wrote:
> Reduces object size and should be slightly faster.
>
> allyesconfig:
> $ size net/core/pktgen.o*
>textdata bss dec hex filename
>   522844321   11840   68445   10b5d net/core/pktgen.o.new
>   523104293   11848   68451   10b63 net/core/pktgen.o.old
>
> Signed-off-by: Joe Perches 

Looks good.

This should go to -net, net-next is not open currently.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[ 56/84] drm/radeon: only adjust default clocks on NI GPUs

2012-10-10 Thread Greg Kroah-Hartman
3.0-stable review patch.  If anyone has any objections, please let me know.

--

From: Alex Deucher 

commit 2e3b3b105ab3bb5b6a37198da4f193cd13781d13 upstream.

SI asics store voltage information differently so we
don't have a way to deal with it properly yet.

Signed-off-by: Alex Deucher 
Signed-off-by: Greg Kroah-Hartman 

---
 drivers/gpu/drm/radeon/radeon_pm.c |8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

--- a/drivers/gpu/drm/radeon/radeon_pm.c
+++ b/drivers/gpu/drm/radeon/radeon_pm.c
@@ -535,7 +535,9 @@ void radeon_pm_suspend(struct radeon_dev
 void radeon_pm_resume(struct radeon_device *rdev)
 {
/* set up the default clocks if the MC ucode is loaded */
-   if (ASIC_IS_DCE5(rdev) && rdev->mc_fw) {
+   if ((rdev->family >= CHIP_BARTS) &&
+   (rdev->family <= CHIP_CAYMAN) &&
+   rdev->mc_fw) {
if (rdev->pm.default_vddc)
radeon_atom_set_voltage(rdev, rdev->pm.default_vddc,
SET_VOLTAGE_TYPE_ASIC_VDDC);
@@ -590,7 +592,9 @@ int radeon_pm_init(struct radeon_device
radeon_pm_print_states(rdev);
radeon_pm_init_profile(rdev);
/* set up the default clocks if the MC ucode is loaded */
-   if (ASIC_IS_DCE5(rdev) && rdev->mc_fw) {
+   if ((rdev->family >= CHIP_BARTS) &&
+   (rdev->family <= CHIP_CAYMAN) &&
+   rdev->mc_fw) {
if (rdev->pm.default_vddc)
radeon_atom_set_voltage(rdev, 
rdev->pm.default_vddc,

SET_VOLTAGE_TYPE_ASIC_VDDC);


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[ 54/84] ALSA: usb - disable broken hw volume for Tenx TP6911

2012-10-10 Thread Greg Kroah-Hartman
3.0-stable review patch.  If anyone has any objections, please let me know.

--

From: David Henningsson 

commit c10514394ef9e8de93a4ad8c8904d71dcd82c122 upstream.

While going through Ubuntu bugs, I discovered this patch being
posted and a confirmation that the patch works as expected.

Finding out how the hw volume really works would be preferrable
to just disabling the broken one, but this would be better than
nothing.

Credit: sndfnsdfin (qawsnews)
BugLink: https://bugs.launchpad.net/bugs/559939
Signed-off-by: David Henningsson 
Signed-off-by: Takashi Iwai 
Signed-off-by: Greg Kroah-Hartman 

---
 sound/usb/mixer.c |7 +++
 1 file changed, 7 insertions(+)

--- a/sound/usb/mixer.c
+++ b/sound/usb/mixer.c
@@ -1246,6 +1246,13 @@ static int parse_audio_feature_unit(stru
/* disable non-functional volume control */
master_bits &= ~UAC_CONTROL_BIT(UAC_FU_VOLUME);
break;
+   case USB_ID(0x1130, 0xf211):
+   snd_printk(KERN_INFO
+  "usbmixer: volume control quirk for Tenx TP6911 
Audio Headset\n");
+   /* disable non-functional volume control */
+   channels = 0;
+   break;
+
}
if (channels > 0)
first_ch_bits = snd_usb_combine_bytes(bmaControls + csize, 
csize);


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[ 57/84] drm/radeon: Add MSI quirk for gateway RS690

2012-10-10 Thread Greg Kroah-Hartman
3.0-stable review patch.  If anyone has any objections, please let me know.

--

From: Alex Deucher 

commit 3a6d59df80897cc87812b6826d70085905bed013 upstream.

Fixes another system on:
https://bugs.freedesktop.org/show_bug.cgi?id=37679

Signed-off-by: Alex Deucher 
Signed-off-by: Greg Kroah-Hartman 

---
 drivers/gpu/drm/radeon/radeon_irq_kms.c |6 ++
 1 file changed, 6 insertions(+)

--- a/drivers/gpu/drm/radeon/radeon_irq_kms.c
+++ b/drivers/gpu/drm/radeon/radeon_irq_kms.c
@@ -143,6 +143,12 @@ static bool radeon_msi_ok(struct radeon_
(rdev->pdev->subsystem_device == 0x01fd))
return true;
 
+   /* Gateway RS690 only seems to work with MSIs. */
+   if ((rdev->pdev->device == 0x791f) &&
+   (rdev->pdev->subsystem_vendor == 0x107b) &&
+   (rdev->pdev->subsystem_device == 0x0185))
+   return true;
+
/* RV515 seems to have MSI issues where it loses
 * MSI rearms occasionally. This leads to lockups and freezes.
 * disable it by default.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[ 59/84] rcu: Fix day-one dyntick-idle stall-warning bug

2012-10-10 Thread Greg Kroah-Hartman
3.0-stable review patch.  If anyone has any objections, please let me know.

--

From: "Paul E. McKenney" 

commit a10d206ef1a83121ab7430cb196e0376a7145b22 upstream.

Each grace period is supposed to have at least one callback waiting
for that grace period to complete.  However, if CONFIG_NO_HZ=n, an
extra callback-free grace period is no big problem -- it will chew up
a tiny bit of CPU time, but it will complete normally.  In contrast,
CONFIG_NO_HZ=y kernels have the potential for all the CPUs to go to
sleep indefinitely, in turn indefinitely delaying completion of the
callback-free grace period.  Given that nothing is waiting on this grace
period, this is also not a problem.

That is, unless RCU CPU stall warnings are also enabled, as they are
in recent kernels.  In this case, if a CPU wakes up after at least one
minute of inactivity, an RCU CPU stall warning will result.  The reason
that no one noticed until quite recently is that most systems have enough
OS noise that they will never remain absolutely idle for a full minute.
But there are some embedded systems with cut-down userspace configurations
that consistently get into this situation.

All this begs the question of exactly how a callback-free grace period
gets started in the first place.  This can happen due to the fact that
CPUs do not necessarily agree on which grace period is in progress.
If a CPU still believes that the grace period that just completed is
still ongoing, it will believe that it has callbacks that need to wait for
another grace period, never mind the fact that the grace period that they
were waiting for just completed.  This CPU can therefore erroneously
decide to start a new grace period.  Note that this can happen in
TREE_RCU and TREE_PREEMPT_RCU even on a single-CPU system:  Deadlock
considerations mean that the CPU that detected the end of the grace
period is not necessarily officially informed of this fact for some time.

Once this CPU notices that the earlier grace period completed, it will
invoke its callbacks.  It then won't have any callbacks left.  If no
other CPU has any callbacks, we now have a callback-free grace period.

This commit therefore makes CPUs check more carefully before starting a
new grace period.  This new check relies on an array of tail pointers
into each CPU's list of callbacks.  If the CPU is up to date on which
grace periods have completed, it checks to see if any callbacks follow
the RCU_DONE_TAIL segment, otherwise it checks to see if any callbacks
follow the RCU_WAIT_TAIL segment.  The reason that this works is that
the RCU_WAIT_TAIL segment will be promoted to the RCU_DONE_TAIL segment
as soon as the CPU is officially notified that the old grace period
has ended.

This change is to cpu_needs_another_gp(), which is called in a number
of places.  The only one that really matters is in rcu_start_gp(), where
the root rcu_node structure's ->lock is held, which prevents any
other CPU from starting or completing a grace period, so that the
comparison that determines whether the CPU is missing the completion
of a grace period is stable.

Reported-by: Becky Bruce 
Reported-by: Subodh Nijsure 
Reported-by: Paul Walmsley 
Signed-off-by: Paul E. McKenney 
Signed-off-by: Paul E. McKenney 
Tested-by: Paul Walmsley 
Signed-off-by: Greg Kroah-Hartman 

---
 kernel/rcutree.c |4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -283,7 +283,9 @@ cpu_has_callbacks_ready_to_invoke(struct
 static int
 cpu_needs_another_gp(struct rcu_state *rsp, struct rcu_data *rdp)
 {
-   return *rdp->nxttail[RCU_DONE_TAIL] && !rcu_gp_in_progress(rsp);
+   return *rdp->nxttail[RCU_DONE_TAIL +
+ACCESS_ONCE(rsp->completed) != rdp->completed] &&
+  !rcu_gp_in_progress(rsp);
 }
 
 /*


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[ 79/84] CPU hotplug, cpusets, suspend: Dont modify cpusets during suspend/resume

2012-10-10 Thread Greg Kroah-Hartman
3.0-stable review patch.  If anyone has any objections, please let me know.

--

From: "Srivatsa S. Bhat" 

commit d35be8bab9b0ce44bed4b9453f86ebf64062721e upstream.

In the event of CPU hotplug, the kernel modifies the cpusets' cpus_allowed
masks as and when necessary to ensure that the tasks belonging to the cpusets
have some place (online CPUs) to run on. And regular CPU hotplug is
destructive in the sense that the kernel doesn't remember the original cpuset
configurations set by the user, across hotplug operations.

However, suspend/resume (which uses CPU hotplug) is a special case in which
the kernel has the responsibility to restore the system (during resume), to
exactly the same state it was in before suspend.

In order to achieve that, do the following:

1. Don't modify cpusets during suspend/resume. At all.
   In particular, don't move the tasks from one cpuset to another, and
   don't modify any cpuset's cpus_allowed mask. So, simply ignore cpusets
   during the CPU hotplug operations that are carried out in the
   suspend/resume path.

2. However, cpusets and sched domains are related. We just want to avoid
   altering cpusets alone. So, to keep the sched domains updated, build
   a single sched domain (containing all active cpus) during each of the
   CPU hotplug operations carried out in s/r path, effectively ignoring
   the cpusets' cpus_allowed masks.

   (Since userspace is frozen while doing all this, it will go unnoticed.)

3. During the last CPU online operation during resume, build the sched
   domains by looking up the (unaltered) cpusets' cpus_allowed masks.
   That will bring back the system to the same original state as it was in
   before suspend.

Ultimately, this will not only solve the cpuset problem related to suspend
resume (ie., restores the cpusets to exactly what it was before suspend, by
not touching it at all) but also speeds up suspend/resume because we avoid
running cpuset update code for every CPU being offlined/onlined.

Signed-off-by: Srivatsa S. Bhat 
Signed-off-by: Peter Zijlstra 
Cc: Linus Torvalds 
Cc: Andrew Morton 
Cc: Thomas Gleixner 
Link: 
http://lkml.kernel.org/r/20120524141611.3692.20155.st...@srivatsabhat.in.ibm.com
Signed-off-by: Ingo Molnar 
Signed-off-by: Preeti U Murthy 
Signed-off-by: Greg Kroah-Hartman 


---
 kernel/cpuset.c |3 +++
 kernel/sched.c  |   40 
 2 files changed, 39 insertions(+), 4 deletions(-)

--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -2080,6 +2080,9 @@ static void scan_for_empty_cpusets(struc
  * (of no affect) on systems that are actively using CPU hotplug
  * but making no active use of cpusets.
  *
+ * The only exception to this is suspend/resume, where we don't
+ * modify cpusets at all.
+ *
  * This routine ensures that top_cpuset.cpus_allowed tracks
  * cpu_active_mask on each CPU hotplug (cpuhp) event.
  *
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -,34 +,66 @@ int __init sched_create_sysfs_power_savi
 }
 #endif /* CONFIG_SCHED_MC || CONFIG_SCHED_SMT */
 
+static int num_cpus_frozen;/* used to mark begin/end of suspend/resume */
+
 /*
  * Update cpusets according to cpu_active mask.  If cpusets are
  * disabled, cpuset_update_active_cpus() becomes a simple wrapper
  * around partition_sched_domains().
+ *
+ * If we come here as part of a suspend/resume, don't touch cpusets because we
+ * want to restore it back to its original state upon resume anyway.
  */
 static int cpuset_cpu_active(struct notifier_block *nfb, unsigned long action,
 void *hcpu)
 {
-   switch (action & ~CPU_TASKS_FROZEN) {
+   switch (action) {
+   case CPU_ONLINE_FROZEN:
+   case CPU_DOWN_FAILED_FROZEN:
+
+   /*
+* num_cpus_frozen tracks how many CPUs are involved in suspend
+* resume sequence. As long as this is not the last online
+* operation in the resume sequence, just build a single sched
+* domain, ignoring cpusets.
+*/
+   num_cpus_frozen--;
+   if (likely(num_cpus_frozen)) {
+   partition_sched_domains(1, NULL, NULL);
+   break;
+   }
+
+   /*
+* This is the last CPU online operation. So fall through and
+* restore the original sched domains by considering the
+* cpuset configurations.
+*/
+
case CPU_ONLINE:
case CPU_DOWN_FAILED:
cpuset_update_active_cpus();
-   return NOTIFY_OK;
+   break;
default:
return NOTIFY_DONE;
}
+   return NOTIFY_OK;
 }
 
 static int cpuset_cpu_inactive(struct notifier_block *nfb, unsigned long 
action,
   void *hcpu)
 {
-   switch (action & ~CPU_TASKS_FROZEN) {
+   switch (action) {
case CPU_DOWN_PREPARE:

[ 82/84] mtd: nand: Use the mirror BBT descriptor when reading its version

2012-10-10 Thread Greg Kroah-Hartman
3.0-stable review patch.  If anyone has any objections, please let me know.

--

From: Shmulik Ladkani 

commit 7bb9c75436212813b38700c34df4bbb6eb82debe upstream.

The code responsible for reading the version of the mirror bbt was
incorrectly using the descriptor of the main bbt.

Pass the mirror bbt descriptor to 'scan_read_raw' when reading the
version of the mirror bbt.

Signed-off-by: Shmulik Ladkani 
Acked-by: Sebastian Andrzej Siewior 
Signed-off-by: Artem Bityutskiy 
Signed-off-by: David Woodhouse 
Signed-off-by: Greg Kroah-Hartman 

---
 drivers/mtd/nand/nand_bbt.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/mtd/nand/nand_bbt.c
+++ b/drivers/mtd/nand/nand_bbt.c
@@ -429,7 +429,7 @@ static int read_abs_bbts(struct mtd_info
/* Read the mirror version, if available */
if (md && (md->options & NAND_BBT_VERSION)) {
scan_read_raw(mtd, buf, (loff_t)md->pages[0] << 
this->page_shift,
- mtd->writesize, td);
+ mtd->writesize, md);
md->version[0] = buf[bbt_get_ver_offs(mtd, md)];
printk(KERN_DEBUG "Bad block table at page %d, version 
0x%02X\n",
   md->pages[0], md->version[0]);


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   5   6   7   8   9   10   >