Re: snd-usb: delay: estimated 0, actual 352

2012-09-06 Thread Markus Trippelsdorf
On 2012.09.06 at 09:08 +0200, Daniel Mack wrote:
 On 06.09.2012 08:53, Markus Trippelsdorf wrote:
  On 2012.09.06 at 08:48 +0200, Takashi Iwai wrote:
  At Thu, 06 Sep 2012 08:33:30 +0200,
  Daniel Mack wrote:
 
  On 06.09.2012 08:02, Markus Trippelsdorf wrote:
  On 2012.09.04 at 16:40 +0200, Takashi Iwai wrote:
  
  Sound fixes for 3.6-rc5
 
  There are nothing scaring, contains only small fixes for HD-audio and
  USB-audio:
  - EPSS regression fix and GPIO fix for HD-audio IDT codecs
  - A series of USB-audio regression fixes that are found since 3.5 kernel
 
  
  Daniel Mack (4):
ALSA: snd-usb: Fix URB cancellation at stream start
ALSA: snd-usb: restore delay information
   
  The commit fbcfbf5f above causes the following lines to be printed
  whenever I start a new song:
 
  Copied Pierre-Louis Bossart - he wrote the code in 294c4fb8 which this
  patch (fbcfbf5f) brings back now.
 
  delay: estimated 0, actual 352
  delay: estimated 353, actual 705
 
  (44.1 * 8 = 352.8)
 
  This happens with an USB-DAC that identifies itself as C-Media USB
  Headphone Set.
 
  And you didn't you see these lines with 3.4?
 
  Maybe the difference of start condition?
 
  Markus, does the patch below fix anything?
  
  Unfortunately no.
  However reverting the following fixes the problem:
  
  commit 245baf983cc39524cce39c24d01b276e6e653c9e
  Author: Daniel Mack zon...@gmail.com
  Date:   Thu Aug 30 18:52:30 2012 +0200
  
  ALSA: snd-usb: fix calls to next_packet_size
  
 
 No, this one certainly fixes a problem and does the right thing by
 restoring the original code.
 
 If you wouldn't state that you didn't see the same effect with 3.4(!),
 before the refactoring done in 3.5, I would believe the device is simply
 slightly off in its feedback rate and the tighter delay code complains
 about it while compensating, just as it did before.
 
 Are there any more than these two lines? And is audio working at all? Is
 it distorted in any way?

There are only these two lines (printed whenever sound starts). Audio is
working just fine with no distortions.

I did see similar lines before when the system load was very high
(happend during make check when building glibc).

Here is what Pierre-Louis wrote in November 2011:

»This was supposed to be an informational message, I thought it was only
enabled for debug. Regular users don't really need to know.«

-- 
Markus
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH RESEND]mm/ia64: fix a node distance bug

2012-09-06 Thread wujianguo
From: Jianguo Wu wujian...@huawei.com

In arch ia64, has following definition:
extern u8 numa_slit[MAX_NUMNODES * MAX_NUMNODES];
#define node_distance(from,to) (numa_slit[(from) * num_online_nodes() + (to)])

num_online_nodes() is a variable value, it can be changed after hot-remove/add
a node.

I my practice, I found node distance is wrong after offline
a node in IA64 platform. For example system has 4 nodes:
node distances:
node   0   1   2   3
  0:  10  21  21  32
  1:  21  10  32  21
  2:  21  32  10  21
  3:  32  21  21  10

linux-drf:/sys/devices/system/node/node0 # cat distance
10  21  21  32
linux-drf:/sys/devices/system/node/node1 # cat distance
21  10  32  21

After offline node2:
linux-drf:/sys/devices/system/node/node0 # cat distance
10 21 32
linux-drf:/sys/devices/system/node/node1 # cat distance
32 21 32-expected value is: 21  10  21


Signed-off-by: Jianguo Wu wujian...@huawei.com
Signed-off-by: Jiang Liu jiang@huawei.com
---
 arch/ia64/include/asm/numa.h |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/ia64/include/asm/numa.h b/arch/ia64/include/asm/numa.h
index 6a8a27c..2e27ef1 100644
--- a/arch/ia64/include/asm/numa.h
+++ b/arch/ia64/include/asm/numa.h
@@ -59,7 +59,7 @@ extern struct node_cpuid_s node_cpuid[NR_CPUS];
  */

 extern u8 numa_slit[MAX_NUMNODES * MAX_NUMNODES];
-#define node_distance(from,to) (numa_slit[(from) * num_online_nodes() + (to)])
+#define node_distance(from,to) (numa_slit[(from) * MAX_NUMNODES + (to)])

 extern int paddr_to_nid(unsigned long paddr);

-- 1.7.6.1 .
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 3/3] memory-hotplug: bug fix race between isolation and allocation

2012-09-06 Thread Yasuaki Ishimatsu
Hi Minchan,

2012/09/06 14:16, Minchan Kim wrote:
 Like below, memory-hotplug makes race between page-isolation
 and page-allocation so it can hit BUG_ON in __offline_isolated_pages.
 
   CPU A   CPU B
 
 start_isolate_page_range
 set_migratetype_isolate
 spin_lock_irqsave(zone-lock)
 
   free_hot_cold_page(Page A)
   /* without zone-lock */
   migratetype = get_pageblock_migratetype(Page A);
   /*
* Page could be moved into MIGRATE_MOVABLE
* of per_cpu_pages
*/
   list_add_tail(page-lru, 
 pcp-lists[migratetype]);
 
 set_pageblock_isolate
 move_freepages_block
 drain_all_pages
 
   /* Page A could be in MIGRATE_MOVABLE of 
 free_list. */
 
 check_pages_isolated
 __test_page_isolated_in_pageblock
 /*
   * We can't catch freed page which
   * is free_list[MIGRATE_MOVABLE]
   */
 if (PageBuddy(page A))
   pfn += 1  page_order(page A);
 
   /* So, Page A could be allocated */
 
 __offline_isolated_pages
 /*
   * BUG_ON hit or offline page
   * which is used by someone
   */
 BUG_ON(!PageBuddy(page A));
 
 This patch checks page's migratetype in freelist in 
 __test_page_isolated_in_pageblock.
 So now __test_page_isolated_in_pageblock can check the page caused by above 
 race and
 can fail of memory offlining.
 
 Signed-off-by: Minchan Kim minc...@kernel.org
 ---
   mm/page_isolation.c |5 -
   1 file changed, 4 insertions(+), 1 deletion(-)
 
 diff --git a/mm/page_isolation.c b/mm/page_isolation.c
 index 87a7929..7ba7405 100644
 --- a/mm/page_isolation.c
 +++ b/mm/page_isolation.c
 @@ -193,8 +193,11 @@ __test_page_isolated_in_pageblock(unsigned long pfn, 
 unsigned long end_pfn)
   continue;
   }
   page = pfn_to_page(pfn);
 - if (PageBuddy(page))
 + if (PageBuddy(page)) {
 + if (get_freepage_migratetype(page) != MIGRATE_ISOLATE)
 + break;
   pfn += 1  page_order(page);
 + }

   else if (page_count(page) == 0 
   get_freepage_migratetype(page) == MIGRATE_ISOLATE)

When do the if statement, the page may be used by someone.
In this case, page-index may have some number. If the number is same as
MIGRATE_ISOLATE, the code goes worng.

Thanks,
Yasuaki Ishimatsu

   pfn += 1;
 


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] gpio: em: Use irq_data_get_irq_chip_data() at appropriate places

2012-09-06 Thread Linus Walleij
On Tue, Sep 4, 2012 at 3:58 PM, Axel Lin axel@gmail.com wrote:

 Then we can remove irq_to_priv() function.

 Signed-off-by: Axel Lin axel@gmail.com

Thanks, applied.

Yours,
Linus Walleij
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH RESEND] arm/dts: AM33XX: Add SPI device tree data

2012-09-06 Thread Philip, Avinash
Add McSPI data node to AM33XX device tree file. The McSPI module (and so
as the driver) is reused from OMAP4.

Signed-off-by: Philip, Avinash avinashphi...@ti.com
---
Resenting patch because ARM  OMAP mailing list was not copied.

:100644 100644 bb31bff... 6b469bd... M  arch/arm/boot/dts/am33xx.dtsi
 arch/arm/boot/dts/am33xx.dtsi |   25 +
 1 files changed, 25 insertions(+), 0 deletions(-)

diff --git a/arch/arm/boot/dts/am33xx.dtsi b/arch/arm/boot/dts/am33xx.dtsi
index bb31bff..6b469bd 100644
--- a/arch/arm/boot/dts/am33xx.dtsi
+++ b/arch/arm/boot/dts/am33xx.dtsi
@@ -210,5 +210,30 @@
interrupt-parent = intc;
interrupts = 91;
};
+
+   spi0: spi@4803 {
+   compatible = ti,omap4-mcspi;
+   #address-cells = 1;
+   #size-cells = 0;
+   reg = 0x483 0x400;
+   interrupt-parent = intc;
+   interrupt = 65;
+   ti,spi-num-cs = 2;
+   ti,hwmods = spi0;
+   status = disabled;
+
+   };
+
+   spi1: spi@481a {
+   compatible = ti,omap4-mcspi;
+   #address-cells = 1;
+   #size-cells = 0;
+   reg = 0x481a 0x400;
+   interrupt-parent = intc;
+   interrupt = 125;
+   ti,spi-num-cs = 2;
+   ti,hwmods = spi1;
+   status = disabled;
+   };
};
 };
-- 
1.7.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] dma: ipu: Drop unused spinlock

2012-09-06 Thread Jean Delvare
I was checking why this spinlock was never initialized, but it turns
out it's not used anywhere, so we can drop it.

Signed-off-by: Jean Delvare kh...@linux-fr.org
Cc: Vinod Koul vinod.k...@intel.com
Cc: Dan Williams d...@fb.com
---
I can't even build-test this.

 drivers/dma/ipu/ipu_irq.c |1 -
 1 file changed, 1 deletion(-)

--- linux-3.6-rc4.orig/drivers/dma/ipu/ipu_irq.c2012-08-04 
21:49:26.0 +0200
+++ linux-3.6-rc4/drivers/dma/ipu/ipu_irq.c 2012-09-06 09:13:31.034228670 
+0200
@@ -45,7 +45,6 @@ static void ipu_write_reg(struct ipu *ip
 struct ipu_irq_bank {
unsigned intcontrol;
unsigned intstatus;
-   spinlock_t  lock;
struct ipu  *ipu;
 };
 


-- 
Jean Delvare
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -v3 14/14] x86, mm: Map ISA area with connected ram range at the same time

2012-09-06 Thread Pekka Enberg
On Wed, Sep 5, 2012 at 1:02 AM, Pekka Enberg penb...@kernel.org wrote:
  How significant is the speed gain? The isa_done flag makes code flow
  more difficult to follow.

On Wed, 5 Sep 2012, Yinghai Lu wrote:
 Not really much.
 
 when booting system:
 memmap=16m$128m memmap=16m$512m memmap=16m$256m memmap=16m$768m 
 memmap=16m$1024m
 
 with the patch
 [0.00] init_memory_mapping: [mem 0x-0x07ff]
 [0.00]  [mem 0x-0x07ff] page 2M
 [0.00] init_memory_mapping: [mem 0x0900-0x0fff]
 [0.00]  [mem 0x0900-0x0fff] page 2M
 [0.00] init_memory_mapping: [mem 0x1100-0x1fff]
 [0.00]  [mem 0x1100-0x1fff] page 2M
 [0.00] init_memory_mapping: [mem 0x2100-0x2fff]
 [0.00]  [mem 0x2100-0x2fff] page 2M
 [0.00] init_memory_mapping: [mem 0x3100-0x3fff]
 [0.00]  [mem 0x3100-0x3fff] page 2M
 [0.00] init_memory_mapping: [mem 0x4100-0x7fffdfff]
 [0.00]  [mem 0x4100-0x7fdf] page 2M
 [0.00]  [mem 0x7fe0-0x7fffdfff] page 4k
 
 otherwise will have
 
 [0.00] init_memory_mapping: [mem 0x-0x000f]
 [0.00]  [mem 0x-0x000f] page 4k
 [0.00] init_memory_mapping: [mem 0x0010-0x07ff]
 [0.00]  [mem 0x0010-0x001f] page 4k
 [0.00]  [mem 0x0020-0x07ff] page 2M
 [0.00] init_memory_mapping: [mem 0x0900-0x0fff]
 [0.00]  [mem 0x0900-0x0fff] page 2M
 [0.00] init_memory_mapping: [mem 0x1100-0x1fff]
 [0.00]  [mem 0x1100-0x1fff] page 2M
 [0.00] init_memory_mapping: [mem 0x2100-0x2fff]
 [0.00]  [mem 0x2100-0x2fff] page 2M
 [0.00] init_memory_mapping: [mem 0x3100-0x3fff]
 [0.00]  [mem 0x3100-0x3fff] page 2M
 [0.00] init_memory_mapping: [mem 0x4100-0x7fffdfff]
 [0.00]  [mem 0x4100-0x7fdf] page 2M
 [0.00]  [mem 0x7fe0-0x7fffdfff] page 4k

OK. Is there any other reason than performance to do this?

Pekka
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] w1: mxc_w1: Adapt the clock name to the new clock framework

2012-09-06 Thread Sascha Hauer
Hi Fabio,

On Wed, Sep 05, 2012 at 07:01:18PM -0300, Fabio Estevam wrote:
 From: Fabio Estevam fabio.este...@freescale.com
 
 With the new i.mx clock framework the mxc_w1 clock is registered as:
 
 clk_register_clkdev(clk[owire_gate], NULL, mxc_w1.0
 
 So we do not need to pass owire string and can use NULL instead.
 
 Signed-off-by: Fabio Estevam fabio.este...@freescale.com
 ---
  drivers/w1/masters/mxc_w1.c |2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
 
 diff --git a/drivers/w1/masters/mxc_w1.c b/drivers/w1/masters/mxc_w1.c
 index 1cc61a7..14f0f66 100644
 --- a/drivers/w1/masters/mxc_w1.c
 +++ b/drivers/w1/masters/mxc_w1.c
 @@ -117,7 +117,7 @@ static int __devinit mxc_w1_probe(struct platform_device 
 *pdev)
   if (!mdev)
   return -ENOMEM;
  
 - mdev-clk = clk_get(pdev-dev, owire);
 + mdev-clk = clk_get(pdev-dev, NULL);
   if (!mdev-clk) {

You can sell this patch better if you fix the wrong error check here and
'by the way' adjust the lookup string.

Sascha

-- 
Pengutronix e.K.   | |
Industrial Linux Solutions | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0|
Amtsgericht Hildesheim, HRA 2686   | Fax:   +49-5121-206917- |
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


linux-next: Tree for Sept 6

2012-09-06 Thread Stephen Rothwell
Hi all,

Changes since 20120905:

New tree: arm64

The powerpc tree gained a build failure for which I reverted 3 commits.

The net-next tree lost its build failure.

The trivial tree gained a conflict against the powerpc tree.

The spi-mb tree gained a build failure so I used the version from
next-20120905.

The driver-core tree gained a build failure (form an interaction with the
workqueues tree) for which I applied a merge fix patch.

The tty tree gained a build failure for which I applied a patch.

The staging tree lost its build failure.

The arm-soc tree gained a conflict against the usb tree.



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use git pull
to do so as that will try to merge the new linux-next release with the
old one.  You should use git fetch as mentioned in the FAQ on the wiki
(see below).

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log files
in the Next directory.  Between each merge, the tree was built with
a ppc64_defconfig for powerpc and an allmodconfig for x86_64. After the
final fixups (if any), it is also built with powerpc allnoconfig (32 and
64 bit), ppc44x_defconfig and allyesconfig (minus
CONFIG_PROFILE_ALL_BRANCHES - this fails its final link) and i386, sparc,
sparc64 and arm defconfig. These builds also have
CONFIG_ENABLE_WARN_DEPRECATED, CONFIG_ENABLE_MUST_CHECK and
CONFIG_DEBUG_INFO disabled when necessary.

Below is a summary of the state of the merge.

We are up to 196 trees (counting Linus' and 26 trees of patches pending
for Linus' tree), more are welcome (even if they are currently empty).
Thanks to those who have contributed, and to those who haven't, please do.

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

There is a wiki covering stuff to do with linux-next at
http://linux.f-seidel.de/linux-next/pmwiki/ .  Thanks to Frank Seidel.

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

$ git checkout master
$ git reset --hard stable
Merging origin/master (5b716ac Merge branch 'for-next' of 
git://git.samba.org/sfrench/cifs-2.6)
Merging fixes/master (9023a40 Merge tag 'mmc-fixes-for-3.5-rc4' of 
git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc)
Merging kbuild-current/rc-fixes (6c7080a firmware: fix directory creation rule 
matching with make 3.82)
Merging arm-current/fixes (36418c5 ARM: 7499/1: mm: Fix vmalloc overlap check 
for !HIGHMEM)
Merging m68k-current/for-linus (3be7184 m68k: Add missing RCU idle APIs on idle 
loop)
Merging powerpc-merge/merge (636802e powerpc: Don't use __put_user() in 
patch_instruction)
Merging sparc/master (6dab7ed Merge branch 'fixes' of 
git://git.linaro.org/people/rmk/linux-arm)
Merging net/master (d90c92f ibmveth: Fix alignment of rx queue bug)
Merging sound-current/for-linus (2e4a263 ALSA: snd-usb: fix cross-interface 
streaming devices)
Merging pci-current/for-linus (0ff9514 PCI: Don't print anything while decoding 
is disabled)
Merging wireless/master (f107238 libertas sdio: fix suspend when interface is 
down)
Merging driver-core.current/driver-core-linus (fea7a08 Linux 3.6-rc3)
Merging tty.current/tty-linus (7be0670 tty: serial: imx: don't reinit clock in 
imx_setup_ufcr())
Merging usb.current/usb-linus (92fc7a8 USB: add device quirk for Joss Optical 
touchboard)
Merging staging.current/staging-linus (6d7d979 staging: zcache: fix cleancache 
race condition with shrinker)
Merging char-misc.current/char-misc-linus (fea7a08 Linux 3.6-rc3)
Merging input-current/for-linus (6f4d038 Input: wacom - add support for EMR on 
Cintiq 24HD touch)
Merging md-current/for-linus (58e94ae md/raid1: close some possible races on 
write errors during resync)
Merging audit-current/for-linus (c158a35 audit: no leading space in 
audit_log_d_path prefix)
Merging crypto-current/master (ce026cb crypto: caam - fix possible deadlock 
condition)
Merging ide/master (9974e43 ide: fix generic_ide_suspend/resume Oops)
Merging dwmw2/master (244dc4e Merge 
git://git.infradead.org/users/dwmw2/random-2.6)
Merging sh-current/sh-fixes-for-linus (4403310 SH: Convert out[bwl] macros to 
inline functions)
Merging irqdomain-current/irqdomain/merge (15e06bf irqdomain: Fix debugfs 
formatting)
Merging devicetree-current/devicetree/merge (4e8383b of: release node fix for 
of_parse_phandle_with_args)
Merging spi-current/spi/merge (d1c185b of/spi: Fix SPI module loading by using 
proper spi: modalias prefixes.)
Merging gpio-current/gpio/merge 

Re: [PATCH] Chinese translation of Documentation/gpio.txt

2012-09-06 Thread Linus Walleij
2012/9/5 Dong Aisheng b29...@freescale.com:

Thanks for your help Dong, Wei can you please check Dong's
comments and submit a version with his ACK, and I'll apply it.

Yours,
Linus Walleij
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] gpio: sx150x: Use irq_data_get_irq_chip_data() at appropriate places

2012-09-06 Thread Linus Walleij
On Tue, Sep 4, 2012 at 4:06 PM, Axel Lin axel@gmail.com wrote:

 Signed-off-by: Axel Lin axel@gmail.com

Thanks, applied!

Yours,
Linus Walleij
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC v9 PATCH 20/21] memory-hotplug: clear hwpoisoned flag when onlining pages

2012-09-06 Thread andywu106建国
2012/9/5 we...@cn.fujitsu.com

 From: Wen Congyang we...@cn.fujitsu.com

 hwpoisoned may set when we offline a page by the sysfs interface
 /sys/devices/system/memory/soft_offline_page or
 /sys/devices/system/memory/hard_offline_page. If we don't clear
 this flag when onlining pages, this page can't be freed, and will
 not in free list. So we can't offline these pages again. So we
 should clear this flag when onlining pages.

 CC: David Rientjes rient...@google.com
 CC: Jiang Liu liu...@gmail.com
 CC: Len Brown len.br...@intel.com
 CC: Benjamin Herrenschmidt b...@kernel.crashing.org
 CC: Paul Mackerras pau...@samba.org
 CC: Christoph Lameter c...@linux.com
 Cc: Minchan Kim minchan@gmail.com
 CC: Andrew Morton a...@linux-foundation.org
 CC: KOSAKI Motohiro kosaki.motoh...@jp.fujitsu.com
 CC: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com
 Signed-off-by: Wen Congyang we...@cn.fujitsu.com
 ---
  mm/memory_hotplug.c |5 +
  1 files changed, 5 insertions(+), 0 deletions(-)

 diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
 index 270c249..140c080 100644
 --- a/mm/memory_hotplug.c
 +++ b/mm/memory_hotplug.c
 @@ -661,6 +661,11 @@ EXPORT_SYMBOL_GPL(__online_page_increment_counters);

  void __online_page_free(struct page *page)
  {
 +#ifdef CONFIG_MEMORY_FAILURE
 +   /* The page may be marked HWPoisoned by soft/hard offline page */
 +   ClearPageHWPoison(page);

Hi Congyang,
I think you should decrease mce_bad_pages counter her
atomic_long_sub(1, mce_bad_pages);


 +#endif
 +
 ClearPageReserved(page);
 init_page_count(page);
 __free_page(page);
 --
 1.7.1

 --
 To unsubscribe, send a message with 'unsubscribe linux-mm' in
 the body to majord...@kvack.org.  For more info on Linux MM,
 see: http://www.linux-mm.org/ .
 Don't email: a href=mailto:d...@kvack.org; em...@kvack.org /a
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] OMAP GPIO - don't wake from suspend unless requested.

2012-09-06 Thread Shilimkar, Santosh
On Thu, Sep 6, 2012 at 12:32 PM, NeilBrown ne...@suse.de wrote:
 On Thu, 6 Sep 2012 11:18:09 +0530 Shilimkar, Santosh
 santosh.shilim...@ti.com wrote:

 On Thu, Sep 6, 2012 at 8:35 AM, NeilBrown ne...@suse.de wrote:
  On Mon, 3 Sep 2012 22:59:06 -0700 Shilimkar, Santosh
  santosh.shilim...@ti.com wrote:

  After thinking bit more on this, the problem seems to be coming
  mainly because the gpio device is runtime suspended bit early than
  it should be. Similar issue seen with i2c driver as well. The i2c issue
  was discussed with Rafael at LPC last week. The idea is to move
  the pm_runtime_enable/disable() calls entirely up to the
  _late/_early stage of device suspend/resume.
  Will update this thread once I have further update.
 
  This won't be late enough.  IRQCHIP_MASK_ON_SUSPEND takes effect after all
  the _late callbacks have been called.
  I, too, spoke to Rafael about this in San Diego.  He seemed to agree with 
  me
  that the interrupt needs to be masked in the -suspend callback.  any later
  is too late.
 
 Thanks for information about your discussion. Will wait for the patch then.

 Regards
 santosh

 I already sent a patch - that is what started this thread :-)

 I include it below.
 You said The patch doesn't seems to be correct but didn't expand on why.
 Do you still think it is not correct?  I wouldn't be surprised if there is
 some case that it doesn't handle quite right, but it seems right to me.

Sorry I though you were talking about moving the Checking wakeup interrupts
bit early as discussed on the follow up of alternate suggested patch to make
use of  IRQCHIP_MASK_ON_SUSPEND.

I think we need to fix the issue seen with ' IRQCHIP_MASK_ON_SUSPEND'
patch. That is at least the expected way to manage suspend and wakeup
irq masks for a IRQCHIP.

Regards
Santosh
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 3/3] memory-hotplug: bug fix race between isolation and allocation

2012-09-06 Thread Minchan Kim
Hello Yasuaki,

On Thu, Sep 06, 2012 at 04:17:54PM +0900, Yasuaki Ishimatsu wrote:
 Hi Minchan,
 
 2012/09/06 14:16, Minchan Kim wrote:
  Like below, memory-hotplug makes race between page-isolation
  and page-allocation so it can hit BUG_ON in __offline_isolated_pages.
  
  CPU A   CPU B
  
  start_isolate_page_range
  set_migratetype_isolate
  spin_lock_irqsave(zone-lock)
  
  free_hot_cold_page(Page A)
  /* without zone-lock */
  migratetype = get_pageblock_migratetype(Page A);
  /*
   * Page could be moved into MIGRATE_MOVABLE
   * of per_cpu_pages
   */
  list_add_tail(page-lru, 
  pcp-lists[migratetype]);
  
  set_pageblock_isolate
  move_freepages_block
  drain_all_pages
  
  /* Page A could be in MIGRATE_MOVABLE of 
  free_list. */
  
  check_pages_isolated
  __test_page_isolated_in_pageblock
  /*
* We can't catch freed page which
* is free_list[MIGRATE_MOVABLE]
*/
  if (PageBuddy(page A))
  pfn += 1  page_order(page A);
  
  /* So, Page A could be allocated */
  
  __offline_isolated_pages
  /*
* BUG_ON hit or offline page
* which is used by someone
*/
  BUG_ON(!PageBuddy(page A));
  
  This patch checks page's migratetype in freelist in 
  __test_page_isolated_in_pageblock.
  So now __test_page_isolated_in_pageblock can check the page caused by above 
  race and
  can fail of memory offlining.
  
  Signed-off-by: Minchan Kim minc...@kernel.org
  ---
mm/page_isolation.c |5 -
1 file changed, 4 insertions(+), 1 deletion(-)
  
  diff --git a/mm/page_isolation.c b/mm/page_isolation.c
  index 87a7929..7ba7405 100644
  --- a/mm/page_isolation.c
  +++ b/mm/page_isolation.c
  @@ -193,8 +193,11 @@ __test_page_isolated_in_pageblock(unsigned long pfn, 
  unsigned long end_pfn)
  continue;
  }
  page = pfn_to_page(pfn);
  -   if (PageBuddy(page))
  +   if (PageBuddy(page)) {
  +   if (get_freepage_migratetype(page) != MIGRATE_ISOLATE)
  +   break;
  pfn += 1  page_order(page);
  +   }
 
  else if (page_count(page) == 0 
  get_freepage_migratetype(page) == MIGRATE_ISOLATE)
 
 When do the if statement, the page may be used by someone.

I can't understand your point.
We already hold zone-lock so that allocator and this function should be atomic
when the page is in free_list.
If I miss something, could you elaborate it more?

 In this case, page-index may have some number. If the number is same as
 MIGRATE_ISOLATE, the code goes worng.
 
 Thanks,
 Yasuaki Ishimatsu
 
  pfn += 1;
  
 
 
 --
 To unsubscribe, send a message with 'unsubscribe linux-mm' in
 the body to majord...@kvack.org.  For more info on Linux MM,
 see: http://www.linux-mm.org/ .
 Don't email: a href=mailto:d...@kvack.org; em...@kvack.org /a

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] JFS: use list_move instead of list_del/list_add

2012-09-06 Thread Wei Yongjun
From: Wei Yongjun yongjun_...@trendmicro.com.cn

Using list_move() instead of list_del() + list_add().

spatch with a semantic match is used to found this problem.
(http://coccinelle.lip6.fr/)

Signed-off-by: Wei Yongjun yongjun_...@trendmicro.com.cn
---
 fs/jfs/jfs_txnmgr.c | 9 +++--
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/fs/jfs/jfs_txnmgr.c b/fs/jfs/jfs_txnmgr.c
index bb8b661..5fcc02e 100644
--- a/fs/jfs/jfs_txnmgr.c
+++ b/fs/jfs/jfs_txnmgr.c
@@ -2977,12 +2977,9 @@ int jfs_sync(void *arg)
 * put back on the anon_list.
 */
 
-   /* Take off anon_list */
-   list_del(jfs_ip-anon_inode_list);
-
-   /* Put on anon_list2 */
-   list_add(jfs_ip-anon_inode_list,
-TxAnchor.anon_list2);
+   /* Move from anon_list to anon_list2 */
+   list_move(jfs_ip-anon_inode_list,
+ TxAnchor.anon_list2);
 
TXN_UNLOCK();
iput(ip);

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: snd-usb: delay: estimated 0, actual 352

2012-09-06 Thread Takashi Iwai
At Thu, 6 Sep 2012 09:17:57 +0200,
Markus Trippelsdorf wrote:
 
 On 2012.09.06 at 09:08 +0200, Daniel Mack wrote:
  On 06.09.2012 08:53, Markus Trippelsdorf wrote:
   On 2012.09.06 at 08:48 +0200, Takashi Iwai wrote:
   At Thu, 06 Sep 2012 08:33:30 +0200,
   Daniel Mack wrote:
  
   On 06.09.2012 08:02, Markus Trippelsdorf wrote:
   On 2012.09.04 at 16:40 +0200, Takashi Iwai wrote:
   
   Sound fixes for 3.6-rc5
  
   There are nothing scaring, contains only small fixes for HD-audio and
   USB-audio:
   - EPSS regression fix and GPIO fix for HD-audio IDT codecs
   - A series of USB-audio regression fixes that are found since 3.5 
   kernel
  
   
   Daniel Mack (4):
 ALSA: snd-usb: Fix URB cancellation at stream start
 ALSA: snd-usb: restore delay information
    
   The commit fbcfbf5f above causes the following lines to be printed
   whenever I start a new song:
  
   Copied Pierre-Louis Bossart - he wrote the code in 294c4fb8 which this
   patch (fbcfbf5f) brings back now.
  
   delay: estimated 0, actual 352
   delay: estimated 353, actual 705
  
   (44.1 * 8 = 352.8)
  
   This happens with an USB-DAC that identifies itself as C-Media USB
   Headphone Set.
  
   And you didn't you see these lines with 3.4?
  
   Maybe the difference of start condition?
  
   Markus, does the patch below fix anything?
   
   Unfortunately no.
   However reverting the following fixes the problem:
   
   commit 245baf983cc39524cce39c24d01b276e6e653c9e
   Author: Daniel Mack zon...@gmail.com
   Date:   Thu Aug 30 18:52:30 2012 +0200
   
   ALSA: snd-usb: fix calls to next_packet_size
   
  
  No, this one certainly fixes a problem and does the right thing by
  restoring the original code.
  
  If you wouldn't state that you didn't see the same effect with 3.4(!),
  before the refactoring done in 3.5, I would believe the device is simply
  slightly off in its feedback rate and the tighter delay code complains
  about it while compensating, just as it did before.
  
  Are there any more than these two lines? And is audio working at all? Is
  it distorted in any way?
 
 There are only these two lines (printed whenever sound starts). Audio is
 working just fine with no distortions.
 
 I did see similar lines before when the system load was very high
 (happend during make check when building glibc).
 
 Here is what Pierre-Louis wrote in November 2011:
 
 »This was supposed to be an informational message, I thought it was only
 enabled for debug. Regular users don't really need to know.«

I guess the problem is that the new endpoint scheme doesn't count the
last_delay update unless the stream is triggered.  In the old code,
retire_playback_urb is always called even before the trigger(START) is
set.  And, there retire_playback_urb() does nothing but updating the
delay information.

In the new code, retire_playback_urb is set only at
snd_usb_substream_playback_trigger().  Thus at the very first shot,
the delay account got confused.


Takashi
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Patch 0/1]drm_irq: Introducing the irq_thread support

2012-09-06 Thread Daniel Vetter
On Thu, Sep 06, 2012 at 12:42:05AM +, Liu, Chuansheng wrote:
  This possibly ought to be submitted in parallel with the code that uses it 
  so that
  the whole proposal can be evaluated as one thing ?
  
  Alan
 
 Patch is here, thanks.
 
 From: liu chuansheng chuansheng@intel.com
 Subject: [PATCH] drm_irq: Introducing the irq_thread support
 
 For some GPUs, the irq handler need 1ms to handle the irq action.
 And it will delay the whole system irq handler.
 
 This patch is adding the irq thread support, it will make the drm_irq
 interface more flexible.
 
 The changes include:
 1/ Change the request_irq to request_thread_irq, and add new callback
irq_handler_t;
 2/ Normally we need IRQF_ONESHOT flag support for irq thread, so add
this option in drm_irq;
 
 Cc: Shi Yang yang.a@intel.com
 Signed-off-by: liu chuansheng chuansheng@intel.com

Nacked-by: Daniel Vetter daniel.vet...@ffwll.ch

I _really_ hate when we add random special cases for random strange
drivers to core code - the usual end result is that in a few years we'll
have a maze of special-cases only used by one driver each. And nope,
cleaning that up isn't _that_ much fun ...

So just do all this in your own driver's code (and maybe set
dev-irq_enabled ourselve so that wait_vblank still works).

Yours, Daniel

 ---
  drivers/gpu/drm/drm_irq.c |8 ++--
  include/drm/drmP.h|2 ++
  2 files changed, 8 insertions(+), 2 deletions(-)
 
 diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
 index 03f16f3..bc105fe 100644
 --- a/drivers/gpu/drm/drm_irq.c
 +++ b/drivers/gpu/drm/drm_irq.c
 @@ -345,13 +345,17 @@ int drm_irq_install(struct drm_device *dev)
 if (drm_core_check_feature(dev, DRIVER_IRQ_SHARED))
 sh_flags = IRQF_SHARED;
  
 +   if (drm_core_check_feature(dev, DRIVER_IRQ_ONESHOT))
 +   sh_flags |= IRQF_ONESHOT;
 +
 if (dev-devname)
 irqname = dev-devname;
 else
 irqname = dev-driver-name;
  
 -   ret = request_irq(drm_dev_to_irq(dev), dev-driver-irq_handler,
 - sh_flags, irqname, dev);
 +   ret = request_threaded_irq(drm_dev_to_irq(dev),
 +   dev-driver-irq_handler, dev-driver-irq_handler_t,
 +   sh_flags, irqname, dev);
  
 if (ret  0) {
 mutex_lock(dev-struct_mutex);
 diff --git a/include/drm/drmP.h b/include/drm/drmP.h
 index d6b67bb..b28be4b 100644
 --- a/include/drm/drmP.h
 +++ b/include/drm/drmP.h
 @@ -152,6 +152,7 @@ int drm_err(const char *func, const char *format, ...);
  #define DRIVER_GEM 0x1000
  #define DRIVER_MODESET 0x2000
  #define DRIVER_PRIME   0x4000
 +#define DRIVER_IRQ_ONESHOT 0x8000
  
  #define DRIVER_BUS_PCI 0x1
  #define DRIVER_BUS_PLATFORM 0x2
 @@ -872,6 +873,7 @@ struct drm_driver {
 /* these have to be filled in */
  
 irqreturn_t(*irq_handler) (DRM_IRQ_ARGS);
 +   irqreturn_t(*irq_handler_t) (DRM_IRQ_ARGS);
 void (*irq_preinstall) (struct drm_device *dev);
 int (*irq_postinstall) (struct drm_device *dev);
 void (*irq_uninstall) (struct drm_device *dev);
 -- 
 1.7.0.4
 ___
 dri-devel mailing list
 dri-de...@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Mail: dan...@ffwll.ch
Mobile: +41 (0)79 365 57 48
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] gpio-ich: Add missing spinlock init

2012-09-06 Thread Jean Delvare
As reported by CONFIG_DEBUG_SPINLOCK=y.

Signed-off-by: Jean Delvare kh...@linux-fr.org
Cc: Peter Tyser pty...@xes-inc.com
Cc: Grant Likely grant.lik...@secretlab.ca
Cc: Linus Walleij linus.wall...@linaro.org
Cc: sta...@vger.kernel.org [v3.5+]
---
 drivers/gpio/gpio-ich.c |1 +
 1 file changed, 1 insertion(+)

--- linux-3.6-rc4.orig/drivers/gpio/gpio-ich.c  2012-09-04 13:34:03.0 
+0200
+++ linux-3.6-rc4/drivers/gpio/gpio-ich.c   2012-09-06 08:08:57.571210424 
+0200
@@ -390,6 +390,7 @@ static int __devinit ichx_gpio_probe(str
return -ENODEV;
}
 
+   spin_lock_init(ichx_priv.lock);
res_base = platform_get_resource(pdev, IORESOURCE_IO, ICH_RES_GPIO);
ichx_priv.use_gpio = ich_info-use_gpio;
err = ichx_gpio_request_regions(res_base, pdev-name,


-- 
Jean Delvare
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] virtio-blk: Fix kconfig option

2012-09-06 Thread Kent Overstreet
On Tue, Sep 04, 2012 at 03:53:53PM +0930, Rusty Russell wrote:
 Kent Overstreet koverstr...@google.com writes:
 
  CONFIG_VIRTIO isn't exposed, everything else is supposed to select it
  instead.
 
 This is a slight mis-understanding.  It's supposed to be selected by
 the particular driver, probably virtio_pci in your case.

So are you saying virtio-blk depends on virtio-pci? If so, the kconfig
should have that.

As is, VIRTIO_BLK just has:
depends on EXPERIMENTAL  VIRTIO

which is flat out broken.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] virtio-balloon spec: provide a version of the silent deflate feature that works

2012-09-06 Thread Paolo Bonzini
VIRTIO_BALLOON_F_MUST_TELL_HOST cannot be used properly because it is a
negative feature: it tells you that silent defalte is not supported.
Right now, QEMU refuses migration if the target does not support all the
features that were negotiated.  But then:

- a migration from non-MUST_TELL_HOST to MUST_TELL_HOST will succeed,
which is wrong;

- a migration from MUST_TELL_HOST to non-MUST_TELL_HOST will fail, which
is useless.

Add instead a new feature VIRTIO_BALLOON_F_SILENT_DEFLATE, and deprecate
VIRTIO_BALLOON_F_MUST_TELL_HOST since it is never actually used.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 virtio-spec.lyx | 36 +---
 1 file modificato, 33 inserzioni(+), 3 rimozioni(-)

diff --git a/virtio-spec.lyx b/virtio-spec.lyx
index 7a073f4..1a25a18 100644
--- a/virtio-spec.lyx
+++ b/virtio-spec.lyx
@@ -6238,6 +6238,8 @@ bits
 
 \begin_deeper
 \begin_layout Description
+
+\change_deleted 1531152142 1346917221
 VIRTIO_BALLOON_F_MUST_TELL_HOST
 \begin_inset space ~
 \end_inset
@@ -6251,6 +6253,20 @@ VIRTIO_BALLOON_F_STATS_VQ
 \end_inset
 
 (1) A virtqueue for reporting guest memory statistics is present.
+\change_inserted 1531152142 1346917193
+
+\end_layout
+
+\begin_layout Description
+
+\change_inserted 1531152142 1346917219
+VIRTIO_BALLOON_F_SILENT_DEFLATE
+\begin_inset space ~
+\end_inset
+
+(2) Host does not need to be told before pages from the balloon are used.
+\change_unchanged
+
 \end_layout
 
 \end_deeper
@@ -6401,9 +6417,23 @@ The driver constructs an array of addresses of memory 
pages it has previously
 \end_layout
 
 \begin_layout Enumerate
-If the VIRTIO_BALLOON_F_MUST_TELL_HOST feature is set, the guest may not
- use these requested pages until that descriptor in the deflateq has been
- used by the device.
+If the VIRTIO_BALLOON_F_
+\change_deleted 1531152142 1346917234
+MUST_TELL_HOST
+\change_inserted 1531152142 1346917237
+SILENT_DEFLATE
+\change_unchanged
+ feature is 
+\change_inserted 1531152142 1346917241
+not 
+\change_unchanged
+set, the guest may not use these requested pages until that descriptor in
+ the deflateq has been used by the device.
+
+\change_inserted 1531152142 1346917253
+ If it is set, the guest may choose to not use the deflateq at all.
+\change_unchanged
+
 \end_layout
 
 \begin_layout Enumerate
-- 
1.7.11.2

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drm/exynos: fix double call of drm_prime_(init/destroy)_file_private

2012-09-06 Thread Paul Menzel
Dear Inki Dae,


Am Donnerstag, den 06.09.2012, 11:35 +0900 schrieb InKi Dae:

 2012/9/6 Mandeep Singh Baines m...@chromium.org:
  The double invocations are incorrect but seem to be safe so I don't
  think this will fix any bugs.
 
  Before:
 
  [7.639366] drm_prime_init_file ee3675d0
  [7.639377] drm_prime_init_file ee3675d0
  [7.639507] drm_prime_destroy_file ee3675d0
  [7.639518] drm_prime_destroy_file ee3675d0
  [7.639802] drm_prime_init_file ee372390
  [7.639810] drm_prime_init_file ee372390
  [8.473316] drm_prime_init_file ee356390
  [8.473331] drm_prime_init_file ee356390
 
  After:
 
  [6.363842] drm_prime_init_file edc2e5d0
  [6.363994] drm_prime_destroy_file edc2e5d0
  [6.364260] drm_prime_init_file edc2e750
  [8.004837] drm_prime_init_file ee36ded0
 
  Signed-off-by: Mandeep Singh Baines m...@chromium.org
  CC: Stéphane Marchesin marc...@chromium.org
  CC: Pawel Osciak posc...@google.com
  CC: Inki Dae inki@samsung.com
  CC: Joonyoung Shim jy0922.s...@samsung.com
  CC: Seung-Woo Kim sw0312@samsung.com
  CC: Kyungmin Park kyungmin.p...@samsung.com
  CC: David Airlie airl...@linux.ie
  CC: dri-de...@lists.freedesktop.org
 
 remove all CCs

I guess they were generated by some script. So they should be fine, no?

Mandeep, if you put CC in here those people should be CCed in real. `git
send-email` should take care of that but I do not see everyone in the CC
field. Or does `git send-email` use blind carbon copy (BCC) field?

 and can you send it again using text mode?

At least to the list it was send in plain text mode.

 your patch is messed up when I try to get patch file.

Everything is fine on my side. Especially since Mandeep used `git
send-email` which should do everything correctly.

 Thanks.
 Inki Dae

In your From address your name is written InKi with capital K. Which one
is correct?


Thanks,

Paul


signature.asc
Description: This is a digitally signed message part


Re: [PATCH] OMAP GPIO - don't wake from suspend unless requested.

2012-09-06 Thread NeilBrown
On Thu, 6 Sep 2012 12:57:46 +0530 Shilimkar, Santosh
santosh.shilim...@ti.com wrote:

 On Thu, Sep 6, 2012 at 12:32 PM, NeilBrown ne...@suse.de wrote:
  On Thu, 6 Sep 2012 11:18:09 +0530 Shilimkar, Santosh
  santosh.shilim...@ti.com wrote:
 
  On Thu, Sep 6, 2012 at 8:35 AM, NeilBrown ne...@suse.de wrote:
   On Mon, 3 Sep 2012 22:59:06 -0700 Shilimkar, Santosh
   santosh.shilim...@ti.com wrote:
 
   After thinking bit more on this, the problem seems to be coming
   mainly because the gpio device is runtime suspended bit early than
   it should be. Similar issue seen with i2c driver as well. The i2c issue
   was discussed with Rafael at LPC last week. The idea is to move
   the pm_runtime_enable/disable() calls entirely up to the
   _late/_early stage of device suspend/resume.
   Will update this thread once I have further update.
  
   This won't be late enough.  IRQCHIP_MASK_ON_SUSPEND takes effect after 
   all
   the _late callbacks have been called.
   I, too, spoke to Rafael about this in San Diego.  He seemed to agree 
   with me
   that the interrupt needs to be masked in the -suspend callback.  any 
   later
   is too late.
  
  Thanks for information about your discussion. Will wait for the patch then.
 
  Regards
  santosh
 
  I already sent a patch - that is what started this thread :-)
 
  I include it below.
  You said The patch doesn't seems to be correct but didn't expand on why.
  Do you still think it is not correct?  I wouldn't be surprised if there is
  some case that it doesn't handle quite right, but it seems right to me.
 
 Sorry I though you were talking about moving the Checking wakeup interrupts
 bit early as discussed on the follow up of alternate suggested patch to make
 use of  IRQCHIP_MASK_ON_SUSPEND.
 
 I think we need to fix the issue seen with ' IRQCHIP_MASK_ON_SUSPEND'
 patch. That is at least the expected way to manage suspend and wakeup
 irq masks for a IRQCHIP.

That is what I thought at first too.  But when talking to Rafael he said that
IRQCHIP_MASK_ON_SUSPEND was intended mainly for clock interrupts.  For other
less fundamental interrupts, doing the mask/unmask in suspend/resume
callbacks is sufficient and simpler... and actually works.

IRQCHIP_MASK_ON_SUSPEND is currently used by precisely two drivers:

   arch/arm/mach-omap2/omap-wakeupgen.c
and 
   drivers/mfd/pm8xxx-irq.c

which suggests that it isn't widely used and quite possibly doesn't actually
work in general.
The pm8xxx-irq doesn't seem to do runtime pm, so maybe it manages to work for
that reason.
The omap-wakeupgen code is beyond my current understanding, but it seems like
it might be the sort of special case that IRQCHIP_MASK_ON_SUSPEND is intended
for...

Maybe we need to start a new thread and pester Rafael or Thomas Gleixner
to either explain what is intended for this case, or to fix 
IRQCHIP_MASK_ON_SUSPEND so that it can be used in general.

NeilBrown


signature.asc
Description: PGP signature


Re: [alsa-devel] [PATCH] ASoC: ams-delta: fix card initalization failure

2012-09-06 Thread Mark Brown
On Sat, Sep 01, 2012 at 11:09:18AM +0200, Janusz Krzysztofik wrote:

 I see your point, however for now I can see no better way of referencing 
 the data (of type struct snd_soc_card) then passing it to 
 snd_soc_register_card(). But for this to work, I would have to register 
 successfully an ams-delta specific platform device first, not the soc-
 audio. This, even if still done from the sound/soc/omap/ams-delta.c, not 
 from an arch board file, would require now not existing ams-delta ASoC 
 platform driver probe/remove callbacks at least. I'm still not convinced 
 if such modification would be acceptable in the middle of the rc cycle.

 If there is a simpler, less intrusive way to do this, then sorry, I 
 still can't see it.

Like I already said just make it a static variable.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] module: signature infrastructure

2012-09-06 Thread Rusty Russell
Lucas De Marchi lucas.de.mar...@gmail.com writes:
 Sorry to come up with this suggestion only now (and after you have
 already talked to me at LPC). Only after seeing this implementation I
 thought about the implications of having the module signed in this
 format.
...
 I'm worried about performance here. Module loading can take a fair
 amount of boot time. It may not be critical for servers or desktops
 that we rarely boot, but it is for embedded uses.
...
 Letting it in be32 is the simplest solution IMO. it's way simpler then
 the loop above.
...
 Scanning the module is the least of our issues since we've just copied
 it and we're about to SHA it.

 Yeah, but I don't think we need to scan it one more time. On every
 boot. For every module

Regretfully, I don't have Linus' talent for flamage.

There's no measurable performance impact.  Scanning 1k takes about
5usec; we've wasted about enough time on this subject to load a billion
kernel modules.

I know this.  Not because I'm brilliant, but because I *measured* it.  I
even pulled out my original module signature signing check code, and
that was both faster and simpler.  See below.

Your assertion that the length-prepended version is way simpler is
wrong.  Again, I know this because I *read the code*:


https://git.kernel.org/?p=linux/kernel/git/kasatkin/linux-digsig.git;a=commitdiff;h=19eef6e4e529ccf2b3a6ab5c19dd40f2eaf8fcaf

Don't send any more lazy, unthoughtful mails to the list.  It's
disrespectful and makes me grumpy.

Rusty.
PS.  Pushed updated version to my kernel.org linux.git/module-signing branch.

#ifdef CONFIG_MODULE_SIG
static int module_sig_check(struct load_info *info,
const void *mod, unsigned long *len)
{
int err = 0;
const unsigned long markerlen = strlen(MODULE_SIG_STRING);
const void *p = mod, *end = mod + *len;

/* Poor man's memmem. */
while ((p = memchr(p, MODULE_SIG_STRING[0], end - p))) {
if (p + markerlen  end)
break;

if (memcmp(p, MODULE_SIG_STRING, markerlen) == 0) {
const void *sig = p + markerlen;
/* Truncate module up to signature. */
*len = p - mod;
err = mod_verify_sig(mod, *len,
 sig, end - sig,
 info-sig_ok);
break;
}
p++;
}

/* Not having a signature is only an error if we're strict. */
if (!err  !info-sig_ok  sig_enforce)
err = -EKEYREJECTED;
return err;
}
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 2/2] virtio-ring: Allocate indirect buffers from cache when possible

2012-09-06 Thread Rusty Russell
Michael S. Tsirkin m...@redhat.com writes:
 Yes without checksum net core always linearizes packets, so yes it is
 screwed.
 For -net, skb always allocates space for 17 frags + linear part so
 it seems sane to do same in virtio core, and allocate, for -net,
 up to max_frags + 1 from cache.
 We can adjust it: no _SG - 2 otherwise 18.

But I thought it used individual buffers these days?

Cheers,
Rusty.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] KVM: VMX: invalidate vpid for invlpg instruction

2012-09-06 Thread Avi Kivity
On 09/06/2012 12:54 AM, Davidlohr Bueso wrote:
 On Mon, 2012-09-03 at 12:11 +0300, Avi Kivity wrote:
 On 09/03/2012 02:27 AM, Davidlohr Bueso wrote:
  On Fri, 2012-08-31 at 14:37 -0300, Marcelo Tosatti wrote:
  On Fri, Aug 31, 2012 at 06:10:48PM +0200, Davidlohr Bueso wrote:
   For processors that support VPIDs we should invalidate the page table 
   entry
   specified by the lineal address. For this purpose add support for 
   individual
   address invalidations.
  
  Not necessary - a single context invalidation is performed through
  KVM_REQ_TLB_FLUSH.
  
  Since vpid_sync_context() supports both single and all-context vpid
  invalidations, wouldn't it make sense to also add individual address
  ones as well, supporting further granularity?
 
 It might.  Do you have benchmarks supporting this?
 
 
 I ran two benchmarks: Java Dacapo[1] Sunflow (renders a set of images
 using ray tracing) and a vanilla 3.2 kernel build (with 1 job and -j8).
 
 The host configuration is an Intel i7-2635QM (4 cores + HT) with 4Gb RAM
 running Linus's latest and only running standard system daemons. For KVM
 I disabled EPT.

That's not very interesting.  In all real machines, if you have VPID you
also have EPT.  Intel are unlikely to produce a processor without EPT.

 The guest configuration is a 64bit 4 core 4Gb RAM, running Linux 3.2
 (debian) and only running the benchmark.
 
 All results represent the mean of 5 runs, with time(1).

The results are impressive, but lack real-world relevance.
Individual-address invalidation isn't very useful with EPT, since we let
the guest issue INVLPG itself and otherwise don't bother with guest page
tables.

Individual-address INVEPT would probably be more useful, but there is no
such instruction variant.

-- 
error compiling committee.c: too many arguments to function
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC v2] memory-hotplug: remove MIGRATE_ISOLATE from free_area-free_list

2012-09-06 Thread Lai Jiangshan
On 09/06/2012 10:53 AM, Minchan Kim wrote:
 Normally, MIGRATE_ISOLATE type is used for memory-hotplug.
 But it's irony type because the pages isolated would exist
 as free page in free_area-free_list[MIGRATE_ISOLATE] so people
 can think of it as allocatable pages but it is *never* allocatable.
 It ends up confusing NR_FREE_PAGES vmstat so it would be
 totally not accurate so some of place which depend on such vmstat
 could reach wrong decision by the context.
 
 There were already report about it.[1]
 [1] 702d1a6e, memory-hotplug: fix kswapd looping forever problem
 
 Then, there was other report which is other problem.[2]
 [2] http://www.spinics.net/lists/linux-mm/msg41251.html
 
 I believe it can make problems in future, too.
 So I hope removing such irony type by another design.
 
 I hope this patch solves it and let's revert [1] and doesn't need [2].
 
 * Changelog v1
  * Fix from Michal's many suggestion
 
 Cc: Michal Nazarewicz min...@mina86.com
 Cc: Mel Gorman m...@csn.ul.ie
 Cc: Kamezawa Hiroyuki kamezawa.hir...@jp.fujitsu.com
 Cc: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com
 Cc: Wen Congyang we...@cn.fujitsu.com
 Cc: Konrad Rzeszutek Wilk konrad.w...@oracle.com
 Signed-off-by: Minchan Kim minc...@kernel.org
 ---

 @@ -180,30 +287,35 @@ int undo_isolate_page_range(unsigned long start_pfn, 
 unsigned long end_pfn,
   * all pages in [start_pfn...end_pfn) must be in the same zone.
   * zone-lock must be held before call this.
   *
 - * Returns 1 if all pages in the range are isolated.
 + * Returns true if all pages in the range are isolated.
   */
 -static int
 -__test_page_isolated_in_pageblock(unsigned long pfn, unsigned long end_pfn)
 +static bool
 +__test_page_isolated_in_pageblock(unsigned long start_pfn, unsigned long 
 end_pfn)
  {
 + unsigned long pfn, next_pfn;
   struct page *page;
  
 - while (pfn  end_pfn) {
 - if (!pfn_valid_within(pfn)) {
 - pfn++;
 - continue;
 - }
 - page = pfn_to_page(pfn);
 - if (PageBuddy(page))
 - pfn += 1  page_order(page);
 - else if (page_count(page) == 0 
 - page_private(page) == MIGRATE_ISOLATE)
 - pfn += 1;
 - else
 - break;
 + list_for_each_entry(page, isolated_pages, lru) {

 + if (page-lru == isolated_pages)
 + return false;

what's the mean of this line?

 + pfn = page_to_pfn(page);
 + if (pfn = end_pfn)
 + return false;
 + if (pfn = start_pfn)
 + goto found;
 + }
 + return false;
 +
 + list_for_each_entry_continue(page, isolated_pages, lru) {
 + if (page_to_pfn(page) != next_pfn)
 + return false;

where is next_pfn init-ed? 

 +found:
 + pfn = page_to_pfn(page);
 + next_pfn = pfn + (1UL  page_order(page));
 + if (next_pfn = end_pfn)
 + return true;
   }
 - if (pfn  end_pfn)
 - return 0;
 - return 1;
 + return false;
  }
  
  int test_pages_isolated(unsigned long start_pfn, unsigned long end_pfn)
 @@ -211,7 +323,7 @@ int test_pages_isolated(unsigned long start_pfn, unsigned 
 long end_pfn)
   unsigned long pfn, flags;
   struct page *page;
   struct zone *zone;
 - int ret;
 + bool ret;
  
   /*
* Note: pageblock_nr_page != MAX_ORDER. Then, chunks of free page
 diff --git a/mm/vmstat.c b/mm/vmstat.c
 index df7a674..bb59ff7 100644
 --- a/mm/vmstat.c
 +++ b/mm/vmstat.c
 @@ -616,7 +616,6 @@ static char * const migratetype_names[MIGRATE_TYPES] = {
  #ifdef CONFIG_CMA
   CMA,
  #endif
 - Isolate,
  };
  
  static void *frag_start(struct seq_file *m, loff_t *pos)

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC v2] memory-hotplug: remove MIGRATE_ISOLATE from free_area-free_list

2012-09-06 Thread Minchan Kim
Hello Lai,

On Thu, Sep 06, 2012 at 04:14:51PM +0800, Lai Jiangshan wrote:
 On 09/06/2012 10:53 AM, Minchan Kim wrote:
  Normally, MIGRATE_ISOLATE type is used for memory-hotplug.
  But it's irony type because the pages isolated would exist
  as free page in free_area-free_list[MIGRATE_ISOLATE] so people
  can think of it as allocatable pages but it is *never* allocatable.
  It ends up confusing NR_FREE_PAGES vmstat so it would be
  totally not accurate so some of place which depend on such vmstat
  could reach wrong decision by the context.
  
  There were already report about it.[1]
  [1] 702d1a6e, memory-hotplug: fix kswapd looping forever problem
  
  Then, there was other report which is other problem.[2]
  [2] http://www.spinics.net/lists/linux-mm/msg41251.html
  
  I believe it can make problems in future, too.
  So I hope removing such irony type by another design.
  
  I hope this patch solves it and let's revert [1] and doesn't need [2].
  
  * Changelog v1
   * Fix from Michal's many suggestion
  
  Cc: Michal Nazarewicz min...@mina86.com
  Cc: Mel Gorman m...@csn.ul.ie
  Cc: Kamezawa Hiroyuki kamezawa.hir...@jp.fujitsu.com
  Cc: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com
  Cc: Wen Congyang we...@cn.fujitsu.com
  Cc: Konrad Rzeszutek Wilk konrad.w...@oracle.com
  Signed-off-by: Minchan Kim minc...@kernel.org
  ---
 
  @@ -180,30 +287,35 @@ int undo_isolate_page_range(unsigned long start_pfn, 
  unsigned long end_pfn,
* all pages in [start_pfn...end_pfn) must be in the same zone.
* zone-lock must be held before call this.
*
  - * Returns 1 if all pages in the range are isolated.
  + * Returns true if all pages in the range are isolated.
*/
  -static int
  -__test_page_isolated_in_pageblock(unsigned long pfn, unsigned long end_pfn)
  +static bool
  +__test_page_isolated_in_pageblock(unsigned long start_pfn, unsigned long 
  end_pfn)
   {
  +   unsigned long pfn, next_pfn;
  struct page *page;
   
  -   while (pfn  end_pfn) {
  -   if (!pfn_valid_within(pfn)) {
  -   pfn++;
  -   continue;
  -   }
  -   page = pfn_to_page(pfn);
  -   if (PageBuddy(page))
  -   pfn += 1  page_order(page);
  -   else if (page_count(page) == 0 
  -   page_private(page) == MIGRATE_ISOLATE)
  -   pfn += 1;
  -   else
  -   break;
  +   list_for_each_entry(page, isolated_pages, lru) {
 
  +   if (page-lru == isolated_pages)
  +   return false;
 
 what's the mean of this line?

I just copied it from Michal's code but It seem to be not needed.
I will remove it in next spin.

 
  +   pfn = page_to_pfn(page);
  +   if (pfn = end_pfn)
  +   return false;
  +   if (pfn = start_pfn)
  +   goto found;
  +   }
  +   return false;
  +
  +   list_for_each_entry_continue(page, isolated_pages, lru) {
  +   if (page_to_pfn(page) != next_pfn)
  +   return false;
 
 where is next_pfn init-ed? 

by goto found

 
  +found:
  +   pfn = page_to_pfn(page);
  +   next_pfn = pfn + (1UL  page_order(page));
  +   if (next_pfn = end_pfn)
  +   return true;
  }
  -   if (pfn  end_pfn)
  -   return 0;
  -   return 1;
  +   return false;
   }
   
   int test_pages_isolated(unsigned long start_pfn, unsigned long end_pfn)
  @@ -211,7 +323,7 @@ int test_pages_isolated(unsigned long start_pfn, 
  unsigned long end_pfn)
  unsigned long pfn, flags;
  struct page *page;
  struct zone *zone;
  -   int ret;
  +   bool ret;
   
  /*
   * Note: pageblock_nr_page != MAX_ORDER. Then, chunks of free page
  diff --git a/mm/vmstat.c b/mm/vmstat.c
  index df7a674..bb59ff7 100644
  --- a/mm/vmstat.c
  +++ b/mm/vmstat.c
  @@ -616,7 +616,6 @@ static char * const migratetype_names[MIGRATE_TYPES] = {
   #ifdef CONFIG_CMA
  CMA,
   #endif
  -   Isolate,
   };
   
   static void *frag_start(struct seq_file *m, loff_t *pos)
 
 --
 To unsubscribe, send a message with 'unsubscribe linux-mm' in
 the body to majord...@kvack.org.  For more info on Linux MM,
 see: http://www.linux-mm.org/ .
 Don't email: a href=mailto:d...@kvack.org; em...@kvack.org /a

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] pwm i.MX: add devicetree support

2012-09-06 Thread Shawn Guo
On Wed, Sep 05, 2012 at 03:35:19PM +0200, Sascha Hauer wrote:
 
 Changes since v1:
 
 - Add devicetree binding documentation
 - Merge 5/9 and 9/9
 - fix #pwm-cells (must be 2 instead of 3)
 - fix wrong name in MODULE_DEVICE_TABLE
 - drop platform based probing while introducing devicetree based probe
 
 
 Philipp Zabel (2):
   pwm i.MX: add devicetree support
   pwm i.MX: fix clock lookup
 
 Sascha Hauer (6):
   pwm i.MX: factor out SoC specific functions
   pwm i.MX: remove unnecessary if in pwm_[en|dis]able
   pwm i.MX: add functions to enable/disable pwm.
   pwm i.MX: Use module_platform_driver
   pwm i.MX: use per clock unconditionally
   ARM i.MX53: Add pwm support
 
For the series,

Reviewed-by: Shawn Guo shawn@linaro.org

  Documentation/devicetree/bindings/pwm/imx-pwm.txt |   17 ++
  arch/arm/boot/dts/imx53.dtsi  |   14 ++
  arch/arm/mach-imx/clk-imx51-imx53.c   |4 +
  drivers/pwm/pwm-imx.c |  275 
 ++---
  4 files changed, 214 insertions(+), 96 deletions(-)
  create mode 100644 Documentation/devicetree/bindings/pwm/imx-pwm.txt
 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: linux-next: build failure after merge of the final tree (powerpc tree related)

2012-09-06 Thread Ananth N Mavinakayanahalli
On Thu, Sep 06, 2012 at 05:11:53PM +1000, Stephen Rothwell wrote:
 Hi all,
 
 After merging the final tree, today's linux-next build (powerpc allyesconfig)
 failed like this:
 
 In file included from drivers/atm/fore200e.c:70:0:
 drivers/atm/fore200e.h:263:3: error: redefinition of typedef 'opcode_t' with 
 different type
 arch/powerpc/include/asm/probes.h:25:13: note: previous declaration of 
 'opcode_t' was here
 
 Caused by commit 7118e7e648e0 (powerpc: Consolidate {k,u}probe
 definitions) from the powerpc tree.
 
 I have reverted that commit (and the two following:
 41ab5266c362  powerpc: Add trap_nr to thread_struct
 8b7b80b9ebb4  powerpc: Uprobes port to powerpc)
 for today.

Hi Stephen,

I have just posted a patch [1] to fix the issue.

Ananth

[1] https://lists.ozlabs.org/pipermail/linuxppc-dev/2012-September/100813.html

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: snd-usb: delay: estimated 0, actual 352

2012-09-06 Thread Takashi Iwai
At Thu, 06 Sep 2012 09:35:26 +0200,
Takashi Iwai wrote:
 
 At Thu, 6 Sep 2012 09:17:57 +0200,
 Markus Trippelsdorf wrote:
  
  On 2012.09.06 at 09:08 +0200, Daniel Mack wrote:
   On 06.09.2012 08:53, Markus Trippelsdorf wrote:
On 2012.09.06 at 08:48 +0200, Takashi Iwai wrote:
At Thu, 06 Sep 2012 08:33:30 +0200,
Daniel Mack wrote:
   
On 06.09.2012 08:02, Markus Trippelsdorf wrote:
On 2012.09.04 at 16:40 +0200, Takashi Iwai wrote:

Sound fixes for 3.6-rc5
   
There are nothing scaring, contains only small fixes for HD-audio 
and
USB-audio:
- EPSS regression fix and GPIO fix for HD-audio IDT codecs
- A series of USB-audio regression fixes that are found since 3.5 
kernel
   

Daniel Mack (4):
  ALSA: snd-usb: Fix URB cancellation at stream start
  ALSA: snd-usb: restore delay information
 
The commit fbcfbf5f above causes the following lines to be printed
whenever I start a new song:
   
Copied Pierre-Louis Bossart - he wrote the code in 294c4fb8 which this
patch (fbcfbf5f) brings back now.
   
delay: estimated 0, actual 352
delay: estimated 353, actual 705
   
(44.1 * 8 = 352.8)
   
This happens with an USB-DAC that identifies itself as C-Media USB
Headphone Set.
   
And you didn't you see these lines with 3.4?
   
Maybe the difference of start condition?
   
Markus, does the patch below fix anything?

Unfortunately no.
However reverting the following fixes the problem:

commit 245baf983cc39524cce39c24d01b276e6e653c9e
Author: Daniel Mack zon...@gmail.com
Date:   Thu Aug 30 18:52:30 2012 +0200

ALSA: snd-usb: fix calls to next_packet_size

   
   No, this one certainly fixes a problem and does the right thing by
   restoring the original code.
   
   If you wouldn't state that you didn't see the same effect with 3.4(!),
   before the refactoring done in 3.5, I would believe the device is simply
   slightly off in its feedback rate and the tighter delay code complains
   about it while compensating, just as it did before.
   
   Are there any more than these two lines? And is audio working at all? Is
   it distorted in any way?
  
  There are only these two lines (printed whenever sound starts). Audio is
  working just fine with no distortions.
  
  I did see similar lines before when the system load was very high
  (happend during make check when building glibc).
  
  Here is what Pierre-Louis wrote in November 2011:
  
  »This was supposed to be an informational message, I thought it was only
  enabled for debug. Regular users don't really need to know.«
 
 I guess the problem is that the new endpoint scheme doesn't count the
 last_delay update unless the stream is triggered.  In the old code,
 retire_playback_urb is always called even before the trigger(START) is
 set.  And, there retire_playback_urb() does nothing but updating the
 delay information.
 
 In the new code, retire_playback_urb is set only at
 snd_usb_substream_playback_trigger().  Thus at the very first shot,
 the delay account got confused.

In short, a patch like below may fix the issue (note: completely
untested!)


Takashi

---

diff --git a/sound/usb/pcm.c b/sound/usb/pcm.c
index fd5e982..928a4f7 100644
--- a/sound/usb/pcm.c
+++ b/sound/usb/pcm.c
@@ -528,6 +528,9 @@ static int snd_usb_hw_free(struct snd_pcm_substream 
*substream)
return snd_pcm_lib_free_vmalloc_buffer(substream);
 }
 
+static void retire_playback_urb(struct snd_usb_substream *subs,
+   struct urb *urb);
+
 /*
  * prepare callback
  *
@@ -561,8 +564,10 @@ static int snd_usb_pcm_prepare(struct snd_pcm_substream 
*substream)
 
/* for playback, submit the URBs now; otherwise, the first hwptr_done
 * updates for all URBs would happen at the same time when starting */
-   if (subs-direction == SNDRV_PCM_STREAM_PLAYBACK)
+   if (subs-direction == SNDRV_PCM_STREAM_PLAYBACK) {
+   subs-data_endpoint-retire_data_urb = retire_playback_urb;
return start_endpoints(subs, 1);
+   }
 
return 0;
 }
@@ -1190,7 +1195,6 @@ static int snd_usb_substream_playback_trigger(struct 
snd_pcm_substream *substrea
case SNDRV_PCM_TRIGGER_START:
case SNDRV_PCM_TRIGGER_PAUSE_RELEASE:
subs-data_endpoint-prepare_data_urb = prepare_playback_urb;
-   subs-data_endpoint-retire_data_urb = retire_playback_urb;
subs-running = 1;
return 0;
case SNDRV_PCM_TRIGGER_STOP:
@@ -1199,7 +1203,6 @@ static int snd_usb_substream_playback_trigger(struct 
snd_pcm_substream *substrea
return 0;
case SNDRV_PCM_TRIGGER_PAUSE_PUSH:
subs-data_endpoint-prepare_data_urb 

Re: [PATCH 2/2] mm: support MIGRATE_DISCARD

2012-09-06 Thread Mel Gorman
On Thu, Sep 06, 2012 at 02:31:12PM +0900, Minchan Kim wrote:
 Hi Mel,
 
 On Wed, Sep 05, 2012 at 11:56:11AM +0100, Mel Gorman wrote:
  On Wed, Sep 05, 2012 at 05:11:13PM +0900, Minchan Kim wrote:
   This patch introudes MIGRATE_DISCARD mode in migration.
   It drops *clean cache pages* instead of migration so that
   migration latency could be reduced by avoiding (memcpy + page remapping).
   It's useful for CMA because latency of migration is very important rather
   than eviction of background processes's workingset. In addition, it needs
   less free pages for migration targets so it could avoid memory reclaiming
   to get free pages, which is another factor increase latency.
   
  
  Bah, this was released while I was reviewing the older version. I did
  not read this one as closely but I see the enum problems have gone away
  at least. I'd still prefer if CMA had an additional helper to discard
  some pages with shrink_page_list() and migrate the remaining pages with
  migrate_pages(). That would remove the need to add a MIGRATE_DISCARD
  migrate mode at all.
 
 I am not convinced with your point. What's the benefit on separating
 reclaim and migration? For just removing MIGRATE_DISCARD mode?

Maintainability. There are reclaim functions and there are migration
functions. Your patch takes migrate_pages() and makes it partially a
reclaim function mixing up the responsibilities of migrate.c and vmscan.c.

 I don't think it's not bad because my implementation is very simple(maybe
 it's much simpler than separating reclaim and migration) and
 could be used by others like memory-hotplug in future.

They could also have used the helper function from CMA that takes a list
of pages, reclaims some and migrates other.

 If you're not strong against with me, I would like to insist on my 
 implementation.
 

I'm not very strongly against it but I'm also very unhappy.

-- 
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drm/exynos: fix double call of drm_prime_(init/destroy)_file_private

2012-09-06 Thread InKi Dae
Hi,

2012/9/6 Paul Menzel paulepan...@users.sourceforge.net:
 Dear Inki Dae,


 Am Donnerstag, den 06.09.2012, 11:35 +0900 schrieb InKi Dae:

 2012/9/6 Mandeep Singh Baines m...@chromium.org:
  The double invocations are incorrect but seem to be safe so I don't
  think this will fix any bugs.
 
  Before:
 
  [7.639366] drm_prime_init_file ee3675d0
  [7.639377] drm_prime_init_file ee3675d0
  [7.639507] drm_prime_destroy_file ee3675d0
  [7.639518] drm_prime_destroy_file ee3675d0
  [7.639802] drm_prime_init_file ee372390
  [7.639810] drm_prime_init_file ee372390
  [8.473316] drm_prime_init_file ee356390
  [8.473331] drm_prime_init_file ee356390
 
  After:
 
  [6.363842] drm_prime_init_file edc2e5d0
  [6.363994] drm_prime_destroy_file edc2e5d0
  [6.364260] drm_prime_init_file edc2e750
  [8.004837] drm_prime_init_file ee36ded0
 
  Signed-off-by: Mandeep Singh Baines m...@chromium.org
  CC: Stéphane Marchesin marc...@chromium.org
  CC: Pawel Osciak posc...@google.com
  CC: Inki Dae inki@samsung.com
  CC: Joonyoung Shim jy0922.s...@samsung.com
  CC: Seung-Woo Kim sw0312@samsung.com
  CC: Kyungmin Park kyungmin.p...@samsung.com
  CC: David Airlie airl...@linux.ie
  CC: dri-de...@lists.freedesktop.org

 remove all CCs

 I guess they were generated by some script. So they should be fine, no?

 Mandeep, if you put CC in here those people should be CCed in real. `git
 send-email` should take care of that but I do not see everyone in the CC
 field. Or does `git send-email` use blind carbon copy (BCC) field?

 and can you send it again using text mode?

 At least to the list it was send in plain text mode.

 your patch is messed up when I try to get patch file.

 Everything is fine on my side. Especially since Mandeep used `git
 send-email` which should do everything correctly.


your patch was encoded with 'Content-Transfer-Encoding: base64' so
please use 7bit ascii like 'Content-Transfer-Encoding: 7bit'

 Thanks.
 Inki Dae

 In your From address your name is written InKi with capital K. Which one
 is correct?


Inki is correct :)

Thanks.
Inki Dae


 Thanks,

 Paul

 ___
 dri-devel mailing list
 dri-de...@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/dri-devel

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/6] unicore32: pwm: Properly remap memory-mapped registers

2012-09-06 Thread guanxuetao
 Instead of writing to the timer controller registers by dereferencing a
 pointer to the memory location, properly remap the memory region with a
 call to ioremap_nocache() and access the registers using writel().

 Signed-off-by: Thierry Reding thierry.red...@avionic-design.de
 ---
  arch/unicore32/kernel/pwm.c | 25 ++---
  1 file changed, 22 insertions(+), 3 deletions(-)

 diff --git a/arch/unicore32/kernel/pwm.c b/arch/unicore32/kernel/pwm.c
 index 4615d51..410b786 100644
 --- a/arch/unicore32/kernel/pwm.c
 +++ b/arch/unicore32/kernel/pwm.c
 @@ -23,10 +23,16 @@
  #include asm/div64.h
  #include mach/hardware.h

 +#define PWCR 0x00
 +#define DCCR 0x04
 +#define PCR  0x08
I think old register names could be used here by some small modifications.
Please see arch/unicore32/include/mach/regs-ost.h
We can avoid ioremap and use writel/readl directly on these registers.

Guan

 +
  struct pwm_device {
   struct list_headnode;
   struct platform_device *pdev;

 + void __iomem*base;
 +
   const char  *label;
   struct clk  *clk;
   int clk_enabled;
 @@ -69,9 +75,11 @@ int pwm_config(struct pwm_device *pwm, int duty_ns, int
 period_ns)
* before writing to the registers
*/
   clk_enable(pwm-clk);
 - OST_PWMPWCR = prescale;
 - OST_PWMDCCR = pv - dc;
 - OST_PWMPCR  = pv;
 +
 + writel(prescale, pwm-base + PWCR);
 + writel(pv - dc, pwm-base + DCCR);
 + writel(pv, pwm-base + PCR);
 +
   clk_disable(pwm-clk);

   return 0;
 @@ -190,10 +198,19 @@ static struct pwm_device *pwm_probe(struct
 platform_device *pdev,
   goto err_free_clk;
   }

 + pwm-base = ioremap_nocache(r-start, resource_size(r));
 + if (pwm-base == NULL) {
 + dev_err(pdev-dev, failed to remap memory resource\n);
 + ret = -EADDRNOTAVAIL;
 + goto err_release_mem;
 + }
 +
   __add_pwm(pwm);
   platform_set_drvdata(pdev, pwm);
   return pwm;

 +err_release_mem:
 + release_mem_region(r-start, resource_size(r));
  err_free_clk:
   clk_put(pwm-clk);
  err_free:
 @@ -224,6 +241,8 @@ static int __devexit pwm_remove(struct platform_device
 *pdev)
   list_del(pwm-node);
   mutex_unlock(pwm_lock);

 + iounmap(pwm-base);
 +
   r = platform_get_resource(pdev, IORESOURCE_MEM, 0);
   release_mem_region(r-start, resource_size(r));

 --
 1.7.12


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC v9 PATCH 20/21] memory-hotplug: clear hwpoisoned flag when onlining pages

2012-09-06 Thread Wen Congyang
At 09/06/2012 03:27 PM, andywu106建国 Wrote:
 2012/9/5 we...@cn.fujitsu.com

 From: Wen Congyang we...@cn.fujitsu.com

 hwpoisoned may set when we offline a page by the sysfs interface
 /sys/devices/system/memory/soft_offline_page or
 /sys/devices/system/memory/hard_offline_page. If we don't clear
 this flag when onlining pages, this page can't be freed, and will
 not in free list. So we can't offline these pages again. So we
 should clear this flag when onlining pages.

 CC: David Rientjes rient...@google.com
 CC: Jiang Liu liu...@gmail.com
 CC: Len Brown len.br...@intel.com
 CC: Benjamin Herrenschmidt b...@kernel.crashing.org
 CC: Paul Mackerras pau...@samba.org
 CC: Christoph Lameter c...@linux.com
 Cc: Minchan Kim minchan@gmail.com
 CC: Andrew Morton a...@linux-foundation.org
 CC: KOSAKI Motohiro kosaki.motoh...@jp.fujitsu.com
 CC: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com
 Signed-off-by: Wen Congyang we...@cn.fujitsu.com
 ---
  mm/memory_hotplug.c |5 +
  1 files changed, 5 insertions(+), 0 deletions(-)

 diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
 index 270c249..140c080 100644
 --- a/mm/memory_hotplug.c
 +++ b/mm/memory_hotplug.c
 @@ -661,6 +661,11 @@ EXPORT_SYMBOL_GPL(__online_page_increment_counters);

  void __online_page_free(struct page *page)
  {
 +#ifdef CONFIG_MEMORY_FAILURE
 +   /* The page may be marked HWPoisoned by soft/hard offline page */
 +   ClearPageHWPoison(page);
 
 Hi Congyang,
 I think you should decrease mce_bad_pages counter her
 atomic_long_sub(1, mce_bad_pages);

Yes, thanks for pointing it out.

Thanks
Wen Congyang

 

 +#endif
 +
 ClearPageReserved(page);
 init_page_count(page);
 __free_page(page);
 --
 1.7.1

 --
 To unsubscribe, send a message with 'unsubscribe linux-mm' in
 the body to majord...@kvack.org.  For more info on Linux MM,
 see: http://www.linux-mm.org/ .
 Don't email: a href=mailto:d...@kvack.org; em...@kvack.org /a
 

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4/6] unicore32: Add common clock support

2012-09-06 Thread guanxuetao
 This commit adds support for the common clock framework to the Unicore32
 architecture.

 Signed-off-by: Thierry Reding thierry.red...@avionic-design.de

This patch can't work.
Could you disintegrate it into several small patches, so I could check it
out.

Thanks,
Guan Xuetao

 ---
  arch/unicore32/Kconfig  |   1 +
  arch/unicore32/include/asm/clkdev.h |  26 ++
  arch/unicore32/kernel/clock.c   | 560
 
  3 files changed, 339 insertions(+), 248 deletions(-)
  create mode 100644 arch/unicore32/include/asm/clkdev.h

 diff --git a/arch/unicore32/Kconfig b/arch/unicore32/Kconfig
 index b0a4743..46b3a15 100644
 --- a/arch/unicore32/Kconfig
 +++ b/arch/unicore32/Kconfig
 @@ -14,6 +14,7 @@ config UNICORE32
   select GENERIC_IRQ_SHOW
   select ARCH_WANT_FRAME_POINTERS
   select GENERIC_IOMAP
 + select COMMON_CLK
   help
 UniCore-32 is 32-bit Instruction Set Architecture,
 including a series of low-power-consumption RISC chip
 diff --git a/arch/unicore32/include/asm/clkdev.h
 b/arch/unicore32/include/asm/clkdev.h
 new file mode 100644
 index 000..201645d
 --- /dev/null
 +++ b/arch/unicore32/include/asm/clkdev.h
 @@ -0,0 +1,26 @@
 +/*
 + *  based on arch/arm/include/asm/clkdev.h
 + *
 + *  Copyright (C) 2008 Russell King.
 + *
 + * This program is free software; you can redistribute it and/or modify
 + * it under the terms of the GNU General Public License version 2 as
 + * published by the Free Software Foundation.
 + *
 + * Helper for the clk API to assist looking up a struct clk.
 + */
 +
 +#ifndef __ASM_CLKDEV_H
 +#define __ASM_CLKDEV_H
 +
 +#include linux/slab.h
 +
 +#define __clk_get(clk)   ({ 1; })
 +#define __clk_put(clk)   do { } while (0)
 +
 +static inline struct clk_lookup_alloc *__clkdev_alloc(size_t size)
 +{
 + return kzalloc(size, GFP_KERNEL);
 +}
 +
 +#endif
 diff --git a/arch/unicore32/kernel/clock.c b/arch/unicore32/kernel/clock.c
 index 18d4563..197f885 100644
 --- a/arch/unicore32/kernel/clock.c
 +++ b/arch/unicore32/kernel/clock.c
 @@ -17,223 +17,50 @@
  #include linux/errno.h
  #include linux/err.h
  #include linux/string.h
 -#include linux/clk.h
 +#include linux/clk-provider.h
  #include linux/mutex.h
  #include linux/delay.h
  #include linux/io.h
 +#include linux/slab.h

  #include mach/hardware.h

 -/*
 - * Very simple clock implementation
 - */
 -struct clk {
 - struct list_headnode;
 - unsigned long   rate;
 - const char  *name;
 -};
 -
 -static struct clk clk_ost_clk = {
 - .name   = OST_CLK,
 - .rate   = CLOCK_TICK_RATE,
 -};
 -
 -static struct clk clk_mclk_clk = {
 - .name   = MAIN_CLK,
 -};
 -
 -static struct clk clk_bclk32_clk = {
 - .name   = BUS32_CLK,
 +struct clk_uc {
 + struct clk_hw hw;
  };

 -static struct clk clk_ddr_clk = {
 - .name   = DDR_CLK,
 -};
 -
 -static struct clk clk_vga_clk = {
 - .name   = VGA_CLK,
 -};
 -
 -static LIST_HEAD(clocks);
 -static DEFINE_MUTEX(clocks_mutex);
 -
 -struct clk *clk_get(struct device *dev, const char *id)
 -{
 - struct clk *p, *clk = ERR_PTR(-ENOENT);
 -
 - mutex_lock(clocks_mutex);
 - list_for_each_entry(p, clocks, node) {
 - if (strcmp(id, p-name) == 0) {
 - clk = p;
 - break;
 - }
 - }
 - mutex_unlock(clocks_mutex);
 -
 - return clk;
 -}
 -EXPORT_SYMBOL(clk_get);
 -
 -void clk_put(struct clk *clk)
 -{
 -}
 -EXPORT_SYMBOL(clk_put);
 -
 -int clk_enable(struct clk *clk)
 -{
 - return 0;
 -}
 -EXPORT_SYMBOL(clk_enable);
 -
 -void clk_disable(struct clk *clk)
 +static inline struct clk_uc *to_clk_uc(struct clk_hw *hw)
  {
 + return container_of(hw, struct clk_uc, hw);
  }
 -EXPORT_SYMBOL(clk_disable);
 -
 -unsigned long clk_get_rate(struct clk *clk)
 -{
 - return clk-rate;
 -}
 -EXPORT_SYMBOL(clk_get_rate);
 -
 -struct {
 - unsigned long rate;
 - unsigned long cfg;
 - unsigned long div;
 -} vga_clk_table[] = {
 - {.rate =  25175000, .cfg = 0x2001, .div = 0x9},
 - {.rate =  3150, .cfg = 0x2001, .div = 0x7},
 - {.rate =  4000, .cfg = 0x3801, .div = 0x9},
 - {.rate =  4950, .cfg = 0x3801, .div = 0x7},
 - {.rate =  6500, .cfg = 0x2c01, .div = 0x4},
 - {.rate =  7875, .cfg = 0x2400, .div = 0x7},
 - {.rate = 10800, .cfg = 0x2c01, .div = 0x2},
 - {.rate = 10650, .cfg = 0x3c01, .div = 0x3},
 - {.rate =  5065, .cfg = 0x00106400, .div = 0x9},
 - {.rate =  6150, .cfg = 0x00106400, .div = 0xa},
 - {.rate =  8550, .cfg = 0x2800, .div = 0x6},
 -};
 -
 -struct {
 - unsigned long mrate;
 - unsigned long prate;
 -} mclk_clk_table[] = {
 - {.mrate = 5, .prate = 0x00109801},
 - {.mrate = 52500, .prate = 0x00104C00},
 - {.mrate = 55000, .prate = 0x00105000},
 - {.mrate = 

Re: [PATCH v2 20/31] arm64: User access library function

2012-09-06 Thread Catalin Marinas
On Wed, Sep 05, 2012 at 10:05:34PM +0100, Russell King - ARM Linux wrote:
 On Wed, Sep 05, 2012 at 10:01:37PM +0100, Catalin Marinas wrote:
  There are indeed a few KB gain in code size but that's probably coming
  from the exception table since otherwise you just replace a bl with
  ldrt. It depends on what the compiler does as well, the arm code has
  some carefully chosen registers when calling the __get_user_x function.
 
 It's more than that - it's not just the ldr but also a zeroing of a
 temporary register to hold the error code should the instruction fault.
 So it's not only the exception tables but also an increase in the
 main path - and that's where you benefit from having it out of line and
 thereby a hotter i-cache.

On 32-bit we have __get_user() inline and get_user() out of line. What
was the history behind this?

  If you do the access_ok inline and the __get_user_x separately, the size
  increase is even greater (at least in the arm64 case it can get to over
  20KB). I think x86 does the access_ok check out of line.
 
 Please talk to Will about get_user() and put_user().  Afterwards you
 will definitely want to keep them out of line on 64-bit ARM.

As I said, I already made the change to always inline get_user/put_user
with some penalty in the Image size but it makes the code cleaner. I'm
not entirely convinced of the performance gain/loss especially on ARMv8
cores with physically tagged caches. There is room for optimisation when
I get real silicon.

-- 
Catalin
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2] drivers/media/platform/s5p-tv/sdo_drv.c: fix error return code

2012-09-06 Thread Peter Senna Tschudin
From: Peter Senna Tschudin peter.se...@gmail.com

Convert a nonnegative error return code to a negative one, as returned
elsewhere in the function.

A simplified version of the semantic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)

// smpl
(
if@p1 (\(ret  0\|ret != 0\))
 { ... return ret; }
|
ret@p1 = 0
)
... when != ret = e1
when != ret
*if(...)
{
  ... when != ret = e2
  when forall
 return ret;
}

// /smpl

Signed-off-by: Peter Senna Tschudin peter.se...@gmail.com

---
 drivers/media/platform/s5p-tv/sdo_drv.c |3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/media/platform/s5p-tv/sdo_drv.c 
b/drivers/media/platform/s5p-tv/sdo_drv.c
index ad68bbe..58cf56d 100644
--- a/drivers/media/platform/s5p-tv/sdo_drv.c
+++ b/drivers/media/platform/s5p-tv/sdo_drv.c
@@ -369,6 +369,7 @@ static int __devinit sdo_probe(struct platform_device *pdev)
sdev-fout_vpll = clk_get(dev, fout_vpll);
if (IS_ERR_OR_NULL(sdev-fout_vpll)) {
dev_err(dev, failed to get clock 'fout_vpll'\n);
+   ret = -ENXIO;
goto fail_dacphy;
}
dev_info(dev, fout_vpll.rate = %lu\n, clk_get_rate(sclk_vpll));
@@ -377,11 +378,13 @@ static int __devinit sdo_probe(struct platform_device 
*pdev)
sdev-vdac = devm_regulator_get(dev, vdd33a_dac);
if (IS_ERR_OR_NULL(sdev-vdac)) {
dev_err(dev, failed to get regulator 'vdac'\n);
+   ret = -ENXIO;
goto fail_fout_vpll;
}
sdev-vdet = devm_regulator_get(dev, vdet);
if (IS_ERR_OR_NULL(sdev-vdet)) {
dev_err(dev, failed to get regulator 'vdet'\n);
+   ret = -ENXIO;
goto fail_fout_vpll;
}
 

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] virtio-blk: Fix kconfig option

2012-09-06 Thread Michael S. Tsirkin
On Thu, Sep 06, 2012 at 12:41:13AM -0700, Kent Overstreet wrote:
 On Tue, Sep 04, 2012 at 03:53:53PM +0930, Rusty Russell wrote:
  Kent Overstreet koverstr...@google.com writes:
  
   CONFIG_VIRTIO isn't exposed, everything else is supposed to select it
   instead.
  
  This is a slight mis-understanding.  It's supposed to be selected by
  the particular driver, probably virtio_pci in your case.
 
 So are you saying virtio-blk depends on virtio-pci? If so, the kconfig
 should have that.
 
 As is, VIRTIO_BLK just has:
   depends on EXPERIMENTAL  VIRTIO
 
 which is flat out broken.

I don't think anything is broken.
Can you show an example of a broken configuration?

-- 
MST
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] OMAP GPIO - don't wake from suspend unless requested.

2012-09-06 Thread Shilimkar, Santosh
On Thu, Sep 6, 2012 at 1:21 PM, NeilBrown ne...@suse.de wrote:
 On Thu, 6 Sep 2012 12:57:46 +0530 Shilimkar, Santosh
 santosh.shilim...@ti.com wrote:

 On Thu, Sep 6, 2012 at 12:32 PM, NeilBrown ne...@suse.de wrote:
  On Thu, 6 Sep 2012 11:18:09 +0530 Shilimkar, Santosh
  santosh.shilim...@ti.com wrote:
 
  On Thu, Sep 6, 2012 at 8:35 AM, NeilBrown ne...@suse.de wrote:
   On Mon, 3 Sep 2012 22:59:06 -0700 Shilimkar, Santosh
   santosh.shilim...@ti.com wrote:
 
   After thinking bit more on this, the problem seems to be coming
   mainly because the gpio device is runtime suspended bit early than
   it should be. Similar issue seen with i2c driver as well. The i2c issue
   was discussed with Rafael at LPC last week. The idea is to move
   the pm_runtime_enable/disable() calls entirely up to the
   _late/_early stage of device suspend/resume.
   Will update this thread once I have further update.
  
   This won't be late enough.  IRQCHIP_MASK_ON_SUSPEND takes effect after 
   all
   the _late callbacks have been called.
   I, too, spoke to Rafael about this in San Diego.  He seemed to agree 
   with me
   that the interrupt needs to be masked in the -suspend callback.  any 
   later
   is too late.
  
  Thanks for information about your discussion. Will wait for the patch 
  then.
 
  Regards
  santosh
 
  I already sent a patch - that is what started this thread :-)
 
  I include it below.
  You said The patch doesn't seems to be correct but didn't expand on why.
  Do you still think it is not correct?  I wouldn't be surprised if there is
  some case that it doesn't handle quite right, but it seems right to me.
 
 Sorry I though you were talking about moving the Checking wakeup interrupts
 bit early as discussed on the follow up of alternate suggested patch to make
 use of  IRQCHIP_MASK_ON_SUSPEND.

 I think we need to fix the issue seen with ' IRQCHIP_MASK_ON_SUSPEND'
 patch. That is at least the expected way to manage suspend and wakeup
 irq masks for a IRQCHIP.

 That is what I thought at first too.  But when talking to Rafael he said that
 IRQCHIP_MASK_ON_SUSPEND was intended mainly for clock interrupts.  For other
 less fundamental interrupts, doing the mask/unmask in suspend/resume
 callbacks is sufficient and simpler... and actually works.

That is not the how I undetand IRQCHIP_MASK_ON_SUSPEND use.
I thought it can be used for any IRQ chip for masking or setting wakeup on
interrupts lines managed by that chip for suspend. May be I am wrong.

 IRQCHIP_MASK_ON_SUSPEND is currently used by precisely two drivers:

arch/arm/mach-omap2/omap-wakeupgen.c
 and
drivers/mfd/pm8xxx-irq.c

 which suggests that it isn't widely used and quite possibly doesn't actually
 work in general.
I have seen lot more platforms use in downstream kernels. Also seen recently
patches on the linux-arm list for couple of platforms.


 Maybe we need to start a new thread and pester Rafael or Thomas Gleixner
 to either explain what is intended for this case, or to fix
 IRQCHIP_MASK_ON_SUSPEND so that it can be used in general.

Sounds a good idea. Since you already had discussion with Rafael,
probably you can describe the issue better.

Regards
Santosh
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 2/2] virtio-ring: Allocate indirect buffers from cache when possible

2012-09-06 Thread Michael S. Tsirkin
On Thu, Sep 06, 2012 at 05:27:23PM +0930, Rusty Russell wrote:
 Michael S. Tsirkin m...@redhat.com writes:
  Yes without checksum net core always linearizes packets, so yes it is
  screwed.
  For -net, skb always allocates space for 17 frags + linear part so
  it seems sane to do same in virtio core, and allocate, for -net,
  up to max_frags + 1 from cache.
  We can adjust it: no _SG - 2 otherwise 18.
 
 But I thought it used individual buffers these days?

Yes for receive, no for transmit. That's probably why
we should have the threshold per vq, not per device, BTW.

-- 
MST
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] virtio-balloon spec: provide a version of the silent deflate feature that works

2012-09-06 Thread Michael S. Tsirkin
On Thu, Sep 06, 2012 at 09:46:50AM +0200, Paolo Bonzini wrote:
 VIRTIO_BALLOON_F_MUST_TELL_HOST cannot be used properly because it is a
 negative feature: it tells you that silent defalte is not supported.
 Right now, QEMU refuses migration if the target does not support all the
 features that were negotiated.  But then:
 
 - a migration from non-MUST_TELL_HOST to MUST_TELL_HOST will succeed,
 which is wrong;
 
 - a migration from MUST_TELL_HOST to non-MUST_TELL_HOST will fail, which
 is useless.
 
 Add instead a new feature VIRTIO_BALLOON_F_SILENT_DEFLATE, and deprecate
 VIRTIO_BALLOON_F_MUST_TELL_HOST since it is never actually used.
 
 Signed-off-by: Paolo Bonzini pbonz...@redhat.com

Frankly I think it's a qemu migration bug. I don't see why
we need to tweak spec: just teach qemu to be smarter
during migration.

Can you show a scenario with old driver/new hypervisor or
new driver/old hypervisor that fails?

 ---
  virtio-spec.lyx | 36 +---
  1 file modificato, 33 inserzioni(+), 3 rimozioni(-)
 
 diff --git a/virtio-spec.lyx b/virtio-spec.lyx
 index 7a073f4..1a25a18 100644
 --- a/virtio-spec.lyx
 +++ b/virtio-spec.lyx
 @@ -6238,6 +6238,8 @@ bits
  
  \begin_deeper
  \begin_layout Description
 +
 +\change_deleted 1531152142 1346917221
  VIRTIO_BALLOON_F_MUST_TELL_HOST
  \begin_inset space ~
  \end_inset
 @@ -6251,6 +6253,20 @@ VIRTIO_BALLOON_F_STATS_VQ
  \end_inset
  
  (1) A virtqueue for reporting guest memory statistics is present.
 +\change_inserted 1531152142 1346917193
 +
 +\end_layout
 +
 +\begin_layout Description
 +
 +\change_inserted 1531152142 1346917219
 +VIRTIO_BALLOON_F_SILENT_DEFLATE
 +\begin_inset space ~
 +\end_inset
 +
 +(2) Host does not need to be told before pages from the balloon are used.
 +\change_unchanged
 +
  \end_layout
  
  \end_deeper
 @@ -6401,9 +6417,23 @@ The driver constructs an array of addresses of memory 
 pages it has previously
  \end_layout
  
  \begin_layout Enumerate
 -If the VIRTIO_BALLOON_F_MUST_TELL_HOST feature is set, the guest may not
 - use these requested pages until that descriptor in the deflateq has been
 - used by the device.
 +If the VIRTIO_BALLOON_F_
 +\change_deleted 1531152142 1346917234
 +MUST_TELL_HOST
 +\change_inserted 1531152142 1346917237
 +SILENT_DEFLATE
 +\change_unchanged
 + feature is 
 +\change_inserted 1531152142 1346917241
 +not 
 +\change_unchanged
 +set, the guest may not use these requested pages until that descriptor in
 + the deflateq has been used by the device.
 +
 +\change_inserted 1531152142 1346917253
 + If it is set, the guest may choose to not use the deflateq at all.
 +\change_unchanged
 +
  \end_layout
  
  \begin_layout Enumerate
 -- 
 1.7.11.2
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] powerpc: fix personality handling in ppc64_personality()

2012-09-06 Thread Jiri Kosina
On Thu, 6 Sep 2012, Benjamin Herrenschmidt wrote:

  actually commit 7256a5d2da56 seems to contain the correct PER_LINUX 
  handling, so seems like you picked the right one :)
  
 
 Odd, they looked different around the use of PER_MASK when I looked but

The original patch had

personality = ~PER_LINUX | PER_LINUX32;

Which is bogus, exactly because ~PER_LINUX is -1.

I then used

personality = (personality  ~PER_MASK) | PER_LINUX32;

which is correct and perhaps a little bit more descriptive, and that is 
what you have merged, so all is fine.

 I was tired  jet lagged, so I might have just had a brain fail...

Probably just missed that the first patch used PER_LINUX and the second 
one PER_MASK, or whatever.

Anyway, thanks.

-- 
Jiri Kosina
SUSE Labs
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3] kconfig: replace 'oldnoconfig' with 'olddefconfig', and keep the old name as an alias

2012-09-06 Thread Adam Lee
On Sat, Sep 01, 2012 at 01:05:17AM +0800, Adam Lee wrote:
 As 67d34a6a391369269a2e5dba8a5f42cc4cd50231 said, 'oldnoconfig' doesn't
 set new symbols to 'n', but instead sets it to their default values.
 
 So, this patch replaces 'oldnoconfig' with 'olddefconfig', stop making
 people confused, and keep the old name 'oldnoconfig' as an alias,
 because people already are dependent on its behavior with the
 counter-intuitive name.
 
 v3: use a better way an add comments about alias in conf.c

Hi, Michal

How about this version? I replace it and keep the old one as an alias,
seems everyone will be happy, not have to stand the counter-intuitive
name.

-- 
Regards,
Adam Lee
http://adam8157.info
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC v2] memory-hotplug: remove MIGRATE_ISOLATE from free_area-free_list

2012-09-06 Thread Lai Jiangshan
On 09/06/2012 04:18 PM, Minchan Kim wrote:
 Hello Lai,
 
 On Thu, Sep 06, 2012 at 04:14:51PM +0800, Lai Jiangshan wrote:
 On 09/06/2012 10:53 AM, Minchan Kim wrote:
 Normally, MIGRATE_ISOLATE type is used for memory-hotplug.
 But it's irony type because the pages isolated would exist
 as free page in free_area-free_list[MIGRATE_ISOLATE] so people
 can think of it as allocatable pages but it is *never* allocatable.
 It ends up confusing NR_FREE_PAGES vmstat so it would be
 totally not accurate so some of place which depend on such vmstat
 could reach wrong decision by the context.

 There were already report about it.[1]
 [1] 702d1a6e, memory-hotplug: fix kswapd looping forever problem

 Then, there was other report which is other problem.[2]
 [2] http://www.spinics.net/lists/linux-mm/msg41251.html

 I believe it can make problems in future, too.
 So I hope removing such irony type by another design.

 I hope this patch solves it and let's revert [1] and doesn't need [2].

 * Changelog v1
  * Fix from Michal's many suggestion

 Cc: Michal Nazarewicz min...@mina86.com
 Cc: Mel Gorman m...@csn.ul.ie
 Cc: Kamezawa Hiroyuki kamezawa.hir...@jp.fujitsu.com
 Cc: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com
 Cc: Wen Congyang we...@cn.fujitsu.com
 Cc: Konrad Rzeszutek Wilk konrad.w...@oracle.com
 Signed-off-by: Minchan Kim minc...@kernel.org
 ---

 @@ -180,30 +287,35 @@ int undo_isolate_page_range(unsigned long start_pfn, 
 unsigned long end_pfn,
   * all pages in [start_pfn...end_pfn) must be in the same zone.
   * zone-lock must be held before call this.
   *
 - * Returns 1 if all pages in the range are isolated.
 + * Returns true if all pages in the range are isolated.
   */
 -static int
 -__test_page_isolated_in_pageblock(unsigned long pfn, unsigned long end_pfn)
 +static bool
 +__test_page_isolated_in_pageblock(unsigned long start_pfn, unsigned long 
 end_pfn)
  {
 +   unsigned long pfn, next_pfn;
 struct page *page;
  
 -   while (pfn  end_pfn) {
 -   if (!pfn_valid_within(pfn)) {
 -   pfn++;
 -   continue;
 -   }
 -   page = pfn_to_page(pfn);
 -   if (PageBuddy(page))
 -   pfn += 1  page_order(page);
 -   else if (page_count(page) == 0 
 -   page_private(page) == MIGRATE_ISOLATE)
 -   pfn += 1;
 -   else
 -   break;
 +   list_for_each_entry(page, isolated_pages, lru) {

 +   if (page-lru == isolated_pages)
 +   return false;

 what's the mean of this line?
 
 I just copied it from Michal's code but It seem to be not needed.
 I will remove it in next spin.
 

 +   pfn = page_to_pfn(page);
 +   if (pfn = end_pfn)
 +   return false;



 +   if (pfn = start_pfn)
 +   goto found;

this test is wrong.

if ((pfn = start_pfn)  (start_pfn  pfn + (1UL  page_order(page
goto found;


 +   }
 +   return false;
 +
 +   list_for_each_entry_continue(page, isolated_pages, lru) {
 +   if (page_to_pfn(page) != next_pfn)
 +   return false;

 where is next_pfn init-ed? 
 
 by goto found

don't goto inner label.

move the found label up:

+
+found:
+   next_pfn = page_to_pfn(page);
+   list_for_each_entry_from(page, isolated_pages, lru) {
+   if (page_to_pfn(page) != next_pfn)
+   return false;
+   pfn = page_to_pfn(page);
+   next_pfn = pfn + (1UL  page_order(page));
+   if (next_pfn = end_pfn)
+   return true;
}
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC v2] memory-hotplug: remove MIGRATE_ISOLATE from free_area-free_list

2012-09-06 Thread Lai Jiangshan
On 09/06/2012 04:18 PM, Minchan Kim wrote:
 Hello Lai,
 
 On Thu, Sep 06, 2012 at 04:14:51PM +0800, Lai Jiangshan wrote:
 On 09/06/2012 10:53 AM, Minchan Kim wrote:
 Normally, MIGRATE_ISOLATE type is used for memory-hotplug.
 But it's irony type because the pages isolated would exist
 as free page in free_area-free_list[MIGRATE_ISOLATE] so people
 can think of it as allocatable pages but it is *never* allocatable.
 It ends up confusing NR_FREE_PAGES vmstat so it would be
 totally not accurate so some of place which depend on such vmstat
 could reach wrong decision by the context.

 There were already report about it.[1]
 [1] 702d1a6e, memory-hotplug: fix kswapd looping forever problem

 Then, there was other report which is other problem.[2]
 [2] http://www.spinics.net/lists/linux-mm/msg41251.html

 I believe it can make problems in future, too.
 So I hope removing such irony type by another design.

 I hope this patch solves it and let's revert [1] and doesn't need [2].

 * Changelog v1
  * Fix from Michal's many suggestion

 Cc: Michal Nazarewicz min...@mina86.com
 Cc: Mel Gorman m...@csn.ul.ie
 Cc: Kamezawa Hiroyuki kamezawa.hir...@jp.fujitsu.com
 Cc: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com
 Cc: Wen Congyang we...@cn.fujitsu.com
 Cc: Konrad Rzeszutek Wilk konrad.w...@oracle.com
 Signed-off-by: Minchan Kim minc...@kernel.org
 ---

 @@ -180,30 +287,35 @@ int undo_isolate_page_range(unsigned long start_pfn, 
 unsigned long end_pfn,
   * all pages in [start_pfn...end_pfn) must be in the same zone.
   * zone-lock must be held before call this.
   *
 - * Returns 1 if all pages in the range are isolated.
 + * Returns true if all pages in the range are isolated.
   */
 -static int
 -__test_page_isolated_in_pageblock(unsigned long pfn, unsigned long end_pfn)
 +static bool
 +__test_page_isolated_in_pageblock(unsigned long start_pfn, unsigned long 
 end_pfn)
  {
 +   unsigned long pfn, next_pfn;
 struct page *page;
  
 -   while (pfn  end_pfn) {
 -   if (!pfn_valid_within(pfn)) {
 -   pfn++;
 -   continue;
 -   }
 -   page = pfn_to_page(pfn);
 -   if (PageBuddy(page))
 -   pfn += 1  page_order(page);
 -   else if (page_count(page) == 0 
 -   page_private(page) == MIGRATE_ISOLATE)
 -   pfn += 1;
 -   else
 -   break;
 +   list_for_each_entry(page, isolated_pages, lru) {

 +   if (page-lru == isolated_pages)
 +   return false;

 what's the mean of this line?
 
 I just copied it from Michal's code but It seem to be not needed.
 I will remove it in next spin.
 

 +   pfn = page_to_pfn(page);
 +   if (pfn = end_pfn)
 +   return false;



 +   if (pfn = start_pfn)
 +   goto found;

this test is wrong.

use this:

if ((pfn = start_pfn)  (start_pfn  pfn + (1UL  page_order(page
goto found;

if (pfn  start_pfn)
return false;


 +   }
 +   return false;
 +
 +   list_for_each_entry_continue(page, isolated_pages, lru) {
 +   if (page_to_pfn(page) != next_pfn)
 +   return false;

 where is next_pfn init-ed? 
 
 by goto found

don't goto inner label.

move the found label up:

+
+found:
+   next_pfn = page_to_pfn(page);
+   list_for_each_entry_from(page, isolated_pages, lru) {
+   if (page_to_pfn(page) != next_pfn)
+   return false;
+   pfn = page_to_pfn(page);
+   next_pfn = pfn + (1UL  page_order(page));
+   if (next_pfn = end_pfn)
+   return true;
}
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] mm: support MIGRATE_DISCARD

2012-09-06 Thread Mel Gorman
On Thu, Sep 06, 2012 at 09:29:35AM +0100, Mel Gorman wrote:
 On Thu, Sep 06, 2012 at 02:31:12PM +0900, Minchan Kim wrote:
  Hi Mel,
  
  On Wed, Sep 05, 2012 at 11:56:11AM +0100, Mel Gorman wrote:
   On Wed, Sep 05, 2012 at 05:11:13PM +0900, Minchan Kim wrote:
This patch introudes MIGRATE_DISCARD mode in migration.
It drops *clean cache pages* instead of migration so that
migration latency could be reduced by avoiding (memcpy + page 
remapping).
It's useful for CMA because latency of migration is very important 
rather
than eviction of background processes's workingset. In addition, it 
needs
less free pages for migration targets so it could avoid memory 
reclaiming
to get free pages, which is another factor increase latency.

   
   Bah, this was released while I was reviewing the older version. I did
   not read this one as closely but I see the enum problems have gone away
   at least. I'd still prefer if CMA had an additional helper to discard
   some pages with shrink_page_list() and migrate the remaining pages with
   migrate_pages(). That would remove the need to add a MIGRATE_DISCARD
   migrate mode at all.
  
  I am not convinced with your point. What's the benefit on separating
  reclaim and migration? For just removing MIGRATE_DISCARD mode?
 
 Maintainability. There are reclaim functions and there are migration
 functions. Your patch takes migrate_pages() and makes it partially a
 reclaim function mixing up the responsibilities of migrate.c and vmscan.c.
 
  I don't think it's not bad because my implementation is very simple(maybe
  it's much simpler than separating reclaim and migration) and
  could be used by others like memory-hotplug in future.
 
 They could also have used the helper function from CMA that takes a list
 of pages, reclaims some and migrates other.
 

I also do not accept that your approach is inherently simpler than what I
proposed to you. This is not tested at all but it should be functionally
similar to both your patches except that it keeps the responsibility for
reclaim in vmscan.c

Your diffstats are

8 files changed, 39 insertions(+), 36 deletions(-)
3 files changed, 46 insertions(+), 4 deletions(-)

Mine is

 3 files changed, 32 insertions(+), 5 deletions(-)

Fewer files changed and fewer lines inserted.

---8---
mm: cma: Discard clean pages during contiguous allocation instead of migration

This patch drops clean cache pages instead of migration during
alloc_contig_range() to minimise allocation latency by reducing the amount
of migration is necessary. It's useful for CMA because latency of migration
is more important than evicting the background processes working set.

Prototype-not-signed-off-but-feel-free-to-pick-up-and-test
---
 mm/internal.h   |1 +
 mm/page_alloc.c |2 ++
 mm/vmscan.c |   34 +-
 3 files changed, 32 insertions(+), 5 deletions(-)

diff --git a/mm/internal.h b/mm/internal.h
index b8c91b3..6d4bdf9 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -356,3 +356,4 @@ extern unsigned long vm_mmap_pgoff(struct file *, unsigned 
long,
 unsigned long, unsigned long);
 
 extern void set_pageblock_order(void);
+unsigned long reclaim_clean_pages_from_list(struct list_head *page_list);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index c66fb87..977bdb2 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5670,6 +5670,8 @@ static int __alloc_contig_migrate_range(unsigned long 
start, unsigned long end)
break;
}
 
+   reclaim_clean_pages_from_list(cc.migratepages);
+
ret = migrate_pages(cc.migratepages,
__alloc_contig_migrate_alloc,
0, false, MIGRATE_SYNC);
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 8d01243..ccf7bc2 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -703,7 +703,7 @@ static unsigned long shrink_page_list(struct list_head 
*page_list,
goto keep;
 
VM_BUG_ON(PageActive(page));
-   VM_BUG_ON(page_zone(page) != zone);
+   VM_BUG_ON(zone  page_zone(page) != zone);
 
sc-nr_scanned++;
 
@@ -817,7 +817,9 @@ static unsigned long shrink_page_list(struct list_head 
*page_list,
 * except we already have the page isolated
 * and know it's dirty
 */
-   inc_zone_page_state(page, NR_VMSCAN_IMMEDIATE);
+   if (zone)
+   inc_zone_page_state(page,
+   NR_VMSCAN_IMMEDIATE);
SetPageReclaim(page);
 
goto keep_locked;
@@ -947,7 +949,7 @@ keep:
 * back off and wait for congestion to clear because further reclaim
 * will encounter the same 

Re: [RFC PATCH 0/3] target: try satisfying memory requests with higher-order allocations

2012-09-06 Thread Paolo Bonzini
Il 06/09/2012 03:58, Nicholas A. Bellinger ha scritto:
 This patch series fixes this problem by using higher-order allocations
 to build the data scatterlist.  The problem is that iscsi assumes that the
 scatterlist consists of single pages, which is not true anymore.  So
 patch 2 has to introduce some relatively complicated changes to
 iscsi_map_iovec and iscsi_unmap_iovec.
 
 So enabling multi-page per SGL support is a feature that has been
 dormant within target core for a long time.  It's about time that we
 start taking advantage of it again.  ;)

Yeah, I noticed some preparation for it in tcm_fc/tfc_io.c, though too
late (they look a lot like my iscsi changes, it would have saved me some
time!).

While this is obviously not to be taken lightly, I disagree with making
this a per-fabric choice.  With a properly organized (and bisectable)
series, it should be relatively easy to review and to get right.  I
looked a bit more closely now and there are no changes needed to other
targets (actually there is a change needed in tcm_qla2xxx, but the code
is currently disabled).

There are however changes to transport_kmap_data_sg needed and a few
other places.

I definitely agree with your other comments, including making max_order
a DEF_DEV_ATTRIB.  In addition, the default max_order should be capped
based on queue_max_sectors(q) if applicable, to avoid hitting this scenario:

   /*
* XXX: if the length the device accepts is shorter than the
*  length of the S/G list entry this will cause and
*  endless loop.  Better hope no driver uses huge pages.
*/

Paolo

 While doing this, I noticed something strange in iscsit_do_crypto_hash_sg.
 Patch 1 adds a warning about it.

 
 M, looks like a separate bug with DataDigest enabled.  
 
 The approach may be completely wrong and it needs more testing anyway.
 Please review!

 
 Adding my comments inline.
 
 Thanks Paolo!
 
 --nab
 
 Paolo

 Paolo Bonzini (3):
   tcm_iscsi: warn on incorrect precondition for iscsit_do_crypto_hash_sg
   tcm_iscsi: support multiple sizes in the scatterlist
   target: try satisfying memory requests with contiguous blocks

  drivers/target/iscsi/iscsi_target.c  |  106 
 +-
  drivers/target/iscsi/iscsi_target_core.h |2 +-
  drivers/target/target_core_transport.c   |   58 ++---
  3 files changed, 138 insertions(+), 28 deletions(-)

 --
 To unsubscribe from this list: send the line unsubscribe target-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 
 

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH RESEND] fs: Build sys_stat64() and friends if __ARCH_WANT_COMPAT_STAT64

2012-09-06 Thread Catalin Marinas
On AArch64, we want the sys_stat64() and related functions for compat
support but do not need the generic struct stat64, enabled automatically
if __ARCH_WANT_STAT64.

Signed-off-by: Catalin Marinas catalin.mari...@arm.com
Acked-by: Arnd Bergmann a...@arndb.de
Cc: Alexander Viro v...@zeniv.linux.org.uk
Cc: Andrew Morton a...@linux-foundation.org
---
 fs/stat.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/stat.c b/fs/stat.c
index b6ff118..6126c5d 100644
--- a/fs/stat.c
+++ b/fs/stat.c
@@ -326,7 +326,7 @@ SYSCALL_DEFINE3(readlink, const char __user *, path, char 
__user *, buf,
 
 
 /* -- LFS-64 --- */
-#ifdef __ARCH_WANT_STAT64
+#if defined(__ARCH_WANT_STAT64) || defined(__ARCH_WANT_COMPAT_STAT64)
 
 #ifndef INIT_STRUCT_STAT64_PADDING
 #  define INIT_STRUCT_STAT64_PADDING(st) memset(st, 0, sizeof(st))
@@ -415,7 +415,7 @@ SYSCALL_DEFINE4(fstatat64, int, dfd, const char __user *, 
filename,
return error;
return cp_new_stat64(stat, statbuf);
 }
-#endif /* __ARCH_WANT_STAT64 */
+#endif /* __ARCH_WANT_STAT64 || __ARCH_WANT_COMPAT_STAT64 */
 
 /* Caller is here responsible for sufficient locking (ie. inode-i_lock) */
 void __inode_add_bytes(struct inode *inode, loff_t bytes)

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: PCI/e1000 BUG: unable to handle kernel paging request at 0ffff163

2012-09-06 Thread Fengguang Wu
On Wed, Sep 05, 2012 at 11:41:04AM -0700, Yinghai Lu wrote:
 On Tue, Sep 4, 2012 at 11:51 PM, Fengguang Wu fengguang...@intel.com wrote:
  Yinghai,
 
  There are many kernel paging errors showing up in tree:
 
git://git.kernel.org/pub/scm/linux/kernel/git/yinghai/linux-yinghai.git 
  for-pci-for-each-res-addon-v2
 
  The below summary shows that
 
  1) it's a reliably reproducible bug
  2) all paging fault happens at address 0163 and in some e1000 functions
 
  I'll try to bisect if the root cause is not obvious to you.  (Cannot
  do so for now because there are 3 bisections on the way and I cannot
  afford more..)
 
 thanks, will check that...

Yinghai, I'm very sorry that it's a false report...

The root cause is memory corruption by the isdnloop driver:

== [9.345694] isdnloop-ISDN-driver Rev 1.11.6.7
== [9.347484] isdnloop: (loop0) virtual card added
[9.348444] bus: 'usb': driver_probe_device: matched device 1-1:2.0 
with driver cdc_acm
[9.349773] bus: 'usb': really_probe: probing driver cdc_acm with 
device 1-1:2.0
[9.350967] cdc_acm 1-1:2.0: This device cannot do calls on its own. 
It is not a modem.
[9.353255] cdc_acm 1-1:2.0: ttyACM0: USB ACM device
[9.354137] BUG: unable to handle kernel paging request at 0163
[9.355214] IP: [0163] 0x162
[9.355869] *pde = 

Which was recently fixed by

commit 77f00f6324cb97cf1df6f9c4aaeea6ada23abdb2
Author: Wu Fengguang fengguang...@intel.com
Commit: David S. Miller da...@davemloft.net
CommitDate: Fri Aug 3 16:53:22 2012 -0700

isdnloop: fix and simplify isdnloop_init()

Fix a buffer overflow bug by removing the revision and printk.

Thanks,
Fengguang
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/4] clk: Provide option for clk_get_rate to issue hw for new rate

2012-09-06 Thread Ulf Hansson
Hi Mike,

Thanks for your input, and sorry for my late reply!

On 31 August 2012 21:29, Mike Turquette mturque...@ti.com wrote:
 Quoting Ulf Hansson (2012-08-31 05:21:28)
 From: Ulf Hansson ulf.hans...@linaro.org

 By using CLK_GET_RATE_NOCACHE flag, we tell the clk_get_rate API to
 issue the hw for an updated clock rate. This can be used for a clock
 which rate may be updated without a client necessary modifying it.


 I'm glad to see this.  We discussed whether the default behavior should
 be cached or from the hardware at length some time back, so having a
 flag to support the non-default is great.

 Signed-off-by: Ulf Hansson ulf.hans...@linaro.org
 ---
  drivers/clk/clk.c|   43 
 +++---
  include/linux/clk-provider.h |1 +
  2 files changed, 25 insertions(+), 19 deletions(-)

 diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c
 index efdfd00..d9cbae0 100644
 --- a/drivers/clk/clk.c
 +++ b/drivers/clk/clk.c
 @@ -558,25 +558,6 @@ int clk_enable(struct clk *clk)
  EXPORT_SYMBOL_GPL(clk_enable);

  /**
 - * clk_get_rate - return the rate of clk
 - * @clk: the clk whose rate is being returned
 - *
 - * Simply returns the cached rate of the clk.  Does not query the hardware. 
  If
 - * clk is NULL then returns 0.
 - */
 -unsigned long clk_get_rate(struct clk *clk)
 -{
 -   unsigned long rate;
 -
 -   mutex_lock(prepare_lock);
 -   rate = __clk_get_rate(clk);
 -   mutex_unlock(prepare_lock);
 -
 -   return rate;
 -}
 -EXPORT_SYMBOL_GPL(clk_get_rate);
 -
 -/**
   * __clk_round_rate - round the given rate for a clk
   * @clk: round the rate of this clock
   *
 @@ -702,6 +683,30 @@ static void __clk_recalc_rates(struct clk *clk, 
 unsigned long msg)
  }

  /**
 + * clk_get_rate - return the rate of clk
 + * @clk: the clk whose rate is being returned
 + *
 + * Simply returns the cached rate of the clk, unless CLK_GET_RATE_NOCACHE 
 flag
 + * is set, which means a recalc_rate will be issued.
 + * If clk is NULL then returns 0.
 + */
 +unsigned long clk_get_rate(struct clk *clk)
 +{
 +   unsigned long rate;
 +
 +   mutex_lock(prepare_lock);
 +
 +   if (clk  (clk-flags  CLK_GET_RATE_NOCACHE))
 +   __clk_recalc_rates(clk, 0);

 This is a bit subtle.  Calling __clk_recalc_rates will walk the subtree
 of children recalculating rates as well as firing off notifiers.  Is
 this what you want?  If your clock changes rates behind your back AND
 has chilren then this is probably the right thing to do.  However you
 might be better off with:

 if (clk  (clk-flags  CLK_GET_RATE_NOCACHE))
 rate = clk-ops-recalc_rate(clk-hw, clk-parent-rate);

 This doesn't update children or fire off notifiers.  What is best for
 your platform?

For my platform, ux500 and for the clock connected to this
patchseries, your suggesting above is enough. (Well some additional
error handling is needed in your code proposal though :-) )

The reason for why I used __clk_recalc_rates was because I think it
could make sense to have a more generic approach, not sure if it is
needed as you mention. Additionally, using  __clk_recalc_rates with
0 as the notification argument, should prevent notifications from
happen, right?

So basically, I wanted the clock rates for the children to be updated
as well as the parent clock rate, but no notifications.

I can happily update the patch according to your proposal if you still
think it is the best way to do it, just tell me again then. :-)

Kind regards
Ulf Hansson
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] mm: Fixup obsolete PG_buddy flag in error_states[]

2012-09-06 Thread Li Haifeng
PG_buddy, an abandoned flag, indicates page(s) is/are free
and in buddy allocator. So in the comment, pages in
buddy system instead of PG_buddy pages.

Signed-off-by: Haifeng Li omy...@gmail.com
---
 mm/memory-failure.c |3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index ab1e714..2873498 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -762,7 +762,8 @@ static struct page_state {
{ reserved, reserved,   reserved kernel,  me_kernel },
/*
 * free pages are specially detected outside this table:
-* PG_buddy pages only make a small fraction of all free pages.
+* pages in buddy system only make a small fraction of all
+* free pages.
 */

/*
--
1.7.5.4
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 3/3] memory-hotplug: bug fix race between isolation and allocation

2012-09-06 Thread Yasuaki Ishimatsu

Hi, Minchan,

2012/09/06 16:30, Minchan Kim wrote:

Hello Yasuaki,

On Thu, Sep 06, 2012 at 04:17:54PM +0900, Yasuaki Ishimatsu wrote:

Hi Minchan,

2012/09/06 14:16, Minchan Kim wrote:

Like below, memory-hotplug makes race between page-isolation
and page-allocation so it can hit BUG_ON in __offline_isolated_pages.

CPU A   CPU B

start_isolate_page_range
set_migratetype_isolate
spin_lock_irqsave(zone-lock)

free_hot_cold_page(Page A)
/* without zone-lock */
migratetype = get_pageblock_migratetype(Page A);
/*
 * Page could be moved into MIGRATE_MOVABLE
 * of per_cpu_pages
 */
list_add_tail(page-lru, 
pcp-lists[migratetype]);

set_pageblock_isolate
move_freepages_block
drain_all_pages

/* Page A could be in MIGRATE_MOVABLE of 
free_list. */

check_pages_isolated
__test_page_isolated_in_pageblock
/*
   * We can't catch freed page which
   * is free_list[MIGRATE_MOVABLE]
   */
if (PageBuddy(page A))
pfn += 1  page_order(page A);

/* So, Page A could be allocated */

__offline_isolated_pages
/*
   * BUG_ON hit or offline page
   * which is used by someone
   */
BUG_ON(!PageBuddy(page A));

This patch checks page's migratetype in freelist in 
__test_page_isolated_in_pageblock.
So now __test_page_isolated_in_pageblock can check the page caused by above 
race and
can fail of memory offlining.

Signed-off-by: Minchan Kim minc...@kernel.org
---
   mm/page_isolation.c |5 -
   1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/mm/page_isolation.c b/mm/page_isolation.c
index 87a7929..7ba7405 100644
--- a/mm/page_isolation.c
+++ b/mm/page_isolation.c
@@ -193,8 +193,11 @@ __test_page_isolated_in_pageblock(unsigned long pfn, 
unsigned long end_pfn)
continue;
}
page = pfn_to_page(pfn);
-   if (PageBuddy(page))
+   if (PageBuddy(page)) {
+   if (get_freepage_migratetype(page) != MIGRATE_ISOLATE)
+   break;
pfn += 1  page_order(page);
+   }



else if (page_count(page) == 0 
get_freepage_migratetype(page) == MIGRATE_ISOLATE)


When do the if statement, the page may be used by someone.


I can't understand your point.
We already hold zone-lock so that allocator and this function should be atomic
when the page is in free_list.
If I miss something, could you elaborate it more?


According to your description, the page might be allocated by someone
at this point. So some value might be set the page-index by the
intended purpose. Thus page-index has the potential to become
MIGRATE_ISOLATE value.

Thanks,
Yasuaki Ishimatsu   




In this case, page-index may have some number. If the number is same as
MIGRATE_ISOLATE, the code goes worng.

Thanks,
Yasuaki Ishimatsu


pfn += 1;




--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majord...@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: a href=mailto:d...@kvack.org; em...@kvack.org /a





--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2] mm: Fixup abandoned PG_buddy for private in struct page

2012-09-06 Thread Li Haifeng
PG_buddy, an abandoned flag, indicates page(s) is/are free
and in buddy allocator. And when page(s) in buddy allocator,
the _mapcount will equal PAGE_BUDDY_MAPCOUNT_VALUE. So,
here, _mapcount equals PAGE_BUDDY_MAPCOUNT_VALUE instead
of PG_buddy is set.

Signed-off-by: Haifeng Li omy...@gmail.com
---
 include/linux/mm_types.h |3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 704a626..49d9247 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -126,7 +126,8 @@ struct page {
 * if PagePrivate set; used for
 * swp_entry_t if PageSwapCache;
 * indicates order in the buddy
-* system if PG_buddy is set.
+* system if _mapcount equals
+* PAGE_BUDDY_MAPCOUNT_VALUE.
 */
 #if USE_SPLIT_PTLOCKS
spinlock_t ptl;
--
1.7.5.4
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


CONFIG_NO_HZ + CONFIG_CPU_IDLE freeze the system (Was Re: [PATCH] acpi : remove power from acpi_processor_cx structure)

2012-09-06 Thread Daniel Lezcano
On 09/06/2012 09:54 AM, Daniel Lezcano wrote:
 On 09/05/2012 03:41 PM, Rafael J. Wysocki wrote:
 On Saturday, September 01, 2012, Rafael J. Wysocki wrote:
 On Friday, August 31, 2012, Daniel Lezcano wrote:
 On 07/24/2012 11:06 PM, Konrad Rzeszutek Wilk wrote:
 On Tue, Jul 24, 2012 at 11:12:29PM +0200, Daniel Lezcano wrote:
 Remove the power field as it is not used.

 Signed-off-by: Daniel Lezcano daniel.lezc...@linaro.org
 Cc: Konrad Rzeszutek Wilk konrad.w...@oracle.com
 Acked.
 Hi Rafael,

 I did not see this patch going in. Is it possible to merge it ?
 I think so.  I'll take care of it when I get back from LinuxCon/Plumbers 
 Conf.
 (early next week).
 Applied to the linux-next branch of the linux-pm.git tree as v3.7 material.
 Thanks Rafael.

 Are there any other patches you want me to consider for v3.7?
 Yes please, I have the per cpu latencies ready to be submitted but I
 want to do extra testing before. Unfortunately, the linux-pm-next hangs
 at boot time on my intel dual core (not related to the patchset).

 I am git bisecting right now.

I found the culprit. This is not related to the linux-pm tree but with
net-next.
The following patch introduced the issue.

commit 6bdb7fe31046ac50b47e83c35cd6c6b6160a475d
Author: Amerigo Wang amw...@redhat.com
Date:   Fri Aug 10 01:24:50 2012 +

netpoll: re-enable irq in poll_napi()
   
napi-poll() needs IRQ enabled, so we have to re-enable IRQ before
calling it.
   
Cc: David Miller da...@davemloft.net
Signed-off-by: Cong Wang amw...@redhat.com
Signed-off-by: David S. Miller da...@davemloft.net

AFAICS, it has been fixed by commit
072a9c48600409d72aeb0d5b29fbb75861a06631 which is not yet in linux-pm-next.

I fall into this issue because NETCONSOLE is set, disabling it allowed
me to go further.

Unfortunately I am facing to some random freeze on the system which
seems to be related to CONFIG_NO_HZ=y and CONFIG_CPU_IDLE=y.

Disabling one of them, make the freezes to disappear.

Is it a known issue ?

Thanks in advance
  -- Daniel




--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] virtio-balloon spec: provide a version of the silent deflate feature that works

2012-09-06 Thread Paolo Bonzini
Il 06/09/2012 10:47, Michael S. Tsirkin ha scritto:
  - a migration from non-MUST_TELL_HOST to MUST_TELL_HOST will succeed,
  which is wrong;
  
  - a migration from MUST_TELL_HOST to non-MUST_TELL_HOST will fail, which
  is useless.
  
  Add instead a new feature VIRTIO_BALLOON_F_SILENT_DEFLATE, and deprecate
  VIRTIO_BALLOON_F_MUST_TELL_HOST since it is never actually used.
  
  Signed-off-by: Paolo Bonzini pbonz...@redhat.com
 Frankly I think it's a qemu migration bug. I don't see why
 we need to tweak spec: just teach qemu to be smarter
 during migration.

Of course you can just teach QEMU to be smarter, but that would be a
one-off hack for the only ill-defined feature that says something is
_not_ supported.  Since in practice all virtio_balloon-enbled
hypervisors supported silent deflate (so the bit was always zero), and
no client used it (so it was never checked), it's easier to just reverse
the direction.

In fact, it's not clear how the driver should use the feature.  My guess
is that, if it wants to use silent deflate, it tries to negotiate
VIRTIO_BALLOON_F_MUST_TELL_HOST, and can use silent deflate if
negotiation fails.  This is against the logic of all other features.

 Can you show a scenario with old driver/new hypervisor or
 new driver/old hypervisor that fails?

Suppose the driver tried to negotiate the feature as above.  We then
have the two scenarios above.

In the harmless but annoying scenario, the source hypervisor doesn't
support silent deflate, so VIRTIO_BALLOON_F_MUST_TELL_HOST has been
negotiated successfully.  The destination hypervisor supports silent
deflate, so it does _not_ include the feature.  It sees that the guest
requests VIRTIO_BALLOON_F_MUST_TELL_HOST, and fails migration.

In the incorrect scenario, you are migrating to an older hypervisor.
The source hypervisor is newer and supports silent deflate, so
VIRTIO_BALLOON_F_MUST_TELL_HOST was _not_ negotiated.  The destination
hypervisor does not supports silent deflate.  However, the guest is not
requesting VIRTIO_BALLOON_F_MUST_TELL_HOST, and migration succeeds.
Next time the guest tries to do use a page from the balloon, badness
happens, because the hypervisor does not expect it.

Paolo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] memory-hotplug: bug fix race between isolation and allocation

2012-09-06 Thread Mel Gorman
On Thu, Sep 06, 2012 at 01:49:03PM +0900, Minchan Kim wrote:
   __offline_isolated_pages
   /*
* BUG_ON hit or offline page
* which is used by someone
*/
   BUG_ON(!PageBuddy(page A));
   
  
  offline_page calling BUG_ON because someone allocated the page is
  ridiculous. I did not spot where that check is but it should be changed. The
  correct action is to retry the isolation.
 
 It is where __offline_isolated_pges.
 
 ..
 while (pfn  end_pfn) {
 if (!pfn_valid(pfn)) {
 pfn++;
 continue;
 }
 page = pfn_to_page(pfn);
 BUG_ON(page_count(page));
 BUG_ON(!PageBuddy(page));  HERE
 order = page_order(page);
 ...
 
 Comment of offline_isolated_pages says following as.
 
 We cannot do rollback at this point
 
 So if the comment is true, BUG_ON does make sense to me.

It's massive overkill. I see no reason why it cannot return EBUSY all the
way back up to offline_pages() and retry with the migration step.  It would
both remove that BUG_ON and improve reliability of memory hot-remove.

 But I don't see why we can't retry it as I look thorugh code.
 Anyway, It's another story which isn't related to this patch.
 

True.

  
   Signed-off-by: Minchan Kim minc...@kernel.org
  
  At no point in the changelog do you actually say what he patch does :/
 
 Argh, I will do.
 
  
   ---
mm/page_isolation.c |5 -
1 file changed, 4 insertions(+), 1 deletion(-)
   
   diff --git a/mm/page_isolation.c b/mm/page_isolation.c
   index acf65a7..4699d1f 100644
   --- a/mm/page_isolation.c
   +++ b/mm/page_isolation.c
   @@ -196,8 +196,11 @@ __test_page_isolated_in_pageblock(unsigned long pfn, 
   unsigned long end_pfn)
 continue;
 }
 page = pfn_to_page(pfn);
   - if (PageBuddy(page))
   + if (PageBuddy(page)) {
   + if (get_page_migratetype(page) != MIGRATE_ISOLATE)
   + break;
 pfn += 1  page_order(page);
   + }
  
  It is possible the page is moved to the MIGRATE_ISOLATE list between when
  the page was freed to the buddy allocator and this check was made. The
  page-index information is stale and the impact is that the hotplug
  operation fails when it could have succeeded. That said, I think it is a
  very unlikely race that will never happen in practice.
 
 I understand you mean move_freepages which I have missed. Right?

Yes.

 Then, I will fix it, too.
 
  
  More importantly, the effect of this path is that EBUSY gets bubbled all
  the way up and the hotplug operations fails. This is fine but as the page
  is free at the time this problem is detected you also have the option
  of moving the PageBuddy page to the MIGRATE_ISOLATE list at this time
  if you take the zone lock. This will mean you need to change the name of
  test_pages_isolated() of course.
 
 Sorry, I can't get your point. Could you elaborate it more?

You detect a PageBuddy page but it's on the wrong list. Instead of returning
and failing memory-hotremove, move the free page to the correct list at
the time it is detected.

 Is it related to this patch?

No, it's not important and was a suggestion on how it could be made
better. However, retrying hot-remove would be even better again. I'm not
suggesting it be done as part of this series.

-- 
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] virtio-blk: Fix kconfig option

2012-09-06 Thread Kent Overstreet
On Thu, Sep 06, 2012 at 11:44:03AM +0300, Michael S. Tsirkin wrote:
 On Thu, Sep 06, 2012 at 12:41:13AM -0700, Kent Overstreet wrote:
  On Tue, Sep 04, 2012 at 03:53:53PM +0930, Rusty Russell wrote:
   Kent Overstreet koverstr...@google.com writes:
   
CONFIG_VIRTIO isn't exposed, everything else is supposed to select it
instead.
   
   This is a slight mis-understanding.  It's supposed to be selected by
   the particular driver, probably virtio_pci in your case.
  
  So are you saying virtio-blk depends on virtio-pci? If so, the kconfig
  should have that.
  
  As is, VIRTIO_BLK just has:
  depends on EXPERIMENTAL  VIRTIO
  
  which is flat out broken.
 
 I don't think anything is broken.
 Can you show an example of a broken configuration?

Do you not understand the difference between depends an selects? Or did
you not read my original mail?

Flip off everything in drivers - virtio

Now go to drivers - block and try to turn on virtio-blk.

It's not listed!

Now go back to drivers - virtio and turn on (randomly) balloon.

Go back to drivers - block, and now you can turn on virtio-blk!

Do you see what's wrong with this picture?
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: linux-next: manual merge of the arm-soc tree with the usb tree

2012-09-06 Thread Roland Stigge
On 09/06/2012 07:42 AM, Stephen Rothwell wrote:
 Today's linux-next merge of the arm-soc tree got a conflict in 
 drivers/usb/host/Kconfig between commit 952230d774bb (usb: ohci:
 Fix Kconfig dependency on USB_ISP1301) from the usb tree and
 commit d684f05f2d55 (ARM: mach-pnx4008: Remove architecture) from
 the arm-soc tree.
 
 I fixed it up (see below) and can carry the fix as necessary (no
 action required).

Thanks - this little conflict was expected when merging pnx4008
removal and the isp1301 deoendency fix.

And the fixup is correct.

Roland
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH] [media] rc: filter out not allowed protocols when decoding

2012-09-06 Thread Changbin Du
Sean , many thanks for your help. I know much more about IR framwork
now. I'll try to
work out a patch to remove allowed_protocols.

Thanks again!
[Du, Changbin]

2012/9/4 Sean Young s...@mess.org:
 On Tue, Sep 04, 2012 at 11:06:07AM +0800, Changbin Du wrote:
 mutex_lock(ir_raw_handler_lock);
   - list_for_each_entry(handler, ir_raw_handler_list, list)
   - handler-decode(raw-dev, ev);
   + list_for_each_entry(handler, ir_raw_handler_list, list) {
   + /* use all protocol by default */
   + if (raw-dev-allowed_protos == RC_TYPE_UNKNOWN ||
   + raw-dev-allowed_protos  handler-protocols)
   + handler-decode(raw-dev, ev);
   + }
 
  Each IR protocol decoder already checks whether it is enabled or not;
  should it not be so that only allowed protocols can be enabled rather
  than checking both enabled_protocols and allowed_protocols?
 
  Just from reading store_protocols it looks like decoders which aren't
  in allowed_protocols can be enabled, which makes no sense. Also
  ir_raw_event_register all protocols are enabled rather than the
  allowed ones.
 
 
  Lastely I don't know why raw ir drivers should dictate which protocols
  can be enabled. Would it not be better to remove it entirely?


 I agree with you. I just thought that the only thing a decoder should care
 is its decoding logic, but not including decoder management. My idaea is:
  1) use enabled_protocols to select decoders in ir_raw.c, but not
 placed in decoders to do the judgement.
  2) remove  allowed_protocols or just use it to set the default
 decoder (also should rename allowed_protocols  to default_protocol).

 The default decoder should be the one set by the rc keymap.

 I also have a question:
  Is there a requirement that one more decoders are enabled for a
 IR device at the same time?

 Yes, you want to be able to multiple remotes on the IR device (which
 you can do as long as the scancodes don't overlap, I think), and the
 lirc device is implemented as a decoder, so you might want to see the
 raw IR as well as have it decoded.

 And if that will lead to a issue that each decoder may decode a
 same pulse sequence to different evnets since their protocol is
 different?

 At the moment, no. David Hardeman has sent a patch for this:

 http://patchwork.linuxtv.org/patch/11388/


 Sean
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] virtio-balloon spec: provide a version of the silent deflate feature that works

2012-09-06 Thread Michael S. Tsirkin
On Thu, Sep 06, 2012 at 11:24:02AM +0200, Paolo Bonzini wrote:
 Il 06/09/2012 10:47, Michael S. Tsirkin ha scritto:
   - a migration from non-MUST_TELL_HOST to MUST_TELL_HOST will succeed,
   which is wrong;
   
   - a migration from MUST_TELL_HOST to non-MUST_TELL_HOST will fail, which
   is useless.
   
   Add instead a new feature VIRTIO_BALLOON_F_SILENT_DEFLATE, and deprecate
   VIRTIO_BALLOON_F_MUST_TELL_HOST since it is never actually used.
   
   Signed-off-by: Paolo Bonzini pbonz...@redhat.com
  Frankly I think it's a qemu migration bug. I don't see why
  we need to tweak spec: just teach qemu to be smarter
  during migration.
 
 Of course you can just teach QEMU to be smarter, but that would be a
 one-off hack for the only ill-defined feature that says something is
 _not_ supported.  Since in practice all virtio_balloon-enbled
 hypervisors supported silent deflate (so the bit was always zero), and
 no client used it (so it was never checked), it's easier to just reverse
 the direction.
 
 In fact, it's not clear how the driver should use the feature.  My guess
 is that, if it wants to use silent deflate, it tries to negotiate
 VIRTIO_BALLOON_F_MUST_TELL_HOST, and can use silent deflate if
 negotiation fails.  This is against the logic of all other features.

Let's take a step back from the implementation details.
You are trying to add a new feature bit, after all.
Why? Why is silent deflate useful? This is what is
missing in all this discussion. If it is not useful
we do not need a bit for it.

  Can you show a scenario with old driver/new hypervisor or
  new driver/old hypervisor that fails?
 
 Suppose the driver tried to negotiate the feature as above.  We then
 have the two scenarios above.
 
 In the harmless but annoying scenario, the source hypervisor doesn't
 support silent deflate, so VIRTIO_BALLOON_F_MUST_TELL_HOST has been
 negotiated successfully.  The destination hypervisor supports silent
 deflate, so it does _not_ include the feature.  It sees that the guest
 requests VIRTIO_BALLOON_F_MUST_TELL_HOST, and fails migration.
 
 In the incorrect scenario, you are migrating to an older hypervisor.
 The source hypervisor is newer and supports silent deflate, so
 VIRTIO_BALLOON_F_MUST_TELL_HOST was _not_ negotiated.  The destination
 hypervisor does not supports silent deflate.  However, the guest is not
 requesting VIRTIO_BALLOON_F_MUST_TELL_HOST, and migration succeeds.
 Next time the guest tries to do use a page from the balloon, badness
 happens, because the hypervisor does not expect it.
 
 Paolo

Sorry this is not the example I asked for.  Please give and example
without migration.

Migration is qemu's problem: it is hypervisor's job to
make sure guest sees no change during migration.
It should be able to do this with any hardware it emulates,
there should be no need to change hardware to make it
migrateable somehow.

-- 
MST
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: snd-usb: delay: estimated 0, actual 352

2012-09-06 Thread Markus Trippelsdorf
On 2012.09.06 at 10:21 +0200, Takashi Iwai wrote:
 At Thu, 06 Sep 2012 09:35:26 +0200,
 Takashi Iwai wrote:
 
 In short, a patch like below may fix the issue (note: completely
 untested!)

No it doesn't, unfortunately...

-- 
Markus
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] virtio-blk: Fix kconfig option

2012-09-06 Thread Michael S. Tsirkin
On Thu, Sep 06, 2012 at 02:25:12AM -0700, Kent Overstreet wrote:
 On Thu, Sep 06, 2012 at 11:44:03AM +0300, Michael S. Tsirkin wrote:
  On Thu, Sep 06, 2012 at 12:41:13AM -0700, Kent Overstreet wrote:
   On Tue, Sep 04, 2012 at 03:53:53PM +0930, Rusty Russell wrote:
Kent Overstreet koverstr...@google.com writes:

 CONFIG_VIRTIO isn't exposed, everything else is supposed to select it
 instead.

This is a slight mis-understanding.  It's supposed to be selected by
the particular driver, probably virtio_pci in your case.
   
   So are you saying virtio-blk depends on virtio-pci? If so, the kconfig
   should have that.
   
   As is, VIRTIO_BLK just has:
 depends on EXPERIMENTAL  VIRTIO
   
   which is flat out broken.
  
  I don't think anything is broken.
  Can you show an example of a broken configuration?
 
 Do you not understand the difference between depends an selects?
 Or did you not read my original mail?
 Flip off everything in drivers - virtio
 
 Now go to drivers - block and try to turn on virtio-blk.
 
 It's not listed!

Yes. Because you disabled all virtio backends.
It does not make sense to have any frontends.

 Now go back to drivers - virtio and turn on (randomly) balloon.
 
 Go back to drivers - block, and now you can turn on virtio-blk!
 
 Do you see what's wrong with this picture?

Yes. You got unlucky with your random guess.
It's a bug in balloon kconfig: it should not
select virtio.
I sent a patch to fix that yesterday.

-- 
MST
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] virtio-balloon spec: provide a version of the silent deflate feature that works

2012-09-06 Thread Paolo Bonzini
Il 06/09/2012 11:44, Michael S. Tsirkin ha scritto:
 In fact, it's not clear how the driver should use the feature.  My guess
 is that, if it wants to use silent deflate, it tries to negotiate
 VIRTIO_BALLOON_F_MUST_TELL_HOST, and can use silent deflate if
 negotiation fails.  This is against the logic of all other features.
 
 Let's take a step back from the implementation details.
 You are trying to add a new feature bit, after all.
 Why? Why is silent deflate useful? This is what is
 missing in all this discussion. If it is not useful
 we do not need a bit for it.

It is useful because it lets guests inflate the balloon aggressively,
and then use ballooned-out pages even in places where the guest OS
cannot sleep, such as kmalloc(GFP_ATOMIC).

 Can you show a scenario with old driver/new hypervisor or
 new driver/old hypervisor that fails?

 Sorry this is not the example I asked for.  Please give and example
 without migration.
 
 Migration is qemu's problem: it is hypervisor's job to
 make sure guest sees no change during migration.

Quoting my message: Of course you can just teach QEMU to be smarter,
but that would be a one-off hack for the only ill-defined feature that
says something is _not_ supported.

Currently migration works the same way for all virtio devices, and
assumes that features are defined only in the positive direction:
drivers request features if they want to use it, devices provide
features to say they support something.

Instead, in the case of this feature, the driver requests it before
relying on its lack (which is odd); the device provides if they do not
support something (which is wrong).  You can see that this just cannot
provide backwards-compatibility in the device; it happens to work only
because the feature was there in the first version of the spec.

 It should be able to do this with any hardware it emulates,
 there should be no need to change hardware to make it
 migrateable somehow.

Of course, but if we can fix the hardware with no bad effects, let's do
that instead.

Paolo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] virtio-blk: Fix kconfig option

2012-09-06 Thread Kent Overstreet
On Thu, Sep 06, 2012 at 12:49:56PM +0300, Michael S. Tsirkin wrote:
 On Thu, Sep 06, 2012 at 02:25:12AM -0700, Kent Overstreet wrote:
  On Thu, Sep 06, 2012 at 11:44:03AM +0300, Michael S. Tsirkin wrote:
   On Thu, Sep 06, 2012 at 12:41:13AM -0700, Kent Overstreet wrote:
On Tue, Sep 04, 2012 at 03:53:53PM +0930, Rusty Russell wrote:
 Kent Overstreet koverstr...@google.com writes:
 
  CONFIG_VIRTIO isn't exposed, everything else is supposed to select 
  it
  instead.
 
 This is a slight mis-understanding.  It's supposed to be selected by
 the particular driver, probably virtio_pci in your case.

So are you saying virtio-blk depends on virtio-pci? If so, the kconfig
should have that.

As is, VIRTIO_BLK just has:
depends on EXPERIMENTAL  VIRTIO

which is flat out broken.
   
   I don't think anything is broken.
   Can you show an example of a broken configuration?
  
  Do you not understand the difference between depends an selects?
  Or did you not read my original mail?
  Flip off everything in drivers - virtio
  
  Now go to drivers - block and try to turn on virtio-blk.
  
  It's not listed!
 
 Yes. Because you disabled all virtio backends.
 It does not make sense to have any frontends.

How's a user - or even another kernel developer who isn't familiar with
virtio - supposed to know that?

I still don't know what exactly a virtio backend is - the term isn't
even mentioned anywhere that I've seen.

Whatever it is though virtio-blk should be depending on _that_, not a
config option that _isn't exposed in the menu_!

  Now go back to drivers - virtio and turn on (randomly) balloon.
  
  Go back to drivers - block, and now you can turn on virtio-blk!
  
  Do you see what's wrong with this picture?
 
 Yes. You got unlucky with your random guess.
 It's a bug in balloon kconfig: it should not
 select virtio.
 I sent a patch to fix that yesterday.

Then it's also a bug in the comments at the top of
drivers/virtio/Kconfig.

And besides that, how the _hell_ is a user supposed to know to turn on
VIRTIO_PCI before VIRTIO_BLK? It's not documented anywhere (if that is
what's supposed to happen! I still don't know) and even if it was
documented, having one kconfig option depend on something that's exposed
in a _completely different menu_ is just made of fail.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] x86, 32-bit: Fix invalid stack address while in softirq

2012-09-06 Thread Robert Richter
(Resending patch with [PATCH] in subject line and updated cc list.)

On 06.09.12 09:30:37, wyang1 wrote:
 Robert,
 
 I agreed what you said, my patch more likes a workaround.
 
  So the proper fix I see is to fix kernel_stack_pointer() to return a
  valid stack in case of an empty stack while in softirq. Something like
  the patch below. Maybe it must be optimized a bit. I tested the patch
  over night with no issues found. Please test it too.
 
 I also tested the following patch over night. it is fine.:-)

Wei,

thanks for testing.

Ingo,

please take a look at this. Not sure if Linus want to look at this too
and if we need more optimization here.

Thanks,

-Robert


From 8e7c16913b1fcfc63f7b24337551aacc7153c334 Mon Sep 17 00:00:00 2001
From: Robert Richter robert.rich...@amd.com
Date: Mon, 3 Sep 2012 20:54:48 +0200
Subject: [PATCH] x86, 32-bit: Fix invalid stack address while in softirq

In 32 bit the stack address provided by kernel_stack_pointer() may
point to an invalid range causing NULL pointer access or page faults
while in NMI (see trace below). This happens if called in softirq
context and if the stack is empty. The address at regs-sp is then
out of range.

Fixing this by checking if regs and regs-sp are in the same stack
context. Otherwise return the previous stack pointer stored in struct
thread_info.

 BUG: unable to handle kernel NULL pointer dereference at 000a
 IP: [c1004237] print_context_stack+0x6e/0x8d
 *pde = 
 Oops:  [#1] SMP
 Modules linked in:
 Pid: 4434, comm: perl Not tainted 3.6.0-rc3-oprofile-i386-standard-g4411a05 #4 
Hewlett-Packard HP xw9400 Workstation/0A1Ch
 EIP: 0060:[c1004237] EFLAGS: 00010093 CPU: 0
 EIP is at print_context_stack+0x6e/0x8d
 EAX: e000 EBX: 000a ECX: f4435f94 EDX: 000a
 ESI: f4435f94 EDI: f4435f94 EBP: f5409ec0 ESP: f5409ea0
  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
 CR0: 8005003b CR2: 000a CR3: 34ac9000 CR4: 07d0
 DR0:  DR1:  DR2:  DR3: 
 DR6: 0ff0 DR7: 0400
 Process perl (pid: 4434, ti=f5408000 task=f5637850 task.ti=f4434000)
 Stack:
  03e8 e000 1ffc f4e39b00  000a f4435f94 c155198c
  f5409ef0 c1003723 c155198c f5409f04  f5409edc  
  f5409ee8 f4435f94 f5409fc4 0001 f5409f1c c12dce1c  c155198c
 Call Trace:
  [c1003723] dump_trace+0x7b/0xa1
  [c12dce1c] x86_backtrace+0x40/0x88
  [c12db712] ? oprofile_add_sample+0x56/0x84
  [c12db731] oprofile_add_sample+0x75/0x84
  [c12ddb5b] op_amd_check_ctrs+0x46/0x260
  [c12dd40d] profile_exceptions_notify+0x23/0x4c
  [c1395034] nmi_handle+0x31/0x4a
  [c1029dc5] ? ftrace_define_fields_irq_handler_entry+0x45/0x45
  [c13950ed] do_nmi+0xa0/0x2ff
  [c1029dc5] ? ftrace_define_fields_irq_handler_entry+0x45/0x45
  [c13949e5] nmi_stack_correct+0x28/0x2d
  [c1029dc5] ? ftrace_define_fields_irq_handler_entry+0x45/0x45
  [c1003603] ? do_softirq+0x4b/0x7f
  IRQ
  [c102a06f] irq_exit+0x35/0x5b
  [c1018f56] smp_apic_timer_interrupt+0x6c/0x7a
  [c1394746] apic_timer_interrupt+0x2a/0x30
 Code: 89 fe eb 08 31 c9 8b 45 0c ff 55 ec 83 c3 04 83 7d 10 00 74 0c 3b 5d 10 
73 26 3b 5d e4 73 0c eb 1f 3b 5d f0 76 1a 3b 5d e8 73 15 8b 13 89 d0 89 55 e0 
e8 ad 42 03 00 85 c0 8b 55 e0 75 a6 eb cc
 EIP: [c1004237] print_context_stack+0x6e/0x8d SS:ESP 0068:f5409ea0
 CR2: 000a
 ---[ end trace 62afee3481b00012 ]---
 Kernel panic - not syncing: Fatal exception in interrupt

Reported-by: Yang Wei wei.y...@windriver.com
Cc: sta...@vger.kernel.org
Signed-off-by: Robert Richter robert.rich...@amd.com
---
 arch/x86/include/asm/ptrace.h |   15 ---
 arch/x86/kernel/ptrace.c  |   21 +
 arch/x86/oprofile/backtrace.c |2 +-
 3 files changed, 26 insertions(+), 12 deletions(-)

diff --git a/arch/x86/include/asm/ptrace.h b/arch/x86/include/asm/ptrace.h
index dcfde52..19f16eb 100644
--- a/arch/x86/include/asm/ptrace.h
+++ b/arch/x86/include/asm/ptrace.h
@@ -205,21 +205,14 @@ static inline bool user_64bit_mode(struct pt_regs *regs)
 }
 #endif
 
-/*
- * X86_32 CPUs don't save ss and esp if the CPU is already in kernel mode
- * when it traps.  The previous stack will be directly underneath the saved
- * registers, and 'sp/ss' won't even have been saved. Thus the 'regs-sp'.
- *
- * This is valid only for kernel mode traps.
- */
-static inline unsigned long kernel_stack_pointer(struct pt_regs *regs)
-{
 #ifdef CONFIG_X86_32
-   return (unsigned long)(regs-sp);
+extern unsigned long kernel_stack_pointer(struct pt_regs *regs);
 #else
+static inline unsigned long kernel_stack_pointer(struct pt_regs *regs)
+{
return regs-sp;
-#endif
 }
+#endif
 
 #define GET_IP(regs) ((regs)-ip)
 #define GET_FP(regs) ((regs)-bp)
diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c
index c4c6a5c..5a9a8c9 100644
--- a/arch/x86/kernel/ptrace.c
+++ b/arch/x86/kernel/ptrace.c
@@ -165,6 +165,27 @@ static inline bool invalid_selector(u16 value)
 
 #define FLAG_MASK  

[PATCH] UDF: Fix incorrect error handling in udf_direct_IO()

2012-09-06 Thread Ian Abbott
My recent patch to add DIRECT_IO support to the UDF filesystem handler
contains a mistake in the error recovery if blockdev_direct_IO() fails.
The test `rw  WRITE` should be `rw  WRITE`.  Fix it.

Signed-off-by: Ian Abbott abbo...@mev.co.uk
---
 fs/udf/inode.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/udf/inode.c b/fs/udf/inode.c
index b905448..41d5830 100644
--- a/fs/udf/inode.c
+++ b/fs/udf/inode.c
@@ -156,7 +156,7 @@ static ssize_t udf_direct_IO(int rw, struct kiocb *iocb,
 
ret = blockdev_direct_IO(rw, iocb, inode, iov, offset, nr_segs,
  udf_get_block);
-   if (unlikely(ret  0  (rw  WRITE)))
+   if (unlikely(ret  0  (rw  WRITE)))
udf_write_failed(mapping, offset + iov_length(iov, nr_segs));
return ret;
 }
-- 
1.7.12

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] btrfs: remove unnecessary -ENOMEM BUG_ON check in extent-tree.c/exclude_super_stripes

2012-09-06 Thread David Sterba
On Thu, Sep 06, 2012 at 02:40:41PM +0800, Wang Sheng-Hui wrote:
 The memory allocation failure is BUG_ON in add_excluded_extent (following
 the code path) and btrfs_rmap_block. No need to BUG_ON -ENOMEM inside
 exclude_super_stripes itself.

No please.

 Its return value is always 0, and useless for its callers. Set it as void
 instead 0-returned.

btrfs_rmap_block itself contains a BUG_ON:

3980 int btrfs_rmap_block(struct btrfs_mapping_tree *map_tree,
3981  u64 chunk_start, u64 physical, u64 devid,
3982  u64 **logical, int *naddrs, int *stripe_len)
3983 {
3984 struct extent_map_tree *em_tree = map_tree-map_tree;
3985 struct extent_map *em;
3986 struct map_lookup *map;
3987 u64 *buf;
3988 u64 bytenr;
3989 u64 length;
3990 u64 stripe_nr;
3991 int i, j, nr = 0;
3992
3993 read_lock(em_tree-lock);
3994 em = lookup_extent_mapping(em_tree, chunk_start, 1);
3995 read_unlock(em_tree-lock);
3996
3997 BUG_ON(!em || em-start != chunk_start);

And this should be turned into an 'return error', thus giving a non-zero return
code that should be handled in the callers.

Eg. this patch attempts to do that
http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg15470.html

but has not been merged due to incorrect fix inside exclude_super_stripes
(introduced in the patch).

The same objection for return code cleanups will hold for any function that
returns 0 but is full of BUG_ONs.


david
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/1] [SCSI] scsi_debug: Add removable parameter

2012-09-06 Thread Martin Pitt
Add removable module parameter to set the removable attribute of any
subsequently created debug block device. It is a writable driver option, so
that you can switch between removable and fixed media block devices in
between the add_host calls.

This is useful for being able to test the different behaviour/required
privileges in e. g. the udisks test suite.

Signed-off-by: Martin Pitt martin.p...@ubuntu.com
Acked-By: David Zeuthen zeut...@gmail.com
---
 drivers/scsi/scsi_debug.c |   30 +++---
 1 files changed, 27 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index 182d5a5..57fbd5a 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -109,6 +109,7 @@ static const char * scsi_debug_version_date = 20100324;
 #define DEF_OPT_BLKS 64
 #define DEF_PHYSBLK_EXP 0
 #define DEF_PTYPE   0
+#define DEF_REMOVABLE false
 #define DEF_SCSI_LEVEL   5/* INQUIRY, byte2 [5-SPC-3] */
 #define DEF_SECTOR_SIZE 512
 #define DEF_UNMAP_ALIGNMENT 0
@@ -193,11 +194,11 @@ static unsigned int scsi_debug_unmap_granularity = 
DEF_UNMAP_GRANULARITY;
 static unsigned int scsi_debug_unmap_max_blocks = DEF_UNMAP_MAX_BLOCKS;
 static unsigned int scsi_debug_unmap_max_desc = DEF_UNMAP_MAX_DESC;
 static unsigned int scsi_debug_write_same_length = DEF_WRITESAME_LENGTH;
+static bool scsi_debug_removable = DEF_REMOVABLE;
 
 static int scsi_debug_cmnd_count = 0;
 
 #define DEV_READONLY(TGT)  (0)
-#define DEV_REMOVEABLE(TGT)(0)
 
 static unsigned int sdebug_store_sectors;
 static sector_t sdebug_capacity;   /* in sectors */
@@ -919,7 +920,7 @@ static int resp_inquiry(struct scsi_cmnd * scp, int target,
return ret;
}
/* drops through here for a standard inquiry */
-   arr[1] = DEV_REMOVEABLE(target) ? 0x80 : 0; /* Removable disk */
+   arr[1] = scsi_debug_removable ? 0x80 : 0;   /* Removable disk */
arr[2] = scsi_debug_scsi_level;
arr[3] = 2;/* response_data_format==2 */
arr[4] = SDEBUG_LONG_INQ_SZ - 5;
@@ -1211,7 +1212,7 @@ static int resp_format_pg(unsigned char * p, int 
pcontrol, int target)
p[11] = sdebug_sectors_per  0xff;
p[12] = (scsi_debug_sector_size  8)  0xff;
p[13] = scsi_debug_sector_size  0xff;
-   if (DEV_REMOVEABLE(target))
+   if (scsi_debug_removable)
p[20] |= 0x20; /* should agree with INQUIRY */
if (1 == pcontrol)
memset(p + 2, 0, sizeof(format_pg) - 2);
@@ -2754,6 +2755,7 @@ module_param_named(opt_blks, scsi_debug_opt_blks, int, 
S_IRUGO);
 module_param_named(opts, scsi_debug_opts, int, S_IRUGO | S_IWUSR);
 module_param_named(physblk_exp, scsi_debug_physblk_exp, int, S_IRUGO);
 module_param_named(ptype, scsi_debug_ptype, int, S_IRUGO | S_IWUSR);
+module_param_named(removable, scsi_debug_removable, bool, S_IRUGO | S_IWUSR);
 module_param_named(scsi_level, scsi_debug_scsi_level, int, S_IRUGO);
 module_param_named(sector_size, scsi_debug_sector_size, int, S_IRUGO);
 module_param_named(unmap_alignment, scsi_debug_unmap_alignment, int, S_IRUGO);
@@ -2796,6 +2798,7 @@ MODULE_PARM_DESC(opt_blks, optimal transfer length in 
block (def=64));
 MODULE_PARM_DESC(opts, 1-noise, 2-medium_err, 4-timeout, 
8-recovered_err... (def=0));
 MODULE_PARM_DESC(physblk_exp, physical block exponent (def=0));
 MODULE_PARM_DESC(ptype, SCSI peripheral type(def=0[disk]));
+MODULE_PARM_DESC(removable, claim to have removable media (def=0));
 MODULE_PARM_DESC(scsi_level, SCSI level to simulate(def=5[SPC-3]));
 MODULE_PARM_DESC(sector_size, logical block size in bytes (def=512));
 MODULE_PARM_DESC(unmap_alignment, lowest aligned thin provisioning lba 
(def=0));
@@ -3205,6 +3208,25 @@ static ssize_t sdebug_map_show(struct device_driver 
*ddp, char *buf)
 }
 DRIVER_ATTR(map, S_IRUGO, sdebug_map_show, NULL);
 
+static ssize_t sdebug_removable_show(struct device_driver *ddp,
+char *buf)
+{
+   return scnprintf(buf, PAGE_SIZE, %d\n, scsi_debug_removable ? 1 : 0);
+}
+static ssize_t sdebug_removable_store(struct device_driver *ddp,
+ const char *buf, size_t count)
+{
+   int n;
+
+   if ((count  0)  (1 == sscanf(buf, %d, n))  (n = 0)) {
+   scsi_debug_removable = (n  0);
+   return count;
+   }
+   return -EINVAL;
+}
+DRIVER_ATTR(removable, S_IRUGO | S_IWUSR, sdebug_removable_show,
+   sdebug_removable_store);
+
 
 /* Note: The following function creates attribute files in the
/sys/bus/pseudo/drivers/scsi_debug directory. The advantage of these
@@ -3230,6 +3252,7 @@ static int do_create_driverfs_files(void)
ret |= driver_create_file(sdebug_driverfs_driver, 
driver_attr_num_tgts);
ret |= driver_create_file(sdebug_driverfs_driver, driver_attr_ptype);
ret |= driver_create_file(sdebug_driverfs_driver, driver_attr_opts);
+   ret |= 

[PATCH 0/1] Option for scsi_debug to fake removable devices

2012-09-06 Thread Martin Pitt
Hello all,

I already re-sent this 1.5 months ago, but did not get any answer back
then; I guess it got lost in the noise by now. So, patiently retrying
again.

For the purposes of automatically testing udisks and gvfs automounting
I would like to add a parameter to scsi_debug to control the
removable attribute of the created block device. With that, we can
test system-internal and removable drives, as well as CD-ROMs (which
scsi_debug can already emulate). udisks requires different privileges
for mounting system-internal drives vs.  removable/hotpluggable
drives. This will also allow us to write system integration tests for
gvfs, which will exercise the whole stack including the actual polkit
configuration in a VM.

I wrote a simple kernel patch for this (against linux-next), and
tested this quite thoroughly.

I ran the style checker, and it reports two problems:

 8 --
WARNING: line over 80 characters
#109: FILE: drivers/scsi/scsi_debug.c:3255:
+   ret |= driver_create_file(sdebug_driverfs_driver, 
driver_attr_removable);

WARNING: Prefer pr_err(... to printk(KERN_ERR, ...
#126: FILE: drivers/scsi/scsi_debug.c:3353:
+   printk(KERN_ERR scsi_debug_init: removable must be 0 or 1\n);
 8 --

But as the existing code uses this style in the adjacent lines, I
favored consistency over fixing those. If the latter is desired, I'd
rather send a separate patch with just the style cleanup for the whole
file.

I got an ack from David Zeuthen (the primary udisks maintainer)
already, noted so in the patch.

Thank you in advance for considering,

Martin

-- 
Martin Pitt| http://www.piware.de
Ubuntu Developer (www.ubuntu.com)  | Debian Developer  (www.debian.org)


signature.asc
Description: Digital signature


Re: [PATCH RFC tip/core/rcu] Add callback-free CPUs

2012-09-06 Thread Peter Zijlstra
On Wed, 2012-09-05 at 16:44 -0700, Paul E. McKenney wrote:

 I was excited by this possibility when you first mentioned it, but
 the low-OS-jitter fans are going to need the grace-period computation
 to be offloaded as well. 

Sure, but it seems to me pulling the grace period machinery out is a
much harder feat and should be a patch (series) on its own. Also..

  So if I use your (admittedly much simpler)
 approach, I get to rewrite it when Frederic's adaptive-ticks work goes
 in. 

I don't see how Frederic's work affects any of this, that would simple
put RCU into extended quiescent state (aka. idle) while in userspace. In
that state the grace period machinery is stopped all together, so it
doesn't matter who would've ran it. 

  Given that this is probably happening relatively soon, it would be
 better if I just did the implementation that will be needed long-term,
 rather than rewriting.
 
 Though I am sure that people will be sad about fewer RCU patches.  ;-)

Always...

Now thinking about this grace machinery stuff a little more, would it be
possible to stick the entire state machine in a kthread and replace all
current hooks, like the tick and rcu_read_unlock_special with a message
passing construct such that they pass their event on to the kthread?

That way you could run the entire state thing from a kthread with random
affinity, all 'per-cpu' data would still be fine since only the one
kthread will access it, even though locality might suffer somewhat.

This would also not suffer from the having to keep one cpu special and
the ugly bouncing etc..


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] virtio-blk: Fix kconfig option

2012-09-06 Thread Michael S. Tsirkin
On Thu, Sep 06, 2012 at 03:02:48AM -0700, Kent Overstreet wrote:
 On Thu, Sep 06, 2012 at 12:49:56PM +0300, Michael S. Tsirkin wrote:
  On Thu, Sep 06, 2012 at 02:25:12AM -0700, Kent Overstreet wrote:
   On Thu, Sep 06, 2012 at 11:44:03AM +0300, Michael S. Tsirkin wrote:
On Thu, Sep 06, 2012 at 12:41:13AM -0700, Kent Overstreet wrote:
 On Tue, Sep 04, 2012 at 03:53:53PM +0930, Rusty Russell wrote:
  Kent Overstreet koverstr...@google.com writes:
  
   CONFIG_VIRTIO isn't exposed, everything else is supposed to 
   select it
   instead.
  
  This is a slight mis-understanding.  It's supposed to be selected by
  the particular driver, probably virtio_pci in your case.
 
 So are you saying virtio-blk depends on virtio-pci? If so, the kconfig
 should have that.
 
 As is, VIRTIO_BLK just has:
   depends on EXPERIMENTAL  VIRTIO
 
 which is flat out broken.

I don't think anything is broken.
Can you show an example of a broken configuration?
   
   Do you not understand the difference between depends an selects?
   Or did you not read my original mail?
   Flip off everything in drivers - virtio
   
   Now go to drivers - block and try to turn on virtio-blk.
   
   It's not listed!
  
  Yes. Because you disabled all virtio backends.
  It does not make sense to have any frontends.
 
 How's a user - or even another kernel developer who isn't familiar with
 virtio - supposed to know that?
 
 I still don't know what exactly a virtio backend is - the term isn't
 even mentioned anywhere that I've seen.
 
 Whatever it is though virtio-blk should be depending on _that_, not a
 config option that _isn't exposed in the menu_!
 
   Now go back to drivers - virtio and turn on (randomly) balloon.
   
   Go back to drivers - block, and now you can turn on virtio-blk!
   
   Do you see what's wrong with this picture?
  
  Yes. You got unlucky with your random guess.
  It's a bug in balloon kconfig: it should not
  select virtio.
  I sent a patch to fix that yesterday.
 
 Then it's also a bug in the comments at the top of
 drivers/virtio/Kconfig.
 
 And besides that, how the _hell_ is a user supposed to know to turn on
 VIRTIO_PCI before VIRTIO_BLK? It's not documented anywhere (if that is
 what's supposed to happen! I still don't know)

Well, what kind of device do you have? Tell us :)
If it's a virtio pci device,
you need to enable virtio-pci and virtio-blk.

 and even if it was
 documented, having one kconfig option depend on something that's exposed
 in a _completely different menu_ is just made of fail.

Fine, but why pick on virtio?
This is extremely common in kconfig.
For example, a ton of network drivers depend
on PCI, it's exactly the same thing.


-- 
MST
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/3] btrfs: remove unnecessary -ENOMEM BUG_ON check in extent-tree.c/btrfs_alloc_logged_file_extent

2012-09-06 Thread David Sterba
On Thu, Sep 06, 2012 at 02:41:02PM +0800, Wang Sheng-Hui wrote:
 The memory allocation failure is BUG_ON in add_excluded_extent
 (following the code path). No need to BUG_ON -ENOMEM inside
 btrfs_alloc_logged_file_extent.

This indirectly calls __set_extent_bit that does BUG_ON on memory
allocation failures. This type of error condition is hard to fix, as it
usually needs to do non-trivial cleanups in the function before
returning -ENOMEM, so the easiset way to handle it is to do BUG_ON and
then separately deal with it in another patch (so it does not mix with
the original patch).

Your patches remove (from my POV) useful marks that we have an error
condition to handle, not to hide it.

So, NAK from me for anything that looks like this. I'm of course glad to
look at patches that replace the BUG_ON with proper error handling :)


david

 
 Signed-off-by: Wang Sheng-Hui shh...@gmail.com
 ---
  fs/btrfs/extent-tree.c |6 ++
  1 files changed, 2 insertions(+), 4 deletions(-)
 
 diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
 index 95492cc..9b9a6fa 100644
 --- a/fs/btrfs/extent-tree.c
 +++ b/fs/btrfs/extent-tree.c
 @@ -6207,8 +6207,7 @@ int btrfs_alloc_logged_file_extent(struct 
 btrfs_trans_handle *trans,
   mutex_lock(caching_ctl-mutex);
  
   if (start = caching_ctl-progress) {
 - ret = add_excluded_extent(root, start, num_bytes);
 - BUG_ON(ret); /* -ENOMEM */
 + add_excluded_extent(root, start, num_bytes);

   } else if (start + num_bytes = caching_ctl-progress) {
   ret = btrfs_remove_free_space(block_group,
 start, num_bytes);
 @@ -6222,8 +6221,7 @@ int btrfs_alloc_logged_file_extent(struct 
 btrfs_trans_handle *trans,
   start = caching_ctl-progress;
   num_bytes = ins-objectid + ins-offset -
   caching_ctl-progress;
 - ret = add_excluded_extent(root, start, num_bytes);
 - BUG_ON(ret); /* -ENOMEM */
 + add_excluded_extent(root, start, num_bytes);
   }
  
   mutex_unlock(caching_ctl-mutex);
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: snd-usb: delay: estimated 0, actual 352

2012-09-06 Thread Daniel Mack
On 06.09.2012 09:35, Takashi Iwai wrote:
 At Thu, 6 Sep 2012 09:17:57 +0200,
 Markus Trippelsdorf wrote:

 On 2012.09.06 at 09:08 +0200, Daniel Mack wrote:
 On 06.09.2012 08:53, Markus Trippelsdorf wrote:
 On 2012.09.06 at 08:48 +0200, Takashi Iwai wrote:
 At Thu, 06 Sep 2012 08:33:30 +0200,
 Daniel Mack wrote:

 On 06.09.2012 08:02, Markus Trippelsdorf wrote:
 On 2012.09.04 at 16:40 +0200, Takashi Iwai wrote:
 
 Sound fixes for 3.6-rc5

 There are nothing scaring, contains only small fixes for HD-audio and
 USB-audio:
 - EPSS regression fix and GPIO fix for HD-audio IDT codecs
 - A series of USB-audio regression fixes that are found since 3.5 
 kernel

 
 Daniel Mack (4):
   ALSA: snd-usb: Fix URB cancellation at stream start
   ALSA: snd-usb: restore delay information
  
 The commit fbcfbf5f above causes the following lines to be printed
 whenever I start a new song:

 Copied Pierre-Louis Bossart - he wrote the code in 294c4fb8 which this
 patch (fbcfbf5f) brings back now.

 delay: estimated 0, actual 352
 delay: estimated 353, actual 705

 (44.1 * 8 = 352.8)

 This happens with an USB-DAC that identifies itself as C-Media USB
 Headphone Set.

 And you didn't you see these lines with 3.4?

 Maybe the difference of start condition?

 Markus, does the patch below fix anything?

 Unfortunately no.
 However reverting the following fixes the problem:

 commit 245baf983cc39524cce39c24d01b276e6e653c9e
 Author: Daniel Mack zon...@gmail.com
 Date:   Thu Aug 30 18:52:30 2012 +0200

 ALSA: snd-usb: fix calls to next_packet_size


 No, this one certainly fixes a problem and does the right thing by
 restoring the original code.

 If you wouldn't state that you didn't see the same effect with 3.4(!),
 before the refactoring done in 3.5, I would believe the device is simply
 slightly off in its feedback rate and the tighter delay code complains
 about it while compensating, just as it did before.

 Are there any more than these two lines? And is audio working at all? Is
 it distorted in any way?

 There are only these two lines (printed whenever sound starts). Audio is
 working just fine with no distortions.

 I did see similar lines before when the system load was very high
 (happend during make check when building glibc).

 Here is what Pierre-Louis wrote in November 2011:

 »This was supposed to be an informational message, I thought it was only
 enabled for debug. Regular users don't really need to know.«
 
 I guess the problem is that the new endpoint scheme doesn't count the
 last_delay update unless the stream is triggered.  In the old code,
 retire_playback_urb is always called even before the trigger(START) is
 set.  And, there retire_playback_urb() does nothing but updating the
 delay information.
 
 In the new code, retire_playback_urb is set only at
 snd_usb_substream_playback_trigger().  Thus at the very first shot,
 the delay account got confused.

In that case, I'd say we can also safely remove the debug output then.
Let's wait for Pierre-Louis' judgement here.


Daniel

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: snd-usb: delay: estimated 0, actual 352

2012-09-06 Thread Markus Trippelsdorf
On 2012.09.06 at 12:25 +0200, Daniel Mack wrote:
 On 06.09.2012 09:35, Takashi Iwai wrote:
  At Thu, 6 Sep 2012 09:17:57 +0200,
  Markus Trippelsdorf wrote:
 
  On 2012.09.06 at 09:08 +0200, Daniel Mack wrote:
  On 06.09.2012 08:53, Markus Trippelsdorf wrote:
  On 2012.09.06 at 08:48 +0200, Takashi Iwai wrote:
  At Thu, 06 Sep 2012 08:33:30 +0200,
  Daniel Mack wrote:
 
  On 06.09.2012 08:02, Markus Trippelsdorf wrote:
  On 2012.09.04 at 16:40 +0200, Takashi Iwai wrote:
  
  Sound fixes for 3.6-rc5
 
  There are nothing scaring, contains only small fixes for HD-audio and
  USB-audio:
  - EPSS regression fix and GPIO fix for HD-audio IDT codecs
  - A series of USB-audio regression fixes that are found since 3.5 
  kernel
 
  
  Daniel Mack (4):
ALSA: snd-usb: Fix URB cancellation at stream start
ALSA: snd-usb: restore delay information
   
  The commit fbcfbf5f above causes the following lines to be printed
  whenever I start a new song:
 
  Copied Pierre-Louis Bossart - he wrote the code in 294c4fb8 which this
  patch (fbcfbf5f) brings back now.
 
  delay: estimated 0, actual 352
  delay: estimated 353, actual 705
 
  (44.1 * 8 = 352.8)
 
  This happens with an USB-DAC that identifies itself as C-Media USB
  Headphone Set.
 
  And you didn't you see these lines with 3.4?
 
  Maybe the difference of start condition?
 
  Markus, does the patch below fix anything?
 
  Unfortunately no.
  However reverting the following fixes the problem:
 
  commit 245baf983cc39524cce39c24d01b276e6e653c9e
  Author: Daniel Mack zon...@gmail.com
  Date:   Thu Aug 30 18:52:30 2012 +0200
 
  ALSA: snd-usb: fix calls to next_packet_size
 
 
  No, this one certainly fixes a problem and does the right thing by
  restoring the original code.
 
  If you wouldn't state that you didn't see the same effect with 3.4(!),
  before the refactoring done in 3.5, I would believe the device is simply
  slightly off in its feedback rate and the tighter delay code complains
  about it while compensating, just as it did before.
 
  Are there any more than these two lines? And is audio working at all? Is
  it distorted in any way?
 
  There are only these two lines (printed whenever sound starts). Audio is
  working just fine with no distortions.
 
  I did see similar lines before when the system load was very high
  (happend during make check when building glibc).
 
  Here is what Pierre-Louis wrote in November 2011:
 
  »This was supposed to be an informational message, I thought it was only
  enabled for debug. Regular users don't really need to know.«
  
  I guess the problem is that the new endpoint scheme doesn't count the
  last_delay update unless the stream is triggered.  In the old code,
  retire_playback_urb is always called even before the trigger(START) is
  set.  And, there retire_playback_urb() does nothing but updating the
  delay information.
  
  In the new code, retire_playback_urb is set only at
  snd_usb_substream_playback_trigger().  Thus at the very first shot,
  the delay account got confused.
 
 In that case, I'd say we can also safely remove the debug output then.
 Let's wait for Pierre-Louis' judgement here.

v3.5 and v3.6-rc4 with commit fbcfbf5f67 (restore delay information)
applied on top are both fine.

-- 
Markus
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] virtio-blk: Fix kconfig option

2012-09-06 Thread Kent Overstreet
On Thu, Sep 06, 2012 at 01:18:43PM +0300, Michael S. Tsirkin wrote:
 On Thu, Sep 06, 2012 at 03:02:48AM -0700, Kent Overstreet wrote:
  On Thu, Sep 06, 2012 at 12:49:56PM +0300, Michael S. Tsirkin wrote:
   On Thu, Sep 06, 2012 at 02:25:12AM -0700, Kent Overstreet wrote:
On Thu, Sep 06, 2012 at 11:44:03AM +0300, Michael S. Tsirkin wrote:
 On Thu, Sep 06, 2012 at 12:41:13AM -0700, Kent Overstreet wrote:
  On Tue, Sep 04, 2012 at 03:53:53PM +0930, Rusty Russell wrote:
   Kent Overstreet koverstr...@google.com writes:
   
CONFIG_VIRTIO isn't exposed, everything else is supposed to 
select it
instead.
   
   This is a slight mis-understanding.  It's supposed to be selected 
   by
   the particular driver, probably virtio_pci in your case.
  
  So are you saying virtio-blk depends on virtio-pci? If so, the 
  kconfig
  should have that.
  
  As is, VIRTIO_BLK just has:
  depends on EXPERIMENTAL  VIRTIO
  
  which is flat out broken.
 
 I don't think anything is broken.
 Can you show an example of a broken configuration?

Do you not understand the difference between depends an selects?
Or did you not read my original mail?
Flip off everything in drivers - virtio

Now go to drivers - block and try to turn on virtio-blk.

It's not listed!
   
   Yes. Because you disabled all virtio backends.
   It does not make sense to have any frontends.
  
  How's a user - or even another kernel developer who isn't familiar with
  virtio - supposed to know that?
  
  I still don't know what exactly a virtio backend is - the term isn't
  even mentioned anywhere that I've seen.
  
  Whatever it is though virtio-blk should be depending on _that_, not a
  config option that _isn't exposed in the menu_!
  
Now go back to drivers - virtio and turn on (randomly) balloon.

Go back to drivers - block, and now you can turn on virtio-blk!

Do you see what's wrong with this picture?
   
   Yes. You got unlucky with your random guess.
   It's a bug in balloon kconfig: it should not
   select virtio.
   I sent a patch to fix that yesterday.
  
  Then it's also a bug in the comments at the top of
  drivers/virtio/Kconfig.
  
  And besides that, how the _hell_ is a user supposed to know to turn on
  VIRTIO_PCI before VIRTIO_BLK? It's not documented anywhere (if that is
  what's supposed to happen! I still don't know)
 
 Well, what kind of device do you have? Tell us :)
 If it's a virtio pci device,
 you need to enable virtio-pci and virtio-blk.

I run qemu with -drive if=virtio. You tell me!

Better yet, tell me how the user is supposed to figure it out!

 
  and even if it was
  documented, having one kconfig option depend on something that's exposed
  in a _completely different menu_ is just made of fail.
 
 Fine, but why pick on virtio?
 This is extremely common in kconfig.
 For example, a ton of network drivers depend
 on PCI, it's exactly the same thing.

Never noticed where CONFIG_PCI is exposed in bus options?

Nope, not the same thing.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH can-next v6] can: add tx/rx LED trigger support

2012-09-06 Thread Kurt Van Dijck
On Tue, Sep 04, 2012 at 10:15:53PM +0200, Fabio Baltieri wrote:
 On Tue, Sep 04, 2012 at 09:11:28AM +0200, Kurt Van Dijck wrote:
  On Mon, Sep 03, 2012 at 10:54:49PM +0200, Oliver Hartkopp wrote:
   On 03.09.2012 20:29, Fabio Baltieri wrote:
   
   
[...]
   The name of the device can only be changed when the interface is down.
   Is it possible to put some scripting around it to detach and attach the 
   leds
   to the interfaces on ifup/ifdown triggers?
  
  Are the led triggers available for using while the netdev is down then?
 
 Sure!  On embedded systems triggers are usually attached to actual LEDs
 at probe time using default_trigger field of struct led_classdev, and
 that can be specified both in machine files or in device tree.

I also think that led triggers should be available.
I asked the question because detach  attach leds to interfaces would
indeed break that.

btw, I tried to send a patch tuesday (my first $ git send-email) using
netdev notifiers: did you receive it, and what do you think of it?

Kurt
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[GIT] kbuild rc fixes for v3.6 (v2)

2012-09-06 Thread Michal Marek
Hi Linus,

there are two fixes that should go into 3.6. The link-vmlinux.sh one is
obvious. The other one fixes make firmware_install with certain
configurations, where a file in the toplevel firmware tree gets
installed first, and $(INSTALL_FW_PATH)/$$(dir file) results in
/lib/firmware/./, which confuses make 3.82 for some reason.

v2: This time with the correct URL.

Thanks,
Michal


The following changes since commit 0d7614f09c1ebdbaa1599a5aba7593f147bf96ee:

  Linux 3.6-rc1 (2012-08-02 16:38:10 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild.git rc-fixes

for you to fetch changes up to 6c7080a61fc7b46b3ac8573952b5a3e9d5f68bc4:

  firmware: fix directory creation rule matching with make 3.82 (2012-08-30 
16:27:13 +0200)


Mark Asselstine (1):
  firmware: fix directory creation rule matching with make 3.82

Michal Marek (1):
  link-vmlinux.sh: Fix stray echo in error message

 scripts/Makefile.fwinst |2 +-
 scripts/link-vmlinux.sh |2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT] kbuild rc fixes for v3.6

2012-09-06 Thread Michal Marek
On 6.9.2012 07:19, Stephen Rothwell wrote:
 On Wed, 5 Sep 2012 20:32:00 -0700 Linus Torvalds 
 torva...@linux-foundation.org wrote:

 On Mon, Sep 3, 2012 at 12:18 PM, Michal Marek mma...@suse.cz wrote:
 are available in the git repository at:
   git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6.git 
 ..BRANCH.NOT.VERIFIED..

 There's something wrong with that repo

  fatal: Could not read from remote repository.

   Please make sure you have the correct access rights
   and the repository exists.

 plus that BRANCH.NOT.VERIFIED thing looks bad too (and is probably related).

Oops.


 The kbuild-current tree in linux-next is
 
 git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild.git rc-fixes
 
 and it has those 2 commits in it.  So script update needed, I guess.

I did update my script, but the last pull request was sent from a
machine that had the old copy :).

Michal
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 10/11 V5] workqueue: unbind/rebind without manager_mutex

2012-09-06 Thread Lai Jiangshan
On 09/06/2012 04:04 AM, Tejun Heo wrote:
 Hello, Lai.
 
 On Wed, Sep 05, 2012 at 06:37:47PM +0800, Lai Jiangshan wrote:
 gcwq_unbind_fn() unbind manager by -manager pointer.

 rebinding-manger, unbinding/rebinding newly created worker are done by
 other place. so we don't need manager_mutex any more.

 Also change the comment of @bind accordingly.
 
 Please don't scatter small prep patches like this.  Each piece in
 isolation doesn't make much sense to me and the patch descriptions
 don't help much.  Please collect the prep patches and explain in more
 detail.

There are 4 different tasks. unbind/rebind manager/newbie

1 task for 1 patch. if I collect them into one patch, it will be hard
to explain which code do which task.

 
 In general, I'm not sure about this approach.  I'd really like the
 hotplug logic to be contained in hotplug logic proper as much as
 possible.  This scatters around hotplug handling to usual code paths
 and seems too invasive for 3.6-fixes.

I don't expect to fix it in 3.6. no approach is simple.

 
 Also, can you please talk to me before going ahead and sending me
 completely new 10 patch series every other day?  You're taking
 disproportionate amount of my time and I can't continue to do this.
 Please discuss with me or at least explain the high-level approach in
 the head message in detail.  Going through the patch series to figure
 out high-level design which is constantly flipping is rather
 inefficient and unfortunately your patch descriptions aren't too
 helpful.  :(
 

I'm not good in English, so I prefer to attach code when I show my idea.
(and the code can prove the idea). I admit that my changelog and comments
are always bad.


I have 4 idea/approach for bug of hotplug VS manage_workers().
there all come up to my mind last week. 
NOTE: (this V5 patch is my approach2)

(list with the order they came into my mind)
Approach 1  V3 patchset non_manager_role_manager_mutex_unlock()
Approach 2  V5 patchset rebind manager, unbind/rebind newbie 
are done outside. no manage mutex for hotplug
Approach 3  un-implemented  move unbind/rebind to worker_thread and 
handle them as POOL_MANAGE_WORKERS
Approach 4  V4 parchset manage_workers_slowpath()

Approach 2,3 is partial implemented last week, but Approach2 is quickly 
finished yesterday.
Approach 3 is too complicated to finish.


Approach 1: the simplest. after it, we can use manage_mutex anywhere as needed, 
but we need to use non_manager_role_manager_mutex_unlock() to unlock.

Approach 2: the binding of manager and newly created worker is handled outside 
of hotplug code. thus hoplug code don't need manage_mutex. manage_mutex is 
typical protect-code-pattern, it is not good. we should always use lock to 
protect data instead of protecting code. although in linux kernel, there are 
many lock which are only used for protecting code, I think we can reduce them 
as possible. the removing of BIG-KERNEL-LOCK is an example. the line of code is 
also less in this approach, but it touch 2 place outside of hotplug code and 
the logic/path are increasing. GOOD to me: disallow manage_mutex(for future), 
not too much code.

Approach 3: complicated. make unbind/rebind 's calle-site and context are the 
same as manage_workers(). BAD: we can't free to use manage_mutex in future when 
need. encounter some other problems.(you suggested approach will also have some 
problem I encountered)

Approach 4: the problem comes from manage_worker(), just add 
manage_workers_slowpath() to fix it inside manage_worker(). it fixs problem in 
only 1 bulk of code. after it, we can use manage_mutex anywhere as needed. the 
line of code is more, but it just in one place. GOOD: the most clean approach.

Thanks
Lai

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] virtio-balloon spec: provide a version of the silent deflate feature that works

2012-09-06 Thread Michael S. Tsirkin
On Thu, Sep 06, 2012 at 11:57:22AM +0200, Paolo Bonzini wrote:
 Il 06/09/2012 11:44, Michael S. Tsirkin ha scritto:
  In fact, it's not clear how the driver should use the feature.  My guess
  is that, if it wants to use silent deflate, it tries to negotiate
  VIRTIO_BALLOON_F_MUST_TELL_HOST, and can use silent deflate if
  negotiation fails.  This is against the logic of all other features.
  
  Let's take a step back from the implementation details.
  You are trying to add a new feature bit, after all.
  Why? Why is silent deflate useful? This is what is
  missing in all this discussion. If it is not useful
  we do not need a bit for it.
 
 It is useful because it lets guests inflate the balloon aggressively,
 and then use ballooned-out pages even in places where the guest OS
 cannot sleep, such as kmalloc(GFP_ATOMIC).

Interesting.
Do you intend to develop a driver patch using this?  I'd like to see how
that works.  Because if not, IMO it's best to wait until someone asks
for it.

  Can you show a scenario with old driver/new hypervisor or
  new driver/old hypervisor that fails?
 
  Sorry this is not the example I asked for.  Please give and example
  without migration.
  
  Migration is qemu's problem: it is hypervisor's job to
  make sure guest sees no change during migration.
 
 Quoting my message: Of course you can just teach QEMU to be smarter,
 but that would be a one-off hack for the only ill-defined feature that
 says something is _not_ supported.

 Currently migration works the same way for all virtio devices,
 and
 assumes that features are defined only in the positive direction:
 drivers request features if they want to use it, devices provide
 features to say they support something.

Well this approach is buggy. If I reread features after migration what
do I see? Something changed right? So this is a bug. Migration should
not change hardware. And it is not a one off thing it is
fundamental for any hardware.

Fix that in qemu, and the problem goes away without spec changes.

 Instead, in the case of this feature, the driver requests it before
 relying on its lack (which is odd);

Which code in driver do you refer to?

 the device provides if they do not
 support something (which is wrong).

Not support?
It just seems to be asking guest to tell it about deflates.
If guest acks the bit, we know it will. If it does not,
it will not.

  You can see that this just cannot
 provide backwards-compatibility in the device;

Sorry I do not understand this meta argument.
There should be an example where a driver and device
fail to work together. And without migration: as
I showed migration is simply broken atm for
an unrelated reason. Otherwise all's well.

 it happens to work only
 because the feature was there in the first version of the spec.

This is how we do compatiblity in virtio. If we want driver to do
something, we add a feature and it can ack, if it does we know it will
do what we want.  Another example is network announce bit.  If driver
acks it, we know we do not need to send gratitious arp from qemu.  You
are saying it is also broken?

  It should be able to do this with any hardware it emulates,
  there should be no need to change hardware to make it
  migrateable somehow.
 
 Of course, but if we can fix the hardware with no bad effects, let's do
 that instead.
 
 Paolo

Don't fix what is not broken. We get to carry compatibility
in both driver and host for a long time for each feature.

Note: adding
new features adds zero value in this respect - it will not
allow simplifying the hypervisor.
-- 
MST
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] virtio-blk: Fix kconfig option

2012-09-06 Thread Michael S. Tsirkin
On Thu, Sep 06, 2012 at 03:31:44AM -0700, Kent Overstreet wrote:
 On Thu, Sep 06, 2012 at 01:18:43PM +0300, Michael S. Tsirkin wrote:
  On Thu, Sep 06, 2012 at 03:02:48AM -0700, Kent Overstreet wrote:
   On Thu, Sep 06, 2012 at 12:49:56PM +0300, Michael S. Tsirkin wrote:
On Thu, Sep 06, 2012 at 02:25:12AM -0700, Kent Overstreet wrote:
 On Thu, Sep 06, 2012 at 11:44:03AM +0300, Michael S. Tsirkin wrote:
  On Thu, Sep 06, 2012 at 12:41:13AM -0700, Kent Overstreet wrote:
   On Tue, Sep 04, 2012 at 03:53:53PM +0930, Rusty Russell wrote:
Kent Overstreet koverstr...@google.com writes:

 CONFIG_VIRTIO isn't exposed, everything else is supposed to 
 select it
 instead.

This is a slight mis-understanding.  It's supposed to be 
selected by
the particular driver, probably virtio_pci in your case.
   
   So are you saying virtio-blk depends on virtio-pci? If so, the 
   kconfig
   should have that.
   
   As is, VIRTIO_BLK just has:
 depends on EXPERIMENTAL  VIRTIO
   
   which is flat out broken.
  
  I don't think anything is broken.
  Can you show an example of a broken configuration?
 
 Do you not understand the difference between depends an selects?
 Or did you not read my original mail?
 Flip off everything in drivers - virtio
 
 Now go to drivers - block and try to turn on virtio-blk.
 
 It's not listed!

Yes. Because you disabled all virtio backends.
It does not make sense to have any frontends.
   
   How's a user - or even another kernel developer who isn't familiar with
   virtio - supposed to know that?
   
   I still don't know what exactly a virtio backend is - the term isn't
   even mentioned anywhere that I've seen.
   
   Whatever it is though virtio-blk should be depending on _that_, not a
   config option that _isn't exposed in the menu_!
   
 Now go back to drivers - virtio and turn on (randomly) balloon.
 
 Go back to drivers - block, and now you can turn on virtio-blk!
 
 Do you see what's wrong with this picture?

Yes. You got unlucky with your random guess.
It's a bug in balloon kconfig: it should not
select virtio.
I sent a patch to fix that yesterday.
   
   Then it's also a bug in the comments at the top of
   drivers/virtio/Kconfig.
   
   And besides that, how the _hell_ is a user supposed to know to turn on
   VIRTIO_PCI before VIRTIO_BLK? It's not documented anywhere (if that is
   what's supposed to happen! I still don't know)
  
  Well, what kind of device do you have? Tell us :)
  If it's a virtio pci device,
  you need to enable virtio-pci and virtio-blk.
 
 I run qemu with -drive if=virtio. You tell me!

-drive if= is a compatibility option. qemu makes
an effort to guess what it is you want to do.
Result is usually correct but it means people building
their own kernels get confused.

For x86 kvm the modern equivalent is:

-device virtio-blk-pci,drive=foobar -drive if=no,...

If you use this you get what you asked for :).

Yes this usage is not documented anywhere, but this is
not guest driver's problem.

 Better yet, tell me how the user is supposed to figure it out!

As usual when you do not know which driver to select.
Boot a distro kernel and look around.
Where is your virtio device? On a pci bus?
There you are.

  
   and even if it was
   documented, having one kconfig option depend on something that's exposed
   in a _completely different menu_ is just made of fail.
  
  Fine, but why pick on virtio?
  This is extremely common in kconfig.
  For example, a ton of network drivers depend
  on PCI, it's exactly the same thing.
 
 Never noticed where CONFIG_PCI is exposed in bus options?

I see it:

CONFIG_PCI:
  │ Find out whether you have a PCI motherboard. PCI is the name of a │  
  │ bus system, i.e. the way the CPU talks to the other stuff inside │  
  │ your box. Other bus systems are ISA, EISA, MicroChannel (MCA) or │  
  │ VESA. If you have PCI, say Y, otherwise N.  │  

 Nope, not the same thing.

You just happen to know what PCI is but not what VIRTIO PCI is.
This is fair enough, but not sure how to help in this case.
Your patch won't help though.

-- 
MST
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH RFC] mm/swap: automatic tuning for swapin readahead

2012-09-06 Thread Konstantin Khlebnikov
This patch adds simple tracker for swapin readahread effectiveness, and tunes
readahead cluster depending on it. It manage internal state [0..1024] and scales
readahead order between 0 and value from sysctl vm.page-cluster (3 by default).
Swapout and readahead misses decreases state, swapin and ra hits increases it:

 Swapin  +1 [page fault, shmem, etc... ]
 Swapout -10
 Readahead hit   +10
 Readahead miss  -1 [removing from swapcache unused readahead page]

If system is under serious memory pressure swapin readahead is useless, because
pages in swap are highly fragmented and cache hit is mostly impossible. In this
case swapin only leads to unnecessary memory allocations. But readahead helps to
read all swapped pages back to memory if system recovers from memory pressure.

This patch inspired by patch from Shaohua Li
http://www.spinics.net/lists/linux-mm/msg41128.html
mine version uses system wide state rather than per-VMA counters.

Signed-off-by: Konstantin Khlebnikov khlebni...@openvz.org
Cc: Shaohua Li s...@kernel.org
Cc: Rik van Riel r...@redhat.com
Cc: Minchan Kim minc...@kernel.org
---
 include/linux/page-flags.h |1 +
 mm/swap_state.c|   42 +-
 2 files changed, 38 insertions(+), 5 deletions(-)

diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index b5d1384..3657cdc 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -231,6 +231,7 @@ PAGEFLAG(MappedToDisk, mappedtodisk)
 /* PG_readahead is only used for file reads; PG_reclaim is only for writes */
 PAGEFLAG(Reclaim, reclaim) TESTCLEARFLAG(Reclaim, reclaim)
 PAGEFLAG(Readahead, reclaim)   /* Reminder to do async read-ahead */
+TESTCLEARFLAG(Readahead, reclaim)
 
 #ifdef CONFIG_HIGHMEM
 /*
diff --git a/mm/swap_state.c b/mm/swap_state.c
index 0cb36fb..d6c7a88 100644
--- a/mm/swap_state.c
+++ b/mm/swap_state.c
@@ -53,12 +53,31 @@ static struct {
unsigned long find_total;
 } swap_cache_info;
 
+#define SWAP_RA_BITS   10
+
+static atomic_t swap_ra_state = ATOMIC_INIT((1  SWAP_RA_BITS) - 1);
+static int swap_ra_cluster = 1;
+
+static void swap_ra_update(int delta)
+{
+   int old_state, new_state;
+
+   old_state = atomic_read(swap_ra_state);
+   new_state = clamp(old_state + delta, 0, 1  SWAP_RA_BITS);
+   if (old_state != new_state) {
+   atomic_set(swap_ra_state, new_state);
+   swap_ra_cluster = (page_cluster * new_state)  SWAP_RA_BITS;
+   }
+}
+
 void show_swap_cache_info(void)
 {
printk(%lu pages in swap cache\n, total_swapcache_pages);
-   printk(Swap cache stats: add %lu, delete %lu, find %lu/%lu\n,
+   printk(Swap cache stats: add %lu, delete %lu, find %lu/%lu,
+readahead %d/%d\n,
swap_cache_info.add_total, swap_cache_info.del_total,
-   swap_cache_info.find_success, swap_cache_info.find_total);
+   swap_cache_info.find_success, swap_cache_info.find_total,
+   1  swap_ra_cluster, atomic_read(swap_ra_state));
printk(Free swap  = %ldkB\n, nr_swap_pages  (PAGE_SHIFT - 10));
printk(Total swap = %lukB\n, total_swap_pages  (PAGE_SHIFT - 10));
 }
@@ -112,6 +131,8 @@ int add_to_swap_cache(struct page *page, swp_entry_t entry, 
gfp_t gfp_mask)
if (!error) {
error = __add_to_swap_cache(page, entry);
radix_tree_preload_end();
+   /* FIXME weird place */
+   swap_ra_update(-10); /* swapout, decrease readahead */
}
return error;
 }
@@ -132,6 +153,8 @@ void __delete_from_swap_cache(struct page *page)
total_swapcache_pages--;
__dec_zone_page_state(page, NR_FILE_PAGES);
INC_CACHE_INFO(del_total);
+   if (TestClearPageReadahead(page))
+   swap_ra_update(-1); /* readahead miss */
 }
 
 /**
@@ -265,8 +288,11 @@ struct page * lookup_swap_cache(swp_entry_t entry)
 
page = find_get_page(swapper_space, entry.val);
 
-   if (page)
+   if (page) {
INC_CACHE_INFO(find_success);
+   if (TestClearPageReadahead(page))
+   swap_ra_update(+10); /* readahead hit */
+   }
 
INC_CACHE_INFO(find_total);
return page;
@@ -374,11 +400,14 @@ struct page *swapin_readahead(swp_entry_t entry, gfp_t 
gfp_mask,
struct vm_area_struct *vma, unsigned long addr)
 {
struct page *page;
-   unsigned long offset = swp_offset(entry);
+   unsigned long entry_offset = swp_offset(entry);
+   unsigned long offset = entry_offset;
unsigned long start_offset, end_offset;
-   unsigned long mask = (1UL  page_cluster) - 1;
+   unsigned long mask = (1UL  swap_ra_cluster) - 1;
struct blk_plug plug;
 
+   swap_ra_update(+1); /* swapin, increase readahead */
+
/* Read a page_cluster sized and aligned cluster around offset. */

Re: [PATCH can-next v6] can: add tx/rx LED trigger support

2012-09-06 Thread Fabio Baltieri
Hi Kurt,

On Thu, Sep 6, 2012 at 12:33 PM, Kurt Van Dijck kurt.van.di...@eia.be wrote:
 On Tue, Sep 04, 2012 at 10:15:53PM +0200, Fabio Baltieri wrote:
 On Tue, Sep 04, 2012 at 09:11:28AM +0200, Kurt Van Dijck wrote:
 [...]
   The name of the device can only be changed when the interface is down.
   Is it possible to put some scripting around it to detach and attach the 
   leds
   to the interfaces on ifup/ifdown triggers?
 
  Are the led triggers available for using while the netdev is down then?

 Sure!  On embedded systems triggers are usually attached to actual LEDs
 at probe time using default_trigger field of struct led_classdev, and
 that can be specified both in machine files or in device tree.

 I also think that led triggers should be available.

Right, that's why I think the only way is to use device name.

 I asked the question because detach  attach leds to interfaces would
 indeed break that.

Sure? I think that the trigger would be set again on reattach, as
default_trigger is checked both in led_cdev probe and
trigger_register, see:

http://lxr.free-electrons.com/source/drivers/leds/led-triggers.c#L180

I'll try that tonight.

 btw, I tried to send a patch tuesday (my first $ git send-email) using
 netdev notifiers: did you receive it, and what do you think of it?

Sure, I got it!  I was planning to try that this weekend but I can
give you some comments earlier tonight... sorry for the dealy!

Fabio

-- 
Fabio Baltieri
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC v2 PATCH 00/21] KVM: x86: CPU isolation and direct interrupts delivery to guests

2012-09-06 Thread Tomoki Sekiyama
This RFC patch series provides facility to dedicate CPUs to KVM guests
and enable the guests to handle interrupts from passed-through PCI devices
directly (without VM exit and relay by the host).

With this feature, we can improve throughput and response time of the device
and the host's CPU usage by reducing the overhead of interrupt handling.
This is good for the application using very high throughput/frequent
interrupt device (e.g. 10GbE NIC).
Real-time applicatoins also gets benefit from CPU isolation feature, which
reduces interfare from host kernel tasks and scheduling delay.

The overview of this patch series is presented in CloudOpen 2012.
The slides are available at:
http://events.linuxfoundation.org/images/stories/pdf/lcna_co2012_sekiyama.pdf

* Changes from v1 ( https://lkml.org/lkml/2012/6/28/30 )
 - SMP guest is supported
 - Direct EOI is added, that eliminate VM exit on EOI
 - Direct local APIC timer access from guests is added, which pass-through the
   physical timer of a dedicated CPU to the guest.
 - Rebased on v3.6-rc4

* How to test
 - Create a guest VM with 1 CPU and some PCI passthrough devices (which
   supports MSI/MSI-X).
   No VGA display will be better...
 - Apply the patch at the end of this mail to qemu-kvm.
   (This patch is just for simple testing, and dedicated CPU ID for the
guest is hard-coded.)
 - Run the guest once to ensure the PCI passthrough works correctly.
 - Make the specified CPU offline.
 # echo 0  /sys/devices/system/cpu/cpu3/online
 - Launch qemu-kvm with -no-kvm-pit option.
   The offlined CPU is booted as a slave CPU and guest is runs on that CPU.

* To-do
 - Enable slave CPUs to handle access fault
 - Support AMD SVM
 - Support non-Linux guests

---

Tomoki Sekiyama (21):
  x86: request TLB flush to slave CPU using NMI
  KVM: Pass-through local APIC timer of on slave CPUs to guest VM
  KVM: Enable direct EOI for directly routed interrupts to guests
  KVM: route assigned devices' MSI/MSI-X directly to guests on slave CPUs
  KVM: add kvm_arch_vcpu_prevent_run to prevent VM ENTER when NMI is 
received
  KVM: vmx: Add definitions PIN_BASED_PREEMPTION_TIMER
  KVM: add tracepoint on enabling/disabling direct interrupt delivery
  KVM: Directly handle interrupts by guests without VM EXIT on slave CPUs
  x86/apic: IRQ vector remapping on slave for slave CPUs
  x86/apic: Enable external interrupt routing to slave CPUs
  KVM: no exiting from guest when slave CPU halted
  KVM: proxy slab operations for slave CPUs on online CPUs
  KVM: Go back to online CPU on VM exit by external interrupt
  KVM: Add KVM_GET_SLAVE_CPU and KVM_SET_SLAVE_CPU to vCPU ioctl
  KVM: handle page faults of slave guests on online CPUs
  KVM: Add facility to run guests on slave CPUs
  KVM: Enable/Disable virtualization on slave CPUs are activated/dying
  x86: Avoid RCU warnings on slave CPUs
  x86: Support hrtimer on slave CPUs
  x86: Add a facility to use offlined CPUs as slave CPUs
  x86: Split memory hotplug function from cpu_up() as cpu_memory_up()


 arch/x86/Kconfig  |   10 +
 arch/x86/include/asm/apic.h   |   10 +
 arch/x86/include/asm/irq.h|   15 +
 arch/x86/include/asm/kvm_host.h   |   59 +
 arch/x86/include/asm/tlbflush.h   |5 
 arch/x86/include/asm/vmx.h|3 
 arch/x86/kernel/apic/apic.c   |   11 +
 arch/x86/kernel/apic/io_apic.c|  111 -
 arch/x86/kernel/apic/x2apic_cluster.c |8 -
 arch/x86/kernel/cpu/common.c  |5 
 arch/x86/kernel/smp.c |2 
 arch/x86/kernel/smpboot.c |  264 ++-
 arch/x86/kvm/irq.c|  136 
 arch/x86/kvm/lapic.c  |   56 +
 arch/x86/kvm/lapic.h  |2 
 arch/x86/kvm/mmu.c|   63 -
 arch/x86/kvm/mmu.h|4 
 arch/x86/kvm/trace.h  |   19 ++
 arch/x86/kvm/vmx.c|  180 +++
 arch/x86/kvm/x86.c|  387 +++--
 arch/x86/kvm/x86.h|9 +
 arch/x86/mm/tlb.c |   94 
 drivers/iommu/intel_irq_remapping.c   |   32 ++-
 include/linux/cpu.h   |   36 +++
 include/linux/cpumask.h   |   26 ++
 include/linux/kvm.h   |4 
 include/linux/kvm_host.h  |2 
 kernel/cpu.c  |   83 +--
 kernel/hrtimer.c  |   14 +
 kernel/irq/manage.c   |4 
 kernel/irq/migration.c|2 
 kernel/irq/proc.c |2 
 kernel/rcutree.c  |   14 +
 kernel/smp.c  |9 +
 virt/kvm/assigned-dev.c   |8 +
 virt/kvm/async_pf.c   |   17 +
 virt/kvm/kvm_main.c   |   32 +++
 37 

[RFC v2 PATCH 08/21] KVM: Add KVM_GET_SLAVE_CPU and KVM_SET_SLAVE_CPU to vCPU ioctl

2012-09-06 Thread Tomoki Sekiyama
Add an interface to set/get slave CPU dedicated to the vCPUs.

By calling ioctl with KVM_GET_SLAVE_CPU, users can get the slave CPU id
for the vCPU. -1 is returned if a slave CPU is not set.

By calling ioctl with KVM_SET_SLAVE_CPU, users can dedicate the specified
slave CPU to the vCPU. The CPU must be offlined before calling ioctl.
The CPU is activated as slave CPU for the vCPU when the correct id is set.
The slave CPU is freed and offlined by setting -1 as slave CPU id.

Whether getting/setting slave CPUs are supported by KVM or not can be
known by checking KVM_CAP_SLAVE_CPU.

Signed-off-by: Tomoki Sekiyama tomoki.sekiyama...@hitachi.com
Cc: Avi Kivity a...@redhat.com
Cc: Marcelo Tosatti mtosa...@redhat.com
Cc: Thomas Gleixner t...@linutronix.de
Cc: Ingo Molnar mi...@redhat.com
Cc: H. Peter Anvin h...@zytor.com
---

 arch/x86/include/asm/kvm_host.h |2 +
 arch/x86/kvm/vmx.c  |7 +
 arch/x86/kvm/x86.c  |   58 +++
 include/linux/kvm.h |4 +++
 4 files changed, 71 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 8dc1a0a..0ea04c9 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -718,6 +718,8 @@ struct kvm_x86_ops {
int (*check_intercept)(struct kvm_vcpu *vcpu,
   struct x86_instruction_info *info,
   enum x86_intercept_stage stage);
+
+   void (*set_slave_mode)(struct kvm_vcpu *vcpu, bool slave);
 };
 
 struct kvm_arch_async_pf {
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index c5db714..7bbfa01 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -1698,6 +1698,11 @@ static void skip_emulated_instruction(struct kvm_vcpu 
*vcpu)
vmx_set_interrupt_shadow(vcpu, 0);
 }
 
+static void vmx_set_slave_mode(struct kvm_vcpu *vcpu, bool slave)
+{
+   /* Nothing */
+}
+
 /*
  * KVM wants to inject page-faults which it got to the guest. This function
  * checks whether in a nested guest, we need to inject them to L1 or L2.
@@ -7344,6 +7349,8 @@ static struct kvm_x86_ops vmx_x86_ops = {
.set_tdp_cr3 = vmx_set_cr3,
 
.check_intercept = vmx_check_intercept,
+
+   .set_slave_mode = vmx_set_slave_mode,
 };
 
 static int __init vmx_init(void)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 579c41c..b62f59c 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2183,6 +2183,9 @@ int kvm_dev_ioctl_check_extension(long ext)
case KVM_CAP_GET_TSC_KHZ:
case KVM_CAP_PCI_2_3:
case KVM_CAP_KVMCLOCK_CTRL:
+#ifdef CONFIG_SLAVE_CPU
+   case KVM_CAP_SLAVE_CPU:
+#endif
r = 1;
break;
case KVM_CAP_COALESCED_MMIO:
@@ -2657,6 +2660,48 @@ static int kvm_set_guest_paused(struct kvm_vcpu *vcpu)
return 0;
 }
 
+#ifdef CONFIG_SLAVE_CPU
+/* vcpu currently running on each slave CPU */
+static DEFINE_PER_CPU(struct kvm_vcpu *, slave_vcpu);
+
+static int kvm_arch_vcpu_ioctl_set_slave_cpu(struct kvm_vcpu *vcpu,
+int slave, int set_slave_mode)
+{
+   int old = vcpu-arch.slave_cpu;
+   int r = -EINVAL;
+
+   if (slave = nr_cpu_ids || (slave = 0  cpu_online(slave)))
+   goto out;
+   if (slave = 0  slave != old  cpu_slave(slave))
+   goto out; /* new slave cpu must be offlined */
+
+   if (old = 0  slave != old) {
+   BUG_ON(old = nr_cpu_ids || !cpu_slave(old));
+   per_cpu(slave_vcpu, old) = NULL;
+   r = slave_cpu_down(old);
+   if (r) {
+   pr_err(kvm: slave_cpu_down %d failed\n, old);
+   goto out;
+   }
+   }
+
+   if (slave = 0) {
+   r = slave_cpu_up(slave);
+   if (r)
+   goto out;
+   BUG_ON(!cpu_slave(slave));
+   per_cpu(slave_vcpu, slave) = vcpu;
+   }
+
+   vcpu-arch.slave_cpu = slave;
+   if (set_slave_mode  kvm_x86_ops-set_slave_mode)
+   kvm_x86_ops-set_slave_mode(vcpu, slave = 0);
+out:
+   return r;
+}
+
+#endif
+
 long kvm_arch_vcpu_ioctl(struct file *filp,
 unsigned int ioctl, unsigned long arg)
 {
@@ -2937,6 +2982,16 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
r = kvm_set_guest_paused(vcpu);
goto out;
}
+#ifdef CONFIG_SLAVE_CPU
+   case KVM_SET_SLAVE_CPU: {
+   r = kvm_arch_vcpu_ioctl_set_slave_cpu(vcpu, (int)arg, 1);
+   goto out;
+   }
+   case KVM_GET_SLAVE_CPU: {
+   r = vcpu-arch.slave_cpu;
+   goto out;
+   }
+#endif
default:
r = -EINVAL;
}
@@ -6154,6 +6209,9 @@ void kvm_put_guest_fpu(struct kvm_vcpu *vcpu)
 void kvm_arch_vcpu_free(struct kvm_vcpu *vcpu)
 {
kvmclock_reset(vcpu);

Re: [PATCH] serial_core: fix sizeof(pointer)

2012-09-06 Thread Alan Cox
On Thu, 6 Sep 2012 10:27:51 +0800
Fengguang Wu fengguang...@intel.com wrote:

 sizeof when applied to a pointer typed expression gives the
 size of the pointer.
 
 Generated by: scripts/coccinelle/misc/noderef.cocci
 
 Signed-off-by: Fengguang Wu fengguang...@intel.com


Oops.. yes typo on my part

Signed-off-by: Alan Cox a...@linux.intel.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC v2 PATCH 09/21] KVM: Go back to online CPU on VM exit by external interrupt

2012-09-06 Thread Tomoki Sekiyama
If the slave CPU receives an interrupt in running a guest, current
implementation must once go back to onilne CPUs to handle the interupt.

This behavior will be replaced by later patch, which introduces direct
interrupt handling mechanism by the guest.

Signed-off-by: Tomoki Sekiyama tomoki.sekiyama...@hitachi.com
Cc: Avi Kivity a...@redhat.com
Cc: Marcelo Tosatti mtosa...@redhat.com
Cc: Thomas Gleixner t...@linutronix.de
Cc: Ingo Molnar mi...@redhat.com
Cc: H. Peter Anvin h...@zytor.com
---

 arch/x86/include/asm/kvm_host.h |1 +
 arch/x86/kvm/vmx.c  |1 +
 arch/x86/kvm/x86.c  |6 ++
 3 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 0ea04c9..af68ffb 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -358,6 +358,7 @@ struct kvm_vcpu_arch {
int sipi_vector;
u64 ia32_misc_enable_msr;
bool tpr_access_reporting;
+   bool interrupted;
 
 #ifdef CONFIG_SLAVE_CPU
/* slave cpu dedicated to this vcpu */
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 7bbfa01..d99bee6 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -4408,6 +4408,7 @@ static int handle_exception(struct kvm_vcpu *vcpu)
 
 static int handle_external_interrupt(struct kvm_vcpu *vcpu)
 {
+   vcpu-arch.interrupted = true;
++vcpu-stat.irq_exits;
return 1;
 }
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index b62f59c..db0be81 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5566,6 +5566,12 @@ static void __vcpu_enter_guest_slave(void *_arg)
break;
 
/* determine if slave cpu can handle the exit alone */
+   if (vcpu-arch.interrupted) {
+   vcpu-arch.interrupted = false;
+   arg-ret = LOOP_ONLINE;
+   break;
+   }
+
r = vcpu_post_run(vcpu, arg-task, arg-apf_pending);
 
if (r == LOOP_SLAVE 


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC v2 PATCH 15/21] KVM: add tracepoint on enabling/disabling direct interrupt delivery

2012-09-06 Thread Tomoki Sekiyama
Add trace event kvm_set_direct_interrupt to trace enabling/disabling
direct interrupt delivery on slave CPUs. At the event, the guest rip and
whether the feature is enabled or not is logged.

Signed-off-by: Tomoki Sekiyama tomoki.sekiyama...@hitachi.com
Cc: Avi Kivity a...@redhat.com
Cc: Marcelo Tosatti mtosa...@redhat.com
Cc: Thomas Gleixner t...@linutronix.de
Cc: Ingo Molnar mi...@redhat.com
Cc: H. Peter Anvin h...@zytor.com
---

 arch/x86/kvm/trace.h |   18 ++
 arch/x86/kvm/vmx.c   |2 ++
 arch/x86/kvm/x86.c   |1 +
 3 files changed, 21 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h
index a71faf7..6081be7 100644
--- a/arch/x86/kvm/trace.h
+++ b/arch/x86/kvm/trace.h
@@ -551,6 +551,24 @@ TRACE_EVENT(kvm_pv_eoi,
TP_printk(apicid %x vector %d, __entry-apicid, __entry-vector)
 );
 
+TRACE_EVENT(kvm_set_direct_interrupt,
+   TP_PROTO(struct kvm_vcpu *vcpu, bool enabled),
+   TP_ARGS(vcpu, enabled),
+
+   TP_STRUCT__entry(
+   __field(unsigned long,  guest_rip   )
+   __field(bool,   enabled )
+   ),
+
+   TP_fast_assign(
+   __entry-guest_rip  = kvm_rip_read(vcpu);
+   __entry-enabled= enabled;
+   ),
+
+   TP_printk(rip 0x%lx enabled %d,
+__entry-guest_rip, __entry-enabled)
+);
+
 /*
  * Tracepoint for nested VMRUN
  */
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 605abea..6dc59c8 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -1719,6 +1719,8 @@ static void vmx_set_direct_interrupt(struct kvm_vcpu 
*vcpu, bool enabled)
else
vmcs_set_bits(PIN_BASED_VM_EXEC_CONTROL,
  PIN_BASED_EXT_INTR_MASK);
+
+   trace_kvm_set_direct_interrupt(vcpu, enabled);
 }
 
 static void vmx_set_slave_mode(struct kvm_vcpu *vcpu, bool slave)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index b7d28df..1449187 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6936,3 +6936,4 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_nested_intr_vmexit);
 EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_invlpga);
 EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_skinit);
 EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_nested_intercepts);
+EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_set_direct_interrupt);


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC v2 PATCH 13/21] x86/apic: IRQ vector remapping on slave for slave CPUs

2012-09-06 Thread Tomoki Sekiyama
Add a facility to use IRQ vector different from online CPUs on slave CPUs.

When alternative vector for IRQ is registered by remap_slave_vector_irq()
and the IRQ affinity is set only to slave CPUs, the device is configured
to use the alternative vector.

Current patch only supports MSI and Intel CPU with IRQ remapper of IOMMU.

This is intended to be used to routing interrupts directly to KVM guest
which is running on slave CPUs which do not cause VM EXIT by external
interrupts.

Signed-off-by: Tomoki Sekiyama tomoki.sekiyama...@hitachi.com
Cc: Avi Kivity a...@redhat.com
Cc: Marcelo Tosatti mtosa...@redhat.com
Cc: Thomas Gleixner t...@linutronix.de
Cc: Ingo Molnar mi...@redhat.com
Cc: H. Peter Anvin h...@zytor.com
---

 arch/x86/include/asm/irq.h  |   15 
 arch/x86/kernel/apic/io_apic.c  |   68 ++-
 drivers/iommu/intel_irq_remapping.c |2 +
 3 files changed, 83 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/irq.h b/arch/x86/include/asm/irq.h
index ba870bb..84756f7 100644
--- a/arch/x86/include/asm/irq.h
+++ b/arch/x86/include/asm/irq.h
@@ -41,4 +41,19 @@ extern int vector_used_by_percpu_irq(unsigned int vector);
 
 extern void init_ISA_irqs(void);
 
+#ifdef CONFIG_SLAVE_CPU
+extern void remap_slave_vector_irq(int irq, int vector,
+  const struct cpumask *mask);
+extern void revert_slave_vector_irq(int irq, const struct cpumask *mask);
+extern u8 get_remapped_slave_vector(u8 vector, unsigned int irq,
+   const struct cpumask *mask);
+#else
+static inline u8 get_remapped_slave_vector(u8 vector, unsigned int irq,
+  const struct cpumask *mask)
+{
+   return vector;
+}
+#endif
+
+
 #endif /* _ASM_X86_IRQ_H */
diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index 0cd2682..167b001 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -1266,6 +1266,69 @@ void __setup_vector_irq(int cpu)
raw_spin_unlock(vector_lock);
 }
 
+#ifdef CONFIG_SLAVE_CPU
+
+/* vector table remapped on slave cpus, indexed by IRQ */
+static DEFINE_PER_CPU(u8[NR_IRQS], slave_vector_remap_tbl) = {
+   [0 ... NR_IRQS - 1] = 0,
+};
+
+void remap_slave_vector_irq(int irq, int vector, const struct cpumask *mask)
+{
+   int cpu;
+   unsigned long flags;
+
+   raw_spin_lock_irqsave(vector_lock, flags);
+   for_each_cpu(cpu, mask) {
+   BUG_ON(!cpu_slave(cpu));
+   per_cpu(slave_vector_remap_tbl, cpu)[irq] = vector;
+   per_cpu(vector_irq, cpu)[vector] = irq;
+   }
+   raw_spin_unlock_irqrestore(vector_lock, flags);
+}
+EXPORT_SYMBOL_GPL(remap_slave_vector_irq);
+
+void revert_slave_vector_irq(int irq, const struct cpumask *mask)
+{
+   int cpu;
+   u8 vector;
+   unsigned long flags;
+
+   raw_spin_lock_irqsave(vector_lock, flags);
+   for_each_cpu(cpu, mask) {
+   BUG_ON(!cpu_slave(cpu));
+   vector = per_cpu(slave_vector_remap_tbl, cpu)[irq];
+   if (vector) {
+   per_cpu(vector_irq, cpu)[vector] = -1;
+   per_cpu(slave_vector_remap_tbl, cpu)[irq] = 0;
+   }
+   }
+   raw_spin_unlock_irqrestore(vector_lock, flags);
+}
+EXPORT_SYMBOL_GPL(revert_slave_vector_irq);
+
+/* If all targets CPUs are slave, returns remapped vector */
+u8 get_remapped_slave_vector(u8 vector, unsigned int irq,
+const struct cpumask *mask)
+{
+   u8 slave_vector;
+
+   if (vector  FIRST_EXTERNAL_VECTOR ||
+   cpumask_intersects(mask, cpu_online_mask))
+   return vector;
+
+   slave_vector = per_cpu(slave_vector_remap_tbl,
+  cpumask_first(mask))[irq];
+   if (slave_vector = FIRST_EXTERNAL_VECTOR)
+   vector = slave_vector;
+
+   pr_info(slave vector remap: irq: %d = vector: %d\n, irq, vector);
+
+   return vector;
+}
+
+#endif
+
 static struct irq_chip ioapic_chip;
 
 #ifdef CONFIG_X86_32
@@ -3133,6 +3196,7 @@ static int
 msi_set_affinity(struct irq_data *data, const struct cpumask *mask, bool force)
 {
struct irq_cfg *cfg = data-chip_data;
+   int vector = cfg-vector;
struct msi_msg msg;
unsigned int dest;
 
@@ -3141,8 +3205,10 @@ msi_set_affinity(struct irq_data *data, const struct 
cpumask *mask, bool force)
 
__get_cached_msi_msg(data-msi_desc, msg);
 
+   vector = get_remapped_slave_vector(vector, data-irq, mask);
+
msg.data = ~MSI_DATA_VECTOR_MASK;
-   msg.data |= MSI_DATA_VECTOR(cfg-vector);
+   msg.data |= MSI_DATA_VECTOR(vector);
msg.address_lo = ~MSI_ADDR_DEST_ID_MASK;
msg.address_lo |= MSI_ADDR_DEST_ID(dest);
 
diff --git a/drivers/iommu/intel_irq_remapping.c 
b/drivers/iommu/intel_irq_remapping.c
index df38334..471d23f 100644
--- 

Re: [Ping^3] Re: [PATCH] sg_io: allow UNMAP and WRITE SAME without CAP_SYS_RAWIO

2012-09-06 Thread Ric Wheeler

On 09/06/2012 02:31 AM, Paolo Bonzini wrote:

Il 05/09/2012 22:18, Ric Wheeler ha scritto:

Hi Paolo,

Both of these commands are destructive. WRITE_SAME (if done without the
discard bits set) can also take a very long time to be destructive and
tie up the storage.

FORMAT_UNIT has the same characteristics and yet it is allowed (btw, I
don't think WRITE SAME slowness is limited to the case where a real
write is requested; discarding can be just as slow).

Also, the two new commands are anyway restricted to programs that have
write access to the disk.  If you have read-only access, you won't be
able to issue any destructive command (there is one exception, START
STOP UNIT is allowed even with read-only capability and is somewhat
destructive).

Honestly, the only reason why these two commands weren't included, is
that the current whitelist is heavily tailored towards CD/DVD burning.


Hi Paolo,

I assume that FORMAT_UNIT is for CD/DVD needs - not sure what a S-ATA disk would 
do with that. If it is destructive, we should probably think about how to make 
it more secure and see how many applications we would break.





I think that restricting them to CAP_SYS_RAWIO seems reasonable - better
to vet and give the appropriate apps the needed capability than to
widely open up the safety check?

CAP_SYS_RAWIO is so wide in its scope, that anything that requires it is
insecure.

Paolo


I don't see allowing anyone who can open the device to zero the data as better 
though :)


Ric



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC v2 PATCH 01/21] x86: Split memory hotplug function from cpu_up() as cpu_memory_up()

2012-09-06 Thread Avi Kivity
On 09/06/2012 02:27 PM, Tomoki Sekiyama wrote:
 Split memory hotplug function from cpu_up() as cpu_memory_up(), which will
 be used for assigning memory area to off-lined cpus at following patch
 in this series.
 

Can post a summary containing both the general outline for people
reading this for the first time, or who have forgotten it, and the list
of changes from v1?

-- 
error compiling committee.c: too many arguments to function
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC v2 PATCH 17/21] KVM: add kvm_arch_vcpu_prevent_run to prevent VM ENTER when NMI is received

2012-09-06 Thread Tomoki Sekiyama
Since NMI can not be disabled around VM enter, there is a race between
receiving NMI to kick a guest and entering the guests on slave CPUs.If the
NMI is received just before entering VM, after the NMI handler is invoked,
it continues entering the guest and the effect of the NMI will be lost.

This patch adds kvm_arch_vcpu_prevent_run(), which causes VM exit right
after VM enter. The NMI handler uses this to ensure the execution of the
guest is cancelled after NMI.

Signed-off-by: Tomoki Sekiyama tomoki.sekiyama...@hitachi.com
Cc: Avi Kivity a...@redhat.com
Cc: Marcelo Tosatti mtosa...@redhat.com
Cc: Thomas Gleixner t...@linutronix.de
Cc: Ingo Molnar mi...@redhat.com
Cc: H. Peter Anvin h...@zytor.com
---

 arch/x86/include/asm/kvm_host.h |6 ++
 arch/x86/kvm/vmx.c  |   42 ++-
 arch/x86/kvm/x86.c  |   31 +
 3 files changed, 78 insertions(+), 1 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 65242a6..624e5ad 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -429,6 +429,9 @@ struct kvm_vcpu_arch {
void *insn;
int insn_len;
} page_fault;
+
+   bool prevent_run;
+   bool prevent_needed;
 #endif
 
int halt_request; /* real mode on Intel only */
@@ -681,6 +684,7 @@ struct kvm_x86_ops {
 
void (*run)(struct kvm_vcpu *vcpu);
int (*handle_exit)(struct kvm_vcpu *vcpu);
+   void (*prevent_run)(struct kvm_vcpu *vcpu, int prevent);
void (*skip_emulated_instruction)(struct kvm_vcpu *vcpu);
void (*set_interrupt_shadow)(struct kvm_vcpu *vcpu, int mask);
u32 (*get_interrupt_shadow)(struct kvm_vcpu *vcpu, int mask);
@@ -1027,4 +1031,6 @@ int kvm_pmu_read_pmc(struct kvm_vcpu *vcpu, unsigned pmc, 
u64 *data);
 void kvm_handle_pmu_event(struct kvm_vcpu *vcpu);
 void kvm_deliver_pmi(struct kvm_vcpu *vcpu);
 
+int kvm_arch_vcpu_run_prevented(struct kvm_vcpu *vcpu);
+
 #endif /* _ASM_X86_KVM_HOST_H */
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 2130cbd..39a4cb4 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -1713,6 +1713,9 @@ static inline void vmx_clear_hlt(struct kvm_vcpu *vcpu)
 
 static void vmx_set_direct_interrupt(struct kvm_vcpu *vcpu, bool enabled)
 {
+#ifdef CONFIG_SLAVE_CPU
+   void *msr_bitmap;
+
if (enabled)
vmcs_clear_bits(PIN_BASED_VM_EXEC_CONTROL,
PIN_BASED_EXT_INTR_MASK);
@@ -1721,6 +1724,7 @@ static void vmx_set_direct_interrupt(struct kvm_vcpu 
*vcpu, bool enabled)
  PIN_BASED_EXT_INTR_MASK);
 
trace_kvm_set_direct_interrupt(vcpu, enabled);
+#endif
 }
 
 static void vmx_set_slave_mode(struct kvm_vcpu *vcpu, bool slave)
@@ -4458,7 +4462,7 @@ static int handle_external_interrupt(struct kvm_vcpu 
*vcpu)
 
 static int handle_preemption_timer(struct kvm_vcpu *vcpu)
 {
-   /* Nothing */
+   kvm_arch_vcpu_run_prevented(vcpu);
return 1;
 }
 
@@ -6052,6 +6056,10 @@ static int vmx_handle_exit(struct kvm_vcpu *vcpu)
}
 
if (exit_reason  VMX_EXIT_REASONS_FAILED_VMENTRY) {
+#ifdef CONFIG_SLAVE_CPU
+   if (vcpu-arch.prevent_run)
+   return kvm_arch_vcpu_run_prevented(vcpu);
+#endif
vcpu-run-exit_reason = KVM_EXIT_FAIL_ENTRY;
vcpu-run-fail_entry.hardware_entry_failure_reason
= exit_reason;
@@ -6059,6 +6067,10 @@ static int vmx_handle_exit(struct kvm_vcpu *vcpu)
}
 
if (unlikely(vmx-fail)) {
+#ifdef CONFIG_SLAVE_CPU
+   if (vcpu-arch.prevent_run)
+   return kvm_arch_vcpu_run_prevented(vcpu);
+#endif
vcpu-run-exit_reason = KVM_EXIT_FAIL_ENTRY;
vcpu-run-fail_entry.hardware_entry_failure_reason
= vmcs_read32(VM_INSTRUCTION_ERROR);
@@ -6275,6 +6287,21 @@ static void atomic_switch_perf_msrs(struct vcpu_vmx *vmx)
msrs[i].host);
 }
 
+/*
+ * Make VMRESUME fail using preemption timer with timer value = 0.
+ * On processors that doesn't support preemption timer, VMRESUME will fail
+ * by internal error.
+ */
+static void vmx_prevent_run(struct kvm_vcpu *vcpu, int prevent)
+{
+   if (prevent)
+   vmcs_set_bits(PIN_BASED_VM_EXEC_CONTROL,
+ PIN_BASED_PREEMPTION_TIMER);
+   else
+   vmcs_clear_bits(PIN_BASED_VM_EXEC_CONTROL,
+   PIN_BASED_PREEMPTION_TIMER);
+}
+
 #ifdef CONFIG_X86_64
 #define R r
 #define Q q
@@ -6326,6 +6353,13 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu)
 
atomic_switch_perf_msrs(vmx);
 
+#ifdef CONFIG_SLAVE_CPU
+   barrier();  /* Avoid vmcs modification by NMI before here */
+   vcpu-arch.prevent_needed = 1;
+   if (vcpu-arch.prevent_run)

[RFC v2 PATCH 20/21] KVM: Pass-through local APIC timer of on slave CPUs to guest VM

2012-09-06 Thread Tomoki Sekiyama
Provide direct control of local APIC timer of slave CPUs to the guest.
The timer interrupt does not cause VM exit if direct interrupt delivery is
enabled. To handle the timer correctly, this makes the guest occupy the
local APIC timer.

If the host supports x2apic, this expose TMICT and TMCCT to the guest in
order to allow guests to start the timer and to read the timer count
without VM exit. Otherwise, it sets APIC registers to specified values.
LVTT is not passed-through to avoid modifying timer interrupt vector.

Currently the guest timer interrupt vector remapping is not supported, and
guest must use the same vector as host.

Signed-off-by: Tomoki Sekiyama tomoki.sekiyama...@hitachi.com
Cc: Avi Kivity a...@redhat.com
Cc: Marcelo Tosatti mtosa...@redhat.com
Cc: Thomas Gleixner t...@linutronix.de
Cc: Ingo Molnar mi...@redhat.com
Cc: H. Peter Anvin h...@zytor.com
---

 arch/x86/include/asm/apic.h |4 +++
 arch/x86/include/asm/kvm_host.h |1 +
 arch/x86/kernel/apic/apic.c |   11 ++
 arch/x86/kernel/smpboot.c   |   30 ++
 arch/x86/kvm/lapic.c|   45 +++
 arch/x86/kvm/lapic.h|2 ++
 arch/x86/kvm/vmx.c  |6 +
 arch/x86/kvm/x86.c  |3 +++
 include/linux/cpu.h |5 
 kernel/hrtimer.c|2 +-
 10 files changed, 108 insertions(+), 1 deletions(-)

diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
index d37ae5c..66e1155 100644
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -44,6 +44,8 @@ static inline void generic_apic_probe(void)
 
 #ifdef CONFIG_X86_LOCAL_APIC
 
+struct clock_event_device;
+
 extern unsigned int apic_verbosity;
 extern int local_apic_timer_c2_ok;
 
@@ -245,6 +247,8 @@ extern void init_apic_mappings(void);
 void register_lapic_address(unsigned long address);
 extern void setup_boot_APIC_clock(void);
 extern void setup_secondary_APIC_clock(void);
+extern void override_local_apic_timer(int cpu,
+   void (*handler)(struct clock_event_device *));
 extern int APIC_init_uniprocessor(void);
 extern int apic_force_enable(unsigned long addr);
 
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index f43680e..a95bb62 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1035,6 +1035,7 @@ int kvm_arch_vcpu_run_prevented(struct kvm_vcpu *vcpu);
 
 #ifdef CONFIG_SLAVE_CPU
 void kvm_get_slave_cpu_mask(struct kvm *kvm, struct cpumask *mask);
+struct kvm_vcpu *get_slave_vcpu(int cpu);
 
 struct kvm_assigned_dev_kernel;
 extern void assign_slave_msi(struct kvm *kvm,
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index 24deb30..90ed84a 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -901,6 +901,17 @@ void __irq_entry smp_apic_timer_interrupt(struct pt_regs 
*regs)
set_irq_regs(old_regs);
 }
 
+void override_local_apic_timer(int cpu,
+  void (*handler)(struct clock_event_device *))
+{
+   unsigned long flags;
+
+   local_irq_save(flags);
+   per_cpu(lapic_events, cpu).event_handler = handler;
+   local_irq_restore(flags);
+}
+EXPORT_SYMBOL_GPL(override_local_apic_timer);
+
 int setup_profiling_timer(unsigned int multiplier)
 {
return -EINVAL;
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 45dfc1d..ba7c99b 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -133,6 +133,7 @@ static void __ref remove_cpu_from_maps(int cpu);
 #ifdef CONFIG_SLAVE_CPU
 /* Notify slave cpu up and down */
 static RAW_NOTIFIER_HEAD(slave_cpu_chain);
+struct notifier_block *slave_timer_nb;
 
 int register_slave_cpu_notifier(struct notifier_block *nb)
 {
@@ -140,6 +141,13 @@ int register_slave_cpu_notifier(struct notifier_block *nb)
 }
 EXPORT_SYMBOL(register_slave_cpu_notifier);
 
+int register_slave_cpu_timer_notifier(struct notifier_block *nb)
+{
+   slave_timer_nb = nb;
+   return register_slave_cpu_notifier(nb);
+}
+EXPORT_SYMBOL(register_slave_cpu_timer_notifier);
+
 void unregister_slave_cpu_notifier(struct notifier_block *nb)
 {
raw_notifier_chain_unregister(slave_cpu_chain, nb);
@@ -155,6 +163,8 @@ static int slave_cpu_notify(unsigned long val, int cpu)
 
return notifier_to_errno(ret);
 }
+
+static void slave_cpu_disable_timer(int cpu);
 #endif
 
 /*
@@ -1013,10 +1023,30 @@ int __cpuinit slave_cpu_up(unsigned int cpu)
 
cpu_maps_update_done();
 
+   /* Timer may be used only in starting the slave CPU */
+   slave_cpu_disable_timer(cpu);
+
return ret;
 }
 EXPORT_SYMBOL(slave_cpu_up);
 
+static void __slave_cpu_disable_timer(void *hcpu)
+{
+   int cpu = (long)hcpu;
+
+   pr_info(Disabling timer on slave cpu %d\n, cpu);
+   BUG_ON(!slave_timer_nb);
+   slave_timer_nb-notifier_call(slave_timer_nb, CPU_SLAVE_DYING, hcpu);
+}
+

Re: [RFC v2 PATCH 01/21] x86: Split memory hotplug function from cpu_up() as cpu_memory_up()

2012-09-06 Thread Avi Kivity
On 09/06/2012 02:31 PM, Avi Kivity wrote:
 On 09/06/2012 02:27 PM, Tomoki Sekiyama wrote:
 Split memory hotplug function from cpu_up() as cpu_memory_up(), which will
 be used for assigning memory area to off-lined cpus at following patch
 in this series.
 
 
 Can post a summary containing both the general outline for people
 reading this for the first time, or who have forgotten it, and the list
 of changes from v1?
 

Never mind, I see it was posted, just that I wasn't copied on it.

-- 
error compiling committee.c: too many arguments to function
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC v2 PATCH 14/21] KVM: Directly handle interrupts by guests without VM EXIT on slave CPUs

2012-09-06 Thread Tomoki Sekiyama
Make interrupts on slave CPUs handled by guests without VM EXIT.
This reduces CPU usage by the host to transfer interrupts of assigned
PCI devices from the host to guests. It also reduces cost of VM EXIT
and quickens response of guests to the interrupts.

When a slave CPU is dedicated to a vCPU, exit on external interrupts is
disabled. Unfortunately, we can only enable/disable exits for whole
external interrupts except NMIs and cannot switch exits based on IRQ#
or vectors. Thus, to avoid IPIs from online CPUs transferred to guests,
this patch modify kvm_vcpu_kick() to use NMI for guests on slave CPUs.

Signed-off-by: Tomoki Sekiyama tomoki.sekiyama...@hitachi.com
Cc: Avi Kivity a...@redhat.com
Cc: Marcelo Tosatti mtosa...@redhat.com
Cc: Thomas Gleixner t...@linutronix.de
Cc: Ingo Molnar mi...@redhat.com
Cc: H. Peter Anvin h...@zytor.com
---

 arch/x86/include/asm/kvm_host.h |1 +
 arch/x86/kvm/lapic.c|5 +
 arch/x86/kvm/vmx.c  |   19 ++
 arch/x86/kvm/x86.c  |   41 +++
 include/linux/kvm_host.h|1 +
 virt/kvm/kvm_main.c |5 +++--
 6 files changed, 70 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 5ce89f1..65242a6 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -725,6 +725,7 @@ struct kvm_x86_ops {
   struct x86_instruction_info *info,
   enum x86_intercept_stage stage);
 
+   void (*set_direct_interrupt)(struct kvm_vcpu *vcpu, bool enabled);
void (*set_slave_mode)(struct kvm_vcpu *vcpu, bool slave);
 };
 
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index ce87878..73f57f3 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -601,6 +601,9 @@ static int apic_set_eoi(struct kvm_lapic *apic)
kvm_ioapic_update_eoi(apic-vcpu-kvm, vector, trigger_mode);
}
kvm_make_request(KVM_REQ_EVENT, apic-vcpu);
+   if (vcpu_has_slave_cpu(apic-vcpu) 
+   kvm_x86_ops-set_direct_interrupt)
+   kvm_x86_ops-set_direct_interrupt(apic-vcpu, 1);
return vector;
 }
 
@@ -1569,6 +1572,8 @@ int kvm_lapic_enable_pv_eoi(struct kvm_vcpu *vcpu, u64 
data)
u64 addr = data  ~KVM_MSR_ENABLED;
if (!IS_ALIGNED(addr, 4))
return 1;
+   if (vcpu_has_slave_cpu(vcpu))
+   return 1;
 
vcpu-arch.pv_eoi.msr_val = data;
if (!pv_eoi_enabled(vcpu))
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 03a2d02..605abea 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -1711,6 +1711,16 @@ static inline void vmx_clear_hlt(struct kvm_vcpu *vcpu)
 #endif
 }
 
+static void vmx_set_direct_interrupt(struct kvm_vcpu *vcpu, bool enabled)
+{
+   if (enabled)
+   vmcs_clear_bits(PIN_BASED_VM_EXEC_CONTROL,
+   PIN_BASED_EXT_INTR_MASK);
+   else
+   vmcs_set_bits(PIN_BASED_VM_EXEC_CONTROL,
+ PIN_BASED_EXT_INTR_MASK);
+}
+
 static void vmx_set_slave_mode(struct kvm_vcpu *vcpu, bool slave)
 {
/* Don't intercept the guest's halt on slave CPU */
@@ -1721,6 +1731,8 @@ static void vmx_set_slave_mode(struct kvm_vcpu *vcpu, 
bool slave)
vmcs_set_bits(CPU_BASED_VM_EXEC_CONTROL,
  CPU_BASED_HLT_EXITING);
}
+
+   vmx_set_direct_interrupt(vcpu, slave);
 }
 
 /*
@@ -1776,6 +1788,8 @@ static void vmx_queue_exception(struct kvm_vcpu *vcpu, 
unsigned nr,
 
vmcs_write32(VM_ENTRY_INTR_INFO_FIELD, intr_info);
vmx_clear_hlt(vcpu);
+   if (vcpu_has_slave_cpu(vcpu))
+   vmx_set_direct_interrupt(vcpu, 0);
 }
 
 static bool vmx_rdtscp_supported(void)
@@ -4147,6 +4161,8 @@ static void vmx_inject_irq(struct kvm_vcpu *vcpu)
intr |= INTR_TYPE_EXT_INTR;
vmcs_write32(VM_ENTRY_INTR_INFO_FIELD, intr);
vmx_clear_hlt(vcpu);
+   if (vcpu_has_slave_cpu(vcpu))
+   vmx_set_direct_interrupt(vcpu, 0);
 }
 
 static void vmx_inject_nmi(struct kvm_vcpu *vcpu)
@@ -4179,6 +4195,8 @@ static void vmx_inject_nmi(struct kvm_vcpu *vcpu)
vmcs_write32(VM_ENTRY_INTR_INFO_FIELD,
INTR_TYPE_NMI_INTR | INTR_INFO_VALID_MASK | NMI_VECTOR);
vmx_clear_hlt(vcpu);
+   if (vcpu_has_slave_cpu(vcpu))
+   vmx_set_direct_interrupt(vcpu, 0);
 }
 
 static int vmx_nmi_allowed(struct kvm_vcpu *vcpu)
@@ -7374,6 +7392,7 @@ static struct kvm_x86_ops vmx_x86_ops = {
 
.check_intercept = vmx_check_intercept,
 
+   .set_direct_interrupt = vmx_set_direct_interrupt,
.set_slave_mode = vmx_set_slave_mode,
 };
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index a6b2521..b7d28df 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -63,6 +63,7 @@
 #include asm/pvclock.h
 #include 

[RFC v2 PATCH 12/21] x86/apic: Enable external interrupt routing to slave CPUs

2012-09-06 Thread Tomoki Sekiyama
Enable APIC to handle interrupts on slave CPUs, and enables interrupt
routing to slave CPUs by setting IRQ affinity.

As slave CPUs which run a KVM guest handle external interrupts directly in
the vCPUs, the guest's vector/IRQ mapping is different from the host's.
That requires interrupts to be routed either online CPUs or slave CPUs.

In this patch, if online CPUs are contained in specified affinity settings,
the affinity settings will be only applied to online CPUs. If every
specified CPU is slave, IRQ will be routed to slave CPUs.

Signed-off-by: Tomoki Sekiyama tomoki.sekiyama...@hitachi.com
Cc: Avi Kivity a...@redhat.com
Cc: Marcelo Tosatti mtosa...@redhat.com
Cc: Thomas Gleixner t...@linutronix.de
Cc: Ingo Molnar mi...@redhat.com
Cc: H. Peter Anvin h...@zytor.com
---

 arch/x86/include/asm/apic.h   |6 ++---
 arch/x86/kernel/apic/io_apic.c|   43 -
 arch/x86/kernel/apic/x2apic_cluster.c |8 +++---
 drivers/iommu/intel_irq_remapping.c   |   30 +++
 kernel/irq/manage.c   |4 ++-
 kernel/irq/migration.c|2 +-
 kernel/irq/proc.c |2 +-
 7 files changed, 67 insertions(+), 28 deletions(-)

diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
index f342612..d37ae5c 100644
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -535,7 +535,7 @@ extern void generic_bigsmp_probe(void);
 static inline const struct cpumask *default_target_cpus(void)
 {
 #ifdef CONFIG_SMP
-   return cpu_online_mask;
+   return cpu_online_or_slave_mask;
 #else
return cpumask_of(0);
 #endif
@@ -543,7 +543,7 @@ static inline const struct cpumask 
*default_target_cpus(void)
 
 static inline const struct cpumask *online_target_cpus(void)
 {
-   return cpu_online_mask;
+   return cpu_online_or_slave_mask;
 }
 
 DECLARE_EARLY_PER_CPU_READ_MOSTLY(u16, x86_bios_cpu_apicid);
@@ -602,7 +602,7 @@ flat_cpu_mask_to_apicid_and(const struct cpumask *cpumask,
 {
unsigned long cpu_mask = cpumask_bits(cpumask)[0] 
 cpumask_bits(andmask)[0] 
-cpumask_bits(cpu_online_mask)[0] 
+cpumask_bits(cpu_online_or_slave_mask)[0] 
 APIC_ALL_CPUS;
 
if (likely(cpu_mask)) {
diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index c265593..0cd2682 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -1125,7 +1125,7 @@ __assign_irq_vector(int irq, struct irq_cfg *cfg, const 
struct cpumask *mask)
/* Only try and allocate irqs on cpus that are present */
err = -ENOSPC;
cpumask_clear(cfg-old_domain);
-   cpu = cpumask_first_and(mask, cpu_online_mask);
+   cpu = cpumask_first_and(mask, cpu_online_or_slave_mask);
while (cpu  nr_cpu_ids) {
int new_cpu, vector, offset;
 
@@ -1158,14 +1158,14 @@ next:
if (unlikely(current_vector == vector)) {
cpumask_or(cfg-old_domain, cfg-old_domain, tmp_mask);
cpumask_andnot(tmp_mask, mask, cfg-old_domain);
-   cpu = cpumask_first_and(tmp_mask, cpu_online_mask);
+   cpu = cpumask_first_and(tmp_mask, 
cpu_online_or_slave_mask);
continue;
}
 
if (test_bit(vector, used_vectors))
goto next;
 
-   for_each_cpu_and(new_cpu, tmp_mask, cpu_online_mask)
+   for_each_cpu_and(new_cpu, tmp_mask, cpu_online_or_slave_mask)
if (per_cpu(vector_irq, new_cpu)[vector] != -1)
goto next;
/* Found one! */
@@ -1175,7 +1175,7 @@ next:
cfg-move_in_progress = 1;
cpumask_copy(cfg-old_domain, cfg-domain);
}
-   for_each_cpu_and(new_cpu, tmp_mask, cpu_online_mask)
+   for_each_cpu_and(new_cpu, tmp_mask, cpu_online_or_slave_mask)
per_cpu(vector_irq, new_cpu)[vector] = irq;
cfg-vector = vector;
cpumask_copy(cfg-domain, tmp_mask);
@@ -1204,7 +1204,7 @@ static void __clear_irq_vector(int irq, struct irq_cfg 
*cfg)
BUG_ON(!cfg-vector);
 
vector = cfg-vector;
-   for_each_cpu_and(cpu, cfg-domain, cpu_online_mask)
+   for_each_cpu_and(cpu, cfg-domain, cpu_online_or_slave_mask)
per_cpu(vector_irq, cpu)[vector] = -1;
 
cfg-vector = 0;
@@ -1212,7 +1212,7 @@ static void __clear_irq_vector(int irq, struct irq_cfg 
*cfg)
 
if (likely(!cfg-move_in_progress))
return;
-   for_each_cpu_and(cpu, cfg-old_domain, cpu_online_mask) {
+   for_each_cpu_and(cpu, cfg-old_domain, cpu_online_or_slave_mask) {
for (vector = 

[RFC v2 PATCH 11/21] KVM: no exiting from guest when slave CPU halted

2012-09-06 Thread Tomoki Sekiyama
Avoid exiting from a guest on slave CPU even if HLT instruction is
executed. Since the slave CPU is dedicated to a vCPU, exit on HLT is
not required, and avoiding VM exit will improve the guest's performance.

This is a partial revert of

10166744b80a (KVM: VMX: remove yield_on_hlt)

Cc: Avi Kivity a...@redhat.com
Cc: Marcelo Tosatti mtosa...@redhat.com
Cc: Thomas Gleixner t...@linutronix.de
Cc: Ingo Molnar mi...@redhat.com
Cc: H. Peter Anvin h...@zytor.com
---

 arch/x86/kvm/vmx.c |   25 -
 1 files changed, 24 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index d99bee6..03a2d02 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -1698,9 +1698,29 @@ static void skip_emulated_instruction(struct kvm_vcpu 
*vcpu)
vmx_set_interrupt_shadow(vcpu, 0);
 }
 
+static inline void vmx_clear_hlt(struct kvm_vcpu *vcpu)
+{
+#ifdef CONFIG_SLAVE_CPU
+   /* Ensure that we clear the HLT state in the VMCS.  We don't need to
+* explicitly skip the instruction because if the HLT state is set,
+* then the instruction is already executing and RIP has already been
+* advanced. */
+   if (vcpu-arch.slave_cpu = 0 
+   vmcs_read32(GUEST_ACTIVITY_STATE) == GUEST_ACTIVITY_HLT)
+   vmcs_write32(GUEST_ACTIVITY_STATE, GUEST_ACTIVITY_ACTIVE);
+#endif
+}
+
 static void vmx_set_slave_mode(struct kvm_vcpu *vcpu, bool slave)
 {
-   /* Nothing */
+   /* Don't intercept the guest's halt on slave CPU */
+   if (slave) {
+   vmcs_clear_bits(CPU_BASED_VM_EXEC_CONTROL,
+   CPU_BASED_HLT_EXITING);
+   } else {
+   vmcs_set_bits(CPU_BASED_VM_EXEC_CONTROL,
+ CPU_BASED_HLT_EXITING);
+   }
 }
 
 /*
@@ -1755,6 +1775,7 @@ static void vmx_queue_exception(struct kvm_vcpu *vcpu, 
unsigned nr,
intr_info |= INTR_TYPE_HARD_EXCEPTION;
 
vmcs_write32(VM_ENTRY_INTR_INFO_FIELD, intr_info);
+   vmx_clear_hlt(vcpu);
 }
 
 static bool vmx_rdtscp_supported(void)
@@ -4125,6 +4146,7 @@ static void vmx_inject_irq(struct kvm_vcpu *vcpu)
} else
intr |= INTR_TYPE_EXT_INTR;
vmcs_write32(VM_ENTRY_INTR_INFO_FIELD, intr);
+   vmx_clear_hlt(vcpu);
 }
 
 static void vmx_inject_nmi(struct kvm_vcpu *vcpu)
@@ -4156,6 +4178,7 @@ static void vmx_inject_nmi(struct kvm_vcpu *vcpu)
}
vmcs_write32(VM_ENTRY_INTR_INFO_FIELD,
INTR_TYPE_NMI_INTR | INTR_INFO_VALID_MASK | NMI_VECTOR);
+   vmx_clear_hlt(vcpu);
 }
 
 static int vmx_nmi_allowed(struct kvm_vcpu *vcpu)


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC v2 PATCH 04/21] x86: Avoid RCU warnings on slave CPUs

2012-09-06 Thread Tomoki Sekiyama
Initialize rcu related variables to avoid warnings about RCU usage while
slave CPUs is running specified functions. Also notify RCU subsystem before
the slave CPU is entered into idle state.

Signed-off-by: Tomoki Sekiyama tomoki.sekiyama...@hitachi.com
Cc: Avi Kivity a...@redhat.com
Cc: Marcelo Tosatti mtosa...@redhat.com
Cc: Thomas Gleixner t...@linutronix.de
Cc: Ingo Molnar mi...@redhat.com
Cc: H. Peter Anvin h...@zytor.com
---

 arch/x86/kernel/smpboot.c |4 
 kernel/rcutree.c  |   14 ++
 2 files changed, 18 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index e8cfe377..45dfc1d 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -382,6 +382,8 @@ notrace static void __cpuinit start_slave_cpu(void *unused)
f = per_cpu(slave_cpu_func, cpu);
per_cpu(slave_cpu_func, cpu).func = NULL;
 
+   rcu_note_context_switch(cpu);
+
if (!f.func) {
native_safe_halt();
continue;
@@ -1005,6 +1007,8 @@ int __cpuinit slave_cpu_up(unsigned int cpu)
if (IS_ERR(idle))
return PTR_ERR(idle);
 
+   slave_cpu_notify(CPU_SLAVE_UP_PREPARE, cpu);
+
ret = __native_cpu_up(cpu, idle, 1);
 
cpu_maps_update_done();
diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index f280e54..31a7c8c 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -2589,6 +2589,9 @@ static int __cpuinit rcu_cpu_notify(struct notifier_block 
*self,
switch (action) {
case CPU_UP_PREPARE:
case CPU_UP_PREPARE_FROZEN:
+#ifdef CONFIG_SLAVE_CPU
+   case CPU_SLAVE_UP_PREPARE:
+#endif
rcu_prepare_cpu(cpu);
rcu_prepare_kthreads(cpu);
break;
@@ -2603,6 +2606,9 @@ static int __cpuinit rcu_cpu_notify(struct notifier_block 
*self,
break;
case CPU_DYING:
case CPU_DYING_FROZEN:
+#ifdef CONFIG_SLAVE_CPU
+   case CPU_SLAVE_DYING:
+#endif
/*
 * The whole machine is stopped except this CPU, so we can
 * touch any data without introducing corruption. We send the
@@ -2616,6 +2622,9 @@ static int __cpuinit rcu_cpu_notify(struct notifier_block 
*self,
case CPU_DEAD_FROZEN:
case CPU_UP_CANCELED:
case CPU_UP_CANCELED_FROZEN:
+#ifdef CONFIG_SLAVE_CPU
+   case CPU_SLAVE_DEAD:
+#endif
for_each_rcu_flavor(rsp)
rcu_cleanup_dead_cpu(cpu, rsp);
break;
@@ -2797,6 +2806,10 @@ static void __init rcu_init_geometry(void)
rcu_num_nodes -= n;
 }
 
+static struct notifier_block __cpuinitdata rcu_slave_nb = {
+   .notifier_call = rcu_cpu_notify,
+};
+
 void __init rcu_init(void)
 {
int cpu;
@@ -2814,6 +2827,7 @@ void __init rcu_init(void)
 * or the scheduler are operational.
 */
cpu_notifier(rcu_cpu_notify, 0);
+   register_slave_cpu_notifier(rcu_slave_nb);
for_each_online_cpu(cpu)
rcu_cpu_notify(NULL, CPU_UP_PREPARE, (void *)(long)cpu);
check_cpu_stall_init();


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC v2 PATCH 16/21] KVM: vmx: Add definitions PIN_BASED_PREEMPTION_TIMER

2012-09-06 Thread Tomoki Sekiyama
Add some definitions to use PIN_BASED_PREEMPTION_TIMER.

When PIN_BASED_PREEMPTION_TIMER is enabled, the guest will exit
with reason=EXIT_REASON_PREEMPTION_TIMER when the counter specified in
VMX_PREEMPTION_TIMER_VALUE becomes 0.
This patch also adds a dummy handler for EXIT_REASON_PREEMPTION_TIMER,
which just goes back to VM execution soon.

These are currently intended only to be used with avoid entering the
guest on a slave CPU when vmx_prevent_run(vcpu, 1) is called.

Signed-off-by: Tomoki Sekiyama tomoki.sekiyama...@hitachi.com
Cc: Avi Kivity a...@redhat.com
Cc: Marcelo Tosatti mtosa...@redhat.com
Cc: Thomas Gleixner t...@linutronix.de
Cc: Ingo Molnar mi...@redhat.com
Cc: H. Peter Anvin h...@zytor.com
---

 arch/x86/include/asm/vmx.h |3 +++
 arch/x86/kvm/trace.h   |1 +
 arch/x86/kvm/vmx.c |7 +++
 3 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
index 74fcb96..6899aaa 100644
--- a/arch/x86/include/asm/vmx.h
+++ b/arch/x86/include/asm/vmx.h
@@ -66,6 +66,7 @@
 #define PIN_BASED_EXT_INTR_MASK 0x0001
 #define PIN_BASED_NMI_EXITING   0x0008
 #define PIN_BASED_VIRTUAL_NMIS  0x0020
+#define PIN_BASED_PREEMPTION_TIMER  0x0040
 
 #define VM_EXIT_SAVE_DEBUG_CONTROLS 0x0002
 #define VM_EXIT_HOST_ADDR_SPACE_SIZE0x0200
@@ -196,6 +197,7 @@ enum vmcs_field {
GUEST_INTERRUPTIBILITY_INFO = 0x4824,
GUEST_ACTIVITY_STATE= 0X4826,
GUEST_SYSENTER_CS   = 0x482A,
+   VMX_PREEMPTION_TIMER_VALUE  = 0x482E,
HOST_IA32_SYSENTER_CS   = 0x4c00,
CR0_GUEST_HOST_MASK = 0x6000,
CR4_GUEST_HOST_MASK = 0x6002,
@@ -280,6 +282,7 @@ enum vmcs_field {
 #define EXIT_REASON_APIC_ACCESS 44
 #define EXIT_REASON_EPT_VIOLATION   48
 #define EXIT_REASON_EPT_MISCONFIG   49
+#define EXIT_REASON_PREEMPTION_TIMER   52
 #define EXIT_REASON_WBINVD 54
 #define EXIT_REASON_XSETBV 55
 #define EXIT_REASON_INVPCID58
diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h
index 6081be7..fc350f3 100644
--- a/arch/x86/kvm/trace.h
+++ b/arch/x86/kvm/trace.h
@@ -218,6 +218,7 @@ TRACE_EVENT(kvm_apic,
{ EXIT_REASON_APIC_ACCESS,  APIC_ACCESS }, \
{ EXIT_REASON_EPT_VIOLATION,EPT_VIOLATION }, \
{ EXIT_REASON_EPT_MISCONFIG,EPT_MISCONFIG }, \
+   { EXIT_REASON_PREEMPTION_TIMER, PREEMPTION_TIMER }, \
{ EXIT_REASON_WBINVD,   WBINVD }
 
 #define SVM_EXIT_REASONS \
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 6dc59c8..2130cbd 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -4456,6 +4456,12 @@ static int handle_external_interrupt(struct kvm_vcpu 
*vcpu)
return 1;
 }
 
+static int handle_preemption_timer(struct kvm_vcpu *vcpu)
+{
+   /* Nothing */
+   return 1;
+}
+
 static int handle_triple_fault(struct kvm_vcpu *vcpu)
 {
vcpu-run-exit_reason = KVM_EXIT_SHUTDOWN;
@@ -5768,6 +5774,7 @@ static int (*kvm_vmx_exit_handlers[])(struct kvm_vcpu 
*vcpu) = {
[EXIT_REASON_VMON]= handle_vmon,
[EXIT_REASON_TPR_BELOW_THRESHOLD] = handle_tpr_below_threshold,
[EXIT_REASON_APIC_ACCESS] = handle_apic_access,
+   [EXIT_REASON_PREEMPTION_TIMER]= handle_preemption_timer,
[EXIT_REASON_WBINVD]  = handle_wbinvd,
[EXIT_REASON_XSETBV]  = handle_xsetbv,
[EXIT_REASON_TASK_SWITCH] = handle_task_switch,


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


<    4   5   6   7   8   9   10   11   12   13   >