Re: snd-usb: delay: estimated 0, actual 352
On 2012.09.06 at 09:08 +0200, Daniel Mack wrote: On 06.09.2012 08:53, Markus Trippelsdorf wrote: On 2012.09.06 at 08:48 +0200, Takashi Iwai wrote: At Thu, 06 Sep 2012 08:33:30 +0200, Daniel Mack wrote: On 06.09.2012 08:02, Markus Trippelsdorf wrote: On 2012.09.04 at 16:40 +0200, Takashi Iwai wrote: Sound fixes for 3.6-rc5 There are nothing scaring, contains only small fixes for HD-audio and USB-audio: - EPSS regression fix and GPIO fix for HD-audio IDT codecs - A series of USB-audio regression fixes that are found since 3.5 kernel Daniel Mack (4): ALSA: snd-usb: Fix URB cancellation at stream start ALSA: snd-usb: restore delay information The commit fbcfbf5f above causes the following lines to be printed whenever I start a new song: Copied Pierre-Louis Bossart - he wrote the code in 294c4fb8 which this patch (fbcfbf5f) brings back now. delay: estimated 0, actual 352 delay: estimated 353, actual 705 (44.1 * 8 = 352.8) This happens with an USB-DAC that identifies itself as C-Media USB Headphone Set. And you didn't you see these lines with 3.4? Maybe the difference of start condition? Markus, does the patch below fix anything? Unfortunately no. However reverting the following fixes the problem: commit 245baf983cc39524cce39c24d01b276e6e653c9e Author: Daniel Mack zon...@gmail.com Date: Thu Aug 30 18:52:30 2012 +0200 ALSA: snd-usb: fix calls to next_packet_size No, this one certainly fixes a problem and does the right thing by restoring the original code. If you wouldn't state that you didn't see the same effect with 3.4(!), before the refactoring done in 3.5, I would believe the device is simply slightly off in its feedback rate and the tighter delay code complains about it while compensating, just as it did before. Are there any more than these two lines? And is audio working at all? Is it distorted in any way? There are only these two lines (printed whenever sound starts). Audio is working just fine with no distortions. I did see similar lines before when the system load was very high (happend during make check when building glibc). Here is what Pierre-Louis wrote in November 2011: »This was supposed to be an informational message, I thought it was only enabled for debug. Regular users don't really need to know.« -- Markus -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH RESEND]mm/ia64: fix a node distance bug
From: Jianguo Wu wujian...@huawei.com In arch ia64, has following definition: extern u8 numa_slit[MAX_NUMNODES * MAX_NUMNODES]; #define node_distance(from,to) (numa_slit[(from) * num_online_nodes() + (to)]) num_online_nodes() is a variable value, it can be changed after hot-remove/add a node. I my practice, I found node distance is wrong after offline a node in IA64 platform. For example system has 4 nodes: node distances: node 0 1 2 3 0: 10 21 21 32 1: 21 10 32 21 2: 21 32 10 21 3: 32 21 21 10 linux-drf:/sys/devices/system/node/node0 # cat distance 10 21 21 32 linux-drf:/sys/devices/system/node/node1 # cat distance 21 10 32 21 After offline node2: linux-drf:/sys/devices/system/node/node0 # cat distance 10 21 32 linux-drf:/sys/devices/system/node/node1 # cat distance 32 21 32-expected value is: 21 10 21 Signed-off-by: Jianguo Wu wujian...@huawei.com Signed-off-by: Jiang Liu jiang@huawei.com --- arch/ia64/include/asm/numa.h |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/arch/ia64/include/asm/numa.h b/arch/ia64/include/asm/numa.h index 6a8a27c..2e27ef1 100644 --- a/arch/ia64/include/asm/numa.h +++ b/arch/ia64/include/asm/numa.h @@ -59,7 +59,7 @@ extern struct node_cpuid_s node_cpuid[NR_CPUS]; */ extern u8 numa_slit[MAX_NUMNODES * MAX_NUMNODES]; -#define node_distance(from,to) (numa_slit[(from) * num_online_nodes() + (to)]) +#define node_distance(from,to) (numa_slit[(from) * MAX_NUMNODES + (to)]) extern int paddr_to_nid(unsigned long paddr); -- 1.7.6.1 . -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 3/3] memory-hotplug: bug fix race between isolation and allocation
Hi Minchan, 2012/09/06 14:16, Minchan Kim wrote: Like below, memory-hotplug makes race between page-isolation and page-allocation so it can hit BUG_ON in __offline_isolated_pages. CPU A CPU B start_isolate_page_range set_migratetype_isolate spin_lock_irqsave(zone-lock) free_hot_cold_page(Page A) /* without zone-lock */ migratetype = get_pageblock_migratetype(Page A); /* * Page could be moved into MIGRATE_MOVABLE * of per_cpu_pages */ list_add_tail(page-lru, pcp-lists[migratetype]); set_pageblock_isolate move_freepages_block drain_all_pages /* Page A could be in MIGRATE_MOVABLE of free_list. */ check_pages_isolated __test_page_isolated_in_pageblock /* * We can't catch freed page which * is free_list[MIGRATE_MOVABLE] */ if (PageBuddy(page A)) pfn += 1 page_order(page A); /* So, Page A could be allocated */ __offline_isolated_pages /* * BUG_ON hit or offline page * which is used by someone */ BUG_ON(!PageBuddy(page A)); This patch checks page's migratetype in freelist in __test_page_isolated_in_pageblock. So now __test_page_isolated_in_pageblock can check the page caused by above race and can fail of memory offlining. Signed-off-by: Minchan Kim minc...@kernel.org --- mm/page_isolation.c |5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/mm/page_isolation.c b/mm/page_isolation.c index 87a7929..7ba7405 100644 --- a/mm/page_isolation.c +++ b/mm/page_isolation.c @@ -193,8 +193,11 @@ __test_page_isolated_in_pageblock(unsigned long pfn, unsigned long end_pfn) continue; } page = pfn_to_page(pfn); - if (PageBuddy(page)) + if (PageBuddy(page)) { + if (get_freepage_migratetype(page) != MIGRATE_ISOLATE) + break; pfn += 1 page_order(page); + } else if (page_count(page) == 0 get_freepage_migratetype(page) == MIGRATE_ISOLATE) When do the if statement, the page may be used by someone. In this case, page-index may have some number. If the number is same as MIGRATE_ISOLATE, the code goes worng. Thanks, Yasuaki Ishimatsu pfn += 1; -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] gpio: em: Use irq_data_get_irq_chip_data() at appropriate places
On Tue, Sep 4, 2012 at 3:58 PM, Axel Lin axel@gmail.com wrote: Then we can remove irq_to_priv() function. Signed-off-by: Axel Lin axel@gmail.com Thanks, applied. Yours, Linus Walleij -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH RESEND] arm/dts: AM33XX: Add SPI device tree data
Add McSPI data node to AM33XX device tree file. The McSPI module (and so as the driver) is reused from OMAP4. Signed-off-by: Philip, Avinash avinashphi...@ti.com --- Resenting patch because ARM OMAP mailing list was not copied. :100644 100644 bb31bff... 6b469bd... M arch/arm/boot/dts/am33xx.dtsi arch/arm/boot/dts/am33xx.dtsi | 25 + 1 files changed, 25 insertions(+), 0 deletions(-) diff --git a/arch/arm/boot/dts/am33xx.dtsi b/arch/arm/boot/dts/am33xx.dtsi index bb31bff..6b469bd 100644 --- a/arch/arm/boot/dts/am33xx.dtsi +++ b/arch/arm/boot/dts/am33xx.dtsi @@ -210,5 +210,30 @@ interrupt-parent = intc; interrupts = 91; }; + + spi0: spi@4803 { + compatible = ti,omap4-mcspi; + #address-cells = 1; + #size-cells = 0; + reg = 0x483 0x400; + interrupt-parent = intc; + interrupt = 65; + ti,spi-num-cs = 2; + ti,hwmods = spi0; + status = disabled; + + }; + + spi1: spi@481a { + compatible = ti,omap4-mcspi; + #address-cells = 1; + #size-cells = 0; + reg = 0x481a 0x400; + interrupt-parent = intc; + interrupt = 125; + ti,spi-num-cs = 2; + ti,hwmods = spi1; + status = disabled; + }; }; }; -- 1.7.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] dma: ipu: Drop unused spinlock
I was checking why this spinlock was never initialized, but it turns out it's not used anywhere, so we can drop it. Signed-off-by: Jean Delvare kh...@linux-fr.org Cc: Vinod Koul vinod.k...@intel.com Cc: Dan Williams d...@fb.com --- I can't even build-test this. drivers/dma/ipu/ipu_irq.c |1 - 1 file changed, 1 deletion(-) --- linux-3.6-rc4.orig/drivers/dma/ipu/ipu_irq.c2012-08-04 21:49:26.0 +0200 +++ linux-3.6-rc4/drivers/dma/ipu/ipu_irq.c 2012-09-06 09:13:31.034228670 +0200 @@ -45,7 +45,6 @@ static void ipu_write_reg(struct ipu *ip struct ipu_irq_bank { unsigned intcontrol; unsigned intstatus; - spinlock_t lock; struct ipu *ipu; }; -- Jean Delvare -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH -v3 14/14] x86, mm: Map ISA area with connected ram range at the same time
On Wed, Sep 5, 2012 at 1:02 AM, Pekka Enberg penb...@kernel.org wrote: How significant is the speed gain? The isa_done flag makes code flow more difficult to follow. On Wed, 5 Sep 2012, Yinghai Lu wrote: Not really much. when booting system: memmap=16m$128m memmap=16m$512m memmap=16m$256m memmap=16m$768m memmap=16m$1024m with the patch [0.00] init_memory_mapping: [mem 0x-0x07ff] [0.00] [mem 0x-0x07ff] page 2M [0.00] init_memory_mapping: [mem 0x0900-0x0fff] [0.00] [mem 0x0900-0x0fff] page 2M [0.00] init_memory_mapping: [mem 0x1100-0x1fff] [0.00] [mem 0x1100-0x1fff] page 2M [0.00] init_memory_mapping: [mem 0x2100-0x2fff] [0.00] [mem 0x2100-0x2fff] page 2M [0.00] init_memory_mapping: [mem 0x3100-0x3fff] [0.00] [mem 0x3100-0x3fff] page 2M [0.00] init_memory_mapping: [mem 0x4100-0x7fffdfff] [0.00] [mem 0x4100-0x7fdf] page 2M [0.00] [mem 0x7fe0-0x7fffdfff] page 4k otherwise will have [0.00] init_memory_mapping: [mem 0x-0x000f] [0.00] [mem 0x-0x000f] page 4k [0.00] init_memory_mapping: [mem 0x0010-0x07ff] [0.00] [mem 0x0010-0x001f] page 4k [0.00] [mem 0x0020-0x07ff] page 2M [0.00] init_memory_mapping: [mem 0x0900-0x0fff] [0.00] [mem 0x0900-0x0fff] page 2M [0.00] init_memory_mapping: [mem 0x1100-0x1fff] [0.00] [mem 0x1100-0x1fff] page 2M [0.00] init_memory_mapping: [mem 0x2100-0x2fff] [0.00] [mem 0x2100-0x2fff] page 2M [0.00] init_memory_mapping: [mem 0x3100-0x3fff] [0.00] [mem 0x3100-0x3fff] page 2M [0.00] init_memory_mapping: [mem 0x4100-0x7fffdfff] [0.00] [mem 0x4100-0x7fdf] page 2M [0.00] [mem 0x7fe0-0x7fffdfff] page 4k OK. Is there any other reason than performance to do this? Pekka -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/3] w1: mxc_w1: Adapt the clock name to the new clock framework
Hi Fabio, On Wed, Sep 05, 2012 at 07:01:18PM -0300, Fabio Estevam wrote: From: Fabio Estevam fabio.este...@freescale.com With the new i.mx clock framework the mxc_w1 clock is registered as: clk_register_clkdev(clk[owire_gate], NULL, mxc_w1.0 So we do not need to pass owire string and can use NULL instead. Signed-off-by: Fabio Estevam fabio.este...@freescale.com --- drivers/w1/masters/mxc_w1.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/w1/masters/mxc_w1.c b/drivers/w1/masters/mxc_w1.c index 1cc61a7..14f0f66 100644 --- a/drivers/w1/masters/mxc_w1.c +++ b/drivers/w1/masters/mxc_w1.c @@ -117,7 +117,7 @@ static int __devinit mxc_w1_probe(struct platform_device *pdev) if (!mdev) return -ENOMEM; - mdev-clk = clk_get(pdev-dev, owire); + mdev-clk = clk_get(pdev-dev, NULL); if (!mdev-clk) { You can sell this patch better if you fix the wrong error check here and 'by the way' adjust the lookup string. Sascha -- Pengutronix e.K. | | Industrial Linux Solutions | http://www.pengutronix.de/ | Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0| Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917- | -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
linux-next: Tree for Sept 6
Hi all, Changes since 20120905: New tree: arm64 The powerpc tree gained a build failure for which I reverted 3 commits. The net-next tree lost its build failure. The trivial tree gained a conflict against the powerpc tree. The spi-mb tree gained a build failure so I used the version from next-20120905. The driver-core tree gained a build failure (form an interaction with the workqueues tree) for which I applied a merge fix patch. The tty tree gained a build failure for which I applied a patch. The staging tree lost its build failure. The arm-soc tree gained a conflict against the usb tree. I have created today's linux-next tree at git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git (patches at http://www.kernel.org/pub/linux/kernel/next/ ). If you are tracking the linux-next tree using git, you should not use git pull to do so as that will try to merge the new linux-next release with the old one. You should use git fetch as mentioned in the FAQ on the wiki (see below). You can see which trees have been included by looking in the Next/Trees file in the source. There are also quilt-import.log and merge.log files in the Next directory. Between each merge, the tree was built with a ppc64_defconfig for powerpc and an allmodconfig for x86_64. After the final fixups (if any), it is also built with powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig and allyesconfig (minus CONFIG_PROFILE_ALL_BRANCHES - this fails its final link) and i386, sparc, sparc64 and arm defconfig. These builds also have CONFIG_ENABLE_WARN_DEPRECATED, CONFIG_ENABLE_MUST_CHECK and CONFIG_DEBUG_INFO disabled when necessary. Below is a summary of the state of the merge. We are up to 196 trees (counting Linus' and 26 trees of patches pending for Linus' tree), more are welcome (even if they are currently empty). Thanks to those who have contributed, and to those who haven't, please do. Status of my local build tests will be at http://kisskb.ellerman.id.au/linux-next . If maintainers want to give advice about cross compilers/configs that work, we are always open to add more builds. Thanks to Randy Dunlap for doing many randconfig builds. And to Paul Gortmaker for triage and bug fixes. There is a wiki covering stuff to do with linux-next at http://linux.f-seidel.de/linux-next/pmwiki/ . Thanks to Frank Seidel. -- Cheers, Stephen Rothwells...@canb.auug.org.au $ git checkout master $ git reset --hard stable Merging origin/master (5b716ac Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6) Merging fixes/master (9023a40 Merge tag 'mmc-fixes-for-3.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc) Merging kbuild-current/rc-fixes (6c7080a firmware: fix directory creation rule matching with make 3.82) Merging arm-current/fixes (36418c5 ARM: 7499/1: mm: Fix vmalloc overlap check for !HIGHMEM) Merging m68k-current/for-linus (3be7184 m68k: Add missing RCU idle APIs on idle loop) Merging powerpc-merge/merge (636802e powerpc: Don't use __put_user() in patch_instruction) Merging sparc/master (6dab7ed Merge branch 'fixes' of git://git.linaro.org/people/rmk/linux-arm) Merging net/master (d90c92f ibmveth: Fix alignment of rx queue bug) Merging sound-current/for-linus (2e4a263 ALSA: snd-usb: fix cross-interface streaming devices) Merging pci-current/for-linus (0ff9514 PCI: Don't print anything while decoding is disabled) Merging wireless/master (f107238 libertas sdio: fix suspend when interface is down) Merging driver-core.current/driver-core-linus (fea7a08 Linux 3.6-rc3) Merging tty.current/tty-linus (7be0670 tty: serial: imx: don't reinit clock in imx_setup_ufcr()) Merging usb.current/usb-linus (92fc7a8 USB: add device quirk for Joss Optical touchboard) Merging staging.current/staging-linus (6d7d979 staging: zcache: fix cleancache race condition with shrinker) Merging char-misc.current/char-misc-linus (fea7a08 Linux 3.6-rc3) Merging input-current/for-linus (6f4d038 Input: wacom - add support for EMR on Cintiq 24HD touch) Merging md-current/for-linus (58e94ae md/raid1: close some possible races on write errors during resync) Merging audit-current/for-linus (c158a35 audit: no leading space in audit_log_d_path prefix) Merging crypto-current/master (ce026cb crypto: caam - fix possible deadlock condition) Merging ide/master (9974e43 ide: fix generic_ide_suspend/resume Oops) Merging dwmw2/master (244dc4e Merge git://git.infradead.org/users/dwmw2/random-2.6) Merging sh-current/sh-fixes-for-linus (4403310 SH: Convert out[bwl] macros to inline functions) Merging irqdomain-current/irqdomain/merge (15e06bf irqdomain: Fix debugfs formatting) Merging devicetree-current/devicetree/merge (4e8383b of: release node fix for of_parse_phandle_with_args) Merging spi-current/spi/merge (d1c185b of/spi: Fix SPI module loading by using proper spi: modalias prefixes.) Merging gpio-current/gpio/merge
Re: [PATCH] Chinese translation of Documentation/gpio.txt
2012/9/5 Dong Aisheng b29...@freescale.com: Thanks for your help Dong, Wei can you please check Dong's comments and submit a version with his ACK, and I'll apply it. Yours, Linus Walleij -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] gpio: sx150x: Use irq_data_get_irq_chip_data() at appropriate places
On Tue, Sep 4, 2012 at 4:06 PM, Axel Lin axel@gmail.com wrote: Signed-off-by: Axel Lin axel@gmail.com Thanks, applied! Yours, Linus Walleij -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC v9 PATCH 20/21] memory-hotplug: clear hwpoisoned flag when onlining pages
2012/9/5 we...@cn.fujitsu.com From: Wen Congyang we...@cn.fujitsu.com hwpoisoned may set when we offline a page by the sysfs interface /sys/devices/system/memory/soft_offline_page or /sys/devices/system/memory/hard_offline_page. If we don't clear this flag when onlining pages, this page can't be freed, and will not in free list. So we can't offline these pages again. So we should clear this flag when onlining pages. CC: David Rientjes rient...@google.com CC: Jiang Liu liu...@gmail.com CC: Len Brown len.br...@intel.com CC: Benjamin Herrenschmidt b...@kernel.crashing.org CC: Paul Mackerras pau...@samba.org CC: Christoph Lameter c...@linux.com Cc: Minchan Kim minchan@gmail.com CC: Andrew Morton a...@linux-foundation.org CC: KOSAKI Motohiro kosaki.motoh...@jp.fujitsu.com CC: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com Signed-off-by: Wen Congyang we...@cn.fujitsu.com --- mm/memory_hotplug.c |5 + 1 files changed, 5 insertions(+), 0 deletions(-) diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 270c249..140c080 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -661,6 +661,11 @@ EXPORT_SYMBOL_GPL(__online_page_increment_counters); void __online_page_free(struct page *page) { +#ifdef CONFIG_MEMORY_FAILURE + /* The page may be marked HWPoisoned by soft/hard offline page */ + ClearPageHWPoison(page); Hi Congyang, I think you should decrease mce_bad_pages counter her atomic_long_sub(1, mce_bad_pages); +#endif + ClearPageReserved(page); init_page_count(page); __free_page(page); -- 1.7.1 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majord...@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: a href=mailto:d...@kvack.org; em...@kvack.org /a -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] OMAP GPIO - don't wake from suspend unless requested.
On Thu, Sep 6, 2012 at 12:32 PM, NeilBrown ne...@suse.de wrote: On Thu, 6 Sep 2012 11:18:09 +0530 Shilimkar, Santosh santosh.shilim...@ti.com wrote: On Thu, Sep 6, 2012 at 8:35 AM, NeilBrown ne...@suse.de wrote: On Mon, 3 Sep 2012 22:59:06 -0700 Shilimkar, Santosh santosh.shilim...@ti.com wrote: After thinking bit more on this, the problem seems to be coming mainly because the gpio device is runtime suspended bit early than it should be. Similar issue seen with i2c driver as well. The i2c issue was discussed with Rafael at LPC last week. The idea is to move the pm_runtime_enable/disable() calls entirely up to the _late/_early stage of device suspend/resume. Will update this thread once I have further update. This won't be late enough. IRQCHIP_MASK_ON_SUSPEND takes effect after all the _late callbacks have been called. I, too, spoke to Rafael about this in San Diego. He seemed to agree with me that the interrupt needs to be masked in the -suspend callback. any later is too late. Thanks for information about your discussion. Will wait for the patch then. Regards santosh I already sent a patch - that is what started this thread :-) I include it below. You said The patch doesn't seems to be correct but didn't expand on why. Do you still think it is not correct? I wouldn't be surprised if there is some case that it doesn't handle quite right, but it seems right to me. Sorry I though you were talking about moving the Checking wakeup interrupts bit early as discussed on the follow up of alternate suggested patch to make use of IRQCHIP_MASK_ON_SUSPEND. I think we need to fix the issue seen with ' IRQCHIP_MASK_ON_SUSPEND' patch. That is at least the expected way to manage suspend and wakeup irq masks for a IRQCHIP. Regards Santosh -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 3/3] memory-hotplug: bug fix race between isolation and allocation
Hello Yasuaki, On Thu, Sep 06, 2012 at 04:17:54PM +0900, Yasuaki Ishimatsu wrote: Hi Minchan, 2012/09/06 14:16, Minchan Kim wrote: Like below, memory-hotplug makes race between page-isolation and page-allocation so it can hit BUG_ON in __offline_isolated_pages. CPU A CPU B start_isolate_page_range set_migratetype_isolate spin_lock_irqsave(zone-lock) free_hot_cold_page(Page A) /* without zone-lock */ migratetype = get_pageblock_migratetype(Page A); /* * Page could be moved into MIGRATE_MOVABLE * of per_cpu_pages */ list_add_tail(page-lru, pcp-lists[migratetype]); set_pageblock_isolate move_freepages_block drain_all_pages /* Page A could be in MIGRATE_MOVABLE of free_list. */ check_pages_isolated __test_page_isolated_in_pageblock /* * We can't catch freed page which * is free_list[MIGRATE_MOVABLE] */ if (PageBuddy(page A)) pfn += 1 page_order(page A); /* So, Page A could be allocated */ __offline_isolated_pages /* * BUG_ON hit or offline page * which is used by someone */ BUG_ON(!PageBuddy(page A)); This patch checks page's migratetype in freelist in __test_page_isolated_in_pageblock. So now __test_page_isolated_in_pageblock can check the page caused by above race and can fail of memory offlining. Signed-off-by: Minchan Kim minc...@kernel.org --- mm/page_isolation.c |5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/mm/page_isolation.c b/mm/page_isolation.c index 87a7929..7ba7405 100644 --- a/mm/page_isolation.c +++ b/mm/page_isolation.c @@ -193,8 +193,11 @@ __test_page_isolated_in_pageblock(unsigned long pfn, unsigned long end_pfn) continue; } page = pfn_to_page(pfn); - if (PageBuddy(page)) + if (PageBuddy(page)) { + if (get_freepage_migratetype(page) != MIGRATE_ISOLATE) + break; pfn += 1 page_order(page); + } else if (page_count(page) == 0 get_freepage_migratetype(page) == MIGRATE_ISOLATE) When do the if statement, the page may be used by someone. I can't understand your point. We already hold zone-lock so that allocator and this function should be atomic when the page is in free_list. If I miss something, could you elaborate it more? In this case, page-index may have some number. If the number is same as MIGRATE_ISOLATE, the code goes worng. Thanks, Yasuaki Ishimatsu pfn += 1; -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majord...@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: a href=mailto:d...@kvack.org; em...@kvack.org /a -- Kind regards, Minchan Kim -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] JFS: use list_move instead of list_del/list_add
From: Wei Yongjun yongjun_...@trendmicro.com.cn Using list_move() instead of list_del() + list_add(). spatch with a semantic match is used to found this problem. (http://coccinelle.lip6.fr/) Signed-off-by: Wei Yongjun yongjun_...@trendmicro.com.cn --- fs/jfs/jfs_txnmgr.c | 9 +++-- 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/fs/jfs/jfs_txnmgr.c b/fs/jfs/jfs_txnmgr.c index bb8b661..5fcc02e 100644 --- a/fs/jfs/jfs_txnmgr.c +++ b/fs/jfs/jfs_txnmgr.c @@ -2977,12 +2977,9 @@ int jfs_sync(void *arg) * put back on the anon_list. */ - /* Take off anon_list */ - list_del(jfs_ip-anon_inode_list); - - /* Put on anon_list2 */ - list_add(jfs_ip-anon_inode_list, -TxAnchor.anon_list2); + /* Move from anon_list to anon_list2 */ + list_move(jfs_ip-anon_inode_list, + TxAnchor.anon_list2); TXN_UNLOCK(); iput(ip); -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: snd-usb: delay: estimated 0, actual 352
At Thu, 6 Sep 2012 09:17:57 +0200, Markus Trippelsdorf wrote: On 2012.09.06 at 09:08 +0200, Daniel Mack wrote: On 06.09.2012 08:53, Markus Trippelsdorf wrote: On 2012.09.06 at 08:48 +0200, Takashi Iwai wrote: At Thu, 06 Sep 2012 08:33:30 +0200, Daniel Mack wrote: On 06.09.2012 08:02, Markus Trippelsdorf wrote: On 2012.09.04 at 16:40 +0200, Takashi Iwai wrote: Sound fixes for 3.6-rc5 There are nothing scaring, contains only small fixes for HD-audio and USB-audio: - EPSS regression fix and GPIO fix for HD-audio IDT codecs - A series of USB-audio regression fixes that are found since 3.5 kernel Daniel Mack (4): ALSA: snd-usb: Fix URB cancellation at stream start ALSA: snd-usb: restore delay information The commit fbcfbf5f above causes the following lines to be printed whenever I start a new song: Copied Pierre-Louis Bossart - he wrote the code in 294c4fb8 which this patch (fbcfbf5f) brings back now. delay: estimated 0, actual 352 delay: estimated 353, actual 705 (44.1 * 8 = 352.8) This happens with an USB-DAC that identifies itself as C-Media USB Headphone Set. And you didn't you see these lines with 3.4? Maybe the difference of start condition? Markus, does the patch below fix anything? Unfortunately no. However reverting the following fixes the problem: commit 245baf983cc39524cce39c24d01b276e6e653c9e Author: Daniel Mack zon...@gmail.com Date: Thu Aug 30 18:52:30 2012 +0200 ALSA: snd-usb: fix calls to next_packet_size No, this one certainly fixes a problem and does the right thing by restoring the original code. If you wouldn't state that you didn't see the same effect with 3.4(!), before the refactoring done in 3.5, I would believe the device is simply slightly off in its feedback rate and the tighter delay code complains about it while compensating, just as it did before. Are there any more than these two lines? And is audio working at all? Is it distorted in any way? There are only these two lines (printed whenever sound starts). Audio is working just fine with no distortions. I did see similar lines before when the system load was very high (happend during make check when building glibc). Here is what Pierre-Louis wrote in November 2011: »This was supposed to be an informational message, I thought it was only enabled for debug. Regular users don't really need to know.« I guess the problem is that the new endpoint scheme doesn't count the last_delay update unless the stream is triggered. In the old code, retire_playback_urb is always called even before the trigger(START) is set. And, there retire_playback_urb() does nothing but updating the delay information. In the new code, retire_playback_urb is set only at snd_usb_substream_playback_trigger(). Thus at the very first shot, the delay account got confused. Takashi -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Patch 0/1]drm_irq: Introducing the irq_thread support
On Thu, Sep 06, 2012 at 12:42:05AM +, Liu, Chuansheng wrote: This possibly ought to be submitted in parallel with the code that uses it so that the whole proposal can be evaluated as one thing ? Alan Patch is here, thanks. From: liu chuansheng chuansheng@intel.com Subject: [PATCH] drm_irq: Introducing the irq_thread support For some GPUs, the irq handler need 1ms to handle the irq action. And it will delay the whole system irq handler. This patch is adding the irq thread support, it will make the drm_irq interface more flexible. The changes include: 1/ Change the request_irq to request_thread_irq, and add new callback irq_handler_t; 2/ Normally we need IRQF_ONESHOT flag support for irq thread, so add this option in drm_irq; Cc: Shi Yang yang.a@intel.com Signed-off-by: liu chuansheng chuansheng@intel.com Nacked-by: Daniel Vetter daniel.vet...@ffwll.ch I _really_ hate when we add random special cases for random strange drivers to core code - the usual end result is that in a few years we'll have a maze of special-cases only used by one driver each. And nope, cleaning that up isn't _that_ much fun ... So just do all this in your own driver's code (and maybe set dev-irq_enabled ourselve so that wait_vblank still works). Yours, Daniel --- drivers/gpu/drm/drm_irq.c |8 ++-- include/drm/drmP.h|2 ++ 2 files changed, 8 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c index 03f16f3..bc105fe 100644 --- a/drivers/gpu/drm/drm_irq.c +++ b/drivers/gpu/drm/drm_irq.c @@ -345,13 +345,17 @@ int drm_irq_install(struct drm_device *dev) if (drm_core_check_feature(dev, DRIVER_IRQ_SHARED)) sh_flags = IRQF_SHARED; + if (drm_core_check_feature(dev, DRIVER_IRQ_ONESHOT)) + sh_flags |= IRQF_ONESHOT; + if (dev-devname) irqname = dev-devname; else irqname = dev-driver-name; - ret = request_irq(drm_dev_to_irq(dev), dev-driver-irq_handler, - sh_flags, irqname, dev); + ret = request_threaded_irq(drm_dev_to_irq(dev), + dev-driver-irq_handler, dev-driver-irq_handler_t, + sh_flags, irqname, dev); if (ret 0) { mutex_lock(dev-struct_mutex); diff --git a/include/drm/drmP.h b/include/drm/drmP.h index d6b67bb..b28be4b 100644 --- a/include/drm/drmP.h +++ b/include/drm/drmP.h @@ -152,6 +152,7 @@ int drm_err(const char *func, const char *format, ...); #define DRIVER_GEM 0x1000 #define DRIVER_MODESET 0x2000 #define DRIVER_PRIME 0x4000 +#define DRIVER_IRQ_ONESHOT 0x8000 #define DRIVER_BUS_PCI 0x1 #define DRIVER_BUS_PLATFORM 0x2 @@ -872,6 +873,7 @@ struct drm_driver { /* these have to be filled in */ irqreturn_t(*irq_handler) (DRM_IRQ_ARGS); + irqreturn_t(*irq_handler_t) (DRM_IRQ_ARGS); void (*irq_preinstall) (struct drm_device *dev); int (*irq_postinstall) (struct drm_device *dev); void (*irq_uninstall) (struct drm_device *dev); -- 1.7.0.4 ___ dri-devel mailing list dri-de...@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel -- Daniel Vetter Mail: dan...@ffwll.ch Mobile: +41 (0)79 365 57 48 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] gpio-ich: Add missing spinlock init
As reported by CONFIG_DEBUG_SPINLOCK=y. Signed-off-by: Jean Delvare kh...@linux-fr.org Cc: Peter Tyser pty...@xes-inc.com Cc: Grant Likely grant.lik...@secretlab.ca Cc: Linus Walleij linus.wall...@linaro.org Cc: sta...@vger.kernel.org [v3.5+] --- drivers/gpio/gpio-ich.c |1 + 1 file changed, 1 insertion(+) --- linux-3.6-rc4.orig/drivers/gpio/gpio-ich.c 2012-09-04 13:34:03.0 +0200 +++ linux-3.6-rc4/drivers/gpio/gpio-ich.c 2012-09-06 08:08:57.571210424 +0200 @@ -390,6 +390,7 @@ static int __devinit ichx_gpio_probe(str return -ENODEV; } + spin_lock_init(ichx_priv.lock); res_base = platform_get_resource(pdev, IORESOURCE_IO, ICH_RES_GPIO); ichx_priv.use_gpio = ich_info-use_gpio; err = ichx_gpio_request_regions(res_base, pdev-name, -- Jean Delvare -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] virtio-blk: Fix kconfig option
On Tue, Sep 04, 2012 at 03:53:53PM +0930, Rusty Russell wrote: Kent Overstreet koverstr...@google.com writes: CONFIG_VIRTIO isn't exposed, everything else is supposed to select it instead. This is a slight mis-understanding. It's supposed to be selected by the particular driver, probably virtio_pci in your case. So are you saying virtio-blk depends on virtio-pci? If so, the kconfig should have that. As is, VIRTIO_BLK just has: depends on EXPERIMENTAL VIRTIO which is flat out broken. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] virtio-balloon spec: provide a version of the silent deflate feature that works
VIRTIO_BALLOON_F_MUST_TELL_HOST cannot be used properly because it is a negative feature: it tells you that silent defalte is not supported. Right now, QEMU refuses migration if the target does not support all the features that were negotiated. But then: - a migration from non-MUST_TELL_HOST to MUST_TELL_HOST will succeed, which is wrong; - a migration from MUST_TELL_HOST to non-MUST_TELL_HOST will fail, which is useless. Add instead a new feature VIRTIO_BALLOON_F_SILENT_DEFLATE, and deprecate VIRTIO_BALLOON_F_MUST_TELL_HOST since it is never actually used. Signed-off-by: Paolo Bonzini pbonz...@redhat.com --- virtio-spec.lyx | 36 +--- 1 file modificato, 33 inserzioni(+), 3 rimozioni(-) diff --git a/virtio-spec.lyx b/virtio-spec.lyx index 7a073f4..1a25a18 100644 --- a/virtio-spec.lyx +++ b/virtio-spec.lyx @@ -6238,6 +6238,8 @@ bits \begin_deeper \begin_layout Description + +\change_deleted 1531152142 1346917221 VIRTIO_BALLOON_F_MUST_TELL_HOST \begin_inset space ~ \end_inset @@ -6251,6 +6253,20 @@ VIRTIO_BALLOON_F_STATS_VQ \end_inset (1) A virtqueue for reporting guest memory statistics is present. +\change_inserted 1531152142 1346917193 + +\end_layout + +\begin_layout Description + +\change_inserted 1531152142 1346917219 +VIRTIO_BALLOON_F_SILENT_DEFLATE +\begin_inset space ~ +\end_inset + +(2) Host does not need to be told before pages from the balloon are used. +\change_unchanged + \end_layout \end_deeper @@ -6401,9 +6417,23 @@ The driver constructs an array of addresses of memory pages it has previously \end_layout \begin_layout Enumerate -If the VIRTIO_BALLOON_F_MUST_TELL_HOST feature is set, the guest may not - use these requested pages until that descriptor in the deflateq has been - used by the device. +If the VIRTIO_BALLOON_F_ +\change_deleted 1531152142 1346917234 +MUST_TELL_HOST +\change_inserted 1531152142 1346917237 +SILENT_DEFLATE +\change_unchanged + feature is +\change_inserted 1531152142 1346917241 +not +\change_unchanged +set, the guest may not use these requested pages until that descriptor in + the deflateq has been used by the device. + +\change_inserted 1531152142 1346917253 + If it is set, the guest may choose to not use the deflateq at all. +\change_unchanged + \end_layout \begin_layout Enumerate -- 1.7.11.2 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] drm/exynos: fix double call of drm_prime_(init/destroy)_file_private
Dear Inki Dae, Am Donnerstag, den 06.09.2012, 11:35 +0900 schrieb InKi Dae: 2012/9/6 Mandeep Singh Baines m...@chromium.org: The double invocations are incorrect but seem to be safe so I don't think this will fix any bugs. Before: [7.639366] drm_prime_init_file ee3675d0 [7.639377] drm_prime_init_file ee3675d0 [7.639507] drm_prime_destroy_file ee3675d0 [7.639518] drm_prime_destroy_file ee3675d0 [7.639802] drm_prime_init_file ee372390 [7.639810] drm_prime_init_file ee372390 [8.473316] drm_prime_init_file ee356390 [8.473331] drm_prime_init_file ee356390 After: [6.363842] drm_prime_init_file edc2e5d0 [6.363994] drm_prime_destroy_file edc2e5d0 [6.364260] drm_prime_init_file edc2e750 [8.004837] drm_prime_init_file ee36ded0 Signed-off-by: Mandeep Singh Baines m...@chromium.org CC: Stéphane Marchesin marc...@chromium.org CC: Pawel Osciak posc...@google.com CC: Inki Dae inki@samsung.com CC: Joonyoung Shim jy0922.s...@samsung.com CC: Seung-Woo Kim sw0312@samsung.com CC: Kyungmin Park kyungmin.p...@samsung.com CC: David Airlie airl...@linux.ie CC: dri-de...@lists.freedesktop.org remove all CCs I guess they were generated by some script. So they should be fine, no? Mandeep, if you put CC in here those people should be CCed in real. `git send-email` should take care of that but I do not see everyone in the CC field. Or does `git send-email` use blind carbon copy (BCC) field? and can you send it again using text mode? At least to the list it was send in plain text mode. your patch is messed up when I try to get patch file. Everything is fine on my side. Especially since Mandeep used `git send-email` which should do everything correctly. Thanks. Inki Dae In your From address your name is written InKi with capital K. Which one is correct? Thanks, Paul signature.asc Description: This is a digitally signed message part
Re: [PATCH] OMAP GPIO - don't wake from suspend unless requested.
On Thu, 6 Sep 2012 12:57:46 +0530 Shilimkar, Santosh santosh.shilim...@ti.com wrote: On Thu, Sep 6, 2012 at 12:32 PM, NeilBrown ne...@suse.de wrote: On Thu, 6 Sep 2012 11:18:09 +0530 Shilimkar, Santosh santosh.shilim...@ti.com wrote: On Thu, Sep 6, 2012 at 8:35 AM, NeilBrown ne...@suse.de wrote: On Mon, 3 Sep 2012 22:59:06 -0700 Shilimkar, Santosh santosh.shilim...@ti.com wrote: After thinking bit more on this, the problem seems to be coming mainly because the gpio device is runtime suspended bit early than it should be. Similar issue seen with i2c driver as well. The i2c issue was discussed with Rafael at LPC last week. The idea is to move the pm_runtime_enable/disable() calls entirely up to the _late/_early stage of device suspend/resume. Will update this thread once I have further update. This won't be late enough. IRQCHIP_MASK_ON_SUSPEND takes effect after all the _late callbacks have been called. I, too, spoke to Rafael about this in San Diego. He seemed to agree with me that the interrupt needs to be masked in the -suspend callback. any later is too late. Thanks for information about your discussion. Will wait for the patch then. Regards santosh I already sent a patch - that is what started this thread :-) I include it below. You said The patch doesn't seems to be correct but didn't expand on why. Do you still think it is not correct? I wouldn't be surprised if there is some case that it doesn't handle quite right, but it seems right to me. Sorry I though you were talking about moving the Checking wakeup interrupts bit early as discussed on the follow up of alternate suggested patch to make use of IRQCHIP_MASK_ON_SUSPEND. I think we need to fix the issue seen with ' IRQCHIP_MASK_ON_SUSPEND' patch. That is at least the expected way to manage suspend and wakeup irq masks for a IRQCHIP. That is what I thought at first too. But when talking to Rafael he said that IRQCHIP_MASK_ON_SUSPEND was intended mainly for clock interrupts. For other less fundamental interrupts, doing the mask/unmask in suspend/resume callbacks is sufficient and simpler... and actually works. IRQCHIP_MASK_ON_SUSPEND is currently used by precisely two drivers: arch/arm/mach-omap2/omap-wakeupgen.c and drivers/mfd/pm8xxx-irq.c which suggests that it isn't widely used and quite possibly doesn't actually work in general. The pm8xxx-irq doesn't seem to do runtime pm, so maybe it manages to work for that reason. The omap-wakeupgen code is beyond my current understanding, but it seems like it might be the sort of special case that IRQCHIP_MASK_ON_SUSPEND is intended for... Maybe we need to start a new thread and pester Rafael or Thomas Gleixner to either explain what is intended for this case, or to fix IRQCHIP_MASK_ON_SUSPEND so that it can be used in general. NeilBrown signature.asc Description: PGP signature
Re: [alsa-devel] [PATCH] ASoC: ams-delta: fix card initalization failure
On Sat, Sep 01, 2012 at 11:09:18AM +0200, Janusz Krzysztofik wrote: I see your point, however for now I can see no better way of referencing the data (of type struct snd_soc_card) then passing it to snd_soc_register_card(). But for this to work, I would have to register successfully an ams-delta specific platform device first, not the soc- audio. This, even if still done from the sound/soc/omap/ams-delta.c, not from an arch board file, would require now not existing ams-delta ASoC platform driver probe/remove callbacks at least. I'm still not convinced if such modification would be acceptable in the middle of the rc cycle. If there is a simpler, less intrusive way to do this, then sorry, I still can't see it. Like I already said just make it a static variable. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] module: signature infrastructure
Lucas De Marchi lucas.de.mar...@gmail.com writes: Sorry to come up with this suggestion only now (and after you have already talked to me at LPC). Only after seeing this implementation I thought about the implications of having the module signed in this format. ... I'm worried about performance here. Module loading can take a fair amount of boot time. It may not be critical for servers or desktops that we rarely boot, but it is for embedded uses. ... Letting it in be32 is the simplest solution IMO. it's way simpler then the loop above. ... Scanning the module is the least of our issues since we've just copied it and we're about to SHA it. Yeah, but I don't think we need to scan it one more time. On every boot. For every module Regretfully, I don't have Linus' talent for flamage. There's no measurable performance impact. Scanning 1k takes about 5usec; we've wasted about enough time on this subject to load a billion kernel modules. I know this. Not because I'm brilliant, but because I *measured* it. I even pulled out my original module signature signing check code, and that was both faster and simpler. See below. Your assertion that the length-prepended version is way simpler is wrong. Again, I know this because I *read the code*: https://git.kernel.org/?p=linux/kernel/git/kasatkin/linux-digsig.git;a=commitdiff;h=19eef6e4e529ccf2b3a6ab5c19dd40f2eaf8fcaf Don't send any more lazy, unthoughtful mails to the list. It's disrespectful and makes me grumpy. Rusty. PS. Pushed updated version to my kernel.org linux.git/module-signing branch. #ifdef CONFIG_MODULE_SIG static int module_sig_check(struct load_info *info, const void *mod, unsigned long *len) { int err = 0; const unsigned long markerlen = strlen(MODULE_SIG_STRING); const void *p = mod, *end = mod + *len; /* Poor man's memmem. */ while ((p = memchr(p, MODULE_SIG_STRING[0], end - p))) { if (p + markerlen end) break; if (memcmp(p, MODULE_SIG_STRING, markerlen) == 0) { const void *sig = p + markerlen; /* Truncate module up to signature. */ *len = p - mod; err = mod_verify_sig(mod, *len, sig, end - sig, info-sig_ok); break; } p++; } /* Not having a signature is only an error if we're strict. */ if (!err !info-sig_ok sig_enforce) err = -EKEYREJECTED; return err; } -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 2/2] virtio-ring: Allocate indirect buffers from cache when possible
Michael S. Tsirkin m...@redhat.com writes: Yes without checksum net core always linearizes packets, so yes it is screwed. For -net, skb always allocates space for 17 frags + linear part so it seems sane to do same in virtio core, and allocate, for -net, up to max_frags + 1 from cache. We can adjust it: no _SG - 2 otherwise 18. But I thought it used individual buffers these days? Cheers, Rusty. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] KVM: VMX: invalidate vpid for invlpg instruction
On 09/06/2012 12:54 AM, Davidlohr Bueso wrote: On Mon, 2012-09-03 at 12:11 +0300, Avi Kivity wrote: On 09/03/2012 02:27 AM, Davidlohr Bueso wrote: On Fri, 2012-08-31 at 14:37 -0300, Marcelo Tosatti wrote: On Fri, Aug 31, 2012 at 06:10:48PM +0200, Davidlohr Bueso wrote: For processors that support VPIDs we should invalidate the page table entry specified by the lineal address. For this purpose add support for individual address invalidations. Not necessary - a single context invalidation is performed through KVM_REQ_TLB_FLUSH. Since vpid_sync_context() supports both single and all-context vpid invalidations, wouldn't it make sense to also add individual address ones as well, supporting further granularity? It might. Do you have benchmarks supporting this? I ran two benchmarks: Java Dacapo[1] Sunflow (renders a set of images using ray tracing) and a vanilla 3.2 kernel build (with 1 job and -j8). The host configuration is an Intel i7-2635QM (4 cores + HT) with 4Gb RAM running Linus's latest and only running standard system daemons. For KVM I disabled EPT. That's not very interesting. In all real machines, if you have VPID you also have EPT. Intel are unlikely to produce a processor without EPT. The guest configuration is a 64bit 4 core 4Gb RAM, running Linux 3.2 (debian) and only running the benchmark. All results represent the mean of 5 runs, with time(1). The results are impressive, but lack real-world relevance. Individual-address invalidation isn't very useful with EPT, since we let the guest issue INVLPG itself and otherwise don't bother with guest page tables. Individual-address INVEPT would probably be more useful, but there is no such instruction variant. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC v2] memory-hotplug: remove MIGRATE_ISOLATE from free_area-free_list
On 09/06/2012 10:53 AM, Minchan Kim wrote: Normally, MIGRATE_ISOLATE type is used for memory-hotplug. But it's irony type because the pages isolated would exist as free page in free_area-free_list[MIGRATE_ISOLATE] so people can think of it as allocatable pages but it is *never* allocatable. It ends up confusing NR_FREE_PAGES vmstat so it would be totally not accurate so some of place which depend on such vmstat could reach wrong decision by the context. There were already report about it.[1] [1] 702d1a6e, memory-hotplug: fix kswapd looping forever problem Then, there was other report which is other problem.[2] [2] http://www.spinics.net/lists/linux-mm/msg41251.html I believe it can make problems in future, too. So I hope removing such irony type by another design. I hope this patch solves it and let's revert [1] and doesn't need [2]. * Changelog v1 * Fix from Michal's many suggestion Cc: Michal Nazarewicz min...@mina86.com Cc: Mel Gorman m...@csn.ul.ie Cc: Kamezawa Hiroyuki kamezawa.hir...@jp.fujitsu.com Cc: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com Cc: Wen Congyang we...@cn.fujitsu.com Cc: Konrad Rzeszutek Wilk konrad.w...@oracle.com Signed-off-by: Minchan Kim minc...@kernel.org --- @@ -180,30 +287,35 @@ int undo_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn, * all pages in [start_pfn...end_pfn) must be in the same zone. * zone-lock must be held before call this. * - * Returns 1 if all pages in the range are isolated. + * Returns true if all pages in the range are isolated. */ -static int -__test_page_isolated_in_pageblock(unsigned long pfn, unsigned long end_pfn) +static bool +__test_page_isolated_in_pageblock(unsigned long start_pfn, unsigned long end_pfn) { + unsigned long pfn, next_pfn; struct page *page; - while (pfn end_pfn) { - if (!pfn_valid_within(pfn)) { - pfn++; - continue; - } - page = pfn_to_page(pfn); - if (PageBuddy(page)) - pfn += 1 page_order(page); - else if (page_count(page) == 0 - page_private(page) == MIGRATE_ISOLATE) - pfn += 1; - else - break; + list_for_each_entry(page, isolated_pages, lru) { + if (page-lru == isolated_pages) + return false; what's the mean of this line? + pfn = page_to_pfn(page); + if (pfn = end_pfn) + return false; + if (pfn = start_pfn) + goto found; + } + return false; + + list_for_each_entry_continue(page, isolated_pages, lru) { + if (page_to_pfn(page) != next_pfn) + return false; where is next_pfn init-ed? +found: + pfn = page_to_pfn(page); + next_pfn = pfn + (1UL page_order(page)); + if (next_pfn = end_pfn) + return true; } - if (pfn end_pfn) - return 0; - return 1; + return false; } int test_pages_isolated(unsigned long start_pfn, unsigned long end_pfn) @@ -211,7 +323,7 @@ int test_pages_isolated(unsigned long start_pfn, unsigned long end_pfn) unsigned long pfn, flags; struct page *page; struct zone *zone; - int ret; + bool ret; /* * Note: pageblock_nr_page != MAX_ORDER. Then, chunks of free page diff --git a/mm/vmstat.c b/mm/vmstat.c index df7a674..bb59ff7 100644 --- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -616,7 +616,6 @@ static char * const migratetype_names[MIGRATE_TYPES] = { #ifdef CONFIG_CMA CMA, #endif - Isolate, }; static void *frag_start(struct seq_file *m, loff_t *pos) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC v2] memory-hotplug: remove MIGRATE_ISOLATE from free_area-free_list
Hello Lai, On Thu, Sep 06, 2012 at 04:14:51PM +0800, Lai Jiangshan wrote: On 09/06/2012 10:53 AM, Minchan Kim wrote: Normally, MIGRATE_ISOLATE type is used for memory-hotplug. But it's irony type because the pages isolated would exist as free page in free_area-free_list[MIGRATE_ISOLATE] so people can think of it as allocatable pages but it is *never* allocatable. It ends up confusing NR_FREE_PAGES vmstat so it would be totally not accurate so some of place which depend on such vmstat could reach wrong decision by the context. There were already report about it.[1] [1] 702d1a6e, memory-hotplug: fix kswapd looping forever problem Then, there was other report which is other problem.[2] [2] http://www.spinics.net/lists/linux-mm/msg41251.html I believe it can make problems in future, too. So I hope removing such irony type by another design. I hope this patch solves it and let's revert [1] and doesn't need [2]. * Changelog v1 * Fix from Michal's many suggestion Cc: Michal Nazarewicz min...@mina86.com Cc: Mel Gorman m...@csn.ul.ie Cc: Kamezawa Hiroyuki kamezawa.hir...@jp.fujitsu.com Cc: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com Cc: Wen Congyang we...@cn.fujitsu.com Cc: Konrad Rzeszutek Wilk konrad.w...@oracle.com Signed-off-by: Minchan Kim minc...@kernel.org --- @@ -180,30 +287,35 @@ int undo_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn, * all pages in [start_pfn...end_pfn) must be in the same zone. * zone-lock must be held before call this. * - * Returns 1 if all pages in the range are isolated. + * Returns true if all pages in the range are isolated. */ -static int -__test_page_isolated_in_pageblock(unsigned long pfn, unsigned long end_pfn) +static bool +__test_page_isolated_in_pageblock(unsigned long start_pfn, unsigned long end_pfn) { + unsigned long pfn, next_pfn; struct page *page; - while (pfn end_pfn) { - if (!pfn_valid_within(pfn)) { - pfn++; - continue; - } - page = pfn_to_page(pfn); - if (PageBuddy(page)) - pfn += 1 page_order(page); - else if (page_count(page) == 0 - page_private(page) == MIGRATE_ISOLATE) - pfn += 1; - else - break; + list_for_each_entry(page, isolated_pages, lru) { + if (page-lru == isolated_pages) + return false; what's the mean of this line? I just copied it from Michal's code but It seem to be not needed. I will remove it in next spin. + pfn = page_to_pfn(page); + if (pfn = end_pfn) + return false; + if (pfn = start_pfn) + goto found; + } + return false; + + list_for_each_entry_continue(page, isolated_pages, lru) { + if (page_to_pfn(page) != next_pfn) + return false; where is next_pfn init-ed? by goto found +found: + pfn = page_to_pfn(page); + next_pfn = pfn + (1UL page_order(page)); + if (next_pfn = end_pfn) + return true; } - if (pfn end_pfn) - return 0; - return 1; + return false; } int test_pages_isolated(unsigned long start_pfn, unsigned long end_pfn) @@ -211,7 +323,7 @@ int test_pages_isolated(unsigned long start_pfn, unsigned long end_pfn) unsigned long pfn, flags; struct page *page; struct zone *zone; - int ret; + bool ret; /* * Note: pageblock_nr_page != MAX_ORDER. Then, chunks of free page diff --git a/mm/vmstat.c b/mm/vmstat.c index df7a674..bb59ff7 100644 --- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -616,7 +616,6 @@ static char * const migratetype_names[MIGRATE_TYPES] = { #ifdef CONFIG_CMA CMA, #endif - Isolate, }; static void *frag_start(struct seq_file *m, loff_t *pos) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majord...@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: a href=mailto:d...@kvack.org; em...@kvack.org /a -- Kind regards, Minchan Kim -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] pwm i.MX: add devicetree support
On Wed, Sep 05, 2012 at 03:35:19PM +0200, Sascha Hauer wrote: Changes since v1: - Add devicetree binding documentation - Merge 5/9 and 9/9 - fix #pwm-cells (must be 2 instead of 3) - fix wrong name in MODULE_DEVICE_TABLE - drop platform based probing while introducing devicetree based probe Philipp Zabel (2): pwm i.MX: add devicetree support pwm i.MX: fix clock lookup Sascha Hauer (6): pwm i.MX: factor out SoC specific functions pwm i.MX: remove unnecessary if in pwm_[en|dis]able pwm i.MX: add functions to enable/disable pwm. pwm i.MX: Use module_platform_driver pwm i.MX: use per clock unconditionally ARM i.MX53: Add pwm support For the series, Reviewed-by: Shawn Guo shawn@linaro.org Documentation/devicetree/bindings/pwm/imx-pwm.txt | 17 ++ arch/arm/boot/dts/imx53.dtsi | 14 ++ arch/arm/mach-imx/clk-imx51-imx53.c |4 + drivers/pwm/pwm-imx.c | 275 ++--- 4 files changed, 214 insertions(+), 96 deletions(-) create mode 100644 Documentation/devicetree/bindings/pwm/imx-pwm.txt -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: linux-next: build failure after merge of the final tree (powerpc tree related)
On Thu, Sep 06, 2012 at 05:11:53PM +1000, Stephen Rothwell wrote: Hi all, After merging the final tree, today's linux-next build (powerpc allyesconfig) failed like this: In file included from drivers/atm/fore200e.c:70:0: drivers/atm/fore200e.h:263:3: error: redefinition of typedef 'opcode_t' with different type arch/powerpc/include/asm/probes.h:25:13: note: previous declaration of 'opcode_t' was here Caused by commit 7118e7e648e0 (powerpc: Consolidate {k,u}probe definitions) from the powerpc tree. I have reverted that commit (and the two following: 41ab5266c362 powerpc: Add trap_nr to thread_struct 8b7b80b9ebb4 powerpc: Uprobes port to powerpc) for today. Hi Stephen, I have just posted a patch [1] to fix the issue. Ananth [1] https://lists.ozlabs.org/pipermail/linuxppc-dev/2012-September/100813.html -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: snd-usb: delay: estimated 0, actual 352
At Thu, 06 Sep 2012 09:35:26 +0200, Takashi Iwai wrote: At Thu, 6 Sep 2012 09:17:57 +0200, Markus Trippelsdorf wrote: On 2012.09.06 at 09:08 +0200, Daniel Mack wrote: On 06.09.2012 08:53, Markus Trippelsdorf wrote: On 2012.09.06 at 08:48 +0200, Takashi Iwai wrote: At Thu, 06 Sep 2012 08:33:30 +0200, Daniel Mack wrote: On 06.09.2012 08:02, Markus Trippelsdorf wrote: On 2012.09.04 at 16:40 +0200, Takashi Iwai wrote: Sound fixes for 3.6-rc5 There are nothing scaring, contains only small fixes for HD-audio and USB-audio: - EPSS regression fix and GPIO fix for HD-audio IDT codecs - A series of USB-audio regression fixes that are found since 3.5 kernel Daniel Mack (4): ALSA: snd-usb: Fix URB cancellation at stream start ALSA: snd-usb: restore delay information The commit fbcfbf5f above causes the following lines to be printed whenever I start a new song: Copied Pierre-Louis Bossart - he wrote the code in 294c4fb8 which this patch (fbcfbf5f) brings back now. delay: estimated 0, actual 352 delay: estimated 353, actual 705 (44.1 * 8 = 352.8) This happens with an USB-DAC that identifies itself as C-Media USB Headphone Set. And you didn't you see these lines with 3.4? Maybe the difference of start condition? Markus, does the patch below fix anything? Unfortunately no. However reverting the following fixes the problem: commit 245baf983cc39524cce39c24d01b276e6e653c9e Author: Daniel Mack zon...@gmail.com Date: Thu Aug 30 18:52:30 2012 +0200 ALSA: snd-usb: fix calls to next_packet_size No, this one certainly fixes a problem and does the right thing by restoring the original code. If you wouldn't state that you didn't see the same effect with 3.4(!), before the refactoring done in 3.5, I would believe the device is simply slightly off in its feedback rate and the tighter delay code complains about it while compensating, just as it did before. Are there any more than these two lines? And is audio working at all? Is it distorted in any way? There are only these two lines (printed whenever sound starts). Audio is working just fine with no distortions. I did see similar lines before when the system load was very high (happend during make check when building glibc). Here is what Pierre-Louis wrote in November 2011: »This was supposed to be an informational message, I thought it was only enabled for debug. Regular users don't really need to know.« I guess the problem is that the new endpoint scheme doesn't count the last_delay update unless the stream is triggered. In the old code, retire_playback_urb is always called even before the trigger(START) is set. And, there retire_playback_urb() does nothing but updating the delay information. In the new code, retire_playback_urb is set only at snd_usb_substream_playback_trigger(). Thus at the very first shot, the delay account got confused. In short, a patch like below may fix the issue (note: completely untested!) Takashi --- diff --git a/sound/usb/pcm.c b/sound/usb/pcm.c index fd5e982..928a4f7 100644 --- a/sound/usb/pcm.c +++ b/sound/usb/pcm.c @@ -528,6 +528,9 @@ static int snd_usb_hw_free(struct snd_pcm_substream *substream) return snd_pcm_lib_free_vmalloc_buffer(substream); } +static void retire_playback_urb(struct snd_usb_substream *subs, + struct urb *urb); + /* * prepare callback * @@ -561,8 +564,10 @@ static int snd_usb_pcm_prepare(struct snd_pcm_substream *substream) /* for playback, submit the URBs now; otherwise, the first hwptr_done * updates for all URBs would happen at the same time when starting */ - if (subs-direction == SNDRV_PCM_STREAM_PLAYBACK) + if (subs-direction == SNDRV_PCM_STREAM_PLAYBACK) { + subs-data_endpoint-retire_data_urb = retire_playback_urb; return start_endpoints(subs, 1); + } return 0; } @@ -1190,7 +1195,6 @@ static int snd_usb_substream_playback_trigger(struct snd_pcm_substream *substrea case SNDRV_PCM_TRIGGER_START: case SNDRV_PCM_TRIGGER_PAUSE_RELEASE: subs-data_endpoint-prepare_data_urb = prepare_playback_urb; - subs-data_endpoint-retire_data_urb = retire_playback_urb; subs-running = 1; return 0; case SNDRV_PCM_TRIGGER_STOP: @@ -1199,7 +1203,6 @@ static int snd_usb_substream_playback_trigger(struct snd_pcm_substream *substrea return 0; case SNDRV_PCM_TRIGGER_PAUSE_PUSH: subs-data_endpoint-prepare_data_urb
Re: [PATCH 2/2] mm: support MIGRATE_DISCARD
On Thu, Sep 06, 2012 at 02:31:12PM +0900, Minchan Kim wrote: Hi Mel, On Wed, Sep 05, 2012 at 11:56:11AM +0100, Mel Gorman wrote: On Wed, Sep 05, 2012 at 05:11:13PM +0900, Minchan Kim wrote: This patch introudes MIGRATE_DISCARD mode in migration. It drops *clean cache pages* instead of migration so that migration latency could be reduced by avoiding (memcpy + page remapping). It's useful for CMA because latency of migration is very important rather than eviction of background processes's workingset. In addition, it needs less free pages for migration targets so it could avoid memory reclaiming to get free pages, which is another factor increase latency. Bah, this was released while I was reviewing the older version. I did not read this one as closely but I see the enum problems have gone away at least. I'd still prefer if CMA had an additional helper to discard some pages with shrink_page_list() and migrate the remaining pages with migrate_pages(). That would remove the need to add a MIGRATE_DISCARD migrate mode at all. I am not convinced with your point. What's the benefit on separating reclaim and migration? For just removing MIGRATE_DISCARD mode? Maintainability. There are reclaim functions and there are migration functions. Your patch takes migrate_pages() and makes it partially a reclaim function mixing up the responsibilities of migrate.c and vmscan.c. I don't think it's not bad because my implementation is very simple(maybe it's much simpler than separating reclaim and migration) and could be used by others like memory-hotplug in future. They could also have used the helper function from CMA that takes a list of pages, reclaims some and migrates other. If you're not strong against with me, I would like to insist on my implementation. I'm not very strongly against it but I'm also very unhappy. -- Mel Gorman SUSE Labs -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] drm/exynos: fix double call of drm_prime_(init/destroy)_file_private
Hi, 2012/9/6 Paul Menzel paulepan...@users.sourceforge.net: Dear Inki Dae, Am Donnerstag, den 06.09.2012, 11:35 +0900 schrieb InKi Dae: 2012/9/6 Mandeep Singh Baines m...@chromium.org: The double invocations are incorrect but seem to be safe so I don't think this will fix any bugs. Before: [7.639366] drm_prime_init_file ee3675d0 [7.639377] drm_prime_init_file ee3675d0 [7.639507] drm_prime_destroy_file ee3675d0 [7.639518] drm_prime_destroy_file ee3675d0 [7.639802] drm_prime_init_file ee372390 [7.639810] drm_prime_init_file ee372390 [8.473316] drm_prime_init_file ee356390 [8.473331] drm_prime_init_file ee356390 After: [6.363842] drm_prime_init_file edc2e5d0 [6.363994] drm_prime_destroy_file edc2e5d0 [6.364260] drm_prime_init_file edc2e750 [8.004837] drm_prime_init_file ee36ded0 Signed-off-by: Mandeep Singh Baines m...@chromium.org CC: Stéphane Marchesin marc...@chromium.org CC: Pawel Osciak posc...@google.com CC: Inki Dae inki@samsung.com CC: Joonyoung Shim jy0922.s...@samsung.com CC: Seung-Woo Kim sw0312@samsung.com CC: Kyungmin Park kyungmin.p...@samsung.com CC: David Airlie airl...@linux.ie CC: dri-de...@lists.freedesktop.org remove all CCs I guess they were generated by some script. So they should be fine, no? Mandeep, if you put CC in here those people should be CCed in real. `git send-email` should take care of that but I do not see everyone in the CC field. Or does `git send-email` use blind carbon copy (BCC) field? and can you send it again using text mode? At least to the list it was send in plain text mode. your patch is messed up when I try to get patch file. Everything is fine on my side. Especially since Mandeep used `git send-email` which should do everything correctly. your patch was encoded with 'Content-Transfer-Encoding: base64' so please use 7bit ascii like 'Content-Transfer-Encoding: 7bit' Thanks. Inki Dae In your From address your name is written InKi with capital K. Which one is correct? Inki is correct :) Thanks. Inki Dae Thanks, Paul ___ dri-devel mailing list dri-de...@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/6] unicore32: pwm: Properly remap memory-mapped registers
Instead of writing to the timer controller registers by dereferencing a pointer to the memory location, properly remap the memory region with a call to ioremap_nocache() and access the registers using writel(). Signed-off-by: Thierry Reding thierry.red...@avionic-design.de --- arch/unicore32/kernel/pwm.c | 25 ++--- 1 file changed, 22 insertions(+), 3 deletions(-) diff --git a/arch/unicore32/kernel/pwm.c b/arch/unicore32/kernel/pwm.c index 4615d51..410b786 100644 --- a/arch/unicore32/kernel/pwm.c +++ b/arch/unicore32/kernel/pwm.c @@ -23,10 +23,16 @@ #include asm/div64.h #include mach/hardware.h +#define PWCR 0x00 +#define DCCR 0x04 +#define PCR 0x08 I think old register names could be used here by some small modifications. Please see arch/unicore32/include/mach/regs-ost.h We can avoid ioremap and use writel/readl directly on these registers. Guan + struct pwm_device { struct list_headnode; struct platform_device *pdev; + void __iomem*base; + const char *label; struct clk *clk; int clk_enabled; @@ -69,9 +75,11 @@ int pwm_config(struct pwm_device *pwm, int duty_ns, int period_ns) * before writing to the registers */ clk_enable(pwm-clk); - OST_PWMPWCR = prescale; - OST_PWMDCCR = pv - dc; - OST_PWMPCR = pv; + + writel(prescale, pwm-base + PWCR); + writel(pv - dc, pwm-base + DCCR); + writel(pv, pwm-base + PCR); + clk_disable(pwm-clk); return 0; @@ -190,10 +198,19 @@ static struct pwm_device *pwm_probe(struct platform_device *pdev, goto err_free_clk; } + pwm-base = ioremap_nocache(r-start, resource_size(r)); + if (pwm-base == NULL) { + dev_err(pdev-dev, failed to remap memory resource\n); + ret = -EADDRNOTAVAIL; + goto err_release_mem; + } + __add_pwm(pwm); platform_set_drvdata(pdev, pwm); return pwm; +err_release_mem: + release_mem_region(r-start, resource_size(r)); err_free_clk: clk_put(pwm-clk); err_free: @@ -224,6 +241,8 @@ static int __devexit pwm_remove(struct platform_device *pdev) list_del(pwm-node); mutex_unlock(pwm_lock); + iounmap(pwm-base); + r = platform_get_resource(pdev, IORESOURCE_MEM, 0); release_mem_region(r-start, resource_size(r)); -- 1.7.12 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC v9 PATCH 20/21] memory-hotplug: clear hwpoisoned flag when onlining pages
At 09/06/2012 03:27 PM, andywu106建国 Wrote: 2012/9/5 we...@cn.fujitsu.com From: Wen Congyang we...@cn.fujitsu.com hwpoisoned may set when we offline a page by the sysfs interface /sys/devices/system/memory/soft_offline_page or /sys/devices/system/memory/hard_offline_page. If we don't clear this flag when onlining pages, this page can't be freed, and will not in free list. So we can't offline these pages again. So we should clear this flag when onlining pages. CC: David Rientjes rient...@google.com CC: Jiang Liu liu...@gmail.com CC: Len Brown len.br...@intel.com CC: Benjamin Herrenschmidt b...@kernel.crashing.org CC: Paul Mackerras pau...@samba.org CC: Christoph Lameter c...@linux.com Cc: Minchan Kim minchan@gmail.com CC: Andrew Morton a...@linux-foundation.org CC: KOSAKI Motohiro kosaki.motoh...@jp.fujitsu.com CC: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com Signed-off-by: Wen Congyang we...@cn.fujitsu.com --- mm/memory_hotplug.c |5 + 1 files changed, 5 insertions(+), 0 deletions(-) diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 270c249..140c080 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -661,6 +661,11 @@ EXPORT_SYMBOL_GPL(__online_page_increment_counters); void __online_page_free(struct page *page) { +#ifdef CONFIG_MEMORY_FAILURE + /* The page may be marked HWPoisoned by soft/hard offline page */ + ClearPageHWPoison(page); Hi Congyang, I think you should decrease mce_bad_pages counter her atomic_long_sub(1, mce_bad_pages); Yes, thanks for pointing it out. Thanks Wen Congyang +#endif + ClearPageReserved(page); init_page_count(page); __free_page(page); -- 1.7.1 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majord...@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: a href=mailto:d...@kvack.org; em...@kvack.org /a -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 4/6] unicore32: Add common clock support
This commit adds support for the common clock framework to the Unicore32 architecture. Signed-off-by: Thierry Reding thierry.red...@avionic-design.de This patch can't work. Could you disintegrate it into several small patches, so I could check it out. Thanks, Guan Xuetao --- arch/unicore32/Kconfig | 1 + arch/unicore32/include/asm/clkdev.h | 26 ++ arch/unicore32/kernel/clock.c | 560 3 files changed, 339 insertions(+), 248 deletions(-) create mode 100644 arch/unicore32/include/asm/clkdev.h diff --git a/arch/unicore32/Kconfig b/arch/unicore32/Kconfig index b0a4743..46b3a15 100644 --- a/arch/unicore32/Kconfig +++ b/arch/unicore32/Kconfig @@ -14,6 +14,7 @@ config UNICORE32 select GENERIC_IRQ_SHOW select ARCH_WANT_FRAME_POINTERS select GENERIC_IOMAP + select COMMON_CLK help UniCore-32 is 32-bit Instruction Set Architecture, including a series of low-power-consumption RISC chip diff --git a/arch/unicore32/include/asm/clkdev.h b/arch/unicore32/include/asm/clkdev.h new file mode 100644 index 000..201645d --- /dev/null +++ b/arch/unicore32/include/asm/clkdev.h @@ -0,0 +1,26 @@ +/* + * based on arch/arm/include/asm/clkdev.h + * + * Copyright (C) 2008 Russell King. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + * + * Helper for the clk API to assist looking up a struct clk. + */ + +#ifndef __ASM_CLKDEV_H +#define __ASM_CLKDEV_H + +#include linux/slab.h + +#define __clk_get(clk) ({ 1; }) +#define __clk_put(clk) do { } while (0) + +static inline struct clk_lookup_alloc *__clkdev_alloc(size_t size) +{ + return kzalloc(size, GFP_KERNEL); +} + +#endif diff --git a/arch/unicore32/kernel/clock.c b/arch/unicore32/kernel/clock.c index 18d4563..197f885 100644 --- a/arch/unicore32/kernel/clock.c +++ b/arch/unicore32/kernel/clock.c @@ -17,223 +17,50 @@ #include linux/errno.h #include linux/err.h #include linux/string.h -#include linux/clk.h +#include linux/clk-provider.h #include linux/mutex.h #include linux/delay.h #include linux/io.h +#include linux/slab.h #include mach/hardware.h -/* - * Very simple clock implementation - */ -struct clk { - struct list_headnode; - unsigned long rate; - const char *name; -}; - -static struct clk clk_ost_clk = { - .name = OST_CLK, - .rate = CLOCK_TICK_RATE, -}; - -static struct clk clk_mclk_clk = { - .name = MAIN_CLK, -}; - -static struct clk clk_bclk32_clk = { - .name = BUS32_CLK, +struct clk_uc { + struct clk_hw hw; }; -static struct clk clk_ddr_clk = { - .name = DDR_CLK, -}; - -static struct clk clk_vga_clk = { - .name = VGA_CLK, -}; - -static LIST_HEAD(clocks); -static DEFINE_MUTEX(clocks_mutex); - -struct clk *clk_get(struct device *dev, const char *id) -{ - struct clk *p, *clk = ERR_PTR(-ENOENT); - - mutex_lock(clocks_mutex); - list_for_each_entry(p, clocks, node) { - if (strcmp(id, p-name) == 0) { - clk = p; - break; - } - } - mutex_unlock(clocks_mutex); - - return clk; -} -EXPORT_SYMBOL(clk_get); - -void clk_put(struct clk *clk) -{ -} -EXPORT_SYMBOL(clk_put); - -int clk_enable(struct clk *clk) -{ - return 0; -} -EXPORT_SYMBOL(clk_enable); - -void clk_disable(struct clk *clk) +static inline struct clk_uc *to_clk_uc(struct clk_hw *hw) { + return container_of(hw, struct clk_uc, hw); } -EXPORT_SYMBOL(clk_disable); - -unsigned long clk_get_rate(struct clk *clk) -{ - return clk-rate; -} -EXPORT_SYMBOL(clk_get_rate); - -struct { - unsigned long rate; - unsigned long cfg; - unsigned long div; -} vga_clk_table[] = { - {.rate = 25175000, .cfg = 0x2001, .div = 0x9}, - {.rate = 3150, .cfg = 0x2001, .div = 0x7}, - {.rate = 4000, .cfg = 0x3801, .div = 0x9}, - {.rate = 4950, .cfg = 0x3801, .div = 0x7}, - {.rate = 6500, .cfg = 0x2c01, .div = 0x4}, - {.rate = 7875, .cfg = 0x2400, .div = 0x7}, - {.rate = 10800, .cfg = 0x2c01, .div = 0x2}, - {.rate = 10650, .cfg = 0x3c01, .div = 0x3}, - {.rate = 5065, .cfg = 0x00106400, .div = 0x9}, - {.rate = 6150, .cfg = 0x00106400, .div = 0xa}, - {.rate = 8550, .cfg = 0x2800, .div = 0x6}, -}; - -struct { - unsigned long mrate; - unsigned long prate; -} mclk_clk_table[] = { - {.mrate = 5, .prate = 0x00109801}, - {.mrate = 52500, .prate = 0x00104C00}, - {.mrate = 55000, .prate = 0x00105000}, - {.mrate =
Re: [PATCH v2 20/31] arm64: User access library function
On Wed, Sep 05, 2012 at 10:05:34PM +0100, Russell King - ARM Linux wrote: On Wed, Sep 05, 2012 at 10:01:37PM +0100, Catalin Marinas wrote: There are indeed a few KB gain in code size but that's probably coming from the exception table since otherwise you just replace a bl with ldrt. It depends on what the compiler does as well, the arm code has some carefully chosen registers when calling the __get_user_x function. It's more than that - it's not just the ldr but also a zeroing of a temporary register to hold the error code should the instruction fault. So it's not only the exception tables but also an increase in the main path - and that's where you benefit from having it out of line and thereby a hotter i-cache. On 32-bit we have __get_user() inline and get_user() out of line. What was the history behind this? If you do the access_ok inline and the __get_user_x separately, the size increase is even greater (at least in the arm64 case it can get to over 20KB). I think x86 does the access_ok check out of line. Please talk to Will about get_user() and put_user(). Afterwards you will definitely want to keep them out of line on 64-bit ARM. As I said, I already made the change to always inline get_user/put_user with some penalty in the Image size but it makes the code cleaner. I'm not entirely convinced of the performance gain/loss especially on ARMv8 cores with physically tagged caches. There is room for optimisation when I get real silicon. -- Catalin -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2] drivers/media/platform/s5p-tv/sdo_drv.c: fix error return code
From: Peter Senna Tschudin peter.se...@gmail.com Convert a nonnegative error return code to a negative one, as returned elsewhere in the function. A simplified version of the semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // smpl ( if@p1 (\(ret 0\|ret != 0\)) { ... return ret; } | ret@p1 = 0 ) ... when != ret = e1 when != ret *if(...) { ... when != ret = e2 when forall return ret; } // /smpl Signed-off-by: Peter Senna Tschudin peter.se...@gmail.com --- drivers/media/platform/s5p-tv/sdo_drv.c |3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/media/platform/s5p-tv/sdo_drv.c b/drivers/media/platform/s5p-tv/sdo_drv.c index ad68bbe..58cf56d 100644 --- a/drivers/media/platform/s5p-tv/sdo_drv.c +++ b/drivers/media/platform/s5p-tv/sdo_drv.c @@ -369,6 +369,7 @@ static int __devinit sdo_probe(struct platform_device *pdev) sdev-fout_vpll = clk_get(dev, fout_vpll); if (IS_ERR_OR_NULL(sdev-fout_vpll)) { dev_err(dev, failed to get clock 'fout_vpll'\n); + ret = -ENXIO; goto fail_dacphy; } dev_info(dev, fout_vpll.rate = %lu\n, clk_get_rate(sclk_vpll)); @@ -377,11 +378,13 @@ static int __devinit sdo_probe(struct platform_device *pdev) sdev-vdac = devm_regulator_get(dev, vdd33a_dac); if (IS_ERR_OR_NULL(sdev-vdac)) { dev_err(dev, failed to get regulator 'vdac'\n); + ret = -ENXIO; goto fail_fout_vpll; } sdev-vdet = devm_regulator_get(dev, vdet); if (IS_ERR_OR_NULL(sdev-vdet)) { dev_err(dev, failed to get regulator 'vdet'\n); + ret = -ENXIO; goto fail_fout_vpll; } -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] virtio-blk: Fix kconfig option
On Thu, Sep 06, 2012 at 12:41:13AM -0700, Kent Overstreet wrote: On Tue, Sep 04, 2012 at 03:53:53PM +0930, Rusty Russell wrote: Kent Overstreet koverstr...@google.com writes: CONFIG_VIRTIO isn't exposed, everything else is supposed to select it instead. This is a slight mis-understanding. It's supposed to be selected by the particular driver, probably virtio_pci in your case. So are you saying virtio-blk depends on virtio-pci? If so, the kconfig should have that. As is, VIRTIO_BLK just has: depends on EXPERIMENTAL VIRTIO which is flat out broken. I don't think anything is broken. Can you show an example of a broken configuration? -- MST -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] OMAP GPIO - don't wake from suspend unless requested.
On Thu, Sep 6, 2012 at 1:21 PM, NeilBrown ne...@suse.de wrote: On Thu, 6 Sep 2012 12:57:46 +0530 Shilimkar, Santosh santosh.shilim...@ti.com wrote: On Thu, Sep 6, 2012 at 12:32 PM, NeilBrown ne...@suse.de wrote: On Thu, 6 Sep 2012 11:18:09 +0530 Shilimkar, Santosh santosh.shilim...@ti.com wrote: On Thu, Sep 6, 2012 at 8:35 AM, NeilBrown ne...@suse.de wrote: On Mon, 3 Sep 2012 22:59:06 -0700 Shilimkar, Santosh santosh.shilim...@ti.com wrote: After thinking bit more on this, the problem seems to be coming mainly because the gpio device is runtime suspended bit early than it should be. Similar issue seen with i2c driver as well. The i2c issue was discussed with Rafael at LPC last week. The idea is to move the pm_runtime_enable/disable() calls entirely up to the _late/_early stage of device suspend/resume. Will update this thread once I have further update. This won't be late enough. IRQCHIP_MASK_ON_SUSPEND takes effect after all the _late callbacks have been called. I, too, spoke to Rafael about this in San Diego. He seemed to agree with me that the interrupt needs to be masked in the -suspend callback. any later is too late. Thanks for information about your discussion. Will wait for the patch then. Regards santosh I already sent a patch - that is what started this thread :-) I include it below. You said The patch doesn't seems to be correct but didn't expand on why. Do you still think it is not correct? I wouldn't be surprised if there is some case that it doesn't handle quite right, but it seems right to me. Sorry I though you were talking about moving the Checking wakeup interrupts bit early as discussed on the follow up of alternate suggested patch to make use of IRQCHIP_MASK_ON_SUSPEND. I think we need to fix the issue seen with ' IRQCHIP_MASK_ON_SUSPEND' patch. That is at least the expected way to manage suspend and wakeup irq masks for a IRQCHIP. That is what I thought at first too. But when talking to Rafael he said that IRQCHIP_MASK_ON_SUSPEND was intended mainly for clock interrupts. For other less fundamental interrupts, doing the mask/unmask in suspend/resume callbacks is sufficient and simpler... and actually works. That is not the how I undetand IRQCHIP_MASK_ON_SUSPEND use. I thought it can be used for any IRQ chip for masking or setting wakeup on interrupts lines managed by that chip for suspend. May be I am wrong. IRQCHIP_MASK_ON_SUSPEND is currently used by precisely two drivers: arch/arm/mach-omap2/omap-wakeupgen.c and drivers/mfd/pm8xxx-irq.c which suggests that it isn't widely used and quite possibly doesn't actually work in general. I have seen lot more platforms use in downstream kernels. Also seen recently patches on the linux-arm list for couple of platforms. Maybe we need to start a new thread and pester Rafael or Thomas Gleixner to either explain what is intended for this case, or to fix IRQCHIP_MASK_ON_SUSPEND so that it can be used in general. Sounds a good idea. Since you already had discussion with Rafael, probably you can describe the issue better. Regards Santosh -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 2/2] virtio-ring: Allocate indirect buffers from cache when possible
On Thu, Sep 06, 2012 at 05:27:23PM +0930, Rusty Russell wrote: Michael S. Tsirkin m...@redhat.com writes: Yes without checksum net core always linearizes packets, so yes it is screwed. For -net, skb always allocates space for 17 frags + linear part so it seems sane to do same in virtio core, and allocate, for -net, up to max_frags + 1 from cache. We can adjust it: no _SG - 2 otherwise 18. But I thought it used individual buffers these days? Yes for receive, no for transmit. That's probably why we should have the threshold per vq, not per device, BTW. -- MST -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] virtio-balloon spec: provide a version of the silent deflate feature that works
On Thu, Sep 06, 2012 at 09:46:50AM +0200, Paolo Bonzini wrote: VIRTIO_BALLOON_F_MUST_TELL_HOST cannot be used properly because it is a negative feature: it tells you that silent defalte is not supported. Right now, QEMU refuses migration if the target does not support all the features that were negotiated. But then: - a migration from non-MUST_TELL_HOST to MUST_TELL_HOST will succeed, which is wrong; - a migration from MUST_TELL_HOST to non-MUST_TELL_HOST will fail, which is useless. Add instead a new feature VIRTIO_BALLOON_F_SILENT_DEFLATE, and deprecate VIRTIO_BALLOON_F_MUST_TELL_HOST since it is never actually used. Signed-off-by: Paolo Bonzini pbonz...@redhat.com Frankly I think it's a qemu migration bug. I don't see why we need to tweak spec: just teach qemu to be smarter during migration. Can you show a scenario with old driver/new hypervisor or new driver/old hypervisor that fails? --- virtio-spec.lyx | 36 +--- 1 file modificato, 33 inserzioni(+), 3 rimozioni(-) diff --git a/virtio-spec.lyx b/virtio-spec.lyx index 7a073f4..1a25a18 100644 --- a/virtio-spec.lyx +++ b/virtio-spec.lyx @@ -6238,6 +6238,8 @@ bits \begin_deeper \begin_layout Description + +\change_deleted 1531152142 1346917221 VIRTIO_BALLOON_F_MUST_TELL_HOST \begin_inset space ~ \end_inset @@ -6251,6 +6253,20 @@ VIRTIO_BALLOON_F_STATS_VQ \end_inset (1) A virtqueue for reporting guest memory statistics is present. +\change_inserted 1531152142 1346917193 + +\end_layout + +\begin_layout Description + +\change_inserted 1531152142 1346917219 +VIRTIO_BALLOON_F_SILENT_DEFLATE +\begin_inset space ~ +\end_inset + +(2) Host does not need to be told before pages from the balloon are used. +\change_unchanged + \end_layout \end_deeper @@ -6401,9 +6417,23 @@ The driver constructs an array of addresses of memory pages it has previously \end_layout \begin_layout Enumerate -If the VIRTIO_BALLOON_F_MUST_TELL_HOST feature is set, the guest may not - use these requested pages until that descriptor in the deflateq has been - used by the device. +If the VIRTIO_BALLOON_F_ +\change_deleted 1531152142 1346917234 +MUST_TELL_HOST +\change_inserted 1531152142 1346917237 +SILENT_DEFLATE +\change_unchanged + feature is +\change_inserted 1531152142 1346917241 +not +\change_unchanged +set, the guest may not use these requested pages until that descriptor in + the deflateq has been used by the device. + +\change_inserted 1531152142 1346917253 + If it is set, the guest may choose to not use the deflateq at all. +\change_unchanged + \end_layout \begin_layout Enumerate -- 1.7.11.2 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] powerpc: fix personality handling in ppc64_personality()
On Thu, 6 Sep 2012, Benjamin Herrenschmidt wrote: actually commit 7256a5d2da56 seems to contain the correct PER_LINUX handling, so seems like you picked the right one :) Odd, they looked different around the use of PER_MASK when I looked but The original patch had personality = ~PER_LINUX | PER_LINUX32; Which is bogus, exactly because ~PER_LINUX is -1. I then used personality = (personality ~PER_MASK) | PER_LINUX32; which is correct and perhaps a little bit more descriptive, and that is what you have merged, so all is fine. I was tired jet lagged, so I might have just had a brain fail... Probably just missed that the first patch used PER_LINUX and the second one PER_MASK, or whatever. Anyway, thanks. -- Jiri Kosina SUSE Labs -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3] kconfig: replace 'oldnoconfig' with 'olddefconfig', and keep the old name as an alias
On Sat, Sep 01, 2012 at 01:05:17AM +0800, Adam Lee wrote: As 67d34a6a391369269a2e5dba8a5f42cc4cd50231 said, 'oldnoconfig' doesn't set new symbols to 'n', but instead sets it to their default values. So, this patch replaces 'oldnoconfig' with 'olddefconfig', stop making people confused, and keep the old name 'oldnoconfig' as an alias, because people already are dependent on its behavior with the counter-intuitive name. v3: use a better way an add comments about alias in conf.c Hi, Michal How about this version? I replace it and keep the old one as an alias, seems everyone will be happy, not have to stand the counter-intuitive name. -- Regards, Adam Lee http://adam8157.info -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC v2] memory-hotplug: remove MIGRATE_ISOLATE from free_area-free_list
On 09/06/2012 04:18 PM, Minchan Kim wrote: Hello Lai, On Thu, Sep 06, 2012 at 04:14:51PM +0800, Lai Jiangshan wrote: On 09/06/2012 10:53 AM, Minchan Kim wrote: Normally, MIGRATE_ISOLATE type is used for memory-hotplug. But it's irony type because the pages isolated would exist as free page in free_area-free_list[MIGRATE_ISOLATE] so people can think of it as allocatable pages but it is *never* allocatable. It ends up confusing NR_FREE_PAGES vmstat so it would be totally not accurate so some of place which depend on such vmstat could reach wrong decision by the context. There were already report about it.[1] [1] 702d1a6e, memory-hotplug: fix kswapd looping forever problem Then, there was other report which is other problem.[2] [2] http://www.spinics.net/lists/linux-mm/msg41251.html I believe it can make problems in future, too. So I hope removing such irony type by another design. I hope this patch solves it and let's revert [1] and doesn't need [2]. * Changelog v1 * Fix from Michal's many suggestion Cc: Michal Nazarewicz min...@mina86.com Cc: Mel Gorman m...@csn.ul.ie Cc: Kamezawa Hiroyuki kamezawa.hir...@jp.fujitsu.com Cc: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com Cc: Wen Congyang we...@cn.fujitsu.com Cc: Konrad Rzeszutek Wilk konrad.w...@oracle.com Signed-off-by: Minchan Kim minc...@kernel.org --- @@ -180,30 +287,35 @@ int undo_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn, * all pages in [start_pfn...end_pfn) must be in the same zone. * zone-lock must be held before call this. * - * Returns 1 if all pages in the range are isolated. + * Returns true if all pages in the range are isolated. */ -static int -__test_page_isolated_in_pageblock(unsigned long pfn, unsigned long end_pfn) +static bool +__test_page_isolated_in_pageblock(unsigned long start_pfn, unsigned long end_pfn) { + unsigned long pfn, next_pfn; struct page *page; - while (pfn end_pfn) { - if (!pfn_valid_within(pfn)) { - pfn++; - continue; - } - page = pfn_to_page(pfn); - if (PageBuddy(page)) - pfn += 1 page_order(page); - else if (page_count(page) == 0 - page_private(page) == MIGRATE_ISOLATE) - pfn += 1; - else - break; + list_for_each_entry(page, isolated_pages, lru) { + if (page-lru == isolated_pages) + return false; what's the mean of this line? I just copied it from Michal's code but It seem to be not needed. I will remove it in next spin. + pfn = page_to_pfn(page); + if (pfn = end_pfn) + return false; + if (pfn = start_pfn) + goto found; this test is wrong. if ((pfn = start_pfn) (start_pfn pfn + (1UL page_order(page goto found; + } + return false; + + list_for_each_entry_continue(page, isolated_pages, lru) { + if (page_to_pfn(page) != next_pfn) + return false; where is next_pfn init-ed? by goto found don't goto inner label. move the found label up: + +found: + next_pfn = page_to_pfn(page); + list_for_each_entry_from(page, isolated_pages, lru) { + if (page_to_pfn(page) != next_pfn) + return false; + pfn = page_to_pfn(page); + next_pfn = pfn + (1UL page_order(page)); + if (next_pfn = end_pfn) + return true; } -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC v2] memory-hotplug: remove MIGRATE_ISOLATE from free_area-free_list
On 09/06/2012 04:18 PM, Minchan Kim wrote: Hello Lai, On Thu, Sep 06, 2012 at 04:14:51PM +0800, Lai Jiangshan wrote: On 09/06/2012 10:53 AM, Minchan Kim wrote: Normally, MIGRATE_ISOLATE type is used for memory-hotplug. But it's irony type because the pages isolated would exist as free page in free_area-free_list[MIGRATE_ISOLATE] so people can think of it as allocatable pages but it is *never* allocatable. It ends up confusing NR_FREE_PAGES vmstat so it would be totally not accurate so some of place which depend on such vmstat could reach wrong decision by the context. There were already report about it.[1] [1] 702d1a6e, memory-hotplug: fix kswapd looping forever problem Then, there was other report which is other problem.[2] [2] http://www.spinics.net/lists/linux-mm/msg41251.html I believe it can make problems in future, too. So I hope removing such irony type by another design. I hope this patch solves it and let's revert [1] and doesn't need [2]. * Changelog v1 * Fix from Michal's many suggestion Cc: Michal Nazarewicz min...@mina86.com Cc: Mel Gorman m...@csn.ul.ie Cc: Kamezawa Hiroyuki kamezawa.hir...@jp.fujitsu.com Cc: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com Cc: Wen Congyang we...@cn.fujitsu.com Cc: Konrad Rzeszutek Wilk konrad.w...@oracle.com Signed-off-by: Minchan Kim minc...@kernel.org --- @@ -180,30 +287,35 @@ int undo_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn, * all pages in [start_pfn...end_pfn) must be in the same zone. * zone-lock must be held before call this. * - * Returns 1 if all pages in the range are isolated. + * Returns true if all pages in the range are isolated. */ -static int -__test_page_isolated_in_pageblock(unsigned long pfn, unsigned long end_pfn) +static bool +__test_page_isolated_in_pageblock(unsigned long start_pfn, unsigned long end_pfn) { + unsigned long pfn, next_pfn; struct page *page; - while (pfn end_pfn) { - if (!pfn_valid_within(pfn)) { - pfn++; - continue; - } - page = pfn_to_page(pfn); - if (PageBuddy(page)) - pfn += 1 page_order(page); - else if (page_count(page) == 0 - page_private(page) == MIGRATE_ISOLATE) - pfn += 1; - else - break; + list_for_each_entry(page, isolated_pages, lru) { + if (page-lru == isolated_pages) + return false; what's the mean of this line? I just copied it from Michal's code but It seem to be not needed. I will remove it in next spin. + pfn = page_to_pfn(page); + if (pfn = end_pfn) + return false; + if (pfn = start_pfn) + goto found; this test is wrong. use this: if ((pfn = start_pfn) (start_pfn pfn + (1UL page_order(page goto found; if (pfn start_pfn) return false; + } + return false; + + list_for_each_entry_continue(page, isolated_pages, lru) { + if (page_to_pfn(page) != next_pfn) + return false; where is next_pfn init-ed? by goto found don't goto inner label. move the found label up: + +found: + next_pfn = page_to_pfn(page); + list_for_each_entry_from(page, isolated_pages, lru) { + if (page_to_pfn(page) != next_pfn) + return false; + pfn = page_to_pfn(page); + next_pfn = pfn + (1UL page_order(page)); + if (next_pfn = end_pfn) + return true; } -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] mm: support MIGRATE_DISCARD
On Thu, Sep 06, 2012 at 09:29:35AM +0100, Mel Gorman wrote: On Thu, Sep 06, 2012 at 02:31:12PM +0900, Minchan Kim wrote: Hi Mel, On Wed, Sep 05, 2012 at 11:56:11AM +0100, Mel Gorman wrote: On Wed, Sep 05, 2012 at 05:11:13PM +0900, Minchan Kim wrote: This patch introudes MIGRATE_DISCARD mode in migration. It drops *clean cache pages* instead of migration so that migration latency could be reduced by avoiding (memcpy + page remapping). It's useful for CMA because latency of migration is very important rather than eviction of background processes's workingset. In addition, it needs less free pages for migration targets so it could avoid memory reclaiming to get free pages, which is another factor increase latency. Bah, this was released while I was reviewing the older version. I did not read this one as closely but I see the enum problems have gone away at least. I'd still prefer if CMA had an additional helper to discard some pages with shrink_page_list() and migrate the remaining pages with migrate_pages(). That would remove the need to add a MIGRATE_DISCARD migrate mode at all. I am not convinced with your point. What's the benefit on separating reclaim and migration? For just removing MIGRATE_DISCARD mode? Maintainability. There are reclaim functions and there are migration functions. Your patch takes migrate_pages() and makes it partially a reclaim function mixing up the responsibilities of migrate.c and vmscan.c. I don't think it's not bad because my implementation is very simple(maybe it's much simpler than separating reclaim and migration) and could be used by others like memory-hotplug in future. They could also have used the helper function from CMA that takes a list of pages, reclaims some and migrates other. I also do not accept that your approach is inherently simpler than what I proposed to you. This is not tested at all but it should be functionally similar to both your patches except that it keeps the responsibility for reclaim in vmscan.c Your diffstats are 8 files changed, 39 insertions(+), 36 deletions(-) 3 files changed, 46 insertions(+), 4 deletions(-) Mine is 3 files changed, 32 insertions(+), 5 deletions(-) Fewer files changed and fewer lines inserted. ---8--- mm: cma: Discard clean pages during contiguous allocation instead of migration This patch drops clean cache pages instead of migration during alloc_contig_range() to minimise allocation latency by reducing the amount of migration is necessary. It's useful for CMA because latency of migration is more important than evicting the background processes working set. Prototype-not-signed-off-but-feel-free-to-pick-up-and-test --- mm/internal.h |1 + mm/page_alloc.c |2 ++ mm/vmscan.c | 34 +- 3 files changed, 32 insertions(+), 5 deletions(-) diff --git a/mm/internal.h b/mm/internal.h index b8c91b3..6d4bdf9 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -356,3 +356,4 @@ extern unsigned long vm_mmap_pgoff(struct file *, unsigned long, unsigned long, unsigned long); extern void set_pageblock_order(void); +unsigned long reclaim_clean_pages_from_list(struct list_head *page_list); diff --git a/mm/page_alloc.c b/mm/page_alloc.c index c66fb87..977bdb2 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -5670,6 +5670,8 @@ static int __alloc_contig_migrate_range(unsigned long start, unsigned long end) break; } + reclaim_clean_pages_from_list(cc.migratepages); + ret = migrate_pages(cc.migratepages, __alloc_contig_migrate_alloc, 0, false, MIGRATE_SYNC); diff --git a/mm/vmscan.c b/mm/vmscan.c index 8d01243..ccf7bc2 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -703,7 +703,7 @@ static unsigned long shrink_page_list(struct list_head *page_list, goto keep; VM_BUG_ON(PageActive(page)); - VM_BUG_ON(page_zone(page) != zone); + VM_BUG_ON(zone page_zone(page) != zone); sc-nr_scanned++; @@ -817,7 +817,9 @@ static unsigned long shrink_page_list(struct list_head *page_list, * except we already have the page isolated * and know it's dirty */ - inc_zone_page_state(page, NR_VMSCAN_IMMEDIATE); + if (zone) + inc_zone_page_state(page, + NR_VMSCAN_IMMEDIATE); SetPageReclaim(page); goto keep_locked; @@ -947,7 +949,7 @@ keep: * back off and wait for congestion to clear because further reclaim * will encounter the same
Re: [RFC PATCH 0/3] target: try satisfying memory requests with higher-order allocations
Il 06/09/2012 03:58, Nicholas A. Bellinger ha scritto: This patch series fixes this problem by using higher-order allocations to build the data scatterlist. The problem is that iscsi assumes that the scatterlist consists of single pages, which is not true anymore. So patch 2 has to introduce some relatively complicated changes to iscsi_map_iovec and iscsi_unmap_iovec. So enabling multi-page per SGL support is a feature that has been dormant within target core for a long time. It's about time that we start taking advantage of it again. ;) Yeah, I noticed some preparation for it in tcm_fc/tfc_io.c, though too late (they look a lot like my iscsi changes, it would have saved me some time!). While this is obviously not to be taken lightly, I disagree with making this a per-fabric choice. With a properly organized (and bisectable) series, it should be relatively easy to review and to get right. I looked a bit more closely now and there are no changes needed to other targets (actually there is a change needed in tcm_qla2xxx, but the code is currently disabled). There are however changes to transport_kmap_data_sg needed and a few other places. I definitely agree with your other comments, including making max_order a DEF_DEV_ATTRIB. In addition, the default max_order should be capped based on queue_max_sectors(q) if applicable, to avoid hitting this scenario: /* * XXX: if the length the device accepts is shorter than the * length of the S/G list entry this will cause and * endless loop. Better hope no driver uses huge pages. */ Paolo While doing this, I noticed something strange in iscsit_do_crypto_hash_sg. Patch 1 adds a warning about it. M, looks like a separate bug with DataDigest enabled. The approach may be completely wrong and it needs more testing anyway. Please review! Adding my comments inline. Thanks Paolo! --nab Paolo Paolo Bonzini (3): tcm_iscsi: warn on incorrect precondition for iscsit_do_crypto_hash_sg tcm_iscsi: support multiple sizes in the scatterlist target: try satisfying memory requests with contiguous blocks drivers/target/iscsi/iscsi_target.c | 106 +- drivers/target/iscsi/iscsi_target_core.h |2 +- drivers/target/target_core_transport.c | 58 ++--- 3 files changed, 138 insertions(+), 28 deletions(-) -- To unsubscribe from this list: send the line unsubscribe target-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH RESEND] fs: Build sys_stat64() and friends if __ARCH_WANT_COMPAT_STAT64
On AArch64, we want the sys_stat64() and related functions for compat support but do not need the generic struct stat64, enabled automatically if __ARCH_WANT_STAT64. Signed-off-by: Catalin Marinas catalin.mari...@arm.com Acked-by: Arnd Bergmann a...@arndb.de Cc: Alexander Viro v...@zeniv.linux.org.uk Cc: Andrew Morton a...@linux-foundation.org --- fs/stat.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/stat.c b/fs/stat.c index b6ff118..6126c5d 100644 --- a/fs/stat.c +++ b/fs/stat.c @@ -326,7 +326,7 @@ SYSCALL_DEFINE3(readlink, const char __user *, path, char __user *, buf, /* -- LFS-64 --- */ -#ifdef __ARCH_WANT_STAT64 +#if defined(__ARCH_WANT_STAT64) || defined(__ARCH_WANT_COMPAT_STAT64) #ifndef INIT_STRUCT_STAT64_PADDING # define INIT_STRUCT_STAT64_PADDING(st) memset(st, 0, sizeof(st)) @@ -415,7 +415,7 @@ SYSCALL_DEFINE4(fstatat64, int, dfd, const char __user *, filename, return error; return cp_new_stat64(stat, statbuf); } -#endif /* __ARCH_WANT_STAT64 */ +#endif /* __ARCH_WANT_STAT64 || __ARCH_WANT_COMPAT_STAT64 */ /* Caller is here responsible for sufficient locking (ie. inode-i_lock) */ void __inode_add_bytes(struct inode *inode, loff_t bytes) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PCI/e1000 BUG: unable to handle kernel paging request at 0ffff163
On Wed, Sep 05, 2012 at 11:41:04AM -0700, Yinghai Lu wrote: On Tue, Sep 4, 2012 at 11:51 PM, Fengguang Wu fengguang...@intel.com wrote: Yinghai, There are many kernel paging errors showing up in tree: git://git.kernel.org/pub/scm/linux/kernel/git/yinghai/linux-yinghai.git for-pci-for-each-res-addon-v2 The below summary shows that 1) it's a reliably reproducible bug 2) all paging fault happens at address 0163 and in some e1000 functions I'll try to bisect if the root cause is not obvious to you. (Cannot do so for now because there are 3 bisections on the way and I cannot afford more..) thanks, will check that... Yinghai, I'm very sorry that it's a false report... The root cause is memory corruption by the isdnloop driver: == [9.345694] isdnloop-ISDN-driver Rev 1.11.6.7 == [9.347484] isdnloop: (loop0) virtual card added [9.348444] bus: 'usb': driver_probe_device: matched device 1-1:2.0 with driver cdc_acm [9.349773] bus: 'usb': really_probe: probing driver cdc_acm with device 1-1:2.0 [9.350967] cdc_acm 1-1:2.0: This device cannot do calls on its own. It is not a modem. [9.353255] cdc_acm 1-1:2.0: ttyACM0: USB ACM device [9.354137] BUG: unable to handle kernel paging request at 0163 [9.355214] IP: [0163] 0x162 [9.355869] *pde = Which was recently fixed by commit 77f00f6324cb97cf1df6f9c4aaeea6ada23abdb2 Author: Wu Fengguang fengguang...@intel.com Commit: David S. Miller da...@davemloft.net CommitDate: Fri Aug 3 16:53:22 2012 -0700 isdnloop: fix and simplify isdnloop_init() Fix a buffer overflow bug by removing the revision and printk. Thanks, Fengguang -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/4] clk: Provide option for clk_get_rate to issue hw for new rate
Hi Mike, Thanks for your input, and sorry for my late reply! On 31 August 2012 21:29, Mike Turquette mturque...@ti.com wrote: Quoting Ulf Hansson (2012-08-31 05:21:28) From: Ulf Hansson ulf.hans...@linaro.org By using CLK_GET_RATE_NOCACHE flag, we tell the clk_get_rate API to issue the hw for an updated clock rate. This can be used for a clock which rate may be updated without a client necessary modifying it. I'm glad to see this. We discussed whether the default behavior should be cached or from the hardware at length some time back, so having a flag to support the non-default is great. Signed-off-by: Ulf Hansson ulf.hans...@linaro.org --- drivers/clk/clk.c| 43 +++--- include/linux/clk-provider.h |1 + 2 files changed, 25 insertions(+), 19 deletions(-) diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c index efdfd00..d9cbae0 100644 --- a/drivers/clk/clk.c +++ b/drivers/clk/clk.c @@ -558,25 +558,6 @@ int clk_enable(struct clk *clk) EXPORT_SYMBOL_GPL(clk_enable); /** - * clk_get_rate - return the rate of clk - * @clk: the clk whose rate is being returned - * - * Simply returns the cached rate of the clk. Does not query the hardware. If - * clk is NULL then returns 0. - */ -unsigned long clk_get_rate(struct clk *clk) -{ - unsigned long rate; - - mutex_lock(prepare_lock); - rate = __clk_get_rate(clk); - mutex_unlock(prepare_lock); - - return rate; -} -EXPORT_SYMBOL_GPL(clk_get_rate); - -/** * __clk_round_rate - round the given rate for a clk * @clk: round the rate of this clock * @@ -702,6 +683,30 @@ static void __clk_recalc_rates(struct clk *clk, unsigned long msg) } /** + * clk_get_rate - return the rate of clk + * @clk: the clk whose rate is being returned + * + * Simply returns the cached rate of the clk, unless CLK_GET_RATE_NOCACHE flag + * is set, which means a recalc_rate will be issued. + * If clk is NULL then returns 0. + */ +unsigned long clk_get_rate(struct clk *clk) +{ + unsigned long rate; + + mutex_lock(prepare_lock); + + if (clk (clk-flags CLK_GET_RATE_NOCACHE)) + __clk_recalc_rates(clk, 0); This is a bit subtle. Calling __clk_recalc_rates will walk the subtree of children recalculating rates as well as firing off notifiers. Is this what you want? If your clock changes rates behind your back AND has chilren then this is probably the right thing to do. However you might be better off with: if (clk (clk-flags CLK_GET_RATE_NOCACHE)) rate = clk-ops-recalc_rate(clk-hw, clk-parent-rate); This doesn't update children or fire off notifiers. What is best for your platform? For my platform, ux500 and for the clock connected to this patchseries, your suggesting above is enough. (Well some additional error handling is needed in your code proposal though :-) ) The reason for why I used __clk_recalc_rates was because I think it could make sense to have a more generic approach, not sure if it is needed as you mention. Additionally, using __clk_recalc_rates with 0 as the notification argument, should prevent notifications from happen, right? So basically, I wanted the clock rates for the children to be updated as well as the parent clock rate, but no notifications. I can happily update the patch according to your proposal if you still think it is the best way to do it, just tell me again then. :-) Kind regards Ulf Hansson -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/2] mm: Fixup obsolete PG_buddy flag in error_states[]
PG_buddy, an abandoned flag, indicates page(s) is/are free and in buddy allocator. So in the comment, pages in buddy system instead of PG_buddy pages. Signed-off-by: Haifeng Li omy...@gmail.com --- mm/memory-failure.c |3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/mm/memory-failure.c b/mm/memory-failure.c index ab1e714..2873498 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -762,7 +762,8 @@ static struct page_state { { reserved, reserved, reserved kernel, me_kernel }, /* * free pages are specially detected outside this table: -* PG_buddy pages only make a small fraction of all free pages. +* pages in buddy system only make a small fraction of all +* free pages. */ /* -- 1.7.5.4 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 3/3] memory-hotplug: bug fix race between isolation and allocation
Hi, Minchan, 2012/09/06 16:30, Minchan Kim wrote: Hello Yasuaki, On Thu, Sep 06, 2012 at 04:17:54PM +0900, Yasuaki Ishimatsu wrote: Hi Minchan, 2012/09/06 14:16, Minchan Kim wrote: Like below, memory-hotplug makes race between page-isolation and page-allocation so it can hit BUG_ON in __offline_isolated_pages. CPU A CPU B start_isolate_page_range set_migratetype_isolate spin_lock_irqsave(zone-lock) free_hot_cold_page(Page A) /* without zone-lock */ migratetype = get_pageblock_migratetype(Page A); /* * Page could be moved into MIGRATE_MOVABLE * of per_cpu_pages */ list_add_tail(page-lru, pcp-lists[migratetype]); set_pageblock_isolate move_freepages_block drain_all_pages /* Page A could be in MIGRATE_MOVABLE of free_list. */ check_pages_isolated __test_page_isolated_in_pageblock /* * We can't catch freed page which * is free_list[MIGRATE_MOVABLE] */ if (PageBuddy(page A)) pfn += 1 page_order(page A); /* So, Page A could be allocated */ __offline_isolated_pages /* * BUG_ON hit or offline page * which is used by someone */ BUG_ON(!PageBuddy(page A)); This patch checks page's migratetype in freelist in __test_page_isolated_in_pageblock. So now __test_page_isolated_in_pageblock can check the page caused by above race and can fail of memory offlining. Signed-off-by: Minchan Kim minc...@kernel.org --- mm/page_isolation.c |5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/mm/page_isolation.c b/mm/page_isolation.c index 87a7929..7ba7405 100644 --- a/mm/page_isolation.c +++ b/mm/page_isolation.c @@ -193,8 +193,11 @@ __test_page_isolated_in_pageblock(unsigned long pfn, unsigned long end_pfn) continue; } page = pfn_to_page(pfn); - if (PageBuddy(page)) + if (PageBuddy(page)) { + if (get_freepage_migratetype(page) != MIGRATE_ISOLATE) + break; pfn += 1 page_order(page); + } else if (page_count(page) == 0 get_freepage_migratetype(page) == MIGRATE_ISOLATE) When do the if statement, the page may be used by someone. I can't understand your point. We already hold zone-lock so that allocator and this function should be atomic when the page is in free_list. If I miss something, could you elaborate it more? According to your description, the page might be allocated by someone at this point. So some value might be set the page-index by the intended purpose. Thus page-index has the potential to become MIGRATE_ISOLATE value. Thanks, Yasuaki Ishimatsu In this case, page-index may have some number. If the number is same as MIGRATE_ISOLATE, the code goes worng. Thanks, Yasuaki Ishimatsu pfn += 1; -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majord...@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: a href=mailto:d...@kvack.org; em...@kvack.org /a -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/2] mm: Fixup abandoned PG_buddy for private in struct page
PG_buddy, an abandoned flag, indicates page(s) is/are free and in buddy allocator. And when page(s) in buddy allocator, the _mapcount will equal PAGE_BUDDY_MAPCOUNT_VALUE. So, here, _mapcount equals PAGE_BUDDY_MAPCOUNT_VALUE instead of PG_buddy is set. Signed-off-by: Haifeng Li omy...@gmail.com --- include/linux/mm_types.h |3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 704a626..49d9247 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -126,7 +126,8 @@ struct page { * if PagePrivate set; used for * swp_entry_t if PageSwapCache; * indicates order in the buddy -* system if PG_buddy is set. +* system if _mapcount equals +* PAGE_BUDDY_MAPCOUNT_VALUE. */ #if USE_SPLIT_PTLOCKS spinlock_t ptl; -- 1.7.5.4 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
CONFIG_NO_HZ + CONFIG_CPU_IDLE freeze the system (Was Re: [PATCH] acpi : remove power from acpi_processor_cx structure)
On 09/06/2012 09:54 AM, Daniel Lezcano wrote: On 09/05/2012 03:41 PM, Rafael J. Wysocki wrote: On Saturday, September 01, 2012, Rafael J. Wysocki wrote: On Friday, August 31, 2012, Daniel Lezcano wrote: On 07/24/2012 11:06 PM, Konrad Rzeszutek Wilk wrote: On Tue, Jul 24, 2012 at 11:12:29PM +0200, Daniel Lezcano wrote: Remove the power field as it is not used. Signed-off-by: Daniel Lezcano daniel.lezc...@linaro.org Cc: Konrad Rzeszutek Wilk konrad.w...@oracle.com Acked. Hi Rafael, I did not see this patch going in. Is it possible to merge it ? I think so. I'll take care of it when I get back from LinuxCon/Plumbers Conf. (early next week). Applied to the linux-next branch of the linux-pm.git tree as v3.7 material. Thanks Rafael. Are there any other patches you want me to consider for v3.7? Yes please, I have the per cpu latencies ready to be submitted but I want to do extra testing before. Unfortunately, the linux-pm-next hangs at boot time on my intel dual core (not related to the patchset). I am git bisecting right now. I found the culprit. This is not related to the linux-pm tree but with net-next. The following patch introduced the issue. commit 6bdb7fe31046ac50b47e83c35cd6c6b6160a475d Author: Amerigo Wang amw...@redhat.com Date: Fri Aug 10 01:24:50 2012 + netpoll: re-enable irq in poll_napi() napi-poll() needs IRQ enabled, so we have to re-enable IRQ before calling it. Cc: David Miller da...@davemloft.net Signed-off-by: Cong Wang amw...@redhat.com Signed-off-by: David S. Miller da...@davemloft.net AFAICS, it has been fixed by commit 072a9c48600409d72aeb0d5b29fbb75861a06631 which is not yet in linux-pm-next. I fall into this issue because NETCONSOLE is set, disabling it allowed me to go further. Unfortunately I am facing to some random freeze on the system which seems to be related to CONFIG_NO_HZ=y and CONFIG_CPU_IDLE=y. Disabling one of them, make the freezes to disappear. Is it a known issue ? Thanks in advance -- Daniel -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] virtio-balloon spec: provide a version of the silent deflate feature that works
Il 06/09/2012 10:47, Michael S. Tsirkin ha scritto: - a migration from non-MUST_TELL_HOST to MUST_TELL_HOST will succeed, which is wrong; - a migration from MUST_TELL_HOST to non-MUST_TELL_HOST will fail, which is useless. Add instead a new feature VIRTIO_BALLOON_F_SILENT_DEFLATE, and deprecate VIRTIO_BALLOON_F_MUST_TELL_HOST since it is never actually used. Signed-off-by: Paolo Bonzini pbonz...@redhat.com Frankly I think it's a qemu migration bug. I don't see why we need to tweak spec: just teach qemu to be smarter during migration. Of course you can just teach QEMU to be smarter, but that would be a one-off hack for the only ill-defined feature that says something is _not_ supported. Since in practice all virtio_balloon-enbled hypervisors supported silent deflate (so the bit was always zero), and no client used it (so it was never checked), it's easier to just reverse the direction. In fact, it's not clear how the driver should use the feature. My guess is that, if it wants to use silent deflate, it tries to negotiate VIRTIO_BALLOON_F_MUST_TELL_HOST, and can use silent deflate if negotiation fails. This is against the logic of all other features. Can you show a scenario with old driver/new hypervisor or new driver/old hypervisor that fails? Suppose the driver tried to negotiate the feature as above. We then have the two scenarios above. In the harmless but annoying scenario, the source hypervisor doesn't support silent deflate, so VIRTIO_BALLOON_F_MUST_TELL_HOST has been negotiated successfully. The destination hypervisor supports silent deflate, so it does _not_ include the feature. It sees that the guest requests VIRTIO_BALLOON_F_MUST_TELL_HOST, and fails migration. In the incorrect scenario, you are migrating to an older hypervisor. The source hypervisor is newer and supports silent deflate, so VIRTIO_BALLOON_F_MUST_TELL_HOST was _not_ negotiated. The destination hypervisor does not supports silent deflate. However, the guest is not requesting VIRTIO_BALLOON_F_MUST_TELL_HOST, and migration succeeds. Next time the guest tries to do use a page from the balloon, badness happens, because the hypervisor does not expect it. Paolo -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] memory-hotplug: bug fix race between isolation and allocation
On Thu, Sep 06, 2012 at 01:49:03PM +0900, Minchan Kim wrote: __offline_isolated_pages /* * BUG_ON hit or offline page * which is used by someone */ BUG_ON(!PageBuddy(page A)); offline_page calling BUG_ON because someone allocated the page is ridiculous. I did not spot where that check is but it should be changed. The correct action is to retry the isolation. It is where __offline_isolated_pges. .. while (pfn end_pfn) { if (!pfn_valid(pfn)) { pfn++; continue; } page = pfn_to_page(pfn); BUG_ON(page_count(page)); BUG_ON(!PageBuddy(page)); HERE order = page_order(page); ... Comment of offline_isolated_pages says following as. We cannot do rollback at this point So if the comment is true, BUG_ON does make sense to me. It's massive overkill. I see no reason why it cannot return EBUSY all the way back up to offline_pages() and retry with the migration step. It would both remove that BUG_ON and improve reliability of memory hot-remove. But I don't see why we can't retry it as I look thorugh code. Anyway, It's another story which isn't related to this patch. True. Signed-off-by: Minchan Kim minc...@kernel.org At no point in the changelog do you actually say what he patch does :/ Argh, I will do. --- mm/page_isolation.c |5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/mm/page_isolation.c b/mm/page_isolation.c index acf65a7..4699d1f 100644 --- a/mm/page_isolation.c +++ b/mm/page_isolation.c @@ -196,8 +196,11 @@ __test_page_isolated_in_pageblock(unsigned long pfn, unsigned long end_pfn) continue; } page = pfn_to_page(pfn); - if (PageBuddy(page)) + if (PageBuddy(page)) { + if (get_page_migratetype(page) != MIGRATE_ISOLATE) + break; pfn += 1 page_order(page); + } It is possible the page is moved to the MIGRATE_ISOLATE list between when the page was freed to the buddy allocator and this check was made. The page-index information is stale and the impact is that the hotplug operation fails when it could have succeeded. That said, I think it is a very unlikely race that will never happen in practice. I understand you mean move_freepages which I have missed. Right? Yes. Then, I will fix it, too. More importantly, the effect of this path is that EBUSY gets bubbled all the way up and the hotplug operations fails. This is fine but as the page is free at the time this problem is detected you also have the option of moving the PageBuddy page to the MIGRATE_ISOLATE list at this time if you take the zone lock. This will mean you need to change the name of test_pages_isolated() of course. Sorry, I can't get your point. Could you elaborate it more? You detect a PageBuddy page but it's on the wrong list. Instead of returning and failing memory-hotremove, move the free page to the correct list at the time it is detected. Is it related to this patch? No, it's not important and was a suggestion on how it could be made better. However, retrying hot-remove would be even better again. I'm not suggesting it be done as part of this series. -- Mel Gorman SUSE Labs -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] virtio-blk: Fix kconfig option
On Thu, Sep 06, 2012 at 11:44:03AM +0300, Michael S. Tsirkin wrote: On Thu, Sep 06, 2012 at 12:41:13AM -0700, Kent Overstreet wrote: On Tue, Sep 04, 2012 at 03:53:53PM +0930, Rusty Russell wrote: Kent Overstreet koverstr...@google.com writes: CONFIG_VIRTIO isn't exposed, everything else is supposed to select it instead. This is a slight mis-understanding. It's supposed to be selected by the particular driver, probably virtio_pci in your case. So are you saying virtio-blk depends on virtio-pci? If so, the kconfig should have that. As is, VIRTIO_BLK just has: depends on EXPERIMENTAL VIRTIO which is flat out broken. I don't think anything is broken. Can you show an example of a broken configuration? Do you not understand the difference between depends an selects? Or did you not read my original mail? Flip off everything in drivers - virtio Now go to drivers - block and try to turn on virtio-blk. It's not listed! Now go back to drivers - virtio and turn on (randomly) balloon. Go back to drivers - block, and now you can turn on virtio-blk! Do you see what's wrong with this picture? -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: linux-next: manual merge of the arm-soc tree with the usb tree
On 09/06/2012 07:42 AM, Stephen Rothwell wrote: Today's linux-next merge of the arm-soc tree got a conflict in drivers/usb/host/Kconfig between commit 952230d774bb (usb: ohci: Fix Kconfig dependency on USB_ISP1301) from the usb tree and commit d684f05f2d55 (ARM: mach-pnx4008: Remove architecture) from the arm-soc tree. I fixed it up (see below) and can carry the fix as necessary (no action required). Thanks - this little conflict was expected when merging pnx4008 removal and the isp1301 deoendency fix. And the fixup is correct. Roland -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH] [media] rc: filter out not allowed protocols when decoding
Sean , many thanks for your help. I know much more about IR framwork now. I'll try to work out a patch to remove allowed_protocols. Thanks again! [Du, Changbin] 2012/9/4 Sean Young s...@mess.org: On Tue, Sep 04, 2012 at 11:06:07AM +0800, Changbin Du wrote: mutex_lock(ir_raw_handler_lock); - list_for_each_entry(handler, ir_raw_handler_list, list) - handler-decode(raw-dev, ev); + list_for_each_entry(handler, ir_raw_handler_list, list) { + /* use all protocol by default */ + if (raw-dev-allowed_protos == RC_TYPE_UNKNOWN || + raw-dev-allowed_protos handler-protocols) + handler-decode(raw-dev, ev); + } Each IR protocol decoder already checks whether it is enabled or not; should it not be so that only allowed protocols can be enabled rather than checking both enabled_protocols and allowed_protocols? Just from reading store_protocols it looks like decoders which aren't in allowed_protocols can be enabled, which makes no sense. Also ir_raw_event_register all protocols are enabled rather than the allowed ones. Lastely I don't know why raw ir drivers should dictate which protocols can be enabled. Would it not be better to remove it entirely? I agree with you. I just thought that the only thing a decoder should care is its decoding logic, but not including decoder management. My idaea is: 1) use enabled_protocols to select decoders in ir_raw.c, but not placed in decoders to do the judgement. 2) remove allowed_protocols or just use it to set the default decoder (also should rename allowed_protocols to default_protocol). The default decoder should be the one set by the rc keymap. I also have a question: Is there a requirement that one more decoders are enabled for a IR device at the same time? Yes, you want to be able to multiple remotes on the IR device (which you can do as long as the scancodes don't overlap, I think), and the lirc device is implemented as a decoder, so you might want to see the raw IR as well as have it decoded. And if that will lead to a issue that each decoder may decode a same pulse sequence to different evnets since their protocol is different? At the moment, no. David Hardeman has sent a patch for this: http://patchwork.linuxtv.org/patch/11388/ Sean -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] virtio-balloon spec: provide a version of the silent deflate feature that works
On Thu, Sep 06, 2012 at 11:24:02AM +0200, Paolo Bonzini wrote: Il 06/09/2012 10:47, Michael S. Tsirkin ha scritto: - a migration from non-MUST_TELL_HOST to MUST_TELL_HOST will succeed, which is wrong; - a migration from MUST_TELL_HOST to non-MUST_TELL_HOST will fail, which is useless. Add instead a new feature VIRTIO_BALLOON_F_SILENT_DEFLATE, and deprecate VIRTIO_BALLOON_F_MUST_TELL_HOST since it is never actually used. Signed-off-by: Paolo Bonzini pbonz...@redhat.com Frankly I think it's a qemu migration bug. I don't see why we need to tweak spec: just teach qemu to be smarter during migration. Of course you can just teach QEMU to be smarter, but that would be a one-off hack for the only ill-defined feature that says something is _not_ supported. Since in practice all virtio_balloon-enbled hypervisors supported silent deflate (so the bit was always zero), and no client used it (so it was never checked), it's easier to just reverse the direction. In fact, it's not clear how the driver should use the feature. My guess is that, if it wants to use silent deflate, it tries to negotiate VIRTIO_BALLOON_F_MUST_TELL_HOST, and can use silent deflate if negotiation fails. This is against the logic of all other features. Let's take a step back from the implementation details. You are trying to add a new feature bit, after all. Why? Why is silent deflate useful? This is what is missing in all this discussion. If it is not useful we do not need a bit for it. Can you show a scenario with old driver/new hypervisor or new driver/old hypervisor that fails? Suppose the driver tried to negotiate the feature as above. We then have the two scenarios above. In the harmless but annoying scenario, the source hypervisor doesn't support silent deflate, so VIRTIO_BALLOON_F_MUST_TELL_HOST has been negotiated successfully. The destination hypervisor supports silent deflate, so it does _not_ include the feature. It sees that the guest requests VIRTIO_BALLOON_F_MUST_TELL_HOST, and fails migration. In the incorrect scenario, you are migrating to an older hypervisor. The source hypervisor is newer and supports silent deflate, so VIRTIO_BALLOON_F_MUST_TELL_HOST was _not_ negotiated. The destination hypervisor does not supports silent deflate. However, the guest is not requesting VIRTIO_BALLOON_F_MUST_TELL_HOST, and migration succeeds. Next time the guest tries to do use a page from the balloon, badness happens, because the hypervisor does not expect it. Paolo Sorry this is not the example I asked for. Please give and example without migration. Migration is qemu's problem: it is hypervisor's job to make sure guest sees no change during migration. It should be able to do this with any hardware it emulates, there should be no need to change hardware to make it migrateable somehow. -- MST -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: snd-usb: delay: estimated 0, actual 352
On 2012.09.06 at 10:21 +0200, Takashi Iwai wrote: At Thu, 06 Sep 2012 09:35:26 +0200, Takashi Iwai wrote: In short, a patch like below may fix the issue (note: completely untested!) No it doesn't, unfortunately... -- Markus -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] virtio-blk: Fix kconfig option
On Thu, Sep 06, 2012 at 02:25:12AM -0700, Kent Overstreet wrote: On Thu, Sep 06, 2012 at 11:44:03AM +0300, Michael S. Tsirkin wrote: On Thu, Sep 06, 2012 at 12:41:13AM -0700, Kent Overstreet wrote: On Tue, Sep 04, 2012 at 03:53:53PM +0930, Rusty Russell wrote: Kent Overstreet koverstr...@google.com writes: CONFIG_VIRTIO isn't exposed, everything else is supposed to select it instead. This is a slight mis-understanding. It's supposed to be selected by the particular driver, probably virtio_pci in your case. So are you saying virtio-blk depends on virtio-pci? If so, the kconfig should have that. As is, VIRTIO_BLK just has: depends on EXPERIMENTAL VIRTIO which is flat out broken. I don't think anything is broken. Can you show an example of a broken configuration? Do you not understand the difference between depends an selects? Or did you not read my original mail? Flip off everything in drivers - virtio Now go to drivers - block and try to turn on virtio-blk. It's not listed! Yes. Because you disabled all virtio backends. It does not make sense to have any frontends. Now go back to drivers - virtio and turn on (randomly) balloon. Go back to drivers - block, and now you can turn on virtio-blk! Do you see what's wrong with this picture? Yes. You got unlucky with your random guess. It's a bug in balloon kconfig: it should not select virtio. I sent a patch to fix that yesterday. -- MST -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] virtio-balloon spec: provide a version of the silent deflate feature that works
Il 06/09/2012 11:44, Michael S. Tsirkin ha scritto: In fact, it's not clear how the driver should use the feature. My guess is that, if it wants to use silent deflate, it tries to negotiate VIRTIO_BALLOON_F_MUST_TELL_HOST, and can use silent deflate if negotiation fails. This is against the logic of all other features. Let's take a step back from the implementation details. You are trying to add a new feature bit, after all. Why? Why is silent deflate useful? This is what is missing in all this discussion. If it is not useful we do not need a bit for it. It is useful because it lets guests inflate the balloon aggressively, and then use ballooned-out pages even in places where the guest OS cannot sleep, such as kmalloc(GFP_ATOMIC). Can you show a scenario with old driver/new hypervisor or new driver/old hypervisor that fails? Sorry this is not the example I asked for. Please give and example without migration. Migration is qemu's problem: it is hypervisor's job to make sure guest sees no change during migration. Quoting my message: Of course you can just teach QEMU to be smarter, but that would be a one-off hack for the only ill-defined feature that says something is _not_ supported. Currently migration works the same way for all virtio devices, and assumes that features are defined only in the positive direction: drivers request features if they want to use it, devices provide features to say they support something. Instead, in the case of this feature, the driver requests it before relying on its lack (which is odd); the device provides if they do not support something (which is wrong). You can see that this just cannot provide backwards-compatibility in the device; it happens to work only because the feature was there in the first version of the spec. It should be able to do this with any hardware it emulates, there should be no need to change hardware to make it migrateable somehow. Of course, but if we can fix the hardware with no bad effects, let's do that instead. Paolo -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] virtio-blk: Fix kconfig option
On Thu, Sep 06, 2012 at 12:49:56PM +0300, Michael S. Tsirkin wrote: On Thu, Sep 06, 2012 at 02:25:12AM -0700, Kent Overstreet wrote: On Thu, Sep 06, 2012 at 11:44:03AM +0300, Michael S. Tsirkin wrote: On Thu, Sep 06, 2012 at 12:41:13AM -0700, Kent Overstreet wrote: On Tue, Sep 04, 2012 at 03:53:53PM +0930, Rusty Russell wrote: Kent Overstreet koverstr...@google.com writes: CONFIG_VIRTIO isn't exposed, everything else is supposed to select it instead. This is a slight mis-understanding. It's supposed to be selected by the particular driver, probably virtio_pci in your case. So are you saying virtio-blk depends on virtio-pci? If so, the kconfig should have that. As is, VIRTIO_BLK just has: depends on EXPERIMENTAL VIRTIO which is flat out broken. I don't think anything is broken. Can you show an example of a broken configuration? Do you not understand the difference between depends an selects? Or did you not read my original mail? Flip off everything in drivers - virtio Now go to drivers - block and try to turn on virtio-blk. It's not listed! Yes. Because you disabled all virtio backends. It does not make sense to have any frontends. How's a user - or even another kernel developer who isn't familiar with virtio - supposed to know that? I still don't know what exactly a virtio backend is - the term isn't even mentioned anywhere that I've seen. Whatever it is though virtio-blk should be depending on _that_, not a config option that _isn't exposed in the menu_! Now go back to drivers - virtio and turn on (randomly) balloon. Go back to drivers - block, and now you can turn on virtio-blk! Do you see what's wrong with this picture? Yes. You got unlucky with your random guess. It's a bug in balloon kconfig: it should not select virtio. I sent a patch to fix that yesterday. Then it's also a bug in the comments at the top of drivers/virtio/Kconfig. And besides that, how the _hell_ is a user supposed to know to turn on VIRTIO_PCI before VIRTIO_BLK? It's not documented anywhere (if that is what's supposed to happen! I still don't know) and even if it was documented, having one kconfig option depend on something that's exposed in a _completely different menu_ is just made of fail. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] x86, 32-bit: Fix invalid stack address while in softirq
(Resending patch with [PATCH] in subject line and updated cc list.) On 06.09.12 09:30:37, wyang1 wrote: Robert, I agreed what you said, my patch more likes a workaround. So the proper fix I see is to fix kernel_stack_pointer() to return a valid stack in case of an empty stack while in softirq. Something like the patch below. Maybe it must be optimized a bit. I tested the patch over night with no issues found. Please test it too. I also tested the following patch over night. it is fine.:-) Wei, thanks for testing. Ingo, please take a look at this. Not sure if Linus want to look at this too and if we need more optimization here. Thanks, -Robert From 8e7c16913b1fcfc63f7b24337551aacc7153c334 Mon Sep 17 00:00:00 2001 From: Robert Richter robert.rich...@amd.com Date: Mon, 3 Sep 2012 20:54:48 +0200 Subject: [PATCH] x86, 32-bit: Fix invalid stack address while in softirq In 32 bit the stack address provided by kernel_stack_pointer() may point to an invalid range causing NULL pointer access or page faults while in NMI (see trace below). This happens if called in softirq context and if the stack is empty. The address at regs-sp is then out of range. Fixing this by checking if regs and regs-sp are in the same stack context. Otherwise return the previous stack pointer stored in struct thread_info. BUG: unable to handle kernel NULL pointer dereference at 000a IP: [c1004237] print_context_stack+0x6e/0x8d *pde = Oops: [#1] SMP Modules linked in: Pid: 4434, comm: perl Not tainted 3.6.0-rc3-oprofile-i386-standard-g4411a05 #4 Hewlett-Packard HP xw9400 Workstation/0A1Ch EIP: 0060:[c1004237] EFLAGS: 00010093 CPU: 0 EIP is at print_context_stack+0x6e/0x8d EAX: e000 EBX: 000a ECX: f4435f94 EDX: 000a ESI: f4435f94 EDI: f4435f94 EBP: f5409ec0 ESP: f5409ea0 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 CR0: 8005003b CR2: 000a CR3: 34ac9000 CR4: 07d0 DR0: DR1: DR2: DR3: DR6: 0ff0 DR7: 0400 Process perl (pid: 4434, ti=f5408000 task=f5637850 task.ti=f4434000) Stack: 03e8 e000 1ffc f4e39b00 000a f4435f94 c155198c f5409ef0 c1003723 c155198c f5409f04 f5409edc f5409ee8 f4435f94 f5409fc4 0001 f5409f1c c12dce1c c155198c Call Trace: [c1003723] dump_trace+0x7b/0xa1 [c12dce1c] x86_backtrace+0x40/0x88 [c12db712] ? oprofile_add_sample+0x56/0x84 [c12db731] oprofile_add_sample+0x75/0x84 [c12ddb5b] op_amd_check_ctrs+0x46/0x260 [c12dd40d] profile_exceptions_notify+0x23/0x4c [c1395034] nmi_handle+0x31/0x4a [c1029dc5] ? ftrace_define_fields_irq_handler_entry+0x45/0x45 [c13950ed] do_nmi+0xa0/0x2ff [c1029dc5] ? ftrace_define_fields_irq_handler_entry+0x45/0x45 [c13949e5] nmi_stack_correct+0x28/0x2d [c1029dc5] ? ftrace_define_fields_irq_handler_entry+0x45/0x45 [c1003603] ? do_softirq+0x4b/0x7f IRQ [c102a06f] irq_exit+0x35/0x5b [c1018f56] smp_apic_timer_interrupt+0x6c/0x7a [c1394746] apic_timer_interrupt+0x2a/0x30 Code: 89 fe eb 08 31 c9 8b 45 0c ff 55 ec 83 c3 04 83 7d 10 00 74 0c 3b 5d 10 73 26 3b 5d e4 73 0c eb 1f 3b 5d f0 76 1a 3b 5d e8 73 15 8b 13 89 d0 89 55 e0 e8 ad 42 03 00 85 c0 8b 55 e0 75 a6 eb cc EIP: [c1004237] print_context_stack+0x6e/0x8d SS:ESP 0068:f5409ea0 CR2: 000a ---[ end trace 62afee3481b00012 ]--- Kernel panic - not syncing: Fatal exception in interrupt Reported-by: Yang Wei wei.y...@windriver.com Cc: sta...@vger.kernel.org Signed-off-by: Robert Richter robert.rich...@amd.com --- arch/x86/include/asm/ptrace.h | 15 --- arch/x86/kernel/ptrace.c | 21 + arch/x86/oprofile/backtrace.c |2 +- 3 files changed, 26 insertions(+), 12 deletions(-) diff --git a/arch/x86/include/asm/ptrace.h b/arch/x86/include/asm/ptrace.h index dcfde52..19f16eb 100644 --- a/arch/x86/include/asm/ptrace.h +++ b/arch/x86/include/asm/ptrace.h @@ -205,21 +205,14 @@ static inline bool user_64bit_mode(struct pt_regs *regs) } #endif -/* - * X86_32 CPUs don't save ss and esp if the CPU is already in kernel mode - * when it traps. The previous stack will be directly underneath the saved - * registers, and 'sp/ss' won't even have been saved. Thus the 'regs-sp'. - * - * This is valid only for kernel mode traps. - */ -static inline unsigned long kernel_stack_pointer(struct pt_regs *regs) -{ #ifdef CONFIG_X86_32 - return (unsigned long)(regs-sp); +extern unsigned long kernel_stack_pointer(struct pt_regs *regs); #else +static inline unsigned long kernel_stack_pointer(struct pt_regs *regs) +{ return regs-sp; -#endif } +#endif #define GET_IP(regs) ((regs)-ip) #define GET_FP(regs) ((regs)-bp) diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c index c4c6a5c..5a9a8c9 100644 --- a/arch/x86/kernel/ptrace.c +++ b/arch/x86/kernel/ptrace.c @@ -165,6 +165,27 @@ static inline bool invalid_selector(u16 value) #define FLAG_MASK
[PATCH] UDF: Fix incorrect error handling in udf_direct_IO()
My recent patch to add DIRECT_IO support to the UDF filesystem handler contains a mistake in the error recovery if blockdev_direct_IO() fails. The test `rw WRITE` should be `rw WRITE`. Fix it. Signed-off-by: Ian Abbott abbo...@mev.co.uk --- fs/udf/inode.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/udf/inode.c b/fs/udf/inode.c index b905448..41d5830 100644 --- a/fs/udf/inode.c +++ b/fs/udf/inode.c @@ -156,7 +156,7 @@ static ssize_t udf_direct_IO(int rw, struct kiocb *iocb, ret = blockdev_direct_IO(rw, iocb, inode, iov, offset, nr_segs, udf_get_block); - if (unlikely(ret 0 (rw WRITE))) + if (unlikely(ret 0 (rw WRITE))) udf_write_failed(mapping, offset + iov_length(iov, nr_segs)); return ret; } -- 1.7.12 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/3] btrfs: remove unnecessary -ENOMEM BUG_ON check in extent-tree.c/exclude_super_stripes
On Thu, Sep 06, 2012 at 02:40:41PM +0800, Wang Sheng-Hui wrote: The memory allocation failure is BUG_ON in add_excluded_extent (following the code path) and btrfs_rmap_block. No need to BUG_ON -ENOMEM inside exclude_super_stripes itself. No please. Its return value is always 0, and useless for its callers. Set it as void instead 0-returned. btrfs_rmap_block itself contains a BUG_ON: 3980 int btrfs_rmap_block(struct btrfs_mapping_tree *map_tree, 3981 u64 chunk_start, u64 physical, u64 devid, 3982 u64 **logical, int *naddrs, int *stripe_len) 3983 { 3984 struct extent_map_tree *em_tree = map_tree-map_tree; 3985 struct extent_map *em; 3986 struct map_lookup *map; 3987 u64 *buf; 3988 u64 bytenr; 3989 u64 length; 3990 u64 stripe_nr; 3991 int i, j, nr = 0; 3992 3993 read_lock(em_tree-lock); 3994 em = lookup_extent_mapping(em_tree, chunk_start, 1); 3995 read_unlock(em_tree-lock); 3996 3997 BUG_ON(!em || em-start != chunk_start); And this should be turned into an 'return error', thus giving a non-zero return code that should be handled in the callers. Eg. this patch attempts to do that http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg15470.html but has not been merged due to incorrect fix inside exclude_super_stripes (introduced in the patch). The same objection for return code cleanups will hold for any function that returns 0 but is full of BUG_ONs. david -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/1] [SCSI] scsi_debug: Add removable parameter
Add removable module parameter to set the removable attribute of any subsequently created debug block device. It is a writable driver option, so that you can switch between removable and fixed media block devices in between the add_host calls. This is useful for being able to test the different behaviour/required privileges in e. g. the udisks test suite. Signed-off-by: Martin Pitt martin.p...@ubuntu.com Acked-By: David Zeuthen zeut...@gmail.com --- drivers/scsi/scsi_debug.c | 30 +++--- 1 files changed, 27 insertions(+), 3 deletions(-) diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c index 182d5a5..57fbd5a 100644 --- a/drivers/scsi/scsi_debug.c +++ b/drivers/scsi/scsi_debug.c @@ -109,6 +109,7 @@ static const char * scsi_debug_version_date = 20100324; #define DEF_OPT_BLKS 64 #define DEF_PHYSBLK_EXP 0 #define DEF_PTYPE 0 +#define DEF_REMOVABLE false #define DEF_SCSI_LEVEL 5/* INQUIRY, byte2 [5-SPC-3] */ #define DEF_SECTOR_SIZE 512 #define DEF_UNMAP_ALIGNMENT 0 @@ -193,11 +194,11 @@ static unsigned int scsi_debug_unmap_granularity = DEF_UNMAP_GRANULARITY; static unsigned int scsi_debug_unmap_max_blocks = DEF_UNMAP_MAX_BLOCKS; static unsigned int scsi_debug_unmap_max_desc = DEF_UNMAP_MAX_DESC; static unsigned int scsi_debug_write_same_length = DEF_WRITESAME_LENGTH; +static bool scsi_debug_removable = DEF_REMOVABLE; static int scsi_debug_cmnd_count = 0; #define DEV_READONLY(TGT) (0) -#define DEV_REMOVEABLE(TGT)(0) static unsigned int sdebug_store_sectors; static sector_t sdebug_capacity; /* in sectors */ @@ -919,7 +920,7 @@ static int resp_inquiry(struct scsi_cmnd * scp, int target, return ret; } /* drops through here for a standard inquiry */ - arr[1] = DEV_REMOVEABLE(target) ? 0x80 : 0; /* Removable disk */ + arr[1] = scsi_debug_removable ? 0x80 : 0; /* Removable disk */ arr[2] = scsi_debug_scsi_level; arr[3] = 2;/* response_data_format==2 */ arr[4] = SDEBUG_LONG_INQ_SZ - 5; @@ -1211,7 +1212,7 @@ static int resp_format_pg(unsigned char * p, int pcontrol, int target) p[11] = sdebug_sectors_per 0xff; p[12] = (scsi_debug_sector_size 8) 0xff; p[13] = scsi_debug_sector_size 0xff; - if (DEV_REMOVEABLE(target)) + if (scsi_debug_removable) p[20] |= 0x20; /* should agree with INQUIRY */ if (1 == pcontrol) memset(p + 2, 0, sizeof(format_pg) - 2); @@ -2754,6 +2755,7 @@ module_param_named(opt_blks, scsi_debug_opt_blks, int, S_IRUGO); module_param_named(opts, scsi_debug_opts, int, S_IRUGO | S_IWUSR); module_param_named(physblk_exp, scsi_debug_physblk_exp, int, S_IRUGO); module_param_named(ptype, scsi_debug_ptype, int, S_IRUGO | S_IWUSR); +module_param_named(removable, scsi_debug_removable, bool, S_IRUGO | S_IWUSR); module_param_named(scsi_level, scsi_debug_scsi_level, int, S_IRUGO); module_param_named(sector_size, scsi_debug_sector_size, int, S_IRUGO); module_param_named(unmap_alignment, scsi_debug_unmap_alignment, int, S_IRUGO); @@ -2796,6 +2798,7 @@ MODULE_PARM_DESC(opt_blks, optimal transfer length in block (def=64)); MODULE_PARM_DESC(opts, 1-noise, 2-medium_err, 4-timeout, 8-recovered_err... (def=0)); MODULE_PARM_DESC(physblk_exp, physical block exponent (def=0)); MODULE_PARM_DESC(ptype, SCSI peripheral type(def=0[disk])); +MODULE_PARM_DESC(removable, claim to have removable media (def=0)); MODULE_PARM_DESC(scsi_level, SCSI level to simulate(def=5[SPC-3])); MODULE_PARM_DESC(sector_size, logical block size in bytes (def=512)); MODULE_PARM_DESC(unmap_alignment, lowest aligned thin provisioning lba (def=0)); @@ -3205,6 +3208,25 @@ static ssize_t sdebug_map_show(struct device_driver *ddp, char *buf) } DRIVER_ATTR(map, S_IRUGO, sdebug_map_show, NULL); +static ssize_t sdebug_removable_show(struct device_driver *ddp, +char *buf) +{ + return scnprintf(buf, PAGE_SIZE, %d\n, scsi_debug_removable ? 1 : 0); +} +static ssize_t sdebug_removable_store(struct device_driver *ddp, + const char *buf, size_t count) +{ + int n; + + if ((count 0) (1 == sscanf(buf, %d, n)) (n = 0)) { + scsi_debug_removable = (n 0); + return count; + } + return -EINVAL; +} +DRIVER_ATTR(removable, S_IRUGO | S_IWUSR, sdebug_removable_show, + sdebug_removable_store); + /* Note: The following function creates attribute files in the /sys/bus/pseudo/drivers/scsi_debug directory. The advantage of these @@ -3230,6 +3252,7 @@ static int do_create_driverfs_files(void) ret |= driver_create_file(sdebug_driverfs_driver, driver_attr_num_tgts); ret |= driver_create_file(sdebug_driverfs_driver, driver_attr_ptype); ret |= driver_create_file(sdebug_driverfs_driver, driver_attr_opts); + ret |=
[PATCH 0/1] Option for scsi_debug to fake removable devices
Hello all, I already re-sent this 1.5 months ago, but did not get any answer back then; I guess it got lost in the noise by now. So, patiently retrying again. For the purposes of automatically testing udisks and gvfs automounting I would like to add a parameter to scsi_debug to control the removable attribute of the created block device. With that, we can test system-internal and removable drives, as well as CD-ROMs (which scsi_debug can already emulate). udisks requires different privileges for mounting system-internal drives vs. removable/hotpluggable drives. This will also allow us to write system integration tests for gvfs, which will exercise the whole stack including the actual polkit configuration in a VM. I wrote a simple kernel patch for this (against linux-next), and tested this quite thoroughly. I ran the style checker, and it reports two problems: 8 -- WARNING: line over 80 characters #109: FILE: drivers/scsi/scsi_debug.c:3255: + ret |= driver_create_file(sdebug_driverfs_driver, driver_attr_removable); WARNING: Prefer pr_err(... to printk(KERN_ERR, ... #126: FILE: drivers/scsi/scsi_debug.c:3353: + printk(KERN_ERR scsi_debug_init: removable must be 0 or 1\n); 8 -- But as the existing code uses this style in the adjacent lines, I favored consistency over fixing those. If the latter is desired, I'd rather send a separate patch with just the style cleanup for the whole file. I got an ack from David Zeuthen (the primary udisks maintainer) already, noted so in the patch. Thank you in advance for considering, Martin -- Martin Pitt| http://www.piware.de Ubuntu Developer (www.ubuntu.com) | Debian Developer (www.debian.org) signature.asc Description: Digital signature
Re: [PATCH RFC tip/core/rcu] Add callback-free CPUs
On Wed, 2012-09-05 at 16:44 -0700, Paul E. McKenney wrote: I was excited by this possibility when you first mentioned it, but the low-OS-jitter fans are going to need the grace-period computation to be offloaded as well. Sure, but it seems to me pulling the grace period machinery out is a much harder feat and should be a patch (series) on its own. Also.. So if I use your (admittedly much simpler) approach, I get to rewrite it when Frederic's adaptive-ticks work goes in. I don't see how Frederic's work affects any of this, that would simple put RCU into extended quiescent state (aka. idle) while in userspace. In that state the grace period machinery is stopped all together, so it doesn't matter who would've ran it. Given that this is probably happening relatively soon, it would be better if I just did the implementation that will be needed long-term, rather than rewriting. Though I am sure that people will be sad about fewer RCU patches. ;-) Always... Now thinking about this grace machinery stuff a little more, would it be possible to stick the entire state machine in a kthread and replace all current hooks, like the tick and rcu_read_unlock_special with a message passing construct such that they pass their event on to the kthread? That way you could run the entire state thing from a kthread with random affinity, all 'per-cpu' data would still be fine since only the one kthread will access it, even though locality might suffer somewhat. This would also not suffer from the having to keep one cpu special and the ugly bouncing etc.. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] virtio-blk: Fix kconfig option
On Thu, Sep 06, 2012 at 03:02:48AM -0700, Kent Overstreet wrote: On Thu, Sep 06, 2012 at 12:49:56PM +0300, Michael S. Tsirkin wrote: On Thu, Sep 06, 2012 at 02:25:12AM -0700, Kent Overstreet wrote: On Thu, Sep 06, 2012 at 11:44:03AM +0300, Michael S. Tsirkin wrote: On Thu, Sep 06, 2012 at 12:41:13AM -0700, Kent Overstreet wrote: On Tue, Sep 04, 2012 at 03:53:53PM +0930, Rusty Russell wrote: Kent Overstreet koverstr...@google.com writes: CONFIG_VIRTIO isn't exposed, everything else is supposed to select it instead. This is a slight mis-understanding. It's supposed to be selected by the particular driver, probably virtio_pci in your case. So are you saying virtio-blk depends on virtio-pci? If so, the kconfig should have that. As is, VIRTIO_BLK just has: depends on EXPERIMENTAL VIRTIO which is flat out broken. I don't think anything is broken. Can you show an example of a broken configuration? Do you not understand the difference between depends an selects? Or did you not read my original mail? Flip off everything in drivers - virtio Now go to drivers - block and try to turn on virtio-blk. It's not listed! Yes. Because you disabled all virtio backends. It does not make sense to have any frontends. How's a user - or even another kernel developer who isn't familiar with virtio - supposed to know that? I still don't know what exactly a virtio backend is - the term isn't even mentioned anywhere that I've seen. Whatever it is though virtio-blk should be depending on _that_, not a config option that _isn't exposed in the menu_! Now go back to drivers - virtio and turn on (randomly) balloon. Go back to drivers - block, and now you can turn on virtio-blk! Do you see what's wrong with this picture? Yes. You got unlucky with your random guess. It's a bug in balloon kconfig: it should not select virtio. I sent a patch to fix that yesterday. Then it's also a bug in the comments at the top of drivers/virtio/Kconfig. And besides that, how the _hell_ is a user supposed to know to turn on VIRTIO_PCI before VIRTIO_BLK? It's not documented anywhere (if that is what's supposed to happen! I still don't know) Well, what kind of device do you have? Tell us :) If it's a virtio pci device, you need to enable virtio-pci and virtio-blk. and even if it was documented, having one kconfig option depend on something that's exposed in a _completely different menu_ is just made of fail. Fine, but why pick on virtio? This is extremely common in kconfig. For example, a ton of network drivers depend on PCI, it's exactly the same thing. -- MST -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/3] btrfs: remove unnecessary -ENOMEM BUG_ON check in extent-tree.c/btrfs_alloc_logged_file_extent
On Thu, Sep 06, 2012 at 02:41:02PM +0800, Wang Sheng-Hui wrote: The memory allocation failure is BUG_ON in add_excluded_extent (following the code path). No need to BUG_ON -ENOMEM inside btrfs_alloc_logged_file_extent. This indirectly calls __set_extent_bit that does BUG_ON on memory allocation failures. This type of error condition is hard to fix, as it usually needs to do non-trivial cleanups in the function before returning -ENOMEM, so the easiset way to handle it is to do BUG_ON and then separately deal with it in another patch (so it does not mix with the original patch). Your patches remove (from my POV) useful marks that we have an error condition to handle, not to hide it. So, NAK from me for anything that looks like this. I'm of course glad to look at patches that replace the BUG_ON with proper error handling :) david Signed-off-by: Wang Sheng-Hui shh...@gmail.com --- fs/btrfs/extent-tree.c |6 ++ 1 files changed, 2 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 95492cc..9b9a6fa 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -6207,8 +6207,7 @@ int btrfs_alloc_logged_file_extent(struct btrfs_trans_handle *trans, mutex_lock(caching_ctl-mutex); if (start = caching_ctl-progress) { - ret = add_excluded_extent(root, start, num_bytes); - BUG_ON(ret); /* -ENOMEM */ + add_excluded_extent(root, start, num_bytes); } else if (start + num_bytes = caching_ctl-progress) { ret = btrfs_remove_free_space(block_group, start, num_bytes); @@ -6222,8 +6221,7 @@ int btrfs_alloc_logged_file_extent(struct btrfs_trans_handle *trans, start = caching_ctl-progress; num_bytes = ins-objectid + ins-offset - caching_ctl-progress; - ret = add_excluded_extent(root, start, num_bytes); - BUG_ON(ret); /* -ENOMEM */ + add_excluded_extent(root, start, num_bytes); } mutex_unlock(caching_ctl-mutex); -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: snd-usb: delay: estimated 0, actual 352
On 06.09.2012 09:35, Takashi Iwai wrote: At Thu, 6 Sep 2012 09:17:57 +0200, Markus Trippelsdorf wrote: On 2012.09.06 at 09:08 +0200, Daniel Mack wrote: On 06.09.2012 08:53, Markus Trippelsdorf wrote: On 2012.09.06 at 08:48 +0200, Takashi Iwai wrote: At Thu, 06 Sep 2012 08:33:30 +0200, Daniel Mack wrote: On 06.09.2012 08:02, Markus Trippelsdorf wrote: On 2012.09.04 at 16:40 +0200, Takashi Iwai wrote: Sound fixes for 3.6-rc5 There are nothing scaring, contains only small fixes for HD-audio and USB-audio: - EPSS regression fix and GPIO fix for HD-audio IDT codecs - A series of USB-audio regression fixes that are found since 3.5 kernel Daniel Mack (4): ALSA: snd-usb: Fix URB cancellation at stream start ALSA: snd-usb: restore delay information The commit fbcfbf5f above causes the following lines to be printed whenever I start a new song: Copied Pierre-Louis Bossart - he wrote the code in 294c4fb8 which this patch (fbcfbf5f) brings back now. delay: estimated 0, actual 352 delay: estimated 353, actual 705 (44.1 * 8 = 352.8) This happens with an USB-DAC that identifies itself as C-Media USB Headphone Set. And you didn't you see these lines with 3.4? Maybe the difference of start condition? Markus, does the patch below fix anything? Unfortunately no. However reverting the following fixes the problem: commit 245baf983cc39524cce39c24d01b276e6e653c9e Author: Daniel Mack zon...@gmail.com Date: Thu Aug 30 18:52:30 2012 +0200 ALSA: snd-usb: fix calls to next_packet_size No, this one certainly fixes a problem and does the right thing by restoring the original code. If you wouldn't state that you didn't see the same effect with 3.4(!), before the refactoring done in 3.5, I would believe the device is simply slightly off in its feedback rate and the tighter delay code complains about it while compensating, just as it did before. Are there any more than these two lines? And is audio working at all? Is it distorted in any way? There are only these two lines (printed whenever sound starts). Audio is working just fine with no distortions. I did see similar lines before when the system load was very high (happend during make check when building glibc). Here is what Pierre-Louis wrote in November 2011: »This was supposed to be an informational message, I thought it was only enabled for debug. Regular users don't really need to know.« I guess the problem is that the new endpoint scheme doesn't count the last_delay update unless the stream is triggered. In the old code, retire_playback_urb is always called even before the trigger(START) is set. And, there retire_playback_urb() does nothing but updating the delay information. In the new code, retire_playback_urb is set only at snd_usb_substream_playback_trigger(). Thus at the very first shot, the delay account got confused. In that case, I'd say we can also safely remove the debug output then. Let's wait for Pierre-Louis' judgement here. Daniel -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: snd-usb: delay: estimated 0, actual 352
On 2012.09.06 at 12:25 +0200, Daniel Mack wrote: On 06.09.2012 09:35, Takashi Iwai wrote: At Thu, 6 Sep 2012 09:17:57 +0200, Markus Trippelsdorf wrote: On 2012.09.06 at 09:08 +0200, Daniel Mack wrote: On 06.09.2012 08:53, Markus Trippelsdorf wrote: On 2012.09.06 at 08:48 +0200, Takashi Iwai wrote: At Thu, 06 Sep 2012 08:33:30 +0200, Daniel Mack wrote: On 06.09.2012 08:02, Markus Trippelsdorf wrote: On 2012.09.04 at 16:40 +0200, Takashi Iwai wrote: Sound fixes for 3.6-rc5 There are nothing scaring, contains only small fixes for HD-audio and USB-audio: - EPSS regression fix and GPIO fix for HD-audio IDT codecs - A series of USB-audio regression fixes that are found since 3.5 kernel Daniel Mack (4): ALSA: snd-usb: Fix URB cancellation at stream start ALSA: snd-usb: restore delay information The commit fbcfbf5f above causes the following lines to be printed whenever I start a new song: Copied Pierre-Louis Bossart - he wrote the code in 294c4fb8 which this patch (fbcfbf5f) brings back now. delay: estimated 0, actual 352 delay: estimated 353, actual 705 (44.1 * 8 = 352.8) This happens with an USB-DAC that identifies itself as C-Media USB Headphone Set. And you didn't you see these lines with 3.4? Maybe the difference of start condition? Markus, does the patch below fix anything? Unfortunately no. However reverting the following fixes the problem: commit 245baf983cc39524cce39c24d01b276e6e653c9e Author: Daniel Mack zon...@gmail.com Date: Thu Aug 30 18:52:30 2012 +0200 ALSA: snd-usb: fix calls to next_packet_size No, this one certainly fixes a problem and does the right thing by restoring the original code. If you wouldn't state that you didn't see the same effect with 3.4(!), before the refactoring done in 3.5, I would believe the device is simply slightly off in its feedback rate and the tighter delay code complains about it while compensating, just as it did before. Are there any more than these two lines? And is audio working at all? Is it distorted in any way? There are only these two lines (printed whenever sound starts). Audio is working just fine with no distortions. I did see similar lines before when the system load was very high (happend during make check when building glibc). Here is what Pierre-Louis wrote in November 2011: ûThis was supposed to be an informational message, I thought it was only enabled for debug. Regular users don't really need to know.ë I guess the problem is that the new endpoint scheme doesn't count the last_delay update unless the stream is triggered. In the old code, retire_playback_urb is always called even before the trigger(START) is set. And, there retire_playback_urb() does nothing but updating the delay information. In the new code, retire_playback_urb is set only at snd_usb_substream_playback_trigger(). Thus at the very first shot, the delay account got confused. In that case, I'd say we can also safely remove the debug output then. Let's wait for Pierre-Louis' judgement here. v3.5 and v3.6-rc4 with commit fbcfbf5f67 (restore delay information) applied on top are both fine. -- Markus -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] virtio-blk: Fix kconfig option
On Thu, Sep 06, 2012 at 01:18:43PM +0300, Michael S. Tsirkin wrote: On Thu, Sep 06, 2012 at 03:02:48AM -0700, Kent Overstreet wrote: On Thu, Sep 06, 2012 at 12:49:56PM +0300, Michael S. Tsirkin wrote: On Thu, Sep 06, 2012 at 02:25:12AM -0700, Kent Overstreet wrote: On Thu, Sep 06, 2012 at 11:44:03AM +0300, Michael S. Tsirkin wrote: On Thu, Sep 06, 2012 at 12:41:13AM -0700, Kent Overstreet wrote: On Tue, Sep 04, 2012 at 03:53:53PM +0930, Rusty Russell wrote: Kent Overstreet koverstr...@google.com writes: CONFIG_VIRTIO isn't exposed, everything else is supposed to select it instead. This is a slight mis-understanding. It's supposed to be selected by the particular driver, probably virtio_pci in your case. So are you saying virtio-blk depends on virtio-pci? If so, the kconfig should have that. As is, VIRTIO_BLK just has: depends on EXPERIMENTAL VIRTIO which is flat out broken. I don't think anything is broken. Can you show an example of a broken configuration? Do you not understand the difference between depends an selects? Or did you not read my original mail? Flip off everything in drivers - virtio Now go to drivers - block and try to turn on virtio-blk. It's not listed! Yes. Because you disabled all virtio backends. It does not make sense to have any frontends. How's a user - or even another kernel developer who isn't familiar with virtio - supposed to know that? I still don't know what exactly a virtio backend is - the term isn't even mentioned anywhere that I've seen. Whatever it is though virtio-blk should be depending on _that_, not a config option that _isn't exposed in the menu_! Now go back to drivers - virtio and turn on (randomly) balloon. Go back to drivers - block, and now you can turn on virtio-blk! Do you see what's wrong with this picture? Yes. You got unlucky with your random guess. It's a bug in balloon kconfig: it should not select virtio. I sent a patch to fix that yesterday. Then it's also a bug in the comments at the top of drivers/virtio/Kconfig. And besides that, how the _hell_ is a user supposed to know to turn on VIRTIO_PCI before VIRTIO_BLK? It's not documented anywhere (if that is what's supposed to happen! I still don't know) Well, what kind of device do you have? Tell us :) If it's a virtio pci device, you need to enable virtio-pci and virtio-blk. I run qemu with -drive if=virtio. You tell me! Better yet, tell me how the user is supposed to figure it out! and even if it was documented, having one kconfig option depend on something that's exposed in a _completely different menu_ is just made of fail. Fine, but why pick on virtio? This is extremely common in kconfig. For example, a ton of network drivers depend on PCI, it's exactly the same thing. Never noticed where CONFIG_PCI is exposed in bus options? Nope, not the same thing. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH can-next v6] can: add tx/rx LED trigger support
On Tue, Sep 04, 2012 at 10:15:53PM +0200, Fabio Baltieri wrote: On Tue, Sep 04, 2012 at 09:11:28AM +0200, Kurt Van Dijck wrote: On Mon, Sep 03, 2012 at 10:54:49PM +0200, Oliver Hartkopp wrote: On 03.09.2012 20:29, Fabio Baltieri wrote: [...] The name of the device can only be changed when the interface is down. Is it possible to put some scripting around it to detach and attach the leds to the interfaces on ifup/ifdown triggers? Are the led triggers available for using while the netdev is down then? Sure! On embedded systems triggers are usually attached to actual LEDs at probe time using default_trigger field of struct led_classdev, and that can be specified both in machine files or in device tree. I also think that led triggers should be available. I asked the question because detach attach leds to interfaces would indeed break that. btw, I tried to send a patch tuesday (my first $ git send-email) using netdev notifiers: did you receive it, and what do you think of it? Kurt -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[GIT] kbuild rc fixes for v3.6 (v2)
Hi Linus, there are two fixes that should go into 3.6. The link-vmlinux.sh one is obvious. The other one fixes make firmware_install with certain configurations, where a file in the toplevel firmware tree gets installed first, and $(INSTALL_FW_PATH)/$$(dir file) results in /lib/firmware/./, which confuses make 3.82 for some reason. v2: This time with the correct URL. Thanks, Michal The following changes since commit 0d7614f09c1ebdbaa1599a5aba7593f147bf96ee: Linux 3.6-rc1 (2012-08-02 16:38:10 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild.git rc-fixes for you to fetch changes up to 6c7080a61fc7b46b3ac8573952b5a3e9d5f68bc4: firmware: fix directory creation rule matching with make 3.82 (2012-08-30 16:27:13 +0200) Mark Asselstine (1): firmware: fix directory creation rule matching with make 3.82 Michal Marek (1): link-vmlinux.sh: Fix stray echo in error message scripts/Makefile.fwinst |2 +- scripts/link-vmlinux.sh |2 +- 2 files changed, 2 insertions(+), 2 deletions(-) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT] kbuild rc fixes for v3.6
On 6.9.2012 07:19, Stephen Rothwell wrote: On Wed, 5 Sep 2012 20:32:00 -0700 Linus Torvalds torva...@linux-foundation.org wrote: On Mon, Sep 3, 2012 at 12:18 PM, Michal Marek mma...@suse.cz wrote: are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6.git ..BRANCH.NOT.VERIFIED.. There's something wrong with that repo fatal: Could not read from remote repository. Please make sure you have the correct access rights and the repository exists. plus that BRANCH.NOT.VERIFIED thing looks bad too (and is probably related). Oops. The kbuild-current tree in linux-next is git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild.git rc-fixes and it has those 2 commits in it. So script update needed, I guess. I did update my script, but the last pull request was sent from a machine that had the old copy :). Michal -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 10/11 V5] workqueue: unbind/rebind without manager_mutex
On 09/06/2012 04:04 AM, Tejun Heo wrote: Hello, Lai. On Wed, Sep 05, 2012 at 06:37:47PM +0800, Lai Jiangshan wrote: gcwq_unbind_fn() unbind manager by -manager pointer. rebinding-manger, unbinding/rebinding newly created worker are done by other place. so we don't need manager_mutex any more. Also change the comment of @bind accordingly. Please don't scatter small prep patches like this. Each piece in isolation doesn't make much sense to me and the patch descriptions don't help much. Please collect the prep patches and explain in more detail. There are 4 different tasks. unbind/rebind manager/newbie 1 task for 1 patch. if I collect them into one patch, it will be hard to explain which code do which task. In general, I'm not sure about this approach. I'd really like the hotplug logic to be contained in hotplug logic proper as much as possible. This scatters around hotplug handling to usual code paths and seems too invasive for 3.6-fixes. I don't expect to fix it in 3.6. no approach is simple. Also, can you please talk to me before going ahead and sending me completely new 10 patch series every other day? You're taking disproportionate amount of my time and I can't continue to do this. Please discuss with me or at least explain the high-level approach in the head message in detail. Going through the patch series to figure out high-level design which is constantly flipping is rather inefficient and unfortunately your patch descriptions aren't too helpful. :( I'm not good in English, so I prefer to attach code when I show my idea. (and the code can prove the idea). I admit that my changelog and comments are always bad. I have 4 idea/approach for bug of hotplug VS manage_workers(). there all come up to my mind last week. NOTE: (this V5 patch is my approach2) (list with the order they came into my mind) Approach 1 V3 patchset non_manager_role_manager_mutex_unlock() Approach 2 V5 patchset rebind manager, unbind/rebind newbie are done outside. no manage mutex for hotplug Approach 3 un-implemented move unbind/rebind to worker_thread and handle them as POOL_MANAGE_WORKERS Approach 4 V4 parchset manage_workers_slowpath() Approach 2,3 is partial implemented last week, but Approach2 is quickly finished yesterday. Approach 3 is too complicated to finish. Approach 1: the simplest. after it, we can use manage_mutex anywhere as needed, but we need to use non_manager_role_manager_mutex_unlock() to unlock. Approach 2: the binding of manager and newly created worker is handled outside of hotplug code. thus hoplug code don't need manage_mutex. manage_mutex is typical protect-code-pattern, it is not good. we should always use lock to protect data instead of protecting code. although in linux kernel, there are many lock which are only used for protecting code, I think we can reduce them as possible. the removing of BIG-KERNEL-LOCK is an example. the line of code is also less in this approach, but it touch 2 place outside of hotplug code and the logic/path are increasing. GOOD to me: disallow manage_mutex(for future), not too much code. Approach 3: complicated. make unbind/rebind 's calle-site and context are the same as manage_workers(). BAD: we can't free to use manage_mutex in future when need. encounter some other problems.(you suggested approach will also have some problem I encountered) Approach 4: the problem comes from manage_worker(), just add manage_workers_slowpath() to fix it inside manage_worker(). it fixs problem in only 1 bulk of code. after it, we can use manage_mutex anywhere as needed. the line of code is more, but it just in one place. GOOD: the most clean approach. Thanks Lai -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] virtio-balloon spec: provide a version of the silent deflate feature that works
On Thu, Sep 06, 2012 at 11:57:22AM +0200, Paolo Bonzini wrote: Il 06/09/2012 11:44, Michael S. Tsirkin ha scritto: In fact, it's not clear how the driver should use the feature. My guess is that, if it wants to use silent deflate, it tries to negotiate VIRTIO_BALLOON_F_MUST_TELL_HOST, and can use silent deflate if negotiation fails. This is against the logic of all other features. Let's take a step back from the implementation details. You are trying to add a new feature bit, after all. Why? Why is silent deflate useful? This is what is missing in all this discussion. If it is not useful we do not need a bit for it. It is useful because it lets guests inflate the balloon aggressively, and then use ballooned-out pages even in places where the guest OS cannot sleep, such as kmalloc(GFP_ATOMIC). Interesting. Do you intend to develop a driver patch using this? I'd like to see how that works. Because if not, IMO it's best to wait until someone asks for it. Can you show a scenario with old driver/new hypervisor or new driver/old hypervisor that fails? Sorry this is not the example I asked for. Please give and example without migration. Migration is qemu's problem: it is hypervisor's job to make sure guest sees no change during migration. Quoting my message: Of course you can just teach QEMU to be smarter, but that would be a one-off hack for the only ill-defined feature that says something is _not_ supported. Currently migration works the same way for all virtio devices, and assumes that features are defined only in the positive direction: drivers request features if they want to use it, devices provide features to say they support something. Well this approach is buggy. If I reread features after migration what do I see? Something changed right? So this is a bug. Migration should not change hardware. And it is not a one off thing it is fundamental for any hardware. Fix that in qemu, and the problem goes away without spec changes. Instead, in the case of this feature, the driver requests it before relying on its lack (which is odd); Which code in driver do you refer to? the device provides if they do not support something (which is wrong). Not support? It just seems to be asking guest to tell it about deflates. If guest acks the bit, we know it will. If it does not, it will not. You can see that this just cannot provide backwards-compatibility in the device; Sorry I do not understand this meta argument. There should be an example where a driver and device fail to work together. And without migration: as I showed migration is simply broken atm for an unrelated reason. Otherwise all's well. it happens to work only because the feature was there in the first version of the spec. This is how we do compatiblity in virtio. If we want driver to do something, we add a feature and it can ack, if it does we know it will do what we want. Another example is network announce bit. If driver acks it, we know we do not need to send gratitious arp from qemu. You are saying it is also broken? It should be able to do this with any hardware it emulates, there should be no need to change hardware to make it migrateable somehow. Of course, but if we can fix the hardware with no bad effects, let's do that instead. Paolo Don't fix what is not broken. We get to carry compatibility in both driver and host for a long time for each feature. Note: adding new features adds zero value in this respect - it will not allow simplifying the hypervisor. -- MST -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] virtio-blk: Fix kconfig option
On Thu, Sep 06, 2012 at 03:31:44AM -0700, Kent Overstreet wrote: On Thu, Sep 06, 2012 at 01:18:43PM +0300, Michael S. Tsirkin wrote: On Thu, Sep 06, 2012 at 03:02:48AM -0700, Kent Overstreet wrote: On Thu, Sep 06, 2012 at 12:49:56PM +0300, Michael S. Tsirkin wrote: On Thu, Sep 06, 2012 at 02:25:12AM -0700, Kent Overstreet wrote: On Thu, Sep 06, 2012 at 11:44:03AM +0300, Michael S. Tsirkin wrote: On Thu, Sep 06, 2012 at 12:41:13AM -0700, Kent Overstreet wrote: On Tue, Sep 04, 2012 at 03:53:53PM +0930, Rusty Russell wrote: Kent Overstreet koverstr...@google.com writes: CONFIG_VIRTIO isn't exposed, everything else is supposed to select it instead. This is a slight mis-understanding. It's supposed to be selected by the particular driver, probably virtio_pci in your case. So are you saying virtio-blk depends on virtio-pci? If so, the kconfig should have that. As is, VIRTIO_BLK just has: depends on EXPERIMENTAL VIRTIO which is flat out broken. I don't think anything is broken. Can you show an example of a broken configuration? Do you not understand the difference between depends an selects? Or did you not read my original mail? Flip off everything in drivers - virtio Now go to drivers - block and try to turn on virtio-blk. It's not listed! Yes. Because you disabled all virtio backends. It does not make sense to have any frontends. How's a user - or even another kernel developer who isn't familiar with virtio - supposed to know that? I still don't know what exactly a virtio backend is - the term isn't even mentioned anywhere that I've seen. Whatever it is though virtio-blk should be depending on _that_, not a config option that _isn't exposed in the menu_! Now go back to drivers - virtio and turn on (randomly) balloon. Go back to drivers - block, and now you can turn on virtio-blk! Do you see what's wrong with this picture? Yes. You got unlucky with your random guess. It's a bug in balloon kconfig: it should not select virtio. I sent a patch to fix that yesterday. Then it's also a bug in the comments at the top of drivers/virtio/Kconfig. And besides that, how the _hell_ is a user supposed to know to turn on VIRTIO_PCI before VIRTIO_BLK? It's not documented anywhere (if that is what's supposed to happen! I still don't know) Well, what kind of device do you have? Tell us :) If it's a virtio pci device, you need to enable virtio-pci and virtio-blk. I run qemu with -drive if=virtio. You tell me! -drive if= is a compatibility option. qemu makes an effort to guess what it is you want to do. Result is usually correct but it means people building their own kernels get confused. For x86 kvm the modern equivalent is: -device virtio-blk-pci,drive=foobar -drive if=no,... If you use this you get what you asked for :). Yes this usage is not documented anywhere, but this is not guest driver's problem. Better yet, tell me how the user is supposed to figure it out! As usual when you do not know which driver to select. Boot a distro kernel and look around. Where is your virtio device? On a pci bus? There you are. and even if it was documented, having one kconfig option depend on something that's exposed in a _completely different menu_ is just made of fail. Fine, but why pick on virtio? This is extremely common in kconfig. For example, a ton of network drivers depend on PCI, it's exactly the same thing. Never noticed where CONFIG_PCI is exposed in bus options? I see it: CONFIG_PCI: │ Find out whether you have a PCI motherboard. PCI is the name of a │ │ bus system, i.e. the way the CPU talks to the other stuff inside │ │ your box. Other bus systems are ISA, EISA, MicroChannel (MCA) or │ │ VESA. If you have PCI, say Y, otherwise N. │ Nope, not the same thing. You just happen to know what PCI is but not what VIRTIO PCI is. This is fair enough, but not sure how to help in this case. Your patch won't help though. -- MST -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH RFC] mm/swap: automatic tuning for swapin readahead
This patch adds simple tracker for swapin readahread effectiveness, and tunes readahead cluster depending on it. It manage internal state [0..1024] and scales readahead order between 0 and value from sysctl vm.page-cluster (3 by default). Swapout and readahead misses decreases state, swapin and ra hits increases it: Swapin +1 [page fault, shmem, etc... ] Swapout -10 Readahead hit +10 Readahead miss -1 [removing from swapcache unused readahead page] If system is under serious memory pressure swapin readahead is useless, because pages in swap are highly fragmented and cache hit is mostly impossible. In this case swapin only leads to unnecessary memory allocations. But readahead helps to read all swapped pages back to memory if system recovers from memory pressure. This patch inspired by patch from Shaohua Li http://www.spinics.net/lists/linux-mm/msg41128.html mine version uses system wide state rather than per-VMA counters. Signed-off-by: Konstantin Khlebnikov khlebni...@openvz.org Cc: Shaohua Li s...@kernel.org Cc: Rik van Riel r...@redhat.com Cc: Minchan Kim minc...@kernel.org --- include/linux/page-flags.h |1 + mm/swap_state.c| 42 +- 2 files changed, 38 insertions(+), 5 deletions(-) diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index b5d1384..3657cdc 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -231,6 +231,7 @@ PAGEFLAG(MappedToDisk, mappedtodisk) /* PG_readahead is only used for file reads; PG_reclaim is only for writes */ PAGEFLAG(Reclaim, reclaim) TESTCLEARFLAG(Reclaim, reclaim) PAGEFLAG(Readahead, reclaim) /* Reminder to do async read-ahead */ +TESTCLEARFLAG(Readahead, reclaim) #ifdef CONFIG_HIGHMEM /* diff --git a/mm/swap_state.c b/mm/swap_state.c index 0cb36fb..d6c7a88 100644 --- a/mm/swap_state.c +++ b/mm/swap_state.c @@ -53,12 +53,31 @@ static struct { unsigned long find_total; } swap_cache_info; +#define SWAP_RA_BITS 10 + +static atomic_t swap_ra_state = ATOMIC_INIT((1 SWAP_RA_BITS) - 1); +static int swap_ra_cluster = 1; + +static void swap_ra_update(int delta) +{ + int old_state, new_state; + + old_state = atomic_read(swap_ra_state); + new_state = clamp(old_state + delta, 0, 1 SWAP_RA_BITS); + if (old_state != new_state) { + atomic_set(swap_ra_state, new_state); + swap_ra_cluster = (page_cluster * new_state) SWAP_RA_BITS; + } +} + void show_swap_cache_info(void) { printk(%lu pages in swap cache\n, total_swapcache_pages); - printk(Swap cache stats: add %lu, delete %lu, find %lu/%lu\n, + printk(Swap cache stats: add %lu, delete %lu, find %lu/%lu, +readahead %d/%d\n, swap_cache_info.add_total, swap_cache_info.del_total, - swap_cache_info.find_success, swap_cache_info.find_total); + swap_cache_info.find_success, swap_cache_info.find_total, + 1 swap_ra_cluster, atomic_read(swap_ra_state)); printk(Free swap = %ldkB\n, nr_swap_pages (PAGE_SHIFT - 10)); printk(Total swap = %lukB\n, total_swap_pages (PAGE_SHIFT - 10)); } @@ -112,6 +131,8 @@ int add_to_swap_cache(struct page *page, swp_entry_t entry, gfp_t gfp_mask) if (!error) { error = __add_to_swap_cache(page, entry); radix_tree_preload_end(); + /* FIXME weird place */ + swap_ra_update(-10); /* swapout, decrease readahead */ } return error; } @@ -132,6 +153,8 @@ void __delete_from_swap_cache(struct page *page) total_swapcache_pages--; __dec_zone_page_state(page, NR_FILE_PAGES); INC_CACHE_INFO(del_total); + if (TestClearPageReadahead(page)) + swap_ra_update(-1); /* readahead miss */ } /** @@ -265,8 +288,11 @@ struct page * lookup_swap_cache(swp_entry_t entry) page = find_get_page(swapper_space, entry.val); - if (page) + if (page) { INC_CACHE_INFO(find_success); + if (TestClearPageReadahead(page)) + swap_ra_update(+10); /* readahead hit */ + } INC_CACHE_INFO(find_total); return page; @@ -374,11 +400,14 @@ struct page *swapin_readahead(swp_entry_t entry, gfp_t gfp_mask, struct vm_area_struct *vma, unsigned long addr) { struct page *page; - unsigned long offset = swp_offset(entry); + unsigned long entry_offset = swp_offset(entry); + unsigned long offset = entry_offset; unsigned long start_offset, end_offset; - unsigned long mask = (1UL page_cluster) - 1; + unsigned long mask = (1UL swap_ra_cluster) - 1; struct blk_plug plug; + swap_ra_update(+1); /* swapin, increase readahead */ + /* Read a page_cluster sized and aligned cluster around offset. */
Re: [PATCH can-next v6] can: add tx/rx LED trigger support
Hi Kurt, On Thu, Sep 6, 2012 at 12:33 PM, Kurt Van Dijck kurt.van.di...@eia.be wrote: On Tue, Sep 04, 2012 at 10:15:53PM +0200, Fabio Baltieri wrote: On Tue, Sep 04, 2012 at 09:11:28AM +0200, Kurt Van Dijck wrote: [...] The name of the device can only be changed when the interface is down. Is it possible to put some scripting around it to detach and attach the leds to the interfaces on ifup/ifdown triggers? Are the led triggers available for using while the netdev is down then? Sure! On embedded systems triggers are usually attached to actual LEDs at probe time using default_trigger field of struct led_classdev, and that can be specified both in machine files or in device tree. I also think that led triggers should be available. Right, that's why I think the only way is to use device name. I asked the question because detach attach leds to interfaces would indeed break that. Sure? I think that the trigger would be set again on reattach, as default_trigger is checked both in led_cdev probe and trigger_register, see: http://lxr.free-electrons.com/source/drivers/leds/led-triggers.c#L180 I'll try that tonight. btw, I tried to send a patch tuesday (my first $ git send-email) using netdev notifiers: did you receive it, and what do you think of it? Sure, I got it! I was planning to try that this weekend but I can give you some comments earlier tonight... sorry for the dealy! Fabio -- Fabio Baltieri -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC v2 PATCH 00/21] KVM: x86: CPU isolation and direct interrupts delivery to guests
This RFC patch series provides facility to dedicate CPUs to KVM guests and enable the guests to handle interrupts from passed-through PCI devices directly (without VM exit and relay by the host). With this feature, we can improve throughput and response time of the device and the host's CPU usage by reducing the overhead of interrupt handling. This is good for the application using very high throughput/frequent interrupt device (e.g. 10GbE NIC). Real-time applicatoins also gets benefit from CPU isolation feature, which reduces interfare from host kernel tasks and scheduling delay. The overview of this patch series is presented in CloudOpen 2012. The slides are available at: http://events.linuxfoundation.org/images/stories/pdf/lcna_co2012_sekiyama.pdf * Changes from v1 ( https://lkml.org/lkml/2012/6/28/30 ) - SMP guest is supported - Direct EOI is added, that eliminate VM exit on EOI - Direct local APIC timer access from guests is added, which pass-through the physical timer of a dedicated CPU to the guest. - Rebased on v3.6-rc4 * How to test - Create a guest VM with 1 CPU and some PCI passthrough devices (which supports MSI/MSI-X). No VGA display will be better... - Apply the patch at the end of this mail to qemu-kvm. (This patch is just for simple testing, and dedicated CPU ID for the guest is hard-coded.) - Run the guest once to ensure the PCI passthrough works correctly. - Make the specified CPU offline. # echo 0 /sys/devices/system/cpu/cpu3/online - Launch qemu-kvm with -no-kvm-pit option. The offlined CPU is booted as a slave CPU and guest is runs on that CPU. * To-do - Enable slave CPUs to handle access fault - Support AMD SVM - Support non-Linux guests --- Tomoki Sekiyama (21): x86: request TLB flush to slave CPU using NMI KVM: Pass-through local APIC timer of on slave CPUs to guest VM KVM: Enable direct EOI for directly routed interrupts to guests KVM: route assigned devices' MSI/MSI-X directly to guests on slave CPUs KVM: add kvm_arch_vcpu_prevent_run to prevent VM ENTER when NMI is received KVM: vmx: Add definitions PIN_BASED_PREEMPTION_TIMER KVM: add tracepoint on enabling/disabling direct interrupt delivery KVM: Directly handle interrupts by guests without VM EXIT on slave CPUs x86/apic: IRQ vector remapping on slave for slave CPUs x86/apic: Enable external interrupt routing to slave CPUs KVM: no exiting from guest when slave CPU halted KVM: proxy slab operations for slave CPUs on online CPUs KVM: Go back to online CPU on VM exit by external interrupt KVM: Add KVM_GET_SLAVE_CPU and KVM_SET_SLAVE_CPU to vCPU ioctl KVM: handle page faults of slave guests on online CPUs KVM: Add facility to run guests on slave CPUs KVM: Enable/Disable virtualization on slave CPUs are activated/dying x86: Avoid RCU warnings on slave CPUs x86: Support hrtimer on slave CPUs x86: Add a facility to use offlined CPUs as slave CPUs x86: Split memory hotplug function from cpu_up() as cpu_memory_up() arch/x86/Kconfig | 10 + arch/x86/include/asm/apic.h | 10 + arch/x86/include/asm/irq.h| 15 + arch/x86/include/asm/kvm_host.h | 59 + arch/x86/include/asm/tlbflush.h |5 arch/x86/include/asm/vmx.h|3 arch/x86/kernel/apic/apic.c | 11 + arch/x86/kernel/apic/io_apic.c| 111 - arch/x86/kernel/apic/x2apic_cluster.c |8 - arch/x86/kernel/cpu/common.c |5 arch/x86/kernel/smp.c |2 arch/x86/kernel/smpboot.c | 264 ++- arch/x86/kvm/irq.c| 136 arch/x86/kvm/lapic.c | 56 + arch/x86/kvm/lapic.h |2 arch/x86/kvm/mmu.c| 63 - arch/x86/kvm/mmu.h|4 arch/x86/kvm/trace.h | 19 ++ arch/x86/kvm/vmx.c| 180 +++ arch/x86/kvm/x86.c| 387 +++-- arch/x86/kvm/x86.h|9 + arch/x86/mm/tlb.c | 94 drivers/iommu/intel_irq_remapping.c | 32 ++- include/linux/cpu.h | 36 +++ include/linux/cpumask.h | 26 ++ include/linux/kvm.h |4 include/linux/kvm_host.h |2 kernel/cpu.c | 83 +-- kernel/hrtimer.c | 14 + kernel/irq/manage.c |4 kernel/irq/migration.c|2 kernel/irq/proc.c |2 kernel/rcutree.c | 14 + kernel/smp.c |9 + virt/kvm/assigned-dev.c |8 + virt/kvm/async_pf.c | 17 + virt/kvm/kvm_main.c | 32 +++ 37
[RFC v2 PATCH 08/21] KVM: Add KVM_GET_SLAVE_CPU and KVM_SET_SLAVE_CPU to vCPU ioctl
Add an interface to set/get slave CPU dedicated to the vCPUs. By calling ioctl with KVM_GET_SLAVE_CPU, users can get the slave CPU id for the vCPU. -1 is returned if a slave CPU is not set. By calling ioctl with KVM_SET_SLAVE_CPU, users can dedicate the specified slave CPU to the vCPU. The CPU must be offlined before calling ioctl. The CPU is activated as slave CPU for the vCPU when the correct id is set. The slave CPU is freed and offlined by setting -1 as slave CPU id. Whether getting/setting slave CPUs are supported by KVM or not can be known by checking KVM_CAP_SLAVE_CPU. Signed-off-by: Tomoki Sekiyama tomoki.sekiyama...@hitachi.com Cc: Avi Kivity a...@redhat.com Cc: Marcelo Tosatti mtosa...@redhat.com Cc: Thomas Gleixner t...@linutronix.de Cc: Ingo Molnar mi...@redhat.com Cc: H. Peter Anvin h...@zytor.com --- arch/x86/include/asm/kvm_host.h |2 + arch/x86/kvm/vmx.c |7 + arch/x86/kvm/x86.c | 58 +++ include/linux/kvm.h |4 +++ 4 files changed, 71 insertions(+), 0 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 8dc1a0a..0ea04c9 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -718,6 +718,8 @@ struct kvm_x86_ops { int (*check_intercept)(struct kvm_vcpu *vcpu, struct x86_instruction_info *info, enum x86_intercept_stage stage); + + void (*set_slave_mode)(struct kvm_vcpu *vcpu, bool slave); }; struct kvm_arch_async_pf { diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index c5db714..7bbfa01 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -1698,6 +1698,11 @@ static void skip_emulated_instruction(struct kvm_vcpu *vcpu) vmx_set_interrupt_shadow(vcpu, 0); } +static void vmx_set_slave_mode(struct kvm_vcpu *vcpu, bool slave) +{ + /* Nothing */ +} + /* * KVM wants to inject page-faults which it got to the guest. This function * checks whether in a nested guest, we need to inject them to L1 or L2. @@ -7344,6 +7349,8 @@ static struct kvm_x86_ops vmx_x86_ops = { .set_tdp_cr3 = vmx_set_cr3, .check_intercept = vmx_check_intercept, + + .set_slave_mode = vmx_set_slave_mode, }; static int __init vmx_init(void) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 579c41c..b62f59c 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -2183,6 +2183,9 @@ int kvm_dev_ioctl_check_extension(long ext) case KVM_CAP_GET_TSC_KHZ: case KVM_CAP_PCI_2_3: case KVM_CAP_KVMCLOCK_CTRL: +#ifdef CONFIG_SLAVE_CPU + case KVM_CAP_SLAVE_CPU: +#endif r = 1; break; case KVM_CAP_COALESCED_MMIO: @@ -2657,6 +2660,48 @@ static int kvm_set_guest_paused(struct kvm_vcpu *vcpu) return 0; } +#ifdef CONFIG_SLAVE_CPU +/* vcpu currently running on each slave CPU */ +static DEFINE_PER_CPU(struct kvm_vcpu *, slave_vcpu); + +static int kvm_arch_vcpu_ioctl_set_slave_cpu(struct kvm_vcpu *vcpu, +int slave, int set_slave_mode) +{ + int old = vcpu-arch.slave_cpu; + int r = -EINVAL; + + if (slave = nr_cpu_ids || (slave = 0 cpu_online(slave))) + goto out; + if (slave = 0 slave != old cpu_slave(slave)) + goto out; /* new slave cpu must be offlined */ + + if (old = 0 slave != old) { + BUG_ON(old = nr_cpu_ids || !cpu_slave(old)); + per_cpu(slave_vcpu, old) = NULL; + r = slave_cpu_down(old); + if (r) { + pr_err(kvm: slave_cpu_down %d failed\n, old); + goto out; + } + } + + if (slave = 0) { + r = slave_cpu_up(slave); + if (r) + goto out; + BUG_ON(!cpu_slave(slave)); + per_cpu(slave_vcpu, slave) = vcpu; + } + + vcpu-arch.slave_cpu = slave; + if (set_slave_mode kvm_x86_ops-set_slave_mode) + kvm_x86_ops-set_slave_mode(vcpu, slave = 0); +out: + return r; +} + +#endif + long kvm_arch_vcpu_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg) { @@ -2937,6 +2982,16 @@ long kvm_arch_vcpu_ioctl(struct file *filp, r = kvm_set_guest_paused(vcpu); goto out; } +#ifdef CONFIG_SLAVE_CPU + case KVM_SET_SLAVE_CPU: { + r = kvm_arch_vcpu_ioctl_set_slave_cpu(vcpu, (int)arg, 1); + goto out; + } + case KVM_GET_SLAVE_CPU: { + r = vcpu-arch.slave_cpu; + goto out; + } +#endif default: r = -EINVAL; } @@ -6154,6 +6209,9 @@ void kvm_put_guest_fpu(struct kvm_vcpu *vcpu) void kvm_arch_vcpu_free(struct kvm_vcpu *vcpu) { kvmclock_reset(vcpu);
Re: [PATCH] serial_core: fix sizeof(pointer)
On Thu, 6 Sep 2012 10:27:51 +0800 Fengguang Wu fengguang...@intel.com wrote: sizeof when applied to a pointer typed expression gives the size of the pointer. Generated by: scripts/coccinelle/misc/noderef.cocci Signed-off-by: Fengguang Wu fengguang...@intel.com Oops.. yes typo on my part Signed-off-by: Alan Cox a...@linux.intel.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC v2 PATCH 09/21] KVM: Go back to online CPU on VM exit by external interrupt
If the slave CPU receives an interrupt in running a guest, current implementation must once go back to onilne CPUs to handle the interupt. This behavior will be replaced by later patch, which introduces direct interrupt handling mechanism by the guest. Signed-off-by: Tomoki Sekiyama tomoki.sekiyama...@hitachi.com Cc: Avi Kivity a...@redhat.com Cc: Marcelo Tosatti mtosa...@redhat.com Cc: Thomas Gleixner t...@linutronix.de Cc: Ingo Molnar mi...@redhat.com Cc: H. Peter Anvin h...@zytor.com --- arch/x86/include/asm/kvm_host.h |1 + arch/x86/kvm/vmx.c |1 + arch/x86/kvm/x86.c |6 ++ 3 files changed, 8 insertions(+), 0 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 0ea04c9..af68ffb 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -358,6 +358,7 @@ struct kvm_vcpu_arch { int sipi_vector; u64 ia32_misc_enable_msr; bool tpr_access_reporting; + bool interrupted; #ifdef CONFIG_SLAVE_CPU /* slave cpu dedicated to this vcpu */ diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 7bbfa01..d99bee6 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -4408,6 +4408,7 @@ static int handle_exception(struct kvm_vcpu *vcpu) static int handle_external_interrupt(struct kvm_vcpu *vcpu) { + vcpu-arch.interrupted = true; ++vcpu-stat.irq_exits; return 1; } diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index b62f59c..db0be81 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -5566,6 +5566,12 @@ static void __vcpu_enter_guest_slave(void *_arg) break; /* determine if slave cpu can handle the exit alone */ + if (vcpu-arch.interrupted) { + vcpu-arch.interrupted = false; + arg-ret = LOOP_ONLINE; + break; + } + r = vcpu_post_run(vcpu, arg-task, arg-apf_pending); if (r == LOOP_SLAVE -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC v2 PATCH 15/21] KVM: add tracepoint on enabling/disabling direct interrupt delivery
Add trace event kvm_set_direct_interrupt to trace enabling/disabling direct interrupt delivery on slave CPUs. At the event, the guest rip and whether the feature is enabled or not is logged. Signed-off-by: Tomoki Sekiyama tomoki.sekiyama...@hitachi.com Cc: Avi Kivity a...@redhat.com Cc: Marcelo Tosatti mtosa...@redhat.com Cc: Thomas Gleixner t...@linutronix.de Cc: Ingo Molnar mi...@redhat.com Cc: H. Peter Anvin h...@zytor.com --- arch/x86/kvm/trace.h | 18 ++ arch/x86/kvm/vmx.c |2 ++ arch/x86/kvm/x86.c |1 + 3 files changed, 21 insertions(+), 0 deletions(-) diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h index a71faf7..6081be7 100644 --- a/arch/x86/kvm/trace.h +++ b/arch/x86/kvm/trace.h @@ -551,6 +551,24 @@ TRACE_EVENT(kvm_pv_eoi, TP_printk(apicid %x vector %d, __entry-apicid, __entry-vector) ); +TRACE_EVENT(kvm_set_direct_interrupt, + TP_PROTO(struct kvm_vcpu *vcpu, bool enabled), + TP_ARGS(vcpu, enabled), + + TP_STRUCT__entry( + __field(unsigned long, guest_rip ) + __field(bool, enabled ) + ), + + TP_fast_assign( + __entry-guest_rip = kvm_rip_read(vcpu); + __entry-enabled= enabled; + ), + + TP_printk(rip 0x%lx enabled %d, +__entry-guest_rip, __entry-enabled) +); + /* * Tracepoint for nested VMRUN */ diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 605abea..6dc59c8 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -1719,6 +1719,8 @@ static void vmx_set_direct_interrupt(struct kvm_vcpu *vcpu, bool enabled) else vmcs_set_bits(PIN_BASED_VM_EXEC_CONTROL, PIN_BASED_EXT_INTR_MASK); + + trace_kvm_set_direct_interrupt(vcpu, enabled); } static void vmx_set_slave_mode(struct kvm_vcpu *vcpu, bool slave) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index b7d28df..1449187 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -6936,3 +6936,4 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_nested_intr_vmexit); EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_invlpga); EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_skinit); EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_nested_intercepts); +EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_set_direct_interrupt); -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC v2 PATCH 13/21] x86/apic: IRQ vector remapping on slave for slave CPUs
Add a facility to use IRQ vector different from online CPUs on slave CPUs. When alternative vector for IRQ is registered by remap_slave_vector_irq() and the IRQ affinity is set only to slave CPUs, the device is configured to use the alternative vector. Current patch only supports MSI and Intel CPU with IRQ remapper of IOMMU. This is intended to be used to routing interrupts directly to KVM guest which is running on slave CPUs which do not cause VM EXIT by external interrupts. Signed-off-by: Tomoki Sekiyama tomoki.sekiyama...@hitachi.com Cc: Avi Kivity a...@redhat.com Cc: Marcelo Tosatti mtosa...@redhat.com Cc: Thomas Gleixner t...@linutronix.de Cc: Ingo Molnar mi...@redhat.com Cc: H. Peter Anvin h...@zytor.com --- arch/x86/include/asm/irq.h | 15 arch/x86/kernel/apic/io_apic.c | 68 ++- drivers/iommu/intel_irq_remapping.c |2 + 3 files changed, 83 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/irq.h b/arch/x86/include/asm/irq.h index ba870bb..84756f7 100644 --- a/arch/x86/include/asm/irq.h +++ b/arch/x86/include/asm/irq.h @@ -41,4 +41,19 @@ extern int vector_used_by_percpu_irq(unsigned int vector); extern void init_ISA_irqs(void); +#ifdef CONFIG_SLAVE_CPU +extern void remap_slave_vector_irq(int irq, int vector, + const struct cpumask *mask); +extern void revert_slave_vector_irq(int irq, const struct cpumask *mask); +extern u8 get_remapped_slave_vector(u8 vector, unsigned int irq, + const struct cpumask *mask); +#else +static inline u8 get_remapped_slave_vector(u8 vector, unsigned int irq, + const struct cpumask *mask) +{ + return vector; +} +#endif + + #endif /* _ASM_X86_IRQ_H */ diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c index 0cd2682..167b001 100644 --- a/arch/x86/kernel/apic/io_apic.c +++ b/arch/x86/kernel/apic/io_apic.c @@ -1266,6 +1266,69 @@ void __setup_vector_irq(int cpu) raw_spin_unlock(vector_lock); } +#ifdef CONFIG_SLAVE_CPU + +/* vector table remapped on slave cpus, indexed by IRQ */ +static DEFINE_PER_CPU(u8[NR_IRQS], slave_vector_remap_tbl) = { + [0 ... NR_IRQS - 1] = 0, +}; + +void remap_slave_vector_irq(int irq, int vector, const struct cpumask *mask) +{ + int cpu; + unsigned long flags; + + raw_spin_lock_irqsave(vector_lock, flags); + for_each_cpu(cpu, mask) { + BUG_ON(!cpu_slave(cpu)); + per_cpu(slave_vector_remap_tbl, cpu)[irq] = vector; + per_cpu(vector_irq, cpu)[vector] = irq; + } + raw_spin_unlock_irqrestore(vector_lock, flags); +} +EXPORT_SYMBOL_GPL(remap_slave_vector_irq); + +void revert_slave_vector_irq(int irq, const struct cpumask *mask) +{ + int cpu; + u8 vector; + unsigned long flags; + + raw_spin_lock_irqsave(vector_lock, flags); + for_each_cpu(cpu, mask) { + BUG_ON(!cpu_slave(cpu)); + vector = per_cpu(slave_vector_remap_tbl, cpu)[irq]; + if (vector) { + per_cpu(vector_irq, cpu)[vector] = -1; + per_cpu(slave_vector_remap_tbl, cpu)[irq] = 0; + } + } + raw_spin_unlock_irqrestore(vector_lock, flags); +} +EXPORT_SYMBOL_GPL(revert_slave_vector_irq); + +/* If all targets CPUs are slave, returns remapped vector */ +u8 get_remapped_slave_vector(u8 vector, unsigned int irq, +const struct cpumask *mask) +{ + u8 slave_vector; + + if (vector FIRST_EXTERNAL_VECTOR || + cpumask_intersects(mask, cpu_online_mask)) + return vector; + + slave_vector = per_cpu(slave_vector_remap_tbl, + cpumask_first(mask))[irq]; + if (slave_vector = FIRST_EXTERNAL_VECTOR) + vector = slave_vector; + + pr_info(slave vector remap: irq: %d = vector: %d\n, irq, vector); + + return vector; +} + +#endif + static struct irq_chip ioapic_chip; #ifdef CONFIG_X86_32 @@ -3133,6 +3196,7 @@ static int msi_set_affinity(struct irq_data *data, const struct cpumask *mask, bool force) { struct irq_cfg *cfg = data-chip_data; + int vector = cfg-vector; struct msi_msg msg; unsigned int dest; @@ -3141,8 +3205,10 @@ msi_set_affinity(struct irq_data *data, const struct cpumask *mask, bool force) __get_cached_msi_msg(data-msi_desc, msg); + vector = get_remapped_slave_vector(vector, data-irq, mask); + msg.data = ~MSI_DATA_VECTOR_MASK; - msg.data |= MSI_DATA_VECTOR(cfg-vector); + msg.data |= MSI_DATA_VECTOR(vector); msg.address_lo = ~MSI_ADDR_DEST_ID_MASK; msg.address_lo |= MSI_ADDR_DEST_ID(dest); diff --git a/drivers/iommu/intel_irq_remapping.c b/drivers/iommu/intel_irq_remapping.c index df38334..471d23f 100644 ---
Re: [Ping^3] Re: [PATCH] sg_io: allow UNMAP and WRITE SAME without CAP_SYS_RAWIO
On 09/06/2012 02:31 AM, Paolo Bonzini wrote: Il 05/09/2012 22:18, Ric Wheeler ha scritto: Hi Paolo, Both of these commands are destructive. WRITE_SAME (if done without the discard bits set) can also take a very long time to be destructive and tie up the storage. FORMAT_UNIT has the same characteristics and yet it is allowed (btw, I don't think WRITE SAME slowness is limited to the case where a real write is requested; discarding can be just as slow). Also, the two new commands are anyway restricted to programs that have write access to the disk. If you have read-only access, you won't be able to issue any destructive command (there is one exception, START STOP UNIT is allowed even with read-only capability and is somewhat destructive). Honestly, the only reason why these two commands weren't included, is that the current whitelist is heavily tailored towards CD/DVD burning. Hi Paolo, I assume that FORMAT_UNIT is for CD/DVD needs - not sure what a S-ATA disk would do with that. If it is destructive, we should probably think about how to make it more secure and see how many applications we would break. I think that restricting them to CAP_SYS_RAWIO seems reasonable - better to vet and give the appropriate apps the needed capability than to widely open up the safety check? CAP_SYS_RAWIO is so wide in its scope, that anything that requires it is insecure. Paolo I don't see allowing anyone who can open the device to zero the data as better though :) Ric -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC v2 PATCH 01/21] x86: Split memory hotplug function from cpu_up() as cpu_memory_up()
On 09/06/2012 02:27 PM, Tomoki Sekiyama wrote: Split memory hotplug function from cpu_up() as cpu_memory_up(), which will be used for assigning memory area to off-lined cpus at following patch in this series. Can post a summary containing both the general outline for people reading this for the first time, or who have forgotten it, and the list of changes from v1? -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC v2 PATCH 17/21] KVM: add kvm_arch_vcpu_prevent_run to prevent VM ENTER when NMI is received
Since NMI can not be disabled around VM enter, there is a race between receiving NMI to kick a guest and entering the guests on slave CPUs.If the NMI is received just before entering VM, after the NMI handler is invoked, it continues entering the guest and the effect of the NMI will be lost. This patch adds kvm_arch_vcpu_prevent_run(), which causes VM exit right after VM enter. The NMI handler uses this to ensure the execution of the guest is cancelled after NMI. Signed-off-by: Tomoki Sekiyama tomoki.sekiyama...@hitachi.com Cc: Avi Kivity a...@redhat.com Cc: Marcelo Tosatti mtosa...@redhat.com Cc: Thomas Gleixner t...@linutronix.de Cc: Ingo Molnar mi...@redhat.com Cc: H. Peter Anvin h...@zytor.com --- arch/x86/include/asm/kvm_host.h |6 ++ arch/x86/kvm/vmx.c | 42 ++- arch/x86/kvm/x86.c | 31 + 3 files changed, 78 insertions(+), 1 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 65242a6..624e5ad 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -429,6 +429,9 @@ struct kvm_vcpu_arch { void *insn; int insn_len; } page_fault; + + bool prevent_run; + bool prevent_needed; #endif int halt_request; /* real mode on Intel only */ @@ -681,6 +684,7 @@ struct kvm_x86_ops { void (*run)(struct kvm_vcpu *vcpu); int (*handle_exit)(struct kvm_vcpu *vcpu); + void (*prevent_run)(struct kvm_vcpu *vcpu, int prevent); void (*skip_emulated_instruction)(struct kvm_vcpu *vcpu); void (*set_interrupt_shadow)(struct kvm_vcpu *vcpu, int mask); u32 (*get_interrupt_shadow)(struct kvm_vcpu *vcpu, int mask); @@ -1027,4 +1031,6 @@ int kvm_pmu_read_pmc(struct kvm_vcpu *vcpu, unsigned pmc, u64 *data); void kvm_handle_pmu_event(struct kvm_vcpu *vcpu); void kvm_deliver_pmi(struct kvm_vcpu *vcpu); +int kvm_arch_vcpu_run_prevented(struct kvm_vcpu *vcpu); + #endif /* _ASM_X86_KVM_HOST_H */ diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 2130cbd..39a4cb4 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -1713,6 +1713,9 @@ static inline void vmx_clear_hlt(struct kvm_vcpu *vcpu) static void vmx_set_direct_interrupt(struct kvm_vcpu *vcpu, bool enabled) { +#ifdef CONFIG_SLAVE_CPU + void *msr_bitmap; + if (enabled) vmcs_clear_bits(PIN_BASED_VM_EXEC_CONTROL, PIN_BASED_EXT_INTR_MASK); @@ -1721,6 +1724,7 @@ static void vmx_set_direct_interrupt(struct kvm_vcpu *vcpu, bool enabled) PIN_BASED_EXT_INTR_MASK); trace_kvm_set_direct_interrupt(vcpu, enabled); +#endif } static void vmx_set_slave_mode(struct kvm_vcpu *vcpu, bool slave) @@ -4458,7 +4462,7 @@ static int handle_external_interrupt(struct kvm_vcpu *vcpu) static int handle_preemption_timer(struct kvm_vcpu *vcpu) { - /* Nothing */ + kvm_arch_vcpu_run_prevented(vcpu); return 1; } @@ -6052,6 +6056,10 @@ static int vmx_handle_exit(struct kvm_vcpu *vcpu) } if (exit_reason VMX_EXIT_REASONS_FAILED_VMENTRY) { +#ifdef CONFIG_SLAVE_CPU + if (vcpu-arch.prevent_run) + return kvm_arch_vcpu_run_prevented(vcpu); +#endif vcpu-run-exit_reason = KVM_EXIT_FAIL_ENTRY; vcpu-run-fail_entry.hardware_entry_failure_reason = exit_reason; @@ -6059,6 +6067,10 @@ static int vmx_handle_exit(struct kvm_vcpu *vcpu) } if (unlikely(vmx-fail)) { +#ifdef CONFIG_SLAVE_CPU + if (vcpu-arch.prevent_run) + return kvm_arch_vcpu_run_prevented(vcpu); +#endif vcpu-run-exit_reason = KVM_EXIT_FAIL_ENTRY; vcpu-run-fail_entry.hardware_entry_failure_reason = vmcs_read32(VM_INSTRUCTION_ERROR); @@ -6275,6 +6287,21 @@ static void atomic_switch_perf_msrs(struct vcpu_vmx *vmx) msrs[i].host); } +/* + * Make VMRESUME fail using preemption timer with timer value = 0. + * On processors that doesn't support preemption timer, VMRESUME will fail + * by internal error. + */ +static void vmx_prevent_run(struct kvm_vcpu *vcpu, int prevent) +{ + if (prevent) + vmcs_set_bits(PIN_BASED_VM_EXEC_CONTROL, + PIN_BASED_PREEMPTION_TIMER); + else + vmcs_clear_bits(PIN_BASED_VM_EXEC_CONTROL, + PIN_BASED_PREEMPTION_TIMER); +} + #ifdef CONFIG_X86_64 #define R r #define Q q @@ -6326,6 +6353,13 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu) atomic_switch_perf_msrs(vmx); +#ifdef CONFIG_SLAVE_CPU + barrier(); /* Avoid vmcs modification by NMI before here */ + vcpu-arch.prevent_needed = 1; + if (vcpu-arch.prevent_run)
[RFC v2 PATCH 20/21] KVM: Pass-through local APIC timer of on slave CPUs to guest VM
Provide direct control of local APIC timer of slave CPUs to the guest. The timer interrupt does not cause VM exit if direct interrupt delivery is enabled. To handle the timer correctly, this makes the guest occupy the local APIC timer. If the host supports x2apic, this expose TMICT and TMCCT to the guest in order to allow guests to start the timer and to read the timer count without VM exit. Otherwise, it sets APIC registers to specified values. LVTT is not passed-through to avoid modifying timer interrupt vector. Currently the guest timer interrupt vector remapping is not supported, and guest must use the same vector as host. Signed-off-by: Tomoki Sekiyama tomoki.sekiyama...@hitachi.com Cc: Avi Kivity a...@redhat.com Cc: Marcelo Tosatti mtosa...@redhat.com Cc: Thomas Gleixner t...@linutronix.de Cc: Ingo Molnar mi...@redhat.com Cc: H. Peter Anvin h...@zytor.com --- arch/x86/include/asm/apic.h |4 +++ arch/x86/include/asm/kvm_host.h |1 + arch/x86/kernel/apic/apic.c | 11 ++ arch/x86/kernel/smpboot.c | 30 ++ arch/x86/kvm/lapic.c| 45 +++ arch/x86/kvm/lapic.h|2 ++ arch/x86/kvm/vmx.c |6 + arch/x86/kvm/x86.c |3 +++ include/linux/cpu.h |5 kernel/hrtimer.c|2 +- 10 files changed, 108 insertions(+), 1 deletions(-) diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h index d37ae5c..66e1155 100644 --- a/arch/x86/include/asm/apic.h +++ b/arch/x86/include/asm/apic.h @@ -44,6 +44,8 @@ static inline void generic_apic_probe(void) #ifdef CONFIG_X86_LOCAL_APIC +struct clock_event_device; + extern unsigned int apic_verbosity; extern int local_apic_timer_c2_ok; @@ -245,6 +247,8 @@ extern void init_apic_mappings(void); void register_lapic_address(unsigned long address); extern void setup_boot_APIC_clock(void); extern void setup_secondary_APIC_clock(void); +extern void override_local_apic_timer(int cpu, + void (*handler)(struct clock_event_device *)); extern int APIC_init_uniprocessor(void); extern int apic_force_enable(unsigned long addr); diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index f43680e..a95bb62 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1035,6 +1035,7 @@ int kvm_arch_vcpu_run_prevented(struct kvm_vcpu *vcpu); #ifdef CONFIG_SLAVE_CPU void kvm_get_slave_cpu_mask(struct kvm *kvm, struct cpumask *mask); +struct kvm_vcpu *get_slave_vcpu(int cpu); struct kvm_assigned_dev_kernel; extern void assign_slave_msi(struct kvm *kvm, diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c index 24deb30..90ed84a 100644 --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -901,6 +901,17 @@ void __irq_entry smp_apic_timer_interrupt(struct pt_regs *regs) set_irq_regs(old_regs); } +void override_local_apic_timer(int cpu, + void (*handler)(struct clock_event_device *)) +{ + unsigned long flags; + + local_irq_save(flags); + per_cpu(lapic_events, cpu).event_handler = handler; + local_irq_restore(flags); +} +EXPORT_SYMBOL_GPL(override_local_apic_timer); + int setup_profiling_timer(unsigned int multiplier) { return -EINVAL; diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index 45dfc1d..ba7c99b 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -133,6 +133,7 @@ static void __ref remove_cpu_from_maps(int cpu); #ifdef CONFIG_SLAVE_CPU /* Notify slave cpu up and down */ static RAW_NOTIFIER_HEAD(slave_cpu_chain); +struct notifier_block *slave_timer_nb; int register_slave_cpu_notifier(struct notifier_block *nb) { @@ -140,6 +141,13 @@ int register_slave_cpu_notifier(struct notifier_block *nb) } EXPORT_SYMBOL(register_slave_cpu_notifier); +int register_slave_cpu_timer_notifier(struct notifier_block *nb) +{ + slave_timer_nb = nb; + return register_slave_cpu_notifier(nb); +} +EXPORT_SYMBOL(register_slave_cpu_timer_notifier); + void unregister_slave_cpu_notifier(struct notifier_block *nb) { raw_notifier_chain_unregister(slave_cpu_chain, nb); @@ -155,6 +163,8 @@ static int slave_cpu_notify(unsigned long val, int cpu) return notifier_to_errno(ret); } + +static void slave_cpu_disable_timer(int cpu); #endif /* @@ -1013,10 +1023,30 @@ int __cpuinit slave_cpu_up(unsigned int cpu) cpu_maps_update_done(); + /* Timer may be used only in starting the slave CPU */ + slave_cpu_disable_timer(cpu); + return ret; } EXPORT_SYMBOL(slave_cpu_up); +static void __slave_cpu_disable_timer(void *hcpu) +{ + int cpu = (long)hcpu; + + pr_info(Disabling timer on slave cpu %d\n, cpu); + BUG_ON(!slave_timer_nb); + slave_timer_nb-notifier_call(slave_timer_nb, CPU_SLAVE_DYING, hcpu); +} +
Re: [RFC v2 PATCH 01/21] x86: Split memory hotplug function from cpu_up() as cpu_memory_up()
On 09/06/2012 02:31 PM, Avi Kivity wrote: On 09/06/2012 02:27 PM, Tomoki Sekiyama wrote: Split memory hotplug function from cpu_up() as cpu_memory_up(), which will be used for assigning memory area to off-lined cpus at following patch in this series. Can post a summary containing both the general outline for people reading this for the first time, or who have forgotten it, and the list of changes from v1? Never mind, I see it was posted, just that I wasn't copied on it. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC v2 PATCH 14/21] KVM: Directly handle interrupts by guests without VM EXIT on slave CPUs
Make interrupts on slave CPUs handled by guests without VM EXIT. This reduces CPU usage by the host to transfer interrupts of assigned PCI devices from the host to guests. It also reduces cost of VM EXIT and quickens response of guests to the interrupts. When a slave CPU is dedicated to a vCPU, exit on external interrupts is disabled. Unfortunately, we can only enable/disable exits for whole external interrupts except NMIs and cannot switch exits based on IRQ# or vectors. Thus, to avoid IPIs from online CPUs transferred to guests, this patch modify kvm_vcpu_kick() to use NMI for guests on slave CPUs. Signed-off-by: Tomoki Sekiyama tomoki.sekiyama...@hitachi.com Cc: Avi Kivity a...@redhat.com Cc: Marcelo Tosatti mtosa...@redhat.com Cc: Thomas Gleixner t...@linutronix.de Cc: Ingo Molnar mi...@redhat.com Cc: H. Peter Anvin h...@zytor.com --- arch/x86/include/asm/kvm_host.h |1 + arch/x86/kvm/lapic.c|5 + arch/x86/kvm/vmx.c | 19 ++ arch/x86/kvm/x86.c | 41 +++ include/linux/kvm_host.h|1 + virt/kvm/kvm_main.c |5 +++-- 6 files changed, 70 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 5ce89f1..65242a6 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -725,6 +725,7 @@ struct kvm_x86_ops { struct x86_instruction_info *info, enum x86_intercept_stage stage); + void (*set_direct_interrupt)(struct kvm_vcpu *vcpu, bool enabled); void (*set_slave_mode)(struct kvm_vcpu *vcpu, bool slave); }; diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index ce87878..73f57f3 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -601,6 +601,9 @@ static int apic_set_eoi(struct kvm_lapic *apic) kvm_ioapic_update_eoi(apic-vcpu-kvm, vector, trigger_mode); } kvm_make_request(KVM_REQ_EVENT, apic-vcpu); + if (vcpu_has_slave_cpu(apic-vcpu) + kvm_x86_ops-set_direct_interrupt) + kvm_x86_ops-set_direct_interrupt(apic-vcpu, 1); return vector; } @@ -1569,6 +1572,8 @@ int kvm_lapic_enable_pv_eoi(struct kvm_vcpu *vcpu, u64 data) u64 addr = data ~KVM_MSR_ENABLED; if (!IS_ALIGNED(addr, 4)) return 1; + if (vcpu_has_slave_cpu(vcpu)) + return 1; vcpu-arch.pv_eoi.msr_val = data; if (!pv_eoi_enabled(vcpu)) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 03a2d02..605abea 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -1711,6 +1711,16 @@ static inline void vmx_clear_hlt(struct kvm_vcpu *vcpu) #endif } +static void vmx_set_direct_interrupt(struct kvm_vcpu *vcpu, bool enabled) +{ + if (enabled) + vmcs_clear_bits(PIN_BASED_VM_EXEC_CONTROL, + PIN_BASED_EXT_INTR_MASK); + else + vmcs_set_bits(PIN_BASED_VM_EXEC_CONTROL, + PIN_BASED_EXT_INTR_MASK); +} + static void vmx_set_slave_mode(struct kvm_vcpu *vcpu, bool slave) { /* Don't intercept the guest's halt on slave CPU */ @@ -1721,6 +1731,8 @@ static void vmx_set_slave_mode(struct kvm_vcpu *vcpu, bool slave) vmcs_set_bits(CPU_BASED_VM_EXEC_CONTROL, CPU_BASED_HLT_EXITING); } + + vmx_set_direct_interrupt(vcpu, slave); } /* @@ -1776,6 +1788,8 @@ static void vmx_queue_exception(struct kvm_vcpu *vcpu, unsigned nr, vmcs_write32(VM_ENTRY_INTR_INFO_FIELD, intr_info); vmx_clear_hlt(vcpu); + if (vcpu_has_slave_cpu(vcpu)) + vmx_set_direct_interrupt(vcpu, 0); } static bool vmx_rdtscp_supported(void) @@ -4147,6 +4161,8 @@ static void vmx_inject_irq(struct kvm_vcpu *vcpu) intr |= INTR_TYPE_EXT_INTR; vmcs_write32(VM_ENTRY_INTR_INFO_FIELD, intr); vmx_clear_hlt(vcpu); + if (vcpu_has_slave_cpu(vcpu)) + vmx_set_direct_interrupt(vcpu, 0); } static void vmx_inject_nmi(struct kvm_vcpu *vcpu) @@ -4179,6 +4195,8 @@ static void vmx_inject_nmi(struct kvm_vcpu *vcpu) vmcs_write32(VM_ENTRY_INTR_INFO_FIELD, INTR_TYPE_NMI_INTR | INTR_INFO_VALID_MASK | NMI_VECTOR); vmx_clear_hlt(vcpu); + if (vcpu_has_slave_cpu(vcpu)) + vmx_set_direct_interrupt(vcpu, 0); } static int vmx_nmi_allowed(struct kvm_vcpu *vcpu) @@ -7374,6 +7392,7 @@ static struct kvm_x86_ops vmx_x86_ops = { .check_intercept = vmx_check_intercept, + .set_direct_interrupt = vmx_set_direct_interrupt, .set_slave_mode = vmx_set_slave_mode, }; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index a6b2521..b7d28df 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -63,6 +63,7 @@ #include asm/pvclock.h #include
[RFC v2 PATCH 12/21] x86/apic: Enable external interrupt routing to slave CPUs
Enable APIC to handle interrupts on slave CPUs, and enables interrupt routing to slave CPUs by setting IRQ affinity. As slave CPUs which run a KVM guest handle external interrupts directly in the vCPUs, the guest's vector/IRQ mapping is different from the host's. That requires interrupts to be routed either online CPUs or slave CPUs. In this patch, if online CPUs are contained in specified affinity settings, the affinity settings will be only applied to online CPUs. If every specified CPU is slave, IRQ will be routed to slave CPUs. Signed-off-by: Tomoki Sekiyama tomoki.sekiyama...@hitachi.com Cc: Avi Kivity a...@redhat.com Cc: Marcelo Tosatti mtosa...@redhat.com Cc: Thomas Gleixner t...@linutronix.de Cc: Ingo Molnar mi...@redhat.com Cc: H. Peter Anvin h...@zytor.com --- arch/x86/include/asm/apic.h |6 ++--- arch/x86/kernel/apic/io_apic.c| 43 - arch/x86/kernel/apic/x2apic_cluster.c |8 +++--- drivers/iommu/intel_irq_remapping.c | 30 +++ kernel/irq/manage.c |4 ++- kernel/irq/migration.c|2 +- kernel/irq/proc.c |2 +- 7 files changed, 67 insertions(+), 28 deletions(-) diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h index f342612..d37ae5c 100644 --- a/arch/x86/include/asm/apic.h +++ b/arch/x86/include/asm/apic.h @@ -535,7 +535,7 @@ extern void generic_bigsmp_probe(void); static inline const struct cpumask *default_target_cpus(void) { #ifdef CONFIG_SMP - return cpu_online_mask; + return cpu_online_or_slave_mask; #else return cpumask_of(0); #endif @@ -543,7 +543,7 @@ static inline const struct cpumask *default_target_cpus(void) static inline const struct cpumask *online_target_cpus(void) { - return cpu_online_mask; + return cpu_online_or_slave_mask; } DECLARE_EARLY_PER_CPU_READ_MOSTLY(u16, x86_bios_cpu_apicid); @@ -602,7 +602,7 @@ flat_cpu_mask_to_apicid_and(const struct cpumask *cpumask, { unsigned long cpu_mask = cpumask_bits(cpumask)[0] cpumask_bits(andmask)[0] -cpumask_bits(cpu_online_mask)[0] +cpumask_bits(cpu_online_or_slave_mask)[0] APIC_ALL_CPUS; if (likely(cpu_mask)) { diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c index c265593..0cd2682 100644 --- a/arch/x86/kernel/apic/io_apic.c +++ b/arch/x86/kernel/apic/io_apic.c @@ -1125,7 +1125,7 @@ __assign_irq_vector(int irq, struct irq_cfg *cfg, const struct cpumask *mask) /* Only try and allocate irqs on cpus that are present */ err = -ENOSPC; cpumask_clear(cfg-old_domain); - cpu = cpumask_first_and(mask, cpu_online_mask); + cpu = cpumask_first_and(mask, cpu_online_or_slave_mask); while (cpu nr_cpu_ids) { int new_cpu, vector, offset; @@ -1158,14 +1158,14 @@ next: if (unlikely(current_vector == vector)) { cpumask_or(cfg-old_domain, cfg-old_domain, tmp_mask); cpumask_andnot(tmp_mask, mask, cfg-old_domain); - cpu = cpumask_first_and(tmp_mask, cpu_online_mask); + cpu = cpumask_first_and(tmp_mask, cpu_online_or_slave_mask); continue; } if (test_bit(vector, used_vectors)) goto next; - for_each_cpu_and(new_cpu, tmp_mask, cpu_online_mask) + for_each_cpu_and(new_cpu, tmp_mask, cpu_online_or_slave_mask) if (per_cpu(vector_irq, new_cpu)[vector] != -1) goto next; /* Found one! */ @@ -1175,7 +1175,7 @@ next: cfg-move_in_progress = 1; cpumask_copy(cfg-old_domain, cfg-domain); } - for_each_cpu_and(new_cpu, tmp_mask, cpu_online_mask) + for_each_cpu_and(new_cpu, tmp_mask, cpu_online_or_slave_mask) per_cpu(vector_irq, new_cpu)[vector] = irq; cfg-vector = vector; cpumask_copy(cfg-domain, tmp_mask); @@ -1204,7 +1204,7 @@ static void __clear_irq_vector(int irq, struct irq_cfg *cfg) BUG_ON(!cfg-vector); vector = cfg-vector; - for_each_cpu_and(cpu, cfg-domain, cpu_online_mask) + for_each_cpu_and(cpu, cfg-domain, cpu_online_or_slave_mask) per_cpu(vector_irq, cpu)[vector] = -1; cfg-vector = 0; @@ -1212,7 +1212,7 @@ static void __clear_irq_vector(int irq, struct irq_cfg *cfg) if (likely(!cfg-move_in_progress)) return; - for_each_cpu_and(cpu, cfg-old_domain, cpu_online_mask) { + for_each_cpu_and(cpu, cfg-old_domain, cpu_online_or_slave_mask) { for (vector =
[RFC v2 PATCH 11/21] KVM: no exiting from guest when slave CPU halted
Avoid exiting from a guest on slave CPU even if HLT instruction is executed. Since the slave CPU is dedicated to a vCPU, exit on HLT is not required, and avoiding VM exit will improve the guest's performance. This is a partial revert of 10166744b80a (KVM: VMX: remove yield_on_hlt) Cc: Avi Kivity a...@redhat.com Cc: Marcelo Tosatti mtosa...@redhat.com Cc: Thomas Gleixner t...@linutronix.de Cc: Ingo Molnar mi...@redhat.com Cc: H. Peter Anvin h...@zytor.com --- arch/x86/kvm/vmx.c | 25 - 1 files changed, 24 insertions(+), 1 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index d99bee6..03a2d02 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -1698,9 +1698,29 @@ static void skip_emulated_instruction(struct kvm_vcpu *vcpu) vmx_set_interrupt_shadow(vcpu, 0); } +static inline void vmx_clear_hlt(struct kvm_vcpu *vcpu) +{ +#ifdef CONFIG_SLAVE_CPU + /* Ensure that we clear the HLT state in the VMCS. We don't need to +* explicitly skip the instruction because if the HLT state is set, +* then the instruction is already executing and RIP has already been +* advanced. */ + if (vcpu-arch.slave_cpu = 0 + vmcs_read32(GUEST_ACTIVITY_STATE) == GUEST_ACTIVITY_HLT) + vmcs_write32(GUEST_ACTIVITY_STATE, GUEST_ACTIVITY_ACTIVE); +#endif +} + static void vmx_set_slave_mode(struct kvm_vcpu *vcpu, bool slave) { - /* Nothing */ + /* Don't intercept the guest's halt on slave CPU */ + if (slave) { + vmcs_clear_bits(CPU_BASED_VM_EXEC_CONTROL, + CPU_BASED_HLT_EXITING); + } else { + vmcs_set_bits(CPU_BASED_VM_EXEC_CONTROL, + CPU_BASED_HLT_EXITING); + } } /* @@ -1755,6 +1775,7 @@ static void vmx_queue_exception(struct kvm_vcpu *vcpu, unsigned nr, intr_info |= INTR_TYPE_HARD_EXCEPTION; vmcs_write32(VM_ENTRY_INTR_INFO_FIELD, intr_info); + vmx_clear_hlt(vcpu); } static bool vmx_rdtscp_supported(void) @@ -4125,6 +4146,7 @@ static void vmx_inject_irq(struct kvm_vcpu *vcpu) } else intr |= INTR_TYPE_EXT_INTR; vmcs_write32(VM_ENTRY_INTR_INFO_FIELD, intr); + vmx_clear_hlt(vcpu); } static void vmx_inject_nmi(struct kvm_vcpu *vcpu) @@ -4156,6 +4178,7 @@ static void vmx_inject_nmi(struct kvm_vcpu *vcpu) } vmcs_write32(VM_ENTRY_INTR_INFO_FIELD, INTR_TYPE_NMI_INTR | INTR_INFO_VALID_MASK | NMI_VECTOR); + vmx_clear_hlt(vcpu); } static int vmx_nmi_allowed(struct kvm_vcpu *vcpu) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC v2 PATCH 04/21] x86: Avoid RCU warnings on slave CPUs
Initialize rcu related variables to avoid warnings about RCU usage while slave CPUs is running specified functions. Also notify RCU subsystem before the slave CPU is entered into idle state. Signed-off-by: Tomoki Sekiyama tomoki.sekiyama...@hitachi.com Cc: Avi Kivity a...@redhat.com Cc: Marcelo Tosatti mtosa...@redhat.com Cc: Thomas Gleixner t...@linutronix.de Cc: Ingo Molnar mi...@redhat.com Cc: H. Peter Anvin h...@zytor.com --- arch/x86/kernel/smpboot.c |4 kernel/rcutree.c | 14 ++ 2 files changed, 18 insertions(+), 0 deletions(-) diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index e8cfe377..45dfc1d 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -382,6 +382,8 @@ notrace static void __cpuinit start_slave_cpu(void *unused) f = per_cpu(slave_cpu_func, cpu); per_cpu(slave_cpu_func, cpu).func = NULL; + rcu_note_context_switch(cpu); + if (!f.func) { native_safe_halt(); continue; @@ -1005,6 +1007,8 @@ int __cpuinit slave_cpu_up(unsigned int cpu) if (IS_ERR(idle)) return PTR_ERR(idle); + slave_cpu_notify(CPU_SLAVE_UP_PREPARE, cpu); + ret = __native_cpu_up(cpu, idle, 1); cpu_maps_update_done(); diff --git a/kernel/rcutree.c b/kernel/rcutree.c index f280e54..31a7c8c 100644 --- a/kernel/rcutree.c +++ b/kernel/rcutree.c @@ -2589,6 +2589,9 @@ static int __cpuinit rcu_cpu_notify(struct notifier_block *self, switch (action) { case CPU_UP_PREPARE: case CPU_UP_PREPARE_FROZEN: +#ifdef CONFIG_SLAVE_CPU + case CPU_SLAVE_UP_PREPARE: +#endif rcu_prepare_cpu(cpu); rcu_prepare_kthreads(cpu); break; @@ -2603,6 +2606,9 @@ static int __cpuinit rcu_cpu_notify(struct notifier_block *self, break; case CPU_DYING: case CPU_DYING_FROZEN: +#ifdef CONFIG_SLAVE_CPU + case CPU_SLAVE_DYING: +#endif /* * The whole machine is stopped except this CPU, so we can * touch any data without introducing corruption. We send the @@ -2616,6 +2622,9 @@ static int __cpuinit rcu_cpu_notify(struct notifier_block *self, case CPU_DEAD_FROZEN: case CPU_UP_CANCELED: case CPU_UP_CANCELED_FROZEN: +#ifdef CONFIG_SLAVE_CPU + case CPU_SLAVE_DEAD: +#endif for_each_rcu_flavor(rsp) rcu_cleanup_dead_cpu(cpu, rsp); break; @@ -2797,6 +2806,10 @@ static void __init rcu_init_geometry(void) rcu_num_nodes -= n; } +static struct notifier_block __cpuinitdata rcu_slave_nb = { + .notifier_call = rcu_cpu_notify, +}; + void __init rcu_init(void) { int cpu; @@ -2814,6 +2827,7 @@ void __init rcu_init(void) * or the scheduler are operational. */ cpu_notifier(rcu_cpu_notify, 0); + register_slave_cpu_notifier(rcu_slave_nb); for_each_online_cpu(cpu) rcu_cpu_notify(NULL, CPU_UP_PREPARE, (void *)(long)cpu); check_cpu_stall_init(); -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC v2 PATCH 16/21] KVM: vmx: Add definitions PIN_BASED_PREEMPTION_TIMER
Add some definitions to use PIN_BASED_PREEMPTION_TIMER. When PIN_BASED_PREEMPTION_TIMER is enabled, the guest will exit with reason=EXIT_REASON_PREEMPTION_TIMER when the counter specified in VMX_PREEMPTION_TIMER_VALUE becomes 0. This patch also adds a dummy handler for EXIT_REASON_PREEMPTION_TIMER, which just goes back to VM execution soon. These are currently intended only to be used with avoid entering the guest on a slave CPU when vmx_prevent_run(vcpu, 1) is called. Signed-off-by: Tomoki Sekiyama tomoki.sekiyama...@hitachi.com Cc: Avi Kivity a...@redhat.com Cc: Marcelo Tosatti mtosa...@redhat.com Cc: Thomas Gleixner t...@linutronix.de Cc: Ingo Molnar mi...@redhat.com Cc: H. Peter Anvin h...@zytor.com --- arch/x86/include/asm/vmx.h |3 +++ arch/x86/kvm/trace.h |1 + arch/x86/kvm/vmx.c |7 +++ 3 files changed, 11 insertions(+), 0 deletions(-) diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h index 74fcb96..6899aaa 100644 --- a/arch/x86/include/asm/vmx.h +++ b/arch/x86/include/asm/vmx.h @@ -66,6 +66,7 @@ #define PIN_BASED_EXT_INTR_MASK 0x0001 #define PIN_BASED_NMI_EXITING 0x0008 #define PIN_BASED_VIRTUAL_NMIS 0x0020 +#define PIN_BASED_PREEMPTION_TIMER 0x0040 #define VM_EXIT_SAVE_DEBUG_CONTROLS 0x0002 #define VM_EXIT_HOST_ADDR_SPACE_SIZE0x0200 @@ -196,6 +197,7 @@ enum vmcs_field { GUEST_INTERRUPTIBILITY_INFO = 0x4824, GUEST_ACTIVITY_STATE= 0X4826, GUEST_SYSENTER_CS = 0x482A, + VMX_PREEMPTION_TIMER_VALUE = 0x482E, HOST_IA32_SYSENTER_CS = 0x4c00, CR0_GUEST_HOST_MASK = 0x6000, CR4_GUEST_HOST_MASK = 0x6002, @@ -280,6 +282,7 @@ enum vmcs_field { #define EXIT_REASON_APIC_ACCESS 44 #define EXIT_REASON_EPT_VIOLATION 48 #define EXIT_REASON_EPT_MISCONFIG 49 +#define EXIT_REASON_PREEMPTION_TIMER 52 #define EXIT_REASON_WBINVD 54 #define EXIT_REASON_XSETBV 55 #define EXIT_REASON_INVPCID58 diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h index 6081be7..fc350f3 100644 --- a/arch/x86/kvm/trace.h +++ b/arch/x86/kvm/trace.h @@ -218,6 +218,7 @@ TRACE_EVENT(kvm_apic, { EXIT_REASON_APIC_ACCESS, APIC_ACCESS }, \ { EXIT_REASON_EPT_VIOLATION,EPT_VIOLATION }, \ { EXIT_REASON_EPT_MISCONFIG,EPT_MISCONFIG }, \ + { EXIT_REASON_PREEMPTION_TIMER, PREEMPTION_TIMER }, \ { EXIT_REASON_WBINVD, WBINVD } #define SVM_EXIT_REASONS \ diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 6dc59c8..2130cbd 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -4456,6 +4456,12 @@ static int handle_external_interrupt(struct kvm_vcpu *vcpu) return 1; } +static int handle_preemption_timer(struct kvm_vcpu *vcpu) +{ + /* Nothing */ + return 1; +} + static int handle_triple_fault(struct kvm_vcpu *vcpu) { vcpu-run-exit_reason = KVM_EXIT_SHUTDOWN; @@ -5768,6 +5774,7 @@ static int (*kvm_vmx_exit_handlers[])(struct kvm_vcpu *vcpu) = { [EXIT_REASON_VMON]= handle_vmon, [EXIT_REASON_TPR_BELOW_THRESHOLD] = handle_tpr_below_threshold, [EXIT_REASON_APIC_ACCESS] = handle_apic_access, + [EXIT_REASON_PREEMPTION_TIMER]= handle_preemption_timer, [EXIT_REASON_WBINVD] = handle_wbinvd, [EXIT_REASON_XSETBV] = handle_xsetbv, [EXIT_REASON_TASK_SWITCH] = handle_task_switch, -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/