Re: [PATCH] can: pcan_usb_core: fix memory leak on failure paths in peak_usb_start()
Le 06/09/2013 08:56, Marc Kleine-Budde a écrit : On 09/06/2013 08:52 AM, Stephane Grosjean wrote: Tx and rx urbs are not deallocated if something goes wrong in peak_usb_start(). The patch fixes error handling to deallocate all the resources. Found by Linux Driver Verification project (linuxtesting.org). Signed-off-by: Alexey Khoroshilov Acked-by: Stephane Grosjean Tnx, Marc BTW: A simply reply to the original patch with your Acked-by is sufficient. Ok, thx Marc. I keep it in mind for the next time (if any ;-)) Stéphane -- PEAK-System Technik GmbH, Otto-Roehm-Strasse 69, D-64293 Darmstadt Geschaeftsleitung: A.Gach/U.Wilhelm,St.Nr.:007/241/13586 FA Darmstadt HRB-9183 Darmstadt, Ust.IdNr.:DE 202220078, WEE-Reg.-Nr.: DE39305391 Tel.+49 (0)6151-817320 / Fax:+49 (0)6151-817329, i...@peak-system.com To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ARM: OMAP2+: am335x-bone*: add DT for BeagleBone Black
On 9/6/2013 12:03 PM, Koen Kooi wrote: The BeagleBone Black is basically a regular BeagleBone with eMMC and HDMI added, so create a common dtsi both can use. MMC support for AM335x still isn't in, so only the LDO change has been added. Signed-off-by: Koen Kooi --- .../{am335x-bone.dts => am335x-bone-common.dtsi} | 3 - arch/arm/boot/dts/am335x-bone.dts | 256 + arch/arm/boot/dts/am335x-boneblack.dts | 18 ++ 3 files changed, 19 insertions(+), 258 deletions(-) copy arch/arm/boot/dts/{am335x-bone.dts => am335x-bone-common.dtsi} (99%) create mode 100644 arch/arm/boot/dts/am335x-boneblack.dts How did you test am335x-boneblack.dtb? where are the Makefile changes for boneblack? diff --git a/arch/arm/boot/dts/am335x-bone.dts b/arch/arm/boot/dts/am335x-bone-common.dtsi similarity index 99% copy from arch/arm/boot/dts/am335x-bone.dts copy to arch/arm/boot/dts/am335x-bone-common.dtsi index d318987..2f66ded 100644 --- a/arch/arm/boot/dts/am335x-bone.dts +++ b/arch/arm/boot/dts/am335x-bone-common.dtsi @@ -5,9 +5,6 @@ * it under the terms of the GNU General Public License version 2 as * published by the Free Software Foundation. */ -/dts-v1/; - -#include "am33xx.dtsi" / { model = "TI AM335x BeagleBone"; diff --git a/arch/arm/boot/dts/am335x-bone.dts b/arch/arm/boot/dts/am335x-bone.dts index d318987..7993c48 100644 --- a/arch/arm/boot/dts/am335x-bone.dts +++ b/arch/arm/boot/dts/am335x-bone.dts @@ -8,258 +8,4 @@ /dts-v1/; #include "am33xx.dtsi" - -/ { - model = "TI AM335x BeagleBone"; - compatible = "ti,am335x-bone", "ti,am33xx"; - - cpus { - cpu@0 { - cpu0-supply = <&dcdc2_reg>; - }; - }; - - memory { - device_type = "memory"; - reg = <0x8000 0x1000>; /* 256 MB */ - }; - - am33xx_pinmux: pinmux@44e10800 { - pinctrl-names = "default"; - pinctrl-0 = <&clkout2_pin>; - - user_leds_s0: user_leds_s0 { - pinctrl-single,pins = < - 0x54 (PIN_OUTPUT_PULLDOWN | MUX_MODE7) /* gpmc_a5.gpio1_21 */ - 0x58 (PIN_OUTPUT_PULLUP | MUX_MODE7)/* gpmc_a6.gpio1_22 */ - 0x5c (PIN_OUTPUT_PULLDOWN | MUX_MODE7) /* gpmc_a7.gpio1_23 */ - 0x60 (PIN_OUTPUT_PULLUP | MUX_MODE7)/* gpmc_a8.gpio1_24 */ - >; - }; - - i2c0_pins: pinmux_i2c0_pins { - pinctrl-single,pins = < - 0x188 (PIN_INPUT_PULLUP | MUX_MODE0)/* i2c0_sda.i2c0_sda */ - 0x18c (PIN_INPUT_PULLUP | MUX_MODE0)/* i2c0_scl.i2c0_scl */ - >; - }; - - uart0_pins: pinmux_uart0_pins { - pinctrl-single,pins = < - 0x170 (PIN_INPUT_PULLUP | MUX_MODE0)/* uart0_rxd.uart0_rxd */ - 0x174 (PIN_OUTPUT_PULLDOWN | MUX_MODE0) /* uart0_txd.uart0_txd */ - >; - }; - - clkout2_pin: pinmux_clkout2_pin { - pinctrl-single,pins = < - 0x1b4 (PIN_OUTPUT_PULLDOWN | MUX_MODE3) /* xdma_event_intr1.clkout2 */ - >; - }; - - cpsw_default: cpsw_default { - pinctrl-single,pins = < - /* Slave 1 */ - 0x110 (PIN_INPUT_PULLUP | MUX_MODE0)/* mii1_rxerr.mii1_rxerr */ - 0x114 (PIN_OUTPUT_PULLDOWN | MUX_MODE0) /* mii1_txen.mii1_txen */ - 0x118 (PIN_INPUT_PULLUP | MUX_MODE0)/* mii1_rxdv.mii1_rxdv */ - 0x11c (PIN_OUTPUT_PULLDOWN | MUX_MODE0) /* mii1_txd3.mii1_txd3 */ - 0x120 (PIN_OUTPUT_PULLDOWN | MUX_MODE0) /* mii1_txd2.mii1_txd2 */ - 0x124 (PIN_OUTPUT_PULLDOWN | MUX_MODE0) /* mii1_txd1.mii1_txd1 */ - 0x128 (PIN_OUTPUT_PULLDOWN | MUX_MODE0) /* mii1_txd0.mii1_txd0 */ - 0x12c (PIN_INPUT_PULLUP | MUX_MODE0)/* mii1_txclk.mii1_txclk */ - 0x130 (PIN_INPUT_PULLUP | MUX_MODE0)/* mii1_rxclk.mii1_rxclk */ - 0x134 (PIN_INPUT_PULLUP | MUX_MODE0)/* mii1_rxd3.mii1_rxd3 */ - 0x138 (PIN_INPUT_PULLUP | MUX_MODE0)/* mii1_rxd2.mii1_rxd2 */ - 0x13c (PIN_INPUT_PULLUP | MUX_MODE0)/* mii1_rxd1.mii1_rxd1 */ - 0x140 (PIN_INPUT_PULLUP | MUX_MODE0)/* mii1_rxd0.mii1_rxd0 */ - >; - }; - - cpsw_sl
Re: [PATCH v9 12/13] KVM: PPC: Add support for IOMMU in-kernel handling
On Thu, Sep 05, 2013 at 02:05:09PM +1000, Benjamin Herrenschmidt wrote: > On Tue, 2013-09-03 at 13:53 +0300, Gleb Natapov wrote: > > > Or supporting all IOMMU links (and leaving emulated stuff as is) in on > > > "device" is the last thing I have to do and then you'll ack the patch? > > > > > I am concerned more about API here. Internal implementation details I > > leave to powerpc experts :) > > So Gleb, I want to step in for a bit here. > > While I understand that the new KVM device API is all nice and shiny and that > this > whole thing should probably have been KVM devices in the first place (had they > existed or had we been told back then), the point is, the API for handling > HW IOMMUs that Alexey is trying to add is an extension of an existing > mechanism > used for emulated IOMMUs. > > The internal data structure is shared, and fundamentally, by forcing him to > use that new KVM device for the "new stuff", we create a oddball API with > an ioctl for one type of iommu and a KVM device for the other, which makes > the implementation a complete mess in the kernel (and you should care :-) > Is it unfixable mess? Even if Alexey will do what you suggested earlier? - Convert *both* existing TCE objects to the new KVM_CREATE_DEVICE, and have some backward compat code for the old one. The point is implementation usually can be changed, but for API it is much harder to do so. > So for something completely new, I would tend to agree with you. However, I > still think that for this specific case, we should just plonk-in the original > ioctl proposed by Alexey and be done with it. > Do you think this is the last extension to IOMMU code, or we will see more and will use same justification to continue adding ioctls? -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] can: pcan_usb_core: fix memory leak on failure paths in peak_usb_start()
On 09/06/2013 08:52 AM, Stephane Grosjean wrote: > Tx and rx urbs are not deallocated if something goes wrong in > peak_usb_start(). > The patch fixes error handling to deallocate all the resources. > > Found by Linux Driver Verification project (linuxtesting.org). > > Signed-off-by: Alexey Khoroshilov > Acked-by: Stephane Grosjean Tnx, Marc BTW: A simply reply to the original patch with your Acked-by is sufficient. -- Pengutronix e.K. | Marc Kleine-Budde | Industrial Linux Solutions| Phone: +49-231-2826-924 | Vertretung West/Dortmund | Fax: +49-5121-206917- | Amtsgericht Hildesheim, HRA 2686 | http://www.pengutronix.de | signature.asc Description: OpenPGP digital signature
Re: [PATCH] VMCI: fix to pass correct device identity to free_irq()
On Fri, Sep 06, 2013 at 02:39:28PM +0800, Wei Yongjun wrote: > From: Wei Yongjun > > free_irq() expects the same device identity that was passed to > corresponding request_irq(), otherwise the IRQ is not freed. > > Signed-off-by: Wei Yongjun Acked-by: Dmitry Torokhov > --- > drivers/misc/vmw_vmci/vmci_guest.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/misc/vmw_vmci/vmci_guest.c > b/drivers/misc/vmw_vmci/vmci_guest.c > index b3a2b76..c98b03b 100644 > --- a/drivers/misc/vmw_vmci/vmci_guest.c > +++ b/drivers/misc/vmw_vmci/vmci_guest.c > @@ -649,7 +649,7 @@ static int vmci_guest_probe_device(struct pci_dev *pdev, > return 0; > > err_free_irq: > - free_irq(vmci_dev->irq, &vmci_dev); > + free_irq(vmci_dev->irq, vmci_dev); > tasklet_kill(&vmci_dev->datagram_tasklet); > tasklet_kill(&vmci_dev->bm_tasklet); > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH V3] PCI: exynos: add support for MSI
This patch adds support for Message Signaled Interrupt in the Exynos PCIe diver using Synopsys designware PCIe core IP. Signed-off-by: Siva Reddy Kallam Signed-off-by: Srikanth T Shivanand Signed-off-by: Jingoo Han Cc: Pratyush Anand Cc: Mohit KUMAR --- Changes since v2: - fixed MAX_MSI_CTRLS because MAX_MSI_IRQS is 32 only - used __get_free_pages() to allocate msi_data - used one msi_data and msi_irq_in_use per one RC - used irq_domain to represent the MSI controller - removed msi-base irq number from device tree because this is not a hardware property. Changes since v1: - removed unnecessary exynos_pcie_clear_irq_level() - updated the bindings documentation - used new msi_chip infrastructure - removed ARCH_SUPPORTS_MSI - replaced #ifdef guards with IS_ENABLED(CONFIG_PCI_MSI) drivers/pci/host/pci-exynos.c | 44 +++ drivers/pci/host/pcie-designware.c | 240 drivers/pci/host/pcie-designware.h | 14 +++ 3 files changed, 298 insertions(+) diff --git a/drivers/pci/host/pci-exynos.c b/drivers/pci/host/pci-exynos.c index 94e096b..f062aca 100644 --- a/drivers/pci/host/pci-exynos.c +++ b/drivers/pci/host/pci-exynos.c @@ -48,6 +48,7 @@ struct exynos_pcie { #define PCIE_IRQ_SPECIAL 0x008 #define PCIE_IRQ_EN_PULSE 0x00c #define PCIE_IRQ_EN_LEVEL 0x010 +#define IRQ_MSI_ENABLE (0x1 << 2) #define PCIE_IRQ_EN_SPECIAL0x014 #define PCIE_PWR_RESET 0x018 #define PCIE_CORE_RESET0x01c @@ -342,9 +343,36 @@ static irqreturn_t exynos_pcie_irq_handler(int irq, void *arg) return IRQ_HANDLED; } +static irqreturn_t exynos_pcie_msi_irq_handler(int irq, void *arg) +{ + struct pcie_port *pp = arg; + + dw_handle_msi_irq(pp); + + return IRQ_HANDLED; +} + +static void exynos_pcie_msi_init(struct pcie_port *pp) +{ + u32 val; + struct exynos_pcie *exynos_pcie = to_exynos_pcie(pp); + + dw_pcie_msi_init(pp); + + /* enable MSI interrupt */ + val = exynos_elb_readl(exynos_pcie, PCIE_IRQ_EN_LEVEL); + val |= IRQ_MSI_ENABLE; + exynos_elb_writel(exynos_pcie, val, PCIE_IRQ_EN_LEVEL); + return; +} + static void exynos_pcie_enable_interrupts(struct pcie_port *pp) { exynos_pcie_enable_irq_pulse(pp); + + if (IS_ENABLED(CONFIG_PCI_MSI)) + exynos_pcie_msi_init(pp); + return; } @@ -430,6 +458,22 @@ static int add_pcie_port(struct pcie_port *pp, struct platform_device *pdev) return ret; } + if (IS_ENABLED(CONFIG_PCI_MSI)) { + pp->msi_irq = platform_get_irq(pdev, 0); + if (!pp->msi_irq) { + dev_err(&pdev->dev, "failed to get msi irq\n"); + return -ENODEV; + } + + ret = devm_request_irq(&pdev->dev, pp->msi_irq, + exynos_pcie_msi_irq_handler, + IRQF_SHARED, "exynos-pcie", pp); + if (ret) { + dev_err(&pdev->dev, "failed to request msi irq\n"); + return ret; + } + } + pp->root_bus_nr = -1; pp->ops = &exynos_pcie_host_ops; diff --git a/drivers/pci/host/pcie-designware.c b/drivers/pci/host/pcie-designware.c index c10e9ac..8963017 100644 --- a/drivers/pci/host/pcie-designware.c +++ b/drivers/pci/host/pcie-designware.c @@ -11,8 +11,11 @@ * published by the Free Software Foundation. */ +#include +#include #include #include +#include #include #include #include @@ -142,6 +145,204 @@ int dw_pcie_wr_own_conf(struct pcie_port *pp, int where, int size, return ret; } +static struct irq_chip dw_msi_irq_chip = { + .name = "PCI-MSI", + .irq_enable = unmask_msi_irq, + .irq_disable = mask_msi_irq, + .irq_mask = mask_msi_irq, + .irq_unmask = unmask_msi_irq, +}; + +/* MSI int handler */ +void dw_handle_msi_irq(struct pcie_port *pp) +{ + unsigned long val; + int i, pos; + + for (i = 0; i < MAX_MSI_CTRLS; i++) { + dw_pcie_rd_own_conf(pp, PCIE_MSI_INTR0_STATUS + i * 12, 4, + (u32 *)&val); + if (val) { + pos = 0; + while ((pos = find_next_bit(&val, 32, pos)) != 32) { + generic_handle_irq(pp->msi_irq_start + + (i * 32) + pos); + pos++; + } + } + dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_STATUS + i * 12, 4, val); + } +} + +void dw_pcie_msi_init(struct pcie_port *pp) +{ + pp->msi_data = __get_free_pages(GFP_KERNEL, 0); + + /* program the msi_data */ + dw_pcie_wr_own_conf(pp, PCIE_MSI_ADDR_LO, 4, + virt_to_phys((void *)pp->msi_data));
Re: [PATCH 2/2] fsl: set wakeup sources
Sorry linux-kernel subscribers, This is for team internal review, linux-kernel is cced due to my carelessness, omit this mail please. On 09/06/2013 02:46 PM, hongbo.zh...@freescale.com wrote: From: Hongbo Zhang Some devices can work as wakeup sources, they should be powerred on during system deep sleep, this patch adds interface for configuring devices power supply status during deep sleep. Signed-off-by: Hongbo Zhang --- arch/powerpc/boot/dts/fsl/qoriq-power.dtsi | 73 arch/powerpc/sysdev/fsl_rcpm.c | 43 2 files changed, 116 insertions(+) create mode 100644 arch/powerpc/boot/dts/fsl/qoriq-power.dtsi diff --git a/arch/powerpc/boot/dts/fsl/qoriq-power.dtsi b/arch/powerpc/boot/dts/fsl/qoriq-power.dtsi new file mode 100644 index 000..c5c2ba0 --- /dev/null +++ b/arch/powerpc/boot/dts/fsl/qoriq-power.dtsi @@ -0,0 +1,73 @@ +/* + * QorIQ Power Management device tree stub + * + * Copyright 2013 Freescale Semiconductor Inc. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * * Neither the name of Freescale Semiconductor nor the + * names of its contributors may be used to endorse or promote products + * derived from this software without specific prior written permission. + * + * + * ALTERNATIVELY, this software may be distributed under the terms of the + * GNU General Public License ("GPL") as published by the Free Software + * Foundation, either version 2 of that License or (at your option) any + * later version. + * + * THIS SOFTWARE IS PROVIDED BY Freescale Semiconductor ``AS IS'' AND ANY + * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED + * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE + * DISCLAIMED. IN NO EVENT SHALL Freescale Semiconductor BE LIABLE FOR ANY + * DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES + * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND + * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS + * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +/* IPPDEXPCR: IP Power Down EXcePtion Control Register */ +rcpm-power@e2140 { + compatible = "fsl,rcpm-ippdexpcr"; + reg = <0xe2140 0x4>; + + mac1_1_power: soc-power@0 { + fsl,ippdexpcr-mask = <0x8000>; + }; + mac1_2_power: soc-power@1 { + fsl,ippdexpcr-mask = <0x4000>; + }; + mac1_3_power: soc-power@2 { + fsl,ippdexpcr-mask = <0x2000>; + }; + mac1_4_power: soc-power@3 { + fsl,ippdexpcr-mask = <0x1000>; + }; + mac1_5_power: soc-power@4 { + fsl,ippdexpcr-mask = <0x0800>; + }; + sdhc_power: soc-power@24 { + fsl,ippdexpcr-mask = <0x0080>; + }; + gpio_power: soc-power@25 { + fsl,ippdexpcr-mask = <0x0040>; + }; + usb1_power: soc-power@26 { + fsl,ippdexpcr-mask = <0x0020>; + }; + usb2_power: soc-power@27 { + fsl,ippdexpcr-mask = <0x0010>; + }; + fman1_power: soc-power@28 { + fsl,ippdexpcr-mask = <0x0008>; + }; + sap_power: soc-power@31 { + fsl,ippdexpcr-mask = <0x0001>; + }; +}; diff --git a/arch/powerpc/sysdev/fsl_rcpm.c b/arch/powerpc/sysdev/fsl_rcpm.c index ecf43a2..bc21aea 100644 --- a/arch/powerpc/sysdev/fsl_rcpm.c +++ b/arch/powerpc/sysdev/fsl_rcpm.c @@ -23,6 +23,49 @@ struct ccsr_rcpm __iomem *rcpm1_regs; struct ccsr_rcpm_v2 __iomem *rcpm2_regs; +/** + * fsl_rcpm_set_wake - enable/disable device working as wakeup source + * @dev: device affected + * @enable: true for keeping power on for this device during deep sleep + * false otherwise + * + * return 0 on success, return -EINVAL if the device cannot wake up system + * and -ENODEV if RCPM unavailable + */ +int fsl_rcpm_set_wake(struct device *dev, bool enable) +{ + int ret = 0; + struct device_node *pw_np; + u32 pw_mask; + + if (!rcpm2_regs) { + dev_err(dev, "%s: RCPM is unavailable\n", __func__); + return -ENODEV; + } + + if (enable && !device_may_wakeup(dev)) + return -EINVAL; + +
Re: [PATCH v4 3/5] clk: dt: binding for basic multiplexer clock
Hi, Chirping in my thoughts below. On 09/05/2013 11:30 PM, Stephen Warren wrote: On 09/05/2013 12:29 PM, Mike Turquette wrote: On Wed, Sep 4, 2013 at 11:36 AM, Stephen Warren wrote: On 09/03/2013 05:22 PM, Mike Turquette wrote: Quoting Stephen Warren (2013-08-30 14:37:46) On 08/30/2013 02:33 PM, Mike Turquette wrote: ... The clock _data_ seems to always have some churn to it. Moving it out to DT reduces that churn from Linux. My concern above is not about kernel data size. That sounds like the opposite of what we should be doing. It's fine for kernel code/data to change; that's a natural part of development. Obviously, we should minimize churn, through thorough review, domain knowledge, etc. And with the "clock mapping" style bindings we'll end up changing both the DT binding definition and the kernel. Not great. What's a "clock mapping" style binding? I guess that means the style where you have a single DT node that provides multiple clocks, rather than one DT node per clock? If the kernel driver changes its internal data, I don't see why that would have any impact at all on the DT binding definition. We should be able to use one DT binding definition with arbitrary drivers. Yes, I'm referring to a single node providing multiple clocks. As an example see the Exynos 5420 binding: Documentation/devicetree/bindings/clock/exynos5420-clock.txt The clock id's are stored as part of the binding definition resulting in a mapping scheme that can be fragile. The mapping shouldn't be fragile if e.g. include/dt-bindings/clock/exynos5420.h were used to define the values. That way, both the Exynos clock driver and Exynos DT files could both include the header, and would always be in sync. There have already been patches to fix the id's assigned in the binding, which isn't supposed to happen because it's a stable interface. That's definitely a real problem. The values should be stable. Preferably, the values should be derived from some aspect of the HW, and hence be stable. For example, many clock IDs on Tegra are derived from the clock's bit index within the peripheral clock enable registers. Although I must admit we have a bit of a mess in the Tegra clocks w.r.t. mis-using clock IDs for reset IDs and hence there are some peripheral clock IDS that don't map 1:1 with the register, and there are other clocks which aren't peripheral clocksthat we've assigned arbitrary IDs to rather than some HW-derived ID. Alternatively, perhaps a register address unique to the clock could be used. If new values are added, the additions should all happen in a single tree, and hence can be co-ordinated, thus avoiding any merge-conflicts. Even ignoring HW-derived clock IDs, people writing DT bindings simply need to get used to bindings being an ABI, and put extra effort into making sure the list of clocks is accurate and complete. Finally, while it's true that a DT binding definition is an ABI, and perhaps DT content isn't (so if there's a DT content bug it can simply be fixed), if DT is wrong because of insufficient thought about its content, it's still wrong, and the system doesn't work correctly. Whether we edit a kernel clock driver or a DT file to solve a problem, there was still a problem. Placing the data into DT doesn't make it any less likely there will be a problem if sufficient care isn't taken when thinking about the clock structure. If clock phandles are created by individual nodes in DT then the binding definition need never be updated due to merge conflicts or renaming which plagues the mapping scenario. That's true. But if we take that approach, shouldn't we just ban #clock-cells? The only case #clock-cells would still be legitimate would be an array of identical clocks represented by a single node, and even then the argument could be extended so say: just write out a node for each clock in the array, just like if the clocks weren't in an array or were different types. And I'll respond to your points below but the whole "relocate the problem to DT" argument is simply not my main point. What I want to do is increase the usefulness of DT by allowing register-level details into the binding which can Can you expand upon why a DT that encodes register-level details is more useful? I can't see why there would be any difference in usefulness. Sure. The usefulness comes out of the fact that we do not need to maintain data synchronization across dts and clock provider drivers. Only the clock IDs. That's a very small amount of information. And synchronizing the two simply means including a header file that defines the IDs in both places. This is *exactly* why I created the include/dt-bindings/ directory, to house such header files. The data lives in one place and only one place. We absolutely need a phandle to a clock in DT link clock consumer devices to their input clocks, so there is no question that should be in DT. Since we're already doing that, why not do away with trying to keep d
Re: [PATCH v14 0/6] LSM: Multiple concurrent LSMs
On 9/5/2013 11:48 AM, Kees Cook wrote: > On Mon, Aug 26, 2013 at 7:29 PM, Casey Schaufler > wrote: >> On 8/6/2013 3:36 PM, Kees Cook wrote: >>> On Tue, Aug 6, 2013 at 3:25 PM, Casey Schaufler >>> wrote: On 8/5/2013 11:30 PM, Kees Cook wrote: > On Thu, Jul 25, 2013 at 11:52 PM, Casey Schaufler > wrote: >> The /proc/*/attr interfaces are given to one LSM. This can be >> done by setting CONFIG_SECURITY_PRESENT. Additional interfaces >> have been created in /proc/*/attr so that each LSM has its own >> named interfaces. The name of the presenting LSM can be read from > For me, this is one problem that was bothering me, but it was a cosmetic > one that I'd mentioned before: I really disliked the /proc/$pid/attr > interface being named "$lsm.$file". I feel it's important to build > directories in attr/ for each LSM. So, I spent time to figure out a way to > do this. This patch changes the interface to /proc/$pid/attr/$lsm/$file > instead, which I feel has a much more appealing organizational structure. I will confess that the reason I went with .current instead of /current was that the former was easier to implement. >>> Yeah, that's totally fine. It wasn't very obvious (to me) how to >>> implement this initially, so no problem at all. I'm glad there was >>> something more than bug fixes I could contribute to this series. :) >> Oh dear. I'm rebasing for 3.12 and the macros don't generate compiling >> code any longer. It seems that, among other things, readdir is no longer >> a member of file_operations. > Looks like f0c3b5093addc8bfe9fe3a5b01acb7ec7969eafa is what touched > fs/proc/base.c and it should just need a few tweaks from "readdir" > becoming "iterate", and the prototype changing. > > So it should just require bump the macros a little. Let's see if gmail > eats my paste... > > diff --git a/fs/proc/base.c b/fs/proc/base.c > index 4c80ffd..f670349 100644 > --- a/fs/proc/base.c > +++ b/fs/proc/base.c > @@ -2358,17 +2358,17 @@ static const struct file_operations > proc_pid_attr_operat > }; > > #define LSM_DIR_OPS(LSM) \ > -static int proc_##LSM##_attr_dir_readdir(struct file * filp, \ > -void * dirent, filldir_t filldir) \ > +static int proc_##LSM##_attr_dir_iterate(struct file * filp, \ > +struct dir_context *ctx) \ > { \ > - return proc_pident_readdir(filp, dirent, filldir, \ > + return proc_pident_readdir(filp, ctx, \ >LSM##_attr_dir_stuff, \ >ARRAY_SIZE(LSM##_attr_dir_stuff)); \ > } \ > \ > static const struct file_operations proc_##LSM##_attr_dir_ops = { \ > .read = generic_read_dir, \ > - .readdir= proc_##LSM##_attr_dir_readdir, \ > + .iterate= proc_##LSM##_attr_dir_iterate, \ > .llseek = default_llseek, \ > }; \ > \ > > > Do you have the rest of the series already ported to 3.12? > > -Kees > Yes, but I did it last week before my holiday started, and have not updated since. I will become active again upon my return. I hope to have the 3.12 version posted before the Security Summit. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/2] fsl: set wakeup sources
From: Hongbo Zhang Some devices can work as wakeup sources, they should be powerred on during system deep sleep, this patch adds interface for configuring devices power supply status during deep sleep. Signed-off-by: Hongbo Zhang --- arch/powerpc/boot/dts/fsl/qoriq-power.dtsi | 73 arch/powerpc/sysdev/fsl_rcpm.c | 43 2 files changed, 116 insertions(+) create mode 100644 arch/powerpc/boot/dts/fsl/qoriq-power.dtsi diff --git a/arch/powerpc/boot/dts/fsl/qoriq-power.dtsi b/arch/powerpc/boot/dts/fsl/qoriq-power.dtsi new file mode 100644 index 000..c5c2ba0 --- /dev/null +++ b/arch/powerpc/boot/dts/fsl/qoriq-power.dtsi @@ -0,0 +1,73 @@ +/* + * QorIQ Power Management device tree stub + * + * Copyright 2013 Freescale Semiconductor Inc. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * * Neither the name of Freescale Semiconductor nor the + * names of its contributors may be used to endorse or promote products + * derived from this software without specific prior written permission. + * + * + * ALTERNATIVELY, this software may be distributed under the terms of the + * GNU General Public License ("GPL") as published by the Free Software + * Foundation, either version 2 of that License or (at your option) any + * later version. + * + * THIS SOFTWARE IS PROVIDED BY Freescale Semiconductor ``AS IS'' AND ANY + * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED + * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE + * DISCLAIMED. IN NO EVENT SHALL Freescale Semiconductor BE LIABLE FOR ANY + * DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES + * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND + * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS + * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +/* IPPDEXPCR: IP Power Down EXcePtion Control Register */ +rcpm-power@e2140 { + compatible = "fsl,rcpm-ippdexpcr"; + reg = <0xe2140 0x4>; + + mac1_1_power: soc-power@0 { + fsl,ippdexpcr-mask = <0x8000>; + }; + mac1_2_power: soc-power@1 { + fsl,ippdexpcr-mask = <0x4000>; + }; + mac1_3_power: soc-power@2 { + fsl,ippdexpcr-mask = <0x2000>; + }; + mac1_4_power: soc-power@3 { + fsl,ippdexpcr-mask = <0x1000>; + }; + mac1_5_power: soc-power@4 { + fsl,ippdexpcr-mask = <0x0800>; + }; + sdhc_power: soc-power@24 { + fsl,ippdexpcr-mask = <0x0080>; + }; + gpio_power: soc-power@25 { + fsl,ippdexpcr-mask = <0x0040>; + }; + usb1_power: soc-power@26 { + fsl,ippdexpcr-mask = <0x0020>; + }; + usb2_power: soc-power@27 { + fsl,ippdexpcr-mask = <0x0010>; + }; + fman1_power: soc-power@28 { + fsl,ippdexpcr-mask = <0x0008>; + }; + sap_power: soc-power@31 { + fsl,ippdexpcr-mask = <0x0001>; + }; +}; diff --git a/arch/powerpc/sysdev/fsl_rcpm.c b/arch/powerpc/sysdev/fsl_rcpm.c index ecf43a2..bc21aea 100644 --- a/arch/powerpc/sysdev/fsl_rcpm.c +++ b/arch/powerpc/sysdev/fsl_rcpm.c @@ -23,6 +23,49 @@ struct ccsr_rcpm __iomem *rcpm1_regs; struct ccsr_rcpm_v2 __iomem *rcpm2_regs; +/** + * fsl_rcpm_set_wake - enable/disable device working as wakeup source + * @dev: device affected + * @enable: true for keeping power on for this device during deep sleep + * false otherwise + * + * return 0 on success, return -EINVAL if the device cannot wake up system + * and -ENODEV if RCPM unavailable + */ +int fsl_rcpm_set_wake(struct device *dev, bool enable) +{ + int ret = 0; + struct device_node *pw_np; + u32 pw_mask; + + if (!rcpm2_regs) { + dev_err(dev, "%s: RCPM is unavailable\n", __func__); + return -ENODEV; + } + + if (enable && !device_may_wakeup(dev)) + return -EINVAL; + + pw_np = of_parse_phandle(dev->of_node, "fsl,rcpm-handle", 0); + if (!pw_np) + return -EINVAL; + + if (of_property_read_u32(pw_np, "fsl,ippdexpcr-mask", &pw_mask)) { +
[PATCH v2 2/4] ab8500-charger: Remove redundant break
Each of the if-else blocks has a break statement. Remove the additional one which is unreachable. Signed-off-by: Sachin Kamat --- No changes since v1. --- drivers/power/ab8500_charger.c |1 - 1 file changed, 1 deletion(-) diff --git a/drivers/power/ab8500_charger.c b/drivers/power/ab8500_charger.c index 0d355a9..453141e 100644 --- a/drivers/power/ab8500_charger.c +++ b/drivers/power/ab8500_charger.c @@ -766,7 +766,6 @@ static int ab8500_charger_max_usb_curr(struct ab8500_charger *di, ret = -ENXIO; break; } - break; case USB_STAT_CARKIT_1: case USB_STAT_CARKIT_2: case USB_STAT_ACA_DOCK_CHARGER: -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 4/4] pm2301-charger: Staticize pm2xxx_charger_die_therm_mngt
pm2xxx_charger_die_therm_mngt is used only in this file. Make it static. Signed-off-by: Sachin Kamat Acked-by: Lee Jones --- No changes since v1. --- drivers/power/pm2301_charger.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/power/pm2301_charger.c b/drivers/power/pm2301_charger.c index e55d809..b871ba4 100644 --- a/drivers/power/pm2301_charger.c +++ b/drivers/power/pm2301_charger.c @@ -205,7 +205,7 @@ static int pm2xxx_charger_batt_therm_mngt(struct pm2xxx_charger *pm2, int val) } -int pm2xxx_charger_die_therm_mngt(struct pm2xxx_charger *pm2, int val) +static int pm2xxx_charger_die_therm_mngt(struct pm2xxx_charger *pm2, int val) { queue_work(pm2->charger_wq, &pm2->check_main_thermal_prot_work); -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 1/4] ab8500-charger: Check return value of regulator_enable
Check the return value of regulator_enable to silence the following type of warnings: drivers/power/ab8500_charger.c:1390:20: warning: ignoring return value of ‘regulator_enable’, declared with attribute warn_unused_result [-Wunused-result] Signed-off-by: Sachin Kamat Cc: Lee Jones --- Compile tested. Changes since v1: * converted dev_err and return to dev_warn as suggested by Lee Jones. --- drivers/power/ab8500_charger.c | 20 1 file changed, 16 insertions(+), 4 deletions(-) diff --git a/drivers/power/ab8500_charger.c b/drivers/power/ab8500_charger.c index a4c4a10..0d355a9 100644 --- a/drivers/power/ab8500_charger.c +++ b/drivers/power/ab8500_charger.c @@ -1387,8 +1387,14 @@ static int ab8500_charger_ac_en(struct ux500_charger *charger, * the GPADC module independant of the AB8500 chargers */ if (!di->vddadc_en_ac) { - regulator_enable(di->regu); - di->vddadc_en_ac = true; + ret = regulator_enable(di->regu); + if (ret) { + dev_warn(di->dev, + "Failed to enable regulator\n"); + di->vddadc_en_ac = false; + } else { + di->vddadc_en_ac = true; + } } /* Check if the requested voltage or current is valid */ @@ -1556,8 +1562,14 @@ static int ab8500_charger_usb_en(struct ux500_charger *charger, * the GPADC module independant of the AB8500 chargers */ if (!di->vddadc_en_usb) { - regulator_enable(di->regu); - di->vddadc_en_usb = true; + ret = regulator_enable(di->regu); + if (ret) { + dev_warn(di->dev, + "Failed to enable regulator\n"); + di->vddadc_en_usb = false; + } else { + di->vddadc_en_usb = true; + } } /* Enable USB charging */ -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 3/4] pm2301-charger: Check return value of regulator_enable
Check the return value of regulator_enable to silence the following warning: drivers/power/pm2301_charger.c:725:20: warning: ignoring return value of ‘regulator_enable’, declared with attribute warn_unused_result [-Wunused-result] Signed-off-by: Sachin Kamat Cc: Lee Jones --- Compile tested. Changes since v1: * converted dev_err and return to dev_warn as suggested by Lee Jones. --- drivers/power/pm2301_charger.c | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/drivers/power/pm2301_charger.c b/drivers/power/pm2301_charger.c index ffa10ed..e55d809 100644 --- a/drivers/power/pm2301_charger.c +++ b/drivers/power/pm2301_charger.c @@ -722,8 +722,14 @@ static int pm2xxx_charger_ac_en(struct ux500_charger *charger, dev_dbg(pm2->dev, "Enable AC: %dmV %dmA\n", vset, iset); if (!pm2->vddadc_en_ac) { - regulator_enable(pm2->regu); - pm2->vddadc_en_ac = true; + ret = regulator_enable(pm2->regu); + if (ret) { + dev_warn(pm2->dev, + "Failed to enable vddadc regulator\n"); + pm2->vddadc_en_ac = false; + } else { + pm2->vddadc_en_ac = true; + } } ret = pm2xxx_charging_init(pm2); -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] perf kvm: fix sample_type manipulation
Manipulating the sample_type of an evsel requires the use of: perf_evsel__set_sample_bit() and perf_evsel__reset_sample_bit() Manipulating the sample type of an evlist requires the id position to be recalculated. Signed-off-by: Adrian Hunter --- tools/perf/builtin-kvm.c | 20 +++- 1 file changed, 11 insertions(+), 9 deletions(-) diff --git a/tools/perf/builtin-kvm.c b/tools/perf/builtin-kvm.c index 47b3540..0b7c5a9 100644 --- a/tools/perf/builtin-kvm.c +++ b/tools/perf/builtin-kvm.c @@ -1165,16 +1165,16 @@ static int kvm_live_open_events(struct perf_kvm_stat *kvm) struct perf_event_attr *attr = &pos->attr; /* make sure these *are* set */ - attr->sample_type |= PERF_SAMPLE_TID; - attr->sample_type |= PERF_SAMPLE_TIME; - attr->sample_type |= PERF_SAMPLE_CPU; - attr->sample_type |= PERF_SAMPLE_RAW; + perf_evsel__set_sample_bit(pos, TID); + perf_evsel__set_sample_bit(pos, TIME); + perf_evsel__set_sample_bit(pos, CPU); + perf_evsel__set_sample_bit(pos, RAW); /* make sure these are *not*; want as small a sample as possible */ - attr->sample_type &= ~PERF_SAMPLE_PERIOD; - attr->sample_type &= ~PERF_SAMPLE_IP; - attr->sample_type &= ~PERF_SAMPLE_CALLCHAIN; - attr->sample_type &= ~PERF_SAMPLE_ADDR; - attr->sample_type &= ~PERF_SAMPLE_READ; + perf_evsel__reset_sample_bit(pos, PERIOD); + perf_evsel__reset_sample_bit(pos, IP); + perf_evsel__reset_sample_bit(pos, CALLCHAIN); + perf_evsel__reset_sample_bit(pos, ADDR); + perf_evsel__reset_sample_bit(pos, READ); attr->mmap = 0; attr->comm = 0; attr->task = 0; @@ -1188,6 +1188,8 @@ static int kvm_live_open_events(struct perf_kvm_stat *kvm) attr->disabled = 1; } + perf_evlist__set_id_pos(evlist); + err = perf_evlist__open(evlist); if (err < 0) { printf("Couldn't create the events: %s\n", strerror(errno)); -- 1.7.11.7 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 4/4] mm/zswap: use GFP_NOIO instead of GFP_KERNEL
On 09/06/2013 01:16 PM, Weijie Yang wrote: > To avoid zswap store and reclaim functions called recursively, > use GFP_NOIO instead of GFP_KERNEL > The reason of using GFP_KERNEL in write back path is we want to try our best to move those pages from zswap to real swap device. I think it would be better to keep GFP_KERNEL flag but find some other ways to skip zswap/zswap_frontswap_store() if zswap write back is in progress. What I can think of currently is adding a mutex to zswap, take that mutex when zswap write back happens and check the mutex in zswap_frontswap_store(). > Signed-off-by: Weijie Yang > --- > mm/zswap.c |6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/mm/zswap.c b/mm/zswap.c > index cc40e6a..3d05ed8 100644 > --- a/mm/zswap.c > +++ b/mm/zswap.c > @@ -427,7 +427,7 @@ static int zswap_get_swap_cache_page(swp_entry_t entry, >* Get a new page to read into from swap. >*/ > if (!new_page) { > - new_page = alloc_page(GFP_KERNEL); > + new_page = alloc_page(GFP_NOIO); > if (!new_page) > break; /* Out of memory */ > } > @@ -435,7 +435,7 @@ static int zswap_get_swap_cache_page(swp_entry_t entry, > /* >* call radix_tree_preload() while we can wait. >*/ > - err = radix_tree_preload(GFP_KERNEL); > + err = radix_tree_preload(GFP_NOIO); > if (err) > break; > > @@ -636,7 +636,7 @@ static int zswap_frontswap_store(unsigned type, pgoff_t > offset, > } > > /* allocate entry */ > - entry = zswap_entry_cache_alloc(GFP_KERNEL); > + entry = zswap_entry_cache_alloc(GFP_NOIO); > if (!entry) { > zswap_reject_kmemcache_fail++; > ret = -ENOMEM; > -- Regards, -Bob -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: linux-next: problem fetching the watchdog tree
Hi Stephen, > Fetching the wireless tree yesterday and today produced this error: > > fatal: unable to connect to www.linux-watchdog.org: > www.linux-watchdog.org[0: 83.149.101.17]: errno=Connection refused Strange. I had a git zombie process, got rid of it i2 days ago and restarted git but apparently it didn't do anything anymore. I just restarted it and saw a pull coming in again. So it is fixed now. Thanks for pointing it out. Wim. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] VMCI: fix to pass correct device identity to free_irq()
From: Wei Yongjun free_irq() expects the same device identity that was passed to corresponding request_irq(), otherwise the IRQ is not freed. Signed-off-by: Wei Yongjun --- drivers/misc/vmw_vmci/vmci_guest.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/misc/vmw_vmci/vmci_guest.c b/drivers/misc/vmw_vmci/vmci_guest.c index b3a2b76..c98b03b 100644 --- a/drivers/misc/vmw_vmci/vmci_guest.c +++ b/drivers/misc/vmw_vmci/vmci_guest.c @@ -649,7 +649,7 @@ static int vmci_guest_probe_device(struct pci_dev *pdev, return 0; err_free_irq: - free_irq(vmci_dev->irq, &vmci_dev); + free_irq(vmci_dev->irq, vmci_dev); tasklet_kill(&vmci_dev->datagram_tasklet); tasklet_kill(&vmci_dev->bm_tasklet); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] SDT markers listing by perf
Hi Hemant, On Wed, 04 Sep 2013 23:07:57 +0530, Hemant wrote: > On 09/04/2013 12:12 PM, Namhyung Kim wrote: >> On Tue, 03 Sep 2013 13:06:55 +0530, Hemant Kumar wrote: >>> + /* >>> +* Look for Section type = SHT_NOTE, flags = no SHF_ALLOC >>> +* and name = .note.stapsdt >>> +*/ >>> + scn = elf_section_by_name(elf, &ehdr, &shdr, NOTE_SCN, NULL); >>> + if (scn == NULL) { >>> + pr_err("%s section not found!\n", NOTE_SCN); >>> + goto out_end; >>> + } >>> + >>> + if (!(shdr.sh_type == SHT_NOTE) || (shdr.sh_flags & SHF_ALLOC)) >>> + goto out_end; >>> + >>> + data = elf_getdata(scn, NULL); >>> + >>> + /* Get the notes */ >>> + for (offset = 0; (next = gelf_getnote(data, offset, &nhdr, &name_off, >>> + &desc_off)) > 0; offset = next) { >>> + tmp = populate_note(&elf, (const char *)((long)(data->d_buf) + >>> +(long)desc_off), >>> + nhdr.n_descsz, nhdr.n_type); >> Shouldn't we check the name of note being "stapsdt" as well as version >> (type) 3? > > Since, we are already fetching the section NOTE_SCN (".note.stapsdt") > and then we check for the type being SHT_NOTE and SHF_ALLOC, is it > required to do the same for the individual notes? I don't know. Now it seems only includes SDT notes with name being "stapsdt" and type being 3. But things can be changed in future.. Thanks, Namhyung -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] ARM: OMAP2+: am335x-bone*: add DT for BeagleBone Black
The BeagleBone Black is basically a regular BeagleBone with eMMC and HDMI added, so create a common dtsi both can use. MMC support for AM335x still isn't in, so only the LDO change has been added. Signed-off-by: Koen Kooi --- .../{am335x-bone.dts => am335x-bone-common.dtsi} | 3 - arch/arm/boot/dts/am335x-bone.dts | 256 + arch/arm/boot/dts/am335x-boneblack.dts | 18 ++ 3 files changed, 19 insertions(+), 258 deletions(-) copy arch/arm/boot/dts/{am335x-bone.dts => am335x-bone-common.dtsi} (99%) create mode 100644 arch/arm/boot/dts/am335x-boneblack.dts diff --git a/arch/arm/boot/dts/am335x-bone.dts b/arch/arm/boot/dts/am335x-bone-common.dtsi similarity index 99% copy from arch/arm/boot/dts/am335x-bone.dts copy to arch/arm/boot/dts/am335x-bone-common.dtsi index d318987..2f66ded 100644 --- a/arch/arm/boot/dts/am335x-bone.dts +++ b/arch/arm/boot/dts/am335x-bone-common.dtsi @@ -5,9 +5,6 @@ * it under the terms of the GNU General Public License version 2 as * published by the Free Software Foundation. */ -/dts-v1/; - -#include "am33xx.dtsi" / { model = "TI AM335x BeagleBone"; diff --git a/arch/arm/boot/dts/am335x-bone.dts b/arch/arm/boot/dts/am335x-bone.dts index d318987..7993c48 100644 --- a/arch/arm/boot/dts/am335x-bone.dts +++ b/arch/arm/boot/dts/am335x-bone.dts @@ -8,258 +8,4 @@ /dts-v1/; #include "am33xx.dtsi" - -/ { - model = "TI AM335x BeagleBone"; - compatible = "ti,am335x-bone", "ti,am33xx"; - - cpus { - cpu@0 { - cpu0-supply = <&dcdc2_reg>; - }; - }; - - memory { - device_type = "memory"; - reg = <0x8000 0x1000>; /* 256 MB */ - }; - - am33xx_pinmux: pinmux@44e10800 { - pinctrl-names = "default"; - pinctrl-0 = <&clkout2_pin>; - - user_leds_s0: user_leds_s0 { - pinctrl-single,pins = < - 0x54 (PIN_OUTPUT_PULLDOWN | MUX_MODE7) /* gpmc_a5.gpio1_21 */ - 0x58 (PIN_OUTPUT_PULLUP | MUX_MODE7)/* gpmc_a6.gpio1_22 */ - 0x5c (PIN_OUTPUT_PULLDOWN | MUX_MODE7) /* gpmc_a7.gpio1_23 */ - 0x60 (PIN_OUTPUT_PULLUP | MUX_MODE7)/* gpmc_a8.gpio1_24 */ - >; - }; - - i2c0_pins: pinmux_i2c0_pins { - pinctrl-single,pins = < - 0x188 (PIN_INPUT_PULLUP | MUX_MODE0)/* i2c0_sda.i2c0_sda */ - 0x18c (PIN_INPUT_PULLUP | MUX_MODE0)/* i2c0_scl.i2c0_scl */ - >; - }; - - uart0_pins: pinmux_uart0_pins { - pinctrl-single,pins = < - 0x170 (PIN_INPUT_PULLUP | MUX_MODE0)/* uart0_rxd.uart0_rxd */ - 0x174 (PIN_OUTPUT_PULLDOWN | MUX_MODE0) /* uart0_txd.uart0_txd */ - >; - }; - - clkout2_pin: pinmux_clkout2_pin { - pinctrl-single,pins = < - 0x1b4 (PIN_OUTPUT_PULLDOWN | MUX_MODE3) /* xdma_event_intr1.clkout2 */ - >; - }; - - cpsw_default: cpsw_default { - pinctrl-single,pins = < - /* Slave 1 */ - 0x110 (PIN_INPUT_PULLUP | MUX_MODE0)/* mii1_rxerr.mii1_rxerr */ - 0x114 (PIN_OUTPUT_PULLDOWN | MUX_MODE0) /* mii1_txen.mii1_txen */ - 0x118 (PIN_INPUT_PULLUP | MUX_MODE0)/* mii1_rxdv.mii1_rxdv */ - 0x11c (PIN_OUTPUT_PULLDOWN | MUX_MODE0) /* mii1_txd3.mii1_txd3 */ - 0x120 (PIN_OUTPUT_PULLDOWN | MUX_MODE0) /* mii1_txd2.mii1_txd2 */ - 0x124 (PIN_OUTPUT_PULLDOWN | MUX_MODE0) /* mii1_txd1.mii1_txd1 */ - 0x128 (PIN_OUTPUT_PULLDOWN | MUX_MODE0) /* mii1_txd0.mii1_txd0 */ - 0x12c (PIN_INPUT_PULLUP | MUX_MODE0)/* mii1_txclk.mii1_txclk */ - 0x130 (PIN_INPUT_PULLUP | MUX_MODE0)/* mii1_rxclk.mii1_rxclk */ - 0x134 (PIN_INPUT_PULLUP | MUX_MODE0)/* mii1_rxd3.mii1_rxd3 */ - 0x138 (PIN_INPUT_PULLUP | MUX_MODE0)/* mii1_rxd2.mii1_rxd2 */ - 0x13c (PIN_INPUT_PULLUP | MUX_MODE0)/* mii1_rxd1.mii1_rxd1 */ - 0x140 (PIN_INPUT_PULLUP | MUX_MODE0)/* mii1_rxd0.mii1_rxd0 */ - >; - }; - - cpsw_sleep: cpsw_sleep { - pinctrl-single,pins = < - /* Slave 1 reset value */ -
Re: [PATCH v2 2/4] mm/zswap: bugfix: memory leak when invalidate and reclaim occur concurrently
On 09/06/2013 01:16 PM, Weijie Yang wrote: > Consider the following scenario: > thread 0: reclaim entry x (get refcount, but not call > zswap_get_swap_cache_page) > thread 1: call zswap_frontswap_invalidate_page to invalidate entry x. > finished, entry x and its zbud is not freed as its refcount != 0 > now, the swap_map[x] = 0 > thread 0: now call zswap_get_swap_cache_page > swapcache_prepare return -ENOENT because entry x is not used any more > zswap_get_swap_cache_page return ZSWAP_SWAPCACHE_NOMEM > zswap_writeback_entry do nothing except put refcount > Now, the memory of zswap_entry x and its zpage leak. > > Modify: > - check the refcount in fail path, free memory if it is not referenced. > - use ZSWAP_SWAPCACHE_FAIL instead of ZSWAP_SWAPCACHE_NOMEM as the fail path > can be not only caused by nomem but also by invalidate. > > Signed-off-by: Weijie Yang Reviewed-by: Bob Liu > --- > mm/zswap.c | 21 + > 1 file changed, 13 insertions(+), 8 deletions(-) > > diff --git a/mm/zswap.c b/mm/zswap.c > index cbd9578..1be7b90 100644 > --- a/mm/zswap.c > +++ b/mm/zswap.c > @@ -387,7 +387,7 @@ static void zswap_free_entry(struct zswap_tree *tree, > struct zswap_entry *entry) > enum zswap_get_swap_ret { > ZSWAP_SWAPCACHE_NEW, > ZSWAP_SWAPCACHE_EXIST, > - ZSWAP_SWAPCACHE_NOMEM > + ZSWAP_SWAPCACHE_FAIL, > }; > > /* > @@ -401,9 +401,9 @@ enum zswap_get_swap_ret { > * added to the swap cache, and returned in retpage. > * > * If success, the swap cache page is returned in retpage > - * Returns 0 if page was already in the swap cache, page is not locked > - * Returns 1 if the new page needs to be populated, page is locked > - * Returns <0 on error > + * Returns ZSWAP_SWAPCACHE_EXIST if page was already in the swap cache > + * Returns ZSWAP_SWAPCACHE_NEW if the new page needs to be populated, page > is locked > + * Returns ZSWAP_SWAPCACHE_FAIL on error > */ > static int zswap_get_swap_cache_page(swp_entry_t entry, > struct page **retpage) > @@ -475,7 +475,7 @@ static int zswap_get_swap_cache_page(swp_entry_t entry, > if (new_page) > page_cache_release(new_page); > if (!found_page) > - return ZSWAP_SWAPCACHE_NOMEM; > + return ZSWAP_SWAPCACHE_FAIL; > *retpage = found_page; > return ZSWAP_SWAPCACHE_EXIST; > } > @@ -529,11 +529,11 @@ static int zswap_writeback_entry(struct zbud_pool > *pool, unsigned long handle) > > /* try to allocate swap cache page */ > switch (zswap_get_swap_cache_page(swpentry, &page)) { > - case ZSWAP_SWAPCACHE_NOMEM: /* no memory */ > + case ZSWAP_SWAPCACHE_FAIL: /* no memory or invalidate happened */ > ret = -ENOMEM; > goto fail; > > - case ZSWAP_SWAPCACHE_EXIST: /* page is unlocked */ > + case ZSWAP_SWAPCACHE_EXIST: > /* page is already in the swap cache, ignore for now */ > page_cache_release(page); > ret = -EEXIST; > @@ -591,7 +591,12 @@ static int zswap_writeback_entry(struct zbud_pool *pool, > unsigned long handle) > > fail: > spin_lock(&tree->lock); > - zswap_entry_put(entry); > + refcount = zswap_entry_put(entry); > + if (refcount <= 0) { > + /* invalidate happened, consider writeback as success */ > + zswap_free_entry(tree, entry); > + ret = 0; > + } > spin_unlock(&tree->lock); > return ret; > } > -- Regards, -Bob -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 3/4] mm/zswap: avoid unnecessary page scanning
On 09/06/2013 01:16 PM, Weijie Yang wrote: > add SetPageReclaim before __swap_writepage so that page can be moved to the > tail of the inactive list, which can avoid unnecessary page scanning as this > page was reclaimed by swap subsystem before. > > Signed-off-by: Weijie Yang Reviewed-by: Bob Liu > --- > mm/zswap.c |3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/mm/zswap.c b/mm/zswap.c > index 1be7b90..cc40e6a 100644 > --- a/mm/zswap.c > +++ b/mm/zswap.c > @@ -556,6 +556,9 @@ static int zswap_writeback_entry(struct zbud_pool *pool, > unsigned long handle) > SetPageUptodate(page); > } > > + /* move it to the tail of the inactive list after end_writeback */ > + SetPageReclaim(page); > + > /* start writeback */ > __swap_writepage(page, &wbc, end_swap_bio_write); > page_cache_release(page); > -- Regards, -Bob -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 1/4] mm/zswap: bugfix: memory leak when re-swapon
On 09/06/2013 01:16 PM, Weijie Yang wrote: > zswap_tree is not freed when swapoff, and it got re-kmalloc in swapon, > so memory-leak occurs. > > Modify: free memory of zswap_tree in zswap_frontswap_invalidate_area(). > > Signed-off-by: Weijie Yang Reviewed-by: Bob Liu > --- > mm/zswap.c |4 > 1 file changed, 4 insertions(+) > > diff --git a/mm/zswap.c b/mm/zswap.c > index deda2b6..cbd9578 100644 > --- a/mm/zswap.c > +++ b/mm/zswap.c > @@ -816,6 +816,10 @@ static void zswap_frontswap_invalidate_area(unsigned > type) > } > tree->rbroot = RB_ROOT; > spin_unlock(&tree->lock); > + > + zbud_destroy_pool(tree->pool); > + kfree(tree); > + zswap_trees[type] = NULL; > } > > static struct zbud_ops zswap_zbud_ops = { > -- Regards, -Bob -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] cpufreq: serialize calls to __cpufreq_governor()
On 09/05/2013 06:24 AM, Stephen Boyd wrote: > On 09/04/13 17:26, Rafael J. Wysocki wrote: >> On Wednesday, September 04, 2013 04:50:01 PM Stephen Boyd wrote: >>> On 09/04/13 16:55, Rafael J. Wysocki wrote: Well, I'm not sure when Viresh is going to be back. Srivatsa, can you please resend this patch with a proper changelog? >>> I haven't had a chance to try this out yet, but I was just thinking >>> about this patch. How is it going to work? If one task opens the file >>> and another task is taking down the CPU wouldn't we deadlock in the >>> CPU_DOWN notifier waiting for the kobject to be released? Task 1 will >>> grab the kobject reference and sleep on the hotplug mutex and task 2 >>> will put the kobject and wait for the completion, but it won't happen. >>> At least I think that's what would happen. >> Do you mean the completion in sysfs_deactivate()? Yes, we can deadlock >> there. > > I mean the complete in cpufreq_sysfs_release(). I don't think that will > ever be called because the kobject is held by the task calling store() > which is waiting on the hotplug lock to be released. > >> >> Well, I guess the Srivatsa's patch may be salvaged by making it do a >> "trylock" >> version of get_online_cpus(), but then I wonder if there's no better way. > > I think the real solution is to remove the kobject first if the CPU > going down is the last user of that policy. Once the completion is done > we can stop the governor and clean up state. For the case where there > are still CPUs using the policy why can't we cancel that CPU's work and > do nothing else? Even in the case where we have to move the cpufreq > directory do we need to do a STOP/START/LIMITS sequence? I hope we can > get away with just moving the directory and canceling that CPUs work then. > Conceptually, I agree that your idea of not allowing any process to take a new reference to the kobject while we are taking the CPU offline, is a sound solution. However, I am reluctant to go down that path because, handling the CPU hotplug sequence in the suspend/resume path might get even more tricky, if we want to implement the changes that you propose. Just recently we managed to rework the cpufreq CPU hotplug handling to retain the sysfs file permissions around suspend/resume, and doing that was not at all simple. Adding more quirks and complexity to the kobject handling in that path will only make things even more challenging, IMHO. That's the reason I'm trying to think of ways to avoid touching that fragile code, and instead solve this problem in some other way, without compromising on the robustness of the solution. So here is my new proposal, as a replacement to this patch[2/2]: We note that, at CPU_DOWN_PREPARE stage, the CPU is not yet marked offline, whereas by the time we handle CPU_POST_DEAD, the CPU is removed from the cpu_online_mask, and also the cpu_hotplug lock is dropped. So, let us split up __cpu_remove_dev() into 2 parts, say: __cpu_remove_prepare() - invoked during CPU_DOWN_PREPARE __cpu_remove_finish() - invoked during CPU_POST_DEAD In the former function, we stop the governors, so that policy->governor_enabled gets set to false, so that patch [1/2] will return -EBUSY to any subsequent ->store() requests. Also, we do everything except the kobject cleanup. In the latter function, we do the remaining work, particularly the part where we wait for the kobject refcount to drop to zero and the subsequent cleanup. And the ->store() functions will be modified to look like this: store() { get_online_cpus(); if (!cpu_online(cpu)) goto out; /* Body of the function*/ out: put_online_cpus(); } That way, if a task tries to write to a cpufreq file during CPU offline, it will get blocked on get_online_cpus(), and will continue after CPU_DEAD (since we release the lock here). Then it will notice that the cpu is offline, and hence will return silently, thus dropping the kobject refcnt. So, when the cpufreq core comes back at the CPU_POST_DEAD stage to cleanup the kobject, it won't encounter any problems. Any thoughts on this approach? I'll try to code it up and post the patch later today. Thank you! Regards, Srivatsa S. Bhat -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v9 12/13] KVM: PPC: Add support for IOMMU in-kernel handling
On 09/06/2013 04:01 PM, Gleb Natapov wrote: > On Fri, Sep 06, 2013 at 09:38:21AM +1000, Alexey Kardashevskiy wrote: >> On 09/06/2013 04:10 AM, Gleb Natapov wrote: >>> On Wed, Sep 04, 2013 at 02:01:28AM +1000, Alexey Kardashevskiy wrote: On 09/03/2013 08:53 PM, Gleb Natapov wrote: > On Mon, Sep 02, 2013 at 01:14:29PM +1000, Alexey Kardashevskiy wrote: >> On 09/01/2013 10:06 PM, Gleb Natapov wrote: >>> On Wed, Aug 28, 2013 at 06:50:41PM +1000, Alexey Kardashevskiy wrote: This allows the host kernel to handle H_PUT_TCE, H_PUT_TCE_INDIRECT and H_STUFF_TCE requests targeted an IOMMU TCE table without passing them to user space which saves time on switching to user space and back. Both real and virtual modes are supported. The kernel tries to handle a TCE request in the real mode, if fails it passes the request to the virtual mode to complete the operation. If it a virtual mode handler fails, the request is passed to user space. The first user of this is VFIO on POWER. Trampolines to the VFIO external user API functions are required for this patch. This adds a "SPAPR TCE IOMMU" KVM device to associate a logical bus number (LIOBN) with an VFIO IOMMU group fd and enable in-kernel handling of map/unmap requests. The device supports a single attribute which is a struct with LIOBN and IOMMU fd. When the attribute is set, the device establishes the connection between KVM and VFIO. Tests show that this patch increases transmission speed from 220MB/s to 750..1020MB/s on 10Gb network (Chelsea CXGB3 10Gb ethernet card). Signed-off-by: Paul Mackerras Signed-off-by: Alexey Kardashevskiy --- Changes: v9: * KVM_CAP_SPAPR_TCE_IOMMU ioctl to KVM replaced with "SPAPR TCE IOMMU" KVM device * release_spapr_tce_table() is not shared between different TCE types * reduced the patch size by moving VFIO external API trampolines to separate patche * moved documentation from Documentation/virtual/kvm/api.txt to Documentation/virtual/kvm/devices/spapr_tce_iommu.txt v8: * fixed warnings from check_patch.pl 2013/07/11: * removed multiple #ifdef IOMMU_API as IOMMU_API is always enabled for KVM_BOOK3S_64 * kvmppc_gpa_to_hva_and_get also returns host phys address. Not much sense for this here but the next patch for hugepages support will use it more. 2013/07/06: * added realmode arch_spin_lock to protect TCE table from races in real and virtual modes * POWERPC IOMMU API is changed to support real mode * iommu_take_ownership and iommu_release_ownership are protected by iommu_table's locks * VFIO external user API use rewritten * multiple small fixes 2013/06/27: * tce_list page is referenced now in order to protect it from accident invalidation during H_PUT_TCE_INDIRECT execution * added use of the external user VFIO API 2013/06/05: * changed capability number * changed ioctl number * update the doc article number 2013/05/20: * removed get_user() from real mode handlers * kvm_vcpu_arch::tce_tmp usage extended. Now real mode handler puts there translated TCEs, tries realmode_get_page() on those and if it fails, it passes control over the virtual mode handler which tries to finish the request handling * kvmppc_lookup_pte() now does realmode_get_page() protected by BUSY bit on a page * The only reason to pass the request to user mode now is when the user mode did not register TCE table in the kernel, in all other cases the virtual mode handler is expected to do the job --- .../virtual/kvm/devices/spapr_tce_iommu.txt| 37 +++ arch/powerpc/include/asm/kvm_host.h| 4 + arch/powerpc/kvm/book3s_64_vio.c | 310 - arch/powerpc/kvm/book3s_64_vio_hv.c| 122 arch/powerpc/kvm/powerpc.c | 1 + include/linux/kvm_host.h | 1 + virt/kvm/kvm_main.c| 5 + 7 files changed, 477 insertions(+), 3 deletions(-) create mode 100644 Documentation/virtual/kvm/devices/spapr_tce_iommu.txt diff --git a/Documentation/virtual/kvm/devices/spapr_tce_iommu.txt b/Documentation/virtual/kvm/devices/spapr_
Re: [PATCH v9 12/13] KVM: PPC: Add support for IOMMU in-kernel handling
On Fri, Sep 06, 2013 at 09:38:21AM +1000, Alexey Kardashevskiy wrote: > On 09/06/2013 04:10 AM, Gleb Natapov wrote: > > On Wed, Sep 04, 2013 at 02:01:28AM +1000, Alexey Kardashevskiy wrote: > >> On 09/03/2013 08:53 PM, Gleb Natapov wrote: > >>> On Mon, Sep 02, 2013 at 01:14:29PM +1000, Alexey Kardashevskiy wrote: > On 09/01/2013 10:06 PM, Gleb Natapov wrote: > > On Wed, Aug 28, 2013 at 06:50:41PM +1000, Alexey Kardashevskiy wrote: > >> This allows the host kernel to handle H_PUT_TCE, H_PUT_TCE_INDIRECT > >> and H_STUFF_TCE requests targeted an IOMMU TCE table without passing > >> them to user space which saves time on switching to user space and > >> back. > >> > >> Both real and virtual modes are supported. The kernel tries to > >> handle a TCE request in the real mode, if fails it passes the request > >> to the virtual mode to complete the operation. If it a virtual mode > >> handler fails, the request is passed to user space. > >> > >> The first user of this is VFIO on POWER. Trampolines to the VFIO > >> external > >> user API functions are required for this patch. > >> > >> This adds a "SPAPR TCE IOMMU" KVM device to associate a logical bus > >> number (LIOBN) with an VFIO IOMMU group fd and enable in-kernel > >> handling > >> of map/unmap requests. The device supports a single attribute which is > >> a struct with LIOBN and IOMMU fd. When the attribute is set, the device > >> establishes the connection between KVM and VFIO. > >> > >> Tests show that this patch increases transmission speed from 220MB/s > >> to 750..1020MB/s on 10Gb network (Chelsea CXGB3 10Gb ethernet card). > >> > >> Signed-off-by: Paul Mackerras > >> Signed-off-by: Alexey Kardashevskiy > >> > >> --- > >> > >> Changes: > >> v9: > >> * KVM_CAP_SPAPR_TCE_IOMMU ioctl to KVM replaced with "SPAPR TCE IOMMU" > >> KVM device > >> * release_spapr_tce_table() is not shared between different TCE types > >> * reduced the patch size by moving VFIO external API > >> trampolines to separate patche > >> * moved documentation from Documentation/virtual/kvm/api.txt to > >> Documentation/virtual/kvm/devices/spapr_tce_iommu.txt > >> > >> v8: > >> * fixed warnings from check_patch.pl > >> > >> 2013/07/11: > >> * removed multiple #ifdef IOMMU_API as IOMMU_API is always enabled > >> for KVM_BOOK3S_64 > >> * kvmppc_gpa_to_hva_and_get also returns host phys address. Not much > >> sense > >> for this here but the next patch for hugepages support will use it > >> more. > >> > >> 2013/07/06: > >> * added realmode arch_spin_lock to protect TCE table from races > >> in real and virtual modes > >> * POWERPC IOMMU API is changed to support real mode > >> * iommu_take_ownership and iommu_release_ownership are protected by > >> iommu_table's locks > >> * VFIO external user API use rewritten > >> * multiple small fixes > >> > >> 2013/06/27: > >> * tce_list page is referenced now in order to protect it from accident > >> invalidation during H_PUT_TCE_INDIRECT execution > >> * added use of the external user VFIO API > >> > >> 2013/06/05: > >> * changed capability number > >> * changed ioctl number > >> * update the doc article number > >> > >> 2013/05/20: > >> * removed get_user() from real mode handlers > >> * kvm_vcpu_arch::tce_tmp usage extended. Now real mode handler puts > >> there > >> translated TCEs, tries realmode_get_page() on those and if it fails, it > >> passes control over the virtual mode handler which tries to finish > >> the request handling > >> * kvmppc_lookup_pte() now does realmode_get_page() protected by BUSY > >> bit > >> on a page > >> * The only reason to pass the request to user mode now is when the > >> user mode > >> did not register TCE table in the kernel, in all other cases the > >> virtual mode > >> handler is expected to do the job > >> --- > >> .../virtual/kvm/devices/spapr_tce_iommu.txt| 37 +++ > >> arch/powerpc/include/asm/kvm_host.h| 4 + > >> arch/powerpc/kvm/book3s_64_vio.c | 310 > >> - > >> arch/powerpc/kvm/book3s_64_vio_hv.c| 122 > >> arch/powerpc/kvm/powerpc.c | 1 + > >> include/linux/kvm_host.h | 1 + > >> virt/kvm/kvm_main.c| 5 + > >> 7 files changed, 477 insertions(+), 3 deletions(-) > >> create mode 100644 > >> Documentation/virtual/kvm/devices/spapr_tce_iommu.txt > >> > >> diff --git a/Documentation/virtual/kvm/devices/spapr_tce_iommu.txt > >> b/Documentation/virtual/kvm/devices/spapr_tce_iommu.txt > >> new file mode 100644 > >
[PATCH] mmc: sdhci: add support for pre_req and post_req
This patch supports non-blocking mmc request function for the sdchi driver. (commit: aa8b683a7d392271ed349c6ab9f36b8c313794b7) pre_req() runs dma_map_sg(), post_req() runs dma_unmap_sg. If not calling pre_req() before sdhci_request(), dma_map_sg will be issued before starting the transfer. It is optional to use pre_req(). If issuing pre_req(), post_req() must be called as well. benchmark results: ARM CA9 1GHz, UHS DDR50 mode Before: dd if=/dev/mmcblk0p15 of=/dev/null bs=64k count=1024 67108864 bytes (64.0MB) copied, 1.188846 seconds, 53.8MB/s After: dd if=/dev/mmcblk0p15 of=/dev/null bs=64k count=1024 67108864 bytes (64.0MB) copied, 0.993098 seconds, 64.4MB/s Signed-off-by: Chanho Min --- drivers/mmc/host/sdhci.c | 96 +++-- include/linux/mmc/sdhci.h |6 +++ 2 files changed, 90 insertions(+), 12 deletions(-) diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c index 2ea429c..0465a9a 100644 --- a/drivers/mmc/host/sdhci.c +++ b/drivers/mmc/host/sdhci.c @@ -465,6 +465,42 @@ static void sdhci_set_adma_desc(u8 *desc, u32 addr, int len, unsigned cmd) dataddr[0] = cpu_to_le32(addr); } +static int sdhci_pre_dma_transfer(struct sdhci_host *host, + struct mmc_data *data, + struct sdhci_next *next) +{ + int sg_count = 0; + + if (!next && data->host_cookie && + data->host_cookie != host->next_data.cookie) { + pr_warn("[%s] invalid cookie: data->host_cookie %d" + " host->next_data.cookie %d\n", + __func__, data->host_cookie, host->next_data.cookie); + data->host_cookie = 0; + } + + /* Check if next job is already prepared */ + if (next || + (!next && data->host_cookie != host->next_data.cookie)) { + sg_count = dma_map_sg(mmc_dev(host->mmc), data->sg, + data->sg_len, + (data->flags & MMC_DATA_READ) ? + DMA_FROM_DEVICE : + DMA_TO_DEVICE); + } else { + sg_count = host->next_data.sg_count; + host->next_data.sg_count = 0; + } + + if (next) { + next->sg_count = sg_count; + data->host_cookie = ++next->cookie < 0 ? 1 : next->cookie; + } else + host->sg_count = sg_count; + + return sg_count; +} + static int sdhci_adma_table_pre(struct sdhci_host *host, struct mmc_data *data) { @@ -502,8 +538,8 @@ static int sdhci_adma_table_pre(struct sdhci_host *host, goto fail; BUG_ON(host->align_addr & 0x3); - host->sg_count = dma_map_sg(mmc_dev(host->mmc), - data->sg, data->sg_len, direction); + host->sg_count = sdhci_pre_dma_transfer(host, data, NULL); + if (host->sg_count == 0) goto unmap_align; @@ -643,9 +679,10 @@ static void sdhci_adma_table_post(struct sdhci_host *host, } } } - - dma_unmap_sg(mmc_dev(host->mmc), data->sg, - data->sg_len, direction); + if (!data->host_cookie) { + dma_unmap_sg(mmc_dev(host->mmc), data->sg, + data->sg_len, direction); + } } static u8 sdhci_calc_timeout(struct sdhci_host *host, struct mmc_command *cmd) @@ -824,12 +861,8 @@ static void sdhci_prepare_data(struct sdhci_host *host, struct mmc_command *cmd) } } else { int sg_cnt; + sg_cnt = sdhci_pre_dma_transfer(host, data, NULL); - sg_cnt = dma_map_sg(mmc_dev(host->mmc), - data->sg, data->sg_len, - (data->flags & MMC_DATA_READ) ? - DMA_FROM_DEVICE : - DMA_TO_DEVICE); if (sg_cnt == 0) { /* * This only happens when someone fed @@ -928,9 +961,12 @@ static void sdhci_finish_data(struct sdhci_host *host) if (host->flags & SDHCI_USE_ADMA) sdhci_adma_table_post(host, data); else { - dma_unmap_sg(mmc_dev(host->mmc), data->sg, - data->sg_len, (data->flags & MMC_DATA_READ) ? + if (!data->host_cookie) { + dma_unmap_sg(mmc_dev(host->mmc), data->sg, + data->sg_len, + (data->flags & MMC_DATA_READ) ? DMA_FROM_DEVICE : DMA_TO_DEVICE); + } }
[REPOST PATCH 2/4] slab: introduce helper functions to get/set free object
In the following patches, to get/set free objects from the freelist is changed so that simple casting doesn't work for it. Therefore, introduce helper functions. Signed-off-by: Joonsoo Kim diff --git a/mm/slab.c b/mm/slab.c index 9d4bad5..a0e49bb 100644 --- a/mm/slab.c +++ b/mm/slab.c @@ -2545,9 +2545,15 @@ static struct freelist *alloc_slabmgmt(struct kmem_cache *cachep, return freelist; } -static inline unsigned int *slab_freelist(struct page *page) +static inline unsigned int get_free_obj(struct page *page, unsigned int idx) { - return (unsigned int *)(page->freelist); + return ((unsigned int *)page->freelist)[idx]; +} + +static inline void set_free_obj(struct page *page, + unsigned int idx, unsigned int val) +{ + ((unsigned int *)(page->freelist))[idx] = val; } static void cache_init_objs(struct kmem_cache *cachep, @@ -2592,7 +2598,7 @@ static void cache_init_objs(struct kmem_cache *cachep, if (cachep->ctor) cachep->ctor(objp); #endif - slab_freelist(page)[i] = i; + set_free_obj(page, i, i); } } @@ -2611,7 +2617,7 @@ static void *slab_get_obj(struct kmem_cache *cachep, struct page *page, { void *objp; - objp = index_to_obj(cachep, page, slab_freelist(page)[page->active]); + objp = index_to_obj(cachep, page, get_free_obj(page, page->active)); page->active++; #if DEBUG WARN_ON(page_to_nid(virt_to_page(objp)) != nodeid); @@ -2632,7 +2638,7 @@ static void slab_put_obj(struct kmem_cache *cachep, struct page *page, /* Verify double free bug */ for (i = page->active; i < cachep->num; i++) { - if (slab_freelist(page)[i] == objnr) { + if (get_free_obj(page, i) == objnr) { printk(KERN_ERR "slab: double free detected in cache " "'%s', objp %p\n", cachep->name, objp); BUG(); @@ -2640,7 +2646,7 @@ static void slab_put_obj(struct kmem_cache *cachep, struct page *page, } #endif page->active--; - slab_freelist(page)[page->active] = objnr; + set_free_obj(page, page->active, objnr); } /* @@ -4214,7 +4220,7 @@ static void handle_slab(unsigned long *n, struct kmem_cache *c, for (j = page->active; j < c->num; j++) { /* Skip freed item */ - if (slab_freelist(page)[j] == i) { + if (get_free_obj(page, j) == i) { active = false; break; } -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[REPOST PATCH 1/4] slab: factor out calculate nr objects in cache_estimate
This logic is not simple to understand so that making separate function helping readability. Additionally, we can use this change in the following patch which implement for freelist to have another sized index in according to nr objects. Signed-off-by: Joonsoo Kim diff --git a/mm/slab.c b/mm/slab.c index f3868fe..9d4bad5 100644 --- a/mm/slab.c +++ b/mm/slab.c @@ -565,9 +565,31 @@ static inline struct array_cache *cpu_cache_get(struct kmem_cache *cachep) return cachep->array[smp_processor_id()]; } -static size_t slab_mgmt_size(size_t nr_objs, size_t align) +static int calculate_nr_objs(size_t slab_size, size_t buffer_size, + size_t idx_size, size_t align) { - return ALIGN(nr_objs * sizeof(unsigned int), align); + int nr_objs; + size_t freelist_size; + + /* +* Ignore padding for the initial guess. The padding +* is at most @align-1 bytes, and @buffer_size is at +* least @align. In the worst case, this result will +* be one greater than the number of objects that fit +* into the memory allocation when taking the padding +* into account. +*/ + nr_objs = slab_size / (buffer_size + idx_size); + + /* +* This calculated number will be either the right +* amount, or one greater than what we want. +*/ + freelist_size = slab_size - nr_objs * buffer_size; + if (freelist_size < ALIGN(nr_objs * idx_size, align)) + nr_objs--; + + return nr_objs; } /* @@ -600,28 +622,12 @@ static void cache_estimate(unsigned long gfporder, size_t buffer_size, nr_objs = slab_size / buffer_size; } else { - /* -* Ignore padding for the initial guess. The padding -* is at most @align-1 bytes, and @buffer_size is at -* least @align. In the worst case, this result will -* be one greater than the number of objects that fit -* into the memory allocation when taking the padding -* into account. -*/ - nr_objs = (slab_size) / (buffer_size + sizeof(unsigned int)); - - /* -* This calculated number will be either the right -* amount, or one greater than what we want. -*/ - if (slab_mgmt_size(nr_objs, align) + nr_objs*buffer_size - > slab_size) - nr_objs--; - - mgmt_size = slab_mgmt_size(nr_objs, align); + nr_objs = calculate_nr_objs(slab_size, buffer_size, + sizeof(unsigned int), align); + mgmt_size = ALIGN(nr_objs * sizeof(unsigned int), align); } *num = nr_objs; - *left_over = slab_size - nr_objs*buffer_size - mgmt_size; + *left_over = slab_size - (nr_objs * buffer_size) - mgmt_size; } #if DEBUG -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4] slab: implement byte sized indexes for the freelist of a slab
On Thu, Sep 05, 2013 at 02:33:56PM +, Christoph Lameter wrote: > On Thu, 5 Sep 2013, Joonsoo Kim wrote: > > > I think that all patchsets deserve to be merged, since it reduces memory > > usage and > > also improves performance. :) > > Could you clean things up etc and the repost the patchset? This time do > *not* do this as a response to an earlier email but start the patchset > with new thread id. I think some people are not seeing this patchset. Okay. I just did that. Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[REPOST PATCH 3/4] slab: introduce byte sized index for the freelist of a slab
Currently, the freelist of a slab consist of unsigned int sized indexes. Most of slabs have less number of objects than 256, since restriction for page order is at most 1 in default configuration. For example, consider a slab consisting of 32 byte sized objects on two continous pages. In this case, 256 objects is possible and these number fit to byte sized indexes. 256 objects is maximum possible value in default configuration, since 32 byte is minimum object size in the SLAB. (8192 / 32 = 256). Therefore, if we use byte sized index, we can save 3 bytes for each object. This introduce one likely branch to functions used for setting/getting objects to/from the freelist, but we may get more benefits from this change. Signed-off-by: Joonsoo Kim diff --git a/mm/slab.c b/mm/slab.c index a0e49bb..bd366e5 100644 --- a/mm/slab.c +++ b/mm/slab.c @@ -565,8 +565,16 @@ static inline struct array_cache *cpu_cache_get(struct kmem_cache *cachep) return cachep->array[smp_processor_id()]; } -static int calculate_nr_objs(size_t slab_size, size_t buffer_size, - size_t idx_size, size_t align) +static inline bool can_byte_index(int nr_objs) +{ + if (likely(nr_objs <= (sizeof(unsigned char) << 8))) + return true; + + return false; +} + +static int __calculate_nr_objs(size_t slab_size, size_t buffer_size, + unsigned int idx_size, size_t align) { int nr_objs; size_t freelist_size; @@ -592,6 +600,29 @@ static int calculate_nr_objs(size_t slab_size, size_t buffer_size, return nr_objs; } +static int calculate_nr_objs(size_t slab_size, size_t buffer_size, + size_t align) +{ + int nr_objs; + int byte_nr_objs; + + nr_objs = __calculate_nr_objs(slab_size, buffer_size, + sizeof(unsigned int), align); + if (!can_byte_index(nr_objs)) + return nr_objs; + + byte_nr_objs = __calculate_nr_objs(slab_size, buffer_size, + sizeof(unsigned char), align); + /* +* nr_objs can be larger when using byte index, +* so that it cannot be indexed by byte index. +*/ + if (can_byte_index(byte_nr_objs)) + return byte_nr_objs; + else + return nr_objs; +} + /* * Calculate the number of objects and left-over bytes for a given buffer size. */ @@ -618,13 +649,18 @@ static void cache_estimate(unsigned long gfporder, size_t buffer_size, * correct alignment when allocated. */ if (flags & CFLGS_OFF_SLAB) { - mgmt_size = 0; nr_objs = slab_size / buffer_size; + mgmt_size = 0; } else { - nr_objs = calculate_nr_objs(slab_size, buffer_size, - sizeof(unsigned int), align); - mgmt_size = ALIGN(nr_objs * sizeof(unsigned int), align); + nr_objs = calculate_nr_objs(slab_size, buffer_size, align); + if (can_byte_index(nr_objs)) { + mgmt_size = + ALIGN(nr_objs * sizeof(unsigned char), align); + } else { + mgmt_size = + ALIGN(nr_objs * sizeof(unsigned int), align); + } } *num = nr_objs; *left_over = slab_size - (nr_objs * buffer_size) - mgmt_size; @@ -2012,7 +2048,10 @@ static size_t calculate_slab_order(struct kmem_cache *cachep, * looping condition in cache_grow(). */ offslab_limit = size; - offslab_limit /= sizeof(unsigned int); + if (can_byte_index(num)) + offslab_limit /= sizeof(unsigned char); + else + offslab_limit /= sizeof(unsigned int); if (num > offslab_limit) break; @@ -2253,8 +2292,13 @@ __kmem_cache_create (struct kmem_cache *cachep, unsigned long flags) if (!cachep->num) return -E2BIG; - freelist_size = - ALIGN(cachep->num * sizeof(unsigned int), cachep->align); + if (can_byte_index(cachep->num)) { + freelist_size = ALIGN(cachep->num * sizeof(unsigned char), + cachep->align); + } else { + freelist_size = ALIGN(cachep->num * sizeof(unsigned int), + cachep->align); + } /* * If the slab has been placed off-slab, and we have enough space then @@ -2267,7 +2311,10 @@ __kmem_cache_create (struct kmem_cache *cachep, unsigned long flags) if (flags & CFLGS_OFF_SLAB) {
[REPOST PATCH 4/4] slab: make more slab management structure off the slab
Now, the size of the freelist for the slab management diminish, so that the on-slab management structure can waste large space if the object of the slab is large. Consider a 128 byte sized slab. If on-slab is used, 31 objects can be in the slab. The size of the freelist for this case would be 31 bytes so that 97 bytes, that is, more than 75% of object size, are wasted. In a 64 byte sized slab case, no space is wasted if we use on-slab. So set off-slab determining constraint to 128 bytes. Signed-off-by: Joonsoo Kim diff --git a/mm/slab.c b/mm/slab.c index bd366e5..d01a2f0 100644 --- a/mm/slab.c +++ b/mm/slab.c @@ -2277,7 +2277,7 @@ __kmem_cache_create (struct kmem_cache *cachep, unsigned long flags) * it too early on. Always use on-slab management when * SLAB_NOLEAKTRACE to avoid recursive calls into kmemleak) */ - if ((size >= (PAGE_SIZE >> 3)) && !slab_early_init && + if ((size >= (PAGE_SIZE >> 5)) && !slab_early_init && !(flags & SLAB_NOLEAKTRACE)) /* * Size is large, assume best to place the slab management obj -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[REPOST PATCH 0/4] slab: implement byte sized indexes for the freelist of a slab
* THIS IS JUST REPOSTED ACCORDING TO MAINTAINER'S REQUEST * * Changes from original post Correct the position of the results. Attach more results about cache-misses and elapsed time on a hackbench test. - This patchset implements byte sized indexes for the freelist of a slab. Currently, the freelist of a slab consist of unsigned int sized indexes. Most of slabs have less number of objects than 256, so much space is wasted. To reduce this overhead, this patchset implements byte sized indexes for the freelist of a slab. With it, we can save 3 bytes for each objects. This introduce one likely branch to functions used for setting/getting objects to/from the freelist, but we may get more benefits from this change. Below is some numbers of 'cat /proc/slabinfo' related to my previous posting and this patchset. * Before * # name : tunables [snip...] kmalloc-512 52760051281 : tunables 54 270 : slabdata 75 75 0 kmalloc-256 210210256 151 : tunables 120 600 : slabdata 14 14 0 kmalloc-192 1040 1040192 201 : tunables 120 600 : slabdata 52 52 0 kmalloc-96 750750128 301 : tunables 120 600 : slabdata 25 25 0 kmalloc-64 2773 2773 64 591 : tunables 120 600 : slabdata 47 47 0 kmalloc-128 660690128 301 : tunables 120 600 : slabdata 23 23 0 kmalloc-32 11200 11200 32 1121 : tunables 120 600 : slabdata100100 0 kmem_cache 197200192 201 : tunables 120 600 : slabdata 10 10 0 * After my previous posting(overload struct slab over struct page) * # name : tunables [snip...] kmalloc-512 52564051281 : tunables 54 270 : slabdata 80 80 0 kmalloc-256 210210256 151 : tunables 120 600 : slabdata 14 14 0 kmalloc-192 1016 1040192 201 : tunables 120 600 : slabdata 52 52 0 kmalloc-96 560620128 311 : tunables 120 600 : slabdata 20 20 0 kmalloc-64 2148 2280 64 601 : tunables 120 600 : slabdata 38 38 0 kmalloc-128 647682128 311 : tunables 120 600 : slabdata 22 22 0 kmalloc-32 11360 11413 32 1131 : tunables 120 600 : slabdata101101 0 kmem_cache 197200192 201 : tunables 120 600 : slabdata 10 10 0 kmem_caches consisting of objects less than or equal to 128 byte have one more objects in a slab. You can see it at objperslab. We can improve further with this patchset. * My previous posting + this patchset * # name : tunables [snip...] kmalloc-512 52164851281 : tunables 54 270 : slabdata 81 81 0 kmalloc-256 208208256 161 : tunables 120 600 : slabdata 13 13 0 kmalloc-192 1029 1029192 211 : tunables 120 600 : slabdata 49 49 0 kmalloc-96 529589128 311 : tunables 120 600 : slabdata 19 19 0 kmalloc-64 2142 2142 64 631 : tunables 120 600 : slabdata 34 34 0 kmalloc-128 660682128 311 : tunables 120 600 : slabdata 22 22 0 kmalloc-32 11716 11780 32 1241 : tunables 120 600 : slabdata 95 95 0 kmem_cache 197210192 211 : tunables 120 600 : slabdata 10 10 0 kmem_caches consisting of objects less than or equal to 256 byte have one or more objects than before. In the case of kmalloc-32, we have 11 more objects, so 352 bytes (11 * 32) are saved and this is roughly 9% saving of memory. Of couse, this percentage decreases as the number of objects in a slab decreases. Here are the performance results on my 4 cpus machine. * Before * Performance counter stats for 'perf bench sched messaging -g 50 -l 1000' (10 runs): 238,309,671 cache-misses ( +- 0.40% ) 12.010172090 seconds time elapsed ( +- 0.21% ) * After my previous posting * Performance counter stats for 'perf bench sched messaging -g 50 -l 1000' (10 runs): 229,945,138 cache-misses ( +- 0.23% ) 11.627897174 seconds time elapsed ( +- 0.14% ) * My previous posting + this patchset * Performance counter stats for 'perf b
Re: [PATCH V4] regulator: palmas: add support for external control of rails
On Thursday 05 September 2013 09:04 PM, Mark Brown wrote: * PGP Signed by an unknown key On Thu, Sep 05, 2013 at 08:27:24PM +0530, Laxman Dewangan wrote: On Thursday 05 September 2013 08:04 PM, Lee Jones wrote: It won't go in until v3.12 now, but I have applied the patch. Thanks Lee for taking care. If it's going to wait for v3.12 there's no point applying it to MFD as the dependency will be in mainline after the merge window. Agree that it should go on regulator tree if it is v3.12 and if there is any issue on applying the patch, I will resend at that time after rebasing to that branch. Thanks, Laxman -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] perf tools: Free strlist in strlist__delete()
From: Namhyung Kim It seems strlist never deleted after allocated. AFAICS every strlist is allocated dynamically, just free it in the _delete() function. Signed-off-by: Namhyung Kim --- tools/perf/util/strlist.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/tools/perf/util/strlist.c b/tools/perf/util/strlist.c index eabdce0a2daa..11593d899eb2 100644 --- a/tools/perf/util/strlist.c +++ b/tools/perf/util/strlist.c @@ -155,8 +155,10 @@ out_error: void strlist__delete(struct strlist *slist) { - if (slist != NULL) + if (slist != NULL) { rblist__delete(&slist->rblist); + free(slist); + } } struct str_node *strlist__entry(const struct strlist *slist, unsigned int idx) -- 1.7.11.7 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH v3 09/35] mm: Track the freepage migratetype of pages accurately
On 09/04/2013 01:53 PM, Yasuaki Ishimatsu wrote: > (2013/09/03 17:45), Srivatsa S. Bhat wrote: >> On 09/03/2013 12:08 PM, Yasuaki Ishimatsu wrote: >>> (2013/08/30 22:16), Srivatsa S. Bhat wrote: Due to the region-wise ordering of the pages in the buddy allocator's free lists, whenever we want to delete a free pageblock from a free list (for ex: when moving blocks of pages from one list to the other), we need to be able to tell the buddy allocator exactly which migratetype it belongs to. For that purpose, we can use the page's freepage migratetype (which is maintained in the page's ->index field). So, while splitting up higher order pages into smaller ones as part of buddy operations, keep the new head pages updated with the correct freepage migratetype information (because we depend on tracking this info accurately, as outlined above). Signed-off-by: Srivatsa S. Bhat --- mm/page_alloc.c |7 +++ 1 file changed, 7 insertions(+) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 398b62c..b4b1275 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -947,6 +947,13 @@ static inline void expand(struct zone *zone, struct page *page, add_to_freelist(&page[size], &area->free_list[migratetype]); area->nr_free++; set_page_order(&page[size], high); + +/* + * Freepage migratetype is tracked using the index field of the + * first page of the block. So we need to update the new first + * page, when changing the page order. + */ +set_freepage_migratetype(&page[size], migratetype); } } >>> >>> It this patch a bug fix patch? >>> If so, I want you to split the patch from the patch-set. >>> >> >> No, its not a bug-fix. We need to take care of this only when using the >> sorted-buddy design to maintain the freelists, which is introduced >> only in >> this patchset. So mainline doesn't need this patch. >> >> In mainline, we can delete a page from a buddy freelist by simply calling >> list_del() by passing a pointer to page->lru. It doesn't matter which >> freelist >> the page was belonging to. However, in the sorted-buddy design introduced >> in this patchset, we also need to know which particular freelist we are >> deleting that page from, because apart from breaking the ->lru link from >> the linked-list, we also need to update certain other things such as the >> region->page_block pointer etc, which are part of that particular >> freelist. >> Thus, it becomes essential to know which freelist we are deleting the >> page >> from. And for that, we need this patch to maintain that information >> accurately >> even during buddy operations such as splitting buddy pages in expand(). > > I may be wrong because I do not know this part clearly. > > Original code is here: > > --- > static inline void expand(struct zone *zone, struct page *page, > int low, int high, struct free_area *area, > int migratetype) > { > ... > list_add(&page[size].lru, &area->free_list[migratetype]); > area->nr_free++; > set_page_order(&page[size], high); > --- > > It seems that migratietype of page[size] page is changed. So even if not > applying your patch, I think migratetype of the page should be changed. > Hmm, thinking about this a bit more, I agree with you. Although its not a bug-fix for mainline, it is certainly good to have, since it makes things more consistent by tracking the freepage migratetype properly for pages split during buddy expansion. I'll separate this patch from the series and post it as a stand-alone patch. Thank you! Regards, Srivatsa S. Bhat -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v3 06/20] mm, hugetlb: return a reserved page to a reserved pool if failed
If we fail with a reserved page, just calling put_page() is not sufficient, because put_page() invoke free_huge_page() at last step and it doesn't know whether a page comes from a reserved pool or not. So it doesn't do anything related to reserved count. This makes reserve count lower than how we need, because reserve count already decrease in dequeue_huge_page_vma(). This patch fix this situation. In this patch, PagePrivate() is used for tracking reservation. When resereved pages are dequeued from reserved pool, Private flag is assigned to the hugepage until properly mapped. On page returning process, if there is a hugepage with Private flag, it is considered as the one returned in certain error path, so that we should restore one reserve count back in order to preserve certain user's reserved hugepage. Using Private flag is safe for the hugepage, because it doesn't use the LRU mechanism so that there is no other user of this page except us. Therefore we can use this flag safely. Signed-off-by: Joonsoo Kim --- Replenishing commit message only. diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 6c8eec2..3f834f1 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -572,6 +572,7 @@ retry_cpuset: if (!vma_has_reserves(vma, chg)) break; + SetPagePrivate(page); h->resv_huge_pages--; break; } @@ -626,15 +627,20 @@ static void free_huge_page(struct page *page) int nid = page_to_nid(page); struct hugepage_subpool *spool = (struct hugepage_subpool *)page_private(page); + bool restore_reserve; set_page_private(page, 0); page->mapping = NULL; BUG_ON(page_count(page)); BUG_ON(page_mapcount(page)); + restore_reserve = PagePrivate(page); spin_lock(&hugetlb_lock); hugetlb_cgroup_uncharge_page(hstate_index(h), pages_per_huge_page(h), page); + if (restore_reserve) + h->resv_huge_pages++; + if (h->surplus_huge_pages_node[nid] && huge_page_order(h) < MAX_ORDER) { /* remove the page from active list */ list_del(&page->lru); @@ -2616,6 +2622,8 @@ retry_avoidcopy: spin_lock(&mm->page_table_lock); ptep = huge_pte_offset(mm, address & huge_page_mask(h)); if (likely(pte_same(huge_ptep_get(ptep), pte))) { + ClearPagePrivate(new_page); + /* Break COW */ huge_ptep_clear_flush(vma, address, ptep); set_huge_pte_at(mm, address, ptep, @@ -2727,6 +2735,7 @@ retry: goto retry; goto out; } + ClearPagePrivate(page); spin_lock(&inode->i_lock); inode->i_blocks += blocks_per_huge_page(h); @@ -2773,8 +2782,10 @@ retry: if (!huge_pte_none(huge_ptep_get(ptep))) goto backout; - if (anon_rmap) + if (anon_rmap) { + ClearPagePrivate(page); hugepage_add_new_anon_rmap(page, vma, address); + } else page_dup_rmap(page); new_pte = make_huge_pte(vma, page, ((vma->vm_flags & VM_WRITE) -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v3 03/20] mm, hugetlb: fix subpool accounting handling
There is a case that we attempt to allocate a hugepage with chg = 0 and avoid_reserve = 1. Although chg = 0 means that it has a reserved hugepage, we wouldn't use it, since avoid_reserve = 1 represents that we don't want to allocate a hugepage from a reserved pool. This happens when the parent process that created a MAP_PRIVATE mapping is about to perform a COW due to a shared page count and it attempt to satisfy the allocation without using the existing reserves. In this case, we would not dequeue a reserved hugepage and, instead, try to allocate a new hugepage. Therefore, we should check subpool counter for a new hugepage. This patch implement it. Reviewed-by: Aneesh Kumar K.V Signed-off-by: Joonsoo Kim --- Replenishing commit message and adding reviewed-by tag. diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 12b6581..ea1ae0a 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1144,13 +1144,14 @@ static struct page *alloc_huge_page(struct vm_area_struct *vma, chg = vma_needs_reservation(h, vma, addr); if (chg < 0) return ERR_PTR(-ENOMEM); - if (chg) - if (hugepage_subpool_get_pages(spool, chg)) + if (chg || avoid_reserve) + if (hugepage_subpool_get_pages(spool, 1)) return ERR_PTR(-ENOSPC); ret = hugetlb_cgroup_charge_cgroup(idx, pages_per_huge_page(h), &h_cg); if (ret) { - hugepage_subpool_put_pages(spool, chg); + if (chg || avoid_reserve) + hugepage_subpool_put_pages(spool, 1); return ERR_PTR(-ENOSPC); } spin_lock(&hugetlb_lock); @@ -1162,7 +1163,8 @@ static struct page *alloc_huge_page(struct vm_area_struct *vma, hugetlb_cgroup_uncharge_cgroup(idx, pages_per_huge_page(h), h_cg); - hugepage_subpool_put_pages(spool, chg); + if (chg || avoid_reserve) + hugepage_subpool_put_pages(spool, 1); return ERR_PTR(-ENOSPC); } spin_lock(&hugetlb_lock); -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v3 03/20] mm, hugetlb: fix subpool accounting handling
There is a case that we attempt to allocate a hugepage with chg = 0 and avoid_reserve = 1. Although chg = 0 means that it has a reserved hugepage, we wouldn't use it, since avoid_reserve = 1 represents that we don't want to allocate a hugepage from a reserved pool. This happens when the parent process that created a MAP_PRIVATE mapping is about to perform a COW due to a shared page count and it attempt to satisfy the allocation without using the existing reserves. In this case, we would not dequeue a reserved hugepage and, instead, try to allocate a new hugepage. Therefore, we should check subpool counter for a new hugepage. This patch implement it. Reviewed-by: Aneesh Kumar K.V Signed-off-by: Joonsoo Kim --- Replenishing commit message and adding reviewed-by tag. diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 12b6581..ea1ae0a 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1144,13 +1144,14 @@ static struct page *alloc_huge_page(struct vm_area_struct *vma, chg = vma_needs_reservation(h, vma, addr); if (chg < 0) return ERR_PTR(-ENOMEM); - if (chg) - if (hugepage_subpool_get_pages(spool, chg)) + if (chg || avoid_reserve) + if (hugepage_subpool_get_pages(spool, 1)) return ERR_PTR(-ENOSPC); ret = hugetlb_cgroup_charge_cgroup(idx, pages_per_huge_page(h), &h_cg); if (ret) { - hugepage_subpool_put_pages(spool, chg); + if (chg || avoid_reserve) + hugepage_subpool_put_pages(spool, 1); return ERR_PTR(-ENOSPC); } spin_lock(&hugetlb_lock); @@ -1162,7 +1163,8 @@ static struct page *alloc_huge_page(struct vm_area_struct *vma, hugetlb_cgroup_uncharge_cgroup(idx, pages_per_huge_page(h), h_cg); - hugepage_subpool_put_pages(spool, chg); + if (chg || avoid_reserve) + hugepage_subpool_put_pages(spool, 1); return ERR_PTR(-ENOSPC); } spin_lock(&hugetlb_lock); -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 1/4] mm/zswap: bugfix: memory leak when re-swapon
zswap_tree is not freed when swapoff, and it got re-kmalloc in swapon, so memory-leak occurs. Modify: free memory of zswap_tree in zswap_frontswap_invalidate_area(). Signed-off-by: Weijie Yang --- mm/zswap.c |4 1 file changed, 4 insertions(+) diff --git a/mm/zswap.c b/mm/zswap.c index deda2b6..cbd9578 100644 --- a/mm/zswap.c +++ b/mm/zswap.c @@ -816,6 +816,10 @@ static void zswap_frontswap_invalidate_area(unsigned type) } tree->rbroot = RB_ROOT; spin_unlock(&tree->lock); + + zbud_destroy_pool(tree->pool); + kfree(tree); + zswap_trees[type] = NULL; } static struct zbud_ops zswap_zbud_ops = { -- 1.7.10.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 2/4] mm/zswap: bugfix: memory leak when invalidate and reclaim occur concurrently
Consider the following scenario: thread 0: reclaim entry x (get refcount, but not call zswap_get_swap_cache_page) thread 1: call zswap_frontswap_invalidate_page to invalidate entry x. finished, entry x and its zbud is not freed as its refcount != 0 now, the swap_map[x] = 0 thread 0: now call zswap_get_swap_cache_page swapcache_prepare return -ENOENT because entry x is not used any more zswap_get_swap_cache_page return ZSWAP_SWAPCACHE_NOMEM zswap_writeback_entry do nothing except put refcount Now, the memory of zswap_entry x and its zpage leak. Modify: - check the refcount in fail path, free memory if it is not referenced. - use ZSWAP_SWAPCACHE_FAIL instead of ZSWAP_SWAPCACHE_NOMEM as the fail path can be not only caused by nomem but also by invalidate. Signed-off-by: Weijie Yang --- mm/zswap.c | 21 + 1 file changed, 13 insertions(+), 8 deletions(-) diff --git a/mm/zswap.c b/mm/zswap.c index cbd9578..1be7b90 100644 --- a/mm/zswap.c +++ b/mm/zswap.c @@ -387,7 +387,7 @@ static void zswap_free_entry(struct zswap_tree *tree, struct zswap_entry *entry) enum zswap_get_swap_ret { ZSWAP_SWAPCACHE_NEW, ZSWAP_SWAPCACHE_EXIST, - ZSWAP_SWAPCACHE_NOMEM + ZSWAP_SWAPCACHE_FAIL, }; /* @@ -401,9 +401,9 @@ enum zswap_get_swap_ret { * added to the swap cache, and returned in retpage. * * If success, the swap cache page is returned in retpage - * Returns 0 if page was already in the swap cache, page is not locked - * Returns 1 if the new page needs to be populated, page is locked - * Returns <0 on error + * Returns ZSWAP_SWAPCACHE_EXIST if page was already in the swap cache + * Returns ZSWAP_SWAPCACHE_NEW if the new page needs to be populated, page is locked + * Returns ZSWAP_SWAPCACHE_FAIL on error */ static int zswap_get_swap_cache_page(swp_entry_t entry, struct page **retpage) @@ -475,7 +475,7 @@ static int zswap_get_swap_cache_page(swp_entry_t entry, if (new_page) page_cache_release(new_page); if (!found_page) - return ZSWAP_SWAPCACHE_NOMEM; + return ZSWAP_SWAPCACHE_FAIL; *retpage = found_page; return ZSWAP_SWAPCACHE_EXIST; } @@ -529,11 +529,11 @@ static int zswap_writeback_entry(struct zbud_pool *pool, unsigned long handle) /* try to allocate swap cache page */ switch (zswap_get_swap_cache_page(swpentry, &page)) { - case ZSWAP_SWAPCACHE_NOMEM: /* no memory */ + case ZSWAP_SWAPCACHE_FAIL: /* no memory or invalidate happened */ ret = -ENOMEM; goto fail; - case ZSWAP_SWAPCACHE_EXIST: /* page is unlocked */ + case ZSWAP_SWAPCACHE_EXIST: /* page is already in the swap cache, ignore for now */ page_cache_release(page); ret = -EEXIST; @@ -591,7 +591,12 @@ static int zswap_writeback_entry(struct zbud_pool *pool, unsigned long handle) fail: spin_lock(&tree->lock); - zswap_entry_put(entry); + refcount = zswap_entry_put(entry); + if (refcount <= 0) { + /* invalidate happened, consider writeback as success */ + zswap_free_entry(tree, entry); + ret = 0; + } spin_unlock(&tree->lock); return ret; } -- 1.7.10.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 3/4] mm/zswap: avoid unnecessary page scanning
add SetPageReclaim before __swap_writepage so that page can be moved to the tail of the inactive list, which can avoid unnecessary page scanning as this page was reclaimed by swap subsystem before. Signed-off-by: Weijie Yang --- mm/zswap.c |3 +++ 1 file changed, 3 insertions(+) diff --git a/mm/zswap.c b/mm/zswap.c index 1be7b90..cc40e6a 100644 --- a/mm/zswap.c +++ b/mm/zswap.c @@ -556,6 +556,9 @@ static int zswap_writeback_entry(struct zbud_pool *pool, unsigned long handle) SetPageUptodate(page); } + /* move it to the tail of the inactive list after end_writeback */ + SetPageReclaim(page); + /* start writeback */ __swap_writepage(page, &wbc, end_swap_bio_write); page_cache_release(page); -- 1.7.10.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 4/4] mm/zswap: use GFP_NOIO instead of GFP_KERNEL
To avoid zswap store and reclaim functions called recursively, use GFP_NOIO instead of GFP_KERNEL Signed-off-by: Weijie Yang --- mm/zswap.c |6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/mm/zswap.c b/mm/zswap.c index cc40e6a..3d05ed8 100644 --- a/mm/zswap.c +++ b/mm/zswap.c @@ -427,7 +427,7 @@ static int zswap_get_swap_cache_page(swp_entry_t entry, * Get a new page to read into from swap. */ if (!new_page) { - new_page = alloc_page(GFP_KERNEL); + new_page = alloc_page(GFP_NOIO); if (!new_page) break; /* Out of memory */ } @@ -435,7 +435,7 @@ static int zswap_get_swap_cache_page(swp_entry_t entry, /* * call radix_tree_preload() while we can wait. */ - err = radix_tree_preload(GFP_KERNEL); + err = radix_tree_preload(GFP_NOIO); if (err) break; @@ -636,7 +636,7 @@ static int zswap_frontswap_store(unsigned type, pgoff_t offset, } /* allocate entry */ - entry = zswap_entry_cache_alloc(GFP_KERNEL); + entry = zswap_entry_cache_alloc(GFP_NOIO); if (!entry) { zswap_reject_kmemcache_fail++; ret = -ENOMEM; -- 1.7.10.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 0/4] mm/zswap bugfix: memory leaks and other problems
This patch series fix a few bugs in zswap based on Linux-3.11. v1 --> v2 - free memory in zswap_frontswap_invalidate_area (in patch 1) - fix whitespace corruption (line wrapping) Corresponding mail thread: https://lkml.org/lkml/2013/8/18/59 These issues fixed/optimized are: 1. memory leaks when re-swapon 2. memory leaks when invalidate and reclaim occur concurrently 3. avoid unnecessary page scanning 4. use GFP_NOIO instead of GFP_KERNEL to avoid zswap store and reclaim functions called recursively Issues discussed in that mail thread NOT fixed as it happens rarely or not a big problem: 1. a "theoretical race condition" when reclaim page When a handle alloced from zbud, zbud considers this handle is used validly by upper(zswap) and can be a candidate for reclaim. But zswap has to initialize it such as setting swapentry and adding it to rbtree. so there is a race condition, such as: thread 0: obtain handle x from zbud_alloc thread 1: zbud_reclaim_page is called thread 1: callback zswap_writeback_entry to reclaim handle x thread 1: get swpentry from handle x (it is random value now) thread 1: bad thing may happen thread 0: initialize handle x with swapentry 2. frontswap_map bitmap not cleared after zswap reclaim Frontswap uses frontswap_map bitmap to track page in "backend" implementation, when zswap reclaim a page, the corresponding bitmap record is not cleared. mm/zswap.c | 34 +++--- 1 file changed, 23 insertions(+), 11 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 0/2] ext4: increase mbcache scalability
On 2013-09-05, at 3:49 AM, Thavatchai Makphaibulchoke wrote: > On 09/05/2013 02:35 AM, Theodore Ts'o wrote: >> How did you gather these results? The mbcache is only used if you >> are using extended attributes, and only if the extended attributes don't fit >> in the inode's extra space. >> >> I checked aim7, and it doesn't do any extended attribute operations. >> So why are you seeing differences? Are you doing something like >> deliberately using 128 byte inodes (which is not the default inode >> size), and then enabling SELinux, or some such? > > No, I did not do anything special, including changing an inode's size. I just > used the profile data, which indicated mb_cache module as one of the > bottleneck. Please see below for perf data from one of th new_fserver run, > which also shows some mb_cache activities. > > >|--3.51%-- __mb_cache_entry_find >| mb_cache_entry_find_first >| ext4_xattr_cache_find >| ext4_xattr_block_set >| ext4_xattr_set_handle >| ext4_initxattrs >| security_inode_init_security >| ext4_init_security Looks like this is some large security xattr, or enough smaller xattrs to exceed the ~120 bytes of in-inode xattr storage. How big is the SELinux xattr (assuming that is what it is)? > Looks like it's a bit harder to disable mbcache than I thought. > I ended up adding code to collect the statics. > > With selinux enabled, for new_fserver workload of aim7, there > are a total of 0x7e054201 ext4_xattr_cache_find() calls > that result in a hit and 0xc100 calls that are not. > The number does not seem to favor the complete disabling of > mbcache in this case. This is about a 65% hit rate, which seems reasonable. You could try a few different things here: - disable selinux completely (boot with "selinux=0" on the kernel command line) and see how much faster it is - format your ext4 filesystem with larger inodes (-I 512) and see if this is an improvement or not. That depends on the size of the selinux xattrs and if they will fit into the extra 256 bytes of xattr space these larger inodes will give you. The performance might also be worse, since there will be more data to read/write for each inode, but it would avoid seeking to the xattr blocks. Cheers, Andreas -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
linux-next: build warning after merge of the ceph tree
Hi Sage, After merging the ceph tree, today's linux-next build (x86_64 allmodconfig) produced this warning: In file included from fs/ceph/super.h:4:0, from fs/ceph/cache.c:26: include/linux/ceph/ceph_debug.h:4:0: warning: "pr_fmt" redefined [enabled by default] #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt ^ In file included from include/linux/kernel.h:13:0, from include/asm-generic/bug.h:13, from arch/x86/include/asm/bug.h:38, from include/linux/bug.h:4, from include/linux/thread_info.h:11, from include/linux/preempt.h:9, from include/linux/spinlock.h:50, from include/linux/wait.h:7, from include/linux/fs.h:6, from include/linux/fscache.h:21, from fs/ceph/cache.c:24: include/linux/printk.h:206:0: note: this is the location of the previous definition #define pr_fmt(fmt) fmt ^ Probably introduced by commit cb0963fcf836 ("ceph: use fscache as a local presisent cache"). pr_fmt needs to be defined before printk.h gets included. -- Cheers, Stephen Rothwells...@canb.auug.org.au pgpepYAR3zjQQ.pgp Description: PGP signature
Re: [PATCH] ethernet/arc/arc_emac: optimize the Tx/Tx-reclaim paths a bit
From: Vineet Gupta Date: Fri, 6 Sep 2013 04:24:39 + > On 09/05/2013 11:54 PM, David Miller wrote: >> You should keep the check in the transmit queueing code as a BUG check, >> almost every driver has code of the form (using NIU as an example): ... >> Otherwise queue management bugs are incredibly hard to diagnose. >> >> I'm not applying this patch. > > The check is already there for current BD. What I removed was checking for > next BD > too (please see below). IMHO this is useless since it will be done in next > iteration anyways. In my tests, the next check never got hit, so it was waste > of > cycles. > > static int arc_emac_tx(struct sk_buff *skb, struct net_device *ndev) > { > if (unlikely((le32_to_cpu(*info) & OWN_MASK) == FOR_EMAC)) { > netif_stop_queue(ndev); > return NETDEV_TX_BUSY; > } > > ... > *txbd_curr = (*txbd_curr + 1) % TX_BD_NUM; > > - /* Get "info" of the next BD */ > - info = &priv->txbd[*txbd_curr].info; > - > - /* Check if if Tx BD ring is full - next BD is still owned by EMAC */ > - if (unlikely((le32_to_cpu(*info) & OWN_MASK) == FOR_EMAC)) > - netif_stop_queue(ndev); > > OTOH, I do see a slight stats update issue - if the queue is stopped (but pkt > not > dropped) we are failing to increment tx_errors. But that would be a separate > patch. It is exactly the correct thing to do. The driver should _NEVER_ return NETDEV_TX_BUSY under normal circumstances. The queue should always be stopped by the ->ndo_start_xmit() method when it fills the queue. Again, when ->ndo_start_xmit() is invoked, it should never see the queue full. When that happens it is a bug. You are deleting exactly the correct part of this function, what it is doing right now is precisely the correct way to manage netif queue state. The only valid change you can make here is to make the: if (unlikely((le32_to_cpu(*info) & OWN_MASK) == FOR_EMAC)) { netif_stop_queue(ndev); return NETDEV_TX_BUSY; } print out an error message and increment tx_errors. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 7/9] KGDB/KDB: add new system NMI entry code to KDB
On 09/05/2013 05:50 PM, Mike Travis wrote: > This patch adds a new "KDB_REASON" code (KDB_REASON_SYSTEM_NMI). This > is purely cosmetic to distinguish it from the other various reasons that > NMI may occur and are usually after an error occurred. Also the dumping > of registers is not done to more closely match what is displayed when KDB > is entered manually via the sysreq 'g' key. This patch is not quite right. See below. > > Signed-off-by: Mike Travis > Reviewed-by: Dimitri Sivanich > Reviewed-by: Hedi Berriche > --- > include/linux/kdb.h |1 + > include/linux/kgdb.h|1 + > kernel/debug/debug_core.c |5 + > kernel/debug/kdb/kdb_debugger.c |5 - > kernel/debug/kdb/kdb_main.c |3 +++ > 5 files changed, 14 insertions(+), 1 deletion(-) > > --- linux.orig/include/linux/kdb.h > +++ linux/include/linux/kdb.h > @@ -109,6 +109,7 @@ typedef enum { > KDB_REASON_RECURSE, /* Recursive entry to kdb; >* regs probably valid */ > KDB_REASON_SSTEP, /* Single Step trap. - regs valid */ > + KDB_REASON_SYSTEM_NMI, /* In NMI due to SYSTEM cmd; regs valid */ > } kdb_reason_t; > > extern int kdb_trap_printk; > --- linux.orig/include/linux/kgdb.h > +++ linux/include/linux/kgdb.h > @@ -52,6 +52,7 @@ extern int kgdb_connected; > extern int kgdb_io_module_registered; > > extern atomic_t kgdb_setting_breakpoint; > +extern atomic_t kgdb_system_nmi; We don't need extra atomics. You should add another variable to the kgdb_state which is processor specific in this case. Better yet, just set the ks->err_code properly in your kgdb_nmicallin() or in the origination call to kgdb_nmicallback() from your nmi handler (remember I still have the question pending if we actually need kgdb_nmicallin() in the first place. You already did the work of adding another NMI type to the enum. We just need to use the ks->err_code variable as well. > extern atomic_t kgdb_cpu_doing_single_step; > > extern struct task_struct*kgdb_usethread; > --- linux.orig/kernel/debug/debug_core.c > +++ linux/kernel/debug/debug_core.c > @@ -125,6 +125,7 @@ static atomic_t masters_in_kgdb; > static atomic_t slaves_in_kgdb; > static atomic_t kgdb_break_tasklet_var; > atomic_t kgdb_setting_breakpoint; > +atomic_t kgdb_system_nmi; > > struct task_struct *kgdb_usethread; > struct task_struct *kgdb_contthread; > @@ -760,7 +761,11 @@ int kgdb_nmicallin(int cpu, int trapnr, > > /* Indicate there are slaves waiting */ > kgdb_info[cpu].send_ready = send_ready; > + > + /* Use new reason code "SYSTEM_NMI" */ > + atomic_inc(&kgdb_system_nmi); > kgdb_cpu_enter(ks, regs, DCPU_WANT_MASTER); > + atomic_dec(&kgdb_system_nmi); > kgdb_do_roundup = save_kgdb_do_roundup; > kgdb_info[cpu].send_ready = NULL; > > --- linux.orig/kernel/debug/kdb/kdb_debugger.c > +++ linux/kernel/debug/kdb/kdb_debugger.c > @@ -69,7 +69,10 @@ int kdb_stub(struct kgdb_state *ks) > if (atomic_read(&kgdb_setting_breakpoint)) > reason = KDB_REASON_KEYBOARD; > > - if (in_nmi()) > + if (atomic_read(&kgdb_system_nmi)) > + reason = KDB_REASON_SYSTEM_NMI; This would get changed to if (ks->err == KDB_REASON_SYSNMI && ks->signo == SIGTRAP) Cheers, Jason. > + > + else if (in_nmi()) > reason = KDB_REASON_NMI; > > for (i = 0, bp = kdb_breakpoints; i < KDB_MAXBPT; i++, bp++) { > --- linux.orig/kernel/debug/kdb/kdb_main.c > +++ linux/kernel/debug/kdb/kdb_main.c > @@ -1200,6 +1200,9 @@ static int kdb_local(kdb_reason_t reason > instruction_pointer(regs)); > kdb_dumpregs(regs); > break; > + case KDB_REASON_SYSTEM_NMI: > + kdb_printf("due to System NonMaskable Interrupt\n"); > + break; > case KDB_REASON_NMI: > kdb_printf("due to NonMaskable Interrupt @ " > kdb_machreg_fmt "\n", > > -- > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RESEND v3 3/7] Intel MIC Host Driver, card OS state management.
On Thu, Sep 05, 2013 at 04:41:55PM -0700, Sudeep Dutt wrote: > +What:/sys/class/mic/mic(x)/firmware > +Date:August 2013 > +KernelVersion: 3.11 > +Contact: Sudeep Dutt > +Description: > + When read, this sysfs entry provides the path name under > + /lib/firmware/ where the firmware image to be booted on the > + card can be found. The entry can be written to change the > + firmware image location under /lib/firmware/. I don't understand, is the path under the HOST device, or the Client device's disk? Why do you need to change the path on the HOST? What's wrong with the existing firmware path selection we have in the kernel? thanks, greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RESEND v3 3/7] Intel MIC Host Driver, card OS state management.
On Thu, Sep 05, 2013 at 04:41:55PM -0700, Sudeep Dutt wrote: > +What:/sys/class/mic/mic(x)/cmdline > +Date:August 2013 > +KernelVersion: 3.11 > +Contact: Sudeep Dutt > +Description: > + An Intel MIC device runs a Linux OS during its operation. Before > + booting this card OS, it is possible to pass kernel command line > + options to configure various features in it, similar to > + self-bootable machines. When read, this entry provides > + information about the current kernel command line options set to > + boot the card OS. This entry can be written to change the > + existing kernel command line options. Typically, the user would > + want to read the current command line options, append new ones > + or modify existing ones and then write the whole kernel command > + line back to this entry. Is a PAGE_SIZE value going to be big enough for your command line? I know some embedded systems have horribly long command lines, hopefully this will be enough for you. thanks, greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RESEND v3 3/7] Intel MIC Host Driver, card OS state management.
Again, very minor fixups for later (I can even do them...) > +static DEVICE_ATTR(state, S_IRUGO|S_IWUSR, mic_show_state, mic_store_state); DEVICE_ATTR_RW() please. Same for the other attributes you create in this patch. thanks, greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 1/1] dcache: Translating dentry into pathname without taking rename_lock
On Thu, Sep 5, 2013 at 7:01 PM, Waiman Long wrote: > > I am sorry that I misunderstand what you said. I will do what you and Al > advise me to do. I'm sorry I shouted at you. I was getting a bit frustrated there.. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RESEND v3 1/7] Intel MIC Host Driver for X100 family.
On Thu, Sep 05, 2013 at 04:41:31PM -0700, Sudeep Dutt wrote: > drivers/misc/mic/common/mic_device.h | 37 +++ > drivers/misc/mic/host/mic_device.h| 109 + Two different files, with the same name? You are asking for trouble in the future, getting them confused :) Please try to pick a unique name, especially when you later do things like: > +#include "../common/mic_device.h" > +#include "mic_device.h" Which just looks odd. Again, not a big deal, follow-on patch can fix this. thanks, greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RESEND v3 1/7] Intel MIC Host Driver for X100 family.
Very minor nits, you can change this in a future add-on patch: > +static DEVICE_ATTR(family, S_IRUGO, mic_show_family, NULL); This should use DEVICE_ATTR_RO(), so that we don't have to audit the permissions of your DEVICE_ATTR() files. > +static DEVICE_ATTR(stepping, S_IRUGO, mic_show_stepping, NULL); Same here. > +static struct attribute *mic_default_attrs[] = { > + &dev_attr_family.attr, > + &dev_attr_stepping.attr, > + > + NULL > +}; > + > +static struct attribute_group mic_attr_group = { > + .attrs = mic_default_attrs, > +}; > + > +static const struct attribute_group *__mic_attr_group[] = { > + &mic_attr_group, > + NULL > +}; These last two structures can be replaced with: ATTRIBUTE_GROUPS(mic_default); > +void mic_sysfs_init(struct mic_device *mdev) > +{ > + mdev->attr_group = __mic_attr_group; > +} This is "odd", why not just export the data structure and reference it in the other code? The pci core does this, and so do other busses. Anyway, it's not a big deal, just a bit strange to me. thanks, greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/8] ceph: fscache support & upstream changes
David, After running this for a day on some loaded machines I ran into what looks like an old issue with the new code. I remember you saw an issue that manifested it self in a similar way a while back. [13837253.462779] FS-Cache: Assertion failed [13837253.462782] 3 == 5 is false [13837253.462807] [ cut here ] [13837253.462811] kernel BUG at fs/fscache/operation.c:414! [13837253.462815] invalid opcode: [#1] SMP [13837253.462820] Modules linked in: cachefiles microcode auth_rpcgss oid_registry nfsv4 nfs lockd ceph sunrpc libceph fscache raid10 raid456 async_pq async_xor async_memcpy async_raid6_recov async_tx raid1 raid0 multipath linear btrfs raid6_pq lzo_compress xor zlib_deflate libcrc32c [13837253.462851] CPU: 1 PID: 1848 Comm: kworker/1:2 Not tainted 3.11.0-rc5-virtual #55 [13837253.462870] Workqueue: ceph-revalidate ceph_revalidate_work [ceph] [13837253.462875] task: 8804251f16f0 ti: 8804047fa000 task.ti: 8804047fa000 [13837253.462879] RIP: e030:[] [] fscache_put_operation+0x2ad/0x330 [fscache] [13837253.462893] RSP: e02b:8804047fbd58 EFLAGS: 00010296 [13837253.462896] RAX: 000f RBX: 880424049d80 RCX: 0006 [13837253.462901] RDX: 0007 RSI: 0007 RDI: 8804047f0218 [13837253.462906] RBP: 8804047fbd68 R08: R09: [13837253.462910] R10: 0108 R11: 0107 R12: 8804251cf928 [13837253.462915] R13: 8804253c7370 R14: R15: [13837253.462923] FS: 7f5c56e43700() GS:88044350() knlGS: [13837253.462928] CS: e033 DS: ES: CR0: 8005003b [13837253.462932] CR2: 7fc08b7ee000 CR3: 0004259a4000 CR4: 2660 [13837253.462936] Stack: [13837253.462939] 880424049d80 8804251cf928 8804047fbda8 a016def1 [13837253.462946] 88042b462b20 88040701c750 88040701c730 88040701c3f0 [13837253.462953] 0003 8804047fbde8 a025ba3f [13837253.462959] Call Trace: [13837253.462966] [] __fscache_check_consistency+0x1a1/0x2c0 [fscache] [13837253.462977] [] ceph_revalidate_work+0x8f/0x120 [ceph] [13837253.462987] [] process_one_work+0x179/0x490 [13837253.462992] [] worker_thread+0x11b/0x370 [13837253.462998] [] ? manage_workers.isra.21+0x2e0/0x2e0 [13837253.463004] [] kthread+0xc0/0xd0 [13837253.463011] [] ? perf_trace_xen_mmu_pmd_clear+0x50/0xc0 [13837253.463017] [] ? flush_kthread_worker+0xb0/0xb0 [13837253.463024] [] ret_from_fork+0x7c/0xb0 [13837253.463029] [] ? flush_kthread_worker+0xb0/0xb0 [13837253.463033] Code: 31 c0 e8 5d e6 3e e1 48 c7 c7 04 8e 17 a0 31 c0 e8 4f e6 3e e1 8b 73 40 ba 05 00 00 00 48 c7 c7 62 8e 17 a0 31 c0 e8 39 e6 3e e1 <0f> 0b 65 48 8b 34 25 80 c7 00 00 48 c7 c7 4f 8e 17 a0 48 81 c6 [13837253.463071] RIP [] fscache_put_operation+0x2ad/0x330 [fscache] [13837253.463079] RSP [13837253.463085] ---[ end trace 2972d68e8efd961e ]--- [13837253.463130] BUG: unable to handle kernel paging request at ffd8 [13837253.463136] IP: [] kthread_data+0x11/0x20 [13837253.463142] PGD 1a0f067 PUD 1a11067 PMD 0 [13837253.463146] Oops: [#2] SMP [13837253.463150] Modules linked in: cachefiles microcode auth_rpcgss oid_registry nfsv4 nfs lockd ceph sunrpc libceph fscache raid10 raid456 async_pq async_xor async_memcpy async_raid6_recov async_tx raid1 raid0 multipath linear btrfs raid6_pq lzo_compress xor zlib_deflate libcrc32c [13837253.463176] CPU: 1 PID: 1848 Comm: kworker/1:2 Tainted: G D 3.11.0-rc5-virtual #55 [13837253.463190] task: 8804251f16f0 ti: 8804047fa000 task.ti: 8804047fa000 [13837253.463194] RIP: e030:[] [] kthread_data+0x11/0x20 [13837253.463201] RSP: e02b:8804047fba00 EFLAGS: 00010046 [13837253.463204] RAX: RBX: RCX: 81c30d00 [13837253.463209] RDX: 0001 RSI: 0001 RDI: 8804251f16f0 [13837253.463213] RBP: 8804047fba18 R08: 27bf1216 R09: [13837253.463217] R10: 88044360cec0 R11: 000e R12: 0001 [13837253.463222] R13: 8804251f1ac8 R14: 88042c498000 R15: [13837253.463228] FS: 7f5c56e43700() GS:88044350() knlGS: [13837253.463233] CS: e033 DS: ES: CR0: 8005003b [13837253.463237] CR2: 0028 CR3: 0004259a4000 CR4: 2660 [13837253.463241] Stack: [13837253.463243] 8107c3d6 880443513fc0 0001 8804047fba98 [13837253.463249] 81568308 0003 8804251f1ce8 8804251f16f0 [13837253.463255] 8804047fbfd8 8804047fbfd8 8804047fbfd8 8804047fba78 [13837253.463261] Call Trace: [13837253.463265] [] ? wq_worker_sleeping+0x16/0x90 [13837253.463272] [] __schedule+0x5c8/0x820 [13837253.463276] [] schedule+0x29/0x70 [13837253.662186] [] do_exit+0x6e0/0xa60 [138
Re: [PATCH 2/3] thermal: samsung: change base_common to more meaningful base_second
On Wed, Sep 4, 2013 at 9:53 AM, Naveen Krishna Chatradhi wrote: > On Exynos5440 and Exynos5420 there are registers common > across the TMU channels. > > To support that, we introduced a ADDRESS_MULTIPLE flag in the > driver and the 2nd set of register base and size are provided > in the "reg" property of the node. > > As per Amit's suggestion, this patch changes the base_common > to base_second and SHARED_MEMORY to ADDRESS_MULTIPLE. > > Signed-off-by: Naveen Krishna Chatradhi The changes look good. For all the 3 patches in the series, Acked-by: Amit Daniel Kachhap Reviewed-by: Amit Daniel Kachhap Thanks, Amit Daniel > --- > Changes since v2: > Changed the flag name from SHARED_MEMORY to ADDRESS_MULTIPLE. > https://lkml.org/lkml/2013/8/1/38 > > .../devicetree/bindings/thermal/exynos-thermal.txt |4 ++-- > drivers/thermal/samsung/exynos_tmu.c | 12 ++-- > drivers/thermal/samsung/exynos_tmu.h |4 ++-- > drivers/thermal/samsung/exynos_tmu_data.c |2 +- > 4 files changed, 11 insertions(+), 11 deletions(-) > > diff --git a/Documentation/devicetree/bindings/thermal/exynos-thermal.txt > b/Documentation/devicetree/bindings/thermal/exynos-thermal.txt > index 284f530..116cca0 100644 > --- a/Documentation/devicetree/bindings/thermal/exynos-thermal.txt > +++ b/Documentation/devicetree/bindings/thermal/exynos-thermal.txt > @@ -11,8 +11,8 @@ > - reg : Address range of the thermal registers. For soc's which has multiple > instances of TMU and some registers are shared across all TMU's like > interrupt related then 2 set of register has to supplied. First set > - belongs to each instance of TMU and second set belongs to common TMU > - registers. > + belongs to each instance of TMU and second set belongs to second set > + of common TMU registers. > - interrupts : Should contain interrupt for thermal system > - clocks : The main clock for TMU device > - clock-names : Thermal system clock name > diff --git a/drivers/thermal/samsung/exynos_tmu.c > b/drivers/thermal/samsung/exynos_tmu.c > index d201ed8..3a55caf 100644 > --- a/drivers/thermal/samsung/exynos_tmu.c > +++ b/drivers/thermal/samsung/exynos_tmu.c > @@ -41,7 +41,7 @@ > * @id: identifier of the one instance of the TMU controller. > * @pdata: pointer to the tmu platform/configuration data > * @base: base address of the single instance of the TMU controller. > - * @base_common: base address of the common registers of the TMU controller. > + * @base_second: base address of the common registers of the TMU controller. > * @irq: irq number of the TMU controller. > * @soc: id of the SOC type. > * @irq_work: pointer to the irq work structure. > @@ -56,7 +56,7 @@ struct exynos_tmu_data { > int id; > struct exynos_tmu_platform_data *pdata; > void __iomem *base; > - void __iomem *base_common; > + void __iomem *base_second; > int irq; > enum soc_type soc; > struct work_struct irq_work; > @@ -297,7 +297,7 @@ skip_calib_data: > } > /*Clear the PMIN in the common TMU register*/ > if (reg->tmu_pmin && !data->id) > - writel(0, data->base_common + reg->tmu_pmin); > + writel(0, data->base_second + reg->tmu_pmin); > out: > clk_disable(data->clk); > mutex_unlock(&data->lock); > @@ -451,7 +451,7 @@ static void exynos_tmu_work(struct work_struct *work) > > /* Find which sensor generated this interrupt */ > if (reg->tmu_irqstatus) { > - val_type = readl(data->base_common + reg->tmu_irqstatus); > + val_type = readl(data->base_second + reg->tmu_irqstatus); > if (!((val_type >> data->id) & 0x1)) > goto out; > } > @@ -582,7 +582,7 @@ static int exynos_map_dt_data(struct platform_device > *pdev) > * Check if the TMU shares some registers and then try to map the > * memory of common registers. > */ > - if (!TMU_SUPPORTS(pdata, SHARED_MEMORY)) > + if (!TMU_SUPPORTS(pdata, ADDRESS_MULTIPLE)) > return 0; > > if (of_address_to_resource(pdev->dev.of_node, 1, &res)) { > @@ -590,7 +590,7 @@ static int exynos_map_dt_data(struct platform_device > *pdev) > return -ENODEV; > } > > - data->base_common = devm_ioremap(&pdev->dev, res.start, > + data->base_second = devm_ioremap(&pdev->dev, res.start, > resource_size(&res)); > if (!data->base) { > dev_err(&pdev->dev, "Failed to ioremap memory\n"); > diff --git a/drivers/thermal/samsung/exynos_tmu.h > b/drivers/thermal/samsung/exynos_tmu.h > index 7c6c34a..ebd2ec1 100644 > --- a/drivers/thermal/samsung/exynos_tmu.h > +++ b/drivers/thermal/samsung/exynos_tmu.h > @@ -59,7 +59,7 @@ enum soc_type { > * state(active/idle) can be checked. > * TMU_S
Re: [PATCH 5/9] KGDB/KDB: add support for external NMI handler to call KGDB/KDB.
On 09/05/2013 05:50 PM, Mike Travis wrote: > This patch adds a kgdb_nmicallin() interface that can be used by > external NMI handlers to call the KGDB/KDB handler. The primary need > for this is for those types of NMI interrupts where all the CPUs > have already received the NMI signal. Therefore no send_IPI(NMI) > is required, and in fact it will cause a 2nd unhandled NMI to occur. > This generates the "Dazed and Confuzed" messages. > > Since all the CPUs are getting the NMI at roughly the same time, it's not > guaranteed that the first CPU that hits the NMI handler will manage to > enter KGDB and set the dbg_master_lock before the slaves start entering. It should have been ok to have more than one master if this was some kind of watch dog. The raw spin lock for the dbg_master_lock should have ensured that only a single CPU is in fact the master. If it is the case that we cannot send a nested IPI at this point, the UV machine type should have replaced the kgdb_roundup_cpus() routine with something that will work, such as looking at the exception type on the way in and perhaps skipping the IPI send. Also if there is no possibility of restarting the machine from this state it would have been possible to simply turn off kgdb_do_roundup in the custom kgdb_roundup_cpus(). The patch you created appears that it will work, but it comes at the cost of some complexity because you are also checking on the state of "kgdb_info[cpu].send_ready" in some other location in the NMI handler. It might be better to consider not sending a nested NMI if all the CPUs are going to enter anyway in the master state. > > The new argument "send_ready" was added for KGDB to signal the NMI handler > to release the slave CPUs for entry into KGDB. > > Signed-off-by: Mike Travis > Reviewed-by: Dimitri Sivanich > Reviewed-by: Hedi Berriche > --- > include/linux/kgdb.h |1 + > kernel/debug/debug_core.c | 41 + > kernel/debug/debug_core.h |1 + > 3 files changed, 43 insertions(+) > > --- linux.orig/include/linux/kgdb.h > +++ linux/include/linux/kgdb.h > @@ -310,6 +310,7 @@ extern int > kgdb_handle_exception(int ex_vector, int signo, int err_code, >struct pt_regs *regs); > extern int kgdb_nmicallback(int cpu, void *regs); > +extern int kgdb_nmicallin(int cpu, int trapnr, void *regs, atomic_t > *snd_rdy); > extern void gdbstub_exit(int status); > > extern intkgdb_single_step; > --- linux.orig/kernel/debug/debug_core.c > +++ linux/kernel/debug/debug_core.c > @@ -578,6 +578,10 @@ return_normal: > /* Signal the other CPUs to enter kgdb_wait() */ > if ((!kgdb_single_step) && kgdb_do_roundup) > kgdb_roundup_cpus(flags); > + > +/* If optional send ready pointer, signal CPUs to proceed */ > +if (kgdb_info[cpu].send_ready) > +atomic_set(kgdb_info[cpu].send_ready, 1); > #endif > > /* > @@ -729,6 +733,43 @@ int kgdb_nmicallback(int cpu, void *regs > return 0; > } > #endif > +return 1; > +} > + > +int kgdb_nmicallin(int cpu, int trapnr, void *regs, atomic_t *send_ready) > +{ > +#ifdef CONFIG_SMP > +if (!kgdb_io_ready(0)) > +return 1; > + > +if (kgdb_info[cpu].enter_kgdb == 0) { > +struct kgdb_state kgdb_var; > +struct kgdb_state *ks = &kgdb_var; > +int save_kgdb_do_roundup = kgdb_do_roundup; > + > +memset(ks, 0, sizeof(struct kgdb_state)); > +ks->cpu= cpu; > +ks->ex_vector= trapnr; > +ks->signo= SIGTRAP; > +ks->err_code= 0; > +ks->kgdb_usethreadid= 0; > +ks->linux_regs= regs; > + > +/* Do not broadcast NMI */ > +kgdb_do_roundup = 0; > + > +/* Indicate there are slaves waiting */ > +kgdb_info[cpu].send_ready = send_ready; > +kgdb_cpu_enter(ks, regs, DCPU_WANT_MASTER); This is the one part of the patch I don't quite understand. Why does the kgdb_nmicallin() desire to be the master core? It was not obvious the circumstance as to why this is called. Is it some kind of watch dog where you really do want to enter the debugger or is it more to deal with nested slave interrupts were the round up would have possibly hung on this hardware. If it is the later, I would have thought this should be a slave and not the master. Perhaps a comment in the code can clear this up? Thanks, Jason. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ethernet/arc/arc_emac: optimize the Tx/Tx-reclaim paths a bit
Hi David, On 09/05/2013 11:54 PM, David Miller wrote: > From: Vineet Gupta > Date: Wed, 4 Sep 2013 18:33:11 +0530 > >> This came out of staring at code due to recent performance fix. >> >> * TX BD reclaim can call netif_wake_queue() once, outside the loop if >> one/more BDs were freed, NO need to do this each iteration. >> >> * TX need not look at next BD to stop the netif queue. It rather be done >> in the next tx call, when it actually fails as the queue seldom gets >> full but the check nevertheless needs to be done for each packet Tx. >> Profiled this under heavy traffic (big tar file cp, LMBench betworking >> tests) and saw not a single hit to that code. >> >> Signed-off-by: Vineet Gupta > You should keep the check in the transmit queueing code as a BUG check, > almost every driver has code of the form (using NIU as an example): > > if (niu_tx_avail(rp) <= (skb_shinfo(skb)->nr_frags + 1)) { > netif_tx_stop_queue(txq); > dev_err(np->device, "%s: BUG! Tx ring full when queue > awake!\n", dev->name); > rp->tx_errors++; > return NETDEV_TX_BUSY; > } > > and arc_emac should too. > > Otherwise queue management bugs are incredibly hard to diagnose. > > I'm not applying this patch. The check is already there for current BD. What I removed was checking for next BD too (please see below). IMHO this is useless since it will be done in next iteration anyways. In my tests, the next check never got hit, so it was waste of cycles. static int arc_emac_tx(struct sk_buff *skb, struct net_device *ndev) { if (unlikely((le32_to_cpu(*info) & OWN_MASK) == FOR_EMAC)) { netif_stop_queue(ndev); return NETDEV_TX_BUSY; } ... *txbd_curr = (*txbd_curr + 1) % TX_BD_NUM; - /* Get "info" of the next BD */ - info = &priv->txbd[*txbd_curr].info; - - /* Check if if Tx BD ring is full - next BD is still owned by EMAC */ - if (unlikely((le32_to_cpu(*info) & OWN_MASK) == FOR_EMAC)) - netif_stop_queue(ndev); OTOH, I do see a slight stats update issue - if the queue is stopped (but pkt not dropped) we are failing to increment tx_errors. But that would be a separate patch. -Vineet -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] net: stmmac: fix bad merge conflict resolution
Hi all, On Thu, 05 Sep 2013 22:58:17 -0400 (EDT) David Miller wrote: > > From: Olof Johansson > Date: Thu, 5 Sep 2013 18:01:41 -0700 > > > Merge commit 06c54055bebf919249aa1eb68312887c3cfe77b4 did a bad conflict > > resolution accidentally leaving out a closing brace. Add it back. > > > > Signed-off-by: Olof Johansson > > --- > > > > This breaks a handful of defconfigs on ARM, so it'd be good to see it > > applied pretty quickly. Thanks! > > Looks like Linus applied this, thanks Olof. And I cherry-picked it into linux-next for today. -- Cheers, Stephen Rothwells...@canb.auug.org.au pgputkdoVt9ya.pgp Description: PGP signature
[PATCH 0/5] Squashfs: extra sanity checks and sanity check fixes
Hi, Following on from the "Squashfs: sanity check information from disk" patch from Dan Carpenter, I have added a couple more sanity checks, and fixed a couple of existing sanity checks (including the patch from Dan Carpenter). These sanity checks mainly exist to trap maliciously corrupted filesystems either through using a deliberately modified mksquashfs, or where the user has deliberately chosen to generate uncompressed metadata and then corrupted it. Normally metadata in Squashfs filesystems is compressed, which means corruption (either accidental or malicious) is detected when trying to decompress the metadata. So corrupted data does not normally get as far as the code paths in question here. Phillip Lougher (5): Squashfs: fix corruption check in get_dir_index_using_name() Squashfs: fix corruption checks in squashfs_lookup() Squashfs: fix corruption checks in squashfs_readdir() Squashfs: add corruption check in get_dir_index_using_offset() Squashfs: add corruption check for type in squashfs_readdir() fs/squashfs/dir.c | 17 + fs/squashfs/namei.c | 7 +++ fs/squashfs/squashfs_fs.h | 5 - 3 files changed, 20 insertions(+), 9 deletions(-) -- 1.8.3.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/5] Squashfs: fix corruption checks in squashfs_lookup()
The dir_count and size fields when read from disk are sanity checked for correctness. However, the sanity checks only check the values are not greater than expected. As dir_count and size were incorrectly defined as signed ints, this can lead to corrupted values appearing as negative which are not trapped. Signed-off-by: Phillip Lougher --- fs/squashfs/namei.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/squashfs/namei.c b/fs/squashfs/namei.c index 342a5aa..67cad77 100644 --- a/fs/squashfs/namei.c +++ b/fs/squashfs/namei.c @@ -147,7 +147,8 @@ static struct dentry *squashfs_lookup(struct inode *dir, struct dentry *dentry, struct squashfs_dir_entry *dire; u64 block = squashfs_i(dir)->start + msblk->directory_table; int offset = squashfs_i(dir)->offset; - int err, length, dir_count, size; + int err, length; + unsigned int dir_count, size; TRACE("Entered squashfs_lookup [%llx:%x]\n", block, offset); -- 1.8.3.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 4/5] Squashfs: add corruption check in get_dir_index_using_offset()
We read the size (of the name) field from disk. This value should be sanity checked for correctness to avoid blindly reading huge amounts of unnecessary data from disk on corruption. Note, here we're not actually reading the name into a buffer, but skipping it, and so corruption doesn't cause buffer overflow, merely lots of unnecessary amounts of data to be read. Signed-off-by: Phillip Lougher --- fs/squashfs/dir.c | 9 - 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/fs/squashfs/dir.c b/fs/squashfs/dir.c index 1192084..bd7155b 100644 --- a/fs/squashfs/dir.c +++ b/fs/squashfs/dir.c @@ -54,6 +54,7 @@ static int get_dir_index_using_offset(struct super_block *sb, { struct squashfs_sb_info *msblk = sb->s_fs_info; int err, i, index, length = 0; + unsigned int size; struct squashfs_dir_index dir_index; TRACE("Entered get_dir_index_using_offset, i_count %d, f_pos %lld\n", @@ -81,8 +82,14 @@ static int get_dir_index_using_offset(struct super_block *sb, */ break; + size = le32_to_cpu(dir_index.size) + 1; + + /* size should never be larger than SQUASHFS_NAME_LEN */ + if (size > SQUASHFS_NAME_LEN) + break; + err = squashfs_read_metadata(sb, NULL, &index_start, - &index_offset, le32_to_cpu(dir_index.size) + 1); + &index_offset, size); if (err < 0) break; -- 1.8.3.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/5] Squashfs: fix corruption checks in squashfs_readdir()
The dir_count and size fields when read from disk are sanity checked for correctness. However, the sanity checks only check the values are not greater than expected. As dir_count and size were incorrectly defined as signed ints, this can lead to corrupted values appearing as negative which are not trapped. Signed-off-by: Phillip Lougher --- fs/squashfs/dir.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/fs/squashfs/dir.c b/fs/squashfs/dir.c index f7f527b..1192084 100644 --- a/fs/squashfs/dir.c +++ b/fs/squashfs/dir.c @@ -105,9 +105,8 @@ static int squashfs_readdir(struct file *file, struct dir_context *ctx) struct inode *inode = file_inode(file); struct squashfs_sb_info *msblk = inode->i_sb->s_fs_info; u64 block = squashfs_i(inode)->start + msblk->directory_table; - int offset = squashfs_i(inode)->offset, length, dir_count, size, - type, err; - unsigned int inode_number; + int offset = squashfs_i(inode)->offset, length, type, err; + unsigned int inode_number, dir_count, size; struct squashfs_dir_header dirh; struct squashfs_dir_entry *dire; -- 1.8.3.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 5/5] Squashfs: add corruption check for type in squashfs_readdir()
We read the type field from disk. This value should be sanity checked for correctness to avoid an out of bounds access when reading the squashfs_filetype_table array. Signed-off-by: Phillip Lougher --- fs/squashfs/dir.c | 7 +-- fs/squashfs/squashfs_fs.h | 5 - 2 files changed, 9 insertions(+), 3 deletions(-) diff --git a/fs/squashfs/dir.c b/fs/squashfs/dir.c index bd7155b..d8c2d74 100644 --- a/fs/squashfs/dir.c +++ b/fs/squashfs/dir.c @@ -112,8 +112,8 @@ static int squashfs_readdir(struct file *file, struct dir_context *ctx) struct inode *inode = file_inode(file); struct squashfs_sb_info *msblk = inode->i_sb->s_fs_info; u64 block = squashfs_i(inode)->start + msblk->directory_table; - int offset = squashfs_i(inode)->offset, length, type, err; - unsigned int inode_number, dir_count, size; + int offset = squashfs_i(inode)->offset, length, err; + unsigned int inode_number, dir_count, size, type; struct squashfs_dir_header dirh; struct squashfs_dir_entry *dire; @@ -206,6 +206,9 @@ static int squashfs_readdir(struct file *file, struct dir_context *ctx) ((short) le16_to_cpu(dire->inode_number)); type = le16_to_cpu(dire->type); + if (type > SQUASHFS_MAX_DIR_TYPE) + goto failed_read; + if (!dir_emit(ctx, dire->name, size, inode_number, squashfs_filetype_table[type])) diff --git a/fs/squashfs/squashfs_fs.h b/fs/squashfs/squashfs_fs.h index 9e2349d..4b2beda 100644 --- a/fs/squashfs/squashfs_fs.h +++ b/fs/squashfs/squashfs_fs.h @@ -87,7 +87,7 @@ #define SQUASHFS_COMP_OPTS(flags) SQUASHFS_BIT(flags, \ SQUASHFS_COMP_OPT) -/* Max number of types and file types */ +/* Inode types including extended types */ #define SQUASHFS_DIR_TYPE 1 #define SQUASHFS_REG_TYPE 2 #define SQUASHFS_SYMLINK_TYPE 3 @@ -103,6 +103,9 @@ #define SQUASHFS_LFIFO_TYPE13 #define SQUASHFS_LSOCKET_TYPE 14 +/* Max type value stored in directory entry */ +#define SQUASHFS_MAX_DIR_TYPE 7 + /* Xattr types */ #define SQUASHFS_XATTR_USER 0 #define SQUASHFS_XATTR_TRUSTED 1 -- 1.8.3.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/5] Squashfs: fix corruption check in get_dir_index_using_name()
Patch "Squashfs: sanity check information from disk" from Dan Carpenter adds a missing check for corruption in the "size" field while reading the directory index from disk. It, however, sets err to -EINVAL, this value is not used later, and so setting it is completely redundant. So remove it. Errors in reading the index are deliberately non-fatal. If we get an error in reading the index we just return the part of the index we have managed to read - the index isn't essential, just quicker. Signed-off-by: Phillip Lougher --- fs/squashfs/namei.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/fs/squashfs/namei.c b/fs/squashfs/namei.c index f866d42..342a5aa 100644 --- a/fs/squashfs/namei.c +++ b/fs/squashfs/namei.c @@ -104,10 +104,8 @@ static int get_dir_index_using_name(struct super_block *sb, size = le32_to_cpu(index->size) + 1; - if (size > SQUASHFS_NAME_LEN) { - err = -EINVAL; + if (size > SQUASHFS_NAME_LEN) break; - } err = squashfs_read_metadata(sb, index->name, &index_start, &index_offset, size); -- 1.8.3.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: ftrace 'failed to modify' bug when loading reiserfs.ko
On Thu, Sep 05, 2013 at 09:51:54PM -0400, Steven Rostedt wrote: > On Thu, 5 Sep 2013 21:48:59 -0400 > Dave Jones wrote: > > > On Thu, Sep 05, 2013 at 09:44:55PM -0400, Steven Rostedt wrote: > > > On Thu, 5 Sep 2013 21:34:55 -0400 > > > Dave Jones wrote: > > > > > > > On Thu, Sep 05, 2013 at 09:28:34PM -0400, Steven Rostedt wrote: > > > > Did you change a config option, or update your gcc? > > > > Yeah, changed CONFIG_DEBUG_KOBJECT, which rebuilt the world. > > Still doesn't explain why it gave you that splat there. > > Do you still have that binary module, and can you show me what's at > reiserfs_init_bitmap_cache+0x0 with objdump? I didn't, but it turns out I can recreate this. A little convoluted but.. disable DEBUG_KOBJECT_RELEASE build, install and boot into kernel enable DEBUG_KOBJECT_RELEASE build kernel install -> boom 28b0 : return bh; } int reiserfs_init_bitmap_cache(struct super_block *sb) { 28b0: e8 00 00 00 00 callq 28b5 28b5: 55 push %rbp /* Don't trust REISERFS_SB(sb)->s_bmap_nr, it's a u16 * which overflows on large file systems. */ static inline __u32 reiserfs_bmap_count(struct super_block *sb) { return (SB_BLOCK_COUNT(sb) - 1) / (sb->s_blocksize * 8) + 1; 28b6: 31 d2 xor%edx,%edx 28b8: 48 89 e5mov%rsp,%rbp 28bb: 41 54 push %r12 28bd: 53 push %rbx 28be: 48 89 fbmov%rdi,%rbx 28c1: 48 8b 87 50 07 00 00mov0x750(%rdi),%rax 28c8: 48 8b 77 18 mov0x18(%rdi),%rsi 28cc: 48 8b 40 08 mov0x8(%rax),%rax 28d0: 48 8d 0c f5 00 00 00lea0x0(,%rsi,8),%rcx 28d7: 00 28d8: 8b 00 mov(%rax),%eax 28da: 83 e8 01sub$0x1,%eax 28dd: 48 f7 f1div%rcx -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2] mfd: rtsx: Modify rts5249_optimize_phy
From: Wei WANG In some platforms, specially Thinkpad series, rts5249 won't be initialized properly. So we need adjust some phy parameters to improve the compatibility issue. Signed-off-by: Wei WANG --- drivers/mfd/rts5249.c| 35 -- include/linux/mfd/rtsx_pci.h | 43 ++ 2 files changed, 76 insertions(+), 2 deletions(-) diff --git a/drivers/mfd/rts5249.c b/drivers/mfd/rts5249.c index 3b835f5..7653638 100644 --- a/drivers/mfd/rts5249.c +++ b/drivers/mfd/rts5249.c @@ -130,13 +130,44 @@ static int rts5249_optimize_phy(struct rtsx_pcr *pcr) { int err; - err = rtsx_pci_write_phy_register(pcr, PHY_REG_REV, 0xFE46); + err = rtsx_pci_write_phy_register(pcr, PHY_REG_REV, REG_REV_RESV | + RXIDLE_LATCHED | P1_EN | RXIDLE_EN | RX_PWST | + CLKREQ_DLY_TIMER_1_0 | STOP_CLKRD | STOP_CLKWR); if (err < 0) return err; msleep(1); - return rtsx_pci_write_phy_register(pcr, PHY_BPCR, 0x05C0); + err = rtsx_pci_write_phy_register(pcr, PHY_BPCR, IBRXSEL | IBTXSEL | + IB_FILTER | CMIRROR_EN); + if (err < 0) + return err; + err = rtsx_pci_write_phy_register(pcr, PHY_PCR, FORCE_CODE | + OOBS_CALI_50 | OOBS_VCM_08 | OOBS_SEN_90 | RSSI_EN); + if (err < 0) + return err; + err = rtsx_pci_write_phy_register(pcr, PHY_RCR2, EMPHASE_EN | NADJR | + CDR_CP_10 | CDR_SR_2 | FREQSEL_12 | CPADJEN | + CDR_SC_8 | CALIB_LATE); + if (err < 0) + return err; + err = rtsx_pci_write_phy_register(pcr, PHY_FLD4, FLDEN_SEL | REQ_REF | + RXAMP_OFF | REQ_ADDA | BER_COUNT | + BER_TIMER | BER_CHK_EN); + if (err < 0) + return err; + err = rtsx_pci_write_phy_register(pcr, PHY_RDR, RXDSEL_1_9); + if (err < 0) + return err; + err = rtsx_pci_write_phy_register(pcr, PHY_RCR1, ADP_TIME | VCO_COARSE); + if (err < 0) + return err; + err = rtsx_pci_write_phy_register(pcr, PHY_FLD3, TIMER_4 | TIMER_6 | + RXDELINK); + if (err < 0) + return err; + return rtsx_pci_write_phy_register(pcr, PHY_TUNE, TUNEREF_1_0 | + VBGSEL_1252 | SDBUS_33 | TUNED18 | TUNED12); } static int rts5249_turn_on_led(struct rtsx_pcr *pcr) diff --git a/include/linux/mfd/rtsx_pci.h b/include/linux/mfd/rtsx_pci.h index d1382df..de20538 100644 --- a/include/linux/mfd/rtsx_pci.h +++ b/include/linux/mfd/rtsx_pci.h @@ -719,16 +719,41 @@ /* Phy register */ #define PHY_PCR0x00 +#define FORCE_CODE0xB000 +#define OOBS_CALI_50 0x0800 +#define OOBS_VCM_08 0x0200 +#define OOBS_SEN_90 0x0040 +#define RSSI_EN 0x0002 #define PHY_RCR0 0x01 #define PHY_RCR1 0x02 +#define ADP_TIME 0x0100 +#define VCO_COARSE0x001F #define PHY_RCR2 0x03 +#define EMPHASE_EN0x8000 +#define NADJR 0x4000 +#define CDR_CP_10 0x0400 +#define CDR_SR_2 0x0100 +#define FREQSEL_120x0040 +#define CPADJEN 0x0020 +#define CDR_SC_8 0x0008 +#define CALIB_LATE0x0002 #define PHY_RTCR 0x04 #define PHY_RDR0x05 +#define RXDSEL_1_90x4000 #define PHY_TCR0 0x06 #define PHY_TCR1 0x07 #define PHY_TUNE 0x08 +#define TUNEREF_1_0 0x4000 +#define VBGSEL_1252 0x0C00 +#define SDBUS_33 0x0200 +#define TUNED18 0x01C0 +#define TUNED12 0X0020 #define PHY_IMR0x09 #define PHY_BPCR 0x0A +#define IBRXSEL 0x0400 +#define IBTXSEL 0x0100 +#define IB_FILTER 0x0080 +#define CMIRROR_EN0x0040 #define PHY_BIST 0x0B #define PHY_RAW_L 0x0C #define PHY_RAW_H 0x0D @@ -744,11 +769,29 @@ #define PHY_BRNR2 0x17 #define PHY_BENR 0x18 #define PHY_REG_REV0x19 +#define REG_REV_RESV 0xE000 +#define RXIDLE_LATCHED0x1000 +#define P1_EN 0x0800 +#define RXIDLE_EN 0x0400 +#define CLKREQ_DLY_TIMER_1_0 0x0040 +#define STO
Re: [PATCH] perf mem: add priv level filtering support
> But my worry here is about consistency accross tools for the single > letter options, so perhaps if you could use: > > -U collect only user level samples > -K collect only kernel level samples Support for this would be nice for perf stat too, to use with the implicit events (using by -d, soon -T etc.) -Andi -- a...@linux.intel.com -- Speaking for myself only -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2] mfd: rtsx: Modify rts5249_optimize_phy
From: Wei WANG v2: Name those new-added register values Wei WANG (1): mfd: rtsx: Modify rts5249_optimize_phy drivers/mfd/rts5249.c| 35 -- include/linux/mfd/rtsx_pci.h | 43 ++ 2 files changed, 76 insertions(+), 2 deletions(-) -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/4] perf, x86: Avoid checkpointed counters causing excessive TSX aborts v5
From: Andi Kleen With checkpointed counters there can be a situation where the counter is overflowing, aborts the transaction, is set back to a non overflowing checkpoint, causes interupt. The interrupt doesn't see the overflow because it has been checkpointed. This is then a spurious PMI, typically with a ugly NMI message. It can also lead to excessive aborts. Avoid this problem by: - Using the full counter width for counting counters (earlier patch) - Forbid sampling for checkpointed counters. It's not too useful anyways, checkpointing is mainly for counting. The check is approximate (to still handle KVM), but should catch the majority of cases. - On a PMI always set back checkpointed counters to zero. v2: Add unlikely. Add comment v3: Allow large sampling periods with CP for KVM v4: Use event_is_checkpointed. Use EOPNOTSUPP. (Stephane Eranian) v5: Remove comment. Signed-off-by: Andi Kleen --- arch/x86/kernel/cpu/perf_event_intel.c | 37 ++ 1 file changed, 37 insertions(+) diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c index a45d8d4..91e3f8c 100644 --- a/arch/x86/kernel/cpu/perf_event_intel.c +++ b/arch/x86/kernel/cpu/perf_event_intel.c @@ -1134,6 +1134,11 @@ static void intel_pmu_enable_event(struct perf_event *event) __x86_pmu_enable_event(hwc, ARCH_PERFMON_EVENTSEL_ENABLE); } +static inline bool event_is_checkpointed(struct perf_event *event) +{ + return (event->hw.config & HSW_IN_TX_CHECKPOINTED) != 0; +} + /* * Save and restart an expired event. Called by NMI contexts, * so it has to be careful about preempting normal event ops: @@ -1141,6 +1146,17 @@ static void intel_pmu_enable_event(struct perf_event *event) int intel_pmu_save_and_restart(struct perf_event *event) { x86_perf_event_update(event); + /* +* For a checkpointed counter always reset back to 0. This +* avoids a situation where the counter overflows, aborts the +* transaction and is then set back to shortly before the +* overflow, and overflows and aborts again. +*/ + if (unlikely(event_is_checkpointed(event))) { + /* No race with NMIs because the counter should not be armed */ + wrmsrl(event->hw.event_base, 0); + local64_set(&event->hw.prev_count, 0); + } return x86_perf_event_set_period(event); } @@ -1224,6 +1240,13 @@ again: x86_pmu.drain_pebs(regs); } + /* +* To avoid spurious interrupts with perf stat always reset checkpointed +* counters. +*/ + if (cpuc->events[2] && event_is_checkpointed(cpuc->events[2])) + status |= (1ULL << 2); + for_each_set_bit(bit, (unsigned long *)&status, X86_PMC_IDX_MAX) { struct perf_event *event = cpuc->events[bit]; @@ -1689,6 +1712,20 @@ static int hsw_hw_config(struct perf_event *event) event->attr.precise_ip > 0)) return -EOPNOTSUPP; + if (event_is_checkpointed(event)) { + /* +* Sampling of checkpointed events can cause situations where +* the CPU constantly aborts because of a overflow, which is +* then checkpointed back and ignored. Forbid checkpointing +* for sampling. +* +* But still allow a long sampling period, so that perf stat +* from KVM works. +*/ + if (event->attr.sample_period > 0 && + event->attr.sample_period < 0x7fff) + return -EOPNOTSUPP; + } return 0; } -- 1.8.3.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/4] perf, x86: Report TSX transaction abort cost as weight v3
From: Andi Kleen Use the existing weight reporting facility to report the transaction abort cost, that is the number of cycles wasted in aborts. Haswell reports this in the PEBS record. This was in fact the original user for weight. This is a very useful sort key to concentrate on the most costly aborts and a good metric for TSX tuning. v2: Add Peter's changes with minor modifications. More comments. v3: Adjust white space. Signed-off-by: Andi Kleen --- arch/x86/kernel/cpu/perf_event_intel_ds.c | 55 +++ 1 file changed, 42 insertions(+), 13 deletions(-) diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c b/arch/x86/kernel/cpu/perf_event_intel_ds.c index 3065c57..d4ed99f 100644 --- a/arch/x86/kernel/cpu/perf_event_intel_ds.c +++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c @@ -182,16 +182,29 @@ struct pebs_record_nhm { * Same as pebs_record_nhm, with two additional fields. */ struct pebs_record_hsw { - struct pebs_record_nhm nhm; - /* -* Real IP of the event. In the Intel documentation this -* is called eventingrip. -*/ - u64 real_ip; - /* -* TSX tuning information field: abort cycles and abort flags. -*/ - u64 tsx_tuning; + u64 flags, ip; + u64 ax, bx, cx, dx; + u64 si, di, bp, sp; + u64 r8, r9, r10, r11; + u64 r12, r13, r14, r15; + u64 status, dla, dse, lat; + u64 real_ip; /* the actual eventing ip */ + u64 tsx_tuning; /* TSX abort cycles and flags */ +}; + +union hsw_tsx_tuning { + struct { + u32 cycles_last_block : 32, + hle_abort : 1, + rtm_abort : 1, + instruction_abort : 1, + non_instruction_abort : 1, + retry : 1, + data_conflict : 1, + capacity_writes : 1, + capacity_reads: 1; + }; + u64 value; }; void init_debug_store_on_cpu(int cpu) @@ -759,16 +772,26 @@ static int intel_pmu_pebs_fixup_ip(struct pt_regs *regs) return 0; } +static inline u64 intel_hsw_weight(struct pebs_record_hsw *pebs) +{ + if (pebs->tsx_tuning) { + union hsw_tsx_tuning tsx = { .value = pebs->tsx_tuning }; + return tsx.cycles_last_block; + } + return 0; +} + static void __intel_pmu_pebs_event(struct perf_event *event, struct pt_regs *iregs, void *__pebs) { /* * We cast to pebs_record_nhm to get the load latency data * if extra_reg MSR_PEBS_LD_LAT_THRESHOLD used +* We cast to the biggest PEBS record are careful not +* to access out-of-bounds members. */ struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events); - struct pebs_record_nhm *pebs = __pebs; - struct pebs_record_hsw *pebs_hsw = __pebs; + struct pebs_record_hsw *pebs = __pebs; struct perf_sample_data data; struct pt_regs regs; u64 sample_type; @@ -827,7 +850,7 @@ static void __intel_pmu_pebs_event(struct perf_event *event, regs.sp = pebs->sp; if (event->attr.precise_ip > 1 && x86_pmu.intel_cap.pebs_format >= 2) { - regs.ip = pebs_hsw->real_ip; + regs.ip = pebs->real_ip; regs.flags |= PERF_EFLAGS_EXACT; } else if (event->attr.precise_ip > 1 && intel_pmu_pebs_fixup_ip(®s)) regs.flags |= PERF_EFLAGS_EXACT; @@ -838,6 +861,12 @@ static void __intel_pmu_pebs_event(struct perf_event *event, x86_pmu.intel_cap.pebs_format >= 1) data.addr = pebs->dla; + /* Only set the TSX weight when no memory weight was requested. */ + if ((event->attr.sample_type & PERF_SAMPLE_WEIGHT) && + !fll && + (x86_pmu.intel_cap.pebs_format >= 2)) + data.weight = intel_hsw_weight(pebs); + if (has_branch_stack(event)) data.br_stack = &cpuc->lbr_stack; -- 1.8.3.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/4] perf, x86: Add Haswell TSX event aliases v6
From: Andi Kleen Add TSX event aliases, and export them from the kernel to perf. These are used by perf stat -T and to allow more user friendly access to events. The events are designed to be fairly generic and may also apply to other architectures implementing HTM. They all cover common situations that happens during tuning of transactional code. For Haswell we have to separate the HLE and RTM events, as they are separate in the PMU. This adds the following events. tx-startCount start transaction (used by perf stat -T) tx-commit Count commit of transaction tx-abortCount all aborts tx-conflict Count aborts due to conflict with another CPU. tx-capacity Count capacity aborts (transaction too large) Then matching el-* events for HLE cycles-tTransactional cycles (used by perf stat -T) * also exists on POWER8 cycles-ct Transactional cycles commited (used by perf stat -T) * according to Michael Ellerman POWER8 has a cycles-transactional-committed, * perf stat -T handles both cases Note for useful abort profiling often precise has to be set, as Haswell can only report the point inside the transaction with precise=2. (I had another patchkit to allow exporting precise too, but Vince Weaver pointed out it violates the ABI, so dropped now) For some classes of aborts, like conflicts, this is not needed, as it makes more sense to look at the complete critical section. This gives a clean set of generalized events to examine transaction success and aborts. Haswell has additional events for TSX, but those are more specialized for very specific situations. v2: Move to new sysfs infrastructure v3: Use own sysfs functions now v4: Add tx/el-abort-return for better conflict sampling v5: Different white space. v6: Cut down events, rewrite description. Signed-off-by: Andi Kleen --- arch/x86/kernel/cpu/perf_event_intel.c | 27 +++ 1 file changed, 27 insertions(+) diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c index 91e3f8c..da58663 100644 --- a/arch/x86/kernel/cpu/perf_event_intel.c +++ b/arch/x86/kernel/cpu/perf_event_intel.c @@ -2074,7 +2074,34 @@ static __init void intel_nehalem_quirk(void) EVENT_ATTR_STR(mem-loads, mem_ld_hsw, "event=0xcd,umask=0x1,ldlat=3"); EVENT_ATTR_STR(mem-stores, mem_st_hsw, "event=0xd0,umask=0x82") +/* Haswell special events */ +EVENT_ATTR_STR(tx-start,tx_start, "event=0xc9,umask=0x1"); +EVENT_ATTR_STR(tx-commit, tx_commit, "event=0xc9,umask=0x2"); +EVENT_ATTR_STR(tx-abort,tx_abort, "event=0xc9,umask=0x4"); +EVENT_ATTR_STR(tx-capacity, tx_capacity, "event=0x54,umask=0x2"); +EVENT_ATTR_STR(tx-conflict, tx_conflict, "event=0x54,umask=0x1"); +EVENT_ATTR_STR(el-start,el_start, "event=0xc8,umask=0x1"); +EVENT_ATTR_STR(el-commit, el_commit, "event=0xc8,umask=0x2"); +EVENT_ATTR_STR(el-abort,el_abort, "event=0xc8,umask=0x4"); +EVENT_ATTR_STR(el-capacity, el_capacity,"event=0x54,umask=0x2"); +EVENT_ATTR_STR(el-conflict, el_conflict,"event=0x54,umask=0x1"); +EVENT_ATTR_STR(cycles-t,cycles_t, "event=0x3c,in_tx=1"); +EVENT_ATTR_STR(cycles-ct, cycles_ct, + "event=0x3c,in_tx=1,in_tx_cp=1"); + static struct attribute *hsw_events_attrs[] = { + EVENT_PTR(tx_start), + EVENT_PTR(tx_commit), + EVENT_PTR(tx_abort), + EVENT_PTR(tx_capacity), + EVENT_PTR(tx_conflict), + EVENT_PTR(el_start), + EVENT_PTR(el_commit), + EVENT_PTR(el_abort), + EVENT_PTR(el_capacity), + EVENT_PTR(el_conflict), + EVENT_PTR(cycles_t), + EVENT_PTR(cycles_ct), EVENT_PTR(mem_ld_hsw), EVENT_PTR(mem_st_hsw), NULL -- 1.8.3.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
perf, x86: Add parts of the remaining haswell PMU functionality v5
I hope this version is ok for everyone now. [v2: Added Peter's changes to the PEBS handler] [v3: Addressed Arnaldo's feedback for the perf stat -T change and avoid conflict] [v4: Remove XXX comment in checkpoint patch. Add Arnaldo's ack for tools patch] [v5: Some white space adjustments] Add some more TSX functionality to the basic Haswell PMU. A lot of the infrastructure needed for these patches has been merged earlier, so it is all quite straight forward now. - Add the checkpointed counter workaround. (Parts of this have been already merged earlier) - Add support for reporting PEBS transaction abort cost as weight. This is useful to judge the cost of aborts and concentrate on expensive ones first. (Large parts of this have been already merged earlier, this is just adding the final few lines to the PEBS handler) - Add TSX event aliases, needed for perf stat -T and general usability. (Infrastructure also already in) - Add perf stat -T support to give a user friendly highlevel counting frontend for transaction.. This version should also be usable for POWER8 eventually. Not included: Support for transaction flags and TSX LBR flags. -Andi -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 4/4] perf, tools: Add perf stat --transaction v5
From: Andi Kleen Add support to perf stat to print the basic transactional execution statistics: Total cycles, Cycles in Transaction, Cycles in aborted transsactions using the in_tx and in_tx_checkpoint qualifiers. Transaction Starts and Elision Starts, to compute the average transaction length. This is a reasonable overview over the success of the transactions. Also support architectures that have a transaction aborted cycles counter like POWER8. Since that is awkward to handle in the kernel abstract handle both cases here. Enable with a new --transaction / -T option. This requires measuring these events in a group, since they depend on each other. This is implemented by using TM sysfs events exported by the kernel v2: Only print the extended statistics when the option is enabled. This avoids negative output when the user specifies the -T events in separate groups. v3: Port to latest tree v4: Remove merge error. Avoid linear walks for comparisons. Check transaction_run earlier. Minor fixes. v5: Move option to avoid conflict. Improve description. Acked-by: Arnaldo Carvalho de Melo Signed-off-by: Andi Kleen --- tools/perf/Documentation/perf-stat.txt | 5 ++ tools/perf/builtin-stat.c | 144 - tools/perf/util/evsel.h| 6 ++ tools/perf/util/pmu.c | 16 tools/perf/util/pmu.h | 1 + 5 files changed, 171 insertions(+), 1 deletion(-) diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt index 2fe87fb..40bc65a 100644 --- a/tools/perf/Documentation/perf-stat.txt +++ b/tools/perf/Documentation/perf-stat.txt @@ -132,6 +132,11 @@ is a useful mode to detect imbalance between physical cores. To enable this mod use --per-core in addition to -a. (system-wide). The output includes the core number and the number of online logical processors on that physical processor. +-T:: +--transaction:: + +Print statistics of transactional execution if supported. + EXAMPLES diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c index 352fbd7..6bd90e4 100644 --- a/tools/perf/builtin-stat.c +++ b/tools/perf/builtin-stat.c @@ -46,6 +46,7 @@ #include "util/util.h" #include "util/parse-options.h" #include "util/parse-events.h" +#include "util/pmu.h" #include "util/event.h" #include "util/evlist.h" #include "util/evsel.h" @@ -70,6 +71,41 @@ static void print_counter_aggr(struct perf_evsel *counter, char *prefix); static void print_counter(struct perf_evsel *counter, char *prefix); static void print_aggr(char *prefix); +/* Default events used for perf stat -T */ +static const char * const transaction_attrs[] = { + "task-clock", + "{" + "instructions," + "cycles," + "cpu/cycles-t/," + "cpu/tx-start/," + "cpu/el-start/," + "cpu/cycles-ct/" + "}" +}; + +/* More limited version when the CPU does not have all events. */ +static const char * const transaction_limited_attrs[] = { + "task-clock", + "{" + "instructions," + "cycles," + "cpu/cycles-t/," + "cpu/tx-start/" + "}" +}; + +/* must match transaction_attrs and the beginning limited_attrs */ +enum { + T_TASK_CLOCK, + T_INSTRUCTIONS, + T_CYCLES, + T_CYCLES_IN_TX, + T_TRANSACTION_START, + T_ELISION_START, + T_CYCLES_IN_TX_CP, +}; + static struct perf_evlist *evsel_list; static struct perf_target target = { @@ -90,6 +126,7 @@ static enum aggr_modeaggr_mode = AGGR_GLOBAL; static volatile pid_t child_pid = -1; static boolnull_run= false; static int detailed_run= 0; +static booltransaction_run; static boolbig_num = true; static int big_num_opt = -1; static const char *csv_sep= NULL; @@ -213,7 +250,10 @@ static struct stats runtime_l1_icache_stats[MAX_NR_CPUS]; static struct stats runtime_ll_cache_stats[MAX_NR_CPUS]; static struct stats runtime_itlb_cache_stats[MAX_NR_CPUS]; static struct stats runtime_dtlb_cache_stats[MAX_NR_CPUS]; +static struct stats runtime_cycles_in_tx_stats[MAX_NR_CPUS]; static struct stats walltime_nsecs_stats; +static struct stats runtime_transaction_stats[MAX_NR_CPUS]; +static struct stats runtime_elision_stats[MAX_NR_CPUS]; static void perf_stat__reset_stats(struct perf_evlist *evlist) { @@ -235,6 +275,11 @@ static void perf_stat__reset_stats(struct perf_evlist *evlist) memset(runtime_ll_cache_stats, 0, sizeof(runtime_ll_cache_stats)); memset(runtime_itlb_cache_stats, 0, sizeof(runtime_itlb_cache_stats)); memset(runtime_dtlb_cache_stats, 0, sizeof(runtime_dtlb_cache_stats)); + memset(runtime_
Re: soft lockup in sysvipc code.
On Thu, Sep 5, 2013 at 5:50 AM, Dave Jones wrote: > Haven't seen this before. > Tree based on v3.11-3104-gf357a82 > > BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:25479] Can't imagine how it could happen. In my understanding, "soft lockup" happens when code stuck at somewhere with preemption disabled. Look at the code, preemption disabled at: sysvipc_proc_next -> sysvipc_find_ipc -> ipc_lock_by_ptr enabled at: sysvipc_proc_next -> ipc_unlock or sysvipc_proc_stop -> ipc_unlock And I didn't find code may stuck in the path. I may miss something .. Regards, Lin Ming > Modules linked in: sctp snd_seq_dummy fuse dlci rfcomm tun bnep hidp ipt_ULOG > nfnetlink can_raw can_bcm scsi_transport_iscsi nfc caif_socket caif af_802154 > phonet af_rxrpc bluetooth rfkill can llc2 pppoe pppox ppp_generic slhc irda > crc_ccitt rds af_key rose x25 atm netrom appletalk ipx p8023 psnap p8022 llc > ax25 xfs snd_hda_codec_realtek libcrc32c snd_hda_intel snd_hda_codec > snd_hwdep snd_seq snd_seq_device snd_pcm snd_page_alloc snd_timer snd > soundcore pcspkr usb_debug e1000e ptp pps_core > irq event stamp: 1143030 > hardirqs last enabled at (1143029): [] > restore_args+0x0/0x30 > hardirqs last disabled at (1143030): [] > apic_timer_interrupt+0x6a/0x80 > softirqs last enabled at (1143028): [] > __do_softirq+0x198/0x460 > softirqs last disabled at (1143023): [] irq_exit+0x135/0x150 > CPU: 0 PID: 25479 Comm: trinity-child0 Not tainted 3.11.0+ #44 > task: 88022c013f90 ti: 88022bd8c000 task.ti: 88022bd8c000 > RIP: 0010:[] [] > idr_find_slowpath+0x9b/0x150 > RSP: 0018:88022bd8dc88 EFLAGS: 0206 > RAX: 0006 RBX: 000a6c0a RCX: 0008 > RDX: 0008 RSI: 81c41040 RDI: 88022c014668 > RBP: 88022bd8dca0 R08: R09: > R10: 0001 R11: 0001 R12: 88023831a290 > R13: 0001 R14: 88022bd8dbe8 R15: 8802449d > FS: 7fcfcad2c740() GS:88024480() knlGS: > CS: 0010 DS: ES: CR0: 80050033 > CR2: 7fcfc84cb968 CR3: 0001de93f000 CR4: 001407f0 > DR0: DR1: DR2: > DR3: DR6: fffe0ff0 DR7: 0400 > Stack: > 0260 2dba 81c7e258 88022bd8dcf8 > 812b1131 88022c013f90 8801d37174c0 88022bd8dd38 > 81c7e2f0 88022bd8dd38 8801e065cec8 880241d86ca8 > Call Trace: > [] sysvipc_find_ipc+0x61/0x300 > [] sysvipc_proc_next+0x46/0xd0 > [] traverse.isra.7+0xc9/0x260 > [] ? lock_release_non_nested+0x308/0x350 > [] seq_read+0x3e1/0x450 > [] ? proc_reg_write+0x80/0x80 > [] proc_reg_read+0x3d/0x80 > [] do_loop_readv_writev+0x63/0x90 > [] do_readv_writev+0x21d/0x240 > [] ? local_clock+0x3f/0x50 > [] ? context_tracking_user_exit+0x46/0x1a0 > [] vfs_readv+0x35/0x60 > [] SyS_preadv+0xa2/0xd0 > [] tracesys+0xdd/0xe2 > Code: 7e 6e 41 8b 84 24 2c 08 00 00 83 eb 08 c1 e0 03 39 c3 0f 85 c1 00 00 00 > 89 d9 44 89 e8 d3 f8 0f b6 c0 48 83 c0 04 4d 8b 64 c4 08 80 b4 d6 ff 85 > c0 74 c4 80 3d f7 2f 9d 00 00 75 bb e8 6e b4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] mm/mmap.c: Remove unnecessary pgoff assignment
We never access variable pgoff later, so the assignment is redundant. Remove it. Signed-off-by: Zhang Yanfei --- mm/mmap.c |1 - 1 files changed, 0 insertions(+), 1 deletions(-) diff --git a/mm/mmap.c b/mm/mmap.c index f9c97d1..db44f6a 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1570,7 +1570,6 @@ munmap_back: WARN_ON_ONCE(addr != vma->vm_start); addr = vma->vm_start; - pgoff = vma->vm_pgoff; vm_flags = vma->vm_flags; } else if (vm_flags & VM_SHARED) { if (unlikely(vm_flags & (VM_GROWSDOWN|VM_GROWSUP))) -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: linux-next: back merge of Linus' tree into the vfio tree
Hi Alex, On Thu, 05 Sep 2013 17:14:29 -0600 Alex Williamson wrote: > > On Fri, 2013-09-06 at 09:08 +1000, Stephen Rothwell wrote: > > > > I noticed that you have back merged Linus' tree into yours. Linus > > usually takes a dim view of that - especially when there is no > > explanation in the merge commit message. i.e. you shouldn't to that > > unless you really need to - and then you should explain why you did it. > > Hmm, I was hoping that wouldn't be a problem, especially with no > conflicts in the merge. I did it because the first commit after the > merge in my next tree depends on PCI changes that have already been > merged by Linus. Re-basing is an even bigger sin and I felt it better > to do a merge than ask for two pulls or add an unbuild-able commit to my > next tree. How do you suggest that I resolve this? See above ... you should have said all that in the merge commit message. I guess that you should just own it now and explain it to Linus when you ask him to pull your tree. -- Cheers, Stephen Rothwells...@canb.auug.org.au pgp4IpFrSiJMD.pgp Description: PGP signature
Re: [PATCH 10/11] x86, mem-hotplug: Support initialize page tables from low to high.
Hi Wanpeng, On 09/06/2013 10:16 AM, Wanpeng Li wrote: .. +#ifdef CONFIG_MOVABLE_NODE + unsigned long kernel_end; + + if (movablenode_enable_srat&& + memblock.current_order == MEMBLOCK_ORDER_LOW_TO_HIGH) { I think memblock.current_order == MEMBLOCK_ORDER_LOW_TO_HIGH is always true if config MOVABLE_NODE and movablenode_enable_srat == true if PATCH 11/11 is applied. memblock.current_order == MEMBLOCK_ORDER_LOW_TO_HIGH is true here if MOVABLE_NODE is configured, and it will be reset after SRAT is parsed. But movablenode_enable_srat could only be true when users specify movablenode boot option in the kernel commandline. You are right. I mean the change should be: +#ifdef CONFIG_MOVABLE_NODE + unsigned long kernel_end; + + if (movablenode_enable_srat) { The is unnecessary to check memblock.current_order since it is always true if movable_node is configured and movablenode_enable_srat is true. But I think, memblock.current_order is set outside init_mem_mapping(). And the path in the if statement could only be run when current order is from low to high. So I think it is safe to check it here. I prefer to keep it at least in the next version patch-set. If others also think it is unnecessary, I'm OK with removing the checking. :) Thanks. :) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] net: stmmac: fix bad merge conflict resolution
From: Olof Johansson Date: Thu, 5 Sep 2013 18:01:41 -0700 > Merge commit 06c54055bebf919249aa1eb68312887c3cfe77b4 did a bad conflict > resolution accidentally leaving out a closing brace. Add it back. > > Signed-off-by: Olof Johansson > --- > > This breaks a handful of defconfigs on ARM, so it'd be good to see it > applied pretty quickly. Thanks! Looks like Linus applied this, thanks Olof. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT] Sparc
From: Sergei Shtylyov Date: Fri, 06 Sep 2013 02:32:51 +0400 > Hello. > > On 09/06/2013 12:44 AM, David Miller wrote: > >> Several bug fixes (from Kirill Tkhai, Geery Uytterhoeven, and Alexey >> Dobriyan) and some support for Fujitsu sparc64x chips (from Allen >> Pais). > >> Please pull, thanks a lot! > >You meant that for 'linux-sparc', not 'linux-ide', right? :-) Yes, sparclinux is the intended destination, and I forwarded it there once I realized my mistake :-) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[git pull] Please pull powerpc.git next branch
Hi Linus ! Here's the powerpc batch for this merge window. Some of the highlights are: * A bunch of endian fixes ! We don't have full LE support yet in that release but this contains a lot of fixes all over arch/powerpc to use the proper accessors, call the firmware with the right endian mode, etc... * A few updates to our "powernv" platform (non-virtualized, the one to run KVM on), among other, support for bridging the P8 LPC bus for UARTs, support and some EEH fixes. * Some mpc51xx clock API cleanups in preparation for a clock API overhaul * A pile of cleanups of our old math emulation code, including better support for using it to emulate optional FP instructions on embedded chips that otherwise have a HW FPU. * Some infrastructure in selftest, for powerpc now, but could be generalized, initially used by some tests for our perf instruction counting code. * A pile of fixes for hotplug on pseries (that was seriously bitrotting) * The usual slew of freescale embedded updates, new boards, 64-bit hiberation support, e6500 core PMU support, etc... Cheers, Ben. The following changes since commit d4e4ab86bcba5a72779c43dc1459f71fea3d89c8: Linux 3.11-rc5 (2013-08-11 18:04:20 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc.git next for you to fetch changes up to 9f24b0c9ef9b6b1292579c9e2cd7ff07ddc372b7: powerpc: Correct FSCR bit definitions (2013-09-05 17:29:20 +1000) Alistair Popple (4): powerpc: More little endian fixes for prom.c powerpc: More little endian fixes for setup-common.c powerpc: Little endian fixes for legacy_serial.c powerpc: Make NUMA device node code endian safe Andy Fleming (2): powerpc: Add smp_generic_cpu_bootable powerpc: Convert platforms to smp_generic_cpu_bootable Anton Blanchard (29): powerpc: Align p_toc powerpc: Handle unaligned ldbrx/stdbrx powerpc: Wrap MSR macros with parentheses powerpc: Remove SAVE_VSRU and REST_VSRU macros powerpc: Simplify logic in include/uapi/asm/elf.h powerpc/pseries: Simplify H_GET_TERM_CHAR powerpc: Fix a number of sparse warnings powerpc/pci: Don't use bitfield for force_32bit_msi powerpc: Stop using non-architected shared_proc field in lppaca powerpc: Make RTAS device tree accesses endian safe powerpc: Make cache info device tree accesses endian safe powerpc: Make RTAS calls endian safe powerpc: Make logical to real cpu mapping code endian safe powerpc: Add some endian annotations to time and xics code powerpc: Fix some endian issues in xics code powerpc: of_parse_dma_window should take a __be32 *dma_window powerpc: Make device tree accesses in cache info code endian safe powerpc: Make device tree accesses in HVC VIO console endian safe powerpc: Make device tree accesses in VIO subsystem endian safe powerpc: Make OF PCI device tree accesses endian safe powerpc: Make PCI device node device tree accesses endian safe powerpc: Add endian annotations to lppaca, slb_shadow and dtl_entry powerpc: Fix little endian lppaca, slb_shadow and dtl_entry powerpc: Emulate instructions in little endian mode powerpc: Little endian SMP IPI demux powerpc/pseries: Fix endian issues in H_GET_TERM_CHAR/H_PUT_TERM_CHAR powerpc: Fix little endian coredumps powerpc: Make rwlocks endian safe powerpc: Never handle VSX alignment exceptions from kernel Benjamin Herrenschmidt (21): Merge remote-tracking branch 'scott/next' into next powerpc/pmac: Early debug output on screen on 64-bit macs powerpc: Better split CONFIG_PPC_INDIRECT_PIO and CONFIG_PPC_INDIRECT_MMIO powerpc/powernv: Update opal.h to add new LPC and XSCOM functions powerpc/powernv: Add helper to get ibm,chip-id of a node powerpc/powernv: Add PIO accessors for Power8 LPC bus powerpc: Cleanup udbg_16550 and add support for LPC PIO-only UARTs powerpc: Check "status" property before adding legacy ISA serial ports powerpc/powernv: Don't crash if there are no OPAL consoles powerpc/powernv: Enable detection of legacy UARTs Revert "powerpc/e500: Update compilation flags with core specific options" powerpc: Make prom_init.c endian safe powerpc/wsp: Fix early debug build Merge remote-tracking branch 'scott/next' into next Merge branch 'merge' into next powerpc/btext: Fix CONFIG_PPC_EARLY_DEBUG_BOOTX on ppc32 powerpc: Don't Oops when accessing /proc/powerpc/lparcfg without hypervisor powerpc/powernv: Return secondary CPUs to firmware on kexec Merge branch 'merge' into next powerpc/pseries: Move lparcfg.c to platforms/pseries Merge remote-tracking branch 'agust/next' into next Catalin Udma (2): powerpc/perf: increase the perf HW events to 6
Re: [PATCH v2 4/4] kernel: add support for init_array constructors
Frantisek Hrbata writes: > This adds the .init_array section as yet another section with constructors. > This > is needed because gcc could add __gcov_init calls to .init_array or .ctors > section, depending on gcc version. > > v2: - reuse mod->ctors for .init_array section for modules, because gcc uses > .ctors or .init_array, but not both at the same time > > Signed-off-by: Frantisek Hrbata Might be nice to document which gcc version changed this, so people can choose whether to cherry-pick this change? Acked-by: Rusty Russell > --- > include/asm-generic/vmlinux.lds.h | 1 + > kernel/module.c | 3 +++ > 2 files changed, 4 insertions(+) > > diff --git a/include/asm-generic/vmlinux.lds.h > b/include/asm-generic/vmlinux.lds.h > index 69732d2..c55d8d9 100644 > --- a/include/asm-generic/vmlinux.lds.h > +++ b/include/asm-generic/vmlinux.lds.h > @@ -468,6 +468,7 @@ > #define KERNEL_CTORS() . = ALIGN(8); \ > VMLINUX_SYMBOL(__ctors_start) = .; \ > *(.ctors) \ > + *(.init_array) \ > VMLINUX_SYMBOL(__ctors_end) = .; > #else > #define KERNEL_CTORS() > diff --git a/kernel/module.c b/kernel/module.c > index 2069158..bbbd953 100644 > --- a/kernel/module.c > +++ b/kernel/module.c > @@ -2760,6 +2760,9 @@ static void find_module_sections(struct module *mod, > struct load_info *info) > #ifdef CONFIG_CONSTRUCTORS > mod->ctors = section_objs(info, ".ctors", > sizeof(*mod->ctors), &mod->num_ctors); > + if (!mod->ctors) > + mod->ctors = section_objs(info, ".init_array", > + sizeof(*mod->ctors), &mod->num_ctors); > #endif > > #ifdef CONFIG_TRACEPOINTS > -- > 1.8.3.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 1/1] dcache: Translating dentry into pathname without taking rename_lock
On 09/05/2013 04:42 PM, Linus Torvalds wrote: On Thu, Sep 5, 2013 at 1:29 PM, Waiman Long wrote: It is not as simple as doing a strncpy(). Yes it damn well is. Stop the f*cking stupid arguments, and instead listen to what I say. Here. Let me bold-face the most important part for you, so that you don't miss it in all the other crap: MAKE prepend() JUST USE "strncpy()" INSTEAD OF "memcpy()". Nothing else. Seriously. Your "you can't do it because we copy backwards" arguments are pure and utter garbage, exactly BECAUSE YOU DON'T CHANGE ANY OF THAT. You can actually use the unreliable length variable BUT YOU MUST STILL STOP AT A ZERO. Get it? You're complicating the whole thing for no good reason. I'm telling you (and HAVE BEEN telling you multiple times) that you cannot use "memcpy()" because the length may not be reliable, so you need to check for zero in the middle and stop early. All your arguments have been totally pointless, because you don't seem to see that simple and fundamental issue. You don't change ANYTHING else. But you damn well not do a "memcpy", you do something that stops when it hits a NUL character. We call that function "strncpy()". I'd actually prefer to write it out by hand (because somebody could implement "strncpy()" as a questionable function that accesses past the NUL as long as it's within the 'n'), and because I think we might want to do that word-at-a-time version of it, but for a first approximation, just do that one-liner version. Don't do anything else. Don't do locking. Don't do memchr. Just make sure that you stop at a NUL character, and don't trust the length, because the length may not match the pointer. That's was always ALL you needed to do. Linus I am sorry that I misunderstand what you said. I will do what you and Al advise me to do. -Longman -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Hope and your long-term cooperation
Dear Manager: Glad to write to you. We are manufacturer of Stearic acid from China, We have Zinc stearate, calcium stearate, magnesium stearate, etc. If you need such chemicals, please do not hesitate to contact me. Best Regards, Shijiazhuang Shinearly Chemicals Co.,Ltd No. 105 Yellow River Road, Hightech Zone, Shijiazhuang City, Hebei Province, China Tel: 0086-311-89809275 Fax: 0086-311-67795015N�Р骒r��yb�X�肚�v�^�)藓{.n�+�伐�{��赙zXФ�≤�}��财�z�&j:+v�����赙zZ+��+zf"�h���~i���z��wア�?�ㄨ��&�)撷f��^j谦y�m��@A�a囤� 0鹅h���i
Re: ftrace 'failed to modify' bug when loading reiserfs.ko
On Thu, 5 Sep 2013 21:48:59 -0400 Dave Jones wrote: > On Thu, Sep 05, 2013 at 09:44:55PM -0400, Steven Rostedt wrote: > > On Thu, 5 Sep 2013 21:34:55 -0400 > > Dave Jones wrote: > > > > > On Thu, Sep 05, 2013 at 09:28:34PM -0400, Steven Rostedt wrote: > > Did you change a config option, or update your gcc? > > Yeah, changed CONFIG_DEBUG_KOBJECT, which rebuilt the world. Still doesn't explain why it gave you that splat there. Do you still have that binary module, and can you show me what's at reiserfs_init_bitmap_cache+0x0 with objdump? -- Steve -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[for-next][PATCH 4/4] ftrace/rcu: Do not trace debug_lockdep_rcu_enabled()
From: "Steven Rostedt (Red Hat)" The function debug_lockdep_rcu_enabled() is part of the RCU lockdep debugging, and is called very frequently. I found that if I enable a lot of debugging and run the function graph tracer, this function can cause a live lock of the system. We don't usually trace lockdep infrastructure, no need to trace this either. Reviewed-by: Paul E. McKenney Signed-off-by: Steven Rostedt --- kernel/rcupdate.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/rcupdate.c b/kernel/rcupdate.c index cce6ba8..4f20c6c 100644 --- a/kernel/rcupdate.c +++ b/kernel/rcupdate.c @@ -122,7 +122,7 @@ struct lockdep_map rcu_sched_lock_map = STATIC_LOCKDEP_MAP_INIT("rcu_read_lock_sched", &rcu_sched_lock_key); EXPORT_SYMBOL_GPL(rcu_sched_lock_map); -int debug_lockdep_rcu_enabled(void) +int notrace debug_lockdep_rcu_enabled(void) { return rcu_scheduler_active && debug_locks && current->lockdep_recursion == 0; -- 1.7.10.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[for-next][PATCH 1/4] tracing: Make tracing_cpumask available for all instances
From: Alexander Z Lam Allow tracer instances to disable tracing by cpu by moving the static global tracing_cpumask into trace_array. Link: http://lkml.kernel.org/r/921622317f239bfc2283cac2242647801ef584f2.1375980149.git@google.com Cc: Vaibhav Nagarnaik Cc: David Sharp Cc: Alexander Z Lam Signed-off-by: Alexander Z Lam Signed-off-by: Steven Rostedt --- kernel/trace/trace.c | 37 - kernel/trace/trace.h |1 + 2 files changed, 21 insertions(+), 17 deletions(-) diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index 496f94d..7974ba2 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -3166,11 +3166,6 @@ static const struct file_operations show_traces_fops = { }; /* - * Only trace on a CPU if the bitmask is set: - */ -static cpumask_var_t tracing_cpumask; - -/* * The tracer itself will not take this lock, but still we want * to provide a consistent cpumask to user-space: */ @@ -3186,11 +3181,12 @@ static ssize_t tracing_cpumask_read(struct file *filp, char __user *ubuf, size_t count, loff_t *ppos) { + struct trace_array *tr = file_inode(filp)->i_private; int len; mutex_lock(&tracing_cpumask_update_lock); - len = cpumask_scnprintf(mask_str, count, tracing_cpumask); + len = cpumask_scnprintf(mask_str, count, tr->tracing_cpumask); if (count - len < 2) { count = -EINVAL; goto out_err; @@ -3208,7 +3204,7 @@ static ssize_t tracing_cpumask_write(struct file *filp, const char __user *ubuf, size_t count, loff_t *ppos) { - struct trace_array *tr = filp->private_data; + struct trace_array *tr = file_inode(filp)->i_private; cpumask_var_t tracing_cpumask_new; int err, cpu; @@ -3228,12 +3224,12 @@ tracing_cpumask_write(struct file *filp, const char __user *ubuf, * Increase/decrease the disabled counter if we are * about to flip a bit in the cpumask: */ - if (cpumask_test_cpu(cpu, tracing_cpumask) && + if (cpumask_test_cpu(cpu, tr->tracing_cpumask) && !cpumask_test_cpu(cpu, tracing_cpumask_new)) { atomic_inc(&per_cpu_ptr(tr->trace_buffer.data, cpu)->disabled); ring_buffer_record_disable_cpu(tr->trace_buffer.buffer, cpu); } - if (!cpumask_test_cpu(cpu, tracing_cpumask) && + if (!cpumask_test_cpu(cpu, tr->tracing_cpumask) && cpumask_test_cpu(cpu, tracing_cpumask_new)) { atomic_dec(&per_cpu_ptr(tr->trace_buffer.data, cpu)->disabled); ring_buffer_record_enable_cpu(tr->trace_buffer.buffer, cpu); @@ -3242,7 +3238,7 @@ tracing_cpumask_write(struct file *filp, const char __user *ubuf, arch_spin_unlock(&ftrace_max_lock); local_irq_enable(); - cpumask_copy(tracing_cpumask, tracing_cpumask_new); + cpumask_copy(tr->tracing_cpumask, tracing_cpumask_new); mutex_unlock(&tracing_cpumask_update_lock); free_cpumask_var(tracing_cpumask_new); @@ -3256,9 +3252,10 @@ err_unlock: } static const struct file_operations tracing_cpumask_fops = { - .open = tracing_open_generic, + .open = tracing_open_generic_tr, .read = tracing_cpumask_read, .write = tracing_cpumask_write, + .release= tracing_release_generic_tr, .llseek = generic_file_llseek, }; @@ -5938,6 +5935,11 @@ static int new_instance_create(const char *name) if (!tr->name) goto out_free_tr; + if (!alloc_cpumask_var(&tr->tracing_cpumask, GFP_KERNEL)) + goto out_free_tr; + + cpumask_copy(tr->tracing_cpumask, cpu_all_mask); + raw_spin_lock_init(&tr->start_lock); tr->current_trace = &nop_trace; @@ -5969,6 +5971,7 @@ static int new_instance_create(const char *name) out_free_tr: if (tr->trace_buffer.buffer) ring_buffer_free(tr->trace_buffer.buffer); + free_cpumask_var(tr->tracing_cpumask); kfree(tr->name); kfree(tr); @@ -6098,6 +6101,9 @@ init_tracer_debugfs(struct trace_array *tr, struct dentry *d_tracer) { int cpu; + trace_create_file("tracing_cpumask", 0644, d_tracer, + tr, &tracing_cpumask_fops); + trace_create_file("trace_options", 0644, d_tracer, tr, &tracing_iter_fops); @@ -6147,9 +6153,6 @@ static __init int tracer_init_debugfs(void) init_tracer_debugfs(&global_trace, d_tracer); - trace_create_file("tracing_cpumask", 0644, d_tracer, - &global_trace, &tracing_cpumask_fops); - trace_create_file("available_tracers", 0444, d_tracer, &global_trace, &show_traces_fop
[for-next][PATCH 3/4] x86-32, ftrace: Fix static ftrace when early microcode is enabled
From: "H. Peter Anvin" Early microcode loading runs C code before paging is enabled on 32 bits. Since ftrace puts a hook into every function, that hook needs to be safe to execute in the pre-paging environment. This is currently true for dynamic ftrace but not for static ftrace. Static ftrace is obsolescent and assumed to not be performance-critical, so we can simply test that the stack pointer falls within the valid range of kernel addresses. Reported-by: Jan Kiszka Tested-by: Jan Kiszka Signed-off-by: H. Peter Anvin Signed-off-by: Steven Rostedt --- arch/x86/kernel/entry_32.S |3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/x86/kernel/entry_32.S b/arch/x86/kernel/entry_32.S index 2cfbc3a..f0dcb0c 100644 --- a/arch/x86/kernel/entry_32.S +++ b/arch/x86/kernel/entry_32.S @@ -1176,6 +1176,9 @@ ftrace_restore_flags: #else /* ! CONFIG_DYNAMIC_FTRACE */ ENTRY(mcount) + cmpl $__PAGE_OFFSET, %esp + jb ftrace_stub /* Paging not enabled yet? */ + cmpl $0, function_trace_stop jne ftrace_stub -- 1.7.10.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[for-next][PATCH 2/4] ftrace: Fix a slight race in modifying what function callback gets traced
From: "Steven Rostedt (Red Hat)" There's a slight race when going from a list function to a non list function. That is, when only one callback is registered to the function tracer, it gets called directly by the mcount trampoline. But if this function has filters, it may be called by the wrong functions. As the list ops callback that handles multiple callbacks that are registered to ftrace, it also handles what functions they call. While the transaction is taking place, use the list function always, and after all the updates are finished (only the functions that should be traced are being traced), then we can update the trampoline to call the function directly. Signed-off-by: Steven Rostedt --- kernel/trace/ftrace.c | 17 - 1 file changed, 16 insertions(+), 1 deletion(-) diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c index a6d098c..03cf44a 100644 --- a/kernel/trace/ftrace.c +++ b/kernel/trace/ftrace.c @@ -1978,12 +1978,27 @@ int __weak ftrace_arch_code_modify_post_process(void) void ftrace_modify_all_code(int command) { + int update = command & FTRACE_UPDATE_TRACE_FUNC; + + /* +* If the ftrace_caller calls a ftrace_ops func directly, +* we need to make sure that it only traces functions it +* expects to trace. When doing the switch of functions, +* we need to update to the ftrace_ops_list_func first +* before the transition between old and new calls are set, +* as the ftrace_ops_list_func will check the ops hashes +* to make sure the ops are having the right functions +* traced. +*/ + if (update) + ftrace_update_ftrace_func(ftrace_ops_list_func); + if (command & FTRACE_UPDATE_CALLS) ftrace_replace_code(1); else if (command & FTRACE_DISABLE_CALLS) ftrace_replace_code(0); - if (command & FTRACE_UPDATE_TRACE_FUNC) + if (update && ftrace_trace_function != ftrace_ops_list_func) ftrace_update_ftrace_func(ftrace_trace_function); if (command & FTRACE_START_FUNC_RET) -- 1.7.10.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[for-next][PATCH 0/4] tracing: Updated changes for 3.12
I'm holding off on the rcu unsafe changes with perf and function tracing. We'll still get bug splats with unsafe rcu usage, but we need to work out a better solution than I was going to push for 3.12. It's too late to get things smooth, thus we need to wait till 3.13 to get something that is decent. For now, root needs to be careful in how they trace functions with perf. git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git for-next Head SHA1: a0a5a0561f63905fe94c49bc567615829f42ce1e Alexander Z Lam (1): tracing: Make tracing_cpumask available for all instances H. Peter Anvin (1): x86-32, ftrace: Fix static ftrace when early microcode is enabled Steven Rostedt (Red Hat) (2): ftrace: Fix a slight race in modifying what function callback gets traced ftrace/rcu: Do not trace debug_lockdep_rcu_enabled() arch/x86/kernel/entry_32.S |3 +++ kernel/rcupdate.c |2 +- kernel/trace/ftrace.c | 17 - kernel/trace/trace.c | 37 - kernel/trace/trace.h |1 + 5 files changed, 41 insertions(+), 19 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: ftrace 'failed to modify' bug when loading reiserfs.ko
On Thu, Sep 05, 2013 at 09:44:55PM -0400, Steven Rostedt wrote: > On Thu, 5 Sep 2013 21:34:55 -0400 > Dave Jones wrote: > > > On Thu, Sep 05, 2013 at 09:28:34PM -0400, Steven Rostedt wrote: > > > On Thu, 5 Sep 2013 21:19:24 -0400 > > > Dave Jones wrote: > > > > > > > For whatever dumb reason, when running 'make install' on a Fedora > > system, > > > > os-prober tries to figure out what filesystems are needed by loading > > filesystems, > > > > and seeing what sticks.. Today it blew up spectacularly when it got > > to > > > > loading reiserfs.. System wedged entirely afterwards. > > > > > > Could it be that the reiserfs module was compiled differently than the > > > running kernel? > > > > o... it was probably installing the just-built version over the same > > '3.11+' > > modules tree that was running. This has never been a problem before > > though.. > > > > Did you change a config option, or update your gcc? Yeah, changed CONFIG_DEBUG_KOBJECT, which rebuilt the world. Dave -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: ftrace 'failed to modify' bug when loading reiserfs.ko
On Thu, 5 Sep 2013 21:34:55 -0400 Dave Jones wrote: > On Thu, Sep 05, 2013 at 09:28:34PM -0400, Steven Rostedt wrote: > > On Thu, 5 Sep 2013 21:19:24 -0400 > > Dave Jones wrote: > > > > > For whatever dumb reason, when running 'make install' on a Fedora system, > > > os-prober tries to figure out what filesystems are needed by loading > filesystems, > > > and seeing what sticks.. Today it blew up spectacularly when it got to > > > loading reiserfs.. System wedged entirely afterwards. > > > > Could it be that the reiserfs module was compiled differently than the > > running kernel? > > o... it was probably installing the just-built version over the same > '3.11+' > modules tree that was running. This has never been a problem before though.. > Did you change a config option, or update your gcc? Although, it doesn't really explain why the location would have something that it doesn't expect. As the mcount/fentry table is created in the module itself. -- Steve -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[GIT PULL] security subsystem changes for 3.12
Nothing major for this kernel, just maintenance updates. Please pull. The following changes since commit 2e032852245b3dcfe5461d7353e34eb6da095ccf: Merge branch 'for-linus' of git://git.linaro.org/people/rmk/linux-arm (2013-09-05 18:07:32 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security.git next Casey Schaufler (1): Smack: network label match fix James Morris (2): Merge branch 'linus-master'; commit 'v3.11-rc2' into ra-next Merge branch 'smack-for-3.12' of git://git.gitorious.org/smack-next/kernel into ra-next John Johansen (14): apparmor: enable users to query whether apparmor is enabled apparmor: add a features/policy dir to interface apparmor: provide base for multiple profiles to be replaced at once apparmor: convert profile lists to RCU based locking apparmor: change how profile replacement update is done apparmor: update how unconfined is handled apparmor: rework namespace free path apparmor: make free_profile available outside of policy.c apparmor: allow setting any profile into the unconfined state apparmor: add interface files for profiles and namespaces apparmor: add an optional profile attachment string for profiles apparmor: add the profile introspection file to interface apparmor: export set of capabilities supported by the apparmor module apparmor: add the ability to report a sha1 hash of loaded policy Rafal Krypa (1): Smack: parse multiple rules per write to load2, up to PAGE_SIZE-1 bytes Tetsuo Handa (2): xattr: Constify ->name member of "struct xattr". apparmor: remove minimum size check for vmalloc() Tomasz Stanislawski (2): security: smack: fix memleak in smk_write_rules_list() security: smack: add a hash table to quicken smk_find_entry() fs/ocfs2/xattr.h |2 +- include/linux/security.h |8 +- include/linux/xattr.h |2 +- include/uapi/linux/reiserfs_xattr.h |2 +- security/apparmor/Kconfig | 12 + security/apparmor/Makefile|7 +- security/apparmor/apparmorfs.c| 636 - security/apparmor/capability.c|5 + security/apparmor/context.c | 16 +- security/apparmor/crypto.c| 97 + security/apparmor/domain.c| 24 +- security/apparmor/include/apparmor.h |6 + security/apparmor/include/apparmorfs.h| 40 ++ security/apparmor/include/audit.h |1 - security/apparmor/include/capability.h|4 + security/apparmor/include/context.h | 15 +- security/apparmor/include/crypto.h| 36 ++ security/apparmor/include/policy.h| 218 +++--- security/apparmor/include/policy_unpack.h | 21 +- security/apparmor/lib.c |5 - security/apparmor/lsm.c | 22 +- security/apparmor/policy.c| 609 security/apparmor/policy_unpack.c | 135 +-- security/apparmor/procattr.c |2 +- security/capability.c |2 +- security/integrity/evm/evm_main.c |2 +- security/security.c |8 +- security/selinux/hooks.c | 17 +- security/smack/smack.h| 13 +- security/smack/smack_access.c | 29 ++- security/smack/smack_lsm.c| 51 ++- security/smack/smackfs.c | 184 - 32 files changed, 1675 insertions(+), 556 deletions(-) create mode 100644 security/apparmor/crypto.c create mode 100644 security/apparmor/include/crypto.h -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 10/11] x86, mem-hotplug: Support initialize page tables from low to high.
Hi Wanpeng, Thank you for reviewing. See below, please. On 09/05/2013 09:30 PM, Wanpeng Li wrote: .. +#ifdef CONFIG_MOVABLE_NODE + unsigned long kernel_end; + + if (movablenode_enable_srat&& + memblock.current_order == MEMBLOCK_ORDER_LOW_TO_HIGH) { I think memblock.current_order == MEMBLOCK_ORDER_LOW_TO_HIGH is always true if config MOVABLE_NODE and movablenode_enable_srat == true if PATCH 11/11 is applied. memblock.current_order == MEMBLOCK_ORDER_LOW_TO_HIGH is true here if MOVABLE_NODE is configured, and it will be reset after SRAT is parsed. But movablenode_enable_srat could only be true when users specify movablenode boot option in the kernel commandline. Please refer to patch 9/11. + kernel_end = round_up(__pa_symbol(_end), PMD_SIZE); + + memory_map_from_low(kernel_end, end); + memory_map_from_low(ISA_END_ADDRESS, kernel_end); Why split ISA_END_ADDRESS ~ end? The first 5 pages for the page tables are from brk, please refer to alloc_low_pages(). They are able to map about 2MB memory. And this 2MB memory will be used to store page tables for the next mapped pages. Here, we split [ISA_END_ADDRESS, end) into [ISA_END_ADDRESS, _end) and [_end, end), and map [_end, end) first. This is because memory in [ISA_END_ADDRESS, _end) may be used, then we have not enough memory for the next coming page tables. We should map [_end, end) first because this memory is highly likely unused. .. I think the variables sorted by address is: ISA_END_ADDRESS -> _end -> real_end -> end Yes. + memory_map_from_high(ISA_END_ADDRESS, real_end); If this is overlap with work done between #ifdef CONFIG_MOVABLE_NODE and #endif? I don't think so. Seeing from my code, if work between #ifdef CONFIG_MOVABLE_NODE and #endif is done, it will goto out, right ? Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RESEND v3 0/7] Enable Drivers for Intel MIC X100 Coprocessors.
Whitespace neatening... Multiline statement argument alignment. Argument wrapping. Use kmalloc_array instead of kmalloc. --- drivers/misc/mic/card/mic_virtio.c | 17 --- drivers/misc/mic/card/mic_x100.c| 4 +- drivers/misc/mic/host/mic_debugfs.c | 91 ++--- drivers/misc/mic/host/mic_fops.c| 6 +-- drivers/misc/mic/host/mic_intr.c| 37 --- drivers/misc/mic/host/mic_smpt.c| 17 +++ drivers/misc/mic/host/mic_sysfs.c | 18 drivers/misc/mic/host/mic_virtio.c | 34 ++ drivers/misc/mic/host/mic_x100.c| 29 ++-- 9 files changed, 122 insertions(+), 131 deletions(-) diff --git a/drivers/misc/mic/card/mic_virtio.c b/drivers/misc/mic/card/mic_virtio.c index 38275c1..6071aec 100644 --- a/drivers/misc/mic/card/mic_virtio.c +++ b/drivers/misc/mic/card/mic_virtio.c @@ -103,7 +103,7 @@ static void mic_finalize_features(struct virtio_device *vdev) for (i = 0; i < bits; i++) { if (test_bit(i, vdev->features)) iowrite8(ioread8(&out_features[i / 8]) | (1 << (i % 8)), - &out_features[i / 8]); +&out_features[i / 8]); } } @@ -197,10 +197,9 @@ static void mic_notify(struct virtqueue *vq) static void mic_del_vq(struct virtqueue *vq, int n) { struct mic_vdev *mvdev = to_micvdev(vq->vdev); - struct vring *vr = (struct vring *) (vq + 1); + struct vring *vr = (struct vring *)(vq + 1); - free_pages((unsigned long) vr->used, - get_order(mvdev->used_size[n])); + free_pages((unsigned long) vr->used, get_order(mvdev->used_size[n])); vring_del_virtqueue(vq); mic_card_unmap(mvdev->mdev, mvdev->vr[n]); mvdev->vr[n] = NULL; @@ -274,8 +273,8 @@ static struct virtqueue *mic_find_vq(struct virtio_device *vdev, /* Allocate and reassign used ring now */ mvdev->used_size[index] = PAGE_ALIGN(sizeof(__u16) * 3 + sizeof(struct vring_used_elem) * config.num); - used = (void *) __get_free_pages(GFP_KERNEL | __GFP_ZERO, - get_order(mvdev->used_size[index])); + used = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, + get_order(mvdev->used_size[index])); if (!used) { err = -ENOMEM; dev_err(mic_dev(mvdev), "%s %d err %d\n", @@ -291,7 +290,7 @@ static struct virtqueue *mic_find_vq(struct virtio_device *vdev, * vring_new_virtqueue() would ensure that * (&vq->vring == (struct vring *) (&vq->vq + 1)); */ - vr = (struct vring *) (vq + 1); + vr = (struct vring *)(vq + 1); vr->used = used; vq->priv = mvdev; @@ -544,7 +543,7 @@ static void mic_scan_devices(struct mic_driver *mdrv, bool remove) if (dev) { if (remove) iowrite8(MIC_VIRTIO_PARAM_DEV_REMOVE, - &dc->config_change); +&dc->config_change); put_device(dev); mic_handle_config_change(d, i, mdrv); ret = mic_remove_device(d, i, mdrv); @@ -559,7 +558,7 @@ static void mic_scan_devices(struct mic_driver *mdrv, bool remove) /* new device */ dev_dbg(mdrv->dev, "%s %d Adding new virtio device %p\n", - __func__, __LINE__, d); + __func__, __LINE__, d); if (!remove) mic_add_device(d, i, mdrv); } diff --git a/drivers/misc/mic/card/mic_x100.c b/drivers/misc/mic/card/mic_x100.c index 7cb3469..e54dfcb 100644 --- a/drivers/misc/mic/card/mic_x100.c +++ b/drivers/misc/mic/card/mic_x100.c @@ -66,8 +66,8 @@ void mic_send_intr(struct mic_device *mdev, int doorbell) /* Ensure that the interrupt is ordered w.r.t previous stores. */ wmb(); mic_mmio_write(mw, MIC_X100_SBOX_SDBIC0_DBREQ_BIT, - MIC_X100_SBOX_BASE_ADDRESS + - (MIC_X100_SBOX_SDBIC0 + (4 * doorbell))); + MIC_X100_SBOX_BASE_ADDRESS + + (MIC_X100_SBOX_SDBIC0 + (4 * doorbell))); } /** diff --git a/drivers/misc/mic/host/mic_debugfs.c b/drivers/misc/mic/host/mic_debugfs.c index e22fb7b..002faa5 100644 --- a/drivers/misc/mic/host/mic_debugfs.c +++ b/drivers/misc/mic/host/mic_debugfs.c @@ -103,7 +103,7 @@ static int mic_smpt_show(struct seq_file *s, void *pos) unsigned long flags; seq_printf(s, "MIC %-2d |%-10s| %-14s %-10s\n", - mdev->id, "SMPT entry", "SW DMA addr", "RefCount"); + mdev->id, "SMPT entry", "SW DMA addr", "RefCount"); seq_puts(s, "\n"); if (mdev->smpt) { @@ -111,8 +111,8 @@ static
Re: ftrace 'failed to modify' bug when loading reiserfs.ko
On Thu, Sep 05, 2013 at 09:28:34PM -0400, Steven Rostedt wrote: > On Thu, 5 Sep 2013 21:19:24 -0400 > Dave Jones wrote: > > > For whatever dumb reason, when running 'make install' on a Fedora system, > > os-prober tries to figure out what filesystems are needed by loading > > filesystems, > > and seeing what sticks.. Today it blew up spectacularly when it got to > > loading reiserfs.. System wedged entirely afterwards. > > Could it be that the reiserfs module was compiled differently than the > running kernel? o... it was probably installing the just-built version over the same '3.11+' modules tree that was running. This has never been a problem before though.. Dave -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: ftrace 'failed to modify' bug when loading reiserfs.ko
On Thu, 5 Sep 2013 21:19:24 -0400 Dave Jones wrote: > For whatever dumb reason, when running 'make install' on a Fedora system, > os-prober tries to figure out what filesystems are needed by loading > filesystems, > and seeing what sticks.. Today it blew up spectacularly when it got to > loading reiserfs.. System wedged entirely afterwards. Could it be that the reiserfs module was compiled differently than the running kernel? > > Dave > > [ cut here ] > WARNING: CPU: 2 PID: 30566 at kernel/trace/ftrace.c:1694 > ftrace_bug+0x25d/0x270() > Modules linked in: reiserfs(+) snd_hda_codec_hdmi snd_hda_codec_realtek > snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm > snd_page_alloc xfs snd_timer libcrc32c snd e1000e ptp usb_debug pps_core > pcspkr soundcore > CPU: 2 PID: 30566 Comm: modprobe Not tainted 3.11.0+ #57 > 81a2809d 88008de19c30 817171e9 > 88008de19c68 81053dad 0010 a02738b0 > 8802419e3518 8801ab16e100 88008de19c78 > Call Trace: > [] dump_stack+0x54/0x74 > [] warn_slowpath_common+0x7d/0xa0 > [] warn_slowpath_null+0x1a/0x20 > [] ftrace_bug+0x25d/0x270 > [] ftrace_process_locs+0x308/0x630 > [] ftrace_module_notify_enter+0x3c/0x40 > [] notifier_call_chain+0x66/0x150 > [] __blocking_notifier_call_chain+0x67/0xc0 > [] blocking_notifier_call_chain+0x16/0x20 > [] load_module+0x1f7d/0x2680 > [] ? store_uevent+0x40/0x40 > [] ? reiserfs_xattr_register_handlers+0xf9f/0xf9f > [reiserfs] > [] ? reiserfs_xattr_register_handlers+0xf9f/0xf9f > [reiserfs] > [] SyS_finit_module+0x86/0xb0 > [] tracesys+0xdd/0xe2 > ---[ end trace 956db59f53237fe4 ]--- > ftrace failed to modify [] > reiserfs_init_bitmap_cache+0x0/0x5750 [reiserfs] > actual: 14:00:00:00:00 Hmm, where it expected to see a call to mcount, instead is sees the instruction: 0x14 00 00 00 00 Can you do an objdump of that same binary, and show me what's located at: reiserfs_init_bitmap_cache+0x0 -- Steve > [ cut here ] > WARNING: CPU: 2 PID: 30566 at arch/x86/mm/pageattr.c:677 > __cpa_process_fault+0x91/0xa0() > CPA: called for zero pte. vaddr = a0249000 cpa->vaddr = > a0249000 > Modules linked in: reiserfs(+) snd_hda_codec_hdmi snd_hda_codec_realtek > snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm > snd_page_alloc xfs snd_timer libcrc32c snd e1000e ptp usb_debug pps_core > pcspkr soundcore > CPU: 2 PID: 30566 Comm: modprobe Tainted: GW3.11.0+ #57 > 81a0ba44 88008de19b40 817171e9 88008de19b88 > 88008de19b78 81053dad 88008de19d08 fff2 > a0249000 880238646248 88008de19d08 88008de19bd8 > Call Trace: > [] dump_stack+0x54/0x74 > [] warn_slowpath_common+0x7d/0xa0 > [] ? reiserfs_xattr_register_handlers+0x9f9f/0x29f9f > [reiserfs] > [] warn_slowpath_fmt+0x4c/0x50 > [] ? reiserfs_xattr_register_handlers+0x8f9f/0xf9f > [reiserfs] > [] ? reiserfs_xattr_register_handlers+0x9f9f/0x29f9f > [reiserfs] > [] ? reiserfs_xattr_register_handlers+0x9f9f/0x29f9f > [reiserfs] > [] __cpa_process_fault+0x91/0xa0 > [] __change_page_attr_set_clr+0x392/0xab0 > [] ? 0xa023efff > [] change_page_attr_set_clr+0x123/0x460 > [] ? 0xa023efff > [] set_memory_ro+0x2f/0x40 > [] ? reiserfs_xattr_register_handlers+0x9f9f/0x29f9f > [reiserfs] > [] set_section_ro_nx+0x3a/0x71 > [] load_module+0x1f9e/0x2680 > [] ? store_uevent+0x40/0x40 > [] ? reiserfs_xattr_register_handlers+0xf9f/0xf9f > [reiserfs] > [] ? reiserfs_xattr_register_handlers+0xf9f/0xf9f > [reiserfs] > [] SyS_finit_module+0x86/0xb0 > [] tracesys+0xdd/0xe2 > ---[ end trace 956db59f53237fe5 ]--- > Oops: 0003 [#1] SMP > Modules linked in: reiserfs snd_hda_codec_hdmi snd_hda_codec_realtek > snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm > snd_page_alloc xfs snd_timer libcrc32c snd e1000e ptp usb_debug pps_core > pcspkr soundcore > CPU: 1 PID: 30571 Comm: modprobe Tainted: GW3.11.0+ #57 > task: 8801238a ti: 8801ab314000 task.ti: 8801ab314000 > RIP: 0010:[] [] load_module+0x161b/0x2680 > RSP: 0018:8801ab315dc0 EFLAGS: 00010202 > RAX: a009c000 RBX: 8801ab315ef8 RCX: a00c2000 > RDX: a00c2000 RSI: 0055 RDI: a00c3f98 > RBP: 8801ab315ee8 R08: a009fa68 R09: a009c000 > R10: a00c3f98 R11: 0002 R12: a02d2838 > R13: 0001 R14: R15: a02d2820 > FS: 7f6f48b51740() GS:88024580() knlGS: > CS: 0010 DS: ES: CR0: 80050033 > CR2: a00c2000 CR3: 0002211e9000 CR4: 001407e0 > DR0: DR1: DR2: > DR3: DR6: fffe0ff0 DR7: 0400 > Sta
[PATCH V2] arm: LLVMLinux: use static inline in ARM ftrace.h
From: Behan Webster With compilers which follow the C99 standard (like modern versions of gcc and clang), "extern inline" does the wrong thing (emits code for an externally linkable version of the inline function). In this case using static inline and removing the NULL version of return_address in return_address.c does the right thing. Signed-off-by: Behan Webster --- arch/arm/include/asm/ftrace.h| 2 +- arch/arm/kernel/return_address.c | 5 - 2 files changed, 1 insertion(+), 6 deletions(-) diff --git a/arch/arm/include/asm/ftrace.h b/arch/arm/include/asm/ftrace.h index f89515a..2bb8cac 100644 --- a/arch/arm/include/asm/ftrace.h +++ b/arch/arm/include/asm/ftrace.h @@ -45,7 +45,7 @@ void *return_address(unsigned int); #else -extern inline void *return_address(unsigned int level) +static inline void *return_address(unsigned int level) { return NULL; } diff --git a/arch/arm/kernel/return_address.c b/arch/arm/kernel/return_address.c index fafedd8..f6aa84d 100644 --- a/arch/arm/kernel/return_address.c +++ b/arch/arm/kernel/return_address.c @@ -63,11 +63,6 @@ void *return_address(unsigned int level) #warning "TODO: return_address should use unwind tables" #endif -void *return_address(unsigned int level) -{ - return NULL; -} - #endif /* if defined(CONFIG_FRAME_POINTER) && !defined(CONFIG_ARM_UNWIND) / else */ EXPORT_SYMBOL_GPL(return_address); -- 1.8.1.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
ftrace 'failed to modify' bug when loading reiserfs.ko
For whatever dumb reason, when running 'make install' on a Fedora system, os-prober tries to figure out what filesystems are needed by loading filesystems, and seeing what sticks.. Today it blew up spectacularly when it got to loading reiserfs.. System wedged entirely afterwards. Dave [ cut here ] WARNING: CPU: 2 PID: 30566 at kernel/trace/ftrace.c:1694 ftrace_bug+0x25d/0x270() Modules linked in: reiserfs(+) snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_page_alloc xfs snd_timer libcrc32c snd e1000e ptp usb_debug pps_core pcspkr soundcore CPU: 2 PID: 30566 Comm: modprobe Not tainted 3.11.0+ #57 81a2809d 88008de19c30 817171e9 88008de19c68 81053dad 0010 a02738b0 8802419e3518 8801ab16e100 88008de19c78 Call Trace: [] dump_stack+0x54/0x74 [] warn_slowpath_common+0x7d/0xa0 [] warn_slowpath_null+0x1a/0x20 [] ftrace_bug+0x25d/0x270 [] ftrace_process_locs+0x308/0x630 [] ftrace_module_notify_enter+0x3c/0x40 [] notifier_call_chain+0x66/0x150 [] __blocking_notifier_call_chain+0x67/0xc0 [] blocking_notifier_call_chain+0x16/0x20 [] load_module+0x1f7d/0x2680 [] ? store_uevent+0x40/0x40 [] ? reiserfs_xattr_register_handlers+0xf9f/0xf9f [reiserfs] [] ? reiserfs_xattr_register_handlers+0xf9f/0xf9f [reiserfs] [] SyS_finit_module+0x86/0xb0 [] tracesys+0xdd/0xe2 ---[ end trace 956db59f53237fe4 ]--- ftrace failed to modify [] reiserfs_init_bitmap_cache+0x0/0x5750 [reiserfs] actual: 14:00:00:00:00 [ cut here ] WARNING: CPU: 2 PID: 30566 at arch/x86/mm/pageattr.c:677 __cpa_process_fault+0x91/0xa0() CPA: called for zero pte. vaddr = a0249000 cpa->vaddr = a0249000 Modules linked in: reiserfs(+) snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_page_alloc xfs snd_timer libcrc32c snd e1000e ptp usb_debug pps_core pcspkr soundcore CPU: 2 PID: 30566 Comm: modprobe Tainted: GW3.11.0+ #57 81a0ba44 88008de19b40 817171e9 88008de19b88 88008de19b78 81053dad 88008de19d08 fff2 a0249000 880238646248 88008de19d08 88008de19bd8 Call Trace: [] dump_stack+0x54/0x74 [] warn_slowpath_common+0x7d/0xa0 [] ? reiserfs_xattr_register_handlers+0x9f9f/0x29f9f [reiserfs] [] warn_slowpath_fmt+0x4c/0x50 [] ? reiserfs_xattr_register_handlers+0x8f9f/0xf9f [reiserfs] [] ? reiserfs_xattr_register_handlers+0x9f9f/0x29f9f [reiserfs] [] ? reiserfs_xattr_register_handlers+0x9f9f/0x29f9f [reiserfs] [] __cpa_process_fault+0x91/0xa0 [] __change_page_attr_set_clr+0x392/0xab0 [] ? 0xa023efff [] change_page_attr_set_clr+0x123/0x460 [] ? 0xa023efff [] set_memory_ro+0x2f/0x40 [] ? reiserfs_xattr_register_handlers+0x9f9f/0x29f9f [reiserfs] [] set_section_ro_nx+0x3a/0x71 [] load_module+0x1f9e/0x2680 [] ? store_uevent+0x40/0x40 [] ? reiserfs_xattr_register_handlers+0xf9f/0xf9f [reiserfs] [] ? reiserfs_xattr_register_handlers+0xf9f/0xf9f [reiserfs] [] SyS_finit_module+0x86/0xb0 [] tracesys+0xdd/0xe2 ---[ end trace 956db59f53237fe5 ]--- Oops: 0003 [#1] SMP Modules linked in: reiserfs snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_page_alloc xfs snd_timer libcrc32c snd e1000e ptp usb_debug pps_core pcspkr soundcore CPU: 1 PID: 30571 Comm: modprobe Tainted: GW3.11.0+ #57 task: 8801238a ti: 8801ab314000 task.ti: 8801ab314000 RIP: 0010:[] [] load_module+0x161b/0x2680 RSP: 0018:8801ab315dc0 EFLAGS: 00010202 RAX: a009c000 RBX: 8801ab315ef8 RCX: a00c2000 RDX: a00c2000 RSI: 0055 RDI: a00c3f98 RBP: 8801ab315ee8 R08: a009fa68 R09: a009c000 R10: a00c3f98 R11: 0002 R12: a02d2838 R13: 0001 R14: R15: a02d2820 FS: 7f6f48b51740() GS:88024580() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: a00c2000 CR3: 0002211e9000 CR4: 001407e0 DR0: DR1: DR2: DR3: DR6: fffe0ff0 DR7: 0400 Stack: 003fa26b 8801238a 8801ab315e48 8801238a a009c000 a02d2a58 a02d2838 3a80 a009c000 a00c2000 003a94a10969 a00c3f98 Call Trace: [] ? xfs_setattr_nonsize+0x240/0x5d0 [xfs] [] ? xfs_inumbers+0x248/0x420 [xfs] [] ? copy_module_from_fd.isra.48+0x12a/0x190 [] SyS_finit_module+0x86/0xb0 [] tracesys+0xdd/0xe2 Code: 48 83 7a 38 00 78 6a 48 8b 30 44 89 ea 4c 89 d7 48 8d 14 52 4c 89 4c 24 40 41 83 c5 01 48 8d 14 d1 48 89 4c 24 48 4c 89 54 24 58 <48> 89 32 48 8b 70 08 48 89 72 0
Re: [PATCH v4 0/3] cleanup of gpio_pcf857x.c
Hi > This patch series > - removes the irq_demux_work > - Uses devm_request_threaded_irq > - Call the user handler iff gpio_to_irq is done. > > v1 --> v2 > Split v1 to 3 patches > v2 --> v3 > Remove the unnecessary dts patches. > v3 --> v4 > Remove gpio->irq (in patch 2) > > Note: these patches were made after applying [1]. > [1] - [PATCH v5] gpio: pcf857x: Add OF support - > https://lkml.org/lkml/2013/8/27/70 > > George Cherian (3): > gpio: pcf857x: change to devm_request_threaded_irq > gpio: pcf857x: remove the irq_demux_work and gpio->irq > gpio: pcf857x: call the gpio user handler iff gpio_to_irq is done > > drivers/gpio/gpio-pcf857x.c | 53 > ++--- > 1 file changed, 26 insertions(+), 27 deletions(-) For all patches Acked-by: Kuninori Morimoto Best regards --- Kuninori Morimoto -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/