Re: [PATCH RFCv2 0/6] net: phy: Ethernet PHY powerdown optimization

2013-12-04 Thread Mugunthan V N
On Wednesday 04 December 2013 09:14 PM, Sebastian Hesselbarth wrote:
> This is v2 of an RFC sent earlier [1] to reduce power consumption of network
> PHYs with link that are either unused or the corresponding netdev is down.
>
> In contrast to RFCv1, this now integrates phy_suspend/phy_resume transparent
> to the netdev drivers. Also, phy_suspend now only suspends the PHY if WOL is
> disabled. Moreover, the phy state machine calls phy_suspend on entering
> HALTED state.
>
> Again, a branch with RFCv2 applied to v3.13-rc2 can also be found at
> https://github.com/shesselba/linux-dove.git topic/ethphy-power-rfc-v2
>
> [1] http://lwn.net/Articles/574426/
>
> Sebastian Hesselbarth (6):
>   net: mv643xx_eth: properly start/stop phy device
>   net: phy: marvell: provide genphy suspend/resume
>   net: phy: provide phy_resume/phy_suspend helpers
>   net: phy: resume/suspend PHYs on attach/detach
>   net: phy: suspend unused PHYs on mdio_bus in late_initcall
>   net: phy: suspend phydev when going to HALTED
>
>  drivers/net/ethernet/marvell/mv643xx_eth.c |4 +++-
>  drivers/net/phy/marvell.c  |   22 ++
>  drivers/net/phy/mdio_bus.c |   25 +
>  drivers/net/phy/phy.c  |6 +-
>  drivers/net/phy/phy_device.c   |   27 +++
>  include/linux/phy.h|2 ++
>  6 files changed, 84 insertions(+), 2 deletions(-)
>
> ---
> Cc: David Miller 
> Cc: Florian Fainelli 
> Cc: Mugunthan V N 
> Cc: net...@vger.kernel.org
> Cc: linux-arm-ker...@lists.infradead.org
> Cc: linux-kernel@vger.kernel.org
Apart form Sergei's comment the patch series looks good to me.

Acked-by: Mugunthan V N 

Regards
Mugunthan V N
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 04/10] usb: dwc3: use quirks to know if a particualr platform doesn't have PHY

2013-12-04 Thread Heikki Krogerus
Hi,

On Thu, Dec 05, 2013 at 12:04:46PM +0530, Kishon Vijay Abraham I wrote:
> On Wednesday 04 December 2013 08:10 PM, Heikki Krogerus wrote:
> > On Mon, Nov 25, 2013 at 03:31:24PM +0530, Kishon Vijay Abraham I wrote:
> >> There can be systems which does not have an external phy, so get
> >> phy only if no quirks are added that indicates the PHY is not present.
> >> Introduced two quirk flags to indicate the *absence* of usb2 phy and
> >> usb3 phy. Also remove checking if return value is -ENXIO since it's now
> >> changed to always enable usb_phy layer.
> > 
> > Can you guys explain why is something like this needed? Like with
> > clocks and gpios, the device drivers shouldn't need to care any more
> > if the platform has the phys or not. -ENODEV tells you your platform
> 
> Shouldn't we report if a particular platform needs a PHY and not able to get
> it. How will a user know if a particular controller is not working because 
> it's
> not able to get and initialize the PHYs? Don't you think in such cases it's
> better to fail (and return from probe) because the controller will not work
> anyway without the PHY?

My point is that you do not need to separately tell this to the driver
like you do with the quirks (if you did, then you would need to fix
your framework and not hack the drivers).

Like I said, ENODEV tells you that there is no phy on this platform
for you, allowing you to safely continue. If your phy driver is not
loaded, the framework already returns EPROBE_DEFER, right. Any other
error when getting the phy you can consider critical. They are the
errors telling you that you do need a phy on this platform, but
something actually went wrong when getting it.

Those "quirks" should always be avoided, and I don't see any use for
them here.

> Thanks
> Kishon

Thanks,

-- 
heikki
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] watchdog: s3c2410_wdt: Report when the watchdog reset the system

2013-12-04 Thread Leela Krishna Amudala
Hi Guenter Roeck,

On Tue, Dec 3, 2013 at 3:06 AM, Guenter Roeck  wrote:
>
> On Mon, Dec 02, 2013 at 12:47:53PM -0800, Olof Johansson wrote:
> > On Mon, Dec 2, 2013 at 12:21 PM, Guenter Roeck  wrote:
> > > On Mon, Dec 02, 2013 at 10:14:41AM -0800, Doug Anderson wrote:
> > >> A good watchdog driver is supposed to report when it was responsible
> > >> for resetting the system.  Implement this for the s3c2410, at least on
> > >> exynos5250 and exynos5420 where we already have a pointer to the PMU
> > >> registers to read the information.
> > >>
> > >> Signed-off-by: Doug Anderson 
> > >> ---
> > >> This patch is based atop Leela Krishna's recent series that ends with
> > >> (ARM: dts: update watchdog device nodes for Exynos5250 and Exynos5420)
> > >> AKA .
> > >>
> > >>  drivers/watchdog/s3c2410_wdt.c | 42 
> > >> +++---
> > >>  1 file changed, 39 insertions(+), 3 deletions(-)
> > >>
> > >> diff --git a/drivers/watchdog/s3c2410_wdt.c 
> > >> b/drivers/watchdog/s3c2410_wdt.c
> > >> index 47f4dcf..2c87d37 100644
> > >> --- a/drivers/watchdog/s3c2410_wdt.c
> > >> +++ b/drivers/watchdog/s3c2410_wdt.c
> > >> @@ -62,9 +62,13 @@
> > >>  #define CONFIG_S3C2410_WATCHDOG_ATBOOT   (0)
> > >>  #define CONFIG_S3C2410_WATCHDOG_DEFAULT_TIME (15)
> > >>
> > >> +#define RST_STAT_REG_OFFSET  0x0404
> > >>  #define WDT_DISABLE_REG_OFFSET   0x0408
> > >>  #define WDT_MASK_RESET_REG_OFFSET0x040c
> > >>  #define QUIRK_NEEDS_PMU_CONFIG   (1 << 0)
> > >> +#define QUIRK_HAS_RST_STAT   (1 << 1)
> > >> +#define QUIRKS_NEED_PMUREG   (QUIRK_NEEDS_PMU_CONFIG | \
> > >> +  QUIRK_HAS_RST_STAT)
> > >>
> > > I am not really happy about the NEED (both of them, really) here.
> > > How about HAS instead ?
> >
> > Ah, I just commented on these things on our internal review site too
> > on this patch. I don't even think a quirk is needed -- just use the
> > presence of a non-0 rst_stat_reg to tell if you need to use regmap.
> >
> Agreed; same is true for the QUIRK_NEEDS_PMU_CONFIG related registers
> as well.
>

As Tomasz Figa suggested I introduced quirks,

Tomasz,
Any comments from you here...??


Best wishes,
Leela Krishna


> Guenter
> --
> To unsubscribe from this list: send the line "unsubscribe linux-samsung-soc" 
> in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] media: Add BCM2048 radio driver

2013-12-04 Thread Hans Verkuil
On 12/02/2013 09:51 PM, Pali Rohár wrote:
> On Monday 04 November 2013 12:39:44 Hans Verkuil wrote:
>> Hi Pali,
>>
>> On 10/26/2013 10:45 PM, Pali Rohár wrote:
>>> On Saturday 26 October 2013 22:22:09 Hans Verkuil wrote:
> Hans, so can it be added to drivers/staging/media tree?

 Yes, that is an option. It's up to you to decide what you
 want. Note that if no cleanup work is done on the staging
 driver for a long time, then it can be removed again.

 Regards,

 Hans
>>>
>>> Ok, so if you can add it to staging tree. When driver will
>>> be in mainline other developers can look at it too. Now
>>> when driver is hidden, nobody know where to find it... You
>>> can see how upstream development for Nokia N900 HW going
>>> on: http://elinux.org/N900
>>
>> Please check my tree:
>>
>> http://git.linuxtv.org/hverkuil/media_tree.git/shortlog/refs/h
>> eads/bcm
>>
>> If you're OK, then I'll queue it for 3.14 (it's too late for
>> 3.13).
>>
>> Regards,
>>
>>  Hans
> 
> Hi, sorry for late reply. I looked into your tree and difference 
> is that you only removed "linux/slab.h" include. So it it is not 
> needed, then it is OK.

I *added* slab.h :-)

Anyway, I've posted the pull request. Please note, if you want to avoid
having this driver be removed again in the future, then you (or someone else)
should work on addressing the issues in the TODO file I added.

Regards,

Hans
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 1/2] cpufreq: tegra: Call tegra_cpufreq_init() specifically in machine code

2013-12-04 Thread Bill Huang
Move the call from module_init to Tegra machine codes so it won't be
called in a multi-platform kernel running on non-Tegra SoCs.

Signed-off-by: Bill Huang 
---
 arch/arm/mach-tegra/tegra.c |2 ++
 drivers/cpufreq/tegra-cpufreq.c |   13 ++---
 include/linux/tegra-soc.h   |   11 ++-
 3 files changed, 14 insertions(+), 12 deletions(-)

diff --git a/arch/arm/mach-tegra/tegra.c b/arch/arm/mach-tegra/tegra.c
index 7336817..14490ad 100644
--- a/arch/arm/mach-tegra/tegra.c
+++ b/arch/arm/mach-tegra/tegra.c
@@ -34,6 +34,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -160,6 +161,7 @@ static void __init tegra_dt_init_late(void)
 {
int i;
 
+   tegra_cpufreq_init();
tegra_init_suspend();
tegra_cpuidle_init();
tegra_powergate_debugfs_init();
diff --git a/drivers/cpufreq/tegra-cpufreq.c b/drivers/cpufreq/tegra-cpufreq.c
index 63f0059..ae1c0f1 100644
--- a/drivers/cpufreq/tegra-cpufreq.c
+++ b/drivers/cpufreq/tegra-cpufreq.c
@@ -155,7 +155,7 @@ static struct cpufreq_driver tegra_cpufreq_driver = {
 #endif
 };
 
-static int __init tegra_cpufreq_init(void)
+int __init tegra_cpufreq_init(void)
 {
cpu_clk = clk_get_sys(NULL, "cclk");
if (IS_ERR(cpu_clk))
@@ -177,17 +177,8 @@ static int __init tegra_cpufreq_init(void)
 
return cpufreq_register_driver(&tegra_cpufreq_driver);
 }
-
-static void __exit tegra_cpufreq_exit(void)
-{
-cpufreq_unregister_driver(&tegra_cpufreq_driver);
-   clk_put(emc_clk);
-   clk_put(cpu_clk);
-}
-
+EXPORT_SYMBOL(tegra_cpufreq_init);
 
 MODULE_AUTHOR("Colin Cross ");
 MODULE_DESCRIPTION("cpufreq driver for Nvidia Tegra2");
 MODULE_LICENSE("GPL");
-module_init(tegra_cpufreq_init);
-module_exit(tegra_cpufreq_exit);
diff --git a/include/linux/tegra-soc.h b/include/linux/tegra-soc.h
index 95f611d..a179aa5 100644
--- a/include/linux/tegra-soc.h
+++ b/include/linux/tegra-soc.h
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2012, NVIDIA CORPORATION.  All rights reserved.
+ * Copyright (c) 2012,2013, NVIDIA CORPORATION.  All rights reserved.
  *
  * This program is free software; you can redistribute it and/or modify it
  * under the terms and conditions of the GNU General Public License,
@@ -17,6 +17,15 @@
 #ifndef __LINUX_TEGRA_SOC_H_
 #define __LINUX_TEGRA_SOC_H_
 
+#ifdef CONFIG_ARM_TEGRA_CPUFREQ
+int tegra_cpufreq_init(void);
+#else
+static inline int tegra_cpufreq_init(void)
+{
+   return -EINVAL;
+}
+#endif
+
 u32 tegra_read_chipid(void);
 
 #endif /* __LINUX_TEGRA_SOC_H_ */
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -next] checkpatch: Warn only on "space before semicolon" at end of line

2013-12-04 Thread Dan Carpenter
Thanks so much.  :)

regards,
dan carpenter

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] Convert ACPI fan driver to platform driver

2013-12-04 Thread Aaron Lu
On Thu, Dec 05, 2013 at 12:07:31AM +0100, Rafael J. Wysocki wrote:
> On Tuesday, December 03, 2013 04:28:28 PM Aaron Lu wrote:
> > This patchset converts ACPI fan driver to platform driver. Patch 1-3 are
> > cleanups for existing fan driver and patch 4 does the convertion.
> > 
> > Tested on harris beach.
> > Apply on top of Rafael's linux-next branch.
> > 
> > Aaron Lu (4):
> >   ACPI / fan: remove unused macro for debug
> >   ACPI / fan: remove no need check for device pointer
> >   ACPI / fan: use acpi_device_xxx_power instead of acpi_bus equivelant
> >   ACPI / fan: convert to platform driver
> > 
> >  drivers/acpi/acpi_platform.c |  3 ++
> >  drivers/acpi/device_pm.c |  1 +
> >  drivers/acpi/fan.c   | 88 
> > 
> >  drivers/acpi/internal.h  |  2 -
> >  include/acpi/acpi_bus.h  |  1 +
> >  5 files changed, 36 insertions(+), 59 deletions(-)
> 
> Unfortunately, we need to postpone these conversions, because Matthew Garrett
> has problems with adding more entries to acpi_platform_device_ids[].  He seems
> to be concerned that that list will grow indefinitely and will become 
> difficult
> to maintain eventually.
> 
> For this reason, he would prefer it if we did the following:
> - Figure out the list of ACPI device IDs we need to create PNP devices for
>   via ACPI PNP.

I'm not sure how to tell this, is it that as long as the ACPI node has a
PNP ID we will need to create a PNP device for it? And in this case,
do we only check the _HID or both _HID and _CID?

> - Make ACPI PNP create PNP devices for these IDs only and make the ACPI core 
> create
>   platform devices for all "unassigned" ACPI device objects by default.

Does "unassigned" mean (all ACPI devices) - (ACPI devices that have a
PNP device created already)?

Thanks,
Aaron

> - Do the conversions at that point.
> 
> I'm slightly worried that we'll encounter ordering issues while doing that, 
> but
> this is the only way forward I can see without going straight against the
> Matthew's objections, which I'd prefer to avoid.
> 
> Thanks,
> Rafael
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 0/2] Remodel Tegra cpufreq drivers to support Tegra series SoC

2013-12-04 Thread Bill Huang
This patch series remodel Tegra cpufreq driver to make it more easy to
add new SoC support, in addition to that, adding probe function in the
driver to let probe defer can be used to control init sequence when we
are going to support DVFS.

Changes since v2:

- Fix Kconfig.
- Rebase on top of branch 'cpufreq-next' on 
git://git.linaro.org/people/vireshk/linux.git.

Changes since v1:

- Split up patches.
- Split configuration-time data out of structure "tegra_cpufreq_data".
- Bug fixes.

Bill Huang (2):
  cpufreq: tegra: Call tegra_cpufreq_init() specifically in machine
code
  cpufreq: tegra: Re-model Tegra cpufreq driver

 arch/arm/mach-tegra/tegra.c   |2 +
 drivers/cpufreq/Kconfig.arm   |   12 +++
 drivers/cpufreq/Makefile  |1 +
 drivers/cpufreq/tegra-cpufreq.c   |  188 +
 drivers/cpufreq/tegra-cpufreq.h   |   40 
 drivers/cpufreq/tegra20-cpufreq.c |  136 +++
 include/linux/tegra-soc.h |   11 ++-
 7 files changed, 287 insertions(+), 103 deletions(-)
 create mode 100644 drivers/cpufreq/tegra-cpufreq.h
 create mode 100644 drivers/cpufreq/tegra20-cpufreq.c

-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 2/2] cpufreq: tegra: Re-model Tegra cpufreq driver

2013-12-04 Thread Bill Huang
Re-model Tegra cpufreq driver to support all Tegra series of SoCs.

* Make tegra-cpufreq.c a generic Tegra cpufreq driver.
* Move Tegra20 specific codes into tegra20-cpufreq.c.
* Bind Tegra cpufreq dirver with a fake device so defer probe would work
  when we're going to get regulator in the driver to support voltage
  scaling (DVFS).

Signed-off-by: Bill Huang 
---
 drivers/cpufreq/Kconfig.arm   |   12 +++
 drivers/cpufreq/Makefile  |1 +
 drivers/cpufreq/tegra-cpufreq.c   |  185 ++---
 drivers/cpufreq/tegra-cpufreq.h   |   40 
 drivers/cpufreq/tegra20-cpufreq.c |  136 +++
 5 files changed, 278 insertions(+), 96 deletions(-)
 create mode 100644 drivers/cpufreq/tegra-cpufreq.h
 create mode 100644 drivers/cpufreq/tegra20-cpufreq.c

diff --git a/drivers/cpufreq/Kconfig.arm b/drivers/cpufreq/Kconfig.arm
index ce52ed9..1cc9213 100644
--- a/drivers/cpufreq/Kconfig.arm
+++ b/drivers/cpufreq/Kconfig.arm
@@ -225,6 +225,18 @@ config ARM_TEGRA_CPUFREQ
help
  This adds the CPUFreq driver support for TEGRA SOCs.
 
+config ARM_TEGRA20_CPUFREQ
+   bool "NVIDIA TEGRA20"
+   depends on ARM_TEGRA_CPUFREQ && ARCH_TEGRA_2x_SOC
+   default y
+   help
+ This enables Tegra20 cpufreq functionality, it adds
+ Tegra20 CPU frequency ladder and the call back functions
+ to set CPU rate. All the non-SoC dependant codes are
+ controlled by the config ARM_TEGRA20_CPUFREQ.
+
+ If in doubt, say N.
+
 config ARM_VEXPRESS_SPC_CPUFREQ
 tristate "Versatile Express SPC based CPUfreq driver"
 select ARM_BIG_LITTLE_CPUFREQ
diff --git a/drivers/cpufreq/Makefile b/drivers/cpufreq/Makefile
index 7494565..331964b 100644
--- a/drivers/cpufreq/Makefile
+++ b/drivers/cpufreq/Makefile
@@ -74,6 +74,7 @@ obj-$(CONFIG_ARM_SA1100_CPUFREQ)  += sa1100-cpufreq.o
 obj-$(CONFIG_ARM_SA1110_CPUFREQ)   += sa1110-cpufreq.o
 obj-$(CONFIG_ARM_SPEAR_CPUFREQ)+= spear-cpufreq.o
 obj-$(CONFIG_ARM_TEGRA_CPUFREQ)+= tegra-cpufreq.o
+obj-$(CONFIG_ARM_TEGRA20_CPUFREQ)  += tegra20-cpufreq.o
 obj-$(CONFIG_ARM_VEXPRESS_SPC_CPUFREQ) += vexpress-spc-cpufreq.o
 
 
##
diff --git a/drivers/cpufreq/tegra-cpufreq.c b/drivers/cpufreq/tegra-cpufreq.c
index ae1c0f1..5f19c18 100644
--- a/drivers/cpufreq/tegra-cpufreq.c
+++ b/drivers/cpufreq/tegra-cpufreq.c
@@ -1,5 +1,6 @@
 /*
  * Copyright (C) 2010 Google, Inc.
+ * Copyright (c) 2013, NVIDIA Corporation.
  *
  * Author:
  * Colin Cross 
@@ -18,69 +19,17 @@
 
 #include 
 #include 
-#include 
-#include 
 #include 
-#include 
-#include 
 #include 
 #include 
-#include 
-
-static struct cpufreq_frequency_table freq_table[] = {
-   { .frequency = 216000 },
-   { .frequency = 312000 },
-   { .frequency = 456000 },
-   { .frequency = 608000 },
-   { .frequency = 76 },
-   { .frequency = 816000 },
-   { .frequency = 912000 },
-   { .frequency = 100 },
-   { .frequency = CPUFREQ_TABLE_END },
-};
-
-#define NUM_CPUS   2
+#include 
+#include 
+#include 
 
-static struct clk *cpu_clk;
-static struct clk *pll_x_clk;
-static struct clk *pll_p_clk;
-static struct clk *emc_clk;
-
-static int tegra_cpu_clk_set_rate(unsigned long rate)
-{
-   int ret;
+#include "tegra-cpufreq.h"
 
-   /*
-* Take an extra reference to the main pll so it doesn't turn
-* off when we move the cpu off of it
-*/
-   clk_prepare_enable(pll_x_clk);
-
-   ret = clk_set_parent(cpu_clk, pll_p_clk);
-   if (ret) {
-   pr_err("Failed to switch cpu to clock pll_p\n");
-   goto out;
-   }
-
-   if (rate == clk_get_rate(pll_p_clk))
-   goto out;
-
-   ret = clk_set_rate(pll_x_clk, rate);
-   if (ret) {
-   pr_err("Failed to change pll_x to %lu\n", rate);
-   goto out;
-   }
-
-   ret = clk_set_parent(cpu_clk, pll_x_clk);
-   if (ret) {
-   pr_err("Failed to switch cpu to clock pll_x\n");
-   goto out;
-   }
-
-out:
-   clk_disable_unprepare(pll_x_clk);
-   return ret;
-}
+static struct tegra_cpufreq_data *tegra_data;
+static const struct tegra_cpufreq_config *soc_config;
 
 static int tegra_update_cpu_speed(struct cpufreq_policy *policy,
unsigned long rate)
@@ -91,14 +40,10 @@ static int tegra_update_cpu_speed(struct cpufreq_policy 
*policy,
 * Vote on memory bus frequency based on cpu frequency
 * This sets the minimum frequency, display or avp may request higher
 */
-   if (rate >= 816000)
-   clk_set_rate(emc_clk, 6); /* cpu 816 MHz, emc max */
-   else if (rate >= 456000)
-   clk_set_rate(emc_clk, 3); /* cpu 456 MHz, emc 150Mhz */
-   else
-   clk_set_rate(emc_clk, 1);  /* e

[char-misc-next] mei: add 9 series PCH mei device ids

2013-12-04 Thread Tomas Winkler
And Lynx Point H Refresh and Wildcat Point LP
device ids.

Signed-off-by: Tomas Winkler 
---
V2: remove duplicated  LPT entry

 drivers/misc/mei/hw-me-regs.h | 5 -
 drivers/misc/mei/pci-me.c | 4 +++-
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/misc/mei/hw-me-regs.h b/drivers/misc/mei/hw-me-regs.h
index 6c0fde5..66f411a 100644
--- a/drivers/misc/mei/hw-me-regs.h
+++ b/drivers/misc/mei/hw-me-regs.h
@@ -109,9 +109,12 @@
 #define MEI_DEV_ID_PPT_2  0x1CBA  /* Panther Point */
 #define MEI_DEV_ID_PPT_3  0x1DBA  /* Panther Point */
 
-#define MEI_DEV_ID_LPT0x8C3A  /* Lynx Point */
+#define MEI_DEV_ID_LPT_H  0x8C3A  /* Lynx Point H */
 #define MEI_DEV_ID_LPT_W  0x8D3A  /* Lynx Point - Wellsburg */
 #define MEI_DEV_ID_LPT_LP 0x9C3A  /* Lynx Point LP */
+#define MEI_DEV_ID_LPT_HR 0x8CBA  /* Lynx Point H Refresh */
+
+#define MEI_DEV_ID_WPT_LP 0x9CBA  /* Wildcat Point LP */
 /*
  * MEI HW Section
  */
diff --git a/drivers/misc/mei/pci-me.c b/drivers/misc/mei/pci-me.c
index 3bfae38..7dfaa32 100644
--- a/drivers/misc/mei/pci-me.c
+++ b/drivers/misc/mei/pci-me.c
@@ -76,9 +76,11 @@ static DEFINE_PCI_DEVICE_TABLE(mei_me_pci_tbl) = {
{PCI_DEVICE(PCI_VENDOR_ID_INTEL, MEI_DEV_ID_PPT_1)},
{PCI_DEVICE(PCI_VENDOR_ID_INTEL, MEI_DEV_ID_PPT_2)},
{PCI_DEVICE(PCI_VENDOR_ID_INTEL, MEI_DEV_ID_PPT_3)},
-   {PCI_DEVICE(PCI_VENDOR_ID_INTEL, MEI_DEV_ID_LPT)},
+   {PCI_DEVICE(PCI_VENDOR_ID_INTEL, MEI_DEV_ID_LPT_H)},
{PCI_DEVICE(PCI_VENDOR_ID_INTEL, MEI_DEV_ID_LPT_W)},
{PCI_DEVICE(PCI_VENDOR_ID_INTEL, MEI_DEV_ID_LPT_LP)},
+   {PCI_DEVICE(PCI_VENDOR_ID_INTEL, MEI_DEV_ID_LPT_HR)},
+   {PCI_DEVICE(PCI_VENDOR_ID_INTEL, MEI_DEV_ID_WPT_LP)},
 
/* required last entry */
{0, }
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [char-misc-next] mei: add 9 series PCH mei device ids

2013-12-04 Thread Winkler, Tomas
> >
> > diff --git a/drivers/misc/mei/hw-me-regs.h b/drivers/misc/mei/hw-me-regs.h
> > index 6c0fde5..f83bc80 100644
> > --- a/drivers/misc/mei/hw-me-regs.h
> > +++ b/drivers/misc/mei/hw-me-regs.h
> > @@ -110,8 +110,12 @@
> >  #define MEI_DEV_ID_PPT_3  0x1DBA  /* Panther Point */
> >
> >  #define MEI_DEV_ID_LPT0x8C3A  /* Lynx Point */
> > +#define MEI_DEV_ID_LPT_H  0x8C3A  /* Lynx Point H */
> 
> Why duplicate this #define?

> And, now that you changed it, why keep the "old" one around, it's no
> longer used anywhere else?
> 
The correct name is LPT_H so  I've just missed to remove the line, sending V2.

Thanks
Tomas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: sleeping function called from invalid context at kernel/locking/mutex.c:616

2013-12-04 Thread Andrew Morton
On Thu, 05 Dec 2013 15:12:04 +0800 Axel Lin  wrote:

> > 
> > blam.  spin_unlock_irq(&mapping->tree_lock) failed to decrement
> > preempt_count().  What the heck.
> > 
> > What architecture is this?  Please send the full .config.
> > 
> > And exactly which kernel version is in use?
> 
> It's a arm7tdmi SoC (GeneralPlus gpl32700 SoC).
> The code is: current Linus' tree + patches for this SoC.
> The patches for this SoC includes:
> irqchip, clocksource, pinctrl, gpio, uart, spi, sd/mmc host drivers.
> I also apply a out-of-tree sdio wifi driver for mt5931 wifi module.

Beats me, sorry - I don't see anything which could cause this in the
arm spinlock implementation, even if the spinlock's storage got
corrupted.

> I can successfully boot and running busybox if using ext2 as root.
> Thus I don't hit "spin_unlock_irq decrement preempt_count failure" if using 
> ext2 as root.
> The storage is a spi nor flash, so I prefer to use jffs2 but then I got
> the hangup.
> 
> BTW, I got below panic today:
> 
> 467: 0
> 470: 0
> 475: 1
> 485: 1
> 487: 2
> 489: 2
> 491: 1
> 494: 1
> 496: 1
> 498: 1
> BUG: spinlock bad magic on CPU#0, spi0/30

Which is what appears to have happened here.

I assume earlier kernels worked OK with this config?

If so, all I can suggest is a git bisection search :(
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RESEND][BUG][PATCH V3] audit: audit_log_start running on auditd should not stop

2013-12-04 Thread Toshiyuki Okajima
Hi.

Please apply this patch because the problem that audit_receive called by auditd 
process is hung up by other process (systemd) which has already called it is 
fixed.

This patch fixes the problem that auditd hangs up by itself.

Thanks.

---
The backlog cannot be consumed when audit_log_start is running on auditd
even if audit_log_start calls wait_for_auditd to consume it.
The situation is the deadlock because only auditd can consume the backlog.
If the other process needs to send the backlog, it can be also stopped 
by the deadlock.

So, audit_log_start running on auditd should not stop.

You can see the deadlock with the following reproducer:
 # auditctl -a exit,always -S all
 # reboot

Signed-off-by: Toshiyuki Okajima 
Reviewed-by: gaof...@cn.fujitsu.com
---
 kernel/audit.c |   14 --
 1 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/kernel/audit.c b/kernel/audit.c
index 7b0e23a..29cfc94 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -1095,7 +1095,8 @@ struct audit_buffer *audit_log_start(struct audit_context 
*ctx, gfp_t gfp_mask,
struct audit_buffer *ab = NULL;
struct timespec t;
unsigned intuninitialized_var(serial);
-   int reserve;
+   int reserve = 5; /* Allow atomic callers to go up to five
+   entries over the normal backlog limit */
unsigned long timeout_start = jiffies;
 
if (audit_initialized != AUDIT_INITIALIZED)
@@ -1104,11 +1105,12 @@ struct audit_buffer *audit_log_start(struct 
audit_context *ctx, gfp_t gfp_mask,
if (unlikely(audit_filter_type(type)))
return NULL;
 
-   if (gfp_mask & __GFP_WAIT)
-   reserve = 0;
-   else
-   reserve = 5; /* Allow atomic callers to go up to five
-   entries over the normal backlog limit */
+   if (gfp_mask & __GFP_WAIT) {
+   if (audit_pid && audit_pid == current->pid)
+   gfp_mask &= ~__GFP_WAIT;
+   else
+   reserve = 0;
+   }
 
while (audit_backlog_limit
   && skb_queue_len(&audit_skb_queue) > audit_backlog_limit + 
reserve) {
-- 
1.5.5.6
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: sleeping function called from invalid context at kernel/locking/mutex.c:616

2013-12-04 Thread Axel Lin
於 三,2013-12-04 於 13:32 -0800,Andrew Morton 提到:
> On Wed, 04 Dec 2013 16:59:38 +0800 Axel Lin  wrote:
> 
> > > 
> > > Please add a lot more printk's so we can narrow it down further?  I'd
> > > use something like 
> > > 
> > >   printk(%d: %d\n", __LINE__, preempt_count());
> > > 
> > > (note: preempt_count(), not in_atomic())
> > > 
> > > Paste that all over the place so we can see which statement is doing
> > > the wrong thing.  
> > 
> > Below is the code ( to show the line number ):
> > 
> > 459 int add_to_page_cache_locked(struct page *page, struct address_space
> > *mapping,
> >  460 pgoff_t offset, gfp_t gfp_mask)
> >  461 {
> >  462 int error;
> >  463 
> >  464 VM_BUG_ON(!PageLocked(page));
> >  465 VM_BUG_ON(PageSwapBacked(page));
> >  466 
> >  467 printk("%d: %d\n", __LINE__, preempt_count());
> >  468 error = mem_cgroup_cache_charge(page, current->mm,
> >  469 gfp_mask &
> > GFP_RECLAIM_MASK);
> >  470 printk("%d: %d\n", __LINE__, preempt_count());
> >  471 if (error)
> >  472 return error;
> >  473 
> >  474 error = radix_tree_maybe_preload(gfp_mask &
> > ~__GFP_HIGHMEM);
> >  475 printk("%d: %d\n", __LINE__, preempt_count());
> >  476 if (error) {
> >  477 mem_cgroup_uncharge_cache_page(page);
> >  478 return error;
> >  479 }
> >  480 
> >  481 page_cache_get(page);
> >  482 page->mapping = mapping;
> >  483 page->index = offset;
> >  484 
> >  485 printk("%d: %d\n", __LINE__, preempt_count());
> >  486 spin_lock_irq(&mapping->tree_lock);
> >  487 printk("%d: %d\n", __LINE__, preempt_count());
> >  488 error = radix_tree_insert(&mapping->page_tree, offset,
> > page);
> >  489 printk("%d: %d\n", __LINE__, preempt_count());
> >  490 radix_tree_preload_end();
> >  491 printk("%d: %d\n", __LINE__, preempt_count());
> >  492 if (unlikely(error))
> >  493 goto err_insert;
> >  494 printk("%d: %d\n", __LINE__, preempt_count());
> >  495 mapping->nrpages++;
> >  496 printk("%d: %d\n", __LINE__, preempt_count());
> >  497 __inc_zone_page_state(page, NR_FILE_PAGES);
> >  498 printk("%d: %d\n", __LINE__, preempt_count());
> >  499 spin_unlock_irq(&mapping->tree_lock);
> >  500 printk("%d: %d\n", __LINE__, preempt_count());
> >  501 trace_mm_filemap_add_to_page_cache(page);
> >  502 printk("%d: %d\n", __LINE__, preempt_count());
> >  503 return 0;
> >  504 err_insert:
> >  505 page->mapping = NULL;
> >  506 /* Leave page->index set: truncation relies upon it */
> >  507 spin_unlock_irq(&mapping->tree_lock);
> >  508 mem_cgroup_uncharge_cache_page(page);
> >  509 page_cache_release(page);
> >  510 printk("%d: %d\n", __LINE__, preempt_count());
> >  511 return error;
> >  512 }
> > 
> > Below is the output log:
> > 
> > VFS: Mounted root (jffs2 filesystem) on device 31:1.
> > devtmpfs: mounted
> > Freeing unused kernel memory: 92K (003a8000 - 003bf000)
> > 467: 0
> > 470: 0
> > 475: 1
> > 485: 1
> > 487: 2
> > 489: 2
> > 491: 1
> > 494: 1
> > 496: 1
> > 498: 1
> > 500: 0
> > 502: 0
> > 467: 0
> > 470: 0
> > 475: 1
> > 485: 1
> > 487: 2
> > 489: 2
> > 491: 1
> > 494: 1
> > 496: 1
> > 498: 1
> > 500: 0
> > 502: 0
> > 467: 0
> > 470: 0
> > 475: 1
> > 485: 1
> > 487: 2
> > 489: 2
> > 491: 1
> > 494: 1
> > 496: 1
> > 498: 1
> > 500: 1
> 
> blam.  spin_unlock_irq(&mapping->tree_lock) failed to decrement
> preempt_count().  What the heck.
> 
> What architecture is this?  Please send the full .config.
> 
> And exactly which kernel version is in use?

It's a arm7tdmi SoC (GeneralPlus gpl32700 SoC).
The code is: current Linus' tree + patches for this SoC.
The patches for this SoC includes:
irqchip, clocksource, pinctrl, gpio, uart, spi, sd/mmc host drivers.
I also apply a out-of-tree sdio wifi driver for mt5931 wifi module.

I can successfully boot and running busybox if using ext2 as root.
Thus I don't hit "spin_unlock_irq decrement preempt_count failure" if using 
ext2 as root.
The storage is a spi nor flash, so I prefer to use jffs2 but then I got
the hangup.

BTW, I got below panic today:

467: 0
470: 0
475: 1
485: 1
487: 2
489: 2
491: 1
494: 1
496: 1
498: 1
BUG: spinlock bad magic on CPU#0, spi0/30
 lock: 0x1, .magic: 65ea0004, .owner: /0, .owner_cpu: -253386753
CPU: 0 PID: 30 Comm: spi0 Not tainted 3.13.0-rc2-00290-g4b02cef-dirty
#2035
Backtrace: Backtrace:
[] (dump_backtrace+0x0/0x108) from [] (show_stack
+0x18/0x1c)
 r6:01ce9648 r6:01ce9648 r5:0001 r5:0001 r4:43ea0004 r4:43ea0004
r3:00208040 r3:00208040

[] (show_stack+0x0/0x1c) from [<002ad4dc>] (dump_stack
+0x20/0x28)
[<002ad4bc>] (dump_stack+0x0/0x28) from [<002ac1c0>] (spin_dump
+0x80/0x94)
[<002ac140>] (spin_dump+0x0/0x94) from [<002ac200>] (spin_bug+0x2c/0x30)
 r5:0033c47a r5:0033c47a

Re: [PATCH] mutexes: Add CONFIG_DEBUG_MUTEX_FASTPATH=y debug variant to debug SMP races

2013-12-04 Thread Simon Kirby
On Tue, Dec 03, 2013 at 09:52:33AM +0100, Ingo Molnar wrote:

> Indeed: this comes from mutex->count being separate from 
> mutex->wait_lock, and this should affect every architecture that has a 
> mutex->count fast-path implemented (essentially every architecture 
> that matters).
> 
> Such bugs should also magically go away with mutex debugging enabled.

Confirmed: I ran the reproducer with CONFIG_DEBUG_MUTEXES for a few
hours, and never got a single poison overwritten notice.

> I'd expect such bugs to be more prominent with unlucky object 
> size/alignment: if mutex->count lies on a separate cache line from 
> mutex->wait_lock.
> 
> Side note: this might be a valid light weight debugging technique, we 
> could add padding between the two fields to force them into separate 
> cache lines, without slowing it down.
> 
> Simon, would you be willing to try the fairly trivial patch below? 
> Please enable CONFIG_DEBUG_MUTEX_FASTPATH=y. Does your kernel fail 
> faster that way?

I didn't see much of a change other than the incremented poison byte is
now further in due to the padding, and it shows up in kmalloc-256.

I also tried with Linus' udelay() suggestion, below. With this, there
were many occurrences per second.

Simon-

diff --git a/kernel/mutex.c b/kernel/mutex.c
index d24105b..f65e735 100644
--- a/kernel/mutex.c
+++ b/kernel/mutex.c
@@ -25,6 +25,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /*
  * In the DEBUG case we are using the "NULL fastpath" for mutexes,
@@ -740,6 +741,11 @@ __mutex_unlock_common_slowpath(atomic_t *lock_count, int 
nested)
wake_up_process(waiter->task);
}
 
+   /* udelay a bit if the spinlock isn't contended */
+   if (lock->wait_lock.rlock.raw_lock.tickets.head + 1 ==
+   lock->wait_lock.rlock.raw_lock.tickets.tail)
+   udelay(1);
+
spin_unlock_mutex(&lock->wait_lock, flags);
 }
 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v12 09/18] vmscan: shrink slab on memcg pressure

2013-12-04 Thread Vladimir Davydov
On 12/05/2013 09:01 AM, Dave Chinner wrote:
> On Wed, Dec 04, 2013 at 10:31:32AM +0400, Vladimir Davydov wrote:
>> On 12/04/2013 08:51 AM, Dave Chinner wrote:
>>> On Tue, Dec 03, 2013 at 04:15:57PM +0400, Vladimir Davydov wrote:
 On 12/03/2013 02:48 PM, Dave Chinner wrote:
>> @@ -236,11 +236,17 @@ shrink_slab_node(struct shrink_control *shrinkctl, 
>> struct shrinker *shrinker,
>>  return 0;
>>  
>>  /*
>> - * copy the current shrinker scan count into a local variable
>> - * and zero it so that other concurrent shrinker invocations
>> - * don't also do this scanning work.
>> + * Do not touch global counter of deferred objects on memcg 
>> pressure to
>> + * avoid isolation issues. Ideally the counter should be 
>> per-memcg.
>>   */
>> -nr = atomic_long_xchg(&shrinker->nr_deferred[nid], 0);
>> +if (!shrinkctl->target_mem_cgroup) {
>> +/*
>> + * copy the current shrinker scan count into a local 
>> variable
>> + * and zero it so that other concurrent shrinker 
>> invocations
>> + * don't also do this scanning work.
>> + */
>> +nr = atomic_long_xchg(&shrinker->nr_deferred[nid], 0);
>> +}
> That's ugly. Effectively it means that memcg reclaim is going to be
> completely ineffective when large numbers of allocations and hence
> reclaim attempts are done under GFP_NOFS context.
>
> The only thing that keeps filesystem caches in balance when there is
> lots of filesystem work going on (i.e. lots of GFP_NOFS allocations)
> is the deferal of reclaim work to a context that can do something
> about it.
 Imagine the situation: a memcg issues a GFP_NOFS allocation and goes to
 shrink_slab() where it defers them to the global counter; then another
 memcg issues a GFP_KERNEL allocation, also goes to shrink_slab() where
 it sees a huge number of deferred objects and starts shrinking them,
 which is not good IMHO.
>>> That's exactly what the deferred mechanism is for - we know we have
>>> to do the work, but we can't do it right now so let someone else do
>>> it who can.
>>>
>>> In most cases, deferral is handled by kswapd, because when a
>>> filesystem workload is causing memory pressure then most allocations
>>> are done in GFP_NOFS conditions. Hence the only memory reclaim that
>>> can make progress here is kswapd.
>>>
>>> Right now, you aren't deferring any of this memory pressure to some
>>> other agent, so it just does not get done. That's a massive problem
>>> - it's a design flaw - and instead I see lots of crazy hacks being
>>> added to do stuff that should simply be deferred to kswapd like is
>>> done for global memory pressure.
>>>
>>> Hell, kswapd shoul dbe allowed to walk memcg LRU lists and trim
>>> them, just like it does for the global lists. We only need a single
>>> "deferred work" counter per node for that - just let kswapd
>>> proportion the deferred work over the per-node LRU and the
>>> memcgs
>> Seems I misunderstand :-(
>>
>> Let me try. You mean we have the only nr_deferred counter per-node, and
>> kswapd scans
>>
>> nr_deferred*memcg_kmem_size/total_kmem_size
>>
>> objects in each memcg, right?
>>
>> Then if there were a lot of objects deferred on memcg (not global)
>> pressure due to a memcg issuing a lot of GFP_NOFS allocations, kswapd
>> will reclaim objects from all, even unlimited, memcgs. This looks like
>> an isolation issue :-/
> Which, when you are running out of memory, is a much less of an
> issue than not being able to make progress reclaiming memory.
>
> Besides, the "isolation" argument runs both ways. e.g. when there
> isn't memory available, it's entirely possible it's because there is
> actually no free memory, not because we've hit a memcg limit. e.g.
> all the memory has been consumed by an unlimited memcg, and we need to
> reclaim from it so this memcg can make progress.
>
> In those situations we need to reclaim from everyone, not
> just the memcg that can't find free memory to allocate

Agree, on global overcommit we have to reclaim from all. I guess it
would be also nice to balance the reclaim proportionally to memlimit
somehow then.

>> Currently we have a per-node nr_deferred counter for each shrinker. If
>> we add per-memcg reclaim, we have to make it per-memcg per-node, don't we?
> Think about what you just said for a moment. We have how many memcg
> shrinkers?  And we can support how many nodes? And we can support
> how many memcgs? And when we multiply that all together, how much
> memory do we need to track that?

But we could grow nr_deferred dynamically as the number of kmem-active
memcgs grows just like we're going to grow list_lru. Then the overhead
would not be that big, it would be practically 0 if there is no
kmem-

Re: [PATCH/RFC] sh: Always link in helper functions extracted from libgcc

2013-12-04 Thread Nobuhiro Iwamatsu
# Add a...@linux-foundation.org and sta...@vger.kernel.org to CC.

Hi, Geert.

I was just creating a patch as your patch I noticed your patch by chance.
I think your proposal is good.

But Paul does not maintain the SH tree.
Andrew, Could you pickup this patch ?

Tested-by: Nobuhiro Iwamatsu 
Reviewed-by: Nobuhiro Iwamatsu 

Best regards,
  Nobuhiro

2013/5/29 Geert Uytterhoeven :
> E.g. landisk_defconfig, which has CONFIG_NTFS_FS=m:
>
> ERROR: "__ashrdi3" [fs/ntfs/ntfs.ko] undefined!
>
> For "lib-y", if no symbols in a compilation unit are referenced by other
> units, the compilation unit will not be included in vmlinux.
> This breaks modules that do reference those symbols.
>
> Use "obj-y" instead to fix this.
>
> Signed-off-by: Geert Uytterhoeven 
> ---
> http://kisskb.ellerman.id.au/kisskb/buildresult/8838077/
>
> This doesn't fix all cases. There are others, e.g. udivsi3.
> This is also not limited to sh, many architectures handle this in the same
> way.
>
> A simple solution is to unconditionally include all helper functions.
> A more complex solution is to make the choice of "lib-y" or "obj-y" depend
> on CONFIG_MODULES:
>   obj-$(CONFIG_MODULES) += ...
>   lib-y($CONFIG_MODULES) += ...
>
> What do you think?
> Thanks for your comments!
>
>  arch/sh/lib/Makefile |2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/arch/sh/lib/Makefile b/arch/sh/lib/Makefile
> index 7b95f29..3baff31 100644
> --- a/arch/sh/lib/Makefile
> +++ b/arch/sh/lib/Makefile
> @@ -6,7 +6,7 @@ lib-y  = delay.o memmove.o memchr.o \
>  checksum.o strlen.o div64.o div64-generic.o
>
>  # Extracted from libgcc
> -lib-y += movmem.o ashldi3.o ashrdi3.o lshrdi3.o \
> +obj-y += movmem.o ashldi3.o ashrdi3.o lshrdi3.o \
>  ashlsi3.o ashrsi3.o ashiftrt.o lshrsi3.o \
>  udiv_qrnnd.o
>
> --
> 1.7.0.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sh" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Nobuhiro Iwamatsu
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] usb: chipidea: fix device tree binding for zevio/nspire usb driver

2013-12-04 Thread dt . tangr
From: Daniel Tang 

The device tree binding chosen for the nspire-usb driver was inappropriate and
should be renamed to lsi,zevio-usb.

References to nspire have been replaced with zevio (the SoC name)

Signed-off-by: Daniel Tang 
---
 .../devicetree/bindings/usb/ci-hdrc-nspire.txt |   17 -
 .../devicetree/bindings/usb/ci-hdrc-zevio.txt  |   17 +
 drivers/usb/chipidea/Makefile  |2 +-
 drivers/usb/chipidea/ci_hdrc_nspire.c  |   72 
 drivers/usb/chipidea/ci_hdrc_zevio.c   |   72 
 5 files changed, 90 insertions(+), 90 deletions(-)
 delete mode 100644 Documentation/devicetree/bindings/usb/ci-hdrc-nspire.txt
 create mode 100644 Documentation/devicetree/bindings/usb/ci-hdrc-zevio.txt
 delete mode 100644 drivers/usb/chipidea/ci_hdrc_nspire.c
 create mode 100644 drivers/usb/chipidea/ci_hdrc_zevio.c

diff --git a/Documentation/devicetree/bindings/usb/ci-hdrc-nspire.txt 
b/Documentation/devicetree/bindings/usb/ci-hdrc-nspire.txt
deleted file mode 100644
index ef1fcbf..000
--- a/Documentation/devicetree/bindings/usb/ci-hdrc-nspire.txt
+++ /dev/null
@@ -1,17 +0,0 @@
-* TI-Nspire USB OTG Controller
-
-Required properties:
-- compatible: Should be "lsi,nspire-usb"
-- reg: Should contain registers location and length
-- interrupts: Should contain controller interrupt
-
-Recommended properies:
-- vbus-supply: regulator for vbus
-
-Examples:
-   usb0: usb@B000 {
-   reg = <0xB000 0x1000>;
-   compatible = "lsi,nspire-usb";
-   interrupts = <8>;
-   vbus-supply = <&vbus_reg>;
-   };
diff --git a/Documentation/devicetree/bindings/usb/ci-hdrc-zevio.txt 
b/Documentation/devicetree/bindings/usb/ci-hdrc-zevio.txt
new file mode 100644
index 000..b6c3b5c
--- /dev/null
+++ b/Documentation/devicetree/bindings/usb/ci-hdrc-zevio.txt
@@ -0,0 +1,17 @@
+* TI-Nspire USB OTG Controller
+
+Required properties:
+- compatible: Should be "lsi,zevio-usb"
+- reg: Should contain registers location and length
+- interrupts: Should contain controller interrupt
+
+Recommended properies:
+- vbus-supply: regulator for vbus
+
+Examples:
+   usb0: usb@B000 {
+   reg = <0xB000 0x1000>;
+   compatible = "lsi,zevio-usb";
+   interrupts = <8>;
+   vbus-supply = <&vbus_reg>;
+   };
diff --git a/drivers/usb/chipidea/Makefile b/drivers/usb/chipidea/Makefile
index 245ea4d..7635407 100644
--- a/drivers/usb/chipidea/Makefile
+++ b/drivers/usb/chipidea/Makefile
@@ -10,7 +10,7 @@ ci_hdrc-$(CONFIG_USB_CHIPIDEA_DEBUG)  += debug.o
 # Glue/Bridge layers go here
 
 obj-$(CONFIG_USB_CHIPIDEA) += ci_hdrc_msm.o
-obj-$(CONFIG_USB_CHIPIDEA) += ci_hdrc_nspire.o
+obj-$(CONFIG_USB_CHIPIDEA) += ci_hdrc_zevio.o
 
 # PCI doesn't provide stubs, need to check
 ifneq ($(CONFIG_PCI),)
diff --git a/drivers/usb/chipidea/ci_hdrc_nspire.c 
b/drivers/usb/chipidea/ci_hdrc_nspire.c
deleted file mode 100644
index c5c2dde..000
--- a/drivers/usb/chipidea/ci_hdrc_nspire.c
+++ /dev/null
@@ -1,72 +0,0 @@
-/*
- * Copyright (C) 2013 Daniel Tang 
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2, as
- * published by the Free Software Foundation.
- *
- * Based off drivers/usb/chipidea/ci_hdrc_msm.c
- *
- */
-
-#include 
-#include 
-#include 
-#include 
-
-#include "ci.h"
-
-static struct ci_hdrc_platform_data ci_hdrc_nspire_platdata = {
-   .name   = "ci_hdrc_nspire",
-   .flags  = CI_HDRC_REGS_SHARED,
-   .capoffset  = DEF_CAPOFFSET,
-};
-
-static int ci_hdrc_nspire_probe(struct platform_device *pdev)
-{
-   struct platform_device *ci_pdev;
-
-   dev_dbg(&pdev->dev, "ci_hdrc_nspire_probe\n");
-
-   ci_pdev = ci_hdrc_add_device(&pdev->dev,
-   pdev->resource, pdev->num_resources,
-   &ci_hdrc_nspire_platdata);
-
-   if (IS_ERR(ci_pdev)) {
-   dev_err(&pdev->dev, "ci_hdrc_add_device failed!\n");
-   return PTR_ERR(ci_pdev);
-   }
-
-   platform_set_drvdata(pdev, ci_pdev);
-
-   return 0;
-}
-
-static int ci_hdrc_nspire_remove(struct platform_device *pdev)
-{
-   struct platform_device *ci_pdev = platform_get_drvdata(pdev);
-
-   ci_hdrc_remove_device(ci_pdev);
-
-   return 0;
-}
-
-static const struct of_device_id ci_hdrc_nspire_dt_ids[] = {
-   { .compatible = "lsi,nspire-usb", },
-   { /* sentinel */ }
-};
-
-static struct platform_driver ci_hdrc_nspire_driver = {
-   .probe = ci_hdrc_nspire_probe,
-   .remove = ci_hdrc_nspire_remove,
-   .driver = {
-   .name = "nspire_usb",
-   .owner = THIS_MODULE,
-   .of_match_table = ci_hdrc

Re: [PATCH v3 04/10] usb: dwc3: use quirks to know if a particualr platform doesn't have PHY

2013-12-04 Thread Kishon Vijay Abraham I
Hi,

On Wednesday 04 December 2013 08:10 PM, Heikki Krogerus wrote:
> Hi guys,
> 
> Kishon, sorry I did not see this v3 set.
> 
> On Mon, Nov 25, 2013 at 03:31:24PM +0530, Kishon Vijay Abraham I wrote:
>> There can be systems which does not have an external phy, so get
>> phy only if no quirks are added that indicates the PHY is not present.
>> Introduced two quirk flags to indicate the *absence* of usb2 phy and
>> usb3 phy. Also remove checking if return value is -ENXIO since it's now
>> changed to always enable usb_phy layer.
> 
> Can you guys explain why is something like this needed? Like with
> clocks and gpios, the device drivers shouldn't need to care any more
> if the platform has the phys or not. -ENODEV tells you your platform

Shouldn't we report if a particular platform needs a PHY and not able to get
it. How will a user know if a particular controller is not working because it's
not able to get and initialize the PHYs? Don't you think in such cases it's
better to fail (and return from probe) because the controller will not work
anyway without the PHY?

Thanks
Kishon
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Need help on Linux PCIe

2013-12-04 Thread Jagan Teki
On Wed, Dec 4, 2013 at 11:35 PM, Bjorn Helgaas  wrote:
> On Wed, Dec 4, 2013 at 10:00 AM, Jagan Teki  wrote:
>> On Wed, Dec 4, 2013 at 8:41 PM, Bjorn Helgaas  wrote:
>>> On Tue, Dec 3, 2013 at 11:20 PM, Jagan Teki  
>>> wrote:
 Thanks for your quick response.
 Please find my comments below.

 On Tue, Dec 3, 2013 at 11:09 PM, Bjorn Helgaas  wrote:
> On Tue, Dec 3, 2013 at 4:24 AM, Jagan Teki  
> wrote:
>> Hi,
>>
>> I have few question on Linux PCIe subsystem, I am trying to understand
>> the PCIe on ARM platform.
>> 1. Compared to PCI, PCIe have an extra port functionalists/services
>> which is implemented drivers/pci/pcie/* is it true?
>
> Yes.
>
>> 2. PCIe root complex is same as Host controller drivers in linux 
>> drivers/host/*
>
> Yes.
>
>> 3. As individual endpoint drivers are registered to pci_core as
>> pci_driver_register, then what is the common call for registering
>> individual HC driver to pci-core?
>
> The host controller-PCI core interface is not as mature as the
> pci_register_driver() interface.  The basic interface is
> pci_scan_root_bus().  If you skim through the drivers in
> drivers/pci/host/* and drivers/acpi/pci_root.c, the interface to the
> PCI core will be fairly obvious.  And you'll learn what the existing
> practices are in case you need to add or modify something.

 OK.

 I understand the flow as below - please correct if am wrong.

 From low level (hw) - HC driver has a platform registration using
 platform_driver_register() to lower layer
 and then pci_scan_root_bus() --> pci_common_init_dev() registration to
 upper layer as PCI - BIOS and then ends.
>>>
>>> Yes.  Sometime HC drivers use platform_driver_register(); other use
>>> something else depending on how the HC device is enumerated.  For
>>> example, drivers/acpi/pci_root.c uses something else to deal with host
>>> bridges in the ACPI namespace.
>>>
 From upper level (app) - each endpoint driver has
 pci_driver_register() call to PCI Core for lower level
>>>
>>> Yes.
>>>
 and then the upper level registration is based on endpoint().
>>>
>>> I don't know what you mean here (I don't know of a function named
>>> "endpoint()").  But the driver model matches drivers to PCI functions
>>> based on vendor and device IDs.  A Linux "pci_dev" is what the PCI
>>> specs refer to as a "function."
>> Sorry it's typo - added ()
>>
>>>
 What is the connection here for PCI-BIOS and PCI-Core here, does these
 are two different entities means there is no common call for these?
 I see for ARM - "arch/arm/kernel/bios32.c" is PCI-BIOS is it correct?
 does we have separate BIOS codes for architectures?
>>>
>>> The "pcibios_*" functions are architecture-specific things called by
>>> the generic PCI core.  Generally, things specified by the PCI specs
>>> are architecture-independent and should be in the PCI core
>>> (drivers/pci/*).
>>
>> I have some good information to discuss from this thread.
>> Can you please verify this Linux PCIe subsystem stack - comment
>> whether my understanding is correct/not.
>> (I just draw this based on driver calls flow - to accommodate with in
>> the Linux cores)
>> http://jagannadhteki.blog.com/2013/12/04/linux-pcie-subsystem/
>
> Yes, that makes sense.  I wouldn't label the PCIBIOS - PCI core link
> as "pci_bus_add_device()"; pci_bus_add_device() is part of the PCI
> core's generic enumeration code and shouldn't be called by
> arch-specific code.  The link going from PCI core to PCIBIOS is the
> set of "pcibios_*()" functions.  Going from PCIBIOS to the PCI core,
> it's mostly just pci_scan_root_bus().
Yes - understand your point.
I made few changes accordingly.
http://jagannadhteki.blog.com/files/2013/12/Linux_PCIe_zynq.png

I am planning to document this subsystem into Documentation/PCI/*
with brief description of important blocks, any comments?

> I also probably wouldn't put in links between VFS and AER, HP, PME,
> and VC.  It's true that there are some sysfs files that influence the
> operation of those PCIe features, but mostly for debugging and
> administration.  They aren't something useful to ordinary user
> programs.

-- 
Thanks,
Jagan.

Jagannadha Sutradharudu Teki,
E: jagannadh.t...@gmail.com, P: +91-9676773388
Engineer - System Software Hacker
U-boot - SPI Custodian and Zynq APSOC
Ln: http://www.linkedin.com/in/jaganteki
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCHv3] irqchip: Add support for TI-NSPIRE irqchip

2013-12-04 Thread dt . tangr
From: Daniel Tang 

This patch adds support for the interrupt controllers found in some
TI-Nspire models.

FIQ support was taken out to simplify the driver
code and may be added in later. Since Linux on this platform doesn't
really use FIQs, this wasn't really that important in the first
place.

Signed-off-by: Daniel Tang 
Acked-by: Grant Likely 
---
Changes from v1 to v2:
* Converted to use generic IRQ chips.
* Removed FIQ for now to simplify driver code.
* Based against tip/irq/core and uses IRQ domain support for generic
 chips.

Changes from v2 to v3:
* Removed unnessecary locking in interrupt acking function
* Mark zevio_init_irq_base as __init since it's only used at init

 .../interrupt-controller/lsi,zevio-intc.txt|   18 +++
 drivers/irqchip/Makefile   |1 +
 drivers/irqchip/irq-zevio.c|  127 
 3 files changed, 146 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/interrupt-controller/lsi,zevio-intc.txt
 create mode 100644 drivers/irqchip/irq-zevio.c

diff --git 
a/Documentation/devicetree/bindings/interrupt-controller/lsi,zevio-intc.txt 
b/Documentation/devicetree/bindings/interrupt-controller/lsi,zevio-intc.txt
new file mode 100644
index 000..aee38e7
--- /dev/null
+++ b/Documentation/devicetree/bindings/interrupt-controller/lsi,zevio-intc.txt
@@ -0,0 +1,18 @@
+TI-NSPIRE interrupt controller
+
+Required properties:
+- compatible: Compatible property value should be "lsi,zevio-intc".
+
+- reg: Physical base address of the controller and length of memory mapped
+   region.
+
+- interrupt-controller : Identifies the node as an interrupt controller
+
+Example:
+
+interrupt-controller {
+   compatible = "lsi,zevio-intc";
+   interrupt-controller;
+   reg = <0xDC00 0x1000>;
+   #interrupt-cells = <1>;
+};
diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
index c60b901..292b106 100644
--- a/drivers/irqchip/Makefile
+++ b/drivers/irqchip/Makefile
@@ -20,5 +20,6 @@ obj-$(CONFIG_SIRF_IRQ)+= irq-sirfsoc.o
 obj-$(CONFIG_RENESAS_INTC_IRQPIN)  += irq-renesas-intc-irqpin.o
 obj-$(CONFIG_RENESAS_IRQC) += irq-renesas-irqc.o
 obj-$(CONFIG_VERSATILE_FPGA_IRQ)   += irq-versatile-fpga.o
+obj-$(CONFIG_ARCH_NSPIRE)  += irq-zevio.o
 obj-$(CONFIG_ARCH_VT8500)  += irq-vt8500.o
 obj-$(CONFIG_TB10X_IRQC)   += irq-tb10x.o
diff --git a/drivers/irqchip/irq-zevio.c b/drivers/irqchip/irq-zevio.c
new file mode 100644
index 000..3f52bb7
--- /dev/null
+++ b/drivers/irqchip/irq-zevio.c
@@ -0,0 +1,127 @@
+/*
+ *  linux/drivers/irqchip/irq-zevio.c
+ *
+ *  Copyright (C) 2013 Daniel Tang 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2, as
+ * published by the Free Software Foundation.
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+
+#include "irqchip.h"
+
+#define IO_STATUS  0x000
+#define IO_RAW_STATUS  0x004
+#define IO_ENABLE  0x008
+#define IO_DISABLE 0x00C
+#define IO_CURRENT 0x020
+#define IO_RESET   0x028
+#define IO_MAX_PRIOTY  0x02C
+
+#define IO_IRQ_BASE0x000
+#define IO_FIQ_BASE0x100
+
+#define IO_INVERT_SEL  0x200
+#define IO_STICKY_SEL  0x204
+#define IO_PRIORITY_SEL0x300
+
+#define MAX_INTRS  32
+#define FIQ_START  MAX_INTRS
+
+static struct irq_domain *zevio_irq_domain;
+static void __iomem *zevio_irq_io;
+
+static void zevio_irq_ack(struct irq_data *irqd)
+{
+   struct irq_chip_generic *gc = irq_data_get_irq_chip_data(irqd);
+   struct irq_chip_regs *regs =
+   &container_of(irqd->chip, struct irq_chip_type, chip)->regs;
+
+   readl(gc->reg_base + regs->ack);
+}
+
+static void __init zevio_init_irq_base(void __iomem *base)
+{
+   /* Disable all interrupts */
+   writel(~0, base + IO_DISABLE);
+
+   /* Accept interrupts of all priorities */
+   writel(0xF, base + IO_MAX_PRIOTY);
+
+   /* Reset existing interrupts */
+   readl(base + IO_RESET);
+}
+
+asmlinkage void __exception_irq_entry zevio_handle_irq(struct pt_regs *regs)
+{
+   int irqnr;
+
+   while (readl(zevio_irq_io + IO_STATUS)) {
+   irqnr = readl(zevio_irq_io + IO_CURRENT);
+   irqnr = irq_find_mapping(zevio_irq_domain, irqnr);
+   handle_IRQ(irqnr, regs);
+   };
+}
+
+static int __init zevio_of_init(struct device_node *node,
+   struct device_node *parent)
+{
+   unsigned int clr = IRQ_NOREQUEST | IRQ_NOPROBE | IRQ_NOAUTOEN;
+   struct irq_chip_generic *gc;
+   int ret;
+
+   if (WARN_ON(zevio_irq_io || zevio_irq_domain))
+   return -EBUSY;
+
+   zevio_irq_io = of_iomap(node, 0);
+   BUG_ON(!zevio_irq_io);
+
+   /* Do not invert interrupt status bits */
+   writel(~0, zevio_irq_io + IO_INVERT_

Re: [PATCH 0/2] ceph: Add clean up if invalid osd reply received

2013-12-04 Thread Sage Weil
APplied these both to teh testing branch.  Thanks!

On Wed, 27 Nov 2013, Li Wang wrote:

> Signed-off-by: Li Wang 
> Signed-off-by: Yunchuan Wen 
> 
> Li Wang (2):
>   ceph: Clean up if error occurred in finish_read()
>   ceph: Add necessary clean up if invalid reply received in
> handle_reply()
> 
>  fs/ceph/addr.c|3 +++
>  net/ceph/osd_client.c |7 +++
>  2 files changed, 10 insertions(+)
> 
> -- 
> 1.7.9.5
> 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH CFT] ARM:MMP: Enable ARM_PATCH_PHYS_VIRT and ZRELADDR default

2013-12-04 Thread panchaxari
ARM_PATCH_PHYS_VIRT and AUTO_ZRELADDR has been enabled as default configs
to Marvell PXA168/910/MMP2 platforms.

Introduction of PHYS_VIRT config as default would enable phy-to-virt and
virt-to-phy translation function at boot and module loading time
and enforce dynamic reallocation of memory. AUTO_ZRELADDR config would
enable calculation of kernel load address at run time.

PHYS_VIRT config is mutually exclusive to XIP_KERNEL, XIP_KERNEL is used in
systems with NOR flash devices, and ZRELADDR config is mutually exclusive
to ZBOOT_ROM.

CFT::Call For Testing

Requesting maintainers of Marvell PXA168/910/MMP2 platforms to evaluate the
changes on the board and comment, as I dont have the board for testing and
also requesting an ACK

Signed-off-by: panchaxari 
Cc: Eric Miao 
Cc: Haojian Zhuang 
Cc: Russell King 
Cc: Linus Walleij 
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
---
 arch/arm/Kconfig |2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 79ba1a8..13621ed 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -560,6 +560,8 @@ config ARCH_MMP
bool "Marvell PXA168/910/MMP2"
depends on MMU
select ARCH_REQUIRE_GPIOLIB
+   select ARM_PATCH_PHYS_VIRT
+   select AUTO_ZRELADDR
select CLKDEV_LOOKUP
select GENERIC_ALLOCATOR
select GENERIC_CLOCKEVENTS
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFC] mm readahead: Fix the readahead fail in case of empty numa node

2013-12-04 Thread Raghavendra K T

On 12/05/2013 03:18 AM, Andrew Morton wrote:

On Wed, 04 Dec 2013 14:38:11 +0530 Raghavendra K T 
 wrote:


On 12/04/2013 02:11 PM, Andrew Morton wrote:

:
: This patch takes it all out and applies the same upper limit as is used in
: sys_readahead() - half the inactive list.
:
: +/*
: + * Given a desired number of PAGE_CACHE_SIZE readahead pages, return a
: + * sensible upper limit.
: + */
: +unsigned long max_sane_readahead(unsigned long nr)
: +{
: +   unsigned long active;
: +   unsigned long inactive;
: +
: +   get_zone_counts(&active, &inactive);
: +   return min(nr, inactive / 2);
: +}



Hi Andrew, Thanks for digging out. So it seems like earlier we had not
even considered free pages?


And one would need to go back further still to understand the rationale
for the sys_readahead() decision and that even predates the BK repo.

iirc the thinking was that we need _some_ limit on readahead size so
the user can't go and do ridiculously large amounts of readahead via
sys_readahead().  But that doesn't make a lot of sense because the user
could do the same thing with plain old read().



True.


So for argument's sake I'm thinking we just kill it altogether and
permit arbitrarily large readahead:

--- a/mm/readahead.c~a
+++ a/mm/readahead.c
@@ -238,13 +238,12 @@ int force_page_cache_readahead(struct ad
  }

  /*
- * Given a desired number of PAGE_CACHE_SIZE readahead pages, return a
- * sensible upper limit.
+ * max_sane_readahead() is disabled.  It can later be removed altogether, but
+ * let's keep a skeleton in place for now, in case disabling was the wrong 
call.
   */
  unsigned long max_sane_readahead(unsigned long nr)
  {
-   return min(nr, (node_page_state(numa_node_id(), NR_INACTIVE_FILE)
-   + node_page_state(numa_node_id(), NR_FREE_PAGES)) / 2);
+   return nr;
  }



I had something like below in mind for posting.  But it looks
simple now with your patch.


 unsigned long max_sane_readahead(unsigned long nr)
 {
int nid;
unsigned long free_page = 0;

for_each_node_state(nid, N_MEMORY)
free_page += node_page_state(nid, NR_INACTIVE_FILE)
+ node_page_state(nid, NR_FREE_PAGES);

/*
 * Readahead onto remote memory is better than no readahead when local
 * numa node does not have memory. We sanitize readahead size depending
 * on potential free memory in the whole system.
 */
return min(nr, free_page / (2 * nr_node_ids));

Or if we wanted to avoid iteration on nodes simply returning

something like nr/8  or something like that for remote numa fault cases.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] ceph: initialize inode before instantiating dentry

2013-12-04 Thread Yan, Zheng
From: "Yan, Zheng" 

commit b18825a7c8 (Put a small type field into struct dentry::d_flags)
put a type field into struct dentry::d_flags. __d_instantiate() set the
field by checking inode->i_mode. So we should initialize inode before
instantiating dentry when handling mds reply.

Fixes: #6930
Signed-off-by: Yan, Zheng 
---
 fs/ceph/inode.c | 114 +---
 1 file changed, 43 insertions(+), 71 deletions(-)

diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c
index 9a8e396..c1a9367 100644
--- a/fs/ceph/inode.c
+++ b/fs/ceph/inode.c
@@ -978,7 +978,6 @@ int ceph_fill_trace(struct super_block *sb, struct 
ceph_mds_request *req,
struct ceph_mds_reply_inode *ininfo;
struct ceph_vino vino;
struct ceph_fs_client *fsc = ceph_sb_to_client(sb);
-   int i = 0;
int err = 0;
 
dout("fill_trace %p is_dentry %d is_target %d\n", req,
@@ -1039,6 +1038,29 @@ int ceph_fill_trace(struct super_block *sb, struct 
ceph_mds_request *req,
}
}
 
+   if (rinfo->head->is_target) {
+   vino.ino = le64_to_cpu(rinfo->targeti.in->ino);
+   vino.snap = le64_to_cpu(rinfo->targeti.in->snapid);
+
+   in = ceph_get_inode(sb, vino);
+   if (IS_ERR(in)) {
+   err = PTR_ERR(in);
+   goto done;
+   }
+   req->r_target_inode = in;
+
+   err = fill_inode(in, &rinfo->targeti, NULL,
+   session, req->r_request_started,
+   (le32_to_cpu(rinfo->head->result) == 0) ?
+   req->r_fmode : -1,
+   &req->r_caps_reservation);
+   if (err < 0) {
+   pr_err("fill_inode badness %p %llx.%llx\n",
+   in, ceph_vinop(in));
+   goto done;
+   }
+   }
+
/*
 * ignore null lease/binding on snapdir ENOENT, or else we
 * will have trouble splicing in the virtual snapdir later
@@ -1108,7 +1130,6 @@ int ceph_fill_trace(struct super_block *sb, struct 
ceph_mds_request *req,
 ceph_dentry(req->r_old_dentry)->offset);
 
dn = req->r_old_dentry;  /* use old_dentry */
-   in = dn->d_inode;
}
 
/* null dentry? */
@@ -1130,44 +1151,28 @@ int ceph_fill_trace(struct super_block *sb, struct 
ceph_mds_request *req,
}
 
/* attach proper inode */
-   ininfo = rinfo->targeti.in;
-   vino.ino = le64_to_cpu(ininfo->ino);
-   vino.snap = le64_to_cpu(ininfo->snapid);
-   in = dn->d_inode;
-   if (!in) {
-   in = ceph_get_inode(sb, vino);
-   if (IS_ERR(in)) {
-   pr_err("fill_trace bad get_inode "
-  "%llx.%llx\n", vino.ino, vino.snap);
-   err = PTR_ERR(in);
-   d_drop(dn);
-   goto done;
-   }
+   if (!dn->d_inode) {
+   ihold(in);
dn = splice_dentry(dn, in, &have_lease, true);
if (IS_ERR(dn)) {
err = PTR_ERR(dn);
goto done;
}
req->r_dentry = dn;  /* may have spliced */
-   ihold(in);
-   } else if (ceph_ino(in) == vino.ino &&
-  ceph_snap(in) == vino.snap) {
-   ihold(in);
-   } else {
+   } else if (dn->d_inode && dn->d_inode != in) {
dout(" %p links to %p %llx.%llx, not %llx.%llx\n",
-dn, in, ceph_ino(in), ceph_snap(in),
-vino.ino, vino.snap);
+dn, dn->d_inode, ceph_vinop(dn->d_inode),
+ceph_vinop(in));
have_lease = false;
-   in = NULL;
}
 
if (have_lease)
update_dentry_lease(dn, rinfo->dlease, session,
req->r_request_started);
dout(" final dn %p\n", dn);
-   i++;
-   } else if ((req->r_op == CEPH_MDS_OP_LOOKUPSNAP ||
-  req->r_op == CEPH_MDS_OP_MKSNAP) && !req->r_aborted) {
+   } else if (!req->r_aborted &&
+  (req->r_op == CEPH_MDS_OP_LOOKUPSNAP ||
+   req->r_op == CEPH_MDS_OP_MKSNAP)) {
struct dentry *dn = req->r_dentry;
 
/* fill out a snapdir LOOKUPSNAP dentry */
@@ -1177,52 +1182,15 @@ int ceph_fill_trace(struct super_block *sb, struct 
ceph_mds_reques

Re: [PATCH 1/2] usb: chipidea: fix mistake in device tree binding of nspire-usb to use vendor name 'lsi' instead of SoC name 'zevio'

2013-12-04 Thread Daniel Tang
Hi,

On 04/12/2013, at 11:18 PM, Mark Rutland  wrote:

>> 
>> 
>> Required properties:
>> -- compatible: Should be "zevio,nspire-usb"
>> +- compatible: Should be "lsi,nspire-usb"
> 
> Surely this should be lsi,zevio-usb, matching the lsi,zevio-timer
> binding?

You're right. I'll fix up the patch and send it back in.

> 
> Thanks,
> Mark.

Cheers,
Daniel Tang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] usb: chipidea: fix mistake in device tree binding of nspire-usb to use vendor name 'lsi' instead of SoC name 'zevio'

2013-12-04 Thread Daniel Tang
Hi,

On 05/12/2013, at 12:44 AM, Peter Chen  wrote:

> 
> lsi is vendor name, what are zevio and nspire?
> Usually, the compatible string should be "vendor_name,soc_name-module_name"
> 

Because this port uses documentation from reverse engineering, it's difficult 
to work out what is SoC specific and what is device specific. The SoC is Zevio 
but the driver is written for the TI-Nspire.

If it's usually "vendor_name,soc_name-module_name", I'll fix up this patch with 
zevio instead of nspire (and it'll be more consistent with the other drivers).

>> - reg: Should contain registers location and length
>> - interrupts: Should contain controller interrupt
>> 
>> @@ -11,7 +11,7 @@ Recommended properies:
>> Examples:
>>  usb0: usb@B000 {
>>  reg = <0xB000 0x1000>;
>> -compatible = "zevio,nspire-usb";
>> +compatible = "lsi,nspire-usb";
>>  interrupts = <8>;
>>  vbus-supply = <&vbus_reg>;
>>  };
>> diff --git a/drivers/usb/chipidea/ci_hdrc_nspire.c 
>> b/drivers/usb/chipidea/ci_hdrc_nspire.c
>> index 517ce41..c5c2dde 100644
>> --- a/drivers/usb/chipidea/ci_hdrc_nspire.c
>> +++ b/drivers/usb/chipidea/ci_hdrc_nspire.c
>> @@ -52,7 +52,7 @@ static int ci_hdrc_nspire_remove(struct platform_device 
>> *pdev)
>> }
>> 
>> static const struct of_device_id ci_hdrc_nspire_dt_ids[] = {
>> -{ .compatible = "zevio,nspire-usb", },
>> +{ .compatible = "lsi,nspire-usb", },
>>  { /* sentinel */ }
>> };
>> 
>> -- 
>> 1.7.10.4
>> 
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-usb" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> 
> 
> -- 
> 
> Best Regards,
> Peter Chen
> 

Cheers,
Daniel Tang--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH tip 4/5] use BPF in tracing filters

2013-12-04 Thread Alexei Starovoitov
On Wed, Dec 4, 2013 at 4:05 PM, Masami Hiramatsu
 wrote:
> (2013/12/04 10:11), Steven Rostedt wrote:
>> On Wed, 04 Dec 2013 09:48:44 +0900
>> Masami Hiramatsu  wrote:
>>
>>> fetch functions and actions. In that case, we can continue
>>> to use current interface but much faster to trace.
>>> Also, we can see what filter/arguments/actions are set
>>> on each event.
>>
>> There's also the problem that the current filters work with the results
>> of what is written to the buffer, not what is passed in by the trace
>> point, as that isn't even displayed to the user.
>
> Agreed, so I've said I doubt this implementation is a good
> shape to integrate. Ktap style is better, since it just gets
> parameters from perf buffer entry (using event format).

Are you saying always store all arguments into ring buffer and let
filter run on it?
It's slower, but it's cleaner, because of human readable? since ktap
arg1 matches first
argument of tracepoint is better than doing ctx->regs.di ? Sure.
si->arg1 is easy to fix.
With si->arg1 tweak the bpf will become architecture independent. It
will run through JIT on x86 and through interpreter everywhere else.
but for kprobes user have to specify 'var=cpu_register' during probe
creation… how is it better than doing the same in filter?
I'm open to suggestions on how to improve the usability.

Thanks
Alexei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v12 09/18] vmscan: shrink slab on memcg pressure

2013-12-04 Thread Dave Chinner
On Wed, Dec 04, 2013 at 10:31:32AM +0400, Vladimir Davydov wrote:
> On 12/04/2013 08:51 AM, Dave Chinner wrote:
> > On Tue, Dec 03, 2013 at 04:15:57PM +0400, Vladimir Davydov wrote:
> >> On 12/03/2013 02:48 PM, Dave Chinner wrote:
>  @@ -236,11 +236,17 @@ shrink_slab_node(struct shrink_control *shrinkctl, 
>  struct shrinker *shrinker,
>   return 0;
>   
>   /*
>  - * copy the current shrinker scan count into a local variable
>  - * and zero it so that other concurrent shrinker invocations
>  - * don't also do this scanning work.
>  + * Do not touch global counter of deferred objects on memcg 
>  pressure to
>  + * avoid isolation issues. Ideally the counter should be 
>  per-memcg.
>    */
>  -nr = atomic_long_xchg(&shrinker->nr_deferred[nid], 0);
>  +if (!shrinkctl->target_mem_cgroup) {
>  +/*
>  + * copy the current shrinker scan count into a local 
>  variable
>  + * and zero it so that other concurrent shrinker 
>  invocations
>  + * don't also do this scanning work.
>  + */
>  +nr = atomic_long_xchg(&shrinker->nr_deferred[nid], 0);
>  +}
> >>> That's ugly. Effectively it means that memcg reclaim is going to be
> >>> completely ineffective when large numbers of allocations and hence
> >>> reclaim attempts are done under GFP_NOFS context.
> >>>
> >>> The only thing that keeps filesystem caches in balance when there is
> >>> lots of filesystem work going on (i.e. lots of GFP_NOFS allocations)
> >>> is the deferal of reclaim work to a context that can do something
> >>> about it.
> >> Imagine the situation: a memcg issues a GFP_NOFS allocation and goes to
> >> shrink_slab() where it defers them to the global counter; then another
> >> memcg issues a GFP_KERNEL allocation, also goes to shrink_slab() where
> >> it sees a huge number of deferred objects and starts shrinking them,
> >> which is not good IMHO.
> > That's exactly what the deferred mechanism is for - we know we have
> > to do the work, but we can't do it right now so let someone else do
> > it who can.
> >
> > In most cases, deferral is handled by kswapd, because when a
> > filesystem workload is causing memory pressure then most allocations
> > are done in GFP_NOFS conditions. Hence the only memory reclaim that
> > can make progress here is kswapd.
> >
> > Right now, you aren't deferring any of this memory pressure to some
> > other agent, so it just does not get done. That's a massive problem
> > - it's a design flaw - and instead I see lots of crazy hacks being
> > added to do stuff that should simply be deferred to kswapd like is
> > done for global memory pressure.
> >
> > Hell, kswapd shoul dbe allowed to walk memcg LRU lists and trim
> > them, just like it does for the global lists. We only need a single
> > "deferred work" counter per node for that - just let kswapd
> > proportion the deferred work over the per-node LRU and the
> > memcgs
> 
> Seems I misunderstand :-(
> 
> Let me try. You mean we have the only nr_deferred counter per-node, and
> kswapd scans
> 
> nr_deferred*memcg_kmem_size/total_kmem_size
> 
> objects in each memcg, right?
> 
> Then if there were a lot of objects deferred on memcg (not global)
> pressure due to a memcg issuing a lot of GFP_NOFS allocations, kswapd
> will reclaim objects from all, even unlimited, memcgs. This looks like
> an isolation issue :-/

Which, when you are running out of memory, is a much less of an
issue than not being able to make progress reclaiming memory.

Besides, the "isolation" argument runs both ways. e.g. when there
isn't memory available, it's entirely possible it's because there is
actually no free memory, not because we've hit a memcg limit. e.g.
all the memory has been consumed by an unlimited memcg, and we need to
reclaim from it so this memcg can make progress.

In those situations we need to reclaim from everyone, not
just the memcg that can't find free memory to allocate

> Currently we have a per-node nr_deferred counter for each shrinker. If
> we add per-memcg reclaim, we have to make it per-memcg per-node, don't we?

Think about what you just said for a moment. We have how many memcg
shrinkers?  And we can support how many nodes? And we can support
how many memcgs? And when we multiply that all together, how much
memory do we need to track that?

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [OOPS, 3.13-rc2] null ptr in dio_complete()

2013-12-04 Thread Dave Chinner
On Wed, Dec 04, 2013 at 08:41:43PM -0700, Jens Axboe wrote:
> On Thu, Dec 05 2013, Dave Chinner wrote:
> > On Wed, Dec 04, 2013 at 03:17:49PM +1100, Dave Chinner wrote:
> > > On Tue, Dec 03, 2013 at 08:47:12PM -0700, Jens Axboe wrote:
> > > > On Wed, Dec 04 2013, Dave Chinner wrote:
> > > > > On Wed, Dec 04, 2013 at 12:58:38PM +1100, Dave Chinner wrote:
> > > > > > On Wed, Dec 04, 2013 at 08:59:40AM +1100, Dave Chinner wrote:
> > > > > > > Hi Jens,
> > > > > > > 
> > > > > > > Not sure who to direct this to or CC, so I figured you are the
> > > > > > > person to do that. I just had xfstests generic/299 (an AIO/DIO 
> > > > > > > test)
> > > > > > > oops in dio_complete() like so:
> > > > > > > 
> > > 
> > > > > > > [ 9650.590630]  
> > > > > > > [ 9650.590630]  [] dio_complete+0xa3/0x140
> > > > > > > [ 9650.590630]  [] dio_bio_end_aio+0x7a/0x110
> > > > > > > [ 9650.590630]  [] ? dio_bio_end_aio+0x5/0x110
> > > > > > > [ 9650.590630]  [] bio_endio+0x1d/0x30
> > > > > > > [ 9650.590630]  [] 
> > > > > > > blk_mq_complete_request+0x5f/0x120
> > > > > > > [ 9650.590630]  [] __blk_mq_end_io+0x16/0x20
> > > > > > > [ 9650.590630]  [] blk_mq_end_io+0x68/0xd0
> > > > > > > [ 9650.590630]  [] virtblk_done+0x67/0x110
> > > > > > > [ 9650.590630]  [] vring_interrupt+0x35/0x60
> > > .
> > > > > > And I just hit this from running xfs_repair which is doing
> > > > > > multithreaded direct IO directly on /dev/vdc:
> > > > > > 
> > > 
> > > > > > [ 1776.510446] IP: [] 
> > > > > > blk_account_io_done+0x6a/0x180
> > > 
> > > > > > [ 1776.512577]  [] 
> > > > > > blk_mq_complete_request+0xb8/0x120
> > > > > > [ 1776.512577]  [] __blk_mq_end_io+0x16/0x20
> > > > > > [ 1776.512577]  [] blk_mq_end_io+0x68/0xd0
> > > > > > [ 1776.512577]  [] virtblk_done+0x67/0x110
> > > > > > [ 1776.512577]  [] vring_interrupt+0x35/0x60
> > > > > > [ 1776.512577]  [] 
> > > > > > handle_irq_event_percpu+0x54/0x1e0
> > > .
> > > > > > So this is looking like another virtio+blk_mq problem
> > > > > 
> > > > > This one is definitely reproducable. Just hit it again...
> > > > 
> > > > I'll take a look at this. You don't happen to have gdb dumps of the
> > > > lines associated with those crashes? Just to save me some digging
> > > > time...
> > > 
> > > Only this:
> > > 
> > > (gdb) l *(dio_complete+0xa3)
> > > 0x811ddae3 is in dio_complete (fs/direct-io.c:282).
> > > 277 }
> > > 278
> > > 279 aio_complete(dio->iocb, ret, 0);
> > > 280 }
> > > 281
> > > 282 kmem_cache_free(dio_cache, dio);
> > > 283 return ret;
> > > 284 }
> > > 285
> > > 286 static void dio_aio_complete_work(struct work_struct *work)
> > > 
> > > And this:
> > > 
> > > (gdb) l *(blk_account_io_done+0x6a)
> > > 0x81755b6a is in blk_account_io_done (block/blk-core.c:2049).
> > > 2044int cpu;
> > > 2045
> > > 2046cpu = part_stat_lock();
> > > 2047part = req->part;
> > > 2048
> > > 2049part_stat_inc(cpu, part, ios[rw]);
> > > 2050part_stat_add(cpu, part, ticks[rw], duration);
> > > 2051part_round_stats(cpu, part);
> > > 2052part_dec_in_flight(part, rw);
> > > 2053
> > > 
> > > as I've rebuild the kernel with different patches since the one
> > > running on the machine that is triggering the problem.
> > 
> > Any update on this, Jens? I've hit this blk_account_io_done() panic
> > 10 times in the past 2 hours while trying to do xfs_repair
> > testing
> 
> No, sorry, no updates yet... I haven't had time to look into it today.
> To reproduce tomorrow, can you mail me your exact setup (kvm invocation,
> etc) and how your guest is setup and if there's any special way I need
> to run xfstest or xfs_repair?

The virtio device that I'm hitting is "/mnt/fast-ssd/vm-100TB-sparse.img"
which is a 100TB file on a 160GB XFS filesystem on an SSD, created
with

$ xfs_io -f -c "truncate 100t" -c "extsize 1m" /mnt/fast-ssd/vm-100TB-sparse.img

If I stat it, I get:

$ xfs_io -c stat /mnt/fast-ssd/vm-100TB-sparse.img 
fd.path = "/mnt/fast-ssd/vm-100TB-sparse.img"
fd.flags = non-sync,non-direct,read-write
stat.ino = 131
stat.type = regular file
stat.size = 109951162777600
stat.blocks = 259333400
fsxattr.xflags = 0x800 [--e---]
fsxattr.projid = 0
fsxattr.extsize = 1048576
fsxattr.nextents = 83108
fsxattr.naextents = 0
dioattr.mem = 0x200
dioattr.miniosz = 512
dioattr.maxiosz = 2147483136
$

The VM is run by this script:

$ cat run-vm-4.sh 
#!/bin/sh
sudo /usr/bin/qemu-system-x86_64 \
-no-fd-bootchk \
-localtime \
-boot c \
-serial pty \
-nographic \
-alt-grab \
-smp 16 -m 16384 \
-machine accel=kvm \
-hda /vm-images/vm-4/root.img \
-drive file=/vm-images/vm-4/vm-4-test.img,if=virtio,cache=none \
-drive file=/vm-images/vm-4/vm-4-scratch.img,if=virti

Re: [PATCH V4 00/10] perf: New conditional branch filter

2013-12-04 Thread Michael Ellerman
On Wed, 2013-12-04 at 16:02 +0530, Anshuman Khandual wrote:
>   This patchset is the re-spin of the original branch stack 
> sampling
> patchset which introduced new PERF_SAMPLE_BRANCH_COND branch filter. This 
> patchset
> also enables SW based branch filtering support for book3s powerpc platforms 
> which
> have PMU HW backed branch stack sampling support. 
> 
> Summary of code changes in this patchset:
> 
> (1) Introduces a new PERF_SAMPLE_BRANCH_COND branch filter
> (2) Add the "cond" branch filter options in the "perf record" tool
> (3) Enable PERF_SAMPLE_BRANCH_COND in X86 platforms
> (4) Enable PERF_SAMPLE_BRANCH_COND in POWER8 platform 
> (5) Update the documentation regarding "perf record" tool


Hi Arnaldo,

Can you please take just patches 1-5 into the perf tree? And do you mind
putting them in a topic branch so Benh can merge that.

The remaining patches are powerpc specific and still need some more review.

cheers


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH tip 0/5] tracing filters with BPF

2013-12-04 Thread Alexei Starovoitov
> On Tue, Dec 3, 2013 at 4:01 PM, Andi Kleen  wrote:
>>
>> Can you do some performance comparison compared to e.g. ktap?
>> How much faster is it?

Did simple ktap test with 1M alloc_skb/kfree_skb toy test from earlier email:
trace skb:kfree_skb {
if (arg2 == 0x100) {
printf("%x %x\n", arg1, arg2)
}
}
1M skb alloc/free 350315 (usecs)

baseline without any tracing:
1M skb alloc/free 145400 (usecs)

then equivalent bpf test:
void filter(struct bpf_context *ctx)
{
void *loc = (void *)ctx->regs.dx;
if (loc == 0x100) {
struct sk_buff *skb = (struct sk_buff *)ctx->regs.si;
char fmt[] = "skb %p loc %p\n";
bpf_trace_printk(fmt, sizeof(fmt), (long)skb, (long)loc, 0);
}
}
1M skb alloc/free 183214 (usecs)

so with one 'if' condition the difference ktap vs bpf is 350-145 vs 183-145

obviously ktap is an interpreter, so it's not really fair.

To make it really unfair I did:
trace skb:kfree_skb {
if (arg2 == 0x100 || arg2 == 0x200 || arg2 == 0x300 || arg2 == 0x400 ||
arg2 == 0x500 || arg2 == 0x600 || arg2 == 0x700 || arg2 == 0x800 ||
arg2 == 0x900 || arg2 == 0x1000) {
printf("%x %x\n", arg1, arg2)
}
}
1M skb alloc/free 484280 (usecs)

and corresponding bpf:
void filter(struct bpf_context *ctx)
{
void *loc = (void *)ctx->regs.dx;
if (loc == 0x100 || loc == 0x200 || loc == 0x300 || loc == 0x400 ||
loc == 0x500 || loc == 0x600 || loc == 0x700 || loc == 0x800 ||
loc == 0x900 || loc == 0x1000) {
struct sk_buff *skb = (struct sk_buff *)ctx->regs.si;
char fmt[] = "skb %p loc %p\n";
bpf_trace_printk(fmt, sizeof(fmt), (long)skb, (long)loc, 0);
}
}
1M skb alloc/free 185660 (usecs)

the difference is bigger now: 484-145 vs 185-145

9 extra 'if' conditions for bpf is almost nothing, since they
translate into 18 new x86 instructions after JITing, but for
interpreter it's obviously costly.

Why 0x100 instead of 0x1? To make sure that compiler doesn't optimize
them into < >
Otherwise it's really really unfair.

ktap is a nice tool. Great job Jovi!
I noticed that it doesn't always clear created kprobes after run and I
see a bunch of .../tracing/events/ktap_kprobes_xxx, but that's a minor
thing.

Thanks
Alexei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 01/11] resolve PXA<->8250 serial device address conflict

2013-12-04 Thread Sergei Ianovich
On Wed, 2013-12-04 at 20:35 -0800, Greg Kroah-Hartman wrote:
> On Thu, Dec 05, 2013 at 08:31:36AM +0400, Sergei Ianovich wrote:
> > I'm reading the last message as a confirmation that
> > drivers/tty/serial/pxa.c needs to be rewritten using 8250_core.c.
> 
> Yes, how much work is this really?

Great. It seems two drivers practically match. I'll do and submit a
merge patch.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 01/11] resolve PXA<->8250 serial device address conflict

2013-12-04 Thread Greg Kroah-Hartman
On Thu, Dec 05, 2013 at 08:31:36AM +0400, Sergei Ianovich wrote:
> On Wed, 2013-12-04 at 20:12 -0800, Greg Kroah-Hartman wrote:
> > On Mon, Dec 02, 2013 at 04:10:33PM +0200, Heikki Krogerus wrote:
> > > On Mon, Dec 02, 2013 at 02:26:45PM +0400, Sergei Ianovich wrote:
> > > > Who makes the decision which way to go?
> > > 
> > > Greg and Russel make this decision. By having the pxa driver simply
> > > register 8250 ports would probable reduce the code. Thats about the
> > > biggest benefit from it.
> > > 
> > > It would still be something nice to have IMO. Ideally all the
> > > 8250/16x50 UARTs should register the ports with 8250_core.c, and not
> > > create complete uart driver on their own.
> > 
> > I agree, this is the best way to resolve this, having a separate uart
> > driver isn't that good at all to be doing, if at all possible.
> 
> I'm reading the last message as a confirmation that
> drivers/tty/serial/pxa.c needs to be rewritten using 8250_core.c.

Yes, how much work is this really?

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 01/11] resolve PXA<->8250 serial device address conflict

2013-12-04 Thread Sergei Ianovich
On Wed, 2013-12-04 at 20:12 -0800, Greg Kroah-Hartman wrote:
> On Mon, Dec 02, 2013 at 04:10:33PM +0200, Heikki Krogerus wrote:
> > On Mon, Dec 02, 2013 at 02:26:45PM +0400, Sergei Ianovich wrote:
> > > Who makes the decision which way to go?
> > 
> > Greg and Russel make this decision. By having the pxa driver simply
> > register 8250 ports would probable reduce the code. Thats about the
> > biggest benefit from it.
> > 
> > It would still be something nice to have IMO. Ideally all the
> > 8250/16x50 UARTs should register the ports with 8250_core.c, and not
> > create complete uart driver on their own.
> 
> I agree, this is the best way to resolve this, having a separate uart
> driver isn't that good at all to be doing, if at all possible.

I'm reading the last message as a confirmation that
drivers/tty/serial/pxa.c needs to be rewritten using 8250_core.c.
However, "if at all possible" confuses me, since we have pxa.c in the
tree and it works. Greg, could you please clarify?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] usb: core: Abort deauthorization if unsetting configuration fails

2013-12-04 Thread Julius Werner
> Instead, how about changing usb_set_configuration() so that it will
> never fail when the new config is -1?  Except perhaps for -ENODEV
> errors (the device has been disconnected), which
> usb_deauthorize_device() could check for.

Yes, that should work as well. It's really just one autoresume and one
disable_lpm that can fail in that case so it shouldn't be too
intrusive. I would prefer not to special-case ENODEV though, no need
to add more complexity than necessary.

I will write up a new version for that tomorrow.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [char-misc-next] mei: add 9 series PCH mei device ids

2013-12-04 Thread Greg KH
On Tue, Dec 03, 2013 at 01:01:29PM +0200, Tomas Winkler wrote:
> And Lynx Point H Refresh and Wildcat Point LP
> device ids.
> 
> Signed-off-by: Tomas Winkler 
> ---
>  drivers/misc/mei/hw-me-regs.h | 4 
>  drivers/misc/mei/pci-me.c | 4 +++-
>  2 files changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/misc/mei/hw-me-regs.h b/drivers/misc/mei/hw-me-regs.h
> index 6c0fde5..f83bc80 100644
> --- a/drivers/misc/mei/hw-me-regs.h
> +++ b/drivers/misc/mei/hw-me-regs.h
> @@ -110,8 +110,12 @@
>  #define MEI_DEV_ID_PPT_3  0x1DBA  /* Panther Point */
>  
>  #define MEI_DEV_ID_LPT0x8C3A  /* Lynx Point */
> +#define MEI_DEV_ID_LPT_H  0x8C3A  /* Lynx Point H */

Why duplicate this #define?

And, now that you changed it, why keep the "old" one around, it's no
longer used anywhere else?

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] usb: phy-generic: Add ULPI VBUS support

2013-12-04 Thread Chris Ruehl



On Wednesday, December 04, 2013 05:49 PM, Heikki Krogerus wrote:

Hi Chris,

On Wed, Dec 04, 2013 at 03:16:21PM +0800, Chris Ruehl wrote:

On Tuesday, December 03, 2013 04:15 PM, Heikki Krogerus wrote:

On Mon, Dec 02, 2013 at 03:05:19PM +0800, Chris Ruehl wrote:

@@ -154,6 +164,27 @@ int usb_phy_gen_create_phy(struct device *dev, struct 
usb_phy_gen_xceiv *nop,
  {
int err;

+   if (nop->ulpi_vbus>   0) {
+   unsigned int flags = 0;
+
+   if (nop->ulpi_vbus&   0x1)
+   flags |= ULPI_OTG_DRVVBUS;
+   if (nop->ulpi_vbus&   0x2)
+   flags |= ULPI_OTG_DRVVBUS_EXT;
+   if (nop->ulpi_vbus&   0x4)
+   flags |= ULPI_OTG_EXTVBUSIND;
+   if (nop->ulpi_vbus&   0x8)
+   flags |= ULPI_OTG_CHRGVBUS;
+
+   nop->ulpi = otg_ulpi_create(&ulpi_viewport_access_ops, flags);
+   if (!nop->ulpi) {
+   dev_err(dev, "Failed create ULPI Phy\n");
+   return -ENOMEM;
+   }
+   dev_dbg(dev, "Create ULPI Phy\n");
+   nop->ulpi->io_priv =  nop->viewport;
+   }


This is so wrong. You are registering one kind of usb phy driver from
an other. Change drivers/usb/phy/ulpi.c to be a platform device. The
whole flag system in it is pretty horrid. While you are at it, change
that so it sets the values based on boolean flags from OF properties
or platform data.

NAK for the whole set.




Heikki,

Thanks for your comments, even not much positive to me.. any how.
My intention on the "horrid" path was to reduce kernel code where
one of_read32 vs. four of_boolean. And mentioned logic is simple.
But that's history.


I should probable explain why I have problems with them. First of all,
things like driving the vbus should be a function that can be called
from upper layers. struct usb_otg has the set_vbus hook for that. You
can call it for example from your host controller's init routine. I'm
assuming you have a host controller since you are driving vbus.


My platform is Freescale imx27 and the host controller the ChipIdea, where I 
have already send some patches for. I uses the set_vbus it in the wrong place

nop->ulpi->otg->set_vbus(nop->ulpi->otg,true); (phy-generic:usb_gen_phy_init())

and now I start to understand where is the issue. I must tell chipidea to init 
the vbus using the platform




You don't need to set the ULPI_OTG_CHRGVBUS. It's used for the VBUS
pulsing of SRP, which btw. is not anymore supported in OTG&EH2.0 spec,
so just don't use that bit even if you want to start SRP.


OK, got it. Test it right away, yes my USB still works great even I omit the 
flag. The reason I introduced it was the fact that plat-mxc/isp1504xc.c of the 
2.6.22 with the freescale patches set this flag.




The only of_booleans you should need are for the DRV_VBUS_EXT and
USE_EXT_VBUS_IND. In my case I could not use even those. My controller
provides it's own control for them, so even if I set them to my ULPI
phy, the controller would simply override the values.

Secondly, why those silly flags in the first place. Those flags are
just bits in the registers. It would have been much easier and cleaner
to deliver a small struct with default values for the registers
instead.


On my way to find a solution for my board I'd look around and found using of
phy-ulpi.c functions in phy-tegra-usb.c and don't mind to use them too.


OK, IC. I have not followed what is happening with USB in linux for a
while.

The whole otg_ulpi_create() thing, and the flags with it, were
originally planned to be used from platform code. It's evil and it
should have never been accepted into upstream kernel. The time it was
introduced I was on vacation and nobody else seemed to care :(. All I
was able to do was to protest afterwards.



Checked!



I accept your NAK and will work on a patch to make phy-ulpi.c
working as platform device.

Last question to you. What you don't like on the patch to support
chip-select gpio of my patch-set.. I ask because you NAK the whole
set.
I really need the ChipSelect function to make my hardware work!


OK, I did not explain my problem with that patch. I'm sorry about
that. It also looks like I made wrong assumption with it. I thought
that your phy (is was ISP1504 right) is just like isp1704 that I have
worked with. On isp1704 you only have the chip_sel pin (no reset pin),
so I thought you can not have any reason to add handler for an other
gpio to this driver. After a quick look at isp1504 data sheet, it
looks like you have both reset and chip_sel pins on it, which I guess
are both connected to gpios on your platform.


Yes 1504, and my hardware guys make otg using the chipselect with gpio
and the usbh2 is fixed selected via pull down resistor.



So I don't have a problem with that. Though I'm not sure is this
driver the right place to handle things like these gpios, which are
pretty phy sp

Re: [PATCH 01/11] resolve PXA<->8250 serial device address conflict

2013-12-04 Thread Greg Kroah-Hartman
On Mon, Dec 02, 2013 at 04:10:33PM +0200, Heikki Krogerus wrote:
> Hi,
> 
> On Mon, Dec 02, 2013 at 02:26:45PM +0400, Sergei Ianovich wrote:
> > On Mon, 2013-12-02 at 11:49 +0200, Heikki Krogerus wrote:
> > > On Mon, Dec 02, 2013 at 01:23:58PM +0400, Sergei Ianovich wrote:
> > > > On Mon, 2013-12-02 at 11:02 +0200, Heikki Krogerus wrote:
> > > > > 
> > > > > If drivers/tty/serial/pxa.c was converted to an other probe driver for
> > > > > the 8250, this would not be an issue.
> > > > 
> > > > It seems that my patch is not going to be accepted. However, there is a
> > > > device which has both PXA ports and a additional 8250 accent chip. As a
> > > > result, there is a device allocation conflict. For the device to be
> > > > usable the conflict needs to be resolved.
> > > > 
> > > > Do you mean that drivers/tty/serial/pxa.c needs to be rewritten to
> > > > support lp8x4x special case?
> > > 
> > > Sorry I was not clear. I was suggesting that drivers/tty/serial/pxa.c
> > > would be converted to drivers/tty/serial/8250/8250_pxa.c since it
> > > looks to me like just an other 16x50 compatible UART. That would fix
> > > the issue with the name conflict. You would then simply register 8250
> > > ports from two probe drivers (drivers/tty/serial/8250/8250_pxa.c and
> > > drivers/tty/serial/8250/8250_lp8x4x.c).
> > > 
> > > Depending on the order you register your platform devices (which you
> > > decide in your platform code), but let's say the pxa gets registered
> > > first and let's say it only has one port. You will then have in your
> > > system /dev/ttyS0 for the pxa port and /dev/ttyS[1-4] for the other
> > > UART.
> > > 
> > > I hope I was able to explain what I mean this time :)
> > 
> > Sorry, I wasn't clear as well. I got it right the first time. You mean
> > pxa.c needs to merged into 8250. This will solve the conflict in
> > question, and do it the right way. However, this will be a *much* bigger
> > patch, and it will affect everyone on pxa.
> > 
> > Who makes the decision which way to go?
> 
> Greg and Russel make this decision. By having the pxa driver simply
> register 8250 ports would probable reduce the code. Thats about the
> biggest benefit from it.
> 
> It would still be something nice to have IMO. Ideally all the
> 8250/16x50 UARTs should register the ports with 8250_core.c, and not
> create complete uart driver on their own.

I agree, this is the best way to resolve this, having a separate uart
driver isn't that good at all to be doing, if at all possible.

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] kexec: A new system call to allow in kernel loading

2013-12-04 Thread Eric W. Biederman
Vivek Goyal  writes:

> Hi Eric,
>
> So you want a separate purgatory code and that purgatory should be self
> contained and should not share any code with rest of the kernel. No
> inclusion of header files, no linking against kernel libraries? That means
> even re-implementing sha256 functions separately (like user space)?

Call only trivial sharing of code with the rest of the kernel.   But only
as much as say the kernel decompressor has.  

> If code maintenance is a concern, then I think I can reimplement some
> of the functions to calculate sha256 in separate crash files and invoke
> those to reduce code sharing with rest of the kernel. And we should be
> able to link against the kernel and not have to create separate
> relocatable purgatory object and relocate it.

It is both code maintenance and the fact that we have a strong
expectation that where purgatory lives should not be corrupted.
Plus what I have seen there maintenance becomes much simpler if there is
a little bit of C code that lives between the two kernels.  At that
point people don't have to grok assembly to be able to touch anything,
and by simply living their it enforces separation of concerns
from the kernel in a way that is trivial and obvious.

Plus we already have all of the code in userspace to do all of this work
so it is not something you would need to write from scrach merely
something that you would need to adapt.

> IOW, does purgatory still have to be a relocatable object?

Fundamentally purgatory does need to be a relocatable object.

> I think
> user space had no choice but given the fact that we are implementing
> thing in kernel, I should be able to implement my own hash calculation
> and segment verification code and link it to existing kernel and invoke
> these outside purgatory. 

Doing this outside of purgatory and linking to the rest of the kernel is
almost certainly enough to get someone to perform an obvious cleanup
that will undermine the purpose of the code.  Or perhaps it will be
merely a reference to the GOT table behind our backs in the C code
generated by the compiler that will undermine this checking.

Linking to our sanity checks to the rest of the kernel leaves me
profoundly uncomfortable.

> Anyway, we call so many other functions after
> crash to stop cpus, save registers, etc.

There is no other possible place to stop cpus, and save the cpu
registers.  As much as possible that should be the only justification
for the code we run on the kexec on panic code path.

Honestly calling so many other functions on that code path is a good
reason to see about removing them.  In addition to my other reasons
adding the hash calculation on that path will likely confuse issues
more than helping them.

Eric

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] perf trace fixes

2013-12-04 Thread David Ahern

On 12/4/13, 7:41 PM, David Ahern wrote:

Hi Arnaldo:

As I mentioned on IRC perf-trace fails on older kernels -- like RHEL6. This
set of patches makes it at least usable - though still some problems I am
hoping you can fix.

Build perf with these patches and run:

   perf trace -- dd if=/dev/zero of=/tmp/zero bs=4096 count=16

you see something like this which is just wrong:

   3.684 ( 0.007 ms): write(buf: 2, count: 140737077958816  ) = 27

there is no fd (should be 1 for the write) and all the values are wrong.
Perhaps it is an artifact of the older way of doing system call tracing, but
I see something goofy with the 3.12 kernel as well:
   5.633 ( 0.004 ms): write(fd: 2, buf: 0x7fff9177fee0, count: 24 ) = 24


forget this last comment about 3.12; it works fine. That write entry is 
the final write by dd:


write(2, "65536 bytes (66 kB) copied", 2665536 bytes (66 kB) copied) = 26

That one is fine and looking up I see the 4096 lines as expected.

For RHEL6, the problem is there for the 4096 lines:

32.641 ( 0.005 ms): read(buf: 0, count: 27258880) = 4096
32.655 ( 0.009 ms): write(buf: 1, count: 27258880) = 4096

buf and count are wrong. Adding -e write to the perf-trace I get:

33.775 ( 0.031 ms): write(buf: 1, count: 37437440) = 4096

which suggests an off-by-1 error with parsing syscalls versus 
raw_syscalls. I am hoping you have an idea on how to fix that.


David
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/3] ARM Coresight: Enhance ETM tracing control

2013-12-04 Thread Greg Kroah-Hartman
On Wed, Dec 04, 2013 at 10:49:25PM -0500, Adrien Vergé wrote:
> 2013/12/4 Greg Kroah-Hartman :
> > How much overhead does the existing tracing code have on ARM?  Is ETM
> > still even needed?  Why not just use ETM for the core tracing code
> > instead?
> 
> Coresight ETM is not just faster than /sys/kernel/debug/tracing, it
> provides more detailed and customisable info. For instance, you can
> trace every load, store, instruction fetch, along with the number of
> cycles taken, with almost zero-overhead.

Can't you already do that with the 'perf' tool the kernel provides
without the ETM driver?

> > What's wrong with the in-kernel tracing logic that you can't use that
> > instead of the ETM stuff?
> 
> ETM has a different purpose. Integrating it in
> /sys/kernel/debug/tracing would not take advantage of all its
> features.

What is it's purpose then?  At first glance, this seems to be exactly
what 'perf' provides already.  Doesn't perf work on ARM today?

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/1] MTD: UBI: avoid program operation on NOR flash after erasure interrupted

2013-12-04 Thread qiwang
Hi Artem:
As we talked in mail before, please check my patch as below:

From: Qi Wang 

nor_erase_prepare() will be called before erase a NOR flash, it will program '0'
into a block to mark this block. But program data into a erasure interrupted 
block
can cause program timtout(several minutes at most) error, could impact other 
operation on NOR flash. So UBIFS can read this block first to avoid unneeded 
program operation. 

This patch try to put read operation at head of write operation in 
nor_erase_prepare(), read out the data. 
If the data is already corrupt, then no need to program any data into this 
block, 
just go to erase this block.

Signed-off-by: Qi Wang 
---
diff --git a/drivers/mtd/ubi/io.c b/drivers/mtd/ubi/io.c
index bf79def..0a6343d 100644
--- a/drivers/mtd/ubi/io.c
+++ b/drivers/mtd/ubi/io.c
@@ -499,6 +499,7 @@ static int nor_erase_prepare(struct ubi_device *ubi, int 
pnum)
size_t written;
loff_t addr;
uint32_t data = 0;
+   struct ubi_ec_hdr ec_hdr;
/*
 * Note, we cannot generally define VID header buffers on stack,
 * because of the way we deal with these buffers (see the header
@@ -509,49 +510,38 @@ static int nor_erase_prepare(struct ubi_device *ubi, int 
pnum)
struct ubi_vid_hdr vid_hdr;
 
/*
+* If VID or EC is valid, need to corrupt it before erase operation.  
 * It is important to first invalidate the EC header, and then the VID
 * header. Otherwise a power cut may lead to valid EC header and
 * invalid VID header, in which case UBI will treat this PEB as
 * corrupted and will try to preserve it, and print scary warnings.
 */
addr = (loff_t)pnum * ubi->peb_size;
-   err = mtd_write(ubi->mtd, addr, 4, &written, (void *)&data);
-   if (!err) {
-   addr += ubi->vid_hdr_aloffset;
-   err = mtd_write(ubi->mtd, addr, 4, &written, (void *)&data);
-   if (!err)
-   return 0;
+   err = ubi_io_read_ec_hdr(ubi, pnum, &ec_hdr, 0);
+   if (err != UBI_IO_BAD_HDR_EBADMSG && err != UBI_IO_BAD_HDR &&
+   err != UBI_IO_FF){
+   err1 = mtd_write(ubi->mtd, addr, 4, &written, (void *)&data);
+   if(err1)
+   goto error;
}
 
-   /*
-* We failed to write to the media. This was observed with Spansion
-* S29GL512N NOR flash. Most probably the previously eraseblock erasure
-* was interrupted at a very inappropriate moment, so it became
-* unwritable. In this case we probably anyway have garbage in this
-* PEB.
-*/
-   err1 = ubi_io_read_vid_hdr(ubi, pnum, &vid_hdr, 0);
-   if (err1 == UBI_IO_BAD_HDR_EBADMSG || err1 == UBI_IO_BAD_HDR ||
-   err1 == UBI_IO_FF) {
-   struct ubi_ec_hdr ec_hdr;
-
-   err1 = ubi_io_read_ec_hdr(ubi, pnum, &ec_hdr, 0);
-   if (err1 == UBI_IO_BAD_HDR_EBADMSG || err1 == UBI_IO_BAD_HDR ||
-   err1 == UBI_IO_FF)
-   /*
-* Both VID and EC headers are corrupted, so we can
-* safely erase this PEB and not afraid that it will be
-* treated as a valid PEB in case of an unclean reboot.
-*/
-   return 0;
+   err = ubi_io_read_vid_hdr(ubi, pnum, &vid_hdr, 0);
+   if (err != UBI_IO_BAD_HDR_EBADMSG && err != UBI_IO_BAD_HDR &&
+   err != UBI_IO_FF){
+   addr += ubi->vid_hdr_aloffset;
+   err1 = mtd_write(ubi->mtd, addr, 4, &written, (void *)&data);
+   if (err1)
+   goto error; 
}
+   return 0;
 
+error:
/*
-* The PEB contains a valid VID header, but we cannot invalidate it.
+* The PEB contains a valid VID or EC header, but we cannot invalidate 
it.
 * Supposedly the flash media or the driver is screwed up, so return an
 * error.
 */
-   ubi_err("cannot invalidate PEB %d, write returned %d read returned %d",
+   ubi_err("cannot invalidate PEB %d, read returned %d write returned %d",
pnum, err, err1);
ubi_dump_flash(ubi, pnum, 0, ubi->peb_size);
return -EIO;
---

I have tested this patch on Micron NOR flash, part number is:JS28F512M29EWHA.
If have any questions, please let me know. 
Thank you

Best Regards, 
Qi Wang 王起
ESG APAC AE 
Tel: 86-021-38997158
Mobile: 86-15201958202
Email: qiw...@micron.com
Address: No 601 Fasai Rd, Pudong, Shanghai, China, 200131


Re: [PATCH 0/3] ARM Coresight: Enhance ETM tracing control

2013-12-04 Thread Adrien Vergé
2013/12/4 Greg Kroah-Hartman :
> How much overhead does the existing tracing code have on ARM?  Is ETM
> still even needed?  Why not just use ETM for the core tracing code
> instead?

Coresight ETM is not just faster than /sys/kernel/debug/tracing, it
provides more detailed and customisable info. For instance, you can
trace every load, store, instruction fetch, along with the number of
cycles taken, with almost zero-overhead.

> What's wrong with the in-kernel tracing logic that you can't use that
> instead of the ETM stuff?

ETM has a different purpose. Integrating it in
/sys/kernel/debug/tracing would not take advantage of all its
features.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/1] MTD: UBI: avoid program operation on NOR flash after erasure interrupted

2013-12-04 Thread qiwang
Hi Artem:
As we talked in mail before, please check my patch as below:

From: Qi Wang 

nor_erase_prepare() will be called before erase a NOR flash, it will program '0'
into a block to mark this block. But program data into a erasure interrupted 
block
can cause program timtout(several minutes at most) error, could impact other 
operation on NOR flash. So UBIFS can read this block first to avoid unneeded 
program operation. 

This patch try to put read operation at head of write operation in 
nor_erase_prepare(), read out the data. 
If the data is already corrupt, then no need to program any data into this 
block, 
just go to erase this block.

Signed-off-by: Qi Wang 
---
diff --git a/drivers/mtd/ubi/io.c b/drivers/mtd/ubi/io.c
index bf79def..0a6343d 100644
--- a/drivers/mtd/ubi/io.c
+++ b/drivers/mtd/ubi/io.c
@@ -499,6 +499,7 @@ static int nor_erase_prepare(struct ubi_device *ubi, int 
pnum)
size_t written;
loff_t addr;
uint32_t data = 0;
+   struct ubi_ec_hdr ec_hdr;
/*
 * Note, we cannot generally define VID header buffers on stack,
 * because of the way we deal with these buffers (see the header
@@ -509,49 +510,38 @@ static int nor_erase_prepare(struct ubi_device *ubi, int 
pnum)
struct ubi_vid_hdr vid_hdr;
 
/*
+* If VID or EC is valid, need to corrupt it before erase operation.  
 * It is important to first invalidate the EC header, and then the VID
 * header. Otherwise a power cut may lead to valid EC header and
 * invalid VID header, in which case UBI will treat this PEB as
 * corrupted and will try to preserve it, and print scary warnings.
 */
addr = (loff_t)pnum * ubi->peb_size;
-   err = mtd_write(ubi->mtd, addr, 4, &written, (void *)&data);
-   if (!err) {
-   addr += ubi->vid_hdr_aloffset;
-   err = mtd_write(ubi->mtd, addr, 4, &written, (void *)&data);
-   if (!err)
-   return 0;
+   err = ubi_io_read_ec_hdr(ubi, pnum, &ec_hdr, 0);
+   if (err != UBI_IO_BAD_HDR_EBADMSG && err != UBI_IO_BAD_HDR &&
+   err != UBI_IO_FF){
+   err1 = mtd_write(ubi->mtd, addr, 4, &written, (void *)&data);
+   if(err1)
+   goto error;
}
 
-   /*
-* We failed to write to the media. This was observed with Spansion
-* S29GL512N NOR flash. Most probably the previously eraseblock erasure
-* was interrupted at a very inappropriate moment, so it became
-* unwritable. In this case we probably anyway have garbage in this
-* PEB.
-*/
-   err1 = ubi_io_read_vid_hdr(ubi, pnum, &vid_hdr, 0);
-   if (err1 == UBI_IO_BAD_HDR_EBADMSG || err1 == UBI_IO_BAD_HDR ||
-   err1 == UBI_IO_FF) {
-   struct ubi_ec_hdr ec_hdr;
-
-   err1 = ubi_io_read_ec_hdr(ubi, pnum, &ec_hdr, 0);
-   if (err1 == UBI_IO_BAD_HDR_EBADMSG || err1 == UBI_IO_BAD_HDR ||
-   err1 == UBI_IO_FF)
-   /*
-* Both VID and EC headers are corrupted, so we can
-* safely erase this PEB and not afraid that it will be
-* treated as a valid PEB in case of an unclean reboot.
-*/
-   return 0;
+   err = ubi_io_read_vid_hdr(ubi, pnum, &vid_hdr, 0);
+   if (err != UBI_IO_BAD_HDR_EBADMSG && err != UBI_IO_BAD_HDR &&
+   err != UBI_IO_FF){
+   addr += ubi->vid_hdr_aloffset;
+   err1 = mtd_write(ubi->mtd, addr, 4, &written, (void *)&data);
+   if (err1)
+   goto error; 
}
+   return 0;
 
+error:
/*
-* The PEB contains a valid VID header, but we cannot invalidate it.
+* The PEB contains a valid VID or EC header, but we cannot invalidate 
it.
 * Supposedly the flash media or the driver is screwed up, so return an
 * error.
 */
-   ubi_err("cannot invalidate PEB %d, write returned %d read returned %d",
+   ubi_err("cannot invalidate PEB %d, read returned %d write returned %d",
pnum, err, err1);
ubi_dump_flash(ubi, pnum, 0, ubi->peb_size);
return -EIO;
---

I have tested this patch on Micron NOR flash, part number is:JS28F512M29EWHA.
If have any questions, please let me know. 
Thank you

Best Regards, 
Qi Wang 王起
ESG APAC AE 
Tel: 86-021-38997158
Mobile: 86-15201958202
Email: qiw...@micron.com
Address: No 601 Fasai Rd, Pudong, Shanghai, China, 200131


[PATCH 1/1] MTD: UBI: avoid program operation on NOR flash after erasure interrupted

2013-12-04 Thread qiwang

Hi Artem:
As we talked in mail before, please check my patch as below:

From: Qi Wang 

nor_erase_prepare() will be called before erase a NOR flash, it will program '0'
into a block to mark this block. But program data into a erasure interrupted 
block
can cause program timtout(several minutes at most) error, could impact other 
operation on NOR flash. So UBIFS can read this block first to avoid unneeded 
program operation. 

This patch try to put read operation at head of write operation in 
nor_erase_prepare(), read out the data. 
If the data is already corrupt, then no need to program any data into this 
block, 
just go to erase this block.

Signed-off-by: Qi Wang 
---
diff --git a/drivers/mtd/ubi/io.c b/drivers/mtd/ubi/io.c
index bf79def..0a6343d 100644
--- a/drivers/mtd/ubi/io.c
+++ b/drivers/mtd/ubi/io.c
@@ -499,6 +499,7 @@ static int nor_erase_prepare(struct ubi_device *ubi, int 
pnum)
size_t written;
loff_t addr;
uint32_t data = 0;
+   struct ubi_ec_hdr ec_hdr;
/*
 * Note, we cannot generally define VID header buffers on stack,
 * because of the way we deal with these buffers (see the header
@@ -509,49 +510,38 @@ static int nor_erase_prepare(struct ubi_device *ubi, int 
pnum)
struct ubi_vid_hdr vid_hdr;
 
/*
+* If VID or EC is valid, need to corrupt it before erase operation.  
 * It is important to first invalidate the EC header, and then the VID
 * header. Otherwise a power cut may lead to valid EC header and
 * invalid VID header, in which case UBI will treat this PEB as
 * corrupted and will try to preserve it, and print scary warnings.
 */
addr = (loff_t)pnum * ubi->peb_size;
-   err = mtd_write(ubi->mtd, addr, 4, &written, (void *)&data);
-   if (!err) {
-   addr += ubi->vid_hdr_aloffset;
-   err = mtd_write(ubi->mtd, addr, 4, &written, (void *)&data);
-   if (!err)
-   return 0;
+   err = ubi_io_read_ec_hdr(ubi, pnum, &ec_hdr, 0);
+   if (err != UBI_IO_BAD_HDR_EBADMSG && err != UBI_IO_BAD_HDR &&
+   err != UBI_IO_FF){
+   err1 = mtd_write(ubi->mtd, addr, 4, &written, (void *)&data);
+   if(err1)
+   goto error;
}
 
-   /*
-* We failed to write to the media. This was observed with Spansion
-* S29GL512N NOR flash. Most probably the previously eraseblock erasure
-* was interrupted at a very inappropriate moment, so it became
-* unwritable. In this case we probably anyway have garbage in this
-* PEB.
-*/
-   err1 = ubi_io_read_vid_hdr(ubi, pnum, &vid_hdr, 0);
-   if (err1 == UBI_IO_BAD_HDR_EBADMSG || err1 == UBI_IO_BAD_HDR ||
-   err1 == UBI_IO_FF) {
-   struct ubi_ec_hdr ec_hdr;
-
-   err1 = ubi_io_read_ec_hdr(ubi, pnum, &ec_hdr, 0);
-   if (err1 == UBI_IO_BAD_HDR_EBADMSG || err1 == UBI_IO_BAD_HDR ||
-   err1 == UBI_IO_FF)
-   /*
-* Both VID and EC headers are corrupted, so we can
-* safely erase this PEB and not afraid that it will be
-* treated as a valid PEB in case of an unclean reboot.
-*/
-   return 0;
+   err = ubi_io_read_vid_hdr(ubi, pnum, &vid_hdr, 0);
+   if (err != UBI_IO_BAD_HDR_EBADMSG && err != UBI_IO_BAD_HDR &&
+   err != UBI_IO_FF){
+   addr += ubi->vid_hdr_aloffset;
+   err1 = mtd_write(ubi->mtd, addr, 4, &written, (void *)&data);
+   if (err1)
+   goto error; 
}
+   return 0;
 
+error:
/*
-* The PEB contains a valid VID header, but we cannot invalidate it.
+* The PEB contains a valid VID or EC header, but we cannot invalidate 
it.
 * Supposedly the flash media or the driver is screwed up, so return an
 * error.
 */
-   ubi_err("cannot invalidate PEB %d, write returned %d read returned %d",
+   ubi_err("cannot invalidate PEB %d, read returned %d write returned %d",
pnum, err, err1);
ubi_dump_flash(ubi, pnum, 0, ubi->peb_size);
return -EIO;
---

I have tested this patch on Micron NOR flash, part number is:JS28F512M29EWHA.
If have any questions, please let me know. 
Thank you

Best Regards, 
Qi Wang 王起
ESG APAC AE 
Tel: 86-021-38997158
Mobile: 86-15201958202
Email: qiw...@micron.com
Address: No 601 Fasai Rd, Pudong, Shanghai, China, 200131


[PATCH 1/1] MTD: UBI: avoid program operation on NOR flash after erasure interrupted

2013-12-04 Thread qiwang

Hi Artem:
As we talked in mail before, please check my patch as below:

From: Qi Wang 

nor_erase_prepare() will be called before erase a NOR flash, it will program '0'
into a block to mark this block. But program data into a erasure interrupted 
block
can cause program timtout(several minutes at most) error, could impact other 
operation on NOR flash. So UBIFS can read this block first to avoid unneeded 
program operation. 

This patch try to put read operation at head of write operation in 
nor_erase_prepare(), read out the data. 
If the data is already corrupt, then no need to program any data into this 
block, 
just go to erase this block.

Signed-off-by: Qi Wang 
---
diff --git a/drivers/mtd/ubi/io.c b/drivers/mtd/ubi/io.c
index bf79def..0a6343d 100644
--- a/drivers/mtd/ubi/io.c
+++ b/drivers/mtd/ubi/io.c
@@ -499,6 +499,7 @@ static int nor_erase_prepare(struct ubi_device *ubi, int 
pnum)
size_t written;
loff_t addr;
uint32_t data = 0;
+   struct ubi_ec_hdr ec_hdr;
/*
 * Note, we cannot generally define VID header buffers on stack,
 * because of the way we deal with these buffers (see the header
@@ -509,49 +510,38 @@ static int nor_erase_prepare(struct ubi_device *ubi, int 
pnum)
struct ubi_vid_hdr vid_hdr;
 
/*
+* If VID or EC is valid, need to corrupt it before erase operation.  
 * It is important to first invalidate the EC header, and then the VID
 * header. Otherwise a power cut may lead to valid EC header and
 * invalid VID header, in which case UBI will treat this PEB as
 * corrupted and will try to preserve it, and print scary warnings.
 */
addr = (loff_t)pnum * ubi->peb_size;
-   err = mtd_write(ubi->mtd, addr, 4, &written, (void *)&data);
-   if (!err) {
-   addr += ubi->vid_hdr_aloffset;
-   err = mtd_write(ubi->mtd, addr, 4, &written, (void *)&data);
-   if (!err)
-   return 0;
+   err = ubi_io_read_ec_hdr(ubi, pnum, &ec_hdr, 0);
+   if (err != UBI_IO_BAD_HDR_EBADMSG && err != UBI_IO_BAD_HDR &&
+   err != UBI_IO_FF){
+   err1 = mtd_write(ubi->mtd, addr, 4, &written, (void *)&data);
+   if(err1)
+   goto error;
}
 
-   /*
-* We failed to write to the media. This was observed with Spansion
-* S29GL512N NOR flash. Most probably the previously eraseblock erasure
-* was interrupted at a very inappropriate moment, so it became
-* unwritable. In this case we probably anyway have garbage in this
-* PEB.
-*/
-   err1 = ubi_io_read_vid_hdr(ubi, pnum, &vid_hdr, 0);
-   if (err1 == UBI_IO_BAD_HDR_EBADMSG || err1 == UBI_IO_BAD_HDR ||
-   err1 == UBI_IO_FF) {
-   struct ubi_ec_hdr ec_hdr;
-
-   err1 = ubi_io_read_ec_hdr(ubi, pnum, &ec_hdr, 0);
-   if (err1 == UBI_IO_BAD_HDR_EBADMSG || err1 == UBI_IO_BAD_HDR ||
-   err1 == UBI_IO_FF)
-   /*
-* Both VID and EC headers are corrupted, so we can
-* safely erase this PEB and not afraid that it will be
-* treated as a valid PEB in case of an unclean reboot.
-*/
-   return 0;
+   err = ubi_io_read_vid_hdr(ubi, pnum, &vid_hdr, 0);
+   if (err != UBI_IO_BAD_HDR_EBADMSG && err != UBI_IO_BAD_HDR &&
+   err != UBI_IO_FF){
+   addr += ubi->vid_hdr_aloffset;
+   err1 = mtd_write(ubi->mtd, addr, 4, &written, (void *)&data);
+   if (err1)
+   goto error; 
}
+   return 0;
 
+error:
/*
-* The PEB contains a valid VID header, but we cannot invalidate it.
+* The PEB contains a valid VID or EC header, but we cannot invalidate 
it.
 * Supposedly the flash media or the driver is screwed up, so return an
 * error.
 */
-   ubi_err("cannot invalidate PEB %d, write returned %d read returned %d",
+   ubi_err("cannot invalidate PEB %d, read returned %d write returned %d",
pnum, err, err1);
ubi_dump_flash(ubi, pnum, 0, ubi->peb_size);
return -EIO;
---

I have tested this patch on Micron NOR flash, part number is:JS28F512M29EWHA.
If have any questions, please let me know. 
Thank you

Best Regards, 
Qi Wang 王起
ESG APAC AE 
Tel: 86-021-38997158
Mobile: 86-15201958202
Email: qiw...@micron.com
Address: No 601 Fasai Rd, Pudong, Shanghai, China, 200131


Re: [RFC part2 PATCH 8/9] ACPI / ARM64: Update acpi_register_gsi to register with the core IRQ subsystem

2013-12-04 Thread Arnd Bergmann
On Tuesday 03 December 2013, Hanjun Guo wrote:
> +   /*
> +* ACPI have no bindings to indicate SPI or PPI, so we
> +* use different mappings from DT in ACPI.
> +*
> +* For FDT
> +* PPI interrupt: in the range [0, 15];
> +* SPI interrupt: in the range [0, 987];
> +*
> +* For ACPI, using identity mapping for hwirq:
> +* PPI interrupt: in the range [16, 31];
> +* SPI interrupt: in the range [32, 1019];

This difference might cause endless confusion. Can't you register PPI and SPI as
separate IRQ controllers to have the same number space that we normally have?

Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] audit: process errors from filter user rules

2013-12-04 Thread Richard Guy Briggs
Errors from filter user rules were previously ignored, and worse, an error on
a AUDIT_NEVER rule disabled logging on that rule.  On -ESTALE, retry up to 5
times.  On error on AUDIT_NEVER rules, log.

Signed-off-by: Richard Guy Briggs 
---
 kernel/audit.c   |2 +-
 kernel/auditfilter.c |   44 +++-
 2 files changed, 32 insertions(+), 14 deletions(-)

diff --git a/kernel/audit.c b/kernel/audit.c
index 4cbc945..c93cf06 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -706,7 +706,7 @@ static int audit_receive_msg(struct sk_buff *skb, struct 
nlmsghdr *nlh)
return 0;
 
err = audit_filter_user(msg_type);
-   if (err == 1) {
+   if (err) { /* match or error */
err = 0;
if (msg_type == AUDIT_USER_TTY) {
err = tty_audit_push_current();
diff --git a/kernel/auditfilter.c b/kernel/auditfilter.c
index b4c6e03..1a7dfa5 100644
--- a/kernel/auditfilter.c
+++ b/kernel/auditfilter.c
@@ -1272,8 +1272,8 @@ static int audit_filter_user_rules(struct audit_krule 
*rule, int type,
break;
}
 
-   if (!result)
-   return 0;
+   if (result <= 0)
+   return result;
}
switch (rule->action) {
case AUDIT_NEVER:*state = AUDIT_DISABLED;   break;
@@ -1286,19 +1286,37 @@ int audit_filter_user(int type)
 {
enum audit_state state = AUDIT_DISABLED;
struct audit_entry *e;
-   int ret = 1;
-
-   rcu_read_lock();
-   list_for_each_entry_rcu(e, &audit_filter_list[AUDIT_FILTER_USER], list) 
{
-   if (audit_filter_user_rules(&e->rule, type, &state)) {
-   if (state == AUDIT_DISABLED)
-   ret = 0;
-   break;
+   int rc, count = 0, retry = 0, ret = 1; /* Audit by default */
+#define FILTER_RETRY_LIMIT 5
+
+   do {
+   rcu_read_lock();
+   list_for_each_entry_rcu(e,
+   &audit_filter_list[AUDIT_FILTER_USER],
+   list) {
+   retry = 0;
+   rc = audit_filter_user_rules(&e->rule, type, &state);
+   if (rc > 0) {
+   if (state == AUDIT_DISABLED)
+   ret = 0;
+   break;
+   } else if (rc < 0) {
+   if (rc == -ESTALE && count < 
FILTER_RETRY_LIMIT) {
+   rcu_read_unlock();
+   count++;
+   retry = 1;
+   cond_resched();
+   } else {
+   ret = rc;
+   }
+   break;
+   }
}
-   }
-   rcu_read_unlock();
+   if (!retry)
+   rcu_read_unlock();
+   } while (retry);
 
-   return ret; /* Audit by default */
+   return ret;
 }
 
 int audit_filter_type(int type)
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC part3 PATCH 1/2] clocksource / arch_timer: Use ACPI GTDT table to initialize arch timer

2013-12-04 Thread Arnd Bergmann
On Tuesday 03 December 2013, Hanjun Guo wrote:

> +#ifdef CONFIG_ACPI
> +void __init arch_timer_acpi_init(void)
> +{
...
> +}
> +#else
> +void __init arch_timer_acpi_init(void) { return; };
> +#endif
>  

The #else clause is broken in combination with 

> diff --git a/include/clocksource/arm_arch_timer.h 
> b/include/clocksource/arm_arch_timer.h
> index 6d26b40..2654edf 100644
> --- a/include/clocksource/arm_arch_timer.h
> +++ b/include/clocksource/arm_arch_timer.h
> @@ -66,6 +66,11 @@ static inline struct timecounter 
> *arch_timer_get_timecounter(void)
>   return NULL;
>  }
>  
> +static inline void arch_timer_acpi_init(void)
> +{
> + return;
> +}
> +
>  #endif
>  

this inline function. Have you build-tested this with CONFIG_ACPI disabled?

Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[GIT PULL] UPDATED - x86 and EFI fixes for v3.13-rc3

2013-12-04 Thread H. Peter Anvin
Hi Linus,

The following changes since commit 6ce4eac1f600b34f2f7f58f9cd8f0503d79e42ae:

  Linux 3.13-rc1 (2013-11-22 11:30:55 -0800)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86/urgent

for you to fetch changes up to 2885432aaf15c1b7e65c787bfe7c5fec428296f0:

  x86/apic, doc: Justification for disabling IO APIC before Local APIC 
(2013-12-04 19:33:21 -0800)

The update is that I added a patch which contains only a comment and
no code changes (see below).

Half of these are EFI-related:

The by far biggest change is the change to hold off the deletion of a
sysfs entry while a backend scan is in progress.  This is to avoid
calling kmemdup() while under a spinlock.

The other major change is for each entry in the EFI pstore backend to
get a unique identifier, as required by the pstore filesystem proper.

The other changes are:

A fix to the recent consolidation and optimization of using "asm goto"
with read-modify-write operation, which broke the bitops; specifically
in such a way that we could end up generating invalid code.

A build hack to make sure we compile with -mno-sse.  icc, and
most likely future versions of gcc, can generate SSE instructions
unless we tell it not to.

A comment-only patch to a change the was due in part to an unpublished
erratum; now when the erratum is published we want to add a comment
explaining why.


H. Peter Anvin (2):
  x86-64, build: Always pass in -mno-sse
  x86, bitops: Correct the assembly constraints to testing bitops

Madper Xie (1):
  efi-pstore: Make efi-pstore return a unique id

Matt Fleming (1):
  x86/efi: Fix earlyprintk off-by-one bug

Seiji Aguchi (1):
  efivars, efi-pstore: Hold off deletion of sysfs entry until the scan is 
completed

 arch/x86/Makefile|   8 +-
 arch/x86/include/asm/atomic.h|   4 +-
 arch/x86/include/asm/atomic64_64.h   |   4 +-
 arch/x86/include/asm/bitops.h|   6 +-
 arch/x86/include/asm/local.h |   4 +-
 arch/x86/include/asm/rmwcc.h |   8 +-
 arch/x86/platform/efi/early_printk.c |   2 +-
 drivers/firmware/efi/efi-pstore.c| 163 +++
 drivers/firmware/efi/efivars.c   |  12 ++-
 drivers/firmware/efi/vars.c  |  12 ++-
 include/linux/efi.h  |   4 +
 11 files changed, 190 insertions(+), 37 deletions(-)

diff --git a/arch/x86/Makefile b/arch/x86/Makefile
index 41250fb..eda00f9 100644
--- a/arch/x86/Makefile
+++ b/arch/x86/Makefile
@@ -31,6 +31,9 @@ ifeq ($(CONFIG_X86_32),y)
 
 KBUILD_CFLAGS += -msoft-float -mregparm=3 -freg-struct-return
 
+# Don't autogenerate SSE instructions
+   KBUILD_CFLAGS += -mno-sse
+
 # Never want PIC in a 32-bit kernel, prevent breakage with GCC built
 # with nonstandard options
 KBUILD_CFLAGS += -fno-pic
@@ -57,8 +60,11 @@ else
 KBUILD_AFLAGS += -m64
 KBUILD_CFLAGS += -m64
 
+# Don't autogenerate SSE instructions
+   KBUILD_CFLAGS += -mno-sse
+
# Use -mpreferred-stack-boundary=3 if supported.
-   KBUILD_CFLAGS += $(call cc-option,-mno-sse -mpreferred-stack-boundary=3)
+   KBUILD_CFLAGS += $(call cc-option,-mpreferred-stack-boundary=3)
 
 # FIXME - should be integrated in Makefile.cpu (Makefile_32.cpu)
 cflags-$(CONFIG_MK8) += $(call cc-option,-march=k8)
diff --git a/arch/x86/include/asm/atomic.h b/arch/x86/include/asm/atomic.h
index da31c8b..b17f4f4 100644
--- a/arch/x86/include/asm/atomic.h
+++ b/arch/x86/include/asm/atomic.h
@@ -77,7 +77,7 @@ static inline void atomic_sub(int i, atomic_t *v)
  */
 static inline int atomic_sub_and_test(int i, atomic_t *v)
 {
-   GEN_BINARY_RMWcc(LOCK_PREFIX "subl", v->counter, i, "%0", "e");
+   GEN_BINARY_RMWcc(LOCK_PREFIX "subl", v->counter, "er", i, "%0", "e");
 }
 
 /**
@@ -141,7 +141,7 @@ static inline int atomic_inc_and_test(atomic_t *v)
  */
 static inline int atomic_add_negative(int i, atomic_t *v)
 {
-   GEN_BINARY_RMWcc(LOCK_PREFIX "addl", v->counter, i, "%0", "s");
+   GEN_BINARY_RMWcc(LOCK_PREFIX "addl", v->counter, "er", i, "%0", "s");
 }
 
 /**
diff --git a/arch/x86/include/asm/atomic64_64.h 
b/arch/x86/include/asm/atomic64_64.h
index 3f065c9..46e9052 100644
--- a/arch/x86/include/asm/atomic64_64.h
+++ b/arch/x86/include/asm/atomic64_64.h
@@ -72,7 +72,7 @@ static inline void atomic64_sub(long i, atomic64_t *v)
  */
 static inline int atomic64_sub_and_test(long i, atomic64_t *v)
 {
-   GEN_BINARY_RMWcc(LOCK_PREFIX "subq", v->counter, i, "%0", "e");
+   GEN_BINARY_RMWcc(LOCK_PREFIX "subq", v->counter, "er", i, "%0", "e");
 }
 
 /**
@@ -138,7 +138,7 @@ static inline int atomic64_inc_and_test(atomic64_t *v)
  */
 static inline int atomic64_add_negative(long i, atomic64_t *v)
 {
-   GEN_BINARY_RMWcc(LOCK_PREFIX "addq", v->counter, i, "%0", "s");
+   GEN_BINARY_RMWcc(LOCK_PREFIX "addq", v

Re: [OOPS, 3.13-rc2] null ptr in dio_complete()

2013-12-04 Thread Jens Axboe
On Thu, Dec 05 2013, Dave Chinner wrote:
> On Wed, Dec 04, 2013 at 03:17:49PM +1100, Dave Chinner wrote:
> > On Tue, Dec 03, 2013 at 08:47:12PM -0700, Jens Axboe wrote:
> > > On Wed, Dec 04 2013, Dave Chinner wrote:
> > > > On Wed, Dec 04, 2013 at 12:58:38PM +1100, Dave Chinner wrote:
> > > > > On Wed, Dec 04, 2013 at 08:59:40AM +1100, Dave Chinner wrote:
> > > > > > Hi Jens,
> > > > > > 
> > > > > > Not sure who to direct this to or CC, so I figured you are the
> > > > > > person to do that. I just had xfstests generic/299 (an AIO/DIO test)
> > > > > > oops in dio_complete() like so:
> > > > > > 
> > 
> > > > > > [ 9650.590630]  
> > > > > > [ 9650.590630]  [] dio_complete+0xa3/0x140
> > > > > > [ 9650.590630]  [] dio_bio_end_aio+0x7a/0x110
> > > > > > [ 9650.590630]  [] ? dio_bio_end_aio+0x5/0x110
> > > > > > [ 9650.590630]  [] bio_endio+0x1d/0x30
> > > > > > [ 9650.590630]  [] 
> > > > > > blk_mq_complete_request+0x5f/0x120
> > > > > > [ 9650.590630]  [] __blk_mq_end_io+0x16/0x20
> > > > > > [ 9650.590630]  [] blk_mq_end_io+0x68/0xd0
> > > > > > [ 9650.590630]  [] virtblk_done+0x67/0x110
> > > > > > [ 9650.590630]  [] vring_interrupt+0x35/0x60
> > .
> > > > > And I just hit this from running xfs_repair which is doing
> > > > > multithreaded direct IO directly on /dev/vdc:
> > > > > 
> > 
> > > > > [ 1776.510446] IP: [] blk_account_io_done+0x6a/0x180
> > 
> > > > > [ 1776.512577]  [] 
> > > > > blk_mq_complete_request+0xb8/0x120
> > > > > [ 1776.512577]  [] __blk_mq_end_io+0x16/0x20
> > > > > [ 1776.512577]  [] blk_mq_end_io+0x68/0xd0
> > > > > [ 1776.512577]  [] virtblk_done+0x67/0x110
> > > > > [ 1776.512577]  [] vring_interrupt+0x35/0x60
> > > > > [ 1776.512577]  [] 
> > > > > handle_irq_event_percpu+0x54/0x1e0
> > .
> > > > > So this is looking like another virtio+blk_mq problem
> > > > 
> > > > This one is definitely reproducable. Just hit it again...
> > > 
> > > I'll take a look at this. You don't happen to have gdb dumps of the
> > > lines associated with those crashes? Just to save me some digging
> > > time...
> > 
> > Only this:
> > 
> > (gdb) l *(dio_complete+0xa3)
> > 0x811ddae3 is in dio_complete (fs/direct-io.c:282).
> > 277 }
> > 278
> > 279 aio_complete(dio->iocb, ret, 0);
> > 280 }
> > 281
> > 282 kmem_cache_free(dio_cache, dio);
> > 283 return ret;
> > 284 }
> > 285
> > 286 static void dio_aio_complete_work(struct work_struct *work)
> > 
> > And this:
> > 
> > (gdb) l *(blk_account_io_done+0x6a)
> > 0x81755b6a is in blk_account_io_done (block/blk-core.c:2049).
> > 2044int cpu;
> > 2045
> > 2046cpu = part_stat_lock();
> > 2047part = req->part;
> > 2048
> > 2049part_stat_inc(cpu, part, ios[rw]);
> > 2050part_stat_add(cpu, part, ticks[rw], duration);
> > 2051part_round_stats(cpu, part);
> > 2052part_dec_in_flight(part, rw);
> > 2053
> > 
> > as I've rebuild the kernel with different patches since the one
> > running on the machine that is triggering the problem.
> 
> Any update on this, Jens? I've hit this blk_account_io_done() panic
> 10 times in the past 2 hours while trying to do xfs_repair
> testing

No, sorry, no updates yet... I haven't had time to look into it today.
To reproduce tomorrow, can you mail me your exact setup (kvm invocation,
etc) and how your guest is setup and if there's any special way I need
to run xfstest or xfs_repair?

-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC part1 PATCH 5/7] ARM64 / ACPI: Introduce arm_core.c and its related head file

2013-12-04 Thread Arnd Bergmann
On Tuesday 03 December 2013, Hanjun Guo wrote:
> +static unsigned int gsi_to_irq(unsigned int gsi)
> +{
> +   int irq = irq_create_mapping(NULL, gsi);
> +
> +   return irq;
> +}

I think this could use a comment regarding your plans for IRQ domains.

Do you expect that all ACPI systems would have only a single GIC IRQ
controller and a single domain, or do you plan to add irqdomain code
later?

Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[tip:x86/urgent] x86/apic, doc: Justification for disabling IO APIC before Local APIC

2013-12-04 Thread tip-bot for Fenghua Yu
Commit-ID:  2885432aaf15c1b7e65c787bfe7c5fec428296f0
Gitweb: http://git.kernel.org/tip/2885432aaf15c1b7e65c787bfe7c5fec428296f0
Author: Fenghua Yu 
AuthorDate: Wed, 4 Dec 2013 16:07:49 -0800
Committer:  H. Peter Anvin 
CommitDate: Wed, 4 Dec 2013 19:33:21 -0800

x86/apic, doc: Justification for disabling IO APIC before Local APIC

Since erratum AVR31 in "Intel Atom Processor C2000 Product Family
Specification Update" is now published, I added a justification
comment for disabling IO APIC before Local APIC, as changed in commit:

522e66464467 x86/apic: Disable I/O APIC before shutdown of the local APIC

Signed-off-by: Fenghua Yu 
Link: 
http://lkml.kernel.org/r/1386202069-51515-1-git-send-email-fenghua...@intel.com
Signed-off-by: H. Peter Anvin 
---
 arch/x86/kernel/reboot.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c
index da3c599..c752cb4 100644
--- a/arch/x86/kernel/reboot.c
+++ b/arch/x86/kernel/reboot.c
@@ -558,6 +558,17 @@ void native_machine_shutdown(void)
 {
/* Stop the cpus and apics */
 #ifdef CONFIG_X86_IO_APIC
+   /*
+* Disabling IO APIC before local APIC is a workaround for
+* erratum AVR31 in "Intel Atom Processor C2000 Product Family
+* Specification Update". In this situation, interrupts that target
+* a Logical Processor whose Local APIC is either in the process of
+* being hardware disabled or software disabled are neither delivered
+* nor discarded. When this erratum occurs, the processor may hang.
+*
+* Even without the erratum, it still makes sense to quiet IO APIC
+* before disabling Local APIC.
+*/
disable_IO_APIC();
 #endif
 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] thp: move preallocated PTE page table on move_huge_pmd()

2013-12-04 Thread Andrey Wagin
2013/12/4 Kirill A. Shutemov :
> Andrey Wagin reported crash on VM_BUG_ON() in pgtable_pmd_page_dtor()
> with fallowing backtrace:
>
>   [] free_pgd_range+0x2bf/0x410
>   [] free_pgtables+0xce/0x120
>   [] unmap_region+0xe0/0x120
>   [] ? move_page_tables+0x526/0x6b0
>   [] do_munmap+0x249/0x360
>   [] move_vma+0x144/0x270
>   [] SyS_mremap+0x3b9/0x510
>   [] system_call_fastpath+0x16/0x1b
>
> The crash can be reproduce with this test case:
>
>   #define _GNU_SOURCE
>   #include 
>   #include 
>   #include 
>
>   #define MB (1024 * 1024UL)
>   #define GB (1024 * MB)
>
>   int main(int argc, char **argv)
>   {
> char *p;
> int i;
>
> p = mmap((void *) GB, 10 * MB, PROT_READ | PROT_WRITE,
> MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED, -1, 0);
> for (i = 0; i < 10 * MB; i += 4096)
> p[i] = 1;
> mremap(p, 10 * MB, 10 * MB, MREMAP_FIXED | MREMAP_MAYMOVE, 2 * GB);
> return 0;
>   }
>
> Due to split PMD lock, we now store preallocated PTE tables for THP
> pages per-PMD table.  It means we need to move them to other PMD table
> if huge PMD moved there.
>
> Signed-off-by: Kirill A. Shutemov 
> Reported-by: Andrey Vagin 

My tests were working for the night without any problem.  Thanks for
the quick response.

Tested-by: Andrey Vagin 

> ---
>  mm/huge_memory.c | 12 +++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index bccd5a628ea6..33a5dc492810 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -1481,8 +1481,18 @@ int move_huge_pmd(struct vm_area_struct *vma, struct 
> vm_area_struct *new_vma,
> pmd = pmdp_get_and_clear(mm, old_addr, old_pmd);
> VM_BUG_ON(!pmd_none(*new_pmd));
> set_pmd_at(mm, new_addr, new_pmd, pmd_mksoft_dirty(pmd));
> -   if (new_ptl != old_ptl)
> +   if (new_ptl != old_ptl) {
> +   pgtable_t pgtable;
> +
> +   /*
> +* Move preallocated PTE page table if new_pmd is on
> +* different PMD page table.
> +*/
> +   pgtable = pgtable_trans_huge_withdraw(mm, old_pmd);
> +   pgtable_trans_huge_deposit(mm, new_pmd, pgtable);
> +
> spin_unlock(new_ptl);
> +   }
> spin_unlock(old_ptl);
> }
>  out:
> --
> 1.8.4.4
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 1/2] test: add minimal module for verification testing

2013-12-04 Thread Rusty Russell
Andrew Morton  writes:
> On Thu, 05 Dec 2013 13:12:17 +1030 Rusty Russell  
> wrote:
>
>> Kees Cook  writes:
>> > When doing module loading verification tests (for example, with module
>> > singing, or LSM hooks), it is very handy to have a module that can be
>> 
>> "module singing" sounds like a horrible idea!  Is the author even
>> musical?  I've only heard it said David Howls.
>
> You're such a killjoy.
>
> btw, git log | grep Singed

I had my ego singed off by Linus once, so I completely understand.

Cheers,
Rusty.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: mcheck: call put_device on device_register failure

2013-12-04 Thread Chen, Gong
On Wed, Dec 04, 2013 at 07:39:07PM +0100, Levente Kurusa wrote:
> Date: Wed, 04 Dec 2013 19:39:07 +0100
> From: Levente Kurusa 
> To: Borislav Petkov , Ingo Molnar , Thomas
>  Gleixner , Tony Luck , "H. Peter
>  Anvin" , x...@kernel.org, EDAC ,
>  LKML 
> Subject: Re: [PATCH] x86: mcheck: call put_device on device_register failure
> User-Agent: Mozilla/5.0 (X11; Linux i686; rv:24.0) Gecko/20100101
>  Thunderbird/24.1.0
> 
> 2013-12-04 08:38, Chen, Gong:
> > On Tue, Dec 03, 2013 at 06:01:50PM +0100, Borislav Petkov wrote:
> >> Date: Tue, 3 Dec 2013 18:01:50 +0100
> >> From: Borislav Petkov 
> >> To: "Chen, Gong" 
> >> Cc: Levente Kurusa , Ingo Molnar ,
> >>  Thomas Gleixner , Tony Luck , "H.
> >>  Peter Anvin" , x...@kernel.org, EDAC
> >>  , LKML 
> >> Subject: Re: [PATCH] x86: mcheck: call put_device on device_register 
> >> failure
> >> User-Agent: Mutt/1.5.21 (2010-09-15)
> >>
> >> Can you please fix your
> >>
> >> Mail-Followup-To:
> >>
> >> header? It is impossible to reply to your emails without fiddling with
> >> the To: and Cc: by hand which gets very annoying over time.
> > 
> > I add some configs in my muttrc. Hope it works.
> > 
> >>
> >> On Mon, Dec 02, 2013 at 09:23:30PM -0500, Chen, Gong wrote:
> >>> I have some concerns about it. if device_register is failed, it will
> >>> backtraces all kinds of conditions automatically, including put_device
> >>> definately. So do we really need an extra put_device when it returns
> >>> failure?
> >>
> >> Do you mean the "done:" label in device_add() which does put_device()
> >> and which gets called by device_register()?
> >>
> > 
> > Not only. I noticed that another put_device under label "Error:".
> > 
> 
> That label is called when we failed to add the kobject to its parent.
> It just puts the parent of the device. I don't think it has anything
> to do with us put_device()-ing the actual device too.
> 
OK, you are right. I read some kobject related codes and get:

static inline void kref_init(struct kref *kref)
{
atomic_set(&kref->refcount, 1);
}

The init refcount is 1, which means even if we meet an error and put_device
in device_add, we still need an extra put_device to make refcount = 0
and then release the dev object.

BTW, from the comments of device_register:

"NOTE: _Never_ directly free @dev after calling this function, even
 if it returned an error! Always use put_device() to give up the
 reference initialized in this function instead. "

Many caller don't follow this logic. For example:
in arch/arm/common/locomo.c
locomo_init_one_child
...
ret = device_register(&dev->dev);
if (ret) {
out:
kfree(dev);
}
...
 
in arch/parisc/kernel/drivers.c
create_tree_node
...
if (device_register(&dev->dev)) {
kfree(dev);
return NULL;
}
...

etc.

Maybe we need one more patch to fix them all. :-)
> -- 
> Regards,
> Levente Kurusa
> --
> To unsubscribe from this list: send the line "unsubscribe linux-edac" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


signature.asc
Description: Digital signature


Re: [PATCH v2] watchdog: Add a sysctl to disable soft lockup detector

2013-12-04 Thread Don Zickus
On Wed, Dec 04, 2013 at 05:55:56PM -0800, Ben Zhang wrote:
> Currently, the soft lockup detector and hard lockup detector
> can be enabled or disabled together via the flag variable
> watchdog_user_enabled. There isn't a way to disable only the
> soft lockup detector while keeping the hard lockup detector
> running.
> 
> The hard lockup detector sometimes does not work on a x86
> machine with multiple cpus when softlockup_panic is set to 0.
> For example:
> 1. Hard lockup occurs on cpu0 ("cli" followed by a infinite loop).
> 2. Soft lockup occurs on cpu1 shortly after because cpu1 tries to
> send a function to cpu0 via smp_call_function_single().
> 3. watchdog_timer_fn() detects the soft lockup on cpu1 and
> dumps the stack. dump_stack() eventually calls touch_nmi_watchdog()
> which sets watchdog_nmi_touch=true for all cpus and sets
> watchdog_touch_ts=0 for cpu1.
> 4. NMI fires on cpu0. watchdog_overflow_callback() sees
> watchdog_nmi_touch=true, so it does not do anything except setting
> watchdog_nmi_touch=false.
> 5. watchdog_timer_fn() is called again on cpu1, it sees
> watchdog_touch_ts=0, so reloads it with the current tick. Thus,
> is_softlockup() returns false, and soft_watchdog_warn is set to false.
> 6. Before NMI can fire on cpu0 again with watchdog_nmi_touch=false,
> watchdog_timer_fn() reports the soft lockup on cpu1 again
> and we go back to #3.

Yup.  This is because touch_nmi_watchdog touches _all_ the cpus instead of
its own.  I tried to fix this 3 years ago, but Andrew wanted me to fix
something else in the panic code first as a trade for his ack in changing
the semantics of touch_nmi_watchdog. ;-p

I doubt this patch still applies but the concept is pretty simple, touch
only the local cpu not all of them.  Should be pretty easy to port.

Not entirely sure if this would be accepted by folks, but I think it would
address your fundamental problem.

Cheers,
Don 

---8<
From: Don Zickus 
Date: Thu, 4 Nov 2010 20:53:03 -0400
Subject: [PATCH v3] watchdog:  touch_nmi_watchdog should only touch local cpu 
not every one

I ran into a scenario where while one cpu was stuck and should have panic'd
because of the NMI watchdog, it didn't.  The reason was another cpu was spewing
stack dumps on to the console.  Upon investigation, I noticed that when writing
to the console and also when dumping the stack, the watchdog is touched.

This causes all the cpus to reset their NMI watchdog flags and the 'stuck' cpu
just spins forever.

This change causes the semantics of touch_nmi_watchdog to be changed slightly.
Previously, I accidentally changed the semantics and we noticed there was a
codepath in which touch_nmi_watchdog could be touched from a preemtible area.
That caused a BUG() to happen when CONFIG_DEBUG_PREEMPT was enabled.  I believe
it was the acpi code.

My attempt here re-introduces the change to have the touch_nmi_watchdog() code
only touch the local cpu instead of all of the cpus.  But instead of using
__get_cpu_var(), I use the __raw_get_cpu_var() version.

This avoids the preemption problem.  However my reasoning wasn't because I was
trying to be lazy.  Instead I rationalized it as, well if preemption is enabled
then interrupts should be enabled to and the NMI watchdog will have no reason
to trigger.  So it won't matter if the wrong cpu is touched because the percpu
interrupt counters the NMI watchdog uses should still be incrementing.

V3: Really remove touch_all_nmi_watchdogs()
V2: Remove touch_all_nmi_watchdogs()

Signed-off-by: Don Zickus 
---
 kernel/watchdog.c |   17 +
 1 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index dc8e168..09fddd7 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -141,14 +141,15 @@ void touch_all_softlockup_watchdogs(void)
 #ifdef CONFIG_HARDLOCKUP_DETECTOR
 void touch_nmi_watchdog(void)
 {
-   if (watchdog_enabled) {
-   unsigned cpu;
+   /*
+* Using __raw here because some code paths have
+* preemption enabled.  If preemption is enabled
+* then interrupts should be enabled too, in which
+* case we shouldn't have to worry about the watchdog
+* going off.
+*/
+   __raw_get_cpu_var(watchdog_nmi_touch) = true;
 
-   for_each_present_cpu(cpu) {
-   if (per_cpu(watchdog_nmi_touch, cpu) != true)
-   per_cpu(watchdog_nmi_touch, cpu) = true;
-   }
-   }
touch_softlockup_watchdog();
 }
 EXPORT_SYMBOL(touch_nmi_watchdog);
-- 
1.7.2.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 1/2] test: add minimal module for verification testing

2013-12-04 Thread Andrew Morton
On Thu, 05 Dec 2013 13:12:17 +1030 Rusty Russell  wrote:

> Kees Cook  writes:
> > When doing module loading verification tests (for example, with module
> > singing, or LSM hooks), it is very handy to have a module that can be
> 
> "module singing" sounds like a horrible idea!  Is the author even
> musical?  I've only heard it said David Howls.

You're such a killjoy.

btw, git log | grep Singed
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: lots of brief rcu stalls.

2013-12-04 Thread Paul E. McKenney
On Wed, Dec 04, 2013 at 06:36:05PM -0800, Joe Perches wrote:
> On Wed, 2013-12-04 at 18:18 -0800, Eric Dumazet wrote:
> > On Wed, 2013-12-04 at 16:16 -0800, Paul E. McKenney wrote:
> > > + ULONG_CMP_GE(ACCESS_ONCE(jiffies), rdp->rsp->jiffies_resched)) {
> 
> perhaps time_before_eq
> 
> > jiffies should not need ACCESS_ONCE(), right ?
> > 
> > It is one of the few variables marked with volatile keyword.
> 
> It does seem redundant
> 
> $ git grep -n "ACCESS_ONCE(jiffies)"
> kernel/rcu/torture.c:1354:  jiffies_snap = ACCESS_ONCE(jiffies);
> kernel/rcu/torture.c:1363:  jiffies_snap = ACCESS_ONCE(jiffies);
> kernel/rcu/tree.c:820:  unsigned long j = ACCESS_ONCE(jiffies);
> kernel/rcu/tree.c:975:  j = ACCESS_ONCE(jiffies);

I took care of the first four, plus one more that I have in my local tree.

Thanx, Paul

> kernel/sched/core.c:2324:   unsigned long next, now = 
> ACCESS_ONCE(jiffies);
> kernel/sched/proc.c:536:unsigned long curr_jiffies = 
> ACCESS_ONCE(jiffies);
> kernel/sched/proc.c:558:unsigned long curr_jiffies = 
> ACCESS_ONCE(jiffies);
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Will CPU 0 be forever prohibited from NO_HZ_FULL status?

2013-12-04 Thread Paul E. McKenney
On Thu, Dec 05, 2013 at 02:20:55AM +0100, Frederic Weisbecker wrote:
> On Wed, Dec 04, 2013 at 11:39:57AM -0800, Paul E. McKenney wrote:
> > Hello, Frederic,
> > 
> > Just realized that I could further decrease RT latency of one of my "shut
> > up RCU on NO_HZ_FULL CPUs" patches if I relied on CPU 0 always having
> > a scheduling-clock tick unless the entire system is idle.  The trick
> > is that I could then rely on CPU 0 to detect RCU CPU stall warnings,
> > and remove the checking from the other CPUs.
> > 
> > Thoughts?
> 
> You're right on time as I'm currently working on that :)
> So the plan is to allow timekeeping to be handled by a set of CPUs 
> (cpu_housekeeping_mask
> which I guess should be ~nohz_full_mask & cpu_online_mask). I think it will 
> be better
> for powersaving. I guess you could balance the RCU stall checks in this
> set of housekeeping CPUs?
> 
> It should be very easy to make the rcu sysidle stuff to support that 
> housekeeping set,
> I just looked into it and all we need to do is to turn the several "cpu == 
> tick_do_timer"
> checks into something like is_housekeeping_cpu(cpu). And may be a few easy 
> details, like which
> CPU from the housekeeping set should get the kick IPI, well the first one 
> available should be a good start,
> of course I expect some issues with cpu hotplug.
> But other than that, RCU sysidle detection is mostly ready to support 
> tracking only a given subset
> of CPUs instead of all of them. That's in fact what it already does currently 
> by excluding the
> fixed boot timekeeping CPU.
> 
> So I'm working on that and should have some patches ready soon.

Thank you for the info!  Nice to know that RCU will continue to be able
to rely on there being at least one housekeeping CPU.  ;-)

At that point, tick_nohz_full_cpu() would still be a good way for RCU
to distinguish housekeeping CPUs from working CPUs, correct?

> In fact I just realized that all the sysidle detection infrastructure is 
> there and working
> but we forgot to plug it in the tick engine, and thus we are still running
> with periodic CPU 0 even with CONFIG_NO_HZ_FULL_SYSIDLE=y. Anyway I have a 
> few changes
> ready to enable that, lets hope testing will be ok :)

Indeed!  ;-)

The CONFIG_NO_HZ_FULL_SYSIDLE=y might complicate things a bit.  But I
guess the problem would be a corner case -- the system entered sysidle
mode with a grace period pending, which should eventually wake up the
corresponding grace-period kthread, which might be prevented from ever
running due to high load or something.  If that problem arises, I will
fix it.

So there!  ;-)

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 7/8] mm, memcg: allow processes handling oom notifications to access reserves

2013-12-04 Thread Tejun Heo
Hello,

On Wed, Dec 04, 2013 at 05:49:04PM -0800, David Rientjes wrote:
> That's not what this series is addressing, though, and in fact it's quite 
> the opposite.  It acknowledges that userspace oom handlers need to 
> allocate and that anything else would be too difficult to maintain 
> (thereby agreeing with the above), so we must set aside memory that they 
> are exclusively allowed to access.  For the vast majority of users who 
> will not use userspace oom handlers, they can just use the default value 
> of memory.oom_reserve_in_bytes == 0 and they incur absolutely no side-
> effects as a result of this series.

Umm.. without delving into details, aren't you basically creating a
memory cgroup inside a memory cgroup?  Doesn't sound like a
particularly well thought-out plan to me.

> For those who do use userspace oom handlers, like Google, this allows us 
> to set aside memory to allow the userspace oom handlers to kill a process, 
> dump the heap, send a signal, drop caches, etc. when waking up.

Seems kinda obvious.  Put it in a separate cgroup?  You're basically
saying it doesn't want to be under the same memory limit as the
processes that it's looking over.  That's like the definition of being
in a different cgroup.

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] virtio_balloon: update_balloon_size(): update correct field

2013-12-04 Thread Rusty Russell
Luiz Capitulino  writes:
> According to the virtio spec, the device configuration field
> that should be updated after an inflation or deflation
> operation is the 'actual' field, not the 'num_pages' one.
>
> Commit 855e0c5288177bcb193f6f6316952d2490478e1c swapped them
> in update_balloon_size(). Fix it.
>
> Signed-off-by: Luiz Capitulino 

Damn, exactly right.  Good catch.

Applied,
Rusty.

>  drivers/virtio/virtio_balloon.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> index c444654..5c4a95b 100644
> --- a/drivers/virtio/virtio_balloon.c
> +++ b/drivers/virtio/virtio_balloon.c
> @@ -285,7 +285,7 @@ static void update_balloon_size(struct virtio_balloon *vb)
>  {
>   __le32 actual = cpu_to_le32(vb->num_pages);
>  
> - virtio_cwrite(vb->vdev, struct virtio_balloon_config, num_pages,
> + virtio_cwrite(vb->vdev, struct virtio_balloon_config, actual,
> &actual);
>  }
>  
> -- 
> 1.8.1.4
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 1/2] test: add minimal module for verification testing

2013-12-04 Thread Rusty Russell
Kees Cook  writes:
> When doing module loading verification tests (for example, with module
> singing, or LSM hooks), it is very handy to have a module that can be

"module singing" sounds like a horrible idea!  Is the author even
musical?  I've only heard it said David Howls.

But if my ack for the patch helps:

Acked-by: Rusty Russell 

Cheers,
Rusty.


> built on all systems under test, isn't auto-loaded at boot, and has
> no device or similar dependencies. This creates the "test_module.ko"
> module for that purpose, which only reports its load and unload to printk.
>
> Signed-off-by: Kees Cook 
> ---
> v3:
>  - use KBUILD_MODNAME; Rusty Russell
> v2:
>  - use pr_warn, better comment, add headers explicitly, move to lib/; akpm.
> ---
>  lib/Kconfig.debug |   14 ++
>  lib/Makefile  |1 +
>  lib/test_module.c |   33 +
>  3 files changed, 48 insertions(+)
>  create mode 100644 lib/test_module.c
>
> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> index db25707aa41b..81882335c625 100644
> --- a/lib/Kconfig.debug
> +++ b/lib/Kconfig.debug
> @@ -1578,6 +1578,20 @@ config DMA_API_DEBUG
> This option causes a performance degredation.  Use only if you want
> to debug device drivers. If unsure, say N.
>  
> +config TEST_MODULE
> + tristate "Test module loading with 'hello world' module"
> + default n
> + depends on m
> + help
> +   This builds the "test_module" module that emits "Hello, world"
> +   on printk when loaded. It is designed to be used for basic
> +   evaluation of the module loading subsystem (for example when
> +   validating module verification). It lacks any extra dependencies,
> +   and will not normally be loaded by the system unless explicitly
> +   requested by name.
> +
> +   If unsure, say N.
> +
>  source "samples/Kconfig"
>  
>  source "lib/Kconfig.kgdb"
> diff --git a/lib/Makefile b/lib/Makefile
> index a459c31e8c6b..b494b9af631c 100644
> --- a/lib/Makefile
> +++ b/lib/Makefile
> @@ -31,6 +31,7 @@ obj-y += string_helpers.o
>  obj-$(CONFIG_TEST_STRING_HELPERS) += test-string_helpers.o
>  obj-y += kstrtox.o
>  obj-$(CONFIG_TEST_KSTRTOX) += test-kstrtox.o
> +obj-$(CONFIG_TEST_MODULE) += test_module.o
>  
>  ifeq ($(CONFIG_DEBUG_KOBJECT),y)
>  CFLAGS_kobject.o += -DDEBUG
> diff --git a/lib/test_module.c b/lib/test_module.c
> new file mode 100644
> index ..319b66f1ff61
> --- /dev/null
> +++ b/lib/test_module.c
> @@ -0,0 +1,33 @@
> +/*
> + * This module emits "Hello, world" on printk when loaded.
> + *
> + * It is designed to be used for basic evaluation of the module loading
> + * subsystem (for example when validating module signing/verification). It
> + * lacks any extra dependencies, and will not normally be loaded by the
> + * system unless explicitly requested by name.
> + */
> +
> +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> +
> +#include 
> +#include 
> +#include 
> +
> +static int __init test_module_init(void)
> +{
> + pr_warn("Hello, world\n");
> +
> + return 0;
> +}
> +
> +module_init(test_module_init);
> +
> +static void __exit test_module_exit(void)
> +{
> + pr_warn("Goodbye\n");
> +}
> +
> +module_exit(test_module_exit);
> +
> +MODULE_AUTHOR("Kees Cook ");
> +MODULE_LICENSE("GPL");
> -- 
> 1.7.9.5
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/3] audit: remove audit_log_start() contention in AUDIT_USER type calls

2013-12-04 Thread Richard Guy Briggs
There is a race condition between systemd and auditd:

systemd|auditd
---+---
...|
-> audit_receive   |...
   -> mutex_lock(&audit_cmd_mutex) |-> audit_receive
  ... -> audit_log_start   |   -> mutex_lock(&audit_cmd_mutex)
 -> wait_for_auditd|  // wait for systemd
-> schedule_timeout(60*HZ) |

This fix will take care of systemd and anything USING audit.  It still means
that we could race with something configuring audit and auditd shutting down.

The idea of dropping the lock at the top of audit_receive_msg() isn't as clean
as I had hoped, with AUDIT_ADD_RULE, AUDIT_TRIM, AUDIT_MAKE_EQUIV all
potentially allocating additional audit buffers indirectly through
trim_marked().  It may make sense to have trim_marked() send its queue through
a new thread.

Richard Guy Briggs (3):
  selinux: call WARN_ONCE() instead of calling audit_log_start()
  smack: call WARN_ONCE() instead of calling audit_log_start()
  audit: drop audit_cmd_lock in AUDIT_USER family of cases

 kernel/audit.c |2 ++
 security/selinux/ss/services.c |   12 
 security/smack/smack_lsm.c |5 ++---
 3 files changed, 8 insertions(+), 11 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/3] selinux: call WARN_ONCE() instead of calling audit_log_start()

2013-12-04 Thread Richard Guy Briggs
Two of the conditions in selinux_audit_rule_match() should never happen and
the third indicates a race that should be retried.  Remove the calls to
audit_log() (which call audit_log_start()) and deal with the errors in the
caller, logging only once if the condition is met.  Calling audit_log_start()
in this location makes buffer allocation and locking more complicated in the
calling tree (audit_filter_user()).

Signed-off-by: Richard Guy Briggs 
---
 security/selinux/ss/services.c |   12 
 1 files changed, 4 insertions(+), 8 deletions(-)

diff --git a/security/selinux/ss/services.c b/security/selinux/ss/services.c
index b4feecc..f4dda05 100644
--- a/security/selinux/ss/services.c
+++ b/security/selinux/ss/services.c
@@ -2938,25 +2938,21 @@ int selinux_audit_rule_match(u32 sid, u32 field, u32 
op, void *vrule,
struct selinux_audit_rule *rule = vrule;
int match = 0;
 
-   if (!rule) {
-   audit_log(actx, GFP_ATOMIC, AUDIT_SELINUX_ERR,
- "selinux_audit_rule_match: missing rule\n");
+   if (unlikely(!rule)) {
+   WARN_ONCE(1, "selinux_audit_rule_match: missing rule\n");
return -ENOENT;
}
 
read_lock(&policy_rwlock);
 
if (rule->au_seqno < latest_granting) {
-   audit_log(actx, GFP_ATOMIC, AUDIT_SELINUX_ERR,
- "selinux_audit_rule_match: stale rule\n");
match = -ESTALE;
goto out;
}
 
ctxt = sidtab_search(&sidtab, sid);
-   if (!ctxt) {
-   audit_log(actx, GFP_ATOMIC, AUDIT_SELINUX_ERR,
- "selinux_audit_rule_match: unrecognized SID %d\n",
+   if (unlikely(!ctxt)) {
+   WARN_ONCE(1, "selinux_audit_rule_match: unrecognized SID %d\n",
  sid);
match = -ENOENT;
goto out;
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/3] audit: drop audit_cmd_lock in AUDIT_USER family of cases

2013-12-04 Thread Richard Guy Briggs
We do not need to hold the audit_cmd_mutex for this family of cases.  The
possible exception to this is the call to audit_filter_user(), so drop the lock
immediately after.  To help in fixing the race we are trying to avoid, make
sure that nothing called by audit_filter_user() calls audit_log_start().  In
particular, watch out for *_audit_rule_match().

This fix will take care of systemd and anything USING audit.  It still means
that we could race with something configuring audit and auditd shutting down.

Signed-off-by: Richard Guy Briggs 
Signed-off-by: Richard Guy Briggs 
---
 kernel/audit.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/kernel/audit.c b/kernel/audit.c
index 4689012..4cbc945 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -713,6 +713,7 @@ static int audit_receive_msg(struct sk_buff *skb, struct 
nlmsghdr *nlh)
if (err)
break;
}
+   mutex_unlock(&audit_cmd_mutex);
audit_log_common_recv_msg(&ab, msg_type);
if (msg_type != AUDIT_USER_TTY)
audit_log_format(ab, " msg='%.1024s'",
@@ -729,6 +730,7 @@ static int audit_receive_msg(struct sk_buff *skb, struct 
nlmsghdr *nlh)
}
audit_set_pid(ab, NETLINK_CB(skb).portid);
audit_log_end(ab);
+   mutex_lock(&audit_cmd_mutex);
}
break;
case AUDIT_ADD_RULE:
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/3] smack: call WARN_ONCE() instead of calling audit_log_start()

2013-12-04 Thread Richard Guy Briggs
Remove the call to audit_log() (which call audit_log_start()) and deal with
the errors in the caller, logging only once if the condition is met.  Calling
audit_log_start() in this location makes buffer allocation and locking more
complicated in the calling tree (audit_filter_user()).

Signed-off-by: Richard Guy Briggs 
---
 security/smack/smack_lsm.c |5 ++---
 1 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/security/smack/smack_lsm.c b/security/smack/smack_lsm.c
index 8825375..185e2e7 100644
--- a/security/smack/smack_lsm.c
+++ b/security/smack/smack_lsm.c
@@ -3615,9 +3615,8 @@ static int smack_audit_rule_match(u32 secid, u32 field, 
u32 op, void *vrule,
struct smack_known *skp;
char *rule = vrule;
 
-   if (!rule) {
-   audit_log(actx, GFP_ATOMIC, AUDIT_SELINUX_ERR,
- "Smack: missing rule\n");
+   if (unlikely(!rule)) {
+   WARN_ONCE(1, "Smack: missing rule\n");
return -ENOENT;
}
 
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: lots of brief rcu stalls.

2013-12-04 Thread Paul E. McKenney
On Wed, Dec 04, 2013 at 06:18:30PM -0800, Eric Dumazet wrote:
> On Wed, 2013-12-04 at 16:16 -0800, Paul E. McKenney wrote:
> > +   if (rdp->rsp == rcu_state &&
> > +   ULONG_CMP_GE(ACCESS_ONCE(jiffies), rdp->rsp->jiffies_resched)) {
> > +   rdp->rsp->jiffies_resched += 5;
> > +   resched_cpu(rdp->cpu);
> > +   }
> > +
> > return 0;
> >  }
> 
> jiffies should not need ACCESS_ONCE(), right ?
> 
> It is one of the few variables marked with volatile keyword.

Good point!  I have queued a patch fixing this and four other occurrences
with your Submitted-by.

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: lots of brief rcu stalls.

2013-12-04 Thread Paul E. McKenney
On Wed, Dec 04, 2013 at 08:22:06PM -0500, Dave Jones wrote:
> On Wed, Dec 04, 2013 at 04:16:14PM -0800, Paul E. McKenney wrote:
>  > On Wed, Dec 04, 2013 at 06:28:38PM -0500, Dave Jones wrote:
>  > > Paul,
>  > > I'm seeing this happening more and more lately...
>  > > 
>  > > [  771.786462] INFO: rcu_preempt detected stalls on CPUs/tasks:
>  > > [  771.786552]   Tasks blocked on level-0 rcu_node (CPUs 0-3):
>  > > [  771.786574]   Tasks blocked on level-0 rcu_node (CPUs 0-3):
>  > > [  771.786595]   (detected by 0, t=6502 jiffies, g=20611, c=20610, q=0)
>  > > [  771.786620] INFO: Stall ended before state dump start
>  > > [  966.724546] INFO: rcu_preempt detected stalls on CPUs/tasks:
>  > > [  966.724854]   Tasks blocked on level-0 rcu_node (CPUs 0-3):
>  > > [  966.724931]   Tasks blocked on level-0 rcu_node (CPUs 0-3):
>  > > [  966.725005]   (detected by 0, t=26007 jiffies, g=20611, c=20610, q=0)
>  > > [  966.725093] INFO: Stall ended before state dump start
>  > > [ 1161.661459] INFO: rcu_preempt detected stalls on CPUs/tasks:
>  > > [ 1161.661763]   Tasks blocked on level-0 rcu_node (CPUs 0-3):
>  > > [ 1161.661840]   Tasks blocked on level-0 rcu_node (CPUs 0-3):
>  > > [ 1161.661915]   (detected by 0, t=45512 jiffies, g=20611, c=20610, q=0)
>  > > [ 1161.662001] INFO: Stall ended before state dump start
>  > > [ 1356.598205] INFO: rcu_preempt detected stalls on CPUs/tasks:
>  > > [ 1356.598513]   Tasks blocked on level-0 rcu_node (CPUs 0-3):
>  > > [ 1356.598590]   Tasks blocked on level-0 rcu_node (CPUs 0-3):
>  > > [ 1356.598664]   (detected by 0, t=65017 jiffies, g=20611, c=20610, q=0)
>  > > [ 1356.598751] INFO: Stall ended before state dump start
>  > > [ 1551.536099] INFO: rcu_preempt detected stalls on CPUs/tasks:
>  > > [ 1551.536408]   Tasks blocked on level-0 rcu_node (CPUs 0-3):
>  > > [ 1551.536485]   Tasks blocked on level-0 rcu_node (CPUs 0-3):
>  > > [ 1551.536559]   (detected by 0, t=84522 jiffies, g=20611, c=20610, q=0)
>  > > [ 1551.536645] INFO: Stall ended before state dump start
>  > > 
>  > > While it's apparently a non-problem, it's pretty noisy.
>  > > Any ideas?
>  > 
>  > Does the following help?
> 
> Seems so so far!

Keeping fingers firmly crossed...

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 5/5] ARM: msm: Move MSM's DT based hardware to multi-platform support

2013-12-04 Thread Arnd Bergmann
On Wednesday 04 December 2013, Stephen Boyd wrote:
> The DT based MSM platforms can join the multi-platform builds, so
> introduce a DT based ARCH_MSM option. This option allows DT based
> MSM platforms to be built into the multi-platform kernel. Also
> introduce a hidden ARCH_MSM config that both the DT and non-DT
> platform support code select to avoid churn in places that depend
> on CONFIG_ARCH_MSM.
> 
> Cc: Arnd Bergmann 
> Signed-off-by: Stephen Boyd 

Nice!

Acked-by: Arnd Bergmann 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/4] perf trace: Fix summary percentage when processing files

2013-12-04 Thread David Ahern
Getting a divide by 0 when events are processed from a file:
   perf trace -i perf.data -s
   ...
   dnsmasq (1684), 10 events, inf%, 0.000 msec

The problem is that the event count is not incremented as events are
processed. With this patch:
   perf trace -i perf.data -s
   ...
   dnsmasq (1684), 10 events, 8.9%, 0.000 msec

Signed-off-by: David Ahern 
---
 tools/perf/builtin-trace.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 8f47eaae2f34..0203324fe585 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -1770,8 +1770,10 @@ static int trace__process_sample(struct perf_tool *tool,
if (!trace->full_time && trace->base_time == 0)
trace->base_time = sample->time;
 
-   if (handler)
+   if (handler) {
+   ++trace->nr_events;
handler(trace, evsel, sample);
+   }
 
return err;
 }
-- 
1.8.3.4 (Apple Git-47)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/4] perf trace: Fix crash on RHEL6

2013-12-04 Thread David Ahern
Signed-off-by: David Ahern 
---
 tools/perf/builtin-trace.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index a7aa771a98e6..8f47eaae2f34 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -1455,7 +1455,7 @@ static size_t syscall__scnprintf_args(struct syscall *sc, 
char *bf, size_t size,
 {
size_t printed = 0;
 
-   if (sc->tp_format != NULL) {
+   if ((sc->tp_format != NULL) && (sc->tp_format->format.fields != NULL)) {
struct format_field *field;
u8 bit = 1;
struct syscall_arg arg = {
-- 
1.8.3.4 (Apple Git-47)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 4/4] perf trace: Add option to specify machine type

2013-12-04 Thread David Ahern
Perhaps there is a better way to do this; I could not think of one and
I don't see any field in the tracepoint that can be leveraged. So ...

perf-trace autodetects the machine type (e.g., i386, x86_64, etc) via
libaudit. When running 32-bit apps on a 64-bit kernel the wrong machine
type is used to convert syscall numbers to names leading to wrong information
getting displayed to the user. This option allows the user to override
the machine type to use.

Signed-off-by: David Ahern 
Cc: Ingo Molnar 
Cc: Peter Zijlstra 
Cc: Frederic Weisbecker 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
---
 tools/perf/builtin-trace.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 0203324fe585..4a78a39b684a 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -2274,6 +2274,7 @@ int cmd_trace(int argc, const char **argv, const char 
*prefix __maybe_unused)
};
const char *output_name = NULL;
const char *ev_qualifier_str = NULL;
+   const char *machine_str = NULL;
const struct option trace_options[] = {
OPT_BOOLEAN(0, "comm", &trace.show_comm,
"show the thread COMM next to its id"),
@@ -2308,6 +2309,8 @@ int cmd_trace(int argc, const char **argv, const char 
*prefix __maybe_unused)
"Show only syscall summary with statistics"),
OPT_BOOLEAN('S', "with-summary", &trace.summary,
"Show all syscalls and summary with statistics"),
+   OPT_STRING('M', NULL, &machine_str, "x86|x86_64",
+"Advanced: machine type for converting system calls: x86, 
x86_64"),
OPT_END()
};
int err;
@@ -2318,6 +2321,17 @@ int cmd_trace(int argc, const char **argv, const char 
*prefix __maybe_unused)
 
argc = parse_options(argc, argv, trace_options, trace_usage, 0);
 
+   if (machine_str) {
+   if (strcmp(machine_str, "x86") == 0)
+   trace.audit.machine = MACH_X86;
+   else if (strcmp(machine_str, "x86_64") == 0)
+   trace.audit.machine = MACH_86_64;
+   else {
+   pr_err("Invalid machine type\n");
+   return -EINVAL;
+   }
+   }
+
/* summary_only implies summary option, but don't overwrite summary if 
set */
if (trace.summary_only)
trace.summary = trace.summary_only;
-- 
1.8.3.4 (Apple Git-47)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/4] perf trace: Add support for syscalls vs raw_syscalls

2013-12-04 Thread David Ahern
Older kernels (e.g., RHEL6) do system call tracing via
syscalls:sys_{enter,exit} rather than raw_syscalls. Update perf-trace to
detect lack of raw_syscalls support and try syscalls.

Signed-off-by: David Ahern 
---
 tools/perf/builtin-trace.c | 28 ++--
 1 file changed, 26 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 56afe339661a..a7aa771a98e6 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -12,6 +12,7 @@
 #include "util/thread_map.h"
 #include "util/stat.h"
 #include "trace-event.h"
+#include "util/parse-events.h"
 
 #include 
 #include 
@@ -173,6 +174,10 @@ static struct perf_evsel *perf_evsel__syscall_newtp(const 
char *direction, void
 {
struct perf_evsel *evsel = perf_evsel__newtp("raw_syscalls", direction);
 
+   /* older kernel (e.g., RHEL6) use syscalls:{enter,exit} */
+   if (evsel == NULL)
+   evsel = perf_evsel__newtp("syscalls", direction);
+
if (evsel) {
if (perf_evsel__init_syscall_tp(evsel, handler))
goto out_delete;
@@ -1801,10 +1806,11 @@ static int trace__record(int argc, const char **argv)
"-R",
"-m", "1024",
"-c", "1",
-   "-e", "raw_syscalls:sys_enter,raw_syscalls:sys_exit",
+   "-e",
};
 
-   rec_argc = ARRAY_SIZE(record_args) + argc;
+   /* +1 is for the event string below */
+   rec_argc = ARRAY_SIZE(record_args) + 1 + argc;
rec_argv = calloc(rec_argc + 1, sizeof(char *));
 
if (rec_argv == NULL)
@@ -1813,6 +1819,17 @@ static int trace__record(int argc, const char **argv)
for (i = 0; i < ARRAY_SIZE(record_args); i++)
rec_argv[i] = record_args[i];
 
+   /* event string may be different for older kernels - e.g., RHEL6 */
+   if (is_valid_tracepoint("raw_syscalls:sys_enter"))
+   rec_argv[i] = "raw_syscalls:sys_enter,raw_syscalls:sys_exit";
+   else if (is_valid_tracepoint("syscalls:sys_enter"))
+   rec_argv[i] = "syscalls:sys_enter,syscalls:sys_exit";
+   else {
+   pr_err("Neither raw_syscalls nor syscalls events exist.\n");
+   return -1;
+   }
+   i++;
+
for (j = 0; j < (unsigned int)argc; j++, i++)
rec_argv[i] = argv[j];
 
@@ -2048,6 +2065,10 @@ static int trace__replay(struct trace *trace)
 
evsel = perf_evlist__find_tracepoint_by_name(session->evlist,
 "raw_syscalls:sys_enter");
+   /* older kernels have syscalls tp versus raw_syscalls */
+   if (evsel == NULL)
+   evsel = perf_evlist__find_tracepoint_by_name(session->evlist,
+
"syscalls:sys_enter");
if (evsel == NULL) {
pr_err("Data file does not have raw_syscalls:sys_enter 
event\n");
goto out;
@@ -2061,6 +2082,9 @@ static int trace__replay(struct trace *trace)
 
evsel = perf_evlist__find_tracepoint_by_name(session->evlist,
 "raw_syscalls:sys_exit");
+   if (evsel == NULL)
+   evsel = perf_evlist__find_tracepoint_by_name(session->evlist,
+
"syscalls:sys_exit");
if (evsel == NULL) {
pr_err("Data file does not have raw_syscalls:sys_exit event\n");
goto out;
-- 
1.8.3.4 (Apple Git-47)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/4] perf trace fixes

2013-12-04 Thread David Ahern
Hi Arnaldo:

As I mentioned on IRC perf-trace fails on older kernels -- like RHEL6. This
set of patches makes it at least usable - though still some problems I am
hoping you can fix.

Build perf with these patches and run:

  perf trace -- dd if=/dev/zero of=/tmp/zero bs=4096 count=16

you see something like this which is just wrong:

  3.684 ( 0.007 ms): write(buf: 2, count: 140737077958816  ) = 27

there is no fd (should be 1 for the write) and all the values are wrong.
Perhaps it is an artifact of the older way of doing system call tracing, but
I see something goofy with the 3.12 kernel as well:
  5.633 ( 0.004 ms): write(fd: 2, buf: 0x7fff9177fee0, count: 24 ) = 24

According to strace that should be a return of 4096.

David Ahern (4):
  perf trace: Add support for syscalls vs raw_syscalls
  perf trace: Fix crash on RHEL6
  perf trace: Fix summary percentage when processing files
  perf trace: Add option to specify machine type

 tools/perf/builtin-trace.c | 48 ++
 1 file changed, 44 insertions(+), 4 deletions(-)

-- 
1.8.3.4 (Apple Git-47)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


linux-next: Tree for Dec 5

2013-12-04 Thread Stephen Rothwell
Hi all,

Changes since 20131204:

The akpm-current tree gained a conflict against the modules tree.

Non-merge commits (relative to Linus' tree): 2453
 2661 files changed, 104935 insertions(+), 71055 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" as mentioned in the FAQ on the wiki
(see below).

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log files
in the Next directory.  Between each merge, the tree was built with
a ppc64_defconfig for powerpc and an allmodconfig for x86_64 and a
multi_v7_defconfig for arm. After the final fixups (if any), it is also
built with powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig and
allyesconfig (minus CONFIG_PROFILE_ALL_BRANCHES - this fails its final
link) and i386, sparc, sparc64 and arm defconfig. These builds also have
CONFIG_ENABLE_WARN_DEPRECATED, CONFIG_ENABLE_MUST_CHECK and
CONFIG_DEBUG_INFO disabled when necessary.

Below is a summary of the state of the merge.

I am currently merging 210 trees (counting Linus' and 29 trees of patches
pending for Linus' tree).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

There is a wiki covering stuff to do with linux-next at
http://linux.f-seidel.de/linux-next/pmwiki/ .  Thanks to Frank Seidel.

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

$ git checkout master
$ git reset --hard stable
Merging origin/master (8ecffd791448 Merge tag 'gpio-v3.13-3' of 
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio)
Merging fixes/master (8ae516aa8b81 Merge tag 'trace-fixes-v3.13-rc1' of 
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace)
Merging kbuild-current/rc-fixes (19514fc665ff arm, kbuild: make "make install" 
not depend on vmlinux)
Merging arc-current/for-curr (da990a4f2d5a ARC: [perf] Fix a few thinkos)
Merging arm-current/fixes (11d4bb1bd067 ARM: 7907/1: lib: delay-loop: Add align 
directive to fix BogoMIPS calculation)
Merging m68k-current/for-linus (77a42796786c m68k: Remove deprecated 
IRQF_DISABLED)
Merging metag-fixes/fixes (3b2f64d00c46 Linux 3.11-rc2)
Merging powerpc-merge/merge (721cb59e9d95 powerpc/windfarm: Fix XServe G5 fan 
control Makefile issue)
Merging sparc/master (1de425c7b271 sparc64: Fix build regression)
Merging net/master (988bf4f01e6a Merge branch 'cxgb4')
Merging ipsec/master (dff345c5c85d be2net: call napi_disable() for all event 
queues)
Merging sound-current/for-linus (0756f09c4946 ALSA: hda - Fix silent output on 
MacBook Air 2,1)
Merging pci-current/for-linus (4bff6749905d PCI: Move device_del() from 
pci_stop_dev() to pci_destroy_dev())
Merging wireless/master (a59b40b30f3f Merge branch 'for-john' of 
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211)
Merging driver-core.current/driver-core-linus (dc1ccc48159d Linux 3.13-rc2)
Merging tty.current/tty-linus (39434abd942c n_tty: Fix missing newline echo)
Merging usb.current/usb-linus (eee52f9edd0f USB: switch maintainership of 
chipidea to Peter)
Merging staging.current/staging-linus (55ef003e4ae6 Merge tag 
'iio-fixes-for-3.13b' of 
git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into staging-linus)
Merging char-misc.current/char-misc-linus (d0b00d3fb96d Merge tag 
'extcon-linus-for-3.13-rc2' of 
git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/extcon into 
char-misc-linus)
Merging input-current/for-linus (4ef38351d770 Input: usbtouchscreen - separate 
report and transmit buffer size handling)
Merging md-current/for-linus (d47648fcf061 raid5: avoid finding "discard" 
stripe)
Merging crypto-current/master (8ec25c512916 crypto: testmgr - fix sglen in 
test_aead for case 'dst != src')
Merging ide/master (c2f7d1e103ef ide: pmac: remove unnecessary 
pci_set_drvdata())
Merging dwmw2/master (5950f0803ca9 pcmcia: remove RPX board stuff)
Merging sh-current/sh-fixes-for-linus (44033109e99c SH: Convert out[bwl] macros 
to inline functions)
Merging devicetree-current/devicetree/merge (1931ee143b0a Revert "drivers: of: 
add initialization code for dma reserved memory")
Merging rr-fixes/fixes (af91706d5dde

Re: [PATCH 1/6] GenWQE PCI support, health monitoring and recovery

2013-12-04 Thread Arnd Bergmann
On Wednesday 04 December 2013, Frank Haverkamp wrote:
> Hi Arnd & Greg,
> 
> please let me know if my following changes are ok:
> 
> Am Dienstag, den 03.12.2013, 15:28 +0100 schrieb Frank Haverkamp:
> 
> > +/* Read/write from/to registers */
> > +struct genwqe_regs_io {
> > +   __u32 num;  /* register offset/address */
> > +   union {
> > +   __u64 val64;
> > +   __u32 val32;
> > +   __u16 define;
> > +   };
> > +};
> 
> Here I am using now:
> 
> struct genwqe_regs_io {
>   __u64 num;  /* register offset/address */
>   union {
>   __u64 val64;
>   __u32 val32;
>   __u16 define;
>   };
> };

This is not a bug anymore, but it seems pointless to use a union
there rather than just a __u64 for the value.

> Here I reordered and resized the members like this:
> 
> struct genwqe_bitstream {
>   __u64 data_addr;/* pointer to image data */
>   __u32 size; /* size of image file */
>   __u32 crc;  /* crc of this image */
>   __u64 target_addr;  /* starting address in Flash */
>   __u32 partition;/* '0', '1', or 'v' */
>   __u32 uid;  /* 1=host/x=dram */
> 
>   __u64 slu_id;   /* informational/sim: SluID */
>   __u64 app_id;   /* informational/sim: AppID */
> 
>   __u16 retc; /* returned from processing */
>   __u16 attn; /* attention code from processing */
>   __u32 progress; /* progress code from processing */
> };

Yes, this is fine.

> > +struct genwqe_debug_data {
> > +   char driver_version[64];
> > +   __u64 slu_unitcfg;
> > +   __u64 app_unitcfg;
> > +
> > +   __u8  ddcb_before[DDCB_LENGTH];
> > +   __u8  ddcb_prev[DDCB_LENGTH];
> > +   __u8  ddcb_finished[DDCB_LENGTH];
> > +};
> > +
> 
> This I hope is ok. DDCB_LENGTH is 256.

Yes.

> 
> Was this already ok? My new version looks as follows:

The old version was wrong.

> struct genwqe_ddcb_cmd {
>   /* START of data copied to/from driver */
>   __u64 next_addr;/* chaining genwqe_ddcb_cmd */
>   __u64 flags;/* reserved */
> 
>   __u8  acfunc;   /* accelerators functional unit */
>   __u8  cmd;  /* command to execute */
>   __u8  asiv_length;  /* used parameter length */
>   __u8  asv_length;   /* length of valid return values  */
>   __u16 cmdopts;  /* command options */
>   __u16 retc; /* return code from processing*/


>   __u16 attn; /* attention code from processing */
>   __u16 vcrc; /* variant crc16 */
>   __u32 progress; /* progress code from processing  */
> 
>   __u64 deque_ts; /* dequeue time stamp */
>   __u64 cmplt_ts; /* completion time stamp */
>   __u64 disp_ts;  /* SW processing start */
> 
>   /* move to end and avoid copy-back */
>   __u64 ddata_addr;   /* collect debug data */
> 
>   /* command specific values */
>   __u8  asv[DDCB_ASV_LENGTH];
> 
>   /* END of data copied from driver */
>   union {
>   struct {
>   __u64 ats;
>   __u8  asiv[DDCB_ASIV_LENGTH_ATS];
>   };
>   /* used for flash update to keep it backward compatible */
>   __u8 __asiv[DDCB_ASIV_LENGTH];
>   };
>   /* END of data copied to driver */
> };
> 
> Trying to group the data in 64bit chunks even nicer than I had it
> before.

Yes, this works, although I would argue that it is too complex to be a nice
interface.

> > +/**
> > + * struct genwqe_mem - Memory pinning/unpinning information
> > + * @addr:  virtual user space address
> > + * @size:  size of the area pin/dma-map/unmap
> > + * direction:  0: read/1: read and write
> > + *
> > + * Avoid pinning and unpinning of memory pages dynamically. Instead
> > + * the idea is to pin the whole buffer space required for DDCB
> > + * opertionas in advance. The driver will reuse this pinning and the
> > + * memory associated with it to setup the sglists for the DDCB
> > + * requests without the need to allocate and free memory or map and
> > + * unmap to get the DMA addresses.
> > + *
> > + * The inverse operation needs to be called after the pinning is not
> > + * needed anymore. The pinnings else the pinnings will get removed
> > + * after the device is closed. Note that pinnings will required
> > + * memory.
> > + */
> > +struct genwqe_mem {
> > +   unsigned long addr;
> > +   unsigned long size;
> > +   int direction;
> > +};
> 
> Was wrong, as already pointed out before. It is now:
> 
> struct g

Re: lots of brief rcu stalls.

2013-12-04 Thread Joe Perches
On Wed, 2013-12-04 at 18:18 -0800, Eric Dumazet wrote:
> On Wed, 2013-12-04 at 16:16 -0800, Paul E. McKenney wrote:
> > +   ULONG_CMP_GE(ACCESS_ONCE(jiffies), rdp->rsp->jiffies_resched)) {

perhaps time_before_eq

> jiffies should not need ACCESS_ONCE(), right ?
> 
> It is one of the few variables marked with volatile keyword.

It does seem redundant

$ git grep -n "ACCESS_ONCE(jiffies)"
kernel/rcu/torture.c:1354:  jiffies_snap = ACCESS_ONCE(jiffies);
kernel/rcu/torture.c:1363:  jiffies_snap = ACCESS_ONCE(jiffies);
kernel/rcu/tree.c:820:  unsigned long j = ACCESS_ONCE(jiffies);
kernel/rcu/tree.c:975:  j = ACCESS_ONCE(jiffies);
kernel/sched/core.c:2324:   unsigned long next, now = ACCESS_ONCE(jiffies);
kernel/sched/proc.c:536:unsigned long curr_jiffies = 
ACCESS_ONCE(jiffies);
kernel/sched/proc.c:558:unsigned long curr_jiffies = 
ACCESS_ONCE(jiffies);


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/6] GenWQE PCI support, health monitoring and recovery

2013-12-04 Thread Arnd Bergmann
On Wednesday 04 December 2013, Frank Haverkamp wrote:
> Am Dienstag, den 03.12.2013, 15:28 +0100 schrieb Frank Haverkamp:
> > + */
> > +struct genwqe_mem {
> > +   unsigned long addr;
> > +   unsigned long size;
> > +   int direction;
> > +};
> > +
> > +#define GENWQE_PIN_MEM   _IOWR(GENWQE_IOC_CODE, 40, struct
> > genwqe_mem *)
> > +#define GENWQE_UNPIN_MEM  _IOWR(GENWQE_IOC_CODE, 41, struct
> > genwqe_mem *)
> > + 
> 
> Before someone comments on the unsigned long and the 32/64 bit issues
> with it. I need to fix that.

Also the extraneous '*' in the definitions.

Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] Documentation: gpiolib: document new interface

2013-12-04 Thread Rob Landley

On 11/24/2013 12:02:30 AM, Alexandre Courbot wrote:

Hi Rob,

On Sun, Nov 24, 2013 at 8:31 AM, Rob Landley  wrote:
>> > Linus, I hope this can be merged during the -rc cycle of 3.13,  
since the
>> > gpiod_ interface is going to be introduced there. It would not  
make much

>> > sense for it to come without its documentation.
>>
>> You're right of course. I'll read through it and apply fixes on top
>> (or squash into your patch.)
>>
>> Formal stuff:
>> Don't we need an 00-INDEX file?
>> (Maybe Rob can tell whether this is desirable.)
>
>
> A 00-INDEX file wouldn't hurt, but it can always be added later. No  
reason
> to hold up the series for that. (I was using them to generate html  
indexes

> for kernel.org/doc but after the breakin they eliminated all non-git
> functionality so I haven't been able to update it since. They  
replaced
> kernel.org/doc/Documentation with a raw git checkout, and I expect  
them to
> replace kernel.org/doc/menuconfig with a raw git checkout any day  
now.)

>
> That said, a 00-INDEX file would let you know where to start  
reading to find
> the file with the intro paragraph at the start of the old file, the  
bit
> explaining what GPIO is. Here the first file alphabetically is  
"board.txt",
> and I have no idea why it's named that, given how it starts. (I was  
sort of
> hoping that somebody who already knows the subsystem would comment  
before I

> do. I have no way of knowing if this documentation is _right_.)

I actually submitted a patch that introduces a 00-INDEX file
yesterday. It's probably a few other gazillions mails under in your
inbox. ;)


I'm a couple weeks behind on my email. I'll get to it eventually. (My  
time's spread between a few too many projects these days...)


Most documentation goes in through the trees of the people whose  
subsystems it documents. I mostly catch the stuff that falls through  
the cracks. I'm somewhere between a librarian shelving abandoned books  
and a janitor.


...
> But I really don't have time to go through every paragraph like  
that, and
> was hoping the gpio guys would (or just sign off on it so I don't  
have

> to)...

I will make another pass and send an update (or a new version of the
patch maybe, since we are still in -rc1 - whatever is more convenient
to Linus). Hopefully early users of the new (oops) interface will send
fixes to the documentation as well, if only to improve my approximate
English.


It's not so much the english, it's that Documentation should be aimed  
at people who _don't_ already know this stuff. Since it tends to be  
written by people who _do_ already know the stuff (kinda hard to do the  
other way, although I've done it), this can be tricky to pull off. You  
have to maintain a "But what if this _wasn't_ a rhetorical question?"  
mindset and emulate a Virtual Newbie. (I suggest qemu for this.)


Still, thanks for taking a stab at it. Imperfect's better than  
obsolete...


Rob--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: lots of brief rcu stalls.

2013-12-04 Thread Eric Dumazet
On Wed, 2013-12-04 at 16:16 -0800, Paul E. McKenney wrote:
> + if (rdp->rsp == rcu_state &&
> + ULONG_CMP_GE(ACCESS_ONCE(jiffies), rdp->rsp->jiffies_resched)) {
> + rdp->rsp->jiffies_resched += 5;
> + resched_cpu(rdp->cpu);
> + }
> +
>   return 0;
>  }

jiffies should not need ACCESS_ONCE(), right ?

It is one of the few variables marked with volatile keyword.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: ARM: gic_arch_extn (Was: [PATCH v3] irqchip: mmp: add dt support for wakeup)

2013-12-04 Thread Thomas Gleixner
Russell,

On Thu, 5 Dec 2013, Russell King - ARM Linux wrote:
> On Thu, Dec 05, 2013 at 01:41:53AM +0100, Thomas Gleixner wrote:
> > @all who feel responsible for gic_arch_extn
> > 
> > On Wed, 4 Dec 2013, Thomas Gleixner wrote:
> > > I'm going to reply in a separate mail on this, because you have
> > > brought this to my attention, but you are not responsible in the first
> > > place for this brainfart.
> > 
> > Who came up with that gic_arch_extn concept in the first place?
> 
> If you'd spend more time reviewing IRQ patches then maybe you'd catch
> this at review time.  So please stop your rediculous whinging when
> most of the problem is your own lack of time.

I'm not a native english speaker, so I want to make sure in the first
place that you meant:

"ridiculous whingeing" 

Assumed that you meant that, let me ridicule you a bit.

The gic_arch_extn concept got merged with:

commit d7ed36a4ea84e3a850f9932e2058ceef987d1acd
Author: Santosh Shilimkar 
Date:   Wed Mar 2 08:03:22 2011 +0100

ARM: 6777/1: gic: Add hooks for architecture specific extensions



Cc: Russell King 
Signed-off-by: Santosh Shilimkar 
Acked-by: Colin Cross 
Tested-by: Colin Cross 
Signed-off-by: Russell King 

---
arch/arm/common/gic.c   |   47 
arch/arm/include/asm/hardware/gic.h |1

The patch in question was never cc'ed to me and you merged it on your
own.

So now you have the chuzpe to blame me for that, just because this
code moved to drivers/irqchip with

 commit 81243e444c6e9d1625073e4a3d3bc244c8a545f0
 Author: Rob Herring 
 Date:   Tue Nov 20 21:21:40 2012 -0600

 irqchip: Move ARM GIC to drivers/irqchip

almost two years later?

The code move neither exempts you from the responsibility of merging
it nor does it imply a retroactive responsibility for me to review all
patches which went into that code prior to the move.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [f2fs-dev] [PATCH 3/3] f2fs: introduce f2fs_cache_node_page() to add page into node_inode cache

2013-12-04 Thread Chao Yu
> -Original Message-
> From: Chao Yu [mailto:chao2...@samsung.com]
> Sent: Thursday, December 05, 2013 9:55 AM
> To: ???
> Cc: linux-fsde...@vger.kernel.org; linux-kernel@vger.kernel.org; 
> linux-f2fs-de...@lists.sourceforge.net
> Subject: [f2fs-dev] [PATCH 3/3] f2fs: introduce f2fs_cache_node_page() to add 
> page into node_inode cache
> 
> This patch introduces f2fs_cache_node_page(), in this function, page which is
> readed ahead will be copy to node_inode's mapping cache.
> It will avoid rereading these node pages.
> 
> Signed-off-by: Chao Yu 

Suggested-by: Jaegeuk Kim 

I miss that, my mistake.

> ---
>  fs/f2fs/node.c |   30 ++
>  1 file changed, 30 insertions(+)
> 
> diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
> index 099f06f..5e2588f 100644
> --- a/fs/f2fs/node.c
> +++ b/fs/f2fs/node.c
> @@ -1600,6 +1600,34 @@ static int ra_sum_pages(struct f2fs_sb_info *sbi, 
> struct list_head *pages,
>   return 0;
>  }
> 
> +/*
> + * f2fs_cache_node_page() copy updated page data to node_inode cache page.
> + */
> +void f2fs_cache_node_page(struct f2fs_sb_info *sbi, struct page *page,
> + nid_t nid)
> +{
> + struct address_space *mapping = sbi->node_inode->i_mapping;
> + struct page *npage;
> +
> + npage = find_get_page(mapping, nid);
> + if (npage && PageUptodate(npage)) {
> + f2fs_put_page(npage, 0);
> + return;
> + }
> + f2fs_put_page(npage, 0);
> +
> + npage = grab_cache_page(mapping, nid);
> + if (!npage)
> + return;
> +
> + memcpy(page_address(npage), page_address(page), PAGE_CACHE_SIZE);
> +
> + SetPageUptodate(npage);
> + f2fs_put_page(npage, 1);
> +
> + return;
> +}
> +
>  int restore_node_summary(struct f2fs_sb_info *sbi,
>   unsigned int segno, struct f2fs_summary_block *sum)
>  {
> @@ -1633,6 +1661,8 @@ int restore_node_summary(struct f2fs_sb_info *sbi,
>   sum_entry->version = 0;
>   sum_entry->ofs_in_node = 0;
>   sum_entry++;
> + f2fs_cache_node_page(sbi, page,
> + le32_to_cpu(rn->footer.nid));
>   } else {
>   err = -EIO;
>   }
> --
> 1.7.9.5
> 
> 
> --
> Sponsored by Intel(R) XDK
> Develop, test and display web and hybrid apps with a single code base.
> Download it for free now!
> http://pubads.g.doubleclick.net/gampad/clk?id=111408631&iu=/4140/ostg.clktrk
> ___
> Linux-f2fs-devel mailing list
> linux-f2fs-de...@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v1 9/9] staging: android: binder: Add binder compat layer

2013-12-04 Thread Arve Hjønnevåg
On Wed, Dec 4, 2013 at 2:02 PM, Greg KH  wrote:
> On Wed, Dec 04, 2013 at 01:55:34PM -0800, Colin Cross wrote:
>> On Wed, Dec 4, 2013 at 1:43 PM, Greg KH  wrote:
>> > On Wed, Dec 04, 2013 at 12:46:42PM -0800, Colin Cross wrote:
>> >> On Wed, Dec 4, 2013 at 10:35 AM, Greg KH  
>> >> wrote:
>> >> 
>> >>
>> >> > And finally, is this all really needed?  Why not just fix the structures
>> >> > to be "correct", and then fix userspace to use the correct structures as
>> >> > well, thereby not needing a compat layer at all?
>> >>
>> >> Some of the binder ioctls take userspace pointers.  Are you suggesting
>> >> storing those pointers in a __u64 to avoid having to have a
>> >> compat_ioctl?
>> >
>> > Yes, that's the best way to solve the issue, right?
>>
>> It's the least code, but in exchange you lose all the type safety and
>> warnings when copying in and out of the pointers, as well as sparse
>> checking on the __user attribute.
>
> Not if you make the cast right at the beginning, when you first "touch"
> the data, but yes, it does take some of the type saftey away, at the
> expense of simpler code to mess up :)
>
>> That doesn't seem like a good tradeoff to me.  In addition it requires
>> modifying the existing heavily used 32 bit api, which means a
>> mostly-equivalent compat layer added in libbinder to support old
>> kernels.
>
> Wait, I thought that libbinder would have to be changed anyway here, to
> handle 64bit kernels (in both 32 and 64bit userspace).  Since you are
> already changing it, why not just "do it correctly"?
>

Yes libbinder will have to be changed to support calls between 32 bit
and 64 bit processes, so I don't see much value in a patchset that
only supports all 32 bit or all 64 bit processes. If user space is
fixed to use 64 bit pointers on a 64 bit system, then much of the code
added in this patchset becomes useless (and probably harmful as it
appears to prevent 32 bit processes from communicating with 64 bit
processes).

> Or does this patch series mean that no userspace code is changed?  Is
> that a "requirement" here?
>

I don't think we need to support old 32 bit userspace framework code
on a 64 bit system. I think it is more important to not prevent mixed
mode systems.

-- 
Arve Hjønnevåg
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V2] arm: mmp: build sram driver alone

2013-12-04 Thread Dan Williams
On Wed, Dec 4, 2013 at 5:36 PM, Qiao Zhou  wrote:
> sram driver can be used by many chips besides CPU_MMP2, and so build
> it alone. Also need to select MMP_SRAM for MMP_TDMA driver.
>
> Reported-by: Dan Williams 
> Signed-off-by: Qiao Zhou 
> ---

Looks good, thanks for fixing it up.

--
Dan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4 07/12] efi: passing kexec necessary efi data via setup_data

2013-12-04 Thread Dave Young
On 12/04/13 at 09:43am, Toshi Kani wrote:
> On Wed, 2013-12-04 at 10:46 +0800, Dave Young wrote:
> > Hi, Toshi
> > 
> > > Oh, I think I now understand what the issue was.  The z420 firmware
> > > updates the SMBIOS table address in the EFI system table to a virtual
> > > address after calling EFI SetVirtualAddressMap.  So, you are passing the
> > > original physical address of the SMBIOS table from the 1st kernel to the
> > > 2nd kernel to put it back to physical.  Is that right? 
> > 
> > Right.
> 
> Hi Dave,
> 
> The z420 firmware is based on some UEFI core that may be used by other
> vendors as well.  Since this handling is totally harmless (just
> redundant), I'd suggest not to have a platform check on this handling.

I have same worry as well, so I agree with you.

Thanks
Dave
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


linux-next: manual merge of the akpm-current tree with the modules tree

2013-12-04 Thread Stephen Rothwell
Hi Andrew,

Today's linux-next merge of the akpm-current tree got a conflict in
kernel/params.c between commit 88a88b320a90 ("params: improve standard
definitions") from the modules tree and commit b23eb499ed40
("kernel-paramsc-improve-standard-definitions-checkpatch-fixes") from the
akpm-current tree.

I fixed it up (using the akpm tree patch that just fixed the whitespace)
and can carry the fix as necessary (no action is required).

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


pgpGM4ACIUdpF.pgp
Description: PGP signature


[f2fs-dev] [PATCH 3/3] f2fs: introduce f2fs_cache_node_page() to add page into node_inode cache

2013-12-04 Thread Chao Yu
This patch introduces f2fs_cache_node_page(), in this function, page which is
readed ahead will be copy to node_inode's mapping cache.
It will avoid rereading these node pages.

Signed-off-by: Chao Yu 
---
 fs/f2fs/node.c |   30 ++
 1 file changed, 30 insertions(+)

diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
index 099f06f..5e2588f 100644
--- a/fs/f2fs/node.c
+++ b/fs/f2fs/node.c
@@ -1600,6 +1600,34 @@ static int ra_sum_pages(struct f2fs_sb_info *sbi, struct 
list_head *pages,
return 0;
 }
 
+/*
+ * f2fs_cache_node_page() copy updated page data to node_inode cache page.
+ */
+void f2fs_cache_node_page(struct f2fs_sb_info *sbi, struct page *page,
+   nid_t nid)
+{
+   struct address_space *mapping = sbi->node_inode->i_mapping;
+   struct page *npage;
+
+   npage = find_get_page(mapping, nid);
+   if (npage && PageUptodate(npage)) {
+   f2fs_put_page(npage, 0);
+   return;
+   }
+   f2fs_put_page(npage, 0);
+
+   npage = grab_cache_page(mapping, nid);
+   if (!npage)
+   return;
+
+   memcpy(page_address(npage), page_address(page), PAGE_CACHE_SIZE);
+
+   SetPageUptodate(npage);
+   f2fs_put_page(npage, 1);
+
+   return;
+}
+
 int restore_node_summary(struct f2fs_sb_info *sbi,
unsigned int segno, struct f2fs_summary_block *sum)
 {
@@ -1633,6 +1661,8 @@ int restore_node_summary(struct f2fs_sb_info *sbi,
sum_entry->version = 0;
sum_entry->ofs_in_node = 0;
sum_entry++;
+   f2fs_cache_node_page(sbi, page,
+   le32_to_cpu(rn->footer.nid));
} else {
err = -EIO;
}
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[f2fs-dev] [PATCH 2/3] f2fs: avoid unneeded page release for correct _count of page

2013-12-04 Thread Chao Yu
In find_fsync_dnodes() and recover_data(), our flow is like this:

->f2fs_submit_page_bio()
-> f2fs_put_page()
-> page_cache_release()  page->_count declined to zero.
->__free_pages()
-> put_page_testzero()  page->_count will be declined again.

We will get a segment fault in put_page_testzero when CONFIG_DEBUG_VM
is on, or return MM with a bad page with wrong _count num.

So let's just release this page.

Signed-off-by: Chao Yu 
---
 fs/f2fs/recovery.c |9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
index 7dda1f28..d075465 100644
--- a/fs/f2fs/recovery.c
+++ b/fs/f2fs/recovery.c
@@ -145,7 +145,7 @@ static int find_fsync_dnodes(struct f2fs_sb_info *sbi, 
struct list_head *head)
 
err = f2fs_submit_page_bio(sbi, page, blkaddr, READ_SYNC);
if (err)
-   goto out;
+   return err;
 
lock_page(page);
 
@@ -191,9 +191,10 @@ next:
/* check next segment */
blkaddr = next_blkaddr_of_node(page);
}
+
unlock_page(page);
-out:
__free_pages(page, 0);
+
return err;
 }
 
@@ -388,7 +389,7 @@ static int recover_data(struct f2fs_sb_info *sbi,
 
err = f2fs_submit_page_bio(sbi, page, blkaddr, READ_SYNC);
if (err)
-   goto out;
+   return err;
 
lock_page(page);
 
@@ -412,8 +413,8 @@ next:
/* check next segment */
blkaddr = next_blkaddr_of_node(page);
}
+
unlock_page(page);
-out:
__free_pages(page, 0);
 
if (!err)
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2] watchdog: Add a sysctl to disable soft lockup detector

2013-12-04 Thread Ben Zhang
Currently, the soft lockup detector and hard lockup detector
can be enabled or disabled together via the flag variable
watchdog_user_enabled. There isn't a way to disable only the
soft lockup detector while keeping the hard lockup detector
running.

The hard lockup detector sometimes does not work on a x86
machine with multiple cpus when softlockup_panic is set to 0.
For example:
1. Hard lockup occurs on cpu0 ("cli" followed by a infinite loop).
2. Soft lockup occurs on cpu1 shortly after because cpu1 tries to
send a function to cpu0 via smp_call_function_single().
3. watchdog_timer_fn() detects the soft lockup on cpu1 and
dumps the stack. dump_stack() eventually calls touch_nmi_watchdog()
which sets watchdog_nmi_touch=true for all cpus and sets
watchdog_touch_ts=0 for cpu1.
4. NMI fires on cpu0. watchdog_overflow_callback() sees
watchdog_nmi_touch=true, so it does not do anything except setting
watchdog_nmi_touch=false.
5. watchdog_timer_fn() is called again on cpu1, it sees
watchdog_touch_ts=0, so reloads it with the current tick. Thus,
is_softlockup() returns false, and soft_watchdog_warn is set to false.
6. Before NMI can fire on cpu0 again with watchdog_nmi_touch=false,
watchdog_timer_fn() reports the soft lockup on cpu1 again
and we go back to #3.

The machine stays locked up and the log shows repeated reports of
soft lockup on cpu1. Therefore, we need a way to disable the soft
lockup check so that the hard lockup detector can reboot the machine.


* Existing boot options for the watchdog:
nmi_watchdog=panic/nopanic/0
softlockup_panic=0/1
nowatchdog
nosoftlockup

* Variables modified by the boot options:
int watchdog_user_enabled;
unsigned int softlockup_panic;
unsigned int hardlockup_panic;

* Existing sysctls at /proc/sys/kernel/... for the watchdog:
nmi_watchdog=0/1
watchdog=0/1
softlockup_panic=0/1
watchdog_thresh=0~60

* Variables modified by the sysctls:
int watchdog_user_enabled;
unsigned int softlockup_panic;
int watchdog_thresh;


This patch adds a new boot option softlockup_detector_enable
and a sysctl at /proc/sys/kernel/softlockup_detector_enable to
allow disabling only the soft lockup detector.

softlockup_detector_enable=1:
This is the default. The soft lockup detector is enabled.
When a soft lockup is detected, a warning message with
debug info is printed. The kernel may be configured to
panics in this case via the sysctl kernel.softlockup_panic.

softlockup_detector_enable=0:
The soft lockup detector is disabled. Warning message is
not printed on soft lockup. The kernel does not panic on
soft lockup regardless of the value of kernel.softlockup_panic.
Note kernel.softlockup_detector_enable does not affect
the hard lockup detector.

Signed-off-by: Ben Zhang 
---
 Documentation/kernel-parameters.txt | 11 +++
 Documentation/sysctl/kernel.txt | 20 
 include/linux/sched.h   |  3 ++-
 kernel/sysctl.c |  9 +
 kernel/watchdog.c   | 15 +++
 5 files changed, 57 insertions(+), 1 deletion(-)

diff --git a/Documentation/kernel-parameters.txt 
b/Documentation/kernel-parameters.txt
index 50680a5..5678ac3 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -2980,6 +2980,17 @@ bytes respectively. Such letter suffixes can also be 
entirely omitted.
1: Fast pin select (default)
2: ATC IRMode
 
+   softlockup_detector_enable=
+   [KNL] Should the soft-lockup detector be enabled. If
+   the soft-lockup detector is disabled, no warning
+   message is printed on soft lockup, and the kernel does
+   not panic on soft lockup regardless of the value of
+   softlockup_panic. softlockup_detector_enable does not
+   affect the hard lockup detector.
+   If this parameter is not present, the soft-lockup
+   detector is enabled by default.
+   Format: 
+
softlockup_panic=
[KNL] Should the soft-lockup detector generate panics.
Format: 
diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt
index 26b7ee4..209212e 100644
--- a/Documentation/sysctl/kernel.txt
+++ b/Documentation/sysctl/kernel.txt
@@ -70,6 +70,7 @@ show up in /proc/sys/kernel:
 - shmall
 - shmmax  [ sysv ipc ]
 - shmmni
+- softlockup_detector_enable
 - stop-a  [ SPARC only ]
 - sysrq   ==> Documentation/sysrq.txt
 - tainted
@@ -718,6 +719,25 @@ without users and with a dead originative process will be 
destroyed.
 
 ==
 
+softlockup_detector_enable:
+
+Should the soft-lockup detector be enabled.
+
+softlockup_detector_enable=1:
+This is the default. The soft lockup detector is

[PATCH] perf tools: fix bug in usage of the basename() function

2013-12-04 Thread Stephane Eranian

The basename() implementation varies a lot between systems.
The Linux man page says: "basename may modify the content of the path,
so it may be desirable to pass a copy when calling the function".
On some other systems, the returned address may come from an internal
buffer which can be reused in subsequent calls, thus the results should
also be copied.

The dso__set_basename() function was not doing this causing problems
on some systems with wrong library names being shown by perf report,
such as on Android systems.

This patch fixes the problem.
Thanks to Ben Cheng for tracking down the problem.

Patch relative to tip.git at commit 631d5ea.

Reported-by: Ben Cheng 
Signed-off-by: Stephane Eranian 
---
 tools/perf/util/dso.c |   29 -
 1 file changed, 28 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index af4c687c..d186ace 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -404,7 +404,34 @@ void dso__set_short_name(struct dso *dso, const char *name)
 
 static void dso__set_basename(struct dso *dso)
 {
-   dso__set_short_name(dso, basename(dso->long_name));
+   char *lname, *base;
+
+   /*
+* basename may modify path buffer, so we must pass
+* a copy.
+*/
+   lname = strdup(dso->long_name);
+   if (!lname)
+   return;
+
+   /*
+* basename may return pointer to internal
+* storage which is reused in subsequent calls
+* so copy the result
+*/
+   base = strdup(basename(lname));
+
+   free(lname);
+
+   if (!base)
+   return;
+
+   if (dso->sname_alloc)
+   free((char *)dso->short_name);
+   else
+   dso->sname_alloc = 1;
+
+   dso__set_short_name(dso, base);
 }
 
 int dso__name_len(const struct dso *dso)
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[f2fs-dev] [PATCH 1/3] f2fs: use inner macro GFP_F2FS_ZERO for simplification

2013-12-04 Thread Chao Yu
Use inner macro GFP_F2FS_ZERO to instead of GFP_NOFS | __GFP_ZERO for
simplification of code.

Signed-off-by: Chao Yu 
---
 fs/f2fs/node.c |2 +-
 fs/f2fs/recovery.c |2 +-
 fs/f2fs/super.c|2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
index 0855168..099f06f 100644
--- a/fs/f2fs/node.c
+++ b/fs/f2fs/node.c
@@ -1577,7 +1577,7 @@ static int ra_sum_pages(struct f2fs_sb_info *sbi, struct 
list_head *pages,
 
for (; page_idx < start + nrpages; page_idx++) {
/* alloc temporal page for read node summary info*/
-   page = alloc_page(GFP_NOFS | __GFP_ZERO);
+   page = alloc_page(GFP_F2FS_ZERO);
if (!page) {
struct page *tmp;
list_for_each_entry_safe(page, tmp, pages, lru) {
diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
index c209b86..7dda1f28 100644
--- a/fs/f2fs/recovery.c
+++ b/fs/f2fs/recovery.c
@@ -377,7 +377,7 @@ static int recover_data(struct f2fs_sb_info *sbi,
blkaddr = NEXT_FREE_BLKADDR(sbi, curseg);
 
/* read node page */
-   page = alloc_page(GFP_NOFS | __GFP_ZERO);
+   page = alloc_page(GFP_F2FS_ZERO);
if (!page)
return -ENOMEM;
 
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index dd55074..22b07c3 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -332,7 +332,7 @@ static struct inode *f2fs_alloc_inode(struct super_block 
*sb)
 {
struct f2fs_inode_info *fi;
 
-   fi = kmem_cache_alloc(f2fs_inode_cachep, GFP_NOFS | __GFP_ZERO);
+   fi = kmem_cache_alloc(f2fs_inode_cachep, GFP_F2FS_ZERO);
if (!fi)
return NULL;
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 7/8] mm, memcg: allow processes handling oom notifications to access reserves

2013-12-04 Thread David Rientjes
On Wed, 4 Dec 2013, Johannes Weiner wrote:

> > Now that a per-process flag is available, define it for processes that
> > handle userspace oom notifications.  This is an optimization to avoid
> > mantaining a list of such processes attached to a memcg at any given time
> > and iterating it at charge time.
> > 
> > This flag gets set whenever a process has registered for an oom
> > notification and is cleared whenever it unregisters.
> > 
> > When memcg reclaim has failed to free any memory, it is necessary for
> > userspace oom handlers to be able to dip into reserves to pagefault text,
> > allocate kernel memory to read the "tasks" file, allocate heap, etc.
> 
> The task handling the OOM of a memcg can obviously not be part of that
> same memcg.
> 

Not without memory.oom_reserve_in_bytes that this series adds, that's 
true.  Michal expressed interest in the idea of memcg oom reserves in the 
past, so I thought I'd share the series.

> On Tue, 3 Dec 2013 at 15:35:48 +0800, Li Zefan wrote:
> > On Mon, 2 Dec 2013 at 11:44:06 -0500, Johannes Weiner wrote:
> > > On Fri, Nov 29, 2013 at 03:05:25PM -0500, Tejun Heo wrote:
> > > > Whoa, so we support oom handler inside the memcg that it handles?
> > > > Does that work reliably?  Changing the above detail in this patch
> > > > isn't difficult (and we'll later need to update kernfs too) but
> > > > supporting such setup properly would be a *lot* of commitment and I'm
> > > > very doubtful we'd be able to achieve that by just carefully avoiding
> > > > memory allocation in the operations that usreland oom handler uses -
> > > > that set is destined to expand over time, extremely fragile and will
> > > > be hellish to maintain.
> > > > 

It works reliably with this patch series, yes.  I'm not sure what change 
this is referring to that would avoid memory allocation for userspace oom 
handlers, and I'd agree that it would be difficult to maintain a 
no-allocation policy for a subset of processes that are destined to handle 
oom handlers.

That's not what this series is addressing, though, and in fact it's quite 
the opposite.  It acknowledges that userspace oom handlers need to 
allocate and that anything else would be too difficult to maintain 
(thereby agreeing with the above), so we must set aside memory that they 
are exclusively allowed to access.  For the vast majority of users who 
will not use userspace oom handlers, they can just use the default value 
of memory.oom_reserve_in_bytes == 0 and they incur absolutely no side-
effects as a result of this series.

For those who do use userspace oom handlers, like Google, this allows us 
to set aside memory to allow the userspace oom handlers to kill a process, 
dump the heap, send a signal, drop caches, etc. when waking up.

> > > > So, I'm not at all excited about commiting to this guarantee.  This
> > > > one is an easy one but it looks like the first step onto dizzying
> > > > slippery slope.
> > > > 
> > > > Am I misunderstanding something here?  Are you and Johannes firm on
> > > > supporting this?
> > >
> > > Handling a memcg OOM from userspace running inside that OOM memcg is
> > > completely crazy.  I mean, think about this for just two seconds...
> > > Really?
> > >
> > > I get that people are doing it right now, and if you can get away with
> > > it for now, good for you.  But you have to be aware how crazy this is
> > > and if it breaks you get to keep the pieces and we are not going to
> > > accomodate this in the kernel.  Fix your crazy userspace.
> > 

The rest of this email communicates only one thing: someone thinks it's 
crazy.  And I agree it would be crazy if we don't allow that class of 
process to have access to a pre-defined amount of memory to handle the 
situation, which this series adds.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH v3 07/19] smp, hexagon: kill SMP single function call interrupt

2013-12-04 Thread rkuo
On Thu, Dec 05, 2013 at 12:12:58AM +0800, Jiang Liu wrote:
> Commit 9a46ad6d6df3b54 "smp: make smp_call_function_many() use logic
> similar to smp_call_function_single()" has unified the way to handle
> single and multiple cross-CPU function calls. Now only one intterupt
> is needed for architecture specific code to support generic SMP function
> call interfaces, so kill the redundant single function call interrupt.
> 
> Cc: Andrew Morton 
> Cc: Shaohua Li 
> Cc: Peter Zijlstra 
> Cc: Ingo Molnar 
> Cc: Steven Rostedt 
> Cc: Jiri Kosina 
> Cc: Richard Kuo 
> Cc: linux-hexa...@vger.kernel.org
> Signed-off-by: Jiang Liu 
> ---
>  arch/hexagon/include/asm/smp.h | 1 -
>  arch/hexagon/kernel/smp.c  | 6 +-
>  2 files changed, 1 insertion(+), 6 deletions(-)

Seems to work fine for Hexagon.  Thanks!


Acked-by: Richard Kuo 


-- 

Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2 V1] arm-mmp-build-sram-driver-alone

2013-12-04 Thread Qiao Zhou

On 12/05/2013 03:17 AM, Dan Williams wrote:

On Wed, Dec 4, 2013 at 3:24 AM, Qiao Zhou  wrote:

On 12/04/2013 07:17 PM, Haojian Zhuang wrote:


Dan indicated that you could pack these two patches into one. Whatever
it's also OK to use two patches.


Misunderstood it... Thanks for correcting.



Please combine the patches for two reasons:
1/ patch1 by itself makes the problem worse it prevents the mmp_tdma
driver from building even if CPU_MMP2 is selected.
2/ patch2 does not have a changelog and is the only user of the
enabling in patch1

Dan, I updated the patch according to your suggestions. please help take 
a look again. Thanks.


--

Best Regards
Qiao
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2] arm-mmp-build-sram-driver-alone

2013-12-04 Thread Qiao Zhou
V2 -> V1:
combine the two patches together according to Dan's suggestion below.

>Please combine the patches for two reasons:
>1/ patch1 by itself makes the problem worse it prevents the mmp_tdma
>driver from building even if CPU_MMP2 is selected.
>2/ patch2 does not have a changelog and is the only user of the
>enabling in patch1

V1 -> V0:
No need for help text for MMP_SRAM in Kconfig and move it into MMP_TDMA
text in Kconfig.


Qiao Zhou (1):
  arm: mmp: build sram driver alone

 arch/arm/mach-mmp/Kconfig  |3 +++
 arch/arm/mach-mmp/Makefile |3 ++-
 drivers/dma/Kconfig|2 ++
 3 files changed, 7 insertions(+), 1 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2] arm: mmp: build sram driver alone

2013-12-04 Thread Qiao Zhou
sram driver can be used by many chips besides CPU_MMP2, and so build
it alone. Also need to select MMP_SRAM for MMP_TDMA driver.

Reported-by: Dan Williams 
Signed-off-by: Qiao Zhou 
---
 arch/arm/mach-mmp/Kconfig  |3 +++
 arch/arm/mach-mmp/Makefile |3 ++-
 drivers/dma/Kconfig|2 ++
 3 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/arch/arm/mach-mmp/Kconfig b/arch/arm/mach-mmp/Kconfig
index ebdda83..ebdba87 100644
--- a/arch/arm/mach-mmp/Kconfig
+++ b/arch/arm/mach-mmp/Kconfig
@@ -136,4 +136,7 @@ config USB_EHCI_MV_U2O
help
  Enables support for OTG controller which can be switched to host mode.
 
+config MMP_SRAM
+   bool
+
 endif
diff --git a/arch/arm/mach-mmp/Makefile b/arch/arm/mach-mmp/Makefile
index 9b702a1..98f0f63 100644
--- a/arch/arm/mach-mmp/Makefile
+++ b/arch/arm/mach-mmp/Makefile
@@ -7,7 +7,8 @@ obj-y   += common.o devices.o time.o
 # SoC support
 obj-$(CONFIG_CPU_PXA168)   += pxa168.o
 obj-$(CONFIG_CPU_PXA910)   += pxa910.o
-obj-$(CONFIG_CPU_MMP2) += mmp2.o sram.o
+obj-$(CONFIG_CPU_MMP2) += mmp2.o
+obj-$(CONFIG_MMP_SRAM) += sram.o
 
 ifeq ($(CONFIG_COMMON_CLK), )
 obj-y  += clock.o
diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig
index dd2874e..599f0ae 100644
--- a/drivers/dma/Kconfig
+++ b/drivers/dma/Kconfig
@@ -288,9 +288,11 @@ config MMP_TDMA
bool "MMP Two-Channel DMA support"
depends on ARCH_MMP
select DMA_ENGINE
+   select MMP_SRAM
help
  Support the MMP Two-Channel DMA engine.
  This engine used for MMP Audio DMA and pxa910 SQU.
+ It needs sram driver under mach-mmp.
 
  Say Y here if you enabled MMP ADMA, otherwise say N.
 
-- 
1.7.0.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] aio: clean up aio ring in the fail path

2013-12-04 Thread Gu Zheng
On 12/05/2013 09:14 AM, Gu Zheng wrote:

> Hi Dave,
> 
> On 12/04/2013 10:54 PM, Dave Jones wrote:
> 
>> On Wed, Dec 04, 2013 at 06:19:06PM +0800, Gu Zheng wrote:
>>  > Clean up the aio ring file in the fail path of aio_setup_ring
>>  > and ioctx_alloc. And maybe it can fix the GPF issue reported by
>>  > Dave Jones:
>>  > https://lkml.org/lkml/2013/11/25/898
>>  > 
>>  > 
>>  > Signed-off-by: Gu Zheng 
>>  > ---
>>  >  fs/aio.c |8 ++--
>>  >  1 files changed, 6 insertions(+), 2 deletions(-)
>>  > 
>>  > diff --git a/fs/aio.c b/fs/aio.c
>>  > index 08159ed..6efb7f6 100644
>>  > --- a/fs/aio.c
>>  > +++ b/fs/aio.c
>>  > @@ -367,8 +367,10 @@ static int aio_setup_ring(struct kioctx *ctx)
>>  >   if (nr_pages > AIO_RING_PAGES) {
>>  >   ctx->ring_pages = kcalloc(nr_pages, sizeof(struct page *),
>>  > GFP_KERNEL);
>>  > - if (!ctx->ring_pages)
>>  > + if (!ctx->ring_pages) {
>>  > + put_aio_ring_file(ctx);
>>  >   return -ENOMEM;
>>  > + }
>>  >   }
>>  >  
>>
>>   CC  fs/aio.o
>> fs/aio.c: In function ‘aio_setup_ring’:
>> fs/aio.c:363:4: error: implicit declaration of function ‘put_aio_ring_file’ 
>> [-Werror=implicit-function-declaration]
>> put_aio_ring_file(ctx);
>>
>>
>> Is this dependant on another patch ?
> 
> It's applied on 3.12-rc2. Please ignore the previous defective patch I sent

Sorry, s/3.12-rc2/3.13-rc2/

> before, and try again.
> 
> Regards,
> Gu
> 
>>
>>  Dave
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/
>>
> 
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] iommu: add missing include

2013-12-04 Thread Brian Norris
Fix a warning in of_iommu.c:

drivers/iommu/of_iommu.c:38:5: warning: no previous prototype for 
'of_get_dma_window' [-Wmissing-prototypes]

Signed-off-by: Brian Norris 
Cc: Hiroshi DOYU 
---
 drivers/iommu/of_iommu.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
index ee249bc959f8..e550ccb7634e 100644
--- a/drivers/iommu/of_iommu.c
+++ b/drivers/iommu/of_iommu.c
@@ -20,6 +20,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /**
  * of_get_dma_window - Parse *dma-window property and returns 0 if found.
-- 
1.8.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   5   6   7   8   >