date:20131203

Re: [PATCH 1/2] arm: mmp: build sram driver alone

2013-12-03 Thread Dan Williams

On Tue, Dec 3, 2013 at 11:05 PM, Qiao Zhou  wrote:
> sram driver can be used by many chips besides CPU_MMP2, and so build
> it alone.
>
> Reported-by: Dan Williams 
> Signed-off-by: Qiao Zhou 
> ---
>  arch/arm/mach-mmp/Kconfig  |5 +
>  arch/arm/mach-mmp/Makefile |3 ++-
>  2 files changed, 7 insertions(+), 1 deletions(-)
>
> diff --git a/arch/arm/mach-mmp/Kconfig b/arch/arm/mach-mmp/Kconfig
> index ebdda83..6a6597c 100644
> --- a/arch/arm/mach-mmp/Kconfig
> +++ b/arch/arm/mach-mmp/Kconfig
> @@ -136,4 +136,9 @@ config USB_EHCI_MV_U2O
> help
>   Enables support for OTG controller which can be switched to host 
> mode.
>
> +config MMP_SRAM
> +   bool
> +   help
> + Select code specific to sram.
> +

No need for help text for something that is selected, and might as
well squash this with patch 2.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86: mcheck: call put_device on device_register failure

2013-12-03 Thread Chen, Gong

On Tue, Dec 03, 2013 at 06:01:50PM +0100, Borislav Petkov wrote:
> Date: Tue, 3 Dec 2013 18:01:50 +0100
> From: Borislav Petkov 
> To: "Chen, Gong" 
> Cc: Levente Kurusa , Ingo Molnar ,
>  Thomas Gleixner , Tony Luck , "H.
>  Peter Anvin" , x...@kernel.org, EDAC
>  , LKML 
> Subject: Re: [PATCH] x86: mcheck: call put_device on device_register failure
> User-Agent: Mutt/1.5.21 (2010-09-15)
> 
> Can you please fix your
> 
> Mail-Followup-To:
> 
> header? It is impossible to reply to your emails without fiddling with
> the To: and Cc: by hand which gets very annoying over time.

I add some configs in my muttrc. Hope it works.

> 
> On Mon, Dec 02, 2013 at 09:23:30PM -0500, Chen, Gong wrote:
> > I have some concerns about it. if device_register is failed, it will
> > backtraces all kinds of conditions automatically, including put_device
> > definately. So do we really need an extra put_device when it returns
> > failure?
> 
> Do you mean the "done:" label in device_add() which does put_device()
> and which gets called by device_register()?
> 

Not only. I noticed that another put_device under label "Error:".


signature.asc
Description: Digital signature

How to change erasesize of a partition

2013-12-03 Thread Suki Buryani

hi,

i have few changes in my mtd partitions, i want to change the erasesize of a 
partition. can any one tell me where i should look to change erasesize of 
mtd3...

Thanks

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Staging: speakup: synth.c: removed a space

2013-12-03 Thread Joe Perches

On Wed, 2013-12-04 at 10:38 +0300, Dan Carpenter wrote:
> On Wed, Dec 04, 2013 at 06:35:15AM +0200, Aldo Iljazi wrote:
> >  Samuel Thibault wrote:
> > 
> > > Err, I'd rather make it really visible that the for loop doesn't have
> > > its first statement?
> > 
> > Wouldn't it be better if you add a comment there? So it would follow the
> > coding style?
> 
> No.  Adding obvious comments is more annoying than the space.
> 
> This seems like a small bug in checkpatch.pl.  Joe, the problem is this
> code:
> 
>   for ( ; x; x++) {
> 
> It's complaining about the space character before the semicolon.

Shrug.  checkpatch isn't and can't be perfect.

I think it's for things like

for (;;)

and that's better than

for ( ; ; )
or
for ( ;; )
or
for ( ;;)


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCHv6 05/13] iommu/core: add ops->{bound,unbind}_driver()

2013-12-03 Thread Hiroshi Doyu

On Mon, 25 Nov 2013 14:49:37 +0100
Hiroshi Doyu  wrote:

> Hi Joerg,
> 
> Do you have some time to review this patch along with the following ones?
> 
>   [PATCHv6 02/13] iommu/of: introduce a global iommu device list
>   http://lists.linuxfoundation.org/pipermail/iommu/2013-November/007050.html
> 
>   [PATCHv6 03/13] iommu/of: check if dependee iommu is ready or not
>   http://lists.linuxfoundation.org/pipermail/iommu/2013-November/007051.html

Any chance to get some feedback on them?

> With those patches, now I'm trying to populate iommu master devices
> after an IOMMU device is populated. Originally [PATCHv6 02/13] was
> proposed by Thierry. I'm a bit afraid of adding new IOMMU API as
> this, but I think that this new {bound,unbind}_driver() is the right
> timiing that depeding devices are populated instead of add_device()
> since, even after add_device, a device won't be populated. I'm not so
> sure how this affects on the existing IOMMUs.
> 
> It would be nice if you give some feedback on this.
> 
> On Thu, 21 Nov 2013 14:40:41 +0100
> Hiroshi Doyu  wrote:
> 
> > ops->{bound,unbind}_driver() functions are called at
> > BUS_NOTIFY_{BOUND,UNBIND}_DRIVER respectively.
> > 
> > This is necessary to control the device population order. IOMMU master
> > devices depend on an IOMMU device instanciation. IOMMU master devices
> > can be registered to an IOMMU only after it's successfully
> > populated. This IOMMU registration is done via
> > ops->bound_driver(). Currently this population can be deferred if
> > depending IOMMU device hasn't yet been populated in driver core. This
> > cannot be done via ops->add_device() since after add_device() device's
> > population/instanciation can be still deferred via probe().
> > 
> > Signed-off-by: Hiroshi Doyu 
> > ---
> > v6:
> > New for v6.
> > ---
> >  drivers/iommu/iommu.c | 13 +++--
> >  include/linux/iommu.h |  4 
> >  2 files changed, 15 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> > index efc..5469d36 100644
> > --- a/drivers/iommu/iommu.c
> > +++ b/drivers/iommu/iommu.c
> > @@ -540,14 +540,23 @@ static int iommu_bus_notifier(struct notifier_block 
> > *nb,
> >  * ADD/DEL call into iommu driver ops if provided, which may
> >  * result in ADD/DEL notifiers to group->notifier
> >  */
> > -   if (action == BUS_NOTIFY_ADD_DEVICE) {
> > +   switch (action) {
> > +   case BUS_NOTIFY_ADD_DEVICE:
> > if (ops->add_device)
> > return ops->add_device(dev);
> > -   } else if (action == BUS_NOTIFY_DEL_DEVICE) {
> > +   case BUS_NOTIFY_DEL_DEVICE:
> > if (ops->remove_device && dev->iommu_group) {
> > ops->remove_device(dev);
> > return 0;
> > }
> > +   case BUS_NOTIFY_BOUND_DRIVER:
> > +   if (ops->bound_driver)
> > +   ops->bound_driver(dev);
> > +   break;
> > +   case BUS_NOTIFY_UNBIND_DRIVER:
> > +   if (ops->unbind_driver)
> > +   ops->unbind_driver(dev);
> > +   break;
> > }
> >  
> > /*
> > diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> > index a444c79..a0e92be 100644
> > --- a/include/linux/iommu.h
> > +++ b/include/linux/iommu.h
> > @@ -96,6 +96,8 @@ enum iommu_attr {
> >   * @domain_has_cap: domain capabilities query
> >   * @add_device: add device to iommu grouping
> >   * @remove_device: remove device from iommu grouping
> > + * @bound_driver: called at BUS_NOTIFY_BOUND_DRIVER
> > + * @unbind_driver: called at BUS_NOTIFY_UNBIND_DRIVER
> >   * @domain_get_attr: Query domain attributes
> >   * @domain_set_attr: Change domain attributes
> >   * @pgsize_bitmap: bitmap of supported page sizes
> > @@ -114,6 +116,8 @@ struct iommu_ops {
> >   unsigned long cap);
> > int (*add_device)(struct device *dev);
> > void (*remove_device)(struct device *dev);
> > +   int (*bound_driver)(struct device *dev);
> > +   void (*unbind_driver)(struct device *dev);
> > int (*device_group)(struct device *dev, unsigned int *groupid);
> > int (*domain_get_attr)(struct iommu_domain *domain,
> >enum iommu_attr attr, void *data);
> > -- 
> > 1.8.1.5
> > 
> ___
> iommu mailing list
> io...@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/2] perf probe: Allow user to specify address within executable

2013-12-03 Thread Masami Hiramatsu

(2013/12/04 10:44), David Ahern wrote:
> On 12/3/13, 6:22 PM, Masami Hiramatsu wrote:
> 
>>> I figured out what you meant by uprobe_events interface yesterday. If I
>>> have to go to that interface for even 1 function I would do it for all
>>> -- from a user perspective it is just simpler to have 1 command to setup
>>> probes. I would prefer that 1 command be perf-probe.
>>
>> Yeah, but in that case, why you don't ask us adding sym->binding == STB_LOCAL
>> in filter_available_functions? :)
> 
> I did in a separate email -- you said because there can be multiple
> local functions with the same name.

Yeah, and this still seems to be a kind of workaround for me. The best way
to make you requirement enable is to support dwarf for userspace tracing.
OK I'll try it.

> But local functions is not the only
> use case I need.

What would you like to do with perf probe? Direct address accessing for
userspace is not a good way to do, because userspace is relocatable...

> For now I will carry the patch locally. At this point I am 20 patches
> deep and have probably another 20 to go. What's one more. I'll come back
> to this when I have more time.

Would you have any public git repository for that? And could you share
us what would you like to do before sending patch? We can help you to
tell the best way.

Thank you,

-- 
Masami HIRAMATSU
IT Management Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu...@hitachi.com


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Staging: speakup: synth.c: removed a space

2013-12-03 Thread Dan Carpenter

On Wed, Dec 04, 2013 at 06:35:15AM +0200, Aldo Iljazi wrote:
>  Samuel Thibault wrote:
> 
> > Err, I'd rather make it really visible that the for loop doesn't have
> > its first statement?
> 
> Wouldn't it be better if you add a comment there? So it would follow the
> coding style?

No.  Adding obvious comments is more annoying than the space.

This seems like a small bug in checkpatch.pl.  Joe, the problem is this
code:

for ( ; x; x++) {

It's complaining about the space character before the semicolon.

regards,
dan carpenter
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH -tip v4 0/6] kprobes: introduce NOKPROBE_SYMBOL() and fixes crash bugs

2013-12-03 Thread Masami Hiramatsu

(2013/12/04 11:54), Sandeepa Prabhu wrote:
> On 4 December 2013 06:58, Masami Hiramatsu
>  wrote:
>> Hi,
>> Here is the version 4 of NOKPORBE_SYMBOL series.
>>
>> In this version, I removed the cleanup patches and
>> add bugfixes I've found, since those bugs will be
>> critical.
>> Rest of the cleanup and visible blacklists will be
>> proposed later in another series.
>>
>> Oh, just one new thing, I added a new RFC patch which
>> removes the dependency of notify_die() from kprobes
>> miss-hit/recovery path. Since the notify_die() involves
>> locking and lockdep code which invokes a lot of heavy
>> printk functions etc. This helped me to minimize the
>> blacklist and provides more stability for kprobes.
>> Actually, most of int3 handlers are already called
>> from do_int3 directly, I think this change is acceptable
>> too.
>>
>> Here is the updates about NOKPROBE_SYMBOL().
>>  - Now _ASM_NOKPROBE() macro is introduced for assembly
>>symbols on x86.
>>  - Rename kprobe_blackpoint to kprobe_blacklist_entry
>>and simplify it. Also NOKPROBE_SYMBOL() macro just
>>saves the address of non-probe-able symbols.
>>
>> ---
>>
>> Masami Hiramatsu (6):
> 
>>   kprobes: Prohibit probing on .entry.text code
>>   kprobes: Introduce NOKPROBE_SYMBOL() macro for blacklist
> Hi Masami,
> Is it good idea to split  "arch/x86" code from generic kernel changes?
> Then we just need to take above two patches for verifying it on arm64
> or other platforms.

Yeah, it can be.
However I think you can apply it without any problem on arm64 tree too,
since it "just adds" an asm macro in arch/x86/include/asm/asm.h.
It should not have any effect for other arch. Could you try it? :)

Thank you,


-- 
Masami HIRAMATSU
IT Management Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu...@hitachi.com


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/4] Regulators: TPS65218: Add Regulator driver for TPS65218 PMIC

2013-12-03 Thread Manish Badarkhe

Hi Keerthy,

> +   rdev = regulator_register([id], );

Can you make use of "devm_regulator_register" instead?

> +   if (IS_ERR(rdev)) {
> +   dev_err(tps->dev, "failed to register %s regulator\n",
> +   pdev->name);
> +   return PTR_ERR(rdev);
> +   }
> +
> +   /* Save regulator */
> +   tps->rdev[id] = rdev;
> +
> +   return 0;
> +}


Best Regards
Manish Badarkhe
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] memcg: remove KMEM_ACCOUNTED_ACTIVATED

2013-12-03 Thread Vladimir Davydov

On 12/04/2013 02:38 AM, Glauber Costa wrote:
>> In memcg_update_kmem_limit() we do the whole process of limit
>> initialization under a mutex so the situation we need protection from in
>> tcp_update_limit() is impossible. BTW once set, the 'activated' flag is
>> never cleared and never checked alone, only along with the 'active'
>> flag, that's why I doubt we need it at all.
> The updates are protected by a mutex, but the readers are not, and should not.
> So we can still be patching the readers, and the double-flag was
> initially used so
> we can make sure that both flags are only set after the static branches are 
> in.
>
> Note that one of the flags is set inside memcg_update_cache_sizes(). After 
> that,
> we call static_key_slow_inc(). At this point, someone whose code is
> patched in could
> start accounting, but it shouldn't - because not all sites are patched in.
>
> So after the update is done, we set the other flag, and now everybody
> will start going
> through.
>
> Could you do something clever with just one flag? Probably yes. But I
> doubt it would
> be that much cleaner, this is just the way that patching sites work.

Thank you for spending your time to listen to me.

Let me try to explain what is bothering me.

We have two state bits for each memcg, 'active' and 'activated'. There
are two scenarios where the bits can be modified:

1) The kmem limit is set on a memcg for the first time -
memcg_update_kmem_limit(). Here we call memcg_update_cache_sizes(),
which sets the 'activated' bit on success, then update static branching,
then set the 'active' bit. All three actions are done atomically in
respect to other tasks setting the limit due to the set_limit_mutex.
After both bits are set, they never get cleared for the memcg.

2) When a subgroup of a kmem-active cgroup is created -
memcg_propagate_kmem(). Here we copy kmem_account_flags from the parent,
then increase static branching refcounter, then call
memcg_update_cache_sizes() for the new memcg, which may clear the
'activated' bit on failure. After successful execution, the state bits
never get cleared for the new memcg.

In scenario 2 there is no need bothering about the flags setting order,
because we don't have any tasks in the cgroup yet - the tasks can be
moved in only after css_online finishes when we have both of the bits
set and the static branching enabled. Actually, we already do not bother
about it, because we have both bits set before the cgroup is fully
initialized (memcg_update_cache_sizes() is called).

Let's look at scenario 1. There we have two bits - 'activated' and
'active' - the latter is always set after the former and after the
static branching is enabled. Moreover, none of the bits is cleared once
it's set. That said checking if both bits are set - I mean
memcg_can_account_kmem() - is equivalent to checking if the 'acitve' bit
is set. Next, the 'activated' bit is never checked alone, only along
with the 'active' bit in memcg_can_account_kmem() - I do not count
(!memcg->kmem_account_flags) check in memcg_update_kmem_limit(), because
it is done under the set_limit_mutex. What's the deal having it then?

Thanks and sorry for disturbing you.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH 1/3 v5] usb: chipidea: Reallocate regmap only if lpm is detected

2013-12-03 Thread Peter Chen



 
> usb: chipidea: Reallocate regmap only if lpm is detected
> 
> The regmap only needs to reallocate if the hw_read on the CAP register
> shows
> lpm is used. Therefore the if() statement check the change.
> 
> Signed-off-by: Chris Ruehl 
> Acked-by: Peter Chen 
> ---
>  drivers/usb/chipidea/core.c |7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/usb/chipidea/core.c b/drivers/usb/chipidea/core.c
> index 5d8981c..9a5ef20 100644
> --- a/drivers/usb/chipidea/core.c
> +++ b/drivers/usb/chipidea/core.c
> @@ -208,7 +208,8 @@ static int hw_device_init(struct ci_hdrc *ci, void
> __iomem *base)
>   reg = hw_read(ci, CAP_HCCPARAMS, HCCPARAMS_LEN) >>
>   __ffs(HCCPARAMS_LEN);
>   ci->hw_bank.lpm  = reg;
> - hw_alloc_regmap(ci, !!reg);
> + if (reg)
> + hw_alloc_regmap(ci, !!reg);
>   ci->hw_bank.size = ci->hw_bank.op - ci->hw_bank.abs;
>   ci->hw_bank.size += OP_LAST;
>   ci->hw_bank.size /= sizeof(u32);
> --
> 1.7.10.4
> 

Applied, Thanks.
Please do not add subject to commit log next time :)

Peter

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCHv3] usb: chipidea: add support for USB OTG controller on TI-NSPIRE

2013-12-03 Thread Peter Chen

On Mon, Nov 25, 2013 at 02:53:37PM +1100, dt.ta...@gmail.com wrote:
> From: Daniel Tang 
> 
> The USB controller in TI-NSPIRE calculators are based off either Freescale's
> USB OTG controller or the USB controller found in the IMX233, both of which
> are Chipidea compatible.
> 
> This patch adds a device tree binding for the controller.
> 
> Signed-off-by: Daniel Tang 
> ---
> 
> Changelog v3:
>  * Removed redundant module aliases
> 
> Changelog v2:
>  * Rename ci13xxx to ci_hdrc
>  * Fixed alignment issues
> 
> .../devicetree/bindings/usb/ci-hdrc-nspire.txt | 17 +
>  drivers/usb/chipidea/Makefile  |  1 +
>  drivers/usb/chipidea/ci_hdrc_nspire.c  | 72 
> ++
>  3 files changed, 90 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/usb/ci-hdrc-nspire.txt
>  create mode 100644 drivers/usb/chipidea/ci_hdrc_nspire.c
> 
> diff --git a/Documentation/devicetree/bindings/usb/ci-hdrc-nspire.txt 
> b/Documentation/devicetree/bindings/usb/ci-hdrc-nspire.txt
> new file mode 100644
> index 000..5ba8e90
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/usb/ci-hdrc-nspire.txt
> @@ -0,0 +1,17 @@
> +* TI-Nspire USB OTG Controller
> +
> +Required properties:
> +- compatible: Should be "zevio,nspire-usb"
> +- reg: Should contain registers location and length
> +- interrupts: Should contain controller interrupt
> +
> +Recommended properies:
> +- vbus-supply: regulator for vbus
> +
> +Examples:
> + usb0: usb@B000 {
> + reg = <0xB000 0x1000>;
> + compatible = "zevio,nspire-usb";
> + interrupts = <8>;
> + vbus-supply = <_reg>;
> + };
> diff --git a/drivers/usb/chipidea/Makefile b/drivers/usb/chipidea/Makefile
> index a99d980..245ea4d 100644
> --- a/drivers/usb/chipidea/Makefile
> +++ b/drivers/usb/chipidea/Makefile
> @@ -10,6 +10,7 @@ ci_hdrc-$(CONFIG_USB_CHIPIDEA_DEBUG)+= debug.o
>  # Glue/Bridge layers go here
> 
>  obj-$(CONFIG_USB_CHIPIDEA)   += ci_hdrc_msm.o
> +obj-$(CONFIG_USB_CHIPIDEA)   += ci_hdrc_nspire.o
> 
>  # PCI doesn't provide stubs, need to check
>  ifneq ($(CONFIG_PCI),)
> diff --git a/drivers/usb/chipidea/ci_hdrc_nspire.c 
> b/drivers/usb/chipidea/ci_hdrc_nspire.c
> new file mode 100644
> index 000..517ce41
> --- /dev/null
> +++ b/drivers/usb/chipidea/ci_hdrc_nspire.c
> @@ -0,0 +1,72 @@
> +/*
> + *   Copyright (C) 2013 Daniel Tang 
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2, as
> + * published by the Free Software Foundation.
> + *
> + * Based off drivers/usb/chipidea/ci_hdrc_msm.c
> + *
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include "ci.h"
> +
> +static struct ci_hdrc_platform_data ci_hdrc_nspire_platdata = {
> + .name   = "ci_hdrc_nspire",
> + .flags  = CI_HDRC_REGS_SHARED,
> + .capoffset  = DEF_CAPOFFSET,
> +};
> +
> +static int ci_hdrc_nspire_probe(struct platform_device *pdev)
> +{
> + struct platform_device *ci_pdev;
> +
> + dev_dbg(>dev, "ci_hdrc_nspire_probe\n");
> +
> + ci_pdev = ci_hdrc_add_device(>dev,
> + pdev->resource, pdev->num_resources,
> + _hdrc_nspire_platdata);
> +
> + if (IS_ERR(ci_pdev)) {
> + dev_err(>dev, "ci_hdrc_add_device failed!\n");
> + return PTR_ERR(ci_pdev);
> + }
> +
> + platform_set_drvdata(pdev, ci_pdev);
> +
> + return 0;
> +}
> +
> +static int ci_hdrc_nspire_remove(struct platform_device *pdev)
> +{
> + struct platform_device *ci_pdev = platform_get_drvdata(pdev);
> +
> + ci_hdrc_remove_device(ci_pdev);
> +
> + return 0;
> +}
> +
> +static const struct of_device_id ci_hdrc_nspire_dt_ids[] = {
> + { .compatible = "zevio,nspire-usb", },
> + { /* sentinel */ }
> +};
> +
> +static struct platform_driver ci_hdrc_nspire_driver = {
> + .probe = ci_hdrc_nspire_probe,
> + .remove = ci_hdrc_nspire_remove,
> + .driver = {
> + .name = "nspire_usb",
> + .owner = THIS_MODULE,
> + .of_match_table = ci_hdrc_nspire_dt_ids,
> + },
> +};
> +
> +MODULE_DEVICE_TABLE(of, ci_hdrc_nspire_dt_ids);
> +module_platform_driver(ci_hdrc_nspire_driver);
> +
> +MODULE_LICENSE("GPL v2");
> --
> 1.8.1.3
> 
> 

Applied, Thanks.

-- 

Best Regards,
Peter Chen

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] fix del_timer() misuse for ->s_err_report

2013-12-03 Thread Al Viro

That thing should be del_timer_sync(); consider what happens
if ext4_put_super() call of del_timer() happens to come just as it's
getting run on another CPU.  Since that timer reschedules itself
to run next day, you are pretty much guaranteed that you'll end up
with kfree'd scheduled timer, with usual fun consequences.  AFAICS,
that's -stable fodder all way back to 2010... [the second del_timer_sync()
is almost certainly not needed, but it doesn't hurt either]

Signed-off-by: Al Viro 
---
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index c977f4e..9d70c0c 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -792,7 +792,7 @@ static void ext4_put_super(struct super_block *sb)
}
 
ext4_es_unregister_shrinker(sbi);
-   del_timer(>s_err_report);
+   del_timer_sync(>s_err_report);
ext4_release_system_zone(sb);
ext4_mb_release(sb);
ext4_ext_release(sb);
@@ -4184,7 +4184,7 @@ failed_mount_wq:
}
 failed_mount3:
ext4_es_unregister_shrinker(sbi);
-   del_timer(>s_err_report);
+   del_timer_sync(>s_err_report);
if (sbi->s_flex_groups)
ext4_kvfree(sbi->s_flex_groups);
percpu_counter_destroy(>s_freeclusters_counter);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] drivers: staging: ft1000: ft1000-usb: initialize 'status' with STATUS_SUCCESS in request_code_segment()

2013-12-03 Thread Chen Gang


Oh, another member has already fixed it (found earlier than me), and
integrated it into next-20131203 tree, so this patch is obsoleted.

The related git commit is "8aced95 staging: ft1000: fix use of
potentially uninitialized variable"

Thanks.

On 11/27/2013 05:27 PM, Chen Gang wrote:
> On 11/27/2013 05:18 PM, Josh Triplett wrote:
>> On Wed, Nov 27, 2013 at 11:01:18AM +0800, Chen Gang wrote:
>>> If "!bool_case", it returns unexpected value instead of STATUS_SUCCESS,
>>> so need fix it, the related warning (with allmodconfig under hexagon):
>>>
>>> CC [M]  drivers/staging/ft1000/ft1000-usb/ft1000_download.o
>>>   drivers/staging/ft1000/ft1000-usb/ft1000_download.c: In function 
>>> 'request_code_segment':
>>>   drivers/staging/ft1000/ft1000-usb/ft1000_download.c:581:6: warning: 
>>> 'status' may be used uninitialized in this function [-Wuninitialized]
>>>
>>>
>>> Signed-off-by: Chen Gang 
>>
>> Reviewed-by: Josh Triplett 
>>
> 
> Thanks.  :-)
> 
>>>  .../staging/ft1000/ft1000-usb/ft1000_download.c|2 +-
>>>  1 files changed, 1 insertions(+), 1 deletions(-)
>>>
>>> diff --git a/drivers/staging/ft1000/ft1000-usb/ft1000_download.c 
>>> b/drivers/staging/ft1000/ft1000-usb/ft1000_download.c
>>> index 68ded17..15f3062 100644
>>> --- a/drivers/staging/ft1000/ft1000-usb/ft1000_download.c
>>> +++ b/drivers/staging/ft1000/ft1000-usb/ft1000_download.c
>>> @@ -578,7 +578,7 @@ static int request_code_segment(struct ft1000_usb 
>>> *ft1000dev, u16 **s_file,
>>>  u8 **c_file, const u8 *endpoint, bool boot_case)
>>>  {
>>> long word_length;
>>> -   int status;
>>> +   int status = STATUS_SUCCESS;
>>>  
>>> /*DEBUG("FT1000:REQUEST_CODE_SEGMENT\n");i*/
>>> word_length = get_request_value(ft1000dev);
>>> -- 
>>> 1.7.7.6
> 
> 


-- 
Chen Gang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/3] usb: phy-generic: Add ULPI VBUS support

2013-12-03 Thread Chris Ruehl


On Tuesday, December 03, 2013 04:15 PM, Heikki Krogerus wrote:

On Mon, Dec 02, 2013 at 03:05:19PM +0800, Chris Ruehl wrote:

@@ -154,6 +164,27 @@ int usb_phy_gen_create_phy(struct device *dev, struct 
usb_phy_gen_xceiv *nop,
  {
int err;

+   if (nop->ulpi_vbus>  0) {
+   unsigned int flags = 0;
+
+   if (nop->ulpi_vbus&  0x1)
+   flags |= ULPI_OTG_DRVVBUS;
+   if (nop->ulpi_vbus&  0x2)
+   flags |= ULPI_OTG_DRVVBUS_EXT;
+   if (nop->ulpi_vbus&  0x4)
+   flags |= ULPI_OTG_EXTVBUSIND;
+   if (nop->ulpi_vbus&  0x8)
+   flags |= ULPI_OTG_CHRGVBUS;
+
+   nop->ulpi = otg_ulpi_create(_viewport_access_ops, flags);
+   if (!nop->ulpi) {
+   dev_err(dev, "Failed create ULPI Phy\n");
+   return -ENOMEM;
+   }
+   dev_dbg(dev, "Create ULPI Phy\n");
+   nop->ulpi->io_priv =  nop->viewport;
+   }


This is so wrong. You are registering one kind of usb phy driver from
an other. Change drivers/usb/phy/ulpi.c to be a platform device. The
whole flag system in it is pretty horrid. While you are at it, change
that so it sets the values based on boolean flags from OF properties
or platform data.

NAK for the whole set.




Heikki,

Thanks for your comments, even not much positive to me.. any how.
My intention on the "horrid" path was to reduce kernel code where
one of_read32 vs. four of_boolean. And mentioned logic is simple. But that's 
history.


On my way to find a solution for my board I'd look around and found using of
phy-ulpi.c functions in phy-tegra-usb.c and don't mind to use them too.

I accept your NAK and will work on a patch to make phy-ulpi.c working as 
platform device.


Last question to you. What you don't like on the patch to support chip-select 
gpio of my patch-set.. I ask because you NAK the whole set.

I really need the ChipSelect function to make my hardware work!

Chris

--
GTSYS Limited RFID Technology
A01 24/F Gold King Industrial Bld
35-41 Tai Lin Pai Road, Kwai Chung, Hong Kong
Fax (852) 8167 4060 - Tel (852) 3598 9488

Disclaimer: http://www.gtsys.com.hk/email/classified.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3] watchdog: mpc8xxx_wdt convert to watchdog core

2013-12-03 Thread Guenter Roeck


On 12/03/2013 10:32 PM, Christophe Leroy wrote:

Convert mpc8xxx_wdt.c to the new watchdog API.

Signed-off-by: Christophe Leroy 



Reviewed-by: Guenter Roeck 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] ARM: tegra: switch FUSE clock on before usage

2013-12-03 Thread Alexandre Courbot

Hi Stephen,

On Fri, Nov 22, 2013 at 10:35 AM, Alex Courbot  wrote:
> On 11/22/2013 05:30 AM, Stephen Warren wrote:
>>
>> On 11/20/2013 07:40 PM, Alexandre Courbot wrote:
>>>
>>> FUSE clock is enabled by most bootloaders, but we cannot expect it to be
>>> on in all contexts (e.g. kexec).
>>>
>>> Ensure the FUSE clock is enabled before any of its registers is touched.
>>> Since FUSE is touched very early during system boot (before the clock
>>> devices are registered), directly manipulate the clock register bit in
>>> case the clock device cannot be acquired.
>>
>>
>> This looks reasonable to me. I'll apply it soon after -rc1.
>
>
> Thanks. Be careful as I noticed I misformatted my commit message. The part
> after the "--" will not be stripped by git am as I intended it to be. I
> understood what I did wrong and will hopefully not make that mistake again.
>
> Sorry for the inconvenience.

I am not seeing this in your tree, have you applied it somewhere already?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] devtmpfs: Calling delete_path() only when necessary

2013-12-03 Thread Greg Kroah-Hartman

On Wed, Dec 04, 2013 at 02:44:14PM +0800, Axel Lin wrote:
> 2013/12/4 Rob Landley :
> > On 11/16/2013 02:15:23 AM, Axel Lin wrote:
> >>
> >> The deleted variable is always 1 in current code.
> >> Initialize deleted variable to be 0, so delete_path() will be called only
> >> when
> >> necessary.
> >>
> >> Signed-off-by: Axel Lin 
> >
> >
> > I'm not seeing this in linux-next, or a reply on the web archive. Assuming
> > nobody's objected to this, you might want to forward it to
> > triv...@kernel.org.
> >
> > That said, you could describe what it _does_ a little more?
> 
> I was expecting Greg to pick up this patch.
> 
> I thought the description is pretty clear.
> What the patch does is changing the init value of deleted variable to 0.
> The intention of this change is to avoid unnecessary delete_path() call.
> 
> Hi Greg,
> Would you pick up this patch?
> If a re-send or a v2 is required, please just let me know.

It's in my queue to get to, sorry, it's huge and slowly going down, it's
not lost...

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/2] arm: mmp: build sram driver alone

2013-12-03 Thread Qiao Zhou

sram driver can be used by many chips besides CPU_MMP2, and so build
it alone.

Reported-by: Dan Williams 
Signed-off-by: Qiao Zhou 
---
 arch/arm/mach-mmp/Kconfig  |5 +
 arch/arm/mach-mmp/Makefile |3 ++-
 2 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/arch/arm/mach-mmp/Kconfig b/arch/arm/mach-mmp/Kconfig
index ebdda83..6a6597c 100644
--- a/arch/arm/mach-mmp/Kconfig
+++ b/arch/arm/mach-mmp/Kconfig
@@ -136,4 +136,9 @@ config USB_EHCI_MV_U2O
help
  Enables support for OTG controller which can be switched to host mode.
 
+config MMP_SRAM
+   bool
+   help
+ Select code specific to sram.
+
 endif
diff --git a/arch/arm/mach-mmp/Makefile b/arch/arm/mach-mmp/Makefile
index 9b702a1..98f0f63 100644
--- a/arch/arm/mach-mmp/Makefile
+++ b/arch/arm/mach-mmp/Makefile
@@ -7,7 +7,8 @@ obj-y   += common.o devices.o time.o
 # SoC support
 obj-$(CONFIG_CPU_PXA168)   += pxa168.o
 obj-$(CONFIG_CPU_PXA910)   += pxa910.o
-obj-$(CONFIG_CPU_MMP2) += mmp2.o sram.o
+obj-$(CONFIG_CPU_MMP2) += mmp2.o
+obj-$(CONFIG_MMP_SRAM) += sram.o
 
 ifeq ($(CONFIG_COMMON_CLK), )
 obj-y  += clock.o
-- 
1.7.0.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/2] dma: mmp-tdma: select sram driver

2013-12-03 Thread Qiao Zhou

Reported-by: Dan Williams 
Signed-off-by: Qiao Zhou 
---
 drivers/dma/Kconfig |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig
index dd2874e..4e41b69 100644
--- a/drivers/dma/Kconfig
+++ b/drivers/dma/Kconfig
@@ -288,6 +288,7 @@ config MMP_TDMA
bool "MMP Two-Channel DMA support"
depends on ARCH_MMP
select DMA_ENGINE
+   select MMP_SRAM
help
  Support the MMP Two-Channel DMA engine.
  This engine used for MMP Audio DMA and pxa910 SQU.
-- 
1.7.0.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/3] ARM Coresight: Enhance ETM tracing control

2013-12-03 Thread Greg Kroah-Hartman

On Tue, Dec 03, 2013 at 11:39:21PM -0500, Adrien Vergé wrote:
> Usage of ETM tracing facility is currently very limited: user can
> only start/stop tracing. This set of patches enables management of
> address combinations and PIDs that trigger tracing.
> 
> ETM management was done via sysfs entries (trace_info,
> trace_running...), this code adds trace_addrrange and trace_pid to
> let the user read/write custom values.

I have lots to say about this from a sysfs point of view, but first, why
is it in sysfs at all?  Shouldn't all of this be in debugfs?

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH RESEND v10 0/7] cpufreq:boost: CPU Boost mode support

2013-12-03 Thread Lukasz Majewski

Hi Rafael,

> This patch series introduces support for CPU overclocking technique
> called Boost.
> 
> It is a follow up of a LAB governor proposal. Boost is a LAB
> component:
> http://thread.gmane.org/gmane.linux.kernel/1484746/match=cpufreq
> 
> Boost unifies hardware based solution (e.g. Intel Nehalem) with
> software oriented one (like the one done at Exynos).
> For this reason cpufreq/freq_table code has been reorganized to
> include common code.
> 
> Important design decisions:
> - Boost related code is compiled-in unconditionally to cpufreq core
> and disabled by default. The cpufreq_driver is responsibile for
> setting boost_supported flag and providing set_boost callback(if HW
> support is needed). For software managed boost, special Kconfig flag -
>   CONFIG_CPU_FREQ_BOOST_SW has been defined. It will be selected only
>   when a target platform has thermal framework properly configured.
> 
> - struct cpufreq_driver has been extended with boost related fields:
> -- boost_supported - when driver supports boosting
> -- boost_enabled - boost state
> -- set_boost - callback to function, which is necessary to
>enable/disable boost
> 
> - Boost sysfs attribute (/sys/devices/system/cpu/cpufreq/boost) is
> visible _only_ when cpufreq driver supports Boost.
> 
> - No special spin_lock for Boost was created. The one from cpufreq
> core was reused.
> 
> - The Boost code doesn't rely on any policy. When boost state is
> changed, then the policy list is iterated and proper adjustements are
> done.
> 
> - To improve safety level, the thermal framework is also extended to
> disable software boosting, when thermal trip point is reached. After
> cooling down the boost can be enabled again. This emulates behaviour
> similar to HW managed boost (like x86)
> 
> Tested at HW:
>Exynos 4412 3.13-rc2 Linux
>Intel Core i7-3770 3.13-rc2 Linux
> 
> Above patches were posted on top of kernel_pm/bleeding-edge
> (SHA1: 9483a9f69d5c8f83f1723361bf8340ddfb6475b4)
> 

Rafael, could you pull patches from 1 to 6 of this series? Those are
related to cpufreq core and has already been accepted by Viresh at a
late August this year.
This would facilitate my further cpufreq work.

And about the last patch - related to thermal. It seems that more
discussion NOT related to cpufreq will be ongoing. 

I would prefer to add it as a separate patch to thermal subtree.



> 
> Lukasz Majewski (7):
>   cpufreq: Add boost frequency support in core
>   cpufreq:acpi:x86: Adjust the acpi-cpufreq.c code to work with common
> boost solution
>   cpufreq:boost:Kconfig: Provide support for software managed BOOST
>   cpufreq:exynos:Extend Exynos cpufreq driver to support boost
> framework
>   Documentation:cpufreq:boost: Update BOOST documentation
>   cpufreq:exynos4x12: Change L0 driver data to CPUFREQ_BOOST_FREQ
>   thermal:exynos:boost: Automatic enable/disable of BOOST feature (at
> Exynos4412)
> 
>  Documentation/cpu-freq/boost.txt  |   26 +++
>  drivers/cpufreq/Kconfig   |4 +
>  drivers/cpufreq/Kconfig.arm   |   15 
>  drivers/cpufreq/acpi-cpufreq.c|   86
> +++-- drivers/cpufreq/cpufreq.c |
> 118 -
> drivers/cpufreq/exynos-cpufreq.c  |3 +
> drivers/cpufreq/exynos4x12-cpufreq.c  |2 +-
> drivers/cpufreq/freq_table.c  |   56 --
> drivers/thermal/samsung/exynos_tmu_data.c |   47 
> include/linux/cpufreq.h   |   24 ++ 10 files
> changed, 302 insertions(+), 79 deletions(-)
> 



-- 
-- 
Best regards,

Lukasz Majewski

Samsung R Institute Poland (SRPOL) | Linux Platform Group
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] devtmpfs: Calling delete_path() only when necessary

2013-12-03 Thread Axel Lin

2013/12/4 Rob Landley :
> On 11/16/2013 02:15:23 AM, Axel Lin wrote:
>>
>> The deleted variable is always 1 in current code.
>> Initialize deleted variable to be 0, so delete_path() will be called only
>> when
>> necessary.
>>
>> Signed-off-by: Axel Lin 
>
>
> I'm not seeing this in linux-next, or a reply on the web archive. Assuming
> nobody's objected to this, you might want to forward it to
> triv...@kernel.org.
>
> That said, you could describe what it _does_ a little more?

I was expecting Greg to pick up this patch.

I thought the description is pretty clear.
What the patch does is changing the init value of deleted variable to 0.
The intention of this change is to avoid unnecessary delete_path() call.

Hi Greg,
Would you pick up this patch?
If a re-send or a v2 is required, please just let me know.

Thanks,
Axel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCHv2 1/2] New Driver for IOSF-SB MBI access on Intel SOCs

2013-12-03 Thread Andi Kleen

"David E. Box"  writes:
>  
> +config X86_IOSF_MBI
> + tristate "IOSF-SB MailBox Interface access support for Intel
> SOCs"

This is only implicitly used by other drivers, right?

Please make it not user visible (drop the string after tristate), as users 
will not know when to enable it.

The other drivers using it should always select it instead.

-Andi

-- 
a...@linux.intel.com -- Speaking for myself only
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3] watchdog: mpc8xxx_wdt convert to watchdog core

2013-12-03 Thread Christophe Leroy

Convert mpc8xxx_wdt.c to the new watchdog API.

Signed-off-by: Christophe Leroy 

diff -ur a/drivers/watchdog/mpc8xxx_wdt.c b/drivers/watchdog/mpc8xxx_wdt.c
--- a/drivers/watchdog/mpc8xxx_wdt.c2013-05-11 22:57:46.0 +0200
+++ b/drivers/watchdog/mpc8xxx_wdt.c2013-11-30 16:14:53.803495472 +0100
@@ -72,9 +72,7 @@
  * to 0
  */
 static int prescale = 1;
-static unsigned int timeout_sec;
 
-static unsigned long wdt_is_open;
 static DEFINE_SPINLOCK(wdt_spinlock);
 
 static void mpc8xxx_wdt_keepalive(void)
@@ -86,39 +84,23 @@
spin_unlock(_spinlock);
 }
 
+static struct watchdog_device mpc8xxx_wdt_dev;
 static void mpc8xxx_wdt_timer_ping(unsigned long arg);
-static DEFINE_TIMER(wdt_timer, mpc8xxx_wdt_timer_ping, 0, 0);
+static DEFINE_TIMER(wdt_timer, mpc8xxx_wdt_timer_ping, 0,
+   (unsigned long)_wdt_dev);
 
 static void mpc8xxx_wdt_timer_ping(unsigned long arg)
 {
+   struct watchdog_device *w = (struct watchdog_device *)arg;
+
mpc8xxx_wdt_keepalive();
/* We're pinging it twice faster than needed, just to be sure. */
-   mod_timer(_timer, jiffies + HZ * timeout_sec / 2);
-}
-
-static void mpc8xxx_wdt_pr_warn(const char *msg)
-{
-   pr_crit("%s, expect the %s soon!\n", msg,
-   reset ? "reset" : "machine check exception");
+   mod_timer(_timer, jiffies + HZ * w->timeout / 2);
 }
 
-static ssize_t mpc8xxx_wdt_write(struct file *file, const char __user *buf,
-size_t count, loff_t *ppos)
-{
-   if (count)
-   mpc8xxx_wdt_keepalive();
-   return count;
-}
-
-static int mpc8xxx_wdt_open(struct inode *inode, struct file *file)
+static int mpc8xxx_wdt_start(struct watchdog_device *w)
 {
u32 tmp = SWCRR_SWEN;
-   if (test_and_set_bit(0, _is_open))
-   return -EBUSY;
-
-   /* Once we start the watchdog we can't stop it */
-   if (nowayout)
-   __module_get(THIS_MODULE);
 
/* Good, fire up the show */
if (prescale)
@@ -132,59 +114,37 @@
 
del_timer_sync(_timer);
 
-   return nonseekable_open(inode, file);
+   return 0;
 }
 
-static int mpc8xxx_wdt_release(struct inode *inode, struct file *file)
+static int mpc8xxx_wdt_ping(struct watchdog_device *w)
 {
-   if (!nowayout)
-   mpc8xxx_wdt_timer_ping(0);
-   else
-   mpc8xxx_wdt_pr_warn("watchdog closed");
-   clear_bit(0, _is_open);
+   mpc8xxx_wdt_keepalive();
return 0;
 }
 
-static long mpc8xxx_wdt_ioctl(struct file *file, unsigned int cmd,
-   unsigned long arg)
+static int mpc8xxx_wdt_stop(struct watchdog_device *w)
 {
-   void __user *argp = (void __user *)arg;
-   int __user *p = argp;
-   static const struct watchdog_info ident = {
-   .options = WDIOF_KEEPALIVEPING,
-   .firmware_version = 1,
-   .identity = "MPC8xxx",
-   };
-
-   switch (cmd) {
-   case WDIOC_GETSUPPORT:
-   return copy_to_user(argp, , sizeof(ident)) ? -EFAULT : 0;
-   case WDIOC_GETSTATUS:
-   case WDIOC_GETBOOTSTATUS:
-   return put_user(0, p);
-   case WDIOC_KEEPALIVE:
-   mpc8xxx_wdt_keepalive();
-   return 0;
-   case WDIOC_GETTIMEOUT:
-   return put_user(timeout_sec, p);
-   default:
-   return -ENOTTY;
-   }
+   mod_timer(_timer, jiffies);
+   return 0;
 }
 
-static const struct file_operations mpc8xxx_wdt_fops = {
-   .owner  = THIS_MODULE,
-   .llseek = no_llseek,
-   .write  = mpc8xxx_wdt_write,
-   .unlocked_ioctl = mpc8xxx_wdt_ioctl,
-   .open   = mpc8xxx_wdt_open,
-   .release= mpc8xxx_wdt_release,
+static struct watchdog_info mpc8xxx_wdt_info = {
+   .options = WDIOF_KEEPALIVEPING,
+   .firmware_version = 1,
+   .identity = "MPC8xxx",
 };
 
-static struct miscdevice mpc8xxx_wdt_miscdev = {
-   .minor  = WATCHDOG_MINOR,
-   .name   = "watchdog",
-   .fops   = _wdt_fops,
+static struct watchdog_ops mpc8xxx_wdt_ops = {
+   .owner = THIS_MODULE,
+   .start = mpc8xxx_wdt_start,
+   .ping = mpc8xxx_wdt_ping,
+   .stop = mpc8xxx_wdt_stop,
+};
+
+static struct watchdog_device mpc8xxx_wdt_dev = {
+   .info = _wdt_info,
+   .ops = _wdt_ops,
 };
 
 static const struct of_device_id mpc8xxx_wdt_match[];
@@ -196,6 +156,7 @@
const struct mpc8xxx_wdt_type *wdt_type;
u32 freq = fsl_get_sys_freq();
bool enabled;
+   unsigned int timeout_sec;
 
match = of_match_device(mpc8xxx_wdt_match, >dev);
if (!match)
@@ -222,6 +183,7 @@
else
timeout_sec = timeout / freq;
 
+   mpc8xxx_wdt_dev.timeout = timeout_sec;
 #ifdef MODULE
ret = mpc8xxx_wdt_init_late();
if (ret)
@@ -237,7 +199,7 @@
 * userspace handles it.
 */
if (enabled)
-

Re: [PATCH v12 09/18] vmscan: shrink slab on memcg pressure

2013-12-03 Thread Vladimir Davydov

On 12/04/2013 08:51 AM, Dave Chinner wrote:
> On Tue, Dec 03, 2013 at 04:15:57PM +0400, Vladimir Davydov wrote:
>> On 12/03/2013 02:48 PM, Dave Chinner wrote:
 @@ -236,11 +236,17 @@ shrink_slab_node(struct shrink_control *shrinkctl, 
 struct shrinker *shrinker,
return 0;
  
/*
 -   * copy the current shrinker scan count into a local variable
 -   * and zero it so that other concurrent shrinker invocations
 -   * don't also do this scanning work.
 +   * Do not touch global counter of deferred objects on memcg pressure to
 +   * avoid isolation issues. Ideally the counter should be per-memcg.
 */
 -  nr = atomic_long_xchg(>nr_deferred[nid], 0);
 +  if (!shrinkctl->target_mem_cgroup) {
 +  /*
 +   * copy the current shrinker scan count into a local variable
 +   * and zero it so that other concurrent shrinker invocations
 +   * don't also do this scanning work.
 +   */
 +  nr = atomic_long_xchg(>nr_deferred[nid], 0);
 +  }
>>> That's ugly. Effectively it means that memcg reclaim is going to be
>>> completely ineffective when large numbers of allocations and hence
>>> reclaim attempts are done under GFP_NOFS context.
>>>
>>> The only thing that keeps filesystem caches in balance when there is
>>> lots of filesystem work going on (i.e. lots of GFP_NOFS allocations)
>>> is the deferal of reclaim work to a context that can do something
>>> about it.
>> Imagine the situation: a memcg issues a GFP_NOFS allocation and goes to
>> shrink_slab() where it defers them to the global counter; then another
>> memcg issues a GFP_KERNEL allocation, also goes to shrink_slab() where
>> it sees a huge number of deferred objects and starts shrinking them,
>> which is not good IMHO.
> That's exactly what the deferred mechanism is for - we know we have
> to do the work, but we can't do it right now so let someone else do
> it who can.
>
> In most cases, deferral is handled by kswapd, because when a
> filesystem workload is causing memory pressure then most allocations
> are done in GFP_NOFS conditions. Hence the only memory reclaim that
> can make progress here is kswapd.
>
> Right now, you aren't deferring any of this memory pressure to some
> other agent, so it just does not get done. That's a massive problem
> - it's a design flaw - and instead I see lots of crazy hacks being
> added to do stuff that should simply be deferred to kswapd like is
> done for global memory pressure.
>
> Hell, kswapd shoul dbe allowed to walk memcg LRU lists and trim
> them, just like it does for the global lists. We only need a single
> "deferred work" counter per node for that - just let kswapd
> proportion the deferred work over the per-node LRU and the
> memcgs

Seems I misunderstand :-(

Let me try. You mean we have the only nr_deferred counter per-node, and
kswapd scans

nr_deferred*memcg_kmem_size/total_kmem_size

objects in each memcg, right?

Then if there were a lot of objects deferred on memcg (not global)
pressure due to a memcg issuing a lot of GFP_NOFS allocations, kswapd
will reclaim objects from all, even unlimited, memcgs. This looks like
an isolation issue :-/

Currently we have a per-node nr_deferred counter for each shrinker. If
we add per-memcg reclaim, we have to make it per-memcg per-node, don't we?

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH V6 2/2] arm64: perf: add support for percpu pmu interrupt

2013-12-03 Thread Vinayak Kale

On Tue, Dec 3, 2013 at 7:20 PM, Will Deacon  wrote:
> On Mon, Dec 02, 2013 at 09:34:03AM +, Vinayak Kale wrote:
>>  static void
>> +armpmu_disable_percpu_irq(void *data)
>> +{
>> + struct arm_pmu *armpmu = data;
>> + struct platform_device *pmu_device = armpmu->plat_device;
>> + int irq = platform_get_irq(pmu_device, 0);
>> +
>> + cpumask_test_and_clear_cpu(smp_processor_id(), >active_irqs);
>
> Why not just cpumask_clear_cpu?

Yes, that would have serve the purpose. It was due to dumb copy/paste
from non-percpu counterpart.

>
>> + disable_percpu_irq(irq);
>> +}
>> +
>> +static void
>>  armpmu_release_hardware(struct arm_pmu *armpmu)
>>  {
>> - int i, irq, irqs;
>> + int irq;
>> + unsigned int i, irqs;
>>   struct platform_device *pmu_device = armpmu->plat_device;
>>
>>   irqs = min(pmu_device->num_resources, num_possible_cpus());
>> + if (!irqs)
>> + return;
>>
>> - for (i = 0; i < irqs; ++i) {
>> - if (!cpumask_test_and_clear_cpu(i, >active_irqs))
>> - continue;
>> - irq = platform_get_irq(pmu_device, i);
>> - if (irq >= 0)
>> - free_irq(irq, armpmu);
>> + irq = platform_get_irq(pmu_device, 0);
>> + if (irq <= 0)
>> + return;
>> +
>> + if (irq_is_percpu(irq)) {
>> + on_each_cpu(armpmu_disable_percpu_irq, armpmu, 1);
>> + free_percpu_irq(irq, _hw_events);
>> + } else {
>> + for (i = 0; i < irqs; ++i) {
>> + if (!cpumask_test_and_clear_cpu(i, 
>> >active_irqs))
>> + continue;
>> + irq = platform_get_irq(pmu_device, i);
>> + if (irq > 0)
>> + free_irq(irq, armpmu);
>> + }
>>   }
>>  }
>>
>> +static void
>> +armpmu_enable_percpu_irq(void *data)
>> +{
>> + struct arm_pmu *armpmu = data;
>> + struct platform_device *pmu_device = armpmu->plat_device;
>> + int irq = platform_get_irq(pmu_device, 0);
>> +
>> + enable_percpu_irq(irq, IRQ_TYPE_NONE);
>> + cpumask_set_cpu(smp_processor_id(), >active_irqs);
>
> Hmm, wouldn't it make more sense to pass the irq in data, then deal with the
> mask in the caller? (since the mask will *always* be updated by each CPU).
>
> Similarly for the disable path.

Okay.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v9] x86, apic, kexec, Documentation: Add disable_cpu_apic kernel parameter

2013-12-03 Thread HATAYAMA Daisuke


(2013/12/04 12:08), HATAYAMA Daisuke wrote:

(2013/12/04 0:25), Vivek Goyal wrote:

On Tue, Dec 03, 2013 at 10:32:26AM +0900, HATAYAMA Daisuke wrote:

[..]


diff --git a/Documentation/kernel-parameters.txt 
b/Documentation/kernel-parameters.txt
index 50680a5..dd77bec 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -774,6 +774,15 @@ bytes respectively. Such letter suffixes can also be 
entirely omitted.
  disable=[IPV6]
  See Documentation/networking/ipv6.txt.

+disable_cpu_apicid= [X86,APIC,KEXEC,SMP]


Hi Hatayama,

We are almost there. A minor nit. Why have we specified KEXEC here. This
parameter disabled_cpu_apicid does not seem to dependon CONFIG_KEXEC?

Jerry, this patch looks good to me. Does it work on your system?



Because primary user for the option is currently kexec/kdump only.

I referred to acpi_rsdp description:

 acpi_rsdp=  [ACPI,EFI,KEXEC]
 Pass the RSDP address to the kernel, mostly used
 on machines running EFI runtime service to boot the
 second kernel for kdump.



Indo-san, who introduced acpi_rsdp and KEXEC tag, told me that historical reason
why KEXEC tag was introduced. disable_cpu_apicid is generic at least in the
current version, so the tag doesn't need to be specified here.

I'll post a new version soon.

--
Thanks.
HATAYAMA, Daisuke

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] checkpatch: add DT compatible string documentation checks

2013-12-03 Thread Joe Perches

On Tue, 2013-12-03 at 22:17 -0600, Rob Herring wrote:
> From: Rob Herring 
> 
> This adds a simple check that any compatible strings in DeviceTree dts
> files are present in Documentation/devicetree/bindings. Vendor prefixes
> are also checked for existing in vendor-prefixes.txt These should be
> temporary checks until we have more sophisticated binding schema checking.
> 
> Signed-off-by: Rob Herring 
> Cc: Grant Likely 
> Cc: Andy Whitcroft 
> Cc: Joe Perches 
> ---
> v2:
> - Add vendor string checking against vendor-prefixes.txt
> - Add '_', '.' and '+' as valid compatible string characters
> - Use 'grep -E' instead of egrep

Some more trivial notes:

> diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
[]
> @@ -2034,6 +2034,29 @@ sub process {
>"Use of $flag is deprecated, please use 
> \`$replacement->{$flag} instead.\n" . $herecurr) if ($replacement->{$flag});
>   }
>  
> +# check for DT compatible documentation
> + if ($realfile =~ /\.dts/ && $rawline =~ /\+\s*compatible\s*=/) {

this should probably be $rawline =~ /^\+\s*compatible...

> + my @compats = $rawline =~ 
> /\"([a-zA-Z0-9\-\,\.\+_]+)\"/g;
> +
> + foreach my $compat (@compats) {
> + my $compat2 = $compat;
> + my $dt_path = 
> "Documentation/devicetree/bindings/";
> + $compat2 =~ s/\,[a-z]*\-/\,<\.\*>\-/;
> + `grep -Erq "$compat|$compat2" $dt_path`;
> + if ( $? >> 8 ) {
> + WARN("UNDOCUMENTED_DT_BINDING",
> +  "DT compatible string \"$compat\" 
> appears un-documented -- check $dt_path\n" . $herecurr);
> + }
> + my $vendor = $compat;
> + $vendor =~ s/^([a-zA-Z0-9]+)\,.*/$1/;
> + `grep -Eq "$vendor" 
> "${dt_path}vendor-prefixes.txt"`;

It maybe simpler to read as:

my $vendor_path = $dt_path . "vendor-prefixes.txt";
`grep -Eq $vendor $vendor_path`;

> + if ( $? >> 8 ) {
> + WARN("UNDOCUMENTED_DT_VENDOR",
> +  "DT compatible string vendor 
> \"$vendor\" appears un-documented -- check ${dt_path}vendor-prefixes.txt\n" . 
> $herecurr);

 "DT compatible string vendor 
\"$vendor\" appears un-documented -- check $vendor_path\n" . $herecurr)

Also I suggest using the same message type
instead of 2 distinct ones.

Maybe "UNDOCUMENTED_DT_STRING"

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Xen-devel] [PATCH v2 0/2] xen: vnuma introduction for pv guest

2013-12-03 Thread Elena Ufimtseva

On Tue, Dec 3, 2013 at 7:35 PM, Elena Ufimtseva  wrote:
> On Tue, Nov 19, 2013 at 1:29 PM, Dario Faggioli
>  wrote:
>> On mar, 2013-11-19 at 10:38 -0500, Konrad Rzeszutek Wilk wrote:
>>> On Mon, Nov 18, 2013 at 03:25:48PM -0500, Elena Ufimtseva wrote:
>>> > The patchset introduces vnuma to paravirtualized Xen guests
>>> > runnning as domU.
>>> > Xen subop hypercall is used to retreive vnuma topology information.
>>> > Bases on the retreived topology from Xen, NUMA number of nodes,
>>> > memory ranges, distance table and cpumask is being set.
>>> > If initialization is incorrect, sets 'dummy' node and unsets
>>> > nodemask.
>>> > vNUMA topology is constructed by Xen toolstack. Xen patchset is
>>> > available at https://git.gitorious.org/xenvnuma/xenvnuma.git:v3.
>>>
>>> Yeey!
>>>
>> :-)
>>
>>> One question - I know you had questions about the
>>> PROT_GLOBAL | ~PAGE_PRESENT being set on PTEs that are going to
>>> be harvested for AutoNUMA balancing.
>>>
>>> And that the hypercall to set such PTE entry disallows the
>>> PROT_GLOBAL (it stripts it off)? That means that when the
>>> Linux page system kicks in (as it has ~PAGE_PRESENT) the
>>> Linux pagehandler won't see the PROT_GLOBAL (as it has
>>> been filtered out). Which means that the AutoNUMA code won't
>>> kick in.
>>>
>>> (see http://article.gmane.org/gmane.comp.emulators.xen.devel/174317)
>>>
>>> Was that problem ever answered?
>>>
>> I think the issue is a twofold one.
>>
>> If I remember correctly (Elena, please, correct me if I'm wrong) Elena
>> was seeing _crashes_ with both vNUMA and AutoNUMA enabled for the guest.
>> That's what pushed her to investigate the issue, and led to what you're
>> summing up above.
>>
>> However, it appears the crash was due to something completely unrelated
>> to Xen and vNUMA, was affecting baremetal too, and got fixed, which
>> means the crash is now gone.
>>
>> It remains to be seen (I think) whether that also means that AutoNUMA
>> works. In fact, chatting about this in Edinburgh, Elena managed to
>> convince me pretty badly that we should --as part of the vNUMA support--
>> do something about this, in order to make it work. At that time I
>> thought we should be doing something to avoid the system to go ka-boom,
>> but as I said, even now that it does not crash anymore, she was so
>> persuasive that I now find it quite hard to believe that we really don't
>> need to do anything. :-P
>
> Yes, you were right Dario :) See at the end. pv guests do not crash,
> but they have user space memory corruption.
> Ok, so I will try to understand what again had happened during this
> weekend.
> Meanwhile posting patches for Xen.
>
>>
>> I guess, as soon as we get the chance, we should see if this actually
>> works, i.e., in addition to seeing the proper topology and not crashing,
>> verify that AutoNUMA in the guest is actually doing is job.
>>
>> What do you think? Again, Elena, please chime in and explain how things
>> are, if I got something wrong. :-)
>>
>
> Oh guys, I feel really bad about not replying to these emails... Somehow these
> replies all got deleted.. wierd.
>
> Ok, about that automatic balancing. At the moment of the last patch
> automatic numa balancing seem to
> work, but after rebasing on the top of 3.12-rc2 I see similar issues.
> I will try to figure out what commits broke and will contact Ingo
> Molnar and Mel Gorman.
>
> Konrad,
> as of PROT_GLOBAL flag, I will double check once more to exclude
> errors from my side.
> Last time I was able to have numa_balancing working without any
> modifications from hypervisor side.
> But again, I want to double check this, some experiments might have
> appear being good :)
>
>
>> Regards,
>> Dario
>>
>> --
>> <> (Raistlin Majere)
>> -
>> Dario Faggioli, Ph.D, http://about.me/dario.faggioli
>> Senior Software Engineer, Citrix Systems R Ltd., Cambridge (UK)
>>
>

As of now I have patch v4 for reviewing. Not sure if it will be
beneficial to post it for review
or look closer at the current problem.
The issue I am seeing right now is defferent from what was happening before.
The corruption happens when on change_prot_numa way :

[ 6638.021439]  pfn 45e602, highest_memmap_pfn - 14ddd7
[ 6638.021444] BUG: Bad page map in process dd  pte:80045e602166
pmd:abf1a067
[ 6638.021449] addr:7f4fda2d8000 vm_flags:00100073
anon_vma:8800abf77b90 mapping:  (null) index:7f4fda2d8
[ 6638.021457] CPU: 1 PID: 1033 Comm: dd Tainted: GB   W3.13.0-rc2+ #10
[ 6638.021462]   7f4fda2d8000 813ca5b1
88010d68deb8
[ 6638.021471]  810f2c88 abf1a067 80045e602166

[ 6638.021482]  0045e602 88010d68deb8 7f4fda2d8000
80045e602166
[ 6638.021492] Call Trace:
[ 6638.021497]  [] ? dump_stack+0x41/0x51
[ 6638.021503]  [] ? print_bad_pte+0x19d/0x1c9
[ 6638.021509]  [] ? vm_normal_page+0x94/0xb3
[ 6638.021519]  [] ?

Re: Need help on Linux PCIe

2013-12-03 Thread Jagan Teki

Thanks for your quick response.
Please find my comments below.

On Tue, Dec 3, 2013 at 11:09 PM, Bjorn Helgaas  wrote:
> On Tue, Dec 3, 2013 at 4:24 AM, Jagan Teki  wrote:
>> Hi,
>>
>> I have few question on Linux PCIe subsystem, I am trying to understand
>> the PCIe on ARM platform.
>> 1. Compared to PCI, PCIe have an extra port functionalists/services
>> which is implemented drivers/pci/pcie/* is it true?
>
> Yes.
>
>> 2. PCIe root complex is same as Host controller drivers in linux 
>> drivers/host/*
>
> Yes.
>
>> 3. As individual endpoint drivers are registered to pci_core as
>> pci_driver_register, then what is the common call for registering
>> individual HC driver to pci-core?
>
> The host controller-PCI core interface is not as mature as the
> pci_register_driver() interface.  The basic interface is
> pci_scan_root_bus().  If you skim through the drivers in
> drivers/pci/host/* and drivers/acpi/pci_root.c, the interface to the
> PCI core will be fairly obvious.  And you'll learn what the existing
> practices are in case you need to add or modify something.

OK.

I understand the flow as below - please correct if am wrong.

>From low level (hw) - HC driver has a platform registration using
platform_driver_register() to lower layer
and then pci_scan_root_bus() --> pci_common_init_dev() registration to
upper layer as PCI - BIOS and then ends.

>From upper level (app) - each endpoint driver has
pci_driver_register() call to PCI Core for lower level and then the
upper
level registration is based on endpoint().

What is the connection here for PCI-BIOS and PCI-Core here, does these
are two different entities means there is no common call for these?
I see for ARM - "arch/arm/kernel/bios32.c" is PCI-BIOS is it correct?
does we have separate BIOS codes for architectures?

-- 
Thanks,
Jagan.

Jagannadha Sutradharudu Teki,
E: jagannadh.t...@gmail.com, P: +91-9676773388
Engineer - System Software Hacker
U-boot - SPI Custodian and Zynq APSOC
Ln: http://www.linkedin.com/in/jaganteki
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 3/3] nohz_full: update cpu load fix in nohz_full

2013-12-03 Thread Alex Shi

On 12/03/2013 08:35 PM, Alex Shi wrote:
> We are not always 0 when update nohz cpu load, after nohz_full enabled.
> But current code still treat the cpu as idle. that is incorrect.
> Fix it to use correct cpu_load.

Frederic, Could you like to give some comments?

> 
> Signed-off-by: Alex Shi 
> ---
>  kernel/sched/proc.c | 8 +++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/sched/proc.c b/kernel/sched/proc.c
> index 16f5a30..f1441f0 100644
> --- a/kernel/sched/proc.c
> +++ b/kernel/sched/proc.c
> @@ -568,8 +568,14 @@ void update_cpu_load_nohz(void)
>   /*
>* We were idle, this means load 0, the current load might be
>* !0 due to remote wakeups and the sort.
> +  * or we may has only one task and in NO_HZ_FULL, then still use
> +  * normal cpu load.
>*/
> - __update_cpu_load(this_rq, 0, pending_updates);
> + if (this_rq->cfs.h_nr_running) {
> + unsigned load = get_rq_runnable_load(this_rq);
> + __update_cpu_load(this_rq, load, pending_updates);
> + } else
> + __update_cpu_load(this_rq, 0, pending_updates);
>   }
>   raw_spin_unlock(_rq->lock);
>  }
> 


-- 
Thanks
Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/4] usb: chipidea: msm: Initialize offset of the capability registers

2013-12-03 Thread Peter Chen

On Mon, Nov 11, 2013 at 03:35:36PM +0200, Ivan T. Ivanov wrote:
> From: "Ivan T. Ivanov" 
> 

The commit log is needed.

> Signed-off-by: Ivan T. Ivanov 
> ---
>  drivers/usb/chipidea/ci_hdrc_msm.c |1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/usb/chipidea/ci_hdrc_msm.c 
> b/drivers/usb/chipidea/ci_hdrc_msm.c
> index 747d6c1..e9624f3 100644
> --- a/drivers/usb/chipidea/ci_hdrc_msm.c
> +++ b/drivers/usb/chipidea/ci_hdrc_msm.c
> @@ -47,6 +47,7 @@ static void ci_hdrc_msm_notify_event(struct ci_hdrc *ci, 
> unsigned event)
>  
>  static struct ci_hdrc_platform_data ci_hdrc_msm_platdata = {
>   .name   = "ci_hdrc_msm",
> + .capoffset  = DEF_CAPOFFSET,
>   .flags  = CI_HDRC_REGS_SHARED |
> CI_HDRC_REQUIRE_TRANSCEIVER |
> CI_HDRC_DISABLE_STREAMING,
> -- 
> 1.7.9.5
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-usb" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

-- 

Best Regards,
Peter Chen

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/3] makedumpfile: hugepage filtering for vmcore dump

2013-12-03 Thread Atsushi Kumagai

On 2013/12/03 18:06:13, kexec  wrote:
> >> This is a suggestion from different point of view...
> >>
> >> In general, data on crash dump can be corrupted. Thus, order contained in 
> >> a page
> >> descriptor can also be corrupted. For example, if the corrupted value were 
> >> a huge
> >> number, wide range of pages after buddy page would be filtered falsely.
> >>
> >> So, actually we should sanity check data in crash dump before using them 
> >> for application
> >> level feature. I've picked up order contained in page descriptor, so there 
> >> would be other
> >> data used in makedumpfile that are not checked.
> > 
> > What you said is reasonable, but how will you do such sanity check ?
> > Certain standard values are necessary for sanity check, how will
> > you prepare such values ?
> > (Get them from kernel source and hard-code them in makedumpfile ?)
> > 
> >> Unlike diskdump, we no longer need to care about kernel/hardware level 
> >> data integrity
> >> outside of user-land, but we still care about data its own integrity.
> >>
> >> On the other hand, if we do it, we might face some difficulty, for 
> >> example, hardness of
> >> maintenance or performance bottleneck; it might be the reason why we don't 
> >> see sanity
> >> check in makedumpfile now.
> > 
> > There are many values which should be checked, e.g. page.flags, page._count,
> > page.mapping, list_head.next and so on.
> > If we introduce sanity check for them, the issues you mentioned will be 
> > appear
> > distinctly.
> > 
> > So I think makedumpfile has to trust crash dump in practice.
> > 
> 
> Yes, I don't mean such very drastic checking; I understand hardness because I 
> often
> handle/write this kind of code; I don't want to fight tremendously many 
> dependencies...
> 
> So we need to concentrate on things that can affect makedumpfile's behavior 
> significantly,
> e.g. infinite loop caused by broken linked list objects, buffer overrun 
> cauesd by large values
> from broken data, etc. We should be able to deal with them by carefully 
> handling
> dump data against makedumpfile's runtime data structure, e.g., buffer size.
> 
> Is it OK to consider this is a policy of makedumpfile for data corruption?

Right. 
Of course, if there is a very simple and effective check for a dump data, 
then we can take it.


Thanks
Atsushi Kumagai

> -- 
> Thanks.
> HATAYAMA, Daisuke
> 
> 
> ___
> kexec mailing list
> ke...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] mm: memcg: do not declare OOM from __GFP_NOFAIL allocations

2013-12-03 Thread David Rientjes

On Wed, 4 Dec 2013, Johannes Weiner wrote:

> However, the GFP_NOFS | __GFP_NOFAIL task stuck in the page allocator
> may hold filesystem locks that could prevent a third party from
> freeing memory and/or exiting, so we can not guarantee that only the
> __GFP_NOFAIL task is getting stuck, it might well trap other tasks.
> The same applies to open-coded GFP_NOFS allocation loops of course
> unless they cycle the filesystem locks while looping.
> 

Yup.  I think we should do this:

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2631,6 +2631,11 @@ rebalance:
pages_reclaimed)) {
/* Wait for some write requests to complete then retry */
wait_iff_congested(preferred_zone, BLK_RW_ASYNC, HZ/50);
+
+   /* Allocations that cannot fail must allocate from somewhere */
+   if (gfp_mask & __GFP_NOFAIL)
+   alloc_flags |= ALLOC_HARDER;
+
goto rebalance;
} else {
/*

so that it gets the same behavior as GFP_ATOMIC and is allowed to allocate 
from memory reserves (although not enough to totally deplete memory).  We 
need to leave some memory reserves around in case another process with 
__GFP_FS invokes the oom killer and the victim needs memory to exit since 
the GFP_NOFS | __GFP_NOFAIL failure wasn't only because reclaim was 
limited due to !__GFP_FS.

The only downside of this is that it might become harder in the future to 
ever make a case to remove __GFP_NOFAIL entirely since the behavior of the 
page allocator is changed with this and it's not equivalent to coding the 
retry directly in the caller.

On a tangent, GFP_NOWAIT | __GFP_NOFAIL and GFP_ATOMIC | __GFP_NOFAIL 
actually allows allocations to fail.  Nothing currently does that, but I 
wonder if we should do this for correctness:

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2535,17 +2535,19 @@ rebalance:
}
}

-   /* Atomic allocations - we can't balance anything */
-   if (!wait)
-   goto nopage;
-
-   /* Avoid recursion of direct reclaim */
-   if (current->flags & PF_MEMALLOC)
-   goto nopage;
-
-   /* Avoid allocations with no watermarks from looping endlessly */
-   if (test_thread_flag(TIF_MEMDIE) && !(gfp_mask & __GFP_NOFAIL))
-   goto nopage;
+   if (likely(!(gfp_mask & __GFP_NOFAIL))) {
+   /* Atomic allocations - we can't balance anything */
+   if (!wait)
+   goto nopage;
+
+   /* Avoid recursion of direct reclaim */
+   if (current->flags & PF_MEMALLOC)
+   goto nopage;
+
+   /* Avoid allocations without watermarks from looping forever */
+   if (test_thread_flag(TIF_MEMDIE))
+   goto nopage;
+   }

/*
 * Try direct compaction. The first pass is asynchronous. Subsequent

It can be likely() because the __GFP_NOFAIL restart from the first patch 
above will likely now succeed since there's access to memory reserves and 
we never actually get here but once for __GFP_NOFAIL.  Thoughts on either 
patch?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/4] usb: chipidea: msm: Add device tree binding information

2013-12-03 Thread Peter Chen

On Mon, Nov 11, 2013 at 03:35:34PM +0200, Ivan T. Ivanov wrote:
> From: "Ivan T. Ivanov" 
> 

Please add something in commit log

> Signed-off-by: Ivan T. Ivanov 
> Cc: devicet...@vger.kernel.org
> ---
>  .../devicetree/bindings/usb/msm-hsusb.txt  |   16 
>  1 file changed, 16 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/usb/msm-hsusb.txt 
> b/Documentation/devicetree/bindings/usb/msm-hsusb.txt
> index 5ea26c6..0a85eba 100644
> --- a/Documentation/devicetree/bindings/usb/msm-hsusb.txt
> +++ b/Documentation/devicetree/bindings/usb/msm-hsusb.txt
> @@ -15,3 +15,19 @@ Example EHCI controller device node:
>   usb-phy = <_otg>;
>   };
>  
> +CI13xxx (Chipidea) USB controllers
> +

We have already renamed ci13xxx to ci_hdrc.

> +Required properties:
> +- compatible:should contain "qcom,ci-hdrc"
> +- reg:   offset and length of the register set in the 
> memory map
> +- interrupts:interrupt-specifier for the controller interrupt.
> +- usb-phy:   phandle for the PHY device
> +- dr_mode:   Sould be "peripheral"
> +

Please keep alignment for "reg"

Peter

> + gadget@f9a55000 {
> + compatible = "qcom,ci-hdrc";
> + reg = <0xf9a55000 0x400>;
> + dr_mode = "peripheral";
> + interrupts = <0 134 0>;
> + usb-phy = <_otg>;
> + };
> \ No newline at end of file
> -- 
> 1.7.9.5
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-usb" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

-- 

Best Regards,
Peter Chen

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH] usb: phy-tegra-usb.c: wrong pointer check for remap UTMI

2013-12-03 Thread Venu Byravarasu

Hi Stephen,

Initially Chris sent this patch to Linux-USB alias alone & there I Acked it.
Plz check http://marc.info/?l=linux-usb=138475663023376=1 

Then he resent the patch to linux-kernel alias with my ACK added.

Thanks,
Venu


> -Original Message-
> From: Stephen Warren [mailto:swar...@wwwdotorg.org]
> Sent: Wednesday, December 04, 2013 9:29 AM
> To: Chris Ruehl; ba...@ti.com
> Cc: gre...@linuxfoundation.org; thierry.red...@gmail.com; linux-
> u...@vger.kernel.org; linux-te...@vger.kernel.org; linux-
> ker...@vger.kernel.org; Venu Byravarasu
> Subject: Re: [PATCH] usb: phy-tegra-usb.c: wrong pointer check for remap
> UTMI
> 
> On 12/03/2013 07:02 PM, Chris Ruehl wrote:
> > usb: phy-tegra-usb.c: wrong pointer check for remap UTMI
> >
> > A wrong pointer was used to test the result of devm_ioremap()
> >
> > Signed-off-by: Chris Ruehl 
> > Acked-by: Venu Byravarasu 
> 
> Out of curiosity, when did that ack happen? I didn't see it. But anyway,
> 
> Acked-by: Stephen Warren 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 4/4] usb: chipidea: msm: Use USB PHY API to control PHY state

2013-12-03 Thread Peter Chen

On Mon, Nov 11, 2013 at 04:36:09PM +0200, Ivan T. Ivanov wrote:
> 
> Hi Peter,
> 
> On Mon, 2013-11-11 at 21:59 +0800, Peter Chen wrote: 
> > On Mon, Nov 11, 2013 at 03:35:37PM +0200, Ivan T. Ivanov wrote:
> > > From: "Ivan T. Ivanov" 
> > > 
> > > PHY drivers keep track of the current state of the hardware,
> > > so don't change PHY settings under it.
> > > 
> > > Signed-off-by: Ivan T. Ivanov 
> > > ---
> > >  drivers/usb/chipidea/ci_hdrc_msm.c |9 ++---
> > >  1 file changed, 2 insertions(+), 7 deletions(-)
> > > 
> > > diff --git a/drivers/usb/chipidea/ci_hdrc_msm.c 
> > > b/drivers/usb/chipidea/ci_hdrc_msm.c
> > > index e9624f3..338b209 100644
> > > --- a/drivers/usb/chipidea/ci_hdrc_msm.c
> > > +++ b/drivers/usb/chipidea/ci_hdrc_msm.c
> > > @@ -20,13 +20,11 @@
> > >  static void ci_hdrc_msm_notify_event(struct ci_hdrc *ci, unsigned event)
> > >  {
> > >   struct device *dev = ci->gadget.dev.parent;
> > > - int val;
> > >  
> > >   switch (event) {
> > >   case CI_HDRC_CONTROLLER_RESET_EVENT:
> > >   dev_dbg(dev, "CI_HDRC_CONTROLLER_RESET_EVENT received\n");
> > > - writel(0, USB_AHBBURST);
> > > - writel(0, USB_AHBMODE);
> > > + usb_phy_init(ci->transceiver);
> > 
> > It will reset the PHY,  but your comment is "don't change PHY settings 
> > under it"
> 
> :-). This function is exported by PHY drivers, so they will know how
> to handle this change.
> 
> > 
> > >   break;
> > >   case CI_HDRC_CONTROLLER_STOPPED_EVENT:
> > >   dev_dbg(dev, "CI_HDRC_CONTROLLER_STOPPED_EVENT received\n");
> > > @@ -34,10 +32,7 @@ static void ci_hdrc_msm_notify_event(struct ci_hdrc 
> > > *ci, unsigned event)
> > >* Put the transceiver in non-driving mode. Otherwise host
> > >* may not detect soft-disconnection.
> > >*/
> > > - val = usb_phy_io_read(ci->transceiver, ULPI_FUNC_CTRL);
> > > - val &= ~ULPI_FUNC_CTRL_OPMODE_MASK;
> > > - val |= ULPI_FUNC_CTRL_OPMODE_NONDRIVING;
> > > - usb_phy_io_write(ci->transceiver, val, ULPI_FUNC_CTRL);
> > > + usb_phy_notify_disconnect(ci->transceiver, USB_SPEED_UNKNOWN);
> > 
> > Where you have implemented .notify_disconnect?
> > I have not found it at your phy driver.
> 
> Yep, I will post PHY driver changes shortly. Meanwhile this should
> not break existing board file based platforms, because not of them
> could be compiled (HTC Dream, Halibut Board) and DT based platforms 
> are sill work in progress.
> 

Hi Ivan, I am going to apply this msm chipidea patchset, but the change
in this file is different with its original meaning. Have you
tested at existing platforms?

-- 

Best Regards,
Peter Chen

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [f2fs-dev] [PATCH v2] f2fs: refactor bio-related operations

2013-12-03 Thread Chao Yu

Hi,

Comment as following.

> -Original Message-
> From: Jaegeuk Kim [mailto:jaegeuk@samsung.com]
> Sent: Monday, December 02, 2013 4:27 PM
> To: linux-fsde...@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org; linux-f2fs-de...@lists.sourceforge.net
> Subject: Re: [f2fs-dev] [PATCH v2] f2fs: refactor bio-related operations
> 
> Change log from v1:
>  o remove redundant codes
> 
> >From a480dfc915490f4bca7275f6fbb44fa34aa00eaa Mon Sep 17 00:00:00 2001
> From: Jaegeuk Kim 
> Date: Sat, 30 Nov 2013 12:51:14 +0900
> Subject: [PATCH] f2fs: refactor bio-related operations
> Cc: linux-fsde...@vger.kernel.org, linux-kernel@vger.kernel.org, 
> linux-f2fs-de...@lists.sourceforge.net
> 
> This patch integrates redundant bio operations on read and write IOs.
> 
> 1. Move bio-related codes to the top of data.c.
> 2. Replace f2fs_submit_bio with f2fs_submit_merged_bio, which handles read
>bios additionally.
> 3. Introduce __submit_merged_bio to submit the merged bio.
> 4. Change f2fs_readpage to f2fs_submit_page_bio.
> 5. Introduce f2fs_submit_page_mbio to integrate previous submit_read_page and
>submit_write_page.

[snip]

> +static void __submit_merged_bio(struct f2fs_sb_info *sbi,
> + struct f2fs_bio_info *io,
> + enum page_type type, bool sync, int rw)
> +{
> + enum page_type btype = PAGE_TYPE_OF_BIO(type);
> +
> + if (!io->bio)
> + return;
> +
> + if (btype == META)
> + rw |= REQ_META;
> +
> + if (is_read_io(rw)) {
> + if (sync)
> + rw |= READ_SYNC;
> + submit_bio(rw, io->bio);
> + trace_f2fs_submit_read_bio(sbi->sb, rw, type, io->bio);
> + io->bio = NULL;
> + return;
> + }
> +
> + if (sync)
> + rw |= WRITE_SYNC;

rw = WRITE_SYNC; ?

> + if (type >= META_FLUSH)
> + rw |= WRITE_FLUSH_FUA;

rw = WRITE_FLUSH_FUA; ?

> +
> + /*
> +  * META_FLUSH is only from the checkpoint procedure, and we should wait
> +  * this metadata bio for FS consistency.
> +  */
> + if (type == META_FLUSH) {
> + DECLARE_COMPLETION_ONSTACK(wait);
> + io->bio->bi_private = 
> + submit_bio(rw, io->bio);
> + wait_for_completion();
> + } else {
> + submit_bio(rw, io->bio);
> + }
> + trace_f2fs_submit_write_bio(sbi->sb, rw, btype, io->bio);
> + io->bio = NULL;
> +}

[snip]

Thanks,
Yu

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCHv6 1/4] pwm: Add Freescale FTM PWM driver support

2013-12-03 Thread Li Xiubo

> > > > +static int fsl_pwm_parse_clk_ps(struct fsl_pwm_chip *fpc)
> > > > +{
> > > > +   int ret;
> > > > +   struct of_phandle_args clkspec;
> > > > +   struct device_node *np = fpc->chip.dev->of_node;
> > > > +
> > > > +   fpc->sys_clk = devm_clk_get(fpc->chip.dev, "ftm0");
> > > > +   if (IS_ERR(fpc->sys_clk)) {
> > > > +   ret = PTR_ERR(fpc->sys_clk);
> > > > +   dev_err(fpc->chip.dev,
> > > > +   "failed to get \"ftm0\" clock %d\n", 
> > > > ret);
> > > > +   return ret;
> > > > +   }
> > > > +
> > > > +   fpc->counter_clk = devm_clk_get(fpc->chip.dev, "ftm0_counter");
> > > > +   if (IS_ERR(fpc->counter_clk)) {
> > > > +   ret = PTR_ERR(fpc->counter_clk);
> > > > +   dev_err(fpc->chip.dev,
> > > > +   "failed to get \"ftm0_counter\" clock 
> > > > %d\n",
> > > > +   ret);
> > > > +   return ret;
> > > > +   }
> > > > +
> > > > +   ret = of_parse_phandle_with_args(np, "clocks", "#clock-cells",
> 1,
> > > > +   );
> > > > +   if (ret)
> > > > +   return ret;
> > > > +
> > > > +   fpc->counter_clk_select = clkspec.args[0];
> > >
> > > This isn't at all pretty. But given that once you have access to a
> > > struct clk there's no way to identify it, I don't know of a better
> > > alternative.
> >
> > Hi Mike,
> >
> > I've seen this crop up a number of times now, to varying degrees of
> > gravity. In this particular case, the driver needs to know the type of a
> > clock because it needs to program this hardware differently depending on
> > which clock feeds the counter. Since there is no way to obtain any kind
> > of identifying information from a struct clk, drivers need to rely on
> > hacks like this and manually reach into the device tree to obtain that
> > information.
> 
> Which property of the clock is the consumer concerned with in this case?
> 

The "ftm0_counter" clock.


> From a quick look at the driver it looks like there are actually a
> number of different input lines to the device that share the clock-name
> "ftm0_counter", though they are actually separate and each has a
> different divider. Have I got that right?
> 

Yes mostly, there are three different input lines wired to the counter
clock, but they share only one divider. And between the three lines and
the divider there is one mux inside of the FTM IP block.
The three clock sources are:
"ftm0",
"ftm0-fix",
"ftm0-ext".
 _
|   
|
|   +++ FTM Module  
|
ftm0 ---|-->+ + 
|
|   + + + ++|
ftm0-fix ---|-->+   MUX   +>+  divider  +>+  counter   +|
|   + + + ++|
ftm0-ext ---|-->+ + 
|
|   +++ 
|
|_|



> If that's the case, having a unique clock-names value for each of those
> lines would be the solution I'd expect. Then you just have to list the
> one(s) that are wired up, and the driver can figure out the appropriate
> line to use either by requesting by name until it finds a match or
> inspecting the clock-names property.
> 

Hi Kumar,
In the list archives for mails of "[RFC][PATCHv5 4/4] Documentation: Add
device tree bindings for Freescale FTM PWM.",we have discussed about this.
Is there any different with your suggestions or ideas? 


> Is there some other property of the parent that we care about here?
> 

As I know, not yet.

--
Best Regards,
 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/4] Regulators: TPS65218: Add Regulator driver for TPS65218 PMIC

2013-12-03 Thread Keerthy


Thanks for the review Mark.

On Tuesday 03 December 2013 08:46 PM, Mark Brown wrote:

On Tue, Dec 03, 2013 at 12:13:24PM +0530, Keerthy wrote:


+static int tps65218_ldo1_dcdc3_vsel_to_uv(unsigned int vsel)
+{
+   int uV = 0;
+
+   if (vsel <= 26)
+   uV = vsel * 25000 + 90;
+   else
+   uV = (vsel - 26) * 5 + 155;
+
+   return uV;
+}

Use regulator_map_voltage_linear_range() (and similarly for most of the
other functions).


Ok.


+static const struct of_device_id tps65218_of_match[] = {
+   TPS65218_OF_MATCH("ti,tps65218-dcdc1", tps65218_pmic_regs[0]),
+   TPS65218_OF_MATCH("ti,tps65218-dcdc2", tps65218_pmic_regs[1]),
+   TPS65218_OF_MATCH("ti,tps65218-dcdc3", tps65218_pmic_regs[2]),
+   TPS65218_OF_MATCH("ti,tps65218-dcdc4", tps65218_pmic_regs[3]),
+   TPS65218_OF_MATCH("ti,tps65218-dcdc5", tps65218_pmic_regs[4]),
+   TPS65218_OF_MATCH("ti,tps65218-dcdc6", tps65218_pmic_regs[5]),
+   TPS65218_OF_MATCH("ti,tps65218-ldo1", tps65218_pmic_regs[6]),
+};
+MODULE_DEVICE_TABLE(of, tps65218_of_match);

Indexing into another array by magic number like this is both error
prone and hard to read.  Either use defined constants or individual
variables for the things being referenced.

Okay.

Regards,
Keerthy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC v2 0/4] Add basic support for ASV

2013-12-03 Thread Sachin Kamat

Hi Abhilash,

On 3 December 2013 20:16, Abhilash Kesavan  wrote:
> Hi Yadwinder and Sachin,

> CC'ing Doug and Andrew who have also worked on ASV.
>
> I tested these patches on a 5250 Chromebook after modifying the
> cpufreq code and a few other changes for booting the board. The driver
> is retrieving the ASV fused group correctly. The behavior on an
> unfused SMDK5250 is also fine.
> I have a few minor comments on the patches.
>

Thank you for testing and reviewing the patchset.
Will incorporate your comments in the next version.

-- 
With warm regards,
Sachin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[GIT PULL] ARM: SoC fixes for 3.13-rc

2013-12-03 Thread Olof Johansson

Hi Linus,

The following changes since commit a31ab44ef5d07c6707df4a9ad2c8affd2d62ff4b:

  ARM: bcm2835: add missing #xxx-cells to I2C nodes (2013-11-25 21:56:00 -0800)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc.git 
tags/fixes-for-linus

for you to fetch changes up to a5c6e87a7b224bdbf57875a9da8f340f5a6abc5a:

  arm: dts: socfpga: Change some clocks of gate-clk type to perip-clk 
(2013-12-03 14:19:53 -0800)


ARM: SoC fixes for 3.13-rc

Another batch of fixes for ARM SoCs for 3.13. The diffstat is large,
mostly because of:

- Another set of fixes to fix regressions caused by moving OMAP from board
  files to DT. Tony thinks this was the last major set of fixes, with
  maybe just a few small patches to follow.
- More fixes for Marvell platforms, most dealing with misdescribed PCIe
  hardware, i.e. incorrect number of busses on some SoCs, etc. The line
  delta adds up due to various ranges moving around when this is fixed.

But there's also:

- Some smaller tweaks to defconfigs to make more boards bootable in my
  test setup for better coverage.
- There are also a few other smaller fixes, a short series for at91, a couple
  of reverts for ux500, etc.


Andi Shyti (1):
  u8500_defconfig: allow creation and mounting of devtmpfs

Arnaud Ebalard (2):
  ARM: mvebu: second PCIe unit of Armada XP mv78230 is only x1 capable
  ARM: mvebu: fix second and third PCIe unit of Armada XP mv78260

Balaji T K (2):
  ARM: dts: omap4-panda-common: Fix pin muxing for wl12xx
  ARM: dts: omap4-sdp: Fix pin muxing for wl12xx

Brent Taylor (1):
  ARM: at91: fixed unresolved symbol "at91_pm_set_standby" when built 
without CONFIG_PM

Daniel Lezcano (1):
  ARM: ux500: u8500_defconfig: add missing cpuidle option

Dinh Nguyen (2):
  arm: socfpga: Enable ARM_TWD for socfpga
  arm: dts: socfpga: Change some clocks of gate-clk type to perip-clk

Enric Balletbo i Serra (7):
  ARM: dts: omap3-igep: Fix bus-width for mmc1
  ARM: dts: omap3-igep: Add support for LBEE1USJYC WiFi connected to SDIO
  ARM: dts: omap3-igep: Update to use the TI AM/DM37x processor
  ARM: dts: AM33XX BASE0033: add pinmux and hdmi node to enable display
  ARM: dts: AM33XX BASE0033: add pinmux and user led support
  ARM: dts: AM33XX BASE0033: add 32KBit EEPROM support
  ARM: dts: AM33XX IGEP0033: add USB support

Florian Vaussard (1):
  ARM: dts: Fix the name of supplies for smsc911x shared by OMAP

Gregory CLEMENT (1):
  ARM: mvebu: use the virtual CPU registers to access coherency registers

Jarkko Nikula (1):
  ARM: dts: omap3-beagle: Add omap-twl4030 audio support

Javier Martinez Canillas (4):
  ARM: OMAP2+: dss-common: change IGEP's DVI DDC i2c bus
  ARM: dts: omap3-igep0020: Add pinmux setup for i2c devices
  ARM: dts: omap3-igep0020: Add pinmuxing for DVI output
  ARM: dts: omap3-igep0020: name twl4030 VPLL2 regulator as vdds_dsi

Joel Fernandes (1):
  ARM: OMAP2+: Disable POSTED mode for errata i103 and i767

Linus Walleij (2):
  Revert "ARM: ux500: Remove AUXDATA relating to SDI (MMC) clock-name 
bindings"
  Revert "ARM: ux500: Stop passing MMC's platform data for Device Tree 
boots"

Ludovic Desroches (1):
  ARM: at91: sama5d3: reduce TWI internal clock frequency

Nicolas Ferre (1):
  ARM: at91: add usart3 alias to dtsi

Olof Johansson (9):
  Merge tag 'ux500-fixes-v3.13-1' of 
git://git.kernel.org/.../linusw/linux-stericsson into fixes
  Merge tag 'ux500-defconfig-v3.13-rcs' of 
git://git.kernel.org/.../linusw/linux-stericsson into fixes
  Merge tag 'omap-for-v3.13/fixes-against-rc1-take2' of 
git://git.kernel.org/.../tmlind/linux-omap into fixes
  Merge tag 'mvebu-dt-fixes-3.13' of git://git.infradead.org/linux-mvebu 
into fixes
  Merge tag 'at91-fixes' of git://github.com/at91linux/linux-at91 into fixes
  ARM: multi_v7_defconfig: enable network for BeagleBone Black
  Merge tag 'omap-for-v3.13/more-dt-regressions' of 
git://git.kernel.org/.../tmlind/linux-omap into fixes
  ARM: sunxi_defconfig: enable NFS, TMPFS, PRINTK_TIME and nfsroot support
  ARM: multi_v7_defconfig: enable SDHCI_BCM_KONA and MMC_BLOCK_MINORS=16

Rajendra Nayak (1):
  ARM: OMAP2+: Powerdomain: Fix unchecked dereference of arch_pwrdm

Roger Quadros (1):
  ARM: dts: omap3-beagle: Fix USB host on beagle boards (for 3.13)

Thomas Petazzoni (1):
  ARM: mvebu: re-enable PCIe on Armada 370 DB

Tony Lindgren (5):
  ARM: OMAP2+: Fix more missing data for omap3.dtsi file
  ARM: OMAP2+: Add fixed regulator to omap2plus_defconfig
  ARM: OMAP2+: Fix eMMC on n900 with device tree
  mmc: omap: Fix DMA configuration to not rely on device id
  mmc: omap: Fix I2C dependency and make driver usable with device tree

[PATCH] locking: Giving mutex warning more precisely in case of !lock->owner

2013-12-03 Thread Chuansheng Liu


When enabling mutex debugging, in case the imbalanced mutex_unlock()
is called, we still get the warning like below:
[  364.208284] DEBUG_LOCKS_WARN_ON(lock->owner != current)

But in that case, it is due to imbalanced mutex_unlock calling, and
the lock->owner is NULL.

Here we can enhance the case to give the warning as below:
 DEBUG_LOCKS_WARN_ON(lock->owner == NULL)

Signed-off-by: Liu, Chuansheng 
---
 kernel/locking/mutex-debug.c |7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/kernel/locking/mutex-debug.c b/kernel/locking/mutex-debug.c
index 7e3443f..b2a1b96 100644
--- a/kernel/locking/mutex-debug.c
+++ b/kernel/locking/mutex-debug.c
@@ -75,7 +75,12 @@ void debug_mutex_unlock(struct mutex *lock)
return;
 
DEBUG_LOCKS_WARN_ON(lock->magic != lock);
-   DEBUG_LOCKS_WARN_ON(lock->owner != current);
+
+   if (!lock->owner)
+   DEBUG_LOCKS_WARN_ON(lock->owner == NULL);
+   else
+   DEBUG_LOCKS_WARN_ON(lock->owner != current);
+
DEBUG_LOCKS_WARN_ON(!lock->wait_list.prev && !lock->wait_list.next);
mutex_clear_owner(lock);
 }
-- 
1.7.9.5



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/2] ARM: tegra: convert device tree file of Dalmore to use pinctrl defines

2013-12-03 Thread Laxman Dewangan


On Wednesday 04 December 2013 01:39 AM, Stephen Warren wrote:

On 12/02/2013 06:55 AM, Laxman Dewangan wrote:

Signed-off-by: Laxman Dewangan 

Patch description?

BTW, did you compile all the Tegra DT files before and after this
change, and make sure that there's zero difference between them (i.e.
they're identical byte-for-byte when compiled)? I don't feel like
manually double-checking this entire patch...


I just made changes for the Dalmore only. Not touched Tegra30 and 
Tegra20_ platform dts.


For Dalmore, I compare the binary before and after this change as well 
dts generated back from dtb using dtc.

It is same.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] ARM: dts: tegra: Header file for pinctrl constants

2013-12-03 Thread Laxman Dewangan


On Wednesday 04 December 2013 01:38 AM, Stephen Warren wrote:

On 12/02/2013 11:04 PM, Laxman Dewangan wrote:

On Monday 02 December 2013 07:55 PM, Thierry Reding wrote:

* PGP Signed by an unknown key

On Mon, Dec 02, 2013 at 07:25:01PM +0530, Laxman Dewangan wrote:

+
+/* Schmitt enable/disable */
+#define TEGRA_PIN_DRIVE_SCHMITT_DISABLE0
+#define TEGRA_PIN_DRIVE_SCHMITT_ENABLE1
These are all boolean, so I wonder if perhaps we should be simply
defining a single pair and reuse that in different contexts:

 #define TEGRA_PIN_DISABLE0
 #define TEGRA_PIN_ENABLE1

The property names should provide enough context for them to be used
unambiguously.



I can make generic ENABLE/DISABLE macro as you suggested but datasheet
says as 0=NORMAL, 1 = TRISTATE. and that's why I kept name very near to
the datasheet.

That documentation is relative to a specific field, whereas the
namespace for #defines is global. Hence, we may have to name #defines
using stricter rules than the TRM's field values, in order to make them
unambiguous.


I send the V2 patches on which I have taken care of this.
Request you to please review.

Thanks,
Laxman
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v4] arm64: support single-step and breakpoint handler hooks

2013-12-03 Thread Sandeepa Prabhu

AArch64 Single Steping and Breakpoint debug exceptions will be
used by multiple debug framworks like kprobes & kgdb.

This patch implements the hooks for those frameworks to register
their own handlers for handling breakpoint and single step events.

Reworked the debug exception handler in entry.S: do_dbg to route
software breakpoint (BRK64) exception to do_debug_exception()

Signed-off-by: Sandeepa Prabhu 
Signed-off-by: Deepak Saxena 
Acked-by: Will Deacon 
---
 arch/arm64/include/asm/debug-monitors.h | 21 
 arch/arm64/kernel/debug-monitors.c  | 88 -
 arch/arm64/kernel/entry.S   |  2 +
 3 files changed, 110 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/debug-monitors.h 
b/arch/arm64/include/asm/debug-monitors.h
index a2232d0..6231479 100644
--- a/arch/arm64/include/asm/debug-monitors.h
+++ b/arch/arm64/include/asm/debug-monitors.h
@@ -62,6 +62,27 @@ struct task_struct;
 
 #define DBG_ARCH_ID_RESERVED   0   /* In case of ptrace ABI updates. */
 
+#define DBG_HOOK_HANDLED   0
+#define DBG_HOOK_ERROR 1
+
+struct step_hook {
+   struct list_head node;
+   int (*fn)(struct pt_regs *regs, unsigned int esr);
+};
+
+void register_step_hook(struct step_hook *hook);
+void unregister_step_hook(struct step_hook *hook);
+
+struct break_hook {
+   struct list_head node;
+   u32 esr_val;
+   u32 esr_mask;
+   int (*fn)(struct pt_regs *regs, unsigned int esr);
+};
+
+void register_break_hook(struct break_hook *hook);
+void unregister_break_hook(struct break_hook *hook);
+
 u8 debug_monitors_arch(void);
 
 void enable_debug_monitors(enum debug_el el);
diff --git a/arch/arm64/kernel/debug-monitors.c 
b/arch/arm64/kernel/debug-monitors.c
index 4ae6857..636ba8b 100644
--- a/arch/arm64/kernel/debug-monitors.c
+++ b/arch/arm64/kernel/debug-monitors.c
@@ -187,6 +187,48 @@ static void clear_regs_spsr_ss(struct pt_regs *regs)
regs->pstate = spsr;
 }
 
+/* EL1 Single Step Handler hooks */
+static LIST_HEAD(step_hook);
+DEFINE_RWLOCK(step_hook_lock);
+
+void register_step_hook(struct step_hook *hook)
+{
+   write_lock(_hook_lock);
+   list_add(>node, _hook);
+   write_unlock(_hook_lock);
+}
+
+void unregister_step_hook(struct step_hook *hook)
+{
+   write_lock(_hook_lock);
+   list_del(>node);
+   write_unlock(_hook_lock);
+}
+
+/*
+ * Call registered single step handers
+ * There is no Syndrome info to check for determining the handler.
+ * So we call all the registered handlers, until the right handler is
+ * found which returns zero.
+ */
+static int call_step_hook(struct pt_regs *regs, unsigned int esr)
+{
+   struct step_hook *hook;
+   int retval = DBG_HOOK_ERROR;
+
+   read_lock(_hook_lock);
+
+   list_for_each_entry(hook, _hook, node) {
+   retval = hook->fn(regs, esr);
+   if (retval == DBG_HOOK_HANDLED)
+   break;
+   }
+
+   read_unlock(_hook_lock);
+
+   return retval;
+}
+
 static int single_step_handler(unsigned long addr, unsigned int esr,
   struct pt_regs *regs)
 {
@@ -214,7 +256,9 @@ static int single_step_handler(unsigned long addr, unsigned 
int esr,
 */
user_rewind_single_step(current);
} else {
-   /* TODO: route to KGDB */
+   if (call_step_hook(regs, esr) == DBG_HOOK_HANDLED)
+   return 0;
+
pr_warning("Unexpected kernel single-step exception at EL1\n");
/*
 * Re-enable stepping since we know that we will be
@@ -226,11 +270,53 @@ static int single_step_handler(unsigned long addr, 
unsigned int esr,
return 0;
 }
 
+/*
+ * Breakpoint handler is re-entrant as another breakpoint can
+ * hit within breakpoint handler, especically in kprobes.
+ * Use reader/writer locks instead of plain spinlock.
+ */
+static LIST_HEAD(break_hook);
+DEFINE_RWLOCK(break_hook_lock);
+
+void register_break_hook(struct break_hook *hook)
+{
+   write_lock(_hook_lock);
+   list_add(>node, _hook);
+   write_unlock(_hook_lock);
+}
+
+void unregister_break_hook(struct break_hook *hook)
+{
+   write_lock(_hook_lock);
+   list_del(>node);
+   write_unlock(_hook_lock);
+}
+
+static int call_break_hook(struct pt_regs *regs, unsigned int esr)
+{
+   struct break_hook *hook;
+   int (*fn)(struct pt_regs *regs, unsigned int esr) = NULL;
+
+   read_lock(_hook_lock);
+   list_for_each_entry(hook, _hook, node)
+   if ((esr & hook->esr_mask) == hook->esr_val)
+   fn = hook->fn;
+   read_unlock(_hook_lock);
+
+   return fn ? fn(regs, esr) : DBG_HOOK_ERROR;
+}
+
 static int brk_handler(unsigned long addr, unsigned int esr,
   struct pt_regs *regs)
 {
siginfo_t info;
 
+   if (call_break_hook(regs, esr) == DBG_HOOK_HANDLED)
+   return 0;
+
+

[PATCH v4] ARM64 breakpoint and single step exception hooks

2013-12-03 Thread Sandeepa Prabhu

This patch adds support for breakpoint and single-step exception hooks.

v3 version of this patch is published and reviewed with arm64 kdgb and kprobes 
patch series [1] and [2]

[1] http://lwn.net/Articles/570648/
[2] https://lwn.net/Articles/571063/

Changes v3 -> v4:
 -Incorporated review comments: 
http://lists.infradead.org/pipermail/linux-arm-kernel/2013-October/207372.html
-Removed unnecessary comments
-Added comments for breakpoint re-entrancy & rw locks
 - Rebased on top of 
git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64.git 
Branch:upstream
commit ID: dc1ccc48159d63eca5089e507c82c7d22ef60839  (Linux 3.13-rc2)
 - CCing Jason Wessel, since arm64 kgdb patchset is dependant on this.

Sandeepa Prabhu (1):
  arm64: support single-step and breakpoint handler hooks

 arch/arm64/include/asm/debug-monitors.h | 21 
 arch/arm64/kernel/debug-monitors.c  | 88 -
 arch/arm64/kernel/entry.S   |  2 +
 3 files changed, 110 insertions(+), 1 deletion(-)

-- 
1.8.1.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] ARM: tegra: convert device tree files to use key defines

2013-12-03 Thread Laxman Dewangan


On Wednesday 04 December 2013 01:34 AM, Stephen Warren wrote:

On 12/02/2013 06:09 AM, Laxman Dewangan wrote:

Use key code macros for all key code refernced for keys.

For tegra20-seaboard.dts and tegra20-harmony.dts:
   The key comment for key (16th row and 1st column) is KEY_KPSLASH but
   code is 0x004e which is the key code for KEY_KPPLUS. As there other
   key exist with KY_KPPLUS, I am assuming key code is wrong and comment
   is fine. With this assumption, I am keeping the key code as KEY_KPSLASH.

This looks reasonable, and I'll apply it soon. What is the patch based
on? Note that I recently sent a patch to fix the sort order of DT nodes
in all Tegra DT files, which I'll apply early since it's a cleanup, and
IIRC some of the KBC nodes may have been moved by that patch, so this
may conflict. I'll see if I can rebase while applying it.

I worked on linux-next of 20131202.

If there is any issue,  I can generate new patch based on the branch 
which you suggest.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [RFC part1 PATCH 5/7] ARM64 / ACPI: Introduce arm_core.c and its related head file

2013-12-03 Thread Zheng, Lv

> From: linux-acpi-ow...@vger.kernel.org 
> [mailto:linux-acpi-ow...@vger.kernel.org] On Behalf Of Hanjun Guo
> Sent: Wednesday, December 04, 2013 12:37 AM
> 
> introduce arm_core.c and its related head file, after this patch,
> we can get ACPI tables from BIOS on ARM64 now.
> 
> Signed-off-by: Al Stone 
> Signed-off-by: Graeme Gregory 
> Signed-off-by: Hanjun Guo 
> ---
>  arch/arm64/include/asm/acpi.h |   57 +++
>  arch/arm64/kernel/setup.c |8 ++
>  drivers/acpi/Makefile |2 +
>  drivers/acpi/plat/Makefile|1 +
>  drivers/acpi/plat/arm-core.c  |  219 
> +
>  5 files changed, 287 insertions(+)
>  create mode 100644 drivers/acpi/plat/Makefile
>  create mode 100644 drivers/acpi/plat/arm-core.c
> 
> diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h
> index c186f5b..e9444e4 100644
> --- a/arch/arm64/include/asm/acpi.h
> +++ b/arch/arm64/include/asm/acpi.h
> @@ -19,6 +19,43 @@
>  #ifndef _ASM_ARM_ACPI_H
>  #define _ASM_ARM_ACPI_H
> 
> +#include 
> +
> +#include 
> +
> +#define COMPILER_DEPENDENT_INT64 long long
> +#define COMPILER_DEPENDENT_UINT64unsigned long long
> +
> +/*
> + * Calling conventions:
> + *
> + * ACPI_SYSTEM_XFACE- Interfaces to host OS (handlers, threads)
> + * ACPI_EXTERNAL_XFACE  - External ACPI interfaces
> + * ACPI_INTERNAL_XFACE  - Internal ACPI interfaces
> + * ACPI_INTERNAL_VAR_XFACE  - Internal variable-parameter list interfaces
> + */
> +#define ACPI_SYSTEM_XFACE
> +#define ACPI_EXTERNAL_XFACE
> +#define ACPI_INTERNAL_XFACE
> +#define ACPI_INTERNAL_VAR_XFACE
> +
> +/* Asm macros */
> +#define ACPI_FLUSH_CPU_CACHE() flush_cache_all()

Well, you may need to check the following environments defined in 
 is sufficient for ARM targets:
49 #define ACPI_USE_SYSTEM_CLIBRARY
50 #define ACPI_USE_DO_WHILE_0
51 #define ACPI_MUTEX_TYPE ACPI_BINARY_SEMAPHORE

70 #define ACPI_MACHINE_WIDTH  BITS_PER_LONG
Will this zap IO addresses on ARM32 platforms?

And following default settings in  and  is 
sufficient for ARM targets:
179 #if defined (__IA64__) || defined (__ia64__)
180 #define ACPI_MISALIGNMENT_NOT_SUPPORTED
181 #endif
Will this cause any exceptions on ARM by executing ACPICA name functions?

444 #if ACPI_MACHINE_WIDTH == 64
445 #define ACPI_USE_NATIVE_DIVIDE  /* Use compiler native 64-bit divide */
446 #endif
I think you may see build breakage on ARM32 as you haven't implemented the 
following ACPICA macros for ARM:
67 #define ACPI_DIV_64_BY_32(n_hi, n_lo, d32, q32, r32)
74 #define ACPI_SHIFT_RIGHT_64(n_hi, n_lo)
Have you tested this yet?

I'm not sure if all global lock code blocks are not referenced by 
ACPI_REDUCED_HARDWARE and I'm not sure what mechanism is implemented on ARM 
ACPI platforms to offer the synchronization mechanism between firmware and 
OSPM.  So you may need to implement the following synchronization protocol in 
:
58 #define ACPI_ACQUIRE_GLOBAL_LOCK(facs, Acq)
61 #define ACPI_RELEASE_GLOBAL_LOCK(facs, Acq)

I only reviewed the ACPICA stuffs in , I didn't take a look at your 
Linux ACPI stuff in .  You may need more instructions on the 
porting issues from Linux ACPI guys.

Thanks and best regards
-Lv

> +
> +/* Basic configuration for ACPI */
> +#ifdef   CONFIG_ACPI
> +extern int acpi_disabled;
> +extern int acpi_noirq;
> +extern int acpi_pci_disabled;
> +extern int acpi_strict;
> +
> +static inline void disable_acpi(void)
> +{
> + acpi_disabled = 1;
> + acpi_pci_disabled = 1;
> + acpi_noirq = 1;
> +}
> +
>  static inline bool arch_has_acpi_pdc(void)
>  {
>   return false;   /* always false for now */
> @@ -29,4 +66,24 @@ static inline void arch_acpi_set_pdc_bits(u32 *buf)
>   return;
>  }
> 
> +static inline void acpi_noirq_set(void) { acpi_noirq = 1; }
> +static inline void acpi_disable_pci(void)
> +{
> + acpi_pci_disabled = 1;
> + acpi_noirq_set();
> +}
> +
> +/* FIXME: this function should be moved to topology.h when it's ready */
> +void arch_fix_phys_package_id(int num, u32 slot);
> +
> +/* temperally define -1 to make acpi core compilerable */
> +#define cpu_physical_id(cpu) -1
> +
> +#else/* !CONFIG_ACPI */
> +#define acpi_disabled 1  /* ACPI sometimes enabled on ARM */
> +#define acpi_noirq 1 /* ACPI sometimes enabled on ARM */
> +#define acpi_pci_disabled 1  /* ACPI PCI sometimes enabled on ARM */
> +#define acpi_strict 1/* no ACPI spec workarounds on ARM */
> +#endif
> +
>  #endif /*_ASM_ARM_ACPI_H*/
> diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
> index bd9bbd0..8199360 100644
> --- a/arch/arm64/kernel/setup.c
> +++ b/arch/arm64/kernel/setup.c
> @@ -41,6 +41,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
> 
>  #include 
>  #include 
> @@ -225,6 +226,13 @@ void __init setup_arch(char **cmdline_p)
> 
>   arm64_memblock_init();
> 
> + /*
> +  * Parse the ACPI tables for possible boot-time configuration
>

Re: [PATCH 7/10] ACPI / hotplug: Move container-specific code out of the core

2013-12-03 Thread Yasuaki Ishimatsu


(2013/12/03 22:15), Rafael J. Wysocki wrote:

On Tuesday, December 03, 2013 11:46:24 AM Yasuaki Ishimatsu wrote:

(2013/11/29 22:08), Rafael J. Wysocki wrote:

On Friday, November 29, 2013 11:36:55 AM Yasuaki Ishimatsu wrote:

Hi Rafael,


Hi,


Replying to this mail may be wrong.


OK, so this particular patch doesn't break things any more?


Yes.




Do you remember following your patch?
http://lkml.org/lkml/2013/2/23/97

I want to add autoeject variable in acpi_hotplug_profile structure and
set autoecjet of container device "false".


Then after the series the $subject patch belongs to it will work almost the
same way as /sys/firmware/acpi/container/enabled (hot add will still work after
patch [4/10] if "enabled" is 0), but only for containers.


Currently, I have a problem on ejecting container device. Since linux-3.12,
container device is removed by acpi_scan_hot_remove.

I think this has two problems.

 1. easily fail
My container device has CPU device and Memory device, and maximum size 
of
memory is 3Tbyte. In my environment, hot removing container device fails
on offlining memory if memory is used by application.
I think if offlininig memory, we must retly to offline memory several
times.


Yes, that's correct.  But then you can try to offline the memory upfront
and only remove the container after that has been successful.


 2. cannot work with userland's application
Hot removing CPU and memory on container device, we need take care of
userland application. Before linux-3.12, container device just notifies
KOBJ_OFFLINE to udev. So by using udev, if application binds to removed
CPU or node, applications can change them before hot removing container
device.
Currently, KOBJ_OFFLINE is notified to udev. But acpi_scan_hot_remove
also runs simultaneously for hot removing container device. So when
applications runs for corresponding to the deletion of the devices,
the devices may have been deleted.





So the expectation is that the container will refuse to offline, but instead
it will emit KOBJ_OFFLINE so that user space can do some cleanup and offline
it through the "eject" attribute, right?


Yes, that's right.




I don't know what devices are on hotpluggable conatainer device of other
vendors. At least, my container device cannot be hot removed correctly.
Then I want to add autoeject variable in acpi_hotplug_profile so that user
can change the parameter to "true" or "false".


I have a different idea.

Why don't we create a bus type for containers in analogy with CPUs and memory
and make it support offline.  Then, the container scan handler will create a
"physical" container device under that bus type and the new bus type code will
implement the logic you need (that is, it will have a sysfs flag that will
cause the offline to fail emitting a uevent of some sort if set and will allow
the offline to happen when unset).  That "physical" container device will go
away (again, via the container scan handler) during container removal.




The eject work flow can be:
(1) an eject event occurs,
(2) the container "physical" device fails offline in acpi_scan_hot_remove()
emmitting, say, KOBJ_CHANGE for the "physical" device,
(3) user space notices the KOBJ_CHANGE and does the cleanup as needed,
(4) user space changes the "physical" container device flag controlling
offline to 0,
(5) user space uses the sysfs "eject" attribute of the ACPI container object
to finally eject the container,
(6) the offline in acpi_scan_hot_remove() is now successful, because the
flag controlling it has been set to 0 in step (4),
(7) the "physical" container device goes away before executing _EJ0,
(8) the container is ejected.

Of course, if the flag controlling container offline is 0 to start with, step
(6) will now occur directly after (1), so whoever wants containers to be
hot-removed automatically may just clear that flag for all of them on boot.

How does that sound?



The above ideas are almost O.K. I want kernel to notify user space of 
KOBJ_OFFLINE.
Even if user space catches "KOBJ_CHANGE", user doesn't know whether the 
notification
is offline or not.





It is easy to figure out, though.  Since the KOBJ_CHANGE will be emitted for
container devices only in that situation, user space can see that (1) it is
from a container and (2) it is KOBJ_CHANGE, so it must mean "container offline
has been attempted".

My concern with using KOBJ_OFFLINE for that is that device_offline() emits it
too on success and it may be easily confused with the one emitted on failure
for containers.


I have no objection.

Thanks,
Yasuaki Ishimatsu



Thanks,
Rafael




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch 7/8] mm, memcg: allow processes handling oom notifications to access reserves

2013-12-03 Thread Johannes Weiner

On Tue, Dec 03, 2013 at 09:20:17PM -0800, David Rientjes wrote:
> Now that a per-process flag is available, define it for processes that
> handle userspace oom notifications.  This is an optimization to avoid
> mantaining a list of such processes attached to a memcg at any given time
> and iterating it at charge time.
> 
> This flag gets set whenever a process has registered for an oom
> notification and is cleared whenever it unregisters.
> 
> When memcg reclaim has failed to free any memory, it is necessary for
> userspace oom handlers to be able to dip into reserves to pagefault text,
> allocate kernel memory to read the "tasks" file, allocate heap, etc.

The task handling the OOM of a memcg can obviously not be part of that
same memcg.

I've said this many times in the past, but here is the most recent
thread from Tejun, me, and Li on this topic:

---

On Tue, 3 Dec 2013 at 15:35:48 +0800, Li Zefan wrote:
> On Mon, 2 Dec 2013 at 11:44:06 -0500, Johannes Weiner wrote:
> > On Fri, Nov 29, 2013 at 03:05:25PM -0500, Tejun Heo wrote:
> > > Whoa, so we support oom handler inside the memcg that it handles?
> > > Does that work reliably?  Changing the above detail in this patch
> > > isn't difficult (and we'll later need to update kernfs too) but
> > > supporting such setup properly would be a *lot* of commitment and I'm
> > > very doubtful we'd be able to achieve that by just carefully avoiding
> > > memory allocation in the operations that usreland oom handler uses -
> > > that set is destined to expand over time, extremely fragile and will
> > > be hellish to maintain.
> > > 
> > > So, I'm not at all excited about commiting to this guarantee.  This
> > > one is an easy one but it looks like the first step onto dizzying
> > > slippery slope.
> > > 
> > > Am I misunderstanding something here?  Are you and Johannes firm on
> > > supporting this?
> >
> > Handling a memcg OOM from userspace running inside that OOM memcg is
> > completely crazy.  I mean, think about this for just two seconds...
> > Really?
> >
> > I get that people are doing it right now, and if you can get away with
> > it for now, good for you.  But you have to be aware how crazy this is
> > and if it breaks you get to keep the pieces and we are not going to
> > accomodate this in the kernel.  Fix your crazy userspace.
> 
> +1

---

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] mm: memcg: do not declare OOM from __GFP_NOFAIL allocations

2013-12-03 Thread Johannes Weiner

On Wed, Dec 04, 2013 at 03:34:17PM +1100, Dave Chinner wrote:
> On Tue, Dec 03, 2013 at 10:01:01PM -0500, Johannes Weiner wrote:
> > On Tue, Dec 03, 2013 at 03:40:13PM -0800, David Rientjes wrote:
> > > On Tue, 3 Dec 2013, Johannes Weiner wrote:
> > > I believe the page allocator would be susceptible to the same deadlock if 
> > > nothing else on the system can reclaim memory and that belief comes from 
> > > code inspection that shows __GFP_NOFAIL is not guaranteed to ever succeed 
> > > in the page allocator as their charges now are (with your patch) in 
> > > memcg.  
> > > I do not have an example of such an incident.
> > 
> > Me neither.
> 
> Is this the sort of thing that you expect to see when GFP_NOFS |
> GFP_NOFAIL type allocations continualy fail?
> 
> http://oss.sgi.com/archives/xfs/2013-12/msg00095.html
> 
> XFS doesn't use GFP_NOFAIL, it does it's own loop with GFP_NOWARN in
> kmem_alloc() so that if we get stuck for more than 100 attempts to
> allocate it throws a warning. i.e. only when we really are stuck and
> reclaim is not making any progress.
> 
> This specific case is due to memory fragmentation preventing a 64k
> memory allocation (due to the filesystem being configured with a 64k
> directory block size), but GFP_NOFS | GFP_NOFAIL allocations happen
> *all the time* in filesystems.

Yes, the question is whether this in itself is a practical problem,
regardless of whether you use __GFP_NOFAIL or a manual loop.

> > > > > So, my question again: why not bypass the per-zone min watermarks in 
> > > > > the 
> > > > > page allocator?
> > > > 
> > > > I don't even know what your argument is supposed to be.  The fact that
> > > > we don't do it in the page allocator means that there can't be a bug
> > > > in memcg?
> > > > 
> > > 
> > > I'm asking if we should allow GFP_NOFS | __GFP_NOFAIL allocations in the 
> > > page allocator to bypass per-zone min watermarks after reclaim has failed 
> > > since the oom killer cannot be called in such a context so that the page 
> > > allocator is not susceptible to the same deadlock without a complete 
> > > depletion of memory reserves?
> > 
> > Yes, I think so.
> 
> There be dragons. If memcg's deadlock in low memory conditions in
> the presence of GFP_NOFS | GFP_NOFAIL allocations, then we need to
> make the memcg reclaim design more robust, not work around it by
> allowing filesystems to drain critical memory reserves needed for
> other situations

The problems in the page allocator and memcg are entirely unrelated.
What we do in the memcg does not affect the page allocator and vice
versa.  However, they are problems of the same type, so we are trying
to find out whether both instances can have the same solution:

If GFP_NOFS | __GFP_NOFAIL allocations can not make forward progress
in direct reclaim, they are screwed: can't reclaim memory, can't OOM
kill, can't return NULL.  They are essentially stuck until a third
party intervenes.  This applies to both the page allocator and memcg.

In memcg, I fixed it by allowing the __GFP_NOFAIL task to bypass the
user-defined memory limit after reclaim fails.

David asks whether we should do the equivalent in the page allocator
and allow __GFP_NOFAIL allocations to dip into the emergency memory
reserves for the same reason.

I suggested that the situations are not entirely the same.  A memcg
might only have one or two tasks and so third party intervention to
reduce memory usage in the memcg can be unlikely to impossible,
whereas in the case of the page allocator, the likelihood of any task
in the system releasing or reclaiming memory is higher.

However, the GFP_NOFS | __GFP_NOFAIL task stuck in the page allocator
may hold filesystem locks that could prevent a third party from
freeing memory and/or exiting, so we can not guarantee that only the
__GFP_NOFAIL task is getting stuck, it might well trap other tasks.
The same applies to open-coded GFP_NOFS allocation loops of course
unless they cycle the filesystem locks while looping.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 5/8] res_counter: remove interface for locked charging and uncharging

2013-12-03 Thread David Rientjes

The res_counter_{charge,uncharge}_locked() variants are not used in the
kernel outside of the resource counter code itself, so remove the
interface.

Signed-off-by: David Rientjes 
---
 Documentation/cgroups/resource_counter.txt | 14 ++
 include/linux/res_counter.h|  6 +-
 kernel/res_counter.c   | 23 ---
 3 files changed, 15 insertions(+), 28 deletions(-)

diff --git a/Documentation/cgroups/resource_counter.txt 
b/Documentation/cgroups/resource_counter.txt
--- a/Documentation/cgroups/resource_counter.txt
+++ b/Documentation/cgroups/resource_counter.txt
@@ -76,24 +76,14 @@ to work with it.
limit_fail_at parameter is set to the particular res_counter element
where the charging failed.
 
- d. int res_counter_charge_locked
-   (struct res_counter *rc, unsigned long val, bool force)
-
-   The same as res_counter_charge(), but it must not acquire/release the
-   res_counter->lock internally (it must be called with res_counter->lock
-   held). The force parameter indicates whether we can bypass the limit.
-
- e. u64 res_counter_uncharge[_locked]
-   (struct res_counter *rc, unsigned long val)
+ d. u64 res_counter_uncharge(struct res_counter *rc, unsigned long val)
 
When a resource is released (freed) it should be de-accounted
from the resource counter it was accounted to.  This is called
"uncharging". The return value of this function indicate the amount
of charges still present in the counter.
 
-   The _locked routines imply that the res_counter->lock is taken.
-
- f. u64 res_counter_uncharge_until
+ e. u64 res_counter_uncharge_until
(struct res_counter *rc, struct res_counter *top,
 unsinged long val)
 
diff --git a/include/linux/res_counter.h b/include/linux/res_counter.h
--- a/include/linux/res_counter.h
+++ b/include/linux/res_counter.h
@@ -104,15 +104,13 @@ void res_counter_init(struct res_counter *counter, struct 
res_counter *parent);
  *   units, e.g. numbers, bytes, Kbytes, etc
  *
  * returns 0 on success and <0 if the counter->usage will exceed the
- * counter->limit _locked call expects the counter->lock to be taken
+ * counter->limit
  *
  * charge_nofail works the same, except that it charges the resource
  * counter unconditionally, and returns < 0 if the after the current
  * charge we are over limit.
  */
 
-int __must_check res_counter_charge_locked(struct res_counter *counter,
-  unsigned long val, bool force);
 int __must_check res_counter_charge(struct res_counter *counter,
unsigned long val, struct res_counter **limit_fail_at);
 int res_counter_charge_nofail(struct res_counter *counter,
@@ -125,12 +123,10 @@ int res_counter_charge_nofail(struct res_counter *counter,
  * @val: the amount of the resource
  *
  * these calls check for usage underflow and show a warning on the console
- * _locked call expects the counter->lock to be taken
  *
  * returns the total charges still present in @counter.
  */
 
-u64 res_counter_uncharge_locked(struct res_counter *counter, unsigned long 
val);
 u64 res_counter_uncharge(struct res_counter *counter, unsigned long val);
 
 u64 res_counter_uncharge_until(struct res_counter *counter,
diff --git a/kernel/res_counter.c b/kernel/res_counter.c
--- a/kernel/res_counter.c
+++ b/kernel/res_counter.c
@@ -22,8 +22,18 @@ void res_counter_init(struct res_counter *counter, struct 
res_counter *parent)
counter->parent = parent;
 }
 
-int res_counter_charge_locked(struct res_counter *counter, unsigned long val,
- bool force)
+static u64 res_counter_uncharge_locked(struct res_counter *counter,
+  unsigned long val)
+{
+   if (WARN_ON(counter->usage < val))
+   val = counter->usage;
+
+   counter->usage -= val;
+   return counter->usage;
+}
+
+static int res_counter_charge_locked(struct res_counter *counter,
+unsigned long val, bool force)
 {
int ret = 0;
 
@@ -86,15 +96,6 @@ int res_counter_charge_nofail(struct res_counter *counter, 
unsigned long val,
return __res_counter_charge(counter, val, limit_fail_at, true);
 }
 
-u64 res_counter_uncharge_locked(struct res_counter *counter, unsigned long val)
-{
-   if (WARN_ON(counter->usage < val))
-   val = counter->usage;
-
-   counter->usage -= val;
-   return counter->usage;
-}
-
 u64 res_counter_uncharge_until(struct res_counter *counter,
   struct res_counter *top,
   unsigned long val)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 8/8] mm, memcg: add memcg oom reserve documentation

2013-12-03 Thread David Rientjes

Add documentation on memcg oom reserves to
Documentation/cgroups/memory.txt and give an example of its usage and
recommended best practices.

Signed-off-by: David Rientjes 
---
 Documentation/cgroups/memory.txt | 26 ++
 1 file changed, 26 insertions(+)

diff --git a/Documentation/cgroups/memory.txt b/Documentation/cgroups/memory.txt
--- a/Documentation/cgroups/memory.txt
+++ b/Documentation/cgroups/memory.txt
@@ -71,6 +71,7 @@ Brief summary of control files.
 (See sysctl's vm.swappiness)
  memory.move_charge_at_immigrate # set/show controls of moving charges
  memory.oom_control # set/show oom controls.
+ memory.oom_reserve_in_bytes# set/show limit of oom memory reserves
  memory.numa_stat   # show the number of memory usage per numa node
 
  memory.kmem.limit_in_bytes  # set/show hard limit for kernel memory
@@ -772,6 +773,31 @@ At reading, current status of OOM is shown.
under_oom0 or 1 (if 1, the memory cgroup is under OOM, tasks may
 be stopped.)
 
+Processes that handle oom conditions in their own memcgs or their child
+memcgs may need to allocate memory themselves to do anything useful,
+including pagefaulting its text or allocating kernel memory to read the
+memcg "tasks" file.  For this reason, memory.oom_reserve_in_bytes is
+provided that specifies how much memory that processes waiting on
+memory.oom_control can allocate above the memcg limit.
+
+The memcg that the oom handler is attached to is charged for the memory
+that it allocates against its own memory.oom_reserve_in_bytes.  This
+memory is therefore only available to processes that are waiting for
+a notification.
+
+For example, if you do
+
+   # echo 2m > memory.oom_reserve_in_bytes
+
+then any process attached to this memcg that is waiting on memcg oom
+notifications anywhere on the system can allocate an additional 2MB
+above memory.limit_in_bytes.
+
+You may still consider doing mlockall(MCL_FUTURE) for processes that
+are waiting on oom notifications to keep this vaue as minimal as
+possible, or allow it to be large enough so that its text can still
+be pagefaulted in under oom conditions when the value is known.
+
 11. Memory Pressure
 
 The pressure level notifications can be used to monitor the memory
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 7/8] mm, memcg: allow processes handling oom notifications to access reserves

2013-12-03 Thread David Rientjes

Now that a per-process flag is available, define it for processes that
handle userspace oom notifications.  This is an optimization to avoid
mantaining a list of such processes attached to a memcg at any given time
and iterating it at charge time.

This flag gets set whenever a process has registered for an oom
notification and is cleared whenever it unregisters.

When memcg reclaim has failed to free any memory, it is necessary for
userspace oom handlers to be able to dip into reserves to pagefault text,
allocate kernel memory to read the "tasks" file, allocate heap, etc.

System oom conditions are not addressed at this time, but the same per-
process flag can be used in the page allocator to determine if access
should be given to userspace oom handlers to per-zone memory reserves at
a later time once there is consensus.

Signed-off-by: David Rientjes 
---
 include/linux/sched.h |  1 +
 mm/memcontrol.c   | 47 ++-
 2 files changed, 47 insertions(+), 1 deletion(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1695,6 +1695,7 @@ extern void thread_group_cputime_adjusted(struct 
task_struct *p, cputime_t *ut,
 #define PF_SPREAD_SLAB 0x0200  /* Spread some slab caches over cpuset 
*/
 #define PF_NO_SETAFFINITY 0x0400   /* Userland is not allowed to meddle 
with cpus_allowed */
 #define PF_MCE_EARLY0x0800  /* Early kill for mce process policy */
+#define PF_OOM_HANDLER 0x1000  /* Userspace process handling oom 
conditions */
 #define PF_MUTEX_TESTER0x2000  /* Thread belongs to the rt 
mutex tester */
 #define PF_FREEZER_SKIP0x4000  /* Freezer should not count it 
as freezable */
 #define PF_SUSPEND_TASK 0x8000  /* this thread called freeze_processes 
and should not be frozen */
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2590,6 +2590,33 @@ enum {
CHARGE_WOULDBLOCK,  /* GFP_WAIT wasn't set and no enough res. */
 };
 
+/*
+ * Processes handling oom conditions are allowed to utilize memory reserves so
+ * that they may handle the condition.
+ */
+static int mem_cgroup_oom_handler_charge(struct mem_cgroup *memcg,
+unsigned long csize,
+struct mem_cgroup **mem_over_limit)
+{
+   struct res_counter *fail_res;
+   int ret;
+
+   ret = res_counter_charge_nofail_max(>res, csize, _res,
+   memcg->oom_reserve);
+   if (!ret && do_swap_account) {
+   ret = res_counter_charge_nofail_max(>memsw, csize,
+   _res,
+   memcg->oom_reserve);
+   if (ret) {
+   res_counter_uncharge(>res, csize);
+   *mem_over_limit = mem_cgroup_from_res_counter(fail_res,
+ memsw);
+
+   }
+   }
+   return !ret ? CHARGE_OK : CHARGE_NOMEM;
+}
+
 static int mem_cgroup_do_charge(struct mem_cgroup *memcg, gfp_t gfp_mask,
unsigned int nr_pages, unsigned int min_pages,
bool invoke_oom)
@@ -2649,6 +2676,13 @@ static int mem_cgroup_do_charge(struct mem_cgroup 
*memcg, gfp_t gfp_mask,
if (mem_cgroup_wait_acct_move(mem_over_limit))
return CHARGE_RETRY;
 
+   if (current->flags & PF_OOM_HANDLER) {
+   ret = mem_cgroup_oom_handler_charge(memcg, csize,
+   _over_limit);
+   if (ret == CHARGE_OK)
+   return CHARGE_OK;
+   }
+
if (invoke_oom)
mem_cgroup_oom(mem_over_limit, gfp_mask, get_order(csize));
 
@@ -2696,7 +2730,8 @@ static int __mem_cgroup_try_charge(struct mm_struct *mm,
 || fatal_signal_pending(current)))
goto bypass;
 
-   if (unlikely(task_in_memcg_oom(current)))
+   if (unlikely(task_in_memcg_oom(current)) &&
+   !(current->flags & PF_OOM_HANDLER))
goto bypass;
 
/*
@@ -5825,6 +5860,11 @@ static int mem_cgroup_oom_register_event(struct 
cgroup_subsys_state *css,
if (!event)
return -ENOMEM;
 
+   /*
+* Setting PF_OOM_HANDLER before taking memcg_oom_lock ensures it is
+* set before getting added to memcg->oom_notify.
+*/
+   current->flags |= PF_OOM_HANDLER;
spin_lock(_oom_lock);
 
event->eventfd = eventfd;
@@ -5856,6 +5896,11 @@ static void mem_cgroup_oom_unregister_event(struct 
cgroup_subsys_state *css,
}
}
 
+   /*
+* Clearing PF_OOM_HANDLER before dropping memcg_oom_lock ensures it is
+* cleared before receiving another

[patch 6/8] res_counter: add interface for maximum nofail charge

2013-12-03 Thread David Rientjes

For memcg oom reserves, we'll need a resource counter interface that will
not fail when exceeding the memcg limit like res_counter_charge_nofail,
but only to a ceiling.

This patch adds res_counter_charge_nofail_max() that will exceed the
resource counter but only to a maximum defined value.  If it fails to
charge the resource, it returns -ENOMEM.

Signed-off-by: David Rientjes 
---
 include/linux/res_counter.h | 10 +-
 kernel/res_counter.c| 27 +--
 2 files changed, 30 insertions(+), 7 deletions(-)

diff --git a/include/linux/res_counter.h b/include/linux/res_counter.h
--- a/include/linux/res_counter.h
+++ b/include/linux/res_counter.h
@@ -107,14 +107,22 @@ void res_counter_init(struct res_counter *counter, struct 
res_counter *parent);
  * counter->limit
  *
  * charge_nofail works the same, except that it charges the resource
- * counter unconditionally, and returns < 0 if the after the current
+ * counter unconditionally, and returns < 0 if after the current
  * charge we are over limit.
+ *
+ * charge_nofail_max is the same as charge_nofail, except that the
+ * resource counter usage can only exceed the limit by the max
+ * difference.  Unlike charge_nofail, charge_nofail_max returns < 0
+ * only if the current charge fails because of the max difference.
  */
 
 int __must_check res_counter_charge(struct res_counter *counter,
unsigned long val, struct res_counter **limit_fail_at);
 int res_counter_charge_nofail(struct res_counter *counter,
unsigned long val, struct res_counter **limit_fail_at);
+int res_counter_charge_nofail_max(struct res_counter *counter,
+   unsigned long val, struct res_counter **limit_fail_at,
+   unsigned long max);
 
 /*
  * uncharge - tell that some portion of the resource is released
diff --git a/kernel/res_counter.c b/kernel/res_counter.c
--- a/kernel/res_counter.c
+++ b/kernel/res_counter.c
@@ -33,15 +33,19 @@ static u64 res_counter_uncharge_locked(struct res_counter 
*counter,
 }
 
 static int res_counter_charge_locked(struct res_counter *counter,
-unsigned long val, bool force)
+unsigned long val, bool force,
+unsigned long max)
 {
int ret = 0;
 
if (counter->usage + val > counter->limit) {
counter->failcnt++;
-   ret = -ENOMEM;
+   if (max == ULONG_MAX)
+   ret = -ENOMEM;
if (!force)
return ret;
+   if (counter->usage + val - counter->limit > max)
+   return -ENOMEM;
}
 
counter->usage += val;
@@ -51,7 +55,8 @@ static int res_counter_charge_locked(struct res_counter 
*counter,
 }
 
 static int __res_counter_charge(struct res_counter *counter, unsigned long val,
-   struct res_counter **limit_fail_at, bool force)
+   struct res_counter **limit_fail_at, bool force,
+   unsigned long max)
 {
int ret, r;
unsigned long flags;
@@ -62,7 +67,7 @@ static int __res_counter_charge(struct res_counter *counter, 
unsigned long val,
local_irq_save(flags);
for (c = counter; c != NULL; c = c->parent) {
spin_lock(>lock);
-   r = res_counter_charge_locked(c, val, force);
+   r = res_counter_charge_locked(c, val, force, max);
spin_unlock(>lock);
if (r < 0 && !ret) {
ret = r;
@@ -87,13 +92,23 @@ static int __res_counter_charge(struct res_counter 
*counter, unsigned long val,
 int res_counter_charge(struct res_counter *counter, unsigned long val,
struct res_counter **limit_fail_at)
 {
-   return __res_counter_charge(counter, val, limit_fail_at, false);
+   return __res_counter_charge(counter, val, limit_fail_at, false,
+   ULONG_MAX);
 }
 
 int res_counter_charge_nofail(struct res_counter *counter, unsigned long val,
  struct res_counter **limit_fail_at)
 {
-   return __res_counter_charge(counter, val, limit_fail_at, true);
+   return __res_counter_charge(counter, val, limit_fail_at, true,
+   ULONG_MAX);
+}
+
+int res_counter_charge_nofail_max(struct res_counter *counter,
+ unsigned long val,
+ struct res_counter **limit_fail_at,
+ unsigned long max)
+{
+   return __res_counter_charge(counter, val, limit_fail_at, true, max);
 }
 
 u64 res_counter_uncharge_until(struct res_counter *counter,
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read

[patch 3/8] mm, mempolicy: remove per-process flag

2013-12-03 Thread David Rientjes

PF_MEMPOLICY is an unnecessary optimization for CONFIG_SLAB users.
There's no significant performance degradation to checking
current->mempolicy rather than current->flags & PF_MEMPOLICY in the
allocation path, especially since this is considered unlikely().

Per-process flags are a scarce resource so we should free them up
whenever possible and make them available.  We'll be using it shortly for
memcg oom reserves.

Signed-off-by: David Rientjes 
---
 include/linux/mempolicy.h |  5 -
 include/linux/sched.h |  1 -
 kernel/fork.c |  1 -
 mm/mempolicy.c| 31 ---
 mm/slab.c |  4 ++--
 5 files changed, 2 insertions(+), 40 deletions(-)

diff --git a/include/linux/mempolicy.h b/include/linux/mempolicy.h
--- a/include/linux/mempolicy.h
+++ b/include/linux/mempolicy.h
@@ -143,7 +143,6 @@ extern void numa_policy_init(void);
 extern void mpol_rebind_task(struct task_struct *tsk, const nodemask_t *new,
enum mpol_rebind_step step);
 extern void mpol_rebind_mm(struct mm_struct *mm, nodemask_t *new);
-extern void mpol_fix_fork_child_flag(struct task_struct *p);
 
 extern struct zonelist *huge_zonelist(struct vm_area_struct *vma,
unsigned long addr, gfp_t gfp_flags,
@@ -266,10 +265,6 @@ static inline void mpol_rebind_mm(struct mm_struct *mm, 
nodemask_t *new)
 {
 }
 
-static inline void mpol_fix_fork_child_flag(struct task_struct *p)
-{
-}
-
 static inline struct zonelist *huge_zonelist(struct vm_area_struct *vma,
unsigned long addr, gfp_t gfp_flags,
struct mempolicy **mpol, nodemask_t **nodemask)
diff --git a/include/linux/sched.h b/include/linux/sched.h
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1695,7 +1695,6 @@ extern void thread_group_cputime_adjusted(struct 
task_struct *p, cputime_t *ut,
 #define PF_SPREAD_SLAB 0x0200  /* Spread some slab caches over cpuset 
*/
 #define PF_NO_SETAFFINITY 0x0400   /* Userland is not allowed to meddle 
with cpus_allowed */
 #define PF_MCE_EARLY0x0800  /* Early kill for mce process policy */
-#define PF_MEMPOLICY   0x1000  /* Non-default NUMA mempolicy */
 #define PF_MUTEX_TESTER0x2000  /* Thread belongs to the rt 
mutex tester */
 #define PF_FREEZER_SKIP0x4000  /* Freezer should not count it 
as freezable */
 #define PF_SUSPEND_TASK 0x8000  /* this thread called freeze_processes 
and should not be frozen */
diff --git a/kernel/fork.c b/kernel/fork.c
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1261,7 +1261,6 @@ static struct task_struct *copy_process(unsigned long 
clone_flags,
p->mempolicy = NULL;
goto bad_fork_cleanup_cgroup;
}
-   mpol_fix_fork_child_flag(p);
 #endif
 #ifdef CONFIG_CPUSETS
p->cpuset_mem_spread_rotor = NUMA_NO_NODE;
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -796,36 +796,6 @@ static int mbind_range(struct mm_struct *mm, unsigned long 
start,
return err;
 }
 
-/*
- * Update task->flags PF_MEMPOLICY bit: set iff non-default
- * mempolicy.  Allows more rapid checking of this (combined perhaps
- * with other PF_* flag bits) on memory allocation hot code paths.
- *
- * If called from outside this file, the task 'p' should -only- be
- * a newly forked child not yet visible on the task list, because
- * manipulating the task flags of a visible task is not safe.
- *
- * The above limitation is why this routine has the funny name
- * mpol_fix_fork_child_flag().
- *
- * It is also safe to call this with a task pointer of current,
- * which the static wrapper mpol_set_task_struct_flag() does,
- * for use within this file.
- */
-
-void mpol_fix_fork_child_flag(struct task_struct *p)
-{
-   if (p->mempolicy)
-   p->flags |= PF_MEMPOLICY;
-   else
-   p->flags &= ~PF_MEMPOLICY;
-}
-
-static void mpol_set_task_struct_flag(void)
-{
-   mpol_fix_fork_child_flag(current);
-}
-
 /* Set the process memory policy */
 static long do_set_mempolicy(unsigned short mode, unsigned short flags,
 nodemask_t *nodes)
@@ -862,7 +832,6 @@ static long do_set_mempolicy(unsigned short mode, unsigned 
short flags,
}
old = current->mempolicy;
current->mempolicy = new;
-   mpol_set_task_struct_flag();
if (new && new->mode == MPOL_INTERLEAVE &&
nodes_weight(new->v.nodes))
current->il_next = first_node(new->v.nodes);
diff --git a/mm/slab.c b/mm/slab.c
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -3027,7 +3027,7 @@ out:
 
 #ifdef CONFIG_NUMA
 /*
- * Try allocating on another node if PF_SPREAD_SLAB|PF_MEMPOLICY.
+ * Try allocating on another node if PF_SPREAD_SLAB is a mempolicy is set.
  *
  * If we are in_interrupt, then process context, including cpusets and
  * mempolicy, may not apply and

[patch 2/8] mm, mempolicy: rename slab_node for clarity

2013-12-03 Thread David Rientjes

slab_node() is actually a mempolicy function, so rename it to
mempolicy_slab_node() to make it clearer that it used for processes with
mempolicies.

At the same time, cleanup its code by saving numa_mem_id() in a local
variable (since we require a node with memory, not just any node) and
remove an obsolete comment that assumes the mempolicy is actually passed
into the function.

Signed-off-by: David Rientjes 
---
 include/linux/mempolicy.h |  2 +-
 mm/mempolicy.c| 15 ++-
 mm/slab.c |  4 ++--
 mm/slub.c |  2 +-
 4 files changed, 10 insertions(+), 13 deletions(-)

diff --git a/include/linux/mempolicy.h b/include/linux/mempolicy.h
--- a/include/linux/mempolicy.h
+++ b/include/linux/mempolicy.h
@@ -151,7 +151,7 @@ extern struct zonelist *huge_zonelist(struct vm_area_struct 
*vma,
 extern bool init_nodemask_of_mempolicy(nodemask_t *mask);
 extern bool mempolicy_nodemask_intersects(struct task_struct *tsk,
const nodemask_t *mask);
-extern unsigned slab_node(void);
+extern unsigned int mempolicy_slab_node(void);
 
 extern enum zone_type policy_zone;
 
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -1783,21 +1783,18 @@ static unsigned interleave_nodes(struct mempolicy 
*policy)
 /*
  * Depending on the memory policy provide a node from which to allocate the
  * next slab entry.
- * @policy must be protected by freeing by the caller.  If @policy is
- * the current task's mempolicy, this protection is implicit, as only the
- * task can change it's policy.  The system default policy requires no
- * such protection.
  */
-unsigned slab_node(void)
+unsigned int mempolicy_slab_node(void)
 {
struct mempolicy *policy;
+   int node = numa_mem_id();
 
if (in_interrupt())
-   return numa_node_id();
+   return node;
 
policy = current->mempolicy;
if (!policy || policy->flags & MPOL_F_LOCAL)
-   return numa_node_id();
+   return node;
 
switch (policy->mode) {
case MPOL_PREFERRED:
@@ -1817,11 +1814,11 @@ unsigned slab_node(void)
struct zonelist *zonelist;
struct zone *zone;
enum zone_type highest_zoneidx = gfp_zone(GFP_KERNEL);
-   zonelist = _DATA(numa_node_id())->node_zonelists[0];
+   zonelist = _DATA(node)->node_zonelists[0];
(void)first_zones_zonelist(zonelist, highest_zoneidx,
>v.nodes,
);
-   return zone ? zone->node : numa_node_id();
+   return zone ? zone->node : node;
}
 
default:
diff --git a/mm/slab.c b/mm/slab.c
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -3042,7 +3042,7 @@ static void *alternate_node_alloc(struct kmem_cache 
*cachep, gfp_t flags)
if (cpuset_do_slab_mem_spread() && (cachep->flags & SLAB_MEM_SPREAD))
nid_alloc = cpuset_slab_spread_node();
else if (current->mempolicy)
-   nid_alloc = slab_node();
+   nid_alloc = mempolicy_slab_node();
if (nid_alloc != nid_here)
return cache_alloc_node(cachep, flags, nid_alloc);
return NULL;
@@ -3074,7 +3074,7 @@ static void *fallback_alloc(struct kmem_cache *cache, 
gfp_t flags)
 
 retry_cpuset:
cpuset_mems_cookie = get_mems_allowed();
-   zonelist = node_zonelist(slab_node(), flags);
+   zonelist = node_zonelist(mempolicy_slab_node(), flags);
 
 retry:
/*
diff --git a/mm/slub.c b/mm/slub.c
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1663,7 +1663,7 @@ static void *get_any_partial(struct kmem_cache *s, gfp_t 
flags,
 
do {
cpuset_mems_cookie = get_mems_allowed();
-   zonelist = node_zonelist(slab_node(), flags);
+   zonelist = node_zonelist(mempolicy_slab_node(), flags);
for_each_zone_zonelist(zone, z, zonelist, high_zoneidx) {
struct kmem_cache_node *n;
 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 4/8] mm, memcg: add tunable for oom reserves

2013-12-03 Thread David Rientjes

Userspace needs a way to define the amount of memory reserves that
processes handling oom conditions may utilize.  This patch adds a per-
memcg oom reserve field and file, memory.oom_reserve_in_bytes, to
manipulate its value.

If currently utilized memory reserves are attempted to be reduced by
writing a smaller value to memory.oom_reserve_in_bytes, it will fail with
-EBUSY until some memory is uncharged.

Signed-off-by: David Rientjes 
---
 mm/memcontrol.c | 53 +
 1 file changed, 53 insertions(+)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -274,6 +274,9 @@ struct mem_cgroup {
/* OOM-Killer disable */
int oom_kill_disable;
 
+   /* reserves for handling oom conditions, protected by res.lock */
+   unsigned long long  oom_reserve;
+
/* set when res.limit == memsw.limit */
boolmemsw_is_minimum;
 
@@ -5893,6 +5896,51 @@ static int mem_cgroup_oom_control_write(struct 
cgroup_subsys_state *css,
return 0;
 }
 
+static int mem_cgroup_resize_oom_reserve(struct mem_cgroup *memcg,
+unsigned long long new_limit)
+{
+   struct res_counter *res = >res;
+   u64 limit, usage;
+   int ret = 0;
+
+   spin_lock(>lock);
+   limit = res->limit;
+   usage = res->usage;
+
+   if (usage > limit && usage - limit > new_limit) {
+   ret = -EBUSY;
+   goto out;
+   }
+
+   memcg->oom_reserve = new_limit;
+out:
+   spin_unlock(>lock);
+   return ret;
+}
+
+static u64 mem_cgroup_oom_reserve_read(struct cgroup_subsys_state *css,
+  struct cftype *cft)
+{
+   return mem_cgroup_from_css(css)->oom_reserve;
+}
+
+static int mem_cgroup_oom_reserve_write(struct cgroup_subsys_state *css,
+   struct cftype *cft, const char *buffer)
+{
+   struct mem_cgroup *memcg = mem_cgroup_from_css(css);
+   unsigned long long val;
+   int ret;
+
+   if (mem_cgroup_is_root(memcg))
+   return -EINVAL;
+
+   ret = res_counter_memparse_write_strategy(buffer, );
+   if (ret)
+   return ret;
+
+   return mem_cgroup_resize_oom_reserve(memcg, val);
+}
+
 #ifdef CONFIG_MEMCG_KMEM
 static int memcg_init_kmem(struct mem_cgroup *memcg, struct cgroup_subsys *ss)
 {
@@ -6024,6 +6072,11 @@ static struct cftype mem_cgroup_files[] = {
.private = MEMFILE_PRIVATE(_OOM_TYPE, OOM_CONTROL),
},
{
+   .name = "oom_reserve_in_bytes",
+   .read_u64 = mem_cgroup_oom_reserve_read,
+   .write_string = mem_cgroup_oom_reserve_write,
+   },
+   {
.name = "pressure_level",
.register_event = vmpressure_register_event,
.unregister_event = vmpressure_unregister_event,
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 1/8] fork: collapse copy_flags into copy_process

2013-12-03 Thread David Rientjes

copy_flags() does not use the clone_flags formal and can be collapsed
into copy_process() for cleaner code.

Signed-off-by: David Rientjes 
---
 kernel/fork.c | 12 ++--
 1 file changed, 2 insertions(+), 10 deletions(-)

diff --git a/kernel/fork.c b/kernel/fork.c
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1066,15 +1066,6 @@ static int copy_signal(unsigned long clone_flags, struct 
task_struct *tsk)
return 0;
 }
 
-static void copy_flags(unsigned long clone_flags, struct task_struct *p)
-{
-   unsigned long new_flags = p->flags;
-
-   new_flags &= ~(PF_SUPERPRIV | PF_WQ_WORKER);
-   new_flags |= PF_FORKNOEXEC;
-   p->flags = new_flags;
-}
-
 SYSCALL_DEFINE1(set_tid_address, int __user *, tidptr)
 {
current->clear_child_tid = tidptr;
@@ -1223,7 +1214,8 @@ static struct task_struct *copy_process(unsigned long 
clone_flags,
 
p->did_exec = 0;
delayacct_tsk_init(p);  /* Must remain after dup_task_struct() */
-   copy_flags(clone_flags, p);
+   p->flags &= ~(PF_SUPERPRIV | PF_WQ_WORKER);
+   p->flags |= PF_FORKNOEXEC;
INIT_LIST_HEAD(>children);
INIT_LIST_HEAD(>sibling);
rcu_copy_process(p);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCHv4] ARM: tegra: add gpiod_lookup table for paz00

2013-12-03 Thread Alex Courbot


On 12/03/2013 09:49 PM, Heikki Krogerus wrote:

This makes it possible to request the gpio descriptors in
rfkill_gpio driver regardless of the platform.

Signed-off-by: Heikki Krogerus 
---
  arch/arm/mach-tegra/board-paz00.c | 11 +++
  1 file changed, 11 insertions(+)

Changes since v3:
-
Rebased on top of Alexandre's patch "gpio: better lookup method for
platform GPIOs".

diff --git a/arch/arm/mach-tegra/board-paz00.c 
b/arch/arm/mach-tegra/board-paz00.c
index 06f0240..e4dec9f 100644
--- a/arch/arm/mach-tegra/board-paz00.c
+++ b/arch/arm/mach-tegra/board-paz00.c
@@ -18,6 +18,7 @@
   */

  #include 
+#include 
  #include 
  #include "board.h"

@@ -36,7 +37,17 @@ static struct platform_device wifi_rfkill_device = {
},
  };

+static struct gpiod_lookup_table wifi_gpio_lookup = {
+   .dev_id = "rfkill_gpio",
+   .table = {
+   GPIO_LOOKUP_IDX("tegra-gpio", 25, NULL, 0, 0),
+   GPIO_LOOKUP_IDX("tegra-gpio", 85, NULL, 1, 0),


nit: the flags could be changed to GPIO_ACTIVE_HIGH (which evaluates to 
0) to be more explicit.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] watchdog: mpc8xxx_wdt convert to watchdog core

2013-12-03 Thread Guenter Roeck


On 12/03/2013 05:31 AM, Christophe Leroy wrote:

Convert mpc8xxx_wdt.c to the new watchdog API.

Signed-off-by: Christophe Leroy 

diff -ur a/drivers/watchdog/mpc8xxx_wdt.c b/drivers/watchdog/mpc8xxx_wdt.c
--- a/drivers/watchdog/mpc8xxx_wdt.c2013-05-11 22:57:46.0 +0200
+++ b/drivers/watchdog/mpc8xxx_wdt.c2013-11-30 16:14:53.803495472 +0100
@@ -72,53 +72,36 @@
   * to 0
   */
  static int prescale = 1;
-static unsigned int timeout_sec;

-static unsigned long wdt_is_open;
  static DEFINE_SPINLOCK(wdt_spinlock);

-static void mpc8xxx_wdt_keepalive(void)
+static int mpc8xxx_wdt_ping(struct watchdog_device *w)
  {
/* Ping the WDT */
spin_lock(_spinlock);
out_be16(_base->swsrr, 0x556c);
out_be16(_base->swsrr, 0xaa39);
spin_unlock(_spinlock);
+   return 0;
  }



Ok, now I understand a bit better.

I think it would be better to keep the original mpc8xxx_wdt_keepalive()
function and add

static int mpc8xxx_wdt_ping(struct watchdog_device *w)
{
mpc8xxx_wdt_keepalive();
}

since the parameter is not used. Then you can call mpc8xxx_wdt_keepalive()
from mpc8xxx_wdt_timer_ping(), and you don't have to add the dummy argument.

Otherwise looks good.

Note there is a problem in the probe function with
u32 freq = fsl_get_sys_freq();
and
if (!freq || freq == -1)
 ^^^

but fixing that would be a different patch. 

Guenter


+static struct watchdog_device mpc8xxx_wdt_dev;
  static void mpc8xxx_wdt_timer_ping(unsigned long arg);
-static DEFINE_TIMER(wdt_timer, mpc8xxx_wdt_timer_ping, 0, 0);
+static DEFINE_TIMER(wdt_timer, mpc8xxx_wdt_timer_ping, 0,
+   (unsigned long)_wdt_dev);

  static void mpc8xxx_wdt_timer_ping(unsigned long arg)
  {
-   mpc8xxx_wdt_keepalive();
-   /* We're pinging it twice faster than needed, just to be sure. */
-   mod_timer(_timer, jiffies + HZ * timeout_sec / 2);
-}
+   struct watchdog_device *w = (struct watchdog_device *)arg;

-static void mpc8xxx_wdt_pr_warn(const char *msg)
-{
-   pr_crit("%s, expect the %s soon!\n", msg,
-   reset ? "reset" : "machine check exception");
-}
-
-static ssize_t mpc8xxx_wdt_write(struct file *file, const char __user *buf,
-size_t count, loff_t *ppos)
-{
-   if (count)
-   mpc8xxx_wdt_keepalive();
-   return count;
+   mpc8xxx_wdt_ping(w);
+   /* We're pinging it twice faster than needed, just to be sure. */
+   mod_timer(_timer, jiffies + HZ * w->timeout / 2);
  }

-static int mpc8xxx_wdt_open(struct inode *inode, struct file *file)
+static int mpc8xxx_wdt_start(struct watchdog_device *w)
  {
u32 tmp = SWCRR_SWEN;
-   if (test_and_set_bit(0, _is_open))
-   return -EBUSY;
-
-   /* Once we start the watchdog we can't stop it */
-   if (nowayout)
-   __module_get(THIS_MODULE);

/* Good, fire up the show */
if (prescale)
@@ -132,59 +115,31 @@

del_timer_sync(_timer);

-   return nonseekable_open(inode, file);
+   return 0;
  }

-static int mpc8xxx_wdt_release(struct inode *inode, struct file *file)
+static int mpc8xxx_wdt_stop(struct watchdog_device *w)
  {
-   if (!nowayout)
-   mpc8xxx_wdt_timer_ping(0);
-   else
-   mpc8xxx_wdt_pr_warn("watchdog closed");
-   clear_bit(0, _is_open);
+   mod_timer(_timer, jiffies);
return 0;
  }

-static long mpc8xxx_wdt_ioctl(struct file *file, unsigned int cmd,
-   unsigned long arg)
-{
-   void __user *argp = (void __user *)arg;
-   int __user *p = argp;
-   static const struct watchdog_info ident = {
-   .options = WDIOF_KEEPALIVEPING,
-   .firmware_version = 1,
-   .identity = "MPC8xxx",
-   };
-
-   switch (cmd) {
-   case WDIOC_GETSUPPORT:
-   return copy_to_user(argp, , sizeof(ident)) ? -EFAULT : 0;
-   case WDIOC_GETSTATUS:
-   case WDIOC_GETBOOTSTATUS:
-   return put_user(0, p);
-   case WDIOC_KEEPALIVE:
-   mpc8xxx_wdt_keepalive();
-   return 0;
-   case WDIOC_GETTIMEOUT:
-   return put_user(timeout_sec, p);
-   default:
-   return -ENOTTY;
-   }
-}
+static struct watchdog_info mpc8xxx_wdt_info = {
+   .options = WDIOF_KEEPALIVEPING,
+   .firmware_version = 1,
+   .identity = "MPC8xxx",
+};

-static const struct file_operations mpc8xxx_wdt_fops = {
-   .owner  = THIS_MODULE,
-   .llseek = no_llseek,
-   .write  = mpc8xxx_wdt_write,
-   .unlocked_ioctl = mpc8xxx_wdt_ioctl,
-   .open   = mpc8xxx_wdt_open,
-   .release= mpc8xxx_wdt_release,
+static struct watchdog_ops mpc8xxx_wdt_ops = {
+   .owner = THIS_MODULE,
+   .start = mpc8xxx_wdt_start,
+   .ping = mpc8xxx_wdt_ping,
+   .stop =

[PATCH] Update x86_msi.restore_msi_irqs API param

2013-12-03 Thread DuanZhenzhong


Change x86_msi.restore_msi_irqs(struct pci_dev *dev, int irq) to
x86_msi.restore_msi_irqs(struct pci_dev *dev)


From its naming, restore_msi_irqs is used to restore multiple msix irqs,

param 'int irq' is unneeded. This could make code looks consistent in vm
and bare metal.

Dom0 msix restore code could also be optimized as XEN only has a hypercall
to restore all MSIX vectors in one time.

Tested-by: Sucheta Chakraborty 
Signed-off-by: Zhenzhong Duan 
Acked-by: Konrad Rzeszutek Wilk 
---
arch/x86/include/asm/pci.h  |2 +-
arch/x86/include/asm/x86_init.h |2 +-
arch/x86/kernel/x86_init.c  |4 ++--
arch/x86/pci/xen.c  |2 +-
drivers/pci/msi.c   |   19 ++-
include/linux/msi.h |4 ++--
6 files changed, 21 insertions(+), 12 deletions(-)

diff --git a/arch/x86/include/asm/pci.h b/arch/x86/include/asm/pci.h
index 947b5c4..0de52c5 100644
--- a/arch/x86/include/asm/pci.h
+++ b/arch/x86/include/asm/pci.h
@@ -104,7 +104,7 @@ extern void pci_iommu_alloc(void);
struct msi_desc;
int native_setup_msi_irqs(struct pci_dev *dev, int nvec, int type);
void native_teardown_msi_irq(unsigned int irq);
-void native_restore_msi_irqs(struct pci_dev *dev, int irq);
+void native_restore_msi_irqs(struct pci_dev *dev);
int setup_msi_irq(struct pci_dev *dev, struct msi_desc *msidesc,
  unsigned int irq_base, unsigned int irq_offset);
#else
diff --git a/arch/x86/include/asm/x86_init.h b/arch/x86/include/asm/x86_init.h
index 0f1be11..e45e4da 100644
--- a/arch/x86/include/asm/x86_init.h
+++ b/arch/x86/include/asm/x86_init.h
@@ -181,7 +181,7 @@ struct x86_msi_ops {
   u8 hpet_id);
void (*teardown_msi_irq)(unsigned int irq);
void (*teardown_msi_irqs)(struct pci_dev *dev);
-   void (*restore_msi_irqs)(struct pci_dev *dev, int irq);
+   void (*restore_msi_irqs)(struct pci_dev *dev);
int  (*setup_hpet_msi)(unsigned int irq, unsigned int id);
u32 (*msi_mask_irq)(struct msi_desc *desc, u32 mask, u32 flag);
u32 (*msix_mask_irq)(struct msi_desc *desc, u32 flag);
diff --git a/arch/x86/kernel/x86_init.c b/arch/x86/kernel/x86_init.c
index 021783b..e48b674 100644
--- a/arch/x86/kernel/x86_init.c
+++ b/arch/x86/kernel/x86_init.c
@@ -136,9 +136,9 @@ void arch_teardown_msi_irq(unsigned int irq)
x86_msi.teardown_msi_irq(irq);
}

-void arch_restore_msi_irqs(struct pci_dev *dev, int irq)
+void arch_restore_msi_irqs(struct pci_dev *dev)
{
-   x86_msi.restore_msi_irqs(dev, irq);
+   x86_msi.restore_msi_irqs(dev);
}
u32 arch_msi_mask_irq(struct msi_desc *desc, u32 mask, u32 flag)
{
diff --git a/arch/x86/pci/xen.c b/arch/x86/pci/xen.c
index 5eee495..103e702 100644
--- a/arch/x86/pci/xen.c
+++ b/arch/x86/pci/xen.c
@@ -337,7 +337,7 @@ out:
return ret;
}

-static void xen_initdom_restore_msi_irqs(struct pci_dev *dev, int irq)
+static void xen_initdom_restore_msi_irqs(struct pci_dev *dev)
{
int ret = 0;

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index 3fcd67a..ed7310d 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -116,7 +116,7 @@ void __weak arch_teardown_msi_irqs(struct pci_dev *dev)
return default_teardown_msi_irqs(dev);
}

-void default_restore_msi_irqs(struct pci_dev *dev, int irq)
+void default_restore_msi_irq(struct pci_dev *dev, int irq)
{
struct msi_desc *entry;

@@ -134,9 +134,9 @@ void default_restore_msi_irqs(struct pci_dev *dev, int irq)
write_msi_msg(irq, >msg);
}

-void __weak arch_restore_msi_irqs(struct pci_dev *dev, int irq)
+void __weak arch_restore_msi_irqs(struct pci_dev *dev)
{
-   return default_restore_msi_irqs(dev, irq);
+   return default_restore_msi_irqs(dev);
}

static void msi_set_enable(struct pci_dev *dev, int enable)
@@ -262,6 +262,15 @@ void unmask_msi_irq(struct irq_data *data)
msi_set_mask_bit(data, 0);
}

+void default_restore_msi_irqs(struct pci_dev *dev)
+{
+   struct msi_desc *entry;
+
+   list_for_each_entry(entry, >msi_list, list) {
+   default_restore_msi_irq(dev, entry->irq);
+   }
+}
+
void __read_msi_msg(struct msi_desc *entry, struct msi_msg *msg)
{
BUG_ON(entry->dev->current_state != PCI_D0);
@@ -430,7 +439,7 @@ static void __pci_restore_msi_state(struct pci_dev *dev)

pci_intx_for_msi(dev, 0);
msi_set_enable(dev, 0);
-   arch_restore_msi_irqs(dev, dev->irq);
+   arch_restore_msi_irqs(dev);

pci_read_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, );
msi_mask_irq(entry, msi_capable_mask(control), entry->masked);
@@ -455,8 +464,8 @@ static void __pci_restore_msix_state(struct pci_dev *dev)
control |= PCI_MSIX_FLAGS_ENABLE | PCI_MSIX_FLAGS_MASKALL;
pci_write_config_word(dev, dev->msix_cap + PCI_MSIX_FLAGS, control);

+   arch_restore_msi_irqs(dev);
list_for_each_entry(entry, >msi_list, list) {
-   arch_restore_msi_irqs(dev, entry->irq);

Re: [PATCH] PCI: Introduce two new MSI infrastructure calls for masking/unmasking.

2013-12-03 Thread DuanZhenzhong


Konrad Rzeszutek Wilk wrote:

On Fri, Nov 08, 2013 at 09:44:09AM +0800, Zhenzhong Duan wrote:
  

On 2013-11-07 07:51, Bjorn Helgaas wrote:


[+cc Thomas, Ingo, Peter, x86 list]

On Wed, Nov 6, 2013 at 2:16 PM, Konrad Rzeszutek Wilk
 wrote:
  

Certain platforms do not allow writes in the MSI-X bars
to setup or tear down vector values. To combat against
the generic code trying to write to that and either silently
being ignored or crashing due to the pagetables being marked r/o
this patch introduces a platform over-write.

Note that we keep two separate, non-weak, functions
default_mask_msi_irqs() and default_mask_msix_irqs() for the
behavior of the arch_mask_msi_irqs() and arch_mask_msix_irqs(),
as the default behavior is needed by x86 PCI code.

For Xen, which does not allow the guest to write to MSI-X
tables - as the hypervisor is solely responsible for setting
the vector values - we implement two nops.

CC: Bjorn Helgaas 
CC: Sucheta Chakraborty 
CC: Zhenzhong Duan 
Signed-off-by: Konrad Rzeszutek Wilk 


I think this is safe, and I'd like to squeeze it into the v3.13 merge
window next week, since it supersedes three patches Zhenzhong has been
trying to get in since July [1], and this patch is much simpler to
understand.
  

This patch could replace the first two.
I think the third patch of mine is still needed as it does a
different thing.
It optimizes restore path in dom0.



I tried to rebase it on top of this patch but it ended up that
you still need the two arguments (for restore_... operation). 


But perhaps there is a better way. If you can rebase on top
of this patch - and send it out - that would be great!
  

Ok, I'll send one rebased on your patch later.

--
Regards
zhenzhong
--
Oracle Building, No.24 Building, Zhongguancun Software Park
Haidian District, Beijing 100193, China

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v12 09/18] vmscan: shrink slab on memcg pressure

2013-12-03 Thread Dave Chinner

On Tue, Dec 03, 2013 at 04:15:57PM +0400, Vladimir Davydov wrote:
> On 12/03/2013 02:48 PM, Dave Chinner wrote:
> >> @@ -236,11 +236,17 @@ shrink_slab_node(struct shrink_control *shrinkctl, 
> >> struct shrinker *shrinker,
> >>return 0;
> >>  
> >>/*
> >> -   * copy the current shrinker scan count into a local variable
> >> -   * and zero it so that other concurrent shrinker invocations
> >> -   * don't also do this scanning work.
> >> +   * Do not touch global counter of deferred objects on memcg pressure to
> >> +   * avoid isolation issues. Ideally the counter should be per-memcg.
> >> */
> >> -  nr = atomic_long_xchg(>nr_deferred[nid], 0);
> >> +  if (!shrinkctl->target_mem_cgroup) {
> >> +  /*
> >> +   * copy the current shrinker scan count into a local variable
> >> +   * and zero it so that other concurrent shrinker invocations
> >> +   * don't also do this scanning work.
> >> +   */
> >> +  nr = atomic_long_xchg(>nr_deferred[nid], 0);
> >> +  }
> > That's ugly. Effectively it means that memcg reclaim is going to be
> > completely ineffective when large numbers of allocations and hence
> > reclaim attempts are done under GFP_NOFS context.
> >
> > The only thing that keeps filesystem caches in balance when there is
> > lots of filesystem work going on (i.e. lots of GFP_NOFS allocations)
> > is the deferal of reclaim work to a context that can do something
> > about it.
> 
> Imagine the situation: a memcg issues a GFP_NOFS allocation and goes to
> shrink_slab() where it defers them to the global counter; then another
> memcg issues a GFP_KERNEL allocation, also goes to shrink_slab() where
> it sees a huge number of deferred objects and starts shrinking them,
> which is not good IMHO.

That's exactly what the deferred mechanism is for - we know we have
to do the work, but we can't do it right now so let someone else do
it who can.

In most cases, deferral is handled by kswapd, because when a
filesystem workload is causing memory pressure then most allocations
are done in GFP_NOFS conditions. Hence the only memory reclaim that
can make progress here is kswapd.

Right now, you aren't deferring any of this memory pressure to some
other agent, so it just does not get done. That's a massive problem
- it's a design flaw - and instead I see lots of crazy hacks being
added to do stuff that should simply be deferred to kswapd like is
done for global memory pressure.

Hell, kswapd shoul dbe allowed to walk memcg LRU lists and trim
them, just like it does for the global lists. We only need a single
"deferred work" counter per node for that - just let kswapd
proportion the deferred work over the per-node LRU and the
memcgs

> I understand that nr_deferred is necessary, but
> I think it should be per-memcg. What do you think about moving it to
> list_lru?

It's part of the shrinker state that is used to calculate how much
work the shrinker needs to do. We can't hold it in the LRU, because
there is no guarantee that a shrinker actually uses a list_lru, and
shrinkers can be memcg aware without using list_lru infrastructure.

So, no, moving it to the list-lru is not a solution

> > So, if the memcg can't make progress, why wouldn't you defer the
> > work to the global scan? Or can't a global scan trim memcg LRUs?
> > And if it can't, then isn't that a major design flaw? Why not just
> > allow kswapd to walk memcg LRUs in the background?
> >
> > /me just looked at patch 13
> >
> > Yeah, this goes some way to explaining why something like patch 13
> > is necessary - slab shrinkers are not keeping up with page cache
> > reclaim because of GFP_NOFS allocations, and so the page cache
> > empties only leaving slab caches to be trimmed
> >
> >
> >> +static unsigned long
> >> +shrink_slab_memcg(struct shrink_control *shrinkctl, struct shrinker 
> >> *shrinker,
> >> +unsigned long fraction, unsigned long denominator)
> > what's this function got to do with memcgs? Why did you rename it
> > from the self explanitory shrink_slab_one() name that Glauber gave
> > it?
> 
> When I sent the previous version, Johannes Weiner disliked the name that
> was why I renamed it, now you don't like the new name and ask for the
> old one :-) But why do you think that shrink_slab_one() is
> self-explanatory while shrink_slab_memcg() is not? I mean
> shrink_slab_memcg() means "shrink slab accounted to a memcg" just like

But it's not shrinking a slab accounted to a memcg - the memcg can
be null. All it's is doing is executing a shrinker scan...

> shrink_slab_node() means "shrink slab on the node" while seeing
> shrink_slab_one() I would ask "one what?".

It's running a shrinker scan on *one* shrinker. It doesn't matter if
the shrinker is memcg aware, or numa aware, it's just running one
shrinker

So, while shrink_slab_one() might not be the best name, it's
certainly more correct than shrink_slab_memcg(). Perhaps it would be
better named

Re: sysfs: use a separate locking class for open files depending on mmap

2013-12-03 Thread Dave Jones

On Tue, Dec 03, 2013 at 05:15:43PM -0500, Tejun Heo wrote:
 > Hello,
 > 
 > Can you please test whether this patch makes the lockdep warning go
 > away?
 > 
 > Thanks a lot!
 
been running a few hours now, looks good.

Tested-by: Dave Jones 

Dave

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v1 1/2] of: irq: Ignore disabled intc's when searching map

2013-12-03 Thread Peter Crosthwaite

Ping!

On Thu, Nov 28, 2013 at 5:26 PM, Peter Crosthwaite
 wrote:
> When searching the interrupt map, if a matched parent is disabled, just
> ignore it and move on with the search.
>
> This allows for specifying connection of a single device IRQ to
> multiple interrupt controllers via the interrupt map schema. This change
> allows for selection of the active interrupt controller via the already
> existing status = "disabled" mechanism.
>
> Signed-off-by: Peter Crosthwaite 
> Acked-by: Michal Simek 
> ---
>  drivers/of/irq.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/drivers/of/irq.c b/drivers/of/irq.c
> index 786b0b4..22e414b 100644
> --- a/drivers/of/irq.c
> +++ b/drivers/of/irq.c
> @@ -217,6 +217,9 @@ int of_irq_parse_raw(const __be32 *addr, struct 
> of_phandle_args *out_irq)
> goto fail;
> }
>
> +   if (!of_device_is_available(newpar))
> +   match = 0;
> +
> /* Get #interrupt-cells and #address-cells of new
>  * parent
>  */
> --
> 1.8.4.4
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3/3] ARM Coresight: Add PID control support for ETM tracing

2013-12-03 Thread Adrien Vergé

In the same manner as for enabling tracing, an entry is created in
sysfs to set the PID that triggers tracing. This change requires
CONFIG_PID_IN_CONTEXTIDR to be set when using on-chip ETM.

Signed-off-by: Adrien Vergé 
Cc: Russell King 
Cc: Ben Dooks 
Cc: Will Deacon 
Cc: Dietmar Eggemann 
Cc: Andrew Morton 
Cc: "zhangwei(Jovi)" 
Cc: Greg Kroah-Hartman 
Cc: Randy Dunlap 
---
 arch/arm/Kconfig.debug|  1 +
 arch/arm/include/asm/hardware/coresight.h |  3 ++
 arch/arm/kernel/etm.c | 73 ---
 3 files changed, 70 insertions(+), 7 deletions(-)

diff --git a/arch/arm/Kconfig.debug b/arch/arm/Kconfig.debug
index 5765abf..fef32e15 100644
--- a/arch/arm/Kconfig.debug
+++ b/arch/arm/Kconfig.debug
@@ -1130,6 +1130,7 @@ config EARLY_PRINTK
 config OC_ETM
  bool "On-chip ETM and ETB"
  depends on ARM_AMBA
+ select PID_IN_CONTEXTIDR
  help
   Enables the on-chip embedded trace macrocell and embedded trace
   buffer driver that will allow you to collect traces of the
diff --git a/arch/arm/include/asm/hardware/coresight.h
b/arch/arm/include/asm/hardware/coresight.h
index 8c50cf6..009cdf9 100644
--- a/arch/arm/include/asm/hardware/coresight.h
+++ b/arch/arm/include/asm/hardware/coresight.h
@@ -98,6 +98,9 @@
 #define ETMR_ADDRCOMP_VAL(x) (0x40 + (x) * 4)
 #define ETMR_ADDRCOMP_ACC_TYPE(x) (0x80 + (x) * 4)

+#define ETMR_CTXIDCOMP_VAL(x) (0x1b0 + (x) * 4)
+#define ETMR_CTXIDCOMP_MASK (0x1bc)
+
 /* ETM status register, "ETM Architecture", 3.3.2 */
 #define ETMR_STATUS (0x10)
 #define ETMST_OVERFLOW BIT(0)
diff --git a/arch/arm/kernel/etm.c b/arch/arm/kernel/etm.c
index a72382b..18afed1 100644
--- a/arch/arm/kernel/etm.c
+++ b/arch/arm/kernel/etm.c
@@ -40,12 +40,14 @@ struct tracectx {
  void __iomem *etm_regs;
  unsigned long flags;
  int naddrcmppairs;
+ int nctxidcmp;
  int etm_portsz;
  struct device *dev;
  struct clk *emu_clk;
  struct mutex mutex;
  unsigned long addrrange_start;
  unsigned long addrrange_end;
+ long pid;
 };

 static struct tracectx tracer;
@@ -59,14 +61,18 @@ static inline bool trace_isrunning(struct tracectx *t)
  * Setups ETM to trace only when:
  *   - address between start and end
  * or address not between start and end (if exclude)
+ *   - in user-space when process id equals pid,
+ * in kernel-space (if pid == 0),
+ * always (if pid == -1)
  *   - trace executed instructions
  * or trace loads and stores (if data)
  */
-static int etm_setup_address_range(struct tracectx *t, int n,
- unsigned long start, unsigned long end, int exclude, int data)
+static int etm_setup(struct tracectx *t, int n,
+ unsigned long start, unsigned long end, int exclude,
+ long pid,
+ int data)
 {
- u32 flags = ETMAAT_ARM | ETMAAT_IGNCONTEXTID | ETMAAT_NSONLY | \
-ETMAAT_NOVALCMP;
+ u32 flags = ETMAAT_ARM | ETMAAT_NSONLY | ETMAAT_NOVALCMP;

  if (n < 1 || n > t->naddrcmppairs)
  return -EINVAL;
@@ -75,6 +81,19 @@ static int etm_setup_address_range(struct tracectx *t, int n,
  * to bits in a word */
  n--;

+ if (pid < 0) {
+ /* ignore Context ID */
+ flags |= ETMAAT_IGNCONTEXTID;
+ } else {
+ flags |= ETMAAT_VALUE1;
+
+ /* Set up the first Context ID comparator.
+   Process ID is found in the 24 first bits of Context ID
+   (provided by CONFIG_PID_IN_CONTEXTIDR) */
+ etm_writel(t, pid << 8, ETMR_CTXIDCOMP_VAL(0));
+ etm_writel(t, 0xff, ETMR_CTXIDCOMP_MASK);
+ }
+
  if (data)
  flags |= ETMAAT_DLOADSTORE;
  else
@@ -124,8 +143,10 @@ static int trace_start(struct tracectx *t)
  return -EFAULT;
  }

- etm_setup_address_range(t, 1, t->addrrange_start, t->addrrange_end,
- 0, 0);
+ etm_setup(t, 1,
+  t->addrrange_start, t->addrrange_end, 0,
+  t->pid,
+  0);
  etm_writel(t, 0, ETMR_TRACEENCTRL2);
  etm_writel(t, 0, ETMR_TRACESSCTRL);
  etm_writel(t, 0x6f, ETMR_TRACEENEVT);
@@ -488,6 +509,7 @@ static ssize_t trace_info_show(struct kobject *kobj,

  return sprintf(buf, "Trace buffer len: %d\n"
  "Addr comparator pairs: %d\n"
+ "Ctx ID comparators: %d\n"
  "ETBR_WRITEADDR:\t%08x\n"
  "ETBR_READADDR:\t%08x\n"
  "ETBR_STATUS:\t%08x\n"
@@ -496,6 +518,7 @@ static ssize_t trace_info_show(struct kobject *kobj,
  "ETMR_STATUS:\t%08x\n",
  datalen,
  tracer.naddrcmppairs,
+ tracer.nctxidcmp,
  etb_wa,
  etb_ra,
  etb_st,
@@ -572,6 +595,35 @@ static ssize_t trace_addrrange_store(struct kobject *kobj,
 static struct kobj_attribute trace_addrrange_attr =
  __ATTR(trace_addrrange, 0644, trace_addrrange_show, trace_addrrange_store);

+static ssize_t trace_pid_show(struct kobject *kobj,
+  struct kobj_attribute *attr,
+  char *buf)
+{
+ return sprintf(buf, "%ld\n", tracer.pid);
+}
+
+static ssize_t trace_pid_store(struct kobject *kobj,
+   struct kobj_attribute *attr,
+   const char *buf, size_t n)
+{
+ long pid;
+
+ if (tracer.flags & TRACER_RUNNING)
+ return -EBUSY;
+
+ if (sscanf(buf, "%li", ) != 1)
+ return -EINVAL;
+
+ mutex_lock();
+ tracer.pid = pid;
+ mutex_unlock();
+
+ return n;
+}
+
+static struct kobj_attribute

[PATCH 2/3] ARM Coresight: Add address control support for ETM

2013-12-03 Thread Adrien Vergé

In the same manner as for enabling tracing, an entry is created
in sysfs to set the address range that triggers tracing.

Signed-off-by: Adrien Vergé 
Cc: Russell King 
Cc: Ben Dooks 
Cc: Will Deacon 
Cc: Dietmar Eggemann 
Cc: Andrew Morton 
Cc: "zhangwei(Jovi)" 
Cc: Greg Kroah-Hartman 
Cc: Randy Dunlap 
---
 arch/arm/kernel/etm.c | 53 ---
 1 file changed, 50 insertions(+), 3 deletions(-)

diff --git a/arch/arm/kernel/etm.c b/arch/arm/kernel/etm.c
index bd7e8e4..a72382b 100644
--- a/arch/arm/kernel/etm.c
+++ b/arch/arm/kernel/etm.c
@@ -44,6 +44,8 @@ struct tracectx {
  struct device *dev;
  struct clk *emu_clk;
  struct mutex mutex;
+ unsigned long addrrange_start;
+ unsigned long addrrange_end;
 };

 static struct tracectx tracer;
@@ -53,6 +55,13 @@ static inline bool trace_isrunning(struct tracectx *t)
  return !!(t->flags & TRACER_RUNNING);
 }

+/*
+ * Setups ETM to trace only when:
+ *   - address between start and end
+ * or address not between start and end (if exclude)
+ *   - trace executed instructions
+ * or trace loads and stores (if data)
+ */
 static int etm_setup_address_range(struct tracectx *t, int n,
  unsigned long start, unsigned long end, int exclude, int data)
 {
@@ -115,8 +124,8 @@ static int trace_start(struct tracectx *t)
  return -EFAULT;
  }

- etm_setup_address_range(t, 1, (unsigned long)_stext,
- (unsigned long)_etext, 0, 0);
+ etm_setup_address_range(t, 1, t->addrrange_start, t->addrrange_end,
+ 0, 0);
  etm_writel(t, 0, ETMR_TRACEENCTRL2);
  etm_writel(t, 0, ETMR_TRACESSCTRL);
  etm_writel(t, 0x6f, ETMR_TRACEENEVT);
@@ -532,6 +541,37 @@ static ssize_t trace_mode_store(struct kobject *kobj,
 static struct kobj_attribute trace_mode_attr =
  __ATTR(trace_mode, 0644, trace_mode_show, trace_mode_store);

+static ssize_t trace_addrrange_show(struct kobject *kobj,
+struct kobj_attribute *attr,
+char *buf)
+{
+ return sprintf(buf, "%08lx - %08lx\n", tracer.addrrange_start,
+   tracer.addrrange_end);
+}
+
+static ssize_t trace_addrrange_store(struct kobject *kobj,
+ struct kobj_attribute *attr,
+ const char *buf, size_t n)
+{
+ unsigned long start, end;
+
+ if (tracer.flags & TRACER_RUNNING)
+ return -EBUSY;
+
+ if (sscanf(buf, "%08lx - %08lx", , ) != 2)
+ return -EINVAL;
+
+ mutex_lock();
+ tracer.addrrange_start = start;
+ tracer.addrrange_end = end;
+ mutex_unlock();
+
+ return n;
+}
+
+static struct kobj_attribute trace_addrrange_attr =
+ __ATTR(trace_addrrange, 0644, trace_addrrange_show, trace_addrrange_store);
+
 static int etm_probe(struct amba_device *dev, const struct amba_id *id)
 {
  struct tracectx *t = 
@@ -559,6 +599,8 @@ static int etm_probe(struct amba_device *dev,
const struct amba_id *id)
  t->dev = >dev;
  t->flags = TRACER_CYCLE_ACC;
  t->etm_portsz = 1;
+ t->addrrange_start = (unsigned long) _stext;
+ t->addrrange_end = (unsigned long) _etext;

  etm_unlock(t);
  (void)etm_readl(t, ETMMR_PDSR);
@@ -574,7 +616,7 @@ static int etm_probe(struct amba_device *dev,
const struct amba_id *id)
  if (ret)
  goto out_unmap;

- /* failing to create any of these two is not fatal */
+ /* failing to create any of these three is not fatal */
  ret = sysfs_create_file(>dev.kobj, _info_attr.attr);
  if (ret)
  dev_dbg(>dev, "Failed to create trace_info in sysfs\n");
@@ -583,6 +625,10 @@ static int etm_probe(struct amba_device *dev,
const struct amba_id *id)
  if (ret)
  dev_dbg(>dev, "Failed to create trace_mode in sysfs\n");

+ ret = sysfs_create_file(>dev.kobj, _addrrange_attr.attr);
+ if (ret)
+ dev_dbg(>dev, "Failed to create trace_addrrange in sysfs\n");
+
  dev_dbg(t->dev, "ETM AMBA driver initialized.\n");

 out:
@@ -612,6 +658,7 @@ static int etm_remove(struct amba_device *dev)
  sysfs_remove_file(>dev.kobj, _running_attr.attr);
  sysfs_remove_file(>dev.kobj, _info_attr.attr);
  sysfs_remove_file(>dev.kobj, _mode_attr.attr);
+ sysfs_remove_file(>dev.kobj, _addrrange_attr.attr);

  return 0;
 }
-- 
1.8.3.1
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/3] ARM Coresight: Rename 'comparator' to 'address

2013-12-03 Thread Adrien Vergé

Since there are different types of comparators, and other kinds will
be used (such as Context ID comparators), rename them properly.

Signed-off-by: Adrien Vergé 
Cc: Russell King 
Cc: Ben Dooks 
Cc: Will Deacon 
Cc: Dietmar Eggemann 
Cc: Andrew Morton 
Cc: "zhangwei(Jovi)" 
Cc: Greg Kroah-Hartman 
Cc: Randy Dunlap 
---
 arch/arm/include/asm/hardware/coresight.h |  4 ++--
 arch/arm/kernel/etm.c | 19 ++-
 2 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/arch/arm/include/asm/hardware/coresight.h
b/arch/arm/include/asm/hardware/coresight.h
index ad774f3..8c50cf6 100644
--- a/arch/arm/include/asm/hardware/coresight.h
+++ b/arch/arm/include/asm/hardware/coresight.h
@@ -95,8 +95,8 @@
 #define ETMAAT_NSONLY (1 << 10)
 #define ETMAAT_SONLY (2 << 10)

-#define ETMR_COMP_VAL(x) (0x40 + (x) * 4)
-#define ETMR_COMP_ACC_TYPE(x) (0x80 + (x) * 4)
+#define ETMR_ADDRCOMP_VAL(x) (0x40 + (x) * 4)
+#define ETMR_ADDRCOMP_ACC_TYPE(x) (0x80 + (x) * 4)

 /* ETM status register, "ETM Architecture", 3.3.2 */
 #define ETMR_STATUS (0x10)
diff --git a/arch/arm/kernel/etm.c b/arch/arm/kernel/etm.c
index 8ff0ecd..bd7e8e4 100644
--- a/arch/arm/kernel/etm.c
+++ b/arch/arm/kernel/etm.c
@@ -39,7 +39,7 @@ struct tracectx {
  void __iomem *etb_regs;
  void __iomem *etm_regs;
  unsigned long flags;
- int ncmppairs;
+ int naddrcmppairs;
  int etm_portsz;
  struct device *dev;
  struct clk *emu_clk;
@@ -59,7 +59,7 @@ static int etm_setup_address_range(struct tracectx *t, int n,
  u32 flags = ETMAAT_ARM | ETMAAT_IGNCONTEXTID | ETMAAT_NSONLY | \
 ETMAAT_NOVALCMP;

- if (n < 1 || n > t->ncmppairs)
+ if (n < 1 || n > t->naddrcmppairs)
  return -EINVAL;

  /* comparators and ranges are numbered starting with 1 as opposed
@@ -72,12 +72,12 @@ static int etm_setup_address_range(struct tracectx
*t, int n,
  flags |= ETMAAT_IEXEC;

  /* first comparator for the range */
- etm_writel(t, flags, ETMR_COMP_ACC_TYPE(n * 2));
- etm_writel(t, start, ETMR_COMP_VAL(n * 2));
+ etm_writel(t, flags, ETMR_ADDRCOMP_ACC_TYPE(n * 2));
+ etm_writel(t, start, ETMR_ADDRCOMP_VAL(n * 2));

  /* second comparator is right next to it */
- etm_writel(t, flags, ETMR_COMP_ACC_TYPE(n * 2 + 1));
- etm_writel(t, end, ETMR_COMP_VAL(n * 2 + 1));
+ etm_writel(t, flags, ETMR_ADDRCOMP_ACC_TYPE(n * 2 + 1));
+ etm_writel(t, end, ETMR_ADDRCOMP_VAL(n * 2 + 1));

  flags = exclude ? ETMTE_INCLEXCL : 0;
  etm_writel(t, flags | (1 << n), ETMR_TRACEENCTRL);
@@ -477,7 +477,8 @@ static ssize_t trace_info_show(struct kobject *kobj,
  etm_st = etm_readl(, ETMR_STATUS);
  etm_lock();

- return sprintf(buf, "Trace buffer len: %d\nComparator pairs: %d\n"
+ return sprintf(buf, "Trace buffer len: %d\n"
+ "Addr comparator pairs: %d\n"
  "ETBR_WRITEADDR:\t%08x\n"
  "ETBR_READADDR:\t%08x\n"
  "ETBR_STATUS:\t%08x\n"
@@ -485,7 +486,7 @@ static ssize_t trace_info_show(struct kobject *kobj,
  "ETMR_CTRL:\t%08x\n"
  "ETMR_STATUS:\t%08x\n",
  datalen,
- tracer.ncmppairs,
+ tracer.naddrcmppairs,
  etb_wa,
  etb_ra,
  etb_st,
@@ -564,7 +565,7 @@ static int etm_probe(struct amba_device *dev,
const struct amba_id *id)
  /* dummy first read */
  (void)etm_readl(, ETMMR_OSSRR);

- t->ncmppairs = etm_readl(t, ETMR_CONFCODE) & 0xf;
+ t->naddrcmppairs = etm_readl(t, ETMR_CONFCODE) & 0xf;
  etm_writel(t, 0x440, ETMR_CTRL);
  etm_lock(t);

-- 
1.8.3.1
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 0/3] ARM Coresight: Enhance ETM tracing control

2013-12-03 Thread Adrien Vergé

Usage of ETM tracing facility is currently very limited: user can
only start/stop tracing. This set of patches enables management of
address combinations and PIDs that trigger tracing.

ETM management was done via sysfs entries (trace_info,
trace_running...), this code adds trace_addrrange and trace_pid to
let the user read/write custom values.

This series of patches apply to v3.13-rc2.

Signed-off-by: Adrien Vergé 
Cc: Russell King 
Cc: Ben Dooks 
Cc: Will Deacon 
Cc: Dietmar Eggemann 
Cc: Andrew Morton 
Cc: "zhangwei(Jovi)" 
Cc: Greg Kroah-Hartman 
Cc: Randy Dunlap 
---
Adrien Vergé (3):
  ARM Coresight: Rename 'comparator' to 'address comparator' in ETM
  ARM Coresight: Add address control support for ETM tracing
  ARM Coresight: Add PID control support for ETM tracing

 arch/arm/Kconfig.debug|   1 +
 arch/arm/include/asm/hardware/coresight.h |   7 +-
 arch/arm/kernel/etm.c | 139 ++
 3 files changed, 129 insertions(+), 18 deletions(-)

-- 
1.8.3.1
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

ARM Coresight: Enhance ETM tracing control

2013-12-03 Thread Adrien Vergé

Usage of ETM tracing facility is currently very limited: user can
only start/stop tracing. This set of patches enables management of
address combinations and PIDs that trigger tracing.

ETM management was done via sysfs entries (trace_info,
trace_running...), this code adds trace_addrrange and trace_pid to
let the user read/write custom values.

This series of patches apply to v3.13-rc2.

Signed-off-by: Adrien Vergé 
Cc: Russell King 
Cc: Ben Dooks 
Cc: Will Deacon 
Cc: Dietmar Eggemann 
Cc: Andrew Morton 
Cc: "zhangwei(Jovi)" 
Cc: Greg Kroah-Hartman 
Cc: Randy Dunlap 
---
Adrien Vergé (3):
  ARM Coresight: Rename 'comparator' to 'address comparator' in ETM
  ARM Coresight: Add address control support for ETM tracing
  ARM Coresight: Add PID control support for ETM tracing

 arch/arm/Kconfig.debug|   1 +
 arch/arm/include/asm/hardware/coresight.h |   7 +-
 arch/arm/kernel/etm.c | 139 ++
 3 files changed, 129 insertions(+), 18 deletions(-)

-- 
1.8.3.1
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Staging: speakup: synth.c: removed a space

2013-12-03 Thread Aldo Iljazi

 Samuel Thibault wrote:

> Err, I'd rather make it really visible that the for loop doesn't have
> its first statement?

Wouldn't it be better if you add a comment there? So it would follow the
coding style?
-- 
Aldo Iljazi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] mm: memcg: do not declare OOM from __GFP_NOFAIL allocations

2013-12-03 Thread Dave Chinner

On Tue, Dec 03, 2013 at 10:01:01PM -0500, Johannes Weiner wrote:
> On Tue, Dec 03, 2013 at 03:40:13PM -0800, David Rientjes wrote:
> > On Tue, 3 Dec 2013, Johannes Weiner wrote:
> > I believe the page allocator would be susceptible to the same deadlock if 
> > nothing else on the system can reclaim memory and that belief comes from 
> > code inspection that shows __GFP_NOFAIL is not guaranteed to ever succeed 
> > in the page allocator as their charges now are (with your patch) in memcg.  
> > I do not have an example of such an incident.
> 
> Me neither.

Is this the sort of thing that you expect to see when GFP_NOFS |
GFP_NOFAIL type allocations continualy fail?

http://oss.sgi.com/archives/xfs/2013-12/msg00095.html

XFS doesn't use GFP_NOFAIL, it does it's own loop with GFP_NOWARN in
kmem_alloc() so that if we get stuck for more than 100 attempts to
allocate it throws a warning. i.e. only when we really are stuck and
reclaim is not making any progress.

This specific case is due to memory fragmentation preventing a 64k
memory allocation (due to the filesystem being configured with a 64k
directory block size), but GFP_NOFS | GFP_NOFAIL allocations happen
*all the time* in filesystems.

> > > > So, my question again: why not bypass the per-zone min watermarks in 
> > > > the 
> > > > page allocator?
> > > 
> > > I don't even know what your argument is supposed to be.  The fact that
> > > we don't do it in the page allocator means that there can't be a bug
> > > in memcg?
> > > 
> > 
> > I'm asking if we should allow GFP_NOFS | __GFP_NOFAIL allocations in the 
> > page allocator to bypass per-zone min watermarks after reclaim has failed 
> > since the oom killer cannot be called in such a context so that the page 
> > allocator is not susceptible to the same deadlock without a complete 
> > depletion of memory reserves?
> 
> Yes, I think so.

There be dragons. If memcg's deadlock in low memory conditions in
the presence of GFP_NOFS | GFP_NOFAIL allocations, then we need to
make the memcg reclaim design more robust, not work around it by
allowing filesystems to drain critical memory reserves needed for
other situations

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [OOPS, 3.13-rc2] null ptr in dio_complete()

2013-12-03 Thread Dave Chinner

On Tue, Dec 03, 2013 at 08:47:12PM -0700, Jens Axboe wrote:
> On Wed, Dec 04 2013, Dave Chinner wrote:
> > On Wed, Dec 04, 2013 at 12:58:38PM +1100, Dave Chinner wrote:
> > > On Wed, Dec 04, 2013 at 08:59:40AM +1100, Dave Chinner wrote:
> > > > Hi Jens,
> > > > 
> > > > Not sure who to direct this to or CC, so I figured you are the
> > > > person to do that. I just had xfstests generic/299 (an AIO/DIO test)
> > > > oops in dio_complete() like so:
> > > > 

> > > > [ 9650.590630]  
> > > > [ 9650.590630]  [] dio_complete+0xa3/0x140
> > > > [ 9650.590630]  [] dio_bio_end_aio+0x7a/0x110
> > > > [ 9650.590630]  [] ? dio_bio_end_aio+0x5/0x110
> > > > [ 9650.590630]  [] bio_endio+0x1d/0x30
> > > > [ 9650.590630]  [] blk_mq_complete_request+0x5f/0x120
> > > > [ 9650.590630]  [] __blk_mq_end_io+0x16/0x20
> > > > [ 9650.590630]  [] blk_mq_end_io+0x68/0xd0
> > > > [ 9650.590630]  [] virtblk_done+0x67/0x110
> > > > [ 9650.590630]  [] vring_interrupt+0x35/0x60
.
> > > And I just hit this from running xfs_repair which is doing
> > > multithreaded direct IO directly on /dev/vdc:
> > > 

> > > [ 1776.510446] IP: [] blk_account_io_done+0x6a/0x180

> > > [ 1776.512577]  [] blk_mq_complete_request+0xb8/0x120
> > > [ 1776.512577]  [] __blk_mq_end_io+0x16/0x20
> > > [ 1776.512577]  [] blk_mq_end_io+0x68/0xd0
> > > [ 1776.512577]  [] virtblk_done+0x67/0x110
> > > [ 1776.512577]  [] vring_interrupt+0x35/0x60
> > > [ 1776.512577]  [] handle_irq_event_percpu+0x54/0x1e0
.
> > > So this is looking like another virtio+blk_mq problem
> > 
> > This one is definitely reproducable. Just hit it again...
> 
> I'll take a look at this. You don't happen to have gdb dumps of the
> lines associated with those crashes? Just to save me some digging
> time...

Only this:

(gdb) l *(dio_complete+0xa3)
0x811ddae3 is in dio_complete (fs/direct-io.c:282).
277 }
278
279 aio_complete(dio->iocb, ret, 0);
280 }
281
282 kmem_cache_free(dio_cache, dio);
283 return ret;
284 }
285
286 static void dio_aio_complete_work(struct work_struct *work)

And this:

(gdb) l *(blk_account_io_done+0x6a)
0x81755b6a is in blk_account_io_done (block/blk-core.c:2049).
2044int cpu;
2045
2046cpu = part_stat_lock();
2047part = req->part;
2048
2049part_stat_inc(cpu, part, ios[rw]);
2050part_stat_add(cpu, part, ticks[rw], duration);
2051part_round_stats(cpu, part);
2052part_dec_in_flight(part, rw);
2053

as I've rebuild the kernel with different patches since the one
running on the machine that is triggering the problem.

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2] checkpatch: add DT compatible string documentation checks

2013-12-03 Thread Rob Herring

From: Rob Herring 

This adds a simple check that any compatible strings in DeviceTree dts
files are present in Documentation/devicetree/bindings. Vendor prefixes
are also checked for existing in vendor-prefixes.txt These should be
temporary checks until we have more sophisticated binding schema checking.

Signed-off-by: Rob Herring 
Cc: Grant Likely 
Cc: Andy Whitcroft 
Cc: Joe Perches 
---
v2:
- Add vendor string checking against vendor-prefixes.txt
- Add '_', '.' and '+' as valid compatible string characters
- Use 'grep -E' instead of egrep

 scripts/checkpatch.pl | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 9c98100..5eea031 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -2034,6 +2034,29 @@ sub process {
 "Use of $flag is deprecated, please use 
\`$replacement->{$flag} instead.\n" . $herecurr) if ($replacement->{$flag});
}
 
+# check for DT compatible documentation
+   if ($realfile =~ /\.dts/ && $rawline =~ /\+\s*compatible\s*=/) {
+   my @compats = $rawline =~ 
/\"([a-zA-Z0-9\-\,\.\+_]+)\"/g;
+
+   foreach my $compat (@compats) {
+   my $compat2 = $compat;
+   my $dt_path = 
"Documentation/devicetree/bindings/";
+   $compat2 =~ s/\,[a-z]*\-/\,<\.\*>\-/;
+   `grep -Erq "$compat|$compat2" $dt_path`;
+   if ( $? >> 8 ) {
+   WARN("UNDOCUMENTED_DT_BINDING",
+"DT compatible string \"$compat\" 
appears un-documented -- check $dt_path\n" . $herecurr);
+   }
+   my $vendor = $compat;
+   $vendor =~ s/^([a-zA-Z0-9]+)\,.*/$1/;
+   `grep -Eq "$vendor" 
"${dt_path}vendor-prefixes.txt"`;
+   if ( $? >> 8 ) {
+   WARN("UNDOCUMENTED_DT_VENDOR",
+"DT compatible string vendor 
\"$vendor\" appears un-documented -- check ${dt_path}vendor-prefixes.txt\n" . 
$herecurr);
+   }
+   }
+   }
+
 # check we are in a valid source file if not then ignore this hunk
next if ($realfile !~ /\.(h|c|s|S|pl|sh)$/);
 
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 6/8] perf sched: Introduce timehist command

2013-12-03 Thread David Ahern


On 11/28/13, 6:58 PM, David Ahern wrote:

On 11/28/13, 5:48 PM, Namhyung Kim wrote:

Do we really need to look up the callchain to find out an idle thread?

---8<---

It seems every idle/swapper thread for each cpu has a pid of 0.




I knew I had this code in there for a reason

Older kernels (e.g., RHEL6) show init as the idle task for cpus != 0. 
So, to be robust across kernel versions the idle check needs to do more 
than just looking at the swapper thread as the incoming or outgoing 
task. It needs to walk the first few frames of the callstack looking for 
a known idle symbol.


David

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] usb: phy-tegra-usb.c: wrong pointer check for remap UTMI

2013-12-03 Thread Stephen Warren

On 12/03/2013 07:02 PM, Chris Ruehl wrote:
> usb: phy-tegra-usb.c: wrong pointer check for remap UTMI
> 
> A wrong pointer was used to test the result of devm_ioremap()
> 
> Signed-off-by: Chris Ruehl 
> Acked-by: Venu Byravarasu 

Out of curiosity, when did that ack happen? I didn't see it. But anyway,

Acked-by: Stephen Warren 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3/3] ARM: dts: bcm281xx: define real clocks

2013-12-03 Thread Alex Elder

Replace the "fake" clocks defined in the "bcm11351.dtsi" device tree
file with real definitions backed by the new BCM281xx clock driver..

Signed-off-by: Alex Elder 
Reviewed-by: Matt Porter 
Reviewed-by: Tim Kryger 
---
 arch/arm/boot/dts/bcm11351.dtsi |  222
---
 1 file changed, 161 insertions(+), 61 deletions(-)

diff --git a/arch/arm/boot/dts/bcm11351.dtsi
b/arch/arm/boot/dts/bcm11351.dtsi
index 1246885..b3a0535 100644
--- a/arch/arm/boot/dts/bcm11351.dtsi
+++ b/arch/arm/boot/dts/bcm11351.dtsi
@@ -14,6 +14,8 @@
 #include 
 #include 

+#include "dt-bindings/clock/bcm281xx.h"
+
 #include "skeleton.dtsi"

 / {
@@ -43,7 +45,7 @@
compatible = "brcm,bcm11351-dw-apb-uart", "snps,dw-apb-uart";
status = "disabled";
reg = <0x3e00 0x1000>;
-   clocks = <_clk>;
+   clocks = <_ccu BCM281XX_SLAVE_CCU_UARTB>;
interrupts = ;
reg-shift = <2>;
reg-io-width = <4>;
@@ -53,7 +55,7 @@
compatible = "brcm,bcm11351-dw-apb-uart", "snps,dw-apb-uart";
status = "disabled";
reg = <0x3e001000 0x1000>;
-   clocks = <_clk>;
+   clocks = <_ccu BCM281XX_SLAVE_CCU_UARTB2>;
interrupts = ;
reg-shift = <2>;
reg-io-width = <4>;
@@ -63,7 +65,7 @@
compatible = "brcm,bcm11351-dw-apb-uart", "snps,dw-apb-uart";
status = "disabled";
reg = <0x3e002000 0x1000>;
-   clocks = <_clk>;
+   clocks = <_ccu BCM281XX_SLAVE_CCU_UARTB3>;
interrupts = ;
reg-shift = <2>;
reg-io-width = <4>;
@@ -73,7 +75,7 @@
compatible = "brcm,bcm11351-dw-apb-uart", "snps,dw-apb-uart";
status = "disabled";
reg = <0x3e003000 0x1000>;
-   clocks = <_clk>;
+   clocks = <_ccu BCM281XX_SLAVE_CCU_UARTB4>;
interrupts = ;
reg-shift = <2>;
reg-io-width = <4>;
@@ -95,7 +97,7 @@
compatible = "brcm,kona-timer";
reg = <0x35006000 0x1000>;
interrupts = ;
-   clocks = <_timer_clk>;
+   clocks = <_ccu BCM281XX_AON_CCU_HUB_TIMER>;
};

gpio: gpio@35003000 {
@@ -118,7 +120,7 @@
compatible = "brcm,kona-sdhci";
reg = <0x3f18 0x1>;
interrupts = ;
-   clocks = <_clk>;
+   clocks = <_ccu BCM281XX_MASTER_CCU_SDIO1>;
status = "disabled";
};

@@ -126,7 +128,7 @@
compatible = "brcm,kona-sdhci";
reg = <0x3f19 0x1>;
interrupts = ;
-   clocks = <_clk>;
+   clocks = <_ccu BCM281XX_MASTER_CCU_SDIO2>;
status = "disabled";
};

@@ -134,7 +136,7 @@
compatible = "brcm,kona-sdhci";
reg = <0x3f1a 0x1>;
interrupts = ;
-   clocks = <_clk>;
+   clocks = <_ccu BCM281XX_MASTER_CCU_SDIO3>;
status = "disabled";
};

@@ -142,99 +144,137 @@
compatible = "brcm,kona-sdhci";
reg = <0x3f1b 0x1>;
interrupts = ;
-   clocks = <_clk>;
+   clocks = <_ccu BCM281XX_MASTER_CCU_SDIO4>;
status = "disabled";
};

clocks {
-   bsc1_clk: bsc1 {
-   compatible = "fixed-clock";
-   clock-frequency = <1300>;
-   #clock-cells = <0>;
-   };
-
-   bsc2_clk: bsc2 {
-   compatible = "fixed-clock";
-   clock-frequency = <1300>;
-   #clock-cells = <0>;
-   };
+   #address-cells = <1>;
+   #size-cells = <1>;
+   ranges;

-   bsc3_clk: bsc3 {
-   compatible = "fixed-clock";
-   clock-frequency = <1300>;
-   #clock-cells = <0>;
+   root_ccu: root_ccu {
+   compatible = "brcm,bcm11351-root-ccu";
+   reg = <0x35001000 0x0f00>;
+   #clock-cells = <1>;
+   clock-output-names = "frac_1m";
};

-   pmu_bsc_clk: pmu_bsc {
-   compatible = "fixed-clock";
-   clock-frequency = <1300>;
-   #clock-cells = <0>;
+   hub_ccu: hub_ccu {
+   compatible = "brcm,bcm11351-hub-ccu";
+   reg = <0x3400 0x0f00>;
+   #clock-cells = <1>;
+   clock-output-names = "tmon_1m";
};

-

[PATCH 1/3] clk: bcm281xx: define kona clock binding

2013-12-03 Thread Alex Elder

Document the device tree binding for Broadcom Kona architecture
clock control units and clocks.  Kona device nodes are represented
with compatible strings having "bcm11351" in their name.

Kona clocks are managed by "clock control units" (CCUs).  Each CCU
has a device tree node, and within that node are defined the names
of the clocks provided by the CCU.

The BCM281xx family of SoCs use Kona CCUs and clocks.

Signed-off-by: Alex Elder 
Reviewed-by: Matt Porter 
Reviewed-by: Tim Kryger 
---
 .../devicetree/bindings/clock/bcm-kona-clock.txt   |   95

 1 file changed, 95 insertions(+)
 create mode 100644
Documentation/devicetree/bindings/clock/bcm-kona-clock.txt

diff --git a/Documentation/devicetree/bindings/clock/bcm-kona-clock.txt
b/Documentation/devicetree/bindings/clock/bcm-kona-clock.txt
new file mode 100644
index 000..0cafd6a
--- /dev/null
+++ b/Documentation/devicetree/bindings/clock/bcm-kona-clock.txt
@@ -0,0 +1,95 @@
+Broadcom Kona Family Clocks
+
+This binding is associated with Broadcom SoCs having "Kona" style
+clock control units (CCUs).  A CCU is a clock provider that manages
+a set of clock signals.  Each CCU is represented by a node in the
+device tree.
+
+This binding uses the common clock binding:
+Documentation/devicetree/bindings/clock/clock-bindings.txt
+
+Many source clocks are described using the "fixed-clock" binding:
+Documentation/devicetree/bindings/clock/fixed-clock.txt
+
+Required properties:
+- compatible
+   Shall have a value "brcm,bcm11351--ccu", where
+identifies the particular CCU (see below).
+- reg
+   Shall define the base and range of the address space
+   containing clock control registers
+- #clock-cells
+   Shall have the value <1>
+- clock-output-names
+   Shall be an ordered list of strings defining the names of
+   the clocks provided by the CCU.
+
+Clock consumers refer to a particular clock supplied by a CCU using
+a phandle and specifier pair, using the phandle for the CCU device
+tree node and the clock's symbolic specifier.  The clock specifier
+is a CCU-unique 0-based index value.
+
+
+BCM281XX family SoCs use Kona CCUs.  The following table defines
+the set of CCUs and clock specifiers for BCM281XX clocks.  The
+compatible string for the CCU uses the name in the "CCU" column
+below as it's  value.
+
+CCU Clock   TypeIndex   Specifier
+--- -   -   -
+rootfrac_1m peri  0 BCM281XX_ROOT_CCU_FRAC_1M
+
+aon hub_timer   peri  0 BCM281XX_AON_CCU_HUB_TIMER
+aon pmu_bsc peri  1 BCM281XX_AON_CCU_PMU_BSC
+aon pmu_bsc_var peri  2 BCM281XX_AON_CCU_PMU_BSC_VAR
+
+hub tmon_1m peri  0 BCM281XX_HUB_CCU_TMON_1M
+
+master  sdio1   peri  0 BCM281XX_MASTER_CCU_SDIO1
+master  sdio2   peri  1 BCM281XX_MASTER_CCU_SDIO2
+master  sdio3   peri  2 BCM281XX_MASTER_CCU_SDIO3
+master  sdio4   peri  3 BCM281XX_MASTER_CCU_SDIO4
+master  dmacperi  4 BCM281XX_MASTER_CCU_DMAC
+master  usb_ic  peri  5 BCM281XX_MASTER_CCU_USB_IC
+master  hsic2_48m   peri  6 BCM281XX_MASTER_CCU_HSIC_48M
+master  hsic2_12m   peri  7 BCM281XX_MASTER_CCU_HSIC_12M
+
+slave   uartb   peri  0 BCM281XX_SLAVE_CCU_UARTB
+slave   uartb2  peri  1 BCM281XX_SLAVE_CCU_UARTB2
+slave   uartb3  peri  2 BCM281XX_SLAVE_CCU_UARTB3
+slave   uartb4  peri  3 BCM281XX_SLAVE_CCU_UARTB4
+slave   ssp0peri  4 BCM281XX_SLAVE_CCU_SSP0
+slave   ssp2peri  5 BCM281XX_SLAVE_CCU_SSP2
+slave   bsc1peri  6 BCM281XX_SLAVE_CCU_BSC1
+slave   bsc2peri  7 BCM281XX_SLAVE_CCU_BSC2
+slave   bsc3peri  8 BCM281XX_SLAVE_CCU_BSC3
+slave   pwm peri  9 BCM281XX_SLAVE_CCU_PWM
+
+
+Device tree example:
+
+   clocks {
+   slave_ccu: slave_ccu {
+   compatible = "brcm,bcm11351-slave-ccu";
+   reg = <0x3e011000 0x0f00>;
+   #clock-cells = <1>;
+   clock-output-names = "uartb",
+"uartb2",
+"uartb3",
+"uartb4";
+   };
+   ref_crystal_clk: ref_crystal {
+   #clock-cells = <0>;
+   compatible = "fixed-clock";
+   clock-frequency = <2600>;
+   };
+   };
+   uart@3e002000 {
+   compatible = "brcm,bcm11351-dw-apb-uart", "snps,dw-apb-uart";
+   status = "disabled";
+   reg = <0x3e002000 0x1000>;
+

[PATCH 0/3] clk: bcm281xx: define Broadcom kona clocks

2013-12-03 Thread Alex Elder

This series adds support for Kona clock control units (CCUs) and
clocks, used by Broadcom BCM281xx family SoCs.  Kona CCUs are
represented by nodes in the device tree, and the names of the clocks
provided by a CCU are included in its node.  Implementation details
of those clocks are defined in a C file separate from the code that
implements the functionality used by most of the common clock
framework.

This series depends on:
"Update Kona drivers to use clocks"
https://lkml.org/lkml/2013/11/14/450

This series (along with a version of those prerequisite patches) is
available in the "review/bcm-kona-clocks" branch of this git
repository:
git://git.linaro.org/people/elder/linux.git

-Alex

Alex Elder (3):
  clk: bcm281xx: define kona clock binding
  clk: bcm281xx: add initial clock framework support
  ARM: dts: bcm281xx: define real clocks

 .../devicetree/bindings/clock/bcm-kona-clock.txt   |   95 ++
 arch/arm/boot/dts/bcm11351.dtsi|  222 +++--
 drivers/clk/Kconfig|1 +
 drivers/clk/Makefile   |1 +
 drivers/clk/bcm/Kconfig|8 +
 drivers/clk/bcm/Makefile   |3 +
 drivers/clk/bcm/clk-bcm281xx.c |  416 
 drivers/clk/bcm/clk-kona-setup.c   |  774 +++
 drivers/clk/bcm/clk-kona.c | 1033

 drivers/clk/bcm/clk-kona.h |  416 
 include/dt-bindings/clock/bcm281xx.h   |   65 ++
 11 files changed, 2973 insertions(+), 61 deletions(-)
 create mode 100644
Documentation/devicetree/bindings/clock/bcm-kona-clock.txt
 create mode 100644 drivers/clk/bcm/Kconfig
 create mode 100644 drivers/clk/bcm/Makefile
 create mode 100644 drivers/clk/bcm/clk-bcm281xx.c
 create mode 100644 drivers/clk/bcm/clk-kona-setup.c
 create mode 100644 drivers/clk/bcm/clk-kona.c
 create mode 100644 drivers/clk/bcm/clk-kona.h
 create mode 100644 include/dt-bindings/clock/bcm281xx.h

-- 1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [OOPS, 3.13-rc2] null ptr in dio_complete()

2013-12-03 Thread Jens Axboe

On Wed, Dec 04 2013, Dave Chinner wrote:
> On Wed, Dec 04, 2013 at 12:58:38PM +1100, Dave Chinner wrote:
> > On Wed, Dec 04, 2013 at 08:59:40AM +1100, Dave Chinner wrote:
> > > Hi Jens,
> > > 
> > > Not sure who to direct this to or CC, so I figured you are the
> > > person to do that. I just had xfstests generic/299 (an AIO/DIO test)
> > > oops in dio_complete() like so:
> > > 
> > > [ 9650.586724] general protection fault:  [#1] SMP DEBUG_PAGEALLOC
> > > [ 9650.590131] Modules linked in:
> > > [ 9650.590630] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.13.0-rc2-dgc+ 
> > > #73
> > > [ 9650.590630] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
> > > [ 9650.590630] task: 81f35480 ti: 81f24000 task.ti: 
> > > 81f24000
> > > [ 9650.590630] RIP: 0010:[]  [] 
> > > aio_complete+0xb1/0x1e0
> > > [ 9650.590630] RSP: 0018:88007f803cf8  EFLAGS: 00010003
> > > [ 9650.590630] RAX: 0086 RBX: 8800688ae000 RCX: 
> > > 00d6d6d6
> > > [ 9650.590630] RDX: 6b6b6b6b6b6b6b6b RSI: 1000 RDI: 
> > > 8800688ae1c4
> > > [ 9650.590630] RBP: 88007f803d28 R08:  R09: 
> > > 6b6b6b6c
> > > [ 9650.590630] R10: ff79a1a35779b009 R11: 0010 R12: 
> > > 88006953a540
> > > [ 9650.590630] R13: 1000 R14:  R15: 
> > > 8800688ae1c4
> > > [ 9650.590630] FS:  () GS:88007f80() 
> > > knlGS:
> > > [ 9650.590630] CS:  0010 DS:  ES:  CR0: 8005003b
> > > [ 9650.590630] CR2: 7ffa9c48448f CR3: 28061000 CR4: 
> > > 06f0
> > > [ 9650.590630] Stack:
> > > [ 9650.590630]  88007f803d18 1000 880068b80800 
> > > 1000
> > > [ 9650.590630]  0001 31f6d000 88007f803d68 
> > > 811ddae3
> > > [ 9650.590630]  0086 880068b80800  
> > > 880068b80828
> > > [ 9650.590630] Call Trace:
> > > [ 9650.590630]  
> > > [ 9650.590630]  [] dio_complete+0xa3/0x140
> > > [ 9650.590630]  [] dio_bio_end_aio+0x7a/0x110
> > > [ 9650.590630]  [] ? dio_bio_end_aio+0x5/0x110
> > > [ 9650.590630]  [] bio_endio+0x1d/0x30
> > > [ 9650.590630]  [] blk_mq_complete_request+0x5f/0x120
> > > [ 9650.590630]  [] __blk_mq_end_io+0x16/0x20
> > > [ 9650.590630]  [] blk_mq_end_io+0x68/0xd0
> > > [ 9650.590630]  [] virtblk_done+0x67/0x110
> > > [ 9650.590630]  [] vring_interrupt+0x35/0x60
> > > [ 9650.590630]  [] handle_irq_event_percpu+0x54/0x1e0
> > > [ 9650.590630]  [] handle_irq_event+0x48/0x70
> > > [ 9650.590630]  [] ? kvm_guest_apic_eoi_write+0x5/0x50
> > > [ 9650.590630]  [] handle_edge_irq+0x77/0x110
> > > [ 9650.590630]  [] handle_irq+0xbf/0x150
> > > [ 9650.590630]  [] ? handle_irq+0x5/0x150
> > > [ 9650.590630]  [] ? 
> > > atomic_notifier_call_chain+0x16/0x20
> > > [ 9650.590630]  [] do_IRQ+0x5a/0xe0
> > > [ 9650.590630]  [] common_interrupt+0x6d/0x6d
> > > [ 9650.590630]  
> > > [ 9650.590630]  [] ? native_safe_halt+0x6/0x10
> > > [ 9650.590630]  [] ? default_idle+0x5/0xc0
> > > [ 9650.590630]  [] default_idle+0x1f/0xc0
> > > [ 9650.590630]  [] arch_cpu_idle+0x26/0x30
> > > [ 9650.590630]  [] cpu_startup_entry+0x81/0x240
> > > [ 9650.590630]  [] rest_init+0x77/0x80
> > > [ 9650.590630]  [] start_kernel+0x3cd/0x3da
> > > [ 9650.590630]  [] ? repair_env_string+0x5e/0x5e
> > > [ 9650.590630]  [] ? early_idt_handlers+0x117/0x120
> > > [ 9650.590630]  [] x86_64_start_reservations+0x2a/0x2c
> > > [ 9650.590630]  [] x86_64_start_kernel+0x146/0x155
> > > [ 9650.590630] Code: e8 05 b5 8f 00 44 8b 8b c0 01 00 00 45 31 c0 48 8b 
> > > 93 98 00 00 00 41 83 c1 01 44 3b 8b 80 00 00 00 44 89 c9 45 0f 42 c1 c1 
> > > e9 07 <48> 8b 0c ca 65
> > > [ 9650.590630] RIP  [] aio_complete+0xb1/0x1e0
> > > [ 9650.590630]  RSP 
> > > 
> > > I'm not sure if it is related to blk_mq, virtio, dio  or bio changes
> > > (or even somethign else), but I haven't seen anything like this
> > > before. I've only seen it once so far (haven't rerun the test yet at
> > > all).
> > 
> > And I just hit this from running xfs_repair which is doing
> > multithreaded direct IO directly on /dev/vdc:
> > 
> > [ 1776.508599] BUG: unable to handle kernel NULL pointer dereference at 
> > 0328
> > [ 1776.510446] IP: [] blk_account_io_done+0x6a/0x180
> > [ 1776.511762] PGD 38e75b067 PUD 41987d067 PMD 0
> > [ 1776.512577] Oops:  [#1] SMP
> > [ 1776.512577] Modules linked in:
> > [ 1776.512577] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.13.0-rc2-dgc+ #75
> > [ 1776.512577] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
> > [ 1776.512577] task: 81f37480 ti: 81f26000 task.ti: 
> > 81f26000
> > [ 1776.512577] RIP: 0010:[]  [] 
> > blk_account_io_done+0x6a/0x180
> > [ 1776.512577] RSP: :88011bc03da8  EFLAGS: 00010046
> > [ 1776.512577] RAX: 1000 RBX:  RCX: 
> > 
> > [ 1776.512577] RDX:  RSI:

Re: [OOPS, 3.13-rc2] null ptr in dio_complete()

2013-12-03 Thread Dave Chinner

On Wed, Dec 04, 2013 at 12:58:38PM +1100, Dave Chinner wrote:
> On Wed, Dec 04, 2013 at 08:59:40AM +1100, Dave Chinner wrote:
> > Hi Jens,
> > 
> > Not sure who to direct this to or CC, so I figured you are the
> > person to do that. I just had xfstests generic/299 (an AIO/DIO test)
> > oops in dio_complete() like so:
> > 
> > [ 9650.586724] general protection fault:  [#1] SMP DEBUG_PAGEALLOC
> > [ 9650.590131] Modules linked in:
> > [ 9650.590630] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.13.0-rc2-dgc+ #73
> > [ 9650.590630] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
> > [ 9650.590630] task: 81f35480 ti: 81f24000 task.ti: 
> > 81f24000
> > [ 9650.590630] RIP: 0010:[]  [] 
> > aio_complete+0xb1/0x1e0
> > [ 9650.590630] RSP: 0018:88007f803cf8  EFLAGS: 00010003
> > [ 9650.590630] RAX: 0086 RBX: 8800688ae000 RCX: 
> > 00d6d6d6
> > [ 9650.590630] RDX: 6b6b6b6b6b6b6b6b RSI: 1000 RDI: 
> > 8800688ae1c4
> > [ 9650.590630] RBP: 88007f803d28 R08:  R09: 
> > 6b6b6b6c
> > [ 9650.590630] R10: ff79a1a35779b009 R11: 0010 R12: 
> > 88006953a540
> > [ 9650.590630] R13: 1000 R14:  R15: 
> > 8800688ae1c4
> > [ 9650.590630] FS:  () GS:88007f80() 
> > knlGS:
> > [ 9650.590630] CS:  0010 DS:  ES:  CR0: 8005003b
> > [ 9650.590630] CR2: 7ffa9c48448f CR3: 28061000 CR4: 
> > 06f0
> > [ 9650.590630] Stack:
> > [ 9650.590630]  88007f803d18 1000 880068b80800 
> > 1000
> > [ 9650.590630]  0001 31f6d000 88007f803d68 
> > 811ddae3
> > [ 9650.590630]  0086 880068b80800  
> > 880068b80828
> > [ 9650.590630] Call Trace:
> > [ 9650.590630]  
> > [ 9650.590630]  [] dio_complete+0xa3/0x140
> > [ 9650.590630]  [] dio_bio_end_aio+0x7a/0x110
> > [ 9650.590630]  [] ? dio_bio_end_aio+0x5/0x110
> > [ 9650.590630]  [] bio_endio+0x1d/0x30
> > [ 9650.590630]  [] blk_mq_complete_request+0x5f/0x120
> > [ 9650.590630]  [] __blk_mq_end_io+0x16/0x20
> > [ 9650.590630]  [] blk_mq_end_io+0x68/0xd0
> > [ 9650.590630]  [] virtblk_done+0x67/0x110
> > [ 9650.590630]  [] vring_interrupt+0x35/0x60
> > [ 9650.590630]  [] handle_irq_event_percpu+0x54/0x1e0
> > [ 9650.590630]  [] handle_irq_event+0x48/0x70
> > [ 9650.590630]  [] ? kvm_guest_apic_eoi_write+0x5/0x50
> > [ 9650.590630]  [] handle_edge_irq+0x77/0x110
> > [ 9650.590630]  [] handle_irq+0xbf/0x150
> > [ 9650.590630]  [] ? handle_irq+0x5/0x150
> > [ 9650.590630]  [] ? atomic_notifier_call_chain+0x16/0x20
> > [ 9650.590630]  [] do_IRQ+0x5a/0xe0
> > [ 9650.590630]  [] common_interrupt+0x6d/0x6d
> > [ 9650.590630]  
> > [ 9650.590630]  [] ? native_safe_halt+0x6/0x10
> > [ 9650.590630]  [] ? default_idle+0x5/0xc0
> > [ 9650.590630]  [] default_idle+0x1f/0xc0
> > [ 9650.590630]  [] arch_cpu_idle+0x26/0x30
> > [ 9650.590630]  [] cpu_startup_entry+0x81/0x240
> > [ 9650.590630]  [] rest_init+0x77/0x80
> > [ 9650.590630]  [] start_kernel+0x3cd/0x3da
> > [ 9650.590630]  [] ? repair_env_string+0x5e/0x5e
> > [ 9650.590630]  [] ? early_idt_handlers+0x117/0x120
> > [ 9650.590630]  [] x86_64_start_reservations+0x2a/0x2c
> > [ 9650.590630]  [] x86_64_start_kernel+0x146/0x155
> > [ 9650.590630] Code: e8 05 b5 8f 00 44 8b 8b c0 01 00 00 45 31 c0 48 8b 93 
> > 98 00 00 00 41 83 c1 01 44 3b 8b 80 00 00 00 44 89 c9 45 0f 42 c1 c1 e9 07 
> > <48> 8b 0c ca 65
> > [ 9650.590630] RIP  [] aio_complete+0xb1/0x1e0
> > [ 9650.590630]  RSP 
> > 
> > I'm not sure if it is related to blk_mq, virtio, dio  or bio changes
> > (or even somethign else), but I haven't seen anything like this
> > before. I've only seen it once so far (haven't rerun the test yet at
> > all).
> 
> And I just hit this from running xfs_repair which is doing
> multithreaded direct IO directly on /dev/vdc:
> 
> [ 1776.508599] BUG: unable to handle kernel NULL pointer dereference at 
> 0328
> [ 1776.510446] IP: [] blk_account_io_done+0x6a/0x180
> [ 1776.511762] PGD 38e75b067 PUD 41987d067 PMD 0
> [ 1776.512577] Oops:  [#1] SMP
> [ 1776.512577] Modules linked in:
> [ 1776.512577] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.13.0-rc2-dgc+ #75
> [ 1776.512577] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
> [ 1776.512577] task: 81f37480 ti: 81f26000 task.ti: 
> 81f26000
> [ 1776.512577] RIP: 0010:[]  [] 
> blk_account_io_done+0x6a/0x180
> [ 1776.512577] RSP: :88011bc03da8  EFLAGS: 00010046
> [ 1776.512577] RAX: 1000 RBX:  RCX: 
> 
> [ 1776.512577] RDX:  RSI: 0002 RDI: 
> 8800dba59fc0
> [ 1776.512577] RBP: 88011bc03db8 R08:  R09: 
> 88021a828928
> [ 1776.512577] R10:  R11: 0001 R12: 
> 
> [ 1776.512577] R13:  R14:

Re: [patch 1/2] mm, memcg: avoid oom notification when current needs access to memory reserves

2013-12-03 Thread Johannes Weiner

On Tue, Dec 03, 2013 at 03:50:41PM -0800, David Rientjes wrote:
> On Tue, 3 Dec 2013, Michal Hocko wrote:
> 
> > OK, as it seems that the notification part is too controversial, how
> > would you like the following? It reverts the notification part and still
> > solves the fault on exit path. I will prepare the full patch with the
> > changelog if this looks reasonable:
> 
> Um, no, that's not satisfactory because it obviously does the check after 
> mem_cgroup_oom_notify().  There is absolutely no reason why userspace 
> should be woken up when current simply needs access to memory reserves to 
> exit.  You can already get such notification by memory thresholds at the 
> memcg limit.
> 
> I'll repeat: Section 10 of Documentation/cgroups/memory.txt specifies what 
> userspace should do when waking up; one of those options is not "check if 
> the memcg is still actually oom in a short period of time once a charging 
> task with a pending SIGKILL or in the exit path has been able to exit."  
> Users of this interface typically also disable the memcg oom killer 
> through the same file, it's ludicrous to put the responsibility on 
> userspace to determine if the wakeup is actionable and requires it to 
> intervene in one of the methods listed in section 10.

Kind of a bummer that you haven't read anything I wrote...

But here is a patch that defers wakeups until we know for sure that
userspace action is required:

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index f1a0ae6..cc6adac 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2254,8 +2254,17 @@ bool mem_cgroup_oom_synchronize(bool handle)
 
locked = mem_cgroup_oom_trylock(memcg);
 
+#if 0
+   /*
+* XXX: An unrelated task in the group might exit at any time,
+* making the OOM kill unnecessary.  We don't want to wake up
+* the userspace handler unless we are certain it needs to
+* intervene, so disable notifications until we solve the
+* halting problem.
+*/
if (locked)
mem_cgroup_oom_notify(memcg);
+#endif
 
if (locked && !memcg->oom_kill_disable) {
mem_cgroup_unmark_under_oom(memcg);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[BUG] Re: [PATCH v10 1/3] aerdrv: Trace Event for AER

2013-12-03 Thread rui wang

Resending adding Mauro's new Email address...


On 1/17/13, Lance Ortiz  wrote:
> This header file will define a new trace event that will be triggered when
> a AER event occurs.  The following data will be provided to the trace
> event.
>
> char * dev_name - The name of the slot where the device resides
>   ([domain:]bus:device.function).
>
> u32 status - Either the correctable or uncorrectable register
>  indicating what error or errors have been see.
>
> u8 severity - error severity 0:NONFATAL 1:FATAL 2:CORRECTED
>
> The trace event will also provide a trace string that may look like:
>
> ":05:00.0 PCIe Bus Error:severity=Uncorrected (Non-Fatal), Poisoned
> TLP"
>
> v1-v2 Move header from include/ras/aer_event.h to
> include/trace/events/ras.h
> v3-v4 Cleaned up comments and commit header
> v4-v5 More cleanup remove () from if statement in print.
>   Renamed string define to be more specific.
> v5-v6 change TRACE_SYSTEM define to be ras and not aer.
>
> Signed-off-by: Lance Ortiz 
> Acked-by: Mauro Carvalho Chehab 
> Acked-by: Tony Luck 
> ---
>
>  include/trace/events/ras.h |   77
> 
>  1 files changed, 77 insertions(+), 0 deletions(-)
>  create mode 100644 include/trace/events/ras.h
>
> diff --git a/include/trace/events/ras.h b/include/trace/events/ras.h
> new file mode 100644
> index 000..88b8783
> --- /dev/null
> +++ b/include/trace/events/ras.h
> @@ -0,0 +1,77 @@
> +#undef TRACE_SYSTEM
> +#define TRACE_SYSTEM ras
> +
> +#if !defined(_TRACE_AER_H) || defined(TRACE_HEADER_MULTI_READ)
> +#define _TRACE_AER_H
> +
> +#include 
> +#include 
> +
> +
> +/*
> + * PCIe AER Trace event
> + *
> + * These events are generated when hardware detects a corrected or
> + * uncorrected event on a PCIe device. The event report has
> + * the following structure:
> + *
> + * char * dev_name - The name of the slot where the device resides
> + *   ([domain:]bus:device.function).
> + * u32 status -  Either the correctable or uncorrectable register
> + *   indicating what error or errors have been seen
> + * u8 severity - error severity 0:NONFATAL 1:FATAL 2:CORRECTED
> + */
> +
> +#define aer_correctable_errors   \
> + {BIT(0),"Receiver Error"},  \
> + {BIT(6),"Bad TLP"}, \
> + {BIT(7),"Bad DLLP"},\
> + {BIT(8),"RELAY_NUM Rollover"},  \
> + {BIT(12),   "Replay Timer Timeout"},\
> + {BIT(13),   "Advisory Non-Fatal"}
> +
> +#define aer_uncorrectable_errors \
> + {BIT(4),"Data Link Protocol"},  \
> + {BIT(12),   "Poisoned TLP"},\
> + {BIT(13),   "Flow Control Protocol"},   \
> + {BIT(14),   "Completion Timeout"},  \
> + {BIT(15),   "Completer Abort"}, \
> + {BIT(16),   "Unexpected Completion"},   \
> + {BIT(17),   "Receiver Overflow"},   \
> + {BIT(18),   "Malformed TLP"},   \
> + {BIT(19),   "ECRC"},\
> + {BIT(20),   "Unsupported Request"}
> +
> +TRACE_EVENT(aer_event,
> + TP_PROTO(const char *dev_name,
> +  const u32 status,
> +  const u8 severity),
> +
> + TP_ARGS(dev_name, status, severity),
> +
> + TP_STRUCT__entry(
> + __string(   dev_name,   dev_name)
> + __field(u32,status  )
> + __field(u8, severity)
> + ),
> +
> + TP_fast_assign(
> + __assign_str(dev_name, dev_name);
> + __entry->status = status;
> + __entry->severity   = severity;
> + ),
> +
> + TP_printk("%s PCIe Bus Error: severity=%s, %s\n",
> + __get_str(dev_name),
> + __entry->severity == HW_EVENT_ERR_CORRECTED ? "Corrected" :
> + __entry->severity == HW_EVENT_ERR_FATAL ?
> + "Fatal" : "Uncorrected",
> + __entry->severity == HW_EVENT_ERR_CORRECTED ?
> + __print_flags(__entry->status, "|", aer_correctable_errors) :
> + __print_flags(__entry->status, "|", aer_uncorrectable_errors))
> +);

Here's a bug causing inconsistency between dmesg and the trace event output.
When dmesg says "severity=Corrected", the trace event says
"severity=Fatal". What happens is that HW_EVENT_ERR_CORRECTED is
defined in edac.h:

enum hw_event_mc_err_type {
HW_EVENT_ERR_CORRECTED,
HW_EVENT_ERR_UNCORRECTED,
HW_EVENT_ERR_FATAL,
HW_EVENT_ERR_INFO,
};

while aer_print_error() uses aer_error_severity_string[] defined as:

static const char *aer_error_severity_string[] = {
"Uncorrected (Non-Fatal)",
"Uncorrected (Fatal)",
"Corrected"
};

In this case dmesg is correct because

Re: [RFC PATCH tip 0/5] tracing filters with BPF

2013-12-03 Thread Alexei Starovoitov

On Tue, Dec 3, 2013 at 4:01 PM, Andi Kleen  wrote:
> Alexei Starovoitov  writes:
>
> Can you do some performance comparison compared to e.g. ktap?
> How much faster is it?

imo the most interesting ktap scripts (like kmalloc-top.kp) need
tables and timers.
tables are almost ready for prime time, but timers I prefer to keep
out of kernel.
I would like bpf filter to fill tables with interesting data in kernel
up to predefined limit
and periodically read and clear the tables from userspace.
This way I will be able to do nettop.stp, iotop.stp like programs.
So I'm still thinking what should be clean kernel/user interface for
bpf-defined tables.
Format of keys and elements of the table is defined within bpf program.
During load of bpf program, the tables are allocated and bpf program
can now lookup/update into them. At the same time corresponding
userspace program can read tables of this particular bpf program over
netlink.
Creating its own debugfs files for every filter feels too slow and
feature limited, since files are all or nothing interface. Netlink
access to bpf tables feels cleaner. Userspace will use libmnl to
access them. Other ideas?

In the mean time I'll do some simple
  trace probe:xx { print }
performance test…

> While it sounds interesting, I would strongly advise to make this
> capability only available to root. Traditionally lots of complex byte
> code languages which were designed to be "safe" and verifiable weren't
> really. e.g. i managed to crash things with "safe" systemtap multiple
> times. And we all know what happened to Java.
>
> So the likelyhood of this having some hole somewhere (either in
> the byte code or in some library function) is high.

Tracing filters are for root only today and should stay this way.
As far as safety of bpf… hard to argue systemtap point ;)
Though existing bpf is generally accepted to be safe.
extended bpf needs time to prove itself.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v9] x86, apic, kexec, Documentation: Add disable_cpu_apic kernel parameter

2013-12-03 Thread HATAYAMA Daisuke


(2013/12/04 0:25), Vivek Goyal wrote:

On Tue, Dec 03, 2013 at 10:32:26AM +0900, HATAYAMA Daisuke wrote:

[..]


diff --git a/Documentation/kernel-parameters.txt 
b/Documentation/kernel-parameters.txt
index 50680a5..dd77bec 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -774,6 +774,15 @@ bytes respectively. Such letter suffixes can also be 
entirely omitted.
disable=[IPV6]
See Documentation/networking/ipv6.txt.

+   disable_cpu_apicid= [X86,APIC,KEXEC,SMP]


Hi Hatayama,

We are almost there. A minor nit. Why have we specified KEXEC here. This
parameter disabled_cpu_apicid does not seem to dependon CONFIG_KEXEC?

Jerry, this patch looks good to me. Does it work on your system?



Because primary user for the option is currently kexec/kdump only.

I referred to acpi_rsdp description:

acpi_rsdp=  [ACPI,EFI,KEXEC]
Pass the RSDP address to the kernel, mostly used
on machines running EFI runtime service to boot the
second kernel for kdump.

--
Thanks.
HATAYAMA, Daisuke

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v8 1/2] PWM: atmel-pwm: add PWM controller driver

2013-12-03 Thread Bo Shen


Hi Thierry,

On 12/03/2013 05:43 PM, Thierry Reding wrote:

On Tue, Dec 03, 2013 at 11:09:12AM +0800, Bo Shen wrote:

On 12/02/2013 06:59 PM, Thierry Reding wrote:

On Mon, Nov 18, 2013 at 05:13:21PM +0800, Bo Shen wrote:

[...]

diff --git a/drivers/pwm/pwm-atmel.c b/drivers/pwm/pwm-atmel.c

[...]

+   /* Calculate the period cycles */
+   while (div > PWM_MAX_PRD) {
+   div = clk_rate / (1 << pres);
+   div = div * period_ns;
+   /* 1/Hz = 1 ns */


I don't think that comment is needed.


This is asked to be added.
And, I think keep it and it won't hurt, what do you think?


I think it's obvious that you're converting from nanoseconds because of
the _ns prefix in period_ns. But if somebody requested this and everyone
else thinks it's useful, I'm okay with keeping it.


+   if (test_bit(PWMF_ENABLED, >flags)) {
+   atmel_pwm_ch_writel(atmel_pwm, pwm->hwpwm, PWMV2_CDTYUPD, dty);
+   } else {
+   atmel_pwm_ch_writel(atmel_pwm, pwm->hwpwm, PWMV2_CDTY, dty);
+   atmel_pwm_ch_writel(atmel_pwm, pwm->hwpwm, PWMV2_CPRD, prd);
+   }
+}


Neither version 1 nor version 2 seem to be able to change the period
while the channel is enabled. Perhaps that should be checked for in
atmel_pwm_config() and an error (-EBUSY) returned?


The period is configured in dt in device tree, or platform data in non
device tree. Nowhere will update period. So, not code to update period.
Am I right? If not, please figure out.


The .config() operation allows the period to be specified. Just because
nobody currently changes it at runtime doesn't mean it can't be done.

It is also possible that whoever wrote the device tree or platform data
didn't know that the period must be the same for all PWM channels and
therefore might use different values. If you check for this, at least
they'll notice. If you don't check for it, then they may end up having
the period reconfigured behind their backs, which may cause parts of
their setup to behave unexpectedly.


Thanks for this information.
I will add code for changing period.


Thierry



Best Regards,
Bo Shen
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] mm: memcg: do not declare OOM from __GFP_NOFAIL allocations

2013-12-03 Thread Johannes Weiner

On Tue, Dec 03, 2013 at 03:40:13PM -0800, David Rientjes wrote:
> On Tue, 3 Dec 2013, Johannes Weiner wrote:
> 
> > > > Spin on which level? The whole point of this change was to not spin for
> > > > ever because the caller might sit on top of other locks which might
> > > > prevent somebody else to die although it has been killed.
> > > 
> > > See my question about the non-memcg page allocator behavior below.
> > 
> > No, please answer the question.
> > 
> 
> The question would be answered below, by having consistency in allocation 
> and charging paths between both the page allocator and memcg.
> 
> > > I'm not quite sure how significant of a point this is, though, because it 
> > > depends on the caller doing the __GFP_NOFAIL allocations that allow the 
> > > bypass.  If you're doing
> > > 
> > >   for (i = 0; i < 1 << 20; i++)
> > >   page[i] = alloc_page(GFP_NOFS | __GFP_NOFAIL);
> > 
> > Hyperbole serves no one.
> > 
> 
> Since this bypasses all charges to the root memcg in oom conditions as a 
> result of your patch, how do you ensure the "leakage" is contained to a 
> small amount of memory?  Are we currently just trusting the users of 
> __GFP_NOFAIL that they aren't allocating a large amount of memory?

Yes, as answered in my first reply to you:

---

> Ah, this is because of 3168ecbe1c04 ("mm: memcg: use proper memcg in limit 
> bypass") which just bypasses all of these allocations and charges the root 
> memcg.  So if allocations want to bypass memcg isolation they just have to 
> be __GFP_NOFAIL?

I don't think we have another option.

---

Is there a specific reason you keep repeating the same questions?

> > > I'm referring to the generic non-memcg page allocator behavior.  Forget 
> > > memcg for a moment.  What is the behavior in the _page_allocator_ for 
> > > GFP_NOFS | __GFP_NOFAIL?  Do we spin forever if reclaim fails or do we 
> > > bypas the per-zone min watermarks to allow it to allocate because "it 
> > > needs to succeed, it may be holding filesystem locks"?
> > > 
> > > It's already been acknowledged in this thread that no bypassing is done 
> > > in the page allocator and it just spins.  There's some handwaving saying 
> > > that since the entire system is oom that there is a greater chance that 
> > > memory will be freed by something else, but that's just handwaving and is 
> > > certainly no guaranteed.
> > 
> > Do you have another explanation of why this deadlock is not triggering
> > in the global case?  It's pretty obvious that there is a deadlock that
> > can not be resolved unless some unrelated task intervenes, just read
> > __alloc_pages_slowpath().
> > 
> > But we had a concrete bug report for memcg where there was no other
> > task to intervene.  One was stuck in the OOM killer waiting for the
> > victim to exit, the victim was stuck on locks that the killer held.
> > 
> 
> I believe the page allocator would be susceptible to the same deadlock if 
> nothing else on the system can reclaim memory and that belief comes from 
> code inspection that shows __GFP_NOFAIL is not guaranteed to ever succeed 
> in the page allocator as their charges now are (with your patch) in memcg.  
> I do not have an example of such an incident.

Me neither.

> > > So, my question again: why not bypass the per-zone min watermarks in the 
> > > page allocator?
> > 
> > I don't even know what your argument is supposed to be.  The fact that
> > we don't do it in the page allocator means that there can't be a bug
> > in memcg?
> > 
> 
> I'm asking if we should allow GFP_NOFS | __GFP_NOFAIL allocations in the 
> page allocator to bypass per-zone min watermarks after reclaim has failed 
> since the oom killer cannot be called in such a context so that the page 
> allocator is not susceptible to the same deadlock without a complete 
> depletion of memory reserves?

Yes, I think so.

> It's not an argument, it's a question.  Relax.

Right.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

linux-next: Tree for Dec 4

2013-12-03 Thread Stephen Rothwell

Hi all,

Changes since 20131203:

The staging tree gained a conflict against the staging.current tree.

Non-merge commits (relative to Linus' tree): 2328
 2507 files changed, 92621 insertions(+), 65164 deletions(-)


I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" as mentioned in the FAQ on the wiki
(see below).

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log files
in the Next directory.  Between each merge, the tree was built with
a ppc64_defconfig for powerpc and an allmodconfig for x86_64 and a
multi_v7_defconfig for arm. After the final fixups (if any), it is also
built with powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig and
allyesconfig (minus CONFIG_PROFILE_ALL_BRANCHES - this fails its final
link) and i386, sparc, sparc64 and arm defconfig. These builds also have
CONFIG_ENABLE_WARN_DEPRECATED, CONFIG_ENABLE_MUST_CHECK and
CONFIG_DEBUG_INFO disabled when necessary.

Below is a summary of the state of the merge.

I am currently merging 210 trees (counting Linus' and 29 trees of patches
pending for Linus' tree).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

There is a wiki covering stuff to do with linux-next at
http://linux.f-seidel.de/linux-next/pmwiki/ .  Thanks to Frank Seidel.

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

$ git checkout master
$ git reset --hard stable
Merging origin/master (dea4f48a0a30 Merge branch 'leds-fixes-for-3.13' of 
git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/linux-leds)
Merging fixes/master (8ae516aa8b81 Merge tag 'trace-fixes-v3.13-rc1' of 
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace)
Merging kbuild-current/rc-fixes (19514fc665ff arm, kbuild: make "make install" 
not depend on vmlinux)
Merging arc-current/for-curr (da990a4f2d5a ARC: [perf] Fix a few thinkos)
Merging arm-current/fixes (11d4bb1bd067 ARM: 7907/1: lib: delay-loop: Add align 
directive to fix BogoMIPS calculation)
Merging m68k-current/for-linus (77a42796786c m68k: Remove deprecated 
IRQF_DISABLED)
Merging metag-fixes/fixes (3b2f64d00c46 Linux 3.11-rc2)
Merging powerpc-merge/merge (721cb59e9d95 powerpc/windfarm: Fix XServe G5 fan 
control Makefile issue)
Merging sparc/master (1de425c7b271 sparc64: Fix build regression)
Merging net/master (988bf4f01e6a Merge branch 'cxgb4')
Merging ipsec/master (dff345c5c85d be2net: call napi_disable() for all event 
queues)
Merging sound-current/for-linus (0202e99c6910 ALSA: hda/realtek - Independent 
of model for HP)
Merging pci-current/for-linus (4bff6749905d PCI: Move device_del() from 
pci_stop_dev() to pci_destroy_dev())
Merging wireless/master (a59b40b30f3f Merge branch 'for-john' of 
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211)
Merging driver-core.current/driver-core-linus (dc1ccc48159d Linux 3.13-rc2)
Merging tty.current/tty-linus (39434abd942c n_tty: Fix missing newline echo)
Merging usb.current/usb-linus (eee52f9edd0f USB: switch maintainership of 
chipidea to Peter)
Merging staging.current/staging-linus (55ef003e4ae6 Merge tag 
'iio-fixes-for-3.13b' of 
git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into staging-linus)
Merging char-misc.current/char-misc-linus (d0b00d3fb96d Merge tag 
'extcon-linus-for-3.13-rc2' of 
git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/extcon into 
char-misc-linus)
Merging input-current/for-linus (4ef38351d770 Input: usbtouchscreen - separate 
report and transmit buffer size handling)
Merging md-current/for-linus (d47648fcf061 raid5: avoid finding "discard" 
stripe)
Merging crypto-current/master (8ec25c512916 crypto: testmgr - fix sglen in 
test_aead for case 'dst != src')
Merging ide/master (c2f7d1e103ef ide: pmac: remove unnecessary 
pci_set_drvdata())
Merging dwmw2/master (5950f0803ca9 pcmcia: remove RPX board stuff)
Merging sh-current/sh-fixes-for-linus (44033109e99c SH: Convert out[bwl] macros 
to inline functions)
Merging devicetree-current/devicetree/merge (1931ee143b0a Revert "drivers: of: 
add initialization code for dma reserved memory")
Merging rr-fixes/fixes (f6537f2f0eba scripts/kallsyms: filter symbols not in 
kernel address space)
Merging mfd-fixes/

Re: [PATCH] devtmpfs: Calling delete_path() only when necessary

2013-12-03 Thread Rob Landley


On 11/16/2013 02:15:23 AM, Axel Lin wrote:

The deleted variable is always 1 in current code.
Initialize deleted variable to be 0, so delete_path() will be called  
only when

necessary.

Signed-off-by: Axel Lin 


I'm not seeing this in linux-next, or a reply on the web archive.  
Assuming nobody's objected to this, you might want to forward it to  
triv...@kernel.org.


That said, you could describe what it _does_ a little more?

Thanks,

Rob--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH -tip v4 0/6] kprobes: introduce NOKPROBE_SYMBOL() and fixes crash bugs

2013-12-03 Thread Sandeepa Prabhu

On 4 December 2013 06:58, Masami Hiramatsu
 wrote:
> Hi,
> Here is the version 4 of NOKPORBE_SYMBOL series.
>
> In this version, I removed the cleanup patches and
> add bugfixes I've found, since those bugs will be
> critical.
> Rest of the cleanup and visible blacklists will be
> proposed later in another series.
>
> Oh, just one new thing, I added a new RFC patch which
> removes the dependency of notify_die() from kprobes
> miss-hit/recovery path. Since the notify_die() involves
> locking and lockdep code which invokes a lot of heavy
> printk functions etc. This helped me to minimize the
> blacklist and provides more stability for kprobes.
> Actually, most of int3 handlers are already called
> from do_int3 directly, I think this change is acceptable
> too.
>
> Here is the updates about NOKPROBE_SYMBOL().
>  - Now _ASM_NOKPROBE() macro is introduced for assembly
>symbols on x86.
>  - Rename kprobe_blackpoint to kprobe_blacklist_entry
>and simplify it. Also NOKPROBE_SYMBOL() macro just
>saves the address of non-probe-able symbols.
>
> ---
>
> Masami Hiramatsu (6):

>   kprobes: Prohibit probing on .entry.text code
>   kprobes: Introduce NOKPROBE_SYMBOL() macro for blacklist
Hi Masami,
Is it good idea to split  "arch/x86" code from generic kernel changes?
Then we just need to take above two patches for verifying it on arm64
or other platforms.

Thanks,
Sandeepa
>   [BUGFIX] kprobes/x86: Prohibit probing on debug_stack_*
>   [BUGFIX] x86: Prohibit probing on native_set_debugreg
>   [BUGFIX] x86: Prohibit probing on thunk functions and restore
>   [RFC] kprobes/x86: Call exception handlers directly from 
> do_int3/do_debug
>
>
>  Documentation/kprobes.txt |   16 +
>  arch/x86/include/asm/asm.h|7 ++
>  arch/x86/include/asm/kprobes.h|2 +
>  arch/x86/kernel/cpu/common.c  |4 +
>  arch/x86/kernel/entry_32.S|   33 ---
>  arch/x86/kernel/entry_64.S|   20 ---
>  arch/x86/kernel/kprobes/core.c|   32 --
>  arch/x86/kernel/paravirt.c|5 ++
>  arch/x86/kernel/traps.c   |   10 +++
>  arch/x86/lib/thunk_32.S   |3 +
>  arch/x86/lib/thunk_64.S   |3 +
>  include/asm-generic/vmlinux.lds.h |9 +++
>  include/linux/kprobes.h   |   21 ++-
>  kernel/kprobes.c  |  113 
> -
>  kernel/sched/core.c   |1
>  15 files changed, 147 insertions(+), 132 deletions(-)
>
> --
> Masami HIRAMATSU
> IT Management Research Dept. Linux Technology Center
> Hitachi, Ltd., Yokohama Research Laboratory
> E-mail: masami.hiramatsu...@hitachi.com
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCHv2 2/2] ACPI/platform: Add ACPI ID for Intel MBI device

2013-12-03 Thread Matthew Garrett

On Tue, Dec 03, 2013 at 06:44:52PM -0800, David E. Box wrote:
> On Wed, Dec 04, 2013 at 02:21:30AM +, Matthew Garrett wrote:
> > Well sure, but why do you need to be a platform device at all? This 
> > functionality was intended for cases where we already have a driver for 
> > the part that enumerated it via some other mechanism. If the driver's 
> > only intended for ACPI systems then why not just be an ACPI device?
> > 
> 
> It was my understanding that with ACPI 5.0 it was becoming more common to use
> ACPI ID's exclusively for device enumeration. I originally wrote this as an
> acpi_bus driver but Rafeal advised me that the model is being phased out and
> suggeted the platform model instead.

If you're not adding ACPI support to an existing platform driver, you 
shouldn't be adding entries to acpi_platform.c. I'm not actually happy 
that I merged the ideapad-laptop patch that did the same thing - I'm 
inclined to revert it, because this really is an ugly way to do things.

Rafael, why did we convert the AC driver this way? It means we have to 
keep track of ACPI IDs in multiple places, which is worth it when it 
avoids having to write a pile of new code (such as the sdhci case) but 
doesn't seem to provide benefits otherwise.

-- 
Matthew Garrett | mj...@srcf.ucam.org
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [V3 00/10] perf: New conditional branch filter

2013-12-03 Thread Michael Ellerman

On Wed, 2013-10-16 at 12:26 +0530, Anshuman Khandual wrote:
>   This patchset is the re-spin of the original branch stack 
> sampling
> patchset which introduced new PERF_SAMPLE_BRANCH_COND branch filter. This 
> patchset
> also enables SW based branch filtering support for book3s powerpc platforms 
> which
> have PMU HW backed branch stack sampling support. 
> 
> Summary of code changes in this patchset:
> 
> (1) Introduces a new PERF_SAMPLE_BRANCH_COND branch filter
> (2) Add the "cond" branch filter options in the "perf record" tool
> (3) Enable PERF_SAMPLE_BRANCH_COND in X86 platforms
> (4) Enable PERF_SAMPLE_BRANCH_COND in POWER8 platform 
> (5) Update the documentation regarding "perf record" tool

Can you please address my comments and then resend patches 1-5. And make sure
you send them to the perf maintainers.

Those three touch the generic code, powerpc and x86, so we'll get those merged
first, and then focus on the remaining patches, which are powerpc specific.

cheers


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86: Add check for number of available vectors before CPU down

2013-12-03 Thread rui wang

On 11/20/13, Prarit Bhargava  wrote:
> Second try at this ...
>
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=64791
>
> When a cpu is downed on a system, the irqs on the cpu are assigned to other
> cpus.  It is possible, however, that when a cpu is downed there aren't
> enough free vectors on the remaining cpus to account for the vectors from
> the cpu that is being downed.
>
> This results in an interesting "overflow" condition where irqs are
> "assigned" to a CPU but are not handled.
>
> For example, when downing cpus on a 1-64 logical processor system:
>
> 
> [  232.021745] smpboot: CPU 61 is now offline
> [  238.480275] smpboot: CPU 62 is now offline
> [  245.991080] [ cut here ]
> [  245.996270] WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:264
> dev_watchdog+0x246/0x250()
> [  246.005688] NETDEV WATCHDOG: p786p1 (ixgbe): transmit queue 0 timed out
> [  246.013070] Modules linked in: lockd sunrpc iTCO_wdt iTCO_vendor_support
> sb_edac ixgbe microcode e1000e pcspkr joydev edac_core lpc_ich ioatdma ptp
> mdio mfd_core i2c_i801 dca pps_core i2c_core wmi acpi_cpufreq isci libsas
> scsi_transport_sas
> [  246.037633] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.12.0+ #14
> [  246.044451] Hardware name: Intel Corporation S4600LH
> ../SVRBD-ROW_T, BIOS SE5C600.86B.01.08.0003.022620131521 02/26/2013
> [  246.057371]  0009 88081fa03d40 8164fbf6
> 88081fa0ee48
> [  246.065728]  88081fa03d90 88081fa03d80 81054ecc
> 88081fa13040
> [  246.074073]   88200cce 0040
> 
> [  246.082430] Call Trace:
> [  246.085174][] dump_stack+0x46/0x58
> [  246.091633]  [] warn_slowpath_common+0x8c/0xc0
> [  246.098352]  [] warn_slowpath_fmt+0x46/0x50
> [  246.104786]  [] dev_watchdog+0x246/0x250
> [  246.110923]  [] ?
> dev_deactivate_queue.constprop.31+0x80/0x80
> [  246.119097]  [] call_timer_fn+0x3a/0x110
> [  246.125224]  [] ? update_process_times+0x6f/0x80
> [  246.132137]  [] ?
> dev_deactivate_queue.constprop.31+0x80/0x80
> [  246.140308]  [] run_timer_softirq+0x1f0/0x2a0
> [  246.146933]  [] __do_softirq+0xe0/0x220
> [  246.152976]  [] call_softirq+0x1c/0x30
> [  246.158920]  [] do_softirq+0x55/0x90
> [  246.164670]  [] irq_exit+0xa5/0xb0
> [  246.170227]  [] smp_apic_timer_interrupt+0x4a/0x60
> [  246.177324]  [] apic_timer_interrupt+0x6a/0x70
> [  246.184041][] ? cpuidle_enter_state+0x5b/0xe0
> [  246.191559]  [] ? cpuidle_enter_state+0x57/0xe0
> [  246.198374]  [] cpuidle_idle_call+0xbd/0x200
> [  246.204900]  [] arch_cpu_idle+0xe/0x30
> [  246.210846]  [] cpu_startup_entry+0xd0/0x250
> [  246.217371]  [] rest_init+0x77/0x80
> [  246.223028]  [] start_kernel+0x3ee/0x3fb
> [  246.229165]  [] ? repair_env_string+0x5e/0x5e
> [  246.235787]  [] x86_64_start_reservations+0x2a/0x2c
> [  246.242990]  [] x86_64_start_kernel+0xf8/0xfc
> [  246.249610] ---[ end trace fb74fdef54d79039 ]---
> [  246.254807] ixgbe :c2:00.0 p786p1: initiating reset due to tx
> timeout
> [  246.262489] ixgbe :c2:00.0 p786p1: Reset adapter
> Last login: Mon Nov 11 08:35:14 from 10.18.17.119
> [root@(none) ~]# [  246.792676] ixgbe :c2:00.0 p786p1: detected SFP+: 5
> [  249.231598] ixgbe :c2:00.0 p786p1: NIC Link is Up 10 Gbps, Flow
> Control: RX/TX
> [  246.792676] ixgbe :c2:00.0 p786p1: detected SFP+: 5
> [  249.231598] ixgbe :c2:00.0 p786p1: NIC Link is Up 10 Gbps, Flow
> Control: RX/TX
>
> (last lines keep repeating.  ixgbe driver is dead until module reload.)
>
> If the downed cpu has more vectors than are free on the remaining cpus on
> the
> system, it is possible that some vectors are "orphaned" even though they
> are
> assigned to a cpu.  In this case, since the ixgbe driver had a watchdog,
> the
> watchdog fired and notified that something was wrong.
>
> This patch adds a function, check_vectors(), to compare the number of
> vectors
> on the CPU going down and compares it to the number of vectors available on
> the system.  If there aren't enough vectors for the CPU to go down, an
> error is returned and propogated back to userspace.
>
> Signed-off-by: Prarit Bhargava 
> Cc: x...@kernel.org
> ---
>  arch/x86/include/asm/irq.h |1 +
>  arch/x86/kernel/irq.c  |   33 +
>  arch/x86/kernel/smpboot.c  |6 ++
>  3 files changed, 40 insertions(+)
>
> diff --git a/arch/x86/include/asm/irq.h b/arch/x86/include/asm/irq.h
> index 0ea10f27..dfd7372 100644
> --- a/arch/x86/include/asm/irq.h
> +++ b/arch/x86/include/asm/irq.h
> @@ -25,6 +25,7 @@ extern void irq_ctx_init(int cpu);
>
>  #ifdef CONFIG_HOTPLUG_CPU
>  #include 
> +extern int check_vectors(void);
>  extern void fixup_irqs(void);
>  extern void irq_force_complete_move(int);
>  #endif
> diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
> index 22d0687..ea5fa19 100644
> --- a/arch/x86/kernel/irq.c
> +++ b/arch/x86/kernel/irq.c
> @@ -262,6 +262,39 @@ __visible void

Re: [PATCH v4 07/12] efi: passing kexec necessary efi data via setup_data

2013-12-03 Thread Dave Young

Hi, Toshi

> Oh, I think I now understand what the issue was.  The z420 firmware
> updates the SMBIOS table address in the EFI system table to a virtual
> address after calling EFI SetVirtualAddressMap.  So, you are passing the
> original physical address of the SMBIOS table from the 1st kernel to the
> 2nd kernel to put it back to physical.  Is that right? 

Right.

Thanks
Dave
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCHv2 2/2] ACPI/platform: Add ACPI ID for Intel MBI device

2013-12-03 Thread David E. Box

On Wed, Dec 04, 2013 at 02:21:30AM +, Matthew Garrett wrote:
> On Tue, Dec 03, 2013 at 06:17:03PM -0800, David E. Box wrote:
> > This is per the requirement in Documentation/acpi/enumeration.txt:
> > 
> > "Currently the kernel is not able to automatically determine from which ACPI
> > device it should make the corresponding platform device so we need to add
> > the ACPI device explicitly to acpi_platform_device_ids list defined in
> > drivers/acpi/acpi_platform.c"
> 
> Well sure, but why do you need to be a platform device at all? This 
> functionality was intended for cases where we already have a driver for 
> the part that enumerated it via some other mechanism. If the driver's 
> only intended for ACPI systems then why not just be an ACPI device?
> 

It was my understanding that with ACPI 5.0 it was becoming more common to use
ACPI ID's exclusively for device enumeration. I originally wrote this as an
acpi_bus driver but Rafeal advised me that the model is being phased out and
suggeted the platform model instead.

> -- 
> Matthew Garrett | mj...@srcf.ucam.org
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[GIT PULL] Squashfs bug fixes for 3.13

2013-12-03 Thread Phillip Lougher


Hi Linus,

Please consider pulling the following Squashfs bug fix.

Thanks

Phillip

The following changes since commit ed4f381ec15e5f11724cdbc68cffd2c22d1eaebd:

  Squashfs: Check stream is not NULL in decompressor_multi.c (2013-11-20 
03:59:20 +)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-next.git 
tags/squashfs-fixes

for you to fetch changes up to 6d565409503f4e1f74ac30de14e8c91a2b826cd8:

  Squashfs: fix failure to unlock pages on decompress error (2013-11-24 
01:02:50 +)


Just a single bug fix to the new "directly decompress into the
page cache" code.


Phillip Lougher (1):
  Squashfs: fix failure to unlock pages on decompress error

 fs/squashfs/file_direct.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[ANNOUNCE] 3.2.53-rt75

2013-12-03 Thread Steven Rostedt


Dear RT Folks,

I'm pleased to announce the 3.2.53-rt75 stable release.


This release is just an update to the new stable 3.2.53 version
and no RT specific changes have been made.


You can get this release via the git tree at:

  git://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-stable-rt.git

  branch: v3.2-rt
  Head SHA1: 31851e0bb2263c249a09a95e10eca720b5cdc4dd


Or to build 3.2.53-rt75 directly, the following patches should be applied:

  http://www.kernel.org/pub/linux/kernel/v3.x/linux-3.2.tar.xz

  http://www.kernel.org/pub/linux/kernel/v3.x/patch-3.2.53.xz

  
http://www.kernel.org/pub/linux/kernel/projects/rt/3.2/patch-3.2.53-rt75.patch.xz




Enjoy,

-- Steve

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1844 matches

Mail list logo