Re: [PATCH] iwl4965: Enable checking of format strings

2015-02-12 Thread Mark Rustad
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 2/12/15 2:20 AM, Rasmus Villemoes wrote:
> Rather weak arguments, but I have three of them :-)

Yes, weak. All three.

> (1) If I'm reading some code and spot a non-constant format
> argument, I sometimes track back to see how e.g. fmt_value is
> defined. If I then see it's a macro, I immediately think "ok, the
> compiler is doing type-checking". If it is a const char[], I have
> to remember that gcc also does it in that case (as opposed to for
> example const char*const).

GCC should check in both cases. The case you are replacing was not
const char * const, but only const char *. Still, the compiler really
should check either form, even though theoretically the pointer in the
latter case could be changed, but the initial const value should be a
good indication of what the parameters are expected to be. No real
reason for the compiler not to check it.

> (2) The names of these variables themselves may end up wasting a
> few bytes in the image.

Maybe in a debug image, but they should be stripped from any normal
image. Really not a factor.

> (3) gcc/the linker doesn't merge identical const char[] arrays
> across translation units. It also doesn't consider their tails for
> merging with string literals. So although these specific strings
> are unlikely to appear elsewhere, a string such as "%10u\n" or
> "max\n" couldn't be merged with one of the above.

I haven't checked, but there is no theoretical reason that const char
[] items could not be merged exactly as the literals are. Considering
the boundaries the compiler guys push on optimization, doing such
merging would be tame by comparison (speculative stores make me crazy).
-BEGIN PGP SIGNATURE-
Version: GnuPG/MacGPG2 v2.0.20 (Darwin)
Comment: GPGTools - http://gpgtools.org

iQIcBAEBAgAGBQJU3a3+AAoJEDwO/+eO4+5unvMP/jwxA4GmwC0d3VdGsVTJkMVd
zg+jwbkhnMiaEj6uPAwV5LXV/IGyuYFgNjoiNDg9RD3trV9/3YAxdAKw1ffO+PWe
lnmXSxaapLlCTapOsUdXPg88z9muQKrcfhnGyi+jt3BFeccXgtlHLsR0qVa4ddJw
KVHByPg+AlTSNzSnROxHH3UAbxEuZmDy+g+xfbEFLCKNCgtrSX5jGyG2vJIY3lhF
40VIdriUHz1QW4C1YYeJWMKwzml7Kln3u0T5MfDEtDfy6n7hiBhHczEgPjf7dnzd
aY4+VtKTyjPWLRhyDoJfR9maaV9TsYHpheSuUVzAGwvb85wH32ugdfmcW2RPnRfC
n9ThhtWd1WdJJpmq0xhLjc9bc3nrxJO8b2J/GMsT6SjGBhPGaaJSWY37UPhhOJOj
akKkA6QwD0u6Yoc3de7unGsiKWayD7e2k3w3bus+kCSspmyn/OnkzZRc0X3nXd20
suAWNZTalLWioqvI/hyvH3GMZxIuHTJoLRpTm+K7BQs7gBM9pD7OJOpn7XLtk2PM
zPfEj8fAUMV17lzFdBP+M+pGT3HzjWVwTIUgujdA4vL6eqB1W+3fR7kqjUuQYc69
aBaMde//i+HUPzTHZht2qXEb6K9EvSsz/XlhQrtAyu2gYY8PwchdZXH0NbAGqJ9C
4BEAO4HYJijd4vVSNBFO
=utge
-END PGP SIGNATURE-
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v8 00/21] Introduce ACPI for ARM64 based on ACPI 5.1

2015-02-12 Thread Hanjun Guo

Hi Jonathan,

On 2015年02月13日 08:50, Jonathan (Zhixiong) Zhang wrote:

On 02/02/2015 05:45 AM, Hanjun Guo wrote:

From: Al Stone 

Two more documentation files are also being added:
(1) A verbatim copy of the "Why ACPI on ARM?" blog posting by Grant

Likely,

 which is also summarized in arm-acpi.txt, and

(2) A section by section review of the ACPI spec (acpi_object_usage.txt)
 to note recommendations and prohibitions on the use of the numerous
 ACPI tables and objects.  This sets out the current expectations of
 the firmware by Linux very explicitly (or as explicitly as I can, for
 now).


[snip]

+ERST   Section 18.5 (signature == "ERST")
+   == Error Record Serialization Table ==
+   Must be supplied if RAS support is provided by the platform.  It
+   is recommended this table be supplied.


The above text related to ERST table could lead to misunderstanding.
Following is what the ACPI spec (section 18.5) says:
"The error record serialization feature is used to save and retrieve
hardware error information to and from a persistent store. OSPM interacts
with the platform through a platform interface. On UEFI-based platforms, the
UEFI runtime variable services can be used to carry out error record
persistence operations. On non-UEFI based platforms, the ACPI solution
described below is used."


Thanks for the reminding, it is well documented in the spec as you
mentioned here.



When RAS support is provided by the platform, ERST table may not be supplied
when it is UEFI-based and when UEFI run time service provides the ability to
save and retrieve hardware error information to and from a persistent store
(UEFI spec section 7.2.3). Therefore, following text might be more accurate:
" On a platform supports RAS, this table must be supplied if it is not
UEFI-based; if it is UEFI-based, this table may be supplied, consult your
firmware vendor if you are not sure.


We can scan all the ACPI tables to see if we have one, so we just meed
to scan all the table then we will know if ERST table is available.


When this table is not present, UEFI
run time service will be utilized to save and retrieve hardware error
information to and from a persistent store."


Other than that, the comments pretty fine to me :)

Thanks
Hanjun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 02/13] clk: mediatek: Add initial common clock support for Mediatek SoCs.

2015-02-12 Thread Tomasz Figa
Hi,

Let me add some suggestions inline.

On Mon, Feb 9, 2015 at 7:47 PM, Sascha Hauer  wrote:
> From: James Liao 
>
> This patch adds common clock support for Mediatek SoCs, including plls,
> muxes and clock gates.

[snip]

> +static int mtk_cg_enable(struct clk_hw *hw)
> +{
> +   mtk_cg_clr_bit(hw);
> +
> +   return 0;
> +}
> +
> +static void mtk_cg_disable(struct clk_hw *hw)
> +{
> +   mtk_cg_set_bit(hw);
> +}
> +
> +static int mtk_cg_enable_inv(struct clk_hw *hw)
> +{
> +   mtk_cg_set_bit(hw);
> +
> +   return 0;
> +}
> +
> +static void mtk_cg_disable_inv(struct clk_hw *hw)
> +{
> +   mtk_cg_clr_bit(hw);
> +}

Instead of duplicating the ops, couldn't you add a flag or something
to mtk_clk_gate struct and then act appropriately in the ops? Also,
see below.

> +
> +const struct clk_ops mtk_clk_gate_ops_setclr = {
> +   .is_enabled = mtk_cg_bit_is_cleared,
> +   .enable = mtk_cg_enable,
> +   .disable= mtk_cg_disable,
> +};
> +
> +const struct clk_ops mtk_clk_gate_ops_setclr_inv = {
> +   .is_enabled = mtk_cg_bit_is_set,
> +   .enable = mtk_cg_enable_inv,
> +   .disable= mtk_cg_disable_inv,
> +};
> +
> +struct clk *mtk_clk_register_gate(
> +   const char *name,
> +   const char *parent_name,
> +   struct regmap *regmap,
> +   int set_ofs,
> +   int clr_ofs,
> +   int sta_ofs,
> +   u8 bit,
> +   const struct clk_ops *ops)

Instead of passing the ops here you would have some flags or even just
a single bool inverted. Then the ops struct could be made static.

Also it would be nice to have a kerneldoc-style comment documenting
arguments of this function. Same thing applies to other structs added
by this and related patches and non-static functions.

also CodingStyle: I believe it is not kernel coding style to push
every argument to new line, even if few of them can fit one line.
Similar thing applies to other functions added by this and related
patches using this convention.

> +{
> +   struct mtk_clk_gate *cg;
> +   struct clk *clk;
> +   struct clk_init_data init;
> +
> +   cg = kzalloc(sizeof(*cg), GFP_KERNEL);
> +   if (!cg)
> +   return ERR_PTR(-ENOMEM);
> +
> +   init.name = name;
> +   init.flags = CLK_SET_RATE_PARENT;
> +   init.parent_names = parent_name ? _name : NULL;
> +   init.num_parents = parent_name ? 1 : 0;
> +   init.ops = ops;
> +
> +   cg->regmap = regmap;
> +   cg->set_ofs = set_ofs;
> +   cg->clr_ofs = clr_ofs;
> +   cg->sta_ofs = sta_ofs;
> +   cg->bit = bit;
> +
> +   cg->hw.init = 
> +
> +   clk = clk_register(NULL, >hw);
> +   if (IS_ERR(clk))
> +   kfree(cg);
> +
> +   return clk;
> +}
> diff --git a/drivers/clk/mediatek/clk-gate.h b/drivers/clk/mediatek/clk-gate.h
> new file mode 100644
> index 000..a44dcbf
> --- /dev/null
> +++ b/drivers/clk/mediatek/clk-gate.h
> @@ -0,0 +1,49 @@
> +/*
> + * Copyright (c) 2014 MediaTek Inc.
> + * Author: James Liao 

Might not be necessary, but maybe the other people (all or some of
them) from signed-off-by should be added to this and other copyright
statements?

> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + */
> +
> +#ifndef __DRV_CLK_GATE_H
> +#define __DRV_CLK_GATE_H
> +
> +/*
> + * This is a private header file. DO NOT include it except clk-*.c.
> + */

I believe the above comment is unnecessary, because the file is
already located in drivers/clk/mediatek.

> +#include 
> +#include 
> +#include 
> +
> +struct mtk_clk_gate {

It would be nice to have a kerneldoc-style comment describing fields
of this struct.

> +   struct clk_hw   hw;
> +   struct regmap   *regmap;
> +   int set_ofs;
> +   int clr_ofs;
> +   int sta_ofs;
> +   u8  bit;
> +};
> +
> +#define to_clk_gate(_hw) container_of(_hw, struct mtk_clk_gate, hw)

I believe static inline is preferred to macros for such helpers, due
to increased type safety.

> +
> +extern const struct clk_ops mtk_clk_gate_ops_setclr;
> +extern const struct clk_ops mtk_clk_gate_ops_setclr_inv;
> +
> +struct clk *mtk_clk_register_gate(
> +   const char *name,
> +   const char *parent_name,
> +   struct regmap *regmap,
> +   int set_ofs,
> +   int clr_ofs,
> +   int sta_ofs,
> +   u8 bit,
> +   const struct clk_ops *ops);
> +
> +#endif /* 

Re: [PATCH 1/1] Staging: dgnc: dgnc_tty: code style improvements

2015-02-12 Thread Dan Carpenter
On Thu, Feb 12, 2015 at 11:18:50PM -0800, tolga ceylan wrote:
> Just noticed this warning in all dgnc_* files:
> 
>  *NOTE TO LINUX KERNEL HACKERS:  DO NOT REFORMAT THIS CODE!
>  *
>  *This is shared code between Digi's CVS archive and the
>  *Linux Kernel sources.
>  *Changing the source just for reformatting needlessly breaks
>  *our CVS diff history.
>  *
>  *Send any bug fixes/changes to:  Eng.Linux at digi dot com.
>  *Thank you.
> 
> Seems unusual. Also get_maintainers.pl does not return any digi dot com 
> addresses.
> ___


You can delete that comment.

regards,
dan carpenter

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 10/11] ARM: sti: always enable RESET_CONTROLLER

2015-02-12 Thread Patrice Chotard

Hi Arnd

On 02/12/2015 08:42 PM, Arnd Bergmann wrote:

A lot of the sti device drivers require reset controller support,
but do not all have individual 'depends on RESET_CONTROLLER'
statements. Using 'select' here once avoids a lot of build errors
resulting from this.

Signed-off-by: Arnd Bergmann 
Cc: Maxime Coquelin 
Cc: Srinivas Kandagatla 
Cc: Patrice Chotard 
---
  arch/arm/mach-sti/Kconfig | 1 +
  1 file changed, 1 insertion(+)

diff --git a/arch/arm/mach-sti/Kconfig b/arch/arm/mach-sti/Kconfig
index 8825bc9e2553..3b1ac463a494 100644
--- a/arch/arm/mach-sti/Kconfig
+++ b/arch/arm/mach-sti/Kconfig
@@ -13,6 +13,7 @@ menuconfig ARCH_STI
select ARM_ERRATA_775420
select PL310_ERRATA_753970 if CACHE_L2X0
select PL310_ERRATA_769419 if CACHE_L2X0
+   select RESET_CONTROLLER
help
  Include support for STiH41x SOCs like STiH415/416 using the device 
tree
  for discovery


Acked-by: Patrice Chotard 

Thanks

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 2/3] clk: Add __clk_hw_set_clk helper function

2015-02-12 Thread Javier Martinez Canillas
Hello Stephen,

On 02/12/2015 08:55 PM, Stephen Boyd wrote:
> On 02/12/15 05:58, Javier Martinez Canillas wrote:
>> After the clk API change to return a per-user clock instance, both the
>> struct clk_core and struct clk pointers from the hw clock needs to be
>> assigned to clock that share the same state.
>>
>> In the future the struct clk_core will be removed and this is going to
> 
> s/clk_core/clk/?
>

hrmm, I thought that the plan was to eventually merge clk_core into clk_hw.

In any case, if I got it backwards then I guess that the commit message
could be fixed up by Mike when the patches are applied?

>> change again so to avoid having to change the assignments twice in all
>> the drivers, add a helper function to have an indirection level.
>>
>> Signed-off-by: Javier Martinez Canillas 
>> ---
> 
> Reviewed-by: Stephen Boyd 
> 

Thanks a lot for your review.

Best regards,
Javier
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] mmc: dw_mmc: rockchip: add support MMC_CAP_RUNTIME_RESUME capability

2015-02-12 Thread Addy Ke
To support HS200 and UHS mode, mmc core will call init_card() to
execute tuning:
- sdio: init_card can be executed at runtime resume.
- sd and mmc: init_card can be executed at resume or runtime resume,
  which depends on MMC_CAP_RUNTIME_RESUME capability.

On rk3288 SoC, host will get DRTO interrupt when host send command
to read tuning data. This will spend more than 111ms:
drto_ms = drto_clks * 1000 / bus_hz = 111ms.

And the total tuning time will be more than 400ms.

So we should add MMC_CAP_RUNTIME_RESUME capability to execute tuning
at runtime resume. Only if we do so, can we pass resume test.

Signed-off-by: Addy Ke 
---
 drivers/mmc/host/dw_mmc-rockchip.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/drivers/mmc/host/dw_mmc-rockchip.c 
b/drivers/mmc/host/dw_mmc-rockchip.c
index e2a726a..e5f57b5 100644
--- a/drivers/mmc/host/dw_mmc-rockchip.c
+++ b/drivers/mmc/host/dw_mmc-rockchip.c
@@ -76,12 +76,20 @@ static int dw_mci_rockchip_init(struct dw_mci *host)
return 0;
 }
 
+/* Common capabilities of RK3288 SoC */
+static unsigned long dw_mci_rk3288_dwmmc_caps[4] = {
+   MMC_CAP_RUNTIME_RESUME, /* emmc */
+   MMC_CAP_RUNTIME_RESUME, /* sdmmc */
+   0, /* sdio0 */
+   0, /* sdio1 */
+};
 static const struct dw_mci_drv_data rk2928_drv_data = {
.prepare_command= dw_mci_rockchip_prepare_command,
.init   = dw_mci_rockchip_init,
 };
 
 static const struct dw_mci_drv_data rk3288_drv_data = {
+   .caps   = dw_mci_rk3288_dwmmc_caps,
.prepare_command= dw_mci_rockchip_prepare_command,
.set_ios= dw_mci_rk3288_set_ios,
.setup_clock= dw_mci_rk3288_setup_clock,
-- 
1.8.3.2


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 3/3] clk: Replace explicit clk assignment with __clk_hw_set_clk

2015-02-12 Thread Javier Martinez Canillas
Hello Stephen,

On 02/12/2015 08:55 PM, Stephen Boyd wrote:
> On 02/12/15 05:58, Javier Martinez Canillas wrote:
>>
>> The changes were made using the following cocinelle semantic patch:
>>
>> @i@
>> @@
>>
>> @depends on i@
>> identifier dst;
>> @@
>>
>> - dst->clk = hw->clk;
>> + __clk_hw_set_clk(dst, hw);
>>
>> @depends on i@
>> identifier dst;
>> @@
>>
>> - dst->hw.clk = hw->clk;
>> + __clk_hw_set_clk(>hw, hw);
>>
>> Fixes: 035a61c314eb3 ("clk: Make clk API return per-user struct clk 
>> instances")
>> Signed-off-by: Javier Martinez Canillas 
> 
> Reviewed-by: Stephen Boyd 
> 

Thanks a lot for your review.

> Did you run this on all files that include clk-provider.h? I hope there
> aren't similar situations in arch/arm/ for example.
> 

Yes, I did run spatch against all the files that include clk-provider.h
but only were matches in the drivers/clk files changed by $subject.

Best regards,
Javier
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/1] Staging: dgnc: dgnc_tty: code style improvements

2015-02-12 Thread tolga ceylan

On 02/12/2015 10:20 PM, Joe Perches wrote:

On Thu, 2015-02-12 at 21:58 -0800, Tolga Ceylan wrote:

On Wed, Feb 11, 2015 at 2:36 AM, Dan Carpenter  wrote:

That looks kind of uglier than before.  Please run your patch throught
scripts/checkpatch.pl --strict.

[]

Running with --strict helped, but now I'm also getting warnings for
camel case usage.


You can use --strict --ignore=camelcase


If I try to fix camel case, then
the patch will get much larger spanning many dgnc_* files. I can
proceed with this if you think it is valuable/acceptable.


I suggest not.




Just noticed this warning in all dgnc_* files:

 *  NOTE TO LINUX KERNEL HACKERS:  DO NOT REFORMAT THIS CODE!
 *
 *  This is shared code between Digi's CVS archive and the
 *  Linux Kernel sources.
 *  Changing the source just for reformatting needlessly breaks
 *  our CVS diff history.
 *
 *  Send any bug fixes/changes to:  Eng.Linux at digi dot com.
 *  Thank you.

Seems unusual. Also get_maintainers.pl does not return any digi dot com 
addresses.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2] perf: fix building error in x86_64

2015-02-12 Thread He Kuang
When build with ARCH=x86_64, perf failed to compile with following error:

tests/builtin-test.o:(.data+0x158): undefined reference to 
`test__perf_time_to_tsc'
collect2: error: ld returned 1 exit status
Makefile.perf:632: recipe for target 'perf' failed
...

Which is caused commit c6e5e9fbc3ea1 ("perf tools: Fix building error
in x86_64 when dwarf unwind is on"), ARCH test in Makefile.perf
conflicts with tests/builtin-test.c's __x86_64__.
To x86/x86_64 platform, ARCH should always override to x86 while
IS_64_BIT stands for the actual architecture.

Signed-off-by: He Kuang 
---
 tools/perf/config/Makefile.arch | 4 
 1 file changed, 4 insertions(+)

diff --git a/tools/perf/config/Makefile.arch b/tools/perf/config/Makefile.arch
index ff95a68..836ba44 100644
--- a/tools/perf/config/Makefile.arch
+++ b/tools/perf/config/Makefile.arch
@@ -29,3 +29,7 @@ ifeq ($(LP64), 1)
 else
   IS_64_BIT := 0
 endif
+
+ifeq ($(ARCH), x86_64)
+  override ARCH := x86
+endif
-- 
2.2.0.33.gc18b867

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5] perf: Use monotonic clock as a source for timestamps

2015-02-12 Thread Adrian Hunter
On 12/02/15 12:28, Peter Zijlstra wrote:
> On Thu, Feb 12, 2015 at 12:04:54PM +0200, Adrian Hunter wrote:
>> On 11/02/15 18:12, Peter Zijlstra wrote:
>>>
>>> How about something like the below? I _think_ it should mostly work for
>>> x86, where the tsc is a 64bit wide cycle counter.
>>
>> It would have to be based on CLOCK_MONOTONIC_RAW not CLOCK_MONOTONIC 
> 
> Why?

In the CLOCK_MONOTONIC case, the components of the calculation (mult and
shift etc) are subject to change, so the calculation would be increasingly
inaccurate the greater the time between reading those values the reading TSC
or capturing perf events.

Accuracy is important for matching sideband events against Intel PT. e.g.
did a mmap event happen before or after a given TSC timestamp.

Adding another sample value (Pawel's patch 3) is more accurate and simpler
to understand. It just needs to be extended to allow TSC.

> 
>> and you would have to check the clocksource is TSC.
> 
> It implicitly does that; it has that sched_clock_stable() thing, but
> yeah I suppose someone could change the clocksource even though the tsc
> is stable.
> 
> Not using TSC when its available is quite crazy though.. but sure.
> 
>> Why is CLOCK_MONOTONIC preferred anyway - I would have thought any
>> adjustment would skew performance timings?
> 
> Because you can do inter-machine stuff with MONOTONIC and that's
> entirely impossible with MONO_RAW.

Ok, the man page does not make it sound as bad as that.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PULL] modules-next

2015-02-12 Thread Rusty Russell
The following changes since commit f8de05ca38b7bce4079b52002a6817e9582e3e01:

  Merge branch 'for-linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux (2015-01-23 06:53:06 
+1200)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux.git 
tags/modules-next-for-linus

for you to fetch changes up to 9cc019b8c94fa59e02fd82f15f7b7d689e35c190:

  module: Replace over-engineered nested sleep (2015-02-11 15:02:04 +1030)


Trivial cleanups, mainly.

Cheers,
Rusty.


Andrey Tsyvarev (1):
  kernel/module.c: Free lock-classes if parse_args failed

Marcel Holtmann (1):
  module: Remove double spaces in module verification taint message

Peter Zijlstra (2):
  module: Annotate nested sleep in resolve_symbol()
  module: Replace over-engineered nested sleep

Rabin Vincent (1):
  module: set ksymtab/kcrctab* section addresses to 0x0

 kernel/module.c   | 47 ++-
 scripts/module-common.lds | 20 ++--
 2 files changed, 28 insertions(+), 39 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] usb: dwc2: Register interrupt handler only once gadget is correctly initialized

2015-02-12 Thread Romain Perier
No problem

Regards,
Romain

2015-02-13 3:47 GMT+01:00 John Youn :
> On 2/12/2015 4:42 AM, Romain Perier wrote:
>> ping
>>
>> 2015-02-06 17:50 GMT+01:00 Romain Perier :
>>> Don't register interrupt handler before usb gadget is correctly initialized.
>>> For some embedded platforms which don't have a usb-phy, it crashes the 
>>> driver
>>> because an interrupt is emitted with non-initialized hardware.
>>> According to devm_request_irq documentation, an interrupt can be emitted
>>> at any time once the interrupt is registered, so we have to care about 
>>> driver
>>> and hardware initialization.
>>>
>>> Signed-off-by: Romain Perier 
>>> ---
>>>
>>> Changes for v2: fix typos in commit log
>>>
>>>  drivers/usb/dwc2/platform.c | 17 +
>>>  1 file changed, 9 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/drivers/usb/dwc2/platform.c b/drivers/usb/dwc2/platform.c
>>> index ae095f0..b26cf8c 100644
>>> --- a/drivers/usb/dwc2/platform.c
>>> +++ b/drivers/usb/dwc2/platform.c
>>> @@ -196,14 +196,6 @@ static int dwc2_driver_probe(struct platform_device 
>>> *dev)
>>> return irq;
>>> }
>>>
>>> -   dev_dbg(hsotg->dev, "registering common handler for irq%d\n",
>>> -   irq);
>>> -   retval = devm_request_irq(hsotg->dev, irq,
>>> - dwc2_handle_common_intr, IRQF_SHARED,
>>> - dev_name(hsotg->dev), hsotg);
>>> -   if (retval)
>>> -   return retval;
>>> -
>>> res = platform_get_resource(dev, IORESOURCE_MEM, 0);
>>> hsotg->regs = devm_ioremap_resource(>dev, res);
>>> if (IS_ERR(hsotg->regs))
>>> @@ -237,6 +229,15 @@ static int dwc2_driver_probe(struct platform_device 
>>> *dev)
>>> retval = dwc2_gadget_init(hsotg, irq);
>>> if (retval)
>>> return retval;
>>> +
>>> +dev_dbg(hsotg->dev, "registering common handler for irq%d\n",
>>> +irq);
>>> +retval = devm_request_irq(hsotg->dev, irq,
>>> +dwc2_handle_common_intr, IRQF_SHARED,
>>> +dev_name(hsotg->dev), hsotg);
>>> +if (retval)
>>> +return retval;
>>> +
>>> retval = dwc2_hcd_init(hsotg, irq, params);
>>> if (retval)
>>> return retval;
>
> Hi,
>
> I'm going to be away until Wednesday, Feb 18. I'll take a look at
> this and other pending dwc2 patches at that time.
>
> Regards,
> John
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/1] Staging: iio: meter: ade7854-i2c: code style improvements

2015-02-12 Thread Tolga Ceylan
Code reformatting based on checkpatch.pl with --strict:
1) Lines over 80 characters were fixed
2) Alignment should match open paranthesis cases corrected
3) Comparison to NULL rewritten as !indio_dev

Signed-off-by: Tolga Ceylan 
---
 drivers/staging/iio/meter/ade7854-i2c.c | 39 +
 1 file changed, 20 insertions(+), 19 deletions(-)

diff --git a/drivers/staging/iio/meter/ade7854-i2c.c 
b/drivers/staging/iio/meter/ade7854-i2c.c
index 5b33c7f..fcb4714 100644
--- a/drivers/staging/iio/meter/ade7854-i2c.c
+++ b/drivers/staging/iio/meter/ade7854-i2c.c
@@ -16,8 +16,8 @@
 #include "ade7854.h"
 
 static int ade7854_i2c_write_reg_8(struct device *dev,
-   u16 reg_address,
-   u8 value)
+  u16 reg_address,
+  u8 value)
 {
int ret;
struct iio_dev *indio_dev = dev_to_iio_dev(dev);
@@ -35,8 +35,8 @@ static int ade7854_i2c_write_reg_8(struct device *dev,
 }
 
 static int ade7854_i2c_write_reg_16(struct device *dev,
-   u16 reg_address,
-   u16 value)
+   u16 reg_address,
+   u16 value)
 {
int ret;
struct iio_dev *indio_dev = dev_to_iio_dev(dev);
@@ -55,8 +55,8 @@ static int ade7854_i2c_write_reg_16(struct device *dev,
 }
 
 static int ade7854_i2c_write_reg_24(struct device *dev,
-   u16 reg_address,
-   u32 value)
+   u16 reg_address,
+   u32 value)
 {
int ret;
struct iio_dev *indio_dev = dev_to_iio_dev(dev);
@@ -76,8 +76,8 @@ static int ade7854_i2c_write_reg_24(struct device *dev,
 }
 
 static int ade7854_i2c_write_reg_32(struct device *dev,
-   u16 reg_address,
-   u32 value)
+   u16 reg_address,
+   u32 value)
 {
int ret;
struct iio_dev *indio_dev = dev_to_iio_dev(dev);
@@ -98,8 +98,8 @@ static int ade7854_i2c_write_reg_32(struct device *dev,
 }
 
 static int ade7854_i2c_read_reg_8(struct device *dev,
-   u16 reg_address,
-   u8 *val)
+ u16 reg_address,
+ u8 *val)
 {
struct iio_dev *indio_dev = dev_to_iio_dev(dev);
struct ade7854_state *st = iio_priv(indio_dev);
@@ -124,8 +124,8 @@ out:
 }
 
 static int ade7854_i2c_read_reg_16(struct device *dev,
-   u16 reg_address,
-   u16 *val)
+  u16 reg_address,
+  u16 *val)
 {
struct iio_dev *indio_dev = dev_to_iio_dev(dev);
struct ade7854_state *st = iio_priv(indio_dev);
@@ -150,8 +150,8 @@ out:
 }
 
 static int ade7854_i2c_read_reg_24(struct device *dev,
-   u16 reg_address,
-   u32 *val)
+  u16 reg_address,
+  u32 *val)
 {
struct iio_dev *indio_dev = dev_to_iio_dev(dev);
struct ade7854_state *st = iio_priv(indio_dev);
@@ -176,8 +176,8 @@ out:
 }
 
 static int ade7854_i2c_read_reg_32(struct device *dev,
-   u16 reg_address,
-   u32 *val)
+  u16 reg_address,
+  u32 *val)
 {
struct iio_dev *indio_dev = dev_to_iio_dev(dev);
struct ade7854_state *st = iio_priv(indio_dev);
@@ -195,21 +195,22 @@ static int ade7854_i2c_read_reg_32(struct device *dev,
if (ret)
goto out;
 
-   *val = (st->rx[0] << 24) | (st->rx[1] << 16) | (st->rx[2] << 8) | 
st->rx[3];
+   *val = (st->rx[0] << 24) | (st->rx[1] << 16)
+   | (st->rx[2] << 8) | st->rx[3];
 out:
mutex_unlock(>buf_lock);
return ret;
 }
 
 static int ade7854_i2c_probe(struct i2c_client *client,
-   const struct i2c_device_id *id)
+const struct i2c_device_id *id)
 {
int ret;
struct ade7854_state *st;
struct iio_dev *indio_dev;
 
indio_dev = devm_iio_device_alloc(>dev, sizeof(*st));
-   if (indio_dev == NULL)
+   if (!indio_dev)
return -ENOMEM;
st = iio_priv(indio_dev);
i2c_set_clientdata(client, indio_dev);
-- 
2.3.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V4] x86 spinlock: Fix memory corruption on completing completions

2015-02-12 Thread Raghavendra K T
Paravirt spinlock clears slowpath flag after doing unlock.
As explained by Linus currently it does:
prev = *lock;
add_smp(>tickets.head, TICKET_LOCK_INC);

/* add_smp() is a full mb() */

if (unlikely(lock->tickets.tail & TICKET_SLOWPATH_FLAG))
__ticket_unlock_slowpath(lock, prev);

which is *exactly* the kind of things you cannot do with spinlocks,
because after you've done the "add_smp()" and released the spinlock
for the fast-path, you can't access the spinlock any more.  Exactly
because a fast-path lock might come in, and release the whole data
structure.

Linus suggested that we should not do any writes to lock after unlock(),
and we can move slowpath clearing to fastpath lock.

So this patch implements the fix with:
1. Moving slowpath flag to head (Oleg):
Unlocked locks don't care about the slowpath flag; therefore we can keep
it set after the last unlock, and clear it again on the first (try)lock.
-- this removes the write after unlock. note that keeping slowpath flag would
result in unnecessary kicks.
By moving the slowpath flag from the tail to the head ticket we also avoid
the need to access both the head and tail tickets on unlock.

2. use xadd to avoid read/write after unlock that checks the need for
unlock_kick (Linus):
We further avoid the need for a read-after-release by using xadd;
the prev head value will include the slowpath flag and indicate if we
need to do PV kicking of suspended spinners -- on modern chips xadd
isn't (much) more expensive than an add + load.

Result:
 setup: 16core (32 cpu +ht sandy bridge 8GB 16vcpu guest)
 benchmark overcommit %improve
 kernbench  1x   -0.13
 kernbench  2x0.02
 dbench 1x   -1.77
 dbench 2x   -0.63

[Jeremy: hinted missing TICKET_LOCK_INC for kick]
[Oleg: Moving slowpath flag to head, ticket_equals idea]
[PeterZ: Detailed changelog]

Reported-by: Sasha Levin 
Suggested-by: Linus Torvalds 
Signed-off-by: Raghavendra K T 
---
 arch/x86/include/asm/spinlock.h | 91 -
 arch/x86/kernel/kvm.c   | 10 +++--
 arch/x86/xen/spinlock.c | 10 +++--
 3 files changed, 56 insertions(+), 55 deletions(-)

potential TODO:
 * The whole patch be splitted into, 1. move slowpath flag
 2. fix memory corruption in completion problem ??

Changes since V3:
  - Detailed changelog (PeterZ)
  - Replace ACCESS_ONCE with READ_ONCE (oleg)
  - Add xen changes (Oleg)
  - Correct break logic in unlock_wait() (Oleg)

Changes since V2:
  - Move the slowpath flag to head, this enables xadd usage in unlock code
and inturn we can get rid of read/write after  unlock (Oleg)
  - usage of ticket_equals (Oleg)

Changes since V1:
  - Add missing TICKET_LOCK_INC before unlock kick (fixes hang in overcommit: 
Jeremy).
  - Remove SLOWPATH_FLAG clearing in fast lock. (Jeremy)
  - clear SLOWPATH_FLAG in arch_spin_value_unlocked during comparison.
 Note: The current implementation is still based on avoid writing after unlock.
  we could still have potential invalid memory read. (Sasha)

 Result:
 setup: 16core (32 cpu +ht sandy bridge 8GB 16vcpu guest)
base = 3_19_rc7

3_19_rc7_spinfix_v3
+---+---+---++---+
 kernbench (Time taken in sec lower is better)
+---+---+---++---+
 base   %stdevpatched  %stdev  %improve
+---+---+---++---+
1x   54.2300 3.0652 54.3008 4.0366-0.13056
2x   90.1883 5.5509 90.1650 6.4336 0.02583
+---+---+---++---+
+---+---+---++---+
dbench (Throughput higher is better)
+---+---+---++---+
 base   %stdevpatched  %stdev  %improve
+---+---+---++---+
1x 7029.9188 2.5952   6905.0712 4.4737-1.77595
2x 3254.207514.8291   3233.713726.8784-0.62976
+---+---+---++---+

 (here is the result I got from the patches, I believe there may
 be some small overhead from xadd etc, but overall looks fine but
 a thorough test may be needed)

diff --git a/arch/x86/include/asm/spinlock.h b/arch/x86/include/asm/spinlock.h
index 625660f..646a1a3 100644
--- a/arch/x86/include/asm/spinlock.h
+++ b/arch/x86/include/asm/spinlock.h
@@ -46,7 +46,7 @@ static __always_inline bool static_key_false(struct 
static_key *key);
 
 static inline void __ticket_enter_slowpath(arch_spinlock_t *lock)
 {
-   set_bit(0, (volatile unsigned long *)>tickets.tail);
+   set_bit(0, (volatile unsigned long *)>tickets.head);
 }
 
 #else  /* !CONFIG_PARAVIRT_SPINLOCKS */
@@ -60,10 +60,30 @@ static inline void __ticket_unlock_kick(arch_spinlock_t 
*lock,
 }
 
 #endif /* CONFIG_PARAVIRT_SPINLOCKS 

Re: [RFC 07/16] mm/page_isolation: watch out zone range overlap

2015-02-12 Thread Gioh Kim

> diff --git a/mm/page_isolation.c b/mm/page_isolation.c
> index c8778f7..883e78d 100644
> --- a/mm/page_isolation.c
> +++ b/mm/page_isolation.c
> @@ -210,8 +210,8 @@ int undo_isolate_page_range(unsigned long start_pfn, 
> unsigned long end_pfn,
>* Returns 1 if all pages in the range are isolated.
>*/
>   static int
> -__test_page_isolated_in_pageblock(unsigned long pfn, unsigned long end_pfn,
> -   bool skip_hwpoisoned_pages)
> +__test_page_isolated_in_pageblock(struct zone *zone, unsigned long pfn,
> + unsigned long end_pfn, bool skip_hwpoisoned_pages)
>   {
>   struct page *page;
>   
> @@ -221,6 +221,9 @@ __test_page_isolated_in_pageblock(unsigned long pfn, 
> unsigned long end_pfn,
>   continue;
>   }
>   page = pfn_to_page(pfn);
> + if (page_zone(page) != zone)
> + break;
> +
>   if (PageBuddy(page)) {
>   /*
>* If race between isolatation and allocation happens,
> @@ -281,7 +284,7 @@ int test_pages_isolated(unsigned long start_pfn, unsigned 
> long end_pfn,
>   /* Check all pages are free or marked as ISOLATED */
>   zone = page_zone(page);
>   spin_lock_irqsave(>lock, flags);
> - ret = __test_page_isolated_in_pageblock(start_pfn, end_pfn,
> + ret = __test_page_isolated_in_pageblock(zone, start_pfn, end_pfn,
>   skip_hwpoisoned_pages);
>   spin_unlock_irqrestore(>lock, flags);
>   return ret ? 0 : -EBUSY;
> 

What about checking zone at test_pages_isolated?
It might be a little bit early and without locking zone.

@@ -273,8 +273,14 @@ int test_pages_isolated(unsigned long start_pfn, unsigned 
long end_pfn,
 * are not aligned to pageblock_nr_pages.
 * Then we just check migratetype first.
 */
+
+   zone = page_zone(__first_valid_page(start_pfn, pageblock_nr_pages));
+
for (pfn = start_pfn; pfn < end_pfn; pfn += pageblock_nr_pages) {
page = __first_valid_page(pfn, pageblock_nr_pages);
+
+   if (page_zone(page) != zone)
+   break;
if (page && get_pageblock_migratetype(page) != MIGRATE_ISOLATE)
break;
}
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/1] Staging: dgnc: dgnc_tty: code style improvements

2015-02-12 Thread Joe Perches
On Thu, 2015-02-12 at 21:58 -0800, Tolga Ceylan wrote:
> On Wed, Feb 11, 2015 at 2:36 AM, Dan Carpenter  
> wrote:
> > That looks kind of uglier than before.  Please run your patch throught
> > scripts/checkpatch.pl --strict.
[]
> Running with --strict helped, but now I'm also getting warnings for
> camel case usage. 

You can use --strict --ignore=camelcase

> If I try to fix camel case, then
> the patch will get much larger spanning many dgnc_* files. I can
> proceed with this if you think it is valuable/acceptable.

I suggest not.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] cpuset: initialize effective masks when clone_children is enabled

2015-02-12 Thread Serge E. Hallyn
Quoting Zefan Li (lize...@huawei.com):
> If clone_children is enabled, effective masks won't be initialized
> due to the bug:
> 
>   # mount -t cgroup -o cpuset xxx /mnt
>   # echo 1 > cgroup.clone_children
>   # mkdir /mnt/tmp
>   # cat /mnt/tmp/
>   # cat cpuset.effective_cpus
> 
>   # cat cpuset.cpus
>   0-15
> 
> And then this cpuset won't constrain the tasks in it.
> 
> Either the bug or the fix has no effect on unified hierarchy, as
> there's no clone_chidren flag there any more.
> 
> Reported-by: Christian Brauner 
> Reported-by: Serge Hallyn 

Thanks - this give sme the correct output in /proc/self/status and
cpuest.cpus.  (I didn't do a stress test but that seems unlikely to
be broken)

Tested-by: Serge Hallyn 

> Cc:  # 3.17+
> Signed-off-by: Zefan Li 
> ---
>  kernel/cpuset.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/kernel/cpuset.c b/kernel/cpuset.c
> index 64b257f..7e9d711 100644
> --- a/kernel/cpuset.c
> +++ b/kernel/cpuset.c
> @@ -1992,7 +1992,9 @@ static int cpuset_css_online(struct cgroup_subsys_state 
> *css)
>  
>   spin_lock_irq(_lock);
>   cs->mems_allowed = parent->mems_allowed;
> + cs->effective_mems = parent->mems_allowed;
>   cpumask_copy(cs->cpus_allowed, parent->cpus_allowed);
> + cpumask_copy(cs->effective_cpus, parent->cpus_allowed);
>   spin_unlock_irq(_lock);
>  out_unlock:
>   mutex_unlock(_mutex);
> -- 
> 1.8.0.2
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/1] Staging: dgnc: dgnc_tty: code style improvements

2015-02-12 Thread Tolga Ceylan
On Wed, Feb 11, 2015 at 2:36 AM, Dan Carpenter  wrote:
> That looks kind of uglier than before.  Please run your patch throught
> scripts/checkpatch.pl --strict.
>
> regards,
> dan carpenter
>

Running with --strict helped, but now I'm also getting warnings for
camel case usage. If I try to fix camel case, then
the patch will get much larger spanning many dgnc_* files. I can
proceed with this if you think it is valuable/acceptable.

Regards,
Tolga
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH v3 24/26] early kprobes: core logic to support early kprobe on ftrace.

2015-02-12 Thread Wang Nan
This is the main patch to support early kprobes on ftrace.

Utilizes previous introduced ftrace update notification chain to fix
possible ftrace code modifition failuer.

For early kprobes on ftrace, register ftrace_notifier_call() to ftrace
update notifier to receive ftrace code conversion failures.

When registering early kprobes, uses check_kprobe_address_safe() to
check whether it is an ftrace entries and uses
ftrace_process_loc_early() to convert such instruction to nop before
ftrace inited. Previous ftrace patches make such checking and
modification possible.

When ftrace doing the NOP conversion, give x86 a chance to adjust probed
nop instruction by calling arch_fix_ftrace_early_kprobe().

When ftrace trying to enable the probed ftrace entry, restores the NOP
instruction. There are 2 different situations. Case 1:  ftrace is
enabled by ftrace_filter= option. In this case the early kprobe will
stop work until kprobe fully initialized.  Case 2: registering ftrace
events during converting early kprobe to normal kprobe. Event losing is
possible, but in case 2 the window should be small enough.

After kprobe fully initialized, converts early kprobes on ftrace to
normal kprobe on ftrace by first restoring ftrace then register ftrace
event on them. Conversion is splitted into two parts. The first part
does some checking and converting kprobes on ftrace. The second part is
wrapped by stop_machine() to avoid losting events during list
manipulation. kprobes_initialized is also set in stop_machine() context
to avoid event losing.

Signed-off-by: Wang Nan 
---
 include/linux/kprobes.h |   1 +
 kernel/kprobes.c| 247 +++-
 2 files changed, 225 insertions(+), 23 deletions(-)

diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index e615402..8f4d344 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -131,6 +131,7 @@ struct kprobe {
   */
 #define KPROBE_FLAG_FTRACE 8 /* probe is using ftrace */
 #define KPROBE_FLAG_EARLY  16 /* early kprobe */
+#define KPROBE_FLAG_RESTORED   32 /* temporarily restored to its original insn 
*/
 
 /* Has this kprobe gone ? */
 static inline int kprobe_gone(struct kprobe *p)
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 0bbb510..edac74b 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -48,6 +48,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -2239,11 +2240,24 @@ static int __init init_kprobes(void)
if (!err)
err = register_module_notifier(_module_nb);
 
-   convert_early_kprobes();
-   kprobes_initialized = (err == 0);
-
-   if (!err)
+   if (!err) {
+   /*
+* Let convert_early_kprobes setup kprobes_initialized
+* to 1 in stop_machine() context. If not, we may lost
+* events from kprobe on ftrace happens in the gap.
+*
+* kprobe_ftrace_handler() use get_kprobe() to retrive
+* kprobe being triggered, which depends on
+* kprobes_is_early() to determine hlist used for
+* searching. convert_early_kprobes() relike early
+* kprobes to normal hlist. If event raises after that
+* before setting kprobes_initialized, get_kprobe()
+* will retrive incorrect list.
+*/
+   convert_early_kprobes();
init_test_probes();
+   }
+
return err;
 }
 
@@ -2540,11 +2554,127 @@ EXPORT_SYMBOL_GPL(jprobe_return);
 void __weak arch_fix_ftrace_early_kprobe(struct optimized_kprobe *p)
 {
 }
+
+static int restore_optimized_kprobe(struct optimized_kprobe *op)
+{
+   /* If it already restored, pass it to other. */
+   if (op->kp.flags & KPROBE_FLAG_RESTORED)
+   return NOTIFY_DONE;
+
+   get_online_cpus();
+   mutex_lock(_mutex);
+   arch_restore_optimized_kprobe(op);
+   mutex_unlock(_mutex);
+   put_online_cpus();
+
+   op->kp.flags |= KPROBE_FLAG_RESTORED;
+   return NOTIFY_STOP;
+}
+
+static int ftrace_notifier_call(struct notifier_block *nb,
+   unsigned long val, void *param)
+{
+   struct ftrace_update_notifier_info *info = param;
+   struct optimized_kprobe *op;
+   struct dyn_ftrace *rec;
+   struct kprobe *kp;
+   int enable;
+   void *addr;
+   int ret = NOTIFY_DONE;
+
+   if (!info || !info->rec || !info->rec->ip)
+   return NOTIFY_DONE;
+
+   rec = info->rec;
+   enable = info->enable;
+   addr = (void *)rec->ip;
+
+   mutex_lock(_mutex);
+   kp = get_kprobe(addr);
+   mutex_unlock(_mutex);
+
+   if (!kp || !kprobe_aggrprobe(kp))
+   return NOTIFY_DONE;
+
+   op = container_of(kp, struct optimized_kprobe, kp);
+   /*
+* Ftrace is trying to convert ftrace entries to nop
+* instruction. This 

linux-next: Tree for Feb 13

2015-02-12 Thread Stephen Rothwell
Hi all,

Please do not add any material destined for v3.21 to your linux-next
included trees until after v3.20-rc1 has been released.

Changes since 20150212:

The mips tree gained a conflict against Linus' tree.

Non-merge commits (relative to Linus' tree): 4418
 4081 files changed, 183203 insertions(+), 92585 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log files
in the Next directory.  Between each merge, the tree was built with
a ppc64_defconfig for powerpc and an allmodconfig for x86_64 and a
multi_v7_defconfig for arm. After the final fixups (if any), it is also
built with powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig and
allyesconfig (this fails its final link) and i386, sparc, sparc64 and arm
defconfig.

Below is a summary of the state of the merge.

I am currently merging 206 trees (counting Linus' and 30 trees of patches
pending for Linus' tree).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

$ git checkout master
$ git reset --hard stable
Merging origin/master (8494bcf5b7c4 Merge branch 'for-3.20/drivers' of 
git://git.kernel.dk/linux-block)
Merging fixes/master (b94d525e58dc Merge 
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net)
Merging kbuild-current/rc-fixes (a16c5f99a28c kbuild: Fix removal of the 
debian/ directory)
Merging arc-current/for-curr (2ce7598c9a45 Linux 3.17-rc4)
Merging arm-current/fixes (8e6480667246 ARM: 8299/1: mm: ensure local active 
ASID is marked as allocated on rollover)
Merging m68k-current/for-linus (4436820a98cd m68k/defconfig: Enable Ethernet 
bridging)
Merging metag-fixes/fixes (ffe6902b66aa asm-generic: remove _STK_LIM_MAX)
Merging mips-fixes/mips-fixes (1795cd9b3a91 Linux 3.16-rc5)
Merging powerpc-merge/merge (31345e1a071e powerpc/pci: Remove unused 
force_32bit_msi quirk)
Merging powerpc-merge-mpe/fixes (c59c961ca511 Merge branch 'drm-fixes' of 
git://people.freedesktop.org/~airlied/linux)
Merging sparc/master (66d0f7ec9f10 sparc32: destroy_context() and switch_mm() 
needs to disable interrupts.)
Merging net/master (9672723973f1 bridge: netfilter: Move sysctl-specific error 
code inside #ifdef)
Merging ipsec/master (ac37e2515c1a xfrm: release dst_orig in case of error in 
xfrm_lookup())
Merging sound-current/for-linus (0b444af8daf9 ALSA: seq: potential out of 
bounds in do_control())
Merging pci-current/for-linus (feb28979c137 of/pci: Remove duplicate kfree in 
of_pci_get_host_bridge_resources())
Merging wireless-drivers/master (aeb2d2a4c0ae rtlwifi: Remove logging statement 
that is no longer needed)
Merging driver-core.current/driver-core-linus (26bc420b59a3 Linux 3.19-rc6)
Merging tty.current/tty-linus (ec6f34e5b552 Linux 3.19-rc5)
Merging usb.current/usb-linus (e36f014edff7 Linux 3.19-rc7)
Merging usb-gadget-fixes/fixes (0df8fc37f6e4 usb: phy: never defer probe in 
non-OF case)
Merging usb-serial-fixes/usb-linus (a6f0331236fa USB: cp210x: add ID for 
RUGGEDCOM USB Serial Console)
Merging staging.current/staging-linus (e36f014edff7 Linux 3.19-rc7)
Merging char-misc.current/char-misc-linus (e36f014edff7 Linux 3.19-rc7)
Merging input-current/for-linus (4ba24fef3eb3 Merge branch 'next' into 
for-linus)
Merging crypto-current/master (3e14dcf7cb80 crypto: add missing crypto module 
aliases)
Merging ide/master (f96fe225677b Merge 
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net)
Merging devicetree-current/devicetree/merge (6b1271de3723 of/unittest: Overlays 
with sub-devices tests)
Merging rr-fixes/fixes (dc4515ea26d6 scsi: always increment reference count)
Merging vfio-fixes/for-linus (7c2e211f3c95 vfio-pci: Fix the check on pci 
device type in vfio_pci_probe())
Merging kselftest-fixes/fixes (f5db310d77ef selftests/vm: fix link error for 
transhuge-stress test)
Merging drm-intel-fixes/for-linux-next-fixes (bfa76d495765 Linux 3.19)
Merging asm-generic/master (643165c8bbc8 Merge tag 'uaccess_for_upstream' of 
git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost into asm-generic)
Merging arc/for-next (091f56be10ef ARC

[RFC PATCH v3 04/26] ftrace: don't update record flags if code modification fail.

2015-02-12 Thread Wang Nan
X86 and common ftrace_replace_code() behave differently.

In x86, rec->flags get updated only when (almost) all works are done. In
common code, rec->flags is updated before code modification, and never
get restored when code modification fails.

This patch ensures rec->flags kept its original value if
ftrace_replace_code() fail. A later patch will correct that function
for x86.

Signed-off-by: Wang Nan 
---
 kernel/trace/ftrace.c | 17 -
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 45e5cb1..6c6cbb1 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -2254,23 +2254,30 @@ __ftrace_replace_code(struct dyn_ftrace *rec, int 
enable)
/* This needs to be done before we call ftrace_update_record */
ftrace_old_addr = ftrace_get_addr_curr(rec);
 
-   ret = ftrace_update_record(rec, enable);
+   ret = ftrace_test_record(rec, enable);
 
switch (ret) {
case FTRACE_UPDATE_IGNORE:
return 0;
 
case FTRACE_UPDATE_MAKE_CALL:
-   return ftrace_make_call(rec, ftrace_addr);
+   ret = ftrace_make_call(rec, ftrace_addr);
+   break;
 
case FTRACE_UPDATE_MAKE_NOP:
-   return ftrace_make_nop(NULL, rec, ftrace_old_addr);
+   ret = ftrace_make_nop(NULL, rec, ftrace_old_addr);
+   break;
 
case FTRACE_UPDATE_MODIFY_CALL:
-   return ftrace_modify_call(rec, ftrace_old_addr, ftrace_addr);
+   ret = ftrace_modify_call(rec, ftrace_old_addr, ftrace_addr);
+   break;
}
 
-   return -1; /* unknow ftrace bug */
+   if (ret)
+   return -1; /* unknow ftrace bug */
+
+   ftrace_update_record(rec, enable);
+   return 0;
 }
 
 void __weak ftrace_replace_code(int enable)
-- 
1.8.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH v3 07/26] ftrace: allow search ftrace addr before ftrace fully inited.

2015-02-12 Thread Wang Nan
This patch enables ftrace_location() to be used before ftrace_init().
The first user should be early kprobes, which can insert kprobes to
kernel code even before setup_arch() finishes. This patch gives it a
chance to determine whether it is probing ftrace entries and allows it
do some special treatment.

ftrace_cmp_ips_insn() is introduced to make early ftrace_location()
behavior consistent with normal ftrace_location(). With existing
ftrace_cmp_ips(), searching an address in middle of an instruction will
fail, which is inconsistent with ftrace_cmp_recs() used by normal
ftrace_location().

With this and previous patch ftrace_location() now is able to be called
in and after setup_arch().

Signed-off-by: Wang Nan 
---
 kernel/trace/ftrace.c | 38 ++
 1 file changed, 38 insertions(+)

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index a75cfbe..fc0c1aa 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -1539,6 +1539,8 @@ static unsigned long ftrace_location_range(unsigned long 
start, unsigned long en
return 0;
 }
 
+static unsigned long ftrace_search_mcount_ip(unsigned long ip);
+
 /**
  * ftrace_location - return true if the ip giving is a traced location
  * @ip: the instruction pointer to check
@@ -1550,6 +1552,9 @@ static unsigned long ftrace_location_range(unsigned long 
start, unsigned long en
  */
 unsigned long ftrace_location(unsigned long ip)
 {
+   if (unlikely(!ftrace_pages_start))
+   return ftrace_search_mcount_ip(ip);
+
return ftrace_location_range(ip, ip);
 }
 
@@ -4733,6 +4738,18 @@ static int ftrace_cmp_ips(const void *a, const void *b)
return 0;
 }
 
+static int ftrace_cmp_ips_insn(const void *a, const void *b)
+{
+   const unsigned long *ipa = a;
+   const unsigned long *ipb = b;
+
+   if (*ipa >= *ipb + MCOUNT_INSN_SIZE)
+   return 1;
+   if (*ipa < *ipb)
+   return -1;
+   return 0;
+}
+
 static void ftrace_swap_ips(void *a, void *b, int size)
 {
unsigned long *ipa = a;
@@ -4770,6 +4787,27 @@ static void ftrace_sort_mcount_area(unsigned long 
*start, unsigned long *end)
kernel_mcount_sorted = true;
 }
 
+static unsigned long ftrace_search_mcount_ip(unsigned long ip)
+{
+   extern unsigned long __start_mcount_loc[];
+   extern unsigned long __stop_mcount_loc[];
+
+   unsigned long *mcount_start = __start_mcount_loc;
+   unsigned long *mcount_end = __stop_mcount_loc;
+   unsigned long count = mcount_end - mcount_start;
+   unsigned long *retval;
+
+   if (!kernel_mcount_sorted)
+   return 0;
+
+   retval = bsearch(, mcount_start, count,
+   sizeof(unsigned long), ftrace_cmp_ips_insn);
+   if (!retval)
+   return 0;
+
+   return ftrace_call_adjust(ip);
+}
+
 static int ftrace_process_locs(struct module *mod,
   unsigned long *start,
   unsigned long *end)
-- 
1.8.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH v3 05/26] ftrace/x86: Ensure rec->flags no change when failure occures.

2015-02-12 Thread Wang Nan
Don't change rec->flags if code modification fails.

Signed-off-by: Wang Nan 
---
 arch/x86/kernel/ftrace.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index 8b7b0a5..7bdba65 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -497,6 +497,7 @@ static int finish_update(struct dyn_ftrace *rec, int enable)
 {
unsigned long ftrace_addr;
int ret;
+   unsigned long old_flags = rec->flags;
 
ret = ftrace_update_record(rec, enable);
 
@@ -509,14 +510,18 @@ static int finish_update(struct dyn_ftrace *rec, int 
enable)
case FTRACE_UPDATE_MODIFY_CALL:
case FTRACE_UPDATE_MAKE_CALL:
/* converting nop to call */
-   return finish_update_call(rec, ftrace_addr);
+   ret = finish_update_call(rec, ftrace_addr);
+   break;
 
case FTRACE_UPDATE_MAKE_NOP:
/* converting a call to a nop */
-   return finish_update_nop(rec);
+   ret = finish_update_nop(rec);
+   break;
}
 
-   return 0;
+   if (ret)
+   rec->flags = old_flags;
+   return ret;
 }
 
 static void do_sync_core(void *data)
-- 
1.8.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH v3 17/26] early kprobes: introduces macros for allocing early kprobe resources.

2015-02-12 Thread Wang Nan
Introduces macros to genearte common early kprobe related resource
allocator.

All early kprobe related resources are statically allocated during
linking for each early kprobe slot. For each type of resource, a bitmap
is used to track allocation. __DEFINE_EKPROBE_ALLOC_OPS defines alloc
and free handler for them. The range of the resource and the bitmap
should be provided for allocaing and freeing. DEFINE_EKPROBE_ALLOC_OPS
defines bitmap and the array used by it.

Signed-off-by: Wang Nan 
---
 include/linux/kprobes.h | 78 +
 1 file changed, 78 insertions(+)

diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index 8d2e754..cd7a2a5 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -270,6 +270,84 @@ extern void show_registers(struct pt_regs *regs);
 extern void kprobes_inc_nmissed_count(struct kprobe *p);
 extern bool arch_within_kprobe_blacklist(unsigned long addr);
 
+#ifdef CONFIG_EARLY_KPROBES
+
+#define NR_EARLY_KPROBES_SLOTS CONFIG_NR_EARLY_KPROBES_SLOTS
+#define ALIGN_UP(v, a) (((v) + ((a) - 1)) & ~((a) - 1))
+#define EARLY_KPROBES_BITMAP_SZALIGN_UP(NR_EARLY_KPROBES_SLOTS, 
BITS_PER_LONG)
+
+#define __ek_in_range(v, s, e) (((v) >= (s)) && ((v) < (e)))
+#define __ek_buf_sz(s, e)  ((void *)(e) - (void *)(s))
+#define __ek_elem_sz_b(s, e)   (__ek_buf_sz(s, e) / NR_EARLY_KPROBES_SLOTS)
+#define __ek_elem_sz(s, e) (__ek_elem_sz_b(s, e) / sizeof(s[0]))
+#define __ek_elem_idx(v, s, e) (__ek_buf_sz(s, v) / __ek_elem_sz_b(s, e))
+#define __ek_get_elem(i, s, e) (&((s)[__ek_elem_sz(s, e) * (i)]))
+#define __DEFINE_EKPROBE_ALLOC_OPS(__t, __name)
\
+static inline __t *__ek_alloc_##__name(__t *__s, __t *__e, unsigned long *__b)\
+{  \
+   int __i = find_next_zero_bit(__b, NR_EARLY_KPROBES_SLOTS, 0);   \
+   if (__i >= NR_EARLY_KPROBES_SLOTS)  \
+   return NULL;\
+   set_bit(__i, __b);  \
+   return __ek_get_elem(__i, __s, __e);\
+}  \
+static inline int __ek_free_##__name(__t *__v, __t *__s, __t *__e, unsigned 
long *__b) \
+{  \
+   if (!__ek_in_range(__v, __s, __e))  \
+   return 0;   \
+   clear_bit(__ek_elem_idx(__v, __s, __e), __b);   \
+   return 1;   \
+}
+
+#define __DEFINE_EKPROBE_AREA(__t, __name, __static)   \
+__static __t __ek_##__name##_slots[NR_EARLY_KPROBES_SLOTS];\
+__static unsigned long __ek_##__name##_bitmap[EARLY_KPROBES_BITMAP_SZ];
+
+#define DEFINE_EKPROBE_ALLOC_OPS(__t, __name, __static)
\
+__DEFINE_EKPROBE_AREA(__t, __name, __static)   \
+__DEFINE_EKPROBE_ALLOC_OPS(__t, __name)
\
+static inline __t *ek_alloc_##__name(void) \
+{  \
+   return __ek_alloc_##__name(&((__ek_##__name##_slots)[0]),   \
+   &((__ek_##__name##_slots)[NR_EARLY_KPROBES_SLOTS]),\
+   (__ek_##__name##_bitmap));  \
+}  \
+static inline int ek_free_##__name(__t *__s)   \
+{  \
+   return __ek_free_##__name(__s, &((__ek_##__name##_slots)[0]),   \
+   &((__ek_##__name##_slots)[NR_EARLY_KPROBES_SLOTS]),\
+   (__ek_##__name##_bitmap));  \
+}
+
+
+#else
+#define __DEFINE_EKPROBE_ALLOC_OPS(__t, __name)
\
+static inline __t *__ek_alloc_##__name(__t *__s, __t *__e, unsigned long *__b)\
+{  \
+   return NULL;\
+}  \
+static inline int __ek_free_##__name(__t *__v, __t *__s, __t *__e, unsigned 
long *__b)\
+{  \
+   return 0;   \
+}
+
+#define __DEFINE_EKPROBE_AREA(__t, __name, __static)   \
+__static __t __ek_##__name##_slots[0]; \
+__static unsigned long __ek_##__name##_bitmap[0];
+
+#define DEFINE_EKPROBE_ALLOC_OPS(__t, __name, __static)
\

[RFC PATCH v3 16/26] early kprobes: x86: introduce early kprobes related code area.

2015-02-12 Thread Wang Nan
This patch introduces EARLY_KPROBES_CODES_AREA into x86 vmlinux for
early kprobes.

Signed-off-by: Wang Nan 
---
 arch/x86/include/asm/insn.h|  7 ---
 arch/x86/include/asm/kprobes.h | 47 +++---
 arch/x86/kernel/vmlinux.lds.S  |  2 ++
 3 files changed, 45 insertions(+), 11 deletions(-)

diff --git a/arch/x86/include/asm/insn.h b/arch/x86/include/asm/insn.h
index 47f29b1..ea6f318 100644
--- a/arch/x86/include/asm/insn.h
+++ b/arch/x86/include/asm/insn.h
@@ -20,6 +20,9 @@
  * Copyright (C) IBM Corporation, 2009
  */
 
+#define MAX_INSN_SIZE  16
+
+#ifndef __ASSEMBLY__
 /* insn_attr_t is defined in inat.h */
 #include 
 
@@ -69,8 +72,6 @@ struct insn {
const insn_byte_t *next_byte;
 };
 
-#define MAX_INSN_SIZE  16
-
 #define X86_MODRM_MOD(modrm) (((modrm) & 0xc0) >> 6)
 #define X86_MODRM_REG(modrm) (((modrm) & 0x38) >> 3)
 #define X86_MODRM_RM(modrm) ((modrm) & 0x07)
@@ -197,5 +198,5 @@ static inline int insn_offset_immediate(struct insn *insn)
 {
return insn_offset_displacement(insn) + insn->displacement.nbytes;
 }
-
+#endif /* __ASSEMBLY__ */
 #endif /* _ASM_X86_INSN_H */
diff --git a/arch/x86/include/asm/kprobes.h b/arch/x86/include/asm/kprobes.h
index 4421b5d..6a6066a 100644
--- a/arch/x86/include/asm/kprobes.h
+++ b/arch/x86/include/asm/kprobes.h
@@ -21,23 +21,54 @@
  *
  * See arch/x86/kernel/kprobes.c for x86 kprobes history.
  */
-#include 
-#include 
-#include 
-#include 
 
 #define  __ARCH_WANT_KPROBES_INSN_SLOT
 
-struct pt_regs;
-struct kprobe;
+#include 
+#include 
 
-typedef u8 kprobe_opcode_t;
 #define BREAKPOINT_INSTRUCTION 0xcc
 #define RELATIVEJUMP_OPCODE 0xe9
 #define RELATIVEJUMP_SIZE 5
 #define RELATIVECALL_OPCODE 0xe8
 #define RELATIVE_ADDR_SIZE 4
 #define MAX_STACK_SIZE 64
+#define MAX_OPTIMIZED_LENGTH (MAX_INSN_SIZE + RELATIVE_ADDR_SIZE)
+
+#ifdef __ASSEMBLY__
+
+#define KPROBE_OPCODE_SIZE 1
+#define MAX_OPTINSN_SIZE ((optprobe_template_end - optprobe_template_entry) + \
+   MAX_OPTIMIZED_LENGTH + RELATIVEJUMP_SIZE)
+
+#ifdef CONFIG_EARLY_KPROBES
+# define EARLY_KPROBES_CODES_AREA  \
+   . = ALIGN(8);   \
+   VMLINUX_SYMBOL(__early_kprobes_start) = .;  \
+   VMLINUX_SYMBOL(__early_kprobes_code_area_start) = .;\
+   . = . + MAX_OPTINSN_SIZE * CONFIG_NR_EARLY_KPROBES_SLOTS;   \
+   VMLINUX_SYMBOL(__early_kprobes_code_area_end) = .;  \
+   . = ALIGN(8);   \
+   VMLINUX_SYMBOL(__early_kprobes_insn_slot_start) = .;\
+   . = . + MAX_INSN_SIZE * KPROBE_OPCODE_SIZE *\
+   CONFIG_NR_EARLY_KPROBES_SLOTS;  \
+   VMLINUX_SYMBOL(__early_kprobes_insn_slot_end) = .;  \
+   VMLINUX_SYMBOL(__early_kprobes_end) = .;
+#else
+# define EARLY_KPROBES_CODES_AREA
+#endif
+
+#else
+
+#include 
+#include 
+
+
+struct pt_regs;
+struct kprobe;
+
+typedef u8 kprobe_opcode_t;
+#define KPROBE_OPCODE_SIZE sizeof(kprobe_opcode_t)
 #define MIN_STACK_SIZE(ADDR)  \
(((MAX_STACK_SIZE) < (((unsigned long)current_thread_info()) + \
  THREAD_SIZE - (unsigned long)(ADDR)))\
@@ -52,7 +83,6 @@ extern __visible kprobe_opcode_t optprobe_template_entry;
 extern __visible kprobe_opcode_t optprobe_template_val;
 extern __visible kprobe_opcode_t optprobe_template_call;
 extern __visible kprobe_opcode_t optprobe_template_end;
-#define MAX_OPTIMIZED_LENGTH (MAX_INSN_SIZE + RELATIVE_ADDR_SIZE)
 #define MAX_OPTINSN_SIZE   \
(((unsigned long)_template_end -   \
  (unsigned long)_template_entry) +\
@@ -117,4 +147,5 @@ extern int kprobe_exceptions_notify(struct notifier_block 
*self,
unsigned long val, void *data);
 extern int kprobe_int3_handler(struct pt_regs *regs);
 extern int kprobe_debug_handler(struct pt_regs *regs);
+#endif /* __ASSEMBLY__ */
 #endif /* _ASM_X86_KPROBES_H */
diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index 00bf300..69f3f0e 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -26,6 +26,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #undef i386 /* in case the preprocessor is a 32bit one */
 
@@ -100,6 +101,7 @@ SECTIONS
SCHED_TEXT
LOCK_TEXT
KPROBES_TEXT
+   EARLY_KPROBES_CODES_AREA
ENTRY_TEXT
IRQENTRY_TEXT
*(.fixup)
-- 
1.8.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH v3 18/26] early kprobes: allows __alloc_insn_slot() from early kprobes slots.

2015-02-12 Thread Wang Nan
Introduces early_slots_start/end and bitmap for struct kprobe_insn_cache
then uses previous introduced macro to generate allocator. This patch
makes get/free_insn_slot() and get/free_optinsn_slot() transparent to
early kprobes.

Signed-off-by: Wang Nan 
---
 include/linux/kprobes.h | 40 
 kernel/kprobes.c| 14 ++
 2 files changed, 54 insertions(+)

diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index cd7a2a5..6100678 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -319,6 +319,17 @@ static inline int ek_free_##__name(__t *__s)   
\
(__ek_##__name##_bitmap));  \
 }
 
+/*
+ * Start and end of early kprobes area, including code area and
+ * insn_slot area.
+ */
+extern char __early_kprobes_start[];
+extern char __early_kprobes_end[];
+
+extern kprobe_opcode_t __early_kprobes_code_area_start[];
+extern kprobe_opcode_t __early_kprobes_code_area_end[];
+extern kprobe_opcode_t __early_kprobes_insn_slot_start[];
+extern kprobe_opcode_t __early_kprobes_insn_slot_end[];
 
 #else
 #define __DEFINE_EKPROBE_ALLOC_OPS(__t, __name)
\
@@ -348,6 +359,8 @@ static inline int ek_free_##__name(__t *__s)
\
 
 #endif
 
+__DEFINE_EKPROBE_ALLOC_OPS(kprobe_opcode_t, opcode)
+
 struct kprobe_insn_cache {
struct mutex mutex;
void *(*alloc)(void);   /* allocate insn page */
@@ -355,8 +368,35 @@ struct kprobe_insn_cache {
struct list_head pages; /* list of kprobe_insn_page */
size_t insn_size;   /* size of instruction slot */
int nr_garbage;
+#ifdef CONFIG_EARLY_KPROBES
+# define slots_start(c)((c)->early_slots_start)
+# define slots_end(c)  ((c)->early_slots_end)
+# define slots_bitmap(c)   ((c)->early_slots_bitmap)
+   kprobe_opcode_t *early_slots_start;
+   kprobe_opcode_t *early_slots_end;
+   unsigned long early_slots_bitmap[EARLY_KPROBES_BITMAP_SZ];
+#else
+# define slots_start(c)NULL
+# define slots_end(c)  NULL
+# define slots_bitmap(c)   NULL
+#endif
 };
 
+static inline kprobe_opcode_t *
+__get_insn_slot_early(struct kprobe_insn_cache *c)
+{
+   return __ek_alloc_opcode(slots_start(c),
+   slots_end(c), slots_bitmap(c));
+}
+
+static inline int
+__free_insn_slot_early(struct kprobe_insn_cache *c,
+   kprobe_opcode_t *slot)
+{
+   return __ek_free_opcode(slot, slots_start(c),
+   slots_end(c), slots_bitmap(c));
+}
+
 extern kprobe_opcode_t *__get_insn_slot(struct kprobe_insn_cache *c);
 extern void __free_insn_slot(struct kprobe_insn_cache *c,
 kprobe_opcode_t *slot, int dirty);
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 647c95a..fa1e422 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -143,6 +143,10 @@ struct kprobe_insn_cache kprobe_insn_slots = {
.pages = LIST_HEAD_INIT(kprobe_insn_slots.pages),
.insn_size = MAX_INSN_SIZE,
.nr_garbage = 0,
+#ifdef CONFIG_EARLY_KPROBES
+   .early_slots_start = __early_kprobes_insn_slot_start,
+   .early_slots_end = __early_kprobes_insn_slot_end,
+#endif
 };
 static int collect_garbage_slots(struct kprobe_insn_cache *c);
 
@@ -155,6 +159,9 @@ kprobe_opcode_t *__get_insn_slot(struct kprobe_insn_cache 
*c)
struct kprobe_insn_page *kip;
kprobe_opcode_t *slot = NULL;
 
+   if (kprobes_is_early())
+   return __get_insn_slot_early(c);
+
mutex_lock(>mutex);
  retry:
list_for_each_entry(kip, >pages, list) {
@@ -255,6 +262,9 @@ void __free_insn_slot(struct kprobe_insn_cache *c,
 {
struct kprobe_insn_page *kip;
 
+   if (unlikely(__free_insn_slot_early(c, slot)))
+   return;
+
mutex_lock(>mutex);
list_for_each_entry(kip, >pages, list) {
long idx = ((long)slot - (long)kip->insns) /
@@ -286,6 +296,10 @@ struct kprobe_insn_cache kprobe_optinsn_slots = {
.pages = LIST_HEAD_INIT(kprobe_optinsn_slots.pages),
/* .insn_size is initialized later */
.nr_garbage = 0,
+#ifdef CONFIG_EARLY_KPROBES
+   .early_slots_start = __early_kprobes_code_area_start,
+   .early_slots_end = __early_kprobes_code_area_end,
+#endif
 };
 #endif
 #endif
-- 
1.8.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH v3 19/26] early kprobes: perhibit probing at early kprobe reserved area.

2015-02-12 Thread Wang Nan
Puts early kprobe reserved area into kprobe blacklist.

Signed-off-by: Wang Nan 
---
 kernel/kprobes.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index fa1e422..b83c406 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -1358,6 +1358,13 @@ static bool within_kprobe_blacklist(unsigned long addr)
 
if (arch_within_kprobe_blacklist(addr))
return true;
+
+#ifdef CONFIG_EARLY_KPROBES
+   if (addr >= (unsigned long)__early_kprobes_start &&
+   addr < (unsigned long)__early_kprobes_end)
+   return true;
+#endif
+
/*
 * If there exists a kprobe_blacklist, verify and
 * fail any probe registration in the prohibited area
-- 
1.8.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH v3 26/26] kprobes: enable 'ekprobe=' cmdline option for early kprobes.

2015-02-12 Thread Wang Nan
This patch shows a very rough usage of arly kprobes. By adding
kernel cmdline options such as 'ekprobe=__alloc_pages_nodemask' or
'ekprobe=0xc00f3c2c', early kprobes are installed. When the probed
instructions get hit, a message is printed.

This patch is only a sample. I'll drop it in future

Signed-off-by: Wang Nan 
---
 kernel/kprobes.c | 71 
 1 file changed, 71 insertions(+)

diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index edac74b..278b2511 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -2835,10 +2835,81 @@ convert_early_kprobes(void)
free_aggr_kprobe(>kp);
}
 };
+
+static int early_kprobe_pre_handler(struct kprobe *p, struct pt_regs *regs)
+{
+   const char *sym = NULL;
+   char *modname, namebuf[KSYM_NAME_LEN];
+   unsigned long offset = 0;
+
+   sym = kallsyms_lookup((unsigned long)p->addr, NULL,
+   , , namebuf);
+   if (sym)
+   pr_info("Hit early kprobe at %s+0x%lx%s%s\n",
+   sym, offset,
+   (modname ? " " : ""),
+   (modname ? modname : ""));
+   else
+   pr_info("Hit early kprobe at %p\n", p->addr);
+   return 0;
+}
+
+DEFINE_EKPROBE_ALLOC_OPS(struct kprobe, early_kprobe_setup, static);
+static int __init early_kprobe_setup(char *p)
+{
+   unsigned long long addr;
+   struct kprobe *kp;
+   int len = strlen(p);
+   int err;
+
+   if (len <= 0) {
+   pr_err("early kprobe: wrong param: %s\n", p);
+   return 0;
+   }
+
+   if ((p[0] == '0') && (p[1] == 'x')) {
+   err = kstrtoull(p, 16, );
+   if (err) {
+   pr_err("early kprobe: wrong address: %p\n", p);
+   return 0;
+   }
+   } else {
+   addr = kallsyms_lookup_name(p);
+   if (!addr) {
+   pr_err("early kprobe: wrong symbol: %s\n", p);
+   return 0;
+   }
+   }
+
+   if ((addr < (unsigned long)_text) ||
+   (addr >= (unsigned long)_etext))
+   pr_err("early kprobe: address of %p out of range\n", p);
+
+   kp = ek_alloc_early_kprobe_setup();
+   if (kp == NULL) {
+   pr_err("early kprobe: no enough early kprobe slot\n");
+   return 0;
+   }
+   kp->addr = (void *)(unsigned long)(addr);
+   kp->pre_handler = early_kprobe_pre_handler;
+   err = register_kprobe(kp);
+   if (err) {
+   pr_err("early kprobe: register early kprobe %s failed\n", p);
+   ek_free_early_kprobe_setup(kp);
+   }
+   return 0;
+}
 #else
 static inline int register_early_kprobe(struct kprobe *p) { return -ENOSYS; }
 static inline void convert_early_kprobes(void)
 {
kprobes_initialized = 1;
 }
+
+static int __init early_kprobe_setup(char *p)
+{
+   return 0;
+}
 #endif
+
+early_param("ekprobe", early_kprobe_setup);
-- 
1.8.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH v3 15/26] early kprobes: x86: directly modify code.

2015-02-12 Thread Wang Nan
When registering early kprobes, SMP should has not been enabled, so
doesn't require synchronization in text_poke_bp(). Simply memcpy is
enough.

Signed-off-by: Wang Nan 
---
 arch/x86/kernel/kprobes/opt.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/kprobes/opt.c b/arch/x86/kernel/kprobes/opt.c
index 0dd8d08..21847ab 100644
--- a/arch/x86/kernel/kprobes/opt.c
+++ b/arch/x86/kernel/kprobes/opt.c
@@ -36,6 +36,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "common.h"
 
@@ -397,8 +398,15 @@ void arch_optimize_kprobes(struct list_head *oplist)
insn_buf[0] = RELATIVEJUMP_OPCODE;
*(s32 *)(_buf[1]) = rel;
 
-   text_poke_bp(op->kp.addr, insn_buf, RELATIVEJUMP_SIZE,
-op->optinsn.insn);
+   if (unlikely(kprobes_is_early())) {
+   BUG_ON(!(op->kp.flags & KPROBE_FLAG_EARLY));
+   memcpy(op->kp.addr, insn_buf, RELATIVEJUMP_SIZE);
+   local_flush_tlb();
+   sync_core();
+   } else {
+   text_poke_bp(op->kp.addr, insn_buf, RELATIVEJUMP_SIZE,
+op->optinsn.insn);
+   }
 
list_del_init(>list);
}
-- 
1.8.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH v3 22/26] early kprobes: introduce arch_fix_ftrace_early_kprobe().

2015-02-12 Thread Wang Nan
This patch is for futher use. arch_fix_ftrace_early_kprobe() will be
called when ftrace trying to convert ftrace entries to nop and fail. For
x86 it should adjust the saved nop instruction here because it doesn't
know what nop ftrace will choose when early probing.

Signed-off-by: Wang Nan 
---
 arch/x86/kernel/kprobes/opt.c | 31 +++
 include/linux/kprobes.h   |  5 +
 kernel/kprobes.c  |  6 ++
 3 files changed, 42 insertions(+)

diff --git a/arch/x86/kernel/kprobes/opt.c b/arch/x86/kernel/kprobes/opt.c
index 21847ab..f3ea954 100644
--- a/arch/x86/kernel/kprobes/opt.c
+++ b/arch/x86/kernel/kprobes/opt.c
@@ -456,3 +456,34 @@ int setup_detour_execution(struct kprobe *p, struct 
pt_regs *regs, int reenter)
return 0;
 }
 NOKPROBE_SYMBOL(setup_detour_execution);
+
+#ifdef CONFIG_EARLY_KPROBES
+void arch_fix_ftrace_early_kprobe(struct optimized_kprobe *op)
+{
+   const unsigned char *correct_nop5 = ideal_nops[NOP_ATOMIC5];
+   struct kprobe *list_p;
+
+   u32 mask = KPROBE_FLAG_EARLY |
+   KPROBE_FLAG_OPTIMIZED |
+   KPROBE_FLAG_FTRACE;
+
+   if ((op->kp.flags & mask) != mask)
+   return;
+
+   /*
+* For early kprobe on ftrace, use right nop instruction.
+* See x86 ftrace_make_nop and ftrace_nop_replace. Note that
+* ideal_nops used by ftrace_nop_replace is setupt after early
+* kprobe registration.
+*/
+
+   memcpy(>kp.opcode, correct_nop5, sizeof(kprobe_opcode_t));
+   memcpy(op->optinsn.copied_insn, correct_nop5 + INT3_SIZE,
+   RELATIVE_ADDR_SIZE);
+
+   /* Fix all kprobes connected to it */
+   list_for_each_entry_rcu(list_p, >kp.list, list)
+   memcpy(_p->opcode, correct_nop5, sizeof(kprobe_opcode_t));
+
+}
+#endif
diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index 0c64df8..e483f1b 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -459,6 +459,11 @@ struct early_kprobe_slot {
 extern void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
  struct ftrace_ops *ops, struct pt_regs *regs);
 extern int arch_prepare_kprobe_ftrace(struct kprobe *p);
+
+#ifdef CONFIG_EARLY_KPROBES
+extern void arch_fix_ftrace_early_kprobe(struct optimized_kprobe *p);
+#endif
+
 #endif
 
 int arch_check_ftrace_location(struct kprobe *p);
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 131a71a..0bbb510 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -2536,6 +2536,12 @@ EXPORT_SYMBOL_GPL(jprobe_return);
 
 #ifdef CONFIG_EARLY_KPROBES
 
+#ifdef CONFIG_KPROBES_ON_FTRACE
+void __weak arch_fix_ftrace_early_kprobe(struct optimized_kprobe *p)
+{
+}
+#endif
+
 static int register_early_kprobe(struct kprobe *p)
 {
struct early_kprobe_slot *slot;
-- 
1.8.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH v3 21/26] early kprobes: add CONFIG_EARLY_KPROBES option.

2015-02-12 Thread Wang Nan
Enable early kprobes in Kconfig.

Signed-off-by: Wang Nan 
---
 arch/Kconfig | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/arch/Kconfig b/arch/Kconfig
index 05d7a8a..32e9f4a 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -46,6 +46,21 @@ config KPROBES
  for kernel debugging, non-intrusive instrumentation and testing.
  If in doubt, say "N".
 
+config EARLY_KPROBES
+   bool "Enable kprobes at very early booting stage"
+   depends on KPROBES && OPTPROBES
+   def_bool y
+   help
+ Enable kprobe at very early booting stage.
+
+config NR_EARLY_KPROBES_SLOTS
+   int "Number of possible early kprobes"
+   range 1 64
+   default 16
+   depends on EARLY_KPROBES
+   help
+ Number of early kprobes slots.
+
 config JUMP_LABEL
bool "Optimize very unlikely/likely branches"
depends on HAVE_ARCH_JUMP_LABEL
-- 
1.8.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH v3 20/26] early kprobes: core logic of eraly kprobes.

2015-02-12 Thread Wang Nan
This patch is the main logic of early kprobe.

If register_kprobe() is called before kprobes_initialized, an early
kprobe is allocated. Try to utilize existing OPTPROBE mechanism to
replace the target instruction by a branch instead of breakpoint,
because interrupt handlers may not been initialized yet.

All resources required by early kprobes are allocated statically.
CONFIG_NR_EARLY_KPROBES_SLOTS is used to control number of possible
early kprobes.

Signed-off-by: Wang Nan 
---
 include/linux/kprobes.h |   4 ++
 kernel/kprobes.c| 150 ++--
 2 files changed, 148 insertions(+), 6 deletions(-)

diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index 6100678..0c64df8 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -450,6 +450,10 @@ extern int proc_kprobes_optimization_handler(struct 
ctl_table *table,
 size_t *length, loff_t *ppos);
 #endif
 
+struct early_kprobe_slot {
+   struct optimized_kprobe op;
+};
+
 #endif /* CONFIG_OPTPROBES */
 #ifdef CONFIG_KPROBES_ON_FTRACE
 extern void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index b83c406..131a71a 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -77,6 +77,10 @@ int kprobes_is_early(void)
 static struct hlist_head kprobe_table[KPROBE_TABLE_SIZE];
 static struct hlist_head kretprobe_inst_table[KPROBE_TABLE_SIZE];
 
+#ifdef CONFIG_EARLY_KPROBES
+static HLIST_HEAD(early_kprobe_hlist);
+#endif
+
 /* NOTE: change this value only with kprobe_mutex held */
 static bool kprobes_all_disarmed;
 
@@ -87,6 +91,8 @@ static struct {
raw_spinlock_t lock cacheline_aligned_in_smp;
 } kretprobe_table_locks[KPROBE_TABLE_SIZE];
 
+DEFINE_EKPROBE_ALLOC_OPS(struct early_kprobe_slot, early_kprobe, static)
+
 static raw_spinlock_t *kretprobe_table_lock_ptr(unsigned long hash)
 {
return &(kretprobe_table_locks[hash].lock);
@@ -326,7 +332,12 @@ struct kprobe *get_kprobe(void *addr)
struct hlist_head *head;
struct kprobe *p;
 
-   head = _table[hash_ptr(addr, KPROBE_HASH_BITS)];
+#ifdef CONFIG_EARLY_KPROBES
+   if (kprobes_is_early())
+   head = _kprobe_hlist;
+   else
+#endif
+   head = _table[hash_ptr(addr, KPROBE_HASH_BITS)];
hlist_for_each_entry_rcu(p, head, hlist) {
if (p->addr == addr)
return p;
@@ -386,11 +397,14 @@ NOKPROBE_SYMBOL(opt_pre_handler);
 static void free_aggr_kprobe(struct kprobe *p)
 {
struct optimized_kprobe *op;
+   struct early_kprobe_slot *ep;
 
op = container_of(p, struct optimized_kprobe, kp);
arch_remove_optimized_kprobe(op);
arch_remove_kprobe(p);
-   kfree(op);
+   ep = container_of(op, struct early_kprobe_slot, op);
+   if (likely(!ek_free_early_kprobe(ep)))
+   kfree(op);
 }
 
 /* Return true(!0) if the kprobe is ready for optimization. */
@@ -607,9 +621,15 @@ static void optimize_kprobe(struct kprobe *p)
struct optimized_kprobe *op;
 
/* Check if the kprobe is disabled or not ready for optimization. */
-   if (!kprobe_optready(p) || !kprobes_allow_optimization ||
-   (kprobe_disabled(p) || kprobes_all_disarmed))
-   return;
+   if (unlikely(kprobes_is_early())) {
+   BUG_ON(!(p->flags & KPROBE_FLAG_EARLY));
+   if (!kprobe_optready(p) || kprobe_disabled(p))
+   return;
+   } else {
+   if (!kprobe_optready(p) || !kprobes_allow_optimization ||
+   (kprobe_disabled(p) || kprobes_all_disarmed))
+   return;
+   }
 
/* Both of break_handler and post_handler are not supported. */
if (p->break_handler || p->post_handler)
@@ -631,7 +651,10 @@ static void optimize_kprobe(struct kprobe *p)
list_del_init(>list);
else {
list_add(>list, _list);
-   kick_kprobe_optimizer();
+   if (kprobes_is_early())
+   arch_optimize_kprobes(_list);
+   else
+   kick_kprobe_optimizer();
}
 }
 
@@ -1505,6 +1528,8 @@ out:
return ret;
 }
 
+static int register_early_kprobe(struct kprobe *p);
+
 int register_kprobe(struct kprobe *p)
 {
int ret;
@@ -1518,6 +1543,14 @@ int register_kprobe(struct kprobe *p)
return PTR_ERR(addr);
p->addr = addr;
 
+   if (unlikely(kprobes_is_early())) {
+   p->flags |= KPROBE_FLAG_EARLY;
+   return register_early_kprobe(p);
+   }
+
+   WARN(p->flags & KPROBE_FLAG_EARLY,
+   "register early kprobe after kprobes initialized\n");
+
ret = check_kprobe_rereg(p);
if (ret)
return ret;
@@ -2156,6 +2189,8 @@ static struct notifier_block kprobe_module_nb = {
 extern unsigned long 

[RFC PATCH v3 12/26] early kprobes: Add an KPROBE_FLAG_EARLY for early kprobe.

2015-02-12 Thread Wang Nan
Introduce a KPROBE_FLAG_EARLY for futher using. KPROBE_FLAG_EARLY
indicates a kprobe is installed at very early stage, its resources
should be allocated statically.

Signed-off-by: Wang Nan 
---
 include/linux/kprobes.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index e1c8307..8d2e754 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -130,6 +130,7 @@ struct kprobe {
   * this flag is only for optimized_kprobe.
   */
 #define KPROBE_FLAG_FTRACE 8 /* probe is using ftrace */
+#define KPROBE_FLAG_EARLY  16 /* early kprobe */
 
 /* Has this kprobe gone ? */
 static inline int kprobe_gone(struct kprobe *p)
-- 
1.8.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH v3 02/26] kprobes: makes kprobes/enabled works correctly for optimized kprobes.

2015-02-12 Thread Wang Nan
debugfs/kprobes/enabled doesn't work correctly on optimized kprobes.
Masami Hiramatsu has a test report on x86_64 platform:

https://lkml.org/lkml/2015/1/19/274

This patch forces it to unoptimize kprobe if kprobes_all_disarmed
is set. It also checks the flag in unregistering path for skipping
unneeded disarming process when kprobes globally disarmed.

Signed-off-by: Wang Nan 
Signed-off-by: Masami Hiramatsu 
Cc: Ingo Molnar 
Signed-off-by: Andrew Morton 
---
 kernel/kprobes.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index c397900..c90e417 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -869,7 +869,8 @@ static void __disarm_kprobe(struct kprobe *p, bool reopt)
 {
struct kprobe *_p;
 
-   unoptimize_kprobe(p, false);/* Try to unoptimize */
+   /* Try to unoptimize */
+   unoptimize_kprobe(p, kprobes_all_disarmed);
 
if (!kprobe_queued(p)) {
arch_disarm_kprobe(p);
@@ -1571,7 +1572,13 @@ static struct kprobe *__disable_kprobe(struct kprobe *p)
 
/* Try to disarm and disable this/parent probe */
if (p == orig_p || aggr_kprobe_disabled(orig_p)) {
-   disarm_kprobe(orig_p, true);
+   /*
+* If kprobes_all_disarmed is set, orig_p
+* should have already been disarmed, so
+* skip unneed disarming process.
+*/
+   if (!kprobes_all_disarmed)
+   disarm_kprobe(orig_p, true);
orig_p->flags |= KPROBE_FLAG_DISABLED;
}
}
-- 
1.8.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH v3 01/26] kprobes: set kprobes_all_disarmed earlier to enable re-optimization.

2015-02-12 Thread Wang Nan
In original code, the probed instruction doesn't get optimized after

echo 0 > /sys/kernel/debug/kprobes/enabled
echo 1 > /sys/kernel/debug/kprobes/enabled

This is because original code checks kprobes_all_disarmed in
optimize_kprobe(), but this flag is turned off after calling that
function. Therefore, optimize_kprobe() will see
kprobes_all_disarmed == true and doesn't do the optimization.

This patch simply turns off kprobes_all_disarmed earlier to enable
optimization.

Signed-off-by: Wang Nan 
Signed-off-by: Masami Hiramatsu 
Cc: Ingo Molnar 
Signed-off-by: Andrew Morton 
---
 kernel/kprobes.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 2ca272f..c397900 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -2320,6 +2320,12 @@ static void arm_all_kprobes(void)
if (!kprobes_all_disarmed)
goto already_enabled;
 
+   /*
+* optimize_kprobe() called by arm_kprobe() checks
+* kprobes_all_disarmed, so set kprobes_all_disarmed before
+* arm_kprobe.
+*/
+   kprobes_all_disarmed = false;
/* Arming kprobes doesn't optimize kprobe itself */
for (i = 0; i < KPROBE_TABLE_SIZE; i++) {
head = _table[i];
@@ -2328,7 +2334,6 @@ static void arm_all_kprobes(void)
arm_kprobe(p);
}
 
-   kprobes_all_disarmed = false;
printk(KERN_INFO "Kprobes globally enabled\n");
 
 already_enabled:
-- 
1.8.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH v3 10/26] ftrace: x86: try to fix ftrace when ftrace_replace_code.

2015-02-12 Thread Wang Nan
For ftrace x86, when ftrace_replace_code(), if it failed to add
breakpoint, trigger a bugfix trying instead of ftrace_bug().

Only give one chance for fixing at add_breakpoints(). If it fails at
other stage, bug directly.

Signed-off-by: Wang Nan 
---
 arch/x86/kernel/ftrace.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index 7bdba65..c869138 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -553,8 +553,16 @@ void ftrace_replace_code(int enable)
rec = ftrace_rec_iter_record(iter);
 
ret = add_breakpoints(rec, enable);
-   if (ret)
-   goto remove_breakpoints;
+   if (ret) {
+   /*
+* Don't trigger ftrace_bug here. Let it done by
+* remove_breakpoints procedure.
+*/
+   ret = __ftrace_tryfix_bug(ret, enable, rec,
+   add_breakpoints(rec, enable), false);
+   if (ret)
+   goto remove_breakpoints;
+   }
count++;
}
 
-- 
1.8.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH v3 00/26] Early kprobe: enable kprobes at very early booting stage.

2015-02-12 Thread Wang Nan
I fell very sorry for people who reviewed my v2 patch series yesterday
at https://lkml.org/lkml/2015/2/12/234 because I didn't provide enough
information in commit log. This v3 patch series add those missing
commit messages. There are also 2 small fix based on v2:

 1. Fixes ftrace_sort_mcount_area. Original patch doesn't work for module.
 2. Wraps setting of kprobes_initialized in stop_machine() context. 

Wang Nan (26):
  kprobes: set kprobes_all_disarmed earlier to enable re-optimization.
  kprobes: makes kprobes/enabled works correctly for optimized kprobes.
  kprobes: x86: mark 2 bytes NOP as boostable.
  ftrace: don't update record flags if code modification fail.
  ftrace/x86: Ensure rec->flags no change when failure occures.
  ftrace: sort ftrace entries earlier.
  ftrace: allow search ftrace addr before ftrace fully inited.
  ftrace: enable make ftrace nop before ftrace_init().
  ftrace: allow fixing code update failure by notifier chain.
  ftrace: x86: try to fix ftrace when ftrace_replace_code.
  early kprobes: introduce kprobe_is_early for futher early kprobe use.
  early kprobes: Add an KPROBE_FLAG_EARLY for early kprobe.
  early kprobes: ARM: directly modify code.
  early kprobes: ARM: introduce early kprobes related code area.
  early kprobes: x86: directly modify code.
  early kprobes: x86: introduce early kprobes related code area.
  early kprobes: introduces macros for allocing early kprobe resources.
  early kprobes: allows __alloc_insn_slot() from early kprobes slots.
  early kprobes: perhibit probing at early kprobe reserved area.
  early kprobes: core logic of eraly kprobes.
  early kprobes: add CONFIG_EARLY_KPROBES option.
  early kprobes: introduce arch_fix_ftrace_early_kprobe().
  early kprobes: x86: arch_restore_optimized_kprobe().
  early kprobes: core logic to support early kprobe on ftrace.
  early kprobes: introduce kconfig option to support early kprobe on
ftrace.
  kprobes: enable 'ekprobe=' cmdline option for early kprobes.

 arch/Kconfig  |  15 ++
 arch/arm/include/asm/kprobes.h|  31 ++-
 arch/arm/kernel/vmlinux.lds.S |   2 +
 arch/arm/probes/kprobes/opt-arm.c |  12 +-
 arch/x86/include/asm/insn.h   |   7 +-
 arch/x86/include/asm/kprobes.h|  47 +++-
 arch/x86/kernel/ftrace.c  |  23 +-
 arch/x86/kernel/kprobes/core.c|   2 +-
 arch/x86/kernel/kprobes/opt.c |  69 +-
 arch/x86/kernel/vmlinux.lds.S |   2 +
 include/linux/ftrace.h|  37 +++
 include/linux/kprobes.h   | 132 +++
 init/main.c   |   1 +
 kernel/kprobes.c  | 479 +-
 kernel/trace/ftrace.c | 157 +++--
 15 files changed, 969 insertions(+), 47 deletions(-)

-- 
1.8.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH v3 14/26] early kprobes: ARM: introduce early kprobes related code area.

2015-02-12 Thread Wang Nan
In arm's vmlinux.lds, introduces code area inside text section.
Executable area used by early kprobes will be allocated from there.

Signed-off-by: Wang Nan 
---
 arch/arm/include/asm/kprobes.h | 31 +--
 arch/arm/kernel/vmlinux.lds.S  |  2 ++
 2 files changed, 31 insertions(+), 2 deletions(-)

diff --git a/arch/arm/include/asm/kprobes.h b/arch/arm/include/asm/kprobes.h
index 3ea9be5..0a4421e 100644
--- a/arch/arm/include/asm/kprobes.h
+++ b/arch/arm/include/asm/kprobes.h
@@ -17,16 +17,42 @@
 #define _ARM_KPROBES_H
 
 #include 
-#include 
-#include 
 
 #define __ARCH_WANT_KPROBES_INSN_SLOT
 #define MAX_INSN_SIZE  2
 
+#ifdef __ASSEMBLY__
+
+#define KPROBE_OPCODE_SIZE 4
+#define MAX_OPTINSN_SIZE (optprobe_template_end - optprobe_template_entry)
+
+#ifdef CONFIG_EARLY_KPROBES
+#define EARLY_KPROBES_CODES_AREA   \
+   . = ALIGN(8);   \
+   VMLINUX_SYMBOL(__early_kprobes_start) = .;  \
+   VMLINUX_SYMBOL(__early_kprobes_code_area_start) = .;\
+   . = . + MAX_OPTINSN_SIZE * CONFIG_NR_EARLY_KPROBES_SLOTS;   \
+   VMLINUX_SYMBOL(__early_kprobes_code_area_end) = .;  \
+   . = ALIGN(8);   \
+   VMLINUX_SYMBOL(__early_kprobes_insn_slot_start) = .;\
+   . = . + MAX_INSN_SIZE * KPROBE_OPCODE_SIZE * 
CONFIG_NR_EARLY_KPROBES_SLOTS;\
+   VMLINUX_SYMBOL(__early_kprobes_insn_slot_end) = .;  \
+   VMLINUX_SYMBOL(__early_kprobes_end) = .;
+
+#else
+#define EARLY_KPROBES_CODES_AREA
+#endif
+
+#else
+
+#include 
+#include 
+
 #define flush_insn_slot(p) do { } while (0)
 #define kretprobe_blacklist_size   0
 
 typedef u32 kprobe_opcode_t;
+#define KPROBE_OPCODE_SIZE sizeof(kprobe_opcode_t)
 struct kprobe;
 #include 
 
@@ -83,4 +109,5 @@ struct arch_optimized_insn {
 */
 };
 
+#endif /* __ASSEMBLY__ */
 #endif /* _ARM_KPROBES_H */
diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S
index 9351f7f..6fa2b85 100644
--- a/arch/arm/kernel/vmlinux.lds.S
+++ b/arch/arm/kernel/vmlinux.lds.S
@@ -11,6 +11,7 @@
 #ifdef CONFIG_ARM_KERNMEM_PERMS
 #include 
 #endif
+#include 

 #define PROC_INFO  \
. = ALIGN(4);   \
@@ -108,6 +109,7 @@ SECTIONS
SCHED_TEXT
LOCK_TEXT
KPROBES_TEXT
+   EARLY_KPROBES_CODES_AREA
IDMAP_TEXT
 #ifdef CONFIG_MMU
*(.fixup)
-- 
1.8.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH v3 13/26] early kprobes: ARM: directly modify code.

2015-02-12 Thread Wang Nan
For early kprobe, we can simply patch text because we are in a relative
simple environment.

Signed-off-by: Wang Nan 
---
 arch/arm/probes/kprobes/opt-arm.c | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/arch/arm/probes/kprobes/opt-arm.c 
b/arch/arm/probes/kprobes/opt-arm.c
index bcdecc2..43446df 100644
--- a/arch/arm/probes/kprobes/opt-arm.c
+++ b/arch/arm/probes/kprobes/opt-arm.c
@@ -330,8 +330,18 @@ void __kprobes arch_optimize_kprobes(struct list_head 
*oplist)
 * Similar to __arch_disarm_kprobe, operations which
 * removing breakpoints must be wrapped by stop_machine
 * to avoid racing.
+*
+* If this function is called before kprobes initialized,
+* the kprobe should be an early kprobe, the instruction
+* is not armed with breakpoint. There should be only
+* one core now, so directly __patch_text is enough.
 */
-   kprobes_remove_breakpoint(op->kp.addr, insn);
+   if (unlikely(kprobes_is_early())) {
+   BUG_ON(!(op->kp.flags & KPROBE_FLAG_EARLY));
+   __patch_text(op->kp.addr, insn);
+   } else {
+   kprobes_remove_breakpoint(op->kp.addr, insn);
+   }
 
list_del_init(>list);
}
-- 
1.8.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH v3 00/26] Early kprobe: enable kprobes at very early booting stage.

2015-02-12 Thread Wang Nan
I fell very sorry for people who reviewed my v2 patch series yesterday
at https://lkml.org/lkml/2015/2/12/234 because I didn't provide enough
information in commit log. This v3 patch series add those missing
commit messages. There are also 2 small fix based on v2:

 1. Fixes ftrace_sort_mcount_area. Original patch doesn't work for module.
 2. Wraps setting of kprobes_initialized in stop_machine() context. 

Wang Nan (26):
  kprobes: set kprobes_all_disarmed earlier to enable re-optimization.
  kprobes: makes kprobes/enabled works correctly for optimized kprobes.
  kprobes: x86: mark 2 bytes NOP as boostable.
  ftrace: don't update record flags if code modification fail.
  ftrace/x86: Ensure rec->flags no change when failure occures.
  ftrace: sort ftrace entries earlier.
  ftrace: allow search ftrace addr before ftrace fully inited.
  ftrace: enable make ftrace nop before ftrace_init().
  ftrace: allow fixing code update failure by notifier chain.
  ftrace: x86: try to fix ftrace when ftrace_replace_code.
  early kprobes: introduce kprobe_is_early for futher early kprobe use.
  early kprobes: Add an KPROBE_FLAG_EARLY for early kprobe.
  early kprobes: ARM: directly modify code.
  early kprobes: ARM: introduce early kprobes related code area.
  early kprobes: x86: directly modify code.
  early kprobes: x86: introduce early kprobes related code area.
  early kprobes: introduces macros for allocing early kprobe resources.
  early kprobes: allows __alloc_insn_slot() from early kprobes slots.
  early kprobes: perhibit probing at early kprobe reserved area.
  early kprobes: core logic of eraly kprobes.
  early kprobes: add CONFIG_EARLY_KPROBES option.
  early kprobes: introduce arch_fix_ftrace_early_kprobe().
  early kprobes: x86: arch_restore_optimized_kprobe().
  early kprobes: core logic to support early kprobe on ftrace.
  early kprobes: introduce kconfig option to support early kprobe on
ftrace.
  kprobes: enable 'ekprobe=' cmdline option for early kprobes.

 arch/Kconfig  |  15 ++
 arch/arm/include/asm/kprobes.h|  31 ++-
 arch/arm/kernel/vmlinux.lds.S |   2 +
 arch/arm/probes/kprobes/opt-arm.c |  12 +-
 arch/x86/include/asm/insn.h   |   7 +-
 arch/x86/include/asm/kprobes.h|  47 +++-
 arch/x86/kernel/ftrace.c  |  23 +-
 arch/x86/kernel/kprobes/core.c|   2 +-
 arch/x86/kernel/kprobes/opt.c |  69 +-
 arch/x86/kernel/vmlinux.lds.S |   2 +
 include/linux/ftrace.h|  37 +++
 include/linux/kprobes.h   | 132 +++
 init/main.c   |   1 +
 kernel/kprobes.c  | 479 +-
 kernel/trace/ftrace.c | 157 +++--
 15 files changed, 969 insertions(+), 47 deletions(-)

-- 
1.8.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH v3 06/26] ftrace: sort ftrace entries earlier.

2015-02-12 Thread Wang Nan
By extracting mcount sorting code and sort them earliler, futher patches
will be able to determine whether an address is on an ftrace entry or
not using bsearch().

ftrace_sort_mcount_area() will be called before, during and after
ftrace_init (when module insertion). Ensure it sort kernel mcount table
only once.

Signed-off-by: Wang Nan 
---
 include/linux/ftrace.h |  2 ++
 init/main.c|  1 +
 kernel/trace/ftrace.c  | 38 --
 3 files changed, 39 insertions(+), 2 deletions(-)

diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 1da6029..8db315a 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -701,8 +701,10 @@ static inline void __ftrace_enabled_restore(int enabled)
 
 #ifdef CONFIG_FTRACE_MCOUNT_RECORD
 extern void ftrace_init(void);
+extern void ftrace_init_early(void);
 #else
 static inline void ftrace_init(void) { }
+static inline void ftrace_init_early(void) { }
 #endif
 
 /*
diff --git a/init/main.c b/init/main.c
index 6f0f1c5f..eaafc3e 100644
--- a/init/main.c
+++ b/init/main.c
@@ -517,6 +517,7 @@ asmlinkage __visible void __init start_kernel(void)
boot_cpu_init();
page_address_init();
pr_notice("%s", linux_banner);
+   ftrace_init_early();
setup_arch(_line);
mm_init_cpumask(_mm);
setup_command_line(command_line);
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 6c6cbb1..a75cfbe 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -1169,6 +1169,7 @@ struct ftrace_page {
 
 static struct ftrace_page  *ftrace_pages_start;
 static struct ftrace_page  *ftrace_pages;
+static bool kernel_mcount_sorted = false;
 
 static bool __always_inline ftrace_hash_empty(struct ftrace_hash *hash)
 {
@@ -4743,6 +4744,32 @@ static void ftrace_swap_ips(void *a, void *b, int size)
*ipb = t;
 }
 
+static void ftrace_sort_mcount_area(unsigned long *start, unsigned long *end)
+{
+   extern unsigned long __start_mcount_loc[];
+   extern unsigned long __stop_mcount_loc[];
+
+   unsigned long count;
+   bool is_kernel_mcount;
+
+   count = end - start;
+   if (!count)
+   return;
+
+   is_kernel_mcount =
+   (start == __start_mcount_loc) &&
+   (end == __stop_mcount_loc);
+
+   if (is_kernel_mcount && kernel_mcount_sorted)
+   return;
+
+   sort(start, count, sizeof(*start),
+   ftrace_cmp_ips, ftrace_swap_ips);
+
+   if (is_kernel_mcount)
+   kernel_mcount_sorted = true;
+}
+
 static int ftrace_process_locs(struct module *mod,
   unsigned long *start,
   unsigned long *end)
@@ -4761,8 +4788,7 @@ static int ftrace_process_locs(struct module *mod,
if (!count)
return 0;
 
-   sort(start, count, sizeof(*start),
-ftrace_cmp_ips, ftrace_swap_ips);
+   ftrace_sort_mcount_area(start, end);
 
start_pg = ftrace_allocate_pages(count);
if (!start_pg)
@@ -4965,6 +4991,14 @@ void __init ftrace_init(void)
ftrace_disabled = 1;
 }
 
+void __init ftrace_init_early(void)
+{
+   extern unsigned long __start_mcount_loc[];
+   extern unsigned long __stop_mcount_loc[];
+
+   ftrace_sort_mcount_area(__start_mcount_loc, __stop_mcount_loc);
+}
+
 /* Do nothing if arch does not support this */
 void __weak arch_ftrace_update_trampoline(struct ftrace_ops *ops)
 {
-- 
1.8.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH v3 11/26] early kprobes: introduce kprobe_is_early for futher early kprobe use.

2015-02-12 Thread Wang Nan
Following early kprobe patches will enable kprobe registering very
early, even before kprobe system initialized. kprobe_is_early() can be
used to check whether we are working on early kprobe.

Signed-off-by: Wang Nan 
---
 include/linux/kprobes.h | 2 ++
 kernel/kprobes.c| 6 ++
 2 files changed, 8 insertions(+)

diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index 1ab5475..e1c8307 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -50,6 +50,8 @@
 #define KPROBE_REENTER 0x0004
 #define KPROBE_HIT_SSDONE  0x0008
 
+extern int kprobes_is_early(void);
+
 #else /* CONFIG_KPROBES */
 typedef int kprobe_opcode_t;
 struct arch_specific_insn {
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index c90e417..647c95a 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -68,6 +68,12 @@
 #endif
 
 static int kprobes_initialized;
+
+int kprobes_is_early(void)
+{
+   return !kprobes_initialized;
+}
+
 static struct hlist_head kprobe_table[KPROBE_TABLE_SIZE];
 static struct hlist_head kretprobe_inst_table[KPROBE_TABLE_SIZE];
 
-- 
1.8.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH v3 25/26] early kprobes: introduce kconfig option to support early kprobe on ftrace.

2015-02-12 Thread Wang Nan
On platform (like x86) which supports CONFIG_KPROBE_ON_FTRACE, makes
early kprobe depend on it so we are able to probe function entries.

Signed-off-by: Wang Nan 
---
 arch/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index 32e9f4a..79f809d 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -48,7 +48,7 @@ config KPROBES
 
 config EARLY_KPROBES
bool "Enable kprobes at very early booting stage"
-   depends on KPROBES && OPTPROBES
+   depends on KPROBES && OPTPROBES && (KPROBES_ON_FTRACE || 
!HAVE_KPROBES_ON_FTRACE)
def_bool y
help
  Enable kprobe at very early booting stage.
-- 
1.8.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH v3 23/26] early kprobes: x86: arch_restore_optimized_kprobe().

2015-02-12 Thread Wang Nan
arch_restore_optimized_kprobe() can be used to temporarily restore
probed instruction. It will actually disable optimized kprobe, but keep
the relatived data structure. It uses stop_machine() to enforce
atimicity.

Signed-off-by: Wang Nan 
---
 arch/x86/kernel/kprobes/opt.c | 26 ++
 include/linux/kprobes.h   |  1 +
 2 files changed, 27 insertions(+)

diff --git a/arch/x86/kernel/kprobes/opt.c b/arch/x86/kernel/kprobes/opt.c
index f3ea954..12332c2 100644
--- a/arch/x86/kernel/kprobes/opt.c
+++ b/arch/x86/kernel/kprobes/opt.c
@@ -28,6 +28,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -486,4 +487,29 @@ void arch_fix_ftrace_early_kprobe(struct optimized_kprobe 
*op)
memcpy(_p->opcode, correct_nop5, sizeof(kprobe_opcode_t));
 
 }
+
+static int do_restore_kprobe(void *p)
+{
+   struct optimized_kprobe *op = p;
+   u8 insn_buf[RELATIVEJUMP_SIZE];
+
+   memcpy(insn_buf, >kp.opcode, sizeof(kprobe_opcode_t));
+   memcpy(insn_buf + INT3_SIZE,
+   op->optinsn.copied_insn,
+   RELATIVE_ADDR_SIZE);
+   text_poke(op->kp.addr, insn_buf, RELATIVEJUMP_SIZE);
+   return 0;
+}
+
+void arch_restore_optimized_kprobe(struct optimized_kprobe *op)
+{
+   u32 mask = KPROBE_FLAG_EARLY |
+   KPROBE_FLAG_OPTIMIZED |
+   KPROBE_FLAG_FTRACE;
+
+   if ((op->kp.flags & mask) != mask)
+   return;
+
+   stop_machine(do_restore_kprobe, op, NULL);
+}
 #endif
diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index e483f1b..e615402 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -462,6 +462,7 @@ extern int arch_prepare_kprobe_ftrace(struct kprobe *p);
 
 #ifdef CONFIG_EARLY_KPROBES
 extern void arch_fix_ftrace_early_kprobe(struct optimized_kprobe *p);
+extern void arch_restore_optimized_kprobe(struct optimized_kprobe *p);
 #endif
 
 #endif
-- 
1.8.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH v3 08/26] ftrace: enable make ftrace nop before ftrace_init().

2015-02-12 Thread Wang Nan
This patch is for early kprobes.

Ftrace converts ftrace entries to nop when init, which will conflict
with early kprobes if it probe on an ftrace entry before such
conversion. For x86, ftrace entries is 'call' instruction which is
happends unboostable.

This patch provides ftrace_process_loc_early() to allow early kprobes to
convert target instruction before ftrace_init() is called. Only allows
ftrace_process_loc_early() called before ftrace_init().

However, for x86 only this patch is not enough. Due to ideal_nop() is
updated during setup_arch(), we are unable to ensure
ftrace_process_loc_early() choose similar nop as normal ftrace. I'll use
another mechanism to solve this problem.

Signed-off-by: Wang Nan 
---
 include/linux/ftrace.h |  5 +
 kernel/trace/ftrace.c  | 18 ++
 2 files changed, 23 insertions(+)

diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 8db315a..d37ccd8a 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -702,9 +702,14 @@ static inline void __ftrace_enabled_restore(int enabled)
 #ifdef CONFIG_FTRACE_MCOUNT_RECORD
 extern void ftrace_init(void);
 extern void ftrace_init_early(void);
+extern int ftrace_process_loc_early(unsigned long ip);
 #else
 static inline void ftrace_init(void) { }
 static inline void ftrace_init_early(void) { }
+static inline int ftrace_process_loc_early(unsigned long __unused)
+{
+   return 0;
+}
 #endif
 
 /*
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index fc0c1aa..e39e72a 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -5037,6 +5037,24 @@ void __init ftrace_init_early(void)
ftrace_sort_mcount_area(__start_mcount_loc, __stop_mcount_loc);
 }
 
+int __init ftrace_process_loc_early(unsigned long addr)
+{
+   unsigned long ip;
+   struct dyn_ftrace fake_rec;
+   int ret;
+
+   BUG_ON(ftrace_pages_start);
+
+   ip = ftrace_location(addr);
+   if (ip != addr)
+   return -EINVAL;
+
+   memset(_rec, '\0', sizeof(fake_rec));
+   fake_rec.ip = ip;
+   ret = ftrace_make_nop(NULL, _rec, MCOUNT_ADDR);
+   return ret;
+}
+
 /* Do nothing if arch does not support this */
 void __weak arch_ftrace_update_trampoline(struct ftrace_ops *ops)
 {
-- 
1.8.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH v3 03/26] kprobes: x86: mark 2 bytes NOP as boostable.

2015-02-12 Thread Wang Nan
Currently, x86 kprobes is unable to boost 2 bytes nop like:

nopl 0x0(%rax,%rax,1)

which is 0x0f 0x1f 0x44 0x00 0x00.

Such nops have exactly 5 bytes which is able to hold a relative jmp
instruction. Boosting them should be obviously safe.

This patch enable boosting such nops by simply updating
twobyte_is_boostable[] array.

Signed-off-by: Wang Nan 
Acked-by: Masami Hiramatsu 
---
 arch/x86/kernel/kprobes/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index 98f654d..6a1146e 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -84,7 +84,7 @@ static volatile u32 twobyte_is_boostable[256 / 32] = {
/*  0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f  */
/*  --  */
W(0x00, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0) | /* 00 */
-   W(0x10, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) , /* 10 */
+   W(0x10, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1) , /* 10 */
W(0x20, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) | /* 20 */
W(0x30, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) , /* 30 */
W(0x40, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1) | /* 40 */
-- 
1.8.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH v3 09/26] ftrace: allow fixing code update failure by notifier chain.

2015-02-12 Thread Wang Nan
This patch introduces a notifier chain (ftrace_update_notifier_list) and
ftrace_tryfix_bug(). The goal of this patch is to provide other
subsystem a chance to fix code if they alert ftrace entries before
ftrace_init().

Such subsystems should register a callback with
register_ftrace_update_notifier(). Ftrace will trigger the callback by
ftrace_tryfix_bug() when it fail to alert ftrace entries, instead of
directly fire an ftrace_bug(). It wrapps failure information with
a struct ftrace_update_notifier_info. Subscriber is able to determine
what it trying to do with it.

Subscriber of that notifier chain should return NOTIFY_STOP if it can
deal with the problem, or NOTIFY_DONE to pass it to other. By setting
info->retry it can inform ftrace to retry faild operation.

Signed-off-by: Wang Nan 
---
 include/linux/ftrace.h | 30 ++
 kernel/trace/ftrace.c  | 46 --
 2 files changed, 70 insertions(+), 6 deletions(-)

diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index d37ccd8a..98da86d 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -283,6 +283,21 @@ int ftrace_arch_code_modify_post_process(void);
 struct dyn_ftrace;
 
 void ftrace_bug(int err, struct dyn_ftrace *rec);
+int ftrace_tryfix(int failed, int enable, struct dyn_ftrace *rec);
+
+#define __ftrace_tryfix_bug(__failed, __enable, __rec, __retry, __trigger)\
+   ({  \
+   int __fix_ret = ftrace_tryfix((__failed), (__enable), (__rec));\
+   __fix_ret = (__fix_ret == -EAGAIN) ?\
+   ({ __retry; }) :\
+   __fix_ret;  \
+   if (__fix_ret && (__trigger))   
\
+   ftrace_bug(__failed, __rec);\
+   __fix_ret;  \
+   })
+
+#define ftrace_tryfix_bug(__failed, __enable, __rec, __retry)  \
+   __ftrace_tryfix_bug(__failed, __enable, __rec, __retry, true)
 
 struct seq_file;
 
@@ -699,10 +714,20 @@ static inline void __ftrace_enabled_restore(int enabled)
 # define trace_preempt_off(a0, a1) do { } while (0)
 #endif
 
+struct ftrace_update_notifier_info {
+   struct dyn_ftrace *rec;
+   int errno;
+   int enable;
+
+   /* Filled by subscriber */
+   bool retry;
+};
+
 #ifdef CONFIG_FTRACE_MCOUNT_RECORD
 extern void ftrace_init(void);
 extern void ftrace_init_early(void);
 extern int ftrace_process_loc_early(unsigned long ip);
+extern int register_ftrace_update_notifier(struct notifier_block *nb);
 #else
 static inline void ftrace_init(void) { }
 static inline void ftrace_init_early(void) { }
@@ -710,6 +735,11 @@ static inline int ftrace_process_loc_early(unsigned long 
__unused)
 {
return 0;
 }
+
+static inline int register_ftrace_update_notifier(struct notifier_block 
*__unused)
+{
+   return 0;
+}
 #endif
 
 /*
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index e39e72a..d75b823 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -112,6 +112,7 @@ ftrace_func_t ftrace_trace_function __read_mostly = 
ftrace_stub;
 ftrace_func_t ftrace_pid_function __read_mostly = ftrace_stub;
 static struct ftrace_ops global_ops;
 static struct ftrace_ops control_ops;
+static ATOMIC_NOTIFIER_HEAD(ftrace_update_notifier_list);
 
 static void ftrace_ops_recurs_func(unsigned long ip, unsigned long parent_ip,
   struct ftrace_ops *op, struct pt_regs *regs);
@@ -1971,6 +1972,28 @@ void ftrace_bug(int failed, struct dyn_ftrace *rec)
}
 }
 
+int ftrace_tryfix(int failed, int enable, struct dyn_ftrace *rec)
+{
+   int notify_result = NOTIFY_DONE;
+   struct ftrace_update_notifier_info info = {
+   .rec = rec,
+   .errno = failed,
+   .enable = enable,
+   .retry = false,
+   };
+
+   notify_result = atomic_notifier_call_chain(
+   _update_notifier_list,
+   0, );
+
+   if (notify_result != NOTIFY_STOP)
+   return failed;
+
+   if (info.retry)
+   return -EAGAIN;
+   return 0;
+}
+
 static int ftrace_check_record(struct dyn_ftrace *rec, int enable, int update)
 {
unsigned long flag = 0UL;
@@ -2298,9 +2321,12 @@ void __weak ftrace_replace_code(int enable)
do_for_each_ftrace_rec(pg, rec) {
failed = __ftrace_replace_code(rec, enable);
if (failed) {
-   ftrace_bug(failed, rec);
-   /* Stop processing */
-   return;
+   failed = ftrace_tryfix_bug(failed, enable, rec,
+   __ftrace_replace_code(rec, enable));
+
+   /* Stop 

Re: [RESEND Patch V2 1/4] xen: build infrastructure for generating hypercall depending symbols

2015-02-12 Thread Juergen Gross

 ##   ###  # #   #
 # #   #   ###  # #
 # #   #   # #   #  #
 ###   #  #  #  #  
 # #   #   # #  # #
 # #   ###  # #
 ####  # #   #

David still wants a comment from the x86 maintainers...

Juergen

On 01/21/2015 08:49 AM, Juergen Gross wrote:

Today there are several places in the kernel which build tables
containing one entry for each possible Xen hypercall. Create an
infrastructure to be able to generate these tables at build time.

Based-on-patch-by: Jan Beulich 
Signed-off-by: Juergen Gross 
Reviewed-by: David Vrabel 
---
  arch/x86/syscalls/Makefile |  9 +
  scripts/xen-hypercalls.sh  | 12 
  2 files changed, 21 insertions(+)
  create mode 100644 scripts/xen-hypercalls.sh

diff --git a/arch/x86/syscalls/Makefile b/arch/x86/syscalls/Makefile
index 3323c27..a55abb9 100644
--- a/arch/x86/syscalls/Makefile
+++ b/arch/x86/syscalls/Makefile
@@ -19,6 +19,9 @@ quiet_cmd_syshdr = SYSHDR  $@
  quiet_cmd_systbl = SYSTBL  $@
cmd_systbl = $(CONFIG_SHELL) '$(systbl)' $< $@

+quiet_cmd_hypercalls = HYPERCALLS $@
+  cmd_hypercalls = $(CONFIG_SHELL) '$<' $@ $(filter-out $<,$^)
+
  syshdr_abi_unistd_32 := i386
  $(uapi)/unistd_32.h: $(syscall32) $(syshdr)
$(call if_changed,syshdr)
@@ -47,10 +50,16 @@ $(out)/syscalls_32.h: $(syscall32) $(systbl)
  $(out)/syscalls_64.h: $(syscall64) $(systbl)
$(call if_changed,systbl)

+$(out)/xen-hypercalls.h: $(srctree)/scripts/xen-hypercalls.sh
+   $(call if_changed,hypercalls)
+
+$(out)/xen-hypercalls.h: $(srctree)/include/xen/interface/xen*.h
+
  uapisyshdr-y  += unistd_32.h unistd_64.h unistd_x32.h
  syshdr-y  += syscalls_32.h
  syshdr-$(CONFIG_X86_64)   += unistd_32_ia32.h unistd_64_x32.h
  syshdr-$(CONFIG_X86_64)   += syscalls_64.h
+syshdr-$(CONFIG_XEN)   += xen-hypercalls.h

  targets   += $(uapisyshdr-y) $(syshdr-y)

diff --git a/scripts/xen-hypercalls.sh b/scripts/xen-hypercalls.sh
new file mode 100644
index 000..676d922
--- /dev/null
+++ b/scripts/xen-hypercalls.sh
@@ -0,0 +1,12 @@
+#!/bin/sh
+out="$1"
+shift
+in="$@"
+
+for i in $in; do
+   eval $CPP $LINUXINCLUDE -dD -imacros "$i" -x c /dev/null
+done | \
+awk '$1 == "#define" && $2 ~ /__HYPERVISOR_[a-z][a-z_0-9]*/ { v[$3] = $2 }
+   END {   print "/* auto-generated by scripts/xen-hypercall.sh */"
+   for (i in v) if (!(v[i] in v))
+   print "HYPERCALL("substr(v[i], 14)")"}' | sort -u >$out



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[LKP] [mmu_gather] fb7332a9fed: +5.4% tlbflush.mem_acc_time_thread_ms -6.2% will-it-scale.per_thread_ops

2015-02-12 Thread Huang Ying
FYI, we noticed the below changes on

commit fb7332a9fedfd62b1ba6530c86f39f0fa38afd49 ("mmu_gather: move minimal 
range calculations into generic code")


testbox/testcase/testparams: ivb42/will-it-scale/performance-brk1

63648dd20fa0780a  fb7332a9fedfd62b1ba6530c86  
  --  
 %stddev %change %stddev
 \  |\  
733067 ±  0%  -6.2% 687433 ±  0%  will-it-scale.per_thread_ops
   3728462 ±  0%  -5.7%3516927 ±  0%  will-it-scale.per_process_ops
  0.46 ±  0%  -3.3%   0.44 ±  0%  will-it-scale.scalability
140435 ± 22% -75.2%  34877 ± 32%  sched_debug.cpu#19.ttwu_count
85 ± 32%+144.6%208 ± 30%  
sched_debug.cfs_rq[8]:/.blocked_load_avg
892437 ± 34% -55.5% 397177 ± 16%  sched_debug.cpu#5.nr_switches
445593 ± 34% -55.5% 198460 ± 16%  sched_debug.cpu#5.sched_goidle
92 ± 29%+136.7%219 ± 29%  
sched_debug.cfs_rq[8]:/.tg_load_contrib
374967 ± 14% -56.3% 163891 ± 35%  sched_debug.cpu#5.ttwu_count
924309 ± 32% -51.0% 452960 ±  9%  sched_debug.cpu#5.sched_count
90 ± 41% +78.2%161 ± 18%  
sched_debug.cfs_rq[30]:/.tg_load_contrib
205529 ± 15% -38.3% 126863 ± 32%  sched_debug.cpu#40.ttwu_count
  1152 ± 33% +50.4%   1734 ± 22%  sched_debug.cpu#13.ttwu_local
  3.44 ± 38% +86.6%   6.42 ± 18%  
perf-profile.cpu-cycles.rwsem_spin_on_owner.rwsem_down_write_failed.call_rwsem_down_write_failed.sys_brk.system_call_fastpath
 6 ± 39%+100.0% 13 ±  9%  sched_debug.cpu#2.cpu_load[0]
  1.03 ± 11% +46.6%   1.52 ± 12%  
perf-profile.cpu-cycles.find_vma.do_munmap.sys_brk.system_call_fastpath.brk
  0.76 ± 32% +65.2%   1.26 ± 14%  
perf-profile.cpu-cycles.up_write.vma_adjust.vma_merge.do_brk.sys_brk
538886 ± 38% +70.3% 917481 ± 20%  sched_debug.cpu#26.ttwu_count
  0.76 ± 21% +69.6%   1.28 ± 14%  
perf-profile.cpu-cycles.find_vma.sys_brk.system_call_fastpath.brk
16 ±  5% +25.8% 20 ± 18%  sched_debug.cpu#30.cpu_load[1]
  2224 ± 11% +21.8%   2709 ± 16%  sched_debug.cpu#41.curr->pid
  3.94 ±  9% -30.5%   2.74 ± 17%  
perf-profile.cpu-cycles._raw_spin_lock.try_to_wake_up.wake_up_process.__rwsem_do_wake.rwsem_wake
28 ± 14% -32.1% 19 ± 14%  sched_debug.cfs_rq[25]:/.load
16 ±  4% +25.0% 20 ± 12%  sched_debug.cpu#32.cpu_load[2]
17 ± 16% +20.3% 20 ± 10%  
sched_debug.cfs_rq[34]:/.runnable_load_avg
180505 ± 26% -43.4% 102128 ± 18%  sched_debug.cpu#44.ttwu_count
  2135 ±  7% +28.7%   2747 ± 22%  sched_debug.cpu#44.curr->pid
 13.14 ± 10% +20.4%  15.82 ±  5%  
perf-profile.cpu-cycles.call_rwsem_down_write_failed.sys_brk.system_call_fastpath.brk
 13.05 ± 10% +20.4%  15.71 ±  5%  
perf-profile.cpu-cycles.rwsem_down_write_failed.call_rwsem_down_write_failed.sys_brk.system_call_fastpath.brk
  2.30 ± 10% +24.1%   2.86 ±  7%  
perf-profile.cpu-cycles.vma_adjust.vma_merge.do_brk.sys_brk.system_call_fastpath
  1.70 ±  6% -13.2%   1.47 ± 11%  
perf-profile.cpu-cycles.clockevents_program_event.tick_program_event.__hrtimer_start_range_ns.hrtimer_start_range_ns.tick_nohz_restart
  5512 ±  1% +27.7%   7040 ± 22%  
sched_debug.cfs_rq[20]:/.exec_clock
17 ±  5% -30.9% 11 ± 32%  sched_debug.cpu#40.load
  2.73 ±  9% +21.0%   3.30 ±  7%  
perf-profile.cpu-cycles.vma_merge.do_brk.sys_brk.system_call_fastpath.brk
505131 ± 13% +26.4% 638526 ±  7%  sched_debug.cpu#32.ttwu_count
  1.09 ±  7% -14.9%   0.93 ± 11%  
perf-profile.cpu-cycles._raw_spin_unlock_irqrestore.rwsem_wake.call_rwsem_wake.sys_brk.system_call_fastpath
16 ±  2% +13.8% 18 ±  8%  sched_debug.cpu#32.cpu_load[3]
  1.73 ±  6% -12.5%   1.52 ± 11%  
perf-profile.cpu-cycles.tick_program_event.__hrtimer_start_range_ns.hrtimer_start_range_ns.tick_nohz_restart.tick_nohz_idle_exit
  1.89 ±  5% -12.2%   1.66 ±  1%  
perf-profile.cpu-cycles.set_next_entity.pick_next_task_fair.__sched_text_start.schedule_preempt_disabled.cpu_startup_entry
 17.50 ±  5% -14.4%  14.98 ±  6%  
perf-profile.cpu-cycles.try_to_wake_up.wake_up_process.__rwsem_do_wake.rwsem_wake.call_rwsem_wake
 18.55 ±  5% -13.8%  16.00 ±  6%  
perf-profile.cpu-cycles.wake_up_process.__rwsem_do_wake.rwsem_wake.call_rwsem_wake.sys_brk
   229 ±  6% -10.5%205 ±  0%  
sched_debug.cfs_rq[2]:/.tg_runnable_contrib
 10557 ±  6% -10.2%   9478 ±  0%  
sched_debug.cfs_rq[2]:/.avg->runnable_avg_sum
 18.66 ±  5% -13.4%  16.16 ±  5%  
perf-profile.cpu-cycles.__rwsem_do_wake.rwsem_wake.call_rwsem_wake.sys_brk.system_call_fastpath
745968 ±  4%  +9.1% 813977 ±  5%  

[PATCH] virtio: don't set VIRTIO_CONFIG_S_DRIVER_OK twice.

2015-02-12 Thread Rusty Russell
I noticed this with the console device.  It's not *wrong*, just a bit
weird.

Signed-off-by: Rusty Russell 

diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
index b9f70dfc4751..5ce2aa48fc6e 100644
--- a/drivers/virtio/virtio.c
+++ b/drivers/virtio/virtio.c
@@ -236,7 +236,10 @@ static int virtio_dev_probe(struct device *_d)
if (err)
goto err;
 
-   add_status(dev, VIRTIO_CONFIG_S_DRIVER_OK);
+   /* If probe didn't do it, mark device DRIVER_OK ourselves. */
+   if (!(dev->config->get_status(dev) & VIRTIO_CONFIG_S_DRIVER_OK))
+   virtio_device_ready(dev);
+
if (drv->scan)
drv->scan(dev);
 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[LKP] [mutex] 6aa15f5a2fe: -9.2% will-it-scale.per_process_ops

2015-02-12 Thread Huang Ying
FYI, we noticed the below changes on

git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git locking/core
commit 6aa15f5a2febe058056180786bb39513ad5ae70d ("mutex: In 
mutex_spin_on_owner(), return true when owner changes")


testbox/testcase/testparams: wsm/will-it-scale/performance-writeseek3

afffc6c1805d98e0  6aa15f5a2febe058056180786b  
  --  
 %stddev %change %stddev
 \  |\  
  27329774 ±  5% -98.7% 350559 ±  4%  
will-it-scale.time.voluntary_context_switches
  1401 ±  4%+340.4%   6172 ±  9%  
will-it-scale.time.involuntary_context_switches
   402 ±  7%+157.9%   1036 ±  0%  will-it-scale.time.system_time
   141 ±  6%+146.3%347 ±  0%  
will-it-scale.time.percent_of_cpu_this_job_got
 28.29 ±  4% -25.7%  21.01 ±  1%  will-it-scale.time.user_time
73 ±  0%  -9.2% 706114 ±  7%  will-it-scale.per_process_ops
   4995546 ± 11% -99.4%  31990 ± 29%  sched_debug.cpu#11.sched_count
332497 ±  9% -87.9%  40257 ±  4%  softirqs.SCHED
  0.96 ± 20%-100.0%   0.00 ±  0%  
perf-profile.cpu-cycles.hrtimer_try_to_cancel.hrtimer_cancel.tick_nohz_restart.tick_nohz_idle_exit.cpu_startup_entry
  0.99 ± 14%-100.0%   0.00 ±  0%  
perf-profile.cpu-cycles.hrtimer_cancel.tick_nohz_restart.tick_nohz_idle_exit.cpu_startup_entry.start_secondary
  1.40 ±  3%-100.0%   0.00 ±  0%  
perf-profile.cpu-cycles.dequeue_entity.dequeue_task_fair.dequeue_task.deactivate_task.__schedule
  1.36 ± 15%-100.0%   0.00 ±  0%  
perf-profile.cpu-cycles.get_nohz_timer_target.__hrtimer_start_range_ns.hrtimer_start.tick_nohz_stop_sched_tick.__tick_nohz_idle_enter
  1.62 ±  4%-100.0%   0.00 ±  0%  
perf-profile.cpu-cycles.pick_next_task_fair.__schedule.schedule_preempt_disabled.cpu_startup_entry.start_secondary
  1.69 ±  2%-100.0%   0.00 ±  0%  
perf-profile.cpu-cycles.dequeue_task_fair.dequeue_task.deactivate_task.__schedule.schedule_preempt_disabled
 15.54 ± 32%+351.5%  70.18 ±  2%  
perf-profile.cpu-cycles.mutex_optimistic_spin.__mutex_lock_slowpath.mutex_lock.generic_file_write_iter.new_sync_write
  1.94 ±  3%-100.0%   0.00 ±  0%  
perf-profile.cpu-cycles.dequeue_task.deactivate_task.__schedule.schedule_preempt_disabled.__mutex_lock_slowpath
  1.96 ±  4%-100.0%   0.00 ±  0%  
perf-profile.cpu-cycles.deactivate_task.__schedule.schedule_preempt_disabled.__mutex_lock_slowpath.mutex_lock
  2.08 ± 11%-100.0%   0.00 ±  0%  
perf-profile.cpu-cycles._raw_spin_lock.try_to_wake_up.wake_up_process.__mutex_unlock_slowpath.mutex_unlock
 20.14 ± 10% -82.3%   3.56 ± 34%  
perf-profile.cpu-cycles.start_secondary
 20.04 ± 10% -82.2%   3.56 ± 34%  
perf-profile.cpu-cycles.cpu_startup_entry.start_secondary
  2.67 ±  8%-100.0%   0.00 ±  0%  
perf-profile.cpu-cycles._raw_spin_unlock_irqrestore.__hrtimer_start_range_ns.hrtimer_start_range_ns.tick_nohz_restart.tick_nohz_idle_exit
   4495470 ±  7% -99.4%  25868 ± 12%  sched_debug.cpu#9.nr_switches
   4496190 ±  7% -99.4%  26052 ± 12%  sched_debug.cpu#9.sched_count
  40599.18 ± 41%-100.0%   0.00 ±  0%  
sched_debug.cfs_rq[6]:/.max_vruntime
  3.30 ±  8%-100.0%   0.00 ±  0%  
perf-profile.cpu-cycles.__hrtimer_start_range_ns.hrtimer_start_range_ns.tick_nohz_restart.tick_nohz_idle_exit.cpu_startup_entry
  3.33 ±  6%-100.0%   0.00 ±  0%  
perf-profile.cpu-cycles.hrtimer_start_range_ns.tick_nohz_restart.tick_nohz_idle_exit.cpu_startup_entry.start_secondary
   2247155 ±  7% -99.4%  12447 ± 14%  sched_debug.cpu#9.sched_goidle
  40599.18 ± 41%-100.0%   0.00 ±  0%  
sched_debug.cfs_rq[6]:/.MIN_vruntime
  3.99 ±  6%-100.0%   0.00 ±  0%  
perf-profile.cpu-cycles.__schedule.schedule_preempt_disabled.__mutex_lock_slowpath.mutex_lock.generic_file_write_iter
   2448071 ±  2% -99.4%  13482 ±  9%  sched_debug.cpu#6.ttwu_count
   2554386 ±  2% -99.5%  13020 ±  8%  sched_debug.cpu#6.sched_goidle
   5111673 ±  2% -99.5%  26787 ±  8%  sched_debug.cpu#6.sched_count
   2527550 ±  6% -99.5%  12538 ± 14%  sched_debug.cpu#9.ttwu_count
   5109913 ±  2% -99.5%  26531 ±  8%  sched_debug.cpu#6.nr_switches
 54085 ± 37%+845.7% 511463 ± 40%  
sched_debug.cfs_rq[5]:/.min_vruntime
  8424 ± 16%+560.9%  55678 ± 42%  sched_debug.cfs_rq[5]:/.exec_clock
   1201871 ± 19% -83.0% 204673 ± 49%  sched_debug.cpu#5.sched_count
10 ± 28%+490.5% 62 ± 31%  sched_debug.cpu#5.cpu_load[4]
  93687275 ± 19% -94.0%5611171 ± 33%  cpuidle.C1-NHM.time
558262 ± 18% -96.2%  21169 ±  6%  cpuidle.C1-NHM.usage
 9.831e+08 ± 15% -98.8%   11473350 ± 25%  cpuidle.C3-NHM.time
   3976387 ± 13% -99.3%  26803 ±  6%  cpuidle.C3-NHM.usage
   1390481 ±  7% 

Re: [PATCH] perf: fix building error in x86_64

2015-02-12 Thread Namhyung Kim
On Thu, Feb 12, 2015 at 04:56:44PM +0800, Hekuang wrote:
> 
> 在 2015/2/12 16:07, Namhyung Kim 写道:
> >Hi,
> >
> >On Wed, Feb 11, 2015 at 10:01:08AM +0800, He Kuang wrote:
> >>When build with ARCH=x86_64, perf failed to compile with following error:
> >>
> >>tests/builtin-test.o:(.data+0x158): undefined reference to 
> >>`test__perf_time_to_tsc'
> >>collect2: error: ld returned 1 exit status
> >>Makefile.perf:632: recipe for target 'perf' failed
> >>...
> >>
> >>Which is caused commit c6e5e9fbc3ea1 ("perf tools: Fix building error
> >>in x86_64 when dwarf unwind is on"), ARCH test in Makefile.perf
> >>conflicts with tests/builtin-test.c's __x86_64__.
> >>To x86/x86_64 platform, ARCH should always override to x86 while
> >>IS_64_BIT stands for the actual architecture.
> >>
> >>Signed-off-by: He Kuang 
> >>---
> >>  tools/perf/config/Makefile.arch | 2 +-
> >>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >>diff --git a/tools/perf/config/Makefile.arch 
> >>b/tools/perf/config/Makefile.arch
> >>index ff95a68..8c6214d 100644
> >>--- a/tools/perf/config/Makefile.arch
> >>+++ b/tools/perf/config/Makefile.arch
> >>@@ -14,7 +14,7 @@ ifeq ($(RAW_ARCH),i386)
> >>  endif
> >>  ifeq ($(RAW_ARCH),x86_64)
> >>-  ARCH ?= x86
> >>+  override ARCH := x86
> >Hmm.. wouldn't it (again) break cross build then?
> >
> >Thanks,
> >Namhyung
> >
> 
> 
> I've tested both 'make ARCH=x86' and 'make ARCH=x86_64' cases after a
> 'make clean'.
> 
> The issue was first caused by IS_X86_64 flag wrongly cleared when
> ARCH=x86, which is already fixed by separating IS_64_BIT and ARCH in
> commit c6e5e9fbc3ea1 ("perf tools: Fix building error in x86_64 when
> dwarf unwind is on").
> 
> The only problem here is we should let ARCH override to x86, to keep
> compatible with 'ifeq ($(ARCH),x86)'.
> >>ifneq (, $(findstring m32,$(CFLAGS)))
> >>  RAW_ARCH := x86_32

Have you tried a cross build like 'make ARCH=arm' also?

Thanks,
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] mfd: stw481x: Remove unused fields from struct stw481x

2015-02-12 Thread Axel Lin
The mutex lock is not used at all, remove it.
The *vmmc_regulator is not necessary, use a local variable in
stw481x_vmmc_regulator_probe() instead.

Signed-off-by: Axel Lin 
---
 drivers/regulator/stw481x-vmmc.c | 8 
 include/linux/mfd/stw481x.h  | 4 
 2 files changed, 4 insertions(+), 8 deletions(-)

diff --git a/drivers/regulator/stw481x-vmmc.c b/drivers/regulator/stw481x-vmmc.c
index 89025f5..7d2ae3e 100644
--- a/drivers/regulator/stw481x-vmmc.c
+++ b/drivers/regulator/stw481x-vmmc.c
@@ -56,6 +56,7 @@ static int stw481x_vmmc_regulator_probe(struct 
platform_device *pdev)
 {
struct stw481x *stw481x = dev_get_platdata(>dev);
struct regulator_config config = { };
+   struct regulator_dev *rdev;
int ret;
 
/* First disable the external VMMC if it's active */
@@ -75,12 +76,11 @@ static int stw481x_vmmc_regulator_probe(struct 
platform_device *pdev)
  pdev->dev.of_node,
  _regulator);
 
-   stw481x->vmmc_regulator = devm_regulator_register(>dev,
-   _regulator, );
-   if (IS_ERR(stw481x->vmmc_regulator)) {
+   rdev = devm_regulator_register(>dev, _regulator, );
+   if (IS_ERR(rdev)) {
dev_err(>dev,
"error initializing STw481x VMMC regulator\n");
-   return PTR_ERR(stw481x->vmmc_regulator);
+   return PTR_ERR(rdev);
}
 
dev_info(>dev, "initialized STw481x VMMC regulator\n");
diff --git a/include/linux/mfd/stw481x.h b/include/linux/mfd/stw481x.h
index eda1215..833074b 100644
--- a/include/linux/mfd/stw481x.h
+++ b/include/linux/mfd/stw481x.h
@@ -41,15 +41,11 @@
 
 /**
  * struct stw481x - state holder for the Stw481x drivers
- * @mutex: mutex to serialize I2C accesses
  * @i2c_client: corresponding I2C client
- * @regulator: regulator device for regulator children
  * @map: regmap handle to access device registers
  */
 struct stw481x {
-   struct mutexlock;
struct i2c_client   *client;
-   struct regulator_dev*vmmc_regulator;
struct regmap   *map;
 };
 
-- 
1.9.1



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: RAID1 might_sleep() warning on 3.19-rc7

2015-02-12 Thread NeilBrown
On Tue, 10 Feb 2015 10:29:36 +0100 Peter Zijlstra 
wrote:

> On Tue, Feb 10, 2015 at 01:50:17PM +1100, NeilBrown wrote:
> > On Mon, 9 Feb 2015 10:10:00 +0100 Peter Zijlstra  
> > wrote:
> > > > However, when io_schedule() explicitly calls blk_flush_plug(), then
> > > > @from_schedule=false variant is used, and the unplug functions are 
> > > > allowed to
> > > > allocate memory and block and maybe even call mempool_alloc() which 
> > > > might
> > > > call io_schedule().
> > > > 
> > > > This shouldn't be a problem as blk_flush_plug() spliced out the plug 
> > > > list, so
> > > > any recursive call will find an empty list and do nothing.
> > > 
> > > Unless, something along the way stuck something back on, right? So
> > > should we stick an:
> > > 
> > >   WARN_ON(current->in_iowait);
> > > 
> > > somewhere near where things are added to this plug list? (and move the
> > > blk_flush_plug() call inside of where that's actually true of course).
> > 
> > No, I don't think so.
> > 
> > It is certainly possible that some request on plug->cb_list could add
> > something to plug->list - which is processed after ->cb_list.
> > 
> > I think the best way to think about this is that the *problem* was that a
> > wait_event loop could spin without making any progress.   So any time that
> > clear forward progress is made it is safe sleep without necessitating the
> > warning.  Hence sched_annotate_sleep() is reasonable.
> > blk_flush_plug() with definitely have dispatched some requests if it
> > might_sleep(), so the sleep is OK.
> 
> Well, yes, but you forget that this gets us back into recursion land.
> io_schedule() calling io_schedule() calling io_schedule() and *boom*
> stack overflow -> dead machine.
> 
> We must either guarantee io_schedule() will never call io_schedule() or
> that io_schedule() itself will not add new work to the current plug such
> that calling io_schedule() itself will not recurse on the blk stuff.
> 
> Pick either option, but pick one.

I choose ... Buzz Lightyear !!!

Sorry, go carried away there.  Uhhmm.  I think I pick a/  (But I expect I'll
find a goat... ho hum).

Does this look credible?

Thanks,
NeilBrown


From: NeilBrown 
Date: Fri, 13 Feb 2015 15:49:17 +1100
Subject: [PATCH] sched: prevent recursion in io_schedule()

io_schedule() calls blk_flush_plug() which, depending on the
contents of current->plug, can initiate arbitrary blk-io requests.

Note that this contrasts with blk_schedule_flush_plug() which requires
all non-trivial work to be handed off to a separate thread.

This makes it possible for io_schedule() to recurse, and initiating
block requests could possibly call mempool_alloc() which, in times of
memory pressure, uses io_schedule().

Apart from any stack usage issues, io_schedule() will not behave
correctly when called recursively as delayacct_blkio_start() does
not allow for repeated calls.

So:
 - use in_iowait to detect recursion.  Set it earlier, and restore
   it to the old value.
 - move the call to "raw_rq" after the call to blk_flush_plug().
   As this is some sort of per-cpu thing, we want some chance that
   we are on the right CPU
 - When io_schedule() is called recurively, use blk_schedule_flush_plug()
   which cannot further recurse.
 - as this makes io_schedule() a lot more complex and as io_schedule()
   must match io_schedule_timeout(), but all the changes in 
io_schedule_timeout()
   and make io_schedule a simple wrapper for that.

Signed-off-by: NeilBrown 
Cc: Jens Axboe 
Cc: Peter Zijlstra 

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 1f37fe7f77a4..90f3de8bc7ca 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -4420,30 +4420,27 @@ EXPORT_SYMBOL_GPL(yield_to);
  */
 void __sched io_schedule(void)
 {
-   struct rq *rq = raw_rq();
-
-   delayacct_blkio_start();
-   atomic_inc(>nr_iowait);
-   blk_flush_plug(current);
-   current->in_iowait = 1;
-   schedule();
-   current->in_iowait = 0;
-   atomic_dec(>nr_iowait);
-   delayacct_blkio_end();
+   io_schedule_timeout(MAX_SCHEDULE_TIMEOUT);
 }
 EXPORT_SYMBOL(io_schedule);
 
 long __sched io_schedule_timeout(long timeout)
 {
-   struct rq *rq = raw_rq();
+   struct rq *rq;
long ret;
+   int old_iowait = current->in_iowait;
+
+   current->in_iowait = 1;
+   if (old_iowait)
+   blk_schedule_flush_plug(current);
+   else
+   blk_flush_plug(current);
 
delayacct_blkio_start();
+   rq = raw_rq();
atomic_inc(>nr_iowait);
-   blk_flush_plug(current);
-   current->in_iowait = 1;
ret = schedule_timeout(timeout);
-   current->in_iowait = 0;
+   current->in_iowait = old_iowait;
atomic_dec(>nr_iowait);
delayacct_blkio_end();
return ret;


pgpeV9fPzQOMQ.pgp
Description: OpenPGP digital signature


Re: [PATCH v3 2/2] remoteproc: add support to handle internal memories

2015-02-12 Thread Ohad Ben-Cohen
On Thu, Feb 12, 2015 at 10:54 PM, Suman Anna  wrote:
> My original motivation was that it would only need to be added on
> firmwares requiring support for loading into internal memories,
> otherwise, these are something left to be managed by the software
> running on the remote processor completely, and MPU will not even touch
> them.

Sure. But even if you guys will use this interface correctly, this
patch essentially exposes ioremap to user space, which is something we
generally want to avoid.

> So, let me know if this is a NAK. If so, we have two options - one to go
> the sram node model where each of them have to be defined separately,
> and have a specific property in the rproc nodes to be able to get the
> gen_pool handles. The other one is simply to define these as  and
> use devm_ioremap_resource() (so use DT for defining the regions instead
> of a resource table entry).

Any approach where these regions are defined explicitly really sounds
better. If you could look into these two alternatives that would be
great.

Thanks,
Ohad.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH RFC v9.5 09/20] drm/dsi: Add a helper to get bits per pixel of MIPI DSI pixel format

2015-02-12 Thread Liu Ying
Signed-off-by: Liu Ying 
---
v9->v9.5:
 * Add kernel-doc for the new helper function to address Daniel Vetter's
   comment.

v8->v9:
 * Rebase onto the imx-drm/next branch of Philipp Zabel's open git repository.

v7->v8:
 * None.

v6->v7:
 * None.

v5->v6:
 * Address the over 80 characters in one line warning reported by the
   checkpatch.pl script.

v4->v5:
 * None.

v3->v4:
 * None.

v2->v3:
 * None.

v1->v2:
 * Thierry Reding suggested that the mipi_dsi_pixel_format_to_bpp() function
   could be placed at the common DRM MIPI DSI driver.
   This patch is newly added.

 include/drm/drm_mipi_dsi.h | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/include/drm/drm_mipi_dsi.h b/include/drm/drm_mipi_dsi.h
index f1d8d0d..cabc910 100644
--- a/include/drm/drm_mipi_dsi.h
+++ b/include/drm/drm_mipi_dsi.h
@@ -163,6 +163,28 @@ static inline struct mipi_dsi_device 
*to_mipi_dsi_device(struct device *dev)
return container_of(dev, struct mipi_dsi_device, dev);
 }
 
+/**
+ * mipi_dsi_pixel_format_to_bpp() - get bits per pixel for a mipi dsi
+ *pixel format
+ * @fmt: mipi dsi pixel format
+ *
+ * Return: The bits per pixel value for the mipi dsi pixel format on success or
+ *a negative error code on failure.
+ */
+static inline int mipi_dsi_pixel_format_to_bpp(enum mipi_dsi_pixel_format fmt)
+{
+   switch (fmt) {
+   case MIPI_DSI_FMT_RGB888:
+   case MIPI_DSI_FMT_RGB666:
+   return 24;
+   case MIPI_DSI_FMT_RGB666_PACKED:
+   return 18;
+   case MIPI_DSI_FMT_RGB565:
+   return 16;
+   }
+   return -EINVAL;
+}
+
 struct mipi_dsi_device *of_find_mipi_dsi_device_by_node(struct device_node 
*np);
 int mipi_dsi_attach(struct mipi_dsi_device *dsi);
 int mipi_dsi_detach(struct mipi_dsi_device *dsi);
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RESEND] ARM: DMA: Fix kzalloc flags in __iommu_alloc_buffer()

2015-02-12 Thread Alexandre Courbot

On 02/13/2015 12:32 PM, Will Deacon wrote:

On Wed, Feb 11, 2015 at 09:01:41AM +, Alexandre Courbot wrote:

There doesn't seem to be any valid reason to allocate the pages array
with the same flags as the buffer itself. Doing so can eventually lead
to the following safeguard in mm/slab.c to be hit:

BUG_ON(flags & GFP_SLAB_BUG_MASK);


nit: I can't actually spot this BUG_ON in the kernel.


I have been trying to push this patch for so long that the line in 
question changed in the meantime. :) It is now


if (unlikely(flags & GFP_SLAB_BUG_MASK)) {
pr_emerg("gfp: %u\n", flags & GFP_SLAB_BUG_MASK);
BUG();
}

in cache_grow, line 2593 of mm/slab.c.




This happens when buffers are allocated with __GFP_DMA32 or
__GFP_HIGHMEM.

Fix this by allocating the pages array with GFP_KERNEL to follow what is
done elsewhere in this file. Using GFP_KERNEL in __iommu_alloc_buffer()
is safe because atomic allocations are handled by __iommu_alloc_atomic().

Signed-off-by: Alexandre Courbot 
Cc: Arnd Bergmann 
Cc: Marek Szyprowski 
Cc: Russell King 
Acked-by: Marek Szyprowski 
---
  arch/arm/mm/dma-mapping.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 903dba0..170a116 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -1106,7 +1106,7 @@ static struct page **__iommu_alloc_buffer(struct device 
*dev, size_t size,
int i = 0;

if (array_size <= PAGE_SIZE)
-   pages = kzalloc(array_size, gfp);
+   pages = kzalloc(array_size, GFP_KERNEL);
else
pages = vzalloc(array_size);
if (!pages)
--
2.3.0


Looks sensible to me:

   Acked-by: Will Deacon 


Thanks! I will amend the commit message and resend.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] hwmon: (applesmc) Allow format checking

2015-02-12 Thread Guenter Roeck

On 02/12/2015 06:15 AM, Rasmus Villemoes wrote:

Currently gcc and other tools can't check the format strings. It's
easy to fix by letting fan_speed_fmt simply hold what is different
between the strings (and renaming it appropriately). While at it, we
can also eliminate some wasted space and an extra level of indirection
by making it an array of char[4] instead of char*.

Signed-off-by: Rasmus Villemoes 


Saving a few bytes with the added cost of harder to understand code.
Not really sure if that is worth it. I'll need to see a Tested-by:
for this patch.

Also, please fix the checkpatch warnings.

Thanks,
Guenter


---
  drivers/hwmon/applesmc.c | 16 
  1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/hwmon/applesmc.c b/drivers/hwmon/applesmc.c
index 0af63da6b603..0c950e1b03f3 100644
--- a/drivers/hwmon/applesmc.c
+++ b/drivers/hwmon/applesmc.c
@@ -84,12 +84,12 @@
  #define TEMP_SENSOR_TYPE  "sp78"

  /* List of keys used to read/write fan speeds */
-static const char *const fan_speed_fmt[] = {
-   "F%dAc",  /* actual speed */
-   "F%dMn",  /* minimum speed (rw) */
-   "F%dMx",  /* maximum speed */
-   "F%dSf",  /* safe speed - not all models */
-   "F%dTg",  /* target speed (manual: rw) */
+static const char fan_speed_suffix[][4] = {
+   "Ac", /* actual speed */
+   "Mn", /* minimum speed (rw) */
+   "Mx", /* maximum speed */
+   "Sf", /* safe speed - not all models */
+   "Tg", /* target speed (manual: rw) */
  };

  #define INIT_TIMEOUT_MSECS5000/* wait up to 5s for device init ... */
@@ -811,7 +811,7 @@ static ssize_t applesmc_show_fan_speed(struct device *dev,
char newkey[5];
u8 buffer[2];

-   sprintf(newkey, fan_speed_fmt[to_option(attr)], to_index(attr));
+   sprintf(newkey, "F%d%s", to_index(attr), 
fan_speed_suffix[to_option(attr)]);

ret = applesmc_read_key(newkey, buffer, 2);
speed = ((buffer[0] << 8 | buffer[1]) >> 2);
@@ -834,7 +834,7 @@ static ssize_t applesmc_store_fan_speed(struct device *dev,
if (kstrtoul(sysfsbuf, 10, ) < 0 || speed >= 0x4000)
return -EINVAL; /* Bigger than a 14-bit value */

-   sprintf(newkey, fan_speed_fmt[to_option(attr)], to_index(attr));
+   sprintf(newkey, "F%d%s", to_index(attr), 
fan_speed_suffix[to_option(attr)]);

buffer[0] = (speed >> 6) & 0xff;
buffer[1] = (speed << 2) & 0xff;



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] hwmon: (ibmpex) Allow format string checking

2015-02-12 Thread Guenter Roeck

On 02/12/2015 06:15 AM, Rasmus Villemoes wrote:

The only difference between the three power_sensor_name_templates is
whether there is a suffix of "", "_lowest" or "_highest". We might as
well pull those into an array and use a literal format string,
allowing gcc to do type checking of the arguments to
sprintf. Incidentially, the same three suffixes are used in the
temp_sensor_name_templates case, so we end up eliminating one static
array.

Signed-off-by: Rasmus Villemoes 


Applied to -next after fixing 'line over 80 characters' checkpatch warning.

Thanks,
Guenter

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/3] hwmon: (coretemp) Allow format checking

2015-02-12 Thread Guenter Roeck

On 02/12/2015 06:15 AM, Rasmus Villemoes wrote:

By extracting the only part that differs we can allow static checking
of the format string, and possibly save a little .rodata.

Signed-off-by: Rasmus Villemoes 


Applied to -next after fixing multi-line alignment.

Thanks,
Guenter

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] x86/apic updates for v3.20

2015-02-12 Thread Jiang Liu
On 2015/2/13 10:50, Linus Torvalds wrote:
> On Thu, Feb 12, 2015 at 6:08 PM, Linus Torvalds
>  wrote:
>>
>> Jiang, Joerg - that commit seems to cause a lockup at suspend time for
>> me. Now, I haven't verified by reverting it from top-of-git yet, but
>> the bisection seemed to be pretty stable. I'll try the revert next (it
>> doesn't revert cleanly, but I can undo it by hand).
> 
> Confirmed. Reverting 5fcee53ce705 make the pixel suspend cleanly again.
Hi Linus,
Sorry for the trouble. Seems there are conflicts between x2apic
and suspend on Chromebook. With commit 5fcee53ce705 applied, x2apic
may be enabled on Chromebook, which in turn may cause suspend failure.
Could you please help to revert the change first? I will try to find
a Chromebook laptop and do more investigation.
Regards!
Gerry

> 
>  Linus
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [E1000-devel] [PATCH 1/3] ixgbe, ixgbevf: Add new mbox API to enable MC promiscuous mode

2015-02-12 Thread Hiroshi Shimamoto
> > > -Original Message-
> > > From: Hiroshi Shimamoto [mailto:h-shimam...@ct.jp.nec.com]
> > > Sent: Monday, February 09, 2015 6:29 PM
> > > To: Kirsher, Jeffrey T
> > > Cc: Alexander Duyck; Skidmore, Donald C; Bjørn Mork; e1000-
> > > de...@lists.sourceforge.net; net...@vger.kernel.org; Choi, Sy Jong; linux-
> > > ker...@vger.kernel.org; David Laight; Hayato Momma
> > > Subject: RE: [E1000-devel] [PATCH 1/3] ixgbe, ixgbevf: Add new mbox API to
> > > enable MC promiscuous mode
> > >
> > > > > > > Can you please fix up your patches based on my tree:
> > > > > > > git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/queue.git
> > > > > >
> > > > > > Yes. I haven't noticed your tree.
> > > > > > Will resend patches against it.
> > > > > >
> > > > >
> > > > > I encountered an issue with your tree, the commit id is below.
> > > > >
> > > > > $ git log | head
> > > > > commit e6f1649780f8f5a87299bf6af04453f93d1e3d5e
> > > > > Author: Rasmus Villemoes 
> > > > > Date:   Fri Jan 23 20:43:14 2015 -0800
> > > > >
> > > > > ethernet: fm10k: Actually drop 4 bits
> > > > >
> > > > > The comment explains the intention, but vid has type u16. Before 
> > > > > the
> > > > > inner shift, it is promoted to int, which has plenty of space for 
> > > > > all
> > > > > vid's bits, so nothing is dropped. Use a simple mask instead.
> > > > >
> > > > >
> > > > > I use the kernel from your tree in both host and guest.
> > > > >
> > > > > Assign an IPv6 for VF in guest.
> > > > > # ip -6 addr add 2001:db8::18:1/64 dev ens0
> > > > >
> > > > > Send ping packet from other server to the VM.
> > > > > # ping6  2001:db8::18:1 -I eth0
> > > > >
> > > > > The following message was shown.
> > > > > ixgbevf :00:08.0: partial checksum but l4 proto=3a!
> > > > >
> > > > > If I did the same operation in the host, I saw the same error message 
> > > > > in
> > > host too.
> > > > > ixgbe :2d:00.0: partial checksum but l4 proto=3a!
> > > > >
> > > > > Do you have any idea about that?
> > > >
> > > > Ah, sorry about that, try this tree again:
> > > > git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/queue.git
> > > >
> > > > That patch was dropped for favor of a patch that Matthew Vick put
> > > > together (and recently got pushed upstream).  So my queue no longer
> > > > has that patch in the queue, since it got dropped.
> > >
> > > I still see the same error, the head id is the below
> > >
> > > $ git log | head
> > > commit a072afb0b45904022b76deef3b770ee9a93cb13a
> > > Author: Nicholas Krause 
> > > Date:   Mon Feb 9 00:27:00 2015 -0800
> > >
> > > igb: Remove outdated fix me comment in the
> > > function,gb_acquire_swfw_sync_i210
> > >
> > >
> > > thanks,
> > > Hiroshi
> >
> > I'm having our validation see if they can recreate the same issue 
> > internally.  When they get back to me I'll let you
> know
> > what we found.
> 
> We did bisect, and the below looks the culprit;
> 
> 32dce968dd987adfb0c00946d78dad9154f64759 is the first bad commit
> commit 32dce968dd987adfb0c00946d78dad9154f64759
> Author: Vlad Yasevich 
> Date:   Sat Jan 31 10:40:18 2015 -0500
> 
> ipv6: Allow for partial checksums on non-ufo packets
> 
> Currntly, if we are not doing UFO on the packet, all UDP
> packets will start with CHECKSUM_NONE and thus perform full
> checksum computations in software even if device support
> IPv6 checksum offloading.
> 
> Let's start start with CHECKSUM_PARTIAL if the device
> supports it and we are sending only a single packet at
> or below mtu size.
> 
> Signed-off-by: Vladislav Yasevich 
> Signed-off-by: David S. Miller 
> 
> :04 04 4437eaf7e944f5a6136ebf668a256fee688fda3d 
> fade8da998d35c8da97a15f0556949ad371e5347 M  net

When I reverted the commit, the issue was solved.

thanks,
Hiroshi

N�r��yb�X��ǧv�^�)޺{.n�+{zX����ܨ}���Ơz�:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a���
0��h���i

Re: [PATCH v3 2/2] sched/rt: Add check_preempt_equal_prio() logic in pick_next_task_rt()

2015-02-12 Thread Steven Rostedt
On Fri, 13 Feb 2015 11:55:11 +0800
Xunlei Pang  wrote:

> > RT1 just got pushed behind RT3 and it is now not the next one to run.
> > RT2 will get this rq, RT3 will be pushed off, but say there's no more
> > rq's available to run RT1.
> >
> > You just broke FIFO.
> 
> Yes, I've also thought of this point before.
> 
> If this is a problem, we may have the same thing happening in
> current check_preempt_equal_prio() code:
> When a pinned waking task preempts the current successfully,
> because it thinks current is migratable via cpupri_find().
> 
> But when resched happens, things may change, i.e. current
> becomes non-migratable, so the waking task gets running, while
> the previous running task gets stuck. See, it also broke FIFO.

It breaks FIFO if the state of the system changes before the current
task found another queue to run on, sure, and that probably should be
fixed. And technically, that case does not break FIFO from a state
point of view. Think of the timing, if that task was able to migrate to
another CPU, but suddenly it could not, that means the CPU it was going
to migrate to had a higher priority task that started to run on that
CPU. It still fits the FIFO design. That's because if that task
succeeded to migrate to that CPU, just before the high priority task
ran, that high priority task would have bumped it anyway.

Now if it couldn't migrate because a same priority task started, then,
well yeah, it broke FIFO, and maybe that should be fixed.

But your patch breaks FIFO if the system is in just one
particular state. That's much worse, and it shouldn't be added.

-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFC v9 09/20] drm/dsi: Add a helper to get bits per pixel of MIPI DSI pixel format

2015-02-12 Thread Liu Ying
On Thu, Feb 12, 2015 at 10:26:42AM +0100, Daniel Vetter wrote:
> On Thu, Feb 12, 2015 at 02:01:32PM +0800, Liu Ying wrote:
> > Signed-off-by: Liu Ying 
> > ---
> > v8->v9:
> >  * Rebase onto the imx-drm/next branch of Philipp Zabel's open git 
> > repository.
> > 
> > v7->v8:
> >  * None.
> > 
> > v6->v7:
> >  * None.
> > 
> > v5->v6:
> >  * Address the over 80 characters in one line warning reported by the
> >checkpatch.pl script.
> > 
> > v4->v5:
> >  * None.
> > 
> > v3->v4:
> >  * None.
> > 
> > v2->v3:
> >  * None.
> > 
> > v1->v2:
> >  * Thierry Reding suggested that the mipi_dsi_pixel_format_to_bpp() function
> >could be placed at the common DRM MIPI DSI driver.
> >This patch is newly added.
> > 
> >  include/drm/drm_mipi_dsi.h | 14 ++
> >  1 file changed, 14 insertions(+)
> > 
> > diff --git a/include/drm/drm_mipi_dsi.h b/include/drm/drm_mipi_dsi.h
> > index f1d8d0d..3662021 100644
> > --- a/include/drm/drm_mipi_dsi.h
> > +++ b/include/drm/drm_mipi_dsi.h
> > @@ -163,6 +163,20 @@ static inline struct mipi_dsi_device 
> > *to_mipi_dsi_device(struct device *dev)
> > return container_of(dev, struct mipi_dsi_device, dev);
> >  }
> >  
> > +static inline int mipi_dsi_pixel_format_to_bpp(enum mipi_dsi_pixel_format 
> > fmt)
> 
> Kerneldoc seems to be missing for this one.

I'll add it.  Thanks for pointing out this.

Regards,
Liu Ying

> -Daniel
> 
> > +{
> > +   switch (fmt) {
> > +   case MIPI_DSI_FMT_RGB888:
> > +   case MIPI_DSI_FMT_RGB666:
> > +   return 24;
> > +   case MIPI_DSI_FMT_RGB666_PACKED:
> > +   return 18;
> > +   case MIPI_DSI_FMT_RGB565:
> > +   return 16;
> > +   }
> > +   return -EINVAL;
> > +}
> > +
> >  struct mipi_dsi_device *of_find_mipi_dsi_device_by_node(struct device_node 
> > *np);
> >  int mipi_dsi_attach(struct mipi_dsi_device *dsi);
> >  int mipi_dsi_detach(struct mipi_dsi_device *dsi);
> > -- 
> > 2.1.0
> > 
> > ___
> > dri-devel mailing list
> > dri-de...@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/dri-devel
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] fix platform_no_drv_owner.cocci warnings

2015-02-12 Thread kbuild test robot
drivers/clk/qcom/clk-rpm.c:262:3-8: No need to set .owner here. The core will 
do it.

 Remove .owner field if calls are used which set it automatically

Generated by: scripts/coccinelle/api/platform_no_drv_owner.cocci

Signed-off-by: Fengguang Wu 
---

 clk-rpm.c |1 -
 1 file changed, 1 deletion(-)

--- a/drivers/clk/qcom/clk-rpm.c
+++ b/drivers/clk/qcom/clk-rpm.c
@@ -259,7 +259,6 @@ static struct platform_driver rpm_clk_dr
.probe  = rpm_clk_probe,
.driver = {
.name   = "qcom-rpm-clk",
-   .owner  = THIS_MODULE,
.of_match_table = of_match_ptr(clk_rpm_of_match),
},
 };
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] usb: dwc3: dwc3-omap: Fix disable IRQ

2015-02-12 Thread George Cherian


On 02/12/2015 11:52 PM, Felipe Balbi wrote:

On Thu, Feb 12, 2015 at 11:13:16AM +0530, George Cherian wrote:

>In the wrapper the IRQ disable should be done by writing 1's to the
>IRQ*_CLR register. Existing code is broken because it instead writes
>zeros to IRQ*_SET register.
>
>Fix this by adding functions dwc3_omap_write_irqmisc_clr() and
>dwc3_omap_write_irq0_clr() which do the right thing.
>
>Signed-off-by: George Cherian

please resend with:

Fixes: 72246da40f37 (usb: Introduce DesignWare USB3 DRD Driver)
Cc:  # v3.2+

Done!!
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2] usb: dwc3: dwc3-omap: Fix disable IRQ

2015-02-12 Thread George Cherian
In the wrapper the IRQ disable should be done by writing 1's to the
IRQ*_CLR register. Existing code is broken because it instead writes
zeros to IRQ*_SET register.

Fix this by adding functions dwc3_omap_write_irqmisc_clr() and
dwc3_omap_write_irq0_clr() which do the right thing.

Fixes: 72246da40f37 ("usb: Introduce DesignWare USB3 DRD Driver")
Cc:  # v3.2+
Signed-off-by: George Cherian 
---
 drivers/usb/dwc3/dwc3-omap.c | 30 --
 1 file changed, 28 insertions(+), 2 deletions(-)

diff --git a/drivers/usb/dwc3/dwc3-omap.c b/drivers/usb/dwc3/dwc3-omap.c
index 172d64e..52e0c4e 100644
--- a/drivers/usb/dwc3/dwc3-omap.c
+++ b/drivers/usb/dwc3/dwc3-omap.c
@@ -205,6 +205,18 @@ static void dwc3_omap_write_irq0_set(struct dwc3_omap 
*omap, u32 value)
omap->irq0_offset, value);
 }
 
+static void dwc3_omap_write_irqmisc_clr(struct dwc3_omap *omap, u32 value)
+{
+   dwc3_omap_writel(omap->base, USBOTGSS_IRQENABLE_CLR_MISC +
+   omap->irqmisc_offset, value);
+}
+
+static void dwc3_omap_write_irq0_clr(struct dwc3_omap *omap, u32 value)
+{
+   dwc3_omap_writel(omap->base, USBOTGSS_IRQENABLE_CLR_0 -
+   omap->irq0_offset, value);
+}
+
 static void dwc3_omap_set_mailbox(struct dwc3_omap *omap,
enum omap_dwc3_vbus_id_status status)
 {
@@ -345,9 +357,23 @@ static void dwc3_omap_enable_irqs(struct dwc3_omap *omap)
 
 static void dwc3_omap_disable_irqs(struct dwc3_omap *omap)
 {
+   u32 reg;
+
/* disable all IRQs */
-   dwc3_omap_write_irqmisc_set(omap, 0x00);
-   dwc3_omap_write_irq0_set(omap, 0x00);
+   reg = USBOTGSS_IRQO_COREIRQ_ST;
+   dwc3_omap_write_irq0_clr(omap, reg);
+
+   reg = (USBOTGSS_IRQMISC_OEVT |
+   USBOTGSS_IRQMISC_DRVVBUS_RISE |
+   USBOTGSS_IRQMISC_CHRGVBUS_RISE |
+   USBOTGSS_IRQMISC_DISCHRGVBUS_RISE |
+   USBOTGSS_IRQMISC_IDPULLUP_RISE |
+   USBOTGSS_IRQMISC_DRVVBUS_FALL |
+   USBOTGSS_IRQMISC_CHRGVBUS_FALL |
+   USBOTGSS_IRQMISC_DISCHRGVBUS_FALL |
+   USBOTGSS_IRQMISC_IDPULLUP_FALL);
+
+   dwc3_omap_write_irqmisc_clr(omap, reg);
 }
 
 static u64 dwc3_omap_dma_mask = DMA_BIT_MASK(32);
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 2/2] sched/rt: Add check_preempt_equal_prio() logic in pick_next_task_rt()

2015-02-12 Thread Xunlei Pang
Hi Steve,

On 13 February 2015 at 11:55, Xunlei Pang  wrote:
> Hi steve,
>
> On 13 February 2015 at 08:04, Steven Rostedt  wrote:
>> On Sun,  8 Feb 2015 23:51:26 +0800
>> Xunlei Pang  wrote:
>>
>>> check_preempt_curr() doesn't call sched_class::check_preempt_curr
>>> when the class of current is a higher level.
>>
>> The above sentence does not make sense.
>>
>>> So if there is a DL
>>> task running when doing this for RT, check_preempt_equal_prio()
>>
>> Doing what for RT?
>>
>>> will definitely miss, which may result in some response latency
>>
>> Miss what?
>
> Sorry, this may lack some information I need to further explain in detail.
>
>>
>>> for this RT task if it is pinned and there're some same-priority
>>> migratable rt tasks already queued.
>>>
>>> We should do the similar thing in select_task_rq_rt() when first
>>> picking rt tasks after running out of DL tasks.
>>>
>>> This patch tackles the issue by peeking the next rt task(RT1), and
>>> if find RT1 migratable, just requeue it to the tail of the rq using
>>> requeue_task_rt(rq, p, 0). In this way:
>>> - If there do have another rt task(RT2) with the same priority as
>>>   RT1, RT2 will finally be picked as the running task. While RT1
>>>   will be pushed onto another cpu via RT1's post_schedule(), as
>>>   RT1 is migratable. The difference from check_preempt_equal_prio()
>>>   here is that we just don't care whether RT2 is migratable.
>>>
>>> - Otherwise, if there's no rt task with the same priority as RT1,
>>>   RT1 will still be picked as the running task after the requeuing.
>>
>> What happens if there's three RT tasks of the same prio, RT1 is ready
>> to run and is migratable, RT2 is pinned, RT3 is migratable
>>
>> RT1 just got pushed behind RT3 and it is now not the next one to run.
>> RT2 will get this rq, RT3 will be pushed off, but say there's no more
>> rq's available to run RT1.
>>
>> You just broke FIFO.
>
> Yes, I've also thought of this point before.
>
> If this is a problem, we may have the same thing happening in
> current check_preempt_equal_prio() code:
> When a pinned waking task preempts the current successfully,
> because it thinks current is migratable via cpupri_find().
>
> But when resched happens, things may change, i.e. current
> becomes non-migratable, so the waking task gets running, while
> the previous running task gets stuck. See, it also broke FIFO.

Aside of this, please ignore this patch, the waking rt tasks will also be
pushed via task_woken_rt() when current is DL, which I missed before.

Thanks,
Xunlei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 2/2] leds: add Qualcomm PM8941 WLED driver

2015-02-12 Thread Stephen Boyd
On 02/12/15 20:28, Ivan T. Ivanov wrote:
> On Thu, 2015-02-12 at 20:07 -0800, Stephen Boyd wrote:
>> On 01/29/15 04:48, Ivan T. Ivanov wrote:
>>> Otherwise it looks good. Driver is loaded and device is detected
>>> properly (i have added readings for type and subtype registers).
>>> Do you know where I can measure result from changing brightness
>>> sysfs entry. I am using 8074 dragonboard?
>> Does the backlight turn on? From what I can tell it controls the
>> backlight, but it may be that nothings getting displayed so it won't be
>> noticeable.
>>
> Yes, I can not see visual changes. That is why I have asked where
> I could hook the probe and measure current change or wherever.
> BL_WLEDx signals go out from J2, but somehow I was unable to 
> locate it on the board, will try to look harder :-)
>
>

When the screen is "off" in android I can still turn on the brightness
to max in sysfs and see the screen glow. Hopefully we don't need
something else to make that work.

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 2/2] leds: add Qualcomm PM8941 WLED driver

2015-02-12 Thread Ivan T. Ivanov

On Thu, 2015-02-12 at 20:07 -0800, Stephen Boyd wrote:
> On 01/29/15 04:48, Ivan T. Ivanov wrote:
> > Otherwise it looks good. Driver is loaded and device is detected
> > properly (i have added readings for type and subtype registers).
> > Do you know where I can measure result from changing brightness
> > sysfs entry. I am using 8074 dragonboard?
> 
> Does the backlight turn on? From what I can tell it controls the
> backlight, but it may be that nothings getting displayed so it won't be
> noticeable.
> 

Yes, I can not see visual changes. That is why I have asked where
I could hook the probe and measure current change or wherever.
BL_WLEDx signals go out from J2, but somehow I was unable to 
locate it on the board, will try to look harder :-)

Regards,
Ivan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 2/2] leds: add Qualcomm PM8941 WLED driver

2015-02-12 Thread Stephen Boyd
On 01/23/15 16:54, Bjorn Andersson wrote:
> +
> +static int pm8941_wled_set(struct led_classdev *cdev,
> +enum led_brightness value)
> +{
> + struct pm8941_wled *wled;
> + u8 ctrl = 0;
> + u16 val;
> + int rc;
> + int i;
> +
> + wled = container_of(cdev, struct pm8941_wled, cdev);
> +
> + if (value != 0)
> + ctrl = PM8941_WLED_REG_MOD_EN_BIT;
> +
> + val = value * PM8941_WLED_REG_VAL_MAX / LED_FULL;
> +
> + rc = regmap_update_bits(wled->regmap,
> + wled->addr + PM8941_WLED_REG_MOD_EN,
> + PM8941_WLED_REG_MOD_EN_MASK, ctrl);
> + if (rc)
> + return rc;
> +
> + for (i = 0; i < wled->cfg.num_strings; ++i) {
> + u8 v[2] = { val & 0xff, (val >> 8) & 0xf };
> +
> + rc = regmap_bulk_write(wled->regmap,
> + wled->addr + PM8941_WLED_REG_VAL_BASE + 2 * i,
> + v, 2);
> + if (rc)
> + return rc;
> + }
> +
> + rc = regmap_update_bits(wled->regmap,
> + wled->addr + PM8941_WLED_REG_SYNC,
> + PM8941_WLED_REG_SYNC_MASK, PM8941_WLED_REG_SYNC_ALL);
> + if (rc)
> + return rc;
> +
> + rc = regmap_update_bits(wled->regmap,
> + wled->addr + PM8941_WLED_REG_SYNC,
> + PM8941_WLED_REG_SYNC_MASK, PM8941_WLED_REG_SYNC_CLEAR);
> + return rc;
> +}

This doesn't seem to do anything for the OVP spike mentioned in this
patch[1]. Do you see that problem on your device? I imagine the PMIC is
the same.

[1]
https://www.codeaurora.org/cgit/quic/la/kernel/msm/commit/?id=fef9e15072562f0f28dc7066dcdd69388df81ed3

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 2/2] leds: add Qualcomm PM8941 WLED driver

2015-02-12 Thread Stephen Boyd
On 01/29/15 04:48, Ivan T. Ivanov wrote:
>
> Otherwise it looks good. Driver is loaded and device is detected
> properly (i have added readings for type and subtype registers).
> Do you know where I can measure result from changing brightness 
> sysfs entry. I am using 8074 dragonboard?

Does the backlight turn on? From what I can tell it controls the
backlight, but it may be that nothings getting displayed so it won't be
noticeable.

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 2/2] leds: add Qualcomm PM8941 WLED driver

2015-02-12 Thread Stephen Boyd
On 01/23/15 16:54, Bjorn Andersson wrote:
> +
> +static int pm8941_wled_configure(struct pm8941_wled *wled, struct device 
> *dev)
> +{
> + struct pm8941_wled_config *cfg = >cfg;
> + u32 val;
> + int rc;
> + int i;
> +
> + const struct {
> + const char *name;
> + u32 *val_ptr;
> + const struct pm8941_wled_var_cfg *cfg;
> + } u32_opts[] = {
> + {
> + "qcom,current-boost-limit",
> + >i_boost_limit,
> + .cfg = _wled_i_boost_limit_cfg,
> + },
> + {
> + "qcom,current-limit",
> + >i_limit,
> + .cfg = _wled_i_limit_cfg,
> + },
> + {
> + "qcom,ovp",
> + >ovp,
> + .cfg = _wled_ovp_cfg,
> + },
> + {
> + "qcom,switching-freq",
> + >switch_freq,
> + .cfg = _wled_switch_freq_cfg,
> + },
> + {
> + "qcom,num-strings",
> + >num_strings,
> + .cfg = _wled_num_strings_cfg,
> + },
> + };
> + const struct {
> + const char *name;
> + bool *val_ptr;
> + } bool_opts[] = {
> + { "qcom,cs-out", >cs_out_en, },
> + { "qcom,ext-gen", >ext_gen, },
> + { "qcom,cabc", >cabc_en, },
> + };
> +
> + rc = of_property_read_u32(dev->of_node, "reg", );
> + if (rc || val > 0x) {
> + dev_err(dev, "invalid IO resources\n");
> + return rc ? rc : -EINVAL;
> + }
> + wled->addr = val;
> +
> + rc = of_property_read_string(dev->of_node, "label", >cdev.name);
> + if (rc)
> + wled->cdev.name = dev->of_node->name;
> +
> + wled->cdev.default_trigger = of_get_property(dev->of_node,
> + "linux,default-trigger", NULL);
> +
> + *cfg = pm8941_wled_config_defaults;
> + for (i = 0; i < ARRAY_SIZE(u32_opts); ++i) {
> + u32 sel, c;
> + int j, rj;
> +
> + rc = of_property_read_u32(dev->of_node, u32_opts[i].name, );
> + if (rc) {
> + if (rc != -EINVAL) {
> + dev_err(dev, "error reading '%s'\n",
> + u32_opts[i].name);
> + return rc;
> + }
> + continue;
> + }
> +
> + sel = UINT_MAX;
> + rj = -1;
> + c = pm8941_wled_values(u32_opts[i].cfg, 0);
> + for (j = 0; c != UINT_MAX; ++j) {
> + if (c <= val && (sel == UINT_MAX || c >= sel)) {
> + sel = c;
> + rj = j;
> + }
> + c = pm8941_wled_values(u32_opts[i].cfg, j + 1);
> + }
> + if (sel == UINT_MAX) {
> + dev_err(dev, "invalid value for '%s'\n",
> + u32_opts[i].name);
> + return rc;

Isn't rc always 0 here? Don't we want to return an error?

Also, I find this code very convoluted given that we loop through a
table and match based on nodes and call function pointers, etc. Why
can't we just have a handful of if statements with of_property_read_u32
in them? That way we don't have to jump through so many hoops, bouncing
all around this file to figure out what's going on. If we did I imagine
we wouldn't have missed out on rc being 0 here.

> +
> +static int pm8941_wled_remove(struct platform_device *pdev)
> +{
> + struct pm8941_wled *wled;
> +
> + wled = platform_get_drvdata(pdev);
> + led_classdev_unregister(>cdev);

Would be nice to have a devm for this one too.

> +
> + return 0;
> +}
> +
> +static const struct of_device_id pm8941_wled_match_table[] = {
> + { .compatible = "qcom,pm8941-wled" },
> + {}
> +};
> +MODULE_DEVICE_TABLE(of, pm8941_wled_match_table);
> +
> +static struct platform_driver pm8941_wled_driver = {
> + .probe  = pm8941_wled_probe,
> + .remove = pm8941_wled_remove,
> + .driver = {
> + .name   = "pm8941-wled",
> + .owner  = THIS_MODULE,

THIS_MODULE should be removed.

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/3] cpuset: Fix cpuset sched_relax_domain_level

2015-02-12 Thread Zefan Li
From: Jason Low 

The cpuset.sched_relax_domain_level can control how far we do
immediate load balancing on a system. However, it was found on recent
kernels that echo'ing a value into cpuset.sched_relax_domain_level
did not reduce any immediate load balancing.

The reason this occurred was because the update_domain_attr_tree() traversal
did not update for the "top_cpuset". This resulted in nothing being changed
when modifying the sched_relax_domain_level parameter.

This patch is able to address that problem by having update_domain_attr_tree()
allow updates for the root in the cpuset traversal.

Fixes: fc560a26acce ("cpuset: replace cpuset->stack_list with 
cpuset_for_each_descendant_pre()")
Cc:  # 3.9+
Signed-off-by: Jason Low 
Signed-off-by: Zefan Li 
---

This is a resend. I forgot to edit the subject when sending this patch...

---
 kernel/cpuset.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index 29463c2..9e25599 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -548,9 +548,6 @@ static void update_domain_attr_tree(struct 
sched_domain_attr *dattr,
 
rcu_read_lock();
cpuset_for_each_descendant_pre(cp, pos_css, root_cs) {
-   if (cp == root_cs)
-   continue;
-
/* skip the whole subtree if @cp doesn't have any CPU */
if (cpumask_empty(cp->cpus_allowed)) {
pos_css = css_rightmost_descendant(pos_css);
-- 1.8.0.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 2/2] sched/rt: Add check_preempt_equal_prio() logic in pick_next_task_rt()

2015-02-12 Thread Xunlei Pang
Hi steve,

On 13 February 2015 at 08:04, Steven Rostedt  wrote:
> On Sun,  8 Feb 2015 23:51:26 +0800
> Xunlei Pang  wrote:
>
>> check_preempt_curr() doesn't call sched_class::check_preempt_curr
>> when the class of current is a higher level.
>
> The above sentence does not make sense.
>
>> So if there is a DL
>> task running when doing this for RT, check_preempt_equal_prio()
>
> Doing what for RT?
>
>> will definitely miss, which may result in some response latency
>
> Miss what?

Sorry, this may lack some information I need to further explain in detail.

>
>> for this RT task if it is pinned and there're some same-priority
>> migratable rt tasks already queued.
>>
>> We should do the similar thing in select_task_rq_rt() when first
>> picking rt tasks after running out of DL tasks.
>>
>> This patch tackles the issue by peeking the next rt task(RT1), and
>> if find RT1 migratable, just requeue it to the tail of the rq using
>> requeue_task_rt(rq, p, 0). In this way:
>> - If there do have another rt task(RT2) with the same priority as
>>   RT1, RT2 will finally be picked as the running task. While RT1
>>   will be pushed onto another cpu via RT1's post_schedule(), as
>>   RT1 is migratable. The difference from check_preempt_equal_prio()
>>   here is that we just don't care whether RT2 is migratable.
>>
>> - Otherwise, if there's no rt task with the same priority as RT1,
>>   RT1 will still be picked as the running task after the requeuing.
>
> What happens if there's three RT tasks of the same prio, RT1 is ready
> to run and is migratable, RT2 is pinned, RT3 is migratable
>
> RT1 just got pushed behind RT3 and it is now not the next one to run.
> RT2 will get this rq, RT3 will be pushed off, but say there's no more
> rq's available to run RT1.
>
> You just broke FIFO.

Yes, I've also thought of this point before.

If this is a problem, we may have the same thing happening in
current check_preempt_equal_prio() code:
When a pinned waking task preempts the current successfully,
because it thinks current is migratable via cpupri_find().

But when resched happens, things may change, i.e. current
becomes non-migratable, so the waking task gets running, while
the previous running task gets stuck. See, it also broke FIFO.

Thanks,
Xunlei

>
> I'm sorry, I'm thinking this is trying too hard to fix the users poor
> management of RT tasks.
>
> If you have 2 or more RT tasks of the same prio, you had better be damn
> aware that if one is pinned, it will block the others, even from
> migrating. You should not have pinned tasks of the same prio as those
> that can migrate.
>
> And if your system depends on DL tasks working nicely with RT tasks on
> the same CPU, it's even more broken by design.
>
> -- Steve
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4 0/5] sched_clock: Optimize and avoid deadlock during read from NMI

2015-02-12 Thread Stephen Boyd
On 02/08/15 04:02, Daniel Thompson wrote:
> This patchset optimizes the generic sched_clock implementation by
> removing branches and significantly reducing the data cache profile. It
> also makes it safe to call sched_clock() from NMI (or FIQ on ARM).
>
> The data cache profile of sched_clock() in the original code is
> somewhere between 2 and 3 (64-byte) cache lines, depending on alignment
> of struct clock_data. After patching, the cache profile for the normal
> case should be a single cacheline.
>
> NMI safety was tested on i.MX6 with perf drowning the system in FIQs and
> using the perf handler to check that sched_clock() returned monotonic
> values. At the same time I forcefully reduced kt_wrap so that
> update_sched_clock() is being called at >1000Hz.
>
> Without the patches the above system is grossly unstable, surviving
> [9K,115K,25K] perf event cycles during three separate runs. With the
> patch I ran for over 9M perf event cycles before getting bored.
>

Looks good to me. For the series:

Reviewed-by: Stephen Boyd 

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] clockevents: Introduce mode specific callbacks

2015-02-12 Thread Preeti U Murthy
On 02/13/2015 06:24 AM, Viresh Kumar wrote:
> It is not possible for the clockevents core to know which modes (other than
> those with a corresponding feature flag) are supported by a particular
> implementation. And drivers are expected to handle transition to all modes
> elegantly, as ->set_mode() would be issued for them unconditionally.
> 
> Now, adding support for a new mode complicates things a bit if we want to use
> the legacy ->set_mode() callback. We need to closely review all clockevents
> drivers to see if they would break on addition of a new mode. And after such
> reviews, it is found that we have to do non-trivial changes to most of the
> drivers [1].
> 
> Introduce mode-specific set_mode_*() callbacks, some of which the drivers may 
> or
> may not implement. A missing callback would clearly convey the message that 
> the
> corresponding mode isn't supported.
> 
> A driver may still choose to keep supporting the legacy ->set_mode() callback,
> but ->set_mode() wouldn't be supporting any new modes beyond RESUME. If a 
> driver
> wants to get benefited by using a new mode, it would be required to migrate to
> the mode specific callbacks.
> 
> The legacy ->set_mode() callback and the newly introduced mode-specific
> callbacks are mutually exclusive. Only one of them should be supported by the
> driver.
> 
> Sanity check is done at the time of registration to distinguish between 
> optional
> and required callbacks and to make error recovery and handling simpler. If the
> legacy ->set_mode() callback is provided, all mode specific ones would be
> ignored by the core but a warning is thrown if they are present.
> 
> Call sites calling ->set_mode() directly are also updated to use
> __clockevents_set_mode() instead, as ->set_mode() may not be available anymore
> for few drivers.
> 
> [1] https://lkml.org/lkml/2014/12/9/605
> [2] https://lkml.org/lkml/2015/1/23/255
> 
> Suggested-by: Thomas Gleixner  [2]
> Signed-off-by: Viresh Kumar 
> ---
> V1->V2: Stricter sanity checks.
> 
>  include/linux/clockchips.h | 21 +--
>  kernel/time/clockevents.c  | 88 
> --
>  kernel/time/timer_list.c   | 32 +++--
>  3 files changed, 134 insertions(+), 7 deletions(-)
> 
> diff --git a/include/linux/clockchips.h b/include/linux/clockchips.h
> index 2e4cb67f6e56..59af26b54d15 100644
> --- a/include/linux/clockchips.h
> +++ b/include/linux/clockchips.h
> @@ -39,6 +39,8 @@ enum clock_event_mode {
>   CLOCK_EVT_MODE_PERIODIC,
>   CLOCK_EVT_MODE_ONESHOT,
>   CLOCK_EVT_MODE_RESUME,
> +
> + /* Legacy ->set_mode() callback doesn't support below modes */
>  };
> 
>  /*
> @@ -81,7 +83,11 @@ enum clock_event_mode {
>   * @mode:operating mode assigned by the management code
>   * @features:features
>   * @retries: number of forced programming retries
> - * @set_mode:set mode function
> + * @set_mode:legacy set mode function, only for modes <= 
> CLOCK_EVT_MODE_RESUME.
> + * @set_mode_periodic:   switch mode to periodic, if !set_mode
> + * @set_mode_oneshot:switch mode to oneshot, if !set_mode
> + * @set_mode_shutdown:   switch mode to shutdown, if !set_mode
> + * @set_mode_resume: resume clkevt device, if !set_mode
>   * @broadcast:   function to broadcast events
>   * @min_delta_ticks: minimum delta value in ticks stored for reconfiguration
>   * @max_delta_ticks: maximum delta value in ticks stored for reconfiguration
> @@ -108,9 +114,20 @@ struct clock_event_device {
>   unsigned intfeatures;
>   unsigned long   retries;
> 
> - void(*broadcast)(const struct cpumask *mask);
> + /*
> +  * Mode transition callback(s): Only one of the two groups should be
> +  * defined:
> +  * - set_mode(), only for modes <= CLOCK_EVT_MODE_RESUME.
> +  * - set_mode_{shutdown|periodic|oneshot|resume}().
> +  */
>   void(*set_mode)(enum clock_event_mode mode,
>   struct clock_event_device *);
> + int (*set_mode_periodic)(struct clock_event_device 
> *);
> + int (*set_mode_oneshot)(struct clock_event_device 
> *);
> + int (*set_mode_shutdown)(struct clock_event_device 
> *);
> + int (*set_mode_resume)(struct clock_event_device *);
> +
> + void(*broadcast)(const struct cpumask *mask);
>   void(*suspend)(struct clock_event_device *);
>   void(*resume)(struct clock_event_device *);
>   unsigned long   min_delta_ticks;
> diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
> index 55449909f114..489642b08d64 100644
> --- a/kernel/time/clockevents.c
> +++ b/kernel/time/clockevents.c
> @@ -94,6 +94,57 @@ u64 clockevent_delta2ns(unsigned long latch, struct 

Re: [PATCH v3 1/2] sched/rt: Check to push the task when changing its affinity

2015-02-12 Thread Xunlei Pang
On 13 February 2015 at 07:31, Steven Rostedt  wrote:
> On Sun,  8 Feb 2015 23:51:25 +0800
> Xunlei Pang  wrote:
>
>
>> + if (new_weight > 1 &&
>> + rt_task(rq->curr) &&
>> + !test_tsk_need_resched(rq->curr)) {
>> + /*
>> +  * We own p->pi_lock and rq->lock. rq->lock might
>> +  * get released when doing direct pushing, however
>> +  * p->pi_lock is always held, so it's safe to assign
>> +  * the new_mask and new_weight to p below.
>> +  */
>> + if (!task_running(rq, p)) {
>> + cpumask_copy(>cpus_allowed, new_mask);
>> + p->nr_cpus_allowed = new_weight;
>> + direct_push = 1;
>> + } else if (cpumask_test_cpu(task_cpu(p), new_mask)) {
>> + cpumask_copy(>cpus_allowed, new_mask);
>> + p->nr_cpus_allowed = new_weight;
>> + if (!cpupri_find(>rd->cpupri, p, NULL))
>> + goto update;
>> +
>> + /*
>> +  * At this point, current task gets migratable most
>> +  * likely due to the change of its affinity, let's
>> +  * figure out if we can migrate it.
>> +  *
>> +  * Is there any task with the same priority as that
>> +  * of current task? If found one, we should resched.
>> +  * NOTE: The target may be unpushable.
>> +  */
>> + if (p->prio == rq->rt.highest_prio.next) {
>> + /* One target just in pushable_tasks list. */
>> + requeue_task_rt(rq, p, 0);
>
> What's the purpose of the requeue_task_rt() here?
>
>> + preempt_push = 1;
>> + } else if (rq->rt.rt_nr_total > 1) {
>> + struct task_struct *next;
>> +
>> + requeue_task_rt(rq, p, 0);
>
> And here? It may just be late and I'm tired, but it's not obvious to me.

If we're changing the affinity of the current running task, and there're also
other tasks with the same prio on the same cpu, we do the similar thing
as check_preempt_equal_prio(). But yes, this may have the same problem
you pointed out on the 2nd patch.

Thanks,
Xunlei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RESEND] ARM: DMA: Fix kzalloc flags in __iommu_alloc_buffer()

2015-02-12 Thread Will Deacon
On Wed, Feb 11, 2015 at 09:01:41AM +, Alexandre Courbot wrote:
> There doesn't seem to be any valid reason to allocate the pages array
> with the same flags as the buffer itself. Doing so can eventually lead
> to the following safeguard in mm/slab.c to be hit:
> 
> BUG_ON(flags & GFP_SLAB_BUG_MASK);

nit: I can't actually spot this BUG_ON in the kernel.

> This happens when buffers are allocated with __GFP_DMA32 or
> __GFP_HIGHMEM.
> 
> Fix this by allocating the pages array with GFP_KERNEL to follow what is
> done elsewhere in this file. Using GFP_KERNEL in __iommu_alloc_buffer()
> is safe because atomic allocations are handled by __iommu_alloc_atomic().
> 
> Signed-off-by: Alexandre Courbot 
> Cc: Arnd Bergmann 
> Cc: Marek Szyprowski 
> Cc: Russell King 
> Acked-by: Marek Szyprowski 
> ---
>  arch/arm/mm/dma-mapping.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
> index 903dba0..170a116 100644
> --- a/arch/arm/mm/dma-mapping.c
> +++ b/arch/arm/mm/dma-mapping.c
> @@ -1106,7 +1106,7 @@ static struct page **__iommu_alloc_buffer(struct device 
> *dev, size_t size,
>   int i = 0;
>  
>   if (array_size <= PAGE_SIZE)
> - pages = kzalloc(array_size, gfp);
> + pages = kzalloc(array_size, GFP_KERNEL);
>   else
>   pages = vzalloc(array_size);
>   if (!pages)
> -- 
> 2.3.0

Looks sensible to me:

  Acked-by: Will Deacon 

Will
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] cpuset: initialize effective masks when clone_children is enabled

2015-02-12 Thread Zefan Li
From: Jason Low 

The cpuset.sched_relax_domain_level can control how far we do
immediate load balancing on a system. However, it was found on recent
kernels that echo'ing a value into cpuset.sched_relax_domain_level
did not reduce any immediate load balancing.

The reason this occurred was because the update_domain_attr_tree() traversal
did not update for the "top_cpuset". This resulted in nothing being changed
when modifying the sched_relax_domain_level parameter.

This patch is able to address that problem by having update_domain_attr_tree()
allow updates for the root in the cpuset traversal.

Fixes: fc560a26acce ("cpuset: replace cpuset->stack_list with 
cpuset_for_each_descendant_pre()")
Cc:  # 3.9+
Signed-off-by: Jason Low 
Signed-off-by: Zefan Li 
---
 kernel/cpuset.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index 29463c2..9e25599 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -548,9 +548,6 @@ static void update_domain_attr_tree(struct 
sched_domain_attr *dattr,
 
rcu_read_lock();
cpuset_for_each_descendant_pre(cp, pos_css, root_cs) {
-   if (cp == root_cs)
-   continue;
-
/* skip the whole subtree if @cp doesn't have any CPU */
if (cpumask_empty(cp->cpus_allowed)) {
pos_css = css_rightmost_descendant(pos_css);
-- 
1.8.0.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/3] cpuset: fix a warning when clearing configured masks in old hierarchy

2015-02-12 Thread Zefan Li
When we clear cpuset.cpus, cpuset.effective_cpus won't be cleared:

  # mount -t cgroup -o cpuset xxx /mnt
  # mkdir /mnt/tmp
  # echo 0 > /mnt/tmp/cpuset.cpus
  # echo > /mnt/tmp/cpuset.cpus
  # cat cpuset.cpus

  # cat cpuset.effective_cpus
  0-15

And a kernel warning in update_cpumasks_hier() is triggered:

 [ cut here ]
 WARNING: CPU: 0 PID: 4028 at kernel/cpuset.c:894 
update_cpumasks_hier+0x471/0x650()

Cc:  # 3.17+
Signed-off-by: Zefan Li 
---
 kernel/cpuset.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index 7e9d711..29463c2 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -873,7 +873,7 @@ static void update_cpumasks_hier(struct cpuset *cs, struct 
cpumask *new_cpus)
 * If it becomes empty, inherit the effective mask of the
 * parent, which is guaranteed to have some CPUs.
 */
-   if (cpumask_empty(new_cpus))
+   if (cgroup_on_dfl(cp->css.cgroup) && cpumask_empty(new_cpus))
cpumask_copy(new_cpus, parent->effective_cpus);
 
/* Skip the whole subtree if the cpumask remains the same. */
@@ -1129,7 +1129,7 @@ static void update_nodemasks_hier(struct cpuset *cs, 
nodemask_t *new_mems)
 * If it becomes empty, inherit the effective mask of the
 * parent, which is guaranteed to have some MEMs.
 */
-   if (nodes_empty(*new_mems))
+   if (cgroup_on_dfl(cp->css.cgroup) && nodes_empty(*new_mems))
*new_mems = parent->effective_mems;
 
/* Skip the whole subtree if the nodemask remains the same. */
-- 
1.8.0.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/3] cpuset: initialize effective masks when clone_children is enabled

2015-02-12 Thread Zefan Li
If clone_children is enabled, effective masks won't be initialized
due to the bug:

  # mount -t cgroup -o cpuset xxx /mnt
  # echo 1 > cgroup.clone_children
  # mkdir /mnt/tmp
  # cat /mnt/tmp/
  # cat cpuset.effective_cpus

  # cat cpuset.cpus
  0-15

And then this cpuset won't constrain the tasks in it.

Either the bug or the fix has no effect on unified hierarchy, as
there's no clone_chidren flag there any more.

Reported-by: Christian Brauner 
Reported-by: Serge Hallyn 
Cc:  # 3.17+
Signed-off-by: Zefan Li 
---
 kernel/cpuset.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index 64b257f..7e9d711 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -1992,7 +1992,9 @@ static int cpuset_css_online(struct cgroup_subsys_state 
*css)
 
spin_lock_irq(_lock);
cs->mems_allowed = parent->mems_allowed;
+   cs->effective_mems = parent->mems_allowed;
cpumask_copy(cs->cpus_allowed, parent->cpus_allowed);
+   cpumask_copy(cs->effective_cpus, parent->cpus_allowed);
spin_unlock_irq(_lock);
 out_unlock:
mutex_unlock(_mutex);
-- 
1.8.0.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[ANNOUNCE] Linux Security Summit 2015, Seattle WA, USA, August 20-21

2015-02-12 Thread James Morris
This is to announce the date & location of the 2015 Linux Security Summit.

LSS 2015 will be co-located with LinuxCon North America, in Seattle WA, 
USA, on 20 and 21 August.

As with previous events, LSS 2015 will be open to all registered LinuxCon 
attendees.

Please see the event web site for further details:

   http://events.linuxfoundation.org/events/linux-security-summit

A CFP will be announced soon.


- James (on behalf of the Program Committee)

#linuxsecuritysummit
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/4] mm: cma: add currently allocated CMA buffers list to debugfs

2015-02-12 Thread Joonsoo Kim
On Fri, Feb 13, 2015 at 01:15:41AM +0300, Stefan Strogin wrote:
>  static int cma_debugfs_get(void *data, u64 *val)
>  {
>   unsigned long *p = data;
> @@ -125,6 +221,52 @@ static int cma_alloc_write(void *data, u64 val)
>  
>  DEFINE_SIMPLE_ATTRIBUTE(cma_alloc_fops, NULL, cma_alloc_write, "%llu\n");
>  
> +static int cma_buffers_read(struct file *file, char __user *userbuf,
> + size_t count, loff_t *ppos)
> +{
> + struct cma *cma = file->private_data;
> + struct cma_buffer *cmabuf;
> + struct stack_trace trace;
> + char *buf;
> + int ret, n = 0;
> +
> + if (*ppos < 0 || !count)
> + return -EINVAL;
> +
> + buf = kmalloc(count, GFP_KERNEL);
> + if (!buf)
> + return -ENOMEM;

Is count limited within proper size boundary for kmalloc()?
If it can exceed page size, using vmalloc() is better than this.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] x86: entry_64.S: always allocate complete "struct pt_regs"

2015-02-12 Thread Travis
I'll read this later and get back to you.

On Feb 12, 2015 7:54 PM, Denys Vlasenko  wrote:
>
> On Thu, Feb 12, 2015 at 11:29 PM, Andy Lutomirski  
> wrote: 
> >> Thanks! 
> >> The renaming of macros caught the bug at compile time, as intended. 
> >> 
> >> I'll send an updated patch set v3 in a minute. It will have 
> >> additional patch in front, since that location in code 
> >> also wrongly uses R11 instead of ARGOFFSET. 
> > 
> > If you aren't already, can you base it here: 
> > 
> > https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/log/?h=x86/entry
> >  
>
> Yes, the most recent 3-patch set is on top of that tree. 
> -- 
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in 
> the body of a message to majord...@vger.kernel.org 
> More majordomo info at  http://vger.kernel.org/majordomo-info.html 
> Please read the FAQ at  http://www.tux.org/lkml/ 


Re: [PATCH RFC v9 01/20] clk: divider: Correct parent clk round rate if no bestdiv is normally found

2015-02-12 Thread Travis
Travis liked your message with Boxer for Android.

On Feb 12, 2015 8:58 PM, Liu Ying  wrote:
>
> On Thu, Feb 12, 2015 at 10:06:27PM +0800, Liu Ying wrote: 
> > On Thu, Feb 12, 2015 at 02:41:31PM +0100, Sascha Hauer wrote: 
> > > On Thu, Feb 12, 2015 at 12:56:46PM +, Russell King - ARM Linux wrote: 
> > > > On Thu, Feb 12, 2015 at 01:24:05PM +0100, Sascha Hauer wrote: 
> > > > > On Thu, Feb 12, 2015 at 06:39:45PM +0800, Liu Ying wrote: 
> > > > > > On Thu, Feb 12, 2015 at 10:33:56AM +0100, Sascha Hauer wrote: 
> > > > > > > On Thu, Feb 12, 2015 at 02:01:24PM +0800, Liu Ying wrote: 
> > > > > > > > If no best divider is normally found, we will try to use the 
> > > > > > > > maximum divider. 
> > > > > > > > We should not set the parent clock rate to be 1Hz by force for 
> > > > > > > > being rounded. 
> > > > > > > > Instead, we should take the maximum divider as a base and 
> > > > > > > > calculate a correct 
> > > > > > > > parent clock rate for being rounded. 
> > > > > > > 
> > > > > > > Please add an explanation why you think the current code is wrong 
> > > > > > > and 
> > > > > > > what this actually fixes, maybe an example? 
> > > > > > 
> > > > > > The MIPI DSI panel's pixel clock rate is 26.4MHz and it's derived 
> > > > > > from PLL5 on 
> > > > > > the MX6DL SabreSD board. 
> > > > > > 
> > > > > > These are the clock tree summaries with or without the patch 
> > > > > > applied: 
> > > > > > 1) With the patch applied: 
> > > > > > pll5_bypass_src   1    1    2400
> > > > > >   0 0 
> > > > > >    pll5   1    1   844800048
> > > > > >  0 0 
> > > > > >   pll5_bypass 1    1   844800048
> > > > > >  0 0 
> > > > > >  pll5_video   1    1   844800048
> > > > > >  0 0 
> > > > > > pll5_post_div 1    1   211200012
> > > > > >  0 0 
> > > > > >    pll5_video_div   1    1   211200012  
> > > > > >  0 0 
> > > > > >   ipu1_di0_pre_sel   1    1   
> > > > > >211200012   0 0 
> > > > > >  ipu1_di0_pre   1    1    
> > > > > >2642    0 0 
> > > > > > ipu1_di0_sel   1    1    
> > > > > >2642 0 0 
> > > > > >    ipu1_di0   1    1    
> > > > > >2642  0 0 
> > > > > > 
> > > > > > 2) Without the patch applied: 
> > > > > > pll5_bypass_src   1    1    2400
> > > > > >   0 0 
> > > > > >    pll5   1    1   64800
> > > > > >  0 0 
> > > > > >   pll5_bypass 1    1   64800
> > > > > >  0 0 
> > > > > >  pll5_video   1    1   64800
> > > > > >  0 0 
> > > > > > pll5_post_div 1    1   16200
> > > > > >  0 0 
> > > > > >    pll5_video_div   1    1    4050  
> > > > > >  0 0 
> > > > > >   ipu1_di0_pre_sel   1    1    
> > > > > >4050   0 0 
> > > > > >  ipu1_di0_pre   1    1    
> > > > > >2025    0 0 
> > > > > > ipu1_di0_sel   1    1    
> > > > > >2025 0 0 
> > > > > >    ipu1_di0   1    1    
> > > > > >2025  0 0 
> > > > > 
> > > > > This seems to be broken since: 
> > > > > 
> > > > > | commit b11d282dbea27db1788893115dfca8a7856bf205 
> > > > > | Author: Tomi Valkeinen  
> > > > > | Date:   Thu Feb 13 12:03:59 2014 +0200 
> > > > > | 
> > > > > | clk: divider: fix rate calculation for fractional rates 
> > > > > 
> > > > > This patch fixed a case when clk_set_rate(clk_round_rate(rate)) 
> > > > > resulted 
> > > > > in a lower frequency than clk_round_rate(rate) returned. 
> > > > > 
> > > > > Since then the MULT_ROUND_UP in clk_divider_bestdiv() is inconsistent 
> > > > > to 
> > > > > the rest of the divider. Maybe this should be a simple rate * i now, 
> > > > > but 
> > > > > I'm unsure what side effects this has. 
> > > > > 
> > > > > I think your patch only fixes the behaviour in your case by accident, 
> > > > > it's not a correct fix for this issue. 
> > > > 
> > > > Well, it's defined that: 
> > > > 
> > > > new_rate = clk_round_rate(clk, rate); 
> > > > 
> > > > returns the rate which you would get if you did: 
> > > > 
> > > > clk_set_rate(clk, rate); 
> > > > new_rate = clk_get_rate(clk); 
> > > > 
> > > > The reasoning here is that clk_round_rate() gives you a way to query 
> > > > what 
> > > > rate you would get if you were to ask for the rate to be set, without 
> > > > effecting a change in the hardware. 
> > > > 
> > > > The idea that you should call clk_round_rate() first before 
> 

Re: [PATCH 2/4] mm: cma: add functions to get region pages counters

2015-02-12 Thread Joonsoo Kim
On Fri, Feb 13, 2015 at 01:15:42AM +0300, Stefan Strogin wrote:
> From: Dmitry Safonov 
> 
> Here are two functions that provide interface to compute/get used size
> and size of biggest free chunk in cma region.
> Add that information to debugfs.
> 
> Signed-off-by: Dmitry Safonov 
> Signed-off-by: Stefan Strogin 
> ---
>  include/linux/cma.h |  2 ++
>  mm/cma.c| 30 ++
>  mm/cma_debug.c  | 24 
>  3 files changed, 56 insertions(+)
> 
> diff --git a/include/linux/cma.h b/include/linux/cma.h
> index 4c2c83c..54a2c4d 100644
> --- a/include/linux/cma.h
> +++ b/include/linux/cma.h
> @@ -18,6 +18,8 @@ struct cma;
>  extern unsigned long totalcma_pages;
>  extern phys_addr_t cma_get_base(struct cma *cma);
>  extern unsigned long cma_get_size(struct cma *cma);
> +extern unsigned long cma_get_used(struct cma *cma);
> +extern unsigned long cma_get_maxchunk(struct cma *cma);
>  
>  extern int __init cma_declare_contiguous(phys_addr_t base,
>   phys_addr_t size, phys_addr_t limit,
> diff --git a/mm/cma.c b/mm/cma.c
> index ed269b0..95e8121 100644
> --- a/mm/cma.c
> +++ b/mm/cma.c
> @@ -54,6 +54,36 @@ unsigned long cma_get_size(struct cma *cma)
>   return cma->count << PAGE_SHIFT;
>  }
>  
> +unsigned long cma_get_used(struct cma *cma)
> +{
> + unsigned long ret = 0;
> +
> + mutex_lock(>lock);
> + /* pages counter is smaller than sizeof(int) */
> + ret = bitmap_weight(cma->bitmap, (int)cma->count);
> + mutex_unlock(>lock);
> +
> + return ret;
> +}

Need to consider order_per_bit for returing number of page rather
than number of bits.

> +
> +unsigned long cma_get_maxchunk(struct cma *cma)
> +{
> + unsigned long maxchunk = 0;
> + unsigned long start, end = 0;
> +
> + mutex_lock(>lock);
> + for (;;) {
> + start = find_next_zero_bit(cma->bitmap, cma->count, end);
> + if (start >= cma->count)
> + break;
> + end = find_next_bit(cma->bitmap, cma->count, start);
> + maxchunk = max(end - start, maxchunk);
> + }
> + mutex_unlock(>lock);
> +
> + return maxchunk;
> +}
> +

Same here.

Thanks.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/6] timekeeping: Make it safe to use the fast timekeeper while suspended

2015-02-12 Thread Travis
Sounds good to me!

On Feb 12, 2015 8:03 PM, "Rafael J. Wysocki"  wrote:
>
> On Friday, February 13, 2015 08:53:38 AM John Stultz wrote: 
> > On Wed, Feb 11, 2015 at 12:03 PM, Rafael J. Wysocki  
> > wrote: 
> > > From: Rafael J. Wysocki  
> > > 
> > > Theoretically, ktime_get_mono_fast_ns() may be executed after 
> > > timekeeping has been suspended (or before it is resumed) which 
> > > in turn may lead to undefined behavior, for example, when the 
> > > clocksource read from timekeeping_get_ns() called by it is 
> > > not accessible at that time. 
> > 
> > And the callers of the ktime_get_mono_fast_ns() have to get back a 
> > value? 
>
> Yes, they do. 
>
> > Or can we return an error on timekeeping_suspended like we do 
> > w/ __getnstimeofday64()? 
>
> No, we can't. 
>
> > Also, what exactly is the case when the clocksource being read isn't 
> > accessible? I see this is conditionalized on 
> > CLOCK_SOURCE_SUSPEND_NONSTOP, so is the concern on resume we read the 
> > clocksource and its been reset causing a crazy time value? 
>
> The clocksource's ->suspend method may have been called (during suspend) 
> and depending on what that did we may even crash things theoretically. 
>
> During resume, before the clocksource's ->resume callback, it may just 
> be undefined behavior (random data etc). 
>
> For system suspend as we have today the window is quite narrow, but after 
> patch [4/6] from this series suspend-to-idle may suspend timekeeping and 
> just sit there in idle for extended time (hours even) which broadens the 
> potential exposure quite a bit. 
>
> Of course, it does that with interrupts disabled, but 
> ktime_get_mono_fast_ns() 
> is for NMI, so theoretically, if an NMI happens while we're in 
> suspend-to-idle 
> with timekeeping suspended and the clocksource is not 
> CLOCK_SOURCE_SUSPEND_NONSTOP 
> and the NMI calls ktime_get_mono_fast_ns(), strange and undesirable things 
> may 
> happen. 
>
> > > Prevent that from happening by setting up a dummy readout base for 
> > > the fast timekeeper during timekeeping_suspend() such that it will 
> > > always return the same number of cycles. 
> > > 
> > > After the last timekeeping_update() in timekeeping_suspend() the 
> > > clocksource is read and the result is stored as cycles_at_suspend. 
> > > The readout base from the current timekeeper is copied onto the 
> > > dummy and the ->read pointer of the dummy is set to a routine 
> > > unconditionally returning cycles_at_suspend.  Next, the dummy is 
> > > passed to update_fast_timekeeper(). 
> > > 
> > > Then, ktime_get_mono_fast_ns() will work until the subsequent 
> > > timekeeping_resume() and the proper readout base for the fast 
> > > timekeeper will be restored by the timekeeping_update() called 
> > > right after clearing timekeeping_suspended. 
> > > 
> > > Signed-off-by: Rafael J. Wysocki  
> > > --- 
> > >  kernel/time/timekeeping.c |   22 ++ 
> > >  1 file changed, 22 insertions(+) 
> > > 
> > > Index: linux-pm/kernel/time/timekeeping.c 
> > > === 
> > > --- linux-pm.orig/kernel/time/timekeeping.c 
> > > +++ linux-pm/kernel/time/timekeeping.c 
> > > @@ -1249,9 +1249,23 @@ static void timekeeping_resume(void) 
> > > hrtimers_resume(); 
> > >  } 
> > > 
> > > +/* 
> > > + * Dummy readout base and suspend-time cycles value for the fast 
> > > timekeeper to 
> > > + * work in a consistent way after timekeeping has been suspended if the 
> > > core 
> > > + * timekeeper clocksource is not suspend-nonstop. 
> > > + */ 
> > > +static struct tk_read_base tkr_dummy; 
> > > +static cycle_t cycles_at_suspend; 
> > > + 
> > > +static cycle_t dummy_clock_read(struct clocksource *cs) 
> > > +{ 
> > > +   return cycles_at_suspend; 
> > > +} 
> > > + 
> > >  static int timekeeping_suspend(void) 
> > >  { 
> > > struct timekeeper *tk = _core.timekeeper; 
> > > +   struct clocksource *clock = tk->tkr.clock; 
> > > unsigned long flags; 
> > > struct timespec64   delta, delta_delta; 
> > > static struct timespec64    old_delta; 
> > > @@ -1294,6 +1308,14 @@ static int timekeeping_suspend(void) 
> > > } 
> > > 
> > > timekeeping_update(tk, TK_MIRROR); 
> > > + 
> > > +   if (!(clock->flags & CLOCK_SOURCE_SUSPEND_NONSTOP)) { 
> > > +   memcpy(_dummy, >tkr, sizeof(tkr_dummy)); 
> > > +   cycles_at_suspend = tk->tkr.read(clock); 
> > > +   tkr_dummy.read = dummy_clock_read; 
> > > +   update_fast_timekeeper(_dummy); 
> > > +   } 
> > 
> > Its a little ugly... though I'm not sure I have a better idea right off. 
> > 
> > thanks 
> > -john 
> > -- 
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in 
> > the body of a message to majord...@vger.kernel.org 
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html 
> > Please read the FAQ at  

Re: [PATCH 1/4] mm: cma: add currently allocated CMA buffers list to debugfs

2015-02-12 Thread Joonsoo Kim
On Fri, Feb 13, 2015 at 01:15:41AM +0300, Stefan Strogin wrote:
> /sys/kernel/debug/cma/cma-/buffers contains a list of currently allocated
> CMA buffers for CMA region N when CONFIG_CMA_DEBUGFS is enabled.
> 
> Format is:
> 
>  -  ( kB), allocated by  ()
>  
> 
> Signed-off-by: Stefan Strogin 
> ---
>  include/linux/cma.h |   9 
>  mm/cma.c|   9 
>  mm/cma.h|  16 ++
>  mm/cma_debug.c  | 145 
> +++-
>  4 files changed, 178 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/cma.h b/include/linux/cma.h
> index 9384ba6..4c2c83c 100644
> --- a/include/linux/cma.h
> +++ b/include/linux/cma.h
> @@ -28,4 +28,13 @@ extern int cma_init_reserved_mem(phys_addr_t base,
>   struct cma **res_cma);
>  extern struct page *cma_alloc(struct cma *cma, int count, unsigned int 
> align);
>  extern bool cma_release(struct cma *cma, struct page *pages, int count);
> +
> +#ifdef CONFIG_CMA_DEBUGFS
> +extern int cma_buffer_list_add(struct cma *cma, unsigned long pfn, int 
> count);
> +extern void cma_buffer_list_del(struct cma *cma, unsigned long pfn, int 
> count);
> +#else
> +#define cma_buffer_list_add(cma, pfn, count) { }
> +#define cma_buffer_list_del(cma, pfn, count) { }
> +#endif
> +

These could be in mm/cma.h rather than include/linux/cma.h.

>  #endif
> diff --git a/mm/cma.c b/mm/cma.c
> index 2609e20..ed269b0 100644
> --- a/mm/cma.c
> +++ b/mm/cma.c
> @@ -34,6 +34,9 @@
>  #include 
>  #include 
>  #include 
> +#include 
> +#include 
> +#include 
>  
>  #include "cma.h"
>  
> @@ -125,6 +128,8 @@ static int __init cma_activate_area(struct cma *cma)
>  #ifdef CONFIG_CMA_DEBUGFS
>   INIT_HLIST_HEAD(>mem_head);
>   spin_lock_init(>mem_head_lock);
> + INIT_LIST_HEAD(>buffers_list);
> + mutex_init(>list_lock);
>  #endif
>  
>   return 0;
> @@ -408,6 +413,9 @@ struct page *cma_alloc(struct cma *cma, int count, 
> unsigned int align)
>   start = bitmap_no + mask + 1;
>   }
>  
> + if (page)
> + cma_buffer_list_add(cma, pfn, count);
> +
>   pr_debug("%s(): returned %p\n", __func__, page);
>   return page;
>  }
> @@ -440,6 +448,7 @@ bool cma_release(struct cma *cma, struct page *pages, int 
> count)
>  
>   free_contig_range(pfn, count);
>   cma_clear_bitmap(cma, pfn, count);
> + cma_buffer_list_del(cma, pfn, count);
>  
>   return true;
>  }
> diff --git a/mm/cma.h b/mm/cma.h
> index 1132d73..98e5f79 100644
> --- a/mm/cma.h
> +++ b/mm/cma.h
> @@ -1,6 +1,8 @@
>  #ifndef __MM_CMA_H__
>  #define __MM_CMA_H__
>  
> +#include 
> +
>  struct cma {
>   unsigned long   base_pfn;
>   unsigned long   count;
> @@ -10,9 +12,23 @@ struct cma {
>  #ifdef CONFIG_CMA_DEBUGFS
>   struct hlist_head mem_head;
>   spinlock_t mem_head_lock;
> + struct list_head buffers_list;
> + struct mutexlist_lock;
>  #endif
>  };
>  
> +#ifdef CONFIG_CMA_DEBUGFS
> +struct cma_buffer {
> + unsigned long pfn;
> + unsigned long count;
> + pid_t pid;
> + char comm[TASK_COMM_LEN];
> + unsigned long trace_entries[16];
> + unsigned int nr_entries;
> + struct list_head list;
> +};
> +#endif
> +
>  extern struct cma cma_areas[MAX_CMA_AREAS];
>  extern unsigned cma_area_count;
>  
> diff --git a/mm/cma_debug.c b/mm/cma_debug.c
> index 7e1d325..5acd937 100644
> --- a/mm/cma_debug.c
> +++ b/mm/cma_debug.c
> @@ -2,6 +2,7 @@
>   * CMA DebugFS Interface
>   *
>   * Copyright (c) 2015 Sasha Levin 
> + * Copyright (c) 2015 Stefan Strogin 
>   */
>   
>  
> @@ -10,6 +11,8 @@
>  #include 
>  #include 
>  #include 
> +#include 
> +#include 
>  
>  #include "cma.h"
>  
> @@ -21,6 +24,99 @@ struct cma_mem {
>  
>  static struct dentry *cma_debugfs_root;
>  
> +/* Must be called under cma->list_lock */
> +static int __cma_buffer_list_add(struct cma *cma, unsigned long pfn, int 
> count)
> +{
> + struct cma_buffer *cmabuf;
> + struct stack_trace trace;
> +
> + cmabuf = kmalloc(sizeof(*cmabuf), GFP_KERNEL);
> + if (!cmabuf) {
> + pr_warn("%s(page %p, count %d): failed to allocate buffer list 
> entry\n",
> + __func__, pfn_to_page(pfn), count);
> + return -ENOMEM;
> + }
> +
> + trace.nr_entries = 0;
> + trace.max_entries = ARRAY_SIZE(cmabuf->trace_entries);
> + trace.entries = >trace_entries[0];
> + trace.skip = 2;
> + save_stack_trace();
> +
> + cmabuf->pfn = pfn;
> + cmabuf->count = count;
> + cmabuf->pid = task_pid_nr(current);
> + cmabuf->nr_entries = trace.nr_entries;
> + get_task_comm(cmabuf->comm, current);
> +
> + list_add_tail(>list, >buffers_list);
> +
> + return 0;
> +}
> +
> +/**
> + * cma_buffer_list_add() - add a new entry to a list of allocated buffers
> + * @cma: Contiguous memory region for which the allocation is performed.
> + * @pfn: Base PFN of the allocated buffer.
> + * @count:   Number 

Re: [PATCH 07/11] ARM: prima2: do not select SMP_ON_UP

2015-02-12 Thread Barry Song
2015-02-13 3:42 GMT+08:00 Arnd Bergmann :
> The new Atlas7 platform implicitly selects 'CONFIG_SMP_ON_UP',
> which leads to problems if we enable building the platform without
> MMU, as that combination is not allowed and causes a link error:
>
> arch/arm/kernel/built-in.o: In function `c_show':
> :(.text+0x1872): undefined reference to `smp_on_up'
> :(.text+0x1876): undefined reference to `smp_on_up'
> arch/arm/kernel/built-in.o: In function `arch_irq_work_raise':
> :(.text+0x3d48): undefined reference to `smp_on_up'
> :(.text+0x3d4c): undefined reference to `smp_on_up'
> arch/arm/kernel/built-in.o: In function `smp_setup_processor_id':
> :(.init.text+0x180): undefined reference to `smp_on_up'
>
> This removes the 'select' statement.
>
> Signed-off-by: Arnd Bergmann 
> Fixes: 4cba058526a7 ("ARM: sirf: add Atlas7 machine support")
> Cc: Zhiwu Song 
> Cc: Barry Song 
> ---
>  arch/arm/mach-prima2/Kconfig | 1 -
>  1 file changed, 1 deletion(-)

Acked-by: Barry Song 

>
> diff --git a/arch/arm/mach-prima2/Kconfig b/arch/arm/mach-prima2/Kconfig
> index a219dc310d5d..e03d8b5c9ad0 100644
> --- a/arch/arm/mach-prima2/Kconfig
> +++ b/arch/arm/mach-prima2/Kconfig
> @@ -27,7 +27,6 @@ config ARCH_ATLAS7
> select CPU_V7
> select HAVE_ARM_SCU if SMP
> select HAVE_SMP
> -   select SMP_ON_UP if SMP
> help
>Support for CSR SiRFSoC ARM Cortex A7 Platform
>
> --
> 2.1.0.rc2
>

-barry
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] clockevents: Introduce mode specific callbacks

2015-02-12 Thread Viresh Kumar
On 13 February 2015 at 10:11, Rafael J. Wysocki  wrote:
> On Friday, February 13, 2015 08:54:56 AM Viresh Kumar wrote:
>> It is not possible for the clockevents core to know which modes (other than
>> those with a corresponding feature flag) are supported by a particular
>> implementation. And drivers are expected to handle transition to all modes
>> elegantly, as ->set_mode() would be issued for them unconditionally.
>>
>> Now, adding support for a new mode complicates things a bit if we want to use
>> the legacy ->set_mode() callback. We need to closely review all clockevents
>> drivers to see if they would break on addition of a new mode. And after such
>> reviews, it is found that we have to do non-trivial changes to most of the
>> drivers [1].
>>
>> Introduce mode-specific set_mode_*() callbacks, some of which the drivers 
>> may or
>> may not implement. A missing callback would clearly convey the message that 
>> the
>> corresponding mode isn't supported.
>
> This is not going to fly AFAICS if you don't say what exacly you need it for.

For this: https://lkml.org/lkml/2014/5/9/508
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] mm: cma: add some debug information for CMA

2015-02-12 Thread Joonsoo Kim
On Fri, Feb 13, 2015 at 01:15:40AM +0300, Stefan Strogin wrote:
> Hi all.
> 
> Sorry for the long delay. Here is the second attempt to add some facility
> for debugging CMA (the first one was "mm: cma: add /proc/cmainfo" [1]).
> 
> This patch set is based on v3.19 and Sasha Levin's patch set
> "mm: cma: debugfs access to CMA" [2].
> It is also available on git:
> git://github.com/stefanstrogin/cmainfo -b cmainfo-v2
> 
> We want an interface to see a list of currently allocated CMA buffers and
> some useful information about them (like /proc/vmallocinfo but for physically
> contiguous buffers allocated with CMA).
> 
> Here is an example use case when we need it. We want a big (megabytes)
> CMA buffer to be allocated in runtime in default CMA region. If someone
> already uses CMA then the big allocation can fail. If it happens then with
> such an interface we could find who used CMA at the moment of failure, who
> caused fragmentation (possibly ftrace also would be helpful here) and so on.

Hello,

So, I'm not sure that information about allocated CMA buffer is really
needed to solve your problem. You just want to know who uses default CMA
region and you can know it by adding tracepoint in your 4/4 patch. We
really need this custom allocation tracer? What can we do more with
this custom tracer to solve your problem? Could you more specific
about your problem and how to solve it by using this custom tracer?

> 
> These patches add some files to debugfs when CONFIG_CMA_DEBUGFS is enabled.

If this tracer is justifiable, I think that making it conditional is
better than just enabling always on CONFIG_CMA_DEBUGFS. Some users
don't want to this feature although they enable CONFIG_CMA_DEBUGFS.

Thanks.

> 
> /sys/kernel/debug/cma/cma-/buffers contains a list of currently allocated
> CMA buffers for each CMA region. Stacktrace saved at the moment of allocation
> is used to see who and whence allocated each buffer [3].
> 
> cma/cma-/used and cma/cma-/maxchunk are added to show used size and
> the biggest free chunk in each CMA region.
> 
> Also added trace events for cma_alloc() and cma_release().
> 
> Changes from "mm: cma: add /proc/cmainfo" [1]:
> - Rebased on v3.19 and Sasha Levin's patch set [2].
> - Moved debug code from cma.c to cma_debug.c.
> - Moved cmainfo to debugfs and splited it by CMA region.
> - Splited 'cmainfo' into 'buffers', 'used' and 'maxchunk'.
> - Used CONFIG_CMA_DEBUGFS instead of CONFIG_CMA_DEBUG.
> - Added trace events for cma_alloc() and cma_release().
> - Don't use seq_files.
> - A small change of debug output in cma_release().
> - cma_buffer_list_del() now supports releasing chunks which ranges don't match
>   allocations. E.g. we have buffer1: [0x0, 0x1], buffer2: [0x2, 0x3], then
>   cma_buffer_list_del(cma, 0x1 /*or 0x0*/, 1 /*(or 2 or 3)*/) should work.
> - Various small changes.
> 
> 
> [1] https://lkml.org/lkml/2014/12/26/95
> 
> [2] https://lkml.org/lkml/2015/1/28/755
> 
> [3] E.g.
> root@debian:/sys/kernel/debug/cma# cat cma-0/buffers
> 0x2f40 - 0x2f417000 (92 kB), allocated by pid 1 (swapper/0)
>  [] cma_alloc+0x1bb/0x200
>  [] dma_alloc_from_contiguous+0x3a/0x40
>  [] dma_generic_alloc_coherent+0x89/0x160
>  [] dmam_alloc_coherent+0xbe/0x100
>  [] ahci_port_start+0xe2/0x210
>  [] ata_host_start.part.28+0xc0/0x1a0
>  [] ata_host_activate+0xd0/0x110
>  [] ahci_host_activate+0x3f/0x170
>  [] ahci_init_one+0x764/0xab0
>  [] pci_device_probe+0x6f/0xd0
>  [] driver_probe_device+0x68/0x210
>  [] __driver_attach+0x79/0x80
>  [] bus_for_each_dev+0x4f/0x80
>  [] driver_attach+0x1e/0x20
>  [] bus_add_driver+0x157/0x200
>  [] driver_register+0x5d/0xf0
> <...> 
> 0x2f41b000 - 0x2f41c000 (4 kB), allocated by pid 1264 (NetworkManager)
>  [] cma_alloc+0x1bb/0x200
>  [] dma_alloc_from_contiguous+0x3a/0x40
>  [] dma_generic_alloc_coherent+0x89/0x160
>  [] e1000_setup_all_tx_resources+0x93/0x540
>  [] e1000_open+0x31/0x120
>  [] __dev_open+0x9f/0x130
>  [] __dev_change_flags+0x8e/0x150
>  [] dev_change_flags+0x28/0x60
>  [] do_setlink+0x2a0/0x760
>  [] rtnl_newlink+0x60b/0x7b0
>  [] rtnetlink_rcv_msg+0x84/0x1f0
>  [] netlink_rcv_skb+0x8e/0xb0
>  [] rtnetlink_rcv+0x21/0x30
>  [] netlink_unicast+0x13a/0x1d0
>  [] netlink_sendmsg+0x240/0x3e0
>  [] do_sock_sendmsg+0xbd/0xe0
> <...>
> 
> 
> Dmitry Safonov (1):
>   mm: cma: add functions to get region pages counters
> 
> Stefan Strogin (3):
>   mm: cma: add currently allocated CMA buffers list to debugfs
>   mm: cma: add number of pages to debug message in cma_release()
>   mm: cma: add trace events to debug physically-contiguous memory
> allocations
> 
>  include/linux/cma.h|  11 +++
>  include/trace/events/cma.h |  57 +++
>  mm/cma.c   |  46 +++-
>  mm/cma.h   |  16 +
>  mm/cma_debug.c | 169 
> -
>  5 files changed, 297 insertions(+), 2 deletions(-)
>  create mode 100644 include/trace/events/cma.h
> 
> -- 
> 2.1.0
> 
> --
> 

Re: Possible Bug in gnet_start_copy_compat for the file,gen_stats.c

2015-02-12 Thread Cong Wang
On Thu, Feb 12, 2015 at 6:33 PM, nick  wrote:
> Greets to Everyone,
> I am wondering after running sparse on the latest mainline tree why we are 
> not unlocking the  spinlock_bh,lock when calling the function,
> gnet_stats_start_copy_compat at the end of this function's body. Unless 
> someone can explain to me why there is a very good reason for not
> unlocking the spinlock_bh,lock at the end of this function I will send in a 
> patch fixing this. I am assuming this is a bug due to us infinitely
> looping in the spinlock_bh and deadlocking unless we exit this lock external 
> to the call to the function,gnet_start_copy_compat.

The last time I looked at this, there is some place calling unlock,
you need to find it out. I don't think anyone touched it ever since
that time.

Let me know if you still can't find it. And yes, that code is kinda
messy, I _do_ have a patch to clean it up (not sent out yet).
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFC v5 net-next 4/6] virtio-net: add basic interrupt coalescing support

2015-02-12 Thread Rusty Russell
"Michael S. Tsirkin"  writes:
> On Tue, Feb 10, 2015 at 12:02:37PM +1030, Rusty Russell wrote:
>> Jason Wang  writes:
>> > This patch enables the interrupt coalescing setting through ethtool.
>> 
>> The problem is that there's nothing network specific about interrupt
>> coalescing.  I can see other devices wanting exactly the same thing,
>> which means we'd deprecate this in the next virtio standard.
>> 
>> I think the right answer is to extend like we did with
>> vring_used_event(), eg:
>> 
>> 1) Add a new feature VIRTIO_F_RING_COALESCE.
>> 2) Add another a 32-bit field after vring_used_event(), eg:
>> #define vring_used_delay(vr) (*(u32 *)((vr)->avail->ring[(vr)->num + 
>> 2]))
>> 
>> This loses the ability to coalesce by number of frames, but we can still
>> do number of sg entries, as we do now with used_event, and we could
>> change virtqueue_enable_cb_delayed() to take a precise number if we
>> wanted.
>
> But do we expect delay to be update dynamically?
> If not, why not stick it in config space?

Hmm, we could update it dynamically (and will, in the case of ethtool).
But it won't be common, so we could append a field to
virtio_pci_common_cfg for PCI.

I think MMIO and CCW would be easy to extend too, but CC'd to check.

>> My feeling is that this should be a v1.0-only feature though
>> (eg. feature bit 33).
>
> Yes, e.g. we can't extend config space for legacy virtio pci.

Thanks,
Rusty.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] usb: dwc2: Register interrupt handler only once gadget is correctly initialized

2015-02-12 Thread John Youn
On 2/12/2015 4:42 AM, Romain Perier wrote:
> ping
> 
> 2015-02-06 17:50 GMT+01:00 Romain Perier :
>> Don't register interrupt handler before usb gadget is correctly initialized.
>> For some embedded platforms which don't have a usb-phy, it crashes the driver
>> because an interrupt is emitted with non-initialized hardware.
>> According to devm_request_irq documentation, an interrupt can be emitted
>> at any time once the interrupt is registered, so we have to care about driver
>> and hardware initialization.
>>
>> Signed-off-by: Romain Perier 
>> ---
>>
>> Changes for v2: fix typos in commit log
>>
>>  drivers/usb/dwc2/platform.c | 17 +
>>  1 file changed, 9 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/usb/dwc2/platform.c b/drivers/usb/dwc2/platform.c
>> index ae095f0..b26cf8c 100644
>> --- a/drivers/usb/dwc2/platform.c
>> +++ b/drivers/usb/dwc2/platform.c
>> @@ -196,14 +196,6 @@ static int dwc2_driver_probe(struct platform_device 
>> *dev)
>> return irq;
>> }
>>
>> -   dev_dbg(hsotg->dev, "registering common handler for irq%d\n",
>> -   irq);
>> -   retval = devm_request_irq(hsotg->dev, irq,
>> - dwc2_handle_common_intr, IRQF_SHARED,
>> - dev_name(hsotg->dev), hsotg);
>> -   if (retval)
>> -   return retval;
>> -
>> res = platform_get_resource(dev, IORESOURCE_MEM, 0);
>> hsotg->regs = devm_ioremap_resource(>dev, res);
>> if (IS_ERR(hsotg->regs))
>> @@ -237,6 +229,15 @@ static int dwc2_driver_probe(struct platform_device 
>> *dev)
>> retval = dwc2_gadget_init(hsotg, irq);
>> if (retval)
>> return retval;
>> +
>> +dev_dbg(hsotg->dev, "registering common handler for irq%d\n",
>> +irq);
>> +retval = devm_request_irq(hsotg->dev, irq,
>> +dwc2_handle_common_intr, IRQF_SHARED,
>> +dev_name(hsotg->dev), hsotg);
>> +if (retval)
>> +return retval;
>> +
>> retval = dwc2_hcd_init(hsotg, irq, params);
>> if (retval)
>> return retval;

Hi,

I'm going to be away until Wednesday, Feb 18. I'll take a look at
this and other pending dwc2 patches at that time.

Regards,
John


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFC v9 01/20] clk: divider: Correct parent clk round rate if no bestdiv is normally found

2015-02-12 Thread Liu Ying
On Thu, Feb 12, 2015 at 10:06:27PM +0800, Liu Ying wrote:
> On Thu, Feb 12, 2015 at 02:41:31PM +0100, Sascha Hauer wrote:
> > On Thu, Feb 12, 2015 at 12:56:46PM +, Russell King - ARM Linux wrote:
> > > On Thu, Feb 12, 2015 at 01:24:05PM +0100, Sascha Hauer wrote:
> > > > On Thu, Feb 12, 2015 at 06:39:45PM +0800, Liu Ying wrote:
> > > > > On Thu, Feb 12, 2015 at 10:33:56AM +0100, Sascha Hauer wrote:
> > > > > > On Thu, Feb 12, 2015 at 02:01:24PM +0800, Liu Ying wrote:
> > > > > > > If no best divider is normally found, we will try to use the 
> > > > > > > maximum divider.
> > > > > > > We should not set the parent clock rate to be 1Hz by force for 
> > > > > > > being rounded.
> > > > > > > Instead, we should take the maximum divider as a base and 
> > > > > > > calculate a correct
> > > > > > > parent clock rate for being rounded.
> > > > > > 
> > > > > > Please add an explanation why you think the current code is wrong 
> > > > > > and
> > > > > > what this actually fixes, maybe an example?
> > > > > 
> > > > > The MIPI DSI panel's pixel clock rate is 26.4MHz and it's derived 
> > > > > from PLL5 on
> > > > > the MX6DL SabreSD board.
> > > > > 
> > > > > These are the clock tree summaries with or without the patch applied:
> > > > > 1) With the patch applied:
> > > > > pll5_bypass_src   112400  
> > > > > 0 0
> > > > >pll5   11   844800048  
> > > > > 0 0
> > > > >   pll5_bypass 11   844800048  
> > > > > 0 0
> > > > >  pll5_video   11   844800048  
> > > > > 0 0
> > > > > pll5_post_div 11   211200012  
> > > > > 0 0
> > > > >pll5_video_div   11   211200012
> > > > > 0 0
> > > > >   ipu1_di0_pre_sel   11   
> > > > > 211200012   0 0
> > > > >  ipu1_di0_pre   11
> > > > > 26420 0
> > > > > ipu1_di0_sel   11
> > > > > 2642 0 0
> > > > >ipu1_di0   11
> > > > > 2642  0 0
> > > > > 
> > > > > 2) Without the patch applied:
> > > > > pll5_bypass_src   112400  
> > > > > 0 0
> > > > >pll5   11   64800  
> > > > > 0 0
> > > > >   pll5_bypass 11   64800  
> > > > > 0 0
> > > > >  pll5_video   11   64800  
> > > > > 0 0
> > > > > pll5_post_div 11   16200  
> > > > > 0 0
> > > > >pll5_video_div   114050
> > > > > 0 0
> > > > >   ipu1_di0_pre_sel   11
> > > > > 4050   0 0
> > > > >  ipu1_di0_pre   11
> > > > > 20250 0
> > > > > ipu1_di0_sel   11
> > > > > 2025 0 0
> > > > >ipu1_di0   11
> > > > > 2025  0 0
> > > > 
> > > > This seems to be broken since:
> > > > 
> > > > | commit b11d282dbea27db1788893115dfca8a7856bf205
> > > > | Author: Tomi Valkeinen 
> > > > | Date:   Thu Feb 13 12:03:59 2014 +0200
> > > > | 
> > > > | clk: divider: fix rate calculation for fractional rates
> > > > 
> > > > This patch fixed a case when clk_set_rate(clk_round_rate(rate)) resulted
> > > > in a lower frequency than clk_round_rate(rate) returned.
> > > > 
> > > > Since then the MULT_ROUND_UP in clk_divider_bestdiv() is inconsistent to
> > > > the rest of the divider. Maybe this should be a simple rate * i now, but
> > > > I'm unsure what side effects this has.
> > > > 
> > > > I think your patch only fixes the behaviour in your case by accident,
> > > > it's not a correct fix for this issue.
> > > 
> > > Well, it's defined that:
> > > 
> > >   new_rate = clk_round_rate(clk, rate);
> > > 
> > > returns the rate which you would get if you did:
> > > 
> > >   clk_set_rate(clk, rate);
> > >   new_rate = clk_get_rate(clk);
> > > 
> > > The reasoning here is that clk_round_rate() gives you a way to query what
> > > rate you would get if you were to ask for the rate to be set, without
> > > effecting a change in the hardware.
> > > 
> > > The idea that you should call clk_round_rate() first before clk_set_rate()
> > > and pass the returned rounded rate into clk_set_rate() is really idiotic
> > > given that.  Please don't do it, and please remove code which does it, and
> > > in review comment on it.  Thanks.
> > 
> > Tomis patch is based on the assumption that 
> > clk_set_rate(clk_round_rate(rate))
> > is equal to clk_round_rate(rate). So when this assumption is wrong then
> 

Re: [GIT PULL] x86/apic updates for v3.20

2015-02-12 Thread Linus Torvalds
On Thu, Feb 12, 2015 at 6:08 PM, Linus Torvalds
 wrote:
>
> Jiang, Joerg - that commit seems to cause a lockup at suspend time for
> me. Now, I haven't verified by reverting it from top-of-git yet, but
> the bisection seemed to be pretty stable. I'll try the revert next (it
> doesn't revert cleanly, but I can undo it by hand).

Confirmed. Reverting 5fcee53ce705 make the pixel suspend cleanly again.

 Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 2/3] mm: cma: allocation trigger

2015-02-12 Thread Joonsoo Kim
On Thu, Feb 12, 2015 at 05:26:47PM -0500, Sasha Levin wrote:
> Provides a userspace interface to trigger a CMA allocation.
> 
> Usage:
> 
>   echo [pages] > alloc
> 
> This would provide testing/fuzzing access to the CMA allocation paths.
> 
> Signed-off-by: Sasha Levin 
> ---
>  mm/cma.c   |6 ++
>  mm/cma.h   |4 
>  mm/cma_debug.c |   56 
> ++--
>  3 files changed, 64 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/cma.c b/mm/cma.c
> index e093b53..9e3d44a 100644
> --- a/mm/cma.c
> +++ b/mm/cma.c
> @@ -121,6 +121,12 @@ static int __init cma_activate_area(struct cma *cma)
>   } while (--i);
>  
>   mutex_init(>lock);
> +
> +#ifdef CONFIG_CMA_DEBUGFS
> + INIT_HLIST_HEAD(>mem_head);
> + spin_lock_init(>mem_head_lock);
> +#endif
> +
>   return 0;
>  
>  err:
> diff --git a/mm/cma.h b/mm/cma.h
> index 4141887..1132d73 100644
> --- a/mm/cma.h
> +++ b/mm/cma.h
> @@ -7,6 +7,10 @@ struct cma {
>   unsigned long   *bitmap;
>   unsigned int order_per_bit; /* Order of pages represented by one bit */
>   struct mutexlock;
> +#ifdef CONFIG_CMA_DEBUGFS
> + struct hlist_head mem_head;
> + spinlock_t mem_head_lock;
> +#endif
>  };
>  
>  extern struct cma cma_areas[MAX_CMA_AREAS];
> diff --git a/mm/cma_debug.c b/mm/cma_debug.c
> index 3a25413..5bd6863 100644
> --- a/mm/cma_debug.c
> +++ b/mm/cma_debug.c
> @@ -7,9 +7,18 @@
>  
>  #include 
>  #include 
> +#include 
> +#include 
> +#include 
>  
>  #include "cma.h"
>  
> +struct cma_mem {
> + struct hlist_node node;
> + struct page *p;
> + unsigned long n;
> +};
> +
>  static struct dentry *cma_debugfs_root;
>  
>  static int cma_debugfs_get(void *data, u64 *val)
> @@ -23,8 +32,48 @@ static int cma_debugfs_get(void *data, u64 *val)
>  
>  DEFINE_SIMPLE_ATTRIBUTE(cma_debugfs_fops, cma_debugfs_get, NULL, "%llu\n");
>  
> -static void cma_debugfs_add_one(struct cma *cma, int idx)
> +static void cma_add_to_cma_mem_list(struct cma *cma, struct cma_mem *mem)
> +{
> + spin_lock(>mem_head_lock);
> + hlist_add_head(>node, >mem_head);
> + spin_unlock(>mem_head_lock);
> +}
> +
> +static int cma_alloc_mem(struct cma *cma, int count)
> +{
> + struct cma_mem *mem;
> + struct page *p;
> +
> + mem = kzalloc(sizeof(*mem), GFP_KERNEL);
> + if (!mem) 
> + return -ENOMEM;
> +
> + p = cma_alloc(cma, count, CONFIG_CMA_ALIGNMENT);

Alignment is resurrected. Please change it to 0.

Other than that,
Acked-by: Joonsoo Kim 

Thanks.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 3/3] mm: cma: release trigger

2015-02-12 Thread Joonsoo Kim
On Thu, Feb 12, 2015 at 05:26:48PM -0500, Sasha Levin wrote:
> Provides a userspace interface to trigger a CMA release.
> 
> Usage:
> 
>   echo [pages] > free
> 
> This would provide testing/fuzzing access to the CMA release paths.
> 
> Signed-off-by: Sasha Levin 

Acked-by: Joonsoo Kim 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   5   6   7   8   9   10   >