Re: [PATCH v2 2/3] perf tools: Record sampling time for each entry

2013-12-08 Thread Namhyung Kim
Hi David,

On Tue, 03 Dec 2013 09:20:23 -0700, David Ahern wrote:
> On 12/3/13, 2:00 AM, Namhyung Kim wrote:
>> diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
>> index 20a7c653b74b..ac65fc67972c 100644
>> --- a/tools/perf/util/evsel.h
>> +++ b/tools/perf/util/evsel.h
>> @@ -69,6 +69,7 @@ struct perf_evsel {
>>  struct histshists;
>>  u64 first_timestamp;
>>  u64 last_timestamp;
>> +u64 *prev_timestamps;
>
>
> Why plural and why dynamically allocated? The allocation only does a
> single u64, not an array.

Nope, it'll be an array if the session was a system-wide one, so plural.

But, I think the current code won't work well if there're multiple
unrelated processes recorded - e.g. perf record -u `id -u` - since it'll
intermix all timestamps between the samples regardless of process.

Hmm.. I think there's not much thing we can do for this without help
from kernel side (PERF_SAMPLE_READ?).  So I'll just drop this unless I
can come up with a better idea.  But the patch 1/3 still makes some
sense and worth to merge by itself IMHO.

What do you think?

Thanks,
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4 3/9] phy: Add new Exynos USB PHY driver

2013-12-08 Thread Kishon Vijay Abraham I

Hi,

On Friday 06 December 2013 09:58 PM, Kamil Debski wrote:

Hi Kishon,

Thank you for the review.


From: Kishon Vijay Abraham I [mailto:kis...@ti.com]
Sent: Friday, December 06, 2013 11:59 AM

Hi,

On Thursday 05 December 2013 05:59 PM, Kamil Debski wrote:

Add a new driver for the Exynos USB PHY. The new driver uses the
generic PHY framework. The driver includes support for the Exynos

4x10

and 4x12 SoC families.

Signed-off-by: Kamil Debski 
Signed-off-by: Kyungmin Park 
---




.
.

diff --git a/drivers/phy/Makefile b/drivers/phy/Makefile index
d0caae9..9f4befd 100644
--- a/drivers/phy/Makefile
+++ b/drivers/phy/Makefile
@@ -7,3 +7,6 @@ obj-$(CONFIG_PHY_EXYNOS_DP_VIDEO)   += phy-exynos-dp-

video.o

  obj-$(CONFIG_PHY_EXYNOS_MIPI_VIDEO)   += phy-exynos-mipi-video.o
  obj-$(CONFIG_OMAP_USB2)   += phy-omap-usb2.o
  obj-$(CONFIG_TWL4030_USB) += phy-twl4030-usb.o
+obj-$(CONFIG_PHY_SAMSUNG_USB2) += phy-samsung-usb2.o
+obj-$(CONFIG_PHY_EXYNOS4210_USB2)  += phy-exynos4210-usb2.o
+obj-$(CONFIG_PHY_EXYNOS4212_USB2)  += phy-exynos4212-usb2.o
diff --git a/drivers/phy/phy-exynos4210-usb2.c
b/drivers/phy/phy-exynos4210-usb2.c
new file mode 100644
index 000..a02e5c2
--- /dev/null
+++ b/drivers/phy/phy-exynos4210-usb2.c
@@ -0,0 +1,264 @@
+/*
+ * Samsung SoC USB 1.1/2.0 PHY driver - Exynos 4210 support
+ *
+ * Copyright (C) 2013 Samsung Electronics Co., Ltd.
+ * Author: Kamil Debski 
+ *
+ * This program is free software; you can redistribute it and/or
+modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 


You've included most of the above header files in phy-samsung-usb2.h
which you are including below.


I agree that includes in phy-samsung-usb2.h could use a cleanup. On the
other
hand my opinion is that a .c file should include all .h files that are used
in
this .c file. Relaying on .h file to include another .h doesn't seem good to
me.


then remove it in .h file.



+#include "phy-samsung-usb2.h"
+
+/* Exynos USB PHY registers */
+
+/* PHY power control */
+#define EXYNOS_4210_UPHYPWR0x0
+
+#define EXYNOS_4210_UPHYPWR_PHY0_SUSPEND   (1 << 0)


use BIT() here and everywhere below.





.
.


+#ifdef CONFIG_PHY_EXYNOS4212_USB2
+   {
+   .compatible = "samsung,exynos4212-usb2-phy",
+   .data = _usb2_phy_config,
+   },
+#endif
+   { },
+};


I think we've had enough discussion about this approach. Let's get the
opinion of others too. Felipe? Greg?


Good idea.


Summary:
We have two drivers PHY_EXYNOS4210_USB2 and PHY_EXYNOS4212_USB2 with
almost similar register map [1] and a samsung helper driver for these
two drivers.


I would not call them separate drivers. It's a single USB 2.0 driver with
the option to include support for various SoCs. This patchset adds:
Exynos 4210, Exynos 4212, Exynos 5250 and S5PCV210. I know that another
person is working on supporting S3C6410.


These two PHY drivers populate the function pointers in the helper
driver. So any phy_ops will first invoke the helper driver which will
then invoke the corresponding phy driver.

[1] -> http://www.diffchecker.com/7yno1uvk


Come on, this diff only includes the registers part of the file.
The following functions are also different:
- exynos421*_rate_to_clk
- exynos421*_isol
- exynos421*_phy_pwr
- exynos421*_power_on
- exynos421*_power_on


But most of the differences is because your 4212 has additional features 
in HSIC and supports more clock rates.


It seems that the file is too large for the tool. But still this makes a
false impression that only registers are different.


Advantages:
* (more) clean and readable
* helper driver can be used with other PHY drivers which will be added
soon

Disadvantages:
* code duplication


I would say that actually in this case less code is duplicated. Having
Separate drivers would mean that most of the phy-samsung-usb2.c file has


I actually meant a single driver for 4210 and 4212.

your current code has separate drivers for different versions of the 
same IP. If you have a single driver for the different versions, it will 
lead to a lot less code duplication (hint: I've given the exact 'same' 
comment at-least twice in this patch). There are quite a few examples in 
the kernel where the same driver is used for multiple versions of the IP.

to be repeated. That is 240 times 4 (and surely more in the future, as
this patchset adds support for 4 SoCs). Which is almost 1000 lines more.



Maybe having a helper driver makes sense when we have other samsung PHY
drivers added but not sure if it's needed here for EXYNOS4210_USB2 and
EXYNOS4212_USB2

Need your inputs on what you think about this.


Yes, I would also welcome other people's opinions.


+
+static int samsung_usb2_phy_probe(struct 

Re: xhci_hcd debugging status, please?

2013-12-08 Thread Oliver Neukum
On Sun, 2013-12-08 at 12:52 +0100, Udo van den Heuvel wrote:
> Hello,
> 
> Can someone please summarise the status of the xhci_hcd debugging I
> found after booting into 3.12.2?
> I did not see these messages before and I do not yet understand the
> added value for a mere user of this exciting xhci technology.
> Can we please have them go away if really there's noting really wrong?
> See below for a small part of the stream of stuff that was started.

You have XHCI debugging on. This is most likely a side effect
of teh switch to dynamic debugging in 3.12. But this should be
discussed on linux-usb.

HTH
Oliver


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH] drivers: char: Add a dynamic clock for the trace clock

2013-12-08 Thread Josh Triplett
On Fri, Dec 06, 2013 at 04:34:11PM -0800, Sonny Rao wrote:
> Based on a suggestion from John Stultz.
> 
> This adds a dynamic clock device which can be used with clock_gettime
> to sample the clock source used for time stamping trace events in the
> kernel.  The only use for this clock source is to associate user space
> events with kernel events on a given kernel.  It is explicitly not
> supposed to be used as a generic time source and won't necessarily be
> consistent between kernels.
> 
> Signed-off-by: Sonny Rao 

Reviewed-by: Josh Triplett 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] omap: twl-common: Fix musb-hdrc device name.

2013-12-08 Thread Belisko Marek
Hi Tony,

On Thu, Dec 5, 2013 at 7:43 PM, Tony Lindgren  wrote:
> * Belisko Marek  [131203 01:21]:
>> On Tue, Dec 3, 2013 at 10:08 AM, Belisko Marek  
>> wrote:
>> > Hi,
>> >
>> > On Tue, Dec 3, 2013 at 9:58 AM, Kishon Vijay Abraham I  
>> > wrote:
>> >> Hi,
>> >>
>> >> On Tuesday 03 December 2013 02:03 PM, Marek Belisko wrote:
>> >>> Without this change when booting omap3 device (gta04) with board file
>> >>> leads to follwing errors:
>> >>>
>> >>> [1.203308] musb-hdrc musb-hdrc.0.auto: unable to find phy
>> >>> [1.209075] HS USB OTG: no transceiver configured
>> >>> [1.214019] musb-hdrc musb-hdrc.0.auto: musb_init_controller failed 
>> >>> with status -517
>> >>>
>> >>> and usb isn't working.
>> >>>
>> >>> This is probably regression caused by commit: 6c27f939
>> >>
>> >> I think a better fix would be to have this merged..
>> >> https://lkml.org/lkml/2013/7/26/91
>> > Yes I see but how this could help with current situation? Ho you then
>> > specify device number?
>> I was too fast with reply sorry. I can see whole series and it is of
>> course correct solution. But as I said
>> can we except to be merged to 3.13. If not Tony can you pick my patch.
>
> If it's a regression, then let's get it merged for the -rc cycle.
Yes it is regression and without that usb on most omap3 based boards
without DT will not work.
>
> So please try to follow up on getting the proper fix merged, meanwhile
> I'll mark this thread as read. If you need this one merged for some
> reason, then please report to get it back to my radar.
>
> Regards,
>
> Tony

BR,

marek

-- 
as simple and primitive as possible
-
Marek Belisko - OPEN-NANDRA
Freelance Developer

Ruska Nova Ves 219 | Presov, 08005 Slovak Republic
Tel: +421 915 052 184
skype: marekwhite
twitter: #opennandra
web: http://open-nandra.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] net: unix: allow set_peek_off to fail

2013-12-08 Thread Pavel Emelyanov
On 12/08/2013 02:26 AM, Sasha Levin wrote:
> unix_dgram_recvmsg() will hold the readlock of the socket until recv
> is complete.
> 
> In the same time, we may try to setsockopt(SO_PEEK_OFF) which will hang until
> unix_dgram_recvmsg() will complete (which can take a while) without allowing
> us to break out of it, triggering a hung task spew.
> 
> Instead, allow set_peek_off to fail, this way userspace will not hang.
> 
> Signed-off-by: Sasha Levin 

Acked-by: Pavel Emelyanov 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH tip 0/5] tracing filters with BPF

2013-12-08 Thread Namhyung Kim
Hi Masami,

On Wed, 04 Dec 2013 10:13:37 +0900, Masami Hiramatsu wrote:
> (2013/12/04 3:26), Alexei Starovoitov wrote:
>> the only inconvenience so far is to know how parameters are getting
>> into registers.
>> on x86-64, arg1 is in rdi, arg2 is in rsi,... I want to improve that
>> after first step is done.
>
> Actually, that part is done by the perf-probe and ftrace dynamic events
> (kernel/trace/trace_probe.c). I think this generic BPF is good for
> re-implementing fetch methods. :)

For implementing patch method, it seems that it needs to access to user
memory, stack and/or current (task_struct - for utask or vma later) from
the BPF VM as well.  Isn't it OK from the security perspective?

Anyway, I'll take a look at it later if I have time, but I want to get
the existing/pending implementation merged first. :)

Thanks,
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] net: wirelesse: wcn36xx: pull allocation outside of critical section

2013-12-08 Thread Eugene Krasnikov
hal_ind_mutex suppose to protect msg_ind but with this patch
allocation will be done outside the critical section.

On Sat, Dec 7, 2013 at 5:13 PM, Michal Nazarewicz  wrote:
>
> This also simplifies flow-controll as there is now only one if
> condition with a single branch.
> ---
>  drivers/net/wireless/ath/wcn36xx/smd.c | 28 +++-
>  1 file changed, 15 insertions(+), 13 deletions(-)
>
> On Thu, Dec 05 2013, Eugene Krasnikov  wrote:
>> I think code will look neater if error case will be handled in "else"
>> statement something like: [...]
>
> How about the following:
>
> diff --git a/drivers/net/wireless/ath/wcn36xx/smd.c 
> b/drivers/net/wireless/ath/wcn36xx/smd.c
> index 823631c..ebab2db 100644
> --- a/drivers/net/wireless/ath/wcn36xx/smd.c
> +++ b/drivers/net/wireless/ath/wcn36xx/smd.c
> @@ -2056,22 +2056,24 @@ static void wcn36xx_smd_rsp_process(struct wcn36xx 
> *wcn, void *buf, size_t len)
> case WCN36XX_HAL_OTA_TX_COMPL_IND:
> case WCN36XX_HAL_MISSED_BEACON_IND:
> case WCN36XX_HAL_DELETE_STA_CONTEXT_IND:
> -   mutex_lock(>hal_ind_mutex);
> msg_ind = kmalloc(sizeof(*msg_ind), GFP_KERNEL);
> -   if (msg_ind) {
> -   msg_ind->msg_len = len;
> -   msg_ind->msg = kmalloc(len, GFP_KERNEL);
> -   memcpy(msg_ind->msg, buf, len);
> -   list_add_tail(_ind->list, >hal_ind_queue);
> -   queue_work(wcn->hal_ind_wq, >hal_ind_work);
> -   wcn36xx_dbg(WCN36XX_DBG_HAL, "indication arrived\n");
> +   if (!msg_ind) {
> +   /*
> +* FIXME: Do something smarter then just
> +* printing an error.
> +*/
> +   wcn36xx_err("Run out of memory while handling 
> SMD_EVENT (%d)\n",
> +   msg_header->msg_type);
> +   break;
> }
> +   mutex_lock(>hal_ind_mutex);
> +   msg_ind->msg_len = len;
> +   msg_ind->msg = kmalloc(len, GFP_KERNEL);
> +   memcpy(msg_ind->msg, buf, len);
> +   list_add_tail(_ind->list, >hal_ind_queue);
> +   queue_work(wcn->hal_ind_wq, >hal_ind_work);
> +   wcn36xx_dbg(WCN36XX_DBG_HAL, "indication arrived\n");
> mutex_unlock(>hal_ind_mutex);
> -   if (msg_ind)
> -   break;
> -   /* FIXME: Do something smarter then just printing an error. */
> -   wcn36xx_err("Run out of memory while handling SMD_EVENT 
> (%d)\n",
> -   msg_header->msg_type);
> break;
> default:
> wcn36xx_err("SMD_EVENT (%d) not supported\n",
> --
> 1.8.4
>



-- 
Best regards,
Eugene
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 01/10] net: stmmac: Enable stmmac main clock when probing hardware

2013-12-08 Thread Chen-Yu Tsai
Hi Peppe,

On Mon, Dec 9, 2013 at 3:14 PM, Giuseppe CAVALLARO
 wrote:
> Hello Chen-Yu
>
>
> On 12/6/2013 6:29 PM, Chen-Yu Tsai wrote:
>>
>> Signed-off-by: Chen-Yu Tsai 
>> ---
>>
>> Guiseppe previously stated that the "stmmaceth" clock is the
>> main clock that drives the IP. The stmmac driver does not
>> enable this clock during the probe phase. When the driver is
>> built in to the kernel, this is fine because the clock maybe
>> on by default, or the boot loader had enabled it.
>>
>> If stmmac is built as a module, when the module is loaded,
>> the clock may be found unused and disabled by the kernel.
>
>
> the clk_prepare_enable is then called in the open.
> Is it not working for you?
> Do you mean that the stmmac_hw_init fails if you do not move
> the clk_get and clk_prepare_enable on top of the stmmac_dvr_probe?
>

Exactly. The clock needs to be enabled prior to stmmac_dvr_probe.
Otherwise, stmmac_mdio_register will fail to find a usable PHY.

The DWMAC core I am working with does not support chip ID or
HW capability flags. I suspect those would fail, too.

Disabling the clock at the end of stmmac_dvr_probe is to make
sure it plays nice with existing power management callbacks.

Chen-Yu

>>
>>   drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 24
>> +--
>>   1 file changed, 14 insertions(+), 10 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>> b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>> index 8d4ccd3..7da71ed 100644
>> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>> @@ -2688,10 +2688,17 @@ struct stmmac_priv *stmmac_dvr_probe(struct device
>> *device,
>> if ((phyaddr >= 0) && (phyaddr <= 31))
>> priv->plat->phy_addr = phyaddr;
>>
>> +   priv->stmmac_clk = clk_get(priv->device, STMMAC_RESOURCE_NAME);
>> +   if (IS_ERR(priv->stmmac_clk)) {
>> +   pr_warn("%s: warning: cannot get CSR clock\n", __func__);
>> +   goto error_clk_get;
>> +   }
>> +   clk_prepare_enable(priv->stmmac_clk);
>> +
>> /* Init MAC and get the capabilities */
>> ret = stmmac_hw_init(priv);
>> if (ret)
>> -   goto error_free_netdev;
>> +   goto error_hw_init;
>>
>> ndev->netdev_ops = _netdev_ops;
>>
>> @@ -2729,12 +2736,6 @@ struct stmmac_priv *stmmac_dvr_probe(struct device
>> *device,
>> goto error_netdev_register;
>> }
>>
>> -   priv->stmmac_clk = clk_get(priv->device, STMMAC_RESOURCE_NAME);
>> -   if (IS_ERR(priv->stmmac_clk)) {
>> -   pr_warn("%s: warning: cannot get CSR clock\n", __func__);
>> -   goto error_clk_get;
>> -   }
>> -
>> /* If a specific clk_csr value is passed from the platform
>>  * this means that the CSR Clock Range selection cannot be
>>  * changed at run-time and it is fixed. Viceversa the driver'll
>> try to
>> @@ -2759,15 +2760,18 @@ struct stmmac_priv *stmmac_dvr_probe(struct device
>> *device,
>> }
>> }
>>
>> +   clk_disable_unprepare(priv->stmmac_clk);
>> +
>> return priv;
>>
>>   error_mdio_register:
>> -   clk_put(priv->stmmac_clk);
>> -error_clk_get:
>> unregister_netdev(ndev);
>>   error_netdev_register:
>> netif_napi_del(>napi);
>> -error_free_netdev:
>> +error_hw_init:
>> +   clk_disable_unprepare(priv->stmmac_clk);
>> +   clk_put(priv->stmmac_clk);
>> +error_clk_get:
>> free_netdev(ndev);
>>
>> return NULL;
>>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] regulator: pfuze100: Fix address of FABID

2013-12-08 Thread Axel Lin
According to the datasheet, the address of FABID is 0x4. Fix it.

Signed-off-by: Axel Lin 
---
 drivers/regulator/pfuze100-regulator.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/regulator/pfuze100-regulator.c 
b/drivers/regulator/pfuze100-regulator.c
index 032df37..8b5e4c7 100644
--- a/drivers/regulator/pfuze100-regulator.c
+++ b/drivers/regulator/pfuze100-regulator.c
@@ -38,7 +38,7 @@
 
 #define PFUZE100_DEVICEID  0x0
 #define PFUZE100_REVID 0x3
-#define PFUZE100_FABID 0x3
+#define PFUZE100_FABID 0x4
 
 #define PFUZE100_SW1ABVOL  0x20
 #define PFUZE100_SW1CVOL   0x2e
-- 
1.8.1.2



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/9] phy: core: Change the way of_phy_get is called

2013-12-08 Thread Kishon Vijay Abraham I

On Friday 06 December 2013 04:22 PM, Kamil Debski wrote:

Hi,


From: Kishon Vijay Abraham I [mailto:kis...@ti.com]
Sent: Friday, December 06, 2013 6:31 AM

Hi,

On Thursday 05 December 2013 05:59 PM, Kamil Debski wrote:

Previously the of_phy_get function took a struct device * and was
declared static. It was impossible to call it from another driver and
thus it was impossible to get phy defined


It was never intended to be called from other drivers. What's up with
the wrapper of of_phy_get, phy_get()/devm_phy_get()? Why isn't that
enough?


Implementing support for multiple phys in the ehci driver is a bit tricky.
Especially when we want to do it right. Please have a look at this part of
the dts file:

+ehci@1258 {
+compatible = "samsung,exynos4210-ehci";
+reg = <0x1258 0x2>;
+interrupts = <0 70 0>;
+clocks = < 304>, < 305>;
+clock-names = "usbhost", "otg";
+status = "disabled";
+#address-cells = <1>;
+#size-cells = <0>;
+port@0 {
+reg = <0>;
+phys = < 1>;
+phy-names = "host";
+status = "disabled";
+};
+port@1 {
+reg = <1>;
+phys = < 2>;
+phy-names = "hsic0";
+status = "disabled";
+};
+port@2 {
+reg = <2>;
+phys = < 3>;
+phy-names = "hsic1";
+status = "disabled";
+};
+};

With the above we have a clear specification of ports and their respective
phys. But to do this properly the ehci driver has to iterate over port
nodes. It is much easier to use devm_of_phy_get by giving the node as its
argument.


I see. There are a couple of more things we do in the wrapper that gets 
missed while exporting of_phy_get (get_device and try_module_get). You 
might want to re-work that one.


Thanks
Kishon
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 01/10] net: stmmac: Enable stmmac main clock when probing hardware

2013-12-08 Thread Giuseppe CAVALLARO

Hello Chen-Yu

On 12/6/2013 6:29 PM, Chen-Yu Tsai wrote:

Signed-off-by: Chen-Yu Tsai 
---

Guiseppe previously stated that the "stmmaceth" clock is the
main clock that drives the IP. The stmmac driver does not
enable this clock during the probe phase. When the driver is
built in to the kernel, this is fine because the clock maybe
on by default, or the boot loader had enabled it.

If stmmac is built as a module, when the module is loaded,
the clock may be found unused and disabled by the kernel.


the clk_prepare_enable is then called in the open.
Is it not working for you?
Do you mean that the stmmac_hw_init fails if you do not move
the clk_get and clk_prepare_enable on top of the stmmac_dvr_probe?

Peppe



  drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 24 +--
  1 file changed, 14 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c 
b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 8d4ccd3..7da71ed 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -2688,10 +2688,17 @@ struct stmmac_priv *stmmac_dvr_probe(struct device 
*device,
if ((phyaddr >= 0) && (phyaddr <= 31))
priv->plat->phy_addr = phyaddr;

+   priv->stmmac_clk = clk_get(priv->device, STMMAC_RESOURCE_NAME);
+   if (IS_ERR(priv->stmmac_clk)) {
+   pr_warn("%s: warning: cannot get CSR clock\n", __func__);
+   goto error_clk_get;
+   }
+   clk_prepare_enable(priv->stmmac_clk);
+
/* Init MAC and get the capabilities */
ret = stmmac_hw_init(priv);
if (ret)
-   goto error_free_netdev;
+   goto error_hw_init;

ndev->netdev_ops = _netdev_ops;

@@ -2729,12 +2736,6 @@ struct stmmac_priv *stmmac_dvr_probe(struct device 
*device,
goto error_netdev_register;
}

-   priv->stmmac_clk = clk_get(priv->device, STMMAC_RESOURCE_NAME);
-   if (IS_ERR(priv->stmmac_clk)) {
-   pr_warn("%s: warning: cannot get CSR clock\n", __func__);
-   goto error_clk_get;
-   }
-
/* If a specific clk_csr value is passed from the platform
 * this means that the CSR Clock Range selection cannot be
 * changed at run-time and it is fixed. Viceversa the driver'll try to
@@ -2759,15 +2760,18 @@ struct stmmac_priv *stmmac_dvr_probe(struct device 
*device,
}
}

+   clk_disable_unprepare(priv->stmmac_clk);
+
return priv;

  error_mdio_register:
-   clk_put(priv->stmmac_clk);
-error_clk_get:
unregister_netdev(ndev);
  error_netdev_register:
netif_napi_del(>napi);
-error_free_netdev:
+error_hw_init:
+   clk_disable_unprepare(priv->stmmac_clk);
+   clk_put(priv->stmmac_clk);
+error_clk_get:
free_netdev(ndev);

return NULL;



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 03/18] mm: Clear pmd_numa before invalidating

2013-12-08 Thread Mel Gorman
pmdp_invalidate clears the present bit without taking into account that it
might be in the _PAGE_NUMA bit leaving the PMD in an unexpected state. Clear
pmd_numa before invalidating.

Cc: sta...@vger.kernel.org
Signed-off-by: Mel Gorman 
---
 mm/pgtable-generic.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/mm/pgtable-generic.c b/mm/pgtable-generic.c
index cbb3854..e84cad2 100644
--- a/mm/pgtable-generic.c
+++ b/mm/pgtable-generic.c
@@ -191,6 +191,9 @@ pgtable_t pgtable_trans_huge_withdraw(struct mm_struct *mm, 
pmd_t *pmdp)
 void pmdp_invalidate(struct vm_area_struct *vma, unsigned long address,
 pmd_t *pmdp)
 {
+   pmd_t entry = *pmdp;
+   if (pmd_numa(entry))
+   entry = pmd_mknonnuma(entry);
set_pmd_at(vma->vm_mm, address, pmdp, pmd_mknotpresent(*pmdp));
flush_tlb_range(vma, address, address + HPAGE_PMD_SIZE);
 }
-- 
1.8.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 04/10] usb: dwc3: use quirks to know if a particualr platform doesn't have PHY

2013-12-08 Thread Kishon Vijay Abraham I

Hi,

On Thursday 05 December 2013 01:28 PM, Heikki Krogerus wrote:

Hi,

On Thu, Dec 05, 2013 at 12:04:46PM +0530, Kishon Vijay Abraham I wrote:

On Wednesday 04 December 2013 08:10 PM, Heikki Krogerus wrote:

On Mon, Nov 25, 2013 at 03:31:24PM +0530, Kishon Vijay Abraham I wrote:

There can be systems which does not have an external phy, so get
phy only if no quirks are added that indicates the PHY is not present.
Introduced two quirk flags to indicate the *absence* of usb2 phy and
usb3 phy. Also remove checking if return value is -ENXIO since it's now
changed to always enable usb_phy layer.


Can you guys explain why is something like this needed? Like with
clocks and gpios, the device drivers shouldn't need to care any more
if the platform has the phys or not. -ENODEV tells you your platform


Shouldn't we report if a particular platform needs a PHY and not able to get
it. How will a user know if a particular controller is not working because it's
not able to get and initialize the PHYs? Don't you think in such cases it's
better to fail (and return from probe) because the controller will not work
anyway without the PHY?


My point is that you do not need to separately tell this to the driver
like you do with the quirks (if you did, then you would need to fix
your framework and not hack the drivers).

Like I said, ENODEV tells you that there is no phy on this platform
for you, allowing you to safely continue. If your phy driver is not
loaded, the framework already returns EPROBE_DEFER, right. Any other


right. but that doesn't consider broken dt data. With quirks we'll able 
to tell if a controller in a particular platform has PHY or not without 
depending on the dt data.

error when getting the phy you can consider critical. They are the
errors telling you that you do need a phy on this platform, but
something actually went wrong when getting it.

Not on all scenarios though :-s

Thanks
Kishon
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 05/18] mm: numa: Do not clear PTE for pte_numa update

2013-12-08 Thread Mel Gorman
The TLB must be flushed if the PTE is updated but change_pte_range is clearing
the PTE while marking PTEs pte_numa without necessarily flushing the TLB if it
reinserts the same entry. Without the flush, it's conceivable that two 
processors
have different TLBs for the same virtual address and at the very least it would
generate spurious faults. This patch only unmaps the pages in change_pte_range 
for
a full protection change.

Cc: sta...@vger.kernel.org
Signed-off-by: Mel Gorman 
---
 mm/mprotect.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/mm/mprotect.c b/mm/mprotect.c
index 2666797..0a07e2d 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -52,13 +52,14 @@ static unsigned long change_pte_range(struct vm_area_struct 
*vma, pmd_t *pmd,
pte_t ptent;
bool updated = false;
 
-   ptent = ptep_modify_prot_start(mm, addr, pte);
if (!prot_numa) {
+   ptent = ptep_modify_prot_start(mm, addr, pte);
ptent = pte_modify(ptent, newprot);
updated = true;
} else {
struct page *page;
 
+   ptent = *pte;
page = vm_normal_page(vma, addr, oldpte);
if (page) {
if (!pte_numa(oldpte)) {
@@ -79,7 +80,10 @@ static unsigned long change_pte_range(struct vm_area_struct 
*vma, pmd_t *pmd,
 
if (updated)
pages++;
-   ptep_modify_prot_commit(mm, addr, pte, ptent);
+
+   /* Only !prot_numa always clears the pte */
+   if (!prot_numa)
+   ptep_modify_prot_commit(mm, addr, pte, ptent);
} else if (IS_ENABLED(CONFIG_MIGRATION) && !pte_file(oldpte)) {
swp_entry_t entry = pte_to_swp_entry(oldpte);
 
-- 
1.8.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 07/18] mm: numa: Avoid unnecessary work on the failure path

2013-12-08 Thread Mel Gorman
If a PMD changes during a THP migration then migration aborts but the
failure path is doing more work than is necessary.

Cc: sta...@vger.kernel.org
Signed-off-by: Mel Gorman 
---
 mm/migrate.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index be787d5..a987525 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1780,7 +1780,8 @@ fail_putback:
putback_lru_page(page);
mod_zone_page_state(page_zone(page),
 NR_ISOLATED_ANON + page_lru, -HPAGE_PMD_NR);
-   goto out_fail;
+
+   goto out_unlock;
}
 
/*
@@ -1854,6 +1855,7 @@ out_dropref:
}
spin_unlock(ptl);
 
+out_unlock:
unlock_page(page);
put_page(page);
return 0;
-- 
1.8.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 04/18] mm: numa: Do not clear PMD during PTE update scan

2013-12-08 Thread Mel Gorman
If the PMD is flushed then a parallel fault in handle_mm_fault() will enter
the pmd_none and do_huge_pmd_anonymous_page() path where it'll attempt
to insert a huge zero page. This is wasteful so the patch avoids clearing
the PMD when setting pmd_numa.

Cc: sta...@vger.kernel.org
Signed-off-by: Mel Gorman 
---
 mm/huge_memory.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index deae592..5a5da50 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1529,7 +1529,7 @@ int change_huge_pmd(struct vm_area_struct *vma, pmd_t 
*pmd,
 */
if (!is_huge_zero_page(page) &&
!pmd_numa(*pmd)) {
-   entry = pmdp_get_and_clear(mm, addr, pmd);
+   entry = *pmd;
entry = pmd_mknuma(entry);
ret = HPAGE_PMD_NR;
}
-- 
1.8.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 09/18] mm: numa: Clear numa hinting information on mprotect

2013-12-08 Thread Mel Gorman
On a protection change it is no longer clear if the page should be still
accessible.  This patch clears the NUMA hinting fault bits on a protection
change.

Cc: sta...@vger.kernel.org
Signed-off-by: Mel Gorman 
---
 mm/huge_memory.c | 2 ++
 mm/mprotect.c| 2 ++
 2 files changed, 4 insertions(+)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 0f00b96..0ecaba2 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1522,6 +1522,8 @@ int change_huge_pmd(struct vm_area_struct *vma, pmd_t 
*pmd,
ret = 1;
if (!prot_numa) {
entry = pmdp_get_and_clear(mm, addr, pmd);
+   if (pmd_numa(entry))
+   entry = pmd_mknonnuma(entry);
entry = pmd_modify(entry, newprot);
ret = HPAGE_PMD_NR;
BUG_ON(pmd_write(entry));
diff --git a/mm/mprotect.c b/mm/mprotect.c
index 0a07e2d..eb2f349 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -54,6 +54,8 @@ static unsigned long change_pte_range(struct vm_area_struct 
*vma, pmd_t *pmd,
 
if (!prot_numa) {
ptent = ptep_modify_prot_start(mm, addr, pte);
+   if (pte_numa(ptent))
+   ptent = pte_mknonnuma(ptent);
ptent = pte_modify(ptent, newprot);
updated = true;
} else {
-- 
1.8.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 08/18] sched: numa: Skip inaccessible VMAs

2013-12-08 Thread Mel Gorman
Inaccessible VMA should not be trapping NUMA hint faults. Skip them.

Cc: sta...@vger.kernel.org
Signed-off-by: Mel Gorman 
---
 kernel/sched/fair.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index e8b652e..1ce1615 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -1752,6 +1752,13 @@ void task_numa_work(struct callback_head *work)
(vma->vm_file && (vma->vm_flags & (VM_READ|VM_WRITE)) == 
(VM_READ)))
continue;
 
+   /*
+* Skip inaccessible VMAs to avoid any confusion between
+* PROT_NONE and NUMA hinting ptes
+*/
+   if (!(vma->vm_flags & (VM_READ | VM_EXEC | VM_WRITE)))
+   continue;
+
do {
start = max(start, vma->vm_start);
end = ALIGN(start + (pages << PAGE_SHIFT), HPAGE_SIZE);
-- 
1.8.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 10/18] mm: numa: Avoid unnecessary disruption of NUMA hinting during migration

2013-12-08 Thread Mel Gorman
do_huge_pmd_numa_page() handles the case where there is parallel THP
migration.  However, by the time it is checked the NUMA hinting information
has already been disrupted. This patch adds an earlier check with some helpers.

Cc: sta...@vger.kernel.org
Signed-off-by: Mel Gorman 
---
 include/linux/migrate.h |  9 +
 mm/huge_memory.c| 22 --
 mm/migrate.c| 12 
 3 files changed, 37 insertions(+), 6 deletions(-)

diff --git a/include/linux/migrate.h b/include/linux/migrate.h
index f5096b5..b7717d7 100644
--- a/include/linux/migrate.h
+++ b/include/linux/migrate.h
@@ -90,10 +90,19 @@ static inline int migrate_huge_page_move_mapping(struct 
address_space *mapping,
 #endif /* CONFIG_MIGRATION */
 
 #ifdef CONFIG_NUMA_BALANCING
+extern bool pmd_trans_migrating(pmd_t pmd);
+extern void wait_migrate_huge_page(struct anon_vma *anon_vma, pmd_t *pmd);
 extern int migrate_misplaced_page(struct page *page,
  struct vm_area_struct *vma, int node);
 extern bool migrate_ratelimited(int node);
 #else
+static inline bool pmd_trans_migrating(pmd_t pmd)
+{
+   return false;
+}
+static inline void wait_migrate_huge_page(struct anon_vma *anon_vma, pmd_t 
*pmd)
+{
+}
 static inline int migrate_misplaced_page(struct page *page,
 struct vm_area_struct *vma, int node)
 {
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 0ecaba2..e3b6a75 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -882,6 +882,10 @@ int copy_huge_pmd(struct mm_struct *dst_mm, struct 
mm_struct *src_mm,
ret = 0;
goto out_unlock;
}
+
+   /* mmap_sem prevents this happening but warn if that changes */
+   WARN_ON(pmd_trans_migrating(pmd));
+
if (unlikely(pmd_trans_splitting(pmd))) {
/* split huge page running from under us */
spin_unlock(src_ptl);
@@ -1299,6 +1303,17 @@ int do_huge_pmd_numa_page(struct mm_struct *mm, struct 
vm_area_struct *vma,
if (unlikely(!pmd_same(pmd, *pmdp)))
goto out_unlock;
 
+   /*
+* If there are potential migrations, wait for completion and retry
+* without disrupting NUMA hinting information. Do not relock and
+* check_same as the page may no longer be mapped.
+*/
+   if (unlikely(pmd_trans_migrating(*pmdp))) {
+   spin_unlock(ptl);
+   wait_migrate_huge_page(vma->anon_vma, pmdp);
+   goto out;
+   }
+
page = pmd_page(pmd);
BUG_ON(is_huge_zero_page(page));
page_nid = page_to_nid(page);
@@ -1329,12 +1344,7 @@ int do_huge_pmd_numa_page(struct mm_struct *mm, struct 
vm_area_struct *vma,
goto clear_pmdnuma;
}
 
-   /*
-* If there are potential migrations, wait for completion and retry. We
-* do not relock and check_same as the page may no longer be mapped.
-* Furtermore, even if the page is currently misplaced, there is no
-* guarantee it is still misplaced after the migration completes.
-*/
+   /* Migration could have started since the pmd_trans_migrating check */
if (!page_locked) {
spin_unlock(ptl);
wait_on_page_locked(page);
diff --git a/mm/migrate.c b/mm/migrate.c
index a987525..cfb4190 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1655,6 +1655,18 @@ int numamigrate_isolate_page(pg_data_t *pgdat, struct 
page *page)
return 1;
 }
 
+bool pmd_trans_migrating(pmd_t pmd)
+{
+   struct page *page = pmd_page(pmd);
+   return PageLocked(page);
+}
+
+void wait_migrate_huge_page(struct anon_vma *anon_vma, pmd_t *pmd)
+{
+   struct page *page = pmd_page(*pmd);
+   wait_on_page_locked(page);
+}
+
 /*
  * Attempt to migrate a misplaced page to the specified destination
  * node. Caller is expected to have an elevated reference count on
-- 
1.8.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 11/18] mm: fix TLB flush race between migration, and change_protection_range

2013-12-08 Thread Mel Gorman
From: Rik van Riel 

There are a few subtle races, between change_protection_range (used by
mprotect and change_prot_numa) on one side, and NUMA page migration and
compaction on the other side.

The basic race is that there is a time window between when the PTE gets
made non-present (PROT_NONE or NUMA), and the TLB is flushed.

During that time, a CPU may continue writing to the page.

This is fine most of the time, however compaction or the NUMA migration
code may come in, and migrate the page away.

When that happens, the CPU may continue writing, through the cached
translation, to what is no longer the current memory location of the process.

This only affects x86, which has a somewhat optimistic pte_accessible. All
other architectures appear to be safe, and will either always flush,
or flush whenever there is a valid mapping, even with no permissions (SPARC).

The basic race looks like this:

CPU A   CPU B   CPU C

load TLB entry
make entry PTE/PMD_NUMA
fault on entry
read/write old page
start migrating page
change PTE/PMD to new page
read/write old page [*]
flush TLB
reload TLB from new entry
read/write new page
lose data

[*] the old page may belong to a new user at this point!

The obvious fix is to flush remote TLB entries, by making sure that
pte_accessible aware of the fact that PROT_NONE and PROT_NUMA memory may
still be accessible if there is a TLB flush pending for the mm.

This should fix both NUMA migration and compaction.

Cc: sta...@vger.kernel.org
Signed-off-by: Rik van Riel 
Signed-off-by: Mel Gorman 
---
 arch/sparc/include/asm/pgtable_64.h |  4 ++--
 arch/x86/include/asm/pgtable.h  | 11 --
 include/asm-generic/pgtable.h   |  2 +-
 include/linux/mm_types.h| 44 +
 kernel/fork.c   |  1 +
 mm/huge_memory.c|  7 ++
 mm/mprotect.c   |  2 ++
 mm/pgtable-generic.c|  5 +++--
 8 files changed, 69 insertions(+), 7 deletions(-)

diff --git a/arch/sparc/include/asm/pgtable_64.h 
b/arch/sparc/include/asm/pgtable_64.h
index 8358dc1..0f9e945 100644
--- a/arch/sparc/include/asm/pgtable_64.h
+++ b/arch/sparc/include/asm/pgtable_64.h
@@ -619,7 +619,7 @@ static inline unsigned long pte_present(pte_t pte)
 }
 
 #define pte_accessible pte_accessible
-static inline unsigned long pte_accessible(pte_t a)
+static inline unsigned long pte_accessible(struct mm_struct *mm, pte_t a)
 {
return pte_val(a) & _PAGE_VALID;
 }
@@ -847,7 +847,7 @@ static inline void __set_pte_at(struct mm_struct *mm, 
unsigned long addr,
 * SUN4V NOTE: _PAGE_VALID is the same value in both the SUN4U
 * and SUN4V pte layout, so this inline test is fine.
 */
-   if (likely(mm != _mm) && pte_accessible(orig))
+   if (likely(mm != _mm) && pte_accessible(mm, orig))
tlb_batch_add(mm, addr, ptep, orig, fullmm);
 }
 
diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index 3d19994..48cab4c 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -452,9 +452,16 @@ static inline int pte_present(pte_t a)
 }
 
 #define pte_accessible pte_accessible
-static inline int pte_accessible(pte_t a)
+static inline bool pte_accessible(struct mm_struct *mm, pte_t a)
 {
-   return pte_flags(a) & _PAGE_PRESENT;
+   if (pte_flags(a) & _PAGE_PRESENT)
+   return true;
+
+   if ((pte_flags(a) & (_PAGE_PROTNONE | _PAGE_NUMA)) &&
+   tlb_flush_pending(mm))
+   return true;
+
+   return false;
 }
 
 static inline int pte_hidden(pte_t pte)
diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index f330d28..b12079a 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -217,7 +217,7 @@ static inline int pmd_same(pmd_t pmd_a, pmd_t pmd_b)
 #endif
 
 #ifndef pte_accessible
-# define pte_accessible(pte)   ((void)(pte),1)
+# define pte_accessible(mm, pte)   ((void)(pte), 1)
 #endif
 
 #ifndef flush_tlb_fix_spurious_fault
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index bd29941..c122bb1 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -443,6 +443,14 @@ struct mm_struct {
/* numa_scan_seq prevents two threads setting pte_numa */
int numa_scan_seq;
 #endif
+#if defined(CONFIG_NUMA_BALANCING) || defined(CONFIG_COMPACTION)
+   /*
+* An operation with batched TLB flushing is going on. Anything that
+* can move process memory needs to flush 

[PATCH 13/18] mm: numa: Make NUMA-migrate related functions static

2013-12-08 Thread Mel Gorman
numamigrate_update_ratelimit and numamigrate_isolate_page only have callers
in mm/migrate.c. This patch makes them static.

Signed-off-by: Mel Gorman 
---
 mm/migrate.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index 5372521..77147bd 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1593,7 +1593,8 @@ bool migrate_ratelimited(int node)
 }
 
 /* Returns true if the node is migrate rate-limited after the update */
-bool numamigrate_update_ratelimit(pg_data_t *pgdat, unsigned long nr_pages)
+static bool numamigrate_update_ratelimit(pg_data_t *pgdat,
+   unsigned long nr_pages)
 {
bool rate_limited = false;
 
@@ -1617,7 +1618,7 @@ bool numamigrate_update_ratelimit(pg_data_t *pgdat, 
unsigned long nr_pages)
return rate_limited;
 }
 
-int numamigrate_isolate_page(pg_data_t *pgdat, struct page *page)
+static int numamigrate_isolate_page(pg_data_t *pgdat, struct page *page)
 {
int page_lru;
 
-- 
1.8.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 01/18] mm: numa: Serialise parallel get_user_page against THP migration

2013-12-08 Thread Mel Gorman
Base pages are unmapped and flushed from cache and TLB during normal page
migration and replaced with a migration entry that causes any parallel or
gup to block until migration completes. THP does not unmap pages due to
a lack of support for migration entries at a PMD level. This allows races
with get_user_pages and get_user_pages_fast which commit 3f926ab94 ("mm:
Close races between THP migration and PMD numa clearing") made worse by
introducing a pmd_clear_flush().

This patch forces get_user_page (fast and normal) on a pmd_numa page to
go through the slow get_user_page path where it will serialise against THP
migration and properly account for the NUMA hinting fault. On the migration
side the page table lock is taken for each PTE update.

Cc: sta...@vger.kernel.org
Signed-off-by: Mel Gorman 
---
 arch/x86/mm/gup.c | 13 +
 mm/huge_memory.c  | 24 
 mm/migrate.c  | 38 +++---
 3 files changed, 60 insertions(+), 15 deletions(-)

diff --git a/arch/x86/mm/gup.c b/arch/x86/mm/gup.c
index dd74e46..0596e8e 100644
--- a/arch/x86/mm/gup.c
+++ b/arch/x86/mm/gup.c
@@ -83,6 +83,12 @@ static noinline int gup_pte_range(pmd_t pmd, unsigned long 
addr,
pte_t pte = gup_get_pte(ptep);
struct page *page;
 
+   /* Similar to the PMD case, NUMA hinting must take slow path */
+   if (pte_numa(pte)) {
+   pte_unmap(ptep);
+   return 0;
+   }
+
if ((pte_flags(pte) & (mask | _PAGE_SPECIAL)) != mask) {
pte_unmap(ptep);
return 0;
@@ -167,6 +173,13 @@ static int gup_pmd_range(pud_t pud, unsigned long addr, 
unsigned long end,
if (pmd_none(pmd) || pmd_trans_splitting(pmd))
return 0;
if (unlikely(pmd_large(pmd))) {
+   /*
+* NUMA hinting faults need to be handled in the GUP
+* slowpath for accounting purposes and so that they
+* can be serialised against THP migration.
+*/
+   if (pmd_numa(pmd))
+   return 0;
if (!gup_huge_pmd(pmd, addr, next, write, pages, nr))
return 0;
} else {
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index bccd5a6..deae592 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1243,6 +1243,10 @@ struct page *follow_trans_huge_pmd(struct vm_area_struct 
*vma,
if ((flags & FOLL_DUMP) && is_huge_zero_pmd(*pmd))
return ERR_PTR(-EFAULT);
 
+   /* Full NUMA hinting faults to serialise migration in fault paths */
+   if ((flags & FOLL_NUMA) && pmd_numa(*pmd))
+   goto out;
+
page = pmd_page(*pmd);
VM_BUG_ON(!PageHead(page));
if (flags & FOLL_TOUCH) {
@@ -1323,23 +1327,27 @@ int do_huge_pmd_numa_page(struct mm_struct *mm, struct 
vm_area_struct *vma,
/* If the page was locked, there are no parallel migrations */
if (page_locked)
goto clear_pmdnuma;
+   }
 
-   /*
-* Otherwise wait for potential migrations and retry. We do
-* relock and check_same as the page may no longer be mapped.
-* As the fault is being retried, do not account for it.
-*/
+   /*
+* If there are potential migrations, wait for completion and retry. We
+* do not relock and check_same as the page may no longer be mapped.
+* Furtermore, even if the page is currently misplaced, there is no
+* guarantee it is still misplaced after the migration completes.
+*/
+   if (!page_locked) {
spin_unlock(ptl);
wait_on_page_locked(page);
page_nid = -1;
goto out;
}
 
-   /* Page is misplaced, serialise migrations and parallel THP splits */
+   /*
+* Page is misplaced. Page lock serialises migrations. Acquire anon_vma
+* to serialises splits
+*/
get_page(page);
spin_unlock(ptl);
-   if (!page_locked)
-   lock_page(page);
anon_vma = page_lock_anon_vma_read(page);
 
/* Confirm the PMD did not change while page_table_lock was released */
diff --git a/mm/migrate.c b/mm/migrate.c
index bb94004..2cabbd5 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1722,6 +1722,7 @@ int migrate_misplaced_transhuge_page(struct mm_struct *mm,
struct page *new_page = NULL;
struct mem_cgroup *memcg = NULL;
int page_lru = page_is_file_cache(page);
+   pmd_t orig_entry;
 
/*
 * Rate-limit the amount of data that is being migrated to a node.
@@ -1756,7 +1757,8 @@ int migrate_misplaced_transhuge_page(struct mm_struct *mm,
 

[PATCH 14/18] mm: numa: Limit scope of lock for NUMA migrate rate limiting

2013-12-08 Thread Mel Gorman
NUMA migrate rate limiting protects a migration counter and window using
a lock but in some cases this can be a contended lock. It is not
critical that the number of pages be perfect, lost updates are
acceptable. Reduce the importance of this lock.

Signed-off-by: Mel Gorman 
---
 include/linux/mmzone.h |  5 +
 mm/migrate.c   | 21 -
 2 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index bd791e4..b835d3f 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -758,10 +758,7 @@ typedef struct pglist_data {
int kswapd_max_order;
enum zone_type classzone_idx;
 #ifdef CONFIG_NUMA_BALANCING
-   /*
-* Lock serializing the per destination node AutoNUMA memory
-* migration rate limiting data.
-*/
+   /* Lock serializing the migrate rate limiting window */
spinlock_t numabalancing_migrate_lock;
 
/* Rate limiting time interval */
diff --git a/mm/migrate.c b/mm/migrate.c
index 77147bd..8b560d5 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1596,26 +1596,29 @@ bool migrate_ratelimited(int node)
 static bool numamigrate_update_ratelimit(pg_data_t *pgdat,
unsigned long nr_pages)
 {
-   bool rate_limited = false;
-
/*
 * Rate-limit the amount of data that is being migrated to a node.
 * Optimal placement is no good if the memory bus is saturated and
 * all the time is being spent migrating!
 */
-   spin_lock(>numabalancing_migrate_lock);
if (time_after(jiffies, pgdat->numabalancing_migrate_next_window)) {
+   spin_lock(>numabalancing_migrate_lock);
pgdat->numabalancing_migrate_nr_pages = 0;
pgdat->numabalancing_migrate_next_window = jiffies +
msecs_to_jiffies(migrate_interval_millisecs);
+   spin_unlock(>numabalancing_migrate_lock);
}
if (pgdat->numabalancing_migrate_nr_pages > ratelimit_pages)
-   rate_limited = true;
-   else
-   pgdat->numabalancing_migrate_nr_pages += nr_pages;
-   spin_unlock(>numabalancing_migrate_lock);
-   
-   return rate_limited;
+   return true;
+
+   /*
+* This is an unlocked non-atomic update so errors are possible.
+* The consequences are failing to migrate when we potentiall should
+* have which is not severe enough to warrant locking. If it is ever
+* a problem, it can be converted to a per-cpu counter.
+*/
+   pgdat->numabalancing_migrate_nr_pages += nr_pages;
+   return false;
 }
 
 static int numamigrate_isolate_page(pg_data_t *pgdat, struct page *page)
-- 
1.8.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 12/18] mm: numa: Defer TLB flush for THP migration as long as possible

2013-12-08 Thread Mel Gorman
THP migration can fail for a variety of reasons. Avoid flushing the TLB
to deal with THP migration races until the copy is ready to start.

Cc: sta...@vger.kernel.org
Signed-off-by: Mel Gorman 
---
 mm/huge_memory.c | 7 ---
 mm/migrate.c | 6 ++
 2 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index e3a5ee2..e3b6a75 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1377,13 +1377,6 @@ int do_huge_pmd_numa_page(struct mm_struct *mm, struct 
vm_area_struct *vma,
}
 
/*
-* The page_table_lock above provides a memory barrier
-* with change_protection_range.
-*/
-   if (tlb_flush_pending(mm))
-   flush_tlb_range(vma, haddr, haddr + HPAGE_PMD_SIZE);
-
-   /*
 * Migrate the THP to the requested node, returns with page unlocked
 * and pmd_numa cleared.
 */
diff --git a/mm/migrate.c b/mm/migrate.c
index cfb4190..5372521 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1759,6 +1759,12 @@ int migrate_misplaced_transhuge_page(struct mm_struct 
*mm,
goto out_fail;
}
 
+   /* PTL provides a memory barrier with change_protection_range */
+   ptl = pmd_lock(mm, pmd);
+   if (tlb_flush_pending(mm))
+   flush_tlb_range(vma, mmun_start, mmun_end);
+   spin_unlock(ptl);
+
/* Prepare a page as a migration target */
__set_page_locked(new_page);
SetPageSwapBacked(new_page);
-- 
1.8.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 18/18] sched: Add tracepoints related to NUMA task migration

2013-12-08 Thread Mel Gorman
This patch adds three tracepoints
 o trace_sched_move_numawhen a task is moved to a node
 o trace_sched_swap_numawhen a task is swapped with another task
 o trace_sched_stick_numa   when a numa-related migration fails

The tracepoints allow the NUMA scheduler activity to be monitored and the
following high-level metrics can be calculated

 o NUMA migrated stuck   nr trace_sched_stick_numa
 o NUMA migrated idlenr trace_sched_move_numa
 o NUMA migrated swapped nr trace_sched_swap_numa
 o NUMA local swappedtrace_sched_swap_numa src_nid == dst_nid (should never 
happen)
 o NUMA remote swapped   trace_sched_swap_numa src_nid != dst_nid (should == 
NUMA migrated swapped)
 o NUMA group swappedtrace_sched_swap_numa src_ngid == dst_ngid
 Maybe a small number of these are acceptable
 but a high number would be a major surprise.
 It would be even worse if bounces are frequent.
 o NUMA avg task migs.   Average number of migrations for tasks
 o NUMA stddev task mig  Self-explanatory
 o NUMA max task migs.   Maximum number of migrations for a single task

In general the intent of the tracepoints is to help diagnose problems
where automatic NUMA balancing appears to be doing an excessive amount of
useless work.

Signed-off-by: Mel Gorman 
---
 include/trace/events/sched.h | 68 
 kernel/sched/core.c  |  2 ++
 kernel/sched/fair.c  |  6 ++--
 3 files changed, 69 insertions(+), 7 deletions(-)

diff --git a/include/trace/events/sched.h b/include/trace/events/sched.h
index cf1694c..f0c54e3 100644
--- a/include/trace/events/sched.h
+++ b/include/trace/events/sched.h
@@ -443,11 +443,7 @@ TRACE_EVENT(sched_process_hang,
 );
 #endif /* CONFIG_DETECT_HUNG_TASK */
 
-/*
- * Tracks migration of tasks from one runqueue to another. Can be used to
- * detect if automatic NUMA balancing is bouncing between nodes
- */
-TRACE_EVENT(sched_move_task,
+DECLARE_EVENT_CLASS(sched_move_task_template,
 
TP_PROTO(struct task_struct *tsk, int src_cpu, int dst_cpu),
 
@@ -478,6 +474,68 @@ TRACE_EVENT(sched_move_task,
__entry->src_cpu, __entry->src_nid,
__entry->dst_cpu, __entry->dst_nid)
 );
+
+/*
+ * Tracks migration of tasks from one runqueue to another. Can be used to
+ * detect if automatic NUMA balancing is bouncing between nodes
+ */
+DEFINE_EVENT(sched_move_task_template, sched_move_task,
+   TP_PROTO(struct task_struct *tsk, int src_cpu, int dst_cpu),
+
+   TP_ARGS(tsk, src_cpu, dst_cpu)
+);
+
+DEFINE_EVENT(sched_move_task_template, sched_move_numa,
+   TP_PROTO(struct task_struct *tsk, int src_cpu, int dst_cpu),
+
+   TP_ARGS(tsk, src_cpu, dst_cpu)
+);
+
+DEFINE_EVENT(sched_move_task_template, sched_stick_numa,
+   TP_PROTO(struct task_struct *tsk, int src_cpu, int dst_cpu),
+
+   TP_ARGS(tsk, src_cpu, dst_cpu)
+);
+
+TRACE_EVENT(sched_swap_numa,
+
+   TP_PROTO(struct task_struct *src_tsk, int src_cpu,
+struct task_struct *dst_tsk, int dst_cpu),
+
+   TP_ARGS(src_tsk, src_cpu, dst_tsk, dst_cpu),
+
+   TP_STRUCT__entry(
+   __field( pid_t, src_pid )
+   __field( pid_t, src_tgid)
+   __field( pid_t, src_ngid)
+   __field( int,   src_cpu )
+   __field( int,   src_nid )
+   __field( pid_t, dst_pid )
+   __field( pid_t, dst_tgid)
+   __field( pid_t, dst_ngid)
+   __field( int,   dst_cpu )
+   __field( int,   dst_nid )
+   ),
+
+   TP_fast_assign(
+   __entry->src_pid= task_pid_nr(src_tsk);
+   __entry->src_tgid   = task_tgid_nr(src_tsk);
+   __entry->src_ngid   = task_numa_group_id(src_tsk);
+   __entry->src_cpu= src_cpu;
+   __entry->src_nid= cpu_to_node(src_cpu);
+   __entry->dst_pid= task_pid_nr(dst_tsk);
+   __entry->dst_tgid   = task_tgid_nr(dst_tsk);
+   __entry->dst_ngid   = task_numa_group_id(dst_tsk);
+   __entry->dst_cpu= dst_cpu;
+   __entry->dst_nid= cpu_to_node(dst_cpu);
+   ),
+
+   TP_printk("src_pid=%d src_tgid=%d src_ngid=%d src_cpu=%d src_nid=%d 
dst_pid=%d dst_tgid=%d dst_ngid=%d dst_cpu=%d dst_nid=%d",
+   __entry->src_pid, __entry->src_tgid, __entry->src_ngid,
+   __entry->src_cpu, __entry->src_nid,
+   __entry->dst_pid, __entry->dst_tgid, __entry->dst_ngid,
+   __entry->dst_cpu, __entry->dst_nid)
+);
 #endif /* _TRACE_SCHED_H */
 
 /* This part must be outside protection */
diff --git a/kernel/sched/core.c 

[PATCH 16/18] mm: numa: Do not automatically migrate KSM pages

2013-12-08 Thread Mel Gorman
KSM pages can be shared between tasks that are not necessarily related
to each other from a NUMA perspective. This patch causes those pages to
be ignored by automatic NUMA balancing so they do not migrate and do not
cause unrelated tasks to be grouped together.

Signed-off-by: Mel Gorman 
---
 mm/mprotect.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/mm/mprotect.c b/mm/mprotect.c
index 9b1be30..c258137 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -23,6 +23,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -63,7 +64,7 @@ static unsigned long change_pte_range(struct vm_area_struct 
*vma, pmd_t *pmd,
 
ptent = *pte;
page = vm_normal_page(vma, addr, oldpte);
-   if (page) {
+   if (page && !PageKsm(page)) {
if (!pte_numa(oldpte)) {
ptent = pte_mknuma(ptent);
updated = true;
-- 
1.8.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 15/18] mm: numa: Trace tasks that fail migration due to rate limiting

2013-12-08 Thread Mel Gorman
A low local/remote numa hinting fault ratio is potentially explained by
failed migrations. This patch adds a tracepoint that fires when migration
fails due to migration rate limitation.

Signed-off-by: Mel Gorman 
---
 include/trace/events/migrate.h | 26 ++
 mm/migrate.c   |  5 -
 2 files changed, 30 insertions(+), 1 deletion(-)

diff --git a/include/trace/events/migrate.h b/include/trace/events/migrate.h
index ec2a6cc..3075ffb 100644
--- a/include/trace/events/migrate.h
+++ b/include/trace/events/migrate.h
@@ -45,6 +45,32 @@ TRACE_EVENT(mm_migrate_pages,
__print_symbolic(__entry->reason, MIGRATE_REASON))
 );
 
+TRACE_EVENT(mm_numa_migrate_ratelimit,
+
+   TP_PROTO(struct task_struct *p, int dst_nid, unsigned long nr_pages),
+
+   TP_ARGS(p, dst_nid, nr_pages),
+
+   TP_STRUCT__entry(
+   __array(char,   comm,   TASK_COMM_LEN)
+   __field(pid_t,  pid)
+   __field(int,dst_nid)
+   __field(unsigned long,  nr_pages)
+   ),
+
+   TP_fast_assign(
+   memcpy(__entry->comm, p->comm, TASK_COMM_LEN);
+   __entry->pid= p->pid;
+   __entry->dst_nid= dst_nid;
+   __entry->nr_pages   = nr_pages;
+   ),
+
+   TP_printk("comm=%s pid=%d dst_nid=%d nr_pages=%lu",
+   __entry->comm,
+   __entry->pid,
+   __entry->dst_nid,
+   __entry->nr_pages)
+);
 #endif /* _TRACE_MIGRATE_H */
 
 /* This part must be outside protection */
diff --git a/mm/migrate.c b/mm/migrate.c
index 8b560d5..9f53c00 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1608,8 +1608,11 @@ static bool numamigrate_update_ratelimit(pg_data_t 
*pgdat,
msecs_to_jiffies(migrate_interval_millisecs);
spin_unlock(>numabalancing_migrate_lock);
}
-   if (pgdat->numabalancing_migrate_nr_pages > ratelimit_pages)
+   if (pgdat->numabalancing_migrate_nr_pages > ratelimit_pages) {
+   trace_mm_numa_migrate_ratelimit(current, pgdat->node_id,
+   nr_pages);
return true;
+   }
 
/*
 * This is an unlocked non-atomic update so errors are possible.
-- 
1.8.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 17/18] sched: Tracepoint task movement

2013-12-08 Thread Mel Gorman
move_task() is called from move_one_task and move_tasks and is an
approximation of load balancer activity. We should be able to track
tasks that move between CPUs frequently. If the tracepoint included node
information then we could distinguish between in-node and between-node
traffic for load balancer decisions. The tracepoint allows us to track
local migrations, remote migrations and average task migrations.

Signed-off-by: Mel Gorman 
---
 include/trace/events/sched.h | 35 +++
 kernel/sched/fair.c  |  2 ++
 2 files changed, 37 insertions(+)

diff --git a/include/trace/events/sched.h b/include/trace/events/sched.h
index 04c3084..cf1694c 100644
--- a/include/trace/events/sched.h
+++ b/include/trace/events/sched.h
@@ -443,6 +443,41 @@ TRACE_EVENT(sched_process_hang,
 );
 #endif /* CONFIG_DETECT_HUNG_TASK */
 
+/*
+ * Tracks migration of tasks from one runqueue to another. Can be used to
+ * detect if automatic NUMA balancing is bouncing between nodes
+ */
+TRACE_EVENT(sched_move_task,
+
+   TP_PROTO(struct task_struct *tsk, int src_cpu, int dst_cpu),
+
+   TP_ARGS(tsk, src_cpu, dst_cpu),
+
+   TP_STRUCT__entry(
+   __field( pid_t, pid )
+   __field( pid_t, tgid)
+   __field( pid_t, ngid)
+   __field( int,   src_cpu )
+   __field( int,   src_nid )
+   __field( int,   dst_cpu )
+   __field( int,   dst_nid )
+   ),
+
+   TP_fast_assign(
+   __entry->pid= task_pid_nr(tsk);
+   __entry->tgid   = task_tgid_nr(tsk);
+   __entry->ngid   = task_numa_group_id(tsk);
+   __entry->src_cpu= src_cpu;
+   __entry->src_nid= cpu_to_node(src_cpu);
+   __entry->dst_cpu= dst_cpu;
+   __entry->dst_nid= cpu_to_node(dst_cpu);
+   ),
+
+   TP_printk("pid=%d tgid=%d ngid=%d src_cpu=%d src_nid=%d dst_cpu=%d 
dst_nid=%d",
+   __entry->pid, __entry->tgid, __entry->ngid,
+   __entry->src_cpu, __entry->src_nid,
+   __entry->dst_cpu, __entry->dst_nid)
+);
 #endif /* _TRACE_SCHED_H */
 
 /* This part must be outside protection */
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 1ce1615..41021c8 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4770,6 +4770,8 @@ static void move_task(struct task_struct *p, struct 
lb_env *env)
set_task_cpu(p, env->dst_cpu);
activate_task(env->dst_rq, p, 0);
check_preempt_curr(env->dst_rq, p, 0);
+
+   trace_sched_move_task(p, env->src_cpu, env->dst_cpu);
 }
 
 /*
-- 
1.8.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 00/18] NUMA balancing segmentation fault fixes and misc followups v3

2013-12-08 Thread Mel Gorman
Alex Thorlton reported segementation faults when NUMA balancing is enabled
on large machines. There is no obvious explanation from the console what the
problem but similar problems have been observed by Rik van Riel and myself
if migration was aggressive enough. Alex, this series is against 3.13-rc2,
a verification that the fix addresses your problem would be appreciated.

This series starts with a range of patches aimed at addressing the
segmentation fault problem while offsetting some of the cost to avoid badly
regressing performance in -stable. Those that are cc'd to stable (patches
1-12) should be merged ASAP. The rest of the series is relatively minor
stuff that fell out during the course of development that is ok to wait
for the next merge window but should help with the continued development
of NUMA balancing.

 arch/sparc/include/asm/pgtable_64.h |   4 +-
 arch/x86/include/asm/pgtable.h  |  11 +++-
 arch/x86/mm/gup.c   |  13 +
 include/asm-generic/pgtable.h   |   2 +-
 include/linux/migrate.h |   9 
 include/linux/mm_types.h|  44 +++
 include/linux/mmzone.h  |   5 +-
 include/trace/events/migrate.h  |  26 +
 include/trace/events/sched.h|  93 
 kernel/fork.c   |   1 +
 kernel/sched/core.c |   2 +
 kernel/sched/fair.c |  15 +-
 mm/huge_memory.c|  45 
 mm/migrate.c| 103 
 mm/mprotect.c   |  15 --
 mm/pgtable-generic.c|   8 ++-
 16 files changed, 348 insertions(+), 48 deletions(-)

-- 
1.8.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 02/18] mm: numa: Call MMU notifiers on THP migration

2013-12-08 Thread Mel Gorman
MMU notifiers must be called on THP page migration or secondary MMUs will
get very confused.

Cc: sta...@vger.kernel.org
Signed-off-by: Mel Gorman 
---
 mm/migrate.c | 22 ++
 1 file changed, 14 insertions(+), 8 deletions(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index 2cabbd5..be787d5 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -36,6 +36,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 
@@ -1716,12 +1717,13 @@ int migrate_misplaced_transhuge_page(struct mm_struct 
*mm,
struct page *page, int node)
 {
spinlock_t *ptl;
-   unsigned long haddr = address & HPAGE_PMD_MASK;
pg_data_t *pgdat = NODE_DATA(node);
int isolated = 0;
struct page *new_page = NULL;
struct mem_cgroup *memcg = NULL;
int page_lru = page_is_file_cache(page);
+   unsigned long mmun_start = address & HPAGE_PMD_MASK;
+   unsigned long mmun_end = mmun_start + HPAGE_PMD_SIZE;
pmd_t orig_entry;
 
/*
@@ -1756,10 +1758,12 @@ int migrate_misplaced_transhuge_page(struct mm_struct 
*mm,
WARN_ON(PageLRU(new_page));
 
/* Recheck the target PMD */
+   mmu_notifier_invalidate_range_start(mm, mmun_start, mmun_end);
ptl = pmd_lock(mm, pmd);
if (unlikely(!pmd_same(*pmd, entry) || page_count(page) != 2)) {
 fail_putback:
spin_unlock(ptl);
+   mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end);
 
/* Reverse changes made by migrate_page_copy() */
if (TestClearPageActive(new_page))
@@ -1800,15 +1804,16 @@ fail_putback:
 * The SetPageUptodate on the new page and page_add_new_anon_rmap
 * guarantee the copy is visible before the pagetable update.
 */
-   flush_cache_range(vma, haddr, haddr + HPAGE_PMD_SIZE);
-   page_add_new_anon_rmap(new_page, vma, haddr);
-   pmdp_clear_flush(vma, haddr, pmd);
-   set_pmd_at(mm, haddr, pmd, entry);
+   flush_cache_range(vma, mmun_start, mmun_end);
+   page_add_new_anon_rmap(new_page, vma, mmun_start);
+   pmdp_clear_flush(vma, mmun_start, pmd);
+   set_pmd_at(mm, mmun_start, pmd, entry);
+   flush_tlb_range(vma, mmun_start, mmun_end);
update_mmu_cache_pmd(vma, address, );
 
if (page_count(page) != 2) {
-   set_pmd_at(mm, haddr, pmd, orig_entry);
-   flush_tlb_range(vma, haddr, haddr + HPAGE_PMD_SIZE);
+   set_pmd_at(mm, mmun_start, pmd, orig_entry);
+   flush_tlb_range(vma, mmun_start, mmun_end);
update_mmu_cache_pmd(vma, address, );
page_remove_rmap(new_page);
goto fail_putback;
@@ -1823,6 +1828,7 @@ fail_putback:
 */
mem_cgroup_end_migration(memcg, page, new_page, true);
spin_unlock(ptl);
+   mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end);
 
unlock_page(new_page);
unlock_page(page);
@@ -1843,7 +1849,7 @@ out_dropref:
ptl = pmd_lock(mm, pmd);
if (pmd_same(*pmd, entry)) {
entry = pmd_mknonnuma(entry);
-   set_pmd_at(mm, haddr, pmd, entry);
+   set_pmd_at(mm, mmun_start, pmd, entry);
update_mmu_cache_pmd(vma, address, );
}
spin_unlock(ptl);
-- 
1.8.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 06/18] mm: numa: Ensure anon_vma is locked to prevent parallel THP splits

2013-12-08 Thread Mel Gorman
The anon_vma lock prevents parallel THP splits and any associated complexity
that arises when handling splits during THP migration. This patch checks
if the lock was successfully acquired and bails from THP migration if it
failed for any reason.

Cc: sta...@vger.kernel.org
Signed-off-by: Mel Gorman 
---
 mm/huge_memory.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 5a5da50..0f00b96 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1359,6 +1359,13 @@ int do_huge_pmd_numa_page(struct mm_struct *mm, struct 
vm_area_struct *vma,
goto out_unlock;
}
 
+   /* Bail if we fail to protect against THP splits for any reason */
+   if (unlikely(!anon_vma)) {
+   put_page(page);
+   page_nid = -1;
+   goto clear_pmdnuma;
+   }
+
/*
 * Migrate the THP to the requested node, returns with page unlocked
 * and pmd_numa cleared.
-- 
1.8.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 09/14] tools lib traceevent: Get rid of die() in add_right()

2013-12-08 Thread Namhyung Kim
Hi Ilia,

On Mon, 9 Dec 2013 01:28:26 -0500, Ilia Mirkin wrote:
> On Mon, Dec 9, 2013 at 12:34 AM, Namhyung Kim  wrote:
>> Signed-off-by: Namhyung Kim 
>> ---
>>  tools/lib/traceevent/parse-filter.c | 12 +---
>>  1 file changed, 9 insertions(+), 3 deletions(-)
>>
>> diff --git a/tools/lib/traceevent/parse-filter.c 
>> b/tools/lib/traceevent/parse-filter.c
>> index 5efe66a682bd..a1ad609a860f 100644
>> --- a/tools/lib/traceevent/parse-filter.c
>> +++ b/tools/lib/traceevent/parse-filter.c
>> @@ -583,12 +583,18 @@ static int add_right(struct filter_arg *op, struct 
>> filter_arg *arg,
>> op->str.type = op_type;
>> op->str.field = left->field.field;
>> op->str.val = strdup(str);
>> -   if (!op->str.val)
>> -   die("malloc string");
>> +   if (!op->str.val) {
>> +   show_error(error_str, "Failed to allocate 
>> string filter");
>> +   return -1;
>> +   }
>> /*
>>  * Need a buffer to copy data for tests
>>  */
>> -   op->str.buffer = malloc_or_die(op->str.field->size + 
>> 1);
>> +   op->str.buffer = malloc(op->str.field->size + 1);
>> +   if (op->str.buffer) {
>
> That should probably be
>
> if (!op->str.buffer)

Argh.. you're right!

>
> Also, should you free op->str.val? Perhaps the surrounding code takes
> care of that.

Yeah, it'll be handled by the caller - process_filter().

Thanks for the quick review!
Namhyung

>
>> +   show_error(error_str, "Failed to allocate 
>> string filter");
>> +   return -1;
>> +   }
>> /* Null terminate this buffer */
>> op->str.buffer[op->str.field->size] = 0;
>>
>> --
>> 1.7.11.7
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] sched: Revert need_resched() to look at TIF_NEED_RESCHED

2013-12-08 Thread Aneesh Kumar K.V
Peter Zijlstra  writes:

> Subject: sched: Revert need_resched() to look at TIF_NEED_RESCHED
> From: Peter Zijlstra 
> Date: Fri Sep 27 17:20:30 CEST 2013
>
> Yuanhan reported a serious throughput regression in his pigz
> benchmark. Using the ftrace patch I found that several idle paths
> need more TLC before we can switch the generic need_resched() over to
> preempt_need_resched.
>
> The preemption paths benefit most from preempt_need_resched and do
> indeed use it; all other need_resched() users don't really care that
> much so reverting need_resched() back to tif_need_resched() is the
> simple and safe solution.
>
> Reported-by: Yuanhan Liu 
> Signed-off-by: Peter Zijlstra 

I guess we still need to fix should_resched() I am hitting the below
with upstream


BUG: soft lockup - CPU#0 stuck for 23s! [qemu-system-ppc:4394]
Modules linked in:
CPU: 0 PID: 4394 Comm: qemu-system-ppc Not tainted 3.13.0-rc3+ #98
task: c001d0788400 ti: c001dca0 task.ti: c001dca0
NIP: c082dd80 LR: c0081ae0 CTR: c0062ba0
REGS: c001dca02f70 TRAP: 0901   Not tainted  (3.13.0-rc3+)
MSR: 80009032   CR: 24822024  XER: 
CFAR: c0081adc SOFTE: 1
GPR00: c0081ae0 c001dca031f0 c0d67ab0 0001
GPR04: 7102 0001 0189a0d786b7 018c
GPR08: 0001   c0da
GPR12: 0c00 cfef
NIP [c082dd80] ._cond_resched+0x0/0x40
LR [c0081ae0] .kvmppc_prepare_to_enter+0x2a0/0x2e0
Call Trace:
[c001dca031f0] [c0081ae0] .kvmppc_prepare_to_enter+0x2a0/0x2e0 
(unreliable)
[c001dca03290] [c008f2cc] .kvmppc_handle_exit_pr+0xec/0xa40
[c001dca03340] [c00918c4] kvm_start_lightweight+0xac/0xbc
[c001dca03510] [c008efe0] .kvmppc_vcpu_run_pr+0x130/0x2a0
[c001dca039e0] [c00855bc] .kvmppc_vcpu_run+0x2c/0x40
[c001dca03a50] [c0082c94] .kvm_arch_vcpu_ioctl_run+0x54/0x1b0
[c001dca03ae0] [c007d5f8] .kvm_vcpu_ioctl+0x478/0x740
[c001dca03ca0] [c0218864] .do_vfs_ioctl+0x4a4/0x760
[c001dca03d80] [c0218b78] .SyS_ioctl+0x58/0xb0
[c001dca03e30] [c0009e58] syscall_exit+0x0/0x98
Instruction dump:
e92d0260 e94911c0 812a0004 5529f07e 5529103e 912a0004 38210080 e8010010
ebc1fff0 ebe1fff8 7c0803a6 4e800020 <7c0802a6> 3860 f8010010 f821ff91 


The revert which Alexander tried had those

http://mid.gmane.org/20131128132641.gp10...@twins.programming.kicks-ass.net


> ---
>  arch/x86/include/asm/preempt.h |8 
>  include/asm-generic/preempt.h  |8 
>  include/linux/sched.h  |5 +
>  3 files changed, 5 insertions(+), 16 deletions(-)
>
> --- a/arch/x86/include/asm/preempt.h
> +++ b/arch/x86/include/asm/preempt.h
> @@ -80,14 +80,6 @@ static __always_inline bool __preempt_co
>  }
>  
>  /*
> - * Returns true when we need to resched -- even if we can not.
> - */
> -static __always_inline bool need_resched(void)
> -{
> - return unlikely(test_preempt_need_resched());
> -}
> -
> -/*
>   * Returns true when we need to resched and can (barring IRQ state).
>   */
>  static __always_inline bool should_resched(void)
> --- a/include/asm-generic/preempt.h
> +++ b/include/asm-generic/preempt.h
> @@ -85,14 +85,6 @@ static __always_inline bool __preempt_co
>  }
>  
>  /*
> - * Returns true when we need to resched -- even if we can not.
> - */
> -static __always_inline bool need_resched(void)
> -{
> - return unlikely(test_preempt_need_resched());
> -}
> -
> -/*
>   * Returns true when we need to resched and can (barring IRQ state).
>   */
>  static __always_inline bool should_resched(void)
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -2577,6 +2577,11 @@ static inline bool __must_check current_
>  }
>  #endif
>  
> +static __always_inline bool need_resched(void)
> +{
> + return unlikely(tif_need_resched());
> +}
> +
>  /*
>   * Thread group CPU time accounting.
>   */

-aneesh

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: NETDEV WATCHDOG: eth0 (e1000e): transmit queue 0 timed out

2013-12-08 Thread Ethan Zhao
Nick,
You could try 7.3.21-k8-NAPI in tree or the out-of-tree version as
Bjorn mentioned.
To read and debug an old version driver is not a interesting thing for
somebody to do.

Thanks,
Ethan

On Tue, Dec 3, 2013 at 9:33 PM, Nick Pegg  wrote:
> On Mon, Dec 2, 2013 at 10:51 PM, Ethan Zhao  wrote:
>> Bjorn,
>>Seems not the same bug as  http://sourceforge.net/p/e1000/bugs/367/
>> ,  Nick is not running his kernel on bare metal, per the error log,
>> he runs his kernel as HVM DomU guest or Dom0 on XEN ?  so just a check
>> of NULL will not fix that.
>>
>
> Sorry, I neglected to say in my original email that the kernel is
> running as a Xen Dom0. Per Todd's request, I've opened a bug report on
> sourceforge and will follow up with this issue there:
> https://sourceforge.net/p/e1000/bugs/385/
>
> Thanks,
> Nick
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


linux-next: Tree for Dec 9

2013-12-08 Thread Stephen Rothwell
Hi all,

Changes since 20131206:

The powerpc tree gained a build failure for which I reverted some commits.

The crypto tree still had its build failure so I used the version from
next-20131205.

The driver-core tree gained a conflict against the driver-core.current
tree.

The usb-gadget tree gained a build failure so I used the version from
next-20131206.

The pinctrl tree gained a build failure so I used the version from
next-20131206.

Non-merge commits (relative to Linus' tree): 2894
 3313 files changed, 117925 insertions(+), 76643 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" as mentioned in the FAQ on the wiki
(see below).

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log files
in the Next directory.  Between each merge, the tree was built with
a ppc64_defconfig for powerpc and an allmodconfig for x86_64 and a
multi_v7_defconfig for arm. After the final fixups (if any), it is also
built with powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig and
allyesconfig (minus CONFIG_PROFILE_ALL_BRANCHES - this fails its final
link) and i386, sparc, sparc64 and arm defconfig. These builds also have
CONFIG_ENABLE_WARN_DEPRECATED, CONFIG_ENABLE_MUST_CHECK and
CONFIG_DEBUG_INFO disabled when necessary.

Below is a summary of the state of the merge.

I am currently merging 210 trees (counting Linus' and 29 trees of patches
pending for Linus' tree).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

There is a wiki covering stuff to do with linux-next at
http://linux.f-seidel.de/linux-next/pmwiki/ .  Thanks to Frank Seidel.

-- 
Cheers,
Stephen Rothwell 

$ git checkout master
$ git reset --hard stable
Merging origin/master (374b105797c3 Linux 3.13-rc3)
Merging fixes/master (8ae516aa8b81 Merge tag 'trace-fixes-v3.13-rc1' of 
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace)
Merging kbuild-current/rc-fixes (19514fc665ff arm, kbuild: make "make install" 
not depend on vmlinux)
Merging arc-current/for-curr (da990a4f2d5a ARC: [perf] Fix a few thinkos)
Merging arm-current/fixes (11d4bb1bd067 ARM: 7907/1: lib: delay-loop: Add align 
directive to fix BogoMIPS calculation)
Merging m68k-current/for-linus (77a42796786c m68k: Remove deprecated 
IRQF_DISABLED)
Merging metag-fixes/fixes (3b2f64d00c46 Linux 3.11-rc2)
Merging powerpc-merge/merge (721cb59e9d95 powerpc/windfarm: Fix XServe G5 fan 
control Makefile issue)
Merging sparc/master (1de425c7b271 sparc64: Fix build regression)
Merging net/master (98bfd23cdb30 virtio-net: free bufs correctly on invalid 
packet length)
Merging ipsec/master (239c78db9c41 net: clear local_df when passing skb between 
namespaces)
Merging sound-current/for-linus (31660e9084df ALSA: hda - Remove quirk for Dell 
Vostro 131)
Merging pci-current/for-linus (4bff6749905d PCI: Move device_del() from 
pci_stop_dev() to pci_destroy_dev())
Merging wireless/master (a59b40b30f3f Merge branch 'for-john' of 
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211)
Merging driver-core.current/driver-core-linus (a8b14744429f sysfs: give 
different locking key to regular and bin files)
Merging tty.current/tty-linus (39434abd942c n_tty: Fix missing newline echo)
Merging usb.current/usb-linus (ea4215893884 Revert "usb: xhci: Link TRB must 
not occur within a USB payload burst")
Merging staging.current/staging-linus (55ef003e4ae6 Merge tag 
'iio-fixes-for-3.13b' of 
git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into staging-linus)
Merging char-misc.current/char-misc-linus (76a9635979e5 mei: add 9 series PCH 
mei device ids)
Merging input-current/for-linus (95f75e91588a Input: ALPS - add support for 
DualPoint device on Dell XT2 model)
Merging md-current/for-linus (d47648fcf061 raid5: avoid finding "discard" 
stripe)
Merging crypto-current/master (8ec25c512916 crypto: testmgr - fix sglen in 
test_aead for case 'dst != src')
Merging ide/master (c2f7d1e103ef ide: pmac: remove unnecessary 
pci_set_drvdata())
Merging dwmw2/master (5950f0803ca9 pcmcia: remove RPX board stuff)
Merging sh-current/sh-fixes-for-linus (44033109e99c SH: Convert out[bwl] macros 
to inline functions)
Merging devicetree-current/devicetree/merge (1931ee143b0a Revert "drivers: 

linux-next: build failure after merge of the final tree (powerpc tree related)

2013-12-08 Thread Stephen Rothwell
Hi all,

After merging the final tree, today's linux-next build (powerpc
allyesconfig) failed like this:

arch/powerpc/kernel/exceptions-64s.S: Assembler messages:
arch/powerpc/kernel/exceptions-64s.S:958: Error: attempt to move .org backwards
arch/powerpc/kernel/exceptions-64s.S:959: Error: attempt to move .org backwards
arch/powerpc/kernel/exceptions-64s.S:983: Error: attempt to move .org backwards
arch/powerpc/kernel/exceptions-64s.S:984: Error: attempt to move .org backwards
arch/powerpc/kernel/exceptions-64s.S:1003: Error: attempt to move .org backwards
arch/powerpc/kernel/exceptions-64s.S:1013: Error: attempt to move .org backwards
arch/powerpc/kernel/exceptions-64s.S:1014: Error: attempt to move .org backwards
arch/powerpc/kernel/exceptions-64s.S:1015: Error: attempt to move .org backwards
arch/powerpc/kernel/exceptions-64s.S:1016: Error: attempt to move .org backwards
arch/powerpc/kernel/exceptions-64s.S:1017: Error: attempt to move .org backwards
arch/powerpc/kernel/exceptions-64s.S:1018: Error: attempt to move .org backwards

Caused by commit 1e9b4507ed98 ("powerpc/book3s: handle machine check in
Linux host").

I have reverted these commits (possibly some of these reverts are
unnecessary):

b63a0ffe35de "powerpc/powernv: Machine check exception handling"
28446de2ce99 "powerpc/powernv: Remove machine check handling in OPAL"
b5ff4211a829 "powerpc/book3s: Queue up and process delayed MCE events"
36df96f8acaf "powerpc/book3s: Decode and save machine check event"
ae744f3432d3 "powerpc/book3s: Flush SLB/TLBs if we get SLB/TLB machine check 
errors on power8"
e22a22740c1a "powerpc/book3s: Flush SLB/TLBs if we get SLB/TLB machine check 
errors on power7"
0440705049b0 "powerpc/book3s: Add flush_tlb operation in cpu_spec"
4c703416efc0 "powerpc/book3s: Introduce a early machine check hook in cpu_spec"
1c51089f777b "powerpc/book3s: Return from interrupt if coming from evil context"
1e9b4507ed98 "powerpc/book3s: handle machine check in Linux host"

-- 
Cheers,
Stephen Rothwell 


pgp0kzzgsEmJY.pgp
Description: PGP signature


[PATCH 13/17] tracing/uprobes: Pass 'is_return' to traceprobe_parse_probe_arg()

2013-12-08 Thread Namhyung Kim
From: Namhyung Kim 

Currently uprobes don't pass is_return to the argument parser so that
it cannot make use of "$retval" fetch method since it only works for
return probes.

Cc: Masami Hiramatsu 
Cc: Srikar Dronamraju 
Cc: Oleg Nesterov 
Cc: zhangwei(Jovi) 
Cc: Arnaldo Carvalho de Melo 
Signed-off-by: Namhyung Kim 
---
 kernel/trace/trace_uprobe.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
index 89dd346865ad..d407a556aa55 100644
--- a/kernel/trace/trace_uprobe.c
+++ b/kernel/trace/trace_uprobe.c
@@ -512,7 +512,7 @@ static int create_trace_uprobe(int argc, char **argv)
 
/* Parse fetch argument */
ret = traceprobe_parse_probe_arg(arg, >tp.size, parg,
-false, false);
+is_return, false);
if (ret) {
pr_info("Parse error at argument[%d]. (%d)\n", i, ret);
goto error;
-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] sched: Remove PREEMPT_NEED_RESCHED from generic code

2013-12-08 Thread Benjamin Herrenschmidt
On Thu, 2013-11-28 at 14:26 +0100, Peter Zijlstra wrote:
> Subject: sched: Remove PREEMPT_NEED_RESCHED from generic code
> 
> While hunting a preemption issue with Alexander, Ben noticed that the
> currently generic PREEMPT_NEED_RESCHED stuff is horribly broken for
> load-store architectures.

Hi Peter !

Are you sending this to Linus ? From what I can tell the bug is still
upstream ...

Cheers,
Ben.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Add: (1) Detection for newer Elantech touchpads, so that kernel doesn't fall-back to default PS/2 driver. (2) Enable hardware version 4 touchpad right click function.

2013-12-08 Thread Dmitry Torokhov
Hi Duson.


On Mon, Dec 09, 2013 at 10:59:50AM +0800, Duson Lin wrote:
> Modify:
> (1) crc_enabled only support for v3 and v4 touchpad, so initialize 
> crc_enabled as false first and
> check this hardware flag when hw_version as 3 or 4.

It looks to me there are several fixes rolled up together in this patch:

1. Support for the new hardware signatures (8 and 10, although I already
applied patch for 8)
2. Fix to handle CRC check
3. Changes to report BTN_RIGHT.

Could you please split them up please?

Also, I am not sure if we can simply start reporting BTN_RIGHT as
present, even on devices that don't actually have the right button, as
this will interfere with userspace providing emulation for multiple
buttons.

Is it possible to determine if a given model had right button or not?

Thanks.

> 
> Signed-off-by: Duson Lin 
> ---
>  drivers/input/mouse/elantech.c |   30 ++
>  drivers/input/mouse/elantech.h |2 +-
>  2 files changed, 15 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/input/mouse/elantech.c b/drivers/input/mouse/elantech.c
> index 8551dca..b3627cf 100644
> --- a/drivers/input/mouse/elantech.c
> +++ b/drivers/input/mouse/elantech.c
> @@ -1,5 +1,5 @@
>  /*
> - * Elantech Touchpad driver (v6)
> + * Elantech Touchpad driver (v7)
>   *
>   * Copyright (C) 2007-2009 Arjan Opmeer 
>   *
> @@ -486,6 +486,7 @@ static void elantech_input_sync_v4(struct psmouse 
> *psmouse)
>   unsigned char *packet = psmouse->packet;
>  
>   input_report_key(dev, BTN_LEFT, packet[0] & 0x01);
> + input_report_key(dev, BTN_RIGHT, packet[0] & 0x02);
>   input_mt_report_pointer_emulation(dev, true);
>   input_sync(dev);
>  }
> @@ -1047,9 +1048,7 @@ static int elantech_set_input_params(struct psmouse 
> *psmouse)
>*/
>   psmouse_warn(psmouse, "couldn't query resolution 
> data.\n");
>   }
> - /* v4 is clickpad, with only one button. */
>   __set_bit(INPUT_PROP_BUTTONPAD, dev->propbit);
> - __clear_bit(BTN_RIGHT, dev->keybit);
>   __set_bit(BTN_TOOL_QUADTAP, dev->keybit);
>   /* For X to recognize me as touchpad. */
>   input_set_abs_params(dev, ABS_X, x_min, x_max, 0, 0);
> @@ -1186,19 +1185,12 @@ static struct attribute_group elantech_attr_group = {
>  
>  static bool elantech_is_signature_valid(const unsigned char *param)
>  {
> - static const unsigned char rates[] = { 200, 100, 80, 60, 40, 20, 10 };
> - int i;
> -
>   if (param[0] == 0)
>   return false;
>  
>   if (param[1] == 0)
>   return true;
>  
> - for (i = 0; i < ARRAY_SIZE(rates); i++)
> - if (param[2] == rates[i])
> - return false;
> -
>   return true;
>  }
>  
> @@ -1298,6 +1290,14 @@ static int elantech_set_properties(struct 
> elantech_data *etd)
>  {
>   /* This represents the version of IC body. */
>   int ver = (etd->fw_version & 0x0f) >> 16;
> + /*
> +  * The signatures of v3 and v4 packets change depending on the
> +  * value of this hardware flag. But the v1 and v2 have not crc
> +  * check mechanism and the same hardware flag are also defined
> +  * as other function. So crc_enabled must be initialized as false 
> +  * first and checking by different hw_version.
> +  */
> + etd->crc_enabled = false;
>  
>   /* Early version of Elan touchpads doesn't obey the rule. */
>   if (etd->fw_version < 0x020030 || etd->fw_version == 0x020600)
> @@ -1309,10 +1309,14 @@ static int elantech_set_properties(struct 
> elantech_data *etd)
>   etd->hw_version = 2;
>   break;
>   case 5:
> + etd->crc_enabled = ((etd->fw_version & 0x4000) == 
> 0x4000);
>   etd->hw_version = 3;
>   break;
>   case 6:
>   case 7:
> + case 8:
> + case 10:
> + etd->crc_enabled = ((etd->fw_version & 0x4000) == 
> 0x4000);
>   etd->hw_version = 4;
>   break;
>   default:
> @@ -1343,12 +1347,6 @@ static int elantech_set_properties(struct 
> elantech_data *etd)
>   etd->reports_pressure = true;
>   }
>  
> - /*
> -  * The signatures of v3 and v4 packets change depending on the
> -  * value of this hardware flag.
> -  */
> - etd->crc_enabled = ((etd->fw_version & 0x4000) == 0x4000);
> -
>   return 0;
>  }
>  
> diff --git a/drivers/input/mouse/elantech.h b/drivers/input/mouse/elantech.h
> index 036a04a..c963ac8 100644
> --- a/drivers/input/mouse/elantech.h
> +++ b/drivers/input/mouse/elantech.h
> @@ -1,5 +1,5 @@
>  /*
> - * Elantech Touchpad driver (v6)
> + * Elantech Touchpad driver (v7)
>   *
>   * Copyright (C) 2007-2009 Arjan Opmeer 
>   *
> -- 
> 1.7.10.4
> 

-- 
Dmitry
--
To unsubscribe from this list: 

Re: [PATCH 09/14] tools lib traceevent: Get rid of die() in add_right()

2013-12-08 Thread Ilia Mirkin
On Mon, Dec 9, 2013 at 12:34 AM, Namhyung Kim  wrote:
> Signed-off-by: Namhyung Kim 
> ---
>  tools/lib/traceevent/parse-filter.c | 12 +---
>  1 file changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/tools/lib/traceevent/parse-filter.c 
> b/tools/lib/traceevent/parse-filter.c
> index 5efe66a682bd..a1ad609a860f 100644
> --- a/tools/lib/traceevent/parse-filter.c
> +++ b/tools/lib/traceevent/parse-filter.c
> @@ -583,12 +583,18 @@ static int add_right(struct filter_arg *op, struct 
> filter_arg *arg,
> op->str.type = op_type;
> op->str.field = left->field.field;
> op->str.val = strdup(str);
> -   if (!op->str.val)
> -   die("malloc string");
> +   if (!op->str.val) {
> +   show_error(error_str, "Failed to allocate 
> string filter");
> +   return -1;
> +   }
> /*
>  * Need a buffer to copy data for tests
>  */
> -   op->str.buffer = malloc_or_die(op->str.field->size + 
> 1);
> +   op->str.buffer = malloc(op->str.field->size + 1);
> +   if (op->str.buffer) {

That should probably be

if (!op->str.buffer)

Also, should you free op->str.val? Perhaps the surrounding code takes
care of that.

> +   show_error(error_str, "Failed to allocate 
> string filter");
> +   return -1;
> +   }
> /* Null terminate this buffer */
> op->str.buffer[op->str.field->size] = 0;
>
> --
> 1.7.11.7
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 02/17] tracing/probes: Fix basic print type functions

2013-12-08 Thread Namhyung Kim
From: Namhyung Kim 

The print format of s32 type was "ld" and it's casted to "long".  So
it turned out to print 4294967295 for "-1" on 64-bit systems.  Not
sure whether it worked well on 32-bit systems.

Anyway, it doesn't need to have cast argument at all since it already
casted using type pointer - just get rid of it.  Thanks to Oleg for
pointing that out.

And print 0x prefix for unsigned type as it shows hex numbers.

Suggested-by: Oleg Nesterov 
Acked-by: Masami Hiramatsu 
Cc: Srikar Dronamraju 
Cc: Oleg Nesterov 
Cc: zhangwei(Jovi) 
Cc: Arnaldo Carvalho de Melo 
Signed-off-by: Namhyung Kim 
---
 kernel/trace/trace_probe.c | 22 +++---
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
index 412e959709b4..430505b08a6f 100644
--- a/kernel/trace/trace_probe.c
+++ b/kernel/trace/trace_probe.c
@@ -40,23 +40,23 @@ const char *reserved_field_names[] = {
 #define PRINT_TYPE_FMT_NAME(type)  print_type_format_##type
 
 /* Printing  in basic type function template */
-#define DEFINE_BASIC_PRINT_TYPE_FUNC(type, fmt, cast)  \
+#define DEFINE_BASIC_PRINT_TYPE_FUNC(type, fmt)
\
 static __kprobes int PRINT_TYPE_FUNC_NAME(type)(struct trace_seq *s,   \
const char *name,   \
-   void *data, void *ent)\
+   void *data, void *ent)  \
 {  \
-   return trace_seq_printf(s, " %s=" fmt, name, (cast)*(type *)data);\
+   return trace_seq_printf(s, " %s=" fmt, name, *(type *)data);\
 }  \
 static const char PRINT_TYPE_FMT_NAME(type)[] = fmt;
 
-DEFINE_BASIC_PRINT_TYPE_FUNC(u8, "%x", unsigned int)
-DEFINE_BASIC_PRINT_TYPE_FUNC(u16, "%x", unsigned int)
-DEFINE_BASIC_PRINT_TYPE_FUNC(u32, "%lx", unsigned long)
-DEFINE_BASIC_PRINT_TYPE_FUNC(u64, "%llx", unsigned long long)
-DEFINE_BASIC_PRINT_TYPE_FUNC(s8, "%d", int)
-DEFINE_BASIC_PRINT_TYPE_FUNC(s16, "%d", int)
-DEFINE_BASIC_PRINT_TYPE_FUNC(s32, "%ld", long)
-DEFINE_BASIC_PRINT_TYPE_FUNC(s64, "%lld", long long)
+DEFINE_BASIC_PRINT_TYPE_FUNC(u8 , "0x%x")
+DEFINE_BASIC_PRINT_TYPE_FUNC(u16, "0x%x")
+DEFINE_BASIC_PRINT_TYPE_FUNC(u32, "0x%x")
+DEFINE_BASIC_PRINT_TYPE_FUNC(u64, "0x%Lx")
+DEFINE_BASIC_PRINT_TYPE_FUNC(s8,  "%d")
+DEFINE_BASIC_PRINT_TYPE_FUNC(s16, "%d")
+DEFINE_BASIC_PRINT_TYPE_FUNC(s32, "%d")
+DEFINE_BASIC_PRINT_TYPE_FUNC(s64, "%Ld")
 
 static inline void *get_rloc_data(u32 *dl)
 {
-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCHSET 00/17] tracing/uprobes: Add support for more fetch methods (v8)

2013-12-08 Thread Namhyung Kim
Hello,

This patchset implements memory (address), stack[N], deference,
bitfield, retval (it needs uretprobe tho) and file_offset fetch
methods for uprobes.  It's based on the previous work [1] done by
Hyeoncheol Lee.

Now kprobes and uprobes have their own fetch_type_tables and, in turn,
memory and stack access methods.  The symbol and file_offset fetch
methods are only available to kprobes and uprobes, respectively.
Other fetch methods are shared.

For the file_offset method, it translates the offset argument to a
virtual address in a process.  To do that, it calculates base mapping
address using probe address (utask->vaddr) and probe offset
(tu->offset) and then adds the argument offset.  Those info are
carried via utask and a new fetch parameter.

The syntax is '@+offset' where offset are relative address to the base
address.  For shared libraries, it'd be simply the st_value of symbol
in ELF format.  But for executable, it needs to subtract base load
address (e.g. 0x4 for x86_64) from the symbol value.  Please see
previous discussion for an example [2] - Note that the syntax changed
to '@+' from plain '@'.  The plain '@addr' syntax is used for
accessing absolute memory address if you already know the exact address.

Many thanks to Oleg who provides valuable feedbacks and suggestions.

The patch 1-2 are bug fixes and can be applied independently.
The patch 16 is a preparation for patch 17 which implements the
file_offset fetch method.


 * v8 changes:
  - rename tk, tu and tp more consistently (Srikar)
  - change prefix format specifier: %#x -> 0x%x (Masami)
  - convert file_offset_param to uprobe_dispatch_data (Oleg)
  - add more Ack's from Srikar and Masami

 * v7 changes:
  - restructure patches not to break build with !CONFIG_[KU]PROBE_EVENT
  - print 0x prefix for unsigned types
  - add @+file_offset fetch method (Oleg)
  - get rid of uprobe_buffer_mutex (Oleg)
  - pass 'is_return' to uprobes argument parser


[1] https://lkml.org/lkml/2012/11/14/84
[2] https://lkml.org/lkml/2013/11/5/25

A simple example:

  # cat foo.c
  int glob = -1;
  char str[] = "hello uprobe.";

  struct foo {
unsigned int unused: 2;
unsigned int foo: 20;
unsigned int bar: 10;
  } foo = {
.foo = 5,
  };

  int main(int argc, char *argv[])
  {
long local = 0x1234;

return 127;
  }

  # gcc -o foo -g foo.c

  # objdump -d foo | grep -A9 -F ''
  004004b0 :
4004b0: 55  push   %rbp
4004b1: 48 89 e5mov%rsp,%rbp
4004b4: 89 7d ecmov%edi,-0x14(%rbp)
4004b7: 48 89 75 e0 mov%rsi,-0x20(%rbp)
4004bb: 48 c7 45 f8 34 12 00movq   $0x1234,-0x8(%rbp)
4004c2: 00 
4004c3: b8 7f 00 00 00  mov$0x7f,%eax
4004c8: 5d  pop%rbp
4004c9: c3  retq   

  # nm foo | grep -e glob$ -e str -e foo
  006008bc D foo
  006008a8 D glob
  006008ac D str

  # perf probe -x /home/namhyung/tmp/foo -a 'foo=main+0x13 glob=@0x6008a8:s32 \
  > str=@+0x2008ac:string bit=@+0x2008bc:b10@2/32 argc=%di:s32 local=-0x8(%bp)'
  Added new event:
probe_foo:foo  (on 0x4c3 with glob=@0x6008a8:s32 str=@+0x2008ac:string 
   bit=@+0x2008bc:b10@2/32 argc=%di:s32 local=-0x8(%bp))

  You can now use it in all perf tools, such as:

  perf record -e probe_foo:foo -aR sleep 1

  # perf record -e probe_foo:foo ./foo
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.001 MB perf.data (~33 samples) ]

  # perf script | grep -v ^#
   foo  2008 [002  2199.867154: probe_foo:foo (4004c3)
   glob=-1 str="hello uprobe." bit=0x5 argc=1 local=0x1234


This patchset is based on the current for-next branch of the Steven
Rostedt's linux-trace tree.  I also put this on my 'uprobe/fetch-v8'
branch in my tree:

  git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git


Any comments are welcome, thanks.
Namhyung


Cc: Masami Hiramatsu 
Cc: Srikar Dronamraju 
Cc: Oleg Nesterov 
Cc: zhangwei(Jovi) 
Cc: Arnaldo Carvalho de Melo 
Cc: Hemant Kumar 


Hyeoncheol Lee (1):
  tracing/probes: Add fetch{,_size} member into deref fetch method

Namhyung Kim (15):
  tracing/uprobes: Fix documentation of uprobe registration syntax
  tracing/probes: Fix basic print type functions
  tracing/kprobes: Factor out struct trace_probe
  tracing/uprobes: Convert to struct trace_probe
  tracing/kprobes: Move common functions to trace_probe.h
  tracing/probes: Integrate duplicate set_print_fmt()
  tracing/probes: Move fetch function helpers to trace_probe.h
  tracing/probes: Split [ku]probes_fetch_type_table
  tracing/probes: Implement 'stack' fetch method for uprobes
  tracing/probes: Move 'symbol' fetch method to kprobes
  tracing/probes: Implement 'memory' fetch method for uprobes
  tracing/uprobes: Pass 'is_return' to traceprobe_parse_probe_arg()
  

[PATCH 04/17] tracing/uprobes: Convert to struct trace_probe

2013-12-08 Thread Namhyung Kim
From: Namhyung Kim 

Convert struct trace_uprobe to make use of the common trace_probe
structure.

Reviewed-by: Masami Hiramatsu 
Acked-by: Srikar Dronamraju 
Cc: Oleg Nesterov 
Cc: zhangwei(Jovi) 
Cc: Arnaldo Carvalho de Melo 
Signed-off-by: Namhyung Kim 
---
 kernel/trace/trace_uprobe.c | 159 ++--
 1 file changed, 79 insertions(+), 80 deletions(-)

diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
index c77b92d61551..afda3726f288 100644
--- a/kernel/trace/trace_uprobe.c
+++ b/kernel/trace/trace_uprobe.c
@@ -51,22 +51,17 @@ struct trace_uprobe_filter {
  */
 struct trace_uprobe {
struct list_headlist;
-   struct ftrace_event_class   class;
-   struct ftrace_event_callcall;
struct trace_uprobe_filter  filter;
struct uprobe_consumer  consumer;
struct inode*inode;
char*filename;
unsigned long   offset;
unsigned long   nhit;
-   unsigned intflags;  /* For TP_FLAG_* */
-   ssize_t size;   /* trace entry size */
-   unsigned intnr_args;
-   struct probe_argargs[];
+   struct trace_probe  tp;
 };
 
-#define SIZEOF_TRACE_UPROBE(n) \
-   (offsetof(struct trace_uprobe, args) +  \
+#define SIZEOF_TRACE_UPROBE(n) \
+   (offsetof(struct trace_uprobe, tp.args) +   \
(sizeof(struct probe_arg) * (n)))
 
 static int register_uprobe_event(struct trace_uprobe *tu);
@@ -114,13 +109,13 @@ alloc_trace_uprobe(const char *group, const char *event, 
int nargs, bool is_ret)
if (!tu)
return ERR_PTR(-ENOMEM);
 
-   tu->call.class = >class;
-   tu->call.name = kstrdup(event, GFP_KERNEL);
-   if (!tu->call.name)
+   tu->tp.call.class = >tp.class;
+   tu->tp.call.name = kstrdup(event, GFP_KERNEL);
+   if (!tu->tp.call.name)
goto error;
 
-   tu->class.system = kstrdup(group, GFP_KERNEL);
-   if (!tu->class.system)
+   tu->tp.class.system = kstrdup(group, GFP_KERNEL);
+   if (!tu->tp.class.system)
goto error;
 
INIT_LIST_HEAD(>list);
@@ -128,11 +123,11 @@ alloc_trace_uprobe(const char *group, const char *event, 
int nargs, bool is_ret)
if (is_ret)
tu->consumer.ret_handler = uretprobe_dispatcher;
init_trace_uprobe_filter(>filter);
-   tu->call.flags |= TRACE_EVENT_FL_USE_CALL_FILTER;
+   tu->tp.call.flags |= TRACE_EVENT_FL_USE_CALL_FILTER;
return tu;
 
 error:
-   kfree(tu->call.name);
+   kfree(tu->tp.call.name);
kfree(tu);
 
return ERR_PTR(-ENOMEM);
@@ -142,12 +137,12 @@ static void free_trace_uprobe(struct trace_uprobe *tu)
 {
int i;
 
-   for (i = 0; i < tu->nr_args; i++)
-   traceprobe_free_probe_arg(>args[i]);
+   for (i = 0; i < tu->tp.nr_args; i++)
+   traceprobe_free_probe_arg(>tp.args[i]);
 
iput(tu->inode);
-   kfree(tu->call.class->system);
-   kfree(tu->call.name);
+   kfree(tu->tp.call.class->system);
+   kfree(tu->tp.call.name);
kfree(tu->filename);
kfree(tu);
 }
@@ -157,8 +152,8 @@ static struct trace_uprobe *find_probe_event(const char 
*event, const char *grou
struct trace_uprobe *tu;
 
list_for_each_entry(tu, _list, list)
-   if (strcmp(tu->call.name, event) == 0 &&
-   strcmp(tu->call.class->system, group) == 0)
+   if (strcmp(tu->tp.call.name, event) == 0 &&
+   strcmp(tu->tp.call.class->system, group) == 0)
return tu;
 
return NULL;
@@ -181,16 +176,16 @@ static int unregister_trace_uprobe(struct trace_uprobe 
*tu)
 /* Register a trace_uprobe and probe_event */
 static int register_trace_uprobe(struct trace_uprobe *tu)
 {
-   struct trace_uprobe *old_tp;
+   struct trace_uprobe *old_tu;
int ret;
 
mutex_lock(_lock);
 
/* register as an event */
-   old_tp = find_probe_event(tu->call.name, tu->call.class->system);
-   if (old_tp) {
+   old_tu = find_probe_event(tu->tp.call.name, tu->tp.call.class->system);
+   if (old_tu) {
/* delete old event */
-   ret = unregister_trace_uprobe(old_tp);
+   ret = unregister_trace_uprobe(old_tu);
if (ret)
goto end;
}
@@ -360,34 +355,36 @@ static int create_trace_uprobe(int argc, char **argv)
/* parse arguments */
ret = 0;
for (i = 0; i < argc && i < MAX_TRACE_ARGS; i++) {
+   struct probe_arg *parg = >tp.args[i];
+
/* Increment count for freeing args in error case */
-   tu->nr_args++;
+   tu->tp.nr_args++;
 

[PATCH 03/17] tracing/kprobes: Factor out struct trace_probe

2013-12-08 Thread Namhyung Kim
From: Namhyung Kim 

There are functions that can be shared to both of kprobes and uprobes.
Separate common data structure to struct trace_probe and use it from
the shared functions.

Acked-by: Masami Hiramatsu 
Cc: Srikar Dronamraju 
Cc: Oleg Nesterov 
Cc: zhangwei(Jovi) 
Cc: Arnaldo Carvalho de Melo 
Signed-off-by: Namhyung Kim 
---
 kernel/trace/trace_kprobe.c | 548 ++--
 kernel/trace/trace_probe.h  |  20 ++
 2 files changed, 289 insertions(+), 279 deletions(-)

diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index dae9541ada9e..db922fd74b05 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -27,18 +27,12 @@
 /**
  * Kprobe event core functions
  */
-struct trace_probe {
+struct trace_kprobe {
struct list_headlist;
struct kretproberp; /* Use rp.kp for kprobe use */
unsigned long   nhit;
-   unsigned intflags;  /* For TP_FLAG_* */
const char  *symbol;/* symbol name */
-   struct ftrace_event_class   class;
-   struct ftrace_event_callcall;
-   struct list_headfiles;
-   ssize_t size;   /* trace entry size */
-   unsigned intnr_args;
-   struct probe_argargs[];
+   struct trace_probe  tp;
 };
 
 struct event_file_link {
@@ -46,56 +40,46 @@ struct event_file_link {
struct list_headlist;
 };
 
-#define SIZEOF_TRACE_PROBE(n)  \
-   (offsetof(struct trace_probe, args) +   \
+#define SIZEOF_TRACE_KPROBE(n) \
+   (offsetof(struct trace_kprobe, tp.args) +   \
(sizeof(struct probe_arg) * (n)))
 
 
-static __kprobes bool trace_probe_is_return(struct trace_probe *tp)
+static __kprobes bool trace_kprobe_is_return(struct trace_kprobe *tk)
 {
-   return tp->rp.handler != NULL;
+   return tk->rp.handler != NULL;
 }
 
-static __kprobes const char *trace_probe_symbol(struct trace_probe *tp)
+static __kprobes const char *trace_kprobe_symbol(struct trace_kprobe *tk)
 {
-   return tp->symbol ? tp->symbol : "unknown";
+   return tk->symbol ? tk->symbol : "unknown";
 }
 
-static __kprobes unsigned long trace_probe_offset(struct trace_probe *tp)
+static __kprobes unsigned long trace_kprobe_offset(struct trace_kprobe *tk)
 {
-   return tp->rp.kp.offset;
+   return tk->rp.kp.offset;
 }
 
-static __kprobes bool trace_probe_is_enabled(struct trace_probe *tp)
+static __kprobes bool trace_kprobe_has_gone(struct trace_kprobe *tk)
 {
-   return !!(tp->flags & (TP_FLAG_TRACE | TP_FLAG_PROFILE));
+   return !!(kprobe_gone(>rp.kp));
 }
 
-static __kprobes bool trace_probe_is_registered(struct trace_probe *tp)
-{
-   return !!(tp->flags & TP_FLAG_REGISTERED);
-}
-
-static __kprobes bool trace_probe_has_gone(struct trace_probe *tp)
-{
-   return !!(kprobe_gone(>rp.kp));
-}
-
-static __kprobes bool trace_probe_within_module(struct trace_probe *tp,
-   struct module *mod)
+static __kprobes bool trace_kprobe_within_module(struct trace_kprobe *tk,
+struct module *mod)
 {
int len = strlen(mod->name);
-   const char *name = trace_probe_symbol(tp);
+   const char *name = trace_kprobe_symbol(tk);
return strncmp(mod->name, name, len) == 0 && name[len] == ':';
 }
 
-static __kprobes bool trace_probe_is_on_module(struct trace_probe *tp)
+static __kprobes bool trace_kprobe_is_on_module(struct trace_kprobe *tk)
 {
-   return !!strchr(trace_probe_symbol(tp), ':');
+   return !!strchr(trace_kprobe_symbol(tk), ':');
 }
 
-static int register_probe_event(struct trace_probe *tp);
-static int unregister_probe_event(struct trace_probe *tp);
+static int register_kprobe_event(struct trace_kprobe *tk);
+static int unregister_kprobe_event(struct trace_kprobe *tk);
 
 static DEFINE_MUTEX(probe_lock);
 static LIST_HEAD(probe_list);
@@ -107,42 +91,42 @@ static int kretprobe_dispatcher(struct kretprobe_instance 
*ri,
 /*
  * Allocate new trace_probe and initialize it (including kprobes).
  */
-static struct trace_probe *alloc_trace_probe(const char *group,
+static struct trace_kprobe *alloc_trace_kprobe(const char *group,
 const char *event,
 void *addr,
 const char *symbol,
 unsigned long offs,
 int nargs, bool is_return)
 {
-   struct trace_probe *tp;
+   struct trace_kprobe *tk;
int ret = -ENOMEM;
 
-   tp = kzalloc(SIZEOF_TRACE_PROBE(nargs), GFP_KERNEL);
-   if (!tp)
+   tk = kzalloc(SIZEOF_TRACE_KPROBE(nargs), GFP_KERNEL);
+   if (!tk)
return ERR_PTR(ret);
 
if (symbol) 

[PATCH 07/17] tracing/probes: Move fetch function helpers to trace_probe.h

2013-12-08 Thread Namhyung Kim
From: Namhyung Kim 

Move fetch function helper macros/functions to the header file and
make them external.  This is preparation of supporting uprobe fetch
table in next patch.

Cc: Masami Hiramatsu 
Cc: Srikar Dronamraju 
Cc: Oleg Nesterov 
Cc: zhangwei(Jovi) 
Cc: Arnaldo Carvalho de Melo 
Signed-off-by: Namhyung Kim 
---
 kernel/trace/trace_probe.c | 74 --
 kernel/trace/trace_probe.h | 64 +++
 2 files changed, 77 insertions(+), 61 deletions(-)

diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
index d8347b01ce89..c26bc9eaa2ac 100644
--- a/kernel/trace/trace_probe.c
+++ b/kernel/trace/trace_probe.c
@@ -35,19 +35,15 @@ const char *reserved_field_names[] = {
FIELD_STRING_FUNC,
 };
 
-/* Printing function type */
-#define PRINT_TYPE_FUNC_NAME(type) print_type_##type
-#define PRINT_TYPE_FMT_NAME(type)  print_type_format_##type
-
 /* Printing  in basic type function template */
 #define DEFINE_BASIC_PRINT_TYPE_FUNC(type, fmt)
\
-static __kprobes int PRINT_TYPE_FUNC_NAME(type)(struct trace_seq *s,   \
+__kprobes int PRINT_TYPE_FUNC_NAME(type)(struct trace_seq *s,  \
const char *name,   \
void *data, void *ent)  \
 {  \
return trace_seq_printf(s, " %s=" fmt, name, *(type *)data);\
 }  \
-static const char PRINT_TYPE_FMT_NAME(type)[] = fmt;
+const char PRINT_TYPE_FMT_NAME(type)[] = fmt;
 
 DEFINE_BASIC_PRINT_TYPE_FUNC(u8 , "0x%x")
 DEFINE_BASIC_PRINT_TYPE_FUNC(u16, "0x%x")
@@ -58,23 +54,12 @@ DEFINE_BASIC_PRINT_TYPE_FUNC(s16, "%d")
 DEFINE_BASIC_PRINT_TYPE_FUNC(s32, "%d")
 DEFINE_BASIC_PRINT_TYPE_FUNC(s64, "%Ld")
 
-static inline void *get_rloc_data(u32 *dl)
-{
-   return (u8 *)dl + get_rloc_offs(*dl);
-}
-
-/* For data_loc conversion */
-static inline void *get_loc_data(u32 *dl, void *ent)
-{
-   return (u8 *)ent + get_rloc_offs(*dl);
-}
-
 /* For defining macros, define string/string_size types */
 typedef u32 string;
 typedef u32 string_size;
 
 /* Print type function for string type */
-static __kprobes int PRINT_TYPE_FUNC_NAME(string)(struct trace_seq *s,
+__kprobes int PRINT_TYPE_FUNC_NAME(string)(struct trace_seq *s,
  const char *name,
  void *data, void *ent)
 {
@@ -87,7 +72,7 @@ static __kprobes int PRINT_TYPE_FUNC_NAME(string)(struct 
trace_seq *s,
(const char *)get_loc_data(data, ent));
 }
 
-static const char PRINT_TYPE_FMT_NAME(string)[] = "\\\"%s\\\"";
+const char PRINT_TYPE_FMT_NAME(string)[] = "\\\"%s\\\"";
 
 #define FETCH_FUNC_NAME(method, type)  fetch_##method##_##type
 /*
@@ -111,7 +96,7 @@ DEFINE_FETCH_##method(u64)
 
 /* Data fetch function templates */
 #define DEFINE_FETCH_reg(type) \
-static __kprobes void FETCH_FUNC_NAME(reg, type)(struct pt_regs *regs, \
+__kprobes void FETCH_FUNC_NAME(reg, type)(struct pt_regs *regs,
\
void *offset, void *dest)   \
 {  \
*(type *)dest = (type)regs_get_register(regs,   \
@@ -123,7 +108,7 @@ DEFINE_BASIC_FETCH_FUNCS(reg)
 #define fetch_reg_string_size  NULL
 
 #define DEFINE_FETCH_stack(type)   \
-static __kprobes void FETCH_FUNC_NAME(stack, type)(struct pt_regs *regs,\
+__kprobes void FETCH_FUNC_NAME(stack, type)(struct pt_regs *regs,  \
  void *offset, void *dest) \
 {  \
*(type *)dest = (type)regs_get_kernel_stack_nth(regs,   \
@@ -135,7 +120,7 @@ DEFINE_BASIC_FETCH_FUNCS(stack)
 #define fetch_stack_string_sizeNULL
 
 #define DEFINE_FETCH_retval(type)  \
-static __kprobes void FETCH_FUNC_NAME(retval, type)(struct pt_regs *regs,\
+__kprobes void FETCH_FUNC_NAME(retval, type)(struct pt_regs *regs, \
  void *dummy, void *dest)  \
 {  \
*(type *)dest = (type)regs_return_value(regs);  \
@@ -146,7 +131,7 @@ DEFINE_BASIC_FETCH_FUNCS(retval)
 #define fetch_retval_string_size   NULL
 
 #define DEFINE_FETCH_memory(type)  \
-static __kprobes void FETCH_FUNC_NAME(memory, type)(struct pt_regs *regs,\
+__kprobes void FETCH_FUNC_NAME(memory, type)(struct pt_regs *regs, \
  void *addr, void *dest)   \
 {  

[PATCH 10/17] tracing/probes: Move 'symbol' fetch method to kprobes

2013-12-08 Thread Namhyung Kim
From: Namhyung Kim 

Move existing functions to trace_kprobe.c and add NULL entries to the
uprobes fetch type table.  I don't make them static since some generic
routines like update/free_XXX_fetch_param() require pointers to the
functions.

Cc: Masami Hiramatsu 
Cc: Srikar Dronamraju 
Cc: Oleg Nesterov 
Cc: zhangwei(Jovi) 
Cc: Arnaldo Carvalho de Melo 
Signed-off-by: Namhyung Kim 
---
 kernel/trace/trace_kprobe.c | 59 +
 kernel/trace/trace_probe.c  | 59 -
 kernel/trace/trace_probe.h  | 24 ++
 kernel/trace/trace_uprobe.c |  8 ++
 4 files changed, 91 insertions(+), 59 deletions(-)

diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 4eaecf9bfd66..2886e6da524d 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -88,6 +88,51 @@ static int kprobe_dispatcher(struct kprobe *kp, struct 
pt_regs *regs);
 static int kretprobe_dispatcher(struct kretprobe_instance *ri,
struct pt_regs *regs);
 
+/* Memory fetching by symbol */
+struct symbol_cache {
+   char*symbol;
+   longoffset;
+   unsigned long   addr;
+};
+
+unsigned long update_symbol_cache(struct symbol_cache *sc)
+{
+   sc->addr = (unsigned long)kallsyms_lookup_name(sc->symbol);
+
+   if (sc->addr)
+   sc->addr += sc->offset;
+
+   return sc->addr;
+}
+
+void free_symbol_cache(struct symbol_cache *sc)
+{
+   kfree(sc->symbol);
+   kfree(sc);
+}
+
+struct symbol_cache *alloc_symbol_cache(const char *sym, long offset)
+{
+   struct symbol_cache *sc;
+
+   if (!sym || strlen(sym) == 0)
+   return NULL;
+
+   sc = kzalloc(sizeof(struct symbol_cache), GFP_KERNEL);
+   if (!sc)
+   return NULL;
+
+   sc->symbol = kstrdup(sym, GFP_KERNEL);
+   if (!sc->symbol) {
+   kfree(sc);
+   return NULL;
+   }
+   sc->offset = offset;
+   update_symbol_cache(sc);
+
+   return sc;
+}
+
 /*
  * Kprobes-specific fetch functions
  */
@@ -103,6 +148,20 @@ DEFINE_BASIC_FETCH_FUNCS(stack)
 #define fetch_stack_string NULL
 #define fetch_stack_string_sizeNULL
 
+#define DEFINE_FETCH_symbol(type)  \
+__kprobes void FETCH_FUNC_NAME(symbol, type)(struct pt_regs *regs, \
+ void *data, void *dest)   \
+{  \
+   struct symbol_cache *sc = data; \
+   if (sc->addr)   \
+   fetch_memory_##type(regs, (void *)sc->addr, dest);  \
+   else\
+   *(type *)dest = 0;  \
+}
+DEFINE_BASIC_FETCH_FUNCS(symbol)
+DEFINE_FETCH_symbol(string)
+DEFINE_FETCH_symbol(string_size)
+
 /* Fetch type information table */
 const struct fetch_type kprobes_fetch_type_table[] = {
/* Special types */
diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
index ddd14a5cb3ee..56cf63796fc1 100644
--- a/kernel/trace/trace_probe.c
+++ b/kernel/trace/trace_probe.c
@@ -180,65 +180,6 @@ __kprobes void FETCH_FUNC_NAME(memory, string_size)(struct 
pt_regs *regs,
*(u32 *)dest = len;
 }
 
-/* Memory fetching by symbol */
-struct symbol_cache {
-   char*symbol;
-   longoffset;
-   unsigned long   addr;
-};
-
-static unsigned long update_symbol_cache(struct symbol_cache *sc)
-{
-   sc->addr = (unsigned long)kallsyms_lookup_name(sc->symbol);
-
-   if (sc->addr)
-   sc->addr += sc->offset;
-
-   return sc->addr;
-}
-
-static void free_symbol_cache(struct symbol_cache *sc)
-{
-   kfree(sc->symbol);
-   kfree(sc);
-}
-
-static struct symbol_cache *alloc_symbol_cache(const char *sym, long offset)
-{
-   struct symbol_cache *sc;
-
-   if (!sym || strlen(sym) == 0)
-   return NULL;
-
-   sc = kzalloc(sizeof(struct symbol_cache), GFP_KERNEL);
-   if (!sc)
-   return NULL;
-
-   sc->symbol = kstrdup(sym, GFP_KERNEL);
-   if (!sc->symbol) {
-   kfree(sc);
-   return NULL;
-   }
-   sc->offset = offset;
-   update_symbol_cache(sc);
-
-   return sc;
-}
-
-#define DEFINE_FETCH_symbol(type)  \
-__kprobes void FETCH_FUNC_NAME(symbol, type)(struct pt_regs *regs, \
- void *data, void *dest)   \
-{  \
-   struct symbol_cache *sc = data; \
-   if (sc->addr)   \
-   fetch_memory_##type(regs, (void 

[PATCH 08/17] tracing/probes: Split [ku]probes_fetch_type_table

2013-12-08 Thread Namhyung Kim
From: Namhyung Kim 

Use separate fetch_type_table for kprobes and uprobes.  It currently
shares all fetch methods but some of them will be implemented
differently later.

This is not to break build if [ku]probes is configured alone (like
!CONFIG_KPROBE_EVENT and CONFIG_UPROBE_EVENT).  So I added '__weak'
to the table declaration so that it can be safely omitted when it
configured out.

Cc: Masami Hiramatsu 
Cc: Srikar Dronamraju 
Cc: Oleg Nesterov 
Cc: zhangwei(Jovi) 
Cc: Arnaldo Carvalho de Melo 
Signed-off-by: Namhyung Kim 
---
 kernel/trace/trace_kprobe.c | 18 +++
 kernel/trace/trace_probe.c  | 64 +++---
 kernel/trace/trace_probe.h  | 76 ++---
 kernel/trace/trace_uprobe.c | 18 +++
 4 files changed, 126 insertions(+), 50 deletions(-)

diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 62d6c961bbce..d00ee5ce6ccc 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -88,6 +88,24 @@ static int kprobe_dispatcher(struct kprobe *kp, struct 
pt_regs *regs);
 static int kretprobe_dispatcher(struct kretprobe_instance *ri,
struct pt_regs *regs);
 
+/* Fetch type information table */
+const struct fetch_type kprobes_fetch_type_table[] = {
+   /* Special types */
+   [FETCH_TYPE_STRING] = __ASSIGN_FETCH_TYPE("string", string, string,
+   sizeof(u32), 1, "__data_loc char[]"),
+   [FETCH_TYPE_STRSIZE] = __ASSIGN_FETCH_TYPE("string_size", u32,
+   string_size, sizeof(u32), 0, "u32"),
+   /* Basic types */
+   ASSIGN_FETCH_TYPE(u8,  u8,  0),
+   ASSIGN_FETCH_TYPE(u16, u16, 0),
+   ASSIGN_FETCH_TYPE(u32, u32, 0),
+   ASSIGN_FETCH_TYPE(u64, u64, 0),
+   ASSIGN_FETCH_TYPE(s8,  u8,  1),
+   ASSIGN_FETCH_TYPE(s16, u16, 1),
+   ASSIGN_FETCH_TYPE(s32, u32, 1),
+   ASSIGN_FETCH_TYPE(s64, u64, 1),
+};
+
 /*
  * Allocate new trace_probe and initialize it (including kprobes).
  */
diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
index c26bc9eaa2ac..68b00a214fcc 100644
--- a/kernel/trace/trace_probe.c
+++ b/kernel/trace/trace_probe.c
@@ -54,10 +54,6 @@ DEFINE_BASIC_PRINT_TYPE_FUNC(s16, "%d")
 DEFINE_BASIC_PRINT_TYPE_FUNC(s32, "%d")
 DEFINE_BASIC_PRINT_TYPE_FUNC(s64, "%Ld")
 
-/* For defining macros, define string/string_size types */
-typedef u32 string;
-typedef u32 string_size;
-
 /* Print type function for string type */
 __kprobes int PRINT_TYPE_FUNC_NAME(string)(struct trace_seq *s,
  const char *name,
@@ -74,7 +70,6 @@ __kprobes int PRINT_TYPE_FUNC_NAME(string)(struct trace_seq 
*s,
 
 const char PRINT_TYPE_FMT_NAME(string)[] = "\\\"%s\\\"";
 
-#define FETCH_FUNC_NAME(method, type)  fetch_##method##_##type
 /*
  * Define macro for basic types - we don't need to define s* types, because
  * we have to care only about bitwidth at recording time.
@@ -359,25 +354,8 @@ free_bitfield_fetch_param(struct bitfield_fetch_param 
*data)
kfree(data);
 }
 
-/* Fetch type information table */
-static const struct fetch_type fetch_type_table[] = {
-   /* Special types */
-   [FETCH_TYPE_STRING] = __ASSIGN_FETCH_TYPE("string", string, string,
-   sizeof(u32), 1, "__data_loc char[]"),
-   [FETCH_TYPE_STRSIZE] = __ASSIGN_FETCH_TYPE("string_size", u32,
-   string_size, sizeof(u32), 0, "u32"),
-   /* Basic types */
-   ASSIGN_FETCH_TYPE(u8,  u8,  0),
-   ASSIGN_FETCH_TYPE(u16, u16, 0),
-   ASSIGN_FETCH_TYPE(u32, u32, 0),
-   ASSIGN_FETCH_TYPE(u64, u64, 0),
-   ASSIGN_FETCH_TYPE(s8,  u8,  1),
-   ASSIGN_FETCH_TYPE(s16, u16, 1),
-   ASSIGN_FETCH_TYPE(s32, u32, 1),
-   ASSIGN_FETCH_TYPE(s64, u64, 1),
-};
-
-static const struct fetch_type *find_fetch_type(const char *type)
+static const struct fetch_type *find_fetch_type(const char *type,
+   const struct fetch_type *ftbl)
 {
int i;
 
@@ -398,21 +376,21 @@ static const struct fetch_type *find_fetch_type(const 
char *type)
 
switch (bs) {
case 8:
-   return find_fetch_type("u8");
+   return find_fetch_type("u8", ftbl);
case 16:
-   return find_fetch_type("u16");
+   return find_fetch_type("u16", ftbl);
case 32:
-   return find_fetch_type("u32");
+   return find_fetch_type("u32", ftbl);
case 64:
-   return find_fetch_type("u64");
+   return find_fetch_type("u64", ftbl);
default:
goto fail;
}
}
 
-   for (i = 0; i < ARRAY_SIZE(fetch_type_table); i++)
-   if 

[PATCH 09/17] tracing/probes: Implement 'stack' fetch method for uprobes

2013-12-08 Thread Namhyung Kim
From: Namhyung Kim 

Use separate method to fetch from stack.  Move existing functions to
trace_kprobe.c and make them static.  Also add new stack fetch
implementation for uprobes.

Cc: Masami Hiramatsu 
Cc: Srikar Dronamraju 
Cc: Oleg Nesterov 
Cc: zhangwei(Jovi) 
Cc: Arnaldo Carvalho de Melo 
Signed-off-by: Namhyung Kim 
---
 kernel/trace/trace_kprobe.c | 15 +++
 kernel/trace/trace_probe.c  | 22 --
 kernel/trace/trace_probe.h  | 14 ++
 kernel/trace/trace_uprobe.c | 41 +
 4 files changed, 66 insertions(+), 26 deletions(-)

diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index d00ee5ce6ccc..4eaecf9bfd66 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -88,6 +88,21 @@ static int kprobe_dispatcher(struct kprobe *kp, struct 
pt_regs *regs);
 static int kretprobe_dispatcher(struct kretprobe_instance *ri,
struct pt_regs *regs);
 
+/*
+ * Kprobes-specific fetch functions
+ */
+#define DEFINE_FETCH_stack(type)   \
+static __kprobes void FETCH_FUNC_NAME(stack, type)(struct pt_regs *regs,\
+ void *offset, void *dest) \
+{  \
+   *(type *)dest = (type)regs_get_kernel_stack_nth(regs,   \
+   (unsigned int)((unsigned long)offset)); \
+}
+DEFINE_BASIC_FETCH_FUNCS(stack)
+/* No string on the stack entry */
+#define fetch_stack_string NULL
+#define fetch_stack_string_sizeNULL
+
 /* Fetch type information table */
 const struct fetch_type kprobes_fetch_type_table[] = {
/* Special types */
diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
index 68b00a214fcc..ddd14a5cb3ee 100644
--- a/kernel/trace/trace_probe.c
+++ b/kernel/trace/trace_probe.c
@@ -70,16 +70,6 @@ __kprobes int PRINT_TYPE_FUNC_NAME(string)(struct trace_seq 
*s,
 
 const char PRINT_TYPE_FMT_NAME(string)[] = "\\\"%s\\\"";
 
-/*
- * Define macro for basic types - we don't need to define s* types, because
- * we have to care only about bitwidth at recording time.
- */
-#define DEFINE_BASIC_FETCH_FUNCS(method) \
-DEFINE_FETCH_##method(u8)  \
-DEFINE_FETCH_##method(u16) \
-DEFINE_FETCH_##method(u32) \
-DEFINE_FETCH_##method(u64)
-
 #define CHECK_FETCH_FUNCS(method, fn)  \
(((FETCH_FUNC_NAME(method, u8) == fn) ||\
  (FETCH_FUNC_NAME(method, u16) == fn) ||   \
@@ -102,18 +92,6 @@ DEFINE_BASIC_FETCH_FUNCS(reg)
 #define fetch_reg_string   NULL
 #define fetch_reg_string_size  NULL
 
-#define DEFINE_FETCH_stack(type)   \
-__kprobes void FETCH_FUNC_NAME(stack, type)(struct pt_regs *regs,  \
- void *offset, void *dest) \
-{  \
-   *(type *)dest = (type)regs_get_kernel_stack_nth(regs,   \
-   (unsigned int)((unsigned long)offset)); \
-}
-DEFINE_BASIC_FETCH_FUNCS(stack)
-/* No string on the stack entry */
-#define fetch_stack_string NULL
-#define fetch_stack_string_sizeNULL
-
 #define DEFINE_FETCH_retval(type)  \
 __kprobes void FETCH_FUNC_NAME(retval, type)(struct pt_regs *regs, \
  void *dummy, void *dest)  \
diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h
index 99d3aa1f2d8a..23b2d83ee5fb 100644
--- a/kernel/trace/trace_probe.h
+++ b/kernel/trace/trace_probe.h
@@ -167,10 +167,6 @@ DECLARE_BASIC_FETCH_FUNCS(reg);
 #define fetch_reg_string   NULL
 #define fetch_reg_string_size  NULL
 
-DECLARE_BASIC_FETCH_FUNCS(stack);
-#define fetch_stack_string NULL
-#define fetch_stack_string_sizeNULL
-
 DECLARE_BASIC_FETCH_FUNCS(retval);
 #define fetch_retval_stringNULL
 #define fetch_retval_string_size   NULL
@@ -191,6 +187,16 @@ DECLARE_BASIC_FETCH_FUNCS(bitfield);
 #define fetch_bitfield_string  NULL
 #define fetch_bitfield_string_size NULL
 
+/*
+ * Define macro for basic types - we don't need to define s* types, because
+ * we have to care only about bitwidth at recording time.
+ */
+#define DEFINE_BASIC_FETCH_FUNCS(method) \
+DEFINE_FETCH_##method(u8)  \
+DEFINE_FETCH_##method(u16) \
+DEFINE_FETCH_##method(u32) \
+DEFINE_FETCH_##method(u64)
+
 /* Default (unsigned long) fetch type */
 #define __DEFAULT_FETCH_TYPE(t) u##t
 #define _DEFAULT_FETCH_TYPE(t) __DEFAULT_FETCH_TYPE(t)
diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
index c66ddc744f12..28caa766edf8 100644
--- a/kernel/trace/trace_uprobe.c
+++ 

[PATCH 06/17] tracing/probes: Integrate duplicate set_print_fmt()

2013-12-08 Thread Namhyung Kim
From: Namhyung Kim 

The set_print_fmt() functions are implemented almost same for
[ku]probes.  Move it to a common place and get rid of the duplication.

Acked-by: Masami Hiramatsu 
Cc: Srikar Dronamraju 
Cc: Oleg Nesterov 
Cc: zhangwei(Jovi) 
Cc: Arnaldo Carvalho de Melo 
Signed-off-by: Namhyung Kim 
---
 kernel/trace/trace_kprobe.c | 63 +
 kernel/trace/trace_probe.c  | 62 
 kernel/trace/trace_probe.h  |  2 ++
 kernel/trace/trace_uprobe.c | 55 +--
 4 files changed, 66 insertions(+), 116 deletions(-)

diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 0167d4b92431..62d6c961bbce 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -964,67 +964,6 @@ static int kretprobe_event_define_fields(struct 
ftrace_event_call *event_call)
return 0;
 }
 
-static int __set_print_fmt(struct trace_kprobe *tk, char *buf, int len)
-{
-   int i;
-   int pos = 0;
-
-   const char *fmt, *arg;
-
-   if (!trace_kprobe_is_return(tk)) {
-   fmt = "(%lx)";
-   arg = "REC->" FIELD_STRING_IP;
-   } else {
-   fmt = "(%lx <- %lx)";
-   arg = "REC->" FIELD_STRING_FUNC ", REC->" FIELD_STRING_RETIP;
-   }
-
-   /* When len=0, we just calculate the needed length */
-#define LEN_OR_ZERO (len ? len - pos : 0)
-
-   pos += snprintf(buf + pos, LEN_OR_ZERO, "\"%s", fmt);
-
-   for (i = 0; i < tk->tp.nr_args; i++) {
-   pos += snprintf(buf + pos, LEN_OR_ZERO, " %s=%s",
-   tk->tp.args[i].name, tk->tp.args[i].type->fmt);
-   }
-
-   pos += snprintf(buf + pos, LEN_OR_ZERO, "\", %s", arg);
-
-   for (i = 0; i < tk->tp.nr_args; i++) {
-   if (strcmp(tk->tp.args[i].type->name, "string") == 0)
-   pos += snprintf(buf + pos, LEN_OR_ZERO,
-   ", __get_str(%s)",
-   tk->tp.args[i].name);
-   else
-   pos += snprintf(buf + pos, LEN_OR_ZERO, ", REC->%s",
-   tk->tp.args[i].name);
-   }
-
-#undef LEN_OR_ZERO
-
-   /* return the length of print_fmt */
-   return pos;
-}
-
-static int set_print_fmt(struct trace_kprobe *tk)
-{
-   int len;
-   char *print_fmt;
-
-   /* First: called with 0 length to calculate the needed length */
-   len = __set_print_fmt(tk, NULL, 0);
-   print_fmt = kmalloc(len + 1, GFP_KERNEL);
-   if (!print_fmt)
-   return -ENOMEM;
-
-   /* Second: actually write the @print_fmt */
-   __set_print_fmt(tk, print_fmt, len + 1);
-   tk->tp.call.print_fmt = print_fmt;
-
-   return 0;
-}
-
 #ifdef CONFIG_PERF_EVENTS
 
 /* Kprobe profile handler */
@@ -1175,7 +1114,7 @@ static int register_kprobe_event(struct trace_kprobe *tk)
call->event.funcs = _funcs;
call->class->define_fields = kprobe_event_define_fields;
}
-   if (set_print_fmt(tk) < 0)
+   if (set_print_fmt(>tp, trace_kprobe_is_return(tk)) < 0)
return -ENOMEM;
ret = register_ftrace_event(>event);
if (!ret) {
diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
index 430505b08a6f..d8347b01ce89 100644
--- a/kernel/trace/trace_probe.c
+++ b/kernel/trace/trace_probe.c
@@ -837,3 +837,65 @@ out:
 
return ret;
 }
+
+static int __set_print_fmt(struct trace_probe *tp, char *buf, int len,
+  bool is_return)
+{
+   int i;
+   int pos = 0;
+
+   const char *fmt, *arg;
+
+   if (!is_return) {
+   fmt = "(%lx)";
+   arg = "REC->" FIELD_STRING_IP;
+   } else {
+   fmt = "(%lx <- %lx)";
+   arg = "REC->" FIELD_STRING_FUNC ", REC->" FIELD_STRING_RETIP;
+   }
+
+   /* When len=0, we just calculate the needed length */
+#define LEN_OR_ZERO (len ? len - pos : 0)
+
+   pos += snprintf(buf + pos, LEN_OR_ZERO, "\"%s", fmt);
+
+   for (i = 0; i < tp->nr_args; i++) {
+   pos += snprintf(buf + pos, LEN_OR_ZERO, " %s=%s",
+   tp->args[i].name, tp->args[i].type->fmt);
+   }
+
+   pos += snprintf(buf + pos, LEN_OR_ZERO, "\", %s", arg);
+
+   for (i = 0; i < tp->nr_args; i++) {
+   if (strcmp(tp->args[i].type->name, "string") == 0)
+   pos += snprintf(buf + pos, LEN_OR_ZERO,
+   ", __get_str(%s)",
+   tp->args[i].name);
+   else
+   pos += snprintf(buf + pos, LEN_OR_ZERO, ", REC->%s",
+   tp->args[i].name);
+   }
+
+#undef LEN_OR_ZERO
+
+   /* return the length of print_fmt */
+   return pos;
+}
+
+int set_print_fmt(struct 

[PATCH 11/17] tracing/probes: Add fetch{,_size} member into deref fetch method

2013-12-08 Thread Namhyung Kim
From: Hyeoncheol Lee 

The deref fetch methods access a memory region but it assumes that
it's a kernel memory since uprobes does not support them.

Add ->fetch and ->fetch_size member in order to provide a proper
access methods for supporting uprobes.

Acked-by: Masami Hiramatsu 
Cc: Srikar Dronamraju 
Cc: Oleg Nesterov 
Cc: zhangwei(Jovi) 
Cc: Arnaldo Carvalho de Melo 
Signed-off-by: Hyeoncheol Lee 
[namhy...@kernel.org: Split original patch into pieces as requested]
Signed-off-by: Namhyung Kim 
---
 kernel/trace/trace_probe.c | 22 --
 1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
index 56cf63796fc1..b4f28bc39959 100644
--- a/kernel/trace/trace_probe.c
+++ b/kernel/trace/trace_probe.c
@@ -184,6 +184,8 @@ __kprobes void FETCH_FUNC_NAME(memory, string_size)(struct 
pt_regs *regs,
 struct deref_fetch_param {
struct fetch_param  orig;
longoffset;
+   fetch_func_tfetch;
+   fetch_func_tfetch_size;
 };
 
 #define DEFINE_FETCH_deref(type)   \
@@ -195,13 +197,26 @@ __kprobes void FETCH_FUNC_NAME(deref, type)(struct 
pt_regs *regs, \
call_fetch(>orig, regs, );   \
if (addr) { \
addr += dprm->offset;   \
-   fetch_memory_##type(regs, (void *)addr, dest);  \
+   dprm->fetch(regs, (void *)addr, dest);  \
} else  \
*(type *)dest = 0;  \
 }
 DEFINE_BASIC_FETCH_FUNCS(deref)
 DEFINE_FETCH_deref(string)
-DEFINE_FETCH_deref(string_size)
+
+__kprobes void FETCH_FUNC_NAME(deref, string_size)(struct pt_regs *regs,
+  void *data, void *dest)
+{
+   struct deref_fetch_param *dprm = data;
+   unsigned long addr;
+
+   call_fetch(>orig, regs, );
+   if (addr && dprm->fetch_size) {
+   addr += dprm->offset;
+   dprm->fetch_size(regs, (void *)addr, dest);
+   } else
+   *(string_size *)dest = 0;
+}
 
 static __kprobes void update_deref_fetch_param(struct deref_fetch_param *data)
 {
@@ -476,6 +491,9 @@ static int parse_probe_arg(char *arg, const struct 
fetch_type *t,
return -ENOMEM;
 
dprm->offset = offset;
+   dprm->fetch = t->fetch[FETCH_MTD_memory];
+   dprm->fetch_size = get_fetch_size_function(t,
+   dprm->fetch, ttbl);
ret = parse_probe_arg(arg, t2, >orig, is_return,
is_kprobe);
if (ret)
-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V4 09/10] power8, perf: Change BHRB branch filter configuration

2013-12-08 Thread Michael Ellerman
On Wed, 2013-04-12 at 10:32:41 UTC, Anshuman Khandual wrote:
> Powerpc kernel now supports SW based branch filters for book3s systems with 
> some
> specifc requirements while dealing with HW supported branch filters in order 
> to
> achieve overall OR semantics prevailing in perf branch stack sampling 
> framework.
> This patch adapts the BHRB branch filter configuration to meet those 
> protocols.
> POWER8 PMU does support 3 branch filters (out of which two are getting used in
> perf branch stack) which are mutually exclussive and cannot be ORed with each
> other. This implies that PMU can only handle one HW based branch filter 
> request
> at any point of time. For all other combinations PMU will pass it on to the 
> SW.
> 
> Also the combination of PERF_SAMPLE_BRANCH_ANY_CALL and 
> PERF_SAMPLE_BRANCH_COND
> can now be handled in SW, hence we dont error them out anymore.
> 
> diff --git a/arch/powerpc/perf/power8-pmu.c b/arch/powerpc/perf/power8-pmu.c
> index 03c5b8d..6021349 100644
> --- a/arch/powerpc/perf/power8-pmu.c
> +++ b/arch/powerpc/perf/power8-pmu.c
> @@ -561,7 +561,56 @@ static int power8_generic_events[] = {
>  
>  static u64 power8_bhrb_filter_map(u64 branch_sample_type, u64 *filter_mask)
>  {
> - u64 pmu_bhrb_filter = 0;
> + u64 x, tmp, pmu_bhrb_filter = 0;
> + *filter_mask = 0;
> +
> + /* No branch filter requested */
> + if (branch_sample_type & PERF_SAMPLE_BRANCH_ANY) {
> + *filter_mask = PERF_SAMPLE_BRANCH_ANY;
> + return pmu_bhrb_filter;
> + }
> +
> + /*
> +  * P8 does not support oring of PMU HW branch filters. Hence
> +  * if multiple branch filters are requested which includes filters
> +  * supported in PMU, still go ahead and clear the PMU based HW branch
> +  * filter component as in this case all the filters will be processed
> +  * in SW.

Leading space there.

> +  */
> + tmp = branch_sample_type;
> +
> + /* Remove privilege filters before comparison */
> + tmp &= ~PERF_SAMPLE_BRANCH_USER;
> + tmp &= ~PERF_SAMPLE_BRANCH_KERNEL;
> + tmp &= ~PERF_SAMPLE_BRANCH_HV;
> +
> + for_each_branch_sample_type(x) {
> + /* Ignore privilege requests */
> + if ((x == PERF_SAMPLE_BRANCH_USER) || (x == 
> PERF_SAMPLE_BRANCH_KERNEL) || (x == PERF_SAMPLE_BRANCH_HV))
> + continue;
> +
> + if (!(tmp & x))
> + continue;
> +
> +   /* Supported HW PMU filters */
> + if (tmp & PERF_SAMPLE_BRANCH_ANY_CALL) {
> + tmp &= ~PERF_SAMPLE_BRANCH_ANY_CALL;
> + if (tmp) {
> + pmu_bhrb_filter = 0;
> + *filter_mask = 0;
> + return pmu_bhrb_filter;
> + }
> + }
> +
> + if (tmp & PERF_SAMPLE_BRANCH_COND) {
> + tmp &= ~PERF_SAMPLE_BRANCH_COND;
> + if (tmp) {
> + pmu_bhrb_filter = 0;
> + *filter_mask = 0;
> + return pmu_bhrb_filter;
> + }
> + }
> + }

>  
>   /* BHRB and regular PMU events share the same privilege state
>* filter configuration. BHRB is always recorded along with a
> @@ -570,34 +619,20 @@ static u64 power8_bhrb_filter_map(u64 
> branch_sample_type, u64 *filter_mask)
>* PMU event, we ignore any separate BHRB specific request.
>*/
>  
> - /* No branch filter requested */
> - if (branch_sample_type & PERF_SAMPLE_BRANCH_ANY)
> - return pmu_bhrb_filter;
> -
> - /* Invalid branch filter options - HW does not support */
> - if (branch_sample_type & PERF_SAMPLE_BRANCH_ANY_RETURN)
> - return -1;
> -
> - if (branch_sample_type & PERF_SAMPLE_BRANCH_IND_CALL)
> - return -1;
> -
> + /* Supported individual branch filters */
>   if (branch_sample_type & PERF_SAMPLE_BRANCH_ANY_CALL) {
>   pmu_bhrb_filter |= POWER8_MMCRA_IFM1;
> + *filter_mask|= PERF_SAMPLE_BRANCH_ANY_CALL;
>   return pmu_bhrb_filter;
>   }
>  
>   if (branch_sample_type & PERF_SAMPLE_BRANCH_COND) {
>   pmu_bhrb_filter |= POWER8_MMCRA_IFM3;
> + *filter_mask|= PERF_SAMPLE_BRANCH_COND;
>   return pmu_bhrb_filter;
>   }
>  
> - /* PMU does not support ANY combination of HW BHRB filters */
> - if ((branch_sample_type & PERF_SAMPLE_BRANCH_ANY_CALL) &&
> - (branch_sample_type & PERF_SAMPLE_BRANCH_COND))
> - return -1;
> -
> - /* Every thing else is unsupported */
> - return -1;
> + return pmu_bhrb_filter;
>  }


As I said in my comments on version 3 which you ignored:

I think it would be clearer if we actually checked for the possibilities we
allow and let everything else fall through, eg:

   

[PATCH 05/17] tracing/kprobes: Move common functions to trace_probe.h

2013-12-08 Thread Namhyung Kim
From: Namhyung Kim 

The __get_data_size() and store_trace_args() will be used by uprobes
too.  Move them to a common location.

Acked-by: Masami Hiramatsu 
Cc: Srikar Dronamraju 
Cc: Oleg Nesterov 
Cc: zhangwei(Jovi) 
Cc: Arnaldo Carvalho de Melo 
Signed-off-by: Namhyung Kim 
---
 kernel/trace/trace_kprobe.c | 48 -
 kernel/trace/trace_probe.h  | 48 +
 2 files changed, 48 insertions(+), 48 deletions(-)

diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index db922fd74b05..0167d4b92431 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -740,54 +740,6 @@ static const struct file_operations kprobe_profile_ops = {
.release= seq_release,
 };
 
-/* Sum up total data length for dynamic arraies (strings) */
-static __kprobes int __get_data_size(struct trace_probe *tp,
-struct pt_regs *regs)
-{
-   int i, ret = 0;
-   u32 len;
-
-   for (i = 0; i < tp->nr_args; i++)
-   if (unlikely(tp->args[i].fetch_size.fn)) {
-   call_fetch(>args[i].fetch_size, regs, );
-   ret += len;
-   }
-
-   return ret;
-}
-
-/* Store the value of each argument */
-static __kprobes void store_trace_args(int ent_size, struct trace_probe *tp,
-  struct pt_regs *regs,
-  u8 *data, int maxlen)
-{
-   int i;
-   u32 end = tp->size;
-   u32 *dl;/* Data (relative) location */
-
-   for (i = 0; i < tp->nr_args; i++) {
-   if (unlikely(tp->args[i].fetch_size.fn)) {
-   /*
-* First, we set the relative location and
-* maximum data length to *dl
-*/
-   dl = (u32 *)(data + tp->args[i].offset);
-   *dl = make_data_rloc(maxlen, end - tp->args[i].offset);
-   /* Then try to fetch string or dynamic array data */
-   call_fetch(>args[i].fetch, regs, dl);
-   /* Reduce maximum length */
-   end += get_rloc_len(*dl);
-   maxlen -= get_rloc_len(*dl);
-   /* Trick here, convert data_rloc to data_loc */
-   *dl = convert_rloc_to_loc(*dl,
-ent_size + tp->args[i].offset);
-   } else
-   /* Just fetching data normally */
-   call_fetch(>args[i].fetch, regs,
-  data + tp->args[i].offset);
-   }
-}
-
 /* Kprobe handler */
 static __kprobes void
 __kprobe_trace_func(struct trace_kprobe *tk, struct pt_regs *regs,
diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h
index 984e91ed8a44..d384fbd4025c 100644
--- a/kernel/trace/trace_probe.h
+++ b/kernel/trace/trace_probe.h
@@ -178,3 +178,51 @@ extern ssize_t traceprobe_probes_write(struct file *file,
int (*createfn)(int, char**));
 
 extern int traceprobe_command(const char *buf, int (*createfn)(int, char**));
+
+/* Sum up total data length for dynamic arraies (strings) */
+static inline __kprobes int
+__get_data_size(struct trace_probe *tp, struct pt_regs *regs)
+{
+   int i, ret = 0;
+   u32 len;
+
+   for (i = 0; i < tp->nr_args; i++)
+   if (unlikely(tp->args[i].fetch_size.fn)) {
+   call_fetch(>args[i].fetch_size, regs, );
+   ret += len;
+   }
+
+   return ret;
+}
+
+/* Store the value of each argument */
+static inline __kprobes void
+store_trace_args(int ent_size, struct trace_probe *tp, struct pt_regs *regs,
+u8 *data, int maxlen)
+{
+   int i;
+   u32 end = tp->size;
+   u32 *dl;/* Data (relative) location */
+
+   for (i = 0; i < tp->nr_args; i++) {
+   if (unlikely(tp->args[i].fetch_size.fn)) {
+   /*
+* First, we set the relative location and
+* maximum data length to *dl
+*/
+   dl = (u32 *)(data + tp->args[i].offset);
+   *dl = make_data_rloc(maxlen, end - tp->args[i].offset);
+   /* Then try to fetch string or dynamic array data */
+   call_fetch(>args[i].fetch, regs, dl);
+   /* Reduce maximum length */
+   end += get_rloc_len(*dl);
+   maxlen -= get_rloc_len(*dl);
+   /* Trick here, convert data_rloc to data_loc */
+   *dl = convert_rloc_to_loc(*dl,
+ent_size + tp->args[i].offset);
+   } else
+   /* Just fetching data normally */
+   

Re: [PATCH V4 08/10] powerpc, perf: Enable SW filtering in branch stack sampling framework

2013-12-08 Thread Michael Ellerman
On Wed, 2013-04-12 at 10:32:40 UTC, Anshuman Khandual wrote:
> This patch enables SW based post processing of BHRB captured branches
> to be able to meet more user defined branch filtration criteria in perf
> branch stack sampling framework. These changes increase the number of
> branch filters and their valid combinations on any powerpc64 server
> platform with BHRB support. Find the summary of code changes here.
> 
> (1) struct cpu_hw_events
> 
>   Introduced two new variables track various filter values and mask
> 
>   (a) bhrb_sw_filter  Tracks SW implemented branch filter flags
>   (b) filter_mask Tracks both (SW and HW) branch filter flags

The name 'filter_mask' doesn't mean much to me. I'd rather it was 'bhrb_filter'.


> (2) Event creation
> 
>   Kernel will figure out supported BHRB branch filters through a PMU call
>   back 'bhrb_filter_map'. This function will find out how many of the
>   requested branch filters can be supported in the PMU HW. It will not
>   try to invalidate any branch filter combinations. Event creation will 
> not
>   error out because of lack of HW based branch filters. Meanwhile it will
>   track the overall supported branch filters in the "filter_mask" 
> variable.
> 
>   Once the PMU call back returns kernel will process the user branch 
> filter
>   request against available SW filters while looking at the "filter_mask".
>   During this phase all the branch filters which are still pending from 
> the
>   user requested list will have to be supported in SW failing which the
>   event creation will error out.
> 
> (3) SW branch filter
> 
>   During the BHRB data capture inside the PMU interrupt context, each
>   of the captured 'perf_branch_entry.from' will be checked for compliance
>   with applicable SW branch filters. If the entry does not conform to the
>   filter requirements, it will be discarded from the final perf branch
>   stack buffer.
> 
> (4) Supported SW based branch filters
> 
>   (a) PERF_SAMPLE_BRANCH_ANY_RETURN
>   (b) PERF_SAMPLE_BRANCH_IND_CALL
>   (c) PERF_SAMPLE_BRANCH_ANY_CALL
>   (d) PERF_SAMPLE_BRANCH_COND
> 
>   Please refer patch to understand the classification of instructions into
>   these branch filter categories.
> 
> (5) Multiple branch filter semantics
> 
>   Book3 sever implementation follows the same OR semantics (as 
> implemented in
>   x86) while dealing with multiple branch filters at any point of time. SW
>   branch filter analysis is carried on the data set captured in the PMU 
> HW.
>   So the resulting set of data (after applying the SW filters) will 
> inherently
>   be an AND with the HW captured set. Hence any combination of HW and SW 
> branch
>   filters will be invalid. HW based branch filters are more efficient and 
> faster
>   compared to SW implemented branch filters. So at first the PMU should 
> decide
>   whether it can support all the requested branch filters itself or not. 
> In case
>   it can support all the branch filters in an OR manner, we dont apply 
> any SW
>   branch filter on top of the HW captured set (which is the final set). 
> This
>   preserves the OR semantic of multiple branch filters as required. But 
> in case
>   where the PMU cannot support all the requested branch filters in an OR 
> manner,
>   it should not apply any it's filters and leave it upto the SW to handle 
> them
>   all. Its the PMU code's responsibility to uphold this protocol to be 
> able to
>   conform to the overall OR semantic of perf branch stack sampling 
> framework.


I'd prefer this level of commentary was in a block comment in the code. It's
much more likely to be seen by a future hacker than here in the commit log.


> diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
> index 2de7d48..54d39a5 100644
> --- a/arch/powerpc/perf/core-book3s.c
> +++ b/arch/powerpc/perf/core-book3s.c
> @@ -48,6 +48,8 @@ struct cpu_hw_events {
>  
>   /* BHRB bits */
>   u64 bhrb_hw_filter; /* BHRB HW branch 
> filter */
> + u64 bhrb_sw_filter; /* BHRB SW branch 
> filter */
> + u64 filter_mask;/* Branch filter mask */
>   int bhrb_users;
>   void*bhrb_context;
>   struct  perf_branch_stack   bhrb_stack;
> @@ -400,6 +402,228 @@ static __u64 power_pmu_bhrb_to(u64 addr)
>   return target - (unsigned long) + addr;
>  }
>  
> +/*
> + * Instruction opcode analysis
> + *
> + * Analyse instruction opcodes and classify them
> + * into various branch filter options available.
> + * This follows the standard semantics of OR which
> + * means that instructions which conforms to `any`
> + * of the requested branch filters get picked up.
> + */
> +static bool 

Re: [PATCH V4 10/10] powerpc, perf: Cleanup SW branch filter list look up

2013-12-08 Thread Michael Ellerman
On Wed, 2013-04-12 at 10:32:42 UTC, Anshuman Khandual wrote:
> This patch adds enumeration for all available SW branch filters
> in powerpc book3s code and also streamlines the look for the
> SW branch filter entries while trying to figure out which all
> branch filters can be supported in SW.

This appears to patch code that was only added in 8/10 ?

Was there any reason not to do it the right way from the beginning?

cheers
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V4 07/10] powerpc, lib: Add new branch instruction analysis support functions

2013-12-08 Thread Michael Ellerman
On Wed, 2013-04-12 at 10:32:39 UTC, Anshuman Khandual wrote:
> Generic powerpc branch instruction analysis support added in the code
> patching library which will help the subsequent patch on SW based
> filtering of branch records in perf. This patch also converts and
> exports some of the existing local static functions through the header
> file to be used else where.
> 
> diff --git a/arch/powerpc/include/asm/code-patching.h 
> b/arch/powerpc/include/asm/code-patching.h
> index a6f8c7a..8bab417 100644
> --- a/arch/powerpc/include/asm/code-patching.h
> +++ b/arch/powerpc/include/asm/code-patching.h
> @@ -22,6 +22,36 @@
>  #define BRANCH_SET_LINK  0x1
>  #define BRANCH_ABSOLUTE  0x2
>  
> +#define XL_FORM_LR  0x4C20
> +#define XL_FORM_CTR 0x4C000420
> +#define XL_FORM_TAR 0x4C000460
> +
> +#define BO_ALWAYS0x0280
> +#define BO_CTR   0x0200
> +#define BO_CRBI_OFF  0x0080
> +#define BO_CRBI_ON   0x0180
> +#define BO_CRBI_HINT 0x0040
> +
> +/* Forms of branch instruction */
> +int instr_is_branch_iform(unsigned int instr);
> +int instr_is_branch_bform(unsigned int instr);
> +int instr_is_branch_xlform(unsigned int instr);
> +
> +/* Classification of XL-form instruction */
> +int is_xlform_lr(unsigned int instr);
> +int is_xlform_ctr(unsigned int instr);
> +int is_xlform_tar(unsigned int instr);
> +
> +/* Branch instruction is a call */
> +int is_branch_link_set(unsigned int instr);
> +
> +/* BO field analysis (B-form or XL-form) */
> +int is_bo_always(unsigned int instr);
> +int is_bo_ctr(unsigned int instr);
> +int is_bo_crbi_off(unsigned int instr);
> +int is_bo_crbi_on(unsigned int instr);
> +int is_bo_crbi_hint(unsigned int instr);


I think this is the wrong API.

We end up with all these micro checks, which don't actually encapsulate much,
and don't implement the logic perf needs. If we had another user for this level
of detail then it might make sense, but for a single user I think we're better
off just implementing the semantics it wants.

So that would be something more like:

bool instr_is_return_branch(unsigned int instr);
bool instr_is_conditional_branch(unsigned int instr);
bool instr_is_func_call(unsigned int instr);
bool instr_is_indirect_func_call(unsigned int instr);


These would then encapsulate something like the logic in your 8/10 patch. You
can hopefully also optimise the checking logic in each routine because you know
the exact semantics you're implementing.

cheers
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 15/17] tracing/uprobes: Add support for full argument access methods

2013-12-08 Thread Namhyung Kim
From: Namhyung Kim 

Enable to fetch other types of argument for the uprobes.  IOW, we can
access stack, memory, deref, bitfield and retval from uprobes now.

The format for the argument types are same as kprobes (but @SYMBOL
type is not supported for uprobes), i.e:

  @ADDR   : Fetch memory at ADDR
  $stackN : Fetch Nth entry of stack (N >= 0)
  $stack  : Fetch stack address
  $retval : Fetch return value
  +|-offs(FETCHARG) : Fetch memory at FETCHARG +|- offs address

Note that the retval only can be used with uretprobes.

Original-patch-by: Hyeoncheol Lee 
Acked-by: Masami Hiramatsu 
Cc: Srikar Dronamraju 
Cc: Oleg Nesterov 
Cc: zhangwei(Jovi) 
Cc: Arnaldo Carvalho de Melo 
Signed-off-by: Namhyung Kim 
---
 Documentation/trace/uprobetracer.txt | 25 +
 kernel/trace/trace_probe.c   | 34 ++
 2 files changed, 47 insertions(+), 12 deletions(-)

diff --git a/Documentation/trace/uprobetracer.txt 
b/Documentation/trace/uprobetracer.txt
index 8f1a8b8956fc..6e5cff263e2b 100644
--- a/Documentation/trace/uprobetracer.txt
+++ b/Documentation/trace/uprobetracer.txt
@@ -31,6 +31,31 @@ Synopsis of uprobe_tracer
 
   FETCHARGS : Arguments. Each probe can have up to 128 args.
%REG : Fetch register REG
+   @ADDR   : Fetch memory at ADDR (ADDR should be in userspace)
+   $stackN : Fetch Nth entry of stack (N >= 0)
+   $stack  : Fetch stack address.
+   $retval : Fetch return value.(*)
+   +|-offs(FETCHARG) : Fetch memory at FETCHARG +|- offs address.(**)
+   NAME=FETCHARG : Set NAME as the argument name of FETCHARG.
+   FETCHARG:TYPE : Set TYPE as the type of FETCHARG. Currently, basic types
+  (u8/u16/u32/u64/s8/s16/s32/s64), "string" and bitfield
+  are supported.
+
+  (*) only for return probe.
+  (**) this is useful for fetching a field of data structures.
+
+Types
+-
+Several types are supported for fetch-args. Uprobe tracer will access memory
+by given type. Prefix 's' and 'u' means those types are signed and unsigned
+respectively. Traced arguments are shown in decimal (signed) or hex (unsigned).
+String type is a special type, which fetches a "null-terminated" string from
+user space.
+Bitfield is another special type, which takes 3 parameters, bit-width, bit-
+offset, and container-size (usually 32). The syntax is;
+
+ b@/
+
 
 Event Profiling
 ---
diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
index d0b4a42dafcf..464ec506ec08 100644
--- a/kernel/trace/trace_probe.c
+++ b/kernel/trace/trace_probe.c
@@ -254,12 +254,18 @@ fail:
 }
 
 /* Special function : only accept unsigned long */
-static __kprobes void fetch_stack_address(struct pt_regs *regs,
-   void *dummy, void *dest)
+static __kprobes void fetch_kernel_stack_address(struct pt_regs *regs,
+void *dummy, void *dest)
 {
*(unsigned long *)dest = kernel_stack_pointer(regs);
 }
 
+static __kprobes void fetch_user_stack_address(struct pt_regs *regs,
+  void *dummy, void *dest)
+{
+   *(unsigned long *)dest = user_stack_pointer(regs);
+}
+
 static fetch_func_t get_fetch_size_function(const struct fetch_type *type,
fetch_func_t orig_fn,
const struct fetch_type *ftbl)
@@ -304,7 +310,8 @@ int traceprobe_split_symbol_offset(char *symbol, unsigned 
long *offset)
 #define PARAM_MAX_STACK (THREAD_SIZE / sizeof(unsigned long))
 
 static int parse_probe_vars(char *arg, const struct fetch_type *t,
-   struct fetch_param *f, bool is_return)
+   struct fetch_param *f, bool is_return,
+   bool is_kprobe)
 {
int ret = 0;
unsigned long param;
@@ -316,13 +323,16 @@ static int parse_probe_vars(char *arg, const struct 
fetch_type *t,
ret = -EINVAL;
} else if (strncmp(arg, "stack", 5) == 0) {
if (arg[5] == '\0') {
-   if (strcmp(t->name, DEFAULT_FETCH_TYPE_STR) == 0)
-   f->fn = fetch_stack_address;
+   if (strcmp(t->name, DEFAULT_FETCH_TYPE_STR))
+   return -EINVAL;
+
+   if (is_kprobe)
+   f->fn = fetch_kernel_stack_address;
else
-   ret = -EINVAL;
+   f->fn = fetch_user_stack_address;
} else if (isdigit(arg[5])) {
ret = kstrtoul(arg + 5, 10, );
-   if (ret || param > PARAM_MAX_STACK)
+   if (ret || (is_kprobe && param > PARAM_MAX_STACK))
ret = -EINVAL;
else {

[PATCH 14/17] tracing/uprobes: Fetch args before reserving a ring buffer

2013-12-08 Thread Namhyung Kim
From: Namhyung Kim 

Fetching from user space should be done in a non-atomic context.  So
use a per-cpu buffer and copy its content to the ring buffer
atomically.  Note that we can migrate during accessing user memory
thus use a per-cpu mutex to protect concurrent accesses.

This is needed since we'll be able to fetch args from an user memory
which can be swapped out.  Before that uprobes could fetch args from
registers only which saved in a kernel space.

While at it, use __get_data_size() and store_trace_args() to reduce
code duplication.  And add struct uprobe_cpu_buffer and its helpers as
suggested by Oleg.

Reviewed-by: Masami Hiramatsu 
Cc: Srikar Dronamraju 
Cc: Oleg Nesterov 
Cc: zhangwei(Jovi) 
Cc: Arnaldo Carvalho de Melo 
Signed-off-by: Namhyung Kim 
---
 kernel/trace/trace_uprobe.c | 146 +++-
 1 file changed, 132 insertions(+), 14 deletions(-)

diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
index d407a556aa55..f86a6a711de9 100644
--- a/kernel/trace/trace_uprobe.c
+++ b/kernel/trace/trace_uprobe.c
@@ -650,21 +650,117 @@ static const struct file_operations uprobe_profile_ops = 
{
.release= seq_release,
 };
 
+struct uprobe_cpu_buffer {
+   struct mutex mutex;
+   void *buf;
+};
+static struct uprobe_cpu_buffer __percpu *uprobe_cpu_buffer;
+static int uprobe_buffer_refcnt;
+
+static int uprobe_buffer_init(void)
+{
+   int cpu, err_cpu;
+
+   uprobe_cpu_buffer = alloc_percpu(struct uprobe_cpu_buffer);
+   if (uprobe_cpu_buffer == NULL)
+   return -ENOMEM;
+
+   for_each_possible_cpu(cpu) {
+   struct page *p = alloc_pages_node(cpu_to_node(cpu),
+ GFP_KERNEL, 0);
+   if (p == NULL) {
+   err_cpu = cpu;
+   goto err;
+   }
+   per_cpu_ptr(uprobe_cpu_buffer, cpu)->buf = page_address(p);
+   mutex_init(_cpu_ptr(uprobe_cpu_buffer, cpu)->mutex);
+   }
+
+   return 0;
+
+err:
+   for_each_possible_cpu(cpu) {
+   if (cpu == err_cpu)
+   break;
+   free_page((unsigned long)per_cpu_ptr(uprobe_cpu_buffer, 
cpu)->buf);
+   }
+
+   free_percpu(uprobe_cpu_buffer);
+   return -ENOMEM;
+}
+
+static int uprobe_buffer_enable(void)
+{
+   int ret = 0;
+
+   BUG_ON(!mutex_is_locked(_mutex));
+
+   if (uprobe_buffer_refcnt++ == 0) {
+   ret = uprobe_buffer_init();
+   if (ret < 0)
+   uprobe_buffer_refcnt--;
+   }
+
+   return ret;
+}
+
+static void uprobe_buffer_disable(void)
+{
+   BUG_ON(!mutex_is_locked(_mutex));
+
+   if (--uprobe_buffer_refcnt == 0) {
+   free_percpu(uprobe_cpu_buffer);
+   uprobe_cpu_buffer = NULL;
+   }
+}
+
+static struct uprobe_cpu_buffer *uprobe_buffer_get(void)
+{
+   struct uprobe_cpu_buffer *ucb;
+   int cpu;
+
+   cpu = raw_smp_processor_id();
+   ucb = per_cpu_ptr(uprobe_cpu_buffer, cpu);
+
+   /*
+* Use per-cpu buffers for fastest access, but we might migrate
+* so the mutex makes sure we have sole access to it.
+*/
+   mutex_lock(>mutex);
+
+   return ucb;
+}
+
+static void uprobe_buffer_put(struct uprobe_cpu_buffer *ucb)
+{
+   mutex_unlock(>mutex);
+}
+
 static void uprobe_trace_print(struct trace_uprobe *tu,
unsigned long func, struct pt_regs *regs)
 {
struct uprobe_trace_entry_head *entry;
struct ring_buffer_event *event;
struct ring_buffer *buffer;
+   struct uprobe_cpu_buffer *ucb;
void *data;
-   int size, i;
+   int size, dsize, esize;
struct ftrace_event_call *call = >tp.call;
 
-   size = SIZEOF_TRACE_ENTRY(is_ret_probe(tu));
+   dsize = __get_data_size(>tp, regs);
+   esize = SIZEOF_TRACE_ENTRY(is_ret_probe(tu));
+
+   if (WARN_ON_ONCE(!uprobe_cpu_buffer || tu->tp.size + dsize > PAGE_SIZE))
+   return;
+
+   ucb = uprobe_buffer_get();
+   store_trace_args(esize, >tp, regs, ucb->buf, dsize);
+
+   size = esize + tu->tp.size + dsize;
event = trace_current_buffer_lock_reserve(, call->event.type,
- size + tu->tp.size, 0, 0);
+ size, 0, 0);
if (!event)
-   return;
+   goto out;
 
entry = ring_buffer_event_data(event);
if (is_ret_probe(tu)) {
@@ -676,13 +772,13 @@ static void uprobe_trace_print(struct trace_uprobe *tu,
data = DATAOF_TRACE_ENTRY(entry, false);
}
 
-   for (i = 0; i < tu->tp.nr_args; i++) {
-   call_fetch(>tp.args[i].fetch, regs,
-  data + tu->tp.args[i].offset);
-   }
+   memcpy(data, ucb->buf, tu->tp.size + dsize);
 
if 

[PATCH 17/17] tracing/uprobes: Add @+file_offset fetch method

2013-12-08 Thread Namhyung Kim
From: Namhyung Kim 

Enable to fetch data from a file offset.  Currently it only supports
fetching from same binary uprobe set.  It'll translate the file offset
to a proper virtual address in the process.

The syntax is "@+OFFSET" as it does similar to normal memory fetching
(@ADDR) which does no address translation.

Suggested-by: Oleg Nesterov 
Acked-by: Masami Hiramatsu 
Cc: Srikar Dronamraju 
Cc: Oleg Nesterov 
Cc: zhangwei(Jovi) 
Cc: Arnaldo Carvalho de Melo 
Signed-off-by: Namhyung Kim 
---
 Documentation/trace/uprobetracer.txt |  1 +
 kernel/trace/trace_kprobe.c  |  8 
 kernel/trace/trace_probe.c   | 13 +++-
 kernel/trace/trace_probe.h   |  2 ++
 kernel/trace/trace_uprobe.c  | 40 
 5 files changed, 63 insertions(+), 1 deletion(-)

diff --git a/Documentation/trace/uprobetracer.txt 
b/Documentation/trace/uprobetracer.txt
index 6e5cff263e2b..f1cf9a34ad9d 100644
--- a/Documentation/trace/uprobetracer.txt
+++ b/Documentation/trace/uprobetracer.txt
@@ -32,6 +32,7 @@ Synopsis of uprobe_tracer
   FETCHARGS : Arguments. Each probe can have up to 128 args.
%REG : Fetch register REG
@ADDR   : Fetch memory at ADDR (ADDR should be in userspace)
+   @+OFFSET: Fetch memory at OFFSET (OFFSET from same file as PATH)
$stackN : Fetch Nth entry of stack (N >= 0)
$stack  : Fetch stack address.
$retval : Fetch return value.(*)
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 8b32236ae890..1cf8f3375559 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -239,6 +239,14 @@ DEFINE_BASIC_FETCH_FUNCS(symbol)
 DEFINE_FETCH_symbol(string)
 DEFINE_FETCH_symbol(string_size)
 
+/* kprobes don't support file_offset fetch methods */
+#define fetch_file_offset_u8   NULL
+#define fetch_file_offset_u16  NULL
+#define fetch_file_offset_u32  NULL
+#define fetch_file_offset_u64  NULL
+#define fetch_file_offset_string   NULL
+#define fetch_file_offset_string_size  NULL
+
 /* Fetch type information table */
 const struct fetch_type kprobes_fetch_type_table[] = {
/* Special types */
diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
index 464ec506ec08..705c06ba0a8a 100644
--- a/kernel/trace/trace_probe.c
+++ b/kernel/trace/trace_probe.c
@@ -373,7 +373,7 @@ static int parse_probe_arg(char *arg, const struct 
fetch_type *t,
}
break;
 
-   case '@':   /* memory or symbol */
+   case '@':   /* memory, file-offset or symbol */
if (isdigit(arg[1])) {
ret = kstrtoul(arg + 1, 0, );
if (ret)
@@ -381,6 +381,17 @@ static int parse_probe_arg(char *arg, const struct 
fetch_type *t,
 
f->fn = t->fetch[FETCH_MTD_memory];
f->data = (void *)param;
+   } else if (arg[1] == '+') {
+   /* kprobes don't support file offsets */
+   if (is_kprobe)
+   return -EINVAL;
+
+   ret = kstrtol(arg + 2, 0, );
+   if (ret)
+   break;
+
+   f->fn = t->fetch[FETCH_MTD_file_offset];
+   f->data = (void *)offset;
} else {
/* uprobes don't support symbols */
if (!is_kprobe)
diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h
index 385206bbbf8b..d9afeb580cbf 100644
--- a/kernel/trace/trace_probe.h
+++ b/kernel/trace/trace_probe.h
@@ -106,6 +106,7 @@ enum {
FETCH_MTD_symbol,
FETCH_MTD_deref,
FETCH_MTD_bitfield,
+   FETCH_MTD_file_offset,
FETCH_MTD_END,
 };
 
@@ -217,6 +218,7 @@ ASSIGN_FETCH_FUNC(memory, ftype),   \
 ASSIGN_FETCH_FUNC(symbol, ftype),  \
 ASSIGN_FETCH_FUNC(deref, ftype),   \
 ASSIGN_FETCH_FUNC(bitfield, ftype),\
+ASSIGN_FETCH_FUNC(file_offset, ftype), \
  } \
}
 
diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
index f86a6a711de9..f5cbed7e7709 100644
--- a/kernel/trace/trace_uprobe.c
+++ b/kernel/trace/trace_uprobe.c
@@ -70,6 +70,11 @@ static int unregister_uprobe_event(struct trace_uprobe *tu);
 static DEFINE_MUTEX(uprobe_lock);
 static LIST_HEAD(uprobe_list);
 
+struct uprobe_dispatch_data {
+   struct trace_uprobe *tu;
+   unsigned long   bp_addr;
+};
+
 static int uprobe_dispatcher(struct uprobe_consumer *con, struct pt_regs 
*regs);
 static int uretprobe_dispatcher(struct uprobe_consumer *con,
unsigned long func, struct pt_regs *regs);
@@ -175,6 +180,29 @@ static __kprobes void FETCH_FUNC_NAME(memory, 
string_size)(struct pt_regs 

[PATCH 12/17] tracing/probes: Implement 'memory' fetch method for uprobes

2013-12-08 Thread Namhyung Kim
From: Namhyung Kim 

Use separate method to fetch from memory.  Move existing functions to
trace_kprobe.c and make them static.  Also add new memory fetch
implementation for uprobes.

Cc: Masami Hiramatsu 
Cc: Srikar Dronamraju 
Cc: Oleg Nesterov 
Cc: zhangwei(Jovi) 
Cc: Arnaldo Carvalho de Melo 
Signed-off-by: Namhyung Kim 
---
 kernel/trace/trace_kprobe.c | 77 +++
 kernel/trace/trace_probe.c  | 79 +
 kernel/trace/trace_probe.h  |  4 ---
 kernel/trace/trace_uprobe.c | 52 +
 4 files changed, 130 insertions(+), 82 deletions(-)

diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 2886e6da524d..8b32236ae890 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -148,6 +148,83 @@ DEFINE_BASIC_FETCH_FUNCS(stack)
 #define fetch_stack_string NULL
 #define fetch_stack_string_sizeNULL
 
+#define DEFINE_FETCH_memory(type)  \
+static __kprobes void FETCH_FUNC_NAME(memory, type)(struct pt_regs *regs,\
+ void *addr, void *dest)   \
+{  \
+   type retval;\
+   if (probe_kernel_address(addr, retval)) \
+   *(type *)dest = 0;  \
+   else\
+   *(type *)dest = retval; \
+}
+DEFINE_BASIC_FETCH_FUNCS(memory)
+/*
+ * Fetch a null-terminated string. Caller MUST set *(u32 *)dest with max
+ * length and relative data location.
+ */
+static __kprobes void FETCH_FUNC_NAME(memory, string)(struct pt_regs *regs,
+ void *addr, void *dest)
+{
+   long ret;
+   int maxlen = get_rloc_len(*(u32 *)dest);
+   u8 *dst = get_rloc_data(dest);
+   u8 *src = addr;
+   mm_segment_t old_fs = get_fs();
+
+   if (!maxlen)
+   return;
+
+   /*
+* Try to get string again, since the string can be changed while
+* probing.
+*/
+   set_fs(KERNEL_DS);
+   pagefault_disable();
+
+   do
+   ret = __copy_from_user_inatomic(dst++, src++, 1);
+   while (dst[-1] && ret == 0 && src - (u8 *)addr < maxlen);
+
+   dst[-1] = '\0';
+   pagefault_enable();
+   set_fs(old_fs);
+
+   if (ret < 0) {  /* Failed to fetch string */
+   ((u8 *)get_rloc_data(dest))[0] = '\0';
+   *(u32 *)dest = make_data_rloc(0, get_rloc_offs(*(u32 *)dest));
+   } else {
+   *(u32 *)dest = make_data_rloc(src - (u8 *)addr,
+ get_rloc_offs(*(u32 *)dest));
+   }
+}
+
+/* Return the length of string -- including null terminal byte */
+static __kprobes void FETCH_FUNC_NAME(memory, string_size)(struct pt_regs 
*regs,
+   void *addr, void *dest)
+{
+   mm_segment_t old_fs;
+   int ret, len = 0;
+   u8 c;
+
+   old_fs = get_fs();
+   set_fs(KERNEL_DS);
+   pagefault_disable();
+
+   do {
+   ret = __copy_from_user_inatomic(, (u8 *)addr + len, 1);
+   len++;
+   } while (c && ret == 0 && len < MAX_STRING_SIZE);
+
+   pagefault_enable();
+   set_fs(old_fs);
+
+   if (ret < 0)/* Failed to check the length */
+   *(u32 *)dest = 0;
+   else
+   *(u32 *)dest = len;
+}
+
 #define DEFINE_FETCH_symbol(type)  \
 __kprobes void FETCH_FUNC_NAME(symbol, type)(struct pt_regs *regs, \
  void *data, void *dest)   \
diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
index b4f28bc39959..d0b4a42dafcf 100644
--- a/kernel/trace/trace_probe.c
+++ b/kernel/trace/trace_probe.c
@@ -103,83 +103,6 @@ DEFINE_BASIC_FETCH_FUNCS(retval)
 #define fetch_retval_stringNULL
 #define fetch_retval_string_size   NULL
 
-#define DEFINE_FETCH_memory(type)  \
-__kprobes void FETCH_FUNC_NAME(memory, type)(struct pt_regs *regs, \
- void *addr, void *dest)   \
-{  \
-   type retval;\
-   if (probe_kernel_address(addr, retval)) \
-   *(type *)dest = 0;  \
-   else\
-   *(type *)dest = retval; \
-}
-DEFINE_BASIC_FETCH_FUNCS(memory)
-/*
- * Fetch a null-terminated string. Caller MUST set *(u32 *)dest with 

[PATCH 16/17] uprobes: Allocate ->utask before handler_chain() for tracing handlers

2013-12-08 Thread Namhyung Kim
From: Oleg Nesterov 

uprobe_trace_print() and uprobe_perf_print() need to pass the additional
info to call_fetch() methods, currently there is no simple way to do this.

current->utask looks like a natural place to hold this info, but we need
to allocate it before handler_chain().

This is a bit unfortunate, perhaps we will find a better solution later,
but this is simnple and should work right now.

Signed-off-by: Oleg Nesterov 
Cc: Srikar Dronamraju 
Cc: Masami Hiramatsu 
Signed-off-by: Namhyung Kim 
---
 kernel/events/uprobes.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index 24b7d6ca871b..3cc8e0bb8acf 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -1828,6 +1828,10 @@ static void handle_swbp(struct pt_regs *regs)
if (unlikely(!test_bit(UPROBE_COPY_INSN, >flags)))
goto out;
 
+   /* Tracing handlers use ->utask to communicate with fetch methods */
+   if (!get_utask())
+   goto out;
+
handler_chain(uprobe, regs);
if (can_skip_sstep(uprobe, regs))
goto out;
-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 01/17] tracing/uprobes: Fix documentation of uprobe registration syntax

2013-12-08 Thread Namhyung Kim
From: Namhyung Kim 

The uprobe syntax requires an offset after a file path not a symbol.

Reviewed-by: Masami Hiramatsu 
Acked-by: Oleg Nesterov 
Acked-by: Srikar Dronamraju 
Cc: zhangwei(Jovi) 
Cc: Arnaldo Carvalho de Melo 
Signed-off-by: Namhyung Kim 
---
 Documentation/trace/uprobetracer.txt | 10 +-
 kernel/trace/trace_uprobe.c  |  2 +-
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/Documentation/trace/uprobetracer.txt 
b/Documentation/trace/uprobetracer.txt
index d9c3e682312c..8f1a8b8956fc 100644
--- a/Documentation/trace/uprobetracer.txt
+++ b/Documentation/trace/uprobetracer.txt
@@ -19,15 +19,15 @@ user to calculate the offset of the probepoint in the 
object.
 
 Synopsis of uprobe_tracer
 -
-  p[:[GRP/]EVENT] PATH:SYMBOL[+offs] [FETCHARGS] : Set a uprobe
-  r[:[GRP/]EVENT] PATH:SYMBOL[+offs] [FETCHARGS] : Set a return uprobe 
(uretprobe)
-  -:[GRP/]EVENT  : Clear uprobe or uretprobe 
event
+  p[:[GRP/]EVENT] PATH:OFFSET [FETCHARGS] : Set a uprobe
+  r[:[GRP/]EVENT] PATH:OFFSET [FETCHARGS] : Set a return uprobe (uretprobe)
+  -:[GRP/]EVENT   : Clear uprobe or uretprobe event
 
   GRP   : Group name. If omitted, "uprobes" is the default value.
   EVENT : Event name. If omitted, the event name is generated based
-  on SYMBOL+offs.
+  on PATH+OFFSET.
   PATH  : Path to an executable or a library.
-  SYMBOL[+offs] : Symbol+offset where the probe is inserted.
+  OFFSET: Offset where the probe is inserted.
 
   FETCHARGS : Arguments. Each probe can have up to 128 args.
%REG : Fetch register REG
diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
index b6dcc42ef7f5..c77b92d61551 100644
--- a/kernel/trace/trace_uprobe.c
+++ b/kernel/trace/trace_uprobe.c
@@ -211,7 +211,7 @@ end:
 
 /*
  * Argument syntax:
- *  - Add uprobe: p|r[:[GRP/]EVENT] PATH:SYMBOL [FETCHARGS]
+ *  - Add uprobe: p|r[:[GRP/]EVENT] PATH:OFFSET [FETCHARGS]
  *
  *  - Remove uprobe: -:[GRP/]EVENT
  */
-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH][RESEND] powerpc: remove unused REDBOOT Kconfig parameter

2013-12-08 Thread Benjamin Herrenschmidt
On Mon, 2013-12-09 at 06:27 +0100, Michael Opdenacker wrote:
> This removes the REDBOOT Kconfig parameter,
> which was no longer used anywhere in the source code
> and Makefiles.

It hasn't been lost :-) It's still in patchwork and it's even in my
queue.

Cheers,
Ben.

> Signed-off-by: Michael Opdenacker 
> ---
>  arch/powerpc/Kconfig| 3 ---
>  arch/powerpc/platforms/83xx/Kconfig | 1 -
>  arch/powerpc/platforms/8xx/Kconfig  | 1 -
>  3 files changed, 5 deletions(-)
> 
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index b44b52c0a8f0..70dc283050b5 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -209,9 +209,6 @@ config DEFAULT_UIMAGE
> Used to allow a board to specify it wants a uImage built by default
>   default n
>  
> -config REDBOOT
> - bool
> -
>  config ARCH_HIBERNATION_POSSIBLE
>   bool
>   default y
> diff --git a/arch/powerpc/platforms/83xx/Kconfig 
> b/arch/powerpc/platforms/83xx/Kconfig
> index 670a033264c0..2bdc8c862c46 100644
> --- a/arch/powerpc/platforms/83xx/Kconfig
> +++ b/arch/powerpc/platforms/83xx/Kconfig
> @@ -99,7 +99,6 @@ config SBC834x
>  config ASP834x
>   bool "Analogue & Micro ASP 834x"
>   select PPC_MPC834x
> - select REDBOOT
>   help
> This enables support for the Analogue & Micro ASP 83xx
> board.
> diff --git a/arch/powerpc/platforms/8xx/Kconfig 
> b/arch/powerpc/platforms/8xx/Kconfig
> index 8dec3c0911ad..bd6f1a1cf922 100644
> --- a/arch/powerpc/platforms/8xx/Kconfig
> +++ b/arch/powerpc/platforms/8xx/Kconfig
> @@ -45,7 +45,6 @@ config PPC_EP88XC
>  config PPC_ADDER875
>   bool "Analogue & Micro Adder 875"
>   select CPM1
> - select REDBOOT
>   help
> This enables support for the Analogue & Micro Adder 875
> board.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 2/2] arm: omap: remove *.auto* from device names given in usb_bind_phy

2013-12-08 Thread Kishon Vijay Abraham I

Hi,

On Saturday 07 December 2013 02:38 AM, Felipe Balbi wrote:

Hi,

On Fri, Dec 06, 2013 at 01:14:38PM +0100, Javier Martinez Canillas wrote:

On Fri, Dec 6, 2013 at 1:06 PM, Kishon Vijay Abraham I  wrote:

Previously MUSB wrapper (OMAP) device used PLATFORM_DEVID_AUTO while creating
MUSB core device. So in usb_bind_phy (binds the controller with the PHY), the
device name of the controller had *.auto* in it. Since with using
PLATFORM_DEVID_AUTO, there is no way to know the exact device name in advance,
the data given in usb_bind_phy became obsolete and usb_get_phy was failing.
So MUSB wrapper was modified not to use PLATFORM_DEVID_AUTO. Corresponding
change is done in board file here.

Signed-off-by: Kishon Vijay Abraham I 
---
  arch/arm/mach-omap2/board-2430sdp.c|2 +-
  arch/arm/mach-omap2/board-3430sdp.c|2 +-
  arch/arm/mach-omap2/board-cm-t35.c |2 +-
  arch/arm/mach-omap2/board-devkit8000.c |2 +-
  arch/arm/mach-omap2/board-ldp.c|2 +-
  arch/arm/mach-omap2/board-omap3beagle.c|2 +-
  arch/arm/mach-omap2/board-omap3logic.c |2 +-
  arch/arm/mach-omap2/board-omap3pandora.c   |2 +-
  arch/arm/mach-omap2/board-omap3stalker.c   |2 +-
  arch/arm/mach-omap2/board-omap3touchbook.c |2 +-
  arch/arm/mach-omap2/board-overo.c  |2 +-
  arch/arm/mach-omap2/board-rx51.c   |2 +-
  12 files changed, 12 insertions(+), 12 deletions(-)



You can drop this patch since boards files are being removed for v3.14


if we can drop this patch, the whole series is invalid, since we'll be
using DT phandles to find PHYs going forward, no ?

yeah. But in one of the other threads, Tony seemed ok to take a patch 
that fixes the same issue in mach-omap2/twl-common.c. So it's better to 
confirm with Tony.


Thanks
Kishon
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] perf tools: Fix bug for perf kvm report without guestmount.

2013-12-08 Thread Dongsheng Yang


When more than 1 VM, the cases you provided is all about 
record-report the symbols from __all__ guests. How about I want to 
record-report one of them?

Example:
There are 2 guests are running different kernels, I want to 
record-report VM1.


Currently, we can use --guestmount to get the symbols of VM1, but we 
can not remove the symbols of VM2 from report. It means the 
percentage of each symbol is not  the symbol only in this VM1.


What I want with introducing guestpid is to record-report the symbols 
only from VM1, and ignore VM2.




Besides, as we provide two usages of perf-kvm in manpage, 
guestmount-way and guest{kallsyms,modules}-way, but the 
guest{kallsyms, modules} is not used 

s/not/only. Sorry for the typo :(

when
there is only one guest is running, the guestmount is used in all 
situations, it seems not reasonable. I think we can provide a guestpid 
to make the guestkallsyms-way can be used in any situation.


This way, user can understand and choose the usage easily.

1. one or more guests you want to record-report, --guestmount-way. It 
provides the full functionality of perf kvm.


2. only one guest you want to record-report, --guestmount is also 
available, and there is another usage for this most used case as a 
shortcut , --guest{kallsyms, modules, pid}.


Then the relationship between the two kinds of usage is more clear, 
right? And in this way, we can make the result of perf-kvm more 
foreseeable to user.



David




--
To unsubscribe from this list: send the line "unsubscribe 
linux-kernel" in

the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



--
To unsubscribe from this list: send the line "unsubscribe 
linux-kernel" in

the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] perf tools: Fix bug for perf kvm report without guestmount.

2013-12-08 Thread Dongsheng Yang

On 12/09/2013 01:06 PM, Dongsheng Yang wrote:

On 12/08/2013 11:32 PM, David Ahern wrote:

On 12/9/13, 10:12 AM, Dongsheng Yang wrote:

On 12/08/2013 10:42 PM, David Ahern wrote:

On 12/9/13, 8:20 AM, Dongsheng Yang wrote:

How about introduce an option named --guestpid? Then we can make the
usage of perf kvm
more clear:
 * perf kvm --guestkallsyms --guestmodules --guestpid
[top|record|report]
 This usage is for only one guest and will not resolve the
symbols from other guests.


If there is only 1 guest then there should not be a problem right? You
give perf a single guest kallsyms as the "default" and it works.
--guestpid adds no value in that case.


Yes, if there is only one guest is running, "default" guest is "the"
guest. Then with my patch in this thread applied, it works well.

But consider this scenario, there are two guests are running, but we
need to record-report one of them.

--guestmount can achieve this request, but as a shortcut of guestmount,
--guest{kallysms, modules} dose not
support it well, right? So, I think we can discard the default guest,
and use guestpid in record-report.


No.

Use cases:
1. one guest
--guestkallsyms and --guestmodules apply to default guest; user 
should supply files that apply to the one guest. Supplying any other 
kallsyms is just nonsense. *NO* other arguments are needed.


2. more than 1 VM, *ALL* VMs running the same kernel
--guestkallsyms and --guestmodules apply to default guest; user 
should supply files that apply to all of guests. No other arguments 
are needed.


3. more than 1 VM, VMs running different kernels. 1+ VMs running the 
same kernel
--guestmount allows user to supply files that apply to all of guests 
based on pid. --guestkallsyms/guestmodules is used for any guest not 
showing up in guestmount.




When more than 1 VM, the cases you provided is all about record-report 
the symbols from __all__ guests. How about I want to record-report one 
of them?

Example:
There are 2 guests are running different kernels, I want to 
record-report VM1.


Currently, we can use --guestmount to get the symbols of VM1, but we 
can not remove the symbols of VM2 from report. It means the percentage 
of each symbol is not  the symbol only in this VM1.


What I want with introducing guestpid is to record-report the symbols 
only from VM1, and ignore VM2.




Besides, as we provide two usages of perf-kvm in manpage, guestmount-way 
and guest{kallsyms,modules}-way, but the guest{kallsyms, modules} is not 
used when
there is only one guest is running, the guestmount is used in all 
situations, it seems not reasonable. I think we can provide a guestpid 
to make the guestkallsyms-way can be used in any situation.


This way, user can understand and choose the usage easily.

1. one or more guests you want to record-report, --guestmount-way. It 
provides the full functionality of perf kvm.


2. only one guest you want to record-report, --guestmount is also 
available, and there is another usage for this most used case as a 
shortcut , --guest{kallsyms, modules, pid}.


Then the relationship between the two kinds of usage is more clear, 
right? And in this way, we can make the result of perf-kvm more 
foreseeable to user.



David




--
To unsubscribe from this list: send the line "unsubscribe 
linux-kernel" in

the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 05/14] tools lib traceevent: Get rid of malloc_or_die() in read_token()

2013-12-08 Thread Namhyung Kim
Signed-off-by: Namhyung Kim 
---
 tools/lib/traceevent/parse-filter.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/tools/lib/traceevent/parse-filter.c 
b/tools/lib/traceevent/parse-filter.c
index 35fac1fa376b..e9d17bfcdffd 100644
--- a/tools/lib/traceevent/parse-filter.c
+++ b/tools/lib/traceevent/parse-filter.c
@@ -109,7 +109,11 @@ static enum event_type read_token(char **tok)
(strcmp(token, "=") == 0 || strcmp(token, "!") == 0) &&
pevent_peek_char() == '~') {
/* append it */
-   *tok = malloc_or_die(3);
+   *tok = malloc(3);
+   if (*tok == NULL) {
+   free_token(token);
+   return EVENT_ERROR;
+   }
sprintf(*tok, "%c%c", *token, '~');
free_token(token);
/* Now remove the '~' from the buffer */
@@ -1107,6 +,8 @@ process_filter(struct event_format *event, struct 
filter_arg **parg,
break;
case EVENT_NONE:
break;
+   case EVENT_ERROR:
+   goto fail_alloc;
default:
goto fail_print;
}
-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 02/14] tools lib traceevent: Get rid of die in add_filter_type()

2013-12-08 Thread Namhyung Kim
The realloc() should check return value and not to overwrite previous
pointer in case of error.

Signed-off-by: Namhyung Kim 
---
 tools/lib/traceevent/parse-filter.c | 21 -
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/tools/lib/traceevent/parse-filter.c 
b/tools/lib/traceevent/parse-filter.c
index 0fc905c230ad..d9c239933992 100644
--- a/tools/lib/traceevent/parse-filter.c
+++ b/tools/lib/traceevent/parse-filter.c
@@ -161,11 +161,13 @@ add_filter_type(struct event_filter *filter, int id)
if (filter_type)
return filter_type;
 
-   filter->event_filters = realloc(filter->event_filters,
-   sizeof(*filter->event_filters) *
-   (filter->filters + 1));
-   if (!filter->event_filters)
-   die("Could not allocate filter");
+   filter_type = realloc(filter->event_filters,
+ sizeof(*filter->event_filters) *
+ (filter->filters + 1));
+   if (!filter_type)
+   return NULL;
+
+   filter->event_filters = filter_type;
 
for (i = 0; i < filter->filters; i++) {
if (filter->event_filters[i].event_id > id)
@@ -1164,6 +1166,12 @@ static int filter_event(struct event_filter *filter,
}
 
filter_type = add_filter_type(filter, event->id);
+   if (filter_type == NULL) {
+   show_error(error_str, "failed to add a new filter: %s",
+  filter_str ? filter_str : "true");
+   return -1;
+   }
+
if (filter_type->filter)
free_arg(filter_type->filter);
filter_type->filter = arg;
@@ -1395,6 +1403,9 @@ static int copy_filter_type(struct event_filter *filter,
arg->boolean.value = 0;
 
filter_type = add_filter_type(filter, event->id);
+   if (filter_type == NULL)
+   return -1;
+
filter_type->filter = arg;
 
free(str);
-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 03/14] tools lib traceevent: Get rid of malloc_or_die() in pevent_filter_alloc()

2013-12-08 Thread Namhyung Kim
It returns NULL when allocation fails so the users should check the
return value from now on.

Signed-off-by: Namhyung Kim 
---
 tools/lib/traceevent/parse-filter.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/tools/lib/traceevent/parse-filter.c 
b/tools/lib/traceevent/parse-filter.c
index d9c239933992..21d13a4f9a5f 100644
--- a/tools/lib/traceevent/parse-filter.c
+++ b/tools/lib/traceevent/parse-filter.c
@@ -198,7 +198,10 @@ struct event_filter *pevent_filter_alloc(struct pevent 
*pevent)
 {
struct event_filter *filter;
 
-   filter = malloc_or_die(sizeof(*filter));
+   filter = malloc(sizeof(*filter));
+   if (filter == NULL)
+   return NULL;
+
memset(filter, 0, sizeof(*filter));
filter->pevent = pevent;
pevent_ref(pevent);
-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[f2fs-dev] [PATCH 3/3 V3] f2fs: introduce f2fs_cache_node_page() to add page into node_inode cache

2013-12-08 Thread Chao Yu
This patch introduces f2fs_cache_node_page(), in this function, page which is
readed ahead will be copy to node_inode's mapping cache.
It will avoid rereading these node pages.

change log:
 o check validity of page by searching NAT suggested by Jaegeuk Kim.
 o add 'unlikely' for compiler optimization suggested by Jaegeuk Kim.

Suggested-by: Jaegeuk Kim 
Signed-off-by: Chao Yu 
---
 fs/f2fs/node.c |   39 ++-
 1 file changed, 38 insertions(+), 1 deletion(-)

diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
index 099f06f..4c6da98 100644
--- a/fs/f2fs/node.c
+++ b/fs/f2fs/node.c
@@ -1600,13 +1600,46 @@ static int ra_sum_pages(struct f2fs_sb_info *sbi, 
struct list_head *pages,
return 0;
 }
 
+/*
+ * f2fs_cache_node_page() check validaty of input page by searching NAT.
+ * Then, it will copy updated data of vaild page to node_inode cache.
+ */
+void f2fs_cache_node_page(struct f2fs_sb_info *sbi, struct page *page,
+   nid_t nid, block_t blkaddr)
+{
+   struct address_space *mapping = sbi->node_inode->i_mapping;
+   struct page *npage;
+   struct node_info ni;
+
+   get_node_info(sbi, nid, );
+
+   if (ni.blk_addr != blkaddr)
+   return;
+
+   npage = grab_cache_page(mapping, nid);
+   if (unlikely(!npage))
+   return;
+
+   if (PageUptodate(npage)) {
+   f2fs_put_page(npage, 1);
+   return;
+   }
+
+   memcpy(page_address(npage), page_address(page), PAGE_CACHE_SIZE);
+
+   SetPageUptodate(npage);
+   f2fs_put_page(npage, 1);
+
+   return;
+}
+
 int restore_node_summary(struct f2fs_sb_info *sbi,
unsigned int segno, struct f2fs_summary_block *sum)
 {
struct f2fs_node *rn;
struct f2fs_summary *sum_entry;
struct page *page, *tmp;
-   block_t addr;
+   block_t addr, blkaddr;
int bio_blocks = MAX_BIO_BLOCKS(max_hw_blocks(sbi));
int i, last_offset, nrpages, err = 0;
LIST_HEAD(page_list);
@@ -1624,6 +1657,7 @@ int restore_node_summary(struct f2fs_sb_info *sbi,
if (err)
return err;
 
+   blkaddr = addr;
list_for_each_entry_safe(page, tmp, _list, lru) {
 
lock_page(page);
@@ -1633,6 +1667,8 @@ int restore_node_summary(struct f2fs_sb_info *sbi,
sum_entry->version = 0;
sum_entry->ofs_in_node = 0;
sum_entry++;
+   f2fs_cache_node_page(sbi, page,
+   le32_to_cpu(rn->footer.nid), blkaddr);
} else {
err = -EIO;
}
@@ -1640,6 +1676,7 @@ int restore_node_summary(struct f2fs_sb_info *sbi,
list_del(>lru);
unlock_page(page);
__free_pages(page, 0);
+   blkaddr++;
}
}
return err;
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 04/14] tools lib traceevent: Get rid of malloc_or_die() allocate_arg()

2013-12-08 Thread Namhyung Kim
Also check return value and handle it.

Signed-off-by: Namhyung Kim 
---
 tools/lib/traceevent/parse-filter.c | 59 +++--
 1 file changed, 44 insertions(+), 15 deletions(-)

diff --git a/tools/lib/traceevent/parse-filter.c 
b/tools/lib/traceevent/parse-filter.c
index 21d13a4f9a5f..35fac1fa376b 100644
--- a/tools/lib/traceevent/parse-filter.c
+++ b/tools/lib/traceevent/parse-filter.c
@@ -211,12 +211,7 @@ struct event_filter *pevent_filter_alloc(struct pevent 
*pevent)
 
 static struct filter_arg *allocate_arg(void)
 {
-   struct filter_arg *arg;
-
-   arg = malloc_or_die(sizeof(*arg));
-   memset(arg, 0, sizeof(*arg));
-
-   return arg;
+   return calloc(1, sizeof(struct filter_arg));
 }
 
 static void free_arg(struct filter_arg *arg)
@@ -359,6 +354,10 @@ create_arg_item(struct event_format *event, const char 
*token,
struct filter_arg *arg;
 
arg = allocate_arg();
+   if (arg == NULL) {
+   show_error(error_str, "failed to allocate filter arg");
+   return NULL;
+   }
 
switch (type) {
 
@@ -409,8 +408,10 @@ create_arg_op(enum filter_op_type btype)
struct filter_arg *arg;
 
arg = allocate_arg();
-   arg->type = FILTER_ARG_OP;
-   arg->op.type = btype;
+   if (arg) {
+   arg->type = FILTER_ARG_OP;
+   arg->op.type = btype;
+   }
 
return arg;
 }
@@ -421,8 +422,10 @@ create_arg_exp(enum filter_exp_type etype)
struct filter_arg *arg;
 
arg = allocate_arg();
-   arg->type = FILTER_ARG_EXP;
-   arg->op.type = etype;
+   if (arg) {
+   arg->type = FILTER_ARG_EXP;
+   arg->op.type = etype;
+   }
 
return arg;
 }
@@ -433,9 +436,11 @@ create_arg_cmp(enum filter_exp_type etype)
struct filter_arg *arg;
 
arg = allocate_arg();
-   /* Use NUM and change if necessary */
-   arg->type = FILTER_ARG_NUM;
-   arg->op.type = etype;
+   if (arg) {
+   /* Use NUM and change if necessary */
+   arg->type = FILTER_ARG_NUM;
+   arg->op.type = etype;
+   }
 
return arg;
 }
@@ -896,8 +901,10 @@ static struct filter_arg *collapse_tree(struct filter_arg 
*arg)
case FILTER_VAL_FALSE:
free_arg(arg);
arg = allocate_arg();
-   arg->type = FILTER_ARG_BOOLEAN;
-   arg->boolean.value = ret == FILTER_VAL_TRUE;
+   if (arg) {
+   arg->type = FILTER_ARG_BOOLEAN;
+   arg->boolean.value = ret == FILTER_VAL_TRUE;
+   }
}
 
return arg;
@@ -1044,6 +1051,8 @@ process_filter(struct event_format *event, struct 
filter_arg **parg,
switch (op_type) {
case OP_BOOL:
arg = create_arg_op(btype);
+   if (arg == NULL)
+   goto fail_alloc;
if (current_op)
ret = add_left(arg, current_op);
else
@@ -1054,6 +1063,8 @@ process_filter(struct event_format *event, struct 
filter_arg **parg,
 
case OP_NOT:
arg = create_arg_op(btype);
+   if (arg == NULL)
+   goto fail_alloc;
if (current_op)
ret = add_right(current_op, arg, 
error_str);
if (ret < 0)
@@ -1073,6 +1084,8 @@ process_filter(struct event_format *event, struct 
filter_arg **parg,
arg = create_arg_exp(etype);
else
arg = create_arg_cmp(ctype);
+   if (arg == NULL)
+   goto fail_alloc;
 
if (current_op)
ret = add_right(current_op, arg, 
error_str);
@@ -1106,11 +1119,16 @@ process_filter(struct event_format *event, struct 
filter_arg **parg,
current_op = current_exp;
 
current_op = collapse_tree(current_op);
+   if (current_op == NULL)
+   goto fail_alloc;
 
*parg = current_op;
 
return 0;
 
+ fail_alloc:
+   show_error(error_str, "failed to allocate filter arg");
+   goto fail;
  fail_print:
show_error(error_str, "Syntax error");
  fail:
@@ -1141,6 +1159,10 @@ process_event(struct event_format *event, const char 
*filter_str,
/* If parg is NULL, then make it into FALSE */
if (!*parg) {
*parg = allocate_arg();
+   if (*parg == NULL) {
+   show_error(error_str, "failed to allocate filter arg");
+   return 

[PATCH 11/14] tools lib traceevent: Get rid of malloc_or_die() in pevent_filter_add_filter_str()

2013-12-08 Thread Namhyung Kim
Signed-off-by: Namhyung Kim 
---
 tools/lib/traceevent/parse-filter.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/tools/lib/traceevent/parse-filter.c 
b/tools/lib/traceevent/parse-filter.c
index dabae52bbbcb..d8613308c08d 100644
--- a/tools/lib/traceevent/parse-filter.c
+++ b/tools/lib/traceevent/parse-filter.c
@@ -1319,7 +1319,13 @@ int pevent_filter_add_filter_str(struct event_filter 
*filter,
else
len = strlen(filter_str);
 
-   this_event = malloc_or_die(len + 1);
+   this_event = malloc(len + 1);
+   if (this_event == NULL) {
+   show_error(error_str, "Memory allocation failure");
+   /* This can only happen when events is NULL, but still 
*/
+   free_events(events);
+   return -1;
+   }
memcpy(this_event, filter_str, len);
this_event[len] = 0;
 
-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 07/14] tools lib traceevent: Get rid of malloc_or_die() in add_event()

2013-12-08 Thread Namhyung Kim
Make it return error value since its only caller find_event() now can
handle allocation error properly.

Signed-off-by: Namhyung Kim 
---
 tools/lib/traceevent/parse-filter.c | 16 +---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/tools/lib/traceevent/parse-filter.c 
b/tools/lib/traceevent/parse-filter.c
index 06e5af9f8fc4..faa10824b87d 100644
--- a/tools/lib/traceevent/parse-filter.c
+++ b/tools/lib/traceevent/parse-filter.c
@@ -260,15 +260,19 @@ static void free_arg(struct filter_arg *arg)
free(arg);
 }
 
-static void add_event(struct event_list **events,
+static int add_event(struct event_list **events,
  struct event_format *event)
 {
struct event_list *list;
 
-   list = malloc_or_die(sizeof(*list));
+   list = malloc(sizeof(*list));
+   if (list == NULL)
+   return -1;
+
list->next = *events;
*events = list;
list->event = event;
+   return 0;
 }
 
 static int event_match(struct event_format *event,
@@ -291,6 +295,7 @@ find_event(struct pevent *pevent, struct event_list 
**events,
regex_t ereg;
regex_t sreg;
int match = 0;
+   int fail = 0;
char *reg;
int ret;
int i;
@@ -333,7 +338,10 @@ find_event(struct pevent *pevent, struct event_list 
**events,
event = pevent->events[i];
if (event_match(event, sys_name ?  : NULL, )) {
match = 1;
-   add_event(events, event);
+   if (add_event(events, event) < 0) {
+   fail = 1;
+   break;
+   }
}
}
 
@@ -343,6 +351,8 @@ find_event(struct pevent *pevent, struct event_list 
**events,
 
if (!match)
return -1;
+   if (fail)
+   return -2;
 
return 0;
 }
-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 12/14] tools lib traceevent: Get rid of die() in pevent_filter_clear_trivial()

2013-12-08 Thread Namhyung Kim
Change the function signature to return error code and not call die()
anymore.

Signed-off-by: Namhyung Kim 
---
 tools/lib/traceevent/event-parse.h  |  2 +-
 tools/lib/traceevent/parse-filter.c | 21 +++--
 2 files changed, 16 insertions(+), 7 deletions(-)

diff --git a/tools/lib/traceevent/event-parse.h 
b/tools/lib/traceevent/event-parse.h
index 620c27a72960..6e23f197175f 100644
--- a/tools/lib/traceevent/event-parse.h
+++ b/tools/lib/traceevent/event-parse.h
@@ -860,7 +860,7 @@ int pevent_event_filtered(struct event_filter *filter,
 
 void pevent_filter_reset(struct event_filter *filter);
 
-void pevent_filter_clear_trivial(struct event_filter *filter,
+int pevent_filter_clear_trivial(struct event_filter *filter,
 enum filter_trivial_type type);
 
 void pevent_filter_free(struct event_filter *filter);
diff --git a/tools/lib/traceevent/parse-filter.c 
b/tools/lib/traceevent/parse-filter.c
index d8613308c08d..4d395e8b88bb 100644
--- a/tools/lib/traceevent/parse-filter.c
+++ b/tools/lib/traceevent/parse-filter.c
@@ -1606,8 +1606,10 @@ int pevent_update_trivial(struct event_filter *dest, 
struct event_filter *source
  * @type: remove only true, false, or both
  *
  * Removes filters that only contain a TRUE or FALES boolean arg.
+ *
+ * Returns 0 on success and -1 if there was a problem.
  */
-void pevent_filter_clear_trivial(struct event_filter *filter,
+int pevent_filter_clear_trivial(struct event_filter *filter,
 enum filter_trivial_type type)
 {
struct filter_type *filter_type;
@@ -1616,13 +1618,15 @@ void pevent_filter_clear_trivial(struct event_filter 
*filter,
int i;
 
if (!filter->filters)
-   return;
+   return 0;
 
/*
 * Two steps, first get all ids with trivial filters.
 *  then remove those ids.
 */
for (i = 0; i < filter->filters; i++) {
+   int *new_ids;
+
filter_type = >event_filters[i];
if (filter_type->filter->type != FILTER_ARG_BOOLEAN)
continue;
@@ -1637,19 +1641,24 @@ void pevent_filter_clear_trivial(struct event_filter 
*filter,
break;
}
 
-   ids = realloc(ids, sizeof(*ids) * (count + 1));
-   if (!ids)
-   die("Can't allocate ids");
+   new_ids = realloc(ids, sizeof(*ids) * (count + 1));
+   if (!new_ids) {
+   free(ids);
+   return -1;
+   }
+
+   ids = new_ids;
ids[count++] = filter_type->event_id;
}
 
if (!count)
-   return;
+   return 0;
 
for (i = 0; i < count; i++)
pevent_filter_remove_event(filter, ids[i]);
 
free(ids);
+   return 0;
 }
 
 /**
-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 08/14] tools lib traceevent: Get rid of die() in create_arg_item()

2013-12-08 Thread Namhyung Kim
Signed-off-by: Namhyung Kim 
---
 tools/lib/traceevent/parse-filter.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/tools/lib/traceevent/parse-filter.c 
b/tools/lib/traceevent/parse-filter.c
index faa10824b87d..5efe66a682bd 100644
--- a/tools/lib/traceevent/parse-filter.c
+++ b/tools/lib/traceevent/parse-filter.c
@@ -389,8 +389,11 @@ create_arg_item(struct event_format *event, const char 
*token,
arg->value.type =
type == EVENT_DQUOTE ? FILTER_STRING : FILTER_CHAR;
arg->value.str = strdup(token);
-   if (!arg->value.str)
-   die("malloc string");
+   if (!arg->value.str) {
+   free_arg(arg);
+   show_error(error_str, "failed to allocate string filter 
arg");
+   return NULL;
+   }
break;
case EVENT_ITEM:
/* if it is a number, then convert it */
-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 10/14] tools lib traceevent: Get rid of die() in reparent_op_arg()

2013-12-08 Thread Namhyung Kim
To do that, add FILTER_VAL_ERROR to enum filter_vals and make the
function returns the error code.  Also pass error_str so that it can
set proper error message when error occurred.

Signed-off-by: Namhyung Kim 
---
 tools/lib/traceevent/parse-filter.c | 91 +++--
 1 file changed, 58 insertions(+), 33 deletions(-)

diff --git a/tools/lib/traceevent/parse-filter.c 
b/tools/lib/traceevent/parse-filter.c
index a1ad609a860f..dabae52bbbcb 100644
--- a/tools/lib/traceevent/parse-filter.c
+++ b/tools/lib/traceevent/parse-filter.c
@@ -781,17 +781,21 @@ enum filter_vals {
FILTER_VAL_NORM,
FILTER_VAL_FALSE,
FILTER_VAL_TRUE,
+   FILTER_VAL_ERROR,
 };
 
-void reparent_op_arg(struct filter_arg *parent, struct filter_arg *old_child,
- struct filter_arg *arg)
+enum filter_vals
+reparent_op_arg(struct filter_arg *parent, struct filter_arg *old_child,
+   struct filter_arg *arg, char **error_str)
 {
struct filter_arg *other_child;
struct filter_arg **ptr;
 
if (parent->type != FILTER_ARG_OP &&
-   arg->type != FILTER_ARG_OP)
-   die("can not reparent other than OP");
+   arg->type != FILTER_ARG_OP) {
+   show_error(error_str, "can not reparent other than OP");
+   return FILTER_VAL_ERROR;
+   }
 
/* Get the sibling */
if (old_child->op.right == arg) {
@@ -800,8 +804,10 @@ void reparent_op_arg(struct filter_arg *parent, struct 
filter_arg *old_child,
} else if (old_child->op.left == arg) {
ptr = _child->op.left;
other_child = old_child->op.right;
-   } else
-   die("Error in reparent op, find other child");
+   } else {
+   show_error(error_str, "Error in reparent op, find other child");
+   return FILTER_VAL_ERROR;
+   }
 
/* Detach arg from old_child */
*ptr = NULL;
@@ -812,21 +818,25 @@ void reparent_op_arg(struct filter_arg *parent, struct 
filter_arg *old_child,
*parent = *arg;
/* Free arg without recussion */
free(arg);
-   return;
+   return FILTER_VAL_NORM;
}
 
if (parent->op.right == old_child)
ptr = >op.right;
else if (parent->op.left == old_child)
ptr = >op.left;
-   else
-   die("Error in reparent op");
+   else {
+   show_error(error_str, "Error in reparent op");
+   return FILTER_VAL_ERROR;
+   }
*ptr = arg;
 
free_arg(old_child);
+   return FILTER_VAL_NORM;
 }
 
-enum filter_vals test_arg(struct filter_arg *parent, struct filter_arg *arg)
+enum filter_vals test_arg(struct filter_arg *parent, struct filter_arg *arg,
+ char **error_str)
 {
enum filter_vals lval, rval;
 
@@ -843,63 +853,68 @@ enum filter_vals test_arg(struct filter_arg *parent, 
struct filter_arg *arg)
return FILTER_VAL_NORM;
 
case FILTER_ARG_EXP:
-   lval = test_arg(arg, arg->exp.left);
+   lval = test_arg(arg, arg->exp.left, error_str);
if (lval != FILTER_VAL_NORM)
return lval;
-   rval = test_arg(arg, arg->exp.right);
+   rval = test_arg(arg, arg->exp.right, error_str);
if (rval != FILTER_VAL_NORM)
return rval;
return FILTER_VAL_NORM;
 
case FILTER_ARG_NUM:
-   lval = test_arg(arg, arg->num.left);
+   lval = test_arg(arg, arg->num.left, error_str);
if (lval != FILTER_VAL_NORM)
return lval;
-   rval = test_arg(arg, arg->num.right);
+   rval = test_arg(arg, arg->num.right, error_str);
if (rval != FILTER_VAL_NORM)
return rval;
return FILTER_VAL_NORM;
 
case FILTER_ARG_OP:
if (arg->op.type != FILTER_OP_NOT) {
-   lval = test_arg(arg, arg->op.left);
+   lval = test_arg(arg, arg->op.left, error_str);
switch (lval) {
case FILTER_VAL_NORM:
break;
case FILTER_VAL_TRUE:
if (arg->op.type == FILTER_OP_OR)
return FILTER_VAL_TRUE;
-   rval = test_arg(arg, arg->op.right);
+   rval = test_arg(arg, arg->op.right, error_str);
if (rval != FILTER_VAL_NORM)
return rval;
 
-   reparent_op_arg(parent, arg, arg->op.right);
-   return FILTER_VAL_NORM;
+   return reparent_op_arg(parent, arg, 

[PATCH 01/14] tools lib traceevent: Get rid of malloc_or_die() in show_error()

2013-12-08 Thread Namhyung Kim
Signed-off-by: Namhyung Kim 
---
 tools/lib/traceevent/parse-filter.c | 16 +++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/tools/lib/traceevent/parse-filter.c 
b/tools/lib/traceevent/parse-filter.c
index 2500e75583fc..0fc905c230ad 100644
--- a/tools/lib/traceevent/parse-filter.c
+++ b/tools/lib/traceevent/parse-filter.c
@@ -56,7 +56,21 @@ static void show_error(char **error_str, const char *fmt, 
...)
index = pevent_get_input_buf_ptr();
len = input ? strlen(input) : 0;
 
-   error = malloc_or_die(MAX_ERR_STR_SIZE + (len*2) + 3);
+   error = malloc(MAX_ERR_STR_SIZE + (len*2) + 3);
+   if (error == NULL) {
+   /*
+* Maybe it's due to len is too long.
+* Retry without the input buffer part.
+*/
+   len = 0;
+
+   error = malloc(MAX_ERR_STR_SIZE);
+   if (error == NULL) {
+   /* no memory */
+   *error_str = "failed to allocate memory";
+   return;
+   }
+   }
 
if (len) {
strcpy(error, input);
-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 09/14] tools lib traceevent: Get rid of die() in add_right()

2013-12-08 Thread Namhyung Kim
Signed-off-by: Namhyung Kim 
---
 tools/lib/traceevent/parse-filter.c | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/tools/lib/traceevent/parse-filter.c 
b/tools/lib/traceevent/parse-filter.c
index 5efe66a682bd..a1ad609a860f 100644
--- a/tools/lib/traceevent/parse-filter.c
+++ b/tools/lib/traceevent/parse-filter.c
@@ -583,12 +583,18 @@ static int add_right(struct filter_arg *op, struct 
filter_arg *arg,
op->str.type = op_type;
op->str.field = left->field.field;
op->str.val = strdup(str);
-   if (!op->str.val)
-   die("malloc string");
+   if (!op->str.val) {
+   show_error(error_str, "Failed to allocate 
string filter");
+   return -1;
+   }
/*
 * Need a buffer to copy data for tests
 */
-   op->str.buffer = malloc_or_die(op->str.field->size + 1);
+   op->str.buffer = malloc(op->str.field->size + 1);
+   if (op->str.buffer) {
+   show_error(error_str, "Failed to allocate 
string filter");
+   return -1;
+   }
/* Null terminate this buffer */
op->str.buffer[op->str.field->size] = 0;
 
-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 13/14] tools lib traceevent: Refactor test_filter() to get rid of die()

2013-12-08 Thread Namhyung Kim
The test_filter() function is for testing given filter is matched to a
given record.  However it doesn't handle error cases properly so add a
new argument error_str to save error info during the test and also
pass it to internal test functions.

For now, it just save the error but does nothing with it.  Maybe it
can be given by user through pevent_filter_match() later.

Signed-off-by: Namhyung Kim 
---
 tools/lib/traceevent/event-parse.h  |   1 +
 tools/lib/traceevent/parse-filter.c | 102 ++--
 2 files changed, 65 insertions(+), 38 deletions(-)

diff --git a/tools/lib/traceevent/event-parse.h 
b/tools/lib/traceevent/event-parse.h
index 6e23f197175f..a1d8b2792e3a 100644
--- a/tools/lib/traceevent/event-parse.h
+++ b/tools/lib/traceevent/event-parse.h
@@ -836,6 +836,7 @@ struct event_filter {
 
 struct event_filter *pevent_filter_alloc(struct pevent *pevent);
 
+#define FILTER_ERROR   -3
 #define FILTER_NONE-2
 #define FILTER_NOEXIST -1
 #define FILTER_MISS0
diff --git a/tools/lib/traceevent/parse-filter.c 
b/tools/lib/traceevent/parse-filter.c
index 4d395e8b88bb..8a5b7a74b44e 100644
--- a/tools/lib/traceevent/parse-filter.c
+++ b/tools/lib/traceevent/parse-filter.c
@@ -1698,8 +1698,8 @@ int pevent_filter_event_has_trivial(struct event_filter 
*filter,
}
 }
 
-static int test_filter(struct event_format *event,
-  struct filter_arg *arg, struct pevent_record *record);
+static int test_filter(struct event_format *event, struct filter_arg *arg,
+  struct pevent_record *record, char **error_str);
 
 static const char *
 get_comm(struct event_format *event, struct pevent_record *record)
@@ -1745,15 +1745,17 @@ get_value(struct event_format *event,
 }
 
 static unsigned long long
-get_arg_value(struct event_format *event, struct filter_arg *arg, struct 
pevent_record *record);
+get_arg_value(struct event_format *event, struct filter_arg *arg,
+ struct pevent_record *record, char **error_str);
 
 static unsigned long long
-get_exp_value(struct event_format *event, struct filter_arg *arg, struct 
pevent_record *record)
+get_exp_value(struct event_format *event, struct filter_arg *arg,
+ struct pevent_record *record, char **error_str)
 {
unsigned long long lval, rval;
 
-   lval = get_arg_value(event, arg->exp.left, record);
-   rval = get_arg_value(event, arg->exp.right, record);
+   lval = get_arg_value(event, arg->exp.left, record, error_str);
+   rval = get_arg_value(event, arg->exp.right, record, error_str);
 
switch (arg->exp.type) {
case FILTER_EXP_ADD:
@@ -1788,39 +1790,44 @@ get_exp_value(struct event_format *event, struct 
filter_arg *arg, struct pevent_
 
case FILTER_EXP_NOT:
default:
-   die("error in exp");
+   if (*error_str == NULL)
+   *error_str = "invalid expression type";
}
return 0;
 }
 
 static unsigned long long
-get_arg_value(struct event_format *event, struct filter_arg *arg, struct 
pevent_record *record)
+get_arg_value(struct event_format *event, struct filter_arg *arg,
+ struct pevent_record *record, char **error_str)
 {
switch (arg->type) {
case FILTER_ARG_FIELD:
return get_value(event, arg->field.field, record);
 
case FILTER_ARG_VALUE:
-   if (arg->value.type != FILTER_NUMBER)
-   die("must have number field!");
+   if (arg->value.type != FILTER_NUMBER) {
+   if (*error_str == NULL)
+   *error_str = "must have number field!";
+   }
return arg->value.val;
 
case FILTER_ARG_EXP:
-   return get_exp_value(event, arg, record);
+   return get_exp_value(event, arg, record, error_str);
 
default:
-   die("oops in filter");
+   if (*error_str == NULL)
+   *error_str = "invalid numeric argument type";
}
return 0;
 }
 
-static int test_num(struct event_format *event,
-   struct filter_arg *arg, struct pevent_record *record)
+static int test_num(struct event_format *event, struct filter_arg *arg,
+   struct pevent_record *record, char **error_str)
 {
unsigned long long lval, rval;
 
-   lval = get_arg_value(event, arg->num.left, record);
-   rval = get_arg_value(event, arg->num.right, record);
+   lval = get_arg_value(event, arg->num.left, record, error_str);
+   rval = get_arg_value(event, arg->num.right, record, error_str);
 
switch (arg->num.type) {
case FILTER_CMP_EQ:
@@ -1842,7 +1849,8 @@ static int test_num(struct event_format *event,
return lval <= rval;
 
default:
-   /* ?? */
+   if (*error_str == NULL)
+   *error_str = "invalid 

[PATCH 14/14] tools lib traceevent: Get rid of die() in some string conversion funcitons

2013-12-08 Thread Namhyung Kim
Those functions are for stringify filter arguments.  As caller of
those functions handles NULL string properly, it seems that it's
enough to return NULL rather than calling die().

Signed-off-by: Namhyung Kim 
---
 tools/lib/traceevent/parse-filter.c | 51 ++---
 1 file changed, 31 insertions(+), 20 deletions(-)

diff --git a/tools/lib/traceevent/parse-filter.c 
b/tools/lib/traceevent/parse-filter.c
index 8a5b7a74b44e..ff95da94eee2 100644
--- a/tools/lib/traceevent/parse-filter.c
+++ b/tools/lib/traceevent/parse-filter.c
@@ -2108,7 +2108,9 @@ static char *op_to_str(struct event_filter *filter, 
struct filter_arg *arg)
default:
break;
}
-   str = malloc_or_die(6);
+   str = malloc(6);
+   if (str == NULL)
+   break;
if (val)
strcpy(str, "TRUE");
else
@@ -2131,7 +2133,9 @@ static char *op_to_str(struct event_filter *filter, 
struct filter_arg *arg)
}
 
len = strlen(left) + strlen(right) + strlen(op) + 10;
-   str = malloc_or_die(len);
+   str = malloc(len);
+   if (str == NULL)
+   break;
snprintf(str, len, "(%s) %s (%s)",
 left, op, right);
break;
@@ -2149,7 +2153,9 @@ static char *op_to_str(struct event_filter *filter, 
struct filter_arg *arg)
right_val = 0;
if (right_val >= 0) {
/* just return the opposite */
-   str = malloc_or_die(6);
+   str = malloc(6);
+   if (str == NULL)
+   break;
if (right_val)
strcpy(str, "FALSE");
else
@@ -2157,8 +2163,9 @@ static char *op_to_str(struct event_filter *filter, 
struct filter_arg *arg)
break;
}
len = strlen(right) + strlen(op) + 3;
-   str = malloc_or_die(len);
-   snprintf(str, len, "%s(%s)", op, right);
+   str = malloc(len);
+   if (str)
+   snprintf(str, len, "%s(%s)", op, right);
break;
 
default:
@@ -2174,9 +2181,9 @@ static char *val_to_str(struct event_filter *filter, 
struct filter_arg *arg)
 {
char *str;
 
-   str = malloc_or_die(30);
-
-   snprintf(str, 30, "%lld", arg->value.val);
+   str = malloc(30);
+   if (str)
+   snprintf(str, 30, "%lld", arg->value.val);
 
return str;
 }
@@ -2231,12 +2238,13 @@ static char *exp_to_str(struct event_filter *filter, 
struct filter_arg *arg)
op = "^";
break;
default:
-   die("oops in exp");
+   break;
}
 
len = strlen(op) + strlen(lstr) + strlen(rstr) + 4;
-   str = malloc_or_die(len);
-   snprintf(str, len, "%s %s %s", lstr, op, rstr);
+   str = malloc(len);
+   if (str)
+   snprintf(str, len, "%s %s %s", lstr, op, rstr);
 out:
free(lstr);
free(rstr);
@@ -2282,9 +2290,9 @@ static char *num_to_str(struct event_filter *filter, 
struct filter_arg *arg)
op = "<=";
 
len = strlen(lstr) + strlen(op) + strlen(rstr) + 4;
-   str = malloc_or_die(len);
-   sprintf(str, "%s %s %s", lstr, op, rstr);
-
+   str = malloc(len);
+   if (str)
+   sprintf(str, "%s %s %s", lstr, op, rstr);
break;
 
default:
@@ -2322,10 +2330,11 @@ static char *str_to_str(struct event_filter *filter, 
struct filter_arg *arg)
 
len = strlen(arg->str.field->name) + strlen(op) +
strlen(arg->str.val) + 6;
-   str = malloc_or_die(len);
-   snprintf(str, len, "%s %s \"%s\"",
-arg->str.field->name,
-op, arg->str.val);
+   str = malloc(len);
+   if (str) {
+   snprintf(str, len, "%s %s \"%s\"",
+arg->str.field->name, op, arg->str.val);
+   }
break;
 
default:
@@ -2341,7 +2350,9 @@ static char *arg_to_str(struct event_filter *filter, 
struct filter_arg *arg)
 
switch (arg->type) {
case FILTER_ARG_BOOLEAN:
-   str = malloc_or_die(6);
+   str = malloc(6);
+   if (str == NULL)
+   return NULL;
if (arg->boolean.value)
strcpy(str, "TRUE");
else
@@ -2380,7 

[PATCH 06/14] tools lib traceevent: Get rid of malloc_or_die() in find_event()

2013-12-08 Thread Namhyung Kim
Make it return -2 to distinguish malloc allocation failure.

Signed-off-by: Namhyung Kim 
---
 tools/lib/traceevent/parse-filter.c | 17 ++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/tools/lib/traceevent/parse-filter.c 
b/tools/lib/traceevent/parse-filter.c
index e9d17bfcdffd..06e5af9f8fc4 100644
--- a/tools/lib/traceevent/parse-filter.c
+++ b/tools/lib/traceevent/parse-filter.c
@@ -301,7 +301,10 @@ find_event(struct pevent *pevent, struct event_list 
**events,
sys_name = NULL;
}
 
-   reg = malloc_or_die(strlen(event_name) + 3);
+   reg = malloc(strlen(event_name) + 3);
+   if (reg == NULL)
+   return -2;
+
sprintf(reg, "^%s$", event_name);
 
ret = regcomp(, reg, REG_ICASE|REG_NOSUB);
@@ -311,7 +314,12 @@ find_event(struct pevent *pevent, struct event_list 
**events,
return -1;
 
if (sys_name) {
-   reg = malloc_or_die(strlen(sys_name) + 3);
+   reg = malloc(strlen(sys_name) + 3);
+   if (reg == NULL) {
+   regfree();
+   return -2;
+   }
+
sprintf(reg, "^%s$", sys_name);
ret = regcomp(, reg, REG_ICASE|REG_NOSUB);
free(reg);
@@ -1290,7 +1298,10 @@ int pevent_filter_add_filter_str(struct event_filter 
*filter,
/* Find this event */
ret = find_event(pevent, , strim(sys_name), 
strim(event_name));
if (ret < 0) {
-   if (event_name)
+   if (ret == -2)
+   show_error(error_str,
+  "Memory allocation failure");
+   else if (event_name)
show_error(error_str,
   "No event found under '%s.%s'",
   sys_name, event_name);
-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCHSET 00/14] tools lib traceevent: Get rid of *die() calls from parse-filter.c

2013-12-08 Thread Namhyung Kim
Hello,

This patchset tries to remove all die() calls in event filter parsing
code.  The only remaining bits are in trace-seq.c which implement
print functions and I want to hear what's the best way we can handle
the error case during the print.

I also put this patches on libtraceevent/die-removal-v1 branch in my tree

  git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git


Any comments are welcome, thanks
Namhyung


Namhyung Kim (14):
  tools lib traceevent: Get rid of malloc_or_die() in show_error()
  tools lib traceevent: Get rid of die in add_filter_type()
  tools lib traceevent: Get rid of malloc_or_die() in
pevent_filter_alloc()
  tools lib traceevent: Get rid of malloc_or_die() allocate_arg()
  tools lib traceevent: Get rid of malloc_or_die() in read_token()
  tools lib traceevent: Get rid of malloc_or_die() in find_event()
  tools lib traceevent: Get rid of malloc_or_die() in add_event()
  tools lib traceevent: Get rid of die() in create_arg_item()
  tools lib traceevent: Get rid of die() in add_right()
  tools lib traceevent: Get rid of die() in reparent_op_arg()
  tools lib traceevent: Get rid of malloc_or_die() in
pevent_filter_add_filter_str()
  tools lib traceevent: Get rid of die() in
pevent_filter_clear_trivial()
  tools lib traceevent: Refactor test_filter() to get rid of die()
  tools lib traceevent: Get rid of die() in some string conversion
funcitons

 tools/lib/traceevent/event-parse.h  |   3 +-
 tools/lib/traceevent/parse-filter.c | 432 +---
 2 files changed, 303 insertions(+), 132 deletions(-)

-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH][RESEND] score: remove unused CPU_SCORE7 Kconfig parameter

2013-12-08 Thread Michael Opdenacker
This removes the CPU_SCORE7 Kconfig parameter,
which was no longer used anywhere in the source code
and Makefiles.

Signed-off-by: Michael Opdenacker 
---
 arch/score/Kconfig | 6 --
 1 file changed, 6 deletions(-)

diff --git a/arch/score/Kconfig b/arch/score/Kconfig
index 305f7ee1f382..2bc03d04f3af 100644
--- a/arch/score/Kconfig
+++ b/arch/score/Kconfig
@@ -23,27 +23,21 @@ choice
 config ARCH_SCORE7
bool "SCORE7 processor"
select SYS_SUPPORTS_32BIT_KERNEL
-   select CPU_SCORE7
select GENERIC_HAS_IOMAP
 
 config MACH_SPCT6600
bool "SPCT6600 series based machines"
select SYS_SUPPORTS_32BIT_KERNEL
-   select CPU_SCORE7
select GENERIC_HAS_IOMAP
 
 config SCORE_SIM
bool "Score simulator"
select SYS_SUPPORTS_32BIT_KERNEL
-   select CPU_SCORE7
select GENERIC_HAS_IOMAP
 endchoice
 
 endmenu
 
-config CPU_SCORE7
-   bool
-
 config NO_DMA
bool
default y
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH][RESEND] score: remove deprecated IRQF_DISABLED

2013-12-08 Thread Michael Opdenacker
This patch proposes to remove the use of the IRQF_DISABLED flag

It's a NOOP since 2.6.35 and it will be removed one day.

Signed-off-by: Michael Opdenacker 
---
 arch/score/kernel/time.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/score/kernel/time.c b/arch/score/kernel/time.c
index f0a43affb201..24770cd9b473 100644
--- a/arch/score/kernel/time.c
+++ b/arch/score/kernel/time.c
@@ -41,7 +41,7 @@ static irqreturn_t timer_interrupt(int irq, void *dev_id)
 
 static struct irqaction timer_irq = {
.handler = timer_interrupt,
-   .flags = IRQF_DISABLED | IRQF_TIMER,
+   .flags = IRQF_TIMER,
.name = "timer",
 };
 
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH][RESEND] powerpc: remove unused REDBOOT Kconfig parameter

2013-12-08 Thread Michael Opdenacker
This removes the REDBOOT Kconfig parameter,
which was no longer used anywhere in the source code
and Makefiles.

Signed-off-by: Michael Opdenacker 
---
 arch/powerpc/Kconfig| 3 ---
 arch/powerpc/platforms/83xx/Kconfig | 1 -
 arch/powerpc/platforms/8xx/Kconfig  | 1 -
 3 files changed, 5 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index b44b52c0a8f0..70dc283050b5 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -209,9 +209,6 @@ config DEFAULT_UIMAGE
  Used to allow a board to specify it wants a uImage built by default
default n
 
-config REDBOOT
-   bool
-
 config ARCH_HIBERNATION_POSSIBLE
bool
default y
diff --git a/arch/powerpc/platforms/83xx/Kconfig 
b/arch/powerpc/platforms/83xx/Kconfig
index 670a033264c0..2bdc8c862c46 100644
--- a/arch/powerpc/platforms/83xx/Kconfig
+++ b/arch/powerpc/platforms/83xx/Kconfig
@@ -99,7 +99,6 @@ config SBC834x
 config ASP834x
bool "Analogue & Micro ASP 834x"
select PPC_MPC834x
-   select REDBOOT
help
  This enables support for the Analogue & Micro ASP 83xx
  board.
diff --git a/arch/powerpc/platforms/8xx/Kconfig 
b/arch/powerpc/platforms/8xx/Kconfig
index 8dec3c0911ad..bd6f1a1cf922 100644
--- a/arch/powerpc/platforms/8xx/Kconfig
+++ b/arch/powerpc/platforms/8xx/Kconfig
@@ -45,7 +45,6 @@ config PPC_EP88XC
 config PPC_ADDER875
bool "Analogue & Micro Adder 875"
select CPM1
-   select REDBOOT
help
  This enables support for the Analogue & Micro Adder 875
  board.
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH][RESEND] m32r: remove deprecated IRQF_DISABLED

2013-12-08 Thread Michael Opdenacker
This patch proposes to remove the use of the IRQF_DISABLED flag

It's a NOOP since 2.6.35 and it will be removed one day.

Signed-off-by: Michael Opdenacker 
---
 arch/m32r/kernel/time.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/m32r/kernel/time.c b/arch/m32r/kernel/time.c
index 1a15f81ea1bd..093f2761aa51 100644
--- a/arch/m32r/kernel/time.c
+++ b/arch/m32r/kernel/time.c
@@ -134,7 +134,6 @@ static irqreturn_t timer_interrupt(int irq, void *dev_id)
 
 static struct irqaction irq0 = {
.handler = timer_interrupt,
-   .flags = IRQF_DISABLED,
.name = "MFT2",
 };
 
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH][RESEND] ia64/xen: remove unused NO_IDLE_HZ Kconfig parameter

2013-12-08 Thread Michael Opdenacker
This removes the NO_IDLE_HZ Kconfig parameter,
which was no longer used anywhere in the source code
and Makefiles.

Signed-off-by: Michael Opdenacker 
---
 arch/ia64/xen/Kconfig | 5 -
 1 file changed, 5 deletions(-)

diff --git a/arch/ia64/xen/Kconfig b/arch/ia64/xen/Kconfig
index 5d8a06b0ddf7..592999efc419 100644
--- a/arch/ia64/xen/Kconfig
+++ b/arch/ia64/xen/Kconfig
@@ -7,7 +7,6 @@ config XEN
default y
depends on PARAVIRT && MCKINLEY && IA64_PAGE_SIZE_16KB
select XEN_XENCOMM
-   select NO_IDLE_HZ
# followings are required to save/restore.
select ARCH_SUSPEND_POSSIBLE
select SUSPEND
@@ -19,7 +18,3 @@ config XEN
 config XEN_XENCOMM
depends on XEN
bool
-
-config NO_IDLE_HZ
-   depends on XEN
-   bool
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [f2fs-dev] [PATCH 3/3 V2] f2fs: introduce f2fs_cache_node_page() to add page into node_inode cache

2013-12-08 Thread Jaegeuk Kim
2013-12-09 (월), 10:14 +0800, Chao Yu:
> Hi,
> 
> > -Original Message-
> > From: Jaegeuk Kim [mailto:jaegeuk@samsung.com]
> > Sent: Monday, December 09, 2013 7:37 AM
> > To: Chao Yu
> > Cc: linux-fsde...@vger.kernel.org; linux-kernel@vger.kernel.org; 
> > linux-f2fs-de...@lists.sourceforge.net
> > Subject: Re: [f2fs-dev] [PATCH 3/3 V2] f2fs: introduce 
> > f2fs_cache_node_page() to add page into node_inode cache
> > 
> > 2013-12-06 (금), 17:10 +0800, Chao Yu:
> > > This patch introduces f2fs_cache_node_page(), in this function, page 
> > > which is
> > > readed ahead will be copy to node_inode's mapping cache.
> > > It will avoid rereading these node pages.
> > >
> > > change log:
> > >  o check validity of grabbed page suggested by Jaegeuk Kim.
> > >
> > > Suggested-by: Jaegeuk Kim 
> > > Signed-off-by: Chao Yu 
> > > ---
> > >  fs/f2fs/node.c |   35 +++
> > >  1 file changed, 35 insertions(+)
> > >
> > > diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
> > > index 099f06f..3ff98fa 100644
> > > --- a/fs/f2fs/node.c
> > > +++ b/fs/f2fs/node.c
> > > @@ -1600,6 +1600,39 @@ static int ra_sum_pages(struct f2fs_sb_info *sbi, 
> > > struct list_head *pages,
> > >   return 0;
> > >  }
> > >
> > > +/*
> > > + * f2fs_cache_node_page() copy updated page data to node_inode cache 
> > > page.
> > > + */
> > > +void f2fs_cache_node_page(struct f2fs_sb_info *sbi, struct page *page,
> > > + nid_t nid)
> > > +{
> > > + struct address_space *mapping = sbi->node_inode->i_mapping;
> > > + struct page *npage;
> > 
> > 
> > What I meant for the validity was to check the block address to figure
> > out this node page is up-to-date or not.
> > IOW, something like this.
> 
> Yes, you're right.
> 
> So, how about the this one?
> ---
>  fs/f2fs/node.c |   39 ++-
>  1 file changed, 38 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
> index 099f06f..3e7a336 100644
> --- a/fs/f2fs/node.c
> +++ b/fs/f2fs/node.c
> @@ -1600,13 +1600,46 @@ static int ra_sum_pages(struct f2fs_sb_info *sbi, 
> struct list_head *pages,
>   return 0;
>  }
>  
> +/*
> + * f2fs_cache_node_page() check validaty of input page by searching NAT.
> + * Then, it will copy updated data of vaild page to node_inode cache.
> + */
> +void f2fs_cache_node_page(struct f2fs_sb_info *sbi, struct page *page,
> + nid_t nid, block_t blkaddr)
> +{
> + struct address_space *mapping = sbi->node_inode->i_mapping;
> + struct page *npage;
> + struct node_info ni;
> +
> + get_node_info(sbi, nid, );
> +
> + if (ni.blk_addr != blkaddr)
> + return;
> +
> + npage = grab_cache_page(mapping, nid);
> + if (!npage)

if (unlikely(!npage))

Could you submit a v3?
Thanks,

> + return;
> +
> + if (PageUptodate(npage)) {
> + f2fs_put_page(npage, 1);
> + return;
> + }
> +
> + memcpy(page_address(npage), page_address(page), PAGE_CACHE_SIZE);
> +
> + SetPageUptodate(npage);
> + f2fs_put_page(npage, 1);
> +
> + return;
> +}
> +
>  int restore_node_summary(struct f2fs_sb_info *sbi,
>   unsigned int segno, struct f2fs_summary_block *sum)
>  {
>   struct f2fs_node *rn;
>   struct f2fs_summary *sum_entry;
>   struct page *page, *tmp;
> - block_t addr;
> + block_t addr, blkaddr;
>   int bio_blocks = MAX_BIO_BLOCKS(max_hw_blocks(sbi));
>   int i, last_offset, nrpages, err = 0;
>   LIST_HEAD(page_list);
> @@ -1624,6 +1657,7 @@ int restore_node_summary(struct f2fs_sb_info *sbi,
>   if (err)
>   return err;
>  
> + blkaddr = addr;
>   list_for_each_entry_safe(page, tmp, _list, lru) {
>  
>   lock_page(page);
> @@ -1633,6 +1667,8 @@ int restore_node_summary(struct f2fs_sb_info *sbi,
>   sum_entry->version = 0;
>   sum_entry->ofs_in_node = 0;
>   sum_entry++;
> + f2fs_cache_node_page(sbi, page,
> + le32_to_cpu(rn->footer.nid), blkaddr);
>   } else {
>   err = -EIO;
>   }
> @@ -1640,6 +1676,7 @@ int restore_node_summary(struct f2fs_sb_info *sbi,
>   list_del(>lru);
>   unlock_page(page);
>   __free_pages(page, 0);
> + blkaddr++;
>   }
>   }
>   return err;

-- 
Jaegeuk Kim
Samsung

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH][RESEND] avr32: remove deprecated IRQF_DISABLED

2013-12-08 Thread Michael Opdenacker
This patch proposes to remove the use of the IRQF_DISABLED flag

It's a NOOP since 2.6.35 and it will be removed one day.

Signed-off-by: Michael Opdenacker 
Acked-by: Hans-Christian Egtvedt 
---
 arch/avr32/kernel/time.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/avr32/kernel/time.c b/arch/avr32/kernel/time.c
index 12f828ad5058..d0f771be9e96 100644
--- a/arch/avr32/kernel/time.c
+++ b/arch/avr32/kernel/time.c
@@ -59,7 +59,7 @@ static irqreturn_t timer_interrupt(int irq, void *dev_id)
 static struct irqaction timer_irqaction = {
.handler= timer_interrupt,
/* Oprofile uses the same irq as the timer, so allow it to be shared */
-   .flags  = IRQF_TIMER | IRQF_DISABLED | IRQF_SHARED,
+   .flags  = IRQF_TIMER | IRQF_SHARED,
.name   = "avr32_comparator",
 };
 
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] arm: plat-orion: remove deprecated IRQF_DISABLED

2013-12-08 Thread Michael Opdenacker
Hi Jason,

On 10/17/2013 02:54 PM, Jason Cooper wrote:
> On Sat, Oct 12, 2013 at 05:49:20AM +0200, Michael Opdenacker wrote:
>> This patch proposes to remove the use of the IRQF_DISABLED flag
>>
>> It's a NOOP since 2.6.35 and it will be removed one day.
>>
>> Signed-off-by: Michael Opdenacker 
>> ---
>>  arch/arm/plat-orion/time.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
> Applied to mvebu/soc

Oops, I don't see this patch in mainline yet.

It is still in the tree? Did it just miss the last merge window?

Thanks for your support,

Cheers,

Michael.

-- 
Michael Opdenacker, CEO, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com
+33 484 258 098

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ARM: dts: mxs: Add iio-hwmon to crystalfontz boards

2013-12-08 Thread Shawn Guo
On Fri, Dec 06, 2013 at 09:47:23PM +0100, Alexandre Belloni wrote:
> Signed-off-by: Alexandre Belloni 
> ---
> 
> This won't work until those patches are applied:
>  https://lkml.org/lkml/2013/12/6/676

I won't have this patch in my tree until next cycle.  But if this DTS
change does not cause any regression on my tree, I can still apply it,
and the feature will work when it gets merged into linux-next together
with the driver changes.

>  and 
> http://lists.infradead.org/pipermail/linux-arm-kernel/2013-December/216963.html
> 
> However, I'm wondering if it wouldn't make sense to add it directly in
> imx28.dtsi

Since it sounds like a SoC configuration rather than board one, having
the change in imx28.dtsi makes sense to me. 

Shawn

> 
>  arch/arm/boot/dts/imx28-cfa10036.dts | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/arch/arm/boot/dts/imx28-cfa10036.dts 
> b/arch/arm/boot/dts/imx28-cfa10036.dts
> index cabb6171a19d..51ebf36dd7b9 100644
> --- a/arch/arm/boot/dts/imx28-cfa10036.dts
> +++ b/arch/arm/boot/dts/imx28-cfa10036.dts
> @@ -114,4 +114,9 @@
>   default-state = "on";
>   };
>   };
> +
> + iio_hwmon {
> + compatible = "iio-hwmon";
> + io-channels = < 8>;
> + };
>  };
> -- 
> 1.8.3.2
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH][RESEND] ARM: w90x900: remove deprecated IRQF_DISABLED

2013-12-08 Thread Michael Opdenacker
This patch proposes to remove the use of the IRQF_DISABLED flag

It's a NOOP since 2.6.35 and it will be removed one day.

Signed-off-by: Michael Opdenacker 
Acked-by: Wan zongshun 
---
 arch/arm/mach-w90x900/time.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/mach-w90x900/time.c b/arch/arm/mach-w90x900/time.c
index 30fbca844575..9230d3725599 100644
--- a/arch/arm/mach-w90x900/time.c
+++ b/arch/arm/mach-w90x900/time.c
@@ -111,7 +111,7 @@ static irqreturn_t nuc900_timer0_interrupt(int irq, void 
*dev_id)
 
 static struct irqaction nuc900_timer0_irq = {
.name   = "nuc900-timer0",
-   .flags  = IRQF_DISABLED | IRQF_TIMER | IRQF_IRQPOLL,
+   .flags  = IRQF_TIMER | IRQF_IRQPOLL,
.handler= nuc900_timer0_interrupt,
 };
 
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] perf tools: Fix bug for perf kvm report without guestmount.

2013-12-08 Thread Dongsheng Yang

On 12/08/2013 11:32 PM, David Ahern wrote:

On 12/9/13, 10:12 AM, Dongsheng Yang wrote:

On 12/08/2013 10:42 PM, David Ahern wrote:

On 12/9/13, 8:20 AM, Dongsheng Yang wrote:

How about introduce an option named --guestpid? Then we can make the
usage of perf kvm
more clear:
 * perf kvm --guestkallsyms --guestmodules --guestpid
[top|record|report]
 This usage is for only one guest and will not resolve the
symbols from other guests.


If there is only 1 guest then there should not be a problem right? You
give perf a single guest kallsyms as the "default" and it works.
--guestpid adds no value in that case.


Yes, if there is only one guest is running, "default" guest is "the"
guest. Then with my patch in this thread applied, it works well.

But consider this scenario, there are two guests are running, but we
need to record-report one of them.

--guestmount can achieve this request, but as a shortcut of guestmount,
--guest{kallysms, modules} dose not
support it well, right? So, I think we can discard the default guest,
and use guestpid in record-report.


No.

Use cases:
1. one guest
--guestkallsyms and --guestmodules apply to default guest; user should 
supply files that apply to the one guest. Supplying any other kallsyms 
is just nonsense. *NO* other arguments are needed.


2. more than 1 VM, *ALL* VMs running the same kernel
--guestkallsyms and --guestmodules apply to default guest; user should 
supply files that apply to all of guests. No other arguments are needed.


3. more than 1 VM, VMs running different kernels. 1+ VMs running the 
same kernel
--guestmount allows user to supply files that apply to all of guests 
based on pid. --guestkallsyms/guestmodules is used for any guest not 
showing up in guestmount.




When more than 1 VM, the cases you provided is all about record-report 
the symbols from __all__ guests. How about I want to record-report one 
of them?

Example:
There are 2 guests are running different kernels, I want to 
record-report VM1.


Currently, we can use --guestmount to get the symbols of VM1, but we can 
not remove the symbols of VM2 from report. It means the percentage of 
each symbol is not  the symbol only in this VM1.


What I want with introducing guestpid is to record-report the symbols 
only from VM1, and ignore VM2.



David




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH][RESEND] ARM: spear: remove deprecated IRQF_DISABLED

2013-12-08 Thread Michael Opdenacker
This patch proposes to remove the use of the IRQF_DISABLED flag

It's a NOOP since 2.6.35 and it will be removed one day.

Signed-off-by: Michael Opdenacker 
---
 arch/arm/mach-spear/time.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/mach-spear/time.c b/arch/arm/mach-spear/time.c
index d449673e40f7..218ba5b67d92 100644
--- a/arch/arm/mach-spear/time.c
+++ b/arch/arm/mach-spear/time.c
@@ -172,7 +172,7 @@ static irqreturn_t spear_timer_interrupt(int irq, void 
*dev_id)
 
 static struct irqaction spear_timer_irq = {
.name = "timer",
-   .flags = IRQF_DISABLED | IRQF_TIMER,
+   .flags = IRQF_TIMER,
.handler = spear_timer_interrupt
 };
 
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH][RESEND] ARM: mmp: remove deprecated IRQF_DISABLED

2013-12-08 Thread Michael Opdenacker
This patch proposes to remove the use of the IRQF_DISABLED flag

It's a NOOP since 2.6.35 and it will be removed one day.

Signed-off-by: Michael Opdenacker 
---
 arch/arm/mach-mmp/time.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/mach-mmp/time.c b/arch/arm/mach-mmp/time.c
index 7ac41e83cfef..6aacc9b050f5 100644
--- a/arch/arm/mach-mmp/time.c
+++ b/arch/arm/mach-mmp/time.c
@@ -186,7 +186,7 @@ static void __init timer_config(void)
 
 static struct irqaction timer_irq = {
.name   = "timer",
-   .flags  = IRQF_DISABLED | IRQF_TIMER | IRQF_IRQPOLL,
+   .flags  = IRQF_TIMER | IRQF_IRQPOLL,
.handler= timer_interrupt,
.dev_id = ,
 };
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH][RESEND] ARM: misc: remove deprecated IRQF_DISABLED

2013-12-08 Thread Michael Opdenacker
This patch proposes to remove the use of the IRQF_DISABLED flag
from miscellaneous code in mach-xxx and plat-xxx

This flag is a NOOP since 2.6.35 and it will be removed one day.

Signed-off-by: Michael Opdenacker 
---
 arch/arm/mach-ebsa110/core.c | 2 +-
 arch/arm/mach-integrator/integrator_ap.c | 2 +-
 arch/arm/mach-ks8695/time.c  | 2 +-
 arch/arm/mach-netx/time.c| 2 +-
 arch/arm/mach-rpc/dma.c  | 2 +-
 arch/arm/mach-rpc/time.c | 1 -
 arch/arm/mach-sa1100/time.c  | 2 +-
 arch/arm/mach-u300/timer.c   | 2 +-
 arch/arm/plat-iop/time.c | 2 +-
 arch/arm/plat-pxa/dma.c  | 2 +-
 10 files changed, 9 insertions(+), 10 deletions(-)

diff --git a/arch/arm/mach-ebsa110/core.c b/arch/arm/mach-ebsa110/core.c
index 68ac934d4565..8254e716b095 100644
--- a/arch/arm/mach-ebsa110/core.c
+++ b/arch/arm/mach-ebsa110/core.c
@@ -206,7 +206,7 @@ ebsa110_timer_interrupt(int irq, void *dev_id)
 
 static struct irqaction ebsa110_timer_irq = {
.name   = "EBSA110 Timer Tick",
-   .flags  = IRQF_DISABLED | IRQF_TIMER | IRQF_IRQPOLL,
+   .flags  = IRQF_TIMER | IRQF_IRQPOLL,
.handler= ebsa110_timer_interrupt,
 };
 
diff --git a/arch/arm/mach-integrator/integrator_ap.c 
b/arch/arm/mach-integrator/integrator_ap.c
index d50dc2dbfd89..699abfdbc673 100644
--- a/arch/arm/mach-integrator/integrator_ap.c
+++ b/arch/arm/mach-integrator/integrator_ap.c
@@ -368,7 +368,7 @@ static struct clock_event_device integrator_clockevent = {
 
 static struct irqaction integrator_timer_irq = {
.name   = "timer",
-   .flags  = IRQF_DISABLED | IRQF_TIMER | IRQF_IRQPOLL,
+   .flags  = IRQF_TIMER | IRQF_IRQPOLL,
.handler= integrator_timer_interrupt,
.dev_id = _clockevent,
 };
diff --git a/arch/arm/mach-ks8695/time.c b/arch/arm/mach-ks8695/time.c
index 426c97662f5b..a197874bf382 100644
--- a/arch/arm/mach-ks8695/time.c
+++ b/arch/arm/mach-ks8695/time.c
@@ -122,7 +122,7 @@ static irqreturn_t ks8695_timer_interrupt(int irq, void 
*dev_id)
 
 static struct irqaction ks8695_timer_irq = {
.name   = "ks8695_tick",
-   .flags  = IRQF_DISABLED | IRQF_TIMER,
+   .flags  = IRQF_TIMER,
.handler= ks8695_timer_interrupt,
 };
 
diff --git a/arch/arm/mach-netx/time.c b/arch/arm/mach-netx/time.c
index 6df42e643031..3177c7a40930 100644
--- a/arch/arm/mach-netx/time.c
+++ b/arch/arm/mach-netx/time.c
@@ -99,7 +99,7 @@ netx_timer_interrupt(int irq, void *dev_id)
 
 static struct irqaction netx_timer_irq = {
.name   = "NetX Timer Tick",
-   .flags  = IRQF_DISABLED | IRQF_TIMER | IRQF_IRQPOLL,
+   .flags  = IRQF_TIMER | IRQF_IRQPOLL,
.handler= netx_timer_interrupt,
 };
 
diff --git a/arch/arm/mach-rpc/dma.c b/arch/arm/mach-rpc/dma.c
index 85883b2e0e49..6d3517dc4772 100644
--- a/arch/arm/mach-rpc/dma.c
+++ b/arch/arm/mach-rpc/dma.c
@@ -141,7 +141,7 @@ static int iomd_request_dma(unsigned int chan, dma_t *dma)
struct iomd_dma *idma = container_of(dma, struct iomd_dma, dma);
 
return request_irq(idma->irq, iomd_dma_handle,
-  IRQF_DISABLED, idma->dma.device_id, idma);
+  0, idma->dma.device_id, idma);
 }
 
 static void iomd_free_dma(unsigned int chan, dma_t *dma)
diff --git a/arch/arm/mach-rpc/time.c b/arch/arm/mach-rpc/time.c
index 9a6def14df01..9a5158861ca9 100644
--- a/arch/arm/mach-rpc/time.c
+++ b/arch/arm/mach-rpc/time.c
@@ -75,7 +75,6 @@ ioc_timer_interrupt(int irq, void *dev_id)
 
 static struct irqaction ioc_timer_irq = {
.name   = "timer",
-   .flags  = IRQF_DISABLED,
.handler= ioc_timer_interrupt
 };
 
diff --git a/arch/arm/mach-sa1100/time.c b/arch/arm/mach-sa1100/time.c
index 713c86cd3d64..a98fded8c432 100644
--- a/arch/arm/mach-sa1100/time.c
+++ b/arch/arm/mach-sa1100/time.c
@@ -112,7 +112,7 @@ static struct clock_event_device ckevt_sa1100_osmr0 = {
 
 static struct irqaction sa1100_timer_irq = {
.name   = "ost0",
-   .flags  = IRQF_DISABLED | IRQF_TIMER | IRQF_IRQPOLL,
+   .flags  = IRQF_TIMER | IRQF_IRQPOLL,
.handler= sa1100_ost0_interrupt,
.dev_id = _sa1100_osmr0,
 };
diff --git a/arch/arm/mach-u300/timer.c b/arch/arm/mach-u300/timer.c
index 9a5f9fb352ce..f4669c4225c3 100644
--- a/arch/arm/mach-u300/timer.c
+++ b/arch/arm/mach-u300/timer.c
@@ -329,7 +329,7 @@ static irqreturn_t u300_timer_interrupt(int irq, void 
*dev_id)
 
 static struct irqaction u300_timer_irq = {
.name   = "U300 Timer Tick",
-   .flags  = IRQF_DISABLED | IRQF_TIMER | IRQF_IRQPOLL,
+   .flags  = IRQF_TIMER | IRQF_IRQPOLL,
.handler= u300_timer_interrupt,
 };
 
diff --git a/arch/arm/plat-iop/time.c 

Re: Re: Re: [RFC PATCH tip 0/5] tracing filters with BPF

2013-12-08 Thread Masami Hiramatsu
(2013/12/08 1:21), Jovi Zhangwei wrote:
> On Sat, Dec 7, 2013 at 7:58 AM, Masami Hiramatsu
>  wrote:
>> (2013/12/06 14:19), Jovi Zhangwei wrote:
>>> Hi Alexei,
>>>
>>> On Thu, Dec 5, 2013 at 12:40 PM, Alexei Starovoitov  
>>> wrote:
> On Tue, Dec 3, 2013 at 4:01 PM, Andi Kleen  wrote:
>>
>> Can you do some performance comparison compared to e.g. ktap?
>> How much faster is it?

 Did simple ktap test with 1M alloc_skb/kfree_skb toy test from earlier 
 email:
 trace skb:kfree_skb {
 if (arg2 == 0x100) {
 printf("%x %x\n", arg1, arg2)
 }
 }
 1M skb alloc/free 350315 (usecs)

 baseline without any tracing:
 1M skb alloc/free 145400 (usecs)

 then equivalent bpf test:
 void filter(struct bpf_context *ctx)
 {
 void *loc = (void *)ctx->regs.dx;
 if (loc == 0x100) {
 struct sk_buff *skb = (struct sk_buff *)ctx->regs.si;
 char fmt[] = "skb %p loc %p\n";
 bpf_trace_printk(fmt, sizeof(fmt), (long)skb, (long)loc, 
 0);
 }
 }
 1M skb alloc/free 183214 (usecs)

 so with one 'if' condition the difference ktap vs bpf is 350-145 vs 183-145

 obviously ktap is an interpreter, so it's not really fair.

 To make it really unfair I did:
 trace skb:kfree_skb {
 if (arg2 == 0x100 || arg2 == 0x200 || arg2 == 0x300 || arg2 == 
 0x400 ||
 arg2 == 0x500 || arg2 == 0x600 || arg2 == 0x700 || arg2 == 
 0x800 ||
 arg2 == 0x900 || arg2 == 0x1000) {
 printf("%x %x\n", arg1, arg2)
 }
 }
 1M skb alloc/free 484280 (usecs)

 and corresponding bpf:
 void filter(struct bpf_context *ctx)
 {
 void *loc = (void *)ctx->regs.dx;
 if (loc == 0x100 || loc == 0x200 || loc == 0x300 || loc == 0x400 ||
 loc == 0x500 || loc == 0x600 || loc == 0x700 || loc == 0x800 ||
 loc == 0x900 || loc == 0x1000) {
 struct sk_buff *skb = (struct sk_buff *)ctx->regs.si;
 char fmt[] = "skb %p loc %p\n";
 bpf_trace_printk(fmt, sizeof(fmt), (long)skb, (long)loc, 
 0);
 }
 }
 1M skb alloc/free 185660 (usecs)

 the difference is bigger now: 484-145 vs 185-145

>>> There have big differences for compare arg2(in ktap) with direct register
>>> access(ctx->regs.dx).
>>>
>>> The current argument fetching(arg2 in above testcase) implementation in ktap
>>> is very inefficient, see ktap/interpreter/lib_kdebug.c:kp_event_getarg.
>>> The only way to speedup is kernel tracing code change, let external tracing
>>> module access event field not through list lookup. This work is not
>>> started yet. :)
>>
>> I'm not sure why you can't access it directly from ftrace-event buffer.
>> There is just a packed data structure and it is exposed via debugfs.
>> You can decode it and can get an offset/size by using libtraceevent.
>>
> Then it means there need pass the event field info into kernel through trunk,
> it looks strange because the kernel structure is the source of event field 
> info,
> it's like loop-back, and need to engage with libtraceevent in userspace.

No, the static traceevents have its own kernel data structure, but
the dynamic events don't. They expose the data format (offset/type)
via debugfs, but do not define new data structure.
So, I meant the script is enough to take an offset and a method casting
to corresponding size.

> (the side effect is it will make compilation slow, and consume more memory,
> sometimes it will process 20K events in one script, like 'trace
> probe:big_dso:*')

I doubt it, since you just need to get formats only for the events what
the script using.

> So "the only way" which I said is wrong, your approach indeed is another way.
> I just think maybe use array instead of list for event fields would be more
> efficient if list is not must needed. we can check it more in future.

Ah, perhaps, I misunderstood ktap implementation. Does it define dynamic
events right before loading a bytecode? In that case, I recommend you to
change a loader to adjust the bytecode after defining event to tune the
offset information, which fits to the target event format.

e.g.
 1) compile a bytecode with dummy offsets
 2) define new additional dynamic events
 3) get the field offset information from the events
 4) modify the bytecode to replace offsets with correct one on memory
 5) load the bytecode

Thank you,

-- 
Masami HIRAMATSU
IT Management Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu...@hitachi.com


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read 

[PATCH][RESEND] ARM: LPC32xx: remove deprecated IRQF_DISABLED

2013-12-08 Thread Michael Opdenacker
This patch proposes to remove the use of the IRQF_DISABLED flag

It's a NOOP since 2.6.35 and it will be removed one day.

Signed-off-by: Michael Opdenacker 
---
 arch/arm/mach-lpc32xx/timer.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/mach-lpc32xx/timer.c b/arch/arm/mach-lpc32xx/timer.c
index 20eab63d10ba..4e5837299c04 100644
--- a/arch/arm/mach-lpc32xx/timer.c
+++ b/arch/arm/mach-lpc32xx/timer.c
@@ -90,7 +90,7 @@ static irqreturn_t lpc32xx_timer_interrupt(int irq, void 
*dev_id)
 
 static struct irqaction lpc32xx_timer_irq = {
.name   = "LPC32XX Timer Tick",
-   .flags  = IRQF_DISABLED | IRQF_TIMER | IRQF_IRQPOLL,
+   .flags  = IRQF_TIMER | IRQF_IRQPOLL,
.handler= lpc32xx_timer_interrupt,
 };
 
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH][RESEND] ARM: IXP4xx: remove deprecated IRQF_DISABLED

2013-12-08 Thread Michael Opdenacker
This patch proposes to remove the use of the IRQF_DISABLED flag

It's a NOOP since 2.6.35 and it will be removed one day.

Signed-off-by: Michael Opdenacker 
---
 arch/arm/mach-ixp4xx/common.c| 2 +-
 arch/arm/mach-ixp4xx/dsmg600-setup.c | 3 +--
 arch/arm/mach-ixp4xx/fsg-setup.c | 6 ++
 arch/arm/mach-ixp4xx/nas100d-setup.c | 3 +--
 arch/arm/mach-ixp4xx/nslu2-setup.c   | 6 ++
 5 files changed, 7 insertions(+), 13 deletions(-)

diff --git a/arch/arm/mach-ixp4xx/common.c b/arch/arm/mach-ixp4xx/common.c
index 9edaf4734fa8..340b2c9c51f4 100644
--- a/arch/arm/mach-ixp4xx/common.c
+++ b/arch/arm/mach-ixp4xx/common.c
@@ -312,7 +312,7 @@ static irqreturn_t ixp4xx_timer_interrupt(int irq, void 
*dev_id)
 
 static struct irqaction ixp4xx_timer_irq = {
.name   = "timer1",
-   .flags  = IRQF_DISABLED | IRQF_TIMER | IRQF_IRQPOLL,
+   .flags  = IRQF_TIMER | IRQF_IRQPOLL,
.handler= ixp4xx_timer_interrupt,
.dev_id = _ixp4xx,
 };
diff --git a/arch/arm/mach-ixp4xx/dsmg600-setup.c 
b/arch/arm/mach-ixp4xx/dsmg600-setup.c
index 736dc692d540..43ee06d3abe5 100644
--- a/arch/arm/mach-ixp4xx/dsmg600-setup.c
+++ b/arch/arm/mach-ixp4xx/dsmg600-setup.c
@@ -233,8 +233,7 @@ static int __init dsmg600_gpio_init(void)
 
gpio_request(DSMG600_RB_GPIO, "reset button");
if (request_irq(gpio_to_irq(DSMG600_RB_GPIO), _reset_handler,
-   IRQF_DISABLED | IRQF_TRIGGER_LOW,
-   "DSM-G600 reset button", NULL) < 0) {
+   IRQF_TRIGGER_LOW, "DSM-G600 reset button", NULL) < 0) {
 
printk(KERN_DEBUG "Reset Button IRQ %d not available\n",
gpio_to_irq(DSMG600_RB_GPIO));
diff --git a/arch/arm/mach-ixp4xx/fsg-setup.c b/arch/arm/mach-ixp4xx/fsg-setup.c
index 429966b756ed..5c4b0c4a1b37 100644
--- a/arch/arm/mach-ixp4xx/fsg-setup.c
+++ b/arch/arm/mach-ixp4xx/fsg-setup.c
@@ -208,16 +208,14 @@ static void __init fsg_init(void)
platform_add_devices(fsg_devices, ARRAY_SIZE(fsg_devices));
 
if (request_irq(gpio_to_irq(FSG_RB_GPIO), _reset_handler,
-   IRQF_DISABLED | IRQF_TRIGGER_LOW,
-   "FSG reset button", NULL) < 0) {
+   IRQF_TRIGGER_LOW, "FSG reset button", NULL) < 0) {
 
printk(KERN_DEBUG "Reset Button IRQ %d not available\n",
gpio_to_irq(FSG_RB_GPIO));
}
 
if (request_irq(gpio_to_irq(FSG_SB_GPIO), _power_handler,
-   IRQF_DISABLED | IRQF_TRIGGER_LOW,
-   "FSG power button", NULL) < 0) {
+   IRQF_TRIGGER_LOW, "FSG power button", NULL) < 0) {
 
printk(KERN_DEBUG "Power Button IRQ %d not available\n",
gpio_to_irq(FSG_SB_GPIO));
diff --git a/arch/arm/mach-ixp4xx/nas100d-setup.c 
b/arch/arm/mach-ixp4xx/nas100d-setup.c
index 507cb5233537..4e0f762bc651 100644
--- a/arch/arm/mach-ixp4xx/nas100d-setup.c
+++ b/arch/arm/mach-ixp4xx/nas100d-setup.c
@@ -295,8 +295,7 @@ static void __init nas100d_init(void)
pm_power_off = nas100d_power_off;
 
if (request_irq(gpio_to_irq(NAS100D_RB_GPIO), _reset_handler,
-   IRQF_DISABLED | IRQF_TRIGGER_LOW,
-   "NAS100D reset button", NULL) < 0) {
+   IRQF_TRIGGER_LOW, "NAS100D reset button", NULL) < 0) {
 
printk(KERN_DEBUG "Reset Button IRQ %d not available\n",
gpio_to_irq(NAS100D_RB_GPIO));
diff --git a/arch/arm/mach-ixp4xx/nslu2-setup.c 
b/arch/arm/mach-ixp4xx/nslu2-setup.c
index ba5f1cda2a9d..88c025f52d8d 100644
--- a/arch/arm/mach-ixp4xx/nslu2-setup.c
+++ b/arch/arm/mach-ixp4xx/nslu2-setup.c
@@ -265,16 +265,14 @@ static void __init nslu2_init(void)
pm_power_off = nslu2_power_off;
 
if (request_irq(gpio_to_irq(NSLU2_RB_GPIO), _reset_handler,
-   IRQF_DISABLED | IRQF_TRIGGER_LOW,
-   "NSLU2 reset button", NULL) < 0) {
+   IRQF_TRIGGER_LOW, "NSLU2 reset button", NULL) < 0) {
 
printk(KERN_DEBUG "Reset Button IRQ %d not available\n",
gpio_to_irq(NSLU2_RB_GPIO));
}
 
if (request_irq(gpio_to_irq(NSLU2_PB_GPIO), _power_handler,
-   IRQF_DISABLED | IRQF_TRIGGER_HIGH,
-   "NSLU2 power button", NULL) < 0) {
+   IRQF_TRIGGER_HIGH, "NSLU2 power button", NULL) < 0) {
 
printk(KERN_DEBUG "Power Button IRQ %d not available\n",
gpio_to_irq(NSLU2_PB_GPIO));
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH][RESEND] ARM: cns3xxx: remove deprecated IRQF_DISABLED

2013-12-08 Thread Michael Opdenacker
This patch proposes to remove the use of the IRQF_DISABLED flag

It's a NOOP since 2.6.35 and it will be removed one day.

Signed-off-by: Michael Opdenacker 
---
 arch/arm/mach-cns3xxx/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/mach-cns3xxx/core.c b/arch/arm/mach-cns3xxx/core.c
index e38b279f402c..384dc859e6c6 100644
--- a/arch/arm/mach-cns3xxx/core.c
+++ b/arch/arm/mach-cns3xxx/core.c
@@ -155,7 +155,7 @@ static irqreturn_t cns3xxx_timer_interrupt(int irq, void 
*dev_id)
 
 static struct irqaction cns3xxx_timer_irq = {
.name   = "timer",
-   .flags  = IRQF_DISABLED | IRQF_TIMER | IRQF_IRQPOLL,
+   .flags  = IRQF_TIMER | IRQF_IRQPOLL,
.handler= cns3xxx_timer_interrupt,
 };
 
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   5   6   7   >