date:20140623

[PATCH 06/11] usb: dwc2/gadget: ensure that all fifos have correct memory buffers

2014-06-23 Thread Robert Baldyga

From: Marek Szyprowski 

Print warning if FIFOs are configured in such a way that they don't fit
into the SPRAM available on the s3c hsotg module.

Signed-off-by: Marek Szyprowski 
Signed-off-by: Robert Baldyga 
---
 drivers/usb/dwc2/core.h   |  1 +
 drivers/usb/dwc2/gadget.c | 15 ++-
 2 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/drivers/usb/dwc2/core.h b/drivers/usb/dwc2/core.h
index 1efd10c..067390e 100644
--- a/drivers/usb/dwc2/core.h
+++ b/drivers/usb/dwc2/core.h
@@ -194,6 +194,7 @@ struct s3c_hsotg {
struct regulator_bulk_data supplies[ARRAY_SIZE(s3c_hsotg_supply_names)];
 
u32 phyif;
+   int fifo_mem;
unsigned intdedicated_fifos:1;
unsigned char   num_of_eps;
 
diff --git a/drivers/usb/dwc2/gadget.c b/drivers/usb/dwc2/gadget.c
index 95b6dcb..21d21de 100644
--- a/drivers/usb/dwc2/gadget.c
+++ b/drivers/usb/dwc2/gadget.c
@@ -194,6 +194,8 @@ static void s3c_hsotg_init_fifo(struct s3c_hsotg *hsotg)
for (ep = 1; ep <= 15; ep++) {
val = addr;
val |= size << FIFOSIZE_DEPTH_SHIFT;
+   WARN_ONCE(addr + size > hsotg->fifo_mem,
+ "insufficient fifo memory");
addr += size;
 
writel(val, hsotg->regs + DPTXFSIZN(ep));
@@ -3030,19 +3032,22 @@ static void s3c_hsotg_initep(struct s3c_hsotg *hsotg,
  */
 static void s3c_hsotg_hw_cfg(struct s3c_hsotg *hsotg)
 {
-   u32 cfg2, cfg4;
+   u32 cfg2, cfg3, cfg4;
/* check hardware configuration */
 
cfg2 = readl(hsotg->regs + 0x48);
hsotg->num_of_eps = (cfg2 >> 10) & 0xF;
 
-   dev_info(hsotg->dev, "EPs:%d\n", hsotg->num_of_eps);
+   cfg3 = readl(hsotg->regs + 0x4C);
+   hsotg->fifo_mem = (cfg3 >> 16);
 
cfg4 = readl(hsotg->regs + 0x50);
hsotg->dedicated_fifos = (cfg4 >> 25) & 1;
 
-   dev_info(hsotg->dev, "%s fifos\n",
-hsotg->dedicated_fifos ? "dedicated" : "shared");
+   dev_info(hsotg->dev, "EPs: %d, %s fifos, %d entries in SPRAM\n",
+hsotg->num_of_eps,
+hsotg->dedicated_fifos ? "dedicated" : "shared",
+hsotg->fifo_mem);
 }
 
 /**
@@ -3495,8 +3500,8 @@ static int s3c_hsotg_probe(struct platform_device *pdev)
s3c_hsotg_phy_enable(hsotg);
 
s3c_hsotg_corereset(hsotg);
-   s3c_hsotg_init(hsotg);
s3c_hsotg_hw_cfg(hsotg);
+   s3c_hsotg_init(hsotg);
 
/* hsotg->num_of_eps holds number of EPs other than ep0 */
 
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH -tip v2 2/3] ftrace, kprobes: Support IPMODIFY flag to find IP modify conflict

2014-06-23 Thread Masami Hiramatsu

(2014/06/20 11:48), Steven Rostedt wrote:
> On Tue, 17 Jun 2014 11:04:49 +
> Masami Hiramatsu  wrote:
> 
>> Introduce FTRACE_OPS_FL_IPMODIFY to avoid conflict among
>> ftrace users who may modify regs->ip to change the execution
>> path. This also adds the flag to kprobe_ftrace_ops, since
>> ftrace-based kprobes already modifies regs->ip. Thus, if
>> another user modifies the regs->ip on the same function entry,
>> one of them will be broken. So both should add IPMODIFY flag
>> and make sure that ftrace_set_filter_ip() succeeds.
>>
>> Note that currently conflicts of IPMODIFY are detected on the
>> filter hash. It does NOT care about the notrace hash. This means
>> that if you set filter hash all functions and notrace(mask)
>> some of them, the IPMODIFY flag will be applied to all
>> functions.
> 
> Hmm, this worries me. I'm not sure I care about ignoring the notrace
> hash.

Since the notrace hash is not updated atomically (it is disabled ->
updated -> enabled), there could be a small window where only filter
hash is enabled.　I considered that.


[...]
>> @@ -317,13 +322,14 @@ extern int ftrace_nr_registered_ops(void);
>>   * from tracing that function.
>>   */
>>  enum {
>> +FTRACE_FL_IPMODIFY  = (1UL << 28),
>>  FTRACE_FL_ENABLED   = (1UL << 29),
>>  FTRACE_FL_REGS  = (1UL << 30),
>>  FTRACE_FL_REGS_EN   = (1UL << 31)
>>  };
>>  
>> -#define FTRACE_FL_MASK  (0x7UL << 29)
>> -#define FTRACE_REF_MAX  ((1UL << 29) - 1)
>> +#define FTRACE_FL_MASK  (0xfUL << 28)
>> +#define FTRACE_REF_MAX  ((1UL << 28) - 1)
> 
> Note, this is going to conflict with my queue for 3.17, as I'm starting
> to write individual trampolines.
> 
> You can take a look at my ftrace/next branch, but be warned, it will
> rebase.

OK, anyway, at least this flag can easily be changed. :)


>>  struct dyn_ftrace {
>>  unsigned long   ip; /* address of mcount call-site */
>> diff --git a/kernel/kprobes.c b/kernel/kprobes.c
>> index 3214289..e52d86f 100644
>> --- a/kernel/kprobes.c
>> +++ b/kernel/kprobes.c
>> @@ -915,7 +915,7 @@ static struct kprobe *alloc_aggr_kprobe(struct kprobe *p)
>>  #ifdef CONFIG_KPROBES_ON_FTRACE
>>  static struct ftrace_ops kprobe_ftrace_ops __read_mostly = {
>>  .func = kprobe_ftrace_handler,
>> -.flags = FTRACE_OPS_FL_SAVE_REGS,
>> +.flags = FTRACE_OPS_FL_SAVE_REGS | FTRACE_OPS_FL_IPMODIFY,
> 
> We probably should comment somewhere that once you set the IPMODIFY
> flag (or do not set it), it should never change. An ops either has it
> or it doesn't, it can't change its mind. Otherwise it could play havoc
> with the update code below.

Agreed. Hmm, it seems that we currently have no document about ftrace_ops.
This is a kind of undocumented kmodule API. At least we need to add a comment
on the header.

[...]
>> @@ -1593,6 +1607,109 @@ static void ftrace_hash_rec_enable(struct ftrace_ops 
>> *ops,
>>  __ftrace_hash_rec_update(ops, filter_hash, 1);
>>  }
>>  
>> +/*
>> + * Try to update IPMODIFY flag on each ftrace_rec. Return 0 if it is OK
>> + * or no-needed to update, -EBUSY if it detects a conflict of the flag
>> + * on a ftrace_rec.
>> + * Note that old_hash and new_hash has below meanings
>> + *  - If the hash is NULL, it hits all recs
>> + *  - If the hash is EMPTY_HASH, it hits nothing
>> + *  - Others hits the recs which match the hash entries.
> 
>  - Anything else hits the recs ...

Oh, thanks!

> 
>> + */
>> +static int __ftrace_hash_update_ipmodify(struct ftrace_ops *ops,
>> + struct ftrace_hash *old_hash,
>> + struct ftrace_hash *new_hash)
>> +{
>> +struct ftrace_page *pg;
>> +struct dyn_ftrace *rec, *end = NULL;
>> +int in_old, in_new;
>> +
>> +/* Only update if the ops has been registered */
>> +if (!(ops->flags & FTRACE_OPS_FL_ENABLED))
>> +return 0;
>> +
>> +if (!(ops->flags & FTRACE_OPS_FL_SAVE_REGS) ||
>> +!(ops->flags & FTRACE_OPS_FL_IPMODIFY))
> 
> Only check the IPMODIFY flag. In the future, I may allow this without
> SAVE_REGS. That is, the function will get a regs pointer that has a
> limited number of regs set. Maybe just ip.

Ah, I see. I'll change that.

>> +return 0;
>> +
>> +/* Update rec->flags */
>> +do_for_each_ftrace_rec(pg, rec) {
>> +/* We need to update only differences of filter_hash */
>> +in_old = !old_hash || ftrace_lookup_ip(old_hash, rec->ip);
>> +in_new = !new_hash || ftrace_lookup_ip(new_hash, rec->ip);
>> +if (in_old == in_new)
>> +continue;
>> +
>> +if (in_new) {
>> +/* New entries must ensure no others are using it */
>> +if (rec->flags & FTRACE_FL_IPMODIFY)
>> +goto rollback;
>> +rec->flags |= FTRACE_FL_IPMODIFY;
>> +} else /* Removed entry */

[PATCH 2/2] perf symbols: Get kernel start address by symbol name

2014-06-23 Thread Jiri Olsa

From: Simon Que 

The function machine__get_kernel_start_addr() was taking the first symbol
of kallsyms as the start address. This is incorrect in certain cases
where the first symbol is something at 0, while the actual kernel
functions begin at a later point (e.g. 0x8020).

This patch fixes machine__get_kernel_start_addr() to search for the
symbol "_text" or "_stext", which marks the beginning of kernel mapping.
This was already being done in machine__create_kernel_maps(). Thus, this
patch is just a refactor, to move that code into
machine__get_kernel_start_addr().

Signed-off-by: Simon Que 
Link: 
http://lkml.kernel.org/r/1402943529-13244-1-git-send-email-s...@chromium.org
Signed-off-by: Jiri Olsa 
---
 tools/perf/util/machine.c | 54 +++
 1 file changed, 22 insertions(+), 32 deletions(-)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 0e5fea9..c73e1fc 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -496,18 +496,6 @@ struct process_args {
u64 start;
 };
 
-static int symbol__in_kernel(void *arg, const char *name,
-char type __maybe_unused, u64 start)
-{
-   struct process_args *args = arg;
-
-   if (strchr(name, '['))
-   return 0;
-
-   args->start = start;
-   return 1;
-}
-
 static void machine__get_kallsyms_filename(struct machine *machine, char *buf,
   size_t bufsz)
 {
@@ -517,27 +505,41 @@ static void machine__get_kallsyms_filename(struct machine 
*machine, char *buf,
scnprintf(buf, bufsz, "%s/proc/kallsyms", machine->root_dir);
 }
 
-/* Figure out the start address of kernel map from /proc/kallsyms */
-static u64 machine__get_kernel_start_addr(struct machine *machine)
+const char *ref_reloc_sym_names[] = {"_text", "_stext", NULL};
+
+/* Figure out the start address of kernel map from /proc/kallsyms.
+ * Returns the name of the start symbol in *symbol_name. Pass in NULL as
+ * symbol_name if it's not that important.
+ */
+static u64 machine__get_kernel_start_addr(struct machine *machine,
+ const char **symbol_name)
 {
char filename[PATH_MAX];
-   struct process_args args;
+   int i;
+   const char *name;
+   u64 addr = 0;
 
machine__get_kallsyms_filename(machine, filename, PATH_MAX);
 
if (symbol__restricted_filename(filename, "/proc/kallsyms"))
return 0;
 
-   if (kallsyms__parse(filename, , symbol__in_kernel) <= 0)
-   return 0;
+   for (i = 0; (name = ref_reloc_sym_names[i]) != NULL; i++) {
+   addr = kallsyms__get_function_start(filename, name);
+   if (addr)
+   break;
+   }
+
+   if (symbol_name)
+   *symbol_name = name;
 
-   return args.start;
+   return addr;
 }
 
 int __machine__create_kernel_maps(struct machine *machine, struct dso *kernel)
 {
enum map_type type;
-   u64 start = machine__get_kernel_start_addr(machine);
+   u64 start = machine__get_kernel_start_addr(machine, NULL);
 
for (type = 0; type < MAP__NR_TYPES; ++type) {
struct kmap *kmap;
@@ -852,23 +854,11 @@ static int machine__create_modules(struct machine 
*machine)
return 0;
 }
 
-const char *ref_reloc_sym_names[] = {"_text", "_stext", NULL};
-
 int machine__create_kernel_maps(struct machine *machine)
 {
struct dso *kernel = machine__get_kernel(machine);
-   char filename[PATH_MAX];
const char *name;
-   u64 addr = 0;
-   int i;
-
-   machine__get_kallsyms_filename(machine, filename, PATH_MAX);
-
-   for (i = 0; (name = ref_reloc_sym_names[i]) != NULL; i++) {
-   addr = kallsyms__get_function_start(filename, name);
-   if (addr)
-   break;
-   }
+   u64 addr = machine__get_kernel_start_addr(machine, );
if (!addr)
return -1;
 
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 11/11] usb: dwc2/gadget: disable clock when it's not needed

2014-06-23 Thread Robert Baldyga

When device is stopped or suspended clock is not needed so we
can disable it for this time.

Signed-off-by: Robert Baldyga 
---
 drivers/usb/dwc2/gadget.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/drivers/usb/dwc2/gadget.c b/drivers/usb/dwc2/gadget.c
index 2a7c014..0523bc3 100644
--- a/drivers/usb/dwc2/gadget.c
+++ b/drivers/usb/dwc2/gadget.c
@@ -2884,6 +2884,8 @@ static int s3c_hsotg_udc_start(struct usb_gadget *gadget,
hsotg->gadget.dev.of_node = hsotg->dev->of_node;
hsotg->gadget.speed = USB_SPEED_UNKNOWN;
 
+   clk_enable(hsotg->clk);
+
ret = regulator_bulk_enable(ARRAY_SIZE(hsotg->supplies),
hsotg->supplies);
if (ret) {
@@ -2932,6 +2934,8 @@ static int s3c_hsotg_udc_stop(struct usb_gadget *gadget,
 
regulator_bulk_disable(ARRAY_SIZE(hsotg->supplies), hsotg->supplies);
 
+   clk_disable(hsotg->clk);
+
return 0;
 }
 
@@ -2963,8 +2967,10 @@ static int s3c_hsotg_pullup(struct usb_gadget *gadget, 
int is_on)
spin_lock_irqsave(>lock, flags);
if (is_on) {
s3c_hsotg_phy_enable(hsotg);
+   clk_enable(hsotg->clk);
s3c_hsotg_core_init(hsotg);
} else {
+   clk_disable(hsotg->clk);
s3c_hsotg_phy_disable(hsotg);
}
 
@@ -3640,6 +3646,7 @@ static int s3c_hsotg_suspend(struct platform_device 
*pdev, pm_message_t state)
 
ret = regulator_bulk_disable(ARRAY_SIZE(hsotg->supplies),
 hsotg->supplies);
+   clk_disable(hsotg->clk);
}
 
return ret;
@@ -3654,6 +3661,8 @@ static int s3c_hsotg_resume(struct platform_device *pdev)
if (hsotg->driver) {
dev_info(hsotg->dev, "resuming usb gadget %s\n",
 hsotg->driver->driver.name);
+
+   clk_enable(hsotg->clk);
ret = regulator_bulk_enable(ARRAY_SIZE(hsotg->supplies),
  hsotg->supplies);
}
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[GIT PULL 0/2] perf/urgent fixes

2014-06-23 Thread Jiri Olsa

hi Ingo,
please consider pulling

thanks,
jirka


The following changes since commit cf230918cda19532e4a5cc4f0d5c82fa7e5e94f6:

  Merge branch 'perf/core' into perf/urgent, to pick up the latest fixes 
(2014-06-14 14:10:08 +0200)

are available in the git repository at:


  git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git 
tags/perf-urgent-for-mingo

for you to fetch changes up to a93f0e551af9e194db38bfe16001e17a3a1d189a:

  perf symbols: Get kernel start address by symbol name (2014-06-20 09:34:22 
+0200)


perf/urgent fixes:

. Fix kernel start address lookup in report code (Simon Que)

. Fix segfault in cumulative.callchain report (Jiri Olsa)

Signed-off-by: Jiri Olsa 


Jiri Olsa (1):
  perf tools: Fix segfault in cumulative.callchain report

Simon Que (1):
  perf symbols: Get kernel start address by symbol name

 tools/perf/ui/browsers/hists.c | 21 
 tools/perf/util/machine.c  | 54 +-
 2 files changed, 38 insertions(+), 37 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 09/11] usb: dwc2/gadget: delay enabling irq once hardware is configured properly

2014-06-23 Thread Robert Baldyga

From: Marek Szyprowski 

This patch fixes kernel panic/interrupt storm/etc issues if bootloader
left s3c-hsotg module in enabled state. Now interrupt handler is enabled
only after proper configuration of hardware registers.

Signed-off-by: Marek Szyprowski 
Signed-off-by: Robert Baldyga 
---
 drivers/usb/dwc2/gadget.c | 18 +++---
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/drivers/usb/dwc2/gadget.c b/drivers/usb/dwc2/gadget.c
index def4900..3435711 100644
--- a/drivers/usb/dwc2/gadget.c
+++ b/drivers/usb/dwc2/gadget.c
@@ -3459,13 +3459,6 @@ static int s3c_hsotg_probe(struct platform_device *pdev)
 
hsotg->irq = ret;
 
-   ret = devm_request_irq(>dev, hsotg->irq, s3c_hsotg_irq, 0,
-   dev_name(dev), hsotg);
-   if (ret < 0) {
-   dev_err(dev, "cannot claim IRQ\n");
-   goto err_clk;
-   }
-
dev_info(dev, "regs %p, irq %d\n", hsotg->regs, hsotg->irq);
 
hsotg->gadget.max_speed = USB_SPEED_HIGH;
@@ -3503,6 +3496,17 @@ static int s3c_hsotg_probe(struct platform_device *pdev)
s3c_hsotg_hw_cfg(hsotg);
s3c_hsotg_init(hsotg);
 
+   ret = devm_request_irq(>dev, hsotg->irq, s3c_hsotg_irq, 0,
+   dev_name(dev), hsotg);
+   if (ret < 0) {
+   s3c_hsotg_phy_disable(hsotg);
+   clk_disable_unprepare(hsotg->clk);
+   regulator_bulk_disable(ARRAY_SIZE(hsotg->supplies),
+  hsotg->supplies);
+   dev_err(dev, "cannot claim IRQ\n");
+   goto err_clk;
+   }
+
/* hsotg->num_of_eps holds number of EPs other than ep0 */
 
if (hsotg->num_of_eps == 0) {
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 10/11] usb: dwc2/gadget: assign TX FIFO dynamically

2014-06-23 Thread Robert Baldyga

Because we have not enough memory to have each TX FIFO of size at least 3072
bytes (the maximum single packet size), we create four FIFOs of lenght 1024,
and four of length 3072 bytes, and assing them to endpoints dynamically
according to maxpacket size value of given endpoint.

Signed-off-by: Robert Baldyga 
---
 drivers/usb/dwc2/core.h   |  1 +
 drivers/usb/dwc2/gadget.c | 48 +--
 2 files changed, 35 insertions(+), 14 deletions(-)

diff --git a/drivers/usb/dwc2/core.h b/drivers/usb/dwc2/core.h
index 067390e..23f7e86 100644
--- a/drivers/usb/dwc2/core.h
+++ b/drivers/usb/dwc2/core.h
@@ -197,6 +197,7 @@ struct s3c_hsotg {
int fifo_mem;
unsigned intdedicated_fifos:1;
unsigned char   num_of_eps;
+   u32 fifo_map;
 
struct dentry   *debug_root;
struct dentry   *debug_file;
diff --git a/drivers/usb/dwc2/gadget.c b/drivers/usb/dwc2/gadget.c
index 3435711..2a7c014 100644
--- a/drivers/usb/dwc2/gadget.c
+++ b/drivers/usb/dwc2/gadget.c
@@ -184,14 +184,26 @@ static void s3c_hsotg_init_fifo(struct s3c_hsotg *hsotg)
 
/* start at the end of the GNPTXFSIZ, rounded up */
addr = 2048 + 1024;
-   size = 768;
 
/*
 * currently we allocate TX FIFOs for all possible endpoints,
 * and assume that they are all the same size.
 */
 
-   for (ep = 1; ep <= 15; ep++) {
+   /* 256*4=1024 bytes FIFO length */
+   size = 256;
+   for (ep = 1; ep <= 4; ep++) {
+   val = addr;
+   val |= size << FIFOSIZE_DEPTH_SHIFT;
+   WARN_ONCE(addr + size > hsotg->fifo_mem,
+ "insufficient fifo memory");
+   addr += size;
+
+   writel(val, hsotg->regs + DPTXFSIZN(ep));
+   }
+   /* 768*4=3072 bytes FIFO length */
+   size = 768;
+   for (ep = 5; ep <= 8; ep++) {
val = addr;
val |= size << FIFOSIZE_DEPTH_SHIFT;
WARN_ONCE(addr + size > hsotg->fifo_mem,
@@ -2440,6 +2452,7 @@ static int s3c_hsotg_ep_enable(struct usb_ep *ep,
u32 epctrl;
u32 mps;
int dir_in;
+   int i, val, size;
int ret = 0;
 
dev_dbg(hsotg->dev,
@@ -2512,17 +2525,8 @@ static int s3c_hsotg_ep_enable(struct usb_ep *ep,
break;
 
case USB_ENDPOINT_XFER_INT:
-   if (dir_in) {
-   /*
-* Allocate our TxFNum by simply using the index
-* of the endpoint for the moment. We could do
-* something better if the host indicates how
-* many FIFOs we are expecting to use.
-*/
-
+   if (dir_in)
hs_ep->periodic = 1;
-   epctrl |= DXEPCTL_TXFNUM(index);
-   }
 
epctrl |= DXEPCTL_EPTYPE_INTERRUPT;
break;
@@ -2536,8 +2540,24 @@ static int s3c_hsotg_ep_enable(struct usb_ep *ep,
 * if the hardware has dedicated fifos, we must give each IN EP
 * a unique tx-fifo even if it is non-periodic.
 */
-   if (dir_in && hsotg->dedicated_fifos)
-   epctrl |= DXEPCTL_TXFNUM(index);
+   if (dir_in && hsotg->dedicated_fifos) {
+   size = hs_ep->ep.maxpacket*hs_ep->mc;
+   for (i = 1; i < 8; ++i) {
+   if (hsotg->fifo_map & (1> FIFOSIZE_DEPTH_SHIFT)*4;
+   if (val < size)
+   continue;
+   hsotg->fifo_map |= 1

[PATCH 08/11] usb: dwc2/gadget: do not call disconnect method in pullup

2014-06-23 Thread Robert Baldyga

From: Marek Szyprowski 

This leads to potential spinlock recursion in composite framework, other
udc drivers also don't call it directly from pullup method.

Signed-off-by: Marek Szyprowski 
Signed-off-by: Robert Baldyga 
---
 drivers/usb/dwc2/gadget.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/usb/dwc2/gadget.c b/drivers/usb/dwc2/gadget.c
index 2220882..def4900 100644
--- a/drivers/usb/dwc2/gadget.c
+++ b/drivers/usb/dwc2/gadget.c
@@ -2945,7 +2945,6 @@ static int s3c_hsotg_pullup(struct usb_gadget *gadget, 
int is_on)
s3c_hsotg_phy_enable(hsotg);
s3c_hsotg_core_init(hsotg);
} else {
-   s3c_hsotg_disconnect(hsotg);
s3c_hsotg_phy_disable(hsotg);
}
 
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 07/11] usb: dwc2/gadget: break infinite loop in endpoint disable code

2014-06-23 Thread Robert Baldyga

From: Marek Szyprowski 

This patch fixes possible freeze caused by infinite loop in interrupt
context.

Signed-off-by: Marek Szyprowski 
Signed-off-by: Robert Baldyga 
---
 drivers/usb/dwc2/gadget.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/usb/dwc2/gadget.c b/drivers/usb/dwc2/gadget.c
index 21d21de..2220882 100644
--- a/drivers/usb/dwc2/gadget.c
+++ b/drivers/usb/dwc2/gadget.c
@@ -1652,6 +1652,7 @@ static void s3c_hsotg_txfifo_flush(struct s3c_hsotg 
*hsotg, unsigned int idx)
dev_err(hsotg->dev,
"%s: timeout flushing fifo (GRSTCTL=%08x)\n",
__func__, val);
+   break;
}
 
udelay(1);
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/2] perf tools: Fix segfault in cumulative.callchain report

2014-06-23 Thread Jiri Olsa

When cumulative callchain mode is on, we could get samples with
with no actual hits. This breaks the assumption of the annotation
code, that each sample has annotation counts allocated and leads
to segfault.

Fixing this by additional checks for annotation stats.

Acked-by: Namhyung Kim 
Acked-by: Arnaldo Carvalho de Melo 
Cc: Arnaldo Carvalho de Melo 
Cc: Corey Ashford 
Cc: David Ahern 
Cc: Frederic Weisbecker 
Cc: Ingo Molnar 
Cc: Namhyung Kim 
Cc: Paul Mackerras 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1402821332-12419-1-git-send-email-jo...@kernel.org
Signed-off-by: Jiri Olsa 
---
 tools/perf/ui/browsers/hists.c | 21 -
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index 52c03fb..04a229a 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -17,6 +17,7 @@
 #include "../util.h"
 #include "../ui.h"
 #include "map.h"
+#include "annotate.h"
 
 struct hist_browser {
struct ui_browser   b;
@@ -1593,13 +1594,18 @@ static int perf_evsel__hists_browse(struct perf_evsel 
*evsel, int nr_events,
 bi->to.sym->name) > 0)
annotate_t = nr_options++;
} else {
-
if (browser->selection != NULL &&
browser->selection->sym != NULL &&
-   !browser->selection->map->dso->annotate_warned &&
-   asprintf([nr_options], "Annotate %s",
-browser->selection->sym->name) > 0)
-   annotate = nr_options++;
+   !browser->selection->map->dso->annotate_warned) {
+   struct annotation *notes;
+
+   notes = 
symbol__annotation(browser->selection->sym);
+
+   if (notes->src &&
+   asprintf([nr_options], "Annotate 
%s",
+browser->selection->sym->name) 
> 0)
+   annotate = nr_options++;
+   }
}
 
if (thread != NULL &&
@@ -1656,6 +1662,7 @@ retry_popup_menu:
 
if (choice == annotate || choice == annotate_t || choice == 
annotate_f) {
struct hist_entry *he;
+   struct annotation *notes;
int err;
 do_annotate:
if (!objdump_path && 
perf_session_env__lookup_objdump(env))
@@ -1679,6 +1686,10 @@ do_annotate:
he->ms.map = he->branch_info->to.map;
}
 
+   notes = symbol__annotation(he->ms.sym);
+   if (!notes->src)
+   continue;
+
/*
 * Don't let this be freed, say, by hists__decay_entry.
 */
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] SCHED: remove proliferation of wait_on_bit action functions.

2014-06-23 Thread David Howells


Acked-by: David Howells  (fscache, keys)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] ARM: dts: exynos5410: Fill in CPU clock-frequency

2014-06-23 Thread Tarek Dakhran


On 06/22/2014 11:49 PM, Andreas Färber wrote:

It's 1.6 GHz for the Cortex-A15.

Avoids warnings like "/cpus/cpu@0 missing clock-frequency property".

Signed-off-by: Andreas Färber 
---
  arch/arm/boot/dts/exynos5410.dtsi | 4 
  1 file changed, 4 insertions(+)

diff --git a/arch/arm/boot/dts/exynos5410.dtsi 
b/arch/arm/boot/dts/exynos5410.dtsi
index 3839c26..9d0b8cc 100644
--- a/arch/arm/boot/dts/exynos5410.dtsi
+++ b/arch/arm/boot/dts/exynos5410.dtsi
@@ -28,24 +28,28 @@
device_type = "cpu";
compatible = "arm,cortex-a15";
reg = <0x0>;
+   clock-frequency = <16>;
  


Reviewed-by: Tarek Dakhran

--
Best regards,
Tarek Dakhran

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 05/11] usb: dwc2/gadget: hide some not really needed debug messages

2014-06-23 Thread Robert Baldyga

From: Marek Szyprowski 

Some DWC2/s3c-hsotg debug messages are really useless for typical user,
so hide them behind dev_dbg().

Signed-off-by: Marek Szyprowski 
Signed-off-by: Robert Baldyga 
---
 drivers/usb/dwc2/gadget.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/usb/dwc2/gadget.c b/drivers/usb/dwc2/gadget.c
index 35b4890..95b6dcb 100644
--- a/drivers/usb/dwc2/gadget.c
+++ b/drivers/usb/dwc2/gadget.c
@@ -2568,7 +2568,7 @@ static int s3c_hsotg_ep_disable(struct usb_ep *ep)
u32 epctrl_reg;
u32 ctrl;
 
-   dev_info(hsotg->dev, "%s(ep %p)\n", __func__, ep);
+   dev_dbg(hsotg->dev, "%s(ep %p)\n", __func__, ep);
 
if (ep == >eps[0].ep) {
dev_err(hsotg->dev, "%s: called for ep0\n", __func__);
@@ -2626,7 +2626,7 @@ static int s3c_hsotg_ep_dequeue(struct usb_ep *ep, struct 
usb_request *req)
struct s3c_hsotg *hs = hs_ep->parent;
unsigned long flags;
 
-   dev_info(hs->dev, "ep_dequeue(%p,%p)\n", ep, req);
+   dev_dbg(hs->dev, "ep_dequeue(%p,%p)\n", ep, req);
 
spin_lock_irqsave(>lock, flags);
 
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 04/11] usb: dwc2/gadget: Fix comment text

2014-06-23 Thread Robert Baldyga

From: Andrzej Pietrasiewicz 

Adjust the debug text to the name of the printed variable.

Signed-off-by: Andrzej Pietrasiewicz 
Signed-off-by: Robert Baldyga 
---
 drivers/usb/dwc2/gadget.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/usb/dwc2/gadget.c b/drivers/usb/dwc2/gadget.c
index fc27b4c..35b4890 100644
--- a/drivers/usb/dwc2/gadget.c
+++ b/drivers/usb/dwc2/gadget.c
@@ -2935,7 +2935,7 @@ static int s3c_hsotg_pullup(struct usb_gadget *gadget, 
int is_on)
struct s3c_hsotg *hsotg = to_hsotg(gadget);
unsigned long flags = 0;
 
-   dev_dbg(hsotg->dev, "%s: is_in: %d\n", __func__, is_on);
+   dev_dbg(hsotg->dev, "%s: is_on: %d\n", __func__, is_on);
 
spin_lock_irqsave(>lock, flags);
if (is_on) {
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 02/11] usb: dwc2/gadget: fix phy initialization sequence

2014-06-23 Thread Robert Baldyga

From: Kamil Debski 

In the Generic PHY Framework a NULL phy is considered to be a valid phy
thus the "if (hsotg->phy)" check does not give us the information whether
the Generic PHY Framework is used.

In addition to the above this patch also removes phy_init from probe and
phy_exit from remove. This is not necessary when init/exit is done in the
s3c_hsotg_phy_enable/disable functions.

Signed-off-by: Kamil Debski 
Signed-off-by: Marek Szyprowski 
Signed-off-by: Robert Baldyga 
---
 drivers/usb/dwc2/gadget.c | 27 ---
 1 file changed, 12 insertions(+), 15 deletions(-)

diff --git a/drivers/usb/dwc2/gadget.c b/drivers/usb/dwc2/gadget.c
index ccef3a7..70eab95 100644
--- a/drivers/usb/dwc2/gadget.c
+++ b/drivers/usb/dwc2/gadget.c
@@ -2748,13 +2748,14 @@ static void s3c_hsotg_phy_enable(struct s3c_hsotg 
*hsotg)
 
dev_dbg(hsotg->dev, "pdev 0x%p\n", pdev);
 
-   if (hsotg->phy) {
-   phy_init(hsotg->phy);
-   phy_power_on(hsotg->phy);
-   } else if (hsotg->uphy)
+   if (hsotg->uphy)
usb_phy_init(hsotg->uphy);
-   else if (hsotg->plat->phy_init)
+   else if (hsotg->plat && hsotg->plat->phy_init)
hsotg->plat->phy_init(pdev, hsotg->plat->phy_type);
+   else {
+   phy_init(hsotg->phy);
+   phy_power_on(hsotg->phy);
+   }
 }
 
 /**
@@ -2768,13 +2769,14 @@ static void s3c_hsotg_phy_disable(struct s3c_hsotg 
*hsotg)
 {
struct platform_device *pdev = to_platform_device(hsotg->dev);
 
-   if (hsotg->phy) {
-   phy_power_off(hsotg->phy);
-   phy_exit(hsotg->phy);
-   } else if (hsotg->uphy)
+   if (hsotg->uphy)
usb_phy_shutdown(hsotg->uphy);
-   else if (hsotg->plat->phy_exit)
+   else if (hsotg->plat && hsotg->plat->phy_exit)
hsotg->plat->phy_exit(pdev, hsotg->plat->phy_type);
+   else {
+   phy_power_off(hsotg->phy);
+   phy_exit(hsotg->phy);
+   }
 }
 
 /**
@@ -3489,9 +3491,6 @@ static int s3c_hsotg_probe(struct platform_device *pdev)
if (hsotg->phy && (phy_get_bus_width(phy) == 8))
hsotg->phyif = GUSBCFG_PHYIF8;
 
-   if (hsotg->phy)
-   phy_init(hsotg->phy);
-
/* usb phy enable */
s3c_hsotg_phy_enable(hsotg);
 
@@ -3584,8 +3583,6 @@ static int s3c_hsotg_remove(struct platform_device *pdev)
usb_gadget_unregister_driver(hsotg->driver);
}
 
-   if (hsotg->phy)
-   phy_exit(hsotg->phy);
clk_disable_unprepare(hsotg->clk);
 
return 0;
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 03/11] usb: dwc2/gadget: move phy bus legth initialization

2014-06-23 Thread Robert Baldyga

From: Kamil Debski 

This patch moves the part of code that initializes the PHY bus width.
This results in simpler code and removes the need to check whether
the Generic PHY Framework is used.

Signed-off-by: Kamil Debski 
Signed-off-by: Marek Szyprowski 
Signed-off-by: Robert Baldyga 
---
 drivers/usb/dwc2/gadget.c | 22 +++---
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/drivers/usb/dwc2/gadget.c b/drivers/usb/dwc2/gadget.c
index 70eab95..fc27b4c 100644
--- a/drivers/usb/dwc2/gadget.c
+++ b/drivers/usb/dwc2/gadget.c
@@ -3395,6 +3395,9 @@ static int s3c_hsotg_probe(struct platform_device *pdev)
return -ENOMEM;
}
 
+   /* Set default UTMI width */
+   hsotg->phyif = GUSBCFG_PHYIF16;
+
/*
 * Attempt to find a generic PHY, then look for an old style
 * USB PHY, finally fall back to pdata
@@ -3413,8 +3416,15 @@ static int s3c_hsotg_probe(struct platform_device *pdev)
hsotg->plat = plat;
} else
hsotg->uphy = uphy;
-   } else
+   } else {
hsotg->phy = phy;
+   /*
+* If using the generic PHY framework, check if the PHY bus
+* width is 8-bit and set the phyif appropriately.
+*/
+   if (phy_get_bus_width(phy) == 8)
+   hsotg->phyif = GUSBCFG_PHYIF8;
+   }
 
hsotg->dev = dev;
 
@@ -3481,16 +3491,6 @@ static int s3c_hsotg_probe(struct platform_device *pdev)
goto err_supplies;
}
 
-   /* Set default UTMI width */
-   hsotg->phyif = GUSBCFG_PHYIF16;
-
-   /*
-* If using the generic PHY framework, check if the PHY bus
-* width is 8-bit and set the phyif appropriately.
-*/
-   if (hsotg->phy && (phy_get_bus_width(phy) == 8))
-   hsotg->phyif = GUSBCFG_PHYIF8;
-
/* usb phy enable */
s3c_hsotg_phy_enable(hsotg);
 
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 00/11] usb: dwc2/gadget: fix series

2014-06-23 Thread Robert Baldyga

Hello,

This patchset contains fixes for dwc2 gadget driver. It touches PHY,
FIFO configuration, initialization sequence and adds many other small fixes.

Best regards
Robert Baldyga
Samsung R Institute Poland

Andrzej Pietrasiewicz (1):
  usb: dwc2/gadget: Fix comment text

Kamil Debski (3):
  usb: dwc2/gadget: fix phy disable sequence
  usb: dwc2/gadget: fix phy initialization sequence
  usb: dwc2/gadget: move phy bus legth initialization

Marek Szyprowski (5):
  usb: dwc2/gadget: hide some not really needed debug messages
  usb: dwc2/gadget: ensure that all fifos have correct memory buffers
  usb: dwc2/gadget: break infinite loop in endpoint disable code
  usb: dwc2/gadget: do not call disconnect method in pullup
  usb: dwc2/gadget: delay enabling irq once hardware is configured
properly

Robert Baldyga (2):
  usb: dwc2/gadget: assign TX FIFO dynamically
  usb: dwc2/gadget: disable clock when it's not needed

 drivers/usb/dwc2/core.h   |   2 +
 drivers/usb/dwc2/gadget.c | 150 --
 2 files changed, 93 insertions(+), 59 deletions(-)

-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 01/11] usb: dwc2/gadget: fix phy disable sequence

2014-06-23 Thread Robert Baldyga

From: Kamil Debski 

When the driver is removed s3c_hsotg_phy_disable is called three times
instead of once. This results in decreasing of the phy reference counter
below zero and thus consecutive inserts of the module fails.

This patch removes calls to s3c_hsotg_phy_disable from s3c_hsotg_remove
and s3c_hsotg_udc_stop.

s3c_hsotg_udc_stop is called from udc-core.c only after
usb_gadget_disconnect, which in turn calls s3c_hsotg_pullup, which
already calls s3c_hsotg_phy_disable.

s3c_hsotg_remove must be called only after udc_stop, so there is no
point in disabling phy once again there.

Signed-off-by: Kamil Debski 
Signed-off-by: Marek Szyprowski 
Signed-off-by: Robert Baldyga 
---
 drivers/usb/dwc2/gadget.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/usb/dwc2/gadget.c b/drivers/usb/dwc2/gadget.c
index f3c56a2..ccef3a7 100644
--- a/drivers/usb/dwc2/gadget.c
+++ b/drivers/usb/dwc2/gadget.c
@@ -2898,8 +2898,6 @@ static int s3c_hsotg_udc_stop(struct usb_gadget *gadget,
 
spin_lock_irqsave(>lock, flags);
 
-   s3c_hsotg_phy_disable(hsotg);
-
if (!driver)
hsotg->driver = NULL;
 
@@ -3586,7 +3584,6 @@ static int s3c_hsotg_remove(struct platform_device *pdev)
usb_gadget_unregister_driver(hsotg->driver);
}
 
-   s3c_hsotg_phy_disable(hsotg);
if (hsotg->phy)
phy_exit(hsotg->phy);
clk_disable_unprepare(hsotg->clk);
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 1/4] mm: vmscan: remove remains of kswapd-managed zone->all_unreclaimable

2014-06-23 Thread Michal Hocko

On Fri 20-06-14 12:33:47, Johannes Weiner wrote:
> shrink_zones() has a special branch to skip the all_unreclaimable()
> check during hibernation, because a frozen kswapd can't mark a zone
> unreclaimable.
> 
> But ever since 6e543d5780e3 ("mm: vmscan: fix do_try_to_free_pages()
> livelock"), determining a zone to be unreclaimable is done by directly
> looking at its scan history and no longer relies on kswapd setting the
> per-zone flag.
> 
> Remove this branch and let shrink_zones() check the reclaimability of
> the target zones regardless of hibernation state.
> 
> Signed-off-by: Johannes Weiner 

This code is really tricky :/

But the patch looks good to me.
Acked-by: Michal Hocko 

> ---
>  mm/vmscan.c | 8 
>  1 file changed, 8 deletions(-)
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 0f16ffe8eb67..19b5b8016209 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -2534,14 +2534,6 @@ out:
>   if (sc->nr_reclaimed)
>   return sc->nr_reclaimed;
>  
> - /*
> -  * As hibernation is going on, kswapd is freezed so that it can't mark
> -  * the zone into all_unreclaimable. Thus bypassing all_unreclaimable
> -  * check.
> -  */
> - if (oom_killer_disabled)
> - return 0;
> -
>   /* Aborted reclaim to try compaction? don't OOM, then */
>   if (aborted_reclaim)
>   return 1;
> -- 
> 2.0.0
> 

-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[regression] fix 32-bit breakage in block device read(2) (was Re: 32-bit bug in iovec iterator changes)

2014-06-23 Thread Al Viro

On Sun, Jun 22, 2014 at 07:50:07AM -0400, Theodore Ts'o wrote:
> On Sun, Jun 22, 2014 at 02:00:32AM +0100, Al Viro wrote:
> > 
> > PS: I agree that it's worth careful commenting, obviously, but
> > before sending it to Linus (*with* comments) I want to get a
> > confirmation that this one-liner actually fixes what Ted is seeing.
> > I have reproduced it here, and that change makes the breakage go
> > away in my testing, but I'd like to make sure that we are seeing the
> > same thing.  Ted?
> 
> Hep, that fixes things.  Thanks for explaining what was going on!

OK, here it is, hopefully with sufficient comments:

blkdev_read_iter() wants to cap the iov_iter by the amount of
data remaining to the end of device.  That's what iov_iter_truncate()
is for (trim iter->count if it's above the given limit).  So far,
so good, but the argument of iov_iter_truncate() is size_t, so on
32bit boxen (in case of a large device) we end up with that upper
limit truncated down to 32 bits *before* comparing it with iter->count.

Easily fixed by making iov_iter_truncate() take 64bit argument -
it does the right thing after such change (we only reach the
assignment in there when the current value of iter->count is greater
than the limit, i.e. for anything that would get truncated we don't
reach the assignment at all) and that argument is not the new
value of iter->count - it's an upper limit for such.

The overhead of passing u64 is not an issue - the thing is inlined,
so callers passing size_t won't pay any penalty.

Reported-by: Theodore Tso 
Tested-by: Theodore Tso 
Signed-off-by: Al Viro 
---

diff --git a/include/linux/uio.h b/include/linux/uio.h
index ddfdb53..17ae7e3 100644
--- a/include/linux/uio.h
+++ b/include/linux/uio.h
@@ -94,8 +94,20 @@ static inline size_t iov_iter_count(struct iov_iter *i)
return i->count;
 }

-static inline void iov_iter_truncate(struct iov_iter *i, size_t count)
+/*
+ * Cap the iov_iter by given limit; note that the second argument is
+ * *not* the new size - it's upper limit for such.  Passing it a value
+ * greater than the amount of data in iov_iter is fine - it'll just do
+ * nothing in that case.
+ */
+static inline void iov_iter_truncate(struct iov_iter *i, u64 count)
 {
+   /*
+* count doesn't have to fit in size_t - comparison extends both
+* operands to u64 here and any value that would be truncated by
+* conversion in assignement is by definition greater than all
+* values of size_t, including old i->count.
+*/
if (i->count > count)
i->count = count;
 }
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] perf/x86: update Haswell PEBS event constraints

2014-06-23 Thread Peter Zijlstra

On Thu, Jun 19, 2014 at 01:40:41PM -0700, Andi Kleen wrote:
> On Thu, Jun 19, 2014 at 10:31:29PM +0200, Stephane Eranian wrote:
> > On Thu, Jun 19, 2014 at 10:18 PM, Andi Kleen  wrote:
> > >> I don't quite understand that.
> > >> You need to know which events support PEBS. You need a table
> > >
> > > We're talking about the kernel allowing things here.
> > > Yes the user still needs to know what supports PEBS, but
> > > that doesn't concern the kernel.
> > >
> > Just need to make sure you don't return bogus information.
> 
> GIGO. We only need to prevent security issues.

> Yes if the user specifies a bogus raw event it will not count.
> That's fine. The important part is just that nothing ever crashes.

Right. But IIRC you were previously arguing that we can in fact crash
the machine with raw PEBS events, as illustrated with the SNB PEBS
cycles 'event'.

Which is where my strict_pebs patch came from; by default only allow the
sanitized known-safe list of events, but allow the system administrator
to disable that test and allow any PEBS event.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 0/2] ARM: tegra: roth: pinmux fixes

2014-06-23 Thread Alexandre Courbot

Two small but important fixes to SHIELD's pinmux configuration.
The use of invalid properties caused the pinmux to not be applied
at all. Also the setting for sdmmc clock lines resulted in random
errors or even the impossibility to probe attached devices.

Alexandre Courbot (2):
  ARM: tegra: roth: fix unsupported pinmux properties
  ARM: tegra: roth: enable input on mmc clock pins

 arch/arm/boot/dts/tegra114-roth.dts | 10 +++---
 1 file changed, 3 insertions(+), 7 deletions(-)

-- 
2.0.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/2] ARM: tegra: roth: fix unsupported pinmux properties

2014-06-23 Thread Alexandre Courbot

The pinmux subsystem complained that the nvidia,low-power-mode property
is not supported by the sdio1, sdio3 and gma drive groups. In addition
gma also does not support nvidia,drive-type. Remove these properties so
the pinmux configuration can properly be applied.

Signed-off-by: Alexandre Courbot 
---
 arch/arm/boot/dts/tegra114-roth.dts | 4 
 1 file changed, 4 deletions(-)

diff --git a/arch/arm/boot/dts/tegra114-roth.dts 
b/arch/arm/boot/dts/tegra114-roth.dts
index 0b0e8e07d965..a67885250f81 100644
--- a/arch/arm/boot/dts/tegra114-roth.dts
+++ b/arch/arm/boot/dts/tegra114-roth.dts
@@ -730,7 +730,6 @@
nvidia,pins = "drive_sdio1";
nvidia,high-speed-mode = ;
nvidia,schmitt = ;
-   nvidia,low-power-mode = 
;
nvidia,pull-down-strength = <36>;
nvidia,pull-up-strength = <20>;
nvidia,slew-rate-rising = 
;
@@ -740,7 +739,6 @@
nvidia,pins = "drive_sdio3";
nvidia,high-speed-mode = ;
nvidia,schmitt = ;
-   nvidia,low-power-mode = 
;
nvidia,pull-down-strength = <36>;
nvidia,pull-up-strength = <20>;
nvidia,slew-rate-rising = 
;
@@ -750,12 +748,10 @@
nvidia,pins = "drive_gma";
nvidia,high-speed-mode = ;
nvidia,schmitt = ;
-   nvidia,low-power-mode = 
;
nvidia,pull-down-strength = <2>;
nvidia,pull-up-strength = <2>;
nvidia,slew-rate-rising = 
;
nvidia,slew-rate-falling = 
;
-   nvidia,drive-type = <1>;
};
};
};
-- 
2.0.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/2] ARM: tegra: roth: enable input on mmc clock pins

2014-06-23 Thread Alexandre Courbot

Input had been disabled by mistake on these pins, leading to issues with
SDIO devices like the Wifi module not being probed or random errors
occuring on the SD card.

Signed-off-by: Alexandre Courbot 
---
 arch/arm/boot/dts/tegra114-roth.dts | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/arm/boot/dts/tegra114-roth.dts 
b/arch/arm/boot/dts/tegra114-roth.dts
index a67885250f81..ba210c6e189f 100644
--- a/arch/arm/boot/dts/tegra114-roth.dts
+++ b/arch/arm/boot/dts/tegra114-roth.dts
@@ -244,7 +244,7 @@
nvidia,function = "sdmmc1";
nvidia,pull = ;
nvidia,tristate = ;
-   nvidia,enable-input = ;
+   nvidia,enable-input = ;
};
sdmmc1_cmd_pz1 {
nvidia,pins = "sdmmc1_cmd_pz1",
@@ -262,7 +262,7 @@
nvidia,function = "sdmmc3";
nvidia,pull = ;
nvidia,tristate = ;
-   nvidia,enable-input = ;
+   nvidia,enable-input = ;
};
sdmmc3_cmd_pa7 {
nvidia,pins = "sdmmc3_cmd_pa7",
@@ -290,7 +290,7 @@
nvidia,function = "sdmmc4";
nvidia,pull = ;
nvidia,tristate = ;
-   nvidia,enable-input = ;
+   nvidia,enable-input = ;
};
sdmmc4_cmd_pt7 {
nvidia,pins = "sdmmc4_cmd_pt7",
-- 
2.0.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 2/4] mm: vmscan: rework compaction-ready signaling in direct reclaim

2014-06-23 Thread Michal Hocko

On Fri 20-06-14 16:24:49, Johannes Weiner wrote:
[...]
> From cd48b73fdca9e23aa21f65e9af1f850dbac5ab8e Mon Sep 17 00:00:00 2001
> From: Johannes Weiner 
> Date: Wed, 11 Jun 2014 12:53:59 -0400
> Subject: [patch] mm: vmscan: rework compaction-ready signaling in direct
>  reclaim
> 
> Page reclaim for a higher-order page runs until compaction is ready,
> then aborts and signals this situation through the return value of
> shrink_zones().  This is an oddly specific signal to encode in the
> return value of shrink_zones(), though, and can be quite confusing.
> 
> Introduce sc->compaction_ready and signal the compactability of the
> zones out-of-band to free up the return value of shrink_zones() for
> actual zone reclaimability.
> 
> Signed-off-by: Johannes Weiner 
> Acked-by: Vlastimil Babka 

Very nice. It will help me to get rid off additional hacks for the
min_limit for memcg. Thanks!

One question below

[...]
> @@ -2500,12 +2492,15 @@ static unsigned long do_try_to_free_pages(struct 
> zonelist *zonelist,
>   vmpressure_prio(sc->gfp_mask, sc->target_mem_cgroup,
>   sc->priority);
>   sc->nr_scanned = 0;
> - aborted_reclaim = shrink_zones(zonelist, sc);
> + shrink_zones(zonelist, sc);
>  
>   total_scanned += sc->nr_scanned;
>   if (sc->nr_reclaimed >= sc->nr_to_reclaim)
>   goto out;
>  
> + if (sc->compaction_ready)
> + goto out;
> +
>   /*
>* If we're getting trouble reclaiming, start doing
>* writepage even in laptop mode.
> @@ -2526,7 +2521,7 @@ static unsigned long do_try_to_free_pages(struct 
> zonelist *zonelist,
>   WB_REASON_TRY_TO_FREE_PAGES);
>   sc->may_writepage = 1;
>   }
> - } while (--sc->priority >= 0 && !aborted_reclaim);
> + } while (--sc->priority >= 0);
>  
>  out:
>   delayacct_freepages_end();

It is not entirely clear to me why we do not need to check and wake up
flusher threads anymore?

-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH V3 16/16] irqchip: crossbar: allow for quirky hardware with direct hardwiring of GIC

2014-06-23 Thread Sricharan R

Hi Jason,

On Saturday 21 June 2014 08:27 AM, Jason Cooper wrote:
> On Mon, Jun 16, 2014 at 04:53:16PM +0530, Sricharan R wrote:
>> From: Nishanth Menon 
>>
>> On certain platforms such as DRA7, SPIs 0, 1, 2, 3, 5, 6, 10, 131,
>> 132, 133 are direct wired to hardware blocks bypassing crossbar.
>> This quirky implementation is *NOT* supposed to be the expectation
>> of crossbar hardware usage. However, these are already marked in our
>> description of the hardware with SKIP and RESERVED where appropriate.
>>
>> Unfortunately, we need to be able to refer to these hardwired IRQs.
>> So, to request these, crossbar driver can use the existing information
>> from it's table that these SKIP/RESERVED maps are direct wired sources
>> and generic allocation/programming of crossbar should be avoided.
>>
>> Signed-off-by: Nishanth Menon 
>> Signed-off-by: Sricharan R 
>> ---
>>  .../devicetree/bindings/arm/omap/crossbar.txt  |   12 ++--
>>  drivers/irqchip/irq-crossbar.c |   20 
>> ++--
>>  2 files changed, 28 insertions(+), 4 deletions(-)
>>
>> diff --git a/Documentation/devicetree/bindings/arm/omap/crossbar.txt 
>> b/Documentation/devicetree/bindings/arm/omap/crossbar.txt
>> index 8210ea4..438ccab 100644
>> --- a/Documentation/devicetree/bindings/arm/omap/crossbar.txt
>> +++ b/Documentation/devicetree/bindings/arm/omap/crossbar.txt
>> @@ -42,8 +42,10 @@ Documentation/devicetree/bindings/arm/gic.txt for further 
>> details.
>>  
>>  An interrupt consumer on an SoC using crossbar will use:
>>  interrupts = 
>> -request number shall be between 0 to that described by
>> -"ti,max-crossbar-sources"
>> +When the request number is between 0 to that described by
>> +"ti,max-crossbar-sources", it is assumed to be a crossbar mapping. If the
>> +request_number is greater than "ti,max-crossbar-sources", then it is mapped 
>> as a
>> +quirky hardware mapping direct to GIC.
>>  
>>  Example:
>>  device_x@0x4a023000 {
>> @@ -51,3 +53,9 @@ Example:
>>  interrupts = ;
>>  ...
>>  };
>> +
>> +device_y@0x4a033000 {
>> +/* Direct mapped GIC SPI 1 used */
>> +interrupts = ;
> 
> Ideally, I'd like to see a macro here so that it's clear that we crossed
> a magic threshold. eg:
> 
> #define MAX_SOURCES 400
> #define DIRECT_IRQ(irq) (MAX_SOURCES + irq)
> ...
>   interrupts = ;
> 
> and, then:
> 
>   ti,max-crossbar-sources = ;
> 
 Ok, thats more good for readability. Will add that macro then.

Regards,
 Sricharan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH V3 03/16] irqchip: crossbar: introduce ti,irqs-skip to skip

2014-06-23 Thread Sricharan R

On Saturday 21 June 2014 08:03 AM, Jason Cooper wrote:
> Sricharan,
> 
> Your subject line seems truncated:
> 
>   "irqchip: crossbar: introduce ti,irqs-skip to skip"
> 
> maybe "... Introduce DT property to skip hardwired irqs" ?
> 
> Also note that you need to correct the subject line for *every* patch in
> the series wrt capitalization.
> 
> I don't mind correcting it when I apply it, provided that:
> 
ha, i think this got truncated unintentionally. Sorry will fix this.
>  - the patch is otherwise ready
>  - I only have to do it once or twice for the series
>  - I never had a chance to ask since you created a rockstar patch series
>the first time out of the gate (except for capitalization).
> 
> Once I've looked over the whole series, please resend with the subject
> lines corrected.
> 
Ok. I will look for your comments on the rest of the patches and
resend with capitalization fix said above.

> On Mon, Jun 16, 2014 at 04:53:03PM +0530, Sricharan R wrote:
>> From: Nishanth Menon 
>>
>> When, in the system due to varied reasons, interrupts might be unusable
>> due to hardware behavior, but register maps do exist, then those interrupts
>> should be skipped while mapping irq to crossbars.
>>
>> Signed-off-by: Nishanth Menon 
>> Signed-off-by: Sricharan R 
>> ---
>> [V3] introduced ti,irqs-skip dt property to list the
>>  irqs to be skipped.
>>
>>  .../devicetree/bindings/arm/omap/crossbar.txt  |4 
>>  drivers/irqchip/irq-crossbar.c |   20 
>> 
>>  2 files changed, 24 insertions(+)
>>
>> diff --git a/Documentation/devicetree/bindings/arm/omap/crossbar.txt 
>> b/Documentation/devicetree/bindings/arm/omap/crossbar.txt
>> index fb88585..cfcbd52 100644
>> --- a/Documentation/devicetree/bindings/arm/omap/crossbar.txt
>> +++ b/Documentation/devicetree/bindings/arm/omap/crossbar.txt
>> @@ -17,6 +17,10 @@ Required properties:
>>   so crossbar bar driver should not consider them as free
>>   lines.
>>  
>> +Optional properties:
>> +- ti,irqs-skip: This is similar to "ti,irqs-reserved", but are irq mappings
>> +  which are not supposed to be used for errata or other 
>> reasons(virtualization).
> 
> I would specifically mention SoC-specific hard-wiring of irqs here.
> Also the fact that the hardwiring unexpectedly bypasses the crossbar.
ok, that will be more easily understandable and will add that.
> 
>> +
>>  Examples:
>>  crossbar_mpu: @4a02 {
>>  compatible = "ti,irq-crossbar";
> 
> Please include a ti,irqs-skip example here.
> 
ok.
>> diff --git a/drivers/irqchip/irq-crossbar.c b/drivers/irqchip/irq-crossbar.c
>> index 51d4b87..27049de 100644
>> --- a/drivers/irqchip/irq-crossbar.c
>> +++ b/drivers/irqchip/irq-crossbar.c
>> @@ -18,6 +18,7 @@
>>  
>>  #define IRQ_FREE-1
>>  #define IRQ_RESERVED-2
>> +#define IRQ_SKIP-3
>>  #define GIC_IRQ_START   32
>>  
>>  /*
>> @@ -160,6 +161,25 @@ static int __init crossbar_of_init(struct device_node 
>> *node)
>>  }
>>  }
>>  
>> +/* Skip the ones marked as skip */
> 
> This comment is redundant, perhaps "Skip irqs hardwired to bypass the
> crossbar."?
ok, will change this.

Regards,
 Sricharan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v6 1/9] efi: Use early_mem() instead of early_io()

2014-06-23 Thread Jan Beulich

>>> On 20.06.14 at 23:29,  wrote:
> --- a/drivers/firmware/efi/efi.c
> +++ b/drivers/firmware/efi/efi.c
> @@ -298,7 +298,7 @@ int __init efi_config_init(efi_config_table_type_t 
> *arch_tables)
>   if (table64 >> 32) {
>   pr_cont("\n");
>   pr_err("Table located above 4GB, disabling 
> EFI.\n");
> - early_iounmap(config_tables,
> + early_memunmap(config_tables,
>  efi.systab->nr_tables * sz);
>   return -EINVAL;
>   }
> @@ -314,7 +314,7 @@ int __init efi_config_init(efi_config_table_type_t 
> *arch_tables)
>   tablep += sz;
>   }
>   pr_cont("\n");
> - early_iounmap(config_tables, efi.systab->nr_tables * sz);
> + early_memunmap(config_tables, efi.systab->nr_tables * sz);
>  
>   set_bit(EFI_CONFIG_TABLES, );
>  

If these two changes are really deemed necessary (there's the
implied assumption currently in place that early_iounmap() can
undo early_memremap() mappings), then ia64 will need a
definition added for early_memunmap() or its build will break.

Jan

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] perf/x86: update Haswell PEBS event constraints

2014-06-23 Thread Peter Zijlstra

On Thu, Jun 19, 2014 at 11:00:28AM -0700, Andi Kleen wrote:
> However these days I'm actually thinking of just getting
> rid of the detailed table except for PREC_DIST. All the PEBS
> controls should be noops if the event does not support PEBS

I had something like the below stuck on the 'look more at later' list
and later never really ever happened.

---
 arch/x86/kernel/cpu/perf_event.c  |  3 +++
 arch/x86/kernel/cpu/perf_event.h  |  1 +
 arch/x86/kernel/cpu/perf_event_intel.c| 19 ++--
 arch/x86/kernel/cpu/perf_event_intel_ds.c | 38 +--
 4 files changed, 42 insertions(+), 19 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index ae407f7226c8..f42405c9868b 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -1541,6 +1541,7 @@ static int __init init_hw_perf_events(void)
pr_cont("%s PMU driver.\n", x86_pmu.name);
 
x86_pmu.attr_rdpmc = 1; /* enable userspace RDPMC usage by default */
+   x86_pmu.attr_strict_pebs = 1;
 
for (quirk = x86_pmu.quirks; quirk; quirk = quirk->next)
quirk->func();
@@ -1855,9 +1856,11 @@ static ssize_t set_attr_rdpmc(struct device *cdev,
 }
 
 static DEVICE_ATTR(rdpmc, S_IRUSR | S_IWUSR, get_attr_rdpmc, set_attr_rdpmc);
+static DEVICE_BOOL_ATTR(strict_pebs, S_IRUSR | S_IWUSR, 
x86_pmu.attr_strict_pebs);
 
 static struct attribute *x86_pmu_attrs[] = {
_attr_rdpmc.attr,
+   _attr_strict_pebs.attr.attr,
NULL,
 };
 
diff --git a/arch/x86/kernel/cpu/perf_event.h b/arch/x86/kernel/cpu/perf_event.h
index 3b2f9bdd974b..a11eeab9b611 100644
--- a/arch/x86/kernel/cpu/perf_event.h
+++ b/arch/x86/kernel/cpu/perf_event.h
@@ -413,6 +413,7 @@ struct x86_pmu {
 */
int attr_rdpmc_broken;
int attr_rdpmc;
+   boolattr_strict_pebs;
struct attribute **format_attrs;
struct attribute **event_attrs;
 
diff --git a/arch/x86/kernel/cpu/perf_event_intel.c 
b/arch/x86/kernel/cpu/perf_event_intel.c
index aa333d966886..6e68f3dc9a30 100644
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -1729,6 +1729,12 @@ static void intel_pebs_aliases_snb(struct perf_event 
*event)
}
 }
 
+#define ARCH_PERFMON_STRICT_PEBS   \
+   (ARCH_PERFMON_EVENTSEL_ANY   |  \
+ARCH_PERFMON_EVENTSEL_CMASK |  \
+ARCH_PERFMON_EVENTSEL_EDGE  |  \
+ARCH_PERFMON_EVENTSEL_INV)
+
 static int intel_pmu_hw_config(struct perf_event *event)
 {
int ret = x86_pmu_hw_config(event);
@@ -1736,8 +1742,17 @@ static int intel_pmu_hw_config(struct perf_event *event)
if (ret)
return ret;
 
-   if (event->attr.precise_ip && x86_pmu.pebs_aliases)
-   x86_pmu.pebs_aliases(event);
+   if (event->attr.precise_ip) {
+   if ((event->attr.config & INTEL_ARCH_EVENT_MASK) == 0x)
+   return -EINVAL;
+
+   if ((event->attr.config & ARCH_PERFMON_STRICT_PEBS) &&
+   (x86_pmu.attr_strict_pebs || !capable(CAP_SYS_ADMIN)))
+   return -EINVAL;
+
+   if (x86_pmu.pebs_aliases)
+   x86_pmu.pebs_aliases(event);
+   }
 
if (intel_pmu_needs_lbr_smpl(event)) {
ret = intel_pmu_setup_lbr_filter(event);
diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c 
b/arch/x86/kernel/cpu/perf_event_intel_ds.c
index ae96cfa5eddd..36b1f2afa61c 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_ds.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c
@@ -540,6 +540,7 @@ struct event_constraint 
intel_core2_pebs_event_constraints[] = {
INTEL_UEVENT_CONSTRAINT(0x00c5, 0x1), /* BR_INST_RETIRED.MISPRED */
INTEL_UEVENT_CONSTRAINT(0x1fc7, 0x1), /* SIMD_INST_RETURED.ANY */
INTEL_EVENT_CONSTRAINT(0xcb, 0x1),/* MEM_LOAD_RETIRED.* */
+   INTEL_UEVENT_CONSTRAINT(0x, 0x1), /* generic PEBS mask */
EVENT_CONSTRAINT_END
 };
 
@@ -547,6 +548,7 @@ struct event_constraint intel_atom_pebs_event_constraints[] 
= {
INTEL_UEVENT_CONSTRAINT(0x00c0, 0x1), /* INST_RETIRED.ANY */
INTEL_UEVENT_CONSTRAINT(0x00c5, 0x1), /* MISPREDICTED_BRANCH_RETIRED */
INTEL_EVENT_CONSTRAINT(0xcb, 0x1),/* MEM_LOAD_RETIRED.* */
+   INTEL_UEVENT_CONSTRAINT(0x, 0x1), /* generic PEBS mask */
EVENT_CONSTRAINT_END
 };
 
@@ -573,6 +575,7 @@ struct event_constraint intel_slm_pebs_event_constraints[] 
= {
INTEL_UEVENT_CONSTRAINT(0xf7c5, 0x1), /* BR_INST_MISP_RETIRED.RETURN_PS 
*/
INTEL_UEVENT_CONSTRAINT(0xfbc5, 0x1), /* 
BR_INST_MISP_RETIRED.IND_CALL_PS */
INTEL_UEVENT_CONSTRAINT(0xfec5, 0x1), /* 
BR_INST_MISP_RETIRED.TAKEN_JCC_PS */
+   INTEL_UEVENT_CONSTRAINT(0x, 0x1), /* generic PEBS mask */
EVENT_CONSTRAINT_END
 };
 
@@ -588,6 +591,7 @@ struct event_constraint

Re: [PATCH] arch,locking: Ciao arch_mutex_cpu_relax()

2014-06-23 Thread Vineet Gupta

Hi Peter,

On Monday 23 June 2014 12:24 PM, Peter Zijlstra wrote:
> On Fri, Jun 20, 2014 at 11:21:13AM -0700, Davidlohr Bueso wrote:
>> diff --git a/arch/arc/include/asm/processor.h 
>> b/arch/arc/include/asm/processor.h
>> index d99f9b3..8e1bf6b 100644
>> --- a/arch/arc/include/asm/processor.h
>> +++ b/arch/arc/include/asm/processor.h
>> @@ -62,6 +62,8 @@ unsigned long thread_saved_pc(struct task_struct *t);
>>  #define cpu_relax() do { } while (0)
>>  #endif
>>  
>> +#define arch_cpu_relax() cpu_relax()
>> +
>>  #define copy_segments(tsk, mm)  do { } while (0)
>>  #define release_segments(mm)do { } while (0)
> 
> I'm not at all sure that cpu_relax() definition ARC has is valid. We
> rely on cpu_relax() being at least a barrier() all over the place, and
> it doesn't need to be SMP only. You can have a UP wait loop waiting for
> an interrupt for example.
> 
> Vineet?

Over the years we've not had any trouble with !SMP cpu_relax() being a no-op 
(and
barrier version was only required when we hit a hard hang in our our initial SMP
code). UP busy wait looping would be frowned upon in general.

However what we have now is just a code optimization quirk for !SMP since a
compiler barrier will cause gcc to dump out and reload scratch regs - specially
for our deep reg file.

Here's what I get with current UP kernel switching to compiler barrier

./scripts/bloat-o-meter vmlinux-pre-cpu-relax  vmlinux | head
add/remove: 1/0 grow/shrink: 75/5 up/down: 1218/-32 (1186)
function old new   delta
path_init708 826+118
sys_semtimedop  25402640+100
...
__slab_alloc.isra.constprop  564 560  -4
deactivate_slab  886 878  -8

So it doesn't look too bad, although I've not run any performance tests. We can
switch UP to barrier if you feel it is needed semantically.

-Vineet


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2] DRBG: simplify ordering of linked list in drbg_ctr_df

2014-06-23 Thread Stephan Mueller

This patch superseeds and replaces the initial fix submitted with [1].
After careful analysis of the code, the anticipated NULL pointer
deference is caught in drbg_ctr_update which only invokes drbg_ctr_df
when addtl is not NULL.

This patch is tested with CAVS testing and the test set provided in [2].

[1] https://lkml.org/lkml/2014/6/21/70
[2] http://www.chronox.de/drbg.html

Stephan Mueller (1):
  DRBG: simplify ordering of linked list in drbg_ctr_df

 crypto/drbg.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

-- 
1.9.3


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] perf/x86: update Haswell PEBS event constraints

2014-06-23 Thread Peter Zijlstra

On Thu, Jun 19, 2014 at 05:58:28PM +0200, Stephane Eranian wrote:
> + INTEL_EVENT_CONSTRAINT(0xd1, 0xf), /* MEM_LOAD_UOPS_RETIRED.* */
> + /* MEM_LOAD_UOPS_LLC_HIT_RETIRED.* */
> + INTEL_EVENT_CONSTRAINT(0xd2, 0xf),
> + /* MEM_LOAD_UOPS_LLC_MISS_RETIRED.* */
> + INTEL_EVENT_CONSTRAINT(0xd3, 0xf),
>   INTEL_UEVENT_CONSTRAINT(0x04c8, 0xf), /* HLE_RETIRED.Abort */
>   INTEL_UEVENT_CONSTRAINT(0x04c9, 0xf), /* RTM_RETIRED.Abort */

Please keep the event and comment on the same line, screw 80 chars, this
interleaves stuff is impossible to read.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2] DRBG: simplify ordering of linked list in drbg_ctr_df

2014-06-23 Thread Stephan Mueller

As reported by a static code analyzer, the code for the ordering of
the linked list can be simplified.

Reported-by: kbuild test robot 
Signed-off-by: Stephan Mueller 
---
 crypto/drbg.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/crypto/drbg.c b/crypto/drbg.c
index faaa2ce..99fa8f8 100644
--- a/crypto/drbg.c
+++ b/crypto/drbg.c
@@ -516,13 +516,13 @@ static int drbg_ctr_df(struct drbg_state *drbg,
S2.next = addtl;
 
/*
-* splice in addtl between S2 and S4 -- we place S4 at the end of the
-* input data chain
+* Splice in addtl between S2 and S4 -- we place S4 at the end
+* of the input data chain. As this code is only triggered when
+* addtl is not NULL, no NULL checks are necessary.
 */
tempstr = addtl;
-   for (; NULL != tempstr; tempstr = tempstr->next)
-   if (NULL == tempstr->next)
-   break;
+   while (tempstr->next)
+   tempstr = tempstr->next;
tempstr->next = 
 
/* 10.4.2 step 9 */
-- 
1.9.3


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: scsi-mq

2014-06-23 Thread Christoph Hellwig

On Sat, Jun 21, 2014 at 12:52:22AM +, Elliott, Robert (Server Storage) 
wrote:
> Some of those context switches might be from scsi_end_request(), 
> which always schedules the scsi_requeue_run_queue() function via the
> requeue_work workqueue for scsi-mq.  That causes lots of context 
> switches from a busy application thread (e.g., fio) to a 
> kworker thread.

Jens has been prodding me to fix this up, I'll send a patch soon that avoids
the workqueue unless we need to kick other queues and thus walk the list
of LUNs.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 10/14] scsi: only maintain target_blocked if the driver has a target queue limit

2014-06-23 Thread Christoph Hellwig

On Sat, Jun 21, 2014 at 10:10:14PM +, Elliott, Robert (Server Storage) 
wrote:
> >   not_ready:
> > /*
> >  * lock q, handle tag, requeue req, and decrement device_busy. We
> 
> There's an extra & in that if statement.

Indeed, this crept in during a rebase and a later patch fixes it.  I'll make
sure this one works on its own.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3] lockdep: restrict the use of recursive read_lock with qrwlock

2014-06-23 Thread Peter Zijlstra

On Fri, Jun 20, 2014 at 03:22:46PM -0400, Waiman Long wrote:
> v2->v3:
>  - Add a new read mode (3) for rwlock (used in
>lock_acquire_shared_cond_recursive()) to avoid conflict with other
>use cases of lock_acquire_shared_recursive().
> 
> v1->v2:
>  - Use less conditional & make it easier to read
> 
> Unlike the original unfair rwlock implementation, queued rwlock
> will grant lock according to the chronological sequence of the lock
> requests except when the lock requester is in the interrupt context.
> As a result, recursive read_lock calls will hang the process if there
> is a write_lock call somewhere in between the read_lock calls.
> 
> This patch updates the lockdep implementation to look for recursive
> read_lock calls when queued rwlock is being used.
> 
> Signed-off-by: Waiman Long 

So this Changelog really won't do. This vn->vn+1 nonsense should not be
part of the Changelog proper.

Also, you failed to mention what prompted you to write this patch; did
you find an offending site that now triggers a lockdep warning?

You also fail to mention that the new read state fits, but exhausts, the
storage in held_lock::read.

> ---
>  2 files changed, 19 insertions(+), 1 deletions(-)
> 
> diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
> index 008388f..0a53d88 100644
> --- a/include/linux/lockdep.h
> +++ b/include/linux/lockdep.h
> @@ -481,13 +481,15 @@ static inline void print_irqtrace_events(struct 
> task_struct *curr)
>  #define lock_acquire_exclusive(l, s, t, n, i)lock_acquire(l, 
> s, t, 0, 1, n, i)
>  #define lock_acquire_shared(l, s, t, n, i)   lock_acquire(l, s, t, 
> 1, 1, n, i)
>  #define lock_acquire_shared_recursive(l, s, t, n, i) lock_acquire(l, s, t, 
> 2, 1, n, i)
> +#define lock_acquire_shared_cond_recursive(l, s, t, n, i)\
> + lock_acquire(l, s, t, 3, 1, n, i)
>  #define spin_acquire(l, s, t, i) lock_acquire_exclusive(l, s, t, 
> NULL, i)
>  #define spin_acquire_nest(l, s, t, n, i) lock_acquire_exclusive(l, s, t, 
> n, i)
>  #define spin_release(l, n, i)lock_release(l, n, i)
>  
>  #define rwlock_acquire(l, s, t, i)   lock_acquire_exclusive(l, s, t, 
> NULL, i)
> -#define rwlock_acquire_read(l, s, t, i)  
> lock_acquire_shared_recursive(l, s, t, NULL, i)
> +#define rwlock_acquire_read(l, s, t, i)  
> lock_acquire_shared_cond_recursive(l, s, t, NULL, i)

Yeah, no. Only the qrwlock has the new cond_recursive thing.

>  #define rwlock_release(l, n, i)  lock_release(l, n, i)
>  
>  #define seqcount_acquire(l, s, t, i) lock_acquire_exclusive(l, s, t, 
> NULL, i)
> diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
> index d24e433..7d90ebc 100644
> --- a/kernel/locking/lockdep.c
> +++ b/kernel/locking/lockdep.c
> @@ -67,6 +67,16 @@ module_param(lock_stat, int, 0644);
>  #define lock_stat 0
>  #endif
>  
> +#ifdef CONFIG_QUEUE_RWLOCK
> +/*
> +* Queue rwlock only allows read-after-read recursion of the same lock class
> +* when the latter read is in an interrupt context.
> +*/
> +#define allow_recursive_read in_interrupt()
> +#else
> +#define allow_recursive_read true
> +#endif

That #ifdef is entirely inappropriate, the lockdep implementation should
not depend on this. Furthermore you now added a new read state with
variable semantics, that's crap.

>  /*
>   * lockdep_lock: protects the lockdep graph, the hashes and the
>   *   class/list/hash allocators.
> @@ -1774,6 +1784,12 @@ check_deadlock(struct task_struct *curr, struct 
> held_lock *next,
>   return 2;
>  
>   /*
> +  * Conditionally recursive read-lock check
> +  */
> + if ((read == 3) && prev->read && allow_recursive_read)
> + return 2;
> +
> + /*
>* We're holding the nest_lock, which serializes this lock's
>* nesting behaviour.
>*/
> -- 
> 1.7.1
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] arch,locking: Ciao arch_mutex_cpu_relax()

2014-06-23 Thread Peter Zijlstra

On Fri, Jun 20, 2014 at 11:21:13AM -0700, Davidlohr Bueso wrote:
> From: Davidlohr Bueso 
> 
> The arch_mutex_cpu_relax() function, introduced by 34b133f, is
> hacky and ugly. It was added a few years ago to address the fact
> that common cpu_relax() calls include yielding on s390, and thus
> impact the optimistic spinning functionality of mutexes. Nowadays
> we use this function well beyond mutexes: rwsem, qrwlock, mcs and
> lockref. Since the macro that defines the call is in the mutex header,
> any users must include mutex.h and the naming is misleading as well.
> 
> This patch (i) renames the call to arch_cpu_relax (for lack of a better
> name), and (ii) defines it in each arch's asm/processor.h local header,
> just like for regular cpu_relax() functions. On all archs, except s390,
> arch_cpu_relax is simply cpu_relax, and thus we can take it out of
> mutex.h. While this can seem redundant or weird, I believe it is a
> good choice as it allows us to move out arch specific logic from generic
> locking primitives and enables future(?) archs to transparently define
> it, similarly to System Z.
> 
> Please note that these changes are only tested on x86-64.

While I like the general idea; does anyone have a better name for this?
So in particular, the difference is that on s390:

 cpu_relax()- yields the vcpu
 arch_{,mutex_}cpu_relax()  - will actually spin-wait


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3 04/13] mm, compaction: move pageblock checks up from isolate_migratepages_range()

2014-06-23 Thread Zhang Yanfei

On 06/20/2014 11:49 PM, Vlastimil Babka wrote:
> isolate_migratepages_range() is the main function of the compaction scanner,
> called either on a single pageblock by isolate_migratepages() during regular
> compaction, or on an arbitrary range by CMA's __alloc_contig_migrate_range().
> It currently perfoms two pageblock-wide compaction suitability checks, and
> because of the CMA callpath, it tracks if it crossed a pageblock boundary in
> order to repeat those checks.
> 
> However, closer inspection shows that those checks are always true for CMA:
> - isolation_suitable() is true because CMA sets cc->ignore_skip_hint to true
> - migrate_async_suitable() check is skipped because CMA uses sync compaction
> 
> We can therefore move the checks to isolate_migratepages(), reducing variables
> and simplifying isolate_migratepages_range(). The update_pageblock_skip()
> function also no longer needs set_unsuitable parameter.
> 
> Furthermore, going back to compact_zone() and compact_finished() when 
> pageblock
> is unsuitable is wasteful - the checks are meant to skip pageblocks quickly.
> The patch therefore also introduces a simple loop into isolate_migratepages()
> so that it does not return immediately on pageblock checks, but keeps going
> until isolate_migratepages_range() gets called once. Similarily to
> isolate_freepages(), the function periodically checks if it needs to 
> reschedule
> or abort async compaction.
> 
> Signed-off-by: Vlastimil Babka 
> Cc: Minchan Kim 
> Cc: Mel Gorman 
> Cc: Joonsoo Kim 
> Cc: Michal Nazarewicz 
> Cc: Naoya Horiguchi 
> Cc: Christoph Lameter 
> Cc: Rik van Riel 
> Cc: David Rientjes 

I think this is a good clean-up to make code more clear.

Reviewed-by: Zhang Yanfei 

Only a tiny nit-pick below.

> ---
>  mm/compaction.c | 112 
> +---
>  1 file changed, 59 insertions(+), 53 deletions(-)
> 
> diff --git a/mm/compaction.c b/mm/compaction.c
> index 3064a7f..ebe30c9 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -132,7 +132,7 @@ void reset_isolation_suitable(pg_data_t *pgdat)
>   */
>  static void update_pageblock_skip(struct compact_control *cc,
>   struct page *page, unsigned long nr_isolated,
> - bool set_unsuitable, bool migrate_scanner)
> + bool migrate_scanner)
>  {
>   struct zone *zone = cc->zone;
>   unsigned long pfn;
> @@ -146,12 +146,7 @@ static void update_pageblock_skip(struct compact_control 
> *cc,
>   if (nr_isolated)
>   return;
>  
> - /*
> -  * Only skip pageblocks when all forms of compaction will be known to
> -  * fail in the near future.
> -  */
> - if (set_unsuitable)
> - set_pageblock_skip(page);
> + set_pageblock_skip(page);
>  
>   pfn = page_to_pfn(page);
>  
> @@ -180,7 +175,7 @@ static inline bool isolation_suitable(struct 
> compact_control *cc,
>  
>  static void update_pageblock_skip(struct compact_control *cc,
>   struct page *page, unsigned long nr_isolated,
> - bool set_unsuitable, bool migrate_scanner)
> + bool migrate_scanner)
>  {
>  }
>  #endif /* CONFIG_COMPACTION */
> @@ -345,8 +340,7 @@ isolate_fail:
>  
>   /* Update the pageblock-skip if the whole pageblock was scanned */
>   if (blockpfn == end_pfn)
> - update_pageblock_skip(cc, valid_page, total_isolated, true,
> -   false);
> + update_pageblock_skip(cc, valid_page, total_isolated, false);
>  
>   count_compact_events(COMPACTFREE_SCANNED, nr_scanned);
>   if (total_isolated)
> @@ -474,14 +468,12 @@ unsigned long
>  isolate_migratepages_range(struct zone *zone, struct compact_control *cc,
>   unsigned long low_pfn, unsigned long end_pfn, bool unevictable)
>  {
> - unsigned long last_pageblock_nr = 0, pageblock_nr;
>   unsigned long nr_scanned = 0, nr_isolated = 0;
>   struct list_head *migratelist = >migratepages;
>   struct lruvec *lruvec;
>   unsigned long flags;
>   bool locked = false;
>   struct page *page = NULL, *valid_page = NULL;
> - bool set_unsuitable = true;
>   const isolate_mode_t mode = (cc->mode == MIGRATE_ASYNC ?
>   ISOLATE_ASYNC_MIGRATE : 0) |
>   (unevictable ? ISOLATE_UNEVICTABLE : 0);
> @@ -545,28 +537,6 @@ isolate_migratepages_range(struct zone *zone, struct 
> compact_control *cc,
>   if (!valid_page)
>   valid_page = page;
>  
> - /* If isolation recently failed, do not retry */
> - pageblock_nr = low_pfn >> pageblock_order;
> - if (last_pageblock_nr != pageblock_nr) {
> - int mt;
> -
> - last_pageblock_nr = pageblock_nr;
> - if (!isolation_suitable(cc, page))
> - goto

[PATCH] crypto:caam - Configuration for platforms with virtualization enabled in CAAM

2014-06-23 Thread Ruchika Gupta

For platforms with virtualization enabled

1. The job ring registers can be written to only is the job ring has been
   started i.e STARTR bit in JRSTART register is 1

2. For DECO's under direct software control, with virtualization enabled
   PL, BMT, ICID and SDID values need to be provided. These are provided by
   selecting a Job ring in start mode whose parameters would be used for the
   DECO access programming.

Signed-off-by: Ruchika Gupta 
---
The current patch  used the 32 bit register comp_params_ms defined in another 
patch.
The link of patch thsi patch is depnedent on is given below:
 crypto: caam - Correct definition of registers in memory map
(https://lkml.org/lkml/2014/6/23/3) 

 drivers/crypto/caam/ctrl.c   | 39 +++
 drivers/crypto/caam/intern.h |  1 +
 drivers/crypto/caam/regs.h   | 18 --
 3 files changed, 56 insertions(+), 2 deletions(-)

diff --git a/drivers/crypto/caam/ctrl.c b/drivers/crypto/caam/ctrl.c
index 066a4d4..7acaaa4 100644
--- a/drivers/crypto/caam/ctrl.c
+++ b/drivers/crypto/caam/ctrl.c
@@ -88,6 +88,14 @@ static inline int run_descriptor_deco0(struct device 
*ctrldev, u32 *desc,
 
/* Set the bit to request direct access to DECO0 */
topregs = (struct caam_full __iomem *)ctrlpriv->ctrl;
+
+   if (ctrlpriv->virt_en == 1)
+   setbits32(>ctrl.deco_rsr, DECORSR_JR0);
+
+   while (!(rd_reg32(>ctrl.deco_rsr) & DECORSR_VALID) &&
+  --timeout)
+   cpu_relax();
+
setbits32(>ctrl.deco_rq, DECORR_RQD0ENABLE);
 
while (!(rd_reg32(>ctrl.deco_rq) & DECORR_DEN0) &&
@@ -130,6 +138,9 @@ static inline int run_descriptor_deco0(struct device 
*ctrldev, u32 *desc,
*status = rd_reg32(>deco.op_status_hi) &
  DECO_OP_STATUS_HI_ERR_MASK;
 
+   if (ctrlpriv->virt_en == 1)
+   clrbits32(>ctrl.deco_rsr, DECORSR_JR0);
+
/* Mark the DECO as free */
clrbits32(>ctrl.deco_rq, DECORR_RQD0ENABLE);
 
@@ -378,6 +389,7 @@ static int caam_probe(struct platform_device *pdev)
 #ifdef CONFIG_DEBUG_FS
struct caam_perfmon *perfmon;
 #endif
+   u32 scfgr, comp_params;
u32 cha_vid_ls;
 
ctrlpriv = devm_kzalloc(>dev, sizeof(struct caam_drv_private),
@@ -412,6 +424,33 @@ static int caam_probe(struct platform_device *pdev)
setbits32(>ctrl.mcr, MCFGR_WDENABLE |
  (sizeof(dma_addr_t) == sizeof(u64) ? MCFGR_LONG_PTR : 0));
 
+   /*
+*  Read the Compile Time paramters and SCFGR to determine
+* if Virtualization is enabled for this platform
+*/
+   comp_params = rd_reg32(>ctrl.perfmon.comp_parms_ms);
+   scfgr = rd_reg32(>ctrl.scfgr);
+
+   ctrlpriv->virt_en = 0;
+   if (comp_params & CTPR_MS_VIRT_EN_INCL) {
+   /* VIRT_EN_INCL = 1 & VIRT_EN_POR = 1 or
+* VIRT_EN_INCL = 1 & VIRT_EN_POR = 0 & SCFGR_VIRT_EN = 1
+*/
+   if ((comp_params & CTPR_MS_VIRT_EN_POR) ||
+   (!(comp_params & CTPR_MS_VIRT_EN_POR) &&
+  (scfgr & SCFGR_VIRT_EN)))
+   ctrlpriv->virt_en = 1;
+   } else {
+   /* VIRT_EN_INCL = 0 && VIRT_EN_POR_VALUE = 1 */
+   if (comp_params & CTPR_MS_VIRT_EN_POR)
+   ctrlpriv->virt_en = 1;
+   }
+
+   if (ctrlpriv->virt_en == 1)
+   setbits32(>ctrl.jrstart, JRSTART_JR0_START |
+ JRSTART_JR1_START | JRSTART_JR2_START |
+ JRSTART_JR3_START);
+
if (sizeof(dma_addr_t) == sizeof(u64))
if (of_device_is_compatible(nprop, "fsl,sec-v5.0"))
dma_set_mask(dev, DMA_BIT_MASK(40));
diff --git a/drivers/crypto/caam/intern.h b/drivers/crypto/caam/intern.h
index 6d85fcc..97363db 100644
--- a/drivers/crypto/caam/intern.h
+++ b/drivers/crypto/caam/intern.h
@@ -82,6 +82,7 @@ struct caam_drv_private {
u8 total_jobrs; /* Total Job Rings in device */
u8 qi_present;  /* Nonzero if QI present in device */
int secvio_irq; /* Security violation interrupt number */
+   int virt_en;/* Virtualization enabled in CAAM */
 
 #defineRNG4_MAX_HANDLES 2
/* RNG4 block */
diff --git a/drivers/crypto/caam/regs.h b/drivers/crypto/caam/regs.h
index 7bb898d..69e3562 100644
--- a/drivers/crypto/caam/regs.h
+++ b/drivers/crypto/caam/regs.h
@@ -176,6 +176,8 @@ struct caam_perfmon {
u32 cha_rev_ls; /* CRNR - CHA Rev No. Least significant half*/
 #define CTPR_MS_QI_SHIFT   25
 #define CTPR_MS_QI_MASK(0x1ull << CTPR_MS_QI_SHIFT)
+#define CTPR_MS_VIRT_EN_INCL   0x0001
+#define CTPR_MS_VIRT_EN_POR0x0002
u32 comp_parms_ms;  /* CTPR - Compile Parameters Register   */
u32 comp_parms_ls;  /* CTPR - Compile Parameters Register   */
u64

Re: [PATCH] arch,locking: Ciao arch_mutex_cpu_relax()

2014-06-23 Thread Peter Zijlstra

On Fri, Jun 20, 2014 at 11:21:13AM -0700, Davidlohr Bueso wrote:
> diff --git a/arch/arc/include/asm/processor.h 
> b/arch/arc/include/asm/processor.h
> index d99f9b3..8e1bf6b 100644
> --- a/arch/arc/include/asm/processor.h
> +++ b/arch/arc/include/asm/processor.h
> @@ -62,6 +62,8 @@ unsigned long thread_saved_pc(struct task_struct *t);
>  #define cpu_relax()  do { } while (0)
>  #endif
>  
> +#define arch_cpu_relax() cpu_relax()
> +
>  #define copy_segments(tsk, mm)  do { } while (0)
>  #define release_segments(mm)do { } while (0)

I'm not at all sure that cpu_relax() definition ARC has is valid. We
rely on cpu_relax() being at least a barrier() all over the place, and
it doesn't need to be SMP only. You can have a UP wait loop waiting for
an interrupt for example.

Vineet?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 4/4] mm: vmscan: move swappiness out of scan_control

2014-06-23 Thread Minchan Kim

On Fri, Jun 20, 2014 at 12:33:50PM -0400, Johannes Weiner wrote:
> Swappiness is determined for each scanned memcg individually in
> shrink_zone() and is not a parameter that applies throughout the
> reclaim scan.  Move it out of struct scan_control to prevent
> accidental use of a stale value.
> 
> Signed-off-by: Johannes Weiner 
Acked-by: Minchan Kim 

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 3/4] mm: vmscan: remove all_unreclaimable()

2014-06-23 Thread Minchan Kim

On Fri, Jun 20, 2014 at 12:33:49PM -0400, Johannes Weiner wrote:
> Direct reclaim currently calls shrink_zones() to reclaim all members
> of a zonelist, and if that wasn't successful it does another pass
> through the same zonelist to check overall reclaimability.
> 
> Just check reclaimability in shrink_zones() directly and propagate the
> result through the return value.  Then remove all_unreclaimable().
> 
> Signed-off-by: Johannes Weiner 
Acked-by: Minchan Kim 

> ---
>  mm/vmscan.c | 48 +++-
>  1 file changed, 23 insertions(+), 25 deletions(-)
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index ed1efb84c542..d0bc1a209746 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -2244,9 +2244,10 @@ static inline bool should_continue_reclaim(struct zone 
> *zone,
>   }
>  }
>  
> -static void shrink_zone(struct zone *zone, struct scan_control *sc)
> +static unsigned long shrink_zone(struct zone *zone, struct scan_control *sc)
>  {
>   unsigned long nr_reclaimed, nr_scanned;
> + unsigned long zone_reclaimed = 0;
>  
>   do {
>   struct mem_cgroup *root = sc->target_mem_cgroup;
> @@ -2290,8 +2291,12 @@ static void shrink_zone(struct zone *zone, struct 
> scan_control *sc)
>  sc->nr_scanned - nr_scanned,
>  sc->nr_reclaimed - nr_reclaimed);
>  
> + zone_reclaimed += sc->nr_reclaimed - nr_reclaimed;
> +
>   } while (should_continue_reclaim(zone, sc->nr_reclaimed - nr_reclaimed,
>sc->nr_scanned - nr_scanned, sc));
> +
> + return zone_reclaimed;
>  }
>  
>  /* Returns true if compaction should go ahead for a high-order request */
> @@ -2340,8 +2345,10 @@ static inline bool compaction_ready(struct zone *zone, 
> int order)
>   *
>   * If a zone is deemed to be full of pinned pages then just give it a light
>   * scan then give up on it.
> + *
> + * Returns whether the zones overall are reclaimable or not.
>   */
> -static void shrink_zones(struct zonelist *zonelist, struct scan_control *sc)
> +static bool shrink_zones(struct zonelist *zonelist, struct scan_control *sc)
>  {
>   struct zoneref *z;
>   struct zone *zone;
> @@ -2354,6 +2361,7 @@ static void shrink_zones(struct zonelist *zonelist, 
> struct scan_control *sc)
>   .gfp_mask = sc->gfp_mask,
>   };
>   enum zone_type requested_highidx = gfp_zone(sc->gfp_mask);
> + bool all_unreclaimable = true;
>  
>   /*
>* If the number of buffer_heads in the machine exceeds the maximum
> @@ -2368,6 +2376,8 @@ static void shrink_zones(struct zonelist *zonelist, 
> struct scan_control *sc)
>  
>   for_each_zone_zonelist_nodemask(zone, z, zonelist,
>   gfp_zone(sc->gfp_mask), sc->nodemask) {
> + unsigned long zone_reclaimed = 0;
> +
>   if (!populated_zone(zone))
>   continue;
>   /*
> @@ -2414,10 +2424,15 @@ static void shrink_zones(struct zonelist *zonelist, 
> struct scan_control *sc)
>   _soft_scanned);
>   sc->nr_reclaimed += nr_soft_reclaimed;
>   sc->nr_scanned += nr_soft_scanned;
> + zone_reclaimed += nr_soft_reclaimed;
>   /* need some check for avoid more shrink_zone() */
>   }
>  
> - shrink_zone(zone, sc);
> + zone_reclaimed += shrink_zone(zone, sc);
> +
> + if (zone_reclaimed ||
> + (global_reclaim(sc) && zone_reclaimable(zone)))
> + all_unreclaimable = false;
>   }
>  
>   /*
> @@ -2439,26 +2454,8 @@ static void shrink_zones(struct zonelist *zonelist, 
> struct scan_control *sc)
>* promoted it to __GFP_HIGHMEM.
>*/
>   sc->gfp_mask = orig_mask;
> -}
>  
> -/* All zones in zonelist are unreclaimable? */
> -static bool all_unreclaimable(struct zonelist *zonelist,
> - struct scan_control *sc)
> -{
> - struct zoneref *z;
> - struct zone *zone;
> -
> - for_each_zone_zonelist_nodemask(zone, z, zonelist,
> - gfp_zone(sc->gfp_mask), sc->nodemask) {
> - if (!populated_zone(zone))
> - continue;
> - if (!cpuset_zone_allowed_hardwall(zone, GFP_KERNEL))
> - continue;
> - if (zone_reclaimable(zone))
> - return false;
> - }
> -
> - return true;
> + return !all_unreclaimable;
>  }
>  
>  /*
> @@ -2482,6 +2479,7 @@ static unsigned long do_try_to_free_pages(struct 
> zonelist *zonelist,
>  {
>   unsigned long total_scanned = 0;
>   unsigned long writeback_threshold;
> + bool zones_reclaimable;
>  
>   delayacct_freepages_start();
>  
> @@ -2492,7 +2490,7 @@ static unsigned long do_try_to_free_pages(struct 
> zonelist *zonelist,
>

Re: [PATCH] sched: Fix potential near-infinite distribute_cfs_runtime loop

2014-06-23 Thread Peter Zijlstra

On Fri, Jun 20, 2014 at 03:21:20PM -0700, Ben Segall wrote:
> distribute_cfs_runtime intentionally only hands out enough runtime to
> bring each cfs_rq to 1 ns of runtime, expecting the cfs_rqs to then take
> the runtime they need only once they actually get to run. However, if
> they get to run sufficiently quickly, the period timer is still in
> distribute_cfs_runtime and no runtime is available, causing them to
> throttle. Then distribute has to handle them again, and this can go on
> until distribute has handed out all of the runtime 1ns at a time, which
> takes far too long.
> 
> Instead allow access to the same runtime that distribute is handing out,
> accepting that corner cases with very low quota may be able to spend the
> entire cfs_b->runtime during distribute_cfs_runtime, meaning that the
> runtime directly handed out by distribute_cfs_runtime was over quota. In
> addition, if a cfs_rq does manage to throttle like this, make sure the
> existing distribute_cfs_runtime no longer loops over it again.
> 
> Signed-off-by: Ben Segall 

Thanks Ben!
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v1 2/2] block: virtio-blk: support multi virt queues per virtio-blk device

2014-06-23 Thread Michael S. Tsirkin

On Mon, Jun 23, 2014 at 01:42:51PM +1000, Dave Chinner wrote:
> On Sun, Jun 22, 2014 at 01:24:48PM +0300, Michael S. Tsirkin wrote:
> > On Fri, Jun 20, 2014 at 11:29:40PM +0800, Ming Lei wrote:
> > > @@ -24,8 +26,8 @@ static struct workqueue_struct *virtblk_wq;
> > >  struct virtio_blk
> > >  {
> > >   struct virtio_device *vdev;
> > > - struct virtqueue *vq;
> > > - spinlock_t vq_lock;
> > > + struct virtqueue *vq[MAX_NUM_VQ];
> > > + spinlock_t vq_lock[MAX_NUM_VQ];
> > 
> > array of struct {
> > *vq;
> > spinlock_t lock;
> > }
> > would use more memory but would get us better locality.
> > It might even make sense to add padding to avoid
> > cacheline sharing between two unrelated VQs.
> > Want to try?
> 
> It's still false sharing because the queue objects share cachelines.
> To operate without contention they have to be physically separated
> from each other like so:
> 
> struct vq {
>   struct virtqueue*q;
>   spinlock_t  lock;
> } cacheline_aligned_in_smp;

Exacly, that's what I meant by padding above.

> struct some_other_struct {
>   
>   struct vq   vq[MAX_NUM_VQ];
>   
> };
> 
> This keeps locality to objects within a queue, but separates each
> queue onto it's own cacheline
> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> da...@fromorbit.com

To reduce the amount of memory wasted, we could add
the lock in the VQ itself.
Wastes 8 bytes of memory for devices which don't need it, but
we can save it elsewhere (e.g. get rid of the list and
the priv pointer).

How's this?  Your patch would go on top.
Care benchmarking and telling us whether it makes sense?
If yes please let me know and I'll send an official patchset.

-->

virtio-blk: move spinlock to vq itself

Signed-off-by: Michael S. Tsirkin 

--

diff --git a/include/linux/virtio.h b/include/linux/virtio.h
index b46671e..0951b21 100644
--- a/include/linux/virtio.h
+++ b/include/linux/virtio.h
@@ -19,6 +19,7 @@
  * @priv: a pointer for the virtqueue implementation to use.
  * @index: the zero-based ordinal number for this queue.
  * @num_free: number of elements we expect to be able to fit.
+ * @lock: lock for optional use by devices. If used, devices must initialize 
it.
  *
  * A note on @num_free: with indirect buffers, each buffer needs one
  * element in the queue, otherwise a buffer will need one element per
@@ -31,6 +32,7 @@ struct virtqueue {
struct virtio_device *vdev;
unsigned int index;
unsigned int num_free;
+   spinlock_t lock;
void *priv;
 };
 
diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index f63d358..a3cdc19 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -25,7 +25,6 @@ struct virtio_blk
 {
struct virtio_device *vdev;
struct virtqueue *vq;
-   spinlock_t vq_lock;
 
/* The disk structure for the kernel. */
struct gendisk *disk;
@@ -137,7 +136,7 @@ static void virtblk_done(struct virtqueue *vq)
unsigned long flags;
unsigned int len;
 
-   spin_lock_irqsave(>vq_lock, flags);
+   spin_lock_irqsave(>vq->lock, flags);
do {
virtqueue_disable_cb(vq);
while ((vbr = virtqueue_get_buf(vblk->vq, )) != NULL) {
@@ -151,7 +150,7 @@ static void virtblk_done(struct virtqueue *vq)
/* In case queue is stopped waiting for more buffers. */
if (req_done)
blk_mq_start_stopped_hw_queues(vblk->disk->queue, true);
-   spin_unlock_irqrestore(>vq_lock, flags);
+   spin_unlock_irqrestore(>vq->lock, flags);
 }
 
 static int virtio_queue_rq(struct blk_mq_hw_ctx *hctx, struct request *req)
@@ -202,12 +201,12 @@ static int virtio_queue_rq(struct blk_mq_hw_ctx *hctx, 
struct request *req)
vbr->out_hdr.type |= VIRTIO_BLK_T_IN;
}
 
-   spin_lock_irqsave(>vq_lock, flags);
+   spin_lock_irqsave(>vq->lock, flags);
err = __virtblk_add_req(vblk->vq, vbr, vbr->sg, num);
if (err) {
virtqueue_kick(vblk->vq);
blk_mq_stop_hw_queue(hctx);
-   spin_unlock_irqrestore(>vq_lock, flags);
+   spin_unlock_irqrestore(>vq->lock, flags);
/* Out of mem doesn't actually happen, since we fall back
 * to direct descriptors */
if (err == -ENOMEM || err == -ENOSPC)
@@ -217,7 +216,7 @@ static int virtio_queue_rq(struct blk_mq_hw_ctx *hctx, 
struct request *req)
 
if (last && virtqueue_kick_prepare(vblk->vq))
notify = true;
-   spin_unlock_irqrestore(>vq_lock, flags);
+   spin_unlock_irqrestore(>vq->lock, flags);
 
if (notify)
virtqueue_notify(vblk->vq);
@@ -551,7 +550,7 @@ static int virtblk_probe(struct virtio_device *vdev)
err = init_vq(vblk);
if (err)
goto out_free_vblk;
-   spin_lock_init(>vq_lock);
+   spin_lock_init(>vq->lock);
 
/* FIXME:

Re: [patch 2/4] mm: vmscan: rework compaction-ready signaling in direct reclaim

2014-06-23 Thread Minchan Kim

On Fri, Jun 20, 2014 at 12:33:48PM -0400, Johannes Weiner wrote:
> Page reclaim for a higher-order page runs until compaction is ready,
> then aborts and signals this situation through the return value of
> shrink_zones().  This is an oddly specific signal to encode in the
> return value of shrink_zones(), though, and can be quite confusing.
> 
> Introduce sc->compaction_ready and signal the compactability of the
> zones out-of-band to free up the return value of shrink_zones() for
> actual zone reclaimability.
> 
> Signed-off-by: Johannes Weiner 
Acked-by: Minchan Kim 

Below just one nitpick.

> ---
>  mm/vmscan.c | 67 
> -
>  1 file changed, 31 insertions(+), 36 deletions(-)
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 19b5b8016209..ed1efb84c542 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -65,6 +65,9 @@ struct scan_control {
>   /* Number of pages freed so far during a call to shrink_zones() */
>   unsigned long nr_reclaimed;
>  
> + /* One of the zones is ready for compaction */
> + int compaction_ready;
> +
>   /* How many pages shrink_list() should reclaim */
>   unsigned long nr_to_reclaim;
>  
> @@ -2292,15 +2295,11 @@ static void shrink_zone(struct zone *zone, struct 
> scan_control *sc)
>  }
>  
>  /* Returns true if compaction should go ahead for a high-order request */
> -static inline bool compaction_ready(struct zone *zone, struct scan_control 
> *sc)
> +static inline bool compaction_ready(struct zone *zone, int order)
>  {
>   unsigned long balance_gap, watermark;
>   bool watermark_ok;
>  
> - /* Do not consider compaction for orders reclaim is meant to satisfy */
> - if (sc->order <= PAGE_ALLOC_COSTLY_ORDER)
> - return false;
> -
>   /*
>* Compaction takes time to run and there are potentially other
>* callers using the pages just freed. Continue reclaiming until
> @@ -2309,18 +2308,18 @@ static inline bool compaction_ready(struct zone 
> *zone, struct scan_control *sc)
>*/
>   balance_gap = min(low_wmark_pages(zone), DIV_ROUND_UP(
>   zone->managed_pages, KSWAPD_ZONE_BALANCE_GAP_RATIO));
> - watermark = high_wmark_pages(zone) + balance_gap + (2UL << sc->order);
> + watermark = high_wmark_pages(zone) + balance_gap + (2UL << order);
>   watermark_ok = zone_watermark_ok_safe(zone, 0, watermark, 0, 0);
>  
>   /*
>* If compaction is deferred, reclaim up to a point where
>* compaction will have a chance of success when re-enabled
>*/
> - if (compaction_deferred(zone, sc->order))
> + if (compaction_deferred(zone, order))
>   return watermark_ok;
>  
>   /* If compaction is not ready to start, keep reclaiming */
> - if (!compaction_suitable(zone, sc->order))
> + if (!compaction_suitable(zone, order))
>   return false;
>  
>   return watermark_ok;
> @@ -2341,20 +2340,14 @@ static inline bool compaction_ready(struct zone 
> *zone, struct scan_control *sc)
>   *
>   * If a zone is deemed to be full of pinned pages then just give it a light
>   * scan then give up on it.
> - *
> - * This function returns true if a zone is being reclaimed for a costly
> - * high-order allocation and compaction is ready to begin. This indicates to
> - * the caller that it should consider retrying the allocation instead of
> - * further reclaim.
>   */
> -static bool shrink_zones(struct zonelist *zonelist, struct scan_control *sc)
> +static void shrink_zones(struct zonelist *zonelist, struct scan_control *sc)
>  {
>   struct zoneref *z;
>   struct zone *zone;
>   unsigned long nr_soft_reclaimed;
>   unsigned long nr_soft_scanned;
>   unsigned long lru_pages = 0;
> - bool aborted_reclaim = false;

>   struct reclaim_state *reclaim_state = current->reclaim_state;
>   gfp_t orig_mask;
>   struct shrink_control shrink = {
> @@ -2391,22 +2384,24 @@ static bool shrink_zones(struct zonelist *zonelist, 
> struct scan_control *sc)
>   if (sc->priority != DEF_PRIORITY &&
>   !zone_reclaimable(zone))
>   continue;   /* Let kswapd poll it */
> - if (IS_ENABLED(CONFIG_COMPACTION)) {
> - /*
> -  * If we already have plenty of memory free for
> -  * compaction in this zone, don't free any more.
> -  * Even though compaction is invoked for any
> -  * non-zero order, only frequent costly order
> -  * reclamation is disruptive enough to become a
> -  * noticeable problem, like transparent huge
> -  * page allocations.
> -  */
> - if ((zonelist_zone_idx(z) <=

Re: [PATCH 1/2] slip: Fix deadlock in write_wakeup

2014-06-23 Thread Alexander Stein

On Monday 16 June 2014 19:55:04, Oliver Hartkopp wrote:
> Hello Tyler,
> 
> On 16.06.2014 04:23, Tyler Hall wrote:
> > Use schedule_work() to avoid potentially taking the spinlock in
> > interrupt context.
> > 
> (..)
> 
> > 
> > To deal with these issues, don't grab the lock in the wakeup function by
> > deferring the writeout to a workqueue. Also hold the lock during close
> > when de-assigning the tty pointer to safely disarm the worker and
> > timers.
> > 
> > This bug is easily reproducible on the first transmit when slip is
> > used with the standard 8250 serial driver.
> > 
> 
> looks reasonable. Thanks for your patch!
> Indeed I can't remember ever using the slcan driver with a real serial
> controller hardware with irq line but only via serial-to-USB adapters :-)
> Due to the recent fixes from Andre and Alexander these two drivers got in
> motion again ...
> 
> @Andre/Alexander: Can you please check if slcan still works in your setup. I
> don't have that hardware with me. I only was able to compile it successfully.

Sorry, I don't have access to the serial hardware currently.

Best regards
Alexander

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH 0/3] Mark literal strings in init / exit code

2014-06-23 Thread Joe Perches

On Mon, 2014-06-23 at 08:23 +0200, Mathias Krause wrote:
> On 23 June 2014 00:56, Joe Perches  wrote:
> > On Mon, 2014-06-23 at 00:46 +0200, Mathias Krause wrote:
> >> [...] patch 2 adds some syntactical sugar for the most popular use
> >> case, by providing pr_ alike macros, namely pi_ for __init
> >> code and pe_ for __exit code. This hides the use of the marker
> >> macros behind the commonly known printing functions -- with just a
> >> single character changed.
> >>
> >> Patch 3 exemplarily changes all strings and format strings in
> >> arch/x86/kernel/acpi/boot.c to use the new macros. It also addresses a
> >> few styling issues, though. But this already leads to ~1.7 kB of r/o
> >> data moved to the .init.rodata section, marking it for release after
> >> init.
> >>
> >> [...]
> >
> > I once proposed a similar thing.
> >
> > https://lkml.org/lkml/2009/7/21/421
> >
> > Matt Mackall replied
> >
> > https://lkml.org/lkml/2009/7/21/463
> >
> 
> Thanks for the pointers. Have you looked at patch 2 and 3? I don't
> think it makes the printk() case ugly. In fact, using pi_()
> should be no less readable then pr_, no?

I don't think it's particularly less readable, but I
do think using the plug-in mechanism might be a better
option as it would need no manual markings at all.

cheers, Joe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH 0/3] Mark literal strings in init / exit code

2014-06-23 Thread Mathias Krause

On 23 June 2014 03:30, Joe Perches  wrote:
> On Mon, 2014-06-23 at 00:46 +0200, Mathias Krause wrote:
>> This RFC series tries to address the problem of dangling strings of
>> __init functions after initialization, as well as __exit strings for
>> code not even included in the final kernel image. The code might get
>> freed, but the format strings are not.
>>
>> One solution to the problem might be to declare variables in the code
>> and mark those variables as __initconst. That, though, makes the code
>> ugly, as can be seen, e.g., in drivers/hwmon/w83627ehf.c -- a pile of
>> 'static const char[] __initconst' lines just for the pr_info() call.
>
> Another solution might be, as David Daney suggested, using
> gcc 4.5+ plug-ins to extract these format strings and
> const char * arrays into specific sections automatically.
>
> https://lkml.org/lkml/2009/7/21/483
>
> Seems feasible, but there might be a negative of string
> duplication in multiple sections that would otherwise
> be consolidated into a single object.
>

There is currently no infrastructure for gcc plugins in the kernel
tree. And using plugins might make the kernel even more depended on a
particular gcc version, as the plugin API changes with every version.
In fact, there is none, beside "use every exported function you can
get your hand on". And that API breaks with each and every new version
of gcc. This would put quite a bigger maintenance burden on such an
approach than providing appropriate wrappers, fixing the obvious
candidates and adding a checkpatch warning.

Thanks,
Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH tip/core/rcu] Reduce overhead of cond_resched() checks for RCU

2014-06-23 Thread Peter Zijlstra

On Fri, Jun 20, 2014 at 07:59:58PM -0700, Paul E. McKenney wrote:
> Commit ac1bea85781e (Make cond_resched() report RCU quiescent states)
> fixed a problem where a CPU looping in the kernel with but one runnable
> task would give RCU CPU stall warnings, even if the in-kernel loop
> contained cond_resched() calls.  Unfortunately, in so doing, it introduced
> performance regressions in Anton Blanchard's will-it-scale "open1" test.
> The problem appears to be not so much the increased cond_resched() path
> length as an increase in the rate at which grace periods complete, which
> increased per-update grace-period overhead.
> 
> This commit takes a different approach to fixing this bug, mainly by
> moving the RCU-visible quiescent state from cond_resched() to
> rcu_note_context_switch(), and by further reducing the check to a
> simple non-zero test of a single per-CPU variable.  However, this
> approach requires that the force-quiescent-state processing send
> resched IPIs to the offending CPUs.  These will be sent only once
> the grace period has reached an age specified by the boot/sysfs
> parameter rcutree.jiffies_till_sched_qs, or once the grace period
> reaches an age halfway to the point at which RCU CPU stall warnings
> will be emitted, whichever comes first.

Right, and I suppose the force quiescent stuff is triggered from the
tick, which in turn wakes some of these rcu kthreads, which on UP would
cause scheduling themselves.

On the topic of these threads; I recently noticed RCU grew a metric ton
of them, I found some 75 rcu kthreads on my box, wth up with that?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3 02/13] mm, compaction: defer each zone individually instead of preferred zone

2014-06-23 Thread Zhang Yanfei

On 06/20/2014 11:49 PM, Vlastimil Babka wrote:
> When direct sync compaction is often unsuccessful, it may become deferred for
> some time to avoid further useless attempts, both sync and async. Successful
> high-order allocations un-defer compaction, while further unsuccessful
> compaction attempts prolong the copmaction deferred period.
> 
> Currently the checking and setting deferred status is performed only on the
> preferred zone of the allocation that invoked direct compaction. But 
> compaction
> itself is attempted on all eligible zones in the zonelist, so the behavior is
> suboptimal and may lead both to scenarios where 1) compaction is attempted
> uselessly, or 2) where it's not attempted despite good chances of succeeding,
> as shown on the examples below:
> 
> 1) A direct compaction with Normal preferred zone failed and set deferred
>compaction for the Normal zone. Another unrelated direct compaction with
>DMA32 as preferred zone will attempt to compact DMA32 zone even though
>the first compaction attempt also included DMA32 zone.
> 
>In another scenario, compaction with Normal preferred zone failed to 
> compact
>Normal zone, but succeeded in the DMA32 zone, so it will not defer
>compaction. In the next attempt, it will try Normal zone which will fail
>again, instead of skipping Normal zone and trying DMA32 directly.
> 
> 2) Kswapd will balance DMA32 zone and reset defer status based on watermarks
>looking good. A direct compaction with preferred Normal zone will skip
>compaction of all zones including DMA32 because Normal was still deferred.
>The allocation might have succeeded in DMA32, but won't.
> 
> This patch makes compaction deferring work on individual zone basis instead of
> preferred zone. For each zone, it checks compaction_deferred() to decide if 
> the
> zone should be skipped. If watermarks fail after compacting the zone,
> defer_compaction() is called. The zone where watermarks passed can still be
> deferred when the allocation attempt is unsuccessful. When allocation is
> successful, compaction_defer_reset() is called for the zone containing the
> allocated page. This approach should approximate calling defer_compaction()
> only on zones where compaction was attempted and did not yield allocated page.
> There might be corner cases but that is inevitable as long as the decision
> to stop compacting dues not guarantee that a page will be allocated.
> 
> During testing on a two-node machine with a single very small Normal zone on
> node 1, this patch has improved success rates in stress-highalloc mmtests
> benchmark. The success here were previously made worse by commit 3a025760fc
> ("mm: page_alloc: spill to remote nodes before waking kswapd") as kswapd was
> no longer resetting often enough the deferred compaction for the Normal zone,
> and DMA32 zones on both nodes were thus not considered for compaction.
> 
> Signed-off-by: Vlastimil Babka 
> Cc: Minchan Kim 
> Cc: Mel Gorman 
> Cc: Joonsoo Kim 
> Cc: Michal Nazarewicz 
> Cc: Naoya Horiguchi 
> Cc: Christoph Lameter 
> Cc: Rik van Riel 
> Cc: David Rientjes 

Really good.

Reviewed-by: Zhang Yanfei 

> ---
>  include/linux/compaction.h |  6 --
>  mm/compaction.c| 29 -
>  mm/page_alloc.c| 33 ++---
>  3 files changed, 46 insertions(+), 22 deletions(-)
> 
> diff --git a/include/linux/compaction.h b/include/linux/compaction.h
> index 01e3132..76f9beb 100644
> --- a/include/linux/compaction.h
> +++ b/include/linux/compaction.h
> @@ -22,7 +22,8 @@ extern int sysctl_extfrag_handler(struct ctl_table *table, 
> int write,
>  extern int fragmentation_index(struct zone *zone, unsigned int order);
>  extern unsigned long try_to_compact_pages(struct zonelist *zonelist,
>   int order, gfp_t gfp_mask, nodemask_t *mask,
> - enum migrate_mode mode, bool *contended);
> + enum migrate_mode mode, bool *contended, bool *deferred,
> + struct zone **candidate_zone);
>  extern void compact_pgdat(pg_data_t *pgdat, int order);
>  extern void reset_isolation_suitable(pg_data_t *pgdat);
>  extern unsigned long compaction_suitable(struct zone *zone, int order);
> @@ -91,7 +92,8 @@ static inline bool compaction_restarting(struct zone *zone, 
> int order)
>  #else
>  static inline unsigned long try_to_compact_pages(struct zonelist *zonelist,
>   int order, gfp_t gfp_mask, nodemask_t *nodemask,
> - enum migrate_mode mode, bool *contended)
> + enum migrate_mode mode, bool *contended, bool *deferred,
> + struct zone **candidate_zone)
>  {
>   return COMPACT_CONTINUE;
>  }
> diff --git a/mm/compaction.c b/mm/compaction.c
> index 5175019..7c491d0 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -1122,13 +1122,15 @@ int sysctl_extfrag_threshold = 500;
>   * @nodemask:

Re: [RFC PATCH 0/3] Mark literal strings in init / exit code

2014-06-23 Thread Mathias Krause

On 23 June 2014 00:56, Joe Perches  wrote:
> On Mon, 2014-06-23 at 00:46 +0200, Mathias Krause wrote:
>> [...] patch 2 adds some syntactical sugar for the most popular use
>> case, by providing pr_ alike macros, namely pi_ for __init
>> code and pe_ for __exit code. This hides the use of the marker
>> macros behind the commonly known printing functions -- with just a
>> single character changed.
>>
>> Patch 3 exemplarily changes all strings and format strings in
>> arch/x86/kernel/acpi/boot.c to use the new macros. It also addresses a
>> few styling issues, though. But this already leads to ~1.7 kB of r/o
>> data moved to the .init.rodata section, marking it for release after
>> init.
>>
>> [...]
>
> I once proposed a similar thing.
>
> https://lkml.org/lkml/2009/7/21/421
>
> Matt Mackall replied
>
> https://lkml.org/lkml/2009/7/21/463
>

Thanks for the pointers. Have you looked at patch 2 and 3? I don't
think it makes the printk() case ugly. In fact, using pi_()
should be no less readable then pr_, no?

Thanks,
Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] ASoC: max98090: Add max98091 compatible string

2014-06-23 Thread Tushar Behera

On 06/21/2014 02:02 AM, Doug Anderson wrote:
> Tushar,
> 
> On Fri, Jun 20, 2014 at 1:03 AM, Tushar Behera  wrote:
>> From: Wonjoon Lee 
>>
>> The MAX98091 CODEC is the same as MAX98090 CODEC, but with an extra
>> microphone. Existing driver for MAX98090 CODEC already has support
>> for MAX98091 CODEC. Adding proper compatible string so that MAX98091
>> CODEC can be specified from device tree.
>>
>> Signed-off-by: Wonjoon Lee 
>> Signed-off-by: Doug Anderson 
>> Signed-off-by: Tushar Behera 
>> ---
>>
>> Picked from https://chromium-review.googlesource.com/#/c/184091/
>>
>>  .../devicetree/bindings/sound/max98090.txt |2 +-
>>  sound/soc/codecs/max98090.c|2 ++
>>  2 files changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/Documentation/devicetree/bindings/sound/max98090.txt 
>> b/Documentation/devicetree/bindings/sound/max98090.txt
>> index a5e63fa..c454e67 100644
>> --- a/Documentation/devicetree/bindings/sound/max98090.txt
>> +++ b/Documentation/devicetree/bindings/sound/max98090.txt
>> @@ -4,7 +4,7 @@ This device supports I2C only.
>>
>>  Required properties:
>>
>> -- compatible : "maxim,max98090".
>> +- compatible : "maxim,max98090" or "maxim,max98091".
>>
>>  - reg : The I2C address of the device.
>>
>> diff --git a/sound/soc/codecs/max98090.c b/sound/soc/codecs/max98090.c
>> index f5fccc7..4f5534d 100644
>> --- a/sound/soc/codecs/max98090.c
>> +++ b/sound/soc/codecs/max98090.c
>> @@ -2460,12 +2460,14 @@ static const struct dev_pm_ops max98090_pm = {
>>
>>  static const struct i2c_device_id max98090_i2c_id[] = {
>> { "max98090", MAX98090 },
>> +   { "max98091", MAX98091 },
> 
> optional: This would allow you to add some extra error checking in
> max98090_probe() to make sure that the device-tree specified device
> matched the device that was detected.  That could be in a future
> patch, though.
> 
> Reviewed-by: Doug Anderson 
> 

Okay. I will add that in a follow-up patch.

Thanks for reviewing.
-- 
Tushar Behera
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] usb: misc: usb3503: Update error code in print message

2014-06-23 Thread Tushar Behera

On 06/17/2014 04:54 PM, Marek Szyprowski wrote:
> Hello,
> 
> On 2014-06-17 13:08, Tushar Behera wrote:
>> 'err' is uninitialized, rather print the error code directly.
>>
>> This also fixes following warning.
>> drivers/usb/misc/usb3503.c: In function ‘usb3503_probe’:
>> drivers/usb/misc/usb3503.c:195:11: warning: ‘err’ may be used
>> uninitialized
>> in this function [-Wmaybe-uninitialized]
>>  dev_err(dev, "unable to request refclk (%d)\n", err);
>>
>> Signed-off-by: Tushar Behera 
> 
> Acked-by: Marek Szyprowski 
> 

Greg,

Would you please pick up this patch?

>> ---
>>
>> Based on next-20140616.
>>
>>   drivers/usb/misc/usb3503.c |3 ++-
>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/usb/misc/usb3503.c b/drivers/usb/misc/usb3503.c
>> index f43c619..652855b 100644
>> --- a/drivers/usb/misc/usb3503.c
>> +++ b/drivers/usb/misc/usb3503.c
>> @@ -192,7 +192,8 @@ static int usb3503_probe(struct usb3503 *hub)
>> clk = devm_clk_get(dev, "refclk");
>>   if (IS_ERR(clk) && PTR_ERR(clk) != -ENOENT) {
>> -dev_err(dev, "unable to request refclk (%d)\n", err);
>> +dev_err(dev, "unable to request refclk (%ld)\n",
>> +PTR_ERR(clk));
>>   return PTR_ERR(clk);
>>   }
>>   
> 
> Best regards


-- 
Tushar Behera
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 12/13] mm: memcontrol: rewrite charge API

2014-06-23 Thread Uwe Kleine-König

Hello,

On Wed, Jun 18, 2014 at 04:40:44PM -0400, Johannes Weiner wrote:
> The memcg charge API charges pages before they are rmapped - i.e. have
> an actual "type" - and so every callsite needs its own set of charge
> and uncharge functions to know what type is being operated on.  Worse,
> uncharge has to happen from a context that is still type-specific,
> rather than at the end of the page's lifetime with exclusive access,
> and so requires a lot of synchronization.
> ...

this patch made it into next-20140623 as 5e49555277df (mm: memcontrol: rewrite
charge API) and it makes efm32_defconfig (ARCH=arm) fail with:

  CC  mm/swap.o
mm/swap.c: In function 'lru_cache_add_active_or_unevictable':
mm/swap.c:719:2: error: implicit declaration of function 'TestSetPageMlocked' 
[-Werror=implicit-function-declaration]
  if (!TestSetPageMlocked(page)) {
  ^
cc1: some warnings being treated as errors
scripts/Makefile.build:257: recipe for target 'mm/swap.o' failed
make[3]: *** [mm/swap.o] Error 1
Makefile:1471: recipe for target 'mm/swap.o' failed

imx_v4_v5_defconfig works, so probably the thing that makes
efm32_defconfig fail is CONFIG_MMU=n.

Best regards
Uwe

-- 
Pengutronix e.K.   | Uwe Kleine-König|
Industrial Linux Solutions | http://www.pengutronix.de/  |
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 1/4] mm: vmscan: remove remains of kswapd-managed zone->all_unreclaimable

2014-06-23 Thread Minchan Kim

On Fri, Jun 20, 2014 at 12:33:47PM -0400, Johannes Weiner wrote:
> shrink_zones() has a special branch to skip the all_unreclaimable()
> check during hibernation, because a frozen kswapd can't mark a zone
> unreclaimable.
> 
> But ever since 6e543d5780e3 ("mm: vmscan: fix do_try_to_free_pages()
> livelock"), determining a zone to be unreclaimable is done by directly
> looking at its scan history and no longer relies on kswapd setting the
> per-zone flag.
> 
> Remove this branch and let shrink_zones() check the reclaimability of
> the target zones regardless of hibernation state.
> 
> Signed-off-by: Johannes Weiner 
Acked-by: Minchan Kim 

It would be not bad to Cced KOSAKI who was involved all_unreclaimable
series several time with me.

> ---
>  mm/vmscan.c | 8 
>  1 file changed, 8 deletions(-)
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 0f16ffe8eb67..19b5b8016209 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -2534,14 +2534,6 @@ out:
>   if (sc->nr_reclaimed)
>   return sc->nr_reclaimed;
>  
> - /*
> -  * As hibernation is going on, kswapd is freezed so that it can't mark
> -  * the zone into all_unreclaimable. Thus bypassing all_unreclaimable
> -  * check.
> -  */
> - if (oom_killer_disabled)
> - return 0;
> -
>   /* Aborted reclaim to try compaction? don't OOM, then */
>   if (aborted_reclaim)
>   return 1;
> -- 
> 2.0.0
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majord...@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: mailto:"d...@kvack.org;> em...@kvack.org 

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH 3/3] x86: make MP a required-feature on 64-bit

2014-06-23 Thread Andi Kleen

> We probably should just the cpu_has_mp macro entirely.  All it is used
> for is printing a warning in amd_k7_smp_check().
> 
> Andi, Borislav -- as far as I can tell, we have *never* enforced this on
> the 64-bit kernel, although we have enforced it on 64-bit processors
> running the 32-bit kernel.  We should either enforce it on both or just
> drop it.  What is your opinion?

Drop it everywhere.

-Andi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] serial: samsung: Remove redundant label

2014-06-23 Thread Sachin Kamat

On Mon, Jun 23, 2014 at 11:32 AM, Tushar Behera  wrote:
> probe_err label only returns the error code. This label can be removed
> and the error code can be returned directly.
>
> Signed-off-by: Tushar Behera 
> ---
>  drivers/tty/serial/samsung.c |5 +
>  1 file changed, 1 insertion(+), 4 deletions(-)
>
> diff --git a/drivers/tty/serial/samsung.c b/drivers/tty/serial/samsung.c
> index c1d3ebd..bf93010 100644
> --- a/drivers/tty/serial/samsung.c
> +++ b/drivers/tty/serial/samsung.c
> @@ -1303,7 +1303,7 @@ static int s3c24xx_serial_probe(struct platform_device 
> *pdev)
>
> ret = s3c24xx_serial_init_port(ourport, pdev);
> if (ret < 0)
> -   goto probe_err;
> +   return ret;
>
> if (!s3c24xx_uart_drv.state) {
> ret = uart_register_driver(_uart_drv);
> @@ -1335,9 +1335,6 @@ static int s3c24xx_serial_probe(struct platform_device 
> *pdev)
> dev_err(>dev, "failed to add cpufreq notifier\n");
>
> return 0;
> -
> - probe_err:
> -   return ret;
>  }
>
>  static int s3c24xx_serial_remove(struct platform_device *dev)
> --

Looks good.
Reviewed-by: Sachin Kamat 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] usb: gadget: add claimed field in struct usb_ep

2014-06-23 Thread Robert Baldyga

On 06/19/2014 05:08 PM, Felipe Balbi wrote:
> On Mon, Jun 16, 2014 at 10:20:36AM +0200, Robert Baldyga wrote:
>> This field allows to mark ep as claimed in more clear way. Claiming
>> endpoint by setting driver_data to non-null value is leaky solution
>> and makes code unreadable.
> 
> how come ? How can it be unreadable ? how can it be leaky ?
> 

What if gadget will not assign any value to driver_data (just like
Gadget Zero do)? Endpoint will be seen as not used, and autoconfig will
return it more than one time. That's what I call leaky solution.

Information if endpoint is claimed or not is its internal state and
should not depend on assigning non-null value to driver_data field.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] serial: samsung: Remove redundant label

2014-06-23 Thread Tushar Behera

probe_err label only returns the error code. This label can be removed
and the error code can be returned directly.

Signed-off-by: Tushar Behera 
---
 drivers/tty/serial/samsung.c |5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/tty/serial/samsung.c b/drivers/tty/serial/samsung.c
index c1d3ebd..bf93010 100644
--- a/drivers/tty/serial/samsung.c
+++ b/drivers/tty/serial/samsung.c
@@ -1303,7 +1303,7 @@ static int s3c24xx_serial_probe(struct platform_device 
*pdev)
 
ret = s3c24xx_serial_init_port(ourport, pdev);
if (ret < 0)
-   goto probe_err;
+   return ret;
 
if (!s3c24xx_uart_drv.state) {
ret = uart_register_driver(_uart_drv);
@@ -1335,9 +1335,6 @@ static int s3c24xx_serial_probe(struct platform_device 
*pdev)
dev_err(>dev, "failed to add cpufreq notifier\n");
 
return 0;
-
- probe_err:
-   return ret;
 }
 
 static int s3c24xx_serial_remove(struct platform_device *dev)
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 0/2] serial: amba-pl01x: Clean up patches

2014-06-23 Thread Tushar Behera

The patches are based next-20140620 and they have only been build
tested.

Tushar Behera (2):
  serial: amba-pl011: Simplify goto statements
  serial: amba-pl010: Use devres APIs

 drivers/tty/serial/amba-pl010.c |   46 ++-
 drivers/tty/serial/amba-pl011.c |   30 +
 2 files changed, 26 insertions(+), 50 deletions(-)

-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/2] serial: amba-pl010: Use devres APIs

2014-06-23 Thread Tushar Behera

Migrating to use devres managed APIs devm_kzalloc, devm_ioremap and
devm_clk_get.

Signed-off-by: Tushar Behera 
---
 drivers/tty/serial/amba-pl010.c |   46 ++-
 1 file changed, 16 insertions(+), 30 deletions(-)

diff --git a/drivers/tty/serial/amba-pl010.c b/drivers/tty/serial/amba-pl010.c
index 971af1e..af8deba 100644
--- a/drivers/tty/serial/amba-pl010.c
+++ b/drivers/tty/serial/amba-pl010.c
@@ -46,6 +46,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 
@@ -688,28 +689,22 @@ static int pl010_probe(struct amba_device *dev, const 
struct amba_id *id)
if (amba_ports[i] == NULL)
break;
 
-   if (i == ARRAY_SIZE(amba_ports)) {
-   ret = -EBUSY;
-   goto out;
-   }
+   if (i == ARRAY_SIZE(amba_ports))
+   return -EBUSY;
 
-   uap = kzalloc(sizeof(struct uart_amba_port), GFP_KERNEL);
-   if (!uap) {
-   ret = -ENOMEM;
-   goto out;
-   }
+   uap = devm_kzalloc(>dev, sizeof(struct uart_amba_port),
+  GFP_KERNEL);
+   if (!uap)
+   return -ENOMEM;
 
-   base = ioremap(dev->res.start, resource_size(>res));
-   if (!base) {
-   ret = -ENOMEM;
-   goto free;
-   }
+   base = devm_ioremap(>dev, dev->res.start,
+   resource_size(>res));
+   if (!base)
+   return -ENOMEM;
 
-   uap->clk = clk_get(>dev, NULL);
-   if (IS_ERR(uap->clk)) {
-   ret = PTR_ERR(uap->clk);
-   goto unmap;
-   }
+   uap->clk = devm_clk_get(>dev, NULL);
+   if (IS_ERR(uap->clk))
+   return PTR_ERR(uap->clk);
 
uap->port.dev = >dev;
uap->port.mapbase = dev->res.start;
@@ -727,15 +722,9 @@ static int pl010_probe(struct amba_device *dev, const 
struct amba_id *id)
 
amba_set_drvdata(dev, uap);
ret = uart_add_one_port(_reg, >port);
-   if (ret) {
+   if (ret)
amba_ports[i] = NULL;
-   clk_put(uap->clk);
- unmap:
-   iounmap(base);
- free:
-   kfree(uap);
-   }
- out:
+
return ret;
 }
 
@@ -750,9 +739,6 @@ static int pl010_remove(struct amba_device *dev)
if (amba_ports[i] == uap)
amba_ports[i] = NULL;
 
-   iounmap(uap->port.membase);
-   clk_put(uap->clk);
-   kfree(uap);
return 0;
 }
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/2] serial: amba-pl011: Remove redundant label

2014-06-23 Thread Tushar Behera

The label 'out' is only used to return the error code. We can return the
error code directly and remove 'out' label.

Signed-off-by: Tushar Behera 
---
 drivers/tty/serial/amba-pl011.c |   30 ++
 1 file changed, 10 insertions(+), 20 deletions(-)

diff --git a/drivers/tty/serial/amba-pl011.c b/drivers/tty/serial/amba-pl011.c
index 0e26dcb..8572f2a 100644
--- a/drivers/tty/serial/amba-pl011.c
+++ b/drivers/tty/serial/amba-pl011.c
@@ -1484,7 +1484,7 @@ static int pl011_hwinit(struct uart_port *port)
 */
retval = clk_prepare_enable(uap->clk);
if (retval)
-   goto out;
+   return retval;
 
uap->port.uartclk = clk_get_rate(uap->clk);
 
@@ -1507,8 +1507,6 @@ static int pl011_hwinit(struct uart_port *port)
plat->init();
}
return 0;
- out:
-   return retval;
 }
 
 static void pl011_write_lcr_h(struct uart_amba_port *uap, unsigned int lcr_h)
@@ -2131,32 +2129,24 @@ static int pl011_probe(struct amba_device *dev, const 
struct amba_id *id)
if (amba_ports[i] == NULL)
break;
 
-   if (i == ARRAY_SIZE(amba_ports)) {
-   ret = -EBUSY;
-   goto out;
-   }
+   if (i == ARRAY_SIZE(amba_ports))
+   return -EBUSY;
 
uap = devm_kzalloc(>dev, sizeof(struct uart_amba_port),
   GFP_KERNEL);
-   if (uap == NULL) {
-   ret = -ENOMEM;
-   goto out;
-   }
+   if (uap == NULL)
+   return -ENOMEM;
 
i = pl011_probe_dt_alias(i, >dev);
 
base = devm_ioremap(>dev, dev->res.start,
resource_size(>res));
-   if (!base) {
-   ret = -ENOMEM;
-   goto out;
-   }
+   if (!base)
+   return -ENOMEM;
 
uap->clk = devm_clk_get(>dev, NULL);
-   if (IS_ERR(uap->clk)) {
-   ret = PTR_ERR(uap->clk);
-   goto out;
-   }
+   if (IS_ERR(uap->clk))
+   return PTR_ERR(uap->clk);
 
uap->vendor = vendor;
uap->lcrh_rx = vendor->lcrh_rx;
@@ -2198,7 +2188,7 @@ static int pl011_probe(struct amba_device *dev, const 
struct amba_id *id)
uart_unregister_driver(_reg);
pl011_dma_remove(uap);
}
- out:
+
return ret;
 }
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3.11 15/93] drm/radeon: avoid segfault on device open when accel is not working.

2014-06-23 Thread Luis Henriques

3.11.10.12 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: =?UTF-8?q?J=C3=A9r=C3=B4me=20Glisse?= jgli...@redhat.com

commit 24f47acc78b0ab5e2201f859fe1f693ae90c7c83 upstream.

When accel is not working on device with virtual address space radeon
segfault because the ib buffer is NULL and trying to map it inside the
virtual address space trigger segfault. This patch only map the ib
buffer if accel is working.

Signed-off-by: Jérôme Glisse jgli...@redhat.com
Reviewed-by: Alex Deucher alexander.deuc...@amd.com
Signed-off-by: Christian König christian.koe...@amd.com
Signed-off-by: Luis Henriques luis.henriq...@canonical.com
---
 drivers/gpu/drm/radeon/radeon_kms.c | 57 +++--
 1 file changed, 30 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_kms.c 
b/drivers/gpu/drm/radeon/radeon_kms.c
index 82b87108457c..a6fb24a773e8 100644
--- a/drivers/gpu/drm/radeon/radeon_kms.c
+++ b/drivers/gpu/drm/radeon/radeon_kms.c
@@ -504,28 +504,29 @@ int radeon_driver_open_kms(struct drm_device *dev, struct 
drm_file *file_priv)
 
radeon_vm_init(rdev, fpriv-vm);
 
-   r = radeon_bo_reserve(rdev-ring_tmp_bo.bo, false);
-   if (r) {
-   radeon_vm_fini(rdev, fpriv-vm);
-   kfree(fpriv);
-   return r;
-   }
+   if (rdev-accel_working) {
+   r = radeon_bo_reserve(rdev-ring_tmp_bo.bo, false);
+   if (r) {
+   radeon_vm_fini(rdev, fpriv-vm);
+   kfree(fpriv);
+   return r;
+   }
 
-   /* map the ib pool buffer read only into
-* virtual address space */
-   bo_va = radeon_vm_bo_add(rdev, fpriv-vm,
-rdev-ring_tmp_bo.bo);
-   r = radeon_vm_bo_set_addr(rdev, bo_va, RADEON_VA_IB_OFFSET,
- RADEON_VM_PAGE_READABLE |
- RADEON_VM_PAGE_SNOOPED);
-
-   radeon_bo_unreserve(rdev-ring_tmp_bo.bo);
-   if (r) {
-   radeon_vm_fini(rdev, fpriv-vm);
-   kfree(fpriv);
-   return r;
-   }
+   /* map the ib pool buffer read only into
+* virtual address space */
+   bo_va = radeon_vm_bo_add(rdev, fpriv-vm,
+rdev-ring_tmp_bo.bo);
+   r = radeon_vm_bo_set_addr(rdev, bo_va, 
RADEON_VA_IB_OFFSET,
+ RADEON_VM_PAGE_READABLE |
+ RADEON_VM_PAGE_SNOOPED);
 
+   radeon_bo_unreserve(rdev-ring_tmp_bo.bo);
+   if (r) {
+   radeon_vm_fini(rdev, fpriv-vm);
+   kfree(fpriv);
+   return r;
+   }
+   }
file_priv-driver_priv = fpriv;
}
return 0;
@@ -550,13 +551,15 @@ void radeon_driver_postclose_kms(struct drm_device *dev,
struct radeon_bo_va *bo_va;
int r;
 
-   r = radeon_bo_reserve(rdev-ring_tmp_bo.bo, false);
-   if (!r) {
-   bo_va = radeon_vm_bo_find(fpriv-vm,
- rdev-ring_tmp_bo.bo);
-   if (bo_va)
-   radeon_vm_bo_rmv(rdev, bo_va);
-   radeon_bo_unreserve(rdev-ring_tmp_bo.bo);
+   if (rdev-accel_working) {
+   r = radeon_bo_reserve(rdev-ring_tmp_bo.bo, false);
+   if (!r) {
+   bo_va = radeon_vm_bo_find(fpriv-vm,
+ rdev-ring_tmp_bo.bo);
+   if (bo_va)
+   radeon_vm_bo_rmv(rdev, bo_va);
+   radeon_bo_unreserve(rdev-ring_tmp_bo.bo);
+   }
}
 
radeon_vm_fini(rdev, fpriv-vm);
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3.11 18/93] nfsd4: warn on finding lockowner without stateid's

2014-06-23 Thread Luis Henriques

3.11.10.12 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: J. Bruce Fields bfie...@redhat.com

commit 27b11428b7de097c42f205beabb1764f4365443b upstream.

The current code assumes a one-to-one lockowner-lock stateid
correspondance.

Signed-off-by: J. Bruce Fields bfie...@redhat.com
Signed-off-by: Luis Henriques luis.henriq...@canonical.com
---
 fs/nfsd/nfs4state.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 4858accc4c33..85e3686f16fc 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -4149,6 +4149,10 @@ static bool same_lockowner_ino(struct nfs4_lockowner 
*lo, struct inode *inode, c
 
if (!same_owner_str(lo-lo_owner, owner, clid))
return false;
+   if (list_empty(lo-lo_owner.so_stateids)) {
+   WARN_ON_ONCE(1);
+   return false;
+   }
lst = list_first_entry(lo-lo_owner.so_stateids,
   struct nfs4_ol_stateid, st_perstateowner);
return lst-st_file-fi_inode == inode;
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: unparseable, undocumented /sys/class/drm/.../pstate

2014-06-23 Thread Ilia Mirkin

On Mon, Jun 23, 2014 at 9:02 AM, Pavel Machek pa...@ucw.cz wrote:
 On Sun 2014-06-22 22:12:14, Ilia Mirkin wrote:
 On Sat, Jun 21, 2014 at 3:45 PM, Greg KH g...@kroah.com wrote:
  On Sat, Jun 21, 2014 at 02:22:59PM -0400, Ilia Mirkin wrote:
  On Sat, Jun 21, 2014 at 2:02 PM, Pavel Machek pa...@ucw.cz wrote:
   Hi!
  
   AFAICT, pstate file will contain something like
  
   07: core 100 MHz memory 123 MHz *
   08: core 100-200 MHz memory 123 MHz
  
   ...which does not look exactly like one-value-per-file, and I'm pretty
   sure userspace will get it wrong if it tries to parse it. Plus, I
   don't see required documentation in Documentation/ABI.
  
   Should we disable it for now, so that userspace does not start
   depending on it and we'll not have to maintain it forever?
  
   I guess better interface would be something like
  
   pstate/07/core_clock_min
 core_clock_max
 memory_clock_min
 memory_clock_max
  
   and then pstate/active containing just the number of active state?

 Could we just say that the format of this file is one-per-line of

 level: information-for-the-user

 But it is not.

But it is...

 Management tools will want to parse it, sooner or
 later.  What is wrong with solution described above?

It is complex and annoying to the people that will actually use it.


 And you can echo a level into it to switch to that level? That seems
 like a reasonable ABI to have... would be happy to throw it into a
 file somewhere... not sure where though.

 Documentation/ABI/testing/

Yes, I got that far. And then I became confused.

  -ilia
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 02/21] ftrace: Make ftrace_is_dead available globally

2014-06-23 Thread Jiri Slaby

Kgr wants to test whether ftrace is OK with patching. If not, we just
bail out and will not initialize at all.

Signed-off-by: Jiri Slaby jsl...@suse.cz
Cc: Steven Rostedt rost...@goodmis.org
Cc: Frederic Weisbecker fweis...@gmail.com
Cc: Ingo Molnar mi...@redhat.com
---
 include/linux/ftrace.h | 3 +++
 kernel/trace/trace.h   | 2 --
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index c142816c2801..7ba30d4b4909 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -140,6 +140,8 @@ enum ftrace_tracing_type_t {
 /* Current tracing type, default is FTRACE_TYPE_ENTER */
 extern enum ftrace_tracing_type_t ftrace_tracing_type;
 
+extern int ftrace_is_dead(void);
+
 /**
  * ftrace_stop - stop function tracer.
  *
@@ -241,6 +243,7 @@ static inline int ftrace_nr_registered_ops(void)
return 0;
 }
 static inline void clear_ftrace_function(void) { }
+static inline int ftrace_is_dead(void) { return 0; }
 static inline void ftrace_kill(void) { }
 static inline void ftrace_stop(void) { }
 static inline void ftrace_start(void) { }
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 9258f5a815db..0d96b8990a97 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -824,7 +824,6 @@ static inline int ftrace_trace_task(struct task_struct 
*task)
 
return test_tsk_trace_trace(task);
 }
-extern int ftrace_is_dead(void);
 int ftrace_create_function_files(struct trace_array *tr,
 struct dentry *parent);
 void ftrace_destroy_function_files(struct trace_array *tr);
@@ -837,7 +836,6 @@ static inline int ftrace_trace_task(struct task_struct 
*task)
 {
return 1;
 }
-static inline int ftrace_is_dead(void) { return 0; }
 static inline int
 ftrace_create_function_files(struct trace_array *tr,
 struct dentry *parent)
-- 
2.0.0

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 16/21] kgr: add support for missing functions

2014-06-23 Thread Jiri Slaby

Sometimes we want to patch a function which is in a module that is not
currently loaded. In that case, patching would fail completely. So let
the user decide whether it is fatal when the function to be patched is
not found. If it is not, it is just skipped.  Other functions in the
patch (if any) are still patched in that case.

Note that this approach expects newly loaded modules to be fixed
already. No deferred patching happens on the module load.

Signed-off-by: Jiri Slaby jsl...@suse.cz
---
 include/linux/kgraft.h  | 10 +++---
 kernel/kgraft.c | 26 ++
 samples/kgraft/kgraft_patcher.c |  4 ++--
 3 files changed, 27 insertions(+), 13 deletions(-)

diff --git a/include/linux/kgraft.h b/include/linux/kgraft.h
index 92b642408b6f..4d8665f60cbc 100644
--- a/include/linux/kgraft.h
+++ b/include/linux/kgraft.h
@@ -30,10 +30,13 @@
 
 struct kgr_patch {
bool __percpu *irq_use_new;
-   const struct kgr_patch_fun {
+   struct kgr_patch_fun {
const char *name;
const char *new_name;
 
+   bool abort_if_missing;
+   bool applied;
+
void *new_function;
 
struct ftrace_ops *ftrace_ops_slow;
@@ -51,16 +54,17 @@ struct kgr_loc_caches {
bool __percpu *irq_use_new;
 };
 
-#define KGR_PATCHED_FUNCTION(_name, _new_function) 
\
+#define KGR_PATCHED_FUNCTION(_name, _new_function, abort)  
\
static struct ftrace_ops __kgr_patch_ftrace_ops_slow_ ## _name = {  
\
.flags = FTRACE_OPS_FL_SAVE_REGS,   
\
};  
\
static struct ftrace_ops __kgr_patch_ftrace_ops_fast_ ## _name = {  
\
.flags = FTRACE_OPS_FL_SAVE_REGS,   
\
};  
\
-   static const struct kgr_patch_fun __kgr_patch_ ## _name = { 
\
+   static struct kgr_patch_fun __kgr_patch_ ## _name = {   
\
.name = #_name, 
\
.new_name = #_new_function, 
\
+   .abort_if_missing = abort,  
\
.new_function = _new_function,  
\
.ftrace_ops_slow = __kgr_patch_ftrace_ops_slow_ ## _name,  
\
.ftrace_ops_fast = __kgr_patch_ftrace_ops_fast_ ## _name,  
\
diff --git a/kernel/kgraft.c b/kernel/kgraft.c
index 0d04f1cbcd4a..6816da29a6a3 100644
--- a/kernel/kgraft.c
+++ b/kernel/kgraft.c
@@ -28,7 +28,7 @@
 #include linux/workqueue.h
 
 static int kgr_patch_code(const struct kgr_patch *patch,
-   const struct kgr_patch_fun *patch_fun, bool final);
+   struct kgr_patch_fun *patch_fun, bool final);
 static void kgr_work_fn(struct work_struct *work);
 
 static struct workqueue_struct *kgr_wq;
@@ -87,7 +87,7 @@ static bool kgr_still_patching(void)
 
 static void kgr_finalize(void)
 {
-   const struct kgr_patch_fun *const *patch_fun;
+   struct kgr_patch_fun *const *patch_fun;
 
for (patch_fun = kgr_patch-patches; *patch_fun; patch_fun++) {
int ret = kgr_patch_code(kgr_patch, *patch_fun, true);
@@ -240,7 +240,7 @@ free_caches:
 }
 
 static int kgr_patch_code(const struct kgr_patch *patch,
-   const struct kgr_patch_fun *patch_fun, bool final)
+   struct kgr_patch_fun *patch_fun, bool final)
 {
struct ftrace_ops *new_ops;
struct kgr_loc_caches *caches;
@@ -250,11 +250,16 @@ static int kgr_patch_code(const struct kgr_patch *patch,
/* Choose between slow and fast stub */
if (!final) {
err = kgr_init_ftrace_ops(patch_fun);
-   if (err)
+   if (err) {
+   if (err == -ENOENT  !patch_fun-abort_if_missing)
+   return 0;
return err;
+   }
pr_debug(kgr: patching %s to slow stub\n, patch_fun-name);
new_ops = patch_fun-ftrace_ops_slow;
} else {
+   if (!patch_fun-applied)
+   return 0;
pr_debug(kgr: patching %s to fast stub\n, patch_fun-name);
new_ops = patch_fun-ftrace_ops_fast;
}
@@ -290,7 +295,9 @@ static int kgr_patch_code(const struct kgr_patch *patch,
/* don't fail: we are only slower */
return 0;
}
-   }
+   } else
+   patch_fun-applied = true;
+
pr_debug(kgr: redirection for %lx (%s) done\n, fentry_loc,
patch_fun-name);
 
@@ -305,7 +312,7 @@ static int kgr_patch_code(const struct kgr_patch *patch,
  */

[PATCH 13/21] kgr: x86: refuse to build without fentry support

2014-06-23 Thread Jiri Slaby

From: Jiri Kosina jkos...@suse.cz

The only reliable way for function redirection through ftrace_ops (when
modifying pt_regs-rip in the handler) is fentry.

The alternative -- mcount -- is problematic in several ways. Namely the
caller's function prologue (that has already been executed by the time
mcount callsite has been reached) is not known to the callee, and can be
completely incompatible to the calee, resulting in a havoc on return from
the function.

fentry doesn't suffer from this, as it's located at the very beginning of
the function, even before prologue has been executed, and therefore callee
is the owner of both function prologue and epilogue.

Fixing mcount to properly fix everything up would be non-trivial, and
Steven is not in favor of doing that.

Both kGraft and upstream kernel (patch to be submitted) should error out
when this unsupported and non-working configuration is detected.

According to Michael Matz, the -mfentry gcc option is x86 specific. Other
architectures insert the respective profile calls before the prologue by
default.

Signed-off-by: Jiri Kosina jkos...@suse.cz
Signed-off-by: Jiri Slaby jsl...@suse.cz
Cc: Michael Matz m...@suse.de
Cc: Steven Rostedt rost...@goodmis.org
Cc: Frederic Weisbecker fweis...@gmail.com
Cc: Ingo Molnar mi...@redhat.com
---
 arch/x86/include/asm/kgraft.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/arch/x86/include/asm/kgraft.h b/arch/x86/include/asm/kgraft.h
index 5e40ba1a0753..6fc57a85d12c 100644
--- a/arch/x86/include/asm/kgraft.h
+++ b/arch/x86/include/asm/kgraft.h
@@ -17,6 +17,10 @@
 #ifndef ASM_KGR_H
 #define ASM_KGR_H
 
+#ifndef CC_USING_FENTRY
+#error Your compiler has to support -mfentry for kGraft to work on x86
+#endif
+
 #include asm/ptrace.h
 
 static inline void kgr_set_regs_ip(struct pt_regs *regs, unsigned long ip)
-- 
2.0.0

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 06/21] kgr: add Documentation

2014-06-23 Thread Jiri Slaby

This is a text provided by Udo and polished.

Signed-off-by: Jiri Slaby jsl...@suse.cz
Cc: Udo Seidel udosei...@gmx.de
---
 Documentation/kgraft.txt | 44 
 1 file changed, 44 insertions(+)
 create mode 100644 Documentation/kgraft.txt

diff --git a/Documentation/kgraft.txt b/Documentation/kgraft.txt
new file mode 100644
index ..476beaf35c62
--- /dev/null
+++ b/Documentation/kgraft.txt
@@ -0,0 +1,44 @@
+Live Kernel Patching with kGraft
+
+
+Written by Udo Seidel udoseidel at gmx dot de
+Based on the Blog entry by Vojtech Pavlik
+Updated by Jiri Slaby
+
+June 2014
+
+kGraft's developement was started by the SUSE Labs. kGraft builds on
+technologies and ideas that are already present in the kernel: ftrace [1] and
+its mcount-based reserved space in function headers [2], the INT3/IPI-NMI
+patching also used in jump labels [3], and RCU-like update of code that does
+not require stopping the kernel.
+
+A kGraft patch is a kernel module and fully relies on the in-kernel module
+loader to link the new code with the kernel. Thanks to all that, the design
+can be nicely minimalistic.
+
+While kGraft is, by choice, limited to replacing whole functions and constants
+they reference, this does not limit the set of code patches that can be
+applied significantly.
+
+Use
+---
+
+1) Build a kernel with CONFIG_KGRAFT enabled
+2) Create a module with a patch
+   * Look at samples/kgraft/kgraft_patcher.c for an example
+3) Insert the module from 2) into the booted kernel from 1)
+4) All processes need to enter the kernel to acknowledge the new state
+   * This can be done e.g. by sending a non-fatal signal to all processes
+   * Check /proc/*/kgr_in_progress to check who still needs to be poked
+5) You should see kgr succeeded in dmesg now
+
+Enjoy your patched system!
+
+
+References
+--
+
+[1] Documentation/trace/ftrace.txt
+[2] Documentation/trace/ftrace-design.txt
+[3] Documentation/static-keys.txt
-- 
2.0.0

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 03/21] kgr: initial code

2014-06-23 Thread Jiri Slaby

From: Jiri Kosina jkos...@suse.cz

Provide initial implementation. We are now able to do ftrace-based
runtime patching of the kernel code.

In addition to that, we will provide a kgr_patcher module in the next
patch to test the functionality.

Note that the per-process flag dismisses in later patches where it is
converted to a single bit in the thread_info.

Limitations/TODOs:

- rmmod of the module that provides the patch is not possible yet
  (it'd be nice if that'd cause reverse application of the patch)
- x86_64 only

Additional squashes to this patch:
jk: add missing Kconfig.kgr
jk: fixup a header bug
jk: cleanup comments
js: port to new mcount infrastructure
js: order includes
js: fix for non-KGR (prototype and Kconfig fixes)
js: fix potential lock imbalance in kgr_patch_code
js: use insn helper for jmp generation
js: add \n to a printk
jk: externally_visible attribute warning fix
jk: symbol lookup failure handling
jk: fix race between patching and setting a flag (thanks to bpetkov)
js: add more sanity checking
js: handle missing kallsyms gracefully
js: use correct name, not alias
js: fix index in cleanup path
js: clear kgr_in_progress for all syscall paths
js: cleanup
js: do the checking in the process context
js: call kgr_mark_processes outside loop and locks
jk: convert from raw patching to ftrace API
jk: depend on regs-saving ftrace
js: make kgr_init an init_call
js: use correct offset for stub
js: use pr_debug
js: use IS_ENABLED
js: fix potential memory leak
js: change names from kgr - kGraft
js: fix error handling and return values
js: use bitops to be atomic
jk: helpers for task's kgr_in_progress
js: remove copies of stubs, have only a single instance

Signed-off-by: Jiri Kosina jkos...@suse.cz
Signed-off-by: Jiri Slaby jsl...@suse.cz
Cc: Steven Rostedt rost...@goodmis.org
Cc: Frederic Weisbecker fweis...@gmail.com
Cc: Ingo Molnar mi...@redhat.com
Cc: Andi Kleen a...@firstfloor.org
---
 arch/x86/Kconfig   |   2 +
 arch/x86/include/asm/kgraft.h  |  27 +++
 arch/x86/include/asm/thread_info.h |   1 +
 arch/x86/kernel/asm-offsets.c  |   1 +
 arch/x86/kernel/entry_64.S |   3 +
 include/linux/kgraft.h |  85 +
 kernel/Kconfig.kgraft  |   7 +
 kernel/Makefile|   1 +
 kernel/kgraft.c| 346 +
 9 files changed, 473 insertions(+)
 create mode 100644 arch/x86/include/asm/kgraft.h
 create mode 100644 include/linux/kgraft.h
 create mode 100644 kernel/Kconfig.kgraft
 create mode 100644 kernel/kgraft.c

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index a8f749ef0fdc..90c45b15b08b 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -131,6 +131,7 @@ config X86
select HAVE_CC_STACKPROTECTOR
select GENERIC_CPU_AUTOPROBE
select HAVE_ARCH_AUDITSYSCALL
+   select HAVE_KGRAFT
 
 config INSTRUCTION_DECODER
def_bool y
@@ -267,6 +268,7 @@ config FIX_EARLYCON_MEM
 
 source init/Kconfig
 source kernel/Kconfig.freezer
+source kernel/Kconfig.kgraft
 
 menu Processor type and features
 
diff --git a/arch/x86/include/asm/kgraft.h b/arch/x86/include/asm/kgraft.h
new file mode 100644
index ..5e40ba1a0753
--- /dev/null
+++ b/arch/x86/include/asm/kgraft.h
@@ -0,0 +1,27 @@
+/*
+ * kGraft Online Kernel Patching
+ *
+ *  Copyright (c) 2013-2014 SUSE
+ *   Authors: Jiri Kosina
+ *   Vojtech Pavlik
+ *   Jiri Slaby
+ */
+
+/*
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the Free
+ * Software Foundation; either version 2 of the License, or (at your option)
+ * any later version.
+ */
+
+#ifndef ASM_KGR_H
+#define ASM_KGR_H
+
+#include asm/ptrace.h
+
+static inline void kgr_set_regs_ip(struct pt_regs *regs, unsigned long ip)
+{
+   regs-ip = ip;
+}
+
+#endif
diff --git a/arch/x86/include/asm/thread_info.h 
b/arch/x86/include/asm/thread_info.h
index 854053889d4d..e44c8fda9c43 100644
--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -35,6 +35,7 @@ struct thread_info {
void __user *sysenter_return;
unsigned intsig_on_uaccess_error:1;
unsigned intuaccess_err:1;  /* uaccess failed */
+   unsigned long   kgr_in_progress;
 };
 
 #define INIT_THREAD_INFO(tsk)  \
diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c
index 9f6b9341950f..0db0437967a2 100644
--- a/arch/x86/kernel/asm-offsets.c
+++ b/arch/x86/kernel/asm-offsets.c
@@ -32,6 +32,7 @@ void common(void) {
OFFSET(TI_flags, thread_info, flags);
OFFSET(TI_status, thread_info, status);
OFFSET(TI_addr_limit, thread_info, addr_limit);
+   OFFSET(TI_kgr_in_progress, thread_info, kgr_in_progress);
 
BLANK();
OFFSET(crypto_tfm_ctx_offset, crypto_tfm, __crt_ctx);
diff --git a/arch/x86/kernel/entry_64.S

[PATCH 11/21] kgr: handle irqs

2014-06-23 Thread Jiri Slaby

Introduce a per-cpu flag to check whether we should use the old or new
function in the slow stub. The new function starts being used on a
processor only after a scheduled function sets the flag via
schedule_on_each_cpu. Presumably this happens in the process context,
no irq is running. And protect the flag setting by disabling
interrupts so that we 1) have a barrier and 2) no interrupt triggers
while setting the flag (but the set should be atomic anyway as it is
bool).

js: fix fail paths

Signed-off-by: Jiri Slaby jsl...@suse.cz
Cc: Steven Rostedt rost...@goodmis.org
Cc: Frederic Weisbecker fweis...@gmail.com
Cc: Ingo Molnar mi...@redhat.com
Cc: Thomas Gleixner t...@linutronix.de
---
 include/linux/kgraft.h  |  6 +++--
 kernel/kgraft.c | 59 -
 samples/kgraft/kgraft_patcher.c |  2 +-
 3 files changed, 51 insertions(+), 16 deletions(-)

diff --git a/include/linux/kgraft.h b/include/linux/kgraft.h
index e87623fe74ad..93bb1c50e079 100644
--- a/include/linux/kgraft.h
+++ b/include/linux/kgraft.h
@@ -18,6 +18,7 @@
 #define LINUX_KGR_H
 
 #include linux/bitops.h
+#include linux/compiler.h
 #include linux/ftrace.h
 #include linux/sched.h
 
@@ -28,7 +29,7 @@
 #define KGR_TIMEOUT 30
 
 struct kgr_patch {
-   char reserved;
+   bool __percpu *irq_use_new;
const struct kgr_patch_fun {
const char *name;
const char *new_name;
@@ -47,6 +48,7 @@ struct kgr_patch {
 struct kgr_loc_caches {
unsigned long old;
unsigned long new;
+   bool __percpu *irq_use_new;
 };
 
 #define KGR_PATCHED_FUNCTION(_name, _new_function) 
\
@@ -67,7 +69,7 @@ struct kgr_loc_caches {
 #define KGR_PATCH(name)__kgr_patch_ ## name
 #define KGR_PATCH_END  NULL
 
-extern int kgr_start_patching(const struct kgr_patch *);
+extern int kgr_start_patching(struct kgr_patch *);
 
 static inline void kgr_mark_task_in_progress(struct task_struct *p)
 {
diff --git a/kernel/kgraft.c b/kernel/kgraft.c
index 2fe1d922ebac..ce733f3f2640 100644
--- a/kernel/kgraft.c
+++ b/kernel/kgraft.c
@@ -15,9 +15,11 @@
  */
 
 #include linux/ftrace.h
+#include linux/hardirq.h /* for in_interrupt() */
 #include linux/kallsyms.h
 #include linux/kgraft.h
 #include linux/module.h
+#include linux/percpu.h
 #include linux/sched.h
 #include linux/slab.h
 #include linux/sort.h
@@ -25,7 +27,8 @@
 #include linux/types.h
 #include linux/workqueue.h
 
-static int kgr_patch_code(const struct kgr_patch_fun *patch_fun, bool final);
+static int kgr_patch_code(const struct kgr_patch *patch,
+   const struct kgr_patch_fun *patch_fun, bool final);
 static void kgr_work_fn(struct work_struct *work);
 
 static struct workqueue_struct *kgr_wq;
@@ -52,8 +55,10 @@ static void kgr_stub_slow(unsigned long ip, unsigned long 
parent_ip,
struct ftrace_ops *ops, struct pt_regs *regs)
 {
struct kgr_loc_caches *c = ops-private;
+   bool irq = !!in_interrupt();
 
-   if (kgr_task_in_progress(current)) {
+   if ((!irq  kgr_task_in_progress(current)) ||
+   (irq  !*this_cpu_ptr(c-irq_use_new))) {
pr_info(kgr: slow stub: calling old code at %lx\n,
c-old);
kgr_set_regs_ip(regs, c-old + MCOUNT_INSN_SIZE);
@@ -86,7 +91,7 @@ static void kgr_finalize(void)
const struct kgr_patch_fun *const *patch_fun;
 
for (patch_fun = kgr_patch-patches; *patch_fun; patch_fun++) {
-   int ret = kgr_patch_code(*patch_fun, true);
+   int ret = kgr_patch_code(kgr_patch, *patch_fun, true);
/*
 * In case any of the symbol resolutions in the set
 * has failed, patch all the previously replaced fentry
@@ -96,6 +101,7 @@ static void kgr_finalize(void)
pr_err(kgr: finalize for %s failed, trying to 
continue\n,
(*patch_fun)-name);
}
+   free_percpu(kgr_patch-irq_use_new);
 }
 
 static void kgr_work_fn(struct work_struct *work)
@@ -167,6 +173,20 @@ static unsigned long kgr_get_fentry_loc(const char *f_name)
return fentry_loc;
 }
 
+static void kgr_handle_irq_cpu(struct work_struct *work)
+{
+   unsigned long flags;
+
+   local_irq_save(flags);
+   *this_cpu_ptr(kgr_patch-irq_use_new) = true;
+   local_irq_restore(flags);
+}
+
+static void kgr_handle_irqs(void)
+{
+   schedule_on_each_cpu(kgr_handle_irq_cpu);
+}
+
 static int kgr_init_ftrace_ops(const struct kgr_patch_fun *patch_fun)
 {
struct kgr_loc_caches *caches;
@@ -220,7 +240,8 @@ free_caches:
return ret;
 }
 
-static int kgr_patch_code(const struct kgr_patch_fun *patch_fun, bool final)
+static int kgr_patch_code(const struct kgr_patch *patch,
+   const struct kgr_patch_fun *patch_fun, bool final)
 {
struct ftrace_ops *new_ops;
struct kgr_loc_caches

[PATCH 09/21] kgr: mark task_safe in some kthreads

2014-06-23 Thread Jiri Slaby

Before we enable a kthread support in kGraft, we must make sure all
kthreads mark themselves as kGraft-safe at some point explicitly.

We do this by injecting kgr_task_safe to the freezer test. There, we
assume that kthreads are in some predefined state and can expect
something bad to happen. Hence we switch the kGraft worlds there from
the old one to the new one. The optimal solution would be to convert
most of kthreads (that need not be kthreads actually) to workqeues as
suggested by Tejun. This is an upcoming work that will appear next.
But until we get there, we use freezer for kGraft that way as is
presented here.

Note that there are also some kthreads that do not utilize freezer, so
we use kgr_task_safe in them explicitly. This happens at locations
that appear to be safe for the kthreads to switch the worlds.

The end result after we migrate kthreads (that need not be kthreads)
to workqueues is: have only kthreads that contain kgr_task_safe
explicitly (or using some helper) and nothing else.

Signed-off-by: Jiri Slaby jsl...@suse.cz
Acked-by: Greg Kroah-Hartman gre...@linuxfoundation.org [devtmpfs]
Acked-by: Paul E. McKenney paul...@linux.vnet.ibm.com [rcu]
Cc: Steven Rostedt rost...@goodmis.org
Cc: Frederic Weisbecker fweis...@gmail.com
Cc: Ingo Molnar mi...@redhat.com
Cc: Theodore Ts'o ty...@mit.edu
Cc: Dipankar Sarma dipan...@in.ibm.com
Cc: Tejun Heo t...@kernel.org
---
 drivers/base/devtmpfs.c |  1 +
 drivers/scsi/scsi_error.c   |  2 ++
 drivers/usb/core/hub.c  |  4 ++--
 fs/jbd2/journal.c   |  2 ++
 fs/notify/mark.c|  5 -
 include/linux/freezer.h |  2 ++
 kernel/hung_task.c  |  5 -
 kernel/kthread.c|  3 +++
 kernel/rcu/tree.c   |  6 --
 kernel/rcu/tree_plugin.h| 10 --
 kernel/smpboot.c|  2 ++
 kernel/workqueue.c  |  3 +++
 mm/huge_memory.c|  1 +
 net/bluetooth/rfcomm/core.c |  2 ++
 14 files changed, 40 insertions(+), 8 deletions(-)

diff --git a/drivers/base/devtmpfs.c b/drivers/base/devtmpfs.c
index 25798db14553..c7d52d1b8c9c 100644
--- a/drivers/base/devtmpfs.c
+++ b/drivers/base/devtmpfs.c
@@ -387,6 +387,7 @@ static int devtmpfsd(void *p)
sys_chroot(.);
complete(setup_done);
while (1) {
+   kgr_task_safe(current);
spin_lock(req_lock);
while (requests) {
struct req *req = requests;
diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index cbe38e5e7955..28bc61251e2a 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -2153,6 +2153,8 @@ int scsi_error_handler(void *data)
 * disables signal delivery for the created thread.
 */
while (!kthread_should_stop()) {
+   kgr_task_safe(current);
+
set_current_state(TASK_INTERRUPTIBLE);
if ((shost-host_failed == 0  shost-host_eh_scheduled == 0) 
||
shost-host_failed != shost-host_busy) {
diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c
index 21b99b4b4082..85a53488ed3f 100644
--- a/drivers/usb/core/hub.c
+++ b/drivers/usb/core/hub.c
@@ -5070,9 +5070,9 @@ static int hub_thread(void *__unused)
 
do {
hub_events();
-   wait_event_freezable(khubd_wait,
+   wait_event_freezable(khubd_wait, ({ kgr_task_safe(current);
!list_empty(hub_event_list) ||
-   kthread_should_stop());
+   kthread_should_stop(); }));
} while (!kthread_should_stop() || !list_empty(hub_event_list));
 
pr_debug(%s: khubd exiting\n, usbcore_name);
diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
index 67b8e303946c..1b9c4c2e014a 100644
--- a/fs/jbd2/journal.c
+++ b/fs/jbd2/journal.c
@@ -43,6 +43,7 @@
 #include linux/backing-dev.h
 #include linux/bitops.h
 #include linux/ratelimit.h
+#include linux/sched.h
 
 #define CREATE_TRACE_POINTS
 #include trace/events/jbd2.h
@@ -260,6 +261,7 @@ loop:
write_lock(journal-j_state_lock);
}
finish_wait(journal-j_wait_commit, wait);
+   kgr_task_safe(current);
}
 
jbd_debug(1, kjournald2 wakes\n);
diff --git a/fs/notify/mark.c b/fs/notify/mark.c
index d90deaa08e78..d30a491cacf2 100644
--- a/fs/notify/mark.c
+++ b/fs/notify/mark.c
@@ -82,6 +82,7 @@
 #include linux/kthread.h
 #include linux/module.h
 #include linux/mutex.h
+#include linux/sched.h
 #include linux/slab.h
 #include linux/spinlock.h
 #include linux/srcu.h
@@ -355,7 +356,9 @@ static int fsnotify_mark_destroy(void *ignored)
fsnotify_put_mark(mark);
}
 
-   wait_event_interruptible(destroy_waitq, 
!list_empty(destroy_list));
+   wait_event_interruptible(destroy_waitq, ({
+   kgr_task_safe(current);
+

[PATCH 21/21] kgr: x86: optimize handling of CPU-bound tasks

2014-06-23 Thread Jiri Slaby

From: Jiri Kosina jkos...@suse.cz

Processes which are running in userspace at the time of patching can
be immediately marked as migrated to the new universe, as they are
provably outside the kernel and would have their 'in_progress' flag
cleared upon (eventual) kernel entry anyway.

This eliminates the need to send a SIGSTOP/SIGCONT signal (or perform
any kind of alternative handling that would force the tasks to go
through the kernel) to such tasks. This allows the tasks to run
completely undisturbed by the patching.

We do this by looking at the task's stack trace. This is suboptimal
and perhaps ugly solution but we have not find any other easy way
without interrupting the task's computation. I.e. we are aware of IPIs
and looking at stored regs for example. If anyone can come up with an
idea how to dig out the process' state (whether running in user space
or not) from task_struct or such, please draw faster and shoot this
one dead.

js: remove unneeded headers
js: cleanup

Signed-off-by: Jiri Kosina jkos...@suse.cz
Signed-off-by: Jiri Slaby jsl...@suse.cz
---
 arch/x86/include/asm/kgraft.h | 30 ++
 kernel/kgraft.c   |  3 +++
 2 files changed, 33 insertions(+)

diff --git a/arch/x86/include/asm/kgraft.h b/arch/x86/include/asm/kgraft.h
index 6fc57a85d12c..3b13738f3665 100644
--- a/arch/x86/include/asm/kgraft.h
+++ b/arch/x86/include/asm/kgraft.h
@@ -22,10 +22,40 @@
 #endif
 
 #include asm/ptrace.h
+#include linux/stacktrace.h
 
 static inline void kgr_set_regs_ip(struct pt_regs *regs, unsigned long ip)
 {
regs-ip = ip;
 }
 
+#ifdef CONFIG_STACKTRACE
+/*
+ * Tasks which are running in userspace after the patching has been started
+ * can immediately be marked as migrated to the new universe.
+ *
+ * If this function returns non-zero (i.e. also when error happens), the task
+ * needs to be migrated using kgraft lazy mechanism.
+ */
+static inline bool kgr_needs_lazy_migration(struct task_struct *p)
+{
+   unsigned long s[3];
+   struct stack_trace t = {
+   .nr_entries = 0,
+   .skip = 0,
+   .max_entries = 3,
+   .entries = s,
+   };
+
+   save_stack_trace_tsk(p, t);
+
+   return t.nr_entries  2;
+}
+#else
+static inline bool kgr_needs_lazy_migration(struct task_struct *p)
+{
+   return true;
+}
+#endif
+
 #endif
diff --git a/kernel/kgraft.c b/kernel/kgraft.c
index c4b1777604e1..f13a4c081bd4 100644
--- a/kernel/kgraft.c
+++ b/kernel/kgraft.c
@@ -150,6 +150,9 @@ static void kgr_handle_processes(void)
 */
wake_up_process(p);
}
+   /* mark tasks wandering in userspace as already migrated */
+   if (!kgr_needs_lazy_migration(p))
+   kgr_task_safe(p);
}
read_unlock(tasklist_lock);
 }
-- 
2.0.0

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] ns: introduce getnspid syscall

2014-06-23 Thread Serge E. Hallyn

Quoting chenhanx...@cn.fujitsu.com (chenhanx...@cn.fujitsu.com):
 Hi

  -Original Message-
  From: Richard Weinberger [mailto:rich...@nod.at]
  Sent: Friday, June 20, 2014 7:02 PM
  To: Chen, Hanxiao/陈 晗霄; contain...@lists.linux-foundation.org;
  linux-kernel@vger.kernel.org
  Cc: Eric W. Biederman; Serge Hallyn; Daniel P. Berrange; Oleg Nesterov; Al 
  Viro;
  David Howells; Pavel Emelyanov; Vasiliy Kulikov; Gotou, Yasunori/五島 康文;
  linux-...@vger.kernel.org
  Subject: Re: [PATCH v2] ns: introduce getnspid syscall

  Am 20.06.2014 12:18, schrieb Chen Hanxiao:
   We need a direct method of getting the pid inside containers.
   If some issues occurred inside container guest, host user
   could not know which process is in trouble just by guest pid:
   the users of container guest only knew the pid inside containers.
   This will bring obstacle for trouble shooting.

   int getnspid(pid_t pid, int fd1, int fd2);

   pid: the pid number need to be translated.

   fd: a file descriptor referring to one of
   the namespace entries in a /proc/[pid]/ns/pid.
   fd1 for destination ns(ns1), where the pid came from.
   fd2 for reference ns(ns2), while fd2 = -2 means for current ns.

   return value:
   0 : translated pid in ns1(fd1) seen from ns2(fd2).
   =0: on failure.

  I don't think that adding a new system call for this is a good solution.
  We need a more generic way. I bet people are interested in more than just 
  PID
  numbers.

 Could you please give some hints on how to expand this interface?

  I agree with Eric that a procfs solution is more appropriate.

 Procfs is a good solution, but syscall is not bad though.

I might be inclined to agree, except that in this case you are still
needing mounted procfs anyway to get the proc/$pid/ns/pid fds.

I'm sorry, I've not been watching this thread, so this probably has been
considered and decided against, but I'm going to ask anyway.  Keeping
in mind both checkpoint-restart and and introspection for use in a
setns'd commend, why not make it

pid_t getnspid(pid_t query_pid, pid_t observer_pid)

which returns the process id of query_pid as seen from observer_pid's
pidns?

 Procfs works for me, but that seems could not fit
 Pavel's requirement.
 His opinion is that a syscall is a more generic interface
 than proc files, and  also very helpful.
 And syscall could tell whether a pid lives in a specific pid namespace,
 much convenient than procfs.

 Thanks,
 - Chen

 ___
 Containers mailing list
 contain...@lists.linux-foundation.org
 https://lists.linuxfoundation.org/mailman/listinfo/containers

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 05/21] kgr: update Kconfig documentation

2014-06-23 Thread Jiri Slaby

This is based on Udo's text which was augmented in this patch.

Signed-off-by: Jiri Slaby jsl...@suse.cz
Cc: Udo Seidel udosei...@gmx.de
Cc: Vojtech Pavlik vojt...@suse.cz
---
 kernel/Kconfig.kgraft | 3 +++
 samples/Kconfig   | 4 
 2 files changed, 7 insertions(+)

diff --git a/kernel/Kconfig.kgraft b/kernel/Kconfig.kgraft
index f38d82c06580..bead93646071 100644
--- a/kernel/Kconfig.kgraft
+++ b/kernel/Kconfig.kgraft
@@ -5,3 +5,6 @@ config KGRAFT
bool kGraft infrastructure
depends on DYNAMIC_FTRACE_WITH_REGS
depends on HAVE_KGRAFT
+   help
+ Select this to enable kGraft online kernel patching. The
+ runtime price is zero, so it is safe to say Y here.
diff --git a/samples/Kconfig b/samples/Kconfig
index b33a397dfc58..12848d1bd8c5 100644
--- a/samples/Kconfig
+++ b/samples/Kconfig
@@ -58,6 +58,10 @@ config SAMPLE_KDB
 config SAMPLE_KGRAFT_PATCHER
tristate Build kGraft patcher example -- loadable modules only
depends on KGRAFT  m
+   help
+ Sample code to replace sys_iopl() and sys_capable() via
+ kGraft. This is only for presentation purposes. It is safe to
+ say Y here.
 
 config SAMPLE_RPMSG_CLIENT
tristate Build rpmsg client sample -- loadable modules only
-- 
2.0.0

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 07/21] kgr: trigger the first check earlier

2014-06-23 Thread Jiri Slaby

In 10 seconds, not 30. This speeds up the whole process in most
scenarios.

Signed-off-by: Jiri Slaby jsl...@suse.cz
Cc: Steven Rostedt rost...@goodmis.org
Cc: Frederic Weisbecker fweis...@gmail.com
Cc: Ingo Molnar mi...@redhat.com
---
 kernel/kgraft.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/kgraft.c b/kernel/kgraft.c
index 9b832419e0fd..ce2f09f3b544 100644
--- a/kernel/kgraft.c
+++ b/kernel/kgraft.c
@@ -319,7 +319,7 @@ int kgr_start_patching(const struct kgr_patch *patch)
/*
 * give everyone time to exit kernel, and check after a while
 */
-   queue_delayed_work(kgr_wq, kgr_work, KGR_TIMEOUT * HZ);
+   queue_delayed_work(kgr_wq, kgr_work, 10 * HZ);
 
return 0;
 }
-- 
2.0.0

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 18/21] kgr: fix race of stub and patching

2014-06-23 Thread Jiri Slaby

While we are patching, we set up a stub which refers to
kgr_in_progress of a process. The stub can be called immediately when
set up, but we set the flag even after done with patching in
kgr_handle_processes. This is obviously too late, so set the flag
before we start patching, but after we check that no other patching is
in progress -- we would interfere otherwise.

Signed-off-by: Jiri Slaby jsl...@suse.cz
Reported-by: Aravinda Prasad aravi...@linux.vnet.ibm.com
---
 kernel/kgraft.c | 14 --
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/kernel/kgraft.c b/kernel/kgraft.c
index 6816da29a6a3..89414957cf74 100644
--- a/kernel/kgraft.c
+++ b/kernel/kgraft.c
@@ -124,14 +124,22 @@ static void kgr_work_fn(struct work_struct *work)
mutex_unlock(kgr_in_progress_lock);
 }
 
-static void kgr_handle_processes(void)
+static void kgr_mark_processes(void)
 {
struct task_struct *p;
 
read_lock(tasklist_lock);
-   for_each_process(p) {
+   for_each_process(p)
kgr_mark_task_in_progress(p);
+   read_unlock(tasklist_lock);
+}
 
+static void kgr_handle_processes(void)
+{
+   struct task_struct *p;
+
+   read_lock(tasklist_lock);
+   for_each_process(p) {
/* wake up kthreads, they will clean the progress flag */
if (!p-mm) {
/*
@@ -333,6 +341,8 @@ int kgr_start_patching(struct kgr_patch *patch)
goto unlock_free;
}
 
+   kgr_mark_processes();
+
for (patch_fun = patch-patches; *patch_fun; patch_fun++) {
ret = kgr_patch_code(patch, *patch_fun, false);
/*
-- 
2.0.0

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 01/21] ftrace: Add function to find fentry of function

2014-06-23 Thread Jiri Slaby

This is needed for kGraft to find a fentry location to be ftraced.
We use this to find a place where to jump to a new/old code location.

Note that we use a O(n) algorithm to assert correctness (and
simplicity). This algorithm can be further optimized to be O(log(n))
using binary search, but care has to be taken about the first member
of each entries page. I.e. we cannot use 1:1 what is in
ftrace_location_range etc.

Signed-off-by: Jiri Slaby jsl...@suse.cz
Cc: Steven Rostedt rost...@goodmis.org
Cc: Frederic Weisbecker fweis...@gmail.com
Cc: Ingo Molnar mi...@redhat.com
---
 include/linux/ftrace.h |  1 +
 kernel/trace/ftrace.c  | 30 ++
 2 files changed, 31 insertions(+)

diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 404a686a3644..c142816c2801 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -295,6 +295,7 @@ extern void
 unregister_ftrace_function_probe_func(char *glob, struct ftrace_probe_ops 
*ops);
 extern void unregister_ftrace_function_probe_all(char *glob);
 
+extern unsigned long ftrace_function_to_fentry(unsigned long addr);
 extern int ftrace_text_reserved(const void *start, const void *end);
 
 extern int ftrace_nr_registered_ops(void);
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 5b372e3ed675..f4da441c0125 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -1422,6 +1422,36 @@ ftrace_ops_test(struct ftrace_ops *ops, unsigned long 
ip, void *regs)
}   \
}
 
+/**
+ * ftrace_function_to_fentry -- lookup fentry location for a function
+ * @addr: function address to find a fentry in
+ *
+ * Perform a lookup in a list of fentry callsites to find one that fits a
+ * specified function @addr. It returns the corresponding fentry callsite or
+ * zero on failure.
+ */
+unsigned long ftrace_function_to_fentry(unsigned long addr)
+{
+   const struct dyn_ftrace *rec;
+   const struct ftrace_page *pg;
+   unsigned long ret = 0;
+
+   mutex_lock(ftrace_lock);
+   do_for_each_ftrace_rec(pg, rec) {
+   unsigned long off;
+
+   if (!kallsyms_lookup_size_offset(rec-ip, NULL, off))
+   continue;
+   if (addr + off == rec-ip) {
+   ret = rec-ip;
+   goto end;
+   }
+   } while_for_each_ftrace_rec()
+end:
+   mutex_unlock(ftrace_lock);
+
+   return ret;
+}
 
 static int ftrace_cmp_recs(const void *a, const void *b)
 {
-- 
2.0.0

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 3/4] mm: vmscan: remove all_unreclaimable()

2014-06-23 Thread Mel Gorman

On Fri, Jun 20, 2014 at 12:33:49PM -0400, Johannes Weiner wrote:
 Direct reclaim currently calls shrink_zones() to reclaim all members
 of a zonelist, and if that wasn't successful it does another pass
 through the same zonelist to check overall reclaimability.
 
 Just check reclaimability in shrink_zones() directly and propagate the
 result through the return value.  Then remove all_unreclaimable().
 
 Signed-off-by: Johannes Weiner han...@cmpxchg.org
 ---
  mm/vmscan.c | 48 +++-
  1 file changed, 23 insertions(+), 25 deletions(-)
 
 diff --git a/mm/vmscan.c b/mm/vmscan.c
 index ed1efb84c542..d0bc1a209746 100644
 --- a/mm/vmscan.c
 +++ b/mm/vmscan.c
 @@ -2244,9 +2244,10 @@ static inline bool should_continue_reclaim(struct zone 
 *zone,
   }
  }
  
 -static void shrink_zone(struct zone *zone, struct scan_control *sc)
 +static unsigned long shrink_zone(struct zone *zone, struct scan_control *sc)
  {
   unsigned long nr_reclaimed, nr_scanned;
 + unsigned long zone_reclaimed = 0;
  
   do {
   struct mem_cgroup *root = sc-target_mem_cgroup;
 @@ -2290,8 +2291,12 @@ static void shrink_zone(struct zone *zone, struct 
 scan_control *sc)
  sc-nr_scanned - nr_scanned,
  sc-nr_reclaimed - nr_reclaimed);
  
 + zone_reclaimed += sc-nr_reclaimed - nr_reclaimed;
 +
   } while (should_continue_reclaim(zone, sc-nr_reclaimed - nr_reclaimed,
sc-nr_scanned - nr_scanned, sc));
 +
 + return zone_reclaimed;
  }

You do not actually need a counter here because all that matters is that
a page got reclaimed. It could just as easily have been

bool zone_reclaimable = false;

...

if (sc-nr_reclaimed - nr_reclaimed)
zone_reclaimable = true;

...

return zone_reclaimable

so that zone[s]_reclaimable is always a boolean and not sometimes a boolean
and sometimes a counter.


  
  /* Returns true if compaction should go ahead for a high-order request */
 @@ -2340,8 +2345,10 @@ static inline bool compaction_ready(struct zone *zone, 
 int order)
   *
   * If a zone is deemed to be full of pinned pages then just give it a light
   * scan then give up on it.
 + *
 + * Returns whether the zones overall are reclaimable or not.
   */

Returns true if a zone was reclaimable

 -static void shrink_zones(struct zonelist *zonelist, struct scan_control *sc)
 +static bool shrink_zones(struct zonelist *zonelist, struct scan_control *sc)
  {
   struct zoneref *z;
   struct zone *zone;
 @@ -2354,6 +2361,7 @@ static void shrink_zones(struct zonelist *zonelist, 
 struct scan_control *sc)
   .gfp_mask = sc-gfp_mask,
   };
   enum zone_type requested_highidx = gfp_zone(sc-gfp_mask);
 + bool all_unreclaimable = true;
  
   /*
* If the number of buffer_heads in the machine exceeds the maximum
 @@ -2368,6 +2376,8 @@ static void shrink_zones(struct zonelist *zonelist, 
 struct scan_control *sc)
  
   for_each_zone_zonelist_nodemask(zone, z, zonelist,
   gfp_zone(sc-gfp_mask), sc-nodemask) {
 + unsigned long zone_reclaimed = 0;
 +
   if (!populated_zone(zone))
   continue;
   /*
 @@ -2414,10 +2424,15 @@ static void shrink_zones(struct zonelist *zonelist, 
 struct scan_control *sc)
   nr_soft_scanned);
   sc-nr_reclaimed += nr_soft_reclaimed;
   sc-nr_scanned += nr_soft_scanned;
 + zone_reclaimed += nr_soft_reclaimed;
   /* need some check for avoid more shrink_zone() */
   }
  
 - shrink_zone(zone, sc);
 + zone_reclaimed += shrink_zone(zone, sc);
 +
 + if (zone_reclaimed ||
 + (global_reclaim(sc)  zone_reclaimable(zone)))
 + all_unreclaimable = false;
   }
  

This is where you don't need the counter as such. It could just as
easily have been

bool reclaimable = false;

if (shrink_zone(zone, sc))
reclaimable = true;

if (!reclaimable  global_reclaim(sc)  zone_reclaimable(zone))
reclaimable = true;

return reclaimable;

It doesn't matter as such, it's just zone_reclaimed is implemented as a
counter but not used as one.

   /*
 @@ -2439,26 +2454,8 @@ static void shrink_zones(struct zonelist *zonelist, 
 struct scan_control *sc)
* promoted it to __GFP_HIGHMEM.
*/
   sc-gfp_mask = orig_mask;
 -}
  
 -/* All zones in zonelist are unreclaimable? */
 -static bool all_unreclaimable(struct zonelist *zonelist,
 - struct scan_control *sc)
 -{
 - struct zoneref *z;
 - struct zone *zone;
 -
 - for_each_zone_zonelist_nodemask(zone, z, zonelist,
 - gfp_zone(sc-gfp_mask), sc-nodemask) {
 - if (!populated_zone(zone))
 - continue;

[PATCH 08/21] kgr: sched.h, introduce kgr_task_safe helper

2014-06-23 Thread Jiri Slaby

To be used from some kthreads.

Signed-off-by: Jiri Slaby jsl...@suse.cz
Cc: Steven Rostedt rost...@goodmis.org
Cc: Frederic Weisbecker fweis...@gmail.com
Cc: Ingo Molnar mi...@redhat.com
---
 include/linux/sched.h | 9 +
 1 file changed, 9 insertions(+)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 306f4f0c987a..6bc2d63a59c4 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2974,6 +2974,15 @@ static inline void mm_init_owner(struct mm_struct *mm, 
struct task_struct *p)
 }
 #endif /* CONFIG_MEMCG */
 
+#if IS_ENABLED(CONFIG_KGRAFT)
+static inline void kgr_task_safe(struct task_struct *p)
+{
+   clear_bit(0, task_thread_info(p)-kgr_in_progress);
+}
+#else
+static inline void kgr_task_safe(struct task_struct *p) { }
+#endif /* IS_ENABLED(CONFIG_KGRAFT) */
+
 static inline unsigned long task_rlimit(const struct task_struct *tsk,
unsigned int limit)
 {
-- 
2.0.0

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 10/21] kgr: kthreads support

2014-06-23 Thread Jiri Slaby

Wake up kthreads so that they cycle through kgr_task_safe either
by an explicit call to it or implicitly via try_to_freeze. This
ensures nobody should use the old version of the code and kgraft core
can push everybody to use the new version by switching to the fast
path.

Signed-off-by: Jiri Slaby jsl...@suse.cz
Cc: Tejun Heo t...@kernel.org
Cc: Steven Rostedt rost...@goodmis.org
Cc: Frederic Weisbecker fweis...@gmail.com
Cc: Ingo Molnar mi...@redhat.com
---
 kernel/kgraft.c | 27 ---
 1 file changed, 16 insertions(+), 11 deletions(-)

diff --git a/kernel/kgraft.c b/kernel/kgraft.c
index ce2f09f3b544..2fe1d922ebac 100644
--- a/kernel/kgraft.c
+++ b/kernel/kgraft.c
@@ -53,7 +53,7 @@ static void kgr_stub_slow(unsigned long ip, unsigned long 
parent_ip,
 {
struct kgr_loc_caches *c = ops-private;
 
-   if (kgr_task_in_progress(current)  current-mm) {
+   if (kgr_task_in_progress(current)) {
pr_info(kgr: slow stub: calling old code at %lx\n,
c-old);
kgr_set_regs_ip(regs, c-old + MCOUNT_INSN_SIZE);
@@ -71,11 +71,7 @@ static bool kgr_still_patching(void)
 
read_lock(tasklist_lock);
for_each_process(p) {
-   /*
-* TODO
-*   kernel thread codepaths not supported and silently ignored
-*/
-   if (kgr_task_in_progress(p)  p-mm) {
+   if (kgr_task_in_progress(p)) {
pr_info(pid %d (%s) still in kernel after timeout\n,
p-pid, p-comm);
failed = true;
@@ -123,13 +119,23 @@ static void kgr_work_fn(struct work_struct *work)
mutex_unlock(kgr_in_progress_lock);
 }
 
-static void kgr_mark_processes(void)
+static void kgr_handle_processes(void)
 {
struct task_struct *p;
 
read_lock(tasklist_lock);
-   for_each_process(p)
+   for_each_process(p) {
kgr_mark_task_in_progress(p);
+
+   /* wake up kthreads, they will clean the progress flag */
+   if (!p-mm) {
+   /*
+* this is incorrect for kthreads waiting still for
+* their first wake_up.
+*/
+   wake_up_process(p);
+   }
+   }
read_unlock(tasklist_lock);
 }
 
@@ -274,8 +280,7 @@ static int kgr_patch_code(const struct kgr_patch_fun 
*patch_fun, bool final)
  * kgr_start_patching -- the entry for a kgraft patch
  * @patch: patch to be applied
  *
- * Start patching of code that is neither running in IRQ context nor
- * kernel thread.
+ * Start patching of code that is not running in IRQ context.
  */
 int kgr_start_patching(const struct kgr_patch *patch)
 {
@@ -314,7 +319,7 @@ int kgr_start_patching(const struct kgr_patch *patch)
kgr_patch = patch;
mutex_unlock(kgr_in_progress_lock);
 
-   kgr_mark_processes();
+   kgr_handle_processes();
 
/*
 * give everyone time to exit kernel, and check after a while
-- 
2.0.0

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 17/21] kgr: exercise non-present function

2014-06-23 Thread Jiri Slaby

This is to test the newly added functionality: non-fatal patching of
yet unknown functions.

Signed-off-by: Jiri Slaby jsl...@suse.cz
---
 samples/kgraft/kgraft_patcher.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/samples/kgraft/kgraft_patcher.c b/samples/kgraft/kgraft_patcher.c
index 5d02a908bc26..e96eef840397 100644
--- a/samples/kgraft/kgraft_patcher.c
+++ b/samples/kgraft/kgraft_patcher.c
@@ -63,10 +63,17 @@ static bool new_capable(int cap)
 }
 KGR_PATCHED_FUNCTION(capable, new_capable, true);
 
+static void new_function(unsigned long data)
+{
+   pr_info(kgr-patcher: %s\n, __func__);
+}
+KGR_PATCHED_FUNCTION(unknown_function, new_function, false);
+
 static struct kgr_patch patch = {
.patches = {
KGR_PATCH(SyS_iopl),
KGR_PATCH(capable),
+   KGR_PATCH(unknown_function),
KGR_PATCH_END
}
 };
-- 
2.0.0

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 14/21] kgr: add procfs interface for per-process 'kgr_in_progress'

2014-06-23 Thread Jiri Slaby

From: Jiri Kosina jkos...@suse.cz

Instead of flooding dmesg with data about tasks which haven't yet been
migrated to the new universe, create a 'kgr_in_progress' in
/proc/pid/ so that it's possible to easily script the checks/actions
in userspace.

js: use the kgr helper

Signed-off-by: Jiri Kosina jkos...@suse.cz
Signed-off-by: Jiri Slaby jsl...@suse.cz [simplification]
Cc: Steven Rostedt rost...@goodmis.org
Cc: Frederic Weisbecker fweis...@gmail.com
Cc: Ingo Molnar mi...@redhat.com
---
 fs/proc/base.c  | 11 +++
 kernel/kgraft.c |  3 +--
 2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index 2d696b0c93bf..60f7b1ce5d1c 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -87,6 +87,7 @@
 #include linux/slab.h
 #include linux/flex_array.h
 #include linux/posix-timers.h
+#include linux/kgraft.h
 #ifdef CONFIG_HARDWALL
 #include asm/hardwall.h
 #endif
@@ -2106,6 +2107,13 @@ static const struct file_operations 
proc_timers_operations = {
 };
 #endif /* CONFIG_CHECKPOINT_RESTORE */
 
+#if IS_ENABLED(CONFIG_KGRAFT)
+static int proc_pid_kgr_in_progress(struct task_struct *task, char *buffer)
+{
+   return sprintf(buffer, %d\n, kgr_task_in_progress(task));
+}
+#endif /* IS_ENABLED(CONFIG_KGRAFT) */
+
 static int proc_pident_instantiate(struct inode *dir,
struct dentry *dentry, struct task_struct *task, const void *ptr)
 {
@@ -2638,6 +2646,9 @@ static const struct pid_entry tgid_base_stuff[] = {
 #ifdef CONFIG_CHECKPOINT_RESTORE
REG(timers, S_IRUGO, proc_timers_operations),
 #endif
+#if IS_ENABLED(CONFIG_KGRAFT)
+   INF(kgr_in_progress,  S_IRUSR, proc_pid_kgr_in_progress),
+#endif
 };
 
 static int proc_tgid_base_readdir(struct file *file, struct dir_context *ctx)
diff --git a/kernel/kgraft.c b/kernel/kgraft.c
index ce733f3f2640..0d04f1cbcd4a 100644
--- a/kernel/kgraft.c
+++ b/kernel/kgraft.c
@@ -77,9 +77,8 @@ static bool kgr_still_patching(void)
read_lock(tasklist_lock);
for_each_process(p) {
if (kgr_task_in_progress(p)) {
-   pr_info(pid %d (%s) still in kernel after timeout\n,
-   p-pid, p-comm);
failed = true;
+   break;
}
}
read_unlock(tasklist_lock);
-- 
2.0.0

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 12/21] kgr: add MAINTAINERS entry

2014-06-23 Thread Jiri Slaby

Signed-off-by: Jiri Slaby jsl...@suse.cz
Cc: Jiri Kosina jkos...@suse.cz
Cc: Vojtech Pavlik vojt...@suse.cz
---
 MAINTAINERS | 9 +
 1 file changed, 9 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 3f2e171047b9..73733eb50bb3 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -5239,6 +5239,15 @@ F:   include/linux/kdb.h
 F: include/linux/kgdb.h
 F: kernel/debug/
 
+KGRAFT
+M: Jiri Kosina jkos...@suse.cz
+M: Jiri Slaby jsl...@suse.cz
+M: Vojtech Pavlik vojt...@suse.cz
+F: arch/x86/include/asm/kgraft.h
+F: include/linux/kgraft.h
+F: kernel/kgraft.c
+F: samples/kgraft/
+
 KMEMCHECK
 M: Vegard Nossum vegar...@ifi.uio.no
 M: Pekka Enberg penb...@kernel.org
-- 
2.0.0

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 15/21] kgr: make a per-process 'in progress' flag a single bit

2014-06-23 Thread Jiri Slaby

From: Jiri Kosina jkos...@suse.cz

Having the per-task 'kgr_in_progress' flag stored as long is a waste
of space. And manipulating it is likely slower than just performing
single bit operations. Convert the flag to a thread info flag.

Additionally, making the KGR TI_flag part of _TIF_ALLWORK_MASK and
_TIF_WORK_SYSCALL_ENTRY allows for offloading the flag manipulation to
slow code paths.

js: use *_tsk_thread_flag helpers

Signed-off-by: Jiri Kosina jkos...@suse.cz
Signed-off-by: Jiri Slaby jsl...@suse.cz
Cc: Steven Rostedt rost...@goodmis.org
Cc: Frederic Weisbecker fweis...@gmail.com
Cc: Ingo Molnar mi...@redhat.com
---
 arch/x86/include/asm/thread_info.h |  7 ---
 arch/x86/kernel/asm-offsets.c  |  1 -
 arch/x86/kernel/entry_64.S | 12 +---
 include/linux/kgraft.h |  5 ++---
 include/linux/sched.h  |  2 +-
 5 files changed, 16 insertions(+), 11 deletions(-)

diff --git a/arch/x86/include/asm/thread_info.h 
b/arch/x86/include/asm/thread_info.h
index e44c8fda9c43..53df17c72359 100644
--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -35,7 +35,6 @@ struct thread_info {
void __user *sysenter_return;
unsigned intsig_on_uaccess_error:1;
unsigned intuaccess_err:1;  /* uaccess failed */
-   unsigned long   kgr_in_progress;
 };
 
 #define INIT_THREAD_INFO(tsk)  \
@@ -88,6 +87,7 @@ struct thread_info {
 #define TIF_IO_BITMAP  22  /* uses I/O bitmap */
 #define TIF_FORCED_TF  24  /* true if TF in eflags artificially */
 #define TIF_BLOCKSTEP  25  /* set when we want DEBUGCTLMSR_BTF */
+#define TIF_KGR_IN_PROGRESS26  /* kGraft patching running */
 #define TIF_LAZY_MMU_UPDATES   27  /* task is updating the mmu lazily */
 #define TIF_SYSCALL_TRACEPOINT 28  /* syscall tracepoint instrumentation */
 #define TIF_ADDR32 29  /* 32-bit address space on 64 bits */
@@ -112,6 +112,7 @@ struct thread_info {
 #define _TIF_IO_BITMAP (1  TIF_IO_BITMAP)
 #define _TIF_FORCED_TF (1  TIF_FORCED_TF)
 #define _TIF_BLOCKSTEP (1  TIF_BLOCKSTEP)
+#define _TIF_KGR_IN_PROGRESS   (1  TIF_KGR_IN_PROGRESS)
 #define _TIF_LAZY_MMU_UPDATES  (1  TIF_LAZY_MMU_UPDATES)
 #define _TIF_SYSCALL_TRACEPOINT(1  TIF_SYSCALL_TRACEPOINT)
 #define _TIF_ADDR32(1  TIF_ADDR32)
@@ -121,7 +122,7 @@ struct thread_info {
 #define _TIF_WORK_SYSCALL_ENTRY\
(_TIF_SYSCALL_TRACE | _TIF_SYSCALL_EMU | _TIF_SYSCALL_AUDIT |   \
 _TIF_SECCOMP | _TIF_SINGLESTEP | _TIF_SYSCALL_TRACEPOINT | \
-_TIF_NOHZ)
+_TIF_NOHZ | _TIF_KGR_IN_PROGRESS)
 
 /* work to do in syscall_trace_leave() */
 #define _TIF_WORK_SYSCALL_EXIT \
@@ -137,7 +138,7 @@ struct thread_info {
 /* work to do on any return to user space */
 #define _TIF_ALLWORK_MASK  \
((0x  ~_TIF_SECCOMP) | _TIF_SYSCALL_TRACEPOINT |   \
-   _TIF_NOHZ)
+   _TIF_NOHZ | _TIF_KGR_IN_PROGRESS)
 
 /* Only used for 64 bit */
 #define _TIF_DO_NOTIFY_MASK\
diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c
index 0db0437967a2..9f6b9341950f 100644
--- a/arch/x86/kernel/asm-offsets.c
+++ b/arch/x86/kernel/asm-offsets.c
@@ -32,7 +32,6 @@ void common(void) {
OFFSET(TI_flags, thread_info, flags);
OFFSET(TI_status, thread_info, status);
OFFSET(TI_addr_limit, thread_info, addr_limit);
-   OFFSET(TI_kgr_in_progress, thread_info, kgr_in_progress);
 
BLANK();
OFFSET(crypto_tfm_ctx_offset, crypto_tfm, __crt_ctx);
diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index a7c570abc918..edaa5abd58f9 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -409,7 +409,6 @@ GLOBAL(system_call_after_swapgs)
movq  %rax,ORIG_RAX-ARGOFFSET(%rsp)
movq  %rcx,RIP-ARGOFFSET(%rsp)
CFI_REL_OFFSET rip,RIP-ARGOFFSET
-   movq $0, TI_kgr_in_progress+THREAD_INFO(%rsp,RIP-ARGOFFSET)
testl $_TIF_WORK_SYSCALL_ENTRY,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET)
jnz tracesys
 system_call_fastpath:
@@ -434,7 +433,6 @@ sysret_check:
LOCKDEP_SYS_EXIT
DISABLE_INTERRUPTS(CLBR_NONE)
TRACE_IRQS_OFF
-   movq $0, TI_kgr_in_progress+THREAD_INFO(%rsp,RIP-ARGOFFSET)
movl TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET),%edx
andl %edi,%edx
jnz  sysret_careful
@@ -454,6 +452,9 @@ sysret_check:
/* Handle reschedules */
/* edx: work, edi: workmask */
 sysret_careful:
+#if IS_ENABLED(CONFIG_KGRAFT)
+   andl $~_TIF_KGR_IN_PROGRESS,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET)
+#endif
bt $TIF_NEED_RESCHED,%edx
jnc sysret_signal
TRACE_IRQS_ON
@@ -517,6 +518,9 @@ sysret_audit:
 
/* Do syscall tracing */
 tracesys:
+#if

[PATCH 19/21] kgr: expose global 'in_progress' state through procfs

2014-06-23 Thread Jiri Slaby

From: Jiri Kosina jkos...@suse.cz

In addition to having a per-process flag that shows which processess have
already been migrated, it's useful to have a global-wide flag that will
show whether the patching operation is currently undergoing without having
to traverse all /proc entries.

js: handle error

Reported-by: Libor Pechacek lpecha...@suse.cz
Signed-off-by: Jiri Kosina jkos...@suse.cz
Signed-off-by: Jiri Slaby jsl...@suse.cz
---
 kernel/kgraft.c | 24 
 1 file changed, 24 insertions(+)

diff --git a/kernel/kgraft.c b/kernel/kgraft.c
index 89414957cf74..b427812c12cd 100644
--- a/kernel/kgraft.c
+++ b/kernel/kgraft.c
@@ -26,6 +26,8 @@
 #include linux/spinlock.h
 #include linux/types.h
 #include linux/workqueue.h
+#include linux/seq_file.h
+#include linux/proc_fs.h
 
 static int kgr_patch_code(const struct kgr_patch *patch,
struct kgr_patch_fun *patch_fun, bool final);
@@ -382,6 +384,25 @@ unlock_free:
 }
 EXPORT_SYMBOL_GPL(kgr_start_patching);
 
+static int kgr_show(struct seq_file *m, void *v)
+{
+   seq_printf(m, %d\n, kgr_in_progress);
+   return 0;
+}
+
+static int kgr_open(struct inode *inode, struct file *file)
+{
+   return single_open(file, kgr_show, NULL);
+}
+
+static const struct file_operations kgr_fops = {
+   .owner  = THIS_MODULE,
+   .open   = kgr_open,
+   .read   = seq_read,
+   .llseek = seq_lseek,
+   .release= single_release,
+};
+
 static int __init kgr_init(void)
 {
if (ftrace_is_dead()) {
@@ -398,6 +419,9 @@ static int __init kgr_init(void)
kgr_initialized = true;
pr_info(kgr: successfully initialized\n);
 
+   if (!proc_create(kgr_in_progress, 0, NULL, kgr_fops))
+   pr_warn(kgr: cannot create kgr_in_progress in procfs\n);
+
return 0;
 }
 module_init(kgr_init);
-- 
2.0.0

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3.11 14/93] drm/radeon: fix register typo on si

2014-06-23 Thread Luis Henriques

3.11.10.12 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Alex Deucher alexdeuc...@gmail.com

commit 4955bb073f1be6dd884b5d10041ba4bade6495bf upstream.

Probably a copy paste typo.

Signed-off-by: Alex Deucher alexander.deuc...@amd.com
Signed-off-by: Christian König christian.koe...@amd.com
[ luis: backported to 3.11: adjusted context ]
Signed-off-by: Luis Henriques luis.henriq...@canonical.com
---
 drivers/gpu/drm/radeon/sid.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/radeon/sid.h b/drivers/gpu/drm/radeon/sid.h
index 0b55877c26b4..ff027425fe98 100644
--- a/drivers/gpu/drm/radeon/sid.h
+++ b/drivers/gpu/drm/radeon/sid.h
@@ -102,8 +102,8 @@
 #defineCG_SPLL_FUNC_CNTL_4 0x60c
 
 #defineSPLL_CNTL_MODE  0x618
-#  define SPLL_REFCLK_SEL(x)   ((x)  8)
-#  define SPLL_REFCLK_SEL_MASK 0xFF00
+#  define SPLL_REFCLK_SEL(x)   ((x)  26)
+#  define SPLL_REFCLK_SEL_MASK (3  26)
 
 #defineCG_SPLL_SPREAD_SPECTRUM 0x620
 #defineSSEN(1  0)
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 04/21] kgr: add testing kgraft patch

2014-06-23 Thread Jiri Slaby

This is intended to be a presentation of the kGraft engine, so it is
placed into samples/ directory.

It patches two chosen functions sys_iopl() and sys_capable() to print
a message in addition to the original functionality.

js: fix filename in Makefile (thanks mmarek)

Signed-off-by: Jiri Kosina jkos...@suse.cz
Signed-off-by: Jiri Slaby jsl...@suse.cz
Cc: Steven Rostedt rost...@goodmis.org
Cc: Frederic Weisbecker fweis...@gmail.com
Cc: Ingo Molnar mi...@redhat.com
---
 samples/Kconfig |  4 ++
 samples/Makefile|  3 +-
 samples/kgraft/Makefile |  1 +
 samples/kgraft/kgraft_patcher.c | 92 +
 4 files changed, 99 insertions(+), 1 deletion(-)
 create mode 100644 samples/kgraft/Makefile
 create mode 100644 samples/kgraft/kgraft_patcher.c

diff --git a/samples/Kconfig b/samples/Kconfig
index 6181c2cc9ca0..b33a397dfc58 100644
--- a/samples/Kconfig
+++ b/samples/Kconfig
@@ -55,6 +55,10 @@ config SAMPLE_KDB
  Build an example of how to dynamically add the hello
  command to the kdb shell.
 
+config SAMPLE_KGRAFT_PATCHER
+   tristate Build kGraft patcher example -- loadable modules only
+   depends on KGRAFT  m
+
 config SAMPLE_RPMSG_CLIENT
tristate Build rpmsg client sample -- loadable modules only
depends on RPMSG  m
diff --git a/samples/Makefile b/samples/Makefile
index 1a60c62e2045..a0d1626bd5bb 100644
--- a/samples/Makefile
+++ b/samples/Makefile
@@ -1,4 +1,5 @@
 # Makefile for Linux samples code
 
 obj-$(CONFIG_SAMPLES)  += kobject/ kprobes/ trace_events/ \
-  hw_breakpoint/ kfifo/ kdb/ hidraw/ rpmsg/ seccomp/
+  hw_breakpoint/ kfifo/ kdb/ kgraft/ \
+  hidraw/ rpmsg/ seccomp/
diff --git a/samples/kgraft/Makefile b/samples/kgraft/Makefile
new file mode 100644
index ..888a332c3148
--- /dev/null
+++ b/samples/kgraft/Makefile
@@ -0,0 +1 @@
+obj-$(CONFIG_SAMPLE_KGRAFT_PATCHER) += kgraft_patcher.o
diff --git a/samples/kgraft/kgraft_patcher.c b/samples/kgraft/kgraft_patcher.c
new file mode 100644
index ..abb0c05bf739
--- /dev/null
+++ b/samples/kgraft/kgraft_patcher.c
@@ -0,0 +1,92 @@
+/*
+ * kgraft_patcher -- just kick the kGraft infrastructure for test
+ *
+ * We patch two (arbitrarily chosen) functions at once...
+ *
+ *  Copyright (c) 2013-2014 SUSE
+ *   Authors: Jiri Kosina
+ *   Vojtech Pavlik
+ *   Jiri Slaby
+ */
+
+/*
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the Free
+ * Software Foundation; either version 2 of the License, or (at your option)
+ * any later version.
+ */
+
+#include linux/module.h
+#include linux/kernel.h
+#include linux/init.h
+#include linux/kgraft.h
+#include linux/kallsyms.h
+#include linux/sched.h
+#include linux/types.h
+#include linux/capability.h
+#include linux/ptrace.h
+
+#include asm/processor.h
+
+/*
+ * This all should be autogenerated from the patched sources
+ */
+
+asmlinkage long kgr_new_sys_iopl(unsigned int level)
+{
+   struct pt_regs *regs = current_pt_regs();
+   unsigned int old = (regs-flags  12)  3;
+   struct thread_struct *t = current-thread;
+
+   printk(KERN_DEBUG kgr-patcher: this is a new sys_iopl()\n);
+
+   if (level  3)
+   return -EINVAL;
+   /* Trying to gain more privileges? */
+   if (level  old) {
+   if (!capable(CAP_SYS_RAWIO))
+   return -EPERM;
+   }
+   regs-flags = (regs-flags  ~X86_EFLAGS_IOPL) | (level  12);
+   t-iopl = level  12;
+   set_iopl_mask(t-iopl);
+
+   return 0;
+}
+KGR_PATCHED_FUNCTION(SyS_iopl, kgr_new_sys_iopl);
+
+static bool new_capable(int cap)
+{
+   printk(KERN_DEBUG kgr-patcher: this is a new capable()\n);
+
+   return ns_capable(init_user_ns, cap);
+}
+KGR_PATCHED_FUNCTION(capable, new_capable);
+
+static const struct kgr_patch patch = {
+   .patches = {
+   KGR_PATCH(SyS_iopl),
+   KGR_PATCH(capable),
+   KGR_PATCH_END
+   }
+};
+
+static int __init kgr_patcher_init(void)
+{
+   /* removing not supported */
+   __module_get(THIS_MODULE);
+   kgr_start_patching(patch);
+   return 0;
+}
+
+static void __exit kgr_patcher_cleanup(void)
+{
+   /* extra care needs to be taken when freeing ftrace_ops-private */
+   pr_err(removing now buggy!\n);
+}
+
+module_init(kgr_patcher_init);
+module_exit(kgr_patcher_cleanup);
+
+MODULE_LICENSE(GPL);
+
-- 
2.0.0

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 20/21] kgr: rephrase the kGraft failed message

2014-06-23 Thread Jiri Slaby

From: Libor Pechacek lpecha...@suse.cz

kGraft not succeeding on the first attempt can hardly be called a
failure.  kGraft is merely waiting for sleeping processes to wake up
and get out of the way.

Signed-off-by: Libor Pechacek lpecha...@suse.cz
Signed-off-by: Jiri Slaby jsl...@suse.cz
---
 kernel/kgraft.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/kgraft.c b/kernel/kgraft.c
index b427812c12cd..c4b1777604e1 100644
--- a/kernel/kgraft.c
+++ b/kernel/kgraft.c
@@ -108,7 +108,7 @@ static void kgr_finalize(void)
 static void kgr_work_fn(struct work_struct *work)
 {
if (kgr_still_patching()) {
-   pr_info(kgr failed after timeout (%d), still in degraded 
mode\n,
+   pr_info(kgr still in progress after timeout (%d)\n,
KGR_TIMEOUT);
/* recheck again later */
queue_delayed_work(kgr_wq, kgr_work, KGR_TIMEOUT * HZ);
-- 
2.0.0

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3.11 13/93] drm/radeon: handle non-VGA class pci devices with ATRM

2014-06-23 Thread Luis Henriques

3.11.10.12 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Alex Deucher alexdeuc...@gmail.com

commit d8ade3526b2aa0505132c404c05a38b73ea15490 upstream.

Newer PX systems have non-VGA pci class dGPUs.  Update
the ATRM fetch method to handle those cases.

bug:
https://bugzilla.kernel.org/show_bug.cgi?id=75401

Signed-off-by: Alex Deucher alexander.deuc...@amd.com
Signed-off-by: Christian König christian.koe...@amd.com
Signed-off-by: Luis Henriques luis.henriq...@canonical.com
---
 drivers/gpu/drm/radeon/radeon_bios.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/drivers/gpu/drm/radeon/radeon_bios.c 
b/drivers/gpu/drm/radeon/radeon_bios.c
index 061b227dae0c..b131520521e4 100644
--- a/drivers/gpu/drm/radeon/radeon_bios.c
+++ b/drivers/gpu/drm/radeon/radeon_bios.c
@@ -196,6 +196,20 @@ static bool radeon_atrm_get_bios(struct radeon_device 
*rdev)
}
}
 
+   if (!found) {
+   while ((pdev = pci_get_class(PCI_CLASS_DISPLAY_OTHER  8, 
pdev)) != NULL) {
+   dhandle = ACPI_HANDLE(pdev-dev);
+   if (!dhandle)
+   continue;
+
+   status = acpi_get_handle(dhandle, ATRM, atrm_handle);
+   if (!ACPI_FAILURE(status)) {
+   found = true;
+   break;
+   }
+   }
+   }
+
if (!found)
return false;
 
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3.11 11/93] drm/gf119-/disp: fix nasty bug which can clobber SOR0's clock setup

2014-06-23 Thread Luis Henriques

3.11.10.12 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Ben Skeggs bske...@redhat.com

commit 0f1d360b2ee3a2a0f510d3f1bcd3f5ebe5d41265 upstream.

Fixes a LVDS bleed issue on Lenovo W530 that can occur under a
number of circumstances.

Signed-off-by: Ben Skeggs bske...@redhat.com
Signed-off-by: Luis Henriques luis.henriq...@canonical.com
---
 drivers/gpu/drm/nouveau/core/engine/disp/nvd0.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/nouveau/core/engine/disp/nvd0.c 
b/drivers/gpu/drm/nouveau/core/engine/disp/nvd0.c
index 52dd7a1db729..8f336558c681 100644
--- a/drivers/gpu/drm/nouveau/core/engine/disp/nvd0.c
+++ b/drivers/gpu/drm/nouveau/core/engine/disp/nvd0.c
@@ -678,7 +678,7 @@ exec_clkcmp(struct nv50_disp_priv *priv, int head, int id,
}
 
if (outp == 8)
-   return false;
+   return conf;
 
data = exec_lookup(priv, head, outp, ctrl, dcb, ver, hdr, cnt, len, 
info1);
if (data == 0x)
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3.11 07/93] af_iucv: wrong mapping of sent and confirmed skbs

2014-06-23 Thread Luis Henriques

3.11.10.12 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Ursula Braun ursula.br...@de.ibm.com

commit f5738e2ef88070ef1372e6e718124d88e9abe4ac upstream.

When sending data through IUCV a MESSAGE COMPLETE interrupt
signals that sent data memory can be freed or reused again.
With commit f9c41a62bba3f3f7ef3541b2a025e3371bcbba97
af_iucv: fix recvmsg by replacing skb_pull() function the
MESSAGE COMPLETE callback iucv_callback_txdone() identifies
the wrong skb as being confirmed, which leads to data corruption.
This patch fixes the skb mapping logic in iucv_callback_txdone().

Signed-off-by: Ursula Braun ursula.br...@de.ibm.com
Signed-off-by: Frank Blaschka frank.blasc...@de.ibm.com
Signed-off-by: David S. Miller da...@davemloft.net
Signed-off-by: Luis Henriques luis.henriq...@canonical.com
---
 net/iucv/af_iucv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/iucv/af_iucv.c b/net/iucv/af_iucv.c
index c4b7218058b6..1465363a452b 100644
--- a/net/iucv/af_iucv.c
+++ b/net/iucv/af_iucv.c
@@ -1829,7 +1829,7 @@ static void iucv_callback_txdone(struct iucv_path *path,
spin_lock_irqsave(list-lock, flags);
 
while (list_skb != (struct sk_buff *)list) {
-   if (msg-tag != IUCV_SKB_CB(list_skb)-tag) {
+   if (msg-tag == IUCV_SKB_CB(list_skb)-tag) {
this = list_skb;
break;
}
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3.11 09/93] perf: Limit perf_event_attr::sample_period to 63 bits

2014-06-23 Thread Luis Henriques

3.11.10.12 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Peter Zijlstra pet...@infradead.org

commit 0819b2e30ccb93edf04876237b6205eef84ec8d2 upstream.

Vince reported that using a large sample_period (one with bit 63 set)
results in wreckage since while the sample_period is fundamentally
unsigned (negative periods don't make sense) the way we implement
things very much rely on signed logic.

So limit sample_period to 63 bits to avoid tripping over this.

Reported-by: Vince Weaver vincent.wea...@maine.edu
Signed-off-by: Peter Zijlstra pet...@infradead.org
Link: http://lkml.kernel.org/n/tip-p25fhunibl4y3qi0zuqmy...@git.kernel.org
Signed-off-by: Thomas Gleixner t...@linutronix.de
Signed-off-by: Luis Henriques luis.henriq...@canonical.com
---
 kernel/events/core.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 9c511b4296db..fe0c665c54d2 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -6816,6 +6816,9 @@ SYSCALL_DEFINE5(perf_event_open,
if (attr.freq) {
if (attr.sample_freq  sysctl_perf_event_sample_rate)
return -EINVAL;
+   } else {
+   if (attr.sample_period  (1ULL  63))
+   return -EINVAL;
}
 
/*
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3.11 08/93] net: filter: s390: fix JIT address randomization

2014-06-23 Thread Luis Henriques

3.11.10.12 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Heiko Carstens heiko.carst...@de.ibm.com

commit e84d2f8d2ae33c8215429824e1ecf24cbca9645e upstream.

This is the s390 variant of Alexei's JIT bug fix.
(patch description below stolen from Alexei's patch)

bpf_alloc_binary() adds 128 bytes of room to JITed program image
and rounds it up to the nearest page size. If image size is close
to page size (like 4000), it is rounded to two pages:
round_up(4000 + 4 + 128) == 8192
then 'hole' is computed as 8192 - (4000 + 4) = 4188
If prandom_u32() % hole selects a number = PAGE_SIZE - sizeof(*header)
then kernel will crash during bpf_jit_free():

kernel BUG at arch/x86/mm/pageattr.c:887!
Call Trace:
 [81037285] change_page_attr_set_clr+0x135/0x460
 [81694cc0] ? _raw_spin_unlock_irq+0x30/0x50
 [810378ff] set_memory_rw+0x2f/0x40
 [a01a0d8d] bpf_jit_free_deferred+0x2d/0x60
 [8106bf98] process_one_work+0x1d8/0x6a0
 [8106bf38] ? process_one_work+0x178/0x6a0
 [8106c90c] worker_thread+0x11c/0x370

since bpf_jit_free() does:
  unsigned long addr = (unsigned long)fp-bpf_func  PAGE_MASK;
  struct bpf_binary_header *header = (void *)addr;
to compute start address of 'bpf_binary_header'
and header-pages will pass junk to:
  set_memory_rw(addr, header-pages);

Fix it by making sure that header-image[prandom_u32() % hole] and header
are in the same page.

Fixes: aa2d2c73c21f2 (s390/bpf,jit: address randomize and write protect jit 
code)

Reported-by: Alexei Starovoitov a...@plumgrid.com
Signed-off-by: Heiko Carstens heiko.carst...@de.ibm.com
Signed-off-by: David S. Miller da...@davemloft.net
Signed-off-by: Luis Henriques luis.henriq...@canonical.com
---
 arch/s390/net/bpf_jit_comp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/s390/net/bpf_jit_comp.c b/arch/s390/net/bpf_jit_comp.c
index 8ccd6a669804..63c527eb096f 100644
--- a/arch/s390/net/bpf_jit_comp.c
+++ b/arch/s390/net/bpf_jit_comp.c
@@ -811,7 +811,7 @@ static struct bpf_binary_header *bpf_alloc_binary(unsigned 
int bpfsize,
return NULL;
memset(header, 0, sz);
header-pages = sz / PAGE_SIZE;
-   hole = sz - (bpfsize + sizeof(*header));
+   hole = min(sz - (bpfsize + sizeof(*header)), PAGE_SIZE - 
sizeof(*header));
/* Insert random number of illegal instructions before BPF code
 * and make sure the first instruction starts at an even address.
 */
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3.11 04/93] net: cpsw: fix null dereference at probe

2014-06-23 Thread Luis Henriques

3.11.10.12 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Johan Hovold jhov...@gmail.com

commit 6954cc1f238199e971ec905c5cc87120806ac981 upstream.

Fix null-pointer dereference at probe when the mdio platform device is
missing (e.g. when it has been disabled in DT).

Signed-off-by: Johan Hovold jhov...@gmail.com
Signed-off-by: David S. Miller da...@davemloft.net
Signed-off-by: Luis Henriques luis.henriq...@canonical.com
---
 drivers/net/ethernet/ti/cpsw.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
index 22a7a4336211..04b39c155c6a 100644
--- a/drivers/net/ethernet/ti/cpsw.c
+++ b/drivers/net/ethernet/ti/cpsw.c
@@ -1548,6 +1548,10 @@ static int cpsw_probe_dt(struct cpsw_platform_data *data,
mdio_node = of_find_node_by_phandle(be32_to_cpup(parp));
phyid = be32_to_cpup(parp+1);
mdio = of_find_device_by_node(mdio_node);
+   if (!mdio) {
+   pr_err(Missing mdio platform device\n);
+   return -EINVAL;
+   }
snprintf(slave_data-phy_id, sizeof(slave_data-phy_id),
 PHY_ID_FMT, mdio-name, phyid);
 
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3.11 06/93] mac80211: fix on-channel remain-on-channel

2014-06-23 Thread Luis Henriques

3.11.10.12 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Johannes Berg johannes.b...@intel.com

commit b4b177a5556a686909e643f1e9b6434c10de079f upstream.

Jouni reported that if a remain-on-channel was active on the
same channel as the current operating channel, then the ROC
would start, but any frames transmitted using mgmt-tx on the
same channel would get delayed until after the ROC.

The reason for this is that the ROC starts, but doesn't have
any handling for remain on the same channel, so it stops
the interface queues. The later mgmt-tx then puts the frame
on the interface queues (since it's on the current operating
channel) and thus they get delayed until after the ROC.

To fix this, add some logic to handle remaining on the same
channel specially and not stop the queues etc. in this case.
This not only fixes the bug but also improves behaviour in
this case as data frames etc. can continue to flow.

Reported-by: Jouni Malinen j...@w1.fi
Tested-by: Jouni Malinen j...@w1.fi
Signed-off-by: Johannes Berg johannes.b...@intel.com
Signed-off-by: Luis Henriques luis.henriq...@canonical.com
---
 net/mac80211/ieee80211_i.h |  1 +
 net/mac80211/offchannel.c  | 27 ---
 2 files changed, 21 insertions(+), 7 deletions(-)

diff --git a/net/mac80211/ieee80211_i.h b/net/mac80211/ieee80211_i.h
index 735349bd9a07..18e16d05292b 100644
--- a/net/mac80211/ieee80211_i.h
+++ b/net/mac80211/ieee80211_i.h
@@ -312,6 +312,7 @@ struct ieee80211_roc_work {
 
bool started, abort, hw_begun, notified;
bool to_be_freed;
+   bool on_channel;
 
unsigned long hw_start_time;
 
diff --git a/net/mac80211/offchannel.c b/net/mac80211/offchannel.c
index 11d3f227e11e..e554a246e52c 100644
--- a/net/mac80211/offchannel.c
+++ b/net/mac80211/offchannel.c
@@ -333,7 +333,7 @@ void ieee80211_sw_roc_work(struct work_struct *work)
container_of(work, struct ieee80211_roc_work, work.work);
struct ieee80211_sub_if_data *sdata = roc-sdata;
struct ieee80211_local *local = sdata-local;
-   bool started;
+   bool started, on_channel;
 
mutex_lock(local-mtx);
 
@@ -354,14 +354,26 @@ void ieee80211_sw_roc_work(struct work_struct *work)
if (!roc-started) {
struct ieee80211_roc_work *dep;
 
-   /* start this ROC */
-   ieee80211_offchannel_stop_vifs(local);
+   WARN_ON(local-use_chanctx);
+
+   /* If actually operating on the desired channel (with at least
+* 20 MHz channel width) don't stop all the operations but still
+* treat it as though the ROC operation started properly, so
+* other ROC operations won't interfere with this one.
+*/
+   roc-on_channel = roc-chan == local-_oper_chandef.chan 
+ local-_oper_chandef.width != 
NL80211_CHAN_WIDTH_5 
+ local-_oper_chandef.width != 
NL80211_CHAN_WIDTH_10;
 
-   /* switch channel etc */
+   /* start this ROC */
ieee80211_recalc_idle(local);
 
-   local-tmp_channel = roc-chan;
-   ieee80211_hw_config(local, 0);
+   if (!roc-on_channel) {
+   ieee80211_offchannel_stop_vifs(local);
+
+   local-tmp_channel = roc-chan;
+   ieee80211_hw_config(local, 0);
+   }
 
/* tell userspace or send frame */
ieee80211_handle_roc_started(roc);
@@ -380,9 +392,10 @@ void ieee80211_sw_roc_work(struct work_struct *work)
  finish:
list_del(roc-list);
started = roc-started;
+   on_channel = roc-on_channel;
ieee80211_roc_notify_destroy(roc, !roc-abort);
 
-   if (started) {
+   if (started  !on_channel) {
ieee80211_flush_queues(local, NULL);
 
local-tmp_channel = NULL;
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3.11 01/93] cfg80211: free sme on connection failures

2014-06-23 Thread Luis Henriques

3.11.10.12 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Eliad Peller el...@wizery.com

commit c1fbb258846dfc425507a093922d2d001e54c3ea upstream.

cfg80211 is notified about connection failures by
__cfg80211_connect_result() call. However, this
function currently does not free cfg80211 sme.

This results in hanging connection attempts in some cases

e.g. when mac80211 authentication attempt is denied,
we have this function call:
ieee80211_rx_mgmt_auth() - cfg80211_rx_mlme_mgmt() -
cfg80211_process_auth() - cfg80211_sme_rx_auth() -
__cfg80211_connect_result()

but cfg80211_sme_free() is never get called.

Fixes: ceca7b712 (cfg80211: separate internal SME implementation)
Signed-off-by: Eliad Peller eliadx.pel...@intel.com
Signed-off-by: Johannes Berg johannes.b...@intel.com
Signed-off-by: Luis Henriques luis.henriq...@canonical.com
---
 net/wireless/sme.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/wireless/sme.c b/net/wireless/sme.c
index 20e86a95dc4e..2f844eec9c6d 100644
--- a/net/wireless/sme.c
+++ b/net/wireless/sme.c
@@ -242,7 +242,6 @@ void cfg80211_conn_work(struct work_struct *work)
NULL, 0, NULL, 0,
WLAN_STATUS_UNSPECIFIED_FAILURE,
false, NULL);
-   cfg80211_sme_free(wdev);
}
wdev_unlock(wdev);
}
@@ -646,6 +645,7 @@ void __cfg80211_connect_result(struct net_device *dev, 
const u8 *bssid,
cfg80211_unhold_bss(bss_from_pub(bss));
cfg80211_put_bss(wdev-wiphy, bss);
}
+   cfg80211_sme_free(wdev);
return;
}
 
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3.11 03/93] sched: Use CPUPRI_NR_PRIORITIES instead of MAX_RT_PRIO in cpupri check

2014-06-23 Thread Luis Henriques

3.11.10.12 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Steven Rostedt (Red Hat) rost...@goodmis.org

commit 6227cb00cc120f9a43ce8313bb0475ddabcb7d01 upstream.

The check at the beginning of cpupri_find() makes sure that the task_pri
variable does not exceed the cp-pri_to_cpu array length. But that length
is CPUPRI_NR_PRIORITIES not MAX_RT_PRIO, where it will miss the last two
priorities in that array.

As task_pri is computed from convert_prio() which should never be bigger
than CPUPRI_NR_PRIORITIES, if the check should cause a panic if it is
hit.

Reported-by: Mike Galbraith umgwanakikb...@gmail.com
Signed-off-by: Steven Rostedt rost...@goodmis.org
Signed-off-by: Peter Zijlstra pet...@infradead.org
Link: http://lkml.kernel.org/r/1397015410.5212.13.ca...@marge.simpson.net
Signed-off-by: Ingo Molnar mi...@kernel.org
Signed-off-by: Luis Henriques luis.henriq...@canonical.com
---
 kernel/sched/cpupri.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/kernel/sched/cpupri.c b/kernel/sched/cpupri.c
index 8b836b376d91..3031bac8aa3e 100644
--- a/kernel/sched/cpupri.c
+++ b/kernel/sched/cpupri.c
@@ -70,8 +70,7 @@ int cpupri_find(struct cpupri *cp, struct task_struct *p,
int idx = 0;
int task_pri = convert_prio(p-prio);
 
-   if (task_pri = MAX_RT_PRIO)
-   return 0;
+   BUG_ON(task_pri = CPUPRI_NR_PRIORITIES);
 
for (idx = 0; idx  task_pri; idx++) {
struct cpupri_vec *vec  = cp-pri_to_cpu[idx];
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] staging: lirc: fix coding style problems

2014-06-23 Thread Raphael Poggi

This patch fix some coding style problems.

Signed-off-by: Raphaël Poggi poggi.r...@gmail.com
---
 drivers/staging/media/lirc/lirc_imon.c |   13 +
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/drivers/staging/media/lirc/lirc_imon.c 
b/drivers/staging/media/lirc/lirc_imon.c
index a5b62ee..31d5c6b 100644
--- a/drivers/staging/media/lirc/lirc_imon.c
+++ b/drivers/staging/media/lirc/lirc_imon.c
@@ -189,6 +189,7 @@ MODULE_PARM_DESC(debug, Debug messages: 0=no, 
1=yes(default: no));
 static void free_imon_context(struct imon_context *context)
 {
struct device *dev = context-driver-dev;
+
usb_free_urb(context-tx_urb);
usb_free_urb(context-rx_urb);
lirc_buffer_free(context-driver-rbuf);
@@ -481,8 +482,6 @@ static void usb_tx_callback(struct urb *urb)
/* notify waiters that write has finished */
atomic_set(context-tx.busy, 0);
complete(context-tx.finished);
-
-   return;
 }
 
 /**
@@ -547,7 +546,6 @@ static void ir_close(void *data)
}
 
mutex_unlock(context-ctx_lock);
-   return;
 }
 
 /**
@@ -572,7 +570,6 @@ static void submit_data(struct imon_context *context)
 
lirc_buffer_write(context-driver-rbuf, buf);
wake_up(context-driver-rbuf-wait_poll);
-   return;
 }
 
 static inline int tv2int(const struct timeval *a, const struct timeval *b)
@@ -626,8 +623,8 @@ static void imon_incoming_packet(struct imon_context 
*context,
if (debug) {
dev_info(dev, raw packet: );
for (i = 0; i  len; ++i)
-   printk(%02x , buf[i]);
-   printk(\n);
+   dev_dbg(dev, %02x , buf[i]);
+   dev_dbg(dev, \n);
}
 
/*
@@ -656,6 +653,7 @@ static void imon_incoming_packet(struct imon_context 
*context,
mask = 0x80;
for (bit = 0; bit  8; ++bit) {
int curr_bit = !(buf[octet]  mask);
+
if (curr_bit != context-rx.prev_bit) {
if (context-rx.count) {
submit_data(context);
@@ -707,8 +705,6 @@ static void usb_rx_callback(struct urb *urb)
}
 
usb_submit_urb(context-rx_urb, GFP_ATOMIC);
-
-   return;
 }
 
 /**
@@ -775,6 +771,7 @@ static int imon_probe(struct usb_interface *interface,
struct usb_endpoint_descriptor *ep;
int ep_dir;
int ep_type;
+
ep = iface_desc-endpoint[i].desc;
ep_dir = ep-bEndpointAddress  USB_ENDPOINT_DIR_MASK;
ep_type = ep-bmAttributes  USB_ENDPOINT_XFERTYPE_MASK;
-- 
1.7.9.5

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v7 2/7] Documentation: bindings: add the Berlin SATA PHY

2014-06-23 Thread Antoine Ténart

The Berlin SATA PHY drives the PHY related to the SATA interface. Add
the corresponding documentation.

Signed-off-by: Antoine Ténart antoine.ten...@free-electrons.com
---
 .../devicetree/bindings/phy/berlin-sata-phy.txt  | 16 
 1 file changed, 16 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/phy/berlin-sata-phy.txt

diff --git a/Documentation/devicetree/bindings/phy/berlin-sata-phy.txt 
b/Documentation/devicetree/bindings/phy/berlin-sata-phy.txt
new file mode 100644
index ..c61616e03931
--- /dev/null
+++ b/Documentation/devicetree/bindings/phy/berlin-sata-phy.txt
@@ -0,0 +1,16 @@
+Berlin SATA PHY
+---
+
+Required properties:
+- compatible: should be marvell,berlin2q-sata-phy
+- phy-cells: from the generic PHY bindings, must be 1
+- reg: address and length of the register
+- clocks: reference to the clock entry
+
+Example:
+   sata_phy: phy@f7e900a0 {
+   compatible = marvell,berlin2q-sata-phy;
+   reg = 0xf7e900a0 0x200;
+   clocks = chip CLKID_SATA;
+   #phy-cells = 1;
+   };
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

< 5 6 7 8 9 10 11 12 13 14 >

901 - 1000 of 1924 matches

Mail list logo