date:20150504

Re: question about RCU dynticks_nesting

2015-05-04 Thread Rik van Riel

On 05/04/2015 11:59 AM, Rik van Riel wrote:

> However, currently the RCU code seems to use a much more
> complex counting scheme, with a different increment for
> kernel/task use, and irq use.
> 
> This counter seems to be modeled on the task preempt_counter,
> where we do care about whether we are in task context, irq
> context, or softirq context.
> 
> On the other hand, the RCU code only seems to care about
> whether or not a CPU is in an extended quiescent state,
> or is potentially in an RCU critical section.
> 
> Paul, what is the reason for RCU using a complex counter,
> instead of a simple increment for each potential kernel/RCU
> entry, like rcu_read_lock() does with CONFIG_PREEMPT_RCU
> enabled?

Looking at the code for a while more, I have not found
any reason why the rcu dynticks counter is so complex.

The rdtp->dynticks atomic seems to be used as a serial
number. Odd means the cpu is in an rcu quiescent state,
even means it is not.

This test is used to verify whether or not a CPU is
in rcu quiescent state. Presumably the atomic_add_return
is used to add a memory barrier.

atomic_add_return(0, &rdtp->dynticks) & 0x1)

> In fact, would we be able to simply use tsk->rcu_read_lock_nesting
> as an indicator of whether or not we should bother waiting on that
> task or CPU when doing synchronize_rcu?

We seem to have two variants of __rcu_read_lock().

One increments current->rcu_read_lock_nesting, the other
calls preempt_disable().

In case of the non-preemptible RCU, we could easily also
increase current->rcu_read_lock_nesting at the same time
we increase the preempt counter, and use that as the
indicator to test whether the cpu is in an extended
rcu quiescent state. That way there would be no extra
overhead at syscall entry or exit at all. The trick
would be getting the preempt count and the rcu read
lock nesting count in the same cache line for each task.

In case of the preemptible RCU scheme, we would have to
examine the per-task state (under the runqueue lock)
to get the current task info of all CPUs, and in
addition wait for the blkd_tasks list to empty out
when doing a synchronize_rcu().

That does not appear to require special per-cpu
counters; examining the per-cpu rdp and the lists
inside it, with the rnp->lock held if doing any
list manipulation, looks like it would be enough.

However, the current code is a lot more complicated
than that. Am I overlooking something obvious, Paul?
Maybe something non-obvious? :)

-- 
All rights reversed
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] staging: unisys: Disable driver for UML

2015-05-04 Thread Richard Weinberger

UML has no io memory nor cpuid.
Let's disable this driver for UML.

Fixes:
drivers/staging/unisys/common-spar/include/iovmcall_gnuc.h: In function 
‘__unisys_vmcall_gnuc’:
drivers/staging/unisys/common-spar/include/iovmcall_gnuc.h:24:2: error: 
implicit declaration of function ‘cpuid’ [-Werror=implicit-function-declaration]
  cpuid(0x0001, &cpuid_eax, &cpuid_ebx, &cpuid_ecx, &cpuid_edx);
  ^
In file included from drivers/staging/unisys/uislib/uislib.c:33:0:
drivers/staging/unisys/include/uisutils.h: In function ‘dbg_ioremap_cache’:
drivers/staging/unisys/include/uisutils.h:78:2: error: implicit declaration of 
function ‘ioremap_cache’ [-Werror=implicit-function-declaration]
  new = ioremap_cache(addr, size);
  ^
drivers/staging/unisys/include/uisutils.h:78:6: warning: assignment makes 
pointer from integer without a cast [enabled by default]
  new = ioremap_cache(addr, size);
  ^
drivers/staging/unisys/include/uisutils.h: In function ‘dbg_ioremap’:
drivers/staging/unisys/include/uisutils.h:89:2: error: implicit declaration of 
function ‘ioremap’ [-Werror=implicit-function-declaration]
  new = ioremap(addr, size);
  ^
drivers/staging/unisys/include/uisutils.h:89:6: warning: assignment makes 
pointer from integer without a cast [enabled by default]
  new = ioremap(addr, size);
  ^
drivers/staging/unisys/include/uisutils.h: In function ‘dbg_iounmap’:
drivers/staging/unisys/include/uisutils.h:98:2: error: implicit declaration of 
function ‘iounmap’ [-Werror=implicit-function-declaration]
  iounmap(addr);

Signed-off-by: Richard Weinberger 
---
 drivers/staging/unisys/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/unisys/Kconfig b/drivers/staging/unisys/Kconfig
index 19fcb34..a6d6c2a 100644
--- a/drivers/staging/unisys/Kconfig
+++ b/drivers/staging/unisys/Kconfig
@@ -3,7 +3,7 @@
 #
 menuconfig UNISYSSPAR
bool "Unisys SPAR driver support"
-   depends on X86_64
+   depends on X86_64 && !UML
---help---
Support for the Unisys SPAR drivers
 
-- 
1.8.4.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [net-next PATCH v2 2/3] net: macb: Add support for jumbo frames

2015-05-04 Thread Jaeden Amero

On 05/04/2015 02:52 AM, Harini Katakam wrote:
> Check for "cdns,zynqmp-gem" compatible string and enable jumbo frame support
> in NWCFG register, update descriptor length masks and registers accordingly.
> Jumbo max length register should be set according to support in SoC; it is
> set to 10240 for Zynq Ultrascale+ MPSoC.
> 
> Signed-off-by: Harini Katakam 
> Reviewed-by: Punnaiah Choudary Kalluri 
> ---
> 
> On v1, Michal commented that I should use macb_config for jumbo parameters
> instead of defining them by reading the compatible string directly.
> I can use .caps for isjumbo. But jumbo-max-length needs to be defined.
> Can I add this to the structure? Any suggestions on how to handle this?
> 
> v2:
> Add constant definition and update SoC name
> 
> ---
>  drivers/net/ethernet/cadence/macb.c |   21 ++---
>  drivers/net/ethernet/cadence/macb.h |8 
>  2 files changed, 26 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/ethernet/cadence/macb.c 
> b/drivers/net/ethernet/cadence/macb.c
> index 4104d49..a065283 100644
> --- a/drivers/net/ethernet/cadence/macb.c
> +++ b/drivers/net/ethernet/cadence/macb.c
> @@ -54,6 +54,8 @@
>  #define MACB_MAX_TX_LEN  ((unsigned int)((1 << 
> MACB_TX_FRMLEN_SIZE) - 1))
>  #define GEM_MAX_TX_LEN   ((unsigned int)((1 << 
> GEM_TX_FRMLEN_SIZE) - 1))
>  
> +#define GEM_ZYNQMP_JUMBO_MAX 10240
> +
>  /*
>   * Graceful stop timeouts in us. We should allow up to
>   * 1 frame time (10 Mbits/s, full-duplex, ignoring collisions)
> @@ -782,7 +784,7 @@ static int gem_rx(struct macb *bp, int budget)
>   }
>   /* now everything is ready for receiving packet */
>   bp->rx_skbuff[entry] = NULL;
> - len = MACB_BFEXT(RX_FRMLEN, ctrl);
> + len = ctrl & bp->rx_frm_len_mask;
>  
>   netdev_vdbg(bp->dev, "gem_rx %u (len %u)\n", entry, len);
>  
> @@ -828,7 +830,7 @@ static int macb_rx_frame(struct macb *bp, unsigned int 
> first_frag,
>   struct macb_dma_desc *desc;
>  
>   desc = macb_rx_desc(bp, last_frag);
> - len = MACB_BFEXT(RX_FRMLEN, desc->ctrl);
> + len = desc->ctrl & bp->rx_frm_len_mask;
>  
>   netdev_vdbg(bp->dev, "macb_rx_frame frags %u - %u (len %u)\n",
>   macb_rx_ring_wrap(first_frag),
> @@ -1633,7 +1635,10 @@ static void macb_init_hw(struct macb *bp)
>   config |= MACB_BF(RBOF, NET_IP_ALIGN);  /* Make eth data aligned */
>   config |= MACB_BIT(PAE);/* PAuse Enable */
>   config |= MACB_BIT(DRFCS);  /* Discard Rx FCS */
> - config |= MACB_BIT(BIG);/* Receive oversized frames */
> + if (bp->isjumbo)
> + config |= MACB_BIT(JFRAME); /* Enable jumbo frames */
> + else
> + config |= MACB_BIT(BIG);/* Receive oversized frames */
>   if (bp->dev->flags & IFF_PROMISC)
>   config |= MACB_BIT(CAF);/* Copy All Frames */
>   else if (macb_is_gem(bp) && bp->dev->features & NETIF_F_RXCSUM)
> @@ -1642,8 +1647,13 @@ static void macb_init_hw(struct macb *bp)
>   config |= MACB_BIT(NBC);/* No BroadCast */
>   config |= macb_dbw(bp);
>   macb_writel(bp, NCFGR, config);
> + if (bp->isjumbo && bp->jumbo_max_len)
> + gem_writel(bp, JML, bp->jumbo_max_len);
>   bp->speed = SPEED_10;
>   bp->duplex = DUPLEX_HALF;
> + bp->rx_frm_len_mask = MACB_RX_FRMLEN_MASK;
> + if (bp->isjumbo)
> + bp->rx_frm_len_mask = MACB_RX_JFRMLEN_MASK;
>  
>   macb_configure_dma(bp);
>  
> @@ -2762,6 +2772,11 @@ static int macb_probe(struct platform_device *pdev)
>   bp->pclk = pclk;
>   bp->hclk = hclk;
>   bp->tx_clk = tx_clk;
> + if (of_device_is_compatible(pdev->dev.of_node, "cdns,zynqmp-gem")) {
> + bp->isjumbo = 1;
> + bp->jumbo_max_len = GEM_ZYNQMP_JUMBO_MAX;

Could you use the bottom 16 bits of DCFG2 instead of GEM_ZYNQMP_JUMBO_MAX?

> + }
> +
>   spin_lock_init(&bp->lock);
>  
>   /* setup capabilities */
> diff --git a/drivers/net/ethernet/cadence/macb.h 
> b/drivers/net/ethernet/cadence/macb.h
> index eb7d76f..e25f77e 100644
> --- a/drivers/net/ethernet/cadence/macb.h
> +++ b/drivers/net/ethernet/cadence/macb.h
> @@ -71,6 +71,7 @@
>  #define GEM_NCFGR0x0004 /* Network Config */
>  #define GEM_USRIO0x000c /* User IO */
>  #define GEM_DMACFG   0x0010 /* DMA Configuration */
> +#define GEM_JML  0x0048 /* Jumbo Max Length */
>  #define GEM_HRB  0x0080 /* Hash Bottom */
>  #define GEM_HRT  0x0084 /* Hash Top */
>  #define GEM_SA1B 0x0088 /* Specific1 Bottom */
> @@ -514,6 +515,9 @@ struct macb_dma_desc {
>  #define MACB_RX_BROADCAST_OFFSET 31
>  #define MACB_RX_BROADCAST_SIZE   1
>  
> +#define MACB_RX_FRMLEN_MASK  0xFFF
> +#define MACB_RX_JFRMLEN_MASK

Re: [PATCH] Documentation: tracing: fix grammar

2015-05-04 Thread Steven Rostedt

On Mon,  4 May 2015 19:48:54 +0200
Rabin Vincent  wrote:

> 4a88d44ab17da ("tracing: Remove mentioning of legacy latency_trace file
> from documentation") changed a sentence to refer to only one file
> instead of two, but the sentence still uses "they".  Fix it.
> 
> Signed-off-by: Rabin Vincent 

Acked-by: Steven Rostedt 

Is someone else going to take this in their tree?

-- Steve

> ---
>  Documentation/trace/ftrace.txt | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/Documentation/trace/ftrace.txt b/Documentation/trace/ftrace.txt
> index 572ca92..7ddb1e3 100644
> --- a/Documentation/trace/ftrace.txt
> +++ b/Documentation/trace/ftrace.txt
> @@ -108,8 +108,8 @@ of ftrace. Here is a list of some of the key files:
>   data is read from this file, it is consumed, and
>   will not be read again with a sequential read. The
>   "trace" file is static, and if the tracer is not
> - adding more data,they will display the same
> - information every time they are read.
> + adding more data, it will display the same
> + information every time it is read.
>  
>trace_options:
>  

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/4] x86/mce/amd: Introduce deferred error interrupt handler

2015-05-04 Thread Aravind Gopalakrishnan


On 5/4/2015 1:46 PM, Borislav Petkov wrote:

For deferred errors, the workaround is a little different as it
applies to only the given family/model right now. If the workaround
needs to be applied for future processors, we can extend the family
check for those right?

Or, you can do the check for all families as we're behind a CPUID bit
anyway. This is why CPUID bits are a good thing :-)


Yep. Ok, Will do that.


If we setup 'm.addr' in amd_threshold_interrupt() and
amd_deferred_error_interrupt() properly, then amd_decode_mce() would
actually have some value in m->addr to report.

I didn't mean to say HW doesn't provide us the information in the addr
and/or the misc registers.

So you can use mce_read_aux(), yeah, you can move it to mce-internal.h



Ok, will do.
Is it ok to grow another patch in a V2 for this instead of fixing it in 
this patch since it's a real bug?
That should be helpful when someone wants to look up git logs of why 
this was done..



The addr, misc registers are still valid for threshold, deferred errors.
(Of course, misc is valid only if m->status & MCI_STATUS_MISCV)

My point was, in __log_error(), we can read relevant status and addr MSRs to
be passed to mce_log() as those are the only pieces of information we use in
the decoding chain; and discard the m.misc assignment we do for threshold
errors.

But MCx_MISC is important for thresholding errors, it carries the ErrCnt
and stuff.

So you can pass a parameter to __log_error(..., threshold=true, misc)
and do

if (threshold)
m.misc = misc;

Right?



Yeah, just wanted to keep __log_error() as generic as possible and not 
special case for threshold.

But ok, since MCx_MISC is needed, I'll work it up as you suggested.

Thanks,
-Aravind.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Documentation: tracing: fix grammar

2015-05-04 Thread Jonathan Corbet

On Mon, 4 May 2015 15:03:52 -0400
Steven Rostedt  wrote:

> Is someone else going to take this in their tree?

I'll take it in the docs tree.

jon
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] blk-mq: fix FUA request hang

2015-05-04 Thread Jens Axboe


On 05/01/2015 10:59 AM, Shaohua Li wrote:

When a FUA request enters its DATA stage of flush pipeline, the
request is added to mq requeue list, the request will then be added to
ctx->rq_list. blk_mq_attempt_merge() might merge the request with a bio.
Later when the request is finished the flush pipeline, the
request->__data_len is 0. Then I only saw the bio gets endio called, the
original request never finish.

Adding REQ_FLUSH_SEQ into REQ_NOMERGE_FLAGS looks an easy fix.


Thanks, applied.

--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Revert "smc91x: retrieve IRQ and trigger flags in a modern way"

2015-05-04 Thread David Miller

From: Linus Walleij 
Date: Mon, 4 May 2015 15:18:39 +0200

> On Tue, Mar 17, 2015 at 8:05 PM, David Miller  wrote:
>> From: Robert Jarzmik 
>> Date: Mon, 16 Mar 2015 22:06:13 +0100
>>
>>> David Miller  writes:
>>>
 From: Robert Jarzmik 
 Date: Thu, 19 Feb 2015 21:48:49 +0100

> Linus has submitted the patch [1]. I'll be watching carefully until -rc4 
> that
> this is applied. If it's not, I'll reping you to apply this revert. Until 
> then,
> you can forget about it, I'll do the follow-up.
>
> Does that plan sound good to you ?

 It sounds good to me.
>>>
>>> Hi David,
>>>
>>> Unfortunately Linus's patch didn't make it upstream. Therefore I'll ask you 
>>> to
>>> apply the revert, which is reminded in [1].
>>
>> Revert applied, thanks.
> 
> Since the required patch fixing the actual problem is now upstream,
> can we revert the revert?

Ok, done.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/4] x86/mce/amd: Introduce deferred error interrupt handler

2015-05-04 Thread Borislav Petkov

On Mon, May 04, 2015 at 02:06:43PM -0500, Aravind Gopalakrishnan wrote:
> Is it ok to grow another patch in a V2 for this instead of fixing
> it in this patch since it's a real bug? That should be helpful when
> someone wants to look up git logs of why this was done..

Yes, a prepatch please.

> Yeah, just wanted to keep __log_error() as generic as possible and not
> special case for threshold.

Not important as it is going to be used in mce_amd.c only anyway. It's
main goal is to avoid code duplication - nothing else.

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [LKP] [mtd] 6b44d910ae7: WARNING: CPU: 0 PID: 0 at kernel/locking/lockdep.c:3547 check_flags+0xae/0x17b()

2015-05-04 Thread Frans Klaver

On Mon, May 4, 2015 at 4:37 AM, Huang Ying  wrote:
> On Tue, 2015-04-28 at 23:37 +0200, Frans Klaver wrote:
>> On Thu, Apr 16, 2015 at 01:27:14PM +0800, Huang Ying wrote:
>> > FYI, we noticed the below changes on
>> >
>> > git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
>> > commit 6b44d910ae7de5316fcf1fc828ff4a8d48cac5e2 ("mtd: core: set some 
>> > defaults when dev.parent is set")
>> >
>> >
>> > [5.566033] [nandsim] warning: read_byte: unexpected data output cycle, 
>> > state is STATE_READY return 0x0
>> > [5.566033] [nandsim] warning: read_byte: unexpected data output cycle, 
>> > state is STATE_READY return 0x0
>> > [5.567490] [nandsim] warning: read_byte: unexpected data output cycle, 
>> > state is STATE_READY return 0x0
>> > [5.567490] [nandsim] warning: read_byte: unexpected data output cycle, 
>> > state is STATE_READY return 0x0
>> > [5.568935] [nandsim] warning: read_byte: unexpected data output cycle, 
>> > state is STATE_READY return 0x0
>> > [5.568935] [nandsim] warning: read_byte: unexpected data output cycle, 
>> > state is STATE_READY return 0x0
>> > [5.570362] [nandsim] warning: read_byte: unexpected data output cycle, 
>> > state is STATE_READY return 0x0
>> > [5.570362] [nandsim] warning: read_byte: unexpected data output cycle, 
>> > state is STATE_READY return 0x0
>> > [5.571786] [nandsim] warning: read_byte: unexpected data output cycle, 
>> > state is STATE_READY return 0x0
>> > [5.571786] [nandsim] warning: read_byte: unexpected data output cycle, 
>> > state is STATE_READY return 0x0
>> > [5.573195] [nandsim] warning: read_byte: unexpected data output cycle, 
>> > state is STATE_READY return 0x0
>> > [5.573195] [nandsim] warning: read_byte: unexpected data output cycle, 
>> > state is STATE_READY return 0x0
>> > [5.574628] nand: device found, Manufacturer ID: 0x98, Chip ID: 0x39
>> > [5.574628] nand: device found, Manufacturer ID: 0x98, Chip ID: 0x39
>> > [5.575662] nand: Toshiba NAND 128MiB 1,8V 8-bit
>> > [5.575662] nand: Toshiba NAND 128MiB 1,8V 8-bit
>> > [5.576417] nand: 128 MiB, SLC, erase size: 16 KiB, page size: 512, OOB 
>> > size: 16
>> > [5.576417] nand: 128 MiB, SLC, erase size: 16 KiB, page size: 512, OOB 
>> > size: 16
>> > [5.577576] flash size: 128 MiB
>> > [5.577576] flash size: 128 MiB
>> > [5.578060] page size: 512 bytes
>> > [5.578060] page size: 512 bytes
>> > [5.578556] OOB area size: 16 bytes
>> > [5.578556] OOB area size: 16 bytes
>> > [5.579085] sector size: 16 KiB
>> > [5.579085] sector size: 16 KiB
>> > [5.579568] pages number: 262144
>> > [5.579568] pages number: 262144
>> > [5.580114] pages per sector: 32
>> > [5.580114] pages per sector: 32
>> > [5.580659] bus width: 8
>> > [5.580659] bus width: 8
>> > [5.581067] bits in sector size: 14
>> > [5.581067] bits in sector size: 14
>> > [5.581605] bits in page size: 9
>> > [5.581605] bits in page size: 9
>> > [5.582102] bits in OOB size: 4
>> > [5.582102] bits in OOB size: 4
>> > [5.582593] flash size with OOB: 135168 KiB
>> > [5.582593] flash size with OOB: 135168 KiB
>> > [5.583235] page address bytes: 4
>> > [5.583235] page address bytes: 4
>> > [5.583749] sector address bytes: 3
>> > [5.583749] sector address bytes: 3
>> > [5.584332] options: 0x42
>> > [5.584332] options: 0x42
>> > [5.586063] Scanning device for bad blocks
>> > [5.586063] Scanning device for bad blocks
>> > [5.609792] ftl_cs: FTL header not found.
>> > [5.609792] ftl_cs: FTL header not found.
>> > [5.612150] Creating 1 MTD partitions on "NAND 128MiB 1,8V 8-bit":
>> > [5.612150] Creating 1 MTD partitions on "NAND 128MiB 1,8V 8-bit":
>> > [5.613131] 0x-0x0800 : "NAND simulator partition 0"
>> > [5.613131] 0x-0x0800 : "NAND simulator partition 0"
>> > [5.614496] BUG: unable to handle kernel
>> > [5.614496] BUG: unable to handle kernel NULL pointer dereferenceNULL 
>> > pointer dereference at 0008
>> >  at 0008
>> > [5.615637] IP:
>> > [5.615637] IP: [<818c8620>] add_mtd_device+0x194/0x313
>> >  [<818c8620>] add_mtd_device+0x194/0x313
>> > [5.616041] *pde = 
>> > [5.616041] *pde = 
>> >
>> > [5.616041] Oops:  [#1]
>> > [5.616041] Oops:  [#1] DEBUG_PAGEALLOC DEBUG_PAGEALLOC
>> >
>> > [5.616041] CPU: 0 PID: 1 Comm: swapper Tainted: GW   
>> > 4.0.0-08945-gcb973ec #3
>> > [5.616041] CPU: 0 PID: 1 Comm: swapper Tainted: GW   
>> > 4.0.0-08945-gcb973ec #3
>> > [5.616041] task: 9468 ti: 94688000 task.ti: 94688000
>> > [5.616041] task: 9468 ti: 94688000 task.ti: 94688000
>> > [5.616041] EIP: 0060:[<818c8620>] EFLAGS: 00010202 CPU: 0
>> > [5.616041] EIP: 0060:[<818c8620>] EFLAGS: 00010202 CPU: 0
>> > [5.616041] EIP is at add_mtd_device+0x194/0x313
>>

Re: [PATCH 2/4] perf tools: Add functions which can get or set perf config variables.

2015-05-04 Thread Jiri Olsa

On Mon, Apr 27, 2015 at 03:34:24PM +0900, Taeung Song wrote:
> This patch consists of functions
> which can get, set specific config variables.
> For the syntax examples,
> 
>perf config [options] [section.subkey[=value] ...]
> 
>display key-value pairs of specific config variables
># perf config report.queue-size report.children

[jolsa@krava perf]$ ./perf config krava.krava
krava.krava=true

?

some comments below

> 
>set specific config variables
># perf config report.queue-size=100M report.children=true
> 
> Signed-off-by: Taeung Song 
> ---
>  tools/perf/Documentation/perf-config.txt |   2 +
>  tools/perf/builtin-config.c  | 276 
> ++-
>  tools/perf/util/cache.h  |  17 ++
>  tools/perf/util/config.c |  30 +++-
>  4 files changed, 320 insertions(+), 5 deletions(-)

SNIP

> +static int set_spec_config(const char *section_name, const char *subkey,
> +const char *value)
>  {
>   int ret = 0;
> + ret += set_config(section_name, subkey, value);
> + ret += perf_configset_write_in_full();
> +
> + return ret;
> +}
> +
> +static void parse_key(const char *var, const char **section_name, const char 
> **subkey)
> +{
> + char *key = strdup(var);
> +
> + if (!key)
> + die("%s: strdup failed\n", __func__);
> +
> + *section_name = strsep(&key, ".");
> + *subkey = strsep(&key, ".");

should this check the config syntax? could be used for command line check as 
well

> +}
> +
> +static int collect_config(const char *var, const char *value,
> +   void *cb __maybe_unused)
> +{
> + struct config_section *section_node;
> + const char *section_name, *subkey;

SNIP

> + }
> + for (i = 0; key[i]; i++) {
> + if (i == 0 && !isalpha(key[i++]))
> + goto out_err;
> +
> + switch (key[i]) {
> + case '.':
> + num_dot += 1;
> + if (!isalpha(key[++i]))
> + goto out_err;
> + break;
> + case '=':
> + num_equals += 1;
> + break;
> + default:
> + if (!isalpha(key[i]) && !isalnum(key[i]))
> + goto out_err;

you dont allow '-' in the key report.queue-size, I think we should support also 
_ 

also please put the name checks into separated function

> + }
> + }
> +
> + if (num_equals > 1 || num_dot > 1)
> + goto out_err;
> +
> + given_value = strchr(key, '=');
> + if (given_value == NULL || given_value == key)
> + given_value = NULL;

SNIP

>   argc = parse_options(argc, argv, config_options, config_usage,
>PARSE_OPT_STOP_AT_NON_OPTION);
> + if (origin_argc > argc)
> + is_option = true;
> + else
> + is_option = false;
> +
> + if (!is_option && argc >= 0) {
> + switch (argc) {
> + case 0:
> + break;
> + default:
> + for (i = 0; argv[i]; i++) {
> + value = strrchr(argv[i], '=');
> + if (value == NULL || value == argv[i])

hum, so you let go in args like '=krava' ?

why dont you completely check the name (assignment string) first
and decide later about the callback

> + ret = 
> perf_configset_with_option(show_spec_config, argv[i]);
> + else
> + ret = 
> perf_configset_with_option(set_spec_config, argv[i]);
> + if (ret < 0)
> + break;
> + }
> + goto out;
> + }
> + }

SNIP

> @@ -502,6 +501,31 @@ out:
>   return ret;
>  }
>  
> +int perf_configset_write_in_full(void)
> +{
> + struct config_section *section_node;
> + struct config_element *element_node;
> + const char *first_line = "# this file is auto-generated.";

so you parse whole config, change it and write back..
hum, I dont see better way.. and I like the first line ;-)

> + FILE *fp = fopen(config_file_name, "w");
> +
> + if (!fp)
> + return -1;
> +
> + fprintf(fp, "%s\n", first_line);
> + /* overwrite configvariables */
> + list_for_each_entry(section_node, sections, list) {
> + fprintf(fp, "[%s]\n", section_node->name);
> + list_for_each_entry(element_node, §ion_node->element_head, 
> list) {
> + if (element_node->value)
> + fprintf(fp, "\t%s = %s\n",
> + element_node->subkey, 
> element_node->value);
> + }
> + }
> + fclose(fp);
> +
> + return 0;
> +}
> +
>  /*
>   * Call this to report error for your va

Re: [PATCH 3/4] perf tools: Add a option 'all' to perf-config.

2015-05-04 Thread Jiri Olsa

On Mon, Apr 27, 2015 at 03:34:25PM +0900, Taeung Song wrote:
> A option 'all' is to display both current config variables and
> all possible config variables with default values.
> The syntax examples are like below
> 
> perf config [options]
> 
> display all perf config with default values.
> # perf config
> or
> # perf config -a | --all

hum, I think we wanted 

  'perf config' to display current config file (same as -l)

  'perf config -a' to display all possible config

jirka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/4] perf tools: Add a option 'all' to perf-config.

2015-05-04 Thread Jiri Olsa

On Mon, Apr 27, 2015 at 03:34:25PM +0900, Taeung Song wrote:

SNIP

> +static int merge_config(const char *var, const char *value,
> + void *cb __maybe_unused)
> +{
> + const char *section_name, *subkey;
> + parse_key(var, §ion_name, &subkey);
> + return set_config(section_name, subkey, value);
> +}
> +
> +static int show_all_config(void)
> +{
> + int ret = 0;
> + struct config_section *section_node;
> + struct config_element *element_node;
> + char *pwd, *all_config;
> +
> + pwd = getenv("PWD");
> + all_config = strdup(mkpath("%s/util/PERFCONFIG-DEFAULT", pwd));

so we could compile defaultconfig directly into perf
via following change

maybe the PERFCONFIG-DEFAULT could be in perf root dir

also I'm not sure if xxd is common enough to be used,
maybe there's some common way used in kernel build already

jirka


---
diff --git a/tools/build/Makefile.build b/tools/build/Makefile.build
index 10df57237a66..888a1d344be2 100644
--- a/tools/build/Makefile.build
+++ b/tools/build/Makefile.build
@@ -41,6 +41,7 @@ include $(build-file)
 
 quiet_cmd_flex  = FLEX $@
 quiet_cmd_bison = BISON$@
+quiet_cmd_xxd   = XXD  $@
 
 # Create directory unless it exists
 quiet_cmd_mkdir = MKDIR$(dir $@)
diff --git a/tools/perf/Build b/tools/perf/Build
index 3c1f4371c95a..614a9013d401 100644
--- a/tools/perf/Build
+++ b/tools/perf/Build
@@ -43,3 +43,8 @@ libperf-y += ui/
 libperf-y += scripts/
 
 gtk-y += ui/gtk/
+
+$(OUTPUT)builtin-config.o: $(OUTPUT)util/PERFCONFIG-DEFAULT.h
+
+$(OUTPUT)util/PERFCONFIG-DEFAULT.h: $(OUTPUT)util/PERFCONFIG-DEFAULT
+   @$(call echo-cmd,xxd)xxd -i $< > $@
diff --git a/tools/perf/builtin-config.c b/tools/perf/builtin-config.c
index a67aea314180..a3678cfca4dd 100644
--- a/tools/perf/builtin-config.c
+++ b/tools/perf/builtin-config.c
@@ -12,6 +12,7 @@
 #include "util/parse-options.h"
 #include "util/util.h"
 #include "util/debug.h"
+#include "util/PERFCONFIG-DEFAULT.h"
 
 static struct {
bool list_action;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/4] perf tools: Add a option 'all' to perf-config.

2015-05-04 Thread Jiri Olsa

On Mon, Apr 27, 2015 at 03:34:25PM +0900, Taeung Song wrote:

SNIP

> +
> +static int show_all_config(void)
> +{
> + int ret = 0;
> + struct config_section *section_node;
> + struct config_element *element_node;
> + char *pwd, *all_config;
> +
> + pwd = getenv("PWD");
> + all_config = strdup(mkpath("%s/util/PERFCONFIG-DEFAULT", pwd));
> +
> + if (!all_config) {
> + pr_err("%s: strdup failed\n", __func__);
> + return -1;
> + }
> +
> + sections = zalloc(sizeof(*sections));
> + if (!sections)
> + return -1;
> + INIT_LIST_HEAD(sections);

this code pattern is already there, please placeit into function

also, where's the section freed?

jirka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] blk-mq: don't lose requests if a stopped queue restarts

2015-05-04 Thread Jens Axboe


On 05/02/2015 06:31 PM, Shaohua Li wrote:

Normally if driver is busy to dispatch a request the logic is like below:
block layer:driver:
__blk_mq_run_hw_queue
a.  blk_mq_stop_hw_queue
b.  rq add to ctx->dispatch

later:
1.  blk_mq_start_hw_queue
2.  __blk_mq_run_hw_queue

But it's possible step 1-2 runs between a and b. And since rq isn't in
ctx->dispatch yet, step 2 will not run rq. The rq might get lost if
there are no subsequent requests kick in.


Good catch! But the patch introduces a potentially never ending loop in 
__blk_mq_run_hw_queue(). Not sure how we can fully close it, but it 
might be better to punt the re-run after adding the requests back to the 
worker. That would turn a potential busy loop (until requests complete) 
into something with nicer behavior, at least. Ala


if (!test_bit(BLK_MQ_S_STOPPED, &hctx->state))
 kblockd_schedule_delayed_work_on(blk_mq_hctx_next_cpu(hctx),
&hctx->run_work, 0);


--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/1 linux-next] block: loop: use IS_ERR() to check blk_mq_init_queue() return

2015-05-04 Thread Fabian Frederick

blk_mq_init_queue() never returns NULL. There's no need for IS_ERR_OR_NULL()

Signed-off-by: Fabian Frederick 
---
 drivers/block/loop.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index ae3fcb4..3fb23e9 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -1559,7 +1559,7 @@ static int loop_add(struct loop_device **l, int i)
goto out_free_idr;
 
lo->lo_queue = blk_mq_init_queue(&lo->tag_set);
-   if (IS_ERR_OR_NULL(lo->lo_queue)) {
+   if (IS_ERR(lo->lo_queue)) {
err = PTR_ERR(lo->lo_queue);
goto out_cleanup_tags;
}
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] watchdog: cadence: Add dependency on HAS_IOMEM

2015-05-04 Thread Guenter Roeck

On Mon, May 04, 2015 at 09:01:25PM +0200, Richard Weinberger wrote:
> Not all architectures have io memory.
> 
> Fixes:
> drivers/built-in.o: In function `cdns_wdt_probe':
> cadence_wdt.c:(.text+0x33b7c9): undefined reference to `devm_ioremap_resource'
> 
> Signed-off-by: Richard Weinberger 

Reviewed-by: Guenter Roeck 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [REGRESSION,BISECTED] 4.1-rc2 radeon audio changes oops the kernel hard

2015-05-04 Thread Mikael Pettersson

Deucher, Alexander writes:
 > > -Original Message-
 > > From: Mikael Pettersson [mailto:mikpeli...@gmail.com]
 > > Sent: Monday, May 04, 2015 11:53 AM
 > > To: linux-kernel@vger.kernel.org
 > > Cc: Deucher, Alexander
 > > Subject: [REGRESSION,BISECTED] 4.1-rc2 radeon audio changes oops the
 > > kernel hard
 > > 
 > > On my Ivy Bridge i7 mobo w/ Radeon graphics, the 4.1-rc2 kernel oopses
 > > hard,
 > > requiring a hard reset:
 > > 
 > > BUG: unable to handle kernel NULL pointer dereference at
 > > 0010
 > > IP: [] radeon_audio_detect+0x5b/0x150 [radeon]
 > > PGD 0
 > > Oops:  [#1] SMP
 > > Modules linked in: af_packet snd_hda_codec_generic snd_hda_intel
 > > snd_hda_controller snd_hda_codec snd_hwdep snd_hda_core snd_seq
 > > snd_seq_device snd_pcm radeon cfbfillrect cfbimgblt cfbcopyarea
 > > i2c_algo_bit backlight r8169 mii coretemp snd_timer drm_kms_helper ttm
 > > snd drm i2c_core xhci_pci xhci_hcd soundcore evdev firmware_class hwmon
 > > hid_generic usbhid hid ehci_pci ehci_hcd sr_mod cdrom usbcore
 > > usb_common ipv6
 > > CPU: 0 PID: 163 Comm: kworker/0:2 Not tainted 4.1.0-rc2 #1
 > > Hardware name: System manufacturer System Product Name/P8Z77-V LE
 > > PLUS, BIOS 0403 05/08/2012
 > > Workqueue: events output_poll_execute [drm_kms_helper]
 > > task: 8806012b1590 ti: 88003796 task.ti: 88003796
 > > RIP: 0010:[]  []
 > > radeon_audio_detect+0x5b/0x150 [radeon]
 > > RSP: 0018:880037963c78  EFLAGS: 00010246
 > > RAX: 880600c92da0 RBX: 880600cbb000 RCX: 0001
 > > RDX:  RSI:  RDI: 880037a3f600
 > > RBP: 880600c92da0 R08: 0001 R09: 0050
 > > R10: 0001 R11: 880603001a80 R12: 0001
 > > R13: 880600c924e0 R14: 880601f84000 R15: 0001
 > > FS:  () GS:88061ec0()
 > > knlGS:
 > > CS:  0010 DS:  ES:  CR0: 80050033
 > > CR2: 0010 CR3: 01478000 CR4: 001407f0
 > > Stack:
 > >  880600cbb000 0001 0001 880601f84000
 > >  a03e7d70 a03157ea 880601f84000 0002
 > >  880600baa200 880600cbb050 880600cbb000 880600e33800
 > > Call Trace:
 > >  [] ? radeon_dvi_detect+0x35a/0x4d0 [radeon]
 > >  [] ?
 > > drm_helper_probe_single_connector_modes_merge_bits+0x2e6/0x490
 > > [drm_kms_helper]
 > >  [] ?
 > > drm_fb_helper_probe_connector_modes.isra.5+0x48/0x70
 > > [drm_kms_helper]
 > >  [] ? drm_fb_helper_hotplug_event+0x55/0xe0
 > > [drm_kms_helper]
 > >  [] ? output_poll_execute+0x7c/0x1a0 [drm_kms_helper]
 > >  [] ? process_one_work+0x130/0x360
 > >  [] ? worker_thread+0x114/0x460
 > >  [] ? __schedule+0x20d/0x660
 > >  [] ? rescuer_thread+0x2f0/0x2f0
 > >  [] ? kthread+0xbc/0xe0
 > >  [] ? kthread_create_on_node+0x170/0x170
 > >  [] ? ret_from_fork+0x42/0x70
 > >  [] ? kthread_create_on_node+0x170/0x170
 > > Code: 8b 45 00 4c 8b ad 58 01 00 00 4c 8b 70 28 49 8b 85 00 01 00 00 48 85 
 > > c0 74
 > > 30 41 83 fc 01 74 38 48 8b 70 10 49 8b 96 c8 24 00 00 <48> 8b 4a 10 48 85 
 > > c9 74
 > > 0e 31 d2 4c 89 f7 ff d1 49 8b 85 00 01
 > > RIP  [] radeon_audio_detect+0x5b/0x150 [radeon]
 > >  RSP 
 > > CR2: 0010
 > > ---[ end trace 5b99e3870bfc7a92 ]---
 > > BUG: unable to handle kernel paging request at ffd8
 > > IP: [] kthread_data+0x7/0x10
 > > PGD 1479067 PUD 147b067 PMD 0
 > > Oops:  [#2] SMP
 > > Modules linked in: af_packet snd_hda_codec_generic snd_hda_intel
 > > snd_hda_controller snd_hda_codec snd_hwdep snd_hda_core snd_seq
 > > snd_seq_device snd_pcm radeon cfbfillrect cfbimgblt cfbcopyarea
 > > i2c_algo_bit backlight r8169 mii coretemp snd_timer drm_kms_helper ttm
 > > snd drm i2c_core xhci_pci xhci_hcd soundcore evdev firmware_class hwmon
 > > hid_generic usbhid hid ehci_pci ehci_hcd sr_mod cdrom usbcore
 > > usb_common ipv6
 > > CPU: 0 PID: 163 Comm: kworker/0:2 Tainted: G  D 4.1.0-rc2 #1
 > > Hardware name: System manufacturer System Product Name/P8Z77-V LE
 > > PLUS, BIOS 0403 05/08/2012
 > > task: 8806012b1590 ti: 88003796 task.ti: 88003796
 > > RIP: 0010:[]  [] kthread_data+0x7/0x10
 > > RSP: 0018:880037963a60  EFLAGS: 00010002
 > > RAX:  RBX:  RCX: 73c2bc6e
 > > RDX:  RSI:  RDI: 8806012b1590
 > > RBP: 8806012b1590 R08: 0001 R09: 0001
 > > R10: ea001804b800 R11: 001a R12: 8806012b1980
 > > R13:  R14: 00014300 R15: 
 > > FS:  () GS:88061ec0()
 > > knlGS:
 > > CS:  0010 DS:  ES:  CR0: 80050033
 > > CR2: 0028 CR3: 01478000 CR4: 001407f0
 > > Stack:
 > >  81051068 88061ec14300 8134c203 
 > >  880037964000 8806012b1878 fff

Re: [PATCH 1/2] net/rds: RDS-TCP: Always create a new rds_sock for an incoming connection.

2015-05-04 Thread Sowmini Varadhan

On (05/04/15 14:47), David Miller wrote:
> 
> I think adding 64K of data to this module just to solve this rare
> issue is excessive.

I'd based that number mostly as a heuristic based on rds_conn_hash[].
Any suggestions for what's reasonable? 8K? Less?
(BTW, I think that should be 32K, or am I mis-counting?)

> Furthermore I don't see any locking protecting the hash table nor
> the RDS socket linkage into that table.

yes, I missed that, I'll fix that in v2.

--Sowmini

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [REGRESSION,BISECTED] 4.1-rc2 radeon audio changes oops the kernel hard

2015-05-04 Thread Deucher, Alexander

> -Original Message-
> From: Mikael Pettersson [mailto:mikpeli...@gmail.com]
> Sent: Monday, May 04, 2015 3:27 PM
> To: Deucher, Alexander
> Cc: Mikael Pettersson; linux-kernel@vger.kernel.org
> Subject: RE: [REGRESSION,BISECTED] 4.1-rc2 radeon audio changes oops the
> kernel hard
> 
> Deucher, Alexander writes:
>  > > -Original Message-
>  > > From: Mikael Pettersson [mailto:mikpeli...@gmail.com]
>  > > Sent: Monday, May 04, 2015 11:53 AM
>  > > To: linux-kernel@vger.kernel.org
>  > > Cc: Deucher, Alexander
>  > > Subject: [REGRESSION,BISECTED] 4.1-rc2 radeon audio changes oops the
>  > > kernel hard
>  > >
>  > > On my Ivy Bridge i7 mobo w/ Radeon graphics, the 4.1-rc2 kernel oopses
>  > > hard,
>  > > requiring a hard reset:
>  > >
>  > > BUG: unable to handle kernel NULL pointer dereference at
>  > > 0010
>  > > IP: [] radeon_audio_detect+0x5b/0x150 [radeon]
>  > > PGD 0
>  > > Oops:  [#1] SMP
>  > > Modules linked in: af_packet snd_hda_codec_generic snd_hda_intel
>  > > snd_hda_controller snd_hda_codec snd_hwdep snd_hda_core snd_seq
>  > > snd_seq_device snd_pcm radeon cfbfillrect cfbimgblt cfbcopyarea
>  > > i2c_algo_bit backlight r8169 mii coretemp snd_timer drm_kms_helper
> ttm
>  > > snd drm i2c_core xhci_pci xhci_hcd soundcore evdev firmware_class
> hwmon
>  > > hid_generic usbhid hid ehci_pci ehci_hcd sr_mod cdrom usbcore
>  > > usb_common ipv6
>  > > CPU: 0 PID: 163 Comm: kworker/0:2 Not tainted 4.1.0-rc2 #1
>  > > Hardware name: System manufacturer System Product Name/P8Z77-V
> LE
>  > > PLUS, BIOS 0403 05/08/2012
>  > > Workqueue: events output_poll_execute [drm_kms_helper]
>  > > task: 8806012b1590 ti: 88003796 task.ti: 88003796
>  > > RIP: 0010:[]  []
>  > > radeon_audio_detect+0x5b/0x150 [radeon]
>  > > RSP: 0018:880037963c78  EFLAGS: 00010246
>  > > RAX: 880600c92da0 RBX: 880600cbb000 RCX: 0001
>  > > RDX:  RSI:  RDI: 880037a3f600
>  > > RBP: 880600c92da0 R08: 0001 R09: 0050
>  > > R10: 0001 R11: 880603001a80 R12: 0001
>  > > R13: 880600c924e0 R14: 880601f84000 R15: 0001
>  > > FS:  () GS:88061ec0()
>  > > knlGS:
>  > > CS:  0010 DS:  ES:  CR0: 80050033
>  > > CR2: 0010 CR3: 01478000 CR4: 001407f0
>  > > Stack:
>  > >  880600cbb000 0001 0001 880601f84000
>  > >  a03e7d70 a03157ea 880601f84000 0002
>  > >  880600baa200 880600cbb050 880600cbb000 880600e33800
>  > > Call Trace:
>  > >  [] ? radeon_dvi_detect+0x35a/0x4d0 [radeon]
>  > >  [] ?
>  > > drm_helper_probe_single_connector_modes_merge_bits+0x2e6/0x490
>  > > [drm_kms_helper]
>  > >  [] ?
>  > > drm_fb_helper_probe_connector_modes.isra.5+0x48/0x70
>  > > [drm_kms_helper]
>  > >  [] ? drm_fb_helper_hotplug_event+0x55/0xe0
>  > > [drm_kms_helper]
>  > >  [] ? output_poll_execute+0x7c/0x1a0
> [drm_kms_helper]
>  > >  [] ? process_one_work+0x130/0x360
>  > >  [] ? worker_thread+0x114/0x460
>  > >  [] ? __schedule+0x20d/0x660
>  > >  [] ? rescuer_thread+0x2f0/0x2f0
>  > >  [] ? kthread+0xbc/0xe0
>  > >  [] ? kthread_create_on_node+0x170/0x170
>  > >  [] ? ret_from_fork+0x42/0x70
>  > >  [] ? kthread_create_on_node+0x170/0x170
>  > > Code: 8b 45 00 4c 8b ad 58 01 00 00 4c 8b 70 28 49 8b 85 00 01 00 00 48 
> 85
> c0 74
>  > > 30 41 83 fc 01 74 38 48 8b 70 10 49 8b 96 c8 24 00 00 <48> 8b 4a 10 48 
> 85 c9
> 74
>  > > 0e 31 d2 4c 89 f7 ff d1 49 8b 85 00 01
>  > > RIP  [] radeon_audio_detect+0x5b/0x150 [radeon]
>  > >  RSP 
>  > > CR2: 0010
>  > > ---[ end trace 5b99e3870bfc7a92 ]---
>  > > BUG: unable to handle kernel paging request at ffd8
>  > > IP: [] kthread_data+0x7/0x10
>  > > PGD 1479067 PUD 147b067 PMD 0
>  > > Oops:  [#2] SMP
>  > > Modules linked in: af_packet snd_hda_codec_generic snd_hda_intel
>  > > snd_hda_controller snd_hda_codec snd_hwdep snd_hda_core snd_seq
>  > > snd_seq_device snd_pcm radeon cfbfillrect cfbimgblt cfbcopyarea
>  > > i2c_algo_bit backlight r8169 mii coretemp snd_timer drm_kms_helper
> ttm
>  > > snd drm i2c_core xhci_pci xhci_hcd soundcore evdev firmware_class
> hwmon
>  > > hid_generic usbhid hid ehci_pci ehci_hcd sr_mod cdrom usbcore
>  > > usb_common ipv6
>  > > CPU: 0 PID: 163 Comm: kworker/0:2 Tainted: G  D 4.1.0-rc2 #1
>  > > Hardware name: System manufacturer System Product Name/P8Z77-V
> LE
>  > > PLUS, BIOS 0403 05/08/2012
>  > > task: 8806012b1590 ti: 88003796 task.ti: 88003796
>  > > RIP: 0010:[]  []
> kthread_data+0x7/0x10
>  > > RSP: 0018:880037963a60  EFLAGS: 00010002
>  > > RAX:  RBX:  RCX: 73c2bc6e
>  > > RDX:  RSI:  RDI: 8806012b1590
>  > > RBP: 8806012b1590 R08: 0001 R09:

Re: [PATCH RFC 1/2] crypto: add PKE API

2015-05-04 Thread Tadeusz Struk

On 05/02/2015 05:07 PM, Herbert Xu wrote:
 > >>  #define CRYPTO_ALG_TYPE_AHASH0x000a
> > >> > +#define CRYPTO_ALG_TYPE_PKE   0x000b
> > >> >  #define CRYPTO_ALG_TYPE_RNG   0x000c
>>> > > Will filling a hole cause a problem with something that got obsoleted?
>> > 
>> > I hope not. I checked as far back as 2.6.18 and I don't see any clash.
>> > Herbert, what do you think?
> Indeed you can't use this hole as it'll make you a hash algorithm.

So in this case isn't RNG a hash algorithm as well?
Anyway will something like this be ok with you:

diff --git a/include/linux/crypto.h b/include/linux/crypto.h
index ee14140..ac18cd3 100644
--- a/include/linux/crypto.h
+++ b/include/linux/crypto.h
@@ -41,7 +41,7 @@
 /*
  * Algorithm masks and types.
  */
-#define CRYPTO_ALG_TYPE_MASK   0x000f
+#define CRYPTO_ALG_TYPE_MASK   0xf00f
 #define CRYPTO_ALG_TYPE_CIPHER 0x0001
 #define CRYPTO_ALG_TYPE_COMPRESS   0x0002
 #define CRYPTO_ALG_TYPE_AEAD   0x0003
@@ -54,6 +54,7 @@
 #define CRYPTO_ALG_TYPE_AHASH  0x000a
 #define CRYPTO_ALG_TYPE_RNG0x000c
 #define CRYPTO_ALG_TYPE_PCOMPRESS  0x000f
+#define CRYPTO_ALG_TYPE_PKE0x1001
 
 #define CRYPTO_ALG_TYPE_HASH_MASK  0x000e
 #define CRYPTO_ALG_TYPE_AHASH_MASK 0x000c


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v8 2/9] mailbox: Make mbox_chan_ops const

2015-05-04 Thread Suman Anna

On 05/04/2015 12:36 PM, Andrew Bresticker wrote:
> The mailbox controller's channel ops ought to be read-only.  Update
> all the mailbox drivers to make their mbox_chan_ops const as well.
> 
> Signed-off-by: Andrew Bresticker 
> Cc: Jassi Brar 
> Cc: Suman Anna 
> Cc: Ashwin Chaugule 
> Cc: Ley Foon Tan 

Thanks, the new patch looks good.

Acked-by: Suman Anna 

> ---
> Changes from v7:
>  - Constify all drivers' mbox_chan_ops.
> No changes from v5/v6.
> New for v5.
> ---
>  drivers/mailbox/arm_mhu.c  | 2 +-
>  drivers/mailbox/mailbox-altera.c   | 2 +-
>  drivers/mailbox/omap-mailbox.c | 2 +-
>  drivers/mailbox/pcc.c  | 2 +-
>  include/linux/mailbox_controller.h | 2 +-
>  5 files changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/mailbox/arm_mhu.c b/drivers/mailbox/arm_mhu.c
> index ac693c6..d9e99f9 100644
> --- a/drivers/mailbox/arm_mhu.c
> +++ b/drivers/mailbox/arm_mhu.c
> @@ -110,7 +110,7 @@ static void mhu_shutdown(struct mbox_chan *chan)
>   free_irq(mlink->irq, chan);
>  }
>  
> -static struct mbox_chan_ops mhu_ops = {
> +static const struct mbox_chan_ops mhu_ops = {
>   .send_data = mhu_send_data,
>   .startup = mhu_startup,
>   .shutdown = mhu_shutdown,
> diff --git a/drivers/mailbox/mailbox-altera.c 
> b/drivers/mailbox/mailbox-altera.c
> index a266265..bb682c9 100644
> --- a/drivers/mailbox/mailbox-altera.c
> +++ b/drivers/mailbox/mailbox-altera.c
> @@ -285,7 +285,7 @@ static void altera_mbox_shutdown(struct mbox_chan *chan)
>   }
>  }
>  
> -static struct mbox_chan_ops altera_mbox_ops = {
> +static const struct mbox_chan_ops altera_mbox_ops = {
>   .send_data = altera_mbox_send_data,
>   .startup = altera_mbox_startup,
>   .shutdown = altera_mbox_shutdown,
> diff --git a/drivers/mailbox/omap-mailbox.c b/drivers/mailbox/omap-mailbox.c
> index 0f332c1..03f8545 100644
> --- a/drivers/mailbox/omap-mailbox.c
> +++ b/drivers/mailbox/omap-mailbox.c
> @@ -604,7 +604,7 @@ static int omap_mbox_chan_send_data(struct mbox_chan 
> *chan, void *data)
>   return ret;
>  }
>  
> -static struct mbox_chan_ops omap_mbox_chan_ops = {
> +static const struct mbox_chan_ops omap_mbox_chan_ops = {
>   .startup= omap_mbox_chan_startup,
>   .send_data  = omap_mbox_chan_send_data,
>   .shutdown   = omap_mbox_chan_shutdown,
> diff --git a/drivers/mailbox/pcc.c b/drivers/mailbox/pcc.c
> index 7e91d68..26d121d 100644
> --- a/drivers/mailbox/pcc.c
> +++ b/drivers/mailbox/pcc.c
> @@ -198,7 +198,7 @@ static int pcc_send_data(struct mbox_chan *chan, void 
> *data)
>   return 0;
>  }
>  
> -static struct mbox_chan_ops pcc_chan_ops = {
> +static const struct mbox_chan_ops pcc_chan_ops = {
>   .send_data = pcc_send_data,
>  };
>  
> diff --git a/include/linux/mailbox_controller.h 
> b/include/linux/mailbox_controller.h
> index d4cf96f..68c4245 100644
> --- a/include/linux/mailbox_controller.h
> +++ b/include/linux/mailbox_controller.h
> @@ -72,7 +72,7 @@ struct mbox_chan_ops {
>   */
>  struct mbox_controller {
>   struct device *dev;
> - struct mbox_chan_ops *ops;
> + const struct mbox_chan_ops *ops;
>   struct mbox_chan *chans;
>   int num_chans;
>   bool txdone_irq;
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: perf: fuzzer triggers NULL pointer derefreence in x86_schedule_events

2015-05-04 Thread Stephane Eranian

On Fri, May 1, 2015 at 5:59 AM, Peter Zijlstra  wrote:
>
> On Thu, Apr 30, 2015 at 03:08:56PM -0400, Vince Weaver wrote:
> >
> > So the perf_fuzzer caught this after about a week of fuzzing on a Haswell
> > machine running a recent git kernel (pre 4.1-rc1 though).
> >
> > We've seen this BUG before and various fixes were applied but apparently
> > it wasn't enough.
> >
> > Sadly it doesn't seem to be reproducible.
> >
> > validate_group() -> x86_pmu.schedule_events() ->  -> variable_test_bit()
> >  (hard to tell which test bit with all the inlining going on).
>
> Assuming you build with debug info addr2line -i can help, but I think I
> found it by comparing the Code section below with my objdump -D output.
>
> Its:
> /* constraint still honored */
> if (!test_bit(hwc->idx, c->idxmsk))
> break;
>
> Which would seem to suggest c is NULL.
>
But then, you'd crash in the previous loop, because after
get_event_contraint(), you touch
c->weight. I think it is more likely related to the bitmask (idxmsk).
But then it is always
allocated with the constraint even with the HT bug workaround.  So
most, likely the index
is bogus and you touch outside the idxmsk[] array.


>
> Lemme go figure out how that could happen.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC] Design for flag bit outputs from asms

2015-05-04 Thread Richard Henderson

On 05/02/2015 05:39 AM, Peter Zijlstra wrote:
> static inline bool __test_and_clear_bit(long nr, volatile unsigned long *addr)
> {
>   bool oldbit;
> 
>   asm volatile ("btr %2, %1"
> : "CF" (oldbit), "+m" (*addr)
> : "Ir" (nr));
> 
>   return oldbit;
> }
> 
> Be the far better solution for this? Bug 59615 comment 7 states that
> they actually modeled the flags in the .md file, so the above should be
> possible to implement.
> 
> Now GCC can decide to use "sbb %0, %0" to convert CF into a register
> value or use "jnc" / "jc" for branches, depending on what
> __test_and_clear_bit() was used for.
> 
> We don't have to (ab)use asm goto for these things anymore; furthermore
> I think the above will naturally work with our __builtin_expect() hints,
> whereas the asm goto stuff has a hard time with that (afaik).
> 
> That's not to say output operants for asm goto would not still be useful
> for other things (like your EXTABLE example).
> 

(0) The C level output variable should be an integral type, from bool on up.

The flags are a scarse resource, easily clobbered.  We cannot allow user code
to keep data in the flags.  While x86 does have lahf/sahf, they don't exactly
perform well.  And other targets like arm don't even have that bad option.

Therefore, the language level semantics are that the output is a boolean store
into the variable with a condition specified by a magic constraint.

That said, just like the compiler should be able to optimize

void bar(int y)
{
  int x = (y <= 0);
  if (x) foo();
}

such that we only use a single compare against y, the expectation is that
within a similarly constrained context the compiler will not require two tests
for these boolean outputs.

Therefore:

(1) Each target defines a set of constraint strings,

   E.g. for x86, wherein we're almost out of constraint letters,

 ja   aux carry flag
 jc   carry flag
 jo   overflow flag
 jp   parity flag
 js   sign flag
 jz   zero flag

   E.g. for arm/aarch64 (using "j" here, but other possibilities exist):

 jn   negative flag
 jc   carry flag
 jz   zero flag
 jv   overflow flag

   E.g. for s390x (I've thought less about what's useful here)

 j  where m is a hex digit, and is the mask of CC values
   for which the condition is true; exactly corresponding
   to the M1 field in the branch on condition instruction.

(2) A new target hook post-processes the asm_insn, looking for the
new constraint strings.  The hook expands the condition prescribed
by the string, adjusting the asm_insn as required.

  E.g.

bool x, y, z;
asm ("xyzzy" : "=jc"(x), "=jp"(y), "=jo"(z) : : );

  originally

(parallel [
(set (reg:QI 83 [ x ])
(asm_operands/v:QI ("xyzzy") ("=jc") 0 []
 []
 [] z.c:4))
(set (reg:QI 84 [ y ])
(asm_operands/v:QI ("xyzzy") ("=jp") 1 []
 []
 [] z.c:4))
(set (reg:QI 85 [ z ])
(asm_operands/v:QI ("xyzzy") ("=jo") 2 []
 []
 [] z.c:4))
(clobber (reg:QI 18 fpsr))
(clobber (reg:QI 17 flags))
])

  becomes

(parallel [
(set (reg:CC 17 flags)
(asm_operands/v:CC ("xyzzy") ("=j_") 0 []
 []
 [] z.c:4))
(clobber (reg:QI 18 fpsr))
])
(set (reg:QI 83 [ x ])
 (ne:QI (reg:CCC 17 flags) (const_int 0)))
(set (reg:QI 84 [ y ])
 (ne:QI (reg:CCP 17 flags) (const_int 0)))
(set (reg:QI 85 [ z ])
 (ne:QI (reg:CCO 17 flags) (const_int 0)))

  which ought to assemble to something like

xyzzy
setc  %dl
setp  %cl
seto  %r15l

  Note that rtl level data flow is preserved via the flags hard register,
  and the lifetime of flags would not extended any further than we would
  for a normal cstore pattern.

  Note that the output constraints are adjusted to a single internal "=j_"
  which would match the flags register in any mode.  We can collapse
  several output flags to a single set of the flags hard register.

(3) Note that ppc is both easier and more complicated.

  There we have 8 4-bit registers, although most of the integer
  non-comparisons only write to CR0.  And the vector non-comparisons
  only write to CR1, though of course that's of less interest in the
  context of kernel code.

  For the purposes of cr0, the same scheme could certainly work, although
  the hook would not insert a hard register use, but rather a pseudo to
  be allocated to cr0 (constaint "x").

  That said, it's my understanding that "dot insns", setting cr0 are
  expensive in current processor generations.  There's also a lot less
  of the x86-style "operate and set a flag based on something useful".

Can anyone think of any drawbacks

lening

2015-05-04 Thread YesGrowth Loans




Goedendag,

   Ik ben mevrouw Rose Butler, het uitvoerend orgaan van een goed erkende
legitieme kredietgever bekend als YesGrowth leningen. United we geld lenen aan
particulieren en bedrijven die financiële steun nodig hebben. Heeft u een
slecht krediet of u behoefte aan geld om je rekeningen te betalen? rente van
3%.

   Als u geïnteresseerd bent, vul dan de lening aanvraagformulier en terug te
keren zo snel mogelijk met uw persoonlijke gegevens.

Volledige naam:
Geslacht:
Hoeveelheid nodig:
duur:
Tel:
Begrijp Engels?

    We vriendelijk wachten op de gevuld lening aanvraagformulier, zodat we de
berekening kan doen en voor de goedkeuring start de lening verwerking.

U kunt contact met ons opnemen via Tel: +447045734550

  Met vriendelijke groet,
Mrs. Rose Butler
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: question about RCU dynticks_nesting

2015-05-04 Thread Paul E. McKenney

On Mon, May 04, 2015 at 03:00:44PM -0400, Rik van Riel wrote:
> On 05/04/2015 11:59 AM, Rik van Riel wrote:
> 
> > However, currently the RCU code seems to use a much more
> > complex counting scheme, with a different increment for
> > kernel/task use, and irq use.
> > 
> > This counter seems to be modeled on the task preempt_counter,
> > where we do care about whether we are in task context, irq
> > context, or softirq context.
> > 
> > On the other hand, the RCU code only seems to care about
> > whether or not a CPU is in an extended quiescent state,
> > or is potentially in an RCU critical section.
> > 
> > Paul, what is the reason for RCU using a complex counter,
> > instead of a simple increment for each potential kernel/RCU
> > entry, like rcu_read_lock() does with CONFIG_PREEMPT_RCU
> > enabled?
> 
> Looking at the code for a while more, I have not found
> any reason why the rcu dynticks counter is so complex.

For the nesting counter, please see my earlier email.

> The rdtp->dynticks atomic seems to be used as a serial
> number. Odd means the cpu is in an rcu quiescent state,
> even means it is not.

Yep.

> This test is used to verify whether or not a CPU is
> in rcu quiescent state. Presumably the atomic_add_return
> is used to add a memory barrier.
> 
>   atomic_add_return(0, &rdtp->dynticks) & 0x1)

Yep.  It is sampled remotely, hence the need for full memory barriers.
It doesn't help to sample the counter if the sampling gets reordered
with the surrounding code.  Ditto for the increments.

By the end of the year, and hopefully much sooner, I expect to have
testing infrastructure capable of detecting ordering bugs in this code.
At which point, I can start experimenting with alternative code sequences.

But full ordering is still required, and cache misses can happen.

> > In fact, would we be able to simply use tsk->rcu_read_lock_nesting
> > as an indicator of whether or not we should bother waiting on that
> > task or CPU when doing synchronize_rcu?
> 
> We seem to have two variants of __rcu_read_lock().
> 
> One increments current->rcu_read_lock_nesting, the other
> calls preempt_disable().

Yep.  The first is preemptible RCU, the second classic RCU.

> In case of the non-preemptible RCU, we could easily also
> increase current->rcu_read_lock_nesting at the same time
> we increase the preempt counter, and use that as the
> indicator to test whether the cpu is in an extended
> rcu quiescent state. That way there would be no extra
> overhead at syscall entry or exit at all. The trick
> would be getting the preempt count and the rcu read
> lock nesting count in the same cache line for each task.

But in non-preemptible RCU, we have PREEMPT=n, so there is no preempt
counter in production kernels.  Even if there was, we have to sample this
on other CPUs, so the overhead of preempt_disable() and preempt_enable()
would be where kernel entry/exit is, so I expect that this would be a
net loss in overall performance.

> In case of the preemptible RCU scheme, we would have to
> examine the per-task state (under the runqueue lock)
> to get the current task info of all CPUs, and in
> addition wait for the blkd_tasks list to empty out
> when doing a synchronize_rcu().
> 
> That does not appear to require special per-cpu
> counters; examining the per-cpu rdp and the lists
> inside it, with the rnp->lock held if doing any
> list manipulation, looks like it would be enough.
> 
> However, the current code is a lot more complicated
> than that. Am I overlooking something obvious, Paul?
> Maybe something non-obvious? :)

Ummm...  The need to maintain memory ordering when sampling task
state from remote CPUs?

Or am I completely confused about what you are suggesting?

That said, are you chasing a real system-visible performance issue
that you tracked to RCU's dyntick-idle system?

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: question about RCU dynticks_nesting

2015-05-04 Thread Rik van Riel

On 05/04/2015 02:39 PM, Paul E. McKenney wrote:
> On Mon, May 04, 2015 at 11:59:05AM -0400, Rik van Riel wrote:

>> In fact, would we be able to simply use tsk->rcu_read_lock_nesting
>> as an indicator of whether or not we should bother waiting on that
>> task or CPU when doing synchronize_rcu?
> 
> Depends on exactly what you are asking.  If you are asking if I could add
> a few more checks to preemptible RCU and speed up grace-period detection
> in a number of cases, the answer is very likely "yes".  This is on my
> list, but not particularly high priority.  If you are asking whether
> CPU 0 could access ->rcu_read_lock_nesting of some task running on
> some other CPU, in theory, the answer is "yes", but in practice that
> would require putting full memory barriers in both rcu_read_lock()
> and rcu_read_unlock(), so the real answer is "no".
> 
> Or am I missing your point?

The main question is "how can we greatly reduce the overhead
of nohz_full, by simplifying the RCU extended quiescent state
code called in the syscall fast path, and maybe piggyback on
that to do time accounting for remote CPUs?"

Your memory barrier answer above makes it clear we will still
want to do the RCU stuff at syscall entry & exit time, at least
on x86, where we already have automatic and implicit memory
barriers.

-- 
All rights reversed
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] x86: punit_atom: punit device state debug driver

2015-05-04 Thread Thomas Gleixner

On Mon, 4 May 2015, Srinivas Pandruvada wrote:
> +struct punit_device {
> + char *name;
> + int reg;
> + int sss_pos;
> +};
> +
> +static struct punit_device *punit_device;

So this pointer gets initialized in punit_atom_debug_init() and points
either to punit_device_byt or punit_device_cht.

> +static const struct punit_device punit_device_byt[] = {
> + { "GFX RENDER", PWRGT_STATUS,   RENDER_POS },
...
> + { NULL }
> +};

> +static int punit_dev_state_show(struct seq_file *seq_file, void *unused)
> +{
> + u32 punit_pwr_status;
> + int index;
> + int status;
> +
> + seq_puts(seq_file, "\n\nPUNIT NORTH COMPLEX DEVICES :\n");
> + while (punit_device->name) {
> + status = iosf_mbi_read(PUNIT_PORT, BT_MBI_PMC_READ,
> +punit_device->reg,
> +&punit_pwr_status);
> + if (status)
> + seq_printf(seq_file, "%9s : Read Failed\n",
> +punit_device->name);
> + else  {
> + index = (punit_pwr_status >> punit_device->sss_pos) & 3;
> + seq_printf(seq_file, "%9s : %s\n", punit_device->name,
> +dstates[index]);
> + }
> + punit_device++;

So you happily increment the above pointer. So after the first readout
of that debug file the pointer will point to the end of that
table. And any further readouts will simply fail.

Completely useless and untested crap.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 4/5] blk-mq: do limited block plug for multiple queue case

2015-05-04 Thread Shaohua Li

On Fri, May 01, 2015 at 04:16:04PM -0400, Jeff Moyer wrote:
> Shaohua Li  writes:
> 
> > plug is still helpful for workload with IO merge, but it can be harmful
> > otherwise especially with multiple hardware queues, as there is
> > (supposed) no lock contention in this case and plug can introduce
> > latency. For multiple queues, we do limited plug, eg plug only if there
> > is request merge. If a request doesn't have merge with following
> > request, the requet will be dispatched immediately.
> >
> > This also fixes a bug. If we directly issue a request and it fails, we
> > use blk_mq_merge_queue_io(). But we already assigned bio to a request in
> > blk_mq_bio_to_request. blk_mq_merge_queue_io shouldn't run
> > blk_mq_bio_to_request again.
> 
> Good catch.  Might've been better to split that out first for easy
> backport to stable kernels, but I won't hold you to that.

It's not a severe bug, but I don't mind. Jens, please let me know if I
should split the patch into 2 patches.
 
> > @@ -1243,6 +1277,10 @@ static void blk_mq_make_request(struct request_queue 
> > *q, struct bio *bio)
> > return;
> > }
> >  
> > +   if (likely(!is_flush_fua) && !blk_queue_nomerges(q) &&
> > +   blk_attempt_plug_merge(q, bio, &request_count))
> > +   return;
> > +
> > rq = blk_mq_map_request(q, bio, &data);
> > if (unlikely(!rq))
> > return;
> 
> After this patch, everything up to this point in blk_mq_make_request and
> blk_sq_make_request is the same.  This can be factored out (in another
> patch) to a common function.

I'll leave this for a separate cleanup if a good function name is found.

> > @@ -1253,38 +1291,38 @@ static void blk_mq_make_request(struct 
> > request_queue *q, struct bio *bio)
> > goto run_queue;
> > }
> >  
> > +   plug = current->plug;
> > /*
> >  * If the driver supports defer issued based on 'last', then
> >  * queue it up like normal since we can potentially save some
> >  * CPU this way.
> >  */
> > -   if (is_sync && !(data.hctx->flags & BLK_MQ_F_DEFER_ISSUE)) {
> > -   struct blk_mq_queue_data bd = {
> > -   .rq = rq,
> > -   .list = NULL,
> > -   .last = 1
> > -   };
> > -   int ret;
> > +   if ((plug || is_sync) && !(data.hctx->flags & BLK_MQ_F_DEFER_ISSUE)) {
> > +   struct request *old_rq = NULL;
> 
> I would add a !blk_queue_nomerges(q) to that conditional.  There's no
> point holding back an I/O when we won't merge it anyway.

Good catch! Fixed.
 
> That brings up another quirk of the current implementation (not your
> patches) that bugs me.
> 
> BLK_MQ_F_SHOULD_MERGE
> QUEUE_FLAG_NOMERGES
> 
> Those two flags are set independently, one via the driver and the other
> via a sysfs file.  So the user could set the nomerges flag to 1 or 2,
> and still potentially get merges (see blk_mq_merge_queue_io).  That's
> something that should be fixed, albeit that can wait.

Agree 
> > blk_mq_bio_to_request(rq, bio);
> >  
> > /*
> > -* For OK queue, we are done. For error, kill it. Any other
> > -* error (busy), just add it to our list as we previously
> > -* would have done
> > +* we do limited pluging. If bio can be merged, do merge.
> > +* Otherwise the existing request in the plug list will be
> > +* issued. So the plug list will have one request at most
> >  */
> > -   ret = q->mq_ops->queue_rq(data.hctx, &bd);
> > -   if (ret == BLK_MQ_RQ_QUEUE_OK)
> > -   goto done;
> > -   else {
> > -   __blk_mq_requeue_request(rq);
> > -
> > -   if (ret == BLK_MQ_RQ_QUEUE_ERROR) {
> > -   rq->errors = -EIO;
> > -   blk_mq_end_request(rq, rq->errors);
> > -   goto done;
> > +   if (plug) {
> > +   if (!list_empty(&plug->mq_list)) {
> > +   old_rq = list_first_entry(&plug->mq_list,
> > +   struct request, queuelist);
> > +   list_del_init(&old_rq->queuelist);
> > }
> > -   }
> > +   list_add_tail(&rq->queuelist, &plug->mq_list);
> > +   } else /* is_sync */
> > +   old_rq = rq;
> > +   blk_mq_put_ctx(data.ctx);
> > +   if (!old_rq)
> > +   return;
> > +   if (!blk_mq_direct_issue_request(old_rq))
> > +   return;
> > +   blk_mq_insert_request(old_rq, false, true, true);
> > +   return;
> > }
> 
> Now there is no way to exit that if block, we always return.  It may be
> worth cosidering moving that block to its own function, if you can think
> of a good name for it.

I'll leave this for a later work

> Other than those minor issues, this looks good to me.

Th

Re: earlycon: no match?

2015-05-04 Thread Robert Schwebel

Hi Peter,

On Mon, May 04, 2015 at 10:01:37AM -0400, Peter Hurley wrote:
> > with 4.1-rc1, my boxes with early console enabled show something like
> > this (the example is vexpress, but it for example also happens on an
> > AM335x board):
> > 
> >   earlycon: no match for ttyAMA0,38400n8
> 
> This shouldn't impact any previous earlycon setup. Are you saying
> you're seeing a regression?

Well, it is a warning, and the system was warning-free on mainline with
the last kernels. People assume something is wrong if they read such a
message, so I'm searching for a way to do it right and get rid of the
warning again.

> How do you have early console enabled, via the command line or via DT?

Neither nor: the same SD card image runs on qemu (vexpress) and on an
AM335x. It has its primary console on the serial console:

- console=ttyAMA0,38400  (amba-pl011.c, vexpress)
- console=ttyO2,115200n8 (omap-serial.c, AM335x)

There is no "earlycon" on the commandline and nothing earlycon related I
did on purpose in the oftree.

My expectation would be to configure the system in a way that I have
everything necessary for earlecon usage compiled into the kernel, so I
can enable it manually from the bootloader whenever I need it (i.e. by
adding 'earlycon' to the kernel commandline, or by modifying the oftree
before it is handled over to the kernel).

> > The box was booted with "console=ttyAMA0,38400n8" on the commandline.
> > If I understand this right, the code in drivers/tty/serial/earlycon.c
> > calls setup_earlycon() with the string above ("ttyAMA0,38400n8") and
> > fails to find that string in the "names" part of the __earlycon_table,
> > because for the pl011 component on vexpress, the early console was
> > registered in drivers/tty/serial/amba-pl011.c with:
> > 
> > OF_EARLYCON_DECLARE(pl011, "arm,pl011", pl011_early_console_setup);
> > ^ name
> > 
> > So isn't that trying to match "ttyAMA0" against "arm,pl011"? I have the
> > feeling that I didn't understand the logic behind that.
> > 
> > Can you elaborate about how this is supposed to work correctly?
> 
> Yeah, I've been meaning to write about this but simply haven't had the
> time yet; apologies for that.
> 
> The facility is hopefully best explained by the existing 8250 exemplar.
> Normally, an 8250 early console is started via command line with a
> command line parameter like:
> 
>   earlycon=uart,io,0x2f8,115200n8

What happens if you don't have this parameter on the kernel commandline,
but use the same port for your serial console? i.e. 'console=ttyS0'?

I would expect the same warning I see on my boxes.
 
> Since 2007, an 8250 early console can also be started via command line
> using console= instead, like:
> 
>   console=uart,io,0x2f8,115200n8

No: "console=..." puts the console on that port, not the early console.
The semantic for console= was always to specify the name of the device
there, so "console=ttyS0...", not "console=uart...", right?
 
> In this alternate form, this early console will go on to become the
> corresponding ttyS console.
> 
> However, that functionality was exclusive to 8250 console/earlycon.
> To get this same behavior for the amba-pl011 console would look
> something like:
> 
> /* drivers/tty/serial/amba-pl011.c */
> 
> /* returns 0 if the console matches; otherwise, non-zero to use default 
> matching */
> static int pl011_console_match(struct console *co, char *name, int idx, char 
> *options)
> {
>   unsigned char iotype;
>   unsigned long addr;
> 
>   if (strncmp(name, "pl" 2) != 0 || idx != 11)
>   return -ENODEV;
> 
>   if (uart_parse_earlycon(options, &iotype, &addr, &options))
>   return -ENODEV;
> 
>   /* find the port from the addr */
>   for (i = 0; i < ARRAY_SIZE(amba_ports); i++) {
>   if (amba_ports[i] == NULL)
>   continue;
>   if (port->mapbase != addr)
>   continue;
> 
>   co->index = i;
>   return pl011_console_setup(co, options);
>   }
> 
>   return -ENODEV;
> }
> 
> ...
> 
> static struct console amba_console = {
>   ...
>   .match  = pl011_console_match,
>   ...
> };

pl011 already has:

--8<--8<--8<--8<--8<--8<--

static void pl011_early_write(struct console *con, const char *s, unsigned n)
{
struct earlycon_device *dev = con->data;

uart_console_write(&dev->port, s, n, pl011_putc);
}

static int __init pl011_early_console_setup(struct earlycon_device *device,
const char *opt)
{
if (!device->port.membase)
return -ENODEV;

device->con->write = pl011_early_write;
return 0;
}
EARLYCON_DECLARE(pl011, pl011_early_console_setup);
OF_EARLYCON_DECLARE(pl011, "arm,pl011", pl011_early_console_setup);

--8<--8<--8<--8<--8<--8<---

Re: [PATCH 1/1] signals: don't abuse __flush_signals() in selinux_bprm_committed_creds()

2015-05-04 Thread Paul Moore

On Monday, May 04, 2015 06:45:58 PM Oleg Nesterov wrote:
> selinux_bprm_committed_creds()->__flush_signals() is not right, we
> shouldn't clear TIF_SIGPENDING unconditionally. There can be other
> reasons for signal_pending(): freezing(), JOBCTL_PENDING_MASK, and
> potentially more.
> 
> Also change this code to check fatal_signal_pending() rather than
> SIGNAL_GROUP_EXIT, it looks a bit better.
> 
> Now we can kill __flush_signals() before it finds another buggy user.

[NOTE: Added the SELinux list to the CC line]

This looks reasonable to me, I'm going to apply it to selinux#next today.

> Note: this code looks racy, we can flush a signal which was sent after
> the task SID has been updated.

The whole signal flush thread has started some discussions about how we are 
currently handling this, and if it still makes sense.  Like many things, it 
seemed like a good idea at the time, but after several years we're debating if 
that is still the case.  I expect we'll be changing this code soon.

> Signed-off-by: Oleg Nesterov 
> ---
>  include/linux/sched.h|1 -
>  kernel/signal.c  |   13 -
>  security/selinux/hooks.c |6 --
>  3 files changed, 8 insertions(+), 12 deletions(-)
> 
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 8db31ef..eb1ac84 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -2345,7 +2345,6 @@ extern void sched_dead(struct task_struct *p);
> 
>  extern void proc_caches_init(void);
>  extern void flush_signals(struct task_struct *);
> -extern void __flush_signals(struct task_struct *);
>  extern void ignore_signals(struct task_struct *);
>  extern void flush_signal_handlers(struct task_struct *, int force_default);
> extern int dequeue_signal(struct task_struct *tsk, sigset_t *mask,
> siginfo_t *info); diff --git a/kernel/signal.c b/kernel/signal.c
> index 16a3052..837ca7d 100644
> --- a/kernel/signal.c
> +++ b/kernel/signal.c
> @@ -414,21 +414,16 @@ void flush_sigqueue(struct sigpending *queue)
>  }
> 
>  /*
> - * Flush all pending signals for a task.
> + * Flush all pending signals for this kthread.
>   */
> -void __flush_signals(struct task_struct *t)
> -{
> - clear_tsk_thread_flag(t, TIF_SIGPENDING);
> - flush_sigqueue(&t->pending);
> - flush_sigqueue(&t->signal->shared_pending);
> -}
> -
>  void flush_signals(struct task_struct *t)
>  {
>   unsigned long flags;
> 
>   spin_lock_irqsave(&t->sighand->siglock, flags);
> - __flush_signals(t);
> + clear_tsk_thread_flag(t, TIF_SIGPENDING);
> + flush_sigqueue(&t->pending);
> + flush_sigqueue(&t->signal->shared_pending);
>   spin_unlock_irqrestore(&t->sighand->siglock, flags);
>  }
> 
> diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
> index 6da7532..6907d11 100644
> --- a/security/selinux/hooks.c
> +++ b/security/selinux/hooks.c
> @@ -2397,10 +2397,12 @@ static void selinux_bprm_committed_creds(struct
> linux_binprm *bprm) for (i = 0; i < 3; i++)
>   do_setitimer(i, &itimer, NULL);
>   spin_lock_irq(¤t->sighand->siglock);
> - if (!(current->signal->flags & SIGNAL_GROUP_EXIT)) {
> - __flush_signals(current);
> + if (!fatal_signal_pending(current)) {
> + flush_sigqueue(¤t->pending);
> + flush_sigqueue(¤t->signal->shared_pending);
>   flush_signal_handlers(current, 1);
>   sigemptyset(¤t->blocked);
> + recalc_sigpending();
>   }
>   spin_unlock_irq(¤t->sighand->siglock);
>   }

-- 
paul moore
www.paul-moore.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] ARM: bcm2835: Use 0x4 prefix for DMA bus addresses to SDRAM.

2015-05-04 Thread Eric Anholt

There exists a tiny MMU, configurable only by the VC (running the
closed firmware), which maps from the ARM's physical addresses to bus
addresses.  These bus addresses determine the caching behavior in the
VC's L1/L2 (note: separate from the ARM's L1/L2) according to the top
2 bits.  The bits in the bus address mean:

>From the VideoCore processor:
0x0... L1 and L2 cache allocating and coherent
0x4... L1 non-allocating, but coherent. L2 allocating and coherent
0x8... L1 non-allocating, but coherent. L2 non-allocating, but coherent
0xc... SDRAM alias. Cache is bypassed. Not L1 or L2 allocating or coherent

>From the GPU peripherals (note: all peripherals bypass the L1
cache. The ARM will see this view once through the VC MMU):
0x0... Do not use
0x4... L1 non-allocating, and incoherent. L2 allocating and coherent.
0x8... L1 non-allocating, and incoherent. L2 non-allocating, but coherent
0xc... SDRAM alias. Cache is bypassed. Not L1 or L2 allocating or coherent

The 2835 firmware always configures the MMU to turn ARM physical
addresses with 0x0 top bits to 0x4, meaning present in L2 but
incoherent with L1.  However, any bus addresses we were generating in
the kernel to be passed to a device had 0x0 bits.  That would be a
reserved (possibly totally incoherent) value if sent to a GPU
peripheral like USB, or L1 allocating if sent to the VC (like a
firmware property request).  By setting dma-ranges, all of the devices
below it get a dev->dma_pfn_offset, so that dma_alloc_coherent() and
friends return addresses with 0x4 bits and avoid cache incoherency.

This matches the behavior in the downstream 2708 kernel (see
BUS_OFFSET in arch/arm/mach-bcm2708/include/mach/memory.h).

Signed-off-by: Eric Anholt 
Cc: popcorn...@gmail.com
---
 arch/arm/boot/dts/bcm2835.dtsi | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm/boot/dts/bcm2835.dtsi b/arch/arm/boot/dts/bcm2835.dtsi
index 5734650..2df1b5c 100644
--- a/arch/arm/boot/dts/bcm2835.dtsi
+++ b/arch/arm/boot/dts/bcm2835.dtsi
@@ -15,6 +15,7 @@
#address-cells = <1>;
#size-cells = <1>;
ranges = <0x7e00 0x2000 0x0200>;
+   dma-ranges = <0x4000 0x 0x1f00>;
 
timer@7e003000 {
compatible = "brcm,bcm2835-system-timer";
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 5/5] blk-mq: make plug work for mutiple disks and queues

2015-05-04 Thread Shaohua Li

On Fri, May 01, 2015 at 04:55:39PM -0400, Jeff Moyer wrote:
> Shaohua Li  writes:
> 
> > Last patch makes plug work for multiple queue case. However it only
> > works for single disk case, because it assumes only one request in the
> > plug list. If a task is accessing multiple disks, eg MD/DM, the
> > assumption is wrong. Let blk_attempt_plug_merge() record request from
> > the same queue.
> 
> I understand the desire to be performant, and that's why you
> piggy-backed the same_queue_rq onto this function, but it sure looks
> hackish.  I'd almost rather walk the list a second time instead of add
> warts like this.  To be perfectly clear, this is what I really don't
> like about it: we're relying on the nuance that we will only add a
> single request per queue to the plug_list in the mq case.  When you
> start to look at what happens for the sq case, it doesn't make much
> sense at all, as you'll just return the first entry in the list (last
> one visited walking the list backwards) that has the same request queue.
> That's fine if this code were mq specific, but it isn't.  It will just
> lead to confusion for others reviewing the code, and may trip up anyone
> modifying the mq plugging code.
> 
> I'll leave it up to Jens to decide if this is fit for inclusion.  The
> patch /functionally/ looks fine to me.

I really dont want to rewalk the list again for performance reason.
Added some comments in the code and hopefully it's better.
 
> > Cc: Jens Axboe 
> > Cc: Christoph Hellwig 
> > Signed-off-by: Shaohua Li 
> > ---
> >  block/blk-core.c | 10 +++---
> >  block/blk-mq.c   | 11 ++-
> >  block/blk.h  |  3 ++-
> >  3 files changed, 15 insertions(+), 9 deletions(-)
> >
> > diff --git a/block/blk-core.c b/block/blk-core.c
> > index d51ed61..a5e1574 100644
> > --- a/block/blk-core.c
> > +++ b/block/blk-core.c
> > @@ -1521,7 +1521,8 @@ bool bio_attempt_front_merge(struct request_queue *q, 
> > struct request *req,
> >   * Caller must ensure !blk_queue_nomerges(q) beforehand.
> >   */
> >  bool blk_attempt_plug_merge(struct request_queue *q, struct bio *bio,
> > -   unsigned int *request_count)
> > +   unsigned int *request_count,
> > +   struct request **same_queue_rq)
> >  {
> > struct blk_plug *plug;
> > struct request *rq;
> > @@ -1541,8 +1542,10 @@ bool blk_attempt_plug_merge(struct request_queue *q, 
> > struct bio *bio,
> > list_for_each_entry_reverse(rq, plug_list, queuelist) {
> > int el_ret;
> >  
> > -   if (rq->q == q)
> > +   if (rq->q == q) {
> > (*request_count)++;
> > +   *same_queue_rq = rq;
> > +   }
> 
> Out of the 3 callers of blk_attempt_plug_merge, only one will use the
> result, yet all of them have to provide the argument.  How about just
> handling NULL in there?

Ok, fixed. Also the updated patch fixed a bug in the patch.


>From 1218a50d39dbd11d0fbd15547516d1a4a92df0b5 Mon Sep 17 00:00:00 2001
Message-Id: 
<1218a50d39dbd11d0fbd15547516d1a4a92df0b5.1430766392.git.s...@fb.com>
In-Reply-To: 

References: 

From: Shaohua Li 
Date: Wed, 29 Apr 2015 16:58:20 -0700
Subject: [PATCH 5/5] blk-mq: make plug work for mutiple disks and queues

Last patch makes plug work for multiple queue case. However it only
works for single disk case, because it assumes only one request in the
plug list. If a task is accessing multiple disks, eg MD/DM, the
assumption is wrong. Let blk_attempt_plug_merge() record request from
the same queue.

Cc: Jens Axboe 
Cc: Christoph Hellwig 
Signed-off-by: Shaohua Li 
---
 block/blk-core.c | 15 ---
 block/blk-mq.c   | 12 
 block/blk.h  |  3 ++-
 3 files changed, 22 insertions(+), 8 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index d51ed61..503927e 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1521,7 +1521,8 @@ bool bio_attempt_front_merge(struct request_queue *q, 
struct request *req,
  * Caller must ensure !blk_queue_nomerges(q) beforehand.
  */
 bool blk_attempt_plug_merge(struct request_queue *q, struct bio *bio,
-   unsigned int *request_count)
+   unsigned int *request_count,
+   struct request **same_queue_rq)
 {
struct blk_plug *plug;
struct request *rq;
@@ -1541,8 +1542,16 @@ bool blk_attempt_plug_merge(struct request_queue *q, 
struct bio *bio,
list_for_each_entry_reverse(rq, plug_list, queuelist) {
int el_ret;
 
-   if (rq->q == q)
+   if (rq->q == q) {
(*request_count)++;
+   /*
+* Only blk-mq multiple hardware queues case checks the
+* rq in the same queue, there should be only one such
+* rq in a queue
+**/
+   if (same_queue_rq)
+

Re: [PATCH v4 02/20] clk: tegra: periph: add new periph clks and muxes for Tegra210

2015-05-04 Thread Benson Leung

On Mon, May 4, 2015 at 9:37 AM, Rhyland Klein  wrote:
> Tegra210 has significant differences in muxes for peripheral clocks.
> One of the most important changes is that pll_m isn't to be used
> as a source for peripherals. Therefore, we need to define the new
> muxes and new clocks to use those muxes for Tegra210 support.
>
> Signed-off-by: Rhyland Klein 
> ---
>  drivers/clk/tegra/clk-id.h   |   57 +++-
>  drivers/clk/tegra/clk-tegra-periph.c |  257 
> +-
>  2 files changed, 312 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/clk/tegra/clk-id.h b/drivers/clk/tegra/clk-id.h
> index 60738cc954cb..ac6eaba5cc6e 100644
> --- a/drivers/clk/tegra/clk-id.h
> +++ b/drivers/clk/tegra/clk-id.h
> @@ -13,6 +13,7 @@ enum clk_id {
> tegra_clk_amx1,
> tegra_clk_apbdma,
> tegra_clk_apbif,
> +   tegra_clk_ape,
> tegra_clk_audio0,
> tegra_clk_audio0_2x,
> tegra_clk_audio0_mux,
> @@ -38,6 +39,7 @@ enum clk_id {
> tegra_clk_cile,
> tegra_clk_clk_32k,
> tegra_clk_clk72Mhz,
> +   tegra_clk_clk72Mhz_8,
> tegra_clk_clk_m,
> tegra_clk_clk_m_div2,
> tegra_clk_clk_m_div4,
> @@ -51,17 +53,21 @@ enum clk_id {
> tegra_clk_cml1,
> tegra_clk_csi,
> tegra_clk_csite,
> +   tegra_clk_csite_8,
> tegra_clk_csus,
> tegra_clk_cve,
> tegra_clk_dam0,
> tegra_clk_dam1,
> tegra_clk_dam2,
> tegra_clk_d_audio,
> +   tegra_clk_dbgapb,
> tegra_clk_dds,
> tegra_clk_dfll_ref,
> tegra_clk_dfll_soc,
> tegra_clk_disp1,
> +   tegra_clk_disp1_8,
> tegra_clk_disp2,
> +   tegra_clk_disp2_8,
> tegra_clk_dp2,
> tegra_clk_dpaux,
> tegra_clk_dsialp,
> @@ -71,6 +77,7 @@ enum clk_id {
> tegra_clk_dtv,
> tegra_clk_emc,
> tegra_clk_entropy,
> +   tegra_clk_entropy_8,
> tegra_clk_epp,
> tegra_clk_epp_8,
> tegra_clk_extern1,
> @@ -85,12 +92,15 @@ enum clk_id {
> tegra_clk_gr3d_8,
> tegra_clk_hclk,
> tegra_clk_hda,
> +   tegra_clk_hda_8,
> tegra_clk_hda2codec_2x,
> +   tegra_clk_hda2codec_2x_8,
> tegra_clk_hda2hdmi,
> tegra_clk_hdmi,
> tegra_clk_hdmi_audio,
> tegra_clk_host1x,
> tegra_clk_host1x_8,
> +   tegra_clk_host1x_9,
> tegra_clk_i2c1,
> tegra_clk_i2c2,
> tegra_clk_i2c3,
> @@ -110,11 +120,14 @@ enum clk_id {
> tegra_clk_i2s4_sync,
> tegra_clk_isp,
> tegra_clk_isp_8,
> +   tegra_clk_isp_9,
> tegra_clk_ispb,
> tegra_clk_kbc,
> tegra_clk_kfuse,
> tegra_clk_la,
> +   tegra_clk_maud,
> tegra_clk_mipi,
> +   tegra_clk_mipibif,
> tegra_clk_mipi_cal,
> tegra_clk_mpe,
> tegra_clk_mselect,
> @@ -124,11 +137,16 @@ enum clk_id {
> tegra_clk_ndspeed,
> tegra_clk_ndspeed_8,
> tegra_clk_nor,
> +   tegra_clk_nvdec,
> +   tegra_clk_nvenc,
> +   tegra_clk_nvjpg,
> tegra_clk_owr,
> +   tegra_clk_owr_8,
> tegra_clk_pcie,
> tegra_clk_pclk,
> tegra_clk_pll_a,
> tegra_clk_pll_a_out0,
> +   tegra_clk_pll_a1,
> tegra_clk_pll_c,
> tegra_clk_pll_c2,
> tegra_clk_pll_c3,
> @@ -140,8 +158,10 @@ enum clk_id {
> tegra_clk_pll_d_out0,
> tegra_clk_pll_dp,
> tegra_clk_pll_e_out0,
> +   tegra_clk_pll_g_ref,
> tegra_clk_pll_m,
> tegra_clk_pll_m_out1,
> +   tegra_clk_pll_mb,
> tegra_clk_pll_p,
> tegra_clk_pll_p_out1,
> tegra_clk_pll_p_out2,
> @@ -160,52 +180,77 @@ enum clk_id {
> tegra_clk_pll_x,
> tegra_clk_pll_x_out0,
> tegra_clk_pwm,
> +   tegra_clk_qspi,
> tegra_clk_rtc,
> tegra_clk_sata,
> +   tegra_clk_sata_8,
> tegra_clk_sata_cold,
> tegra_clk_sata_oob,
> +   tegra_clk_sata_oob_8,
> tegra_clk_sbc1,
> tegra_clk_sbc1_8,
> +   tegra_clk_sbc1_9,
> tegra_clk_sbc2,
> tegra_clk_sbc2_8,
> +   tegra_clk_sbc2_9,
> tegra_clk_sbc3,
> tegra_clk_sbc3_8,
> +   tegra_clk_sbc3_9,
> tegra_clk_sbc4,
> tegra_clk_sbc4_8,
> +   tegra_clk_sbc4_9,
> tegra_clk_sbc5,
> tegra_clk_sbc5_8,
> tegra_clk_sbc6,
> tegra_clk_sbc6_8,
> tegra_clk_sclk,
> +   tegra_clk_sdmmc_legacy,
> tegra_clk_sdmmc1,
> tegra_clk_sdmmc1_8,
> +   tegra_clk_sdmmc1_9,
> tegra_clk_sdmmc2,
> tegra_clk_sdmmc2_8,
> +   tegra_clk_sdmmc2_9,
> tegra_clk_sdmmc3,
> tegra_clk_sdmmc3_8,
> +   tegra_clk_sdmmc3_9,
> tegra_clk_sdmmc4,
> tegra_clk_sdmmc4_8,
> +   tegra_clk_sdmmc4_9,
> tegra_clk_se,
> tegra_clk_soc_therm,
> +   tegra_clk_soc_therm_8

Re: [PATCH 4/5] blk-mq: do limited block plug for multiple queue case

2015-05-04 Thread Jens Axboe


On 05/04/2015 01:40 PM, Shaohua Li wrote:

On Fri, May 01, 2015 at 04:16:04PM -0400, Jeff Moyer wrote:

Shaohua Li  writes:


plug is still helpful for workload with IO merge, but it can be harmful
otherwise especially with multiple hardware queues, as there is
(supposed) no lock contention in this case and plug can introduce
latency. For multiple queues, we do limited plug, eg plug only if there
is request merge. If a request doesn't have merge with following
request, the requet will be dispatched immediately.

This also fixes a bug. If we directly issue a request and it fails, we
use blk_mq_merge_queue_io(). But we already assigned bio to a request in
blk_mq_bio_to_request. blk_mq_merge_queue_io shouldn't run
blk_mq_bio_to_request again.


Good catch.  Might've been better to split that out first for easy
backport to stable kernels, but I won't hold you to that.


It's not a severe bug, but I don't mind. Jens, please let me know if I
should split the patch into 2 patches.


I don't care that much for this particular case. But since one/more of 
the others need respin anyway, might be prudent to split it up in any case.



--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] blk-mq: don't lose requests if a stopped queue restarts

2015-05-04 Thread Shaohua Li

On Mon, May 04, 2015 at 01:17:19PM -0600, Jens Axboe wrote:
> On 05/02/2015 06:31 PM, Shaohua Li wrote:
> >Normally if driver is busy to dispatch a request the logic is like below:
> >block layer: driver:
> > __blk_mq_run_hw_queue
> >a.   blk_mq_stop_hw_queue
> >b.   rq add to ctx->dispatch
> >
> >later:
> >1.   blk_mq_start_hw_queue
> >2.   __blk_mq_run_hw_queue
> >
> >But it's possible step 1-2 runs between a and b. And since rq isn't in
> >ctx->dispatch yet, step 2 will not run rq. The rq might get lost if
> >there are no subsequent requests kick in.
> 
> Good catch! But the patch introduces a potentially never ending loop
> in __blk_mq_run_hw_queue(). Not sure how we can fully close it, but
> it might be better to punt the re-run after adding the requests back
> to the worker. That would turn a potential busy loop (until requests
> complete) into something with nicer behavior, at least. Ala
> 
> if (!test_bit(BLK_MQ_S_STOPPED, &hctx->state))
>  kblockd_schedule_delayed_work_on(blk_mq_hctx_next_cpu(hctx),
> &hctx->run_work, 0);

My first version of the patch is like this, but I changed my mind later.
The assumption is driver will stop queue if it's busy to dispatch
request.  If the driver is buggy, we will have the endless loop here.
Should we assume drivers will not do the right thing?

Thanks,
Shaohua
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v5 3/4] iio: trigger: Introduce IIO hrtimer based trigger

2015-05-04 Thread Lars-Peter Clausen


On 05/04/2015 12:50 PM, Daniel Baluta wrote:
[...]

+IIO_HRTIMER_INFO_ATTR(sampling_frequency, S_IRUGO | S_IWUSR,
+ iio_hrtimer_info_show_sampling_frequency,
+ iio_hrtimer_info_store_sampling_frequency);


I wonder if the sampling frequency should be configurable the regular IIO 
API, just like any other IIO device. But things like min/max sampling 
frequency should be configured in configfs.


[...]

+#endif /* CONFIGFS_FS */
+

[...]

+static struct iio_sw_trigger *iio_trig_hrtimer_probe(const char *name)
+{

[...]

+#ifdef CONFIG_CONFIGFS_FS
+   config_group_init_type_name(&trig_info->swt.group, name,
+   &iio_hrtimer_type);
+#endif


This should probably have a helper function in the sw trigger core, that 
gets stubbed out when CONFIG_FS is disabled. Otherwise we'll see the same 
#ifdef in every software trigger driver.

[...]

+}
+
+static int iio_trig_hrtimer_remove(struct iio_sw_trigger *swt)
+{
+   struct iio_hrtimer_info *trig_info;
+
+   trig_info = iio_trigger_get_drvdata(swt->trigger);
+
+   hrtimer_cancel(&trig_info->timer);
+
+   iio_trigger_unregister(swt->trigger);
+   iio_trigger_free(swt->trigger);


There is a bit of a race condition here. hrtimer_cancel() should be called 
between unregister and free, otherwise it might be re-armed before it is 
unregistered.



+   kfree(trig_info);
+
+   return 0;
+}
+
+struct iio_sw_trigger_ops iio_trig_hrtimer_ops = {


const


+   .probe  = iio_trig_hrtimer_probe,
+   .remove = iio_trig_hrtimer_remove,
+};

[...]

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 01/10] Add parse_integer() (replacement for simple_strto*())

2015-05-04 Thread Alexey Dobriyan

On Mon, May 04, 2015 at 06:44:42PM +0200, Rasmus Villemoes wrote:
> [I'm merging the subthreads below]
> 
> On Mon, May 04 2015, Alexey Dobriyan  wrote:
> 
> > On Mon, May 4, 2015 at 4:24 PM, Rasmus Villemoes

> >> Is there any reason to disallow "-0"?
> >
> > No! -0 is not accepted because code is copied from kstrtoll()
> > which doesn't accept "-0". It is even in the testsuite:
> >
> >   static void __init test_kstrtoll_fail(void)
> >   {
> >   ...
> > /* negative zero isn't an integer in Linux */
> > {"-0",  0},
> > {"-0",  8},
> > {"-0",  10},
> > {"-0",  16},
> >
> > Frankly I don't even remember why it does that, and
> > no one noticed until now. libc functions accept "-0".
> 
> I think it's odd to accept "+0" but not "-0", but that's probably just
> because I'm a mathematician. Am I right that you just added these test
> cases because of the existing behaviour of kstrtoll? I suppose that
> behaviour is just a historical accident.
> 
> If "-0" is not going to be accepted, I think that deserves a comment
> (with rationale) in the parsing code and not hidden away in the test
> suite.

Again, I honestly do not remember why "-0" was banned.
Let's change it to "+0 -0" for signed case, "+0" for unsigned case.

> >>>  unsigned long long memparse(const char *ptr, char **retptr)
> >>>  {
> >>> - char *endptr;   /* local pointer to end of parsed string */
> >>> + unsigned long long val;
> >>>
> >>> - unsigned long long ret = simple_strtoull(ptr, &endptr, 0);
> >>> -
> >>> - switch (*endptr) {
> >>> + ptr += parse_integer(ptr, 0, &val);
> >>
> >> This seems wrong. simple_strtoull used to "sanitize" the return value
> >> from the (old) _parse_integer, so that endptr still points into the
> >> given string. Unconditionally adding the result from parse_integer may
> >> make ptr point far before the actual string, into who-knows-what.
> >
> > When converting I tried to preserve the amount of error checking done.
> > simple_strtoull() either
> > a) return 0 and not advance pointer, or
> > b) return something and advance pointer.
> >
> 
> Are we talking about the same simple_strtoull? I see
> 
>   cp = _parse_integer_fixup_radix(cp, &base);
>   rv = _parse_integer(cp, base, &result);
>   /* FIXME */
>   cp += (rv & ~KSTRTOX_OVERFLOW);
> 
> so cp is definitely advanced even in case of overflow. And in the case
> of "underflow" (no digits found), the old code does initialize *result
> to 0, while parse_integer by design doesn't write anything.
> 
> > Current code just ignores error case, so do I.
> 
> There's a difference between ignoring an error (which the current code
> does), and ignoring _the possibility_ of an error (which the new code
> does).
> 
> There are lots of callers of memparse(), and I don't think any of them
> are prepared to handle *endp ending up pointing before the passed-in
> string (-EINVAL == -22, -ERANGE == -34). I can easily see how that could
> lead to an infinite loop, maybe worse.

Yeah, possible bug could become worse, I'll add error checking,
but, seriously, you're defending this :^)

case Opt_nr_inodes:
===>/* memparse() will accept a K/M/G without a digit */
===>if (!isdigit(*args[0].from))
===>goto bad_val;
pconfig->nr_inodes = memparse(args[0].from, &rest);
break;

memparse() is misdesigned in the same sense strtoul() is misdesigned.
Every "memparse(s, NULL)" user is a bug for example.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] blk-mq: don't lose requests if a stopped queue restarts

2015-05-04 Thread Jens Axboe


On 05/04/2015 01:51 PM, Shaohua Li wrote:

On Mon, May 04, 2015 at 01:17:19PM -0600, Jens Axboe wrote:

On 05/02/2015 06:31 PM, Shaohua Li wrote:

Normally if driver is busy to dispatch a request the logic is like below:
block layer:driver:
__blk_mq_run_hw_queue
a.  blk_mq_stop_hw_queue
b.  rq add to ctx->dispatch

later:
1.  blk_mq_start_hw_queue
2.  __blk_mq_run_hw_queue

But it's possible step 1-2 runs between a and b. And since rq isn't in
ctx->dispatch yet, step 2 will not run rq. The rq might get lost if
there are no subsequent requests kick in.


Good catch! But the patch introduces a potentially never ending loop
in __blk_mq_run_hw_queue(). Not sure how we can fully close it, but
it might be better to punt the re-run after adding the requests back
to the worker. That would turn a potential busy loop (until requests
complete) into something with nicer behavior, at least. Ala

if (!test_bit(BLK_MQ_S_STOPPED, &hctx->state))
  kblockd_schedule_delayed_work_on(blk_mq_hctx_next_cpu(hctx),
 &hctx->run_work, 0);


My first version of the patch is like this, but I changed my mind later.
The assumption is driver will stop queue if it's busy to dispatch
request.  If the driver is buggy, we will have the endless loop here.
Should we assume drivers will not do the right thing?


There's really no contract that says the driver MUST stop the queue for 
busy. It could, legitimately, decide to just always run the queue when 
requests complete.


It might be better to simply force this behavior. If we get a BUSY, stop 
the queue from __blk_mq_run_hw_queue(). And if the bit isn't still set 
on re-add, then we know we need to re-run it. I think that would be a 
cleaner API, less fragile, and harder to get wrong. The down side is 
that now this stop happens implicitly by the core, and the driver must 
now have an asymmetric queue start when it frees the limited resource 
that caused the BUSY return. Either that, or we define a 2nd set of 
start/stop bits, one used exclusively by the driver and one used 
exclusively by blk-mq. Then blk-mq could restart the queue on completion 
of a request, since it would then know that blk-mq was the one that 
stopped it.


--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] dmaengine: pl300: enable the clock to PL330 dma

2015-05-04 Thread Dinh Nguyen

On 05/04/2015 09:06 AM, Dinh Nguyen wrote:
> +CC Olof
> 
> On 5/4/15 8:50 AM, Krzysztof Kozlowski wrote:
>> 2015-05-04 22:28 GMT+09:00 Dinh Nguyen :
>>> Hi Krzystof,
>>>
>>> On 5/4/15 12:30 AM, Krzysztof Kozlowski wrote:
>>>> 2015-05-04 13:28 GMT+09:00  :
>>>>> From: Dinh Nguyen 
>>>>>
>>>>> Turn on the clock to the PL330 DMA if there is a clock node provided.
>>>>
>>>> Why? There is no explanation in the patch for this important question - 
>>>> why?
>>>>
>>>> Amba bus already does this and provide a wrapper function.
>>>> Additionally that would mess up with runtime PM and clock
>>>> enable/disable.
>>>
>>> I don't see the clock for the DMA getting turned on at all, which is why
>>> after the kernel has booted, the filesystem tries to open up a serial
>>> port using DMA and the system hangs. The failure is seen here:
>>>
>>> http://arm-soc.lixom.net/bootlogs/next/next-20150504/socfpga-arm-multi_v7_defconfig.html
>>
>> Thanks!
>>
>> The amba bus and pl330 should enable the clock and then disable it
>> after probing:
>> static int amba_probe(struct device *dev)
>> {
>> ...
>> ret = amba_get_enable_pclk(pcdev);
>> ...
>>
>> I wonder why do you think it is not enabled at all?
> 
> I've checked it down to the register level that the gate for this clock
> does not get set.
> 
>>
>>>
>>> This only happens with the multi_v7_defconfig, because the PL330 DMA is
>>> getting built into the kernel, while the socfpga_defconfig does not
>>> enable the PL330.
>>
>> It makes sense. If pl330 driver is not enabled then necessary clocks
>> are turned on by bootloader. Probing pl330 effectively disables the
>> clock (if DMA is not used).
>>
>>> The DTS for the socfpga platform looks like this:
>>>
>>> pdma: pdma@ffe01000 {
>>> compatible = "arm,pl330", "arm,primecell";
>>> reg = <0xffe01000 0x1000>;
>>> interrupts = <0 104 4>,
>>> <0 105 4>,
>>> ...
>>> #dma-cells = <1>;
>>> #dma-channels = <8>;
>>> #dma-requests = <32>;
>>> clocks = <&l4_main_clk>;
>>> clock-names = "apb_pclk";
>>> };
>>>
>>> Perhaps I have the wrong designation for clock-names and the amba bus is
>>> not able to pick up the correct clock?
>>
>> I have two ideas:
>> 1. Is this really the clock for the DMA? If DMA is not used then
>> disabling it should be OK.
> 
> Yes, this is the clock for the DMA. Yeah, leaving this clock off is
> fine, until the DMA gets used. Up until v4.0, SoCFPGA was not using the
> DMA at all, but in v4.0, there was a patch to assign the UARTs to it's
> DMA channel.
> 
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/arch/arm/boot/dts/socfpga.dtsi?id=78c03c7af89721bd8a4428408a8cc7b53972e4b8
> 
>> 2. Disabling the clock may effectively disable its parent or
>> grandparent if there are not more users. Maybe some other driver needs
>> these parents to be enabled? This was the issue for at least one
>> similar error (on Exynos boards).
>>
> 
> I'll check up on these issues. When I was debugging this issue, the
> l4_main_clk is only used by the DMA, so it was not getting turned on by
> an other drivers.
> 

Ah, it looks like perhaps there's a problem with the serial driver and
suspend/resume? If disable CONFIG_PM, then the DMA seems to be working
fine with the debug uart. It appears the DMA is getting suspended and
doesn't get resumed.

Dinh

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 09/13] KVM: x86: save/load state on SMM switch

2015-05-04 Thread Radim Krčmář

2015-04-30 13:36+0200, Paolo Bonzini:
> The big ugly one.  This patch adds support for switching in and out of
> system management mode, respectively upon receiving KVM_REQ_SMI and upon
> executing a RSM instruction.  Both 32- and 64-bit formats are supported
> for the SMM state save area.
> 
> Signed-off-by: Paolo Bonzini 
> ---
> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
> +static void rsm_set_desc_flags(struct desc_struct *desc, u16 flags)
> +{
> + desc->g= (flags >> 15) & 1;
> + desc->d= (flags >> 14) & 1;
> + desc->l= (flags >> 13) & 1;
> + desc->avl  = (flags >> 12) & 1;
> + desc->p= (flags >> 7) & 1;
> + desc->dpl  = (flags >> 5) & 3;
> + desc->s= (flags >> 4) & 1;
> + desc->type = flags & 15;

I can't find a description of this ... can you point me to a place where
the gap between 'p' and 'avl' is documented?
(Not that it matters unless the guest reads it, but it's a bit weird.)

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v5 2/4] iio: core: Introduce IIO configfs support

2015-05-04 Thread Lars-Peter Clausen


On 05/04/2015 12:50 PM, Daniel Baluta wrote:

This creates an IIO configfs subystem named "iio", with a default "triggers"
group.

Triggers group is used for handling software triggers. To create a new software
trigger one must create a directory inside the trigger directory.

Software trigger name MUST follow the following convention:
* -
Where:
* , specifies the interrupt source (e.g: hrtimer)
* , specifies the IIO device trigger name

Failing to follow this convention will result in an directory creation error.

E.g, assuming that hrtimer trigger type is registered with IIO software
trigger core:

$ mkdir /config/iio/triggers/hrtimer-instance1



Nice, short and clean. Looks pretty good. It's a bit of a shame that we 
can't have a per type directory, but if that's how configfs works I guess 
there is not much choice.


[...]

+static struct config_group *trigger_make_group(struct config_group *group,
+  const char *name)
+{
+   char *type_name;
+   char *trigger_name;
+   char buf[MAX_NAME_LEN];
+   struct iio_sw_trigger *t;
+
+   snprintf(buf, MAX_NAME_LEN, "%s", name);
+
+   /* group name should have the form - */
+   type_name = buf;
+   trigger_name = strchr(buf, '-');
+   if (!trigger_name) {
+   pr_err("Unable to locate '-' in %s. Use -.\n", buf);


Do we want to print this side channel message? Makes it pretty easy to spam 
the kernel log with a rouge application.



+   return ERR_PTR(-EINVAL);
+   }
+
+   /* replace - with \0, this nicely separates the two strings */
+   *trigger_name = '\0';
+   trigger_name++;
+
+   t = iio_sw_trigger_create(type_name, trigger_name);
+   if (IS_ERR(t))
+   return ERR_CAST(t);
+
+   config_item_set_name(&t->group.cg_item, name);
+
+   return &t->group;
+}
+
+static void trigger_drop_group(struct config_group *group,
+  struct config_item *item)
+{
+   struct iio_sw_trigger *t = to_iio_sw_trigger(item);
+
+   if (t)


t will never be NULL.


+   iio_sw_trigger_destroy(t);
+   config_item_put(item);
+}


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: question about RCU dynticks_nesting

2015-05-04 Thread Rik van Riel

On 05/04/2015 03:39 PM, Paul E. McKenney wrote:
> On Mon, May 04, 2015 at 03:00:44PM -0400, Rik van Riel wrote:

>> In case of the non-preemptible RCU, we could easily also
>> increase current->rcu_read_lock_nesting at the same time
>> we increase the preempt counter, and use that as the
>> indicator to test whether the cpu is in an extended
>> rcu quiescent state. That way there would be no extra
>> overhead at syscall entry or exit at all. The trick
>> would be getting the preempt count and the rcu read
>> lock nesting count in the same cache line for each task.
> 
> But in non-preemptible RCU, we have PREEMPT=n, so there is no preempt
> counter in production kernels.  Even if there was, we have to sample this
> on other CPUs, so the overhead of preempt_disable() and preempt_enable()
> would be where kernel entry/exit is, so I expect that this would be a
> net loss in overall performance.

CONFIG_PREEMPT_RCU seems to be independent of CONFIG_PREEMPT.
Not sure why, but they are :)

>> In case of the preemptible RCU scheme, we would have to
>> examine the per-task state (under the runqueue lock)
>> to get the current task info of all CPUs, and in
>> addition wait for the blkd_tasks list to empty out
>> when doing a synchronize_rcu().
>>
>> That does not appear to require special per-cpu
>> counters; examining the per-cpu rdp and the lists
>> inside it, with the rnp->lock held if doing any
>> list manipulation, looks like it would be enough.
>>
>> However, the current code is a lot more complicated
>> than that. Am I overlooking something obvious, Paul?
>> Maybe something non-obvious? :)
> 
> Ummm...  The need to maintain memory ordering when sampling task
> state from remote CPUs?
> 
> Or am I completely confused about what you are suggesting?
> 
> That said, are you chasing a real system-visible performance issue
> that you tracked to RCU's dyntick-idle system?

The goal is to reduce the syscall overhead of nohz_full.

Part of the overhead is in the vtime updates, part of it is
in the way RCU extended quiescent state is tracked.

-- 
All rights reversed
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] block: pmem: Add dependency on HAS_IOMEM

2015-05-04 Thread Ross Zwisler

On Mon, 2015-05-04 at 20:58 +0200, Richard Weinberger wrote:
> Not all architectures have io memory.
> 
> Fixes:
> drivers/block/pmem.c: In function ‘pmem_alloc’:
> drivers/block/pmem.c:146:2: error: implicit declaration of function 
> ‘ioremap_nocache’ [-Werror=implicit-function-declaration]
>   pmem->virt_addr = ioremap_nocache(pmem->phys_addr, pmem->size);
>   ^
> drivers/block/pmem.c:146:18: warning: assignment makes pointer from integer 
> without a cast [enabled by default]
>   pmem->virt_addr = ioremap_nocache(pmem->phys_addr, pmem->size);
>   ^
> drivers/block/pmem.c:182:2: error: implicit declaration of function ‘iounmap’ 
> [-Werror=implicit-function-declaration]
>   iounmap(pmem->virt_addr);
>   ^
> 
> Signed-off-by: Richard Weinberger 
> ---
>  drivers/block/Kconfig | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/block/Kconfig b/drivers/block/Kconfig
> index eb1fed5..3ccef9e 100644
> --- a/drivers/block/Kconfig
> +++ b/drivers/block/Kconfig
> @@ -406,6 +406,7 @@ config BLK_DEV_RAM_DAX
>  
>  config BLK_DEV_PMEM
>   tristate "Persistent memory block device support"
> + depends on HAS_IOMEM
>   help
> Saying Y here will allow you to use a contiguous range of reserved
> memory as one or more persistent block devices.

Reviewed-by: Ross Zwisler 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: question about RCU dynticks_nesting

2015-05-04 Thread Paul E. McKenney

On Mon, May 04, 2015 at 03:39:25PM -0400, Rik van Riel wrote:
> On 05/04/2015 02:39 PM, Paul E. McKenney wrote:
> > On Mon, May 04, 2015 at 11:59:05AM -0400, Rik van Riel wrote:
> 
> >> In fact, would we be able to simply use tsk->rcu_read_lock_nesting
> >> as an indicator of whether or not we should bother waiting on that
> >> task or CPU when doing synchronize_rcu?
> > 
> > Depends on exactly what you are asking.  If you are asking if I could add
> > a few more checks to preemptible RCU and speed up grace-period detection
> > in a number of cases, the answer is very likely "yes".  This is on my
> > list, but not particularly high priority.  If you are asking whether
> > CPU 0 could access ->rcu_read_lock_nesting of some task running on
> > some other CPU, in theory, the answer is "yes", but in practice that
> > would require putting full memory barriers in both rcu_read_lock()
> > and rcu_read_unlock(), so the real answer is "no".
> > 
> > Or am I missing your point?
> 
> The main question is "how can we greatly reduce the overhead
> of nohz_full, by simplifying the RCU extended quiescent state
> code called in the syscall fast path, and maybe piggyback on
> that to do time accounting for remote CPUs?"
> 
> Your memory barrier answer above makes it clear we will still
> want to do the RCU stuff at syscall entry & exit time, at least
> on x86, where we already have automatic and implicit memory
> barriers.

We do need to keep in mind that x86's automatic and implicit memory
barriers do not order prior stores against later loads.

Hmmm...  But didn't earlier performance measurements show that the bulk of
the overhead was the delta-time computations rather than RCU accounting?

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH V4 02/24] perf tools: Add AUX area tracing index

2015-05-04 Thread Jiri Olsa

On Thu, Apr 30, 2015 at 05:37:25PM +0300, Adrian Hunter wrote:
> Add an index of AUX area tracing events within
> a perf.data file.
> 
> perf record uses a special user event
> PERF_RECORD_FINISHED_ROUND to enable sorting of
> events in chunks instead of having to sort all
> events altogether.
> 
> AUX area tracing events contain data that can
> span back to the very beginning of the recording
> period. i.e. they do not obey the rules of
> PERF_RECORD_FINISHED_ROUND.
> 
> By adding an index, AUX area tracing events
> can be found in advance and the
> PERF_RECORD_FINISHED_ROUND approach works as
> usual.
> 
> The index is recorded with the auxtrace feature
> in the perf.data file.  A session reads the index
> but does not process it.  An AUX area
> decoder can queue all the AUX area data
> in advance using auxtrace_queues__process_index()
> or otherwise process the index in some custom
> manner.
> 
> Signed-off-by: Adrian Hunter 

Acked-by: Jiri Olsa 

thanks,
jirka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 0.5/4] netconsole: remove unnecessary netconsole_target_get/out() from write_msg()

2015-05-04 Thread Tejun Heo

>From 958d3e14720a35c6103668c69d58751b36053d69 Mon Sep 17 00:00:00 2001
From: Tejun Heo 
Date: Mon, 4 May 2015 15:57:54 -0400

write_msg() grabs target_list_lock and walks target_list invoking
netpool_send_udp() on each target.  Curiously, it protects each
iteration with netconsole_target_get/put() even though it never
releases target_list_lock which protects all the members.

While this doesn't harm anything, it doesn't serve any purpose either.
The items on the list can't go away while target_list_lock is held.
Remove the unnecessary get/put pair.

Signed-off-by: Tejun Heo 
Cc: David Miller 
Cc: Tetsuo Handa 
---
Hello,

If anyone wants the whole series to be reposted, please let me know.
The updated patchset is available in the following git branch.

 git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc.git 
review-netconsole-ext-console

Thanks.

 drivers/net/netconsole.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/net/netconsole.c b/drivers/net/netconsole.c
index 15731d1..30c0524 100644
--- a/drivers/net/netconsole.c
+++ b/drivers/net/netconsole.c
@@ -744,7 +744,6 @@ static void write_msg(struct console *con, const char *msg, 
unsigned int len)

spin_lock_irqsave(&target_list_lock, flags);
list_for_each_entry(nt, &target_list, list) {
-   netconsole_target_get(nt);
if (nt->enabled && netif_running(nt->np.dev)) {
/*
 * We nest this inside the for-each-target loop above
@@ -760,7 +759,6 @@ static void write_msg(struct console *con, const char *msg, 
unsigned int len)
left -= frag;
}
}
-   netconsole_target_put(nt);
}
spin_unlock_irqrestore(&target_list_lock, flags);
 }
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH V4 03/24] perf tools: Hit all build ids when AUX area tracing

2015-05-04 Thread Jiri Olsa

On Thu, Apr 30, 2015 at 05:37:26PM +0300, Adrian Hunter wrote:
> We need to include all buildids when a perf.data
> file contains AUX area tracing data because we
> do not decode the trace for that purpose because
> it would take too long.
> 
> Signed-off-by: Adrian Hunter 

Acked-by: Jiri Olsa 

thanks,
jirka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 3/3] netconsole: implement extended console support

2015-05-04 Thread Tejun Heo

>From f9530ac244c12a89837736269a1930291a360875 Mon Sep 17 00:00:00 2001
From: Tejun Heo 
Date: Mon, 4 May 2015 15:57:54 -0400

printk logbuf keeps various metadata and optional key=value dictionary
for structured messages, both of which are stripped when messages are
handed to regular console drivers.

It can be useful to have this metadata and dictionary available to
netconsole consumers.  This obviously makes logging via netconsole
more complete and the sequence number in particular is useful in
environments where messages may be lost or reordered in transit -
e.g. when netconsole is used to collect messages in a large cluster
where packets may have to travel congested hops to reach the
aggregator.  The lost and reordered messages can easily be identified
and handled accordingly using the sequence numbers.

printk recently added extended console support which can be selected
by setting CON_EXTENDED flag.  From console driver side, not much
changes.  The only difference is that the text passed to the write
callback is formatted the same way as /dev/kmsg.

This patch implements extended console support for netconsole which
can be enabled by either prepending "+" to a netconsole boot param
entry or echoing 1 to "extended" file in configfs.  When enabled,
netconsole transmits extended log messages with headers identical to
/dev/kmsg output.

There's one complication due to message fragments.  netconsole limits
the maximum message size to 1k and messages longer than that are split
into multiple fragments.  As all extended console messages should
carry matching headers and be uniquely identifiable, each extended
message fragment carries full copy of the metadata and an extra header
field to identify the specific fragment.  The optional header is of
the form "ncfrag=OFF/LEN" where OFF is the byte offset into the
message body and LEN is the total length.

To avoid unnecessarily making printk format extended messages,
Extended netconsole is registered with printk when the first extended
netconsole is configured.

v3: Tweaked documentation to make clarify that the example assumes a
lot smaller chunk size.  Updated to apply on top of spurious
target get/put removal in write_msg().

v2: Dropped dynamic unregistration of extended console driver, which
added complexity while not being too beneficial given that most
netconsole configurations are static.  ncfrag updated to use just
byte offset and message length.

Signed-off-by: Tejun Heo 
Cc: Tetsuo Handa 
Cc: David Miller 
---
Hello,

If anyone wants the whole series to be reposted, please let me know.
The updated patchset is available in the following git branch.

 git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc.git 
review-netconsole-ext-console

Thanks.

 Documentation/networking/netconsole.txt |  35 +++-
 drivers/net/netconsole.c| 149 +++-
 2 files changed, 181 insertions(+), 3 deletions(-)

diff --git a/Documentation/networking/netconsole.txt 
b/Documentation/networking/netconsole.txt
index a5d574a..30409a3 100644
--- a/Documentation/networking/netconsole.txt
+++ b/Documentation/networking/netconsole.txt
@@ -2,6 +2,7 @@
 started by Ingo Molnar , 2001.09.17
 2.6 port and netpoll api by Matt Mackall , Sep 9 2003
 IPv6 support by Cong Wang , Jan 1 2013
+Extended console support by Tejun Heo , May 1 2015

 Please send bug reports to Matt Mackall 
 Satyam Sharma , and Cong Wang 

@@ -24,9 +25,10 @@ Sender and receiver configuration:
 It takes a string configuration parameter "netconsole" in the
 following format:

- netconsole=[src-port]@[src-ip]/[],[tgt-port]@/[tgt-macaddr]
+ netconsole=[+][src-port]@[src-ip]/[],[tgt-port]@/[tgt-macaddr]

where
++ if present, enable extended console support
 src-port  source for UDP packets (defaults to 6665)
 src-ipsource IP to use (interface address)
 dev   network interface (eth0)
@@ -107,6 +109,7 @@ To remove a target:
 The interface exposes these parameters of a netconsole target to userspace:

enabled Is this target currently enabled?   (read-write)
+   extendedExtended mode enabled   (read-write)
dev_nameLocal network interface name(read-write)
local_port  Source UDP port to use  (read-write)
remote_port Remote agent's UDP port (read-write)
@@ -132,6 +135,36 @@ You can also update the local interface dynamically. This 
is especially
 useful if you want to use interfaces that have newly come up (and may not
 have existed when netconsole was loaded / initialized).

+Extended console:
+=
+
+If '+' is prefixed to the configuration line or "extended" config file
+is set to 1, extended console support is enabled. An example boot
+param follows.
+
+ linux netconsole=+@10.0.0.1/eth1,9353@10.0.0.2/12:34:56:78:9a:bc
+
+Log messages are transmi

Re: [PATCH] ftrace: Provide trace clock monotonic raw

2015-05-04 Thread Drew Richardson

On Mon, May 04, 2015 at 04:10:05PM +0100, Mathieu Desnoyers wrote:
> - Original Message -
> > Expose the NMI safe accessor to the monotonic raw clock to the
> > tracer. The mono clock was added with commit
> > 1b3e5c0936046e7e023149ddc8946d21c2ea20eb. Although the monotonic raw
> > clock cannot be used to compare time between different machines, it is
> > not perterbed by ntp.
> 
> perterbed -> perturbed

Oops, I'll correct that in the next version.

> > 
> > Signed-off-by: Drew Richardson 
> >
> 
> What is the use-case that justify exposing the "raw fast"
> clock that cannot be handled by the "monotonic fast" clock ?
> 
> Thanks,
> 
> Mathieu

I'm collecting and merging data from perf, with Android Atrace data
(writes to /sys/kernel/debug/tracing/trace_marker) which ends up in
the ftrace stream and other measurements collected from
userspace. Currently the only clock readable from userspace, supported
by perf and by ftrace is CLOCK_MONOTONIC. However this clock is
affected by the incremental adjustments performed by adjtime(3) and
NTP. But I'd prefer to use a clock that is advancing at a consistent
rate, hence CLOCK_MONOTONIC_RAW.

Thanks,

Drew


> > ---
> >  kernel/trace/trace.c | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
> > index 05330494a0df..458031c31a37 100644
> > --- a/kernel/trace/trace.c
> > +++ b/kernel/trace/trace.c
> > @@ -876,6 +876,7 @@ static struct {
> > { trace_clock_jiffies,  "uptime",   0 },
> > { trace_clock,  "perf", 1 },
> > { ktime_get_mono_fast_ns,   "mono", 1 },
> > +   { ktime_get_raw_fast_ns,"mono_raw", 1 },
> > ARCH_TRACE_CLOCKS
> >  };
> >  
> > --
> > 2.1.4
> > 
> > 
> 
> -- 
> Mathieu Desnoyers
> EfficiOS Inc.
> http://www.efficios.com
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 2/2] staging: ion: chunk_heap: use %pad for printing dma_addr_t's

2015-05-04 Thread Colin Cross

On Mon, May 4, 2015 at 1:22 AM, Dan Carpenter  wrote:
> On Thu, Apr 09, 2015 at 06:10:04PM -0700, Mitchel Humpherys wrote:
>> We're currently using %lu and %ld to print some variables of type
>> dma_addr_t, which results in the following warning when dma_addr_t is
>> 64-bits wide:
>>
>> drivers/staging/android/ion/ion_chunk_heap.c: In function 
>> 'ion_chunk_heap_create':
>> drivers/staging/android/ion/ion_chunk_heap.c:176:2: warning: format 
>> '%lu' expects argument of type 'long unsigned int', but argument 3 has type 
>> 'dma_addr_t' [-Wformat=]
>>   pr_info("%s: base %lu size %zu align %ld\n", __func__, 
>> chunk_heap->base,
>>   ^
>> drivers/staging/android/ion/ion_chunk_heap.c:176:2: warning: format 
>> '%ld' expects argument of type 'long int', but argument 5 has type 
>> 'dma_addr_t' [-Wformat=]
>>
>> Fix this by using %pad as instructed in printk-formats.txt.
>>
>> Signed-off-by: Mitchel Humpherys 
>
> This one was just merged and I was about to email you that it introduces
> some new Smatch warnings, but actually looking at it, it's just wrong.
>
> We want to print "chunk_heap->base" and not "&chunk_heap->base".

This would be correct if base was a dma_addr_t...

> And anyway "&chunk_heap->base" is a regular pointer, not a dma_addr_t.

But it is actually an ion_phys_addr_t, which is currently typedef'd to
unsigned long.  Are you using a local patch that replaces
ion_phys_addr_t with dma_addr_t?

> So please send a new patch that removes the &.

Removing the & is not correct, lib/vsprintf.c will dereference the arg
for %pad or %pap.  I think this patch should just be dropped, the old
%lu was correct for what is in Linus' tree.

> regards,
> dan carpenter
>
>> ---
>>  drivers/staging/android/ion/ion_chunk_heap.c | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/staging/android/ion/ion_chunk_heap.c 
>> b/drivers/staging/android/ion/ion_chunk_heap.c
>> index 54746157d799..6b3e18aa1c64 100644
>> --- a/drivers/staging/android/ion/ion_chunk_heap.c
>> +++ b/drivers/staging/android/ion/ion_chunk_heap.c
>> @@ -173,8 +173,8 @@ struct ion_heap *ion_chunk_heap_create(struct 
>> ion_platform_heap *heap_data)
>>   chunk_heap->heap.ops = &chunk_heap_ops;
>>   chunk_heap->heap.type = ION_HEAP_TYPE_CHUNK;
>>   chunk_heap->heap.flags = ION_HEAP_FLAG_DEFER_FREE;
>> - pr_debug("%s: base %lu size %zu align %ld\n", __func__, 
>> chunk_heap->base,
>> - heap_data->size, heap_data->align);
>> + pr_debug("%s: base %pad size %zu align %pad\n", __func__,
>> + &chunk_heap->base, heap_data->size, &heap_data->align);
>>
>>   return &chunk_heap->heap;
>>
>> --
>> Qualcomm Innovation Center, Inc.
>> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
>> a Linux Foundation Collaborative Project
>>
>> ___
>> devel mailing list
>> de...@linuxdriverproject.org
>> http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
>
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to kernel-team+unsubscr...@android.com.

+
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH V4 04/24] perf tools: Add build option NO_AUXTRACE to exclude AUX area tracing

2015-05-04 Thread Jiri Olsa

On Thu, Apr 30, 2015 at 05:37:27PM +0300, Adrian Hunter wrote:
> Add build option NO_AUXTRACE to exclude compiling support
> for AUX area tracing. Support for both recording and
> processing is excluded and by implication any future
> additions such as Intel PT and Intel BTS will also not
> be compiled in with this option.
> 
> Signed-off-by: Adrian Hunter 

Acked-by: Jiri Olsa 

thanks,
jirka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 1/5] devicetree: vendor-prefixes: Add CompuLab to known vendors

2015-05-04 Thread Sebastian Hesselbarth

This adds "compulab" as a vendor-prefix for CompuLab (compulab.co.il),
an Israeli company that builds ARM-based SoMs and CoMs.

Signed-off-by: Sebastian Hesselbarth 
Acked-by: Rob Herring 
---
Cc: Jason Cooper 
Cc: Andrew Lunn 
Cc: Gregory Clement 
Cc: Gabriel Dobato 
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
---
 Documentation/devicetree/bindings/vendor-prefixes.txt | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Documentation/devicetree/bindings/vendor-prefixes.txt 
b/Documentation/devicetree/bindings/vendor-prefixes.txt
index 80339192c93e..ff94ac484887 100644
--- a/Documentation/devicetree/bindings/vendor-prefixes.txt
+++ b/Documentation/devicetree/bindings/vendor-prefixes.txt
@@ -49,6 +49,7 @@ cirrusCirrus Logic, Inc.
 cloudengines   Cloud Engines, Inc.
 cnmChips&Media, Inc.
 cnxt   Conexant Systems, Inc.
+compulab   CompuLab
 cortinaCortina Systems, Inc.
 cosmic Cosmic Circuits
 crystalfontz   Crystalfontz America, Inc.
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 0/5] Add proper support for Compulab CM-A510/SBC-A510

2015-05-04 Thread Sebastian Hesselbarth

This is v2 of the patch set to improve current mainline support for
the Compulab CM-A510 System-on-Module (SoM) and its default Compulab
SBC-A510 base board.

Compared to v1 [1] there have been the overall changes:
- Drop i2c-mux-pinctrl rework as Wolfram Sang raised concerns with
  respect to OF_DYNAMIC. Instead we add i2c-mux-pinctrl node by
  default and wait until corresponding driver respects disabled
  sub-bus nodes. This will add 3 additional i2c busses to all
  Dove based boards, but no functional change with respect to
  i2c-0.
- Split patches dealing with arch/arm/boot/dts/Makefile to route
  them though arm-soc directly.

Other patches of v1 have already been taken by MVEBU SoC maintainers
as fixes.

Patches are based on v4.1-rc1 and are intended for v4.2. All patches
contain appropriate Acked-by's from related maintainers. Patches 3
and 5 should go thought arm-soc directly, the rest can go though
mvebu tree.

Sebastian

[1] http://thread.gmane.org/gmane.linux.drivers.devicetree/110324

Sebastian Hesselbarth (5):
  devicetree: vendor-prefixes: Add CompuLab to known vendors
  ARM: dts: dove: Add internal i2c multiplexer node
  ARM: dts: dove: Remove Compulab CM-A510 from Makefile
  ARM: dts: dove: Add proper support for Compulab CM-A510/SBC-A510
  ARM: dts: dove: Add Compulab SBC-A510 to Makefile

 .../devicetree/bindings/vendor-prefixes.txt|   1 +
 arch/arm/boot/dts/Makefile |   4 +-
 arch/arm/boot/dts/dove-cm-a510.dts |  38 
 arch/arm/boot/dts/dove-cm-a510.dtsi| 195 +
 arch/arm/boot/dts/dove-sbc-a510.dts| 182 +++
 arch/arm/boot/dts/dove.dtsi|  40 -
 6 files changed, 418 insertions(+), 42 deletions(-)
 delete mode 100644 arch/arm/boot/dts/dove-cm-a510.dts
 create mode 100644 arch/arm/boot/dts/dove-cm-a510.dtsi
 create mode 100644 arch/arm/boot/dts/dove-sbc-a510.dts

---
Cc: Jason Cooper 
Cc: Andrew Lunn 
Cc: Gregory Clement 
Cc: Gabriel Dobato 
Cc: Arnd Bergmann 
Cc: Olof Johansson 
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 5/5] ARM: dts: dove: Add Compulab SBC-A510 to Makefile

2015-05-04 Thread Sebastian Hesselbarth

With reworked device tree files for Compulab CM-A510 SoM and SBC-A510
base board, now add the correspoding board file to Makefile again.

Signed-off-by: Sebastian Hesselbarth 
---
Cc: Jason Cooper 
Cc: Andrew Lunn 
Cc: Gregory Clement 
Cc: Gabriel Dobato 
Cc: Arnd Bergmann 
Cc: Olof Johansson 
Cc: a...@kernel.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
---
 arch/arm/boot/dts/Makefile | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/arm/boot/dts/Makefile b/arch/arm/boot/dts/Makefile
index dc24597c4f2e..ba72134f1fd3 100644
--- a/arch/arm/boot/dts/Makefile
+++ b/arch/arm/boot/dts/Makefile
@@ -653,7 +653,8 @@ dtb-$(CONFIG_MACH_DOVE) += \
dove-cubox-es.dtb \
dove-d2plug.dtb \
dove-d3plug.dtb \
-   dove-dove-db.dtb
+   dove-dove-db.dtb \
+   dove-sbc-a510.dtb
 dtb-$(CONFIG_ARCH_MEDIATEK) += \
mt6589-aquaris5.dtb \
mt6592-evb.dtb \
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH V4 05/24] perf auxtrace: Add option to synthesize events for transactions

2015-05-04 Thread Jiri Olsa

On Thu, Apr 30, 2015 at 05:37:28PM +0300, Adrian Hunter wrote:
> Add AUX area tracing option 'x' to synthesize events for
> transactions. This will be used by Intel PT to synthesize
> an event record for each TSX start, commit or abort.
> 
> Signed-off-by: Adrian Hunter 

Acked-by: Jiri Olsa 

thanks,
jirka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 4/5] ARM: dts: dove: Add proper support for Compulab CM-A510/SBC-A510

2015-05-04 Thread Sebastian Hesselbarth

Existing dts file for Compulab CM-A510 was very limited due to missing
hardware. Now that we actually found somebody with that board, properly
rework it to provide a CoM/SoM include and a board file for Compulab's
SBC-A510.

Both the CM-A510 SoM and the SBC-A510 can be configured with different
options, so we only enable a minimum set of options. The actual board
configuration will have to be set by either the bootloader or user.

Although functionally not required, repeat even disabled nodes again
to increse their visibility in the dtsi/dts files.

Signed-off-by: Sebastian Hesselbarth 
Tested-by: Gabriel Dobato 
Acked-by: Gregory CLEMENT 
---
Cc: Jason Cooper 
Cc: Andrew Lunn 
Cc: Gregory Clement 
Cc: Gabriel Dobato 
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
---
 arch/arm/boot/dts/dove-cm-a510.dts  |  38 ---
 arch/arm/boot/dts/dove-cm-a510.dtsi | 195 
 arch/arm/boot/dts/dove-sbc-a510.dts | 182 +
 3 files changed, 377 insertions(+), 38 deletions(-)
 delete mode 100644 arch/arm/boot/dts/dove-cm-a510.dts
 create mode 100644 arch/arm/boot/dts/dove-cm-a510.dtsi
 create mode 100644 arch/arm/boot/dts/dove-sbc-a510.dts

diff --git a/arch/arm/boot/dts/dove-cm-a510.dts 
b/arch/arm/boot/dts/dove-cm-a510.dts
deleted file mode 100644
index 50c0d6904497..
--- a/arch/arm/boot/dts/dove-cm-a510.dts
+++ /dev/null
@@ -1,38 +0,0 @@
-/dts-v1/;
-
-#include "dove.dtsi"
-
-/ {
-   model = "Compulab CM-A510";
-   compatible = "compulab,cm-a510", "marvell,dove";
-
-   memory {
-   device_type = "memory";
-   reg = <0x 0x4000>;
-   };
-
-   chosen {
-   bootargs = "console=ttyS0,115200n8 earlyprintk";
-   };
-};
-
-&uart0 { status = "okay"; };
-&uart1 { status = "okay"; };
-&sdio0 { status = "okay"; };
-&sdio1 { status = "okay"; };
-&sata0 { status = "okay"; };
-
-&spi0 {
-   status = "okay";
-
-   /* spi0.0: 4M Flash Winbond W25Q32BV */
-   spi-flash@0 {
-   compatible = "st,w25q32";
-   spi-max-frequency = <2000>;
-   reg = <0>;
-   };
-};
-
-&i2c0 {
- status = "okay";
-};
diff --git a/arch/arm/boot/dts/dove-cm-a510.dtsi 
b/arch/arm/boot/dts/dove-cm-a510.dtsi
new file mode 100644
index ..59b4056b478f
--- /dev/null
+++ b/arch/arm/boot/dts/dove-cm-a510.dtsi
@@ -0,0 +1,195 @@
+/*
+ * Device Tree include for Compulab CM-A510 System-on-Module
+ *
+ * Copyright (C) 2015, Sebastian Hesselbarth 
+ *
+ * This file is dual-licensed: you can use it either under the terms
+ * of the GPL or the X11 license, at your option. Note that this dual
+ * licensing only applies to this file, and not this project as a
+ * whole.
+ *
+ *  a) This file is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; version 2 of the
+ * License.
+ *
+ * This file is distributed in the hope that it will be useful
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * Or, alternatively
+ *
+ *  b) Permission is hereby granted, free of charge, to any person
+ * obtaining a copy of this software and associated documentation
+ * files (the "Software"), to deal in the Software without
+ * restriction, including without limitation the rights to use
+ * copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following
+ * conditions:
+ *
+ * The above copyright notice and this permission notice shall be
+ * included in all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED , WITHOUT WARRANTY OF ANY KIND
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
+ * OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
+ * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY
+ * WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ */
+
+/*
+ * The CM-A510 comes with several optional components:
+ *
+ * Memory options:
+ *  D512: 512M
+ *  D1024: 1G
+ *
+ * NAND options:
+ *  N512: 512M NAND
+ *
+ * Ethernet options:
+ *  E1: PHY RTL8211D on internal GbE (SMI address 0x03)
+ *  E2: Additional ethernet NIC RTL8111D on PCIe1
+ *
+ * Audio options:
+ *  A: TI TLV320AIC23b audio codec (I2C address 0x1a)
+ *
+ * Touchscreen options:
+ *  I: TI TSC2046 touchscreen controller (on SPI1)
+ *
+ * USB options:
+ *  U2: 2 dual-role USB2.0 ports
+ *  U4:

Re: [PATCH V4 06/24] perf tools: Add support for PERF_RECORD_AUX

2015-05-04 Thread Jiri Olsa

On Thu, Apr 30, 2015 at 05:37:29PM +0300, Adrian Hunter wrote:
> Add support for the PERF_RECORD_AUX event type.
> 
> PERF_RECORD_AUX is a new kernel event that records
> when new data lands in the AUX buffer. Currently
> it is assumed that AUX data follows the same ring
> buffer conventions used by the perf events buffer,
> and consequently the AUX event is not processed
> during recording.
> 
> It is processed during session processing so that
> the information in the 'flags' member is made
> available.
> 
> The format of PERF_RECORD_AUX is outlined in the
> linux/perf_events.h header file. The 'flags' are
> also enumerated.
> 
> Intel PT and Intel BTS use the flag named
> PERF_AUX_FLAG_TRUNCATED to determine if data has
> been lost because the buffer became full as
> perf was not able to empty it fast enough.
> 
> Signed-off-by: Adrian Hunter 

Acked-by: Jiri Olsa 

thanks,
jirka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 3/5] ARM: dts: dove: Remove Compulab CM-A510 from Makefile

2015-05-04 Thread Sebastian Hesselbarth

Prior reworking Dove based Compulab CM-A510 device tree, remove it
from the compiled device tree files.

Signed-off-by: Sebastian Hesselbarth 
---
Cc: Jason Cooper 
Cc: Andrew Lunn 
Cc: Gregory Clement 
Cc: Gabriel Dobato 
Cc: Arnd Bergmann 
Cc: Olof Johansson 
Cc: a...@kernel.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
---
 arch/arm/boot/dts/Makefile | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/arm/boot/dts/Makefile b/arch/arm/boot/dts/Makefile
index 86217db2937a..dc24597c4f2e 100644
--- a/arch/arm/boot/dts/Makefile
+++ b/arch/arm/boot/dts/Makefile
@@ -649,7 +649,6 @@ dtb-$(CONFIG_MACH_ARMADA_XP) += \
armada-xp-openblocks-ax3-4.dtb \
armada-xp-synology-ds414.dtb
 dtb-$(CONFIG_MACH_DOVE) += \
-   dove-cm-a510.dtb \
dove-cubox.dtb \
dove-cubox-es.dtb \
dove-d2plug.dtb \
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 2/5] ARM: dts: dove: Add internal i2c multiplexer node

2015-05-04 Thread Sebastian Hesselbarth

This adds a i2c-mux-pinctrl node to dove.dtsi for the internal i2c
mux found on Dove SoCs. Up to now, we had no board using any of the
two additional i2c busses, so make sure the change does not break
any existing boards.

Therefore, we rename the i2c-controller node label to "i2c" and
enable it by default. Also, the dedicated sub-bus (now "i2c0") is
enabled by default. The two optional sub-busses require additional
external pin-muxing, so disable them by default.

Signed-off-by: Sebastian Hesselbarth 
Acked-by: Gregory CLEMENT 
---
Cc: Jason Cooper 
Cc: Andrew Lunn 
Cc: Gregory Clement 
Cc: Gabriel Dobato 
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
---
 arch/arm/boot/dts/dove.dtsi | 40 ++--
 1 file changed, 38 insertions(+), 2 deletions(-)

diff --git a/arch/arm/boot/dts/dove.dtsi b/arch/arm/boot/dts/dove.dtsi
index 9ad829523a13..38b1f7e6004e 100644
--- a/arch/arm/boot/dts/dove.dtsi
+++ b/arch/arm/boot/dts/dove.dtsi
@@ -33,6 +33,42 @@
marvell,tauros2-cache-features = <0>;
};
 
+   i2c-mux {
+   compatible = "i2c-mux-pinctrl";
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   i2c-parent = <&i2c>;
+
+   pinctrl-names = "i2c0", "i2c1", "i2c2";
+   pinctrl-0 = <&pmx_i2cmux_0>;
+   pinctrl-1 = <&pmx_i2cmux_1>;
+   pinctrl-2 = <&pmx_i2cmux_2>;
+
+   i2c0: i2c@0 {
+   reg = <0>;
+   #address-cells = <1>;
+   #size-cells = <0>;
+   status = "okay";
+   };
+
+   i2c1: i2c@1 {
+   reg = <1>;
+   #address-cells = <1>;
+   #size-cells = <0>;
+   /* Requires pmx_i2c1 on i2c controller node */
+   status = "disabled";
+   };
+
+   i2c2: i2c@2 {
+   reg = <2>;
+   #address-cells = <1>;
+   #size-cells = <0>;
+   /* Requires pmx_i2c2 on i2c controller node */
+   status = "disabled";
+   };
+   };
+
mbus {
compatible = "marvell,dove-mbus", "marvell,mbus", "simple-bus";
#address-cells = <2>;
@@ -123,7 +159,7 @@
status = "disabled";
};
 
-   i2c0: i2c-ctrl@11000 {
+   i2c: i2c-ctrl@11000 {
compatible = "marvell,mv64xxx-i2c";
reg = <0x11000 0x20>;
#address-cells = <1>;
@@ -132,7 +168,7 @@
clock-frequency = <40>;
timeout-ms = <1000>;
clocks = <&core_clk 0>;
-   status = "disabled";
+   status = "okay";
};
 
uart0: serial@12000 {
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH V4 07/24] perf tools: Add support for PERF_RECORD_ITRACE_START

2015-05-04 Thread Jiri Olsa

On Thu, Apr 30, 2015 at 05:37:30PM +0300, Adrian Hunter wrote:
> Add support for the PERF_RECORD_ITRACE_START event type.
> This event can be used to determine the pid and tid that
> are running when Instruction Tracing starts.  Generally
> that information would come from a sched_switch event
> but, at the start, no sched_switch events may yet have
> been recorded.
> 
> Signed-off-by: Adrian Hunter 

Acked-by: Jiri Olsa 

thanks,
jirka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH V4 09/24] perf record: Add AUX area tracing Snapshot Mode support

2015-05-04 Thread Jiri Olsa

On Thu, Apr 30, 2015 at 05:37:32PM +0300, Adrian Hunter wrote:
> Add a new option and support for Instruction
> Tracing Snapshot Mode.  When the new option is
> selected, no AUX area tracing data is
> captured until a signal (SIGUSR2) is received.
> 
> Signed-off-by: Adrian Hunter 

Acked-by: Jiri Olsa 

thanks,
jirka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v5 1/4] iio: core: Introduce IIO software triggers

2015-05-04 Thread Lars-Peter Clausen


On 05/04/2015 12:50 PM, Daniel Baluta wrote:

A software trigger associates an IIO device trigger with a software
interrupt source (e.g: timer, sysfs). This patch adds the generic
infrastructure for handling software triggers.

Software interrupts sources are kept in a iio_trigger_types_list and
registered separately when the associated kernel module is loaded.

Software triggers can be created directly from drivers or from user
space via configfs interface.

Signed-off-by: Daniel Baluta 
---
  drivers/iio/Kconfig   |   8 +++
  drivers/iio/Makefile  |   1 +
  drivers/iio/industrialio-sw-trigger.c | 112 ++
  include/linux/iio/sw_trigger.h|  60 ++
  4 files changed, 181 insertions(+)
  create mode 100644 drivers/iio/industrialio-sw-trigger.c
  create mode 100644 include/linux/iio/sw_trigger.h

diff --git a/drivers/iio/Kconfig b/drivers/iio/Kconfig
index 4011eff..de7f1d9 100644
--- a/drivers/iio/Kconfig
+++ b/drivers/iio/Kconfig
@@ -58,6 +58,14 @@ config IIO_CONSUMERS_PER_TRIGGER
This value controls the maximum number of consumers that a
given trigger may handle. Default is 2.

+config IIO_SW_TRIGGER
+   bool "Enable software triggers support"
+   depends on IIO_TRIGGER
+   help
+Provides IIO core support for software triggers. A software
+trigger can be created via configfs or directly by a driver
+using the API provided.
+
  source "drivers/iio/accel/Kconfig"
  source "drivers/iio/adc/Kconfig"
  source "drivers/iio/amplifiers/Kconfig"
diff --git a/drivers/iio/Makefile b/drivers/iio/Makefile
index 698afc2..df87975 100644
--- a/drivers/iio/Makefile
+++ b/drivers/iio/Makefile
@@ -6,6 +6,7 @@ obj-$(CONFIG_IIO) += industrialio.o
  industrialio-y := industrialio-core.o industrialio-event.o inkern.o
  industrialio-$(CONFIG_IIO_BUFFER) += industrialio-buffer.o
  industrialio-$(CONFIG_IIO_TRIGGER) += industrialio-trigger.o
+industrialio-$(CONFIG_IIO_SW_TRIGGER) += industrialio-sw-trigger.o
  industrialio-$(CONFIG_IIO_BUFFER_CB) += buffer_cb.o

  obj-$(CONFIG_IIO_TRIGGERED_BUFFER) += industrialio-triggered-buffer.o
diff --git a/drivers/iio/industrialio-sw-trigger.c 
b/drivers/iio/industrialio-sw-trigger.c
new file mode 100644
index 000..f22aa63
--- /dev/null
+++ b/drivers/iio/industrialio-sw-trigger.c
@@ -0,0 +1,112 @@
+/*
+ * The Industrial I/O core, software trigger functions
+ *
+ * Copyright (c) 2015 Intel Corporation
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published by
+ * the Free Software Foundation.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+static LIST_HEAD(iio_trigger_types_list);
+static DEFINE_RWLOCK(iio_trigger_types_lock);
+
+static
+struct iio_sw_trigger_type *iio_find_sw_trigger_type(char *name, unsigned len)


const char *name, there are a couple of other places where char * should be 
const char * below as well.



+{
+   struct iio_sw_trigger_type *t = NULL, *iter;
+
+   list_for_each_entry(iter, &iio_trigger_types_list, list)
+   if (!strncmp(iter->name, name, len)) {
+   t = iter;
+   break;
+   }
+
+   return t;
+}
+
+int iio_register_sw_trigger_type(struct iio_sw_trigger_type *t)


Kernel doc would be nice for the public API


+{
+   struct iio_sw_trigger_type *iter;
+   int ret = 0;
+
+   write_lock(&iio_trigger_types_lock);
+   iter = iio_find_sw_trigger_type(t->name, strlen(t->name));
+   if (iter)
+   ret = -EBUSY;
+   else
+   list_add_tail(&t->list, &iio_trigger_types_list);
+   write_unlock(&iio_trigger_types_lock);
+
+   return ret;
+}
+EXPORT_SYMBOL(iio_register_sw_trigger_type);
+
+int iio_unregister_sw_trigger_type(struct iio_sw_trigger_type *t)


I'd make the return type void. Either it is not registered and unregister is 
a noop, or it is registered and it will be successfully unregistered. Either 
way the operation won't fail.



+{
+   struct iio_sw_trigger_type *iter;
+   int ret = 0;
+
+   write_lock(&iio_trigger_types_lock);
+   iter = iio_find_sw_trigger_type(t->name, strlen(t->name));


Not sure if we need this. unregister should never be called without register 
succeeding before.



+   if (!iter)
+   ret = -EINVAL;
+   else
+   list_del(&t->list);
+   write_unlock(&iio_trigger_types_lock);
+
+   return ret;
+}
+EXPORT_SYMBOL(iio_unregister_sw_trigger_type);
+
+static
+struct iio_sw_trigger_type *iio_get_sw_trigger_type(char *name)
+{
+   struct iio_sw_trigger_type *t;
+
+   read_lock(&iio_trigger_types_lock);
+   t = iio_find_sw_trigger_type(name, strlen(name));
+   if (t && !try_module_get(t->owner))
+   t = NULL;
+   read_unlock(&iio_trigger_types_lock);
+
+   return t

Re: [PATCH] Input: Fix multitouch support for Type Cover 3

2015-05-04 Thread Benjamin Tissoires

Hi Felipe,

On Sat, May 2, 2015 at 12:35 PM, Felipe  wrote:
> Hey Benjamin,
>
> Did you get a chance to look at the new patch I sent? I included the
> "touchpad" suffix part, but I don't know if I should have.

yes I do. Sorry for the lag. I think the code now looks fine.

However, when I tested it, I felt that we need to fix
hid-core/hid-input too or we would end up showing a lot of unused
input node. Also the LED breakage is rather worrisome, so I'd prefer
we fix first hid-input. I'd also prefer we specifically enable only
the used input nodes in hid-multitouch (something like a bitmask to
say which input nodes are created and used whithin the mt class).

I still did not have the time to work on this again, so if you want to
have a look at it yourself, you are welcome.

IIRC my findings for the hid-input code were:
- when using MULTI_INPUT, we will create one input node per report id
per report direction (input/output).
  -> We should probably not create 2 input nodes per id, but rather
reuse the previous existing one.
- That means that we should not register the input node when creating
a new one but register them in a batch later when the reports have
been mapped
- It should be fine for most drivers except a few which are expecting
to have the current behavior when probing/working (some FF devices
IIRC).

So the question would be should we add an extra quirk for the new
"MULTI_INPUT" or fix MULTI_INPUT as the general rule and have a fix
for those which we thing may be affected by the new way of registering
inputs.

Cheers,
Benjamin

>
> On Tue, Apr 14, 2015 at 4:51 PM Benjamin Tissoires
>  wrote:
>>
>> On Mon, Apr 13, 2015 at 11:47 AM, Felipe 
>> wrote:
>> > On Mon, Apr 13, 2015 at 11:16 AM Benjamin Tissoires
>> >  wrote:
>> >>
>> >> On Sun, Apr 12, 2015 at 6:04 PM, Felipe 
>> >> wrote:
>> >> > Hi Benjamin,
>> >> >
>> >> > On Sat, Apr 11, 2015 at 11:08 AM, Benjamin Tissoires
>> >> >  wrote:
>> >> >> Hi Felipe,
>> >> >>
>> >> >> On Sat, Apr 11, 2015 at 12:17 AM, Felipe Otamendi
>> >> >>  wrote:
>> >> >>> Make the Type Cover 3 use the hid multitouch driver, which is
>> >> >>> better suited for the touchpad. Also, since it has multiple reports 
>> >> >>> under
>> >> >>> the same interface, allow the generic hid driver to handle 
>> >> >>> non-multitouch
>> >> >>> inputs such as the keyboard's.
>> >> >>
>> >> >> IIRC, the point of having hid-microsoft was to have better support
>> >> >> of
>> >> >> the keyboard special functions and shortcuts. Can you please confirm
>> >> >> that you do not lose any functionality?
>> >> >>
>> >> >
>> >> > I've checked and all the keys work as they used to with the previous
>> >> > patch. The only thing that doesn't work is the led on the Caps Lock
>> >> > key. That's because the output from the keyboard report is being
>> >> > mapped as a different input than the input from the same report
>> >> > because of how inputs are mapped when HID_QUIRK_MULTI_INPUT is
>> >> > enabled.
>> >>
>> >> That is worrisome. It means that there will be a regression with the
>> >> patch.
>> >> If I understand correctly, with hid-microsoft, the Caps Lock LED
>> >> works, and not with hid-multitouch?
>> >>
>> >
>> > With hid-microsoft and hid-input the LED works, but not if you set the
>> > HID_QUIRK_MULTI_INPUT. The hid-multitouch driver uses that quirk by
>> > default but it is needed to get both the keyboard and touchpad as
>> > different inputs so X11 drivers can pick them up independently. Also,
>> > the hid-multitouch driver works well not only because it handles the
>> > touchpad fields correctly but also because it initializes the device
>> > in multitouch mode (Input mode feature report [1]) instead of mouse
>> > mode.
>> > The LED output report is mapped separately because of a combination of
>> > how reports are traversed in hidinput_connect in hid-input.c and how
>> > are mapped to new inputs with the HID_QUIRK_MULTI_INPUT. That part
>> > seems dangerous to modify without breaking compatibility with other
>> > devices. Maybe adding a different quirk? I don't know what the
>> > protocol is in those cases.
>>
>> It took me a while but I finally got your point. hidinput_connect
>> assigned two different input nodes for the input and output reports
>> even if they share the same report ID. X believes there are 2 distinct
>> keyboards and do not change the LED of the one without the LED
>> declared :)
>>
>> This is definitively something we should fix in hid-input.c. IMO, the
>> for loop in hidinput_configure() has been wild for too long and it is
>> really hard to get what it does.
>> I'll try to put something into it.
>>
>> >
>> > [1]
>> > https://msdn.microsoft.com/en-us/library/windows/hardware/dn467314%28v=vs.85%29.aspx
>> >
>> >> Can you share the report descriptors of the device? I might have had
>> >> one, but I can not find it.
>> >>
>> >
>> > Yes, here's the report [2], it is in html.
>> >
>> > [2]
>> > http://htmlpreview.github.io/?https://gist.githubusercontent.com/felipeota/

Re: [PATCH V4 08/24] perf tools: Add AUX area tracing Snapshot Mode

2015-05-04 Thread Jiri Olsa

On Thu, Apr 30, 2015 at 05:37:31PM +0300, Adrian Hunter wrote:
> Add support for making snapshots of
> AUX area tracing data.
> 
> Signed-off-by: Adrian Hunter 


Acked-by: Jiri Olsa 

thanks,
jirka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH V4 10/24] perf auxtrace: Add Intel PT as an AUX area tracing type

2015-05-04 Thread Jiri Olsa

On Thu, Apr 30, 2015 at 05:37:33PM +0300, Adrian Hunter wrote:
> Add the Intel Processor Trace type
> constant PERF_AUXTRACE_INTEL_PT.
> 
> Signed-off-by: Adrian Hunter 

Acked-by: Jiri Olsa 

thanks,
jirka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v4 04/20] clk: tegra: pll: simplify clk_enable_path

2015-05-04 Thread Benson Leung

On Mon, May 4, 2015 at 9:37 AM, Rhyland Klein  wrote:
> Instead of having multiple similar wrapper functions for
> _clk_pll_[enable|disable], we can simplify it to single
> wrappers and use checks to avoid the logic we don't want to use.
>
> Signed-off-by: Rhyland Klein 

Reviewed-by: Benson Leung 

-- 
Benson Leung
Software Engineer, Chrom* OS
ble...@chromium.org
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: question about RCU dynticks_nesting

2015-05-04 Thread Rik van Riel

On 05/04/2015 04:02 PM, Paul E. McKenney wrote:
> On Mon, May 04, 2015 at 03:39:25PM -0400, Rik van Riel wrote:
>> On 05/04/2015 02:39 PM, Paul E. McKenney wrote:
>>> On Mon, May 04, 2015 at 11:59:05AM -0400, Rik van Riel wrote:
>>
 In fact, would we be able to simply use tsk->rcu_read_lock_nesting
 as an indicator of whether or not we should bother waiting on that
 task or CPU when doing synchronize_rcu?
>>>
>>> Depends on exactly what you are asking.  If you are asking if I could add
>>> a few more checks to preemptible RCU and speed up grace-period detection
>>> in a number of cases, the answer is very likely "yes".  This is on my
>>> list, but not particularly high priority.  If you are asking whether
>>> CPU 0 could access ->rcu_read_lock_nesting of some task running on
>>> some other CPU, in theory, the answer is "yes", but in practice that
>>> would require putting full memory barriers in both rcu_read_lock()
>>> and rcu_read_unlock(), so the real answer is "no".
>>>
>>> Or am I missing your point?
>>
>> The main question is "how can we greatly reduce the overhead
>> of nohz_full, by simplifying the RCU extended quiescent state
>> code called in the syscall fast path, and maybe piggyback on
>> that to do time accounting for remote CPUs?"
>>
>> Your memory barrier answer above makes it clear we will still
>> want to do the RCU stuff at syscall entry & exit time, at least
>> on x86, where we already have automatic and implicit memory
>> barriers.
> 
> We do need to keep in mind that x86's automatic and implicit memory
> barriers do not order prior stores against later loads.
> 
> Hmmm...  But didn't earlier performance measurements show that the bulk of
> the overhead was the delta-time computations rather than RCU accounting?

The bulk of the overhead was disabling and re-enabling
irqs around the calls to rcu_user_exit and rcu_user_enter :)

Of the remaining time, about 2/3 seems to be the vtime
stuff, and the other 1/3 the rcu code.

I suspect it makes sense to optimize both, though the
vtime code may be the easiest :)

-- 
All rights reversed
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] Design for flag bit outputs from asms

2015-05-04 Thread H. Peter Anvin

On 05/04/2015 12:33 PM, Richard Henderson wrote:
> 
> (0) The C level output variable should be an integral type, from bool on up.
> 
> The flags are a scarse resource, easily clobbered.  We cannot allow user code
> to keep data in the flags.  While x86 does have lahf/sahf, they don't exactly
> perform well.  And other targets like arm don't even have that bad option.
> 
> Therefore, the language level semantics are that the output is a boolean store
> into the variable with a condition specified by a magic constraint.
> 
> That said, just like the compiler should be able to optimize
> 
> void bar(int y)
> {
>   int x = (y <= 0);
>   if (x) foo();
> }
> 
> such that we only use a single compare against y, the expectation is that
> within a similarly constrained context the compiler will not require two tests
> for these boolean outputs.
> 
> Therefore:
> 
> (1) Each target defines a set of constraint strings,
> 
>E.g. for x86, wherein we're almost out of constraint letters,
> 
>  ja   aux carry flag
>  jc   carry flag
>  jo   overflow flag
>  jp   parity flag
>  js   sign flag
>  jz   zero flag
> 

I would argue that for x86 what you actually want is to model the
*conditions* that are available on the flags, not the flags themselves.
 There are 16 such conditions, 8 if we discard the inversions.

It is notable that the auxiliary carry flag has no Jcc/SETcc/CMOVcc
instructions; it is only ever consumed by the DAA/DAS instructions which
makes it pointless to try to model it in a compiler any more than, say, IF.

> (2) A new target hook post-processes the asm_insn, looking for the
> new constraint strings.  The hook expands the condition prescribed
> by the string, adjusting the asm_insn as required.
> 
>   E.g.
> 
> bool x, y, z;
> asm ("xyzzy" : "=jc"(x), "=jp"(y), "=jo"(z) : : );

Other than that, this is exactly what would be wonderful to see.

-hpa

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86: Optimize variable_test_bit()

2015-05-04 Thread H. Peter Anvin

On 05/04/2015 11:07 AM, Vladimir Makarov wrote:
>>>
>>>So I could implement the output reloads in LRA, probably for the
>>> next GCC release.  How to enable and mostly use it for multi-target
>>> code like the kernel is another question.
>> Pretty much all inline asm is in per arch code; so one arch having
>> different asm features than another should not be a problem at all.
> Ok, then. I'll try to implement output operands for asm-goto in LRA for
> the next GCC release.
> 
> Of course, if nobody objects to changing asm goto semantics from
> 
>  An 'asm goto' statement cannot have outputs ...
> 
> to
> 
>  An 'asm goto' statement cannot have outputs on some targets ...
> 

A gradual implementation should be fine.

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v4 05/20] clk: tegra: pll: update warning msg

2015-05-04 Thread Benson Leung

On Mon, May 4, 2015 at 9:37 AM, Rhyland Klein  wrote:
> Swap out the generic WARN_ON with a WARN which gives more
> information about what is happening.
>
> Signed-off-by: Rhyland Klein 

Reviewed-by: Benson Leung 


-- 
Benson Leung
Software Engineer, Chrom* OS
ble...@chromium.org
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] blk-mq: don't lose requests if a stopped queue restarts

2015-05-04 Thread Shaohua Li

On Mon, May 04, 2015 at 01:56:42PM -0600, Jens Axboe wrote:
> On 05/04/2015 01:51 PM, Shaohua Li wrote:
> >On Mon, May 04, 2015 at 01:17:19PM -0600, Jens Axboe wrote:
> >>On 05/02/2015 06:31 PM, Shaohua Li wrote:
> >>>Normally if driver is busy to dispatch a request the logic is like below:
> >>>block layer:   driver:
> >>>   __blk_mq_run_hw_queue
> >>>a. blk_mq_stop_hw_queue
> >>>b. rq add to ctx->dispatch
> >>>
> >>>later:
> >>>1. blk_mq_start_hw_queue
> >>>2. __blk_mq_run_hw_queue
> >>>
> >>>But it's possible step 1-2 runs between a and b. And since rq isn't in
> >>>ctx->dispatch yet, step 2 will not run rq. The rq might get lost if
> >>>there are no subsequent requests kick in.
> >>
> >>Good catch! But the patch introduces a potentially never ending loop
> >>in __blk_mq_run_hw_queue(). Not sure how we can fully close it, but
> >>it might be better to punt the re-run after adding the requests back
> >>to the worker. That would turn a potential busy loop (until requests
> >>complete) into something with nicer behavior, at least. Ala
> >>
> >>if (!test_bit(BLK_MQ_S_STOPPED, &hctx->state))
> >>  kblockd_schedule_delayed_work_on(blk_mq_hctx_next_cpu(hctx),
> >> &hctx->run_work, 0);
> >
> >My first version of the patch is like this, but I changed my mind later.
> >The assumption is driver will stop queue if it's busy to dispatch
> >request.  If the driver is buggy, we will have the endless loop here.
> >Should we assume drivers will not do the right thing?
> 
> There's really no contract that says the driver MUST stop the queue
> for busy. It could, legitimately, decide to just always run the
> queue when requests complete.
> 
> It might be better to simply force this behavior. If we get a BUSY,
> stop the queue from __blk_mq_run_hw_queue(). And if the bit isn't
> still set on re-add, then we know we need to re-run it. I think that
> would be a cleaner API, less fragile, and harder to get wrong. The
> down side is that now this stop happens implicitly by the core, and
> the driver must now have an asymmetric queue start when it frees the
> limited resource that caused the BUSY return. Either that, or we
> define a 2nd set of start/stop bits, one used exclusively by the
> driver and one used exclusively by blk-mq. Then blk-mq could restart
> the queue on completion of a request, since it would then know that
> blk-mq was the one that stopped it.

Agree. I'll make the rerun async for now and leave above as a future
improvement.


>From 3e767da0e9f1044659c605120e09726ffd1aeab0 Mon Sep 17 00:00:00 2001
Message-Id: 
<3e767da0e9f1044659c605120e09726ffd1aeab0.1430770649.git.s...@fb.com>
From: Shaohua Li 
Date: Fri, 1 May 2015 16:39:39 -0700
Subject: [PATCH] blk-mq: don't lose requests if a stopped queue restarts

Normally if driver is busy to dispatch a request the logic is like below:
block layer:driver:
__blk_mq_run_hw_queue
a.  blk_mq_stop_hw_queue
b.  rq add to ctx->dispatch

later:
1.  blk_mq_start_hw_queue
2.  __blk_mq_run_hw_queue

But it's possible step 1-2 runs between a and b. And since rq isn't in
ctx->dispatch yet, step 2 will not run rq. The rq might get lost if
there are no subsequent requests kick in.

Signed-off-by: Shaohua Li 
---
 block/blk-mq.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index ade8a2d..e1a5b9e 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -855,6 +855,16 @@ static void __blk_mq_run_hw_queue(struct blk_mq_hw_ctx 
*hctx)
spin_lock(&hctx->lock);
list_splice(&rq_list, &hctx->dispatch);
spin_unlock(&hctx->lock);
+   /*
+* the queue is expected stopped with BLK_MQ_RQ_QUEUE_BUSY, but
+* it's possible the queue is stopped and restarted again
+* before this. Queue restart will dispatch requests. And since
+* requests in rq_list aren't added into hctx->dispatch yet,
+* the requests in rq_list might get lost.
+*
+* blk_mq_run_hw_queue() already checks the STOPPED bit
+**/
+   blk_mq_run_hw_queue(hctx, true);
}
 }
 
-- 
1.8.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] net/rds: RDS-TCP: Always create a new rds_sock for an incoming connection.

2015-05-04 Thread David Miller

From: Sowmini Varadhan 
Date: Mon, 4 May 2015 15:29:08 -0400

> On (05/04/15 14:47), David Miller wrote:
>> 
>> I think adding 64K of data to this module just to solve this rare
>> issue is excessive.
> 
> I'd based that number mostly as a heuristic based on rds_conn_hash[].
> Any suggestions for what's reasonable? 8K? Less?
> (BTW, I think that should be 32K, or am I mis-counting?)

No table at all.

There has to be another way to notice this kind of situation, how
for example does NFS or any other sunrpc using service handle this
case?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: earlycon: no match?

2015-05-04 Thread Peter Hurley

On 05/04/2015 03:42 PM, Robert Schwebel wrote:
> Hi Peter,
> 
> On Mon, May 04, 2015 at 10:01:37AM -0400, Peter Hurley wrote:
>>> with 4.1-rc1, my boxes with early console enabled show something like
>>> this (the example is vexpress, but it for example also happens on an
>>> AM335x board):
>>>
>>>   earlycon: no match for ttyAMA0,38400n8
>>
>> This shouldn't impact any previous earlycon setup. Are you saying
>> you're seeing a regression?
> 
> Well, it is a warning, and the system was warning-free on mainline with
> the last kernels. People assume something is wrong if they read such a
> message, so I'm searching for a way to do it right and get rid of the
> warning again.

It's not a warning; it's simply a diagnostic in case the earlycon was
misspelled. Since 2007, 'console=' is a early param synonym for 'earlycon=';
IOW, the message is new but not the behavior.

>> How do you have early console enabled, via the command line or via DT?
> 
> Neither nor: the same SD card image runs on qemu (vexpress) and on an
> AM335x. It has its primary console on the serial console:
> 
> - console=ttyAMA0,38400  (amba-pl011.c, vexpress)
> - console=ttyO2,115200n8 (omap-serial.c, AM335x)
> 
> There is no "earlycon" on the commandline and nothing earlycon related I
> did on purpose in the oftree.

Ok. In your first email, you said "my boxes with early console enabled",
so I thought you meant that you were starting an earlycon. I see now
you meant enabled, as in built-in (not enabled as in started).

> My expectation would be to configure the system in a way that I have
> everything necessary for earlecon usage compiled into the kernel, so I
> can enable it manually from the bootloader whenever I need it (i.e. by
> adding 'earlycon' to the kernel commandline, or by modifying the oftree
> before it is handled over to the kernel).

Ok.

>>> The box was booted with "console=ttyAMA0,38400n8" on the commandline.
>>> If I understand this right, the code in drivers/tty/serial/earlycon.c
>>> calls setup_earlycon() with the string above ("ttyAMA0,38400n8") and
>>> fails to find that string in the "names" part of the __earlycon_table,
>>> because for the pl011 component on vexpress, the early console was
>>> registered in drivers/tty/serial/amba-pl011.c with:
>>>
>>> OF_EARLYCON_DECLARE(pl011, "arm,pl011", pl011_early_console_setup);
>>> ^ name
>>>
>>> So isn't that trying to match "ttyAMA0" against "arm,pl011"? I have the
>>> feeling that I didn't understand the logic behind that.
>>>
>>> Can you elaborate about how this is supposed to work correctly?

The facility I describe below is to enable earlycon->console handoff.
If you're not interested in that, please disregard.

I provided the description because it wasn't clear to me from your
original email if that was something you were trying to implement and
couldn't get working.

>> Yeah, I've been meaning to write about this but simply haven't had the
>> time yet; apologies for that.
>>
>> The facility is hopefully best explained by the existing 8250 exemplar.
>> Normally, an 8250 early console is started via command line with a
>> command line parameter like:
>>
>>  earlycon=uart,io,0x2f8,115200n8
> 
> What happens if you don't have this parameter on the kernel commandline,
> but use the same port for your serial console? i.e. 'console=ttyS0'?
>
> I would expect the same warning I see on my boxes.

The diagnostic was added in commit 470ca0de69feaba5df215ad804cec1859883a5ed
("serial: earlycon: Enable earlycon without command line param").

Previously, if earlycon failed to start because of an error, such as because the
earlycon name was misspelled, there was no diagnostic.

>> Since 2007, an 8250 early console can also be started via command line
>> using console= instead, like:
>>
>>  console=uart,io,0x2f8,115200n8
> 
> No: "console=..." puts the console on that port, not the early console.
> The semantic for console= was always to specify the name of the device
> there, so "console=ttyS0...", not "console=uart...", right?

>From Documentation/kernel-parameters:

console=[KNL] Output console device and options.

uart[8250],io,[,options]
uart[8250],mmio,[,options]
uart[8250],mmio32,[,options]
uart[8250],0x[,options]
Start an early, polled-mode console on the 8250/16550
UART at the specified I/O port or MMIO address,
switching to the matching ttyS device later.
MMIO inter-register address stride is either 8-bit
(mmio) or 32-bit (mmio32).
If none of [io|mmio|mmio32],  is assumed to be
equivalent to 'mmio'. 'options' are specified in the
same format described for ttyS above; if unspecified,
the h/w is not re-initialized.

This behavior

Re: earlycon: no match?

2015-05-04 Thread Sascha Hauer

On Mon, May 04, 2015 at 10:01:37AM -0400, Peter Hurley wrote:
> Hi Robert,
> 
> On 05/03/2015 05:10 PM, Robert Schwebel wrote:
> > Hi Peter,
> > 
> > with 4.1-rc1, my boxes with early console enabled show something like
> > this (the example is vexpress, but it for example also happens on an
> > AM335x board):
> > 
> >   earlycon: no match for ttyAMA0,38400n8
> 
> This shouldn't impact any previous earlycon setup. Are you saying
> you're seeing a regression?
> 
> How do you have early console enabled, via the command line or via DT?

What happens here is that Robert has console=ttyAMA0,38400n8 in his
command line. In init/main.c we have:

static int __init do_early_param(char *param, char *val, const char *unused)
{
const struct obs_kernel_param *p;

for (p = __setup_start; p < __setup_end; p++) {
if ((p->early && parameq(param, p->str)) ||
(strcmp(param, "console") == 0 &&
 strcmp(p->str, "earlycon") == 0)
) {
if (p->setup_func(val) != 0)
pr_warn("Malformed early option '%s'\n", param);
}
}
/* We accept everything at this stage. */
return 0;
}

This means that param_setup_earlycon() gets called with the arguments
passed to the console= parameter which makes no sense in this context
and leads to the "no match for" message.

Sascha

-- 
Pengutronix e.K.   | |
Industrial Linux Solutions | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0|
Amtsgericht Hildesheim, HRA 2686   | Fax:   +49-5121-206917- |
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 0/2] Compile-time stack frame pointer validation

2015-05-04 Thread Josh Poimboeuf

In discussions around the live kernel patching consistency model RFC [1], Peter
and Ingo correctly pointed out that stack traces aren't reliable.  And as Ingo
said, there's no "strong force" which ensures we can rely on them.

So I've been thinking about how to fix that.  My goal is to eventually make
stack traces reliable.  Or at the very least, to be able to detect at runtime
when a given stack trace *might* be unreliable.  But improved stack traces
would broadly benefit the entire kernel, regardless of the outcome of the live
kernel patching consistency model discussions.

This patch set is just the first in a series of proposed stack trace
reliability improvements.  Future proposals will include runtime stack
reliability checking, as well as compile-time and runtime DWARF validations.

As far as I can tell, there are two main obstacles which prevent frame pointer
based stack traces from being reliable:

1) Missing frame pointer logic: currently, most assembly functions don't set up
   the frame pointer.

2) Interrupts: if a function is interrupted before it can save and set up
   the frame pointer, its caller won't show up in the stack trace.

This patch set aims to remove the first obstacle by enforcing that all asm
functions honor CONFIG_FRAME_POINTER.  This is done with a new stackvalidate
host tool which is automatically run for every compiled .S file and which
validates that every asm function does the proper frame pointer setup.

Also, to make sure somebody didn't forget to annotate their callable asm code
as a function, flag an error for any return instructions which are hiding
outside of a function.  In almost all cases, return instructions are part of
callable functions and should be annotated as such so that we can validate
their frame pointer usage.  A whitelist mechanism exists for those few return
instructions which are not actually in callable code.

It currently only supports x86_64.  It *almost* supports x86_32, but the
stackvalidate code doesn't yet know how to deal with 32-bit REL
relocations for the return whitelists.  I tried to make the code generic
so that support for other architectures can be plugged in pretty easily.

As a first step, all reported non-compliances result in warnings.  Right
now I'm seeing 200+ warnings.  Once we get them all cleaned up, we can
change the warnings to build errors so the asm code can stay clean.

The patches are based on linux-next.  Patch 1 adds the stackvalidate host tool.
Patch 2 adds some helper macros for asm functions so that they can comply with
stackvalidate.

[1] http://lkml.kernel.org/r/cover.1423499826.git.jpoim...@redhat.com

v2:
- Fixed memory leaks reported by Petr Mladek

Josh Poimboeuf (2):
  x86, stackvalidate: Compile-time stack frame pointer validation
  x86, stackvalidate: Add asm frame pointer setup macros

 MAINTAINERS   |   6 +
 arch/Kconfig  |   4 +
 arch/x86/Kconfig  |   1 +
 arch/x86/Makefile |   6 +-
 arch/x86/include/asm/func.h   |  82 
 lib/Kconfig.debug |  11 ++
 scripts/Makefile  |   1 +
 scripts/Makefile.build|  22 ++-
 scripts/stackvalidate/Makefile|  17 ++
 scripts/stackvalidate/arch-x86.c  | 134 +
 scripts/stackvalidate/arch.h  |  10 +
 scripts/stackvalidate/elf.c   | 352 ++
 scripts/stackvalidate/elf.h   |  56 ++
 scripts/stackvalidate/list.h  | 217 +
 scripts/stackvalidate/stackvalidate.c | 226 ++
 15 files changed, 1142 insertions(+), 3 deletions(-)
 create mode 100644 arch/x86/include/asm/func.h
 create mode 100644 scripts/stackvalidate/Makefile
 create mode 100644 scripts/stackvalidate/arch-x86.c
 create mode 100644 scripts/stackvalidate/arch.h
 create mode 100644 scripts/stackvalidate/elf.c
 create mode 100644 scripts/stackvalidate/elf.h
 create mode 100644 scripts/stackvalidate/list.h
 create mode 100644 scripts/stackvalidate/stackvalidate.c

-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 2/2] x86, stackvalidate: Add asm frame pointer setup macros

2015-05-04 Thread Josh Poimboeuf

Add some helper macros for asm functions so that they can comply with
stackvalidate.

The FUNC_ENTER and FUNC_RETURN macros help asm functions save, set up,
and restore frame pointers.

The RET_NOVALIDATE and FILE_NOVALIDATE macros can be used to whitelist
the few locations which need a return instruction outside of a callable
function.

Signed-off-by: Josh Poimboeuf 
---
 arch/x86/include/asm/func.h | 82 +
 1 file changed, 82 insertions(+)
 create mode 100644 arch/x86/include/asm/func.h

diff --git a/arch/x86/include/asm/func.h b/arch/x86/include/asm/func.h
new file mode 100644
index 000..ae84196
--- /dev/null
+++ b/arch/x86/include/asm/func.h
@@ -0,0 +1,82 @@
+#ifndef _ASM_X86_FUNC_H
+#define _ASM_X86_FUNC_H
+
+#include 
+#include 
+#include 
+
+.macro FUNC_ENTER_NO_FP name
+   ENTRY(\name)
+   CFI_STARTPROC
+   CFI_DEF_CFA _ASM_SP, __ASM_SEL(4, 8)
+.endm
+
+.macro FUNC_RETURN_NO_FP name
+   CFI_DEF_CFA _ASM_SP, __ASM_SEL(4, 8)
+   __ASM_SIZE(ret)
+   CFI_ENDPROC
+   ENDPROC(\name)
+.endm
+
+#ifdef CONFIG_FRAME_POINTER
+
+.macro FUNC_ENTER_FP name
+   FUNC_ENTER_NO_FP \name
+   __ASM_SIZE(push, _cfi) %_ASM_BP
+   CFI_REL_OFFSET _ASM_BP, 0
+   _ASM_MOV %_ASM_SP, %_ASM_BP
+   CFI_DEF_CFA_REGISTER _ASM_BP
+.endm
+
+.macro FUNC_RETURN_FP name
+   __ASM_SIZE(pop, _cfi) %_ASM_BP
+   CFI_RESTORE _ASM_BP
+   FUNC_RETURN_NO_FP \name
+.endm
+
+/*
+ * Every callable asm function should be bookended with FUNC_ENTER and
+ * FUNC_RETURN.  They do proper frame pointer and DWARF CFI setups in order to
+ * achieve more reliable stack traces.
+ *
+ * For the sake of simplicity and correct DWARF annotations, use of the macros
+ * requires that the return instruction comes at the end of the function.
+ */
+#define FUNC_ENTER(name) FUNC_ENTER_FP name
+#define FUNC_RETURN(name) FUNC_RETURN_FP name
+
+/*
+ * RET_NOVALIDATE tells the stack validation script to whitelist the return
+ * instruction immediately after the macro.  Only use it if you're completely
+ * sure you need a return instruction outside of a callable function.
+ * Otherwise, if the code can be called and you haven't annotated it with
+ * FUNC_ENTER/FUNC_RETURN, it will break stack trace reliability.
+ */
+.macro RET_NOVALIDATE
+   163:
+   .pushsection __stackvalidate_whitelist_ret, "ae"
+   _ASM_ALIGN
+   .long 163b - .
+   .popsection
+.endm
+
+/*
+ * FILE_NOVALIDATE is like RET_NOVALIDATE except it whitelists the entire file.
+ * Use with extreme caution or you will silently break stack traces.
+ */
+.macro FILE_NOVALIDATE
+   .pushsection __stackvalidate_whitelist_file, "ae"
+   .long 0
+   .popsection
+.endm
+
+#else /* !FRAME_POINTER */
+
+#define FUNC_ENTER(name) FUNC_ENTER_NO_FP name
+#define FUNC_RETURN(name) FUNC_RETURN_NO_FP name
+#define RET_NOVALIDATE
+#define FILE_NOVALIDATE
+
+#endif /* FRAME_POINTER */
+
+#endif /* _ASM_X86_FUNC_H */
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 1/2] x86, stackvalidate: Compile-time stack frame pointer validation

2015-05-04 Thread Josh Poimboeuf

Frame pointer based stack traces aren't always reliable.  One big reason
is that most asm functions don't set up the frame pointer.

Fix that by enforcing that all asm functions honor CONFIG_FRAME_POINTER.
This is done with a new stackvalidate host tool which is automatically
run for every compiled .S file and which validates that every asm
function does the proper frame pointer setup.

Also, to make sure somebody didn't forget to annotate their callable asm code
as a function, flag an error for any return instructions which are hiding
outside of a function.  In almost all cases, return instructions are part of
callable functions and should be annotated as such so that we can validate
their frame pointer usage.  A whitelist mechanism exists for those few return
instructions which are not actually in callable code.

It currently only supports x86_64.  It *almost* supports x86_32, but the
stackvalidate code doesn't yet know how to deal with 32-bit REL
relocations for the return whitelists.  I tried to make the code generic
so that support for other architectures can be plugged in pretty easily.

As a first step, all reported non-compliances result in warnings.  Right
now I'm seeing 200+ warnings.  Once we get them all cleaned up, we can
change the warnings to build errors so the asm code can stay clean.

Signed-off-by: Josh Poimboeuf 
---
 MAINTAINERS   |   6 +
 arch/Kconfig  |   4 +
 arch/x86/Kconfig  |   1 +
 arch/x86/Makefile |   6 +-
 lib/Kconfig.debug |  11 ++
 scripts/Makefile  |   1 +
 scripts/Makefile.build|  22 ++-
 scripts/stackvalidate/Makefile|  17 ++
 scripts/stackvalidate/arch-x86.c  | 134 +
 scripts/stackvalidate/arch.h  |  10 +
 scripts/stackvalidate/elf.c   | 352 ++
 scripts/stackvalidate/elf.h   |  56 ++
 scripts/stackvalidate/list.h  | 217 +
 scripts/stackvalidate/stackvalidate.c | 226 ++
 14 files changed, 1060 insertions(+), 3 deletions(-)
 create mode 100644 scripts/stackvalidate/Makefile
 create mode 100644 scripts/stackvalidate/arch-x86.c
 create mode 100644 scripts/stackvalidate/arch.h
 create mode 100644 scripts/stackvalidate/elf.c
 create mode 100644 scripts/stackvalidate/elf.h
 create mode 100644 scripts/stackvalidate/list.h
 create mode 100644 scripts/stackvalidate/stackvalidate.c

diff --git a/MAINTAINERS b/MAINTAINERS
index b99e5b8..86258e6 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -9365,6 +9365,12 @@ L:   sta...@vger.kernel.org
 S: Supported
 F: Documentation/stable_kernel_rules.txt
 
+STACK VALIDATION
+M: Josh Poimboeuf 
+S: Supported
+F: scripts/stackvalidate/
+F: arch/x86/include/asm/func.h
+
 STAGING SUBSYSTEM
 M: Greg Kroah-Hartman 
 T: git git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
diff --git a/arch/Kconfig b/arch/Kconfig
index a65eafb..9e7e388 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -499,6 +499,10 @@ config ARCH_HAS_ELF_RANDOMIZE
  - arch_mmap_rnd()
  - arch_randomize_brk()
 
+config HAVE_STACK_VALIDATION
+   bool
+
+
 #
 # ABI hall of shame
 #
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 9cee995..ed41a76 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -144,6 +144,7 @@ config X86
select ACPI_LEGACY_TABLES_LOOKUP if ACPI
select X86_FEATURE_NAMES if PROC_FS
select SRCU
+   select HAVE_STACK_VALIDATION if FRAME_POINTER && X86_64
 
 config INSTRUCTION_DECODER
def_bool y
diff --git a/arch/x86/Makefile b/arch/x86/Makefile
index 2fda005..72f2f04 100644
--- a/arch/x86/Makefile
+++ b/arch/x86/Makefile
@@ -171,9 +171,13 @@ KBUILD_CFLAGS += $(call cc-option,-mno-avx,)
 KBUILD_CFLAGS += $(mflags-y)
 KBUILD_AFLAGS += $(mflags-y)
 
-archscripts: scripts_basic
+archscripts: scripts_basic $(objtree)/arch/x86/lib/inat-tables.c
$(Q)$(MAKE) $(build)=arch/x86/tools relocs
 
+# this file is needed early by scripts/stackvalidate
+$(objtree)/arch/x86/lib/inat-tables.c:
+   $(Q)$(MAKE) $(build)=arch/x86/lib $@
+
 ###
 # Syscall table generation
 
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 8eaed3dd..b8a6884 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -332,6 +332,17 @@ config FRAME_POINTER
  larger and slower, but it gives very useful debugging information
  in case of kernel bugs. (precise oopses/stacktraces/warnings)
 
+
+config STACK_VALIDATION
+   bool "Enable kernel stack validation"
+   depends on HAVE_STACK_VALIDATION
+   default y
+   help
+ Add compile-time validations which help make kernel stack traces more
+ reliable.  This includes checks to ensure that assembly functions
+ save, update and restore the frame pointer or the back chain pointer.
+
+
 config DEBUG_FORCE_WEAK_PER_C

Re: [PATCH] ARM: bcm2835: Use 0x4 prefix for DMA bus addresses to SDRAM.

2015-05-04 Thread Noralf Trønnes



Den 04.05.2015 21:33, skrev Eric Anholt:

There exists a tiny MMU, configurable only by the VC (running the
closed firmware), which maps from the ARM's physical addresses to bus
addresses.  These bus addresses determine the caching behavior in the
VC's L1/L2 (note: separate from the ARM's L1/L2) according to the top
2 bits.  The bits in the bus address mean:

 From the VideoCore processor:
0x0... L1 and L2 cache allocating and coherent
0x4... L1 non-allocating, but coherent. L2 allocating and coherent
0x8... L1 non-allocating, but coherent. L2 non-allocating, but coherent
0xc... SDRAM alias. Cache is bypassed. Not L1 or L2 allocating or coherent

 From the GPU peripherals (note: all peripherals bypass the L1
cache. The ARM will see this view once through the VC MMU):
0x0... Do not use
0x4... L1 non-allocating, and incoherent. L2 allocating and coherent.
0x8... L1 non-allocating, and incoherent. L2 non-allocating, but coherent
0xc... SDRAM alias. Cache is bypassed. Not L1 or L2 allocating or coherent

The 2835 firmware always configures the MMU to turn ARM physical
addresses with 0x0 top bits to 0x4, meaning present in L2 but
incoherent with L1.  However, any bus addresses we were generating in
the kernel to be passed to a device had 0x0 bits.  That would be a
reserved (possibly totally incoherent) value if sent to a GPU
peripheral like USB, or L1 allocating if sent to the VC (like a
firmware property request).  By setting dma-ranges, all of the devices
below it get a dev->dma_pfn_offset, so that dma_alloc_coherent() and
friends return addresses with 0x4 bits and avoid cache incoherency.

This matches the behavior in the downstream 2708 kernel (see
BUS_OFFSET in arch/arm/mach-bcm2708/include/mach/memory.h).

Signed-off-by: Eric Anholt 
Cc: popcorn...@gmail.com
---
  arch/arm/boot/dts/bcm2835.dtsi | 1 +
  1 file changed, 1 insertion(+)

diff --git a/arch/arm/boot/dts/bcm2835.dtsi b/arch/arm/boot/dts/bcm2835.dtsi
index 5734650..2df1b5c 100644
--- a/arch/arm/boot/dts/bcm2835.dtsi
+++ b/arch/arm/boot/dts/bcm2835.dtsi
@@ -15,6 +15,7 @@
#address-cells = <1>;
#size-cells = <1>;
ranges = <0x7e00 0x2000 0x0200>;
+   dma-ranges = <0x4000 0x 0x1f00>;
  
  		timer@7e003000 {

compatible = "brcm,bcm2835-system-timer";


This was quite a coincidence. I discovered the need for 'dma-ranges'
yesterday while trying to get the downstream bcm2708_fb driver to
work with ARCH_BCM2835. The driver is using the mailbox to get info
about the framebuffer from the firmware. When it failed I discovered
that the bus address was wrong.

What I don't understand, is that mmc and spi works fine with a "wrong"
bus address. It's only the framebuffer driver and the vchiq driver
when using mailbox that fails.

Tested-by: Noralf Trønnes 


Regards,
Noralf Trønnes

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] Design for flag bit outputs from asms

2015-05-04 Thread H. Peter Anvin

On 05/04/2015 01:14 PM, H. Peter Anvin wrote:
>>
>> Therefore:
>>
>> (1) Each target defines a set of constraint strings,
>>
>>E.g. for x86, wherein we're almost out of constraint letters,
>>
>>  ja   aux carry flag
>>  jc   carry flag
>>  jo   overflow flag
>>  jp   parity flag
>>  js   sign flag
>>  jz   zero flag
>>
> 
> I would argue that for x86 what you actually want is to model the
> *conditions* that are available on the flags, not the flags themselves.
>  There are 16 such conditions, 8 if we discard the inversions.
> 
> It is notable that the auxiliary carry flag has no Jcc/SETcc/CMOVcc
> instructions; it is only ever consumed by the DAA/DAS instructions which
> makes it pointless to try to model it in a compiler any more than, say, IF.
> 

OK, let me qualify that.  This is only necessary if it is impractical
for gcc to optimize boolean combinations of flags.  If such
optimizations are available then it doesn't matter and is probably
needlessly complex.  For example:

char foo(void)
{
bool zf, sf, of;

asm("xyzzy" : "=jz" (zf), "=js" (sf), "=jo" (of));

return zf || (sf != of);
}

... should compile to ...

xyzzy
setng %al
ret

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] clk: tegra: Fix comments for structure definitions

2015-05-04 Thread Benson Leung

On Mon, Apr 13, 2015 at 9:38 AM, Rhyland Klein  wrote:
> Some fields moved from the tegra_clk_pll struct to
> the tegra_pll_params struct. Update the struct comments
> to reflect where the fields really are.
>
> Signed-off-by: Rhyland Klein 
> ---
>  drivers/clk/tegra/clk.h |   74 
> +++
>  1 file changed, 37 insertions(+), 37 deletions(-)
>
> diff --git a/drivers/clk/tegra/clk.h b/drivers/clk/tegra/clk.h
> index 751a97966354..4eae99a4f32e 100644
> --- a/drivers/clk/tegra/clk.h
> +++ b/drivers/clk/tegra/clk.h
> @@ -171,6 +171,30 @@ struct div_nmp {
>   * @lock_bit_idx:  Bit index for PLL lock status


By the way,

It looks like the kernel doc for this structure hasn't been updated
for some of the fields added or modified since. lock_bit_idx is
actually no longer here (now lock_mask), and the bunch added for 114
(including ext_misc_reg) aren't documented.

-- 
Benson Leung
Software Engineer, Chrom* OS
ble...@chromium.org
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] rcu: declare rcu_data variables in the section they are defined in

2015-05-04 Thread Paul E. McKenney

On Sun, May 03, 2015 at 12:27:02PM -0700, Josh Triplett wrote:
> On Sun, May 03, 2015 at 05:57:53PM +0800, Nicolas Iooss wrote:
> > Commit 11bbb235c26f ("rcu: Use DEFINE_PER_CPU_SHARED_ALIGNED for
> > rcu_data") replaced DEFINE_PER_CPU by DEFINE_PER_CPU_SHARED_ALIGNED in
> > the definition of rcu_sched and rcu_bh without updating
> > kernel/rcu/tree.h.
> > 
> > This makes clang report a section mismatch (-Wsection warning) when
> > building LLVMLinux because the variables are declared in .data..percpu
> > but defined in .data..percpu..shared_aligned.
> > 
> > Signed-off-by: Nicolas Iooss 
> 
> Good catch.
> Reviewed-by: Josh Triplett 

Agreed, good catch!  But don't we also need to worry about
rcu_preempt_data?  Also, given that tree_trace.c now uses iterators
rather than direct access via the per-CPU variables, wouldn't the
following be more appropriate?  (-Very- lightly tested.)

Thanx, Paul



diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
index da9f6adb5ff9..ee86870b1825 100644
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -537,17 +537,6 @@ extern struct list_head rcu_struct_flavors;
 /*
  * RCU implementation internal declarations:
  */
-extern struct rcu_state rcu_sched_state;
-DECLARE_PER_CPU(struct rcu_data, rcu_sched_data);
-
-extern struct rcu_state rcu_bh_state;
-DECLARE_PER_CPU(struct rcu_data, rcu_bh_data);
-
-#ifdef CONFIG_PREEMPT_RCU
-extern struct rcu_state rcu_preempt_state;
-DECLARE_PER_CPU(struct rcu_data, rcu_preempt_data);
-#endif /* #ifdef CONFIG_PREEMPT_RCU */
-
 #ifdef CONFIG_RCU_BOOST
 DECLARE_PER_CPU(unsigned int, rcu_cpu_kthread_status);
 DECLARE_PER_CPU(int, rcu_cpu_kthread_cpu);

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] Design for flag bit outputs from asms

2015-05-04 Thread Richard Henderson

On 05/04/2015 01:14 PM, H. Peter Anvin wrote:
> On 05/04/2015 12:33 PM, Richard Henderson wrote:
>>
>> (0) The C level output variable should be an integral type, from bool on up.
>>
>> The flags are a scarse resource, easily clobbered.  We cannot allow user code
>> to keep data in the flags.  While x86 does have lahf/sahf, they don't exactly
>> perform well.  And other targets like arm don't even have that bad option.
>>
>> Therefore, the language level semantics are that the output is a boolean 
>> store
>> into the variable with a condition specified by a magic constraint.
>>
>> That said, just like the compiler should be able to optimize
>>
>> void bar(int y)
>> {
>>   int x = (y <= 0);
>>   if (x) foo();
>> }
>>
>> such that we only use a single compare against y, the expectation is that
>> within a similarly constrained context the compiler will not require two 
>> tests
>> for these boolean outputs.
>>
>> Therefore:
>>
>> (1) Each target defines a set of constraint strings,
>>
>>E.g. for x86, wherein we're almost out of constraint letters,
>>
>>  ja   aux carry flag
>>  jc   carry flag
>>  jo   overflow flag
>>  jp   parity flag
>>  js   sign flag
>>  jz   zero flag
>>
> 
> I would argue that for x86 what you actually want is to model the
> *conditions* that are available on the flags, not the flags themselves.
>  There are 16 such conditions, 8 if we discard the inversions.

A fair point.  Though honestly, I was hoping that this feature would mostly be
used for conditions that are "weird" -- that is, not normally describable by
arithmetic at all.  Otherwise, why are you using inline asm for it?

> It is notable that the auxiliary carry flag has no Jcc/SETcc/CMOVcc
> instructions; it is only ever consumed by the DAA/DAS instructions which
> makes it pointless to try to model it in a compiler any more than, say, IF.

Oh yeah.  Consider that dropped.


r~
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 4/5] blk-mq: do limited block plug for multiple queue case

2015-05-04 Thread Shaohua Li

On Mon, May 04, 2015 at 01:46:49PM -0600, Jens Axboe wrote:
> On 05/04/2015 01:40 PM, Shaohua Li wrote:
> >On Fri, May 01, 2015 at 04:16:04PM -0400, Jeff Moyer wrote:
> >>Shaohua Li  writes:
> >>
> >>>plug is still helpful for workload with IO merge, but it can be harmful
> >>>otherwise especially with multiple hardware queues, as there is
> >>>(supposed) no lock contention in this case and plug can introduce
> >>>latency. For multiple queues, we do limited plug, eg plug only if there
> >>>is request merge. If a request doesn't have merge with following
> >>>request, the requet will be dispatched immediately.
> >>>
> >>>This also fixes a bug. If we directly issue a request and it fails, we
> >>>use blk_mq_merge_queue_io(). But we already assigned bio to a request in
> >>>blk_mq_bio_to_request. blk_mq_merge_queue_io shouldn't run
> >>>blk_mq_bio_to_request again.
> >>
> >>Good catch.  Might've been better to split that out first for easy
> >>backport to stable kernels, but I won't hold you to that.
> >
> >It's not a severe bug, but I don't mind. Jens, please let me know if I
> >should split the patch into 2 patches.
> 
> I don't care that much for this particular case. But since one/more
> of the others need respin anyway, might be prudent to split it up in
> any case.

ok, done. I'll repost the patch 4/5. Please let me know if I should
repost 1-3.


>From 5f8117b38ba423a4c746b05d40d3647c73b89a32 Mon Sep 17 00:00:00 2001
Message-Id: 
<5f8117b38ba423a4c746b05d40d3647c73b89a32.1430771485.git.s...@fb.com>
From: Shaohua Li 
Date: Mon, 4 May 2015 13:26:55 -0700
Subject: [PATCH] blk-mq: avoid re-initialize request which is failed in direct
 dispatch

If we directly issue a request and it fails, we use
blk_mq_merge_queue_io(). But we already assigned bio to a request in
blk_mq_bio_to_request. blk_mq_merge_queue_io shouldn't run
blk_mq_bio_to_request again.

Signed-off-by: Shaohua Li 
---
 block/blk-mq.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index e1a5b9e..2a411c9 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1294,6 +1294,8 @@ static void blk_mq_make_request(struct request_queue *q, 
struct bio *bio)
blk_mq_end_request(rq, rq->errors);
goto done;
}
+   blk_mq_insert_request(rq, false, true, true);
+   return;
}
}
 
-- 
1.8.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 2/2] x86, stackvalidate: Add asm frame pointer setup macros

2015-05-04 Thread H. Peter Anvin

On 05/04/2015 01:23 PM, Josh Poimboeuf wrote:
> + __ASM_SIZE(push, _cfi) %_ASM_BP
> + __ASM_SIZE(pop, _cfi) %_ASM_BP

This seems ridiculous.  push/pop only come in one size per
architecture(*).  Can we make it so that just push_cfi and pop_cfi do
the right things?

-hpa

(*) Intentionally ignoring 16 bit here...


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] blk-mq: don't lose requests if a stopped queue restarts

2015-05-04 Thread Jens Axboe


On 05/04/2015 02:20 PM, Shaohua Li wrote:

On Mon, May 04, 2015 at 01:56:42PM -0600, Jens Axboe wrote:

On 05/04/2015 01:51 PM, Shaohua Li wrote:

On Mon, May 04, 2015 at 01:17:19PM -0600, Jens Axboe wrote:

On 05/02/2015 06:31 PM, Shaohua Li wrote:

Normally if driver is busy to dispatch a request the logic is like below:
block layer:driver:
__blk_mq_run_hw_queue
a.  blk_mq_stop_hw_queue
b.  rq add to ctx->dispatch

later:
1.  blk_mq_start_hw_queue
2.  __blk_mq_run_hw_queue

But it's possible step 1-2 runs between a and b. And since rq isn't in
ctx->dispatch yet, step 2 will not run rq. The rq might get lost if
there are no subsequent requests kick in.


Good catch! But the patch introduces a potentially never ending loop
in __blk_mq_run_hw_queue(). Not sure how we can fully close it, but
it might be better to punt the re-run after adding the requests back
to the worker. That would turn a potential busy loop (until requests
complete) into something with nicer behavior, at least. Ala

if (!test_bit(BLK_MQ_S_STOPPED, &hctx->state))
  kblockd_schedule_delayed_work_on(blk_mq_hctx_next_cpu(hctx),
 &hctx->run_work, 0);


My first version of the patch is like this, but I changed my mind later.
The assumption is driver will stop queue if it's busy to dispatch
request.  If the driver is buggy, we will have the endless loop here.
Should we assume drivers will not do the right thing?


There's really no contract that says the driver MUST stop the queue
for busy. It could, legitimately, decide to just always run the
queue when requests complete.

It might be better to simply force this behavior. If we get a BUSY,
stop the queue from __blk_mq_run_hw_queue(). And if the bit isn't
still set on re-add, then we know we need to re-run it. I think that
would be a cleaner API, less fragile, and harder to get wrong. The
down side is that now this stop happens implicitly by the core, and
the driver must now have an asymmetric queue start when it frees the
limited resource that caused the BUSY return. Either that, or we
define a 2nd set of start/stop bits, one used exclusively by the
driver and one used exclusively by blk-mq. Then blk-mq could restart
the queue on completion of a request, since it would then know that
blk-mq was the one that stopped it.


Agree. I'll make the rerun async for now and leave above as a future
improvement.


Agree, I will apply this one. Thanks!

--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: earlycon: no match?

2015-05-04 Thread Peter Hurley

On 05/04/2015 04:22 PM, Sascha Hauer wrote:
> On Mon, May 04, 2015 at 10:01:37AM -0400, Peter Hurley wrote:
>> Hi Robert,
>>
>> On 05/03/2015 05:10 PM, Robert Schwebel wrote:
>>> Hi Peter,
>>>
>>> with 4.1-rc1, my boxes with early console enabled show something like
>>> this (the example is vexpress, but it for example also happens on an
>>> AM335x board):
>>>
>>>   earlycon: no match for ttyAMA0,38400n8
>>
>> This shouldn't impact any previous earlycon setup. Are you saying
>> you're seeing a regression?
>>
>> How do you have early console enabled, via the command line or via DT?
> 
> What happens here is that Robert has console=ttyAMA0,38400n8 in his
> command line.

Yeah, thanks, that much is now clear from his second email.

> In init/main.c we have:
> 
> static int __init do_early_param(char *param, char *val, const char *unused)
> {
>   const struct obs_kernel_param *p;
> 
>   for (p = __setup_start; p < __setup_end; p++) {
>   if ((p->early && parameq(param, p->str)) ||
>   (strcmp(param, "console") == 0 &&
>strcmp(p->str, "earlycon") == 0)
>   ) {
>   if (p->setup_func(val) != 0)
>   pr_warn("Malformed early option '%s'\n", param);
>   }
>   }
>   /* We accept everything at this stage. */
>   return 0;
> }
> 
> This means that param_setup_earlycon() gets called with the arguments
> passed to the console= parameter which makes no sense in this context
> and leads to the "no match for" message.

Right, except that "the context" has insufficient information to differentiate
an error, such as a misspelled earlycon name, from any other use.

Regards,
Peter Hurley

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v4 06/20] clk: tegra: pll-params: change misc_reg count from 3 -> 6

2015-05-04 Thread Benson Leung

On Mon, May 4, 2015 at 9:37 AM, Rhyland Klein  wrote:
> From: Bill Huang 
>
> New SoC's may have more then 3 MISC registers, so bump up the
> array size and use a #define to be more informative about the value.
>
> Signed-off-by: Bill Huang 
> ---
>  drivers/clk/tegra/clk.h |4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/clk/tegra/clk.h b/drivers/clk/tegra/clk.h
> index 5759b8bfb80e..8e7361886cf9 100644
> --- a/drivers/clk/tegra/clk.h
> +++ b/drivers/clk/tegra/clk.h
> @@ -156,6 +156,8 @@ struct div_nmp {
> u8  override_divp_shift;
>  };
>
> +#define MAX_PLL_MISC_REG_COUNT 6
> +
>  /**
>   * struct clk_pll_params - PLL parameters
>   *
> @@ -213,7 +215,7 @@ struct tegra_clk_pll_params {
> u32 iddq_bit_idx;
> u32 aux_reg;
> u32 dyn_ramp_reg;
> -   u32 ext_misc_reg[3];
> +   u32 ext_misc_reg[MAX_PLL_MISC_REG_COUNT];
> u32 pmc_divnm_reg;
> u32 pmc_divp_reg;
> u32 flags;


Missing kernel doc above for ext_misc_reg and some other surrounding members.

Otherwise,
Reviewed-by: Benson Leung 



-- 
Benson Leung
Software Engineer, Chrom* OS
ble...@chromium.org
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] Design for flag bit outputs from asms

2015-05-04 Thread Linus Torvalds

On Mon, May 4, 2015 at 1:14 PM, H. Peter Anvin  wrote:
>
> I would argue that for x86 what you actually want is to model the
> *conditions* that are available on the flags, not the flags themselves.

Yes. Otherwise it would be a nightmare to try to describe simple
conditions like "le", which a rather complicated combination of three
of the actual flag bits:

((SF ^^ OF) || ZF) = 1

which would just be ridiculously painful for (a) the user to describe
and (b) fior the compiler to recognize once described.

Now, I do admit that most of the cases where you'd use inline asm with
condition codes would probably fall into just simple "test ZF or CF".
But I could certainly imagine other cases.

   Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 4/5] blk-mq: do limited block plug for multiple queue case

2015-05-04 Thread Jens Axboe


On 05/04/2015 02:33 PM, Shaohua Li wrote:

On Mon, May 04, 2015 at 01:46:49PM -0600, Jens Axboe wrote:

On 05/04/2015 01:40 PM, Shaohua Li wrote:

On Fri, May 01, 2015 at 04:16:04PM -0400, Jeff Moyer wrote:

Shaohua Li  writes:


plug is still helpful for workload with IO merge, but it can be harmful
otherwise especially with multiple hardware queues, as there is
(supposed) no lock contention in this case and plug can introduce
latency. For multiple queues, we do limited plug, eg plug only if there
is request merge. If a request doesn't have merge with following
request, the requet will be dispatched immediately.

This also fixes a bug. If we directly issue a request and it fails, we
use blk_mq_merge_queue_io(). But we already assigned bio to a request in
blk_mq_bio_to_request. blk_mq_merge_queue_io shouldn't run
blk_mq_bio_to_request again.


Good catch.  Might've been better to split that out first for easy
backport to stable kernels, but I won't hold you to that.


It's not a severe bug, but I don't mind. Jens, please let me know if I
should split the patch into 2 patches.


I don't care that much for this particular case. But since one/more
of the others need respin anyway, might be prudent to split it up in
any case.


ok, done. I'll repost the patch 4/5. Please let me know if I should
repost 1-3.


That's fine, I'll grab 1-3 as-is. Thanks!

--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: question about RCU dynticks_nesting

2015-05-04 Thread Paul E. McKenney

On Mon, May 04, 2015 at 04:13:50PM -0400, Rik van Riel wrote:
> On 05/04/2015 04:02 PM, Paul E. McKenney wrote:
> > On Mon, May 04, 2015 at 03:39:25PM -0400, Rik van Riel wrote:
> >> On 05/04/2015 02:39 PM, Paul E. McKenney wrote:
> >>> On Mon, May 04, 2015 at 11:59:05AM -0400, Rik van Riel wrote:
> >>
>  In fact, would we be able to simply use tsk->rcu_read_lock_nesting
>  as an indicator of whether or not we should bother waiting on that
>  task or CPU when doing synchronize_rcu?
> >>>
> >>> Depends on exactly what you are asking.  If you are asking if I could add
> >>> a few more checks to preemptible RCU and speed up grace-period detection
> >>> in a number of cases, the answer is very likely "yes".  This is on my
> >>> list, but not particularly high priority.  If you are asking whether
> >>> CPU 0 could access ->rcu_read_lock_nesting of some task running on
> >>> some other CPU, in theory, the answer is "yes", but in practice that
> >>> would require putting full memory barriers in both rcu_read_lock()
> >>> and rcu_read_unlock(), so the real answer is "no".
> >>>
> >>> Or am I missing your point?
> >>
> >> The main question is "how can we greatly reduce the overhead
> >> of nohz_full, by simplifying the RCU extended quiescent state
> >> code called in the syscall fast path, and maybe piggyback on
> >> that to do time accounting for remote CPUs?"
> >>
> >> Your memory barrier answer above makes it clear we will still
> >> want to do the RCU stuff at syscall entry & exit time, at least
> >> on x86, where we already have automatic and implicit memory
> >> barriers.
> > 
> > We do need to keep in mind that x86's automatic and implicit memory
> > barriers do not order prior stores against later loads.
> > 
> > Hmmm...  But didn't earlier performance measurements show that the bulk of
> > the overhead was the delta-time computations rather than RCU accounting?
> 
> The bulk of the overhead was disabling and re-enabling
> irqs around the calls to rcu_user_exit and rcu_user_enter :)

Really???  OK...  How about software irq masking?  (I know, that is
probably a bit of a scary change as well.)

> Of the remaining time, about 2/3 seems to be the vtime
> stuff, and the other 1/3 the rcu code.

OK, worth some thought, then.

> I suspect it makes sense to optimize both, though the
> vtime code may be the easiest :)

Making a crude version that does jiffies (or whatever) instead of
fine-grained computations might give good bang for the buck.  ;-)

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [GIT PULL] RAS for 4.2

2015-05-04 Thread Rafael J. Wysocki

On Monday, May 04, 2015 04:02:16 PM Rafael J. Wysocki wrote:
> On Monday, May 04, 2015 03:16:09 PM Borislav Petkov wrote:
> > On Mon, May 04, 2015 at 03:36:16PM +0200, Rafael J. Wysocki wrote:
> > > I'd like to pick this one up if that's not a problem.
> > > 
> > > Traditionally, things like this have gone in through the Tony's tree, but 
> > > if
> > > that's not the case any more, I think ACPI is the next best upstream for 
> > > it.
> > 
> > It is Tony's tree - ras.git, look at the URL. And Tony's tree goes
> > through tip.
> 
> OK, I missed that part.
> 
> > But I don't care which way it goes. If you wanna take it, simply pull
> > the tag.
> 
> OK
> 
> Ingo, any objections?

OK, pulled into linux-pm/linux-next as 4.2 material, thanks!

Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: question about RCU dynticks_nesting

2015-05-04 Thread Paul E. McKenney

On Mon, May 04, 2015 at 03:59:02PM -0400, Rik van Riel wrote:
> On 05/04/2015 03:39 PM, Paul E. McKenney wrote:
> > On Mon, May 04, 2015 at 03:00:44PM -0400, Rik van Riel wrote:
> 
> >> In case of the non-preemptible RCU, we could easily also
> >> increase current->rcu_read_lock_nesting at the same time
> >> we increase the preempt counter, and use that as the
> >> indicator to test whether the cpu is in an extended
> >> rcu quiescent state. That way there would be no extra
> >> overhead at syscall entry or exit at all. The trick
> >> would be getting the preempt count and the rcu read
> >> lock nesting count in the same cache line for each task.
> > 
> > But in non-preemptible RCU, we have PREEMPT=n, so there is no preempt
> > counter in production kernels.  Even if there was, we have to sample this
> > on other CPUs, so the overhead of preempt_disable() and preempt_enable()
> > would be where kernel entry/exit is, so I expect that this would be a
> > net loss in overall performance.
> 
> CONFIG_PREEMPT_RCU seems to be independent of CONFIG_PREEMPT.
> Not sure why, but they are :)

Well, they used to be independent.  But the "depends" clauses force
them.  You cannot have TREE_RCU unless !PREEMPT && SMP.

> >> In case of the preemptible RCU scheme, we would have to
> >> examine the per-task state (under the runqueue lock)
> >> to get the current task info of all CPUs, and in
> >> addition wait for the blkd_tasks list to empty out
> >> when doing a synchronize_rcu().
> >>
> >> That does not appear to require special per-cpu
> >> counters; examining the per-cpu rdp and the lists
> >> inside it, with the rnp->lock held if doing any
> >> list manipulation, looks like it would be enough.
> >>
> >> However, the current code is a lot more complicated
> >> than that. Am I overlooking something obvious, Paul?
> >> Maybe something non-obvious? :)
> > 
> > Ummm...  The need to maintain memory ordering when sampling task
> > state from remote CPUs?
> > 
> > Or am I completely confused about what you are suggesting?
> > 
> > That said, are you chasing a real system-visible performance issue
> > that you tracked to RCU's dyntick-idle system?
> 
> The goal is to reduce the syscall overhead of nohz_full.
> 
> Part of the overhead is in the vtime updates, part of it is
> in the way RCU extended quiescent state is tracked.

OK, as long as it is actual measurements rather than guesswork.

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] Design for flag bit outputs from asms

2015-05-04 Thread H. Peter Anvin

On 05/04/2015 01:35 PM, Linus Torvalds wrote:
> On Mon, May 4, 2015 at 1:14 PM, H. Peter Anvin  wrote:
>>
>> I would argue that for x86 what you actually want is to model the
>> *conditions* that are available on the flags, not the flags themselves.
> 
> Yes. Otherwise it would be a nightmare to try to describe simple
> conditions like "le", which a rather complicated combination of three
> of the actual flag bits:
> 
> ((SF ^^ OF) || ZF) = 1
> 
> which would just be ridiculously painful for (a) the user to describe
> and (b) fior the compiler to recognize once described.
> 
> Now, I do admit that most of the cases where you'd use inline asm with
> condition codes would probably fall into just simple "test ZF or CF".
> But I could certainly imagine other cases.
> 

Yes, although once again I'm more than happy to let gcc do the boolean
optimizations if it already has logic to do so (which it might have/want
for its own reasons.)

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/4] sched / idle: Move the default idle call code to a separate function

2015-05-04 Thread Rafael J. Wysocki

On Monday, May 04, 2015 04:22:45 PM Peter Zijlstra wrote:
> On Mon, May 04, 2015 at 03:56:24PM +0200, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki 
> > 
> > Move the code under the "use_default" label in cpuidle_idle_call()
> > into a separate (new) function.
> > 
> > This just allows the subsequent changes to be more stratightforward.
> > 
> > Signed-off-by: Rafael J. Wysocki 
> > ---
> >  kernel/sched/idle.c |   42 +++---
> >  1 file changed, 23 insertions(+), 19 deletions(-)
> > 
> > Index: linux-pm/kernel/sched/idle.c
> > ===
> > --- linux-pm.orig/kernel/sched/idle.c
> > +++ linux-pm/kernel/sched/idle.c
> > @@ -67,6 +67,17 @@ void __weak arch_cpu_idle(void)
> > local_irq_enable();
> >  }
> >  
> > +static void default_idle_call(void) {
> 
> Please put opening brace of function on a new line.

Ah that's weird.  I wonder why I haven't noticed this myself.

Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] Design for flag bit outputs from asms

2015-05-04 Thread Linus Torvalds

On Mon, May 4, 2015 at 1:33 PM, Richard Henderson  wrote:
>
> A fair point.  Though honestly, I was hoping that this feature would mostly be
> used for conditions that are "weird" -- that is, not normally describable by
> arithmetic at all.  Otherwise, why are you using inline asm for it?

I could easily imagine using some of the combinations for atomic operations.

For example, doing a "lock decl", and wanting to see if the result is
negative or zero. Sure, it would be possible to set *two* booleans (ZF
and SF), but there's a contiional for "BE"..

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH RFC 0/2] crypto: Introduce Public Key Encryption API

2015-05-04 Thread Tadeusz Struk

Hi Horia,
On 05/04/2015 06:16 AM, Horia Geantă wrote:
>>  int (*sign)(struct pke_request *pkereq);
>> >int (*verify)(struct pke_request *pkereq);
>> >int (*encrypt)(struct pke_request *pkereq);
>> >int (*decrypt)(struct pke_request *pkereq);
> Where would be the proper place for keygen operation?

This will need to be extended to support keygen.

> 
> AFAICT algorithms currently map to primitives + encoding methods, which
> is not flexible. For e.g. current RSA implementation hardcodes the
> PKCS1-v1_5 encoding method, making it hard to add OAEP(+) etc.
> 
> One solution would be to map algorithms to primitives only. Encoding
> methods need to be abstracted somehow, maybe using templates to wrap the
> algorithms.

So far there is only one rsa implementation in kernel and it is only used
by module signing code.
Later we can add templates or simply one can register "oaep-rsa" algorithm.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Linux-nvdimm] [PATCH v2 08/20] libnd, nd_acpi: regions (block-data-window, persistent memory, volatile memory)

2015-05-04 Thread Toshi Kani

On Tue, 2015-04-28 at 14:24 -0400, Dan Williams wrote:
 :
> +
> +static int nd_acpi_register_region(struct acpi_nfit_desc *acpi_desc,
> + struct nfit_spa *nfit_spa)
> +{
> + static struct nd_mapping nd_mappings[ND_MAX_MAPPINGS];
> + struct acpi_nfit_spa *spa = nfit_spa->spa;
> + struct nfit_memdev *nfit_memdev;
> + struct nd_region_desc ndr_desc;
> + int spa_type, count = 0;
> + struct resource res;
> + u16 spa_index;
> +
> + spa_type = nfit_spa_type(spa);
> + spa_index = spa->spa_index;
> + if (spa_index == 0) {
> + dev_dbg(acpi_desc->dev, "%s: detected invalid spa index\n",
> + __func__);
> + return 0;
> + }
> +
> + memset(&res, 0, sizeof(res));
> + memset(&nd_mappings, 0, sizeof(nd_mappings));
> + memset(&ndr_desc, 0, sizeof(ndr_desc));
> + res.start = spa->spa_base;
> + res.end = res.start + spa->spa_length - 1;
> + ndr_desc.res = &res;
> + ndr_desc.provider_data = nfit_spa;
> + ndr_desc.attr_groups = nd_acpi_region_attribute_groups;
> + list_for_each_entry(nfit_memdev, &acpi_desc->memdevs, list) {
> + struct acpi_nfit_memdev *memdev = nfit_memdev->memdev;
> + struct nd_mapping *nd_mapping;
> + struct nd_dimm *nd_dimm;
> +
> + if (memdev->spa_index != spa_index)
> + continue;

The libnd does not support memdev->flags, which contains "Memory Device
State Flags" defined in Table 5-129 of ACPI 6.0.  In case of major
errors, we should only allow a failed NVDIMM be accessed with read-only
for possible data recovery (or not allow any access when the data is
completely lost), and should not let users operate normally over the
corrupted data until the error is dealt properly.

Can you set memdev->flags to nd_region(_desc) so that the pmem driver
can check the status in nd_pmem_probe()?  nd_pmem_probe() can then set
the disk read-only or fail probing, and log errors accordingly.

Thanks,
-Toshi








--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 4/4] sched / idle: Call default_idle_call() from cpuidle_enter_state()

2015-05-04 Thread Rafael J. Wysocki

On Monday, May 04, 2015 05:04:08 PM Daniel Lezcano wrote:
> On 05/04/2015 03:58 PM, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki 
> >
> > The check of the cpuidle_enter() return value against -EBUSY
> > made in call_cpuidle() will not be necessary any more if
> > cpuidle_enter_state() calls default_idle_call() directly when it
> > is about to return -EBUSY, so make that happen and eliminate the
> > check.
> >
> > Signed-off-by: Rafael J. Wysocki 
>  >
> > ---
> >   drivers/cpuidle/cpuidle.c |4 +++-
> >   drivers/cpuidle/cpuidle.h |2 ++
> >   kernel/sched/idle.c   |   14 ++
> >   3 files changed, 11 insertions(+), 9 deletions(-)
> >
> > Index: linux-pm/drivers/cpuidle/cpuidle.c
> > ===
> > --- linux-pm.orig/drivers/cpuidle/cpuidle.c
> > +++ linux-pm/drivers/cpuidle/cpuidle.c
> > @@ -167,8 +167,10 @@ int cpuidle_enter_state(struct cpuidle_d
> >  * local timer will be shut down.  If a local timer is used from another
> >  * CPU as a broadcast timer, this call may fail if it is not available.
> >  */
> > -   if (broadcast && tick_broadcast_enter())
> > +   if (broadcast && tick_broadcast_enter()) {
> > +   default_idle_call();
> > return -EBUSY;
> > +   }
> >
> > trace_cpu_idle_rcuidle(index, dev->cpu);
> > time_start = ktime_get();
> > Index: linux-pm/drivers/cpuidle/cpuidle.h
> > ===
> > --- linux-pm.orig/drivers/cpuidle/cpuidle.h
> > +++ linux-pm/drivers/cpuidle/cpuidle.h
> > @@ -18,6 +18,8 @@ extern int cpuidle_enter_state(struct cp
> >   /* idle loop */
> >   extern void cpuidle_install_idle_handler(void);
> >   extern void cpuidle_uninstall_idle_handler(void);
> > +/* kernel/sched/idle.c */
> > +extern void default_idle_call(void);
> 
> There is a cyclic dependency introduced with this function.
> 
> idle.c <=> cpuidle.c
> 
> Are we sure we want them to be mutually dependent ?

Well, hadn't I think so, I wouldn't have posted the patch in the first place. 
:-)

Aesthetics is one thing and wasted cycles is another.  A redundant check
in the idle loop means a whole lot of wasted cycles throughout the life time
of a kernel.

Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] ftrace: Provide trace clock monotonic raw

2015-05-04 Thread John Stultz

On Mon, May 4, 2015 at 1:05 PM, Drew Richardson  wrote:
> On Mon, May 04, 2015 at 04:10:05PM +0100, Mathieu Desnoyers wrote:
>> - Original Message -
>> > Expose the NMI safe accessor to the monotonic raw clock to the
>> > tracer. The mono clock was added with commit
>> > 1b3e5c0936046e7e023149ddc8946d21c2ea20eb. Although the monotonic raw
>> > clock cannot be used to compare time between different machines, it is
>> > not perterbed by ntp.
>>
>> perterbed -> perturbed
>
> Oops, I'll correct that in the next version.
>
>> >
>> > Signed-off-by: Drew Richardson 
>> >
>>
>> What is the use-case that justify exposing the "raw fast"
>> clock that cannot be handled by the "monotonic fast" clock ?
>>
>> Thanks,
>>
>> Mathieu
>
> I'm collecting and merging data from perf, with Android Atrace data
> (writes to /sys/kernel/debug/tracing/trace_marker) which ends up in
> the ftrace stream and other measurements collected from
> userspace. Currently the only clock readable from userspace, supported
> by perf and by ftrace is CLOCK_MONOTONIC. However this clock is
> affected by the incremental adjustments performed by adjtime(3) and
> NTP. But I'd prefer to use a clock that is advancing at a consistent
> rate, hence CLOCK_MONOTONIC_RAW.

Well, I'd caution against assuming CLOCK_MONOTONIC_RAW is really a
consistent rate, since w/ thermal changes the oscillator likely will
drift around. But especially during early initialization, ntp can
manipulate the CLOCK_MONOTONIC freq more drastically to align time.

Another more concrete benefit is that since CLOCK_MONOTONIC is
frequency adjusted, its possible for slight inconsistencies to appear
when using the lock-free ktime_get_mono_fast_ns() accessor that perf
uses. With CLOCK_MONOTONIC_RAW, since there are no frequency
adjustments made, inconsistencies shouldn't occur with the lock-free
accessor.

thanks
-john
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1222 matches

Mail list logo