date:20140304

Re: [tip:irq/core] powerpc: eeh: Fixup the brown paperbag fallout of the "cleanup"

2014-03-04 Thread Benjamin Herrenschmidt

On Tue, 2014-03-04 at 15:15 -0800, tip-bot for Thomas Gleixner wrote:
> Commit-ID:  57310c3c99eb6fab2ecbd63aa3f7c323341ca77e
> Gitweb: http://git.kernel.org/tip/57310c3c99eb6fab2ecbd63aa3f7c323341ca77e
> Author: Thomas Gleixner 
> AuthorDate: Wed, 5 Mar 2014 00:06:11 +0100
> Committer:  Thomas Gleixner 
> CommitDate: Wed, 5 Mar 2014 00:13:33 +0100
> 
> powerpc: eeh: Fixup the brown paperbag fallout of the "cleanup"
> 
> Commit b8a9a11b9 (powerpc: eeh: Kill another abuse of irq_desc) is
> missing some brackets .
> 
> It's not a good idea to write patches in grumpy mode and then forget
> to at least compile test them or rely on the few eyeballs discussing
> that patch to spot it.

Ouch :-)

Next time you have a series like that, if you want I'll throw it at my
build tester.

Cheers,
Ben.

> Reported-by: fengguang...@intel.com
> Signed-off-by: Thomas Gleixner 
> Cc: Peter Zijlstra 
> Cc: Gavin Shan 
> Cc: Benjamin Herrenschmidt 
> Cc: ppc 
> ---
>  arch/powerpc/kernel/eeh_driver.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/kernel/eeh_driver.c 
> b/arch/powerpc/kernel/eeh_driver.c
> index 3e1d7de..bb61ca5 100644
> --- a/arch/powerpc/kernel/eeh_driver.c
> +++ b/arch/powerpc/kernel/eeh_driver.c
> @@ -166,8 +166,9 @@ static void eeh_enable_irq(struct pci_dev *dev)
>*
>*  tglx
>*/
> - if (irqd_irq_disabled(irq_get_irq_data(dev->irq))
> + if (irqd_irq_disabled(irq_get_irq_data(dev->irq)))
>   enable_irq(dev->irq);
> + }
>  }
>  
>  /**


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: linux-next: build failure after merge of the wireless-next tree

2014-03-04 Thread Stephen Rothwell

Hi John,

On Wed, 5 Mar 2014 10:16:27 +1100 Stephen Rothwell  
wrote:
>
> If I revert commit 161d78555435 "Revert "Staging: rtl8812ae: remove
> modules field of rate_control_ops"", it fails differently:
> 
> In file included from drivers/staging/wlan-ng/p80211netdev.c:91:0:
> drivers/staging/wlan-ng/cfg80211.c: In function 'prism2_scan':
> drivers/staging/wlan-ng/cfg80211.c:419:10: error: implicit declaration of 
> function 'ieee80211_dsss_chan_to_freq' [-Werror=implicit-function-declaration]
>   ieee80211_dsss_chan_to_freq(msg2.dschannel.data)),
>   ^
> drivers/staging/rtl8821ae/rtl8821ae/trx.c: In function 
> 'rtl8821ae_rx_query_desc':
> drivers/staging/rtl8821ae/rtl8821ae/trx.c:619:3: warning: passing argument 1 
> of 'ieee80211_is_robust_mgmt_frame' from incompatible pointer type [enabled 
> by default]
>if ((ieee80211_is_robust_mgmt_frame(hdr)) &&
>^
> In file included from include/net/mac80211.h:20:0,
>  from drivers/staging/rtl8821ae/rtl8821ae/../wifi.h:38,
>  from drivers/staging/rtl8821ae/rtl8821ae/trx.c:30:
> include/linux/ieee80211.h:2286:20: note: expected 'struct sk_buff *' but 
> argument is of type 'struct ieee80211_hdr *'
>  static inline bool ieee80211_is_robust_mgmt_frame(struct sk_buff *skb)
> ^
> drivers/staging/rtl8821ae/rtl8821ae/dm.c: In function 
> 'rtl8821ae_dm_clear_txpower_tracking_state':
> drivers/staging/rtl8821ae/rtl8821ae/dm.c:487:31: warning: iteration 2u 
> invokes undefined behavior [-Waggressive-loop-optimizations]
>rtldm->bb_swing_idx_ofdm[p] = rtldm->default_ofdm_index;
>^
> drivers/staging/rtl8821ae/rtl8821ae/dm.c:485:2: note: containing loop
>   for (p = RF90_PATH_A; p < MAX_RF_PATH; ++p) {
>   ^
> 
> So please revert 161d78555435, and then fix this other error.

That fix could be a simple as just disabling the affected staging tree
driver and letting Greg know.  Or you could actually fix it.
-- 
Cheers,
Stephen Rothwell 


pgpRyjksqbEpn.pgp
Description: PGP signature

[PATCH 5/7 v2] staging: cxt1e1: fix checkpatch errors with open brace '{'

2014-03-04 Thread Daeseok Youn


clean up checkpatch.pl error in linux.c:
 ERROR: that open brace { should be on the previous line

Signed-off-by: Daeseok Youn 
---
 drivers/staging/cxt1e1/linux.c |   67 ---
 1 files changed, 21 insertions(+), 46 deletions(-)

diff --git a/drivers/staging/cxt1e1/linux.c b/drivers/staging/cxt1e1/linux.c
index e4541af..478516f 100644
--- a/drivers/staging/cxt1e1/linux.c
+++ b/drivers/staging/cxt1e1/linux.c
@@ -242,8 +242,7 @@ c4_wq_port_cleanup(mpi_t *pi)
 * PORT POINT: cannot call this if WQ is statically allocated w/in
 * structure since it calls kfree(wq);
 */
-   if (pi->wq_port)
-   {
+   if (pi->wq_port) {
destroy_workqueue(pi->wq_port);/* this also calls
* flush_workqueue() */
pi->wq_port = NULL;
@@ -433,15 +432,13 @@ create_chan(struct net_device *ndev, ci_t *ci,
 
/* allocate then fill in private data structure */
priv = OS_kmalloc(sizeof(struct c4_priv));
-   if (!priv)
-   {
+   if (!priv) {
pr_warning("%s: no memory for net_device !\n",
   ci->devname);
return NULL;
}
dev = alloc_hdlcdev(priv);
-   if (!dev)
-   {
+   if (!dev) {
pr_warning("%s: no memory for hdlc_device !\n",
   ci->devname);
OS_kfree(priv);
@@ -459,10 +456,8 @@ create_chan(struct net_device *ndev, ci_t *ci,
*dev->name = 0; /* default ifconfig name = "hdlc" */
 
hi = (hdw_info_t *)ci->hdw_info;
-   if (hi->mfg_info_sts == EEPROM_OK)
-   {
-   switch (hi->promfmt)
-   {
+   if (hi->mfg_info_sts == EEPROM_OK) {
+   switch (hi->promfmt) {
case PROM_FORMAT_TYPE1:
memcpy(dev->dev_addr,
   (FLD_TYPE1 *) (hi->mfg_info.pft1.Serial), 6);
@@ -476,9 +471,7 @@ create_chan(struct net_device *ndev, ci_t *ci,
break;
}
} else
-   {
memset(dev->dev_addr, 0, 6);
-   }
 
hdlc->xmit = c4_linux_xmit;
 
@@ -502,8 +495,7 @@ create_chan(struct net_device *ndev, ci_t *ci,
 
/* needed due to Ioctl calling sequence */
rtnl_lock();
-   if (ret)
-   {
+   if (ret) {
if (cxt1e1_log_level >= LOG_WARN)
pr_info("%s: create_chan[%d] registration error = 
%d.\n",
ci->devname, cp->channum, ret);
@@ -698,8 +690,7 @@ do_create_chan(struct net_device *ndev, void *data)
if (!dev)
return -EBUSY;
ret = mkret(c4_new_chan(ci, cp.port, cp.channum, dev));
-   if (ret)
-   {
+   if (ret) {
/* needed due to Ioctl calling sequence */
rtnl_unlock();
unregister_hdlc_device(dev);
@@ -805,8 +796,7 @@ do_reset(struct net_device *musycc_dev, void *data)
const struct c4_priv *priv;
int i;
 
-   for (i = 0; i < 128; i++)
-   {
+   for (i = 0; i < 128; i++) {
struct net_device *ndev;
charbuf[sizeof(CHANNAME) + 3];
 
@@ -817,8 +807,7 @@ do_reset(struct net_device *musycc_dev, void *data)
priv = dev_to_hdlc(ndev)->priv;
 
if ((unsigned long) (priv->ci) ==
-   (unsigned long) (netdev_priv(musycc_dev)))
-   {
+   (unsigned long) (netdev_priv(musycc_dev))) {
ndev->flags &= ~IFF_UP;
netif_stop_queue(ndev);
do_deluser(ndev, 1);
@@ -845,10 +834,8 @@ c4_ioctl(struct net_device *ndev, struct ifreq *ifr, int 
cmd)
void   *data;
int iocmd, iolen;
status_tret;
-   static struct data
-   {
-   union
-   {
+   static struct data {
+   union {
u_int8_t c;
u_int32_t i;
struct sbe_brd_info bip;
@@ -891,8 +878,7 @@ c4_ioctl(struct net_device *ndev, struct ifreq *ifr, int 
cmd)
return -EFAULT;
 
ret = 0;
-   switch (iocmd)
-   {
+   switch (iocmd) {
case SBE_IOC_PORT_GET:
//pr_info(">> SBE_IOC_PORT_GET Ioctl...\n");
ret = do_get_port(ndev, data);
@@ -975,8 +961,7 @@ c4_add_dev(hdw_info_t *hi, int brdno, unsigned long f0, 
unsigned long f1,
ci_t   *ci;
 
ndev = alloc_netdev(sizeof(ci_t), SBE_IFACETMPL, c4_setup);
-   if (!ndev)
-   {
+   if (!ndev) {
pr_warning("%s: no memory for struct net_device !\n",
   h

Re: [PATCH 05/48] percpu: Add preemption checks to __this_cpu ops

2014-03-04 Thread Steven Rostedt

On Tue, 4 Mar 2014 14:27:27 -0800
Andrew Morton  wrote:

> On Fri, 14 Feb 2014 14:18:46 -0600 Christoph Lameter  wrote:
> 
> > [Patch depends on another patch in this series that introduces raw_cpu_ops]
> > 
> > We define a check function in order to avoid trouble with the
> > include files. Then the higher level __this_cpu macros are
> > modified to invoke the preemption check.
> > 
> > --- linux.orig/lib/smp_processor_id.c   2014-01-30 14:40:50.936519233 
> > -0600
> > +++ linux/lib/smp_processor_id.c2014-01-30 14:40:50.936519233 -0600
> > @@ -7,7 +7,7 @@
> >  #include 
> >  #include 
> >  
> > -notrace unsigned int debug_smp_processor_id(void)
> > +notrace static unsigned int check_preemption_disabled(char *what)
> >  {
> > int this_cpu = raw_smp_processor_id();
> >  
> > @@ -38,9 +38,9 @@
> > if (!printk_ratelimit())
> > goto out_enable;
> >  
> > -   printk(KERN_ERR "BUG: using smp_processor_id() in preemptible [%08x] "
> > -   "code: %s/%d\n",
> > -   preempt_count() - 1, current->comm, current->pid);
> > +   printk(KERN_ERR "BUG: using %s in preemptible [%08x] code: %s/%d\n",
> > +   what, preempt_count() - 1, current->comm, current->pid);
> > +
> > print_symbol("caller is %s\n", (long)__builtin_return_address(0));
> > dump_stack();
> 
> I wonder if there's any point in printing __builtin_return_address. 
> Doesn't dump_stack() tell us the same thing?

When frame pointers are enabled, sure. But without frame pointers, I'm
not so sure.

-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/1] mm: use macros from compiler.h instead of attribute((...))

2014-03-04 Thread Stephen Rothwell

Hi Andrew,

On Tue, 4 Mar 2014 13:26:04 -0800 Andrew Morton  
wrote:
>
> On Sun,  2 Mar 2014 19:09:58 +0530 Gideon Israel Dsouza  
> wrote:
> 
> > To increase compiler portability there is  which
> > provides convenience macros for various gcc constructs.  Eg: __weak
> > for __attribute__((weak)).  I've replaced all instances of gcc
> > attributes with the right macro in the memory management
> > (/mm) subsystem.
> > 
> > ...
> >
> > --- a/mm/hugetlb.c
> > +++ b/mm/hugetlb.c
> > @@ -13,6 +13,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> 
> It may be overdoing things a bit to explicitly include compiler.h. 
> It's hard to conceive of any .c file which doesn't already include it.

Stick to Rule 1 :-)

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


pgpXFVZVe3csI.pgp
Description: PGP signature

Re: [tip:irq/core] powerpc: eeh: Fixup the brown paperbag fallout of the "cleanup"

2014-03-04 Thread Thomas Gleixner

On Wed, 5 Mar 2014, Benjamin Herrenschmidt wrote:

> On Tue, 2014-03-04 at 15:15 -0800, tip-bot for Thomas Gleixner wrote:
> > Commit-ID:  57310c3c99eb6fab2ecbd63aa3f7c323341ca77e
> > Gitweb: 
> > http://git.kernel.org/tip/57310c3c99eb6fab2ecbd63aa3f7c323341ca77e
> > Author: Thomas Gleixner 
> > AuthorDate: Wed, 5 Mar 2014 00:06:11 +0100
> > Committer:  Thomas Gleixner 
> > CommitDate: Wed, 5 Mar 2014 00:13:33 +0100
> > 
> > powerpc: eeh: Fixup the brown paperbag fallout of the "cleanup"
> > 
> > Commit b8a9a11b9 (powerpc: eeh: Kill another abuse of irq_desc) is
> > missing some brackets .
> > 
> > It's not a good idea to write patches in grumpy mode and then forget
> > to at least compile test them or rely on the few eyeballs discussing
> > that patch to spot it.
> 
> Ouch :-)
> 
> Next time you have a series like that, if you want I'll throw it at my
> build tester.

You simply could have been less lazy and picked up the whole ppc
related stuff instead of ignoring it

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 5/5] arm64: add early_ioremap support

2014-03-04 Thread Rob Herring

On Tue, Mar 4, 2014 at 2:08 PM, Mark Salter  wrote:
> Add support for early IO or memory mappings which are needed
> before the normal ioremap() is usable. This also adds fixmap
> support for permanent fixed mappings such as that used by the
> earlyprintk device register region.

One minor comment:

> +enum fixed_addresses {
> +   FIX_EARLYCON,

Can you align this name with x86 and rename to FIX_EARLYCON_MEM_BASE.
Doing that will help enable the earlycon driver in 8250_early.c.

Rob
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: linux-next: build failure after merge of the mfd-lj tree

2014-03-04 Thread Stephen Rothwell

Hi Lee,

On Tue, 4 Mar 2014 16:06:55 +0800 Lee Jones  wrote:
>
> > > After merging the mfd-lj tree, today's linux-next build (x86_64
> > > allmodconfig) failed like this:
> > > 
> > > drivers/mfd/tps65218: struct i2c_device_id is 32 bytes.  The last of 1 is:
> > > 0x74 0x70 0x73 0x36 0x35 0x32 0x31 0x38 0x00 0x00 0x00 0x00 0x00 0x00 
> > > 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0xf0 0x00 0x00 0x00 
> > > 0x00 0x00 0x00 0x00 
> > > FATAL: drivers/mfd/tps65218: struct i2c_device_id is not terminated with 
> > > a NULL entry!
> > > 
> > > Caused by commit cc493e30e3a1 ("mfd: tps65218: Add driver for the
> > > TPS65218 PMIC").
> 
> -rw-r--r-- 1 lee lee   6320 Mar  4 15:20 tps65218.o
> 
> Hmmm... I'm having trouble reproducing the error. My toolchain
> doesn't complain about the lack of a terminating entry. I have fixed
> the problem and re-pushed the branch though. Would you be kind enough
> to re-test it please?

I assume that the error comes from some post processing that happens late
in the allmodconfig build process.

You current tree build fine, thanks.

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


pgpUsJ36TSov9.pgp
Description: PGP signature

RE: [PATCH v2] usb: gadget: return the right length in ffs_epfile_io()

2014-03-04 Thread Liu, Chuansheng

Hi Balbi,

> -Original Message-
> From: Felipe Balbi [mailto:ba...@ti.com]
> Sent: Wednesday, March 05, 2014 3:56 AM
> To: Michal Nazarewicz
> Cc: Robert Baldyga; Felipe Balbi; Sergei Shtylyov; Liu, Chuansheng;
> gre...@linuxfoundation.org; linux-...@vger.kernel.org;
> linux-kernel@vger.kernel.org; david.a.co...@linux.intel.com
> Subject: Re: [PATCH v2] usb: gadget: return the right length in 
> ffs_epfile_io()
> 
> On Tue, Mar 04, 2014 at 08:53:40PM +0100, Michal Nazarewicz wrote:
> > >> On 03/04/2014 10:34 AM, Chuansheng Liu wrote:
> > >> >@@ -845,12 +845,14 @@ static ssize_t ffs_epfile_io(struct file *file,
> struct ffs_io_data *io_data)
> > >> > * we may end up with more data then user space 
> > >> > has
> > >> > * space for.
> > >> > */
> > >> >-   ret = ep->status;
> > >> >-   if (io_data->read && ret > 0 &&
> > >> >-   unlikely(copy_to_user(io_data->buf, data,
> > >> >- min_t(size_t, ret,
> > >> >- io_data->len
> > >> >-   ret = -EFAULT;
> > >> >+   ret = ep->status;
> >
> > On Tue, Mar 04 2014, Felipe Balbi wrote:
> > >>Why the indentation jumped suddenly to the right?
> >
> > > On Tue, Mar 04, 2014 at 08:01:15PM +0300, Sergei Shtylyov wrote:
> > > because it was wrong before ;-)
> >
> > Yep.  It looks like Robert's [2e4c7553: add aio support] introduced an
> > if-else-if-else flow but did not indent the code and I didn't caught it
> > when reviewing that patch.
> 
> it's in my testing/next now, I also fixed the comment indentation which
> was also wrong.
Thanks your help and the fix for comment indentation also:)

Best Regards
Chuansheng
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Closing on the CR2 leak bug

2014-03-04 Thread Steven Rostedt

On Tue, 04 Mar 2014 14:34:00 -0800
"H. Peter Anvin"  wrote:

> So we need to get something into x86/urgent for the CR2 bug.
> 
> It seems like a no-brainer to do the hoisting patch, for which I prefer
> the version proposed by Jiri Olsa which reads %cr2 and then passes it to
> __do_page_fault() in a GPR:
> 
> http://lkml.kernel.org/r/20140228160526.gd1...@krava.brq.redhat.com
> 
> As fart as I can tell this should fix Vince's problem as well.
> 
> Anyone who objects to this?  Otherwise I will put it in tip:x86/urgent
> tomorrow.
> 
> Do we need any more refinements this late in the -rc cycle?
> 

The only other issue is if perf traces from function context and traces
the trace_do_page_fault() call. But other than that, sure, take Jiri's
patch.

Acked-by: Steven Rostedt 

Please also add the reported/tested-by from Vince Weaver and add a
Link: to the other thread, as it got pretty detailed there too.


 -- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[GIT PULL] EFI mixed mode support

2014-03-04 Thread Matt Fleming

Peter, I've got the mixed mode support in two branches.

'mixed-mode' is a clean topic branch against -rc3, which will generate
conflicts when you merge it with the EFI 'next' branch.

I performed the merge myself in 'mixed-mode-merged' so you can take a
look there to see how I resolved the conflicts.

If you want another option, just let me know.

The following changes since commit 6d0abeca3242a88cab8232e4acd7e2bf088f3bc2:

  Linux 3.14-rc3 (2014-02-16 13:30:25 -0800)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/mfleming/efi.git mixed-mode

for you to fetch changes up to 18c46461d9e42d398536055f31f58cdcd2c6347e:

  x86/efi: Re-disable interrupts after calling firmware services (2014-03-04 
21:44:00 +)


Matt Fleming (13):
  x86/boot: Cleanup header.S by removing some #ifdefs
  x86, tools: Consolidate #ifdef code
  x86/mm/pageattr: Always dump the right page table in an oops
  x86/efi: Delete dead code when checking for non-native
  efi: Add separate 32-bit/64-bit definitions
  x86/efi: Build our own EFI services pointer table
  x86/efi: Add early thunk code to go from 64-bit to 32-bit
  x86/efi: Split the boot stub into 32/64 code paths
  x86/efi: Firmware agnostic handover entry points
  x86/efi: Add mixed runtime services support
  x86/efi: Wire up CONFIG_EFI_MIXED
  x86/boot: Don't overwrite cr4 when enabling PAE
  x86/efi: Re-disable interrupts after calling firmware services

 arch/x86/Kconfig   |   14 +
 arch/x86/boot/Makefile |2 +-
 arch/x86/boot/compressed/eboot.c   | 1018 +---
 arch/x86/boot/compressed/eboot.h   |   60 ++
 arch/x86/boot/compressed/efi_stub_64.S |   29 +
 arch/x86/boot/compressed/head_32.S |   50 +-
 arch/x86/boot/compressed/head_64.S |  108 +++-
 arch/x86/boot/header.S |   23 +-
 arch/x86/boot/tools/build.c|   76 ++-
 arch/x86/include/asm/efi.h |   38 +-
 arch/x86/include/asm/pgtable_types.h   |2 +
 arch/x86/kernel/setup.c|2 +-
 arch/x86/mm/fault.c|7 +-
 arch/x86/mm/pageattr.c |   12 +-
 arch/x86/platform/efi/Makefile |1 +
 arch/x86/platform/efi/efi.c|  141 +++--
 arch/x86/platform/efi/efi_64.c |  335 ++-
 arch/x86/platform/efi/efi_stub_64.S|  157 +
 arch/x86/platform/efi/efi_thunk_64.S   |   65 ++
 drivers/firmware/efi/efi-stub-helper.c |  148 ++---
 include/linux/efi.h|  252 
 21 files changed, 2116 insertions(+), 424 deletions(-)
 create mode 100644 arch/x86/platform/efi/efi_thunk_64.S

-- 
Matt Fleming, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Closing on the CR2 leak bug

2014-03-04 Thread H. Peter Anvin

On 03/04/2014 03:41 PM, Steven Rostedt wrote:
> 
> The only other issue is if perf traces from function context and traces
> the trace_do_page_fault() call. But other than that, sure, take Jiri's
> patch.
> 

Is there a known codepath on which that can happen?

-hpa

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 31/48] uv: Replace __get_cpu_var

2014-03-04 Thread Steven Rostedt

On Tue, 4 Mar 2014 15:02:17 -0800
Andrew Morton  wrote:

> On Fri, 14 Feb 2014 14:19:12 -0600 Christoph Lameter  wrote:
> 
> > Use __this_cpu_read instead.
> > 
> > ...
> >
> > --- linux.orig/arch/x86/include/asm/uv/uv_hub.h 2014-02-03 
> > 14:16:53.987889372 -0600
> > +++ linux/arch/x86/include/asm/uv/uv_hub.h  2014-02-03 14:16:53.987889372 
> > -0600
> > @@ -618,7 +618,7 @@
> >  };
> >  
> >  DECLARE_PER_CPU(struct uv_cpu_nmi_s, __uv_cpu_nmi);
> > -#define uv_cpu_nmi (__get_cpu_var(__uv_cpu_nmi))
> > +#define uv_cpu_nmi __this_cpu_read(_uv_cpu_nmi)
> 
> arch/x86/platform/uv/uv_nmi.c: In function 'uv_check_nmi':
> arch/x86/platform/uv/uv_nmi.c:218: error: '_uv_cpu_nmi' undeclared (first use 
> in this function)
> arch/x86/platform/uv/uv_nmi.c:218: error: (Each undeclared identifier is 
> reported only once
> arch/x86/platform/uv/uv_nmi.c:218: error: for each function it appears in.)
> 
> 
> This?
> 
> --- a/arch/x86/include/asm/uv/uv_hub.h~uv-replace-__get_cpu_var-fix
> +++ a/arch/x86/include/asm/uv/uv_hub.h
> @@ -618,7 +618,7 @@ struct uv_cpu_nmi_s {
>  };
>  
>  DECLARE_PER_CPU(struct uv_cpu_nmi_s, __uv_cpu_nmi);
> -#define uv_cpu_nmi   __this_cpu_read(_uv_cpu_nmi)
> +#define uv_cpu_nmi   (*this_cpu_ptr(&__uv_cpu_nmi))

Looks like an extra "_" was added.

-- Steve

>  #define uv_hub_nmi   (uv_cpu_nmi.hub)
>  #define uv_cpu_nmi_per(cpu)  (per_cpu(__uv_cpu_nmi, cpu))
>  #define uv_hub_nmi_per(cpu)  (uv_cpu_nmi_per(cpu).hub)
> _

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] MAINTAINERS: add maintainers for arm64 acpi

2014-03-04 Thread Catalin Marinas

On Tue, Mar 04, 2014 at 10:59:46AM +, Grant Likely wrote:
> On Tue, Mar 4, 2014 at 6:23 PM, Catalin Marinas  
> wrote:
> > On Tue, Mar 04, 2014 at 02:15:45AM +, Graeme Gregory wrote:
> >> +ACPI ARM64
> >
> > That's a pretty broad statement for a single file. Is it core support,
> > architected peripherals, SoC?
> 
> That's a good point. Graeme, it would be good if you could put some
> text in the patch describing how you propose the maintainership to
> work. Unfortunately the maintainers file doesn't have any kind of
> comments field, otherwise I'd suggest you make those comments directly
> there.

I would actually go for something that can be used as a reference like
Documentation/arm64/acpi.txt. This will be a document that evolves in
time but pretty much sets the direction for what it means to support
ACPI on the arm64 kernel. Yesterday's talk at Linaro Connect about
clocks and voltage regulators on ACPI platforms is a clear example that
needs to be captured in a kernel document.

For DT we have documented bindings and I was too lazy for a broader
arm64 soc.txt document but I plan send an RFC across the lines of
https://plus.google.com/103785593327310749350/posts/dZF3zf7z2v4 (unless
the arm-soc guys beat me to it ;)).

I really don't see much point in an ACPI ARM64 maintainers entry that
only covers a 200 lines long file without guidelines on what else is
going into other parts of the kernel related to ARM ACPI.

> Given that ACPI can touch a lot of subsystems I would expect you and
> Hanjun not to be merging much code directly, but being listed in
> maintainers means that you will be kept in the loop when it comes to
> merging ARM ACPI changes. I would also expect that anything that does
> go through you (instead of merely acked) would be merged via Rafael
> and Len's tree.

No issues here.

> >> +M:   Hanjun Guo 
> >> +M:   Graeme Gregory 
> >> +S:   Supported
> >> +L:   linux-a...@vger.kernel.org
> >> +F:   drivers/acpi/plat/arm-core.c
> >
> > This patch should be part of the series introducing the arm-core.c file
> > and it will be ACKed (or NAKed) following review. We can't really commit
> > maintainers to a file which does not exist in mainline and while there is
> > still feedback to be addressed. It's like a blank cheque.
> 
> I agree with merging it with the rest of the series, but comparing it
> to a blank cheque is not appropriate. Merely having an entry in
> MAINTAINERS doesn't immediately confer trust or ability to merge code,
> but it does tell people who to talk to when looking at ACPI on ARM.
> You can bet that neither Linus, Len or Rafael will merge ARM ACPI
> trees from them if you disagree. (And even if they did, you would
> yell, and Linus would revert it).

The point is that I don't have to follow all the developments closely
and feel the need to yell, so I have to rely on (trust) the newly
appointed maintainers to do the right thing. The recent example with me
asking Rafael to drop the GIC patch shouldn't have really happened. It's
not for Rafael to decide on how many acks to be on a patch before being
merged (absolutely no complaints to Rafael here) but rather for the
patch contributors to reach out to relevant kernel developers and ask
for review/ack before sending the patch upstream (especially when there
is an ongoing conversation).

As I already stated, I'm not bothered with the ACPI clean-up patches,
that's up to Rafael/Len to merge. But those involving the ARM IP like
GIC and timers need wider review before going upstream. For something of
such importance to us like ARM ACPI, I really feel uneasy about patches
going into mainline with just a Signed-off-by (hint: your ack adds
significant weight to a patch ;)). Who/how many acks, I leave it to the
ARM64 ACPI maintainers to figure out but it must be non-zero to be able
to build up trust over time.

And there is the wider issue of which platforms go for ACPI and which
stay with DT. Here the key agreement should come from the arm64 and
arm-soc maintainers and captured in a kernel document. That's a kernel
decision based on technical merits and *not* driven by "artificial"
distro policies (sorry jcm).

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: linux-next: build failure after merge of the wireless-next tree

2014-03-04 Thread Greg KH

On Wed, Mar 05, 2014 at 10:21:09AM +1100, Stephen Rothwell wrote:
> Hi John,
> 
> On Wed, 5 Mar 2014 10:16:27 +1100 Stephen Rothwell  
> wrote:
> >
> > If I revert commit 161d78555435 "Revert "Staging: rtl8812ae: remove
> > modules field of rate_control_ops"", it fails differently:
> > 
> > In file included from drivers/staging/wlan-ng/p80211netdev.c:91:0:
> > drivers/staging/wlan-ng/cfg80211.c: In function 'prism2_scan':
> > drivers/staging/wlan-ng/cfg80211.c:419:10: error: implicit declaration of 
> > function 'ieee80211_dsss_chan_to_freq' 
> > [-Werror=implicit-function-declaration]
> >   ieee80211_dsss_chan_to_freq(msg2.dschannel.data)),
> >   ^
> > drivers/staging/rtl8821ae/rtl8821ae/trx.c: In function 
> > 'rtl8821ae_rx_query_desc':
> > drivers/staging/rtl8821ae/rtl8821ae/trx.c:619:3: warning: passing argument 
> > 1 of 'ieee80211_is_robust_mgmt_frame' from incompatible pointer type 
> > [enabled by default]
> >if ((ieee80211_is_robust_mgmt_frame(hdr)) &&
> >^
> > In file included from include/net/mac80211.h:20:0,
> >  from drivers/staging/rtl8821ae/rtl8821ae/../wifi.h:38,
> >  from drivers/staging/rtl8821ae/rtl8821ae/trx.c:30:
> > include/linux/ieee80211.h:2286:20: note: expected 'struct sk_buff *' but 
> > argument is of type 'struct ieee80211_hdr *'
> >  static inline bool ieee80211_is_robust_mgmt_frame(struct sk_buff *skb)
> > ^
> > drivers/staging/rtl8821ae/rtl8821ae/dm.c: In function 
> > 'rtl8821ae_dm_clear_txpower_tracking_state':
> > drivers/staging/rtl8821ae/rtl8821ae/dm.c:487:31: warning: iteration 2u 
> > invokes undefined behavior [-Waggressive-loop-optimizations]
> >rtldm->bb_swing_idx_ofdm[p] = rtldm->default_ofdm_index;
> >^
> > drivers/staging/rtl8821ae/rtl8821ae/dm.c:485:2: note: containing loop
> >   for (p = RF90_PATH_A; p < MAX_RF_PATH; ++p) {
> >   ^
> > 
> > So please revert 161d78555435, and then fix this other error.
> 
> That fix could be a simple as just disabling the affected staging tree
> driver and letting Greg know.  Or you could actually fix it.

John and Larry and I have been talking about how to handle some
cross-tree issues, and I should be merging with his tree soon to handle
them, hopefully this will get resolved at that point in time as well.

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 31/48] uv: Replace __get_cpu_var

2014-03-04 Thread Andrew Morton

On Tue, 4 Mar 2014 18:42:10 -0500 Steven Rostedt  wrote:

> > --- a/arch/x86/include/asm/uv/uv_hub.h~uv-replace-__get_cpu_var-fix
> > +++ a/arch/x86/include/asm/uv/uv_hub.h
> > @@ -618,7 +618,7 @@ struct uv_cpu_nmi_s {
> >  };
> >  
> >  DECLARE_PER_CPU(struct uv_cpu_nmi_s, __uv_cpu_nmi);
> > -#define uv_cpu_nmi __this_cpu_read(_uv_cpu_nmi)
> > +#define uv_cpu_nmi (*this_cpu_ptr(&__uv_cpu_nmi))
> 
> Looks like an extra "_" was added.

yes, there were two mistakes in that line.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[GIT PULL] EFI urgent fix

2014-03-04 Thread Matt Fleming

Please pull the following fix from Borislav that fixes a boot regression
for SGI UV.

The following changes since commit 09503379dc99535b1bbfa51aa1aeef340f5d82ec:

  x86/efi: Check status field to validate BGRT header (2014-02-14 10:07:15 
+)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/mfleming/efi.git tags/efi-urgent

for you to fetch changes up to a5d90c923bcfb9632d998ed06e9569216ad695f3:

  x86/efi: Quirk out SGI UV (2014-03-04 23:43:33 +)


 * Disable the new EFI 1:1 virtual mapping for SGI UV because using it
   causes a crash during boot - Borislav Petkov


Borislav Petkov (1):
  x86/efi: Quirk out SGI UV

 arch/x86/include/asm/efi.h  |  1 +
 arch/x86/kernel/setup.c | 10 ++
 arch/x86/platform/efi/efi.c | 20 
 3 files changed, 23 insertions(+), 8 deletions(-)

-- 
Matt Fleming, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] net: sched: dev_deactivate_many(): use msleep(1) instead of yield() to wait for outstanding qdisc_run calls

2014-03-04 Thread Marc Kleine-Budde

On PREEMPT_RT enabled systems the interrupt handler run as threads at prio 50
(by default). If a high priority userspace process tries to shut down a busy
network interface it might spin in a yield loop waiting for the device to
become idle. With the interrupt thread having a lower priority than the
looping process it might never be scheduled and so result in a deadlock on UP
systems.

With Magic SysRq the following backtrace can be produced:

> test_app R running  0   174168 0x
> [] (__schedule+0x220/0x3fc) from [] 
> (preempt_schedule_irq+0x48/0x80)
> [] (preempt_schedule_irq+0x48/0x80) from [] 
> (svc_preempt+0x8/0x20)
> [] (svc_preempt+0x8/0x20) from [] 
> (local_bh_enable+0x18/0x88)
> [] (local_bh_enable+0x18/0x88) from [] 
> (dev_deactivate_many+0x220/0x264)
> [] (dev_deactivate_many+0x220/0x264) from [] 
> (__dev_close_many+0x64/0xd4)
> [] (__dev_close_many+0x64/0xd4) from [] 
> (__dev_close+0x28/0x3c)
> [] (__dev_close+0x28/0x3c) from [] 
> (__dev_change_flags+0x88/0x130)
> [] (__dev_change_flags+0x88/0x130) from [] 
> (dev_change_flags+0x10/0x48)
> [] (dev_change_flags+0x10/0x48) from [] 
> (do_setlink+0x370/0x7ec)
> [] (do_setlink+0x370/0x7ec) from [] 
> (rtnl_newlink+0x2b4/0x450)
> [] (rtnl_newlink+0x2b4/0x450) from [] 
> (rtnetlink_rcv_msg+0x158/0x1f4)
> [] (rtnetlink_rcv_msg+0x158/0x1f4) from [] 
> (netlink_rcv_skb+0xac/0xc0)
> [] (netlink_rcv_skb+0xac/0xc0) from [] 
> (rtnetlink_rcv+0x18/0x24)
> [] (rtnetlink_rcv+0x18/0x24) from [] 
> (netlink_unicast+0x13c/0x198)
> [] (netlink_unicast+0x13c/0x198) from [] 
> (netlink_sendmsg+0x264/0x2e0)
> [] (netlink_sendmsg+0x264/0x2e0) from [] 
> (sock_sendmsg+0x78/0x98)
> [] (sock_sendmsg+0x78/0x98) from [] 
> (___sys_sendmsg.part.25+0x268/0x278)
> [] (___sys_sendmsg.part.25+0x268/0x278) from [] 
> (__sys_sendmsg+0x48/0x78)
> [] (__sys_sendmsg+0x48/0x78) from [] 
> (ret_fast_syscall+0x0/0x2c)

This patch works around the problem by replacing yield() by msleep(1), giving
the interrupt thread time to finish, similar to other changes contained in the
rt patch set. Using wait_for_completion() instead would probably be a better
solution.

Signed-off-by: Marc Kleine-Budde 
---
Hello,

this patch is intended for -rt, as the problem probably doesn't occur on non rt
systems. However, I put netdev on Cc, as they probably can come up with a
proper solution.

regards,
Marc

 net/sched/sch_generic.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index a7f838b..d5b00ad 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -839,7 +839,7 @@ void dev_deactivate_many(struct list_head *head)
/* Wait for outstanding qdisc_run calls. */
list_for_each_entry(dev, head, unreg_list)
while (some_qdisc_is_busy(dev))
-   yield();
+   msleep(1)
 }
 
 void dev_deactivate(struct net_device *dev)
-- 
1.9.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] MAINTAINERS: add maintainers for arm64 acpi

2014-03-04 Thread Catalin Marinas

On Tue, Mar 04, 2014 at 07:03:18PM +, Graeme Gregory wrote:
> On Tue, Mar 04, 2014 at 10:23:16AM +, Catalin Marinas wrote:
> > On Tue, Mar 04, 2014 at 02:15:45AM +, Graeme Gregory wrote:
> > > +ACPI ARM64
> > 
> > That's a pretty broad statement for a single file. Is it core support,
> > architected peripherals, SoC?
> > 
> Hi Catalin would changing the title to ACPI ARM64 Core Support be better
> in your mind. I do intend for the maintainership to cover just the
> plat/arm-core.c file.

See my reply to Grant. If that's the only thing you guys are aiming for,
who's in charge of the other bits? Face-to-face meeting in 3 hours
anyway, so we can get back here with the conclusion.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v5] can: xilinx CAN controller support.

2014-03-04 Thread Sören Brinkmann

Hi Kedar,

On Tue, 2014-03-04 at 06:50PM +0530, Kedareswara rao Appana wrote:
> This patch adds xilinx CAN controller support.
> This driver supports both ZYNQ CANPS and Soft IP
> AXI CAN controller.
> 
[...]
> diff --git a/Documentation/devicetree/bindings/net/can/xilinx_can.txt 
> b/Documentation/devicetree/bindings/net/can/xilinx_can.txt
> new file mode 100644
> index 000..0e57103
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/net/can/xilinx_can.txt
> @@ -0,0 +1,45 @@
> +Xilinx Axi CAN/Zynq CANPS controller Device Tree Bindings
> +-
> +
> +Required properties:
> +- compatible : Should be "xlnx,zynq-can-1.00.a" for Zynq CAN
> +   controllers and "xlnx,axi-can-1.00.a" for Axi CAN
> +   controllers.
> +- reg: Physical base address and size of the Axi 
> CAN/Zynq
> +   CANPS registers map.
> +- interrupts : Property with a value describing the interrupt
> +   number.
> +- interrupt-parent   : Must be core interrupt controller
> +- clock-names: List of input clock names - "ref_clk", 
> "aper_clk"

Let's reconsider these names. These are rather Zynq specific names. Does
the IP documentation use these as well? The names should match the
naming used for the IP, rather than the SOC. Is this the correct data sheet:
http://www.xilinx.com/support/documentation/ip_documentation/axi_can/v1_03_a/ds791_axi_can.pdf
? According to that the names should rather be 's_axi_aclk' and
'can_clk', IMHO.

Sören


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RCU stalls when running out of memory on 3.14-rc4 w/ NFS and kernel threads priorities changed

2014-03-04 Thread Florian Fainelli

Hi all,

I am seeing the following RCU stalls messages appearing on an ARMv7
4xCPUs system running 3.14-rc4:

[   42.974327] INFO: rcu_sched detected stalls on CPUs/tasks:
[   42.979839]  (detected by 0, t=2102 jiffies, g=4294967082,
c=4294967081, q=516)
[   42.987169] INFO: Stall ended before state dump start

this is happening under the following conditions:

- the attached bumper.c binary alters various kernel thread priorities
based on the contents of bumpup.cfg and
- malloc_crazy is running from a NFS share
- malloc_crazy.c is running in a loop allocating chunks of memory but
never freeing it

when the priorities are altered, instead of getting the OOM killer to
be invoked, the RCU stalls are happening. Taking NFS out of the
equation does not allow me to reproduce the problem even with the
priorities altered.

This "problem" seems to have been there for quite a while now since I
was able to get 3.8.13 to trigger that bug as well, with a slightly
more detailed RCU debugging trace which points the finger at kswapd0.

You should be able to get that reproduced under QEMU with the
Versatile Express platform emulating a Cortex A15 CPU and the attached
files.

Any help or suggestions would be greatly appreciated. Thanks!
-- 
Florian
#include 
#include 
#include 

int main(int argc, char **argv)
{
	int i;
	int *buf;
	int max = 3048;

	if (argc == 2)
		max = atoi(argv[1]);

	for(i = 0; i < max; i++) {
		buf = (void *)malloc(1048576);
		if(! buf) {
			printf("malloc returned NULL\n");
			return(0);
		}
		memset(buf, 0x11, 1048576);
		*buf = i;
		printf("%d\n", i);
	}
	printf("finished\n");

	return(1);
}


rcu_test.sh
Description: Bourne shell script


bumpup.cfg
Description: Binary data

/***
 *   Copyright (C) 2010 by BSkyB
 *   Richard Parsons
 * *
 *   This program is free software; you can redistribute it and/or modify  *
 *   it under the terms of the GNU General Public License as published by  *
 *   the Free Software Foundation; either version 2 of the License, or *
 *   (at your option) any later version.   *
 * *
 *   This program is distributed in the hope that it will be useful,   *
 *   but WITHOUT ANY WARRANTY; without even the implied warranty of*
 *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the *
 *   GNU General Public License for more details.  *
 * *
 *   You should have received a copy of the GNU General Public License *
 *   along with this program; if not, write to the *
 *   Free Software Foundation, Inc.,   *
 *   59 Temple Place - Suite 330, Boston, MA  02111-1307, USA. *
 ***/

/* N.B. */
/* this program is supplied WITHOUT any warranty whatsoever		  */

/*==

This program is used for the mass changing of process priorities by name
The syntax is as follows
	./bumber [cfgfile]

The config file uses the following syntax
#this is a comment
process_name,POLICY,priority

Default is RR , but valid policies are
  RR, FIFO and OTHER

priority is a value from 0..99

e.g.

#Telnet priority
telnetd,RR,37

=*/


#ifdef HAVE_CONFIG_H
#include 
#endif

#include 
#include 
#include 
#include 
#include 
#include "bumper.h"

int main ( int argc, char *argv[] )
{

	if ( argc != 2 )
	{
		printf ( "Error: Usage is %s [config file]\n",argv[0] );
		return EXIT_FAILURE;
	} 
	processList = init_application();
	if ( readInputFile ( argv[1] ) < 0 )
	{
		printf ( "Error: Could not open %s\n",argv[1] );
		return EXIT_FAILURE;
	}
	getProcessID ( processList );
	destory_application(processList);
	return EXIT_SUCCESS;
}

void *init_application ( void )
{
  return NULL;
}

void destory_application(struct params *head)
{
struct params *next;

if (head == NULL)
  return;

   next = head; 

   while (next != NULL)
 {
void *nptr = next->next; 
	free(next->process);
free(next);
next = nptr;
 } 
}

int readInputFile ( char *inFile )
{
	FILE *fp;
	int rc = -1;
	char inLine[101];

	if ( ( fp = fopen ( inFile, "r" ) ) == NULL )
		return rc;

	while ( fgets ( inLine, 100, fp ) !=NULL )
	{
		if ( inLine[0] == '#' )
			continue;

		processList = addEntry ( inLine,processList );
	}
	fclose ( fp );
	return 1;
}


void *addEntry ( char *szParameters,struct params *head )
{
	struct params *entry;
	struct params *lastPtr=head;
	void *retPtr=head;
	char process[512];
	char policy[512];
	char prio[512];
	if ( strlen ( szPar

Re: mm: kernel BUG at mm/huge_memory.c:2785!

2014-03-04 Thread Sasha Levin


On 02/27/2014 10:03 AM, Kirill A. Shutemov wrote:

Sasha Levin wrote:

>Hi all,
>
>While fuzzing with trinity inside a KVM tools guest running latest -next 
kernel I've stumbled on the
>following spew:
>
>[ 1428.146261] kernel BUG at mm/huge_memory.c:2785!

Hm, interesting.

It seems we either failed to split huge page on vma split or it
materialized from under us. I don't see how it can happen:

   - it seems we do the right thing with vma_adjust_trans_huge() in
 __split_vma();
   - we hold ->mmap_sem all the way from vm_munmap(). At least I don't see
 a place where we could drop it;

Andrea, any ideas?


And a somewhat related issue (please correct me if I'm wrong):

[ 2208.713223] kernel BUG at mm/mlock.c:528!
[ 2208.713692] invalid opcode:  [#1] PREEMPT SMP DEBUG_PAGEALLOC
[ 2208.714488] Dumping ftrace buffer:
[ 2208.715209](ftrace buffer empty)
[ 2208.715759] Modules linked in:
[ 2208.716206] CPU: 34 PID: 3736 Comm: trinity-c209 Tainted: GW
3.14.0-rc5-next-20140304-sasha-9-geaa4df0 #77
[ 2208.717637] task: 880ff90c8000 ti: 880ff90c6000 task.ti: 
880ff90c6000
[ 2208.718742] RIP: 0010:[]  [] 
munlock_vma_pages_range+0x176/0x1d0
[ 2208.720107] RSP: 0018:880ff90c7e08  EFLAGS: 00010206
[ 2208.720711] RAX: 01ff RBX: 0003f000 RCX: 
[ 2208.721456] RDX: 003f RSI: 8129d92d RDI: 84476115
[ 2208.721456] RBP: 880ff90c7ec8 R08:  R09: 
[ 2208.721456] R10: 0001 R11:  R12: fff2
[ 2208.721456] R13: 880313b41600 R14: 0004 R15: 880ff90c7e94
[ 2208.721456] FS:  7f2bd5330700() GS:88032bc0() 
knlGS:
[ 2208.721456] CS:  0010 DS:  ES:  CR0: 8005003b
[ 2208.721456] CR2: 02767c90 CR3: 000ffa1d4000 CR4: 06e0
[ 2208.721456] DR0: 7f15fe555000 DR1:  DR2: 
[ 2208.721456] DR3:  DR6: 0ff0 DR7: 0600
[ 2208.721456] Stack:
[ 2208.721456]    0001880ff90c7e38 

[ 2208.721456]   880313b41600 f90c7e88 

[ 2208.721456]  00ff880ff90c7e58 880815e8cba0 880ff90c7eb8 
880313b41600
[ 2208.721456] Call Trace:
[ 2208.721456]  [] do_munmap+0x1d2/0x350
[ 2208.721456]  [] ? down_write+0xa6/0xc0
[ 2208.721456]  [] ? vm_munmap+0x46/0x80
[ 2208.721456]  [] vm_munmap+0x54/0x80
[ 2208.721456]  [] SyS_munmap+0x2c/0x40
[ 2208.721456]  [] tracesys+0xdd/0xe2
[ 2208.721456] Code: fd ff ff 4c 89 e6 48 89 c3 48 8d bd 40 ff ff ff e8 80 fa ff ff 
eb 2f 66 0f 1f 44 00 00 8b 45 cc 48 89 da 48 c1 ea 0c 85 d0 74 12 <0f> 0b 0f 1f 
84 00 00 00 00 00 eb fe 66 0f 1f 44 00 00 ff c0 48
[ 2208.721456] RIP  [] munlock_vma_pages_range+0x176/0x1d0
[ 2208.721456]  RSP 


Thanks,
Sasha
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: fs: gpf in simple_setattr

2014-03-04 Thread Sasha Levin

On 03/03/2014 04:40 PM, Jan Kara wrote:

On Sat 01-03-14 15:05:21, Sasha Levin wrote:

>ping again?
>
>I've been working on it, but don't see an obvious issue.
>
>It does look like an access to invalid memory easily doable from
>userspace, so it should probably get fixed soon...

   Hum, can you maybe dump the name in dentry passed to simple_setattr()? Or
maybe even the whole path using dentry_path() (but not sure if that will
be workable on half-torn-down fs)? Maybe it will give us a hint at which
filesystem to look...

It's just garbage, this is why I'm having a hard time making any progress with
this bug.

Thanks,
Sasha
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 00/12] Thunderbolt hotplug support for Apple hardware (testers needed)

2014-03-04 Thread Andreas Noever

On Tue, Mar 4, 2014 at 1:09 AM, Matthew Garrett  wrote:
>
> Actually, turns out there's a much easier way. Can you try this patch?
> I see the Thunderbolt controller after resume, although it doesn't seem
> to be in a working state.
>
> commit 102547d63e2cbbda42a25f650df9a33cf929a385
> Author: Matthew Garrett 
> Date:   Mon Mar 3 18:49:28 2014 -0500
>
> ACPI: Support _OSI("Darwin") correctly
>
> Apple hardware queries _OSI("Darwin") in order to determine whether the
> system is running OS X, and changes firmware behaviour based on the 
> answer.
> The most obvious difference in behaviour is that Thunderbolt hardware is
> forcibly powered down unless the system is running OS X. The obvious 
> solution
> would be to simply add Darwin to the list of supported _OSI strings, but 
> this
> causes problems.
>
> Recent Apple hardware includes two separate methods for checking _OSI
> strings. The first will check whether Darwin is supported, and if so will
> exit. The second will check whether Darwin is supported, but will then
> continue to check for further operating systems. If a further operating
> system is found then later firmware code will assume that the OS is not 
> OS X.
> This results in the unfortunate situation where the Thunderbolt 
> controller is
> available at boot time but remains powered down after suspend.
>
> The easiest way to handle this is to special-case it in the Linux-specific
> OSI handling code. If we see Darwin, we should answer true and then 
> disable
> all other _OSI vendor strings.
>
> Signed-off-by: Matthew Garrett 
>
> diff --git a/drivers/acpi/osl.c b/drivers/acpi/osl.c
> index 54a20ff..ef8656c 100644
> --- a/drivers/acpi/osl.c
> +++ b/drivers/acpi/osl.c
> @@ -156,6 +156,16 @@ static u32 acpi_osi_handler(acpi_string interface, u32 
> supported)
> osi_linux.dmi ? " via DMI" : "");
> }
>
> +   if (!strcmp("Darwin", interface)) {
> +   /*
> +* Apple firmware will behave poorly if it receives positive
> +* answers to "Darwin" and any other OS. Respond positively
> +* to Darwin and then disable all other vendor strings.
> +*/
> +   acpi_update_interfaces(ACPI_DISABLE_ALL_VENDOR_STRINGS);
> +   supported = ACPI_UINT32_MAX;
> +   }
> +
> return supported;
>  }
>
>
> --
> Matthew Garrett | mj...@srcf.ucam.org


I belive that the patch has the same effect as passing
acpi_osi=! acpi_osi=Darwin
to the kernel. The problem with that approach is that it changes the
firmware behaviour quite a lot. In particular it prevents Linux from
taking over pci hotplug control:

acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI]
\_SB_.PCI0:_OSC invalid UUID
_OSC request data:1 1f 0
acpi PNP0A08:00: _OSC failed (AE_ERROR); disabling ASPM

Booting with just acpi_osi=Darwin (or without acpi_osi) gives:
acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI]
acpi PNP0A08:00: _OSC: platform does not support [PME]
acpi PNP0A08:00: _OSC: OS now controls [PCIeHotplug AER PCIeCapability]


The end result is that that pci hotplog events are not handled. If I
unplug the Thunderbolt Ethernet adapter the device is not removed from
lspci.

I would prefer to find a solution that boots without acpi_osi=Darwin
as seems to trigger quite a lot of ACPI code. My current approach is
to inject a custom OSDW method somewhere into the NHI namespace and to
replace _PTS and _WAK from my driver. I can then wake the controller
with the XRPE method. The last problem is that the PCI code does not
allocate enough (or any) bus numbers below the hotplug ports. I'm
trying to add some quirks to it but the code is not really made for
that...


Andreas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Closing on the CR2 leak bug

2014-03-04 Thread Steven Rostedt

On Tue, 04 Mar 2014 15:44:26 -0800
"H. Peter Anvin"  wrote:

> On 03/04/2014 03:41 PM, Steven Rostedt wrote:
> > 
> > The only other issue is if perf traces from function context and traces
> > the trace_do_page_fault() call. But other than that, sure, take Jiri's
> > patch.
> > 
> 
> Is there a known codepath on which that can happen?

I'm not sure. Jiri, is there a way we can trace userspace from
the ftrace function event? What would be the command to try? I know you
had a patch to prevent it, but is it currently possible?

-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] Add option to build with -O3

2014-03-04 Thread Jon Ringle

Signed-off-by: Jon Ringle 
---
 Makefile |  2 ++
 init/Kconfig | 19 ---
 2 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/Makefile b/Makefile
index 78209ee..e7f0b3c 100644
--- a/Makefile
+++ b/Makefile
@@ -581,6 +581,8 @@ all: vmlinux

 ifdef CONFIG_CC_OPTIMIZE_FOR_SIZE
 KBUILD_CFLAGS  += -Os $(call cc-disable-warning,maybe-uninitialized,)
+else ifdef CONFIG_CC_OPTIMIZE_FOR_SPEED
+KBUILD_CFLAGS   += -O3
 else
 KBUILD_CFLAGS  += -O2
 endif
diff --git a/init/Kconfig b/init/Kconfig
index 009a797..17d4c62 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1233,13 +1233,26 @@ source "usr/Kconfig"

 endif

+choice
+prompt "Optimize"
+
+config CC_OPTIMIZE_NORMAL
+bool "Optimize Normal (-O2)"
+help
+  Enabling this option will pass "-O2" to gcc
 config CC_OPTIMIZE_FOR_SIZE
-   bool "Optimize for size"
+   bool "Optimize for size (-Os)"
help
- Enabling this option will pass "-Os" instead of "-O2" to gcc
+ Enabling this option will pass "-Os" to gcc
  resulting in a smaller kernel.

- If unsure, say N.
+config CC_OPTIMIZE_FOR_SPEED
+bool "Optimze for speed (-O3)"
+help
+  Enabling this option will pass "-O3" to gcc
+  resulting in a larger kernel (but possibly faster)
+
+endchoice

 config SYSCTL
bool
--
1.8.5.4


The information contained in this transmission may contain confidential 
information.  If the reader of this message is not the intended recipient, you 
are hereby notified that any review, dissemination, distribution or duplication 
of this communication is strictly prohibited.  If you are not the intended 
recipient, please contact the sender by reply email and destroy all copies of 
the original message.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/3] devicetree: bindings: drop pinctrl PMU reg property

2014-03-04 Thread Sebastian Hesselbarth

Marvell Dove's pinctrl does require some PMU regs for muxing PMU
functions to MPP pins. Recently, a discussion started about consolidating
Power Management Unit (PMU) into a single DT node. As we don't want
anymore DT ABI in the way, drop the corresponding reg property from
pinctrl binding documentation now.

Signed-off-by: Sebastian Hesselbarth 
---
Cc: Rob Herring  
Cc: Pawel Moll  
Cc: Mark Rutland  
Cc: Ian Campbell  
Cc: Kumar Gala  
Cc: Russell King  
Cc: Jason Cooper  
Cc: Andrew Lunn 
Cc: Gregory Clement 
Cc: Linus Walleij 
Cc: devicet...@vger.kernel.org 
Cc: linux-arm-ker...@lists.infradead.org 
Cc: linux-kernel@vger.kernel.org
---
 Documentation/devicetree/bindings/pinctrl/marvell,dove-pinctrl.txt  | 2 +-
 Documentation/devicetree/bindings/pinctrl/marvell,mvebu-pinctrl.txt | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/devicetree/bindings/pinctrl/marvell,dove-pinctrl.txt 
b/Documentation/devicetree/bindings/pinctrl/marvell,dove-pinctrl.txt
index cf52477cc7ee..3b3d465cbfd9 100644
--- a/Documentation/devicetree/bindings/pinctrl/marvell,dove-pinctrl.txt
+++ b/Documentation/devicetree/bindings/pinctrl/marvell,dove-pinctrl.txt
@@ -6,7 +6,7 @@ part and usage.
 Required properties:
 - compatible: "marvell,dove-pinctrl"
 - clocks: (optional) phandle of pdma clock
-- reg: register specifiers of MPP, MPP4, and PMU MPP registers
+- reg: register specifiers of MPP and MPP4 registers
 
 Available mpp pins/groups and functions:
 Note: brackets (x) are not part of the mpp name for marvell,function and given
diff --git 
a/Documentation/devicetree/bindings/pinctrl/marvell,mvebu-pinctrl.txt 
b/Documentation/devicetree/bindings/pinctrl/marvell,mvebu-pinctrl.txt
index 0c09f4eb2af0..111b6eea72e8 100644
--- a/Documentation/devicetree/bindings/pinctrl/marvell,mvebu-pinctrl.txt
+++ b/Documentation/devicetree/bindings/pinctrl/marvell,mvebu-pinctrl.txt
@@ -37,7 +37,7 @@ uart1: serial@12100 {
 
 pinctrl: pinctrl@d0200 {
compatible = "marvell,dove-pinctrl";
-   reg = <0xd0200 0x14>, <0xd0440 0x04>, <0xd802c 0x08>;
+   reg = <0xd0200 0x14>, <0xd0440 0x04>;
 
pmx_uart1_sw: pmx-uart1-sw {
marvell,pins = "mpp_uart1";
-- 
1.8.5.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 0/3] pinctrl: mvebu: prepare for single PMU node

2014-03-04 Thread Sebastian Hesselbarth

This is a small patch set preparing a discussion and rework for the
Power Management Unit (PMU) found on Marvell Dove SoCs. We are planing
to consolidate PMU into a single DT node instead of chopping it into
tiny pieces [1].

As we just have taken in patches for pinctrl driver that grab another
piece of PMU reg space, we rather drop the corresponding reg property
and wait for v3.15 and proper PMU node rework.

Patch 1 drops the reg property added for pinctrl and is based on mvebu/dt.

Patch 2 drops the reg property added for pinctrl from the corresponding
binding documentation and example. It is based on recently pulled
mvebu/pinctrl-dove.

Patch 3 reduces a WARN on missing reg properties to dev_warn with FW_BUG,
as we just enforced this situation. The driver will continue to work
properly without the resource and derive it from other resources. As soon
as PMU binding is worked out, we will update the driver to make use of it.

The last patch is optional and Linus Walleij can reject it, if he is already
done for v3.15.

[1] https://lkml.org/lkml/2014/3/3/326

Sebastian Hesselbarth (3):
  ARM: dove: drop pinctrl PMU reg property
  devicetree: bindings: drop pinctrl PMU reg property
  pinctrl: mvebu: silence WARN to dev_warn

 Documentation/devicetree/bindings/pinctrl/marvell,dove-pinctrl.txt  | 2 +-
 Documentation/devicetree/bindings/pinctrl/marvell,mvebu-pinctrl.txt | 2 +-
 arch/arm/boot/dts/dove.dtsi | 3 +--
 drivers/pinctrl/mvebu/pinctrl-dove.c| 3 ++-
 4 files changed, 5 insertions(+), 5 deletions(-)

---
Cc: Rob Herring 
Cc: Pawel Moll 
Cc: Mark Rutland 
Cc: Ian Campbell 
Cc: Kumar Gala 
Cc: Russell King 
Cc: Jason Cooper 
Cc: Andrew Lunn 
Cc: Gregory Clement 
Cc: Linus Walleij 
Cc: devicet...@vger.kernel.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
-- 
1.8.5.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3/3] pinctrl: mvebu: silence WARN to dev_warn

2014-03-04 Thread Sebastian Hesselbarth

Pinctrl will WARN on missing DT resources, which is a little bit too
noisy. Use dev_warn with FW_BUG instead.

Signed-off-by: Sebastian Hesselbarth 
---
Cc: Russell King  
Cc: Jason Cooper  
Cc: Andrew Lunn 
Cc: Gregory Clement 
Cc: Linus Walleij 
Cc: devicet...@vger.kernel.org 
Cc: linux-arm-ker...@lists.infradead.org 
Cc: linux-kernel@vger.kernel.org
---
 drivers/pinctrl/mvebu/pinctrl-dove.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/pinctrl/mvebu/pinctrl-dove.c 
b/drivers/pinctrl/mvebu/pinctrl-dove.c
index 9e7ff651c018..3b022178a566 100644
--- a/drivers/pinctrl/mvebu/pinctrl-dove.c
+++ b/drivers/pinctrl/mvebu/pinctrl-dove.c
@@ -832,7 +832,8 @@ static int dove_pinctrl_probe(struct platform_device *pdev)
}
 
/* Warn on any missing DT resource */
-   WARN(fb_res.start, FW_BUG "Missing pinctrl regs in DTB. Please update 
your firmware.\n");
+   if (fb_res.start)
+   dev_warn(&pdev->dev, FW_BUG "Missing pinctrl regs in DTB. 
Please update your firmware.\n");
 
return mvebu_pinctrl_probe(pdev);
 }
-- 
1.8.5.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Update of file offset on write() etc. is non-atomic with I/O

2014-03-04 Thread Al Viro

On Tue, Mar 04, 2014 at 01:17:50PM -0800, Linus Torvalds wrote:
> On Tue, Mar 4, 2014 at 12:00 PM, Al Viro  wrote:
> >
> > OK, with the attached set (the first one is essentially unchanged from
> > your first one), it seems to work and produce better code on all targets
> > I've tried.  Comments?
> 
> I'm certainly ok with it. You seem to have left the fput_light()
> function around, though, despite removing fget_[raw_]light(). That
> seems a bit silly, since there is no valid use any more apart from
> net/socket.c that now doesn't balance things properly.

There's also a pile of crap around sockfd_lookup/sockfd_put, related
to that.   Moreover, there's net/compat.c, which probably ought to
have the compat syscalls themselves moved to net/socket.c (under
ifdef CONFIG_COMPAT) and switched to sockfd_lookup_light().
There's l2tp_tunnel_sock_lookup(), which is simply broken - it assumes
that if tunnel->fd still resolves to a socket, that socket must
be l2tp one.  Trivial to drive into BUG_ON(), in queue_work() callback,
no less...  There's bluetooth, assuming that pretty much the same
(that if it got a file descriptor that resolves to a socket, it must
be a bluetooth one).  BTW, I wonder what will happen if one gives
iscsi_sw_tcp_conn_bind() descriptor of a socket of sufficiently
weird sort...

Then there's staging/usbip with its sockfd_to_socket(), which is more or
less parallel to sockfd_lookup().  And open-coded analogs in nbd and
ncpfs...

> > I've also pushed those (on top of old ocfs2 fix) into vfs.git#for-linus,
> > if you prefer to read it that way.  Should propagate in a few...
> 
> Should I pull?
> 
> I also get the feeling that the first patch should likely be marked
> for stable. Hmm?

It should; I'll mark it such when I send a pull request.  I really want
to sort the situation with sockfd_lookup() and friends out, though - at
least to the point where I would understand how painful the fixes
will be.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[tip:x86/urgent] x86, trace: Fix CR2 corruption when tracing page faults

2014-03-04 Thread tip-bot for Jiri Olsa

Commit-ID:  0ac09f9f8cd1fb028a48330edba6023d347d3cea
Gitweb: http://git.kernel.org/tip/0ac09f9f8cd1fb028a48330edba6023d347d3cea
Author: Jiri Olsa 
AuthorDate: Fri, 28 Feb 2014 17:05:26 +0100
Committer:  H. Peter Anvin 
CommitDate: Tue, 4 Mar 2014 16:00:14 -0800

x86, trace: Fix CR2 corruption when tracing page faults

The trace_do_page_fault function trigger tracepoint
and then handles the actual page fault.

This could lead to error if the tracepoint caused page
fault. The original cr2 value gets lost and the original
page fault handler kills current process with SIGSEGV.

This happens if you record page faults with callchain
data, the user part of it will cause tracepoint handler
to page fault:

  # perf record -g -e exceptions:page_fault_user ls

Fixing this by saving the original cr2 value
and using it after tracepoint handler is done.

v2: Moving the cr2 read before exception_enter, because
it could trigger tracepoint as well.

Reported-by: Arnaldo Carvalho de Melo 
Reported-by: Vince Weaver 
Tested-by: Vince Weaver 
Acked-by: Steven Rostedt 
Cc: Peter Zijlstra 
Cc: Paul Mackerras 
Cc: Seiji Aguchi 
Signed-off-by: H. Peter Anvin 
Link: 
http://lkml.kernel.org/r/alpine.deb.2.10.1402211701380.6...@vincent-weaver-1.um.maine.edu
Link: http://lkml.kernel.org/r/20140228160526.gd1...@krava.brq.redhat.com
---
 arch/x86/mm/fault.c | 20 +---
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index 6dea040..e7fa28b 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -1022,11 +1022,11 @@ static inline bool smap_violation(int error_code, 
struct pt_regs *regs)
  * routines.
  */
 static void __kprobes
-__do_page_fault(struct pt_regs *regs, unsigned long error_code)
+__do_page_fault(struct pt_regs *regs, unsigned long error_code,
+   unsigned long address)
 {
struct vm_area_struct *vma;
struct task_struct *tsk;
-   unsigned long address;
struct mm_struct *mm;
int fault;
unsigned int flags = FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_KILLABLE;
@@ -1034,9 +1034,6 @@ __do_page_fault(struct pt_regs *regs, unsigned long 
error_code)
tsk = current;
mm = tsk->mm;
 
-   /* Get the faulting address: */
-   address = read_cr2();
-
/*
 * Detect and handle instructions that would cause a page fault for
 * both a tracked kernel page and a userspace page.
@@ -1252,9 +1249,11 @@ dotraplinkage void __kprobes
 do_page_fault(struct pt_regs *regs, unsigned long error_code)
 {
enum ctx_state prev_state;
+   /* Get the faulting address: */
+   unsigned long address = read_cr2();
 
prev_state = exception_enter();
-   __do_page_fault(regs, error_code);
+   __do_page_fault(regs, error_code, address);
exception_exit(prev_state);
 }
 
@@ -1271,9 +1270,16 @@ dotraplinkage void __kprobes
 trace_do_page_fault(struct pt_regs *regs, unsigned long error_code)
 {
enum ctx_state prev_state;
+   /*
+* The exception_enter and tracepoint processing could
+* trigger another page faults (user space callchain
+* reading) and destroy the original cr2 value, so read
+* the faulting address now.
+*/
+   unsigned long address = read_cr2();
 
prev_state = exception_enter();
trace_page_fault_entries(regs, error_code);
-   __do_page_fault(regs, error_code);
+   __do_page_fault(regs, error_code, address);
exception_exit(prev_state);
 }
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v4 5/6] timerfd: Add support for deferrable timers

2014-03-04 Thread Thomas Gleixner

On Tue, 4 Mar 2014, Andy Lutomirski wrote:
> On Tue, Mar 4, 2014 at 2:11 PM, Thomas Gleixner  wrote:
> > We do no add another random special case syscall for timerfd just
> > because timerfd is linux specific.
> 
> What syscalls?  I can think of exactly two timer interfaces that
> actually accept a clock id and flags: clock_nanosleep and
> timerfd_settime.

Sure, and what you can think of is reality?

 sys_timer_settime() which relies on sys_timer_create() are outside
 your universe, right?

And no. We are not adding timer_list mess back to any of them.

Aside of that if you want to make the slack thing usefull on a per
call basis then you want to add it to a lot of other interfaces like
poll.

And you are completely ignoring the fact that the slack works
completely differrent:

A slacked timer still gets enqueued into the main timer queue. It just
relies on the fact that it gets batched with some other expiring
timer. But thats completely different to the deferrable approach.

   start_timer(timer, expiry, slack);

   timer.hard_expiry = expiry + slack;
   timer.soft_expiry = expiry;
   enqueue_timer(timer, timer.hard_expiry);

The enqueueing code puts it into the queue by looking at the
hard_expiry code. And the expiry code looks at the timer.soft_expiry
value to expire a timer early.

Now assume the following:

   start_timer(timer, +100ms, 100s);

So that puts that timer into the hard expiry line of 100.1 sec from
now. So if the cpu is busy and is firing a lot of timers then your
timer could be delayed up to the hard expiry time, i.e. 100.1 seconds
from now, which has completely differrent semantics than the
deferrrable timers.

The deferrable timer is guaranteed to expire (halfways) on time when
the system is active and does not affect the system from going idle,
but it expires right away when the system comes back out of idle.

The slack timers are just a batching mechanism to align expiry times
of non deferrable timers to a common time.

So how do you map those together?

I'm not saying that a per timer slack is useless, but it does not
solve the issue of deferrable timers.

Quite the contrary, it would be simpler to implement the slacked
timers as a special case of the deferrable timers. But hell no, we are
not going to go there.

> > But we cannot do that right now as we cannot whip up severl dozen of
> > new syscalls just because we want to add slack/deferrable whatever
> > properties.

> Two syscalls, right?

It does not matter at all how many syscalls this affects. We are not
adding any random new syscalls just because we can.

> Once we agree on a solution to the Y2038 issue on 32bit with a unified
> 32/64 bit syscall interface which simply gets rid of the timespec/val
> nonsense and takes a simple u64 nsec value we can add the slack
> property to that without any further inconvenience.

Ignoring this wont get you anywhere.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/3] ARM: dove: drop pinctrl PMU reg property

2014-03-04 Thread Sebastian Hesselbarth

Marvell Dove's pinctrl does require some PMU regs for muxing PMU
functions to MPP pins. Recently, a discussion started about consolidating
Power Management Unit (PMU) into a single DT node. As we don't want
anymore DT ABI in the way, drop the corresponding reg property from
pinctrl node now. The driver will derive the registers from existing
reg properties.

Signed-off-by: Sebastian Hesselbarth 
---
Cc: Rob Herring  
Cc: Pawel Moll  
Cc: Mark Rutland  
Cc: Ian Campbell  
Cc: Kumar Gala  
Cc: Russell King  
Cc: Jason Cooper  
Cc: Andrew Lunn 
Cc: Gregory Clement 
Cc: Linus Walleij 
Cc: devicet...@vger.kernel.org 
Cc: linux-arm-ker...@lists.infradead.org 
Cc: linux-kernel@vger.kernel.org
---
 arch/arm/boot/dts/dove.dtsi | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/arm/boot/dts/dove.dtsi b/arch/arm/boot/dts/dove.dtsi
index b6fc27f8ed66..3b891dd20993 100644
--- a/arch/arm/boot/dts/dove.dtsi
+++ b/arch/arm/boot/dts/dove.dtsi
@@ -395,8 +395,7 @@
pinctrl: pin-ctrl@d0200 {
compatible = "marvell,dove-pinctrl";
reg = <0xd0200 0x14>,
- <0xd0440 0x04>,
- <0xd802c 0x08>;
+ <0xd0440 0x04>;
clocks = <&gate_clk 22>;
 
pmx_gpio_0: pmx-gpio-0 {
-- 
1.8.5.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH net RESEND] vlan: don't allow to add VLAN on VLAN device

2014-03-04 Thread Ben Hutchings

On Thu, 2014-02-27 at 19:45 -0800, John Fastabend wrote:
> On 2/27/2014 6:43 PM, Ding Tianhong wrote:
> > I run these steps:
> >
> > modprobe 8021q
> > vconfig add eth2 20
> > vconfig add eth2.20 20
> > ifconfig eth2 xx.xx.xx.xx
> >
> > then the Call Trace happened:
> >
> 
> [...]
> 
> > 
> >
> > The reason is that if add vlan on vlan dev, the vlan dev will create 
> > vlan_info,
> > then the notification will let the real dev to run dev_set_rx_mode() and 
> > hold
> > netif_addr_lock, and then the real dev will call ndo_set_rx_mode(), if the 
> > real
> > dev is vlan dev, the ndo_set_rx_mode() will hold netif_addr_lock again, so 
> > deadlock
> > happened.
> >
> > Don't allow to add vlan on vlan dev to fix this problem.
> >
> > Signed-off-by: Ding Tianhong 
> > ---
> 
> I'm not sure we can just disable stacked vlans. There might be something
> using them today and they have worked in the past. Lets try to find a
> better fix.

I don't think there's any deadlock possible here.  We try to acquire the
addr_list_lock for eth2.20, then the addr_list_lock for eth2.  We never
try to acquire them in the opposite order.  The fix would involve
telling lockdep about lock ordering between stacked net_devices (I have
no idea how that's done).

Ben.

-- 
Ben Hutchings
For every complex problem
there is a solution that is simple, neat, and wrong.


signature.asc
Description: This is a digitally signed message part

Re: [PATCH] ACPI / hotplug / PCI: Use pci_device_is_present()

2014-03-04 Thread Rafael J. Wysocki

On Tuesday, March 04, 2014 10:23:58 AM Mika Westerberg wrote:
> On Mon, Mar 03, 2014 at 01:19:25AM +0100, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki 
> > 
> > Make the ACPI-based PCI hotplug (ACPIPHP) code use
> > pci_device_is_present() for checking if devices are present instead
> > of open coding the same thing.
> > 
> > Signed-off-by: Rafael J. Wysocki 
> 
> Reviewed-by: Mika Westerberg 

Thanks!

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 31/48] uv: Replace __get_cpu_var

2014-03-04 Thread H. Peter Anvin

On 03/04/2014 03:02 PM, Andrew Morton wrote:
> On Fri, 14 Feb 2014 14:19:12 -0600 Christoph Lameter  wrote:
> 
>> Use __this_cpu_read instead.
>>
>>
>> --- linux.orig/arch/x86/include/asm/uv/uv_hub.h  2014-02-03 
>> 14:16:53.987889372 -0600
>> +++ linux/arch/x86/include/asm/uv/uv_hub.h   2014-02-03 14:16:53.987889372 
>> -0600
>> @@ -618,7 +618,7 @@
>>  };
>>  
>>  DECLARE_PER_CPU(struct uv_cpu_nmi_s, __uv_cpu_nmi);
>> -#define uv_cpu_nmi  (__get_cpu_var(__uv_cpu_nmi))
>> +#define uv_cpu_nmi  __this_cpu_read(_uv_cpu_nmi)
> 
> arch/x86/platform/uv/uv_nmi.c: In function 'uv_check_nmi':
> arch/x86/platform/uv/uv_nmi.c:218: error: '_uv_cpu_nmi' undeclared (first use 
> in this function)
> arch/x86/platform/uv/uv_nmi.c:218: error: (Each undeclared identifier is 
> reported only once
> arch/x86/platform/uv/uv_nmi.c:218: error: for each function it appears in.)
> 
> 
> This?
> 

More likely just add the missing second underscore.

-hpa

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] ACPI / hotplug: Rework deferred execution of acpi_device_hotplug()

2014-03-04 Thread Rafael J. Wysocki

On Tuesday, March 04, 2014 12:53:05 PM Toshi Kani wrote:
> On Sat, 2014-03-01 at 20:57 +, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki 
> > 
> > Since the only function executed by acpi_hotplug_execute() is
> > acpi_device_hotplug() and it only is called by the ACPI core,
> > simplify its definition so that it only takes two arguments, the
> > ACPI device object pointer and event code, rename it to
> > acpi_hotplug_schedule() and move its header from acpi_bus.h to
> > the ACPI core's internal header file internal.h.  Modify the
> > definition of acpi_device_hotplug() so that its first argument is
> > an ACPI device object pointer and modify the definition of
> > struct acpi_hp_work accordingly.
> > 
> > Signed-off-by: Rafael J. Wysocki 
> 
> The change looks good to me.  I wonder if acpi_hotplug_schedule() should
> still be in acpi/osl.c after this change, though.

Well, not necessarily. :-)

Still, that'd be a separate patch anyway.

> Acked-by: Toshi Kani 

Thanks!

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH] audit: Simplify by assuming the callers socket buffer is large enough

2014-03-04 Thread David Miller

From: ebied...@xmission.com (Eric W. Biederman)
Date: Tue, 04 Mar 2014 14:41:16 -0800

> If we really want the ability to always appened to the queue of skb's
> is to just have a version of netlink_send_skb that ignores the queued
> limits.  Of course an evil program then could force the generation of
> enough audit records to DOS the kernel, but we seem to be in that
> situation now.  Shrug.

There is never a valid reason to bypass the socket limits.

It protects the system from things going out of control.

Netlink packet sends can fail, and audit should cope with that
event instead of trying to bludgeon it into not happening.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH V6 Resend 1/5] cpufreq: suspend governors on system suspend/hibernate

2014-03-04 Thread Rafael J. Wysocki

On Tuesday, March 04, 2014 11:00:26 AM Viresh Kumar wrote:
> This patch adds cpufreq suspend/resume calls to dpm_{suspend|resume}() for
> handling suspend/resume of cpufreq governors.
> 
> Lan Tianyu (Intel) & Jinhyuk Choi (Broadcom) found an issue where tunables
> configuration for clusters/sockets with non-boot CPUs was getting lost after
> suspend/resume, as we were notifying governors with CPUFREQ_GOV_POLICY_EXIT on
> removal of the last cpu for that policy and so deallocating memory for 
> tunables.
> This is fixed by this patch as we don't allow any operation on governors after
> device suspend and before device resume now.
> 
> We could have added these callbacks at dpm_{suspend|resume}_noirq() level but
> the problem here is that most of the devices (i.e. devices with ->suspend()
> callbacks) have already been suspended by now and so if drivers want to change
> frequency before suspending, then it might not be possible for many platforms
> (which depend on other peripherals like i2c, regulators, etc).
> 
> Reported-and-tested-by: Lan Tianyu 
> Reported-by: Jinhyuk Choi 
> Signed-off-by: Viresh Kumar 
> ---
> V5->V6: 1-2-3/7 merged into 1/5
> 
>  drivers/base/power/main.c |   5 +++
>  drivers/cpufreq/cpufreq.c | 111 
> +++---
>  include/linux/cpufreq.h   |   8 
>  3 files changed, 69 insertions(+), 55 deletions(-)
> 
> diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
> index 42355e4..86d5e4f 100644
> --- a/drivers/base/power/main.c
> +++ b/drivers/base/power/main.c
> @@ -29,6 +29,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  
> @@ -866,6 +867,8 @@ void dpm_resume(pm_message_t state)
>   mutex_unlock(&dpm_list_mtx);
>   async_synchronize_full();
>   dpm_show_time(starttime, state, NULL);
> +
> + cpufreq_resume();
>  }
>  
>  /**
> @@ -1434,6 +1437,8 @@ int dpm_suspend(pm_message_t state)
>  
>   might_sleep();
>  
> + cpufreq_suspend();
> +
>   mutex_lock(&dpm_list_mtx);
>   pm_transition = state;
>   async_error = 0;
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 56b7b1b..2e43c08 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -26,7 +26,7 @@
>  #include 
>  #include 
>  #include 
> -#include 
> +#include 
>  #include 
>  #include 
>  
> @@ -47,6 +47,9 @@ static LIST_HEAD(cpufreq_policy_list);
>  static DEFINE_PER_CPU(char[CPUFREQ_NAME_LEN], cpufreq_cpu_governor);
>  #endif
>  
> +/* Flag to suspend/resume CPUFreq governors */
> +static bool cpufreq_suspended;
> +
>  static inline bool has_target(void)
>  {
>   return cpufreq_driver->target_index || cpufreq_driver->target;
> @@ -1576,82 +1579,77 @@ static struct subsys_interface cpufreq_interface = {
>  };
>  
>  /**
> - * cpufreq_bp_suspend - Prepare the boot CPU for system suspend.
> + * cpufreq_suspend() - Suspend CPUFreq governors
>   *
> - * This function is only executed for the boot processor.  The other CPUs
> - * have been put offline by means of CPU hotplug.
> + * Called during system wide Suspend/Hibernate cycles for suspending 
> governors
> + * as some platforms can't change frequency after this point in suspend 
> cycle.
> + * Because some of the devices (like: i2c, regulators, etc) they use for
> + * changing frequency are suspended quickly after this point.
>   */
> -static int cpufreq_bp_suspend(void)
> +void cpufreq_suspend(void)
>  {
> - int ret = 0;
> -
> - int cpu = smp_processor_id();
>   struct cpufreq_policy *policy;
>  
> - pr_debug("suspending cpu %u\n", cpu);
> + if (!cpufreq_driver)
> + return;
>  
> - /* If there's no policy for the boot CPU, we have nothing to do. */
> - policy = cpufreq_cpu_get(cpu);
> - if (!policy)
> - return 0;
> + if (!has_target())
> + return;
>  
> - if (cpufreq_driver->suspend) {
> - ret = cpufreq_driver->suspend(policy);
> - if (ret)
> - printk(KERN_ERR "cpufreq: suspend failed in ->suspend "
> - "step on CPU %u\n", policy->cpu);
> + pr_debug("%s: Suspending Governors\n", __func__);
> +
> + list_for_each_entry(policy, &cpufreq_policy_list, policy_list) {
> + if (__cpufreq_governor(policy, CPUFREQ_GOV_STOP))
> + pr_err("%s: Failed to stop governor for policy: %p\n",
> + __func__, policy);
> + else if (cpufreq_driver->suspend

So, previously the driver's ->suspend callback was executed with interrupts off
and it was only executed for the boot CPU.

Now, it is going to be executed for all CPUs and with interrupts on.

This sounds potentially dangerous, so did you check all drivers supplying
->suspend if they are fine with that?

And same for ->resume, of course.

> + && cpufreq_driver->suspend(policy))
> + pr_err("%s: Failed to sus

[PATCH RT 1/8] rcutree/rcu_bh_qs: disable irq while calling rcu_preempt_qs()

2014-03-04 Thread Steven Rostedt

3.8.13.14-rt28-rc1 stable review patch.
If anyone has any objections, please let me know.

--

From: Tiejun Chen 

Any callers to the function rcu_preempt_qs() must disable irqs in
order to protect the assignment to ->rcu_read_unlock_special. In
RT case, rcu_bh_qs() as the wrapper of rcu_preempt_qs() is called
in some scenarios where irq is enabled, like this path,

do_single_softirq()
|
+ local_irq_enable();
+ handle_softirq()
||
|+ rcu_bh_qs()
||
|+ rcu_preempt_qs()
|
+ local_irq_disable()

So here we'd better disable irq directly inside of rcu_bh_qs() to
fix this, otherwise the kernel may be freezable sometimes as
observed. And especially this way is also kind and safe for the
potential rcu_bh_qs() usage elsewhere in the future.

Cc: stable...@vger.kernel.org
Signed-off-by: Tiejun Chen 
Signed-off-by: Bin Jiang 
Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Steven Rostedt 
---
 kernel/rcutree.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 7ec834d..6f6d133 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -186,7 +186,12 @@ static void rcu_preempt_qs(int cpu);
 
 void rcu_bh_qs(int cpu)
 {
+   unsigned long flags;
+
+   /* Callers to this function, rcu_preempt_qs(), must disable irqs. */
+   local_irq_save(flags);
rcu_preempt_qs(cpu);
+   local_irq_restore(flags);
 }
 #else
 void rcu_bh_qs(int cpu)
-- 
1.8.5.3


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH RT 4/8] kernel/hrtimer: be non-freezeable in cpu_chill()

2014-03-04 Thread Steven Rostedt

3.8.13.14-rt28-rc1 stable review patch.
If anyone has any objections, please let me know.

--

From: Sebastian Andrzej Siewior 

Since we replaced msleep() by hrtimer I see now and then (rarely) this:

| [] Waiting for /dev to be fully populated...
| =
| [ BUG: udevd/229 still has locks held! ]
| 3.12.11-rt17 #23 Not tainted
| -
| 1 lock held by udevd/229:
|  #0:  (&type->i_mutex_dir_key#2){+.+.+.}, at: lookup_slow+0x28/0x98
|
| stack backtrace:
| CPU: 0 PID: 229 Comm: udevd Not tainted 3.12.11-rt17 #23
| (unwind_backtrace+0x0/0xf8) from (show_stack+0x10/0x14)
| (show_stack+0x10/0x14) from (dump_stack+0x74/0xbc)
| (dump_stack+0x74/0xbc) from (do_nanosleep+0x120/0x160)
| (do_nanosleep+0x120/0x160) from (hrtimer_nanosleep+0x90/0x110)
| (hrtimer_nanosleep+0x90/0x110) from (cpu_chill+0x30/0x38)
| (cpu_chill+0x30/0x38) from (dentry_kill+0x158/0x1ec)
| (dentry_kill+0x158/0x1ec) from (dput+0x74/0x15c)
| (dput+0x74/0x15c) from (lookup_real+0x4c/0x50)
| (lookup_real+0x4c/0x50) from (__lookup_hash+0x34/0x44)
| (__lookup_hash+0x34/0x44) from (lookup_slow+0x38/0x98)
| (lookup_slow+0x38/0x98) from (path_lookupat+0x208/0x7fc)
| (path_lookupat+0x208/0x7fc) from (filename_lookup+0x20/0x60)
| (filename_lookup+0x20/0x60) from (user_path_at_empty+0x50/0x7c)
| (user_path_at_empty+0x50/0x7c) from (user_path_at+0x14/0x1c)
| (user_path_at+0x14/0x1c) from (vfs_fstatat+0x48/0x94)
| (vfs_fstatat+0x48/0x94) from (SyS_stat64+0x14/0x30)
| (SyS_stat64+0x14/0x30) from (ret_fast_syscall+0x0/0x48)

For now I see no better way but to disable the freezer the sleep the period.

Cc: stable...@vger.kernel.org
Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Steven Rostedt 
---
 kernel/hrtimer.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index 2f023aa..2e66fbb 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -1861,8 +1861,12 @@ void cpu_chill(void)
struct timespec tu = {
.tv_nsec = NSEC_PER_MSEC,
};
+   unsigned int freeze_flag = current->flags & PF_NOFREEZE;
 
+   current->flags |= PF_NOFREEZE;
hrtimer_nanosleep(&tu, NULL, HRTIMER_MODE_REL, CLOCK_MONOTONIC);
+   if (!freeze_flag)
+   current->flags &= ~PF_NOFREEZE;
 }
 EXPORT_SYMBOL(cpu_chill);
 #endif
-- 
1.8.5.3


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH RT 0/8] Linux 3.8.13.14-rt28-rc1

2014-03-04 Thread Steven Rostedt


Dear RT Folks,

This is the RT stable review cycle of patch 3.8.13.14-rt28-rc1.

Please scream at me if I messed something up. Please test the patches too.

The -rc release will be uploaded to kernel.org and will be deleted when
the final release is out. This is just a review release (or release candidate).

The pre-releases will not be pushed to the git repository, only the
final release is.

If all goes well, this patch will be converted to the next main release
on 3/7/2014.

Enjoy,

-- Steve


To build 3.8.13.14-rt28-rc1 directly, the following patches should be applied:

  http://www.kernel.org/pub/linux/kernel/v3.x/linux-3.8.tar.xz

  http://www.kernel.org/pub/linux/kernel/v3.x/patch-3.8.13.14.xz

  
http://www.kernel.org/pub/linux/kernel/projects/rt/3.8/patch-3.8.13.14-rt28-rc1.patch.xz

You can also build from 3.8.13.14-rt27 by applying the incremental patch:

http://www.kernel.org/pub/linux/kernel/projects/rt/3.8/incr/patch-3.8.13.14-rt27-rt28-rc1.patch.xz


Changes from 3.8.13.14-rt27:

---


Nicholas Mc Guire (1):
  net: ip_send_unicast_reply: add missing local serialization

Paul E. McKenney (1):
  rcu: Eliminate softirq processing from rcutree

Sebastian Andrzej Siewior (3):
  Revert "x86: Disable IST stacks for debug/int 3/stack fault for 
PREEMPT_RT"
  kernel/hrtimer: be non-freezeable in cpu_chill()
  arm/unwind: use a raw_spin_lock

Steven Rostedt (1):
  rt: Make cpu_chill() use hrtimer instead of msleep()

Steven Rostedt (Red Hat) (1):
  Linux 3.8.13.14-rt28-rc1

Tiejun Chen (1):
  rcutree/rcu_bh_qs: disable irq while calling rcu_preempt_qs()


 arch/arm/kernel/unwind.c |  14 ++--
 arch/x86/include/asm/page_64_types.h |  21 ++
 arch/x86/kernel/cpu/common.c |   2 -
 arch/x86/kernel/dumpstack_64.c   |   4 --
 include/linux/delay.h|   2 +-
 kernel/hrtimer.c |  19 +
 kernel/rcutree.c | 119 +++
 kernel/rcutree.h |   3 +-
 kernel/rcutree_plugin.h  | 133 ---
 localversion-rt  |   2 +-
 net/ipv4/ip_output.c |   9 ++-
 11 files changed, 159 insertions(+), 169 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH RT 5/8] arm/unwind: use a raw_spin_lock

2014-03-04 Thread Steven Rostedt

3.8.13.14-rt28-rc1 stable review patch.
If anyone has any objections, please let me know.

--

From: Sebastian Andrzej Siewior 

Mostly unwind is done with irqs enabled however SLUB may call it with
irqs disabled while creating a new SLUB cache.

I had system freeze while loading a module which called
kmem_cache_create() on init. That means SLUB's __slab_alloc() disabled
interrupts and then

->new_slab_objects()
 ->new_slab()
  ->setup_object()
   ->setup_object_debug()
->init_tracking()
 ->set_track()
  ->save_stack_trace()
   ->save_stack_trace_tsk()
->walk_stackframe()
 ->unwind_frame()
  ->unwind_find_idx()
   =>spin_lock_irqsave(&unwind_lock);

Cc: stable...@vger.kernel.org
Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Steven Rostedt 
---
 arch/arm/kernel/unwind.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/arm/kernel/unwind.c b/arch/arm/kernel/unwind.c
index 00df012..bbafc67 100644
--- a/arch/arm/kernel/unwind.c
+++ b/arch/arm/kernel/unwind.c
@@ -87,7 +87,7 @@ extern const struct unwind_idx __start_unwind_idx[];
 static const struct unwind_idx *__origin_unwind_idx;
 extern const struct unwind_idx __stop_unwind_idx[];
 
-static DEFINE_SPINLOCK(unwind_lock);
+static DEFINE_RAW_SPINLOCK(unwind_lock);
 static LIST_HEAD(unwind_tables);
 
 /* Convert a prel31 symbol to an absolute address */
@@ -195,7 +195,7 @@ static const struct unwind_idx *unwind_find_idx(unsigned 
long addr)
/* module unwind tables */
struct unwind_table *table;
 
-   spin_lock_irqsave(&unwind_lock, flags);
+   raw_spin_lock_irqsave(&unwind_lock, flags);
list_for_each_entry(table, &unwind_tables, list) {
if (addr >= table->begin_addr &&
addr < table->end_addr) {
@@ -207,7 +207,7 @@ static const struct unwind_idx *unwind_find_idx(unsigned 
long addr)
break;
}
}
-   spin_unlock_irqrestore(&unwind_lock, flags);
+   raw_spin_unlock_irqrestore(&unwind_lock, flags);
}
 
pr_debug("%s: idx = %p\n", __func__, idx);
@@ -469,9 +469,9 @@ struct unwind_table *unwind_table_add(unsigned long start, 
unsigned long size,
tab->begin_addr = text_addr;
tab->end_addr = text_addr + text_size;
 
-   spin_lock_irqsave(&unwind_lock, flags);
+   raw_spin_lock_irqsave(&unwind_lock, flags);
list_add_tail(&tab->list, &unwind_tables);
-   spin_unlock_irqrestore(&unwind_lock, flags);
+   raw_spin_unlock_irqrestore(&unwind_lock, flags);
 
return tab;
 }
@@ -483,9 +483,9 @@ void unwind_table_del(struct unwind_table *tab)
if (!tab)
return;
 
-   spin_lock_irqsave(&unwind_lock, flags);
+   raw_spin_lock_irqsave(&unwind_lock, flags);
list_del(&tab->list);
-   spin_unlock_irqrestore(&unwind_lock, flags);
+   raw_spin_unlock_irqrestore(&unwind_lock, flags);
 
kfree(tab);
 }
-- 
1.8.5.3


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH RT 8/8] Linux 3.8.13.14-rt28-rc1

2014-03-04 Thread Steven Rostedt

3.8.13.14-rt28-rc1 stable review patch.
If anyone has any objections, please let me know.

--

From: "Steven Rostedt (Red Hat)" 

---
 localversion-rt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/localversion-rt b/localversion-rt
index be1e37b..124baf7 100644
--- a/localversion-rt
+++ b/localversion-rt
@@ -1 +1 @@
--rt27
+-rt28-rc1
-- 
1.8.5.3


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 00/12] Thunderbolt hotplug support for Apple hardware (testers needed)

2014-03-04 Thread Matthew Garrett

On Wed, Mar 05, 2014 at 12:59:54AM +0100, Andreas Noever wrote:

> 
> I belive that the patch has the same effect as passing
> acpi_osi=! acpi_osi=Darwin
> to the kernel. The problem with that approach is that it changes the
> firmware behaviour quite a lot. In particular it prevents Linux from
> taking over pci hotplug control:

There's not really any way around this. The method to power up the chip 
will refuse to run unless the system claims Darwin and nothing else.

> I would prefer to find a solution that boots without acpi_osi=Darwin
> as seems to trigger quite a lot of ACPI code. My current approach is
> to inject a custom OSDW method somewhere into the NHI namespace and to
> replace _PTS and _WAK from my driver. I can then wake the controller
> with the XRPE method. The last problem is that the PCI code does not
> allocate enough (or any) bus numbers below the hotplug ports. I'm
> trying to add some quirks to it but the code is not really made for
> that...

I don't think that's a workable approach - any change in the firmware 
implementation could break it. If we're going to insert quirks then I 
think it makes more sense to do it in the PCI layer rather than 
injecting things into ACPI.

-- 
Matthew Garrett | mj...@srcf.ucam.org
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH RT 3/8] rt: Make cpu_chill() use hrtimer instead of msleep()

2014-03-04 Thread Steven Rostedt

3.8.13.14-rt28-rc1 stable review patch.
If anyone has any objections, please let me know.

--

From: Steven Rostedt 

Ulrich Obergfell pointed out that cpu_chill() calls msleep() which is woken
up by the ksoftirqd running the TIMER softirq. But as the cpu_chill() is
called from softirq context, it may block the ksoftirqd() from running, in
which case, it may never wake up the msleep() causing the deadlock.

I checked the vmcore, and irq/74-qla2xxx is stuck in the msleep() call,
running on CPU 8. The one ksoftirqd that is stuck, happens to be the one that
runs on CPU 8, and it is blocked on a lock held by irq/74-qla2xxx. As that
ksoftirqd is the one that will wake up irq/74-qla2xxx, and it happens to be
blocked on a lock that irq/74-qla2xxx holds, we have our deadlock.

The solution is not to convert the cpu_chill() back to a cpu_relax() as that
will re-create a possible live lock that the cpu_chill() fixed earlier, and may
also leave this bug open on other softirqs. The fix is to remove the
dependency on ksoftirqd from cpu_chill(). That is, instead of calling
msleep() that requires ksoftirqd to wake it up, use the
hrtimer_nanosleep() code that does the wakeup from hard irq context.

|Looks to be the lock of the block softirq. I don't have the core dump
|anymore, but from what I could tell the ksoftirqd was blocked on the
|block softirq lock, where the block softirq handler did a msleep
|(called by the qla2xxx interrupt handler).
|
|Looking at trigger_softirq() in block/blk-softirq.c, it can do a
|smp_callfunction() to another cpu to run the block softirq. If that
|happens to be the cpu where the qla2xx irq handler is doing the block
|softirq and is in a middle of a msleep(), I believe the ksoftirqd will
|try to run the softirq. If it does that, then BOOM, it's deadlocked
|because the ksoftirqd will never run the timer softirq either.

|I should have also stated that it was only one lock that was involved.
|But the lock owner was doing a msleep() that requires a wakeup by
|ksoftirqd to continue. If ksoftirqd happens to be blocked on a lock
|held by the msleep() caller, then you have your deadlock.
|
|It's best not to have any softirqs going to sleep requiring another
|softirq to wake it up. Note, if we ever require a timer softirq to do a
|cpu_chill() it will most definitely hit this deadlock.

Cc: stable...@vger.kernel.org
Found-by: Ulrich Obergfell 
Signed-off-by: Steven Rostedt 
[bigeasy: add the 4 | chapters from email]
Signed-off-by: Sebastian Andrzej Siewior 
---
 include/linux/delay.h |  2 +-
 kernel/hrtimer.c  | 15 +++
 2 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/include/linux/delay.h b/include/linux/delay.h
index e23a7c0..37caab3 100644
--- a/include/linux/delay.h
+++ b/include/linux/delay.h
@@ -53,7 +53,7 @@ static inline void ssleep(unsigned int seconds)
 }
 
 #ifdef CONFIG_PREEMPT_RT_FULL
-# define cpu_chill()   msleep(1)
+extern void cpu_chill(void);
 #else
 # define cpu_chill()   cpu_relax()
 #endif
diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index aa5eb4f..2f023aa 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -1852,6 +1852,21 @@ SYSCALL_DEFINE2(nanosleep, struct timespec __user *, 
rqtp,
return hrtimer_nanosleep(&tu, rmtp, HRTIMER_MODE_REL, CLOCK_MONOTONIC);
 }
 
+#ifdef CONFIG_PREEMPT_RT_FULL
+/*
+ * Sleep for 1 ms in hope whoever holds what we want will let it go.
+ */
+void cpu_chill(void)
+{
+   struct timespec tu = {
+   .tv_nsec = NSEC_PER_MSEC,
+   };
+
+   hrtimer_nanosleep(&tu, NULL, HRTIMER_MODE_REL, CLOCK_MONOTONIC);
+}
+EXPORT_SYMBOL(cpu_chill);
+#endif
+
 /*
  * Functions related to boot-time initialization:
  */
-- 
1.8.5.3


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH RT 7/8] rcu: Eliminate softirq processing from rcutree

2014-03-04 Thread Steven Rostedt

3.8.13.14-rt28-rc1 stable review patch.
If anyone has any objections, please let me know.

--

From: "Paul E. McKenney" 

Running RCU out of softirq is a problem for some workloads that would
like to manage RCU core processing independently of other softirq work,
for example, setting kthread priority.  This commit therefore moves the
RCU core work from softirq to a per-CPU/per-flavor SCHED_OTHER kthread
named rcuc.  The SCHED_OTHER approach avoids the scalability problems
that appeared with the earlier attempt to move RCU core processing to
from softirq to kthreads.  That said, kernels built with RCU_BOOST=y
will run the rcuc kthreads at the RCU-boosting priority.

Cc: stable...@vger.kernel.org
Reported-by: Thomas Gleixner 
Tested-by: Mike Galbraith 
Signed-off-by: Paul E. McKenney 
Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Steven Rostedt 
---
 kernel/rcutree.c| 114 -
 kernel/rcutree.h|   3 +-
 kernel/rcutree_plugin.h | 133 +---
 3 files changed, 114 insertions(+), 136 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 6f6d133..6a34f87 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -53,6 +53,11 @@
 #include 
 #include 
 #include 
+#include 
+#include 
+#include 
+#include 
+#include "time/tick-internal.h"
 
 #include "rcutree.h"
 #include 
@@ -127,8 +132,6 @@ EXPORT_SYMBOL_GPL(rcu_scheduler_active);
  */
 static int rcu_scheduler_fully_active __read_mostly;
 
-#ifdef CONFIG_RCU_BOOST
-
 /*
  * Control variables for per-CPU and per-rcu_node kthreads.  These
  * handle all flavors of RCU.
@@ -138,8 +141,6 @@ DEFINE_PER_CPU(unsigned int, rcu_cpu_kthread_status);
 DEFINE_PER_CPU(unsigned int, rcu_cpu_kthread_loops);
 DEFINE_PER_CPU(char, rcu_cpu_has_work);
 
-#endif /* #ifdef CONFIG_RCU_BOOST */
-
 static void rcu_boost_kthread_setaffinity(struct rcu_node *rnp, int 
outgoingcpu);
 static void invoke_rcu_core(void);
 static void invoke_rcu_callbacks(struct rcu_state *rsp, struct rcu_data *rdp);
@@ -2048,16 +2049,14 @@ __rcu_process_callbacks(struct rcu_state *rsp)
 /*
  * Do RCU core processing for the current CPU.
  */
-static void rcu_process_callbacks(struct softirq_action *unused)
+static void rcu_process_callbacks(void)
 {
struct rcu_state *rsp;
 
if (cpu_is_offline(smp_processor_id()))
return;
-   trace_rcu_utilization("Start RCU core");
for_each_rcu_flavor(rsp)
__rcu_process_callbacks(rsp);
-   trace_rcu_utilization("End RCU core");
 }
 
 /*
@@ -2071,17 +2070,105 @@ static void invoke_rcu_callbacks(struct rcu_state 
*rsp, struct rcu_data *rdp)
 {
if (unlikely(!ACCESS_ONCE(rcu_scheduler_fully_active)))
return;
-   if (likely(!rsp->boost)) {
-   rcu_do_batch(rsp, rdp);
-   return;
-   }
-   invoke_rcu_callbacks_kthread();
+   rcu_do_batch(rsp, rdp);
 }
 
+static void rcu_wake_cond(struct task_struct *t, int status)
+{
+   /*
+* If the thread is yielding, only wake it when this
+* is invoked from idle
+*/
+   if (t && (status != RCU_KTHREAD_YIELDING || is_idle_task(current)))
+   wake_up_process(t);
+}
+
+/*
+ * Wake up this CPU's rcuc kthread to do RCU core processing.
+ */
 static void invoke_rcu_core(void)
 {
-   raise_softirq(RCU_SOFTIRQ);
+   unsigned long flags;
+   struct task_struct *t;
+
+   if (!cpu_online(smp_processor_id()))
+   return;
+   local_irq_save(flags);
+   __this_cpu_write(rcu_cpu_has_work, 1);
+   t = __this_cpu_read(rcu_cpu_kthread_task);
+   if (t != NULL && current != t)
+   rcu_wake_cond(t, __this_cpu_read(rcu_cpu_kthread_status));
+   local_irq_restore(flags);
+}
+
+static void rcu_cpu_kthread_park(unsigned int cpu)
+{
+   per_cpu(rcu_cpu_kthread_status, cpu) = RCU_KTHREAD_OFFCPU;
+}
+
+static int rcu_cpu_kthread_should_run(unsigned int cpu)
+{
+   return __this_cpu_read(rcu_cpu_has_work);
+}
+
+/*
+ * Per-CPU kernel thread that invokes RCU callbacks.  This replaces the
+ * RCU softirq used in flavors and configurations of RCU that do not
+ * support RCU priority boosting.
+ */
+static void rcu_cpu_kthread(unsigned int cpu)
+{
+   unsigned int *statusp = &__get_cpu_var(rcu_cpu_kthread_status);
+   char work, *workp = &__get_cpu_var(rcu_cpu_has_work);
+   int spincnt;
+
+   for (spincnt = 0; spincnt < 10; spincnt++) {
+   trace_rcu_utilization("Start CPU kthread@rcu_wait");
+   local_bh_disable();
+   *statusp = RCU_KTHREAD_RUNNING;
+   this_cpu_inc(rcu_cpu_kthread_loops);
+   local_irq_disable();
+   work = *workp;
+   *workp = 0;
+   local_irq_enable();
+   if (work)
+   rcu_process_callbacks();
+   local_bh_enable();
+

Re: [PATCH] irq-gic: remove file name from heading comment

2014-03-04 Thread Thomas Gleixner

On Wed, 5 Mar 2014, Sergei Shtylyov wrote:
> Hello.
> 
> On 01/15/2014 02:49 AM, Sergei Shtylyov wrote:
> 
> > > File names in the heading comments  fell out of favor long ago, and this
> > > one
> > > weren't even changed when the driver was moved from arch/arm/common/, so
> > > remove
> > > it at last...
> 
> > > Signed-off-by: Sergei Shtylyov 
> 
> > > ---
> > > The patch is against the 'irq/core' branch of the 'tip.git' repo.
> 
> > Thomas, is there a chance that you merge this patch for 3.14? Is it even
> > the right branch for cleanups (I'm seeing fixes there)?
> 
>Thomas, will you ever reply to me? Who's maintaining the GIC driver, you?
> Or the patches shoud just go in via arm-soc.git?

That patch is so important^Wtrivial that it really can go via the
obvious triv...@kernel.org

Sigh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH RT 6/8] net: ip_send_unicast_reply: add missing local serialization

2014-03-04 Thread Steven Rostedt

3.8.13.14-rt28-rc1 stable review patch.
If anyone has any objections, please let me know.

--

From: Nicholas Mc Guire 

in response to the oops in ip_output.c:ip_send_unicast_reply under high
network load with CONFIG_PREEMPT_RT_FULL=y, reported by Sami Pietikainen
, this patch adds local serialization in
ip_send_unicast_reply.

from ip_output.c:
/*
 *  Generic function to send a packet as reply to another packet.
 *  Used to send some TCP resets/acks so far.
 *
 *  Use a fake percpu inet socket to avoid false sharing and contention.
 */
static DEFINE_PER_CPU(struct inet_sock, unicast_sock) = {
...

which was added in commit be9f4a44 in linux-stable. The git log, wich
introduced the PER_CPU unicast_sock, states:

commit be9f4a44e7d41cee50ddb5f038fc2391cbbb4046
Author: Eric Dumazet 
Date:   Thu Jul 19 07:34:03 2012 +

ipv4: tcp: remove per net tcp_sock

tcp_v4_send_reset() and tcp_v4_send_ack() use a single socket
per network namespace.

This leads to bad behavior on multiqueue NICS, because many cpus
contend for the socket lock and once socket lock is acquired, extra
false sharing on various socket fields slow down the operations.

To better resist to attacks, we use a percpu socket. Each cpu can
run without contention, using appropriate memory (local node)


The per-cpu here thus is assuming exclusivity serializing per cpu - so
the use of get_cpu_ligh introduced in
net-use-cpu-light-in-ip-send-unicast-reply.patch, which droped the
preempt_disable in favor of a migrate_disable is probably wrong as this
only handles the referencial consistency but not the serialization. To
evade a preempt_disable here a local lock would be needed.

Therapie:
 * add local lock:
 * and re-introduce local serialization:

Tested on x86 with high network load using the testcase from Sami Pietikainen
  while : ; do wget -O - ftp://LOCAL_SERVER/empty_file > /dev/null 2>&1; done

Link: http://www.spinics.net/lists/linux-rt-users/msg11007.html
Cc: stable...@vger.kernel.org
Signed-off-by: Nicholas Mc Guire 
Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Steven Rostedt 
---
 net/ipv4/ip_output.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 90af992..6e75fb1 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -79,6 +79,7 @@
 #include 
 #include 
 #include 
+#include 
 
 int sysctl_ip_default_ttl __read_mostly = IPDEFTTL;
 EXPORT_SYMBOL(sysctl_ip_default_ttl);
@@ -1471,6 +1472,9 @@ static DEFINE_PER_CPU(struct inet_sock, unicast_sock) = {
.uc_ttl = -1,
 };
 
+/* serialize concurrent calls on the same CPU to ip_send_unicast_reply */
+static DEFINE_LOCAL_IRQ_LOCK(unicast_lock);
+
 void ip_send_unicast_reply(struct net *net, struct sk_buff *skb, __be32 daddr,
   __be32 saddr, const struct ip_reply_arg *arg,
   unsigned int len)
@@ -1508,8 +1512,7 @@ void ip_send_unicast_reply(struct net *net, struct 
sk_buff *skb, __be32 daddr,
if (IS_ERR(rt))
return;
 
-   get_cpu_light();
-   inet = &__get_cpu_var(unicast_sock);
+   inet = &get_locked_var(unicast_lock, unicast_sock);
 
inet->tos = arg->tos;
sk = &inet->sk;
@@ -1533,7 +1536,7 @@ void ip_send_unicast_reply(struct net *net, struct 
sk_buff *skb, __be32 daddr,
ip_push_pending_frames(sk, &fl4);
}
 
-   put_cpu_light();
+   put_locked_var(unicast_lock, unicast_sock);
 
ip_rt_put(rt);
 }
-- 
1.8.5.3


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH RT 2/8] Revert "x86: Disable IST stacks for debug/int 3/stack fault for PREEMPT_RT"

2014-03-04 Thread Steven Rostedt

3.8.13.14-rt28-rc1 stable review patch.
If anyone has any objections, please let me know.

--

From: Sebastian Andrzej Siewior 

where do I start. Let me explain what is going on here. The code
sequence
| pushf
| pop%edx
| or $0x1,%dh
| push   %edx
| mov$0xe0,%eax
| popf
| sysenter

triggers the bug. On 64bit kernel we see the double fault (with 32bit and
64bit userland) and on 32bit kernel there is no problem. The reporter said
that double fault does not happen on 64bit kernel with 64bit userland and
this is because in that case the VDSO uses the "syscall" interface instead
of "sysenter".

The bug. "popf" loads the flags with the TF bit set which enables
"single stepping" and this leads to a debug exception. Usually on 64bit
we have a special IST stack for the debug exception. Due to patch [0] we
do not use the IST stack but the kernel stack instead. On 64bit the
sysenter instruction starts in kernel with the stack address NULL. The
code sequence above enters the debug exception (TF flag) after the
sysenter instruction was executed which sets the stack pointer to NULL
and we have a fault (it seems that the debug exception saves some bytes
on the stack).
To fix the double fault I'm going to drop patch [0]. It is completely
pointless. In do_debug() and do_stack_segment() we disable preemption
which means the task can't leave the CPU. So it does not matter if we run
on IST or on kernel stack.
There is a patch [1] which drops preempt_disable() call for a 32bit
kernel but not for 64bit so there should be no regression.
And [1] seems valid even for this code sequence. We enter the debug
exception with a 256bytes long per cpu stack and migrate to the kernel
stack before calling do_debug().

[0] x86-disable-debug-stack.patch
[1] fix-rt-int3-x86_32-3.2-rt.patch

Cc: stable...@vger.kernel.org
Reported-by: Brian Silverman 
Cc: Andi Kleen 
Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Steven Rostedt 
---
 arch/x86/include/asm/page_64_types.h | 21 ++---
 arch/x86/kernel/cpu/common.c |  2 --
 arch/x86/kernel/dumpstack_64.c   |  4 
 3 files changed, 6 insertions(+), 21 deletions(-)

diff --git a/arch/x86/include/asm/page_64_types.h 
b/arch/x86/include/asm/page_64_types.h
index 65b85f4..320f7bb 100644
--- a/arch/x86/include/asm/page_64_types.h
+++ b/arch/x86/include/asm/page_64_types.h
@@ -14,21 +14,12 @@
 #define IRQ_STACK_ORDER 2
 #define IRQ_STACK_SIZE (PAGE_SIZE << IRQ_STACK_ORDER)
 
-#ifdef CONFIG_PREEMPT_RT_FULL
-# define STACKFAULT_STACK 0
-# define DOUBLEFAULT_STACK 1
-# define NMI_STACK 2
-# define DEBUG_STACK 0
-# define MCE_STACK 3
-# define N_EXCEPTION_STACKS 3  /* hw limit: 7 */
-#else
-# define STACKFAULT_STACK 1
-# define DOUBLEFAULT_STACK 2
-# define NMI_STACK 3
-# define DEBUG_STACK 4
-# define MCE_STACK 5
-# define N_EXCEPTION_STACKS 5  /* hw limit: 7 */
-#endif
+#define STACKFAULT_STACK 1
+#define DOUBLEFAULT_STACK 2
+#define NMI_STACK 3
+#define DEBUG_STACK 4
+#define MCE_STACK 5
+#define N_EXCEPTION_STACKS 5  /* hw limit: 7 */
 
 #define PUD_PAGE_SIZE  (_AC(1, UL) << PUD_SHIFT)
 #define PUD_PAGE_MASK  (~(PUD_PAGE_SIZE-1))
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 2636e0f..9c3ab43 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1103,9 +1103,7 @@ DEFINE_PER_CPU(struct task_struct *, fpu_owner_task);
  */
 static const unsigned int exception_stack_sizes[N_EXCEPTION_STACKS] = {
  [0 ... N_EXCEPTION_STACKS - 1]= EXCEPTION_STKSZ,
-#if DEBUG_STACK > 0
  [DEBUG_STACK - 1] = DEBUG_STKSZ
-#endif
 };
 
 static DEFINE_PER_CPU_PAGE_ALIGNED(char, exception_stacks
diff --git a/arch/x86/kernel/dumpstack_64.c b/arch/x86/kernel/dumpstack_64.c
index f16c07b..b653675 100644
--- a/arch/x86/kernel/dumpstack_64.c
+++ b/arch/x86/kernel/dumpstack_64.c
@@ -21,14 +21,10 @@
(N_EXCEPTION_STACKS + DEBUG_STKSZ/EXCEPTION_STKSZ - 2)
 
 static char x86_stack_ids[][8] = {
-#if DEBUG_STACK > 0
[ DEBUG_STACK-1 ]   = "#DB",
-#endif
[ NMI_STACK-1   ]   = "NMI",
[ DOUBLEFAULT_STACK-1   ]   = "#DF",
-#if STACKFAULT_STACK > 0
[ STACKFAULT_STACK-1]   = "#SS",
-#endif
[ MCE_STACK-1   ]   = "#MC",
 #if DEBUG_STKSZ > EXCEPTION_STKSZ
[ N_EXCEPTION_STACKS ...
-- 
1.8.5.3


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH RT 2/6] Revert "x86: Disable IST stacks for debug/int 3/stack fault for PREEMPT_RT"

2014-03-04 Thread Steven Rostedt

3.4.82-rt101-rc1 stable review patch.
If anyone has any objections, please let me know.

--

From: Sebastian Andrzej Siewior 

where do I start. Let me explain what is going on here. The code
sequence
| pushf
| pop%edx
| or $0x1,%dh
| push   %edx
| mov$0xe0,%eax
| popf
| sysenter

triggers the bug. On 64bit kernel we see the double fault (with 32bit and
64bit userland) and on 32bit kernel there is no problem. The reporter said
that double fault does not happen on 64bit kernel with 64bit userland and
this is because in that case the VDSO uses the "syscall" interface instead
of "sysenter".

The bug. "popf" loads the flags with the TF bit set which enables
"single stepping" and this leads to a debug exception. Usually on 64bit
we have a special IST stack for the debug exception. Due to patch [0] we
do not use the IST stack but the kernel stack instead. On 64bit the
sysenter instruction starts in kernel with the stack address NULL. The
code sequence above enters the debug exception (TF flag) after the
sysenter instruction was executed which sets the stack pointer to NULL
and we have a fault (it seems that the debug exception saves some bytes
on the stack).
To fix the double fault I'm going to drop patch [0]. It is completely
pointless. In do_debug() and do_stack_segment() we disable preemption
which means the task can't leave the CPU. So it does not matter if we run
on IST or on kernel stack.
There is a patch [1] which drops preempt_disable() call for a 32bit
kernel but not for 64bit so there should be no regression.
And [1] seems valid even for this code sequence. We enter the debug
exception with a 256bytes long per cpu stack and migrate to the kernel
stack before calling do_debug().

[0] x86-disable-debug-stack.patch
[1] fix-rt-int3-x86_32-3.2-rt.patch

Cc: stable...@vger.kernel.org
Reported-by: Brian Silverman 
Cc: Andi Kleen 
Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Steven Rostedt 
---
 arch/x86/include/asm/page_64_types.h | 21 ++---
 arch/x86/kernel/cpu/common.c |  2 --
 arch/x86/kernel/dumpstack_64.c   |  4 
 3 files changed, 6 insertions(+), 21 deletions(-)

diff --git a/arch/x86/include/asm/page_64_types.h 
b/arch/x86/include/asm/page_64_types.h
index 0883ecd..7639dbf 100644
--- a/arch/x86/include/asm/page_64_types.h
+++ b/arch/x86/include/asm/page_64_types.h
@@ -14,21 +14,12 @@
 #define IRQ_STACK_ORDER 2
 #define IRQ_STACK_SIZE (PAGE_SIZE << IRQ_STACK_ORDER)
 
-#ifdef CONFIG_PREEMPT_RT_FULL
-# define STACKFAULT_STACK 0
-# define DOUBLEFAULT_STACK 1
-# define NMI_STACK 2
-# define DEBUG_STACK 0
-# define MCE_STACK 3
-# define N_EXCEPTION_STACKS 3  /* hw limit: 7 */
-#else
-# define STACKFAULT_STACK 1
-# define DOUBLEFAULT_STACK 2
-# define NMI_STACK 3
-# define DEBUG_STACK 4
-# define MCE_STACK 5
-# define N_EXCEPTION_STACKS 5  /* hw limit: 7 */
-#endif
+#define STACKFAULT_STACK 1
+#define DOUBLEFAULT_STACK 2
+#define NMI_STACK 3
+#define DEBUG_STACK 4
+#define MCE_STACK 5
+#define N_EXCEPTION_STACKS 5  /* hw limit: 7 */
 
 #define PUD_PAGE_SIZE  (_AC(1, UL) << PUD_SHIFT)
 #define PUD_PAGE_MASK  (~(PUD_PAGE_SIZE-1))
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index f743261..cf79302 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1056,9 +1056,7 @@ DEFINE_PER_CPU(struct task_struct *, fpu_owner_task);
  */
 static const unsigned int exception_stack_sizes[N_EXCEPTION_STACKS] = {
  [0 ... N_EXCEPTION_STACKS - 1]= EXCEPTION_STKSZ,
-#if DEBUG_STACK > 0
  [DEBUG_STACK - 1] = DEBUG_STKSZ
-#endif
 };
 
 static DEFINE_PER_CPU_PAGE_ALIGNED(char, exception_stacks
diff --git a/arch/x86/kernel/dumpstack_64.c b/arch/x86/kernel/dumpstack_64.c
index 9d50b30..17107bd 100644
--- a/arch/x86/kernel/dumpstack_64.c
+++ b/arch/x86/kernel/dumpstack_64.c
@@ -21,14 +21,10 @@
(N_EXCEPTION_STACKS + DEBUG_STKSZ/EXCEPTION_STKSZ - 2)
 
 static char x86_stack_ids[][8] = {
-#if DEBUG_STACK > 0
[ DEBUG_STACK-1 ]   = "#DB",
-#endif
[ NMI_STACK-1   ]   = "NMI",
[ DOUBLEFAULT_STACK-1   ]   = "#DF",
-#if STACKFAULT_STACK > 0
[ STACKFAULT_STACK-1]   = "#SS",
-#endif
[ MCE_STACK-1   ]   = "#MC",
 #if DEBUG_STKSZ > EXCEPTION_STKSZ
[ N_EXCEPTION_STACKS ...
-- 
1.8.5.3


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/6] mm: add get_pageblock_migratetype_nolock() for cases where locking is undesirable

2014-03-04 Thread Joonsoo Kim

On Tue, Mar 04, 2014 at 01:16:56PM +0100, Vlastimil Babka wrote:
> On 03/04/2014 01:55 AM, Joonsoo Kim wrote:
> >On Mon, Mar 03, 2014 at 02:54:09PM +0100, Vlastimil Babka wrote:
> >>On 03/03/2014 09:22 AM, Joonsoo Kim wrote:
> >>>On Fri, Feb 28, 2014 at 03:15:00PM +0100, Vlastimil Babka wrote:
> In order to prevent race with set_pageblock_migratetype, most of calls to
> get_pageblock_migratetype have been moved under zone->lock. For the 
> remaining
> call sites, the extra locking is undesirable, notably in 
> free_hot_cold_page().
> 
> This patch introduces a _nolock version to be used on these call sites, 
> where
> a wrong value does not affect correctness. The function makes sure that 
> the
> value does not exceed valid migratetype numbers. Such too-high values are
> assumed to be a result of race and caller-supplied fallback value is 
> returned
> instead.
> 
> Signed-off-by: Vlastimil Babka 
> ---
>   include/linux/mmzone.h | 24 
>   mm/compaction.c| 14 +++---
>   mm/memory-failure.c|  3 ++-
>   mm/page_alloc.c| 22 +-
>   mm/vmstat.c|  2 +-
>   5 files changed, 55 insertions(+), 10 deletions(-)
> 
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index fac5509..7c3f678 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -75,6 +75,30 @@ enum {
> 
>   extern int page_group_by_mobility_disabled;
> 
> +/*
> + * When called without zone->lock held, a race with 
> set_pageblock_migratetype
> + * may result in bogus values. Use this variant only when this does not 
> affect
> + * correctness, and taking zone->lock would be costly. Values >= 
> MIGRATE_TYPES
> + * are considered to be a result of this race and the value of 
> race_fallback
> + * argument is returned instead.
> + */
> +static inline int get_pageblock_migratetype_nolock(struct page *page,
> + int race_fallback)
> +{
> + int ret = get_pageblock_flags_group(page, PB_migrate, PB_migrate_end);
> +
> + if (unlikely(ret >= MIGRATE_TYPES))
> + ret = race_fallback;
> +
> + return ret;
> +}
> >>>
> >>>Hello, Vlastimil.
> >>>
> >>>First of all, thanks for nice work!
> >>>I have another opinion about this implementation. It can be wrong, so if it
> >>>is wrong, please let me know.
> >>
> >>Thanks, all opinions/reviewing is welcome :)
> >>
> >>>Although this implementation would close the race which triggers NULL 
> >>>dereference,
> >>>I think that this isn't enough if you have a plan to add more
> >>>{start,undo}_isolate_page_range().
> >>>
> >>>Consider that there are lots of {start,undo}_isolate_page_range() calls
> >>>on the system without CMA.
> >>>
> >>>bit representation of migratetype is like as following.
> >>>
> >>>MIGRATE_MOVABLE = 010
> >>>MIGRATE_ISOLATE = 100
> >>>
> >>>We could read following values as migratetype of the page on movable 
> >>>pageblock
> >>>if race occurs.
> >>>
> >>>start_isolate_page_range() case: 010 -> 100
> >>>010, 000, 100
> >>>
> >>>undo_isolate_page_range() case: 100 -> 010
> >>>100, 110, 010
> >>>
> >>>Above implementation prevents us from getting 110, but, it can't prevent 
> >>>us from
> >>>getting 000, that is, MIGRATE_UNMOVABLE. If this race occurs in 
> >>>free_hot_cold_page(),
> >>>this page would go into unmovable pcp and then allocated for that 
> >>>migratetype.
> >>>It results in more fragmented memory.
> >>
> >>Yes, that can happen. But I would expect it to be negligible to
> >>other causes of fragmentation. But I'm not at this moment sure how
> >>often {start,undo}_isolate_page_range() would be called in the end.
> >>Certainly
> >>not as often as in the development patch which is just to see if
> >>that can improve anything. Because it will have its own overhead
> >>(mostly for zone->lock) that might be too large. But good point, I
> >>will try to quantify this.
> >>
> >>>
> >>>Consider another case that system enables CONFIG_CMA,
> >>>
> >>>MIGRATE_MOVABLE = 010
> >>>MIGRATE_ISOLATE = 101
> >>>
> >>>start_isolate_page_range() case: 010 -> 101
> >>>010, 011, 001, 101
> >>>
> >>>undo_isolate_page_range() case: 101 -> 010
> >>>101, 100, 110, 010
> >>>
> >>>This can results in totally different values and this also makes the 
> >>>problem
> >>>mentioned above. And, although this doesn't cause any problem on CMA for 
> >>>now,
> >>>if another migratetype is introduced or some migratetype is removed, it 
> >>>can cause
> >>>CMA typed page to go into other migratetype and makes CMA permanently 
> >>>failed.
> >>
> >>This should actually be no problem for free_hot_cold_page() as any
> >>migratetype >= MIGRATE_PCPTYPES will defer to free_one_page() which
> >>will reread migratetype under zone->lock. So as long as
> >>MIGRATE_PCPTYPES does not include a migratetype with such

[PATCH RT 1/6] rcutree/rcu_bh_qs: disable irq while calling rcu_preempt_qs()

2014-03-04 Thread Steven Rostedt

3.4.82-rt101-rc1 stable review patch.
If anyone has any objections, please let me know.

--

From: Tiejun Chen 

Any callers to the function rcu_preempt_qs() must disable irqs in
order to protect the assignment to ->rcu_read_unlock_special. In
RT case, rcu_bh_qs() as the wrapper of rcu_preempt_qs() is called
in some scenarios where irq is enabled, like this path,

do_single_softirq()
|
+ local_irq_enable();
+ handle_softirq()
||
|+ rcu_bh_qs()
||
|+ rcu_preempt_qs()
|
+ local_irq_disable()

So here we'd better disable irq directly inside of rcu_bh_qs() to
fix this, otherwise the kernel may be freezable sometimes as
observed. And especially this way is also kind and safe for the
potential rcu_bh_qs() usage elsewhere in the future.

Cc: stable...@vger.kernel.org
Signed-off-by: Tiejun Chen 
Signed-off-by: Bin Jiang 
Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Steven Rostedt 
---
 kernel/rcutree.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 12ae410..055268b 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -177,7 +177,12 @@ static void rcu_preempt_qs(int cpu);
 
 void rcu_bh_qs(int cpu)
 {
+   unsigned long flags;
+
+   /* Callers to this function, rcu_preempt_qs(), must disable irqs. */
+   local_irq_save(flags);
rcu_preempt_qs(cpu);
+   local_irq_restore(flags);
 }
 #else
 void rcu_bh_qs(int cpu)
-- 
1.8.5.3


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH RT 6/6] Linux 3.4.82-rt101-rc1

2014-03-04 Thread Steven Rostedt

3.4.82-rt101-rc1 stable review patch.
If anyone has any objections, please let me know.

--

From: "Steven Rostedt (Red Hat)" 

---
 localversion-rt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/localversion-rt b/localversion-rt
index 79d3e2b..18ffee1 100644
--- a/localversion-rt
+++ b/localversion-rt
@@ -1 +1 @@
--rt100
+-rt101-rc1
-- 
1.8.5.3


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH RT 5/6] arm/unwind: use a raw_spin_lock

2014-03-04 Thread Steven Rostedt

3.4.82-rt101-rc1 stable review patch.
If anyone has any objections, please let me know.

--

From: Sebastian Andrzej Siewior 

Mostly unwind is done with irqs enabled however SLUB may call it with
irqs disabled while creating a new SLUB cache.

I had system freeze while loading a module which called
kmem_cache_create() on init. That means SLUB's __slab_alloc() disabled
interrupts and then

->new_slab_objects()
 ->new_slab()
  ->setup_object()
   ->setup_object_debug()
->init_tracking()
 ->set_track()
  ->save_stack_trace()
   ->save_stack_trace_tsk()
->walk_stackframe()
 ->unwind_frame()
  ->unwind_find_idx()
   =>spin_lock_irqsave(&unwind_lock);

Cc: stable...@vger.kernel.org
Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Steven Rostedt 
---
 arch/arm/kernel/unwind.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/arm/kernel/unwind.c b/arch/arm/kernel/unwind.c
index 00df012..bbafc67 100644
--- a/arch/arm/kernel/unwind.c
+++ b/arch/arm/kernel/unwind.c
@@ -87,7 +87,7 @@ extern const struct unwind_idx __start_unwind_idx[];
 static const struct unwind_idx *__origin_unwind_idx;
 extern const struct unwind_idx __stop_unwind_idx[];
 
-static DEFINE_SPINLOCK(unwind_lock);
+static DEFINE_RAW_SPINLOCK(unwind_lock);
 static LIST_HEAD(unwind_tables);
 
 /* Convert a prel31 symbol to an absolute address */
@@ -195,7 +195,7 @@ static const struct unwind_idx *unwind_find_idx(unsigned 
long addr)
/* module unwind tables */
struct unwind_table *table;
 
-   spin_lock_irqsave(&unwind_lock, flags);
+   raw_spin_lock_irqsave(&unwind_lock, flags);
list_for_each_entry(table, &unwind_tables, list) {
if (addr >= table->begin_addr &&
addr < table->end_addr) {
@@ -207,7 +207,7 @@ static const struct unwind_idx *unwind_find_idx(unsigned 
long addr)
break;
}
}
-   spin_unlock_irqrestore(&unwind_lock, flags);
+   raw_spin_unlock_irqrestore(&unwind_lock, flags);
}
 
pr_debug("%s: idx = %p\n", __func__, idx);
@@ -469,9 +469,9 @@ struct unwind_table *unwind_table_add(unsigned long start, 
unsigned long size,
tab->begin_addr = text_addr;
tab->end_addr = text_addr + text_size;
 
-   spin_lock_irqsave(&unwind_lock, flags);
+   raw_spin_lock_irqsave(&unwind_lock, flags);
list_add_tail(&tab->list, &unwind_tables);
-   spin_unlock_irqrestore(&unwind_lock, flags);
+   raw_spin_unlock_irqrestore(&unwind_lock, flags);
 
return tab;
 }
@@ -483,9 +483,9 @@ void unwind_table_del(struct unwind_table *tab)
if (!tab)
return;
 
-   spin_lock_irqsave(&unwind_lock, flags);
+   raw_spin_lock_irqsave(&unwind_lock, flags);
list_del(&tab->list);
-   spin_unlock_irqrestore(&unwind_lock, flags);
+   raw_spin_unlock_irqrestore(&unwind_lock, flags);
 
kfree(tab);
 }
-- 
1.8.5.3


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH RT 3/6] rt: Make cpu_chill() use hrtimer instead of msleep()

2014-03-04 Thread Steven Rostedt

3.4.82-rt101-rc1 stable review patch.
If anyone has any objections, please let me know.

--

From: Steven Rostedt 

Ulrich Obergfell pointed out that cpu_chill() calls msleep() which is woken
up by the ksoftirqd running the TIMER softirq. But as the cpu_chill() is
called from softirq context, it may block the ksoftirqd() from running, in
which case, it may never wake up the msleep() causing the deadlock.

I checked the vmcore, and irq/74-qla2xxx is stuck in the msleep() call,
running on CPU 8. The one ksoftirqd that is stuck, happens to be the one that
runs on CPU 8, and it is blocked on a lock held by irq/74-qla2xxx. As that
ksoftirqd is the one that will wake up irq/74-qla2xxx, and it happens to be
blocked on a lock that irq/74-qla2xxx holds, we have our deadlock.

The solution is not to convert the cpu_chill() back to a cpu_relax() as that
will re-create a possible live lock that the cpu_chill() fixed earlier, and may
also leave this bug open on other softirqs. The fix is to remove the
dependency on ksoftirqd from cpu_chill(). That is, instead of calling
msleep() that requires ksoftirqd to wake it up, use the
hrtimer_nanosleep() code that does the wakeup from hard irq context.

|Looks to be the lock of the block softirq. I don't have the core dump
|anymore, but from what I could tell the ksoftirqd was blocked on the
|block softirq lock, where the block softirq handler did a msleep
|(called by the qla2xxx interrupt handler).
|
|Looking at trigger_softirq() in block/blk-softirq.c, it can do a
|smp_callfunction() to another cpu to run the block softirq. If that
|happens to be the cpu where the qla2xx irq handler is doing the block
|softirq and is in a middle of a msleep(), I believe the ksoftirqd will
|try to run the softirq. If it does that, then BOOM, it's deadlocked
|because the ksoftirqd will never run the timer softirq either.

|I should have also stated that it was only one lock that was involved.
|But the lock owner was doing a msleep() that requires a wakeup by
|ksoftirqd to continue. If ksoftirqd happens to be blocked on a lock
|held by the msleep() caller, then you have your deadlock.
|
|It's best not to have any softirqs going to sleep requiring another
|softirq to wake it up. Note, if we ever require a timer softirq to do a
|cpu_chill() it will most definitely hit this deadlock.

Cc: stable...@vger.kernel.org
Found-by: Ulrich Obergfell 
Signed-off-by: Steven Rostedt 
[bigeasy: add the 4 | chapters from email]
Signed-off-by: Sebastian Andrzej Siewior 
---
 include/linux/delay.h |  2 +-
 kernel/hrtimer.c  | 15 +++
 2 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/include/linux/delay.h b/include/linux/delay.h
index e23a7c0..37caab3 100644
--- a/include/linux/delay.h
+++ b/include/linux/delay.h
@@ -53,7 +53,7 @@ static inline void ssleep(unsigned int seconds)
 }
 
 #ifdef CONFIG_PREEMPT_RT_FULL
-# define cpu_chill()   msleep(1)
+extern void cpu_chill(void);
 #else
 # define cpu_chill()   cpu_relax()
 #endif
diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index 1314c00..0b1d411 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -1845,6 +1845,21 @@ SYSCALL_DEFINE2(nanosleep, struct timespec __user *, 
rqtp,
return hrtimer_nanosleep(&tu, rmtp, HRTIMER_MODE_REL, CLOCK_MONOTONIC);
 }
 
+#ifdef CONFIG_PREEMPT_RT_FULL
+/*
+ * Sleep for 1 ms in hope whoever holds what we want will let it go.
+ */
+void cpu_chill(void)
+{
+   struct timespec tu = {
+   .tv_nsec = NSEC_PER_MSEC,
+   };
+
+   hrtimer_nanosleep(&tu, NULL, HRTIMER_MODE_REL, CLOCK_MONOTONIC);
+}
+EXPORT_SYMBOL(cpu_chill);
+#endif
+
 /*
  * Functions related to boot-time initialization:
  */
-- 
1.8.5.3


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH RT 4/6] kernel/hrtimer: be non-freezeable in cpu_chill()

2014-03-04 Thread Steven Rostedt

3.4.82-rt101-rc1 stable review patch.
If anyone has any objections, please let me know.

--

From: Sebastian Andrzej Siewior 

Since we replaced msleep() by hrtimer I see now and then (rarely) this:

| [] Waiting for /dev to be fully populated...
| =
| [ BUG: udevd/229 still has locks held! ]
| 3.12.11-rt17 #23 Not tainted
| -
| 1 lock held by udevd/229:
|  #0:  (&type->i_mutex_dir_key#2){+.+.+.}, at: lookup_slow+0x28/0x98
|
| stack backtrace:
| CPU: 0 PID: 229 Comm: udevd Not tainted 3.12.11-rt17 #23
| (unwind_backtrace+0x0/0xf8) from (show_stack+0x10/0x14)
| (show_stack+0x10/0x14) from (dump_stack+0x74/0xbc)
| (dump_stack+0x74/0xbc) from (do_nanosleep+0x120/0x160)
| (do_nanosleep+0x120/0x160) from (hrtimer_nanosleep+0x90/0x110)
| (hrtimer_nanosleep+0x90/0x110) from (cpu_chill+0x30/0x38)
| (cpu_chill+0x30/0x38) from (dentry_kill+0x158/0x1ec)
| (dentry_kill+0x158/0x1ec) from (dput+0x74/0x15c)
| (dput+0x74/0x15c) from (lookup_real+0x4c/0x50)
| (lookup_real+0x4c/0x50) from (__lookup_hash+0x34/0x44)
| (__lookup_hash+0x34/0x44) from (lookup_slow+0x38/0x98)
| (lookup_slow+0x38/0x98) from (path_lookupat+0x208/0x7fc)
| (path_lookupat+0x208/0x7fc) from (filename_lookup+0x20/0x60)
| (filename_lookup+0x20/0x60) from (user_path_at_empty+0x50/0x7c)
| (user_path_at_empty+0x50/0x7c) from (user_path_at+0x14/0x1c)
| (user_path_at+0x14/0x1c) from (vfs_fstatat+0x48/0x94)
| (vfs_fstatat+0x48/0x94) from (SyS_stat64+0x14/0x30)
| (SyS_stat64+0x14/0x30) from (ret_fast_syscall+0x0/0x48)

For now I see no better way but to disable the freezer the sleep the period.

Cc: stable...@vger.kernel.org
Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Steven Rostedt 
---
 kernel/hrtimer.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index 0b1d411..a87d70d 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -1854,8 +1854,12 @@ void cpu_chill(void)
struct timespec tu = {
.tv_nsec = NSEC_PER_MSEC,
};
+   unsigned int freeze_flag = current->flags & PF_NOFREEZE;
 
+   current->flags |= PF_NOFREEZE;
hrtimer_nanosleep(&tu, NULL, HRTIMER_MODE_REL, CLOCK_MONOTONIC);
+   if (!freeze_flag)
+   current->flags &= ~PF_NOFREEZE;
 }
 EXPORT_SYMBOL(cpu_chill);
 #endif
-- 
1.8.5.3


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH RT 0/6] Linux 3.4.82-rt101-rc1

2014-03-04 Thread Steven Rostedt


Dear RT Folks,

This is the RT stable review cycle of patch 3.4.82-rt101-rc1.

Please scream at me if I messed something up. Please test the patches too.

The -rc release will be uploaded to kernel.org and will be deleted when
the final release is out. This is just a review release (or release candidate).

The pre-releases will not be pushed to the git repository, only the
final release is.

If all goes well, this patch will be converted to the next main release
on 3/7/2014.

Enjoy,

-- Steve


To build 3.4.82-rt101-rc1 directly, the following patches should be applied:

  http://www.kernel.org/pub/linux/kernel/v3.x/linux-3.4.tar.xz

  http://www.kernel.org/pub/linux/kernel/v3.x/patch-3.4.82.xz

  
http://www.kernel.org/pub/linux/kernel/projects/rt/3.4/patch-3.4.82-rt101-rc1.patch.xz

You can also build from 3.4.82-rt100 by applying the incremental patch:

http://www.kernel.org/pub/linux/kernel/projects/rt/3.4/incr/patch-3.4.82-rt100-rt101-rc1.patch.xz


Changes from 3.4.82-rt100:

---


Sebastian Andrzej Siewior (3):
  Revert "x86: Disable IST stacks for debug/int 3/stack fault for 
PREEMPT_RT"
  kernel/hrtimer: be non-freezeable in cpu_chill()
  arm/unwind: use a raw_spin_lock

Steven Rostedt (1):
  rt: Make cpu_chill() use hrtimer instead of msleep()

Steven Rostedt (Red Hat) (1):
  Linux 3.4.82-rt101-rc1

Tiejun Chen (1):
  rcutree/rcu_bh_qs: disable irq while calling rcu_preempt_qs()


 arch/arm/kernel/unwind.c | 14 +++---
 arch/x86/include/asm/page_64_types.h | 21 ++---
 arch/x86/kernel/cpu/common.c |  2 --
 arch/x86/kernel/dumpstack_64.c   |  4 
 include/linux/delay.h|  2 +-
 kernel/hrtimer.c | 19 +++
 kernel/rcutree.c |  5 +
 localversion-rt  |  2 +-
 8 files changed, 39 insertions(+), 30 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 5/7] staging: cxt1e1: fix checkpatch errors with open brace '{'

2014-03-04 Thread Greg KH

On Tue, Mar 04, 2014 at 11:10:44AM +0900, Daeseok Youn wrote:
> 
> clean up checkpatch.pl error in linux.c:
>  ERROR: that open brace { should be on the previous line
> 
> Signed-off-by: Daeseok Youn 
> ---
>  drivers/staging/cxt1e1/linux.c |   67 ---
>  1 files changed, 21 insertions(+), 46 deletions(-)

As patch 4 can't be applied, I can't apply these either, please resend
the rest of the series when you fix them up.

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH RT 0/6] Linux 3.2.55-rt79-rc1

2014-03-04 Thread Steven Rostedt


Dear RT Folks,

This is the RT stable review cycle of patch 3.2.55-rt79-rc1.

Please scream at me if I messed something up. Please test the patches too.

The -rc release will be uploaded to kernel.org and will be deleted when
the final release is out. This is just a review release (or release candidate).

The pre-releases will not be pushed to the git repository, only the
final release is.

If all goes well, this patch will be converted to the next main release
on 3/7/2014.

Enjoy,

-- Steve


To build 3.2.55-rt79-rc1 directly, the following patches should be applied:

  http://www.kernel.org/pub/linux/kernel/v3.x/linux-3.2.tar.xz

  http://www.kernel.org/pub/linux/kernel/v3.x/patch-3.2.55.xz

  
http://www.kernel.org/pub/linux/kernel/projects/rt/3.2/patch-3.2.55-rt79-rc1.patch.xz

You can also build from 3.2.55-rt78 by applying the incremental patch:

http://www.kernel.org/pub/linux/kernel/projects/rt/3.2/incr/patch-3.2.55-rt78-rt79-rc1.patch.xz


Changes from 3.2.55-rt78:

---


Sebastian Andrzej Siewior (3):
  Revert "x86: Disable IST stacks for debug/int 3/stack fault for 
PREEMPT_RT"
  kernel/hrtimer: be non-freezeable in cpu_chill()
  arm/unwind: use a raw_spin_lock

Steven Rostedt (1):
  rt: Make cpu_chill() use hrtimer instead of msleep()

Steven Rostedt (Red Hat) (1):
  Linux 3.2.55-rt79-rc1

Tiejun Chen (1):
  rcutree/rcu_bh_qs: disable irq while calling rcu_preempt_qs()


 arch/arm/kernel/unwind.c | 14 +++---
 arch/x86/include/asm/page_64_types.h | 21 ++---
 arch/x86/kernel/cpu/common.c |  2 --
 arch/x86/kernel/dumpstack_64.c   |  4 
 include/linux/delay.h|  2 +-
 kernel/hrtimer.c | 19 +++
 kernel/rcutree.c |  5 +
 localversion-rt  |  2 +-
 8 files changed, 39 insertions(+), 30 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH RT 2/6] Revert "x86: Disable IST stacks for debug/int 3/stack fault for PREEMPT_RT"

2014-03-04 Thread Steven Rostedt

3.2.55-rt79-rc1 stable review patch.
If anyone has any objections, please let me know.

--

From: Sebastian Andrzej Siewior 

where do I start. Let me explain what is going on here. The code
sequence
| pushf
| pop%edx
| or $0x1,%dh
| push   %edx
| mov$0xe0,%eax
| popf
| sysenter

triggers the bug. On 64bit kernel we see the double fault (with 32bit and
64bit userland) and on 32bit kernel there is no problem. The reporter said
that double fault does not happen on 64bit kernel with 64bit userland and
this is because in that case the VDSO uses the "syscall" interface instead
of "sysenter".

The bug. "popf" loads the flags with the TF bit set which enables
"single stepping" and this leads to a debug exception. Usually on 64bit
we have a special IST stack for the debug exception. Due to patch [0] we
do not use the IST stack but the kernel stack instead. On 64bit the
sysenter instruction starts in kernel with the stack address NULL. The
code sequence above enters the debug exception (TF flag) after the
sysenter instruction was executed which sets the stack pointer to NULL
and we have a fault (it seems that the debug exception saves some bytes
on the stack).
To fix the double fault I'm going to drop patch [0]. It is completely
pointless. In do_debug() and do_stack_segment() we disable preemption
which means the task can't leave the CPU. So it does not matter if we run
on IST or on kernel stack.
There is a patch [1] which drops preempt_disable() call for a 32bit
kernel but not for 64bit so there should be no regression.
And [1] seems valid even for this code sequence. We enter the debug
exception with a 256bytes long per cpu stack and migrate to the kernel
stack before calling do_debug().

[0] x86-disable-debug-stack.patch
[1] fix-rt-int3-x86_32-3.2-rt.patch

Cc: stable...@vger.kernel.org
Reported-by: Brian Silverman 
Cc: Andi Kleen 
Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Steven Rostedt 
---
 arch/x86/include/asm/page_64_types.h | 21 ++---
 arch/x86/kernel/cpu/common.c |  2 --
 arch/x86/kernel/dumpstack_64.c   |  4 
 3 files changed, 6 insertions(+), 21 deletions(-)

diff --git a/arch/x86/include/asm/page_64_types.h 
b/arch/x86/include/asm/page_64_types.h
index 0883ecd..7639dbf 100644
--- a/arch/x86/include/asm/page_64_types.h
+++ b/arch/x86/include/asm/page_64_types.h
@@ -14,21 +14,12 @@
 #define IRQ_STACK_ORDER 2
 #define IRQ_STACK_SIZE (PAGE_SIZE << IRQ_STACK_ORDER)
 
-#ifdef CONFIG_PREEMPT_RT_FULL
-# define STACKFAULT_STACK 0
-# define DOUBLEFAULT_STACK 1
-# define NMI_STACK 2
-# define DEBUG_STACK 0
-# define MCE_STACK 3
-# define N_EXCEPTION_STACKS 3  /* hw limit: 7 */
-#else
-# define STACKFAULT_STACK 1
-# define DOUBLEFAULT_STACK 2
-# define NMI_STACK 3
-# define DEBUG_STACK 4
-# define MCE_STACK 5
-# define N_EXCEPTION_STACKS 5  /* hw limit: 7 */
-#endif
+#define STACKFAULT_STACK 1
+#define DOUBLEFAULT_STACK 2
+#define NMI_STACK 3
+#define DEBUG_STACK 4
+#define MCE_STACK 5
+#define N_EXCEPTION_STACKS 5  /* hw limit: 7 */
 
 #define PUD_PAGE_SIZE  (_AC(1, UL) << PUD_SHIFT)
 #define PUD_PAGE_MASK  (~(PUD_PAGE_SIZE-1))
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index edc013e..ca93cc7 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1050,9 +1050,7 @@ DEFINE_PER_CPU(unsigned int, irq_count) = -1;
  */
 static const unsigned int exception_stack_sizes[N_EXCEPTION_STACKS] = {
  [0 ... N_EXCEPTION_STACKS - 1]= EXCEPTION_STKSZ,
-#if DEBUG_STACK > 0
  [DEBUG_STACK - 1] = DEBUG_STKSZ
-#endif
 };
 
 static DEFINE_PER_CPU_PAGE_ALIGNED(char, exception_stacks
diff --git a/arch/x86/kernel/dumpstack_64.c b/arch/x86/kernel/dumpstack_64.c
index 352beb7..6d728d9 100644
--- a/arch/x86/kernel/dumpstack_64.c
+++ b/arch/x86/kernel/dumpstack_64.c
@@ -21,14 +21,10 @@
(N_EXCEPTION_STACKS + DEBUG_STKSZ/EXCEPTION_STKSZ - 2)
 
 static char x86_stack_ids[][8] = {
-#if DEBUG_STACK > 0
[ DEBUG_STACK-1 ]   = "#DB",
-#endif
[ NMI_STACK-1   ]   = "NMI",
[ DOUBLEFAULT_STACK-1   ]   = "#DF",
-#if STACKFAULT_STACK > 0
[ STACKFAULT_STACK-1]   = "#SS",
-#endif
[ MCE_STACK-1   ]   = "#MC",
 #if DEBUG_STKSZ > EXCEPTION_STKSZ
[ N_EXCEPTION_STACKS ...
-- 
1.8.5.3


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH RT 1/6] rcutree/rcu_bh_qs: disable irq while calling rcu_preempt_qs()

2014-03-04 Thread Steven Rostedt

3.2.55-rt79-rc1 stable review patch.
If anyone has any objections, please let me know.

--

From: Tiejun Chen 

Any callers to the function rcu_preempt_qs() must disable irqs in
order to protect the assignment to ->rcu_read_unlock_special. In
RT case, rcu_bh_qs() as the wrapper of rcu_preempt_qs() is called
in some scenarios where irq is enabled, like this path,

do_single_softirq()
|
+ local_irq_enable();
+ handle_softirq()
||
|+ rcu_bh_qs()
||
|+ rcu_preempt_qs()
|
+ local_irq_disable()

So here we'd better disable irq directly inside of rcu_bh_qs() to
fix this, otherwise the kernel may be freezable sometimes as
observed. And especially this way is also kind and safe for the
potential rcu_bh_qs() usage elsewhere in the future.

Cc: stable...@vger.kernel.org
Signed-off-by: Tiejun Chen 
Signed-off-by: Bin Jiang 
Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Steven Rostedt 
---
 kernel/rcutree.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 82c2224..6c2ec2d 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -175,7 +175,12 @@ static void rcu_preempt_qs(int cpu);
 
 void rcu_bh_qs(int cpu)
 {
+   unsigned long flags;
+
+   /* Callers to this function, rcu_preempt_qs(), must disable irqs. */
+   local_irq_save(flags);
rcu_preempt_qs(cpu);
+   local_irq_restore(flags);
 }
 #else
 void rcu_bh_qs(int cpu)
-- 
1.8.5.3


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH RT 3/6] rt: Make cpu_chill() use hrtimer instead of msleep()

2014-03-04 Thread Steven Rostedt

3.2.55-rt79-rc1 stable review patch.
If anyone has any objections, please let me know.

--

From: Steven Rostedt 

Ulrich Obergfell pointed out that cpu_chill() calls msleep() which is woken
up by the ksoftirqd running the TIMER softirq. But as the cpu_chill() is
called from softirq context, it may block the ksoftirqd() from running, in
which case, it may never wake up the msleep() causing the deadlock.

I checked the vmcore, and irq/74-qla2xxx is stuck in the msleep() call,
running on CPU 8. The one ksoftirqd that is stuck, happens to be the one that
runs on CPU 8, and it is blocked on a lock held by irq/74-qla2xxx. As that
ksoftirqd is the one that will wake up irq/74-qla2xxx, and it happens to be
blocked on a lock that irq/74-qla2xxx holds, we have our deadlock.

The solution is not to convert the cpu_chill() back to a cpu_relax() as that
will re-create a possible live lock that the cpu_chill() fixed earlier, and may
also leave this bug open on other softirqs. The fix is to remove the
dependency on ksoftirqd from cpu_chill(). That is, instead of calling
msleep() that requires ksoftirqd to wake it up, use the
hrtimer_nanosleep() code that does the wakeup from hard irq context.

|Looks to be the lock of the block softirq. I don't have the core dump
|anymore, but from what I could tell the ksoftirqd was blocked on the
|block softirq lock, where the block softirq handler did a msleep
|(called by the qla2xxx interrupt handler).
|
|Looking at trigger_softirq() in block/blk-softirq.c, it can do a
|smp_callfunction() to another cpu to run the block softirq. If that
|happens to be the cpu where the qla2xx irq handler is doing the block
|softirq and is in a middle of a msleep(), I believe the ksoftirqd will
|try to run the softirq. If it does that, then BOOM, it's deadlocked
|because the ksoftirqd will never run the timer softirq either.

|I should have also stated that it was only one lock that was involved.
|But the lock owner was doing a msleep() that requires a wakeup by
|ksoftirqd to continue. If ksoftirqd happens to be blocked on a lock
|held by the msleep() caller, then you have your deadlock.
|
|It's best not to have any softirqs going to sleep requiring another
|softirq to wake it up. Note, if we ever require a timer softirq to do a
|cpu_chill() it will most definitely hit this deadlock.

Cc: stable...@vger.kernel.org
Found-by: Ulrich Obergfell 
Signed-off-by: Steven Rostedt 
[bigeasy: add the 4 | chapters from email]
Signed-off-by: Sebastian Andrzej Siewior 
---
 include/linux/delay.h |  2 +-
 kernel/hrtimer.c  | 15 +++
 2 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/include/linux/delay.h b/include/linux/delay.h
index e23a7c0..37caab3 100644
--- a/include/linux/delay.h
+++ b/include/linux/delay.h
@@ -53,7 +53,7 @@ static inline void ssleep(unsigned int seconds)
 }
 
 #ifdef CONFIG_PREEMPT_RT_FULL
-# define cpu_chill()   msleep(1)
+extern void cpu_chill(void);
 #else
 # define cpu_chill()   cpu_relax()
 #endif
diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index ed0f3a1..a9f8842 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -1847,6 +1847,21 @@ SYSCALL_DEFINE2(nanosleep, struct timespec __user *, 
rqtp,
return hrtimer_nanosleep(&tu, rmtp, HRTIMER_MODE_REL, CLOCK_MONOTONIC);
 }
 
+#ifdef CONFIG_PREEMPT_RT_FULL
+/*
+ * Sleep for 1 ms in hope whoever holds what we want will let it go.
+ */
+void cpu_chill(void)
+{
+   struct timespec tu = {
+   .tv_nsec = NSEC_PER_MSEC,
+   };
+
+   hrtimer_nanosleep(&tu, NULL, HRTIMER_MODE_REL, CLOCK_MONOTONIC);
+}
+EXPORT_SYMBOL(cpu_chill);
+#endif
+
 /*
  * Functions related to boot-time initialization:
  */
-- 
1.8.5.3


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH RT 6/6] Linux 3.2.55-rt79-rc1

2014-03-04 Thread Steven Rostedt

3.2.55-rt79-rc1 stable review patch.
If anyone has any objections, please let me know.

--

From: "Steven Rostedt (Red Hat)" 

---
 localversion-rt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/localversion-rt b/localversion-rt
index 30758e0..cf494ca 100644
--- a/localversion-rt
+++ b/localversion-rt
@@ -1 +1 @@
--rt78
+-rt79-rc1
-- 
1.8.5.3


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH RT 4/6] kernel/hrtimer: be non-freezeable in cpu_chill()

2014-03-04 Thread Steven Rostedt

3.2.55-rt79-rc1 stable review patch.
If anyone has any objections, please let me know.

--

From: Sebastian Andrzej Siewior 

Since we replaced msleep() by hrtimer I see now and then (rarely) this:

| [] Waiting for /dev to be fully populated...
| =
| [ BUG: udevd/229 still has locks held! ]
| 3.12.11-rt17 #23 Not tainted
| -
| 1 lock held by udevd/229:
|  #0:  (&type->i_mutex_dir_key#2){+.+.+.}, at: lookup_slow+0x28/0x98
|
| stack backtrace:
| CPU: 0 PID: 229 Comm: udevd Not tainted 3.12.11-rt17 #23
| (unwind_backtrace+0x0/0xf8) from (show_stack+0x10/0x14)
| (show_stack+0x10/0x14) from (dump_stack+0x74/0xbc)
| (dump_stack+0x74/0xbc) from (do_nanosleep+0x120/0x160)
| (do_nanosleep+0x120/0x160) from (hrtimer_nanosleep+0x90/0x110)
| (hrtimer_nanosleep+0x90/0x110) from (cpu_chill+0x30/0x38)
| (cpu_chill+0x30/0x38) from (dentry_kill+0x158/0x1ec)
| (dentry_kill+0x158/0x1ec) from (dput+0x74/0x15c)
| (dput+0x74/0x15c) from (lookup_real+0x4c/0x50)
| (lookup_real+0x4c/0x50) from (__lookup_hash+0x34/0x44)
| (__lookup_hash+0x34/0x44) from (lookup_slow+0x38/0x98)
| (lookup_slow+0x38/0x98) from (path_lookupat+0x208/0x7fc)
| (path_lookupat+0x208/0x7fc) from (filename_lookup+0x20/0x60)
| (filename_lookup+0x20/0x60) from (user_path_at_empty+0x50/0x7c)
| (user_path_at_empty+0x50/0x7c) from (user_path_at+0x14/0x1c)
| (user_path_at+0x14/0x1c) from (vfs_fstatat+0x48/0x94)
| (vfs_fstatat+0x48/0x94) from (SyS_stat64+0x14/0x30)
| (SyS_stat64+0x14/0x30) from (ret_fast_syscall+0x0/0x48)

For now I see no better way but to disable the freezer the sleep the period.

Cc: stable...@vger.kernel.org
Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Steven Rostedt 
---
 kernel/hrtimer.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index a9f8842..71acc21 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -1856,8 +1856,12 @@ void cpu_chill(void)
struct timespec tu = {
.tv_nsec = NSEC_PER_MSEC,
};
+   unsigned int freeze_flag = current->flags & PF_NOFREEZE;
 
+   current->flags |= PF_NOFREEZE;
hrtimer_nanosleep(&tu, NULL, HRTIMER_MODE_REL, CLOCK_MONOTONIC);
+   if (!freeze_flag)
+   current->flags &= ~PF_NOFREEZE;
 }
 EXPORT_SYMBOL(cpu_chill);
 #endif
-- 
1.8.5.3


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH RT 5/6] arm/unwind: use a raw_spin_lock

2014-03-04 Thread Steven Rostedt

3.2.55-rt79-rc1 stable review patch.
If anyone has any objections, please let me know.

--

From: Sebastian Andrzej Siewior 

Mostly unwind is done with irqs enabled however SLUB may call it with
irqs disabled while creating a new SLUB cache.

I had system freeze while loading a module which called
kmem_cache_create() on init. That means SLUB's __slab_alloc() disabled
interrupts and then

->new_slab_objects()
 ->new_slab()
  ->setup_object()
   ->setup_object_debug()
->init_tracking()
 ->set_track()
  ->save_stack_trace()
   ->save_stack_trace_tsk()
->walk_stackframe()
 ->unwind_frame()
  ->unwind_find_idx()
   =>spin_lock_irqsave(&unwind_lock);

Cc: stable...@vger.kernel.org
Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Steven Rostedt 
---
 arch/arm/kernel/unwind.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/arm/kernel/unwind.c b/arch/arm/kernel/unwind.c
index 00df012..bbafc67 100644
--- a/arch/arm/kernel/unwind.c
+++ b/arch/arm/kernel/unwind.c
@@ -87,7 +87,7 @@ extern const struct unwind_idx __start_unwind_idx[];
 static const struct unwind_idx *__origin_unwind_idx;
 extern const struct unwind_idx __stop_unwind_idx[];
 
-static DEFINE_SPINLOCK(unwind_lock);
+static DEFINE_RAW_SPINLOCK(unwind_lock);
 static LIST_HEAD(unwind_tables);
 
 /* Convert a prel31 symbol to an absolute address */
@@ -195,7 +195,7 @@ static const struct unwind_idx *unwind_find_idx(unsigned 
long addr)
/* module unwind tables */
struct unwind_table *table;
 
-   spin_lock_irqsave(&unwind_lock, flags);
+   raw_spin_lock_irqsave(&unwind_lock, flags);
list_for_each_entry(table, &unwind_tables, list) {
if (addr >= table->begin_addr &&
addr < table->end_addr) {
@@ -207,7 +207,7 @@ static const struct unwind_idx *unwind_find_idx(unsigned 
long addr)
break;
}
}
-   spin_unlock_irqrestore(&unwind_lock, flags);
+   raw_spin_unlock_irqrestore(&unwind_lock, flags);
}
 
pr_debug("%s: idx = %p\n", __func__, idx);
@@ -469,9 +469,9 @@ struct unwind_table *unwind_table_add(unsigned long start, 
unsigned long size,
tab->begin_addr = text_addr;
tab->end_addr = text_addr + text_size;
 
-   spin_lock_irqsave(&unwind_lock, flags);
+   raw_spin_lock_irqsave(&unwind_lock, flags);
list_add_tail(&tab->list, &unwind_tables);
-   spin_unlock_irqrestore(&unwind_lock, flags);
+   raw_spin_unlock_irqrestore(&unwind_lock, flags);
 
return tab;
 }
@@ -483,9 +483,9 @@ void unwind_table_del(struct unwind_table *tab)
if (!tab)
return;
 
-   spin_lock_irqsave(&unwind_lock, flags);
+   raw_spin_lock_irqsave(&unwind_lock, flags);
list_del(&tab->list);
-   spin_unlock_irqrestore(&unwind_lock, flags);
+   raw_spin_unlock_irqrestore(&unwind_lock, flags);
 
kfree(tab);
 }
-- 
1.8.5.3


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] phy: fix compiler array bounds warning on settings[]

2014-03-04 Thread Bjorn Helgaas

With -Werror=array-bounds, gcc v4.7.x warns that in phy_find_valid(), the
settings[] "array subscript is above array bounds", I think because idx is
a signed integer and if the caller supplied idx < 0, we pass the guard but
still reference out of bounds.

Fix this by making idx unsigned here and elsewhere.

Signed-off-by: Bjorn Helgaas 
---
 drivers/net/phy/phy.c |   11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index 19c9eca0ef26..76d96b9ebcdb 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -164,9 +164,9 @@ static const struct phy_setting settings[] = {
  *   of that setting.  Returns the index of the last setting if
  *   none of the others match.
  */
-static inline int phy_find_setting(int speed, int duplex)
+static inline unsigned int phy_find_setting(int speed, int duplex)
 {
-   int idx = 0;
+   unsigned int idx = 0;
 
while (idx < ARRAY_SIZE(settings) &&
   (settings[idx].speed != speed || settings[idx].duplex != duplex))
@@ -185,7 +185,7 @@ static inline int phy_find_setting(int speed, int duplex)
  *   the mask in features.  Returns the index of the last setting
  *   if nothing else matches.
  */
-static inline int phy_find_valid(int idx, u32 features)
+static inline unsigned int phy_find_valid(unsigned int idx, u32 features)
 {
while (idx < MAX_NUM_SETTINGS && !(settings[idx].setting & features))
idx++;
@@ -204,7 +204,7 @@ static inline int phy_find_valid(int idx, u32 features)
 static void phy_sanitize_settings(struct phy_device *phydev)
 {
u32 features = phydev->supported;
-   int idx;
+   unsigned int idx;
 
/* Sanitize settings based on PHY capabilities */
if ((features & SUPPORTED_Autoneg) == 0)
@@ -954,7 +954,8 @@ int phy_init_eee(struct phy_device *phydev, bool 
clk_stop_enable)
(phydev->interface == PHY_INTERFACE_MODE_RGMII))) {
int eee_lp, eee_cap, eee_adv;
u32 lp, cap, adv;
-   int idx, status;
+   int status;
+   unsigned int idx;
 
/* Read phy status to properly get the right settings */
status = phy_read_status(phydev);

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/6] mm: add get_pageblock_migratetype_nolock() for cases where locking is undesirable

2014-03-04 Thread Joonsoo Kim

On Fri, Feb 28, 2014 at 03:15:00PM +0100, Vlastimil Babka wrote:
> In order to prevent race with set_pageblock_migratetype, most of calls to
> get_pageblock_migratetype have been moved under zone->lock. For the remaining
> call sites, the extra locking is undesirable, notably in free_hot_cold_page().
> 
> This patch introduces a _nolock version to be used on these call sites, where
> a wrong value does not affect correctness. The function makes sure that the
> value does not exceed valid migratetype numbers. Such too-high values are
> assumed to be a result of race and caller-supplied fallback value is returned
> instead.
> 
> Signed-off-by: Vlastimil Babka 
> ---
>  include/linux/mmzone.h | 24 
>  mm/compaction.c| 14 +++---
>  mm/memory-failure.c|  3 ++-
>  mm/page_alloc.c| 22 +-
>  mm/vmstat.c|  2 +-
>  5 files changed, 55 insertions(+), 10 deletions(-)
> 
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index fac5509..7c3f678 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -75,6 +75,30 @@ enum {
>  
>  extern int page_group_by_mobility_disabled;
>  
> +/*
> + * When called without zone->lock held, a race with set_pageblock_migratetype
> + * may result in bogus values. Use this variant only when this does not 
> affect
> + * correctness, and taking zone->lock would be costly. Values >= 
> MIGRATE_TYPES
> + * are considered to be a result of this race and the value of race_fallback
> + * argument is returned instead.
> + */
> +static inline int get_pageblock_migratetype_nolock(struct page *page,
> + int race_fallback)
> +{
> + int ret = get_pageblock_flags_group(page, PB_migrate, PB_migrate_end);
> +
> + if (unlikely(ret >= MIGRATE_TYPES))
> + ret = race_fallback;
> +
> + return ret;
> +}

How about below forms?

get_pageblock_migratetype_locked(struct page *page)
get_pageblock_migratetype(struct page *page, int race_fallback)

get_pageblock_migratetype() and _nolock looks error-prone because developer
who try to use get_pageblock_migratetype() may not know that it needs lock.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 1/2] Staging: comedi: introduce {outl,inl}_amcc() and {outl,inl}_iobase() helper functions in hwdrv_apci1564.c

2014-03-04 Thread Greg KH

On Mon, Mar 03, 2014 at 12:27:55PM +0300, Dan Carpenter wrote:
> On Sun, Mar 02, 2014 at 08:52:19PM -0600, Chase Southwood wrote:
> > This patch introduces a few simple outl and inl helper functions to allow
> > several lines which violate the character limit to be shortened
> > appropriately.  It also changes a few macro values which represented
> > offset values from a single unique base value to instead represent the value
> > of that base plus the offset.  This is to simplify the use of these macros
> > in the new helper functions.
> > 
> > Cc: Dan Carpenter 
> > Signed-off-by: Chase Southwood 
> > ---
> > 
> > All right, here's another shot at this.  Dan, I took your outl_amcc idea
> > and did a version for the outl/inl calls based from devpriv->iobase as well.
> > I changed all of the macros which only offset from one base value as you
> > had mentioned as well, and the result is starting to look very good.
> > The only outl/inl calls which still look a little gross (see PATCH v2 2/2) 
> > are
> > the ones involving DIGITAL_OP_WATCHDOG, TIMER, or any of the COUNTER macros,
> > just because they use a common set of offset macros so simplifying
> > those calls in the same way as the rest isn't possible.  What are your
> > thoughts on this version of the patchset?
> > 
> > This is version 2 of [PATCH 1/2] Staging: comedi: introduce outl_1564_* and
> > inl_1564_* helper functions in hwdrv_apci1564.c
> > 
> > 2: Changed helper functions from {outl,inl}_1564_*() to
> > {outl,inl}_{amcc,iobase}()
> > 
> > Comments welcome!
> > 
> >  .../comedi/drivers/addi-data/hwdrv_apci1564.c  | 38 
> > +-
> >  1 file changed, 30 insertions(+), 8 deletions(-)
> > 
> > diff --git a/drivers/staging/comedi/drivers/addi-data/hwdrv_apci1564.c 
> > b/drivers/staging/comedi/drivers/addi-data/hwdrv_apci1564.c
> > index 2b47fa1..58e301d 100644
> > --- a/drivers/staging/comedi/drivers/addi-data/hwdrv_apci1564.c
> > +++ b/drivers/staging/comedi/drivers/addi-data/hwdrv_apci1564.c
> > @@ -49,25 +49,25 @@ This program is distributed in the hope that it will be 
> > useful, but WITHOUT ANY
> >  /* DIGITAL INPUT-OUTPUT DEFINE */
> >  /* Input defines */
> >  #define APCI1564_DIGITAL_IP0x04
> > -#define APCI1564_DIGITAL_IP_INTERRUPT_MODE14
> > -#define APCI1564_DIGITAL_IP_INTERRUPT_MODE28
> > -#define APCI1564_DIGITAL_IP_IRQ16
> > +#define APCI1564_DIGITAL_IP_INTERRUPT_MODE1(0x04 + 0x04)
> > +#define APCI1564_DIGITAL_IP_INTERRUPT_MODE2(0x04 + 0x08)
> > +#define APCI1564_DIGITAL_IP_IRQ(0x04 + 0x10)
> 
> You can't change these without changing the callers.  This bit needs to
> be in patch 2/2.  You should probably just merge both patches anyway
> because presumably this one adds some GCC warnings about unused static
> functions.

I'm totally confused about this series...

Chase, can you resend any outstanding patches that I haven't applied of
yours, as these different revisions and threads don't make much sense
right now.

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-04 Thread Thomas Gleixner

On Tue, 4 Mar 2014, Khalid Aziz wrote:
> be in the right control group. Besides they want to use a common mechanism
> across multiple OSs and pre-emption delay is already in use on other OSs. Good
> idea though.

Well, just because preemption delay is a mechanism exposed by some
other OS does not make it a good idea.

In fact its a horrible idea.

What you are creating is a crystal ball based form of time bound
priority ceiling with the worst user space interface i've ever seen.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/6] mm: add is_migrate_isolate_page_nolock() for cases where locking is undesirable

2014-03-04 Thread Joonsoo Kim

On Fri, Feb 28, 2014 at 03:15:01PM +0100, Vlastimil Babka wrote:
> This patch complements the addition of get_pageblock_migratetype_nolock() for
> the case where is_migrate_isolate_page() cannot be called with zone->lock 
> held.
> A race with set_pageblock_migratetype() may be detected, in which case a 
> caller
> supplied argument is returned.
> 
> Signed-off-by: Vlastimil Babka 
> ---
>  include/linux/page-isolation.h | 24 
>  mm/hugetlb.c   |  2 +-
>  2 files changed, 25 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/page-isolation.h b/include/linux/page-isolation.h
> index 3fff8e7..f7bd491 100644
> --- a/include/linux/page-isolation.h
> +++ b/include/linux/page-isolation.h
> @@ -2,10 +2,30 @@
>  #define __LINUX_PAGEISOLATION_H
>  
>  #ifdef CONFIG_MEMORY_ISOLATION
> +/*
> + * Should be called only with zone->lock held. In cases where locking 
> overhead
> + * is undesirable, consider the _nolock version.
> + */
>  static inline bool is_migrate_isolate_page(struct page *page)
>  {
>   return get_pageblock_migratetype(page) == MIGRATE_ISOLATE;
>  }
> +/*
> + * When called without zone->lock held, a race with set_pageblock_migratetype
> + * may result in bogus values. The race may be detected, in which case the
> + * value of race_fallback argument is returned. For details, see
> + * get_pageblock_migratetype_nolock().
> + */
> +static inline bool is_migrate_isolate_page_nolock(struct page *page,
> + bool race_fallback)
> +{
> + int migratetype = get_pageblock_migratetype_nolock(page, MIGRATE_TYPES);
> +
> + if (unlikely(migratetype == MIGRATE_TYPES))
> + return race_fallback;
> +
> + return migratetype == MIGRATE_ISOLATE;
> +}
>  static inline bool is_migrate_isolate(int migratetype)
>  {
>   return migratetype == MIGRATE_ISOLATE;
> @@ -15,6 +35,10 @@ static inline bool is_migrate_isolate_page(struct page 
> *page)
>  {
>   return false;
>  }
> +static inline bool is_migrate_isolate_page_nolock(struct page *page)
> +{
> + return false;
> +}
>  static inline bool is_migrate_isolate(int migratetype)
>  {
>   return false;

Nitpick.
You need race_fallback parameter for is_migrate_isolate_page_nolock().

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] irqchip: Remove asmlinkage from static functions

2014-03-04 Thread Stephen Boyd

LTO patches add __visible to the asmlinkage define, causing
compilation warnings like:

  drivers/irqchip/irq-gic.c:283:1: warning: 'externally_visible'
  attribute have effect only on public objects [-Wattributes]

Drop asmlinkage here to avoid such warnings.

Reported-by: Olof's autobuilder 
Cc: Andi Kleen 
Signed-off-by: Stephen Boyd 
---

Based on next-20140226.

 drivers/irqchip/irq-armada-370-xp.c | 2 +-
 drivers/irqchip/irq-bcm2835.c   | 4 ++--
 drivers/irqchip/irq-gic.c   | 2 +-
 drivers/irqchip/irq-mmp.c   | 6 ++
 drivers/irqchip/irq-moxart.c| 2 +-
 drivers/irqchip/irq-orion.c | 2 +-
 drivers/irqchip/irq-sirfsoc.c   | 2 +-
 drivers/irqchip/irq-sun4i.c | 4 ++--
 drivers/irqchip/irq-vic.c   | 2 +-
 drivers/irqchip/irq-vt8500.c| 3 +--
 drivers/irqchip/irq-zevio.c | 2 +-
 11 files changed, 14 insertions(+), 17 deletions(-)

diff --git a/drivers/irqchip/irq-armada-370-xp.c 
b/drivers/irqchip/irq-armada-370-xp.c
index cd79503abea9..41be897df8d5 100644
--- a/drivers/irqchip/irq-armada-370-xp.c
+++ b/drivers/irqchip/irq-armada-370-xp.c
@@ -410,7 +410,7 @@ static void armada_370_xp_mpic_handle_cascade_irq(unsigned 
int irq,
chained_irq_exit(chip, desc);
 }
 
-static asmlinkage void __exception_irq_entry
+static void __exception_irq_entry
 armada_370_xp_handle_irq(struct pt_regs *regs)
 {
u32 irqstat, irqnr;
diff --git a/drivers/irqchip/irq-bcm2835.c b/drivers/irqchip/irq-bcm2835.c
index 1693b8e7f26a..5916d6cdafa1 100644
--- a/drivers/irqchip/irq-bcm2835.c
+++ b/drivers/irqchip/irq-bcm2835.c
@@ -95,7 +95,7 @@ struct armctrl_ic {
 };
 
 static struct armctrl_ic intc __read_mostly;
-static asmlinkage void __exception_irq_entry bcm2835_handle_irq(
+static void __exception_irq_entry bcm2835_handle_irq(
struct pt_regs *regs);
 
 static void armctrl_mask_irq(struct irq_data *d)
@@ -196,7 +196,7 @@ static void armctrl_handle_shortcut(int bank, struct 
pt_regs *regs,
handle_IRQ(irq_linear_revmap(intc.domain, irq), regs);
 }
 
-static asmlinkage void __exception_irq_entry bcm2835_handle_irq(
+static void __exception_irq_entry bcm2835_handle_irq(
struct pt_regs *regs)
 {
u32 stat, irq;
diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
index a917b144895e..63922b9ba6b7 100644
--- a/drivers/irqchip/irq-gic.c
+++ b/drivers/irqchip/irq-gic.c
@@ -279,7 +279,7 @@ static int gic_set_wake(struct irq_data *d, unsigned int on)
 #define gic_set_wake   NULL
 #endif
 
-static asmlinkage void __exception_irq_entry gic_handle_irq(struct pt_regs 
*regs)
+static void __exception_irq_entry gic_handle_irq(struct pt_regs *regs)
 {
u32 irqstat, irqnr;
struct gic_chip_data *gic = &gic_data[0];
diff --git a/drivers/irqchip/irq-mmp.c b/drivers/irqchip/irq-mmp.c
index 2cb7cd0bc2f5..3c8827fe83f3 100644
--- a/drivers/irqchip/irq-mmp.c
+++ b/drivers/irqchip/irq-mmp.c
@@ -194,8 +194,7 @@ static struct mmp_intc_conf mmp2_conf = {
.conf_mask  = 0x7f,
 };
 
-static asmlinkage void __exception_irq_entry
-mmp_handle_irq(struct pt_regs *regs)
+static void __exception_irq_entry mmp_handle_irq(struct pt_regs *regs)
 {
int irq, hwirq;
 
@@ -207,8 +206,7 @@ mmp_handle_irq(struct pt_regs *regs)
handle_IRQ(irq, regs);
 }
 
-static asmlinkage void __exception_irq_entry
-mmp2_handle_irq(struct pt_regs *regs)
+static void __exception_irq_entry mmp2_handle_irq(struct pt_regs *regs)
 {
int irq, hwirq;
 
diff --git a/drivers/irqchip/irq-moxart.c b/drivers/irqchip/irq-moxart.c
index 5552fc2bf28a..00b3cc908f76 100644
--- a/drivers/irqchip/irq-moxart.c
+++ b/drivers/irqchip/irq-moxart.c
@@ -44,7 +44,7 @@ struct moxart_irq_data {
 
 static struct moxart_irq_data intc;
 
-static asmlinkage void __exception_irq_entry handle_irq(struct pt_regs *regs)
+static void __exception_irq_entry handle_irq(struct pt_regs *regs)
 {
u32 irqstat;
int hwirq;
diff --git a/drivers/irqchip/irq-orion.c b/drivers/irqchip/irq-orion.c
index 8e41be62812e..e25f246cd2fb 100644
--- a/drivers/irqchip/irq-orion.c
+++ b/drivers/irqchip/irq-orion.c
@@ -30,7 +30,7 @@
 
 static struct irq_domain *orion_irq_domain;
 
-static asmlinkage void
+static void
 __exception_irq_entry orion_handle_irq(struct pt_regs *regs)
 {
struct irq_domain_chip_generic *dgc = orion_irq_domain->gc;
diff --git a/drivers/irqchip/irq-sirfsoc.c b/drivers/irqchip/irq-sirfsoc.c
index 3a070c587ed9..581eefe331ae 100644
--- a/drivers/irqchip/irq-sirfsoc.c
+++ b/drivers/irqchip/irq-sirfsoc.c
@@ -47,7 +47,7 @@ sirfsoc_alloc_gc(void __iomem *base, unsigned int irq_start, 
unsigned int num)
ct->regs.mask = SIRFSOC_INT_RISC_MASK0;
 }
 
-static asmlinkage void __exception_irq_entry sirfsoc_handle_irq(struct pt_regs 
*regs)
+static void __exception_irq_entry sirfsoc_handle_irq(struct pt_regs *regs)
 {
void __iomem *base = sirfsoc_irqdomain->host_data;
u32 irqstat, irqnr;
diff --git a/drivers/irqchip/irq-s

Re: [PATCH] Revert "irqchip: irq-dove: Add PMU interrupt controller."

2014-03-04 Thread Russell King - ARM Linux

On Tue, Mar 04, 2014 at 05:32:40AM +, Jason Cooper wrote:
> -static void dove_pmu_irq_handler(unsigned int irq, struct irq_desc *desc)
> -{
> - struct irq_domain *d = irq_get_handler_data(irq);
> - struct irq_chip_generic *gc = irq_get_domain_generic_chip(d, 0);
> - u32 stat = readl_relaxed(gc->reg_base + DOVE_PMU_IRQ_CAUSE) &
> -gc->mask_cache;
> -
> - while (stat) {
> - u32 hwirq = ffs(stat) - 1;
> -
> - generic_handle_irq(irq_find_mapping(d, gc->irq_base + hwirq));
> - stat &= ~(1 << hwirq);
> - }
> -}
> -
> -static void pmu_irq_ack(struct irq_data *d)
> -{
> - struct irq_chip_generic *gc = irq_data_get_irq_chip_data(d);
> - struct irq_chip_type *ct = irq_data_get_chip_type(d);
> - u32 mask = ~d->mask;
> -
> - /*
> -  * The PMU mask register is not RW0C: it is RW.  This means that
> -  * the bits take whatever value is written to them; if you write
> -  * a '1', you will set the interrupt.
> -  *
> -  * Unfortunately this means there is NO race free way to clear
> -  * these interrupts.
> -  *
> -  * So, let's structure the code so that the window is as small as
> -  * possible.
> -  */
> - irq_gc_lock(gc);
> - mask &= irq_reg_readl(gc->reg_base +  ct->regs.ack);
> - irq_reg_writel(mask, gc->reg_base +  ct->regs.ack);
> - irq_gc_unlock(gc);
> -}

Jason, Thomas,

I've just been giving the above a whirl here with the RTC, and it
doesn't seem to quite work as it should.  Not your problem, because it's
as the code is originally.

Let's say you set an alarm for 10sec time.  When the alarm fires:

- we read the PMU interrupt status, mask it with the mask register,
  and find the RTC pending.
- we call the genirq layer for this interrupt.
- genirq does the mask + ack thing.
- the RTC interrupt handler is called, and there's the RTC says there's
  an interrupt pending.
- the RTC handler clears the interrupt, and returns.
- genirq unmasks the interrupt, and returns.
- dove_pmu_irq_handler() is re-entered, and again, we find that the
  RTC interrupt is pending.
- follow the above...
- the RTC interrupt handler is called, but this time there's no interrupt
  pending, so returns IRQ_NONE
- genirq unmasks the interrupt, and returns.

The reason this happens is that the attempt to "ack" - rather "clear" the
interrupt the first time around has no effect - the RTC is still asserting
the interrupt, so the write to clear the register results in the bit
remaining set.

The second time around, we've already cleared the RTC interrupt, so this
time, the ack clears the interrupt down properly.

In some ways, this is good news - it shows that the bits in this register
latch '1' when an interrupt is pending, and remain '1' while the block
continues to assert its interrupt signal - but can we say that the other
interrupt functions in this register have that behaviour?

>From the spec, it looks like this is probably true of DFSDone as well.
DVSDone - I see no separate status register containing status bits
indicating what the cause of the DVSDone status is.  The thermal bits -
if it's a transitory excursion, may not hold.  Battery fault... we
can guess.

Now, genirq doesn't have a good way to handle this.  I'll also say that
because of the above, I've always been worried about hardware races when
trying to clear down interrupts in this register - I'd much prefer not
to touch it unless absolutely necessary.  So... how about this instead?

u32 stat = readl_relaxed(gc->reg_base + DOVE_PMC_IRQ_CAUSE) &
   gc->mask_cache;
u32 done = ~0;

while (stat) {
unsigned hwirq = ffs(stat) - 1;

stat &= ~(1 << hwirq);
done &= ~(1 << hwirq);

generic_handle_irq(irq_find_mapping(domain, hwirq));
}

irq_gc_lock(gc);
done &= readl_relaxed(gc->reg_base + DOVE_PMC_IRQ_CAUSE);
writel_relaxed(done, gc->reg_base + DOVE_PMC_IRQ_CAUSE);
irq_gc_unlock(gc);

This results in the RTC alarm test receiving exactly one interrupt for
each alarm expiry, as it should do.  Thoughts?

Another question: ffs(stat) - any reason to use ffs() there rather than
fls(stat) which would result in simpler code?  r1 = ffs(r4 = stat) creates:

 198:   e2641000rsb r1, r4, #0
 19c:   e1a6mov r0, r6
 1a0:   e0011004and r1, r1, r4
 1a4:   e16f1f11clz r1, r1
 1a8:   e261101frsb r1, r1, #31

whereas fls(stat):

 198:   e16f1f14clz r1, r4
 19c:   e261101frsb r1, r1, #31
 1a0:   e1a6mov r0, r6

Kind of a micro-optimisation, but I see no reason to prefer one over the
other except for this - and I think the switch to ffs() was made in the
hope of optimising this code!

-- 
FTTC broadband for 0.8mile line: now at 9.7Mbps down 460kbps up... slowly
improving, and getting towards what was expected from it.

Re: [PATCH v4 5/6] timerfd: Add support for deferrable timers

2014-03-04 Thread Andy Lutomirski

On Tue, Mar 4, 2014 at 4:10 PM, Thomas Gleixner  wrote:
> On Tue, 4 Mar 2014, Andy Lutomirski wrote:
>> On Tue, Mar 4, 2014 at 2:11 PM, Thomas Gleixner  wrote:
>> > We do no add another random special case syscall for timerfd just
>> > because timerfd is linux specific.
>>
>> What syscalls?  I can think of exactly two timer interfaces that
>> actually accept a clock id and flags: clock_nanosleep and
>> timerfd_settime.
>
> Sure, and what you can think of is reality?
>
>  sys_timer_settime() which relies on sys_timer_create() are outside
>  your universe, right?
>

Sigh, I forgot about those.  I would argue that there is no real
reason to make timer_create any fancier.  That kind of sucks.

> Aside of that if you want to make the slack thing usefull on a per
> call basis then you want to add it to a lot of other interfaces like
> poll.

Same with deferrable timers.  And things that want MONOTONIC *and*
REALTIME.  Etc.

>
> And you are completely ignoring the fact that the slack works
> completely differrent:
>
> A slacked timer still gets enqueued into the main timer queue. It just
> relies on the fact that it gets batched with some other expiring
> timer. But thats completely different to the deferrable approach.
>
>start_timer(timer, expiry, slack);
>
>timer.hard_expiry = expiry + slack;
>timer.soft_expiry = expiry;
>enqueue_timer(timer, timer.hard_expiry);
>
> The enqueueing code puts it into the queue by looking at the
> hard_expiry code. And the expiry code looks at the timer.soft_expiry
> value to expire a timer early.
>
> Now assume the following:
>
>start_timer(timer, +100ms, 100s);
>
> So that puts that timer into the hard expiry line of 100.1 sec from
> now. So if the cpu is busy and is firing a lot of timers then your
> timer could be delayed up to the hard expiry time, i.e. 100.1 seconds
> from now, which has completely differrent semantics than the
> deferrrable timers.

Erk.  I didn't realize that.  Is that really the desired behavior?  I
assumed that a timer with slack would fire at the earliest time after
the soft timeout at which the system wasn't idle.  The idea is to
batch wakeups, right?

>
> The deferrable timer is guaranteed to expire (halfways) on time when
> the system is active and does not affect the system from going idle,
> but it expires right away when the system comes back out of idle.
>
> The slack timers are just a batching mechanism to align expiry times
> of non deferrable timers to a common time.
>
> So how do you map those together?

By thinking of what semantics are actually useful for userspace developers.

I think that most userspace developers probably want the semantics
that I thought that timer slack had: I want to do work between time A
and time B.  Before A is too early, but I'm willing to wait until time
B if it improves power consumption.

Presumably, if the kernel chooses *not* to fire the timer just after
time A even if the system is awake, then it's risking an unnecessary
wakeup at time B.

(I admit that I don't really understand the hrtimer code.  I guess
that two indexes on the list of timers would be needed.)

>> > But we cannot do that right now as we cannot whip up severl dozen of
>> > new syscalls just because we want to add slack/deferrable whatever
>> > properties.
>
>> Two syscalls, right?
>
> It does not matter at all how many syscalls this affects. We are not
> adding any random new syscalls just because we can.
>
>> Once we agree on a solution to the Y2038 issue on 32bit with a unified
>> 32/64 bit syscall interface which simply gets rid of the timespec/val
>> nonsense and takes a simple u64 nsec value we can add the slack
>> property to that without any further inconvenience.
>
> Ignoring this wont get you anywhere.

I'm not entirely sure why per-timer slack can't be added without
simultaneously fixing Y2038 (and presumably leap seconds, too) but a
new flag can be.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] phy: fix compiler array bounds warning on settings[]

2014-03-04 Thread Florian Fainelli

2014-03-04 16:35 GMT-08:00 Bjorn Helgaas :
> With -Werror=array-bounds, gcc v4.7.x warns that in phy_find_valid(), the
> settings[] "array subscript is above array bounds", I think because idx is
> a signed integer and if the caller supplied idx < 0, we pass the guard but
> still reference out of bounds.
>
> Fix this by making idx unsigned here and elsewhere.
>
> Signed-off-by: Bjorn Helgaas 

Acked-by: Florian Fainelli 

> ---
>  drivers/net/phy/phy.c |   11 ++-
>  1 file changed, 6 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
> index 19c9eca0ef26..76d96b9ebcdb 100644
> --- a/drivers/net/phy/phy.c
> +++ b/drivers/net/phy/phy.c
> @@ -164,9 +164,9 @@ static const struct phy_setting settings[] = {
>   *   of that setting.  Returns the index of the last setting if
>   *   none of the others match.
>   */
> -static inline int phy_find_setting(int speed, int duplex)
> +static inline unsigned int phy_find_setting(int speed, int duplex)
>  {
> -   int idx = 0;
> +   unsigned int idx = 0;
>
> while (idx < ARRAY_SIZE(settings) &&
>(settings[idx].speed != speed || settings[idx].duplex != 
> duplex))
> @@ -185,7 +185,7 @@ static inline int phy_find_setting(int speed, int duplex)
>   *   the mask in features.  Returns the index of the last setting
>   *   if nothing else matches.
>   */
> -static inline int phy_find_valid(int idx, u32 features)
> +static inline unsigned int phy_find_valid(unsigned int idx, u32 features)
>  {
> while (idx < MAX_NUM_SETTINGS && !(settings[idx].setting & features))
> idx++;
> @@ -204,7 +204,7 @@ static inline int phy_find_valid(int idx, u32 features)
>  static void phy_sanitize_settings(struct phy_device *phydev)
>  {
> u32 features = phydev->supported;
> -   int idx;
> +   unsigned int idx;
>
> /* Sanitize settings based on PHY capabilities */
> if ((features & SUPPORTED_Autoneg) == 0)
> @@ -954,7 +954,8 @@ int phy_init_eee(struct phy_device *phydev, bool 
> clk_stop_enable)
> (phydev->interface == PHY_INTERFACE_MODE_RGMII))) {
> int eee_lp, eee_cap, eee_adv;
> u32 lp, cap, adv;
> -   int idx, status;
> +   int status;
> +   unsigned int idx;
>
> /* Read phy status to properly get the right settings */
> status = phy_read_status(phydev);
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Florian
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH net-next v6 9/9] xen-netback: Aggregate TX unmap operations

2014-03-04 Thread Zoltan Kiss

Despite all my efforts to do renumbering right, this subject still shows 
9/9 instead of 10/10.


On 04/03/14 22:32, Zoltan Kiss wrote:

Unmapping causes TLB flushing, therefore we should make it in the largest
possible batches. However we shouldn't starve the guest for too long. So if
the guest has space for at least two big packets and we don't have at least a
quarter ring to unmap, delay it for at most 1 milisec.

Signed-off-by: Zoltan Kiss 
---
v4:
- use bool for tx_dealloc_work_todo

v6:
- rebase tx_dealloc_work_todo due to missing ;

  drivers/net/xen-netback/common.h|2 ++
  drivers/net/xen-netback/interface.c |2 ++
  drivers/net/xen-netback/netback.c   |   34 +-
  3 files changed, 37 insertions(+), 1 deletion(-)

diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index d1cd8ce..95498c8 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -118,6 +118,8 @@ struct xenvif {
u16 dealloc_ring[MAX_PENDING_REQS];
struct task_struct *dealloc_task;
wait_queue_head_t dealloc_wq;
+   struct timer_list dealloc_delay;
+   bool dealloc_delay_timed_out;

/* Use kthread for guest RX */
struct task_struct *task;
diff --git a/drivers/net/xen-netback/interface.c 
b/drivers/net/xen-netback/interface.c
index 40aa500..f925af5 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -407,6 +407,7 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t 
domid,
  .desc = i };
vif->grant_tx_handle[i] = NETBACK_INVALID_HANDLE;
}
+   init_timer(&vif->dealloc_delay);

/*
 * Initialise a dummy MAC address. We choose the numerically
@@ -557,6 +558,7 @@ void xenvif_disconnect(struct xenvif *vif)
}

if (vif->dealloc_task) {
+   del_timer_sync(&vif->dealloc_delay);
kthread_stop(vif->dealloc_task);
vif->dealloc_task = NULL;
}
diff --git a/drivers/net/xen-netback/netback.c 
b/drivers/net/xen-netback/netback.c
index bb65c7c..c098276 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -135,6 +135,11 @@ static inline pending_ring_idx_t nr_pending_reqs(struct 
xenvif *vif)
vif->pending_prod + vif->pending_cons;
  }

+static inline pending_ring_idx_t nr_free_slots(struct xen_netif_tx_back_ring 
*ring)
+{
+   return ring->nr_ents -   (ring->sring->req_prod - 
ring->rsp_prod_pvt);
+}
+
  bool xenvif_rx_ring_slots_available(struct xenvif *vif, int needed)
  {
RING_IDX prod, cons;
@@ -1932,9 +1937,36 @@ static inline int tx_work_todo(struct xenvif *vif)
return 0;
  }

+static void xenvif_dealloc_delay(unsigned long data)
+{
+   struct xenvif *vif = (struct xenvif *)data;
+
+   vif->dealloc_delay_timed_out = true;
+   wake_up(&vif->dealloc_wq);
+}
+
  static inline bool tx_dealloc_work_todo(struct xenvif *vif)
  {
-   return vif->dealloc_cons != vif->dealloc_prod;
+   if (vif->dealloc_cons != vif->dealloc_prod) {
+   if ((nr_free_slots(&vif->tx) > 2 * XEN_NETBK_LEGACY_SLOTS_MAX) 
&&
+   (vif->dealloc_prod - vif->dealloc_cons < MAX_PENDING_REQS / 4) 
&&
+   !vif->dealloc_delay_timed_out) {
+   if (!timer_pending(&vif->dealloc_delay)) {
+   vif->dealloc_delay.function =
+   xenvif_dealloc_delay;
+   vif->dealloc_delay.data = (unsigned long)vif;
+   mod_timer(&vif->dealloc_delay,
+ jiffies + msecs_to_jiffies(1));
+
+   }
+   return false;
+   }
+   del_timer_sync(&vif->dealloc_delay);
+   vif->dealloc_delay_timed_out = false;
+   return true;
+   }
+
+   return false;
  }

  void xenvif_unmap_frontend_rings(struct xenvif *vif)



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RCU stalls when running out of memory on 3.14-rc4 w/ NFS and kernel threads priorities changed

2014-03-04 Thread Eric Dumazet

On Tue, 2014-03-04 at 15:55 -0800, Florian Fainelli wrote:
> Hi all,
> 
> I am seeing the following RCU stalls messages appearing on an ARMv7
> 4xCPUs system running 3.14-rc4:
> 
> [   42.974327] INFO: rcu_sched detected stalls on CPUs/tasks:
> [   42.979839]  (detected by 0, t=2102 jiffies, g=4294967082,
> c=4294967081, q=516)
> [   42.987169] INFO: Stall ended before state dump start
> 
> this is happening under the following conditions:
> 
> - the attached bumper.c binary alters various kernel thread priorities
> based on the contents of bumpup.cfg and
> - malloc_crazy is running from a NFS share
> - malloc_crazy.c is running in a loop allocating chunks of memory but
> never freeing it
> 
> when the priorities are altered, instead of getting the OOM killer to
> be invoked, the RCU stalls are happening. Taking NFS out of the
> equation does not allow me to reproduce the problem even with the
> priorities altered.
> 
> This "problem" seems to have been there for quite a while now since I
> was able to get 3.8.13 to trigger that bug as well, with a slightly
> more detailed RCU debugging trace which points the finger at kswapd0.
> 
> You should be able to get that reproduced under QEMU with the
> Versatile Express platform emulating a Cortex A15 CPU and the attached
> files.
> 
> Any help or suggestions would be greatly appreciated. Thanks!

Do you have a more complete trace, including stack traces ?




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-04 Thread Andi Kleen

Thomas Gleixner  writes:

> On Tue, 4 Mar 2014, Khalid Aziz wrote:
>> be in the right control group. Besides they want to use a common mechanism
>> across multiple OSs and pre-emption delay is already in use on other OSs. 
>> Good
>> idea though.
>
> Well, just because preemption delay is a mechanism exposed by some
> other OS does not make it a good idea.
>
> In fact its a horrible idea.
>
> What you are creating is a crystal ball based form of time bound
> priority ceiling with the worst user space interface i've ever seen.

So how would you solve the user space lock holder preemption 
problem then?

It's a real problem, affecting lots of workloads.

Just saying everything is crap without suggesting anything 
constructive is not really getting us anywhere.

The workarounds I've seen for it are generally far worse
than this. Often people do all kinds of fragile tunings
to address this, when then break on the next kernel
update that does even a minor scheduler change.

futex doesn't solve the problem at all.

The real time scheduler is also really poor fit for these 
workloads and needs a lot of hacks to scale.

The thread swap proposal from plumbers had some potential,
but it's likely very intrusive everywhere and seems
to have died too.

Anything else?

-Andi

-- 
a...@linux.intel.com -- Speaking for myself only
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[GIT PULL] tracing: Do not add event files for modules that fail tracepoints

2014-03-04 Thread Steven Rostedt


Linus,

In the past, I've had lots of reports about trace events not working.
Developers would say they put a trace_printk() before and after the trace
event but when they enable it (and the trace event said it was enabled) they
would see the trace_printks but not the trace event.

I was not able to reproduce this, but that's because I wasn't looking at
the right location. Recently, another bug came up that showed the issue.

If your kernel supports signed modules but allows for non-signed modules
to be loaded, then when one is, the kernel will silently set the
MODULE_FORCED taint on the module. Although, this taint happens without
the need for insmod --force or anything of the kind, it labels the
module with that taint anyway.

If this tainted module has tracepoints, the tracepoints will be ignored
because of the MODULE_FORCED taint. But no error message will be
displayed. Worse yet, the event infrastructure will still be created
letting users enable the trace event represented by the tracepoint,
although that event will never actually be enabled. This is because
the tracepoint infrastructure allows for non-existing tracepoints to
be enabled for new modules to arrive and have their tracepoints set.

Although there are several things wrong with the above, this change
only addresses the creation of the trace event files for tracepoints
that are not created when a module is loaded and is tainted. This change
will print an error message about the module being tainted and the
trace events will not be created, and it does not create the trace event
infrastructure.

Please pull the latest trace-fixes-v3.14-rc5 tree, which can be found at:


  git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git
trace-fixes-v3.14-rc5

Tag SHA1: babb7844ff64d24d7206dd477c0dacc5683d05b7
Head SHA1: 45ab2813d40d88fc575e753c38478de242d03f88


Steven Rostedt (Red Hat) (1):
  tracing: Do not add event files for modules that fail tracepoints


 include/linux/tracepoint.h  |  6 ++
 kernel/trace/trace_events.c | 10 ++
 kernel/tracepoint.c |  7 ++-
 3 files changed, 22 insertions(+), 1 deletion(-)
---
commit 45ab2813d40d88fc575e753c38478de242d03f88
Author: Steven Rostedt (Red Hat) 
Date:   Wed Feb 26 13:37:38 2014 -0500

tracing: Do not add event files for modules that fail tracepoints

If a module fails to add its tracepoints due to module tainting, do not
create the module event infrastructure in the debugfs directory. As the 
events
will not work and worse yet, they will silently fail, making the user wonder
why the events they enable do not display anything.

Having a warning on module load and the events not visible to the users
will make the cause of the problem much clearer.

Link: http://lkml.kernel.org/r/20140227154923.265882...@goodmis.org

Fixes: 6d723736e472 "tracing/events: add support for modules to TRACE_EVENT"
Acked-by: Mathieu Desnoyers 
Cc: sta...@vger.kernel.org # 2.6.31+
Cc: Rusty Russell 
Signed-off-by: Steven Rostedt 

diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
index accc497..7159a0a 100644
--- a/include/linux/tracepoint.h
+++ b/include/linux/tracepoint.h
@@ -60,6 +60,12 @@ struct tp_module {
unsigned int num_tracepoints;
struct tracepoint * const *tracepoints_ptrs;
 };
+bool trace_module_has_bad_taint(struct module *mod);
+#else
+static inline bool trace_module_has_bad_taint(struct module *mod)
+{
+   return false;
+}
 #endif /* CONFIG_MODULES */
 
 struct tracepoint_iter {
diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index e71ffd4..f3989ce 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -1777,6 +1777,16 @@ static void trace_module_add_events(struct module *mod)
 {
struct ftrace_event_call **call, **start, **end;
 
+   if (!mod->num_trace_events)
+   return;
+
+   /* Don't add infrastructure for mods without tracepoints */
+   if (trace_module_has_bad_taint(mod)) {
+   pr_err("%s: module has bad taint, not creating trace events\n",
+  mod->name);
+   return;
+   }
+
start = mod->trace_events;
end = mod->trace_events + mod->num_trace_events;
 
diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c
index 29f2654..031cc56 100644
--- a/kernel/tracepoint.c
+++ b/kernel/tracepoint.c
@@ -631,6 +631,11 @@ void tracepoint_iter_reset(struct tracepoint_iter *iter)
 EXPORT_SYMBOL_GPL(tracepoint_iter_reset);
 
 #ifdef CONFIG_MODULES
+bool trace_module_has_bad_taint(struct module *mod)
+{
+   return mod->taints & ~((1 << TAINT_OOT_MODULE) | (1 << TAINT_CRAP));
+}
+
 static int tracepoint_module_coming(struct module *mod)
 {
struct tp_module *tp_mod, *iter;
@@ -641,7 +646,7 @@ static int tracepoint_module_coming(struct module *mod)
 * module headers (for forced load), to make sur

Re: [PATCH 5/7] staging: cxt1e1: fix checkpatch errors with open brace '{'

2014-03-04 Thread DaeSeok Youn

Hi, greg

I already resend patch 4 and 5. :-)

It had a bug which is noticed by Dan.

I tried to fix assignment in if condition and missed curly brace in inner loop.
So I fixed that bug and resend patch 4. And patch 5 is rebased after
fixing patch 4.

And I tested to apply these patch to staging-next branch from mailing list.
Please check.

Thanks.
Daeseok Youn.

2014-03-05 9:35 GMT+09:00 Greg KH :
> On Tue, Mar 04, 2014 at 11:10:44AM +0900, Daeseok Youn wrote:
>>
>> clean up checkpatch.pl error in linux.c:
>>  ERROR: that open brace { should be on the previous line
>>
>> Signed-off-by: Daeseok Youn 
>> ---
>>  drivers/staging/cxt1e1/linux.c |   67 
>> ---
>>  1 files changed, 21 insertions(+), 46 deletions(-)
>
> As patch 4 can't be applied, I can't apply these either, please resend
> the rest of the series when you fix them up.
>
> thanks,
>
> greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Lennart Poettering and his software, a simple explaination (cartoon):

2014-03-04 Thread Arnold Bird

Lennart Poettering and his software, a simple explaination (cartoon):
archive.rebeccablacktech.com/boards/g/img/0406/33/1393881561558.png

That is all that needs to be said.


_
Free e-mail, simple, clean and easy to use. Visit CosmicEmail.com for your 
instant free account.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] cgroup: missing rcu read lock around task_css_set

2014-03-04 Thread Li Zefan

On 2014/3/5 3:47, Tejun Heo wrote:
> On Tue, Mar 04, 2014 at 12:20:45PM -0500, Sasha Levin wrote:
>>> Hrm... there is a PF_EXITING check there already:
>>>
>>> #define task_css_set_check(task, __c)\
>>> rcu_dereference_check((task)->cgroups,\
>>> lockdep_is_held(&cgroup_mutex) ||\
>>> lockdep_is_held(&css_set_rwsem) ||\
>>> ((task)->flags & PF_EXITING) || (__c))
>>>
>>> I see it's not happening on Linus's master so I'll run a bisection to 
>>> figure out what broke it.
>>
>> Hi Tejun,
>>
>> It bisects down to your patch: "cgroup: drop task_lock() protection
>> around task->cgroups". I'll look into it later unless it's obvious
>> to you.
> 
> Hmmm... maybe I'm confused and PF_EXITING is not set there and
> task_lock was what held off the lockdep warning.  Confused
> 

Because this cgroup_exit() is called in a failure path in copy_process().

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH] clocksource: avoid unnecessary overflow in cyclecounter_cyc2ns()

2014-03-04 Thread John Stultz

On Tue, Mar 4, 2014 at 3:10 PM, Mike Galbraith  wrote:
> On Tue, 2014-03-04 at 14:40 +0800, John Stultz wrote:
>> On Tue, Mar 4, 2014 at 1:38 PM, Mike Galbraith  wrote:
>> > (crap crap crap... M.A.I.N.T.A.I.N.E.R.S _dummy_)
>> >
>> > clocksource: avoid unnecessary overflow in cyclecounter_cyc2ns()
>> >
>> > As per 4cecf6d401a "sched, x86: Avoid unnecessary overflow in sched_clock",
>> > cycles * mult >> shift is overflow prone. so give it the same treatment.
>> >
>> > Cc: Salman Qazi 
>> > Cc: John Stultz ,
>> > Signed-off-by: Mike Galbraith 
>>
>> Thanks for sending this in!  Curious exactly how the issue was being
>> triggered?
>
> Dunno that it is.  This is the result of me rummaging around, looking
> for any excuse what-so-ever for a small and identical group of weird a$$
> boxen running old 2.6.32 kernels (w. 208 day fix!) to manage to hop back
> and forth in time by exactly 208 days.  Grep showed me that function, so
> I scurried off and swiped the fix.

So.. this makes me a bit more hesitant to really queue this,
particularly since the timecounter logic is supposed to periodically
accumulate cycles so you don't run into these overflow issues (the
earlier fix was for sched_clock which didn't do any accumulation).

So, if you're seeing time jump around, that's probably clocksource or
timekeping related, and not tied to the cyclecounter code. Do you have
any other info about these systems? What clocksource are they using,
etc?

thanks
-john
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] perf kvm: introduce --list-cmds for use by scripts

2014-03-04 Thread David Ahern


On 3/3/14, 6:26 PM, Ramkumar Ramachandra wrote:

Introduce

   $ perf kvm --list-cmds

to dump a raw list of commands for use by the completion script. In
order to do this, introduce parse_options_subcommand() for handling
subcommands as a special case in the parse-options machinery.

Cc: David Ahern 
Cc: Arnaldo Carvalho de Melo 
Signed-off-by: Ramkumar Ramachandra 


Looks ok to me. Jiri?

David


---
  Does this example justify the creation of parse_options_subcommand()?
  Initializing the usagestr to { NULL, NULL } and passing it to
  parse_options_subcommand() isn't intuitive -- what can we do about
  that?

  tools/perf/builtin-kvm.c| 12 +---
  tools/perf/perf-completion.sh   |  2 +-
  tools/perf/util/parse-options.c | 37 +
  tools/perf/util/parse-options.h |  8 +++-
  4 files changed, 46 insertions(+), 13 deletions(-)

diff --git a/tools/perf/builtin-kvm.c b/tools/perf/builtin-kvm.c
index a735051..21c164b 100644
--- a/tools/perf/builtin-kvm.c
+++ b/tools/perf/builtin-kvm.c
@@ -1691,17 +1691,15 @@ int cmd_kvm(int argc, const char **argv, const char 
*prefix __maybe_unused)
OPT_END()
};

-
-   const char * const kvm_usage[] = {
-   "perf kvm [] 
{top|record|report|diff|buildid-list|stat}",
-   NULL
-   };
+   const char *const kvm_subcommands[] = { "top", "record", "report", 
"diff",
+   "buildid-list", "stat", NULL };
+   const char *kvm_usage[] = { NULL, NULL };

perf_host  = 0;
perf_guest = 1;

-   argc = parse_options(argc, argv, kvm_options, kvm_usage,
-   PARSE_OPT_STOP_AT_NON_OPTION);
+   argc = parse_options_subcommand(argc, argv, kvm_options, 
kvm_subcommands, kvm_usage,
+   PARSE_OPT_STOP_AT_NON_OPTION);
if (!argc)
usage_with_options(kvm_usage, kvm_options);

diff --git a/tools/perf/perf-completion.sh b/tools/perf/perf-completion.sh
index 496e2ab..ae3a576 100644
--- a/tools/perf/perf-completion.sh
+++ b/tools/perf/perf-completion.sh
@@ -123,7 +123,7 @@ __perf_main ()
__perfcomp_colon "$evts" "$cur"
# List subcommands for 'perf kvm'
elif [[ $prev == "kvm" ]]; then
-   subcmds="top record report diff buildid-list stat"
+   subcmds=$($cmd $prev --list-cmds)
__perfcomp_colon "$subcmds" "$cur"
# List long option names
elif [[ $cur == --* ]];  then
diff --git a/tools/perf/util/parse-options.c b/tools/perf/util/parse-options.c
index d22e3f8..bf48092 100644
--- a/tools/perf/util/parse-options.c
+++ b/tools/perf/util/parse-options.c
@@ -407,7 +407,9 @@ int parse_options_step(struct parse_opt_ctx_t *ctx,
if (internal_help && !strcmp(arg + 2, "help"))
return usage_with_options_internal(usagestr, options, 
0);
if (!strcmp(arg + 2, "list-opts"))
-   return PARSE_OPT_LIST;
+   return PARSE_OPT_LIST_OPTS;
+   if (!strcmp(arg + 2, "list-cmds"))
+   return PARSE_OPT_LIST_SUBCMDS;
switch (parse_long_opt(ctx, arg + 2, options)) {
case -1:
return parse_options_usage(usagestr, options, arg + 2, 
0);
@@ -433,25 +435,45 @@ int parse_options_end(struct parse_opt_ctx_t *ctx)
return ctx->cpidx + ctx->argc;
  }

-int parse_options(int argc, const char **argv, const struct option *options,
- const char * const usagestr[], int flags)
+int parse_options_subcommand(int argc, const char **argv, const struct option 
*options,
+   const char *const subcommands[], const char 
*usagestr[], int flags)
  {
struct parse_opt_ctx_t ctx;

perf_header__set_cmdline(argc, argv);

+   /* build usage string if it's not provided */
+   if (subcommands && !usagestr[0]) {
+   struct strbuf buf = STRBUF_INIT;
+
+   strbuf_addf(&buf, "perf %s [] {", argv[0]);
+   for (int i = 0; subcommands[i]; i++) {
+   if (i)
+   strbuf_addstr(&buf, "|");
+   strbuf_addstr(&buf, subcommands[i]);
+   }
+   strbuf_addstr(&buf, "}");
+
+   usagestr[0] = strdup(buf.buf);
+   strbuf_release(&buf);
+   }
+
parse_options_start(&ctx, argc, argv, flags);
switch (parse_options_step(&ctx, options, usagestr)) {
case PARSE_OPT_HELP:
exit(129);
case PARSE_OPT_DONE:
break;
-   case PARSE_OPT_LIST:
+   case PARSE_OPT_LIST_OPTS:
while (options->type != OPTION_END) {
printf("--%s ", options->long_name);
options++;
}
exit(130);
+   case PARSE_OPT_LIST_SUBC

[PATCH] irqchip: Silence sparse warning

2014-03-04 Thread Stephen Boyd

drivers/irqchip/irqchip.c:27:13: warning: symbol 'irqchip_init'
was not declared. Should it be static?

Signed-off-by: Stephen Boyd 
---
 drivers/irqchip/irqchip.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/irqchip/irqchip.c b/drivers/irqchip/irqchip.c
index f496afce29de..3469141f10f6 100644
--- a/drivers/irqchip/irqchip.c
+++ b/drivers/irqchip/irqchip.c
@@ -10,6 +10,7 @@
 
 #include 
 #include 
+#include 
 
 #include "irqchip.h"
 
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v12] gpio: add a driver for the Synopsys DesignWare APB GPIO block

2014-03-04 Thread Linus Walleij

On Wed, Feb 26, 2014 at 7:01 AM, Alan Tull  wrote:

> From: Jamie Iles 
>
> The Synopsys DesignWare block is used in some ARM devices (picoxcell)
> and can be configured to provide multiple banks of GPIO pins.
>
> Signed-off-by: Jamie Iles 
> Signed-off-by: Alan Tull 
> Reviewed-by: Sebastian Hesselbarth 
>
> v12: - Add irq_startup/shutdown
>  - do irq_create_mapping() in probe, irq_find_mapping() in to_irq()
>  - Adjust mappings to show support for 1 gpio per port.
>  - gpio-cells = <1>

Patch applied!

You've done a tremendous work on this driver and there is
absolutely nothing controversial about the bindings, so it is
my pleasure to include this driver.

Yours,
Linus Walleij
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] irqchip: gic: Silence sparse warnings

2014-03-04 Thread Stephen Boyd

drivers/irqchip/irq-gic.c:53:23: warning: duplicate [noderef]
drivers/irqchip/irq-gic.c:651:6: warning: symbol 'gic_raise_softirq' was not 
declared. Should it be static?
drivers/irqchip/irq-gic.c:872:29: warning: symbol 'gic_irq_domain_ops' was not 
declared. Should it be static?
drivers/irqchip/irq-gic.c:977:12: warning: symbol 'gic_of_init' was not 
declared. Should it be static?

Signed-off-by: Stephen Boyd 
---
 drivers/irqchip/irq-gic.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
index 500e533b9648..a917b144895e 100644
--- a/drivers/irqchip/irq-gic.c
+++ b/drivers/irqchip/irq-gic.c
@@ -50,7 +50,7 @@
 
 union gic_base {
void __iomem *common_base;
-   void __percpu __iomem **percpu_base;
+   void __percpu * __iomem *percpu_base;
 };
 
 struct gic_chip_data {
@@ -648,7 +648,7 @@ static void __init gic_pm_init(struct gic_chip_data *gic)
 #endif
 
 #ifdef CONFIG_SMP
-void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
+static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
 {
int cpu;
unsigned long flags, map = 0;
@@ -869,7 +869,7 @@ static struct notifier_block gic_cpu_notifier = {
 };
 #endif
 
-const struct irq_domain_ops gic_irq_domain_ops = {
+static const struct irq_domain_ops gic_irq_domain_ops = {
.map = gic_irq_domain_map,
.xlate = gic_irq_domain_xlate,
 };
@@ -974,7 +974,8 @@ void __init gic_init_bases(unsigned int gic_nr, int 
irq_start,
 #ifdef CONFIG_OF
 static int gic_cnt __initdata;
 
-int __init gic_of_init(struct device_node *node, struct device_node *parent)
+static int __init
+gic_of_init(struct device_node *node, struct device_node *parent)
 {
void __iomem *cpu_base;
void __iomem *dist_base;
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Question on vhost_has_feature()

2014-03-04 Thread Paul E. McKenney

On Tue, Mar 04, 2014 at 09:56:42AM +0200, Michael S. Tsirkin wrote:
> On Mon, Mar 03, 2014 at 11:44:23AM -0800, Paul E. McKenney wrote:
> > Hello, Michael,
> > 
> > Just curious about the purpose of the rcu_dereference_index_check() in
> > vhost_has_feature().  I don't see how it fits in.  The closest thing
> > I see if the use in handle_rx(), where it selects vq->log or NULL.  But
> > in that case, I would expect the usual RCU usage pattern to wrap an
> > rcu_dereference() around the vq->log.
> > 
> > Enlightenment?
> > 
> > Thanx, Paul
> 
> Hi Paul,
> 
> Yes, it's weird.  As you say the use is in handle_rx.
> The way it's supposed to work is that readers take vq mutex, and
> writers change the value and then take and release
> vq mutex.
> We did it like this because there are thinkably multiple vqs.
> 
> I tried to document it in vhost.h :
> /* Readers use RCU to access memory table pointer
>  * log base pointer and features.
>  * Writers use mutex below.*/
> 
> If this is a problem, it's possible to restructure the code to avoid
> this pattern for vhost_has_feature, pls let me know.

I am not yet sure whether or not it is a problem, it just looked a bit
strange.  ;-)

Thanx, Paul

> Thanks a lot for paying attention!
> 
> 
> -- 
> MST
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 5/7] staging: cxt1e1: fix checkpatch errors with open brace '{'

2014-03-04 Thread Greg KH

On Wed, Mar 05, 2014 at 09:55:14AM +0900, DaeSeok Youn wrote:
> Hi, greg
> 
> I already resend patch 4 and 5. :-)
> 
> It had a bug which is noticed by Dan.
> 
> I tried to fix assignment in if condition and missed curly brace in inner 
> loop.
> So I fixed that bug and resend patch 4. And patch 5 is rebased after
> fixing patch 4.
> 
> And I tested to apply these patch to staging-next branch from mailing list.
> Please check.

I've flushed all of your patches from my queue now, please resend
anything I haven't applied.

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RCU stalls when running out of memory on 3.14-rc4 w/ NFS and kernel threads priorities changed

2014-03-04 Thread Florian Fainelli

2014-03-04 16:48 GMT-08:00 Eric Dumazet :
> On Tue, 2014-03-04 at 15:55 -0800, Florian Fainelli wrote:
>> Hi all,
>>
>> I am seeing the following RCU stalls messages appearing on an ARMv7
>> 4xCPUs system running 3.14-rc4:
>>
>> [   42.974327] INFO: rcu_sched detected stalls on CPUs/tasks:
>> [   42.979839]  (detected by 0, t=2102 jiffies, g=4294967082,
>> c=4294967081, q=516)
>> [   42.987169] INFO: Stall ended before state dump start
>>
>> this is happening under the following conditions:
>>
>> - the attached bumper.c binary alters various kernel thread priorities
>> based on the contents of bumpup.cfg and
>> - malloc_crazy is running from a NFS share
>> - malloc_crazy.c is running in a loop allocating chunks of memory but
>> never freeing it
>>
>> when the priorities are altered, instead of getting the OOM killer to
>> be invoked, the RCU stalls are happening. Taking NFS out of the
>> equation does not allow me to reproduce the problem even with the
>> priorities altered.
>>
>> This "problem" seems to have been there for quite a while now since I
>> was able to get 3.8.13 to trigger that bug as well, with a slightly
>> more detailed RCU debugging trace which points the finger at kswapd0.
>>
>> You should be able to get that reproduced under QEMU with the
>> Versatile Express platform emulating a Cortex A15 CPU and the attached
>> files.
>>
>> Any help or suggestions would be greatly appreciated. Thanks!
>
> Do you have a more complete trace, including stack traces ?

Attatched is what I get out of SysRq-t, which is the only thing I have
(note that the kernel is built with CONFIG_RCU_CPU_STALL_INFO=y):

Thanks!
-- 
Florian
[ 3474.417333] INFO: Stall ended before state dump start
  
[ 3500.312946] SysRq : Show State   
  
[ 3500.316015]   taskPC stack   pid father  
  
[ 3500.321244] initS c04bda98 0 1  0 0x 
  
[ 3500.327640] [] (__schedule) from [] 
(do_wait+0x220/0x244)  
[ 3500.334786] [] (do_wait) from [] (SyS_wait4+0x60/0xc4)   
  
[ 3500.341672] [] (SyS_wait4) from [] 
(ret_fast_syscall+0x0/0x30) 
[ 3500.349247] kthreaddS c04bda98 0 2  0 0x 
  
[ 3500.355635] [] (__schedule) from [] 
(kthreadd+0x168/0x16c) 
[ 3500.362866] [] (kthreadd) from [] 
(ret_from_fork+0x14/0x3c)
[ 3500.370181] ksoftirqd/0 S c04bda98 0 3  2 0x 
  
[ 3500.376567] [] (__schedule) from [] 
(smpboot_thread_fn+0xc4/0x17c) 
[ 3500.384494] [] (smpboot_thread_fn) from [] 
(kthread+0xd4/0xec) 
[ 3500.392072] [] (kthread) from [] 
(ret_from_fork+0x14/0x3c) 
[ 3500.399300] kworker/0:0 S c04bda98 0 4  2 0x 
  
[ 3500.405691] [] (__schedule) from [] 
(worker_thread+0x210/0x404)
[ 3500.413357] [] (worker_thread) from [] 
(kthread+0xd4/0xec) 
[ 3500.420588] [] (kthread) from [] 
(ret_from_fork+0x14/0x3c) 
[ 3500.427817] kworker/0:0HS c04bda98 0 5  2 0x 
  
[ 3500.434205] [] (__schedule) from [] 
(worker_thread+0x210/0x404)
[ 3500.441871] [] (worker_thread) from [] 
(kthread+0xd4/0xec) 
[ 3500.449102] [] (kthread) from [] 
(ret_from_fork+0x14/0x3c) 
[ 3500.456329] kworker/u8:0S c04bda98 0 6  2 0x 
  
[ 3500.462718] [] (__schedule) from [] 
(worker_thread+0x210/0x404)
[ 3500.470384] [] (worker_thread) from [] 
(kthread+0xd4/0xec) 
[ 3500.477615] [] (kthread) from [] 
(ret_from_fork+0x14/0x3c) 
[ 3500.484843] rcu_sched   R running  0 7  2 0x 
  
[ 3500.491230] [] (__schedule) from [] 
(schedule_timeout+0x130/0x1ac) 
[ 3500.499157] [] (schedule_timeout) from [] 
(rcu_gp_kthread+0x26c/0x5f8) 
[ 3500.507431] [] (rcu_gp_kthread) from [] 
(kthread+0xd4/0xec)
[ 3500.514749] [] (kthread) from [] 
(ret_from_fork+0x14/0x3c) 
[ 3500.521977] rcu_bh  S c04bda98 0 8  2 0x 
  
[ 3500.528363] [] (__schedule) from [] 
(rcu_gp_kthread+0x80/0x5f8)
[ 3500.536028] [] (rcu_gp_kthread) from [] 
(kthread+0xd4/0xec)

Re: [PATCH net-next v6 9/9] xen-netback: Aggregate TX unmap operations

2014-03-04 Thread David Miller

From: Zoltan Kiss 
Date: Wed, 5 Mar 2014 00:45:49 +

> Despite all my efforts to do renumbering right, this subject still
> shows 9/9 instead of 10/10.

No worries.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/2] staging :keucr:scsiglue.c : fixed a macros should not be colon terminated issue

2014-03-04 Thread Greg KH

On Sun, Mar 02, 2014 at 06:21:16PM +0530, Keerthimai Janarthanan wrote:
> fixed a coding style issue.
> 
> Signed-off-by: Keerthimai Janarthanan 
> ---
>  drivers/staging/keucr/scsiglue.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/staging/keucr/scsiglue.c 
> b/drivers/staging/keucr/scsiglue.c
> index ac3d34d..fce19a4 100644
> --- a/drivers/staging/keucr/scsiglue.c
> +++ b/drivers/staging/keucr/scsiglue.c
> @@ -277,7 +277,7 @@ static int show_info(struct seq_file *m, struct Scsi_Host 
> *host)
>   do { \
>   if (us->fflags & value) \
>   SPRINTF(" " #name); \
> - } while (0);
> + } while (0)

You just broke the build.

Please ALWAYS test your patches, don't break someone else's build with
broken patches...

Please be more careful.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3.10 00/97] 3.10.33-stable review

2014-03-04 Thread Guenter Roeck


On 03/04/2014 12:03 PM, Greg Kroah-Hartman wrote:

This is the start of the stable review cycle for the 3.10.33 release.
There are 97 patches in this series, all will be posted as a response
to this one.  If anyone has any issues with these being applied, please
let me know.

Responses should be made by Thu Mar  6 20:03:11 UTC 2014.
Anything received after that time might be too late.



Build results:
total: 126 pass: 121 skipped: 4 fail: 1

qemu tests all passed.

Results are as expected. Details are available at 
http://server.roeck-us.net:8010/builders.

Guenter


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3.13 000/172] 3.13.6-stable review

2014-03-04 Thread Guenter Roeck


On 03/04/2014 12:01 PM, Greg Kroah-Hartman wrote:

This is the start of the stable review cycle for the 3.13.6 release.
There are 172 patches in this series, all will be posted as a response
to this one.  If anyone has any issues with these being applied, please
let me know.

Responses should be made by Thu Mar  6 20:02:29 UTC 2014.
Anything received after that time might be too late.



Build results:
total: 126 pass: 122 skipped: 4 fail: 0

qemu tests all passed.

Results are as expected. Details are available at 
http://server.roeck-us.net:8010/builders.

Guenter

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [blk-lib] 6a0608544e5: fileio -77.4% max latency, -5.7% throughput

2014-03-04 Thread Fengguang Wu

On Tue, Mar 04, 2014 at 01:52:25PM -0800, Kent Overstreet wrote:
> On Tue, Mar 04, 2014 at 09:21:30PM +0800, Fengguang Wu wrote:
> > Hi Kent,
> > 
> > FYI, we noticed the below changes on
> > 
> > git://evilpiepirate.org/~kent/linux-bcache.git for-jens
> > commit 6a0608544e5672bd9a044c285119547eae41abe5 ("blk-lib.c: 
> > generic_make_request() handles large bios now")
> > 
> > test case: 
> > snb-drag/sysbench/fileio/600s-100%-1HDD-ext4-64G-1024-seqrewr-sync

snb-drag is the test machine, it's a SNB desktop.

The test command is

mkfs -t ext4 -q /dev/sda2
mount -t ext4 /dev/sda2 /fs/sda2
cd /fs/sda2

for i in $(seq 0 1023)
do
fallocate -l 67108864 test_file.$i
done

sysbench --test=fileio --max-requests=0 --num-threads=4 --max-time=600 
--file-test-mode=seqrewr --file-total-size=68719476736 --file-io-mode=sync 
--file-num=1024 run


> I'm trying to figure out how to parse this and the graphs - where do I find 
> the
> test? And is there anything you can point me to for the graphs, or is that
> output from that test?
> 
> > 
> > 11541d5f5b7002b  6a0608544e5672bd9a044c285  
> > ---  -  
> >   1885 ~60% -77.4%426 ~ 7%  TOTAL 
> > fileio.request_latency_max_ms

The ~XX% numbers are stddev percent.
The [+-]XX% is change percent.

> >   6258 ~ 0%  -5.7%   5904 ~ 0%  TOTAL fileio.requests_per_sec
> > 26 ~ 1%+702.3%211 ~ 8%  TOTAL 
> > slabinfo.kmalloc-4096.num_slabs
> >217 ~ 1%+682.3%   1697 ~ 8%  TOTAL 
> > slabinfo.kmalloc-4096.num_objs
> > 26 ~ 1%+702.3%211 ~ 8%  TOTAL 
> > slabinfo.kmalloc-4096.active_slabs
> >211 ~ 2%+701.5%   1697 ~ 8%  TOTAL 
> > slabinfo.kmalloc-4096.active_objs
> >  2 ~ 0% -50.0%  1 ~ 0%  TOTAL vmstat.procs.b
> >   2236 ~ 0% +63.6%   3659 ~ 3%  TOTAL 
> > slabinfo.kmalloc-256.num_objs
> >   2217 ~ 0% +63.4%   3623 ~ 3%  TOTAL 
> > slabinfo.kmalloc-256.active_objs
> >   3274 ~ 1% +47.8%   4837 ~ 2%  TOTAL 
> > proc-vmstat.nr_slab_unreclaimable
> >  13096 ~ 1% +47.7%  19350 ~ 2%  TOTAL meminfo.SUnreclaim
> >   62558204 ~ 5% -30.7%   43379766 ~ 1%  TOTAL cpuidle.C3-SNB.time
> >  91031 ~ 4% -29.4%  64253 ~ 2%  TOTAL cpuidle.C1E-SNB.usage
> >  34092 ~ 2% -17.6%  28085 ~ 1%  TOTAL cpuidle.C3-SNB.usage
> >   2656 ~ 2% -15.0%   2258 ~ 2%  TOTAL 
> > proc-vmstat.kswapd_high_wmark_hit_quickly
> > 266339 ~ 2% +14.2% 304129 ~ 0%  TOTAL cpuidle.C6-SNB.usage
> >  21899 ~ 2%  +9.6%  23992 ~ 2%  TOTAL interrupts.RES
> > 20 ~ 0%  -8.0% 18 ~ 2%  TOTAL 
> > time.percent_of_cpu_this_job_got
> >3430189 ~ 0%  -5.9%3226430 ~ 0%  TOTAL 
> > time.voluntary_context_switches
> >117 ~ 0%  -6.0%110 ~ 0%  TOTAL time.system_time
> >  11691 ~ 0%  -5.6%  11042 ~ 0%  TOTAL vmstat.system.cs
> >711 ~ 0%  +4.4%742 ~ 0%  TOTAL iostat.sda.await
> >712 ~ 0%  +4.4%743 ~ 0%  TOTAL iostat.sda.w_await
> >  99142 ~ 0%  -2.2%  96963 ~ 0%  TOTAL iostat.sda.wkB/s
> >  99257 ~ 0%  -2.2%  97110 ~ 0%  TOTAL vmstat.io.bo
> >139 ~ 0%  +1.7%141 ~ 0%  TOTAL iostat.sda.avgqu-sz
> > 
> > 

The below graph shows all samples collected during the bisect

[*] bisect-good 
[O] bisect-bad

In which you can see the stableness of the change and bisect. 

> >fileio.requests_per_sec
> > 
> >6300 
> > ++--*---+
> >6250 *+.*..  .*.*..*..*..*..*..*.*..*..*..*..*..*.*..*..*..*..*.   
> > *..*..*
> > | *.
> > |
> >6200 ++  
> > |
> >6150 ++  
> > |
> > |   
> > |
> >6100 ++  
> > |
> >6050 ++  
> > |
> >6000 ++O 
> > |
> > | OO
> > |
> >5950 O+ O O  
> > |
> >5900 ++   OO  O O  O 
> > |
> > |  OO   O  O O O
> > |
> >5850 ++O O   O   
> > |
> >5800 
> > ++--+
> > 
> > 
> > fileio.request_latency_max_ms
> >

Re: RCU stalls when running out of memory on 3.14-rc4 w/ NFS and kernel threads priorities changed

2014-03-04 Thread Florian Fainelli

2014-03-04 17:03 GMT-08:00 Florian Fainelli :
> 2014-03-04 16:48 GMT-08:00 Eric Dumazet :
>> On Tue, 2014-03-04 at 15:55 -0800, Florian Fainelli wrote:
>>> Hi all,
>>>
>>> I am seeing the following RCU stalls messages appearing on an ARMv7
>>> 4xCPUs system running 3.14-rc4:
>>>
>>> [   42.974327] INFO: rcu_sched detected stalls on CPUs/tasks:
>>> [   42.979839]  (detected by 0, t=2102 jiffies, g=4294967082,
>>> c=4294967081, q=516)
>>> [   42.987169] INFO: Stall ended before state dump start
>>>
>>> this is happening under the following conditions:
>>>
>>> - the attached bumper.c binary alters various kernel thread priorities
>>> based on the contents of bumpup.cfg and
>>> - malloc_crazy is running from a NFS share
>>> - malloc_crazy.c is running in a loop allocating chunks of memory but
>>> never freeing it
>>>
>>> when the priorities are altered, instead of getting the OOM killer to
>>> be invoked, the RCU stalls are happening. Taking NFS out of the
>>> equation does not allow me to reproduce the problem even with the
>>> priorities altered.
>>>
>>> This "problem" seems to have been there for quite a while now since I
>>> was able to get 3.8.13 to trigger that bug as well, with a slightly
>>> more detailed RCU debugging trace which points the finger at kswapd0.
>>>
>>> You should be able to get that reproduced under QEMU with the
>>> Versatile Express platform emulating a Cortex A15 CPU and the attached
>>> files.
>>>
>>> Any help or suggestions would be greatly appreciated. Thanks!
>>
>> Do you have a more complete trace, including stack traces ?
>
> Attatched is what I get out of SysRq-t, which is the only thing I have
> (note that the kernel is built with CONFIG_RCU_CPU_STALL_INFO=y):

QEMU for Versatile Express w/ 2 CPUs yields something slightly
different than the real HW platform this is happening with, but it
does produce the RCU stall anyway:

[  125.762946] BUG: soft lockup - CPU#1 stuck for 53s! [malloc_crazy:91]
[  125.766841] Modules linked in:
[  125.768389]
[  125.769199] CPU: 1 PID: 91 Comm: malloc_crazy Not tainted 3.14.0-rc4 #39
[  125.769883] task: edbded00 ti: c089c000 task.ti: c089c000
[  125.771743] PC is at load_balance+0x4b0/0x760
[  125.772069] LR is at cpumask_next_and+0x44/0x5c
[  125.772387] pc : []lr : []psr: 6113
[  125.772387] sp : c089db48  ip : 8113  fp : edfd8cf4
[  125.773128] r10: c0de871c  r9 : ed893840  r8 : 
[  125.773452] r7 : c0de8458  r6 : edfd8840  r5 : edfd8840  r4 : ed89389c
[  125.773825] r3 : 12d8  r2 : 8113  r1 : 0023  r0 : 
[  125.774332] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
[  125.774753] Control: 30c7387d  Table: 80835d40  DAC: 
[  125.775266] CPU: 1 PID: 91 Comm: malloc_test_bcm Not tainted 3.14.0-rc4 #39
[  125.776392] [] (unwind_backtrace) from []
(show_stack+0x10/0x14)
[  125.777026] [] (show_stack) from []
(dump_stack+0x84/0x94)
[  125.777429] [] (dump_stack) from []
(watchdog_timer_fn+0x144/0x17c)
[  125.777865] [] (watchdog_timer_fn) from []
(__run_hrtimer.isra.32+0x54/0xe4)
[  125.778333] [] (__run_hrtimer.isra.32) from []
(hrtimer_interrupt+0x11c/0x2d0)
[  125.778814] [] (hrtimer_interrupt) from []
(arch_timer_handler_virt+0x28/0x30)
[  125.779297] [] (arch_timer_handler_virt) from
[] (handle_percpu_devid_irq+0x6c/0x84)
[  125.779734] [] (handle_percpu_devid_irq) from
[] (generic_handle_irq+0x2c/0x3c)
[  125.780145] [] (generic_handle_irq) from []
(handle_IRQ+0x40/0x90)
[  125.780513] [] (handle_IRQ) from []
(gic_handle_irq+0x2c/0x5c)
[  125.780867] [] (gic_handle_irq) from []
(__irq_svc+0x40/0x50)
[  125.781312] Exception stack(0xc089db00 to 0xc089db48)
[  125.781787] db00:  0023 8113 12d8 ed89389c
edfd8840 edfd8840 c0de8458
[  125.782234] db20:  ed893840 c0de871c edfd8cf4 8113
c089db48 c01db940 c004ff58
[  125.782594] db40: 6113 
[  125.782864] [] (__irq_svc) from []
(load_balance+0x4b0/0x760)
[  125.783215] [] (load_balance) from []
(rebalance_domains+0x154/0x284)
[  125.783595] [] (rebalance_domains) from []
(run_rebalance_domains+0x34/0x164)
[  125.784000] [] (run_rebalance_domains) from []
(__do_softirq+0x110/0x24c)
[  125.784388] [] (__do_softirq) from []
(irq_exit+0xac/0xf4)
[  125.784726] [] (irq_exit) from [] (handle_IRQ+0x44/0x90)
[  125.785059] [] (handle_IRQ) from []
(gic_handle_irq+0x2c/0x5c)
[  125.785412] [] (gic_handle_irq) from []
(__irq_svc+0x40/0x50)
[  125.785742] Exception stack(0xc089dcf8 to 0xc089dd40)
[  125.785983] dce0:
ee4e38c0 
[  125.786360] dd00: 000200da 0001 ee4e38a0 c0de2340 2d201000
edfe3358 c05c0c18 0001
[  125.786737] dd20: c05c0c2c c0e1e180  c089dd40 c0086050
c0086140 4113 
[  125.787120] [] (__irq_svc) from []
(get_page_from_freelist+0x2bc/0x638)
[  125.787507] [] (get_page_from_freelist) from []
(__alloc_pages_nodemask+0x114/0x8f4)
[  125.787949] [] (__alloc_pages_nodemask) from []
(handle_mm_fault+0x9f8/0xcdc)
[  125.788357] [] (handle_mm_faul

Re: [PATCH 1/2] pinctrl: sh-pfc: r8a7790: Add alternative MSIOF pin groups

2014-03-04 Thread Linus Walleij

On Fri, Feb 21, 2014 at 3:53 AM, Geert Uytterhoeven
 wrote:

> From: Geert Uytterhoeven 
>
> Signed-off-by: Geert Uytterhoeven 

Patch applied with Lauren's ACK.

Yours,
Linus Walleij
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 5/7] staging: cxt1e1: fix checkpatch errors with open brace '{'

2014-03-04 Thread DaeSeok Youn

OK.
I send patches again.

Thanks.
Daeseok Youn.

2014-03-05 10:06 GMT+09:00 Greg KH :
> On Wed, Mar 05, 2014 at 09:55:14AM +0900, DaeSeok Youn wrote:
>> Hi, greg
>>
>> I already resend patch 4 and 5. :-)
>>
>> It had a bug which is noticed by Dan.
>>
>> I tried to fix assignment in if condition and missed curly brace in inner 
>> loop.
>> So I fixed that bug and resend patch 4. And patch 5 is rebased after
>> fixing patch 4.
>>
>> And I tested to apply these patch to staging-next branch from mailing list.
>> Please check.
>
> I've flushed all of your patches from my queue now, please resend
> anything I haven't applied.
>
> thanks,
>
> greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3 03/11] perf: provide a common perf_event_nop_0() for use with .event_idx

2014-03-04 Thread Michael Ellerman

On Mon, 2014-03-03 at 23:01 -0800, Cody P Schafer wrote:
> On 03/03/2014 09:19 PM, Michael Ellerman wrote:
> > On Thu, 2014-27-02 at 21:04:56 UTC, Cody P Schafer wrote:
> >> Rather an having every pmu that needs a function that just returns 0 for
> >> .event_idx define their own copy, reuse the one in kernel/events/core.c.
> >>
> >> Rename from perf_swevent_event_idx() because we're no longer using it
> >> for just software events. Naming is based on the perf_pmu_nop_*()
> >> functions.
> >
> > You could just use perf_pmu_nop_int() directly.
> 
> No, .event_idx needs something that takes a (struct perf_event *), 
> perf_pmu_nop_int() takes a (struct pmu *).

Yeah, duh.

cheers


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v5 7/7] pci: Add support for creating a generic host_bridge from device tree

2014-03-04 Thread Jingoo Han

On Wednesday, March 05, 2014 12:50 AM, Liviu Dudau wrote:
> 
> Several platforms use a rather generic version of parsing
> the device tree to find the host bridge ranges. Move the common code
> into the generic PCI code and use it to create a pci_host_bridge
> structure that can be used by arch code.
> 
> Based on early attempts by Andrew Murray to unify the code.
> Used powerpc and microblaze PCI code as starting point.
> 
> Signed-off-by: Liviu Dudau 
> 
> diff --git a/drivers/pci/host-bridge.c b/drivers/pci/host-bridge.c
> index 8708b652..800678a 100644
> --- a/drivers/pci/host-bridge.c
> +++ b/drivers/pci/host-bridge.c

[.]

> + res = kzalloc(sizeof(struct resource), GFP_KERNEL);

It makes build error with exynos_defconfig. (ARM32)
Thus, 'slab.h' is necessary in order to fix the build error.

./drivers/pci/host-bridge.c
@@ -8,6 +8,7 @@
 #include 
 #include 
 #include 
+#include 

 #include "pci.h"


> + if (!res) {
> + err = -ENOMEM;
> + goto bridge_ranges_nomem;
> + }
> +
> + of_pci_range_to_resource(&range, dev, res);
> +
> + if (resource_type(res) == IORESOURCE_IO)
> + *io_base = range.cpu_addr;
> +
> + pci_add_resource_offset(resources, res,
> + res->start - range.pci_addr);
> + }
> +
> + /* Apply architecture specific fixups for the ranges */
> + pcibios_fixup_bridge_ranges(resources);

It also makes compile problem with exynos_defconfig as below:

drivers/built-in.o: In function `pci_host_bridge_of_get_ranges':
drivers/pci/host-bridge.c:157: undefined reference to 
`pcibios_fixup_bridge_ranges'

Best regards,
Jingoo Han

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

< 4 5 6 7 8 9 10 11 12 >

801 - 900 of 1113 matches

Mail list logo