from:"Sabrina Dubroca"

Re: e1000_netpoll(): disable_irq() triggers might_sleep() on linux-next

2014-12-02 Thread Sabrina Dubroca

Hello, sorry for the delay.

2014-10-29, 20:36:03 +0100, Peter Zijlstra wrote:
> On Wed, Oct 29, 2014 at 07:33:00PM +0100, Thomas Gleixner wrote:
> > Yuck. No. You are just papering over the problem.
> > 
> > What happens if you add 'threadirqs' to the kernel command line? Or if
> > the interrupt line is shared with a real threaded interrupt user?
> > 
> > The proper solution is to have a poll_lock for e1000 which serializes
> > the hardware interrupt against netpoll instead of using
> > disable/enable_irq().
> > 
> > In fact that's less expensive than the disable/enable_irq() dance and
> > the chance of contention is pretty low. If done right it will be a
> > NOOP for the CONFIG_NET_POLL_CONTROLLER=n case.
> > 
> 
> OK a little something like so then I suppose.. But I suspect most all
> the network drivers will need this and maybe more, disable_irq() is a
> popular little thing and we 'just' changed semantics on them.
> 
> ---
>  drivers/net/ethernet/intel/e1000/e1000.h  |  2 ++
>  drivers/net/ethernet/intel/e1000/e1000_main.c | 22 +-
>  kernel/irq/manage.c   |  2 +-
>  3 files changed, 20 insertions(+), 6 deletions(-)

I've been running with variants of this patch, things seem ok.

As noted earlier, there are a lot of drivers doing this disable_irq +
irq_handler + enable_irq sequence.  I found about 60.
Many already take a lock in the interrupt handler, and look like we
could just remove the call to disable_irq (example: cp_interrupt,
drivers/net/ethernet/realtek/8139cp.c).

Here's how I modified your patch.  The locking compiles away if
CONFIG_NET_POLL_CONTROLLER=n.

I can work on converting all the drivers from disable_irq to
netpoll_irq_lock, if that's okay with you.

In igb there's also a synchronize_irq() called from the netpoll
controller (in igb_irq_disable()), I think a similar locking scheme
would work.
I also saw a few disable_irq_nosync and disable_percpu_irq. These are
okay?


Thanks.

---

diff --git a/drivers/net/ethernet/intel/e1000/e1000.h 
b/drivers/net/ethernet/intel/e1000/e1000.h
index 69707108d23c..79444125b9bd 100644
--- a/drivers/net/ethernet/intel/e1000/e1000.h
+++ b/drivers/net/ethernet/intel/e1000/e1000.h
@@ -68,6 +68,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #define BAR_0  0
 #define BAR_1  1
@@ -323,6 +324,8 @@ struct e1000_adapter {
struct delayed_work watchdog_task;
struct delayed_work fifo_stall_task;
struct delayed_work phy_info_task;
+
+   struct netpoll_irq_lock netpoll_lock;
 };
 
 enum e1000_state_t {
diff --git a/drivers/net/ethernet/intel/e1000/e1000_main.c 
b/drivers/net/ethernet/intel/e1000/e1000_main.c
index 24f3986cfae2..5749a27e5c5e 100644
--- a/drivers/net/ethernet/intel/e1000/e1000_main.c
+++ b/drivers/net/ethernet/intel/e1000/e1000_main.c
@@ -32,6 +32,7 @@
 #include 
 #include 
 #include 
+#include 
 
 char e1000_driver_name[] = "e1000";
 static char e1000_driver_string[] = "Intel(R) PRO/1000 Network Driver";
@@ -1313,6 +1314,7 @@ static int e1000_sw_init(struct e1000_adapter *adapter)
e1000_irq_disable(adapter);
 
spin_lock_init(>stats_lock);
+   netpoll_irq_lock_init(>netpoll_lock);
 
set_bit(__E1000_DOWN, >flags);
 
@@ -3751,10 +3753,8 @@ void e1000_update_stats(struct e1000_adapter *adapter)
  * @irq: interrupt number
  * @data: pointer to a network interface device structure
  **/
-static irqreturn_t e1000_intr(int irq, void *data)
+static irqreturn_t __e1000_intr(int irq, struct e1000_adapter *adapter)
 {
-   struct net_device *netdev = data;
-   struct e1000_adapter *adapter = netdev_priv(netdev);
struct e1000_hw *hw = >hw;
u32 icr = er32(ICR);
 
@@ -3796,6 +3796,19 @@ static irqreturn_t e1000_intr(int irq, void *data)
return IRQ_HANDLED;
 }
 
+static irqreturn_t e1000_intr(int irq, void *data)
+{
+   struct net_device *netdev = data;
+   struct e1000_adapter *adapter = netdev_priv(netdev);
+   irqreturn_t ret;
+
+   netpoll_irq_lock(>netpoll_lock);
+   ret = __e1000_intr(irq, adapter);
+   netpoll_irq_unlock(>netpoll_lock);
+
+   return ret;
+}
+
 /**
  * e1000_clean - NAPI Rx polling callback
  * @adapter: board private structure
@@ -5220,9 +5233,7 @@ static void e1000_netpoll(struct net_device *netdev)
 {
struct e1000_adapter *adapter = netdev_priv(netdev);
 
-   disable_irq(adapter->pdev->irq);
e1000_intr(adapter->pdev->irq, netdev);
-   enable_irq(adapter->pdev->irq);
 }
 #endif
 
diff --git a/include/linux/netpoll.h b/include/linux/netpoll.h
index b25ee9ffdbe6..a171f1a50e0e 100644
--- a/include/linux/netpoll.h
+++ b/include/linux/netpoll.h
@@ -117,4 +117,35 @@ static inline bool netpoll_tx_running(struct net_device 
*dev)
 }
 #endif
 
+struct netpoll_irq_lock {
+#ifdef CONFIG_NET_POLL_CONTROLLER
+   spinlock_t lock;
+#endif
+};
+
+#ifdef CONFIG_NET_POLL_CONTROLLER
+static inline void netpoll_irq_lock_init(struct netpoll_irq_lock

Re: e1000_netpoll(): disable_irq() triggers might_sleep() on linux-next

2014-12-02 Thread Sabrina Dubroca

Hello, sorry for the delay.

2014-10-29, 20:36:03 +0100, Peter Zijlstra wrote:
 On Wed, Oct 29, 2014 at 07:33:00PM +0100, Thomas Gleixner wrote:
  Yuck. No. You are just papering over the problem.
  
  What happens if you add 'threadirqs' to the kernel command line? Or if
  the interrupt line is shared with a real threaded interrupt user?
  
  The proper solution is to have a poll_lock for e1000 which serializes
  the hardware interrupt against netpoll instead of using
  disable/enable_irq().
  
  In fact that's less expensive than the disable/enable_irq() dance and
  the chance of contention is pretty low. If done right it will be a
  NOOP for the CONFIG_NET_POLL_CONTROLLER=n case.
  
 
 OK a little something like so then I suppose.. But I suspect most all
 the network drivers will need this and maybe more, disable_irq() is a
 popular little thing and we 'just' changed semantics on them.
 
 ---
  drivers/net/ethernet/intel/e1000/e1000.h  |  2 ++
  drivers/net/ethernet/intel/e1000/e1000_main.c | 22 +-
  kernel/irq/manage.c   |  2 +-
  3 files changed, 20 insertions(+), 6 deletions(-)

I've been running with variants of this patch, things seem ok.

As noted earlier, there are a lot of drivers doing this disable_irq +
irq_handler + enable_irq sequence.  I found about 60.
Many already take a lock in the interrupt handler, and look like we
could just remove the call to disable_irq (example: cp_interrupt,
drivers/net/ethernet/realtek/8139cp.c).

Here's how I modified your patch.  The locking compiles away if
CONFIG_NET_POLL_CONTROLLER=n.

I can work on converting all the drivers from disable_irq to
netpoll_irq_lock, if that's okay with you.

In igb there's also a synchronize_irq() called from the netpoll
controller (in igb_irq_disable()), I think a similar locking scheme
would work.
I also saw a few disable_irq_nosync and disable_percpu_irq. These are
okay?


Thanks.

---

diff --git a/drivers/net/ethernet/intel/e1000/e1000.h 
b/drivers/net/ethernet/intel/e1000/e1000.h
index 69707108d23c..79444125b9bd 100644
--- a/drivers/net/ethernet/intel/e1000/e1000.h
+++ b/drivers/net/ethernet/intel/e1000/e1000.h
@@ -68,6 +68,7 @@
 #include linux/mii.h
 #include linux/ethtool.h
 #include linux/if_vlan.h
+#include linux/netpoll.h
 
 #define BAR_0  0
 #define BAR_1  1
@@ -323,6 +324,8 @@ struct e1000_adapter {
struct delayed_work watchdog_task;
struct delayed_work fifo_stall_task;
struct delayed_work phy_info_task;
+
+   struct netpoll_irq_lock netpoll_lock;
 };
 
 enum e1000_state_t {
diff --git a/drivers/net/ethernet/intel/e1000/e1000_main.c 
b/drivers/net/ethernet/intel/e1000/e1000_main.c
index 24f3986cfae2..5749a27e5c5e 100644
--- a/drivers/net/ethernet/intel/e1000/e1000_main.c
+++ b/drivers/net/ethernet/intel/e1000/e1000_main.c
@@ -32,6 +32,7 @@
 #include linux/prefetch.h
 #include linux/bitops.h
 #include linux/if_vlan.h
+#include linux/netpoll.h
 
 char e1000_driver_name[] = e1000;
 static char e1000_driver_string[] = Intel(R) PRO/1000 Network Driver;
@@ -1313,6 +1314,7 @@ static int e1000_sw_init(struct e1000_adapter *adapter)
e1000_irq_disable(adapter);
 
spin_lock_init(adapter-stats_lock);
+   netpoll_irq_lock_init(adapter-netpoll_lock);
 
set_bit(__E1000_DOWN, adapter-flags);
 
@@ -3751,10 +3753,8 @@ void e1000_update_stats(struct e1000_adapter *adapter)
  * @irq: interrupt number
  * @data: pointer to a network interface device structure
  **/
-static irqreturn_t e1000_intr(int irq, void *data)
+static irqreturn_t __e1000_intr(int irq, struct e1000_adapter *adapter)
 {
-   struct net_device *netdev = data;
-   struct e1000_adapter *adapter = netdev_priv(netdev);
struct e1000_hw *hw = adapter-hw;
u32 icr = er32(ICR);
 
@@ -3796,6 +3796,19 @@ static irqreturn_t e1000_intr(int irq, void *data)
return IRQ_HANDLED;
 }
 
+static irqreturn_t e1000_intr(int irq, void *data)
+{
+   struct net_device *netdev = data;
+   struct e1000_adapter *adapter = netdev_priv(netdev);
+   irqreturn_t ret;
+
+   netpoll_irq_lock(adapter-netpoll_lock);
+   ret = __e1000_intr(irq, adapter);
+   netpoll_irq_unlock(adapter-netpoll_lock);
+
+   return ret;
+}
+
 /**
  * e1000_clean - NAPI Rx polling callback
  * @adapter: board private structure
@@ -5220,9 +5233,7 @@ static void e1000_netpoll(struct net_device *netdev)
 {
struct e1000_adapter *adapter = netdev_priv(netdev);
 
-   disable_irq(adapter-pdev-irq);
e1000_intr(adapter-pdev-irq, netdev);
-   enable_irq(adapter-pdev-irq);
 }
 #endif
 
diff --git a/include/linux/netpoll.h b/include/linux/netpoll.h
index b25ee9ffdbe6..a171f1a50e0e 100644
--- a/include/linux/netpoll.h
+++ b/include/linux/netpoll.h
@@ -117,4 +117,35 @@ static inline bool netpoll_tx_running(struct net_device 
*dev)
 }
 #endif
 
+struct netpoll_irq_lock {
+#ifdef CONFIG_NET_POLL_CONTROLLER
+   spinlock_t lock;
+#endif

Re: linux-next: manual merge of the block tree with the ext4 tree

2014-11-29 Thread Sabrina Dubroca

Hello,

[adding Jeremiah Mahler to CC]

2014-11-27, 14:53:47 +1100, Stephen Rothwell wrote:
> Hi Jens,
> 
> Today's linux-next merge of the block tree got a conflict in
> fs/fs-writeback.c between commit ef7fdf5e8c87 ("vfs: add support for a
> lazytime mount option") from the ext4 tree and commit 9c6ac78eb352
> ("writeback: fix a subtle race condition in I_DIRTY clearing") from the
> block tree.
> 
> I fixed it up (I took a guess, plese check - see below) and can carry
> the fix as necessary (no action is required).
> 
> -- 
> Cheers,
> Stephen Rothwells...@canb.auug.org.au
> 
> diff --cc fs/fs-writeback.c
> index 3d87174408ae,2d609a5fbfea..
> --- a/fs/fs-writeback.c
> +++ b/fs/fs-writeback.c
> @@@ -482,14 -479,30 +482,30 @@@ __writeback_single_inode(struct inode *
>* write_inode()
>*/
>   spin_lock(>i_lock);
> - /* Clear I_DIRTY_PAGES if we've written out all dirty pages */
> - if (!mapping_tagged(mapping, PAGECACHE_TAG_DIRTY))
> - inode->i_state &= ~I_DIRTY_PAGES;
> + 
>  -dirty = inode->i_state & I_DIRTY;
>  -inode->i_state &= ~I_DIRTY;
>  +dirty = inode->i_state & I_DIRTY_INODE;
>  +inode->i_state &= ~I_DIRTY_INODE;
> + 
> + /*
> +  * Paired with smp_mb() in __mark_inode_dirty().  This allows
> +  * __mark_inode_dirty() to test i_state without grabbing i_lock -
> +  * either they see the I_DIRTY bits cleared or we see the dirtied
> +  * inode.
> +  *
> +  * I_DIRTY_PAGES is always cleared together above even if @mapping
> +  * still has dirty pages.  The flag is reinstated after smp_mb() if
> +  * necessary.  This guarantees that either __mark_inode_dirty()
> +  * sees clear I_DIRTY_PAGES or we see PAGECACHE_TAG_DIRTY.
> +  */
> + smp_mb();
> + 
> + if (mapping_tagged(mapping, PAGECACHE_TAG_DIRTY))
> + inode->i_state |= I_DIRTY_PAGES;
> + 
>   spin_unlock(>i_lock);
> + 
>   /* Don't write the inode if only I_DIRTY_PAGES was set */
>  -if (dirty & (I_DIRTY_SYNC | I_DIRTY_DATASYNC)) {
>  +if (dirty) {
>   int err = write_inode(inode, wbc);
>   if (ret == 0)
>   ret = err;


I think there's a problem in your fix, Stephen.

I'm getting hangs at boot (strangely -- in QEMU -- only when booting
via grub, not when using -kernel) and during shutdown.  Jeremiah seems
to have the same problem and his bisection led to the merge commit:
https://lkml.org/lkml/2014/11/29/17

The following solves both issues for me.  I think it makes sense given
the #defines from ef7fdf5e8c87, since Tejun intended to clear
I_DIRTY_PAGES.


diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index b70e45f45afa..6b2510d97a0a 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -484,7 +484,7 @@ __writeback_single_inode(struct inode *inode, struct 
writeback_control *wbc)
spin_lock(>i_lock);
 
dirty = inode->i_state & I_DIRTY_INODE;
-   inode->i_state &= ~I_DIRTY_INODE;
+   inode->i_state &= ~I_DIRTY;
 
/*
 * Paired with smp_mb() in __mark_inode_dirty().  This allows


-- 
Thanks,
Sabrina
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: linux-next: manual merge of the block tree with the ext4 tree

2014-11-29 Thread Sabrina Dubroca

Hello,

[adding Jeremiah Mahler to CC]

2014-11-27, 14:53:47 +1100, Stephen Rothwell wrote:
 Hi Jens,
 
 Today's linux-next merge of the block tree got a conflict in
 fs/fs-writeback.c between commit ef7fdf5e8c87 (vfs: add support for a
 lazytime mount option) from the ext4 tree and commit 9c6ac78eb352
 (writeback: fix a subtle race condition in I_DIRTY clearing) from the
 block tree.
 
 I fixed it up (I took a guess, plese check - see below) and can carry
 the fix as necessary (no action is required).
 
 -- 
 Cheers,
 Stephen Rothwells...@canb.auug.org.au
 
 diff --cc fs/fs-writeback.c
 index 3d87174408ae,2d609a5fbfea..
 --- a/fs/fs-writeback.c
 +++ b/fs/fs-writeback.c
 @@@ -482,14 -479,30 +482,30 @@@ __writeback_single_inode(struct inode *
* write_inode()
*/
   spin_lock(inode-i_lock);
 - /* Clear I_DIRTY_PAGES if we've written out all dirty pages */
 - if (!mapping_tagged(mapping, PAGECACHE_TAG_DIRTY))
 - inode-i_state = ~I_DIRTY_PAGES;
 + 
  -dirty = inode-i_state  I_DIRTY;
  -inode-i_state = ~I_DIRTY;
  +dirty = inode-i_state  I_DIRTY_INODE;
  +inode-i_state = ~I_DIRTY_INODE;
 + 
 + /*
 +  * Paired with smp_mb() in __mark_inode_dirty().  This allows
 +  * __mark_inode_dirty() to test i_state without grabbing i_lock -
 +  * either they see the I_DIRTY bits cleared or we see the dirtied
 +  * inode.
 +  *
 +  * I_DIRTY_PAGES is always cleared together above even if @mapping
 +  * still has dirty pages.  The flag is reinstated after smp_mb() if
 +  * necessary.  This guarantees that either __mark_inode_dirty()
 +  * sees clear I_DIRTY_PAGES or we see PAGECACHE_TAG_DIRTY.
 +  */
 + smp_mb();
 + 
 + if (mapping_tagged(mapping, PAGECACHE_TAG_DIRTY))
 + inode-i_state |= I_DIRTY_PAGES;
 + 
   spin_unlock(inode-i_lock);
 + 
   /* Don't write the inode if only I_DIRTY_PAGES was set */
  -if (dirty  (I_DIRTY_SYNC | I_DIRTY_DATASYNC)) {
  +if (dirty) {
   int err = write_inode(inode, wbc);
   if (ret == 0)
   ret = err;


I think there's a problem in your fix, Stephen.

I'm getting hangs at boot (strangely -- in QEMU -- only when booting
via grub, not when using -kernel) and during shutdown.  Jeremiah seems
to have the same problem and his bisection led to the merge commit:
https://lkml.org/lkml/2014/11/29/17

The following solves both issues for me.  I think it makes sense given
the #defines from ef7fdf5e8c87, since Tejun intended to clear
I_DIRTY_PAGES.


diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index b70e45f45afa..6b2510d97a0a 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -484,7 +484,7 @@ __writeback_single_inode(struct inode *inode, struct 
writeback_control *wbc)
spin_lock(inode-i_lock);
 
dirty = inode-i_state  I_DIRTY_INODE;
-   inode-i_state = ~I_DIRTY_INODE;
+   inode-i_state = ~I_DIRTY;
 
/*
 * Paired with smp_mb() in __mark_inode_dirty().  This allows


-- 
Thanks,
Sabrina
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

e1000_netpoll(): disable_irq() triggers might_sleep() on linux-next

2014-10-29 Thread Sabrina Dubroca

commit e22b886a8a43b ("sched/wait: Add might_sleep() checks") included
in today's linux-next added a check that fires on e1000 with netpoll:


BUG: sleeping function called from invalid context at kernel/irq/manage.c:104
in_atomic(): 1, irqs_disabled(): 1, pid: 1, name: systemd
no locks held by systemd/1.
irq event stamp: 10102965
hardirqs last  enabled at (10102965): [] 
vprintk_emit+0x2dd/0x6a0
hardirqs last disabled at (10102964): [] 
vprintk_emit+0x77/0x6a0
softirqs last  enabled at (10102342): [] 
__do_softirq+0x27a/0x6f0
softirqs last disabled at (10102337): [] irq_exit+0x56/0xe0
Preemption disabled at:[] printk_emit+0x31/0x33

CPU: 1 PID: 1 Comm: systemd Not tainted 3.18.0-rc2-next-20141029-dirty #222
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
1.7.5-20140617_173321-var-lib-archbuild-testing-x86_64-tobias 04/01/2014
 81a82291 88001e743978 817df31d 
  88001e7439a8 8108dfa2 88001e7439a8
 81a82291 0068  88001e7439d8
Call Trace:
 [] dump_stack+0x4f/0x7c
 [] ___might_sleep+0x182/0x2b0
 [] __might_sleep+0x3a/0xc0
 [] synchronize_irq+0x38/0xa0
 [] ? __disable_irq_nosync+0x43/0x70
 [] disable_irq+0x20/0x30
 [] e1000_netpoll+0x23/0x60
 [] netpoll_poll_dev+0x72/0x3a0
 [] ? _raw_spin_trylock+0x73/0x90
 [] ? netpoll_send_skb_on_dev+0x1df/0x2e0
 [] netpoll_send_skb_on_dev+0x1b7/0x2e0
 [] netpoll_send_udp+0x2e3/0x490
 [] ? write_msg+0x51/0x140
 [] write_msg+0xcf/0x140
 [] call_console_drivers.constprop.22+0x13b/0x2a0
 [] console_unlock+0x39d/0x500
 [] ? vprintk_emit+0x31e/0x6a0
 [] vprintk_emit+0x33c/0x6a0
 [] ? might_fault+0x5e/0xc0
 [] printk_emit+0x31/0x33
 [] devkmsg_write+0xbd/0x110
 [] do_iter_readv_writev+0x65/0xa0
 [] do_readv_writev+0xe2/0x290
 [] ? vprintk+0x30/0x30
 [] ? rcu_read_lock_held+0x6d/0x70
 [] ? __fget_light+0xc6/0xd0
 [] vfs_writev+0x3c/0x50
 [] SyS_writev+0x4d/0xe0
 [] system_call_fastpath+0x16/0x1b



I'm able to reproduce it consistently by sending a lot of packets from
that interface while writing to /dev/kmsg with netconsole
enabled. Just writing to /dev/kmsg isn't enough.

# with ping -f or pktgen running
for i in `seq 1 20` ; do echo '' > 
/dev/kmsg ; done


-- 
Sabrina
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

e1000_netpoll(): disable_irq() triggers might_sleep() on linux-next

2014-10-29 Thread Sabrina Dubroca

commit e22b886a8a43b (sched/wait: Add might_sleep() checks) included
in today's linux-next added a check that fires on e1000 with netpoll:


BUG: sleeping function called from invalid context at kernel/irq/manage.c:104
in_atomic(): 1, irqs_disabled(): 1, pid: 1, name: systemd
no locks held by systemd/1.
irq event stamp: 10102965
hardirqs last  enabled at (10102965): [810cbafd] 
vprintk_emit+0x2dd/0x6a0
hardirqs last disabled at (10102964): [810cb897] 
vprintk_emit+0x77/0x6a0
softirqs last  enabled at (10102342): [810666aa] 
__do_softirq+0x27a/0x6f0
softirqs last disabled at (10102337): [81066e86] irq_exit+0x56/0xe0
Preemption disabled at:[817de50d] printk_emit+0x31/0x33

CPU: 1 PID: 1 Comm: systemd Not tainted 3.18.0-rc2-next-20141029-dirty #222
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
1.7.5-20140617_173321-var-lib-archbuild-testing-x86_64-tobias 04/01/2014
 81a82291 88001e743978 817df31d 
  88001e7439a8 8108dfa2 88001e7439a8
 81a82291 0068  88001e7439d8
Call Trace:
 [817df31d] dump_stack+0x4f/0x7c
 [8108dfa2] ___might_sleep+0x182/0x2b0
 [8108e10a] __might_sleep+0x3a/0xc0
 [810ce358] synchronize_irq+0x38/0xa0
 [810ce633] ? __disable_irq_nosync+0x43/0x70
 [810ce690] disable_irq+0x20/0x30
 [815d7253] e1000_netpoll+0x23/0x60
 [81678d02] netpoll_poll_dev+0x72/0x3a0
 [817e9993] ? _raw_spin_trylock+0x73/0x90
 [8167920f] ? netpoll_send_skb_on_dev+0x1df/0x2e0
 [816791e7] netpoll_send_skb_on_dev+0x1b7/0x2e0
 [816795f3] netpoll_send_udp+0x2e3/0x490
 [815d1f61] ? write_msg+0x51/0x140
 [815d1fdf] write_msg+0xcf/0x140
 [810cadbb] call_console_drivers.constprop.22+0x13b/0x2a0
 [810cb6bd] console_unlock+0x39d/0x500
 [810cbb3e] ? vprintk_emit+0x31e/0x6a0
 [810cbb5c] vprintk_emit+0x33c/0x6a0
 [811a6c6e] ? might_fault+0x5e/0xc0
 [817de50d] printk_emit+0x31/0x33
 [810cbfad] devkmsg_write+0xbd/0x110
 [811f24d5] do_iter_readv_writev+0x65/0xa0
 [811f3b72] do_readv_writev+0xe2/0x290
 [810cbef0] ? vprintk+0x30/0x30
 [810d499d] ? rcu_read_lock_held+0x6d/0x70
 [812142d6] ? __fget_light+0xc6/0xd0
 [811f3dac] vfs_writev+0x3c/0x50
 [811f3eed] SyS_writev+0x4d/0xe0
 [817ea16d] system_call_fastpath+0x16/0x1b



I'm able to reproduce it consistently by sending a lot of packets from
that interface while writing to /dev/kmsg with netconsole
enabled. Just writing to /dev/kmsg isn't enough.

# with ping -f or pktgen running
for i in `seq 1 20` ; do echo ''  
/dev/kmsg ; done


-- 
Sabrina
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

kmemleak: Cannot insert [...] into the object search tree (overlaps existing) (mm: use memblock_alloc_range())

2014-09-02 Thread Sabrina Dubroca

Hello,

2014-08-24, 23:56:03 +0900, Akinobu Mita wrote:
> Replace memblock_find_in_range() and memblock_reserve() with
> memblock_alloc_range().
> 
> Signed-off-by: Akinobu Mita 
> Cc: linux...@kvack.org

This patch is included in linux-next, and when I boot next-20140901,
on a 32-bit build, I get this message:


kmemleak: Cannot insert 0xf6556000 into the object search tree (overlaps 
existing)
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.17.0-rc3-next-20140901 #126
Hardware name: Dell Inc. Latitude D830   /0UY141, BIOS A02 
06/07/2007
 f6556000  c1891f64 c16226e8 f5c090c0 c1891f98 c11a934c c17f8768
 f6556000 00200282 f5c090e4 0005 0010 f5c09124  c19984e8
 0002 c19f5800 c1891fb0 c162216d 0020 c19984e8 0002 c19f5800
Call Trace:
 [] dump_stack+0x48/0x69
 [] create_object+0x23c/0x290
 [] early_alloc+0x98/0x120
 [] kmemleak_init+0x129/0x226
 [] start_kernel+0x2d5/0x38d
 [] i386_start_kernel+0x79/0x7d
kmemleak: Kernel memory leak detector disabled
kmemleak: Object 0xf6556000 (size 16777216):
kmemleak:   comm "swapper/0", pid 0, jiffies 4294877296
kmemleak:   min_count = 0
kmemleak:   count = 0
kmemleak:   flags = 0x1
kmemleak:   checksum = 0
kmemleak:   backtrace:
 [] kmemleak_alloc+0xa8/0xb0
 [] memblock_alloc_range_nid+0x46/0x50
 [] memblock_virt_alloc_internal+0x89/0xe7
 [] memblock_virt_alloc_try_nid_nopanic+0x58/0x60
 [] alloc_node_mem_map.constprop.72+0x4b/0x8c
 [] free_area_init_node+0xee/0x3a1
 [] free_area_init_nodes+0x36e/0x380
 [] zone_sizes_init+0x33/0x39
 [] paging_init+0xaa/0xad
 [] native_pagetable_init+0x54/0xe7
 [] setup_arch+0xb21/0xc07
 [] start_kernel+0x79/0x38d
 [] i386_start_kernel+0x79/0x7d
 [] 0x
kmemleak: Early log backtrace:
   [] kmemleak_alloc+0xa8/0xb0
   [] memblock_virt_alloc_internal+0xcc/0xe7
   [] memblock_virt_alloc_try_nid_nopanic+0x58/0x60
   [] alloc_node_mem_map.constprop.72+0x4b/0x8c
   [] free_area_init_node+0xee/0x3a1
   [] free_area_init_nodes+0x36e/0x380
   [] zone_sizes_init+0x33/0x39
   [] paging_init+0xaa/0xad
   [] native_pagetable_init+0x54/0xe7
   [] setup_arch+0xb21/0xc07
   [] start_kernel+0x79/0x38d
   [] i386_start_kernel+0x79/0x7d
   [] 0x



git bisect pointed to this patch:

abc65ff21e61d49269bf8fafd486fff2e3679c21 is the first bad commit
commit abc65ff21e61d49269bf8fafd486fff2e3679c21
Author: Akinobu Mita 
Date:   Mon Sep 1 23:48:54 2014 +0100

mm: use memblock_alloc_range()

Replace memblock_find_in_range() and memblock_reserve() with
the equivalent memblock_alloc_range().

Signed-off-by: Akinobu Mita 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Cc: Yinghai Lu 
Signed-off-by: Andrew Morton 


## git bisect log

git bisect start
# good: [a3793b0cedfc0bc3212e5ebf5b79710c910687c4] Merge remote-tracking branch 
'llvmlinux/for-next'
git bisect good a3793b0cedfc0bc3212e5ebf5b79710c910687c4
# bad: [03af78748485f63e8ed21d2e2585b5d1ec862ba6] Add linux-next specific files 
for 20140901
git bisect bad 03af78748485f63e8ed21d2e2585b5d1ec862ba6
# good: [42330182961e380c6ac85c1482ff8115ddc487dd] mempolicy: unexport 
get_vma_policy() and remove its "task" arg
git bisect good 42330182961e380c6ac85c1482ff8115ddc487dd
# bad: [2437e4e8841cafe9c086a98d4b0186196c7e10af] MAINTAINERS: remove non 
existent files
git bisect bad 2437e4e8841cafe9c086a98d4b0186196c7e10af
# bad: [36fb2fa2c928947f728dc8119a7143fe9f61033c] zsmalloc: change return value 
unit of zs_get_total_size_bytes
git bisect bad 36fb2fa2c928947f728dc8119a7143fe9f61033c
# bad: [658f7da49d34bc6187e6cd1ec57933d1a2a76035] mm: introduce dump_vma
git bisect bad 658f7da49d34bc6187e6cd1ec57933d1a2a76035
# skip: [2090938a202a34e6ea28a40a9b98214795546882] mm: introduce common page 
state for ballooned memory
git bisect skip 2090938a202a34e6ea28a40a9b98214795546882
# skip: [843cbba246f248585f88dffec89304545d2f3bde] 
mm-introduce-common-page-state-for-ballooned-memory-fix
git bisect skip 843cbba246f248585f88dffec89304545d2f3bde
# bad: [92a1357eacd714671071871e95bdaf9144aa622a] 
mm-balloon_compaction-general-cleanup-checkpatch-fixes
git bisect bad 92a1357eacd714671071871e95bdaf9144aa622a
# skip: [b19a479c8ae91329771288310701f996bc100947] 
selftests/vm/transhuge-stress: stress test for memory compaction
git bisect skip b19a479c8ae91329771288310701f996bc100947
# bad: [e0e398dffe88d10dcda4e41941d677aa337410e5] mm/balloon_compaction: ignore 
anonymous pages
git bisect bad e0e398dffe88d10dcda4e41941d677aa337410e5
# bad: [15ccb0a452bb0e2f0edb25747110aae73fd9a962] include/linux/migrate.h: 
remove migrate_page #define
git bisect bad 15ccb0a452bb0e2f0edb25747110aae73fd9a962
# bad: [abc65ff21e61d49269bf8fafd486fff2e3679c21] mm: use memblock_alloc_range()
git bisect bad abc65ff21e61d49269bf8fafd486fff2e3679c21
# first bad commit: [abc65ff21e61d49269bf8fafd486fff2e3679c21] mm: use 
memblock_alloc_range()



Thanks,

-- 
Sabrina
--
To unsubscribe from this

[PATCH net v2] ipv6: fix rtnl locking in setsockopt for anycast and multicast

2014-09-02 Thread Sabrina Dubroca

Calling setsockopt with IPV6_JOIN_ANYCAST or IPV6_LEAVE_ANYCAST
triggers the assertion in addrconf_join_solict()/addrconf_leave_solict()

ipv6_sock_ac_join(), ipv6_sock_ac_drop(), ipv6_sock_ac_close() need to
take RTNL before calling ipv6_dev_ac_inc/dec. Same thing with
ipv6_sock_mc_join(), ipv6_sock_mc_drop(), ipv6_sock_mc_close() before
calling ipv6_dev_mc_inc/dec.

This patch moves ASSERT_RTNL() up a level in the call stack.

Signed-off-by: Cong Wang 
Signed-off-by: Sabrina Dubroca 
Reported-by: Tommi Rantala 
---
As was said earlier, this should go in stable.

v2:
 - based on net
 - keep dev_get_by_flags_rcu and RCU in ipv6_sock_ac_*
 - remove two ASSERT_RTNL() that are not necessary

Thank you for your help, Hannes!

 net/ipv6/addrconf.c | 15 +--
 net/ipv6/anycast.c  | 12 
 net/ipv6/mcast.c| 14 ++
 3 files changed, 31 insertions(+), 10 deletions(-)

diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 0b239fc1816e..aa0e135b808c 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -1690,14 +1690,12 @@ void addrconf_dad_failure(struct inet6_ifaddr *ifp)
addrconf_mod_dad_work(ifp, 0);
 }
 
-/* Join to solicited addr multicast group. */
-
+/* Join to solicited addr multicast group.
+ * caller must hold RTNL */
 void addrconf_join_solict(struct net_device *dev, const struct in6_addr *addr)
 {
struct in6_addr maddr;
 
-   ASSERT_RTNL();
-
if (dev->flags&(IFF_LOOPBACK|IFF_NOARP))
return;
 
@@ -1705,12 +1703,11 @@ void addrconf_join_solict(struct net_device *dev, const 
struct in6_addr *addr)
ipv6_dev_mc_inc(dev, );
 }
 
+/* caller must hold RTNL */
 void addrconf_leave_solict(struct inet6_dev *idev, const struct in6_addr *addr)
 {
struct in6_addr maddr;
 
-   ASSERT_RTNL();
-
if (idev->dev->flags&(IFF_LOOPBACK|IFF_NOARP))
return;
 
@@ -1718,12 +1715,11 @@ void addrconf_leave_solict(struct inet6_dev *idev, 
const struct in6_addr *addr)
__ipv6_dev_mc_dec(idev, );
 }
 
+/* caller must hold RTNL */
 static void addrconf_join_anycast(struct inet6_ifaddr *ifp)
 {
struct in6_addr addr;
 
-   ASSERT_RTNL();
-
if (ifp->prefix_len >= 127) /* RFC 6164 */
return;
ipv6_addr_prefix(, >addr, ifp->prefix_len);
@@ -1732,12 +1728,11 @@ static void addrconf_join_anycast(struct inet6_ifaddr 
*ifp)
ipv6_dev_ac_inc(ifp->idev->dev, );
 }
 
+/* caller must hold RTNL */
 static void addrconf_leave_anycast(struct inet6_ifaddr *ifp)
 {
struct in6_addr addr;
 
-   ASSERT_RTNL();
-
if (ifp->prefix_len >= 127) /* RFC 6164 */
return;
ipv6_addr_prefix(, >addr, ifp->prefix_len);
diff --git a/net/ipv6/anycast.c b/net/ipv6/anycast.c
index 210183244689..45b9d81d91e8 100644
--- a/net/ipv6/anycast.c
+++ b/net/ipv6/anycast.c
@@ -77,6 +77,7 @@ int ipv6_sock_ac_join(struct sock *sk, int ifindex, const 
struct in6_addr *addr)
pac->acl_next = NULL;
pac->acl_addr = *addr;
 
+   rtnl_lock();
rcu_read_lock();
if (ifindex == 0) {
struct rt6_info *rt;
@@ -137,6 +138,7 @@ int ipv6_sock_ac_join(struct sock *sk, int ifindex, const 
struct in6_addr *addr)
 
 error:
rcu_read_unlock();
+   rtnl_unlock();
if (pac)
sock_kfree_s(sk, pac, sizeof(*pac));
return err;
@@ -171,13 +173,17 @@ int ipv6_sock_ac_drop(struct sock *sk, int ifindex, const 
struct in6_addr *addr)
 
spin_unlock_bh(_sk_ac_lock);
 
+   rtnl_lock();
rcu_read_lock();
dev = dev_get_by_index_rcu(net, pac->acl_ifindex);
if (dev)
ipv6_dev_ac_dec(dev, >acl_addr);
rcu_read_unlock();
+   rtnl_unlock();
 
sock_kfree_s(sk, pac, sizeof(*pac));
+   if (!dev)
+   return -ENODEV;
return 0;
 }
 
@@ -198,6 +204,7 @@ void ipv6_sock_ac_close(struct sock *sk)
spin_unlock_bh(_sk_ac_lock);
 
prev_index = 0;
+   rtnl_lock();
rcu_read_lock();
while (pac) {
struct ipv6_ac_socklist *next = pac->acl_next;
@@ -212,6 +219,7 @@ void ipv6_sock_ac_close(struct sock *sk)
pac = next;
}
rcu_read_unlock();
+   rtnl_unlock();
 }
 
 static void aca_put(struct ifacaddr6 *ac)
@@ -233,6 +241,8 @@ int ipv6_dev_ac_inc(struct net_device *dev, const struct 
in6_addr *addr)
struct rt6_info *rt;
int err;
 
+   ASSERT_RTNL();
+
idev = in6_dev_get(dev);
 
if (idev == NULL)
@@ -302,6 +312,8 @@ int __ipv6_dev_ac_dec(struct inet6_dev *idev, const struct 
in6_addr *addr)
 {
struct ifacaddr6 *aca, *prev_aca;
 
+   ASSERT_RTNL();
+
write_lock_bh(>lock);
prev_aca = NULL;
for (aca = idev->ac_list; aca; aca = aca->aca_next) {
diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
index 617f0958

[PATCH net v2] ipv6: fix rtnl locking in setsockopt for anycast and multicast

2014-09-02 Thread Sabrina Dubroca

Calling setsockopt with IPV6_JOIN_ANYCAST or IPV6_LEAVE_ANYCAST
triggers the assertion in addrconf_join_solict()/addrconf_leave_solict()

ipv6_sock_ac_join(), ipv6_sock_ac_drop(), ipv6_sock_ac_close() need to
take RTNL before calling ipv6_dev_ac_inc/dec. Same thing with
ipv6_sock_mc_join(), ipv6_sock_mc_drop(), ipv6_sock_mc_close() before
calling ipv6_dev_mc_inc/dec.

This patch moves ASSERT_RTNL() up a level in the call stack.

Signed-off-by: Cong Wang xiyou.wangc...@gmail.com
Signed-off-by: Sabrina Dubroca s...@queasysnail.net
Reported-by: Tommi Rantala tt.rant...@gmail.com
---
As was said earlier, this should go in stable.

v2:
 - based on net
 - keep dev_get_by_flags_rcu and RCU in ipv6_sock_ac_*
 - remove two ASSERT_RTNL() that are not necessary

Thank you for your help, Hannes!

 net/ipv6/addrconf.c | 15 +--
 net/ipv6/anycast.c  | 12 
 net/ipv6/mcast.c| 14 ++
 3 files changed, 31 insertions(+), 10 deletions(-)

diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 0b239fc1816e..aa0e135b808c 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -1690,14 +1690,12 @@ void addrconf_dad_failure(struct inet6_ifaddr *ifp)
addrconf_mod_dad_work(ifp, 0);
 }
 
-/* Join to solicited addr multicast group. */
-
+/* Join to solicited addr multicast group.
+ * caller must hold RTNL */
 void addrconf_join_solict(struct net_device *dev, const struct in6_addr *addr)
 {
struct in6_addr maddr;
 
-   ASSERT_RTNL();
-
if (dev-flags(IFF_LOOPBACK|IFF_NOARP))
return;
 
@@ -1705,12 +1703,11 @@ void addrconf_join_solict(struct net_device *dev, const 
struct in6_addr *addr)
ipv6_dev_mc_inc(dev, maddr);
 }
 
+/* caller must hold RTNL */
 void addrconf_leave_solict(struct inet6_dev *idev, const struct in6_addr *addr)
 {
struct in6_addr maddr;
 
-   ASSERT_RTNL();
-
if (idev-dev-flags(IFF_LOOPBACK|IFF_NOARP))
return;
 
@@ -1718,12 +1715,11 @@ void addrconf_leave_solict(struct inet6_dev *idev, 
const struct in6_addr *addr)
__ipv6_dev_mc_dec(idev, maddr);
 }
 
+/* caller must hold RTNL */
 static void addrconf_join_anycast(struct inet6_ifaddr *ifp)
 {
struct in6_addr addr;
 
-   ASSERT_RTNL();
-
if (ifp-prefix_len = 127) /* RFC 6164 */
return;
ipv6_addr_prefix(addr, ifp-addr, ifp-prefix_len);
@@ -1732,12 +1728,11 @@ static void addrconf_join_anycast(struct inet6_ifaddr 
*ifp)
ipv6_dev_ac_inc(ifp-idev-dev, addr);
 }
 
+/* caller must hold RTNL */
 static void addrconf_leave_anycast(struct inet6_ifaddr *ifp)
 {
struct in6_addr addr;
 
-   ASSERT_RTNL();
-
if (ifp-prefix_len = 127) /* RFC 6164 */
return;
ipv6_addr_prefix(addr, ifp-addr, ifp-prefix_len);
diff --git a/net/ipv6/anycast.c b/net/ipv6/anycast.c
index 210183244689..45b9d81d91e8 100644
--- a/net/ipv6/anycast.c
+++ b/net/ipv6/anycast.c
@@ -77,6 +77,7 @@ int ipv6_sock_ac_join(struct sock *sk, int ifindex, const 
struct in6_addr *addr)
pac-acl_next = NULL;
pac-acl_addr = *addr;
 
+   rtnl_lock();
rcu_read_lock();
if (ifindex == 0) {
struct rt6_info *rt;
@@ -137,6 +138,7 @@ int ipv6_sock_ac_join(struct sock *sk, int ifindex, const 
struct in6_addr *addr)
 
 error:
rcu_read_unlock();
+   rtnl_unlock();
if (pac)
sock_kfree_s(sk, pac, sizeof(*pac));
return err;
@@ -171,13 +173,17 @@ int ipv6_sock_ac_drop(struct sock *sk, int ifindex, const 
struct in6_addr *addr)
 
spin_unlock_bh(ipv6_sk_ac_lock);
 
+   rtnl_lock();
rcu_read_lock();
dev = dev_get_by_index_rcu(net, pac-acl_ifindex);
if (dev)
ipv6_dev_ac_dec(dev, pac-acl_addr);
rcu_read_unlock();
+   rtnl_unlock();
 
sock_kfree_s(sk, pac, sizeof(*pac));
+   if (!dev)
+   return -ENODEV;
return 0;
 }
 
@@ -198,6 +204,7 @@ void ipv6_sock_ac_close(struct sock *sk)
spin_unlock_bh(ipv6_sk_ac_lock);
 
prev_index = 0;
+   rtnl_lock();
rcu_read_lock();
while (pac) {
struct ipv6_ac_socklist *next = pac-acl_next;
@@ -212,6 +219,7 @@ void ipv6_sock_ac_close(struct sock *sk)
pac = next;
}
rcu_read_unlock();
+   rtnl_unlock();
 }
 
 static void aca_put(struct ifacaddr6 *ac)
@@ -233,6 +241,8 @@ int ipv6_dev_ac_inc(struct net_device *dev, const struct 
in6_addr *addr)
struct rt6_info *rt;
int err;
 
+   ASSERT_RTNL();
+
idev = in6_dev_get(dev);
 
if (idev == NULL)
@@ -302,6 +312,8 @@ int __ipv6_dev_ac_dec(struct inet6_dev *idev, const struct 
in6_addr *addr)
 {
struct ifacaddr6 *aca, *prev_aca;
 
+   ASSERT_RTNL();
+
write_lock_bh(idev-lock);
prev_aca = NULL;
for (aca = idev-ac_list; aca; aca = aca-aca_next) {
diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c

kmemleak: Cannot insert [...] into the object search tree (overlaps existing) (mm: use memblock_alloc_range())

2014-09-02 Thread Sabrina Dubroca

Hello,

2014-08-24, 23:56:03 +0900, Akinobu Mita wrote:
 Replace memblock_find_in_range() and memblock_reserve() with
 memblock_alloc_range().
 
 Signed-off-by: Akinobu Mita akinobu.m...@gmail.com
 Cc: linux...@kvack.org

This patch is included in linux-next, and when I boot next-20140901,
on a 32-bit build, I get this message:


kmemleak: Cannot insert 0xf6556000 into the object search tree (overlaps 
existing)
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.17.0-rc3-next-20140901 #126
Hardware name: Dell Inc. Latitude D830   /0UY141, BIOS A02 
06/07/2007
 f6556000  c1891f64 c16226e8 f5c090c0 c1891f98 c11a934c c17f8768
 f6556000 00200282 f5c090e4 0005 0010 f5c09124  c19984e8
 0002 c19f5800 c1891fb0 c162216d 0020 c19984e8 0002 c19f5800
Call Trace:
 [c16226e8] dump_stack+0x48/0x69
 [c11a934c] create_object+0x23c/0x290
 [c162216d] early_alloc+0x98/0x120
 [c195b10d] kmemleak_init+0x129/0x226
 [c19399f7] start_kernel+0x2d5/0x38d
 [c19392ab] i386_start_kernel+0x79/0x7d
kmemleak: Kernel memory leak detector disabled
kmemleak: Object 0xf6556000 (size 16777216):
kmemleak:   comm swapper/0, pid 0, jiffies 4294877296
kmemleak:   min_count = 0
kmemleak:   count = 0
kmemleak:   flags = 0x1
kmemleak:   checksum = 0
kmemleak:   backtrace:
 [c1620048] kmemleak_alloc+0xa8/0xb0
 [c19595bd] memblock_alloc_range_nid+0x46/0x50
 [c195965f] memblock_virt_alloc_internal+0x89/0xe7
 [c195978b] memblock_virt_alloc_try_nid_nopanic+0x58/0x60
 [c161fc2f] alloc_node_mem_map.constprop.72+0x4b/0x8c
 [c195623f] free_area_init_node+0xee/0x3a1
 [c1956860] free_area_init_nodes+0x36e/0x380
 [c194b8a5] zone_sizes_init+0x33/0x39
 [c194c112] paging_init+0xaa/0xad
 [c194c169] native_pagetable_init+0x54/0xe7
 [c193c9d9] setup_arch+0xb21/0xc07
 [c193979b] start_kernel+0x79/0x38d
 [c19392ab] i386_start_kernel+0x79/0x7d
 [] 0x
kmemleak: Early log backtrace:
   [c1620048] kmemleak_alloc+0xa8/0xb0
   [c19596a2] memblock_virt_alloc_internal+0xcc/0xe7
   [c195978b] memblock_virt_alloc_try_nid_nopanic+0x58/0x60
   [c161fc2f] alloc_node_mem_map.constprop.72+0x4b/0x8c
   [c195623f] free_area_init_node+0xee/0x3a1
   [c1956860] free_area_init_nodes+0x36e/0x380
   [c194b8a5] zone_sizes_init+0x33/0x39
   [c194c112] paging_init+0xaa/0xad
   [c194c169] native_pagetable_init+0x54/0xe7
   [c193c9d9] setup_arch+0xb21/0xc07
   [c193979b] start_kernel+0x79/0x38d
   [c19392ab] i386_start_kernel+0x79/0x7d
   [] 0x



git bisect pointed to this patch:

abc65ff21e61d49269bf8fafd486fff2e3679c21 is the first bad commit
commit abc65ff21e61d49269bf8fafd486fff2e3679c21
Author: Akinobu Mita akinobu.m...@gmail.com
Date:   Mon Sep 1 23:48:54 2014 +0100

mm: use memblock_alloc_range()

Replace memblock_find_in_range() and memblock_reserve() with
the equivalent memblock_alloc_range().

Signed-off-by: Akinobu Mita akinobu.m...@gmail.com
Cc: Thomas Gleixner t...@linutronix.de
Cc: Ingo Molnar mi...@redhat.com
Cc: H. Peter Anvin h...@zytor.com
Cc: Yinghai Lu ying...@kernel.org
Signed-off-by: Andrew Morton a...@linux-foundation.org


## git bisect log

git bisect start
# good: [a3793b0cedfc0bc3212e5ebf5b79710c910687c4] Merge remote-tracking branch 
'llvmlinux/for-next'
git bisect good a3793b0cedfc0bc3212e5ebf5b79710c910687c4
# bad: [03af78748485f63e8ed21d2e2585b5d1ec862ba6] Add linux-next specific files 
for 20140901
git bisect bad 03af78748485f63e8ed21d2e2585b5d1ec862ba6
# good: [42330182961e380c6ac85c1482ff8115ddc487dd] mempolicy: unexport 
get_vma_policy() and remove its task arg
git bisect good 42330182961e380c6ac85c1482ff8115ddc487dd
# bad: [2437e4e8841cafe9c086a98d4b0186196c7e10af] MAINTAINERS: remove non 
existent files
git bisect bad 2437e4e8841cafe9c086a98d4b0186196c7e10af
# bad: [36fb2fa2c928947f728dc8119a7143fe9f61033c] zsmalloc: change return value 
unit of zs_get_total_size_bytes
git bisect bad 36fb2fa2c928947f728dc8119a7143fe9f61033c
# bad: [658f7da49d34bc6187e6cd1ec57933d1a2a76035] mm: introduce dump_vma
git bisect bad 658f7da49d34bc6187e6cd1ec57933d1a2a76035
# skip: [2090938a202a34e6ea28a40a9b98214795546882] mm: introduce common page 
state for ballooned memory
git bisect skip 2090938a202a34e6ea28a40a9b98214795546882
# skip: [843cbba246f248585f88dffec89304545d2f3bde] 
mm-introduce-common-page-state-for-ballooned-memory-fix
git bisect skip 843cbba246f248585f88dffec89304545d2f3bde
# bad: [92a1357eacd714671071871e95bdaf9144aa622a] 
mm-balloon_compaction-general-cleanup-checkpatch-fixes
git bisect bad 92a1357eacd714671071871e95bdaf9144aa622a
# skip: [b19a479c8ae91329771288310701f996bc100947] 
selftests/vm/transhuge-stress: stress test for memory compaction
git bisect skip b19a479c8ae91329771288310701f996bc100947
# bad: [e0e398dffe88d10dcda4e41941d677aa337410e5] mm/balloon_compaction: ignore 
anonymous pages
git bisect bad e0e398dffe88d10dcda4e41941d677aa337410e5
# bad:

[PATCH] ipv6: fix rtnl locking in setsockopt for anycast and multicast

2014-09-01 Thread Sabrina Dubroca

Calling setsockopt with IPV6_JOIN_ANYCAST or IPV6_LEAVE_ANYCAST
triggers the assertion in addrconf_join_solict()/addrconf_leave_solict()

ipv6_sock_ac_join(), ipv6_sock_ac_drop(), ipv6_sock_ac_close() need to
take RTNL before calling ipv6_dev_ac_inc/dec. Same thing with
ipv6_sock_mc_join(), ipv6_sock_mc_drop(), ipv6_sock_mc_close() before
calling ipv6_dev_mc_inc/dec.

This patch moves ASSERT_RTNL() up a level in the call stack.

Signed-off-by: Cong Wang 
Signed-off-by: Sabrina Dubroca 
Reported-by: Tommi Rantala 
---
I included Cong's Signed-off-by for the first part of the patch,
I hope that's OK.

This patch is based on -next, but since the assertion can also be
triggered on a current kernel (tested on a 3.16), I think it should
also go in stable.

 include/linux/netdevice.h |  4 ++--
 net/core/dev.c| 11 ++-
 net/ipv6/addrconf.c   | 15 +--
 net/ipv6/anycast.c| 30 +++---
 net/ipv6/mcast.c  | 16 
 5 files changed, 48 insertions(+), 28 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 429801370d0c..1ae0e745b1b1 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2077,8 +2077,8 @@ void __dev_remove_pack(struct packet_type *pt);
 void dev_add_offload(struct packet_offload *po);
 void dev_remove_offload(struct packet_offload *po);
 
-struct net_device *dev_get_by_flags_rcu(struct net *net, unsigned short flags,
-   unsigned short mask);
+struct net_device *dev_get_by_flags(struct net *net, unsigned short flags,
+   unsigned short mask);
 struct net_device *dev_get_by_name(struct net *net, const char *name);
 struct net_device *dev_get_by_name_rcu(struct net *net, const char *name);
 struct net_device *__dev_get_by_name(struct net *net, const char *name);
diff --git a/net/core/dev.c b/net/core/dev.c
index 443b814db05b..8fede6ef4a39 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -897,23 +897,24 @@ struct net_device *dev_getfirstbyhwtype(struct net *net, 
unsigned short type)
 EXPORT_SYMBOL(dev_getfirstbyhwtype);
 
 /**
- * dev_get_by_flags_rcu - find any device with given flags
+ * dev_get_by_flags - find any device with given flags
  * @net: the applicable net namespace
  * @if_flags: IFF_* values
  * @mask: bitmask of bits in if_flags to check
  *
  * Search for any interface with the given flags. Returns NULL if a device
  * is not found or a pointer to the device. Must be called inside
- * rcu_read_lock(), and result refcount is unchanged.
+ * rtnl_lock(), and result refcount is unchanged.
  */
 
-struct net_device *dev_get_by_flags_rcu(struct net *net, unsigned short 
if_flags,
+struct net_device *dev_get_by_flags(struct net *net, unsigned short if_flags,
unsigned short mask)
 {
struct net_device *dev, *ret;
 
+   ASSERT_RTNL();
ret = NULL;
-   for_each_netdev_rcu(net, dev) {
+   for_each_netdev(net, dev) {
if (((dev->flags ^ if_flags) & mask) == 0) {
ret = dev;
break;
@@ -921,7 +922,7 @@ struct net_device *dev_get_by_flags_rcu(struct net *net, 
unsigned short if_flags
}
return ret;
 }
-EXPORT_SYMBOL(dev_get_by_flags_rcu);
+EXPORT_SYMBOL(dev_get_by_flags);
 
 /**
  * dev_valid_name - check if name is okay for network device
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 267ce3caee24..7ada65937d23 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -1690,14 +1690,12 @@ void addrconf_dad_failure(struct inet6_ifaddr *ifp)
addrconf_mod_dad_work(ifp, 0);
 }
 
-/* Join to solicited addr multicast group. */
-
+/* Join to solicited addr multicast group.
+ * caller must hold RTNL */
 void addrconf_join_solict(struct net_device *dev, const struct in6_addr *addr)
 {
struct in6_addr maddr;
 
-   ASSERT_RTNL();
-
if (dev->flags&(IFF_LOOPBACK|IFF_NOARP))
return;
 
@@ -1705,12 +1703,11 @@ void addrconf_join_solict(struct net_device *dev, const 
struct in6_addr *addr)
ipv6_dev_mc_inc(dev, );
 }
 
+/* caller must hold RTNL */
 void addrconf_leave_solict(struct inet6_dev *idev, const struct in6_addr *addr)
 {
struct in6_addr maddr;
 
-   ASSERT_RTNL();
-
if (idev->dev->flags&(IFF_LOOPBACK|IFF_NOARP))
return;
 
@@ -1718,12 +1715,11 @@ void addrconf_leave_solict(struct inet6_dev *idev, 
const struct in6_addr *addr)
__ipv6_dev_mc_dec(idev, );
 }
 
+/* caller must hold RTNL */
 static void addrconf_join_anycast(struct inet6_ifaddr *ifp)
 {
struct in6_addr addr;
 
-   ASSERT_RTNL();
-
if (ifp->prefix_len >= 127) /* RFC 6164 */
return;
ipv6_addr_prefix(, >addr, ifp->prefix_len);
@@ -1732,12 +1728,11 @@ static void addrconf_jo

[PATCH] ipv6: fix rtnl locking in setsockopt for anycast and multicast

2014-09-01 Thread Sabrina Dubroca

Calling setsockopt with IPV6_JOIN_ANYCAST or IPV6_LEAVE_ANYCAST
triggers the assertion in addrconf_join_solict()/addrconf_leave_solict()

ipv6_sock_ac_join(), ipv6_sock_ac_drop(), ipv6_sock_ac_close() need to
take RTNL before calling ipv6_dev_ac_inc/dec. Same thing with
ipv6_sock_mc_join(), ipv6_sock_mc_drop(), ipv6_sock_mc_close() before
calling ipv6_dev_mc_inc/dec.

This patch moves ASSERT_RTNL() up a level in the call stack.

Signed-off-by: Cong Wang xiyou.wangc...@gmail.com
Signed-off-by: Sabrina Dubroca s...@queasysnail.net
Reported-by: Tommi Rantala tt.rant...@gmail.com
---
I included Cong's Signed-off-by for the first part of the patch,
I hope that's OK.

This patch is based on -next, but since the assertion can also be
triggered on a current kernel (tested on a 3.16), I think it should
also go in stable.

 include/linux/netdevice.h |  4 ++--
 net/core/dev.c| 11 ++-
 net/ipv6/addrconf.c   | 15 +--
 net/ipv6/anycast.c| 30 +++---
 net/ipv6/mcast.c  | 16 
 5 files changed, 48 insertions(+), 28 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 429801370d0c..1ae0e745b1b1 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2077,8 +2077,8 @@ void __dev_remove_pack(struct packet_type *pt);
 void dev_add_offload(struct packet_offload *po);
 void dev_remove_offload(struct packet_offload *po);
 
-struct net_device *dev_get_by_flags_rcu(struct net *net, unsigned short flags,
-   unsigned short mask);
+struct net_device *dev_get_by_flags(struct net *net, unsigned short flags,
+   unsigned short mask);
 struct net_device *dev_get_by_name(struct net *net, const char *name);
 struct net_device *dev_get_by_name_rcu(struct net *net, const char *name);
 struct net_device *__dev_get_by_name(struct net *net, const char *name);
diff --git a/net/core/dev.c b/net/core/dev.c
index 443b814db05b..8fede6ef4a39 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -897,23 +897,24 @@ struct net_device *dev_getfirstbyhwtype(struct net *net, 
unsigned short type)
 EXPORT_SYMBOL(dev_getfirstbyhwtype);
 
 /**
- * dev_get_by_flags_rcu - find any device with given flags
+ * dev_get_by_flags - find any device with given flags
  * @net: the applicable net namespace
  * @if_flags: IFF_* values
  * @mask: bitmask of bits in if_flags to check
  *
  * Search for any interface with the given flags. Returns NULL if a device
  * is not found or a pointer to the device. Must be called inside
- * rcu_read_lock(), and result refcount is unchanged.
+ * rtnl_lock(), and result refcount is unchanged.
  */
 
-struct net_device *dev_get_by_flags_rcu(struct net *net, unsigned short 
if_flags,
+struct net_device *dev_get_by_flags(struct net *net, unsigned short if_flags,
unsigned short mask)
 {
struct net_device *dev, *ret;
 
+   ASSERT_RTNL();
ret = NULL;
-   for_each_netdev_rcu(net, dev) {
+   for_each_netdev(net, dev) {
if (((dev-flags ^ if_flags)  mask) == 0) {
ret = dev;
break;
@@ -921,7 +922,7 @@ struct net_device *dev_get_by_flags_rcu(struct net *net, 
unsigned short if_flags
}
return ret;
 }
-EXPORT_SYMBOL(dev_get_by_flags_rcu);
+EXPORT_SYMBOL(dev_get_by_flags);
 
 /**
  * dev_valid_name - check if name is okay for network device
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 267ce3caee24..7ada65937d23 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -1690,14 +1690,12 @@ void addrconf_dad_failure(struct inet6_ifaddr *ifp)
addrconf_mod_dad_work(ifp, 0);
 }
 
-/* Join to solicited addr multicast group. */
-
+/* Join to solicited addr multicast group.
+ * caller must hold RTNL */
 void addrconf_join_solict(struct net_device *dev, const struct in6_addr *addr)
 {
struct in6_addr maddr;
 
-   ASSERT_RTNL();
-
if (dev-flags(IFF_LOOPBACK|IFF_NOARP))
return;
 
@@ -1705,12 +1703,11 @@ void addrconf_join_solict(struct net_device *dev, const 
struct in6_addr *addr)
ipv6_dev_mc_inc(dev, maddr);
 }
 
+/* caller must hold RTNL */
 void addrconf_leave_solict(struct inet6_dev *idev, const struct in6_addr *addr)
 {
struct in6_addr maddr;
 
-   ASSERT_RTNL();
-
if (idev-dev-flags(IFF_LOOPBACK|IFF_NOARP))
return;
 
@@ -1718,12 +1715,11 @@ void addrconf_leave_solict(struct inet6_dev *idev, 
const struct in6_addr *addr)
__ipv6_dev_mc_dec(idev, maddr);
 }
 
+/* caller must hold RTNL */
 static void addrconf_join_anycast(struct inet6_ifaddr *ifp)
 {
struct in6_addr addr;
 
-   ASSERT_RTNL();
-
if (ifp-prefix_len = 127) /* RFC 6164 */
return;
ipv6_addr_prefix(addr, ifp-addr, ifp-prefix_len);
@@ -1732,12

Re: RTNL: assertion failed at net/ipv6/addrconf.c (1699)

2014-08-30 Thread Sabrina Dubroca

2014-08-30, 12:58:21 +0200, Sabrina Dubroca wrote:
>- pndisc_constructor, called from pneigh_lookup -- pneigh_lookup
>  has ASSERT_RTNL(), but pneigh_lookup is called from ip6_forward and
>  ndisc_recv_na

Ah, these have creat = 0, so it's fine. I missed that earlier.

Sorry for the noise.

-- 
Sabrina
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RTNL: assertion failed at net/ipv6/addrconf.c (1699)

2014-08-30 Thread Sabrina Dubroca

Hello,

2014-08-30, 03:51:29 +0200, Hannes Frederic Sowa wrote:
> Hi Sabrina,
> 
> [...]
> 
> Sorry, just had time to look at this.
> 
> The reason is not to have list corruption but that the calls down to
> ndo_set_rx_mode expect rtnl to be locked by the drivers. Filter lists
> are locked by addr_list_lock and that's why I think we never saw any
> problems with that, but drivers expect rtnl locked for those calls.
> 
> But this problem also affects multicast join, so patch seems incomplete
> to me (and for that matter ssm multicast join, too).
> 
> Also rtnl_lock and rcu_read_lock compose in that order, so we don't need
> to change dev_get_by_flags, but as this is the only user it sure is
> possible. RCU locked version is just easier composeable, so I wouldn't
> touch that if needed in future, just also take rcu lock as before.
> 
> So just adding rtnl_lock add appropriate places seems to be ok to me,
> but still need to review parts of the ssm code.
> 
> Also we should move ASSERT_RTNL checks from addrconf_join_solict to
> ipv6_dev_mc_inc/dec.
> 
> Thanks,
> Hannes

Thanks for explaining.

I had a look at what you suggested.


So, for anycast, on top of the previous patch, we'd have:

---
diff --git a/net/ipv6/anycast.c b/net/ipv6/anycast.c
index 210183244689..61dd3046b804 100644
--- a/net/ipv6/anycast.c
+++ b/net/ipv6/anycast.c
 static void aca_put(struct ifacaddr6 *ac)
@@ -233,6 +235,8 @@ int ipv6_dev_ac_inc(struct net_device *dev, const struct 
in6_addr *addr)
struct rt6_info *rt;
int err;
 
+   ASSERT_RTNL();
+
idev = in6_dev_get(dev);
 
if (idev == NULL)
@@ -302,6 +306,8 @@ int __ipv6_dev_ac_dec(struct inet6_dev *idev, const struct 
in6_addr *addr)
 {
struct ifacaddr6 *aca, *prev_aca;
 
+   ASSERT_RTNL();
+
write_lock_bh(>lock);
prev_aca = NULL;
for (aca = idev->ac_list; aca; aca = aca->aca_next) {
@@ -336,6 +342,8 @@ static int ipv6_dev_ac_dec(struct net_device *dev, const 
struct in6_addr *addr)
 {
struct inet6_dev *idev = __in6_dev_get(dev);
 
+   ASSERT_RTNL();
+
if (idev == NULL)
return -ENODEV;
return __ipv6_dev_ac_dec(idev, addr);

---


And for multicast:
 - locking order in the patch below: rtnl -> rcu -> ipv6_sk_mc_lock
 - ipv6_sock_mc_join: maybe move all the _unlock()'s together at the end of the 
function
 - do we need to modify rcu_dereference_protected in 
ipv6_sock_mc_drop/ipv6_sock_mc_close
 - I had a look at the other codepaths that call ipv6_dev_mc_inc/dec
   - ipv6_mc_destroy_dev, dev_forward_change, ipv6_add_dev,
 addrconf_join_solict -- all take rtnl or already have an
 ASSERT_RTNL()
   - pndisc_destructor, called from pneigh_ifdown/pneigh_delete
   - pndisc_constructor, called from pneigh_lookup -- pneigh_lookup
 has ASSERT_RTNL(), but pneigh_lookup is called from ip6_forward and
 ndisc_recv_na
   - (hope I didn't miss any callers)

As far as I could see, apart maybe from pndisc_constructor, it seems
okay, but I'd like to hear your comments.


Current modifications:

---
diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
index 70881795da96..d73ac1ef65f2 100644
--- a/net/ipv6/mcast.c
+++ b/net/ipv6/mcast.c
@@ -172,6 +172,7 @@ int ipv6_sock_mc_join(struct sock *sk, int ifindex, const 
struct in6_addr *addr)
mc_lst->next = NULL;
mc_lst->addr = *addr;
 
+   rtnl_lock();
rcu_read_lock();
if (ifindex == 0) {
struct rt6_info *rt;
@@ -185,6 +186,7 @@ int ipv6_sock_mc_join(struct sock *sk, int ifindex, const 
struct in6_addr *addr)
 
if (dev == NULL) {
rcu_read_unlock();
+   rtnl_unlock();
sock_kfree_s(sk, mc_lst, sizeof(*mc_lst));
return -ENODEV;
}
@@ -202,6 +204,7 @@ int ipv6_sock_mc_join(struct sock *sk, int ifindex, const 
struct in6_addr *addr)
 
if (err) {
rcu_read_unlock();
+   rtnl_unlock();
sock_kfree_s(sk, mc_lst, sizeof(*mc_lst));
return err;
}
@@ -212,6 +215,7 @@ int ipv6_sock_mc_join(struct sock *sk, int ifindex, const 
struct in6_addr *addr)
spin_unlock(_sk_mc_lock);
 
rcu_read_unlock();
+   rtnl_unlock();
 
return 0;
 }
@@ -229,6 +233,7 @@ int ipv6_sock_mc_drop(struct sock *sk, int ifindex, const 
struct in6_addr *addr)
if (!ipv6_addr_is_multicast(addr))
return -EINVAL;
 
+   rtnl_lock();
spin_lock(_sk_mc_lock);
for (lnk = >ipv6_mc_list;
 (mc_lst = rcu_dereference_protected(*lnk,
@@ -252,12 +257,15 @@ int ipv6_sock_mc_drop(struct sock *sk, int ifindex, const 
struct in6_addr *addr)
} else
(void) ip6_mc_leave_src(sk, mc_lst, NULL);
rcu_read_unlock();
+   rtnl_unlock();
+
atomic_sub(sizeof(*mc_lst), >sk_omem_alloc);

Re: RTNL: assertion failed at net/ipv6/addrconf.c (1699)

2014-08-30 Thread Sabrina Dubroca

2014-08-29, 15:54:48 -0700, Cong Wang wrote:
> [...]
> 
> You are absolutely right here.
> 
> Can I have your Signed-off-by and Tested-by before sending the patch
> formally?
> 
> Thanks!

Sure:

Signed-off-by: Sabrina Dubroca 
Tested-by: Sabrina Dubroca 


Thanks,

-- 
Sabrina
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RTNL: assertion failed at net/ipv6/addrconf.c (1699)

2014-08-30 Thread Sabrina Dubroca

2014-08-29, 15:54:48 -0700, Cong Wang wrote:
 [...]
 
 You are absolutely right here.
 
 Can I have your Signed-off-by and Tested-by before sending the patch
 formally?
 
 Thanks!

Sure:

Signed-off-by: Sabrina Dubroca s...@queasysnail.net
Tested-by: Sabrina Dubroca s...@queasysnail.net


Thanks,

-- 
Sabrina
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RTNL: assertion failed at net/ipv6/addrconf.c (1699)

2014-08-30 Thread Sabrina Dubroca

Hello,

2014-08-30, 03:51:29 +0200, Hannes Frederic Sowa wrote:
 Hi Sabrina,
 
 [...]
 
 Sorry, just had time to look at this.
 
 The reason is not to have list corruption but that the calls down to
 ndo_set_rx_mode expect rtnl to be locked by the drivers. Filter lists
 are locked by addr_list_lock and that's why I think we never saw any
 problems with that, but drivers expect rtnl locked for those calls.
 
 But this problem also affects multicast join, so patch seems incomplete
 to me (and for that matter ssm multicast join, too).
 
 Also rtnl_lock and rcu_read_lock compose in that order, so we don't need
 to change dev_get_by_flags, but as this is the only user it sure is
 possible. RCU locked version is just easier composeable, so I wouldn't
 touch that if needed in future, just also take rcu lock as before.
 
 So just adding rtnl_lock add appropriate places seems to be ok to me,
 but still need to review parts of the ssm code.
 
 Also we should move ASSERT_RTNL checks from addrconf_join_solict to
 ipv6_dev_mc_inc/dec.
 
 Thanks,
 Hannes

Thanks for explaining.

I had a look at what you suggested.


So, for anycast, on top of the previous patch, we'd have:

---
diff --git a/net/ipv6/anycast.c b/net/ipv6/anycast.c
index 210183244689..61dd3046b804 100644
--- a/net/ipv6/anycast.c
+++ b/net/ipv6/anycast.c
 static void aca_put(struct ifacaddr6 *ac)
@@ -233,6 +235,8 @@ int ipv6_dev_ac_inc(struct net_device *dev, const struct 
in6_addr *addr)
struct rt6_info *rt;
int err;
 
+   ASSERT_RTNL();
+
idev = in6_dev_get(dev);
 
if (idev == NULL)
@@ -302,6 +306,8 @@ int __ipv6_dev_ac_dec(struct inet6_dev *idev, const struct 
in6_addr *addr)
 {
struct ifacaddr6 *aca, *prev_aca;
 
+   ASSERT_RTNL();
+
write_lock_bh(idev-lock);
prev_aca = NULL;
for (aca = idev-ac_list; aca; aca = aca-aca_next) {
@@ -336,6 +342,8 @@ static int ipv6_dev_ac_dec(struct net_device *dev, const 
struct in6_addr *addr)
 {
struct inet6_dev *idev = __in6_dev_get(dev);
 
+   ASSERT_RTNL();
+
if (idev == NULL)
return -ENODEV;
return __ipv6_dev_ac_dec(idev, addr);

---


And for multicast:
 - locking order in the patch below: rtnl - rcu - ipv6_sk_mc_lock
 - ipv6_sock_mc_join: maybe move all the _unlock()'s together at the end of the 
function
 - do we need to modify rcu_dereference_protected in 
ipv6_sock_mc_drop/ipv6_sock_mc_close
 - I had a look at the other codepaths that call ipv6_dev_mc_inc/dec
   - ipv6_mc_destroy_dev, dev_forward_change, ipv6_add_dev,
 addrconf_join_solict -- all take rtnl or already have an
 ASSERT_RTNL()
   - pndisc_destructor, called from pneigh_ifdown/pneigh_delete
   - pndisc_constructor, called from pneigh_lookup -- pneigh_lookup
 has ASSERT_RTNL(), but pneigh_lookup is called from ip6_forward and
 ndisc_recv_na
   - (hope I didn't miss any callers)

As far as I could see, apart maybe from pndisc_constructor, it seems
okay, but I'd like to hear your comments.


Current modifications:

---
diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
index 70881795da96..d73ac1ef65f2 100644
--- a/net/ipv6/mcast.c
+++ b/net/ipv6/mcast.c
@@ -172,6 +172,7 @@ int ipv6_sock_mc_join(struct sock *sk, int ifindex, const 
struct in6_addr *addr)
mc_lst-next = NULL;
mc_lst-addr = *addr;
 
+   rtnl_lock();
rcu_read_lock();
if (ifindex == 0) {
struct rt6_info *rt;
@@ -185,6 +186,7 @@ int ipv6_sock_mc_join(struct sock *sk, int ifindex, const 
struct in6_addr *addr)
 
if (dev == NULL) {
rcu_read_unlock();
+   rtnl_unlock();
sock_kfree_s(sk, mc_lst, sizeof(*mc_lst));
return -ENODEV;
}
@@ -202,6 +204,7 @@ int ipv6_sock_mc_join(struct sock *sk, int ifindex, const 
struct in6_addr *addr)
 
if (err) {
rcu_read_unlock();
+   rtnl_unlock();
sock_kfree_s(sk, mc_lst, sizeof(*mc_lst));
return err;
}
@@ -212,6 +215,7 @@ int ipv6_sock_mc_join(struct sock *sk, int ifindex, const 
struct in6_addr *addr)
spin_unlock(ipv6_sk_mc_lock);
 
rcu_read_unlock();
+   rtnl_unlock();
 
return 0;
 }
@@ -229,6 +233,7 @@ int ipv6_sock_mc_drop(struct sock *sk, int ifindex, const 
struct in6_addr *addr)
if (!ipv6_addr_is_multicast(addr))
return -EINVAL;
 
+   rtnl_lock();
spin_lock(ipv6_sk_mc_lock);
for (lnk = np-ipv6_mc_list;
 (mc_lst = rcu_dereference_protected(*lnk,
@@ -252,12 +257,15 @@ int ipv6_sock_mc_drop(struct sock *sk, int ifindex, const 
struct in6_addr *addr)
} else
(void) ip6_mc_leave_src(sk, mc_lst, NULL);
rcu_read_unlock();
+   rtnl_unlock();
+
atomic_sub(sizeof(*mc_lst), sk-sk_omem_alloc);
kfree_rcu(mc_lst,

Re: RTNL: assertion failed at net/ipv6/addrconf.c (1699)

2014-08-30 Thread Sabrina Dubroca

2014-08-30, 12:58:21 +0200, Sabrina Dubroca wrote:
- pndisc_constructor, called from pneigh_lookup -- pneigh_lookup
  has ASSERT_RTNL(), but pneigh_lookup is called from ip6_forward and
  ndisc_recv_na

Ah, these have creat = 0, so it's fine. I missed that earlier.

Sorry for the noise.

-- 
Sabrina
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RTNL: assertion failed at net/ipv6/addrconf.c (1699)

2014-08-29 Thread Sabrina Dubroca

2014-08-29, 11:14:48 -0700, Cong Wang wrote:
> On Fri, Aug 29, 2014 at 8:26 AM, Tommi Rantala  wrote:
> > [   77.297196] RTNL: assertion failed at net/ipv6/addrconf.c (1699)
> > [   77.298080] CPU: 0 PID: 4842 Comm: trinity-main Not tainted 3.17.0-rc2+ 
> > #30
> > [   77.299039] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
> > [   77.299789]  88003d76a618 880026133c50 8238ba79
> > 880037c84520
> > [   77.300829]  880026133c90 820bd52b 
> > 82d86c40
> > [   77.301869]   f76fd1e1 8800382d8000
> > 8800382d8220
> > [   77.302906] Call Trace:
> > [   77.303246]  [] dump_stack+0x4d/0x66
> > [   77.303928]  [] addrconf_join_solict+0x4b/0xb0
> > [   77.304731]  [] ipv6_dev_ac_inc+0x2bb/0x330
> > [   77.305498]  [] ? ac6_seq_start+0x260/0x260
> > [   77.306257]  [] ipv6_sock_ac_join+0x26e/0x360
> > [   77.307046]  [] ? ipv6_sock_ac_join+0x99/0x360
> > [   77.307798]  [] do_ipv6_setsockopt.isra.5+0xa70/0xf20
> 
> 
> I think we should just use rtnl_lock() instead of rcu_read_lock() there,
> it is not a hot path worth optimization.
> 
> Please try the attached patch.

note: it doesn't build as it is now, it needs:

-EXPORT_SYMBOL(dev_get_by_flags_rcu);
+EXPORT_SYMBOL(dev_get_by_flags);


I just tried your patch with a basic test program (open
socket/join/leave/close and open socket/join/close).

I think you need to modify ipv6_sock_ac_close as well, or you can still
trigger the assertion when closing the socket without leaving first.

Modified patch attached.


-- 
Sabrina
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 429801370d0c..1ae0e745b1b1 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2077,8 +2077,8 @@ void __dev_remove_pack(struct packet_type *pt);
 void dev_add_offload(struct packet_offload *po);
 void dev_remove_offload(struct packet_offload *po);
 
-struct net_device *dev_get_by_flags_rcu(struct net *net, unsigned short flags,
-   unsigned short mask);
+struct net_device *dev_get_by_flags(struct net *net, unsigned short flags,
+   unsigned short mask);
 struct net_device *dev_get_by_name(struct net *net, const char *name);
 struct net_device *dev_get_by_name_rcu(struct net *net, const char *name);
 struct net_device *__dev_get_by_name(struct net *net, const char *name);
diff --git a/net/core/dev.c b/net/core/dev.c
index 443b814db05b..8fede6ef4a39 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -897,23 +897,24 @@ struct net_device *dev_getfirstbyhwtype(struct net *net, 
unsigned short type)
 EXPORT_SYMBOL(dev_getfirstbyhwtype);
 
 /**
- * dev_get_by_flags_rcu - find any device with given flags
+ * dev_get_by_flags - find any device with given flags
  * @net: the applicable net namespace
  * @if_flags: IFF_* values
  * @mask: bitmask of bits in if_flags to check
  *
  * Search for any interface with the given flags. Returns NULL if a device
  * is not found or a pointer to the device. Must be called inside
- * rcu_read_lock(), and result refcount is unchanged.
+ * rtnl_lock(), and result refcount is unchanged.
  */
 
-struct net_device *dev_get_by_flags_rcu(struct net *net, unsigned short 
if_flags,
+struct net_device *dev_get_by_flags(struct net *net, unsigned short if_flags,
unsigned short mask)
 {
struct net_device *dev, *ret;
 
+   ASSERT_RTNL();
ret = NULL;
-   for_each_netdev_rcu(net, dev) {
+   for_each_netdev(net, dev) {
if (((dev->flags ^ if_flags) & mask) == 0) {
ret = dev;
break;
@@ -921,7 +922,7 @@ struct net_device *dev_get_by_flags_rcu(struct net *net, 
unsigned short if_flags
}
return ret;
 }
-EXPORT_SYMBOL(dev_get_by_flags_rcu);
+EXPORT_SYMBOL(dev_get_by_flags);
 
 /**
  * dev_valid_name - check if name is okay for network device
diff --git a/net/ipv6/anycast.c b/net/ipv6/anycast.c
index 210183244689..6de5caa26ea4 100644
--- a/net/ipv6/anycast.c
+++ b/net/ipv6/anycast.c
@@ -77,7 +77,7 @@ int ipv6_sock_ac_join(struct sock *sk, int ifindex, const 
struct in6_addr *addr)
pac->acl_next = NULL;
pac->acl_addr = *addr;
 
-   rcu_read_lock();
+   rtnl_lock();
if (ifindex == 0) {
struct rt6_info *rt;
 
@@ -90,11 +90,11 @@ int ipv6_sock_ac_join(struct sock *sk, int ifindex, const 
struct in6_addr *addr)
goto error;
} else {
/* router, no matching interface: just pick one */
-   dev = dev_get_by_flags_rcu(net, IFF_UP,
+   dev = dev_get_by_flags(net, IFF_UP,
   IFF_UP | IFF_LOOPBACK);
}
} else
-   dev = dev_get_by_index_rcu(net, ifindex);
+   dev =

Re: RTNL: assertion failed at net/ipv6/addrconf.c (1699)

2014-08-29 Thread Sabrina Dubroca

2014-08-29, 11:14:48 -0700, Cong Wang wrote:
 On Fri, Aug 29, 2014 at 8:26 AM, Tommi Rantala tt.rant...@gmail.com wrote:
  [   77.297196] RTNL: assertion failed at net/ipv6/addrconf.c (1699)
  [   77.298080] CPU: 0 PID: 4842 Comm: trinity-main Not tainted 3.17.0-rc2+ 
  #30
  [   77.299039] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
  [   77.299789]  88003d76a618 880026133c50 8238ba79
  880037c84520
  [   77.300829]  880026133c90 820bd52b 
  82d86c40
  [   77.301869]   f76fd1e1 8800382d8000
  8800382d8220
  [   77.302906] Call Trace:
  [   77.303246]  [8238ba79] dump_stack+0x4d/0x66
  [   77.303928]  [820bd52b] addrconf_join_solict+0x4b/0xb0
  [   77.304731]  [820b031b] ipv6_dev_ac_inc+0x2bb/0x330
  [   77.305498]  [820b0060] ? ac6_seq_start+0x260/0x260
  [   77.306257]  [820b05fe] ipv6_sock_ac_join+0x26e/0x360
  [   77.307046]  [820b0429] ? ipv6_sock_ac_join+0x99/0x360
  [   77.307798]  [820cdd60] do_ipv6_setsockopt.isra.5+0xa70/0xf20
 
 
 I think we should just use rtnl_lock() instead of rcu_read_lock() there,
 it is not a hot path worth optimization.
 
 Please try the attached patch.

note: it doesn't build as it is now, it needs:

-EXPORT_SYMBOL(dev_get_by_flags_rcu);
+EXPORT_SYMBOL(dev_get_by_flags);


I just tried your patch with a basic test program (open
socket/join/leave/close and open socket/join/close).

I think you need to modify ipv6_sock_ac_close as well, or you can still
trigger the assertion when closing the socket without leaving first.

Modified patch attached.


-- 
Sabrina
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 429801370d0c..1ae0e745b1b1 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2077,8 +2077,8 @@ void __dev_remove_pack(struct packet_type *pt);
 void dev_add_offload(struct packet_offload *po);
 void dev_remove_offload(struct packet_offload *po);
 
-struct net_device *dev_get_by_flags_rcu(struct net *net, unsigned short flags,
-   unsigned short mask);
+struct net_device *dev_get_by_flags(struct net *net, unsigned short flags,
+   unsigned short mask);
 struct net_device *dev_get_by_name(struct net *net, const char *name);
 struct net_device *dev_get_by_name_rcu(struct net *net, const char *name);
 struct net_device *__dev_get_by_name(struct net *net, const char *name);
diff --git a/net/core/dev.c b/net/core/dev.c
index 443b814db05b..8fede6ef4a39 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -897,23 +897,24 @@ struct net_device *dev_getfirstbyhwtype(struct net *net, 
unsigned short type)
 EXPORT_SYMBOL(dev_getfirstbyhwtype);
 
 /**
- * dev_get_by_flags_rcu - find any device with given flags
+ * dev_get_by_flags - find any device with given flags
  * @net: the applicable net namespace
  * @if_flags: IFF_* values
  * @mask: bitmask of bits in if_flags to check
  *
  * Search for any interface with the given flags. Returns NULL if a device
  * is not found or a pointer to the device. Must be called inside
- * rcu_read_lock(), and result refcount is unchanged.
+ * rtnl_lock(), and result refcount is unchanged.
  */
 
-struct net_device *dev_get_by_flags_rcu(struct net *net, unsigned short 
if_flags,
+struct net_device *dev_get_by_flags(struct net *net, unsigned short if_flags,
unsigned short mask)
 {
struct net_device *dev, *ret;
 
+   ASSERT_RTNL();
ret = NULL;
-   for_each_netdev_rcu(net, dev) {
+   for_each_netdev(net, dev) {
if (((dev-flags ^ if_flags)  mask) == 0) {
ret = dev;
break;
@@ -921,7 +922,7 @@ struct net_device *dev_get_by_flags_rcu(struct net *net, 
unsigned short if_flags
}
return ret;
 }
-EXPORT_SYMBOL(dev_get_by_flags_rcu);
+EXPORT_SYMBOL(dev_get_by_flags);
 
 /**
  * dev_valid_name - check if name is okay for network device
diff --git a/net/ipv6/anycast.c b/net/ipv6/anycast.c
index 210183244689..6de5caa26ea4 100644
--- a/net/ipv6/anycast.c
+++ b/net/ipv6/anycast.c
@@ -77,7 +77,7 @@ int ipv6_sock_ac_join(struct sock *sk, int ifindex, const 
struct in6_addr *addr)
pac-acl_next = NULL;
pac-acl_addr = *addr;
 
-   rcu_read_lock();
+   rtnl_lock();
if (ifindex == 0) {
struct rt6_info *rt;
 
@@ -90,11 +90,11 @@ int ipv6_sock_ac_join(struct sock *sk, int ifindex, const 
struct in6_addr *addr)
goto error;
} else {
/* router, no matching interface: just pick one */
-   dev = dev_get_by_flags_rcu(net, IFF_UP,
+   dev = dev_get_by_flags(net, IFF_UP,
   IFF_UP | IFF_LOOPBACK);
}
} else
-

BUG: sleeping function called from invalid context at arch/x86/mm/fault.c:1177

2014-08-27 Thread Sabrina Dubroca

Hello,

While fuzzing with trinity on next-20140827, I ran into this:

[ 2059.161014] BUG: sleeping function called from invalid context at 
arch/x86/mm/fault.c:1177
[ 2059.162968] in_atomic(): 0, irqs_disabled(): 1, pid: 3225, name: trinity-c0
[ 2059.163142] INFO: lockdep is turned off.
[ 2059.163142] irq event stamp: 0
[ 2059.163142] CPU: 0 PID: 3225 Comm: trinity-c0 Not tainted 
3.17.0-rc2-next-20140827 #112
[ 2059.163142] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
1.7.5-20140617_173321-var-lib-archbuild-testing-x86_64-tobias 04/01/2014
[ 2059.173190] Call Trace:
[ 2059.173190]  [] dump_stack+0x4e/0x7a
[ 2059.173190]  [] __might_sleep+0x182/0x2b0
[ 2059.173190]  [] __do_page_fault+0x114/0x680
[ 2059.173190]  [] ? __alloc_pages_nodemask+0x1eb/0xcf0
[ 2059.173190]  [] ? __radix_tree_preload+0x63/0xf0
[ 2059.173190]  [] trace_do_page_fault+0x45/0x270
[ 2059.173190]  [] do_async_page_fault+0x5b/0x90
[ 2059.173190]  [] async_page_fault+0x28/0x30
[ 2059.173190]  [] ? gup_pte_range+0xb2/0x170
[ 2059.173190]  [] gup_pud_range+0x138/0x210
[ 2059.173190]  [] get_user_pages_fast+0xba/0x1d0
[ 2059.173190]  [] ? __kmalloc+0x2e/0x3c0
[ 2059.173190]  [] iov_iter_get_pages_alloc+0xb2/0x1c0
[ 2059.173190]  [] nfs_direct_read_schedule_iovec+0xbc/0x2e0 
[nfs]
[ 2059.173190]  [] ? nfs_get_lock_context+0x4f/0x120 [nfs]
[ 2059.173190]  [] nfs_file_direct_read+0x1d6/0x2b0 [nfs]
[ 2059.173190]  [] ? do_sync_readv_writev+0x80/0x80
[ 2059.173190]  [] nfs_file_read+0x56/0x90 [nfs]
[ 2059.173190]  [] do_iter_readv_writev+0x62/0x90
[ 2059.173190]  [] compat_do_readv_writev+0xd7/0x260
[ 2059.173190]  [] ? nfs_file_release+0x30/0x30 [nfs]
[ 2059.173190]  [] ? trace_hardirqs_on+0xd/0x10
[ 2059.173190]  [] ? mutex_lock_nested+0x2e5/0x620
[ 2059.173190]  [] ? __fdget_pos+0x49/0x50
[ 2059.173190]  [] ? trace_hardirqs_on+0xd/0x10
[ 2059.173190]  [] ? perf_syscall_enter+0x1c/0x1d0
[ 2059.173190]  [] ? do_setitimer+0x137/0x2f0
[ 2059.173190]  [] compat_readv+0x53/0x70
[ 2059.173190]  [] compat_SyS_readv+0x49/0xb0
[ 2059.173190]  [] ia32_do_call+0x13/0x13
[ 2059.173190] BUG: unable to handle kernel NULL pointer dereference at 
0010
[ 2059.173190] IP: [] gup_pte_range+0xb2/0x170
[ 2059.173190] PGD 79409067 PUD 74b5c067 PMD 0 
[ 2059.173190] Oops: 0002 [#1] PREEMPT SMP DEBUG_PAGEALLOC
[ 2059.173190] Modules linked in: sctp crc32c_generic libcrc32c ipx p8023 psnap 
p8022 llc auth_rpcgss nfsv4 9p netconsole e1000 cirrus syscopyarea sysfillrect 
ppdev sysimgblt drm_kms_helper ttm drm psmouse evdev microcode i2c_piix4 
parport_pc serio_raw parport intel_agp button intel_gtt processor pcspkr nfs 
lockd sunrpc ipv6 ext4 crc16 mbcache jbd2 sd_mod sr_mod cdrom ata_generic 
pata_acpi ata_piix 9pnet_virtio libata 9pnet scsi_mod
[ 2059.173190] CPU: 0 PID: 3225 Comm: trinity-c0 Not tainted 
3.17.0-rc2-next-20140827 #112
[ 2059.173190] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
1.7.5-20140617_173321-var-lib-archbuild-testing-x86_64-tobias 04/01/2014
[ 2059.173190] task: 88007a5d4350 ti: 880004f28000 task.ti: 
880004f28000
[ 2059.173190] RIP: 0010:[]  [] 
gup_pte_range+0xb2/0x170
[ 2059.173190] RSP: :880004f2ba98  EFLAGS: 00010086
[ 2059.173190] RAX:  RBX: ea0001e2ffc0 RCX: 0207
[ 2059.173190] RDX: 4085a000 RSI: ea00 RDI: 800078bff067
[ 2059.173190] RBP: 880004f2bae8 R08: 0010 R09: 880004f2bb94
[ 2059.173190] R10: 40a0 R11: 8000 R12: 880004e8e2d8
[ 2059.173190] R13: 4085b000 R14: 0007 R15: 3000
[ 2059.173190] FS:  7f6133060700() GS:88007f60() 
knlGS:
[ 2059.173190] CS:  0010 DS:  ES:  CR0: 80050033
[ 2059.173190] CR2: 0010 CR3: 3ab44000 CR4: 001407f0
[ 2059.173190] Stack:
[ 2059.173190]  88007cc00618 88007cc00650 ea0001e4d6a0 
000100140010
[ 2059.173190]  000180140011 880004ebf020 40859fff 
4085a000
[ 2059.173190]  4085a000 880004f2bb94 880004f2bb58 
8106ede8
[ 2059.173190] Call Trace:
[ 2059.173190]  [] gup_pud_range+0x138/0x210
[ 2059.173190]  [] get_user_pages_fast+0xba/0x1d0
[ 2059.173190]  [] ? __kmalloc+0x2e/0x3c0
[ 2059.173190]  [] iov_iter_get_pages_alloc+0xb2/0x1c0
[ 2059.173190]  [] nfs_direct_read_schedule_iovec+0xbc/0x2e0 
[nfs]
[ 2059.173190]  [] ? nfs_get_lock_context+0x4f/0x120 [nfs]
[ 2059.173190]  [] nfs_file_direct_read+0x1d6/0x2b0 [nfs]
[ 2059.173190]  [] ? do_sync_readv_writev+0x80/0x80
[ 2059.173190]  [] nfs_file_read+0x56/0x90 [nfs]
[ 2059.173190]  [] do_iter_readv_writev+0x62/0x90
[ 2059.173190]  [] compat_do_readv_writev+0xd7/0x260
[ 2059.173190]  [] ? nfs_file_release+0x30/0x30 [nfs]
[ 2059.173190]  [] ? trace_hardirqs_on+0xd/0x10
[ 2059.173190]  [] ? mutex_lock_nested+0x2e5/0x620
[ 2059.173190]  [] ? __fdget_pos+0x49/0x50
[ 2059.173190]  [] ? trace_hardirqs_on+0xd/0x10
[ 2059.173190]  [] ?

BUG: sleeping function called from invalid context at arch/x86/mm/fault.c:1177

2014-08-27 Thread Sabrina Dubroca

Hello,

While fuzzing with trinity on next-20140827, I ran into this:

[ 2059.161014] BUG: sleeping function called from invalid context at 
arch/x86/mm/fault.c:1177
[ 2059.162968] in_atomic(): 0, irqs_disabled(): 1, pid: 3225, name: trinity-c0
[ 2059.163142] INFO: lockdep is turned off.
[ 2059.163142] irq event stamp: 0
[ 2059.163142] CPU: 0 PID: 3225 Comm: trinity-c0 Not tainted 
3.17.0-rc2-next-20140827 #112
[ 2059.163142] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
1.7.5-20140617_173321-var-lib-archbuild-testing-x86_64-tobias 04/01/2014
[ 2059.173190] Call Trace:
[ 2059.173190]  [815caba9] dump_stack+0x4e/0x7a
[ 2059.173190]  [810a5b52] __might_sleep+0x182/0x2b0
[ 2059.173190]  [81068a04] __do_page_fault+0x114/0x680
[ 2059.173190]  [81191bbb] ? __alloc_pages_nodemask+0x1eb/0xcf0
[ 2059.173190]  [81327523] ? __radix_tree_preload+0x63/0xf0
[ 2059.173190]  [81068ff5] trace_do_page_fault+0x45/0x270
[ 2059.173190]  [8106063b] do_async_page_fault+0x5b/0x90
[ 2059.173190]  [815d6d58] async_page_fault+0x28/0x30
[ 2059.173190]  [8106ebf2] ? gup_pte_range+0xb2/0x170
[ 2059.173190]  [8106ede8] gup_pud_range+0x138/0x210
[ 2059.173190]  [8106f11a] get_user_pages_fast+0xba/0x1d0
[ 2059.173190]  [811e95ae] ? __kmalloc+0x2e/0x3c0
[ 2059.173190]  [811b50e2] iov_iter_get_pages_alloc+0xb2/0x1c0
[ 2059.173190]  [a02f6dac] nfs_direct_read_schedule_iovec+0xbc/0x2e0 
[nfs]
[ 2059.173190]  [a02f1b5f] ? nfs_get_lock_context+0x4f/0x120 [nfs]
[ 2059.173190]  [a02f77a6] nfs_file_direct_read+0x1d6/0x2b0 [nfs]
[ 2059.173190]  [8120ada0] ? do_sync_readv_writev+0x80/0x80
[ 2059.173190]  [a02edbf6] nfs_file_read+0x56/0x90 [nfs]
[ 2059.173190]  [8120af62] do_iter_readv_writev+0x62/0x90
[ 2059.173190]  [8120b837] compat_do_readv_writev+0xd7/0x260
[ 2059.173190]  [a02edba0] ? nfs_file_release+0x30/0x30 [nfs]
[ 2059.173190]  [810c9b2d] ? trace_hardirqs_on+0xd/0x10
[ 2059.173190]  [815cff65] ? mutex_lock_nested+0x2e5/0x620
[ 2059.173190]  [8122d479] ? __fdget_pos+0x49/0x50
[ 2059.173190]  [810c9b2d] ? trace_hardirqs_on+0xd/0x10
[ 2059.173190]  [8116507c] ? perf_syscall_enter+0x1c/0x1d0
[ 2059.173190]  [810f98b7] ? do_setitimer+0x137/0x2f0
[ 2059.173190]  [8120ba13] compat_readv+0x53/0x70
[ 2059.173190]  [8120cba9] compat_SyS_readv+0x49/0xb0
[ 2059.173190]  [815d7989] ia32_do_call+0x13/0x13
[ 2059.173190] BUG: unable to handle kernel NULL pointer dereference at 
0010
[ 2059.173190] IP: [8106ebf2] gup_pte_range+0xb2/0x170
[ 2059.173190] PGD 79409067 PUD 74b5c067 PMD 0 
[ 2059.173190] Oops: 0002 [#1] PREEMPT SMP DEBUG_PAGEALLOC
[ 2059.173190] Modules linked in: sctp crc32c_generic libcrc32c ipx p8023 psnap 
p8022 llc auth_rpcgss nfsv4 9p netconsole e1000 cirrus syscopyarea sysfillrect 
ppdev sysimgblt drm_kms_helper ttm drm psmouse evdev microcode i2c_piix4 
parport_pc serio_raw parport intel_agp button intel_gtt processor pcspkr nfs 
lockd sunrpc ipv6 ext4 crc16 mbcache jbd2 sd_mod sr_mod cdrom ata_generic 
pata_acpi ata_piix 9pnet_virtio libata 9pnet scsi_mod
[ 2059.173190] CPU: 0 PID: 3225 Comm: trinity-c0 Not tainted 
3.17.0-rc2-next-20140827 #112
[ 2059.173190] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
1.7.5-20140617_173321-var-lib-archbuild-testing-x86_64-tobias 04/01/2014
[ 2059.173190] task: 88007a5d4350 ti: 880004f28000 task.ti: 
880004f28000
[ 2059.173190] RIP: 0010:[8106ebf2]  [8106ebf2] 
gup_pte_range+0xb2/0x170
[ 2059.173190] RSP: :880004f2ba98  EFLAGS: 00010086
[ 2059.173190] RAX:  RBX: ea0001e2ffc0 RCX: 0207
[ 2059.173190] RDX: 4085a000 RSI: ea00 RDI: 800078bff067
[ 2059.173190] RBP: 880004f2bae8 R08: 0010 R09: 880004f2bb94
[ 2059.173190] R10: 40a0 R11: 8000 R12: 880004e8e2d8
[ 2059.173190] R13: 4085b000 R14: 0007 R15: 3000
[ 2059.173190] FS:  7f6133060700() GS:88007f60() 
knlGS:
[ 2059.173190] CS:  0010 DS:  ES:  CR0: 80050033
[ 2059.173190] CR2: 0010 CR3: 3ab44000 CR4: 001407f0
[ 2059.173190] Stack:
[ 2059.173190]  88007cc00618 88007cc00650 ea0001e4d6a0 
000100140010
[ 2059.173190]  000180140011 880004ebf020 40859fff 
4085a000
[ 2059.173190]  4085a000 880004f2bb94 880004f2bb58 
8106ede8
[ 2059.173190] Call Trace:
[ 2059.173190]  [8106ede8] gup_pud_range+0x138/0x210
[ 2059.173190]  [8106f11a] get_user_pages_fast+0xba/0x1d0
[ 2059.173190]  [811e95ae] ? __kmalloc+0x2e/0x3c0
[ 2059.173190]  [811b50e2] iov_iter_get_pages_alloc+0xb2/0x1c0
[ 2059.173190]  [a02f6dac]

Re: [PATCH][input-led] Defer input led work to workqueue

2014-08-26 Thread Sabrina Dubroca

2014-08-26, 22:02:08 +0200, Sabrina Dubroca wrote:
> [...]
> 
> Tested-by: Sabrina Dubroca 

Forgot to mention, this applies to both versions of the patch.


-- 
Sabrina
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH][input-led] Defer input led work to workqueue

2014-08-26 Thread Sabrina Dubroca

2014-08-26, 14:49:18 -0400, valdis.kletni...@vt.edu wrote:
> On Tue, 26 Aug 2014 03:54:53 +0200, Samuel Thibault said:
> > This changeset defers the second led_trigger_event call into a
> > workqueue, which avoids the nested locking altogether.  This does
> > not prevent the user from shooting himself in the foot by creating a
> > vt::capsl <-> vt-capsl loop, but the only consequence is the workqueue
> > threads eating some CPU until the user breaks the loop, which is not too
> > bad.
> >
> > Signed-off-by: Samuel Thibault 
> >
> > --- a/drivers/input/leds.c
> > +++ b/drivers/input/leds.c
> > @@ -100,13 +100,25 @@ static unsigned long vt_led_registered[B
> 
> I admit having zero understanding of the code, but I can confirm that
> next-20140825 still throws the lockdep whine I was seeing, but the same
> tree with this patch on top of it boots without warning. Soo...
> 
> Tested-By: Valdis Kletnieks 

Same for me.

Tested-by: Sabrina Dubroca 


Thanks,

-- 
Sabrina
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] block: fix error handling in sg_io

2014-08-26 Thread Sabrina Dubroca

Before commit 2cada584b200 ("block: cleanup error handling in sg_io"),
we had ret = 0 before entering the last big if block of sg_io.

Since 2cada584b200, ret = -EFAULT, which breaks hdparm:

/dev/sda:
 setting Advanced Power Management level to 0xc8 (200)
 HDIO_DRIVE_CMD failed: Bad address
 APM_level  = 128

Signed-off-by: Sabrina Dubroca 
Fixes: 2cada584b200 ("block: cleanup error handling in sg_io")
---
 block/scsi_ioctl.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/block/scsi_ioctl.c b/block/scsi_ioctl.c
index 5dd477bfb4bc..9b8eaeca6a79 100644
--- a/block/scsi_ioctl.c
+++ b/block/scsi_ioctl.c
@@ -330,6 +330,7 @@ static int sg_io(struct request_queue *q, struct gendisk 
*bd_disk,
if (blk_fill_sghdr_rq(q, rq, hdr, mode))
goto out_free_cdb;
 
+   ret = 0;
if (hdr->iovec_count) {
size_t iov_data_len;
struct iovec *iov = NULL;
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

linux-next: hdparm -B broken

2014-08-26 Thread Sabrina Dubroca


Hello,

Since next-20140823:

# hdparm -B 200 /dev/sda

/dev/sda:
 setting Advanced Power Management level to 0xc8 (200)
 HDIO_DRIVE_CMD failed: Bad address
 APM_level  = 128


It looks like before commit 2cada584b200 ("block: cleanup error
handling in sg_io"), we had ret = 0 before entering the last big if
block of sg_io, and now ret = -EFAULT.

The following patch seems to fix the problem:


diff --git a/block/scsi_ioctl.c b/block/scsi_ioctl.c
index 5dd477bfb4bc..9b8eaeca6a79 100644
--- a/block/scsi_ioctl.c
+++ b/block/scsi_ioctl.c
@@ -330,6 +330,7 @@ static int sg_io(struct request_queue *q, struct gendisk 
*bd_disk,
if (blk_fill_sghdr_rq(q, rq, hdr, mode))
goto out_free_cdb;
 
+   ret = 0;
if (hdr->iovec_count) {
size_t iov_data_len;
struct iovec *iov = NULL;



Thanks,

-- 
Sabrina
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

linux-next: hdparm -B value broken

2014-08-26 Thread Sabrina Dubroca


Hello,

Since next-20140823:

# hdparm -B 200 /dev/sda

/dev/sda:
 setting Advanced Power Management level to 0xc8 (200)
 HDIO_DRIVE_CMD failed: Bad address
 APM_level  = 128


It looks like before commit 2cada584b200 (block: cleanup error
handling in sg_io), we had ret = 0 before entering the last big if
block of sg_io, and now ret = -EFAULT.

The following patch seems to fix the problem:


diff --git a/block/scsi_ioctl.c b/block/scsi_ioctl.c
index 5dd477bfb4bc..9b8eaeca6a79 100644
--- a/block/scsi_ioctl.c
+++ b/block/scsi_ioctl.c
@@ -330,6 +330,7 @@ static int sg_io(struct request_queue *q, struct gendisk 
*bd_disk,
if (blk_fill_sghdr_rq(q, rq, hdr, mode))
goto out_free_cdb;
 
+   ret = 0;
if (hdr-iovec_count) {
size_t iov_data_len;
struct iovec *iov = NULL;



Thanks,

-- 
Sabrina
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] block: fix error handling in sg_io

2014-08-26 Thread Sabrina Dubroca

Before commit 2cada584b200 (block: cleanup error handling in sg_io),
we had ret = 0 before entering the last big if block of sg_io.

Since 2cada584b200, ret = -EFAULT, which breaks hdparm:

/dev/sda:
 setting Advanced Power Management level to 0xc8 (200)
 HDIO_DRIVE_CMD failed: Bad address
 APM_level  = 128

Signed-off-by: Sabrina Dubroca s...@queasysnail.net
Fixes: 2cada584b200 (block: cleanup error handling in sg_io)
---
 block/scsi_ioctl.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/block/scsi_ioctl.c b/block/scsi_ioctl.c
index 5dd477bfb4bc..9b8eaeca6a79 100644
--- a/block/scsi_ioctl.c
+++ b/block/scsi_ioctl.c
@@ -330,6 +330,7 @@ static int sg_io(struct request_queue *q, struct gendisk 
*bd_disk,
if (blk_fill_sghdr_rq(q, rq, hdr, mode))
goto out_free_cdb;
 
+   ret = 0;
if (hdr-iovec_count) {
size_t iov_data_len;
struct iovec *iov = NULL;
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH][input-led] Defer input led work to workqueue

2014-08-26 Thread Sabrina Dubroca

2014-08-26, 14:49:18 -0400, valdis.kletni...@vt.edu wrote:
 On Tue, 26 Aug 2014 03:54:53 +0200, Samuel Thibault said:
  This changeset defers the second led_trigger_event call into a
  workqueue, which avoids the nested locking altogether.  This does
  not prevent the user from shooting himself in the foot by creating a
  vt::capsl - vt-capsl loop, but the only consequence is the workqueue
  threads eating some CPU until the user breaks the loop, which is not too
  bad.
 
  Signed-off-by: Samuel Thibault samuel.thiba...@ens-lyon.org
 
  --- a/drivers/input/leds.c
  +++ b/drivers/input/leds.c
  @@ -100,13 +100,25 @@ static unsigned long vt_led_registered[B
 
 I admit having zero understanding of the code, but I can confirm that
 next-20140825 still throws the lockdep whine I was seeing, but the same
 tree with this patch on top of it boots without warning. Soo...
 
 Tested-By: Valdis Kletnieks valdis.kletni...@vt.edu

Same for me.

Tested-by: Sabrina Dubroca s...@queasysnail.net


Thanks,

-- 
Sabrina
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH][input-led] Defer input led work to workqueue

2014-08-26 Thread Sabrina Dubroca

2014-08-26, 22:02:08 +0200, Sabrina Dubroca wrote:
 [...]
 
 Tested-by: Sabrina Dubroca s...@queasysnail.net

Forgot to mention, this applies to both versions of the patch.


-- 
Sabrina
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 3.17-rc1: leds blink workqueue causes sleeping BUGs

2014-08-25 Thread Sabrina Dubroca

2014-08-19, 13:06:07 -0400, valdis.kletni...@vt.edu wrote:
> On Sat, 16 Aug 2014 20:27:01 -0700, Hugh Dickins said:
> > Can we safely revert your 8b37e1bef5a6 ("leds: convert blink timer to
> > workqueue"), or have there been other changes which now depend upon it?
> 
> I suspect there's something else busted.  I hand-reverted that patch, and I 
> *still*
> see the following lockdep whine that looks related (as it talks about
> leddev_list_lock).  next-0811 was OK, looks like next-0815 and -0818 had 
> this

I've seen this one in linux-next too. It seems to be caused by
946a4abbcfac ("input: route kbd LEDs through the generic LEDs layer")

I had a look at the code, led_trigger_event calls vt_led_set, which
calls led_trigger_event again.

CC'ing Samuel Thibault, the author of the commit.


> 
> [2.473044] iTCO_vendor_support: vendor-support=0
> [2.473122] device-mapper: uevent: version 1.0.3
> 
> [2.473145] =
> [2.473177] [ INFO: possible recursive locking detected ]
> [2.473204] 3.17.0-rc1-next-20140818-dirty #274 Not tainted
> [2.473231] -
> [2.473258] kworker/3:0/25 is trying to acquire lock:
> [2.473283]  (>leddev_list_lock){.+.?..}, at: [] 
> led_trigger_event+0x26/0x69
> [2.473341]
> but task is already holding lock:
> [2.473370]  (>leddev_list_lock){.+.?..}, at: [] 
> led_trigger_event+0x26/0x69
> [2.473425]
> other info that might help us debug this:
> [2.473456]  Possible unsafe locking scenario:
> 
> [2.473485]CPU0
> [2.473498]
> [2.473511]   lock(>leddev_list_lock);
> [2.473538]   lock(>leddev_list_lock);
> [2.473565]
>  *** DEADLOCK ***
> 
> [2.473594]  May be due to missing lock nesting notation
> 
> [2.473626] 11 locks held by kworker/3:0/25:
> [2.473648]  #0:  ("events_long"){.+.+.+}, at: [] 
> process_one_work+0x1e9/0x4ad
> [2.473704]  #1:  (serio_event_work){+.+.+.}, at: [] 
> process_one_work+0x1e9/0x4ad
> [2.473760]  #2:  (serio_mutex){+.+.+.}, at: [] 
> serio_handle_event+0x19/0x19c
> [2.473816]  #3:  (>mutex){..}, at: [] 
> __driver_attach+0x2f/0x7e
> [2.473869]  #4:  (>mutex){..}, at: [] 
> __driver_attach+0x47/0x7e
> [2.473922]  #5:  (>drv_mutex){+.+.+.}, at: [] 
> serio_connect_driver+0x21/0x42
> [2.473980]  #6:  (input_mutex){+.+.+.}, at: [] 
> input_register_device+0x2a9/0x381
> [2.474036]  #7:  (vt_led_registered_lock){+.+.+.}, at: 
> [] input_led_connect+0x49/0x1cd
> [2.474095]  #8:  (triggers_list_lock){.+}, at: [] 
> led_trigger_set_default+0x27/0x87
> [2.474154]  #9:  (_cdev->trigger_lock){+.+.+.}, at: 
> [] led_trigger_set_default+0x2f/0x87
> [2.474214]  #10:  (>leddev_list_lock){.+.?..}, at: 
> [] led_trigger_event+0x26/0x69
> [2.474274]
> stack backtrace:
> [2.474297] CPU: 3 PID: 25 Comm: kworker/3:0 Not tainted 
> 3.17.0-rc1-next-20140818-dirty #274
> [2.474338] Hardware name: Dell Inc. Latitude E6530/07Y85M, BIOS A15 
> 06/20/2014
> [2.474374] Workqueue: events_long serio_handle_event
> [2.474401]   880224ceb960 9368ba02 
> 945ff130
> [2.474447]  945ff130 880224ceba20 9307b730 
> 000a0246
> [2.474492]   945ff130 0003 
> 23605381dcae248d
> [2.474538] Call Trace:
> [2.474554]  [] dump_stack+0x51/0xaa
> [2.474581]  [] __lock_acquire+0xd22/0xed1
> [2.474611]  [] ? __lock_acquire+0x68c/0xed1
> [2.474641]  [] ? led_trigger_event+0x26/0x69
> [2.474671]  [] lock_acquire+0xdd/0x16a
> [2.474699]  [] ? lock_acquire+0xdd/0x16a
> [2.474727]  [] ? led_trigger_event+0x26/0x69
> [2.474758]  [] _raw_read_lock+0x30/0x3f
> [2.474786]  [] ? led_trigger_event+0x26/0x69
> [2.474816]  [] led_trigger_event+0x26/0x69
> [2.474846]  [] vt_led_set+0x32/0x34
> [2.474873]  [] __led_set_brightness+0x1b/0x1d
> [2.474904]  [] led_set_brightness+0x37/0x39
> [2.474934]  [] led_trigger_event+0x48/0x69
> [2.474965]  [] kbd_ledstate_trigger_activate+0x47/0x4f
> [2.474999]  [] led_trigger_set+0xfd/0x139
> [2.475028]  [] led_trigger_set_default+0x62/0x87
> [2.475061]  [] led_classdev_register+0x139/0x144
> [2.475093]  [] input_led_connect+0x8a/0x1cd
> [2.475123]  [] input_register_device+0x370/0x381
> [2.475154]  [] atkbd_connect+0x216/0x269
> [2.475182]  [] serio_connect_driver+0x2c/0x42
> [2.475213]  [] serio_driver_probe+0x1b/0x1d
> [2.476508]  [] driver_probe_device+0xda/0x203
> [2.477797]  [] __driver_attach+0x5c/0x7e
> [2.479081]  [] ? __device_attach+0x38/0x38
> [2.480328]  [] bus_for_each_dev+0x6a/0x82
> [2.481604]  [] driver_attach+0x19/0x1b
> [2.482874]  [] serio_handle_event+0x156/0x19c
> [2.483387] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
> [2.485373]  [] process_one_work+0x28b/0x4ad
> [

Re: [PATCH v3 5/7] KVM: trace kvm_ple_window grow/shrink

2014-08-25 Thread Sabrina Dubroca

Hello,

2014-08-21, 18:08:09 +0200, Radim Krčmář wrote:
> Tracepoint for dynamic PLE window, fired on every potential change.
> 
> Signed-off-by: Radim Krčmář 
> ---
>  arch/x86/kvm/trace.h | 30 ++
>  arch/x86/kvm/vmx.c   | 10 --
>  arch/x86/kvm/x86.c   |  1 +
>  3 files changed, 39 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h
> index e850a7d..1742dfb 100644
> --- a/arch/x86/kvm/trace.h
> +++ b/arch/x86/kvm/trace.h
> @@ -848,6 +848,36 @@ TRACE_EVENT(kvm_track_tsc,
> __print_symbolic(__entry->host_clock, host_clocks))
>  );
>  
> +TRACE_EVENT(kvm_ple_window,
> + TP_PROTO(bool grow, unsigned int vcpu_id, int new, int old),
> + TP_ARGS(grow, vcpu_id, new, old),
> +
> + TP_STRUCT__entry(
> + __field(bool,  grow )
> + __field(unsigned int,   vcpu_id )
> + __field( int,   new )
> + __field( int,   old )
> + ),
> +
> + TP_fast_assign(
> + __entry->grow   = grow;
> + __entry->vcpu_id= vcpu_id;
> + __entry->new= new;
> + __entry->old= old;
> + ),
> +
> + TP_printk("vcpu %u: ple_window %d (%s %d)",
> +   __entry->vcpu_id,
> +   __entry->new,
> +   __entry->grow ? "grow" : "shrink",
> +   __entry->old)
> +);
> +
> +#define trace_kvm_ple_window_grow(vcpu_id, new, old) \
> + trace_kvm_ple_window(true, vcpu_id, new, old)
> +#define trace_kvm_ple_window_shrink(vcpu_id, new, old) \
> + trace_kvm_ple_window(false, vcpu_id, new, old)
> +
>  #endif /* CONFIG_X86_64 */

Looks like these are needed on 32-bit as well.
Today's linux-next doesn't build:

  CC [M]  arch/x86/kvm/x86.o
In file included from include/linux/linkage.h:6:0,
 from include/linux/preempt.h:9,
 from include/linux/preempt_mask.h:4,
 from include/linux/hardirq.h:4,
 from include/linux/kvm_host.h:10,
 from arch/x86/kvm/x86.c:22:
include/linux/tracepoint.h:214:20: error: ‘__tracepoint_kvm_ple_window’ 
undeclared here (not in a function)
  EXPORT_SYMBOL_GPL(__tracepoint_##name)
^
include/linux/export.h:57:16: note: in definition of macro ‘__EXPORT_SYMBOL’
  extern typeof(sym) sym; \
^
include/linux/tracepoint.h:214:2: note: in expansion of macro 
‘EXPORT_SYMBOL_GPL’
  EXPORT_SYMBOL_GPL(__tracepoint_##name)
  ^
arch/x86/kvm/x86.c:7676:1: note: in expansion of macro 
‘EXPORT_TRACEPOINT_SYMBOL_GPL’
 EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_ple_window);
 ^


and if I comment out the EXPORT_TRACEPOINT_SYMBOL_GPL:

arch/x86/kvm/vmx.c: In function ‘grow_ple_window’:
arch/x86/kvm/vmx.c:5742:2: error: implicit declaration of function 
‘trace_kvm_ple_window_grow’ [-Werror=implicit-function-declaration]
  trace_kvm_ple_window_grow(vcpu->vcpu_id, vmx->ple_window, old);
  ^
arch/x86/kvm/vmx.c: In function ‘shrink_ple_window’:
arch/x86/kvm/vmx.c:5756:2: error: implicit declaration of function 
‘trace_kvm_ple_window_shrink’ [-Werror=implicit-function-declaration]
  trace_kvm_ple_window_shrink(vcpu->vcpu_id, vmx->ple_window, old);
  ^
cc1: some warnings being treated as errors
make[2]: *** [arch/x86/kvm/vmx.o] Error 1



I moved the line

#endif /* CONFIG_X86_64 */

above

TRACE_EVENT(kvm_ple_window,

and it builds.


Thanks,

-- 
Sabrina
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3 5/7] KVM: trace kvm_ple_window grow/shrink

2014-08-25 Thread Sabrina Dubroca

Hello,

2014-08-21, 18:08:09 +0200, Radim Krčmář wrote:
 Tracepoint for dynamic PLE window, fired on every potential change.
 
 Signed-off-by: Radim Krčmář rkrc...@redhat.com
 ---
  arch/x86/kvm/trace.h | 30 ++
  arch/x86/kvm/vmx.c   | 10 --
  arch/x86/kvm/x86.c   |  1 +
  3 files changed, 39 insertions(+), 2 deletions(-)
 
 diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h
 index e850a7d..1742dfb 100644
 --- a/arch/x86/kvm/trace.h
 +++ b/arch/x86/kvm/trace.h
 @@ -848,6 +848,36 @@ TRACE_EVENT(kvm_track_tsc,
 __print_symbolic(__entry-host_clock, host_clocks))
  );
  
 +TRACE_EVENT(kvm_ple_window,
 + TP_PROTO(bool grow, unsigned int vcpu_id, int new, int old),
 + TP_ARGS(grow, vcpu_id, new, old),
 +
 + TP_STRUCT__entry(
 + __field(bool,  grow )
 + __field(unsigned int,   vcpu_id )
 + __field( int,   new )
 + __field( int,   old )
 + ),
 +
 + TP_fast_assign(
 + __entry-grow   = grow;
 + __entry-vcpu_id= vcpu_id;
 + __entry-new= new;
 + __entry-old= old;
 + ),
 +
 + TP_printk(vcpu %u: ple_window %d (%s %d),
 +   __entry-vcpu_id,
 +   __entry-new,
 +   __entry-grow ? grow : shrink,
 +   __entry-old)
 +);
 +
 +#define trace_kvm_ple_window_grow(vcpu_id, new, old) \
 + trace_kvm_ple_window(true, vcpu_id, new, old)
 +#define trace_kvm_ple_window_shrink(vcpu_id, new, old) \
 + trace_kvm_ple_window(false, vcpu_id, new, old)
 +
  #endif /* CONFIG_X86_64 */

Looks like these are needed on 32-bit as well.
Today's linux-next doesn't build:

  CC [M]  arch/x86/kvm/x86.o
In file included from include/linux/linkage.h:6:0,
 from include/linux/preempt.h:9,
 from include/linux/preempt_mask.h:4,
 from include/linux/hardirq.h:4,
 from include/linux/kvm_host.h:10,
 from arch/x86/kvm/x86.c:22:
include/linux/tracepoint.h:214:20: error: ‘__tracepoint_kvm_ple_window’ 
undeclared here (not in a function)
  EXPORT_SYMBOL_GPL(__tracepoint_##name)
^
include/linux/export.h:57:16: note: in definition of macro ‘__EXPORT_SYMBOL’
  extern typeof(sym) sym; \
^
include/linux/tracepoint.h:214:2: note: in expansion of macro 
‘EXPORT_SYMBOL_GPL’
  EXPORT_SYMBOL_GPL(__tracepoint_##name)
  ^
arch/x86/kvm/x86.c:7676:1: note: in expansion of macro 
‘EXPORT_TRACEPOINT_SYMBOL_GPL’
 EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_ple_window);
 ^


and if I comment out the EXPORT_TRACEPOINT_SYMBOL_GPL:

arch/x86/kvm/vmx.c: In function ‘grow_ple_window’:
arch/x86/kvm/vmx.c:5742:2: error: implicit declaration of function 
‘trace_kvm_ple_window_grow’ [-Werror=implicit-function-declaration]
  trace_kvm_ple_window_grow(vcpu-vcpu_id, vmx-ple_window, old);
  ^
arch/x86/kvm/vmx.c: In function ‘shrink_ple_window’:
arch/x86/kvm/vmx.c:5756:2: error: implicit declaration of function 
‘trace_kvm_ple_window_shrink’ [-Werror=implicit-function-declaration]
  trace_kvm_ple_window_shrink(vcpu-vcpu_id, vmx-ple_window, old);
  ^
cc1: some warnings being treated as errors
make[2]: *** [arch/x86/kvm/vmx.o] Error 1



I moved the line

#endif /* CONFIG_X86_64 */

above

TRACE_EVENT(kvm_ple_window,

and it builds.


Thanks,

-- 
Sabrina
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 3.17-rc1: leds blink workqueue causes sleeping BUGs

2014-08-25 Thread Sabrina Dubroca

2014-08-19, 13:06:07 -0400, valdis.kletni...@vt.edu wrote:
 On Sat, 16 Aug 2014 20:27:01 -0700, Hugh Dickins said:
  Can we safely revert your 8b37e1bef5a6 (leds: convert blink timer to
  workqueue), or have there been other changes which now depend upon it?
 
 I suspect there's something else busted.  I hand-reverted that patch, and I 
 *still*
 see the following lockdep whine that looks related (as it talks about
 leddev_list_lock).  next-0811 was OK, looks like next-0815 and -0818 had 
 this

I've seen this one in linux-next too. It seems to be caused by
946a4abbcfac (input: route kbd LEDs through the generic LEDs layer)

I had a look at the code, led_trigger_event calls vt_led_set, which
calls led_trigger_event again.

CC'ing Samuel Thibault, the author of the commit.


 
 [2.473044] iTCO_vendor_support: vendor-support=0
 [2.473122] device-mapper: uevent: version 1.0.3
 
 [2.473145] =
 [2.473177] [ INFO: possible recursive locking detected ]
 [2.473204] 3.17.0-rc1-next-20140818-dirty #274 Not tainted
 [2.473231] -
 [2.473258] kworker/3:0/25 is trying to acquire lock:
 [2.473283]  (trig-leddev_list_lock){.+.?..}, at: [9344da1b] 
 led_trigger_event+0x26/0x69
 [2.473341]
 but task is already holding lock:
 [2.473370]  (trig-leddev_list_lock){.+.?..}, at: [9344da1b] 
 led_trigger_event+0x26/0x69
 [2.473425]
 other info that might help us debug this:
 [2.473456]  Possible unsafe locking scenario:
 
 [2.473485]CPU0
 [2.473498]
 [2.473511]   lock(trig-leddev_list_lock);
 [2.473538]   lock(trig-leddev_list_lock);
 [2.473565]
  *** DEADLOCK ***
 
 [2.473594]  May be due to missing lock nesting notation
 
 [2.473626] 11 locks held by kworker/3:0/25:
 [2.473648]  #0:  (events_long){.+.+.+}, at: [93054594] 
 process_one_work+0x1e9/0x4ad
 [2.473704]  #1:  (serio_event_work){+.+.+.}, at: [93054594] 
 process_one_work+0x1e9/0x4ad
 [2.473760]  #2:  (serio_mutex){+.+.+.}, at: [933dd7f0] 
 serio_handle_event+0x19/0x19c
 [2.473816]  #3:  (dev-mutex){..}, at: [9332cec3] 
 __driver_attach+0x2f/0x7e
 [2.473869]  #4:  (dev-mutex){..}, at: [9332cedb] 
 __driver_attach+0x47/0x7e
 [2.473922]  #5:  (serio-drv_mutex){+.+.+.}, at: [933dca77] 
 serio_connect_driver+0x21/0x42
 [2.473980]  #6:  (input_mutex){+.+.+.}, at: [933e183a] 
 input_register_device+0x2a9/0x381
 [2.474036]  #7:  (vt_led_registered_lock){+.+.+.}, at: 
 [933e3846] input_led_connect+0x49/0x1cd
 [2.474095]  #8:  (triggers_list_lock){.+}, at: [9344dd91] 
 led_trigger_set_default+0x27/0x87
 [2.474154]  #9:  (led_cdev-trigger_lock){+.+.+.}, at: 
 [9344dd99] led_trigger_set_default+0x2f/0x87
 [2.474214]  #10:  (trig-leddev_list_lock){.+.?..}, at: 
 [9344da1b] led_trigger_event+0x26/0x69
 [2.474274]
 stack backtrace:
 [2.474297] CPU: 3 PID: 25 Comm: kworker/3:0 Not tainted 
 3.17.0-rc1-next-20140818-dirty #274
 [2.474338] Hardware name: Dell Inc. Latitude E6530/07Y85M, BIOS A15 
 06/20/2014
 [2.474374] Workqueue: events_long serio_handle_event
 [2.474401]   880224ceb960 9368ba02 
 945ff130
 [2.474447]  945ff130 880224ceba20 9307b730 
 000a0246
 [2.474492]   945ff130 0003 
 23605381dcae248d
 [2.474538] Call Trace:
 [2.474554]  [9368ba02] dump_stack+0x51/0xaa
 [2.474581]  [9307b730] __lock_acquire+0xd22/0xed1
 [2.474611]  [9307b09a] ? __lock_acquire+0x68c/0xed1
 [2.474641]  [9344da1b] ? led_trigger_event+0x26/0x69
 [2.474671]  [9307bc64] lock_acquire+0xdd/0x16a
 [2.474699]  [9307bc64] ? lock_acquire+0xdd/0x16a
 [2.474727]  [9344da1b] ? led_trigger_event+0x26/0x69
 [2.474758]  [93696391] _raw_read_lock+0x30/0x3f
 [2.474786]  [9344da1b] ? led_trigger_event+0x26/0x69
 [2.474816]  [9344da1b] led_trigger_event+0x26/0x69
 [2.474846]  [933e3751] vt_led_set+0x32/0x34
 [2.474873]  [9344d312] __led_set_brightness+0x1b/0x1d
 [2.474904]  [9344d4a8] led_set_brightness+0x37/0x39
 [2.474934]  [9344da3d] led_trigger_event+0x48/0x69
 [2.474965]  [9330b7fb] kbd_ledstate_trigger_activate+0x47/0x4f
 [2.474999]  [9344dbda] led_trigger_set+0xfd/0x139
 [2.475028]  [9344ddcc] led_trigger_set_default+0x62/0x87
 [2.475061]  [9344d87e] led_classdev_register+0x139/0x144
 [2.475093]  [933e3887] input_led_connect+0x8a/0x1cd
 [2.475123]  [933e1901] input_register_device+0x370/0x381
 [2.475154]  [933e848f] atkbd_connect+0x216/0x269
 [2.475182]  [933dca82]

Re: [PATCH] gpio / ACPI: Don't crash on NULL chip->dev

2014-03-31 Thread Sabrina Dubroca

2014-03-31, 15:16:49 +0300, Mika Westerberg wrote:
> Commit aa92b6f689ac (gpio / ACPI: Allocate ACPI specific data directly in
> acpi_gpiochip_add()) moved ACPI handle checking to acpi_gpiochip_add() but
> forgot to check whether chip->dev is NULL before dereferencing it.
> 
> Since chip->dev pointer is optional we can end up with crash like following:
> 
>  BUG: unable to handle kernel NULL pointer dereference at 0138
>  IP: [] acpi_gpiochip_add+0x13/0x190
>  *pde = 
>  Oops:  [#1] PREEMPT SMP
>  Modules linked in: ssb(+) ...
>  CPU: 0 PID: 512 Comm: modprobe Tainted: GW 
> 3.14.0-rc7-next-20140324-t1 #24
>  Hardware name: Dell Inc. Latitude D830   /0UY141, BIOS A02 
> 06/07/2007
>  task: f5799900 ti: f543e000 task.ti: f543e000
>  EIP: 0060:[] EFLAGS: 00010282 CPU: 0
>  EIP is at acpi_gpiochip_add+0x13/0x190
>  EAX:  EBX: f57824c4 ECX:  EDX: 
>  ESI: f57824c4 EDI: 0010 EBP: f543fc54 ESP: f543fc40
>   DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
>  CR0: 8005003b CR2: 0138 CR3: 355f8000 CR4: 07d0
>  Stack:
>   f543fc5c fd1f7790 f57824c4 00be 0010 f543fc84 c1269f4e f543fc74
>   fd1f78bd 8002 f57822b0 f5782090 fd1f8400 0286 fd1f9994 
>   f5782000 f543fc8c fd1f7e39 f543fcc8 fd1f0bd8 00c0  
>  Call Trace:
>   [] ? ssb_pcie_mdio_write+0xa0/0xd0 [ssb]
>   [] gpiochip_add+0xee/0x300
>   [] ? ssb_pcicore_serdes_workaround+0xfd/0x140 [ssb]
>   [] ssb_gpio_init+0x89/0xa0 [ssb]
>   [] ssb_attach_queued_buses+0xc8/0x2d0 [ssb]
>   [] ssb_bus_register+0x185/0x1f0 [ssb]
>   [] ? ssb_pci_xtal+0x220/0x220 [ssb]
>   [] ssb_bus_pcibus_register+0x2c/0x80 [ssb]
>   [] ssb_pcihost_probe+0x9c/0x110 [ssb]
>   [] pci_device_probe+0x6f/0xc0
>   [] ? sysfs_create_link+0x25/0x40
>   [] driver_probe_device+0x79/0x360
>   [] ? pci_match_device+0xb2/0xc0
>   [] __driver_attach+0x71/0x80
>   [] ? __device_attach+0x40/0x40
>   [] bus_for_each_dev+0x47/0x80
>   [] driver_attach+0x1e/0x20
>   [] ? __device_attach+0x40/0x40
>   [] bus_add_driver+0x157/0x230
>   [] driver_register+0x59/0xe0
>   ...
> 
> Fix this by checking chip->dev pointer against NULL first. Also we can now
> remove redundant check in acpi_gpiochip_request/free_interrupts().
> 
> Reported-by: Sabrina Dubroca 
> Signed-off-by: Mika Westerberg 
> ---
> Sabrina,
> 
> Can you please re-test this and provide your tested-by? I changed the patch
> a bit to remove redundant checks. Just to be sure that I don't accidentally
> break something.
> 
> Thanks.

Everything looks good.

Tested-by: Sabrina Dubroca 


Thanks,
-- 
Sabrina
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] gpio / ACPI: Don't crash on NULL chip-dev

2014-03-31 Thread Sabrina Dubroca

2014-03-31, 15:16:49 +0300, Mika Westerberg wrote:
 Commit aa92b6f689ac (gpio / ACPI: Allocate ACPI specific data directly in
 acpi_gpiochip_add()) moved ACPI handle checking to acpi_gpiochip_add() but
 forgot to check whether chip-dev is NULL before dereferencing it.
 
 Since chip-dev pointer is optional we can end up with crash like following:
 
  BUG: unable to handle kernel NULL pointer dereference at 0138
  IP: [c126c2b3] acpi_gpiochip_add+0x13/0x190
  *pde = 
  Oops:  [#1] PREEMPT SMP
  Modules linked in: ssb(+) ...
  CPU: 0 PID: 512 Comm: modprobe Tainted: GW 
 3.14.0-rc7-next-20140324-t1 #24
  Hardware name: Dell Inc. Latitude D830   /0UY141, BIOS A02 
 06/07/2007
  task: f5799900 ti: f543e000 task.ti: f543e000
  EIP: 0060:[c126c2b3] EFLAGS: 00010282 CPU: 0
  EIP is at acpi_gpiochip_add+0x13/0x190
  EAX:  EBX: f57824c4 ECX:  EDX: 
  ESI: f57824c4 EDI: 0010 EBP: f543fc54 ESP: f543fc40
   DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
  CR0: 8005003b CR2: 0138 CR3: 355f8000 CR4: 07d0
  Stack:
   f543fc5c fd1f7790 f57824c4 00be 0010 f543fc84 c1269f4e f543fc74
   fd1f78bd 8002 f57822b0 f5782090 fd1f8400 0286 fd1f9994 
   f5782000 f543fc8c fd1f7e39 f543fcc8 fd1f0bd8 00c0  
  Call Trace:
   [fd1f7790] ? ssb_pcie_mdio_write+0xa0/0xd0 [ssb]
   [c1269f4e] gpiochip_add+0xee/0x300
   [fd1f78bd] ? ssb_pcicore_serdes_workaround+0xfd/0x140 [ssb]
   [fd1f7e39] ssb_gpio_init+0x89/0xa0 [ssb]
   [fd1f0bd8] ssb_attach_queued_buses+0xc8/0x2d0 [ssb]
   [fd1f0f65] ssb_bus_register+0x185/0x1f0 [ssb]
   [fd1f3120] ? ssb_pci_xtal+0x220/0x220 [ssb]
   [fd1f106c] ssb_bus_pcibus_register+0x2c/0x80 [ssb]
   [fd1f40dc] ssb_pcihost_probe+0x9c/0x110 [ssb]
   [c1276c8f] pci_device_probe+0x6f/0xc0
   [c11bdb55] ? sysfs_create_link+0x25/0x40
   [c131d8b9] driver_probe_device+0x79/0x360
   [c1276512] ? pci_match_device+0xb2/0xc0
   [c131dc51] __driver_attach+0x71/0x80
   [c131dbe0] ? __device_attach+0x40/0x40
   [c131bd87] bus_for_each_dev+0x47/0x80
   [c131d3ae] driver_attach+0x1e/0x20
   [c131dbe0] ? __device_attach+0x40/0x40
   [c131d007] bus_add_driver+0x157/0x230
   [c131e219] driver_register+0x59/0xe0
   ...
 
 Fix this by checking chip-dev pointer against NULL first. Also we can now
 remove redundant check in acpi_gpiochip_request/free_interrupts().
 
 Reported-by: Sabrina Dubroca s...@queasysnail.net
 Signed-off-by: Mika Westerberg mika.westerb...@linux.intel.com
 ---
 Sabrina,
 
 Can you please re-test this and provide your tested-by? I changed the patch
 a bit to remove redundant checks. Just to be sure that I don't accidentally
 break something.
 
 Thanks.

Everything looks good.

Tested-by: Sabrina Dubroca s...@queasysnail.net


Thanks,
-- 
Sabrina
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [bisected] NULL pointer dereference in acpi_gpiochip_add (on modprobe ssb)

2014-03-25 Thread Sabrina Dubroca

2014-03-25, 09:25:30 +0200, Mika Westerberg wrote:
> On Mon, Mar 24, 2014 at 07:31:11PM +0100, Sabrina Dubroca wrote:
> > > Actually gpiolib seems to handle ->dev as optional. Can you try this patch
> > > instead? Thanks.
> > > 
> > > diff --git a/drivers/gpio/gpiolib-acpi.c b/drivers/gpio/gpiolib-acpi.c
> > > index bf0f8b476696..642b2bf3360e 100644
> > > --- a/drivers/gpio/gpiolib-acpi.c
> > > +++ b/drivers/gpio/gpiolib-acpi.c
> > > @@ -501,6 +501,9 @@ void acpi_gpiochip_add(struct gpio_chip *chip)
> > >   acpi_handle handle;
> > >   acpi_status status;
> > >  
> > > + if (!chip || !chip->dev)
> > > + return;
> > > +
> > >   handle = ACPI_HANDLE(chip->dev);
> > >   if (!handle)
> > >   return;
> > > @@ -531,6 +534,9 @@ void acpi_gpiochip_remove(struct gpio_chip *chip)
> > >   acpi_handle handle;
> > >   acpi_status status;
> > >  
> > > + if (!chip || !chip->dev)
> > > + return;
> > > +
> > >   handle = ACPI_HANDLE(chip->dev);
> > >   if (!handle)
> > >   return;
> > 
> > Thanks, this patch solves the problem.
> 
> Great thanks for testing. Can I add your tested-by to the patch?
> 
> I'll submit a formal patch for this next week as I'm currently on vacation.

Sure:

Tested-by: Sabrina Dubroca 


Thanks again,

-- 
Sabrina
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [bisected] NULL pointer dereference in acpi_gpiochip_add (on modprobe ssb)

2014-03-25 Thread Sabrina Dubroca

2014-03-25, 09:25:30 +0200, Mika Westerberg wrote:
 On Mon, Mar 24, 2014 at 07:31:11PM +0100, Sabrina Dubroca wrote:
   Actually gpiolib seems to handle -dev as optional. Can you try this patch
   instead? Thanks.
   
   diff --git a/drivers/gpio/gpiolib-acpi.c b/drivers/gpio/gpiolib-acpi.c
   index bf0f8b476696..642b2bf3360e 100644
   --- a/drivers/gpio/gpiolib-acpi.c
   +++ b/drivers/gpio/gpiolib-acpi.c
   @@ -501,6 +501,9 @@ void acpi_gpiochip_add(struct gpio_chip *chip)
 acpi_handle handle;
 acpi_status status;

   + if (!chip || !chip-dev)
   + return;
   +
 handle = ACPI_HANDLE(chip-dev);
 if (!handle)
 return;
   @@ -531,6 +534,9 @@ void acpi_gpiochip_remove(struct gpio_chip *chip)
 acpi_handle handle;
 acpi_status status;

   + if (!chip || !chip-dev)
   + return;
   +
 handle = ACPI_HANDLE(chip-dev);
 if (!handle)
 return;
  
  Thanks, this patch solves the problem.
 
 Great thanks for testing. Can I add your tested-by to the patch?
 
 I'll submit a formal patch for this next week as I'm currently on vacation.

Sure:

Tested-by: Sabrina Dubroca s...@queasysnail.net


Thanks again,

-- 
Sabrina
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [bisected] NULL pointer dereference in acpi_gpiochip_add (on modprobe ssb)

2014-03-24 Thread Sabrina Dubroca

2014-03-24, 20:11:12 +0200, Mika Westerberg wrote:
> On Mon, Mar 24, 2014 at 08:00:33PM +0200, Mika Westerberg wrote:
> > On Mon, Mar 24, 2014 at 04:49:44PM +0100, Sabrina Dubroca wrote:
> > > Hi,
> > > 
> > > With next-20140324, I get the BUG below when I modprobe ssb.
> > > I bisected it to aa92b6f689ac
> > > "gpio / ACPI: Allocate ACPI specific data directly in acpi_gpiochip_add()"
> > > 
> > > The device that needs ssb is:
> > > 0c:00.0 Network controller [0280]: Broadcom Corporation BCM4321 
> > > 802.11a/b/g/n [14e4:4328] (rev 03)
> > > Subsystem: Dell Wireless 1500 Draft 802.11n WLAN Mini-card 
> > > [1028:000a]
> > > Kernel driver in use: b43-pci-bridge
> > > Kernel modules: ssb
> > > 
> > > 
> > > [   92.693606] ssb: Found chip with id 0x4321, rev 0x03 and package 0x00
> > > [   92.693649] ssb: Core 0 found: ChipCommon (cc 0x800, rev 0x13, vendor 
> > > 0x4243)
> > > [   92.693675] ssb: Core 1 found: IEEE 802.11 (cc 0x812, rev 0x0C, vendor 
> > > 0x4243)
> > > [   92.693699] ssb: Core 2 found: PCI-E (cc 0x820, rev 0x04, vendor 
> > > 0x4243)
> > > [   92.693723] ssb: Core 3 found: PCI (cc 0x804, rev 0x0D, vendor 0x4243)
> > > [   92.693746] ssb: Core 4 found: USB 1.1 Host (cc 0x817, rev 0x04, 
> > > vendor 0x4243)
> > > [   92.753554] BUG: unable to handle kernel NULL pointer dereference at 
> > > 0138
> > > [   92.753760] IP: [] acpi_gpiochip_add+0x13/0x190
> > > [   92.753901] *pde =  
> > > [   92.753986] Oops:  [#1] PREEMPT SMP 
> > > [   92.754125] Modules linked in: ssb(+) mmc_core netconsole nouveau 
> > > mxm_wmi i2c_algo_bit drm_kms_helper ttm drm joydev mousedev tg3 coretemp 
> > > kvm_intel ptp pcmcia kvm pps_core libphy dell_laptop gpio_ich rfkill 
> > > yenta_socket pcmcia_rsrc intel_agp intel_gtt iTCO_wdt iTCO_vendor_support 
> > > dell_wmi sparse_keymap pcmcia_core evdev agpgart dcdbas snd_hda_codec_idt 
> > > snd_hda_codec_generic microcode psmouse pcspkr i2c_i801 i2c_core 
> > > serio_raw lpc_ich mfd_core acpi_cpufreq ac battery thermal button wmi 
> > > snd_hda_intel snd_hda_controller snd_hda_codec snd_hwdep snd_pcm video 
> > > snd_timer shpchp processor snd soundcore nfs lockd sunrpc ext4 crc16 
> > > mbcache jbd2 sd_mod sr_mod cdrom ata_generic pata_acpi ata_piix libata 
> > > scsi_mod firewire_ohci firewire_core crc_itu_t uhci_hcd ehci_pci ehci_hcd 
> > > usbcore usb_common
> > > [   92.756833] CPU: 0 PID: 512 Comm: modprobe Tainted: GW 
> > > 3.14.0-rc7-next-20140324-t1 #24
> > > [   92.756833] Hardware name: Dell Inc. Latitude D830   
> > > /0UY141, BIOS A02 06/07/2007
> > > [   92.756833] task: f5799900 ti: f543e000 task.ti: f543e000
> > > [   92.756833] EIP: 0060:[] EFLAGS: 00010282 CPU: 0
> > > [   92.756833] EIP is at acpi_gpiochip_add+0x13/0x190
> > > [   92.756833] EAX:  EBX: f57824c4 ECX:  EDX: 
> > > [   92.756833] ESI: f57824c4 EDI: 0010 EBP: f543fc54 ESP: f543fc40
> > > [   92.756833]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
> > > [   92.756833] CR0: 8005003b CR2: 0138 CR3: 355f8000 CR4: 07d0
> > 
> > To me looks like chip->dev is NULL. My understanding is that the GPIO core
> > wants to have it non-NULL.
> 
> Actually gpiolib seems to handle ->dev as optional. Can you try this patch
> instead? Thanks.
> 
> diff --git a/drivers/gpio/gpiolib-acpi.c b/drivers/gpio/gpiolib-acpi.c
> index bf0f8b476696..642b2bf3360e 100644
> --- a/drivers/gpio/gpiolib-acpi.c
> +++ b/drivers/gpio/gpiolib-acpi.c
> @@ -501,6 +501,9 @@ void acpi_gpiochip_add(struct gpio_chip *chip)
>   acpi_handle handle;
>   acpi_status status;
>  
> + if (!chip || !chip->dev)
> + return;
> +
>   handle = ACPI_HANDLE(chip->dev);
>   if (!handle)
>   return;
> @@ -531,6 +534,9 @@ void acpi_gpiochip_remove(struct gpio_chip *chip)
>   acpi_handle handle;
>   acpi_status status;
>  
> + if (!chip || !chip->dev)
> + return;
> +
>   handle = ACPI_HANDLE(chip->dev);
>   if (!handle)
>   return;

Thanks, this patch solves the problem.

-- 
Sabrina
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [bisected] NULL pointer dereference in acpi_gpiochip_add (on modprobe ssb)

2014-03-24 Thread Sabrina Dubroca

2014-03-24, 20:11:12 +0200, Mika Westerberg wrote:
 On Mon, Mar 24, 2014 at 08:00:33PM +0200, Mika Westerberg wrote:
  On Mon, Mar 24, 2014 at 04:49:44PM +0100, Sabrina Dubroca wrote:
   Hi,
   
   With next-20140324, I get the BUG below when I modprobe ssb.
   I bisected it to aa92b6f689ac
   gpio / ACPI: Allocate ACPI specific data directly in acpi_gpiochip_add()
   
   The device that needs ssb is:
   0c:00.0 Network controller [0280]: Broadcom Corporation BCM4321 
   802.11a/b/g/n [14e4:4328] (rev 03)
   Subsystem: Dell Wireless 1500 Draft 802.11n WLAN Mini-card 
   [1028:000a]
   Kernel driver in use: b43-pci-bridge
   Kernel modules: ssb
   
   
   [   92.693606] ssb: Found chip with id 0x4321, rev 0x03 and package 0x00
   [   92.693649] ssb: Core 0 found: ChipCommon (cc 0x800, rev 0x13, vendor 
   0x4243)
   [   92.693675] ssb: Core 1 found: IEEE 802.11 (cc 0x812, rev 0x0C, vendor 
   0x4243)
   [   92.693699] ssb: Core 2 found: PCI-E (cc 0x820, rev 0x04, vendor 
   0x4243)
   [   92.693723] ssb: Core 3 found: PCI (cc 0x804, rev 0x0D, vendor 0x4243)
   [   92.693746] ssb: Core 4 found: USB 1.1 Host (cc 0x817, rev 0x04, 
   vendor 0x4243)
   [   92.753554] BUG: unable to handle kernel NULL pointer dereference at 
   0138
   [   92.753760] IP: [c126c2b3] acpi_gpiochip_add+0x13/0x190
   [   92.753901] *pde =  
   [   92.753986] Oops:  [#1] PREEMPT SMP 
   [   92.754125] Modules linked in: ssb(+) mmc_core netconsole nouveau 
   mxm_wmi i2c_algo_bit drm_kms_helper ttm drm joydev mousedev tg3 coretemp 
   kvm_intel ptp pcmcia kvm pps_core libphy dell_laptop gpio_ich rfkill 
   yenta_socket pcmcia_rsrc intel_agp intel_gtt iTCO_wdt iTCO_vendor_support 
   dell_wmi sparse_keymap pcmcia_core evdev agpgart dcdbas snd_hda_codec_idt 
   snd_hda_codec_generic microcode psmouse pcspkr i2c_i801 i2c_core 
   serio_raw lpc_ich mfd_core acpi_cpufreq ac battery thermal button wmi 
   snd_hda_intel snd_hda_controller snd_hda_codec snd_hwdep snd_pcm video 
   snd_timer shpchp processor snd soundcore nfs lockd sunrpc ext4 crc16 
   mbcache jbd2 sd_mod sr_mod cdrom ata_generic pata_acpi ata_piix libata 
   scsi_mod firewire_ohci firewire_core crc_itu_t uhci_hcd ehci_pci ehci_hcd 
   usbcore usb_common
   [   92.756833] CPU: 0 PID: 512 Comm: modprobe Tainted: GW 
   3.14.0-rc7-next-20140324-t1 #24
   [   92.756833] Hardware name: Dell Inc. Latitude D830   
   /0UY141, BIOS A02 06/07/2007
   [   92.756833] task: f5799900 ti: f543e000 task.ti: f543e000
   [   92.756833] EIP: 0060:[c126c2b3] EFLAGS: 00010282 CPU: 0
   [   92.756833] EIP is at acpi_gpiochip_add+0x13/0x190
   [   92.756833] EAX:  EBX: f57824c4 ECX:  EDX: 
   [   92.756833] ESI: f57824c4 EDI: 0010 EBP: f543fc54 ESP: f543fc40
   [   92.756833]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
   [   92.756833] CR0: 8005003b CR2: 0138 CR3: 355f8000 CR4: 07d0
  
  To me looks like chip-dev is NULL. My understanding is that the GPIO core
  wants to have it non-NULL.
 
 Actually gpiolib seems to handle -dev as optional. Can you try this patch
 instead? Thanks.
 
 diff --git a/drivers/gpio/gpiolib-acpi.c b/drivers/gpio/gpiolib-acpi.c
 index bf0f8b476696..642b2bf3360e 100644
 --- a/drivers/gpio/gpiolib-acpi.c
 +++ b/drivers/gpio/gpiolib-acpi.c
 @@ -501,6 +501,9 @@ void acpi_gpiochip_add(struct gpio_chip *chip)
   acpi_handle handle;
   acpi_status status;
  
 + if (!chip || !chip-dev)
 + return;
 +
   handle = ACPI_HANDLE(chip-dev);
   if (!handle)
   return;
 @@ -531,6 +534,9 @@ void acpi_gpiochip_remove(struct gpio_chip *chip)
   acpi_handle handle;
   acpi_status status;
  
 + if (!chip || !chip-dev)
 + return;
 +
   handle = ACPI_HANDLE(chip-dev);
   if (!handle)
   return;

Thanks, this patch solves the problem.

-- 
Sabrina
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH 1/2] ACPICA: Dispatcher: Ignore SyncLevel for auto-serialization mechanism.

2014-03-18 Thread Sabrina Dubroca

Hi,

2014-03-18, 00:50:04 +, Zheng, Lv wrote:
> Hi, Sabrina and Valdis
> 
> Could you give this patch a try to see if this patch is working?
> 
> Thanks and best regards
> -Lv

I applied the two patches. The laptop boots and there is no ACPI
error in the logs, both with auto-serialization enabled and disabled.

Thanks,

-- 
Sabrina
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH 1/2] ACPICA: Dispatcher: Ignore SyncLevel for auto-serialization mechanism.

2014-03-18 Thread Sabrina Dubroca

Hi,

2014-03-18, 00:50:04 +, Zheng, Lv wrote:
 Hi, Sabrina and Valdis
 
 Could you give this patch a try to see if this patch is working?
 
 Thanks and best regards
 -Lv

I applied the two patches. The laptop boots and there is no ACPI
error in the logs, both with auto-serialization enabled and disabled.

Thanks,

-- 
Sabrina
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: ACPICA 20140214 auto-serialize weirds my machine...

2014-03-14 Thread Sabrina Dubroca

Hi,

2014-03-14, 12:05:01 -0400, Valdis Kletnieks wrote:
> Surprisingly enough, this hasn't seemed to have bitten many Linux people yet,
> Google only finds a BSD thread (where the same ACPICA code is in use):
> 
> http://comments.gmane.org/gmane.os.dragonfly-bsd.user/1817

I was about to report the same issue on a Dell Latitude D830. Boot
failure, flood of these messages:

ACPI Exception: AE_AML_MUTEX_ORDER, while evaluating GPE method [_L1C] 
(20140214/evgpe-580)
ACPI Error: Method parse/execution failed [\GPE._L1C] (Node f5e0c9a8), 
AE_AML_MUTEX_ORDER (20140214/psparse-536)
ACPI Error: Cannot acquire Mutex for method [SX30], current SyncLevel is too 
large (1) (20140214/dsmethod-362)
ACPI Exception: AE_AML_MUTEX_ORDER, while evaluating GPE method [_L1C] 
(20140214/evgpe-580)

With acpi_no_auto_serialize things look good.


Before I tried that option, a bisection led me to remove
"| AML_CREATE" from commit f56b05bd111b. That made the laptop boot,

things seemed okay, except for the battery not being detected: the
normal battery line was gone ("ACPI: Battery Slot [BAT0] (battery
present)") and there's this instead:

ACPI Error: Cannot acquire Mutex for method [SX45], current SyncLevel is too 
large (1) (20140214/dsmethod-362)
ACPI Error: Method parse/execution failed [\_SB_.BIF_] (Node f5c250d8), 
AE_AML_MUTEX_ORDER (20140214/psparse-536)
ACPI Error: Method parse/execution failed [\_SB_.BAT0._BIF] (Node f5c25168), 
AE_AML_MUTEX_ORDER (20140214/psparse-536)
ACPI Exception: AE_AML_MUTEX_ORDER, Evaluating _BIF (20140214/battery-416)


-- 
Sabrina
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: ACPICA 20140214 auto-serialize weirds my machine...

2014-03-14 Thread Sabrina Dubroca

Hi,

2014-03-14, 12:05:01 -0400, Valdis Kletnieks wrote:
 Surprisingly enough, this hasn't seemed to have bitten many Linux people yet,
 Google only finds a BSD thread (where the same ACPICA code is in use):
 
 http://comments.gmane.org/gmane.os.dragonfly-bsd.user/1817

I was about to report the same issue on a Dell Latitude D830. Boot
failure, flood of these messages:

ACPI Exception: AE_AML_MUTEX_ORDER, while evaluating GPE method [_L1C] 
(20140214/evgpe-580)
ACPI Error: Method parse/execution failed [\GPE._L1C] (Node f5e0c9a8), 
AE_AML_MUTEX_ORDER (20140214/psparse-536)
ACPI Error: Cannot acquire Mutex for method [SX30], current SyncLevel is too 
large (1) (20140214/dsmethod-362)
ACPI Exception: AE_AML_MUTEX_ORDER, while evaluating GPE method [_L1C] 
(20140214/evgpe-580)

With acpi_no_auto_serialize things look good.


Before I tried that option, a bisection led me to remove
| AML_CREATE from commit f56b05bd111b. That made the laptop boot,

things seemed okay, except for the battery not being detected: the
normal battery line was gone (ACPI: Battery Slot [BAT0] (battery
present)) and there's this instead:

ACPI Error: Cannot acquire Mutex for method [SX45], current SyncLevel is too 
large (1) (20140214/dsmethod-362)
ACPI Error: Method parse/execution failed [\_SB_.BIF_] (Node f5c250d8), 
AE_AML_MUTEX_ORDER (20140214/psparse-536)
ACPI Error: Method parse/execution failed [\_SB_.BAT0._BIF] (Node f5c25168), 
AE_AML_MUTEX_ORDER (20140214/psparse-536)
ACPI Exception: AE_AML_MUTEX_ORDER, Evaluating _BIF (20140214/battery-416)


-- 
Sabrina
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [3.14-rc1] cirrus driver problem (qemu)

2014-02-05 Thread Sabrina Dubroca

2014-02-05, 14:50:18 +1000, Dave Airlie wrote:
> On Wed, Feb 5, 2014 at 8:53 AM, Sabrina Dubroca  wrote:
> > 2014-02-04, 13:20:54 +1000, Dave Airlie wrote:
> >> On Tue, Feb 4, 2014 at 1:34 AM, Sabrina Dubroca  
> >> wrote:
> >> > When I boot 3.14-rc1 in qemu, I get the trace below. The console stops
> >> > updating and I don't get a login prompt. I can login, but I can't see
> >> > what I'm doing. I can login normally via SSH.
> >> >
> >> > If I revert the last commit in drivers/gpu/drm/cirrus:
> >> >
> >> > f4b4718b61d1d5a7442a4fd6863ea80c3a10e508 drm: ast,cirrus,mgag200: use 
> >> > drm_can_sleep
> >> >
> >> > the problem is solved.
> >> >
> >>
> >> Hi does the attach patch fix it?
> >>
> >> Dave.
> >
> >
> > Same problem. Didn't you reverse the logic on in_interrupt, compared
> > to the old "if (!in_interrupt())" ? It looks like drm_can_sleep() is
> > false when in_interrupt() is true.
> >
> > I modified your patch as below. Display doesn't freeze, but I still
> > get the warning.
> 
> Oh wow I totally screwed up there, you are right, logic inversion.
> 
> Can you try the attached?
> 
> without the in_interrupt addition.
> 
> Dave.

It works, thanks!
No freeze, no warning.

Sabrina
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [3.14-rc1] cirrus driver problem (qemu)

2014-02-05 Thread Sabrina Dubroca

2014-02-05, 14:50:18 +1000, Dave Airlie wrote:
 On Wed, Feb 5, 2014 at 8:53 AM, Sabrina Dubroca s...@queasysnail.net wrote:
  2014-02-04, 13:20:54 +1000, Dave Airlie wrote:
  On Tue, Feb 4, 2014 at 1:34 AM, Sabrina Dubroca s...@queasysnail.net 
  wrote:
   When I boot 3.14-rc1 in qemu, I get the trace below. The console stops
   updating and I don't get a login prompt. I can login, but I can't see
   what I'm doing. I can login normally via SSH.
  
   If I revert the last commit in drivers/gpu/drm/cirrus:
  
   f4b4718b61d1d5a7442a4fd6863ea80c3a10e508 drm: ast,cirrus,mgag200: use 
   drm_can_sleep
  
   the problem is solved.
  
 
  Hi does the attach patch fix it?
 
  Dave.
 
 
  Same problem. Didn't you reverse the logic on in_interrupt, compared
  to the old if (!in_interrupt()) ? It looks like drm_can_sleep() is
  false when in_interrupt() is true.
 
  I modified your patch as below. Display doesn't freeze, but I still
  get the warning.
 
 Oh wow I totally screwed up there, you are right, logic inversion.
 
 Can you try the attached?
 
 without the in_interrupt addition.
 
 Dave.

It works, thanks!
No freeze, no warning.

Sabrina
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [3.14-rc1] cirrus driver problem (qemu)

2014-02-04 Thread Sabrina Dubroca

2014-02-04, 13:20:54 +1000, Dave Airlie wrote:
> On Tue, Feb 4, 2014 at 1:34 AM, Sabrina Dubroca  wrote:
> > When I boot 3.14-rc1 in qemu, I get the trace below. The console stops
> > updating and I don't get a login prompt. I can login, but I can't see
> > what I'm doing. I can login normally via SSH.
> >
> > If I revert the last commit in drivers/gpu/drm/cirrus:
> >
> > f4b4718b61d1d5a7442a4fd6863ea80c3a10e508 drm: ast,cirrus,mgag200: use 
> > drm_can_sleep
> >
> > the problem is solved.
> >
> 
> Hi does the attach patch fix it?
> 
> Dave.


Same problem. Didn't you reverse the logic on in_interrupt, compared
to the old "if (!in_interrupt())" ? It looks like drm_can_sleep() is
false when in_interrupt() is true.

I modified your patch as below. Display doesn't freeze, but I still
get the warning.

Thanks,
Sabrina

---
diff --git a/include/drm/drmP.h b/include/drm/drmP.h
index 04086c5..6ab14455f 100644
--- a/include/drm/drmP.h
+++ b/include/drm/drmP.h
@@ -1696,7 +1696,7 @@ extern int drm_platform_init(struct drm_driver *driver, 
struct platform_device *
 /* returns true if currently okay to sleep */
 static __inline__ bool drm_can_sleep(void)
 {
-   if (in_atomic() || in_dbg_master() || irqs_disabled())
+   if (in_atomic() || in_dbg_master() || !in_interrupt() || 
irqs_disabled())
return false;
return true;
 }

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [3.14-rc1] cirrus driver problem (qemu)

2014-02-04 Thread Sabrina Dubroca

2014-02-04, 13:20:54 +1000, Dave Airlie wrote:
 On Tue, Feb 4, 2014 at 1:34 AM, Sabrina Dubroca s...@queasysnail.net wrote:
  When I boot 3.14-rc1 in qemu, I get the trace below. The console stops
  updating and I don't get a login prompt. I can login, but I can't see
  what I'm doing. I can login normally via SSH.
 
  If I revert the last commit in drivers/gpu/drm/cirrus:
 
  f4b4718b61d1d5a7442a4fd6863ea80c3a10e508 drm: ast,cirrus,mgag200: use 
  drm_can_sleep
 
  the problem is solved.
 
 
 Hi does the attach patch fix it?
 
 Dave.


Same problem. Didn't you reverse the logic on in_interrupt, compared
to the old if (!in_interrupt()) ? It looks like drm_can_sleep() is
false when in_interrupt() is true.

I modified your patch as below. Display doesn't freeze, but I still
get the warning.

Thanks,
Sabrina

---
diff --git a/include/drm/drmP.h b/include/drm/drmP.h
index 04086c5..6ab14455f 100644
--- a/include/drm/drmP.h
+++ b/include/drm/drmP.h
@@ -1696,7 +1696,7 @@ extern int drm_platform_init(struct drm_driver *driver, 
struct platform_device *
 /* returns true if currently okay to sleep */
 static __inline__ bool drm_can_sleep(void)
 {
-   if (in_atomic() || in_dbg_master() || irqs_disabled())
+   if (in_atomic() || in_dbg_master() || !in_interrupt() || 
irqs_disabled())
return false;
return true;
 }

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[3.14-rc1] cirrus driver problem (qemu)

2014-02-03 Thread Sabrina Dubroca

When I boot 3.14-rc1 in qemu, I get the trace below. The console stops
updating and I don't get a login prompt. I can login, but I can't see
what I'm doing. I can login normally via SSH.

If I revert the last commit in drivers/gpu/drm/cirrus:

f4b4718b61d1d5a7442a4fd6863ea80c3a10e508 drm: ast,cirrus,mgag200: use 
drm_can_sleep

the problem is solved.


[1.749341] [ cut here ]
[1.749347] WARNING: CPU: 0 PID: 0 at kernel/locking/mutex.c:856 
mutex_trylock+0x1e5/0x250()
[1.749348] DEBUG_LOCKS_WARN_ON(in_interrupt())
[1.749360] Modules linked in: ppdev cirrus syscopyarea sysfillrect 
sysimgblt drm_kms_helper evdev psmouse microcode serio_raw pcspkr ttm e1000 
parport_pc parport processor button intel_agp drm intel_gtt i2c_piix4 ipv6 ext4 
crc16 mbcache jbd2 sd_mod sr_mod cdrom ata_generic pata_acpi ata_piix 
9pnet_virtio 9pnet libata scsi_mod
[1.749362] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.14.0-rc1-t1 #34
[1.749364] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[1.749366]  0009 88001fc038c8 814e8456 
88001fc03910
[1.749367]  88001fc03900 8106a0dd 88001d3ff990 
0010
[1.749368]   01e0 88001cc3b000 
88001fc03960
[1.749369] Call Trace:
[1.749372][] dump_stack+0x4d/0x6f
[1.749374]  [] warn_slowpath_common+0x7d/0xa0
[1.749375]  [] warn_slowpath_fmt+0x4c/0x50
[1.749377]  [] mutex_trylock+0x1e5/0x250
[1.749380]  [] cirrus_dirty_update+0x7c/0x2f0 [cirrus]
[1.749381]  [] cirrus_imageblit+0x2f/0x40 [cirrus]
[1.749388]  [] soft_cursor+0x1b4/0x250
[1.749390]  [] bit_cursor+0x613/0x650
[1.749391]  [] ? get_color.isra.15+0x31/0x140
[1.749392]  [] fbcon_cursor+0x13b/0x1c0
[1.749393]  [] ? update_attr.isra.2+0x90/0x90
[1.749398]  [] hide_cursor+0x28/0xa0
[1.749400]  [] vt_console_print+0x398/0x3d0
[1.749405]  [] ? print_prefix+0x6f/0xb0
[1.749407]  [] 
call_console_drivers.constprop.18+0x93/0x110
[1.749409]  [] console_unlock+0x3cf/0x410
[1.749410]  [] vprintk_emit+0x181/0x4f0
[1.749412]  [] printk+0x54/0x56
[1.749414]  [] credit_entropy_bits+0x2ea/0x300
[1.749415]  [] ? mix_pool_bytes.constprop.30+0x56/0xc0
[1.749416]  [] add_timer_randomness+0xee/0x120
[1.749418]  [] add_disk_randomness+0x33/0xb0
[1.749424]  [] blk_update_bidi_request+0x5c/0x80
[1.749426]  [] blk_end_bidi_request+0x1f/0x60
[1.749427]  [] blk_end_request+0x10/0x20
[1.749433]  [] scsi_io_completion+0xa9/0x640 [scsi_mod]
[1.749436]  [] scsi_finish_command+0xa2/0xe0 [scsi_mod]
[1.749440]  [] scsi_softirq_done+0x10e/0x130 [scsi_mod]
[1.749441]  [] blk_done_softirq+0x93/0xb0
[1.749443]  [] __do_softirq+0x105/0x2f0
[1.749444]  [] irq_exit+0x92/0xc0
[1.749446]  [] do_IRQ+0x58/0xf0
[1.749447]  [] common_interrupt+0x6d/0x6d
[1.749450][] ? native_safe_halt+0x6/0x10
[1.749453]  [] default_idle+0x2d/0x110
[1.749454]  [] arch_cpu_idle+0x2e/0x40
[1.749455]  [] cpu_startup_entry+0xa5/0x2e0
[1.749464]  [] ? early_idt_handlers+0x120/0x120
[1.749466]  [] rest_init+0x84/0x90
[1.749467]  [] start_kernel+0x443/0x44e
[1.749468]  [] ? repair_env_string+0x5c/0x5c
[1.749469]  [] x86_64_start_reservations+0x2a/0x2c
[1.749470]  [] x86_64_start_kernel+0x169/0x178
[1.749471] ---[ end trace d478ba7c30908d4d ]---


-- 
Sabrina Dubroca
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[3.14-rc1] cirrus driver problem (qemu)

2014-02-03 Thread Sabrina Dubroca

When I boot 3.14-rc1 in qemu, I get the trace below. The console stops
updating and I don't get a login prompt. I can login, but I can't see
what I'm doing. I can login normally via SSH.

If I revert the last commit in drivers/gpu/drm/cirrus:

f4b4718b61d1d5a7442a4fd6863ea80c3a10e508 drm: ast,cirrus,mgag200: use 
drm_can_sleep

the problem is solved.


[1.749341] [ cut here ]
[1.749347] WARNING: CPU: 0 PID: 0 at kernel/locking/mutex.c:856 
mutex_trylock+0x1e5/0x250()
[1.749348] DEBUG_LOCKS_WARN_ON(in_interrupt())
[1.749360] Modules linked in: ppdev cirrus syscopyarea sysfillrect 
sysimgblt drm_kms_helper evdev psmouse microcode serio_raw pcspkr ttm e1000 
parport_pc parport processor button intel_agp drm intel_gtt i2c_piix4 ipv6 ext4 
crc16 mbcache jbd2 sd_mod sr_mod cdrom ata_generic pata_acpi ata_piix 
9pnet_virtio 9pnet libata scsi_mod
[1.749362] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.14.0-rc1-t1 #34
[1.749364] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[1.749366]  0009 88001fc038c8 814e8456 
88001fc03910
[1.749367]  88001fc03900 8106a0dd 88001d3ff990 
0010
[1.749368]   01e0 88001cc3b000 
88001fc03960
[1.749369] Call Trace:
[1.749372]  IRQ  [814e8456] dump_stack+0x4d/0x6f
[1.749374]  [8106a0dd] warn_slowpath_common+0x7d/0xa0
[1.749375]  [8106a14c] warn_slowpath_fmt+0x4c/0x50
[1.749377]  [814ebf45] mutex_trylock+0x1e5/0x250
[1.749380]  [a037888c] cirrus_dirty_update+0x7c/0x2f0 [cirrus]
[1.749381]  [a0378b2f] cirrus_imageblit+0x2f/0x40 [cirrus]
[1.749388]  [813188f4] soft_cursor+0x1b4/0x250
[1.749390]  [813181a3] bit_cursor+0x613/0x650
[1.749391]  [81313891] ? get_color.isra.15+0x31/0x140
[1.749392]  [8131442b] fbcon_cursor+0x13b/0x1c0
[1.749393]  [81317b90] ? update_attr.isra.2+0x90/0x90
[1.749398]  [81380fd8] hide_cursor+0x28/0xa0
[1.749400]  [81382428] vt_console_print+0x398/0x3d0
[1.749405]  [810bc0af] ? print_prefix+0x6f/0xb0
[1.749407]  [810bcc03] 
call_console_drivers.constprop.18+0x93/0x110
[1.749409]  [810bd33f] console_unlock+0x3cf/0x410
[1.749410]  [810bd501] vprintk_emit+0x181/0x4f0
[1.749412]  [814e60dd] printk+0x54/0x56
[1.749414]  [81397bca] credit_entropy_bits+0x2ea/0x300
[1.749415]  [81398166] ? mix_pool_bytes.constprop.30+0x56/0xc0
[1.749416]  [8139866e] add_timer_randomness+0xee/0x120
[1.749418]  [81399463] add_disk_randomness+0x33/0xb0
[1.749424]  [8127d4bc] blk_update_bidi_request+0x5c/0x80
[1.749426]  [8127d79f] blk_end_bidi_request+0x1f/0x60
[1.749427]  [8127d7f0] blk_end_request+0x10/0x20
[1.749433]  [a0009db9] scsi_io_completion+0xa9/0x640 [scsi_mod]
[1.749436]  [a00014b2] scsi_finish_command+0xa2/0xe0 [scsi_mod]
[1.749440]  [a0009c1e] scsi_softirq_done+0x10e/0x130 [scsi_mod]
[1.749441]  [81283fb3] blk_done_softirq+0x93/0xb0
[1.749443]  [8106ee15] __do_softirq+0x105/0x2f0
[1.749444]  [8106f2b2] irq_exit+0x92/0xc0
[1.749446]  [814fa798] do_IRQ+0x58/0xf0
[1.749447]  [814f03ad] common_interrupt+0x6d/0x6d
[1.749450]  EOI  [810546d6] ? native_safe_halt+0x6/0x10
[1.749453]  [8102047d] default_idle+0x2d/0x110
[1.749454]  [81020e9e] arch_cpu_idle+0x2e/0x40
[1.749455]  [810bec05] cpu_startup_entry+0xa5/0x2e0
[1.749464]  [81ae2120] ? early_idt_handlers+0x120/0x120
[1.749466]  [814daad4] rest_init+0x84/0x90
[1.749467]  [81ae2fa9] start_kernel+0x443/0x44e
[1.749468]  [81ae296f] ? repair_env_string+0x5c/0x5c
[1.749469]  [81ae25f6] x86_64_start_reservations+0x2a/0x2c
[1.749470]  [81ae2761] x86_64_start_kernel+0x169/0x178
[1.749471] ---[ end trace d478ba7c30908d4d ]---


-- 
Sabrina Dubroca
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: linux-next: build failure after merge of the tip tree

2014-01-18 Thread Sabrina Dubroca

2014-01-18, 13:44:51 +0100, Peter Zijlstra wrote:
> On Sat, Jan 18, 2014 at 10:46:06AM +0100, Mike Galbraith wrote:
> > 
> > I hope it doesn't look quite like that, next-20140117 is -ENOBOOT on
> > Q6600 box.  See below for an alternative.
> 
> Urgh, I see, we call the idle arch_cpu_idle() callback with irqs
> disabled.
> 
> Could something like this work?
> 
>   local_irq_enable();
>   mwait_idle_with_hints(0,0);
> 
> The interrupt enable window is slightly larger, but I'm not immediately
> seeing a problem with that.

next-20140117 doesn't boot on my T7300 laptop either. I've tried the
two fixes, they both work for me.


Thanks,

-- 
Sabrina
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: linux-next: build failure after merge of the tip tree

2014-01-18 Thread Sabrina Dubroca

2014-01-18, 13:44:51 +0100, Peter Zijlstra wrote:
 On Sat, Jan 18, 2014 at 10:46:06AM +0100, Mike Galbraith wrote:
  
  I hope it doesn't look quite like that, next-20140117 is -ENOBOOT on
  Q6600 box.  See below for an alternative.
 
 Urgh, I see, we call the idle arch_cpu_idle() callback with irqs
 disabled.
 
 Could something like this work?
 
   local_irq_enable();
   mwait_idle_with_hints(0,0);
 
 The interrupt enable window is slightly larger, but I'm not immediately
 seeing a problem with that.

next-20140117 doesn't boot on my T7300 laptop either. I've tried the
two fixes, they both work for me.


Thanks,

-- 
Sabrina
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

< 1 2

101 - 152 of 152 matches

Mail list logo