In netpoll_send_skb_on_dev we do not trigger warn so irqs are disabled.
And in ixgbe_poll->ixgbe_qv_unlock_napi with irqs disabled we use
spin_unlock_bh, that is bug as inside it has local_irq_enable, witch
will enable hardware interrupts.

So port patch witch changes locking here to atomic_cmpxchg.

https://jira.sw.ru/browse/PSBM-40330

WARNING:
Dec  1 18:03:10 p9 kernel: netpoll: netconsole: local port 6666
Dec  1 18:03:10 p9 kernel: netpoll: netconsole: local IPv4 address 10.94.2.172
Dec  1 18:03:10 p9 kernel: netpoll: netconsole: interface 'br0'
Dec  1 18:03:10 p9 kernel: netpoll: netconsole: remote port 6666
Dec  1 18:03:10 p9 kernel: netpoll: netconsole: remote IPv4 address 10.29.0.70
Dec  1 18:03:10 p9 kernel: netpoll: netconsole: remote ethernet address 
7c:95:f3:2e:05:64
Dec  1 18:03:10 p9 kernel: ------------[ cut here ]------------
Dec  1 18:03:10 p9 kernel: WARNING: at kernel/softirq.c:162 
local_bh_enable_ip+0x7a/0xa0()
Dec  1 18:03:10 p9 kernel: Modules linked in: netconsole(+) xt_CHECKSUM 
iptable_mangle iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 
xt_conntrack nf_conntrack ipt_REJECT tun ebtable_filter ebtables 
ip6table_filter ip6_tables iptable_filter ip_tables 8021q garp mrp 
intel_powerclamp coretemp kvm_intel kvm snd_pcm snd_timer snd iTCO_wdt 
iTCO_vendor_support crct10dif_pclmul crc32_pclmul crc32c_intel 
ghash_clmulni_intel aesni_intel lrw gf128mul ioatdma glue_helper soundcore 
ablk_helper cryptd mei_me pcspkr ses sb_edac enclosure edac_core mei i2c_i801 
ipmi_si shpchp lpc_ich mfd_core wmi ipmi_msghandler dm_mod ext4 mbcache jbd2 
sd_mod crc_t10dif crct10dif_common mgag200 isci syscopyarea sysfillrect 
sysimgblt drm_kms_helper ttm libsas ixgbe drm igb scsi_transport_sas ahci 
libahci megaraid_sas libata mdio
Dec  1 18:03:10 p9 kernel: ptp pps_core dca i2c_algo_bit i2c_core ip6_vzprivnet 
ip6_vznetstat pio_kaio pio_nfs pio_direct pfmt_raw pfmt_ploop1 ploop 
ip_vznetstat ip_vzprivnet vziolimit vzevent vzlist vzstat vznetstat vznetdev 
vzmon vzdev bridge stp llc
Dec  1 18:03:10 p9 kernel: CPU: 31 PID: 3649 Comm: modprobe ve: 0 Not tainted 
3.10.0-229.7.2.ovz.9.14-00004-gcc670bd-dirty #2 9.14
Dec  1 18:03:10 p9 kernel: Hardware name: DEPO Computers 
X9DRi-LN4+/X9DR3-LN4+/X9DRi-LN4+/X9DR3-LN4+, BIOS 3.2 03/04/2015
Dec  1 18:03:10 p9 kernel: 00000000000000a2 000000001e836364 ffff883fd0aeb908 
ffffffff816222a3
Dec  1 18:03:10 p9 kernel: ffff883fd0aeb940 ffffffff8107025b ffff883fc90d2040 
ffff883fc90d2368
Dec  1 18:03:10 p9 kernel: 0000000000000005 0000000000000001 00000000fffffe05 
ffff883fd0aeb950
Dec  1 18:03:10 p9 kernel: Call Trace:
Dec  1 18:03:10 p9 kernel: [<ffffffff816222a3>] dump_stack+0x19/0x1b
Dec  1 18:03:10 p9 kernel: [<ffffffff8107025b>] warn_slowpath_common+0x6b/0xa0
Dec  1 18:03:10 p9 kernel: [<ffffffff8107039a>] warn_slowpath_null+0x1a/0x20
Dec  1 18:03:10 p9 kernel: [<ffffffff81079bea>] local_bh_enable_ip+0x7a/0xa0
Dec  1 18:03:10 p9 kernel: [<ffffffff816285cb>] _raw_spin_unlock_bh+0x1b/0x70
Dec  1 18:03:10 p9 kernel: [<ffffffffa02e123d>] ixgbe_poll+0x54d/0x980 [ixgbe]
Dec  1 18:03:10 p9 kernel: [<ffffffff8153d1b6>] netpoll_poll_dev+0x1b6/0xa80
Dec  1 18:03:10 p9 kernel: [<ffffffffa02dcf4b>] ? ixgbe_select_queue+0xbb/0x120 
[ixgbe]
Dec  1 18:03:10 p9 kernel: [<ffffffff8153ddc0>] 
netpoll_send_skb_on_dev+0x340/0x480
Dec  1 18:03:10 p9 kernel: [<ffffffff8119cf78>] ? 
__list_lru_walk_one.isra.3+0x148/0x170
Dec  1 18:03:10 p9 kernel: [<ffffffffa00129d6>] __br_deliver+0x106/0x130 
[bridge]
Dec  1 18:03:10 p9 kernel: [<ffffffffa0012e83>] br_deliver+0x63/0x70 [bridge]
Dec  1 18:03:10 p9 kernel: [<ffffffffa0010250>] br_dev_xmit+0x240/0x2a0 [bridge]
Dec  1 18:03:10 p9 kernel: [<ffffffff8153dcf6>] 
netpoll_send_skb_on_dev+0x276/0x480
Dec  1 18:03:10 p9 kernel: [<ffffffff8153e22b>] netpoll_send_udp+0x27b/0x390
Dec  1 18:03:10 p9 kernel: [<ffffffffa05d383f>] write_msg+0xcf/0x140 
[netconsole]
Dec  1 18:03:10 p9 kernel: [<ffffffff81070f6f>] 
call_console_drivers.constprop.21+0x9f/0xf0
Dec  1 18:03:10 p9 kernel: [<ffffffff810718b3>] console_unlock+0x293/0x410
Dec  1 18:03:10 p9 kernel: [<ffffffff810726de>] register_console+0x14e/0x3e0
Dec  1 18:03:10 p9 kernel: [<ffffffffa05d81a0>] init_netconsole+0x1a0/0x1000 
[netconsole]
Dec  1 18:03:10 p9 kernel: [<ffffffffa05d8000>] ? 0xffffffffa05d7fff
Dec  1 18:03:10 p9 kernel: [<ffffffff810020e8>] do_one_initcall+0xb8/0x230
Dec  1 18:03:10 p9 kernel: [<ffffffff810f3996>] load_module+0x1656/0x1cc0
Dec  1 18:03:10 p9 kernel: [<ffffffff8130c7a0>] ? ddebug_proc_write+0xf0/0xf0
Dec  1 18:03:10 p9 kernel: [<ffffffff810ef6aa>] ? 
copy_module_from_fd.isra.42+0x5a/0x150
Dec  1 18:03:10 p9 kernel: [<ffffffff810f41de>] SyS_finit_module+0xbe/0xf0
Dec  1 18:03:10 p9 kernel: [<ffffffff816318c9>] system_call_fastpath+0x16/0x1b
Dec  1 18:03:10 p9 kernel: ---[ end trace 44dd3910a6b1ff80 ]---
Dec  1 18:03:10 p9 kernel: Tainting kernel with flag 0x9

Port ms commit:
  commit adc810900a703ee78fe88fd65e086d359fec04b2
  Author: Alexander Duyck <alexander.h.du...@intel.com>
  Date:   Sat Jul 26 02:42:44 2014 +0000

    This change addresses several issues in the current ixgbe implementation of
    busy poll sockets.

    First was the fact that it was possible for frames to be delivered out of
    order if they were held in GRO.  This is addressed by flushing the GRO 
buffers
    before releasing the q_vector back to the idle state.

    The other issue was the fact that we were having to take a spinlock on
    changing the state to and from idle.  To resolve this I have replaced the
    state value with an atomic and use atomic_cmpxchg to change the value from
    idle, and a simple atomic set to restore it back to idle after we have
    acquired it.  This allows us to only use a locked operation on acquiring the
    vector without a need for a locked operation to release it.

    Signed-off-by: Alexander Duyck <alexander.h.du...@intel.com>
    Tested-by: Phil Schmitt <phillip.j.schm...@intel.com>
    Signed-off-by: Jeff Kirsher <jeffrey.t.kirs...@intel.com>

Signed-off-by: Pavel Tikhomirov <ptikhomi...@virtuozzo.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe.h     | 112 ++++++++++-----------------
 drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c |   5 ++
 2 files changed, 45 insertions(+), 72 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h 
b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
index 211c30e..00da363 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
@@ -371,119 +371,87 @@ struct ixgbe_q_vector {
        char name[IFNAMSIZ + 9];
 
 #ifdef CONFIG_NET_RX_BUSY_POLL
-       unsigned int state;
-#define IXGBE_QV_STATE_IDLE        0
-#define IXGBE_QV_STATE_NAPI       1     /* NAPI owns this QV */
-#define IXGBE_QV_STATE_POLL       2     /* poll owns this QV */
-#define IXGBE_QV_STATE_DISABLED           4     /* QV is disabled */
-#define IXGBE_QV_OWNED (IXGBE_QV_STATE_NAPI | IXGBE_QV_STATE_POLL)
-#define IXGBE_QV_LOCKED (IXGBE_QV_OWNED | IXGBE_QV_STATE_DISABLED)
-#define IXGBE_QV_STATE_NAPI_YIELD  8     /* NAPI yielded this QV */
-#define IXGBE_QV_STATE_POLL_YIELD  16    /* poll yielded this QV */
-#define IXGBE_QV_YIELD (IXGBE_QV_STATE_NAPI_YIELD | IXGBE_QV_STATE_POLL_YIELD)
-#define IXGBE_QV_USER_PEND (IXGBE_QV_STATE_POLL | IXGBE_QV_STATE_POLL_YIELD)
-       spinlock_t lock;
 #endif  /* CONFIG_NET_RX_BUSY_POLL */
+       atomic_t state;
 
        /* for dynamic allocation of rings associated with this q_vector */
        struct ixgbe_ring ring[0] ____cacheline_internodealigned_in_smp;
 };
+
 #ifdef CONFIG_NET_RX_BUSY_POLL
+enum ixgbe_qv_state_t {
+       IXGBE_QV_STATE_IDLE = 0,
+       IXGBE_QV_STATE_NAPI,
+       IXGBE_QV_STATE_POLL,
+       IXGBE_QV_STATE_DISABLE
+};
+
 static inline void ixgbe_qv_init_lock(struct ixgbe_q_vector *q_vector)
 {
-
-       spin_lock_init(&q_vector->lock);
-       q_vector->state = IXGBE_QV_STATE_IDLE;
+       /* reset state to idle */
+       atomic_set(&q_vector->state, IXGBE_QV_STATE_IDLE);
 }
 
 /* called from the device poll routine to get ownership of a q_vector */
 static inline bool ixgbe_qv_lock_napi(struct ixgbe_q_vector *q_vector)
 {
-       int rc = true;
-       spin_lock_bh(&q_vector->lock);
-       if (q_vector->state & IXGBE_QV_LOCKED) {
-               WARN_ON(q_vector->state & IXGBE_QV_STATE_NAPI);
-               q_vector->state |= IXGBE_QV_STATE_NAPI_YIELD;
-               rc = false;
+       int rc = atomic_cmpxchg(&q_vector->state, IXGBE_QV_STATE_IDLE,
+                               IXGBE_QV_STATE_NAPI);
 #ifdef LL_EXTENDED_STATS
+       if (rc != IXGBE_QV_STATE_IDLE)
                q_vector->tx.ring->stats.yields++;
 #endif
-       } else {
-               /* we don't care if someone yielded */
-               q_vector->state = IXGBE_QV_STATE_NAPI;
-       }
-       spin_unlock_bh(&q_vector->lock);
-       return rc;
+
+       return rc == IXGBE_QV_STATE_IDLE;
 }
 
 /* returns true is someone tried to get the qv while napi had it */
-static inline bool ixgbe_qv_unlock_napi(struct ixgbe_q_vector *q_vector)
+static inline void ixgbe_qv_unlock_napi(struct ixgbe_q_vector *q_vector)
 {
-       int rc = false;
-       spin_lock_bh(&q_vector->lock);
-       WARN_ON(q_vector->state & (IXGBE_QV_STATE_POLL |
-                              IXGBE_QV_STATE_NAPI_YIELD));
-
-       if (q_vector->state & IXGBE_QV_STATE_POLL_YIELD)
-               rc = true;
-       /* will reset state to idle, unless QV is disabled */
-       q_vector->state &= IXGBE_QV_STATE_DISABLED;
-       spin_unlock_bh(&q_vector->lock);
-       return rc;
+       WARN_ON(atomic_read(&q_vector->state) != IXGBE_QV_STATE_NAPI);
+
+       /* flush any outstanding Rx frames */
+       if (q_vector->napi.gro_list)
+               napi_gro_flush(&q_vector->napi, false);
+
+       /* reset state to idle */
+       atomic_set(&q_vector->state, IXGBE_QV_STATE_IDLE);
 }
 
 /* called from ixgbe_low_latency_poll() */
 static inline bool ixgbe_qv_lock_poll(struct ixgbe_q_vector *q_vector)
 {
-       int rc = true;
-       spin_lock_bh(&q_vector->lock);
-       if ((q_vector->state & IXGBE_QV_LOCKED)) {
-               q_vector->state |= IXGBE_QV_STATE_POLL_YIELD;
-               rc = false;
+       int rc = atomic_cmpxchg(&q_vector->state, IXGBE_QV_STATE_IDLE,
+                               IXGBE_QV_STATE_POLL);
 #ifdef LL_EXTENDED_STATS
-               q_vector->rx.ring->stats.yields++;
+       if (rc != IXGBE_QV_STATE_IDLE)
+               q_vector->tx.ring->stats.yields++;
 #endif
-       } else {
-               /* preserve yield marks */
-               q_vector->state |= IXGBE_QV_STATE_POLL;
-       }
-       spin_unlock_bh(&q_vector->lock);
-       return rc;
+       return rc == IXGBE_QV_STATE_IDLE;
 }
 
 /* returns true if someone tried to get the qv while it was locked */
-static inline bool ixgbe_qv_unlock_poll(struct ixgbe_q_vector *q_vector)
+static inline void ixgbe_qv_unlock_poll(struct ixgbe_q_vector *q_vector)
 {
-       int rc = false;
-       spin_lock_bh(&q_vector->lock);
-       WARN_ON(q_vector->state & (IXGBE_QV_STATE_NAPI));
-
-       if (q_vector->state & IXGBE_QV_STATE_POLL_YIELD)
-               rc = true;
-       /* will reset state to idle, unless QV is disabled */
-       q_vector->state &= IXGBE_QV_STATE_DISABLED;
-       spin_unlock_bh(&q_vector->lock);
-       return rc;
+       WARN_ON(atomic_read(&q_vector->state) != IXGBE_QV_STATE_POLL);
+
+       /* reset state to idle */
+       atomic_set(&q_vector->state, IXGBE_QV_STATE_IDLE);
 }
 
 /* true if a socket is polling, even if it did not get the lock */
 static inline bool ixgbe_qv_ll_polling(struct ixgbe_q_vector *q_vector)
 {
-       WARN_ON(!(q_vector->state & IXGBE_QV_OWNED));
-       return q_vector->state & IXGBE_QV_USER_PEND;
+       return atomic_read(&q_vector->state) == IXGBE_QV_STATE_POLL;
 }
 
 /* false if QV is currently owned */
 static inline bool ixgbe_qv_disable(struct ixgbe_q_vector *q_vector)
 {
-       int rc = true;
-       spin_lock_bh(&q_vector->lock);
-       if (q_vector->state & IXGBE_QV_OWNED)
-               rc = false;
-       q_vector->state |= IXGBE_QV_STATE_DISABLED;
-       spin_unlock_bh(&q_vector->lock);
-
-       return rc;
+       int rc = atomic_cmpxchg(&q_vector->state, IXGBE_QV_STATE_IDLE,
+                               IXGBE_QV_STATE_DISABLE);
+
+       return rc == IXGBE_QV_STATE_IDLE;
 }
 
 #else /* CONFIG_NET_RX_BUSY_POLL */
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c 
b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
index dcd2d1ec..b42e44d 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
@@ -808,6 +808,11 @@ static int ixgbe_alloc_q_vector(struct ixgbe_adapter 
*adapter,
                       ixgbe_poll, 64);
        napi_hash_add(&q_vector->napi);
 
+#ifdef CONFIG_NET_RX_BUSY_POLL
+       /* initialize busy poll */
+       atomic_set(&q_vector->state, IXGBE_QV_STATE_DISABLE);
+
+#endif
        /* tie q_vector and adapter together */
        adapter->q_vector[v_idx] = q_vector;
        q_vector->adapter = adapter;
-- 
1.9.3

_______________________________________________
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel

Reply via email to