Re: [PATCH v3 0/8] nic: thunderx: fix communication races between VF & PF
sorry for occasionally reply to old thread. On Wed, Feb 20, 2019 at 11:02:42AM +, Vadim Lomovtsev wrote: > The ThunderX CN88XX NIC Virtual Function driver uses mailbox interface > to communicate to physical function driver. Each of VF has it's own pair > of mailbox registers to read from and write to. The mailbox registers > has no protection from possible races, so it has to be implemented > at software side. > > After long term testing by loop of 'ip link set up/down' > command it was found that there are two possible scenarios when > race condition appears: > 1. VF receives link change message from PF and VF send RX mode > configuration message to PF in the same time from separate thread. > 2. PF receives RX mode configuration from VF and in the same time, > in separate thread PF detects link status change and sends appropriate > message to particular VF. > > Both cases leads to mailbox data to be rewritten, NIC VF messaging control > data to be updated incorrectly and communication sequence gets broken. > > This patch series is to address race condition with VF & PF communication. > > Changes: > v1 -> v2 > - : correct typo in cover letter subject: 'betwen' -> 'between'; > - move link state polling request task from pf to vf >instead of cheking status of mailbox irq; > v2 -> v3 > - 0003: change return type of nicvf_send_cfg_done() function >from int to void; > - 0007: update subject and remove unused variable 'netdev' >from nicvf_link_status_check_task() function; > > Vadim Lomovtsev (8): > net: thunderx: correct typo in macro name > net: thunderx: replace global nicvf_rx_mode_wq work queue for all VFs > to private for each of them. > net: thunderx: make CFG_DONE message to run through generic send-ack > sequence > net: thunderx: add nicvf_send_msg_to_pf result check for > set_rx_mode_task > net: thunderx: rework xcast message structure to make it fit into 64 > bit > net: thunderx: add mutex to protect mailbox from concurrent calls for > same VF > net: thunderx: add LINK_CHANGE message handler at nicpf > net: thunderx: remove link change polling code and info from nicpf > > drivers/net/ethernet/cavium/thunder/nic.h | 14 +- > .../net/ethernet/cavium/thunder/nic_main.c| 149 ++ > .../net/ethernet/cavium/thunder/nicvf_main.c | 130 ++- > .../net/ethernet/cavium/thunder/thunder_bgx.c | 2 +- > .../net/ethernet/cavium/thunder/thunder_bgx.h | 2 +- > 5 files changed, 144 insertions(+), 153 deletions(-) > > -- > 2.17.2
[PATCH v3 6/8] net: thunderx: add mutex to protect mailbox from concurrent calls for same VF
In some cases it could happen that nicvf_send_msg_to_pf() could be called concurrently for the same NIC VF, and thus re-writing mailbox contents and breaking messaging sequence with PF by re-writing NICVF data. This commit is to implement mutex for NICVF to protect mailbox registers and NICVF messaging control data from concurrent access. Signed-off-by: Vadim Lomovtsev --- drivers/net/ethernet/cavium/thunder/nic.h| 2 ++ drivers/net/ethernet/cavium/thunder/nicvf_main.c | 13 ++--- 2 files changed, 12 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/cavium/thunder/nic.h b/drivers/net/ethernet/cavium/thunder/nic.h index 227343625e83..86cda3f4b37b 100644 --- a/drivers/net/ethernet/cavium/thunder/nic.h +++ b/drivers/net/ethernet/cavium/thunder/nic.h @@ -329,6 +329,8 @@ struct nicvf { spinlock_t rx_mode_wq_lock; /* workqueue for handling kernel ndo_set_rx_mode() calls */ struct workqueue_struct *nicvf_rx_mode_wq; + /* mutex to protect VF's mailbox contents from concurrent access */ + struct mutexrx_mode_mtx; /* PTP timestamp */ struct cavium_ptp *ptp_clock; diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c index da5986ca7bee..2332e3e95e0e 100644 --- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c +++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c @@ -124,6 +124,9 @@ int nicvf_send_msg_to_pf(struct nicvf *nic, union nic_mbx *mbx) { int timeout = NIC_MBOX_MSG_TIMEOUT; int sleep = 10; + int ret = 0; + + mutex_lock(&nic->rx_mode_mtx); nic->pf_acked = false; nic->pf_nacked = false; @@ -136,7 +139,8 @@ int nicvf_send_msg_to_pf(struct nicvf *nic, union nic_mbx *mbx) netdev_err(nic->netdev, "PF NACK to mbox msg 0x%02x from VF%d\n", (mbx->msg.msg & 0xFF), nic->vf_id); - return -EINVAL; + ret = -EINVAL; + break; } msleep(sleep); if (nic->pf_acked) @@ -146,10 +150,12 @@ int nicvf_send_msg_to_pf(struct nicvf *nic, union nic_mbx *mbx) netdev_err(nic->netdev, "PF didn't ACK to mbox msg 0x%02x from VF%d\n", (mbx->msg.msg & 0xFF), nic->vf_id); - return -EBUSY; + ret = -EBUSY; + break; } } - return 0; + mutex_unlock(&nic->rx_mode_mtx); + return ret; } /* Checks if VF is able to comminicate with PF @@ -2208,6 +2214,7 @@ static int nicvf_probe(struct pci_dev *pdev, const struct pci_device_id *ent) nic->vf_id); INIT_WORK(&nic->rx_mode_work.work, nicvf_set_rx_mode_task); spin_lock_init(&nic->rx_mode_wq_lock); + mutex_init(&nic->rx_mode_mtx); err = register_netdev(netdev); if (err) { -- 2.17.2
[PATCH v3 5/8] net: thunderx: rework xcast message structure to make it fit into 64 bit
To communicate to PF each of ThunderX NIC VF uses mailbox which is pair of 64 bit registers available to both VFn and PF. This commit is to change the xcast message structure in order to fit it into 64 bit. Signed-off-by: Vadim Lomovtsev --- drivers/net/ethernet/cavium/thunder/nic.h| 6 ++ drivers/net/ethernet/cavium/thunder/nic_main.c | 4 ++-- drivers/net/ethernet/cavium/thunder/nicvf_main.c | 6 +++--- 3 files changed, 7 insertions(+), 9 deletions(-) diff --git a/drivers/net/ethernet/cavium/thunder/nic.h b/drivers/net/ethernet/cavium/thunder/nic.h index 376a96bce33f..227343625e83 100644 --- a/drivers/net/ethernet/cavium/thunder/nic.h +++ b/drivers/net/ethernet/cavium/thunder/nic.h @@ -577,10 +577,8 @@ struct set_ptp { struct xcast { u8msg; - union { - u8mode; - u64 mac; - } data; + u8mode; + u64 mac:48; }; /* 128 bit shared memory between PF and each VF */ diff --git a/drivers/net/ethernet/cavium/thunder/nic_main.c b/drivers/net/ethernet/cavium/thunder/nic_main.c index 90497a27df18..620dbe082ca0 100644 --- a/drivers/net/ethernet/cavium/thunder/nic_main.c +++ b/drivers/net/ethernet/cavium/thunder/nic_main.c @@ -1094,7 +1094,7 @@ static void nic_handle_mbx_intr(struct nicpf *nic, int vf) bgx = NIC_GET_BGX_FROM_VF_LMAC_MAP(nic->vf_lmac_map[vf]); lmac = NIC_GET_LMAC_FROM_VF_LMAC_MAP(nic->vf_lmac_map[vf]); bgx_set_dmac_cam_filter(nic->node, bgx, lmac, - mbx.xcast.data.mac, + mbx.xcast.mac, vf < NIC_VF_PER_MBX_REG ? vf : vf - NIC_VF_PER_MBX_REG); break; @@ -1106,7 +1106,7 @@ static void nic_handle_mbx_intr(struct nicpf *nic, int vf) } bgx = NIC_GET_BGX_FROM_VF_LMAC_MAP(nic->vf_lmac_map[vf]); lmac = NIC_GET_LMAC_FROM_VF_LMAC_MAP(nic->vf_lmac_map[vf]); - bgx_set_xcast_mode(nic->node, bgx, lmac, mbx.xcast.data.mode); + bgx_set_xcast_mode(nic->node, bgx, lmac, mbx.xcast.mode); break; default: dev_err(&nic->pdev->dev, diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c index 45f06504a61b..da5986ca7bee 100644 --- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c +++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c @@ -1961,7 +1961,7 @@ static void __nicvf_set_rx_mode_task(u8 mode, struct xcast_addr_list *mc_addrs, * its' own LMAC to the filter to accept packets for it. */ mbx.xcast.msg = NIC_MBOX_MSG_ADD_MCAST; - mbx.xcast.data.mac = 0; + mbx.xcast.mac = 0; if (nicvf_send_msg_to_pf(nic, &mbx) < 0) goto free_mc; } @@ -1971,7 +1971,7 @@ static void __nicvf_set_rx_mode_task(u8 mode, struct xcast_addr_list *mc_addrs, /* now go through kernel list of MACs and add them one by one */ for (idx = 0; idx < mc_addrs->count; idx++) { mbx.xcast.msg = NIC_MBOX_MSG_ADD_MCAST; - mbx.xcast.data.mac = mc_addrs->mc[idx]; + mbx.xcast.mac = mc_addrs->mc[idx]; if (nicvf_send_msg_to_pf(nic, &mbx) < 0) goto free_mc; } @@ -1979,7 +1979,7 @@ static void __nicvf_set_rx_mode_task(u8 mode, struct xcast_addr_list *mc_addrs, /* and finally set rx mode for PF accordingly */ mbx.xcast.msg = NIC_MBOX_MSG_SET_XCAST; - mbx.xcast.data.mode = mode; + mbx.xcast.mode = mode; nicvf_send_msg_to_pf(nic, &mbx); free_mc: -- 2.17.2
[PATCH v3 2/8] net: thunderx: replace global nicvf_rx_mode_wq work queue for all VFs to private for each of them.
Having one work queue for receive mode configuration ndo_set_rx_mode() call for all VFs results in making each of them wait till the set_rx_mode() call completes for another VF if any of close, set receive mode and change flags calls being already invoked. Potentially this could cause device state change before appropriate call of receive mode configuration completes, so the call itself became meaningless, corrupt data or break configuration sequence. We don't need any delays in NIC VF configuration sequence so having delayed work call with 0 delay has no sense. This commit is to implement one work queue for each NIC VF for set_rx_mode task and to let them work independently and replacing delayed_work with work_struct. Signed-off-by: Vadim Lomovtsev --- drivers/net/ethernet/cavium/thunder/nic.h | 4 ++- .../net/ethernet/cavium/thunder/nicvf_main.c | 30 ++- 2 files changed, 19 insertions(+), 15 deletions(-) diff --git a/drivers/net/ethernet/cavium/thunder/nic.h b/drivers/net/ethernet/cavium/thunder/nic.h index f4d81765221e..376a96bce33f 100644 --- a/drivers/net/ethernet/cavium/thunder/nic.h +++ b/drivers/net/ethernet/cavium/thunder/nic.h @@ -271,7 +271,7 @@ struct xcast_addr_list { }; struct nicvf_work { - struct delayed_workwork; + struct work_struct work; u8 mode; struct xcast_addr_list *mc; }; @@ -327,6 +327,8 @@ struct nicvf { struct nicvf_work rx_mode_work; /* spinlock to protect workqueue arguments from concurrent access */ spinlock_t rx_mode_wq_lock; + /* workqueue for handling kernel ndo_set_rx_mode() calls */ + struct workqueue_struct *nicvf_rx_mode_wq; /* PTP timestamp */ struct cavium_ptp *ptp_clock; diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c index 88f8a8fa93cd..abf24e7dff2d 100644 --- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c +++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c @@ -68,9 +68,6 @@ module_param(cpi_alg, int, 0444); MODULE_PARM_DESC(cpi_alg, "PFC algorithm (0=none, 1=VLAN, 2=VLAN16, 3=IP Diffserv)"); -/* workqueue for handling kernel ndo_set_rx_mode() calls */ -static struct workqueue_struct *nicvf_rx_mode_wq; - static inline u8 nicvf_netdev_qidx(struct nicvf *nic, u8 qidx) { if (nic->sqs_mode) @@ -1311,6 +1308,9 @@ int nicvf_stop(struct net_device *netdev) struct nicvf_cq_poll *cq_poll = NULL; union nic_mbx mbx = {}; + /* wait till all queued set_rx_mode tasks completes */ + drain_workqueue(nic->nicvf_rx_mode_wq); + mbx.msg.msg = NIC_MBOX_MSG_SHUTDOWN; nicvf_send_msg_to_pf(nic, &mbx); @@ -1418,6 +1418,9 @@ int nicvf_open(struct net_device *netdev) struct nicvf_cq_poll *cq_poll = NULL; union nic_mbx mbx = {}; + /* wait till all queued set_rx_mode tasks completes if any */ + drain_workqueue(nic->nicvf_rx_mode_wq); + netif_carrier_off(netdev); err = nicvf_register_misc_interrupt(nic); @@ -1973,7 +1976,7 @@ static void __nicvf_set_rx_mode_task(u8 mode, struct xcast_addr_list *mc_addrs, static void nicvf_set_rx_mode_task(struct work_struct *work_arg) { struct nicvf_work *vf_work = container_of(work_arg, struct nicvf_work, - work.work); + work); struct nicvf *nic = container_of(vf_work, struct nicvf, rx_mode_work); u8 mode; struct xcast_addr_list *mc; @@ -2030,7 +2033,7 @@ static void nicvf_set_rx_mode(struct net_device *netdev) kfree(nic->rx_mode_work.mc); nic->rx_mode_work.mc = mc_list; nic->rx_mode_work.mode = mode; - queue_delayed_work(nicvf_rx_mode_wq, &nic->rx_mode_work.work, 0); + queue_work(nic->nicvf_rx_mode_wq, &nic->rx_mode_work.work); spin_unlock(&nic->rx_mode_wq_lock); } @@ -2187,7 +2190,10 @@ static int nicvf_probe(struct pci_dev *pdev, const struct pci_device_id *ent) INIT_WORK(&nic->reset_task, nicvf_reset_task); - INIT_DELAYED_WORK(&nic->rx_mode_work.work, nicvf_set_rx_mode_task); + nic->nicvf_rx_mode_wq = alloc_ordered_workqueue("nicvf_rx_mode_wq_VF%d", + WQ_MEM_RECLAIM, + nic->vf_id); + INIT_WORK(&nic->rx_mode_work.work, nicvf_set_rx_mode_task); spin_lock_init(&nic->rx_mode_wq_lock); err = register_netdev(netdev); @@ -2228,13 +2234,15 @@ static void nicvf_remove(struct pci_dev *pdev) nic = netdev_priv(netdev); pnetdev = nic->pnicvf->netdev; - cancel_delayed_work_sync(&nic->rx_mode_work.work); - /* Check if this Qset
[PATCH v3 3/8] net: thunderx: make CFG_DONE message to run through generic send-ack sequence
At the end of NIC VF initialization VF sends CFG_DONE message to PF without using nicvf_msg_send_to_pf routine. This potentially could re-write data in mailbox. This commit is to implement common way of sending CFG_DONE message by the same way with other configuration messages by using nicvf_send_msg_to_pf() routine. Signed-off-by: Vadim Lomovtsev --- drivers/net/ethernet/cavium/thunder/nic_main.c | 2 +- drivers/net/ethernet/cavium/thunder/nicvf_main.c | 15 --- 2 files changed, 13 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/cavium/thunder/nic_main.c b/drivers/net/ethernet/cavium/thunder/nic_main.c index 6c8dcb65ff03..90497a27df18 100644 --- a/drivers/net/ethernet/cavium/thunder/nic_main.c +++ b/drivers/net/ethernet/cavium/thunder/nic_main.c @@ -1039,7 +1039,7 @@ static void nic_handle_mbx_intr(struct nicpf *nic, int vf) case NIC_MBOX_MSG_CFG_DONE: /* Last message of VF config msg sequence */ nic_enable_vf(nic, vf, true); - goto unlock; + break; case NIC_MBOX_MSG_SHUTDOWN: /* First msg in VF teardown sequence */ if (vf >= nic->num_vf_en) diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c index abf24e7dff2d..19b58fc3ca41 100644 --- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c +++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c @@ -169,6 +169,17 @@ static int nicvf_check_pf_ready(struct nicvf *nic) return 1; } +static void nicvf_send_cfg_done(struct nicvf *nic) +{ + union nic_mbx mbx = {}; + + mbx.msg.msg = NIC_MBOX_MSG_CFG_DONE; + if (nicvf_send_msg_to_pf(nic, &mbx)) { + netdev_err(nic->netdev, + "PF didn't respond to CFG DONE msg\n"); + } +} + static void nicvf_read_bgx_stats(struct nicvf *nic, struct bgx_stats_msg *bgx) { if (bgx->rx) @@ -1416,7 +1427,6 @@ int nicvf_open(struct net_device *netdev) struct nicvf *nic = netdev_priv(netdev); struct queue_set *qs = nic->qs; struct nicvf_cq_poll *cq_poll = NULL; - union nic_mbx mbx = {}; /* wait till all queued set_rx_mode tasks completes if any */ drain_workqueue(nic->nicvf_rx_mode_wq); @@ -1515,8 +1525,7 @@ int nicvf_open(struct net_device *netdev) nicvf_enable_intr(nic, NICVF_INTR_RBDR, qidx); /* Send VF config done msg to PF */ - mbx.msg.msg = NIC_MBOX_MSG_CFG_DONE; - nicvf_write_to_mbx(nic, &mbx); + nicvf_send_cfg_done(nic); return 0; cleanup: -- 2.17.2
[PATCH v3 4/8] net: thunderx: add nicvf_send_msg_to_pf result check for set_rx_mode_task
The rx_set_mode invokes number of messages to be send to PF for receive mode configuration. In case if there any issues we need to stop sending messages and release allocated memory. This commit is to implement check of nicvf_msg_send_to_pf() result. Signed-off-by: Vadim Lomovtsev --- drivers/net/ethernet/cavium/thunder/nicvf_main.c | 12 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c index 19b58fc3ca41..45f06504a61b 100644 --- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c +++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c @@ -1953,7 +1953,8 @@ static void __nicvf_set_rx_mode_task(u8 mode, struct xcast_addr_list *mc_addrs, /* flush DMAC filters and reset RX mode */ mbx.xcast.msg = NIC_MBOX_MSG_RESET_XCAST; - nicvf_send_msg_to_pf(nic, &mbx); + if (nicvf_send_msg_to_pf(nic, &mbx) < 0) + goto free_mc; if (mode & BGX_XCAST_MCAST_FILTER) { /* once enabling filtering, we need to signal to PF to add @@ -1961,7 +1962,8 @@ static void __nicvf_set_rx_mode_task(u8 mode, struct xcast_addr_list *mc_addrs, */ mbx.xcast.msg = NIC_MBOX_MSG_ADD_MCAST; mbx.xcast.data.mac = 0; - nicvf_send_msg_to_pf(nic, &mbx); + if (nicvf_send_msg_to_pf(nic, &mbx) < 0) + goto free_mc; } /* check if we have any specific MACs to be added to PF DMAC filter */ @@ -1970,9 +1972,9 @@ static void __nicvf_set_rx_mode_task(u8 mode, struct xcast_addr_list *mc_addrs, for (idx = 0; idx < mc_addrs->count; idx++) { mbx.xcast.msg = NIC_MBOX_MSG_ADD_MCAST; mbx.xcast.data.mac = mc_addrs->mc[idx]; - nicvf_send_msg_to_pf(nic, &mbx); + if (nicvf_send_msg_to_pf(nic, &mbx) < 0) + goto free_mc; } - kfree(mc_addrs); } /* and finally set rx mode for PF accordingly */ @@ -1980,6 +1982,8 @@ static void __nicvf_set_rx_mode_task(u8 mode, struct xcast_addr_list *mc_addrs, mbx.xcast.data.mode = mode; nicvf_send_msg_to_pf(nic, &mbx); +free_mc: + kfree(mc_addrs); } static void nicvf_set_rx_mode_task(struct work_struct *work_arg) -- 2.17.2
[PATCH v3 8/8] net: thunderx: remove link change polling code and info from nicpf
Since link change polling routine was moved to nicvf side, we don't need anymore polling function at nicpf side along with link status info for all enabled Vfs as at VF side this info is already tracked. This commit is to remove unnecessary code & fields from nicpf structure. Signed-off-by: Vadim Lomovtsev --- .../net/ethernet/cavium/thunder/nic_main.c| 114 ++ 1 file changed, 12 insertions(+), 102 deletions(-) diff --git a/drivers/net/ethernet/cavium/thunder/nic_main.c b/drivers/net/ethernet/cavium/thunder/nic_main.c index 8ab71dae3988..c90252829ed3 100644 --- a/drivers/net/ethernet/cavium/thunder/nic_main.c +++ b/drivers/net/ethernet/cavium/thunder/nic_main.c @@ -57,14 +57,8 @@ struct nicpf { #defineNIC_GET_BGX_FROM_VF_LMAC_MAP(map) ((map >> 4) & 0xF) #defineNIC_GET_LMAC_FROM_VF_LMAC_MAP(map) (map & 0xF) u8 *vf_lmac_map; - struct delayed_work dwork; - struct workqueue_struct *check_link; - u8 *link; - u8 *duplex; - u32 *speed; u16 cpi_base[MAX_NUM_VFS_SUPPORTED]; u16 rssi_base[MAX_NUM_VFS_SUPPORTED]; - boolmbx_lock[MAX_NUM_VFS_SUPPORTED]; /* MSI-X */ u8 num_vec; @@ -929,6 +923,10 @@ static void nic_config_timestamp(struct nicpf *nic, int vf, struct set_ptp *ptp) nic_reg_write(nic, NIC_PF_PKIND_0_15_CFG | (pkind_idx << 3), pkind_val); } +/* Get BGX LMAC link status and update corresponding VF + * if there is a change, valid only if internal L2 switch + * is not present otherwise VF link is always treated as up + */ static void nic_link_status_get(struct nicpf *nic, u8 vf) { union nic_mbx mbx = {}; @@ -944,10 +942,6 @@ static void nic_link_status_get(struct nicpf *nic, u8 vf) /* Get interface link status */ bgx_get_lmac_link_state(nic->node, bgx, lmac, &link); - nic->link[vf] = link.link_up; - nic->duplex[vf] = link.duplex; - nic->speed[vf] = link.speed; - /* Send a mbox message to VF with current link status */ mbx.link_status.link_up = link.link_up; mbx.link_status.duplex = link.duplex; @@ -970,8 +964,6 @@ static void nic_handle_mbx_intr(struct nicpf *nic, int vf) int i; int ret = 0; - nic->mbx_lock[vf] = true; - mbx_addr = nic_get_mbx_addr(vf); mbx_data = (u64 *)&mbx; @@ -986,12 +978,7 @@ static void nic_handle_mbx_intr(struct nicpf *nic, int vf) switch (mbx.msg.msg) { case NIC_MBOX_MSG_READY: nic_mbx_send_ready(nic, vf); - if (vf < nic->num_vf_en) { - nic->link[vf] = 0; - nic->duplex[vf] = 0; - nic->speed[vf] = 0; - } - goto unlock; + return; case NIC_MBOX_MSG_QS_CFG: reg_addr = NIC_PF_QSET_0_127_CFG | (mbx.qs.num << NIC_QS_ID_SHIFT); @@ -1060,7 +1047,7 @@ static void nic_handle_mbx_intr(struct nicpf *nic, int vf) break; case NIC_MBOX_MSG_RSS_SIZE: nic_send_rss_size(nic, vf); - goto unlock; + return; case NIC_MBOX_MSG_RSS_CFG: case NIC_MBOX_MSG_RSS_CFG_CONT: nic_config_rss(nic, &mbx.rss_cfg); @@ -1078,19 +1065,19 @@ static void nic_handle_mbx_intr(struct nicpf *nic, int vf) break; case NIC_MBOX_MSG_ALLOC_SQS: nic_alloc_sqs(nic, &mbx.sqs_alloc); - goto unlock; + return; case NIC_MBOX_MSG_NICVF_PTR: nic->nicvf[vf] = mbx.nicvf.nicvf; break; case NIC_MBOX_MSG_PNICVF_PTR: nic_send_pnicvf(nic, vf); - goto unlock; + return; case NIC_MBOX_MSG_SNICVF_PTR: nic_send_snicvf(nic, &mbx.nicvf); - goto unlock; + return; case NIC_MBOX_MSG_BGX_STATS: nic_get_bgx_stats(nic, &mbx.bgx_stats); - goto unlock; + return; case NIC_MBOX_MSG_LOOPBACK: ret = nic_config_loopback(nic, &mbx.lbk); break; @@ -1099,7 +1086,7 @@ static void nic_handle_mbx_intr(struct nicpf *nic, int vf) break; case NIC_MBOX_MSG_PFC: nic_pause_frame(nic, vf, &mbx.pfc); - goto unlock; + return; case NIC_MBOX_MSG_PTP_CFG: nic_config_timestamp(nic, vf, &mbx.ptp); break; @@ -1143,7 +1130,7 @@ static void nic_handle_mbx_intr(struct nicpf *nic, int vf) break; } nic_link_status_get
[PATCH v3 7/8] net: thunderx: move link state polling function to VF
Move the link change polling task to VF side in order to prevent races between VF and PF while sending link change message(s). This commit is to implement link change request to be initiated by VF. Signed-off-by: Vadim Lomovtsev --- drivers/net/ethernet/cavium/thunder/nic.h | 2 +- .../net/ethernet/cavium/thunder/nic_main.c| 39 -- .../net/ethernet/cavium/thunder/nicvf_main.c | 52 +-- 3 files changed, 74 insertions(+), 19 deletions(-) diff --git a/drivers/net/ethernet/cavium/thunder/nic.h b/drivers/net/ethernet/cavium/thunder/nic.h index 86cda3f4b37b..62636c1ed141 100644 --- a/drivers/net/ethernet/cavium/thunder/nic.h +++ b/drivers/net/ethernet/cavium/thunder/nic.h @@ -331,7 +331,7 @@ struct nicvf { struct workqueue_struct *nicvf_rx_mode_wq; /* mutex to protect VF's mailbox contents from concurrent access */ struct mutexrx_mode_mtx; - + struct delayed_work link_change_work; /* PTP timestamp */ struct cavium_ptp *ptp_clock; /* Inbound timestamping is on */ diff --git a/drivers/net/ethernet/cavium/thunder/nic_main.c b/drivers/net/ethernet/cavium/thunder/nic_main.c index 620dbe082ca0..8ab71dae3988 100644 --- a/drivers/net/ethernet/cavium/thunder/nic_main.c +++ b/drivers/net/ethernet/cavium/thunder/nic_main.c @@ -929,6 +929,35 @@ static void nic_config_timestamp(struct nicpf *nic, int vf, struct set_ptp *ptp) nic_reg_write(nic, NIC_PF_PKIND_0_15_CFG | (pkind_idx << 3), pkind_val); } +static void nic_link_status_get(struct nicpf *nic, u8 vf) +{ + union nic_mbx mbx = {}; + struct bgx_link_status link; + u8 bgx, lmac; + + mbx.link_status.msg = NIC_MBOX_MSG_BGX_LINK_CHANGE; + + /* Get BGX, LMAC indices for the VF */ + bgx = NIC_GET_BGX_FROM_VF_LMAC_MAP(nic->vf_lmac_map[vf]); + lmac = NIC_GET_LMAC_FROM_VF_LMAC_MAP(nic->vf_lmac_map[vf]); + + /* Get interface link status */ + bgx_get_lmac_link_state(nic->node, bgx, lmac, &link); + + nic->link[vf] = link.link_up; + nic->duplex[vf] = link.duplex; + nic->speed[vf] = link.speed; + + /* Send a mbox message to VF with current link status */ + mbx.link_status.link_up = link.link_up; + mbx.link_status.duplex = link.duplex; + mbx.link_status.speed = link.speed; + mbx.link_status.mac_type = link.mac_type; + + /* reply with link status */ + nic_send_msg_to_vf(nic, vf, &mbx); +} + /* Interrupt handler to handle mailbox messages from VFs */ static void nic_handle_mbx_intr(struct nicpf *nic, int vf) { @@ -1108,6 +1137,13 @@ static void nic_handle_mbx_intr(struct nicpf *nic, int vf) lmac = NIC_GET_LMAC_FROM_VF_LMAC_MAP(nic->vf_lmac_map[vf]); bgx_set_xcast_mode(nic->node, bgx, lmac, mbx.xcast.mode); break; + case NIC_MBOX_MSG_BGX_LINK_CHANGE: + if (vf >= nic->num_vf_en) { + ret = -1; /* NACK */ + break; + } + nic_link_status_get(nic, vf); + goto unlock; default: dev_err(&nic->pdev->dev, "Invalid msg from VF%d, msg 0x%x\n", vf, mbx.msg.msg); @@ -1419,9 +1455,6 @@ static int nic_probe(struct pci_dev *pdev, const struct pci_device_id *ent) goto err_disable_sriov; } - INIT_DELAYED_WORK(&nic->dwork, nic_poll_for_link); - queue_delayed_work(nic->check_link, &nic->dwork, 0); - return 0; err_disable_sriov: diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c index 2332e3e95e0e..503cfadff4ac 100644 --- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c +++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c @@ -242,21 +242,24 @@ static void nicvf_handle_mbx_intr(struct nicvf *nic) break; case NIC_MBOX_MSG_BGX_LINK_CHANGE: nic->pf_acked = true; - nic->link_up = mbx.link_status.link_up; - nic->duplex = mbx.link_status.duplex; - nic->speed = mbx.link_status.speed; - nic->mac_type = mbx.link_status.mac_type; - if (nic->link_up) { - netdev_info(nic->netdev, "Link is Up %d Mbps %s duplex\n", - nic->speed, - nic->duplex == DUPLEX_FULL ? - "Full" : "Half"); - netif_carrier_on(nic->netdev); - netif_tx_start_all_queues(nic->netdev); - } else { - netdev_info(nic->netdev, "Link is Down\n"); - netif_carrier_off(nic->netdev); -
[PATCH v3 0/8] nic: thunderx: fix communication races between VF & PF
The ThunderX CN88XX NIC Virtual Function driver uses mailbox interface to communicate to physical function driver. Each of VF has it's own pair of mailbox registers to read from and write to. The mailbox registers has no protection from possible races, so it has to be implemented at software side. After long term testing by loop of 'ip link set up/down' command it was found that there are two possible scenarios when race condition appears: 1. VF receives link change message from PF and VF send RX mode configuration message to PF in the same time from separate thread. 2. PF receives RX mode configuration from VF and in the same time, in separate thread PF detects link status change and sends appropriate message to particular VF. Both cases leads to mailbox data to be rewritten, NIC VF messaging control data to be updated incorrectly and communication sequence gets broken. This patch series is to address race condition with VF & PF communication. Changes: v1 -> v2 - : correct typo in cover letter subject: 'betwen' -> 'between'; - move link state polling request task from pf to vf instead of cheking status of mailbox irq; v2 -> v3 - 0003: change return type of nicvf_send_cfg_done() function from int to void; - 0007: update subject and remove unused variable 'netdev' from nicvf_link_status_check_task() function; Vadim Lomovtsev (8): net: thunderx: correct typo in macro name net: thunderx: replace global nicvf_rx_mode_wq work queue for all VFs to private for each of them. net: thunderx: make CFG_DONE message to run through generic send-ack sequence net: thunderx: add nicvf_send_msg_to_pf result check for set_rx_mode_task net: thunderx: rework xcast message structure to make it fit into 64 bit net: thunderx: add mutex to protect mailbox from concurrent calls for same VF net: thunderx: add LINK_CHANGE message handler at nicpf net: thunderx: remove link change polling code and info from nicpf drivers/net/ethernet/cavium/thunder/nic.h | 14 +- .../net/ethernet/cavium/thunder/nic_main.c| 149 ++ .../net/ethernet/cavium/thunder/nicvf_main.c | 130 ++- .../net/ethernet/cavium/thunder/thunder_bgx.c | 2 +- .../net/ethernet/cavium/thunder/thunder_bgx.h | 2 +- 5 files changed, 144 insertions(+), 153 deletions(-) -- 2.17.2
[PATCH v3 1/8] net: thunderx: correct typo in macro name
Correct STREERING to STEERING at macro name for BGX steering register. Signed-off-by: Vadim Lomovtsev --- drivers/net/ethernet/cavium/thunder/thunder_bgx.c | 2 +- drivers/net/ethernet/cavium/thunder/thunder_bgx.h | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c index e337da6ba2a4..673c57b8023f 100644 --- a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c +++ b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c @@ -1217,7 +1217,7 @@ static void bgx_init_hw(struct bgx *bgx) /* Disable MAC steering (NCSI traffic) */ for (i = 0; i < RX_TRAFFIC_STEER_RULE_COUNT; i++) - bgx_reg_write(bgx, 0, BGX_CMR_RX_STREERING + (i * 8), 0x00); + bgx_reg_write(bgx, 0, BGX_CMR_RX_STEERING + (i * 8), 0x00); } static u8 bgx_get_lane2sds_cfg(struct bgx *bgx, struct lmac *lmac) diff --git a/drivers/net/ethernet/cavium/thunder/thunder_bgx.h b/drivers/net/ethernet/cavium/thunder/thunder_bgx.h index cbdd20b9ee6f..5cbc54e9eb19 100644 --- a/drivers/net/ethernet/cavium/thunder/thunder_bgx.h +++ b/drivers/net/ethernet/cavium/thunder/thunder_bgx.h @@ -60,7 +60,7 @@ #define RX_DMACX_CAM_EN BIT_ULL(48) #define RX_DMACX_CAM_LMACID(x)(((u64)x) << 49) #define RX_DMAC_COUNT 32 -#define BGX_CMR_RX_STREERING 0x300 +#define BGX_CMR_RX_STEERING0x300 #define RX_TRAFFIC_STEER_RULE_COUNT 8 #define BGX_CMR_CHAN_MSK_AND 0x450 #define BGX_CMR_BIST_STATUS0x460 -- 2.17.2
Re: [EXT] Re: [PATCH v2 3/8] net: thunderx: make CFG_DONE message to run through generic send-ack sequence
Hi David, On Mon, Feb 18, 2019 at 03:33:10PM -0800, David Miller wrote: > From: Vadim Lomovtsev > Date: Mon, 18 Feb 2019 09:52:14 + > > > @@ -169,6 +169,20 @@ static int nicvf_check_pf_ready(struct nicvf *nic) > > return 1; > > } > > > > +static int nicvf_send_cfg_done(struct nicvf *nic) > > +{ > > + union nic_mbx mbx = {}; > > + > > + mbx.msg.msg = NIC_MBOX_MSG_CFG_DONE; > > + if (nicvf_send_msg_to_pf(nic, &mbx)) { > > + netdev_err(nic->netdev, > > + "PF didn't respond to CFG DONE msg\n"); > > + return 0; > > + } > > + > > + return 1; > > +} > ... > > @@ -1515,8 +1528,7 @@ int nicvf_open(struct net_device *netdev) > > nicvf_enable_intr(nic, NICVF_INTR_RBDR, qidx); > > > > /* Send VF config done msg to PF */ > > - mbx.msg.msg = NIC_MBOX_MSG_CFG_DONE; > > - nicvf_write_to_mbx(nic, &mbx); > > + nicvf_send_cfg_done(nic); > > If the one and only call site doesn't even bother to check the return > value, just make it return void. > > Thanks. Thank you for your time and comments. I'll update patch and re-submit. WBR, Vadim
[PATCH v2 6/8] net: thunderx: add mutex to protect mailbox from concurrent calls for same VF
In some cases it could happen that nicvf_send_msg_to_pf() could be called concurrently for the same NIC VF, and thus re-writing mailbox contents and breaking messaging sequence with PF by re-writing NICVF data. This commit is to implement mutex for NICVF to protect mailbox registers and NICVF messaging control data from concurrent access. Signed-off-by: Vadim Lomovtsev --- drivers/net/ethernet/cavium/thunder/nic.h| 2 ++ drivers/net/ethernet/cavium/thunder/nicvf_main.c | 13 ++--- 2 files changed, 12 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/cavium/thunder/nic.h b/drivers/net/ethernet/cavium/thunder/nic.h index 227343625e83..86cda3f4b37b 100644 --- a/drivers/net/ethernet/cavium/thunder/nic.h +++ b/drivers/net/ethernet/cavium/thunder/nic.h @@ -329,6 +329,8 @@ struct nicvf { spinlock_t rx_mode_wq_lock; /* workqueue for handling kernel ndo_set_rx_mode() calls */ struct workqueue_struct *nicvf_rx_mode_wq; + /* mutex to protect VF's mailbox contents from concurrent access */ + struct mutexrx_mode_mtx; /* PTP timestamp */ struct cavium_ptp *ptp_clock; diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c index 30c7f54b4f17..a05e2989ec76 100644 --- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c +++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c @@ -124,6 +124,9 @@ int nicvf_send_msg_to_pf(struct nicvf *nic, union nic_mbx *mbx) { int timeout = NIC_MBOX_MSG_TIMEOUT; int sleep = 10; + int ret = 0; + + mutex_lock(&nic->rx_mode_mtx); nic->pf_acked = false; nic->pf_nacked = false; @@ -136,7 +139,8 @@ int nicvf_send_msg_to_pf(struct nicvf *nic, union nic_mbx *mbx) netdev_err(nic->netdev, "PF NACK to mbox msg 0x%02x from VF%d\n", (mbx->msg.msg & 0xFF), nic->vf_id); - return -EINVAL; + ret = -EINVAL; + break; } msleep(sleep); if (nic->pf_acked) @@ -146,10 +150,12 @@ int nicvf_send_msg_to_pf(struct nicvf *nic, union nic_mbx *mbx) netdev_err(nic->netdev, "PF didn't ACK to mbox msg 0x%02x from VF%d\n", (mbx->msg.msg & 0xFF), nic->vf_id); - return -EBUSY; + ret = -EBUSY; + break; } } - return 0; + mutex_unlock(&nic->rx_mode_mtx); + return ret; } /* Checks if VF is able to comminicate with PF @@ -2211,6 +2217,7 @@ static int nicvf_probe(struct pci_dev *pdev, const struct pci_device_id *ent) nic->vf_id); INIT_WORK(&nic->rx_mode_work.work, nicvf_set_rx_mode_task); spin_lock_init(&nic->rx_mode_wq_lock); + mutex_init(&nic->rx_mode_mtx); err = register_netdev(netdev); if (err) { -- 2.17.2
[PATCH v2 4/8] net: thunderx: add nicvf_send_msg_to_pf result check for set_rx_mode_task
The rx_set_mode invokes number of messages to be send to PF for receive mode configuration. In case if there any issues we need to stop sending messages and release allocated memory. This commit is to implement check of nicvf_msg_send_to_pf() result. Signed-off-by: Vadim Lomovtsev --- drivers/net/ethernet/cavium/thunder/nicvf_main.c | 12 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c index b0e8a04e0f1e..dbd8862d60d6 100644 --- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c +++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c @@ -1956,7 +1956,8 @@ static void __nicvf_set_rx_mode_task(u8 mode, struct xcast_addr_list *mc_addrs, /* flush DMAC filters and reset RX mode */ mbx.xcast.msg = NIC_MBOX_MSG_RESET_XCAST; - nicvf_send_msg_to_pf(nic, &mbx); + if (nicvf_send_msg_to_pf(nic, &mbx) < 0) + goto free_mc; if (mode & BGX_XCAST_MCAST_FILTER) { /* once enabling filtering, we need to signal to PF to add @@ -1964,7 +1965,8 @@ static void __nicvf_set_rx_mode_task(u8 mode, struct xcast_addr_list *mc_addrs, */ mbx.xcast.msg = NIC_MBOX_MSG_ADD_MCAST; mbx.xcast.data.mac = 0; - nicvf_send_msg_to_pf(nic, &mbx); + if (nicvf_send_msg_to_pf(nic, &mbx) < 0) + goto free_mc; } /* check if we have any specific MACs to be added to PF DMAC filter */ @@ -1973,9 +1975,9 @@ static void __nicvf_set_rx_mode_task(u8 mode, struct xcast_addr_list *mc_addrs, for (idx = 0; idx < mc_addrs->count; idx++) { mbx.xcast.msg = NIC_MBOX_MSG_ADD_MCAST; mbx.xcast.data.mac = mc_addrs->mc[idx]; - nicvf_send_msg_to_pf(nic, &mbx); + if (nicvf_send_msg_to_pf(nic, &mbx) < 0) + goto free_mc; } - kfree(mc_addrs); } /* and finally set rx mode for PF accordingly */ @@ -1983,6 +1985,8 @@ static void __nicvf_set_rx_mode_task(u8 mode, struct xcast_addr_list *mc_addrs, mbx.xcast.data.mode = mode; nicvf_send_msg_to_pf(nic, &mbx); +free_mc: + kfree(mc_addrs); } static void nicvf_set_rx_mode_task(struct work_struct *work_arg) -- 2.17.2
[PATCH v2 0/8] nic: thunderx: fix communication races between VF & PF
From: Vadim Lomovtsev The ThunderX CN88XX NIC Virtual Function driver uses mailbox interface to communicate to physical function driver. Each of VF has it's own pair of mailbox registers to read from and write to. The mailbox registers has no protection from possible races, so it has to be implemented at software side. After long term testing by loop of 'ip link set up/down' command it was found that there are two possible scenarios when race condition appears: 1. VF receives link change message from PF and VF send RX mode configuration message to PF in the same time from separate thread. 2. PF receives RX mode configuration from VF and in the same time, in separate thread PF detects link status change and sends appropriate message to particular VF. Both cases leads to mailbox data to be rewritten, NIC VF messaging control data to be updated incorrectly and communication sequence gets broken. This patch series is to address race condition with VF & PF communication. Vadim Lomovtsev (8): net: thunderx: correct typo in macro name net: thunderx: replace global nicvf_rx_mode_wq work queue for all VFs to private for each of them. net: thunderx: make CFG_DONE message to run through generic send-ack sequence net: thunderx: add nicvf_send_msg_to_pf result check for set_rx_mode_task net: thunderx: rework xcast message structure to make it fit into 64 bit net: thunderx: add mutex to protect mailbox from concurrent calls for same VF net: thunderx: add LINK_CHANGE message handler at nicpf net: thunderx: remove link change polling code and info from nicpf drivers/net/ethernet/cavium/thunder/nic.h | 14 +- .../net/ethernet/cavium/thunder/nic_main.c| 149 ++ .../net/ethernet/cavium/thunder/nicvf_main.c | 133 +++- .../net/ethernet/cavium/thunder/thunder_bgx.c | 2 +- .../net/ethernet/cavium/thunder/thunder_bgx.h | 2 +- 5 files changed, 147 insertions(+), 153 deletions(-) -- 2.17.2
[PATCH v2 3/8] net: thunderx: make CFG_DONE message to run through generic send-ack sequence
At the end of NIC VF initialization VF sends CFG_DONE message to PF without using nicvf_msg_send_to_pf routine. This potentially could re-write data in mailbox. This commit is to implement common way of sending CFG_DONE message by the same way with other configuration messages by using nicvf_send_msg_to_pf() routine. Signed-off-by: Vadim Lomovtsev --- drivers/net/ethernet/cavium/thunder/nic_main.c | 2 +- .../net/ethernet/cavium/thunder/nicvf_main.c | 18 +++--- 2 files changed, 16 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/cavium/thunder/nic_main.c b/drivers/net/ethernet/cavium/thunder/nic_main.c index 6c8dcb65ff03..90497a27df18 100644 --- a/drivers/net/ethernet/cavium/thunder/nic_main.c +++ b/drivers/net/ethernet/cavium/thunder/nic_main.c @@ -1039,7 +1039,7 @@ static void nic_handle_mbx_intr(struct nicpf *nic, int vf) case NIC_MBOX_MSG_CFG_DONE: /* Last message of VF config msg sequence */ nic_enable_vf(nic, vf, true); - goto unlock; + break; case NIC_MBOX_MSG_SHUTDOWN: /* First msg in VF teardown sequence */ if (vf >= nic->num_vf_en) diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c index abf24e7dff2d..b0e8a04e0f1e 100644 --- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c +++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c @@ -169,6 +169,20 @@ static int nicvf_check_pf_ready(struct nicvf *nic) return 1; } +static int nicvf_send_cfg_done(struct nicvf *nic) +{ + union nic_mbx mbx = {}; + + mbx.msg.msg = NIC_MBOX_MSG_CFG_DONE; + if (nicvf_send_msg_to_pf(nic, &mbx)) { + netdev_err(nic->netdev, + "PF didn't respond to CFG DONE msg\n"); + return 0; + } + + return 1; +} + static void nicvf_read_bgx_stats(struct nicvf *nic, struct bgx_stats_msg *bgx) { if (bgx->rx) @@ -1416,7 +1430,6 @@ int nicvf_open(struct net_device *netdev) struct nicvf *nic = netdev_priv(netdev); struct queue_set *qs = nic->qs; struct nicvf_cq_poll *cq_poll = NULL; - union nic_mbx mbx = {}; /* wait till all queued set_rx_mode tasks completes if any */ drain_workqueue(nic->nicvf_rx_mode_wq); @@ -1515,8 +1528,7 @@ int nicvf_open(struct net_device *netdev) nicvf_enable_intr(nic, NICVF_INTR_RBDR, qidx); /* Send VF config done msg to PF */ - mbx.msg.msg = NIC_MBOX_MSG_CFG_DONE; - nicvf_write_to_mbx(nic, &mbx); + nicvf_send_cfg_done(nic); return 0; cleanup: -- 2.17.2
[PATCH v2 7/8] net: thunderx: add LINK_CHANGE message handler at nicpf
Move the link change polling task to VF side in order to prevent races between VF and PF while sending link change message(s). This commit is to implement link change request to be initiated by VF. Signed-off-by: Vadim Lomovtsev --- drivers/net/ethernet/cavium/thunder/nic.h | 2 +- .../net/ethernet/cavium/thunder/nic_main.c| 39 -- .../net/ethernet/cavium/thunder/nicvf_main.c | 54 +-- 3 files changed, 76 insertions(+), 19 deletions(-) diff --git a/drivers/net/ethernet/cavium/thunder/nic.h b/drivers/net/ethernet/cavium/thunder/nic.h index 86cda3f4b37b..62636c1ed141 100644 --- a/drivers/net/ethernet/cavium/thunder/nic.h +++ b/drivers/net/ethernet/cavium/thunder/nic.h @@ -331,7 +331,7 @@ struct nicvf { struct workqueue_struct *nicvf_rx_mode_wq; /* mutex to protect VF's mailbox contents from concurrent access */ struct mutexrx_mode_mtx; - + struct delayed_work link_change_work; /* PTP timestamp */ struct cavium_ptp *ptp_clock; /* Inbound timestamping is on */ diff --git a/drivers/net/ethernet/cavium/thunder/nic_main.c b/drivers/net/ethernet/cavium/thunder/nic_main.c index 620dbe082ca0..8ab71dae3988 100644 --- a/drivers/net/ethernet/cavium/thunder/nic_main.c +++ b/drivers/net/ethernet/cavium/thunder/nic_main.c @@ -929,6 +929,35 @@ static void nic_config_timestamp(struct nicpf *nic, int vf, struct set_ptp *ptp) nic_reg_write(nic, NIC_PF_PKIND_0_15_CFG | (pkind_idx << 3), pkind_val); } +static void nic_link_status_get(struct nicpf *nic, u8 vf) +{ + union nic_mbx mbx = {}; + struct bgx_link_status link; + u8 bgx, lmac; + + mbx.link_status.msg = NIC_MBOX_MSG_BGX_LINK_CHANGE; + + /* Get BGX, LMAC indices for the VF */ + bgx = NIC_GET_BGX_FROM_VF_LMAC_MAP(nic->vf_lmac_map[vf]); + lmac = NIC_GET_LMAC_FROM_VF_LMAC_MAP(nic->vf_lmac_map[vf]); + + /* Get interface link status */ + bgx_get_lmac_link_state(nic->node, bgx, lmac, &link); + + nic->link[vf] = link.link_up; + nic->duplex[vf] = link.duplex; + nic->speed[vf] = link.speed; + + /* Send a mbox message to VF with current link status */ + mbx.link_status.link_up = link.link_up; + mbx.link_status.duplex = link.duplex; + mbx.link_status.speed = link.speed; + mbx.link_status.mac_type = link.mac_type; + + /* reply with link status */ + nic_send_msg_to_vf(nic, vf, &mbx); +} + /* Interrupt handler to handle mailbox messages from VFs */ static void nic_handle_mbx_intr(struct nicpf *nic, int vf) { @@ -1108,6 +1137,13 @@ static void nic_handle_mbx_intr(struct nicpf *nic, int vf) lmac = NIC_GET_LMAC_FROM_VF_LMAC_MAP(nic->vf_lmac_map[vf]); bgx_set_xcast_mode(nic->node, bgx, lmac, mbx.xcast.mode); break; + case NIC_MBOX_MSG_BGX_LINK_CHANGE: + if (vf >= nic->num_vf_en) { + ret = -1; /* NACK */ + break; + } + nic_link_status_get(nic, vf); + goto unlock; default: dev_err(&nic->pdev->dev, "Invalid msg from VF%d, msg 0x%x\n", vf, mbx.msg.msg); @@ -1419,9 +1455,6 @@ static int nic_probe(struct pci_dev *pdev, const struct pci_device_id *ent) goto err_disable_sriov; } - INIT_DELAYED_WORK(&nic->dwork, nic_poll_for_link); - queue_delayed_work(nic->check_link, &nic->dwork, 0); - return 0; err_disable_sriov: diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c index a05e2989ec76..2ecacd0e1b51 100644 --- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c +++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c @@ -245,21 +245,24 @@ static void nicvf_handle_mbx_intr(struct nicvf *nic) break; case NIC_MBOX_MSG_BGX_LINK_CHANGE: nic->pf_acked = true; - nic->link_up = mbx.link_status.link_up; - nic->duplex = mbx.link_status.duplex; - nic->speed = mbx.link_status.speed; - nic->mac_type = mbx.link_status.mac_type; - if (nic->link_up) { - netdev_info(nic->netdev, "Link is Up %d Mbps %s duplex\n", - nic->speed, - nic->duplex == DUPLEX_FULL ? - "Full" : "Half"); - netif_carrier_on(nic->netdev); - netif_tx_start_all_queues(nic->netdev); - } else { - netdev_info(nic->netdev, "Link is Down\n"); - netif_carrier_off(nic->netdev); -
[PATCH v2 8/8] net: thunderx: remove link change polling code and info from nicpf
Since link change polling routine was moved to nicvf side, we don't need anymore polling function at nicpf side along with link status info for all enabled Vfs as at VF side this info is already tracked. This commit is to remove unnecessary code & fields from nicpf structure. Signed-off-by: Vadim Lomovtsev --- .../net/ethernet/cavium/thunder/nic_main.c| 114 ++ 1 file changed, 12 insertions(+), 102 deletions(-) diff --git a/drivers/net/ethernet/cavium/thunder/nic_main.c b/drivers/net/ethernet/cavium/thunder/nic_main.c index 8ab71dae3988..c90252829ed3 100644 --- a/drivers/net/ethernet/cavium/thunder/nic_main.c +++ b/drivers/net/ethernet/cavium/thunder/nic_main.c @@ -57,14 +57,8 @@ struct nicpf { #defineNIC_GET_BGX_FROM_VF_LMAC_MAP(map) ((map >> 4) & 0xF) #defineNIC_GET_LMAC_FROM_VF_LMAC_MAP(map) (map & 0xF) u8 *vf_lmac_map; - struct delayed_work dwork; - struct workqueue_struct *check_link; - u8 *link; - u8 *duplex; - u32 *speed; u16 cpi_base[MAX_NUM_VFS_SUPPORTED]; u16 rssi_base[MAX_NUM_VFS_SUPPORTED]; - boolmbx_lock[MAX_NUM_VFS_SUPPORTED]; /* MSI-X */ u8 num_vec; @@ -929,6 +923,10 @@ static void nic_config_timestamp(struct nicpf *nic, int vf, struct set_ptp *ptp) nic_reg_write(nic, NIC_PF_PKIND_0_15_CFG | (pkind_idx << 3), pkind_val); } +/* Get BGX LMAC link status and update corresponding VF + * if there is a change, valid only if internal L2 switch + * is not present otherwise VF link is always treated as up + */ static void nic_link_status_get(struct nicpf *nic, u8 vf) { union nic_mbx mbx = {}; @@ -944,10 +942,6 @@ static void nic_link_status_get(struct nicpf *nic, u8 vf) /* Get interface link status */ bgx_get_lmac_link_state(nic->node, bgx, lmac, &link); - nic->link[vf] = link.link_up; - nic->duplex[vf] = link.duplex; - nic->speed[vf] = link.speed; - /* Send a mbox message to VF with current link status */ mbx.link_status.link_up = link.link_up; mbx.link_status.duplex = link.duplex; @@ -970,8 +964,6 @@ static void nic_handle_mbx_intr(struct nicpf *nic, int vf) int i; int ret = 0; - nic->mbx_lock[vf] = true; - mbx_addr = nic_get_mbx_addr(vf); mbx_data = (u64 *)&mbx; @@ -986,12 +978,7 @@ static void nic_handle_mbx_intr(struct nicpf *nic, int vf) switch (mbx.msg.msg) { case NIC_MBOX_MSG_READY: nic_mbx_send_ready(nic, vf); - if (vf < nic->num_vf_en) { - nic->link[vf] = 0; - nic->duplex[vf] = 0; - nic->speed[vf] = 0; - } - goto unlock; + return; case NIC_MBOX_MSG_QS_CFG: reg_addr = NIC_PF_QSET_0_127_CFG | (mbx.qs.num << NIC_QS_ID_SHIFT); @@ -1060,7 +1047,7 @@ static void nic_handle_mbx_intr(struct nicpf *nic, int vf) break; case NIC_MBOX_MSG_RSS_SIZE: nic_send_rss_size(nic, vf); - goto unlock; + return; case NIC_MBOX_MSG_RSS_CFG: case NIC_MBOX_MSG_RSS_CFG_CONT: nic_config_rss(nic, &mbx.rss_cfg); @@ -1078,19 +1065,19 @@ static void nic_handle_mbx_intr(struct nicpf *nic, int vf) break; case NIC_MBOX_MSG_ALLOC_SQS: nic_alloc_sqs(nic, &mbx.sqs_alloc); - goto unlock; + return; case NIC_MBOX_MSG_NICVF_PTR: nic->nicvf[vf] = mbx.nicvf.nicvf; break; case NIC_MBOX_MSG_PNICVF_PTR: nic_send_pnicvf(nic, vf); - goto unlock; + return; case NIC_MBOX_MSG_SNICVF_PTR: nic_send_snicvf(nic, &mbx.nicvf); - goto unlock; + return; case NIC_MBOX_MSG_BGX_STATS: nic_get_bgx_stats(nic, &mbx.bgx_stats); - goto unlock; + return; case NIC_MBOX_MSG_LOOPBACK: ret = nic_config_loopback(nic, &mbx.lbk); break; @@ -1099,7 +1086,7 @@ static void nic_handle_mbx_intr(struct nicpf *nic, int vf) break; case NIC_MBOX_MSG_PFC: nic_pause_frame(nic, vf, &mbx.pfc); - goto unlock; + return; case NIC_MBOX_MSG_PTP_CFG: nic_config_timestamp(nic, vf, &mbx.ptp); break; @@ -1143,7 +1130,7 @@ static void nic_handle_mbx_intr(struct nicpf *nic, int vf) break; } nic_link_status_get
[PATCH v2 5/8] net: thunderx: rework xcast message structure to make it fit into 64 bit
To communicate to PF each of ThunderX NIC VF uses mailbox which is pair of 64 bit registers available to both VFn and PF. This commit is to change the xcast message structure in order to fit it into 64 bit. Signed-off-by: Vadim Lomovtsev --- drivers/net/ethernet/cavium/thunder/nic.h| 6 ++ drivers/net/ethernet/cavium/thunder/nic_main.c | 4 ++-- drivers/net/ethernet/cavium/thunder/nicvf_main.c | 6 +++--- 3 files changed, 7 insertions(+), 9 deletions(-) diff --git a/drivers/net/ethernet/cavium/thunder/nic.h b/drivers/net/ethernet/cavium/thunder/nic.h index 376a96bce33f..227343625e83 100644 --- a/drivers/net/ethernet/cavium/thunder/nic.h +++ b/drivers/net/ethernet/cavium/thunder/nic.h @@ -577,10 +577,8 @@ struct set_ptp { struct xcast { u8msg; - union { - u8mode; - u64 mac; - } data; + u8mode; + u64 mac:48; }; /* 128 bit shared memory between PF and each VF */ diff --git a/drivers/net/ethernet/cavium/thunder/nic_main.c b/drivers/net/ethernet/cavium/thunder/nic_main.c index 90497a27df18..620dbe082ca0 100644 --- a/drivers/net/ethernet/cavium/thunder/nic_main.c +++ b/drivers/net/ethernet/cavium/thunder/nic_main.c @@ -1094,7 +1094,7 @@ static void nic_handle_mbx_intr(struct nicpf *nic, int vf) bgx = NIC_GET_BGX_FROM_VF_LMAC_MAP(nic->vf_lmac_map[vf]); lmac = NIC_GET_LMAC_FROM_VF_LMAC_MAP(nic->vf_lmac_map[vf]); bgx_set_dmac_cam_filter(nic->node, bgx, lmac, - mbx.xcast.data.mac, + mbx.xcast.mac, vf < NIC_VF_PER_MBX_REG ? vf : vf - NIC_VF_PER_MBX_REG); break; @@ -1106,7 +1106,7 @@ static void nic_handle_mbx_intr(struct nicpf *nic, int vf) } bgx = NIC_GET_BGX_FROM_VF_LMAC_MAP(nic->vf_lmac_map[vf]); lmac = NIC_GET_LMAC_FROM_VF_LMAC_MAP(nic->vf_lmac_map[vf]); - bgx_set_xcast_mode(nic->node, bgx, lmac, mbx.xcast.data.mode); + bgx_set_xcast_mode(nic->node, bgx, lmac, mbx.xcast.mode); break; default: dev_err(&nic->pdev->dev, diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c index dbd8862d60d6..30c7f54b4f17 100644 --- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c +++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c @@ -1964,7 +1964,7 @@ static void __nicvf_set_rx_mode_task(u8 mode, struct xcast_addr_list *mc_addrs, * its' own LMAC to the filter to accept packets for it. */ mbx.xcast.msg = NIC_MBOX_MSG_ADD_MCAST; - mbx.xcast.data.mac = 0; + mbx.xcast.mac = 0; if (nicvf_send_msg_to_pf(nic, &mbx) < 0) goto free_mc; } @@ -1974,7 +1974,7 @@ static void __nicvf_set_rx_mode_task(u8 mode, struct xcast_addr_list *mc_addrs, /* now go through kernel list of MACs and add them one by one */ for (idx = 0; idx < mc_addrs->count; idx++) { mbx.xcast.msg = NIC_MBOX_MSG_ADD_MCAST; - mbx.xcast.data.mac = mc_addrs->mc[idx]; + mbx.xcast.mac = mc_addrs->mc[idx]; if (nicvf_send_msg_to_pf(nic, &mbx) < 0) goto free_mc; } @@ -1982,7 +1982,7 @@ static void __nicvf_set_rx_mode_task(u8 mode, struct xcast_addr_list *mc_addrs, /* and finally set rx mode for PF accordingly */ mbx.xcast.msg = NIC_MBOX_MSG_SET_XCAST; - mbx.xcast.data.mode = mode; + mbx.xcast.mode = mode; nicvf_send_msg_to_pf(nic, &mbx); free_mc: -- 2.17.2
[PATCH v2 2/8] net: thunderx: replace global nicvf_rx_mode_wq work queue for all VFs to private for each of them.
Having one work queue for receive mode configuration ndo_set_rx_mode() call for all VFs results in making each of them wait till the set_rx_mode() call completes for another VF if any of close, set receive mode and change flags calls being already invoked. Potentially this could cause device state change before appropriate call of receive mode configuration completes, so the call itself became meaningless, corrupt data or break configuration sequence. We don't need any delays in NIC VF configuration sequence so having delayed work call with 0 delay has no sense. This commit is to implement one work queue for each NIC VF for set_rx_mode task and to let them work independently and replacing delayed_work with work_struct. Signed-off-by: Vadim Lomovtsev --- drivers/net/ethernet/cavium/thunder/nic.h | 4 ++- .../net/ethernet/cavium/thunder/nicvf_main.c | 30 ++- 2 files changed, 19 insertions(+), 15 deletions(-) diff --git a/drivers/net/ethernet/cavium/thunder/nic.h b/drivers/net/ethernet/cavium/thunder/nic.h index f4d81765221e..376a96bce33f 100644 --- a/drivers/net/ethernet/cavium/thunder/nic.h +++ b/drivers/net/ethernet/cavium/thunder/nic.h @@ -271,7 +271,7 @@ struct xcast_addr_list { }; struct nicvf_work { - struct delayed_workwork; + struct work_struct work; u8 mode; struct xcast_addr_list *mc; }; @@ -327,6 +327,8 @@ struct nicvf { struct nicvf_work rx_mode_work; /* spinlock to protect workqueue arguments from concurrent access */ spinlock_t rx_mode_wq_lock; + /* workqueue for handling kernel ndo_set_rx_mode() calls */ + struct workqueue_struct *nicvf_rx_mode_wq; /* PTP timestamp */ struct cavium_ptp *ptp_clock; diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c index 88f8a8fa93cd..abf24e7dff2d 100644 --- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c +++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c @@ -68,9 +68,6 @@ module_param(cpi_alg, int, 0444); MODULE_PARM_DESC(cpi_alg, "PFC algorithm (0=none, 1=VLAN, 2=VLAN16, 3=IP Diffserv)"); -/* workqueue for handling kernel ndo_set_rx_mode() calls */ -static struct workqueue_struct *nicvf_rx_mode_wq; - static inline u8 nicvf_netdev_qidx(struct nicvf *nic, u8 qidx) { if (nic->sqs_mode) @@ -1311,6 +1308,9 @@ int nicvf_stop(struct net_device *netdev) struct nicvf_cq_poll *cq_poll = NULL; union nic_mbx mbx = {}; + /* wait till all queued set_rx_mode tasks completes */ + drain_workqueue(nic->nicvf_rx_mode_wq); + mbx.msg.msg = NIC_MBOX_MSG_SHUTDOWN; nicvf_send_msg_to_pf(nic, &mbx); @@ -1418,6 +1418,9 @@ int nicvf_open(struct net_device *netdev) struct nicvf_cq_poll *cq_poll = NULL; union nic_mbx mbx = {}; + /* wait till all queued set_rx_mode tasks completes if any */ + drain_workqueue(nic->nicvf_rx_mode_wq); + netif_carrier_off(netdev); err = nicvf_register_misc_interrupt(nic); @@ -1973,7 +1976,7 @@ static void __nicvf_set_rx_mode_task(u8 mode, struct xcast_addr_list *mc_addrs, static void nicvf_set_rx_mode_task(struct work_struct *work_arg) { struct nicvf_work *vf_work = container_of(work_arg, struct nicvf_work, - work.work); + work); struct nicvf *nic = container_of(vf_work, struct nicvf, rx_mode_work); u8 mode; struct xcast_addr_list *mc; @@ -2030,7 +2033,7 @@ static void nicvf_set_rx_mode(struct net_device *netdev) kfree(nic->rx_mode_work.mc); nic->rx_mode_work.mc = mc_list; nic->rx_mode_work.mode = mode; - queue_delayed_work(nicvf_rx_mode_wq, &nic->rx_mode_work.work, 0); + queue_work(nic->nicvf_rx_mode_wq, &nic->rx_mode_work.work); spin_unlock(&nic->rx_mode_wq_lock); } @@ -2187,7 +2190,10 @@ static int nicvf_probe(struct pci_dev *pdev, const struct pci_device_id *ent) INIT_WORK(&nic->reset_task, nicvf_reset_task); - INIT_DELAYED_WORK(&nic->rx_mode_work.work, nicvf_set_rx_mode_task); + nic->nicvf_rx_mode_wq = alloc_ordered_workqueue("nicvf_rx_mode_wq_VF%d", + WQ_MEM_RECLAIM, + nic->vf_id); + INIT_WORK(&nic->rx_mode_work.work, nicvf_set_rx_mode_task); spin_lock_init(&nic->rx_mode_wq_lock); err = register_netdev(netdev); @@ -2228,13 +2234,15 @@ static void nicvf_remove(struct pci_dev *pdev) nic = netdev_priv(netdev); pnetdev = nic->pnicvf->netdev; - cancel_delayed_work_sync(&nic->rx_mode_work.work); - /* Check if this Qset
[PATCH v2 1/8] net: thunderx: correct typo in macro name
Correct STREERING to STEERING at macro name for BGX steering register. Signed-off-by: Vadim Lomovtsev --- drivers/net/ethernet/cavium/thunder/thunder_bgx.c | 2 +- drivers/net/ethernet/cavium/thunder/thunder_bgx.h | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c index e337da6ba2a4..673c57b8023f 100644 --- a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c +++ b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c @@ -1217,7 +1217,7 @@ static void bgx_init_hw(struct bgx *bgx) /* Disable MAC steering (NCSI traffic) */ for (i = 0; i < RX_TRAFFIC_STEER_RULE_COUNT; i++) - bgx_reg_write(bgx, 0, BGX_CMR_RX_STREERING + (i * 8), 0x00); + bgx_reg_write(bgx, 0, BGX_CMR_RX_STEERING + (i * 8), 0x00); } static u8 bgx_get_lane2sds_cfg(struct bgx *bgx, struct lmac *lmac) diff --git a/drivers/net/ethernet/cavium/thunder/thunder_bgx.h b/drivers/net/ethernet/cavium/thunder/thunder_bgx.h index cbdd20b9ee6f..5cbc54e9eb19 100644 --- a/drivers/net/ethernet/cavium/thunder/thunder_bgx.h +++ b/drivers/net/ethernet/cavium/thunder/thunder_bgx.h @@ -60,7 +60,7 @@ #define RX_DMACX_CAM_EN BIT_ULL(48) #define RX_DMACX_CAM_LMACID(x)(((u64)x) << 49) #define RX_DMAC_COUNT 32 -#define BGX_CMR_RX_STREERING 0x300 +#define BGX_CMR_RX_STEERING0x300 #define RX_TRAFFIC_STEER_RULE_COUNT 8 #define BGX_CMR_CHAN_MSK_AND 0x450 #define BGX_CMR_BIST_STATUS0x460 -- 2.17.2
Re: [PATCH 0/8] nic: thunderx: fix communication races betwen VF & PF
self-NACK here, some emails get's corrupted for some reasons, along with some typos found. sorry for inconvenience. Vadim On Wed, Feb 06, 2019 at 10:13:54AM +0000, Vadim Lomovtsev wrote: > The ThunderX CN88XX NIC Virtual Function driver uses mailbox interface > to communicate to physical function driver. Each of VF has it's own pair > of mailbox registers to read from and write to. The mailbox registers > has no protection from possible races, so it has to be implemented > at software side. > > After long term testing by loop of 'ip link set up/down' > command it was found that there are two possible scenarios when > race condition appears: > 1. VF receives link change message from PF and VF send RX mode > configuration message to PF in the same time from separate thread. > 2. PF receives RX mode configuration from VF and in the same time, > in separate thread PF detects link status change and sends appropriate > message to particular VF. > > Both cases leads to mailbox data to be rewritten, NIC VF messaging control > data to be updated incorrectly and communication sequence gets broken. > > This patch series is to address race condition with VF & PF communication. > > Vadim Lomovtsev (8): > net: thunderx: correct typo in macro name > net: thunderx: replace global nicvf_rx_mode_wq work queue for all VFs > to private for each of them. > net: thunderx: make CFG_DONE message to run through generic send-ack > sequence > net: thunderx: add nicvf_send_msg_to_pf result check for > set_rx_mode_task > net: thunderx: rework xcast message structure to make it fit into 64 > bit > net: thunderx: add mutex to protect mailbox from concurrent calls for > same VF > net: thunderx: implement helpers to read mailbox IRQ status > net: thunderx: check status of mailbox IRQ before sending a message > > drivers/net/ethernet/cavium/thunder/nic.h | 12 +-- > .../net/ethernet/cavium/thunder/nic_main.c| 58 +++-- > .../net/ethernet/cavium/thunder/nicvf_main.c | 82 +-- > .../ethernet/cavium/thunder/nicvf_queues.c| 14 > .../ethernet/cavium/thunder/nicvf_queues.h| 1 + > .../net/ethernet/cavium/thunder/thunder_bgx.c | 2 +- > .../net/ethernet/cavium/thunder/thunder_bgx.h | 2 +- > 7 files changed, 112 insertions(+), 59 deletions(-) > > -- > 2.17.2
[PATCH 2/8] net: thunderx: replace global nicvf_rx_mode_wq work queue for all VFs to private for each of them.
Having one work queue for receive mode configuration ndo_set_rx_mode() call for all VFs results in making each of them wait till the set_rx_mode() call completes for another VF if any of close, set receive mode and change flags calls being already invoked. Potentially this could cause device state change before appropriate call of receive mode configuration completes, so the call itself became meaningless, corrupt data or break configuration sequence. We don't need any delays in NIC VF configuration sequence so having delayed work call with 0 delay has no sense. This commit is to implement one work queue for each NIC VF for set_rx_mode task and to let them work independently and replacing delayed_work with work_struct. Signed-off-by: Vadim Lomovtsev --- drivers/net/ethernet/cavium/thunder/nic.h | 4 ++- .../net/ethernet/cavium/thunder/nicvf_main.c | 30 ++- 2 files changed, 19 insertions(+), 15 deletions(-) diff --git a/drivers/net/ethernet/cavium/thunder/nic.h b/drivers/net/ethernet/cavium/thunder/nic.h index f4d81765221e..376a96bce33f 100644 --- a/drivers/net/ethernet/cavium/thunder/nic.h +++ b/drivers/net/ethernet/cavium/thunder/nic.h @@ -271,7 +271,7 @@ struct xcast_addr_list { }; struct nicvf_work { - struct delayed_workwork; + struct work_struct work; u8 mode; struct xcast_addr_list *mc; }; @@ -327,6 +327,8 @@ struct nicvf { struct nicvf_work rx_mode_work; /* spinlock to protect workqueue arguments from concurrent access */ spinlock_t rx_mode_wq_lock; + /* workqueue for handling kernel ndo_set_rx_mode() calls */ + struct workqueue_struct *nicvf_rx_mode_wq; /* PTP timestamp */ struct cavium_ptp *ptp_clock; diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c index 88f8a8fa93cd..abf24e7dff2d 100644 --- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c +++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c @@ -68,9 +68,6 @@ module_param(cpi_alg, int, 0444); MODULE_PARM_DESC(cpi_alg, "PFC algorithm (0=none, 1=VLAN, 2=VLAN16, 3=IP Diffserv)"); -/* workqueue for handling kernel ndo_set_rx_mode() calls */ -static struct workqueue_struct *nicvf_rx_mode_wq; - static inline u8 nicvf_netdev_qidx(struct nicvf *nic, u8 qidx) { if (nic->sqs_mode) @@ -1311,6 +1308,9 @@ int nicvf_stop(struct net_device *netdev) struct nicvf_cq_poll *cq_poll = NULL; union nic_mbx mbx = {}; + /* wait till all queued set_rx_mode tasks completes */ + drain_workqueue(nic->nicvf_rx_mode_wq); + mbx.msg.msg = NIC_MBOX_MSG_SHUTDOWN; nicvf_send_msg_to_pf(nic, &mbx); @@ -1418,6 +1418,9 @@ int nicvf_open(struct net_device *netdev) struct nicvf_cq_poll *cq_poll = NULL; union nic_mbx mbx = {}; + /* wait till all queued set_rx_mode tasks completes if any */ + drain_workqueue(nic->nicvf_rx_mode_wq); + netif_carrier_off(netdev); err = nicvf_register_misc_interrupt(nic); @@ -1973,7 +1976,7 @@ static void __nicvf_set_rx_mode_task(u8 mode, struct xcast_addr_list *mc_addrs, static void nicvf_set_rx_mode_task(struct work_struct *work_arg) { struct nicvf_work *vf_work = container_of(work_arg, struct nicvf_work, - work.work); + work); struct nicvf *nic = container_of(vf_work, struct nicvf, rx_mode_work); u8 mode; struct xcast_addr_list *mc; @@ -2030,7 +2033,7 @@ static void nicvf_set_rx_mode(struct net_device *netdev) kfree(nic->rx_mode_work.mc); nic->rx_mode_work.mc = mc_list; nic->rx_mode_work.mode = mode; - queue_delayed_work(nicvf_rx_mode_wq, &nic->rx_mode_work.work, 0); + queue_work(nic->nicvf_rx_mode_wq, &nic->rx_mode_work.work); spin_unlock(&nic->rx_mode_wq_lock); } @@ -2187,7 +2190,10 @@ static int nicvf_probe(struct pci_dev *pdev, const struct pci_device_id *ent) INIT_WORK(&nic->reset_task, nicvf_reset_task); - INIT_DELAYED_WORK(&nic->rx_mode_work.work, nicvf_set_rx_mode_task); + nic->nicvf_rx_mode_wq = alloc_ordered_workqueue("nicvf_rx_mode_wq_VF%d", + WQ_MEM_RECLAIM, + nic->vf_id); + INIT_WORK(&nic->rx_mode_work.work, nicvf_set_rx_mode_task); spin_lock_init(&nic->rx_mode_wq_lock); err = register_netdev(netdev); @@ -2228,13 +2234,15 @@ static void nicvf_remove(struct pci_dev *pdev) nic = netdev_priv(netdev); pnetdev = nic->pnicvf->netdev; - cancel_delayed_work_sync(&nic->rx_mode_work.work); - /* Check if this Qset
[PATCH 4/8] net: thunderx: add nicvf_send_msg_to_pf result check for set_rx_mode_task
The rx_set_mode invokes number of messages to be send to PF for receive mode configuration. In case if there any issues we need to stop sending messages and release allocated memory. This commit is to implement check of nicvf_msg_send_to_pf() result. Signed-off-by: Vadim Lomovtsev --- drivers/net/ethernet/cavium/thunder/nicvf_main.c | 12 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c index b0e8a04e0f1e..dbd8862d60d6 100644 --- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c +++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c @@ -1956,7 +1956,8 @@ static void __nicvf_set_rx_mode_task(u8 mode, struct xcast_addr_list *mc_addrs, /* flush DMAC filters and reset RX mode */ mbx.xcast.msg = NIC_MBOX_MSG_RESET_XCAST; - nicvf_send_msg_to_pf(nic, &mbx); + if (nicvf_send_msg_to_pf(nic, &mbx) < 0) + goto free_mc; if (mode & BGX_XCAST_MCAST_FILTER) { /* once enabling filtering, we need to signal to PF to add @@ -1964,7 +1965,8 @@ static void __nicvf_set_rx_mode_task(u8 mode, struct xcast_addr_list *mc_addrs, */ mbx.xcast.msg = NIC_MBOX_MSG_ADD_MCAST; mbx.xcast.data.mac = 0; - nicvf_send_msg_to_pf(nic, &mbx); + if (nicvf_send_msg_to_pf(nic, &mbx) < 0) + goto free_mc; } /* check if we have any specific MACs to be added to PF DMAC filter */ @@ -1973,9 +1975,9 @@ static void __nicvf_set_rx_mode_task(u8 mode, struct xcast_addr_list *mc_addrs, for (idx = 0; idx < mc_addrs->count; idx++) { mbx.xcast.msg = NIC_MBOX_MSG_ADD_MCAST; mbx.xcast.data.mac = mc_addrs->mc[idx]; - nicvf_send_msg_to_pf(nic, &mbx); + if (nicvf_send_msg_to_pf(nic, &mbx) < 0) + goto free_mc; } - kfree(mc_addrs); } /* and finally set rx mode for PF accordingly */ @@ -1983,6 +1985,8 @@ static void __nicvf_set_rx_mode_task(u8 mode, struct xcast_addr_list *mc_addrs, mbx.xcast.data.mode = mode; nicvf_send_msg_to_pf(nic, &mbx); +free_mc: + kfree(mc_addrs); } static void nicvf_set_rx_mode_task(struct work_struct *work_arg) -- 2.17.2
[PATCH 1/8] net: thunderx: correct typo in macro name
Correct STREERING to STEERING at macro name for BGX steering register. Signed-off-by: Vadim Lomovtsev --- drivers/net/ethernet/cavium/thunder/thunder_bgx.c | 2 +- drivers/net/ethernet/cavium/thunder/thunder_bgx.h | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c index e337da6ba2a4..673c57b8023f 100644 --- a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c +++ b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c @@ -1217,7 +1217,7 @@ static void bgx_init_hw(struct bgx *bgx) /* Disable MAC steering (NCSI traffic) */ for (i = 0; i < RX_TRAFFIC_STEER_RULE_COUNT; i++) - bgx_reg_write(bgx, 0, BGX_CMR_RX_STREERING + (i * 8), 0x00); + bgx_reg_write(bgx, 0, BGX_CMR_RX_STEERING + (i * 8), 0x00); } static u8 bgx_get_lane2sds_cfg(struct bgx *bgx, struct lmac *lmac) diff --git a/drivers/net/ethernet/cavium/thunder/thunder_bgx.h b/drivers/net/ethernet/cavium/thunder/thunder_bgx.h index cbdd20b9ee6f..5cbc54e9eb19 100644 --- a/drivers/net/ethernet/cavium/thunder/thunder_bgx.h +++ b/drivers/net/ethernet/cavium/thunder/thunder_bgx.h @@ -60,7 +60,7 @@ #define RX_DMACX_CAM_EN BIT_ULL(48) #define RX_DMACX_CAM_LMACID(x)(((u64)x) << 49) #define RX_DMAC_COUNT 32 -#define BGX_CMR_RX_STREERING 0x300 +#define BGX_CMR_RX_STEERING0x300 #define RX_TRAFFIC_STEER_RULE_COUNT 8 #define BGX_CMR_CHAN_MSK_AND 0x450 #define BGX_CMR_BIST_STATUS0x460 -- 2.17.2
[PATCH 5/8] net: thunderx: rework xcast message structure to make it fit into 64 bit
To communicate to PF each of ThunderX NIC VF uses mailbox which is pair of 64 bit registers available to both VFn and PF. This commit is to change the xcast message structure in order to fit it into 64 bit. Signed-off-by: Vadim Lomovtsev --- drivers/net/ethernet/cavium/thunder/nic.h| 6 ++ drivers/net/ethernet/cavium/thunder/nic_main.c | 4 ++-- drivers/net/ethernet/cavium/thunder/nicvf_main.c | 6 +++--- 3 files changed, 7 insertions(+), 9 deletions(-) diff --git a/drivers/net/ethernet/cavium/thunder/nic.h b/drivers/net/ethernet/cavium/thunder/nic.h index 376a96bce33f..227343625e83 100644 --- a/drivers/net/ethernet/cavium/thunder/nic.h +++ b/drivers/net/ethernet/cavium/thunder/nic.h @@ -577,10 +577,8 @@ struct set_ptp { struct xcast { u8msg; - union { - u8mode; - u64 mac; - } data; + u8mode; + u64 mac:48; }; /* 128 bit shared memory between PF and each VF */ diff --git a/drivers/net/ethernet/cavium/thunder/nic_main.c b/drivers/net/ethernet/cavium/thunder/nic_main.c index 90497a27df18..620dbe082ca0 100644 --- a/drivers/net/ethernet/cavium/thunder/nic_main.c +++ b/drivers/net/ethernet/cavium/thunder/nic_main.c @@ -1094,7 +1094,7 @@ static void nic_handle_mbx_intr(struct nicpf *nic, int vf) bgx = NIC_GET_BGX_FROM_VF_LMAC_MAP(nic->vf_lmac_map[vf]); lmac = NIC_GET_LMAC_FROM_VF_LMAC_MAP(nic->vf_lmac_map[vf]); bgx_set_dmac_cam_filter(nic->node, bgx, lmac, - mbx.xcast.data.mac, + mbx.xcast.mac, vf < NIC_VF_PER_MBX_REG ? vf : vf - NIC_VF_PER_MBX_REG); break; @@ -1106,7 +1106,7 @@ static void nic_handle_mbx_intr(struct nicpf *nic, int vf) } bgx = NIC_GET_BGX_FROM_VF_LMAC_MAP(nic->vf_lmac_map[vf]); lmac = NIC_GET_LMAC_FROM_VF_LMAC_MAP(nic->vf_lmac_map[vf]); - bgx_set_xcast_mode(nic->node, bgx, lmac, mbx.xcast.data.mode); + bgx_set_xcast_mode(nic->node, bgx, lmac, mbx.xcast.mode); break; default: dev_err(&nic->pdev->dev, diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c index dbd8862d60d6..30c7f54b4f17 100644 --- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c +++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c @@ -1964,7 +1964,7 @@ static void __nicvf_set_rx_mode_task(u8 mode, struct xcast_addr_list *mc_addrs, * its' own LMAC to the filter to accept packets for it. */ mbx.xcast.msg = NIC_MBOX_MSG_ADD_MCAST; - mbx.xcast.data.mac = 0; + mbx.xcast.mac = 0; if (nicvf_send_msg_to_pf(nic, &mbx) < 0) goto free_mc; } @@ -1974,7 +1974,7 @@ static void __nicvf_set_rx_mode_task(u8 mode, struct xcast_addr_list *mc_addrs, /* now go through kernel list of MACs and add them one by one */ for (idx = 0; idx < mc_addrs->count; idx++) { mbx.xcast.msg = NIC_MBOX_MSG_ADD_MCAST; - mbx.xcast.data.mac = mc_addrs->mc[idx]; + mbx.xcast.mac = mc_addrs->mc[idx]; if (nicvf_send_msg_to_pf(nic, &mbx) < 0) goto free_mc; } @@ -1982,7 +1982,7 @@ static void __nicvf_set_rx_mode_task(u8 mode, struct xcast_addr_list *mc_addrs, /* and finally set rx mode for PF accordingly */ mbx.xcast.msg = NIC_MBOX_MSG_SET_XCAST; - mbx.xcast.data.mode = mode; + mbx.xcast.mode = mode; nicvf_send_msg_to_pf(nic, &mbx); free_mc: -- 2.17.2
[PATCH 0/8] nic: thunderx: fix communication races betwen VF & PF
The ThunderX CN88XX NIC Virtual Function driver uses mailbox interface to communicate to physical function driver. Each of VF has it's own pair of mailbox registers to read from and write to. The mailbox registers has no protection from possible races, so it has to be implemented at software side. After long term testing by loop of 'ip link set up/down' command it was found that there are two possible scenarios when race condition appears: 1. VF receives link change message from PF and VF send RX mode configuration message to PF in the same time from separate thread. 2. PF receives RX mode configuration from VF and in the same time, in separate thread PF detects link status change and sends appropriate message to particular VF. Both cases leads to mailbox data to be rewritten, NIC VF messaging control data to be updated incorrectly and communication sequence gets broken. This patch series is to address race condition with VF & PF communication. Vadim Lomovtsev (8): net: thunderx: correct typo in macro name net: thunderx: replace global nicvf_rx_mode_wq work queue for all VFs to private for each of them. net: thunderx: make CFG_DONE message to run through generic send-ack sequence net: thunderx: add nicvf_send_msg_to_pf result check for set_rx_mode_task net: thunderx: rework xcast message structure to make it fit into 64 bit net: thunderx: add mutex to protect mailbox from concurrent calls for same VF net: thunderx: implement helpers to read mailbox IRQ status net: thunderx: check status of mailbox IRQ before sending a message drivers/net/ethernet/cavium/thunder/nic.h | 12 +-- .../net/ethernet/cavium/thunder/nic_main.c| 58 +++-- .../net/ethernet/cavium/thunder/nicvf_main.c | 82 +-- .../ethernet/cavium/thunder/nicvf_queues.c| 14 .../ethernet/cavium/thunder/nicvf_queues.h| 1 + .../net/ethernet/cavium/thunder/thunder_bgx.c | 2 +- .../net/ethernet/cavium/thunder/thunder_bgx.h | 2 +- 7 files changed, 112 insertions(+), 59 deletions(-) -- 2.17.2
[PATCH 3/8] net: thunderx: make CFG_DONE message to run through generic send-ack sequence
At the end of NIC VF initialization it send CFG_DONE message to PF without using nicvf_msg_send_to_pf routine. This potentially could re-write data in mailbox. This commit is to implement common way of sending CFG_DONE message by the same way with other configuration messages by using nicvf_send_msg_to_pf() routine. Signed-off-by: Vadim Lomovtsev --- drivers/net/ethernet/cavium/thunder/nic_main.c | 2 +- .../net/ethernet/cavium/thunder/nicvf_main.c | 18 +++--- 2 files changed, 16 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/cavium/thunder/nic_main.c b/drivers/net/ethernet/cavium/thunder/nic_main.c index 6c8dcb65ff03..90497a27df18 100644 --- a/drivers/net/ethernet/cavium/thunder/nic_main.c +++ b/drivers/net/ethernet/cavium/thunder/nic_main.c @@ -1039,7 +1039,7 @@ static void nic_handle_mbx_intr(struct nicpf *nic, int vf) case NIC_MBOX_MSG_CFG_DONE: /* Last message of VF config msg sequence */ nic_enable_vf(nic, vf, true); - goto unlock; + break; case NIC_MBOX_MSG_SHUTDOWN: /* First msg in VF teardown sequence */ if (vf >= nic->num_vf_en) diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c index abf24e7dff2d..b0e8a04e0f1e 100644 --- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c +++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c @@ -169,6 +169,20 @@ static int nicvf_check_pf_ready(struct nicvf *nic) return 1; } +static int nicvf_send_cfg_done(struct nicvf *nic) +{ + union nic_mbx mbx = {}; + + mbx.msg.msg = NIC_MBOX_MSG_CFG_DONE; + if (nicvf_send_msg_to_pf(nic, &mbx)) { + netdev_err(nic->netdev, + "PF didn't respond to CFG DONE msg\n"); + return 0; + } + + return 1; +} + static void nicvf_read_bgx_stats(struct nicvf *nic, struct bgx_stats_msg *bgx) { if (bgx->rx) @@ -1416,7 +1430,6 @@ int nicvf_open(struct net_device *netdev) struct nicvf *nic = netdev_priv(netdev); struct queue_set *qs = nic->qs; struct nicvf_cq_poll *cq_poll = NULL; - union nic_mbx mbx = {}; /* wait till all queued set_rx_mode tasks completes if any */ drain_workqueue(nic->nicvf_rx_mode_wq); @@ -1515,8 +1528,7 @@ int nicvf_open(struct net_device *netdev) nicvf_enable_intr(nic, NICVF_INTR_RBDR, qidx); /* Send VF config done msg to PF */ - mbx.msg.msg = NIC_MBOX_MSG_CFG_DONE; - nicvf_write_to_mbx(nic, &mbx); + nicvf_send_cfg_done(nic); return 0; cleanup: -- 2.17.2
[PATCH 6/8] net: thunderx: add mutex to protect mailbox from concurrent calls for same VF
In some cases it could happen that nicvf_send_msg_to_pf() could be called concurrently for the same NIC VF, and thus re-writing mailbox contents and breaking messaging sequence with PF by re-writing NICVF data. This commit is to implement mutex for NICVF to protect mailbox registers and NICVF messaging control data from concurrent access. Signed-off-by: Vadim Lomovtsev --- drivers/net/ethernet/cavium/thunder/nic.h| 2 ++ drivers/net/ethernet/cavium/thunder/nicvf_main.c | 13 ++--- 2 files changed, 12 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/cavium/thunder/nic.h b/drivers/net/ethernet/cavium/thunder/nic.h index 227343625e83..86cda3f4b37b 100644 --- a/drivers/net/ethernet/cavium/thunder/nic.h +++ b/drivers/net/ethernet/cavium/thunder/nic.h @@ -329,6 +329,8 @@ struct nicvf { spinlock_t rx_mode_wq_lock; /* workqueue for handling kernel ndo_set_rx_mode() calls */ struct workqueue_struct *nicvf_rx_mode_wq; + /* mutex to protect VF's mailbox contents from concurrent access */ + struct mutexrx_mode_mtx; /* PTP timestamp */ struct cavium_ptp *ptp_clock; diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c index 30c7f54b4f17..a05e2989ec76 100644 --- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c +++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c @@ -124,6 +124,9 @@ int nicvf_send_msg_to_pf(struct nicvf *nic, union nic_mbx *mbx) { int timeout = NIC_MBOX_MSG_TIMEOUT; int sleep = 10; + int ret = 0; + + mutex_lock(&nic->rx_mode_mtx); nic->pf_acked = false; nic->pf_nacked = false; @@ -136,7 +139,8 @@ int nicvf_send_msg_to_pf(struct nicvf *nic, union nic_mbx *mbx) netdev_err(nic->netdev, "PF NACK to mbox msg 0x%02x from VF%d\n", (mbx->msg.msg & 0xFF), nic->vf_id); - return -EINVAL; + ret = -EINVAL; + break; } msleep(sleep); if (nic->pf_acked) @@ -146,10 +150,12 @@ int nicvf_send_msg_to_pf(struct nicvf *nic, union nic_mbx *mbx) netdev_err(nic->netdev, "PF didn't ACK to mbox msg 0x%02x from VF%d\n", (mbx->msg.msg & 0xFF), nic->vf_id); - return -EBUSY; + ret = -EBUSY; + break; } } - return 0; + mutex_unlock(&nic->rx_mode_mtx); + return ret; } /* Checks if VF is able to comminicate with PF @@ -2211,6 +2217,7 @@ static int nicvf_probe(struct pci_dev *pdev, const struct pci_device_id *ent) nic->vf_id); INIT_WORK(&nic->rx_mode_work.work, nicvf_set_rx_mode_task); spin_lock_init(&nic->rx_mode_wq_lock); + mutex_init(&nic->rx_mode_mtx); err = register_netdev(netdev); if (err) { -- 2.17.2
[PATCH 8/8] net: thunderx: check status of mailbox IRQ before sending a message
In order to prevent mailbox data re-writing at VF side we need to check if there is an active mailbox IRQ from PF, and if there is no one proceed with sending message to PF. Having spinlock at irq handler and message send routing wont help since by the moment when code flow would reach the irq handler and acquire spinlock the message send routine could be already invoked and thus re-write data in the mailbox. The same is true for PF while sending messages to VF. This commit is to implement mailbox IRQ status check before sending message to VF from PF. Same is for sending message to PF from VF. Signed-off-by: Vadim Lomovtsev --- .../net/ethernet/cavium/thunder/nic_main.c| 39 --- .../net/ethernet/cavium/thunder/nicvf_main.c | 3 ++ 2 files changed, 20 insertions(+), 22 deletions(-) diff --git a/drivers/net/ethernet/cavium/thunder/nic_main.c b/drivers/net/ethernet/cavium/thunder/nic_main.c index a32c1bd75794..e0041692caef 100644 --- a/drivers/net/ethernet/cavium/thunder/nic_main.c +++ b/drivers/net/ethernet/cavium/thunder/nic_main.c @@ -64,7 +64,6 @@ struct nicpf { u32 *speed; u16 cpi_base[MAX_NUM_VFS_SUPPORTED]; u16 rssi_base[MAX_NUM_VFS_SUPPORTED]; - boolmbx_lock[MAX_NUM_VFS_SUPPORTED]; /* MSI-X */ u8 num_vec; @@ -954,8 +953,6 @@ static void nic_handle_mbx_intr(struct nicpf *nic, int vf) int i; int ret = 0; - nic->mbx_lock[vf] = true; - mbx_addr = nic_get_mbx_addr(vf); mbx_data = (u64 *)&mbx; @@ -975,7 +972,7 @@ static void nic_handle_mbx_intr(struct nicpf *nic, int vf) nic->duplex[vf] = 0; nic->speed[vf] = 0; } - goto unlock; + return; case NIC_MBOX_MSG_QS_CFG: reg_addr = NIC_PF_QSET_0_127_CFG | (mbx.qs.num << NIC_QS_ID_SHIFT); @@ -1044,7 +1041,7 @@ static void nic_handle_mbx_intr(struct nicpf *nic, int vf) break; case NIC_MBOX_MSG_RSS_SIZE: nic_send_rss_size(nic, vf); - goto unlock; + return; case NIC_MBOX_MSG_RSS_CFG: case NIC_MBOX_MSG_RSS_CFG_CONT: nic_config_rss(nic, &mbx.rss_cfg); @@ -1062,19 +1059,19 @@ static void nic_handle_mbx_intr(struct nicpf *nic, int vf) break; case NIC_MBOX_MSG_ALLOC_SQS: nic_alloc_sqs(nic, &mbx.sqs_alloc); - goto unlock; + return; case NIC_MBOX_MSG_NICVF_PTR: nic->nicvf[vf] = mbx.nicvf.nicvf; break; case NIC_MBOX_MSG_PNICVF_PTR: nic_send_pnicvf(nic, vf); - goto unlock; + return; case NIC_MBOX_MSG_SNICVF_PTR: nic_send_snicvf(nic, &mbx.nicvf); - goto unlock; + return; case NIC_MBOX_MSG_BGX_STATS: nic_get_bgx_stats(nic, &mbx.bgx_stats); - goto unlock; + return; case NIC_MBOX_MSG_LOOPBACK: ret = nic_config_loopback(nic, &mbx.lbk); break; @@ -1083,7 +1080,7 @@ static void nic_handle_mbx_intr(struct nicpf *nic, int vf) break; case NIC_MBOX_MSG_PFC: nic_pause_frame(nic, vf, &mbx.pfc); - goto unlock; + return; case NIC_MBOX_MSG_PTP_CFG: nic_config_timestamp(nic, vf, &mbx.ptp); break; @@ -1134,8 +1131,6 @@ static void nic_handle_mbx_intr(struct nicpf *nic, int vf) mbx.msg.msg, vf); nic_mbx_send_nack(nic, vf); } -unlock: - nic->mbx_lock[vf] = false; } static irqreturn_t nic_mbx_intr_handler(int irq, void *nic_irq) @@ -1313,18 +1308,18 @@ static void nic_poll_for_link(struct work_struct *work) if (nic->link[vf] == link.link_up) continue; - if (!nic->mbx_lock[vf]) { - nic->link[vf] = link.link_up; - nic->duplex[vf] = link.duplex; - nic->speed[vf] = link.speed; + nic->link[vf] = link.link_up; + nic->duplex[vf] = link.duplex; + nic->speed[vf] = link.speed; - /* Send a mbox message to VF with current link status */ - mbx.link_status.link_up = link.link_up; - mbx.link_status.duplex = link.duplex; - mbx.link_status.speed = link.speed; - mbx.link_status.mac_type = link.mac_type; + /* Send a mbox message to VF with current link status */ + mbx.link_status.link_up = link.link_up; +
[PATCH 7/8] net: thunderx: implement helpers to read mailbox IRQ status
This commit is to implement routines to read mailbox IRQ status for particular VF at PF side, and for mailbox IRQ status from PF at VF side. Signed-off-by: Vadim Lomovtsev --- drivers/net/ethernet/cavium/thunder/nic_main.c | 13 + drivers/net/ethernet/cavium/thunder/nicvf_queues.c | 14 ++ drivers/net/ethernet/cavium/thunder/nicvf_queues.h | 1 + 3 files changed, 28 insertions(+) diff --git a/drivers/net/ethernet/cavium/thunder/nic_main.c b/drivers/net/ethernet/cavium/thunder/nic_main.c index 620dbe082ca0..a32c1bd75794 100644 --- a/drivers/net/ethernet/cavium/thunder/nic_main.c +++ b/drivers/net/ethernet/cavium/thunder/nic_main.c @@ -104,6 +104,19 @@ static u64 nic_reg_read(struct nicpf *nic, u64 offset) return readq_relaxed(nic->reg_base + offset); } +static int nic_is_mbox_intr_active(struct nicpf *nic, int vf_id) +{ + int ret = 0; + + if (vf_id < NIC_VF_PER_MBX_REG) { + ret = nic_reg_read(nic, NIC_PF_MAILBOX_INT) & BIT_ULL(vf_id); + } else { + ret = nic_reg_read(nic, NIC_PF_MAILBOX_INT + sizeof(u64)) & + BIT_ULL(vf_id - NIC_VF_PER_MBX_REG); + } + return ret; +} + /* PF -> VF mailbox communication APIs */ static void nic_enable_mbx_intr(struct nicpf *nic) { diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c index 5b4d3badcb73..e7ee7005657c 100644 --- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c +++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c @@ -1801,6 +1801,20 @@ void nicvf_clear_intr(struct nicvf *nic, int int_type, int q_idx) nicvf_reg_write(nic, NIC_VF_INT, mask); } +/* Check if interrupt is active */ +int nicvf_check_is_intr_active(struct nicvf *nic, int int_type, int q_idx) +{ + u64 mask = nicvf_int_type_to_mask(int_type, q_idx); + + if (!mask) { + netdev_dbg(nic->netdev, + "Failed to read interrupt status: unknown type\n"); + return 0; + } + + return (mask & nicvf_reg_read(nic, NIC_VF_INT)); +} + /* Check if interrupt is enabled */ int nicvf_is_intr_enabled(struct nicvf *nic, int int_type, int q_idx) { diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.h b/drivers/net/ethernet/cavium/thunder/nicvf_queues.h index 5e9a03cf1b4d..58f6fbe48bce 100644 --- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.h +++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.h @@ -358,6 +358,7 @@ void nicvf_enable_intr(struct nicvf *nic, int int_type, int q_idx); void nicvf_disable_intr(struct nicvf *nic, int int_type, int q_idx); void nicvf_clear_intr(struct nicvf *nic, int int_type, int q_idx); int nicvf_is_intr_enabled(struct nicvf *nic, int int_type, int q_idx); +int nicvf_check_is_intr_active(struct nicvf *nic, int int_type, int q_idx); /* Register access APIs */ void nicvf_reg_write(struct nicvf *nic, u64 offset, u64 val); -- 2.17.2
[PATCH v5] net: thunderx: rework mac addresses list to u64 array
From: Vadim Lomovtsev It is too expensive to pass u64 values via linked list, instead allocate array for them by overall number of mac addresses from netdev. This eventually removes multiple kmalloc() calls, aviod memory fragmentation and allow to put single null check on kmalloc return value in order to prevent a potential null pointer dereference. Addresses-Coverity-ID: 1467429 ("Dereference null return value") Fixes: 37c3347eb247 ("net: thunderx: add ndo_set_rx_mode callback implementation for VF") Reported-by: Dan Carpenter Signed-off-by: Vadim Lomovtsev --- Changes from v1 to v2: - C99 syntax: update xcast_addr_list struct field mc[0] -> mc[]; Changes from v2 to v3: - update commit description with 'Reported-by: Dan Carpenter'; - update size calculations for mc list to offsetof() call instead of explicit arithmetic; Changes from v3 to v4: - change loop control variable type from u8 to int, accordingly to mc_count size; Changes from v4 to v5: - remove explicit initialization of 'idx' variable as per review comment; --- drivers/net/ethernet/cavium/thunder/nic.h| 7 +- drivers/net/ethernet/cavium/thunder/nicvf_main.c | 28 +--- 2 files changed, 11 insertions(+), 24 deletions(-) diff --git a/drivers/net/ethernet/cavium/thunder/nic.h b/drivers/net/ethernet/cavium/thunder/nic.h index 5fc46c5a4f36..448d1fafc827 100644 --- a/drivers/net/ethernet/cavium/thunder/nic.h +++ b/drivers/net/ethernet/cavium/thunder/nic.h @@ -265,14 +265,9 @@ struct nicvf_drv_stats { struct cavium_ptp; -struct xcast_addr { - struct list_head list; - u64 addr; -}; - struct xcast_addr_list { - struct list_head list; int count; + u64 mc[]; }; struct nicvf_work { diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c index 1e9a31fef729..707db3304396 100644 --- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c +++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c @@ -1929,7 +1929,7 @@ static void nicvf_set_rx_mode_task(struct work_struct *work_arg) work.work); struct nicvf *nic = container_of(vf_work, struct nicvf, rx_mode_work); union nic_mbx mbx = {}; - struct xcast_addr *xaddr, *next; + int idx; if (!vf_work) return; @@ -1956,16 +1956,10 @@ static void nicvf_set_rx_mode_task(struct work_struct *work_arg) /* check if we have any specific MACs to be added to PF DMAC filter */ if (vf_work->mc) { /* now go through kernel list of MACs and add them one by one */ - list_for_each_entry_safe(xaddr, next, -&vf_work->mc->list, list) { + for (idx = 0; idx < vf_work->mc->count; idx++) { mbx.xcast.msg = NIC_MBOX_MSG_ADD_MCAST; - mbx.xcast.data.mac = xaddr->addr; + mbx.xcast.data.mac = vf_work->mc->mc[idx]; nicvf_send_msg_to_pf(nic, &mbx); - - /* after receiving ACK from PF release memory */ - list_del(&xaddr->list); - kfree(xaddr); - vf_work->mc->count--; } kfree(vf_work->mc); } @@ -1996,17 +1990,15 @@ static void nicvf_set_rx_mode(struct net_device *netdev) mode |= BGX_XCAST_MCAST_FILTER; /* here we need to copy mc addrs */ if (netdev_mc_count(netdev)) { - struct xcast_addr *xaddr; - - mc_list = kmalloc(sizeof(*mc_list), GFP_ATOMIC); - INIT_LIST_HEAD(&mc_list->list); + mc_list = kmalloc(offsetof(typeof(*mc_list), + mc[netdev_mc_count(netdev)]), + GFP_ATOMIC); + if (unlikely(!mc_list)) + return; + mc_list->count = 0; netdev_hw_addr_list_for_each(ha, &netdev->mc) { - xaddr = kmalloc(sizeof(*xaddr), - GFP_ATOMIC); - xaddr->addr = + mc_list->mc[mc_list->count] = ether_addr_to_u64(ha->addr); - list_add_tail(&xaddr->list, - &mc_list->list); mc_list->count++; } } -- 2.14.3
Re: [PATCH v4] net: thunderx: rework mac addresses list to u64 array
On Sun, Apr 08, 2018 at 12:42:00PM -0400, David Miller wrote: > From: Vadim Lomovtsev > Date: Fri, 6 Apr 2018 12:53:54 -0700 > > > @@ -1929,7 +1929,7 @@ static void nicvf_set_rx_mode_task(struct work_struct > > *work_arg) > > work.work); > > struct nicvf *nic = container_of(vf_work, struct nicvf, rx_mode_work); > > union nic_mbx mbx = {}; > > - struct xcast_addr *xaddr, *next; > > + int idx = 0; > > No need to initialize idx. > > > + for (idx = 0; idx < vf_work->mc->count; idx++) { > > As it is always explicitly initialized at, and only used inside of, > this loop. Ok, will post next version shortly. Thanks for your time. Vadim
[PATCH v4] net: thunderx: rework mac addresses list to u64 array
From: Vadim Lomovtsev It is too expensive to pass u64 values via linked list, instead allocate array for them by overall number of mac addresses from netdev. This eventually removes multiple kmalloc() calls, aviod memory fragmentation and allow to put single null check on kmalloc return value in order to prevent a potential null pointer dereference. Addresses-Coverity-ID: 1467429 ("Dereference null return value") Fixes: 37c3347eb247 ("net: thunderx: add ndo_set_rx_mode callback implementation for VF") Reported-by: Dan Carpenter Signed-off-by: Vadim Lomovtsev --- Changes from v1 to v2: - C99 syntax: update xcast_addr_list struct field mc[0] -> mc[]; Changes from v2 to v3: - update commit description with 'Reported-by: Dan Carpenter'; - update size calculations for mc list to offsetof() call instead of explicit arithmetic; Changes from v3 to v4: - change loop control variable type from u8 to int, accordingly to mc_count size; --- drivers/net/ethernet/cavium/thunder/nic.h| 7 +- drivers/net/ethernet/cavium/thunder/nicvf_main.c | 28 +--- 2 files changed, 11 insertions(+), 24 deletions(-) diff --git a/drivers/net/ethernet/cavium/thunder/nic.h b/drivers/net/ethernet/cavium/thunder/nic.h index 5fc46c5..448d1fa 100644 --- a/drivers/net/ethernet/cavium/thunder/nic.h +++ b/drivers/net/ethernet/cavium/thunder/nic.h @@ -265,14 +265,9 @@ struct nicvf_drv_stats { struct cavium_ptp; -struct xcast_addr { - struct list_head list; - u64 addr; -}; - struct xcast_addr_list { - struct list_head list; int count; + u64 mc[]; }; struct nicvf_work { diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c index 1e9a31f..6bd5658 100644 --- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c +++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c @@ -1929,7 +1929,7 @@ static void nicvf_set_rx_mode_task(struct work_struct *work_arg) work.work); struct nicvf *nic = container_of(vf_work, struct nicvf, rx_mode_work); union nic_mbx mbx = {}; - struct xcast_addr *xaddr, *next; + int idx = 0; if (!vf_work) return; @@ -1956,16 +1956,10 @@ static void nicvf_set_rx_mode_task(struct work_struct *work_arg) /* check if we have any specific MACs to be added to PF DMAC filter */ if (vf_work->mc) { /* now go through kernel list of MACs and add them one by one */ - list_for_each_entry_safe(xaddr, next, -&vf_work->mc->list, list) { + for (idx = 0; idx < vf_work->mc->count; idx++) { mbx.xcast.msg = NIC_MBOX_MSG_ADD_MCAST; - mbx.xcast.data.mac = xaddr->addr; + mbx.xcast.data.mac = vf_work->mc->mc[idx]; nicvf_send_msg_to_pf(nic, &mbx); - - /* after receiving ACK from PF release memory */ - list_del(&xaddr->list); - kfree(xaddr); - vf_work->mc->count--; } kfree(vf_work->mc); } @@ -1996,17 +1990,15 @@ static void nicvf_set_rx_mode(struct net_device *netdev) mode |= BGX_XCAST_MCAST_FILTER; /* here we need to copy mc addrs */ if (netdev_mc_count(netdev)) { - struct xcast_addr *xaddr; - - mc_list = kmalloc(sizeof(*mc_list), GFP_ATOMIC); - INIT_LIST_HEAD(&mc_list->list); + mc_list = kmalloc(offsetof(typeof(*mc_list), + mc[netdev_mc_count(netdev)]), + GFP_ATOMIC); + if (unlikely(!mc_list)) + return; + mc_list->count = 0; netdev_hw_addr_list_for_each(ha, &netdev->mc) { - xaddr = kmalloc(sizeof(*xaddr), - GFP_ATOMIC); - xaddr->addr = + mc_list->mc[mc_list->count] = ether_addr_to_u64(ha->addr); - list_add_tail(&xaddr->list, - &mc_list->list); mc_list->count++; } } -- 1.8.3.1
Re: [PATCH v3] net: thunderx: rework mac addresses list to u64 array
Self-NACK here, because of https://lkml.org/lkml/2018/4/6/724 Sorry for noise. Vadim On Fri, Apr 06, 2018 at 07:04:43AM -0700, Vadim Lomovtsev wrote: > From: Vadim Lomovtsev > > It is too expensive to pass u64 values via linked list, instead > allocate array for them by overall number of mac addresses from netdev. > > This eventually removes multiple kmalloc() calls, aviod memory > fragmentation and allow to put single null check on kmalloc > return value in order to prevent a potential null pointer dereference. > > Addresses-Coverity-ID: 1467429 ("Dereference null return value") > Fixes: 37c3347eb247 ("net: thunderx: add ndo_set_rx_mode callback > implementation for VF") > Reported-by: Dan Carpenter > Signed-off-by: Vadim Lomovtsev > --- > Changes from v1 to v2: > - C99 syntax: update xcast_addr_list struct field mc[0] -> mc[]; > Changes from v2 to v3: > - update commit description with 'Reported-by: Dan Carpenter'; > - update size calculations for mc list to offsetof() call >instead of explicit arithmetic; > --- > drivers/net/ethernet/cavium/thunder/nic.h| 7 +- > drivers/net/ethernet/cavium/thunder/nicvf_main.c | 28 > +--- > 2 files changed, 11 insertions(+), 24 deletions(-) > > diff --git a/drivers/net/ethernet/cavium/thunder/nic.h > b/drivers/net/ethernet/cavium/thunder/nic.h > index 5fc46c5a4f36..448d1fafc827 100644 > --- a/drivers/net/ethernet/cavium/thunder/nic.h > +++ b/drivers/net/ethernet/cavium/thunder/nic.h > @@ -265,14 +265,9 @@ struct nicvf_drv_stats { > > struct cavium_ptp; > > -struct xcast_addr { > - struct list_head list; > - u64 addr; > -}; > - > struct xcast_addr_list { > - struct list_head list; > int count; > + u64 mc[]; > }; > > struct nicvf_work { > diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c > b/drivers/net/ethernet/cavium/thunder/nicvf_main.c > index 1e9a31fef729..7d9e58533a83 100644 > --- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c > +++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c > @@ -1929,7 +1929,7 @@ static void nicvf_set_rx_mode_task(struct work_struct > *work_arg) > work.work); > struct nicvf *nic = container_of(vf_work, struct nicvf, rx_mode_work); > union nic_mbx mbx = {}; > - struct xcast_addr *xaddr, *next; > + u8 idx = 0; > > if (!vf_work) > return; > @@ -1956,16 +1956,10 @@ static void nicvf_set_rx_mode_task(struct work_struct > *work_arg) > /* check if we have any specific MACs to be added to PF DMAC filter */ > if (vf_work->mc) { > /* now go through kernel list of MACs and add them one by one */ > - list_for_each_entry_safe(xaddr, next, > - &vf_work->mc->list, list) { > + for (idx = 0; idx < vf_work->mc->count; idx++) { > mbx.xcast.msg = NIC_MBOX_MSG_ADD_MCAST; > - mbx.xcast.data.mac = xaddr->addr; > + mbx.xcast.data.mac = vf_work->mc->mc[idx]; > nicvf_send_msg_to_pf(nic, &mbx); > - > - /* after receiving ACK from PF release memory */ > - list_del(&xaddr->list); > - kfree(xaddr); > - vf_work->mc->count--; > } > kfree(vf_work->mc); > } > @@ -1996,17 +1990,15 @@ static void nicvf_set_rx_mode(struct net_device > *netdev) > mode |= BGX_XCAST_MCAST_FILTER; > /* here we need to copy mc addrs */ > if (netdev_mc_count(netdev)) { > - struct xcast_addr *xaddr; > - > - mc_list = kmalloc(sizeof(*mc_list), GFP_ATOMIC); > - INIT_LIST_HEAD(&mc_list->list); > + mc_list = kmalloc(offsetof(typeof(*mc_list), > + > mc[netdev_mc_count(netdev)]), > + GFP_ATOMIC); > + if (unlikely(!mc_list)) > + return; > + mc_list->count = 0; > netdev_hw_addr_list_for_each(ha, &netdev->mc) { > - xaddr = kmalloc(sizeof(*xaddr), > -
Re: [PATCH v2] net: thunderx: rework mac addresses list to u64 array
On Fri, Apr 06, 2018 at 11:06:03AM -0400, David Miller wrote: > From: Vadim Lomovtsev > Date: Fri, 6 Apr 2018 04:14:25 -0700 > > > diff --git a/drivers/net/ethernet/cavium/thunder/nic.h > > b/drivers/net/ethernet/cavium/thunder/nic.h > > index 5fc46c5a4f36..448d1fafc827 100644 > > --- a/drivers/net/ethernet/cavium/thunder/nic.h > > +++ b/drivers/net/ethernet/cavium/thunder/nic.h > > @@ -265,14 +265,9 @@ struct nicvf_drv_stats { > > > > struct cavium_ptp; > > > > -struct xcast_addr { > > - struct list_head list; > > - u64 addr; > > -}; > > - > > struct xcast_addr_list { > > - struct list_head list; > > int count; > > + u64 mc[]; > > }; > > > > struct nicvf_work { > > diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c > > b/drivers/net/ethernet/cavium/thunder/nicvf_main.c > > index 1e9a31fef729..a26d8bc92e01 100644 > > --- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c > > +++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c > > @@ -1929,7 +1929,7 @@ static void nicvf_set_rx_mode_task(struct work_struct > > *work_arg) > > work.work); > > struct nicvf *nic = container_of(vf_work, struct nicvf, rx_mode_work); > > union nic_mbx mbx = {}; > > - struct xcast_addr *xaddr, *next; > > + u8 idx = 0; > ^^^ > > > > > if (!vf_work) > > return; > > @@ -1956,16 +1956,10 @@ static void nicvf_set_rx_mode_task(struct > > work_struct *work_arg) > > /* check if we have any specific MACs to be added to PF DMAC filter */ > > if (vf_work->mc) { > > /* now go through kernel list of MACs and add them one by one */ > > - list_for_each_entry_safe(xaddr, next, > > -&vf_work->mc->list, list) { > > + for (idx = 0; idx < vf_work->mc->count; idx++) { > > vf_work->mx->count is an 'int' therefore 'idx' should be declared 'int' as > well, > not a 'u8'. My bad, sorry. Will post v4 shortly then. WBR, Vadim
[PATCH v3] net: thunderx: rework mac addresses list to u64 array
From: Vadim Lomovtsev It is too expensive to pass u64 values via linked list, instead allocate array for them by overall number of mac addresses from netdev. This eventually removes multiple kmalloc() calls, aviod memory fragmentation and allow to put single null check on kmalloc return value in order to prevent a potential null pointer dereference. Addresses-Coverity-ID: 1467429 ("Dereference null return value") Fixes: 37c3347eb247 ("net: thunderx: add ndo_set_rx_mode callback implementation for VF") Reported-by: Dan Carpenter Signed-off-by: Vadim Lomovtsev --- Changes from v1 to v2: - C99 syntax: update xcast_addr_list struct field mc[0] -> mc[]; Changes from v2 to v3: - update commit description with 'Reported-by: Dan Carpenter'; - update size calculations for mc list to offsetof() call instead of explicit arithmetic; --- drivers/net/ethernet/cavium/thunder/nic.h| 7 +- drivers/net/ethernet/cavium/thunder/nicvf_main.c | 28 +--- 2 files changed, 11 insertions(+), 24 deletions(-) diff --git a/drivers/net/ethernet/cavium/thunder/nic.h b/drivers/net/ethernet/cavium/thunder/nic.h index 5fc46c5a4f36..448d1fafc827 100644 --- a/drivers/net/ethernet/cavium/thunder/nic.h +++ b/drivers/net/ethernet/cavium/thunder/nic.h @@ -265,14 +265,9 @@ struct nicvf_drv_stats { struct cavium_ptp; -struct xcast_addr { - struct list_head list; - u64 addr; -}; - struct xcast_addr_list { - struct list_head list; int count; + u64 mc[]; }; struct nicvf_work { diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c index 1e9a31fef729..7d9e58533a83 100644 --- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c +++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c @@ -1929,7 +1929,7 @@ static void nicvf_set_rx_mode_task(struct work_struct *work_arg) work.work); struct nicvf *nic = container_of(vf_work, struct nicvf, rx_mode_work); union nic_mbx mbx = {}; - struct xcast_addr *xaddr, *next; + u8 idx = 0; if (!vf_work) return; @@ -1956,16 +1956,10 @@ static void nicvf_set_rx_mode_task(struct work_struct *work_arg) /* check if we have any specific MACs to be added to PF DMAC filter */ if (vf_work->mc) { /* now go through kernel list of MACs and add them one by one */ - list_for_each_entry_safe(xaddr, next, -&vf_work->mc->list, list) { + for (idx = 0; idx < vf_work->mc->count; idx++) { mbx.xcast.msg = NIC_MBOX_MSG_ADD_MCAST; - mbx.xcast.data.mac = xaddr->addr; + mbx.xcast.data.mac = vf_work->mc->mc[idx]; nicvf_send_msg_to_pf(nic, &mbx); - - /* after receiving ACK from PF release memory */ - list_del(&xaddr->list); - kfree(xaddr); - vf_work->mc->count--; } kfree(vf_work->mc); } @@ -1996,17 +1990,15 @@ static void nicvf_set_rx_mode(struct net_device *netdev) mode |= BGX_XCAST_MCAST_FILTER; /* here we need to copy mc addrs */ if (netdev_mc_count(netdev)) { - struct xcast_addr *xaddr; - - mc_list = kmalloc(sizeof(*mc_list), GFP_ATOMIC); - INIT_LIST_HEAD(&mc_list->list); + mc_list = kmalloc(offsetof(typeof(*mc_list), + mc[netdev_mc_count(netdev)]), + GFP_ATOMIC); + if (unlikely(!mc_list)) + return; + mc_list->count = 0; netdev_hw_addr_list_for_each(ha, &netdev->mc) { - xaddr = kmalloc(sizeof(*xaddr), - GFP_ATOMIC); - xaddr->addr = + mc_list->mc[mc_list->count] = ether_addr_to_u64(ha->addr); - list_add_tail(&xaddr->list, - &mc_list->list); mc_list->count++; } } -- 2.14.3
Re: [PATCH v2] net: thunderx: rework mac addresses list to u64 array
Hi Robin, On Fri, Apr 06, 2018 at 12:48:40PM +0100, Robin Murphy wrote: > On 06/04/18 12:14, Vadim Lomovtsev wrote: > > From: Vadim Lomovtsev > > > > It is too expensive to pass u64 values via linked list, instead > > allocate array for them by overall number of mac addresses from netdev. > > > > This eventually removes multiple kmalloc() calls, aviod memory > > fragmentation and allow to put single null check on kmalloc > > return value in order to prevent a potential null pointer dereference. > > > > Addresses-Coverity-ID: 1467429 ("Dereference null return value") > > Fixes: 37c3347eb247 ("net: thunderx: add ndo_set_rx_mode callback > > implementation for VF") > > Signed-off-by: Vadim Lomovtsev > > --- > > Changes from v1 to v2: > > - C99 syntax: update xcast_addr_list struct field mc[0] -> mc[]; > > > > --- > > drivers/net/ethernet/cavium/thunder/nic.h| 7 +- > > drivers/net/ethernet/cavium/thunder/nicvf_main.c | 28 > > +--- > > 2 files changed, 11 insertions(+), 24 deletions(-) > > > > diff --git a/drivers/net/ethernet/cavium/thunder/nic.h > > b/drivers/net/ethernet/cavium/thunder/nic.h > > index 5fc46c5a4f36..448d1fafc827 100644 > > --- a/drivers/net/ethernet/cavium/thunder/nic.h > > +++ b/drivers/net/ethernet/cavium/thunder/nic.h > > @@ -265,14 +265,9 @@ struct nicvf_drv_stats { > > struct cavium_ptp; > > -struct xcast_addr { > > - struct list_head list; > > - u64 addr; > > -}; > > - > > struct xcast_addr_list { > > - struct list_head list; > > int count; > > + u64 mc[]; > > }; > > struct nicvf_work { > > diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c > > b/drivers/net/ethernet/cavium/thunder/nicvf_main.c > > index 1e9a31fef729..a26d8bc92e01 100644 > > --- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c > > +++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c > > @@ -1929,7 +1929,7 @@ static void nicvf_set_rx_mode_task(struct work_struct > > *work_arg) > > work.work); > > struct nicvf *nic = container_of(vf_work, struct nicvf, rx_mode_work); > > union nic_mbx mbx = {}; > > - struct xcast_addr *xaddr, *next; > > + u8 idx = 0; > > if (!vf_work) > > return; > > @@ -1956,16 +1956,10 @@ static void nicvf_set_rx_mode_task(struct > > work_struct *work_arg) > > /* check if we have any specific MACs to be added to PF DMAC filter */ > > if (vf_work->mc) { > > /* now go through kernel list of MACs and add them one by one */ > > - list_for_each_entry_safe(xaddr, next, > > -&vf_work->mc->list, list) { > > + for (idx = 0; idx < vf_work->mc->count; idx++) { > > mbx.xcast.msg = NIC_MBOX_MSG_ADD_MCAST; > > - mbx.xcast.data.mac = xaddr->addr; > > + mbx.xcast.data.mac = vf_work->mc->mc[idx]; > > nicvf_send_msg_to_pf(nic, &mbx); > > - > > - /* after receiving ACK from PF release memory */ > > - list_del(&xaddr->list); > > - kfree(xaddr); > > - vf_work->mc->count--; > > } > > kfree(vf_work->mc); > > } > > @@ -1996,17 +1990,15 @@ static void nicvf_set_rx_mode(struct net_device > > *netdev) > > mode |= BGX_XCAST_MCAST_FILTER; > > /* here we need to copy mc addrs */ > > if (netdev_mc_count(netdev)) { > > - struct xcast_addr *xaddr; > > - > > - mc_list = kmalloc(sizeof(*mc_list), GFP_ATOMIC); > > - INIT_LIST_HEAD(&mc_list->list); > > + mc_list = kmalloc(sizeof(*mc_list) + > > + sizeof(u64) * > > netdev_mc_count(netdev), > > FWIW if you really wanted to disambiguate that it's a structure and not just > an array being allocated, then it could be expressed without explicit > arithmetic: > > size = offsetof(typeof(*mc_list), mc[netdev_mc_count(netdev)]); > > but that's probably just a matter of personal preference at this point. > > Robin. > Thanks for reviewing it and for hint. From style perspective, I believe,
Re: [PATCH v2] net: thunderx: rework mac addresses list to u64 array
On Fri, Apr 06, 2018 at 06:47:26AM -0500, Gustavo A. R. Silva wrote: > > > On 04/06/2018 06:43 AM, Vadim Lomovtsev wrote: > > Hi Gustavo, > > > > On Fri, Apr 06, 2018 at 06:36:28AM -0500, Gustavo A. R. Silva wrote: > > > Hi Vadim, > > > > > > On 04/06/2018 06:14 AM, Vadim Lomovtsev wrote: > > > > From: Vadim Lomovtsev > > > > > > > > It is too expensive to pass u64 values via linked list, instead > > > > allocate array for them by overall number of mac addresses from netdev. > > > > > > > > This eventually removes multiple kmalloc() calls, aviod memory > > > > fragmentation and allow to put single null check on kmalloc > > > > return value in order to prevent a potential null pointer dereference. > > > > > > > > Addresses-Coverity-ID: 1467429 ("Dereference null return value") > > > > Fixes: 37c3347eb247 ("net: thunderx: add ndo_set_rx_mode callback > > > > implementation for VF") > > > > > > It'd be nice if you add: > > > > > > Reported-by: Gustavo A. R. Silva > > > > Ok, I could do that. > > > > Just to explain .. I didn't do it yet since I get such report from > > Dan Carpenter intially > > (https://www.spinics.net/lists/linux-kernel-janitors/msg40720.html) > > and was working on it when found you patches, so then asking you to give > > me some time to prepare and test update to driver. > > > > Oh I got it. Not big deal. I think in this case you should add Dan's > Reported-by instead. Ok then. > > BTW nice patch. :) > Thank you for reviewing it. WBR, Vadim > Thanks > -- > Gustavo > > > So would it be acceptable put two tags 'Reported-by:' then ? > > > > WBR, > > Vadim > > > > > > > > Thanks > > > -- > > > Gustavo > > > > > > > Signed-off-by: Vadim Lomovtsev > > > > --- > > > > Changes from v1 to v2: > > > >- C99 syntax: update xcast_addr_list struct field mc[0] -> mc[]; > > > > > > > > --- > > > >drivers/net/ethernet/cavium/thunder/nic.h| 7 +- > > > >drivers/net/ethernet/cavium/thunder/nicvf_main.c | 28 > > > > +--- > > > >2 files changed, 11 insertions(+), 24 deletions(-) > > > > > > > > diff --git a/drivers/net/ethernet/cavium/thunder/nic.h > > > > b/drivers/net/ethernet/cavium/thunder/nic.h > > > > index 5fc46c5a4f36..448d1fafc827 100644 > > > > --- a/drivers/net/ethernet/cavium/thunder/nic.h > > > > +++ b/drivers/net/ethernet/cavium/thunder/nic.h > > > > @@ -265,14 +265,9 @@ struct nicvf_drv_stats { > > > >struct cavium_ptp; > > > > -struct xcast_addr { > > > > - struct list_head list; > > > > - u64 addr; > > > > -}; > > > > - > > > >struct xcast_addr_list { > > > > - struct list_head list; > > > > int count; > > > > + u64 mc[]; > > > >}; > > > >struct nicvf_work { > > > > diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c > > > > b/drivers/net/ethernet/cavium/thunder/nicvf_main.c > > > > index 1e9a31fef729..a26d8bc92e01 100644 > > > > --- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c > > > > +++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c > > > > @@ -1929,7 +1929,7 @@ static void nicvf_set_rx_mode_task(struct > > > > work_struct *work_arg) > > > > work.work); > > > > struct nicvf *nic = container_of(vf_work, struct nicvf, > > > > rx_mode_work); > > > > union nic_mbx mbx = {}; > > > > - struct xcast_addr *xaddr, *next; > > > > + u8 idx = 0; > > > > if (!vf_work) > > > > return; > > > > @@ -1956,16 +1956,10 @@ static void nicvf_set_rx_mode_task(struct > > > > work_struct *work_arg) > > > > /* check if we have any specific MACs to be added to PF DMAC > > > > filter */ > > > > if (vf_work->mc) { > > > > /* now go through kernel list of MACs and add them one > > > > by one */ > > > > - list_for_each_entry_s
Re: [PATCH v2] net: thunderx: rework mac addresses list to u64 array
Hi Gustavo, On Fri, Apr 06, 2018 at 06:36:28AM -0500, Gustavo A. R. Silva wrote: > Hi Vadim, > > On 04/06/2018 06:14 AM, Vadim Lomovtsev wrote: > > From: Vadim Lomovtsev > > > > It is too expensive to pass u64 values via linked list, instead > > allocate array for them by overall number of mac addresses from netdev. > > > > This eventually removes multiple kmalloc() calls, aviod memory > > fragmentation and allow to put single null check on kmalloc > > return value in order to prevent a potential null pointer dereference. > > > > Addresses-Coverity-ID: 1467429 ("Dereference null return value") > > Fixes: 37c3347eb247 ("net: thunderx: add ndo_set_rx_mode callback > > implementation for VF") > > It'd be nice if you add: > > Reported-by: Gustavo A. R. Silva Ok, I could do that. Just to explain .. I didn't do it yet since I get such report from Dan Carpenter intially (https://www.spinics.net/lists/linux-kernel-janitors/msg40720.html) and was working on it when found you patches, so then asking you to give me some time to prepare and test update to driver. So would it be acceptable put two tags 'Reported-by:' then ? WBR, Vadim > > Thanks > -- > Gustavo > > > Signed-off-by: Vadim Lomovtsev > > --- > > Changes from v1 to v2: > > - C99 syntax: update xcast_addr_list struct field mc[0] -> mc[]; > > > > --- > > drivers/net/ethernet/cavium/thunder/nic.h| 7 +- > > drivers/net/ethernet/cavium/thunder/nicvf_main.c | 28 > > +--- > > 2 files changed, 11 insertions(+), 24 deletions(-) > > > > diff --git a/drivers/net/ethernet/cavium/thunder/nic.h > > b/drivers/net/ethernet/cavium/thunder/nic.h > > index 5fc46c5a4f36..448d1fafc827 100644 > > --- a/drivers/net/ethernet/cavium/thunder/nic.h > > +++ b/drivers/net/ethernet/cavium/thunder/nic.h > > @@ -265,14 +265,9 @@ struct nicvf_drv_stats { > > struct cavium_ptp; > > -struct xcast_addr { > > - struct list_head list; > > - u64 addr; > > -}; > > - > > struct xcast_addr_list { > > - struct list_head list; > > int count; > > + u64 mc[]; > > }; > > struct nicvf_work { > > diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c > > b/drivers/net/ethernet/cavium/thunder/nicvf_main.c > > index 1e9a31fef729..a26d8bc92e01 100644 > > --- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c > > +++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c > > @@ -1929,7 +1929,7 @@ static void nicvf_set_rx_mode_task(struct work_struct > > *work_arg) > > work.work); > > struct nicvf *nic = container_of(vf_work, struct nicvf, rx_mode_work); > > union nic_mbx mbx = {}; > > - struct xcast_addr *xaddr, *next; > > + u8 idx = 0; > > if (!vf_work) > > return; > > @@ -1956,16 +1956,10 @@ static void nicvf_set_rx_mode_task(struct > > work_struct *work_arg) > > /* check if we have any specific MACs to be added to PF DMAC filter */ > > if (vf_work->mc) { > > /* now go through kernel list of MACs and add them one by one */ > > - list_for_each_entry_safe(xaddr, next, > > -&vf_work->mc->list, list) { > > + for (idx = 0; idx < vf_work->mc->count; idx++) { > > mbx.xcast.msg = NIC_MBOX_MSG_ADD_MCAST; > > - mbx.xcast.data.mac = xaddr->addr; > > + mbx.xcast.data.mac = vf_work->mc->mc[idx]; > > nicvf_send_msg_to_pf(nic, &mbx); > > - > > - /* after receiving ACK from PF release memory */ > > - list_del(&xaddr->list); > > - kfree(xaddr); > > - vf_work->mc->count--; > > } > > kfree(vf_work->mc); > > } > > @@ -1996,17 +1990,15 @@ static void nicvf_set_rx_mode(struct net_device > > *netdev) > > mode |= BGX_XCAST_MCAST_FILTER; > > /* here we need to copy mc addrs */ > > if (netdev_mc_count(netdev)) { > > - struct xcast_addr *xaddr; > > - > > - mc_list = kmalloc(sizeof(*mc_list), GFP_ATOMIC); > > - INIT_LIST_HEAD(&mc_list->list); > > + mc_list = kmalloc(sizeof(*mc_list) +
[PATCH v2] net: thunderx: rework mac addresses list to u64 array
From: Vadim Lomovtsev It is too expensive to pass u64 values via linked list, instead allocate array for them by overall number of mac addresses from netdev. This eventually removes multiple kmalloc() calls, aviod memory fragmentation and allow to put single null check on kmalloc return value in order to prevent a potential null pointer dereference. Addresses-Coverity-ID: 1467429 ("Dereference null return value") Fixes: 37c3347eb247 ("net: thunderx: add ndo_set_rx_mode callback implementation for VF") Signed-off-by: Vadim Lomovtsev --- Changes from v1 to v2: - C99 syntax: update xcast_addr_list struct field mc[0] -> mc[]; --- drivers/net/ethernet/cavium/thunder/nic.h| 7 +- drivers/net/ethernet/cavium/thunder/nicvf_main.c | 28 +--- 2 files changed, 11 insertions(+), 24 deletions(-) diff --git a/drivers/net/ethernet/cavium/thunder/nic.h b/drivers/net/ethernet/cavium/thunder/nic.h index 5fc46c5a4f36..448d1fafc827 100644 --- a/drivers/net/ethernet/cavium/thunder/nic.h +++ b/drivers/net/ethernet/cavium/thunder/nic.h @@ -265,14 +265,9 @@ struct nicvf_drv_stats { struct cavium_ptp; -struct xcast_addr { - struct list_head list; - u64 addr; -}; - struct xcast_addr_list { - struct list_head list; int count; + u64 mc[]; }; struct nicvf_work { diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c index 1e9a31fef729..a26d8bc92e01 100644 --- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c +++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c @@ -1929,7 +1929,7 @@ static void nicvf_set_rx_mode_task(struct work_struct *work_arg) work.work); struct nicvf *nic = container_of(vf_work, struct nicvf, rx_mode_work); union nic_mbx mbx = {}; - struct xcast_addr *xaddr, *next; + u8 idx = 0; if (!vf_work) return; @@ -1956,16 +1956,10 @@ static void nicvf_set_rx_mode_task(struct work_struct *work_arg) /* check if we have any specific MACs to be added to PF DMAC filter */ if (vf_work->mc) { /* now go through kernel list of MACs and add them one by one */ - list_for_each_entry_safe(xaddr, next, -&vf_work->mc->list, list) { + for (idx = 0; idx < vf_work->mc->count; idx++) { mbx.xcast.msg = NIC_MBOX_MSG_ADD_MCAST; - mbx.xcast.data.mac = xaddr->addr; + mbx.xcast.data.mac = vf_work->mc->mc[idx]; nicvf_send_msg_to_pf(nic, &mbx); - - /* after receiving ACK from PF release memory */ - list_del(&xaddr->list); - kfree(xaddr); - vf_work->mc->count--; } kfree(vf_work->mc); } @@ -1996,17 +1990,15 @@ static void nicvf_set_rx_mode(struct net_device *netdev) mode |= BGX_XCAST_MCAST_FILTER; /* here we need to copy mc addrs */ if (netdev_mc_count(netdev)) { - struct xcast_addr *xaddr; - - mc_list = kmalloc(sizeof(*mc_list), GFP_ATOMIC); - INIT_LIST_HEAD(&mc_list->list); + mc_list = kmalloc(sizeof(*mc_list) + + sizeof(u64) * netdev_mc_count(netdev), + GFP_ATOMIC); + if (unlikely(!mc_list)) + return; + mc_list->count = 0; netdev_hw_addr_list_for_each(ha, &netdev->mc) { - xaddr = kmalloc(sizeof(*xaddr), - GFP_ATOMIC); - xaddr->addr = + mc_list->mc[mc_list->count] = ether_addr_to_u64(ha->addr); - list_add_tail(&xaddr->list, - &mc_list->list); mc_list->count++; } } -- 2.14.3
Re: [PATCH] net: thunderx: rework mac addresses list to u64 array
Hi Christoph, Thank you for your feedback and time. On Thu, Apr 05, 2018 at 08:07:48AM -0700, Christoph Hellwig wrote: > > struct xcast_addr_list { > > - struct list_head list; > > int count; > > + u64 mc[0]; > > Please use the standard C99 syntax here: > > u64 mc[]; Ok, will update. > > > + mc_list = kmalloc(sizeof(*mc_list) + > > + sizeof(u64) * > > netdev_mc_count(netdev), > > + GFP_ATOMIC); > > kmalloc_array(), please. In this case it would require two memory allocation calls to kmalloc() for xcast_addr_list struct and to kmalloc_array() for 'mc' addresses, becasue of different data types and so two null-ptr checks .. this is what I'd like get rid off. My idea of this was to keep number of array elements and themselves within the same memory block/page to reduce number of memory allocation requests, number of allocated pages/blocks and avoid possible memory fragmentation (however, I believe the latter is already handled at the mm layer). WBR, Vadim
Re: [PATCH] net: thunderx: rework mac addresses list to u64 array
[correct Roberts' address] On Thu, Apr 05, 2018 at 07:57:56AM -0700, Vadim Lomovtsev wrote: > From: Vadim Lomovtsev > > It is too expensive to pass u64 values via linked list, instead > allocate array for them by overall number of mac addresses from netdev. > > This eventually removes multiple kmalloc() calls, aviod memory > fragmentation and allow to put single null check on kmalloc > return value in order to prevent a potential null pointer dereference. > > Addresses-Coverity-ID: 1467429 ("Dereference null return value") > Fixes: 37c3347eb247 ("net: thunderx: add ndo_set_rx_mode callback > implementation for VF") > Signed-off-by: Vadim Lomovtsev > --- > drivers/net/ethernet/cavium/thunder/nic.h| 7 +- > drivers/net/ethernet/cavium/thunder/nicvf_main.c | 28 > +--- > 2 files changed, 11 insertions(+), 24 deletions(-) > > diff --git a/drivers/net/ethernet/cavium/thunder/nic.h > b/drivers/net/ethernet/cavium/thunder/nic.h > index 5fc46c5a4f36..da052159f014 100644 > --- a/drivers/net/ethernet/cavium/thunder/nic.h > +++ b/drivers/net/ethernet/cavium/thunder/nic.h > @@ -265,14 +265,9 @@ struct nicvf_drv_stats { > > struct cavium_ptp; > > -struct xcast_addr { > - struct list_head list; > - u64 addr; > -}; > - > struct xcast_addr_list { > - struct list_head list; > int count; > + u64 mc[0]; > }; > > struct nicvf_work { > diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c > b/drivers/net/ethernet/cavium/thunder/nicvf_main.c > index 1e9a31fef729..a26d8bc92e01 100644 > --- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c > +++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c > @@ -1929,7 +1929,7 @@ static void nicvf_set_rx_mode_task(struct work_struct > *work_arg) > work.work); > struct nicvf *nic = container_of(vf_work, struct nicvf, rx_mode_work); > union nic_mbx mbx = {}; > - struct xcast_addr *xaddr, *next; > + u8 idx = 0; > > if (!vf_work) > return; > @@ -1956,16 +1956,10 @@ static void nicvf_set_rx_mode_task(struct work_struct > *work_arg) > /* check if we have any specific MACs to be added to PF DMAC filter */ > if (vf_work->mc) { > /* now go through kernel list of MACs and add them one by one */ > - list_for_each_entry_safe(xaddr, next, > - &vf_work->mc->list, list) { > + for (idx = 0; idx < vf_work->mc->count; idx++) { > mbx.xcast.msg = NIC_MBOX_MSG_ADD_MCAST; > - mbx.xcast.data.mac = xaddr->addr; > + mbx.xcast.data.mac = vf_work->mc->mc[idx]; > nicvf_send_msg_to_pf(nic, &mbx); > - > - /* after receiving ACK from PF release memory */ > - list_del(&xaddr->list); > - kfree(xaddr); > - vf_work->mc->count--; > } > kfree(vf_work->mc); > } > @@ -1996,17 +1990,15 @@ static void nicvf_set_rx_mode(struct net_device > *netdev) > mode |= BGX_XCAST_MCAST_FILTER; > /* here we need to copy mc addrs */ > if (netdev_mc_count(netdev)) { > - struct xcast_addr *xaddr; > - > - mc_list = kmalloc(sizeof(*mc_list), GFP_ATOMIC); > - INIT_LIST_HEAD(&mc_list->list); > + mc_list = kmalloc(sizeof(*mc_list) + > + sizeof(u64) * > netdev_mc_count(netdev), > + GFP_ATOMIC); > + if (unlikely(!mc_list)) > + return; > + mc_list->count = 0; > netdev_hw_addr_list_for_each(ha, &netdev->mc) { > - xaddr = kmalloc(sizeof(*xaddr), > - GFP_ATOMIC); > - xaddr->addr = > + mc_list->mc[mc_list->count] = > ether_addr_to_u64(ha->addr); > - list_add_tail(&xaddr->list, > - &mc_list->list); > mc_list->count++; > } > } > -- > 2.14.3 >
[PATCH] net: thunderx: rework mac addresses list to u64 array
From: Vadim Lomovtsev It is too expensive to pass u64 values via linked list, instead allocate array for them by overall number of mac addresses from netdev. This eventually removes multiple kmalloc() calls, aviod memory fragmentation and allow to put single null check on kmalloc return value in order to prevent a potential null pointer dereference. Addresses-Coverity-ID: 1467429 ("Dereference null return value") Fixes: 37c3347eb247 ("net: thunderx: add ndo_set_rx_mode callback implementation for VF") Signed-off-by: Vadim Lomovtsev --- drivers/net/ethernet/cavium/thunder/nic.h| 7 +- drivers/net/ethernet/cavium/thunder/nicvf_main.c | 28 +--- 2 files changed, 11 insertions(+), 24 deletions(-) diff --git a/drivers/net/ethernet/cavium/thunder/nic.h b/drivers/net/ethernet/cavium/thunder/nic.h index 5fc46c5a4f36..da052159f014 100644 --- a/drivers/net/ethernet/cavium/thunder/nic.h +++ b/drivers/net/ethernet/cavium/thunder/nic.h @@ -265,14 +265,9 @@ struct nicvf_drv_stats { struct cavium_ptp; -struct xcast_addr { - struct list_head list; - u64 addr; -}; - struct xcast_addr_list { - struct list_head list; int count; + u64 mc[0]; }; struct nicvf_work { diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c index 1e9a31fef729..a26d8bc92e01 100644 --- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c +++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c @@ -1929,7 +1929,7 @@ static void nicvf_set_rx_mode_task(struct work_struct *work_arg) work.work); struct nicvf *nic = container_of(vf_work, struct nicvf, rx_mode_work); union nic_mbx mbx = {}; - struct xcast_addr *xaddr, *next; + u8 idx = 0; if (!vf_work) return; @@ -1956,16 +1956,10 @@ static void nicvf_set_rx_mode_task(struct work_struct *work_arg) /* check if we have any specific MACs to be added to PF DMAC filter */ if (vf_work->mc) { /* now go through kernel list of MACs and add them one by one */ - list_for_each_entry_safe(xaddr, next, -&vf_work->mc->list, list) { + for (idx = 0; idx < vf_work->mc->count; idx++) { mbx.xcast.msg = NIC_MBOX_MSG_ADD_MCAST; - mbx.xcast.data.mac = xaddr->addr; + mbx.xcast.data.mac = vf_work->mc->mc[idx]; nicvf_send_msg_to_pf(nic, &mbx); - - /* after receiving ACK from PF release memory */ - list_del(&xaddr->list); - kfree(xaddr); - vf_work->mc->count--; } kfree(vf_work->mc); } @@ -1996,17 +1990,15 @@ static void nicvf_set_rx_mode(struct net_device *netdev) mode |= BGX_XCAST_MCAST_FILTER; /* here we need to copy mc addrs */ if (netdev_mc_count(netdev)) { - struct xcast_addr *xaddr; - - mc_list = kmalloc(sizeof(*mc_list), GFP_ATOMIC); - INIT_LIST_HEAD(&mc_list->list); + mc_list = kmalloc(sizeof(*mc_list) + + sizeof(u64) * netdev_mc_count(netdev), + GFP_ATOMIC); + if (unlikely(!mc_list)) + return; + mc_list->count = 0; netdev_hw_addr_list_for_each(ha, &netdev->mc) { - xaddr = kmalloc(sizeof(*xaddr), - GFP_ATOMIC); - xaddr->addr = + mc_list->mc[mc_list->count] = ether_addr_to_u64(ha->addr); - list_add_tail(&xaddr->list, - &mc_list->list); mc_list->count++; } } -- 2.14.3
Re: [v2] net: thunderx: nicvf_main: Fix potential NULL pointer dereferences
Hi guys, Please give me couple days to workout solution for this. I'll post patch for this as soon as I done with my testing. WBR, Vadim On Tue, Apr 03, 2018 at 05:04:23PM -0500, Gustavo A. R. Silva wrote: > Add null check on kmalloc() return value in order to prevent > a null pointer dereference. > > Addresses-Coverity-ID: 1467429 ("Dereference null return value") > Fixes: 37c3347eb247 ("net: thunderx: add ndo_set_rx_mode callback > implementation for VF") > Signed-off-by: Gustavo A. R. Silva > --- > Changes in v2: > - Add a null check on a second kmalloc a few lines below. Thanks to >Eric Dumazet for pointing this out. > > drivers/net/ethernet/cavium/thunder/nicvf_main.c | 4 > 1 file changed, 4 insertions(+) > > diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c > b/drivers/net/ethernet/cavium/thunder/nicvf_main.c > index 1e9a31f..f7b5ca5 100644 > --- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c > +++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c > @@ -1999,10 +1999,14 @@ static void nicvf_set_rx_mode(struct net_device > *netdev) > struct xcast_addr *xaddr; > > mc_list = kmalloc(sizeof(*mc_list), GFP_ATOMIC); > + if (unlikely(!mc_list)) > + return; > INIT_LIST_HEAD(&mc_list->list); > netdev_hw_addr_list_for_each(ha, &netdev->mc) { > xaddr = kmalloc(sizeof(*xaddr), > GFP_ATOMIC); > + if (unlikely(!xaddr)) > + return; > xaddr->addr = > ether_addr_to_u64(ha->addr); > list_add_tail(&xaddr->list,
Re: [PATCH v2 0/7] net: thunderx: implement DMAC filtering support
On Sat, Mar 31, 2018 at 10:07:30PM -0400, David Miller wrote: > From: Vadim Lomovtsev > Date: Fri, 30 Mar 2018 04:59:46 -0700 > > > From: Vadim Lomovtsev > > > > By default CN88XX BGX accepts all incoming multicast and broadcast > > packets and filtering is disabled. The nic driver doesn't provide > > an ability to change such behaviour. > > > > This series is to implement DMAC filtering management for CN88XX > > nic driver allowing user to enable/disable filtering and configure > > specific MAC addresses to filter traffic. > > > > Changes from v1: > > build issues: > > - update code in order to address compiler warnings; > > checkpatch.pl reported issues: > > - update code in order to fit 80 symbols length; > > - update commit descriptions in order to fit 80 symbols length; > > Series applied. Thank you. WBR, Vadim
[PATCH v2 1/7] net: thunderx: move filter register related macro into proper place
From: Vadim Lomovtsev The ThunderX NIC has set of registers which allows to configure filter policy for ingress packets. There are three possible regimes of filtering multicasts, broadcasts and unicasts: accept all, reject all and accept filter allowed only. Current implementation has enum with all of them and two generic macro for enabling filtering et all (CAM_ACCEPT) and enabling/disabling broadcast packets, which also should be corrected in order to represent register bits properly. All these values are private for driver and there is no need to ‘publish’ them via header file. This commit is to move filtering register manipulation values from header file into source with explicit assignment of exact register values to them to be used while register configuring. Signed-off-by: Vadim Lomovtsev --- drivers/net/ethernet/cavium/thunder/thunder_bgx.c | 13 + drivers/net/ethernet/cavium/thunder/thunder_bgx.h | 11 --- 2 files changed, 13 insertions(+), 11 deletions(-) diff --git a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c index 91d34ea40e2c..0dd211605eb1 100644 --- a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c +++ b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c @@ -24,6 +24,19 @@ #define DRV_NAME "thunder_bgx" #define DRV_VERSION"1.0" +/* RX_DMAC_CTL configuration */ +enum MCAST_MODE { + MCAST_MODE_REJECT = 0x0, + MCAST_MODE_ACCEPT = 0x1, + MCAST_MODE_CAM_FILTER = 0x2, + RSVD = 0x3 +}; + +#define BCAST_ACCEPT BIT(0) +#define CAM_ACCEPTBIT(3) +#define MCAST_MODE_MASK 0x3 +#define BGX_MCAST_MODE(x) (x << 1) + struct lmac { struct bgx *bgx; int dmac; diff --git a/drivers/net/ethernet/cavium/thunder/thunder_bgx.h b/drivers/net/ethernet/cavium/thunder/thunder_bgx.h index 5a7567d31138..52439da62c97 100644 --- a/drivers/net/ethernet/cavium/thunder/thunder_bgx.h +++ b/drivers/net/ethernet/cavium/thunder/thunder_bgx.h @@ -205,17 +205,6 @@ #define LMAC_INTR_LINK_UP BIT(0) #define LMAC_INTR_LINK_DOWNBIT(1) -/* RX_DMAC_CTL configuration*/ -enum MCAST_MODE { - MCAST_MODE_REJECT, - MCAST_MODE_ACCEPT, - MCAST_MODE_CAM_FILTER, - RSVD -}; - -#define BCAST_ACCEPT 1 -#define CAM_ACCEPT 1 - void octeon_mdiobus_force_mod_depencency(void); void bgx_lmac_rx_tx_enable(int node, int bgx_idx, int lmacid, bool enable); void bgx_add_dmac_addr(u64 dmac, int node, int bgx_idx, int lmac); -- 2.14.3
[PATCH v2 0/7] net: thunderx: implement DMAC filtering support
From: Vadim Lomovtsev By default CN88XX BGX accepts all incoming multicast and broadcast packets and filtering is disabled. The nic driver doesn't provide an ability to change such behaviour. This series is to implement DMAC filtering management for CN88XX nic driver allowing user to enable/disable filtering and configure specific MAC addresses to filter traffic. Changes from v1: build issues: - update code in order to address compiler warnings; checkpatch.pl reported issues: - update code in order to fit 80 symbols length; - update commit descriptions in order to fit 80 symbols length; Vadim Lomovtsev (7): net: thunderx: move filter register related macro into proper place net: thunderx: add MAC address filter tracking for LMAC net: thunderx: add multicast filter management support net: thunderx: add new messages for handle ndo_set_rx_mode callback net: thunderx: add XCAST messages handlers for PF net: thunderx: add workqueue control structures for handle ndo_set_rx_mode request net: thunderx: add ndo_set_rx_mode callback implementation for VF drivers/net/ethernet/cavium/thunder/nic.h | 29 drivers/net/ethernet/cavium/thunder/nic_main.c| 45 - drivers/net/ethernet/cavium/thunder/nicvf_main.c | 110 +++- drivers/net/ethernet/cavium/thunder/thunder_bgx.c | 201 -- drivers/net/ethernet/cavium/thunder/thunder_bgx.h | 19 +- 5 files changed, 374 insertions(+), 30 deletions(-) -- 2.14.3
[PATCH v2 2/7] net: thunderx: add MAC address filter tracking for LMAC
From: Vadim Lomovtsev The ThunderX NIC has two Ethernet Interfaces (BGX) each of them could has up to four Logical MACs configured. Each of BGX has 32 filters to be configured for filtering ingress packets. The number of filters available to particular LMAC is from 8 (if we have four LMACs configured per BGX) up to 32 (in case of only one LMAC is configured per BGX). At the same time the NIC could present up to 128 VFs to OS as network interfaces, each of them kernel will configure with set of MAC addresses for filtering. So to prevent dupes in BGX filter registers from different network interfaces it is required to cache and track all filter configuration requests prior to applying them onto BGX filter registers. This commit is to update LMAC structures with control fields to allocate/releasing filters tracking list along with implementing dmac array allocate/release per LMAC. Signed-off-by: Vadim Lomovtsev --- drivers/net/ethernet/cavium/thunder/thunder_bgx.c | 44 +++ 1 file changed, 30 insertions(+), 14 deletions(-) diff --git a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c index 0dd211605eb1..de90e6aa5a4f 100644 --- a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c +++ b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c @@ -37,9 +37,18 @@ enum MCAST_MODE { #define MCAST_MODE_MASK 0x3 #define BGX_MCAST_MODE(x) (x << 1) +struct dmac_map { + u64 vf_map; + u64 dmac; +}; + struct lmac { struct bgx *bgx; - int dmac; + /* actual number of DMACs configured */ + u8 dmacs_cfg; + /* overal number of possible DMACs could be configured per LMAC */ + u8 dmacs_count; + struct dmac_map *dmacs; /* DMAC:VFs tracking filter array */ u8 mac[ETH_ALEN]; u8 lmac_type; u8 lane_to_sds; @@ -236,6 +245,19 @@ void bgx_set_lmac_mac(int node, int bgx_idx, int lmacid, const u8 *mac) } EXPORT_SYMBOL(bgx_set_lmac_mac); +static void bgx_flush_dmac_cam_filter(struct bgx *bgx, int lmacid) +{ + struct lmac *lmac = NULL; + u8 idx = 0; + + lmac = &bgx->lmac[lmacid]; + /* reset CAM filters */ + for (idx = 0; idx < lmac->dmacs_count; idx++) + bgx_reg_write(bgx, 0, BGX_CMR_RX_DMACX_CAM + + ((lmacid * lmac->dmacs_count) + idx) * + sizeof(u64), 0); +} + void bgx_lmac_rx_tx_enable(int node, int bgx_idx, int lmacid, bool enable) { struct bgx *bgx = get_bgx(node, bgx_idx); @@ -481,18 +503,6 @@ u64 bgx_get_tx_stats(int node, int bgx_idx, int lmac, int idx) } EXPORT_SYMBOL(bgx_get_tx_stats); -static void bgx_flush_dmac_addrs(struct bgx *bgx, int lmac) -{ - u64 offset; - - while (bgx->lmac[lmac].dmac > 0) { - offset = ((bgx->lmac[lmac].dmac - 1) * sizeof(u64)) + - (lmac * MAX_DMAC_PER_LMAC * sizeof(u64)); - bgx_reg_write(bgx, 0, BGX_CMR_RX_DMACX_CAM + offset, 0); - bgx->lmac[lmac].dmac--; - } -} - /* Configure BGX LMAC in internal loopback mode */ void bgx_lmac_internal_loopback(int node, int bgx_idx, int lmac_idx, bool enable) @@ -925,6 +935,11 @@ static int bgx_lmac_enable(struct bgx *bgx, u8 lmacid) bgx_reg_write(bgx, lmacid, BGX_SMUX_TX_MIN_PKT, 60 + 4); } + /* actual number of filters available to exact LMAC */ + lmac->dmacs_count = (RX_DMAC_COUNT / bgx->lmac_count); + lmac->dmacs = kcalloc(lmac->dmacs_count, sizeof(*lmac->dmacs), + GFP_KERNEL); + /* Enable lmac */ bgx_reg_modify(bgx, lmacid, BGX_CMRX_CFG, CMR_EN); @@ -1011,7 +1026,8 @@ static void bgx_lmac_disable(struct bgx *bgx, u8 lmacid) cfg &= ~CMR_EN; bgx_reg_write(bgx, lmacid, BGX_CMRX_CFG, cfg); - bgx_flush_dmac_addrs(bgx, lmacid); + bgx_flush_dmac_cam_filter(bgx, lmacid); + kfree(lmac->dmacs); if ((lmac->lmac_type != BGX_MODE_XFI) && (lmac->lmac_type != BGX_MODE_XLAUI) && -- 2.14.3
[PATCH v2 5/7] net: thunderx: add XCAST messages handlers for PF
From: Vadim Lomovtsev This commit is to add message handling for ndo_set_rx_mode() callback at PF side. Signed-off-by: Vadim Lomovtsev --- drivers/net/ethernet/cavium/thunder/nic_main.c | 45 +++--- 1 file changed, 41 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/cavium/thunder/nic_main.c b/drivers/net/ethernet/cavium/thunder/nic_main.c index 7ff66a8194e2..55af04fa03a7 100644 --- a/drivers/net/ethernet/cavium/thunder/nic_main.c +++ b/drivers/net/ethernet/cavium/thunder/nic_main.c @@ -21,6 +21,8 @@ #define DRV_NAME "nicpf" #define DRV_VERSION"1.0" +#define NIC_VF_PER_MBX_REG 64 + struct hw_info { u8 bgx_cnt; u8 chans_per_lmac; @@ -1072,6 +1074,40 @@ static void nic_handle_mbx_intr(struct nicpf *nic, int vf) case NIC_MBOX_MSG_PTP_CFG: nic_config_timestamp(nic, vf, &mbx.ptp); break; + case NIC_MBOX_MSG_RESET_XCAST: + if (vf >= nic->num_vf_en) { + ret = -1; /* NACK */ + break; + } + bgx = NIC_GET_BGX_FROM_VF_LMAC_MAP(nic->vf_lmac_map[vf]); + lmac = NIC_GET_LMAC_FROM_VF_LMAC_MAP(nic->vf_lmac_map[vf]); + bgx_reset_xcast_mode(nic->node, bgx, lmac, +vf < NIC_VF_PER_MBX_REG ? vf : +vf - NIC_VF_PER_MBX_REG); + break; + + case NIC_MBOX_MSG_ADD_MCAST: + if (vf >= nic->num_vf_en) { + ret = -1; /* NACK */ + break; + } + bgx = NIC_GET_BGX_FROM_VF_LMAC_MAP(nic->vf_lmac_map[vf]); + lmac = NIC_GET_LMAC_FROM_VF_LMAC_MAP(nic->vf_lmac_map[vf]); + bgx_set_dmac_cam_filter(nic->node, bgx, lmac, + mbx.xcast.data.mac, + vf < NIC_VF_PER_MBX_REG ? vf : + vf - NIC_VF_PER_MBX_REG); + break; + + case NIC_MBOX_MSG_SET_XCAST: + if (vf >= nic->num_vf_en) { + ret = -1; /* NACK */ + break; + } + bgx = NIC_GET_BGX_FROM_VF_LMAC_MAP(nic->vf_lmac_map[vf]); + lmac = NIC_GET_LMAC_FROM_VF_LMAC_MAP(nic->vf_lmac_map[vf]); + bgx_set_xcast_mode(nic->node, bgx, lmac, mbx.xcast.data.mode); + break; default: dev_err(&nic->pdev->dev, "Invalid msg from VF%d, msg 0x%x\n", vf, mbx.msg.msg); @@ -1094,7 +1130,7 @@ static irqreturn_t nic_mbx_intr_handler(int irq, void *nic_irq) struct nicpf *nic = (struct nicpf *)nic_irq; int mbx; u64 intr; - u8 vf, vf_per_mbx_reg = 64; + u8 vf; if (irq == pci_irq_vector(nic->pdev, NIC_PF_INTR_ID_MBOX0)) mbx = 0; @@ -1103,12 +1139,13 @@ static irqreturn_t nic_mbx_intr_handler(int irq, void *nic_irq) intr = nic_reg_read(nic, NIC_PF_MAILBOX_INT + (mbx << 3)); dev_dbg(&nic->pdev->dev, "PF interrupt Mbox%d 0x%llx\n", mbx, intr); - for (vf = 0; vf < vf_per_mbx_reg; vf++) { + for (vf = 0; vf < NIC_VF_PER_MBX_REG; vf++) { if (intr & (1ULL << vf)) { dev_dbg(&nic->pdev->dev, "Intr from VF %d\n", - vf + (mbx * vf_per_mbx_reg)); + vf + (mbx * NIC_VF_PER_MBX_REG)); - nic_handle_mbx_intr(nic, vf + (mbx * vf_per_mbx_reg)); + nic_handle_mbx_intr(nic, vf + + (mbx * NIC_VF_PER_MBX_REG)); nic_clear_mbx_intr(nic, vf, mbx); } } -- 2.14.3
[PATCH v2 3/7] net: thunderx: add multicast filter management support
From: Vadim Lomovtsev The ThunderX NIC could be partitioned to up to 128 VFs and thus represented to system. Each VF is mapped to pair BGX:LMAC, and each of VF is configured by kernel individually. Eventually the bunch of VFs could be mapped onto same pair BGX:LMAC and thus could cause several multicast filtering configuration requests to LMAC with the same MAC addresses. This commit is to add ThunderX NIC BGX filtering manipulation routines. Signed-off-by: Vadim Lomovtsev --- drivers/net/ethernet/cavium/thunder/thunder_bgx.c | 144 ++ drivers/net/ethernet/cavium/thunder/thunder_bgx.h | 10 +- 2 files changed, 153 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c index de90e6aa5a4f..5d08d2aeb172 100644 --- a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c +++ b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c @@ -258,6 +258,150 @@ static void bgx_flush_dmac_cam_filter(struct bgx *bgx, int lmacid) sizeof(u64), 0); } +static void bgx_lmac_remove_filters(struct lmac *lmac, u8 vf_id) +{ + int i = 0; + + if (!lmac) + return; + + /* We've got reset filters request from some of attached VF, while the +* others might want to keep their configuration. So in this case lets +* iterate over all of configured filters and decrease number of +* referencies. if some addresses get zero refs remove them from list +*/ + for (i = lmac->dmacs_cfg - 1; i >= 0; i--) { + lmac->dmacs[i].vf_map &= ~BIT_ULL(vf_id); + if (!lmac->dmacs[i].vf_map) { + lmac->dmacs_cfg--; + lmac->dmacs[i].dmac = 0; + lmac->dmacs[i].vf_map = 0; + } + } +} + +static int bgx_lmac_save_filter(struct lmac *lmac, u64 dmac, u8 vf_id) +{ + u8 i = 0; + + if (!lmac) + return -1; + + /* At the same time we could have several VFs 'attached' to some +* particular LMAC, and each VF is represented as network interface +* for kernel. So from user perspective it should be possible to +* manipulate with its' (VF) receive modes. However from PF +* driver perspective we need to keep track of filter configurations +* for different VFs to prevent filter values dupes +*/ + for (i = 0; i < lmac->dmacs_cfg; i++) { + if (lmac->dmacs[i].dmac == dmac) { + lmac->dmacs[i].vf_map |= BIT_ULL(vf_id); + return -1; + } + } + + if (!(lmac->dmacs_cfg < lmac->dmacs_count)) + return -1; + + /* keep it for further tracking */ + lmac->dmacs[lmac->dmacs_cfg].dmac = dmac; + lmac->dmacs[lmac->dmacs_cfg].vf_map = BIT_ULL(vf_id); + lmac->dmacs_cfg++; + return 0; +} + +static int bgx_set_dmac_cam_filter_mac(struct bgx *bgx, int lmacid, + u64 cam_dmac, u8 idx) +{ + struct lmac *lmac = NULL; + u64 cfg = 0; + + /* skip zero addresses as meaningless */ + if (!cam_dmac || !bgx) + return -1; + + lmac = &bgx->lmac[lmacid]; + + /* configure DCAM filtering for designated LMAC */ + cfg = RX_DMACX_CAM_LMACID(lmacid & LMAC_ID_MASK) | + RX_DMACX_CAM_EN | cam_dmac; + bgx_reg_write(bgx, 0, BGX_CMR_RX_DMACX_CAM + + ((lmacid * lmac->dmacs_count) + idx) * sizeof(u64), cfg); + return 0; +} + +void bgx_set_dmac_cam_filter(int node, int bgx_idx, int lmacid, +u64 cam_dmac, u8 vf_id) +{ + struct bgx *bgx = get_bgx(node, bgx_idx); + struct lmac *lmac = NULL; + + if (!bgx) + return; + + lmac = &bgx->lmac[lmacid]; + + if (!cam_dmac) + cam_dmac = ether_addr_to_u64(lmac->mac); + + /* since we might have several VFs attached to particular LMAC +* and kernel could call mcast config for each of them with the +* same MAC, check if requested MAC is already in filtering list and +* updare/prepare list of MACs to be applied later to HW filters +*/ + bgx_lmac_save_filter(lmac, cam_dmac, vf_id); +} +EXPORT_SYMBOL(bgx_set_dmac_cam_filter); + +void bgx_set_xcast_mode(int node, int bgx_idx, int lmacid, u8 mode) +{ + struct bgx *bgx = get_bgx(node, bgx_idx); + struct lmac *lmac = NULL; + u64 cfg = 0; + u8 i = 0; + + if (!bgx) + return; + + lmac = &bgx->lmac[lmacid]; + + cfg = bgx_reg_read(bgx, lmacid, BGX_CMRX_RX_DMAC_CTL); + if (mode & BGX_XCAST_BCAST_ACCEPT) + cfg |= BCAST_ACCEPT; + else + cfg &= ~BCAST_AC
[PATCH v2 4/7] net: thunderx: add new messages for handle ndo_set_rx_mode callback
From: Vadim Lomovtsev The kernel calls ndo_set_rx_mode() callback supplying it will all necessary info, such as device state flags, multicast mac addresses list and so on. Since we have only 128 bits to communicate with PF we need to initiate several requests to PF with small/short operation each based on input data. So this commit implements following PF messages codes along with new data structures for them: NIC_MBOX_MSG_RESET_XCAST to flush all filters configured for this particular network interface (VF) NIC_MBOX_MSG_ADD_MCAST to add new MAC address to DMAC filter registers for this particular network interface (VF) NIC_MBOX_MSG_SET_XCAST to apply filtering configuration to filter control register Signed-off-by: Vadim Lomovtsev --- drivers/net/ethernet/cavium/thunder/nic.h | 12 1 file changed, 12 insertions(+) diff --git a/drivers/net/ethernet/cavium/thunder/nic.h b/drivers/net/ethernet/cavium/thunder/nic.h index 4cacce5d2b16..069289b4f968 100644 --- a/drivers/net/ethernet/cavium/thunder/nic.h +++ b/drivers/net/ethernet/cavium/thunder/nic.h @@ -403,6 +403,9 @@ struct nicvf { #defineNIC_MBOX_MSG_PTP_CFG0x19/* HW packet timestamp */ #defineNIC_MBOX_MSG_CFG_DONE 0xF0/* VF configuration done */ #defineNIC_MBOX_MSG_SHUTDOWN 0xF1/* VF is being shutdown */ +#defineNIC_MBOX_MSG_RESET_XCAST0xF2/* Reset DCAM filtering mode */ +#defineNIC_MBOX_MSG_ADD_MCAST 0xF3/* Add MAC to DCAM filters */ +#defineNIC_MBOX_MSG_SET_XCAST 0xF4/* Set MCAST/BCAST RX mode */ struct nic_cfg_msg { u8msg; @@ -556,6 +559,14 @@ struct set_ptp { bool enable; }; +struct xcast { + u8msg; + union { + u8mode; + u64 mac; + } data; +}; + /* 128 bit shared memory between PF and each VF */ union nic_mbx { struct { u8 msg; } msg; @@ -576,6 +587,7 @@ union nic_mbx { struct reset_stat_cfg reset_stat; struct pfc pfc; struct set_ptp ptp; + struct xcastxcast; }; #define NIC_NODE_ID_MASK 0x03 -- 2.14.3
[PATCH v2 7/7] net: thunderx: add ndo_set_rx_mode callback implementation for VF
From: Vadim Lomovtsev The ndo_set_rx_mode() is called from atomic context which causes messages response timeouts while VF to PF communication via MSIx. To get rid of that we're copy passed mc list, parse flags and queue handling of kernel request to ordered workqueue. Signed-off-by: Vadim Lomovtsev --- drivers/net/ethernet/cavium/thunder/nicvf_main.c | 110 ++- 1 file changed, 109 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c index 7d9c5ffbd041..c8a8faaf17e9 100644 --- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c +++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c @@ -21,6 +21,7 @@ #include #include #include +#include #include "nic_reg.h" #include "nic.h" @@ -67,6 +68,9 @@ module_param(cpi_alg, int, S_IRUGO); MODULE_PARM_DESC(cpi_alg, "PFC algorithm (0=none, 1=VLAN, 2=VLAN16, 3=IP Diffserv)"); +/* workqueue for handling kernel ndo_set_rx_mode() calls */ +static struct workqueue_struct *nicvf_rx_mode_wq; + static inline u8 nicvf_netdev_qidx(struct nicvf *nic, u8 qidx) { if (nic->sqs_mode) @@ -1919,6 +1923,100 @@ static int nicvf_ioctl(struct net_device *netdev, struct ifreq *req, int cmd) } } +static void nicvf_set_rx_mode_task(struct work_struct *work_arg) +{ + struct nicvf_work *vf_work = container_of(work_arg, struct nicvf_work, + work.work); + struct nicvf *nic = container_of(vf_work, struct nicvf, rx_mode_work); + union nic_mbx mbx = {}; + struct xcast_addr *xaddr, *next; + + if (!vf_work) + return; + + /* From the inside of VM code flow we have only 128 bits memory +* available to send message to host's PF, so send all mc addrs +* one by one, starting from flush command in case if kernel +* requests to configure specific MAC filtering +*/ + + /* flush DMAC filters and reset RX mode */ + mbx.xcast.msg = NIC_MBOX_MSG_RESET_XCAST; + nicvf_send_msg_to_pf(nic, &mbx); + + if (vf_work->mode & BGX_XCAST_MCAST_FILTER) { + /* once enabling filtering, we need to signal to PF to add +* its' own LMAC to the filter to accept packets for it. +*/ + mbx.xcast.msg = NIC_MBOX_MSG_ADD_MCAST; + mbx.xcast.data.mac = 0; + nicvf_send_msg_to_pf(nic, &mbx); + } + + /* check if we have any specific MACs to be added to PF DMAC filter */ + if (vf_work->mc) { + /* now go through kernel list of MACs and add them one by one */ + list_for_each_entry_safe(xaddr, next, +&vf_work->mc->list, list) { + mbx.xcast.msg = NIC_MBOX_MSG_ADD_MCAST; + mbx.xcast.data.mac = xaddr->addr; + nicvf_send_msg_to_pf(nic, &mbx); + + /* after receiving ACK from PF release memory */ + list_del(&xaddr->list); + kfree(xaddr); + vf_work->mc->count--; + } + kfree(vf_work->mc); + } + + /* and finally set rx mode for PF accordingly */ + mbx.xcast.msg = NIC_MBOX_MSG_SET_XCAST; + mbx.xcast.data.mode = vf_work->mode; + + nicvf_send_msg_to_pf(nic, &mbx); +} + +static void nicvf_set_rx_mode(struct net_device *netdev) +{ + struct nicvf *nic = netdev_priv(netdev); + struct netdev_hw_addr *ha; + struct xcast_addr_list *mc_list = NULL; + u8 mode = 0; + + if (netdev->flags & IFF_PROMISC) { + mode = BGX_XCAST_BCAST_ACCEPT | BGX_XCAST_MCAST_ACCEPT; + } else { + if (netdev->flags & IFF_BROADCAST) + mode |= BGX_XCAST_BCAST_ACCEPT; + + if (netdev->flags & IFF_ALLMULTI) { + mode |= BGX_XCAST_MCAST_ACCEPT; + } else if (netdev->flags & IFF_MULTICAST) { + mode |= BGX_XCAST_MCAST_FILTER; + /* here we need to copy mc addrs */ + if (netdev_mc_count(netdev)) { + struct xcast_addr *xaddr; + + mc_list = kmalloc(sizeof(*mc_list), GFP_ATOMIC); + INIT_LIST_HEAD(&mc_list->list); + netdev_hw_addr_list_for_each(ha, &netdev->mc) { + xaddr = kmalloc(sizeof(*xaddr), + GFP_ATOMIC); + xaddr->addr = + ether_addr_to_
[PATCH v2 6/7] net: thunderx: add workqueue control structures for handle ndo_set_rx_mode request
From: Vadim Lomovtsev The kernel calls ndo_set_rx_mode() callback from atomic context which causes messaging timeouts between VF and PF (as they’re implemented via MSIx). So in order to handle ndo_set_rx_mode() we need to get rid of it. This commit implements necessary workqueue related structures to let VF queue kernel request processing in non-atomic context later. Signed-off-by: Vadim Lomovtsev --- drivers/net/ethernet/cavium/thunder/nic.h | 17 + 1 file changed, 17 insertions(+) diff --git a/drivers/net/ethernet/cavium/thunder/nic.h b/drivers/net/ethernet/cavium/thunder/nic.h index 069289b4f968..5fc46c5a4f36 100644 --- a/drivers/net/ethernet/cavium/thunder/nic.h +++ b/drivers/net/ethernet/cavium/thunder/nic.h @@ -265,6 +265,22 @@ struct nicvf_drv_stats { struct cavium_ptp; +struct xcast_addr { + struct list_head list; + u64 addr; +}; + +struct xcast_addr_list { + struct list_head list; + int count; +}; + +struct nicvf_work { + struct delayed_workwork; + u8 mode; + struct xcast_addr_list *mc; +}; + struct nicvf { struct nicvf*pnicvf; struct net_device *netdev; @@ -313,6 +329,7 @@ struct nicvf { struct nicvf_pfcpfc; struct tasklet_struct qs_err_task; struct work_struct reset_task; + struct nicvf_work rx_mode_work; /* PTP timestamp */ struct cavium_ptp *ptp_clock; -- 2.14.3
Re: [PATCH 0/7] net: thunderx: implement DMAC filtering support
Hi David, Thanks for feedback. On Tue, Mar 27, 2018 at 01:28:22PM -0400, David Miller wrote: > From: Vadim Lomovtsev > Date: Tue, 27 Mar 2018 08:07:29 -0700 > > > From: Vadim Lomovtsev > > > > By default CN88XX BGX accepts all incoming multicast and broadcast > > packets and filtering is disabled. The nic driver doesn't provide > > an ability to change such behaviour. > > > > This series is to implement DMAC filtering management for CN88XX > > nic driver allowing user to enable/disable filtering and configure > > specific MAC addresses to filter traffic. > > This doesn't even compile: > > drivers/net/ethernet/cavium/thunder/thunder_bgx.c: In function > ‘bgx_lmac_save_filter’: > drivers/net/ethernet/cavium/thunder/thunder_bgx.c:286:3: warning: ‘return’ > with no value, in function returning non-void [-Wreturn-type] >return; >^~ > drivers/net/ethernet/cavium/thunder/thunder_bgx.c:281:12: note: declared here > static int bgx_lmac_save_filter(struct lmac *lmac, u64 dmac, u8 vf_id) > ^~~~ > In file included from drivers/net/ethernet/cavium/thunder/nic.h:15:0, > from drivers/net/ethernet/cavium/thunder/thunder_bgx.c:21: > drivers/net/ethernet/cavium/thunder/thunder_bgx.c: In function > ‘bgx_set_dmac_cam_filter_mac’: > drivers/net/ethernet/cavium/thunder/thunder_bgx.h:61:38: warning: left shift > count >= width of type [-Wshift-count-overflow] > #define RX_DMACX_CAM_LMACID(x) (x << 49) > ^ > drivers/net/ethernet/cavium/thunder/thunder_bgx.c:324:8: note: in expansion > of macro ‘RX_DMACX_CAM_LMACID’ > cfg = RX_DMACX_CAM_LMACID(lmacid & LMAC_ID_MASK) | > ^~~ Will update and repost series as v2. Do you have any other comments ? WBR, Vadim
[PATCH 2/7] net: thunderx: add MAC address filter tracking for LMAC
From: Vadim Lomovtsev The ThunderX NIC has two Ethernet Interfaces (BGX) each of them could has up to four Logical MACs configured. Each of BGX has 32 filters to be configured for filtering ingress packets. The number of filters available to particular LMAC is from 8 (if we have four LMACs configured per BGX) up to 32 (in case of only one LMAC is configured per BGX). At the same time the NIC could present up to 128 VFs to OS as network interfaces, each of them kernel will configure with set of MAC addresses for filtering. So to prevent dupes in BGX filter registers from different network interfaces it is required to cache and track all filter configuration requests prior to applying them onto BGX filter registers. This commit is to update LMAC structures with control fields to allocate/releasing filters tracking list along with implementing dmac array allocate/release per LMAC. Signed-off-by: Vadim Lomovtsev --- drivers/net/ethernet/cavium/thunder/thunder_bgx.c | 42 +++ 1 file changed, 28 insertions(+), 14 deletions(-) diff --git a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c index 0dd211605eb1..cf0cc19c03c5 100644 --- a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c +++ b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c @@ -37,9 +37,18 @@ enum MCAST_MODE { #define MCAST_MODE_MASK 0x3 #define BGX_MCAST_MODE(x) (x << 1) +struct dmac_map { + u64 vf_map; + u64 dmac; +}; + struct lmac { struct bgx *bgx; - int dmac; + /* actual number of DMACs configured */ + u8 dmacs_cfg; + /* overal number of possible DMACs could be configured per LMAC */ + u8 dmacs_count; + struct dmac_map *dmacs; /* DMAC:VFs tracking filter array */ u8 mac[ETH_ALEN]; u8 lmac_type; u8 lane_to_sds; @@ -236,6 +245,17 @@ void bgx_set_lmac_mac(int node, int bgx_idx, int lmacid, const u8 *mac) } EXPORT_SYMBOL(bgx_set_lmac_mac); +static void bgx_flush_dmac_cam_filter(struct bgx *bgx, int lmacid) +{ + struct lmac *lmac = NULL; + u8 idx = 0; + + lmac = &bgx->lmac[lmacid]; + /* reset CAM filters */ + for (idx = 0; idx < lmac->dmacs_count; idx++) + bgx_reg_write(bgx, 0, BGX_CMR_RX_DMACX_CAM + ((lmacid * lmac->dmacs_count) + idx) * sizeof(u64), 0); +} + void bgx_lmac_rx_tx_enable(int node, int bgx_idx, int lmacid, bool enable) { struct bgx *bgx = get_bgx(node, bgx_idx); @@ -481,18 +501,6 @@ u64 bgx_get_tx_stats(int node, int bgx_idx, int lmac, int idx) } EXPORT_SYMBOL(bgx_get_tx_stats); -static void bgx_flush_dmac_addrs(struct bgx *bgx, int lmac) -{ - u64 offset; - - while (bgx->lmac[lmac].dmac > 0) { - offset = ((bgx->lmac[lmac].dmac - 1) * sizeof(u64)) + - (lmac * MAX_DMAC_PER_LMAC * sizeof(u64)); - bgx_reg_write(bgx, 0, BGX_CMR_RX_DMACX_CAM + offset, 0); - bgx->lmac[lmac].dmac--; - } -} - /* Configure BGX LMAC in internal loopback mode */ void bgx_lmac_internal_loopback(int node, int bgx_idx, int lmac_idx, bool enable) @@ -925,6 +933,11 @@ static int bgx_lmac_enable(struct bgx *bgx, u8 lmacid) bgx_reg_write(bgx, lmacid, BGX_SMUX_TX_MIN_PKT, 60 + 4); } + /* actual number of filters available to exact LMAC */ + lmac->dmacs_count = (RX_DMAC_COUNT / bgx->lmac_count); + lmac->dmacs = kcalloc(lmac->dmacs_count, sizeof(*lmac->dmacs), + GFP_KERNEL); + /* Enable lmac */ bgx_reg_modify(bgx, lmacid, BGX_CMRX_CFG, CMR_EN); @@ -1011,7 +1024,8 @@ static void bgx_lmac_disable(struct bgx *bgx, u8 lmacid) cfg &= ~CMR_EN; bgx_reg_write(bgx, lmacid, BGX_CMRX_CFG, cfg); - bgx_flush_dmac_addrs(bgx, lmacid); + bgx_flush_dmac_cam_filter(bgx, lmacid); + kfree(lmac->dmacs); if ((lmac->lmac_type != BGX_MODE_XFI) && (lmac->lmac_type != BGX_MODE_XLAUI) && -- 2.14.3
[PATCH 3/7] net: thunderx: add multicast filter management support
From: Vadim Lomovtsev The ThunderX NIC could be partitioned to up to 128 VFs and thus represented to system. Each VF is mapped to pair BGX:LMAC, and each of VF is configured by kernel individually. Eventually the bunch of VFs could be mapped onto same pair BGX:LMAC and thus could cause several multicast filtering configuration requests to LMAC with the same MAC addresses. This commit is to add ThunderX NIC BGX filtering manipulation routines. Signed-off-by: Vadim Lomovtsev --- drivers/net/ethernet/cavium/thunder/thunder_bgx.c | 141 ++ drivers/net/ethernet/cavium/thunder/thunder_bgx.h | 8 ++ 2 files changed, 149 insertions(+) diff --git a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c index cf0cc19c03c5..52fef3dab0a3 100644 --- a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c +++ b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c @@ -256,6 +256,147 @@ static void bgx_flush_dmac_cam_filter(struct bgx *bgx, int lmacid) bgx_reg_write(bgx, 0, BGX_CMR_RX_DMACX_CAM + ((lmacid * lmac->dmacs_count) + idx) * sizeof(u64), 0); } +static void bgx_lmac_remove_filters(struct lmac *lmac, u8 vf_id) +{ + int i = 0; + + if (!lmac) + return; + + /* We've got reset filters request from some of attached VF, while the +* others might want to keep their configuration. So in this case lets +* iterate over all of configured filters and decrease number of +* referencies. if some addresses get zero refs remove them from list +*/ + for (i = lmac->dmacs_cfg - 1; i >= 0; i--) { + lmac->dmacs[i].vf_map &= ~BIT_ULL(vf_id); + if (!lmac->dmacs[i].vf_map) { + lmac->dmacs_cfg--; + lmac->dmacs[i].dmac = 0; + lmac->dmacs[i].vf_map = 0; + } + } +} + +static int bgx_lmac_save_filter(struct lmac *lmac, u64 dmac, u8 vf_id) +{ + u8 i = 0; + + if (!lmac) + return; + + /* At the same time we could have several VFs 'attached' to some +* particular LMAC, and each VF is represented as network interface +* for kernel. So from user perspective it should be possible to +* manipulate with its' (VF) receive modes. However from PF +* driver perspective we need to keep track of filter configurations +* for different VFs to prevent filter values dupes +*/ + for (i = 0; i < lmac->dmacs_cfg; i++) { + if (lmac->dmacs[i].dmac == dmac) { + lmac->dmacs[i].vf_map |= BIT_ULL(vf_id); + return -1; + } + } + + if (!(lmac->dmacs_cfg < lmac->dmacs_count)) + return -1; + + /* keep it for further tracking */ + lmac->dmacs[lmac->dmacs_cfg].dmac = dmac; + lmac->dmacs[lmac->dmacs_cfg].vf_map = BIT_ULL(vf_id); + lmac->dmacs_cfg++; + return 0; +} + +static int bgx_set_dmac_cam_filter_mac(struct bgx *bgx, int lmacid, u64 cam_dmac, u8 idx) +{ + struct lmac *lmac = NULL; + u64 cfg = 0; + + /* skip zero addresses as meaningless */ + if (!cam_dmac || !bgx) + return -1; + + lmac = &bgx->lmac[lmacid]; + + /* configure DCAM filtering for designated LMAC */ + cfg = RX_DMACX_CAM_LMACID(lmacid & LMAC_ID_MASK) | + RX_DMACX_CAM_EN | cam_dmac; + bgx_reg_write(bgx, 0, BGX_CMR_RX_DMACX_CAM + ((lmacid * lmac->dmacs_count) + idx) * sizeof(u64), cfg); + return 0; +} + +void bgx_set_dmac_cam_filter(int node, int bgx_idx, int lmacid, u64 cam_dmac, u8 vf_id) +{ + struct bgx *bgx = get_bgx(node, bgx_idx); + struct lmac *lmac = NULL; + + if (!bgx) + return; + + lmac = &bgx->lmac[lmacid]; + + if (!cam_dmac) + cam_dmac = ether_addr_to_u64(lmac->mac); + + /* since we might have several VFs attached to particular LMAC +* and kernel could call mcast config for each of them with the +* same MAC, check if requested MAC is already in filtering list and +* updare/prepare list of MACs to be applied later to HW filters +*/ + bgx_lmac_save_filter(lmac, cam_dmac, vf_id); +} +EXPORT_SYMBOL(bgx_set_dmac_cam_filter); + +void bgx_set_xcast_mode(int node, int bgx_idx, int lmacid, u8 mode) +{ + struct bgx *bgx = get_bgx(node, bgx_idx); + struct lmac *lmac = NULL; + u64 cfg = 0; + u8 i = 0; + + if (!bgx) + return; + + lmac = &bgx->lmac[lmacid]; + + cfg = bgx_reg_read(bgx, lmacid, BGX_CMRX_RX_DMAC_CTL); + if (mode & BGX_XCAST_BCAST_ACCEPT) + cfg |= BCAST_ACCEPT; + else + cfg &= ~BCAST_ACCEPT; + + /* disable all MCASTs
[PATCH 1/7] net: thunderx: move filter register related macro into proper place
From: Vadim Lomovtsev The ThunderX NIC has set of registers which allows to configure filter policy for ingress packets. There are three possible regimes of filtering multicasts, broadcasts and unicasts: accept all, reject all and accept filter allowed only. Current implementation has enum with all of them and two generic macro for enabling filtering et all (CAM_ACCEPT) and enabling/disabling broadcast packets, which also should be corrected in order to represent register bits properly. All these values are private for driver and there is no need to ‘publish’ them via header file. This commit is to move filtering register manipulation values from header file into source with explicit assignment of exact register values to them to be used while register configuring. Signed-off-by: Vadim Lomovtsev --- drivers/net/ethernet/cavium/thunder/thunder_bgx.c | 13 + drivers/net/ethernet/cavium/thunder/thunder_bgx.h | 11 --- 2 files changed, 13 insertions(+), 11 deletions(-) diff --git a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c index 91d34ea40e2c..0dd211605eb1 100644 --- a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c +++ b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c @@ -24,6 +24,19 @@ #define DRV_NAME "thunder_bgx" #define DRV_VERSION"1.0" +/* RX_DMAC_CTL configuration */ +enum MCAST_MODE { + MCAST_MODE_REJECT = 0x0, + MCAST_MODE_ACCEPT = 0x1, + MCAST_MODE_CAM_FILTER = 0x2, + RSVD = 0x3 +}; + +#define BCAST_ACCEPT BIT(0) +#define CAM_ACCEPTBIT(3) +#define MCAST_MODE_MASK 0x3 +#define BGX_MCAST_MODE(x) (x << 1) + struct lmac { struct bgx *bgx; int dmac; diff --git a/drivers/net/ethernet/cavium/thunder/thunder_bgx.h b/drivers/net/ethernet/cavium/thunder/thunder_bgx.h index 5a7567d31138..52439da62c97 100644 --- a/drivers/net/ethernet/cavium/thunder/thunder_bgx.h +++ b/drivers/net/ethernet/cavium/thunder/thunder_bgx.h @@ -205,17 +205,6 @@ #define LMAC_INTR_LINK_UP BIT(0) #define LMAC_INTR_LINK_DOWNBIT(1) -/* RX_DMAC_CTL configuration*/ -enum MCAST_MODE { - MCAST_MODE_REJECT, - MCAST_MODE_ACCEPT, - MCAST_MODE_CAM_FILTER, - RSVD -}; - -#define BCAST_ACCEPT 1 -#define CAM_ACCEPT 1 - void octeon_mdiobus_force_mod_depencency(void); void bgx_lmac_rx_tx_enable(int node, int bgx_idx, int lmacid, bool enable); void bgx_add_dmac_addr(u64 dmac, int node, int bgx_idx, int lmac); -- 2.14.3
[PATCH 7/7] net: thunderx: add ndo_set_rx_mode callback implementation for VF
From: Vadim Lomovtsev The ndo_set_rx_mode() is called from atomic context which causes messages response timeouts while VF to PF communication via MSIx. To get rid of that we're copy passed mc list, parse flags and queue handling of kernel request to ordered workqueue. Signed-off-by: Vadim Lomovtsev --- drivers/net/ethernet/cavium/thunder/nicvf_main.c | 108 ++- 1 file changed, 107 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c index 7d9c5ffbd041..0d4bec927623 100644 --- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c +++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c @@ -21,6 +21,7 @@ #include #include #include +#include #include "nic_reg.h" #include "nic.h" @@ -67,6 +68,9 @@ module_param(cpi_alg, int, S_IRUGO); MODULE_PARM_DESC(cpi_alg, "PFC algorithm (0=none, 1=VLAN, 2=VLAN16, 3=IP Diffserv)"); +/* workqueue for handling kernel ndo_set_rx_mode() calls */ +static struct workqueue_struct *nicvf_rx_mode_wq; + static inline u8 nicvf_netdev_qidx(struct nicvf *nic, u8 qidx) { if (nic->sqs_mode) @@ -1919,6 +1923,98 @@ static int nicvf_ioctl(struct net_device *netdev, struct ifreq *req, int cmd) } } +static void nicvf_set_rx_mode_task(struct work_struct *work_arg) +{ + struct nicvf_work *vf_work = container_of(work_arg, struct nicvf_work, + work.work); + struct nicvf *nic = container_of(vf_work, struct nicvf, rx_mode_work); + union nic_mbx mbx = {}; + struct xcast_addr *xaddr, *next; + + if (!vf_work) + return; + + /* From the inside of VM code flow we have only 128 bits memory +* available to send message to host's PF, so send all mc addrs +* one by one, starting from flush command in case if kernel +* requests to configure specific MAC filtering +*/ + + /* flush DMAC filters and reset RX mode */ + mbx.xcast.msg = NIC_MBOX_MSG_RESET_XCAST; + nicvf_send_msg_to_pf(nic, &mbx); + + if (vf_work->mode & BGX_XCAST_MCAST_FILTER) { + /* once enabling filtering, we need to signal to PF to add +* its' own LMAC to the filter to accept packets for it. +*/ + mbx.xcast.msg = NIC_MBOX_MSG_ADD_MCAST; + mbx.xcast.data.mac = 0; + nicvf_send_msg_to_pf(nic, &mbx); + } + + /* check if we have any specific MACs to be added to PF DMAC filter */ + if (vf_work->mc) { + /* now go through kernel list of MACs and add them one by one */ + list_for_each_entry_safe(xaddr, next, &vf_work->mc->list, list) { + mbx.xcast.msg = NIC_MBOX_MSG_ADD_MCAST; + mbx.xcast.data.mac = xaddr->addr; + nicvf_send_msg_to_pf(nic, &mbx); + + /* after receiving ACK from PF release memory */ + list_del(&xaddr->list); + kfree(xaddr); + vf_work->mc->count--; + } + kfree(vf_work->mc); + } + + /* and finally set rx mode for PF accordingly */ + mbx.xcast.msg = NIC_MBOX_MSG_SET_XCAST; + mbx.xcast.data.mode = vf_work->mode; + + nicvf_send_msg_to_pf(nic, &mbx); +} + +static void nicvf_set_rx_mode(struct net_device *netdev) +{ + struct nicvf *nic = netdev_priv(netdev); + struct netdev_hw_addr *ha; + struct xcast_addr_list *mc_list = NULL; + u8 mode = 0; + + if (netdev->flags & IFF_PROMISC) { + mode = BGX_XCAST_BCAST_ACCEPT | BGX_XCAST_MCAST_ACCEPT; + } else { + if (netdev->flags & IFF_BROADCAST) + mode |= BGX_XCAST_BCAST_ACCEPT; + + if (netdev->flags & IFF_ALLMULTI) { + mode |= BGX_XCAST_MCAST_ACCEPT; + } else if (netdev->flags & IFF_MULTICAST) { + mode |= BGX_XCAST_MCAST_FILTER; + /* here we need to copy mc addrs */ + if (netdev_mc_count(netdev)) { + struct xcast_addr *xaddr; + + mc_list = kmalloc(sizeof(*mc_list), GFP_ATOMIC); + INIT_LIST_HEAD(&mc_list->list); + netdev_hw_addr_list_for_each(ha, &netdev->mc) { + xaddr = kmalloc(sizeof(*xaddr), + GFP_ATOMIC); + xaddr->addr = ether_addr_to_u64(ha->addr); +
[PATCH 6/7] net: thunderx: add workqueue control structures for handle ndo_set_rx_mode request
From: Vadim Lomovtsev The kernel calls ndo_set_rx_mode() callback from atomic context which causes messaging timeouts between VF and PF (as they’re implemented via MSIx). So in order to handle ndo_set_rx_mode() we need to get rid of it. This commit implements necessary workqueue related structures to let VF queue kernel request processing in non-atomic context later. Signed-off-by: Vadim Lomovtsev --- drivers/net/ethernet/cavium/thunder/nic.h | 17 + 1 file changed, 17 insertions(+) diff --git a/drivers/net/ethernet/cavium/thunder/nic.h b/drivers/net/ethernet/cavium/thunder/nic.h index 069289b4f968..5fc46c5a4f36 100644 --- a/drivers/net/ethernet/cavium/thunder/nic.h +++ b/drivers/net/ethernet/cavium/thunder/nic.h @@ -265,6 +265,22 @@ struct nicvf_drv_stats { struct cavium_ptp; +struct xcast_addr { + struct list_head list; + u64 addr; +}; + +struct xcast_addr_list { + struct list_head list; + int count; +}; + +struct nicvf_work { + struct delayed_workwork; + u8 mode; + struct xcast_addr_list *mc; +}; + struct nicvf { struct nicvf*pnicvf; struct net_device *netdev; @@ -313,6 +329,7 @@ struct nicvf { struct nicvf_pfcpfc; struct tasklet_struct qs_err_task; struct work_struct reset_task; + struct nicvf_work rx_mode_work; /* PTP timestamp */ struct cavium_ptp *ptp_clock; -- 2.14.3
[PATCH 5/7] net: thunderx: add XCAST messages handlers for PF
From: Vadim Lomovtsev This commit is to add message handling for ndo_set_rx_mode() callback at PF side. Signed-off-by: Vadim Lomovtsev --- drivers/net/ethernet/cavium/thunder/nic_main.c | 45 +++--- 1 file changed, 41 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/cavium/thunder/nic_main.c b/drivers/net/ethernet/cavium/thunder/nic_main.c index 7ff66a8194e2..55af04fa03a7 100644 --- a/drivers/net/ethernet/cavium/thunder/nic_main.c +++ b/drivers/net/ethernet/cavium/thunder/nic_main.c @@ -21,6 +21,8 @@ #define DRV_NAME "nicpf" #define DRV_VERSION"1.0" +#define NIC_VF_PER_MBX_REG 64 + struct hw_info { u8 bgx_cnt; u8 chans_per_lmac; @@ -1072,6 +1074,40 @@ static void nic_handle_mbx_intr(struct nicpf *nic, int vf) case NIC_MBOX_MSG_PTP_CFG: nic_config_timestamp(nic, vf, &mbx.ptp); break; + case NIC_MBOX_MSG_RESET_XCAST: + if (vf >= nic->num_vf_en) { + ret = -1; /* NACK */ + break; + } + bgx = NIC_GET_BGX_FROM_VF_LMAC_MAP(nic->vf_lmac_map[vf]); + lmac = NIC_GET_LMAC_FROM_VF_LMAC_MAP(nic->vf_lmac_map[vf]); + bgx_reset_xcast_mode(nic->node, bgx, lmac, +vf < NIC_VF_PER_MBX_REG ? vf : +vf - NIC_VF_PER_MBX_REG); + break; + + case NIC_MBOX_MSG_ADD_MCAST: + if (vf >= nic->num_vf_en) { + ret = -1; /* NACK */ + break; + } + bgx = NIC_GET_BGX_FROM_VF_LMAC_MAP(nic->vf_lmac_map[vf]); + lmac = NIC_GET_LMAC_FROM_VF_LMAC_MAP(nic->vf_lmac_map[vf]); + bgx_set_dmac_cam_filter(nic->node, bgx, lmac, + mbx.xcast.data.mac, + vf < NIC_VF_PER_MBX_REG ? vf : + vf - NIC_VF_PER_MBX_REG); + break; + + case NIC_MBOX_MSG_SET_XCAST: + if (vf >= nic->num_vf_en) { + ret = -1; /* NACK */ + break; + } + bgx = NIC_GET_BGX_FROM_VF_LMAC_MAP(nic->vf_lmac_map[vf]); + lmac = NIC_GET_LMAC_FROM_VF_LMAC_MAP(nic->vf_lmac_map[vf]); + bgx_set_xcast_mode(nic->node, bgx, lmac, mbx.xcast.data.mode); + break; default: dev_err(&nic->pdev->dev, "Invalid msg from VF%d, msg 0x%x\n", vf, mbx.msg.msg); @@ -1094,7 +1130,7 @@ static irqreturn_t nic_mbx_intr_handler(int irq, void *nic_irq) struct nicpf *nic = (struct nicpf *)nic_irq; int mbx; u64 intr; - u8 vf, vf_per_mbx_reg = 64; + u8 vf; if (irq == pci_irq_vector(nic->pdev, NIC_PF_INTR_ID_MBOX0)) mbx = 0; @@ -1103,12 +1139,13 @@ static irqreturn_t nic_mbx_intr_handler(int irq, void *nic_irq) intr = nic_reg_read(nic, NIC_PF_MAILBOX_INT + (mbx << 3)); dev_dbg(&nic->pdev->dev, "PF interrupt Mbox%d 0x%llx\n", mbx, intr); - for (vf = 0; vf < vf_per_mbx_reg; vf++) { + for (vf = 0; vf < NIC_VF_PER_MBX_REG; vf++) { if (intr & (1ULL << vf)) { dev_dbg(&nic->pdev->dev, "Intr from VF %d\n", - vf + (mbx * vf_per_mbx_reg)); + vf + (mbx * NIC_VF_PER_MBX_REG)); - nic_handle_mbx_intr(nic, vf + (mbx * vf_per_mbx_reg)); + nic_handle_mbx_intr(nic, vf + + (mbx * NIC_VF_PER_MBX_REG)); nic_clear_mbx_intr(nic, vf, mbx); } } -- 2.14.3
[PATCH 4/7] net: thunderx: add new messages for handle ndo_set_rx_mode callback
From: Vadim Lomovtsev The kernel calls ndo_set_rx_mode() callback supplying it will all necessary info, such as device state flags, multicast mac addresses list and so on. Since we have only 128 bits to communicate with PF we need to initiate several requests to PF with small/short operation each based on input data. So this commit implements following PF messages codes along with new data structures for them: NIC_MBOX_MSG_RESET_XCAST to flush all filters configured for this particular network interface (VF) NIC_MBOX_MSG_ADD_MCAST to add new MAC address to DMAC filter registers for this particular network interface (VF) NIC_MBOX_MSG_SET_XCAST to apply filtering configuration to filter control register Signed-off-by: Vadim Lomovtsev --- drivers/net/ethernet/cavium/thunder/nic.h | 12 1 file changed, 12 insertions(+) diff --git a/drivers/net/ethernet/cavium/thunder/nic.h b/drivers/net/ethernet/cavium/thunder/nic.h index 4cacce5d2b16..069289b4f968 100644 --- a/drivers/net/ethernet/cavium/thunder/nic.h +++ b/drivers/net/ethernet/cavium/thunder/nic.h @@ -403,6 +403,9 @@ struct nicvf { #defineNIC_MBOX_MSG_PTP_CFG0x19/* HW packet timestamp */ #defineNIC_MBOX_MSG_CFG_DONE 0xF0/* VF configuration done */ #defineNIC_MBOX_MSG_SHUTDOWN 0xF1/* VF is being shutdown */ +#defineNIC_MBOX_MSG_RESET_XCAST0xF2/* Reset DCAM filtering mode */ +#defineNIC_MBOX_MSG_ADD_MCAST 0xF3/* Add MAC to DCAM filters */ +#defineNIC_MBOX_MSG_SET_XCAST 0xF4/* Set MCAST/BCAST RX mode */ struct nic_cfg_msg { u8msg; @@ -556,6 +559,14 @@ struct set_ptp { bool enable; }; +struct xcast { + u8msg; + union { + u8mode; + u64 mac; + } data; +}; + /* 128 bit shared memory between PF and each VF */ union nic_mbx { struct { u8 msg; } msg; @@ -576,6 +587,7 @@ union nic_mbx { struct reset_stat_cfg reset_stat; struct pfc pfc; struct set_ptp ptp; + struct xcastxcast; }; #define NIC_NODE_ID_MASK 0x03 -- 2.14.3
[PATCH 0/7] net: thunderx: implement DMAC filtering support
From: Vadim Lomovtsev By default CN88XX BGX accepts all incoming multicast and broadcast packets and filtering is disabled. The nic driver doesn't provide an ability to change such behaviour. This series is to implement DMAC filtering management for CN88XX nic driver allowing user to enable/disable filtering and configure specific MAC addresses to filter traffic. Vadim Lomovtsev (7): net: thunderx: move filter register related macro into proper place net: thunderx: add MAC address filter tracking for LMAC net: thunderx: add multicast filter management support net: thunderx: add new messages for handle ndo_set_rx_mode callback net: thunderx: add XCAST messages handlers for PF net: thunderx: add workqueue control structures for handle ndo_set_rx_mode request net: thunderx: add ndo_set_rx_mode callback implementation for VF drivers/net/ethernet/cavium/thunder/nic.h | 29 drivers/net/ethernet/cavium/thunder/nic_main.c| 45 - drivers/net/ethernet/cavium/thunder/nicvf_main.c | 108 +++- drivers/net/ethernet/cavium/thunder/thunder_bgx.c | 196 -- drivers/net/ethernet/cavium/thunder/thunder_bgx.h | 17 +- 5 files changed, 366 insertions(+), 29 deletions(-) -- 2.14.3
[PATCH v4] net: ethernet: cavium: Correct Cavium Thunderx NIC driver names accordingly to module name
From: Vadim Lomovtsev It was found that ethtool provides unexisting module name while it queries the specified network device for associated driver information. Then user tries to unload that module by provided module name and fails. This happens because ethtool reads value of DRV_NAME macro, while module name is defined at the driver's Makefile. This patch is to correct Cavium CN88xx Thunder NIC driver names (DRV_NAME macro) 'thunder-nicvf' to 'nicvf' and 'thunder-nic' to 'nicpf', sync bgx and xcv driver names accordingly to their module names. Signed-off-by: Vadim Lomovtsev --- drivers/net/ethernet/cavium/thunder/nic_main.c | 2 +- drivers/net/ethernet/cavium/thunder/nicvf_ethtool.c | 2 +- drivers/net/ethernet/cavium/thunder/nicvf_main.c| 2 +- drivers/net/ethernet/cavium/thunder/thunder_bgx.c | 2 +- drivers/net/ethernet/cavium/thunder/thunder_xcv.c | 2 +- 5 files changed, 5 insertions(+), 5 deletions(-) diff --git a/drivers/net/ethernet/cavium/thunder/nic_main.c b/drivers/net/ethernet/cavium/thunder/nic_main.c index 8f1dd55b3e08..159b422da7fa 100644 --- a/drivers/net/ethernet/cavium/thunder/nic_main.c +++ b/drivers/net/ethernet/cavium/thunder/nic_main.c @@ -18,7 +18,7 @@ #include "q_struct.h" #include "thunder_bgx.h" -#define DRV_NAME "thunder-nic" +#define DRV_NAME "nicpf" #define DRV_VERSION"1.0" struct hw_info { diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_ethtool.c b/drivers/net/ethernet/cavium/thunder/nicvf_ethtool.c index b9ece9cbf98b..07f00558af9c 100644 --- a/drivers/net/ethernet/cavium/thunder/nicvf_ethtool.c +++ b/drivers/net/ethernet/cavium/thunder/nicvf_ethtool.c @@ -16,7 +16,7 @@ #include "q_struct.h" #include "thunder_bgx.h" -#define DRV_NAME "thunder-nicvf" +#define DRV_NAME "nicvf" #define DRV_VERSION "1.0" struct nicvf_stat { diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c index 7cb7571f9ad9..0a8f3e3ce637 100644 --- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c +++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c @@ -26,7 +26,7 @@ #include "nicvf_queues.h" #include "thunder_bgx.h" -#define DRV_NAME "thunder-nicvf" +#define DRV_NAME "nicvf" #define DRV_VERSION"1.0" /* Supported devices */ diff --git a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c index 5e5c4d7796b8..79fae7de3404 100644 --- a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c +++ b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c @@ -21,7 +21,7 @@ #include "nic.h" #include "thunder_bgx.h" -#define DRV_NAME "thunder-BGX" +#define DRV_NAME "thunder_bgx" #define DRV_VERSION"1.0" struct lmac { diff --git a/drivers/net/ethernet/cavium/thunder/thunder_xcv.c b/drivers/net/ethernet/cavium/thunder/thunder_xcv.c index 578c7f8f11bf..2d5e8dab1f70 100644 --- a/drivers/net/ethernet/cavium/thunder/thunder_xcv.c +++ b/drivers/net/ethernet/cavium/thunder/thunder_xcv.c @@ -20,7 +20,7 @@ #include "nic.h" #include "thunder_bgx.h" -#define DRV_NAME "thunder-xcv" +#define DRV_NAME "thunder_xcv" #define DRV_VERSION"1.0" /* Register offsets */ -- 2.14.3
Re: [PATCH v3] net: ethernet: cavium: Correct Cavium Thunderx NIC module and driver names
On Wed, Jan 24, 2018 at 04:27:37PM -0500, David Miller wrote: > From: Vadim Lomovtsev > Date: Mon, 22 Jan 2018 06:13:27 -0800 > > > From: Vadim Lomovtsev > > > > It was found that ethtool provides unexisting module name while > > it queries the specified network device for associated driver > > information. Then user tries to unload that module by provided > > module name and fails. > > > > This happens because ethtool reads value of DRV_NAME macro, > > while module name is defined at the driver's Makefile. > > > > This patch is to correct Cavium CN88xx Thunder NIC driver modules > > names 'nicvf' to 'thunder_nicvf' and 'nicpf' to 'thunder_nicpf' along > > with updating DRV_NAME macro values accordingly. > > > > Signed-off-by: Dean Nelson > > Signed-off-by: Vadim Lomovtsev > > Once your driver has been deployed in a real upstream release you > should never change the driver module name. > > So if you want to fix things, you'll have to fix them the other > way around, by not changing the module name but changing the > strings that ethtool ends up with instead. > > Thank you. Ok, understood. Will update patch and re-send. Thank you. Vadim
[PATCH v3] net: ethernet: cavium: Correct Cavium Thunderx NIC module and driver names
From: Vadim Lomovtsev It was found that ethtool provides unexisting module name while it queries the specified network device for associated driver information. Then user tries to unload that module by provided module name and fails. This happens because ethtool reads value of DRV_NAME macro, while module name is defined at the driver's Makefile. This patch is to correct Cavium CN88xx Thunder NIC driver modules names 'nicvf' to 'thunder_nicvf' and 'nicpf' to 'thunder_nicpf' along with updating DRV_NAME macro values accordingly. Signed-off-by: Dean Nelson Signed-off-by: Vadim Lomovtsev --- drivers/net/ethernet/cavium/thunder/Makefile| 10 +- drivers/net/ethernet/cavium/thunder/nic_main.c | 2 +- drivers/net/ethernet/cavium/thunder/nicvf_ethtool.c | 2 +- drivers/net/ethernet/cavium/thunder/nicvf_main.c| 2 +- drivers/net/ethernet/cavium/thunder/thunder_bgx.c | 2 +- drivers/net/ethernet/cavium/thunder/thunder_xcv.c | 2 +- 6 files changed, 10 insertions(+), 10 deletions(-) diff --git a/drivers/net/ethernet/cavium/thunder/Makefile b/drivers/net/ethernet/cavium/thunder/Makefile index 6b4d4add7353..f5642337d0fe 100644 --- a/drivers/net/ethernet/cavium/thunder/Makefile +++ b/drivers/net/ethernet/cavium/thunder/Makefile @@ -4,9 +4,9 @@ obj-$(CONFIG_THUNDER_NIC_RGX) += thunder_xcv.o obj-$(CONFIG_THUNDER_NIC_BGX) += thunder_bgx.o -obj-$(CONFIG_THUNDER_NIC_PF) += nicpf.o -obj-$(CONFIG_THUNDER_NIC_VF) += nicvf.o +obj-$(CONFIG_THUNDER_NIC_PF) += thunder_nicpf.o +obj-$(CONFIG_THUNDER_NIC_VF) += thunder_nicvf.o -nicpf-y := nic_main.o -nicvf-y := nicvf_main.o nicvf_queues.o -nicvf-y += nicvf_ethtool.o +thunder_nicpf-y := nic_main.o +thunder_nicvf-y := nicvf_main.o nicvf_queues.o +thunder_nicvf-y += nicvf_ethtool.o diff --git a/drivers/net/ethernet/cavium/thunder/nic_main.c b/drivers/net/ethernet/cavium/thunder/nic_main.c index 8f1dd55b3e08..a8c0d2bde9c8 100644 --- a/drivers/net/ethernet/cavium/thunder/nic_main.c +++ b/drivers/net/ethernet/cavium/thunder/nic_main.c @@ -18,7 +18,7 @@ #include "q_struct.h" #include "thunder_bgx.h" -#define DRV_NAME "thunder-nic" +#define DRV_NAME "thunder_nicpf" #define DRV_VERSION"1.0" struct hw_info { diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_ethtool.c b/drivers/net/ethernet/cavium/thunder/nicvf_ethtool.c index b9ece9cbf98b..8a3e3b70afda 100644 --- a/drivers/net/ethernet/cavium/thunder/nicvf_ethtool.c +++ b/drivers/net/ethernet/cavium/thunder/nicvf_ethtool.c @@ -16,7 +16,7 @@ #include "q_struct.h" #include "thunder_bgx.h" -#define DRV_NAME "thunder-nicvf" +#define DRV_NAME "thunder_nicvf" #define DRV_VERSION "1.0" struct nicvf_stat { diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c index 7cb7571f9ad9..de38105fc649 100644 --- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c +++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c @@ -26,7 +26,7 @@ #include "nicvf_queues.h" #include "thunder_bgx.h" -#define DRV_NAME "thunder-nicvf" +#define DRV_NAME "thunder_nicvf" #define DRV_VERSION"1.0" /* Supported devices */ diff --git a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c index 5e5c4d7796b8..79fae7de3404 100644 --- a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c +++ b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c @@ -21,7 +21,7 @@ #include "nic.h" #include "thunder_bgx.h" -#define DRV_NAME "thunder-BGX" +#define DRV_NAME "thunder_bgx" #define DRV_VERSION"1.0" struct lmac { diff --git a/drivers/net/ethernet/cavium/thunder/thunder_xcv.c b/drivers/net/ethernet/cavium/thunder/thunder_xcv.c index 578c7f8f11bf..2d5e8dab1f70 100644 --- a/drivers/net/ethernet/cavium/thunder/thunder_xcv.c +++ b/drivers/net/ethernet/cavium/thunder/thunder_xcv.c @@ -20,7 +20,7 @@ #include "nic.h" #include "thunder_bgx.h" -#define DRV_NAME "thunder-xcv" +#define DRV_NAME "thunder_xcv" #define DRV_VERSION"1.0" /* Register offsets */ -- 2.14.3
Re: [PATCH v2] net: ethernet: cavium: Correct Cavium Thunderx nicvf/nicpf modules names
Self NACK here: modules names has to contain dashes instead of underscores, as it defined at sources (or update modules DRV_NAME definitions whithin the source files with underscores). Shame on me. Vadim On Thu, Jan 18, 2018 at 07:53:09AM -0800, Vadim Lomovtsev wrote: > From: Vadim Lomovtsev > > It was found that ethtool provides unexisting module name while > it queries the specified network device for associated driver > information. > > This patch is to correct Cavium CN88xx Thunder nicvf/nicpf modules > names 'nicvf' to 'thunder_nicvf' and 'nicpf' to 'thunder_nicpf'. > > Signed-off-by: Dean Nelson > Signed-off-by: Vadim Lomovtsev > --- > drivers/net/ethernet/cavium/thunder/Makefile | 10 +- > 1 file changed, 5 insertions(+), 5 deletions(-) > > diff --git a/drivers/net/ethernet/cavium/thunder/Makefile > b/drivers/net/ethernet/cavium/thunder/Makefile > index 6b4d4add7353..f5642337d0fe 100644 > --- a/drivers/net/ethernet/cavium/thunder/Makefile > +++ b/drivers/net/ethernet/cavium/thunder/Makefile > @@ -4,9 +4,9 @@ > > obj-$(CONFIG_THUNDER_NIC_RGX) += thunder_xcv.o > obj-$(CONFIG_THUNDER_NIC_BGX) += thunder_bgx.o > -obj-$(CONFIG_THUNDER_NIC_PF) += nicpf.o > -obj-$(CONFIG_THUNDER_NIC_VF) += nicvf.o > +obj-$(CONFIG_THUNDER_NIC_PF) += thunder_nicpf.o > +obj-$(CONFIG_THUNDER_NIC_VF) += thunder_nicvf.o > > -nicpf-y := nic_main.o > -nicvf-y := nicvf_main.o nicvf_queues.o > -nicvf-y += nicvf_ethtool.o > +thunder_nicpf-y := nic_main.o > +thunder_nicvf-y := nicvf_main.o nicvf_queues.o > +thunder_nicvf-y += nicvf_ethtool.o > -- > 2.14.3 >
[PATCH v2] net: ethernet: cavium: Correct Cavium Thunderx nicvf/nicpf modules names
From: Vadim Lomovtsev It was found that ethtool provides unexisting module name while it queries the specified network device for associated driver information. This patch is to correct Cavium CN88xx Thunder nicvf/nicpf modules names 'nicvf' to 'thunder_nicvf' and 'nicpf' to 'thunder_nicpf'. Signed-off-by: Dean Nelson Signed-off-by: Vadim Lomovtsev --- drivers/net/ethernet/cavium/thunder/Makefile | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/drivers/net/ethernet/cavium/thunder/Makefile b/drivers/net/ethernet/cavium/thunder/Makefile index 6b4d4add7353..f5642337d0fe 100644 --- a/drivers/net/ethernet/cavium/thunder/Makefile +++ b/drivers/net/ethernet/cavium/thunder/Makefile @@ -4,9 +4,9 @@ obj-$(CONFIG_THUNDER_NIC_RGX) += thunder_xcv.o obj-$(CONFIG_THUNDER_NIC_BGX) += thunder_bgx.o -obj-$(CONFIG_THUNDER_NIC_PF) += nicpf.o -obj-$(CONFIG_THUNDER_NIC_VF) += nicvf.o +obj-$(CONFIG_THUNDER_NIC_PF) += thunder_nicpf.o +obj-$(CONFIG_THUNDER_NIC_VF) += thunder_nicvf.o -nicpf-y := nic_main.o -nicvf-y := nicvf_main.o nicvf_queues.o -nicvf-y += nicvf_ethtool.o +thunder_nicpf-y := nic_main.o +thunder_nicvf-y := nicvf_main.o nicvf_queues.o +thunder_nicvf-y += nicvf_ethtool.o -- 2.14.3
[PATCH] net: ethernet: cavium: Correct Cacivum Thunderx nicvf/nicpf modules names
From: Vadim Lomovtsev It was found that ethtool provides unexisting module name while it queries the specified network device for associated driver information. This patch is to correct Cavium CN88xx Thunder nicvf/nicpf modules names 'nicvf' to 'thunder_nicvf' and 'nicpf' to 'thunder_nicpf'. Signed-off-by: Dean Nelson Signed-off-by: Vadim Lomovtsev --- drivers/net/ethernet/cavium/thunder/Makefile | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/drivers/net/ethernet/cavium/thunder/Makefile b/drivers/net/ethernet/cavium/thunder/Makefile index 6b4d4add7353..f5642337d0fe 100644 --- a/drivers/net/ethernet/cavium/thunder/Makefile +++ b/drivers/net/ethernet/cavium/thunder/Makefile @@ -4,9 +4,9 @@ obj-$(CONFIG_THUNDER_NIC_RGX) += thunder_xcv.o obj-$(CONFIG_THUNDER_NIC_BGX) += thunder_bgx.o -obj-$(CONFIG_THUNDER_NIC_PF) += nicpf.o -obj-$(CONFIG_THUNDER_NIC_VF) += nicvf.o +obj-$(CONFIG_THUNDER_NIC_PF) += thunder_nicpf.o +obj-$(CONFIG_THUNDER_NIC_VF) += thunder_nicvf.o -nicpf-y := nic_main.o -nicvf-y := nicvf_main.o nicvf_queues.o -nicvf-y += nicvf_ethtool.o +thunder_nicpf-y := nic_main.o +thunder_nicvf-y := nicvf_main.o nicvf_queues.o +thunder_nicvf-y += nicvf_ethtool.o -- 2.14.3
Re: [PATCH] net: ethernet: cavium: Correct Cacivum Thunderx nicvf/nicpf modules names
Self NACK to this one, subject line typo, sorry. Vadim On Thu, Jan 18, 2018 at 07:42:40AM -0800, Vadim Lomovtsev wrote: > From: Vadim Lomovtsev > > It was found that ethtool provides unexisting module name while > it queries the specified network device for associated driver > information. > > This patch is to correct Cavium CN88xx Thunder nicvf/nicpf modules > names 'nicvf' to 'thunder_nicvf' and 'nicpf' to 'thunder_nicpf'. > > Signed-off-by: Dean Nelson > Signed-off-by: Vadim Lomovtsev > --- > drivers/net/ethernet/cavium/thunder/Makefile | 10 +- > 1 file changed, 5 insertions(+), 5 deletions(-) > > diff --git a/drivers/net/ethernet/cavium/thunder/Makefile > b/drivers/net/ethernet/cavium/thunder/Makefile > index 6b4d4add7353..f5642337d0fe 100644 > --- a/drivers/net/ethernet/cavium/thunder/Makefile > +++ b/drivers/net/ethernet/cavium/thunder/Makefile > @@ -4,9 +4,9 @@ > > obj-$(CONFIG_THUNDER_NIC_RGX) += thunder_xcv.o > obj-$(CONFIG_THUNDER_NIC_BGX) += thunder_bgx.o > -obj-$(CONFIG_THUNDER_NIC_PF) += nicpf.o > -obj-$(CONFIG_THUNDER_NIC_VF) += nicvf.o > +obj-$(CONFIG_THUNDER_NIC_PF) += thunder_nicpf.o > +obj-$(CONFIG_THUNDER_NIC_VF) += thunder_nicvf.o > > -nicpf-y := nic_main.o > -nicvf-y := nicvf_main.o nicvf_queues.o > -nicvf-y += nicvf_ethtool.o > +thunder_nicpf-y := nic_main.o > +thunder_nicvf-y := nicvf_main.o nicvf_queues.o > +thunder_nicvf-y += nicvf_ethtool.o > -- > 2.14.3 >
Re: [PATCH] net: sunrpc: svcsock: fix NULL-pointer exception
On Fri, Aug 25, 2017 at 06:01:28PM -0400, J. Bruce Fields wrote: > On Fri, Aug 18, 2017 at 06:00:47AM -0400, Vadim Lomovtsev wrote: > > While running nfs/connectathon tests kernel NULL-pointer exception > > has been observed due to races in svcsock.c. > > > > Race is appear when kernel accepts connection by kernel_accept > > (which creates new socket) and start queuing ingress packets > > to new socket. This happanes in ksoftirq context which concurrently > > on a differnt core while new socket setup is not done yet. > > > > The fix is to re-order socket user data init sequence, add NULL-ptr > > check before callback call along with barriers to prevent kernel crash. > > > > Test results: nfs/connectathon reports '0' failed tests for about 200+ > > iterations. > > By the way, is there anything special about your setup that allows you > to reproduce this? There's nothing special about connectathon tests, so > I'm just wondering why we haven't had a lot of reports of this. >From what I have now - nothing special in test setup and/or configuration. I believe it is because or high amount of CPU running. It was found at 32 cores CPU, while simply invoking test by "make run" command. WBR, Vadim > > --b. > > > > > Crash log: > > ---<-snip->--- > > [ 6708.638984] Unable to handle kernel NULL pointer dereference at virtual > > address > > [ 6708.647093] pgd = 094e > > [ 6708.650497] [] *pgd=01090003, *pud=01090003, > > *pmd=01080003, *pte= > > [ 6708.660761] Internal error: Oops: 8605 [#1] SMP > > [ 6708.665630] Modules linked in: nfsv3 nfnetlink_queue nfnetlink_log > > nfnetlink rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache overlay > > xt_CONNSECMARK xt_SECMARK xt_conntrack iptable_security ip_tables ah4 > > xfrm4_mode_transport sctp tun binfmt_misc ext4 jbd2 mbcache loop tcp_diag > > udp_diag inet_diag rpcrdma ib_isert iscsi_target_mod ib_iser rdma_cm iw_cm > > libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp > > scsi_transport_srp ib_ipoib ib_ucm ib_uverbs ib_umad ib_cm ib_core > > nls_koi8_u nls_cp932 ts_kmp nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack > > vfat fat ghash_ce sha2_ce sha1_ce cavium_rng_vf i2c_thunderx sg > > thunderx_edac i2c_smbus edac_core cavium_rng nfsd auth_rpcgss nfs_acl lockd > > grace sunrpc xfs libcrc32c nicvf nicpf ast i2c_algo_bit drm_kms_helper > > syscopyarea sysfillrect sysimgblt fb_sys_fops > > [ 6708.736446] ttm drm i2c_core thunder_bgx thunder_xcv mdio_thunder > > mdio_cavium dm_mirror dm_region_hash dm_log dm_mod [last unloaded: > > stap_3c300909c5b3f46dcacd49aab3334af_87021] > > [ 6708.752275] CPU: 84 PID: 0 Comm: swapper/84 Tainted: GW OE > > 4.11.0-4.el7.aarch64 #1 > > [ 6708.760787] Hardware name: www.cavium.com CRB-2S/CRB-2S, BIOS 0.3 Mar 13 > > 2017 > > [ 6708.767910] task: 810006842e80 task.stack: 81000689c000 > > [ 6708.773822] PC is at 0x0 > > [ 6708.776739] LR is at svc_data_ready+0x38/0x88 [sunrpc] > > [ 6708.781866] pc : [<>] lr : [] pstate: > > 6145 > > [ 6708.789248] sp : 810ffbad3900 > > [ 6708.792551] x29: 810ffbad3900 x28: 08c73d58 > > [ 6708.797853] x27: x26: 81000bbe1e00 > > [ 6708.803156] x25: 0020 x24: 800f7410bf28 > > [ 6708.808458] x23: 08c63000 x22: 08c63000 > > [ 6708.813760] x21: 800f7410bf28 x20: 81000bbe1e00 > > [ 6708.819063] x19: 810012412400 x18: d82a9df2 > > [ 6708.824365] x17: x16: > > [ 6708.829667] x15: x14: 0001 > > [ 6708.834969] x13: x12: 722e736f622e676e > > [ 6708.840271] x11: f814dd99 x10: > > [ 6708.845573] x9 : 737468722500 x8 : > > [ 6708.850875] x7 : x6 : > > [ 6708.856177] x5 : 0028 x4 : > > [ 6708.861479] x3 : x2 : e500 > > [ 6708.866781] x1 : x0 : 81000bbe1e00 > > [ 6708.872084] > > [ 6708.873565] Process swapper/84 (pid: 0, stack limit = 0x81000689c000) > > [ 6708.880341] Stack: (0x810ffbad3900 to 0x8100068a) > > [ 6708.886075] Call trace: > > [ 6708.888513] Exception stack(0x810ffbad3710 to 0x810ffbad3840) > > [ 6708.894942] 3700: 810012412400 > > 0001 > > [ 6708.902759] 3720: 810ffbad
Re: [PATCH v4] net: sunrpc: svcsock: fix NULL-pointer exception
Hi all, Any comments on this ? WBR, Vadim On Mon, Aug 21, 2017 at 07:23:07AM -0400, Vadim Lomovtsev wrote: > While running nfs/connectathon tests kernel NULL-pointer exception > has been observed due to races in svcsock.c. > > Race is appear when kernel accepts connection by kernel_accept > (which creates new socket) and start queuing ingress packets > to new socket. This happens in ksoftirq context which could run > concurrently on a different core while new socket setup is not done yet. > > The fix is to re-order socket user data init sequence and add > write/read barrier calls to be sure that we got proper values > for callback pointers before actually calling them. > > Test results: nfs/connectathon reports '0' failed tests for about 200+ > iterations. > > Crash log: > ---<-snip->--- > [ 6708.638984] Unable to handle kernel NULL pointer dereference at virtual > address > [ 6708.647093] pgd = 094e > [ 6708.650497] [] *pgd=01090003, *pud=01090003, > *pmd=01080003, *pte= > [ 6708.660761] Internal error: Oops: 8605 [#1] SMP > [ 6708.665630] Modules linked in: nfsv3 nfnetlink_queue nfnetlink_log > nfnetlink rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache overlay > xt_CONNSECMARK xt_SECMARK xt_conntrack iptable_security ip_tables ah4 > xfrm4_mode_transport sctp tun binfmt_misc ext4 jbd2 mbcache loop tcp_diag > udp_diag inet_diag rpcrdma ib_isert iscsi_target_mod ib_iser rdma_cm iw_cm > libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp > scsi_transport_srp ib_ipoib ib_ucm ib_uverbs ib_umad ib_cm ib_core nls_koi8_u > nls_cp932 ts_kmp nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack vfat fat > ghash_ce sha2_ce sha1_ce cavium_rng_vf i2c_thunderx sg thunderx_edac > i2c_smbus edac_core cavium_rng nfsd auth_rpcgss nfs_acl lockd grace sunrpc > xfs libcrc32c nicvf nicpf ast i2c_algo_bit drm_kms_helper syscopyarea > sysfillrect sysimgblt fb_sys_fops > [ 6708.736446] ttm drm i2c_core thunder_bgx thunder_xcv mdio_thunder > mdio_cavium dm_mirror dm_region_hash dm_log dm_mod [last unloaded: > stap_3c300909c5b3f46dcacd49aab3334af_87021] > [ 6708.752275] CPU: 84 PID: 0 Comm: swapper/84 Tainted: GW OE > 4.11.0-4.el7.aarch64 #1 > [ 6708.760787] Hardware name: www.cavium.com CRB-2S/CRB-2S, BIOS 0.3 Mar 13 > 2017 > [ 6708.767910] task: 810006842e80 task.stack: 81000689c000 > [ 6708.773822] PC is at 0x0 > [ 6708.776739] LR is at svc_data_ready+0x38/0x88 [sunrpc] > [ 6708.781866] pc : [<>] lr : [] pstate: > 6145 > [ 6708.789248] sp : 810ffbad3900 > [ 6708.792551] x29: 810ffbad3900 x28: 08c73d58 > [ 6708.797853] x27: x26: 81000bbe1e00 > [ 6708.803156] x25: 0020 x24: 800f7410bf28 > [ 6708.808458] x23: 08c63000 x22: 08c63000 > [ 6708.813760] x21: 800f7410bf28 x20: 81000bbe1e00 > [ 6708.819063] x19: 810012412400 x18: d82a9df2 > [ 6708.824365] x17: x16: > [ 6708.829667] x15: x14: 0001 > [ 6708.834969] x13: x12: 722e736f622e676e > [ 6708.840271] x11: f814dd99 x10: > [ 6708.845573] x9 : 737468722500 x8 : > [ 6708.850875] x7 : x6 : > [ 6708.856177] x5 : 0028 x4 : > [ 6708.861479] x3 : x2 : e500 > [ 6708.866781] x1 : x0 : 81000bbe1e00 > [ 6708.872084] > [ 6708.873565] Process swapper/84 (pid: 0, stack limit = 0x81000689c000) > [ 6708.880341] Stack: (0x810ffbad3900 to 0x8100068a) > [ 6708.886075] Call trace: > [ 6708.888513] Exception stack(0x810ffbad3710 to 0x810ffbad3840) > [ 6708.894942] 3700: 810012412400 > 0001 > [ 6708.902759] 3720: 810ffbad3900 6145 > 800f7930 > [ 6708.910577] 3740: 09274d00 03ea 0015 > 08c63000 > [ 6708.918395] 3760: 810ffbad3830 800f7930 004d > > [ 6708.926212] 3780: 810ffbad3890 080f88dc 800f7930 > 004d > [ 6708.934030] 37a0: 800f7930093c 08c63000 > 0140 > [ 6708.941848] 37c0: 08c2c000 00040b00 81000bbe1e00 > > [ 6708.949665] 37e0: e500 > 0028 > [ 6708.957483] 3800: > 737468722500 > [ 6708.965300] 3820: f814dd99 722e736f622e
[PATCH v4] net: sunrpc: svcsock: fix NULL-pointer exception
_skb+0x38/0x84 [ 6709.034580] [] netif_receive_skb_internal+0x68/0xdc [ 6709.041010] [] napi_gro_receive+0xcc/0x1a8 [ 6709.046690] [] nicvf_cq_intr_handler+0x59c/0x730 [nicvf] [ 6709.053559] [] nicvf_poll+0x38/0xb8 [nicvf] [ 6709.059295] [] net_rx_action+0x2f8/0x464 [ 6709.064771] [] __do_softirq+0x11c/0x308 [ 6709.070164] [] irq_exit+0x12c/0x174 [ 6709.075206] [] __handle_domain_irq+0x78/0xc4 [ 6709.081027] [] gic_handle_irq+0x94/0x190 [ 6709.086501] Exception stack(0x81000689fdf0 to 0x81000689ff20) [ 6709.092929] fde0: 810ff2ec 08c1 [ 6709.100747] fe00: 08c70ef4 0001 810ffbad9b18 [ 6709.108565] fe20: 810ffbad9c70 8100169d3800 810006843ab0 81000689fe80 [ 6709.116382] fe40: 0bd0 df979cd0 183f5913da192500 8a254ce4 [ 6709.124200] fe60: 8a254b78 aaab10339808 8a0c2a50 [ 6709.132018] fe80: df979b10 08d6d450 08c1 08d6d000 [ 6709.139836] fea0: 0054 08cd3dbc [ 6709.147653] fec0: 81000689ff20 [ 6709.155471] fee0: 08085240 81000689ff20 08085244 6145 [ 6709.163289] ff00: 81000689ff10 0813f1e4 0813f238 [ 6709.171107] [] el1_irq+0xb4/0x140 [ 6709.175976] [] arch_cpu_idle+0x44/0x11c [ 6709.181368] [] default_idle_call+0x20/0x30 [ 6709.187020] [] do_idle+0x158/0x1e4 [ 6709.191973] [] cpu_startup_entry+0x2c/0x30 [ 6709.197624] [] secondary_start_kernel+0x13c/0x160 [ 6709.203878] [<01bc71c4>] 0x1bc71c4 [ 6709.207967] Code: bad PC value [ 6709.211061] SMP: stopping secondary CPUs [ 6709.218830] Starting crashdump kernel... [ 6709.222749] Bye! ---<-snip>--- Signed-off-by: Vadim Lomovtsev Reviewed-by: Jeff Layton --- net/sunrpc/svcsock.c | 22 -- 1 file changed, 20 insertions(+), 2 deletions(-) diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c index 2b720fa..e185001 100644 --- a/net/sunrpc/svcsock.c +++ b/net/sunrpc/svcsock.c @@ -421,6 +421,9 @@ static void svc_data_ready(struct sock *sk) dprintk("svc: socket %p(inet %p), busy=%d\n", svsk, sk, test_bit(XPT_BUSY, &svsk->sk_xprt.xpt_flags)); + + /* Refer to svc_setup_socket() for details. */ + rmb(); svsk->sk_odata(sk); if (!test_and_set_bit(XPT_DATA, &svsk->sk_xprt.xpt_flags)) svc_xprt_enqueue(&svsk->sk_xprt); @@ -437,6 +440,9 @@ static void svc_write_space(struct sock *sk) if (svsk) { dprintk("svc: socket %p(inet %p), write_space busy=%d\n", svsk, sk, test_bit(XPT_BUSY, &svsk->sk_xprt.xpt_flags)); + + /* Refer to svc_setup_socket() for details. */ + rmb(); svsk->sk_owspace(sk); svc_xprt_enqueue(&svsk->sk_xprt); } @@ -760,8 +766,12 @@ static void svc_tcp_listen_data_ready(struct sock *sk) dprintk("svc: socket %p TCP (listen) state change %d\n", sk, sk->sk_state); - if (svsk) + if (svsk) { + /* Refer to svc_setup_socket() for details. */ + rmb(); svsk->sk_odata(sk); + } + /* * This callback may called twice when a new connection * is established as a child socket inherits everything @@ -794,6 +804,8 @@ static void svc_tcp_state_change(struct sock *sk) if (!svsk) printk("svc: socket %p: no user data\n", sk); else { + /* Refer to svc_setup_socket() for details. */ + rmb(); svsk->sk_ostate(sk); if (sk->sk_state != TCP_ESTABLISHED) { set_bit(XPT_CLOSE, &svsk->sk_xprt.xpt_flags); @@ -1381,12 +1393,18 @@ static struct svc_sock *svc_setup_socket(struct svc_serv *serv, return ERR_PTR(err); } - inet->sk_user_data = svsk; svsk->sk_sock = sock; svsk->sk_sk = inet; svsk->sk_ostate = inet->sk_state_change; svsk->sk_odata = inet->sk_data_ready; svsk->sk_owspace = inet->sk_write_space; + /* +* This barrier is necessary in order to prevent race condition +* with svc_data_ready(), svc_listen_data_ready() and others +* when calling callbacks above. +*/ + wmb(); + inet->sk_user_data = svsk; /* Initialize the socket */ if (sock->type == SOCK_DGRAM) -- 1.8.3.1
Re: [PATCH v3] net: sunrpc: svcsock: fix NULL-pointer exception
On Mon, Aug 21, 2017 at 06:42:07AM -0400, Jeff Layton wrote: > On Mon, 2017-08-21 at 04:56 -0400, Vadim Lomovtsev wrote: > > While running nfs/connectathon tests kernel NULL-pointer exception > > has been observed due to races in svcsock.c. > > > > Race is appear when kernel accepts connection by kernel_accept > > (which creates new socket) and start queuing ingress packets > > to new socket. This happens in ksoftirq context which could run > > concurrently on a different core while new socket setup is not done yet. > > > > The fix is to re-order socket user data init sequence and add > > write/read barrier calls to be sure that we got proper values > > for callback pointers before actually calling them. > > > > Test results: nfs/connectathon reports '0' failed tests for about 200+ > > iterations. > > > > Crash log: > > ---<-snip->--- > > [ 6708.638984] Unable to handle kernel NULL pointer dereference at virtual > > address > > [ 6708.647093] pgd = 094e > > [ 6708.650497] [] *pgd=01090003, *pud=01090003, > > *pmd=01080003, *pte= > > [ 6708.660761] Internal error: Oops: 8605 [#1] SMP > > [ 6708.665630] Modules linked in: nfsv3 nfnetlink_queue nfnetlink_log > > nfnetlink rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache overlay > > xt_CONNSECMARK xt_SECMARK xt_conntrack iptable_security ip_tables ah4 > > xfrm4_mode_transport sctp tun binfmt_misc ext4 jbd2 mbcache loop tcp_diag > > udp_diag inet_diag rpcrdma ib_isert iscsi_target_mod ib_iser rdma_cm iw_cm > > libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp > > scsi_transport_srp ib_ipoib ib_ucm ib_uverbs ib_umad ib_cm ib_core > > nls_koi8_u nls_cp932 ts_kmp nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack > > vfat fat ghash_ce sha2_ce sha1_ce cavium_rng_vf i2c_thunderx sg > > thunderx_edac i2c_smbus edac_core cavium_rng nfsd auth_rpcgss nfs_acl lockd > > grace sunrpc xfs libcrc32c nicvf nicpf ast i2c_algo_bit drm_kms_helper > > syscopyarea sysfillrect sysimgblt fb_sys_fops > > [ 6708.736446] ttm drm i2c_core thunder_bgx thunder_xcv mdio_thunder > > mdio_cavium dm_mirror dm_region_hash dm_log dm_mod [last unloaded: > > stap_3c300909c5b3f46dcacd49aab3334af_87021] > > [ 6708.752275] CPU: 84 PID: 0 Comm: swapper/84 Tainted: GW OE > > 4.11.0-4.el7.aarch64 #1 > > [ 6708.760787] Hardware name: www.cavium.com CRB-2S/CRB-2S, BIOS 0.3 Mar 13 > > 2017 > > [ 6708.767910] task: 810006842e80 task.stack: 81000689c000 > > [ 6708.773822] PC is at 0x0 > > [ 6708.776739] LR is at svc_data_ready+0x38/0x88 [sunrpc] > > [ 6708.781866] pc : [<>] lr : [] pstate: > > 6145 > > [ 6708.789248] sp : 810ffbad3900 > > [ 6708.792551] x29: 810ffbad3900 x28: 08c73d58 > > [ 6708.797853] x27: x26: 81000bbe1e00 > > [ 6708.803156] x25: 0020 x24: 800f7410bf28 > > [ 6708.808458] x23: 08c63000 x22: 08c63000 > > [ 6708.813760] x21: 800f7410bf28 x20: 81000bbe1e00 > > [ 6708.819063] x19: 810012412400 x18: d82a9df2 > > [ 6708.824365] x17: x16: > > [ 6708.829667] x15: x14: 0001 > > [ 6708.834969] x13: x12: 722e736f622e676e > > [ 6708.840271] x11: f814dd99 x10: > > [ 6708.845573] x9 : 737468722500 x8 : > > [ 6708.850875] x7 : x6 : > > [ 6708.856177] x5 : 0028 x4 : > > [ 6708.861479] x3 : x2 : e500 > > [ 6708.866781] x1 : x0 : 81000bbe1e00 > > [ 6708.872084] > > [ 6708.873565] Process swapper/84 (pid: 0, stack limit = 0x81000689c000) > > [ 6708.880341] Stack: (0x810ffbad3900 to 0x8100068a) > > [ 6708.886075] Call trace: > > [ 6708.888513] Exception stack(0x810ffbad3710 to 0x810ffbad3840) > > [ 6708.894942] 3700: 810012412400 > > 0001 > > [ 6708.902759] 3720: 810ffbad3900 6145 > > 800f7930 > > [ 6708.910577] 3740: 09274d00 03ea 0015 > > 08c63000 > > [ 6708.918395] 3760: 810ffbad3830 800f7930 004d > > > > [ 6708.926212] 3780: 810ffbad3890 080f88dc 800f7930 > > 004d > > [ 6708.934030] 37a0: 800f7930093c
[PATCH v3] net: sunrpc: svcsock: fix NULL-pointer exception
_skb+0x38/0x84 [ 6709.034580] [] netif_receive_skb_internal+0x68/0xdc [ 6709.041010] [] napi_gro_receive+0xcc/0x1a8 [ 6709.046690] [] nicvf_cq_intr_handler+0x59c/0x730 [nicvf] [ 6709.053559] [] nicvf_poll+0x38/0xb8 [nicvf] [ 6709.059295] [] net_rx_action+0x2f8/0x464 [ 6709.064771] [] __do_softirq+0x11c/0x308 [ 6709.070164] [] irq_exit+0x12c/0x174 [ 6709.075206] [] __handle_domain_irq+0x78/0xc4 [ 6709.081027] [] gic_handle_irq+0x94/0x190 [ 6709.086501] Exception stack(0x81000689fdf0 to 0x81000689ff20) [ 6709.092929] fde0: 810ff2ec 08c1 [ 6709.100747] fe00: 08c70ef4 0001 810ffbad9b18 [ 6709.108565] fe20: 810ffbad9c70 8100169d3800 810006843ab0 81000689fe80 [ 6709.116382] fe40: 0bd0 df979cd0 183f5913da192500 8a254ce4 [ 6709.124200] fe60: 8a254b78 aaab10339808 8a0c2a50 [ 6709.132018] fe80: df979b10 08d6d450 08c1 08d6d000 [ 6709.139836] fea0: 0054 08cd3dbc [ 6709.147653] fec0: 81000689ff20 [ 6709.155471] fee0: 08085240 81000689ff20 08085244 6145 [ 6709.163289] ff00: 81000689ff10 0813f1e4 0813f238 [ 6709.171107] [] el1_irq+0xb4/0x140 [ 6709.175976] [] arch_cpu_idle+0x44/0x11c [ 6709.181368] [] default_idle_call+0x20/0x30 [ 6709.187020] [] do_idle+0x158/0x1e4 [ 6709.191973] [] cpu_startup_entry+0x2c/0x30 [ 6709.197624] [] secondary_start_kernel+0x13c/0x160 [ 6709.203878] [<01bc71c4>] 0x1bc71c4 [ 6709.207967] Code: bad PC value [ 6709.211061] SMP: stopping secondary CPUs [ 6709.218830] Starting crashdump kernel... [ 6709.222749] Bye! ---<-snip>--- Signed-off-by: Vadim Lomovtsev --- net/sunrpc/svcsock.c | 27 +-- 1 file changed, 25 insertions(+), 2 deletions(-) diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c index 2b720fa..9ec16c5 100644 --- a/net/sunrpc/svcsock.c +++ b/net/sunrpc/svcsock.c @@ -421,6 +421,10 @@ static void svc_data_ready(struct sock *sk) dprintk("svc: socket %p(inet %p), busy=%d\n", svsk, sk, test_bit(XPT_BUSY, &svsk->sk_xprt.xpt_flags)); + /* this barrier is necessary to prevent kernel crash + (due to bad CPU-value) caused by races against + svc_setup_socket() while calling sk_odata() callback. */ + rmb(); svsk->sk_odata(sk); if (!test_and_set_bit(XPT_DATA, &svsk->sk_xprt.xpt_flags)) svc_xprt_enqueue(&svsk->sk_xprt); @@ -437,6 +441,10 @@ static void svc_write_space(struct sock *sk) if (svsk) { dprintk("svc: socket %p(inet %p), write_space busy=%d\n", svsk, sk, test_bit(XPT_BUSY, &svsk->sk_xprt.xpt_flags)); + /* this barrier is necessary to prevent kernel crash + (due to bad CPU-value) caused by races against + svc_setup_socket() while calling sk_owspace() callback. */ + rmb(); svsk->sk_owspace(sk); svc_xprt_enqueue(&svsk->sk_xprt); } @@ -760,8 +768,14 @@ static void svc_tcp_listen_data_ready(struct sock *sk) dprintk("svc: socket %p TCP (listen) state change %d\n", sk, sk->sk_state); - if (svsk) + if (svsk) { + /* this barrier is necessary to prevent kernel crash + (due to bad CPU-value) caused by races against + svc_setup_socket() while calling sk_odata() callback.*/ + rmb(); svsk->sk_odata(sk); + } + /* * This callback may called twice when a new connection * is established as a child socket inherits everything @@ -794,7 +808,12 @@ static void svc_tcp_state_change(struct sock *sk) if (!svsk) printk("svc: socket %p: no user data\n", sk); else { + /* this barrier is necessary to prevent kernel crash + (due to bad CPU-value) caused by races against + svc_setup_socket() while calling sk_ostate() callback. */ + rmb(); svsk->sk_ostate(sk); + if (sk->sk_state != TCP_ESTABLISHED) { set_bit(XPT_CLOSE, &svsk->sk_xprt.xpt_flags); svc_xprt_enqueue(&svsk->sk_xprt); @@ -1381,12 +1400,16 @@ static struct svc_sock *svc_setup_socket(struct svc_serv *serv, return ERR_PTR(err); } - inet->sk_user_data = svsk; svsk->sk_sock = sock; svsk->sk_sk = inet; svsk->sk_osta
Re: [PATCH v2] net: sunrpc: svcsock: fix NULL-pointer exception
Sorry guys, please ignore this - being sent by mistake. Have some typos at comments by copy-paste, will correct and re-send Vadim On Mon, Aug 21, 2017 at 04:44:45AM -0400, Vadim Lomovtsev wrote: > While running nfs/connectathon tests kernel NULL-pointer exception > has been observed due to races in svcsock.c. > > Race is appear when kernel accepts connection by kernel_accept > (which creates new socket) and start queuing ingress packets > to new socket. This happens in ksoftirq context which could run > concurrently on a different core while new socket setup is not done yet. > > The fix is to re-order socket user data init sequence and add > write/read barrier calls to be sure that we got proper values > for callback pointers before actually calling them. > > Test results: nfs/connectathon reports '0' failed tests for about 200+ > iterations. > > Crash log: > ---<-snip->--- > [ 6708.638984] Unable to handle kernel NULL pointer dereference at virtual > address > [ 6708.647093] pgd = 094e > [ 6708.650497] [] *pgd=01090003, *pud=01090003, > *pmd=01080003, *pte= > [ 6708.660761] Internal error: Oops: 8605 [#1] SMP > [ 6708.665630] Modules linked in: nfsv3 nfnetlink_queue nfnetlink_log > nfnetlink rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache overlay > xt_CONNSECMARK xt_SECMARK xt_conntrack iptable_security ip_tables ah4 > xfrm4_mode_transport sctp tun binfmt_misc ext4 jbd2 mbcache loop tcp_diag > udp_diag inet_diag rpcrdma ib_isert iscsi_target_mod ib_iser rdma_cm iw_cm > libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp > scsi_transport_srp ib_ipoib ib_ucm ib_uverbs ib_umad ib_cm ib_core nls_koi8_u > nls_cp932 ts_kmp nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack vfat fat > ghash_ce sha2_ce sha1_ce cavium_rng_vf i2c_thunderx sg thunderx_edac > i2c_smbus edac_core cavium_rng nfsd auth_rpcgss nfs_acl lockd grace sunrpc > xfs libcrc32c nicvf nicpf ast i2c_algo_bit drm_kms_helper syscopyarea > sysfillrect sysimgblt fb_sys_fops > [ 6708.736446] ttm drm i2c_core thunder_bgx thunder_xcv mdio_thunder > mdio_cavium dm_mirror dm_region_hash dm_log dm_mod [last unloaded: > stap_3c300909c5b3f46dcacd49aab3334af_87021] > [ 6708.752275] CPU: 84 PID: 0 Comm: swapper/84 Tainted: GW OE > 4.11.0-4.el7.aarch64 #1 > [ 6708.760787] Hardware name: www.cavium.com CRB-2S/CRB-2S, BIOS 0.3 Mar 13 > 2017 > [ 6708.767910] task: 810006842e80 task.stack: 81000689c000 > [ 6708.773822] PC is at 0x0 > [ 6708.776739] LR is at svc_data_ready+0x38/0x88 [sunrpc] > [ 6708.781866] pc : [<>] lr : [] pstate: > 6145 > [ 6708.789248] sp : 810ffbad3900 > [ 6708.792551] x29: 810ffbad3900 x28: 08c73d58 > [ 6708.797853] x27: x26: 81000bbe1e00 > [ 6708.803156] x25: 0020 x24: 800f7410bf28 > [ 6708.808458] x23: 08c63000 x22: 08c63000 > [ 6708.813760] x21: 800f7410bf28 x20: 81000bbe1e00 > [ 6708.819063] x19: 810012412400 x18: d82a9df2 > [ 6708.824365] x17: x16: > [ 6708.829667] x15: x14: 0001 > [ 6708.834969] x13: x12: 722e736f622e676e > [ 6708.840271] x11: f814dd99 x10: > [ 6708.845573] x9 : 737468722500 x8 : > [ 6708.850875] x7 : x6 : > [ 6708.856177] x5 : 0028 x4 : > [ 6708.861479] x3 : x2 : e500 > [ 6708.866781] x1 : x0 : 81000bbe1e00 > [ 6708.872084] > [ 6708.873565] Process swapper/84 (pid: 0, stack limit = 0x81000689c000) > [ 6708.880341] Stack: (0x810ffbad3900 to 0x8100068a) > [ 6708.886075] Call trace: > [ 6708.888513] Exception stack(0x810ffbad3710 to 0x810ffbad3840) > [ 6708.894942] 3700: 810012412400 > 0001 > [ 6708.902759] 3720: 810ffbad3900 6145 > 800f7930 > [ 6708.910577] 3740: 09274d00 03ea 0015 > 08c63000 > [ 6708.918395] 3760: 810ffbad3830 800f7930 004d > > [ 6708.926212] 3780: 810ffbad3890 080f88dc 800f7930 > 004d > [ 6708.934030] 37a0: 800f7930093c 08c63000 > 0140 > [ 6708.941848] 37c0: 08c2c000 00040b00 81000bbe1e00 > > [ 6708.949665] 37e0: e500 > 0028 > [ 6708.957483] 3800: >
[PATCH v2] net: sunrpc: svcsock: fix NULL-pointer exception
_skb+0x38/0x84 [ 6709.034580] [] netif_receive_skb_internal+0x68/0xdc [ 6709.041010] [] napi_gro_receive+0xcc/0x1a8 [ 6709.046690] [] nicvf_cq_intr_handler+0x59c/0x730 [nicvf] [ 6709.053559] [] nicvf_poll+0x38/0xb8 [nicvf] [ 6709.059295] [] net_rx_action+0x2f8/0x464 [ 6709.064771] [] __do_softirq+0x11c/0x308 [ 6709.070164] [] irq_exit+0x12c/0x174 [ 6709.075206] [] __handle_domain_irq+0x78/0xc4 [ 6709.081027] [] gic_handle_irq+0x94/0x190 [ 6709.086501] Exception stack(0x81000689fdf0 to 0x81000689ff20) [ 6709.092929] fde0: 810ff2ec 08c1 [ 6709.100747] fe00: 08c70ef4 0001 810ffbad9b18 [ 6709.108565] fe20: 810ffbad9c70 8100169d3800 810006843ab0 81000689fe80 [ 6709.116382] fe40: 0bd0 df979cd0 183f5913da192500 8a254ce4 [ 6709.124200] fe60: 8a254b78 aaab10339808 8a0c2a50 [ 6709.132018] fe80: df979b10 08d6d450 08c1 08d6d000 [ 6709.139836] fea0: 0054 08cd3dbc [ 6709.147653] fec0: 81000689ff20 [ 6709.155471] fee0: 08085240 81000689ff20 08085244 6145 [ 6709.163289] ff00: 81000689ff10 0813f1e4 0813f238 [ 6709.171107] [] el1_irq+0xb4/0x140 [ 6709.175976] [] arch_cpu_idle+0x44/0x11c [ 6709.181368] [] default_idle_call+0x20/0x30 [ 6709.187020] [] do_idle+0x158/0x1e4 [ 6709.191973] [] cpu_startup_entry+0x2c/0x30 [ 6709.197624] [] secondary_start_kernel+0x13c/0x160 [ 6709.203878] [<01bc71c4>] 0x1bc71c4 [ 6709.207967] Code: bad PC value [ 6709.211061] SMP: stopping secondary CPUs [ 6709.218830] Starting crashdump kernel... [ 6709.222749] Bye! ---<-snip>--- Signed-off-by: Vadim Lomovtsev --- net/sunrpc/svcsock.c | 27 +-- 1 file changed, 25 insertions(+), 2 deletions(-) diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c index 2b720fa..2be967f 100644 --- a/net/sunrpc/svcsock.c +++ b/net/sunrpc/svcsock.c @@ -421,6 +421,10 @@ static void svc_data_ready(struct sock *sk) dprintk("svc: socket %p(inet %p), busy=%d\n", svsk, sk, test_bit(XPT_BUSY, &svsk->sk_xprt.xpt_flags)); + /* this barrier is necessary to prevent kernel crash + (due to bad CPU-value) caused by races aginst + svc_setup_socket() while calling sk_odata() callback. */ + rmb(); svsk->sk_odata(sk); if (!test_and_set_bit(XPT_DATA, &svsk->sk_xprt.xpt_flags)) svc_xprt_enqueue(&svsk->sk_xprt); @@ -437,6 +441,10 @@ static void svc_write_space(struct sock *sk) if (svsk) { dprintk("svc: socket %p(inet %p), write_space busy=%d\n", svsk, sk, test_bit(XPT_BUSY, &svsk->sk_xprt.xpt_flags)); + /* this barrier is necessary to prevent kernel crash + (due to bad CPU-value) caused by races aginst + svc_setup_socket() while calling sk_owspace() callback. */ + rmb(); svsk->sk_owspace(sk); svc_xprt_enqueue(&svsk->sk_xprt); } @@ -760,8 +768,14 @@ static void svc_tcp_listen_data_ready(struct sock *sk) dprintk("svc: socket %p TCP (listen) state change %d\n", sk, sk->sk_state); - if (svsk) + if (svsk) { + /* this barrier is necessary to prevent kernel crash + (due to bad CPU-value) caused by races aginst + svc_setup_socket() while calling sk_odata() callback.*/ + rmb(); svsk->sk_odata(sk); + } + /* * This callback may called twice when a new connection * is established as a child socket inherits everything @@ -794,7 +808,12 @@ static void svc_tcp_state_change(struct sock *sk) if (!svsk) printk("svc: socket %p: no user data\n", sk); else { + /* this barrier is necessary to prevent kernel crash + (due to bad CPU-value) caused by races aginst + svc_setup_socket() while calling sk_odata() callback. */ + rmb(); svsk->sk_ostate(sk); + if (sk->sk_state != TCP_ESTABLISHED) { set_bit(XPT_CLOSE, &svsk->sk_xprt.xpt_flags); svc_xprt_enqueue(&svsk->sk_xprt); @@ -1381,12 +1400,16 @@ static struct svc_sock *svc_setup_socket(struct svc_serv *serv, return ERR_PTR(err); } - inet->sk_user_data = svsk; svsk->sk_sock = sock; svsk->sk_sk = inet; svsk->sk_ostate
Re: [PATCH] net: sunrpc: svcsock: fix NULL-pointer exception
On Fri, Aug 18, 2017 at 07:16:45AM -0400, Jeff Layton wrote: > On Fri, 2017-08-18 at 07:08 -0400, Vadim Lomovtsev wrote: > > Hi Jeff, > > > > On Fri, Aug 18, 2017 at 06:27:32AM -0400, Jeff Layton wrote: > > > On Fri, 2017-08-18 at 06:00 -0400, Vadim Lomovtsev wrote: > > > > While running nfs/connectathon tests kernel NULL-pointer exception > > > > has been observed due to races in svcsock.c. > > > > > > > > Race is appear when kernel accepts connection by kernel_accept > > > > (which creates new socket) and start queuing ingress packets > > > > to new socket. This happanes in ksoftirq context which concurrently > > > > on a differnt core while new socket setup is not done yet. > > > > > > > > The fix is to re-order socket user data init sequence, add NULL-ptr > > > > check before callback call along with barriers to prevent kernel crash. > > > > > > > > Test results: nfs/connectathon reports '0' failed tests for about 200+ > > > > iterations. > > > > > > > > Crash log: > > > > ---<-snip->--- > > > > [ 6708.638984] Unable to handle kernel NULL pointer dereference at > > > > virtual address > > > > [ 6708.647093] pgd = 094e > > > > [ 6708.650497] [] *pgd=01090003, *pud=01090003, > > > > *pmd=01080003, *pte= > > > > [ 6708.660761] Internal error: Oops: 8605 [#1] SMP > > > > [ 6708.665630] Modules linked in: nfsv3 nfnetlink_queue nfnetlink_log > > > > nfnetlink rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache overlay > > > > xt_CONNSECMARK xt_SECMARK xt_conntrack iptable_security ip_tables ah4 > > > > xfrm4_mode_transport sctp tun binfmt_misc ext4 jbd2 mbcache loop > > > > tcp_diag udp_diag inet_diag rpcrdma ib_isert iscsi_target_mod ib_iser > > > > rdma_cm iw_cm libiscsi scsi_transport_iscsi ib_srpt target_core_mod > > > > ib_srp scsi_transport_srp ib_ipoib ib_ucm ib_uverbs ib_umad ib_cm > > > > ib_core nls_koi8_u nls_cp932 ts_kmp nf_conntrack_ipv4 nf_defrag_ipv4 > > > > nf_conntrack vfat fat ghash_ce sha2_ce sha1_ce cavium_rng_vf > > > > i2c_thunderx sg thunderx_edac i2c_smbus edac_core cavium_rng nfsd > > > > auth_rpcgss nfs_acl lockd grace sunrpc xfs libcrc32c nicvf nicpf ast > > > > i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt > > > > fb_sys_fops > > > > [ 6708.736446] ttm drm i2c_core thunder_bgx thunder_xcv mdio_thunder > > > > mdio_cavium dm_mirror dm_region_hash dm_log dm_mod [last unloaded: > > > > stap_3c300909c5b3f46dcacd49aab3334af_87021] > > > > [ 6708.752275] CPU: 84 PID: 0 Comm: swapper/84 Tainted: GW OE > > > > 4.11.0-4.el7.aarch64 #1 > > > > [ 6708.760787] Hardware name: www.cavium.com CRB-2S/CRB-2S, BIOS 0.3 > > > > Mar 13 2017 > > > > [ 6708.767910] task: 810006842e80 task.stack: 81000689c000 > > > > [ 6708.773822] PC is at 0x0 > > > > [ 6708.776739] LR is at svc_data_ready+0x38/0x88 [sunrpc] > > > > [ 6708.781866] pc : [<>] lr : [] > > > > pstate: 6145 > > > > [ 6708.789248] sp : 810ffbad3900 > > > > [ 6708.792551] x29: 810ffbad3900 x28: 08c73d58 > > > > [ 6708.797853] x27: x26: 81000bbe1e00 > > > > [ 6708.803156] x25: 0020 x24: 800f7410bf28 > > > > [ 6708.808458] x23: 08c63000 x22: 08c63000 > > > > [ 6708.813760] x21: 800f7410bf28 x20: 81000bbe1e00 > > > > [ 6708.819063] x19: 810012412400 x18: d82a9df2 > > > > [ 6708.824365] x17: x16: > > > > [ 6708.829667] x15: x14: 0001 > > > > [ 6708.834969] x13: x12: 722e736f622e676e > > > > [ 6708.840271] x11: f814dd99 x10: > > > > [ 6708.845573] x9 : 737468722500 x8 : > > > > [ 6708.850875] x7 : x6 : > > > > [ 6708.856177] x5 : 0028 x4 : > > > > [ 6708.861479] x3 : x2 : e500 > > > > [ 6708.866781] x1 : x0 : 81000bbe1e00 > > > > [ 6708.872084] > > > > [ 6708.873565] Process swapper/84 (pid: 0, stack limit = > > > > 0x81000689c000) > >
Re: [PATCH] net: sunrpc: svcsock: fix NULL-pointer exception
Hi Jeff, On Fri, Aug 18, 2017 at 06:27:32AM -0400, Jeff Layton wrote: > On Fri, 2017-08-18 at 06:00 -0400, Vadim Lomovtsev wrote: > > While running nfs/connectathon tests kernel NULL-pointer exception > > has been observed due to races in svcsock.c. > > > > Race is appear when kernel accepts connection by kernel_accept > > (which creates new socket) and start queuing ingress packets > > to new socket. This happanes in ksoftirq context which concurrently > > on a differnt core while new socket setup is not done yet. > > > > The fix is to re-order socket user data init sequence, add NULL-ptr > > check before callback call along with barriers to prevent kernel crash. > > > > Test results: nfs/connectathon reports '0' failed tests for about 200+ > > iterations. > > > > Crash log: > > ---<-snip->--- > > [ 6708.638984] Unable to handle kernel NULL pointer dereference at virtual > > address > > [ 6708.647093] pgd = 094e > > [ 6708.650497] [] *pgd=01090003, *pud=01090003, > > *pmd=01080003, *pte= > > [ 6708.660761] Internal error: Oops: 8605 [#1] SMP > > [ 6708.665630] Modules linked in: nfsv3 nfnetlink_queue nfnetlink_log > > nfnetlink rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache overlay > > xt_CONNSECMARK xt_SECMARK xt_conntrack iptable_security ip_tables ah4 > > xfrm4_mode_transport sctp tun binfmt_misc ext4 jbd2 mbcache loop tcp_diag > > udp_diag inet_diag rpcrdma ib_isert iscsi_target_mod ib_iser rdma_cm iw_cm > > libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp > > scsi_transport_srp ib_ipoib ib_ucm ib_uverbs ib_umad ib_cm ib_core > > nls_koi8_u nls_cp932 ts_kmp nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack > > vfat fat ghash_ce sha2_ce sha1_ce cavium_rng_vf i2c_thunderx sg > > thunderx_edac i2c_smbus edac_core cavium_rng nfsd auth_rpcgss nfs_acl lockd > > grace sunrpc xfs libcrc32c nicvf nicpf ast i2c_algo_bit drm_kms_helper > > syscopyarea sysfillrect sysimgblt fb_sys_fops > > [ 6708.736446] ttm drm i2c_core thunder_bgx thunder_xcv mdio_thunder > > mdio_cavium dm_mirror dm_region_hash dm_log dm_mod [last unloaded: > > stap_3c300909c5b3f46dcacd49aab3334af_87021] > > [ 6708.752275] CPU: 84 PID: 0 Comm: swapper/84 Tainted: GW OE > > 4.11.0-4.el7.aarch64 #1 > > [ 6708.760787] Hardware name: www.cavium.com CRB-2S/CRB-2S, BIOS 0.3 Mar 13 > > 2017 > > [ 6708.767910] task: 810006842e80 task.stack: 81000689c000 > > [ 6708.773822] PC is at 0x0 > > [ 6708.776739] LR is at svc_data_ready+0x38/0x88 [sunrpc] > > [ 6708.781866] pc : [<>] lr : [] pstate: > > 6145 > > [ 6708.789248] sp : 810ffbad3900 > > [ 6708.792551] x29: 810ffbad3900 x28: 08c73d58 > > [ 6708.797853] x27: x26: 81000bbe1e00 > > [ 6708.803156] x25: 0020 x24: 800f7410bf28 > > [ 6708.808458] x23: 08c63000 x22: 08c63000 > > [ 6708.813760] x21: 800f7410bf28 x20: 81000bbe1e00 > > [ 6708.819063] x19: 810012412400 x18: d82a9df2 > > [ 6708.824365] x17: x16: > > [ 6708.829667] x15: x14: 0001 > > [ 6708.834969] x13: x12: 722e736f622e676e > > [ 6708.840271] x11: f814dd99 x10: > > [ 6708.845573] x9 : 737468722500 x8 : > > [ 6708.850875] x7 : x6 : > > [ 6708.856177] x5 : 0028 x4 : > > [ 6708.861479] x3 : x2 : e500 > > [ 6708.866781] x1 : x0 : 81000bbe1e00 > > [ 6708.872084] > > [ 6708.873565] Process swapper/84 (pid: 0, stack limit = 0x81000689c000) > > [ 6708.880341] Stack: (0x810ffbad3900 to 0x8100068a) > > [ 6708.886075] Call trace: > > [ 6708.888513] Exception stack(0x810ffbad3710 to 0x810ffbad3840) > > [ 6708.894942] 3700: 810012412400 > > 0001 > > [ 6708.902759] 3720: 810ffbad3900 6145 > > 800f7930 > > [ 6708.910577] 3740: 09274d00 03ea 0015 > > 08c63000 > > [ 6708.918395] 3760: 810ffbad3830 800f7930 004d > > > > [ 6708.926212] 3780: 810ffbad3890 080f88dc 800f7930 > > 004d > > [ 6708.934030] 37a0: 800f7930093c 08c63000 > > 014
[PATCH] net: sunrpc: svcsock: fix NULL-pointer exception
rnal+0x68/0xdc [ 6709.041010] [] napi_gro_receive+0xcc/0x1a8 [ 6709.046690] [] nicvf_cq_intr_handler+0x59c/0x730 [nicvf] [ 6709.053559] [] nicvf_poll+0x38/0xb8 [nicvf] [ 6709.059295] [] net_rx_action+0x2f8/0x464 [ 6709.064771] [] __do_softirq+0x11c/0x308 [ 6709.070164] [] irq_exit+0x12c/0x174 [ 6709.075206] [] __handle_domain_irq+0x78/0xc4 [ 6709.081027] [] gic_handle_irq+0x94/0x190 [ 6709.086501] Exception stack(0x81000689fdf0 to 0x81000689ff20) [ 6709.092929] fde0: 810ff2ec 08c1 [ 6709.100747] fe00: 08c70ef4 0001 810ffbad9b18 [ 6709.108565] fe20: 810ffbad9c70 8100169d3800 810006843ab0 81000689fe80 [ 6709.116382] fe40: 0bd0 df979cd0 183f5913da192500 8a254ce4 [ 6709.124200] fe60: 8a254b78 aaab10339808 8a0c2a50 [ 6709.132018] fe80: df979b10 08d6d450 08c1 08d6d000 [ 6709.139836] fea0: 0054 08cd3dbc [ 6709.147653] fec0: 81000689ff20 [ 6709.155471] fee0: 08085240 81000689ff20 08085244 6145 [ 6709.163289] ff00: 81000689ff10 0813f1e4 0813f238 [ 6709.171107] [] el1_irq+0xb4/0x140 [ 6709.175976] [] arch_cpu_idle+0x44/0x11c [ 6709.181368] [] default_idle_call+0x20/0x30 [ 6709.187020] [] do_idle+0x158/0x1e4 [ 6709.191973] [] cpu_startup_entry+0x2c/0x30 [ 6709.197624] [] secondary_start_kernel+0x13c/0x160 [ 6709.203878] [<01bc71c4>] 0x1bc71c4 [ 6709.207967] Code: bad PC value [ 6709.211061] SMP: stopping secondary CPUs [ 6709.218830] Starting crashdump kernel... [ 6709.222749] Bye! ---<-snip>--- Signed-off-by: Vadim Lomovtsev --- net/sunrpc/svcsock.c | 24 ++-- 1 file changed, 18 insertions(+), 6 deletions(-) diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c index 2b720fa..b6496f3 100644 --- a/net/sunrpc/svcsock.c +++ b/net/sunrpc/svcsock.c @@ -421,7 +421,9 @@ static void svc_data_ready(struct sock *sk) dprintk("svc: socket %p(inet %p), busy=%d\n", svsk, sk, test_bit(XPT_BUSY, &svsk->sk_xprt.xpt_flags)); - svsk->sk_odata(sk); + rmb(); + if (svsk->sk_odata) + svsk->sk_odata(sk); if (!test_and_set_bit(XPT_DATA, &svsk->sk_xprt.xpt_flags)) svc_xprt_enqueue(&svsk->sk_xprt); } @@ -437,7 +439,9 @@ static void svc_write_space(struct sock *sk) if (svsk) { dprintk("svc: socket %p(inet %p), write_space busy=%d\n", svsk, sk, test_bit(XPT_BUSY, &svsk->sk_xprt.xpt_flags)); - svsk->sk_owspace(sk); + rmb(); + if (svsk->sk_owspace) + svsk->sk_owspace(sk); svc_xprt_enqueue(&svsk->sk_xprt); } } @@ -760,8 +764,12 @@ static void svc_tcp_listen_data_ready(struct sock *sk) dprintk("svc: socket %p TCP (listen) state change %d\n", sk, sk->sk_state); - if (svsk) - svsk->sk_odata(sk); + if (svsk) { + rmb(); + if (svsk->sk_odata) + svsk->sk_odata(sk); + } + /* * This callback may called twice when a new connection * is established as a child socket inherits everything @@ -794,7 +802,10 @@ static void svc_tcp_state_change(struct sock *sk) if (!svsk) printk("svc: socket %p: no user data\n", sk); else { - svsk->sk_ostate(sk); + rmb(); + if (svsk->sk_ostate) + svsk->sk_ostate(sk); + if (sk->sk_state != TCP_ESTABLISHED) { set_bit(XPT_CLOSE, &svsk->sk_xprt.xpt_flags); svc_xprt_enqueue(&svsk->sk_xprt); @@ -1381,12 +1392,13 @@ static struct svc_sock *svc_setup_socket(struct svc_serv *serv, return ERR_PTR(err); } - inet->sk_user_data = svsk; svsk->sk_sock = sock; svsk->sk_sk = inet; svsk->sk_ostate = inet->sk_state_change; svsk->sk_odata = inet->sk_data_ready; svsk->sk_owspace = inet->sk_write_space; + wmb(); + inet->sk_user_data = svsk; /* Initialize the socket */ if (sock->type == SOCK_DGRAM) -- 1.8.3.1
[PATCH] net: thunderx: acpi: fix LMAC initialization
While probing BGX we requesting appropriate QLM for it's configuration and get LMAC count by that request. Then, while reading configured MAC values from SSDT table we need to save them in proper mapping: BGX[i]->lmac[j].mac = to later provide for initialization stuff. In order to fill such mapping properly we need to add lmac index to be used while acpi initialization since at this moment bgx->lmac_count already contains actual value. Signed-off-by: Vadim Lomovtsev --- drivers/net/ethernet/cavium/thunder/thunder_bgx.c | 11 ++- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c index be30ad0..a3f4f83 100644 --- a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c +++ b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c @@ -47,8 +47,9 @@ struct lmac { struct bgx { u8 bgx_id; struct lmaclmac[MAX_LMAC_PER_BGX]; - int lmac_count; + u8 lmac_count; u8 max_lmac; + u8 acpi_lmac_idx; void __iomem*reg_base; struct pci_dev *pdev; boolis_dlm; @@ -1073,13 +1074,13 @@ static acpi_status bgx_acpi_register_phy(acpi_handle handle, if (acpi_bus_get_device(handle, &adev)) goto out; - acpi_get_mac_address(dev, adev, bgx->lmac[bgx->lmac_count].mac); + acpi_get_mac_address(dev, adev, bgx->lmac[bgx->acpi_lmac_idx].mac); - SET_NETDEV_DEV(&bgx->lmac[bgx->lmac_count].netdev, dev); + SET_NETDEV_DEV(&bgx->lmac[bgx->acpi_lmac_idx].netdev, dev); - bgx->lmac[bgx->lmac_count].lmacid = bgx->lmac_count; + bgx->lmac[bgx->acpi_lmac_idx].lmacid = bgx->acpi_lmac_idx; + bgx->acpi_lmac_idx++; /* move to next LMAC */ out: - bgx->lmac_count++; return AE_OK; } -- 1.8.3.1