pull-request: wireless-drivers 2017-06-20
Hi Dave, here's a pull request to net tree, few important fixes still I would like to have in 4.12. Please let me know if there are any problems. Kalle The following changes since commit dc89481bb4c9af0700423e21c8371379d3d943b1: Merge tag 'iwlwifi-for-kalle-2017-06-05' of git://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-fixes (2017-06-05 22:21:25 +0300) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers.git tags/wireless-drivers-for-davem-2017-06-20 for you to fetch changes up to 35abcd4f9f303ac4f10f99b3f7e993e5f2e6fa37: brcmfmac: fix uninitialized warning in brcmf_usb_probe_phase2() (2017-06-16 11:52:36 +0300) wireless-drivers fixes for 4.12 Two important fixes for brcmfmac. The rest of the brcmfmac patches are either code preparation and fixing a new build warning. brcmfmac * fix a NULL pointer dereference during resume * fix a NULL pointer dereference with USB devices, a regression from v4.12-rc1 Arend Van Spriel (5): brcmfmac: add parameter to pass error code in firmware callback brcmfmac: use firmware callback upon failure to load brcmfmac: unbind all devices upon failure in firmware callback brcmfmac: fix brcmf_fws_add_interface() for USB devices brcmfmac: fix uninitialized warning in brcmf_usb_probe_phase2() .../broadcom/brcm80211/brcmfmac/firmware.c | 35 +++--- .../broadcom/brcm80211/brcmfmac/firmware.h | 4 +-- .../broadcom/brcm80211/brcmfmac/fwsignal.c | 2 +- .../wireless/broadcom/brcm80211/brcmfmac/pcie.c| 17 +++ .../wireless/broadcom/brcm80211/brcmfmac/sdio.c| 18 +++ .../net/wireless/broadcom/brcm80211/brcmfmac/usb.c | 9 +++--- 6 files changed, 49 insertions(+), 36 deletions(-)
Re: [PATCH 00/51] rtc: stop using rtc deprecated functions
On 20/06/2017 at 15:44:58 +0200, Pavel Machek wrote: > On Tue 2017-06-20 13:37:22, Steve Twiss wrote: > > Hi Pavel, > > > > On 20 June 2017 14:26, Pavel Machek wrote: > > > > > Subject: Re: [PATCH 00/51] rtc: stop using rtc deprecated functions > > > > > > On Tue 2017-06-20 14:24:00, Alexandre Belloni wrote: > > > > On 20/06/2017 at 14:10:11 +0200, Pavel Machek wrote: > > > > > On Tue 2017-06-20 12:03:48, Alexandre Belloni wrote: > > > > > > On 20/06/2017 at 11:35:08 +0200, Benjamin Gaignard wrote: > > > > > > > rtc_time_to_tm() and rtc_tm_to_time() are deprecated because they > > > > > > > rely on 32bits variables and that will make rtc break in > > > > > > > y2038/2016. > > > > > > > > > > > > Please don't, because this hide the fact that the hardware will not > > > > > > handle dates in y2038 anyway and as pointed by Russell a few month > > > > > > ago, > > > > > > rtc_time_to_tm will be able to catch it but the 64 bit version will > > > > > > silently ignore it. > > > > > > > > > > Reference? Because rtc on PCs stores date in binary coded decimal, so > > > > > it is likely to break in 2100, not 2038... > > > > > > > > I'm not saying it should be done but clearly, that is not the correct > > > > thing to do for RTCs that are using a single 32 bits register to store > > > > the time. > > > > You give one example, I can give you three: armada38x, at91sam9, > > > > at32ap700x and that just in the beginning of the series. > > > > > > I wanted reference to Russell's mail. > > > > This is it. > > https://patchwork.kernel.org/patch/6219401/ > > Thanks. > > Yes, that's argument against changing rtc _drivers_ for hardware that > can not do better than 32bit. For generic code (such as 44/51 sysfs, > 51/51 suspend test), the change still makes sense. > Yes, we agree on that but I won't cherry pick working patches from a 51 patches series. -- Alexandre Belloni, Free Electrons Embedded Linux and Kernel engineering http://free-electrons.com
[PATCH net-next 3/4] s390/diag: add diag26c support
Implement support for the hypervisor diagnose 0x26c ('Access Certain System Information'). It passes a request buffer and a subfunction code, and receives a response buffer and a return code. Also add the scaffolding for the 'MAC Services' subfunction. It may be used by network devices to obtain a hypervisor-managed MAC address. Signed-off-by: Julian WiedmannAcked-by: Heiko Carstens --- arch/s390/include/asm/diag.h | 26 ++ arch/s390/kernel/diag.c | 29 + 2 files changed, 55 insertions(+) diff --git a/arch/s390/include/asm/diag.h b/arch/s390/include/asm/diag.h index 8acf482162ed..88162bb5c190 100644 --- a/arch/s390/include/asm/diag.h +++ b/arch/s390/include/asm/diag.h @@ -8,6 +8,7 @@ #ifndef _ASM_S390_DIAG_H #define _ASM_S390_DIAG_H +#include #include enum diag_stat_enum { @@ -24,6 +25,7 @@ enum diag_stat_enum { DIAG_STAT_X224, DIAG_STAT_X250, DIAG_STAT_X258, + DIAG_STAT_X26C, DIAG_STAT_X288, DIAG_STAT_X2C4, DIAG_STAT_X2FC, @@ -225,6 +227,30 @@ struct diag204_x_phys_block { struct diag204_x_phys_cpu cpus[]; } __packed; +enum diag26c_sc { + DIAG26C_MAC_SERVICES = 0x0030 +}; + +enum diag26c_version { + DIAG26C_VERSION2 = 0x0002 /* z/VM 5.4.0 */ +}; + +#define DIAG26C_GET_MAC0x +struct diag26c_mac_req { + u32 resp_buf_len; + u32 resp_version; + u16 op_code; + u16 devno; + u8 res[4]; +}; + +struct diag26c_mac_resp { + u32 version; + u8 mac[ETH_ALEN]; + u8 res[2]; +} __aligned(8); + int diag204(unsigned long subcode, unsigned long size, void *addr); int diag224(void *ptr); +int diag26c(void *req, void *resp, enum diag26c_sc subcode); #endif /* _ASM_S390_DIAG_H */ diff --git a/arch/s390/kernel/diag.c b/arch/s390/kernel/diag.c index ac6abcd3fe6a..349914571772 100644 --- a/arch/s390/kernel/diag.c +++ b/arch/s390/kernel/diag.c @@ -38,6 +38,7 @@ static const struct diag_desc diag_map[NR_DIAG_STAT] = { [DIAG_STAT_X224] = { .code = 0x224, .name = "EBCDIC-Name Table" }, [DIAG_STAT_X250] = { .code = 0x250, .name = "Block I/O" }, [DIAG_STAT_X258] = { .code = 0x258, .name = "Page-Reference Services" }, + [DIAG_STAT_X26C] = { .code = 0x26c, .name = "Certain System Information" }, [DIAG_STAT_X288] = { .code = 0x288, .name = "Time Bomb" }, [DIAG_STAT_X2C4] = { .code = 0x2c4, .name = "FTP Services" }, [DIAG_STAT_X2FC] = { .code = 0x2fc, .name = "Guest Performance Data" }, @@ -236,3 +237,31 @@ int diag224(void *ptr) return rc; } EXPORT_SYMBOL(diag224); + +/* + * Diagnose 26C: Access Certain System Information + */ +static inline int __diag26c(void *req, void *resp, enum diag26c_sc subcode) +{ + register unsigned long _req asm("2") = (addr_t) req; + register unsigned long _resp asm("3") = (addr_t) resp; + register unsigned long _subcode asm("4") = subcode; + register unsigned long _rc asm("5") = -EOPNOTSUPP; + + asm volatile( + " sam31\n" + " diag%[rx],%[ry],0x26c\n" + "0: sam64\n" + EX_TABLE(0b,0b) + : "+d" (_rc) + : [rx] "d" (_req), "d" (_resp), [ry] "d" (_subcode) + : "cc", "memory"); + return _rc; +} + +int diag26c(void *req, void *resp, enum diag26c_sc subcode) +{ + diag_stat_inc(DIAG_STAT_X26C); + return __diag26c(req, resp, subcode); +} +EXPORT_SYMBOL(diag26c); -- 2.11.2
[PATCH net-next 1/4] s390/qeth: add ipa return codes for bridgeport
From: Kittipon Meesompopadd ipa return codes for Bridgeport (HiperSockets and OSA) according to system level design. Signed-off-by: Kittipon Meesompop Reviewed-by: Julian Wiedmann Reviewed-by: Ursula Braun Signed-off-by: Julian Wiedmann --- drivers/s390/net/qeth_core_mpc.c | 14 ++ drivers/s390/net/qeth_core_mpc.h | 18 ++ drivers/s390/net/qeth_l2_main.c | 34 +- 3 files changed, 49 insertions(+), 17 deletions(-) diff --git a/drivers/s390/net/qeth_core_mpc.c b/drivers/s390/net/qeth_core_mpc.c index ab9b1376467f..6dd7d05e5693 100644 --- a/drivers/s390/net/qeth_core_mpc.c +++ b/drivers/s390/net/qeth_core_mpc.c @@ -170,12 +170,18 @@ static struct ipa_rc_msg qeth_ipa_rc_msg[] = { {IPA_RC_TRACE_ALREADY_ACTIVE, "trace already active"}, {IPA_RC_INVALID_FORMAT, "invalid format or length"}, {IPA_RC_DUP_IPV6_REMOTE, "ipv6 address already registered remote"}, + {IPA_RC_SBP_IQD_NOT_CONFIGURED, "Not configured for bridgeport"}, {IPA_RC_DUP_IPV6_HOME, "ipv6 address already registered"}, {IPA_RC_UNREGISTERED_ADDR, "Address not registered"}, {IPA_RC_NO_ID_AVAILABLE,"No identifiers available"}, {IPA_RC_ID_NOT_FOUND, "Identifier not found"}, + {IPA_RC_SBP_IQD_ANO_DEV_PRIMARY, "Primary bridgeport exists already"}, + {IPA_RC_SBP_IQD_CURRENT_SECOND, "Bridgeport is currently secondary"}, + {IPA_RC_SBP_IQD_LIMIT_SECOND, "Limit of secondary bridgeports reached"}, {IPA_RC_INVALID_IP_VERSION, "IP version incorrect"}, + {IPA_RC_SBP_IQD_CURRENT_PRIMARY, "Bridgeport is currently primary"}, {IPA_RC_LAN_FRAME_MISMATCH, "LAN and frame mismatch"}, + {IPA_RC_SBP_IQD_NO_QDIO_QUEUES, "QDIO queues not established"}, {IPA_RC_L2_UNSUPPORTED_CMD, "Unsupported layer 2 command"}, {IPA_RC_L2_DUP_MAC, "Duplicate MAC address"}, {IPA_RC_L2_ADDR_TABLE_FULL, "Layer2 address table full"}, @@ -187,6 +193,14 @@ static struct ipa_rc_msg qeth_ipa_rc_msg[] = { {IPA_RC_L2_INVALID_VLAN_ID, "L2 invalid vlan id"}, {IPA_RC_L2_DUP_VLAN_ID, "L2 duplicate vlan id"}, {IPA_RC_L2_VLAN_ID_NOT_FOUND, "L2 vlan id not found"}, + {IPA_RC_SBP_OSA_NOT_CONFIGURED, "Not configured for bridgeport"}, + {IPA_RC_SBP_OSA_OS_MISMATCH,"OS mismatch"}, + {IPA_RC_SBP_OSA_ANO_DEV_PRIMARY, "Primary bridgeport exists already"}, + {IPA_RC_SBP_OSA_CURRENT_SECOND, "Bridgeport is currently secondary"}, + {IPA_RC_SBP_OSA_LIMIT_SECOND, "Limit of secondary bridgeports reached"}, + {IPA_RC_SBP_OSA_NOT_AUTHD_BY_ZMAN, "Not authorized by zManager"}, + {IPA_RC_SBP_OSA_CURRENT_PRIMARY, "Bridgeport is currently primary"}, + {IPA_RC_SBP_OSA_NO_QDIO_QUEUES, "QDIO queues not established"}, {IPA_RC_DATA_MISMATCH, "Data field mismatch (v4/v6 mixed)"}, {IPA_RC_INVALID_MTU_SIZE, "Invalid MTU size"}, {IPA_RC_INVALID_LANTYPE,"Invalid LAN type"}, diff --git a/drivers/s390/net/qeth_core_mpc.h b/drivers/s390/net/qeth_core_mpc.h index 45bbea2843bf..912e0107de8f 100644 --- a/drivers/s390/net/qeth_core_mpc.h +++ b/drivers/s390/net/qeth_core_mpc.h @@ -142,12 +142,18 @@ enum qeth_ipa_return_codes { IPA_RC_TRACE_ALREADY_ACTIVE = 0x0005, IPA_RC_INVALID_FORMAT = 0x0006, IPA_RC_DUP_IPV6_REMOTE = 0x0008, + IPA_RC_SBP_IQD_NOT_CONFIGURED = 0x000C, IPA_RC_DUP_IPV6_HOME= 0x0010, IPA_RC_UNREGISTERED_ADDR= 0x0011, IPA_RC_NO_ID_AVAILABLE = 0x0012, IPA_RC_ID_NOT_FOUND = 0x0013, + IPA_RC_SBP_IQD_ANO_DEV_PRIMARY = 0x0014, + IPA_RC_SBP_IQD_CURRENT_SECOND = 0x0018, + IPA_RC_SBP_IQD_LIMIT_SECOND = 0x001C, IPA_RC_INVALID_IP_VERSION = 0x0020, + IPA_RC_SBP_IQD_CURRENT_PRIMARY = 0x0024, IPA_RC_LAN_FRAME_MISMATCH = 0x0040, + IPA_RC_SBP_IQD_NO_QDIO_QUEUES = 0x00EB, IPA_RC_L2_UNSUPPORTED_CMD = 0x2003, IPA_RC_L2_DUP_MAC = 0x2005, IPA_RC_L2_ADDR_TABLE_FULL = 0x2006, @@ -159,6 +165,14 @@ enum qeth_ipa_return_codes { IPA_RC_L2_INVALID_VLAN_ID = 0x2015, IPA_RC_L2_DUP_VLAN_ID = 0x2016, IPA_RC_L2_VLAN_ID_NOT_FOUND = 0x2017, + IPA_RC_SBP_OSA_NOT_CONFIGURED = 0x2B0C, + IPA_RC_SBP_OSA_OS_MISMATCH = 0x2B10, + IPA_RC_SBP_OSA_ANO_DEV_PRIMARY = 0x2B14, + IPA_RC_SBP_OSA_CURRENT_SECOND = 0x2B18, + IPA_RC_SBP_OSA_LIMIT_SECOND = 0x2B1C, + IPA_RC_SBP_OSA_NOT_AUTHD_BY_ZMAN = 0x2B20, + IPA_RC_SBP_OSA_CURRENT_PRIMARY = 0x2B24, + IPA_RC_SBP_OSA_NO_QDIO_QUEUES = 0x2BEB,
[PATCH net-next 0/4] s390/net updates, part 2 (v2)
Hi Dave, thanks for the feedback. Here's an updated patchset that honours the reverse christmas tree and drops the __packed attribute. Please apply. Thanks, Julian Julian Wiedmann (3): s390/qeth: fix packing buffer statistics s390/diag: add diag26c support s390/qeth: use diag26c to get MAC address on L2 Kittipon Meesompop (1): s390/qeth: add ipa return codes for bridgeport arch/s390/include/asm/diag.h | 26 + arch/s390/kernel/diag.c | 29 +++ drivers/s390/net/qeth_core.h | 1 + drivers/s390/net/qeth_core_main.c | 78 +++ drivers/s390/net/qeth_core_mpc.c | 14 +++ drivers/s390/net/qeth_core_mpc.h | 18 + drivers/s390/net/qeth_l2_main.c | 50 +++-- 7 files changed, 190 insertions(+), 26 deletions(-) -- 2.11.2
Re: Repeatable inet6_dump_fib crash in stock 4.12.0-rc4+
On 06/14/2017 03:25 PM, David Ahern wrote: On 6/14/17 4:23 PM, Ben Greear wrote: On 06/13/2017 07:27 PM, David Ahern wrote: Let's try a targeted debug patch. See attached I had to change it to pr_err so it would go to our serial console since the system locked hard on crash, and that appears to be enough to change the timing where we can no longer reproduce the problem. ok, let's figure out which one is doing that. There are 3 debug statements. I suspect fib6_del_route is the one setting the state to FWS_U. Can you remove the debug prints in fib6_repair_tree and fib6_walk_continue and try again? We cannot reproduce with just that one printf in the kernel either. It must change the timing too much to trigger the bug. Thanks, Ben -- Ben GreearCandela Technologies Inc http://www.candelatech.com
[PATCH v4 net-next 4/7] qed*: qede_roce.[ch] -> qede_rdma.[ch]
From: Michal KalderonOnce we have iWARP support, the qede portion of the qedr<->qede would serve all the RDMA protocols - so rename the file to be appropriate to its function. While we're at it, we're also moving a couple of inclusions to it into .h files and adding includes to make sure it contains all type definitions it requires. Signed-off-by: Michal Kalderon Signed-off-by: Yuval Mintz --- drivers/infiniband/hw/qedr/main.c | 2 +- drivers/infiniband/hw/qedr/qedr.h | 2 +- drivers/net/ethernet/qlogic/qede/Makefile | 2 +- drivers/net/ethernet/qlogic/qede/qede.h | 1 + drivers/net/ethernet/qlogic/qede/qede_main.c | 1 - drivers/net/ethernet/qlogic/qede/{qede_roce.c => qede_rdma.c} | 2 +- include/linux/qed/{qede_roce.h => qede_rdma.h}| 5 + 7 files changed, 10 insertions(+), 5 deletions(-) rename drivers/net/ethernet/qlogic/qede/{qede_roce.c => qede_rdma.c} (99%) rename include/linux/qed/{qede_roce.h => qede_rdma.h} (96%) diff --git a/drivers/infiniband/hw/qedr/main.c b/drivers/infiniband/hw/qedr/main.c index 5a32b80..714eb0c 100644 --- a/drivers/infiniband/hw/qedr/main.c +++ b/drivers/infiniband/hw/qedr/main.c @@ -37,7 +37,7 @@ #include #include #include -#include + #include #include #include "qedr.h" diff --git a/drivers/infiniband/hw/qedr/qedr.h b/drivers/infiniband/hw/qedr/qedr.h index 80333ec..2376019 100644 --- a/drivers/infiniband/hw/qedr/qedr.h +++ b/drivers/infiniband/hw/qedr/qedr.h @@ -37,7 +37,7 @@ #include #include #include -#include +#include #include #include "qedr_hsi_rdma.h" diff --git a/drivers/net/ethernet/qlogic/qede/Makefile b/drivers/net/ethernet/qlogic/qede/Makefile index bc5f7c3..75408fb 100644 --- a/drivers/net/ethernet/qlogic/qede/Makefile +++ b/drivers/net/ethernet/qlogic/qede/Makefile @@ -2,4 +2,4 @@ obj-$(CONFIG_QEDE) := qede.o qede-y := qede_main.o qede_fp.o qede_filter.o qede_ethtool.o qede_ptp.o qede-$(CONFIG_DCB) += qede_dcbnl.o -qede-$(CONFIG_QED_RDMA) += qede_roce.o +qede-$(CONFIG_QED_RDMA) += qede_rdma.o diff --git a/drivers/net/ethernet/qlogic/qede/qede.h b/drivers/net/ethernet/qlogic/qede/qede.h index 694c09b..2d6b30c 100644 --- a/drivers/net/ethernet/qlogic/qede/qede.h +++ b/drivers/net/ethernet/qlogic/qede/qede.h @@ -40,6 +40,7 @@ #include #include #include +#include #include #ifdef CONFIG_RFS_ACCEL #include diff --git a/drivers/net/ethernet/qlogic/qede/qede_main.c b/drivers/net/ethernet/qlogic/qede/qede_main.c index 37ad799..e9eaa38 100644 --- a/drivers/net/ethernet/qlogic/qede/qede_main.c +++ b/drivers/net/ethernet/qlogic/qede/qede_main.c @@ -60,7 +60,6 @@ #include #include #include -#include #include "qede.h" #include "qede_ptp.h" diff --git a/drivers/net/ethernet/qlogic/qede/qede_roce.c b/drivers/net/ethernet/qlogic/qede/qede_rdma.c similarity index 99% rename from drivers/net/ethernet/qlogic/qede/qede_roce.c rename to drivers/net/ethernet/qlogic/qede/qede_rdma.c index c0030fb..9837ee2 100644 --- a/drivers/net/ethernet/qlogic/qede/qede_roce.c +++ b/drivers/net/ethernet/qlogic/qede/qede_rdma.c @@ -33,7 +33,7 @@ #include #include #include -#include +#include #include "qede.h" static struct qedr_driver *qedr_drv; diff --git a/include/linux/qed/qede_roce.h b/include/linux/qed/qede_rdma.h similarity index 96% rename from include/linux/qed/qede_roce.h rename to include/linux/qed/qede_rdma.h index 3b8dd55..a1a9b81 100644 --- a/include/linux/qed/qede_roce.h +++ b/include/linux/qed/qede_rdma.h @@ -32,6 +32,11 @@ #ifndef QEDE_ROCE_H #define QEDE_ROCE_H +#include +#include +#include +#include + struct qedr_dev; struct qed_dev; struct qede_dev; -- 2.9.4
[PATCH v4 net-next 5/7] qed*: Set rdma generic functions prefix
From: Michal KalderonRename the functions common to both iWARP and RoCE to have a prefix of _rdma_ instead of _roce_. Signed-off-by: Michal Kalderon Signed-off-by: Yuval Mintz --- drivers/infiniband/hw/qedr/main.c| 6 +- drivers/net/ethernet/qlogic/qede/qede.h | 4 +- drivers/net/ethernet/qlogic/qede/qede_main.c | 12 +-- drivers/net/ethernet/qlogic/qede/qede_rdma.c | 142 +-- include/linux/qed/qede_rdma.h| 37 +++ 5 files changed, 101 insertions(+), 100 deletions(-) diff --git a/drivers/infiniband/hw/qedr/main.c b/drivers/infiniband/hw/qedr/main.c index 714eb0c..b5851fd 100644 --- a/drivers/infiniband/hw/qedr/main.c +++ b/drivers/infiniband/hw/qedr/main.c @@ -902,7 +902,7 @@ static void qedr_mac_address_change(struct qedr_dev *dev) * initialization done before RoCE driver notifies * event to stack. */ -static void qedr_notify(struct qedr_dev *dev, enum qede_roce_event event) +static void qedr_notify(struct qedr_dev *dev, enum qede_rdma_event event) { switch (event) { case QEDE_UP: @@ -931,12 +931,12 @@ static struct qedr_driver qedr_drv = { static int __init qedr_init_module(void) { - return qede_roce_register_driver(_drv); + return qede_rdma_register_driver(_drv); } static void __exit qedr_exit_module(void) { - qede_roce_unregister_driver(_drv); + qede_rdma_unregister_driver(_drv); } module_init(qedr_init_module); diff --git a/drivers/net/ethernet/qlogic/qede/qede.h b/drivers/net/ethernet/qlogic/qede/qede.h index 2d6b30c..4dfb238 100644 --- a/drivers/net/ethernet/qlogic/qede/qede.h +++ b/drivers/net/ethernet/qlogic/qede/qede.h @@ -154,8 +154,8 @@ struct qede_vlan { struct qede_rdma_dev { struct qedr_dev *qedr_dev; struct list_head entry; - struct list_head roce_event_list; - struct workqueue_struct *roce_wq; + struct list_head rdma_event_list; + struct workqueue_struct *rdma_wq; }; struct qede_ptp; diff --git a/drivers/net/ethernet/qlogic/qede/qede_main.c b/drivers/net/ethernet/qlogic/qede/qede_main.c index e9eaa38..06ca13d 100644 --- a/drivers/net/ethernet/qlogic/qede/qede_main.c +++ b/drivers/net/ethernet/qlogic/qede/qede_main.c @@ -262,7 +262,7 @@ static int qede_netdev_event(struct notifier_block *this, unsigned long event, break; case NETDEV_CHANGEADDR: edev = netdev_priv(ndev); - qede_roce_event_changeaddr(edev); + qede_rdma_event_changeaddr(edev); break; } @@ -977,7 +977,7 @@ static int __qede_probe(struct pci_dev *pdev, u32 dp_module, u8 dp_level, qede_init_ndev(edev); - rc = qede_roce_dev_add(edev); + rc = qede_rdma_dev_add(edev); if (rc) goto err3; @@ -1013,7 +1013,7 @@ static int __qede_probe(struct pci_dev *pdev, u32 dp_module, u8 dp_level, return 0; err4: - qede_roce_dev_remove(edev); + qede_rdma_dev_remove(edev); err3: free_netdev(edev->ndev); err2: @@ -1064,7 +1064,7 @@ static void __qede_remove(struct pci_dev *pdev, enum qede_remove_mode mode) qede_ptp_disable(edev); - qede_roce_dev_remove(edev); + qede_rdma_dev_remove(edev); edev->ops->common->set_power_state(cdev, PCI_D0); @@ -1964,7 +1964,7 @@ static void qede_unload(struct qede_dev *edev, enum qede_unload_mode mode, edev->state = QEDE_STATE_CLOSED; - qede_roce_dev_event_close(edev); + qede_rdma_dev_event_close(edev); /* Close OS Tx */ netif_tx_disable(edev->ndev); @@ -2069,7 +2069,7 @@ static int qede_load(struct qede_dev *edev, enum qede_load_mode mode, link_params.link_up = true; edev->ops->common->set_link(edev->cdev, _params); - qede_roce_dev_event_open(edev); + qede_rdma_dev_event_open(edev); edev->state = QEDE_STATE_OPEN; diff --git a/drivers/net/ethernet/qlogic/qede/qede_rdma.c b/drivers/net/ethernet/qlogic/qede/qede_rdma.c index 9837ee2..50b142f 100644 --- a/drivers/net/ethernet/qlogic/qede/qede_rdma.c +++ b/drivers/net/ethernet/qlogic/qede/qede_rdma.c @@ -40,12 +40,12 @@ static struct qedr_driver *qedr_drv; static LIST_HEAD(qedr_dev_list); static DEFINE_MUTEX(qedr_dev_list_lock); -bool qede_roce_supported(struct qede_dev *dev) +bool qede_rdma_supported(struct qede_dev *dev) { return dev->dev_info.common.rdma_supported; } -static void _qede_roce_dev_add(struct qede_dev *edev) +static void _qede_rdma_dev_add(struct qede_dev *edev) { if (!qedr_drv) return; @@ -54,11 +54,11 @@ static void _qede_roce_dev_add(struct qede_dev *edev) edev->ndev); } -static int qede_roce_create_wq(struct qede_dev *edev) +static int qede_rdma_create_wq(struct qede_dev *edev) { -
[PATCH v4 net-next 3/7] qed: Disable RoCE dpm when DCBx change occurs
If DCBx update occurs while QPs are open, stop sending edpms until all QPs are closed. Signed-off-by: Yuval Mintz--- drivers/net/ethernet/qlogic/qed/qed_dcbx.c | 8 +++ drivers/net/ethernet/qlogic/qed/qed_roce.c | 36 ++ drivers/net/ethernet/qlogic/qed/qed_roce.h | 5 + 3 files changed, 49 insertions(+) diff --git a/drivers/net/ethernet/qlogic/qed/qed_dcbx.c b/drivers/net/ethernet/qlogic/qed/qed_dcbx.c index 15b516a..f888045 100644 --- a/drivers/net/ethernet/qlogic/qed/qed_dcbx.c +++ b/drivers/net/ethernet/qlogic/qed/qed_dcbx.c @@ -44,6 +44,7 @@ #include "qed_hsi.h" #include "qed_sp.h" #include "qed_sriov.h" +#include "qed_roce.h" #ifdef CONFIG_DCB #include #endif @@ -892,6 +893,13 @@ qed_dcbx_mib_update_event(struct qed_hwfn *p_hwfn, /* update storm FW with negotiation results */ qed_sp_pf_update(p_hwfn); + + /* for roce PFs, we may want to enable/disable DPM +* when DCBx change occurs +*/ + if (p_hwfn->hw_info.personality == + QED_PCI_ETH_ROCE) + qed_roce_dpm_dcbx(p_hwfn, p_ptt); } } diff --git a/drivers/net/ethernet/qlogic/qed/qed_roce.c b/drivers/net/ethernet/qlogic/qed/qed_roce.c index 4bc2f6c..8419dcc 100644 --- a/drivers/net/ethernet/qlogic/qed/qed_roce.c +++ b/drivers/net/ethernet/qlogic/qed/qed_roce.c @@ -162,6 +162,11 @@ static int qed_bmap_test_id(struct qed_hwfn *p_hwfn, return test_bit(id_num, bmap->bitmap); } +static bool qed_bmap_is_empty(struct qed_bmap *bmap) +{ + return bmap->max_count == find_first_bit(bmap->bitmap, bmap->max_count); +} + static u32 qed_rdma_get_sb_id(void *p_hwfn, u32 rel_sb_id) { /* First sb id for RoCE is after all the l2 sb */ @@ -2638,6 +2643,23 @@ static void *qed_rdma_get_rdma_ctx(struct qed_dev *cdev) return QED_LEADING_HWFN(cdev); } +static bool qed_rdma_allocated_qps(struct qed_hwfn *p_hwfn) +{ + bool result; + + /* if rdma info has not been allocated, naturally there are no qps */ + if (!p_hwfn->p_rdma_info) + return false; + + spin_lock_bh(_hwfn->p_rdma_info->lock); + if (!p_hwfn->p_rdma_info->cid_map.bitmap) + result = false; + else + result = !qed_bmap_is_empty(_hwfn->p_rdma_info->cid_map); + spin_unlock_bh(_hwfn->p_rdma_info->lock); + return result; +} + static void qed_rdma_dpm_conf(struct qed_hwfn *p_hwfn, struct qed_ptt *p_ptt) { u32 val; @@ -2650,6 +2672,20 @@ static void qed_rdma_dpm_conf(struct qed_hwfn *p_hwfn, struct qed_ptt *p_ptt) val, p_hwfn->dcbx_no_edpm, p_hwfn->db_bar_no_edpm); } +void qed_roce_dpm_dcbx(struct qed_hwfn *p_hwfn, struct qed_ptt *p_ptt) +{ + u8 val; + + /* if any QPs are already active, we want to disable DPM, since their +* context information contains information from before the latest DCBx +* update. Otherwise enable it. +*/ + val = qed_rdma_allocated_qps(p_hwfn) ? true : false; + p_hwfn->dcbx_no_edpm = (u8)val; + + qed_rdma_dpm_conf(p_hwfn, p_ptt); +} + void qed_rdma_dpm_bar(struct qed_hwfn *p_hwfn, struct qed_ptt *p_ptt) { p_hwfn->db_bar_no_edpm = true; diff --git a/drivers/net/ethernet/qlogic/qed/qed_roce.h b/drivers/net/ethernet/qlogic/qed/qed_roce.h index 94be3b5..ddd7761 100644 --- a/drivers/net/ethernet/qlogic/qed/qed_roce.h +++ b/drivers/net/ethernet/qlogic/qed/qed_roce.h @@ -168,10 +168,15 @@ struct qed_rdma_qp { #if IS_ENABLED(CONFIG_QED_RDMA) void qed_rdma_dpm_bar(struct qed_hwfn *p_hwfn, struct qed_ptt *p_ptt); +void qed_roce_dpm_dcbx(struct qed_hwfn *p_hwfn, struct qed_ptt *p_ptt); void qed_roce_async_event(struct qed_hwfn *p_hwfn, u8 fw_event_code, union rdma_eqe_data *rdma_data); #else static inline void qed_rdma_dpm_bar(struct qed_hwfn *p_hwfn, struct qed_ptt *p_ptt) {} + +static inline void qed_roce_dpm_dcbx(struct qed_hwfn *p_hwfn, +struct qed_ptt *p_ptt) {} + static inline void qed_roce_async_event(struct qed_hwfn *p_hwfn, u8 fw_event_code, union rdma_eqe_data *rdma_data) {} -- 2.9.4
[PATCH v4 net-next 7/7] qed: SPQ async callback registration
From: Michal KalderonWhenever firmware indicates that there's an async indication it needs to handle, there's a switch-case where the right functionality is called based on function's personality and information. Before iWARP is added [as yet another client], switch over the SPQ into a callback-registered mechanism, allowing registration of the relevant event-processing logic based on the function's personality. This allows us to tidy the code by removing protocol-specifics from a common file. Signed-off-by: Michal Kalderon Signed-off-by: Yuval Mintz --- drivers/net/ethernet/qlogic/qed/qed_iscsi.c | 24 - drivers/net/ethernet/qlogic/qed/qed_roce.c | 16 ++--- drivers/net/ethernet/qlogic/qed/qed_roce.h | 6 drivers/net/ethernet/qlogic/qed/qed_sp.h| 17 + drivers/net/ethernet/qlogic/qed/qed_spq.c | 54 - drivers/net/ethernet/qlogic/qed/qed_sriov.c | 16 +++-- drivers/net/ethernet/qlogic/qed/qed_sriov.h | 18 -- 7 files changed, 96 insertions(+), 55 deletions(-) diff --git a/drivers/net/ethernet/qlogic/qed/qed_iscsi.c b/drivers/net/ethernet/qlogic/qed/qed_iscsi.c index 5a1ed05..813c77c 100644 --- a/drivers/net/ethernet/qlogic/qed/qed_iscsi.c +++ b/drivers/net/ethernet/qlogic/qed/qed_iscsi.c @@ -62,6 +62,22 @@ #include "qed_sriov.h" #include "qed_reg_addr.h" +static int +qed_iscsi_async_event(struct qed_hwfn *p_hwfn, + u8 fw_event_code, + u16 echo, union event_ring_data *data, u8 fw_return_code) +{ + if (p_hwfn->p_iscsi_info->event_cb) { + struct qed_iscsi_info *p_iscsi = p_hwfn->p_iscsi_info; + + return p_iscsi->event_cb(p_iscsi->event_context, +fw_event_code, data); + } else { + DP_NOTICE(p_hwfn, "iSCSI async completion is not set\n"); + return -EINVAL; + } +} + struct qed_iscsi_conn { struct list_head list_entry; bool free_on_delete; @@ -265,6 +281,9 @@ qed_sp_iscsi_func_start(struct qed_hwfn *p_hwfn, p_hwfn->p_iscsi_info->event_context = event_context; p_hwfn->p_iscsi_info->event_cb = async_event_cb; + qed_spq_register_async_cb(p_hwfn, PROTOCOLID_ISCSI, + qed_iscsi_async_event); + return qed_spq_post(p_hwfn, p_ent, NULL); } @@ -631,7 +650,10 @@ static int qed_sp_iscsi_func_stop(struct qed_hwfn *p_hwfn, p_ramrod = _ent->ramrod.iscsi_destroy; p_ramrod->hdr.op_code = ISCSI_RAMROD_CMD_ID_DESTROY_FUNC; - return qed_spq_post(p_hwfn, p_ent, NULL); + rc = qed_spq_post(p_hwfn, p_ent, NULL); + + qed_spq_unregister_async_cb(p_hwfn, PROTOCOLID_ISCSI); + return rc; } static void __iomem *qed_iscsi_get_db_addr(struct qed_hwfn *p_hwfn, u32 cid) diff --git a/drivers/net/ethernet/qlogic/qed/qed_roce.c b/drivers/net/ethernet/qlogic/qed/qed_roce.c index 7482905..673f80a 100644 --- a/drivers/net/ethernet/qlogic/qed/qed_roce.c +++ b/drivers/net/ethernet/qlogic/qed/qed_roce.c @@ -68,12 +68,14 @@ static void qed_roce_free_real_icid(struct qed_hwfn *p_hwfn, u16 icid); -void qed_roce_async_event(struct qed_hwfn *p_hwfn, - u8 fw_event_code, union rdma_eqe_data *rdma_data) +static int +qed_roce_async_event(struct qed_hwfn *p_hwfn, +u8 fw_event_code, +u16 echo, union event_ring_data *data, u8 fw_return_code) { if (fw_event_code == ROCE_ASYNC_EVENT_DESTROY_QP_DONE) { u16 icid = - (u16)le32_to_cpu(rdma_data->rdma_destroy_qp_data.cid); + (u16)le32_to_cpu(data->rdma_data.rdma_destroy_qp_data.cid); /* icid release in this async event can occur only if the icid * was offloaded to the FW. In case it wasn't offloaded this is @@ -85,8 +87,10 @@ void qed_roce_async_event(struct qed_hwfn *p_hwfn, events->affiliated_event(p_hwfn->p_rdma_info->events.context, fw_event_code, -_data->async_handle); +(void *)>rdma_data.async_handle); } + + return 0; } static int qed_rdma_bmap_alloc(struct qed_hwfn *p_hwfn, @@ -686,6 +690,9 @@ static int qed_rdma_setup(struct qed_hwfn *p_hwfn, if (rc) return rc; + qed_spq_register_async_cb(p_hwfn, PROTOCOLID_ROCE, + qed_roce_async_event); + return qed_rdma_start_fw(p_hwfn, params, p_ptt); } @@ -706,6 +713,7 @@ void qed_roce_stop(struct qed_hwfn *p_hwfn) break; } } + qed_spq_unregister_async_cb(p_hwfn, PROTOCOLID_ROCE); } static int qed_rdma_stop(void *rdma_cxt) diff --git a/drivers/net/ethernet/qlogic/qed/qed_roce.h
[PATCH v4 net-next 6/7] qed: Wait for resources before FUNC_CLOSE
From: Michal KalderonDriver needs to wait for all resources to return from FW before it can send the FUNC_CLOSE ramrod. Signed-off-by: Michal Kalderon Signed-off-by: Yuval Mintz --- drivers/net/ethernet/qlogic/qed/qed_roce.c | 35 +- 1 file changed, 20 insertions(+), 15 deletions(-) diff --git a/drivers/net/ethernet/qlogic/qed/qed_roce.c b/drivers/net/ethernet/qlogic/qed/qed_roce.c index 8419dcc..7482905 100644 --- a/drivers/net/ethernet/qlogic/qed/qed_roce.c +++ b/drivers/net/ethernet/qlogic/qed/qed_roce.c @@ -372,22 +372,7 @@ static void qed_rdma_bmap_free(struct qed_hwfn *p_hwfn, static void qed_rdma_resc_free(struct qed_hwfn *p_hwfn) { - struct qed_bmap *rcid_map = _hwfn->p_rdma_info->real_cid_map; struct qed_rdma_info *p_rdma_info = p_hwfn->p_rdma_info; - int wait_count = 0; - - /* when destroying a_RoCE QP the control is returned to the user after -* the synchronous part. The asynchronous part may take a little longer. -* We delay for a short while if an async destroy QP is still expected. -* Beyond the added delay we clear the bitmap anyway. -*/ - while (bitmap_weight(rcid_map->bitmap, rcid_map->max_count)) { - msleep(100); - if (wait_count++ > 20) { - DP_NOTICE(p_hwfn, "cid bitmap wait timed out\n"); - break; - } - } qed_rdma_bmap_free(p_hwfn, _hwfn->p_rdma_info->cid_map, 1); qed_rdma_bmap_free(p_hwfn, _hwfn->p_rdma_info->pd_map, 1); @@ -704,6 +689,25 @@ static int qed_rdma_setup(struct qed_hwfn *p_hwfn, return qed_rdma_start_fw(p_hwfn, params, p_ptt); } +void qed_roce_stop(struct qed_hwfn *p_hwfn) +{ + struct qed_bmap *rcid_map = _hwfn->p_rdma_info->real_cid_map; + int wait_count = 0; + + /* when destroying a_RoCE QP the control is returned to the user after +* the synchronous part. The asynchronous part may take a little longer. +* We delay for a short while if an async destroy QP is still expected. +* Beyond the added delay we clear the bitmap anyway. +*/ + while (bitmap_weight(rcid_map->bitmap, rcid_map->max_count)) { + msleep(100); + if (wait_count++ > 20) { + DP_NOTICE(p_hwfn, "cid bitmap wait timed out\n"); + break; + } + } +} + static int qed_rdma_stop(void *rdma_cxt) { struct qed_hwfn *p_hwfn = (struct qed_hwfn *)rdma_cxt; @@ -733,6 +737,7 @@ static int qed_rdma_stop(void *rdma_cxt) qed_wr(p_hwfn, p_ptt, PRS_REG_LIGHT_L2_ETHERTYPE_EN, (ll2_ethertype_en & 0xFFFE)); + qed_roce_stop(p_hwfn); qed_ptt_release(p_hwfn, p_ptt); /* Get SPQ entry */ -- 2.9.4
[PATCH v4 net-next 2/7] qed: RoCE EDPM to honor PFC
Configure device according to DCBx results so that EDPMs made by RoCE would honor flow-control. Signed-off-by: Yuval Mintz--- drivers/net/ethernet/qlogic/qed/qed_dcbx.c | 16 drivers/net/ethernet/qlogic/qed/qed_reg_addr.h | 6 ++ 2 files changed, 22 insertions(+) diff --git a/drivers/net/ethernet/qlogic/qed/qed_dcbx.c b/drivers/net/ethernet/qlogic/qed/qed_dcbx.c index e2a62c0..15b516a 100644 --- a/drivers/net/ethernet/qlogic/qed/qed_dcbx.c +++ b/drivers/net/ethernet/qlogic/qed/qed_dcbx.c @@ -896,6 +896,22 @@ qed_dcbx_mib_update_event(struct qed_hwfn *p_hwfn, } qed_dcbx_get_params(p_hwfn, _hwfn->p_dcbx_info->get, type); + + if (type == QED_DCBX_OPERATIONAL_MIB) { + struct qed_dcbx_results *p_data; + u16 val; + + /* Configure in NIG which protocols support EDPM and should +* honor PFC. +*/ + p_data = _hwfn->p_dcbx_info->results; + val = (0x1 << p_data->arr[DCBX_PROTOCOL_ROCE].tc) | + (0x1 << p_data->arr[DCBX_PROTOCOL_ROCE_V2].tc); + val <<= NIG_REG_TX_EDPM_CTRL_TX_EDPM_TC_EN_SHIFT; + val |= NIG_REG_TX_EDPM_CTRL_TX_EDPM_EN; + qed_wr(p_hwfn, p_ptt, NIG_REG_TX_EDPM_CTRL, val); + } + qed_dcbx_aen(p_hwfn, type); return rc; diff --git a/drivers/net/ethernet/qlogic/qed/qed_reg_addr.h b/drivers/net/ethernet/qlogic/qed/qed_reg_addr.h index 7e4639c..0cdb433 100644 --- a/drivers/net/ethernet/qlogic/qed/qed_reg_addr.h +++ b/drivers/net/ethernet/qlogic/qed/qed_reg_addr.h @@ -1564,6 +1564,12 @@ #define NIG_REG_TSGEN_FREECNT_UPDATE_K2 0x509008UL #define CNIG_REG_NIG_PORT0_CONF_K2 0x218200UL +#define NIG_REG_TX_EDPM_CTRL 0x501f0cUL +#define NIG_REG_TX_EDPM_CTRL_TX_EDPM_EN (0x1 << 0) +#define NIG_REG_TX_EDPM_CTRL_TX_EDPM_EN_SHIFT 0 +#define NIG_REG_TX_EDPM_CTRL_TX_EDPM_TC_EN (0xff << 1) +#define NIG_REG_TX_EDPM_CTRL_TX_EDPM_TC_EN_SHIFT 1 + #define PRS_REG_SEARCH_GFT 0x1f11bcUL #define PRS_REG_CM_HDR_GFT 0x1f11c8UL #define PRS_REG_GFT_CAM 0x1f1100UL -- 2.9.4
Re: new dma-mapping tree, was Re: clean up and modularize arch dma_mapping interface V2
On Tue, Jun 20, 2017 at 11:04:00PM +1000, Stephen Rothwell wrote: > git://git.linaro.org/people/mszyprowski/linux-dma-mapping.git#dma-mapping-next > > Contacts: Marek Szyprowski and Kyungmin Park (cc'd) > > I have called your tree dma-mapping-hch for now. The other tree has > not been updated since 4.9-rc1 and I am not sure how general it is. > Marek, Kyungmin, any comments? I'd be happy to join efforts - co-maintainers and reviers are always welcome.
Re: [PATCH NET] net/hns:bugfix of ethtool -t phy self_test
On Tue, Jun 20, 2017 at 11:05:54AM +0800, l00371289 wrote: > hi, Florian > > On 2017/6/20 5:00, Florian Fainelli wrote: > > On 06/16/2017 02:24 AM, Lin Yun Sheng wrote: > >> This patch fixes the phy loopback self_test failed issue. when > >> Marvell Phy Module is loaded, it will powerdown fiber when doing > >> phy loopback self test, which cause phy loopback self_test fail. > >> > >> Signed-off-by: Lin Yun Sheng> >> --- > >> drivers/net/ethernet/hisilicon/hns/hns_ethtool.c | 16 ++-- > >> 1 file changed, 14 insertions(+), 2 deletions(-) > >> > >> diff --git a/drivers/net/ethernet/hisilicon/hns/hns_ethtool.c > >> b/drivers/net/ethernet/hisilicon/hns/hns_ethtool.c > >> index b8fab14..e95795b 100644 > >> --- a/drivers/net/ethernet/hisilicon/hns/hns_ethtool.c > >> +++ b/drivers/net/ethernet/hisilicon/hns/hns_ethtool.c > >> @@ -288,9 +288,15 @@ static int hns_nic_config_phy_loopback(struct > >> phy_device *phy_dev, u8 en) > > > > The question really is, why is not this properly integrated into the PHY > > driver and PHYLIB such that the only thing the Ethernet MAC driver has > > to call is a function of the PHY driver putting it in self-test? > Do you meaning calling phy_dev->drv->resume and phy_dev->drv->suspend > function? No. Florian is saying you should add support for phylib and the drivers to enable/disable loopback. The BMCR loopback bit is pretty much standardised. So you can implement a genphy_loopback(phydev, enable), which most drivers can use. Those that need there own can implement it in there driver. Andrew
[net-next 04/10] net/mlx5e: Add new profile function update_carrier
From: Erez ShitritUpdating the carrier involves specific HW setting, each profile should use its own function for that. Both IPoIB and VF representor don't need carrier update function, since VF representor has only a logical link to VF and IPoIB manages its own link via ib_core upper layer. Signed-off-by: Erez Shitrit Signed-off-by: Saeed Mahameed --- drivers/net/ethernet/mellanox/mlx5/core/en.h | 1 + drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 14 ++ drivers/net/ethernet/mellanox/mlx5/core/en_rep.c | 1 + drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c | 2 ++ 4 files changed, 14 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h index 49c5fe9fdff0..a223a8e15ece 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h @@ -782,6 +782,7 @@ struct mlx5e_profile { void(*enable)(struct mlx5e_priv *priv); void(*disable)(struct mlx5e_priv *priv); void(*update_stats)(struct mlx5e_priv *priv); + void(*update_carrier)(struct mlx5e_priv *priv); int (*max_nch)(struct mlx5_core_dev *mdev); struct { mlx5e_fp_handle_rx_cqe handle_rx_cqe; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c index 06eb7a8b487c..343be65621db 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -143,7 +143,8 @@ static void mlx5e_update_carrier_work(struct work_struct *work) mutex_lock(>state_lock); if (test_bit(MLX5E_STATE_OPENED, >state)) - mlx5e_update_carrier(priv); + if (priv->profile->update_carrier) + priv->profile->update_carrier(priv); mutex_unlock(>state_lock); } @@ -2598,9 +2599,10 @@ void mlx5e_switch_priv_channels(struct mlx5e_priv *priv, { struct net_device *netdev = priv->netdev; int new_num_txqs; - + int carrier_ok; new_num_txqs = new_chs->num * new_chs->params.num_tc; + carrier_ok = netif_carrier_ok(netdev); netif_carrier_off(netdev); if (new_num_txqs < netdev->real_num_tx_queues) @@ -2618,7 +2620,9 @@ void mlx5e_switch_priv_channels(struct mlx5e_priv *priv, mlx5e_refresh_tirs(priv, false); mlx5e_activate_priv_channels(priv); - mlx5e_update_carrier(priv); + /* return carrier back if needed */ + if (carrier_ok) + netif_carrier_on(netdev); } int mlx5e_open_locked(struct net_device *netdev) @@ -2634,7 +2638,8 @@ int mlx5e_open_locked(struct net_device *netdev) mlx5e_refresh_tirs(priv, false); mlx5e_activate_priv_channels(priv); - mlx5e_update_carrier(priv); + if (priv->profile->update_carrier) + priv->profile->update_carrier(priv); mlx5e_timestamp_init(priv); if (priv->profile->update_stats) @@ -4215,6 +4220,7 @@ static const struct mlx5e_profile mlx5e_nic_profile = { .disable = mlx5e_nic_disable, .update_stats = mlx5e_update_ndo_stats, .max_nch = mlx5e_get_max_num_channels, + .update_carrier= mlx5e_update_carrier, .rx_handlers.handle_rx_cqe = mlx5e_handle_rx_cqe, .rx_handlers.handle_rx_cqe_mpwqe = mlx5e_handle_rx_cqe_mpwrq, .max_tc= MLX5E_MAX_NUM_TC, diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c index 01798e1ab667..a39873bd88a6 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c @@ -916,6 +916,7 @@ static struct mlx5e_profile mlx5e_rep_profile = { .cleanup_tx = mlx5e_cleanup_nic_tx, .update_stats = mlx5e_rep_update_stats, .max_nch= mlx5e_get_rep_max_num_channels, + .update_carrier = NULL, .rx_handlers.handle_rx_cqe = mlx5e_handle_rx_cqe_rep, .rx_handlers.handle_rx_cqe_mpwqe = NULL /* Not supported */, .max_tc = 1, diff --git a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c index fdeb426d4751..cc9ff4014e5c 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c @@ -291,6 +291,7 @@ static const struct mlx5e_profile mlx5i_nic_profile = { .disable = NULL, /* mlx5i_disable */ .update_stats = NULL, /* mlx5i_update_stats */ .max_nch = mlx5e_get_max_num_channels, + .update_carrier= NULL, /* no HW update in IB link */ .rx_handlers.handle_rx_cqe = mlx5i_handle_rx_cqe,
[net-next 08/10] net/mlx5e: IPoIB, Get more TX statistics
From: Erez ShitritAdd misses counters (bytes, packet, gso, xmit_more) in TX flow for ipoib traffic. Fixes: 58545449b7b ("net/mlx5e: IPoIB, Xmit flow") Signed-off-by: Erez Shitrit Signed-off-by: Saeed Mahameed --- drivers/net/ethernet/mellanox/mlx5/core/en_tx.c | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c index 354f474322ce..0433d69429f3 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c @@ -557,11 +557,16 @@ netdev_tx_t mlx5i_sq_xmit(struct mlx5e_txqsq *sq, struct sk_buff *skb, if (skb_is_gso(skb)) { opcode = MLX5_OPCODE_LSO; ihs = mlx5e_txwqe_build_eseg_gso(sq, skb, eseg, _bytes); + sq->stats.packets += skb_shinfo(skb)->gso_segs; } else { ihs = mlx5e_calc_min_inline(sq->min_inline_mode, skb); num_bytes = max_t(unsigned int, skb->len, ETH_ZLEN); + sq->stats.packets++; } + sq->stats.bytes += num_bytes; + sq->stats.xmit_more += skb->xmit_more; + ds_cnt = sizeof(*wqe) / MLX5_SEND_WQE_DS; if (ihs) { memcpy(eseg->inline_hdr.start, skb_data, ihs); -- 2.11.0
[net-next 10/10] net/mlx5e: IPoIB, Add ioctl support to IPoIB device driver
From: Feras DaoudAdd ioctl support to IPoIB device driver. For now, this ioctl will support timestamp get and set. Signed-off-by: Feras Daoud Signed-off-by: Eitan Rabin Signed-off-by: Saeed Mahameed --- drivers/net/ethernet/mellanox/mlx5/core/en.h | 4 ++-- drivers/net/ethernet/mellanox/mlx5/core/en_clock.c| 10 -- drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 6 -- drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c | 16 4 files changed, 26 insertions(+), 10 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h index 626683f0f487..5d9ace493d85 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h @@ -853,8 +853,8 @@ void mlx5e_timestamp_init(struct mlx5e_priv *priv); void mlx5e_timestamp_cleanup(struct mlx5e_priv *priv); void mlx5e_pps_event_handler(struct mlx5e_priv *priv, struct ptp_clock_event *event); -int mlx5e_hwstamp_set(struct net_device *dev, struct ifreq *ifr); -int mlx5e_hwstamp_get(struct net_device *dev, struct ifreq *ifr); +int mlx5e_hwstamp_set(struct mlx5e_priv *priv, struct ifreq *ifr); +int mlx5e_hwstamp_get(struct mlx5e_priv *priv, struct ifreq *ifr); int mlx5e_modify_rx_cqe_compression_locked(struct mlx5e_priv *priv, bool val); int mlx5e_vlan_rx_add_vid(struct net_device *dev, __always_unused __be16 proto, diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_clock.c b/drivers/net/ethernet/mellanox/mlx5/core/en_clock.c index e29494464cae..66f432385dbb 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_clock.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_clock.c @@ -86,9 +86,8 @@ static void mlx5e_timestamp_overflow(struct work_struct *work) schedule_delayed_work(>overflow_work, tstamp->overflow_period); } -int mlx5e_hwstamp_set(struct net_device *dev, struct ifreq *ifr) +int mlx5e_hwstamp_set(struct mlx5e_priv *priv, struct ifreq *ifr) { - struct mlx5e_priv *priv = netdev_priv(dev); struct hwtstamp_config config; int err; @@ -130,10 +129,10 @@ int mlx5e_hwstamp_set(struct net_device *dev, struct ifreq *ifr) case HWTSTAMP_FILTER_PTP_V2_DELAY_REQ: case HWTSTAMP_FILTER_NTP_ALL: /* Disable CQE compression */ - netdev_warn(dev, "Disabling cqe compression"); + netdev_warn(priv->netdev, "Disabling cqe compression"); err = mlx5e_modify_rx_cqe_compression_locked(priv, false); if (err) { - netdev_err(dev, "Failed disabling cqe compression err=%d\n", err); + netdev_err(priv->netdev, "Failed disabling cqe compression err=%d\n", err); mutex_unlock(>state_lock); return err; } @@ -151,9 +150,8 @@ int mlx5e_hwstamp_set(struct net_device *dev, struct ifreq *ifr) sizeof(config)) ? -EFAULT : 0; } -int mlx5e_hwstamp_get(struct net_device *dev, struct ifreq *ifr) +int mlx5e_hwstamp_get(struct mlx5e_priv *priv, struct ifreq *ifr) { - struct mlx5e_priv *priv = netdev_priv(dev); struct hwtstamp_config *cfg = >tstamp.hwtstamp_config; if (!MLX5_CAP_GEN(priv->mdev, device_frequency_khz)) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c index de1e936fc2be..20ee29a2209e 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -3317,11 +3317,13 @@ static int mlx5e_change_mtu(struct net_device *netdev, int new_mtu) static int mlx5e_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd) { + struct mlx5e_priv *priv = netdev_priv(dev); + switch (cmd) { case SIOCSHWTSTAMP: - return mlx5e_hwstamp_set(dev, ifr); + return mlx5e_hwstamp_set(priv, ifr); case SIOCGHWTSTAMP: - return mlx5e_hwstamp_get(dev, ifr); + return mlx5e_hwstamp_get(priv, ifr); default: return -EOPNOTSUPP; } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c index 58bf0665f50b..1ee5bce85901 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c @@ -43,6 +43,7 @@ static int mlx5i_close(struct net_device *netdev); static int mlx5i_dev_init(struct net_device *dev); static void mlx5i_dev_cleanup(struct net_device *dev); static int mlx5i_change_mtu(struct net_device *netdev, int new_mtu); +static int mlx5i_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd); static const struct net_device_ops mlx5i_netdev_ops = { .ndo_open=
[net-next 05/10] net/mlx5e: IPoIB, Change parameters default values
From: Erez ShitritAdd function that sets the default values for ipoib, setting/clearing abilities that IPoIB doesn't support, like RQ size in this case. Signed-off-by: Erez Shitrit Signed-off-by: Saeed Mahameed --- drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c | 19 +++ 1 file changed, 15 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c index cc9ff4014e5c..45ca869118c0 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c @@ -36,6 +36,7 @@ #include "ipoib.h" #define IB_DEFAULT_Q_KEY 0xb1b +#define MLX5I_PARAMS_DEFAULT_LOG_RQ_SIZE 9 static int mlx5i_open(struct net_device *netdev); static int mlx5i_close(struct net_device *netdev); @@ -50,6 +51,19 @@ static const struct net_device_ops mlx5i_netdev_ops = { }; /* IPoIB mlx5 netdev profile */ +static void mlx5i_build_nic_params(struct mlx5_core_dev *mdev, + struct mlx5e_params *params) +{ + /* Override RQ params as IPoIB supports only LINKED LIST RQ for now */ + mlx5e_set_rq_type_params(mdev, params, MLX5_WQ_TYPE_LINKED_LIST); + + /* RQ size in ipoib by default is 512 */ + params->log_rq_size = is_kdump_kernel() ? + MLX5E_PARAMS_MINIMUM_LOG_RQ_SIZE : + MLX5I_PARAMS_DEFAULT_LOG_RQ_SIZE; + + params->lro_en = false; +} /* Called directly after IPoIB netdevice was created to initialize SW structs */ static void mlx5i_init(struct mlx5_core_dev *mdev, @@ -65,10 +79,7 @@ static void mlx5i_init(struct mlx5_core_dev *mdev, priv->ppriv = ppriv; mlx5e_build_nic_params(mdev, >channels.params, profile->max_nch(mdev)); - - /* Override RQ params as IPoIB supports only LINKED LIST RQ for now */ - mlx5e_set_rq_type_params(mdev, >channels.params, MLX5_WQ_TYPE_LINKED_LIST); - priv->channels.params.lro_en = false; + mlx5i_build_nic_params(mdev, >channels.params); mutex_init(>state_lock); -- 2.11.0
Re: [PATCH net-next] sctp: handle errors when updating asoc
On Tue, Jun 20, 2017 at 04:05:11PM +0800, Xin Long wrote: > It's a bad thing not to handle errors when updating asoc. The memory > allocation failure in any of the functions called in sctp_assoc_update() > would cause sctp to work unexpectedly. > > This patch is to fix it by aborting the asoc and reporting the error when > any of these functions fails. > > Signed-off-by: Xin Long> --- > include/net/sctp/structs.h | 4 ++-- > net/sctp/associola.c | 25 ++--- > net/sctp/sm_sideeffect.c | 24 +++- > 3 files changed, 39 insertions(+), 14 deletions(-) > > diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h > index 5051317..e26763b 100644 > --- a/include/net/sctp/structs.h > +++ b/include/net/sctp/structs.h > @@ -1953,8 +1953,8 @@ struct sctp_transport *sctp_assoc_is_match(struct > sctp_association *, > const union sctp_addr *, > const union sctp_addr *); > void sctp_assoc_migrate(struct sctp_association *, struct sock *); > -void sctp_assoc_update(struct sctp_association *old, > -struct sctp_association *new); > +int sctp_assoc_update(struct sctp_association *old, > + struct sctp_association *new); > > __u32 sctp_association_get_next_tsn(struct sctp_association *); > > diff --git a/net/sctp/associola.c b/net/sctp/associola.c > index 72b07dd..757be41 100644 > --- a/net/sctp/associola.c > +++ b/net/sctp/associola.c > @@ -1112,8 +1112,8 @@ void sctp_assoc_migrate(struct sctp_association *assoc, > struct sock *newsk) > } > > /* Update an association (possibly from unexpected COOKIE-ECHO processing). > */ > -void sctp_assoc_update(struct sctp_association *asoc, > -struct sctp_association *new) > +int sctp_assoc_update(struct sctp_association *asoc, > + struct sctp_association *new) > { > struct sctp_transport *trans; > struct list_head *pos, *temp; > @@ -1124,8 +1124,10 @@ void sctp_assoc_update(struct sctp_association *asoc, > asoc->peer.sack_needed = new->peer.sack_needed; > asoc->peer.auth_capable = new->peer.auth_capable; > asoc->peer.i = new->peer.i; > - sctp_tsnmap_init(>peer.tsn_map, SCTP_TSN_MAP_INITIAL, > - asoc->peer.i.initial_tsn, GFP_ATOMIC); > + > + if (!sctp_tsnmap_init(>peer.tsn_map, SCTP_TSN_MAP_INITIAL, > + asoc->peer.i.initial_tsn, GFP_ATOMIC)) > + return -ENOMEM; > > /* Remove any peer addresses not present in the new association. */ > list_for_each_safe(pos, temp, >peer.transport_addr_list) { > @@ -1169,11 +1171,11 @@ void sctp_assoc_update(struct sctp_association *asoc, > } else { > /* Add any peer addresses from the new association. */ > list_for_each_entry(trans, >peer.transport_addr_list, > - transports) { > - if (!sctp_assoc_lookup_paddr(asoc, >ipaddr)) > - sctp_assoc_add_peer(asoc, >ipaddr, > - GFP_ATOMIC, trans->state); > - } > + transports) > + if (!sctp_assoc_lookup_paddr(asoc, >ipaddr) && > + !sctp_assoc_add_peer(asoc, >ipaddr, > + GFP_ATOMIC, trans->state)) > + return -ENOMEM; > > asoc->ctsn_ack_point = asoc->next_tsn - 1; > asoc->adv_peer_ack_point = asoc->ctsn_ack_point; > @@ -1182,7 +1184,8 @@ void sctp_assoc_update(struct sctp_association *asoc, > sctp_stream_update(>stream, >stream); > > /* get a new assoc id if we don't have one yet. */ > - sctp_assoc_set_id(asoc, GFP_ATOMIC); > + if (sctp_assoc_set_id(asoc, GFP_ATOMIC)) > + return -ENOMEM; > } > > /* SCTP-AUTH: Save the peer parameters from the new associations > @@ -1200,7 +1203,7 @@ void sctp_assoc_update(struct sctp_association *asoc, > asoc->peer.peer_hmacs = new->peer.peer_hmacs; > new->peer.peer_hmacs = NULL; > > - sctp_auth_asoc_init_active_key(asoc, GFP_ATOMIC); > + return sctp_auth_asoc_init_active_key(asoc, GFP_ATOMIC); > } > > /* Update the retran path for sending a retransmitted packet. > diff --git a/net/sctp/sm_sideeffect.c b/net/sctp/sm_sideeffect.c > index 7623566..dfe1fcb 100644 > --- a/net/sctp/sm_sideeffect.c > +++ b/net/sctp/sm_sideeffect.c > @@ -818,6 +818,28 @@ static void sctp_cmd_setup_t2(sctp_cmd_seq_t *cmds, > asoc->timeouts[SCTP_EVENT_TIMEOUT_T2_SHUTDOWN] = t->rto; > } > > +static void sctp_cmd_assoc_update(sctp_cmd_seq_t *cmds, > + struct sctp_association *asoc, > + struct sctp_association *new) > +{
Re: [PATCH NET] net/hns:bugfix of ethtool -t phy self_test
> >> The question really is, why is not this properly integrated into the PHY > >> driver and PHYLIB such that the only thing the Ethernet MAC driver has > >> to call is a function of the PHY driver putting it in self-test? > > > > This whole driver pokes various PHY registers, rather than use > > phylib. And it does so without taking the PHY lock. > I will consider using phylib as much as possible, thanks. > > It also assumes it > > is a Marvell PHY and i don't see anywhere it actually verifies this. > When it said Marvell Phy , I meant Marvell Phy with fibre support. > I will send anther patch to only setting bit in Fiber Control when > it is a Marvell Phy with fibre support. There is a lot more broken than just that. You really should remove all code which is accessing the PHY, and add support to phylib and the drivers for what you need. Andrew
Re: [PATCH 00/51] rtc: stop using rtc deprecated functions
On Tue 2017-06-20 13:37:22, Steve Twiss wrote: > Hi Pavel, > > On 20 June 2017 14:26, Pavel Machek wrote: > > > Subject: Re: [PATCH 00/51] rtc: stop using rtc deprecated functions > > > > On Tue 2017-06-20 14:24:00, Alexandre Belloni wrote: > > > On 20/06/2017 at 14:10:11 +0200, Pavel Machek wrote: > > > > On Tue 2017-06-20 12:03:48, Alexandre Belloni wrote: > > > > > On 20/06/2017 at 11:35:08 +0200, Benjamin Gaignard wrote: > > > > > > rtc_time_to_tm() and rtc_tm_to_time() are deprecated because they > > > > > > rely on 32bits variables and that will make rtc break in y2038/2016. > > > > > > > > > > Please don't, because this hide the fact that the hardware will not > > > > > handle dates in y2038 anyway and as pointed by Russell a few month > > > > > ago, > > > > > rtc_time_to_tm will be able to catch it but the 64 bit version will > > > > > silently ignore it. > > > > > > > > Reference? Because rtc on PCs stores date in binary coded decimal, so > > > > it is likely to break in 2100, not 2038... > > > > > > I'm not saying it should be done but clearly, that is not the correct > > > thing to do for RTCs that are using a single 32 bits register to store > > > the time. > > > You give one example, I can give you three: armada38x, at91sam9, > > > at32ap700x and that just in the beginning of the series. > > > > I wanted reference to Russell's mail. > > This is it. > https://patchwork.kernel.org/patch/6219401/ Thanks. Yes, that's argument against changing rtc _drivers_ for hardware that can not do better than 32bit. For generic code (such as 44/51 sysfs, 51/51 suspend test), the change still makes sense. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html signature.asc Description: Digital signature
Re: [PATCH net-next] ibmvnic: Return from ibmvnic_resume if not in VNIC_OPEN state
From: John AllenDate: Mon, 19 Jun 2017 11:27:53 -0500 > If the ibmvnic driver is not in the VNIC_OPEN state, return from > ibmvnic_resume callback. If we are not in the VNIC_OPEN state, interrupts > may not be initialized and directly calling the interrupt handler will > cause a crash. > > Signed-off-by: John Allen Applied.
Re: [PATCH net-next 0/4] s390/net updates, part 2 (v2)
From: Julian WiedmannDate: Tue, 20 Jun 2017 16:00:30 +0200 > thanks for the feedback. Here's an updated patchset that honours > the reverse christmas tree and drops the __packed attribute. Please apply. Series applied.
Re: [Patch net] igmp: add a missing spin_lock_init()
From: Cong WangDate: Tue, 20 Jun 2017 10:46:27 -0700 > Andrey reported a lockdep warning on non-initialized > spinlock: > > INFO: trying to register non-static key. > the code is fine but needs lockdep annotation. > turning off the locking correctness validator. > CPU: 1 PID: 4099 Comm: a.out Not tainted 4.12.0-rc6+ #9 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 > Call Trace: > __dump_stack lib/dump_stack.c:16 > dump_stack+0x292/0x395 lib/dump_stack.c:52 > register_lock_class+0x717/0x1aa0 kernel/locking/lockdep.c:755 > ? 0xa000 > __lock_acquire+0x269/0x3690 kernel/locking/lockdep.c:3255 > lock_acquire+0x22d/0x560 kernel/locking/lockdep.c:3855 > __raw_spin_lock_bh ./include/linux/spinlock_api_smp.h:135 > _raw_spin_lock_bh+0x36/0x50 kernel/locking/spinlock.c:175 > spin_lock_bh ./include/linux/spinlock.h:304 > ip_mc_clear_src+0x27/0x1e0 net/ipv4/igmp.c:2076 > igmpv3_clear_delrec+0xee/0x4f0 net/ipv4/igmp.c:1194 > ip_mc_destroy_dev+0x4e/0x190 net/ipv4/igmp.c:1736 > > We miss a spin_lock_init() in igmpv3_add_delrec(), probably > because previously we never use it on this code path. Since > we already unlink it from the global mc_tomb list, it is > probably safe not to acquire this spinlock here. It does not > harm to have it although, to avoid conditional locking. > > Fixes: c38b7d327aaf ("igmp: acquire pmc lock for ip_mc_clear_src()") > Reported-by: Andrey Konovalov > Signed-off-by: Cong Wang Applied and queued up for -stable, thank you.
Re: [PATCH 00/51] rtc: stop using rtc deprecated functions
On 20/06/2017 at 22:15:36 +0100, Russell King - ARM Linux wrote: > On Tue, Jun 20, 2017 at 05:07:46PM +0200, Benjamin Gaignard wrote: > > 2017-06-20 15:48 GMT+02:00 Alexandre Belloni > >: > > >> Yes, that's argument against changing rtc _drivers_ for hardware that > > >> can not do better than 32bit. For generic code (such as 44/51 sysfs, > > >> 51/51 suspend test), the change still makes sense. > > > > What I had in mind when writing those patches was to remove the limitations > > coming from those functions usage, even more since they been marked has > > deprecated. > > I'd say that they should not be marked as deprecated. They're entirely > appropriate for use with hardware that only supports a 32-bit > representation of time. > > It's entirely reasonable to fix the ones that use other representations > that exceed that, but for those which do not, we need to keep using the > 32-bit versions. Doing so actually gives us _more_ flexibility in the > future. > > Consider that at the moment, we define the 32-bit RTC representation to > start at a well known epoch. We _could_ decide that when it wraps to > 0x8000 seconds, we'll define the lower 0x4000 seconds to mean > dates in the future - and keep rolling that forward each time we cross > another 0x4000 seconds. Unless someone invents a real time machine, > we shouldn't need to set a modern RTC back to 1970. > I agree with that but not the android guys. They seem to mandate an RTC that can store time from 01/01/1970. I don't know much more than that because they never cared to explain why that was actually necessary (apart from a laconic "this will result in a bad user experience") I think tglx had a plan for offsetting the time at some point so 32-bit platform can pass 2038 properly. My opinion is that as long as userspace is not ready to handle those dates, it doesn't really matter because it is quite unlikely that anything will be able to continue running anyway. -- Alexandre Belloni, Free Electrons Embedded Linux and Kernel engineering http://free-electrons.com
Re: [PATCH net-next 00/12] nfp: add flower app with representors
On Tue, Jun 20, 2017 at 07:13:43PM +0300, Or Gerlitz wrote: > On Tue, Jun 20, 2017 at 8:51 AM, Simon Horman >wrote: > > this series adds a flower app to the NFP driver. > > It initialises four types of netdevs: > > > > * PF netdev - lower-device for communication of packets to device > > * PF representor netdev > > * VF representor netdevs > > * Phys port representor netdevs > > > > The PF netdev acts as a lower-device which sends and receives packets to > > and from the firmware. The representors act as upper-devices. For TX > > representors attach a metadata dst to the skb which is used by the PF > > netdev to prepend metadata to the packet before forwarding the firmware. On > > RX the PF netdev looks up the representor based on the prepended metadata > > recieved from the firmware and forwards the skb to the representor after > > removing the metadata. > > Hi Simon, Jakub > > Good to have more VF representors around... > > > Control queues are used to send and receive control messages which are > > used to communicate configuration information with the firmware. These > > are in separate vNIC to the queues belonging to the PF netdev. The control > > queues are not exposed to use-space via a netdev or any other means. > > > > Do you have documentation for the control channel or I should look on > earlier commits? I don't believe there is any publicly available documentation other than the code. > The control messages you describe here are also the ones that are used > to load/unload > specific app? In this patchset PORTMOD messages are used for (app-specific) configuration. > > As the name implies this app is targeted at providing offload of TC flower. > > That will be added by follow-up work. This patchset focuses on adding phys > > port and VF representor netdevs to which flower classifiers may be attached. > > I guess you want to have switch ID so if someone looks on the reps (ip -d) > they can realize they all belong to the same e-switch, we are using > switchdev attribute for that matter. Yes, that intended to be part of follow-up patches. > Few nits from building from static checker below. Thanks, sorry for letting that through. I'll fix them up ASAP. > Or. > > drivers/net/ethernet/netronome/nfp/nfp_net_repr.c:78:1: warning: > symbol 'nfp_repr_phy_port_get_stats64' was not declared. Should it be > static? > drivers/net/ethernet/netronome/nfp/nfp_net_repr.c:98:1: warning: > symbol 'nfp_repr_vf_get_stats64' was not declared. Should it be > static? > drivers/net/ethernet/netronome/nfp/nfp_net_repr.c:118:1: warning: > symbol 'nfp_repr_pf_get_stats64' was not declared. Should it be > static? > drivers/net/ethernet/netronome/nfp/nfp_net_repr.c:262:40: warning: > incorrect type in assignment (different base types) > drivers/net/ethernet/netronome/nfp/nfp_net_repr.c:262:40:expected > unsigned int [unsigned] [usertype] port_id > drivers/net/ethernet/netronome/nfp/nfp_net_repr.c:262:40:got > restricted __be32 [usertype] > drivers/net/ethernet/netronome/nfp/flower/main.c:116:19: warning: cast > to restricted __be32
Re: [PATCH net-next] net: stmmac: enable TSO for IPv6
From: Niklas CasselDate: Mon, 19 Jun 2017 18:36:44 +0200 > There is nothing in the IP that prevents us from enabling TSO for IPv6. > > Before patch: > ftp fe80::2aa:bbff:fecc:1336%eth0 > ftp> get /dev/zero > 882512708 bytes received in 00:14 (56.11 MiB/s) > > After patch: > ftp fe80::2aa:bbff:fecc:1336%eth0 > ftp> get /dev/zero > 1203326784 bytes received in 00:12 (94.52 MiB/s) > > Signed-off-by: Niklas Cassel Applied, thanks.
Re: [PATCH] rtnetlink: add IFLA_GROUP to ifla_policy
From: Serhey PopovychDate: Tue, 20 Jun 2017 14:35:23 +0300 > Network interface groups support added while ago, however > there is no IFLA_GROUP attribute description in policy > and netlink message size calculations until now. > > Add IFLA_GROUP attribute to the policy. > > Fixes: cbda10fa97d7 ("net_device: add support for network device groups") > Signed-off-by: Serhey Popovych > --- > v2: Rebased to kernel/git/davem/net.git Applied and queued up for -stable.
[PATCH] [PATCH v2 net-next] bonding: Convert multiple netdev_info messages to netdev_dbg
The bond_options.c file contains several netdev_info messages that clutter kernel output. This patch changes all netdev_info messages to netdev_dbg and adds a netdev debug for the packets per slave parameter. Suggested-by: Joe PerchesSigned-off-by: Michael J Dilmore --- drivers/net/bonding/bond_options.c | 79 +++--- 1 file changed, 40 insertions(+), 39 deletions(-) diff --git a/drivers/net/bonding/bond_options.c b/drivers/net/bonding/bond_options.c index 1bcbb89..e3a9af6 100644 --- a/drivers/net/bonding/bond_options.c +++ b/drivers/net/bonding/bond_options.c @@ -721,13 +721,13 @@ static int bond_option_mode_set(struct bonding *bond, const struct bond_opt_value *newval) { if (!bond_mode_uses_arp(newval->value) && bond->params.arp_interval) { - netdev_info(bond->dev, "%s mode is incompatible with arp monitoring, start mii monitoring\n", + netdev_dbg(bond->dev, "%s mode is incompatible with arp monitoring, start mii monitoring\n", newval->string); /* disable arp monitoring */ bond->params.arp_interval = 0; /* set miimon to default value */ bond->params.miimon = BOND_DEFAULT_MIIMON; - netdev_info(bond->dev, "Setting MII monitoring interval to %d\n", + netdev_dbg(bond->dev, "Setting MII monitoring interval to %d\n", bond->params.miimon); } @@ -771,7 +771,7 @@ static int bond_option_active_slave_set(struct bonding *bond, block_netpoll_tx(); /* check to see if we are clearing active */ if (!slave_dev) { - netdev_info(bond->dev, "Clearing current active slave\n"); + netdev_dbg(bond->dev, "Clearing current active slave\n"); RCU_INIT_POINTER(bond->curr_active_slave, NULL); bond_select_active_slave(bond); } else { @@ -782,12 +782,12 @@ static int bond_option_active_slave_set(struct bonding *bond, if (new_active == old_active) { /* do nothing */ - netdev_info(bond->dev, "%s is already the current active slave\n", + netdev_dbg(bond->dev, "%s is already the current active slave\n", new_active->dev->name); } else { if (old_active && (new_active->link == BOND_LINK_UP) && bond_slave_is_up(new_active)) { - netdev_info(bond->dev, "Setting %s as active slave\n", + netdev_dbg(bond->dev, "Setting %s as active slave\n", new_active->dev->name); bond_change_active_slave(bond, new_active); } else { @@ -810,17 +810,17 @@ static int bond_option_active_slave_set(struct bonding *bond, static int bond_option_miimon_set(struct bonding *bond, const struct bond_opt_value *newval) { - netdev_info(bond->dev, "Setting MII monitoring interval to %llu\n", + netdev_dbg(bond->dev, "Setting MII monitoring interval to %llu\n", newval->value); bond->params.miimon = newval->value; if (bond->params.updelay) - netdev_info(bond->dev, "Note: Updating updelay (to %d) since it is a multiple of the miimon value\n", + netdev_dbg(bond->dev, "Note: Updating updelay (to %d) since it is a multiple of the miimon value\n", bond->params.updelay * bond->params.miimon); if (bond->params.downdelay) - netdev_info(bond->dev, "Note: Updating downdelay (to %d) since it is a multiple of the miimon value\n", + netdev_dbg(bond->dev, "Note: Updating downdelay (to %d) since it is a multiple of the miimon value\n", bond->params.downdelay * bond->params.miimon); if (newval->value && bond->params.arp_interval) { - netdev_info(bond->dev, "MII monitoring cannot be used with ARP monitoring - disabling ARP monitoring...\n"); + netdev_dbg(bond->dev, "MII monitoring cannot be used with ARP monitoring - disabling ARP monitoring...\n"); bond->params.arp_interval = 0; if (bond->params.arp_validate) bond->params.arp_validate = BOND_ARP_VALIDATE_NONE; @@ -862,7 +862,7 @@ static int bond_option_updelay_set(struct bonding *bond, bond->params.miimon); } bond->params.updelay = value / bond->params.miimon; - netdev_info(bond->dev, "Setting up delay to %d\n", + netdev_dbg(bond->dev, "Setting up delay to %d\n", bond->params.updelay * bond->params.miimon); return 0; @@
Re: [PATCH 00/51] rtc: stop using rtc deprecated functions
Hi! > >> > This is it. > >> > https://patchwork.kernel.org/patch/6219401/ > >> > >> Thanks. > >> > >> Yes, that's argument against changing rtc _drivers_ for hardware that > >> can not do better than 32bit. For generic code (such as 44/51 sysfs, > >> 51/51 suspend test), the change still makes sense. > > What I had in mind when writing those patches was to remove the limitations > coming from those functions usage, even more since they been marked has > deprecated. > > I agree that will change nothing of hardware limitation but at least > the limit will > not come from the framework. > > Yes, we agree on that but I won't cherry pick working patches from a 51 > > patches series. Well, it would be actually nice for you to do the cherry picking. That's something maintainers do, because it is hard for contributors to guess maintainer's taste. Anyway, it looks like someone should go through all the RTC drivers, and document their limitations of each driver (date in future when hardware ceases to be useful). If Benjamin has time to do that, I guess that removes all the objections to the series. Regards, Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html signature.asc Description: Digital signature
[Patch net] ipv6: only call ip6_route_dev_notify() once for NETDEV_UNREGISTER
In commit 242d3a49a2a1 ("ipv6: reorder ip6_route_dev_notifier after ipv6_dev_notf") I assumed NETDEV_REGISTER and NETDEV_UNREGISTER are paired, unfortunately, as reported by jeffy, netdev_wait_allrefs() could rebroadcast NETDEV_UNREGISTER event until all refs are gone. We have to add an additional check to avoid this corner case. For netdev_wait_allrefs() dev->reg_state is NETREG_UNREGISTERED, for dev_change_net_namespace(), dev->reg_state is NETREG_REGISTERED. So check for dev->reg_state != NETREG_UNREGISTERED. Fixes: 242d3a49a2a1 ("ipv6: reorder ip6_route_dev_notifier after ipv6_dev_notf") Reported-by: jeffyCc: David Ahern Signed-off-by: Cong Wang --- net/ipv6/route.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/net/ipv6/route.c b/net/ipv6/route.c index 7cebd95..322bd62 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -3722,7 +3722,11 @@ static int ip6_route_dev_notify(struct notifier_block *this, net->ipv6.ip6_blk_hole_entry->dst.dev = dev; net->ipv6.ip6_blk_hole_entry->rt6i_idev = in6_dev_get(dev); #endif -} else if (event == NETDEV_UNREGISTER) { +} else if (event == NETDEV_UNREGISTER && + dev->reg_state != NETREG_UNREGISTERED) { + /* NETDEV_UNREGISTER could be fired for multiple times by +* netdev_wait_allrefs(). Make sure we only call this once. +*/ in6_dev_put(net->ipv6.ip6_null_entry->rt6i_idev); #ifdef CONFIG_IPV6_MULTIPLE_TABLES in6_dev_put(net->ipv6.ip6_prohibit_entry->rt6i_idev); -- 2.5.5
Re: [oss-drivers] Re: [PATCH net-next 00/12] nfp: add flower app with representors
On Tue, 20 Jun 2017 19:13:43 +0300, Or Gerlitz wrote: > > Control queues are used to send and receive control messages which are > > used to communicate configuration information with the firmware. These > > are in separate vNIC to the queues belonging to the PF netdev. The control > > queues are not exposed to use-space via a netdev or any other means. > > Do you have documentation for the control channel or I should look on > earlier commits? Hi Or! We don't have any docs, the ctrl channel was merged in e5c5180a2302 ("Merge branch 'nfp-ctrl-vNIC'"). The "control channel" is essentially a normal data queue which is specially marked as carrying control messages. > The control messages you describe here are also the ones that are used > to load/unload specific app? No, the app loading, PHY port management and other low-level tasks are handled by management FW. The control messages are an application FW construct. The control messages are transported by the datapath and since the datapath is entirely under control of apps the management FW can't depend on it. The apps today also completely reload the PCIe datapath implementation (which is software defined), so we need to use raw memory mappings to communicate with management FW. The control messages are mostly used for populating tables and reading statistics, because those two need to be fast and low overhead.
Re: [PATCH net] net: stmmac: free an skb first when there are no longer any descriptors using it
From: Niklas CasselDate: Tue, 20 Jun 2017 14:32:41 +0200 > When having the skb pointer in the first descriptor, stmmac_tx_clean > can get called at a moment where the IP has only cleared the own bit > of the first descriptor, thus freeing the skb, even though there can > be several descriptors whose buffers point into the same skb. > > By simply moving the skb pointer from the first descriptor to the last > descriptor, a skb will get freed only when the IP has cleared the > own bit of all the descriptors that are using that skb. > > Signed-off-by: Niklas Cassel Applied and queued up for -stable, thanks.
Re: [pull request][net-next 00/10] Mellanox, mlx5 IPoIB updates 2017-06-20
From: Saeed MahameedDate: Tue, 20 Jun 2017 17:13:04 +0300 > This series mainly from Erez and Feras includes some updates and > ethtool/ndos extension to the mlx5 IPoIB netdevice. > > for more detalis please see tag log below. > > Please pull and let me know if there's any problem. Pulled, thanks Saeed.
Re: [PATCH] liquidio: stop using huge static buffer, save 4096k in .data
From: Derek ChicklesDate: Tue, 20 Jun 2017 13:15:34 -0700 > > From: David Miller [mailto:da...@davemloft.net] > > Sent: Tuesday, June 20, 2017 12:22 PM > > > > From: Denys Vlasenko > > Date: Mon, 19 Jun 2017 21:50:52 +0200 > > > > > Only compile-tested - I don't have the hardware. > > > > > > From code inspection, octeon_pci_write_core_mem() appears to be safe wrt > > > unaligned source. In any case, u8 fbuf[] was not guaranteed to be aligned > > > anyway. > > > > > > Signed-off-by: Denys Vlasenko > > > > Looks good to me but I'll let one of the liquidio guys review this first > > before I apply it. > > Felix is going to try this out this week to confirm. Let's wait for his ack. This patch works. I tested it with a LiquidIO II adapter. ACK
Re: [PATCH net-next] sctp: uncork the old asoc before changing to the new one
On Tue, Jun 20, 2017 at 04:01:55PM +0800, Xin Long wrote: > local_cork is used to decide if it should uncork asoc outq after processing > some cmds, and it is set when replying or sending msgs. local_cork should > always have the same value with current asoc q->cork in some way. > > The thing is when changing to a new asoc by cmd SET_ASOC, local_cork may > not be consistent with the current asoc any more. The cmd seqs can be: > > SCTP_CMD_UPDATE_ASSOC (asoc) > SCTP_CMD_REPLY (asoc) > SCTP_CMD_SET_ASOC (new_asoc) > SCTP_CMD_DELETE_TCB (new_asoc) > SCTP_CMD_SET_ASOC (asoc) > SCTP_CMD_REPLY (asoc) > > The 1st REPLY makes OLD asoc q->cork and local_cork both are 1, and the cmd > DELETE_TCB clears NEW asoc q->cork and local_cork. After asoc goes back to > OLD asoc, q->cork is still 1 while local_cork is 0. The 2nd REPLY will not > set local_cork because q->cork is already set and it can't be uncorked and > sent out because of this. > > To keep local_cork consistent with the current asoc q->cork, this patch is > to uncork the old asoc if local_cork is set before changing to the new one. > > Note that the above cmd seqs will be used in the next patch when updating > asoc and handling errors in it. > > Suggested-by: Marcelo Ricardo Leitner> Signed-off-by: Xin Long > --- > net/sctp/sm_sideeffect.c | 4 > 1 file changed, 4 insertions(+) > > diff --git a/net/sctp/sm_sideeffect.c b/net/sctp/sm_sideeffect.c > index 25384fa..7623566 100644 > --- a/net/sctp/sm_sideeffect.c > +++ b/net/sctp/sm_sideeffect.c > @@ -1748,6 +1748,10 @@ static int sctp_cmd_interpreter(sctp_event_t > event_type, > break; > > case SCTP_CMD_SET_ASOC: > + if (asoc && local_cork) { > + sctp_outq_uncork(>outqueue, gfp); > + local_cork = 0; > + } > asoc = cmd->obj.asoc; > break; > > -- > 2.1.0 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-sctp" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Acked-by: Neil Horman
Re: [PATCH] net: phy: smsc: fix buffer overflow in memcpy
On Tue, Jun 20, 2017 at 10:40:46PM +0200, Arnd Bergmann wrote: > The memcpy annotation triggers for a fixed-length buffer copy: > > In file included from /git/arm-soc/arch/arm64/include/asm/processor.h:30:0, > from /git/arm-soc/arch/arm64/include/asm/spinlock.h:21, > from /git/arm-soc/include/linux/spinlock.h:87, > from /git/arm-soc/include/linux/seqlock.h:35, > from /git/arm-soc/include/linux/time.h:5, > from /git/arm-soc/include/linux/stat.h:21, > from /git/arm-soc/include/linux/module.h:10, > from /git/arm-soc/drivers/net/phy/smsc.c:20: > In function 'memcpy', > inlined from 'smsc_get_strings' at > /git/arm-soc/drivers/net/phy/smsc.c:166:3: > /git/arm-soc/include/linux/string.h:309:4: error: call to '__read_overflow2' > declared with attribute error: detected read beyond size of object passed as > 2nd parameter > > Using strncpy instead of memcpy should do the right thing here. Hi Arnd You will find this pattern in number of phy drivers: bcm-phy-lib.c: memcpy(data + i * ETH_GSTRING_LEN, marvell.c: memcpy(data + i * ETH_GSTRING_LEN, micrel.c: memcpy(data + i * ETH_GSTRING_LEN, smsc.c: memcpy(data + i * ETH_GSTRING_LEN, They probably all need the same fix. Andrew
Re: [PATCH] liquidio: stop using huge static buffer, save 4096k in .data
From: Denys VlasenkoDate: Mon, 19 Jun 2017 21:50:52 +0200 > Only compile-tested - I don't have the hardware. > > From code inspection, octeon_pci_write_core_mem() appears to be safe wrt > unaligned source. In any case, u8 fbuf[] was not guaranteed to be aligned > anyway. > > Signed-off-by: Denys Vlasenko Looks good to me but I'll let one of the liquidio guys review this first before I apply it.
Re: [PATCH net-next v2 0/4] ipmr/ip6mr: add Netlink notifications on cache reports
From: Julien GomesDate: Mon, 19 Jun 2017 13:44:13 -0700 > Currently, all ipmr/ip6mr cache reports are sent through the > mroute/mroute6 socket only. > This forces the use of a single socket for mroute programming, cache > reports and, regarding ipmr, IGMP messages without Router Alert option > reception. > > The present patches are aiming to send Netlink notifications in addition > to the existing igmpmsg/mrt6msg to give user programs a way to handle > cache reports in parallel with multiple sockets other than the > mroute/mroute6 socket. > > Changes in v2: > - Changed attributes naming from {IPMRA,IP6MRA}_CACHEREPORTA_* to > {IPMRA,IP6MRA}_CREPORT_* > - Improved packet data copy to handle non-linear packets in > ipmr/ip6mr cache report Netlink notification creation > - Added two rtnetlink groups with restricted-binding > - Changed cache report notified groups from RTNL_{IPV4,IPV6}_MROUTE to > the new restricted groups in ipmr/ip6mr Please address Nikolay's feedback about interface number limits etc. Thanks.
Re: [PATCH net] sfc: remove duplicate up_write on VF filter_sem
From: Edward CreeDate: Tue, 20 Jun 2017 13:08:51 +0100 > Somehow two copies of the line 'up_write(>efx->filter_sem);' got into > efx_ef10_sriov_set_vf_vlan(). This would put the mutex in a bad state and > cause all subsequent down attempts to hang. > > Fixes: 671b53eec2ed ("sfc: Ensure down_write(_sem) and up_write() are > matched before calling efx_net_open()") > Signed-off-by: Edward Cree Applied and queued up for -stable.
[PATCH net-next v3 0/4] ipmr/ip6mr: add Netlink notifications on cache reports
Currently, all ipmr/ip6mr cache reports are sent through the mroute/mroute6 socket only. This forces the use of a single socket for mroute programming, cache reports and, regarding ipmr, IGMP messages without Router Alert option reception. The present patches are aiming to send Netlink notifications in addition to the existing igmpmsg/mrt6msg to give user programs a way to handle cache reports in parallel with multiple sockets other than the mroute/mroute6 socket. Changes in v2: - Changed attributes naming from {IPMRA,IP6MRA}_CACHEREPORTA_* to {IPMRA,IP6MRA}_CREPORT_* - Improved packet data copy to handle non-linear packets in ipmr/ip6mr cache report Netlink notification creation - Added two rtnetlink groups with restricted-binding - Changed cache report notified groups from RTNL_{IPV4,IPV6}_MROUTE to the new restricted groups in ipmr/ip6mr Changes in v3: - Put message size calculation for {igmp,mrt6}msg_netlink_event in separate functions - Increased vif id attributes size from u8 to u32 Julien Gomes (4): rtnetlink: add NEWCACHEREPORT message type rtnetlink: add restricted rtnl groups for ipv4 and ipv6 mroute ipmr: add netlink notifications on igmpmsg cache reports ip6mr: add netlink notifications on mrt6msg cache reports include/uapi/linux/mroute.h| 12 +++ include/uapi/linux/mroute6.h | 12 +++ include/uapi/linux/rtnetlink.h | 7 + net/core/rtnetlink.c | 13 net/ipv4/ipmr.c| 69 ++-- net/ipv6/ip6mr.c | 71 -- security/selinux/nlmsgtab.c| 3 +- 7 files changed, 182 insertions(+), 5 deletions(-) -- 2.13.1
[PATCH] net: intel: e1000e: add check on e1e_wphy() return value
Check return value from call to e1e_wphy(). This value is being checked during previous calls to function e1e_wphy() and it seems a check was missing here. Addresses-Coverity-ID: 1226905 Signed-off-by: Gustavo A. R. Silva--- drivers/net/ethernet/intel/e1000e/ich8lan.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/net/ethernet/intel/e1000e/ich8lan.c b/drivers/net/ethernet/intel/e1000e/ich8lan.c index 68ea8b4..d6d4ed7 100644 --- a/drivers/net/ethernet/intel/e1000e/ich8lan.c +++ b/drivers/net/ethernet/intel/e1000e/ich8lan.c @@ -2437,6 +2437,8 @@ static s32 e1000_hv_phy_workarounds_ich8lan(struct e1000_hw *hw) if (hw->phy.revision < 2) { e1000e_phy_sw_reset(hw); ret_val = e1e_wphy(hw, MII_BMCR, 0x3140); + if (ret_val) + return ret_val; } } -- 2.5.0
[PATCH net-next v2 05/12] nfp: general representor implementation
Provide infrastructure to create and destroy representors of a given type. Parts based on work by Bert van Leeuwen, Benjamin LaHaise, and Jakub Kicinski. Signed-off-by: Simon HormanReviewed-by: Jakub Kicinski --- drivers/net/ethernet/netronome/nfp/Makefile | 1 + drivers/net/ethernet/netronome/nfp/nfp_app.c | 20 +++ drivers/net/ethernet/netronome/nfp/nfp_app.h | 18 +++ drivers/net/ethernet/netronome/nfp/nfp_net_repr.c | 156 ++ drivers/net/ethernet/netronome/nfp/nfp_net_repr.h | 92 + 5 files changed, 287 insertions(+) create mode 100644 drivers/net/ethernet/netronome/nfp/nfp_net_repr.c create mode 100644 drivers/net/ethernet/netronome/nfp/nfp_net_repr.h diff --git a/drivers/net/ethernet/netronome/nfp/Makefile b/drivers/net/ethernet/netronome/nfp/Makefile index 5ad9a557f06a..a401113035f5 100644 --- a/drivers/net/ethernet/netronome/nfp/Makefile +++ b/drivers/net/ethernet/netronome/nfp/Makefile @@ -22,6 +22,7 @@ nfp-objs := \ nfp_net_common.o \ nfp_net_ethtool.o \ nfp_net_main.o \ + nfp_net_repr.o \ nfp_netvf_main.o \ nfp_port.o \ bpf/main.o \ diff --git a/drivers/net/ethernet/netronome/nfp/nfp_app.c b/drivers/net/ethernet/netronome/nfp/nfp_app.c index 396b93f54823..c9ccb0f94604 100644 --- a/drivers/net/ethernet/netronome/nfp/nfp_app.c +++ b/drivers/net/ethernet/netronome/nfp/nfp_app.c @@ -38,6 +38,7 @@ #include "nfpcore/nfp_nffw.h" #include "nfp_app.h" #include "nfp_main.h" +#include "nfp_net_repr.h" static const struct nfp_app_type *apps[] = { _nic, @@ -68,6 +69,25 @@ struct sk_buff *nfp_app_ctrl_msg_alloc(struct nfp_app *app, unsigned int size) return skb; } +struct nfp_reprs * +nfp_app_reprs_set(struct nfp_app *app, enum nfp_repr_type type, + struct nfp_reprs *reprs) +{ + struct nfp_reprs *old; + + old = rcu_dereference_protected(app->reprs[type], + lockdep_is_held(>pf->lock)); + if (reprs && old) { + old = ERR_PTR(-EBUSY); + goto exit_unlock; + } + + rcu_assign_pointer(app->reprs[type], reprs); + +exit_unlock: + return old; +} + struct nfp_app *nfp_app_alloc(struct nfp_pf *pf, enum nfp_app_id id) { struct nfp_app *app; diff --git a/drivers/net/ethernet/netronome/nfp/nfp_app.h b/drivers/net/ethernet/netronome/nfp/nfp_app.h index 0fee14ffa081..af023a0491e7 100644 --- a/drivers/net/ethernet/netronome/nfp/nfp_app.h +++ b/drivers/net/ethernet/netronome/nfp/nfp_app.h @@ -36,6 +36,8 @@ #include +#include "nfp_net_repr.h" + struct bpf_prog; struct net_device; struct pci_dev; @@ -73,6 +75,7 @@ extern const struct nfp_app_type app_bpf; * @tc_busy: TC HW offload busy (rules loaded) * @xdp_offload:offload an XDP program * @eswitch_mode_get:get SR-IOV eswitch mode + * @repr_get: get representor netdev */ struct nfp_app_type { enum nfp_app_id id; @@ -100,6 +103,7 @@ struct nfp_app_type { struct bpf_prog *prog); enum devlink_eswitch_mode (*eswitch_mode_get)(struct nfp_app *app); + struct net_device *(*repr_get)(struct nfp_app *app, u32 id); }; /** @@ -108,6 +112,7 @@ struct nfp_app_type { * @pf:backpointer to NFP PF structure * @cpp: pointer to the CPP handle * @ctrl: pointer to ctrl vNIC struct + * @reprs: array of pointers to representors * @type: pointer to const application ops and info */ struct nfp_app { @@ -116,6 +121,7 @@ struct nfp_app { struct nfp_cpp *cpp; struct nfp_net *ctrl; + struct nfp_reprs __rcu *reprs[NFP_REPR_TYPE_MAX + 1]; const struct nfp_app_type *type; }; @@ -231,6 +237,18 @@ static inline int nfp_app_eswitch_mode_get(struct nfp_app *app, u16 *mode) return 0; } +static inline struct net_device *nfp_app_repr_get(struct nfp_app *app, u32 id) +{ + if (unlikely(!app || !app->type->repr_get)) + return NULL; + + return app->type->repr_get(app, id); +} + +struct nfp_reprs * +nfp_app_reprs_set(struct nfp_app *app, enum nfp_repr_type type, + struct nfp_reprs *reprs); + const char *nfp_app_mip_name(struct nfp_app *app); struct sk_buff *nfp_app_ctrl_msg_alloc(struct nfp_app *app, unsigned int size); diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_repr.c b/drivers/net/ethernet/netronome/nfp/nfp_net_repr.c new file mode 100644 index ..8e02f843ae92 --- /dev/null +++ b/drivers/net/ethernet/netronome/nfp/nfp_net_repr.c @@ -0,0 +1,156 @@ +/* + * Copyright (C) 2017 Netronome Systems, Inc. + * + * This software is dual licensed under the GNU General License Version 2, + * June 1991 as shown in the file COPYING in the top-level directory of this + * source tree or the BSD 2-Clause License provided below. You have the
[PATCH net-next v2 04/12] nfp: map mac_stats and vf_cfg BARs
If present map mac_stats and vf_cfg BARs. These will be used by representor netdevs to read statistics for phys port and vf representors. Also provide defines describing the layout of the mac_stats area. Similar defines are already present for the cf_cfg area. Based in part on work by Jakub Kicinski. Signed-off-by: Simon HormanReviewed-by: Jakub Kicinski --- drivers/net/ethernet/netronome/nfp/nfp_main.h | 8 ++ drivers/net/ethernet/netronome/nfp/nfp_net_main.c | 116 +++-- drivers/net/ethernet/netronome/nfp/nfp_port.h | 60 +++ .../net/ethernet/netronome/nfp/nfpcore/nfp_nsp.h | 2 + .../ethernet/netronome/nfp/nfpcore/nfp_nsp_eth.c | 5 +- 5 files changed, 161 insertions(+), 30 deletions(-) diff --git a/drivers/net/ethernet/netronome/nfp/nfp_main.h b/drivers/net/ethernet/netronome/nfp/nfp_main.h index 88724f8d0dcd..aa69d4101eb9 100644 --- a/drivers/net/ethernet/netronome/nfp/nfp_main.h +++ b/drivers/net/ethernet/netronome/nfp/nfp_main.h @@ -68,6 +68,10 @@ struct nfp_rtsym_table; * @data_vnic_bar: Pointer to the CPP area for the data vNICs' BARs * @ctrl_vnic_bar: Pointer to the CPP area for the ctrl vNIC's BAR * @qc_area: Pointer to the CPP area for the queues + * @mac_stats_bar: Pointer to the CPP area for the MAC stats + * @mac_stats_mem: Pointer to mapped MAC stats area + * @vf_cfg_bar:Pointer to the CPP area for the VF configuration BAR + * @vf_cfg_mem:Pointer to mapped VF configuration area * @irq_entries: Array of MSI-X entries for all vNICs * @limit_vfs: Number of VFs supported by firmware (~0 for PCI limit) * @num_vfs: Number of SR-IOV VFs enabled @@ -97,6 +101,10 @@ struct nfp_pf { struct nfp_cpp_area *data_vnic_bar; struct nfp_cpp_area *ctrl_vnic_bar; struct nfp_cpp_area *qc_area; + struct nfp_cpp_area *mac_stats_bar; + u8 __iomem *mac_stats_mem; + struct nfp_cpp_area *vf_cfg_bar; + u8 __iomem *vf_cfg_mem; struct msix_entry *irq_entries; diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_main.c b/drivers/net/ethernet/netronome/nfp/nfp_net_main.c index bc2bc0886176..eb87e1c08bb1 100644 --- a/drivers/net/ethernet/netronome/nfp/nfp_net_main.c +++ b/drivers/net/ethernet/netronome/nfp/nfp_net_main.c @@ -570,6 +570,79 @@ static void nfp_net_pf_app_stop(struct nfp_pf *pf) nfp_net_pf_app_stop_ctrl(pf); } +static void nfp_net_pci_unmap_mem(struct nfp_pf *pf) +{ + if (pf->vf_cfg_bar) + nfp_cpp_area_release_free(pf->vf_cfg_bar); + if (pf->mac_stats_bar) + nfp_cpp_area_release_free(pf->mac_stats_bar); + nfp_cpp_area_release_free(pf->qc_area); + nfp_cpp_area_release_free(pf->data_vnic_bar); +} + +static int nfp_net_pci_map_mem(struct nfp_pf *pf) +{ + u32 ctrl_bar_sz; + u8 __iomem *mem; + int err; + + ctrl_bar_sz = pf->max_data_vnics * NFP_PF_CSR_SLICE_SIZE; + mem = nfp_net_pf_map_rtsym(pf, "net.ctrl", "_pf%d_net_bar0", + ctrl_bar_sz, >data_vnic_bar); + if (IS_ERR(mem)) { + err = PTR_ERR(mem); + if (!pf->fw_loaded && err == -ENOENT) + err = -EPROBE_DEFER; + return err; + } + + pf->mac_stats_mem = nfp_net_pf_map_rtsym(pf, "net.macstats", +"_mac_stats", +NFP_MAC_STATS_SIZE * +(pf->eth_tbl->max_index + 1), +>mac_stats_bar); + if (IS_ERR(pf->mac_stats_mem)) { + if (PTR_ERR(pf->mac_stats_mem) != -ENOENT) { + err = PTR_ERR(pf->mac_stats_mem); + goto err_unmap_ctrl; + } + pf->mac_stats_mem = NULL; + } + + pf->vf_cfg_mem = nfp_net_pf_map_rtsym(pf, "net.vfcfg", + "_pf%d_net_vf_bar", + NFP_NET_CFG_BAR_SZ * + pf->limit_vfs, >vf_cfg_bar); + if (IS_ERR(pf->vf_cfg_mem)) { + if (PTR_ERR(pf->vf_cfg_mem) != -ENOENT) { + err = PTR_ERR(pf->vf_cfg_mem); + goto err_unmap_mac_stats; + } + pf->vf_cfg_mem = NULL; + } + + mem = nfp_net_map_area(pf->cpp, "net.qc", 0, 0, + NFP_PCIE_QUEUE(0), NFP_QCP_QUEUE_AREA_SZ, + >qc_area); + if (IS_ERR(mem)) { + nfp_err(pf->cpp, "Failed to map Queue Controller area.\n"); + err = PTR_ERR(mem); + goto err_unmap_vf_cfg; + } + + return 0; + +err_unmap_vf_cfg: + if (pf->vf_cfg_bar) +
[PATCH net-next v2 03/12] nfp: move physical port init into a helper
From: Jakub KicinskiMove MAC/PHY port init into a helper to make it easier to reuse it in the representor code. Signed-off-by: Jakub Kicinski Signed-off-by: Simon Horman --- drivers/net/ethernet/netronome/nfp/nfp_app_nic.c | 23 ++ drivers/net/ethernet/netronome/nfp/nfp_port.c| 25 drivers/net/ethernet/netronome/nfp/nfp_port.h| 3 +++ 3 files changed, 34 insertions(+), 17 deletions(-) diff --git a/drivers/net/ethernet/netronome/nfp/nfp_app_nic.c b/drivers/net/ethernet/netronome/nfp/nfp_app_nic.c index 83c65e6291ee..7b966bd3d214 100644 --- a/drivers/net/ethernet/netronome/nfp/nfp_app_nic.c +++ b/drivers/net/ethernet/netronome/nfp/nfp_app_nic.c @@ -42,6 +42,8 @@ static int nfp_app_nic_vnic_init_phy_port(struct nfp_pf *pf, struct nfp_app *app, struct nfp_net *nn, unsigned int id) { + int err; + if (!pf->eth_tbl) return 0; @@ -49,26 +51,13 @@ nfp_app_nic_vnic_init_phy_port(struct nfp_pf *pf, struct nfp_app *app, if (IS_ERR(nn->port)) return PTR_ERR(nn->port); - nn->port->eth_id = id; - nn->port->eth_port = nfp_net_find_port(pf->eth_tbl, id); - - /* Check if vNIC has external port associated and cfg is OK */ - if (!nn->port->eth_port) { - nfp_err(app->cpp, - "NSP port entries don't match vNICs (no entry for port #%d)\n", - id); + err = nfp_port_init_phy_port(pf, app, nn->port, id); + if (err) { nfp_port_free(nn->port); - return -EINVAL; - } - if (nn->port->eth_port->override_changed) { - nfp_warn(app->cpp, -"Config changed for port #%d, reboot required before port will be operational\n", -id); - nn->port->type = NFP_PORT_INVALID; - return 1; + return err; } - return 0; + return nn->port->type == NFP_PORT_INVALID; } int nfp_app_nic_vnic_init(struct nfp_app *app, struct nfp_net *nn, diff --git a/drivers/net/ethernet/netronome/nfp/nfp_port.c b/drivers/net/ethernet/netronome/nfp/nfp_port.c index a17410ac01ab..19bceeb82225 100644 --- a/drivers/net/ethernet/netronome/nfp/nfp_port.c +++ b/drivers/net/ethernet/netronome/nfp/nfp_port.c @@ -33,6 +33,7 @@ #include +#include "nfpcore/nfp_cpp.h" #include "nfpcore/nfp_nsp.h" #include "nfp_app.h" #include "nfp_main.h" @@ -112,6 +113,30 @@ nfp_port_get_phys_port_name(struct net_device *netdev, char *name, size_t len) return 0; } +int nfp_port_init_phy_port(struct nfp_pf *pf, struct nfp_app *app, + struct nfp_port *port, unsigned int id) +{ + port->eth_id = id; + port->eth_port = nfp_net_find_port(pf->eth_tbl, id); + + /* Check if vNIC has external port associated and cfg is OK */ + if (!port->eth_port) { + nfp_err(app->cpp, + "NSP port entries don't match vNICs (no entry for port #%d)\n", + id); + return -EINVAL; + } + if (port->eth_port->override_changed) { + nfp_warn(app->cpp, +"Config changed for port #%d, reboot required before port will be operational\n", +id); + port->type = NFP_PORT_INVALID; + return 0; + } + + return 0; +} + struct nfp_port * nfp_port_alloc(struct nfp_app *app, enum nfp_port_type type, struct net_device *netdev) diff --git a/drivers/net/ethernet/netronome/nfp/nfp_port.h b/drivers/net/ethernet/netronome/nfp/nfp_port.h index 4d1a9b3fed41..fb28c7071987 100644 --- a/drivers/net/ethernet/netronome/nfp/nfp_port.h +++ b/drivers/net/ethernet/netronome/nfp/nfp_port.h @@ -104,6 +104,9 @@ nfp_port_alloc(struct nfp_app *app, enum nfp_port_type type, struct net_device *netdev); void nfp_port_free(struct nfp_port *port); +int nfp_port_init_phy_port(struct nfp_pf *pf, struct nfp_app *app, + struct nfp_port *port, unsigned int id); + int nfp_net_refresh_eth_port(struct nfp_port *port); void nfp_net_refresh_port_table(struct nfp_port *port); int nfp_net_refresh_port_table_sync(struct nfp_pf *pf); -- 2.1.4
[PATCH net-next v2 02/12] nfp: devlink add support for getting eswitch mode
From: Jakub KicinskiAdd app callback for reporting eswitch mode. Non-SRIOV apps should not implement this callback, nfp_app code will then respond with -EOPNOTSUPP. Signed-off-by: Jakub Kicinski Signed-off-by: Simon Horman --- drivers/net/ethernet/netronome/nfp/nfp_app.h | 15 +++ drivers/net/ethernet/netronome/nfp/nfp_devlink.c | 18 ++ 2 files changed, 33 insertions(+) diff --git a/drivers/net/ethernet/netronome/nfp/nfp_app.h b/drivers/net/ethernet/netronome/nfp/nfp_app.h index f5e373fa8c3b..0fee14ffa081 100644 --- a/drivers/net/ethernet/netronome/nfp/nfp_app.h +++ b/drivers/net/ethernet/netronome/nfp/nfp_app.h @@ -34,6 +34,8 @@ #ifndef _NFP_APP_H #define _NFP_APP_H 1 +#include + struct bpf_prog; struct net_device; struct pci_dev; @@ -70,6 +72,7 @@ extern const struct nfp_app_type app_bpf; * @setup_tc: setup TC ndo * @tc_busy: TC HW offload busy (rules loaded) * @xdp_offload:offload an XDP program + * @eswitch_mode_get:get SR-IOV eswitch mode */ struct nfp_app_type { enum nfp_app_id id; @@ -95,6 +98,8 @@ struct nfp_app_type { bool (*tc_busy)(struct nfp_app *app, struct nfp_net *nn); int (*xdp_offload)(struct nfp_app *app, struct nfp_net *nn, struct bpf_prog *prog); + + enum devlink_eswitch_mode (*eswitch_mode_get)(struct nfp_app *app); }; /** @@ -216,6 +221,16 @@ static inline void nfp_app_ctrl_rx(struct nfp_app *app, struct sk_buff *skb) app->type->ctrl_msg_rx(app, skb); } +static inline int nfp_app_eswitch_mode_get(struct nfp_app *app, u16 *mode) +{ + if (!app->type->eswitch_mode_get) + return -EOPNOTSUPP; + + *mode = app->type->eswitch_mode_get(app); + + return 0; +} + const char *nfp_app_mip_name(struct nfp_app *app); struct sk_buff *nfp_app_ctrl_msg_alloc(struct nfp_app *app, unsigned int size); diff --git a/drivers/net/ethernet/netronome/nfp/nfp_devlink.c b/drivers/net/ethernet/netronome/nfp/nfp_devlink.c index 2609a0f28e81..6c9f29c2e975 100644 --- a/drivers/net/ethernet/netronome/nfp/nfp_devlink.c +++ b/drivers/net/ethernet/netronome/nfp/nfp_devlink.c @@ -149,9 +149,27 @@ nfp_devlink_port_unsplit(struct devlink *devlink, unsigned int port_index) return ret; } +static int nfp_devlink_eswitch_mode_get(struct devlink *devlink, u16 *mode) +{ + struct nfp_pf *pf = devlink_priv(devlink); + int ret; + + mutex_lock(>lock); + if (!pf->app) { + ret = -EBUSY; + goto out; + } + ret = nfp_app_eswitch_mode_get(pf->app, mode); +out: + mutex_unlock(>lock); + + return ret; +} + const struct devlink_ops nfp_devlink_ops = { .port_split = nfp_devlink_port_split, .port_unsplit = nfp_devlink_port_unsplit, + .eswitch_mode_get = nfp_devlink_eswitch_mode_get, }; int nfp_devlink_port_register(struct nfp_app *app, struct nfp_port *port) -- 2.1.4
[PATCH net-next v2 09/12] nfp: add support for tx/rx with metadata portid
Allow tx/rx with metadata port id. This will be used for tx/rx of representor netdevs acting as upper-devices while a pf netdev acts as a lower-device. Signed-off-by: Simon HormanReviewed-by: Jakub Kicinski --- drivers/net/ethernet/netronome/nfp/nfp_net.h | 1 + .../net/ethernet/netronome/nfp/nfp_net_common.c| 57 +++--- 2 files changed, 52 insertions(+), 6 deletions(-) diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net.h b/drivers/net/ethernet/netronome/nfp/nfp_net.h index 02fd8d4e253c..96c8ea476c05 100644 --- a/drivers/net/ethernet/netronome/nfp/nfp_net.h +++ b/drivers/net/ethernet/netronome/nfp/nfp_net.h @@ -318,6 +318,7 @@ struct nfp_meta_parsed { u8 csum_type; u32 hash; u32 mark; + u32 portid; __wsum csum; }; diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c index 2b1ae666..49b8bc937ad8 100644 --- a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c +++ b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c @@ -755,6 +755,26 @@ static void nfp_net_tx_xmit_more_flush(struct nfp_net_tx_ring *tx_ring) tx_ring->wr_ptr_add = 0; } +static int nfp_net_prep_port_id(struct sk_buff *skb) +{ + struct metadata_dst *md_dst = skb_metadata_dst(skb); + unsigned char *data; + + if (likely(!md_dst)) + return 0; + if (unlikely(md_dst->type != METADATA_HW_PORT_MUX)) + return 0; + + if (unlikely(skb_cow_head(skb, 8))) + return -ENOMEM; + + data = skb_push(skb, 8); + put_unaligned_be32(NFP_NET_META_PORTID, data); + put_unaligned_be32(md_dst->u.port_info.port_id, data + 4); + + return 8; +} + /** * nfp_net_tx() - Main transmit entry point * @skb:SKB to transmit @@ -767,6 +787,7 @@ static int nfp_net_tx(struct sk_buff *skb, struct net_device *netdev) struct nfp_net *nn = netdev_priv(netdev); const struct skb_frag_struct *frag; struct nfp_net_tx_desc *txd, txdg; + int f, nr_frags, wr_idx, md_bytes; struct nfp_net_tx_ring *tx_ring; struct nfp_net_r_vector *r_vec; struct nfp_net_tx_buf *txbuf; @@ -774,8 +795,6 @@ static int nfp_net_tx(struct sk_buff *skb, struct net_device *netdev) struct nfp_net_dp *dp; dma_addr_t dma_addr; unsigned int fsize; - int f, nr_frags; - int wr_idx; u16 qidx; dp = >dp; @@ -797,6 +816,13 @@ static int nfp_net_tx(struct sk_buff *skb, struct net_device *netdev) return NETDEV_TX_BUSY; } + md_bytes = nfp_net_prep_port_id(skb); + if (unlikely(md_bytes < 0)) { + nfp_net_tx_xmit_more_flush(tx_ring); + dev_kfree_skb_any(skb); + return NETDEV_TX_OK; + } + /* Start with the head skbuf */ dma_addr = dma_map_single(dp->dev, skb->data, skb_headlen(skb), DMA_TO_DEVICE); @@ -815,7 +841,7 @@ static int nfp_net_tx(struct sk_buff *skb, struct net_device *netdev) /* Build TX descriptor */ txd = _ring->txds[wr_idx]; - txd->offset_eop = (nr_frags == 0) ? PCIE_DESC_TX_EOP : 0; + txd->offset_eop = (nr_frags ? 0 : PCIE_DESC_TX_EOP) | md_bytes; txd->dma_len = cpu_to_le16(skb_headlen(skb)); nfp_desc_set_dma_addr(txd, dma_addr); txd->data_len = cpu_to_le16(skb->len); @@ -855,7 +881,7 @@ static int nfp_net_tx(struct sk_buff *skb, struct net_device *netdev) *txd = txdg; txd->dma_len = cpu_to_le16(fsize); nfp_desc_set_dma_addr(txd, dma_addr); - txd->offset_eop = + txd->offset_eop |= (f == nr_frags - 1) ? PCIE_DESC_TX_EOP : 0; } @@ -1450,6 +1476,10 @@ nfp_net_parse_meta(struct net_device *netdev, struct nfp_meta_parsed *meta, meta->mark = get_unaligned_be32(data); data += 4; break; + case NFP_NET_META_PORTID: + meta->portid = get_unaligned_be32(data); + data += 4; + break; case NFP_NET_META_CSUM: meta->csum_type = CHECKSUM_COMPLETE; meta->csum = @@ -1594,6 +1624,7 @@ static int nfp_net_rx(struct nfp_net_rx_ring *rx_ring, int budget) struct nfp_net_rx_buf *rxbuf; struct nfp_net_rx_desc *rxd; struct nfp_meta_parsed meta; + struct net_device *netdev; dma_addr_t new_dma_addr; void *new_frag; @@ -1672,7 +1703,7 @@ static int nfp_net_rx(struct nfp_net_rx_ring *rx_ring, int budget) } if (xdp_prog &&
[PATCH net-next v2 07/12] nfp: app callbacks for SRIOV
Add app-callbacks for app-specific initialisation of SRIOV. Disabling SRIOV is brought forward in nfp_pci_remove() so that nfp_app_sriov_disable is called while the app still exists. This is intended to be used to implement representor netdevs for virtual ports. Signed-off-by: Simon HormanReviewed-by: Jakub Kicinski --- drivers/net/ethernet/netronome/nfp/nfp_app.h | 18 drivers/net/ethernet/netronome/nfp/nfp_main.c | 42 +++ 2 files changed, 55 insertions(+), 5 deletions(-) diff --git a/drivers/net/ethernet/netronome/nfp/nfp_app.h b/drivers/net/ethernet/netronome/nfp/nfp_app.h index af023a0491e7..ff2d43615808 100644 --- a/drivers/net/ethernet/netronome/nfp/nfp_app.h +++ b/drivers/net/ethernet/netronome/nfp/nfp_app.h @@ -75,6 +75,8 @@ extern const struct nfp_app_type app_bpf; * @tc_busy: TC HW offload busy (rules loaded) * @xdp_offload:offload an XDP program * @eswitch_mode_get:get SR-IOV eswitch mode + * @sriov_enable: app-specific sriov initialisation + * @sriov_disable: app-specific sriov clean-up * @repr_get: get representor netdev */ struct nfp_app_type { @@ -102,6 +104,9 @@ struct nfp_app_type { int (*xdp_offload)(struct nfp_app *app, struct nfp_net *nn, struct bpf_prog *prog); + int (*sriov_enable)(struct nfp_app *app, int num_vfs); + void (*sriov_disable)(struct nfp_app *app); + enum devlink_eswitch_mode (*eswitch_mode_get)(struct nfp_app *app); struct net_device *(*repr_get)(struct nfp_app *app, u32 id); }; @@ -237,6 +242,19 @@ static inline int nfp_app_eswitch_mode_get(struct nfp_app *app, u16 *mode) return 0; } +static inline int nfp_app_sriov_enable(struct nfp_app *app, int num_vfs) +{ + if (!app || !app->type->sriov_enable) + return -EOPNOTSUPP; + return app->type->sriov_enable(app, num_vfs); +} + +static inline void nfp_app_sriov_disable(struct nfp_app *app) +{ + if (app && app->type->sriov_disable) + app->type->sriov_disable(app); +} + static inline struct net_device *nfp_app_repr_get(struct nfp_app *app, u32 id) { if (unlikely(!app || !app->type->repr_get)) diff --git a/drivers/net/ethernet/netronome/nfp/nfp_main.c b/drivers/net/ethernet/netronome/nfp/nfp_main.c index 4e59dcb78c36..748e54cc885e 100644 --- a/drivers/net/ethernet/netronome/nfp/nfp_main.c +++ b/drivers/net/ethernet/netronome/nfp/nfp_main.c @@ -54,6 +54,7 @@ #include "nfpcore/nfp6000_pcie.h" +#include "nfp_app.h" #include "nfp_main.h" #include "nfp_net.h" @@ -97,28 +98,45 @@ static int nfp_pcie_sriov_enable(struct pci_dev *pdev, int num_vfs) struct nfp_pf *pf = pci_get_drvdata(pdev); int err; + mutex_lock(>lock); + if (num_vfs > pf->limit_vfs) { nfp_info(pf->cpp, "Firmware limits number of VFs to %u\n", pf->limit_vfs); - return -EINVAL; + err = -EINVAL; + goto err_unlock; + } + + err = nfp_app_sriov_enable(pf->app, num_vfs); + if (err) { + dev_warn(>dev, "App specific PCI sriov configuration failed: %d\n", +err); + goto err_unlock; } err = pci_enable_sriov(pdev, num_vfs); if (err) { dev_warn(>dev, "Failed to enable PCI sriov: %d\n", err); - return err; + goto err_app_sriov_disable; } pf->num_vfs = num_vfs; dev_dbg(>dev, "Created %d VFs.\n", pf->num_vfs); + mutex_unlock(>lock); return num_vfs; + +err_app_sriov_disable: + nfp_app_sriov_disable(pf->app); +err_unlock: + mutex_unlock(>lock); + return err; #endif return 0; } -static int nfp_pcie_sriov_disable(struct pci_dev *pdev) +static int __nfp_pcie_sriov_disable(struct pci_dev *pdev) { #ifdef CONFIG_PCI_IOV struct nfp_pf *pf = pci_get_drvdata(pdev); @@ -132,6 +150,8 @@ static int nfp_pcie_sriov_disable(struct pci_dev *pdev) return -EPERM; } + nfp_app_sriov_disable(pf->app); + pf->num_vfs = 0; pci_disable_sriov(pdev); @@ -140,6 +160,18 @@ static int nfp_pcie_sriov_disable(struct pci_dev *pdev) return 0; } +static int nfp_pcie_sriov_disable(struct pci_dev *pdev) +{ + struct nfp_pf *pf = pci_get_drvdata(pdev); + int err; + + mutex_lock(>lock); + err = __nfp_pcie_sriov_disable(pdev); + mutex_unlock(>lock); + + return err; +} + static int nfp_pcie_sriov_configure(struct pci_dev *pdev, int num_vfs) { if (num_vfs == 0) @@ -431,11 +463,11 @@ static void nfp_pci_remove(struct pci_dev *pdev) devlink = priv_to_devlink(pf); - nfp_net_pci_remove(pf); - nfp_pcie_sriov_disable(pdev); pci_sriov_set_totalvfs(pf->pdev, 0); + nfp_net_pci_remove(pf); +
[PATCH net-next v2 00/12] nfp: add flower app with representors
Hi, this series adds a flower app to the NFP driver. It initialises four types of netdevs: * PF netdev - lower-device for communication of packets to device * PF representor netdev * VF representor netdevs * Phys port representor netdevs The PF netdev acts as a lower-device which sends and receives packets to and from the firmware. The representors act as upper-devices. For TX representors attach a metadata dst to the skb which is used by the PF netdev to prepend metadata to the packet before forwarding the firmware. On RX the PF netdev looks up the representor based on the prepended metadata recieved from the firmware and forwards the skb to the representor after removing the metadata. Control queues are used to send and receive control messages which are used to communicate configuration information with the firmware. These are in separate vNIC to the queues belonging to the PF netdev. The control queues are not exposed to use-space via a netdev or any other means. As the name implies this app is targeted at providing offload of TC flower. That will be added by follow-up work. This patchset focuses on adding phys port and VF representor netdevs to which flower classifiers may be attached. Changes since v1: * Correct port_id endieness annotations * Make nfp_repr_*_get_stats64() static * Include for readq() on 32-bit systems Jakub Kicinski (3): net: store port/representator id in metadata_dst nfp: devlink add support for getting eswitch mode nfp: move physical port init into a helper Simon Horman (9): nfp: map mac_stats and vf_cfg BARs nfp: general representor implementation nfp: add stats and xmit helpers for representors nfp: app callbacks for SRIOV nfp: provide nfp_port to of nfp_net_get_mac_addr() nfp: add support for tx/rx with metadata portid nfp: add support for control messages for flower app nfp: add flower app nfp: add VF and PF representors to flower app drivers/net/ethernet/netronome/nfp/Makefile| 3 + drivers/net/ethernet/netronome/nfp/flower/cmsg.c | 159 + drivers/net/ethernet/netronome/nfp/flower/cmsg.h | 116 +++ drivers/net/ethernet/netronome/nfp/flower/main.c | 376 + drivers/net/ethernet/netronome/nfp/nfp_app.c | 26 +- drivers/net/ethernet/netronome/nfp/nfp_app.h | 58 +++- drivers/net/ethernet/netronome/nfp/nfp_app_nic.c | 25 +- drivers/net/ethernet/netronome/nfp/nfp_devlink.c | 18 + drivers/net/ethernet/netronome/nfp/nfp_main.c | 42 ++- drivers/net/ethernet/netronome/nfp/nfp_main.h | 11 +- drivers/net/ethernet/netronome/nfp/nfp_net.h | 1 + .../net/ethernet/netronome/nfp/nfp_net_common.c| 57 +++- drivers/net/ethernet/netronome/nfp/nfp_net_main.c | 141 +--- drivers/net/ethernet/netronome/nfp/nfp_net_repr.c | 353 +++ drivers/net/ethernet/netronome/nfp/nfp_net_repr.h | 120 +++ drivers/net/ethernet/netronome/nfp/nfp_port.c | 25 ++ drivers/net/ethernet/netronome/nfp/nfp_port.h | 63 .../net/ethernet/netronome/nfp/nfpcore/nfp_nsp.h | 2 + .../ethernet/netronome/nfp/nfpcore/nfp_nsp_eth.c | 5 +- include/net/dst_metadata.h | 41 ++- net/core/dst.c | 15 +- net/core/filter.c | 1 + net/ipv4/ip_tunnel_core.c | 6 +- net/openvswitch/flow_netlink.c | 4 +- 24 files changed, 1575 insertions(+), 93 deletions(-) create mode 100644 drivers/net/ethernet/netronome/nfp/flower/cmsg.c create mode 100644 drivers/net/ethernet/netronome/nfp/flower/cmsg.h create mode 100644 drivers/net/ethernet/netronome/nfp/flower/main.c create mode 100644 drivers/net/ethernet/netronome/nfp/nfp_net_repr.c create mode 100644 drivers/net/ethernet/netronome/nfp/nfp_net_repr.h -- 2.1.4
[PATCH net-next v2 08/12] nfp: provide nfp_port to of nfp_net_get_mac_addr()
Provide port rather than vNIC as parameter of nfp_net_get_mac_addr. This is to allow this function to be used by representor netdevs where a vNIC may have more than one physical port none of which are associated with the vNIC. Signed-off-by: Simon HormanReviewed-by: Jakub Kicinski --- drivers/net/ethernet/netronome/nfp/nfp_app_nic.c | 2 +- drivers/net/ethernet/netronome/nfp/nfp_main.h | 3 ++- drivers/net/ethernet/netronome/nfp/nfp_net_main.c | 25 +++ 3 files changed, 15 insertions(+), 15 deletions(-) diff --git a/drivers/net/ethernet/netronome/nfp/nfp_app_nic.c b/drivers/net/ethernet/netronome/nfp/nfp_app_nic.c index 7b966bd3d214..c11a6c34e217 100644 --- a/drivers/net/ethernet/netronome/nfp/nfp_app_nic.c +++ b/drivers/net/ethernet/netronome/nfp/nfp_app_nic.c @@ -69,7 +69,7 @@ int nfp_app_nic_vnic_init(struct nfp_app *app, struct nfp_net *nn, if (err) return err < 0 ? err : 0; - nfp_net_get_mac_addr(app->pf, nn, id); + nfp_net_get_mac_addr(app->pf, nn->port, id); return 0; } diff --git a/drivers/net/ethernet/netronome/nfp/nfp_main.h b/drivers/net/ethernet/netronome/nfp/nfp_main.h index aa69d4101eb9..edc14dc78674 100644 --- a/drivers/net/ethernet/netronome/nfp/nfp_main.h +++ b/drivers/net/ethernet/netronome/nfp/nfp_main.h @@ -58,6 +58,7 @@ struct nfp_hwinfo; struct nfp_mip; struct nfp_net; struct nfp_nsp_identify; +struct nfp_port; struct nfp_rtsym_table; /** @@ -147,7 +148,7 @@ void nfp_hwmon_unregister(struct nfp_pf *pf); struct nfp_eth_table_port * nfp_net_find_port(struct nfp_eth_table *eth_tbl, unsigned int id); void -nfp_net_get_mac_addr(struct nfp_pf *pf, struct nfp_net *nn, unsigned int id); +nfp_net_get_mac_addr(struct nfp_pf *pf, struct nfp_port *port, unsigned int id); bool nfp_ctrl_tx(struct nfp_net *nn, struct sk_buff *skb); diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_main.c b/drivers/net/ethernet/netronome/nfp/nfp_net_main.c index eb87e1c08bb1..e16a5fa92279 100644 --- a/drivers/net/ethernet/netronome/nfp/nfp_net_main.c +++ b/drivers/net/ethernet/netronome/nfp/nfp_net_main.c @@ -135,25 +135,24 @@ static u8 __iomem *nfp_net_map_area(struct nfp_cpp *cpp, /** * nfp_net_get_mac_addr() - Get the MAC address. * @pf: NFP PF handle - * @nn: NFP Network structure + * @port: NFP port structure * @id: NFP port id * * First try to get the MAC address from NSP ETH table. If that * fails try HWInfo. As a last resort generate a random address. */ void -nfp_net_get_mac_addr(struct nfp_pf *pf, struct nfp_net *nn, unsigned int id) +nfp_net_get_mac_addr(struct nfp_pf *pf, struct nfp_port *port, unsigned int id) { struct nfp_eth_table_port *eth_port; - struct nfp_net_dp *dp = >dp; u8 mac_addr[ETH_ALEN]; const char *mac_str; char name[32]; - eth_port = __nfp_port_get_eth_port(nn->port); + eth_port = __nfp_port_get_eth_port(port); if (eth_port) { - ether_addr_copy(dp->netdev->dev_addr, eth_port->mac_addr); - ether_addr_copy(dp->netdev->perm_addr, eth_port->mac_addr); + ether_addr_copy(port->netdev->dev_addr, eth_port->mac_addr); + ether_addr_copy(port->netdev->perm_addr, eth_port->mac_addr); return; } @@ -161,22 +160,22 @@ nfp_net_get_mac_addr(struct nfp_pf *pf, struct nfp_net *nn, unsigned int id) mac_str = nfp_hwinfo_lookup(pf->hwinfo, name); if (!mac_str) { - dev_warn(dp->dev, "Can't lookup MAC address. Generate\n"); - eth_hw_addr_random(dp->netdev); + nfp_warn(pf->cpp, "Can't lookup MAC address. Generate\n"); + eth_hw_addr_random(port->netdev); return; } if (sscanf(mac_str, "%02hhx:%02hhx:%02hhx:%02hhx:%02hhx:%02hhx", _addr[0], _addr[1], _addr[2], _addr[3], _addr[4], _addr[5]) != 6) { - dev_warn(dp->dev, -"Can't parse MAC address (%s). Generate.\n", mac_str); - eth_hw_addr_random(dp->netdev); + nfp_warn(pf->cpp, "Can't parse MAC address (%s). Generate.\n", +mac_str); + eth_hw_addr_random(port->netdev); return; } - ether_addr_copy(dp->netdev->dev_addr, mac_addr); - ether_addr_copy(dp->netdev->perm_addr, mac_addr); + ether_addr_copy(port->netdev->dev_addr, mac_addr); + ether_addr_copy(port->netdev->perm_addr, mac_addr); } struct nfp_eth_table_port * -- 2.1.4
[PATCH net-next v2 10/12] nfp: add support for control messages for flower app
In preparation for adding a new flower app - targeted at offloading the flower classifier - provide support for control message that it will use to communicate with the NFP. Based in part on work by Bert van Leeuwen. Signed-off-by: Simon HormanReviewed-by: Jakub Kicinski --- drivers/net/ethernet/netronome/nfp/Makefile | 1 + drivers/net/ethernet/netronome/nfp/flower/cmsg.c | 159 +++ drivers/net/ethernet/netronome/nfp/flower/cmsg.h | 116 + drivers/net/ethernet/netronome/nfp/nfp_app.c | 5 +- drivers/net/ethernet/netronome/nfp/nfp_app.h | 3 +- 5 files changed, 281 insertions(+), 3 deletions(-) create mode 100644 drivers/net/ethernet/netronome/nfp/flower/cmsg.c create mode 100644 drivers/net/ethernet/netronome/nfp/flower/cmsg.h diff --git a/drivers/net/ethernet/netronome/nfp/Makefile b/drivers/net/ethernet/netronome/nfp/Makefile index a401113035f5..e14f62863add 100644 --- a/drivers/net/ethernet/netronome/nfp/Makefile +++ b/drivers/net/ethernet/netronome/nfp/Makefile @@ -27,6 +27,7 @@ nfp-objs := \ nfp_port.o \ bpf/main.o \ bpf/offload.o \ + flower/cmsg.o \ nic/main.o ifeq ($(CONFIG_BPF_SYSCALL),y) diff --git a/drivers/net/ethernet/netronome/nfp/flower/cmsg.c b/drivers/net/ethernet/netronome/nfp/flower/cmsg.c new file mode 100644 index ..326f17eeaccf --- /dev/null +++ b/drivers/net/ethernet/netronome/nfp/flower/cmsg.c @@ -0,0 +1,159 @@ +/* + * Copyright (C) 2015-2017 Netronome Systems, Inc. + * + * This software is dual licensed under the GNU General License Version 2, + * June 1991 as shown in the file COPYING in the top-level directory of this + * source tree or the BSD 2-Clause License provided below. You have the + * option to license this software under the complete terms of either license. + * + * The BSD 2-Clause License: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * 1. Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * 2. Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + */ + +#include +#include +#include +#include + +#include "../nfpcore/nfp_cpp.h" +#include "../nfp_net_repr.h" +#include "./cmsg.h" + +#define nfp_flower_cmsg_warn(app, fmt, args...) \ + do {\ + if (net_ratelimit())\ + nfp_warn((app)->cpp, fmt, ## args); \ + } while (0) + +static struct nfp_flower_cmsg_hdr * +nfp_flower_cmsg_get_hdr(struct sk_buff *skb) +{ + return (struct nfp_flower_cmsg_hdr *)skb->data; +} + +static void *nfp_flower_cmsg_get_data(struct sk_buff *skb) +{ + return (unsigned char *)skb->data + NFP_FLOWER_CMSG_HLEN; +} + +static struct sk_buff * +nfp_flower_cmsg_alloc(struct nfp_app *app, unsigned int size, + enum nfp_flower_cmsg_type_port type) +{ + struct nfp_flower_cmsg_hdr *ch; + struct sk_buff *skb; + + size += NFP_FLOWER_CMSG_HLEN; + + skb = nfp_app_ctrl_msg_alloc(app, size, GFP_KERNEL); + if (!skb) + return NULL; + + ch = nfp_flower_cmsg_get_hdr(skb); + ch->pad = 0; + ch->version = NFP_FLOWER_CMSG_VER1; + ch->type = type; + skb_put(skb, size); + + return skb; +} + +int nfp_flower_cmsg_portmod(struct net_device *netdev) +{ + struct nfp_repr *repr = netdev_priv(netdev); + struct nfp_flower_cmsg_portmod *msg; + struct sk_buff *skb; + + skb = nfp_flower_cmsg_alloc(repr->app, sizeof(*msg), + NFP_FLOWER_CMSG_TYPE_PORT_MOD); + if (!skb) + return -ENOMEM; + + msg = nfp_flower_cmsg_get_data(skb); + msg->portnum = cpu_to_be32(repr->dst->u.port_info.port_id); + msg->reserved = 0; + msg->info = netif_carrier_ok(netdev); + msg->mtu = cpu_to_be16(netdev->mtu); + + nfp_ctrl_tx(repr->app->ctrl, skb); + +
[PATCH net-next v2 12/12] nfp: add VF and PF representors to flower app
Initialise VF and PF representors in flower app. Based in part on work by Benjamin LaHaise, Bert van Leeuwen and Jakub Kicinski. Signed-off-by: Simon HormanReviewed-by: Jakub Kicinski --- drivers/net/ethernet/netronome/nfp/flower/main.c | 86 +++- 1 file changed, 84 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/netronome/nfp/flower/main.c b/drivers/net/ethernet/netronome/nfp/flower/main.c index d1d905727c54..582b7be3e219 100644 --- a/drivers/net/ethernet/netronome/nfp/flower/main.c +++ b/drivers/net/ethernet/netronome/nfp/flower/main.c @@ -149,15 +149,81 @@ static const struct net_device_ops nfp_flower_repr_netdev_ops = { .ndo_get_offload_stats = nfp_repr_get_offload_stats, }; +static void nfp_flower_sriov_disable(struct nfp_app *app) +{ + nfp_reprs_clean_and_free_by_type(app, NFP_REPR_TYPE_VF); +} + +static int +nfp_flower_spawn_vnic_reprs(struct nfp_app *app, + enum nfp_flower_cmsg_port_vnic_type vnic_type, + enum nfp_repr_type repr_type, unsigned int cnt) +{ + u8 nfp_pcie = nfp_cppcore_pcie_unit(app->pf->cpp); + struct nfp_flower_priv *priv = app->priv; + struct nfp_reprs *reprs, *old_reprs; + const u8 queue = 0; + int i, err; + + reprs = nfp_reprs_alloc(cnt); + if (!reprs) + return -ENOMEM; + + for (i = 0; i < cnt; i++) { + u32 port_id; + + reprs->reprs[i] = nfp_repr_alloc(app); + if (!reprs->reprs[i]) { + err = -ENOMEM; + goto err_reprs_clean; + } + + SET_NETDEV_DEV(reprs->reprs[i], >nn->pdev->dev); + eth_hw_addr_inherit(reprs->reprs[i], priv->nn->dp.netdev); + + port_id = nfp_flower_cmsg_pcie_port(nfp_pcie, vnic_type, + i, queue); + err = nfp_repr_init(app, reprs->reprs[i], + _flower_repr_netdev_ops, + port_id, NULL, priv->nn->dp.netdev); + if (err) + goto err_reprs_clean; + + nfp_info(app->cpp, "%s%d Representor(%s) created\n", +repr_type == NFP_REPR_TYPE_PF ? "PF" : "VF", i, +reprs->reprs[i]->name); + } + + old_reprs = nfp_app_reprs_set(app, repr_type, reprs); + if (IS_ERR(old_reprs)) { + err = PTR_ERR(old_reprs); + goto err_reprs_clean; + } + + return 0; +err_reprs_clean: + nfp_reprs_clean_and_free(reprs); + return err; +} + +static int nfp_flower_sriov_enable(struct nfp_app *app, int num_vfs) +{ + return nfp_flower_spawn_vnic_reprs(app, + NFP_FLOWER_CMSG_PORT_VNIC_TYPE_VF, + NFP_REPR_TYPE_VF, num_vfs); +} + static void nfp_flower_stop(struct nfp_app *app) { + nfp_reprs_clean_and_free_by_type(app, NFP_REPR_TYPE_PF); nfp_reprs_clean_and_free_by_type(app, NFP_REPR_TYPE_PHYS_PORT); + } -static int nfp_flower_start(struct nfp_app *app) +static int +nfp_flower_spawn_phy_reprs(struct nfp_app *app, struct nfp_flower_priv *priv) { struct nfp_eth_table *eth_tbl = app->pf->eth_tbl; - struct nfp_flower_priv *priv = app->priv; struct nfp_reprs *reprs, *old_reprs; unsigned int i; int err; @@ -218,6 +284,19 @@ static int nfp_flower_start(struct nfp_app *app) return err; } +static int nfp_flower_start(struct nfp_app *app) +{ + int err; + + err = nfp_flower_spawn_phy_reprs(app, app->priv); + if (err) + return err; + + return nfp_flower_spawn_vnic_reprs(app, + NFP_FLOWER_CMSG_PORT_VNIC_TYPE_PF, + NFP_REPR_TYPE_PF, 1); +} + static void nfp_flower_vnic_clean(struct nfp_app *app, struct nfp_net *nn) { kfree(app->priv); @@ -289,6 +368,9 @@ const struct nfp_app_type app_flower = { .ctrl_msg_rx= nfp_flower_cmsg_rx, + .sriov_enable = nfp_flower_sriov_enable, + .sriov_disable = nfp_flower_sriov_disable, + .eswitch_mode_get = eswitch_mode_get, .repr_get = nfp_flower_repr_get, }; -- 2.1.4
[PATCH net-next v2 11/12] nfp: add flower app
Add app for flower offload. At this point the PF netdev and phys port representor netdevs are initialised. Follow-up work will add support for VF and PF representors and beyond that offloading the flower classifier. Based in part on work by Benjamin LaHaise and Bert van Leeuwen. Signed-off-by: Simon HormanReviewed-by: Jakub Kicinski --- drivers/net/ethernet/netronome/nfp/Makefile | 1 + drivers/net/ethernet/netronome/nfp/flower/main.c | 294 +++ drivers/net/ethernet/netronome/nfp/nfp_app.c | 1 + drivers/net/ethernet/netronome/nfp/nfp_app.h | 4 + 4 files changed, 300 insertions(+) create mode 100644 drivers/net/ethernet/netronome/nfp/flower/main.c diff --git a/drivers/net/ethernet/netronome/nfp/Makefile b/drivers/net/ethernet/netronome/nfp/Makefile index e14f62863add..10b556b2c59d 100644 --- a/drivers/net/ethernet/netronome/nfp/Makefile +++ b/drivers/net/ethernet/netronome/nfp/Makefile @@ -28,6 +28,7 @@ nfp-objs := \ bpf/main.o \ bpf/offload.o \ flower/cmsg.o \ + flower/main.o \ nic/main.o ifeq ($(CONFIG_BPF_SYSCALL),y) diff --git a/drivers/net/ethernet/netronome/nfp/flower/main.c b/drivers/net/ethernet/netronome/nfp/flower/main.c new file mode 100644 index ..d1d905727c54 --- /dev/null +++ b/drivers/net/ethernet/netronome/nfp/flower/main.c @@ -0,0 +1,294 @@ +/* + * Copyright (C) 2017 Netronome Systems, Inc. + * + * This software is dual licensed under the GNU General License Version 2, + * June 1991 as shown in the file COPYING in the top-level directory of this + * source tree or the BSD 2-Clause License provided below. You have the + * option to license this software under the complete terms of either license. + * + * The BSD 2-Clause License: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * 1. Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * 2. Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + */ + +#include +#include +#include +#include +#include + +#include "../nfpcore/nfp_cpp.h" +#include "../nfpcore/nfp_nsp.h" +#include "../nfp_app.h" +#include "../nfp_main.h" +#include "../nfp_net.h" +#include "../nfp_net_repr.h" +#include "../nfp_port.h" +#include "./cmsg.h" + +/** + * struct nfp_flower_priv - Flower APP per-vNIC priv data + * @nn: Pointer to vNIC + */ +struct nfp_flower_priv { + struct nfp_net *nn; +}; + +static const char *nfp_flower_extra_cap(struct nfp_app *app, struct nfp_net *nn) +{ + return "FLOWER"; +} + +static enum devlink_eswitch_mode eswitch_mode_get(struct nfp_app *app) +{ + return DEVLINK_ESWITCH_MODE_SWITCHDEV; +} + +static enum nfp_repr_type +nfp_flower_repr_get_type_and_port(struct nfp_app *app, u32 port_id, u8 *port) +{ + switch (FIELD_GET(NFP_FLOWER_CMSG_PORT_TYPE, port_id)) { + case NFP_FLOWER_CMSG_PORT_TYPE_PHYS_PORT: + *port = FIELD_GET(NFP_FLOWER_CMSG_PORT_PHYS_PORT_NUM, + port_id); + return NFP_REPR_TYPE_PHYS_PORT; + + case NFP_FLOWER_CMSG_PORT_TYPE_PCIE_PORT: + *port = FIELD_GET(NFP_FLOWER_CMSG_PORT_VNIC, port_id); + if (FIELD_GET(NFP_FLOWER_CMSG_PORT_VNIC_TYPE, port_id) == + NFP_FLOWER_CMSG_PORT_VNIC_TYPE_PF) + return NFP_REPR_TYPE_PF; + else + return NFP_REPR_TYPE_VF; + } + + return NFP_FLOWER_CMSG_PORT_TYPE_UNSPEC; +} + +static struct net_device * +nfp_flower_repr_get(struct nfp_app *app, u32 port_id) +{ + enum nfp_repr_type repr_type; + struct nfp_reprs *reprs; + u8 port = 0; + + repr_type = nfp_flower_repr_get_type_and_port(app, port_id, ); + + reprs = rcu_dereference(app->reprs[repr_type]); + if (!reprs) + return NULL; + + if (port >= reprs->num_reprs) + return NULL; + + return reprs->reprs[port]; +} + +static void +nfp_flower_repr_netdev_get_stats64(struct
[PATCH net-next v2 06/12] nfp: add stats and xmit helpers for representors
Provide helpers for stats and xmit on representor netdevs. Parts based on work by Bert van Leeuwen, Benjamin LaHaise and Jakub Kicinski. Signed-off-by: Simon HormanReviewed-by: Jakub Kicinski --- drivers/net/ethernet/netronome/nfp/nfp_net_repr.c | 199 +- drivers/net/ethernet/netronome/nfp/nfp_net_repr.h | 28 +++ 2 files changed, 226 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_repr.c b/drivers/net/ethernet/netronome/nfp/nfp_net_repr.c index 8e02f843ae92..44adcc5df11e 100644 --- a/drivers/net/ethernet/netronome/nfp/nfp_net_repr.c +++ b/drivers/net/ethernet/netronome/nfp/nfp_net_repr.c @@ -32,15 +32,198 @@ */ #include +#include #include #include #include "nfpcore/nfp_cpp.h" #include "nfp_app.h" #include "nfp_main.h" +#include "nfp_net_ctrl.h" #include "nfp_net_repr.h" #include "nfp_port.h" +static void +nfp_repr_inc_tx_stats(struct net_device *netdev, unsigned int len, + int tx_status) +{ + struct nfp_repr *repr = netdev_priv(netdev); + struct nfp_repr_pcpu_stats *stats; + + if (unlikely(tx_status != NET_XMIT_SUCCESS && +tx_status != NET_XMIT_CN)) { + this_cpu_inc(repr->stats->tx_drops); + return; + } + + stats = this_cpu_ptr(repr->stats); + u64_stats_update_begin(>syncp); + stats->tx_packets++; + stats->tx_bytes += len; + u64_stats_update_end(>syncp); +} + +void nfp_repr_inc_rx_stats(struct net_device *netdev, unsigned int len) +{ + struct nfp_repr *repr = netdev_priv(netdev); + struct nfp_repr_pcpu_stats *stats; + + stats = this_cpu_ptr(repr->stats); + u64_stats_update_begin(>syncp); + stats->rx_packets++; + stats->rx_bytes += len; + u64_stats_update_end(>syncp); +} + +static void +nfp_repr_phy_port_get_stats64(const struct nfp_app *app, u8 phy_port, + struct rtnl_link_stats64 *stats) +{ + u8 __iomem *mem; + + mem = app->pf->mac_stats_mem + phy_port * NFP_MAC_STATS_SIZE; + + /* TX and RX stats are flipped as we are returning the stats as seen +* at the switch port corresponding to the phys port. +*/ + stats->tx_packets = readq(mem + NFP_MAC_STATS_RX_FRAMES_RECEIVED_OK); + stats->tx_bytes = readq(mem + NFP_MAC_STATS_RX_IN_OCTETS); + stats->tx_dropped = readq(mem + NFP_MAC_STATS_RX_IN_ERRORS); + + stats->rx_packets = readq(mem + NFP_MAC_STATS_TX_FRAMES_TRANSMITTED_OK); + stats->rx_bytes = readq(mem + NFP_MAC_STATS_TX_OUT_OCTETS); + stats->rx_dropped = readq(mem + NFP_MAC_STATS_TX_OUT_ERRORS); +} + +static void +nfp_repr_vf_get_stats64(const struct nfp_app *app, u8 vf, + struct rtnl_link_stats64 *stats) +{ + u8 __iomem *mem; + + mem = app->pf->vf_cfg_mem + vf * NFP_NET_CFG_BAR_SZ; + + /* TX and RX stats are flipped as we are returning the stats as seen +* at the switch port corresponding to the VF. +*/ + stats->tx_packets = readq(mem + NFP_NET_CFG_STATS_RX_FRAMES); + stats->tx_bytes = readq(mem + NFP_NET_CFG_STATS_RX_OCTETS); + stats->tx_dropped = readq(mem + NFP_NET_CFG_STATS_RX_DISCARDS); + + stats->rx_packets = readq(mem + NFP_NET_CFG_STATS_TX_FRAMES); + stats->rx_bytes = readq(mem + NFP_NET_CFG_STATS_TX_OCTETS); + stats->rx_dropped = readq(mem + NFP_NET_CFG_STATS_TX_DISCARDS); +} + +static void +nfp_repr_pf_get_stats64(const struct nfp_app *app, u8 pf, + struct rtnl_link_stats64 *stats) +{ + u8 __iomem *mem; + + if (pf) + return; + + mem = nfp_cpp_area_iomem(app->pf->data_vnic_bar); + + stats->tx_packets = readq(mem + NFP_NET_CFG_STATS_RX_FRAMES); + stats->tx_bytes = readq(mem + NFP_NET_CFG_STATS_RX_OCTETS); + stats->tx_dropped = readq(mem + NFP_NET_CFG_STATS_RX_DISCARDS); + + stats->rx_packets = readq(mem + NFP_NET_CFG_STATS_TX_FRAMES); + stats->rx_bytes = readq(mem + NFP_NET_CFG_STATS_TX_OCTETS); + stats->rx_dropped = readq(mem + NFP_NET_CFG_STATS_TX_DISCARDS); +} + +void +nfp_repr_get_stats64(const struct nfp_app *app, enum nfp_repr_type type, +u8 port, struct rtnl_link_stats64 *stats) +{ + switch (type) { + case NFP_REPR_TYPE_PHYS_PORT: + nfp_repr_phy_port_get_stats64(app, port, stats); + break; + case NFP_REPR_TYPE_PF: + nfp_repr_pf_get_stats64(app, port, stats); + break; + case NFP_REPR_TYPE_VF: + nfp_repr_vf_get_stats64(app, port, stats); + default: + break; + } +} + +bool +nfp_repr_has_offload_stats(const struct net_device *dev, int attr_id) +{ + switch (attr_id) { + case IFLA_OFFLOAD_XSTATS_CPU_HIT: + return true; + } + + return false; +}
[PATCH net-next v2 01/12] net: store port/representator id in metadata_dst
From: Jakub KicinskiSwitches and modern SR-IOV enabled NICs may multiplex traffic from Port representators and control messages over single set of hardware queues. Control messages and muxed traffic may need ordered delivery. Those requirements make it hard to comfortably use TC infrastructure today unless we have a way of attaching metadata to skbs at the upper device. Because single set of queues is used for many netdevs stopping TC/sched queues of all of them reliably is impossible and lower device has to retreat to returning NETDEV_TX_BUSY and usually has to take extra locks on the fastpath. This patch attempts to enable port/representative devs to attach metadata to skbs which carry port id. This way representatives can be queueless and all queuing can be performed at the lower netdev in the usual way. Traffic arriving on the port/representative interfaces will be have metadata attached and will subsequently be queued to the lower device for transmission. The lower device should recognize the metadata and translate it to HW specific format which is most likely either a special header inserted before the network headers or descriptor/metadata fields. Metadata is associated with the lower device by storing the netdev pointer along with port id so that if TC decides to redirect or mirror the new netdev will not try to interpret it. This is mostly for SR-IOV devices since switches don't have lower netdevs today. Signed-off-by: Jakub Kicinski Signed-off-by: Sridhar Samudrala Signed-off-by: Simon Horman --- include/net/dst_metadata.h | 41 - net/core/dst.c | 15 ++- net/core/filter.c | 1 + net/ipv4/ip_tunnel_core.c | 6 -- net/openvswitch/flow_netlink.c | 4 +++- 5 files changed, 50 insertions(+), 17 deletions(-) diff --git a/include/net/dst_metadata.h b/include/net/dst_metadata.h index 701fc814d0af..a803129a4849 100644 --- a/include/net/dst_metadata.h +++ b/include/net/dst_metadata.h @@ -5,10 +5,22 @@ #include #include +enum metadata_type { + METADATA_IP_TUNNEL, + METADATA_HW_PORT_MUX, +}; + +struct hw_port_info { + struct net_device *lower_dev; + u32 port_id; +}; + struct metadata_dst { struct dst_entrydst; + enum metadata_type type; union { struct ip_tunnel_info tun_info; + struct hw_port_info port_info; } u; }; @@ -27,7 +39,7 @@ static inline struct ip_tunnel_info *skb_tunnel_info(struct sk_buff *skb) struct metadata_dst *md_dst = skb_metadata_dst(skb); struct dst_entry *dst; - if (md_dst) + if (md_dst && md_dst->type == METADATA_IP_TUNNEL) return _dst->u.tun_info; dst = skb_dst(skb); @@ -55,22 +67,33 @@ static inline int skb_metadata_dst_cmp(const struct sk_buff *skb_a, a = (const struct metadata_dst *) skb_dst(skb_a); b = (const struct metadata_dst *) skb_dst(skb_b); - if (!a != !b || a->u.tun_info.options_len != b->u.tun_info.options_len) + if (!a != !b || a->type != b->type) return 1; - return memcmp(>u.tun_info, >u.tun_info, - sizeof(a->u.tun_info) + a->u.tun_info.options_len); + switch (a->type) { + case METADATA_HW_PORT_MUX: + return memcmp(>u.port_info, >u.port_info, + sizeof(a->u.port_info)); + case METADATA_IP_TUNNEL: + return memcmp(>u.tun_info, >u.tun_info, + sizeof(a->u.tun_info) + +a->u.tun_info.options_len); + default: + return 1; + } } void metadata_dst_free(struct metadata_dst *); -struct metadata_dst *metadata_dst_alloc(u8 optslen, gfp_t flags); -struct metadata_dst __percpu *metadata_dst_alloc_percpu(u8 optslen, gfp_t flags); +struct metadata_dst *metadata_dst_alloc(u8 optslen, enum metadata_type type, + gfp_t flags); +struct metadata_dst __percpu * +metadata_dst_alloc_percpu(u8 optslen, enum metadata_type type, gfp_t flags); static inline struct metadata_dst *tun_rx_dst(int md_size) { struct metadata_dst *tun_dst; - tun_dst = metadata_dst_alloc(md_size, GFP_ATOMIC); + tun_dst = metadata_dst_alloc(md_size, METADATA_IP_TUNNEL, GFP_ATOMIC); if (!tun_dst) return NULL; @@ -85,11 +108,11 @@ static inline struct metadata_dst *tun_dst_unclone(struct sk_buff *skb) int md_size; struct metadata_dst *new_md; - if (!md_dst) + if (!md_dst || md_dst->type != METADATA_IP_TUNNEL) return ERR_PTR(-EINVAL); md_size = md_dst->u.tun_info.options_len; - new_md = metadata_dst_alloc(md_size, GFP_ATOMIC); +
Re: [RFC 1/2] net-next: fix DSA flow_disection
> On Tue, Jun 20, 2017 at 07:37:35PM +0200, John Crispin wrote: > > > On 20/06/17 16:01, Andrew Lunn wrote: > >On Tue, Jun 20, 2017 at 10:06:54AM +0200, John Crispin wrote: > >>RPS and probably other kernel features are currently broken on some if not > >>all DSA devices. The root cause of this that skb_hash will call the > >>flow_disector. > >Hi John > > > >What is the call path when the flow_disector is called? I'm wondering > >if we can defer this, and call it later, after the tag code has > >removed the header. > > > > Andrew Hi John I follow your logic of doing the hash early Is there any value in including the DSA header in the hash? That might allow frames from different ingress ports to be spread over CPUs? Andrew
Re: [PATCH] dt-bindings: net: sms911x: Add missing optional VDD regulators
From: Krzysztof KozlowskiDate: Mon, 19 Jun 2017 18:05:41 +0200 > The lan911x family of devices require supplying from 3.3 V power > supplies (connected to VDD_IO, VDD_A and VREG_3.3 pins). The existing > driver however obtains only VDD_IO and VDD_A regulators in an optional > way so document this in bindings. > > Signed-off-by: Krzysztof Kozlowski Applied, thanks.
Re: [PATCH net-next v2] enic: Fix format truncation warning
From: Govindarajulu VaradarajanDate: Mon, 19 Jun 2017 16:28:44 -0700 > With -Wformat-truncation, gcc throws the following warning. > > Fix this by increasing the size of devname to accommodate 15 character > netdev interface name and description. > > Remove length format precision for %s. We can fit entire name. > > Also increment the version. > > drivers/net/ethernet/cisco/enic/enic_main.c: In function ‘enic_open’: > drivers/net/ethernet/cisco/enic/enic_main.c:1740:15: warning: ‘%u’ directive > output may be truncated writing between 1 and 2 bytes into a region of size > between 1 and 12 [-Wformat-truncation=] > "%.11s-rx-%u", netdev->name, i); >^~ > drivers/net/ethernet/cisco/enic/enic_main.c:1740:5: note: directive argument > in the range [0, 16] > "%.11s-rx-%u", netdev->name, i); > ^ > drivers/net/ethernet/cisco/enic/enic_main.c:1738:4: note: ‘snprintf’ output > between 6 and 18 bytes into a destination of size 16 > snprintf(enic->msix[intr].devname, > ^~ > sizeof(enic->msix[intr].devname), > ~ > "%.11s-rx-%u", netdev->name, i); > ~~~ > > Signed-off-by: Govindarajulu Varadarajan > --- > v2: dont use kasprintf, increase the devname size > http://patchwork.ozlabs.org/patch/777568/ Applied, thank you.
Re: ipv6: Do not leak throw route references
From: Serhey PopovychDate: Tue, 20 Jun 2017 13:29:25 +0300 > While commit 73ba57bfae4a ("ipv6: fix backtracking for throw routes") > does good job on error propagation to the fib_rules_lookup() > in fib rules core framework that also corrects throw routes > handling, it does not solve route reference leakage problem > happened when we return -EAGAIN to the fib_rules_lookup() > and leave routing table entry referenced in arg->result. > > If rule with matched throw route isn't last matched in the > list we overwrite arg->result losing reference on throw > route stored previously forever. > > We also partially revert commit ab997ad40839 ("ipv6: fix the > incorrect return value of throw route") since we never return > routing table entry with dst.error == -EAGAIN when > CONFIG_IPV6_MULTIPLE_TABLES is on. Also there is no point > to check for RTF_REJECT flag since it is always set throw > route. > > Fixes: 73ba57bfae4a ("ipv6: fix backtracking for throw routes") > Signed-off-by: Serhey Popovych > --- > v2: Rebased to kernel/git/davem/net.git repository > Address several scripts/checkpatch.pl issues. Applied and queue up for -stable, thanks.
Re: pull-request: wireless-drivers 2017-06-20
From: Kalle ValoDate: Tue, 20 Jun 2017 16:39:59 +0300 > here's a pull request to net tree, few important fixes still I would > like to have in 4.12. Please let me know if there are any problems. Pulled, thanks Kalle.
[GIT] Networking
1) Fix refcounting wrt. timers which hold onto inet6 address objects, from Xin Long. 2) Fix an ancient bug in wireless wext ioctls, from Johannes Berg. 3) Firmware handling fixes in brcm80211 driver, from Arend Van Spriel. 4) Several mlx5 driver fixes (firmware readiness, timestamp cap reporting, devlink command validity checking, tc offloading, etc.) From Eli Cohen, Maor Dickman, Chris Mi, and Or Gerlitz. 5) Fix dst leak in IP/IP6 tunnels, from Haishuang Yan. 6) Fix dst refcount bug in decnet, from Wei Wang. 7) Netdev can be double freed in register_vlan_device(). Fix from Gao Feng. 8) Don't allow object to be destroyed while it is being dumped in SCTP, from Xin Long. 9) Fix dpaa_eth build when modular, from Madalin Bucur. 10) Fix throw route leaks, from Serhey Popovych. 11) IFLA_GROUP missing from if_nlmsg_size() and ifla_policy[] table, also from Serhey Popovych. 12) Fix premature TX SKB free in stmmac, from Niklas Cassel. Please pull, thanks a lot! The following changes since commit a090bd4ff8387c409732a8e059fbf264ea0bdd56: Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net (2017-06-15 18:09:47 +0900) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git for you to fetch changes up to b4846fc3c8559649277e3e4e6b5cec5348a8d208: igmp: add a missing spin_lock_init() (2017-06-20 15:51:57 -0400) Arend Van Spriel (5): brcmfmac: add parameter to pass error code in firmware callback brcmfmac: use firmware callback upon failure to load brcmfmac: unbind all devices upon failure in firmware callback brcmfmac: fix brcmf_fws_add_interface() for USB devices brcmfmac: fix uninitialized warning in brcmf_usb_probe_phase2() Chris Mi (1): net/mlx5e: Fix min inline value for VF rep SQs David Howells (1): rxrpc: Fix several cases where a padded len isn't checked in ticket decode David S. Miller (4): Merge tag 'mlx5-fixes-2017-06-14' of git://git.kernel.org/.../saeed/linux Merge tag 'mac80211-for-davem-2017-06-16' of git://git.kernel.org/.../jberg/mac80211 Merge branch 'net-fix-loadable-module-for-DPAA-Ethernet' Merge tag 'wireless-drivers-for-davem-2017-06-20' of git://git.kernel.org/.../kvalo/wireless-drivers Edward Cree (1): sfc: remove duplicate up_write on VF filter_sem Eli Cohen (1): net/mlx5: Wait for FW readiness before initializing command interface Gao Feng (1): net: 8021q: Fix one possible panic caused by BUG_ON in free_netdev Haishuang Yan (3): ip_tunnel: fix potential issue in ip_tunnel_rcv ip6_tunnel: fix potential issue in __ip6_tnl_rcv ip6_tunnel: Correct tos value in collect_md mode Johannes Berg (3): wireless: wext: remove ndo_do_ioctl fallback wireless: wext: use struct iwreq earlier in the call chain dev_ioctl: copy only the smaller struct iwreq for wext Krzysztof Kozlowski (1): dt-bindings: net: sms911x: Add missing optional VDD regulators Lin Yun Sheng (1): net/hns:bugfix of ethtool -t phy self_test Madalin Bucur (2): fsl/fman: propagate dma_ops dpaa_eth: reuse the dma_ops provided by the FMan MAC device Maor Dickman (1): net/mlx5e: Fix timestamping capabilities reporting Niklas Cassel (1): net: stmmac: free an skb first when there are no longer any descriptors using it Or Gerlitz (3): net/mlx5: Properly check applicability of devlink eswitch commands net/mlx5e: Remove TC header re-write offloading of ip tos net/mlx5e: Avoid doing a cleanup call if the profile doesn't have it Raju Rangoju (1): cxgb4: notify uP to route ctrlq compl to rdma rspq Sebastian Siewior (1): net/core: remove explicit do_softirq() from busy_poll_stop() Serhey Popovych (3): fib_rules: Resolve goto rules target on delete ipv6: Do not leak throw route references rtnetlink: add IFLA_GROUP to ifla_policy WANG Cong (1): igmp: add a missing spin_lock_init() Wei Wang (1): decnet: always not take dst->__refcnt when inserting dst into hash table Xin Long (3): ipv6: fix calling in6_ifa_hold incorrectly for dad work sctp: return next obj by passing pos + 1 into sctp_transport_get_idx sctp: ensure ep is not destroyed before doing the dump xypron.g...@gmx.de (1): Doc: net: dsa: b53: update location of referenced dsa.txt Documentation/devicetree/bindings/net/dsa/b53.txt | 2 +- Documentation/devicetree/bindings/net/smsc911x.txt | 1 + drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c | 10 ++ drivers/net/ethernet/freescale/dpaa/dpaa_eth.c | 2 +- drivers/net/ethernet/freescale/fman/mac.c | 2 ++ drivers/net/ethernet/hisilicon/hns/hns_ethtool.c| 16 ++-- drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c| 8
[PATCH] net: phy: smsc: fix buffer overflow in memcpy
The memcpy annotation triggers for a fixed-length buffer copy: In file included from /git/arm-soc/arch/arm64/include/asm/processor.h:30:0, from /git/arm-soc/arch/arm64/include/asm/spinlock.h:21, from /git/arm-soc/include/linux/spinlock.h:87, from /git/arm-soc/include/linux/seqlock.h:35, from /git/arm-soc/include/linux/time.h:5, from /git/arm-soc/include/linux/stat.h:21, from /git/arm-soc/include/linux/module.h:10, from /git/arm-soc/drivers/net/phy/smsc.c:20: In function 'memcpy', inlined from 'smsc_get_strings' at /git/arm-soc/drivers/net/phy/smsc.c:166:3: /git/arm-soc/include/linux/string.h:309:4: error: call to '__read_overflow2' declared with attribute error: detected read beyond size of object passed as 2nd parameter Using strncpy instead of memcpy should do the right thing here. Fixes: 030a89028db0 ("net: phy: smsc: Implement PHY statistics") Signed-off-by: Arnd Bergmann--- drivers/net/phy/smsc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/phy/smsc.c b/drivers/net/phy/smsc.c index 1b8204be064c..2306bfae057f 100644 --- a/drivers/net/phy/smsc.c +++ b/drivers/net/phy/smsc.c @@ -163,7 +163,7 @@ static void smsc_get_strings(struct phy_device *phydev, u8 *data) int i; for (i = 0; i < ARRAY_SIZE(smsc_hw_stats); i++) { - memcpy(data + i * ETH_GSTRING_LEN, + strncpy(data + i * ETH_GSTRING_LEN, smsc_hw_stats[i].string, ETH_GSTRING_LEN); } } -- 2.9.0
Re: [PATCH v1 1/2] dt-binding: ptp: add bindings document for dte based ptp clock
Hi Rob, On 17-06-18 07:04 AM, Rob Herring wrote: On Mon, Jun 12, 2017 at 01:26:00PM -0700, Arun Parameswaran wrote: Add device tree binding documentation for the Broadcom DTE PTP clock driver. Signed-off-by: Arun Parameswaran--- Documentation/devicetree/bindings/ptp/brcm,ptp-dte.txt | 13 + 1 file changed, 13 insertions(+) create mode 100644 Documentation/devicetree/bindings/ptp/brcm,ptp-dte.txt diff --git a/Documentation/devicetree/bindings/ptp/brcm,ptp-dte.txt b/Documentation/devicetree/bindings/ptp/brcm,ptp-dte.txt new file mode 100644 index 000..07590bc --- /dev/null +++ b/Documentation/devicetree/bindings/ptp/brcm,ptp-dte.txt @@ -0,0 +1,13 @@ +* Broadcom Digital Timing Engine(DTE) based PTP clock driver Bindings describe h/w, not drivers. + +Required properties: +- compatible: should be "brcm,ptp-dte" Looks too generic. You need SoC specific compatible strings. Rob, could you please help me understand the use of adding SoC specific compatible strings. I still don't get it. It's my understanding that the SoC compatibility string is to future proof against bugs/incompatibilities between different versions of the hardware block due to integration issues or any other reason. You can then compare in your driver because the strings were already used in the dtb. That would make sense if you can't already differentiate what SoC you are running on. But the SoC is already specified in the root of the device tree in the compatible string? Why can't you just use of_machine_is_compatible inside your driver when needed? Please explain what I'm missing. I see other drivers already following the of_machine_is_compatible approach and it makes more sense to me than adding SoC specific compatible strings into every driver. Regards, Scott
Re: [PATCH net-next 06/12] nfp: add stats and xmit helpers for representors
On Wed, Jun 21, 2017 at 01:15:05AM +0800, kbuild test robot wrote: > Hi Simon, > > [auto build test ERROR on net-next/master] > > url: > https://github.com/0day-ci/linux/commits/Simon-Horman/nfp-add-flower-app-with-representors/20170620-233831 > config: arm-allmodconfig (attached as .config) > compiler: arm-linux-gnueabi-gcc (Debian 6.1.1-9) 6.1.1 20160705 > reproduce: > wget > https://raw.githubusercontent.com/01org/lkp-tests/master/sbin/make.cross -O > ~/bin/make.cross > chmod +x ~/bin/make.cross > # save the attached .config to linux build tree > make.cross ARCH=arm It seems that I forgot to add #include I will do so in v2. > > All errors (new ones prefixed by >>): > >drivers/net//ethernet/netronome/nfp/nfp_net_repr.c: In function > 'nfp_repr_phy_port_get_stats64': > >> drivers/net//ethernet/netronome/nfp/nfp_net_repr.c:88:22: error: implicit > >> declaration of function 'readq' [-Werror=implicit-function-declaration] > stats->tx_packets = readq(mem + NFP_MAC_STATS_RX_FRAMES_RECEIVED_OK); > ^ >cc1: some warnings being treated as errors > > vim +/readq +88 drivers/net//ethernet/netronome/nfp/nfp_net_repr.c > > 72stats->rx_packets++; > 73stats->rx_bytes += len; > 74u64_stats_update_end(>syncp); > 75} > 76 > 77void > 78nfp_repr_phy_port_get_stats64(const struct nfp_app *app, u8 > phy_port, > 79 struct rtnl_link_stats64 *stats) > 80{ > 81u8 __iomem *mem; > 82 > 83mem = app->pf->mac_stats_mem + phy_port * > NFP_MAC_STATS_SIZE; > 84 > 85/* TX and RX stats are flipped as we are returning the > stats as seen > 86 * at the switch port corresponding to the phys port. > 87 */ > > 88stats->tx_packets = readq(mem + > NFP_MAC_STATS_RX_FRAMES_RECEIVED_OK); > 89stats->tx_bytes = readq(mem + > NFP_MAC_STATS_RX_IN_OCTETS); > 90stats->tx_dropped = readq(mem + > NFP_MAC_STATS_RX_IN_ERRORS); > 91 > 92stats->rx_packets = readq(mem + > NFP_MAC_STATS_TX_FRAMES_TRANSMITTED_OK); > 93stats->rx_bytes = readq(mem + > NFP_MAC_STATS_TX_OUT_OCTETS); > 94stats->rx_dropped = readq(mem + > NFP_MAC_STATS_TX_OUT_ERRORS); > 95} > 96 > > --- > 0-DAY kernel test infrastructureOpen Source Technology Center > https://lists.01.org/pipermail/kbuild-all Intel Corporation
Re: [PATCH 00/51] rtc: stop using rtc deprecated functions
On Tue, Jun 20, 2017 at 05:07:46PM +0200, Benjamin Gaignard wrote: > 2017-06-20 15:48 GMT+02:00 Alexandre Belloni >: > >> Yes, that's argument against changing rtc _drivers_ for hardware that > >> can not do better than 32bit. For generic code (such as 44/51 sysfs, > >> 51/51 suspend test), the change still makes sense. > > What I had in mind when writing those patches was to remove the limitations > coming from those functions usage, even more since they been marked has > deprecated. I'd say that they should not be marked as deprecated. They're entirely appropriate for use with hardware that only supports a 32-bit representation of time. It's entirely reasonable to fix the ones that use other representations that exceed that, but for those which do not, we need to keep using the 32-bit versions. Doing so actually gives us _more_ flexibility in the future. Consider that at the moment, we define the 32-bit RTC representation to start at a well known epoch. We _could_ decide that when it wraps to 0x8000 seconds, we'll define the lower 0x4000 seconds to mean dates in the future - and keep rolling that forward each time we cross another 0x4000 seconds. Unless someone invents a real time machine, we shouldn't need to set a modern RTC back to 1970. If we convert the 32-bit counter RTC drivers to use 64-bit conversions, then we're completely stuffed, because the lower 32-bits will always be relative to the epoch, and we can't change that without breaking the 64-bit users. So, keep the 32-bit conversion functions, do not deprecate them, and think about the future possibilities. I really think this "get rid of 32-bit time representations" is a much to narrow focus on the wrong problem. You can't ever fix 32-bit time representations by just adding additional zeros into the MSB bits. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net.
Re: [PATCH net-next v3 07/15] bpf: Add setsockopt helper function to bpf
On Mon, Jun 19, 2017 at 11:00 PM, Lawrence Brakmowrote: > Added support for calling a subset of socket setsockopts from > BPF_PROG_TYPE_SOCK_OPS programs. The code was duplicated rather > than making the changes to call the socket setsockopt function because > the changes required would have been larger. > > @@ -2671,6 +2672,69 @@ static const struct bpf_func_proto > bpf_get_socket_uid_proto = { > .arg1_type = ARG_PTR_TO_CTX, > }; > > +BPF_CALL_5(bpf_setsockopt, struct bpf_sock_ops_kern *, bpf_sock, > + int, level, int, optname, char *, optval, int, optlen) > +{ > + struct sock *sk = bpf_sock->sk; > + int ret = 0; > + int val; > + > + if (bpf_sock->is_req_sock) > + return -EINVAL; > + > + if (level == SOL_SOCKET) { > + /* Only some socketops are supported */ > + val = *((int *)optval); > + > + switch (optname) { > + case SO_RCVBUF: > + sk->sk_userlocks |= SOCK_RCVBUF_LOCK; > + sk->sk_rcvbuf = max_t(int, val * 2, SOCK_MIN_RCVBUF); > + break; > + case SO_SNDBUF: > + sk->sk_userlocks |= SOCK_SNDBUF_LOCK; > + sk->sk_sndbuf = max_t(int, val * 2, SOCK_MIN_SNDBUF); > + break; > + case SO_MAX_PACING_RATE: > + sk->sk_max_pacing_rate = val; > + sk->sk_pacing_rate = min(sk->sk_pacing_rate, > +sk->sk_max_pacing_rate); > + break; > + case SO_PRIORITY: > + sk->sk_priority = val; > + break; > + case SO_RCVLOWAT: > + if (val < 0) > + val = INT_MAX; > + sk->sk_rcvlowat = val ? : 1; > + break; > + case SO_MARK: > + sk->sk_mark = val; > + break; Isn't the socket lock required when manipulating these fields? It's not obvious that the lock is held from every bpf hook point that could trigger this function...
Re: [PATCH net-next v3 3/4] ipmr: add netlink notifications on igmpmsg cache reports
On 20/06/17 23:54, Julien Gomes wrote: > Add Netlink notifications on cache reports in ipmr, in addition to the > existing igmpmsg sent to mroute_sk. > Send RTM_NEWCACHEREPORT notifications to RTNLGRP_IPV4_MROUTE_R. > > MSGTYPE, VIF_ID, SRC_ADDR and DST_ADDR Netlink attributes contain the > same data as their equivalent fields in the igmpmsg header. > PKT attribute is the packet sent to mroute_sk, without the added igmpmsg > header. > > Suggested-by: Ryan Halbrook> Signed-off-by: Julien Gomes > --- > include/uapi/linux/mroute.h | 12 > net/ipv4/ipmr.c | 69 > +++-- > 2 files changed, 79 insertions(+), 2 deletions(-) > Thanks, Reviewed-by: Nikolay Aleksandrov
Re: [net,v2] ipv6: reorder ip6_route_dev_notifier after ipv6_dev_notf
On Mon, Jun 19, 2017 at 11:37 PM, jeffywrote: > Hi Cong Wang, > > > On 06/20/2017 12:54 PM, Cong Wang wrote: >> >> Interesting, I didn't notice this corner-case, because normally >> we would hit the one in rollback_registered_many(). Probably >> we need to add a check >> >> if (dev->reg_state == NETREG_UNREGISTERING) >> >> in ip6_route_dev_notify(). Can you give it a try? > > the NETREG_UNREGISTERING check works for my test:) > > but i saw dev_change_net_namespace also call NETDEV_UNREGISTER & > NETDEV_REGISTER: Yes we should call it in this case too, only netdev_wait_allrefs() is an exceptional case here. I just sent out a formal patch with you Cc'ed. Thanks!
[PATCH net-next v3 1/4] rtnetlink: add NEWCACHEREPORT message type
New NEWCACHEREPORT message type to be used for cache reports sent via Netlink, effectively allowing splitting cache report reception from mroute programming. Suggested-by: Ryan HalbrookSigned-off-by: Julien Gomes Reviewed-by: Nikolay Aleksandrov --- include/uapi/linux/rtnetlink.h | 3 +++ security/selinux/nlmsgtab.c| 3 ++- 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h index 564790e854f7..cd1afb900929 100644 --- a/include/uapi/linux/rtnetlink.h +++ b/include/uapi/linux/rtnetlink.h @@ -146,6 +146,9 @@ enum { RTM_GETSTATS = 94, #define RTM_GETSTATS RTM_GETSTATS + RTM_NEWCACHEREPORT = 96, +#define RTM_NEWCACHEREPORT RTM_NEWCACHEREPORT + __RTM_MAX, #define RTM_MAX(((__RTM_MAX + 3) & ~3) - 1) }; diff --git a/security/selinux/nlmsgtab.c b/security/selinux/nlmsgtab.c index 5aeaf30b7a13..7b7433a1a34c 100644 --- a/security/selinux/nlmsgtab.c +++ b/security/selinux/nlmsgtab.c @@ -79,6 +79,7 @@ static const struct nlmsg_perm nlmsg_route_perms[] = { RTM_GETNSID, NETLINK_ROUTE_SOCKET__NLMSG_READ }, { RTM_NEWSTATS, NETLINK_ROUTE_SOCKET__NLMSG_READ }, { RTM_GETSTATS, NETLINK_ROUTE_SOCKET__NLMSG_READ }, + { RTM_NEWCACHEREPORT, NETLINK_ROUTE_SOCKET__NLMSG_READ }, }; static const struct nlmsg_perm nlmsg_tcpdiag_perms[] = @@ -158,7 +159,7 @@ int selinux_nlmsg_lookup(u16 sclass, u16 nlmsg_type, u32 *perm) switch (sclass) { case SECCLASS_NETLINK_ROUTE_SOCKET: /* RTM_MAX always point to RTM_SET, ie RTM_NEWxxx + 3 */ - BUILD_BUG_ON(RTM_MAX != (RTM_NEWSTATS + 3)); + BUILD_BUG_ON(RTM_MAX != (RTM_NEWCACHEREPORT + 3)); err = nlmsg_perm(nlmsg_type, perm, nlmsg_route_perms, sizeof(nlmsg_route_perms)); break; -- 2.13.1
Re: [PATCH net-next v3 4/4] ip6mr: add netlink notifications on mrt6msg cache reports
On 20/06/17 23:54, Julien Gomes wrote: > Add Netlink notifications on cache reports in ip6mr, in addition to the > existing mrt6msg sent to mroute6_sk. > Send RTM_NEWCACHEREPORT notifications to RTNLGRP_IPV6_MROUTE_R. > > MSGTYPE, MIF_ID, SRC_ADDR and DST_ADDR Netlink attributes contain the > same data as their equivalent fields in the mrt6msg header. > PKT attribute is the packet sent to mroute6_sk, without the added > mrt6msg header. > > Suggested-by: Ryan Halbrook> Signed-off-by: Julien Gomes > --- > include/uapi/linux/mroute6.h | 12 > net/ipv6/ip6mr.c | 71 > ++-- > 2 files changed, 81 insertions(+), 2 deletions(-) > Reviewed-by: Nikolay Aleksandrov
Re: [PATCH net] dccp: call inet_add_protocol after register_pernet_subsys in dccp_v4_init
From: Xin LongDate: Tue, 20 Jun 2017 15:42:38 +0800 > Now dccp_ipv4 works as a kernel module. During loading this module, if > one dccp packet is being recieved after inet_add_protocol but before > register_pernet_subsys in which v4_ctl_sk is initialized, a null pointer > dereference may be triggered because of init_net.dccp.v4_ctl_sk is 0x0. > > Jianlin found this issue when the following call trace occurred: ... > This patch is to move inet_add_protocol after register_pernet_subsys in > dccp_v4_init, so that v4_ctl_sk is initialized before any incoming dccp > packets are processed. > > Reported-by: Jianlin Shi > Signed-off-by: Xin Long Applied.
Re: [PATCH net-next] sctp: uncork the old asoc before changing to the new one
From: Xin LongDate: Tue, 20 Jun 2017 16:01:55 +0800 > local_cork is used to decide if it should uncork asoc outq after processing > some cmds, and it is set when replying or sending msgs. local_cork should > always have the same value with current asoc q->cork in some way. > > The thing is when changing to a new asoc by cmd SET_ASOC, local_cork may > not be consistent with the current asoc any more. The cmd seqs can be: > > SCTP_CMD_UPDATE_ASSOC (asoc) > SCTP_CMD_REPLY (asoc) > SCTP_CMD_SET_ASOC (new_asoc) > SCTP_CMD_DELETE_TCB (new_asoc) > SCTP_CMD_SET_ASOC (asoc) > SCTP_CMD_REPLY (asoc) > > The 1st REPLY makes OLD asoc q->cork and local_cork both are 1, and the cmd > DELETE_TCB clears NEW asoc q->cork and local_cork. After asoc goes back to > OLD asoc, q->cork is still 1 while local_cork is 0. The 2nd REPLY will not > set local_cork because q->cork is already set and it can't be uncorked and > sent out because of this. > > To keep local_cork consistent with the current asoc q->cork, this patch is > to uncork the old asoc if local_cork is set before changing to the new one. > > Note that the above cmd seqs will be used in the next patch when updating > asoc and handling errors in it. > > Suggested-by: Marcelo Ricardo Leitner > Signed-off-by: Xin Long Applied.
Re: [PATCH net-next] sctp: handle errors when updating asoc
From: Xin LongDate: Tue, 20 Jun 2017 16:05:11 +0800 > It's a bad thing not to handle errors when updating asoc. The memory > allocation failure in any of the functions called in sctp_assoc_update() > would cause sctp to work unexpectedly. > > This patch is to fix it by aborting the asoc and reporting the error when > any of these functions fails. > > Signed-off-by: Xin Long Also applied, thank you.
Re: [PATCH] cfg80211: Fix a memory leak in error handling path in 'brcmf_cfg80211_attach'
On 20-06-17 08:22, Christophe JAILLET wrote: > If 'wiphy_new()' fails, we leak 'ops'. Add a new label in the error > handling path to free it in such a case. Thanks. Please add the following tags: Cc: sta...@vger.kernel.org > Fixes: 5c22fb85102a7 ("brcmfmac: add wowl gtk rekeying offload support") Acked-by: Arend van Spriel> Signed-off-by: Christophe JAILLET > --- > drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-)
[net-next PATCH] tcp: md5: hide unused variable
Changing from a memcpy to per-member comparison left the size variable unused: net/ipv4/tcp_ipv4.c: In function 'tcp_md5_do_lookup': net/ipv4/tcp_ipv4.c:910:15: error: unused variable 'size' [-Werror=unused-variable] This does not show up when CONFIG_IPV6 is enabled, but the variable can be removed either way, along with the now unused assignment. Fixes: 6797318e623d ("tcp: md5: add an address prefix for key lookup") Signed-off-by: Arnd Bergmann--- net/ipv4/tcp_ipv4.c | 6 +- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index bf407f3e20dd..e20bcf0061af 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -907,7 +907,6 @@ struct tcp_md5sig_key *tcp_md5_do_lookup(const struct sock *sk, { const struct tcp_sock *tp = tcp_sk(sk); struct tcp_md5sig_key *key; - unsigned int size = sizeof(struct in_addr); const struct tcp_md5sig_info *md5sig; __be32 mask; struct tcp_md5sig_key *best_match = NULL; @@ -918,10 +917,7 @@ struct tcp_md5sig_key *tcp_md5_do_lookup(const struct sock *sk, lockdep_sock_is_held(sk)); if (!md5sig) return NULL; -#if IS_ENABLED(CONFIG_IPV6) - if (family == AF_INET6) - size = sizeof(struct in6_addr); -#endif + hlist_for_each_entry_rcu(key, >head, node) { if (key->family != family) continue; -- 2.9.0
Re: [PATCH v2] net/phy: micrel: configure intterupts after autoneg workaround
On 06/20/2017 10:48 AM, Zach Brown wrote: > The commit ("net/phy: micrel: Add workaround for bad autoneg") fixes an > autoneg failure case by resetting the hardware. This turns off > intterupts. Things will work themselves out if the phy polls, as it will > figure out it's state during a poll. However if the phy uses only > intterupts, the phy will stall, since interrupts are off. This patch > fixes the issue by calling config_intr after resetting the phy. > > Fixes: d2fd719bcb0e ("net/phy: micrel: Add workaround for bad autoneg ") > Signed-off-by: Zach BrownReviewed-by: Florian Fainelli -- Florian
Re: [PATCH 00/51] rtc: stop using rtc deprecated functions
On Tue, 20 Jun 2017, Alexandre Belloni wrote: > On 20/06/2017 at 22:15:36 +0100, Russell King - ARM Linux wrote: > > On Tue, Jun 20, 2017 at 05:07:46PM +0200, Benjamin Gaignard wrote: > > > 2017-06-20 15:48 GMT+02:00 Alexandre Belloni > > >: > > > >> Yes, that's argument against changing rtc _drivers_ for hardware that > > > >> can not do better than 32bit. For generic code (such as 44/51 sysfs, > > > >> 51/51 suspend test), the change still makes sense. > > > > > > What I had in mind when writing those patches was to remove the > > > limitations > > > coming from those functions usage, even more since they been marked has > > > deprecated. > > > > I'd say that they should not be marked as deprecated. They're entirely > > appropriate for use with hardware that only supports a 32-bit > > representation of time. > > > > It's entirely reasonable to fix the ones that use other representations > > that exceed that, but for those which do not, we need to keep using the > > 32-bit versions. Doing so actually gives us _more_ flexibility in the > > future. > > > > Consider that at the moment, we define the 32-bit RTC representation to > > start at a well known epoch. We _could_ decide that when it wraps to > > 0x8000 seconds, we'll define the lower 0x4000 seconds to mean > > dates in the future - and keep rolling that forward each time we cross > > another 0x4000 seconds. Unless someone invents a real time machine, > > we shouldn't need to set a modern RTC back to 1970. > > > > I agree with that but not the android guys. They seem to mandate an RTC > that can store time from 01/01/1970. I don't know much more than that > because they never cared to explain why that was actually necessary > (apart from a laconic "this will result in a bad user experience") > > I think tglx had a plan for offsetting the time at some point so 32-bit > platform can pass 2038 properly. Yes, but there are still quite some issues to solve there: 1) How do you tell the system that it should apply the offset in the first place, i.e at boot time before NTP or any other mechanism can correct it? 2) Deal with creative vendors who have their own idea about the 'start of the epoch' 3) Add the information of wraparound time to the rtc device which needs to be filled in for each device. That way the rtc_*** accessor functions can deal with them whether they wrap in 2038 or 2100 or whatever. #3 is the simplest problem of them :) > My opinion is that as long as userspace is not ready to handle those > dates, it doesn't really matter because it is quite unlikely that > anything will be able to continue running anyway. That's a different story. Making the kernel y2038 ready in general is a good thing. Whether userspace will be ready by then or not is completely irrelevant. Thanks, tglx
Re: [PATCH net] dccp: call inet_add_protocol after register_pernet_subsys in dccp_v6_init
From: Xin LongDate: Tue, 20 Jun 2017 15:44:44 +0800 > Patch "call inet_add_protocol after register_pernet_subsys in dccp_v4_init" > fixed a null pointer dereference issue for dccp_ipv4 module. > > The same fix is needed for dccp_ipv6 module. > > Signed-off-by: Xin Long Applied.
RE: [PATCH] liquidio: stop using huge static buffer, save 4096k in .data
> From: David Miller [mailto:da...@davemloft.net] > Sent: Tuesday, June 20, 2017 12:22 PM > > From: Denys Vlasenko> Date: Mon, 19 Jun 2017 21:50:52 +0200 > > > Only compile-tested - I don't have the hardware. > > > > From code inspection, octeon_pci_write_core_mem() appears to be safe wrt > > unaligned source. In any case, u8 fbuf[] was not guaranteed to be aligned > > anyway. > > > > Signed-off-by: Denys Vlasenko > > Looks good to me but I'll let one of the liquidio guys review this first > before I apply it. Felix is going to try this out this week to confirm. Let's wait for his ack.
[PATCH net-next v3 3/4] ipmr: add netlink notifications on igmpmsg cache reports
Add Netlink notifications on cache reports in ipmr, in addition to the existing igmpmsg sent to mroute_sk. Send RTM_NEWCACHEREPORT notifications to RTNLGRP_IPV4_MROUTE_R. MSGTYPE, VIF_ID, SRC_ADDR and DST_ADDR Netlink attributes contain the same data as their equivalent fields in the igmpmsg header. PKT attribute is the packet sent to mroute_sk, without the added igmpmsg header. Suggested-by: Ryan HalbrookSigned-off-by: Julien Gomes --- include/uapi/linux/mroute.h | 12 net/ipv4/ipmr.c | 69 +++-- 2 files changed, 79 insertions(+), 2 deletions(-) diff --git a/include/uapi/linux/mroute.h b/include/uapi/linux/mroute.h index f904367c0cee..e8e5041dea8e 100644 --- a/include/uapi/linux/mroute.h +++ b/include/uapi/linux/mroute.h @@ -152,6 +152,18 @@ enum { }; #define IPMRA_VIFA_MAX (__IPMRA_VIFA_MAX - 1) +/* ipmr netlink cache report attributes */ +enum { + IPMRA_CREPORT_UNSPEC, + IPMRA_CREPORT_MSGTYPE, + IPMRA_CREPORT_VIF_ID, + IPMRA_CREPORT_SRC_ADDR, + IPMRA_CREPORT_DST_ADDR, + IPMRA_CREPORT_PKT, + __IPMRA_CREPORT_MAX +}; +#define IPMRA_CREPORT_MAX (__IPMRA_CREPORT_MAX - 1) + /* That's all usermode folks */ #define MFC_ASSERT_THRESH (3*HZ) /* Maximal freq. of asserts */ diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c index 3e7454aa49e8..a1d521be612b 100644 --- a/net/ipv4/ipmr.c +++ b/net/ipv4/ipmr.c @@ -109,6 +109,7 @@ static int __ipmr_fill_mroute(struct mr_table *mrt, struct sk_buff *skb, struct mfc_cache *c, struct rtmsg *rtm); static void mroute_netlink_event(struct mr_table *mrt, struct mfc_cache *mfc, int cmd); +static void igmpmsg_netlink_event(struct mr_table *mrt, struct sk_buff *pkt); static void mroute_clean_tables(struct mr_table *mrt, bool all); static void ipmr_expire_process(unsigned long arg); @@ -995,8 +996,7 @@ static void ipmr_cache_resolve(struct net *net, struct mr_table *mrt, } } -/* Bounce a cache query up to mrouted. We could use netlink for this but mrouted - * expects the following bizarre scheme. +/* Bounce a cache query up to mrouted and netlink. * * Called under mrt_lock. */ @@ -1062,6 +1062,8 @@ static int ipmr_cache_report(struct mr_table *mrt, return -EINVAL; } + igmpmsg_netlink_event(mrt, skb); + /* Deliver to mrouted */ ret = sock_queue_rcv_skb(mroute_sk, skb); rcu_read_unlock(); @@ -2341,6 +2343,69 @@ static void mroute_netlink_event(struct mr_table *mrt, struct mfc_cache *mfc, rtnl_set_sk_err(net, RTNLGRP_IPV4_MROUTE, err); } +static size_t igmpmsg_netlink_msgsize(size_t payloadlen) +{ + size_t len = + NLMSG_ALIGN(sizeof(struct rtgenmsg)) + + nla_total_size(1) /* IPMRA_CREPORT_MSGTYPE */ + + nla_total_size(4) /* IPMRA_CREPORT_VIF_ID */ + + nla_total_size(4) /* IPMRA_CREPORT_SRC_ADDR */ + + nla_total_size(4) /* IPMRA_CREPORT_DST_ADDR */ + /* IPMRA_CREPORT_PKT */ + + nla_total_size(payloadlen) + ; + + return len; +} + +static void igmpmsg_netlink_event(struct mr_table *mrt, struct sk_buff *pkt) +{ + struct net *net = read_pnet(>net); + struct nlmsghdr *nlh; + struct rtgenmsg *rtgenm; + struct igmpmsg *msg; + struct sk_buff *skb; + struct nlattr *nla; + int payloadlen; + + payloadlen = pkt->len - sizeof(struct igmpmsg); + msg = (struct igmpmsg *)skb_network_header(pkt); + + skb = nlmsg_new(igmpmsg_netlink_msgsize(payloadlen), GFP_ATOMIC); + if (!skb) + goto errout; + + nlh = nlmsg_put(skb, 0, 0, RTM_NEWCACHEREPORT, + sizeof(struct rtgenmsg), 0); + if (!nlh) + goto errout; + rtgenm = nlmsg_data(nlh); + rtgenm->rtgen_family = RTNL_FAMILY_IPMR; + if (nla_put_u8(skb, IPMRA_CREPORT_MSGTYPE, msg->im_msgtype) || + nla_put_u32(skb, IPMRA_CREPORT_VIF_ID, msg->im_vif) || + nla_put_in_addr(skb, IPMRA_CREPORT_SRC_ADDR, + msg->im_src.s_addr) || + nla_put_in_addr(skb, IPMRA_CREPORT_DST_ADDR, + msg->im_dst.s_addr)) + goto nla_put_failure; + + nla = nla_reserve(skb, IPMRA_CREPORT_PKT, payloadlen); + if (!nla || skb_copy_bits(pkt, sizeof(struct igmpmsg), + nla_data(nla), payloadlen)) + goto nla_put_failure; + + nlmsg_end(skb, nlh); + + rtnl_notify(skb, net, 0, RTNLGRP_IPV4_MROUTE_R, NULL, GFP_ATOMIC); + return; + +nla_put_failure: + nlmsg_cancel(skb, nlh); +errout: + kfree_skb(skb); + rtnl_set_sk_err(net, RTNLGRP_IPV4_MROUTE_R, -ENOBUFS); +} + static int
[PATCH net-next v3 4/4] ip6mr: add netlink notifications on mrt6msg cache reports
Add Netlink notifications on cache reports in ip6mr, in addition to the existing mrt6msg sent to mroute6_sk. Send RTM_NEWCACHEREPORT notifications to RTNLGRP_IPV6_MROUTE_R. MSGTYPE, MIF_ID, SRC_ADDR and DST_ADDR Netlink attributes contain the same data as their equivalent fields in the mrt6msg header. PKT attribute is the packet sent to mroute6_sk, without the added mrt6msg header. Suggested-by: Ryan HalbrookSigned-off-by: Julien Gomes --- include/uapi/linux/mroute6.h | 12 net/ipv6/ip6mr.c | 71 ++-- 2 files changed, 81 insertions(+), 2 deletions(-) diff --git a/include/uapi/linux/mroute6.h b/include/uapi/linux/mroute6.h index ed5721148768..e4746816c855 100644 --- a/include/uapi/linux/mroute6.h +++ b/include/uapi/linux/mroute6.h @@ -133,4 +133,16 @@ struct mrt6msg { struct in6_addr im6_src, im6_dst; }; +/* ip6mr netlink cache report attributes */ +enum { + IP6MRA_CREPORT_UNSPEC, + IP6MRA_CREPORT_MSGTYPE, + IP6MRA_CREPORT_MIF_ID, + IP6MRA_CREPORT_SRC_ADDR, + IP6MRA_CREPORT_DST_ADDR, + IP6MRA_CREPORT_PKT, + __IP6MRA_CREPORT_MAX +}; +#define IP6MRA_CREPORT_MAX (__IP6MRA_CREPORT_MAX - 1) + #endif /* _UAPI__LINUX_MROUTE6_H */ diff --git a/net/ipv6/ip6mr.c b/net/ipv6/ip6mr.c index b0e2bf1f4212..7454850f2098 100644 --- a/net/ipv6/ip6mr.c +++ b/net/ipv6/ip6mr.c @@ -116,6 +116,7 @@ static int __ip6mr_fill_mroute(struct mr6_table *mrt, struct sk_buff *skb, struct mfc6_cache *c, struct rtmsg *rtm); static void mr6_netlink_event(struct mr6_table *mrt, struct mfc6_cache *mfc, int cmd); +static void mrt6msg_netlink_event(struct mr6_table *mrt, struct sk_buff *pkt); static int ip6mr_rtm_dumproute(struct sk_buff *skb, struct netlink_callback *cb); static void mroute_clean_tables(struct mr6_table *mrt, bool all); @@ -1125,8 +1126,7 @@ static void ip6mr_cache_resolve(struct net *net, struct mr6_table *mrt, } /* - * Bounce a cache query up to pim6sd. We could use netlink for this but pim6sd - * expects the following bizarre scheme. + * Bounce a cache query up to pim6sd and netlink. * * Called under mrt_lock. */ @@ -1208,6 +1208,8 @@ static int ip6mr_cache_report(struct mr6_table *mrt, struct sk_buff *pkt, return -EINVAL; } + mrt6msg_netlink_event(mrt, skb); + /* * Deliver to user space multicast routing algorithms */ @@ -2457,6 +2459,71 @@ static void mr6_netlink_event(struct mr6_table *mrt, struct mfc6_cache *mfc, rtnl_set_sk_err(net, RTNLGRP_IPV6_MROUTE, err); } +static size_t mrt6msg_netlink_msgsize(size_t payloadlen) +{ + size_t len = + NLMSG_ALIGN(sizeof(struct rtgenmsg)) + + nla_total_size(1) /* IP6MRA_CREPORT_MSGTYPE */ + + nla_total_size(4) /* IP6MRA_CREPORT_MIF_ID */ + /* IP6MRA_CREPORT_SRC_ADDR */ + + nla_total_size(sizeof(struct in6_addr)) + /* IP6MRA_CREPORT_DST_ADDR */ + + nla_total_size(sizeof(struct in6_addr)) + /* IP6MRA_CREPORT_PKT */ + + nla_total_size(payloadlen) + ; + + return len; +} + +static void mrt6msg_netlink_event(struct mr6_table *mrt, struct sk_buff *pkt) +{ + struct net *net = read_pnet(>net); + struct nlmsghdr *nlh; + struct rtgenmsg *rtgenm; + struct mrt6msg *msg; + struct sk_buff *skb; + struct nlattr *nla; + int payloadlen; + + payloadlen = pkt->len - sizeof(struct mrt6msg); + msg = (struct mrt6msg *)skb_transport_header(pkt); + + skb = nlmsg_new(mrt6msg_netlink_msgsize(payloadlen), GFP_ATOMIC); + if (!skb) + goto errout; + + nlh = nlmsg_put(skb, 0, 0, RTM_NEWCACHEREPORT, + sizeof(struct rtgenmsg), 0); + if (!nlh) + goto errout; + rtgenm = nlmsg_data(nlh); + rtgenm->rtgen_family = RTNL_FAMILY_IP6MR; + if (nla_put_u8(skb, IP6MRA_CREPORT_MSGTYPE, msg->im6_msgtype) || + nla_put_u32(skb, IP6MRA_CREPORT_MIF_ID, msg->im6_mif) || + nla_put_in6_addr(skb, IP6MRA_CREPORT_SRC_ADDR, +>im6_src) || + nla_put_in6_addr(skb, IP6MRA_CREPORT_DST_ADDR, +>im6_dst)) + goto nla_put_failure; + + nla = nla_reserve(skb, IP6MRA_CREPORT_PKT, payloadlen); + if (!nla || skb_copy_bits(pkt, sizeof(struct mrt6msg), + nla_data(nla), payloadlen)) + goto nla_put_failure; + + nlmsg_end(skb, nlh); + + rtnl_notify(skb, net, 0, RTNLGRP_IPV6_MROUTE_R, NULL, GFP_ATOMIC); + return; + +nla_put_failure: +
[PATCH net-next v3 2/4] rtnetlink: add restricted rtnl groups for ipv4 and ipv6 mroute
Add RTNLGRP_{IPV4,IPV6}_MROUTE_R as two new restricted groups for the NETLINK_ROUTE family. Binding to these groups specifically requires CAP_NET_ADMIN to allow multicast of sensitive messages (e.g. mroute cache reports). Suggested-by: Nikolay AleksandrovSigned-off-by: Julien Gomes Signed-off-by: Nikolay Aleksandrov --- include/uapi/linux/rtnetlink.h | 4 net/core/rtnetlink.c | 13 + 2 files changed, 17 insertions(+) diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h index cd1afb900929..d148505010a7 100644 --- a/include/uapi/linux/rtnetlink.h +++ b/include/uapi/linux/rtnetlink.h @@ -669,6 +669,10 @@ enum rtnetlink_groups { #define RTNLGRP_NSID RTNLGRP_NSID RTNLGRP_MPLS_NETCONF, #define RTNLGRP_MPLS_NETCONF RTNLGRP_MPLS_NETCONF + RTNLGRP_IPV4_MROUTE_R, +#define RTNLGRP_IPV4_MROUTE_R RTNLGRP_IPV4_MROUTE_R + RTNLGRP_IPV6_MROUTE_R, +#define RTNLGRP_IPV6_MROUTE_R RTNLGRP_IPV6_MROUTE_R __RTNLGRP_MAX }; #define RTNLGRP_MAX(__RTNLGRP_MAX - 1) diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index 3aa57848a895..4aefa5a2625f 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -4218,6 +4218,18 @@ static void rtnetlink_rcv(struct sk_buff *skb) rtnl_unlock(); } +static int rtnetlink_bind(struct net *net, int group) +{ + switch (group) { + case RTNLGRP_IPV4_MROUTE_R: + case RTNLGRP_IPV6_MROUTE_R: + if (!ns_capable(net->user_ns, CAP_NET_ADMIN)) + return -EPERM; + break; + } + return 0; +} + static int rtnetlink_event(struct notifier_block *this, unsigned long event, void *ptr) { struct net_device *dev = netdev_notifier_info_to_dev(ptr); @@ -4252,6 +4264,7 @@ static int __net_init rtnetlink_net_init(struct net *net) .input = rtnetlink_rcv, .cb_mutex = _mutex, .flags = NL_CFG_F_NONROOT_RECV, + .bind = rtnetlink_bind, }; sk = netlink_kernel_create(net, NETLINK_ROUTE, ); -- 2.13.1
Re: [PATCH v2] net/phy: micrel: configure intterupts after autoneg workaround
On Tue, Jun 20, 2017 at 12:48:11PM -0500, Zach Brown wrote: > The commit ("net/phy: micrel: Add workaround for bad autoneg") fixes an > autoneg failure case by resetting the hardware. This turns off > intterupts. Things will work themselves out if the phy polls, as it will > figure out it's state during a poll. However if the phy uses only > intterupts, the phy will stall, since interrupts are off. This patch > fixes the issue by calling config_intr after resetting the phy. > > Fixes: d2fd719bcb0e ("net/phy: micrel: Add workaround for bad autoneg ") > Signed-off-by: Zach BrownReviewed-by: Andrew Lunn Andrew
Re: Repeatable inet6_dump_fib crash in stock 4.12.0-rc4+
On 06/20/2017 11:05 AM, Michal Kubecek wrote: On Tue, Jun 20, 2017 at 07:12:27AM -0700, Ben Greear wrote: On 06/14/2017 03:25 PM, David Ahern wrote: On 6/14/17 4:23 PM, Ben Greear wrote: On 06/13/2017 07:27 PM, David Ahern wrote: Let's try a targeted debug patch. See attached I had to change it to pr_err so it would go to our serial console since the system locked hard on crash, and that appears to be enough to change the timing where we can no longer reproduce the problem. ok, let's figure out which one is doing that. There are 3 debug statements. I suspect fib6_del_route is the one setting the state to FWS_U. Can you remove the debug prints in fib6_repair_tree and fib6_walk_continue and try again? We cannot reproduce with just that one printf in the kernel either. It must change the timing too much to trigger the bug. You might try trace_printk() which should have less impact (don't forget to enable /proc/sys/kernel/ftrace_dump_on_oops). We cannot reproduce with trace_printk() either. Thanks, Ben Michal Kubecek -- Ben GreearCandela Technologies Inc http://www.candelatech.com
Re: [PATCH net-next 0/1] Introduction of the tc tests
From: Cong WangDate: Mon, 19 Jun 2017 21:13:31 -0700 > I thought tools/testing/selftests/ is mainly for those tests close to > kernel ABI and API. What is the criteria for these tests? If any test > can fit in, we somehow would merge the whole LTP... > > I definitely don't object more tests, I am just wondering if we should > put it to tools/testing/selftests/ or host it somewhere else. I want it in the kernel so that it gets the level of exposure, care, review, and use that the kernel does. If you put it somewhere else it gets at least 10 times less of all of those attributes.
Re: [PATCH 00/51] rtc: stop using rtc deprecated functions
2017-06-20 15:48 GMT+02:00 Alexandre Belloni: > On 20/06/2017 at 15:44:58 +0200, Pavel Machek wrote: >> On Tue 2017-06-20 13:37:22, Steve Twiss wrote: >> > Hi Pavel, >> > >> > On 20 June 2017 14:26, Pavel Machek wrote: >> > >> > > Subject: Re: [PATCH 00/51] rtc: stop using rtc deprecated functions >> > > >> > > On Tue 2017-06-20 14:24:00, Alexandre Belloni wrote: >> > > > On 20/06/2017 at 14:10:11 +0200, Pavel Machek wrote: >> > > > > On Tue 2017-06-20 12:03:48, Alexandre Belloni wrote: >> > > > > > On 20/06/2017 at 11:35:08 +0200, Benjamin Gaignard wrote: >> > > > > > > rtc_time_to_tm() and rtc_tm_to_time() are deprecated because they >> > > > > > > rely on 32bits variables and that will make rtc break in >> > > > > > > y2038/2016. >> > > > > > >> > > > > > Please don't, because this hide the fact that the hardware will not >> > > > > > handle dates in y2038 anyway and as pointed by Russell a few month >> > > > > > ago, >> > > > > > rtc_time_to_tm will be able to catch it but the 64 bit version will >> > > > > > silently ignore it. >> > > > > >> > > > > Reference? Because rtc on PCs stores date in binary coded decimal, so >> > > > > it is likely to break in 2100, not 2038... >> > > > >> > > > I'm not saying it should be done but clearly, that is not the correct >> > > > thing to do for RTCs that are using a single 32 bits register to store >> > > > the time. >> > > > You give one example, I can give you three: armada38x, at91sam9, >> > > > at32ap700x and that just in the beginning of the series. >> > > >> > > I wanted reference to Russell's mail. >> > >> > This is it. >> > https://patchwork.kernel.org/patch/6219401/ >> >> Thanks. >> >> Yes, that's argument against changing rtc _drivers_ for hardware that >> can not do better than 32bit. For generic code (such as 44/51 sysfs, >> 51/51 suspend test), the change still makes sense. What I had in mind when writing those patches was to remove the limitations coming from those functions usage, even more since they been marked has deprecated. I agree that will change nothing of hardware limitation but at least the limit will not come from the framework. >> > > Yes, we agree on that but I won't cherry pick working patches from a 51 > patches series. maybe only the acked ones ? > > > -- > Alexandre Belloni, Free Electrons > Embedded Linux and Kernel engineering > http://free-electrons.com
Re: [PATCH net-next] sctp: uncork the old asoc before changing to the new one
On Tue, Jun 20, 2017 at 04:01:55PM +0800, Xin Long wrote: > local_cork is used to decide if it should uncork asoc outq after processing > some cmds, and it is set when replying or sending msgs. local_cork should > always have the same value with current asoc q->cork in some way. > > The thing is when changing to a new asoc by cmd SET_ASOC, local_cork may > not be consistent with the current asoc any more. The cmd seqs can be: > > SCTP_CMD_UPDATE_ASSOC (asoc) > SCTP_CMD_REPLY (asoc) > SCTP_CMD_SET_ASOC (new_asoc) > SCTP_CMD_DELETE_TCB (new_asoc) > SCTP_CMD_SET_ASOC (asoc) > SCTP_CMD_REPLY (asoc) > > The 1st REPLY makes OLD asoc q->cork and local_cork both are 1, and the cmd > DELETE_TCB clears NEW asoc q->cork and local_cork. After asoc goes back to > OLD asoc, q->cork is still 1 while local_cork is 0. The 2nd REPLY will not > set local_cork because q->cork is already set and it can't be uncorked and > sent out because of this. > > To keep local_cork consistent with the current asoc q->cork, this patch is > to uncork the old asoc if local_cork is set before changing to the new one. > > Note that the above cmd seqs will be used in the next patch when updating > asoc and handling errors in it. > > Suggested-by: Marcelo Ricardo LeitnerAcked-by: Marcelo Ricardo Leitner > Signed-off-by: Xin Long > --- > net/sctp/sm_sideeffect.c | 4 > 1 file changed, 4 insertions(+) > > diff --git a/net/sctp/sm_sideeffect.c b/net/sctp/sm_sideeffect.c > index 25384fa..7623566 100644 > --- a/net/sctp/sm_sideeffect.c > +++ b/net/sctp/sm_sideeffect.c > @@ -1748,6 +1748,10 @@ static int sctp_cmd_interpreter(sctp_event_t > event_type, > break; > > case SCTP_CMD_SET_ASOC: > + if (asoc && local_cork) { > + sctp_outq_uncork(>outqueue, gfp); > + local_cork = 0; > + } > asoc = cmd->obj.asoc; > break; > > -- > 2.1.0 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-sctp" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >
Re: [PATCH net-next 00/12] nfp: add flower app with representors
On Tue, Jun 20, 2017 at 8:51 AM, Simon Hormanwrote: > this series adds a flower app to the NFP driver. > It initialises four types of netdevs: > > * PF netdev - lower-device for communication of packets to device > * PF representor netdev > * VF representor netdevs > * Phys port representor netdevs > > The PF netdev acts as a lower-device which sends and receives packets to > and from the firmware. The representors act as upper-devices. For TX > representors attach a metadata dst to the skb which is used by the PF > netdev to prepend metadata to the packet before forwarding the firmware. On > RX the PF netdev looks up the representor based on the prepended metadata > recieved from the firmware and forwards the skb to the representor after > removing the metadata. Hi Simon, Jakub Good to have more VF representors around... > Control queues are used to send and receive control messages which are > used to communicate configuration information with the firmware. These > are in separate vNIC to the queues belonging to the PF netdev. The control > queues are not exposed to use-space via a netdev or any other means. Do you have documentation for the control channel or I should look on earlier commits? The control messages you describe here are also the ones that are used to load/unload specific app? > As the name implies this app is targeted at providing offload of TC flower. > That will be added by follow-up work. This patchset focuses on adding phys > port and VF representor netdevs to which flower classifiers may be attached. I guess you want to have switch ID so if someone looks on the reps (ip -d) they can realize they all belong to the same e-switch, we are using switchdev attribute for that matter. Few nits from building from static checker below. Or. drivers/net/ethernet/netronome/nfp/nfp_net_repr.c:78:1: warning: symbol 'nfp_repr_phy_port_get_stats64' was not declared. Should it be static? drivers/net/ethernet/netronome/nfp/nfp_net_repr.c:98:1: warning: symbol 'nfp_repr_vf_get_stats64' was not declared. Should it be static? drivers/net/ethernet/netronome/nfp/nfp_net_repr.c:118:1: warning: symbol 'nfp_repr_pf_get_stats64' was not declared. Should it be static? drivers/net/ethernet/netronome/nfp/nfp_net_repr.c:262:40: warning: incorrect type in assignment (different base types) drivers/net/ethernet/netronome/nfp/nfp_net_repr.c:262:40:expected unsigned int [unsigned] [usertype] port_id drivers/net/ethernet/netronome/nfp/nfp_net_repr.c:262:40:got restricted __be32 [usertype] drivers/net/ethernet/netronome/nfp/flower/main.c:116:19: warning: cast to restricted __be32
Re: [PATCH v4 net-next 0/7] qed*: RDMA and infrastructure for iWARP
From: Yuval MintzDate: Tue, 20 Jun 2017 15:59:59 +0300 > Please consider applying this series to `net-next'. Series applied, thanks.
Re: fib_rules: Resolve goto rules target on delete
From: Serhey PopovychDate: Fri, 16 Jun 2017 15:44:47 +0300 > We should avoid marking goto rules unresolved when their > target is actually reachable after rule deletion. > > Consolder following sample scenario: > > # ip -4 ru sh > 0: from all lookup local > 32000: from all goto 32100 > 32100: from all lookup main > 32100: from all lookup default > 32766: from all lookup main > 32767: from all lookup default > > # ip -4 ru del pref 32100 table main > # ip -4 ru sh > 0: from all lookup local > 32000: from all goto 32100 [unresolved] > 32100: from all lookup default > 32766: from all lookup main > 32767: from all lookup default > > After removal of first rule with preference 32100 we > mark all goto rules as unreachable, even when rule with > same preference as removed one still present. > > Check if next rule with same preference is available > and make all rules with goto action pointing to it. > > Signed-off-by: Serhey Popovych Applied, thanks. It would be awesome if you could distill the above into a test case that could be run under tools/testing/selftests/networking. Thanks!
Re: [PATCH] veth: Be more robust on network device creation when no attributes
From: Serhey PopovychDate: Fri, 16 Jun 2017 18:05:00 +0300 > There are number of problems with configuration peer > network device in absence of IFLA_VETH_PEER attributes > where attributes for main network device shared with > peer. > > First it is not feasible to configure both network > devices with same MAC address since this makes > communication in such configuration problematic. > > This case can be reproduced with following sequence: > > # ip link add address 02:11:22:33:44:55 type veth > # ip li sh > ... > 26: veth0@veth1: mtu 1500 qdisc \ > noop state DOWN mode DEFAULT qlen 1000 > link/ether 00:11:22:33:44:55 brd ff:ff:ff:ff:ff:ff > 27: veth1@veth0: mtu 1500 qdisc \ > noop state DOWN mode DEFAULT qlen 1000 > link/ether 00:11:22:33:44:55 brd ff:ff:ff:ff:ff:ff > > Second it is not possible to register main network device > with given name and automatically create peer network > device name. That happens because IFLA_IFNAME used when > creating main and reused when creating peer. > > This case can be reproduced with following sequence: > > # ip link add dev veth1a type veth > RTNETLINK answers: File exists > > To fix both of the cases check if corresponding netlink > attributes are taken from peer_tb when valid or > name based on rtnl ops kind and random address is used. > > Signed-off-by: Serhey Popovych This does not apply cleanly to the 'net' tree, please respin.
Re: dev: Reclaim network device indexes
From: Serhey PopovychDate: Fri, 16 Jun 2017 19:39:34 +0300 > While making dev_new_index() return zero on overrun prevents > from infinite loop, there is no way to recovery mechanisms > since namespace ifindex only increases and never reused > from released network devices. > > To address this we introduce dev_free_index() helper which > is used to reclaim released network device index when it is > smaller than last allocated index in namespace. > > This also has positive side effect for equal distribution > of network devices per buckets in index hash table. That > positively affects performance of dev_get_by_index() family. > > Signed-off-by: Serhey Popovych You haven't explained why the current behavior is undesirable, and why we want reuse. I don't think we want at all for anything to rely on ifindexes being allocated one way or another. Yes, I understand that hash table argument, but that can be solved other ways. And what you're talking about is more of an error case not a normal case.
Re: [PATCH net-next] sctp: handle errors when updating asoc
On Tue, Jun 20, 2017 at 04:05:11PM +0800, Xin Long wrote: > It's a bad thing not to handle errors when updating asoc. The memory > allocation failure in any of the functions called in sctp_assoc_update() > would cause sctp to work unexpectedly. > > This patch is to fix it by aborting the asoc and reporting the error when > any of these functions fails. > > Signed-off-by: Xin LongAcked-by: Marcelo Ricardo Leitner > --- > include/net/sctp/structs.h | 4 ++-- > net/sctp/associola.c | 25 ++--- > net/sctp/sm_sideeffect.c | 24 +++- > 3 files changed, 39 insertions(+), 14 deletions(-) > > diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h > index 5051317..e26763b 100644 > --- a/include/net/sctp/structs.h > +++ b/include/net/sctp/structs.h > @@ -1953,8 +1953,8 @@ struct sctp_transport *sctp_assoc_is_match(struct > sctp_association *, > const union sctp_addr *, > const union sctp_addr *); > void sctp_assoc_migrate(struct sctp_association *, struct sock *); > -void sctp_assoc_update(struct sctp_association *old, > -struct sctp_association *new); > +int sctp_assoc_update(struct sctp_association *old, > + struct sctp_association *new); > > __u32 sctp_association_get_next_tsn(struct sctp_association *); > > diff --git a/net/sctp/associola.c b/net/sctp/associola.c > index 72b07dd..757be41 100644 > --- a/net/sctp/associola.c > +++ b/net/sctp/associola.c > @@ -1112,8 +1112,8 @@ void sctp_assoc_migrate(struct sctp_association *assoc, > struct sock *newsk) > } > > /* Update an association (possibly from unexpected COOKIE-ECHO processing). > */ > -void sctp_assoc_update(struct sctp_association *asoc, > -struct sctp_association *new) > +int sctp_assoc_update(struct sctp_association *asoc, > + struct sctp_association *new) > { > struct sctp_transport *trans; > struct list_head *pos, *temp; > @@ -1124,8 +1124,10 @@ void sctp_assoc_update(struct sctp_association *asoc, > asoc->peer.sack_needed = new->peer.sack_needed; > asoc->peer.auth_capable = new->peer.auth_capable; > asoc->peer.i = new->peer.i; > - sctp_tsnmap_init(>peer.tsn_map, SCTP_TSN_MAP_INITIAL, > - asoc->peer.i.initial_tsn, GFP_ATOMIC); > + > + if (!sctp_tsnmap_init(>peer.tsn_map, SCTP_TSN_MAP_INITIAL, > + asoc->peer.i.initial_tsn, GFP_ATOMIC)) > + return -ENOMEM; > > /* Remove any peer addresses not present in the new association. */ > list_for_each_safe(pos, temp, >peer.transport_addr_list) { > @@ -1169,11 +1171,11 @@ void sctp_assoc_update(struct sctp_association *asoc, > } else { > /* Add any peer addresses from the new association. */ > list_for_each_entry(trans, >peer.transport_addr_list, > - transports) { > - if (!sctp_assoc_lookup_paddr(asoc, >ipaddr)) > - sctp_assoc_add_peer(asoc, >ipaddr, > - GFP_ATOMIC, trans->state); > - } > + transports) > + if (!sctp_assoc_lookup_paddr(asoc, >ipaddr) && > + !sctp_assoc_add_peer(asoc, >ipaddr, > + GFP_ATOMIC, trans->state)) > + return -ENOMEM; > > asoc->ctsn_ack_point = asoc->next_tsn - 1; > asoc->adv_peer_ack_point = asoc->ctsn_ack_point; > @@ -1182,7 +1184,8 @@ void sctp_assoc_update(struct sctp_association *asoc, > sctp_stream_update(>stream, >stream); > > /* get a new assoc id if we don't have one yet. */ > - sctp_assoc_set_id(asoc, GFP_ATOMIC); > + if (sctp_assoc_set_id(asoc, GFP_ATOMIC)) > + return -ENOMEM; > } > > /* SCTP-AUTH: Save the peer parameters from the new associations > @@ -1200,7 +1203,7 @@ void sctp_assoc_update(struct sctp_association *asoc, > asoc->peer.peer_hmacs = new->peer.peer_hmacs; > new->peer.peer_hmacs = NULL; > > - sctp_auth_asoc_init_active_key(asoc, GFP_ATOMIC); > + return sctp_auth_asoc_init_active_key(asoc, GFP_ATOMIC); > } > > /* Update the retran path for sending a retransmitted packet. > diff --git a/net/sctp/sm_sideeffect.c b/net/sctp/sm_sideeffect.c > index 7623566..dfe1fcb 100644 > --- a/net/sctp/sm_sideeffect.c > +++ b/net/sctp/sm_sideeffect.c > @@ -818,6 +818,28 @@ static void sctp_cmd_setup_t2(sctp_cmd_seq_t *cmds, > asoc->timeouts[SCTP_EVENT_TIMEOUT_T2_SHUTDOWN] = t->rto; > } > > +static void sctp_cmd_assoc_update(sctp_cmd_seq_t *cmds, > + struct sctp_association *asoc, > +
Re: [RFC 1/2] net-next: fix DSA flow_disection
On 06/20/2017 07:01 AM, Andrew Lunn wrote: > On Tue, Jun 20, 2017 at 10:06:54AM +0200, John Crispin wrote: >> RPS and probably other kernel features are currently broken on some if not >> all DSA devices. The root cause of this that skb_hash will call the >> flow_disector. > > Hi John > > What is the call path when the flow_disector is called? I'm wondering > if we can defer this, and call it later, after the tag code has > removed the header. Would not you usually want to configure RPS at the DSA network device level where the switch tag has already been popped and you are processing a regular Ethernet frame at that point? -- Florian
Re: [PATCH net-next 0/3] more skb_put_[data:zero] related work
From: yuan linyuDate: Sun, 18 Jun 2017 22:41:54 +0800 > yuan linyu (3): > net: introduce __skb_put_[zero, data, u8] > net: replace more place to skb_put_[data:zero] > net: manual clean code which call skb_put_[data:zero] Series applied, thanks.
[PATCH net-next] ibmvnic: Correct return code checking for ibmvnic_init during probe
Fixes: 6a2fb0e99f9c (ibmvnic: driver initialization for kdump/kexec) The update to ibmvnic_init to allow an EAGAIN return code broke the calling of ibmvnic_init from ibmvnic_probe. The code now will return from this point in the probe routine if anything other than EAGAIN is returned. The check should be to see if rc is non-zero and not equal to EAGAIN. Without this fix, the vNIC driver can return 0 (success) from its probe routine due to ibmvnic_init returning zero, but before completing the probe process and registering with the netdev layer. Signed-off-by: Nathan Fontenot--- drivers/net/ethernet/ibm/ibmvnic.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c index 722daf5..4e17217 100644 --- a/drivers/net/ethernet/ibm/ibmvnic.c +++ b/drivers/net/ethernet/ibm/ibmvnic.c @@ -3737,7 +3737,7 @@ static int ibmvnic_probe(struct vio_dev *dev, const struct vio_device_id *id) do { rc = ibmvnic_init(adapter); - if (rc != EAGAIN) { + if (rc && rc != EAGAIN) { free_netdev(netdev); return rc; }
Re: [PATCH] netxen: Fix a sleep-in-atomic bug in netxen_nic_pci_mem_access_direct
From: Jia-Ju BaiDate: Mon, 19 Jun 2017 10:48:53 +0800 > The driver may sleep under a spin lock, and the function call path is: > netxen_nic_pci_mem_access_direct (acquire the lock by spin_lock) > ioremap --> may sleep > > To fix it, the lock is released before "ioremap", and the lock is > acquired again after this function. > > Signed-off-by: Jia-Ju Bai This style of change you are making is really starting to be a problem. You can't just drop locks like this, especially without explaining why it's ok, and why the mutual exclusion this code was trying to achieve is still going to be OK afterwards. In fact, I see zero analysis of the locking situation here, why it was needed in the first place, and why your change is OK in that context. Any locking change is delicate, and you must put the greatest of care and consideration into it. Just putting "unlock/lock" around the sleeping operation shows a very low level of consideration for the implications of the change you are making. This isn't like making whitespace fixes, sorry...
Re: [PATCH net-next] ibmvnic: Return from ibmvnic_resume if not in VNIC_OPEN state
On 06/19/2017 11:27 AM, John Allen wrote: > If the ibmvnic driver is not in the VNIC_OPEN state, return from > ibmvnic_resume callback. If we are not in the VNIC_OPEN state, interrupts > may not be initialized and directly calling the interrupt handler will > cause a crash. > > Signed-off-by: John AllenReviewed-by: Nathan Fontenot > --- > diff --git a/drivers/net/ethernet/ibm/ibmvnic.c > b/drivers/net/ethernet/ibm/ibmvnic.c > index 722daf5..0135095 100644 > --- a/drivers/net/ethernet/ibm/ibmvnic.c > +++ b/drivers/net/ethernet/ibm/ibmvnic.c > @@ -3859,6 +3859,9 @@ static int ibmvnic_resume(struct device *dev) > struct ibmvnic_adapter *adapter = netdev_priv(netdev); > int i; > > + if (adapter->state != VNIC_OPEN) > + return 0; > + > /* kick the interrupt handlers just in case we lost an interrupt */ > for (i = 0; i < adapter->req_rx_queues; i++) > ibmvnic_interrupt_rx(adapter->rx_scrq[i]->irq, >
Re: [RFC 1/2] net-next: fix DSA flow_disection
On 20/06/17 16:01, Andrew Lunn wrote: On Tue, Jun 20, 2017 at 10:06:54AM +0200, John Crispin wrote: RPS and probably other kernel features are currently broken on some if not all DSA devices. The root cause of this that skb_hash will call the flow_disector. Hi John What is the call path when the flow_disector is called? I'm wondering if we can defer this, and call it later, after the tag code has removed the header. Andrew Hi Andrew, the ethernet driver receives the frame and passes it down the line. Eventually it ends up inside netif_receive_skb_internal() where it gets added to the backlog. At this point get_rps_cpu() is called. Inside get_rps_cpu() the skb_get_hash() is called which utilizes the flow_dissector() ... which is broken for DSA devices. get_rps_cpu() will always return the same hash for all flows and the frame is always added to the backlog on the same core. Once inside the backlog it will traverse through the dsa layer and end up inside the tag driver and be passed to the slave device for further processing and keep its bad flow hash for its whole life cycle. In theory we could reset the hash inside the tag driver but ideally the whole life cycle of the frame should happen on the same core to avoid possible reordering issues. In addition RPS is broken until the frame reaches the tag driver. In the case of the mediatek mt7623 we only have 1 RX IRQ and in the worst case the RPS of the frame while still inside ethX will happen on the same core as where we handle IRQs. This will increase the IRQ latency and reduce the free cpu time, thus reducing maximum throughput. I did test resetting the hash inside the tag driver. Calculating the correct hash from the start did yield a huge performance difference however, at least on mt7623. We are talking about 30% extra max throughput. This might not be such a big problem if the SoC has a multi queue ethernet core but on mt7623 it does make a huge difference if we can use RPS to delegate all frame processing away from the core handling the IRQs. John
Re: [PATCH net-next v3 0/6] vxlan: cleanup and IPv6 link-local support
From: Matthias SchifferDate: Mon, 19 Jun 2017 10:03:54 +0200 > Running VXLANs over IPv6 link-local addresses allows to use them as a > drop-in replacement for VLANs, avoiding to allocate additional outer IP > addresses to run the VXLAN over. > > Since v1, I have added a lot more consistency checks to the address > configuration, making sure address families and scopes match. To simplify > the implementation, I also did some general refactoring of the > configuration handling in the new first patch of the series. > > The second patch is more cleanup; is slightly touches OVS code, so that > list is in CC this time, too. > > As in v1, the last two patches actually make VXLAN over IPv6 link-local > work, and allow multiple VXLANs with the same VNI and port, as long as > link-local addresses on different interfaces are used. As suggested, I now > store in the flags field if the VXLAN uses link-local addresses or not. > > v3 removes log messages as suggested by Roopa Prabhu (as it is very unusual > for errors in netlink requests to be printed to the kernel log.) The commit > message of patch 5 has been extended to add a note about IPv4. Series applied, thanks Matthias.