Re: [PATCH 8/8] Makefile: drop -D__CHECK_ENDIAN__ from cflags
On 12/15/2016 06:15 AM, Michael S. Tsirkin wrote: > That's the default now, no need for makefiles to set it. > > Signed-off-by: Michael S. Tsirkin > --- [...] > drivers/net/can/Makefile | 1 - For drivers/net/can/Makefile: Acked-by: Marc Kleine-Budde regards, Marc -- Pengutronix e.K. | Marc Kleine-Budde | Industrial Linux Solutions| Phone: +49-231-2826-924 | Vertretung West/Dortmund | Fax: +49-5121-206917- | Amtsgericht Hildesheim, HRA 2686 | http://www.pengutronix.de | signature.asc Description: OpenPGP digital signature
Re: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function
Jason A. Donenfeld wrote: > > Siphash needs a random secret key, yes. The point is that the hash > function remains secure so long as the secret key is kept secret. > Other functions can't make the same guarantee, and so nervous periodic > key rotation is necessary, but in most cases nothing is done, and so > things just leak over time. Actually those users that use rhashtable now have a much more sophisticated defence against these attacks, dyanmic rehashing when bucket length exceeds a preset limit. Cheers, -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
[RFC v2 00/10] HFI Virtual Network Interface Controller (VNIC)
Thanks Jason for the valuable feedback. Here is the revised HFI VNIC patch series. ChangeLog: = v1 => v2: a) Removed hfi_vnic bus, instead make hfi_vnic driver an 'ib client', as per feedback from Jason Gunthorpe. b) Interface changes, data structure changes and variable name changes associated with (a). c) Add hfi_ibdev abstraction to provide VNIC control operations to hfi_vnic client. d) Minor fixes e) Moved hfi_vnic driver from .../sw/intel/vnic/hfi_vnic to .../sw/intel/hfi_vnic. v1: Initial post @ https://www.spinics.net/lists/linux-rdma/msg43158.html Description: Intel Omni-Path Host Fabric Interface (HFI) Virtual Network Interface Controller (VNIC) feature supports Ethernet functionality over Omni-Path fabric by encapsulating the Ethernet packets between HFI nodes. The patterns of exchanges of Omni-Path encapsulated Ethernet packets involves one or more virtual Ethernet switches overlaid on the Omni-Path fabric topology. A subset of HFI nodes on the Omni-Path fabric are permitted to exchange encapsulated Ethernet packets across a particular virtual Ethernet switch. The virtual Ethernet switches are logical abstractions achieved by configuring the HFI nodes on the fabric for header generation and processing. In the simplest configuration all HFI nodes across the fabric exchange encapsulated Ethernet packets over a single virtual Ethernet switch. A virtual Ethernet switch, is effectively an independent Ethernet network. The configuration is performed by an Ethernet Manager (EM) which is part of the trusted Fabric Manager (FM) application. HFI nodes can have multiple VNICs each connected to a different virtual Ethernet switch. The below diagram presents a case of two virtual Ethernet switches with two HFI nodes. +---+ | Subnet/ | | Ethernet | | Manager | +---+ / / / / // / / +-+ +--+ | Virtual Ethernet Switch| | Virtual Ethernet Switch | | +-++-+ | | +-++-+ | | | VPORT || VPORT | | | | VPORT || VPORT | | +--+-++-+-+ +-+-++-+---+ | \/ | | \/ | | \/ | |/ \| | / \ | +---++ +---++ | VNIC|VNIC| |VNIC |VNIC| +---++ +---++ | HFI | | HFI | ++ ++ Intel HFI VNIC software design is presented in the below diagram. HFI VNIC functionality has a HW dependent component and a HW independent component. The HW dependent VNIC functionality is part of the HFI1 driver. It implements the callback functions to do various tasks which includes adding and removing of VNIC ports, HW resource allocation for VNIC functionality and actual transmission and reception of encapsulated Ethernet packets over the fabric. Each VNIC port is addressed by the HFI port number, and the VNIC port number on that HFI port. The HFI VNIC module implements the HW independent VNIC functionality. It consists of two parts. The VNIC Ethernet Management Agent (VEMA) registers itself with IB core as an IB client and interfaces with the IB MAD stack. It exchanges the management information with the Ethernet Manager (EM) and the VNIC netdev. The VNIC netdev part interfaces with the Linux network stack, thus providing standard Ethernet network interfaces. It invokes HFI device's VNIC callback functions for HW access. The VNIC netdev encapsulates the Ethernet packets with an Omni-Path header before passing them to the HFI1 driver for transmission. Similarly, it de-encapsulates the received Omni-Path packets before passing them to the network stack. For each VNIC interface, the information required for encapsulation is configured by EM via VEMA MAD interface. +---+ +--+ | | | Linux | | IB MAD| | Network | | | | Stack | +---+ +--+ | | | | ++ || |
[RFC v2 06/10] IB/hfi-vnic: VNIC MAC table support
HFI VNIC MAC table contains the MAC address to DLID mappings provided by the Ethernet manager. During transmission, the MAC table provides the MAC address to DLID translation. Implement MAC table using simple hash list. Also provide support to update/query the MAC table by Ethernet manager. Reviewed-by: Dennis Dalessandro Reviewed-by: Ira Weiny Signed-off-by: Niranjana Vishwanathapura Signed-off-by: Sadanand Warrier --- .../infiniband/sw/intel/hfi_vnic/hfi_vnic_encap.c | 236 + .../sw/intel/hfi_vnic/hfi_vnic_internal.h | 53 - .../infiniband/sw/intel/hfi_vnic/hfi_vnic_netdev.c | 4 + 3 files changed, 292 insertions(+), 1 deletion(-) diff --git a/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_encap.c b/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_encap.c index 3fdfb7b..e45cff8 100644 --- a/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_encap.c +++ b/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_encap.c @@ -104,6 +104,238 @@ #define HFI_VNIC_SC_MASK 0x1f +/* + * Using a simple hash table for mac table implementation with the last octet + * of mac address as a key. + */ +static void hfi_vnic_free_mac_tbl(struct hlist_head *mactbl) +{ + struct hfi_vnic_mac_tbl_node *node; + struct hlist_node *tmp; + int bkt; + + if (!mactbl) + return; + + vnic_hash_for_each_safe(mactbl, bkt, tmp, node, hlist) { + hash_del(&node->hlist); + kfree(node); + } + kfree(mactbl); +} + +static struct hlist_head *hfi_vnic_alloc_mac_tbl(void) +{ + u32 size = sizeof(struct hlist_head) * HFI_VNIC_MAC_TBL_SIZE; + struct hlist_head *mactbl; + + mactbl = kzalloc(size, GFP_KERNEL); + if (!mactbl) + return ERR_PTR(-ENOMEM); + + vnic_hash_init(mactbl); + return mactbl; +} + +/* hfi_vnic_release_mac_tbl - empty and free the mac table */ +void hfi_vnic_release_mac_tbl(struct hfi_vnic_adapter *adapter) +{ + struct hlist_head *mactbl; + + mutex_lock(&adapter->mactbl_lock); + mactbl = rcu_access_pointer(adapter->mactbl); + rcu_assign_pointer(adapter->mactbl, NULL); + synchronize_rcu(); + hfi_vnic_free_mac_tbl(mactbl); + mutex_unlock(&adapter->mactbl_lock); +} + +/* + * hfi_vnic_query_mac_tbl - query the mac table for a section + * + * This function implements query of specific function of the mac table. + * The function also expects the requested range to be valid. + */ +void hfi_vnic_query_mac_tbl(struct hfi_vnic_adapter *adapter, + struct hfi_veswport_mactable *tbl) +{ + struct hfi_vnic_mac_tbl_node *node; + struct hlist_head *mactbl; + int bkt; + u16 loffset, lnum_entries; + + rcu_read_lock(); + mactbl = rcu_dereference(adapter->mactbl); + if (!mactbl) + goto get_mac_done; + + loffset = be16_to_cpu(tbl->offset); + lnum_entries = be16_to_cpu(tbl->num_entries); + + vnic_hash_for_each(mactbl, bkt, node, hlist) { + struct __hfi_vnic_mactable_entry *nentry = &node->entry; + struct hfi_veswport_mactable_entry *entry; + + if ((node->index < loffset) || + (node->index >= (loffset + lnum_entries))) + continue; + + /* populate entry in the tbl corresponding to the index */ + entry = &tbl->tbl_entries[node->index - loffset]; + memcpy(entry->mac_addr, nentry->mac_addr, + ARRAY_SIZE(entry->mac_addr)); + memcpy(entry->mac_addr_mask, nentry->mac_addr_mask, + ARRAY_SIZE(entry->mac_addr_mask)); + entry->dlid_sd.dw = cpu_to_be32(nentry->dlid_sd.dw); + } + tbl->mac_tbl_digest = cpu_to_be32(adapter->info.vport.mac_tbl_digest); +get_mac_done: + rcu_read_unlock(); +} + +/* + * hfi_vnic_update_mac_tbl - update mac table section + * + * This function updates the specified section of the mac table. + * The procedure includes following steps. + * - Allocate a new mac (hash) table. + * - Add the specified entries to the new table. + *(except the ones that are requested to be deleted). + * - Add all the other entries from the old mac table. + * - If there is a failure, free the new table and return. + * - Switch to the new table. + * - Free the old table and return. + * + * The function also expects the requested range to be valid. + */ +int hfi_vnic_update_mac_tbl(struct hfi_vnic_adapter *adapter, + struct hfi_veswport_mactable *tbl) +{ + struct hfi_vnic_mac_tbl_node *node, *new_node; + struct hlist_head *new_mactbl, *old_mactbl; + int i, bkt, rc = 0; + u8 key; + u16 loffset, lnum_entries; + + mutex_lock(&adapter->mactbl_lock); + /* allocate new mac table */ + new_mactbl = hfi_vnic_alloc_mac_tbl(); + if (IS_ERR(new_mactbl)) { + mutex_unlock(&ada
[RFC v2 07/10] IB/hfi-vnic: VNIC Ethernet Management Agent (VEMA) interface
HFI VNIC EMA interface functions are the management interfaces to the HFI VNIC netdev. Add support to add and remove VNIC ports. Implement the required GET/SET management interface functions and processing of new management information. Add support to send trap notifications upon various events like interface status change, unicast/multicast mac list update and mac address change. Reviewed-by: Dennis Dalessandro Reviewed-by: Ira Weiny Signed-off-by: Niranjana Vishwanathapura Signed-off-by: Sadanand Warrier Signed-off-by: Tanya K Jajodia --- drivers/infiniband/sw/intel/hfi_vnic/Makefile | 3 +- .../infiniband/sw/intel/hfi_vnic/hfi_vnic_encap.h | 4 + .../sw/intel/hfi_vnic/hfi_vnic_internal.h | 44 +++ .../infiniband/sw/intel/hfi_vnic/hfi_vnic_netdev.c | 153 +++- .../sw/intel/hfi_vnic/hfi_vnic_vema_iface.c| 432 + 5 files changed, 633 insertions(+), 3 deletions(-) create mode 100644 drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_vema_iface.c diff --git a/drivers/infiniband/sw/intel/hfi_vnic/Makefile b/drivers/infiniband/sw/intel/hfi_vnic/Makefile index 8e3dca7..a0562af 100644 --- a/drivers/infiniband/sw/intel/hfi_vnic/Makefile +++ b/drivers/infiniband/sw/intel/hfi_vnic/Makefile @@ -3,4 +3,5 @@ # obj-$(CONFIG_HFI_VNIC) += hfi_vnic.o -hfi_vnic-y := hfi_vnic_netdev.o hfi_vnic_encap.o hfi_vnic_ethtool.o +hfi_vnic-y := hfi_vnic_netdev.o hfi_vnic_encap.o hfi_vnic_ethtool.o \ + hfi_vnic_vema_iface.o diff --git a/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_encap.h b/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_encap.h index a6770ef..54e9081 100644 --- a/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_encap.h +++ b/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_encap.h @@ -99,6 +99,10 @@ #define HFI_VNIC_STATE_DROP_ALL0x1 #define HFI_VNIC_STATE_FORWARDING 0x3 +/* VNIC Ethernet link status */ +#define HFI_VNIC_ETH_LINK_UP 1 +#define HFI_VNIC_ETH_LINK_DOWN 2 + /** * struct hfi_vesw_info - HFI vnic switch information * @fabric_id: 10-bit fabric id diff --git a/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_internal.h b/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_internal.h index 6d5c5f8..7723a4e 100644 --- a/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_internal.h +++ b/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_internal.h @@ -243,6 +243,16 @@ struct __hfi_veswport_trap { } __packed; /** + * struct hfi_vnic_ctrl_port - HFI virtual NIC control port + * @ibdev: pointer to ib device + * @ops: hfi vnic control operations + */ +struct hfi_vnic_ctrl_port { + struct ib_device *ibdev; + struct hfi_vnic_ctrl_ops *ops; +}; + +/** * struct hfi_vnic_rx_queue - HFI VNIC receive queue * @idx: queue index * @adapter: netdev adapter @@ -257,11 +267,15 @@ struct hfi_vnic_rx_queue { /** * struct hfi_vnic_adapter - HFI VNIC netdev private data structure * @netdev: pointer to associated netdev + * @cport: pointer to hfi vnic control port * @vport: pointer to hfi vnic port * @flags: flags indicating various states * @lock: adapter lock * @rxq: receive queue array * @info: virtual ethernet switch port information + * @vema_mac_addr: mac address configured by vema + * @umac_hash: unicast maclist hash + * @mmac_hash: multicast maclist hash * @mactbl: hash table of MAC entries * @mactbl_lock: mac table lock * @stats_lock: statistics lock @@ -278,6 +292,7 @@ struct hfi_vnic_rx_queue { */ struct hfi_vnic_adapter { struct net_device *netdev; + struct hfi_vnic_ctrl_port *cport; struct hfi_vnic_port *vport; unsigned long flags; @@ -287,6 +302,9 @@ struct hfi_vnic_adapter { struct hfi_vnic_rx_queue rxq[HFI_VNIC_MAX_QUEUE]; struct __hfi_veswport_info info; + u8 vema_mac_addr[ETH_ALEN]; + u32 umac_hash; + u32 mmac_hash; struct hlist_head __rcu *mactbl; /* Lock used to protect updates to mac table */ @@ -338,6 +356,11 @@ struct hfi_vnic_mac_tbl_node { #define v_warn(format, arg...) \ netdev_warn(adapter->netdev, format, ## arg) +#define c_err(format, arg...) \ + dev_err(&cport->ibdev->dev, format, ## arg) +#define c_info(format, arg...) \ + dev_info(&cport->ibdev->dev, format, ## arg) + /* The maximum allowed entries in the mac table */ #define HFI_VNIC_MAC_TBL_MAX_ENTRIES 2048 /* Limit of smac entries in mac table */ @@ -377,12 +400,33 @@ struct hfi_vnic_adapter *hfi_vnic_add_netdev(struct hfi_vnic_port *vport, int hfi_vnic_encap_skb(struct hfi_vnic_adapter *adapter, struct sk_buff *skb); int hfi_vnic_decap_skb(struct hfi_vnic_rx_queue *rxq, struct sk_buff *skb); u8 hfi_vnic_calc_entropy(struct hfi_vnic_adapter *adapter, struct sk_buff *skb); +void hfi_vnic_process_vema_config(struct hfi_vnic_adapter *adapter); void hfi_vnic_release_mac_tbl(struct hf
[RFC v2 02/10] IB/hfi-vnic: Virtual Network Interface Controller (VNIC) interface
Create hfi_ibdev abstraction which hfi1_ibdev will extend. Define HFI VNIC interface between hardware independent VNIC functionality and the hardware dependent VNIC functionality. Add VNIC control operations to add and remove VNIC devices, to the hfi_ibdev structure. Reviewed-by: Dennis Dalessandro Reviewed-by: Ira Weiny Signed-off-by: Niranjana Vishwanathapura --- drivers/infiniband/hw/hfi1/chip.c | 2 +- drivers/infiniband/hw/hfi1/driver.c | 10 +- drivers/infiniband/hw/hfi1/hfi.h| 2 +- drivers/infiniband/hw/hfi1/init.c | 4 +- drivers/infiniband/hw/hfi1/intr.c | 2 +- drivers/infiniband/hw/hfi1/mad.c| 2 +- drivers/infiniband/hw/hfi1/qp.c | 24 +++-- drivers/infiniband/hw/hfi1/ruc.c| 2 +- drivers/infiniband/hw/hfi1/sysfs.c | 22 ++-- drivers/infiniband/hw/hfi1/verbs.c | 113 ++-- drivers/infiniband/hw/hfi1/verbs.h | 9 +- include/rdma/opa_hfi.h | 199 12 files changed, 298 insertions(+), 93 deletions(-) create mode 100644 include/rdma/opa_hfi.h diff --git a/drivers/infiniband/hw/hfi1/chip.c b/drivers/infiniband/hw/hfi1/chip.c index 37d8af5..9263984 100644 --- a/drivers/infiniband/hw/hfi1/chip.c +++ b/drivers/infiniband/hw/hfi1/chip.c @@ -10452,7 +10452,7 @@ int set_link_state(struct hfi1_pportdata *ppd, u32 state) sdma_all_running(dd); /* Signal the IB layer that the port has went active */ - event.device = &dd->verbs_dev.rdi.ibdev; + event.device = &dd->verbs_dev.hfidev.rdi.ibdev; event.element.port_num = ppd->port; event.event = IB_EVENT_PORT_ACTIVE; } diff --git a/drivers/infiniband/hw/hfi1/driver.c b/drivers/infiniband/hw/hfi1/driver.c index d426116..e219c3b 100644 --- a/drivers/infiniband/hw/hfi1/driver.c +++ b/drivers/infiniband/hw/hfi1/driver.c @@ -163,7 +163,8 @@ const char *get_unit_name(int unit) const char *get_card_name(struct rvt_dev_info *rdi) { - struct hfi1_ibdev *ibdev = container_of(rdi, struct hfi1_ibdev, rdi); + struct hfi1_ibdev *ibdev = container_of(rdi, struct hfi1_ibdev, + hfidev.rdi); struct hfi1_devdata *dd = container_of(ibdev, struct hfi1_devdata, verbs_dev); return get_unit_name(dd->unit); @@ -171,7 +172,8 @@ const char *get_card_name(struct rvt_dev_info *rdi) struct pci_dev *get_pci_dev(struct rvt_dev_info *rdi) { - struct hfi1_ibdev *ibdev = container_of(rdi, struct hfi1_ibdev, rdi); + struct hfi1_ibdev *ibdev = container_of(rdi, struct hfi1_ibdev, + hfidev.rdi); struct hfi1_devdata *dd = container_of(ibdev, struct hfi1_devdata, verbs_dev); return dd->pcidev; @@ -281,7 +283,7 @@ static void rcv_hdrerr(struct hfi1_ctxtdata *rcd, struct hfi1_pportdata *ppd, int lnh = be16_to_cpu(rhdr->lrh[0]) & 3; struct hfi1_ibport *ibp = &ppd->ibport_data; struct hfi1_devdata *dd = ppd->dd; - struct rvt_dev_info *rdi = &dd->verbs_dev.rdi; + struct rvt_dev_info *rdi = &dd->verbs_dev.hfidev.rdi; if (packet->rhf & (RHF_VCRC_ERR | RHF_ICRC_ERR)) return; @@ -600,7 +602,7 @@ static void __prescan_rxq(struct hfi1_packet *packet) struct rvt_qp *qp; struct ib_header *hdr; struct ib_other_headers *ohdr; - struct rvt_dev_info *rdi = &dd->verbs_dev.rdi; + struct rvt_dev_info *rdi = &dd->verbs_dev.hfidev.rdi; u64 rhf = rhf_to_cpu(rhf_addr); u32 etype = rhf_rcv_type(rhf), qpn, bth1; int is_ecn = 0; diff --git a/drivers/infiniband/hw/hfi1/hfi.h b/drivers/infiniband/hw/hfi1/hfi.h index 4163596..1fc5b68 100644 --- a/drivers/infiniband/hw/hfi1/hfi.h +++ b/drivers/infiniband/hw/hfi1/hfi.h @@ -1601,7 +1601,7 @@ static inline struct hfi1_pportdata *ppd_from_ibp(struct hfi1_ibport *ibp) static inline struct hfi1_ibdev *dev_from_rdi(struct rvt_dev_info *rdi) { - return container_of(rdi, struct hfi1_ibdev, rdi); + return container_of(rdi, struct hfi1_ibdev, hfidev.rdi); } static inline struct hfi1_ibport *to_iport(struct ib_device *ibdev, u8 port) diff --git a/drivers/infiniband/hw/hfi1/init.c b/drivers/infiniband/hw/hfi1/init.c index 60db615..13f6862 100644 --- a/drivers/infiniband/hw/hfi1/init.c +++ b/drivers/infiniband/hw/hfi1/init.c @@ -1020,7 +1020,7 @@ static void __hfi1_free_devdata(struct kobject *kobj) free_percpu(dd->int_counter); free_percpu(dd->rcv_limit); free_percpu(dd->send_schedule); - rvt_dealloc_device(&dd->verbs_dev.rdi); + rvt_dealloc_device(&dd->verbs_dev.hfidev.rdi); } static struct kobj_type hfi1_devdata_type = { @@ -1133,7 +113
[RFC v2 08/10] IB/hfi-vnic: VNIC Ethernet Management Agent (VEMA) function
HFI VEMA function interfaces with the Infiniband MAD stack to exchange the management information packets with the Ethernet Manager (EM). It interfaces with the HFI VNIC netdev function to SET/GET the management information. The information exchanged with the EM includes class port details, encapsulation configuration, various counters, unicast and multicast MAC list and the MAC table. It also supports sending traps to the EM. Reviewed-by: Dennis Dalessandro Reviewed-by: Ira Weiny Signed-off-by: Sadanand Warrier Signed-off-by: Niranjana Vishwanathapura Signed-off-by: Tanya K Jajodia Signed-off-by: Sudeep Dutt --- drivers/infiniband/sw/intel/hfi_vnic/Makefile |2 +- .../sw/intel/hfi_vnic/hfi_vnic_ethtool.c | 12 + .../sw/intel/hfi_vnic/hfi_vnic_internal.h | 11 + .../infiniband/sw/intel/hfi_vnic/hfi_vnic_vema.c | 1024 .../sw/intel/hfi_vnic/hfi_vnic_vema_iface.c|4 +- 5 files changed, 1050 insertions(+), 3 deletions(-) create mode 100644 drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_vema.c diff --git a/drivers/infiniband/sw/intel/hfi_vnic/Makefile b/drivers/infiniband/sw/intel/hfi_vnic/Makefile index a0562af..16c0830 100644 --- a/drivers/infiniband/sw/intel/hfi_vnic/Makefile +++ b/drivers/infiniband/sw/intel/hfi_vnic/Makefile @@ -4,4 +4,4 @@ obj-$(CONFIG_HFI_VNIC) += hfi_vnic.o hfi_vnic-y := hfi_vnic_netdev.o hfi_vnic_encap.o hfi_vnic_ethtool.o \ - hfi_vnic_vema_iface.o + hfi_vnic_vema.o hfi_vnic_vema_iface.o diff --git a/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_ethtool.c b/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_ethtool.c index 9289ab2..9c2ed37 100644 --- a/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_ethtool.c +++ b/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_ethtool.c @@ -130,6 +130,17 @@ struct vnic_stats { #define VNIC_STATS_LEN ARRAY_SIZE(vnic_gstrings_stats) +/* vnic_get_drvinfo - get driver info */ +static void vnic_get_drvinfo(struct net_device *netdev, +struct ethtool_drvinfo *drvinfo) +{ + strlcpy(drvinfo->driver, hfi_vnic_driver_name, sizeof(drvinfo->driver)); + strlcpy(drvinfo->version, hfi_vnic_driver_version, + sizeof(drvinfo->version)); + strlcpy(drvinfo->bus_info, dev_name(netdev->dev.parent), + sizeof(drvinfo->bus_info)); +} + /* vnic_get_sset_count - get string set count */ static int vnic_get_sset_count(struct net_device *netdev, int sset) { @@ -183,6 +194,7 @@ static void vnic_get_strings(struct net_device *netdev, u32 stringset, u8 *data) /* ethtool ops */ static const struct ethtool_ops hfi_vnic_ethtool_ops = { + .get_drvinfo = vnic_get_drvinfo, .get_link = ethtool_op_get_link, .get_strings = vnic_get_strings, .get_sset_count = vnic_get_sset_count, diff --git a/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_internal.h b/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_internal.h index 7723a4e..b36bb76 100644 --- a/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_internal.h +++ b/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_internal.h @@ -246,10 +246,12 @@ struct __hfi_veswport_trap { * struct hfi_vnic_ctrl_port - HFI virtual NIC control port * @ibdev: pointer to ib device * @ops: hfi vnic control operations + * @num_ports: number of hfi ports */ struct hfi_vnic_ctrl_port { struct ib_device *ibdev; struct hfi_vnic_ctrl_ops *ops; + u8 num_ports; }; /** @@ -280,6 +282,8 @@ struct hfi_vnic_rx_queue { * @mactbl_lock: mac table lock * @stats_lock: statistics lock * @flow_tbl: flow to default port redirection table + * @trap_timeout: trap timeout + * @trap_count: no. of traps allowed within timeout period * @q_sum_cntrs: per queue EM summary counters * @q_err_cntrs: per queue EM error counters * @q_rx_logic_errors: per queue rx logic (default) errors @@ -314,6 +318,8 @@ struct hfi_vnic_adapter { struct mutex stats_lock; u8 flow_tbl[HFI_VNIC_FLOW_TBL_SIZE]; + unsigned long trap_timeout; + u8trap_count; struct __hfi_vnic_summary_counters q_sum_cntrs[HFI_VNIC_MAX_QUEUE]; struct __hfi_vnic_error_countersq_err_cntrs[HFI_VNIC_MAX_QUEUE]; @@ -394,6 +400,9 @@ struct hfi_vnic_mac_tbl_node { !obj && (bkt) < HFI_VNIC_MAC_TBL_SIZE; (bkt)++) \ hlist_for_each_entry(obj, &name[bkt], member) +extern char hfi_vnic_driver_name[]; +extern const char hfi_vnic_driver_version[]; + struct hfi_vnic_adapter *hfi_vnic_add_netdev(struct hfi_vnic_port *vport, struct device *parent); void hfi_vnic_rem_netdev(struct hfi_vnic_port *vport); @@ -428,5 +437,7 @@ struct hfi_vnic_adapter *hfi_vnic_add_vport(struct hfi_vnic_ctrl_port *cport, u8 port_num, u8 vport_num); void hfi_vnic_rem_vport(struct hfi_vnic_ada
[RFC v2 05/10] IB/hfi-vnic: VNIC statistics support
HFI VNIC driver statistics support maintains various counters including standard netdev counters and the Ethernet manager defined counters. Add the Ethtool hook to read the counters. Reviewed-by: Dennis Dalessandro Reviewed-by: Ira Weiny Signed-off-by: Niranjana Vishwanathapura --- .../infiniband/sw/intel/hfi_vnic/hfi_vnic_encap.c | 19 +- .../sw/intel/hfi_vnic/hfi_vnic_ethtool.c | 131 +++ .../sw/intel/hfi_vnic/hfi_vnic_internal.h | 84 +++ .../infiniband/sw/intel/hfi_vnic/hfi_vnic_netdev.c | 260 - 4 files changed, 486 insertions(+), 8 deletions(-) diff --git a/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_encap.c b/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_encap.c index 093df67..3fdfb7b 100644 --- a/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_encap.c +++ b/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_encap.c @@ -209,8 +209,10 @@ int hfi_vnic_encap_skb(struct hfi_vnic_adapter *adapter, struct sk_buff *skb) hdr->slid_high = info->vport.encap_slid >> 20; dlid = hfi_vnic_get_dlid(adapter, skb, def_port); - if (unlikely(!dlid)) + if (unlikely(!dlid)) { + adapter->q_err_cntrs[skb->queue_mapping].tx_dlid_zero++; return -EFAULT; + } hdr->dlid = dlid; hdr->dlid_high = dlid >> 20; @@ -233,6 +235,19 @@ int hfi_vnic_encap_skb(struct hfi_vnic_adapter *adapter, struct sk_buff *skb) /* hfi_vnic_decap_skb - strip OPA header from the skb (ethernet) packet */ int hfi_vnic_decap_skb(struct hfi_vnic_rx_queue *rxq, struct sk_buff *skb) { + struct hfi_vnic_adapter *adapter = rxq->adapter; + int max_len = adapter->netdev->mtu + VLAN_ETH_HLEN; + int rc = -EFAULT; + skb_pull(skb, HFI_VNIC_HDR_LEN); - return 0; + + /* Validate Packet length */ + if (skb->len > max_len) + adapter->q_err_cntrs[rxq->idx].rx_oversize++; + else if (skb->len < ETH_ZLEN) + adapter->q_err_cntrs[rxq->idx].rx_runt++; + else + rc = 0; + + return rc; } diff --git a/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_ethtool.c b/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_ethtool.c index 0b4da5e..9289ab2 100644 --- a/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_ethtool.c +++ b/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_ethtool.c @@ -53,9 +53,140 @@ #include "hfi_vnic_internal.h" +enum {NETDEV_STATS, VNIC_STATS}; + +struct vnic_stats { + char stat_string[ETH_GSTRING_LEN]; + struct { + int type; + int sizeof_stat; + int stat_offset; + }; +}; + +#define VNIC_STAT(m){ VNIC_STATS, \ + FIELD_SIZEOF(struct hfi_vnic_adapter, m), \ + offsetof(struct hfi_vnic_adapter, m) } +#define VNIC_NETDEV_STAT(m) { NETDEV_STATS, \ + FIELD_SIZEOF(struct net_device, m), \ + offsetof(struct net_device, m) } + +static struct vnic_stats vnic_gstrings_stats[] = { + /* NETDEV stats */ + {"rx_packets", VNIC_NETDEV_STAT(stats.rx_packets)}, + {"tx_packets", VNIC_NETDEV_STAT(stats.tx_packets)}, + {"rx_bytes", VNIC_NETDEV_STAT(stats.rx_bytes)}, + {"tx_bytes", VNIC_NETDEV_STAT(stats.tx_bytes)}, + {"rx_errors", VNIC_NETDEV_STAT(stats.rx_errors)}, + {"tx_errors", VNIC_NETDEV_STAT(stats.tx_errors)}, + {"rx_dropped", VNIC_NETDEV_STAT(stats.rx_dropped)}, + {"tx_dropped", VNIC_NETDEV_STAT(stats.tx_dropped)}, + + {"rx_fifo_errors", VNIC_NETDEV_STAT(stats.rx_fifo_errors)}, + {"rx_missed_errors", VNIC_NETDEV_STAT(stats.rx_missed_errors)}, + {"tx_carrier_errors", VNIC_NETDEV_STAT(stats.tx_carrier_errors)}, + {"tx_fifo_errors", VNIC_NETDEV_STAT(stats.tx_fifo_errors)}, + + /* SUMMARY counters */ + {"tx_unicast", VNIC_STAT(sum_cntrs.tx_grp.unicast)}, + {"tx_mcastbcast", VNIC_STAT(sum_cntrs.tx_grp.mcastbcast)}, + {"tx_untagged", VNIC_STAT(sum_cntrs.tx_grp.untagged)}, + {"tx_vlan", VNIC_STAT(sum_cntrs.tx_grp.vlan)}, + + {"tx_64_size", VNIC_STAT(sum_cntrs.tx_grp.xx_64_size)}, + {"tx_65_127", VNIC_STAT(sum_cntrs.tx_grp.xx_65_127)}, + {"tx_128_255", VNIC_STAT(sum_cntrs.tx_grp.xx_128_255)}, + {"tx_256_511", VNIC_STAT(sum_cntrs.tx_grp.xx_256_511)}, + {"tx_512_1023", VNIC_STAT(sum_cntrs.tx_grp.xx_512_1023)}, + {"tx_1024_1518", VNIC_STAT(sum_cntrs.tx_grp.xx_1024_1518)}, + {"tx_1519_max", VNIC_STAT(sum_cntrs.tx_grp.xx_1519_max)}, + + {"rx_unicast", VNIC_STAT(sum_cntrs.rx_grp.unicast)}, + {"rx_mcastbcast", VNIC_STAT(sum_cntrs.rx_grp.mcastbcast)}, + {"rx_untagged", VNIC_STAT(sum_cntrs.rx_grp.untagged)}, + {"rx_vlan", VNIC_STAT(sum_cntrs.rx_grp.vlan)}, + + {"rx_64_size", VNIC_STAT(sum_cntrs.rx_grp.xx_64_size)}, +
[RFC v2 09/10] IB/hfi1: Virtual Network Interface Controller (VNIC) support
HFI1 HW specific support for VNIC functionality. Add support to add and remove VNIC ports. Also implement the operations to allocate resources, transmit and receive of Omni-Path encapsulated Ethernet packets. Dynamically allocate a set of contexts for VNIC when the first vnic port is instantiated. Allocate VNIC contexts from user contexts pool and return them back to the same pool while freeing up. Set aside enough MSI-X interrupts for VNIC contexts and assign them when the contexts are allocated. On the receive side, use an RSM rule to spread TCP/UDP streams among VNIC contexts. Reviewed-by: Dennis Dalessandro Reviewed-by: Ira Weiny Signed-off-by: Niranjana Vishwanathapura Signed-off-by: Andrzej Kacprowski --- drivers/infiniband/hw/hfi1/Makefile | 2 +- drivers/infiniband/hw/hfi1/aspm.h | 13 +- drivers/infiniband/hw/hfi1/chip.c | 270 +++-- drivers/infiniband/hw/hfi1/chip.h | 2 + drivers/infiniband/hw/hfi1/debugfs.c | 6 +- drivers/infiniband/hw/hfi1/driver.c | 74 +++- drivers/infiniband/hw/hfi1/file_ops.c | 25 +- drivers/infiniband/hw/hfi1/hfi.h | 49 ++- drivers/infiniband/hw/hfi1/init.c | 37 +- drivers/infiniband/hw/hfi1/mad.c | 8 +- drivers/infiniband/hw/hfi1/pio.c | 17 + drivers/infiniband/hw/hfi1/pio.h | 6 + drivers/infiniband/hw/hfi1/sysfs.c| 2 +- drivers/infiniband/hw/hfi1/user_exp_rcv.c | 6 +- drivers/infiniband/hw/hfi1/user_pages.c | 3 +- drivers/infiniband/hw/hfi1/verbs.c| 7 + drivers/infiniband/hw/hfi1/vnic.h | 145 +++ drivers/infiniband/hw/hfi1/vnic_main.c| 614 ++ drivers/infiniband/hw/hfi1/vnic_sdma.c| 60 +++ include/rdma/opa_port_info.h | 2 +- 20 files changed, 1252 insertions(+), 96 deletions(-) create mode 100644 drivers/infiniband/hw/hfi1/vnic.h create mode 100644 drivers/infiniband/hw/hfi1/vnic_main.c create mode 100644 drivers/infiniband/hw/hfi1/vnic_sdma.c diff --git a/drivers/infiniband/hw/hfi1/Makefile b/drivers/infiniband/hw/hfi1/Makefile index 0cf97a0..88085f6 100644 --- a/drivers/infiniband/hw/hfi1/Makefile +++ b/drivers/infiniband/hw/hfi1/Makefile @@ -12,7 +12,7 @@ hfi1-y := affinity.o chip.o device.o driver.o efivar.o \ init.o intr.o mad.o mmu_rb.o pcie.o pio.o pio_copy.o platform.o \ qp.o qsfp.o rc.o ruc.o sdma.o sysfs.o trace.o \ uc.o ud.o user_exp_rcv.o user_pages.o user_sdma.o verbs.o \ - verbs_txreq.o + verbs_txreq.o vnic_main.o vnic_sdma.o hfi1-$(CONFIG_DEBUG_FS) += debugfs.o CFLAGS_trace.o = -I$(src) diff --git a/drivers/infiniband/hw/hfi1/aspm.h b/drivers/infiniband/hw/hfi1/aspm.h index 0d58fe3..3a01b69 100644 --- a/drivers/infiniband/hw/hfi1/aspm.h +++ b/drivers/infiniband/hw/hfi1/aspm.h @@ -229,14 +229,17 @@ static inline void aspm_ctx_timer_function(unsigned long data) spin_unlock_irqrestore(&rcd->aspm_lock, flags); } -/* Disable interrupt processing for verbs contexts when PSM contexts are open */ +/* + * Disable interrupt processing for verbs contexts when PSM or VNIC contexts + * are open. + */ static inline void aspm_disable_all(struct hfi1_devdata *dd) { struct hfi1_ctxtdata *rcd; unsigned long flags; unsigned i; - for (i = 0; i < dd->first_user_ctxt; i++) { + for (i = 0; i < dd->first_dyn_alloc_ctxt; i++) { rcd = dd->rcd[i]; del_timer_sync(&rcd->aspm_timer); spin_lock_irqsave(&rcd->aspm_lock, flags); @@ -260,7 +263,7 @@ static inline void aspm_enable_all(struct hfi1_devdata *dd) if (aspm_mode != ASPM_MODE_DYNAMIC) return; - for (i = 0; i < dd->first_user_ctxt; i++) { + for (i = 0; i < dd->first_dyn_alloc_ctxt; i++) { rcd = dd->rcd[i]; spin_lock_irqsave(&rcd->aspm_lock, flags); rcd->aspm_intr_enable = true; @@ -276,7 +279,7 @@ static inline void aspm_ctx_init(struct hfi1_ctxtdata *rcd) (unsigned long)rcd); rcd->aspm_intr_supported = rcd->dd->aspm_supported && aspm_mode == ASPM_MODE_DYNAMIC && - rcd->ctxt < rcd->dd->first_user_ctxt; + rcd->ctxt < rcd->dd->first_dyn_alloc_ctxt; } static inline void aspm_init(struct hfi1_devdata *dd) @@ -286,7 +289,7 @@ static inline void aspm_init(struct hfi1_devdata *dd) spin_lock_init(&dd->aspm_lock); dd->aspm_supported = aspm_hw_l1_supported(dd); - for (i = 0; i < dd->first_user_ctxt; i++) + for (i = 0; i < dd->first_dyn_alloc_ctxt; i++) aspm_ctx_init(dd->rcd[i]); /* Start with ASPM disabled */ diff --git a/drivers/infiniband/hw/hfi1/chip.c b/drivers/infiniband/hw/hfi1/chip.c index 9263984..472ce55 100644 --- a/drivers/infiniband/hw/hfi1/chip.c +++ b/drivers/infiniband/hw/hfi1/chip.c @@ -125,9 +125,16 @@ struct flag_table { #define DEFAULT_KRCVQS
[RFC v2 03/10] IB/hfi-vnic: Virtual Network Interface Controller (VNIC) netdev
HFI VNIC netdev function supports Ethernet functionality over Omni-Path fabric by encapsulating Ethernet packets inside Omni-Path packet header. It interfaces with the network stack to provide standard Ethernet network interfaces. It invokes HFI device's VNIC callback functions for HW access. Reviewed-by: Dennis Dalessandro Reviewed-by: Ira Weiny Signed-off-by: Niranjana Vishwanathapura Signed-off-by: Sadanand Warrier Signed-off-by: Sudeep Dutt Signed-off-by: Tanya K Jajodia Signed-off-by: Andrzej Kacprowski --- MAINTAINERS| 7 + drivers/infiniband/Kconfig | 1 + drivers/infiniband/sw/Makefile | 1 + drivers/infiniband/sw/intel/hfi_vnic/Kconfig | 8 + drivers/infiniband/sw/intel/hfi_vnic/Makefile | 6 + .../infiniband/sw/intel/hfi_vnic/hfi_vnic_encap.c | 238 .../infiniband/sw/intel/hfi_vnic/hfi_vnic_encap.h | 62 .../sw/intel/hfi_vnic/hfi_vnic_ethtool.c | 65 .../sw/intel/hfi_vnic/hfi_vnic_internal.h | 220 +++ .../infiniband/sw/intel/hfi_vnic/hfi_vnic_netdev.c | 409 + 10 files changed, 1017 insertions(+) create mode 100644 drivers/infiniband/sw/intel/hfi_vnic/Kconfig create mode 100644 drivers/infiniband/sw/intel/hfi_vnic/Makefile create mode 100644 drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_encap.c create mode 100644 drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_encap.h create mode 100644 drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_ethtool.c create mode 100644 drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_internal.h create mode 100644 drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_netdev.c diff --git a/MAINTAINERS b/MAINTAINERS index 2c7a7b6..62db3ea 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -5628,6 +5628,13 @@ F: drivers/block/cciss* F: include/linux/cciss_ioctl.h F: include/uapi/linux/cciss_ioctl.h +HFI-VNIC DRIVER +M: Dennis Dalessandro +M: Niranjana Vishwanathapura +L: linux-r...@vger.kernel.org +S: Supported +F: drivers/infiniband/sw/intel/hfi_vnic + HFI1 DRIVER M: Mike Marciniszyn M: Dennis Dalessandro diff --git a/drivers/infiniband/Kconfig b/drivers/infiniband/Kconfig index 6709173..900daf3 100644 --- a/drivers/infiniband/Kconfig +++ b/drivers/infiniband/Kconfig @@ -85,6 +85,7 @@ source "drivers/infiniband/ulp/srpt/Kconfig" source "drivers/infiniband/ulp/iser/Kconfig" source "drivers/infiniband/ulp/isert/Kconfig" +source "drivers/infiniband/sw/intel/hfi_vnic/Kconfig" source "drivers/infiniband/sw/rdmavt/Kconfig" source "drivers/infiniband/sw/rxe/Kconfig" diff --git a/drivers/infiniband/sw/Makefile b/drivers/infiniband/sw/Makefile index 8b095b2..2792559 100644 --- a/drivers/infiniband/sw/Makefile +++ b/drivers/infiniband/sw/Makefile @@ -1,2 +1,3 @@ obj-$(CONFIG_INFINIBAND_RDMAVT)+= rdmavt/ obj-$(CONFIG_RDMA_RXE) += rxe/ +obj-$(CONFIG_HFI_VNIC) += intel/hfi_vnic/ diff --git a/drivers/infiniband/sw/intel/hfi_vnic/Kconfig b/drivers/infiniband/sw/intel/hfi_vnic/Kconfig new file mode 100644 index 000..84d13e7 --- /dev/null +++ b/drivers/infiniband/sw/intel/hfi_vnic/Kconfig @@ -0,0 +1,8 @@ +config HFI_VNIC + tristate "Intel HFI VNIC support" + depends on X86_64 && INFINIBAND + ---help--- + This is HFI Virtual Network Interface Controller (VNIC) driver + for Ethernet over HFI feature. It implements the HW independent + VNIC functionality. It interfaces with Linux stack for data path + and IB MAD for the control path. diff --git a/drivers/infiniband/sw/intel/hfi_vnic/Makefile b/drivers/infiniband/sw/intel/hfi_vnic/Makefile new file mode 100644 index 000..8e3dca7 --- /dev/null +++ b/drivers/infiniband/sw/intel/hfi_vnic/Makefile @@ -0,0 +1,6 @@ +# Makefile - Intel HFI Virtual Network Controller driver +# Copyright(c) 2016, Intel Corporation. +# +obj-$(CONFIG_HFI_VNIC) += hfi_vnic.o + +hfi_vnic-y := hfi_vnic_netdev.o hfi_vnic_encap.o hfi_vnic_ethtool.o diff --git a/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_encap.c b/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_encap.c new file mode 100644 index 000..093df67 --- /dev/null +++ b/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_encap.c @@ -0,0 +1,238 @@ +/* + * Copyright(c) 2016 Intel Corporation. + * + * This file is provided under a dual BSD/GPLv2 license. When using or + * redistributing this file, you may do so under either license. + * + * GPL LICENSE SUMMARY + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of version 2 of the GNU General Public License as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * General Public License fo
[RFC v2 10/10] IB/hfi1: VNIC SDMA support
HFI1 VNIC SDMA support enables transmission of VNIC packets over SDMA. Map VNIC queues to SDMA engines and support halting and wakeup of the VNIC queues. Reviewed-by: Dennis Dalessandro Reviewed-by: Ira Weiny Signed-off-by: Niranjana Vishwanathapura --- drivers/infiniband/hw/hfi1/hfi.h | 1 + drivers/infiniband/hw/hfi1/vnic.h | 30 +++- drivers/infiniband/hw/hfi1/vnic_main.c | 21 ++- drivers/infiniband/hw/hfi1/vnic_sdma.c | 260 + 4 files changed, 309 insertions(+), 3 deletions(-) diff --git a/drivers/infiniband/hw/hfi1/hfi.h b/drivers/infiniband/hw/hfi1/hfi.h index 78d1726..8d5949f 100644 --- a/drivers/infiniband/hw/hfi1/hfi.h +++ b/drivers/infiniband/hw/hfi1/hfi.h @@ -855,6 +855,7 @@ struct hfi1_asic_data { /* Virtual NIC information */ struct hfi1_vnic_data { struct hfi1_ctxtdata *ctxt[HFI1_NUM_VNIC_CTXT]; + struct kmem_cache *txreq_cache; u8 num_vports; struct idr vesw_idr; u8 rmt_start; diff --git a/drivers/infiniband/hw/hfi1/vnic.h b/drivers/infiniband/hw/hfi1/vnic.h index 047845e..2d4eb8f 100644 --- a/drivers/infiniband/hw/hfi1/vnic.h +++ b/drivers/infiniband/hw/hfi1/vnic.h @@ -49,6 +49,7 @@ #include #include "hfi.h" +#include "sdma.h" #define HFI1_VNIC_ICRC_LEN 4 #define HFI1_VNIC_TAIL_LEN 1 @@ -90,6 +91,26 @@ #define HFI1_VNIC_SC_SHIFT 4 /** + * struct hfi1_vnic_sdma - VNIC per Tx ring SDMA information + * @dd - device data pointer + * @sde - sdma engine + * @vinfo - vnic info pointer + * @wait - iowait structure + * @stx - sdma tx request + * @state - vnic Tx ring SDMA state + * @q_idx - vnic Tx queue index + */ +struct hfi1_vnic_sdma { + struct hfi1_devdata *dd; + struct sdma_engine *sde; + struct hfi1_vnic_vport_info *vinfo; + struct iowait wait; + struct sdma_txreq stx; + unsigned int state; + u8 q_idx; +}; + +/** * struct hfi1_vnic_notifier - VNIC notifer structure * @cb - vnic callback function */ @@ -104,6 +125,7 @@ struct hfi1_vnic_notifier { * @event_flags: event notification flags * @vport: vnic port pointer * @skbq: Array of queues for received socket buffers + * @sdma: VNIC SDMA structure per TXQ */ struct hfi1_vnic_vport_info { struct hfi1_devdata *dd; @@ -112,7 +134,8 @@ struct hfi1_vnic_vport_info { DECLARE_BITMAP(event_flags, HFI_VNIC_NUM_EVTS); struct hfi_vnic_port *vport; - struct sk_buff_head skbq[HFI1_NUM_VNIC_CTXT]; + struct sk_buff_headskbq[HFI1_NUM_VNIC_CTXT]; + struct hfi1_vnic_sdma sdma[HFI1_VNIC_MAX_TXQ]; }; static inline struct hfi1_devdata *vnic_dev2dd(struct hfi_vnic_port *vport) @@ -131,8 +154,13 @@ static inline void hfi1_vnic_update_pad(unsigned char *pad, u8 plen) /* vnic hfi1 internal functions */ void hfi1_vnic_setup(struct hfi1_devdata *dd); void hfi1_vnic_cleanup(struct hfi1_devdata *dd); +int hfi1_vnic_txreq_init(struct hfi1_devdata *dd); +void hfi1_vnic_txreq_deinit(struct hfi1_devdata *dd); void hfi1_vnic_bypass_rcv(struct hfi1_packet *packet); +void hfi1_vnic_sdma_init(struct hfi1_vnic_vport_info *vinfo); +bool hfi1_vnic_sdma_write_avail(struct hfi1_vnic_vport_info *vinfo, + u8 q_idx); /* vnic port operations */ struct hfi_vnic_port *hfi1_vnic_add_vport(struct ib_device *device, diff --git a/drivers/infiniband/hw/hfi1/vnic_main.c b/drivers/infiniband/hw/hfi1/vnic_main.c index 1e237f3..19843a4 100644 --- a/drivers/infiniband/hw/hfi1/vnic_main.c +++ b/drivers/infiniband/hw/hfi1/vnic_main.c @@ -289,15 +289,21 @@ static int hfi1_vnic_put_skb(struct hfi_vnic_port *vport, static u8 hfi1_vnic_select_queue(struct hfi_vnic_port *vport, u8 vl, u8 entropy) { - return 0; + struct hfi1_vnic_vport_info *vinfo = vport->hfi_priv; + struct sdma_engine *sde; + + sde = sdma_select_engine_vl(vinfo->dd, entropy, vl); + return sde->this_idx; } static bool hfi1_vnic_get_write_avail(struct hfi_vnic_port *vport, u8 q_idx) { + struct hfi1_vnic_vport_info *vinfo = vport->hfi_priv; + if (q_idx >= vport->hfi_info.num_tx_q) return false; - return true; + return hfi1_vnic_sdma_write_avail(vinfo, q_idx); } void hfi1_vnic_bypass_rcv(struct hfi1_packet *packet) @@ -499,6 +505,12 @@ static int hfi1_vnic_init(struct hfi_vnic_port *vport) int i, rc = 0; mutex_lock(&hfi1_mutex); + if (!dd->vnic.num_vports) { + rc = hfi1_vnic_txreq_init(dd); + if (rc) + goto txreq_fail; + } + for (i = dd->vnic.num_ctxt; i < vport->hfi_info.num_rx_q; i++) { rc = hfi1_vnic_allot_ctxt(dd, &dd->vnic.ctxt[i]); if (rc) @@ -526,7 +538,11 @@ static int hfi1_vnic_init(struct hfi_vnic_port *vport) dd->vnic.num_vports++; vinfo->vport = vport; + hfi1_vnic_sdma_init(vinfo); alloc_fail: + if (!dd->vnic.num_vports) +
[RFC v2 01/10] IB/hfi-vnic: Virtual Network Interface Controller (VNIC) documentation
Add HFI VNIC design document explaining the VNIC architecture and the driver design. Reviewed-by: Dennis Dalessandro Reviewed-by: Ira Weiny Signed-off-by: Niranjana Vishwanathapura --- Documentation/infiniband/hfi_vnic.txt | 95 +++ 1 file changed, 95 insertions(+) create mode 100644 Documentation/infiniband/hfi_vnic.txt diff --git a/Documentation/infiniband/hfi_vnic.txt b/Documentation/infiniband/hfi_vnic.txt new file mode 100644 index 000..1f39d8b --- /dev/null +++ b/Documentation/infiniband/hfi_vnic.txt @@ -0,0 +1,95 @@ +Intel Omni-Path Host Fabric Interface (HFI) Virtual Network Interface +Controller (VNIC) feature supports Ethernet functionality over Omni-Path +fabric by encapsulating the Ethernet packets between HFI nodes. + +The patterns of exchanges of Omni-Path encapsulated Ethernet packets +involves one or more virtual Ethernet switches overlaid on the Omni-Path +fabric topology. A subset of HFI nodes on the Omni-Path fabric are +permitted to exchange encapsulated Ethernet packets across a particular +virtual Ethernet switch. The virtual Ethernet switches are logical +abstractions achieved by configuring the HFI nodes on the fabric for +header generation and processing. In the simplest configuration all HFI +nodes across the fabric exchange encapsulated Ethernet packets over a +single virtual Ethernet switch. A virtual Ethernet switch, is effectively +an independent Ethernet network. The configuration is performed by an +Ethernet Manager (EM) which is part of the trusted Fabric Manager (FM) +application. HFI nodes can have multiple VNICs each connected to a +different virtual Ethernet switch. The below diagram presents a case +of two virtual Ethernet switches with two HFI nodes. + + +---+ + | Subnet/ | + | Ethernet | + | Manager | + +---+ +/ / + / / +// + / / ++-+ +--+ +| Virtual Ethernet Switch| | Virtual Ethernet Switch | +| +-++-+ | | +-++-+ | +| | VPORT || VPORT | | | | VPORT || VPORT | | ++--+-++-+-+ +-+-++-+---+ + | \/ | + | \/ | + | \/ | + |/ \| + | / \ | + +---++ +---++ + | VNIC|VNIC| |VNIC |VNIC| + +---++ +---++ + | HFI | | HFI | + ++ ++ + +Intel HFI VNIC software design is presented in the below diagram. +HFI VNIC functionality has a HW dependent component and a HW +independent component. + +The HW dependent VNIC functionality is part of the HFI1 driver. It +implements the callback functions to do various tasks which includes +adding and removing of VNIC ports, HW resource allocation for VNIC +functionality and actual transmission and reception of encapsulated +Ethernet packets over the fabric. Each VNIC port is addressed by the +HFI port number, and the VNIC port number on that HFI port. + +The HFI VNIC module implements the HW independent VNIC functionality. +It consists of two parts. The VNIC Ethernet Management Agent (VEMA) +registers itself with IB core as an IB client and interfaces with the +IB MAD stack. It exchanges the management information with the Ethernet +Manager (EM) and the VNIC netdev. The VNIC netdev part interfaces with +the Linux network stack, thus providing standard Ethernet network +interfaces. It invokes HFI device's VNIC callback functions for HW access. +The VNIC netdev encapsulates the Ethernet packets with an Omni-Path +header before passing them to the HFI1 driver for transmission. +Similarly, it de-encapsulates the received Omni-Path packets before +passing them to the network stack. For each VNIC interface, the +information required for encapsulation is configured by EM via VEMA MAD +interface. + + ++---+ +--+ +| | | Linux | +| IB MAD| | Network | +| | | Stack | ++---+ +--+ + | | + | | +++ +|
[RFC v2 04/10] IB/hfi-vnic: VNIC Ethernet Management (EM) structure definitions
Define VNIC EM MAD structures and the associated macros. These structures are used for information exchange between VNIC EM agent (EMA) on the HFI host and the Ethernet manager. These include the virtual ethernet switch (vesw) port information, vesw port mac table, summay and error counters, vesw port interface mac lists and the EMA trap. Reviewed-by: Dennis Dalessandro Reviewed-by: Ira Weiny Signed-off-by: Niranjana Vishwanathapura Signed-off-by: Sadanand Warrier Signed-off-by: Tanya K Jajodia --- .../infiniband/sw/intel/hfi_vnic/hfi_vnic_encap.h | 444 + .../sw/intel/hfi_vnic/hfi_vnic_internal.h | 33 ++ 2 files changed, 477 insertions(+) diff --git a/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_encap.h b/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_encap.h index 6786cce..a6770ef 100644 --- a/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_encap.h +++ b/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_encap.h @@ -52,11 +52,455 @@ * and decapsulation of Ethernet packets */ +#include +#include + +/* Maximum number of vnics supported */ +#define HFI_MAX_VPORTS_SUPPORTED 256 + +/* EMA class version */ +#define HFI_EMA_CLASS_VERSION 0x80 + +/* + * Define the Intel vendor management class for HFI + * ETHERNET MANAGEMENT + */ +#define HFI_MGMT_CLASS_INTEL_EMA0x34 + +/* EM attribute IDs */ +#define HFI_EM_ATTR_CLASS_PORT_INFO 0x0001 +#define HFI_EM_ATTR_VESWPORT_INFO 0x0011 +#define HFI_EM_ATTR_VESWPORT_MAC_ENTRIES0x0012 +#define HFI_EM_ATTR_IFACE_UCAST_MACS0x0013 +#define HFI_EM_ATTR_IFACE_MCAST_MACS0x0014 +#define HFI_EM_ATTR_DELETE_VESW 0x0015 +#define HFI_EM_ATTR_VESWPORT_SUMMARY_COUNTERS 0x0020 +#define HFI_EM_ATTR_VESWPORT_ERROR_COUNTERS 0x0022 + #define HFI_VESW_MAX_NUM_DEF_PORT 16 #define HFI_VNIC_MAX_NUM_PCP8 +#define HFI_VNIC_EMA_DATA(OPA_MGMT_MAD_SIZE - IB_MGMT_VENDOR_HDR) + +/* Defines for vendor specific notice(trap) attributes */ +#define HFI_INTEL_EMA_NOTICE_TYPE_INFO 0x04 + +/* INTEL OUI */ +#define INTEL_OUI_1 0x00 +#define INTEL_OUI_2 0x06 +#define INTEL_OUI_3 0x6a + +/* Trap opcodes sent from VNIC */ +#define HFI_VESWPORT_TRAP_IFACE_UCAST_MAC_CHANGE 0x1 +#define HFI_VESWPORT_TRAP_IFACE_MCAST_MAC_CHANGE 0x2 +#define HFI_VESWPORT_TRAP_ETH_LINK_STATUS_CHANGE 0x3 + /* VNIC configured and operational state values */ #define HFI_VNIC_STATE_DROP_ALL0x1 #define HFI_VNIC_STATE_FORWARDING 0x3 +/** + * struct hfi_vesw_info - HFI vnic switch information + * @fabric_id: 10-bit fabric id + * @vesw_id: 12-bit virtual ethernet switch id + * @def_port_mask: bitmask of default ports + * @pkey: partition key + * @u_mcast_dlid: unknown multicast dlid + * @u_ucast_dlid: array of unknown unicast dlids + * @eth_mtu: MTUs for each vlan PCP + * @eth_mtu_non_vlan: MTU for non vlan packets + */ +struct hfi_vesw_info { + __be16 fabric_id; + __be16 vesw_id; + + u8 rsvd0[6]; + __be16 def_port_mask; + + u8 rsvd1[2]; + __be16 pkey; + + u8 rsvd2[4]; + __be32 u_mcast_dlid; + __be32 u_ucast_dlid[HFI_VESW_MAX_NUM_DEF_PORT]; + + u8 rsvd3[44]; + __be16 eth_mtu[HFI_VNIC_MAX_NUM_PCP]; + __be16 eth_mtu_non_vlan; + u8 rsvd4[2]; +} __packed; + +/** + * struct hfi_per_veswport_info - HFI vnic per port information + * @port_num: port number + * @eth_link_status: current ethernet link state + * @base_mac_addr: base mac address + * @config_state: configured port state + * @oper_state: operational port state + * @max_mac_tbl_ent: max number of mac table entries + * @max_smac_ent: max smac entries in mac table + * @mac_tbl_digest: mac table digest + * @encap_slid: base slid for the port + * @pcp_to_sc_uc: sc by pcp index for unicast ethernet packets + * @pcp_to_vl_uc: vl by pcp index for unicast ethernet packets + * @pcp_to_sc_mc: sc by pcp index for multicast ethernet packets + * @pcp_to_vl_mc: vl by pcp index for multicast ethernet packets + * @non_vlan_sc_uc: sc for non-vlan unicast ethernet packets + * @non_vlan_vl_uc: vl for non-vlan unicast ethernet packets + * @non_vlan_sc_mc: sc for non-vlan multicast ethernet packets + * @non_vlan_vl_mc: vl for non-vlan multicast ethernet packets + * @uc_macs_gen_count: generation count for unicast macs list + * @mc_macs_gen_count: generation count for multicast macs list + */ +struct hfi_per_veswport_info { + __be32 port_num; + + u8 eth_link_status; + u8 rsvd0[3]; + + u8 base_mac_addr[ETH_ALEN]; + u8 config_state; + u8 oper_state; + + __be16 max_mac_tbl_ent; + __be16 max_smac_ent; + __be32 mac_tbl_digest; + u8 rsvd1[4]; + + __be32 encap_slid; + + u8 pcp_to_sc_uc[HFI_VNIC_MAX_NUM_PCP]; + u8 pcp_to_vl_uc[HFI_VNIC_MAX_NUM_PCP]; + u8 pcp_to_sc_mc[
Re: [kernel-hardening] Re: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function
On Thu, 2016-12-15 at 15:57 +0800, Herbert Xu wrote: > Jason A. Donenfeld wrote: > > > > Siphash needs a random secret key, yes. The point is that the hash > > function remains secure so long as the secret key is kept secret. > > Other functions can't make the same guarantee, and so nervous > > periodic > > key rotation is necessary, but in most cases nothing is done, and so > > things just leak over time. > > Actually those users that use rhashtable now have a much more > sophisticated defence against these attacks, dyanmic rehashing > when bucket length exceeds a preset limit. > > Cheers, Key independent collisions won't be mitigated by picking a new secret. A simple solution with clear security properties is ideal. signature.asc Description: This is a digitally signed message part
Re: wl1251 & mac address & calibration data
(Adding Luis because he has been working on request_firmware() lately) Pali Rohár writes: >> > So no, there is no argument against... request_firmware() in >> > fallback mode with userspace helper is by design blocking and >> > waiting for userspace. But waiting for some change in DTS in >> > kernel is just nonsense. >> >> I would just mark the wlan device with status = "disabled" and >> enable it in the overlay together with adding the NVS & MAC info. > > So if you think that this solution make sense, we can wait what net > wireless maintainers say about it... > > For me it looks like that solution can be: > > extending request_firmware() to use only userspace helper I haven't followed the discussion very closely but this is my preference what drivers should do: 1) First the driver should do try to get the calibration data and mac address from the device tree. 2) If they are not in DT the driver should retrieve the calibration data with request_firmware(). BUT with an option for user space to implement that with a helper script so that the data can be created dynamically, which I believe openwrt does with ath10k calibration data right now. > and load mac address also via request_firmware() either by appending it > into NVS data or via separate call I'm not really fan of the idea providing permanent mac address through request_firmware(). For example, how to handle multiple devices on the same host, would there be a need for some kind of bus ids encoded to the filename? And what about devices with multiple mac addresses? I wish there would be a better way than request_firmware() to provide the permanent mac addresses from user space (if device tree is not available), I just don't know what that could be :) But if we would start to use request_firmware() for this at least there should be a wider concensus about that and it should be properly documented, just like the device tree bindings. -- Kalle Valo
Re: [PATCH 3/3] Bluetooth: btusb: Configure Marvel to use one of the pins for oob wakeup
Hi Rajat, On mer., déc. 14 2016, Rajat Jain wrote: In your title unless you speak about the comic books you should do a s/Marvel/Marvell/ :) Gregory > The Marvell devices may have many gpio pins, and hence for wakeup > on these out-of-band pins, the chip needs to be told which pin is > to be used for wakeup, using an hci command. > > Thus, we read the pin number etc from the device tree node and send > a command to the chip. > > Signed-off-by: Rajat Jain > --- > Note that while I would have liked to name the compatible string as more > like "marvell, usb8997-bt", the devicetrees/bindings/usb/usb-device.txt > requires the compatible property to be of the form "usbVID,PID". > > .../{marvell-bt-sd8xxx.txt => marvell-bt-8xxx.txt} | 25 - > drivers/bluetooth/btusb.c | 59 > ++ > 2 files changed, 82 insertions(+), 2 deletions(-) > rename Documentation/devicetree/bindings/net/{marvell-bt-sd8xxx.txt => > marvell-bt-8xxx.txt} (76%) > > diff --git a/Documentation/devicetree/bindings/net/marvell-bt-sd8xxx.txt > b/Documentation/devicetree/bindings/net/marvell-bt-8xxx.txt > similarity index 76% > rename from Documentation/devicetree/bindings/net/marvell-bt-sd8xxx.txt > rename to Documentation/devicetree/bindings/net/marvell-bt-8xxx.txt > index 6a9a63c..471bef8 100644 > --- a/Documentation/devicetree/bindings/net/marvell-bt-sd8xxx.txt > +++ b/Documentation/devicetree/bindings/net/marvell-bt-8xxx.txt > @@ -1,4 +1,4 @@ > -Marvell 8897/8997 (sd8897/sd8997) bluetooth SDIO devices > +Marvell 8897/8997 (sd8897/sd8997) bluetooth devices (SDIO or USB based) > -- > > Required properties: > @@ -6,11 +6,13 @@ Required properties: >- compatible : should be one of the following: > * "marvell,sd8897-bt" > * "marvell,sd8997-bt" > + * "usb1286,204e" > > Optional properties: > >- marvell,cal-data: Calibration data downloaded to the device during > initialization. This is an array of 28 values(u8). > + This is only applicable to SDIO devices. > >- marvell,wakeup-pin: It represents wakeup pin number of the bluetooth > chip. > firmware will use the pin to wakeup host system (u16). > @@ -29,7 +31,9 @@ Example: > IRQ pin 119 is used as system wakeup source interrupt. > wakeup pin 13 and gap 100ms are configured so that firmware can wakeup host > using this device side pin and wakeup latency. > -calibration data is also available in below example. > + > +Example for SDIO device follows (calibration data is also available in > +below example). > > &mmc3 { > status = "okay"; > @@ -54,3 +58,20 @@ calibration data is also available in below example. > marvell,wakeup-gap-ms = /bits/ 16 <0x64>; > }; > }; > + > +Example for USB device: > + > +&usb_host1_ohci { > +status = "okay"; > +#address-cells = <1>; > +#size-cells = <0>; > + > +mvl_bt1: bt@1 { > + compatible = "usb1286,204e"; > + reg = <1>; > + interrupt-parent = <&gpio0>; > + interrupts = <119 IRQ_TYPE_LEVEL_LOW>; > + marvell,wakeup-pin = /bits/ 16 <0x0d>; > + marvell,wakeup-gap-ms = /bits/ 16 <0x64>; > +}; > +}; > diff --git a/drivers/bluetooth/btusb.c b/drivers/bluetooth/btusb.c > index 32a6f22..99d7f6d 100644 > --- a/drivers/bluetooth/btusb.c > +++ b/drivers/bluetooth/btusb.c > @@ -2343,6 +2343,58 @@ static int btusb_shutdown_intel(struct hci_dev *hdev) > return 0; > } > > +#ifdef CONFIG_PM > +static const struct of_device_id mvl_oob_wake_match_table[] = { > + { .compatible = "usb1286,204e" }, > + { } > +}; > +MODULE_DEVICE_TABLE(of, mvl_oob_wake_match_table); > + > +/* Configure an out-of-band gpio as wake-up pin, if specified in device tree > */ > +static int marvell_config_oob_wake(struct hci_dev *hdev) > +{ > + struct sk_buff *skb; > + struct btusb_data *data = hci_get_drvdata(hdev); > + struct device *dev = &data->udev->dev; > + u16 pin, gap, opcode; > + int ret; > + u8 cmd[5]; > + > + if (!of_match_device(mvl_oob_wake_match_table, dev)) > + return 0; > + > + if (of_property_read_u16(dev->of_node, "marvell,wakeup-pin", &pin) || > + of_property_read_u16(dev->of_node, "marvell,wakeup-gap-ms", &gap)) > + return -EINVAL; > + > + /* Vendor specific command to configure a GPIO as wake-up pin */ > + opcode = hci_opcode_pack(0x3F, 0x59); > + cmd[0] = opcode & 0xFF; > + cmd[1] = opcode >> 8; > + cmd[2] = 2; /* length of parameters that follow */ > + cmd[3] = pin; > + cmd[4] = gap; /* time in ms, for which wakeup pin should be asserted */ > + > + skb = bt_skb_alloc(sizeof(cmd), GFP_KERNEL); > + if (!skb) { > + bt_dev_err(hdev, "%s: No memory\n", __func__); > + return -ENOMEM; > + } > + > + memcpy(skb_put(skb, sizeof(cmd)), cmd, sizeof(cmd)); > + hci_skb_pkt_type(skb) = HCI_COMMAND_PKT; > + > + ret
Re: Designing a safe RX-zero-copy Memory Model for Networking
On Wed, 14 Dec 2016 14:45:00 -0800 Alexander Duyck wrote: > On Wed, Dec 14, 2016 at 1:29 PM, Jesper Dangaard Brouer > wrote: > > On Wed, 14 Dec 2016 08:45:08 -0800 > > Alexander Duyck wrote: > > > >> I agree. This is a no-go from the performance perspective as well. > >> At a minimum you would have to be zeroing out the page between uses to > >> avoid leaking data, and that assumes that the program we are sending > >> the pages to is slightly well behaved. If we think zeroing out an > >> sk_buff is expensive wait until we are trying to do an entire 4K page. > > > > Again, yes the page will be zero'ed out, but only when entering the > > page_pool. Because they are recycled they are not cleared on every use. > > Thus, performance does not suffer. > > So you are talking about recycling, but not clearing the page when it > is recycled. That right there is my problem with this. It is fine if > you assume the pages are used by the application only, but you are > talking about using them for both the application and for the regular > network path. You can't do that. If you are recycling you will have > to clear the page every time you put it back onto the Rx ring, > otherwise you can leak the recycled memory into user space and end up > with a user space program being able to snoop data out of the skb. > > > Besides clearing large mem area is not as bad as clearing small. > > Clearing an entire page does cost something, as mentioned before 143 > > cycles, which is 28 bytes-per-cycle (4096/143). And clearing 256 bytes > > cost 36 cycles which is only 7 bytes-per-cycle (256/36). > > What I am saying is that you are going to be clearing the 4K blocks > each time they are recycled. You can't have the pages shared between > user-space and the network stack unless you have true isolation. If > you are allowing network stack pages to be recycled back into the > user-space application you open up all sorts of leaks where the > application can snoop into data it shouldn't have access to. See later, the "Read-only packet page" mode should provide a mode where the netstack doesn't write into the page, and thus cannot leak kernel data. (CAP_NET_ADMIN already give it access to other applications data.) > >> I think we are stuck with having to use a HW filter to split off > >> application traffic to a specific ring, and then having to share the > >> memory between the application and the kernel on that ring only. Any > >> other approach just opens us up to all sorts of security concerns > >> since it would be possible for the application to try to read and > >> possibly write any data it wants into the buffers. > > > > This is why I wrote a document[1], trying to outline how this is possible, > > going through all the combinations, and asking the community to find > > faults in my idea. Inlining it again, as nobody really replied on the > > content of the doc. > > > > - > > Best regards, > > Jesper Dangaard Brouer > > MSc.CS, Principal Kernel Engineer at Red Hat > > LinkedIn: http://www.linkedin.com/in/brouer > > > > [1] > > https://prototype-kernel.readthedocs.io/en/latest/vm/page_pool/design/memory_model_nic.html > > > > === > > Memory Model for Networking > > === > > > > This design describes how the page_pool change the memory model for > > networking in the NIC (Network Interface Card) drivers. > > > > .. Note:: The catch for driver developers is that, once an application > > request zero-copy RX, then the driver must use a specific > > SKB allocation mode and might have to reconfigure the > > RX-ring. > > > > > > Design target > > = > > > > Allow the NIC to function as a normal Linux NIC and be shared in a > > safe manor, between the kernel network stack and an accelerated > > userspace application using RX zero-copy delivery. > > > > Target is to provide the basis for building RX zero-copy solutions in > > a memory safe manor. An efficient communication channel for userspace > > delivery is out of scope for this document, but OOM considerations are > > discussed below (`Userspace delivery and OOM`_). > > > > Background > > == > > > > The SKB or ``struct sk_buff`` is the fundamental meta-data structure > > for network packets in the Linux Kernel network stack. It is a fairly > > complex object and can be constructed in several ways. > > > > From a memory perspective there are two ways depending on > > RX-buffer/page state: > > > > 1) Writable packet page > > 2) Read-only packet page > > > > To take full potential of the page_pool, the drivers must actually > > support handling both options depending on the configuration state of > > the page_pool. > > > > Writable packet page > > > > > > When the RX packet page is writable, the SKB setup is fairly straight > > forward. The SKB->data (and skb->head) can point directly to the page > > data, adjusting the offset acco
Re: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function
On 15.12.2016 00:29, Jason A. Donenfeld wrote: > Hi Hannes, > > On Wed, Dec 14, 2016 at 11:03 PM, Hannes Frederic Sowa > wrote: >> I fear that the alignment requirement will be a source of bugs on 32 bit >> machines, where you cannot even simply take a well aligned struct on a >> stack and put it into the normal siphash(aligned) function without >> adding alignment annotations everywhere. Even blocks returned from >> kmalloc on 32 bit are not aligned to 64 bit. > > That's what the "__aligned(SIPHASH24_ALIGNMENT)" attribute is for. The > aligned siphash function will be for structs explicitly made for > siphash consumption. For everything else there's siphash_unaligned. So in case you have a pointer from somewhere on 32 bit you can essentially only guarantee it has natural alignment or max. native alignment (based on the arch). gcc only fulfills your request for alignment when you allocate on the stack (minus gcc bugs). Let's say you get a pointer from somewhere, maybe embedded in a struct, which came from kmalloc. kmalloc doesn't care about aligned attribute, it will align according to architecture description. That said, if you want to hash that, you would need manually align the memory returned from kmalloc or make sure the the data is more than naturally aligned on that architecture. >> Can we do this a runtime check and just have one function (siphash) >> dealing with that? > > Seems like the runtime branching on the aligned function would be bad > for performance, when we likely know at compile time if it's going to > be aligned or not. I suppose we could add that check just to the > unaligned version, and rename it to "maybe_unaligned"? Is this what > you have in mind? I argue that you mostly don't know at compile time if it is correctly aligned if the alignment requirements are larger than the natural ones. Also, we don't even have that for memcpy, even we use it probably much more than hashing, so I think this is overkill. Bye, Hannes
Re: [Query] Delayed vxlan socket creation?
在 2016年12月14日 17:29, Jiri Benc 写道: On Wed, 14 Dec 2016 07:49:24 +, Du, Fan wrote: I'm interested to one Docker issue[1] which looks like related to kernel vxlan socket creation as described in the thread. From my limited knowledge here, socket creation is synchronous , and after the *socket* syscall, the sock handle will be valid and ready to linkup. Somehow I'm not sure the detailed scenario here, and which/how possible commit fix? baf606d9c9b1^..56ef9c909b40 Jiri Thanks a lot Jiri!
Re: [RFC v2 02/10] IB/hfi-vnic: Virtual Network Interface Controller (VNIC) interface
On Wed, Dec 14, 2016 at 11:59:34PM -0800, Vishwanathapura, Niranjana wrote: + +static inline bool is_hfi_ibdev(struct ib_device *ibdev) +{ + return !memcmp(ibdev->name, "hfi", 3); +} I am thinking of adding a device capability flag to indicate HFI VNIC capabilty instead of relying on the device name as above to identify a hfi ib deice. Any comments? Probably it can be addressed by a separate patch later. Niranjana
Re: [Query] Delayed vxlan socket creation?
在 2016年12月15日 01:24, Cong Wang 写道: On Tue, Dec 13, 2016 at 11:49 PM, Du, Fan wrote: Hi I'm interested to one Docker issue[1] which looks like related to kernel vxlan socket creation as described in the thread. From my limited knowledge here, socket creation is synchronous , and after the *socket* syscall, the sock handle will be valid and ready to linkup. You need to read the code. vxlan tunnel is a UDP tunnel, it needs a kernel socket (and a port) to setup UDP communication, unlike GRE tunnel etc. I check the fix is merged in 4.0, my code base is pretty new, so somehow I failed to see the work queue stuff in drver/net/vxlan.c Somehow I'm not sure the detailed scenario here, and which/how possible commit fix? Thanks! Quoted analysis: -- (Found in kernel 3.13) The issue happens because in older kernels when a vxlan interface is created, the socket creation is queued up in a worker thread which actually creates the socket. But this needs to happen before we bring up the link on the vxlan interface. If for some chance, the worker thread hasn't completed the creation of the socket before we did link up then when we do link up the kernel checks if the socket was created and if not it will return ENOTCONN. This was a bug in the kernel which got fixed in later kernels. That is why retrying with a timer fixes the issue. This was introduced by commit 1c51a9159ddefa5119724a4c7da3fd3ef44b68d5 and later fixed by commit 56ef9c909b40483d2c8cb63fcbf83865f162d5ec. 信聪哥,得永生。 Thanks for the offending commit id!
Re: [RFC v2 02/10] IB/hfi-vnic: Virtual Network Interface Controller (VNIC) interface
On Thu, Dec 15, 2016 at 12:53:49AM -0800, Vishwanathapura, Niranjana wrote: > On Wed, Dec 14, 2016 at 11:59:34PM -0800, Vishwanathapura, Niranjana wrote: > > + > > +static inline bool is_hfi_ibdev(struct ib_device *ibdev) > > +{ > > + return !memcmp(ibdev->name, "hfi", 3); > > +} > > I am thinking of adding a device capability flag to indicate HFI VNIC > capabilty instead of relying on the device name as above to identify a hfi > ib deice. Absolutely. > Any comments? Probably it can be addressed by a separate patch later. no, comparing device names is always wrong, please do it ASAP.
Re: [PATCH] net: sfc: use new api ethtool_{get|set}_link_ksettings
n 14/12/16 23:12, Philippe Reynes wrote: > The ethtool api {get|set}_settings is deprecated. > We move this driver to new api {get|set}_link_ksettings. > > Signed-off-by: Philippe Reynes Thanks Philippe. We'll get some testing done on this. Bert.
[PATCH] net: ipv4: tcp_offload: check segs for NULL
From: Shakya Sundar Das This patch will check segs for being NULL in tcp_gso_segment() before calling skb_shinfo(segs) from skb_is_gso(segs), otherwise kernel can run into a NULL-pointer dereference. Signed-off-by: Shakya Sundar Das --- net/ipv4/tcp_offload.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/ipv4/tcp_offload.c b/net/ipv4/tcp_offload.c index bc68da3..93feefd 100644 --- a/net/ipv4/tcp_offload.c +++ b/net/ipv4/tcp_offload.c @@ -96,7 +96,7 @@ struct sk_buff *tcp_gso_segment(struct sk_buff *skb, skb->ooo_okay = 0; segs = skb_segment(skb, features); - if (IS_ERR(segs)) + if (IS_ERR_OR_NULL(segs)) goto out; /* Only first segment might have ooo_okay set */ -- 1.7.9.5
Re: sanity checking iov_iter patches
On Thu, 15 Dec 2016 06:23:05 + Al Viro wrote: > Some of the vfs.git#work.iov_iter stuff touches net/*; basically, > there are several missing primitives (copy_from_iter_full(), etc.) for > "try to copy, tell whether it has copied the full amount requested and > advance the iterator only in case of success". Most of the callers were > actually doing just that (see e.g. skb_add_data() and friends) and while > nothing in the current kernel cares whether we advance ->msg_iter on > failure, it's much more consistent semantics. > > If anybody has objections to that stuff (in linux-next, or in > git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git#work.iov_iter), > or thinks that some of that should go via net-next.git, yell and I'll > drop the bits in question. If not, to Linus it all goes... Just some links to make it quicker for people see the three patches: http://git.kernel.org/cgit/linux/kernel/git/viro/vfs.git/log/?h=work.iov_iter Patches: http://git.kernel.org/cgit/linux/kernel/git/viro/vfs.git/commit/?h=work.iov_iter&id=cbbd26b8b1a http://git.kernel.org/cgit/linux/kernel/git/viro/vfs.git/commit/?h=work.iov_iter&id=15e6cb46c9b http://git.kernel.org/cgit/linux/kernel/git/viro/vfs.git/commit/?h=work.iov_iter&id=0b62fca2623 -- Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat LinkedIn: http://www.linkedin.com/in/brouer
Re: [PATCH 5/8] linux: drop __bitwise__ everywhere
Hello. On 15/12/16 06:15, Michael S. Tsirkin wrote: __bitwise__ used to mean "yes, please enable sparse checks unconditionally", but now that we dropped __CHECK_ENDIAN__ __bitwise is exactly the same. There aren't many users, replace it by __bitwise everywhere. Signed-off-by: Michael S. Tsirkin --- arch/arm/plat-samsung/include/plat/gpio-cfg.h| 2 +- drivers/md/dm-cache-block-types.h| 6 +++--- drivers/net/ethernet/sun/sunhme.h| 2 +- drivers/net/wireless/intel/iwlwifi/iwl-fw-file.h | 4 ++-- include/linux/mmzone.h | 2 +- include/linux/serial_core.h | 4 ++-- include/linux/types.h| 4 ++-- include/scsi/iscsi_proto.h | 2 +- include/target/target_core_base.h| 2 +- include/uapi/linux/virtio_types.h| 6 +++--- net/ieee802154/6lowpan/6lowpan_i.h | 2 +- net/mac80211/ieee80211_i.h | 4 ++-- 12 files changed, 20 insertions(+), 20 deletions(-) diff --git a/arch/arm/plat-samsung/include/plat/gpio-cfg.h b/arch/arm/plat-samsung/include/plat/gpio-cfg.h index 21391fa..e55d1f5 100644 --- a/arch/arm/plat-samsung/include/plat/gpio-cfg.h +++ b/arch/arm/plat-samsung/include/plat/gpio-cfg.h @@ -26,7 +26,7 @@ #include -typedef unsigned int __bitwise__ samsung_gpio_pull_t; +typedef unsigned int __bitwise samsung_gpio_pull_t; /* forward declaration if gpio-core.h hasn't been included */ struct samsung_gpio_chip; diff --git a/drivers/md/dm-cache-block-types.h b/drivers/md/dm-cache-block-types.h index bed4ad4..389c9e8 100644 --- a/drivers/md/dm-cache-block-types.h +++ b/drivers/md/dm-cache-block-types.h @@ -17,9 +17,9 @@ * discard bitset. */ -typedef dm_block_t __bitwise__ dm_oblock_t; -typedef uint32_t __bitwise__ dm_cblock_t; -typedef dm_block_t __bitwise__ dm_dblock_t; +typedef dm_block_t __bitwise dm_oblock_t; +typedef uint32_t __bitwise dm_cblock_t; +typedef dm_block_t __bitwise dm_dblock_t; static inline dm_oblock_t to_oblock(dm_block_t b) { diff --git a/drivers/net/ethernet/sun/sunhme.h b/drivers/net/ethernet/sun/sunhme.h index f430765..4a8d5b1 100644 --- a/drivers/net/ethernet/sun/sunhme.h +++ b/drivers/net/ethernet/sun/sunhme.h @@ -302,7 +302,7 @@ * Always write the address first before setting the ownership * bits to avoid races with the hardware scanning the ring. */ -typedef u32 __bitwise__ hme32; +typedef u32 __bitwise hme32; struct happy_meal_rxd { hme32 rx_flags; diff --git a/drivers/net/wireless/intel/iwlwifi/iwl-fw-file.h b/drivers/net/wireless/intel/iwlwifi/iwl-fw-file.h index 1ad0ec1..84813b5 100644 --- a/drivers/net/wireless/intel/iwlwifi/iwl-fw-file.h +++ b/drivers/net/wireless/intel/iwlwifi/iwl-fw-file.h @@ -228,7 +228,7 @@ enum iwl_ucode_tlv_flag { IWL_UCODE_TLV_FLAGS_BCAST_FILTERING = BIT(29), }; -typedef unsigned int __bitwise__ iwl_ucode_tlv_api_t; +typedef unsigned int __bitwise iwl_ucode_tlv_api_t; /** * enum iwl_ucode_tlv_api - ucode api @@ -258,7 +258,7 @@ enum iwl_ucode_tlv_api { #endif }; -typedef unsigned int __bitwise__ iwl_ucode_tlv_capa_t; +typedef unsigned int __bitwise iwl_ucode_tlv_capa_t; /** * enum iwl_ucode_tlv_capa - ucode capabilities diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 0f088f3..36d9896 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -246,7 +246,7 @@ struct lruvec { #define ISOLATE_UNEVICTABLE((__force isolate_mode_t)0x8) /* LRU Isolation modes. */ -typedef unsigned __bitwise__ isolate_mode_t; +typedef unsigned __bitwise isolate_mode_t; enum zone_watermarks { WMARK_MIN, diff --git a/include/linux/serial_core.h b/include/linux/serial_core.h index 5d49488..5def8e8 100644 --- a/include/linux/serial_core.h +++ b/include/linux/serial_core.h @@ -111,8 +111,8 @@ struct uart_icount { __u32 buf_overrun; }; -typedef unsigned int __bitwise__ upf_t; -typedef unsigned int __bitwise__ upstat_t; +typedef unsigned int __bitwise upf_t; +typedef unsigned int __bitwise upstat_t; struct uart_port { spinlock_t lock; /* port lock */ diff --git a/include/linux/types.h b/include/linux/types.h index baf7183..d501ad3 100644 --- a/include/linux/types.h +++ b/include/linux/types.h @@ -154,8 +154,8 @@ typedef u64 dma_addr_t; typedef u32 dma_addr_t; #endif -typedef unsigned __bitwise__ gfp_t; -typedef unsigned __bitwise__ fmode_t; +typedef unsigned __bitwise gfp_t; +typedef unsigned __bitwise fmode_t; #ifdef CONFIG_PHYS_ADDR_T_64BIT typedef u64 phys_addr_t; diff --git a/include/scsi/iscsi_proto.h b/include/scsi/iscsi_proto.h index c1260d8..df156f1 100644 --- a/include/scsi/iscsi_proto.h +++ b/include/scsi/iscsi_proto.h @@ -74,7 +74,7 @@ static inline int iscsi_sna_gte(u32 n1, u32 n2) #define zero_data(p) {p[0]=0;p[1]=0;p[2]=0;} /* initiator tags; opaque for target */ -typedef uint32_t __bitwise__ itt_t; +typedef uint32_t _
Re: [PATCH net] vxlan: fix unused variable warning
On Wed, 14 Dec 2016 12:43:55 -0800, Stephen Hemminger wrote: > Fixes commit 4528520d315ac1 ("vxlan: add ipv6 proxy support") Wrong hash, it was commit f564f45c4518. And that commit actually did use saddr, the actual commit that is being fixed is 4b29dba9c085 ("vxlan: fix nonfunctional neigh_reduce()"). Also, please use the standard Fixes: line when resubmitting this. The patch itself looks good. Thanks, Jiri
Re: [RFC v2 00/10] HFI Virtual Network Interface Controller (VNIC)
On Wed, Dec 14, 2016 at 11:59:32PM -0800, Vishwanathapura, Niranjana wrote: > Thanks Jason for the valuable feedback. > Here is the revised HFI VNIC patch series. > > ChangeLog: > = > v1 => v2: > a) Removed hfi_vnic bus, instead make hfi_vnic driver an 'ib client', >as per feedback from Jason Gunthorpe. > b) Interface changes, data structure changes and variable name changes >associated with (a). > c) Add hfi_ibdev abstraction to provide VNIC control operations to >hfi_vnic client. > d) Minor fixes > e) Moved hfi_vnic driver from .../sw/intel/vnic/hfi_vnic to >.../sw/intel/hfi_vnic. To put it into proportion, Jason asked you to do different thing. http://marc.info/?l=linux-rdma&m=147977108302151&w=2 http://marc.info/?l=linux-rdma&m=148000415401842&w=2 And Christoph, http://marc.info/?l=linux-rdma&m=147985587425861&w=2 signature.asc Description: PGP signature
Re: [PATCH v2 2/2] net: ethernet: stmmac: remove private tx queue lock
Hi! > The driver uses a private lock for synchronization of the xmit function and > the xmit completion handler, but since the NETIF_F_LLTX flag is not set, > the xmit function is also called with the xmit_lock held. > > On the other hand the completion handler uses the reverse locking order by > first taking the private lock and (in case that the tx queue had been > stopped) then the xmit_lock. > > Improve the locking by removing the private lock and using only the > xmit_lock for synchronization instead. Do you have stmmac hardware to test on? I believe something is very wrong with the locking there. In particular... scheduling the stmmac_tx_timer() function to run often should not do anything bad if locking is correct... but it breaks the driver rather quickly. [Example patch below, needs applying to two places in net-next.] (Other possibility is that hardware races with the driver.) Giuseppe, is there documentation available for the chip? Driver says Documentation available at: http://www.stlinux.com but that page does not work for me... 404 Not Found Code: NoSuchBucket Message: The specified bucket does not exist BucketName: www.stlinux.com RequestId: 1C8A20CB99AE7F75 HostId: ljPnqbEpyD8exct5MUgcDXSW8n+I67Yw0aejNhLuBQ0pqN0UCfiRBa3ztlOMngiXoSN+COX+VSw= Best regards, Pavel diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c index ffbcd03..8040370 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -1973,8 +1973,9 @@ static void stmmac_xmit_common(struct sk_buff *skb, struct net_device *dev, int */ priv->tx_count_frames += nfrags + 1; if (likely(priv->tx_coal_frames > priv->tx_count_frames)) { - mod_timer(&priv->txtimer, - STMMAC_COAL_TIMER(priv->tx_coal_timer)); + if (priv->tx_count_frames == nfrags + 1) + mod_timer(&priv->txtimer, + STMMAC_COAL_TIMER(priv->tx_coal_timer)); } else { priv->tx_count_frames = 0; priv->hw->desc->set_tx_ic(desc); -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html signature.asc Description: Digital signature
[PATCH] bpf: cgroup: annotate pointers in struct cgroup_bpf with __rcu
The member 'effective' in 'struct cgroup_bpf' is protected by RCU. Annotate it accordingly to squelch a sparse warning. Signed-off-by: Daniel Mack --- include/linux/bpf-cgroup.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/linux/bpf-cgroup.h b/include/linux/bpf-cgroup.h index 7b6e5d1..92bc89a 100644 --- a/include/linux/bpf-cgroup.h +++ b/include/linux/bpf-cgroup.h @@ -20,7 +20,7 @@ struct cgroup_bpf { * when this cgroup is accessed. */ struct bpf_prog *prog[MAX_BPF_ATTACH_TYPE]; - struct bpf_prog *effective[MAX_BPF_ATTACH_TYPE]; + struct bpf_prog __rcu *effective[MAX_BPF_ATTACH_TYPE]; }; void cgroup_bpf_put(struct cgroup *cgrp); -- 2.9.3
Re: [PATCH v2 2/2] net: ethernet: stmmac: remove private tx queue lock
On 12/15/2016 10:45 AM, Pavel Machek wrote: Giuseppe, is there documentation available for the chip? Driver says Documentation available at: http://www.stlinux.com but that page does not work for me... Hi Pavel, yes the page has been removed but all the relevant and updated driver doc is inside the kernel sources. Regards Peppe
RE: [PATCH v3 3/3] random: use siphash24 instead of md5 for get_random_int/long
From: Behalf Of Jason A. Donenfeld > Sent: 14 December 2016 18:46 ... > + ret = *chaining = siphash24((u8 *)&combined, offsetof(typeof(combined), > end), If you make the first argument 'const void *' you won't need the cast on every call. I'd also suggest making the key u64[2]. David
Hello Beautiful
How you doing today? I hope you are doing well. My name is Bentley, from the US. I'm in Syria right now fighting ISIS. I want to get to know you better, if I may be so bold. I consider myself an easy-going man, and I am currently looking for a relationship in which I feel loved. Please tell me more about yourself, if you don't mind. Hope to hear from you soon. Regards, Bentley.
RE: [PATCH v3 1/3] siphash: add cryptographically secure hashtable function
From: Linus Torvalds > Sent: 15 December 2016 00:11 > On Wed, Dec 14, 2016 at 3:34 PM, Jason A. Donenfeld wrote: > > > > Or does your reasonable dislike of "word" still allow for the use of > > dword and qword, so that the current function names of: > > dword really is confusing to people. > > If you have a MIPS background, it means 64 bits. While to people with > Windows programming backgrounds it means 32 bits. Guess what a DWORD_PTR is on 64bit windows ... (it is an integer type). David
Re: [PATCH v2 2/2] net: ethernet: stmmac: remove private tx queue lock
On Thu 2016-12-15 11:08:36, Giuseppe CAVALLARO wrote: > On 12/15/2016 10:45 AM, Pavel Machek wrote: > >Giuseppe, is there documentation available for the chip? Driver says > > > > Documentation available at: > > http://www.stlinux.com > > > >but that page does not work for me... > > Hi Pavel, yes the page has been removed but all the relevant and > updated driver doc is inside the kernel sources. Ok, perhaps the link should be removed, then? (Along with the bugzilla link if that is not going to be re-enabled?) Is there documentation for the hardware somewhere? (As something is very wrong with stmmac_tx_clean(), either locking or interface to the DMA engine.) Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html signature.asc Description: Digital signature
[PATCH 1/5] irda: irproc.c: Remove unneeded linux/miscdevice.h include
irproc.c does not use any miscdevice so this patch remove this unnecessary inclusion. Signed-off-by: Corentin Labbe --- net/irda/irproc.c | 1 - 1 file changed, 1 deletion(-) diff --git a/net/irda/irproc.c b/net/irda/irproc.c index b9ac598..77cfdde 100644 --- a/net/irda/irproc.c +++ b/net/irda/irproc.c @@ -23,7 +23,6 @@ * / -#include #include #include #include -- 2.10.2
[PATCH 3/5] irnet: ppp: move IRNET_MINOR to include/linux/miscdevice.h
This patch move the define for IRNET_MINOR to include/linux/miscdevice.h It is better that all minor number definitions are in the same place. Signed-off-by: Corentin Labbe --- include/linux/miscdevice.h | 1 + net/irda/irnet/irnet_ppp.h | 1 - 2 files changed, 1 insertion(+), 1 deletion(-) diff --git a/include/linux/miscdevice.h b/include/linux/miscdevice.h index 18b2e3b..5ea0a65 100644 --- a/include/linux/miscdevice.h +++ b/include/linux/miscdevice.h @@ -37,6 +37,7 @@ #define HWRNG_MINOR183 #define MICROCODE_MINOR184 #define KEYPAD_MINOR 185 +#define IRNET_MINOR187 #define D7S_MINOR 193 #define VFIO_MINOR 196 #define TUN_MINOR 200 diff --git a/net/irda/irnet/irnet_ppp.h b/net/irda/irnet/irnet_ppp.h index 693ebc0..18fcead 100644 --- a/net/irda/irnet/irnet_ppp.h +++ b/net/irda/irnet/irnet_ppp.h @@ -21,7 +21,6 @@ /* /dev/irnet file constants */ #define IRNET_MAJOR10 /* Misc range */ -#define IRNET_MINOR187 /* Official allocation */ /* IrNET control channel stuff */ #define IRNET_MAX_COMMAND 256 /* Max length of a command line */ -- 2.10.2
[PATCH 5/5] irda: irnet: add member name to the miscdevice declaration
Since the struct miscdevice have many members, it is dangerous to init it without members name relying only on member order. This patch add member name to the init declaration. Signed-off-by: Corentin Labbe --- net/irda/irnet/irnet_ppp.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/net/irda/irnet/irnet_ppp.h b/net/irda/irnet/irnet_ppp.h index ec092c9..1ed17f9 100644 --- a/net/irda/irnet/irnet_ppp.h +++ b/net/irda/irnet/irnet_ppp.h @@ -108,9 +108,9 @@ static const struct file_operations irnet_device_fops = /* Structure so that the misc major (drivers/char/misc.c) take care of us... */ static struct miscdevice irnet_misc_device = { - IRNET_MINOR, - "irnet", - &irnet_device_fops + .minor = IRNET_MINOR, + .name = "irnet", + .file_operations = &irnet_device_fops }; #endif /* IRNET_PPP_H */ -- 2.10.2
RE: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function
From: Hannes Frederic Sowa > Sent: 14 December 2016 22:03 > On 14.12.2016 13:46, Jason A. Donenfeld wrote: > > Hi David, > > > > On Wed, Dec 14, 2016 at 10:56 AM, David Laight > > wrote: > >> ... > >>> +u64 siphash24(const u8 *data, size_t len, const u8 > >>> key[SIPHASH24_KEY_LEN]) > >> ... > >>> + u64 k0 = get_unaligned_le64(key); > >>> + u64 k1 = get_unaligned_le64(key + sizeof(u64)); > >> ... > >>> + m = get_unaligned_le64(data); > >> > >> All these unaligned accesses are going to get expensive on architectures > >> like sparc64. > > > > Yes, the unaligned accesses aren't pretty. Since in pretty much all > > use cases thus far, the data can easily be made aligned, perhaps it > > makes sense to create siphash24() and siphash24_unaligned(). Any > > thoughts on doing something like that? > > I fear that the alignment requirement will be a source of bugs on 32 bit > machines, where you cannot even simply take a well aligned struct on a > stack and put it into the normal siphash(aligned) function without > adding alignment annotations everywhere. Even blocks returned from > kmalloc on 32 bit are not aligned to 64 bit. Are you doing anything that will require 64bit alignment on 32bit systems? It is unlikely that the kernel can use any simd registers that have wider alignment requirements. You also really don't want to request on-stack items have large alignments. While gcc can generate code to do it, it isn't pretty. David
Re: Synopsys Ethernet QoS
Hi! > I know that this is completely of topic, but I am facing a dificulty with > stmmac. I have interrupts, mac well configured rx packets being received > successfully, but TX is not working, resulting in Tx errors = Total TX > packets. > I have made a lot of debug and my conclusions is that by some reason when > using > stmmac after starting tx dma, the hw state machine enters a deadend state > resulting in those errors. Anyone faced this trouble? Actually, I see you have address @synopsys.com, would you have documentation for the chip? I'm trying to understand stmmac_tx_clean() and docs would help... Thanks, Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html signature.asc Description: Digital signature
[PATCH net-next] ixgbevf: fix 'Etherleak' in ixgbevf
Nessus report the vf appears to leak memory in network packets. Fix this by padding all small packets manually. And the CVE-2003-0001. https://ofirarkin.files.wordpress.com/2008/11/atstake_etherleak_report.pdf Signed-off-by: Weilong Chen --- drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c index 6d4bef5..137a154 100644 --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c @@ -3654,6 +3654,13 @@ static int ixgbevf_xmit_frame(struct sk_buff *skb, struct net_device *netdev) return NETDEV_TX_OK; } + /* On PCI/PCI-X HW, if packet size is less than ETH_ZLEN, +* packets may get corrupted during padding by HW. +* To WA this issue, pad all small packets manually. +*/ + if (eth_skb_pad(skb)) + return NETDEV_TX_OK; + tx_ring = adapter->tx_ring[skb->queue_mapping]; /* need: 1 descriptor per page * PAGE_SIZE/IXGBE_MAX_DATA_PER_TXD, -- 1.7.12
[PATCH 4/5] irda: irnet: Remove unused IRNET_MAJOR define
The IRNET_MAJOR define is not used, so this patch remove it. Signed-off-by: Corentin Labbe --- net/irda/irnet/irnet_ppp.h | 3 --- 1 file changed, 3 deletions(-) diff --git a/net/irda/irnet/irnet_ppp.h b/net/irda/irnet/irnet_ppp.h index 18fcead..ec092c9 100644 --- a/net/irda/irnet/irnet_ppp.h +++ b/net/irda/irnet/irnet_ppp.h @@ -19,9 +19,6 @@ / CONSTANTS & MACROS / -/* /dev/irnet file constants */ -#define IRNET_MAJOR10 /* Misc range */ - /* IrNET control channel stuff */ #define IRNET_MAX_COMMAND 256 /* Max length of a command line */ -- 2.10.2
Re: [PATCH 3/5] irnet: ppp: move IRNET_MINOR to include/linux/miscdevice.h
On Thu, Dec 15, 2016 at 11:42:48AM +0100, Corentin Labbe wrote: > This patch move the define for IRNET_MINOR to include/linux/miscdevice.h > It is better that all minor number definitions are in the same place. > > Signed-off-by: Corentin Labbe > --- > include/linux/miscdevice.h | 1 + > net/irda/irnet/irnet_ppp.h | 1 - > 2 files changed, 1 insertion(+), 1 deletion(-) > > diff --git a/include/linux/miscdevice.h b/include/linux/miscdevice.h > index 18b2e3b..5ea0a65 100644 > --- a/include/linux/miscdevice.h > +++ b/include/linux/miscdevice.h > @@ -37,6 +37,7 @@ > #define HWRNG_MINOR 183 > #define MICROCODE_MINOR 184 > #define KEYPAD_MINOR 185 > +#define IRNET_MINOR 187 > #define D7S_MINOR193 > #define VFIO_MINOR 196 > #define TUN_MINOR200 > diff --git a/net/irda/irnet/irnet_ppp.h b/net/irda/irnet/irnet_ppp.h > index 693ebc0..18fcead 100644 > --- a/net/irda/irnet/irnet_ppp.h > +++ b/net/irda/irnet/irnet_ppp.h > @@ -21,7 +21,6 @@ > > /* /dev/irnet file constants */ > #define IRNET_MAJOR 10 /* Misc range */ > -#define IRNET_MINOR 187 /* Official allocation */ > > /* IrNET control channel stuff */ > #define IRNET_MAX_COMMAND256 /* Max length of a command line */ > -- > 2.10.2 Acked-by: Greg Kroah-Hartman
[PATCH 2/5] irda: irnet: Move linux/miscdevice.h include
The only use of miscdevice is irda_ppp so no need to include linux/miscdevice.h for all irda files. This patch move the linux/miscdevice.h include to irnet_ppp.h Signed-off-by: Corentin Labbe --- net/irda/irnet/irnet.h | 1 - net/irda/irnet/irnet_ppp.h | 1 + 2 files changed, 1 insertion(+), 1 deletion(-) diff --git a/net/irda/irnet/irnet.h b/net/irda/irnet/irnet.h index 8d65bb9..c69f0f3 100644 --- a/net/irda/irnet/irnet.h +++ b/net/irda/irnet/irnet.h @@ -245,7 +245,6 @@ #include #include #include -#include #include #include #include/* isspace() */ diff --git a/net/irda/irnet/irnet_ppp.h b/net/irda/irnet/irnet_ppp.h index 9402258..693ebc0 100644 --- a/net/irda/irnet/irnet_ppp.h +++ b/net/irda/irnet/irnet_ppp.h @@ -15,6 +15,7 @@ /* INCLUDES */ #include "irnet.h" /* Module global include */ +#include / CONSTANTS & MACROS / -- 2.10.2
Re: [PATCH 8/8] Makefile: drop -D__CHECK_ENDIAN__ from cflags
On Thu, Dec 15, 2016 at 07:15:30AM +0200, Michael S. Tsirkin wrote: > That's the default now, no need for makefiles to set it. > > Signed-off-by: Michael S. Tsirkin > --- > drivers/bluetooth/Makefile| 2 -- > drivers/net/can/Makefile | 1 - > drivers/net/ethernet/altera/Makefile | 1 - > drivers/net/ethernet/atheros/alx/Makefile | 1 - > drivers/net/ethernet/freescale/Makefile | 2 -- > drivers/net/wireless/ath/Makefile | 2 -- > drivers/net/wireless/ath/wil6210/Makefile | 2 -- > drivers/net/wireless/broadcom/brcm80211/brcmfmac/Makefile | 2 -- > drivers/net/wireless/broadcom/brcm80211/brcmsmac/Makefile | 1 - > drivers/net/wireless/intel/iwlegacy/Makefile | 2 -- > drivers/net/wireless/intel/iwlwifi/Makefile | 2 +- > drivers/net/wireless/intel/iwlwifi/dvm/Makefile | 2 +- > drivers/net/wireless/intel/iwlwifi/mvm/Makefile | 2 +- > drivers/net/wireless/intersil/orinoco/Makefile| 3 --- > drivers/net/wireless/mediatek/mt7601u/Makefile| 2 -- > drivers/net/wireless/realtek/rtlwifi/Makefile | 2 -- > drivers/net/wireless/realtek/rtlwifi/btcoexist/Makefile | 2 -- > drivers/net/wireless/realtek/rtlwifi/rtl8188ee/Makefile | 2 -- > drivers/net/wireless/realtek/rtlwifi/rtl8192c/Makefile| 2 -- > drivers/net/wireless/realtek/rtlwifi/rtl8192ce/Makefile | 2 -- > drivers/net/wireless/realtek/rtlwifi/rtl8192cu/Makefile | 2 -- > drivers/net/wireless/realtek/rtlwifi/rtl8192de/Makefile | 2 -- > drivers/net/wireless/realtek/rtlwifi/rtl8192ee/Makefile | 2 -- > drivers/net/wireless/realtek/rtlwifi/rtl8192se/Makefile | 2 -- > drivers/net/wireless/realtek/rtlwifi/rtl8723ae/Makefile | 2 -- > drivers/net/wireless/realtek/rtlwifi/rtl8723be/Makefile | 2 -- > drivers/net/wireless/realtek/rtlwifi/rtl8723com/Makefile | 2 -- > drivers/net/wireless/realtek/rtlwifi/rtl8821ae/Makefile | 2 -- > drivers/net/wireless/ti/wl1251/Makefile | 2 -- > drivers/net/wireless/ti/wlcore/Makefile | 2 -- > drivers/staging/rtl8188eu/Makefile| 2 +- > drivers/staging/rtl8192e/Makefile | 2 -- > drivers/staging/rtl8192e/rtl8192e/Makefile| 2 -- > net/bluetooth/Makefile| 2 -- > net/ieee802154/Makefile | 2 -- > net/mac80211/Makefile | 2 +- > net/mac802154/Makefile| 2 -- > net/wireless/Makefile | 2 -- > 38 files changed, 5 insertions(+), 68 deletions(-) For drivers/staging: Acked-by: Greg Kroah-Hartman
Re: [PATCH 5/8] linux: drop __bitwise__ everywhere
On Thu, Dec 15, 2016 at 07:15:20AM +0200, Michael S. Tsirkin wrote: > __bitwise__ used to mean "yes, please enable sparse checks > unconditionally", but now that we dropped __CHECK_ENDIAN__ > __bitwise is exactly the same. > There aren't many users, replace it by __bitwise everywhere. > > Signed-off-by: Michael S. Tsirkin > --- > arch/arm/plat-samsung/include/plat/gpio-cfg.h| 2 +- > drivers/md/dm-cache-block-types.h| 6 +++--- > drivers/net/ethernet/sun/sunhme.h| 2 +- > drivers/net/wireless/intel/iwlwifi/iwl-fw-file.h | 4 ++-- > include/linux/mmzone.h | 2 +- > include/linux/serial_core.h | 4 ++-- > include/linux/types.h| 4 ++-- > include/scsi/iscsi_proto.h | 2 +- > include/target/target_core_base.h| 2 +- > include/uapi/linux/virtio_types.h| 6 +++--- > net/ieee802154/6lowpan/6lowpan_i.h | 2 +- > net/mac80211/ieee80211_i.h | 4 ++-- > 12 files changed, 20 insertions(+), 20 deletions(-) for include/linux/serial_core.h: Acked-by: Greg Kroah-Hartman
Re: Synopsys Ethernet QoS
On 12/14/2016 01:57 PM, Pavel Machek wrote: > Hi! > >> So if there is a long time before handling interrupts, >> I guess that it makes sense that one stream could >> get an advantage in the net scheduler. >> >> If I find the time, and if no one beats me to it, I will try to replace >> the normal timers with HR timers + a smaller default timeout. >> > Can you try something like this? Highres timers will be needed, too, > but this fixes the logic problem. Hello Pavel I tried your patch, but unfortunately I get a tx queue timeout. After that, I cannot ping. [ 22.075782] [ cut here ] [ 22.080430] WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:316 dev_watchdog+0x240/0x258 [ 22.088704] NETDEV WATCHDOG: eth0 (stmmaceth): transmit queue 0 timed out [ 22.095491] Modules linked in: [ 22.098552] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.9.0-axis3-devel #126 [ 22.105592] Hardware name: Axis ARTPEC-6 Platform [ 22.110301] [<80110568>] (unwind_backtrace) from [<8010c2bc>] (show_stack+0x18/0x1c) [ 22.118043] [<8010c2bc>] (show_stack) from [<80433544>] (dump_stack+0x80/0xa0) [ 22.125264] [<80433544>] (dump_stack) from [<8011f9f0>] (__warn+0xe0/0x10c) [ 22.132221] [<8011f9f0>] (__warn) from [<8011fadc>] (warn_slowpath_fmt+0x40/0x50) [ 22.139700] [<8011fadc>] (warn_slowpath_fmt) from [<805e626c>] (dev_watchdog+0x240/0x258) [ 22.147875] [<805e626c>] (dev_watchdog) from [<801826c8>] (call_timer_fn+0x44/0x208) [ 22.155613] [<801826c8>] (call_timer_fn) from [<80182934>] (expire_timers+0xa8/0x15c) [ 22.163437] [<80182934>] (expire_timers) from [<80182a74>] (run_timer_softirq+0x8c/0x164) [ 22.171610] [<80182a74>] (run_timer_softirq) from [<80124a7c>] (__do_softirq+0xac/0x3f0) [ 22.179696] [<80124a7c>] (__do_softirq) from [<80125124>] (irq_exit+0xf0/0x158) [ 22.187003] [<80125124>] (irq_exit) from [<8016ffd4>] (__handle_domain_irq+0x60/0xb8) [ 22.194828] [<8016ffd4>] (__handle_domain_irq) from [<801014c4>] (gic_handle_irq+0x4c/0x9c) [ 22.203175] [<801014c4>] (gic_handle_irq) from [<806cc48c>] (__irq_svc+0x6c/0xa8) [ 22.210648] Exception stack(0x80b01f60 to 0x80b01fa8) [ 22.215694] 1f60: bf5c03f0 80b01fb8 8011a060 0001 80b03c9c 80b03c2c [ 22.223865] 1f80: 80b1c045 80b1c045 0001 80a673f0 80b01fb0 801090c0 801090c4 [ 22.232032] 1fa0: 6013 [ 22.235520] [<806cc48c>] (__irq_svc) from [<801090c4>] (arch_cpu_idle+0x38/0x44) [ 22.242914] [<801090c4>] (arch_cpu_idle) from [<80160f00>] (cpu_startup_entry+0xd8/0x148) [ 22.251089] [<80160f00>] (cpu_startup_entry) from [<80a00c44>] (start_kernel+0x360/0x3c8) [ 22.259269] ---[ end trace e04d3944bdde616a ]--- I patched both stmmac_tso_xmit and stmmac_xmit, as instructed. Here is the diff: --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -2090,8 +2090,9 @@ static netdev_tx_t stmmac_tso_xmit(struct sk_buff *skb, struct net_device *dev) /* Manage tx mitigation */ priv->tx_count_frames += nfrags + 1; if (likely(priv->tx_coal_frames > priv->tx_count_frames)) { - mod_timer(&priv->txtimer, - STMMAC_COAL_TIMER(priv->tx_coal_timer)); + if (priv->tx_count_frames == nfrags + 1) + mod_timer(&priv->txtimer, + STMMAC_COAL_TIMER(priv->tx_coal_timer)); } else { priv->tx_count_frames = 0; priv->hw->desc->set_tx_ic(desc); @@ -2292,8 +2293,9 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev) */ priv->tx_count_frames += nfrags + 1; if (likely(priv->tx_coal_frames > priv->tx_count_frames)) { - mod_timer(&priv->txtimer, - STMMAC_COAL_TIMER(priv->tx_coal_timer)); + if (priv->tx_count_frames == nfrags + 1) + mod_timer(&priv->txtimer, + STMMAC_COAL_TIMER(priv->tx_coal_timer)); } else { priv->tx_count_frames = 0; priv->hw->desc->set_tx_ic(desc); Without your patch, I get no tx queue timeout, and ping works fine. > > You'll need to apply it twice as code is copy&pasted. > > Best regards, > Pavel > > +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c > >*/ > priv->tx_count_frames += nfrags + 1; > if (likely(priv->tx_coal_frames > priv->tx_count_frames)) { > - mod_timer(&priv->txtimer, > - STMMAC_COAL_TIMER(priv->tx_coal_timer)); > + if (priv->tx_count_frames == nfrags + 1) > + mod_timer(&priv->txtimer, > + STMMAC_COAL_TIMER(priv->tx_coal_timer)); > } else { > priv->tx_count_frames = 0; > priv->hw->desc->set_tx_ic(des
Applied "misc: atmel-ssc: register as sound DAI if #sound-dai-cells is present" to the asoc tree
The patch misc: atmel-ssc: register as sound DAI if #sound-dai-cells is present has been applied to the asoc tree at git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git All being well this means that it will be integrated into the linux-next tree (usually sometime in the next 24 hours) and sent to Linus during the next merge window (or sooner if it is a bug fix), however if problems are discovered then the patch may be dropped or reverted. You may get further e-mails resulting from automated or manual testing and review of the tree, please engage with people reporting problems and send followup patches addressing any issues that are reported if needed. If any updates are required or you are submitting further changes they should be sent as incremental updates against current git, existing patches will not be replaced. Please add any relevant lists and maintainers to the CCs when replying to this mail. Thanks, Mark >From e8314d7d53c8b050aac2828a5de5f28a997b468b Mon Sep 17 00:00:00 2001 From: Peter Rosin Date: Tue, 6 Dec 2016 20:22:36 +0100 Subject: [PATCH] misc: atmel-ssc: register as sound DAI if #sound-dai-cells is present The SSC is currently not usable with the ASoC simple-audio-card, as every SSC audio user has to build a platform driver that may do as little as calling atmel_ssc_set_audio/atmel_ssc_put_audio (which allocates the SSC and registers a DAI with the ASoC subsystem). So, have that happen automatically, if the #sound-dai-cells property is present in devicetree, which it has to be anyway for simple audio card to work. Signed-off-by: Peter Rosin Acked-by: Rob Herring Acked-by: Nicolas Ferre Signed-off-by: Mark Brown --- .../devicetree/bindings/misc/atmel-ssc.txt | 2 + drivers/misc/atmel-ssc.c | 50 ++ include/linux/atmel-ssc.h | 1 + 3 files changed, 53 insertions(+) diff --git a/Documentation/devicetree/bindings/misc/atmel-ssc.txt b/Documentation/devicetree/bindings/misc/atmel-ssc.txt index efc98ea1f23d..f8629bb73945 100644 --- a/Documentation/devicetree/bindings/misc/atmel-ssc.txt +++ b/Documentation/devicetree/bindings/misc/atmel-ssc.txt @@ -24,6 +24,8 @@ Optional properties: this parameter to choose where the clock from. - By default the clock is from TK pin, if the clock from RK pin, this property is needed. + - #sound-dai-cells: Should contain <0>. + - This property makes the SSC into an automatically registered DAI. Examples: - PDC transfer: diff --git a/drivers/misc/atmel-ssc.c b/drivers/misc/atmel-ssc.c index 0516ecda54d3..b2a0340f277e 100644 --- a/drivers/misc/atmel-ssc.c +++ b/drivers/misc/atmel-ssc.c @@ -20,6 +20,8 @@ #include +#include "../../sound/soc/atmel/atmel_ssc_dai.h" + /* Serialize access to ssc_list and user count */ static DEFINE_SPINLOCK(user_lock); static LIST_HEAD(ssc_list); @@ -145,6 +147,49 @@ static inline const struct atmel_ssc_platform_data * __init platform_get_device_id(pdev)->driver_data; } +#ifdef CONFIG_SND_ATMEL_SOC_SSC +static int ssc_sound_dai_probe(struct ssc_device *ssc) +{ + struct device_node *np = ssc->pdev->dev.of_node; + int ret; + int id; + + ssc->sound_dai = false; + + if (!of_property_read_bool(np, "#sound-dai-cells")) + return 0; + + id = of_alias_get_id(np, "ssc"); + if (id < 0) + return id; + + ret = atmel_ssc_set_audio(id); + ssc->sound_dai = !ret; + + return ret; +} + +static void ssc_sound_dai_remove(struct ssc_device *ssc) +{ + if (!ssc->sound_dai) + return; + + atmel_ssc_put_audio(of_alias_get_id(ssc->pdev->dev.of_node, "ssc")); +} +#else +static inline int ssc_sound_dai_probe(struct ssc_device *ssc) +{ + if (of_property_read_bool(ssc->pdev->dev.of_node, "#sound-dai-cells")) + return -ENOTSUPP; + + return 0; +} + +static inline void ssc_sound_dai_remove(struct ssc_device *ssc) +{ +} +#endif + static int ssc_probe(struct platform_device *pdev) { struct resource *regs; @@ -204,6 +249,9 @@ static int ssc_probe(struct platform_device *pdev) dev_info(&pdev->dev, "Atmel SSC device at 0x%p (irq %d)\n", ssc->regs, ssc->irq); + if (ssc_sound_dai_probe(ssc)) + dev_err(&pdev->dev, "failed to auto-setup ssc for audio\n"); + return 0; } @@ -211,6 +259,8 @@ static int ssc_remove(struct platform_device *pdev) { struct ssc_device *ssc = platform_get_drvdata(pdev); + ssc_sound_dai_remove(ssc); + spin_lock(&user_lock); list_del(&ssc->list); spin_unlock(&user_lock); diff --git a/include/linux/atmel-ssc.h b/include/linux/atmel-ssc.h index 7c0f6549898b..fdb545101ede 100644 --- a/include/linux/atmel-ssc.h +++ b/include/linux/atmel-ssc.h @@ -20,6 +20,7 @@ struct ssc_device { int user; int
Re: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function
On 15.12.2016 12:04, David Laight wrote: > From: Hannes Frederic Sowa >> Sent: 14 December 2016 22:03 >> On 14.12.2016 13:46, Jason A. Donenfeld wrote: >>> Hi David, >>> >>> On Wed, Dec 14, 2016 at 10:56 AM, David Laight >>> wrote: ... > +u64 siphash24(const u8 *data, size_t len, const u8 > key[SIPHASH24_KEY_LEN]) ... > + u64 k0 = get_unaligned_le64(key); > + u64 k1 = get_unaligned_le64(key + sizeof(u64)); ... > + m = get_unaligned_le64(data); All these unaligned accesses are going to get expensive on architectures like sparc64. >>> >>> Yes, the unaligned accesses aren't pretty. Since in pretty much all >>> use cases thus far, the data can easily be made aligned, perhaps it >>> makes sense to create siphash24() and siphash24_unaligned(). Any >>> thoughts on doing something like that? >> >> I fear that the alignment requirement will be a source of bugs on 32 bit >> machines, where you cannot even simply take a well aligned struct on a >> stack and put it into the normal siphash(aligned) function without >> adding alignment annotations everywhere. Even blocks returned from >> kmalloc on 32 bit are not aligned to 64 bit. > > Are you doing anything that will require 64bit alignment on 32bit systems? > It is unlikely that the kernel can use any simd registers that have wider > alignment requirements. > > You also really don't want to request on-stack items have large alignments. > While gcc can generate code to do it, it isn't pretty. Hmm? Even the Intel ABI expects alignment of unsigned long long to be 8 bytes on 32 bit. Do you question that?
RE: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function
From: Hannes Frederic Sowa > Sent: 15 December 2016 12:23 ... > Hmm? Even the Intel ABI expects alignment of unsigned long long to be 8 > bytes on 32 bit. Do you question that? Yes. The linux ABI for x86 (32 bit) only requires 32bit alignment for u64 (etc). David
Re: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function
On 15.12.2016 13:28, David Laight wrote: > From: Hannes Frederic Sowa >> Sent: 15 December 2016 12:23 > ... >> Hmm? Even the Intel ABI expects alignment of unsigned long long to be 8 >> bytes on 32 bit. Do you question that? > > Yes. > > The linux ABI for x86 (32 bit) only requires 32bit alignment for u64 (etc). Hmm, u64 on 32 bit is unsigned long long and not unsigned long. Thus I am actually not sure if the ABI would say anything about that (sorry also for my wrong statement above). Alignment requirement of unsigned long long on gcc with -m32 actually seem to be 8.
[PATCH iproute2 v2 3/3] ifstat: Add "sw only" extended statistics to ifstat
Add support for extended statistics of SW only type, for counting only the packets that went via the cpu. (useful for systems with forward offloading). It reads it from filter type IFLA_STATS_LINK_OFFLOAD_XSTATS and sub type IFLA_OFFLOAD_XSTATS_CPU_HIT. It is under the name 'software' (or any shorten of it as 'soft' or simply 's') For example: ifstat -x s Signed-off-by: Nogah Frankel Reviewed-by: Jiri Pirko --- misc/ifstat.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/misc/ifstat.c b/misc/ifstat.c index ac99d04..62f1f2b 100644 --- a/misc/ifstat.c +++ b/misc/ifstat.c @@ -730,7 +730,8 @@ static void xstat_usage(void) { fprintf(stderr, "Usage: ifstat supported xstats:\n" -" 64bits default stats, with 64 bits support\n"); +" 64bits default stats, with 64 bits support\n" +" softwareSW stats. Counts only packets that went via the CPU\n"); } struct extended_stats_options_t { @@ -745,6 +746,7 @@ struct extended_stats_options_t { */ static const struct extended_stats_options_t extended_stats_options[] = { {"64bits", IFLA_STATS_LINK_64, NO_SUB_TYPE}, + {"software", IFLA_STATS_LINK_OFFLOAD_XSTATS, IFLA_OFFLOAD_XSTATS_CPU_HIT}, }; static bool get_filter_type(char *name) -- 2.4.3
[PATCH iproute2 v2 1/3] ifstat: Add extended statistics to ifstat
Extended stats are part of the RTM_GETSTATS method. This patch adds them to ifstat. While extended stats can come in many forms, we support only the rtnl_link_stats64 struct for them (which is the 64 bits version of struct rtnl_link_stats). We support stats in the main nesting level, or one lower. The extension can be called by its name or any shorten of it. If there is more than one matched, the first one will be picked. To get the extended stats the flag -x is used. Signed-off-by: Nogah Frankel Reviewed-by: Jiri Pirko --- misc/ifstat.c | 161 -- 1 file changed, 146 insertions(+), 15 deletions(-) diff --git a/misc/ifstat.c b/misc/ifstat.c index 92d67b0..d17ae21 100644 --- a/misc/ifstat.c +++ b/misc/ifstat.c @@ -35,6 +35,7 @@ #include +#include "utils.h" int dump_zeros; int reset_history; int ignore_history; @@ -48,17 +49,21 @@ int pretty; double W; char **patterns; int npatterns; +bool is_extanded; +int filter_type; +int sub_type; char info_source[128]; int source_mismatch; #define MAXS (sizeof(struct rtnl_link_stats)/sizeof(__u32)) +#define NO_SUB_TYPE 0x struct ifstat_ent { struct ifstat_ent *next; char*name; int ifindex; - unsigned long long val[MAXS]; + __u64 val[MAXS]; double rate[MAXS]; __u32 ival[MAXS]; }; @@ -106,6 +111,48 @@ static int match(const char *id) return 0; } +static int get_nlmsg_extanded(const struct sockaddr_nl *who, + struct nlmsghdr *m, void *arg) +{ + struct if_stats_msg *ifsm = NLMSG_DATA(m); + struct rtattr *tb[IFLA_STATS_MAX+1]; + int len = m->nlmsg_len; + struct ifstat_ent *n; + + if (m->nlmsg_type != RTM_NEWSTATS) + return 0; + + len -= NLMSG_LENGTH(sizeof(*ifsm)); + if (len < 0) + return -1; + + parse_rtattr(tb, IFLA_STATS_MAX, IFLA_STATS_RTA(ifsm), len); + if (tb[filter_type] == NULL) + return 0; + + n = malloc(sizeof(*n)); + if (!n) + abort(); + + n->ifindex = ifsm->ifindex; + n->name = strdup(ll_index_to_name(ifsm->ifindex)); + + if (sub_type == NO_SUB_TYPE) { + memcpy(&n->val, RTA_DATA(tb[filter_type]), sizeof(n->val)); + } else { + struct rtattr *attr; + + attr = parse_rtattr_one_nested(sub_type, tb[filter_type]); + if (attr == NULL) + return 0; + memcpy(&n->val, RTA_DATA(attr), sizeof(n->val)); + } + memset(&n->rate, 0, sizeof(n->rate)); + n->next = kern_db; + kern_db = n; + return 0; +} + static int get_nlmsg(const struct sockaddr_nl *who, struct nlmsghdr *m, void *arg) { @@ -147,18 +194,34 @@ static void load_info(void) { struct ifstat_ent *db, *n; struct rtnl_handle rth; + __u32 filter_mask; if (rtnl_open(&rth, 0) < 0) exit(1); - if (rtnl_wilddump_request(&rth, AF_INET, RTM_GETLINK) < 0) { - perror("Cannot send dump request"); - exit(1); - } + if (is_extanded) { + ll_init_map(&rth); + filter_mask = IFLA_STATS_FILTER_BIT(filter_type); + if (rtnl_wilddump_stats_req_filter(&rth, AF_UNSPEC, RTM_GETSTATS, + filter_mask) < 0) { + perror("Cannot send dump request"); + exit(1); + } - if (rtnl_dump_filter(&rth, get_nlmsg, NULL) < 0) { - fprintf(stderr, "Dump terminated\n"); - exit(1); + if (rtnl_dump_filter(&rth, get_nlmsg_extanded, NULL) < 0) { + fprintf(stderr, "Dump terminated\n"); + exit(1); + } + } else { + if (rtnl_wilddump_request(&rth, AF_INET, RTM_GETLINK) < 0) { + perror("Cannot send dump request"); + exit(1); + } + + if (rtnl_dump_filter(&rth, get_nlmsg, NULL) < 0) { + fprintf(stderr, "Dump terminated\n"); + exit(1); + } } rtnl_close(&rth); @@ -553,10 +616,17 @@ static void update_db(int interval) } for (i = 0; i < MAXS; i++) { double sample; - unsigned long incr = h1->ival[i] - n->ival[i]; + __u64 incr; + + if (is_extanded) { + incr = h1->val[i] - n->val[i]; + n->val[i] = h1->val[
[PATCH iproute2 v2 2/3] ifstat: Add 64 bits based stats to extended statistics
The default stats for ifstat are 32 bits based. The kernel supports 64 bits based stats. (They are returned in struct rtnl_link_stats64 which is an exact copy of struct rtnl_link_stats, in which the "normal" stats are returned, but with fields of u64 instead of u32). This patch adds them as an extended stats. It is read with filter type IFLA_STATS_LINK_64 and no sub type. It is under the name 64bits (or any shorten of it as "64") For example: ifstat -x 64bit Signed-off-by: Nogah Frankel Reviewed-by: Jiri Pirko --- misc/ifstat.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/misc/ifstat.c b/misc/ifstat.c index d17ae21..ac99d04 100644 --- a/misc/ifstat.c +++ b/misc/ifstat.c @@ -729,7 +729,8 @@ static int verify_forging(int fd) static void xstat_usage(void) { fprintf(stderr, -"Usage: ifstat supported xstats:\n"); +"Usage: ifstat supported xstats:\n" +" 64bits default stats, with 64 bits support\n"); } struct extended_stats_options_t { @@ -743,6 +744,7 @@ struct extended_stats_options_t { * Name length must be under 64 chars. */ static const struct extended_stats_options_t extended_stats_options[] = { + {"64bits", IFLA_STATS_LINK_64, NO_SUB_TYPE}, }; static bool get_filter_type(char *name) -- 2.4.3
[PATCH iproute2 v2 0/3] update ifstat for new stats
Previously stats were gotten by RTM_GETLINK which returns 32 bits based statistics. It supports only one type of stats. Lately, a new method to get stats was added - RTM_GETSTATS. It supports ability to choose stats type. The basic stats were changed from 32 bits based to 64 bits based. This patchset adds ifstat the ability to get extended stats by this method. Its adds two types of extended stats: 64bits - the same as the "normal" stats but get the stats from the cpu in 64 bits based struct. SW - for packets that hit cpu. --- v1->v2: - change from using RTM_GETSTATS always to using it only for extended stats. - Add 64bits extended stats type. Nogah Frankel (3): ifstat: Add extended statistics to ifstat ifstat: Add 64 bits based stats to extended statistics ifstat: Add "sw only" extended statistics to ifstat misc/ifstat.c | 165 -- 1 file changed, 150 insertions(+), 15 deletions(-) -- 2.4.3
Re: [PATCH iproute2 2/2] tc/m_tunnel_key: Add dest UDP port to tunnel key action
On Tue, Dec 13, 2016 at 10:07:47AM +0200, Hadar Hen Zion wrote: > Enhance tunnel key action parameters by adding destination UDP port. > > Signed-off-by: Hadar Hen Zion > Reviewed-by: Roi Dayan Hi, this looks good to me but could you also update tc/m_tunnel_key.c:usage(); ? With that change: Reviewed-by: Simon Horman
[PATCH net 2/3] dpaa_eth: remove redundant dependency on FSL_SOC
Signed-off-by: Madalin Bucur --- drivers/net/ethernet/freescale/dpaa/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/freescale/dpaa/Kconfig b/drivers/net/ethernet/freescale/dpaa/Kconfig index f3a3454..a654736 100644 --- a/drivers/net/ethernet/freescale/dpaa/Kconfig +++ b/drivers/net/ethernet/freescale/dpaa/Kconfig @@ -1,6 +1,6 @@ menuconfig FSL_DPAA_ETH tristate "DPAA Ethernet" - depends on FSL_SOC && FSL_DPAA && FSL_FMAN + depends on FSL_DPAA && FSL_FMAN select PHYLIB select FSL_FMAN_MAC ---help--- -- 2.1.0
Re: [PATCH iproute2 1/2] tc: flower: Fix typo in the flower man page
On Tue, Dec 13, 2016 at 07:33:51AM +0200, Roi Dayan wrote: > Replace vlan_eth_type with vlan_ethtype. > > Fixes: 745d91726006 ("tc: flower: Introduce vlan support") > Signed-off-by: Roi Dayan > Reviewed-by: Hadar Hen Zion Reviewed-by: Simon Horman
[PATCH net 1/4] fsl/fman: fix 1G support for QSGMII interfaces
QSGMII ports were not advertising 1G speed. Signed-off-by: Madalin Bucur Reviewed-by: Camelia Groza --- drivers/net/ethernet/freescale/fman/mac.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/net/ethernet/freescale/fman/mac.c b/drivers/net/ethernet/freescale/fman/mac.c index 69ca42c..0b31f85 100644 --- a/drivers/net/ethernet/freescale/fman/mac.c +++ b/drivers/net/ethernet/freescale/fman/mac.c @@ -594,6 +594,7 @@ static const u16 phy2speed[] = { [PHY_INTERFACE_MODE_RGMII_RXID] = SPEED_1000, [PHY_INTERFACE_MODE_RGMII_TXID] = SPEED_1000, [PHY_INTERFACE_MODE_RTBI] = SPEED_1000, + [PHY_INTERFACE_MODE_QSGMII] = SPEED_1000, [PHY_INTERFACE_MODE_XGMII] = SPEED_1 }; -- 2.1.0
[PATCH net 4/4] fsl/fman: enable compilation on ARM64
Signed-off-by: Madalin Bucur --- drivers/net/ethernet/freescale/fman/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/freescale/fman/Kconfig b/drivers/net/ethernet/freescale/fman/Kconfig index 79b7c84..dc0850b 100644 --- a/drivers/net/ethernet/freescale/fman/Kconfig +++ b/drivers/net/ethernet/freescale/fman/Kconfig @@ -1,6 +1,6 @@ config FSL_FMAN tristate "FMan support" - depends on FSL_SOC || COMPILE_TEST + depends on FSL_SOC || ARCH_LAYERSCAPE || COMPILE_TEST select GENERIC_ALLOCATOR select PHYLIB default n -- 2.1.0
[PATCH net 2/4] fsl/fman: arm: call of_platform_populate() for arm64 platfrom
From: Igal Liberman Signed-off-by: Igal Liberman --- drivers/net/ethernet/freescale/fman/fman.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/drivers/net/ethernet/freescale/fman/fman.c b/drivers/net/ethernet/freescale/fman/fman.c index dafd9e1..f36b4eb 100644 --- a/drivers/net/ethernet/freescale/fman/fman.c +++ b/drivers/net/ethernet/freescale/fman/fman.c @@ -2868,6 +2868,16 @@ static struct fman *read_dts_node(struct platform_device *of_dev) fman->dev = &of_dev->dev; +#ifdef CONFIG_ARM64 + /* call of_platform_populate in order to probe sub-nodes on arm64 */ + err = of_platform_populate(fm_node, NULL, NULL, &of_dev->dev); + if (err) { + dev_err(&of_dev->dev, "%s: of_platform_populate() failed\n", + __func__); + goto fman_free; + } +#endif + return fman; fman_node_put: -- 2.1.0
[PATCH net 0/3] dpaa_eth: a couple of fixes
This patch set introduces big endian accessors in the dpaa_eth driver making sure accesses to the QBMan HW are correct on little endian platforms. Removing a redundant Kconfig dependency on FSL_SOC. Adding myself as maintainer of the dpaa_eth driver. Claudiu Manoil (1): dpaa_eth: use big endian accessors Madalin Bucur (2): dpaa_eth: remove redundant dependency on FSL_SOC MAINTAINERS: net: add entry for Freescale QorIQ DPAA Ethernet driver MAINTAINERS| 6 +++ drivers/net/ethernet/freescale/dpaa/Kconfig| 2 +- drivers/net/ethernet/freescale/dpaa/dpaa_eth.c | 71 ++ 3 files changed, 44 insertions(+), 35 deletions(-) -- 2.1.0
Re: [PATCH] net: ipv4: tcp_offload: check segs for NULL
On 2016-12-15 at 09:47:41 +0100, shakya@samsung.com wrote: > From: Shakya Sundar Das > > This patch will check segs for being NULL in tcp_gso_segment() > before calling skb_shinfo(segs) from skb_is_gso(segs), otherwise > kernel can run into a NULL-pointer dereference. How can segs ever be NULL here? skb_segment() will always either return an skb or an ERR_PTR(err).
[PATCH net 3/3] MAINTAINERS: net: add entry for Freescale QorIQ DPAA Ethernet driver
Add record for Freescale QORIQ DPAA Ethernet driver adding myself as maintainer. Signed-off-by: Madalin Bucur --- MAINTAINERS | 6 ++ 1 file changed, 6 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index e2463ba..0ff9757 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -5058,6 +5058,12 @@ S: Maintained F: drivers/net/ethernet/freescale/fman F: Documentation/devicetree/bindings/powerpc/fsl/fman.txt +FREESCALE QORIQ DPAA ETHERNET DRIVER +M: Madalin Bucur +L: netdev@vger.kernel.org +S: Maintained +F: drivers/net/ethernet/freescale/dpaa + FREESCALE QUICC ENGINE LIBRARY L: linuxppc-...@lists.ozlabs.org S: Orphan -- 2.1.0
[PATCH net 1/3] dpaa_eth: use big endian accessors
From: Claudiu Manoil Ensure correct access to the big endian QMan HW through proper accessors. Signed-off-by: Claudiu Manoil Signed-off-by: Madalin Bucur --- drivers/net/ethernet/freescale/dpaa/dpaa_eth.c | 71 ++ 1 file changed, 37 insertions(+), 34 deletions(-) diff --git a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c index 3c48a84..624ba90 100644 --- a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c +++ b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c @@ -733,7 +733,7 @@ static int dpaa_eth_cgr_init(struct dpaa_priv *priv) priv->cgr_data.cgr.cb = dpaa_eth_cgscn; /* Enable Congestion State Change Notifications and CS taildrop */ - initcgr.we_mask = QM_CGR_WE_CSCN_EN | QM_CGR_WE_CS_THRES; + initcgr.we_mask = cpu_to_be16(QM_CGR_WE_CSCN_EN | QM_CGR_WE_CS_THRES); initcgr.cgr.cscn_en = QM_CGR_EN; /* Set different thresholds based on the MAC speed. @@ -747,7 +747,7 @@ static int dpaa_eth_cgr_init(struct dpaa_priv *priv) cs_th = DPAA_CS_THRESHOLD_1G; qm_cgr_cs_thres_set64(&initcgr.cgr.cs_thres, cs_th, 1); - initcgr.we_mask |= QM_CGR_WE_CSTD_EN; + initcgr.we_mask |= cpu_to_be16(QM_CGR_WE_CSTD_EN); initcgr.cgr.cstd_en = QM_CGR_EN; err = qman_create_cgr(&priv->cgr_data.cgr, QMAN_CGR_FLAG_USE_INIT, @@ -896,18 +896,18 @@ static int dpaa_fq_init(struct dpaa_fq *dpaa_fq, bool td_enable) if (dpaa_fq->init) { memset(&initfq, 0, sizeof(initfq)); - initfq.we_mask = QM_INITFQ_WE_FQCTRL; + initfq.we_mask = cpu_to_be16(QM_INITFQ_WE_FQCTRL); /* Note: we may get to keep an empty FQ in cache */ - initfq.fqd.fq_ctrl = QM_FQCTRL_PREFERINCACHE; + initfq.fqd.fq_ctrl = cpu_to_be16(QM_FQCTRL_PREFERINCACHE); /* Try to reduce the number of portal interrupts for * Tx Confirmation FQs. */ if (dpaa_fq->fq_type == FQ_TYPE_TX_CONFIRM) - initfq.fqd.fq_ctrl |= QM_FQCTRL_HOLDACTIVE; + initfq.fqd.fq_ctrl |= cpu_to_be16(QM_FQCTRL_HOLDACTIVE); /* FQ placement */ - initfq.we_mask |= QM_INITFQ_WE_DESTWQ; + initfq.we_mask |= cpu_to_be16(QM_INITFQ_WE_DESTWQ); qm_fqd_set_destwq(&initfq.fqd, dpaa_fq->channel, dpaa_fq->wq); @@ -920,8 +920,8 @@ static int dpaa_fq_init(struct dpaa_fq *dpaa_fq, bool td_enable) if (dpaa_fq->fq_type == FQ_TYPE_TX || dpaa_fq->fq_type == FQ_TYPE_TX_CONFIRM || dpaa_fq->fq_type == FQ_TYPE_TX_CONF_MQ) { - initfq.we_mask |= QM_INITFQ_WE_CGID; - initfq.fqd.fq_ctrl |= QM_FQCTRL_CGE; + initfq.we_mask |= cpu_to_be16(QM_INITFQ_WE_CGID); + initfq.fqd.fq_ctrl |= cpu_to_be16(QM_FQCTRL_CGE); initfq.fqd.cgid = (u8)priv->cgr_data.cgr.cgrid; /* Set a fixed overhead accounting, in an attempt to * reduce the impact of fixed-size skb shells and the @@ -932,7 +932,7 @@ static int dpaa_fq_init(struct dpaa_fq *dpaa_fq, bool td_enable) * insufficient value, but even that is better than * no overhead accounting at all. */ - initfq.we_mask |= QM_INITFQ_WE_OAC; + initfq.we_mask |= cpu_to_be16(QM_INITFQ_WE_OAC); qm_fqd_set_oac(&initfq.fqd, QM_OAC_CG); qm_fqd_set_oal(&initfq.fqd, min(sizeof(struct sk_buff) + @@ -941,9 +941,9 @@ static int dpaa_fq_init(struct dpaa_fq *dpaa_fq, bool td_enable) } if (td_enable) { - initfq.we_mask |= QM_INITFQ_WE_TDTHRESH; + initfq.we_mask |= cpu_to_be16(QM_INITFQ_WE_TDTHRESH); qm_fqd_set_taildrop(&initfq.fqd, DPAA_FQ_TD, 1); - initfq.fqd.fq_ctrl = QM_FQCTRL_TDE; + initfq.fqd.fq_ctrl = cpu_to_be16(QM_FQCTRL_TDE); } if (dpaa_fq->fq_type == FQ_TYPE_TX) { @@ -951,7 +951,8 @@ static int dpaa_fq_init(struct dpaa_fq *dpaa_fq, bool td_enable) if (queue_id >= 0) confq = priv->conf_fqs[queue_id]; if (confq) { - initfq.we_mask |= QM_INITFQ_WE_CONTEXTA; + initfq.we_mask |= + cpu_to_be16(QM_INITFQ_WE_CONTEXTA); /* ContextA: OVOM=1(use contextA2 bits instead of ICAD) * A2V=1 (contextA A2 field is valid) * A0V=1 (contextA
Re: [PATCH net 2/2] net/sched: cls_flower: Use masked key when calling HW offloads
Hi Paul, On Wed, Dec 14, 2016 at 07:00:58PM +0200, Paul Blakey wrote: > Zero bits on the mask signify a "don't care" on the corresponding bits > in key. Some HWs require those bits on the key to be zero. Since these > bits are masked anyway, it's okay to provide the masked key to all > drivers. > > Fixes: 5b33f48842fa ('net/flower: Introduce hardware offload support') > Signed-off-by: Paul Blakey > Reviewed-by: Roi Dayan > Acked-by: Jiri Pirko While I don't have a specific use case in mind that this change would break it seems to me that it would be better to handle hardware requirements at the driver level. > --- > net/sched/cls_flower.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c > index 9758f5a..35ac28d 100644 > --- a/net/sched/cls_flower.c > +++ b/net/sched/cls_flower.c > @@ -252,7 +252,7 @@ static int fl_hw_replace_filter(struct tcf_proto *tp, > offload.cookie = (unsigned long)f; > offload.dissector = dissector; > offload.mask = mask; > - offload.key = &f->key; > + offload.key = &f->mkey; > offload.exts = &f->exts; > > tc->type = TC_SETUP_CLSFLOWER; > -- > 1.8.3.1 >
Re: [PATCH iproute2 2/2] tc/m_tunnel_key: Add dest UDP port to tunnel key action
On Thu, Dec 15, 2016 at 02:03:36PM +0100, Simon Horman wrote: > On Tue, Dec 13, 2016 at 10:07:47AM +0200, Hadar Hen Zion wrote: > > Enhance tunnel key action parameters by adding destination UDP port. > > > > Signed-off-by: Hadar Hen Zion > > Reviewed-by: Roi Dayan > > Hi, > > this looks good to me but could you also update tc/m_tunnel_key.c:usage(); ? It seems that I was a bit hasty here as I now see that Stephen has indicated that he has applied this series. I also notice that patch 1/2 of this series also misses updating usage(). Let me know if sending some follow-up patches is the best way forwards.
RE: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function
From: Hannes Frederic Sowa > Sent: 15 December 2016 12:50 > On 15.12.2016 13:28, David Laight wrote: > > From: Hannes Frederic Sowa > >> Sent: 15 December 2016 12:23 > > ... > >> Hmm? Even the Intel ABI expects alignment of unsigned long long to be 8 > >> bytes on 32 bit. Do you question that? > > > > Yes. > > > > The linux ABI for x86 (32 bit) only requires 32bit alignment for u64 (etc). > > Hmm, u64 on 32 bit is unsigned long long and not unsigned long. Thus I > am actually not sure if the ABI would say anything about that (sorry > also for my wrong statement above). > > Alignment requirement of unsigned long long on gcc with -m32 actually > seem to be 8. It depends on the architecture. For x86 it is definitely 4. It might be 8 for sparc, ppc and/or alpha. David
[PATCH net] ixgbe: update the rss key on h/w, when ethtool ask for it.
Currently ixgbe_set_rxfh() updates the rss_key copy in the driver memory, but does not push the new value into the h/w. This commit add a new helper for the latter operation and call it in ixgbe_set_rxfh(), so that the h/w rss key value can be really updated via ethtool. Signed-off-by: Paolo Abeni --- drivers/net/ethernet/intel/ixgbe/ixgbe.h | 1 + drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c | 4 +++- drivers/net/ethernet/intel/ixgbe/ixgbe_main.c| 19 --- 3 files changed, 20 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h index ef81c3d..8fb9fbf 100644 --- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h @@ -1026,6 +1026,7 @@ netdev_tx_t ixgbe_xmit_frame_ring(struct sk_buff *skb, struct ixgbe_adapter *adapter, struct ixgbe_ring *tx_ring); u32 ixgbe_rss_indir_tbl_entries(struct ixgbe_adapter *adapter); +void ixgbe_store_key(struct ixgbe_adapter *adapter); void ixgbe_store_reta(struct ixgbe_adapter *adapter); s32 ixgbe_negotiate_fc(struct ixgbe_hw *hw, u32 adv_reg, u32 lp_reg, u32 adv_sym, u32 adv_asm, u32 lp_sym, u32 lp_asm); diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c index fd192bf..e40f9ce 100644 --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c @@ -3003,8 +3003,10 @@ static int ixgbe_set_rxfh(struct net_device *netdev, const u32 *indir, } /* Fill out the rss hash key */ - if (key) + if (key) { memcpy(adapter->rss_key, key, ixgbe_get_rxfh_key_size(netdev)); + ixgbe_store_key(adapter); + } ixgbe_store_reta(adapter); diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c index 1e2f39e..0c23ab8 100644 --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c @@ -3411,6 +3411,21 @@ u32 ixgbe_rss_indir_tbl_entries(struct ixgbe_adapter *adapter) } /** + * ixgbe_store_key - Write the RSS key to HW + * @adapter: device handle + * + * Write the RSS key stored in adapter.rss_key to HW. + */ +void ixgbe_store_key(struct ixgbe_adapter *adapter) +{ + struct ixgbe_hw *hw = &adapter->hw; + int i; + + for (i = 0; i < 10; i++) + IXGBE_WRITE_REG(hw, IXGBE_RSSRK(i), adapter->rss_key[i]); +} + +/** * ixgbe_store_reta - Write the RETA table to HW * @adapter: device handle * @@ -3475,7 +3490,6 @@ static void ixgbe_store_vfreta(struct ixgbe_adapter *adapter) static void ixgbe_setup_reta(struct ixgbe_adapter *adapter) { - struct ixgbe_hw *hw = &adapter->hw; u32 i, j; u32 reta_entries = ixgbe_rss_indir_tbl_entries(adapter); u16 rss_i = adapter->ring_feature[RING_F_RSS].indices; @@ -3488,8 +3502,7 @@ static void ixgbe_setup_reta(struct ixgbe_adapter *adapter) rss_i = 4; /* Fill out hash function seeds */ - for (i = 0; i < 10; i++) - IXGBE_WRITE_REG(hw, IXGBE_RSSRK(i), adapter->rss_key[i]); + ixgbe_store_key(adapter); /* Fill out redirection table */ memset(adapter->rss_indir_tbl, 0, sizeof(adapter->rss_indir_tbl)); -- 1.8.3.1
Re: [PATCHv3 perf/core 0/7] Reuse libbpf from samples/bpf
Em Wed, Dec 14, 2016 at 02:46:23PM -0800, Joe Stringer escreveu: > On 14 December 2016 at 06:55, Arnaldo Carvalho de Melo > wrote: > > So, Joe, can you try refreshing this work, starting from what I have in > > perf/core? It has the changes coming from net-next that Daniel warned us > > about > > and some more. > I've just respun this series based on the version you previously > applied to perf/core. Since bpf_prog_{attach,detach}() were added to > samples/libbpf, a new patch will shift these over to tools/lib/bpf. > Other than that, I folded "samples/bpf: Drop unnecessary build > targets." back into "samples/bpf: Switch over to libbpf", and I > noticed that there were a couple of unnecessary log buffers with the > latest changes. For any new sample programs, those were fixed up to > use libbpf as well. > Don't forget to do a "make headers_install" before attempting to build > the samples, access to the latest headers is required (as per the > readme in samples/bpf). Ah, README, I should read that ;-) I got used to how tools/perf/ work, i.e. it is self sufficient wrt in-flux stuff in the kernel, i.e. headers that are related to features it supports and that are under constant improvements, such as eBPF, kvm, syscall tables, etc. Anyway, will do the headers_install step inside a container, to avoid polluting my workstation. Thanks for doing the respin and for the clarifications about building samples/bpf/. - Arnaldo
[PATCH net 3/4] fsl/fman: A007273 only applies to PPC SoCs
Signed-off-by: Madalin Bucur Reviewed-by: Camelia Groza --- drivers/net/ethernet/freescale/fman/fman.c | 8 1 file changed, 8 insertions(+) diff --git a/drivers/net/ethernet/freescale/fman/fman.c b/drivers/net/ethernet/freescale/fman/fman.c index f36b4eb..93d6a36 100644 --- a/drivers/net/ethernet/freescale/fman/fman.c +++ b/drivers/net/ethernet/freescale/fman/fman.c @@ -1890,6 +1890,7 @@ static int fman_reset(struct fman *fman) goto _return; } else { +#ifdef CONFIG_PPC struct device_node *guts_node; struct ccsr_guts __iomem *guts_regs; u32 devdisr2, reg; @@ -1921,6 +1922,7 @@ static int fman_reset(struct fman *fman) /* Enable all MACs */ iowrite32be(reg, &guts_regs->devdisr2); +#endif /* Perform FMan reset */ iowrite32be(FPM_RSTC_FM_RESET, &fman->fpm_regs->fm_rstc); @@ -1932,25 +1934,31 @@ static int fman_reset(struct fman *fman) } while (((ioread32be(&fman->fpm_regs->fm_rstc)) & FPM_RSTC_FM_RESET) && --count); if (count == 0) { +#ifdef CONFIG_PPC iounmap(guts_regs); of_node_put(guts_node); +#endif err = -EBUSY; goto _return; } +#ifdef CONFIG_PPC /* Restore devdisr2 value */ iowrite32be(devdisr2, &guts_regs->devdisr2); iounmap(guts_regs); of_node_put(guts_node); +#endif goto _return; +#ifdef CONFIG_PPC guts_regs: of_node_put(guts_node); guts_node: dev_dbg(fman->dev, "%s: Didn't perform FManV3 reset due to Errata A007273!\n", __func__); +#endif } _return: return err; -- 2.1.0
[PATCH net 0/4] fsl/fman: fixes for ARM
The patch set fixes advertised speeds for QSGMII interfaces, disables A007273 erratum workaround on non-PowerPC platforms where it does not apply, enables compilation on ARM64 and addresses a probing issue on ARM64. Igal Liberman (1): fsl/fman: arm: call of_platform_populate() for arm64 platfrom Madalin Bucur (3): fsl/fman: fix 1G support for QSGMII interfaces fsl/fman: A007273 only applies to PPC SoCs fsl/fman: enable compilation on ARM64 drivers/net/ethernet/freescale/fman/Kconfig | 2 +- drivers/net/ethernet/freescale/fman/fman.c | 18 ++ drivers/net/ethernet/freescale/fman/mac.c | 1 + 3 files changed, 20 insertions(+), 1 deletion(-) -- 2.1.0
Re: [RFC v2 00/10] HFI Virtual Network Interface Controller (VNIC)
On Thu, Dec 15, 2016 at 11:12:26AM +0200, Leon Romanovsky wrote: > On Wed, Dec 14, 2016 at 11:59:32PM -0800, Vishwanathapura, Niranjana wrote: > > Thanks Jason for the valuable feedback. > > Here is the revised HFI VNIC patch series. > > > > ChangeLog: > > = > > v1 => v2: > > a) Removed hfi_vnic bus, instead make hfi_vnic driver an 'ib client', > >as per feedback from Jason Gunthorpe. > > b) Interface changes, data structure changes and variable name changes > >associated with (a). > > c) Add hfi_ibdev abstraction to provide VNIC control operations to > >hfi_vnic client. > > d) Minor fixes > > e) Moved hfi_vnic driver from .../sw/intel/vnic/hfi_vnic to > >.../sw/intel/hfi_vnic. > > To put it into proportion, Jason asked you to do different thing. > http://marc.info/?l=linux-rdma&m=147977108302151&w=2 > http://marc.info/?l=linux-rdma&m=148000415401842&w=2 > > And Christoph, > http://marc.info/?l=linux-rdma&m=147985587425861&w=2 Understood. However, we never heard back from Niranjanas analysis of the code which stated that > 60% of the code was dealing with the OPA MADs used to configure this device. https://www.spinics.net/lists/linux-rdma/msg43579.html Furthermore, neither Dave nor Doug has had time to weigh in on what we should do. So before we make that change we wanted to get consensus on using the hfi1_ibdev abstraction rather than the bus. This was the _real_ technical change. Beyond that it is really just which maintainer wants this driver. To that end I've also cc'ed Jeff Kirsher who maintains drivers/net/ethernet/intel. Perhaps Dave would like the driver to go through that tree? I think there are pros and cons to both subtrees and in the end we will do whatever is decided. For maintainer review: 1) The driver encapsulates ethernet packets with OPA headers 2) VNIC uses OPA management packets (MADs) for its configuration 3) A significant portion (> 60% +) of the code is specific to OPA https://www.spinics.net/lists/linux-rdma/msg43579.html 4) The driver is from Intel and we expect Intel to be the primary contributor to the code. 5) The driver, like hfi1, is dual licensed (GPL/BSD) 6) Based on Christophs feedback we will be adding device capability bits to the IB core to indicate HFI VNIC support. https://www.spinics.net/lists/linux-rdma/msg44113.html Doug, Dave, Jeff any thoughts? Ira
[PATCH net] sctp: sctp_epaddr_lookup_transport should be protected by rcu_read_lock
Since commit 7fda702f9315 ("sctp: use new rhlist interface on sctp transport rhashtable"), sctp has changed to use rhlist_lookup to look up transport, but rhlist_lookup doesn't call rcu_read_lock inside, unlike rhashtable_lookup_fast. It is called in sctp_epaddr_lookup_transport and sctp_addrs_lookup_transport. sctp_addrs_lookup_transport is always in the protection of rcu_read_lock(), as __sctp_lookup_association is called in rx path or sctp_lookup_association which are in the protection of rcu_read_lock() already. But sctp_epaddr_lookup_transport is called by sctp_endpoint_lookup_assoc, it doesn't call rcu_read_lock, which may cause "suspicious rcu_dereference_check usage' in __rhashtable_lookup. This patch is to fix it by adding rcu_read_lock in sctp_endpoint_lookup_assoc before calling sctp_epaddr_lookup_transport. Fixes: 7fda702f9315 ("sctp: use new rhlist interface on sctp transport rhashtable") Reported-by: Dmitry Vyukov Signed-off-by: Xin Long --- net/sctp/endpointola.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/net/sctp/endpointola.c b/net/sctp/endpointola.c index 1f03065..410ddc1 100644 --- a/net/sctp/endpointola.c +++ b/net/sctp/endpointola.c @@ -331,7 +331,9 @@ struct sctp_association *sctp_endpoint_lookup_assoc( * on this endpoint. */ if (!ep->base.bind_addr.port) - goto out; + return NULL; + + rcu_read_lock(); t = sctp_epaddr_lookup_transport(ep, paddr); if (!t) goto out; @@ -339,6 +341,7 @@ struct sctp_association *sctp_endpoint_lookup_assoc( *transport = t; asoc = t->asoc; out: + rcu_read_unlock(); return asoc; } -- 2.1.0
Re: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function
On 15.12.2016 14:56, David Laight wrote: > From: Hannes Frederic Sowa >> Sent: 15 December 2016 12:50 >> On 15.12.2016 13:28, David Laight wrote: >>> From: Hannes Frederic Sowa Sent: 15 December 2016 12:23 >>> ... Hmm? Even the Intel ABI expects alignment of unsigned long long to be 8 bytes on 32 bit. Do you question that? >>> >>> Yes. >>> >>> The linux ABI for x86 (32 bit) only requires 32bit alignment for u64 (etc). >> >> Hmm, u64 on 32 bit is unsigned long long and not unsigned long. Thus I >> am actually not sure if the ABI would say anything about that (sorry >> also for my wrong statement above). >> >> Alignment requirement of unsigned long long on gcc with -m32 actually >> seem to be 8. > > It depends on the architecture. > For x86 it is definitely 4. May I ask for a reference? I couldn't see unsigned long long being mentioned in the ia32 abi spec that I found. I agree that those accesses might be synthetically assembled by gcc and for me the alignment of 4 would have seemed natural. But my gcc at least in 32 bit mode disagrees with that. > It might be 8 for sparc, ppc and/or alpha. This is something to find out... Right now ipv6 addresses have an alignment of 4. So we couldn't even naturally pass them to siphash but would need to copy them around, which I feel like a source of bugs. Bye, Hannes
Re: [v3] net: ethernet: cavium: octeon: octeon_mgmt: Handle return NULL error from devm_ioremap
Hi David, I did not tested this feature. I have build it and flashed on hardware. You can check below commit id. Which has similar check for ioremap. 1- Commit id - de9e397e40f56b9f34af4bf6a5bd7a75ea02456c In 'drivers/net/phy/mdio-octeon.c' 2- Commit id - 592569de4c247fe4f25db8369dc0c63860f9560b In 'drivers/gpio/gpio-octeon.c' Thanks Arvind On Thursday 15 December 2016 12:58 AM, David Daney wrote: On 12/14/2016 11:03 AM, Arvind Yadav wrote: Here, If devm_ioremap will fail. It will return NULL. Kernel can run into a NULL-pointer dereference. This error check will avoid NULL pointer dereference. t I have asked you twice already this question, but could not determine from your response what the answer is: Q: Have you tested the patch on OCTEON based hardware that contains the "octeon_mgmt" Ethernet ports? Please answer either "yes" or "no". Thanks, David Daney Signed-off-by: Arvind Yadav --- drivers/net/ethernet/cavium/octeon/octeon_mgmt.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/net/ethernet/cavium/octeon/octeon_mgmt.c b/drivers/net/ethernet/cavium/octeon/octeon_mgmt.c index 4ab404f..33c2fec 100644 --- a/drivers/net/ethernet/cavium/octeon/octeon_mgmt.c +++ b/drivers/net/ethernet/cavium/octeon/octeon_mgmt.c @@ -1479,6 +1479,12 @@ static int octeon_mgmt_probe(struct platform_device *pdev) p->agl = (u64)devm_ioremap(&pdev->dev, p->agl_phys, p->agl_size); p->agl_prt_ctl = (u64)devm_ioremap(&pdev->dev, p->agl_prt_ctl_phys, p->agl_prt_ctl_size); +if (!p->mix || !p->agl || !p->agl_prt_ctl) { +dev_err(&pdev->dev, "failed to map I/O memory\n"); +result = -ENOMEM; +goto err; +} + spin_lock_init(&p->lock); skb_queue_head_init(&p->tx_list);
Re: [PATCH net 2/2] net/sched: cls_flower: Use masked key when calling HW offloads
Thu, Dec 15, 2016 at 02:50:44PM CET, simon.hor...@netronome.com wrote: >Hi Paul, > >On Wed, Dec 14, 2016 at 07:00:58PM +0200, Paul Blakey wrote: >> Zero bits on the mask signify a "don't care" on the corresponding bits >> in key. Some HWs require those bits on the key to be zero. Since these >> bits are masked anyway, it's okay to provide the masked key to all >> drivers. >> >> Fixes: 5b33f48842fa ('net/flower: Introduce hardware offload support') >> Signed-off-by: Paul Blakey >> Reviewed-by: Roi Dayan >> Acked-by: Jiri Pirko > >While I don't have a specific use case in mind that this change would break >it seems to me that it would be better to handle hardware requirements >at the driver level. Even though, makes no sense to pass unmasked key down. Is is only confusing. This patch fixes it. > >> --- >> net/sched/cls_flower.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c >> index 9758f5a..35ac28d 100644 >> --- a/net/sched/cls_flower.c >> +++ b/net/sched/cls_flower.c >> @@ -252,7 +252,7 @@ static int fl_hw_replace_filter(struct tcf_proto *tp, >> offload.cookie = (unsigned long)f; >> offload.dissector = dissector; >> offload.mask = mask; >> -offload.key = &f->key; >> +offload.key = &f->mkey; >> offload.exts = &f->exts; >> >> tc->type = TC_SETUP_CLSFLOWER; >> -- >> 1.8.3.1 >>
[PATCH net] sctp: sctp_transport_lookup_process should rcu_read_unlock when transport is null
Prior to this patch, sctp_transport_lookup_process didn't rcu_read_unlock when it failed to find a transport by sctp_addrs_lookup_transport. This patch is to fix it by moving up rcu_read_unlock right before checking transport and also to remove the out path. Fixes: 1cceda784980 ("sctp: fix the issue sctp_diag uses lock_sock in rcu_read_lock") Signed-off-by: Xin Long --- net/sctp/socket.c | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/net/sctp/socket.c b/net/sctp/socket.c index d5f4b4a..318c678 100644 --- a/net/sctp/socket.c +++ b/net/sctp/socket.c @@ -4472,18 +4472,17 @@ int sctp_transport_lookup_process(int (*cb)(struct sctp_transport *, void *), const union sctp_addr *paddr, void *p) { struct sctp_transport *transport; - int err = -ENOENT; + int err; rcu_read_lock(); transport = sctp_addrs_lookup_transport(net, laddr, paddr); + rcu_read_unlock(); if (!transport) - goto out; + return -ENOENT; - rcu_read_unlock(); err = cb(transport, p); sctp_transport_put(transport); -out: return err; } EXPORT_SYMBOL_GPL(sctp_transport_lookup_process); -- 2.1.0
Re: wl1251 & mac address & calibration data
On Thu Dec 15 09:18:44 2016 Kalle Valo wrote: > (Adding Luis because he has been working on request_firmware() lately) > > Pali Rohár writes: > > > > > So no, there is no argument against... request_firmware() in > > > > fallback mode with userspace helper is by design blocking and > > > > waiting for userspace. But waiting for some change in DTS in > > > > kernel is just nonsense. > > > > > > I would just mark the wlan device with status = "disabled" and > > > enable it in the overlay together with adding the NVS & MAC info. > > > > So if you think that this solution make sense, we can wait what net > > wireless maintainers say about it... > > > > For me it looks like that solution can be: > > > > extending request_firmware() to use only userspace helper > > I haven't followed the discussion very closely but this is my preference > what drivers should do: > > 1) First the driver should do try to get the calibration data and mac > address from the device tree. > Ok, but there is no (dynamic, device specific) data in DTS for N900. So 1) is noop. > 2) If they are not in DT the driver should retrieve the calibration data > with request_firmware(). BUT with an option for user space to > implement that with a helper script so that the data can be created > dynamically, which I believe openwrt does with ath10k calibration > data right now. Currently there is flag for request_firmware() that it should fallback to user helper if direct VFS access not find needed firmware. But this flag is not suitable as /lib/firmware already provides default (not device specific) calibration data. So I would suggest to add another flag/function which will primary use user helper. > > and load mac address also via request_firmware() either by appending > > it into NVS data or via separate call > > I'm not really fan of the idea providing permanent mac address through > request_firmware(). For example, how to handle multiple devices on the > same host, would there be a need for some kind of bus ids encoded to the > filename? And what about devices with multiple mac addresses? For N900 there is only one wl1251 device. And... wl12xx is already using appended MAC address in calibration data read by request firmware. So reason why I prefer similar usage also for wl1251. > I wish there would be a better way than request_firmware() to provide > the permanent mac addresses from user space (if device tree is not > available), I just don't know what that could be :) But if we would > start to use request_firmware() for this at least there should be a > wider concensus about that and it should be properly documented, just > like the device tree bindings. > > -- > Kalle Valo I do not know about any other, so reason why I'm asking :-) and there are my proposed solutions. If you (or any other) came up with better we can discuss about it :-) -- Pali Rohár pali.ro...@gmail.com
Re: [PATCHv3 perf/core 0/7] Reuse libbpf from samples/bpf
Em Thu, Dec 15, 2016 at 11:33:29AM -0300, Arnaldo Carvalho de Melo escreveu: > Em Wed, Dec 14, 2016 at 02:46:23PM -0800, Joe Stringer escreveu: > > On 14 December 2016 at 06:55, Arnaldo Carvalho de Melo > > wrote: > > > So, Joe, can you try refreshing this work, starting from what I have in > > > perf/core? It has the changes coming from net-next that Daniel warned us > > > about > > > and some more. > > > I've just respun this series based on the version you previously > > applied to perf/core. Since bpf_prog_{attach,detach}() were added to > > samples/libbpf, a new patch will shift these over to tools/lib/bpf. > > Other than that, I folded "samples/bpf: Drop unnecessary build > > targets." back into "samples/bpf: Switch over to libbpf", and I > > noticed that there were a couple of unnecessary log buffers with the > > latest changes. For any new sample programs, those were fixed up to > > use libbpf as well. > > > Don't forget to do a "make headers_install" before attempting to build > > the samples, access to the latest headers is required (as per the > > readme in samples/bpf). > > Ah, README, I should read that ;-) > > I got used to how tools/perf/ work, i.e. it is self sufficient wrt > in-flux stuff in the kernel, i.e. headers that are related to features > it supports and that are under constant improvements, such as eBPF, kvm, > syscall tables, etc. > > Anyway, will do the headers_install step inside a container, to avoid > polluting my workstation. heh: should've read that file, now I did: There are usually dependencies to header files of the current kernel. To avoid installing devel kernel headers system wide, as a normal user, simply call:: make headers_install This will creates a local "usr/include" directory in the git/build top level directory, that the make system automatically pickup first. > Thanks for doing the respin and for the clarifications about building > samples/bpf/. > > - Arnaldo
RE: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function
From: Hannes Frederic Sowa > Sent: 15 December 2016 14:57 > On 15.12.2016 14:56, David Laight wrote: > > From: Hannes Frederic Sowa > >> Sent: 15 December 2016 12:50 > >> On 15.12.2016 13:28, David Laight wrote: > >>> From: Hannes Frederic Sowa > Sent: 15 December 2016 12:23 > >>> ... > Hmm? Even the Intel ABI expects alignment of unsigned long long to be 8 > bytes on 32 bit. Do you question that? > >>> > >>> Yes. > >>> > >>> The linux ABI for x86 (32 bit) only requires 32bit alignment for u64 > >>> (etc). > >> > >> Hmm, u64 on 32 bit is unsigned long long and not unsigned long. Thus I > >> am actually not sure if the ABI would say anything about that (sorry > >> also for my wrong statement above). > >> > >> Alignment requirement of unsigned long long on gcc with -m32 actually > >> seem to be 8. > > > > It depends on the architecture. > > For x86 it is definitely 4. > > May I ask for a reference? Ask anyone who has had to do compatibility layers to support 32bit binaries on 64bit systems. > I couldn't see unsigned long long being > mentioned in the ia32 abi spec that I found. I agree that those accesses > might be synthetically assembled by gcc and for me the alignment of 4 > would have seemed natural. But my gcc at least in 32 bit mode disagrees > with that. Try (retyped): echo 'struct { long a; long long b; } s; int bar { return sizeof s; }' >foo.c gcc [-m32] -O2 -S foo.c; cat foo.s And look at what is generated. > Right now ipv6 addresses have an alignment of 4. So we couldn't even > naturally pass them to siphash but would need to copy them around, which > I feel like a source of bugs. That is more of a problem on systems that don't support misaligned accesses. Reading the 64bit values with two explicit 32bit reads would work. I think you can get gcc to do that by adding an aligned(4) attribute to the structure member. David
Re: [PATCH net 2/2] net/sched: cls_flower: Use masked key when calling HW offloads
On Thu, Dec 15, 2016 at 04:12:05PM +0200, Or Gerlitz wrote: > On 12/15/2016 3:50 PM, Simon Horman wrote: > >>Zero bits on the mask signify a "don't care" on the corresponding bits > >>in key. Some HWs require those bits on the key to be zero. Since these > >>bits are masked anyway, it's okay to provide the masked key to all > >>drivers. > >> > >>Fixes: 5b33f48842fa ('net/flower: Introduce hardware offload support') > >> > >While I don't have a specific use case in mind that this change would break > >it seems to me that it would be better to handle hardware requirements > >at the driver level. > > Simon, again, since these bits are masked anyway, it would be correct to > provide the masked key to the hw device. > > E.g no matter if the flow key/mask provided to the HW device is is > 1.1.1.10/24 or 1.1.1.0/24, the user expects to the same matching, so > nothing can't happen if we provide the latter to the driver. > >While I don't have a specific use case in mind that this change would break > >it seems to me that it would be better to handle hardware requirements > >at the driver level. > > Even though, makes no sense to pass unmasked key down. Is is only > confusing. This patch fixes it. It seems somewhat arbitrary to me to allow such filters in software but not pass then down to the driver layer. But I don't feel strongly about this and I am happy for the patch to progress as-is.
Re: [PATCH perf/core REBASE 2/5] samples/bpf: Switch over to libbpf
Em Wed, Dec 14, 2016 at 02:43:39PM -0800, Joe Stringer escreveu: > Now that libbpf under tools/lib/bpf/* is synced with the version from > samples/bpf, we can get rid most of the libbpf library here. > > Signed-off-by: Joe Stringer > Cc: Alexei Starovoitov > Cc: Daniel Borkmann > Cc: Wang Nan > Link: http://lkml.kernel.org/r/20161209024620.31660-6-...@ovn.org > [ Use -I$(srctree)/tools/lib/ to support out of source code tree builds, as > noticed by Wang Nan ] > Signed-off-by: Arnaldo Carvalho de Melo So, right before this patch building samples/bpf works, then, after, it fails, investigating: [root@1e797fdfbf4f linux]# make -j4 O=/tmp/build/linux/ headers_install make[1]: Entering directory '/tmp/build/linux' CHK include/generated/uapi/linux/version.h make[1]: Leaving directory '/tmp/build/linux' [root@1e797fdfbf4f linux]# make -j4 O=/tmp/build/linux/ samples/bpf/ make[1]: Entering directory '/tmp/build/linux' CHK include/config/kernel.release GEN ./Makefile CHK include/generated/uapi/linux/version.h Using /git/linux as source for kernel CHK include/generated/utsrelease.h CHK include/generated/timeconst.h CHK include/generated/bounds.h CHK include/generated/asm-offsets.h CALL/git/linux/scripts/checksyscalls.sh HOSTCC samples/bpf/test_lru_dist.o HOSTCC samples/bpf/libbpf.o HOSTCC samples/bpf/sock_example.o HOSTCC samples/bpf/bpf_load.o In file included from /git/linux/samples/bpf/libbpf.c:12:0: /git/linux/samples/bpf/libbpf.h:5:21: fatal error: bpf/bpf.h: No such file or directory #include ^ compilation terminated. In file included from /git/linux/samples/bpf/test_lru_dist.c:24:0: /git/linux/samples/bpf/libbpf.h:5:21: fatal error: bpf/bpf.h: No such file or directory #include ^ compilation terminated. make[2]: *** [scripts/Makefile.host:124: samples/bpf/test_lru_dist.o] Error 1 make[2]: *** Waiting for unfinished jobs make[2]: *** [scripts/Makefile.host:124: samples/bpf/libbpf.o] Error 1 In file included from /git/linux/samples/bpf/bpf_load.c:24:0: /git/linux/samples/bpf/libbpf.h:5:21: fatal error: bpf/bpf.h: No such file or directory #include ^ compilation terminated. make[2]: *** [scripts/Makefile.host:124: samples/bpf/bpf_load.o] Error 1 In file included from /git/linux/samples/bpf/sock_example.c:29:0: /git/linux/samples/bpf/libbpf.h:5:21: fatal error: bpf/bpf.h: No such file or directory #include ^ compilation terminated. make[2]: *** [scripts/Makefile.host:124: samples/bpf/sock_example.o] Error 1 make[1]: *** [/git/linux/Makefile:1659: samples/bpf/] Error 2 make[1]: Leaving directory '/tmp/build/linux' make: *** [Makefile:150: sub-make] Error 2 [root@1e797fdfbf4f linux]#
Re: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function
On 15.12.2016 16:41, David Laight wrote: > Try (retyped): > > echo 'struct { long a; long long b; } s; int bar { return sizeof s; }' >foo.c > gcc [-m32] -O2 -S foo.c; cat foo.s > > And look at what is generated. I used __alignof__(unsigned long long) with -m32. >> Right now ipv6 addresses have an alignment of 4. So we couldn't even >> naturally pass them to siphash but would need to copy them around, which >> I feel like a source of bugs. > > That is more of a problem on systems that don't support misaligned accesses. > Reading the 64bit values with two explicit 32bit reads would work. > I think you can get gcc to do that by adding an aligned(4) attribute to the > structure member. Yes, and that is actually my fear, because we support those architectures. I can't comment on that as I don't understand enough of this. If someone finds a way to cause misaligned reads on a small box this seems (maybe depending on sysctls they get fixed up or panic) to be a much bigger issue than having a hash DoS. Thanks, Hannes
Re: [PATCH net-next 2/2] inet: Fix get port to handle zero port number with soreuseport set
On Wed, Dec 14, 2016 at 7:54 PM, Tom Herbert wrote: > A user may call listen with binding an explicit port with the intent > that the kernel will assign an available port to the socket. In this > case inet_csk_get_port does a port scan. For such sockets, the user may > also set soreuseport with the intent a creating more sockets for the > port that is selected. The problem is that the initial socket being > opened could inadvertently choose an existing and unreleated port > number that was already created with soreuseport. Good catch! I think this problem may also exist in the UDP path? (udp_lib_get_port -> udp_lib_lport_inuse[2])
Re: Designing a safe RX-zero-copy Memory Model for Networking
On Thu, Dec 15, 2016 at 12:28 AM, Jesper Dangaard Brouer wrote: > On Wed, 14 Dec 2016 14:45:00 -0800 > Alexander Duyck wrote: > >> On Wed, Dec 14, 2016 at 1:29 PM, Jesper Dangaard Brouer >> wrote: >> > On Wed, 14 Dec 2016 08:45:08 -0800 >> > Alexander Duyck wrote: >> > >> >> I agree. This is a no-go from the performance perspective as well. >> >> At a minimum you would have to be zeroing out the page between uses to >> >> avoid leaking data, and that assumes that the program we are sending >> >> the pages to is slightly well behaved. If we think zeroing out an >> >> sk_buff is expensive wait until we are trying to do an entire 4K page. >> > >> > Again, yes the page will be zero'ed out, but only when entering the >> > page_pool. Because they are recycled they are not cleared on every use. >> > Thus, performance does not suffer. >> >> So you are talking about recycling, but not clearing the page when it >> is recycled. That right there is my problem with this. It is fine if >> you assume the pages are used by the application only, but you are >> talking about using them for both the application and for the regular >> network path. You can't do that. If you are recycling you will have >> to clear the page every time you put it back onto the Rx ring, >> otherwise you can leak the recycled memory into user space and end up >> with a user space program being able to snoop data out of the skb. >> >> > Besides clearing large mem area is not as bad as clearing small. >> > Clearing an entire page does cost something, as mentioned before 143 >> > cycles, which is 28 bytes-per-cycle (4096/143). And clearing 256 bytes >> > cost 36 cycles which is only 7 bytes-per-cycle (256/36). >> >> What I am saying is that you are going to be clearing the 4K blocks >> each time they are recycled. You can't have the pages shared between >> user-space and the network stack unless you have true isolation. If >> you are allowing network stack pages to be recycled back into the >> user-space application you open up all sorts of leaks where the >> application can snoop into data it shouldn't have access to. > > See later, the "Read-only packet page" mode should provide a mode where > the netstack doesn't write into the page, and thus cannot leak kernel > data. (CAP_NET_ADMIN already give it access to other applications data.) I think you are kind of missing the point. The device is writing to the page on the kernel's behalf. Therefore the page isn't "Read-only" and you have an issue since you are talking about sharing a ring between kernel and userspace. >> >> I think we are stuck with having to use a HW filter to split off >> >> application traffic to a specific ring, and then having to share the >> >> memory between the application and the kernel on that ring only. Any >> >> other approach just opens us up to all sorts of security concerns >> >> since it would be possible for the application to try to read and >> >> possibly write any data it wants into the buffers. >> > >> > This is why I wrote a document[1], trying to outline how this is possible, >> > going through all the combinations, and asking the community to find >> > faults in my idea. Inlining it again, as nobody really replied on the >> > content of the doc. >> > >> > - >> > Best regards, >> > Jesper Dangaard Brouer >> > MSc.CS, Principal Kernel Engineer at Red Hat >> > LinkedIn: http://www.linkedin.com/in/brouer >> > >> > [1] >> > https://prototype-kernel.readthedocs.io/en/latest/vm/page_pool/design/memory_model_nic.html >> > >> > === >> > Memory Model for Networking >> > === >> > >> > This design describes how the page_pool change the memory model for >> > networking in the NIC (Network Interface Card) drivers. >> > >> > .. Note:: The catch for driver developers is that, once an application >> > request zero-copy RX, then the driver must use a specific >> > SKB allocation mode and might have to reconfigure the >> > RX-ring. >> > >> > >> > Design target >> > = >> > >> > Allow the NIC to function as a normal Linux NIC and be shared in a >> > safe manor, between the kernel network stack and an accelerated >> > userspace application using RX zero-copy delivery. >> > >> > Target is to provide the basis for building RX zero-copy solutions in >> > a memory safe manor. An efficient communication channel for userspace >> > delivery is out of scope for this document, but OOM considerations are >> > discussed below (`Userspace delivery and OOM`_). >> > >> > Background >> > == >> > >> > The SKB or ``struct sk_buff`` is the fundamental meta-data structure >> > for network packets in the Linux Kernel network stack. It is a fairly >> > complex object and can be constructed in several ways. >> > >> > From a memory perspective there are two ways depending on >> > RX-buffer/page state: >> > >> > 1) Writable packet page >> > 2) Read-only packet page >> > >> > To take
Re: [PATCH net-next] ixgbevf: fix 'Etherleak' in ixgbevf
On Thu, Dec 15, 2016 at 3:40 AM, Weilong Chen wrote: > Nessus report the vf appears to leak memory in network packets. > Fix this by padding all small packets manually. > > And the CVE-2003-0001. > https://ofirarkin.files.wordpress.com/2008/11/atstake_etherleak_report.pdf > > Signed-off-by: Weilong Chen > --- > drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 7 +++ > 1 file changed, 7 insertions(+) > > diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c > b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c > index 6d4bef5..137a154 100644 > --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c > +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c > @@ -3654,6 +3654,13 @@ static int ixgbevf_xmit_frame(struct sk_buff *skb, > struct net_device *netdev) > return NETDEV_TX_OK; > } > > + /* On PCI/PCI-X HW, if packet size is less than ETH_ZLEN, > +* packets may get corrupted during padding by HW. > +* To WA this issue, pad all small packets manually. > +*/ > + if (eth_skb_pad(skb)) > + return NETDEV_TX_OK; > + So the patch description for this probably isn't correct. It looks like the problem isn't leaking data it is the fact that the frames aren't being padded to prevent malicious events. The only issue is the patch is padding by a bit too much. I would recommend replacing this with the following from ixgbe: /* * The minimum packet size for olinfo paylen is 17 so pad the skb * in order to meet this minimum size requirement. */ if (skb_put_padto(skb, 17)) return NETDEV_TX_OK; > tx_ring = adapter->tx_ring[skb->queue_mapping]; > > /* need: 1 descriptor per page * PAGE_SIZE/IXGBE_MAX_DATA_PER_TXD, > -- > 1.7.12 >
Re: [net-next PATCH v5 1/6] net: virtio dynamically disable/enable LRO
On Wed, Dec 14, 2016 at 09:01:27AM -0800, John Fastabend wrote: > On 16-12-14 05:31 AM, Michael S. Tsirkin wrote: > > On Thu, Dec 08, 2016 at 04:04:58PM -0800, John Fastabend wrote: > >> On 16-12-08 01:36 PM, Michael S. Tsirkin wrote: > >>> On Wed, Dec 07, 2016 at 12:11:11PM -0800, John Fastabend wrote: > This adds support for dynamically setting the LRO feature flag. The > message to control guest features in the backend uses the > CTRL_GUEST_OFFLOADS msg type. > > Signed-off-by: John Fastabend > --- > > [...] > > > static void virtnet_config_changed_work(struct work_struct *work) > @@ -1815,6 +1846,12 @@ static int virtnet_probe(struct virtio_device > *vdev) > if (virtio_has_feature(vdev, VIRTIO_NET_F_GUEST_CSUM)) > dev->features |= NETIF_F_RXCSUM; > > +if (virtio_has_feature(vdev, VIRTIO_NET_F_GUEST_TSO4) && > +virtio_has_feature(vdev, VIRTIO_NET_F_GUEST_TSO6)) { > +dev->features |= NETIF_F_LRO; > +dev->hw_features |= NETIF_F_LRO; > >>> > >>> So the issue is I think that the virtio "LRO" isn't really > >>> LRO, it's typically just GRO forwarded to guests. > >>> So these are easily re-split along MTU boundaries, > >>> which makes it ok to forward these across bridges. > >>> > >>> It's not nice that we don't document this in the spec, > >>> but it's the reality and people rely on this. > >>> > >>> For now, how about doing a custom thing and just disable/enable > >>> it as XDP is attached/detached? > >> > >> The annoying part about doing this is ethtool will say that it is fixed > >> yet it will be changed by seemingly unrelated operation. I'm not sure I > >> like the idea to start automatically configuring the link via xdp_set. > > > > I really don't like the idea of dropping performance > > by a factor of 3 for people bridging two virtio net > > interfaces. > > > > So how about a simple approach for now, just disable > > XDP if GUEST_TSO is enabled? > > > > We can discuss better approaches in next version. > > > > So the proposal is to add a check in XDP setup so that > > if (virtio_has_feature(vdev, VIRTIO_NET_F_GUEST_TSO{4|6}) > return -ENOPSUPP; > > Or whatever is the most appropriate return code? Then we can > disable TSO via qemu-system with guest_tso4=off,guest_tso6=off for > XDP use cases. Right. It's a start. > Sounds like a reasonable start to me. I'll make the change should this > go through DaveMs net-next tree or do you want it on virtio tree? Either > is fine with me. > > Thanks, > John I think I'll merge it because I'm tweaking RX processing too, and this will likely conflict. -- MST
Re: [RFC v2 00/10] HFI Virtual Network Interface Controller (VNIC)
On 12/15/2016 9:52 AM, ira.weiny wrote: > On Thu, Dec 15, 2016 at 11:12:26AM +0200, Leon Romanovsky wrote: >> On Wed, Dec 14, 2016 at 11:59:32PM -0800, Vishwanathapura, Niranjana wrote: >>> Thanks Jason for the valuable feedback. >>> Here is the revised HFI VNIC patch series. >>> >>> ChangeLog: >>> = >>> v1 => v2: >>> a) Removed hfi_vnic bus, instead make hfi_vnic driver an 'ib client', >>>as per feedback from Jason Gunthorpe. >>> b) Interface changes, data structure changes and variable name changes >>>associated with (a). >>> c) Add hfi_ibdev abstraction to provide VNIC control operations to >>>hfi_vnic client. >>> d) Minor fixes >>> e) Moved hfi_vnic driver from .../sw/intel/vnic/hfi_vnic to >>>.../sw/intel/hfi_vnic. >> >> To put it into proportion, Jason asked you to do different thing. >> http://marc.info/?l=linux-rdma&m=147977108302151&w=2 >> http://marc.info/?l=linux-rdma&m=148000415401842&w=2 >> >> And Christoph, >> http://marc.info/?l=linux-rdma&m=147985587425861&w=2 > > Understood. However, we never heard back from Niranjanas analysis of the code > which stated that > 60% of the code was dealing with the OPA MADs used to > configure this device. > > https://www.spinics.net/lists/linux-rdma/msg43579.html > > Furthermore, neither Dave nor Doug has had time to weigh in on what we should > do. > > So before we make that change we wanted to get consensus on using the > hfi1_ibdev abstraction rather than the bus. This was the _real_ technical > change. > > Beyond that it is really just which maintainer wants this driver. To that end > I've also cc'ed Jeff Kirsher who maintains drivers/net/ethernet/intel. > Perhaps > Dave would like the driver to go through that tree? > > > I think there are pros and cons to both subtrees and in the end we will do > whatever is decided. > > For maintainer review: > > 1) The driver encapsulates ethernet packets with OPA headers > > 2) VNIC uses OPA management packets (MADs) for its configuration > > 3) A significant portion (> 60% +) of the code is specific to OPA > > https://www.spinics.net/lists/linux-rdma/msg43579.html > > 4) The driver is from Intel and we expect Intel to be the primary > contributor to the code. > > 5) The driver, like hfi1, is dual licensed (GPL/BSD) > > 6) Based on Christophs feedback we will be adding device capability > bits to the IB core to indicate HFI VNIC support. > > https://www.spinics.net/lists/linux-rdma/msg44113.html > > > Doug, Dave, Jeff any thoughts? > > Ira > Sorry for my late reply. The series is relatively large, and also tagged with RFC, so it got shuffled to the back burner while I worked on the stuff for this pull request. I just read through the comments in the V1 series between Jason et. al., and my take on things is like this: 1) Since your intent is to make this work with multiple versions of the hfi drivers, I disagree with Jason that just because there is only one driver today that we should keep it simple. Design it right from the beginning of multi driver is your intent is, IMO, a better way to go. You'll work out the bugs in the initial implementation and when it comes time to add the second driver, things will go much more smoothly. 2) With more than 60% of the code being MAD related, and another significant chunk being hfi related, and only a minor bit (20% maybe?) being net related, I disagree that this belongs in the drivers/net or net/ directories. Part of the purpose of putting code like this in any given directory is to group it with what it is most tightly tied too. That way people doing sub-tree wide changes know the rough scope of their work as the code that needs changed is grouped together. Putting this or IPoIB in one of the net trees would make it obvious to the casual coder that these need changed for net changes, but would totally hide the fact that once you tear into these drivers, there is a lot more IB to them than there is net. What's more, when 60+% of driver is non-net, then you end up having many more of my patches crossing over into Dave's tree than the opposite if you put the code under my tree. If nothing else, locality of code churn would say both this and IPoIB belong here despite them being net drivers. 3) I would like some hard reasons why this driver deserves to exist? I'm struggling very hard right now with why we would add an entirely new "encapsulate IP over RDMA" driver. Even if you use regular Ethernet MACs instead of IPoIB's 20byte MAC, I'm struggling for why IPoIB couldn't be modified to know it supports two MAC sizes and provide different net devices based on those different types? I'm struggling to see why IPoIB couldn't be modified to essentially have two transport layers underneath? I haven't done a thorough code review yet, but if I get into the net driver portion of this and it has very much similarity to the IPoIB net portion, I'm p
Re: Designing a safe RX-zero-copy Memory Model for Networking
On Thu, 15 Dec 2016, Jesper Dangaard Brouer wrote: > > It sounds like Christoph's RDMA approach might be the way to go. > > I'm getting more and more fond of Christoph's RDMA approach. I do > think we will end-up with something close to that approach. I just > wanted to get review on my idea first. > > IMHO the major blocker for the RDMA approach is not HW filters > themselves, but a common API that applications can call to register > what goes into the HW queues in the driver. I suspect it will be a > long project agreeing between vendors. And agreeing on semantics. Some of the methods from the RDMA subsystem (like queue pairs, the various queues etc) could be extracted and used here. Multiple vendors already support these features and some devices operate both in an RDMA and a network stack mode. Having that all supported by the networks stack would reduce overhead for those vendors. Multiple new vendors are coming up in the RDMA subsystem because the regular network stack does not have the right performance for high speed networking. I would rather see them have a way to get that functionality from the regular network stack. Please add some extensions so that the RDMA style I/O can be made to work. Even the hardware of the new NICs is already prepared to work with the data structures of the RDMA subsystem. That provides an area of standardization where we could hook into but do that properly and in a nice way in the context of main stream network support.
Re: [PATCH net] sctp: sctp_epaddr_lookup_transport should be protected by rcu_read_lock
On Thu, Dec 15, 2016 at 11:00:55PM +0800, Xin Long wrote: > Since commit 7fda702f9315 ("sctp: use new rhlist interface on sctp transport > rhashtable"), sctp has changed to use rhlist_lookup to look up transport, but > rhlist_lookup doesn't call rcu_read_lock inside, unlike > rhashtable_lookup_fast. > > It is called in sctp_epaddr_lookup_transport and sctp_addrs_lookup_transport. > sctp_addrs_lookup_transport is always in the protection of rcu_read_lock(), > as __sctp_lookup_association is called in rx path or sctp_lookup_association > which are in the protection of rcu_read_lock() already. > > But sctp_epaddr_lookup_transport is called by sctp_endpoint_lookup_assoc, it > doesn't call rcu_read_lock, which may cause "suspicious rcu_dereference_check > usage' in __rhashtable_lookup. > > This patch is to fix it by adding rcu_read_lock in sctp_endpoint_lookup_assoc > before calling sctp_epaddr_lookup_transport. > > Fixes: 7fda702f9315 ("sctp: use new rhlist interface on sctp transport > rhashtable") > Reported-by: Dmitry Vyukov > Signed-off-by: Xin Long Acked-by: Marcelo Ricardo Leitner > --- > net/sctp/endpointola.c | 5 - > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/net/sctp/endpointola.c b/net/sctp/endpointola.c > index 1f03065..410ddc1 100644 > --- a/net/sctp/endpointola.c > +++ b/net/sctp/endpointola.c > @@ -331,7 +331,9 @@ struct sctp_association *sctp_endpoint_lookup_assoc( >* on this endpoint. >*/ > if (!ep->base.bind_addr.port) > - goto out; > + return NULL; > + > + rcu_read_lock(); > t = sctp_epaddr_lookup_transport(ep, paddr); > if (!t) > goto out; > @@ -339,6 +341,7 @@ struct sctp_association *sctp_endpoint_lookup_assoc( > *transport = t; > asoc = t->asoc; > out: > + rcu_read_unlock(); > return asoc; > } > > -- > 2.1.0 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-sctp" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >
Re: [PATCH net] sctp: sctp_transport_lookup_process should rcu_read_unlock when transport is null
On Thu, Dec 15, 2016 at 11:05:52PM +0800, Xin Long wrote: > Prior to this patch, sctp_transport_lookup_process didn't rcu_read_unlock > when it failed to find a transport by sctp_addrs_lookup_transport. > > This patch is to fix it by moving up rcu_read_unlock right before checking > transport and also to remove the out path. > > Fixes: 1cceda784980 ("sctp: fix the issue sctp_diag uses lock_sock in > rcu_read_lock") > Signed-off-by: Xin Long Acked-by: Marcelo Ricardo Leitner > --- > net/sctp/socket.c | 7 +++ > 1 file changed, 3 insertions(+), 4 deletions(-) > > diff --git a/net/sctp/socket.c b/net/sctp/socket.c > index d5f4b4a..318c678 100644 > --- a/net/sctp/socket.c > +++ b/net/sctp/socket.c > @@ -4472,18 +4472,17 @@ int sctp_transport_lookup_process(int (*cb)(struct > sctp_transport *, void *), > const union sctp_addr *paddr, void *p) > { > struct sctp_transport *transport; > - int err = -ENOENT; > + int err; > > rcu_read_lock(); > transport = sctp_addrs_lookup_transport(net, laddr, paddr); > + rcu_read_unlock(); > if (!transport) > - goto out; > + return -ENOENT; > > - rcu_read_unlock(); > err = cb(transport, p); > sctp_transport_put(transport); > > -out: > return err; > } > EXPORT_SYMBOL_GPL(sctp_transport_lookup_process); > -- > 2.1.0 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-sctp" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >
Re: [PATCH] net: wan: Use dma_pool_zalloc
On Thu, 2016-12-15 at 10:41 +0530, Souptick Joarder wrote: > On Mon, Dec 12, 2016 at 10:12 AM, Souptick Joarder > wrote: > > On Fri, Dec 9, 2016 at 6:33 PM, Krzysztof Hałasa wrote: > > > Souptick Joarder writes: > > > > > > > We should use dma_pool_zalloc instead of dma_pool_alloc/memset [] > > > > diff --git a/drivers/net/wan/ixp4xx_hss.c b/drivers/net/wan/ixp4xx_hss.c [] > > > > @@ -976,10 +976,9 @@ static int init_hdlc_queues(struct port *port) > > > > return -ENOMEM; > > > > } > > > > > > > > - if (!(port->desc_tab = dma_pool_alloc(dma_pool, GFP_KERNEL, > > > > - &port->desc_tab_phys))) > > > > + if (!(port->desc_tab = dma_pool_zalloc(dma_pool, GFP_KERNEL, > > > > +&port->desc_tab_phys))) > > > > return -ENOMEM; > > > > - memset(port->desc_tab, 0, POOL_ALLOC_SIZE); > > > > memset(port->rx_buff_tab, 0, sizeof(port->rx_buff_tab)); /* > > > > tables */ > > > > memset(port->tx_buff_tab, 0, sizeof(port->tx_buff_tab)); > > > > > > This look fine, feel free to send it to the netdev mailing list for > > > inclusion. > > > > Including netdev mailing list based as requested. > > > Acked-by: Krzysztof Halasa [] > Any comment on this patch ? Shouldn't the one in drivers/net/ethernet/xscale/ixp4xx_eth.c also be changed?
Re: [RFC v2 00/10] HFI Virtual Network Interface Controller (VNIC)
On Wed, Dec 14, 2016 at 11:59:32PM -0800, Vishwanathapura, Niranjana wrote: > create mode 100644 drivers/infiniband/sw/intel/hfi_vnic/Kconfig > create mode 100644 drivers/infiniband/sw/intel/hfi_vnic/Makefile Stil NAK on these paths, I already explained why 'sw' is totally unsuitable. Put it in drivers/net or drivers/infiniband/ulp Jason
Re: [RFC v2 03/10] IB/hfi-vnic: Virtual Network Interface Controller (VNIC) netdev
On Wed, Dec 14, 2016 at 11:59:35PM -0800, Vishwanathapura, Niranjana wrote: > +/** > + * union hfi_vnic_bypass_hdr - VNIC bypass header > + * @slid: source lid > + * @length: length of packet > + * @becn: backward explicit congestion notification > + * @dlid: destination lid > + * @sc: service class > + * @fecn: forward explicit congestion notification > + * @l2: L2 type (2=16B) > + * @lt: link transfer field > + * @l4: L4 type > + * @slid_high: upper 4 bits of source lid > + * @dlid_high: upper 4 bits of destination lid > + * @pkey: partition key > + * @entropy: entropy > + * @age: packet age > + * @l4_hdr: L4 header > + */ > +union hfi_vnic_bypass_hdr { > + struct { > + struct { > + uint64_t slid : 20; > + uint64_t length : 11; > + uint64_t becn : 1; > + uint64_t dlid : 20; > + uint64_t sc : 5; > + uint64_t rsvd : 3; > + uint64_t fecn : 1; > + uint64_t l2 : 2; > + uint64_t lt : 1; > + }; > + struct { > + uint64_t l4: 8; > + uint64_t slid_high : 4; > + uint64_t dlid_high : 4; > + uint64_t pkey : 16; > + uint64_t entropy : 16; > + uint64_t age : 8; > + uint64_t rsvd1 : 8; > + }; > + struct { > + uint32_t rsvd2 : 16; > + uint32_t l4_hdr : 16; > + }; > + } __packed; > + u32 dw[5]; > +}; This isn't going to work on BE, please fix it. > +/** > + * struct __hfi_vesw_info - HFI vnic virtual switch info > + */ > +struct __hfi_vesw_info { > + u16 fabric_id; > + u16 vesw_id; > + > + u8 rsvd0[6]; > + u16 def_port_mask; > + > + u8 rsvd1[2]; > + u16 pkey; > + > + u8 rsvd2[4]; > + u32 u_mcast_dlid; > + u32 u_ucast_dlid[HFI_VESW_MAX_NUM_DEF_PORT]; > + > + u8 rsvd3[44]; > + u16 eth_mtu[HFI_VNIC_MAX_NUM_PCP]; > + u16 eth_mtu_non_vlan; > + u8 rsvd4[2]; > +} __packed; This goes on the network too? Also looks like it has endian problems. Ditto for all the __packed structures. > +#define v_dbg(format, arg...) \ > + netdev_dbg(adapter->netdev, format, ## arg) > +#define v_err(format, arg...) \ > + netdev_err(adapter->netdev, format, ## arg) > +#define v_info(format, arg...) \ > + netdev_info(adapter->netdev, format, ## arg) > +#define v_warn(format, arg...) \ > + netdev_warn(adapter->netdev, format, ## arg) Relies on an 'adapter' local varable?? Ugly. Jason
Re: [RFC v2 00/10] HFI Virtual Network Interface Controller (VNIC)
On Thu, Dec 15, 2016 at 11:28:06AM -0500, Doug Ledford wrote: > 1) Since your intent is to make this work with multiple versions of the > hfi drivers, I disagree with Jason that just because there is only one > driver today that we should keep it simple. Design it right from the > beginning of multi driver is your intent is, IMO, a better way to go. > You'll work out the bugs in the initial implementation and when it comes > time to add the second driver, things will go much more smoothly. If that is your position then this should be a straight up IB ULP that works with any IB hardware. There is nothing HFI specific about it except for the micro-optimization of pushing packets via SDMA instead of post_send, and that same micro optimization probably applies to ipoib. In other words, lets see the first version as a straight ULP with no special HFI hooks, then we can discuss how best to micro optimize it for HFI SDMA. Jason
Re: [PATCH] net: sfc: use new api ethtool_{get|set}_link_ksettings
On 14/12/16 23:12, Philippe Reynes wrote: > The ethtool api {get|set}_settings is deprecated. > We move this driver to new api {get|set}_link_ksettings. > > Signed-off-by: Philippe Reynes Tested-by: Bert Kenward Acked-by: Bert Kenward
Re: [PATCH net-next 1/3] net:dsa:mv88e6xxx: use hashtable to store multicast entries
Hi Volodymyr, Volodymyr Bendiuga writes: > Hi Andrew, > > I have tested the approach you wrote in previous mails, the one > with setting next.mac to address we are looking for -1. It seems > to be as slow as the original implementation, unfortunately. Hum, that is what I was expecting... The ATU GetNext operation (alongside an ether_addr_equal() call) should be quite fast. > We use 6097 and 6352 chips, and both of them can not do any port > filtering in hardware for fdb dump operation. Seems like they would > benefit from cache. But I am not sure about other switches. > > Does anyone know about such feature in other switches? Marvell switches cannot filter ATU entries for a specific port, they contain a port vector. I guess Florian might answer for Broadcom switches, and John might answer for Qualcomm switches. In all cases *if caching is really needed*, I think it won't hurt to do it in DSA core even if a switch support FDB dump operations on a per-port basis, as Andrew mentioned. Thanks, Vivien
RE: [RFC v2 03/10] IB/hfi-vnic: Virtual Network Interface Controller (VNIC) netdev
> This goes on the network too? Also looks like it has endian problems. I don't think OPA supports BE systems, and I think it uses LE on the wire for at least some portions of its protocol.
Re: [RFC v2 03/10] IB/hfi-vnic: Virtual Network Interface Controller (VNIC) netdev
On Thu, Dec 15, 2016 at 05:21:05PM +, Hefty, Sean wrote: > > This goes on the network too? Also looks like it has endian problems. > > I don't think OPA supports BE systems, and I think it uses LE on the > wire for at least some portions of its protocol. This is a linux driver for a PCI device. It needs to support big endian systems, that is how we do things in Linux. If it uses LE on the wire then mark with __le and make it sparse clean. Do not use bitfields without providing a BE version of the bitfield. Jason