Re: [PATCH 8/8] Makefile: drop -D__CHECK_ENDIAN__ from cflags

2016-12-15 Thread Marc Kleine-Budde
On 12/15/2016 06:15 AM, Michael S. Tsirkin wrote:
> That's the default now, no need for makefiles to set it.
> 
> Signed-off-by: Michael S. Tsirkin 
> ---
[...]
>  drivers/net/can/Makefile  | 1 -

For drivers/net/can/Makefile:

Acked-by: Marc Kleine-Budde 

regards,
Marc

-- 
Pengutronix e.K.  | Marc Kleine-Budde   |
Industrial Linux Solutions| Phone: +49-231-2826-924 |
Vertretung West/Dortmund  | Fax:   +49-5121-206917- |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |



signature.asc
Description: OpenPGP digital signature


Re: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function

2016-12-15 Thread Herbert Xu
Jason A. Donenfeld  wrote:
> 
> Siphash needs a random secret key, yes. The point is that the hash
> function remains secure so long as the secret key is kept secret.
> Other functions can't make the same guarantee, and so nervous periodic
> key rotation is necessary, but in most cases nothing is done, and so
> things just leak over time.

Actually those users that use rhashtable now have a much more
sophisticated defence against these attacks, dyanmic rehashing
when bucket length exceeds a preset limit.

Cheers,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


[RFC v2 00/10] HFI Virtual Network Interface Controller (VNIC)

2016-12-15 Thread Vishwanathapura, Niranjana
Thanks Jason for the valuable feedback.
Here is the revised HFI VNIC patch series.

ChangeLog:
=
v1 => v2:
a) Removed hfi_vnic bus, instead make hfi_vnic driver an 'ib client',
   as per feedback from Jason Gunthorpe.
b) Interface changes, data structure changes and variable name changes
   associated with (a).
c) Add hfi_ibdev abstraction to provide VNIC control operations to
   hfi_vnic client.
d) Minor fixes
e) Moved hfi_vnic driver from .../sw/intel/vnic/hfi_vnic to
   .../sw/intel/hfi_vnic.

v1: Initial post @ https://www.spinics.net/lists/linux-rdma/msg43158.html

Description:

Intel Omni-Path Host Fabric Interface (HFI) Virtual Network Interface
Controller (VNIC) feature supports Ethernet functionality over Omni-Path
fabric by encapsulating the Ethernet packets between HFI nodes.

The patterns of exchanges of Omni-Path encapsulated Ethernet packets
involves one or more virtual Ethernet switches overlaid on the Omni-Path
fabric topology. A subset of HFI nodes on the Omni-Path fabric are
permitted to exchange encapsulated Ethernet packets across a particular
virtual Ethernet switch. The virtual Ethernet switches are logical
abstractions achieved by configuring the HFI nodes on the fabric for
header generation and processing. In the simplest configuration all HFI
nodes across the fabric exchange encapsulated Ethernet packets over a
single virtual Ethernet switch. A virtual Ethernet switch, is effectively
an independent Ethernet network. The configuration is performed by an
Ethernet Manager (EM) which is part of the trusted Fabric Manager (FM)
application. HFI nodes can have multiple VNICs each connected to a
different virtual Ethernet switch. The below diagram presents a case
of two virtual Ethernet switches with two HFI nodes.

 +---+
 |  Subnet/  |
 | Ethernet  |
 |  Manager  |
 +---+
/  /
  /   /
//
  / /
+-+  +--+
|  Virtual Ethernet Switch|  |  Virtual Ethernet Switch |
|  +-++-+ |  | +-++-+   |
|  | VPORT   ||  VPORT  | |  | |  VPORT  ||  VPORT  |   |
+--+-++-+-+  +-+-++-+---+
 | \/ |
 |   \/   |
 | \/ |
 |/  \|
 |  /  \  |
 +---++  +---++
 |   VNIC|VNIC|  |VNIC   |VNIC|
 +---++  +---++
 |  HFI   |  |  HFI   |
 ++  ++

Intel HFI VNIC software design is presented in the below diagram.
HFI VNIC functionality has a HW dependent component and a HW
independent component.

The HW dependent VNIC functionality is part of the HFI1 driver. It
implements the callback functions to do various tasks which includes
adding and removing of VNIC ports, HW resource allocation for VNIC
functionality and actual transmission and reception of encapsulated
Ethernet packets over the fabric. Each VNIC port is addressed by the
HFI port number, and the VNIC port number on that HFI port.

The HFI VNIC module implements the HW independent VNIC functionality.
It consists of two parts. The VNIC Ethernet Management Agent (VEMA)
registers itself with IB core as an IB client and interfaces with the
IB MAD stack. It exchanges the management information with the Ethernet
Manager (EM) and the VNIC netdev. The VNIC netdev part interfaces with
the Linux network stack, thus providing standard Ethernet network
interfaces. It invokes HFI device's VNIC callback functions for HW access.
The VNIC netdev encapsulates the Ethernet packets with an Omni-Path
header before passing them to the HFI1 driver for transmission.
Similarly, it de-encapsulates the received Omni-Path packets before
passing them to the network stack. For each VNIC interface, the
information required for encapsulation is configured by EM via VEMA MAD
interface.


+---+ +--+
|   | |   Linux  |
| IB MAD| |  Network |
|   | |   Stack  |
+---+ +--+
 |   |
 |   |
++
||
|   

[RFC v2 06/10] IB/hfi-vnic: VNIC MAC table support

2016-12-15 Thread Vishwanathapura, Niranjana
HFI VNIC MAC table contains the MAC address to DLID mappings provided by
the Ethernet manager. During transmission, the MAC table provides the MAC
address to DLID translation. Implement MAC table using simple hash list.
Also provide support to update/query the MAC table by Ethernet manager.

Reviewed-by: Dennis Dalessandro 
Reviewed-by: Ira Weiny 
Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Sadanand Warrier 
---
 .../infiniband/sw/intel/hfi_vnic/hfi_vnic_encap.c  | 236 +
 .../sw/intel/hfi_vnic/hfi_vnic_internal.h  |  53 -
 .../infiniband/sw/intel/hfi_vnic/hfi_vnic_netdev.c |   4 +
 3 files changed, 292 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_encap.c 
b/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_encap.c
index 3fdfb7b..e45cff8 100644
--- a/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_encap.c
+++ b/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_encap.c
@@ -104,6 +104,238 @@
 
 #define HFI_VNIC_SC_MASK 0x1f
 
+/*
+ * Using a simple hash table for mac table implementation with the last octet
+ * of mac address as a key.
+ */
+static void hfi_vnic_free_mac_tbl(struct hlist_head *mactbl)
+{
+   struct hfi_vnic_mac_tbl_node *node;
+   struct hlist_node *tmp;
+   int bkt;
+
+   if (!mactbl)
+   return;
+
+   vnic_hash_for_each_safe(mactbl, bkt, tmp, node, hlist) {
+   hash_del(&node->hlist);
+   kfree(node);
+   }
+   kfree(mactbl);
+}
+
+static struct hlist_head *hfi_vnic_alloc_mac_tbl(void)
+{
+   u32 size = sizeof(struct hlist_head) * HFI_VNIC_MAC_TBL_SIZE;
+   struct hlist_head *mactbl;
+
+   mactbl = kzalloc(size, GFP_KERNEL);
+   if (!mactbl)
+   return ERR_PTR(-ENOMEM);
+
+   vnic_hash_init(mactbl);
+   return mactbl;
+}
+
+/* hfi_vnic_release_mac_tbl - empty and free the mac table */
+void hfi_vnic_release_mac_tbl(struct hfi_vnic_adapter *adapter)
+{
+   struct hlist_head *mactbl;
+
+   mutex_lock(&adapter->mactbl_lock);
+   mactbl = rcu_access_pointer(adapter->mactbl);
+   rcu_assign_pointer(adapter->mactbl, NULL);
+   synchronize_rcu();
+   hfi_vnic_free_mac_tbl(mactbl);
+   mutex_unlock(&adapter->mactbl_lock);
+}
+
+/*
+ * hfi_vnic_query_mac_tbl - query the mac table for a section
+ *
+ * This function implements query of specific function of the mac table.
+ * The function also expects the requested range to be valid.
+ */
+void hfi_vnic_query_mac_tbl(struct hfi_vnic_adapter *adapter,
+   struct hfi_veswport_mactable *tbl)
+{
+   struct hfi_vnic_mac_tbl_node *node;
+   struct hlist_head *mactbl;
+   int bkt;
+   u16 loffset, lnum_entries;
+
+   rcu_read_lock();
+   mactbl = rcu_dereference(adapter->mactbl);
+   if (!mactbl)
+   goto get_mac_done;
+
+   loffset = be16_to_cpu(tbl->offset);
+   lnum_entries = be16_to_cpu(tbl->num_entries);
+
+   vnic_hash_for_each(mactbl, bkt, node, hlist) {
+   struct __hfi_vnic_mactable_entry *nentry = &node->entry;
+   struct hfi_veswport_mactable_entry *entry;
+
+   if ((node->index < loffset) ||
+   (node->index >= (loffset + lnum_entries)))
+   continue;
+
+   /* populate entry in the tbl corresponding to the index */
+   entry = &tbl->tbl_entries[node->index - loffset];
+   memcpy(entry->mac_addr, nentry->mac_addr,
+  ARRAY_SIZE(entry->mac_addr));
+   memcpy(entry->mac_addr_mask, nentry->mac_addr_mask,
+  ARRAY_SIZE(entry->mac_addr_mask));
+   entry->dlid_sd.dw = cpu_to_be32(nentry->dlid_sd.dw);
+   }
+   tbl->mac_tbl_digest = cpu_to_be32(adapter->info.vport.mac_tbl_digest);
+get_mac_done:
+   rcu_read_unlock();
+}
+
+/*
+ * hfi_vnic_update_mac_tbl - update mac table section
+ *
+ * This function updates the specified section of the mac table.
+ * The procedure includes following steps.
+ *  - Allocate a new mac (hash) table.
+ *  - Add the specified entries to the new table.
+ *(except the ones that are requested to be deleted).
+ *  - Add all the other entries from the old mac table.
+ *  - If there is a failure, free the new table and return.
+ *  - Switch to the new table.
+ *  - Free the old table and return.
+ *
+ * The function also expects the requested range to be valid.
+ */
+int hfi_vnic_update_mac_tbl(struct hfi_vnic_adapter *adapter,
+   struct hfi_veswport_mactable *tbl)
+{
+   struct hfi_vnic_mac_tbl_node *node, *new_node;
+   struct hlist_head *new_mactbl, *old_mactbl;
+   int i, bkt, rc = 0;
+   u8 key;
+   u16 loffset, lnum_entries;
+
+   mutex_lock(&adapter->mactbl_lock);
+   /* allocate new mac table */
+   new_mactbl = hfi_vnic_alloc_mac_tbl();
+   if (IS_ERR(new_mactbl)) {
+   mutex_unlock(&ada

[RFC v2 07/10] IB/hfi-vnic: VNIC Ethernet Management Agent (VEMA) interface

2016-12-15 Thread Vishwanathapura, Niranjana
HFI VNIC EMA interface functions are the management interfaces to the HFI
VNIC netdev. Add support to add and remove VNIC ports. Implement the
required GET/SET management interface functions and processing of new
management information. Add support to send trap notifications upon various
events like interface status change, unicast/multicast mac list update and
mac address change.

Reviewed-by: Dennis Dalessandro 
Reviewed-by: Ira Weiny 
Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Sadanand Warrier 
Signed-off-by: Tanya K Jajodia 
---
 drivers/infiniband/sw/intel/hfi_vnic/Makefile  |   3 +-
 .../infiniband/sw/intel/hfi_vnic/hfi_vnic_encap.h  |   4 +
 .../sw/intel/hfi_vnic/hfi_vnic_internal.h  |  44 +++
 .../infiniband/sw/intel/hfi_vnic/hfi_vnic_netdev.c | 153 +++-
 .../sw/intel/hfi_vnic/hfi_vnic_vema_iface.c| 432 +
 5 files changed, 633 insertions(+), 3 deletions(-)
 create mode 100644 drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_vema_iface.c

diff --git a/drivers/infiniband/sw/intel/hfi_vnic/Makefile 
b/drivers/infiniband/sw/intel/hfi_vnic/Makefile
index 8e3dca7..a0562af 100644
--- a/drivers/infiniband/sw/intel/hfi_vnic/Makefile
+++ b/drivers/infiniband/sw/intel/hfi_vnic/Makefile
@@ -3,4 +3,5 @@
 #
 obj-$(CONFIG_HFI_VNIC) += hfi_vnic.o
 
-hfi_vnic-y := hfi_vnic_netdev.o hfi_vnic_encap.o hfi_vnic_ethtool.o
+hfi_vnic-y := hfi_vnic_netdev.o hfi_vnic_encap.o hfi_vnic_ethtool.o \
+  hfi_vnic_vema_iface.o
diff --git a/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_encap.h 
b/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_encap.h
index a6770ef..54e9081 100644
--- a/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_encap.h
+++ b/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_encap.h
@@ -99,6 +99,10 @@
 #define HFI_VNIC_STATE_DROP_ALL0x1
 #define HFI_VNIC_STATE_FORWARDING  0x3
 
+/* VNIC Ethernet link status */
+#define HFI_VNIC_ETH_LINK_UP 1
+#define HFI_VNIC_ETH_LINK_DOWN   2
+
 /**
  * struct hfi_vesw_info - HFI vnic switch information
  * @fabric_id: 10-bit fabric id
diff --git a/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_internal.h 
b/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_internal.h
index 6d5c5f8..7723a4e 100644
--- a/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_internal.h
+++ b/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_internal.h
@@ -243,6 +243,16 @@ struct __hfi_veswport_trap {
 } __packed;
 
 /**
+ * struct hfi_vnic_ctrl_port - HFI virtual NIC control port
+ * @ibdev: pointer to ib device
+ * @ops: hfi vnic control operations
+ */
+struct hfi_vnic_ctrl_port {
+   struct ib_device   *ibdev;
+   struct hfi_vnic_ctrl_ops   *ops;
+};
+
+/**
  * struct hfi_vnic_rx_queue - HFI VNIC receive queue
  * @idx: queue index
  * @adapter: netdev adapter
@@ -257,11 +267,15 @@ struct hfi_vnic_rx_queue {
 /**
  * struct hfi_vnic_adapter - HFI VNIC netdev private data structure
  * @netdev: pointer to associated netdev
+ * @cport: pointer to hfi vnic control port
  * @vport: pointer to hfi vnic port
  * @flags: flags indicating various states
  * @lock: adapter lock
  * @rxq: receive queue array
  * @info: virtual ethernet switch port information
+ * @vema_mac_addr: mac address configured by vema
+ * @umac_hash: unicast maclist hash
+ * @mmac_hash: multicast maclist hash
  * @mactbl: hash table of MAC entries
  * @mactbl_lock: mac table lock
  * @stats_lock: statistics lock
@@ -278,6 +292,7 @@ struct hfi_vnic_rx_queue {
  */
 struct hfi_vnic_adapter {
struct net_device *netdev;
+   struct hfi_vnic_ctrl_port *cport;
struct hfi_vnic_port  *vport;
unsigned long  flags;
 
@@ -287,6 +302,9 @@ struct hfi_vnic_adapter {
struct hfi_vnic_rx_queue  rxq[HFI_VNIC_MAX_QUEUE];
 
struct __hfi_veswport_info  info;
+   u8  vema_mac_addr[ETH_ALEN];
+   u32 umac_hash;
+   u32 mmac_hash;
struct hlist_head  __rcu   *mactbl;
 
/* Lock used to protect updates to mac table */
@@ -338,6 +356,11 @@ struct hfi_vnic_mac_tbl_node {
 #define v_warn(format, arg...) \
netdev_warn(adapter->netdev, format, ## arg)
 
+#define c_err(format, arg...) \
+   dev_err(&cport->ibdev->dev, format, ## arg)
+#define c_info(format, arg...) \
+   dev_info(&cport->ibdev->dev, format, ## arg)
+
 /* The maximum allowed entries in the mac table */
 #define HFI_VNIC_MAC_TBL_MAX_ENTRIES  2048
 /* Limit of smac entries in mac table */
@@ -377,12 +400,33 @@ struct hfi_vnic_adapter *hfi_vnic_add_netdev(struct 
hfi_vnic_port *vport,
 int hfi_vnic_encap_skb(struct hfi_vnic_adapter *adapter, struct sk_buff *skb);
 int hfi_vnic_decap_skb(struct hfi_vnic_rx_queue *rxq, struct sk_buff *skb);
 u8 hfi_vnic_calc_entropy(struct hfi_vnic_adapter *adapter, struct sk_buff 
*skb);
+void hfi_vnic_process_vema_config(struct hfi_vnic_adapter *adapter);
 void hfi_vnic_release_mac_tbl(struct hf

[RFC v2 02/10] IB/hfi-vnic: Virtual Network Interface Controller (VNIC) interface

2016-12-15 Thread Vishwanathapura, Niranjana
Create hfi_ibdev abstraction which hfi1_ibdev will extend.
Define HFI VNIC interface between hardware independent VNIC
functionality and the hardware dependent VNIC functionality.
Add VNIC control operations to add and remove VNIC devices,
to the hfi_ibdev structure.

Reviewed-by: Dennis Dalessandro 
Reviewed-by: Ira Weiny 
Signed-off-by: Niranjana Vishwanathapura 
---
 drivers/infiniband/hw/hfi1/chip.c   |   2 +-
 drivers/infiniband/hw/hfi1/driver.c |  10 +-
 drivers/infiniband/hw/hfi1/hfi.h|   2 +-
 drivers/infiniband/hw/hfi1/init.c   |   4 +-
 drivers/infiniband/hw/hfi1/intr.c   |   2 +-
 drivers/infiniband/hw/hfi1/mad.c|   2 +-
 drivers/infiniband/hw/hfi1/qp.c |  24 +++--
 drivers/infiniband/hw/hfi1/ruc.c|   2 +-
 drivers/infiniband/hw/hfi1/sysfs.c  |  22 ++--
 drivers/infiniband/hw/hfi1/verbs.c  | 113 ++--
 drivers/infiniband/hw/hfi1/verbs.h  |   9 +-
 include/rdma/opa_hfi.h  | 199 
 12 files changed, 298 insertions(+), 93 deletions(-)
 create mode 100644 include/rdma/opa_hfi.h

diff --git a/drivers/infiniband/hw/hfi1/chip.c 
b/drivers/infiniband/hw/hfi1/chip.c
index 37d8af5..9263984 100644
--- a/drivers/infiniband/hw/hfi1/chip.c
+++ b/drivers/infiniband/hw/hfi1/chip.c
@@ -10452,7 +10452,7 @@ int set_link_state(struct hfi1_pportdata *ppd, u32 
state)
sdma_all_running(dd);
 
/* Signal the IB layer that the port has went active */
-   event.device = &dd->verbs_dev.rdi.ibdev;
+   event.device = &dd->verbs_dev.hfidev.rdi.ibdev;
event.element.port_num = ppd->port;
event.event = IB_EVENT_PORT_ACTIVE;
}
diff --git a/drivers/infiniband/hw/hfi1/driver.c 
b/drivers/infiniband/hw/hfi1/driver.c
index d426116..e219c3b 100644
--- a/drivers/infiniband/hw/hfi1/driver.c
+++ b/drivers/infiniband/hw/hfi1/driver.c
@@ -163,7 +163,8 @@ const char *get_unit_name(int unit)
 
 const char *get_card_name(struct rvt_dev_info *rdi)
 {
-   struct hfi1_ibdev *ibdev = container_of(rdi, struct hfi1_ibdev, rdi);
+   struct hfi1_ibdev *ibdev = container_of(rdi, struct hfi1_ibdev,
+   hfidev.rdi);
struct hfi1_devdata *dd = container_of(ibdev,
   struct hfi1_devdata, verbs_dev);
return get_unit_name(dd->unit);
@@ -171,7 +172,8 @@ const char *get_card_name(struct rvt_dev_info *rdi)
 
 struct pci_dev *get_pci_dev(struct rvt_dev_info *rdi)
 {
-   struct hfi1_ibdev *ibdev = container_of(rdi, struct hfi1_ibdev, rdi);
+   struct hfi1_ibdev *ibdev = container_of(rdi, struct hfi1_ibdev,
+   hfidev.rdi);
struct hfi1_devdata *dd = container_of(ibdev,
   struct hfi1_devdata, verbs_dev);
return dd->pcidev;
@@ -281,7 +283,7 @@ static void rcv_hdrerr(struct hfi1_ctxtdata *rcd, struct 
hfi1_pportdata *ppd,
int lnh = be16_to_cpu(rhdr->lrh[0]) & 3;
struct hfi1_ibport *ibp = &ppd->ibport_data;
struct hfi1_devdata *dd = ppd->dd;
-   struct rvt_dev_info *rdi = &dd->verbs_dev.rdi;
+   struct rvt_dev_info *rdi = &dd->verbs_dev.hfidev.rdi;
 
if (packet->rhf & (RHF_VCRC_ERR | RHF_ICRC_ERR))
return;
@@ -600,7 +602,7 @@ static void __prescan_rxq(struct hfi1_packet *packet)
struct rvt_qp *qp;
struct ib_header *hdr;
struct ib_other_headers *ohdr;
-   struct rvt_dev_info *rdi = &dd->verbs_dev.rdi;
+   struct rvt_dev_info *rdi = &dd->verbs_dev.hfidev.rdi;
u64 rhf = rhf_to_cpu(rhf_addr);
u32 etype = rhf_rcv_type(rhf), qpn, bth1;
int is_ecn = 0;
diff --git a/drivers/infiniband/hw/hfi1/hfi.h b/drivers/infiniband/hw/hfi1/hfi.h
index 4163596..1fc5b68 100644
--- a/drivers/infiniband/hw/hfi1/hfi.h
+++ b/drivers/infiniband/hw/hfi1/hfi.h
@@ -1601,7 +1601,7 @@ static inline struct hfi1_pportdata *ppd_from_ibp(struct 
hfi1_ibport *ibp)
 
 static inline struct hfi1_ibdev *dev_from_rdi(struct rvt_dev_info *rdi)
 {
-   return container_of(rdi, struct hfi1_ibdev, rdi);
+   return container_of(rdi, struct hfi1_ibdev, hfidev.rdi);
 }
 
 static inline struct hfi1_ibport *to_iport(struct ib_device *ibdev, u8 port)
diff --git a/drivers/infiniband/hw/hfi1/init.c 
b/drivers/infiniband/hw/hfi1/init.c
index 60db615..13f6862 100644
--- a/drivers/infiniband/hw/hfi1/init.c
+++ b/drivers/infiniband/hw/hfi1/init.c
@@ -1020,7 +1020,7 @@ static void __hfi1_free_devdata(struct kobject *kobj)
free_percpu(dd->int_counter);
free_percpu(dd->rcv_limit);
free_percpu(dd->send_schedule);
-   rvt_dealloc_device(&dd->verbs_dev.rdi);
+   rvt_dealloc_device(&dd->verbs_dev.hfidev.rdi);
 }
 
 static struct kobj_type hfi1_devdata_type = {
@@ -1133,7 +113

[RFC v2 08/10] IB/hfi-vnic: VNIC Ethernet Management Agent (VEMA) function

2016-12-15 Thread Vishwanathapura, Niranjana
HFI VEMA function interfaces with the Infiniband MAD stack to exchange the
management information packets with the Ethernet Manager (EM).
It interfaces with the HFI VNIC netdev function to SET/GET the management
information. The information exchanged with the EM includes class port
details, encapsulation configuration, various counters, unicast and
multicast MAC list and the MAC table. It also supports sending traps
to the EM.

Reviewed-by: Dennis Dalessandro 
Reviewed-by: Ira Weiny 
Signed-off-by: Sadanand Warrier 
Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Tanya K Jajodia 
Signed-off-by: Sudeep Dutt 
---
 drivers/infiniband/sw/intel/hfi_vnic/Makefile  |2 +-
 .../sw/intel/hfi_vnic/hfi_vnic_ethtool.c   |   12 +
 .../sw/intel/hfi_vnic/hfi_vnic_internal.h  |   11 +
 .../infiniband/sw/intel/hfi_vnic/hfi_vnic_vema.c   | 1024 
 .../sw/intel/hfi_vnic/hfi_vnic_vema_iface.c|4 +-
 5 files changed, 1050 insertions(+), 3 deletions(-)
 create mode 100644 drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_vema.c

diff --git a/drivers/infiniband/sw/intel/hfi_vnic/Makefile 
b/drivers/infiniband/sw/intel/hfi_vnic/Makefile
index a0562af..16c0830 100644
--- a/drivers/infiniband/sw/intel/hfi_vnic/Makefile
+++ b/drivers/infiniband/sw/intel/hfi_vnic/Makefile
@@ -4,4 +4,4 @@
 obj-$(CONFIG_HFI_VNIC) += hfi_vnic.o
 
 hfi_vnic-y := hfi_vnic_netdev.o hfi_vnic_encap.o hfi_vnic_ethtool.o \
-  hfi_vnic_vema_iface.o
+  hfi_vnic_vema.o hfi_vnic_vema_iface.o
diff --git a/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_ethtool.c 
b/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_ethtool.c
index 9289ab2..9c2ed37 100644
--- a/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_ethtool.c
+++ b/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_ethtool.c
@@ -130,6 +130,17 @@ struct vnic_stats {
 
 #define VNIC_STATS_LEN  ARRAY_SIZE(vnic_gstrings_stats)
 
+/* vnic_get_drvinfo - get driver info */
+static void vnic_get_drvinfo(struct net_device *netdev,
+struct ethtool_drvinfo *drvinfo)
+{
+   strlcpy(drvinfo->driver, hfi_vnic_driver_name, sizeof(drvinfo->driver));
+   strlcpy(drvinfo->version, hfi_vnic_driver_version,
+   sizeof(drvinfo->version));
+   strlcpy(drvinfo->bus_info, dev_name(netdev->dev.parent),
+   sizeof(drvinfo->bus_info));
+}
+
 /* vnic_get_sset_count - get string set count */
 static int vnic_get_sset_count(struct net_device *netdev, int sset)
 {
@@ -183,6 +194,7 @@ static void vnic_get_strings(struct net_device *netdev, u32 
stringset, u8 *data)
 
 /* ethtool ops */
 static const struct ethtool_ops hfi_vnic_ethtool_ops = {
+   .get_drvinfo = vnic_get_drvinfo,
.get_link = ethtool_op_get_link,
.get_strings = vnic_get_strings,
.get_sset_count = vnic_get_sset_count,
diff --git a/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_internal.h 
b/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_internal.h
index 7723a4e..b36bb76 100644
--- a/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_internal.h
+++ b/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_internal.h
@@ -246,10 +246,12 @@ struct __hfi_veswport_trap {
  * struct hfi_vnic_ctrl_port - HFI virtual NIC control port
  * @ibdev: pointer to ib device
  * @ops: hfi vnic control operations
+ * @num_ports: number of hfi ports
  */
 struct hfi_vnic_ctrl_port {
struct ib_device   *ibdev;
struct hfi_vnic_ctrl_ops   *ops;
+   u8  num_ports;
 };
 
 /**
@@ -280,6 +282,8 @@ struct hfi_vnic_rx_queue {
  * @mactbl_lock: mac table lock
  * @stats_lock: statistics lock
  * @flow_tbl: flow to default port redirection table
+ * @trap_timeout: trap timeout
+ * @trap_count: no. of traps allowed within timeout period
  * @q_sum_cntrs: per queue EM summary counters
  * @q_err_cntrs: per queue EM error counters
  * @q_rx_logic_errors: per queue rx logic (default) errors
@@ -314,6 +318,8 @@ struct hfi_vnic_adapter {
struct mutex stats_lock;
 
u8 flow_tbl[HFI_VNIC_FLOW_TBL_SIZE];
+   unsigned long trap_timeout;
+   u8trap_count;
 
struct __hfi_vnic_summary_counters  q_sum_cntrs[HFI_VNIC_MAX_QUEUE];
struct __hfi_vnic_error_countersq_err_cntrs[HFI_VNIC_MAX_QUEUE];
@@ -394,6 +400,9 @@ struct hfi_vnic_mac_tbl_node {
!obj && (bkt) < HFI_VNIC_MAC_TBL_SIZE; (bkt)++)   \
hlist_for_each_entry(obj, &name[bkt], member)
 
+extern char hfi_vnic_driver_name[];
+extern const char hfi_vnic_driver_version[];
+
 struct hfi_vnic_adapter *hfi_vnic_add_netdev(struct hfi_vnic_port *vport,
 struct device *parent);
 void hfi_vnic_rem_netdev(struct hfi_vnic_port *vport);
@@ -428,5 +437,7 @@ struct hfi_vnic_adapter *hfi_vnic_add_vport(struct 
hfi_vnic_ctrl_port *cport,
u8 port_num, u8 vport_num);
 void hfi_vnic_rem_vport(struct hfi_vnic_ada

[RFC v2 05/10] IB/hfi-vnic: VNIC statistics support

2016-12-15 Thread Vishwanathapura, Niranjana
HFI VNIC driver statistics support maintains various counters including
standard netdev counters and the Ethernet manager defined counters.
Add the Ethtool hook to read the counters.

Reviewed-by: Dennis Dalessandro 
Reviewed-by: Ira Weiny 
Signed-off-by: Niranjana Vishwanathapura 
---
 .../infiniband/sw/intel/hfi_vnic/hfi_vnic_encap.c  |  19 +-
 .../sw/intel/hfi_vnic/hfi_vnic_ethtool.c   | 131 +++
 .../sw/intel/hfi_vnic/hfi_vnic_internal.h  |  84 +++
 .../infiniband/sw/intel/hfi_vnic/hfi_vnic_netdev.c | 260 -
 4 files changed, 486 insertions(+), 8 deletions(-)

diff --git a/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_encap.c 
b/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_encap.c
index 093df67..3fdfb7b 100644
--- a/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_encap.c
+++ b/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_encap.c
@@ -209,8 +209,10 @@ int hfi_vnic_encap_skb(struct hfi_vnic_adapter *adapter, 
struct sk_buff *skb)
hdr->slid_high = info->vport.encap_slid >> 20;
 
dlid = hfi_vnic_get_dlid(adapter, skb, def_port);
-   if (unlikely(!dlid))
+   if (unlikely(!dlid)) {
+   adapter->q_err_cntrs[skb->queue_mapping].tx_dlid_zero++;
return -EFAULT;
+   }
 
hdr->dlid = dlid;
hdr->dlid_high = dlid >> 20;
@@ -233,6 +235,19 @@ int hfi_vnic_encap_skb(struct hfi_vnic_adapter *adapter, 
struct sk_buff *skb)
 /* hfi_vnic_decap_skb - strip OPA header from the skb (ethernet) packet */
 int hfi_vnic_decap_skb(struct hfi_vnic_rx_queue *rxq, struct sk_buff *skb)
 {
+   struct hfi_vnic_adapter *adapter = rxq->adapter;
+   int max_len = adapter->netdev->mtu + VLAN_ETH_HLEN;
+   int rc = -EFAULT;
+
skb_pull(skb, HFI_VNIC_HDR_LEN);
-   return 0;
+
+   /* Validate Packet length */
+   if (skb->len > max_len)
+   adapter->q_err_cntrs[rxq->idx].rx_oversize++;
+   else if (skb->len < ETH_ZLEN)
+   adapter->q_err_cntrs[rxq->idx].rx_runt++;
+   else
+   rc = 0;
+
+   return rc;
 }
diff --git a/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_ethtool.c 
b/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_ethtool.c
index 0b4da5e..9289ab2 100644
--- a/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_ethtool.c
+++ b/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_ethtool.c
@@ -53,9 +53,140 @@
 
 #include "hfi_vnic_internal.h"
 
+enum {NETDEV_STATS, VNIC_STATS};
+
+struct vnic_stats {
+   char stat_string[ETH_GSTRING_LEN];
+   struct {
+   int type;
+   int sizeof_stat;
+   int stat_offset;
+   };
+};
+
+#define VNIC_STAT(m){ VNIC_STATS,   \
+ FIELD_SIZEOF(struct hfi_vnic_adapter, m), \
+ offsetof(struct hfi_vnic_adapter, m) }
+#define VNIC_NETDEV_STAT(m) { NETDEV_STATS, \
+ FIELD_SIZEOF(struct net_device, m),   \
+ offsetof(struct net_device, m) }
+
+static struct vnic_stats vnic_gstrings_stats[] = {
+   /* NETDEV stats */
+   {"rx_packets", VNIC_NETDEV_STAT(stats.rx_packets)},
+   {"tx_packets", VNIC_NETDEV_STAT(stats.tx_packets)},
+   {"rx_bytes", VNIC_NETDEV_STAT(stats.rx_bytes)},
+   {"tx_bytes", VNIC_NETDEV_STAT(stats.tx_bytes)},
+   {"rx_errors", VNIC_NETDEV_STAT(stats.rx_errors)},
+   {"tx_errors", VNIC_NETDEV_STAT(stats.tx_errors)},
+   {"rx_dropped", VNIC_NETDEV_STAT(stats.rx_dropped)},
+   {"tx_dropped", VNIC_NETDEV_STAT(stats.tx_dropped)},
+
+   {"rx_fifo_errors", VNIC_NETDEV_STAT(stats.rx_fifo_errors)},
+   {"rx_missed_errors", VNIC_NETDEV_STAT(stats.rx_missed_errors)},
+   {"tx_carrier_errors", VNIC_NETDEV_STAT(stats.tx_carrier_errors)},
+   {"tx_fifo_errors", VNIC_NETDEV_STAT(stats.tx_fifo_errors)},
+
+   /* SUMMARY counters */
+   {"tx_unicast", VNIC_STAT(sum_cntrs.tx_grp.unicast)},
+   {"tx_mcastbcast", VNIC_STAT(sum_cntrs.tx_grp.mcastbcast)},
+   {"tx_untagged", VNIC_STAT(sum_cntrs.tx_grp.untagged)},
+   {"tx_vlan", VNIC_STAT(sum_cntrs.tx_grp.vlan)},
+
+   {"tx_64_size", VNIC_STAT(sum_cntrs.tx_grp.xx_64_size)},
+   {"tx_65_127", VNIC_STAT(sum_cntrs.tx_grp.xx_65_127)},
+   {"tx_128_255", VNIC_STAT(sum_cntrs.tx_grp.xx_128_255)},
+   {"tx_256_511", VNIC_STAT(sum_cntrs.tx_grp.xx_256_511)},
+   {"tx_512_1023", VNIC_STAT(sum_cntrs.tx_grp.xx_512_1023)},
+   {"tx_1024_1518", VNIC_STAT(sum_cntrs.tx_grp.xx_1024_1518)},
+   {"tx_1519_max", VNIC_STAT(sum_cntrs.tx_grp.xx_1519_max)},
+
+   {"rx_unicast", VNIC_STAT(sum_cntrs.rx_grp.unicast)},
+   {"rx_mcastbcast", VNIC_STAT(sum_cntrs.rx_grp.mcastbcast)},
+   {"rx_untagged", VNIC_STAT(sum_cntrs.rx_grp.untagged)},
+   {"rx_vlan", VNIC_STAT(sum_cntrs.rx_grp.vlan)},
+
+   {"rx_64_size", VNIC_STAT(sum_cntrs.rx_grp.xx_64_size)},
+   

[RFC v2 09/10] IB/hfi1: Virtual Network Interface Controller (VNIC) support

2016-12-15 Thread Vishwanathapura, Niranjana
HFI1 HW specific support for VNIC functionality. Add support to add
and remove VNIC ports. Also implement the operations to allocate
resources, transmit and receive of Omni-Path encapsulated Ethernet
packets.

Dynamically allocate a set of contexts for VNIC when the first vnic
port is instantiated. Allocate VNIC contexts from user contexts pool
and return them back to the same pool while freeing up. Set aside
enough MSI-X interrupts for VNIC contexts and assign them when the
contexts are allocated. On the receive side, use an RSM rule to
spread TCP/UDP streams among VNIC contexts.

Reviewed-by: Dennis Dalessandro 
Reviewed-by: Ira Weiny 
Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andrzej Kacprowski 
---
 drivers/infiniband/hw/hfi1/Makefile   |   2 +-
 drivers/infiniband/hw/hfi1/aspm.h |  13 +-
 drivers/infiniband/hw/hfi1/chip.c | 270 +++--
 drivers/infiniband/hw/hfi1/chip.h |   2 +
 drivers/infiniband/hw/hfi1/debugfs.c  |   6 +-
 drivers/infiniband/hw/hfi1/driver.c   |  74 +++-
 drivers/infiniband/hw/hfi1/file_ops.c |  25 +-
 drivers/infiniband/hw/hfi1/hfi.h  |  49 ++-
 drivers/infiniband/hw/hfi1/init.c |  37 +-
 drivers/infiniband/hw/hfi1/mad.c  |   8 +-
 drivers/infiniband/hw/hfi1/pio.c  |  17 +
 drivers/infiniband/hw/hfi1/pio.h  |   6 +
 drivers/infiniband/hw/hfi1/sysfs.c|   2 +-
 drivers/infiniband/hw/hfi1/user_exp_rcv.c |   6 +-
 drivers/infiniband/hw/hfi1/user_pages.c   |   3 +-
 drivers/infiniband/hw/hfi1/verbs.c|   7 +
 drivers/infiniband/hw/hfi1/vnic.h | 145 +++
 drivers/infiniband/hw/hfi1/vnic_main.c| 614 ++
 drivers/infiniband/hw/hfi1/vnic_sdma.c|  60 +++
 include/rdma/opa_port_info.h  |   2 +-
 20 files changed, 1252 insertions(+), 96 deletions(-)
 create mode 100644 drivers/infiniband/hw/hfi1/vnic.h
 create mode 100644 drivers/infiniband/hw/hfi1/vnic_main.c
 create mode 100644 drivers/infiniband/hw/hfi1/vnic_sdma.c

diff --git a/drivers/infiniband/hw/hfi1/Makefile 
b/drivers/infiniband/hw/hfi1/Makefile
index 0cf97a0..88085f6 100644
--- a/drivers/infiniband/hw/hfi1/Makefile
+++ b/drivers/infiniband/hw/hfi1/Makefile
@@ -12,7 +12,7 @@ hfi1-y := affinity.o chip.o device.o driver.o efivar.o \
init.o intr.o mad.o mmu_rb.o pcie.o pio.o pio_copy.o platform.o \
qp.o qsfp.o rc.o ruc.o sdma.o sysfs.o trace.o \
uc.o ud.o user_exp_rcv.o user_pages.o user_sdma.o verbs.o \
-   verbs_txreq.o
+   verbs_txreq.o vnic_main.o vnic_sdma.o
 hfi1-$(CONFIG_DEBUG_FS) += debugfs.o
 
 CFLAGS_trace.o = -I$(src)
diff --git a/drivers/infiniband/hw/hfi1/aspm.h 
b/drivers/infiniband/hw/hfi1/aspm.h
index 0d58fe3..3a01b69 100644
--- a/drivers/infiniband/hw/hfi1/aspm.h
+++ b/drivers/infiniband/hw/hfi1/aspm.h
@@ -229,14 +229,17 @@ static inline void aspm_ctx_timer_function(unsigned long 
data)
spin_unlock_irqrestore(&rcd->aspm_lock, flags);
 }
 
-/* Disable interrupt processing for verbs contexts when PSM contexts are open 
*/
+/*
+ * Disable interrupt processing for verbs contexts when PSM or VNIC contexts
+ * are open.
+ */
 static inline void aspm_disable_all(struct hfi1_devdata *dd)
 {
struct hfi1_ctxtdata *rcd;
unsigned long flags;
unsigned i;
 
-   for (i = 0; i < dd->first_user_ctxt; i++) {
+   for (i = 0; i < dd->first_dyn_alloc_ctxt; i++) {
rcd = dd->rcd[i];
del_timer_sync(&rcd->aspm_timer);
spin_lock_irqsave(&rcd->aspm_lock, flags);
@@ -260,7 +263,7 @@ static inline void aspm_enable_all(struct hfi1_devdata *dd)
if (aspm_mode != ASPM_MODE_DYNAMIC)
return;
 
-   for (i = 0; i < dd->first_user_ctxt; i++) {
+   for (i = 0; i < dd->first_dyn_alloc_ctxt; i++) {
rcd = dd->rcd[i];
spin_lock_irqsave(&rcd->aspm_lock, flags);
rcd->aspm_intr_enable = true;
@@ -276,7 +279,7 @@ static inline void aspm_ctx_init(struct hfi1_ctxtdata *rcd)
(unsigned long)rcd);
rcd->aspm_intr_supported = rcd->dd->aspm_supported &&
aspm_mode == ASPM_MODE_DYNAMIC &&
-   rcd->ctxt < rcd->dd->first_user_ctxt;
+   rcd->ctxt < rcd->dd->first_dyn_alloc_ctxt;
 }
 
 static inline void aspm_init(struct hfi1_devdata *dd)
@@ -286,7 +289,7 @@ static inline void aspm_init(struct hfi1_devdata *dd)
spin_lock_init(&dd->aspm_lock);
dd->aspm_supported = aspm_hw_l1_supported(dd);
 
-   for (i = 0; i < dd->first_user_ctxt; i++)
+   for (i = 0; i < dd->first_dyn_alloc_ctxt; i++)
aspm_ctx_init(dd->rcd[i]);
 
/* Start with ASPM disabled */
diff --git a/drivers/infiniband/hw/hfi1/chip.c 
b/drivers/infiniband/hw/hfi1/chip.c
index 9263984..472ce55 100644
--- a/drivers/infiniband/hw/hfi1/chip.c
+++ b/drivers/infiniband/hw/hfi1/chip.c
@@ -125,9 +125,16 @@ struct flag_table {
 #define DEFAULT_KRCVQS

[RFC v2 03/10] IB/hfi-vnic: Virtual Network Interface Controller (VNIC) netdev

2016-12-15 Thread Vishwanathapura, Niranjana
HFI VNIC netdev function supports Ethernet functionality over Omni-Path
fabric by encapsulating Ethernet packets inside Omni-Path packet header.
It interfaces with the network stack to provide standard Ethernet network
interfaces. It invokes HFI device's VNIC callback functions for HW access.

Reviewed-by: Dennis Dalessandro 
Reviewed-by: Ira Weiny 
Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Sadanand Warrier 
Signed-off-by: Sudeep Dutt 
Signed-off-by: Tanya K Jajodia 
Signed-off-by: Andrzej Kacprowski 
---
 MAINTAINERS|   7 +
 drivers/infiniband/Kconfig |   1 +
 drivers/infiniband/sw/Makefile |   1 +
 drivers/infiniband/sw/intel/hfi_vnic/Kconfig   |   8 +
 drivers/infiniband/sw/intel/hfi_vnic/Makefile  |   6 +
 .../infiniband/sw/intel/hfi_vnic/hfi_vnic_encap.c  | 238 
 .../infiniband/sw/intel/hfi_vnic/hfi_vnic_encap.h  |  62 
 .../sw/intel/hfi_vnic/hfi_vnic_ethtool.c   |  65 
 .../sw/intel/hfi_vnic/hfi_vnic_internal.h  | 220 +++
 .../infiniband/sw/intel/hfi_vnic/hfi_vnic_netdev.c | 409 +
 10 files changed, 1017 insertions(+)
 create mode 100644 drivers/infiniband/sw/intel/hfi_vnic/Kconfig
 create mode 100644 drivers/infiniband/sw/intel/hfi_vnic/Makefile
 create mode 100644 drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_encap.c
 create mode 100644 drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_encap.h
 create mode 100644 drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_ethtool.c
 create mode 100644 drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_internal.h
 create mode 100644 drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_netdev.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 2c7a7b6..62db3ea 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -5628,6 +5628,13 @@ F:   drivers/block/cciss*
 F: include/linux/cciss_ioctl.h
 F: include/uapi/linux/cciss_ioctl.h
 
+HFI-VNIC DRIVER
+M: Dennis Dalessandro 
+M: Niranjana Vishwanathapura 
+L: linux-r...@vger.kernel.org
+S: Supported
+F: drivers/infiniband/sw/intel/hfi_vnic
+
 HFI1 DRIVER
 M: Mike Marciniszyn 
 M: Dennis Dalessandro 
diff --git a/drivers/infiniband/Kconfig b/drivers/infiniband/Kconfig
index 6709173..900daf3 100644
--- a/drivers/infiniband/Kconfig
+++ b/drivers/infiniband/Kconfig
@@ -85,6 +85,7 @@ source "drivers/infiniband/ulp/srpt/Kconfig"
 source "drivers/infiniband/ulp/iser/Kconfig"
 source "drivers/infiniband/ulp/isert/Kconfig"
 
+source "drivers/infiniband/sw/intel/hfi_vnic/Kconfig"
 source "drivers/infiniband/sw/rdmavt/Kconfig"
 source "drivers/infiniband/sw/rxe/Kconfig"
 
diff --git a/drivers/infiniband/sw/Makefile b/drivers/infiniband/sw/Makefile
index 8b095b2..2792559 100644
--- a/drivers/infiniband/sw/Makefile
+++ b/drivers/infiniband/sw/Makefile
@@ -1,2 +1,3 @@
 obj-$(CONFIG_INFINIBAND_RDMAVT)+= rdmavt/
 obj-$(CONFIG_RDMA_RXE) += rxe/
+obj-$(CONFIG_HFI_VNIC) += intel/hfi_vnic/
diff --git a/drivers/infiniband/sw/intel/hfi_vnic/Kconfig 
b/drivers/infiniband/sw/intel/hfi_vnic/Kconfig
new file mode 100644
index 000..84d13e7
--- /dev/null
+++ b/drivers/infiniband/sw/intel/hfi_vnic/Kconfig
@@ -0,0 +1,8 @@
+config HFI_VNIC
+   tristate "Intel HFI VNIC support"
+   depends on X86_64 && INFINIBAND
+   ---help---
+   This is HFI Virtual Network Interface Controller (VNIC) driver
+   for Ethernet over HFI feature. It implements the HW independent
+   VNIC functionality. It interfaces with Linux stack for data path
+   and IB MAD for the control path.
diff --git a/drivers/infiniband/sw/intel/hfi_vnic/Makefile 
b/drivers/infiniband/sw/intel/hfi_vnic/Makefile
new file mode 100644
index 000..8e3dca7
--- /dev/null
+++ b/drivers/infiniband/sw/intel/hfi_vnic/Makefile
@@ -0,0 +1,6 @@
+# Makefile - Intel HFI Virtual Network Controller driver
+# Copyright(c) 2016, Intel Corporation.
+#
+obj-$(CONFIG_HFI_VNIC) += hfi_vnic.o
+
+hfi_vnic-y := hfi_vnic_netdev.o hfi_vnic_encap.o hfi_vnic_ethtool.o
diff --git a/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_encap.c 
b/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_encap.c
new file mode 100644
index 000..093df67
--- /dev/null
+++ b/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_encap.c
@@ -0,0 +1,238 @@
+/*
+ * Copyright(c) 2016 Intel Corporation.
+ *
+ * This file is provided under a dual BSD/GPLv2 license.  When using or
+ * redistributing this file, you may do so under either license.
+ *
+ * GPL LICENSE SUMMARY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the GNU General Public License as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License fo

[RFC v2 10/10] IB/hfi1: VNIC SDMA support

2016-12-15 Thread Vishwanathapura, Niranjana
HFI1 VNIC SDMA support enables transmission of VNIC packets over SDMA.
Map VNIC queues to SDMA engines and support halting and wakeup of the
VNIC queues.

Reviewed-by: Dennis Dalessandro 
Reviewed-by: Ira Weiny 
Signed-off-by: Niranjana Vishwanathapura 
---
 drivers/infiniband/hw/hfi1/hfi.h   |   1 +
 drivers/infiniband/hw/hfi1/vnic.h  |  30 +++-
 drivers/infiniband/hw/hfi1/vnic_main.c |  21 ++-
 drivers/infiniband/hw/hfi1/vnic_sdma.c | 260 +
 4 files changed, 309 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/hfi.h b/drivers/infiniband/hw/hfi1/hfi.h
index 78d1726..8d5949f 100644
--- a/drivers/infiniband/hw/hfi1/hfi.h
+++ b/drivers/infiniband/hw/hfi1/hfi.h
@@ -855,6 +855,7 @@ struct hfi1_asic_data {
 /* Virtual NIC information */
 struct hfi1_vnic_data {
struct hfi1_ctxtdata *ctxt[HFI1_NUM_VNIC_CTXT];
+   struct kmem_cache *txreq_cache;
u8 num_vports;
struct idr vesw_idr;
u8 rmt_start;
diff --git a/drivers/infiniband/hw/hfi1/vnic.h 
b/drivers/infiniband/hw/hfi1/vnic.h
index 047845e..2d4eb8f 100644
--- a/drivers/infiniband/hw/hfi1/vnic.h
+++ b/drivers/infiniband/hw/hfi1/vnic.h
@@ -49,6 +49,7 @@
 
 #include 
 #include "hfi.h"
+#include "sdma.h"
 
 #define HFI1_VNIC_ICRC_LEN   4
 #define HFI1_VNIC_TAIL_LEN   1
@@ -90,6 +91,26 @@
 #define HFI1_VNIC_SC_SHIFT  4
 
 /**
+ * struct hfi1_vnic_sdma - VNIC per Tx ring SDMA information
+ * @dd - device data pointer
+ * @sde - sdma engine
+ * @vinfo - vnic info pointer
+ * @wait - iowait structure
+ * @stx - sdma tx request
+ * @state - vnic Tx ring SDMA state
+ * @q_idx - vnic Tx queue index
+ */
+struct hfi1_vnic_sdma {
+   struct hfi1_devdata *dd;
+   struct sdma_engine  *sde;
+   struct hfi1_vnic_vport_info *vinfo;
+   struct iowait wait;
+   struct sdma_txreq stx;
+   unsigned int state;
+   u8 q_idx;
+};
+
+/**
  * struct hfi1_vnic_notifier - VNIC notifer structure
  * @cb - vnic callback function
  */
@@ -104,6 +125,7 @@ struct hfi1_vnic_notifier {
  * @event_flags: event notification flags
  * @vport: vnic port pointer
  * @skbq: Array of queues for received socket buffers
+ * @sdma: VNIC SDMA structure per TXQ
  */
 struct hfi1_vnic_vport_info {
struct hfi1_devdata *dd;
@@ -112,7 +134,8 @@ struct hfi1_vnic_vport_info {
DECLARE_BITMAP(event_flags, HFI_VNIC_NUM_EVTS);
struct hfi_vnic_port *vport;
 
-   struct sk_buff_head skbq[HFI1_NUM_VNIC_CTXT];
+   struct sk_buff_headskbq[HFI1_NUM_VNIC_CTXT];
+   struct hfi1_vnic_sdma  sdma[HFI1_VNIC_MAX_TXQ];
 };
 
 static inline struct hfi1_devdata *vnic_dev2dd(struct hfi_vnic_port *vport)
@@ -131,8 +154,13 @@ static inline void hfi1_vnic_update_pad(unsigned char 
*pad, u8 plen)
 /* vnic hfi1 internal functions */
 void hfi1_vnic_setup(struct hfi1_devdata *dd);
 void hfi1_vnic_cleanup(struct hfi1_devdata *dd);
+int hfi1_vnic_txreq_init(struct hfi1_devdata *dd);
+void hfi1_vnic_txreq_deinit(struct hfi1_devdata *dd);
 
 void hfi1_vnic_bypass_rcv(struct hfi1_packet *packet);
+void hfi1_vnic_sdma_init(struct hfi1_vnic_vport_info *vinfo);
+bool hfi1_vnic_sdma_write_avail(struct hfi1_vnic_vport_info *vinfo,
+   u8 q_idx);
 
 /* vnic port operations */
 struct hfi_vnic_port *hfi1_vnic_add_vport(struct ib_device *device,
diff --git a/drivers/infiniband/hw/hfi1/vnic_main.c 
b/drivers/infiniband/hw/hfi1/vnic_main.c
index 1e237f3..19843a4 100644
--- a/drivers/infiniband/hw/hfi1/vnic_main.c
+++ b/drivers/infiniband/hw/hfi1/vnic_main.c
@@ -289,15 +289,21 @@ static int hfi1_vnic_put_skb(struct hfi_vnic_port *vport,
 
 static u8 hfi1_vnic_select_queue(struct hfi_vnic_port *vport, u8 vl, u8 
entropy)
 {
-   return 0;
+   struct hfi1_vnic_vport_info *vinfo = vport->hfi_priv;
+   struct sdma_engine *sde;
+
+   sde = sdma_select_engine_vl(vinfo->dd, entropy, vl);
+   return sde->this_idx;
 }
 
 static bool hfi1_vnic_get_write_avail(struct hfi_vnic_port *vport, u8 q_idx)
 {
+   struct hfi1_vnic_vport_info *vinfo = vport->hfi_priv;
+
if (q_idx >= vport->hfi_info.num_tx_q)
return false;
 
-   return true;
+   return hfi1_vnic_sdma_write_avail(vinfo, q_idx);
 }
 
 void hfi1_vnic_bypass_rcv(struct hfi1_packet *packet)
@@ -499,6 +505,12 @@ static int hfi1_vnic_init(struct hfi_vnic_port *vport)
int i, rc = 0;
 
mutex_lock(&hfi1_mutex);
+   if (!dd->vnic.num_vports) {
+   rc = hfi1_vnic_txreq_init(dd);
+   if (rc)
+   goto txreq_fail;
+   }
+
for (i = dd->vnic.num_ctxt; i < vport->hfi_info.num_rx_q; i++) {
rc = hfi1_vnic_allot_ctxt(dd, &dd->vnic.ctxt[i]);
if (rc)
@@ -526,7 +538,11 @@ static int hfi1_vnic_init(struct hfi_vnic_port *vport)
 
dd->vnic.num_vports++;
vinfo->vport = vport;
+   hfi1_vnic_sdma_init(vinfo);
 alloc_fail:
+   if (!dd->vnic.num_vports)
+   

[RFC v2 01/10] IB/hfi-vnic: Virtual Network Interface Controller (VNIC) documentation

2016-12-15 Thread Vishwanathapura, Niranjana
Add HFI VNIC design document explaining the VNIC architecture and the
driver design.

Reviewed-by: Dennis Dalessandro 
Reviewed-by: Ira Weiny 
Signed-off-by: Niranjana Vishwanathapura 
---
 Documentation/infiniband/hfi_vnic.txt | 95 +++
 1 file changed, 95 insertions(+)
 create mode 100644 Documentation/infiniband/hfi_vnic.txt

diff --git a/Documentation/infiniband/hfi_vnic.txt 
b/Documentation/infiniband/hfi_vnic.txt
new file mode 100644
index 000..1f39d8b
--- /dev/null
+++ b/Documentation/infiniband/hfi_vnic.txt
@@ -0,0 +1,95 @@
+Intel Omni-Path Host Fabric Interface (HFI) Virtual Network Interface
+Controller (VNIC) feature supports Ethernet functionality over Omni-Path
+fabric by encapsulating the Ethernet packets between HFI nodes.
+
+The patterns of exchanges of Omni-Path encapsulated Ethernet packets
+involves one or more virtual Ethernet switches overlaid on the Omni-Path
+fabric topology. A subset of HFI nodes on the Omni-Path fabric are
+permitted to exchange encapsulated Ethernet packets across a particular
+virtual Ethernet switch. The virtual Ethernet switches are logical
+abstractions achieved by configuring the HFI nodes on the fabric for
+header generation and processing. In the simplest configuration all HFI
+nodes across the fabric exchange encapsulated Ethernet packets over a
+single virtual Ethernet switch. A virtual Ethernet switch, is effectively
+an independent Ethernet network. The configuration is performed by an
+Ethernet Manager (EM) which is part of the trusted Fabric Manager (FM)
+application. HFI nodes can have multiple VNICs each connected to a
+different virtual Ethernet switch. The below diagram presents a case
+of two virtual Ethernet switches with two HFI nodes.
+
+ +---+
+ |  Subnet/  |
+ | Ethernet  |
+ |  Manager  |
+ +---+
+/  /
+  /   /
+//
+  / /
++-+  +--+
+|  Virtual Ethernet Switch|  |  Virtual Ethernet Switch |
+|  +-++-+ |  | +-++-+   |
+|  | VPORT   ||  VPORT  | |  | |  VPORT  ||  VPORT  |   |
++--+-++-+-+  +-+-++-+---+
+ | \/ |
+ |   \/   |
+ | \/ |
+ |/  \|
+ |  /  \  |
+ +---++  +---++
+ |   VNIC|VNIC|  |VNIC   |VNIC|
+ +---++  +---++
+ |  HFI   |  |  HFI   |
+ ++  ++
+
+Intel HFI VNIC software design is presented in the below diagram.
+HFI VNIC functionality has a HW dependent component and a HW
+independent component.
+
+The HW dependent VNIC functionality is part of the HFI1 driver. It
+implements the callback functions to do various tasks which includes
+adding and removing of VNIC ports, HW resource allocation for VNIC
+functionality and actual transmission and reception of encapsulated
+Ethernet packets over the fabric. Each VNIC port is addressed by the
+HFI port number, and the VNIC port number on that HFI port.
+
+The HFI VNIC module implements the HW independent VNIC functionality.
+It consists of two parts. The VNIC Ethernet Management Agent (VEMA)
+registers itself with IB core as an IB client and interfaces with the
+IB MAD stack. It exchanges the management information with the Ethernet
+Manager (EM) and the VNIC netdev. The VNIC netdev part interfaces with
+the Linux network stack, thus providing standard Ethernet network
+interfaces. It invokes HFI device's VNIC callback functions for HW access.
+The VNIC netdev encapsulates the Ethernet packets with an Omni-Path
+header before passing them to the HFI1 driver for transmission.
+Similarly, it de-encapsulates the received Omni-Path packets before
+passing them to the network stack. For each VNIC interface, the
+information required for encapsulation is configured by EM via VEMA MAD
+interface.
+
+
++---+ +--+
+|   | |   Linux  |
+| IB MAD| |  Network |
+|   | |   Stack  |
++---+ +--+
+ |   |
+ |   |
+++
+| 

[RFC v2 04/10] IB/hfi-vnic: VNIC Ethernet Management (EM) structure definitions

2016-12-15 Thread Vishwanathapura, Niranjana
Define VNIC EM MAD structures and the associated macros. These structures
are used for information exchange between VNIC EM agent (EMA) on the HFI
host and the Ethernet manager. These include the virtual ethernet switch
(vesw) port information, vesw port mac table, summay and error counters,
vesw port interface mac lists and the EMA trap.

Reviewed-by: Dennis Dalessandro 
Reviewed-by: Ira Weiny 
Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Sadanand Warrier 
Signed-off-by: Tanya K Jajodia 
---
 .../infiniband/sw/intel/hfi_vnic/hfi_vnic_encap.h  | 444 +
 .../sw/intel/hfi_vnic/hfi_vnic_internal.h  |  33 ++
 2 files changed, 477 insertions(+)

diff --git a/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_encap.h 
b/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_encap.h
index 6786cce..a6770ef 100644
--- a/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_encap.h
+++ b/drivers/infiniband/sw/intel/hfi_vnic/hfi_vnic_encap.h
@@ -52,11 +52,455 @@
  * and decapsulation of Ethernet packets
  */
 
+#include 
+#include 
+
+/* Maximum number of vnics supported */
+#define HFI_MAX_VPORTS_SUPPORTED 256
+
+/* EMA class version */
+#define HFI_EMA_CLASS_VERSION   0x80
+
+/*
+ * Define the Intel vendor management class for HFI
+ * ETHERNET MANAGEMENT
+ */
+#define HFI_MGMT_CLASS_INTEL_EMA0x34
+
+/* EM attribute IDs */
+#define HFI_EM_ATTR_CLASS_PORT_INFO 0x0001
+#define HFI_EM_ATTR_VESWPORT_INFO   0x0011
+#define HFI_EM_ATTR_VESWPORT_MAC_ENTRIES0x0012
+#define HFI_EM_ATTR_IFACE_UCAST_MACS0x0013
+#define HFI_EM_ATTR_IFACE_MCAST_MACS0x0014
+#define HFI_EM_ATTR_DELETE_VESW 0x0015
+#define HFI_EM_ATTR_VESWPORT_SUMMARY_COUNTERS   0x0020
+#define HFI_EM_ATTR_VESWPORT_ERROR_COUNTERS 0x0022
+
 #define HFI_VESW_MAX_NUM_DEF_PORT   16
 #define HFI_VNIC_MAX_NUM_PCP8
 
+#define HFI_VNIC_EMA_DATA(OPA_MGMT_MAD_SIZE - IB_MGMT_VENDOR_HDR)
+
+/* Defines for vendor specific notice(trap) attributes */
+#define HFI_INTEL_EMA_NOTICE_TYPE_INFO 0x04
+
+/* INTEL OUI */
+#define INTEL_OUI_1 0x00
+#define INTEL_OUI_2 0x06
+#define INTEL_OUI_3 0x6a
+
+/* Trap opcodes sent from VNIC */
+#define HFI_VESWPORT_TRAP_IFACE_UCAST_MAC_CHANGE 0x1
+#define HFI_VESWPORT_TRAP_IFACE_MCAST_MAC_CHANGE 0x2
+#define HFI_VESWPORT_TRAP_ETH_LINK_STATUS_CHANGE 0x3
+
 /* VNIC configured and operational state values */
 #define HFI_VNIC_STATE_DROP_ALL0x1
 #define HFI_VNIC_STATE_FORWARDING  0x3
 
+/**
+ * struct hfi_vesw_info - HFI vnic switch information
+ * @fabric_id: 10-bit fabric id
+ * @vesw_id: 12-bit virtual ethernet switch id
+ * @def_port_mask: bitmask of default ports
+ * @pkey: partition key
+ * @u_mcast_dlid: unknown multicast dlid
+ * @u_ucast_dlid: array of unknown unicast dlids
+ * @eth_mtu: MTUs for each vlan PCP
+ * @eth_mtu_non_vlan: MTU for non vlan packets
+ */
+struct hfi_vesw_info {
+   __be16  fabric_id;
+   __be16  vesw_id;
+
+   u8  rsvd0[6];
+   __be16  def_port_mask;
+
+   u8  rsvd1[2];
+   __be16  pkey;
+
+   u8  rsvd2[4];
+   __be32  u_mcast_dlid;
+   __be32  u_ucast_dlid[HFI_VESW_MAX_NUM_DEF_PORT];
+
+   u8  rsvd3[44];
+   __be16  eth_mtu[HFI_VNIC_MAX_NUM_PCP];
+   __be16  eth_mtu_non_vlan;
+   u8  rsvd4[2];
+} __packed;
+
+/**
+ * struct hfi_per_veswport_info - HFI vnic per port information
+ * @port_num: port number
+ * @eth_link_status: current ethernet link state
+ * @base_mac_addr: base mac address
+ * @config_state: configured port state
+ * @oper_state: operational port state
+ * @max_mac_tbl_ent: max number of mac table entries
+ * @max_smac_ent: max smac entries in mac table
+ * @mac_tbl_digest: mac table digest
+ * @encap_slid: base slid for the port
+ * @pcp_to_sc_uc: sc by pcp index for unicast ethernet packets
+ * @pcp_to_vl_uc: vl by pcp index for unicast ethernet packets
+ * @pcp_to_sc_mc: sc by pcp index for multicast ethernet packets
+ * @pcp_to_vl_mc: vl by pcp index for multicast ethernet packets
+ * @non_vlan_sc_uc: sc for non-vlan unicast ethernet packets
+ * @non_vlan_vl_uc: vl for non-vlan unicast ethernet packets
+ * @non_vlan_sc_mc: sc for non-vlan multicast ethernet packets
+ * @non_vlan_vl_mc: vl for non-vlan multicast ethernet packets
+ * @uc_macs_gen_count: generation count for unicast macs list
+ * @mc_macs_gen_count: generation count for multicast macs list
+ */
+struct hfi_per_veswport_info {
+   __be32  port_num;
+
+   u8  eth_link_status;
+   u8  rsvd0[3];
+
+   u8  base_mac_addr[ETH_ALEN];
+   u8  config_state;
+   u8  oper_state;
+
+   __be16  max_mac_tbl_ent;
+   __be16  max_smac_ent;
+   __be32  mac_tbl_digest;
+   u8  rsvd1[4];
+
+   __be32  encap_slid;
+
+   u8  pcp_to_sc_uc[HFI_VNIC_MAX_NUM_PCP];
+   u8  pcp_to_vl_uc[HFI_VNIC_MAX_NUM_PCP];
+   u8  pcp_to_sc_mc[

Re: [kernel-hardening] Re: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function

2016-12-15 Thread Daniel Micay
On Thu, 2016-12-15 at 15:57 +0800, Herbert Xu wrote:
> Jason A. Donenfeld  wrote:
> > 
> > Siphash needs a random secret key, yes. The point is that the hash
> > function remains secure so long as the secret key is kept secret.
> > Other functions can't make the same guarantee, and so nervous
> > periodic
> > key rotation is necessary, but in most cases nothing is done, and so
> > things just leak over time.
> 
> Actually those users that use rhashtable now have a much more
> sophisticated defence against these attacks, dyanmic rehashing
> when bucket length exceeds a preset limit.
> 
> Cheers,

Key independent collisions won't be mitigated by picking a new secret.

A simple solution with clear security properties is ideal.

signature.asc
Description: This is a digitally signed message part


Re: wl1251 & mac address & calibration data

2016-12-15 Thread Kalle Valo
(Adding Luis because he has been working on request_firmware() lately)

Pali Rohár  writes:

>> > So no, there is no argument against... request_firmware() in
>> > fallback mode with userspace helper is by design blocking and
>> > waiting for userspace. But waiting for some change in DTS in
>> > kernel is just nonsense.
>> 
>> I would just mark the wlan device with status = "disabled" and
>> enable it in the overlay together with adding the NVS & MAC info.
>
> So if you think that this solution make sense, we can wait what net 
> wireless maintainers say about it...
>
> For me it looks like that solution can be:
>
> extending request_firmware() to use only userspace helper

I haven't followed the discussion very closely but this is my preference
what drivers should do:

1) First the driver should do try to get the calibration data and mac
   address from the device tree.

2) If they are not in DT the driver should retrieve the calibration data
   with request_firmware(). BUT with an option for user space to
   implement that with a helper script so that the data can be created
   dynamically, which I believe openwrt does with ath10k calibration
   data right now.

> and load mac address also via request_firmware() either by appending it 
> into NVS data or via separate call

I'm not really fan of the idea providing permanent mac address through
request_firmware(). For example, how to handle multiple devices on the
same host, would there be a need for some kind of bus ids encoded to the
filename? And what about devices with multiple mac addresses?

I wish there would be a better way than request_firmware() to provide
the permanent mac addresses from user space (if device tree is not
available), I just don't know what that could be :) But if we would
start to use request_firmware() for this at least there should be a
wider concensus about that and it should be properly documented, just
like the device tree bindings.

-- 
Kalle Valo


Re: [PATCH 3/3] Bluetooth: btusb: Configure Marvel to use one of the pins for oob wakeup

2016-12-15 Thread Gregory CLEMENT
Hi Rajat,
 
 On mer., déc. 14 2016, Rajat Jain  wrote:

In your title unless you speak about the comic books you should do a
s/Marvel/Marvell/ :)

Gregory

> The Marvell devices may have many gpio pins, and hence for wakeup
> on these out-of-band pins, the chip needs to be told which pin is
> to be used for wakeup, using an hci command.
>
> Thus, we read the pin number etc from the device tree node and send
> a command to the chip.
>
> Signed-off-by: Rajat Jain 
> ---
> Note that while I would have liked to name the compatible string as more
> like "marvell, usb8997-bt", the devicetrees/bindings/usb/usb-device.txt
> requires the compatible property to be of the form "usbVID,PID".
>
>  .../{marvell-bt-sd8xxx.txt => marvell-bt-8xxx.txt} | 25 -
>  drivers/bluetooth/btusb.c  | 59 
> ++
>  2 files changed, 82 insertions(+), 2 deletions(-)
>  rename Documentation/devicetree/bindings/net/{marvell-bt-sd8xxx.txt => 
> marvell-bt-8xxx.txt} (76%)
>
> diff --git a/Documentation/devicetree/bindings/net/marvell-bt-sd8xxx.txt 
> b/Documentation/devicetree/bindings/net/marvell-bt-8xxx.txt
> similarity index 76%
> rename from Documentation/devicetree/bindings/net/marvell-bt-sd8xxx.txt
> rename to Documentation/devicetree/bindings/net/marvell-bt-8xxx.txt
> index 6a9a63c..471bef8 100644
> --- a/Documentation/devicetree/bindings/net/marvell-bt-sd8xxx.txt
> +++ b/Documentation/devicetree/bindings/net/marvell-bt-8xxx.txt
> @@ -1,4 +1,4 @@
> -Marvell 8897/8997 (sd8897/sd8997) bluetooth SDIO devices
> +Marvell 8897/8997 (sd8897/sd8997) bluetooth devices (SDIO or USB based)
>  --
>  
>  Required properties:
> @@ -6,11 +6,13 @@ Required properties:
>- compatible : should be one of the following:
>   * "marvell,sd8897-bt"
>   * "marvell,sd8997-bt"
> + * "usb1286,204e"
>  
>  Optional properties:
>  
>- marvell,cal-data: Calibration data downloaded to the device during
> initialization. This is an array of 28 values(u8).
> +   This is only applicable to SDIO devices.
>  
>- marvell,wakeup-pin: It represents wakeup pin number of the bluetooth 
> chip.
>   firmware will use the pin to wakeup host system (u16).
> @@ -29,7 +31,9 @@ Example:
>  IRQ pin 119 is used as system wakeup source interrupt.
>  wakeup pin 13 and gap 100ms are configured so that firmware can wakeup host
>  using this device side pin and wakeup latency.
> -calibration data is also available in below example.
> +
> +Example for SDIO device follows (calibration data is also available in
> +below example).
>  
>  &mmc3 {
>   status = "okay";
> @@ -54,3 +58,20 @@ calibration data is also available in below example.
>   marvell,wakeup-gap-ms = /bits/ 16 <0x64>;
>   };
>  };
> +
> +Example for USB device:
> +
> +&usb_host1_ohci {
> +status = "okay";
> +#address-cells = <1>;
> +#size-cells = <0>;
> +
> +mvl_bt1: bt@1 {
> + compatible = "usb1286,204e";
> + reg = <1>;
> + interrupt-parent = <&gpio0>;
> + interrupts = <119 IRQ_TYPE_LEVEL_LOW>;
> + marvell,wakeup-pin = /bits/ 16 <0x0d>;
> + marvell,wakeup-gap-ms = /bits/ 16 <0x64>;
> +};
> +};
> diff --git a/drivers/bluetooth/btusb.c b/drivers/bluetooth/btusb.c
> index 32a6f22..99d7f6d 100644
> --- a/drivers/bluetooth/btusb.c
> +++ b/drivers/bluetooth/btusb.c
> @@ -2343,6 +2343,58 @@ static int btusb_shutdown_intel(struct hci_dev *hdev)
>   return 0;
>  }
>  
> +#ifdef CONFIG_PM
> +static const struct of_device_id mvl_oob_wake_match_table[] = {
> + { .compatible = "usb1286,204e" },
> + { }
> +};
> +MODULE_DEVICE_TABLE(of, mvl_oob_wake_match_table);
> +
> +/* Configure an out-of-band gpio as wake-up pin, if specified in device tree 
> */
> +static int marvell_config_oob_wake(struct hci_dev *hdev)
> +{
> + struct sk_buff *skb;
> + struct btusb_data *data = hci_get_drvdata(hdev);
> + struct device *dev = &data->udev->dev;
> + u16 pin, gap, opcode;
> + int ret;
> + u8 cmd[5];
> +
> + if (!of_match_device(mvl_oob_wake_match_table, dev))
> + return 0;
> +
> + if (of_property_read_u16(dev->of_node, "marvell,wakeup-pin", &pin) ||
> + of_property_read_u16(dev->of_node, "marvell,wakeup-gap-ms", &gap))
> + return -EINVAL;
> +
> + /* Vendor specific command to configure a GPIO as wake-up pin */
> + opcode = hci_opcode_pack(0x3F, 0x59);
> + cmd[0] = opcode & 0xFF;
> + cmd[1] = opcode >> 8;
> + cmd[2] = 2; /* length of parameters that follow */
> + cmd[3] = pin;
> + cmd[4] = gap; /* time in ms, for which wakeup pin should be asserted */
> +
> + skb = bt_skb_alloc(sizeof(cmd), GFP_KERNEL);
> + if (!skb) {
> + bt_dev_err(hdev, "%s: No memory\n", __func__);
> + return -ENOMEM;
> + }
> +
> + memcpy(skb_put(skb, sizeof(cmd)), cmd, sizeof(cmd));
> + hci_skb_pkt_type(skb) = HCI_COMMAND_PKT;
> +
> + ret 

Re: Designing a safe RX-zero-copy Memory Model for Networking

2016-12-15 Thread Jesper Dangaard Brouer
On Wed, 14 Dec 2016 14:45:00 -0800
Alexander Duyck  wrote:

> On Wed, Dec 14, 2016 at 1:29 PM, Jesper Dangaard Brouer
>  wrote:
> > On Wed, 14 Dec 2016 08:45:08 -0800
> > Alexander Duyck  wrote:
> >  
> >> I agree.  This is a no-go from the performance perspective as well.
> >> At a minimum you would have to be zeroing out the page between uses to
> >> avoid leaking data, and that assumes that the program we are sending
> >> the pages to is slightly well behaved.  If we think zeroing out an
> >> sk_buff is expensive wait until we are trying to do an entire 4K page.  
> >
> > Again, yes the page will be zero'ed out, but only when entering the
> > page_pool. Because they are recycled they are not cleared on every use.
> > Thus, performance does not suffer.  
> 
> So you are talking about recycling, but not clearing the page when it
> is recycled.  That right there is my problem with this.  It is fine if
> you assume the pages are used by the application only, but you are
> talking about using them for both the application and for the regular
> network path.  You can't do that.  If you are recycling you will have
> to clear the page every time you put it back onto the Rx ring,
> otherwise you can leak the recycled memory into user space and end up
> with a user space program being able to snoop data out of the skb.
> 
> > Besides clearing large mem area is not as bad as clearing small.
> > Clearing an entire page does cost something, as mentioned before 143
> > cycles, which is 28 bytes-per-cycle (4096/143).  And clearing 256 bytes
> > cost 36 cycles which is only 7 bytes-per-cycle (256/36).  
> 
> What I am saying is that you are going to be clearing the 4K blocks
> each time they are recycled.  You can't have the pages shared between
> user-space and the network stack unless you have true isolation.  If
> you are allowing network stack pages to be recycled back into the
> user-space application you open up all sorts of leaks where the
> application can snoop into data it shouldn't have access to.

See later, the "Read-only packet page" mode should provide a mode where
the netstack doesn't write into the page, and thus cannot leak kernel
data. (CAP_NET_ADMIN already give it access to other applications data.)


> >> I think we are stuck with having to use a HW filter to split off
> >> application traffic to a specific ring, and then having to share the
> >> memory between the application and the kernel on that ring only.  Any
> >> other approach just opens us up to all sorts of security concerns
> >> since it would be possible for the application to try to read and
> >> possibly write any data it wants into the buffers.  
> >
> > This is why I wrote a document[1], trying to outline how this is possible,
> > going through all the combinations, and asking the community to find
> > faults in my idea.  Inlining it again, as nobody really replied on the
> > content of the doc.
> >
> > -
> > Best regards,
> >   Jesper Dangaard Brouer
> >   MSc.CS, Principal Kernel Engineer at Red Hat
> >   LinkedIn: http://www.linkedin.com/in/brouer
> >
> > [1] 
> > https://prototype-kernel.readthedocs.io/en/latest/vm/page_pool/design/memory_model_nic.html
> >
> > ===
> > Memory Model for Networking
> > ===
> >
> > This design describes how the page_pool change the memory model for
> > networking in the NIC (Network Interface Card) drivers.
> >
> > .. Note:: The catch for driver developers is that, once an application
> >   request zero-copy RX, then the driver must use a specific
> >   SKB allocation mode and might have to reconfigure the
> >   RX-ring.
> >
> >
> > Design target
> > =
> >
> > Allow the NIC to function as a normal Linux NIC and be shared in a
> > safe manor, between the kernel network stack and an accelerated
> > userspace application using RX zero-copy delivery.
> >
> > Target is to provide the basis for building RX zero-copy solutions in
> > a memory safe manor.  An efficient communication channel for userspace
> > delivery is out of scope for this document, but OOM considerations are
> > discussed below (`Userspace delivery and OOM`_).
> >
> > Background
> > ==
> >
> > The SKB or ``struct sk_buff`` is the fundamental meta-data structure
> > for network packets in the Linux Kernel network stack.  It is a fairly
> > complex object and can be constructed in several ways.
> >
> > From a memory perspective there are two ways depending on
> > RX-buffer/page state:
> >
> > 1) Writable packet page
> > 2) Read-only packet page
> >
> > To take full potential of the page_pool, the drivers must actually
> > support handling both options depending on the configuration state of
> > the page_pool.
> >
> > Writable packet page
> > 
> >
> > When the RX packet page is writable, the SKB setup is fairly straight
> > forward.  The SKB->data (and skb->head) can point directly to the page
> > data, adjusting the offset acco

Re: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function

2016-12-15 Thread Hannes Frederic Sowa
On 15.12.2016 00:29, Jason A. Donenfeld wrote:
> Hi Hannes,
> 
> On Wed, Dec 14, 2016 at 11:03 PM, Hannes Frederic Sowa
>  wrote:
>> I fear that the alignment requirement will be a source of bugs on 32 bit
>> machines, where you cannot even simply take a well aligned struct on a
>> stack and put it into the normal siphash(aligned) function without
>> adding alignment annotations everywhere. Even blocks returned from
>> kmalloc on 32 bit are not aligned to 64 bit.
> 
> That's what the "__aligned(SIPHASH24_ALIGNMENT)" attribute is for. The
> aligned siphash function will be for structs explicitly made for
> siphash consumption. For everything else there's siphash_unaligned.

So in case you have a pointer from somewhere on 32 bit you can
essentially only guarantee it has natural alignment or max. native
alignment (based on the arch). gcc only fulfills your request for
alignment when you allocate on the stack (minus gcc bugs).

Let's say you get a pointer from somewhere, maybe embedded in a struct,
which came from kmalloc. kmalloc doesn't care about aligned attribute,
it will align according to architecture description. That said, if you
want to hash that, you would need manually align the memory returned
from kmalloc or make sure the the data is more than naturally aligned on
that architecture.

>> Can we do this a runtime check and just have one function (siphash)
>> dealing with that?
> 
> Seems like the runtime branching on the aligned function would be bad
> for performance, when we likely know at compile time if it's going to
> be aligned or not. I suppose we could add that check just to the
> unaligned version, and rename it to "maybe_unaligned"? Is this what
> you have in mind?

I argue that you mostly don't know at compile time if it is correctly
aligned if the alignment requirements are larger than the natural ones.

Also, we don't even have that for memcpy, even we use it probably much
more than hashing, so I think this is overkill.

Bye,
Hannes



Re: [Query] Delayed vxlan socket creation?

2016-12-15 Thread Du, Fan



在 2016年12月14日 17:29, Jiri Benc 写道:

On Wed, 14 Dec 2016 07:49:24 +, Du, Fan wrote:

I'm interested to one Docker issue[1] which looks like related to kernel vxlan 
socket creation
as described in the thread. From my limited knowledge here, socket creation is 
synchronous ,
and after the *socket* syscall, the sock handle will be valid and ready to 
linkup.

Somehow I'm not sure the detailed scenario here, and which/how possible commit 
fix?

baf606d9c9b1^..56ef9c909b40

  Jiri


Thanks a lot Jiri!


Re: [RFC v2 02/10] IB/hfi-vnic: Virtual Network Interface Controller (VNIC) interface

2016-12-15 Thread Vishwanathapura, Niranjana

On Wed, Dec 14, 2016 at 11:59:34PM -0800, Vishwanathapura, Niranjana wrote:

+
+static inline bool is_hfi_ibdev(struct ib_device *ibdev)
+{
+   return !memcmp(ibdev->name, "hfi", 3);
+}


I am thinking of adding a device capability flag to indicate HFI VNIC capabilty 
instead of relying on the device name as above to identify a hfi ib deice.

Any comments? Probably it can be addressed by a separate patch later.

Niranjana





Re: [Query] Delayed vxlan socket creation?

2016-12-15 Thread Du, Fan



在 2016年12月15日 01:24, Cong Wang 写道:

On Tue, Dec 13, 2016 at 11:49 PM, Du, Fan  wrote:

Hi

I'm interested to one Docker issue[1] which looks like related to kernel vxlan 
socket creation
as described in the thread. From my limited knowledge here, socket creation is 
synchronous ,
and after the *socket* syscall, the sock handle will be valid and ready to 
linkup.

You need to read the code. vxlan tunnel is a UDP tunnel, it needs a kernel
socket (and a port) to setup UDP communication, unlike GRE tunnel etc.

I check the fix is merged in 4.0, my code base is pretty new,
so somehow I failed to see the work queue stuff in drver/net/vxlan.c

Somehow I'm not sure the detailed scenario here, and which/how possible commit 
fix?
Thanks!

Quoted analysis:
--
(Found in kernel 3.13)
The issue happens because in older kernels when a vxlan interface is created,
the socket creation is queued up in a worker thread which actually creates
the socket. But this needs to happen before we bring up the link on the vxlan 
interface.
If for some chance, the worker thread hasn't completed the creation of the 
socket
before we did link up then when we do link up the kernel checks if the socket 
was
created and if not it will return ENOTCONN. This was a bug in the kernel which 
got fixed
in later kernels. That is why retrying with a timer fixes the issue.


This was introduced by commit 1c51a9159ddefa5119724a4c7da3fd3ef44b68d5
and later fixed by commit 56ef9c909b40483d2c8cb63fcbf83865f162d5ec.

信聪哥,得永生。
Thanks for the offending commit id!




Re: [RFC v2 02/10] IB/hfi-vnic: Virtual Network Interface Controller (VNIC) interface

2016-12-15 Thread Christoph Hellwig
On Thu, Dec 15, 2016 at 12:53:49AM -0800, Vishwanathapura, Niranjana wrote:
> On Wed, Dec 14, 2016 at 11:59:34PM -0800, Vishwanathapura, Niranjana wrote:
> > +
> > +static inline bool is_hfi_ibdev(struct ib_device *ibdev)
> > +{
> > +   return !memcmp(ibdev->name, "hfi", 3);
> > +}
> 
> I am thinking of adding a device capability flag to indicate HFI VNIC
> capabilty instead of relying on the device name as above to identify a hfi
> ib deice.

Absolutely.

> Any comments? Probably it can be addressed by a separate patch later.

no, comparing device names is always wrong, please do it ASAP.


Re: [PATCH] net: sfc: use new api ethtool_{get|set}_link_ksettings

2016-12-15 Thread Bert Kenward
n 14/12/16 23:12, Philippe Reynes wrote:
> The ethtool api {get|set}_settings is deprecated.
> We move this driver to new api {get|set}_link_ksettings.
> 
> Signed-off-by: Philippe Reynes 

Thanks Philippe. We'll get some testing done on this.

Bert.


[PATCH] net: ipv4: tcp_offload: check segs for NULL

2016-12-15 Thread shakya . das
From: Shakya Sundar Das 

This patch will check segs for being NULL in tcp_gso_segment()
before calling skb_shinfo(segs) from skb_is_gso(segs), otherwise
kernel can run into a NULL-pointer dereference.

Signed-off-by: Shakya Sundar Das 
---
 net/ipv4/tcp_offload.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv4/tcp_offload.c b/net/ipv4/tcp_offload.c
index bc68da3..93feefd 100644
--- a/net/ipv4/tcp_offload.c
+++ b/net/ipv4/tcp_offload.c
@@ -96,7 +96,7 @@ struct sk_buff *tcp_gso_segment(struct sk_buff *skb,
skb->ooo_okay = 0;
 
segs = skb_segment(skb, features);
-   if (IS_ERR(segs))
+   if (IS_ERR_OR_NULL(segs))
goto out;
 
/* Only first segment might have ooo_okay set */
-- 
1.7.9.5



Re: sanity checking iov_iter patches

2016-12-15 Thread Jesper Dangaard Brouer
On Thu, 15 Dec 2016 06:23:05 +
Al Viro  wrote:

>   Some of the vfs.git#work.iov_iter stuff touches net/*; basically,
> there are several missing primitives (copy_from_iter_full(), etc.) for
> "try to copy, tell whether it has copied the full amount requested and
> advance the iterator only in case of success".  Most of the callers were
> actually doing just that (see e.g. skb_add_data() and friends) and while
> nothing in the current kernel cares whether we advance ->msg_iter on
> failure, it's much more consistent semantics.
> 
>   If anybody has objections to that stuff (in linux-next, or in
> git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git#work.iov_iter),
> or thinks that some of that should go via net-next.git, yell and I'll
> drop the bits in question.  If not, to Linus it all goes...

Just some links to make it quicker for people see the three patches:
 http://git.kernel.org/cgit/linux/kernel/git/viro/vfs.git/log/?h=work.iov_iter

Patches:
 
http://git.kernel.org/cgit/linux/kernel/git/viro/vfs.git/commit/?h=work.iov_iter&id=cbbd26b8b1a
 
http://git.kernel.org/cgit/linux/kernel/git/viro/vfs.git/commit/?h=work.iov_iter&id=15e6cb46c9b
 
http://git.kernel.org/cgit/linux/kernel/git/viro/vfs.git/commit/?h=work.iov_iter&id=0b62fca2623

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer


Re: [PATCH 5/8] linux: drop __bitwise__ everywhere

2016-12-15 Thread Stefan Schmidt

Hello.

On 15/12/16 06:15, Michael S. Tsirkin wrote:

__bitwise__ used to mean "yes, please enable sparse checks
unconditionally", but now that we dropped __CHECK_ENDIAN__
__bitwise is exactly the same.
There aren't many users, replace it by __bitwise everywhere.

Signed-off-by: Michael S. Tsirkin 
---
 arch/arm/plat-samsung/include/plat/gpio-cfg.h| 2 +-
 drivers/md/dm-cache-block-types.h| 6 +++---
 drivers/net/ethernet/sun/sunhme.h| 2 +-
 drivers/net/wireless/intel/iwlwifi/iwl-fw-file.h | 4 ++--
 include/linux/mmzone.h   | 2 +-
 include/linux/serial_core.h  | 4 ++--
 include/linux/types.h| 4 ++--
 include/scsi/iscsi_proto.h   | 2 +-
 include/target/target_core_base.h| 2 +-
 include/uapi/linux/virtio_types.h| 6 +++---
 net/ieee802154/6lowpan/6lowpan_i.h   | 2 +-
 net/mac80211/ieee80211_i.h   | 4 ++--
 12 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/arch/arm/plat-samsung/include/plat/gpio-cfg.h 
b/arch/arm/plat-samsung/include/plat/gpio-cfg.h
index 21391fa..e55d1f5 100644
--- a/arch/arm/plat-samsung/include/plat/gpio-cfg.h
+++ b/arch/arm/plat-samsung/include/plat/gpio-cfg.h
@@ -26,7 +26,7 @@

 #include 

-typedef unsigned int __bitwise__ samsung_gpio_pull_t;
+typedef unsigned int __bitwise samsung_gpio_pull_t;

 /* forward declaration if gpio-core.h hasn't been included */
 struct samsung_gpio_chip;
diff --git a/drivers/md/dm-cache-block-types.h 
b/drivers/md/dm-cache-block-types.h
index bed4ad4..389c9e8 100644
--- a/drivers/md/dm-cache-block-types.h
+++ b/drivers/md/dm-cache-block-types.h
@@ -17,9 +17,9 @@
  * discard bitset.
  */

-typedef dm_block_t __bitwise__ dm_oblock_t;
-typedef uint32_t __bitwise__ dm_cblock_t;
-typedef dm_block_t __bitwise__ dm_dblock_t;
+typedef dm_block_t __bitwise dm_oblock_t;
+typedef uint32_t __bitwise dm_cblock_t;
+typedef dm_block_t __bitwise dm_dblock_t;

 static inline dm_oblock_t to_oblock(dm_block_t b)
 {
diff --git a/drivers/net/ethernet/sun/sunhme.h 
b/drivers/net/ethernet/sun/sunhme.h
index f430765..4a8d5b1 100644
--- a/drivers/net/ethernet/sun/sunhme.h
+++ b/drivers/net/ethernet/sun/sunhme.h
@@ -302,7 +302,7 @@
  * Always write the address first before setting the ownership
  * bits to avoid races with the hardware scanning the ring.
  */
-typedef u32 __bitwise__ hme32;
+typedef u32 __bitwise hme32;

 struct happy_meal_rxd {
hme32 rx_flags;
diff --git a/drivers/net/wireless/intel/iwlwifi/iwl-fw-file.h 
b/drivers/net/wireless/intel/iwlwifi/iwl-fw-file.h
index 1ad0ec1..84813b5 100644
--- a/drivers/net/wireless/intel/iwlwifi/iwl-fw-file.h
+++ b/drivers/net/wireless/intel/iwlwifi/iwl-fw-file.h
@@ -228,7 +228,7 @@ enum iwl_ucode_tlv_flag {
IWL_UCODE_TLV_FLAGS_BCAST_FILTERING = BIT(29),
 };

-typedef unsigned int __bitwise__ iwl_ucode_tlv_api_t;
+typedef unsigned int __bitwise iwl_ucode_tlv_api_t;

 /**
  * enum iwl_ucode_tlv_api - ucode api
@@ -258,7 +258,7 @@ enum iwl_ucode_tlv_api {
 #endif
 };

-typedef unsigned int __bitwise__ iwl_ucode_tlv_capa_t;
+typedef unsigned int __bitwise iwl_ucode_tlv_capa_t;

 /**
  * enum iwl_ucode_tlv_capa - ucode capabilities
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 0f088f3..36d9896 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -246,7 +246,7 @@ struct lruvec {
 #define ISOLATE_UNEVICTABLE((__force isolate_mode_t)0x8)

 /* LRU Isolation modes. */
-typedef unsigned __bitwise__ isolate_mode_t;
+typedef unsigned __bitwise isolate_mode_t;

 enum zone_watermarks {
WMARK_MIN,
diff --git a/include/linux/serial_core.h b/include/linux/serial_core.h
index 5d49488..5def8e8 100644
--- a/include/linux/serial_core.h
+++ b/include/linux/serial_core.h
@@ -111,8 +111,8 @@ struct uart_icount {
__u32   buf_overrun;
 };

-typedef unsigned int __bitwise__ upf_t;
-typedef unsigned int __bitwise__ upstat_t;
+typedef unsigned int __bitwise upf_t;
+typedef unsigned int __bitwise upstat_t;

 struct uart_port {
spinlock_t  lock;   /* port lock */
diff --git a/include/linux/types.h b/include/linux/types.h
index baf7183..d501ad3 100644
--- a/include/linux/types.h
+++ b/include/linux/types.h
@@ -154,8 +154,8 @@ typedef u64 dma_addr_t;
 typedef u32 dma_addr_t;
 #endif

-typedef unsigned __bitwise__ gfp_t;
-typedef unsigned __bitwise__ fmode_t;
+typedef unsigned __bitwise gfp_t;
+typedef unsigned __bitwise fmode_t;

 #ifdef CONFIG_PHYS_ADDR_T_64BIT
 typedef u64 phys_addr_t;
diff --git a/include/scsi/iscsi_proto.h b/include/scsi/iscsi_proto.h
index c1260d8..df156f1 100644
--- a/include/scsi/iscsi_proto.h
+++ b/include/scsi/iscsi_proto.h
@@ -74,7 +74,7 @@ static inline int iscsi_sna_gte(u32 n1, u32 n2)
 #define zero_data(p) {p[0]=0;p[1]=0;p[2]=0;}

 /* initiator tags; opaque for target */
-typedef uint32_t __bitwise__ itt_t;
+typedef uint32_t _

Re: [PATCH net] vxlan: fix unused variable warning

2016-12-15 Thread Jiri Benc
On Wed, 14 Dec 2016 12:43:55 -0800, Stephen Hemminger wrote:
> Fixes commit 4528520d315ac1 ("vxlan: add ipv6 proxy support")

Wrong hash, it was commit f564f45c4518. And that commit actually did
use saddr, the actual commit that is being fixed is 4b29dba9c085
("vxlan: fix nonfunctional neigh_reduce()"). Also, please use the
standard Fixes: line when resubmitting this.

The patch itself looks good.

Thanks,

 Jiri


Re: [RFC v2 00/10] HFI Virtual Network Interface Controller (VNIC)

2016-12-15 Thread Leon Romanovsky
On Wed, Dec 14, 2016 at 11:59:32PM -0800, Vishwanathapura, Niranjana wrote:
> Thanks Jason for the valuable feedback.
> Here is the revised HFI VNIC patch series.
>
> ChangeLog:
> =
> v1 => v2:
> a) Removed hfi_vnic bus, instead make hfi_vnic driver an 'ib client',
>as per feedback from Jason Gunthorpe.
> b) Interface changes, data structure changes and variable name changes
>associated with (a).
> c) Add hfi_ibdev abstraction to provide VNIC control operations to
>hfi_vnic client.
> d) Minor fixes
> e) Moved hfi_vnic driver from .../sw/intel/vnic/hfi_vnic to
>.../sw/intel/hfi_vnic.

To put it into proportion, Jason asked you to do different thing.
http://marc.info/?l=linux-rdma&m=147977108302151&w=2
http://marc.info/?l=linux-rdma&m=148000415401842&w=2

And Christoph,
http://marc.info/?l=linux-rdma&m=147985587425861&w=2


signature.asc
Description: PGP signature


Re: [PATCH v2 2/2] net: ethernet: stmmac: remove private tx queue lock

2016-12-15 Thread Pavel Machek
Hi!

> The driver uses a private lock for synchronization of the xmit function and
> the xmit completion handler, but since the NETIF_F_LLTX flag is not set,
> the xmit function is also called with the xmit_lock held.
> 
> On the other hand the completion handler uses the reverse locking order by
> first taking the private lock and (in case that the tx queue had been
> stopped) then the xmit_lock.
> 
> Improve the locking by removing the private lock and using only the
> xmit_lock for synchronization instead.

Do you have stmmac hardware to test on?

I believe something is very wrong with the locking there. In
particular... scheduling the stmmac_tx_timer() function to run often
should not do anything bad if locking is correct... but it breaks the
driver rather quickly. [Example patch below, needs applying to two
places in net-next.]

(Other possibility is that hardware races with the driver.)

Giuseppe, is there documentation available for the chip? Driver says

  Documentation available at:
  http://www.stlinux.com

but that page does not work for me...

404 Not Found

Code: NoSuchBucket
Message: The specified bucket does not exist
BucketName: www.stlinux.com
RequestId: 1C8A20CB99AE7F75
HostId:
ljPnqbEpyD8exct5MUgcDXSW8n+I67Yw0aejNhLuBQ0pqN0UCfiRBa3ztlOMngiXoSN+COX+VSw=

Best regards,
Pavel

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c 
b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index ffbcd03..8040370 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -1973,8 +1973,9 @@ static void stmmac_xmit_common(struct sk_buff *skb, 
struct net_device *dev, int
 */
priv->tx_count_frames += nfrags + 1;
if (likely(priv->tx_coal_frames > priv->tx_count_frames)) {
-   mod_timer(&priv->txtimer,
- STMMAC_COAL_TIMER(priv->tx_coal_timer));
+   if (priv->tx_count_frames == nfrags + 1)
+   mod_timer(&priv->txtimer,
+ STMMAC_COAL_TIMER(priv->tx_coal_timer));
} else {
priv->tx_count_frames = 0;
priv->hw->desc->set_tx_ic(desc);


-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: Digital signature


[PATCH] bpf: cgroup: annotate pointers in struct cgroup_bpf with __rcu

2016-12-15 Thread Daniel Mack
The member 'effective' in 'struct cgroup_bpf' is protected by RCU.
Annotate it accordingly to squelch a sparse warning.

Signed-off-by: Daniel Mack 
---
 include/linux/bpf-cgroup.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/bpf-cgroup.h b/include/linux/bpf-cgroup.h
index 7b6e5d1..92bc89a 100644
--- a/include/linux/bpf-cgroup.h
+++ b/include/linux/bpf-cgroup.h
@@ -20,7 +20,7 @@ struct cgroup_bpf {
 * when this cgroup is accessed.
 */
struct bpf_prog *prog[MAX_BPF_ATTACH_TYPE];
-   struct bpf_prog *effective[MAX_BPF_ATTACH_TYPE];
+   struct bpf_prog __rcu *effective[MAX_BPF_ATTACH_TYPE];
 };
 
 void cgroup_bpf_put(struct cgroup *cgrp);
-- 
2.9.3



Re: [PATCH v2 2/2] net: ethernet: stmmac: remove private tx queue lock

2016-12-15 Thread Giuseppe CAVALLARO

On 12/15/2016 10:45 AM, Pavel Machek wrote:

Giuseppe, is there documentation available for the chip? Driver says

  Documentation available at:
  http://www.stlinux.com

but that page does not work for me...


Hi Pavel, yes the page has been removed but all the relevant and
updated driver doc is inside the kernel sources.

Regards
Peppe



RE: [PATCH v3 3/3] random: use siphash24 instead of md5 for get_random_int/long

2016-12-15 Thread David Laight
From: Behalf Of Jason A. Donenfeld
> Sent: 14 December 2016 18:46
...
> + ret = *chaining = siphash24((u8 *)&combined, offsetof(typeof(combined), 
> end),

If you make the first argument 'const void *' you won't need the cast
on every call.

I'd also suggest making the key u64[2].

David



Hello Beautiful

2016-12-15 Thread Bentley
How you doing today? I hope you are doing well. My name is Bentley, from the 
US. I'm in Syria right now fighting ISIS. I want to get to know you better, if 
I may be so bold. I consider myself an easy-going man, and I am currently 
looking for a relationship in which I feel loved. Please tell me more about 
yourself, if you don't mind.

Hope to hear from you soon.

Regards,
Bentley.


RE: [PATCH v3 1/3] siphash: add cryptographically secure hashtable function

2016-12-15 Thread David Laight
From: Linus Torvalds
> Sent: 15 December 2016 00:11
> On Wed, Dec 14, 2016 at 3:34 PM, Jason A. Donenfeld  wrote:
> >
> > Or does your reasonable dislike of "word" still allow for the use of
> > dword and qword, so that the current function names of:
> 
> dword really is confusing to people.
>
> If you have a MIPS background, it means 64 bits. While to people with
> Windows programming backgrounds it means 32 bits.

Guess what a DWORD_PTR is on 64bit windows ...
(it is an integer type).

David



Re: [PATCH v2 2/2] net: ethernet: stmmac: remove private tx queue lock

2016-12-15 Thread Pavel Machek
On Thu 2016-12-15 11:08:36, Giuseppe CAVALLARO wrote:
> On 12/15/2016 10:45 AM, Pavel Machek wrote:
> >Giuseppe, is there documentation available for the chip? Driver says
> >
> >  Documentation available at:
> >  http://www.stlinux.com
> >
> >but that page does not work for me...
> 
> Hi Pavel, yes the page has been removed but all the relevant and
> updated driver doc is inside the kernel sources.

Ok, perhaps the link should be removed, then? (Along with the bugzilla
link if that is not going to be re-enabled?)

Is there documentation for the hardware somewhere?

(As something is very wrong with stmmac_tx_clean(), either locking or
interface to the DMA engine.)

Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: Digital signature


[PATCH 1/5] irda: irproc.c: Remove unneeded linux/miscdevice.h include

2016-12-15 Thread Corentin Labbe
irproc.c does not use any miscdevice so this patch remove this
unnecessary inclusion.

Signed-off-by: Corentin Labbe 
---
 net/irda/irproc.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/net/irda/irproc.c b/net/irda/irproc.c
index b9ac598..77cfdde 100644
--- a/net/irda/irproc.c
+++ b/net/irda/irproc.c
@@ -23,7 +23,6 @@
  *
  /
 
-#include 
 #include 
 #include 
 #include 
-- 
2.10.2



[PATCH 3/5] irnet: ppp: move IRNET_MINOR to include/linux/miscdevice.h

2016-12-15 Thread Corentin Labbe
This patch move the define for IRNET_MINOR to include/linux/miscdevice.h
It is better that all minor number definitions are in the same place.

Signed-off-by: Corentin Labbe 
---
 include/linux/miscdevice.h | 1 +
 net/irda/irnet/irnet_ppp.h | 1 -
 2 files changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/miscdevice.h b/include/linux/miscdevice.h
index 18b2e3b..5ea0a65 100644
--- a/include/linux/miscdevice.h
+++ b/include/linux/miscdevice.h
@@ -37,6 +37,7 @@
 #define HWRNG_MINOR183
 #define MICROCODE_MINOR184
 #define KEYPAD_MINOR   185
+#define IRNET_MINOR187
 #define D7S_MINOR  193
 #define VFIO_MINOR 196
 #define TUN_MINOR  200
diff --git a/net/irda/irnet/irnet_ppp.h b/net/irda/irnet/irnet_ppp.h
index 693ebc0..18fcead 100644
--- a/net/irda/irnet/irnet_ppp.h
+++ b/net/irda/irnet/irnet_ppp.h
@@ -21,7 +21,6 @@
 
 /* /dev/irnet file constants */
 #define IRNET_MAJOR10  /* Misc range */
-#define IRNET_MINOR187 /* Official allocation */
 
 /* IrNET control channel stuff */
 #define IRNET_MAX_COMMAND  256 /* Max length of a command line */
-- 
2.10.2



[PATCH 5/5] irda: irnet: add member name to the miscdevice declaration

2016-12-15 Thread Corentin Labbe
Since the struct miscdevice have many members, it is dangerous to init
it without members name relying only on member order.

This patch add member name to the init declaration.

Signed-off-by: Corentin Labbe 
---
 net/irda/irnet/irnet_ppp.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/irda/irnet/irnet_ppp.h b/net/irda/irnet/irnet_ppp.h
index ec092c9..1ed17f9 100644
--- a/net/irda/irnet/irnet_ppp.h
+++ b/net/irda/irnet/irnet_ppp.h
@@ -108,9 +108,9 @@ static const struct file_operations irnet_device_fops =
 /* Structure so that the misc major (drivers/char/misc.c) take care of us... */
 static struct miscdevice irnet_misc_device =
 {
-   IRNET_MINOR,
-   "irnet",
-   &irnet_device_fops
+   .minor = IRNET_MINOR,
+   .name = "irnet",
+   .file_operations = &irnet_device_fops
 };
 
 #endif /* IRNET_PPP_H */
-- 
2.10.2



RE: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function

2016-12-15 Thread David Laight
From: Hannes Frederic Sowa
> Sent: 14 December 2016 22:03
> On 14.12.2016 13:46, Jason A. Donenfeld wrote:
> > Hi David,
> >
> > On Wed, Dec 14, 2016 at 10:56 AM, David Laight  
> > wrote:
> >> ...
> >>> +u64 siphash24(const u8 *data, size_t len, const u8 
> >>> key[SIPHASH24_KEY_LEN])
> >> ...
> >>> + u64 k0 = get_unaligned_le64(key);
> >>> + u64 k1 = get_unaligned_le64(key + sizeof(u64));
> >> ...
> >>> + m = get_unaligned_le64(data);
> >>
> >> All these unaligned accesses are going to get expensive on architectures
> >> like sparc64.
> >
> > Yes, the unaligned accesses aren't pretty. Since in pretty much all
> > use cases thus far, the data can easily be made aligned, perhaps it
> > makes sense to create siphash24() and siphash24_unaligned(). Any
> > thoughts on doing something like that?
> 
> I fear that the alignment requirement will be a source of bugs on 32 bit
> machines, where you cannot even simply take a well aligned struct on a
> stack and put it into the normal siphash(aligned) function without
> adding alignment annotations everywhere. Even blocks returned from
> kmalloc on 32 bit are not aligned to 64 bit.

Are you doing anything that will require 64bit alignment on 32bit systems?
It is unlikely that the kernel can use any simd registers that have wider
alignment requirements.

You also really don't want to request on-stack items have large alignments.
While gcc can generate code to do it, it isn't pretty.

David




Re: Synopsys Ethernet QoS

2016-12-15 Thread Pavel Machek
Hi!

> I know that this is completely of topic, but I am facing a dificulty with
> stmmac. I have interrupts, mac well configured rx packets being received
> successfully, but TX is not working, resulting in Tx errors = Total TX 
> packets.
> I have made a lot of debug and my conclusions is that by some reason when 
> using
> stmmac after starting tx dma, the hw state machine enters a deadend state
> resulting in those errors. Anyone faced this trouble?

Actually, I see you have address @synopsys.com, would you have
documentation for the chip?

I'm trying to understand stmmac_tx_clean() and docs would help...

Thanks,
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: Digital signature


[PATCH net-next] ixgbevf: fix 'Etherleak' in ixgbevf

2016-12-15 Thread Weilong Chen
Nessus report the vf appears to leak memory in network packets.
Fix this by padding all small packets manually.

And the CVE-2003-0001.
https://ofirarkin.files.wordpress.com/2008/11/atstake_etherleak_report.pdf

Signed-off-by: Weilong Chen 
---
 drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c 
b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
index 6d4bef5..137a154 100644
--- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
+++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
@@ -3654,6 +3654,13 @@ static int ixgbevf_xmit_frame(struct sk_buff *skb, 
struct net_device *netdev)
return NETDEV_TX_OK;
}
 
+   /* On PCI/PCI-X HW, if packet size is less than ETH_ZLEN,
+* packets may get corrupted during padding by HW.
+* To WA this issue, pad all small packets manually.
+*/
+   if (eth_skb_pad(skb))
+   return NETDEV_TX_OK;
+
tx_ring = adapter->tx_ring[skb->queue_mapping];
 
/* need: 1 descriptor per page * PAGE_SIZE/IXGBE_MAX_DATA_PER_TXD,
-- 
1.7.12



[PATCH 4/5] irda: irnet: Remove unused IRNET_MAJOR define

2016-12-15 Thread Corentin Labbe
The IRNET_MAJOR define is not used, so this patch remove it.

Signed-off-by: Corentin Labbe 
---
 net/irda/irnet/irnet_ppp.h | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/net/irda/irnet/irnet_ppp.h b/net/irda/irnet/irnet_ppp.h
index 18fcead..ec092c9 100644
--- a/net/irda/irnet/irnet_ppp.h
+++ b/net/irda/irnet/irnet_ppp.h
@@ -19,9 +19,6 @@
 
 / CONSTANTS & MACROS /
 
-/* /dev/irnet file constants */
-#define IRNET_MAJOR10  /* Misc range */
-
 /* IrNET control channel stuff */
 #define IRNET_MAX_COMMAND  256 /* Max length of a command line */
 
-- 
2.10.2



Re: [PATCH 3/5] irnet: ppp: move IRNET_MINOR to include/linux/miscdevice.h

2016-12-15 Thread Greg KH
On Thu, Dec 15, 2016 at 11:42:48AM +0100, Corentin Labbe wrote:
> This patch move the define for IRNET_MINOR to include/linux/miscdevice.h
> It is better that all minor number definitions are in the same place.
> 
> Signed-off-by: Corentin Labbe 
> ---
>  include/linux/miscdevice.h | 1 +
>  net/irda/irnet/irnet_ppp.h | 1 -
>  2 files changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/include/linux/miscdevice.h b/include/linux/miscdevice.h
> index 18b2e3b..5ea0a65 100644
> --- a/include/linux/miscdevice.h
> +++ b/include/linux/miscdevice.h
> @@ -37,6 +37,7 @@
>  #define HWRNG_MINOR  183
>  #define MICROCODE_MINOR  184
>  #define KEYPAD_MINOR 185
> +#define IRNET_MINOR  187
>  #define D7S_MINOR193
>  #define VFIO_MINOR   196
>  #define TUN_MINOR200
> diff --git a/net/irda/irnet/irnet_ppp.h b/net/irda/irnet/irnet_ppp.h
> index 693ebc0..18fcead 100644
> --- a/net/irda/irnet/irnet_ppp.h
> +++ b/net/irda/irnet/irnet_ppp.h
> @@ -21,7 +21,6 @@
>  
>  /* /dev/irnet file constants */
>  #define IRNET_MAJOR  10  /* Misc range */
> -#define IRNET_MINOR  187 /* Official allocation */
>  
>  /* IrNET control channel stuff */
>  #define IRNET_MAX_COMMAND256 /* Max length of a command line */
> -- 
> 2.10.2

Acked-by: Greg Kroah-Hartman 


[PATCH 2/5] irda: irnet: Move linux/miscdevice.h include

2016-12-15 Thread Corentin Labbe
The only use of miscdevice is irda_ppp so no need to include
linux/miscdevice.h for all irda files.
This patch move the linux/miscdevice.h include to irnet_ppp.h

Signed-off-by: Corentin Labbe 
---
 net/irda/irnet/irnet.h | 1 -
 net/irda/irnet/irnet_ppp.h | 1 +
 2 files changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/irda/irnet/irnet.h b/net/irda/irnet/irnet.h
index 8d65bb9..c69f0f3 100644
--- a/net/irda/irnet/irnet.h
+++ b/net/irda/irnet/irnet.h
@@ -245,7 +245,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include/* isspace() */
diff --git a/net/irda/irnet/irnet_ppp.h b/net/irda/irnet/irnet_ppp.h
index 9402258..693ebc0 100644
--- a/net/irda/irnet/irnet_ppp.h
+++ b/net/irda/irnet/irnet_ppp.h
@@ -15,6 +15,7 @@
 /* INCLUDES */
 
 #include "irnet.h" /* Module global include */
+#include 
 
 / CONSTANTS & MACROS /
 
-- 
2.10.2



Re: [PATCH 8/8] Makefile: drop -D__CHECK_ENDIAN__ from cflags

2016-12-15 Thread Greg Kroah-Hartman
On Thu, Dec 15, 2016 at 07:15:30AM +0200, Michael S. Tsirkin wrote:
> That's the default now, no need for makefiles to set it.
> 
> Signed-off-by: Michael S. Tsirkin 
> ---
>  drivers/bluetooth/Makefile| 2 --
>  drivers/net/can/Makefile  | 1 -
>  drivers/net/ethernet/altera/Makefile  | 1 -
>  drivers/net/ethernet/atheros/alx/Makefile | 1 -
>  drivers/net/ethernet/freescale/Makefile   | 2 --
>  drivers/net/wireless/ath/Makefile | 2 --
>  drivers/net/wireless/ath/wil6210/Makefile | 2 --
>  drivers/net/wireless/broadcom/brcm80211/brcmfmac/Makefile | 2 --
>  drivers/net/wireless/broadcom/brcm80211/brcmsmac/Makefile | 1 -
>  drivers/net/wireless/intel/iwlegacy/Makefile  | 2 --
>  drivers/net/wireless/intel/iwlwifi/Makefile   | 2 +-
>  drivers/net/wireless/intel/iwlwifi/dvm/Makefile   | 2 +-
>  drivers/net/wireless/intel/iwlwifi/mvm/Makefile   | 2 +-
>  drivers/net/wireless/intersil/orinoco/Makefile| 3 ---
>  drivers/net/wireless/mediatek/mt7601u/Makefile| 2 --
>  drivers/net/wireless/realtek/rtlwifi/Makefile | 2 --
>  drivers/net/wireless/realtek/rtlwifi/btcoexist/Makefile   | 2 --
>  drivers/net/wireless/realtek/rtlwifi/rtl8188ee/Makefile   | 2 --
>  drivers/net/wireless/realtek/rtlwifi/rtl8192c/Makefile| 2 --
>  drivers/net/wireless/realtek/rtlwifi/rtl8192ce/Makefile   | 2 --
>  drivers/net/wireless/realtek/rtlwifi/rtl8192cu/Makefile   | 2 --
>  drivers/net/wireless/realtek/rtlwifi/rtl8192de/Makefile   | 2 --
>  drivers/net/wireless/realtek/rtlwifi/rtl8192ee/Makefile   | 2 --
>  drivers/net/wireless/realtek/rtlwifi/rtl8192se/Makefile   | 2 --
>  drivers/net/wireless/realtek/rtlwifi/rtl8723ae/Makefile   | 2 --
>  drivers/net/wireless/realtek/rtlwifi/rtl8723be/Makefile   | 2 --
>  drivers/net/wireless/realtek/rtlwifi/rtl8723com/Makefile  | 2 --
>  drivers/net/wireless/realtek/rtlwifi/rtl8821ae/Makefile   | 2 --
>  drivers/net/wireless/ti/wl1251/Makefile   | 2 --
>  drivers/net/wireless/ti/wlcore/Makefile   | 2 --
>  drivers/staging/rtl8188eu/Makefile| 2 +-
>  drivers/staging/rtl8192e/Makefile | 2 --
>  drivers/staging/rtl8192e/rtl8192e/Makefile| 2 --
>  net/bluetooth/Makefile| 2 --
>  net/ieee802154/Makefile   | 2 --
>  net/mac80211/Makefile | 2 +-
>  net/mac802154/Makefile| 2 --
>  net/wireless/Makefile | 2 --
>  38 files changed, 5 insertions(+), 68 deletions(-)

For drivers/staging:

Acked-by: Greg Kroah-Hartman 


Re: [PATCH 5/8] linux: drop __bitwise__ everywhere

2016-12-15 Thread Greg Kroah-Hartman
On Thu, Dec 15, 2016 at 07:15:20AM +0200, Michael S. Tsirkin wrote:
> __bitwise__ used to mean "yes, please enable sparse checks
> unconditionally", but now that we dropped __CHECK_ENDIAN__
> __bitwise is exactly the same.
> There aren't many users, replace it by __bitwise everywhere.
> 
> Signed-off-by: Michael S. Tsirkin 
> ---
>  arch/arm/plat-samsung/include/plat/gpio-cfg.h| 2 +-
>  drivers/md/dm-cache-block-types.h| 6 +++---
>  drivers/net/ethernet/sun/sunhme.h| 2 +-
>  drivers/net/wireless/intel/iwlwifi/iwl-fw-file.h | 4 ++--
>  include/linux/mmzone.h   | 2 +-
>  include/linux/serial_core.h  | 4 ++--
>  include/linux/types.h| 4 ++--
>  include/scsi/iscsi_proto.h   | 2 +-
>  include/target/target_core_base.h| 2 +-
>  include/uapi/linux/virtio_types.h| 6 +++---
>  net/ieee802154/6lowpan/6lowpan_i.h   | 2 +-
>  net/mac80211/ieee80211_i.h   | 4 ++--
>  12 files changed, 20 insertions(+), 20 deletions(-)

for include/linux/serial_core.h:

Acked-by: Greg Kroah-Hartman 


Re: Synopsys Ethernet QoS

2016-12-15 Thread Niklas Cassel
On 12/14/2016 01:57 PM, Pavel Machek wrote:
> Hi!
>
>> So if there is a long time before handling interrupts,
>> I guess that it makes sense that one stream could
>> get an advantage in the net scheduler.
>>
>> If I find the time, and if no one beats me to it, I will try to replace
>> the normal timers with HR timers + a smaller default timeout.
>>
> Can you try something like this? Highres timers will be needed, too,
> but this fixes the logic problem.

Hello Pavel

I tried your patch, but unfortunately I get a tx queue timeout.
After that, I cannot ping.

[   22.075782] [ cut here ]
[   22.080430] WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:316 
dev_watchdog+0x240/0x258
[   22.088704] NETDEV WATCHDOG: eth0 (stmmaceth): transmit queue 0 timed out
[   22.095491] Modules linked in:
[   22.098552] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.9.0-axis3-devel #126
[   22.105592] Hardware name: Axis ARTPEC-6 Platform
[   22.110301] [<80110568>] (unwind_backtrace) from [<8010c2bc>] 
(show_stack+0x18/0x1c)
[   22.118043] [<8010c2bc>] (show_stack) from [<80433544>] 
(dump_stack+0x80/0xa0)
[   22.125264] [<80433544>] (dump_stack) from [<8011f9f0>] (__warn+0xe0/0x10c)
[   22.132221] [<8011f9f0>] (__warn) from [<8011fadc>] 
(warn_slowpath_fmt+0x40/0x50)
[   22.139700] [<8011fadc>] (warn_slowpath_fmt) from [<805e626c>] 
(dev_watchdog+0x240/0x258)
[   22.147875] [<805e626c>] (dev_watchdog) from [<801826c8>] 
(call_timer_fn+0x44/0x208)
[   22.155613] [<801826c8>] (call_timer_fn) from [<80182934>] 
(expire_timers+0xa8/0x15c)
[   22.163437] [<80182934>] (expire_timers) from [<80182a74>] 
(run_timer_softirq+0x8c/0x164)
[   22.171610] [<80182a74>] (run_timer_softirq) from [<80124a7c>] 
(__do_softirq+0xac/0x3f0)
[   22.179696] [<80124a7c>] (__do_softirq) from [<80125124>] 
(irq_exit+0xf0/0x158)
[   22.187003] [<80125124>] (irq_exit) from [<8016ffd4>] 
(__handle_domain_irq+0x60/0xb8)
[   22.194828] [<8016ffd4>] (__handle_domain_irq) from [<801014c4>] 
(gic_handle_irq+0x4c/0x9c)
[   22.203175] [<801014c4>] (gic_handle_irq) from [<806cc48c>] 
(__irq_svc+0x6c/0xa8)
[   22.210648] Exception stack(0x80b01f60 to 0x80b01fa8)
[   22.215694] 1f60:  bf5c03f0 80b01fb8 8011a060  0001 
80b03c9c 80b03c2c
[   22.223865] 1f80: 80b1c045 80b1c045 0001  80a673f0 80b01fb0 
801090c0 801090c4
[   22.232032] 1fa0: 6013 
[   22.235520] [<806cc48c>] (__irq_svc) from [<801090c4>] 
(arch_cpu_idle+0x38/0x44)
[   22.242914] [<801090c4>] (arch_cpu_idle) from [<80160f00>] 
(cpu_startup_entry+0xd8/0x148)
[   22.251089] [<80160f00>] (cpu_startup_entry) from [<80a00c44>] 
(start_kernel+0x360/0x3c8)
[   22.259269] ---[ end trace e04d3944bdde616a ]---



I patched both stmmac_tso_xmit and stmmac_xmit, as instructed.
Here is the diff:

--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -2090,8 +2090,9 @@ static netdev_tx_t stmmac_tso_xmit(struct sk_buff *skb, 
struct net_device *dev)
/* Manage tx mitigation */
priv->tx_count_frames += nfrags + 1;
if (likely(priv->tx_coal_frames > priv->tx_count_frames)) {
-   mod_timer(&priv->txtimer,
- STMMAC_COAL_TIMER(priv->tx_coal_timer));
+   if (priv->tx_count_frames == nfrags + 1)
+   mod_timer(&priv->txtimer,
+ STMMAC_COAL_TIMER(priv->tx_coal_timer));
} else {
priv->tx_count_frames = 0;
priv->hw->desc->set_tx_ic(desc);
@@ -2292,8 +2293,9 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, 
struct net_device *dev)
 */
priv->tx_count_frames += nfrags + 1;
if (likely(priv->tx_coal_frames > priv->tx_count_frames)) {
-   mod_timer(&priv->txtimer,
- STMMAC_COAL_TIMER(priv->tx_coal_timer));
+   if (priv->tx_count_frames == nfrags + 1)
+   mod_timer(&priv->txtimer,
+ STMMAC_COAL_TIMER(priv->tx_coal_timer));
} else {
priv->tx_count_frames = 0;
priv->hw->desc->set_tx_ic(desc);



Without your patch, I get no tx queue timeout, and ping works fine.


>
> You'll need to apply it twice as code is copy&pasted.
>
> Best regards,
>   Pavel
>
> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>
>*/
>   priv->tx_count_frames += nfrags + 1;
>   if (likely(priv->tx_coal_frames > priv->tx_count_frames)) {
> - mod_timer(&priv->txtimer,
> -   STMMAC_COAL_TIMER(priv->tx_coal_timer));
> + if (priv->tx_count_frames == nfrags + 1)
> + mod_timer(&priv->txtimer,
> +   STMMAC_COAL_TIMER(priv->tx_coal_timer));
>   } else {
>   priv->tx_count_frames = 0;
>   priv->hw->desc->set_tx_ic(des

Applied "misc: atmel-ssc: register as sound DAI if #sound-dai-cells is present" to the asoc tree

2016-12-15 Thread Mark Brown
The patch

   misc: atmel-ssc: register as sound DAI if #sound-dai-cells is present

has been applied to the asoc tree at

   git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git 

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.  

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark

>From e8314d7d53c8b050aac2828a5de5f28a997b468b Mon Sep 17 00:00:00 2001
From: Peter Rosin 
Date: Tue, 6 Dec 2016 20:22:36 +0100
Subject: [PATCH] misc: atmel-ssc: register as sound DAI if #sound-dai-cells is
 present

The SSC is currently not usable with the ASoC simple-audio-card, as
every SSC audio user has to build a platform driver that may do as
little as calling atmel_ssc_set_audio/atmel_ssc_put_audio (which
allocates the SSC and registers a DAI with the ASoC subsystem).

So, have that happen automatically, if the #sound-dai-cells property
is present in devicetree, which it has to be anyway for simple audio
card to work.

Signed-off-by: Peter Rosin 
Acked-by: Rob Herring 
Acked-by: Nicolas Ferre 
Signed-off-by: Mark Brown 
---
 .../devicetree/bindings/misc/atmel-ssc.txt |  2 +
 drivers/misc/atmel-ssc.c   | 50 ++
 include/linux/atmel-ssc.h  |  1 +
 3 files changed, 53 insertions(+)

diff --git a/Documentation/devicetree/bindings/misc/atmel-ssc.txt 
b/Documentation/devicetree/bindings/misc/atmel-ssc.txt
index efc98ea1f23d..f8629bb73945 100644
--- a/Documentation/devicetree/bindings/misc/atmel-ssc.txt
+++ b/Documentation/devicetree/bindings/misc/atmel-ssc.txt
@@ -24,6 +24,8 @@ Optional properties:
this parameter to choose where the clock from.
  - By default the clock is from TK pin, if the clock from RK pin, this
property is needed.
+  - #sound-dai-cells: Should contain <0>.
+ - This property makes the SSC into an automatically registered DAI.
 
 Examples:
 - PDC transfer:
diff --git a/drivers/misc/atmel-ssc.c b/drivers/misc/atmel-ssc.c
index 0516ecda54d3..b2a0340f277e 100644
--- a/drivers/misc/atmel-ssc.c
+++ b/drivers/misc/atmel-ssc.c
@@ -20,6 +20,8 @@
 
 #include 
 
+#include "../../sound/soc/atmel/atmel_ssc_dai.h"
+
 /* Serialize access to ssc_list and user count */
 static DEFINE_SPINLOCK(user_lock);
 static LIST_HEAD(ssc_list);
@@ -145,6 +147,49 @@ static inline const struct atmel_ssc_platform_data * __init
platform_get_device_id(pdev)->driver_data;
 }
 
+#ifdef CONFIG_SND_ATMEL_SOC_SSC
+static int ssc_sound_dai_probe(struct ssc_device *ssc)
+{
+   struct device_node *np = ssc->pdev->dev.of_node;
+   int ret;
+   int id;
+
+   ssc->sound_dai = false;
+
+   if (!of_property_read_bool(np, "#sound-dai-cells"))
+   return 0;
+
+   id = of_alias_get_id(np, "ssc");
+   if (id < 0)
+   return id;
+
+   ret = atmel_ssc_set_audio(id);
+   ssc->sound_dai = !ret;
+
+   return ret;
+}
+
+static void ssc_sound_dai_remove(struct ssc_device *ssc)
+{
+   if (!ssc->sound_dai)
+   return;
+
+   atmel_ssc_put_audio(of_alias_get_id(ssc->pdev->dev.of_node, "ssc"));
+}
+#else
+static inline int ssc_sound_dai_probe(struct ssc_device *ssc)
+{
+   if (of_property_read_bool(ssc->pdev->dev.of_node, "#sound-dai-cells"))
+   return -ENOTSUPP;
+
+   return 0;
+}
+
+static inline void ssc_sound_dai_remove(struct ssc_device *ssc)
+{
+}
+#endif
+
 static int ssc_probe(struct platform_device *pdev)
 {
struct resource *regs;
@@ -204,6 +249,9 @@ static int ssc_probe(struct platform_device *pdev)
dev_info(&pdev->dev, "Atmel SSC device at 0x%p (irq %d)\n",
ssc->regs, ssc->irq);
 
+   if (ssc_sound_dai_probe(ssc))
+   dev_err(&pdev->dev, "failed to auto-setup ssc for audio\n");
+
return 0;
 }
 
@@ -211,6 +259,8 @@ static int ssc_remove(struct platform_device *pdev)
 {
struct ssc_device *ssc = platform_get_drvdata(pdev);
 
+   ssc_sound_dai_remove(ssc);
+
spin_lock(&user_lock);
list_del(&ssc->list);
spin_unlock(&user_lock);
diff --git a/include/linux/atmel-ssc.h b/include/linux/atmel-ssc.h
index 7c0f6549898b..fdb545101ede 100644
--- a/include/linux/atmel-ssc.h
+++ b/include/linux/atmel-ssc.h
@@ -20,6 +20,7 @@ struct ssc_device {
int user;
int  

Re: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function

2016-12-15 Thread Hannes Frederic Sowa
On 15.12.2016 12:04, David Laight wrote:
> From: Hannes Frederic Sowa
>> Sent: 14 December 2016 22:03
>> On 14.12.2016 13:46, Jason A. Donenfeld wrote:
>>> Hi David,
>>>
>>> On Wed, Dec 14, 2016 at 10:56 AM, David Laight  
>>> wrote:
 ...
> +u64 siphash24(const u8 *data, size_t len, const u8 
> key[SIPHASH24_KEY_LEN])
 ...
> + u64 k0 = get_unaligned_le64(key);
> + u64 k1 = get_unaligned_le64(key + sizeof(u64));
 ...
> + m = get_unaligned_le64(data);

 All these unaligned accesses are going to get expensive on architectures
 like sparc64.
>>>
>>> Yes, the unaligned accesses aren't pretty. Since in pretty much all
>>> use cases thus far, the data can easily be made aligned, perhaps it
>>> makes sense to create siphash24() and siphash24_unaligned(). Any
>>> thoughts on doing something like that?
>>
>> I fear that the alignment requirement will be a source of bugs on 32 bit
>> machines, where you cannot even simply take a well aligned struct on a
>> stack and put it into the normal siphash(aligned) function without
>> adding alignment annotations everywhere. Even blocks returned from
>> kmalloc on 32 bit are not aligned to 64 bit.
> 
> Are you doing anything that will require 64bit alignment on 32bit systems?
> It is unlikely that the kernel can use any simd registers that have wider
> alignment requirements.
> 
> You also really don't want to request on-stack items have large alignments.
> While gcc can generate code to do it, it isn't pretty.

Hmm? Even the Intel ABI expects alignment of unsigned long long to be 8
bytes on 32 bit. Do you question that?





RE: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function

2016-12-15 Thread David Laight
From: Hannes Frederic Sowa
> Sent: 15 December 2016 12:23
...
> Hmm? Even the Intel ABI expects alignment of unsigned long long to be 8
> bytes on 32 bit. Do you question that?

Yes.

The linux ABI for x86 (32 bit) only requires 32bit alignment for u64 (etc).

David



Re: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function

2016-12-15 Thread Hannes Frederic Sowa
On 15.12.2016 13:28, David Laight wrote:
> From: Hannes Frederic Sowa
>> Sent: 15 December 2016 12:23
> ...
>> Hmm? Even the Intel ABI expects alignment of unsigned long long to be 8
>> bytes on 32 bit. Do you question that?
> 
> Yes.
> 
> The linux ABI for x86 (32 bit) only requires 32bit alignment for u64 (etc).

Hmm, u64 on 32 bit is unsigned long long and not unsigned long. Thus I
am actually not sure if the ABI would say anything about that (sorry
also for my wrong statement above).

Alignment requirement of unsigned long long on gcc with -m32 actually
seem to be 8.




[PATCH iproute2 v2 3/3] ifstat: Add "sw only" extended statistics to ifstat

2016-12-15 Thread Nogah Frankel
Add support for extended statistics of SW only type, for counting only the
packets that went via the cpu. (useful for systems with forward
offloading). It reads it from filter type IFLA_STATS_LINK_OFFLOAD_XSTATS
and sub type IFLA_OFFLOAD_XSTATS_CPU_HIT.

It is under the name 'software'
(or any shorten of it as 'soft' or simply 's')

For example:
ifstat -x s

Signed-off-by: Nogah Frankel 
Reviewed-by: Jiri Pirko 
---
 misc/ifstat.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/misc/ifstat.c b/misc/ifstat.c
index ac99d04..62f1f2b 100644
--- a/misc/ifstat.c
+++ b/misc/ifstat.c
@@ -730,7 +730,8 @@ static void xstat_usage(void)
 {
fprintf(stderr,
 "Usage: ifstat supported xstats:\n"
-"   64bits default stats, with 64 bits support\n");
+"   64bits default stats, with 64 bits support\n"
+"   softwareSW stats. Counts only packets that went via the 
CPU\n");
 }
 
 struct extended_stats_options_t {
@@ -745,6 +746,7 @@ struct extended_stats_options_t {
  */
 static const struct extended_stats_options_t extended_stats_options[] = {
{"64bits", IFLA_STATS_LINK_64, NO_SUB_TYPE},
+   {"software",  IFLA_STATS_LINK_OFFLOAD_XSTATS, 
IFLA_OFFLOAD_XSTATS_CPU_HIT},
 };
 
 static bool get_filter_type(char *name)
-- 
2.4.3



[PATCH iproute2 v2 1/3] ifstat: Add extended statistics to ifstat

2016-12-15 Thread Nogah Frankel
Extended stats are part of the RTM_GETSTATS method. This patch adds them
to ifstat.
While extended stats can come in many forms, we support only the
rtnl_link_stats64 struct for them (which is the 64 bits version of struct
rtnl_link_stats).
We support stats in the main nesting level, or one lower.
The extension can be called by its name or any shorten of it. If there is
more than one matched, the first one will be picked.

To get the extended stats the flag -x  is used.

Signed-off-by: Nogah Frankel 
Reviewed-by: Jiri Pirko 
---
 misc/ifstat.c | 161 --
 1 file changed, 146 insertions(+), 15 deletions(-)

diff --git a/misc/ifstat.c b/misc/ifstat.c
index 92d67b0..d17ae21 100644
--- a/misc/ifstat.c
+++ b/misc/ifstat.c
@@ -35,6 +35,7 @@
 
 #include 
 
+#include "utils.h"
 int dump_zeros;
 int reset_history;
 int ignore_history;
@@ -48,17 +49,21 @@ int pretty;
 double W;
 char **patterns;
 int npatterns;
+bool is_extanded;
+int filter_type;
+int sub_type;
 
 char info_source[128];
 int source_mismatch;
 
 #define MAXS (sizeof(struct rtnl_link_stats)/sizeof(__u32))
+#define NO_SUB_TYPE 0x
 
 struct ifstat_ent {
struct ifstat_ent   *next;
char*name;
int ifindex;
-   unsigned long long  val[MAXS];
+   __u64   val[MAXS];
double  rate[MAXS];
__u32   ival[MAXS];
 };
@@ -106,6 +111,48 @@ static int match(const char *id)
return 0;
 }
 
+static int get_nlmsg_extanded(const struct sockaddr_nl *who,
+ struct nlmsghdr *m, void *arg)
+{
+   struct if_stats_msg *ifsm = NLMSG_DATA(m);
+   struct rtattr *tb[IFLA_STATS_MAX+1];
+   int len = m->nlmsg_len;
+   struct ifstat_ent *n;
+
+   if (m->nlmsg_type != RTM_NEWSTATS)
+   return 0;
+
+   len -= NLMSG_LENGTH(sizeof(*ifsm));
+   if (len < 0)
+   return -1;
+
+   parse_rtattr(tb, IFLA_STATS_MAX, IFLA_STATS_RTA(ifsm), len);
+   if (tb[filter_type] == NULL)
+   return 0;
+
+   n = malloc(sizeof(*n));
+   if (!n)
+   abort();
+
+   n->ifindex = ifsm->ifindex;
+   n->name = strdup(ll_index_to_name(ifsm->ifindex));
+
+   if (sub_type == NO_SUB_TYPE) {
+   memcpy(&n->val, RTA_DATA(tb[filter_type]), sizeof(n->val));
+   } else {
+   struct rtattr *attr;
+
+   attr = parse_rtattr_one_nested(sub_type, tb[filter_type]);
+   if (attr == NULL)
+   return 0;
+   memcpy(&n->val, RTA_DATA(attr), sizeof(n->val));
+   }
+   memset(&n->rate, 0, sizeof(n->rate));
+   n->next = kern_db;
+   kern_db = n;
+   return 0;
+}
+
 static int get_nlmsg(const struct sockaddr_nl *who,
 struct nlmsghdr *m, void *arg)
 {
@@ -147,18 +194,34 @@ static void load_info(void)
 {
struct ifstat_ent *db, *n;
struct rtnl_handle rth;
+   __u32 filter_mask;
 
if (rtnl_open(&rth, 0) < 0)
exit(1);
 
-   if (rtnl_wilddump_request(&rth, AF_INET, RTM_GETLINK) < 0) {
-   perror("Cannot send dump request");
-   exit(1);
-   }
+   if (is_extanded) {
+   ll_init_map(&rth);
+   filter_mask = IFLA_STATS_FILTER_BIT(filter_type);
+   if (rtnl_wilddump_stats_req_filter(&rth, AF_UNSPEC, 
RTM_GETSTATS,
+  filter_mask) < 0) {
+   perror("Cannot send dump request");
+   exit(1);
+   }
 
-   if (rtnl_dump_filter(&rth, get_nlmsg, NULL) < 0) {
-   fprintf(stderr, "Dump terminated\n");
-   exit(1);
+   if (rtnl_dump_filter(&rth, get_nlmsg_extanded, NULL) < 0) {
+   fprintf(stderr, "Dump terminated\n");
+   exit(1);
+   }
+   } else {
+   if (rtnl_wilddump_request(&rth, AF_INET, RTM_GETLINK) < 0) {
+   perror("Cannot send dump request");
+   exit(1);
+   }
+
+   if (rtnl_dump_filter(&rth, get_nlmsg, NULL) < 0) {
+   fprintf(stderr, "Dump terminated\n");
+   exit(1);
+   }
}
 
rtnl_close(&rth);
@@ -553,10 +616,17 @@ static void update_db(int interval)
}
for (i = 0; i < MAXS; i++) {
double sample;
-   unsigned long incr = h1->ival[i] - 
n->ival[i];
+   __u64 incr;
+
+   if (is_extanded) {
+   incr = h1->val[i] - n->val[i];
+   n->val[i] = h1->val[

[PATCH iproute2 v2 2/3] ifstat: Add 64 bits based stats to extended statistics

2016-12-15 Thread Nogah Frankel
The default stats for ifstat are 32 bits based.
The kernel supports 64 bits based stats. (They are returned in struct
rtnl_link_stats64 which is an exact copy of struct rtnl_link_stats, in
which the "normal" stats are returned, but with fields of u64 instead of
u32). This patch adds them as an extended stats.

It is read with filter type IFLA_STATS_LINK_64 and no sub type.

It is under the name 64bits
(or any shorten of it as "64")

For example:
ifstat -x 64bit

Signed-off-by: Nogah Frankel 
Reviewed-by: Jiri Pirko 
---
 misc/ifstat.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/misc/ifstat.c b/misc/ifstat.c
index d17ae21..ac99d04 100644
--- a/misc/ifstat.c
+++ b/misc/ifstat.c
@@ -729,7 +729,8 @@ static int verify_forging(int fd)
 static void xstat_usage(void)
 {
fprintf(stderr,
-"Usage: ifstat supported xstats:\n");
+"Usage: ifstat supported xstats:\n"
+"   64bits default stats, with 64 bits support\n");
 }
 
 struct extended_stats_options_t {
@@ -743,6 +744,7 @@ struct extended_stats_options_t {
  * Name length must be under 64 chars.
  */
 static const struct extended_stats_options_t extended_stats_options[] = {
+   {"64bits", IFLA_STATS_LINK_64, NO_SUB_TYPE},
 };
 
 static bool get_filter_type(char *name)
-- 
2.4.3



[PATCH iproute2 v2 0/3] update ifstat for new stats

2016-12-15 Thread Nogah Frankel
Previously stats were gotten by RTM_GETLINK which returns 32 bits based
statistics. It supports only one type of stats.
Lately, a new method to get stats was added - RTM_GETSTATS. It supports
ability to choose stats type. The basic stats were changed from 32 bits
based to 64 bits based.

This patchset adds ifstat the ability to get extended stats by this
method. Its adds two types of extended stats:
64bits - the same as the "normal" stats but get the stats from the cpu
in 64 bits based struct.
SW - for packets that hit cpu.

---
v1->v2:
 - change from using RTM_GETSTATS always to using it only for extended
   stats.
 - Add 64bits extended stats type.

Nogah Frankel (3):
  ifstat: Add extended statistics to ifstat
  ifstat: Add 64 bits based stats to extended statistics
  ifstat: Add "sw only" extended statistics to ifstat

 misc/ifstat.c | 165 --
 1 file changed, 150 insertions(+), 15 deletions(-)

-- 
2.4.3



Re: [PATCH iproute2 2/2] tc/m_tunnel_key: Add dest UDP port to tunnel key action

2016-12-15 Thread Simon Horman
On Tue, Dec 13, 2016 at 10:07:47AM +0200, Hadar Hen Zion wrote:
> Enhance tunnel key action parameters by adding destination UDP port.
> 
> Signed-off-by: Hadar Hen Zion 
> Reviewed-by: Roi Dayan 

Hi,

this looks good to me but could you also update tc/m_tunnel_key.c:usage(); ?

With that change:

Reviewed-by: Simon Horman 


[PATCH net 2/3] dpaa_eth: remove redundant dependency on FSL_SOC

2016-12-15 Thread Madalin Bucur
Signed-off-by: Madalin Bucur 
---
 drivers/net/ethernet/freescale/dpaa/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/freescale/dpaa/Kconfig 
b/drivers/net/ethernet/freescale/dpaa/Kconfig
index f3a3454..a654736 100644
--- a/drivers/net/ethernet/freescale/dpaa/Kconfig
+++ b/drivers/net/ethernet/freescale/dpaa/Kconfig
@@ -1,6 +1,6 @@
 menuconfig FSL_DPAA_ETH
tristate "DPAA Ethernet"
-   depends on FSL_SOC && FSL_DPAA && FSL_FMAN
+   depends on FSL_DPAA && FSL_FMAN
select PHYLIB
select FSL_FMAN_MAC
---help---
-- 
2.1.0



Re: [PATCH iproute2 1/2] tc: flower: Fix typo in the flower man page

2016-12-15 Thread Simon Horman
On Tue, Dec 13, 2016 at 07:33:51AM +0200, Roi Dayan wrote:
> Replace vlan_eth_type with vlan_ethtype.
> 
> Fixes: 745d91726006 ("tc: flower: Introduce vlan support")
> Signed-off-by: Roi Dayan 
> Reviewed-by: Hadar Hen Zion 

Reviewed-by: Simon Horman 


[PATCH net 1/4] fsl/fman: fix 1G support for QSGMII interfaces

2016-12-15 Thread Madalin Bucur
QSGMII ports were not advertising 1G speed.

Signed-off-by: Madalin Bucur 
Reviewed-by: Camelia Groza 
---
 drivers/net/ethernet/freescale/fman/mac.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/freescale/fman/mac.c 
b/drivers/net/ethernet/freescale/fman/mac.c
index 69ca42c..0b31f85 100644
--- a/drivers/net/ethernet/freescale/fman/mac.c
+++ b/drivers/net/ethernet/freescale/fman/mac.c
@@ -594,6 +594,7 @@ static const u16 phy2speed[] = {
[PHY_INTERFACE_MODE_RGMII_RXID] = SPEED_1000,
[PHY_INTERFACE_MODE_RGMII_TXID] = SPEED_1000,
[PHY_INTERFACE_MODE_RTBI]   = SPEED_1000,
+   [PHY_INTERFACE_MODE_QSGMII] = SPEED_1000,
[PHY_INTERFACE_MODE_XGMII]  = SPEED_1
 };
 
-- 
2.1.0



[PATCH net 4/4] fsl/fman: enable compilation on ARM64

2016-12-15 Thread Madalin Bucur
Signed-off-by: Madalin Bucur 
---
 drivers/net/ethernet/freescale/fman/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/freescale/fman/Kconfig 
b/drivers/net/ethernet/freescale/fman/Kconfig
index 79b7c84..dc0850b 100644
--- a/drivers/net/ethernet/freescale/fman/Kconfig
+++ b/drivers/net/ethernet/freescale/fman/Kconfig
@@ -1,6 +1,6 @@
 config FSL_FMAN
tristate "FMan support"
-   depends on FSL_SOC || COMPILE_TEST
+   depends on FSL_SOC || ARCH_LAYERSCAPE || COMPILE_TEST
select GENERIC_ALLOCATOR
select PHYLIB
default n
-- 
2.1.0



[PATCH net 2/4] fsl/fman: arm: call of_platform_populate() for arm64 platfrom

2016-12-15 Thread Madalin Bucur
From: Igal Liberman 

Signed-off-by: Igal Liberman 
---
 drivers/net/ethernet/freescale/fman/fman.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/drivers/net/ethernet/freescale/fman/fman.c 
b/drivers/net/ethernet/freescale/fman/fman.c
index dafd9e1..f36b4eb 100644
--- a/drivers/net/ethernet/freescale/fman/fman.c
+++ b/drivers/net/ethernet/freescale/fman/fman.c
@@ -2868,6 +2868,16 @@ static struct fman *read_dts_node(struct platform_device 
*of_dev)
 
fman->dev = &of_dev->dev;
 
+#ifdef CONFIG_ARM64
+   /* call of_platform_populate in order to probe sub-nodes on arm64 */
+   err = of_platform_populate(fm_node, NULL, NULL, &of_dev->dev);
+   if (err) {
+   dev_err(&of_dev->dev, "%s: of_platform_populate() failed\n",
+   __func__);
+   goto fman_free;
+   }
+#endif
+
return fman;
 
 fman_node_put:
-- 
2.1.0



[PATCH net 0/3] dpaa_eth: a couple of fixes

2016-12-15 Thread Madalin Bucur
This patch set introduces big endian accessors in the dpaa_eth driver
making sure accesses to the QBMan HW are correct on little endian
platforms. Removing a redundant Kconfig dependency on FSL_SOC.
Adding myself as maintainer of the dpaa_eth driver.

Claudiu Manoil (1):
  dpaa_eth: use big endian accessors

Madalin Bucur (2):
  dpaa_eth: remove redundant dependency on FSL_SOC
  MAINTAINERS: net: add entry for Freescale QorIQ DPAA Ethernet driver

 MAINTAINERS|  6 +++
 drivers/net/ethernet/freescale/dpaa/Kconfig|  2 +-
 drivers/net/ethernet/freescale/dpaa/dpaa_eth.c | 71 ++
 3 files changed, 44 insertions(+), 35 deletions(-)

-- 
2.1.0



Re: [PATCH] net: ipv4: tcp_offload: check segs for NULL

2016-12-15 Thread Tobias Klauser
On 2016-12-15 at 09:47:41 +0100, shakya@samsung.com 
 wrote:
> From: Shakya Sundar Das 
> 
> This patch will check segs for being NULL in tcp_gso_segment()
> before calling skb_shinfo(segs) from skb_is_gso(segs), otherwise
> kernel can run into a NULL-pointer dereference.

How can segs ever be NULL here? skb_segment() will always either return
an skb or an ERR_PTR(err).


[PATCH net 3/3] MAINTAINERS: net: add entry for Freescale QorIQ DPAA Ethernet driver

2016-12-15 Thread Madalin Bucur
Add record for Freescale QORIQ DPAA Ethernet driver adding myself as
maintainer.

Signed-off-by: Madalin Bucur 
---
 MAINTAINERS | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index e2463ba..0ff9757 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -5058,6 +5058,12 @@ S:   Maintained
 F: drivers/net/ethernet/freescale/fman
 F: Documentation/devicetree/bindings/powerpc/fsl/fman.txt
 
+FREESCALE QORIQ DPAA ETHERNET DRIVER
+M: Madalin Bucur 
+L: netdev@vger.kernel.org
+S: Maintained
+F: drivers/net/ethernet/freescale/dpaa
+
 FREESCALE QUICC ENGINE LIBRARY
 L: linuxppc-...@lists.ozlabs.org
 S: Orphan
-- 
2.1.0



[PATCH net 1/3] dpaa_eth: use big endian accessors

2016-12-15 Thread Madalin Bucur
From: Claudiu Manoil 

Ensure correct access to the big endian QMan HW through proper
accessors.

Signed-off-by: Claudiu Manoil 
Signed-off-by: Madalin Bucur 
---
 drivers/net/ethernet/freescale/dpaa/dpaa_eth.c | 71 ++
 1 file changed, 37 insertions(+), 34 deletions(-)

diff --git a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c 
b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
index 3c48a84..624ba90 100644
--- a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
+++ b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
@@ -733,7 +733,7 @@ static int dpaa_eth_cgr_init(struct dpaa_priv *priv)
priv->cgr_data.cgr.cb = dpaa_eth_cgscn;
 
/* Enable Congestion State Change Notifications and CS taildrop */
-   initcgr.we_mask = QM_CGR_WE_CSCN_EN | QM_CGR_WE_CS_THRES;
+   initcgr.we_mask = cpu_to_be16(QM_CGR_WE_CSCN_EN | QM_CGR_WE_CS_THRES);
initcgr.cgr.cscn_en = QM_CGR_EN;
 
/* Set different thresholds based on the MAC speed.
@@ -747,7 +747,7 @@ static int dpaa_eth_cgr_init(struct dpaa_priv *priv)
cs_th = DPAA_CS_THRESHOLD_1G;
qm_cgr_cs_thres_set64(&initcgr.cgr.cs_thres, cs_th, 1);
 
-   initcgr.we_mask |= QM_CGR_WE_CSTD_EN;
+   initcgr.we_mask |= cpu_to_be16(QM_CGR_WE_CSTD_EN);
initcgr.cgr.cstd_en = QM_CGR_EN;
 
err = qman_create_cgr(&priv->cgr_data.cgr, QMAN_CGR_FLAG_USE_INIT,
@@ -896,18 +896,18 @@ static int dpaa_fq_init(struct dpaa_fq *dpaa_fq, bool 
td_enable)
if (dpaa_fq->init) {
memset(&initfq, 0, sizeof(initfq));
 
-   initfq.we_mask = QM_INITFQ_WE_FQCTRL;
+   initfq.we_mask = cpu_to_be16(QM_INITFQ_WE_FQCTRL);
/* Note: we may get to keep an empty FQ in cache */
-   initfq.fqd.fq_ctrl = QM_FQCTRL_PREFERINCACHE;
+   initfq.fqd.fq_ctrl = cpu_to_be16(QM_FQCTRL_PREFERINCACHE);
 
/* Try to reduce the number of portal interrupts for
 * Tx Confirmation FQs.
 */
if (dpaa_fq->fq_type == FQ_TYPE_TX_CONFIRM)
-   initfq.fqd.fq_ctrl |= QM_FQCTRL_HOLDACTIVE;
+   initfq.fqd.fq_ctrl |= cpu_to_be16(QM_FQCTRL_HOLDACTIVE);
 
/* FQ placement */
-   initfq.we_mask |= QM_INITFQ_WE_DESTWQ;
+   initfq.we_mask |= cpu_to_be16(QM_INITFQ_WE_DESTWQ);
 
qm_fqd_set_destwq(&initfq.fqd, dpaa_fq->channel, dpaa_fq->wq);
 
@@ -920,8 +920,8 @@ static int dpaa_fq_init(struct dpaa_fq *dpaa_fq, bool 
td_enable)
if (dpaa_fq->fq_type == FQ_TYPE_TX ||
dpaa_fq->fq_type == FQ_TYPE_TX_CONFIRM ||
dpaa_fq->fq_type == FQ_TYPE_TX_CONF_MQ) {
-   initfq.we_mask |= QM_INITFQ_WE_CGID;
-   initfq.fqd.fq_ctrl |= QM_FQCTRL_CGE;
+   initfq.we_mask |= cpu_to_be16(QM_INITFQ_WE_CGID);
+   initfq.fqd.fq_ctrl |= cpu_to_be16(QM_FQCTRL_CGE);
initfq.fqd.cgid = (u8)priv->cgr_data.cgr.cgrid;
/* Set a fixed overhead accounting, in an attempt to
 * reduce the impact of fixed-size skb shells and the
@@ -932,7 +932,7 @@ static int dpaa_fq_init(struct dpaa_fq *dpaa_fq, bool 
td_enable)
 * insufficient value, but even that is better than
 * no overhead accounting at all.
 */
-   initfq.we_mask |= QM_INITFQ_WE_OAC;
+   initfq.we_mask |= cpu_to_be16(QM_INITFQ_WE_OAC);
qm_fqd_set_oac(&initfq.fqd, QM_OAC_CG);
qm_fqd_set_oal(&initfq.fqd,
   min(sizeof(struct sk_buff) +
@@ -941,9 +941,9 @@ static int dpaa_fq_init(struct dpaa_fq *dpaa_fq, bool 
td_enable)
}
 
if (td_enable) {
-   initfq.we_mask |= QM_INITFQ_WE_TDTHRESH;
+   initfq.we_mask |= cpu_to_be16(QM_INITFQ_WE_TDTHRESH);
qm_fqd_set_taildrop(&initfq.fqd, DPAA_FQ_TD, 1);
-   initfq.fqd.fq_ctrl = QM_FQCTRL_TDE;
+   initfq.fqd.fq_ctrl = cpu_to_be16(QM_FQCTRL_TDE);
}
 
if (dpaa_fq->fq_type == FQ_TYPE_TX) {
@@ -951,7 +951,8 @@ static int dpaa_fq_init(struct dpaa_fq *dpaa_fq, bool 
td_enable)
if (queue_id >= 0)
confq = priv->conf_fqs[queue_id];
if (confq) {
-   initfq.we_mask |= QM_INITFQ_WE_CONTEXTA;
+   initfq.we_mask |=
+   cpu_to_be16(QM_INITFQ_WE_CONTEXTA);
/* ContextA: OVOM=1(use contextA2 bits instead of ICAD)
 *   A2V=1 (contextA A2 field is valid)
 *   A0V=1 (contextA 

Re: [PATCH net 2/2] net/sched: cls_flower: Use masked key when calling HW offloads

2016-12-15 Thread Simon Horman
Hi Paul,

On Wed, Dec 14, 2016 at 07:00:58PM +0200, Paul Blakey wrote:
> Zero bits on the mask signify a "don't care" on the corresponding bits
> in key. Some HWs require those bits on the key to be zero. Since these
> bits are masked anyway, it's okay to provide the masked key to all
> drivers.
> 
> Fixes: 5b33f48842fa ('net/flower: Introduce hardware offload support')
> Signed-off-by: Paul Blakey 
> Reviewed-by: Roi Dayan 
> Acked-by: Jiri Pirko 

While I don't have a specific use case in mind that this change would break
it seems to me that it would be better to handle hardware requirements
at the driver level.

> ---
>  net/sched/cls_flower.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
> index 9758f5a..35ac28d 100644
> --- a/net/sched/cls_flower.c
> +++ b/net/sched/cls_flower.c
> @@ -252,7 +252,7 @@ static int fl_hw_replace_filter(struct tcf_proto *tp,
>   offload.cookie = (unsigned long)f;
>   offload.dissector = dissector;
>   offload.mask = mask;
> - offload.key = &f->key;
> + offload.key = &f->mkey;
>   offload.exts = &f->exts;
>  
>   tc->type = TC_SETUP_CLSFLOWER;
> -- 
> 1.8.3.1
> 


Re: [PATCH iproute2 2/2] tc/m_tunnel_key: Add dest UDP port to tunnel key action

2016-12-15 Thread Simon Horman
On Thu, Dec 15, 2016 at 02:03:36PM +0100, Simon Horman wrote:
> On Tue, Dec 13, 2016 at 10:07:47AM +0200, Hadar Hen Zion wrote:
> > Enhance tunnel key action parameters by adding destination UDP port.
> > 
> > Signed-off-by: Hadar Hen Zion 
> > Reviewed-by: Roi Dayan 
> 
> Hi,
> 
> this looks good to me but could you also update tc/m_tunnel_key.c:usage(); ?

It seems that I was a bit hasty here as I now see that Stephen has
indicated that he has applied this series. I also notice that
patch 1/2 of this series also misses updating usage(). Let me know
if sending some follow-up patches is the best way forwards.


RE: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function

2016-12-15 Thread David Laight
From: Hannes Frederic Sowa
> Sent: 15 December 2016 12:50
> On 15.12.2016 13:28, David Laight wrote:
> > From: Hannes Frederic Sowa
> >> Sent: 15 December 2016 12:23
> > ...
> >> Hmm? Even the Intel ABI expects alignment of unsigned long long to be 8
> >> bytes on 32 bit. Do you question that?
> >
> > Yes.
> >
> > The linux ABI for x86 (32 bit) only requires 32bit alignment for u64 (etc).
> 
> Hmm, u64 on 32 bit is unsigned long long and not unsigned long. Thus I
> am actually not sure if the ABI would say anything about that (sorry
> also for my wrong statement above).
> 
> Alignment requirement of unsigned long long on gcc with -m32 actually
> seem to be 8.

It depends on the architecture.
For x86 it is definitely 4.
It might be 8 for sparc, ppc and/or alpha.

David



[PATCH net] ixgbe: update the rss key on h/w, when ethtool ask for it.

2016-12-15 Thread Paolo Abeni
Currently ixgbe_set_rxfh() updates the rss_key copy in the driver
memory, but does not push the new value into the h/w. This commit
add a new helper for the latter operation and call it in
ixgbe_set_rxfh(), so that the h/w rss key value can be really
updated via ethtool.

Signed-off-by: Paolo Abeni 
---
 drivers/net/ethernet/intel/ixgbe/ixgbe.h |  1 +
 drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c |  4 +++-
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c| 19 ---
 3 files changed, 20 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h 
b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
index ef81c3d..8fb9fbf 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
@@ -1026,6 +1026,7 @@ netdev_tx_t ixgbe_xmit_frame_ring(struct sk_buff *skb,
  struct ixgbe_adapter *adapter,
  struct ixgbe_ring *tx_ring);
 u32 ixgbe_rss_indir_tbl_entries(struct ixgbe_adapter *adapter);
+void ixgbe_store_key(struct ixgbe_adapter *adapter);
 void ixgbe_store_reta(struct ixgbe_adapter *adapter);
 s32 ixgbe_negotiate_fc(struct ixgbe_hw *hw, u32 adv_reg, u32 lp_reg,
   u32 adv_sym, u32 adv_asm, u32 lp_sym, u32 lp_asm);
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c 
b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
index fd192bf..e40f9ce 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
@@ -3003,8 +3003,10 @@ static int ixgbe_set_rxfh(struct net_device *netdev, 
const u32 *indir,
}
 
/* Fill out the rss hash key */
-   if (key)
+   if (key) {
memcpy(adapter->rss_key, key, ixgbe_get_rxfh_key_size(netdev));
+   ixgbe_store_key(adapter);
+   }
 
ixgbe_store_reta(adapter);
 
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c 
b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 1e2f39e..0c23ab8 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -3411,6 +3411,21 @@ u32 ixgbe_rss_indir_tbl_entries(struct ixgbe_adapter 
*adapter)
 }
 
 /**
+ * ixgbe_store_key - Write the RSS key to HW
+ * @adapter: device handle
+ *
+ * Write the RSS key stored in adapter.rss_key to HW.
+ */
+void ixgbe_store_key(struct ixgbe_adapter *adapter)
+{
+   struct ixgbe_hw *hw = &adapter->hw;
+   int i;
+
+   for (i = 0; i < 10; i++)
+   IXGBE_WRITE_REG(hw, IXGBE_RSSRK(i), adapter->rss_key[i]);
+}
+
+/**
  * ixgbe_store_reta - Write the RETA table to HW
  * @adapter: device handle
  *
@@ -3475,7 +3490,6 @@ static void ixgbe_store_vfreta(struct ixgbe_adapter 
*adapter)
 
 static void ixgbe_setup_reta(struct ixgbe_adapter *adapter)
 {
-   struct ixgbe_hw *hw = &adapter->hw;
u32 i, j;
u32 reta_entries = ixgbe_rss_indir_tbl_entries(adapter);
u16 rss_i = adapter->ring_feature[RING_F_RSS].indices;
@@ -3488,8 +3502,7 @@ static void ixgbe_setup_reta(struct ixgbe_adapter 
*adapter)
rss_i = 4;
 
/* Fill out hash function seeds */
-   for (i = 0; i < 10; i++)
-   IXGBE_WRITE_REG(hw, IXGBE_RSSRK(i), adapter->rss_key[i]);
+   ixgbe_store_key(adapter);
 
/* Fill out redirection table */
memset(adapter->rss_indir_tbl, 0, sizeof(adapter->rss_indir_tbl));
-- 
1.8.3.1



Re: [PATCHv3 perf/core 0/7] Reuse libbpf from samples/bpf

2016-12-15 Thread Arnaldo Carvalho de Melo
Em Wed, Dec 14, 2016 at 02:46:23PM -0800, Joe Stringer escreveu:
> On 14 December 2016 at 06:55, Arnaldo Carvalho de Melo  
> wrote:
> > So, Joe, can you try refreshing this work, starting from what I have in
> > perf/core? It has the changes coming from net-next that Daniel warned us 
> > about
> > and some more.
 
> I've just respun this series based on the version you previously
> applied to perf/core. Since bpf_prog_{attach,detach}() were added to
> samples/libbpf, a new patch will shift these over to tools/lib/bpf.
> Other than that, I folded "samples/bpf: Drop unnecessary build
> targets." back into "samples/bpf: Switch over to libbpf", and I
> noticed that there were a couple of unnecessary log buffers with the
> latest changes. For any new sample programs, those were fixed up to
> use libbpf as well.
 
> Don't forget to do a "make headers_install" before attempting to build
> the samples, access to the latest headers is required (as per the
> readme in samples/bpf).

Ah, README, I should read that ;-)

I got used to how tools/perf/ work, i.e. it is self sufficient wrt
in-flux stuff in the kernel, i.e.  headers that are related to features
it supports and that are under constant improvements, such as eBPF, kvm,
syscall tables, etc.

Anyway, will do the headers_install step inside a container, to avoid
polluting my workstation.

Thanks for doing the respin and for the clarifications about building
samples/bpf/.

- Arnaldo


[PATCH net 3/4] fsl/fman: A007273 only applies to PPC SoCs

2016-12-15 Thread Madalin Bucur
Signed-off-by: Madalin Bucur 
Reviewed-by: Camelia Groza 
---
 drivers/net/ethernet/freescale/fman/fman.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/drivers/net/ethernet/freescale/fman/fman.c 
b/drivers/net/ethernet/freescale/fman/fman.c
index f36b4eb..93d6a36 100644
--- a/drivers/net/ethernet/freescale/fman/fman.c
+++ b/drivers/net/ethernet/freescale/fman/fman.c
@@ -1890,6 +1890,7 @@ static int fman_reset(struct fman *fman)
 
goto _return;
} else {
+#ifdef CONFIG_PPC
struct device_node *guts_node;
struct ccsr_guts __iomem *guts_regs;
u32 devdisr2, reg;
@@ -1921,6 +1922,7 @@ static int fman_reset(struct fman *fman)
 
/* Enable all MACs */
iowrite32be(reg, &guts_regs->devdisr2);
+#endif
 
/* Perform FMan reset */
iowrite32be(FPM_RSTC_FM_RESET, &fman->fpm_regs->fm_rstc);
@@ -1932,25 +1934,31 @@ static int fman_reset(struct fman *fman)
} while (((ioread32be(&fman->fpm_regs->fm_rstc)) &
 FPM_RSTC_FM_RESET) && --count);
if (count == 0) {
+#ifdef CONFIG_PPC
iounmap(guts_regs);
of_node_put(guts_node);
+#endif
err = -EBUSY;
goto _return;
}
+#ifdef CONFIG_PPC
 
/* Restore devdisr2 value */
iowrite32be(devdisr2, &guts_regs->devdisr2);
 
iounmap(guts_regs);
of_node_put(guts_node);
+#endif
 
goto _return;
 
+#ifdef CONFIG_PPC
 guts_regs:
of_node_put(guts_node);
 guts_node:
dev_dbg(fman->dev, "%s: Didn't perform FManV3 reset due to 
Errata A007273!\n",
__func__);
+#endif
}
 _return:
return err;
-- 
2.1.0



[PATCH net 0/4] fsl/fman: fixes for ARM

2016-12-15 Thread Madalin Bucur
The patch set fixes advertised speeds for QSGMII interfaces, disables
A007273 erratum workaround on non-PowerPC platforms where it does not
apply, enables compilation on ARM64 and addresses a probing issue on
ARM64.

Igal Liberman (1):
  fsl/fman: arm: call of_platform_populate() for arm64 platfrom

Madalin Bucur (3):
  fsl/fman: fix 1G support for QSGMII interfaces
  fsl/fman: A007273 only applies to PPC SoCs
  fsl/fman: enable compilation on ARM64

 drivers/net/ethernet/freescale/fman/Kconfig |  2 +-
 drivers/net/ethernet/freescale/fman/fman.c  | 18 ++
 drivers/net/ethernet/freescale/fman/mac.c   |  1 +
 3 files changed, 20 insertions(+), 1 deletion(-)

-- 
2.1.0



Re: [RFC v2 00/10] HFI Virtual Network Interface Controller (VNIC)

2016-12-15 Thread ira.weiny
On Thu, Dec 15, 2016 at 11:12:26AM +0200, Leon Romanovsky wrote:
> On Wed, Dec 14, 2016 at 11:59:32PM -0800, Vishwanathapura, Niranjana wrote:
> > Thanks Jason for the valuable feedback.
> > Here is the revised HFI VNIC patch series.
> >
> > ChangeLog:
> > =
> > v1 => v2:
> > a) Removed hfi_vnic bus, instead make hfi_vnic driver an 'ib client',
> >as per feedback from Jason Gunthorpe.
> > b) Interface changes, data structure changes and variable name changes
> >associated with (a).
> > c) Add hfi_ibdev abstraction to provide VNIC control operations to
> >hfi_vnic client.
> > d) Minor fixes
> > e) Moved hfi_vnic driver from .../sw/intel/vnic/hfi_vnic to
> >.../sw/intel/hfi_vnic.
> 
> To put it into proportion, Jason asked you to do different thing.
> http://marc.info/?l=linux-rdma&m=147977108302151&w=2
> http://marc.info/?l=linux-rdma&m=148000415401842&w=2
> 
> And Christoph,
> http://marc.info/?l=linux-rdma&m=147985587425861&w=2

Understood.  However, we never heard back from Niranjanas analysis of the code
which stated that > 60% of the code was dealing with the OPA MADs used to
configure this device.

https://www.spinics.net/lists/linux-rdma/msg43579.html

Furthermore, neither Dave nor Doug has had time to weigh in on what we should
do.

So before we make that change we wanted to get consensus on using the
hfi1_ibdev abstraction rather than the bus.  This was the _real_ technical
change.

Beyond that it is really just which maintainer wants this driver.  To that end
I've also cc'ed Jeff Kirsher who maintains drivers/net/ethernet/intel.  Perhaps
Dave would like the driver to go through that tree?


I think there are pros and cons to both subtrees and in the end we will do
whatever is decided.

For maintainer review:

1) The driver encapsulates ethernet packets with OPA headers

2) VNIC uses OPA management packets (MADs) for its configuration

3) A significant portion (> 60% +) of the code is specific to OPA

https://www.spinics.net/lists/linux-rdma/msg43579.html

4) The driver is from Intel and we expect Intel to be the primary
   contributor to the code.

5) The driver, like hfi1, is dual licensed (GPL/BSD)

6) Based on Christophs feedback we will be adding device capability
   bits to the IB core to indicate HFI VNIC support.

https://www.spinics.net/lists/linux-rdma/msg44113.html


Doug, Dave, Jeff any thoughts?

Ira



[PATCH net] sctp: sctp_epaddr_lookup_transport should be protected by rcu_read_lock

2016-12-15 Thread Xin Long
Since commit 7fda702f9315 ("sctp: use new rhlist interface on sctp transport
rhashtable"), sctp has changed to use rhlist_lookup to look up transport, but
rhlist_lookup doesn't call rcu_read_lock inside, unlike rhashtable_lookup_fast.

It is called in sctp_epaddr_lookup_transport and sctp_addrs_lookup_transport.
sctp_addrs_lookup_transport is always in the protection of rcu_read_lock(),
as __sctp_lookup_association is called in rx path or sctp_lookup_association
which are in the protection of rcu_read_lock() already.

But sctp_epaddr_lookup_transport is called by sctp_endpoint_lookup_assoc, it
doesn't call rcu_read_lock, which may cause "suspicious rcu_dereference_check
usage' in __rhashtable_lookup.

This patch is to fix it by adding rcu_read_lock in sctp_endpoint_lookup_assoc
before calling sctp_epaddr_lookup_transport.

Fixes: 7fda702f9315 ("sctp: use new rhlist interface on sctp transport 
rhashtable")
Reported-by: Dmitry Vyukov 
Signed-off-by: Xin Long 
---
 net/sctp/endpointola.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/net/sctp/endpointola.c b/net/sctp/endpointola.c
index 1f03065..410ddc1 100644
--- a/net/sctp/endpointola.c
+++ b/net/sctp/endpointola.c
@@ -331,7 +331,9 @@ struct sctp_association *sctp_endpoint_lookup_assoc(
 * on this endpoint.
 */
if (!ep->base.bind_addr.port)
-   goto out;
+   return NULL;
+
+   rcu_read_lock();
t = sctp_epaddr_lookup_transport(ep, paddr);
if (!t)
goto out;
@@ -339,6 +341,7 @@ struct sctp_association *sctp_endpoint_lookup_assoc(
*transport = t;
asoc = t->asoc;
 out:
+   rcu_read_unlock();
return asoc;
 }
 
-- 
2.1.0



Re: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function

2016-12-15 Thread Hannes Frederic Sowa
On 15.12.2016 14:56, David Laight wrote:
> From: Hannes Frederic Sowa
>> Sent: 15 December 2016 12:50
>> On 15.12.2016 13:28, David Laight wrote:
>>> From: Hannes Frederic Sowa
 Sent: 15 December 2016 12:23
>>> ...
 Hmm? Even the Intel ABI expects alignment of unsigned long long to be 8
 bytes on 32 bit. Do you question that?
>>>
>>> Yes.
>>>
>>> The linux ABI for x86 (32 bit) only requires 32bit alignment for u64 (etc).
>>
>> Hmm, u64 on 32 bit is unsigned long long and not unsigned long. Thus I
>> am actually not sure if the ABI would say anything about that (sorry
>> also for my wrong statement above).
>>
>> Alignment requirement of unsigned long long on gcc with -m32 actually
>> seem to be 8.
> 
> It depends on the architecture.
> For x86 it is definitely 4.

May I ask for a reference? I couldn't see unsigned long long being
mentioned in the ia32 abi spec that I found. I agree that those accesses
might be synthetically assembled by gcc and for me the alignment of 4
would have seemed natural. But my gcc at least in 32 bit mode disagrees
with that.

> It might be 8 for sparc, ppc and/or alpha.

This is something to find out...

Right now ipv6 addresses have an alignment of 4. So we couldn't even
naturally pass them to siphash but would need to copy them around, which
I feel like a source of bugs.

Bye,
Hannes



Re: [v3] net: ethernet: cavium: octeon: octeon_mgmt: Handle return NULL error from devm_ioremap

2016-12-15 Thread arvind Yadav

Hi David,

I did not tested this feature.  I have build it and flashed on hardware.
You can check below commit id. Which has similar check for ioremap.
1- Commit id - de9e397e40f56b9f34af4bf6a5bd7a75ea02456c
  In 'drivers/net/phy/mdio-octeon.c'

2- Commit id - 592569de4c247fe4f25db8369dc0c63860f9560b
  In 'drivers/gpio/gpio-octeon.c'

Thanks
Arvind

On Thursday 15 December 2016 12:58 AM, David Daney wrote:

On 12/14/2016 11:03 AM, Arvind Yadav wrote:

Here, If devm_ioremap will fail. It will return NULL.
Kernel can run into a NULL-pointer dereference.
This error check will avoid NULL pointer dereference. t


I have asked you twice already this question, but could not determine 
from your response what the answer is:


Q: Have you tested the patch on OCTEON based hardware that contains 
the "octeon_mgmt" Ethernet ports?  Please answer either "yes" or "no".



Thanks,
David Daney




Signed-off-by: Arvind Yadav 
---
 drivers/net/ethernet/cavium/octeon/octeon_mgmt.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/net/ethernet/cavium/octeon/octeon_mgmt.c 
b/drivers/net/ethernet/cavium/octeon/octeon_mgmt.c

index 4ab404f..33c2fec 100644
--- a/drivers/net/ethernet/cavium/octeon/octeon_mgmt.c
+++ b/drivers/net/ethernet/cavium/octeon/octeon_mgmt.c
@@ -1479,6 +1479,12 @@ static int octeon_mgmt_probe(struct 
platform_device *pdev)

 p->agl = (u64)devm_ioremap(&pdev->dev, p->agl_phys, p->agl_size);
 p->agl_prt_ctl = (u64)devm_ioremap(&pdev->dev, p->agl_prt_ctl_phys,
p->agl_prt_ctl_size);
+if (!p->mix || !p->agl || !p->agl_prt_ctl) {
+dev_err(&pdev->dev, "failed to map I/O memory\n");
+result = -ENOMEM;
+goto err;
+}
+
 spin_lock_init(&p->lock);

 skb_queue_head_init(&p->tx_list);







Re: [PATCH net 2/2] net/sched: cls_flower: Use masked key when calling HW offloads

2016-12-15 Thread Jiri Pirko
Thu, Dec 15, 2016 at 02:50:44PM CET, simon.hor...@netronome.com wrote:
>Hi Paul,
>
>On Wed, Dec 14, 2016 at 07:00:58PM +0200, Paul Blakey wrote:
>> Zero bits on the mask signify a "don't care" on the corresponding bits
>> in key. Some HWs require those bits on the key to be zero. Since these
>> bits are masked anyway, it's okay to provide the masked key to all
>> drivers.
>> 
>> Fixes: 5b33f48842fa ('net/flower: Introduce hardware offload support')
>> Signed-off-by: Paul Blakey 
>> Reviewed-by: Roi Dayan 
>> Acked-by: Jiri Pirko 
>
>While I don't have a specific use case in mind that this change would break
>it seems to me that it would be better to handle hardware requirements
>at the driver level.

Even though, makes no sense to pass unmasked key down. Is is only
confusing. This patch fixes it.


>
>> ---
>>  net/sched/cls_flower.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
>> index 9758f5a..35ac28d 100644
>> --- a/net/sched/cls_flower.c
>> +++ b/net/sched/cls_flower.c
>> @@ -252,7 +252,7 @@ static int fl_hw_replace_filter(struct tcf_proto *tp,
>>  offload.cookie = (unsigned long)f;
>>  offload.dissector = dissector;
>>  offload.mask = mask;
>> -offload.key = &f->key;
>> +offload.key = &f->mkey;
>>  offload.exts = &f->exts;
>>  
>>  tc->type = TC_SETUP_CLSFLOWER;
>> -- 
>> 1.8.3.1
>> 


[PATCH net] sctp: sctp_transport_lookup_process should rcu_read_unlock when transport is null

2016-12-15 Thread Xin Long
Prior to this patch, sctp_transport_lookup_process didn't rcu_read_unlock
when it failed to find a transport by sctp_addrs_lookup_transport.

This patch is to fix it by moving up rcu_read_unlock right before checking
transport and also to remove the out path.

Fixes: 1cceda784980 ("sctp: fix the issue sctp_diag uses lock_sock in 
rcu_read_lock")
Signed-off-by: Xin Long 
---
 net/sctp/socket.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index d5f4b4a..318c678 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -4472,18 +4472,17 @@ int sctp_transport_lookup_process(int (*cb)(struct 
sctp_transport *, void *),
  const union sctp_addr *paddr, void *p)
 {
struct sctp_transport *transport;
-   int err = -ENOENT;
+   int err;
 
rcu_read_lock();
transport = sctp_addrs_lookup_transport(net, laddr, paddr);
+   rcu_read_unlock();
if (!transport)
-   goto out;
+   return -ENOENT;
 
-   rcu_read_unlock();
err = cb(transport, p);
sctp_transport_put(transport);
 
-out:
return err;
 }
 EXPORT_SYMBOL_GPL(sctp_transport_lookup_process);
-- 
2.1.0



Re: wl1251 & mac address & calibration data

2016-12-15 Thread Pali Rohár
On Thu Dec 15 09:18:44 2016 Kalle Valo  wrote:
> (Adding Luis because he has been working on request_firmware() lately)
> 
> Pali Rohár  writes:
> 
> > > > So no, there is no argument against... request_firmware() in
> > > > fallback mode with userspace helper is by design blocking and
> > > > waiting for userspace. But waiting for some change in DTS in
> > > > kernel is just nonsense.
> > > 
> > > I would just mark the wlan device with status = "disabled" and
> > > enable it in the overlay together with adding the NVS & MAC info.
> > 
> > So if you think that this solution make sense, we can wait what net 
> > wireless maintainers say about it...
> > 
> > For me it looks like that solution can be:
> > 
> > extending request_firmware() to use only userspace helper
> 
> I haven't followed the discussion very closely but this is my preference
> what drivers should do:
> 
> 1) First the driver should do try to get the calibration data and mac
>       address from the device tree.
> 

Ok, but there is no (dynamic, device specific) data in DTS for N900. So 1) is 
noop.

> 2) If they are not in DT the driver should retrieve the calibration data
>       with request_firmware(). BUT with an option for user space to
>       implement that with a helper script so that the data can be created
>       dynamically, which I believe openwrt does with ath10k calibration
>       data right now.

Currently there is flag for request_firmware() that it should fallback to user 
helper if direct VFS access not find needed firmware.

But this flag is not suitable as /lib/firmware already provides default (not 
device specific) calibration data.

So I would suggest to add another flag/function which will primary use user 
helper.

> > and load mac address also via request_firmware() either by appending
> > it   into NVS data or via separate call
> 
> I'm not really fan of the idea providing permanent mac address through
> request_firmware(). For example, how to handle multiple devices on the
> same host, would there be a need for some kind of bus ids encoded to the
> filename? And what about devices with multiple mac addresses?

For N900 there is only one wl1251 device. And... wl12xx is already using 
appended MAC address in calibration data read by request firmware. So reason 
why I prefer similar usage also for wl1251.

> I wish there would be a better way than request_firmware() to provide
> the permanent mac addresses from user space (if device tree is not
> available), I just don't know what that could be :) But if we would
> start to use request_firmware() for this at least there should be a
> wider concensus about that and it should be properly documented, just
> like the device tree bindings.
> 
> -- 
> Kalle Valo

I do not know about any other, so reason why I'm asking :-) and there are my 
proposed solutions. If you (or any other) came up with better we can discuss 
about it :-)

-- 
Pali Rohár
pali.ro...@gmail.com



Re: [PATCHv3 perf/core 0/7] Reuse libbpf from samples/bpf

2016-12-15 Thread Arnaldo Carvalho de Melo
Em Thu, Dec 15, 2016 at 11:33:29AM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Wed, Dec 14, 2016 at 02:46:23PM -0800, Joe Stringer escreveu:
> > On 14 December 2016 at 06:55, Arnaldo Carvalho de Melo  
> > wrote:
> > > So, Joe, can you try refreshing this work, starting from what I have in
> > > perf/core? It has the changes coming from net-next that Daniel warned us 
> > > about
> > > and some more.
>  
> > I've just respun this series based on the version you previously
> > applied to perf/core. Since bpf_prog_{attach,detach}() were added to
> > samples/libbpf, a new patch will shift these over to tools/lib/bpf.
> > Other than that, I folded "samples/bpf: Drop unnecessary build
> > targets." back into "samples/bpf: Switch over to libbpf", and I
> > noticed that there were a couple of unnecessary log buffers with the
> > latest changes. For any new sample programs, those were fixed up to
> > use libbpf as well.
>  
> > Don't forget to do a "make headers_install" before attempting to build
> > the samples, access to the latest headers is required (as per the
> > readme in samples/bpf).
> 
> Ah, README, I should read that ;-)
> 
> I got used to how tools/perf/ work, i.e. it is self sufficient wrt
> in-flux stuff in the kernel, i.e.  headers that are related to features
> it supports and that are under constant improvements, such as eBPF, kvm,
> syscall tables, etc.
> 
> Anyway, will do the headers_install step inside a container, to avoid
> polluting my workstation.

heh: should've read that file, now I did:


There are usually dependencies to header files of the current kernel.
To avoid installing devel kernel headers system wide, as a normal
user, simply call::

 make headers_install

This will creates a local "usr/include" directory in the git/build top
level directory, that the make system automatically pickup first.

 
> Thanks for doing the respin and for the clarifications about building
> samples/bpf/.
> 
> - Arnaldo


RE: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function

2016-12-15 Thread David Laight
From: Hannes Frederic Sowa
> Sent: 15 December 2016 14:57
> On 15.12.2016 14:56, David Laight wrote:
> > From: Hannes Frederic Sowa
> >> Sent: 15 December 2016 12:50
> >> On 15.12.2016 13:28, David Laight wrote:
> >>> From: Hannes Frederic Sowa
>  Sent: 15 December 2016 12:23
> >>> ...
>  Hmm? Even the Intel ABI expects alignment of unsigned long long to be 8
>  bytes on 32 bit. Do you question that?
> >>>
> >>> Yes.
> >>>
> >>> The linux ABI for x86 (32 bit) only requires 32bit alignment for u64 
> >>> (etc).
> >>
> >> Hmm, u64 on 32 bit is unsigned long long and not unsigned long. Thus I
> >> am actually not sure if the ABI would say anything about that (sorry
> >> also for my wrong statement above).
> >>
> >> Alignment requirement of unsigned long long on gcc with -m32 actually
> >> seem to be 8.
> >
> > It depends on the architecture.
> > For x86 it is definitely 4.
> 
> May I ask for a reference?

Ask anyone who has had to do compatibility layers to support 32bit
binaries on 64bit systems.

> I couldn't see unsigned long long being
> mentioned in the ia32 abi spec that I found. I agree that those accesses
> might be synthetically assembled by gcc and for me the alignment of 4
> would have seemed natural. But my gcc at least in 32 bit mode disagrees
> with that.

Try (retyped):

echo 'struct { long a; long long b; } s; int bar { return sizeof s; }' >foo.c
gcc [-m32] -O2 -S foo.c; cat foo.s

And look at what is generated.

> Right now ipv6 addresses have an alignment of 4. So we couldn't even
> naturally pass them to siphash but would need to copy them around, which
> I feel like a source of bugs.

That is more of a problem on systems that don't support misaligned accesses.
Reading the 64bit values with two explicit 32bit reads would work.
I think you can get gcc to do that by adding an aligned(4) attribute to the
structure member.

David



Re: [PATCH net 2/2] net/sched: cls_flower: Use masked key when calling HW offloads

2016-12-15 Thread Simon Horman
On Thu, Dec 15, 2016 at 04:12:05PM +0200, Or Gerlitz wrote:
> On 12/15/2016 3:50 PM, Simon Horman wrote:
> >>Zero bits on the mask signify a "don't care" on the corresponding bits
> >>in key. Some HWs require those bits on the key to be zero. Since these
> >>bits are masked anyway, it's okay to provide the masked key to all
> >>drivers.
> >>
> >>Fixes: 5b33f48842fa ('net/flower: Introduce hardware offload support')
> >>
> >While I don't have a specific use case in mind that this change would break
> >it seems to me that it would be better to handle hardware requirements
> >at the driver level.
> 
> Simon, again, since these bits are masked anyway, it would be correct to
> provide the masked key to the hw device.
> 
> E.g no matter if the flow key/mask provided to the HW device is is
> 1.1.1.10/24  or 1.1.1.0/24, the user expects to the same matching, so
> nothing can't happen if we provide the latter to the driver.

> >While I don't have a specific use case in mind that this change would break
> >it seems to me that it would be better to handle hardware requirements
> >at the driver level.
> 
> Even though, makes no sense to pass unmasked key down. Is is only
> confusing. This patch fixes it.

It seems somewhat arbitrary to me to allow such filters in software
but not pass then down to the driver layer. But I don't feel strongly
about this and I am happy for the patch to progress as-is.


Re: [PATCH perf/core REBASE 2/5] samples/bpf: Switch over to libbpf

2016-12-15 Thread Arnaldo Carvalho de Melo
Em Wed, Dec 14, 2016 at 02:43:39PM -0800, Joe Stringer escreveu:
> Now that libbpf under tools/lib/bpf/* is synced with the version from
> samples/bpf, we can get rid most of the libbpf library here.
> 
> Signed-off-by: Joe Stringer 
> Cc: Alexei Starovoitov 
> Cc: Daniel Borkmann 
> Cc: Wang Nan 
> Link: http://lkml.kernel.org/r/20161209024620.31660-6-...@ovn.org
> [ Use -I$(srctree)/tools/lib/ to support out of source code tree builds, as 
> noticed by Wang Nan ]
> Signed-off-by: Arnaldo Carvalho de Melo 

So, right before this patch building samples/bpf works, then, after, it fails,
investigating:

[root@1e797fdfbf4f linux]# make -j4 O=/tmp/build/linux/ headers_install
make[1]: Entering directory '/tmp/build/linux'
  CHK include/generated/uapi/linux/version.h
make[1]: Leaving directory '/tmp/build/linux'
[root@1e797fdfbf4f linux]# make -j4 O=/tmp/build/linux/ samples/bpf/
make[1]: Entering directory '/tmp/build/linux'
  CHK include/config/kernel.release
  GEN ./Makefile
  CHK include/generated/uapi/linux/version.h
  Using /git/linux as source for kernel
  CHK include/generated/utsrelease.h
  CHK include/generated/timeconst.h
  CHK include/generated/bounds.h
  CHK include/generated/asm-offsets.h
  CALL/git/linux/scripts/checksyscalls.sh
  HOSTCC  samples/bpf/test_lru_dist.o
  HOSTCC  samples/bpf/libbpf.o
  HOSTCC  samples/bpf/sock_example.o
  HOSTCC  samples/bpf/bpf_load.o
In file included from /git/linux/samples/bpf/libbpf.c:12:0:
/git/linux/samples/bpf/libbpf.h:5:21: fatal error: bpf/bpf.h: No such file or 
directory
 #include 
 ^
compilation terminated.
In file included from /git/linux/samples/bpf/test_lru_dist.c:24:0:
/git/linux/samples/bpf/libbpf.h:5:21: fatal error: bpf/bpf.h: No such file or 
directory
 #include 
 ^
compilation terminated.
make[2]: *** [scripts/Makefile.host:124: samples/bpf/test_lru_dist.o] Error 1
make[2]: *** Waiting for unfinished jobs
make[2]: *** [scripts/Makefile.host:124: samples/bpf/libbpf.o] Error 1
In file included from /git/linux/samples/bpf/bpf_load.c:24:0:
/git/linux/samples/bpf/libbpf.h:5:21: fatal error: bpf/bpf.h: No such file or 
directory
 #include 
 ^
compilation terminated.
make[2]: *** [scripts/Makefile.host:124: samples/bpf/bpf_load.o] Error 1
In file included from /git/linux/samples/bpf/sock_example.c:29:0:
/git/linux/samples/bpf/libbpf.h:5:21: fatal error: bpf/bpf.h: No such file or 
directory
 #include 
 ^
compilation terminated.
make[2]: *** [scripts/Makefile.host:124: samples/bpf/sock_example.o] Error 1
make[1]: *** [/git/linux/Makefile:1659: samples/bpf/] Error 2
make[1]: Leaving directory '/tmp/build/linux'
make: *** [Makefile:150: sub-make] Error 2
[root@1e797fdfbf4f linux]# 


Re: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function

2016-12-15 Thread Hannes Frederic Sowa
On 15.12.2016 16:41, David Laight wrote:
> Try (retyped):
> 
> echo 'struct { long a; long long b; } s; int bar { return sizeof s; }' >foo.c
> gcc [-m32] -O2 -S foo.c; cat foo.s
> 
> And look at what is generated.

I used __alignof__(unsigned long long) with -m32.

>> Right now ipv6 addresses have an alignment of 4. So we couldn't even
>> naturally pass them to siphash but would need to copy them around, which
>> I feel like a source of bugs.
> 
> That is more of a problem on systems that don't support misaligned accesses.
> Reading the 64bit values with two explicit 32bit reads would work.
> I think you can get gcc to do that by adding an aligned(4) attribute to the
> structure member.

Yes, and that is actually my fear, because we support those
architectures. I can't comment on that as I don't understand enough of this.

If someone finds a way to cause misaligned reads on a small box this
seems (maybe depending on sysctls they get fixed up or panic) to be a
much bigger issue than having a hash DoS.

Thanks,
Hannes



Re: [PATCH net-next 2/2] inet: Fix get port to handle zero port number with soreuseport set

2016-12-15 Thread Craig Gallek
On Wed, Dec 14, 2016 at 7:54 PM, Tom Herbert  wrote:
> A user may call listen with binding an explicit port with the intent
> that the kernel will assign an available port to the socket. In this
> case inet_csk_get_port does a port scan. For such sockets, the user may
> also set soreuseport with the intent a creating more sockets for the
> port that is selected. The problem is that the initial socket being
> opened could inadvertently choose an existing and unreleated port
> number that was already created with soreuseport.
Good catch!  I think this problem may also exist in the UDP path?
(udp_lib_get_port -> udp_lib_lport_inuse[2])


Re: Designing a safe RX-zero-copy Memory Model for Networking

2016-12-15 Thread Alexander Duyck
On Thu, Dec 15, 2016 at 12:28 AM, Jesper Dangaard Brouer
 wrote:
> On Wed, 14 Dec 2016 14:45:00 -0800
> Alexander Duyck  wrote:
>
>> On Wed, Dec 14, 2016 at 1:29 PM, Jesper Dangaard Brouer
>>  wrote:
>> > On Wed, 14 Dec 2016 08:45:08 -0800
>> > Alexander Duyck  wrote:
>> >
>> >> I agree.  This is a no-go from the performance perspective as well.
>> >> At a minimum you would have to be zeroing out the page between uses to
>> >> avoid leaking data, and that assumes that the program we are sending
>> >> the pages to is slightly well behaved.  If we think zeroing out an
>> >> sk_buff is expensive wait until we are trying to do an entire 4K page.
>> >
>> > Again, yes the page will be zero'ed out, but only when entering the
>> > page_pool. Because they are recycled they are not cleared on every use.
>> > Thus, performance does not suffer.
>>
>> So you are talking about recycling, but not clearing the page when it
>> is recycled.  That right there is my problem with this.  It is fine if
>> you assume the pages are used by the application only, but you are
>> talking about using them for both the application and for the regular
>> network path.  You can't do that.  If you are recycling you will have
>> to clear the page every time you put it back onto the Rx ring,
>> otherwise you can leak the recycled memory into user space and end up
>> with a user space program being able to snoop data out of the skb.
>>
>> > Besides clearing large mem area is not as bad as clearing small.
>> > Clearing an entire page does cost something, as mentioned before 143
>> > cycles, which is 28 bytes-per-cycle (4096/143).  And clearing 256 bytes
>> > cost 36 cycles which is only 7 bytes-per-cycle (256/36).
>>
>> What I am saying is that you are going to be clearing the 4K blocks
>> each time they are recycled.  You can't have the pages shared between
>> user-space and the network stack unless you have true isolation.  If
>> you are allowing network stack pages to be recycled back into the
>> user-space application you open up all sorts of leaks where the
>> application can snoop into data it shouldn't have access to.
>
> See later, the "Read-only packet page" mode should provide a mode where
> the netstack doesn't write into the page, and thus cannot leak kernel
> data. (CAP_NET_ADMIN already give it access to other applications data.)

I think you are kind of missing the point.  The device is writing to
the page on the kernel's behalf.  Therefore the page isn't "Read-only"
and you have an issue since you are talking about sharing a ring
between kernel and userspace.

>> >> I think we are stuck with having to use a HW filter to split off
>> >> application traffic to a specific ring, and then having to share the
>> >> memory between the application and the kernel on that ring only.  Any
>> >> other approach just opens us up to all sorts of security concerns
>> >> since it would be possible for the application to try to read and
>> >> possibly write any data it wants into the buffers.
>> >
>> > This is why I wrote a document[1], trying to outline how this is possible,
>> > going through all the combinations, and asking the community to find
>> > faults in my idea.  Inlining it again, as nobody really replied on the
>> > content of the doc.
>> >
>> > -
>> > Best regards,
>> >   Jesper Dangaard Brouer
>> >   MSc.CS, Principal Kernel Engineer at Red Hat
>> >   LinkedIn: http://www.linkedin.com/in/brouer
>> >
>> > [1] 
>> > https://prototype-kernel.readthedocs.io/en/latest/vm/page_pool/design/memory_model_nic.html
>> >
>> > ===
>> > Memory Model for Networking
>> > ===
>> >
>> > This design describes how the page_pool change the memory model for
>> > networking in the NIC (Network Interface Card) drivers.
>> >
>> > .. Note:: The catch for driver developers is that, once an application
>> >   request zero-copy RX, then the driver must use a specific
>> >   SKB allocation mode and might have to reconfigure the
>> >   RX-ring.
>> >
>> >
>> > Design target
>> > =
>> >
>> > Allow the NIC to function as a normal Linux NIC and be shared in a
>> > safe manor, between the kernel network stack and an accelerated
>> > userspace application using RX zero-copy delivery.
>> >
>> > Target is to provide the basis for building RX zero-copy solutions in
>> > a memory safe manor.  An efficient communication channel for userspace
>> > delivery is out of scope for this document, but OOM considerations are
>> > discussed below (`Userspace delivery and OOM`_).
>> >
>> > Background
>> > ==
>> >
>> > The SKB or ``struct sk_buff`` is the fundamental meta-data structure
>> > for network packets in the Linux Kernel network stack.  It is a fairly
>> > complex object and can be constructed in several ways.
>> >
>> > From a memory perspective there are two ways depending on
>> > RX-buffer/page state:
>> >
>> > 1) Writable packet page
>> > 2) Read-only packet page
>> >
>> > To take 

Re: [PATCH net-next] ixgbevf: fix 'Etherleak' in ixgbevf

2016-12-15 Thread Alexander Duyck
On Thu, Dec 15, 2016 at 3:40 AM, Weilong Chen  wrote:
> Nessus report the vf appears to leak memory in network packets.
> Fix this by padding all small packets manually.
>
> And the CVE-2003-0001.
> https://ofirarkin.files.wordpress.com/2008/11/atstake_etherleak_report.pdf
>
> Signed-off-by: Weilong Chen 
> ---
>  drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 7 +++
>  1 file changed, 7 insertions(+)
>
> diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c 
> b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
> index 6d4bef5..137a154 100644
> --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
> +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
> @@ -3654,6 +3654,13 @@ static int ixgbevf_xmit_frame(struct sk_buff *skb, 
> struct net_device *netdev)
> return NETDEV_TX_OK;
> }
>
> +   /* On PCI/PCI-X HW, if packet size is less than ETH_ZLEN,
> +* packets may get corrupted during padding by HW.
> +* To WA this issue, pad all small packets manually.
> +*/
> +   if (eth_skb_pad(skb))
> +   return NETDEV_TX_OK;
> +

So the patch description for this probably isn't correct.  It looks
like the problem isn't leaking data it is the fact that the frames
aren't being padded to prevent malicious events.  The only issue is
the patch is padding by a bit too much.  I would recommend replacing
this with the following from ixgbe:

/*
 * The minimum packet size for olinfo paylen is 17 so pad the skb
 * in order to meet this minimum size requirement.
 */
if (skb_put_padto(skb, 17))
return NETDEV_TX_OK;


> tx_ring = adapter->tx_ring[skb->queue_mapping];
>
> /* need: 1 descriptor per page * PAGE_SIZE/IXGBE_MAX_DATA_PER_TXD,
> --
> 1.7.12
>


Re: [net-next PATCH v5 1/6] net: virtio dynamically disable/enable LRO

2016-12-15 Thread Michael S. Tsirkin
On Wed, Dec 14, 2016 at 09:01:27AM -0800, John Fastabend wrote:
> On 16-12-14 05:31 AM, Michael S. Tsirkin wrote:
> > On Thu, Dec 08, 2016 at 04:04:58PM -0800, John Fastabend wrote:
> >> On 16-12-08 01:36 PM, Michael S. Tsirkin wrote:
> >>> On Wed, Dec 07, 2016 at 12:11:11PM -0800, John Fastabend wrote:
>  This adds support for dynamically setting the LRO feature flag. The
>  message to control guest features in the backend uses the
>  CTRL_GUEST_OFFLOADS msg type.
> 
>  Signed-off-by: John Fastabend 
>  ---
> 
> [...]
> 
>   
>   static void virtnet_config_changed_work(struct work_struct *work)
>  @@ -1815,6 +1846,12 @@ static int virtnet_probe(struct virtio_device 
>  *vdev)
>   if (virtio_has_feature(vdev, VIRTIO_NET_F_GUEST_CSUM))
>   dev->features |= NETIF_F_RXCSUM;
>   
>  +if (virtio_has_feature(vdev, VIRTIO_NET_F_GUEST_TSO4) &&
>  +virtio_has_feature(vdev, VIRTIO_NET_F_GUEST_TSO6)) {
>  +dev->features |= NETIF_F_LRO;
>  +dev->hw_features |= NETIF_F_LRO;
> >>>
> >>> So the issue is I think that the virtio "LRO" isn't really
> >>> LRO, it's typically just GRO forwarded to guests.
> >>> So these are easily re-split along MTU boundaries,
> >>> which makes it ok to forward these across bridges.
> >>>
> >>> It's not nice that we don't document this in the spec,
> >>> but it's the reality and people rely on this.
> >>>
> >>> For now, how about doing a custom thing and just disable/enable
> >>> it as XDP is attached/detached?
> >>
> >> The annoying part about doing this is ethtool will say that it is fixed
> >> yet it will be changed by seemingly unrelated operation. I'm not sure I
> >> like the idea to start automatically configuring the link via xdp_set.
> > 
> > I really don't like the idea of dropping performance
> > by a factor of 3 for people bridging two virtio net
> > interfaces.
> > 
> > So how about a simple approach for now, just disable
> > XDP if GUEST_TSO is enabled?
> > 
> > We can discuss better approaches in next version.
> > 
> 
> So the proposal is to add a check in XDP setup so that
> 
>   if (virtio_has_feature(vdev, VIRTIO_NET_F_GUEST_TSO{4|6})
>   return -ENOPSUPP;
> 
> Or whatever is the most appropriate return code? Then we can
> disable TSO via qemu-system with guest_tso4=off,guest_tso6=off for
> XDP use cases.

Right. It's a start.

> Sounds like a reasonable start to me. I'll make the change should this
> go through DaveMs net-next tree or do you want it on virtio tree? Either
> is fine with me.
> 
> Thanks,
> John

I think I'll merge it because I'm tweaking RX processing too,
and this will likely conflict.

-- 
MST


Re: [RFC v2 00/10] HFI Virtual Network Interface Controller (VNIC)

2016-12-15 Thread Doug Ledford
On 12/15/2016 9:52 AM, ira.weiny wrote:
> On Thu, Dec 15, 2016 at 11:12:26AM +0200, Leon Romanovsky wrote:
>> On Wed, Dec 14, 2016 at 11:59:32PM -0800, Vishwanathapura, Niranjana wrote:
>>> Thanks Jason for the valuable feedback.
>>> Here is the revised HFI VNIC patch series.
>>>
>>> ChangeLog:
>>> =
>>> v1 => v2:
>>> a) Removed hfi_vnic bus, instead make hfi_vnic driver an 'ib client',
>>>as per feedback from Jason Gunthorpe.
>>> b) Interface changes, data structure changes and variable name changes
>>>associated with (a).
>>> c) Add hfi_ibdev abstraction to provide VNIC control operations to
>>>hfi_vnic client.
>>> d) Minor fixes
>>> e) Moved hfi_vnic driver from .../sw/intel/vnic/hfi_vnic to
>>>.../sw/intel/hfi_vnic.
>>
>> To put it into proportion, Jason asked you to do different thing.
>> http://marc.info/?l=linux-rdma&m=147977108302151&w=2
>> http://marc.info/?l=linux-rdma&m=148000415401842&w=2
>>
>> And Christoph,
>> http://marc.info/?l=linux-rdma&m=147985587425861&w=2
> 
> Understood.  However, we never heard back from Niranjanas analysis of the code
> which stated that > 60% of the code was dealing with the OPA MADs used to
> configure this device.
> 
> https://www.spinics.net/lists/linux-rdma/msg43579.html
> 
> Furthermore, neither Dave nor Doug has had time to weigh in on what we should
> do.
> 
> So before we make that change we wanted to get consensus on using the
> hfi1_ibdev abstraction rather than the bus.  This was the _real_ technical
> change.
> 
> Beyond that it is really just which maintainer wants this driver.  To that end
> I've also cc'ed Jeff Kirsher who maintains drivers/net/ethernet/intel.  
> Perhaps
> Dave would like the driver to go through that tree?
> 
> 
> I think there are pros and cons to both subtrees and in the end we will do
> whatever is decided.
> 
> For maintainer review:
> 
>   1) The driver encapsulates ethernet packets with OPA headers
> 
>   2) VNIC uses OPA management packets (MADs) for its configuration
> 
>   3) A significant portion (> 60% +) of the code is specific to OPA
> 
>   https://www.spinics.net/lists/linux-rdma/msg43579.html
> 
>   4) The driver is from Intel and we expect Intel to be the primary
>  contributor to the code.
> 
>   5) The driver, like hfi1, is dual licensed (GPL/BSD)
> 
>   6) Based on Christophs feedback we will be adding device capability
>  bits to the IB core to indicate HFI VNIC support.
> 
>   https://www.spinics.net/lists/linux-rdma/msg44113.html
> 
> 
> Doug, Dave, Jeff any thoughts?
> 
> Ira
> 

Sorry for my late reply.  The series is relatively large, and also
tagged with RFC, so it got shuffled to the back burner while I worked on
the stuff for this pull request.

I just read through the comments in the V1 series between Jason et. al.,
and my take on things is like this:

1) Since your intent is to make this work with multiple versions of the
hfi drivers, I disagree with Jason that just because there is only one
driver today that we should keep it simple.  Design it right from the
beginning of multi driver is your intent is, IMO, a better way to go.
You'll work out the bugs in the initial implementation and when it comes
time to add the second driver, things will go much more smoothly.

2) With more than 60% of the code being MAD related, and another
significant chunk being hfi related, and only a minor bit (20% maybe?)
being net related, I disagree that this belongs in the drivers/net or
net/ directories.  Part of the purpose of putting code like this in any
given directory is to group it with what it is most tightly tied too.
That way people doing sub-tree wide changes know the rough scope of
their work as the code that needs changed is grouped together.  Putting
this or IPoIB in one of the net trees would make it obvious to the
casual coder that these need changed for net changes, but would totally
hide the fact that once you tear into these drivers, there is a lot more
IB to them than there is net.  What's more, when 60+% of driver is
non-net, then you end up having many more of my patches crossing over
into Dave's tree than the opposite if you put the code under my tree.
If nothing else, locality of code churn would say both this and IPoIB
belong here despite them being net drivers.

3) I would like some hard reasons why this driver deserves to exist?
I'm struggling very hard right now with why we would add an entirely new
"encapsulate IP over RDMA" driver.  Even if you use regular Ethernet
MACs instead of IPoIB's 20byte MAC, I'm struggling for why IPoIB
couldn't be modified to know it supports two MAC sizes and provide
different net devices based on those different types?  I'm struggling to
see why IPoIB couldn't be modified to essentially have two transport
layers underneath?  I haven't done a thorough code review yet, but if I
get into the net driver portion of this and it has very much similarity
to the IPoIB net portion, I'm p

Re: Designing a safe RX-zero-copy Memory Model for Networking

2016-12-15 Thread Christoph Lameter
On Thu, 15 Dec 2016, Jesper Dangaard Brouer wrote:

> > It sounds like Christoph's RDMA approach might be the way to go.
>
> I'm getting more and more fond of Christoph's RDMA approach.  I do
> think we will end-up with something close to that approach.  I just
> wanted to get review on my idea first.
>
> IMHO the major blocker for the RDMA approach is not HW filters
> themselves, but a common API that applications can call to register
> what goes into the HW queues in the driver.  I suspect it will be a
> long project agreeing between vendors.  And agreeing on semantics.

Some of the methods from the RDMA subsystem (like queue pairs, the various
queues etc) could be extracted and used here. Multiple vendors already
support these features and some devices operate both in an RDMA and a
network stack mode. Having that all supported by the networks stack would
reduce overhead for those vendors.

Multiple new vendors are coming up in the RDMA subsystem because the
regular network stack does not have the right performance for high speed
networking. I would rather see them have a way to get that functionality
from the regular network stack. Please add some extensions so that the
RDMA style I/O can be made to work. Even the hardware of the new NICs is
already prepared to work with the data structures of the RDMA subsystem.
That provides an area of standardization where we could hook into but do
that properly and in a nice way in the context of main stream network
support.


Re: [PATCH net] sctp: sctp_epaddr_lookup_transport should be protected by rcu_read_lock

2016-12-15 Thread Marcelo Ricardo Leitner
On Thu, Dec 15, 2016 at 11:00:55PM +0800, Xin Long wrote:
> Since commit 7fda702f9315 ("sctp: use new rhlist interface on sctp transport
> rhashtable"), sctp has changed to use rhlist_lookup to look up transport, but
> rhlist_lookup doesn't call rcu_read_lock inside, unlike 
> rhashtable_lookup_fast.
> 
> It is called in sctp_epaddr_lookup_transport and sctp_addrs_lookup_transport.
> sctp_addrs_lookup_transport is always in the protection of rcu_read_lock(),
> as __sctp_lookup_association is called in rx path or sctp_lookup_association
> which are in the protection of rcu_read_lock() already.
> 
> But sctp_epaddr_lookup_transport is called by sctp_endpoint_lookup_assoc, it
> doesn't call rcu_read_lock, which may cause "suspicious rcu_dereference_check
> usage' in __rhashtable_lookup.
> 
> This patch is to fix it by adding rcu_read_lock in sctp_endpoint_lookup_assoc
> before calling sctp_epaddr_lookup_transport.
> 
> Fixes: 7fda702f9315 ("sctp: use new rhlist interface on sctp transport 
> rhashtable")
> Reported-by: Dmitry Vyukov 
> Signed-off-by: Xin Long 

Acked-by: Marcelo Ricardo Leitner 

> ---
>  net/sctp/endpointola.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/net/sctp/endpointola.c b/net/sctp/endpointola.c
> index 1f03065..410ddc1 100644
> --- a/net/sctp/endpointola.c
> +++ b/net/sctp/endpointola.c
> @@ -331,7 +331,9 @@ struct sctp_association *sctp_endpoint_lookup_assoc(
>* on this endpoint.
>*/
>   if (!ep->base.bind_addr.port)
> - goto out;
> + return NULL;
> +
> + rcu_read_lock();
>   t = sctp_epaddr_lookup_transport(ep, paddr);
>   if (!t)
>   goto out;
> @@ -339,6 +341,7 @@ struct sctp_association *sctp_endpoint_lookup_assoc(
>   *transport = t;
>   asoc = t->asoc;
>  out:
> + rcu_read_unlock();
>   return asoc;
>  }
>  
> -- 
> 2.1.0
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


Re: [PATCH net] sctp: sctp_transport_lookup_process should rcu_read_unlock when transport is null

2016-12-15 Thread Marcelo Ricardo Leitner
On Thu, Dec 15, 2016 at 11:05:52PM +0800, Xin Long wrote:
> Prior to this patch, sctp_transport_lookup_process didn't rcu_read_unlock
> when it failed to find a transport by sctp_addrs_lookup_transport.
> 
> This patch is to fix it by moving up rcu_read_unlock right before checking
> transport and also to remove the out path.
> 
> Fixes: 1cceda784980 ("sctp: fix the issue sctp_diag uses lock_sock in 
> rcu_read_lock")
> Signed-off-by: Xin Long 

Acked-by: Marcelo Ricardo Leitner 

> ---
>  net/sctp/socket.c | 7 +++
>  1 file changed, 3 insertions(+), 4 deletions(-)
> 
> diff --git a/net/sctp/socket.c b/net/sctp/socket.c
> index d5f4b4a..318c678 100644
> --- a/net/sctp/socket.c
> +++ b/net/sctp/socket.c
> @@ -4472,18 +4472,17 @@ int sctp_transport_lookup_process(int (*cb)(struct 
> sctp_transport *, void *),
> const union sctp_addr *paddr, void *p)
>  {
>   struct sctp_transport *transport;
> - int err = -ENOENT;
> + int err;
>  
>   rcu_read_lock();
>   transport = sctp_addrs_lookup_transport(net, laddr, paddr);
> + rcu_read_unlock();
>   if (!transport)
> - goto out;
> + return -ENOENT;
>  
> - rcu_read_unlock();
>   err = cb(transport, p);
>   sctp_transport_put(transport);
>  
> -out:
>   return err;
>  }
>  EXPORT_SYMBOL_GPL(sctp_transport_lookup_process);
> -- 
> 2.1.0
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


Re: [PATCH] net: wan: Use dma_pool_zalloc

2016-12-15 Thread Joe Perches
On Thu, 2016-12-15 at 10:41 +0530, Souptick Joarder wrote:
> On Mon, Dec 12, 2016 at 10:12 AM, Souptick Joarder  
> wrote:
> > On Fri, Dec 9, 2016 at 6:33 PM, Krzysztof Hałasa  wrote:
> > > Souptick Joarder  writes:
> > > 
> > > > We should use dma_pool_zalloc instead of dma_pool_alloc/memset
[]
> > > > diff --git a/drivers/net/wan/ixp4xx_hss.c b/drivers/net/wan/ixp4xx_hss.c
[]
> > > > @@ -976,10 +976,9 @@ static int init_hdlc_queues(struct port *port)
> > > >   return -ENOMEM;
> > > >   }
> > > > 
> > > > - if (!(port->desc_tab = dma_pool_alloc(dma_pool, GFP_KERNEL,
> > > > -   &port->desc_tab_phys)))
> > > > + if (!(port->desc_tab = dma_pool_zalloc(dma_pool, GFP_KERNEL,
> > > > +&port->desc_tab_phys)))
> > > >   return -ENOMEM;
> > > > - memset(port->desc_tab, 0, POOL_ALLOC_SIZE);
> > > >   memset(port->rx_buff_tab, 0, sizeof(port->rx_buff_tab)); /* 
> > > > tables */
> > > >   memset(port->tx_buff_tab, 0, sizeof(port->tx_buff_tab));
> > > 
> > > This look fine, feel free to send it to the netdev mailing list for
> > > inclusion.
> > 
> > Including netdev mailing list based as requested.
> > > Acked-by: Krzysztof Halasa 
[]
> Any comment on this patch ?

Shouldn't the one in drivers/net/ethernet/xscale/ixp4xx_eth.c
also be changed?


Re: [RFC v2 00/10] HFI Virtual Network Interface Controller (VNIC)

2016-12-15 Thread Jason Gunthorpe
On Wed, Dec 14, 2016 at 11:59:32PM -0800, Vishwanathapura, Niranjana wrote:
>  create mode 100644 drivers/infiniband/sw/intel/hfi_vnic/Kconfig
>  create mode 100644 drivers/infiniband/sw/intel/hfi_vnic/Makefile

Stil NAK on these paths, I already explained why 'sw' is totally
unsuitable. Put it in drivers/net or drivers/infiniband/ulp

Jason


Re: [RFC v2 03/10] IB/hfi-vnic: Virtual Network Interface Controller (VNIC) netdev

2016-12-15 Thread Jason Gunthorpe
On Wed, Dec 14, 2016 at 11:59:35PM -0800, Vishwanathapura, Niranjana wrote:
> +/**
> + * union hfi_vnic_bypass_hdr - VNIC bypass header
> + * @slid: source lid
> + * @length: length of packet
> + * @becn: backward explicit congestion notification
> + * @dlid: destination lid
> + * @sc: service class
> + * @fecn: forward explicit congestion notification
> + * @l2: L2 type (2=16B)
> + * @lt: link transfer field
> + * @l4: L4 type
> + * @slid_high: upper 4 bits of source lid
> + * @dlid_high: upper 4 bits of destination lid
> + * @pkey: partition key
> + * @entropy: entropy
> + * @age: packet age
> + * @l4_hdr: L4 header
> + */
> +union hfi_vnic_bypass_hdr {
> + struct {
> + struct {
> + uint64_t slid   : 20;
> + uint64_t length : 11;
> + uint64_t becn   : 1;
> + uint64_t dlid   : 20;
> + uint64_t sc : 5;
> + uint64_t rsvd   : 3;
> + uint64_t fecn   : 1;
> + uint64_t l2 : 2;
> + uint64_t lt : 1;
> + };
> + struct {
> + uint64_t l4: 8;
> + uint64_t slid_high : 4;
> + uint64_t dlid_high : 4;
> + uint64_t pkey  : 16;
> + uint64_t entropy   : 16;
> + uint64_t age   : 8;
> + uint64_t rsvd1 : 8;
> + };
> + struct {
> + uint32_t rsvd2  : 16;
> + uint32_t l4_hdr : 16;
> + };
> + } __packed;
> + u32 dw[5];
> +};

This isn't going to work on BE, please fix it.

> +/**
> + * struct __hfi_vesw_info - HFI vnic virtual switch info
> + */
> +struct __hfi_vesw_info {
> + u16  fabric_id;
> + u16  vesw_id;
> +
> + u8   rsvd0[6];
> + u16  def_port_mask;
> +
> + u8   rsvd1[2];
> + u16  pkey;
> +
> + u8   rsvd2[4];
> + u32  u_mcast_dlid;
> + u32  u_ucast_dlid[HFI_VESW_MAX_NUM_DEF_PORT];
> +
> + u8   rsvd3[44];
> + u16  eth_mtu[HFI_VNIC_MAX_NUM_PCP];
> + u16  eth_mtu_non_vlan;
> + u8   rsvd4[2];
> +} __packed;

This goes on the network too? Also looks like it has endian problems.

Ditto for all the __packed structures.

> +#define v_dbg(format, arg...) \
> + netdev_dbg(adapter->netdev, format, ## arg)
> +#define v_err(format, arg...) \
> + netdev_err(adapter->netdev, format, ## arg)
> +#define v_info(format, arg...) \
> + netdev_info(adapter->netdev, format, ## arg)
> +#define v_warn(format, arg...) \
> + netdev_warn(adapter->netdev, format, ## arg)

Relies on an 'adapter' local varable?? Ugly.

Jason


Re: [RFC v2 00/10] HFI Virtual Network Interface Controller (VNIC)

2016-12-15 Thread Jason Gunthorpe
On Thu, Dec 15, 2016 at 11:28:06AM -0500, Doug Ledford wrote:

> 1) Since your intent is to make this work with multiple versions of the
> hfi drivers, I disagree with Jason that just because there is only one
> driver today that we should keep it simple.  Design it right from the
> beginning of multi driver is your intent is, IMO, a better way to go.
> You'll work out the bugs in the initial implementation and when it comes
> time to add the second driver, things will go much more smoothly.

If that is your position then this should be a straight up IB ULP that
works with any IB hardware.

There is nothing HFI specific about it except for the
micro-optimization of pushing packets via SDMA instead of post_send,
and that same micro optimization probably applies to ipoib.

In other words, lets see the first version as a straight ULP with no
special HFI hooks, then we can discuss how best to micro optimize it
for HFI SDMA.

Jason


Re: [PATCH] net: sfc: use new api ethtool_{get|set}_link_ksettings

2016-12-15 Thread Bert Kenward
On 14/12/16 23:12, Philippe Reynes wrote:
> The ethtool api {get|set}_settings is deprecated.
> We move this driver to new api {get|set}_link_ksettings.
> 
> Signed-off-by: Philippe Reynes 

Tested-by: Bert Kenward 
Acked-by: Bert Kenward 


Re: [PATCH net-next 1/3] net:dsa:mv88e6xxx: use hashtable to store multicast entries

2016-12-15 Thread Vivien Didelot
Hi Volodymyr,

Volodymyr Bendiuga  writes:

> Hi Andrew,
>
> I have tested the approach you wrote in previous mails, the one
> with setting next.mac to address we are looking for -1. It seems
> to be as slow as the original implementation, unfortunately.

Hum, that is what I was expecting... The ATU GetNext operation
(alongside an ether_addr_equal() call) should be quite fast.

> We use 6097 and 6352 chips, and both of them can not do any port
> filtering in hardware for fdb dump operation. Seems like they would
> benefit from cache. But I am not sure about other switches.
>
> Does anyone know about such feature in other switches?

Marvell switches cannot filter ATU entries for a specific port, they
contain a port vector.

I guess Florian might answer for Broadcom switches, and John might
answer for Qualcomm switches.

In all cases *if caching is really needed*, I think it won't hurt to do
it in DSA core even if a switch support FDB dump operations on a
per-port basis, as Andrew mentioned.

Thanks,

Vivien


RE: [RFC v2 03/10] IB/hfi-vnic: Virtual Network Interface Controller (VNIC) netdev

2016-12-15 Thread Hefty, Sean
> This goes on the network too? Also looks like it has endian problems.

I don't think OPA supports BE systems, and I think it uses LE on the wire for 
at least some portions of its protocol.


Re: [RFC v2 03/10] IB/hfi-vnic: Virtual Network Interface Controller (VNIC) netdev

2016-12-15 Thread Jason Gunthorpe
On Thu, Dec 15, 2016 at 05:21:05PM +, Hefty, Sean wrote:
> > This goes on the network too? Also looks like it has endian problems.
> 
> I don't think OPA supports BE systems, and I think it uses LE on the
> wire for at least some portions of its protocol.

This is a linux driver for a PCI device.

It needs to support big endian systems, that is how we do things in
Linux.

If it uses LE on the wire then mark with __le and make it sparse clean.

Do not use bitfields without providing a BE version of the bitfield.

Jason


  1   2   3   >