date:20170111

Re: [PATCH v2 8/8] crypto/testmgr: Allocate only the required output size for hash tests

2017-01-11 Thread Andy Lutomirski

On Wed, Jan 11, 2017 at 11:47 PM, Herbert Xu
 wrote:
> Andy Lutomirski  wrote:
>> There are some hashes (e.g. sha224) that have some internal trickery
>> to make sure that only the correct number of output bytes are
>> generated.  If something goes wrong, they could potentially overrun
>> the output buffer.
>>
>> Make the test more robust by allocating only enough space for the
>> correct output size so that memory debugging will catch the error if
>> the output is overrun.
>>
>> Tested by intentionally breaking sha224 to output all 256
>> internally-generated bits while running on KASAN.
>>
>> Cc: Ard Biesheuvel 
>> Cc: Herbert Xu 
>> Signed-off-by: Andy Lutomirski 
>
> This patch doesn't seem to depend on anything else in the series.
> Do you want me to take it separately?

Yes, please.  Its only relation to the rest of the series is that I
wanted to make sure that I didn't mess up sha224's finalization code.

--Andy

Re: [PATCH v2 8/8] crypto/testmgr: Allocate only the required output size for hash tests

2017-01-11 Thread Herbert Xu

Andy Lutomirski  wrote:
> There are some hashes (e.g. sha224) that have some internal trickery
> to make sure that only the correct number of output bytes are
> generated.  If something goes wrong, they could potentially overrun
> the output buffer.
> 
> Make the test more robust by allocating only enough space for the
> correct output size so that memory debugging will catch the error if
> the output is overrun.
> 
> Tested by intentionally breaking sha224 to output all 256
> internally-generated bits while running on KASAN.
> 
> Cc: Ard Biesheuvel 
> Cc: Herbert Xu 
> Signed-off-by: Andy Lutomirski 

This patch doesn't seem to depend on anything else in the series.
Do you want me to take it separately?

Cheers,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

[PATCH net-next] cxgb4: Initialize mbox lock and list for mgmt dev

2017-01-11 Thread Ganesh Goudar

Initialize mbox lock and list for mgmt dev to avoid NULL pointer
dereference when cxgb_set_vf_mac is called.

And also allocate memory for private data while allocating mgmt
netdev.

Signed-off-by: Ganesh Goudar 
---
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c 
b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
index 3349e1f..e95bb6a 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
@@ -4516,7 +4516,8 @@ static int config_mgmt_dev(struct pci_dev *pdev)
int err;
 
snprintf(name, IFNAMSIZ, "mgmtpf%d%d", adap->adap_idx, adap->pf);
-   netdev = alloc_netdev(0, name, NET_NAME_UNKNOWN, dummy_setup);
+   netdev = alloc_netdev(sizeof(struct port_info), name, NET_NAME_UNKNOWN,
+ dummy_setup);
if (!netdev)
return -ENOMEM;
 
@@ -4990,6 +4991,8 @@ static int init_one(struct pci_dev *pdev, const struct 
pci_device_id *ent)
err = -ENOMEM;
goto free_adapter;
}
+   spin_lock_init(>mbox_lock);
+   INIT_LIST_HEAD(>mlist.list);
pci_set_drvdata(pdev, adapter);
return 0;
 
-- 
2.1.0

[PATCH v2 net-next] Introduce a sysctl that modifies the value of PROT_SOCK.

2017-01-11 Thread Krister Johansen

Add net.ipv4.ip_unprotected_port_start, which is a per namespace sysctl
that denotes the first unprotected inet port in the namespace.  To
disable all protected ports set this to zero.  It also checks for
overlap with the local port range.  The protected and local range may
not overlap.

The use case for this change is to allow containerized processes to bind
to priviliged ports, but prevent them from ever being allowed to modify
their container's network configuration.  The latter is accomplished by
ensuring that the network namespace is not a child of the user
namespace.  This modification was needed to allow the container manager
to disable a namespace's priviliged port restrictions without exposing
control of the network namespace to processes in the user namespace.

Signed-off-by: Krister Johansen 
---
 include/net/ip.h   | 10 +
 include/net/netns/ipv4.h   |  1 +
 net/ipv4/af_inet.c |  5 -
 net/ipv4/sysctl_net_ipv4.c | 50 +-
 net/ipv6/af_inet6.c|  3 ++-
 net/netfilter/ipvs/ip_vs_ctl.c |  7 +++---
 net/sctp/socket.c  | 10 +
 security/selinux/hooks.c   |  3 ++-
 8 files changed, 77 insertions(+), 12 deletions(-)

Changes v1 -> v2:

Remove LOWPORT_SYSCTL config option.  This is now always enabled as long
as CONFIG_SYSCTL is.

diff --git a/include/net/ip.h b/include/net/ip.h
index ab6761a..bf264a8 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -263,11 +263,21 @@ static inline bool sysctl_dev_name_is_allowed(const char 
*name)
return strcmp(name, "default") != 0  && strcmp(name, "all") != 0;
 }
 
+static inline int inet_prot_sock(struct net *net)
+{
+   return net->ipv4.sysctl_ip_prot_sock;
+}
+
 #else
 static inline int inet_is_local_reserved_port(struct net *net, int port)
 {
return 0;
 }
+
+static inline int inet_prot_sock(struct net *net)
+{
+   return PROT_SOCK;
+}
 #endif
 
 __be32 inet_current_timestamp(void);
diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h
index 8e3f5b6..e365732 100644
--- a/include/net/netns/ipv4.h
+++ b/include/net/netns/ipv4.h
@@ -135,6 +135,7 @@ struct netns_ipv4 {
 
 #ifdef CONFIG_SYSCTL
unsigned long *sysctl_local_reserved_ports;
+   int sysctl_ip_prot_sock;
 #endif
 
 #ifdef CONFIG_IP_MROUTE
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index aae410b..28fe8da 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -479,7 +479,7 @@ int inet_bind(struct socket *sock, struct sockaddr *uaddr, 
int addr_len)
 
snum = ntohs(addr->sin_port);
err = -EACCES;
-   if (snum && snum < PROT_SOCK &&
+   if (snum && snum < inet_prot_sock(net) &&
!ns_capable(net->user_ns, CAP_NET_BIND_SERVICE))
goto out;
 
@@ -1700,6 +1700,9 @@ static __net_init int inet_init_net(struct net *net)
net->ipv4.sysctl_ip_default_ttl = IPDEFTTL;
net->ipv4.sysctl_ip_dynaddr = 0;
net->ipv4.sysctl_ip_early_demux = 1;
+#ifdef CONFIG_SYSCTL
+   net->ipv4.sysctl_ip_prot_sock = PROT_SOCK;
+#endif
 
return 0;
 }
diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index 134d8e1..6ad3b39 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -35,6 +35,8 @@ static int ip_local_port_range_min[] = { 1, 1 };
 static int ip_local_port_range_max[] = { 65535, 65535 };
 static int tcp_adv_win_scale_min = -31;
 static int tcp_adv_win_scale_max = 31;
+static int ip_protected_port_min;
+static int ip_protected_port_max = 65535;
 static int ip_ttl_min = 1;
 static int ip_ttl_max = 255;
 static int tcp_syn_retries_min = 1;
@@ -79,7 +81,12 @@ static int ipv4_local_port_range(struct ctl_table *table, 
int write,
ret = proc_dointvec_minmax(, write, buffer, lenp, ppos);
 
if (write && ret == 0) {
-   if (range[1] < range[0])
+   /* Ensure that the upper limit is not smaller than the lower,
+* and that the lower does not encroach upon the protected
+* port limit.
+*/
+   if ((range[1] < range[0]) ||
+   (range[0] < net->ipv4.sysctl_ip_prot_sock))
ret = -EINVAL;
else
set_local_port_range(net, range);
@@ -88,6 +95,40 @@ static int ipv4_local_port_range(struct ctl_table *table, 
int write,
return ret;
 }
 
+/* Validate changes from /proc interface. */
+static int ipv4_protected_ports(struct ctl_table *table, int write,
+   void __user *buffer, size_t *lenp, loff_t *ppos)
+{
+   struct net *net = container_of(table->data, struct net,
+   ipv4.sysctl_ip_prot_sock);
+   int ret;
+   int pports;
+   int range[2];
+   struct ctl_table tmp = {
+   .data = ,
+   .maxlen = sizeof(pports),
+   .mode = table->mode,
+   .extra1 =

[PATCH] can: Fix kernel panic at security_sock_rcv_skb

2017-01-11 Thread Liu ShuoX

From: Zhang Yanmin 

The patch is for fix the below kernel panic:
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [] selinux_socket_sock_rcv_skb+0x65/0x2a0

Call Trace:
 
 [] security_sock_rcv_skb+0x4c/0x60
 [] sk_filter+0x41/0x210
 [] sock_queue_rcv_skb+0x53/0x3a0
 [] raw_rcv+0x2a3/0x3c0
 [] can_rcv_filter+0x12b/0x370
 [] can_receive+0xd9/0x120
 [] can_rcv+0xab/0x100
 [] __netif_receive_skb_core+0xd8c/0x11f0
 [] __netif_receive_skb+0x24/0xb0
 [] process_backlog+0x127/0x280
 [] net_rx_action+0x33b/0x4f0
 [] __do_softirq+0x184/0x440
 [] do_softirq_own_stack+0x1c/0x30
 
 [] do_softirq.part.18+0x3b/0x40
 [] do_softirq+0x1d/0x20
 [] netif_rx_ni+0xe5/0x110
 [] slcan_receive_buf+0x507/0x520
 [] flush_to_ldisc+0x21c/0x230
 [] process_one_work+0x24f/0x670
 [] worker_thread+0x9d/0x6f0
 [] ? rescuer_thread+0x480/0x480
 [] kthread+0x12c/0x150
 [] ret_from_fork+0x3f/0x70

The sk dereferenced in panic has been released. After the rcu_call in
can_rx_unregister, receiver was protected by RCU but inner data was
not, then later sk will be freed while other CPU is still using it.
We need wait here to make sure sk referenced via receiver was safe.

=> security_sk_free
=> sk_destruct
=> __sk_free
=> sk_free
=> raw_release
=> sock_release
=> sock_close
=> __fput
=> fput
=> task_work_run
=> exit_to_usermode_loop
=> syscall_return_slowpath
=> int_ret_from_sys_call

Signed-off-by: Zhang Yanmin 
Signed-off-by: He, Bo 
Signed-off-by: Liu Shuo A 
---
 net/can/af_can.c | 14 --
 net/can/af_can.h |  1 -
 2 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/net/can/af_can.c b/net/can/af_can.c
index 1108079..fcbe971 100644
--- a/net/can/af_can.c
+++ b/net/can/af_can.c
@@ -517,10 +517,8 @@ int can_rx_register(struct net_device *dev, canid_t 
can_id, canid_t mask,
 /*
  * can_rx_delete_receiver - rcu callback for single receiver entry removal
  */
-static void can_rx_delete_receiver(struct rcu_head *rp)
+static void can_rx_delete_receiver(struct receiver *r)
 {
-   struct receiver *r = container_of(rp, struct receiver, rcu);
-
kmem_cache_free(rcv_cache, r);
 }
 
@@ -595,9 +593,13 @@ void can_rx_unregister(struct net_device *dev, canid_t 
can_id, canid_t mask,
  out:
spin_unlock(_rcvlists_lock);
 
-   /* schedule the receiver item for deletion */
-   if (r)
-   call_rcu(>rcu, can_rx_delete_receiver);
+   /* synchronize_rcu to wait until a grace period has elapsed, to make
+* sure all receiver's sk dereferenced by others.
+*/
+   if (r) {
+   synchronize_rcu();
+   can_rx_delete_receiver(r);
+   }
 }
 EXPORT_SYMBOL(can_rx_unregister);
 
diff --git a/net/can/af_can.h b/net/can/af_can.h
index fca0fe9..a0cbf83 100644
--- a/net/can/af_can.h
+++ b/net/can/af_can.h
@@ -50,7 +50,6 @@
 
 struct receiver {
struct hlist_node list;
-   struct rcu_head rcu;
canid_t can_id;
canid_t mask;
unsigned long matches;
-- 
1.9.4

[PATCH net-next] net: core: Make netif_wake_subqueue a wrapper

2017-01-11 Thread Florian Fainelli

netif_wake_subqueue() is duplicating the same thing that netif_tx_wake_queue()
does, so make it call it directly after looking up the queue from the index.

Signed-off-by: Florian Fainelli 
---
 include/linux/netdevice.h | 14 +-
 net/core/dev.c| 22 --
 2 files changed, 13 insertions(+), 23 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index d532c070163f..6d021c37b774 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -3106,7 +3106,19 @@ static inline bool netif_subqueue_stopped(const struct 
net_device *dev,
return __netif_subqueue_stopped(dev, skb_get_queue_mapping(skb));
 }
 
-void netif_wake_subqueue(struct net_device *dev, u16 queue_index);
+/**
+ * netif_wake_subqueue - allow sending packets on subqueue
+ * @dev: network device
+ * @queue_index: sub queue index
+ *
+ * Resume individual transmit queue of a device with multiple transmit queues.
+ */
+static inline void netif_wake_subqueue(struct net_device *dev, u16 queue_index)
+{
+   struct netdev_queue *txq = netdev_get_tx_queue(dev, queue_index);
+
+   netif_tx_wake_queue(txq);
+}
 
 #ifdef CONFIG_XPS
 int netif_set_xps_queue(struct net_device *dev, const struct cpumask *mask,
diff --git a/net/core/dev.c b/net/core/dev.c
index 643e4a4491c6..7547e2ccc06b 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2408,28 +2408,6 @@ void netif_schedule_queue(struct netdev_queue *txq)
 }
 EXPORT_SYMBOL(netif_schedule_queue);
 
-/**
- * netif_wake_subqueue - allow sending packets on subqueue
- * @dev: network device
- * @queue_index: sub queue index
- *
- * Resume individual transmit queue of a device with multiple transmit queues.
- */
-void netif_wake_subqueue(struct net_device *dev, u16 queue_index)
-{
-   struct netdev_queue *txq = netdev_get_tx_queue(dev, queue_index);
-
-   if (test_and_clear_bit(__QUEUE_STATE_DRV_XOFF, >state)) {
-   struct Qdisc *q;
-
-   rcu_read_lock();
-   q = rcu_dereference(txq->qdisc);
-   __netif_schedule(q);
-   rcu_read_unlock();
-   }
-}
-EXPORT_SYMBOL(netif_wake_subqueue);
-
 void netif_tx_wake_queue(struct netdev_queue *dev_queue)
 {
if (test_and_clear_bit(__QUEUE_STATE_DRV_XOFF, _queue->state)) {
-- 
2.9.3

Re: [PATCH v2 2/2] stmmac: rename it to synopsys

2017-01-11 Thread Florian Fainelli



On 01/11/2017 07:44 PM, Jie Deng wrote:
> Hi Joao,
> 
> On 2017/1/11 18:35, Joao Pinto wrote:
>> Hi Jie,
>>
>> Às 4:00 AM de 1/11/2017, Jie Deng escreveu:
>>> Hi Joao,
>>>
>>>
>>> On 2017/1/10 22:52, Joao Pinto wrote:
 This patch renames stmicro/stmmac to synopsys/ since it is a standard
 ethernet software package regarding synopsys ethernet controllers, 
 supporting
 the majority of Synopsys Ethernet IPs. The config IDs remain the same, for
 retro-compatibility, only the description was changed.

 Signed-off-by: Joao Pinto 
 ---
 changes v1->v2:
 - nothing changed. Just to keep up with patch set version

 @@ -1,5 +1,5 @@
  config STMMAC_ETH
 -  tristate "STMicroelectronics 10/100/1000 Ethernet driver"
 +  tristate "Synopsys Ethernet drivers"
depends on HAS_IOMEM && HAS_DMA
select MII
select PHYLIB
 @@ -14,7 +14,7 @@ config STMMAC_ETH
  if STMMAC_ETH
  
>>> "Synopsys Ethernet drivers" is too generic. The name should reflect the
>>> controller. This driver is for Synopsys GMAC 10M/100M/1G IPs. We will 
>>> submit a
>>> driver for the new 25G/40G/50G/100G XLGMAC IP in the future.
>> As you know Synopsys is an IP vendor that as a wide range of IPs related to
>> Ethernet. stmmac is a driver that was well built and well maintained and
>> supports most of those Ethernet IPs, so it has the potential to be the 
>> official
>> Synopsys Ethernet driver suite in the future.
>>
>> Let's make baby steps. For now stmmac supports 10M/100M/1G and also QoS 
>> which is
>> a separated IP and the different IP cores are selected by device tree.
>>
>> In the future it would be usefull to be selectable by Kconfig options, 
>> making it
>> more clear to the kernel user. For example:
>>
>> Synopsys Ethernet drivers
>>   eQoS Core
>>   10M Core
>>   100M Core
>>   1G Core
>>   25G Core
>>   
>>
>> The XLGMAC will be great for our users and I will be 100% available to help 
>> you
>> with it.
>>
>> Thanks,
>> Joao
>>
> Currently, Synopsys has three series IPs cores. They are
> 1. 10M/100M/1G   (GMAC, QoS)
> 2. 1G/2.5G/5G/10G (XGMAC)
> 3. 10G/25G/40G/50G/100G (XLGMAC)
> More info: https://www.synopsys.com/designware-ip/interface-ip/ethernet.html
>  
> You have successfully merged  dwc_eth_qos.c into stmmac. stmmac now fully
> supports the 10M/100M/1G series IPs. Personally, I do support Florian's
> suggestion not to rename stmmac.
> considering to avoid  future confusion and make easy maintenance, Following is
> my suggestions
>1. not to do any rename
>2. keep all 10M/100M/1G IPs (GMAC, QoS) development in stmmac.
>3. keep all 1G/2.5G/5G/10G IP (XGMAC) development in amd-xgbe.
>4. submit a new driver under synopsys/ for the new 10G/25G/40G/50G/100G
> (XLGMAC) IP.
> 
> Welcome opinions from others.

Seems like a reasonable plan to me. If it helps avoid confusion, you
could always add an entry under Documentation/networking/ which
describes the history of the various drivers, and the reasons why they
are not consolidated under a synopsys directory.

Cheers
-- 
Florian

[Patch net] atm: switch to the new wait API for vcc_sendmsg()

2017-01-11 Thread Cong Wang

Andrey reported a kernel warning for the blocking ops
in between prepare_to_wait() and schedule(), that is
alloc_tx().

Of course, the logic itself is suspicious, other sendmsg()
could handle skb allocation failure very well, not sure
why ATM has to wait for a successful one here. But probably
it is too late to change since the errno and behavior is
visible to user-space. So just leave the logic as it is.

Reported-by: Andrey Konovalov 
Tested-by: Andrey Konovalov 
Signed-off-by: Cong Wang 
---
 net/atm/common.c | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/net/atm/common.c b/net/atm/common.c
index 7ec3bbc..b672231 100644
--- a/net/atm/common.c
+++ b/net/atm/common.c
@@ -572,8 +572,8 @@ int vcc_recvmsg(struct socket *sock, struct msghdr *msg, 
size_t size,
 
 int vcc_sendmsg(struct socket *sock, struct msghdr *m, size_t size)
 {
+   DEFINE_WAIT_FUNC(wait, woken_wake_function);
struct sock *sk = sock->sk;
-   DEFINE_WAIT(wait);
struct atm_vcc *vcc;
struct sk_buff *skb;
int eff, error;
@@ -605,14 +605,14 @@ int vcc_sendmsg(struct socket *sock, struct msghdr *m, 
size_t size)
}
 
eff = (size+3) & ~3; /* align to word boundary */
-   prepare_to_wait(sk_sleep(sk), , TASK_INTERRUPTIBLE);
+   add_wait_queue(sk_sleep(sk), );
error = 0;
while (!(skb = alloc_tx(vcc, eff))) {
if (m->msg_flags & MSG_DONTWAIT) {
error = -EAGAIN;
break;
}
-   schedule();
+   wait_woken(, TASK_INTERRUPTIBLE, MAX_SCHEDULE_TIMEOUT);
if (signal_pending(current)) {
error = -ERESTARTSYS;
break;
@@ -624,9 +624,8 @@ int vcc_sendmsg(struct socket *sock, struct msghdr *m, 
size_t size)
send_sig(SIGPIPE, current, 0);
break;
}
-   prepare_to_wait(sk_sleep(sk), , TASK_INTERRUPTIBLE);
}
-   finish_wait(sk_sleep(sk), );
+   remove_wait_queue(sk_sleep(sk), );
if (error)
goto out;
skb->dev = NULL; /* for paths shared with net_device interfaces */
-- 
2.5.5

[Patch net] atm: remove an unnecessary loop

2017-01-11 Thread Cong Wang

alloc_tx() is already inside a wait loop for a successful skb
allocation, this loop inside alloc_tx() is quite unnecessary
and actually problematic.

Signed-off-by: Cong Wang 
---
 net/atm/common.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/net/atm/common.c b/net/atm/common.c
index a3ca922..7ec3bbc 100644
--- a/net/atm/common.c
+++ b/net/atm/common.c
@@ -72,10 +72,11 @@ static struct sk_buff *alloc_tx(struct atm_vcc *vcc, 
unsigned int size)
 sk_wmem_alloc_get(sk), size, sk->sk_sndbuf);
return NULL;
}
-   while (!(skb = alloc_skb(size, GFP_KERNEL)))
-   schedule();
-   pr_debug("%d += %d\n", sk_wmem_alloc_get(sk), skb->truesize);
-   atomic_add(skb->truesize, >sk_wmem_alloc);
+   skb = alloc_skb(size, GFP_KERNEL);
+   if (skb) {
+   pr_debug("%d += %d\n", sk_wmem_alloc_get(sk), skb->truesize);
+   atomic_add(skb->truesize, >sk_wmem_alloc);
+   }
return skb;
 }
 
-- 
2.5.5

Re: net/atm: warning in alloc_tx/__might_sleep

2017-01-11 Thread Cong Wang

On Wed, Jan 11, 2017 at 11:46 AM, Michal Hocko  wrote:
> On Wed 11-01-17 20:45:25, Michal Hocko wrote:
>> On Wed 11-01-17 09:37:06, Chas Williams wrote:
>> > On Mon, 2017-01-09 at 18:20 +0100, Andrey Konovalov wrote:
>> > > Hi!
>> > >
>> > > I've got the following error report while running the syzkaller fuzzer.
>> > >
>> > > On commit a121103c922847ba5010819a3f250f1f7fc84ab8 (4.10-rc3).
>> > >
>> > > A reproducer is attached.
>> > >
>> > > [ cut here ]
>> > > WARNING: CPU: 0 PID: 4114 at kernel/sched/core.c:7737 
>> > > __might_sleep+0x149/0x1a0
>> > > do not call blocking ops when !TASK_RUNNING; state=1 set at
>> > > [] prepare_to_wait+0x182/0x530
>> > > Modules linked in:
>> > > CPU: 0 PID: 4114 Comm: a.out Not tainted 4.10.0-rc3+ #59
>> > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 
>> > > 01/01/2011
>> > > Call Trace:
>> > >  __dump_stack lib/dump_stack.c:15
>> > >  dump_stack+0x292/0x398 lib/dump_stack.c:51
>> > >  __warn+0x19f/0x1e0 kernel/panic.c:547
>> > >  warn_slowpath_fmt+0xc5/0x110 kernel/panic.c:562
>> > >  __might_sleep+0x149/0x1a0 kernel/sched/core.c:7732
>> > >  slab_pre_alloc_hook mm/slab.h:408
>> > >  slab_alloc_node mm/slub.c:2634
>> > >  kmem_cache_alloc_node+0x14a/0x280 mm/slub.c:2744
>> > >  __alloc_skb+0x10f/0x800 net/core/skbuff.c:219
>> > >  alloc_skb ./include/linux/skbuff.h:926
>> > >  alloc_tx net/atm/common.c:75
>> >
>> > This is likely alloc_skb(..., GFP_KERNEL) in alloc_tx().  The simplest
>> > fix for this would be simply to switch this GFP_ATOMIC.  See if this is
>> > any better.
>> >
>> > diff --git a/net/atm/common.c b/net/atm/common.c
>> > index a3ca922..d84220c 100644
>> > --- a/net/atm/common.c
>> > +++ b/net/atm/common.c
>> > @@ -72,7 +72,7 @@ static struct sk_buff *alloc_tx(struct atm_vcc *vcc, 
>> > unsigned int size)
>> >  sk_wmem_alloc_get(sk), size, sk->sk_sndbuf);
>> > return NULL;
>> > }
>> > -   while (!(skb = alloc_skb(size, GFP_KERNEL)))
>> > +   while (!(skb = alloc_skb(size, GFP_ATOMIC)))
>> > schedule();
>> > pr_debug("%d += %d\n", sk_wmem_alloc_get(sk), skb->truesize);
>> > atomic_add(skb->truesize, >sk_wmem_alloc);
>>
>> Blee, this code is just horrendous. But the "fix" is obviously broken!
>> schedule() is just a noop if you do not change the task state and what
>> you are just asking for is a never failing non sleeping allocation - aka
>> a busy loop in the kernel!
>
> And btw. this while loop should be really turned into GFP_KERNEL |
> __GFP_NOFAIL with and explanation why this allocation cannot possibly
> fail.

I think a nested loop is quite unnecessary, probably due to the code itself
is pretty old. The alloc_tx() is in the outer loop, the alloc_skb() is
in the inner
loop, both seem to wait for a successful GFP allocation. The inner one
is even more unnecessary.

Of course, I am not surprised MM may already have a mechanism to do
the similar logic.

There maybe some reason ATM needs such a logic, although other proto
could handle skb allocation failure quite well in ->sendmsg().

[PATCH v4 03/13] net: ethernet: aquantia: Add ring support code

2017-01-11 Thread Alexander Loktionov

From: David VomLehn 

Add code to support the transmit and receive ring buffers.

Signed-off-by: Alexander Loktionov 
Signed-off-by: Dmitrii Tarakanov 
Signed-off-by: Pavel Belous 
Signed-off-by: Dmitry Bezrukov 
Signed-off-by: David M. VomLehn 
---
 drivers/net/ethernet/aquantia/aq_ring.c | 359 
 drivers/net/ethernet/aquantia/aq_ring.h | 158 ++
 2 files changed, 517 insertions(+)
 create mode 100644 drivers/net/ethernet/aquantia/aq_ring.c
 create mode 100644 drivers/net/ethernet/aquantia/aq_ring.h

diff --git a/drivers/net/ethernet/aquantia/aq_ring.c 
b/drivers/net/ethernet/aquantia/aq_ring.c
new file mode 100644
index 000..49df496
--- /dev/null
+++ b/drivers/net/ethernet/aquantia/aq_ring.c
@@ -0,0 +1,359 @@
+/*
+ * aQuantia Corporation Network Driver
+ * Copyright (C) 2014-2017 aQuantia Corporation. All rights reserved
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ */
+
+/* File aq_ring.c: Definition of functions for Rx/Tx rings. */
+
+#include "aq_ring.h"
+#include "aq_nic.h"
+#include "aq_hw.h"
+
+#include 
+#include 
+
+static struct aq_ring_s *aq_ring_alloc(struct aq_ring_s *self,
+  struct aq_nic_s *aq_nic)
+{
+   int err = 0;
+
+   self->buff_ring = (struct aq_ring_buff_s *)
+   kzalloc(sizeof(struct aq_ring_buff_s) * self->size, GFP_KERNEL);
+
+   if (!self->buff_ring) {
+   err = -ENOMEM;
+   goto err_exit;
+   }
+   self->dx_ring = dma_alloc_coherent(aq_nic_get_dev(aq_nic),
+   self->size * self->dx_size,
+   >dx_ring_pa, GFP_KERNEL);
+   if (!self->dx_ring) {
+   err = -ENOMEM;
+   goto err_exit;
+   }
+
+err_exit:
+   if (err < 0) {
+   aq_ring_free(self);
+   self = NULL;
+   }
+   return self;
+}
+
+struct aq_ring_s *aq_ring_tx_alloc(struct aq_ring_s *self,
+  struct aq_nic_s *aq_nic,
+  unsigned int idx,
+  struct aq_nic_cfg_s *aq_nic_cfg)
+{
+   int err = 0;
+
+   self->aq_nic = aq_nic;
+   self->idx = idx;
+   self->size = aq_nic_cfg->txds;
+   self->dx_size = aq_nic_cfg->aq_hw_caps->txd_size;
+
+   self = aq_ring_alloc(self, aq_nic);
+   if (!self) {
+   err = -ENOMEM;
+   goto err_exit;
+   }
+
+err_exit:
+   if (err < 0) {
+   aq_ring_free(self);
+   self = NULL;
+   }
+   return self;
+}
+
+struct aq_ring_s *aq_ring_rx_alloc(struct aq_ring_s *self,
+  struct aq_nic_s *aq_nic,
+  unsigned int idx,
+  struct aq_nic_cfg_s *aq_nic_cfg)
+{
+   int err = 0;
+
+   self->aq_nic = aq_nic;
+   self->idx = idx;
+   self->size = aq_nic_cfg->rxds;
+   self->dx_size = aq_nic_cfg->aq_hw_caps->rxd_size;
+
+   self = aq_ring_alloc(self, aq_nic);
+   if (!self) {
+   err = -ENOMEM;
+   goto err_exit;
+   }
+
+err_exit:
+   if (err < 0) {
+   aq_ring_free(self);
+   self = NULL;
+   }
+   return self;
+}
+
+void aq_ring_init(struct aq_ring_s *self)
+{
+   self->hw_head = 0;
+   self->sw_head = 0;
+   self->sw_tail = 0;
+}
+
+void aq_ring_free(struct aq_ring_s *self)
+{
+   if (!self)
+   return;
+
+   kfree(self->buff_ring);
+
+   if (self->dx_ring)
+   dma_free_coherent(aq_nic_get_dev(self->aq_nic),
+ self->size * self->dx_size, self->dx_ring,
+ self->dx_ring_pa);
+}
+
+void aq_ring_tx_append_buffs(struct aq_ring_s *self,
+struct aq_ring_buff_s *buffer,
+unsigned int buffers)
+{
+   if (likely(self->sw_tail + buffers < self->size)) {
+   memcpy(>buff_ring[self->sw_tail], buffer,
+  sizeof(buffer[0]) * buffers);
+   } else {
+   unsigned int first_part = self->size - self->sw_tail;
+   unsigned int second_part = buffers - first_part;
+
+   memcpy(>buff_ring[self->sw_tail], buffer,
+  sizeof(buffer[0]) * first_part);
+
+   memcpy(>buff_ring[0], [first_part],
+  sizeof(buffer[0]) * second_part);
+   }
+}
+
+void aq_ring_tx_clean(struct aq_ring_s *self)
+{
+   struct device *dev = aq_nic_get_dev(self->aq_nic);
+
+   for (;

[PATCH v4 09/13] net: ethernet: aquantia: Atlantic hardware abstraction layer

2017-01-11 Thread Alexander Loktionov

From: David VomLehn 

Add common functions for Atlantic hardware abstraction layer.

Signed-off-by: Alexander Loktionov 
Signed-off-by: Dmitrii Tarakanov 
Signed-off-by: Pavel Belous 
Signed-off-by: Dmitry Bezrukov 
Signed-off-by: David M. VomLehn 
---
 .../net/ethernet/aquantia/hw_atl/hw_atl_utils.c| 548 +
 .../net/ethernet/aquantia/hw_atl/hw_atl_utils.h| 211 
 2 files changed, 759 insertions(+)
 create mode 100644 drivers/net/ethernet/aquantia/hw_atl/hw_atl_utils.c
 create mode 100644 drivers/net/ethernet/aquantia/hw_atl/hw_atl_utils.h

diff --git a/drivers/net/ethernet/aquantia/hw_atl/hw_atl_utils.c 
b/drivers/net/ethernet/aquantia/hw_atl/hw_atl_utils.c
new file mode 100644
index 000..122cd25
--- /dev/null
+++ b/drivers/net/ethernet/aquantia/hw_atl/hw_atl_utils.c
@@ -0,0 +1,548 @@
+/*
+ * aQuantia Corporation Network Driver
+ * Copyright (C) 2014-2017 aQuantia Corporation. All rights reserved
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ */
+
+/* File hw_atl_utils.c: Definition of common functions for Atlantic hardware
+ * abstraction layer.
+ */
+
+#include "../aq_hw.h"
+#include "../aq_hw_utils.h"
+#include "../aq_pci_func.h"
+#include "../aq_ring.h"
+#include "../aq_vec.h"
+#include "hw_atl_utils.h"
+#include "hw_atl_llh.h"
+
+#include 
+
+#define HW_ATL_UCP_0X370_REG0x0370U
+
+#define HW_ATL_FW_SM_RAM0x2U
+#define HW_ATL_MPI_CONTROL_ADR  0x0368U
+#define HW_ATL_MPI_STATE_ADR0x036CU
+
+#define HW_ATL_MPI_STATE_MSK0x00FFU
+#define HW_ATL_MPI_STATE_SHIFT  0U
+#define HW_ATL_MPI_SPEED_MSK0xU
+#define HW_ATL_MPI_SPEED_SHIFT  16U
+
+static int hw_atl_utils_fw_downld_dwords(struct aq_hw_s *self, u32 a,
+u32 *p, u32 cnt)
+{
+   int err = 0;
+
+   AQ_HW_WAIT_FOR(reg_glb_cpu_sem_get(self,
+  HW_ATL_FW_SM_RAM) == 1U, 1U, 1000U);
+
+   if (err < 0) {
+   bool is_locked;
+
+   reg_glb_cpu_sem_set(self, 1U, HW_ATL_FW_SM_RAM);
+   is_locked = reg_glb_cpu_sem_get(self, HW_ATL_FW_SM_RAM);
+   if (!is_locked) {
+   err = ETIME;
+   goto err_exit;
+   }
+   }
+
+   aq_hw_write_reg(self, 0x0208U, a);
+
+   for (++cnt; --cnt;) {
+   u32 i = 0U;
+
+   aq_hw_write_reg(self, 0x0200U, 0x8000U);
+
+   for (i = 1024U;
+   (0x100U & aq_hw_read_reg(self, 0x0200U)) && --i;) {
+   }
+
+   *(p++) = aq_hw_read_reg(self, 0x020CU);
+   }
+
+   reg_glb_cpu_sem_set(self, 1U, HW_ATL_FW_SM_RAM);
+
+err_exit:
+   return err;
+}
+
+static void hw_atl_utils_fw_upload_dwords(struct aq_hw_s *self, u32 a, u32 *p,
+ u32 cnt)
+{
+   int err = 0;
+   bool is_locked;
+
+   is_locked = reg_glb_cpu_sem_get(self, HW_ATL_FW_SM_RAM);
+   if (!is_locked) {
+   err = ETIME;
+   goto err_exit;
+   }
+
+   aq_hw_write_reg(self, 0x0208U, a);
+
+   for (++cnt; --cnt;) {
+   u32 i = 0U;
+
+   aq_hw_write_reg(self, 0x020CU, *(p++));
+   aq_hw_write_reg(self, 0x0200U, 0xC000U);
+
+   for (i = 1024U;
+   (0x100U & aq_hw_read_reg(self, 0x0200U)) && --i;) {
+   }
+   }
+
+   reg_glb_cpu_sem_set(self, 1U, HW_ATL_FW_SM_RAM);
+
+err_exit:;
+   (void)err;
+}
+
+static int hw_atl_utils_init_ucp(struct aq_hw_s *self)
+{
+   int err = 0;
+
+   if (!aq_hw_read_reg(self, 0x370U)) {
+   unsigned int rnd = 0U;
+   unsigned int ucp_0x370 = 0U;
+
+   get_random_bytes(, sizeof(unsigned int));
+
+   ucp_0x370 = 0x02020202U | (0xFEFEFEFEU & rnd);
+   aq_hw_write_reg(self, HW_ATL_UCP_0X370_REG, ucp_0x370);
+   }
+
+   reg_glb_cpu_scratch_scp_set(self, 0xU, 25U);
+
+   /* check 10 times by 1ms */
+   AQ_HW_WAIT_FOR(0U != (PHAL_ATLANTIC_A0->mbox_addr =
+   aq_hw_read_reg(self, 0x360U)), 1000U, 10U);
+
+   return err;
+}
+
+#define HW_ATL_RPC_CONTROL_ADR 0x0338U
+#define HW_ATL_RPC_STATE_ADR   0x033CU
+
+struct aq_hw_atl_utils_fw_rpc_tid_s {
+   union {
+   u32 val;
+   struct {
+   u16 tid;
+   u16 len;
+   };
+   };
+};
+
+#define hw_atl_utils_fw_rpc_init(_H_) hw_atl_utils_fw_rpc_wait(_H_, NULL)
+
+static int hw_atl_utils_fw_rpc_call(struct aq_hw_s *self, unsigned int 
rpc_size)
+{
+

[PATCH v4 06/13] net: ethernet: aquantia: Atlantic A0 and B0 specific functions.

2017-01-11 Thread Alexander Loktionov

From: David VomLehn 

Add Atlantic A0 and B0 specific functions.

Signed-off-by: Alexander Loktionov 
Signed-off-by: Dmitrii Tarakanov 
Signed-off-by: Pavel Belous 
Signed-off-by: Dmitry Bezrukov 
Signed-off-by: David M. VomLehn 
---
 drivers/net/ethernet/aquantia/hw_atl/hw_atl_a0.c   | 908 +++
 drivers/net/ethernet/aquantia/hw_atl/hw_atl_a0.h   |  35 +
 .../ethernet/aquantia/hw_atl/hw_atl_a0_internal.h  | 153 
 drivers/net/ethernet/aquantia/hw_atl/hw_atl_b0.c   | 961 +
 drivers/net/ethernet/aquantia/hw_atl/hw_atl_b0.h   |  35 +
 .../ethernet/aquantia/hw_atl/hw_atl_b0_internal.h  | 206 +
 6 files changed, 2298 insertions(+)
 create mode 100644 drivers/net/ethernet/aquantia/hw_atl/hw_atl_a0.c
 create mode 100644 drivers/net/ethernet/aquantia/hw_atl/hw_atl_a0.h
 create mode 100644 drivers/net/ethernet/aquantia/hw_atl/hw_atl_a0_internal.h
 create mode 100644 drivers/net/ethernet/aquantia/hw_atl/hw_atl_b0.c
 create mode 100644 drivers/net/ethernet/aquantia/hw_atl/hw_atl_b0.h
 create mode 100644 drivers/net/ethernet/aquantia/hw_atl/hw_atl_b0_internal.h

diff --git a/drivers/net/ethernet/aquantia/hw_atl/hw_atl_a0.c 
b/drivers/net/ethernet/aquantia/hw_atl/hw_atl_a0.c
new file mode 100644
index 000..02863c8
--- /dev/null
+++ b/drivers/net/ethernet/aquantia/hw_atl/hw_atl_a0.c
@@ -0,0 +1,908 @@
+/*
+ * aQuantia Corporation Network Driver
+ * Copyright (C) 2014-2017 aQuantia Corporation. All rights reserved
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ */
+
+/* File hw_atl_a0.c: Definition of Atlantic hardware specific functions. */
+
+#include "../aq_hw.h"
+#include "../aq_hw_utils.h"
+#include "../aq_ring.h"
+#include "hw_atl_a0.h"
+#include "hw_atl_utils.h"
+#include "hw_atl_llh.h"
+#include "hw_atl_a0_internal.h"
+
+static int hw_atl_a0_get_hw_caps(struct aq_hw_s *self,
+struct aq_hw_caps_s *aq_hw_caps)
+{
+   memcpy(aq_hw_caps, _atl_a0_hw_caps_, sizeof(*aq_hw_caps));
+   return 0;
+}
+
+static struct aq_hw_s *hw_atl_a0_create(struct aq_pci_func_s *aq_pci_func,
+   unsigned int port,
+   struct aq_hw_ops *ops)
+{
+   struct hw_atl_s *self = NULL;
+   int err = 0;
+
+   self = kzalloc(sizeof(*self), GFP_KERNEL);
+   if (!self) {
+   err = -ENOMEM;
+   goto err_exit;
+   }
+   self->base.aq_pci_func = aq_pci_func;
+
+   self->base.not_ff_addr = 0x10U;
+
+err_exit:
+   return (struct aq_hw_s *)self;
+}
+
+static void hw_atl_a0_destroy(struct aq_hw_s *self)
+{
+   kfree(self);
+}
+
+static int hw_atl_a0_hw_reset(struct aq_hw_s *self)
+{
+   int err = 0;
+
+   glb_glb_reg_res_dis_set(self, 1U);
+   pci_pci_reg_res_dis_set(self, 0U);
+   rx_rx_reg_res_dis_set(self, 0U);
+   tx_tx_reg_res_dis_set(self, 0U);
+
+   HW_ATL_FLUSH();
+   glb_soft_res_set(self, 1);
+
+   /* check 10 times by 1ms */
+   AQ_HW_WAIT_FOR(glb_soft_res_get(self) == 0, 1000U, 10U);
+   if (err < 0)
+   goto err_exit;
+
+   itr_irq_reg_res_dis_set(self, 0U);
+   itr_res_irq_set(self, 1U);
+
+   /* check 10 times by 1ms */
+   AQ_HW_WAIT_FOR(itr_res_irq_get(self) == 0, 1000U, 10U);
+   if (err < 0)
+   goto err_exit;
+
+   hw_atl_utils_mpi_set(self, MPI_RESET, 0x0U);
+
+   err = aq_hw_err_from_flags(self);
+
+err_exit:
+   return err;
+}
+
+static int hw_atl_a0_hw_qos_set(struct aq_hw_s *self)
+{
+   u32 tc = 0U;
+   u32 buff_size = 0U;
+   unsigned int i_priority = 0U;
+   bool is_rx_flow_control = false;
+
+   /* TPS Descriptor rate init */
+   tps_tx_pkt_shed_desc_rate_curr_time_res_set(self, 0x0U);
+   tps_tx_pkt_shed_desc_rate_lim_set(self, 0xA);
+
+   /* TPS VM init */
+   tps_tx_pkt_shed_desc_vm_arb_mode_set(self, 0U);
+
+   /* TPS TC credits init */
+   tps_tx_pkt_shed_desc_tc_arb_mode_set(self, 0U);
+   tps_tx_pkt_shed_data_arb_mode_set(self, 0U);
+
+   tps_tx_pkt_shed_tc_data_max_credit_set(self, 0xFFF, 0U);
+   tps_tx_pkt_shed_tc_data_weight_set(self, 0x64, 0U);
+   tps_tx_pkt_shed_desc_tc_max_credit_set(self, 0x50, 0U);
+   tps_tx_pkt_shed_desc_tc_weight_set(self, 0x1E, 0U);
+
+   /* Tx buf size */
+   buff_size = HW_ATL_A0_TXBUF_MAX;
+
+   tpb_tx_pkt_buff_size_per_tc_set(self, buff_size, tc);
+   tpb_tx_buff_hi_threshold_per_tc_set(self,
+   (buff_size * (1024 / 32U) * 66U) /
+   100U, tc);
+   tpb_tx_buff_lo_threshold_per_tc_set(self,
+

[PATCH v4 08/13] net: ethernet: aquantia: PCI operations

2017-01-11 Thread Alexander Loktionov

From: David VomLehn 

Add functions that handle the PCI bus interface.

Signed-off-by: Alexander Loktionov 
Signed-off-by: Dmitrii Tarakanov 
Signed-off-by: Pavel Belous 
Signed-off-by: Dmitry Bezrukov 
Signed-off-by: David M. VomLehn 
---
 drivers/net/ethernet/aquantia/aq_pci_func.c | 348 
 drivers/net/ethernet/aquantia/aq_pci_func.h |  35 +++
 2 files changed, 383 insertions(+)
 create mode 100644 drivers/net/ethernet/aquantia/aq_pci_func.c
 create mode 100644 drivers/net/ethernet/aquantia/aq_pci_func.h

diff --git a/drivers/net/ethernet/aquantia/aq_pci_func.c 
b/drivers/net/ethernet/aquantia/aq_pci_func.c
new file mode 100644
index 000..18c427d
--- /dev/null
+++ b/drivers/net/ethernet/aquantia/aq_pci_func.c
@@ -0,0 +1,348 @@
+/*
+ * aQuantia Corporation Network Driver
+ * Copyright (C) 2014-2017 aQuantia Corporation. All rights reserved
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ */
+
+/* File aq_pci_func.c: Definition of PCI functions. */
+
+#include "aq_pci_func.h"
+#include "aq_nic.h"
+#include "aq_vec.h"
+#include "aq_hw.h"
+#include 
+
+struct aq_pci_func_s {
+   struct pci_dev *pdev;
+   struct aq_nic_s *port[AQ_CFG_PCI_FUNC_PORTS];
+   void __iomem *mmio;
+   void *aq_vec[AQ_CFG_PCI_FUNC_MSIX_IRQS];
+   resource_size_t mmio_pa;
+   unsigned int msix_entry_mask;
+   unsigned int irq_type;
+   unsigned int ports;
+   bool is_pci_enabled;
+   bool is_regions;
+   bool is_pci_using_dac;
+   struct aq_hw_caps_s aq_hw_caps;
+   struct msix_entry msix_entry[AQ_CFG_PCI_FUNC_MSIX_IRQS];
+};
+
+struct aq_pci_func_s *aq_pci_func_alloc(struct aq_hw_ops *aq_hw_ops,
+   struct pci_dev *pdev,
+   const struct net_device_ops *ndev_ops,
+   const struct ethtool_ops *eth_ops)
+{
+   struct aq_pci_func_s *self = NULL;
+   int err = 0;
+   unsigned int port = 0U;
+
+   if (!aq_hw_ops) {
+   err = -EFAULT;
+   goto err_exit;
+   }
+   self = kzalloc(sizeof(*self), GFP_KERNEL);
+   if (!self) {
+   err = -ENOMEM;
+   goto err_exit;
+   }
+
+   pci_set_drvdata(pdev, self);
+   self->pdev = pdev;
+
+   err = aq_hw_ops->get_hw_caps(NULL, >aq_hw_caps);
+   if (err < 0)
+   goto err_exit;
+
+   self->ports = self->aq_hw_caps.ports;
+
+   for (port = 0; port < self->ports; ++port) {
+   struct aq_nic_s *aq_nic = aq_nic_alloc_cold(ndev_ops, eth_ops,
+   >dev, self,
+   port, aq_hw_ops);
+
+   if (!aq_nic) {
+   err = -ENOMEM;
+   goto err_exit;
+   }
+   self->port[port] = aq_nic;
+   }
+
+err_exit:
+   if (err < 0) {
+   if (self)
+   aq_pci_func_free(self);
+   self = NULL;
+   }
+
+   (void)err;
+   return self;
+}
+
+int aq_pci_func_init(struct aq_pci_func_s *self)
+{
+   int err = 0;
+   unsigned int bar = 0U;
+   unsigned int port = 0U;
+   unsigned int i = 0U;
+
+   err = pci_enable_device(self->pdev);
+   if (err < 0)
+   goto err_exit;
+
+   self->is_pci_enabled = true;
+
+   err = pci_set_dma_mask(self->pdev, DMA_BIT_MASK(64));
+   if (!err) {
+   err = pci_set_consistent_dma_mask(self->pdev, DMA_BIT_MASK(64));
+   self->is_pci_using_dac = 1;
+   }
+   if (err) {
+   err = pci_set_dma_mask(self->pdev, DMA_BIT_MASK(32));
+   if (!err)
+   err = pci_set_consistent_dma_mask(self->pdev,
+ DMA_BIT_MASK(32));
+   self->is_pci_using_dac = 0;
+   }
+   if (err != 0) {
+   err = -ENOSR;
+   goto err_exit;
+   }
+
+   err = pci_request_regions(self->pdev, AQ_CFG_DRV_NAME "_mmio");
+   if (err < 0)
+   goto err_exit;
+
+   self->is_regions = true;
+
+   pci_set_master(self->pdev);
+
+   for (bar = 0; bar < 4; ++bar) {
+   if (IORESOURCE_MEM & pci_resource_flags(self->pdev, bar)) {
+   resource_size_t reg_sz;
+
+   self->mmio_pa = pci_resource_start(self->pdev, bar);
+   if (self->mmio_pa == 0U) {
+   err = -EIO;
+   goto err_exit;
+   }
+
+

[PATCH v4 12/13] net: ethernet: aquantia: Receive side scaling

2017-01-11 Thread Alexander Loktionov

From: David VomLehn 

Add definitions that support receive side scaling.

Signed-off-by: Alexander Loktionov 
Signed-off-by: Dmitrii Tarakanov 
Signed-off-by: Pavel Belous 
Signed-off-by: Dmitry Bezrukov 
Signed-off-by: David M. VomLehn 
---
 drivers/net/ethernet/aquantia/aq_rss.h | 27 +++
 1 file changed, 27 insertions(+)
 create mode 100644 drivers/net/ethernet/aquantia/aq_rss.h

diff --git a/drivers/net/ethernet/aquantia/aq_rss.h 
b/drivers/net/ethernet/aquantia/aq_rss.h
new file mode 100644
index 000..5869844
--- /dev/null
+++ b/drivers/net/ethernet/aquantia/aq_rss.h
@@ -0,0 +1,27 @@
+/*
+ * aQuantia Corporation Network Driver
+ * Copyright (C) 2014-2017 aQuantia Corporation. All rights reserved
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ */
+
+/* File aq_rss.h: Receive Side Scaling definitions. */
+
+#ifndef AQ_RSS_H
+#define AQ_RSS_H
+
+#include "aq_common.h"
+#include "aq_cfg.h"
+
+struct aq_rss_parameters {
+   u16 base_cpu_number;
+   u16 indirection_table_size;
+   u16 hash_secret_key_size;
+   u32 hash_secret_key[AQ_CFG_RSS_HASHKEY_SIZE / sizeof(u32)];
+   u8 indirection_table[AQ_CFG_RSS_INDIRECTION_TABLE_MAX];
+};
+
+#endif /* AQ_RSS_H */
+
-- 
2.7.4

[PATCH v4 07/13] net: ethernet: aquantia: Vector operations

2017-01-11 Thread Alexander Loktionov

From: David VomLehn 

Add functions to manululate the vector of receive and transmit rings.

Signed-off-by: Alexander Loktionov 
Signed-off-by: Dmitrii Tarakanov 
Signed-off-by: Pavel.Belous 
Signed-off-by: Dmitry Bezrukov 
Signed-off-by: David M. VomLehn 
---
 drivers/net/ethernet/aquantia/aq_vec.c | 386 +
 drivers/net/ethernet/aquantia/aq_vec.h |  43 
 2 files changed, 429 insertions(+)
 create mode 100644 drivers/net/ethernet/aquantia/aq_vec.c
 create mode 100644 drivers/net/ethernet/aquantia/aq_vec.h

diff --git a/drivers/net/ethernet/aquantia/aq_vec.c 
b/drivers/net/ethernet/aquantia/aq_vec.c
new file mode 100644
index 000..8a4fec9
--- /dev/null
+++ b/drivers/net/ethernet/aquantia/aq_vec.c
@@ -0,0 +1,386 @@
+/*
+ * aQuantia Corporation Network Driver
+ * Copyright (C) 2014-2017 aQuantia Corporation. All rights reserved
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ */
+
+/* File aq_vec.c: Definition of common structure for vector of Rx and Tx rings.
+ * Definition of functions for Rx and Tx rings. Friendly module for aq_nic.
+ */
+
+#include "aq_vec.h"
+#include "aq_nic.h"
+#include "aq_ring.h"
+#include "aq_hw.h"
+
+#include 
+
+struct aq_vec_s {
+   AQ_OBJ_HEADER;
+   struct aq_hw_ops *aq_hw_ops;
+   struct aq_hw_s *aq_hw;
+   struct aq_nic_s *aq_nic;
+   unsigned int tx_rings;
+   unsigned int rx_rings;
+   struct aq_ring_param_s aq_ring_param;
+   struct napi_struct napi;
+   struct aq_ring_s ring[AQ_CFG_TCS_MAX][2];
+};
+
+#define AQ_VEC_TX_ID 0
+#define AQ_VEC_RX_ID 1
+
+static int aq_vec_poll(struct napi_struct *napi, int budget)
+__releases(>lock)
+__acquires(>lock)
+{
+   struct aq_vec_s *self = container_of(napi, struct aq_vec_s, napi);
+   struct aq_ring_s *ring = NULL;
+   int work_done = 0;
+   int err = 0;
+   unsigned int i = 0U;
+   unsigned int sw_tail_old = 0U;
+   bool was_tx_cleaned = false;
+   bool is_locked = false;
+
+   if (!self)
+   return 0;
+
+   if (spin_trylock(>lock)) {
+   is_locked = true;
+
+   for (i = 0U, ring = self->ring[0];
+   self->tx_rings > i; ++i, ring = self->ring[i]) {
+   if (self->aq_hw_ops->hw_ring_tx_head_update) {
+   err = self->aq_hw_ops->hw_ring_tx_head_update(
+   self->aq_hw,
+   [AQ_VEC_TX_ID]);
+   if (err < 0)
+   goto err_exit;
+   }
+
+   if (ring[AQ_VEC_TX_ID].sw_head !=
+   ring[AQ_VEC_TX_ID].hw_head) {
+   aq_ring_tx_clean([AQ_VEC_TX_ID]);
+   was_tx_cleaned = true;
+   }
+
+   err = self->aq_hw_ops->hw_ring_rx_receive(self->aq_hw,
+   [AQ_VEC_RX_ID]);
+   if (err < 0)
+   goto err_exit;
+
+   if (ring[AQ_VEC_RX_ID].sw_head !=
+   ring[AQ_VEC_RX_ID].hw_head) {
+   err = aq_ring_rx_clean([AQ_VEC_RX_ID],
+  _done,
+  budget - work_done);
+   if (err < 0)
+   goto err_exit;
+
+   sw_tail_old = ring[AQ_VEC_RX_ID].sw_tail;
+
+   err = aq_ring_rx_fill([AQ_VEC_RX_ID]);
+   if (err < 0)
+   goto err_exit;
+
+   err = self->aq_hw_ops->hw_ring_rx_fill(
+   self->aq_hw,
+   [AQ_VEC_RX_ID], sw_tail_old);
+   if (err < 0)
+   goto err_exit;
+   }
+   }
+
+   if (was_tx_cleaned)
+   work_done = budget;
+
+   if (work_done < budget) {
+   napi_complete(napi);
+   self->aq_hw_ops->hw_irq_enable(self->aq_hw,
+   1U << self->aq_ring_param.vec_idx);
+   }
+
+err_exit:
+   if (is_locked)
+   spin_unlock(>lock);
+   }
+
+   return work_done;
+}
+
+struct aq_vec_s *aq_vec_alloc(struct

[PATCH v4 01/13] net: ethernet: aquantia: Make and configuration files.

2017-01-11 Thread Alexander Loktionov

From: David VomLehn 

Patches to create the make and configuration files.

Signed-off-by: Alexander Loktionov 
Signed-off-by: Dmitrii Tarakanov 
Signed-off-by: Pavel Belous 
Signed-off-by: Dmitry Bezrukov 
Signed-off-by: David M. VomLehn 
---
 drivers/net/ethernet/aquantia/Kconfig  | 24 +++
 drivers/net/ethernet/aquantia/Makefile | 43 ++
 drivers/net/ethernet/aquantia/ver.h| 19 +++
 3 files changed, 86 insertions(+)
 create mode 100644 drivers/net/ethernet/aquantia/Kconfig
 create mode 100644 drivers/net/ethernet/aquantia/Makefile
 create mode 100644 drivers/net/ethernet/aquantia/ver.h

diff --git a/drivers/net/ethernet/aquantia/Kconfig 
b/drivers/net/ethernet/aquantia/Kconfig
new file mode 100644
index 000..a74a4c0
--- /dev/null
+++ b/drivers/net/ethernet/aquantia/Kconfig
@@ -0,0 +1,24 @@
+#
+# aQuantia device configuration
+#
+
+config NET_VENDOR_AQUANTIA
+   bool "aQuantia devices"
+   default y
+   ---help---
+ Set this to y if you have an Ethernet network cards that uses the 
aQuantia
+ chipset.
+
+ This option does not build any drivers; it casues the aQuantia
+ drivers that can be built to appear in the list of Ethernet drivers.
+
+
+if NET_VENDOR_AQUANTIA
+
+config AQTION
+   tristate "aQuantia AQtion Support"
+   depends on PCI
+   ---help---
+ This enables the support for the aQuantia AQtion Ethernet card.
+
+endif # NET_VENDOR_AQUANTIA
diff --git a/drivers/net/ethernet/aquantia/Makefile 
b/drivers/net/ethernet/aquantia/Makefile
new file mode 100644
index 000..483776a
--- /dev/null
+++ b/drivers/net/ethernet/aquantia/Makefile
@@ -0,0 +1,43 @@
+
+#
+# aQuantia Ethernet Controller AQtion Linux Driver
+# Copyright(c) 2014-2017 aQuantia Corporation.
+#
+# This program is free software; you can redistribute it and/or modify it
+# under the terms and conditions of the GNU General Public License,
+# version 2, as published by the Free Software Foundation.
+#
+# This program is distributed in the hope it will be useful, but WITHOUT
+# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+# FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+# more details.
+#
+# You should have received a copy of the GNU General Public License along
+# with this program. If not, see .
+#
+# The full GNU General Public License is included in this distribution in
+# the file called "COPYING".
+#
+# Contact Information: 
+# aQuantia Corporation, 105 E. Tasman Dr. San Jose, CA 95134, USA
+#
+
+
+#
+# Makefile for the AQtion(tm) Ethernet driver
+#
+
+obj-$(CONFIG_AQTION) += atlantic.o
+
+atlantic-objs := aq_main.o \
+   aq_nic.o \
+   aq_pci_func.o \
+   aq_vec.o \
+   aq_ring.o \
+   aq_hw_utils.o \
+   aq_ethtool.o \
+   hw_atl/hw_atl_a0.o \
+   hw_atl/hw_atl_b0.o \
+   hw_atl/hw_atl_utils.o \
+   hw_atl/hw_atl_llh.o
+
diff --git a/drivers/net/ethernet/aquantia/ver.h 
b/drivers/net/ethernet/aquantia/ver.h
new file mode 100644
index 000..40be675
--- /dev/null
+++ b/drivers/net/ethernet/aquantia/ver.h
@@ -0,0 +1,19 @@
+/*
+ * aQuantia Corporation Network Driver
+ * Copyright (C) 2014-2017 aQuantia Corporation. All rights reserved
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ */
+
+#ifndef VER_H
+#define VER_H
+
+#define NIC_MAJOR_DRIVER_VERSION   1
+#define NIC_MINOR_DRIVER_VERSION   5
+#define NIC_BUILD_DRIVER_VERSION   339
+#define NIC_REVISION_DRIVER_VERSION0
+
+#endif /* VER_H */
+
-- 
2.7.4

[PATCH v4 10/13] net: ethernet: aquantia: Hardware interface and utility functions

2017-01-11 Thread Alexander Loktionov

From: David VomLehn 

Add functions to interface with the hardware and some utility functions.

Signed-off-by: Alexander Loktionov 
Signed-off-by: Dmitrii Tarakanov 
Signed-off-by: Pavel Belous 
Signed-off-by: Dmitry Bezrukov 
Signed-off-by: David M. VomLehn 
---
 drivers/net/ethernet/aquantia/aq_hw.h   | 170 
 drivers/net/ethernet/aquantia/aq_hw_utils.c |  69 +++
 drivers/net/ethernet/aquantia/aq_hw_utils.h |  48 
 3 files changed, 287 insertions(+)
 create mode 100644 drivers/net/ethernet/aquantia/aq_hw.h
 create mode 100644 drivers/net/ethernet/aquantia/aq_hw_utils.c
 create mode 100644 drivers/net/ethernet/aquantia/aq_hw_utils.h

diff --git a/drivers/net/ethernet/aquantia/aq_hw.h 
b/drivers/net/ethernet/aquantia/aq_hw.h
new file mode 100644
index 000..e9ac912
--- /dev/null
+++ b/drivers/net/ethernet/aquantia/aq_hw.h
@@ -0,0 +1,170 @@
+/*
+ * aQuantia Corporation Network Driver
+ * Copyright (C) 2014-2017 aQuantia Corporation. All rights reserved
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ */
+
+/* File aq_hw.h: Declaraion of abstract interface for NIC hardware specific
+ * functions.
+ */
+
+#ifndef AQ_HW_H
+#define AQ_HW_H
+
+#include "aq_common.h"
+
+/* NIC H/W capabilities */
+struct aq_hw_caps_s {
+   u64 hw_features;
+   u64 link_speed_msk;
+   unsigned int hw_priv_flags;
+   u32 rxds;
+   u32 txds;
+   u32 txhwb_alignment;
+   u32 irq_mask;
+   u32 vecs;
+   u32 mtu;
+   u32 mac_regs_count;
+   u8 ports;
+   u8 msix_irqs;
+   u8 tcs;
+   u8 rxd_alignment;
+   u8 rxd_size;
+   u8 txd_alignment;
+   u8 txd_size;
+   u8 tx_rings;
+   u8 rx_rings;
+   bool flow_control;
+   bool is_64_dma;
+};
+
+struct aq_hw_link_status_s {
+   u64 bps;
+};
+
+#define AQ_HW_POWER_STATE_D0   0U
+#define AQ_HW_POWER_STATE_D3   3U
+
+#define AQ_HW_FLAG_STARTED 0x0004U
+#define AQ_HW_FLAG_STOPPING0x0008U
+#define AQ_HW_FLAG_RESETTING   0x0010U
+#define AQ_HW_FLAG_CLOSING 0x0020U
+#define AQ_HW_LINK_DOWN0x0400U
+#define AQ_HW_FLAG_ERR_UNPLUG  0x4000U
+#define AQ_HW_FLAG_ERR_HW  0x8000U
+
+#define AQ_HW_FLAG_ERRORS  (AQ_HW_FLAG_ERR_HW | AQ_HW_FLAG_ERR_UNPLUG)
+
+struct aq_hw_s {
+   AQ_OBJ_HEADER;
+   struct aq_nic_cfg_s *aq_nic_cfg;
+   struct aq_pci_func_s *aq_pci_func;
+   void __iomem *mmio;
+   unsigned int not_ff_addr;
+   struct aq_hw_link_status_s aq_link_status;
+};
+
+struct aq_ring_s;
+struct aq_ring_param_s;
+struct aq_nic_cfg_s;
+struct sk_buff;
+
+struct aq_hw_ops {
+   struct aq_hw_s *(*create)(struct aq_pci_func_s *aq_pci_func,
+ unsigned int port, struct aq_hw_ops *ops);
+
+   void (*destroy)(struct aq_hw_s *self);
+
+   int (*get_hw_caps)(struct aq_hw_s *self,
+  struct aq_hw_caps_s *aq_hw_caps);
+
+   int (*hw_ring_tx_xmit)(struct aq_hw_s *self, struct aq_ring_s *aq_ring,
+  unsigned int frags);
+
+   int (*hw_ring_rx_receive)(struct aq_hw_s *self,
+ struct aq_ring_s *aq_ring);
+
+   int (*hw_ring_rx_fill)(struct aq_hw_s *self, struct aq_ring_s *aq_ring,
+  unsigned int sw_tail_old);
+
+   int (*hw_ring_tx_head_update)(struct aq_hw_s *self,
+ struct aq_ring_s *aq_ring);
+
+   int (*hw_get_mac_permanent)(struct aq_hw_s *self, u8 *mac);
+
+   int (*hw_set_mac_address)(struct aq_hw_s *self, u8 *mac_addr);
+
+   int (*hw_get_link_status)(struct aq_hw_s *self,
+ struct aq_hw_link_status_s *link_status);
+
+   int (*hw_set_link_speed)(struct aq_hw_s *self, u32 speed);
+
+   int (*hw_reset)(struct aq_hw_s *self);
+
+   int (*hw_init)(struct aq_hw_s *self, struct aq_nic_cfg_s *aq_nic_cfg,
+  u8 *mac_addr);
+
+   int (*hw_start)(struct aq_hw_s *self);
+
+   int (*hw_stop)(struct aq_hw_s *self);
+
+   int (*hw_ring_tx_init)(struct aq_hw_s *self, struct aq_ring_s *aq_ring,
+  struct aq_ring_param_s *aq_ring_param);
+
+   int (*hw_ring_tx_start)(struct aq_hw_s *self,
+   struct aq_ring_s *aq_ring);
+
+   int (*hw_ring_tx_stop)(struct aq_hw_s *self,
+  struct aq_ring_s *aq_ring);
+
+   int (*hw_ring_rx_init)(struct aq_hw_s *self,
+  struct aq_ring_s *aq_ring,
+  struct aq_ring_param_s *aq_ring_param);
+
+   int

[PATCH v4 05/13] net: ethernet: aquantia: Support for NIC-specific code

2017-01-11 Thread Alexander Loktionov

From: David VomLehn 

Add support for code specific to the Atlantic NIC.

Signed-off-by: Alexander Loktionov 
Signed-off-by: Dmitrii Tarakanov 
Signed-off-by: Pavel Belous 
Signed-off-by: Dmitry Bezrukov 
Signed-off-by: David M. VomLehn 
---
 drivers/net/ethernet/aquantia/aq_main.c | 292 
 drivers/net/ethernet/aquantia/aq_main.h |  18 +
 drivers/net/ethernet/aquantia/aq_nic.c  | 911 
 drivers/net/ethernet/aquantia/aq_nic.h  | 109 +++
 drivers/net/ethernet/aquantia/aq_nic_internal.h |  47 ++
 5 files changed, 1377 insertions(+)
 create mode 100644 drivers/net/ethernet/aquantia/aq_main.c
 create mode 100644 drivers/net/ethernet/aquantia/aq_main.h
 create mode 100644 drivers/net/ethernet/aquantia/aq_nic.c
 create mode 100644 drivers/net/ethernet/aquantia/aq_nic.h
 create mode 100644 drivers/net/ethernet/aquantia/aq_nic_internal.h

diff --git a/drivers/net/ethernet/aquantia/aq_main.c 
b/drivers/net/ethernet/aquantia/aq_main.c
new file mode 100644
index 000..380d557
--- /dev/null
+++ b/drivers/net/ethernet/aquantia/aq_main.c
@@ -0,0 +1,292 @@
+/*
+ * aQuantia Corporation Network Driver
+ * Copyright (C) 2014-2017 aQuantia Corporation. All rights reserved
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ */
+
+/* File aq_main.c: Main file for aQuantia Linux driver. */
+
+#include "aq_main.h"
+#include "aq_nic.h"
+#include "aq_pci_func.h"
+#include "aq_ethtool.h"
+#include "hw_atl/hw_atl_a0.h"
+#include "hw_atl/hw_atl_b0.h"
+
+#include 
+#include 
+
+static const struct pci_device_id aq_pci_tbl[] = {
+   { PCI_VDEVICE(AQUANTIA, HW_ATL_DEVICE_ID_0001), },
+   { PCI_VDEVICE(AQUANTIA, HW_ATL_DEVICE_ID_D100), },
+   { PCI_VDEVICE(AQUANTIA, HW_ATL_DEVICE_ID_D107), },
+   { PCI_VDEVICE(AQUANTIA, HW_ATL_DEVICE_ID_D108), },
+   { PCI_VDEVICE(AQUANTIA, HW_ATL_DEVICE_ID_D109), },
+   {}
+};
+
+MODULE_DEVICE_TABLE(pci, aq_pci_tbl);
+
+MODULE_LICENSE("GPL v2");
+MODULE_VERSION(AQ_CFG_DRV_VERSION);
+MODULE_AUTHOR(AQ_CFG_DRV_AUTHOR);
+MODULE_DESCRIPTION(AQ_CFG_DRV_DESC);
+
+static struct aq_hw_ops *aq_pci_probe_get_hw_ops_by_id(struct pci_dev *pdev)
+{
+   struct aq_hw_ops *ops = NULL;
+   int err = 0;
+
+   ops = hw_atl_a0_get_ops_by_id(pdev);
+   if (ops) {
+   err = 0;
+   goto err_exit;
+   }
+
+   ops = hw_atl_b0_get_ops_by_id(pdev);
+   if (ops) {
+   err = 0;
+   goto err_exit;
+   }
+
+/* the H/W was not recognized */
+   err = -EFAULT;
+
+err_exit:
+   return ops;
+}
+
+static int aq_ndev_open(struct net_device *ndev)
+{
+   struct aq_nic_s *aq_nic = NULL;
+   int err = 0;
+
+   aq_nic = aq_nic_alloc_hot(ndev);
+   if (!aq_nic) {
+   err = -ENOMEM;
+   goto err_exit;
+   }
+   err = aq_nic_init(aq_nic);
+   if (err < 0)
+   goto err_exit;
+   err = aq_nic_start(aq_nic);
+   if (err < 0)
+   goto err_exit;
+
+err_exit:
+   if (err < 0) {
+   if (aq_nic)
+   aq_nic_deinit(aq_nic);
+   }
+   return err;
+}
+
+static int aq_ndev_close(struct net_device *ndev)
+{
+   struct aq_nic_s *aq_nic = (struct aq_nic_s *)netdev_priv(ndev);
+
+   aq_nic_stop(aq_nic);
+   aq_nic_deinit(aq_nic);
+   aq_nic_free_hot_resources(aq_nic);
+
+   return 0;
+}
+
+static int aq_ndev_start_xmit(struct sk_buff *skb, struct net_device *ndev)
+{
+   struct aq_nic_s *aq_nic = (struct aq_nic_s *)netdev_priv(ndev);
+   int err = 0;
+
+   err = aq_nic_xmit(aq_nic, skb);
+   if (err < 0)
+   goto err_exit;
+
+err_exit:
+   return err;
+}
+
+static int aq_ndev_change_mtu(struct net_device *ndev, int new_mtu)
+{
+   struct aq_nic_s *aq_nic = (struct aq_nic_s *)netdev_priv(ndev);
+   int err = 0;
+
+   if (new_mtu == ndev->mtu) {
+   err = 0;
+   goto err_exit;
+   }
+   if (new_mtu < 68) {
+   err = -EINVAL;
+   goto err_exit;
+   }
+   err = aq_nic_set_mtu(aq_nic, new_mtu + ETH_HLEN);
+   if (err < 0)
+   goto err_exit;
+   ndev->mtu = new_mtu;
+
+   if (netif_running(ndev)) {
+   aq_ndev_close(ndev);
+   aq_ndev_open(ndev);
+   }
+
+err_exit:
+   return err;
+}
+
+static int aq_ndev_set_features(struct net_device *ndev,
+   netdev_features_t features)
+{
+   struct aq_nic_s *aq_nic = (struct aq_nic_s *)netdev_priv(ndev);
+   struct aq_nic_cfg_s *aq_cfg = aq_nic_get_cfg(aq_nic);
+   bool is_lro = false;
+
+

[PATCH v4 13/13] net: ethernet: aquantia: Integrate AQtion 2.5/5 GB NIC driver

2017-01-11 Thread Alexander Loktionov

From: David VomLehn 

Modify the drivers/net/ethernet/{Makefile,Kconfig} file to make them a
part of the network drivers build.

Signed-off-by: Alexander Loktionov 
Signed-off-by: Dmitrii Tarakanov 
Signed-off-by: Pavel Belous 
Signed-off-by: Dmitry Bezrukov 
Signed-off-by: David M. VomLehn 
---
 drivers/net/ethernet/Kconfig  | 1 +
 drivers/net/ethernet/Makefile | 1 +
 2 files changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/Kconfig b/drivers/net/ethernet/Kconfig
index 8cc7467..d467c8b 100644
--- a/drivers/net/ethernet/Kconfig
+++ b/drivers/net/ethernet/Kconfig
@@ -28,6 +28,7 @@ source "drivers/net/ethernet/amazon/Kconfig"
 source "drivers/net/ethernet/amd/Kconfig"
 source "drivers/net/ethernet/apm/Kconfig"
 source "drivers/net/ethernet/apple/Kconfig"
+source "drivers/net/ethernet/aquantia/Kconfig"
 source "drivers/net/ethernet/arc/Kconfig"
 source "drivers/net/ethernet/atheros/Kconfig"
 source "drivers/net/ethernet/aurora/Kconfig"
diff --git a/drivers/net/ethernet/Makefile b/drivers/net/ethernet/Makefile
index a09423d..123ef8e 100644
--- a/drivers/net/ethernet/Makefile
+++ b/drivers/net/ethernet/Makefile
@@ -14,6 +14,7 @@ obj-$(CONFIG_NET_VENDOR_AMAZON) += amazon/
 obj-$(CONFIG_NET_VENDOR_AMD) += amd/
 obj-$(CONFIG_NET_XGENE) += apm/
 obj-$(CONFIG_NET_VENDOR_APPLE) += apple/
+obj-$(CONFIG_NET_VENDOR_AQUANTIA) += aquantia/
 obj-$(CONFIG_NET_VENDOR_ARC) += arc/
 obj-$(CONFIG_NET_VENDOR_ATHEROS) += atheros/
 obj-$(CONFIG_NET_VENDOR_AURORA) += aurora/
-- 
2.7.4

[PATCH v4 11/13] net: ethernet: aquantia: Ethtool support

2017-01-11 Thread Alexander Loktionov

From: David VomLehn 

Add the driver interfaces required for support by the ethtool utility.

Signed-off-by: Alexander Loktionov 
Signed-off-by: Dmitrii Tarakanov 
Signed-off-by: Pavel Belous 
Signed-off-by: Dmitry Bezrukov 
Signed-off-by: David M. VomLehn 
---
 drivers/net/ethernet/aquantia/aq_ethtool.c | 251 +
 drivers/net/ethernet/aquantia/aq_ethtool.h |  20 +++
 2 files changed, 271 insertions(+)
 create mode 100644 drivers/net/ethernet/aquantia/aq_ethtool.c
 create mode 100644 drivers/net/ethernet/aquantia/aq_ethtool.h

diff --git a/drivers/net/ethernet/aquantia/aq_ethtool.c 
b/drivers/net/ethernet/aquantia/aq_ethtool.c
new file mode 100644
index 000..1b21f9f
--- /dev/null
+++ b/drivers/net/ethernet/aquantia/aq_ethtool.c
@@ -0,0 +1,251 @@
+/*
+ * aQuantia Corporation Network Driver
+ * Copyright (C) 2014-2017 aQuantia Corporation. All rights reserved
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ */
+
+/* File aq_ethtool.c: Definition of ethertool related functions. */
+
+#include "aq_ethtool.h"
+#include "aq_nic.h"
+
+static void aq_ethtool_get_regs(struct net_device *ndev,
+   struct ethtool_regs *regs, void *p)
+{
+   struct aq_nic_s *aq_nic = (struct aq_nic_s *)netdev_priv(ndev);
+   u32 regs_count = aq_nic_get_regs_count(aq_nic);
+
+   memset(p, 0, regs_count * sizeof(u32));
+   aq_nic_get_regs(aq_nic, regs, p);
+}
+
+static int aq_ethtool_get_regs_len(struct net_device *ndev)
+{
+   struct aq_nic_s *aq_nic = (struct aq_nic_s *)netdev_priv(ndev);
+   u32 regs_count = aq_nic_get_regs_count(aq_nic);
+
+   return regs_count * sizeof(u32);
+}
+
+static u32 aq_ethtool_get_link(struct net_device *ndev)
+{
+   struct aq_nic_s *aq_nic = (struct aq_nic_s *)netdev_priv(ndev);
+
+   return aq_nic_get_link_speed(aq_nic) ? 1U : 0U;
+}
+
+static int aq_ethtool_get_settings(struct net_device *ndev,
+  struct ethtool_cmd *cmd)
+{
+   struct aq_nic_s *aq_nic = (struct aq_nic_s *)netdev_priv(ndev);
+
+   cmd->port = PORT_TP;
+   cmd->transceiver = XCVR_EXTERNAL;
+
+   ethtool_cmd_speed_set(cmd, netif_carrier_ok(ndev) ?
+   aq_nic_get_link_speed(aq_nic) : 0U);
+
+   cmd->duplex = DUPLEX_FULL;
+   aq_nic_get_link_settings(aq_nic, cmd);
+   return 0;
+}
+
+static int aq_ethtool_set_settings(struct net_device *ndev,
+  struct ethtool_cmd *cmd)
+{
+   struct aq_nic_s *aq_nic = (struct aq_nic_s *)netdev_priv(ndev);
+
+   return aq_nic_set_link_settings(aq_nic, cmd);
+}
+
+static const char aq_ethtool_stat_names[][ETH_GSTRING_LEN] = {
+   "InPackets",
+   "InUCast",
+   "InMCast",
+   "InBCast",
+   "InErrors",
+   "OutPackets",
+   "OutUCast",
+   "OutMCast",
+   "OutBCast",
+   "InUCastOctects",
+   "OutUCastOctects",
+   "InMCastOctects",
+   "OutMCastOctects",
+   "InBCastOctects",
+   "OutBCastOctects",
+   "InOctects",
+   "OutOctects",
+   "InPacketsDma",
+   "OutPacketsDma",
+   "InOctetsDma",
+   "OutOctetsDma",
+   "InDroppedDma",
+   "Queue[0] InPackets",
+   "Queue[0] OutPackets",
+   "Queue[0] InJumboPackets",
+   "Queue[0] InLroPackets",
+   "Queue[0] InErrors",
+#if 1 < AQ_CFG_VECS_DEF
+   "Queue[1] InPackets",
+   "Queue[1] OutPackets",
+   "Queue[1] InJumboPackets",
+   "Queue[1] InLroPackets",
+   "Queue[1] InErrors",
+#endif
+#if 2 < AQ_CFG_VECS_DEF
+   "Queue[2] InPackets",
+   "Queue[2] OutPackets",
+   "Queue[2] InJumboPackets",
+   "Queue[2] InLroPackets",
+   "Queue[2] InErrors",
+#endif
+#if 3 < AQ_CFG_VECS_DEF
+   "Queue[3] InPackets",
+   "Queue[3] OutPackets",
+   "Queue[3] InJumboPackets",
+   "Queue[3] InLroPackets",
+   "Queue[3] InErrors",
+#endif
+#if 4 < AQ_CFG_VECS_DEF
+   "Queue[4] InPackets",
+   "Queue[4] OutPackets",
+   "Queue[4] InJumboPackets",
+   "Queue[4] InLroPackets",
+   "Queue[4] InErrors",
+#endif
+#if 5 < AQ_CFG_VECS_DEF
+   "Queue[5] InPackets",
+   "Queue[5] OutPackets",
+   "Queue[5] InJumboPackets",
+   "Queue[5] InLroPackets",
+   "Queue[5] InErrors",
+#endif
+#if 6 < AQ_CFG_VECS_DEF
+   "Queue[6] InPackets",
+   "Queue[6] OutPackets",
+   "Queue[6] InJumboPackets",
+   "Queue[6] InLroPackets",
+   "Queue[6] InErrors",
+#endif
+#if 7 < AQ_CFG_VECS_DEF
+   "Queue[7] InPackets",
+   "Queue[7] OutPackets",
+   "Queue[7] InJumboPackets",
+   "Queue[7] InLroPackets",
+   "Queue[7]

[PATCH v4 02/13] net: ethernet: aquantia: Common functions and definitions

2017-01-11 Thread Alexander Loktionov

From: David VomLehn 

Add files containing the functions and definitions used in common in
different functional areas.

Signed-off-by: Alexander Loktionov 
Signed-off-by: Dmitrii Tarakanov 
Signed-off-by: Pavel Belous 
Signed-off-by: Dmitry Bezrukov 
Signed-off-by: David M. VomLehn 
---
 drivers/net/ethernet/aquantia/aq_cfg.h| 78 +++
 drivers/net/ethernet/aquantia/aq_common.h | 24 ++
 drivers/net/ethernet/aquantia/aq_utils.h  | 54 +
 3 files changed, 156 insertions(+)
 create mode 100644 drivers/net/ethernet/aquantia/aq_cfg.h
 create mode 100644 drivers/net/ethernet/aquantia/aq_common.h
 create mode 100644 drivers/net/ethernet/aquantia/aq_utils.h

diff --git a/drivers/net/ethernet/aquantia/aq_cfg.h 
b/drivers/net/ethernet/aquantia/aq_cfg.h
new file mode 100644
index 000..63f01b8
--- /dev/null
+++ b/drivers/net/ethernet/aquantia/aq_cfg.h
@@ -0,0 +1,78 @@
+/*
+ * aQuantia Corporation Network Driver
+ * Copyright (C) 2014-2017 aQuantia Corporation. All rights reserved
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ */
+
+/* File aq_cfg.h: Definition of configuration parameters and constants. */
+
+#ifndef AQ_CFG_H
+#define AQ_CFG_H
+
+#define AQ_CFG_VECS_DEF   4U
+#define AQ_CFG_TCS_DEF1U
+
+#define AQ_CFG_TXDS_DEF4096U
+#define AQ_CFG_RXDS_DEF1024U
+
+#define AQ_CFG_IS_POLLING_DEF 0U
+
+#define AQ_CFG_FORCE_LEGACY_INT 0U
+
+#define AQ_CFG_IS_INTERRUPT_MODERATION_DEF   1U
+#define AQ_CFG_INTERRUPT_MODERATION_RATE_DEF 0xU
+#define AQ_CFG_IRQ_MASK  0x1FFU
+
+#define AQ_CFG_VECS_MAX   8U
+#define AQ_CFG_TCS_MAX8U
+
+#define AQ_CFG_TX_FRAME_MAX  (16U * 1024U)
+#define AQ_CFG_RX_FRAME_MAX  (4U * 1024U)
+
+/* LRO */
+#define AQ_CFG_IS_LRO_DEF   1U
+
+/* RSS */
+#define AQ_CFG_RSS_INDIRECTION_TABLE_MAX  128U
+#define AQ_CFG_RSS_HASHKEY_SIZE   320U
+
+#define AQ_CFG_IS_RSS_DEF   1U
+#define AQ_CFG_NUM_RSS_QUEUES_DEF   AQ_CFG_VECS_DEF
+#define AQ_CFG_RSS_BASE_CPU_NUM_DEF 0U
+
+#define AQ_CFG_PCI_FUNC_MSIX_IRQS   9U
+#define AQ_CFG_PCI_FUNC_PORTS   2U
+
+#define AQ_CFG_SERVICE_TIMER_INTERVAL(2 * HZ)
+#define AQ_CFG_POLLING_TIMER_INTERVAL   ((unsigned int)(2 * HZ))
+
+#define AQ_CFG_SKB_FRAGS_MAX   32U
+
+#define AQ_CFG_NAPI_WEIGHT 64U
+
+#define AQ_CFG_MULTICAST_ADDRESS_MAX 32U
+
+/*#define AQ_CFG_MAC_ADDR_PERMANENT {0x30, 0x0E, 0xE3, 0x12, 0x34, 0x56}*/
+
+#define AQ_CFG_FC_MODE 3U
+
+#define AQ_CFG_SPEED_MSK  0xU  /* 0xU==auto_neg */
+
+#define AQ_CFG_IS_AUTONEG_DEF   1U
+#define AQ_CFG_MTU_DEF  1514U
+
+#define AQ_CFG_LOCK_TRYS   100U
+
+#define AQ_CFG_DRV_AUTHOR  "aQuantia"
+#define AQ_CFG_DRV_DESC"aQuantia Corporation(R) Network Driver"
+#define AQ_CFG_DRV_NAME"aquantia"
+#define AQ_CFG_DRV_VERSION __stringify(NIC_MAJOR_DRIVER_VERSION)"."\
+   __stringify(NIC_MINOR_DRIVER_VERSION)"."\
+   __stringify(NIC_BUILD_DRIVER_VERSION)"."\
+   __stringify(NIC_REVISION_DRIVER_VERSION)
+
+#endif /* AQ_CFG_H */
+
diff --git a/drivers/net/ethernet/aquantia/aq_common.h 
b/drivers/net/ethernet/aquantia/aq_common.h
new file mode 100644
index 000..af00d48
--- /dev/null
+++ b/drivers/net/ethernet/aquantia/aq_common.h
@@ -0,0 +1,24 @@
+/*
+ * aQuantia Corporation Network Driver
+ * Copyright (C) 2014-2017 aQuantia Corporation. All rights reserved
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ */
+
+/* File aq_common.h: Basic includes for all files in project. */
+
+#ifndef AQ_COMMON_H
+#define AQ_COMMON_H
+
+#include 
+#include 
+
+#include "ver.h"
+#include "aq_nic.h"
+#include "aq_cfg.h"
+#include "aq_utils.h"
+
+#endif /* AQ_COMMON_H */
+
diff --git a/drivers/net/ethernet/aquantia/aq_utils.h 
b/drivers/net/ethernet/aquantia/aq_utils.h
new file mode 100644
index 000..e8b4167
--- /dev/null
+++ b/drivers/net/ethernet/aquantia/aq_utils.h
@@ -0,0 +1,54 @@
+/*
+ * aQuantia Corporation Network Driver
+ * Copyright (C) 2014-2017 aQuantia Corporation. All rights reserved
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ */
+
+/* File aq_utils.h: Useful macro and structures used in all layers of driver. 
*/
+
+#ifndef AQ_UTILS_H
+#define AQ_UTILS_H
+
+#include "aq_common.h"
+
+#ifndef MBIT
+#define MBIT

[PATCH v4 00/13] net: ethernet: aquantia: Add AQtion 2.5/5 GB NIC driver

2017-01-11 Thread Alexander Loktionov

From: David VomLehn 

This series introduces the AQtion NIC driver for the aQuantia
AQC107/AQC108 network devices.

v1: Initial version
v2: o Make necessary drivers/net/ethernet changes to integrate software
o Drop intermediate atlantic directory
o Remove Makefile things only appropriate to out of tree module
  building
v3: o Move changes to drivers/net/ethernet/{Kconfig,Makefile} to the last
  patch to ensure clean bisection.
o Removed inline attribute aq_hw_write_req() as it was defined in
  only one .c file.
o #included pci.h in aq_common.h to get struct pci definition.
o Modified code to unlock based execution flow rather than using a flag.
o Made a number of functions that were only used in a single file static.
o Cleaned up error and return code handling in various places.
o Remove AQ_CFG_IP_ALIGN definition.
o Other minor code clean up.
v4: o Using do_div for 64 bit division.
o Modified NIC statistics code.
o Using build_skb instead netdev_alloc_skb for single fragment packets.
o Removed extra aq_nic.o from Makefile

Signed-off-by: Alexander Loktionov 
Signed-off-by: Dmitrii Tarakanov 
Signed-off-by: Pavel Belous 
Signed-off-by: David M. VomLehn 
---

David VomLehn (13):
  net: ethernet: aquantia: Make and configuration files.
  net: ethernet: aquantia: Common functions and definitions
  net: ethernet: aquantia: Add ring support code
  net: ethernet: aquantia: Low-level hardware interfaces
  net: ethernet: aquantia: Support for NIC-specific code
  net: ethernet: aquantia: Atlantic A0 and B0 specific functions.
  net: ethernet: aquantia: Vector operations
  net: ethernet: aquantia: PCI operations
  net: ethernet: aquantia: Atlantic hardware abstraction layer
  net: ethernet: aquantia: Hardware interface and utility functions
  net: ethernet: aquantia: Ethtool support
  net: ethernet: aquantia: Receive side scaling
  net: ethernet: aquantia: Integrate AQtion 2.5/5 GB NIC driver

 drivers/net/ethernet/Kconfig   |1 +
 drivers/net/ethernet/Makefile  |1 +
 drivers/net/ethernet/aquantia/Kconfig  |   24 +
 drivers/net/ethernet/aquantia/Makefile |   43 +
 drivers/net/ethernet/aquantia/aq_cfg.h |   78 +
 drivers/net/ethernet/aquantia/aq_common.h  |   24 +
 drivers/net/ethernet/aquantia/aq_ethtool.c |  251 +++
 drivers/net/ethernet/aquantia/aq_ethtool.h |   20 +
 drivers/net/ethernet/aquantia/aq_hw.h  |  170 ++
 drivers/net/ethernet/aquantia/aq_hw_utils.c|   69 +
 drivers/net/ethernet/aquantia/aq_hw_utils.h|   48 +
 drivers/net/ethernet/aquantia/aq_main.c|  292 +++
 drivers/net/ethernet/aquantia/aq_main.h|   18 +
 drivers/net/ethernet/aquantia/aq_nic.c |  911 
 drivers/net/ethernet/aquantia/aq_nic.h |  109 +
 drivers/net/ethernet/aquantia/aq_nic_internal.h|   47 +
 drivers/net/ethernet/aquantia/aq_pci_func.c|  348 +++
 drivers/net/ethernet/aquantia/aq_pci_func.h|   35 +
 drivers/net/ethernet/aquantia/aq_ring.c|  359 +++
 drivers/net/ethernet/aquantia/aq_ring.h|  158 ++
 drivers/net/ethernet/aquantia/aq_rss.h |   27 +
 drivers/net/ethernet/aquantia/aq_utils.h   |   54 +
 drivers/net/ethernet/aquantia/aq_vec.c |  386 
 drivers/net/ethernet/aquantia/aq_vec.h |   43 +
 drivers/net/ethernet/aquantia/hw_atl/hw_atl_a0.c   |  908 
 drivers/net/ethernet/aquantia/hw_atl/hw_atl_a0.h   |   35 +
 .../ethernet/aquantia/hw_atl/hw_atl_a0_internal.h  |  153 ++
 drivers/net/ethernet/aquantia/hw_atl/hw_atl_b0.c   |  961 
 drivers/net/ethernet/aquantia/hw_atl/hw_atl_b0.h   |   35 +
 .../ethernet/aquantia/hw_atl/hw_atl_b0_internal.h  |  206 ++
 drivers/net/ethernet/aquantia/hw_atl/hw_atl_llh.c  | 1395 
 drivers/net/ethernet/aquantia/hw_atl/hw_atl_llh.h  |  678 ++
 .../ethernet/aquantia/hw_atl/hw_atl_llh_internal.h | 2376 
 .../net/ethernet/aquantia/hw_atl/hw_atl_utils.c|  548 +
 .../net/ethernet/aquantia/hw_atl/hw_atl_utils.h|  211 ++
 drivers/net/ethernet/aquantia/ver.h|   19 +
 36 files changed, 11041 insertions(+)
 create mode 100644 drivers/net/ethernet/aquantia/Kconfig
 create mode 100644 drivers/net/ethernet/aquantia/Makefile
 create mode 100644 drivers/net/ethernet/aquantia/aq_cfg.h
 create mode 100644 drivers/net/ethernet/aquantia/aq_common.h
 create mode 100644 drivers/net/ethernet/aquantia/aq_ethtool.c
 create mode 100644 drivers/net/ethernet/aquantia/aq_ethtool.h
 create mode 100644 drivers/net/ethernet/aquantia/aq_hw.h
 create mode 100644 drivers/net/ethernet/aquantia/aq_hw_utils.c
 create mode 100644 drivers/net/ethernet/aquantia/aq_hw_utils.h
 create mode 100644

[PATCH net-next v2 01/10] net: dsa: Pass device pointer to dsa_register_switch

2017-01-11 Thread Florian Fainelli

In preparation for allowing dsa_register_switch() to be supplied with
device/platform data, pass down a struct device pointer instead of a
struct device_node.

Signed-off-by: Florian Fainelli 
---
 drivers/net/dsa/b53/b53_common.c |  2 +-
 drivers/net/dsa/mv88e6xxx/chip.c | 11 ++-
 drivers/net/dsa/qca8k.c  |  2 +-
 include/net/dsa.h|  2 +-
 net/dsa/dsa2.c   |  7 ---
 5 files changed, 13 insertions(+), 11 deletions(-)

diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c
index 5102a3701a1a..7179eed9ee6d 100644
--- a/drivers/net/dsa/b53/b53_common.c
+++ b/drivers/net/dsa/b53/b53_common.c
@@ -1882,7 +1882,7 @@ int b53_switch_register(struct b53_device *dev)
 
pr_info("found switch: %s, rev %i\n", dev->name, dev->core_rev);
 
-   return dsa_register_switch(dev->ds, dev->ds->dev->of_node);
+   return dsa_register_switch(dev->ds, dev->ds->dev);
 }
 EXPORT_SYMBOL(b53_switch_register);
 
diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index eea8e0176e33..1060597e160a 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -4407,8 +4407,7 @@ static struct dsa_switch_driver mv88e6xxx_switch_drv = {
.ops= _switch_ops,
 };
 
-static int mv88e6xxx_register_switch(struct mv88e6xxx_chip *chip,
-struct device_node *np)
+static int mv88e6xxx_register_switch(struct mv88e6xxx_chip *chip)
 {
struct device *dev = chip->dev;
struct dsa_switch *ds;
@@ -4423,7 +4422,7 @@ static int mv88e6xxx_register_switch(struct 
mv88e6xxx_chip *chip,
 
dev_set_drvdata(dev, ds);
 
-   return dsa_register_switch(ds, np);
+   return dsa_register_switch(ds, dev);
 }
 
 static void mv88e6xxx_unregister_switch(struct mv88e6xxx_chip *chip)
@@ -4507,9 +4506,11 @@ static int mv88e6xxx_probe(struct mdio_device *mdiodev)
if (err)
goto out_g2_irq;
 
-   err = mv88e6xxx_register_switch(chip, np);
-   if (err)
+   err = mv88e6xxx_register_switch(chip);
+   if (err) {
+   mv88e6xxx_mdio_unregister(chip);
goto out_mdio;
+   }
 
return 0;
 
diff --git a/drivers/net/dsa/qca8k.c b/drivers/net/dsa/qca8k.c
index 54d270d59eb0..c084aa484d2b 100644
--- a/drivers/net/dsa/qca8k.c
+++ b/drivers/net/dsa/qca8k.c
@@ -964,7 +964,7 @@ qca8k_sw_probe(struct mdio_device *mdiodev)
mutex_init(>reg_mutex);
dev_set_drvdata(>dev, priv);
 
-   return dsa_register_switch(priv->ds, priv->ds->dev->of_node);
+   return dsa_register_switch(priv->ds, >dev);
 }
 
 static void
diff --git a/include/net/dsa.h b/include/net/dsa.h
index b94d1f2ef912..16a502a6c26a 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -403,7 +403,7 @@ static inline bool dsa_uses_tagged_protocol(struct 
dsa_switch_tree *dst)
 }
 
 void dsa_unregister_switch(struct dsa_switch *ds);
-int dsa_register_switch(struct dsa_switch *ds, struct device_node *np);
+int dsa_register_switch(struct dsa_switch *ds, struct device *dev);
 #ifdef CONFIG_PM_SLEEP
 int dsa_switch_suspend(struct dsa_switch *ds);
 int dsa_switch_resume(struct dsa_switch *ds);
diff --git a/net/dsa/dsa2.c b/net/dsa/dsa2.c
index 42a41d84053c..4170f7ea8e28 100644
--- a/net/dsa/dsa2.c
+++ b/net/dsa/dsa2.c
@@ -579,8 +579,9 @@ static struct device_node *dsa_get_ports(struct dsa_switch 
*ds,
return ports;
 }
 
-static int _dsa_register_switch(struct dsa_switch *ds, struct device_node *np)
+static int _dsa_register_switch(struct dsa_switch *ds, struct device *dev)
 {
+   struct device_node *np = dev->of_node;
struct device_node *ports = dsa_get_ports(ds, np);
struct dsa_switch_tree *dst;
u32 tree, index;
@@ -660,12 +661,12 @@ static int _dsa_register_switch(struct dsa_switch *ds, 
struct device_node *np)
return err;
 }
 
-int dsa_register_switch(struct dsa_switch *ds, struct device_node *np)
+int dsa_register_switch(struct dsa_switch *ds, struct device *dev)
 {
int err;
 
mutex_lock(_mutex);
-   err = _dsa_register_switch(ds, np);
+   err = _dsa_register_switch(ds, dev);
mutex_unlock(_mutex);
 
return err;
-- 
2.9.3

Re: [PATCH v2 2/2] stmmac: rename it to synopsys

2017-01-11 Thread Jie Deng

Hi Joao,

On 2017/1/11 18:35, Joao Pinto wrote:
> Hi Jie,
>
> Às 4:00 AM de 1/11/2017, Jie Deng escreveu:
>> Hi Joao,
>>
>>
>> On 2017/1/10 22:52, Joao Pinto wrote:
>>> This patch renames stmicro/stmmac to synopsys/ since it is a standard
>>> ethernet software package regarding synopsys ethernet controllers, 
>>> supporting
>>> the majority of Synopsys Ethernet IPs. The config IDs remain the same, for
>>> retro-compatibility, only the description was changed.
>>>
>>> Signed-off-by: Joao Pinto 
>>> ---
>>> changes v1->v2:
>>> - nothing changed. Just to keep up with patch set version
>>>
>>> @@ -1,5 +1,5 @@
>>>  config STMMAC_ETH
>>> -   tristate "STMicroelectronics 10/100/1000 Ethernet driver"
>>> +   tristate "Synopsys Ethernet drivers"
>>> depends on HAS_IOMEM && HAS_DMA
>>> select MII
>>> select PHYLIB
>>> @@ -14,7 +14,7 @@ config STMMAC_ETH
>>>  if STMMAC_ETH
>>>  
>> "Synopsys Ethernet drivers" is too generic. The name should reflect the
>> controller. This driver is for Synopsys GMAC 10M/100M/1G IPs. We will submit 
>> a
>> driver for the new 25G/40G/50G/100G XLGMAC IP in the future.
> As you know Synopsys is an IP vendor that as a wide range of IPs related to
> Ethernet. stmmac is a driver that was well built and well maintained and
> supports most of those Ethernet IPs, so it has the potential to be the 
> official
> Synopsys Ethernet driver suite in the future.
>
> Let's make baby steps. For now stmmac supports 10M/100M/1G and also QoS which 
> is
> a separated IP and the different IP cores are selected by device tree.
>
> In the future it would be usefull to be selectable by Kconfig options, making 
> it
> more clear to the kernel user. For example:
>
> Synopsys Ethernet drivers
>   eQoS Core
>   10M Core
>   100M Core
>   1G Core
>   25G Core
>   
>
> The XLGMAC will be great for our users and I will be 100% available to help 
> you
> with it.
>
> Thanks,
> Joao
>
Currently, Synopsys has three series IPs cores. They are
1. 10M/100M/1G   (GMAC, QoS)
2. 1G/2.5G/5G/10G (XGMAC)
3. 10G/25G/40G/50G/100G (XLGMAC)
More info: https://www.synopsys.com/designware-ip/interface-ip/ethernet.html
 
You have successfully merged  dwc_eth_qos.c into stmmac. stmmac now fully
supports the 10M/100M/1G series IPs. Personally, I do support Florian's
suggestion not to rename stmmac.
considering to avoid  future confusion and make easy maintenance, Following is
my suggestions
   1. not to do any rename
   2. keep all 10M/100M/1G IPs (GMAC, QoS) development in stmmac.
   3. keep all 1G/2.5G/5G/10G IP (XGMAC) development in amd-xgbe.
   4. submit a new driver under synopsys/ for the new 10G/25G/40G/50G/100G
(XLGMAC) IP.

Welcome opinions from others.

Thanks,
Jie

[PATCH net-next v2 00/10] net: dsa: Support for pdata in dsa2

2017-01-11 Thread Florian Fainelli

Hi all,

This is not exactly new, and was sent before, although back then, I did not
have an user of the pre-declared MDIO board information, but now we do. Note
that I have additional changes queued up to have b53 register platform data for
MIPS bcm47xx and bcm63xx.

Yes I know that we should have the Orion platforms eventually be converted to
Device Tree, but until that happens, I don't want any remaining users of the
old "dsa" platform device (hence the previous DTS submissions for ARM/mvebu)
and, there will be platforms out there that most likely won't never see DT
coming their way (BCM47xx is almost 100% sure, BCM63xx maybe not in a distant
future).

We would probably want the whole series to be merged via David Miller's tree
to simplify things.

Greg, can you Ack/Nack patch 5 since it touched the core LDD?

Thanks!

Changes in v2:

- Rebased against latest net-next/master

- Moved dev_find_class() to device_find_class() into drivers/base/core.c

- Moved dev_to_net_device into net/core/dev.c

- Utilize dsa_chip_data directly instead of dsa_platform_data

- Augmented dsa_chip_data to be multi-CPU port ready

Changes from last submission (few months back):

- rebased against latest net-next

- do not introduce dsa2_platform_data which was overkill and was meant to
  allow us to do exaclty the same things with platform data and Device Tree
  we use the existing dsa_platform_data instead

- properly register MDIO devices when the MDIO bus is registered and associate
  platform_data with them

- add a change to the Orion platform code to demonstrate how this can be used

Thank you

Florian Fainelli (10):
  net: dsa: Pass device pointer to dsa_register_switch
  net: dsa: Make most functions take a dsa_port argument
  net: dsa: Suffix function manipulating device_node with _dn
  net: dsa: Move ports assignment closer to error checking
  drivers: base: Add device_find_class()
  net: dsa: Migrate to device_find_class()
  net: Relocate dev_to_net_device() into core
  net: dsa: Add support for platform data
  net: phy: Allow pre-declaration of MDIO devices
  ARM: orion: Register DSA switch as a MDIO device

 arch/arm/mach-orion5x/common.c   |   2 +-
 arch/arm/mach-orion5x/common.h   |   4 +-
 arch/arm/mach-orion5x/rd88f5181l-fxo-setup.c |   7 +-
 arch/arm/mach-orion5x/rd88f5181l-ge-setup.c  |   7 +-
 arch/arm/mach-orion5x/rd88f6183ap-ge-setup.c |   7 +-
 arch/arm/mach-orion5x/wnr854t-setup.c|   2 +-
 arch/arm/mach-orion5x/wrt350n-v2-setup.c |   7 +-
 arch/arm/plat-orion/common.c |  25 +++-
 arch/arm/plat-orion/include/plat/common.h|   4 +-
 drivers/base/core.c  |  19 +++
 drivers/net/dsa/b53/b53_common.c |   2 +-
 drivers/net/dsa/mv88e6xxx/chip.c |  11 +-
 drivers/net/dsa/qca8k.c  |   2 +-
 drivers/net/phy/Makefile |   3 +-
 drivers/net/phy/mdio-boardinfo.c |  86 ++
 drivers/net/phy/mdio-boardinfo.h |  19 +++
 drivers/net/phy/mdio_bus.c   |   5 +
 drivers/net/phy/mdio_device.c|  11 ++
 include/linux/device.h   |   1 +
 include/linux/mdio.h |   3 +
 include/linux/mod_devicetable.h  |   1 +
 include/linux/netdevice.h|   2 +
 include/linux/phy.h  |  19 +++
 include/net/dsa.h|   8 +-
 net/core/dev.c   |  19 +++
 net/dsa/dsa.c|  53 ++---
 net/dsa/dsa2.c   | 168 +++
 net/dsa/dsa_priv.h   |   4 +-
 28 files changed, 360 insertions(+), 141 deletions(-)
 create mode 100644 drivers/net/phy/mdio-boardinfo.c
 create mode 100644 drivers/net/phy/mdio-boardinfo.h

-- 
2.9.3

[PATCH net-next v2 05/10] drivers: base: Add device_find_class()

2017-01-11 Thread Florian Fainelli

Add a helper function to lookup a device reference given a class name.
This is a preliminary patch to remove adhoc code from net/dsa/dsa.c and
make it more generic.

Signed-off-by: Florian Fainelli 
---
 drivers/base/core.c| 19 +++
 include/linux/device.h |  1 +
 2 files changed, 20 insertions(+)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index 020ea7f05520..3dd6047c10d8 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -2065,6 +2065,25 @@ struct device *device_find_child(struct device *parent, 
void *data,
 }
 EXPORT_SYMBOL_GPL(device_find_child);
 
+static int dev_is_class(struct device *dev, void *class)
+{
+   if (dev->class != NULL && !strcmp(dev->class->name, class))
+   return 1;
+
+   return 0;
+}
+
+struct device *device_find_class(struct device *parent, char *class)
+{
+   if (dev_is_class(parent, class)) {
+   get_device(parent);
+   return parent;
+   }
+
+   return device_find_child(parent, class, dev_is_class);
+}
+EXPORT_SYMBOL_GPL(device_find_class);
+
 int __init devices_init(void)
 {
devices_kset = kset_create_and_add("devices", _uevent_ops, NULL);
diff --git a/include/linux/device.h b/include/linux/device.h
index 491b4c0ca633..8d37f5ecb972 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -1120,6 +1120,7 @@ extern int device_for_each_child_reverse(struct device 
*dev, void *data,
 int (*fn)(struct device *dev, void *data));
 extern struct device *device_find_child(struct device *dev, void *data,
int (*match)(struct device *dev, void *data));
+extern struct device *device_find_class(struct device *parent, char *class);
 extern int device_rename(struct device *dev, const char *new_name);
 extern int device_move(struct device *dev, struct device *new_parent,
   enum dpm_order dpm_order);
-- 
2.9.3

[PATCH net-next v2 04/10] net: dsa: Move ports assignment closer to error checking

2017-01-11 Thread Florian Fainelli

Move the assignment of ports in _dsa_register_switch() closer to where
it is checked, no functional change. Re-order declarations to be
preserve the inverted christmas tree style.

Signed-off-by: Florian Fainelli 
---
 net/dsa/dsa2.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/dsa/dsa2.c b/net/dsa/dsa2.c
index 04ab62251fe3..cd91070b5467 100644
--- a/net/dsa/dsa2.c
+++ b/net/dsa/dsa2.c
@@ -587,8 +587,8 @@ static struct device_node *dsa_get_ports(struct dsa_switch 
*ds,
 static int _dsa_register_switch(struct dsa_switch *ds, struct device *dev)
 {
struct device_node *np = dev->of_node;
-   struct device_node *ports = dsa_get_ports(ds, np);
struct dsa_switch_tree *dst;
+   struct device_node *ports;
u32 tree, index;
int i, err;
 
@@ -596,6 +596,7 @@ static int _dsa_register_switch(struct dsa_switch *ds, 
struct device *dev)
if (err)
return err;
 
+   ports = dsa_get_ports(ds, np);
if (IS_ERR(ports))
return PTR_ERR(ports);
 
-- 
2.9.3

[PATCH net-next v2 03/10] net: dsa: Suffix function manipulating device_node with _dn

2017-01-11 Thread Florian Fainelli

Make it clear that these functions take a device_node structure pointer

Signed-off-by: Florian Fainelli 
---
 net/dsa/dsa2.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/net/dsa/dsa2.c b/net/dsa/dsa2.c
index 6e3675220fef..04ab62251fe3 100644
--- a/net/dsa/dsa2.c
+++ b/net/dsa/dsa2.c
@@ -94,8 +94,8 @@ static bool dsa_port_is_cpu(struct dsa_port *port)
return !!of_parse_phandle(port->dn, "ethernet", 0);
 }
 
-static bool dsa_ds_find_port(struct dsa_switch *ds,
-struct device_node *port)
+static bool dsa_ds_find_port_dn(struct dsa_switch *ds,
+   struct device_node *port)
 {
u32 index;
 
@@ -105,8 +105,8 @@ static bool dsa_ds_find_port(struct dsa_switch *ds,
return false;
 }
 
-static struct dsa_switch *dsa_dst_find_port(struct dsa_switch_tree *dst,
-   struct device_node *port)
+static struct dsa_switch *dsa_dst_find_port_dn(struct dsa_switch_tree *dst,
+  struct device_node *port)
 {
struct dsa_switch *ds;
u32 index;
@@ -116,7 +116,7 @@ static struct dsa_switch *dsa_dst_find_port(struct 
dsa_switch_tree *dst,
if (!ds)
continue;
 
-   if (dsa_ds_find_port(ds, port))
+   if (dsa_ds_find_port_dn(ds, port))
return ds;
}
 
@@ -137,7 +137,7 @@ static int dsa_port_complete(struct dsa_switch_tree *dst,
if (!link)
break;
 
-   dst_ds = dsa_dst_find_port(dst, link);
+   dst_ds = dsa_dst_find_port_dn(dst, link);
of_node_put(link);
 
if (!dst_ds)
@@ -546,7 +546,7 @@ static int dsa_parse_ports_dn(struct device_node *ports, 
struct dsa_switch *ds)
return 0;
 }
 
-static int dsa_parse_member(struct device_node *np, u32 *tree, u32 *index)
+static int dsa_parse_member_dn(struct device_node *np, u32 *tree, u32 *index)
 {
int err;
 
@@ -592,7 +592,7 @@ static int _dsa_register_switch(struct dsa_switch *ds, 
struct device *dev)
u32 tree, index;
int i, err;
 
-   err = dsa_parse_member(np, , );
+   err = dsa_parse_member_dn(np, , );
if (err)
return err;
 
-- 
2.9.3

[PATCH net-next v2 02/10] net: dsa: Make most functions take a dsa_port argument

2017-01-11 Thread Florian Fainelli

In preparation for allowing platform data, and therefore no valid
device_node pointer, make most DSA functions takes a pointer to a
dsa_port structure whenever possible. While at it, introduce a
dsa_port_is_valid() helper function which checks whether port->dn is
NULL or not at the moment.

Signed-off-by: Florian Fainelli 
---
 net/dsa/dsa.c  | 15 --
 net/dsa/dsa2.c | 61 +-
 net/dsa/dsa_priv.h |  4 ++--
 3 files changed, 44 insertions(+), 36 deletions(-)

diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c
index fd532487dfdf..2306d1b87c83 100644
--- a/net/dsa/dsa.c
+++ b/net/dsa/dsa.c
@@ -110,8 +110,9 @@ dsa_switch_probe(struct device *parent, struct device 
*host_dev, int sw_addr,
 
 /* basic switch operations **/
 int dsa_cpu_dsa_setup(struct dsa_switch *ds, struct device *dev,
- struct device_node *port_dn, int port)
+ struct dsa_port *dport, int port)
 {
+   struct device_node *port_dn = dport->dn;
struct phy_device *phydev;
int ret, mode;
 
@@ -141,15 +142,15 @@ int dsa_cpu_dsa_setup(struct dsa_switch *ds, struct 
device *dev,
 
 static int dsa_cpu_dsa_setups(struct dsa_switch *ds, struct device *dev)
 {
-   struct device_node *port_dn;
+   struct dsa_port *dport;
int ret, port;
 
for (port = 0; port < DSA_MAX_PORTS; port++) {
if (!(dsa_is_cpu_port(ds, port) || dsa_is_dsa_port(ds, port)))
continue;
 
-   port_dn = ds->ports[port].dn;
-   ret = dsa_cpu_dsa_setup(ds, dev, port_dn, port);
+   dport = >ports[port];
+   ret = dsa_cpu_dsa_setup(ds, dev, dport, port);
if (ret)
return ret;
}
@@ -366,8 +367,10 @@ dsa_switch_setup(struct dsa_switch_tree *dst, int index,
return ds;
 }
 
-void dsa_cpu_dsa_destroy(struct device_node *port_dn)
+void dsa_cpu_dsa_destroy(struct dsa_port *port)
 {
+   struct device_node *port_dn = port->dn;
+
if (of_phy_is_fixed_link(port_dn))
of_phy_deregister_fixed_link(port_dn);
 }
@@ -393,7 +396,7 @@ static void dsa_switch_destroy(struct dsa_switch *ds)
for (port = 0; port < DSA_MAX_PORTS; port++) {
if (!(dsa_is_cpu_port(ds, port) || dsa_is_dsa_port(ds, port)))
continue;
-   dsa_cpu_dsa_destroy(ds->ports[port].dn);
+   dsa_cpu_dsa_destroy(>ports[port]);
 
/* Clearing a bit which is not set does no harm */
ds->cpu_port_mask |= ~(1 << port);
diff --git a/net/dsa/dsa2.c b/net/dsa/dsa2.c
index 4170f7ea8e28..6e3675220fef 100644
--- a/net/dsa/dsa2.c
+++ b/net/dsa/dsa2.c
@@ -79,14 +79,19 @@ static void dsa_dst_del_ds(struct dsa_switch_tree *dst,
kref_put(>refcount, dsa_free_dst);
 }
 
-static bool dsa_port_is_dsa(struct device_node *port)
+static bool dsa_port_is_valid(struct dsa_port *port)
 {
-   return !!of_parse_phandle(port, "link", 0);
+   return !!port->dn;
 }
 
-static bool dsa_port_is_cpu(struct device_node *port)
+static bool dsa_port_is_dsa(struct dsa_port *port)
 {
-   return !!of_parse_phandle(port, "ethernet", 0);
+   return !!of_parse_phandle(port->dn, "link", 0);
+}
+
+static bool dsa_port_is_cpu(struct dsa_port *port)
+{
+   return !!of_parse_phandle(port->dn, "ethernet", 0);
 }
 
 static bool dsa_ds_find_port(struct dsa_switch *ds,
@@ -120,7 +125,7 @@ static struct dsa_switch *dsa_dst_find_port(struct 
dsa_switch_tree *dst,
 
 static int dsa_port_complete(struct dsa_switch_tree *dst,
 struct dsa_switch *src_ds,
-struct device_node *port,
+struct dsa_port *port,
 u32 src_port)
 {
struct device_node *link;
@@ -128,7 +133,7 @@ static int dsa_port_complete(struct dsa_switch_tree *dst,
struct dsa_switch *dst_ds;
 
for (index = 0;; index++) {
-   link = of_parse_phandle(port, "link", index);
+   link = of_parse_phandle(port->dn, "link", index);
if (!link)
break;
 
@@ -151,13 +156,13 @@ static int dsa_port_complete(struct dsa_switch_tree *dst,
  */
 static int dsa_ds_complete(struct dsa_switch_tree *dst, struct dsa_switch *ds)
 {
-   struct device_node *port;
+   struct dsa_port *port;
u32 index;
int err;
 
for (index = 0; index < DSA_MAX_PORTS; index++) {
-   port = ds->ports[index].dn;
-   if (!port)
+   port = >ports[index];
+   if (!dsa_port_is_valid(port))
continue;
 
if (!dsa_port_is_dsa(port))
@@ -197,7 +202,7 @@ static int dsa_dst_complete(struct dsa_switch_tree *dst)
return 0;
 }
 
-static int dsa_dsa_port_apply(struct device_node

[PATCH net-next v2 06/10] net: dsa: Migrate to device_find_class()

2017-01-11 Thread Florian Fainelli

Now that the base device driver code provides an identical
implementation of dev_find_class() utilize device_find_class() instead
of our own version of it.

Signed-off-by: Florian Fainelli 
---
 net/dsa/dsa.c | 22 ++
 1 file changed, 2 insertions(+), 20 deletions(-)

diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c
index 2306d1b87c83..77fa4c4f5828 100644
--- a/net/dsa/dsa.c
+++ b/net/dsa/dsa.c
@@ -455,29 +455,11 @@ EXPORT_SYMBOL_GPL(dsa_switch_resume);
 #endif
 
 /* platform driver init and cleanup */
-static int dev_is_class(struct device *dev, void *class)
-{
-   if (dev->class != NULL && !strcmp(dev->class->name, class))
-   return 1;
-
-   return 0;
-}
-
-static struct device *dev_find_class(struct device *parent, char *class)
-{
-   if (dev_is_class(parent, class)) {
-   get_device(parent);
-   return parent;
-   }
-
-   return device_find_child(parent, class, dev_is_class);
-}
-
 struct mii_bus *dsa_host_dev_to_mii_bus(struct device *dev)
 {
struct device *d;
 
-   d = dev_find_class(dev, "mdio_bus");
+   d = device_find_class(dev, "mdio_bus");
if (d != NULL) {
struct mii_bus *bus;
 
@@ -495,7 +477,7 @@ static struct net_device *dev_to_net_device(struct device 
*dev)
 {
struct device *d;
 
-   d = dev_find_class(dev, "net");
+   d = device_find_class(dev, "net");
if (d != NULL) {
struct net_device *nd;
 
-- 
2.9.3

[PATCH net-next v2 07/10] net: Relocate dev_to_net_device() into core

2017-01-11 Thread Florian Fainelli

dev_to_net_device() is moved from net/dsa/dsa.c to net/core/dev.c since
it going to be used by net/dsa/dsa2.c and the namespace of the function
justifies making it available to other users potentially.

Signed-off-by: Florian Fainelli 
---
 include/linux/netdevice.h |  2 ++
 net/core/dev.c| 19 +++
 net/dsa/dsa.c | 18 --
 3 files changed, 21 insertions(+), 18 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index ebd9e2c12f44..d532c070163f 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -4378,4 +4378,6 @@ do {  
\
 #define PTYPE_HASH_SIZE(16)
 #define PTYPE_HASH_MASK(PTYPE_HASH_SIZE - 1)
 
+struct net_device *dev_to_net_device(struct device *dev);
+
 #endif /* _LINUX_NETDEVICE_H */
diff --git a/net/core/dev.c b/net/core/dev.c
index e98cc06d2577..643e4a4491c6 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -8150,6 +8150,25 @@ const char *netdev_drivername(const struct net_device 
*dev)
return empty;
 }
 
+struct net_device *dev_to_net_device(struct device *dev)
+{
+   struct device *d;
+
+   d = device_find_class(dev, "net");
+   if (d) {
+   struct net_device *nd;
+
+   nd = to_net_dev(d);
+   dev_hold(nd);
+   put_device(d);
+
+   return nd;
+   }
+
+   return NULL;
+}
+EXPORT_SYMBOL_GPL(dev_to_net_device);
+
 static void __netdev_printk(const char *level, const struct net_device *dev,
struct va_format *vaf)
 {
diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c
index 77fa4c4f5828..6c264f92fec5 100644
--- a/net/dsa/dsa.c
+++ b/net/dsa/dsa.c
@@ -473,24 +473,6 @@ struct mii_bus *dsa_host_dev_to_mii_bus(struct device *dev)
 }
 EXPORT_SYMBOL_GPL(dsa_host_dev_to_mii_bus);
 
-static struct net_device *dev_to_net_device(struct device *dev)
-{
-   struct device *d;
-
-   d = device_find_class(dev, "net");
-   if (d != NULL) {
-   struct net_device *nd;
-
-   nd = to_net_dev(d);
-   dev_hold(nd);
-   put_device(d);
-
-   return nd;
-   }
-
-   return NULL;
-}
-
 #ifdef CONFIG_OF
 static int dsa_of_setup_routing_table(struct dsa_platform_data *pd,
struct dsa_chip_data *cd,
-- 
2.9.3

[PATCH net-next v2 09/10] net: phy: Allow pre-declaration of MDIO devices

2017-01-11 Thread Florian Fainelli

Allow board support code to collect pre-declarations for MDIO devices by
registering them with mdiobus_register_board_info(). SPI and I2C buses
have a similar feature, we were missing this for MDIO devices, but this
is particularly useful for e.g: MDIO-connected switches which need to
provide their port layout (often board-specific) to a MDIO Ethernet
switch driver.

Signed-off-by: Florian Fainelli 
---
 drivers/net/phy/Makefile |  3 +-
 drivers/net/phy/mdio-boardinfo.c | 86 
 drivers/net/phy/mdio-boardinfo.h | 19 +
 drivers/net/phy/mdio_bus.c   |  5 +++
 drivers/net/phy/mdio_device.c| 11 +
 include/linux/mdio.h |  3 ++
 include/linux/mod_devicetable.h  |  1 +
 include/linux/phy.h  | 19 +
 8 files changed, 146 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/phy/mdio-boardinfo.c
 create mode 100644 drivers/net/phy/mdio-boardinfo.h

diff --git a/drivers/net/phy/Makefile b/drivers/net/phy/Makefile
index 356859ac7c18..407b0b601ea8 100644
--- a/drivers/net/phy/Makefile
+++ b/drivers/net/phy/Makefile
@@ -1,6 +1,7 @@
 # Makefile for Linux PHY drivers and MDIO bus drivers
 
-libphy-y   := phy.o phy_device.o mdio_bus.o mdio_device.o
+libphy-y   := phy.o phy_device.o mdio_bus.o mdio_device.o \
+  mdio-boardinfo.o
 libphy-$(CONFIG_SWPHY) += swphy.o
 libphy-$(CONFIG_LED_TRIGGER_PHY)   += phy_led_triggers.o
 
diff --git a/drivers/net/phy/mdio-boardinfo.c b/drivers/net/phy/mdio-boardinfo.c
new file mode 100644
index ..230b40ffee5a
--- /dev/null
+++ b/drivers/net/phy/mdio-boardinfo.c
@@ -0,0 +1,86 @@
+/*
+ * mdio-boardinfo - Collect pre-declarations for MDIO devices
+ *
+ * This program is free software; you can redistribute  it and/or modify it
+ * under  the terms of  the GNU General  Public License as published by the
+ * Free Software Foundation;  either version 2 of the  License, or (at your
+ * option) any later version.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "mdio-boardinfo.h"
+
+static LIST_HEAD(mdio_board_list);
+static DEFINE_MUTEX(mdio_board_lock);
+
+/**
+ * mdiobus_setup_mdiodev_from_board_info - create and setup MDIO devices
+ * from pre-collected board specific MDIO information
+ * @mdiodev: MDIO device pointer
+ * Context: can sleep
+ */
+void mdiobus_setup_mdiodev_from_board_info(struct mii_bus *bus)
+{
+   struct mdio_board_entry *be;
+   struct mdio_device *mdiodev;
+   struct mdio_board_info *bi;
+   int ret;
+
+   mutex_lock(_board_lock);
+   list_for_each_entry(be, _board_list, list) {
+   bi = >board_info;
+
+   if (strcmp(bus->id, bi->bus_id))
+   continue;
+
+   mdiodev = mdio_device_create(bus, bi->mdio_addr) ;
+   if (IS_ERR(mdiodev))
+   continue;
+
+   strncpy(mdiodev->modalias, bi->modalias,
+   sizeof(mdiodev->modalias));
+   mdiodev->bus_match = mdio_device_bus_match;
+   mdiodev->dev.platform_data = (void *)bi->platform_data;
+
+   ret = mdio_device_register(mdiodev);
+   if (ret) {
+   mdio_device_free(mdiodev);
+   continue;
+   }
+   }
+   mutex_unlock(_board_lock);
+}
+
+/**
+ * mdio_register_board_info - register MDIO devices for a given board
+ * @info: array of devices descriptors
+ * @n: number of descriptors provided
+ * Context: can sleep
+ *
+ * The board info passed can be marked with __initdata but be pointers
+ * such as platform_data etc. are copied as-is
+ */
+int mdiobus_register_board_info(const struct mdio_board_info *info,
+   unsigned int n)
+{
+   struct mdio_board_entry *be;
+   unsigned int i;
+
+   be = kcalloc(n, sizeof(*be), GFP_KERNEL);
+   if (!be)
+   return -ENOMEM;
+
+   for (i = 0; i < n; i++, be++, info++) {
+   memcpy(>board_info, info, sizeof(*info));
+   mutex_lock(_board_lock);
+   list_add_tail(>list, _board_list);
+   mutex_unlock(_board_lock);
+   }
+
+   return 0;
+}
diff --git a/drivers/net/phy/mdio-boardinfo.h b/drivers/net/phy/mdio-boardinfo.h
new file mode 100644
index ..00f98163e90e
--- /dev/null
+++ b/drivers/net/phy/mdio-boardinfo.h
@@ -0,0 +1,19 @@
+/*
+ * mdio-boardinfo.h - board info interface internal to the mdio_bus
+ * component
+ */
+
+#ifndef __MDIO_BOARD_INFO_H
+#define __MDIO_BOARD_INFO_H
+
+#include 
+#include 
+
+struct mdio_board_entry {
+   struct list_headlist;
+   struct mdio_board_info  board_info;
+};
+
+void mdiobus_setup_mdiodev_from_board_info(struct mii_bus *bus);
+
+#endif /* __MDIO_BOARD_INFO_H */
diff --git a/drivers/net/phy/mdio_bus.c b/drivers/net/phy/mdio_bus.c

[PATCH net-next v2 10/10] ARM: orion: Register DSA switch as a MDIO device

2017-01-11 Thread Florian Fainelli

Utilize the ability to pass board specific MDIO bus information towards a
particular MDIO device thus allowing us to provide the per-port switch layout
to the Marvell 88E6XXX switch driver.

Since we would end-up with conflicting registration paths, do not register the
"dsa" platform device anymore.

Note that the MDIO devices registered by code in net/dsa/dsa2.c does not
parse a dsa_platform_data, but directly take a dsa_chip_data (specific
to a single switch chip), so we update the different call sites to pass
this structure down to orion_ge00_switch_init().

Signed-off-by: Florian Fainelli 
---
 arch/arm/mach-orion5x/common.c   |  2 +-
 arch/arm/mach-orion5x/common.h   |  4 ++--
 arch/arm/mach-orion5x/rd88f5181l-fxo-setup.c |  7 +--
 arch/arm/mach-orion5x/rd88f5181l-ge-setup.c  |  7 +--
 arch/arm/mach-orion5x/rd88f6183ap-ge-setup.c |  7 +--
 arch/arm/mach-orion5x/wnr854t-setup.c|  2 +-
 arch/arm/mach-orion5x/wrt350n-v2-setup.c |  7 +--
 arch/arm/plat-orion/common.c | 25 +++--
 arch/arm/plat-orion/include/plat/common.h|  4 ++--
 9 files changed, 29 insertions(+), 36 deletions(-)

diff --git a/arch/arm/mach-orion5x/common.c b/arch/arm/mach-orion5x/common.c
index 04910764c385..83a7ec4c16d0 100644
--- a/arch/arm/mach-orion5x/common.c
+++ b/arch/arm/mach-orion5x/common.c
@@ -105,7 +105,7 @@ void __init orion5x_eth_init(struct 
mv643xx_eth_platform_data *eth_data)
 /*
  * Ethernet switch
  /
-void __init orion5x_eth_switch_init(struct dsa_platform_data *d)
+void __init orion5x_eth_switch_init(struct dsa_chip_data *d)
 {
orion_ge00_switch_init(d);
 }
diff --git a/arch/arm/mach-orion5x/common.h b/arch/arm/mach-orion5x/common.h
index 8a4115bd441d..efeffc6b4ebb 100644
--- a/arch/arm/mach-orion5x/common.h
+++ b/arch/arm/mach-orion5x/common.h
@@ -3,7 +3,7 @@
 
 #include 
 
-struct dsa_platform_data;
+struct dsa_chip_data;
 struct mv643xx_eth_platform_data;
 struct mv_sata_platform_data;
 
@@ -41,7 +41,7 @@ void orion5x_setup_wins(void);
 void orion5x_ehci0_init(void);
 void orion5x_ehci1_init(void);
 void orion5x_eth_init(struct mv643xx_eth_platform_data *eth_data);
-void orion5x_eth_switch_init(struct dsa_platform_data *d);
+void orion5x_eth_switch_init(struct dsa_chip_data *d);
 void orion5x_i2c_init(void);
 void orion5x_sata_init(struct mv_sata_platform_data *sata_data);
 void orion5x_spi_init(void);
diff --git a/arch/arm/mach-orion5x/rd88f5181l-fxo-setup.c 
b/arch/arm/mach-orion5x/rd88f5181l-fxo-setup.c
index dccadf68ea2b..a3c1336d30c9 100644
--- a/arch/arm/mach-orion5x/rd88f5181l-fxo-setup.c
+++ b/arch/arm/mach-orion5x/rd88f5181l-fxo-setup.c
@@ -101,11 +101,6 @@ static struct dsa_chip_data 
rd88f5181l_fxo_switch_chip_data = {
.port_names[7]  = "lan3",
 };
 
-static struct dsa_platform_data __initdata rd88f5181l_fxo_switch_plat_data = {
-   .nr_chips   = 1,
-   .chip   = _fxo_switch_chip_data,
-};
-
 static void __init rd88f5181l_fxo_init(void)
 {
/*
@@ -120,7 +115,7 @@ static void __init rd88f5181l_fxo_init(void)
 */
orion5x_ehci0_init();
orion5x_eth_init(_fxo_eth_data);
-   orion5x_eth_switch_init(_fxo_switch_plat_data);
+   orion5x_eth_switch_init(_fxo_switch_chip_data);
orion5x_uart0_init();
 
mvebu_mbus_add_window_by_id(ORION_MBUS_DEVBUS_BOOT_TARGET,
diff --git a/arch/arm/mach-orion5x/rd88f5181l-ge-setup.c 
b/arch/arm/mach-orion5x/rd88f5181l-ge-setup.c
index affe5ec825de..252efe29bd1a 100644
--- a/arch/arm/mach-orion5x/rd88f5181l-ge-setup.c
+++ b/arch/arm/mach-orion5x/rd88f5181l-ge-setup.c
@@ -102,11 +102,6 @@ static struct dsa_chip_data rd88f5181l_ge_switch_chip_data 
= {
.port_names[7]  = "lan3",
 };
 
-static struct dsa_platform_data __initdata rd88f5181l_ge_switch_plat_data = {
-   .nr_chips   = 1,
-   .chip   = _ge_switch_chip_data,
-};
-
 static struct i2c_board_info __initdata rd88f5181l_ge_i2c_rtc = {
I2C_BOARD_INFO("ds1338", 0x68),
 };
@@ -125,7 +120,7 @@ static void __init rd88f5181l_ge_init(void)
 */
orion5x_ehci0_init();
orion5x_eth_init(_ge_eth_data);
-   orion5x_eth_switch_init(_ge_switch_plat_data);
+   orion5x_eth_switch_init(_ge_switch_chip_data);
orion5x_i2c_init();
orion5x_uart0_init();
 
diff --git a/arch/arm/mach-orion5x/rd88f6183ap-ge-setup.c 
b/arch/arm/mach-orion5x/rd88f6183ap-ge-setup.c
index 67ee8571b03c..f4f1dbe1d91d 100644
--- a/arch/arm/mach-orion5x/rd88f6183ap-ge-setup.c
+++ b/arch/arm/mach-orion5x/rd88f6183ap-ge-setup.c
@@ -40,11 +40,6 @@ static struct dsa_chip_data rd88f6183ap_ge_switch_chip_data 
= {
.port_names[5]  = "cpu",
 };
 
-static struct dsa_platform_data __initdata rd88f6183ap_ge_switch_plat_data = {
-   .nr_chips

[PATCH net-next v2 08/10] net: dsa: Add support for platform data

2017-01-11 Thread Florian Fainelli

Allow drivers to use the new DSA API with platform data. Most of the
code in net/dsa/dsa2.c does not rely so much on device_nodes and can get
the same information from platform_data instead.

We purposely do not support distributed configurations with platform
data, so drivers should be providing a pointer to a 'struct
dsa_chip_data' structure if they wish to communicate per-port layout.

Multiple CPUs port could potentially be supported and dsa_chip_data is
extended to receive up to one reference to an upstream network device
per port described by a dsa_chip_data structure.

Signed-off-by: Florian Fainelli 
---
 include/net/dsa.h |  6 
 net/dsa/dsa2.c| 95 ---
 2 files changed, 83 insertions(+), 18 deletions(-)

diff --git a/include/net/dsa.h b/include/net/dsa.h
index 16a502a6c26a..491008792e4d 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -42,6 +42,11 @@ struct dsa_chip_data {
struct device   *host_dev;
int sw_addr;
 
+   /*
+* Reference to network devices
+*/
+   struct device   *netdev[DSA_MAX_PORTS];
+
/* set to size of eeprom if supported by the switch */
int eeprom_len;
 
@@ -140,6 +145,7 @@ struct dsa_switch_tree {
 };
 
 struct dsa_port {
+   const char  *name;
struct net_device   *netdev;
struct device_node  *dn;
unsigned intageing_time;
diff --git a/net/dsa/dsa2.c b/net/dsa/dsa2.c
index cd91070b5467..d326fc4afad7 100644
--- a/net/dsa/dsa2.c
+++ b/net/dsa/dsa2.c
@@ -81,17 +81,23 @@ static void dsa_dst_del_ds(struct dsa_switch_tree *dst,
 
 static bool dsa_port_is_valid(struct dsa_port *port)
 {
-   return !!port->dn;
+   return !!(port->dn || port->name);
 }
 
 static bool dsa_port_is_dsa(struct dsa_port *port)
 {
-   return !!of_parse_phandle(port->dn, "link", 0);
+   if (port->name && !strcmp(port->name, "dsa"))
+   return true;
+   else
+   return !!of_parse_phandle(port->dn, "link", 0);
 }
 
 static bool dsa_port_is_cpu(struct dsa_port *port)
 {
-   return !!of_parse_phandle(port->dn, "ethernet", 0);
+   if (port->name && !strcmp(port->name, "cpu"))
+   return true;
+   else
+   return !!of_parse_phandle(port->dn, "ethernet", 0);
 }
 
 static bool dsa_ds_find_port_dn(struct dsa_switch *ds,
@@ -251,10 +257,11 @@ static void dsa_cpu_port_unapply(struct dsa_port *port, 
u32 index,
 static int dsa_user_port_apply(struct dsa_port *port, u32 index,
   struct dsa_switch *ds)
 {
-   const char *name;
+   const char *name = port->name;
int err;
 
-   name = of_get_property(port->dn, "label", NULL);
+   if (port->dn)
+   name = of_get_property(port->dn, "label", NULL);
if (!name)
name = "eth%d";
 
@@ -439,11 +446,15 @@ static int dsa_cpu_parse(struct dsa_port *port, u32 index,
struct net_device *ethernet_dev;
struct device_node *ethernet;
 
-   ethernet = of_parse_phandle(port->dn, "ethernet", 0);
-   if (!ethernet)
-   return -EINVAL;
+   if (port->dn) {
+   ethernet = of_parse_phandle(port->dn, "ethernet", 0);
+   if (!ethernet)
+   return -EINVAL;
+   ethernet_dev = of_find_net_device_by_node(ethernet);
+   } else {
+   ethernet_dev = dev_to_net_device(ds->cd->netdev[index]);
+   }
 
-   ethernet_dev = of_find_net_device_by_node(ethernet);
if (!ethernet_dev)
return -EPROBE_DEFER;
 
@@ -546,6 +557,33 @@ static int dsa_parse_ports_dn(struct device_node *ports, 
struct dsa_switch *ds)
return 0;
 }
 
+static int dsa_parse_ports(struct dsa_chip_data *cd, struct dsa_switch *ds)
+{
+   bool valid_name_found = false;
+   unsigned int i;
+
+   for (i = 0; i < DSA_MAX_PORTS; i++) {
+   if (!cd->port_names[i])
+   continue;
+
+   ds->ports[i].name = cd->port_names[i];
+
+   /* Initialize enabled_port_mask now for drv->setup()
+* to have access to a correct value, just like what
+* net/dsa/dsa.c::dsa_switch_setup_one does.
+*/
+   if (!dsa_port_is_cpu(>ports[i]))
+   ds->enabled_port_mask |= 1 << i;
+
+   valid_name_found= true;
+   }
+
+   if (!valid_name_found && i == DSA_MAX_PORTS)
+   return -EINVAL;
+
+   return 0;
+}
+
 static int dsa_parse_member_dn(struct device_node *np, u32 *tree, u32 *index)
 {
int err;
@@ -570,6 +608,15 @@ static int dsa_parse_member_dn(struct device_node *np, u32 
*tree, u32 *index)
return 0;
 }
 
+static int dsa_parse_member(struct dsa_chip_data *pd, u32 *tree, u32 *index)
+{
+   /* We do not support complex trees with dsa_chip_data */
+

[PATCH net-next] net/mlx5e: Support bpf_xdp_adjust_head()

2017-01-11 Thread Martin KaFai Lau

This patch adds bpf_xdp_adjust_head() support to mlx5e.

1. rx_headroom is added to struct mlx5e_rq.  It uses
   an existing 4 byte hole in the struct.
2. The adjusted data length is checked against
   MLX5E_XDP_MIN_INLINE and MLX5E_SW2HW_MTU(rq->netdev->mtu).
3. The macro MLX5E_SW2HW_MTU is moved from en_main.c to en.h.
   MLX5E_HW2SW_MTU is also moved to en.h for symmetric reason
   but it is not a must.

Signed-off-by: Martin KaFai Lau 
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h  |  4 ++
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 18 +++
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c   | 63 ++-
 3 files changed, 51 insertions(+), 34 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h 
b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index a473cea10c16..0d9dd860a295 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -51,6 +51,9 @@
 
 #define MLX5_SET_CFG(p, f, v) MLX5_SET(create_flow_group_in, p, f, v)
 
+#define MLX5E_HW2SW_MTU(hwmtu) ((hwmtu) - (ETH_HLEN + VLAN_HLEN + ETH_FCS_LEN))
+#define MLX5E_SW2HW_MTU(swmtu) ((swmtu) + (ETH_HLEN + VLAN_HLEN + ETH_FCS_LEN))
+
 #define MLX5E_MAX_NUM_TC   8
 
 #define MLX5E_PARAMS_MINIMUM_LOG_SQ_SIZE0x6
@@ -369,6 +372,7 @@ struct mlx5e_rq {
 
unsigned long  state;
intix;
+   u16rx_headroom;
 
struct mlx5e_rx_am am; /* Adaptive Moderation */
struct bpf_prog   *xdp_prog;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c 
b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index f74ba73c55c7..aba3691e0919 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -343,9 +343,6 @@ static void mlx5e_disable_async_events(struct mlx5e_priv 
*priv)
synchronize_irq(mlx5_get_msix_vec(priv->mdev, MLX5_EQ_VEC_ASYNC));
 }
 
-#define MLX5E_HW2SW_MTU(hwmtu) (hwmtu - (ETH_HLEN + VLAN_HLEN + ETH_FCS_LEN))
-#define MLX5E_SW2HW_MTU(swmtu) (swmtu + (ETH_HLEN + VLAN_HLEN + ETH_FCS_LEN))
-
 static inline int mlx5e_get_wqe_mtt_sz(void)
 {
/* UMR copies MTTs in units of MLX5_UMR_MTT_ALIGNMENT bytes.
@@ -534,9 +531,13 @@ static int mlx5e_create_rq(struct mlx5e_channel *c,
goto err_rq_wq_destroy;
}
 
-   rq->buff.map_dir = DMA_FROM_DEVICE;
-   if (rq->xdp_prog)
+   if (rq->xdp_prog) {
rq->buff.map_dir = DMA_BIDIRECTIONAL;
+   rq->rx_headroom = XDP_PACKET_HEADROOM;
+   } else {
+   rq->buff.map_dir = DMA_FROM_DEVICE;
+   rq->rx_headroom = MLX5_RX_HEADROOM;
+   }
 
switch (priv->params.rq_wq_type) {
case MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ:
@@ -586,7 +587,7 @@ static int mlx5e_create_rq(struct mlx5e_channel *c,
byte_count = rq->buff.wqe_sz;
 
/* calc the required page order */
-   frag_sz = MLX5_RX_HEADROOM +
+   frag_sz = rq->rx_headroom +
  byte_count /* packet data */ +
  SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
frag_sz = SKB_DATA_ALIGN(frag_sz);
@@ -3153,11 +3154,6 @@ static int mlx5e_xdp_set(struct net_device *netdev, 
struct bpf_prog *prog)
bool reset, was_opened;
int i;
 
-   if (prog && prog->xdp_adjust_head) {
-   netdev_err(netdev, "Does not support bpf_xdp_adjust_head()\n");
-   return -EOPNOTSUPP;
-   }
-
mutex_lock(>state_lock);
 
if ((netdev->features & NETIF_F_LRO) && prog) {
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c 
b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
index 0e2fb3ed1790..914e00132e08 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -264,7 +264,7 @@ int mlx5e_alloc_rx_wqe(struct mlx5e_rq *rq, struct 
mlx5e_rx_wqe *wqe, u16 ix)
if (unlikely(mlx5e_page_alloc_mapped(rq, di)))
return -ENOMEM;
 
-   wqe->data.addr = cpu_to_be64(di->addr + MLX5_RX_HEADROOM);
+   wqe->data.addr = cpu_to_be64(di->addr + rq->rx_headroom);
return 0;
 }
 
@@ -646,8 +646,7 @@ static inline void mlx5e_xmit_xdp_doorbell(struct mlx5e_sq 
*sq)
 
 static inline void mlx5e_xmit_xdp_frame(struct mlx5e_rq *rq,
struct mlx5e_dma_info *di,
-   unsigned int data_offset,
-   int len)
+   const struct xdp_buff *xdp)
 {
struct mlx5e_sq  *sq   = >channel->xdp_sq;
struct mlx5_wq_cyc   *wq   = >wq;
@@ -659,9 +658,17 @@ static inline void mlx5e_xmit_xdp_frame(struct mlx5e_rq 
*rq,
struct mlx5_wqe_eth_seg  *eseg = >eth;
struct mlx5_wqe_data_seg *dseg;
 
+

[PATCH net-next] secure_seq: fix sparse errors

2017-01-11 Thread Eric Dumazet

From: Eric Dumazet 

Fixes following warnings :

net/core/secure_seq.c:125:28: warning: incorrect type in argument 1
(different base types)
net/core/secure_seq.c:125:28:expected unsigned int const [unsigned]
[usertype] a
net/core/secure_seq.c:125:28:got restricted __be32 [usertype] saddr
net/core/secure_seq.c:125:35: warning: incorrect type in argument 2
(different base types)
net/core/secure_seq.c:125:35:expected unsigned int const [unsigned]
[usertype] b
net/core/secure_seq.c:125:35:got restricted __be32 [usertype] daddr
net/core/secure_seq.c:125:43: warning: cast from restricted __be16
net/core/secure_seq.c:125:61: warning: restricted __be16 degrades to
integer


Fixes: 7cd23e5300c1 ("secure_seq: use SipHash in place of MD5")
Signed-off-by: Eric Dumazet 
---
 net/core/secure_seq.c |4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/net/core/secure_seq.c b/net/core/secure_seq.c
index 
3a9fcec94ace21238c0d6cd6d9997678e0623ab9..758f140b6bedc51669fed973b39ee317c2bf1570
 100644
--- a/net/core/secure_seq.c
+++ b/net/core/secure_seq.c
@@ -122,7 +122,9 @@ u64 secure_dccp_sequence_number(__be32 saddr, __be32 daddr,
 {
u64 seq;
net_secret_init();
-   seq = siphash_3u32(saddr, daddr, (u32)sport << 16 | dport, _secret);
+   seq = siphash_3u32((__force u32)saddr, (__force u32)daddr,
+  (__force u32)sport << 16 | (__force u32)dport,
+  _secret);
seq += ktime_get_real_ns();
seq &= (1ull << 48) - 1;
return seq;

[PATCH net-next 0/2] More flexible BPF cb access

2017-01-11 Thread Daniel Borkmann

This patch improves BPF's cb access by allowing b/h/w/dw
access variants on it. For details, please see individual
patches.

Thanks!

Daniel Borkmann (2):
  bpf: pass original insn directly to convert_ctx_access
  bpf: allow b/h/w/dw access for bpf's cb in ctx

 include/linux/bpf.h |   7 +-
 kernel/bpf/verifier.c   |  11 +-
 kernel/trace/bpf_trace.c|  15 +-
 net/core/filter.c   | 176 ++-
 tools/testing/selftests/bpf/test_verifier.c | 442 +++-
 5 files changed, 563 insertions(+), 88 deletions(-)

-- 
1.9.3

[PATCH net-next 1/2] bpf: pass original insn directly to convert_ctx_access

2017-01-11 Thread Daniel Borkmann

Currently, when calling convert_ctx_access() callback for the various
program types, we pass in insn->dst_reg, insn->src_reg, insn->off from
the original instruction. This information is needed to rewrite the
instruction that is based on the user ctx structure into a kernel
representation for the ctx. As we'd like to allow access size beyond
just BPF_W, we'd need also insn->code for that in order to decode the
original access size. Given that, lets just pass insn directly to the
convert_ctx_access() callback and work on that to not clutter the
callback with even more arguments we need to pass when everything is
already contained in insn. So lets go through that once, no functional
change.

Signed-off-by: Daniel Borkmann 
Acked-by: Alexei Starovoitov 
---
 include/linux/bpf.h  |   7 ++-
 kernel/bpf/verifier.c|   3 +-
 kernel/trace/bpf_trace.c |  15 ++---
 net/core/filter.c| 139 +--
 4 files changed, 87 insertions(+), 77 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 94ea8d2..f8c3560 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -161,9 +161,10 @@ struct bpf_verifier_ops {
enum bpf_reg_type *reg_type);
int (*gen_prologue)(struct bpf_insn *insn, bool direct_write,
const struct bpf_prog *prog);
-   u32 (*convert_ctx_access)(enum bpf_access_type type, int dst_reg,
- int src_reg, int ctx_off,
- struct bpf_insn *insn, struct bpf_prog *prog);
+   u32 (*convert_ctx_access)(enum bpf_access_type type,
+ const struct bpf_insn *src,
+ struct bpf_insn *dst,
+ struct bpf_prog *prog);
 };
 
 struct bpf_prog_type_list {
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 2efdc91..df7e472 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -3177,8 +3177,7 @@ static int convert_ctx_accesses(struct bpf_verifier_env 
*env)
if (env->insn_aux_data[i].ptr_type != PTR_TO_CTX)
continue;
 
-   cnt = ops->convert_ctx_access(type, insn->dst_reg, 
insn->src_reg,
- insn->off, insn_buf, env->prog);
+   cnt = ops->convert_ctx_access(type, insn, insn_buf, env->prog);
if (cnt == 0 || cnt >= ARRAY_SIZE(insn_buf)) {
verbose("bpf verifier is misconfigured\n");
return -EINVAL;
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index f883c43..1860e7f 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -572,28 +572,29 @@ static bool pe_prog_is_valid_access(int off, int size, 
enum bpf_access_type type
return true;
 }
 
-static u32 pe_prog_convert_ctx_access(enum bpf_access_type type, int dst_reg,
- int src_reg, int ctx_off,
+static u32 pe_prog_convert_ctx_access(enum bpf_access_type type,
+ const struct bpf_insn *si,
  struct bpf_insn *insn_buf,
  struct bpf_prog *prog)
 {
struct bpf_insn *insn = insn_buf;
 
-   switch (ctx_off) {
+   switch (si->off) {
case offsetof(struct bpf_perf_event_data, sample_period):
BUILD_BUG_ON(FIELD_SIZEOF(struct perf_sample_data, period) != 
sizeof(u64));
 
*insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(struct 
bpf_perf_event_data_kern,
-  data), dst_reg, src_reg,
+  data), si->dst_reg, 
si->src_reg,
  offsetof(struct bpf_perf_event_data_kern, 
data));
-   *insn++ = BPF_LDX_MEM(BPF_DW, dst_reg, dst_reg,
+   *insn++ = BPF_LDX_MEM(BPF_DW, si->dst_reg, si->dst_reg,
  offsetof(struct perf_sample_data, 
period));
break;
default:
*insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(struct 
bpf_perf_event_data_kern,
-  regs), dst_reg, src_reg,
+  regs), si->dst_reg, 
si->src_reg,
  offsetof(struct bpf_perf_event_data_kern, 
regs));
-   *insn++ = BPF_LDX_MEM(BPF_SIZEOF(long), dst_reg, dst_reg, 
ctx_off);
+   *insn++ = BPF_LDX_MEM(BPF_SIZEOF(long), si->dst_reg, 
si->dst_reg,
+ si->off);
break;
}
 
diff --git a/net/core/filter.c b/net/core/filter.c
index f4d16a9..8cfbdef 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -2972,32 +2972,33 @@ void bpf_warn_invalid_xdp_action(u32

[PATCH net-next 2/2] bpf: allow b/h/w/dw access for bpf's cb in ctx

2017-01-11 Thread Daniel Borkmann

When structs are used to store temporary state in cb[] buffer that is
used with programs and among tail calls, then the generated code will
not always access the buffer in bpf_w chunks. We can ease programming
of it and let this act more natural by allowing for aligned b/h/w/dw
sized access for cb[] ctx member. Various test cases are attached as
well for the selftest suite. Potentially, this can also be reused for
other program types to pass data around.

Signed-off-by: Daniel Borkmann 
Acked-by: Alexei Starovoitov 
---
 kernel/bpf/verifier.c   |   8 +-
 net/core/filter.c   |  41 ++-
 tools/testing/selftests/bpf/test_verifier.c | 442 +++-
 3 files changed, 478 insertions(+), 13 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index df7e472..d60e12c 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -3165,10 +3165,14 @@ static int convert_ctx_accesses(struct bpf_verifier_env 
*env)
insn = env->prog->insnsi + delta;
 
for (i = 0; i < insn_cnt; i++, insn++) {
-   if (insn->code == (BPF_LDX | BPF_MEM | BPF_W) ||
+   if (insn->code == (BPF_LDX | BPF_MEM | BPF_B) ||
+   insn->code == (BPF_LDX | BPF_MEM | BPF_H) ||
+   insn->code == (BPF_LDX | BPF_MEM | BPF_W) ||
insn->code == (BPF_LDX | BPF_MEM | BPF_DW))
type = BPF_READ;
-   else if (insn->code == (BPF_STX | BPF_MEM | BPF_W) ||
+   else if (insn->code == (BPF_STX | BPF_MEM | BPF_B) ||
+insn->code == (BPF_STX | BPF_MEM | BPF_H) ||
+insn->code == (BPF_STX | BPF_MEM | BPF_W) ||
 insn->code == (BPF_STX | BPF_MEM | BPF_DW))
type = BPF_WRITE;
else
diff --git a/net/core/filter.c b/net/core/filter.c
index 8cfbdef..9038386 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -2776,11 +2776,33 @@ static bool __is_valid_access(int off, int size)
 {
if (off < 0 || off >= sizeof(struct __sk_buff))
return false;
+
/* The verifier guarantees that size > 0. */
if (off % size != 0)
return false;
-   if (size != sizeof(__u32))
-   return false;
+
+   switch (off) {
+   case offsetof(struct __sk_buff, cb[0]) ...
+offsetof(struct __sk_buff, cb[4]) + sizeof(__u32) - 1:
+   if (size == sizeof(__u16) &&
+   off > offsetof(struct __sk_buff, cb[4]) + sizeof(__u16))
+   return false;
+   if (size == sizeof(__u32) &&
+   off > offsetof(struct __sk_buff, cb[4]))
+   return false;
+   if (size == sizeof(__u64) &&
+   off > offsetof(struct __sk_buff, cb[2]))
+   return false;
+   if (size != sizeof(__u8)  &&
+   size != sizeof(__u16) &&
+   size != sizeof(__u32) &&
+   size != sizeof(__u64))
+   return false;
+   break;
+   default:
+   if (size != sizeof(__u32))
+   return false;
+   }
 
return true;
 }
@@ -2799,7 +2821,7 @@ static bool sk_filter_is_valid_access(int off, int size,
if (type == BPF_WRITE) {
switch (off) {
case offsetof(struct __sk_buff, cb[0]) ...
-offsetof(struct __sk_buff, cb[4]):
+offsetof(struct __sk_buff, cb[4]) + sizeof(__u32) - 1:
break;
default:
return false;
@@ -2823,7 +2845,7 @@ static bool lwt_is_valid_access(int off, int size,
case offsetof(struct __sk_buff, mark):
case offsetof(struct __sk_buff, priority):
case offsetof(struct __sk_buff, cb[0]) ...
-offsetof(struct __sk_buff, cb[4]):
+offsetof(struct __sk_buff, cb[4]) + sizeof(__u32) - 1:
break;
default:
return false;
@@ -2915,7 +2937,7 @@ static bool tc_cls_act_is_valid_access(int off, int size,
case offsetof(struct __sk_buff, tc_index):
case offsetof(struct __sk_buff, priority):
case offsetof(struct __sk_buff, cb[0]) ...
-offsetof(struct __sk_buff, cb[4]):
+offsetof(struct __sk_buff, cb[4]) + sizeof(__u32) - 1:
case offsetof(struct __sk_buff, tc_classid):
break;
default:
@@ -3066,8 +3088,11 @@ static u32 sk_filter_convert_ctx_access(enum 
bpf_access_type type,
  si->dst_reg, si->src_reg, insn);
 
case offsetof(struct __sk_buff, cb[0]) ...
-offsetof(struct

[PATCH net-next] liquidio VF: reduce load time of module

2017-01-11 Thread Felix Manlunas

From: Prasad Kanneganti 

Reduce the load time of the VF driver by decreasing the wait time between
iterations of the loop that polls for a mailbox response from the PF. Also
change the wait time units from jiffies to milliseconds.

Signed-off-by: Prasad Kanneganti 
Signed-off-by: Felix Manlunas 
Signed-off-by: Raghu Vatsavayi 
Signed-off-by: Derek Chickles 
Signed-off-by: Satanand Burla 
---
 drivers/net/ethernet/cavium/liquidio/octeon_mailbox.c | 5 +++--
 drivers/net/ethernet/cavium/liquidio/octeon_mailbox.h | 4 ++--
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/cavium/liquidio/octeon_mailbox.c 
b/drivers/net/ethernet/cavium/liquidio/octeon_mailbox.c
index 73696b42..201b987 100644
--- a/drivers/net/ethernet/cavium/liquidio/octeon_mailbox.c
+++ b/drivers/net/ethernet/cavium/liquidio/octeon_mailbox.c
@@ -131,6 +131,7 @@ int octeon_mbox_write(struct octeon_device *oct,
 {
struct octeon_mbox *mbox = oct->mbox[mbox_cmd->q_no];
u32 count, i, ret = OCTEON_MBOX_STATUS_SUCCESS;
+   long timeout = LIO_MBOX_WRITE_WAIT_TIME;
unsigned long flags;
 
spin_lock_irqsave(>lock, flags);
@@ -158,7 +159,7 @@ int octeon_mbox_write(struct octeon_device *oct,
count = 0;
 
while (readq(mbox->mbox_write_reg) != OCTEON_PFVFSIG) {
-   schedule_timeout_uninterruptible(LIO_MBOX_WRITE_WAIT_TIME);
+   schedule_timeout_uninterruptible(timeout);
if (count++ == LIO_MBOX_WRITE_WAIT_CNT) {
ret = OCTEON_MBOX_STATUS_FAILED;
break;
@@ -171,7 +172,7 @@ int octeon_mbox_write(struct octeon_device *oct,
count = 0;
while (readq(mbox->mbox_write_reg) !=
   OCTEON_PFVFACK) {
-   schedule_timeout_uninterruptible(10);
+   schedule_timeout_uninterruptible(timeout);
if (count++ == LIO_MBOX_WRITE_WAIT_CNT) {
ret = OCTEON_MBOX_STATUS_FAILED;
break;
diff --git a/drivers/net/ethernet/cavium/liquidio/octeon_mailbox.h 
b/drivers/net/ethernet/cavium/liquidio/octeon_mailbox.h
index fe60a3e..c9376fe 100644
--- a/drivers/net/ethernet/cavium/liquidio/octeon_mailbox.h
+++ b/drivers/net/ethernet/cavium/liquidio/octeon_mailbox.h
@@ -31,8 +31,8 @@
 #define OCTEON_PFVFSIG 0x1122334455667788
 #define OCTEON_PFVFERR 0xDEADDEADDEADDEAD
 
-#define LIO_MBOX_WRITE_WAIT_CNT  1000
-#define LIO_MBOX_WRITE_WAIT_TIME   10
+#define LIO_MBOX_WRITE_WAIT_CNT 1000
+#define LIO_MBOX_WRITE_WAIT_TIMEmsecs_to_jiffies(1)
 
 enum octeon_mbox_cmd_status {
OCTEON_MBOX_STATUS_SUCCESS = 0,

RE: Marvell Phy (1510) issue since v4.7 kernel

2017-01-11 Thread Kwok, WingMan



> -Original Message-
> From: Kwok, WingMan
> Sent: Wednesday, January 11, 2017 5:32 PM
> To: 'Andrew Lunn'; 'rmk+ker...@arm.linux.org.uk'
> Cc: Karicheri, Muralidharan; netdev@vger.kernel.org
> Subject: RE: Marvell Phy (1510) issue since v4.7 kernel
> 
> 
> 
> > -Original Message-
> > From: Andrew Lunn [mailto:and...@lunn.ch]
> > Sent: Wednesday, January 11, 2017 5:28 PM
> > To: Kwok, WingMan
> > Cc: Karicheri, Muralidharan; netdev@vger.kernel.org
> > Subject: Re: Marvell Phy (1510) issue since v4.7 kernel
> >
> > On Wed, Jan 11, 2017 at 10:18:02PM +, Kwok, WingMan wrote:
> > >
> > >
> > > > -Original Message-
> > > > From: Andrew Lunn [mailto:and...@lunn.ch]
> > > > Sent: Monday, January 09, 2017 8:54 PM
> > > > To: Kwok, WingMan
> > > > Cc: Karicheri, Muralidharan; netdev@vger.kernel.org
> > > > Subject: Re: Marvell Phy (1510) issue since v4.7 kernel
> > > >
> > > > > From Marvell's brief description
> > > > http://www.marvell.com/transceivers/alaska-gbe/,
> > > > > it seems that 88E1510/1518 don't support fiber. Only 88E1512
> > does.
> > > > In
> > > > > that case, the fiber support patch is not applicable to
> > > > 88E1510/1518.
> > > >
> > > > O.K. That makes it easier.
> > > >
> > > > Please add the relevant IDs to include/linux/marvell.h, and add
> > > > entries to driver/net/phy/marvell.c for 1512 with
> SUPPORTED_FIBRE
> > and
> > > > 1510 and 1518 without.
> > > >
> > > >  Andrew
> > >
> > > By any chance you have the ID of 1512?
> >
> > Nope, sorry.
> >
> > Try Russell King.
> >
> > Andrew
> 
> Russell,
> 
> Do you have the ID of Marvell PHY 88E1512?
> 
> Thanks,
> WingMan
> 

looping in Charles-Antoine (author of patch
with commit id 6cfb3bcc)

Charles-Antoine,

Do you have the ID of Marvell PHY 88E1512?

Thanks,
WingMan

[PATCH net-next] liquidio: remove unnecessary code

2017-01-11 Thread Felix Manlunas

Remove code that's no longer needed.  It used to serve a purpose, which was
to fix a link-related bug.  For a while now, the NIC firmware has had a
more elegant fix for that bug.

Signed-off-by: Felix Manlunas 
Signed-off-by: Derek Chickles 
Signed-off-by: Satanand Burla 
---
 drivers/net/ethernet/cavium/liquidio/lio_main.c | 8 +---
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/cavium/liquidio/lio_main.c 
b/drivers/net/ethernet/cavium/liquidio/lio_main.c
index b8b579d..cc825d5 100644
--- a/drivers/net/ethernet/cavium/liquidio/lio_main.c
+++ b/drivers/net/ethernet/cavium/liquidio/lio_main.c
@@ -2693,13 +2693,7 @@ static int liquidio_stop(struct net_device *netdev)
lio->linfo.link.s.link_up = 0;
lio->link_changes++;
 
-   /* Pause for a moment and wait for Octeon to flush out (to the wire) any
-* egress packets that are in-flight.
-*/
-   set_current_state(TASK_INTERRUPTIBLE);
-   schedule_timeout(msecs_to_jiffies(100));
-
-   /* Now it should be safe to tell Octeon that nic interface is down. */
+   /* Tell Octeon that nic interface is down. */
send_rx_ctrl_cmd(lio, 0);
 
if (OCTEON_CN23XX_PF(oct)) {

Re: Marvell Phy (1510) issue since v4.7 kernel

2017-01-11 Thread Andrew Lunn

> Hi Andrew,
> 
> I double checked that ours is actually a 88E1514. According
> to Marvell's brief description, it seems that it does support
> fiber.
> 
> Reading page 18 reg 30 returns 6.

O.K, that is consistent. Looking at the Marvell SDK:

phy/Include/madApiDefs.h:#defineMAD_SUB_88E1510  4   /* 88E1510 */
phy/Include/madApiDefs.h:#defineMAD_SUB_88E1518  5   /* 88E1518 */
phy/Include/madApiDefs.h:#defineMAD_SUB_88E1512  6   /* 88E1512/88E1514 */

I took another look at the patch adding Fibre support. The
suspend/resume functions are wrong.

/* Suspend the fiber mode first */
if (!(phydev->supported & SUPPORTED_FIBRE)) {
 
The ! should not be there. Does removing them both fix your problem?

Andrew

RE: Marvell Phy (1510) issue since v4.7 kernel

2017-01-11 Thread Kwok, WingMan


> Hi WingMan
> 
> Do you know you really have a 1510? You can see it written on the
> chip?
> 
> Please can you read page 18, register 30 and let us know what value
> you get.
> 
> Andrew

Removed Charles-Antonie from the list since the email
address is no longer valid.

Hi Andrew,

I double checked that ours is actually a 88E1514. According
to Marvell's brief description, it seems that it does support
fiber.

Reading page 18 reg 30 returns 6.

Thanks,
WingMan

Re: TCP using IPv4-mapped IPv6 address as source

2017-01-11 Thread Jonathan T. Leighton


On 1/11/17 4:47 PM, Eric Dumazet wrote:

On Wed, 2017-01-11 at 16:26 -0500, Jonathan T. Leighton wrote:


I'm sure I understand what you're saying here. There should be no flow
to terminate.


I think you figured out that I meant "I'm not sure I understand...".


rfc2765 describes a way to use IPv4-mapped IPv6 packets on the wire.


I don't agree - I didn't read rfc2765 because it's obsolete, but the 
current version does not allow the use of IPv4-mapped IPv6 addresses. 
rfc2765 is obsoleted by rfc6145, and that in turn by rfc7915. rfc7915 
refers to both rfc6052 and rfc6219 for descriptions of the allowable 
mechanisms for translating from IPv4 to IPv6, and the mechanisms in each 
of those documents preclude the use of IPv4-mapped IPv6 addresses 
(:::0:0/96). There's no conflict with rfc6890 (BCP153), which 
explicitly precludes the use of IPv4-mapped IPv6 addresses as a source 
(or destination) address.



What I meant by 'terminating' was that it does not tell if an end system
(a host) is allowed to natively generate these packets.

Anyway,
https://tools.ietf.org/html/draft-itojun-v6ops-v4mapped-harmful-00

(which does not appear to be an RFC), tells us this would be
dangerous ;)

[PATCH] net: ipv4: fix table id in getroute response

2017-01-11 Thread David Ahern

rtm_table is an 8-bit field while table ids are allowed up to u32. Commit
709772e6e065 ("net: Fix routing tables with id > 255 for legacy software")
added the preference to set rtm_table in dumps to RT_TABLE_COMPAT if the
table id is > 255. The table id returned on get route requests should do
the same.

Fixes: c36ba6603a11 ("net: Allow user to get table id from route lookup")
Signed-off-by: David Ahern 
---
 net/ipv4/route.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 0fcac8e7a2b2..709ffe67d1de 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -2472,7 +2472,7 @@ static int rt_fill_info(struct net *net,  __be32 dst, 
__be32 src, u32 table_id,
r->rtm_dst_len  = 32;
r->rtm_src_len  = 0;
r->rtm_tos  = fl4->flowi4_tos;
-   r->rtm_table= table_id;
+   r->rtm_table= table_id < 256 ? table_id : RT_TABLE_COMPAT;
if (nla_put_u32(skb, RTA_TABLE, table_id))
goto nla_put_failure;
r->rtm_type = rt->rt_type;
-- 
2.1.4

Re: Marvell Phy (1510) issue since v4.7 kernel

2017-01-11 Thread Andrew Lunn

> > So i think the correct question should be, how can we tell the 88E1512
> > from the 88E1510 if they have the same ID in register 3.

Hi WingMan

Do you know you really have a 1510? You can see it written on the
chip?

Please can you read page 18, register 30 and let us know what value
you get.

Andrew

Re: [net PATCH 0/2] Don't use lco_csum to compute IPv4 checksum

2017-01-11 Thread Stephen Rothwell

Hi Dave,

On Tue, 29 Nov 2016 20:07:08 +1100 Stephen Rothwell  
wrote:
>
> Hi Jeff,
> 
> On Tue, 29 Nov 2016 10:43:04 +1100 Stephen Rothwell  
> wrote:
> >
> > On Mon, 28 Nov 2016 14:26:02 -0800 Jeff Kirsher 
> >  wrote:  
> > >
> > > On Mon, 2016-11-28 at 10:42 -0500, Alexander Duyck wrote:
> > > > When I implemented the GSO partial support in the Intel drivers I was
> > > > using
> > > > lco_csum to compute the checksum that we needed to plug into the IPv4
> > > > checksum field in order to cancel out the data that was not a part of 
> > > > the
> > > > IPv4 header.  However this didn't take into account that the transport
> > > > offset might be pointing to the inner transport header.
> > > > 
> > > > Instead of using lco_csum I have just coded around it so that we can use
> > > > the outer IP header plus the IP header length to determine where we need
> > > > to
> > > > start our checksum and then just call csum_partial ourselves.
> > > > 
> > > > This should fix the SIT issue reported on igb interfaces as well as
> > > > simliar
> > > > issues that would pop up on other Intel NICs.
> > > > 
> > > > ---
> > > > 
> > > > Alexander Duyck (2):
> > > >   igb/igbvf: Don't use lco_csum to compute IPv4 checksum
> > > >   ixgbe/ixgbevf: Don't use lco_csum to compute IPv4 checksum  
> > > 
> > > Stephen, I have applied Alex's patches to my net-queue tree.  Can you
> > > confirm they resolve the bug seen?
> > 
> > Its a bit tricky because the origin problem only happens on my
> > production server (ozlabs.org), but I will see if I can manage to just
> > remove and reload the driver ...  though, the server is running a 4.7.8
> > kernel and I am wondering how well these patches will apply?  
> 
> We have a winner!  This fixes my problem, so I can run at full speed
> with gso and tso enabled in the sit interface and tx-gso-partial
> enabled on the underlying ethernet.
> 
> Thanks to everyone for diagnosis and solution.
> 
> It would be nice if this fix went into the stable kernels as well so it
> will turn up in the distro kernels eventually.

It looks like these 2 commits:

516165a1e2f2igb/igbvf: Don't use lco_csum to compute IPv4 checksum
c54cdc316dbdixgbe/ixgbevf: Don't use lco_csum to compute IPv4 checksum

have not been sent to stable.  They fix a bug introduced in v4.7-rc1
and so need to go into 4.7 and 4.8 stable trees.

-- 
Cheers,
Stephen Rothwell

[PATCH] xen-netfront: Fix Rx stall during network stress and OOM

2017-01-11 Thread Vineeth Remanan Pillai

During an OOM scenario, request slots could not be created as skb
allocation fails. So the netback cannot pass in packets and netfront
wrongly assumes that there is no more work to be done and it disables
polling. This causes Rx to stall.

Fix is to consider the skb allocation failure as an error and in the
event of this error, re-enable polling so that request slots could be
created when memory is available.

Signed-off-by: Vineeth Remanan Pillai 
---
 drivers/net/xen-netfront.c | 23 ---
 1 file changed, 16 insertions(+), 7 deletions(-)

diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index 40f26b6..8275549 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -277,13 +277,14 @@ static struct sk_buff *xennet_alloc_one_rx_buffer(struct 
netfront_queue *queue)
 }
 
 
-static void xennet_alloc_rx_buffers(struct netfront_queue *queue)
+static int xennet_alloc_rx_buffers(struct netfront_queue *queue)
 {
RING_IDX req_prod = queue->rx.req_prod_pvt;
int notify;
+   int err = 0;
 
if (unlikely(!netif_carrier_ok(queue->info->netdev)))
-   return;
+   return err;
 
for (req_prod = queue->rx.req_prod_pvt;
 req_prod - queue->rx.rsp_cons < NET_RX_RING_SIZE;
@@ -295,8 +296,10 @@ static void xennet_alloc_rx_buffers(struct netfront_queue 
*queue)
struct xen_netif_rx_request *req;
 
skb = xennet_alloc_one_rx_buffer(queue);
-   if (!skb)
+   if (!skb) {
+   err = -ENOMEM;
break;
+   }
 
id = xennet_rxidx(req_prod);
 
@@ -321,9 +324,9 @@ static void xennet_alloc_rx_buffers(struct netfront_queue 
*queue)
queue->rx.req_prod_pvt = req_prod;
 
/* Not enough requests? Try again later. */
-   if (req_prod - queue->rx.rsp_cons < NET_RX_SLOTS_MIN) {
+   if (req_prod - queue->rx.sring->rsp_prod < NET_RX_SLOTS_MIN) {
mod_timer(>rx_refill_timer, jiffies + (HZ/10));
-   return;
+   return err;
}
 
wmb();  /* barrier so backend seens requests */
@@ -331,6 +334,8 @@ static void xennet_alloc_rx_buffers(struct netfront_queue 
*queue)
RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(>rx, notify);
if (notify)
notify_remote_via_irq(queue->rx_irq);
+
+   return err;
 }
 
 static int xennet_open(struct net_device *dev)
@@ -1046,7 +1051,7 @@ static int xennet_poll(struct napi_struct *napi, int 
budget)
 
work_done -= handle_incoming_queue(queue, );
 
-   xennet_alloc_rx_buffers(queue);
+   err = xennet_alloc_rx_buffers(queue);
 
if (work_done < budget) {
int more_to_do = 0;
@@ -1054,7 +1059,11 @@ static int xennet_poll(struct napi_struct *napi, int 
budget)
napi_complete(napi);
 
RING_FINAL_CHECK_FOR_RESPONSES(>rx, more_to_do);
-   if (more_to_do)
+
+   /* If there is more work to do or could not allocate
+* rx buffers, re-enable polling.
+*/
+   if (more_to_do || err != 0)
napi_schedule(napi);
}
 
-- 
2.1.2.AMZN

RE: Marvell Phy (1510) issue since v4.7 kernel

2017-01-11 Thread Kwok, WingMan



> -Original Message-
> From: Andrew Lunn [mailto:and...@lunn.ch]
> Sent: Wednesday, January 11, 2017 6:00 PM
> To: Kwok, WingMan
> Cc: rmk+ker...@arm.linux.org.uk; charles-antoine.cou...@nexvision.fr;
> Karicheri, Muralidharan; netdev@vger.kernel.org
> Subject: Re: Marvell Phy (1510) issue since v4.7 kernel
> 
> > looping in Charles-Antoine (author of patch
> > with commit id 6cfb3bcc)
> >
> > Charles-Antoine,
> >
> > Do you have the ID of Marvell PHY 88E1512?
> 
> I suspect that is the wrong question to ask.
> 
> The Marvell driver is being loaded, so it must be using on of the IDs
> in the driver. There is no ID in the driver specifically for the
> 88E1512. It seems like the 88E1512 uses the 88E1510 ID.
> 
> So i think the correct question should be, how can we tell the 88E1512
> from the 88E1510 if they have the same ID in register 3.
> 
> It appears that for the 88E1512, page 0 are the copper registers and
> page 1 is the fibre registers. Maybe the 88E1512 has an ID in page 1
> register 3? Maybe the 88E1510 does not have an ID in page 1 register
> 3?
> 
>   Andrew

Andrew,

Would Charles-Antoine be the better person to submit a patch
to fix the original problem then, since he tested the original
fiber support patch with 1512? I unfortunately don't have the
datasheet for 1512, and it does not seem to be available publicly.

Regards,
WingMan

Re: Marvell Phy (1510) issue since v4.7 kernel

2017-01-11 Thread Andrew Lunn

> looping in Charles-Antoine (author of patch
> with commit id 6cfb3bcc)
> 
> Charles-Antoine,
> 
> Do you have the ID of Marvell PHY 88E1512?

I suspect that is the wrong question to ask.

The Marvell driver is being loaded, so it must be using on of the IDs
in the driver. There is no ID in the driver specifically for the
88E1512. It seems like the 88E1512 uses the 88E1510 ID.

So i think the correct question should be, how can we tell the 88E1512
from the 88E1510 if they have the same ID in register 3.

It appears that for the 88E1512, page 0 are the copper registers and
page 1 is the fibre registers. Maybe the 88E1512 has an ID in page 1
register 3? Maybe the 88E1510 does not have an ID in page 1 register
3?

Andrew

[PATCH] tilepro: Fix non-void return from void function

2017-01-11 Thread Joe Perches

commit bc1f44709cf2 ("net: make ndo_get_stats64 a void function")
mistakenly used a return value for this void conversion.

Fix it.

Signed-off-by: Joe Perches 
cc: stephen hemminger 
---
 drivers/net/ethernet/tile/tilepro.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/drivers/net/ethernet/tile/tilepro.c 
b/drivers/net/ethernet/tile/tilepro.c
index 30cfea62a356..44f153713791 100644
--- a/drivers/net/ethernet/tile/tilepro.c
+++ b/drivers/net/ethernet/tile/tilepro.c
@@ -2090,12 +2090,8 @@ static void tile_net_get_stats64(struct net_device *dev,
stats->tx_bytes   = tx_bytes;
stats->rx_errors  = rx_errors;
stats->rx_dropped = rx_dropped;
-
-   return stats;
 }
 
-
-
 /*
  * Change the Ethernet Address of the NIC.
  *
-- 
2.10.0.rc2.1.g053435c

[PATCH] [v2] net: qcom/emac: grab a reference to the phydev on ACPI systems

2017-01-11 Thread Timur Tabi

Commit 6ffe1c4cd0a7 ("net: qcom/emac: fix of_node and phydev leaks")
fixed the problem with reference leaks on phydev, but the fix is
device-tree specific.  When the driver unloads, the reference is
dropped only on DT systems.

Instead, it's cleaner if up grab an reference on ACPI systems.
When the driver unloads, we can drop the reference without having
to check whether we're on a DT system.

Signed-off-by: Timur Tabi 
---

Notes:
v2: add check for null pointer

 drivers/net/ethernet/qualcomm/emac/emac-phy.c | 7 +++
 drivers/net/ethernet/qualcomm/emac/emac.c | 6 ++
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/qualcomm/emac/emac-phy.c 
b/drivers/net/ethernet/qualcomm/emac/emac-phy.c
index 99a14df..2851b4c 100644
--- a/drivers/net/ethernet/qualcomm/emac/emac-phy.c
+++ b/drivers/net/ethernet/qualcomm/emac/emac-phy.c
@@ -201,6 +201,13 @@ int emac_phy_config(struct platform_device *pdev, struct 
emac_adapter *adpt)
else
adpt->phydev = mdiobus_get_phy(mii_bus, phy_addr);
 
+   /* of_phy_find_device() claims a reference to the phydev,
+* so we do that here manually as well. When the driver
+* later unloads, it can unilaterally drop the reference
+* without worrying about ACPI vs DT.
+*/
+   if (adpt->phydev)
+   get_device(>phydev->mdio.dev);
} else {
struct device_node *phy_np;
 
diff --git a/drivers/net/ethernet/qualcomm/emac/emac.c 
b/drivers/net/ethernet/qualcomm/emac/emac.c
index 6ffe192..0aac0de 100644
--- a/drivers/net/ethernet/qualcomm/emac/emac.c
+++ b/drivers/net/ethernet/qualcomm/emac/emac.c
@@ -729,8 +729,7 @@ static int emac_probe(struct platform_device *pdev)
 err_undo_napi:
netif_napi_del(>rx_q.napi);
 err_undo_mdiobus:
-   if (!has_acpi_companion(>dev))
-   put_device(>phydev->mdio.dev);
+   put_device(>phydev->mdio.dev);
mdiobus_unregister(adpt->mii_bus);
 err_undo_clocks:
emac_clks_teardown(adpt);
@@ -750,8 +749,7 @@ static int emac_remove(struct platform_device *pdev)
 
emac_clks_teardown(adpt);
 
-   if (!has_acpi_companion(>dev))
-   put_device(>phydev->mdio.dev);
+   put_device(>phydev->mdio.dev);
mdiobus_unregister(adpt->mii_bus);
free_netdev(netdev);
 
-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm
Technologies, Inc.  Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

RE: Marvell Phy (1510) issue since v4.7 kernel

2017-01-11 Thread Kwok, WingMan



> -Original Message-
> From: Andrew Lunn [mailto:and...@lunn.ch]
> Sent: Wednesday, January 11, 2017 5:28 PM
> To: Kwok, WingMan
> Cc: Karicheri, Muralidharan; netdev@vger.kernel.org
> Subject: Re: Marvell Phy (1510) issue since v4.7 kernel
> 
> On Wed, Jan 11, 2017 at 10:18:02PM +, Kwok, WingMan wrote:
> >
> >
> > > -Original Message-
> > > From: Andrew Lunn [mailto:and...@lunn.ch]
> > > Sent: Monday, January 09, 2017 8:54 PM
> > > To: Kwok, WingMan
> > > Cc: Karicheri, Muralidharan; netdev@vger.kernel.org
> > > Subject: Re: Marvell Phy (1510) issue since v4.7 kernel
> > >
> > > > From Marvell's brief description
> > > http://www.marvell.com/transceivers/alaska-gbe/,
> > > > it seems that 88E1510/1518 don't support fiber. Only 88E1512
> does.
> > > In
> > > > that case, the fiber support patch is not applicable to
> > > 88E1510/1518.
> > >
> > > O.K. That makes it easier.
> > >
> > > Please add the relevant IDs to include/linux/marvell.h, and add
> > > entries to driver/net/phy/marvell.c for 1512 with SUPPORTED_FIBRE
> and
> > > 1510 and 1518 without.
> > >
> > >  Andrew
> >
> > By any chance you have the ID of 1512?
> 
> Nope, sorry.
> 
> Try Russell King.
> 
> Andrew

Russell,

Do you have the ID of Marvell PHY 88E1512?

Thanks,
WingMan

[PATCH] net: lwtunnel: Handle lwtunnel_fill_encap failure

2017-01-11 Thread David Ahern

Handle failure in lwtunnel_fill_encap adding attributes to skb.

Fixes: 571e722676fe ("ipv4: support for fib route lwtunnel encap attributes")
Fixes: 19e42e451506 ("ipv6: support for fib route lwtunnel encap attributes")
Signed-off-by: David Ahern 
---
 net/ipv4/fib_semantics.c | 11 +++
 net/ipv6/route.c |  3 ++-
 2 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
index eba1546b5031..9a375b908d01 100644
--- a/net/ipv4/fib_semantics.c
+++ b/net/ipv4/fib_semantics.c
@@ -1279,8 +1279,9 @@ int fib_dump_info(struct sk_buff *skb, u32 portid, u32 
seq, int event,
nla_put_u32(skb, RTA_FLOW, fi->fib_nh[0].nh_tclassid))
goto nla_put_failure;
 #endif
-   if (fi->fib_nh->nh_lwtstate)
-   lwtunnel_fill_encap(skb, fi->fib_nh->nh_lwtstate);
+   if (fi->fib_nh->nh_lwtstate &&
+   lwtunnel_fill_encap(skb, fi->fib_nh->nh_lwtstate) < 0)
+   goto nla_put_failure;
}
 #ifdef CONFIG_IP_ROUTE_MULTIPATH
if (fi->fib_nhs > 1) {
@@ -1316,8 +1317,10 @@ int fib_dump_info(struct sk_buff *skb, u32 portid, u32 
seq, int event,
nla_put_u32(skb, RTA_FLOW, nh->nh_tclassid))
goto nla_put_failure;
 #endif
-   if (nh->nh_lwtstate)
-   lwtunnel_fill_encap(skb, nh->nh_lwtstate);
+   if (nh->nh_lwtstate &&
+   lwtunnel_fill_encap(skb, nh->nh_lwtstate) < 0)
+   goto nla_put_failure;
+
/* length of rtnetlink header + attributes */
rtnh->rtnh_len = nlmsg_get_pos(skb) - (void *) rtnh;
} endfor_nexthops(fi);
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index ce5aaf448c54..4f6b067c8753 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -3317,7 +3317,8 @@ static int rt6_fill_node(struct net *net,
if (nla_put_u8(skb, RTA_PREF, IPV6_EXTRACT_PREF(rt->rt6i_flags)))
goto nla_put_failure;
 
-   lwtunnel_fill_encap(skb, rt->dst.lwtstate);
+   if (lwtunnel_fill_encap(skb, rt->dst.lwtstate) < 0)
+   goto nla_put_failure;
 
nlmsg_end(skb, nlh);
return 0;
-- 
2.1.4

Re: Marvell Phy (1510) issue since v4.7 kernel

2017-01-11 Thread Andrew Lunn

On Wed, Jan 11, 2017 at 10:18:02PM +, Kwok, WingMan wrote:
> 
> 
> > -Original Message-
> > From: Andrew Lunn [mailto:and...@lunn.ch]
> > Sent: Monday, January 09, 2017 8:54 PM
> > To: Kwok, WingMan
> > Cc: Karicheri, Muralidharan; netdev@vger.kernel.org
> > Subject: Re: Marvell Phy (1510) issue since v4.7 kernel
> > 
> > > From Marvell's brief description
> > http://www.marvell.com/transceivers/alaska-gbe/,
> > > it seems that 88E1510/1518 don't support fiber. Only 88E1512 does.
> > In
> > > that case, the fiber support patch is not applicable to
> > 88E1510/1518.
> > 
> > O.K. That makes it easier.
> > 
> > Please add the relevant IDs to include/linux/marvell.h, and add
> > entries to driver/net/phy/marvell.c for 1512 with SUPPORTED_FIBRE and
> > 1510 and 1518 without.
> > 
> >  Andrew
> 
> By any chance you have the ID of 1512?

Nope, sorry.

Try Russell King.

Andrew

RE: Marvell Phy (1510) issue since v4.7 kernel

2017-01-11 Thread Kwok, WingMan



> -Original Message-
> From: Andrew Lunn [mailto:and...@lunn.ch]
> Sent: Monday, January 09, 2017 8:54 PM
> To: Kwok, WingMan
> Cc: Karicheri, Muralidharan; netdev@vger.kernel.org
> Subject: Re: Marvell Phy (1510) issue since v4.7 kernel
> 
> > From Marvell's brief description
> http://www.marvell.com/transceivers/alaska-gbe/,
> > it seems that 88E1510/1518 don't support fiber. Only 88E1512 does.
> In
> > that case, the fiber support patch is not applicable to
> 88E1510/1518.
> 
> O.K. That makes it easier.
> 
> Please add the relevant IDs to include/linux/marvell.h, and add
> entries to driver/net/phy/marvell.c for 1512 with SUPPORTED_FIBRE and
> 1510 and 1518 without.
> 
>  Andrew

By any chance you have the ID of 1512?

Thanks,
WingMan

Re: [PATCH v2 01/13] net: ethernet: aquantia: Make and configuration files.

2017-01-11 Thread Rami Rosen

Hi,


> +++ b/drivers/net/ethernet/aquantia/Makefile
> @@ -0,0 +1,44 @@
...
...
> > +obj-$(CONFIG_AQTION) += atlantic.o
> +
> +atlantic-objs := aq_main.o \
> +   aq_nic.o \
> +   aq_pci_func.o \

Why twice aq_nic.o ? it appears two lines earlier:

> +   aq_nic.o \
> +   aq_vec.o \
> +   aq_ring.o \
...
...

Regards,
Rami Rosen

Re: TCP using IPv4-mapped IPv6 address as source

2017-01-11 Thread Eric Dumazet

On Wed, 2017-01-11 at 16:26 -0500, Jonathan T. Leighton wrote:

> I'm sure I understand what you're saying here. There should be no flow 
> to terminate.

rfc2765 describes a way to use IPv4-mapped IPv6 packets on the wire.

What I meant by 'terminating' was that it does not tell if an end system
(a host) is allowed to natively generate these packets.

Anyway,
https://tools.ietf.org/html/draft-itojun-v6ops-v4mapped-harmful-00 

(which does not appear to be an RFC), tells us this would be
dangerous ;)

Re: [PATCH v2 2/2] stmmac: rename it to synopsys

2017-01-11 Thread David Miller

From: Florian Fainelli 
Date: Wed, 11 Jan 2017 13:14:32 -0800

> As mentioned before, although git is able to track renames, git log does
> not automatically have --follow, so it can be hard for people to track
> down the (new) history of the driver.
> 
> Personally, I don't see much value in doing this rename, especially when
> all the driver internal structures are still going to be named with
> stmmac (and please don't even think about doing a s/stmmac/snps/ inside
> the driver ;)).

I agree, this could really make long term maintainence and bug fix
backporting a nightmare for a lot of people.

Please strongly reconsider, I still don't see any true value in this
rename.

Re: [PATCH net v2] netvsc: add rcu_read locking to netvsc callback

2017-01-11 Thread David Miller

From: Stephen Hemminger 
Date: Wed, 11 Jan 2017 09:16:32 -0800

> The receive callback (in tasklet context) is using RCU to get reference
> to associated VF network device but this is not safe. RCU read lock
> needs to be held. Found by running with full lockdep debugging
> enabled.
> 
> Fixes: f207c10d9823 ("hv_netvsc: use RCU to protect vf_netdev")
> Signed-off-by: Stephen Hemminger 
> ---
> v2 - fix commit message

Applied and queued up for -stable.

Re: TCP using IPv4-mapped IPv6 address as source

2017-01-11 Thread Jonathan T. Leighton




On 1/11/17 3:43 PM, Eric Dumazet wrote:

On Wed, 2017-01-11 at 14:59 -0500, Sowmini Varadhan wrote:


I think the RFC states somewhere that you should never ever
send out a v4 mapped address on the wire.

Can you point the exact RFC ?

https://tools.ietf.org/html/rfc2765  seems to allow just that.


Link was in my original post. See table 20:

https://tools.ietf.org/html/rfc6890#page-14


Jonathan issue is about terminating such flows in TCP stack, which is
likely not needed/useful.


I'm sure I understand what you're saying here. There should be no flow 
to terminate.

[PATCH] [net] net/mlx5e: fix another -Wmaybe-uninitialized warning

2017-01-11 Thread Arnd Bergmann

As found by Olof's build bot, today's mainline kernel gained a harmless
warning about a potential uninitalied variable reference:

drivers/net/ethernet/mellanox/mlx5/core/en_tc.c: In function 
'parse_tc_fdb_actions':
drivers/net/ethernet/mellanox/mlx5/core/en_tc.c:769:13: warning: 'out_dev' may 
be used uninitialized in this function [-Wmaybe-uninitialized]
drivers/net/ethernet/mellanox/mlx5/core/en_tc.c:811:21: note: 'out_dev' was 
declared here

This was introduced through the addition of an 'IS_ERR/PTR_ERR' pair that
gcc is unfortunately unable to completely figure out. Replacing it with
PTR_ERR_OR_ZERO makes the code more understandable to gcc so it no longer
warns.

Hadar Hen Zion already attempted to fix the warning earlier by adding
fake initializations, but that ended up just making the code worse without
fully addressing all warnings, so I'm reverting it now that it is no longer
needed.

In order to avoid pulling a variable declaration into the #ifdef, I'm
removing it in favor of a more readable 'if()' statement here that
has the same effect.

Link: http://arm-soc.lixom.net/buildlogs/mainline/v4.10-rc3-98-gcff3b2c/
Fixes: a42485eb0ee4 ("net/mlx5e: TC ipv4 tunnel encap offload error flow fixes")
Fixes: a757d108dc1a ("net/mlx5e: Fix kbuild warnings for uninitialized 
parameters")
Signed-off-by: Arnd Bergmann 
---
 drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 17 +
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c 
b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
index 118cea5..07d83835 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
@@ -666,14 +666,15 @@ static int mlx5e_route_lookup_ipv4(struct mlx5e_priv 
*priv,
struct rtable *rt;
struct neighbour *n = NULL;
int ttl;
+   int ret;
+
+   if (!IS_ENABLED(CONFIG_INET))
+   return -EOPNOTSUPP;
 
-#if IS_ENABLED(CONFIG_INET)
rt = ip_route_output_key(dev_net(mirred_dev), fl4);
-   if (IS_ERR(rt))
-   return PTR_ERR(rt);
-#else
-   return -EOPNOTSUPP;
-#endif
+   ret = PTR_ERR_OR_ZERO(rt);
+   if (ret)
+   return ret;
 
if (!switchdev_port_same_parent_id(priv->netdev, rt->dst.dev)) {
pr_warn("%s: can't offload, devices not on same HW e-switch\n", 
__func__);
@@ -741,8 +742,8 @@ static int mlx5e_create_encap_header_ipv4(struct mlx5e_priv 
*priv,
struct flowi4 fl4 = {};
char *encap_header;
int encap_size;
-   __be32 saddr = 0;
-   int ttl = 0;
+   __be32 saddr;
+   int ttl;
int err;
 
encap_header = kzalloc(max_encap_size, GFP_KERNEL);
-- 
2.9.0

Re: [PATCH net-next] net: thunderx: Fix error return code in nicvf_open()

2017-01-11 Thread David Miller

From: Wei Yongjun 
Date: Wed, 11 Jan 2017 16:32:51 +

> From: Wei Yongjun 
> 
> Fix to return a negative error code from the error handling
> case instead of 0, as done elsewhere in this function.
> 
> Fixes: 712c31853440 ("net: thunderx: Program LMAC credits based on MTU")
> Signed-off-by: Wei Yongjun 

Applied.

Re: [PATCH v2 2/2] stmmac: rename it to synopsys

2017-01-11 Thread Florian Fainelli

On 01/10/2017 06:52 AM, Joao Pinto wrote:
> This patch renames stmicro/stmmac to synopsys/ since it is a standard
> ethernet software package regarding synopsys ethernet controllers, supporting
> the majority of Synopsys Ethernet IPs. The config IDs remain the same, for
> retro-compatibility, only the description was changed.

Do re really have to do this? ST Micro were the first to upstream
support for a Synopsys IP, and it was later on identified as being
"stmicro" instead of "synopsys" (during the big driver move under
drivers/net/ethernet) whichever came first in the driver essentially "wins".

As mentioned before, although git is able to track renames, git log does
not automatically have --follow, so it can be hard for people to track
down the (new) history of the driver.

Personally, I don't see much value in doing this rename, especially when
all the driver internal structures are still going to be named with
stmmac (and please don't even think about doing a s/stmmac/snps/ inside
the driver ;)).

My 2 cents.
-- 
Florian

Re: [PATCH net-next v2] net: thunderx: Make hfunc variable const type in nicvf_set_rxfh()

2017-01-11 Thread David Miller

From: Robert Richter 
Date: Wed, 11 Jan 2017 18:04:32 +0100

> From struct ethtool_ops:
> 
> int (*set_rxfh)(struct net_device *, const u32 *indir,
> const u8 *key, const u8 hfunc);
> 
> Change function arg of hfunc to const type.
> 
> V2: Fixed indentation.
> 
> Signed-off-by: Robert Richter 

Applied.

Re: [PATCH net-next] sfc: efx_get_phys_port_id() can be static

2017-01-11 Thread David Miller

From: Wei Yongjun 
Date: Wed, 11 Jan 2017 16:16:12 +

> From: Wei Yongjun 
> 
> Fixes the following sparse warning:
> 
> drivers/net/ethernet/sfc/efx.c:2337:5: warning:
>  symbol 'efx_get_phys_port_id' was not declared. Should it be static?
> 
> Signed-off-by: Wei Yongjun 

Applied.

Re: [PATCH net] r8152: fix the sw rx checksum is unavailable

2017-01-11 Thread David Miller

From: Hayes Wang 
Date: Wed, 11 Jan 2017 16:25:34 +0800

> Fix the hw rx checksum is always enabled, and the user couldn't switch
> it to sw rx checksum.
> 
> Note that the RTL_VER_01 only support sw rx checksum only. Besides,
> the hw rx checksum for RTL_VER_02 is disabled after
> commit b9a321b48af4 ("r8152: Fix broken RX checksums."). Re-enable it.
> 
> Signed-off-by: Hayes Wang 

Applied and queued up for -stable, thanks.

Re: [PATCH v2] vxlan: Set ports in flow key when doing route lookups

2017-01-11 Thread David Miller

From: Martynas Pumputis 
Date: Wed, 11 Jan 2017 15:18:53 +

> Otherwise, a xfrm policy with sport/dport being set cannot be matched.
> 
> Signed-off-by: Martynas Pumputis 
> ---
> Changes in v2:
> - Set the source port in the flow key.

Applied, thanks.

[PATCH net-next 2/3] net: mdio-gpio: Convert to use gpiod functions where possible

2017-01-11 Thread Florian Fainelli

From: Guenter Roeck 

Using gpiod functions lets us use functionality which is not available
with gpio functions.

There is no gpiod function to match devm_gpio_request_one, so leave it
in place and use gpio_to_desc() to convert absolute pin numbers to gpio
descriptors.

Signed-off-by: Guenter Roeck 
Signed-off-by: Florian Fainelli 
---
 drivers/net/phy/mdio-gpio.c | 43 +--
 1 file changed, 21 insertions(+), 22 deletions(-)

diff --git a/drivers/net/phy/mdio-gpio.c b/drivers/net/phy/mdio-gpio.c
index f6e773256c82..e62fcea2d945 100644
--- a/drivers/net/phy/mdio-gpio.c
+++ b/drivers/net/phy/mdio-gpio.c
@@ -32,7 +32,7 @@
 
 struct mdio_gpio_info {
struct mdiobb_ctrl ctrl;
-   int mdc, mdio, mdo;
+   struct gpio_desc *mdc, *mdio, *mdo;
int mdc_active_low, mdio_active_low, mdo_active_low;
 };
 
@@ -80,16 +80,15 @@ static void mdio_dir(struct mdiobb_ctrl *ctrl, int dir)
 * assume the pin serves as pull-up. If direction is
 * output, the default value is high.
 */
-   gpio_set_value_cansleep(bitbang->mdo,
-   1 ^ bitbang->mdo_active_low);
+   gpiod_set_value(bitbang->mdo, 1 ^ bitbang->mdo_active_low);
return;
}
 
if (dir)
-   gpio_direction_output(bitbang->mdio,
- 1 ^ bitbang->mdio_active_low);
+   gpiod_direction_output(bitbang->mdio,
+  1 ^ bitbang->mdio_active_low);
else
-   gpio_direction_input(bitbang->mdio);
+   gpiod_direction_input(bitbang->mdio);
 }
 
 static int mdio_get(struct mdiobb_ctrl *ctrl)
@@ -97,8 +96,7 @@ static int mdio_get(struct mdiobb_ctrl *ctrl)
struct mdio_gpio_info *bitbang =
container_of(ctrl, struct mdio_gpio_info, ctrl);
 
-   return gpio_get_value_cansleep(bitbang->mdio) ^
-   bitbang->mdio_active_low;
+   return gpiod_get_value(bitbang->mdio) ^ bitbang->mdio_active_low;
 }
 
 static void mdio_set(struct mdiobb_ctrl *ctrl, int what)
@@ -107,11 +105,9 @@ static void mdio_set(struct mdiobb_ctrl *ctrl, int what)
container_of(ctrl, struct mdio_gpio_info, ctrl);
 
if (bitbang->mdo)
-   gpio_set_value_cansleep(bitbang->mdo,
-   what ^ bitbang->mdo_active_low);
+   gpiod_set_value(bitbang->mdo, what ^ bitbang->mdo_active_low);
else
-   gpio_set_value_cansleep(bitbang->mdio,
-   what ^ bitbang->mdio_active_low);
+   gpiod_set_value(bitbang->mdio, what ^ bitbang->mdio_active_low);
 }
 
 static void mdc_set(struct mdiobb_ctrl *ctrl, int what)
@@ -119,7 +115,7 @@ static void mdc_set(struct mdiobb_ctrl *ctrl, int what)
struct mdio_gpio_info *bitbang =
container_of(ctrl, struct mdio_gpio_info, ctrl);
 
-   gpio_set_value_cansleep(bitbang->mdc, what ^ bitbang->mdc_active_low);
+   gpiod_set_value(bitbang->mdc, what ^ bitbang->mdc_active_low);
 }
 
 static struct mdiobb_ops mdio_gpio_ops = {
@@ -137,6 +133,7 @@ static struct mii_bus *mdio_gpio_bus_init(struct device 
*dev,
struct mii_bus *new_bus;
struct mdio_gpio_info *bitbang;
int i;
+   int mdc, mdio, mdo;
unsigned long mdc_flags = GPIOF_OUT_INIT_LOW;
unsigned long mdio_flags = GPIOF_DIR_IN;
unsigned long mdo_flags = GPIOF_OUT_INIT_HIGH;
@@ -147,11 +144,15 @@ static struct mii_bus *mdio_gpio_bus_init(struct device 
*dev,
 
bitbang->ctrl.ops = _gpio_ops;
bitbang->ctrl.reset = pdata->reset;
-   bitbang->mdc = pdata->mdc;
+   mdc = pdata->mdc;
+   bitbang->mdc = gpio_to_desc(mdc);
bitbang->mdc_active_low = pdata->mdc_active_low;
-   bitbang->mdio = pdata->mdio;
+   mdio = pdata->mdio;
+   bitbang->mdio = gpio_to_desc(mdio);
bitbang->mdio_active_low = pdata->mdio_active_low;
-   bitbang->mdo = pdata->mdo;
+   mdo = pdata->mdo;
+   if (mdo)
+   bitbang->mdo = gpio_to_desc(mdo);
bitbang->mdo_active_low = pdata->mdo_active_low;
 
new_bus = alloc_mdio_bitbang(>ctrl);
@@ -177,16 +178,14 @@ static struct mii_bus *mdio_gpio_bus_init(struct device 
*dev,
else
strncpy(new_bus->id, "gpio", MII_BUS_ID_SIZE);
 
-   if (devm_gpio_request_one(dev, bitbang->mdc, mdc_flags, "mdc"))
+   if (devm_gpio_request_one(dev, mdc, mdc_flags, "mdc"))
goto out_free_bus;
 
-   if (devm_gpio_request_one(dev, bitbang->mdio, mdio_flags, "mdio"))
+   if (devm_gpio_request_one(dev, mdio, mdio_flags, "mdio"))
goto out_free_bus;
 
-   if (bitbang->mdo) {
-   if (devm_gpio_request_one(dev, bitbang->mdo, mdo_flags, "mdo"))
-

[PATCH net-next 0/3] net: mdio-gpio: Use modern GPIO helpers

2017-01-11 Thread Florian Fainelli

Hi David,

This patch series modernizes the mdio-gpio and makes it switch to the
latest and greatest API for manipulating GPIO lines, thus allowing
some simplifications in the driver.

Thanks!

Guenter Roeck (3):
  net: mdio-gpio: Use devm_gpio_request_one instead of devm_gpio_request
  net: mdio-gpio: Convert to use gpiod functions where possible
  net: mdio-gpio: Use gpio subsystem to handle low-active pins

 drivers/net/phy/mdio-gpio.c | 60 ++---
 1 file changed, 30 insertions(+), 30 deletions(-)

-- 
2.9.3

[PATCH net-next 3/3] net: mdio-gpio: Use gpio subsystem to handle low-active pins

2017-01-11 Thread Florian Fainelli

From: Guenter Roeck 

gpiod functions support handling low-active pins, so we can move
thos code out of this driver into the gpio subsystem and simplify
the code a bit.

Signed-off-by: Guenter Roeck 
Signed-off-by: Florian Fainelli 
---
 drivers/net/phy/mdio-gpio.c | 26 ++
 1 file changed, 14 insertions(+), 12 deletions(-)

diff --git a/drivers/net/phy/mdio-gpio.c b/drivers/net/phy/mdio-gpio.c
index e62fcea2d945..7faa79b254ef 100644
--- a/drivers/net/phy/mdio-gpio.c
+++ b/drivers/net/phy/mdio-gpio.c
@@ -33,7 +33,6 @@
 struct mdio_gpio_info {
struct mdiobb_ctrl ctrl;
struct gpio_desc *mdc, *mdio, *mdo;
-   int mdc_active_low, mdio_active_low, mdo_active_low;
 };
 
 static void *mdio_gpio_of_get_data(struct platform_device *pdev)
@@ -80,13 +79,12 @@ static void mdio_dir(struct mdiobb_ctrl *ctrl, int dir)
 * assume the pin serves as pull-up. If direction is
 * output, the default value is high.
 */
-   gpiod_set_value(bitbang->mdo, 1 ^ bitbang->mdo_active_low);
+   gpiod_set_value(bitbang->mdo, 1);
return;
}
 
if (dir)
-   gpiod_direction_output(bitbang->mdio,
-  1 ^ bitbang->mdio_active_low);
+   gpiod_direction_output(bitbang->mdio, 1);
else
gpiod_direction_input(bitbang->mdio);
 }
@@ -96,7 +94,7 @@ static int mdio_get(struct mdiobb_ctrl *ctrl)
struct mdio_gpio_info *bitbang =
container_of(ctrl, struct mdio_gpio_info, ctrl);
 
-   return gpiod_get_value(bitbang->mdio) ^ bitbang->mdio_active_low;
+   return gpiod_get_value(bitbang->mdio);
 }
 
 static void mdio_set(struct mdiobb_ctrl *ctrl, int what)
@@ -105,9 +103,9 @@ static void mdio_set(struct mdiobb_ctrl *ctrl, int what)
container_of(ctrl, struct mdio_gpio_info, ctrl);
 
if (bitbang->mdo)
-   gpiod_set_value(bitbang->mdo, what ^ bitbang->mdo_active_low);
+   gpiod_set_value(bitbang->mdo, what);
else
-   gpiod_set_value(bitbang->mdio, what ^ bitbang->mdio_active_low);
+   gpiod_set_value(bitbang->mdio, what);
 }
 
 static void mdc_set(struct mdiobb_ctrl *ctrl, int what)
@@ -115,7 +113,7 @@ static void mdc_set(struct mdiobb_ctrl *ctrl, int what)
struct mdio_gpio_info *bitbang =
container_of(ctrl, struct mdio_gpio_info, ctrl);
 
-   gpiod_set_value(bitbang->mdc, what ^ bitbang->mdc_active_low);
+   gpiod_set_value(bitbang->mdc, what);
 }
 
 static struct mdiobb_ops mdio_gpio_ops = {
@@ -146,14 +144,18 @@ static struct mii_bus *mdio_gpio_bus_init(struct device 
*dev,
bitbang->ctrl.reset = pdata->reset;
mdc = pdata->mdc;
bitbang->mdc = gpio_to_desc(mdc);
-   bitbang->mdc_active_low = pdata->mdc_active_low;
+   if (pdata->mdc_active_low)
+   mdc_flags = GPIOF_OUT_INIT_HIGH | GPIOF_ACTIVE_LOW;
mdio = pdata->mdio;
bitbang->mdio = gpio_to_desc(mdio);
-   bitbang->mdio_active_low = pdata->mdio_active_low;
+   if (pdata->mdio_active_low)
+   mdio_flags |= GPIOF_ACTIVE_LOW;
mdo = pdata->mdo;
-   if (mdo)
+   if (mdo) {
bitbang->mdo = gpio_to_desc(mdo);
-   bitbang->mdo_active_low = pdata->mdo_active_low;
+   if (pdata->mdo_active_low)
+   mdo_flags = GPIOF_OUT_INIT_LOW | GPIOF_ACTIVE_LOW;
+   }
 
new_bus = alloc_mdio_bitbang(>ctrl);
if (!new_bus)
-- 
2.9.3

[PATCH net-next 1/3] net: mdio-gpio: Use devm_gpio_request_one instead of devm_gpio_request

2017-01-11 Thread Florian Fainelli

From: Guenter Roeck 

Using devm_gpio_request_one lets us request gpio pins with initial state
in one go.

Signed-off-by: Guenter Roeck 
Signed-off-by: Florian Fainelli 
---
 drivers/net/phy/mdio-gpio.c | 13 ++---
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/net/phy/mdio-gpio.c b/drivers/net/phy/mdio-gpio.c
index 27ab63064f95..f6e773256c82 100644
--- a/drivers/net/phy/mdio-gpio.c
+++ b/drivers/net/phy/mdio-gpio.c
@@ -137,6 +137,9 @@ static struct mii_bus *mdio_gpio_bus_init(struct device 
*dev,
struct mii_bus *new_bus;
struct mdio_gpio_info *bitbang;
int i;
+   unsigned long mdc_flags = GPIOF_OUT_INIT_LOW;
+   unsigned long mdio_flags = GPIOF_DIR_IN;
+   unsigned long mdo_flags = GPIOF_OUT_INIT_HIGH;
 
bitbang = devm_kzalloc(dev, sizeof(*bitbang), GFP_KERNEL);
if (!bitbang)
@@ -174,21 +177,17 @@ static struct mii_bus *mdio_gpio_bus_init(struct device 
*dev,
else
strncpy(new_bus->id, "gpio", MII_BUS_ID_SIZE);
 
-   if (devm_gpio_request(dev, bitbang->mdc, "mdc"))
+   if (devm_gpio_request_one(dev, bitbang->mdc, mdc_flags, "mdc"))
goto out_free_bus;
 
-   if (devm_gpio_request(dev, bitbang->mdio, "mdio"))
+   if (devm_gpio_request_one(dev, bitbang->mdio, mdio_flags, "mdio"))
goto out_free_bus;
 
if (bitbang->mdo) {
-   if (devm_gpio_request(dev, bitbang->mdo, "mdo"))
+   if (devm_gpio_request_one(dev, bitbang->mdo, mdo_flags, "mdo"))
goto out_free_bus;
-   gpio_direction_output(bitbang->mdo, 1);
-   gpio_direction_input(bitbang->mdio);
}
 
-   gpio_direction_output(bitbang->mdc, 0);
-
dev_set_drvdata(dev, new_bus);
 
return new_bus;
-- 
2.9.3

Re: TCP using IPv4-mapped IPv6 address as source

2017-01-11 Thread Sowmini Varadhan

On (01/11/17 12:43), Eric Dumazet wrote:
> 
> On Wed, 2017-01-11 at 14:59 -0500, Sowmini Varadhan wrote:
> 
> > I think the RFC states somewhere that you should never ever
> > send out a v4 mapped address on the wire.
> 
> Can you point the exact RFC ?
> 
> https://tools.ietf.org/html/rfc2765  seems to allow just that.

I have not read the details of 2765, but from a cursory look,
it talks about "IPv4-translatable addresses", not v4-mapped
addrs, and says,
"The address translation mechanisms for the stateless and the stateful
 translations are defined in [RFC6052]"
It's also not clear to me that 2765 warrants the use of these
as ip6 src, or ip6 dst, or the target(s) of NS/NA.

https://www.rfc-editor.org/rfc/rfc4038.txt refers to security
considerations about sending v4-mapped addrs on the wire
Looks like these security considerations are discussed in
 https://tools.ietf.org/html/draft-itojun-v6ops-v4mapped-harmful-02

In general, I think BSD and Solaris (and probably most
router implementations, esp the BSD-based ones) will not allow
v4 mapped addresses as src or dst of ip6 packets.

> Jonathan issue is about terminating such flows in TCP stack, which is
> likely not needed/useful.

sure. but if you configure the v4 mapped address as
a src addr "everything should be fine!"

--Sowmini

Re: TCP using IPv4-mapped IPv6 address as source

2017-01-11 Thread Eric Dumazet

On Wed, 2017-01-11 at 14:59 -0500, Sowmini Varadhan wrote:

> I think the RFC states somewhere that you should never ever
> send out a v4 mapped address on the wire.

Can you point the exact RFC ?

https://tools.ietf.org/html/rfc2765  seems to allow just that.

Jonathan issue is about terminating such flows in TCP stack, which is
likely not needed/useful.

Re: [PATCH] wext: handle NULL exta data in iwe_stream_add_point better

2017-01-11 Thread Arnd Bergmann

On Wednesday, January 11, 2017 4:06:17 PM CET Johannes Berg wrote:
> 
> Applied. Also fixed the typo in the subject :)

Thanks! Unfortunately I now got another warning for the same function,
and though I would have expected the patch to fix it, that did not work:

In file included from 
/git/arm-soc/drivers/net/wireless/intersil/prism54/islpci_dev.h:27:0,
 from 
/git/arm-soc/drivers/net/wireless/intersil/prism54/isl_ioctl.h:24,
 from 
/git/arm-soc/drivers/net/wireless/intersil/prism54/isl_ioctl.c:32:
/git/arm-soc/drivers/net/wireless/intersil/prism54/isl_ioctl.c: In function 
'prism54_get_scan':
/git/arm-soc/include/net/iw_handler.h:560:4: error: argument 2 null where 
non-null expected [-Werror=nonnull]
memcpy(stream + point_len, extra, iwe->u.data.length);

The change below kills that warning too, but it gets even uglier there:

diff --git a/include/net/iw_handler.h b/include/net/iw_handler.h
index 1a41043688bc..c2aa73e5e6bb 100644
--- a/include/net/iw_handler.h
+++ b/include/net/iw_handler.h
@@ -556,7 +556,7 @@ iwe_stream_add_point(struct iw_request_info *info, char 
*stream, char *ends,
memcpy(stream + lcp_len,
   ((char *) >u) + IW_EV_POINT_OFF,
   IW_EV_POINT_PK_LEN - IW_EV_LCP_PK_LEN);
-   if (iwe->u.data.length)
+   if (iwe->u.data.length && extra)
memcpy(stream + point_len, extra, iwe->u.data.length);
stream += event_len;
}

Let me know if you want a proper follow-up patch, or if you can amend your
commit, or you have a better idea for resolving that warning.

Arnd

[PATCH 5/6 net-next] inet: split inet_csk_get_port into two functions

2017-01-11 Thread Josef Bacik

inet_csk_get_port does two different things, it either scans for an open port,
or it tries to see if the specified port is available for use.  Since these two
operations have different rules and are basically independent lets split them
into two different functions to make them both more readable.

Signed-off-by: Josef Bacik 
---
 net/ipv4/inet_connection_sock.c | 66 +++--
 1 file changed, 44 insertions(+), 22 deletions(-)

diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
index f7e844d..bbe2892 100644
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -156,33 +156,21 @@ static int inet_csk_bind_conflict(const struct sock *sk,
return sk2 != NULL;
 }
 
-/* Obtain a reference to a local port for the given sock,
- * if snum is zero it means select any available local port.
- * We try to allocate an odd port (and leave even ports for connect())
+/*
+ * Find an open port number for the socket.  Returns with the
+ * inet_bind_hashbucket lock held.
  */
-int inet_csk_get_port(struct sock *sk, unsigned short snum)
+static struct inet_bind_hashbucket *
+inet_csk_find_open_port(struct sock *sk, struct inet_bind_bucket **tb_ret, int 
*port_ret)
 {
-   bool reuse = sk->sk_reuse && sk->sk_state != TCP_LISTEN;
struct inet_hashinfo *hinfo = sk->sk_prot->h.hashinfo;
-   int ret = 1, port = snum;
+   int port = 0;
struct inet_bind_hashbucket *head;
struct net *net = sock_net(sk);
int i, low, high, attempt_half;
struct inet_bind_bucket *tb;
-   kuid_t uid = sock_i_uid(sk);
u32 remaining, offset;
-   bool reuseport_ok = !!snum;
 
-   if (port) {
-   head = >bhash[inet_bhashfn(net, port,
- hinfo->bhash_size)];
-   spin_lock_bh(>lock);
-   inet_bind_bucket_for_each(tb, >chain)
-   if (net_eq(ib_net(tb), net) && tb->port == port)
-   goto tb_found;
-
-   goto tb_not_found;
-   }
attempt_half = (sk->sk_reuse == SK_CAN_REUSE) ? 1 : 0;
 other_half_scan:
inet_get_local_port_range(net, , );
@@ -219,11 +207,12 @@ int inet_csk_get_port(struct sock *sk, unsigned short 
snum)
spin_lock_bh(>lock);
inet_bind_bucket_for_each(tb, >chain)
if (net_eq(ib_net(tb), net) && tb->port == port) {
-   if (!inet_csk_bind_conflict(sk, tb, false, 
reuseport_ok))
+   if (!inet_csk_bind_conflict(sk, tb, false, 
false))
goto success;
goto next_port;
}
-   goto tb_not_found;
+   tb = NULL;
+   goto success;
 next_port:
spin_unlock_bh(>lock);
cond_resched();
@@ -238,8 +227,41 @@ int inet_csk_get_port(struct sock *sk, unsigned short snum)
attempt_half = 2;
goto other_half_scan;
}
-   return ret;
+   return NULL;
+success:
+   *port_ret = port;
+   *tb_ret = tb;
+   return head;
+}
+
+/* Obtain a reference to a local port for the given sock,
+ * if snum is zero it means select any available local port.
+ * We try to allocate an odd port (and leave even ports for connect())
+ */
+int inet_csk_get_port(struct sock *sk, unsigned short snum)
+{
+   bool reuse = sk->sk_reuse && sk->sk_state != TCP_LISTEN;
+   struct inet_hashinfo *hinfo = sk->sk_prot->h.hashinfo;
+   int ret = 1, port = snum;
+   struct inet_bind_hashbucket *head;
+   struct net *net = sock_net(sk);
+   struct inet_bind_bucket *tb = NULL;
+   kuid_t uid = sock_i_uid(sk);
 
+   if (!port) {
+   head = inet_csk_find_open_port(sk, , );
+   if (!head)
+   return ret;
+   if (!tb)
+   goto tb_not_found;
+   goto success;
+   }
+   head = >bhash[inet_bhashfn(net, port,
+ hinfo->bhash_size)];
+   spin_lock_bh(>lock);
+   inet_bind_bucket_for_each(tb, >chain)
+   if (net_eq(ib_net(tb), net) && tb->port == port)
+   goto tb_found;
 tb_not_found:
tb = inet_bind_bucket_create(hinfo->bind_bucket_cachep,
 net, head, port);
@@ -255,7 +277,7 @@ int inet_csk_get_port(struct sock *sk, unsigned short snum)
  !rcu_access_pointer(sk->sk_reuseport_cb) &&
  sk->sk_reuseport && uid_eq(tb->fastuid, uid)))
goto success;
-   if (inet_csk_bind_conflict(sk, tb, true, reuseport_ok))
+   if (inet_csk_bind_conflict(sk, tb, true, true))
goto fail_unlock;
}
 success:
-- 
2.5.5

[PATCH 3/6 net-next] inet: kill smallest_size and smallest_port

2017-01-11 Thread Josef Bacik

In inet_csk_get_port we seem to be using smallest_port to figure out where the
best place to look for a SO_REUSEPORT sk that matches with an existing set of
SO_REUSEPORT's.  However if we get to the logic

if (smallest_size != -1) {
port = smallest_port;
goto have_port;
}

we will do a useless search, because we would have already done the
inet_csk_bind_conflict for that port and it would have returned 1, otherwise we
would have gone to found_tb and succeeded.  Since this logic makes us do yet
another trip through inet_csk_bind_conflict for a port we know won't work just
delete this code and save us the time.

Signed-off-by: Josef Bacik 
---
 include/net/inet_hashtables.h   |  1 -
 net/ipv4/inet_connection_sock.c | 26 --
 net/ipv4/inet_hashtables.c  |  3 ---
 3 files changed, 4 insertions(+), 26 deletions(-)

diff --git a/include/net/inet_hashtables.h b/include/net/inet_hashtables.h
index 756ed16..3fc0366 100644
--- a/include/net/inet_hashtables.h
+++ b/include/net/inet_hashtables.h
@@ -80,7 +80,6 @@ struct inet_bind_bucket {
signed char fastreuse;
signed char fastreuseport;
kuid_t  fastuid;
-   int num_owners;
struct hlist_node   node;
struct hlist_head   owners;
 };
diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
index a1c9055..d352366 100644
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -165,7 +165,6 @@ int inet_csk_get_port(struct sock *sk, unsigned short snum)
bool reuse = sk->sk_reuse && sk->sk_state != TCP_LISTEN;
struct inet_hashinfo *hinfo = sk->sk_prot->h.hashinfo;
int ret = 1, attempts = 5, port = snum;
-   int smallest_size = -1, smallest_port;
struct inet_bind_hashbucket *head;
struct net *net = sock_net(sk);
int i, low, high, attempt_half;
@@ -175,7 +174,6 @@ int inet_csk_get_port(struct sock *sk, unsigned short snum)
bool reuseport_ok = !!snum;
 
if (port) {
-have_port:
head = >bhash[inet_bhashfn(net, port,
  hinfo->bhash_size)];
spin_lock_bh(>lock);
@@ -209,8 +207,6 @@ int inet_csk_get_port(struct sock *sk, unsigned short snum)
 * We do the opposite to not pollute connect() users.
 */
offset |= 1U;
-   smallest_size = -1;
-   smallest_port = low; /* avoid compiler warning */
 
 other_parity_scan:
port = low + offset;
@@ -224,15 +220,6 @@ int inet_csk_get_port(struct sock *sk, unsigned short snum)
spin_lock_bh(>lock);
inet_bind_bucket_for_each(tb, >chain)
if (net_eq(ib_net(tb), net) && tb->port == port) {
-   if (((tb->fastreuse > 0 && reuse) ||
-(tb->fastreuseport > 0 &&
- sk->sk_reuseport &&
- !rcu_access_pointer(sk->sk_reuseport_cb) 
&&
- uid_eq(tb->fastuid, uid))) &&
-   (tb->num_owners < smallest_size || 
smallest_size == -1)) {
-   smallest_size = tb->num_owners;
-   smallest_port = port;
-   }
if (!inet_csk_bind_conflict(sk, tb, false, 
reuseport_ok))
goto tb_found;
goto next_port;
@@ -243,10 +230,6 @@ int inet_csk_get_port(struct sock *sk, unsigned short snum)
cond_resched();
}
 
-   if (smallest_size != -1) {
-   port = smallest_port;
-   goto have_port;
-   }
offset--;
if (!(offset & 1))
goto other_parity_scan;
@@ -268,19 +251,18 @@ int inet_csk_get_port(struct sock *sk, unsigned short 
snum)
if (sk->sk_reuse == SK_FORCE_REUSE)
goto success;
 
-   if (((tb->fastreuse > 0 && reuse) ||
+   if ((tb->fastreuse > 0 && reuse) ||
 (tb->fastreuseport > 0 &&
  !rcu_access_pointer(sk->sk_reuseport_cb) &&
- sk->sk_reuseport && uid_eq(tb->fastuid, uid))) &&
-   smallest_size == -1)
+ sk->sk_reuseport && uid_eq(tb->fastuid, uid)))
goto success;
if (inet_csk_bind_conflict(sk, tb, true, reuseport_ok)) {
if ((reuse ||
 (tb->fastreuseport > 0 &&
  sk->sk_reuseport &&
  !rcu_access_pointer(sk->sk_reuseport_cb) &&
- uid_eq(tb->fastuid, uid))) &&
-   !snum && smallest_size != -1 &&

[PATCH 2/6 net-next] inet: drop ->bind_conflict

2017-01-11 Thread Josef Bacik

The only difference between inet6_csk_bind_conflict and inet_csk_bind_conflict
is how they check the rcv_saddr, so delete this call back and simply
change inet_csk_bind_conflict to call inet_rcv_saddr_equal.

Signed-off-by: Josef Bacik 
---
 include/net/inet6_connection_sock.h |  5 -
 include/net/inet_connection_sock.h  |  6 --
 net/dccp/ipv4.c |  2 +-
 net/dccp/ipv6.c |  2 --
 net/ipv4/inet_connection_sock.c | 22 +++-
 net/ipv4/tcp_ipv4.c |  2 +-
 net/ipv6/inet6_connection_sock.c| 40 -
 net/ipv6/tcp_ipv6.c |  2 --
 8 files changed, 9 insertions(+), 72 deletions(-)

diff --git a/include/net/inet6_connection_sock.h 
b/include/net/inet6_connection_sock.h
index 3212b39..8ec87b6 100644
--- a/include/net/inet6_connection_sock.h
+++ b/include/net/inet6_connection_sock.h
@@ -15,16 +15,11 @@
 
 #include 
 
-struct inet_bind_bucket;
 struct request_sock;
 struct sk_buff;
 struct sock;
 struct sockaddr;
 
-int inet6_csk_bind_conflict(const struct sock *sk,
-   const struct inet_bind_bucket *tb, bool relax,
-   bool soreuseport_ok);
-
 struct dst_entry *inet6_csk_route_req(const struct sock *sk, struct flowi6 
*fl6,
  const struct request_sock *req, u8 proto);
 
diff --git a/include/net/inet_connection_sock.h 
b/include/net/inet_connection_sock.h
index 85ee387..add75c7 100644
--- a/include/net/inet_connection_sock.h
+++ b/include/net/inet_connection_sock.h
@@ -62,9 +62,6 @@ struct inet_connection_sock_af_ops {
char __user *optval, int __user *optlen);
 #endif
void(*addr2sockaddr)(struct sock *sk, struct sockaddr *);
-   int (*bind_conflict)(const struct sock *sk,
-const struct inet_bind_bucket *tb,
-bool relax, bool soreuseport_ok);
void(*mtu_reduced)(struct sock *sk);
 };
 
@@ -261,9 +258,6 @@ inet_csk_rto_backoff(const struct inet_connection_sock 
*icsk,
 
 struct sock *inet_csk_accept(struct sock *sk, int flags, int *err);
 
-int inet_csk_bind_conflict(const struct sock *sk,
-  const struct inet_bind_bucket *tb, bool relax,
-  bool soreuseport_ok);
 int inet_csk_get_port(struct sock *sk, unsigned short snum);
 
 struct dst_entry *inet_csk_route_req(const struct sock *sk, struct flowi4 *fl4,
diff --git a/net/dccp/ipv4.c b/net/dccp/ipv4.c
index d859a5c..ed6f99b 100644
--- a/net/dccp/ipv4.c
+++ b/net/dccp/ipv4.c
@@ -17,6 +17,7 @@
 #include 
 #include 
 
+#include 
 #include 
 #include 
 #include 
@@ -904,7 +905,6 @@ static const struct inet_connection_sock_af_ops 
dccp_ipv4_af_ops = {
.getsockopt= ip_getsockopt,
.addr2sockaddr = inet_csk_addr2sockaddr,
.sockaddr_len  = sizeof(struct sockaddr_in),
-   .bind_conflict = inet_csk_bind_conflict,
 #ifdef CONFIG_COMPAT
.compat_setsockopt = compat_ip_setsockopt,
.compat_getsockopt = compat_ip_getsockopt,
diff --git a/net/dccp/ipv6.c b/net/dccp/ipv6.c
index adfc790..08bcdc3 100644
--- a/net/dccp/ipv6.c
+++ b/net/dccp/ipv6.c
@@ -937,7 +937,6 @@ static const struct inet_connection_sock_af_ops 
dccp_ipv6_af_ops = {
.getsockopt= ipv6_getsockopt,
.addr2sockaddr = inet6_csk_addr2sockaddr,
.sockaddr_len  = sizeof(struct sockaddr_in6),
-   .bind_conflict = inet6_csk_bind_conflict,
 #ifdef CONFIG_COMPAT
.compat_setsockopt = compat_ipv6_setsockopt,
.compat_getsockopt = compat_ipv6_getsockopt,
@@ -958,7 +957,6 @@ static const struct inet_connection_sock_af_ops 
dccp_ipv6_mapped = {
.getsockopt= ipv6_getsockopt,
.addr2sockaddr = inet6_csk_addr2sockaddr,
.sockaddr_len  = sizeof(struct sockaddr_in6),
-   .bind_conflict = inet6_csk_bind_conflict,
 #ifdef CONFIG_COMPAT
.compat_setsockopt = compat_ipv6_setsockopt,
.compat_getsockopt = compat_ipv6_getsockopt,
diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
index ba597cb..a1c9055 100644
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -116,9 +116,9 @@ void inet_get_local_port_range(struct net *net, int *low, 
int *high)
 }
 EXPORT_SYMBOL(inet_get_local_port_range);
 
-int inet_csk_bind_conflict(const struct sock *sk,
-  const struct inet_bind_bucket *tb, bool relax,
-  bool reuseport_ok)
+static int inet_csk_bind_conflict(const struct sock *sk,
+ const struct inet_bind_bucket *tb,
+ bool relax, bool reuseport_ok)
 {
struct sock *sk2;
bool reuse = sk->sk_reuse;
@@ -134,7 +134,6 @@ int inet_csk_bind_conflict(const struct sock *sk,

[PATCH 6/6 net-next] inet: reset tb->fastreuseport when adding a reuseport sk

2017-01-11 Thread Josef Bacik

If we have non reuseport sockets on a tb we will set tb->fastreuseport to 0 and
never set it again.  Which means that in the future if we end up adding a bunch
of reuseport sk's to that tb we'll have to do the expensive scan every time.
Instead add the ipv4/ipv6 saddr fields to the bind bucket, as well as the family
so we know what comparison to make, and the ipv6 only setting so we can make
sure to compare with new sockets appropriately.  Once one sk has made it onto
the list we know that there are no potential bind conflicts on the owners list
that match that sk's rcv_addr.  So copy the sk's information into our bind
bucket and set tb->fastruseport to FASTREUSESOCK_STRICT so we know we have to do
an extra check for subsequent reuseport sockets and skip the expensive bind
conflict check.

Signed-off-by: Josef Bacik 
---
 include/net/inet_hashtables.h   |   9 
 net/ipv4/inet_connection_sock.c | 106 
 2 files changed, 95 insertions(+), 20 deletions(-)

diff --git a/include/net/inet_hashtables.h b/include/net/inet_hashtables.h
index 3fc0366..1178931 100644
--- a/include/net/inet_hashtables.h
+++ b/include/net/inet_hashtables.h
@@ -74,12 +74,21 @@ struct inet_ehash_bucket {
  * users logged onto your box, isn't it nice to know that new data
  * ports are created in O(1) time?  I thought so. ;-)  -DaveM
  */
+#define FASTREUSEPORT_ANY  1
+#define FASTREUSEPORT_STRICT   2
+
 struct inet_bind_bucket {
possible_net_t  ib_net;
unsigned short  port;
signed char fastreuse;
signed char fastreuseport;
kuid_t  fastuid;
+#if IS_ENABLED(CONFIG_IPV6)
+   struct in6_addr fast_v6_rcv_saddr;
+#endif
+   __be32  fast_rcv_saddr;
+   unsigned short  fast_sk_family;
+   boolfast_ipv6_only;
struct hlist_node   node;
struct hlist_head   owners;
 };
diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
index bbe2892..096a085 100644
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -38,20 +38,21 @@ EXPORT_SYMBOL(inet_csk_timer_bug_msg);
  *  IPV6_ADDR_ANY only equals to IPV6_ADDR_ANY,
  *  and 0.0.0.0 equals to 0.0.0.0 only
  */
-static int ipv6_rcv_saddr_equal(const struct sock *sk, const struct sock *sk2,
+static int ipv6_rcv_saddr_equal(const struct in6_addr *sk1_rcv_saddr6,
+   const struct in6_addr *sk2_rcv_saddr6,
+   __be32 sk1_rcv_saddr, __be32 sk2_rcv_saddr,
+   bool sk1_ipv6only, bool sk2_ipv6only,
bool match_wildcard)
 {
-   const struct in6_addr *sk2_rcv_saddr6 = inet6_rcv_saddr(sk2);
-   int sk2_ipv6only = inet_v6_ipv6only(sk2);
-   int addr_type = ipv6_addr_type(>sk_v6_rcv_saddr);
+   int addr_type = ipv6_addr_type(sk1_rcv_saddr6);
int addr_type2 = sk2_rcv_saddr6 ? ipv6_addr_type(sk2_rcv_saddr6) : 
IPV6_ADDR_MAPPED;
 
/* if both are mapped, treat as IPv4 */
if (addr_type == IPV6_ADDR_MAPPED && addr_type2 == IPV6_ADDR_MAPPED) {
if (!sk2_ipv6only) {
-   if (sk->sk_rcv_saddr == sk2->sk_rcv_saddr)
+   if (sk1_rcv_saddr == sk2_rcv_saddr)
return 1;
-   if (!sk->sk_rcv_saddr || !sk2->sk_rcv_saddr)
+   if (!sk1_rcv_saddr || !sk2_rcv_saddr)
return match_wildcard;
}
return 0;
@@ -65,11 +66,11 @@ static int ipv6_rcv_saddr_equal(const struct sock *sk, 
const struct sock *sk2,
return 1;
 
if (addr_type == IPV6_ADDR_ANY && match_wildcard &&
-   !(ipv6_only_sock(sk) && addr_type2 == IPV6_ADDR_MAPPED))
+   !(sk1_ipv6only && addr_type2 == IPV6_ADDR_MAPPED))
return 1;
 
if (sk2_rcv_saddr6 &&
-   ipv6_addr_equal(>sk_v6_rcv_saddr, sk2_rcv_saddr6))
+   ipv6_addr_equal(sk1_rcv_saddr6, sk2_rcv_saddr6))
return 1;
 
return 0;
@@ -80,13 +81,13 @@ static int ipv6_rcv_saddr_equal(const struct sock *sk, 
const struct sock *sk2,
  * match_wildcard == false: addresses must be exactly the same, i.e.
  *  0.0.0.0 only equals to 0.0.0.0
  */
-static int ipv4_rcv_saddr_equal(const struct sock *sk, const struct sock *sk2,
-   bool match_wildcard)
+static int ipv4_rcv_saddr_equal(__be32 sk1_rcv_saddr, __be32 sk2_rcv_saddr,
+   bool sk2_ipv6only, bool match_wildcard)
 {
-   if (!ipv6_only_sock(sk2)) {
-   if (sk->sk_rcv_saddr == sk2->sk_rcv_saddr)
+   if (!sk2_ipv6only) {
+   if (sk1_rcv_saddr == sk2_rcv_saddr)
return 1;
-

[PATCH 4/6 net-next] inet: don't check for bind conflicts twice when searching for a port

2017-01-11 Thread Josef Bacik

This is just wasted time, we've already found a tb that doesn't have a bind
conflict, and we don't drop the head lock so scanning again isn't going to give
us a different answer.  Instead move the tb->reuse setting logic outside of the
found_tb path and put it in the success: path.  Then make it so that we don't
goto again if we find a bind conflict in the found_tb path as we won't reach
this anymore when we are scanning for an ephemeral port.

Signed-off-by: Josef Bacik 
---
 net/ipv4/inet_connection_sock.c | 31 +++
 1 file changed, 11 insertions(+), 20 deletions(-)

diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
index d352366..f7e844d 100644
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -164,7 +164,7 @@ int inet_csk_get_port(struct sock *sk, unsigned short snum)
 {
bool reuse = sk->sk_reuse && sk->sk_state != TCP_LISTEN;
struct inet_hashinfo *hinfo = sk->sk_prot->h.hashinfo;
-   int ret = 1, attempts = 5, port = snum;
+   int ret = 1, port = snum;
struct inet_bind_hashbucket *head;
struct net *net = sock_net(sk);
int i, low, high, attempt_half;
@@ -183,7 +183,6 @@ int inet_csk_get_port(struct sock *sk, unsigned short snum)
 
goto tb_not_found;
}
-again:
attempt_half = (sk->sk_reuse == SK_CAN_REUSE) ? 1 : 0;
 other_half_scan:
inet_get_local_port_range(net, , );
@@ -221,7 +220,7 @@ int inet_csk_get_port(struct sock *sk, unsigned short snum)
inet_bind_bucket_for_each(tb, >chain)
if (net_eq(ib_net(tb), net) && tb->port == port) {
if (!inet_csk_bind_conflict(sk, tb, false, 
reuseport_ok))
-   goto tb_found;
+   goto success;
goto next_port;
}
goto tb_not_found;
@@ -256,23 +255,11 @@ int inet_csk_get_port(struct sock *sk, unsigned short 
snum)
  !rcu_access_pointer(sk->sk_reuseport_cb) &&
  sk->sk_reuseport && uid_eq(tb->fastuid, uid)))
goto success;
-   if (inet_csk_bind_conflict(sk, tb, true, reuseport_ok)) {
-   if ((reuse ||
-(tb->fastreuseport > 0 &&
- sk->sk_reuseport &&
- !rcu_access_pointer(sk->sk_reuseport_cb) &&
- uid_eq(tb->fastuid, uid))) && !snum &&
-   --attempts >= 0) {
-   spin_unlock_bh(>lock);
-   goto again;
-   }
+   if (inet_csk_bind_conflict(sk, tb, true, reuseport_ok))
goto fail_unlock;
-   }
-   if (!reuse)
-   tb->fastreuse = 0;
-   if (!sk->sk_reuseport || !uid_eq(tb->fastuid, uid))
-   tb->fastreuseport = 0;
-   } else {
+   }
+success:
+   if (!hlist_empty(>owners)) {
tb->fastreuse = reuse;
if (sk->sk_reuseport) {
tb->fastreuseport = 1;
@@ -280,8 +267,12 @@ int inet_csk_get_port(struct sock *sk, unsigned short snum)
} else {
tb->fastreuseport = 0;
}
+   } else {
+   if (!reuse)
+   tb->fastreuse = 0;
+   if (!sk->sk_reuseport || !uid_eq(tb->fastuid, uid))
+   tb->fastreuseport = 0;
}
-success:
if (!inet_csk(sk)->icsk_bind_hash)
inet_bind_hash(sk, tb, port);
WARN_ON(inet_csk(sk)->icsk_bind_hash != tb);
-- 
2.5.5

[PATCH 1/6 net-next] inet: collapse ipv4/v6 rcv_saddr_equal functions into one

2017-01-11 Thread Josef Bacik

We pass these per-protocol equal functions around in various places, but
we can just have one function that checks the sk->sk_family and then do
the right comparison function.  I've also changed the ipv4 version to
not cast to inet_sock since it is unneeded.

Signed-off-by: Josef Bacik 
---
 include/net/addrconf.h   |  4 +--
 include/net/inet_hashtables.h|  5 +--
 include/net/udp.h|  1 -
 net/ipv4/inet_connection_sock.c  | 72 
 net/ipv4/inet_hashtables.c   | 16 +++--
 net/ipv4/udp.c   | 58 +++-
 net/ipv6/inet6_connection_sock.c |  4 +--
 net/ipv6/inet6_hashtables.c  | 46 +
 net/ipv6/udp.c   |  2 +-
 9 files changed, 95 insertions(+), 113 deletions(-)

diff --git a/include/net/addrconf.h b/include/net/addrconf.h
index 8f998af..17c6fd8 100644
--- a/include/net/addrconf.h
+++ b/include/net/addrconf.h
@@ -88,9 +88,7 @@ int __ipv6_get_lladdr(struct inet6_dev *idev, struct in6_addr 
*addr,
  u32 banned_flags);
 int ipv6_get_lladdr(struct net_device *dev, struct in6_addr *addr,
u32 banned_flags);
-int ipv4_rcv_saddr_equal(const struct sock *sk, const struct sock *sk2,
-bool match_wildcard);
-int ipv6_rcv_saddr_equal(const struct sock *sk, const struct sock *sk2,
+int inet_rcv_saddr_equal(const struct sock *sk, const struct sock *sk2,
 bool match_wildcard);
 void addrconf_join_solict(struct net_device *dev, const struct in6_addr *addr);
 void addrconf_leave_solict(struct inet6_dev *idev, const struct in6_addr 
*addr);
diff --git a/include/net/inet_hashtables.h b/include/net/inet_hashtables.h
index 0574493..756ed16 100644
--- a/include/net/inet_hashtables.h
+++ b/include/net/inet_hashtables.h
@@ -203,10 +203,7 @@ void inet_hashinfo_init(struct inet_hashinfo *h);
 
 bool inet_ehash_insert(struct sock *sk, struct sock *osk);
 bool inet_ehash_nolisten(struct sock *sk, struct sock *osk);
-int __inet_hash(struct sock *sk, struct sock *osk,
-   int (*saddr_same)(const struct sock *sk1,
- const struct sock *sk2,
- bool match_wildcard));
+int __inet_hash(struct sock *sk, struct sock *osk);
 int inet_hash(struct sock *sk);
 void inet_unhash(struct sock *sk);
 
diff --git a/include/net/udp.h b/include/net/udp.h
index 1661791..c9d8b8e 100644
--- a/include/net/udp.h
+++ b/include/net/udp.h
@@ -204,7 +204,6 @@ static inline void udp_lib_close(struct sock *sk, long 
timeout)
 }
 
 int udp_lib_get_port(struct sock *sk, unsigned short snum,
-int (*)(const struct sock *, const struct sock *, bool),
 unsigned int hash2_nulladdr);
 
 u32 udp_flow_hashrnd(void);
diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
index 19ea045..ba597cb 100644
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -31,6 +31,78 @@ const char inet_csk_timer_bug_msg[] = "inet_csk BUG: unknown 
timer value\n";
 EXPORT_SYMBOL(inet_csk_timer_bug_msg);
 #endif
 
+#if IS_ENABLED(CONFIG_IPV6)
+/* match_wildcard == true:  IPV6_ADDR_ANY equals to any IPv6 addresses if IPv6
+ *  only, and any IPv4 addresses if not IPv6 only
+ * match_wildcard == false: addresses must be exactly the same, i.e.
+ *  IPV6_ADDR_ANY only equals to IPV6_ADDR_ANY,
+ *  and 0.0.0.0 equals to 0.0.0.0 only
+ */
+static int ipv6_rcv_saddr_equal(const struct sock *sk, const struct sock *sk2,
+   bool match_wildcard)
+{
+   const struct in6_addr *sk2_rcv_saddr6 = inet6_rcv_saddr(sk2);
+   int sk2_ipv6only = inet_v6_ipv6only(sk2);
+   int addr_type = ipv6_addr_type(>sk_v6_rcv_saddr);
+   int addr_type2 = sk2_rcv_saddr6 ? ipv6_addr_type(sk2_rcv_saddr6) : 
IPV6_ADDR_MAPPED;
+
+   /* if both are mapped, treat as IPv4 */
+   if (addr_type == IPV6_ADDR_MAPPED && addr_type2 == IPV6_ADDR_MAPPED) {
+   if (!sk2_ipv6only) {
+   if (sk->sk_rcv_saddr == sk2->sk_rcv_saddr)
+   return 1;
+   if (!sk->sk_rcv_saddr || !sk2->sk_rcv_saddr)
+   return match_wildcard;
+   }
+   return 0;
+   }
+
+   if (addr_type == IPV6_ADDR_ANY && addr_type2 == IPV6_ADDR_ANY)
+   return 1;
+
+   if (addr_type2 == IPV6_ADDR_ANY && match_wildcard &&
+   !(sk2_ipv6only && addr_type == IPV6_ADDR_MAPPED))
+   return 1;
+
+   if (addr_type == IPV6_ADDR_ANY && match_wildcard &&
+   !(ipv6_only_sock(sk) && addr_type2 == IPV6_ADDR_MAPPED))
+   return 1;
+
+   if (sk2_rcv_saddr6 &&
+   ipv6_addr_equal(>sk_v6_rcv_saddr, sk2_rcv_saddr6))
+   return 1;
+
+   return 0;

Re: net: ti: cpsw-phy-sel: RGMII is not working on AM335x

2017-01-11 Thread Alex


Hi Teresa

On 01/11/2017 07:57 AM, Teresa Remmet wrote:

The phy we use is a KSZ9021. And yes we add delays to the
phy, as you can see. When looking to the dts documentation I probably
need to set the phy-mode to "rgmii-id" instead, as the phy is providing
the delays.


Before the change I submitted, every "rgmii*" link was being set up as 
"rgmii-id", regardless of phy-mode. I suspected there might have been 
the odd board here or there that used "rgmii" when the actual mode is 
"rgmii-id".


What confused me the first time around was that I didn't realize the 
delay applies to the phy, not the mac. So "-id" means "the PHY provides 
the delays, so don't add delays on the MAC", just like you realized.



I make a quick test with that change and it is working. So this seems
to solve my problem. Thank you for the hint.


I'm glad it was an easy fix. Last time I had issues with RGMII, I had to 
pull out the oscilloscope.


Alex


Regards,
Teresa

[PATCH 0/6 net-next][V3] Rework inet_csk_get_port

2017-01-11 Thread Josef Bacik

Sorry the holidays delayed this a bit.  I've made the changes requested and I
also sat down and wrote a reproducer so everybody could see what was happening.
Without my patches my reproducer takes about 50 seconds to run on my 48 CPU
devel box.  With my patches it takes 8 seconds.

V2->V3:
-Dropped the fastsock from the tb and instead just carry the saddrs, family, and
 ipv6 only flag.
-Reworked the helper functions to deal with this change so I could still use
 them when checking the fast path.
-Killed tb->num_owners as per Eric's request.
-Attached a reproducer to the bottom of this email.

V1->V2:
-Added a new patch 'inet: collapse ipv4/v6 rcv_saddr_equal functions into one'
 at Hannes' suggestion.
-Dropped ->bind_conflict and just use the new helper.
-Fixed a compile bug from the original ->bind_conflict patch.

The original description of the series follows

=

At some point recently the guys working on our load balancer added the ability
to use SO_REUSEPORT.  When they restarted their app with this option enabled
they immediately hit a softlockup on what appeared to be the
inet_bind_bucket->lock.  Eventually what all of our debugging and discussion led
us to was the fact that the application comes up without SO_REUSEPORT, shuts
down which creates around 100k twsk's, and then comes up and tries to open a
bunch of sockets using SO_REUSEPORT, which meant traversing the inet_bind_bucket
owners list under the lock.  Since this lock is needed for dealing with the
twsk's and basically anything else related to connections we would softlockup,
and sometimes not ever recover.

To solve this problem I did what you see in Path 5/5.  Once we have a
SO_REUSEPORT socket on the tb->owners list we know that the socket has no
conflicts with any of the other sockets on that list.  So we can add a copy of
the sock_common (really all we need is the recv_saddr but it seemed ugly to copy
just the ipv6, ipv4, and flag to indicate if we were ipv6 only in there so I've
copied the whole common) in order to check subsequent SO_REUSEPORT sockets.  If
they match the previous one then we can skip the expensive
inet_csk_bind_conflict check.  This is what eliminated the soft lockup that we
were seeing.

Patches 1-4 are cleanups and re-workings.  For instance when we specify port ==
0 we need to find an open port, but we would do two passes through
inet_csk_bind_conflict every time we found a possible port.  We would also keep
track of the smallest_port value in order to try and use it if we found no
port our first run through.  This however made no sense as it would have had to
fail the first pass through inet_csk_bind_conflict, so would not actually pass
the second pass through either.  Finally I split the function into two functions
in order to make it easier to read and to distinguish between the two behaviors.

I have tested this on one of our load balancing boxes during peak traffic and it
hasn't fallen over.  But this is not my area, so obviously feel free to point
out where I'm being stupid and I'll get it fixed up and retested.  Thanks,

Josef


#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 

static int ready = 0;
static int done = 0;

static void sigint_handler(int s)
{
ready = 1;
}

static int create_sock(int socktype, int soreuseport)
{
struct addrinfo hints;
struct addrinfo *ai, *ai_orig;
const char *addrs[] = {"::", "0.0.0.0", NULL};
const char *port = "";
int sock;
int num_socks = 0;
int i, ret;

memset(, 0, sizeof(hints));
hints.ai_flags = AI_PASSIVE | AI_ADDRCONFIG;
hints.ai_socktype = SOCK_STREAM;
hints.ai_family = (socktype == 0) ? AF_INET6 : AF_INET;
hints.ai_protocol = IPPROTO_TCP;
ret = getaddrinfo(addrs[socktype], port, , );
if (ret < 0) {
fprintf(stderr, "couldn't get addr info %d\n", errno);
return -1;
}
ai_orig = ai;
while (ai != NULL) {
int yes = 1;
sock = socket(ai->ai_family, ai->ai_socktype,
  ai->ai_protocol);
if (sock < 0) {
fprintf(stderr, "socket failed %d\n", errno);
goto next;
}
if (soreuseport) {
if (setsockopt(sock, SOL_SOCKET, SO_REUSEPORT, ,
   sizeof(yes)) < 0) {
fprintf(stderr, "setsockopt failed %d\n", 
errno);
close(sock);
goto next;
}
}
if (bind(sock, ai->ai_addr, ai->ai_addrlen)) {
fprintf(stderr, "bind failed %d\n", errno);
close(sock);
goto next;

Re: [PATCH net-next 0/2] Move hwmon support out of switch and into PHYs.

2017-01-11 Thread Guenter Roeck

On Wed, Jan 11, 2017 at 08:16:43PM +0100, Andrew Lunn wrote:
> >  This makes it kind
> > of a circular argument (not that the dsa hwmon driver supports the
> > new API anyway, but still).
> 
> Actually, the code i posted does. I didn't just move it, i re-wrote it
> to use devm_hwmon_device_register_with_info().
> 
Cool. Sorry, I didn't remember. Then maybe you should ask for the core
to support the necessary callback ;-).

Guenter

Re: TCP using IPv4-mapped IPv6 address as source

2017-01-11 Thread Sowmini Varadhan

On (01/11/17 14:48), Jonathan T. Leighton wrote:
> 
> I would say that an IPv6 socket binds to an IPv4-mapped IPv6
> address, but yes - using it to connect to an IPv6 address would be
> an application bug. It also think that having connect() spend 2
> minutes sending packets containing an illegal source address is a
> bug. (And unfriendly!) I'll look into writing a patch for this, and
> let you know whether or not I think I'm up to it.
> 

BTW, linux probably has a number of bugs in this space. 

you can do a number of "interesting" things, where the
v4-mapped address ends up being treated as a global address.
E.g., you can configure a v4 mapped address on an interface
using /sbin/ip, then you can add an onlink route, like
 # ip -6 route add :::13.0.0.1 nexthop onlink dev eth1
and you can ping6 :::13.0.0.1, and (if you do this
on both sides) watch a merry little packet exchange on
the wire, where we send out an NS for :::13.0.0.1
to the solmcast of the v4 mapped address, the peer politely
responds, and the 2 nodes then have a nice chat. 

I think the RFC states somewhere that you should never ever
send out a v4 mapped address on the wire.

--Sowmini

Re: [PATCH net-next 5/6] bnxt_en: Pass RoCE app priority to firmware.

2017-01-11 Thread Michael Chan

On Wed, Jan 11, 2017 at 10:34 AM, Doug Ledford  wrote:
> On Wed, 2017-01-11 at 09:17 -0800, Michael Chan wrote:
>> On Wed, Jan 11, 2017 at 7:46 AM, David Miller 
>> wrote:
>> >
>> > From: Michael Chan 
>> > Date: Tue, 10 Jan 2017 20:12:38 -0500
>> >
>> > >
>> > > diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.h
>> > > b/drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.h
>> > > index 35a0d28..f2630cc 100644
>> > > --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.h
>> > > +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.h
>> > > @@ -36,6 +36,9 @@ struct bnxt_cos2bw_cfg {
>> > >
>> > >  #define HWRM_STRUCT_DATA_SUBTYPE_HOST_OPERATIONAL0x0300
>> > >
>> > > +#define ETH_P_ROCE   0x8915
>> >
>> > There's also a similar define in the qedr infiniband driver, this
>> > doesn't
>> > make much sense.
>> >
>> > Please add this to if_ether.h, and reference it from there in the
>> > drivers.
>> >
>> My colleague informed me that he has submitted a patch to do that
>> through the rdma tree:
>>
>> http://marc.info/?l=linux-rdma=148217575500983=2
>>
>> But it hasn't been merged yet.  I can drop this patch now and wait
>> for
>> the RDMA patch to show up on net-next. Or I make the change later to
>> use the common define when the RDMA patch is merged.  Thanks.
>
> The patch for this is in my for-next area:
>
> 69ae543969ab RDMA: Adding ethertype ETH_P_IBOE
>

David, can you take the patch now and I will fix it later to use
ETH_P_IBOE when the RDMA patch is pulled into net-next?  Or should I
drop the patch for now?

Re: TCP using IPv4-mapped IPv6 address as source

2017-01-11 Thread Jonathan T. Leighton




On 1/11/17 1:31 PM, Eric Dumazet wrote:

On Wed, 2017-01-11 at 12:34 -0500, Jonathan T. Leighton wrote:

On 1/11/17 11:20 AM, Eric Dumazet wrote:

On Thu, 2017-01-05 at 16:25 -0500, Jonathan T. Leighton wrote:

I've observed TCP using an IPv4-mapped IPv6 address as the source
address, which I believe contradicts
https://tools.ietf.org/html/rfc6890#page-14 (BCP 153). This occurs when
an IPv6 TCP socket, bound to a local IPv4-mapped IPv6 address, attempts
to connect to a remote IPv6 address. Presumable connect() should return
EAFNOSUPPORT in this case. Please advise me if this is not to
appropriate list to report this.

Hi Jonathan

I believe your concern makes sense.
Do you have a patch to address this issue ?

Thanks for responding Eric. I have limited experience with kernel
patches. Nevertheless, unless there's someone with the experience and
time to jump on this, I'm interested in taking a crack at it. I think
the issue certainly warrants attention: instead of returning immediately
with EAFNOSUPPPORT, connect() retransmits its SYN 6 times, ultimately
returning ETIMEDOUT after 127 sec (1+2+4+...+64).

I am not aware of an application trying to perform a bind() to an IPV4
address before a connect() to an IPv6 destination.

A kernel fix is certainly something that would detect application bugs
in a more friendly way.


I would say that an IPv6 socket binds to an IPv4-mapped IPv6 address, 
but yes - using it to connect to an IPv6 address would be an application 
bug. It also think that having connect() spend 2 minutes sending packets 
containing an illegal source address is a bug. (And unfriendly!) I'll 
look into writing a patch for this, and let you know whether or not I 
think I'm up to it.


- Jon


Thanks !

Re: net/atm: warning in alloc_tx/__might_sleep

2017-01-11 Thread Michal Hocko

On Wed 11-01-17 20:45:25, Michal Hocko wrote:
> On Wed 11-01-17 09:37:06, Chas Williams wrote:
> > On Mon, 2017-01-09 at 18:20 +0100, Andrey Konovalov wrote:
> > > Hi!
> > > 
> > > I've got the following error report while running the syzkaller fuzzer.
> > > 
> > > On commit a121103c922847ba5010819a3f250f1f7fc84ab8 (4.10-rc3).
> > > 
> > > A reproducer is attached.
> > > 
> > > [ cut here ]
> > > WARNING: CPU: 0 PID: 4114 at kernel/sched/core.c:7737 
> > > __might_sleep+0x149/0x1a0
> > > do not call blocking ops when !TASK_RUNNING; state=1 set at
> > > [] prepare_to_wait+0x182/0x530
> > > Modules linked in:
> > > CPU: 0 PID: 4114 Comm: a.out Not tainted 4.10.0-rc3+ #59
> > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 
> > > 01/01/2011
> > > Call Trace:
> > >  __dump_stack lib/dump_stack.c:15
> > >  dump_stack+0x292/0x398 lib/dump_stack.c:51
> > >  __warn+0x19f/0x1e0 kernel/panic.c:547
> > >  warn_slowpath_fmt+0xc5/0x110 kernel/panic.c:562
> > >  __might_sleep+0x149/0x1a0 kernel/sched/core.c:7732
> > >  slab_pre_alloc_hook mm/slab.h:408
> > >  slab_alloc_node mm/slub.c:2634
> > >  kmem_cache_alloc_node+0x14a/0x280 mm/slub.c:2744
> > >  __alloc_skb+0x10f/0x800 net/core/skbuff.c:219
> > >  alloc_skb ./include/linux/skbuff.h:926
> > >  alloc_tx net/atm/common.c:75
> > 
> > This is likely alloc_skb(..., GFP_KERNEL) in alloc_tx().  The simplest
> > fix for this would be simply to switch this GFP_ATOMIC.  See if this is
> > any better.
> > 
> > diff --git a/net/atm/common.c b/net/atm/common.c
> > index a3ca922..d84220c 100644
> > --- a/net/atm/common.c
> > +++ b/net/atm/common.c
> > @@ -72,7 +72,7 @@ static struct sk_buff *alloc_tx(struct atm_vcc *vcc, 
> > unsigned int size)
> >  sk_wmem_alloc_get(sk), size, sk->sk_sndbuf);
> > return NULL;
> > }
> > -   while (!(skb = alloc_skb(size, GFP_KERNEL)))
> > +   while (!(skb = alloc_skb(size, GFP_ATOMIC)))
> > schedule();
> > pr_debug("%d += %d\n", sk_wmem_alloc_get(sk), skb->truesize);
> > atomic_add(skb->truesize, >sk_wmem_alloc);
> 
> Blee, this code is just horrendous. But the "fix" is obviously broken!
> schedule() is just a noop if you do not change the task state and what
> you are just asking for is a never failing non sleeping allocation - aka
> a busy loop in the kernel!

And btw. this while loop should be really turned into GFP_KERNEL |
__GFP_NOFAIL with and explanation why this allocation cannot possibly
fail.
-- 
Michal Hocko
SUSE Labs

Re: net/atm: warning in alloc_tx/__might_sleep

2017-01-11 Thread Michal Hocko

On Wed 11-01-17 09:37:06, Chas Williams wrote:
> On Mon, 2017-01-09 at 18:20 +0100, Andrey Konovalov wrote:
> > Hi!
> > 
> > I've got the following error report while running the syzkaller fuzzer.
> > 
> > On commit a121103c922847ba5010819a3f250f1f7fc84ab8 (4.10-rc3).
> > 
> > A reproducer is attached.
> > 
> > [ cut here ]
> > WARNING: CPU: 0 PID: 4114 at kernel/sched/core.c:7737 
> > __might_sleep+0x149/0x1a0
> > do not call blocking ops when !TASK_RUNNING; state=1 set at
> > [] prepare_to_wait+0x182/0x530
> > Modules linked in:
> > CPU: 0 PID: 4114 Comm: a.out Not tainted 4.10.0-rc3+ #59
> > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> > Call Trace:
> >  __dump_stack lib/dump_stack.c:15
> >  dump_stack+0x292/0x398 lib/dump_stack.c:51
> >  __warn+0x19f/0x1e0 kernel/panic.c:547
> >  warn_slowpath_fmt+0xc5/0x110 kernel/panic.c:562
> >  __might_sleep+0x149/0x1a0 kernel/sched/core.c:7732
> >  slab_pre_alloc_hook mm/slab.h:408
> >  slab_alloc_node mm/slub.c:2634
> >  kmem_cache_alloc_node+0x14a/0x280 mm/slub.c:2744
> >  __alloc_skb+0x10f/0x800 net/core/skbuff.c:219
> >  alloc_skb ./include/linux/skbuff.h:926
> >  alloc_tx net/atm/common.c:75
> 
> This is likely alloc_skb(..., GFP_KERNEL) in alloc_tx().  The simplest
> fix for this would be simply to switch this GFP_ATOMIC.  See if this is
> any better.
> 
> diff --git a/net/atm/common.c b/net/atm/common.c
> index a3ca922..d84220c 100644
> --- a/net/atm/common.c
> +++ b/net/atm/common.c
> @@ -72,7 +72,7 @@ static struct sk_buff *alloc_tx(struct atm_vcc *vcc, 
> unsigned int size)
>sk_wmem_alloc_get(sk), size, sk->sk_sndbuf);
>   return NULL;
>   }
> - while (!(skb = alloc_skb(size, GFP_KERNEL)))
> + while (!(skb = alloc_skb(size, GFP_ATOMIC)))
>   schedule();
>   pr_debug("%d += %d\n", sk_wmem_alloc_get(sk), skb->truesize);
>   atomic_add(skb->truesize, >sk_wmem_alloc);

Blee, this code is just horrendous. But the "fix" is obviously broken!
schedule() is just a noop if you do not change the task state and what
you are just asking for is a never failing non sleeping allocation - aka
a busy loop in the kernel!

-- 
Michal Hocko
SUSE Labs

Re: [PATCH net-next 0/2] Move hwmon support out of switch and into PHYs.

2017-01-11 Thread Andrew Lunn

> Are not the *_alarm attributes specifically designed to report such
> things? See Documentation/hwmon/sysfs-interface

Hi Florian

Yes it is, and at the moment, userspace needs to poll it. No need for
in interrupt.

   Andrew

Re: [PATCH net-next 0/2] Move hwmon support out of switch and into PHYs.

2017-01-11 Thread Andrew Lunn

>  This makes it kind
> of a circular argument (not that the dsa hwmon driver supports the
> new API anyway, but still).

Actually, the code i posted does. I didn't just move it, i re-wrote it
to use devm_hwmon_device_register_with_info().

   Andrew

Re: net/atm: warning in alloc_tx/__might_sleep

2017-01-11 Thread Eric Dumazet

On Tue, 2017-01-10 at 23:47 +0100, Francois Romieu wrote:
> Eric Dumazet  :
> > On Tue, Jan 10, 2017 at 9:35 AM, Cong Wang  wrote:
> > > On Mon, Jan 9, 2017 at 9:20 AM, Andrey Konovalov  
> > > wrote:
> > >
> > > The fix should be straight-forward. Mind to try the attached patch?
> > 
> > 
> > You forgot to remove schedule() ?
> 
> It may be clearer to split alloc_tx in two parts: only the unsleepable
> "if (sk_wmem_alloc_get(sk) && !atm_may_send(vcc, size)) {" part of it
> contributes to the inner "while (!(skb = alloc_tx(vcc, eff))) {" block.
> 
> See net/atm/common.c
> [...]
> static struct sk_buff *alloc_tx(struct atm_vcc *vcc, unsigned int size)
> {
> struct sk_buff *skb;
> struct sock *sk = sk_atm(vcc);
> 
> if (sk_wmem_alloc_get(sk) && !atm_may_send(vcc, size)) {
> pr_debug("Sorry: wmem_alloc = %d, size = %d, sndbuf = %d\n",
>  sk_wmem_alloc_get(sk), size, sk->sk_sndbuf);
> return NULL;
> }
> while (!(skb = alloc_skb(size, GFP_KERNEL)))
> schedule();

Yeah, this code looks quite wrong anyway.

We can read it as an infinite loop in some stress conditions or memcg
constraints.


> The waiting stuff is related to vcc drain but the code makes it look as
> if it were also related to skb alloc (it isn't).
> 
> It may be obvious for you but it took me a while to figure what the
> code is supposed to achieve.
>

Re: [PATCH net-next 5/6] bnxt_en: Pass RoCE app priority to firmware.

2017-01-11 Thread Doug Ledford

On Wed, 2017-01-11 at 09:17 -0800, Michael Chan wrote:
> On Wed, Jan 11, 2017 at 7:46 AM, David Miller 
> wrote:
> > 
> > From: Michael Chan 
> > Date: Tue, 10 Jan 2017 20:12:38 -0500
> > 
> > > 
> > > diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.h
> > > b/drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.h
> > > index 35a0d28..f2630cc 100644
> > > --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.h
> > > +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.h
> > > @@ -36,6 +36,9 @@ struct bnxt_cos2bw_cfg {
> > > 
> > >  #define HWRM_STRUCT_DATA_SUBTYPE_HOST_OPERATIONAL0x0300
> > > 
> > > +#define ETH_P_ROCE   0x8915
> > 
> > There's also a similar define in the qedr infiniband driver, this
> > doesn't
> > make much sense.
> > 
> > Please add this to if_ether.h, and reference it from there in the
> > drivers.
> > 
> My colleague informed me that he has submitted a patch to do that
> through the rdma tree:
> 
> http://marc.info/?l=linux-rdma=148217575500983=2
> 
> But it hasn't been merged yet.  I can drop this patch now and wait
> for
> the RDMA patch to show up on net-next. Or I make the change later to
> use the common define when the RDMA patch is merged.  Thanks.

The patch for this is in my for-next area:

69ae543969ab RDMA: Adding ethertype ETH_P_IBOE

-- 
Doug Ledford 
    GPG KeyID: B826A3330E572FDD
   
Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD


signature.asc
Description: This is a digitally signed message part

Re: TCP using IPv4-mapped IPv6 address as source

2017-01-11 Thread Eric Dumazet

On Wed, 2017-01-11 at 12:34 -0500, Jonathan T. Leighton wrote:
> On 1/11/17 11:20 AM, Eric Dumazet wrote:
> > On Thu, 2017-01-05 at 16:25 -0500, Jonathan T. Leighton wrote:
> >> I've observed TCP using an IPv4-mapped IPv6 address as the source
> >> address, which I believe contradicts
> >> https://tools.ietf.org/html/rfc6890#page-14 (BCP 153). This occurs when
> >> an IPv6 TCP socket, bound to a local IPv4-mapped IPv6 address, attempts
> >> to connect to a remote IPv6 address. Presumable connect() should return
> >> EAFNOSUPPORT in this case. Please advise me if this is not to
> >> appropriate list to report this.
> > Hi Jonathan
> >
> > I believe your concern makes sense.
> > Do you have a patch to address this issue ?
> 
> Thanks for responding Eric. I have limited experience with kernel 
> patches. Nevertheless, unless there's someone with the experience and 
> time to jump on this, I'm interested in taking a crack at it. I think 
> the issue certainly warrants attention: instead of returning immediately 
> with EAFNOSUPPPORT, connect() retransmits its SYN 6 times, ultimately 
> returning ETIMEDOUT after 127 sec (1+2+4+...+64).

I am not aware of an application trying to perform a bind() to an IPV4
address before a connect() to an IPv6 destination.

A kernel fix is certainly something that would detect application bugs
in a more friendly way.

Thanks !

Re: [PATCH v2 7/8] net: Rename TCA*BPF_DIGEST to ..._SHA256

2017-01-11 Thread Andy Lutomirski

On Wed, Jan 11, 2017 at 1:09 AM, Daniel Borkmann  wrote:
> Hi Andy,
>
> On 01/11/2017 04:11 AM, Andy Lutomirski wrote:
>>
>> On Tue, Jan 10, 2017 at 4:50 PM, Daniel Borkmann 
>> wrote:
>>>
>>> On 01/11/2017 12:24 AM, Andy Lutomirski wrote:

 This makes it easier to add another digest algorithm down the road if
 needed.  It also serves to force any programs that might have been
 written against a kernel that had the old field name to notice the
 change and make any necessary changes.

 This shouldn't violate any stable API policies, as no released kernel
 has ever had TCA*BPF_DIGEST.
>>>
>>>
>>> Imho, this and patch 6/8 is not really needed. Should there ever
>>> another digest alg be used (doubt it), then you'd need a new nl
>>> attribute and fdinfo line anyway to keep existing stuff intact.
>>> Nobody made the claim that you can just change this underneath
>>> and not respecting abi for existing applications when I read from
>>> above that such apps now will get "forced" to notice a change.
>>
>>
>> Fair enough.  I was more concerned about prerelease iproute2 versions,
>> but maybe that's a nonissue.  I'll drop these two patches.
>
>
> Ok. Sleeping over this a bit, how about a general rename into
> "prog_tag" for fdinfo and TCA_BPF_TAG resp. TCA_ACT_BPF_TAG for
> the netlink attributes, fwiw, it might reduce any assumptions on
> this being made? If this would be preferable, I could cook that
> patch against -net for renaming it?

That would be fine with me.

I think there are two reasonable approaches to computing the actual tag.

1. Use a standard, modern cryptographic hash.  SHA-256, SHA-512,
Blake2b, whatever.  SHA-1 is a bad choice in part because it's partly
broken and in part because the implementation in lib/ is a real mess
to use (as you noticed while writing the code).

2. Use whatever algorithm you like but make the tag so short that it's
obviously not collision-free.  48 or 64 bits is probably reasonable.

The intermediate versions are just asking for trouble.  Alexei wants
to make the tag shorter, but I admit I still don't understand why he
prefers that over using a better crypto hash and letting user code
truncate the tag if it wants.

Re: [Linux-c6x-dev] [PATCH v2 7/7] uapi: export all headers under uapi directories

2017-01-11 Thread Mark Salter

On Fri, 2017-01-06 at 10:43 +0100, Nicolas Dichtel wrote:
> Regularly, when a new header is created in include/uapi/, the developer
> forgets to add it in the corresponding Kbuild file. This error is usually
> detected after the release is out.
> 
> In fact, all headers under uapi directories should be exported, thus it's
> useless to have an exhaustive list.
> 
> After this patch, the following files, which were not exported, are now
> exported (with make headers_install_all):
> asm-unicore32/shmparam.h
> asm-unicore32/ucontext.h
> asm-hexagon/shmparam.h
> asm-mips/ucontext.h
> asm-mips/hwcap.h
> asm-mips/reg.h
> drm/vgem_drm.h
> drm/armada_drm.h
> drm/omap_drm.h
> drm/etnaviv_drm.h
> asm-tile/shmparam.h
> asm-blackfin/shmparam.h
> asm-blackfin/ucontext.h
> asm-powerpc/perf_regs.h
> rdma/qedr-abi.h
> asm-parisc/kvm_para.h
> asm-openrisc/shmparam.h
> asm-nios2/kvm_para.h
> asm-nios2/ucontext.h
> asm-sh/kvm_para.h
> asm-sh/ucontext.h
> asm-xtensa/kvm_para.h
> asm-avr32/kvm_para.h
> asm-m32r/kvm_para.h
> asm-h8300/shmparam.h
> asm-h8300/ucontext.h
> asm-metag/kvm_para.h
> asm-metag/shmparam.h
> asm-metag/ucontext.h
> asm-m68k/kvm_para.h
> asm-m68k/shmparam.h
> linux/bcache.h
> linux/kvm.h
> linux/kvm_para.h
> linux/kfd_ioctl.h
> linux/cryptouser.h
> linux/kcm.h
> linux/kcov.h
> linux/seg6_iptunnel.h
> linux/stm.h
> linux/genwqe
> linux/genwqe/.install
> linux/genwqe/genwqe_card.h
> linux/genwqe/..install.cmd
> linux/seg6.h
> linux/cifs
> linux/cifs/.install
> linux/cifs/cifs_mount.h
> linux/cifs/..install.cmd
> linux/auto_dev-ioctl.h
> 
> Thanks to Julien Floret  for the tip to get all
> subdirs with a pure makefile command.
> 
> Signed-off-by: Nicolas Dichtel 
> ---
>  Documentation/kbuild/makefiles.txt  |  41 ++-
>  arch/alpha/include/uapi/asm/Kbuild  |  41 ---
>  arch/arc/include/uapi/asm/Kbuild|   3 -
>  arch/arm/include/uapi/asm/Kbuild|  17 -
>  arch/arm64/include/uapi/asm/Kbuild  |  18 --
>  arch/avr32/include/uapi/asm/Kbuild  |  20 --
>  arch/blackfin/include/uapi/asm/Kbuild   |  17 -
>  arch/c6x/include/uapi/asm/Kbuild|   8 -
>  arch/cris/include/uapi/arch-v10/arch/Kbuild |   5 -
>  arch/cris/include/uapi/arch-v32/arch/Kbuild |   3 -
>  arch/cris/include/uapi/asm/Kbuild   |  43 +--
>  arch/frv/include/uapi/asm/Kbuild|  33 --
>  arch/h8300/include/uapi/asm/Kbuild  |  28 --
>  arch/hexagon/include/asm/Kbuild |   3 -
>  arch/hexagon/include/uapi/asm/Kbuild|  13 -
>  arch/ia64/include/uapi/asm/Kbuild   |  45 ---
>  arch/m32r/include/uapi/asm/Kbuild   |  31 --
>  arch/m68k/include/uapi/asm/Kbuild   |  24 --
>  arch/metag/include/uapi/asm/Kbuild  |   8 -
>  arch/microblaze/include/uapi/asm/Kbuild |  32 --
>  arch/mips/include/uapi/asm/Kbuild   |  37 ---
>  arch/mn10300/include/uapi/asm/Kbuild|  32 --
>  arch/nios2/include/uapi/asm/Kbuild  |   4 +-
>  arch/openrisc/include/asm/Kbuild|   3 -
>  arch/openrisc/include/uapi/asm/Kbuild   |   8 -
>  arch/parisc/include/uapi/asm/Kbuild |  28 --
>  arch/powerpc/include/uapi/asm/Kbuild|  45 ---
>  arch/s390/include/uapi/asm/Kbuild   |  52 ---
>  arch/score/include/asm/Kbuild   |   4 -
>  arch/score/include/uapi/asm/Kbuild  |  32 --
>  arch/sh/include/uapi/asm/Kbuild |  23 --
>  arch/sparc/include/uapi/asm/Kbuild  |  48 ---
>  arch/tile/include/asm/Kbuild|   3 -
>  arch/tile/include/uapi/arch/Kbuild  |  17 -
>  arch/tile/include/uapi/asm/Kbuild   |  19 +-
>  arch/unicore32/include/uapi/asm/Kbuild  |   6 -
>  arch/x86/include/uapi/asm/Kbuild|  59 
>  arch/xtensa/include/uapi/asm/Kbuild |  23 --
>  include/Kbuild  |   2 -
>  include/asm-generic/Kbuild.asm  |   1 -
>  include/scsi/fc/Kbuild  |   0
>  include/uapi/Kbuild |  15 -
>  include/uapi/asm-generic/Kbuild |  36 ---
>  include/uapi/asm-generic/Kbuild.asm |  62 ++--
>  include/uapi/drm/Kbuild |  22 --
>  include/uapi/linux/Kbuild   | 482 
> 
>  include/uapi/linux/android/Kbuild   |   2 -
>  include/uapi/linux/byteorder/Kbuild |   3 -
>  include/uapi/linux/caif/Kbuild  |   3 -
>  include/uapi/linux/can/Kbuild   |   6 -
>  include/uapi/linux/dvb/Kbuild   |   9 -
>  include/uapi/linux/hdlc/Kbuild  |   2 -
>  include/uapi/linux/hsi/Kbuild   |   2 -
>  include/uapi/linux/iio/Kbuild   |   3 -
>  include/uapi/linux/isdn/Kbuild  |   2 -
>  include/uapi/linux/mmc/Kbuild   |   2 -
>  include/uapi/linux/netfilter/Kbuild |  89 -
>  include/uapi/linux/netfilter/ipset/Kbuild

Re: [PATCH v2 8/8] crypto/testmgr: Allocate only the required output size for hash tests

2017-01-11 Thread Andy Lutomirski

On Wed, Jan 11, 2017 at 7:13 AM, David Laight  wrote:
> From: Andy Lutomirski
>> Sent: 10 January 2017 23:25
>> There are some hashes (e.g. sha224) that have some internal trickery
>> to make sure that only the correct number of output bytes are
>> generated.  If something goes wrong, they could potentially overrun
>> the output buffer.
>>
>> Make the test more robust by allocating only enough space for the
>> correct output size so that memory debugging will catch the error if
>> the output is overrun.
>
> Might be better to test this by allocating an overlong buffer
> and then explicitly checking that the output hasn't overrun
> the allowed space.
>
> If nothing else the error message will be clearer.

I thought about that, but then I'd have to figure out what poison
value to use.  Both KASAN and the usual slab debugging are quite good
at this kind of checking, and KASAN will even point you to the
problematic code directly.

--Andy


-- 
Andy Lutomirski
AMA Capital Management, LLC

Re: [PATCH net-next 0/2] Move hwmon support out of switch and into PHYs.

2017-01-11 Thread Guenter Roeck

On Wed, Jan 11, 2017 at 06:37:41PM +0100, Andrew Lunn wrote:
> 
> However, at the moment, there is no code to enable interrupts for
> temperature alarms. I also don't see any need to add such code, since
> there is nowhere in HWMON to actively report such an alarm condition.
> 
That is not entirely correct. It is not in the hwmon core, but that is
mostly because the sysfs attributes are managed in the hwmon drivers,
at least with the legacy API, and it is thus the responsibility of
hwmon drivers to implement interrupts and to alert user space.

It is on my task list to add a callback into the core, to support
generating udev and sysfs events with the new API, but so far I don't
see a need for it because no one requested it. This makes it kind
of a circular argument (not that the dsa hwmon driver supports the
new API anyway, but still).

Guenter

1 2 3 >

1 - 100 of 212 matches

Mail list logo