Re: [RFC v1 for accelerated IPoIB 04/25] IB/verb: Add ipoib_options struct and API

2017-03-14 Thread Vishwanathapura, Niranjana

On Mon, Mar 13, 2017 at 02:01:36PM -0600, Jason Gunthorpe wrote:

+   /* multicast */
+   int (*attach_mcast)(struct net_device *dev, struct ib_device *hca,
+   union ib_gid *gid, u16 lid, int set_qkey);
+   int (*detach_mcast)(struct net_device *dev, struct ib_device *hca,
+   union ib_gid *gid, u16 lid);


It would make more sense to store the struct ib_device pointer in the
struct rdma_netdev.



Agree that it shouldn't be a function parameters.
For opa_vnic, I found it convenient to store ib_device pointer in client and 
device private structures as those will be available in most places anyhow.


Niranjana


Re: [PATCH net-next 4/4] net-next: dsa: add dsa support for Mediatek MT7530 switch

2017-03-14 Thread Sean Wang
On Tue, 2017-03-14 at 00:11 +0100, Andrew Lunn wrote:
> > +static int
> > +mt7530_setup(struct dsa_switch *ds)
> > +{
> > +   struct mt7530_priv *priv = ds->priv;
> > +   int ret, i, phy_mode;
> > +   u8  cpup_mask = 0;
> > +   u32 id, val;
> > +   struct regmap *regmap;
> > +
> > +   /* Make sure that cpu port specfied on the dt is appropriate */
> > +   if (!dsa_is_cpu_port(ds, MT7530_CPU_PORT)) {
> > +   dev_err(priv->dev, "port not matched with the CPU port\n");
> > +   return -EINVAL;
> > +   }
> > +
> > +   regmap = devm_regmap_init(ds->dev, NULL, priv,
> > + &mt7530_regmap_config);
> > +   if (IS_ERR(regmap))
> > +   dev_warn(priv->dev, "phy regmap initialization failed");
> > +
> > +   phy_mode = of_get_phy_mode(ds->ports[ds->dst->cpu_port].dn);
> > +   if (phy_mode < 0) {
> > +   dev_err(priv->dev, "Can't find phy-mode for master device\n");
> > +   return phy_mode;
> > +   }
> > +   dev_info(priv->dev, "phy-mode for master device = %x\n", phy_mode);
> 
> Hi Sean
> 
> It is not documented in the binding that a phy-mode is mandatory for
> the cpu port.
> 
> Andrew

Hi Andrew,

thanks for your reviewing. I'll also add the missing part into the next
one. 
Sean




[PATCH] tcp_westwood: fix tcp_westwood_info() style mistakes

2017-03-14 Thread Chun Long
From: chun Long 

replace comma to semi colons in tcp_westwood_info().

---
 net/ipv4/tcp_westwood.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/tcp_westwood.c b/net/ipv4/tcp_westwood.c
index fed66dc..9775453 100644
--- a/net/ipv4/tcp_westwood.c
+++ b/net/ipv4/tcp_westwood.c
@@ -265,8 +265,8 @@ static size_t tcp_westwood_info(struct sock *sk, u32 ext, 
int *attr,
if (ext & (1 << (INET_DIAG_VEGASINFO - 1))) {
info->vegas.tcpv_enabled = 1;
info->vegas.tcpv_rttcnt = 0;
-   info->vegas.tcpv_rtt= jiffies_to_usecs(ca->rtt),
-   info->vegas.tcpv_minrtt = jiffies_to_usecs(ca->rtt_min),
+   info->vegas.tcpv_rtt= jiffies_to_usecs(ca->rtt);
+   info->vegas.tcpv_minrtt = jiffies_to_usecs(ca->rtt_min);
 
*attr = INET_DIAG_VEGASINFO;
return sizeof(struct tcpvegas_info);
-- 
1.8.3.1




Re: crypto: deadlock between crypto_alg_sem/rtnl_mutex/genl_mutex

2017-03-14 Thread Dmitry Vyukov
On Mon, Mar 6, 2017 at 10:36 AM, Dmitry Vyukov  wrote:
> On Sun, Mar 5, 2017 at 6:36 PM, Dmitry Vyukov  wrote:
>> On Sun, Mar 5, 2017 at 4:08 PM, Dmitry Vyukov  wrote:
>>> Hello,
>>>
>>> I am getting the following deadlock reports while running syzkaller
>>> fuzzer on net-next/8d70eeb84ab277377c017af6a21d0a337025dede:
>>>
>>> ==
>>> [ INFO: possible circular locking dependency detected ]
>>> 4.10.0+ #5 Not tainted
>>> ---
>>> syz-executor6/6143 is trying to acquire lock:
>>>  (nlk->cb_mutex){+.+.+.}, at: []
>>> __netlink_dump_start+0xf4/0x760 net/netlink/af_netlink.c:2187
>>>
>>> but task is already holding lock:
>>>  (crypto_alg_sem){+.}, at: []
>>> crypto_user_rcv_msg+0x136/0x4f0 crypto/crypto_user.c:507
>>>
>>> which lock already depends on the new lock.
>>>
>>>
>>> the existing dependency chain (in reverse order) is:
>>>
>>> -> #4 (crypto_alg_sem){+.}:
>>>validate_chain kernel/locking/lockdep.c:2267 [inline]
>>>__lock_acquire+0x2149/0x3430 kernel/locking/lockdep.c:3340
>>>lock_acquire+0x2a1/0x630 kernel/locking/lockdep.c:3755
>>>down_read+0x9b/0x150 kernel/locking/rwsem.c:23
>>>crypto_alg_lookup+0x23/0x50 crypto/api.c:199
>>>crypto_larval_lookup.part.10+0x9a/0x3b0 crypto/api.c:217
>>>crypto_larval_lookup crypto/api.c:211 [inline]
>>>crypto_alg_mod_lookup+0x77/0x1b0 crypto/api.c:270
>>>crypto_alloc_base+0x50/0x1e0 crypto/api.c:416
>>>crypto_alloc_cipher include/linux/crypto.h:1407 [inline]
>>>tcp_fastopen_reset_cipher+0xc2/0x2e0 net/ipv4/tcp_fastopen.c:48
>>>tcp_fastopen_init_key_once+0x114/0x120 net/ipv4/tcp_fastopen.c:29
>>>do_tcp_setsockopt.isra.36+0x140a/0x20a0 net/ipv4/tcp.c:2684
>>>tcp_setsockopt+0xb0/0xd0 net/ipv4/tcp.c:2733
>>>sock_common_setsockopt+0x95/0xd0 net/core/sock.c:2731
>>>SYSC_setsockopt net/socket.c:1786 [inline]
>>>SyS_setsockopt+0x25c/0x390 net/socket.c:1765
>>>entry_SYSCALL_64_fastpath+0x1f/0xc2
>>>
>>> -> #3 (sk_lock-AF_INET){+.+.+.}:
>>>validate_chain kernel/locking/lockdep.c:2267 [inline]
>>>__lock_acquire+0x2149/0x3430 kernel/locking/lockdep.c:3340
>>>lock_acquire+0x2a1/0x630 kernel/locking/lockdep.c:3755
>>>lock_sock_nested+0xcb/0x120 net/core/sock.c:2536
>>>lock_sock include/net/sock.h:1460 [inline]
>>>rds_tcp_listen_stop+0x57/0x140 net/rds/tcp_listen.c:284
>>>rds_tcp_kill_sock net/rds/tcp.c:529 [inline]
>>>rds_tcp_dev_event+0x383/0xc50 net/rds/tcp.c:568
>>>notifier_call_chain+0x1b5/0x2b0 kernel/notifier.c:93
>>>__raw_notifier_call_chain kernel/notifier.c:394 [inline]
>>>raw_notifier_call_chain+0x2d/0x40 kernel/notifier.c:401
>>>call_netdevice_notifiers_info+0x51/0x90 net/core/dev.c:1646
>>>call_netdevice_notifiers net/core/dev.c:1662 [inline]
>>>netdev_run_todo+0x3b2/0xa30 net/core/dev.c:7530
>>>rtnl_unlock+0xe/0x10 net/core/rtnetlink.c:104
>>>default_device_exit_batch+0x504/0x620 net/core/dev.c:8334
>>>ops_exit_list.isra.6+0x100/0x150 net/core/net_namespace.c:144
>>>cleanup_net+0x551/0xa90 net/core/net_namespace.c:463
>>>process_one_work+0xbd0/0x1c10 kernel/workqueue.c:2096
>>>worker_thread+0x223/0x1990 kernel/workqueue.c:2230
>>>kthread+0x326/0x3f0 kernel/kthread.c:229
>>>ret_from_fork+0x31/0x40 arch/x86/entry/entry_64.S:430
>>>
>>> -> #2 (rtnl_mutex){+.+.+.}:
>>>validate_chain kernel/locking/lockdep.c:2267 [inline]
>>>__lock_acquire+0x2149/0x3430 kernel/locking/lockdep.c:3340
>>>lock_acquire+0x2a1/0x630 kernel/locking/lockdep.c:3755
>>>__mutex_lock_common kernel/locking/mutex.c:756 [inline]
>>>__mutex_lock+0x172/0x1730 kernel/locking/mutex.c:893
>>>mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908
>>>rtnl_lock+0x17/0x20 net/core/rtnetlink.c:70
>>>tipc_nl_bearer_dump+0x3ef/0x720 net/tipc/bearer.c:774
>>>genl_lock_dumpit+0x68/0x90 net/netlink/genetlink.c:479
>>>netlink_dump+0x54d/0xd40 net/netlink/af_netlink.c:2127
>>>__netlink_dump_start+0x4e5/0x760 net/netlink/af_netlink.c:2217
>>>genl_family_rcv_msg+0xd9d/0x1040 net/netlink/genetlink.c:546
>>>genl_rcv_msg+0xa6/0x140 net/netlink/genetlink.c:620
>>>netlink_rcv_skb+0x2ab/0x390 net/netlink/af_netlink.c:2298
>>>genl_rcv+0x28/0x40 net/netlink/genetlink.c:631
>>>netlink_unicast_kernel net/netlink/af_netlink.c:1231 [inline]
>>>netlink_unicast+0x514/0x730 net/netlink/af_netlink.c:1257
>>>netlink_sendmsg+0xa9f/0xe50 net/netlink/af_netlink.c:1803
>>>sock_sendmsg_nosec net/socket.c:633 [inline]
>>>sock_sendmsg+0xca/0x110 net/socket.c:643
>>>sock_write_iter+0x326/0x600 net/socket.c:846
>>>call_write_iter include/linux/fs.h:1733 [in

Re: [PATCH net] r8152: fix the list rx_done may be used without initialization

2017-03-14 Thread Petr Vorel
Hi Hayes,

> The list rx_done would be initialized when the linking on occurs.
> Therefore, if a napi is scheduled without any linking on before,
> the following kernel panic would happen.

>   BUG: unable to handle kernel NULL pointer dereference at 008
>   IP: [] r8152_poll+0xe1e/0x1210 [r8152]
>   PGD 0
>   Oops: 0002 [#1] SMP

thanks for fixing!

Kind regards,
Petr


RE: [PATCH net] r8152: fix the list rx_done may be used without initialization

2017-03-14 Thread Hayes Wang
Petr Vorel [mailto:petr.vo...@gmail.com]
> Sent: Tuesday, March 14, 2017 4:54 PM
[...]
> thanks for fixing!

Does it work?

Best Regards,
Hayes




Re: crypto: deadlock between crypto_alg_sem/rtnl_mutex/genl_mutex

2017-03-14 Thread Herbert Xu
On Sun, Mar 05, 2017 at 04:08:39PM +0100, Dmitry Vyukov wrote:
>
> -> #1 (genl_mutex){+.+.+.}:
>validate_chain kernel/locking/lockdep.c:2267 [inline]
>__lock_acquire+0x2149/0x3430 kernel/locking/lockdep.c:3340
>lock_acquire+0x2a1/0x630 kernel/locking/lockdep.c:3755
>__mutex_lock_common kernel/locking/mutex.c:756 [inline]
>__mutex_lock+0x172/0x1730 kernel/locking/mutex.c:893
>mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908
>genl_lock net/netlink/genetlink.c:32 [inline]
>genl_lock_dumpit+0x41/0x90 net/netlink/genetlink.c:478
>netlink_dump+0x54d/0xd40 net/netlink/af_netlink.c:2127
>__netlink_dump_start+0x4e5/0x760 net/netlink/af_netlink.c:2217
>genl_family_rcv_msg+0xd9d/0x1040 net/netlink/genetlink.c:546
>genl_rcv_msg+0xa6/0x140 net/netlink/genetlink.c:620
>netlink_rcv_skb+0x2ab/0x390 net/netlink/af_netlink.c:2298
>genl_rcv+0x28/0x40 net/netlink/genetlink.c:631
>netlink_unicast_kernel net/netlink/af_netlink.c:1231 [inline]
>netlink_unicast+0x514/0x730 net/netlink/af_netlink.c:1257
>netlink_sendmsg+0xa9f/0xe50 net/netlink/af_netlink.c:1803
>sock_sendmsg_nosec net/socket.c:633 [inline]
>sock_sendmsg+0xca/0x110 net/socket.c:643
>sock_write_iter+0x326/0x600 net/socket.c:846
>call_write_iter include/linux/fs.h:1733 [inline]
>new_sync_write fs/read_write.c:497 [inline]
>__vfs_write+0x483/0x740 fs/read_write.c:510
>vfs_write+0x187/0x530 fs/read_write.c:558
>SYSC_write fs/read_write.c:605 [inline]
>SyS_write+0xfb/0x230 fs/read_write.c:597
>entry_SYSCALL_64_fastpath+0x1f/0xc2
> 
> -> #0 (nlk->cb_mutex){+.+.+.}:
>check_prev_add kernel/locking/lockdep.c:1830 [inline]
>check_prevs_add+0xa8f/0x19f0 kernel/locking/lockdep.c:1940
>validate_chain kernel/locking/lockdep.c:2267 [inline]
>__lock_acquire+0x2149/0x3430 kernel/locking/lockdep.c:3340
>lock_acquire+0x2a1/0x630 kernel/locking/lockdep.c:3755
>__mutex_lock_common kernel/locking/mutex.c:756 [inline]
>__mutex_lock+0x172/0x1730 kernel/locking/mutex.c:893
>mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908
>__netlink_dump_start+0xf4/0x760 net/netlink/af_netlink.c:2187
>netlink_dump_start include/linux/netlink.h:165 [inline]
>crypto_user_rcv_msg+0x2ad/0x4f0 crypto/crypto_user.c:517
>netlink_rcv_skb+0x2ab/0x390 net/netlink/af_netlink.c:2298
>crypto_netlink_rcv+0x2a/0x40 crypto/crypto_user.c:538
>netlink_unicast_kernel net/netlink/af_netlink.c:1231 [inline]
>netlink_unicast+0x514/0x730 net/netlink/af_netlink.c:1257
>netlink_sendmsg+0xa9f/0xe50 net/netlink/af_netlink.c:1803
>sock_sendmsg_nosec net/socket.c:633 [inline]
>sock_sendmsg+0xca/0x110 net/socket.c:643
>___sys_sendmsg+0x8fa/0x9f0 net/socket.c:1985
>__sys_sendmsg+0x138/0x300 net/socket.c:2019
>SYSC_sendmsg net/socket.c:2030 [inline]
>SyS_sendmsg+0x2d/0x50 net/socket.c:2026
>entry_SYSCALL_64_fastpath+0x1f/0xc2

This looks like a false positive.  The cb_mutex in #1 is not the
same as the cb_mutex in #0.  The cb_mutex in #0 comes is obtained
by crypto_user which uses straight netlink.  The cb_mutex in #1
is a genl netlink socket.

I'll have a look to see if we can annotate this.

Cheers,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: crypto: deadlock between crypto_alg_sem/rtnl_mutex/genl_mutex

2017-03-14 Thread Herbert Xu
On Sun, Mar 05, 2017 at 06:36:12PM +0100, Dmitry Vyukov wrote:
>
> > -> #1 (genl_mutex){+.+.+.}:
> >validate_chain kernel/locking/lockdep.c:2267 [inline]
> >__lock_acquire+0x2149/0x3430 kernel/locking/lockdep.c:3340
> >lock_acquire+0x2a1/0x630 kernel/locking/lockdep.c:3755
> >__mutex_lock_common kernel/locking/mutex.c:756 [inline]
> >__mutex_lock+0x172/0x1730 kernel/locking/mutex.c:893
> >mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908
> >genl_lock net/netlink/genetlink.c:32 [inline]
> >genl_lock_dumpit+0x41/0x90 net/netlink/genetlink.c:478
> >netlink_dump+0x54d/0xd40 net/netlink/af_netlink.c:2127
> >__netlink_dump_start+0x4e5/0x760 net/netlink/af_netlink.c:2217
> >genl_family_rcv_msg+0xd9d/0x1040 net/netlink/genetlink.c:546
> >genl_rcv_msg+0xa6/0x140 net/netlink/genetlink.c:620
> >netlink_rcv_skb+0x2ab/0x390 net/netlink/af_netlink.c:2298
> >genl_rcv+0x28/0x40 net/netlink/genetlink.c:631
> >netlink_unicast_kernel net/netlink/af_netlink.c:1231 [inline]
> >netlink_unicast+0x514/0x730 net/netlink/af_netlink.c:1257
> >netlink_sendmsg+0xa9f/0xe50 net/netlink/af_netlink.c:1803
> >sock_sendmsg_nosec net/socket.c:633 [inline]
> >sock_sendmsg+0xca/0x110 net/socket.c:643
> >sock_write_iter+0x326/0x600 net/socket.c:846
> >call_write_iter include/linux/fs.h:1733 [inline]
> >new_sync_write fs/read_write.c:497 [inline]
> >__vfs_write+0x483/0x740 fs/read_write.c:510
> >vfs_write+0x187/0x530 fs/read_write.c:558
> >SYSC_write fs/read_write.c:605 [inline]
> >SyS_write+0xfb/0x230 fs/read_write.c:597
> >entry_SYSCALL_64_fastpath+0x1f/0xc2
> >
> > -> #0 (nlk->cb_mutex){+.+.+.}:
> >check_prev_add kernel/locking/lockdep.c:1830 [inline]
> >check_prevs_add+0xa8f/0x19f0 kernel/locking/lockdep.c:1940
> >validate_chain kernel/locking/lockdep.c:2267 [inline]
> >__lock_acquire+0x2149/0x3430 kernel/locking/lockdep.c:3340
> >lock_acquire+0x2a1/0x630 kernel/locking/lockdep.c:3755
> >__mutex_lock_common kernel/locking/mutex.c:756 [inline]
> >__mutex_lock+0x172/0x1730 kernel/locking/mutex.c:893
> >mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908
> >__netlink_dump_start+0xf4/0x760 net/netlink/af_netlink.c:2187
> >netlink_dump_start include/linux/netlink.h:165 [inline]
> >crypto_user_rcv_msg+0x2ad/0x4f0 crypto/crypto_user.c:517
> >netlink_rcv_skb+0x2ab/0x390 net/netlink/af_netlink.c:2298
> >crypto_netlink_rcv+0x2a/0x40 crypto/crypto_user.c:538
> >netlink_unicast_kernel net/netlink/af_netlink.c:1231 [inline]
> >netlink_unicast+0x514/0x730 net/netlink/af_netlink.c:1257
> >netlink_sendmsg+0xa9f/0xe50 net/netlink/af_netlink.c:1803
> >sock_sendmsg_nosec net/socket.c:633 [inline]
> >sock_sendmsg+0xca/0x110 net/socket.c:643
> >___sys_sendmsg+0x8fa/0x9f0 net/socket.c:1985
> >__sys_sendmsg+0x138/0x300 net/socket.c:2019
> >SYSC_sendmsg net/socket.c:2030 [inline]
> >SyS_sendmsg+0x2d/0x50 net/socket.c:2026
> >entry_SYSCALL_64_fastpath+0x1f/0xc2

Ditto.  Please disregard any reports involving genl_mutex and
cb_mutex where the latter comes from crypto_user.

Thanks,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: crypto: deadlock between crypto_alg_sem/rtnl_mutex/genl_mutex

2017-03-14 Thread Dmitry Vyukov
On Tue, Mar 14, 2017 at 10:16 AM, Herbert Xu
 wrote:
> On Sun, Mar 05, 2017 at 04:08:39PM +0100, Dmitry Vyukov wrote:
>>
>> -> #1 (genl_mutex){+.+.+.}:
>>validate_chain kernel/locking/lockdep.c:2267 [inline]
>>__lock_acquire+0x2149/0x3430 kernel/locking/lockdep.c:3340
>>lock_acquire+0x2a1/0x630 kernel/locking/lockdep.c:3755
>>__mutex_lock_common kernel/locking/mutex.c:756 [inline]
>>__mutex_lock+0x172/0x1730 kernel/locking/mutex.c:893
>>mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908
>>genl_lock net/netlink/genetlink.c:32 [inline]
>>genl_lock_dumpit+0x41/0x90 net/netlink/genetlink.c:478
>>netlink_dump+0x54d/0xd40 net/netlink/af_netlink.c:2127
>>__netlink_dump_start+0x4e5/0x760 net/netlink/af_netlink.c:2217
>>genl_family_rcv_msg+0xd9d/0x1040 net/netlink/genetlink.c:546
>>genl_rcv_msg+0xa6/0x140 net/netlink/genetlink.c:620
>>netlink_rcv_skb+0x2ab/0x390 net/netlink/af_netlink.c:2298
>>genl_rcv+0x28/0x40 net/netlink/genetlink.c:631
>>netlink_unicast_kernel net/netlink/af_netlink.c:1231 [inline]
>>netlink_unicast+0x514/0x730 net/netlink/af_netlink.c:1257
>>netlink_sendmsg+0xa9f/0xe50 net/netlink/af_netlink.c:1803
>>sock_sendmsg_nosec net/socket.c:633 [inline]
>>sock_sendmsg+0xca/0x110 net/socket.c:643
>>sock_write_iter+0x326/0x600 net/socket.c:846
>>call_write_iter include/linux/fs.h:1733 [inline]
>>new_sync_write fs/read_write.c:497 [inline]
>>__vfs_write+0x483/0x740 fs/read_write.c:510
>>vfs_write+0x187/0x530 fs/read_write.c:558
>>SYSC_write fs/read_write.c:605 [inline]
>>SyS_write+0xfb/0x230 fs/read_write.c:597
>>entry_SYSCALL_64_fastpath+0x1f/0xc2
>>
>> -> #0 (nlk->cb_mutex){+.+.+.}:
>>check_prev_add kernel/locking/lockdep.c:1830 [inline]
>>check_prevs_add+0xa8f/0x19f0 kernel/locking/lockdep.c:1940
>>validate_chain kernel/locking/lockdep.c:2267 [inline]
>>__lock_acquire+0x2149/0x3430 kernel/locking/lockdep.c:3340
>>lock_acquire+0x2a1/0x630 kernel/locking/lockdep.c:3755
>>__mutex_lock_common kernel/locking/mutex.c:756 [inline]
>>__mutex_lock+0x172/0x1730 kernel/locking/mutex.c:893
>>mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908
>>__netlink_dump_start+0xf4/0x760 net/netlink/af_netlink.c:2187
>>netlink_dump_start include/linux/netlink.h:165 [inline]
>>crypto_user_rcv_msg+0x2ad/0x4f0 crypto/crypto_user.c:517
>>netlink_rcv_skb+0x2ab/0x390 net/netlink/af_netlink.c:2298
>>crypto_netlink_rcv+0x2a/0x40 crypto/crypto_user.c:538
>>netlink_unicast_kernel net/netlink/af_netlink.c:1231 [inline]
>>netlink_unicast+0x514/0x730 net/netlink/af_netlink.c:1257
>>netlink_sendmsg+0xa9f/0xe50 net/netlink/af_netlink.c:1803
>>sock_sendmsg_nosec net/socket.c:633 [inline]
>>sock_sendmsg+0xca/0x110 net/socket.c:643
>>___sys_sendmsg+0x8fa/0x9f0 net/socket.c:1985
>>__sys_sendmsg+0x138/0x300 net/socket.c:2019
>>SYSC_sendmsg net/socket.c:2030 [inline]
>>SyS_sendmsg+0x2d/0x50 net/socket.c:2026
>>entry_SYSCALL_64_fastpath+0x1f/0xc2
>
> This looks like a false positive.  The cb_mutex in #1 is not the
> same as the cb_mutex in #0.  The cb_mutex in #0 comes is obtained
> by crypto_user which uses straight netlink.  The cb_mutex in #1
> is a genl netlink socket.
>
> I'll have a look to see if we can annotate this.

Yes, please.
Disregarding some reports is not a good way long term.


Re: [PATCH] net: stmmac: set default number of rx and tx queues in stmmac_pci

2017-03-14 Thread Joao Pinto
Hi David,

Às 7:12 PM de 3/13/2017, David Miller escreveu:
> From: Joao Pinto 
> Date: Mon, 13 Mar 2017 10:07:07 +
> 
>> This patch configures default number of RX and TX queues when
>> using the pci glue driver.
>>
>> Signed-off-by: Joao Pinto 
> 
> This is a net-next specific patch, you must the target tree properly in your
> Subject line or similar, for example:

You are right, sorry, I forgot to put the target tree!

> 
>   [PATCH net-next] stmmac: set default ...
> 
> I'm taking care of this, and for the next patch too, but in the future you
> need to be clear what tree your changes are targetting.
> 
> Thanks.
> 

Thanks,
Joao


Re: [PATCH net-next RFC v1 00/27] afnetns: new namespace type for separation on protocol level

2017-03-14 Thread Hannes Frederic Sowa
On 13.03.2017 23:06, Eric W. Biederman wrote:
> Michael Kerrisk  writes:
> 
>> On Mon, Mar 13, 2017 at 12:44 AM, Hannes Frederic Sowa
>>  wrote:
>>> Hi,
>>>
>>> On Sun, 2017-03-12 at 16:26 -0700, David Miller wrote:
 From: Hannes Frederic Sowa 
 Date: Mon, 13 Mar 2017 00:01:24 +0100

> afnetns behaves like ordinary namespaces: clone, unshare, setns syscalls
> can work with afnetns with one limitation: one cannot cross the realm
> of a network namespace while changing the afnetns compartement. To get
> into a new afnetns in a different net namespace, one must first change
> to the net namespace and afterwards switch to the desired afnetns.

 Please explain why this is useful, who wants this kind of facility,
 and how it will be used.
>>>
>>> Yes, I have to enhance the cover letter:
>>>
>>> The work behind all this is to provide more dense container hosting.
>>> Right now we lose performance, because all packets need to be forwarded
>>> through either a bridge or must be routed until they reach the
>>> containers. For example, we can't make use of early demuxing for the
>>> incoming packets. We basically pass the networking stack twice for
>>> every packet.
>>>
>>> The usage is very much in line with how network namespaces are used
>>> nowadays:
>>>
>>> ip afnetns add afns-1
>>> ip address add 192.168.1.1/24 dev eth0 afnetns afns-1
>>> ip afnetns exec afns-1 /usr/sbin/httpd
>>>
>>> this spawns a shell where all child processes will only have access to
>>> the specific ip addresses, even though they do a wildcard bind. Source
>>> address selection will also use only the ip addresses available to the
>>> children.
>>>
>>> In some sense it has lots of characteristics like ipvlan, allowing a
>>> single MAC address to host lots of IP addresses which will end up in
>>> different namespaces. Unlink ipvlan however, it will also solve the
>>> problem around duplicate address detection and multiplexing packets to
>>> the IGMP or MLD state machines.
>>>
>>> The resource consumption in comparison with ordinary namespaces will be
>>> much lower. All in all, we will have far less networking subsystems to
>>> cross compared to normal netns solutions.
>>>
>>> Some more information also in the first patch, which adds a
>>> Documentation.
> 
> If the goal is one ip address per network namespace with a network
> device and mac address on the network I have something that I was
> working on that I believe is in the end is a much simpler solution.

Actually, it should be possible to use more than one IP address per
namespace, proper source address selection should deal with that and
also correctly select the higher scored ones, based on output device and
distance to the remote ip address.

> Add routes in the routing table between network namespaces.
> 
> AKA in the initial network namespace with the network device have
> an input route not towards the local loopback device but towards
> the network namespaces loopback device.
> 
> Before other issues took precedence I made it half way to implementing
> that.   The ip input path won't get confused if the destination network
> device is not in the same network namespace as the device.  Last I
> looked the ip output path still had a few places where confusion was
> possible between the network socket and the output device.

The ip afnetns input path is also of no concern to me and will work
quite easily. Right now, the different semantics and rules for selecting
a source address are the more problematic ones. I think, that in the
case of directly routing from one ns into another this will be the same
and the most complex case to deal with?

> As long as installing such routes is conditional upon having
> CAP_NET_ADMIN in both network namespaces you should be fine and things
> should be very simple and very fast.  Because that won't take a special
> case through the network stack.
> 
> Given that performance is your primary motive I suspect this will yield
> the fastest possible path through the network stack as no extra steps
> need to be taken, and can benefit from any routing improvements to the
> ordinary network stack.

The major performance improvements come from socket early demuxing,
which actually requires the remote netns socket being visible in the
initial netns esock tables. We need the same for the representations for
IP addresses to have ARP/NDISC work correctly. As soon as you try to
just cross one data structure from one netns to another one, it gets
really difficult to keep track of all the dependencies. It felt way more
complex than this approach.

Thanks for your comments!

Bye,
Hannes




[PATCH v2 net-next 01/11] net: stmmac: prepare dma op mode config for multiple queues

2017-03-14 Thread Joao Pinto
This patch prepares DMA Operation Mode configuration for multiple queues.
The work consisted on breaking the DMA operation Mode configuration function
into RX and TX scope and adapting its mechanism in stmmac_main.

Signed-off-by: Joao Pinto 
---
changes v1->v2:
- Just to keep up the patch-set version

 drivers/net/ethernet/stmicro/stmmac/common.h  |   3 +
 drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.c  | 118 +++---
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c |  84 +++
 3 files changed, 125 insertions(+), 80 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/common.h 
b/drivers/net/ethernet/stmicro/stmmac/common.h
index 9f0d26d..13bd3d4 100644
--- a/drivers/net/ethernet/stmicro/stmmac/common.h
+++ b/drivers/net/ethernet/stmicro/stmmac/common.h
@@ -424,6 +424,9 @@ struct stmmac_dma_ops {
 * An invalid value enables the store-and-forward mode */
void (*dma_mode)(void __iomem *ioaddr, int txmode, int rxmode,
 int rxfifosz);
+   void (*dma_rx_mode)(void __iomem *ioaddr, int mode, u32 channel,
+   int fifosz);
+   void (*dma_tx_mode)(void __iomem *ioaddr, int mode, u32 channel);
/* To track extra statistic (if supported) */
void (*dma_diagnostic_fr) (void *data, struct stmmac_extra_stats *x,
   void __iomem *ioaddr);
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.c 
b/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.c
index 6ac6b26..6285e8a 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.c
@@ -182,70 +182,26 @@ static void dwmac4_rx_watchdog(void __iomem *ioaddr, u32 
riwt)
writel(riwt, ioaddr + DMA_CHAN_RX_WATCHDOG(i));
 }
 
-static void dwmac4_dma_chan_op_mode(void __iomem *ioaddr, int txmode,
-   int rxmode, u32 channel, int rxfifosz)
+static void dwmac4_dma_rx_chan_op_mode(void __iomem *ioaddr, int mode,
+  u32 channel, int fifosz)
 {
-   unsigned int rqs = rxfifosz / 256 - 1;
-   u32 mtl_tx_op, mtl_rx_op, mtl_rx_int;
-
-   /* Following code only done for channel 0, other channels not yet
-* supported.
-*/
-   mtl_tx_op = readl(ioaddr + MTL_CHAN_TX_OP_MODE(channel));
-
-   if (txmode == SF_DMA_MODE) {
-   pr_debug("GMAC: enable TX store and forward mode\n");
-   /* Transmit COE type 2 cannot be done in cut-through mode. */
-   mtl_tx_op |= MTL_OP_MODE_TSF;
-   } else {
-   pr_debug("GMAC: disabling TX SF (threshold %d)\n", txmode);
-   mtl_tx_op &= ~MTL_OP_MODE_TSF;
-   mtl_tx_op &= MTL_OP_MODE_TTC_MASK;
-   /* Set the transmit threshold */
-   if (txmode <= 32)
-   mtl_tx_op |= MTL_OP_MODE_TTC_32;
-   else if (txmode <= 64)
-   mtl_tx_op |= MTL_OP_MODE_TTC_64;
-   else if (txmode <= 96)
-   mtl_tx_op |= MTL_OP_MODE_TTC_96;
-   else if (txmode <= 128)
-   mtl_tx_op |= MTL_OP_MODE_TTC_128;
-   else if (txmode <= 192)
-   mtl_tx_op |= MTL_OP_MODE_TTC_192;
-   else if (txmode <= 256)
-   mtl_tx_op |= MTL_OP_MODE_TTC_256;
-   else if (txmode <= 384)
-   mtl_tx_op |= MTL_OP_MODE_TTC_384;
-   else
-   mtl_tx_op |= MTL_OP_MODE_TTC_512;
-   }
-   /* For an IP with DWC_EQOS_NUM_TXQ == 1, the fields TXQEN and TQS are RO
-* with reset values: TXQEN on, TQS == DWC_EQOS_TXFIFO_SIZE.
-* For an IP with DWC_EQOS_NUM_TXQ > 1, the fields TXQEN and TQS are R/W
-* with reset values: TXQEN off, TQS 256 bytes.
-*
-* Write the bits in both cases, since it will have no effect when RO.
-* For DWC_EQOS_NUM_TXQ > 1, the top bits in MTL_OP_MODE_TQS_MASK might
-* be RO, however, writing the whole TQS field will result in a value
-* equal to DWC_EQOS_TXFIFO_SIZE, just like for DWC_EQOS_NUM_TXQ == 1.
-*/
-   mtl_tx_op |= MTL_OP_MODE_TXQEN | MTL_OP_MODE_TQS_MASK;
-   writel(mtl_tx_op, ioaddr +  MTL_CHAN_TX_OP_MODE(channel));
+   unsigned int rqs = fifosz / 256 - 1;
+   u32 mtl_rx_op, mtl_rx_int;
 
mtl_rx_op = readl(ioaddr + MTL_CHAN_RX_OP_MODE(channel));
 
-   if (rxmode == SF_DMA_MODE) {
+   if (mode == SF_DMA_MODE) {
pr_debug("GMAC: enable RX store and forward mode\n");
mtl_rx_op |= MTL_OP_MODE_RSF;
} else {
-   pr_debug("GMAC: disable RX SF mode (threshold %d)\n", rxmode);
+   pr_debug("GMAC: disable RX SF mode (threshold %d)\n", mode);
mtl_rx_op &= ~MTL_OP_MODE_RSF;
mtl_rx_op &= MTL_OP_MODE_RTC_MASK;
-   if (rxmode <

[PATCH v2 net-next 00/11] net: stmmac: prepare dma operations for multiple queues

2017-03-14 Thread Joao Pinto
As agreed with David Miller, this patch-set is the second of 3 to enable
multiple queues in stmmac.

This second one concentrates on dma operations adding functionalities as:
a) DMA Operation Mode configuration per channel and done in the multiple
queues configuration function
b) DMA IRQ enable and Disable by channel
c) DMA start and stop by channel
d) RX and TX ring length configuration by channel
e) RX and TX set tail pointer by channel
f) DMA Channel initialization broke into Channel comon, RX and TX
initialization
g) TSO being configured for all available channels
h) DMA interrupt treatment by channel

Joao Pinto (11):
  net: stmmac: prepare dma op mode config for multiple queues
  net: stmmac: enable/disable dma irq prepared for multiple queues
  net: stmmac: rx/tx dma start/stop prepared for multiple queues
  net: stmmac: prepare stmmac_tx_err for multiple queues
  net: stmmac: prepare dma interrupt treatment for multiple queues
  net: stmmac: rx watchdog config prepared for multiple queues
  net: stmmac: rx and tx ring length prepared for multiple queues
  net: stmmac: prepare rx/tx set tail function for multiple queues
  net: stmmac: dma channel init prepared for multiple queues
  net: stmmac: tso init prepared for multiple queues
  net: stmmac: stmmac interrupt treatment prepared for multiple queues

 drivers/net/ethernet/stmicro/stmmac/common.h   |  31 +-
 .../net/ethernet/stmicro/stmmac/dwmac1000_dma.c|   3 +-
 drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.c   | 192 ++-
 drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h   |  20 +-
 drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c   |  56 ++--
 drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h|  15 +-
 drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c|  14 +-
 .../net/ethernet/stmicro/stmmac/stmmac_ethtool.c   |   3 +-
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c  | 373 -
 9 files changed, 468 insertions(+), 239 deletions(-)

-- 
2.9.3



[PATCH v2 net-next 09/11] net: stmmac: dma channel init prepared for multiple queues

2017-03-14 Thread Joao Pinto
This patch prepares the DMA initialization process for multiple queues.

Signed-off-by: Joao Pinto 
---
changes v1->v2:
- dummy_dma_rx_phy was not being initialized

 drivers/net/ethernet/stmicro/stmmac/common.h  |  8 +++
 drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.c  | 66 ++-
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 51 ++
 3 files changed, 88 insertions(+), 37 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/common.h 
b/drivers/net/ethernet/stmicro/stmmac/common.h
index bef1fc6..badc441 100644
--- a/drivers/net/ethernet/stmicro/stmmac/common.h
+++ b/drivers/net/ethernet/stmicro/stmmac/common.h
@@ -416,6 +416,14 @@ struct stmmac_dma_ops {
int (*reset)(void __iomem *ioaddr);
void (*init)(void __iomem *ioaddr, struct stmmac_dma_cfg *dma_cfg,
 u32 dma_tx, u32 dma_rx, int atds);
+   void (*init_chan)(void __iomem *ioaddr,
+ struct stmmac_dma_cfg *dma_cfg, u32 chan);
+   void (*init_rx_chan)(void __iomem *ioaddr,
+struct stmmac_dma_cfg *dma_cfg,
+u32 dma_rx_phy, u32 chan);
+   void (*init_tx_chan)(void __iomem *ioaddr,
+struct stmmac_dma_cfg *dma_cfg,
+u32 dma_tx_phy, u32 chan);
/* Configure the AXI Bus Mode Register */
void (*axi)(void __iomem *ioaddr, struct stmmac_axi *axi);
/* Dump DMA registers */
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.c 
b/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.c
index 74177f9..54b3876 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.c
@@ -71,36 +71,48 @@ static void dwmac4_dma_axi(void __iomem *ioaddr, struct 
stmmac_axi *axi)
writel(value, ioaddr + DMA_SYS_BUS_MODE);
 }
 
-static void dwmac4_dma_init_channel(void __iomem *ioaddr,
-   struct stmmac_dma_cfg *dma_cfg,
-   u32 dma_tx_phy, u32 dma_rx_phy,
-   u32 channel)
+void dwmac4_dma_init_rx_chan(void __iomem *ioaddr,
+struct stmmac_dma_cfg *dma_cfg,
+u32 dma_rx_phy, u32 chan)
 {
u32 value;
-   int txpbl = dma_cfg->txpbl ?: dma_cfg->pbl;
-   int rxpbl = dma_cfg->rxpbl ?: dma_cfg->pbl;
+   u32 rxpbl = dma_cfg->rxpbl ?: dma_cfg->pbl;
 
-   /* set PBL for each channels. Currently we affect same configuration
-* on each channel
-*/
-   value = readl(ioaddr + DMA_CHAN_CONTROL(channel));
-   if (dma_cfg->pblx8)
-   value = value | DMA_BUS_MODE_PBL;
-   writel(value, ioaddr + DMA_CHAN_CONTROL(channel));
+   value = readl(ioaddr + DMA_CHAN_RX_CONTROL(chan));
+   value = value | (rxpbl << DMA_BUS_MODE_RPBL_SHIFT);
+   writel(value, ioaddr + DMA_CHAN_RX_CONTROL(chan));
+
+   writel(dma_rx_phy, ioaddr + DMA_CHAN_RX_BASE_ADDR(chan));
+}
 
-   value = readl(ioaddr + DMA_CHAN_TX_CONTROL(channel));
+void dwmac4_dma_init_tx_chan(void __iomem *ioaddr,
+struct stmmac_dma_cfg *dma_cfg,
+u32 dma_tx_phy, u32 chan)
+{
+   u32 value;
+   u32 txpbl = dma_cfg->txpbl ?: dma_cfg->pbl;
+
+   value = readl(ioaddr + DMA_CHAN_TX_CONTROL(chan));
value = value | (txpbl << DMA_BUS_MODE_PBL_SHIFT);
-   writel(value, ioaddr + DMA_CHAN_TX_CONTROL(channel));
+   writel(value, ioaddr + DMA_CHAN_TX_CONTROL(chan));
 
-   value = readl(ioaddr + DMA_CHAN_RX_CONTROL(channel));
-   value = value | (rxpbl << DMA_BUS_MODE_RPBL_SHIFT);
-   writel(value, ioaddr + DMA_CHAN_RX_CONTROL(channel));
+   writel(dma_tx_phy, ioaddr + DMA_CHAN_TX_BASE_ADDR(chan));
+}
 
-   /* Mask interrupts by writing to CSR7 */
-   writel(DMA_CHAN_INTR_DEFAULT_MASK, ioaddr + DMA_CHAN_INTR_ENA(channel));
+void dwmac4_dma_init_channel(void __iomem *ioaddr,
+struct stmmac_dma_cfg *dma_cfg, u32 chan)
+{
+   u32 value;
+
+   /* common channel control register config */
+   value = readl(ioaddr + DMA_CHAN_CONTROL(chan));
+   if (dma_cfg->pblx8)
+   value = value | DMA_BUS_MODE_PBL;
+   writel(value, ioaddr + DMA_CHAN_CONTROL(chan));
 
-   writel(dma_tx_phy, ioaddr + DMA_CHAN_TX_BASE_ADDR(channel));
-   writel(dma_rx_phy, ioaddr + DMA_CHAN_RX_BASE_ADDR(channel));
+   /* Mask interrupts by writing to CSR7 */
+   writel(DMA_CHAN_INTR_DEFAULT_MASK,
+  ioaddr + DMA_CHAN_INTR_ENA(chan));
 }
 
 static void dwmac4_dma_init(void __iomem *ioaddr,
@@ -108,7 +120,6 @@ static void dwmac4_dma_init(void __iomem *ioaddr,
u32 dma_tx, u32 dma_rx, int atds)
 {
u32 value = readl(ioaddr + DMA_SYS_BUS_MODE);
-   int i;
 
/* Set the Fixed burst mode */
if (dma_cfg->fixed_burst)
@@ -122

[PATCH v2 net-next 04/11] net: stmmac: prepare stmmac_tx_err for multiple queues

2017-03-14 Thread Joao Pinto
This patch prepares stmmac_err for multiple queues.

Signed-off-by: Joao Pinto 
---
changes v1->v2:
- Just to keep up the patch-set version

 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c 
b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 58ad199..9b64f2e 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -1525,12 +1525,12 @@ static inline void stmmac_disable_dma_irq(struct 
stmmac_priv *priv, u32 chan)
 /**
  * stmmac_tx_err - to manage the tx error
  * @priv: driver private structure
+ * @chan: channel index
  * Description: it cleans the descriptors and restarts the transmission
  * in case of transmission errors.
  */
-static void stmmac_tx_err(struct stmmac_priv *priv)
+static void stmmac_tx_err(struct stmmac_priv *priv, u32 chan)
 {
-   u32 chan = STMMAC_CHAN0;
int i;
netif_stop_queue(priv->dev);
 
@@ -1616,7 +1616,7 @@ static void stmmac_dma_interrupt(struct stmmac_priv *priv)
priv->xstats.threshold = tc;
}
} else if (unlikely(status == tx_hard_error))
-   stmmac_tx_err(priv);
+   stmmac_tx_err(priv, chan);
 }
 
 /**
@@ -2944,9 +2944,10 @@ static int stmmac_poll(struct napi_struct *napi, int 
budget)
 static void stmmac_tx_timeout(struct net_device *dev)
 {
struct stmmac_priv *priv = netdev_priv(dev);
+   u32 chan = STMMAC_CHAN0;
 
/* Clear Tx resources and restart transmitting again */
-   stmmac_tx_err(priv);
+   stmmac_tx_err(priv, chan);
 }
 
 /**
-- 
2.9.3



[PATCH v2 net-next 07/11] net: stmmac: rx and tx ring length prepared for multiple queues

2017-03-14 Thread Joao Pinto
This patch prepares tx and rx ring length configuration for multiple queues.

Signed-off-by: Joao Pinto 
---
changes v1->v2:
- Just to keep up the patch-set version

 drivers/net/ethernet/stmicro/stmmac/common.h  |  4 +--
 drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h  |  4 +--
 drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c  |  8 +++---
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 32 +--
 4 files changed, 32 insertions(+), 16 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/common.h 
b/drivers/net/ethernet/stmicro/stmmac/common.h
index 5fa23b1..bef1fc6 100644
--- a/drivers/net/ethernet/stmicro/stmmac/common.h
+++ b/drivers/net/ethernet/stmicro/stmmac/common.h
@@ -444,8 +444,8 @@ struct stmmac_dma_ops {
   struct dma_features *dma_cap);
/* Program the HW RX Watchdog */
void (*rx_watchdog)(void __iomem *ioaddr, u32 riwt, u32 number_chan);
-   void (*set_tx_ring_len)(void __iomem *ioaddr, u32 len);
-   void (*set_rx_ring_len)(void __iomem *ioaddr, u32 len);
+   void (*set_tx_ring_len)(void __iomem *ioaddr, u32 len, u32 chan);
+   void (*set_rx_ring_len)(void __iomem *ioaddr, u32 len, u32 chan);
void (*set_rx_tail_ptr)(void __iomem *ioaddr, u32 tail_ptr, u32 chan);
void (*set_tx_tail_ptr)(void __iomem *ioaddr, u32 tail_ptr, u32 chan);
void (*enable_tso)(void __iomem *ioaddr, bool en, u32 chan);
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h 
b/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h
index 946dc14..8474bf9 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h
@@ -194,8 +194,8 @@ void dwmac4_dma_start_rx(void __iomem *ioaddr, u32 chan);
 void dwmac4_dma_stop_rx(void __iomem *ioaddr, u32 chan);
 int dwmac4_dma_interrupt(void __iomem *ioaddr,
 struct stmmac_extra_stats *x, u32 chan);
-void dwmac4_set_rx_ring_len(void __iomem *ioaddr, u32 len);
-void dwmac4_set_tx_ring_len(void __iomem *ioaddr, u32 len);
+void dwmac4_set_rx_ring_len(void __iomem *ioaddr, u32 len, u32 chan);
+void dwmac4_set_tx_ring_len(void __iomem *ioaddr, u32 len, u32 chan);
 void dwmac4_set_rx_tail_ptr(void __iomem *ioaddr, u32 tail_ptr, u32 chan);
 void dwmac4_set_tx_tail_ptr(void __iomem *ioaddr, u32 tail_ptr, u32 chan);
 
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c 
b/drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c
index fcd8ec8..da54c0b 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c
@@ -94,14 +94,14 @@ void dwmac4_dma_stop_rx(void __iomem *ioaddr, u32 chan)
writel(value, ioaddr + GMAC_CONFIG);
 }
 
-void dwmac4_set_tx_ring_len(void __iomem *ioaddr, u32 len)
+void dwmac4_set_tx_ring_len(void __iomem *ioaddr, u32 len, u32 chan)
 {
-   writel(len, ioaddr + DMA_CHAN_TX_RING_LEN(STMMAC_CHAN0));
+   writel(len, ioaddr + DMA_CHAN_TX_RING_LEN(chan));
 }
 
-void dwmac4_set_rx_ring_len(void __iomem *ioaddr, u32 len)
+void dwmac4_set_rx_ring_len(void __iomem *ioaddr, u32 len, u32 chan)
 {
-   writel(len, ioaddr + DMA_CHAN_RX_RING_LEN(STMMAC_CHAN0));
+   writel(len, ioaddr + DMA_CHAN_RX_RING_LEN(chan));
 }
 
 void dwmac4_enable_dma_irq(void __iomem *ioaddr, u32 chan)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c 
b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 2af8589..e60e077 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -1802,6 +1802,27 @@ static void stmmac_init_tx_coalesce(struct stmmac_priv 
*priv)
add_timer(&priv->txtimer);
 }
 
+static void stmmac_set_rings_length(struct stmmac_priv *priv)
+{
+   u32 rx_channels_count = priv->plat->rx_queues_to_use;
+   u32 tx_channels_count = priv->plat->tx_queues_to_use;
+   u32 chan;
+
+   /* set TX ring length */
+   if (priv->hw->dma->set_tx_ring_len) {
+   for (chan = 0; chan < tx_channels_count; chan++)
+   priv->hw->dma->set_tx_ring_len(priv->ioaddr,
+  (DMA_TX_SIZE - 1), chan);
+   }
+
+   /* set RX ring length */
+   if (priv->hw->dma->set_rx_ring_len) {
+   for (chan = 0; chan < rx_channels_count; chan++)
+   priv->hw->dma->set_rx_ring_len(priv->ioaddr,
+  (DMA_RX_SIZE - 1), chan);
+   }
+}
+
 /**
  *  stmmac_set_tx_queue_weight - Set TX queue weight
  *  @priv: driver private structure
@@ -1995,14 +2016,9 @@ static int stmmac_hw_setup(struct net_device *dev, bool 
init_ptp)
if (priv->hw->pcs && priv->hw->mac->pcs_ctrl_ane)
priv->hw->mac->pcs_ctrl_ane(priv->hw, 1, priv->hw->ps, 0);
 
-   /*  set TX ring length */
-   if (priv->hw->dma->set_tx_ring_len)
-   priv->hw->dma->set_tx_ring_len(priv->ioad

[PATCH v2 net-next 05/11] net: stmmac: prepare dma interrupt treatment for multiple queues

2017-03-14 Thread Joao Pinto
This patch prepares DMA interrupts treatment for multiple queues.

Signed-off-by: Joao Pinto 
---
changes v1->v2:
- Just to keep up the patch-set version

 drivers/net/ethernet/stmicro/stmmac/common.h  |  2 +-
 drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h  |  2 +-
 drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c  |  8 ++--
 drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h   |  3 +-
 drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c   |  2 +-
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 55 +--
 6 files changed, 41 insertions(+), 31 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/common.h 
b/drivers/net/ethernet/stmicro/stmmac/common.h
index 042b482..6dfb7f3 100644
--- a/drivers/net/ethernet/stmicro/stmmac/common.h
+++ b/drivers/net/ethernet/stmicro/stmmac/common.h
@@ -438,7 +438,7 @@ struct stmmac_dma_ops {
void (*start_rx)(void __iomem *ioaddr, u32 chan);
void (*stop_rx)(void __iomem *ioaddr, u32 chan);
int (*dma_interrupt) (void __iomem *ioaddr,
- struct stmmac_extra_stats *x);
+ struct stmmac_extra_stats *x, u32 chan);
/* If supported then get the optional core features */
void (*get_hw_feature)(void __iomem *ioaddr,
   struct dma_features *dma_cap);
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h 
b/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h
index 2c19042..946dc14 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h
@@ -193,7 +193,7 @@ void dwmac4_dma_stop_tx(void __iomem *ioaddr, u32 chan);
 void dwmac4_dma_start_rx(void __iomem *ioaddr, u32 chan);
 void dwmac4_dma_stop_rx(void __iomem *ioaddr, u32 chan);
 int dwmac4_dma_interrupt(void __iomem *ioaddr,
-struct stmmac_extra_stats *x);
+struct stmmac_extra_stats *x, u32 chan);
 void dwmac4_set_rx_ring_len(void __iomem *ioaddr, u32 len);
 void dwmac4_set_tx_ring_len(void __iomem *ioaddr, u32 len);
 void dwmac4_set_rx_tail_ptr(void __iomem *ioaddr, u32 tail_ptr, u32 chan);
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c 
b/drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c
index 3512d18..fcd8ec8 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c
@@ -122,11 +122,11 @@ void dwmac4_disable_dma_irq(void __iomem *ioaddr, u32 
chan)
 }
 
 int dwmac4_dma_interrupt(void __iomem *ioaddr,
-struct stmmac_extra_stats *x)
+struct stmmac_extra_stats *x, u32 chan)
 {
int ret = 0;
 
-   u32 intr_status = readl(ioaddr + DMA_CHAN_STATUS(0));
+   u32 intr_status = readl(ioaddr + DMA_CHAN_STATUS(chan));
 
/* ABNORMAL interrupts */
if (unlikely(intr_status & DMA_CHAN_STATUS_AIS)) {
@@ -153,7 +153,7 @@ int dwmac4_dma_interrupt(void __iomem *ioaddr,
if (likely(intr_status & DMA_CHAN_STATUS_RI)) {
u32 value;
 
-   value = readl(ioaddr + DMA_CHAN_INTR_ENA(STMMAC_CHAN0));
+   value = readl(ioaddr + DMA_CHAN_INTR_ENA(chan));
/* to schedule NAPI on real RIE event. */
if (likely(value & DMA_CHAN_INTR_ENA_RIE)) {
x->rx_normal_irq_n++;
@@ -172,7 +172,7 @@ int dwmac4_dma_interrupt(void __iomem *ioaddr,
 * status [21-0] expect reserved bits [5-3]
 */
writel((intr_status & 0x3fffc7),
-  ioaddr + DMA_CHAN_STATUS(STMMAC_CHAN0));
+  ioaddr + DMA_CHAN_STATUS(chan));
 
return ret;
 }
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h 
b/drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h
index 6c6cc71..9091df8 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h
@@ -143,7 +143,8 @@ void dwmac_dma_start_tx(void __iomem *ioaddr, u32 chan);
 void dwmac_dma_stop_tx(void __iomem *ioaddr, u32 chan);
 void dwmac_dma_start_rx(void __iomem *ioaddr, u32 chan);
 void dwmac_dma_stop_rx(void __iomem *ioaddr, u32 chan);
-int dwmac_dma_interrupt(void __iomem *ioaddr, struct stmmac_extra_stats *x);
+int dwmac_dma_interrupt(void __iomem *ioaddr, struct stmmac_extra_stats *x,
+   u32 chan);
 int dwmac_dma_reset(void __iomem *ioaddr);
 
 #endif /* __DWMAC_DMA_H__ */
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c 
b/drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c
index 7be60c3..38f9430 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c
@@ -156,7 +156,7 @@ static void show_rx_process_state(unsigned int status)
 #endif
 
 int dwmac_dma_interrupt(void __iomem *ioaddr,
-   struct stmmac_extra_stats *x)
+   struct stmmac_extra_stats *x, u32 chan)
 {
 

[PATCH v2 net-next 06/11] net: stmmac: rx watchdog config prepared for multiple queues

2017-03-14 Thread Joao Pinto
This patch adds rx watchdog configuration for all queues.

Signed-off-by: Joao Pinto 
---
changes v1->v2:
- Just to keep up the patch-set version

 drivers/net/ethernet/stmicro/stmmac/common.h | 2 +-
 drivers/net/ethernet/stmicro/stmmac/dwmac1000_dma.c  | 3 ++-
 drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.c | 8 
 drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c | 3 ++-
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c| 3 ++-
 5 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/common.h 
b/drivers/net/ethernet/stmicro/stmmac/common.h
index 6dfb7f3..5fa23b1 100644
--- a/drivers/net/ethernet/stmicro/stmmac/common.h
+++ b/drivers/net/ethernet/stmicro/stmmac/common.h
@@ -443,7 +443,7 @@ struct stmmac_dma_ops {
void (*get_hw_feature)(void __iomem *ioaddr,
   struct dma_features *dma_cap);
/* Program the HW RX Watchdog */
-   void (*rx_watchdog) (void __iomem *ioaddr, u32 riwt);
+   void (*rx_watchdog)(void __iomem *ioaddr, u32 riwt, u32 number_chan);
void (*set_tx_ring_len)(void __iomem *ioaddr, u32 len);
void (*set_rx_ring_len)(void __iomem *ioaddr, u32 len);
void (*set_rx_tail_ptr)(void __iomem *ioaddr, u32 tail_ptr, u32 chan);
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac1000_dma.c 
b/drivers/net/ethernet/stmicro/stmmac/dwmac1000_dma.c
index d3654a4..471a9aa 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac1000_dma.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac1000_dma.c
@@ -247,7 +247,8 @@ static void dwmac1000_get_hw_feature(void __iomem *ioaddr,
dma_cap->enh_desc = (hw_cap & DMA_HW_FEAT_ENHDESSEL) >> 24;
 }
 
-static void dwmac1000_rx_watchdog(void __iomem *ioaddr, u32 riwt)
+static void dwmac1000_rx_watchdog(void __iomem *ioaddr, u32 riwt,
+ u32 number_chan)
 {
writel(riwt, ioaddr + DMA_RX_WATCHDOG);
 }
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.c 
b/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.c
index 6285e8a..74177f9 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.c
@@ -174,12 +174,12 @@ static void dwmac4_dump_dma_regs(void __iomem *ioaddr, 
u32 *reg_space)
_dwmac4_dump_dma_regs(ioaddr, i, reg_space);
 }
 
-static void dwmac4_rx_watchdog(void __iomem *ioaddr, u32 riwt)
+static void dwmac4_rx_watchdog(void __iomem *ioaddr, u32 riwt, u32 number_chan)
 {
-   int i;
+   u32 chan;
 
-   for (i = 0; i < DMA_CHANNEL_NB_MAX; i++)
-   writel(riwt, ioaddr + DMA_CHAN_RX_WATCHDOG(i));
+   for (chan = 0; chan < number_chan; chan++)
+   writel(riwt, ioaddr + DMA_CHAN_RX_WATCHDOG(chan));
 }
 
 static void dwmac4_dma_rx_chan_op_mode(void __iomem *ioaddr, int mode,
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c 
b/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c
index 61b9369..16808e4 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c
@@ -730,6 +730,7 @@ static int stmmac_set_coalesce(struct net_device *dev,
   struct ethtool_coalesce *ec)
 {
struct stmmac_priv *priv = netdev_priv(dev);
+   u32 rx_cnt = priv->plat->rx_queues_to_use;
unsigned int rx_riwt;
 
/* Check not supported parameters  */
@@ -768,7 +769,7 @@ static int stmmac_set_coalesce(struct net_device *dev,
priv->tx_coal_frames = ec->tx_max_coalesced_frames;
priv->tx_coal_timer = ec->tx_coalesce_usecs;
priv->rx_riwt = rx_riwt;
-   priv->hw->dma->rx_watchdog(priv->ioaddr, priv->rx_riwt);
+   priv->hw->dma->rx_watchdog(priv->ioaddr, priv->rx_riwt, rx_cnt);
 
return 0;
 }
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c 
b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 053a042..2af8589 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -1915,6 +1915,7 @@ static void stmmac_mtl_configuration(struct stmmac_priv 
*priv)
 static int stmmac_hw_setup(struct net_device *dev, bool init_ptp)
 {
struct stmmac_priv *priv = netdev_priv(dev);
+   u32 rx_cnt = priv->plat->rx_queues_to_use;
int ret;
 
/* DMA initialization and SW reset */
@@ -1988,7 +1989,7 @@ static int stmmac_hw_setup(struct net_device *dev, bool 
init_ptp)
 
if ((priv->use_riwt) && (priv->hw->dma->rx_watchdog)) {
priv->rx_riwt = MAX_DMA_RIWT;
-   priv->hw->dma->rx_watchdog(priv->ioaddr, MAX_DMA_RIWT);
+   priv->hw->dma->rx_watchdog(priv->ioaddr, MAX_DMA_RIWT, rx_cnt);
}
 
if (priv->hw->pcs && priv->hw->mac->pcs_ctrl_ane)
-- 
2.9.3



[PATCH v2 net-next 08/11] net: stmmac: prepare rx/tx set tail function for multiple queues

2017-03-14 Thread Joao Pinto
This patch prepares RX and TX set tail functions for multiple queues.

Signed-off-by: Joao Pinto 
---
changes v1->v2:
- Just to keep up the patch-set version

 drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c 
b/drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c
index da54c0b..49f5687 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c
@@ -37,12 +37,12 @@ int dwmac4_dma_reset(void __iomem *ioaddr)
 
 void dwmac4_set_rx_tail_ptr(void __iomem *ioaddr, u32 tail_ptr, u32 chan)
 {
-   writel(tail_ptr, ioaddr + DMA_CHAN_RX_END_ADDR(0));
+   writel(tail_ptr, ioaddr + DMA_CHAN_RX_END_ADDR(chan));
 }
 
 void dwmac4_set_tx_tail_ptr(void __iomem *ioaddr, u32 tail_ptr, u32 chan)
 {
-   writel(tail_ptr, ioaddr + DMA_CHAN_TX_END_ADDR(0));
+   writel(tail_ptr, ioaddr + DMA_CHAN_TX_END_ADDR(chan));
 }
 
 void dwmac4_dma_start_tx(void __iomem *ioaddr, u32 chan)
-- 
2.9.3



[PATCH v2 net-next 02/11] net: stmmac: enable/disable dma irq prepared for multiple queues

2017-03-14 Thread Joao Pinto
This patch prepares the DMA IRQ enable/disable process for multiple queues.

Signed-off-by: Joao Pinto 
---
changes v1->v2:
- Just to keep up the patch-set version

 drivers/net/ethernet/stmicro/stmmac/common.h  |  4 ++--
 drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h  |  6 +++---
 drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c  | 12 ++--
 drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h   |  4 ++--
 drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c   |  4 ++--
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 13 +++--
 6 files changed, 22 insertions(+), 21 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/common.h 
b/drivers/net/ethernet/stmicro/stmmac/common.h
index 13bd3d4..0351b54 100644
--- a/drivers/net/ethernet/stmicro/stmmac/common.h
+++ b/drivers/net/ethernet/stmicro/stmmac/common.h
@@ -431,8 +431,8 @@ struct stmmac_dma_ops {
void (*dma_diagnostic_fr) (void *data, struct stmmac_extra_stats *x,
   void __iomem *ioaddr);
void (*enable_dma_transmission) (void __iomem *ioaddr);
-   void (*enable_dma_irq) (void __iomem *ioaddr);
-   void (*disable_dma_irq) (void __iomem *ioaddr);
+   void (*enable_dma_irq)(void __iomem *ioaddr, u32 chan);
+   void (*disable_dma_irq)(void __iomem *ioaddr, u32 chan);
void (*start_tx) (void __iomem *ioaddr);
void (*stop_tx) (void __iomem *ioaddr);
void (*start_rx) (void __iomem *ioaddr);
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h 
b/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h
index 1b06df7..393a657 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h
@@ -185,9 +185,9 @@
 
 int dwmac4_dma_reset(void __iomem *ioaddr);
 void dwmac4_enable_dma_transmission(void __iomem *ioaddr, u32 tail_ptr);
-void dwmac4_enable_dma_irq(void __iomem *ioaddr);
-void dwmac410_enable_dma_irq(void __iomem *ioaddr);
-void dwmac4_disable_dma_irq(void __iomem *ioaddr);
+void dwmac4_enable_dma_irq(void __iomem *ioaddr, u32 chan);
+void dwmac410_enable_dma_irq(void __iomem *ioaddr, u32 chan);
+void dwmac4_disable_dma_irq(void __iomem *ioaddr, u32 chan);
 void dwmac4_dma_start_tx(void __iomem *ioaddr);
 void dwmac4_dma_stop_tx(void __iomem *ioaddr);
 void dwmac4_dma_start_rx(void __iomem *ioaddr);
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c 
b/drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c
index c7326d5..c932791 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c
@@ -104,21 +104,21 @@ void dwmac4_set_rx_ring_len(void __iomem *ioaddr, u32 len)
writel(len, ioaddr + DMA_CHAN_RX_RING_LEN(STMMAC_CHAN0));
 }
 
-void dwmac4_enable_dma_irq(void __iomem *ioaddr)
+void dwmac4_enable_dma_irq(void __iomem *ioaddr, u32 chan)
 {
writel(DMA_CHAN_INTR_DEFAULT_MASK, ioaddr +
-  DMA_CHAN_INTR_ENA(STMMAC_CHAN0));
+  DMA_CHAN_INTR_ENA(chan));
 }
 
-void dwmac410_enable_dma_irq(void __iomem *ioaddr)
+void dwmac410_enable_dma_irq(void __iomem *ioaddr, u32 chan)
 {
writel(DMA_CHAN_INTR_DEFAULT_MASK_4_10,
-  ioaddr + DMA_CHAN_INTR_ENA(STMMAC_CHAN0));
+  ioaddr + DMA_CHAN_INTR_ENA(chan));
 }
 
-void dwmac4_disable_dma_irq(void __iomem *ioaddr)
+void dwmac4_disable_dma_irq(void __iomem *ioaddr, u32 chan)
 {
-   writel(0, ioaddr + DMA_CHAN_INTR_ENA(STMMAC_CHAN0));
+   writel(0, ioaddr + DMA_CHAN_INTR_ENA(chan));
 }
 
 int dwmac4_dma_interrupt(void __iomem *ioaddr,
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h 
b/drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h
index 56e485f..dec0816 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h
@@ -137,8 +137,8 @@
 #define DMA_CONTROL_FTF0x0010  /* Flush transmit FIFO 
*/
 
 void dwmac_enable_dma_transmission(void __iomem *ioaddr);
-void dwmac_enable_dma_irq(void __iomem *ioaddr);
-void dwmac_disable_dma_irq(void __iomem *ioaddr);
+void dwmac_enable_dma_irq(void __iomem *ioaddr, u32 chan);
+void dwmac_disable_dma_irq(void __iomem *ioaddr, u32 chan);
 void dwmac_dma_start_tx(void __iomem *ioaddr);
 void dwmac_dma_stop_tx(void __iomem *ioaddr);
 void dwmac_dma_start_rx(void __iomem *ioaddr);
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c 
b/drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c
index e60bfca..285cfc9 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c
@@ -47,12 +47,12 @@ void dwmac_enable_dma_transmission(void __iomem *ioaddr)
writel(1, ioaddr + DMA_XMT_POLL_DEMAND);
 }
 
-void dwmac_enable_dma_irq(void __iomem *ioaddr)
+void dwmac_enable_dma_irq(void __iomem *ioaddr, u32 chan)
 {
writel(DMA_INTR_DEFAULT_MASK, ioaddr + DMA_INTR_ENA);
 }
 
-void dwmac_disable_dma_irq(void __iomem *ioaddr)
+void dwmac_disable_dma_irq(void

[PATCH v2 net-next 03/11] net: stmmac: rx/tx dma start/stop prepared for multiple queues

2017-03-14 Thread Joao Pinto
This patch prepares the RX/TX DMA stop/start process for multiple queues.

Signed-off-by: Joao Pinto 
---
changes v1->v2:
- Just to keep up the patch-set version

 drivers/net/ethernet/stmicro/stmmac/common.h  |   8 +-
 drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h  |   8 +-
 drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c  |  24 ++---
 drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h   |   8 +-
 drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c   |   8 +-
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 108 +++---
 6 files changed, 125 insertions(+), 39 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/common.h 
b/drivers/net/ethernet/stmicro/stmmac/common.h
index 0351b54..042b482 100644
--- a/drivers/net/ethernet/stmicro/stmmac/common.h
+++ b/drivers/net/ethernet/stmicro/stmmac/common.h
@@ -433,10 +433,10 @@ struct stmmac_dma_ops {
void (*enable_dma_transmission) (void __iomem *ioaddr);
void (*enable_dma_irq)(void __iomem *ioaddr, u32 chan);
void (*disable_dma_irq)(void __iomem *ioaddr, u32 chan);
-   void (*start_tx) (void __iomem *ioaddr);
-   void (*stop_tx) (void __iomem *ioaddr);
-   void (*start_rx) (void __iomem *ioaddr);
-   void (*stop_rx) (void __iomem *ioaddr);
+   void (*start_tx)(void __iomem *ioaddr, u32 chan);
+   void (*stop_tx)(void __iomem *ioaddr, u32 chan);
+   void (*start_rx)(void __iomem *ioaddr, u32 chan);
+   void (*stop_rx)(void __iomem *ioaddr, u32 chan);
int (*dma_interrupt) (void __iomem *ioaddr,
  struct stmmac_extra_stats *x);
/* If supported then get the optional core features */
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h 
b/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h
index 393a657..2c19042 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h
@@ -188,10 +188,10 @@ void dwmac4_enable_dma_transmission(void __iomem *ioaddr, 
u32 tail_ptr);
 void dwmac4_enable_dma_irq(void __iomem *ioaddr, u32 chan);
 void dwmac410_enable_dma_irq(void __iomem *ioaddr, u32 chan);
 void dwmac4_disable_dma_irq(void __iomem *ioaddr, u32 chan);
-void dwmac4_dma_start_tx(void __iomem *ioaddr);
-void dwmac4_dma_stop_tx(void __iomem *ioaddr);
-void dwmac4_dma_start_rx(void __iomem *ioaddr);
-void dwmac4_dma_stop_rx(void __iomem *ioaddr);
+void dwmac4_dma_start_tx(void __iomem *ioaddr, u32 chan);
+void dwmac4_dma_stop_tx(void __iomem *ioaddr, u32 chan);
+void dwmac4_dma_start_rx(void __iomem *ioaddr, u32 chan);
+void dwmac4_dma_stop_rx(void __iomem *ioaddr, u32 chan);
 int dwmac4_dma_interrupt(void __iomem *ioaddr,
 struct stmmac_extra_stats *x);
 void dwmac4_set_rx_ring_len(void __iomem *ioaddr, u32 len);
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c 
b/drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c
index c932791..3512d18 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c
@@ -45,49 +45,49 @@ void dwmac4_set_tx_tail_ptr(void __iomem *ioaddr, u32 
tail_ptr, u32 chan)
writel(tail_ptr, ioaddr + DMA_CHAN_TX_END_ADDR(0));
 }
 
-void dwmac4_dma_start_tx(void __iomem *ioaddr)
+void dwmac4_dma_start_tx(void __iomem *ioaddr, u32 chan)
 {
-   u32 value = readl(ioaddr + DMA_CHAN_TX_CONTROL(STMMAC_CHAN0));
+   u32 value = readl(ioaddr + DMA_CHAN_TX_CONTROL(chan));
 
value |= DMA_CONTROL_ST;
-   writel(value, ioaddr + DMA_CHAN_TX_CONTROL(STMMAC_CHAN0));
+   writel(value, ioaddr + DMA_CHAN_TX_CONTROL(chan));
 
value = readl(ioaddr + GMAC_CONFIG);
value |= GMAC_CONFIG_TE;
writel(value, ioaddr + GMAC_CONFIG);
 }
 
-void dwmac4_dma_stop_tx(void __iomem *ioaddr)
+void dwmac4_dma_stop_tx(void __iomem *ioaddr, u32 chan)
 {
-   u32 value = readl(ioaddr + DMA_CHAN_TX_CONTROL(STMMAC_CHAN0));
+   u32 value = readl(ioaddr + DMA_CHAN_TX_CONTROL(chan));
 
value &= ~DMA_CONTROL_ST;
-   writel(value, ioaddr + DMA_CHAN_TX_CONTROL(STMMAC_CHAN0));
+   writel(value, ioaddr + DMA_CHAN_TX_CONTROL(chan));
 
value = readl(ioaddr + GMAC_CONFIG);
value &= ~GMAC_CONFIG_TE;
writel(value, ioaddr + GMAC_CONFIG);
 }
 
-void dwmac4_dma_start_rx(void __iomem *ioaddr)
+void dwmac4_dma_start_rx(void __iomem *ioaddr, u32 chan)
 {
-   u32 value = readl(ioaddr + DMA_CHAN_RX_CONTROL(STMMAC_CHAN0));
+   u32 value = readl(ioaddr + DMA_CHAN_RX_CONTROL(chan));
 
value |= DMA_CONTROL_SR;
 
-   writel(value, ioaddr + DMA_CHAN_RX_CONTROL(STMMAC_CHAN0));
+   writel(value, ioaddr + DMA_CHAN_RX_CONTROL(chan));
 
value = readl(ioaddr + GMAC_CONFIG);
value |= GMAC_CONFIG_RE;
writel(value, ioaddr + GMAC_CONFIG);
 }
 
-void dwmac4_dma_stop_rx(void __iomem *ioaddr)
+void dwmac4_dma_stop_rx(void __iomem *ioaddr, u32 chan)
 {
-   u32 value = readl(ioaddr + DMA_CHAN_RX_CONTROL(STMMAC_CHAN0)

[PATCH v2 net-next 10/11] net: stmmac: tso init prepared for multiple queues

2017-03-14 Thread Joao Pinto
This patch configures TSO for all available tx queues.

Signed-off-by: Joao Pinto 
---
changes v1->v2:
- Just to keep up the patch-set version

 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c 
b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 725fe3e..fccd3f7 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -1966,6 +1966,8 @@ static int stmmac_hw_setup(struct net_device *dev, bool 
init_ptp)
 {
struct stmmac_priv *priv = netdev_priv(dev);
u32 rx_cnt = priv->plat->rx_queues_to_use;
+   u32 tx_cnt = priv->plat->tx_queues_to_use;
+   u32 chan;
int ret;
 
/* DMA initialization and SW reset */
@@ -2049,8 +2051,10 @@ static int stmmac_hw_setup(struct net_device *dev, bool 
init_ptp)
stmmac_set_rings_length(priv);
 
/* Enable TSO */
-   if (priv->tso)
-   priv->hw->dma->enable_tso(priv->ioaddr, 1, STMMAC_CHAN0);
+   if (priv->tso) {
+   for (chan = 0; chan < tx_cnt; chan++)
+   priv->hw->dma->enable_tso(priv->ioaddr, 1, chan);
+   }
 
return 0;
 }
-- 
2.9.3



[PATCH v2 net-next 11/11] net: stmmac: stmmac interrupt treatment prepared for multiple queues

2017-03-14 Thread Joao Pinto
This patch prepares the main ISR for multiple queues.

Signed-off-by: Joao Pinto 
---
changes v1->v2:
- Just to keep up the patch-set version

 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 28 ---
 1 file changed, 20 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c 
b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index fccd3f7..33de7c4 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -3115,6 +3115,12 @@ static irqreturn_t stmmac_interrupt(int irq, void 
*dev_id)
 {
struct net_device *dev = (struct net_device *)dev_id;
struct stmmac_priv *priv = netdev_priv(dev);
+   u32 rx_cnt = priv->plat->rx_queues_to_use;
+   u32 tx_cnt = priv->plat->tx_queues_to_use;
+   u32 queues_count;
+   u32 queue;
+
+   queues_count = (rx_cnt > tx_cnt) ? rx_cnt : tx_cnt;
 
if (priv->irq_wake)
pm_wakeup_event(priv->device, 0);
@@ -3129,20 +3135,26 @@ static irqreturn_t stmmac_interrupt(int irq, void 
*dev_id)
int status = priv->hw->mac->host_irq_status(priv->hw,
&priv->xstats);
 
-   if (priv->synopsys_id >= DWMAC_CORE_4_00)
-   status |= priv->hw->mac->host_mtl_irq_status(priv->hw,
-   STMMAC_CHAN0);
-
if (unlikely(status)) {
/* For LPI we need to save the tx status */
if (status & CORE_IRQ_TX_PATH_IN_LPI_MODE)
priv->tx_path_in_lpi_mode = true;
if (status & CORE_IRQ_TX_PATH_EXIT_LPI_MODE)
priv->tx_path_in_lpi_mode = false;
-   if (status & CORE_IRQ_MTL_RX_OVERFLOW && 
priv->hw->dma->set_rx_tail_ptr)
-   priv->hw->dma->set_rx_tail_ptr(priv->ioaddr,
-   priv->rx_tail_addr,
-   STMMAC_CHAN0);
+   }
+
+   if (priv->synopsys_id >= DWMAC_CORE_4_00) {
+   for (queue = 0; queue < queues_count; queue++) {
+   status |=
+   priv->hw->mac->host_mtl_irq_status(priv->hw,
+  queue);
+
+   if (status & CORE_IRQ_MTL_RX_OVERFLOW &&
+   priv->hw->dma->set_rx_tail_ptr)
+   
priv->hw->dma->set_rx_tail_ptr(priv->ioaddr,
+   
priv->rx_tail_addr,
+   queue);
+   }
}
 
/* PCS link status */
-- 
2.9.3



Re: crypto: deadlock between crypto_alg_sem/rtnl_mutex/genl_mutex

2017-03-14 Thread Herbert Xu
On Tue, Mar 14, 2017 at 10:44:10AM +0100, Dmitry Vyukov wrote:
>
> Yes, please.
> Disregarding some reports is not a good way long term.

Please try this patch.

---8<---
Subject: netlink: Annotate nlk cb_mutex by protocol

Currently all occurences of nlk->cb_mutex are annotated by lockdep
as a single class.  This causes a false lcokdep cycle involving
genl and crypto_user.

This patch fixes it by dividing cb_mutex into individual classes
based on the netlink protocol.  As genl and crypto_user do not
use the same netlink protocol this breaks the false dependency
loop.

Reported-by: Dmitry Vyukov 
Signed-off-by: Herbert Xu 

diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index 7b73c7c..596eaff 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -96,6 +96,44 @@ static inline int netlink_is_kernel(struct sock *sk)
 
 static DECLARE_WAIT_QUEUE_HEAD(nl_table_wait);
 
+static struct lock_class_key nlk_cb_mutex_keys[MAX_LINKS];
+
+static const char *const nlk_cb_mutex_key_strings[MAX_LINKS + 1] = {
+   "nlk_cb_mutex-ROUTE",
+   "nlk_cb_mutex-1",
+   "nlk_cb_mutex-USERSOCK",
+   "nlk_cb_mutex-FIREWALL",
+   "nlk_cb_mutex-SOCK_DIAG",
+   "nlk_cb_mutex-NFLOG",
+   "nlk_cb_mutex-XFRM",
+   "nlk_cb_mutex-SELINUX",
+   "nlk_cb_mutex-ISCSI",
+   "nlk_cb_mutex-AUDIT",
+   "nlk_cb_mutex-FIB_LOOKUP",
+   "nlk_cb_mutex-CONNECTOR",
+   "nlk_cb_mutex-NETFILTER",
+   "nlk_cb_mutex-IP6_FW",
+   "nlk_cb_mutex-DNRTMSG",
+   "nlk_cb_mutex-KOBJECT_UEVENT",
+   "nlk_cb_mutex-GENERIC",
+   "nlk_cb_mutex-17",
+   "nlk_cb_mutex-SCSITRANSPORT",
+   "nlk_cb_mutex-ECRYPTFS",
+   "nlk_cb_mutex-RDMA",
+   "nlk_cb_mutex-CRYPTO",
+   "nlk_cb_mutex-SMC",
+   "nlk_cb_mutex-23",
+   "nlk_cb_mutex-24",
+   "nlk_cb_mutex-25",
+   "nlk_cb_mutex-26",
+   "nlk_cb_mutex-27",
+   "nlk_cb_mutex-28",
+   "nlk_cb_mutex-29",
+   "nlk_cb_mutex-30",
+   "nlk_cb_mutex-31",
+   "nlk_cb_mutex-MAX_LINKS"
+};
+
 static int netlink_dump(struct sock *sk);
 static void netlink_skb_destructor(struct sk_buff *skb);
 
@@ -585,6 +623,9 @@ static int __netlink_create(struct net *net, struct socket 
*sock,
} else {
nlk->cb_mutex = &nlk->cb_def_mutex;
mutex_init(nlk->cb_mutex);
+   lockdep_set_class_and_name(nlk->cb_mutex,
+  nlk_cb_mutex_keys + protocol,
+  nlk_cb_mutex_key_strings[protocol]);
}
init_waitqueue_head(&nlk->wait);
 
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH] net: mpls: Fix nexthop alive tracking on down events

2017-03-14 Thread Robert Shearman

On 13/03/17 23:49, David Ahern wrote:

Alive tracking of nexthops can account for a link twice if the carrier
goes down followed by an admin down of the same link rendering multipath
routes useless. This is similar to 79099aab38c8 for UNREGISTER events and
DOWN events.

Fix by tracking number of alive nexthops in mpls_ifdown similar to the
logic in mpls_ifup. Checking the flags per nexthop once after all events
have been processed is simpler than trying to maintian a running count
through all event combinations.

Also, WRITE_ONCE is used instead of ACCESS_ONCE to set rt_nhn_alive
per a comment from checkpatch:
WARNING: Prefer WRITE_ONCE(, ) over ACCESS_ONCE() = 

Fixes: c89359a42e2a4 ("mpls: support for dead routes")
Signed-off-by: David Ahern 


Acked-by: Robert Shearman 


Re: crypto: deadlock between crypto_alg_sem/rtnl_mutex/genl_mutex

2017-03-14 Thread Dmitry Vyukov
On Tue, Mar 14, 2017 at 11:25 AM, Herbert Xu
 wrote:
> On Tue, Mar 14, 2017 at 10:44:10AM +0100, Dmitry Vyukov wrote:
>>
>> Yes, please.
>> Disregarding some reports is not a good way long term.
>
> Please try this patch.

Applied on bots. I should have a conclusion within a day.
Thanks!


> ---8<---
> Subject: netlink: Annotate nlk cb_mutex by protocol
>
> Currently all occurences of nlk->cb_mutex are annotated by lockdep
> as a single class.  This causes a false lcokdep cycle involving
> genl and crypto_user.
>
> This patch fixes it by dividing cb_mutex into individual classes
> based on the netlink protocol.  As genl and crypto_user do not
> use the same netlink protocol this breaks the false dependency
> loop.
>
> Reported-by: Dmitry Vyukov 
> Signed-off-by: Herbert Xu 
>
> diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
> index 7b73c7c..596eaff 100644
> --- a/net/netlink/af_netlink.c
> +++ b/net/netlink/af_netlink.c
> @@ -96,6 +96,44 @@ static inline int netlink_is_kernel(struct sock *sk)
>
>  static DECLARE_WAIT_QUEUE_HEAD(nl_table_wait);
>
> +static struct lock_class_key nlk_cb_mutex_keys[MAX_LINKS];
> +
> +static const char *const nlk_cb_mutex_key_strings[MAX_LINKS + 1] = {
> +   "nlk_cb_mutex-ROUTE",
> +   "nlk_cb_mutex-1",
> +   "nlk_cb_mutex-USERSOCK",
> +   "nlk_cb_mutex-FIREWALL",
> +   "nlk_cb_mutex-SOCK_DIAG",
> +   "nlk_cb_mutex-NFLOG",
> +   "nlk_cb_mutex-XFRM",
> +   "nlk_cb_mutex-SELINUX",
> +   "nlk_cb_mutex-ISCSI",
> +   "nlk_cb_mutex-AUDIT",
> +   "nlk_cb_mutex-FIB_LOOKUP",
> +   "nlk_cb_mutex-CONNECTOR",
> +   "nlk_cb_mutex-NETFILTER",
> +   "nlk_cb_mutex-IP6_FW",
> +   "nlk_cb_mutex-DNRTMSG",
> +   "nlk_cb_mutex-KOBJECT_UEVENT",
> +   "nlk_cb_mutex-GENERIC",
> +   "nlk_cb_mutex-17",
> +   "nlk_cb_mutex-SCSITRANSPORT",
> +   "nlk_cb_mutex-ECRYPTFS",
> +   "nlk_cb_mutex-RDMA",
> +   "nlk_cb_mutex-CRYPTO",
> +   "nlk_cb_mutex-SMC",
> +   "nlk_cb_mutex-23",
> +   "nlk_cb_mutex-24",
> +   "nlk_cb_mutex-25",
> +   "nlk_cb_mutex-26",
> +   "nlk_cb_mutex-27",
> +   "nlk_cb_mutex-28",
> +   "nlk_cb_mutex-29",
> +   "nlk_cb_mutex-30",
> +   "nlk_cb_mutex-31",
> +   "nlk_cb_mutex-MAX_LINKS"
> +};
> +
>  static int netlink_dump(struct sock *sk);
>  static void netlink_skb_destructor(struct sk_buff *skb);
>
> @@ -585,6 +623,9 @@ static int __netlink_create(struct net *net, struct 
> socket *sock,
> } else {
> nlk->cb_mutex = &nlk->cb_def_mutex;
> mutex_init(nlk->cb_mutex);
> +   lockdep_set_class_and_name(nlk->cb_mutex,
> +  nlk_cb_mutex_keys + protocol,
> +  
> nlk_cb_mutex_key_strings[protocol]);
> }
> init_waitqueue_head(&nlk->wait);
>
> --
> Email: Herbert Xu 
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


[PATCH v5] Enable tx timestamping on loopback and dummy

2017-03-14 Thread Ezequiel Lara Gomez

>From aafd3312170ed658571ef443675036f96114c2d7 Mon Sep 17 00:00:00 2001
From: Ezequiel Lara Gomez 
Date: Sat, 11 Mar 2017 20:06:54 +
Subject: [PATCH v5] Enable tx timestamping on loopback and dummy

This enables developing code that uses SOF_TIMESTAMPING_TX_SOFTWARE
by using localhost addresses (without needing to send packets outside),
as well as enabling unit and functional testing of TX timestamping code
without needing hardware support or network access.

It also fulfills the expectation of software network devices supporting
software-based timestamping.

Tested on qemu using txtimestamping.c from the kernel selftests, and
ethtool -T.

Signed-off-by: Ezequiel Lara Gomez 
---
Changes:
* v2: split styling changes suggested by checkpatch.pl to a separate patch
* v3: added ethtool reporting of tx timestamping support
* v4: clarified on the commit message the reason to introduce this.
* v5: proper signoff of the change. Apologies!

 drivers/net/dummy.c| 15 +++
 drivers/net/loopback.c | 15 +++
 2 files changed, 30 insertions(+)

diff --git a/drivers/net/dummy.c b/drivers/net/dummy.c
index 2c80611..149244a 100644
--- a/drivers/net/dummy.c
+++ b/drivers/net/dummy.c
@@ -35,6 +35,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -125,6 +126,7 @@ static netdev_tx_t dummy_xmit(struct sk_buff *skb, struct 
net_device *dev)
dstats->tx_bytes += skb->len;
u64_stats_update_end(&dstats->syncp);
 
+   skb_tx_timestamp(skb);
dev_kfree_skb(skb);
return NETDEV_TX_OK;
 }
@@ -304,8 +306,21 @@ static void dummy_get_drvinfo(struct net_device *dev,
strlcpy(info->version, DRV_VERSION, sizeof(info->version));
 }
 
+static int dummy_get_ts_info(struct net_device *dev,
+ struct ethtool_ts_info *ts_info)
+{
+   ts_info->so_timestamping = SOF_TIMESTAMPING_TX_SOFTWARE |
+  SOF_TIMESTAMPING_RX_SOFTWARE |
+  SOF_TIMESTAMPING_SOFTWARE;
+
+   ts_info->phc_index = -1;
+
+   return 0;
+};
+
 static const struct ethtool_ops dummy_ethtool_ops = {
.get_drvinfo= dummy_get_drvinfo,
+   .get_ts_info= dummy_get_ts_info,
 };
 
 static void dummy_free_netdev(struct net_device *dev)
diff --git a/drivers/net/loopback.c b/drivers/net/loopback.c
index 122cc2d..3a60d27 100644
--- a/drivers/net/loopback.c
+++ b/drivers/net/loopback.c
@@ -55,6 +55,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -74,6 +75,7 @@ static netdev_tx_t loopback_xmit(struct sk_buff *skb,
struct pcpu_lstats *lb_stats;
int len;
 
+   skb_tx_timestamp(skb);
skb_orphan(skb);
 
/* Before queueing this packet to netif_rx(),
@@ -129,8 +131,21 @@ static u32 always_on(struct net_device *dev)
return 1;
 }
 
+static int loopback_get_ts_info(struct net_device *netdev,
+   struct ethtool_ts_info *ts_info)
+{
+   ts_info->so_timestamping = SOF_TIMESTAMPING_TX_SOFTWARE |
+  SOF_TIMESTAMPING_RX_SOFTWARE |
+  SOF_TIMESTAMPING_SOFTWARE;
+
+   ts_info->phc_index = -1;
+
+   return 0;
+};
+
 static const struct ethtool_ops loopback_ethtool_ops = {
.get_link   = always_on,
+   .get_ts_info= loopback_get_ts_info,
 };
 
 static int loopback_dev_init(struct net_device *dev)
-- 
1.9.1

Amazon Data Services Ireland Limited registered office: One Burlington Plaza, 
Burlington Road, Dublin 4, Ireland. Registered in Ireland. Registration number 
390566.



RE: [PATCH net-next 09/12] net: bcmgenet: return EOPNOTSUPP for unknown ioctl commands

2017-03-14 Thread David Laight
From: Doug Berger
> Sent: 14 March 2017 00:42
> This commit changes the ioctl handling behavior to return the
> EOPNOTSUPP error code instead of the EINVAL error code when an
> unknown ioctl command value is detected.
> 
> It also removes some redundant parsing of the ioctl command value
> and allows the SIOCSHWTSTAMP value to be handled.

A better description would seem to be:
Remove checks on ioctl command and just forward all ioctl requests
to phy_mii_ioctl().

I also thought the 'generic' response to an unknown ioctl command
was ENOTTY.

David



Re: [PATCH net-next] net: dsa: mv88e6xxx: debug ATU Age Time

2017-03-14 Thread Matthias May
On 13/03/17 23:58, Andrew Lunn wrote:
> On Mon, Mar 13, 2017 at 03:42:36PM -0700, Florian Fainelli wrote:
>> On 03/13/2017 03:39 PM, Andrew Lunn wrote:
>>> On Mon, Mar 13, 2017 at 03:20:43PM -0400, Vivien Didelot wrote:
 The ATU ageing time value programmed in the switch is rounded up to the
 nearest multiple of its coefficient (variable depending on the model.)

 Add a debug message to inform the user about the exact programmed value.

 On 6352, "brctl setageing br0 18" gives "AgeTime set to 0x01 (15000 ms)"
 while on 6390 we get "AgeTime set to 0x05 (18750 ms)".

 Signed-off-by: Vivien Didelot 
 ---
  drivers/net/dsa/mv88e6xxx/global1_atu.c | 9 -
  1 file changed, 8 insertions(+), 1 deletion(-)

 diff --git a/drivers/net/dsa/mv88e6xxx/global1_atu.c 
 b/drivers/net/dsa/mv88e6xxx/global1_atu.c
 index f6cd3c939da4..bac34737b096 100644
 --- a/drivers/net/dsa/mv88e6xxx/global1_atu.c
 +++ b/drivers/net/dsa/mv88e6xxx/global1_atu.c
 @@ -65,7 +65,14 @@ int mv88e6xxx_g1_atu_set_age_time(struct mv88e6xxx_chip 
 *chip,
val &= ~0xff0;
val |= age_time << 4;
  
 -  return mv88e6xxx_g1_write(chip, GLOBAL_ATU_CONTROL, val);
 +  err = mv88e6xxx_g1_write(chip, GLOBAL_ATU_CONTROL, val);
 +  if (err)
 +  return err;
 +
 +  dev_dbg(chip->dev, "AgeTime set to 0x%02x (%d ms)\n", age_time,
 +  age_time * coeff);
 +
>>>
>>> Hi Vivien
>>>
>>> You could put the dev_dbg before the mv88e6xxx_g1_write(), to keep the
>>> code simpler. If this write fails, we expect a lot of other things to
>>> go horribly wrong, so having one debug message being not quite accurate
>>> is not important.
>>
>> The debug message would not be printed in case mv88e6xxx_g1_write()
>> fails, also, having the message printed after the write occurred is a
>> good way to make sure the write did make it through. Did I miss
>> something in what you are suggesting here?
> 
> We never, ever see a read or a write failure on the MDIO bus. If it
> ever does, i expect the switch is dead, gone, never to be heard from
> again until the power is reset. We are going to have lots of
> failures. So it seems simpler to have:
> 
>   dev_dbg(chip->dev, "Setting AgeTime to 0x%02x (%d ms)\n", age_time,
>   age_time * coeff);
> 
>   return mv88e6xxx_g1_write(chip, GLOBAL_ATU_CONTROL, val);
> 
> and accept that if for some unlikely reason the write does fail, the
> debug message is probably not accurate.
> 
>   Andrew
> 

Hi
The never ever seeing R/W failure on MDIO bus is not exactly accurate.
We had with art (atheros calibration tool) the problem that interrupts
were being disabled which lead to MDIO operations running into
timout/failing.
For normal phys this usually results in calling phy_error in
.../net/phy/phy.c which puts the phy into a defined state (PHY_HALTED).
Granted this is a problem produced by art2 but couldn't the same be
applied here? Put the device in a defined state?

BR
Matthias


[PATCH v2] Cleanup some warning from timestamping code.

2017-03-14 Thread Ezequiel Lara Gomez

>From f85a5b8bfc420ce1f20d81974e87604bc8d60e45 Mon Sep 17 00:00:00 2001
From: Ezequiel Lara Gomez 
Date: Sat, 11 Mar 2017 20:06:01 +
Subject: [PATCH v2] Cleanup some warning from timestamping code.

Following checkpatch.pl recommendations (which include
replacing with  the , since linux/io.h includes
it).

Signed-off-by: Ezequiel Lara Gomez 
---
Changes:
* v2: fixed two remaining warnings about /* comments. Also, added clarification 
on the commit as to why switch headers.

 drivers/net/loopback.c | 19 ---
 1 file changed, 8 insertions(+), 11 deletions(-)

diff --git a/drivers/net/loopback.c b/drivers/net/loopback.c
index b23b719..7957b9a 100644
--- a/drivers/net/loopback.c
+++ b/drivers/net/loopback.c
@@ -13,7 +13,7 @@
  *
  * Alan Cox:   Fixed oddments for NET3.014
  * Alan Cox:   Rejig for NET3.029 snap #3
- * Alan Cox:   Fixed NET3.029 bugs and sped up
+ * Alan Cox:   Fixed NET3.029 bugs and sped up
  * Larry McVoy :   Tiny tweak to double performance
  * Alan Cox:   Backed out LMV's tweak - the linux mm
  * can't take it...
@@ -41,7 +41,7 @@
 #include 
 
 #include 
-#include 
+#include 
 
 #include 
 #include 
@@ -64,8 +64,7 @@ struct pcpu_lstats {
struct u64_stats_sync   syncp;
 };
 
-/*
- * The higher levels take care of making this non-reentrant (it's
+/* The higher levels take care of making this non-reentrant (it's
  * called with bh's disabled).
  */
 static netdev_tx_t loopback_xmit(struct sk_buff *skb,
@@ -149,14 +148,13 @@ static void loopback_dev_free(struct net_device *dev)
 }
 
 static const struct net_device_ops loopback_ops = {
-   .ndo_init  = loopback_dev_init,
-   .ndo_start_xmit= loopback_xmit,
+   .ndo_init= loopback_dev_init,
+   .ndo_start_xmit  = loopback_xmit,
.ndo_get_stats64 = loopback_get_stats64,
.ndo_set_mac_address = eth_mac_addr,
 };
 
-/*
- * The loopback device is special. There is only one instance
+/* The loopback device is special. There is only one instance
  * per network namespace.
  */
 static void loopback_setup(struct net_device *dev)
@@ -170,7 +168,7 @@ static void loopback_setup(struct net_device *dev)
dev->priv_flags |= IFF_LIVE_ADDR_CHANGE | IFF_NO_QUEUE;
netif_keep_dst(dev);
dev->hw_features= NETIF_F_GSO_SOFTWARE;
-   dev->features   = NETIF_F_SG | NETIF_F_FRAGLIST
+   dev->features   = NETIF_F_SG | NETIF_F_FRAGLIST
| NETIF_F_GSO_SOFTWARE
| NETIF_F_HW_CSUM
| NETIF_F_RXCSUM
@@ -206,7 +204,6 @@ static __net_init int loopback_net_init(struct net *net)
net->loopback_dev = dev;
return 0;
 
-
 out_free_netdev:
free_netdev(dev);
 out:
@@ -217,5 +214,5 @@ static __net_init int loopback_net_init(struct net *net)
 
 /* Registered in net/core/dev.c */
 struct pernet_operations __net_initdata loopback_net_ops = {
-   .init = loopback_net_init,
+   .init = loopback_net_init,
 };
-- 
1.9.1

Amazon Data Services Ireland Limited registered office: One Burlington Plaza, 
Burlington Road, Dublin 4, Ireland. Registered in Ireland. Registration number 
390566.



Re: [PATCH 2/7] ath9k: ahb: Add OF support

2017-03-14 Thread Sergei Shtylyov

Hello!

On 3/14/2017 12:05 AM, Alban wrote:


Allow registering ath9k AHB devices defined in DT. This just add the
compatible strings to allow matching the driver and setting the proper
device ID.

Signed-off-by: Alban 
---
 drivers/net/wireless/ath/ath9k/ahb.c | 47 +---
 1 file changed, 43 insertions(+), 4 deletions(-)

diff --git a/drivers/net/wireless/ath/ath9k/ahb.c 
b/drivers/net/wireless/ath/ath9k/ahb.c
index 2bd982c..36a2645 100644
--- a/drivers/net/wireless/ath/ath9k/ahb.c
+++ b/drivers/net/wireless/ath/ath9k/ahb.c

[...]

@@ -79,10 +107,20 @@ static int ath_ahb_probe(struct platform_device *pdev)
int ret = 0;
struct ath_hw *ah;
char hw_name[64];
+   u16 devid;

-   if (!dev_get_platdata(&pdev->dev)) {
-   dev_err(&pdev->dev, "no platform data specified\n");
-   return -EINVAL;
+   if (id) {
+   devid = id->driver_data;
+   } else {
+   const struct of_device_id *match;
+
+   match = of_match_device(ath_ahb_of_match, &pdev->dev);
+   if (!match) {
+   dev_err(&pdev->dev, "no device match found\n");
+   return -EINVAL;
+   }
+
+   devid = (u16)(unsigned long)match->data;


   of_device_get_match_data() instead of the above perhaps?

[...]

MBR, Sergei



[PATCH net-next 0/4] gtp: support multiple APN's per GTP endpoint

2017-03-14 Thread Andreas Schultz
Support multiple APN's per GTP endpoint and as an additional benefit support
multiple GTP sockets per GTP entity.

Use case multiple APN's:


In 3GPP a APN is control path construct. When mappend into the data path,
it mean that UE IP's can be source from independended IP networks with
overlaping IP ranges.

3GPP, TS 29.061 version 13.6.0 Release 13, Section 11.3 describes this as:

> 2. each private network manages its own addressing. In general this will
>result in different private networks having overlapping address ranges.
>A logically separate connection (e.g. an IP in IP tunnel or layer 2
>virtual circuit) is used between the GGSN/P-GW and each private network.
>In this case the IP address alone is not necessarily unique. The pair
>of values, Access Point Name (APN) and IPv4 address and/or IPv6 prefixes,
>is unique.

To support such a setup, each APN is mapped to a Linux network device.
VRF-Lite, network namespaces or other mechanismns can the be used to realize
the full separation of the per APN IP networks.

Use case multiple GTP sockets per GTP entity:
-

A GTP entity like a PGW can use multiple GTP sockets for:

 * separate IPv4 and IPv6 transport endpoints
 * support multiple reference points in separated IP networks, e.g. have
   Gn/Gp/S5/S8 in a GRX attaches network and S2a/S2b in another private
   network

Especially the S2a/S2b separation is an important scenario. The networks
use for roaming and non roaming attachment (Gn/Gp/S5/S8 reference points)
are usually different from the connection for trusted and untrusted WiFi
access (S2a/S2b). Will the GTP transport networks are separated, it is
still desirable to terminated the tunnels in the same GTP entity to ensure
uninterrupted IP connectivity during 3G/LTE to/from WiFi handover.

Implementation:
---

APN's are a control path construct, the identification of the associated network
device need therefore to be bound to be tunnel endpoint identifier.

This series moves the hash for the incoming tunnel endpoint identifiers into
the socket to support multiple network devices per GTP socket. It the adds
a method of enabling the GTP encapsulation on a socket without having to
bound the socket to a network device and finally allows to specify a GTP
socket per PDP context.

API impact:
---

This is probably the most problematic part of this series...

The removeal of the TEID form the netdevice also means that the gtp genl API
for retriving tunnel information and removing tunnels needs to be adjusted.

Before this change it was possible to change a GTP tunnel using the gtp
netdevice id and the teid. The teid is no longer unique per gtp netdevice.
After this change it has to be either the netdevice and MS IP or the GTP
socket and teid.

Fortunatly, libgtpnl has always populated the Link Id, TEID, GSN Peer IP and
MS IP. The library interface has ensured that all information that is mandatory
after this change is guaranteed to be present.

The only project that doesn't use libgtpnl (OpenAir-CN) is also populating
all of those values.

The API change will therefore not break any existing userspace applications.

--
Andreas Schultz (4):
  gtp: move TEID hash to per socket structure
  gtp: add genl cmd to enable GTP encapsulation on UDP socket
  gtp: add support to select a GTP socket during PDP context creation
  Extend Kernel GTP-U tunneling documentation

 Documentation/networking/gtp.txt | 103 ++-
 drivers/net/gtp.c| 263 ---
 include/uapi/linux/gtp.h |   4 +
 3 files changed, 292 insertions(+), 78 deletions(-)

-- 
2.10.2



[PATCH net-next 3/4] gtp: add support to select a GTP socket during PDP context creation

2017-03-14 Thread Andreas Schultz
For each new PDP a separate socket can be selected. The per netdevice
default sockets are no longer mandatory.

This means also that multiple gtp netdevices can share the same default
socket and that therefore the destruction of a gtp netdevice can no
longer automatically disable the gtp encapsulation on it's sockets.

Signed-off-by: Andreas Schultz 
---
 drivers/net/gtp.c | 79 +++
 1 file changed, 68 insertions(+), 11 deletions(-)

diff --git a/drivers/net/gtp.c b/drivers/net/gtp.c
index c4cf1b9..afa043d 100644
--- a/drivers/net/gtp.c
+++ b/drivers/net/gtp.c
@@ -381,9 +381,6 @@ static int gtp_dev_init(struct net_device *dev)
 
 static void gtp_dev_uninit(struct net_device *dev)
 {
-   struct gtp_dev *gtp = netdev_priv(dev);
-
-   gtp_encap_disable(gtp);
free_percpu(dev->tstats);
 }
 
@@ -647,9 +644,6 @@ static int gtp_newlink(struct net *src_net, struct 
net_device *dev,
struct gtp_net *gn;
int hashsize, err;
 
-   if (!data[IFLA_GTP_FD0] && !data[IFLA_GTP_FD1])
-   return -EINVAL;
-
gtp = netdev_priv(dev);
 
if (!data[IFLA_GTP_PDP_HASHSIZE])
@@ -689,8 +683,11 @@ static void gtp_dellink(struct net_device *dev, struct 
list_head *head)
 {
struct gtp_dev *gtp = netdev_priv(dev);
 
-   gtp_encap_disable(gtp);
gtp_hashtable_free(gtp);
+   if (gtp->sk0)
+   sock_put(gtp->sk0);
+   if (gtp->sk1u)
+   sock_put(gtp->sk1u);
list_del_rcu(>p->list);
unregister_netdevice_queue(dev, head);
 }
@@ -1008,9 +1005,10 @@ static void pdp_context_delete(struct pdp_ctx *pctx)
 
 static int gtp_genl_new_pdp(struct sk_buff *skb, struct genl_info *info)
 {
+   struct socket *sock = NULL;
+   struct sock *sk = NULL;
unsigned int version;
struct gtp_dev *gtp;
-   struct sock *sk;
int err;
 
if (!info->attrs[GTPA_VERSION] ||
@@ -1045,12 +1043,14 @@ static int gtp_genl_new_pdp(struct sk_buff *skb, struct 
genl_info *info)
goto out_unlock;
}
 
-   if (version == GTP_V0)
+   if (info->attrs[GTPA_FD]) {
+   sock = sockfd_lookup(nla_get_u32(info->attrs[GTPA_FD]), &err);
+   if (sock)
+   sk = sock->sk;
+   } else if (version == GTP_V0)
sk = gtp->sk0;
else if (version == GTP_V1)
sk = gtp->sk1u;
-   else
-   sk = NULL;
 
if (!sk) {
err = -ENODEV;
@@ -1059,6 +1059,9 @@ static int gtp_genl_new_pdp(struct sk_buff *skb, struct 
genl_info *info)
 
err = ipv4_pdp_add(gtp, sk, info);
 
+   if (sock)
+   sockfd_put(sock);
+
 out_unlock:
rcu_read_unlock();
return err;
@@ -1079,12 +1082,66 @@ static struct pdp_ctx *gtp_find_pdp_by_link(struct net 
*net,
return ipv4_pdp_find(gtp, nla_get_be32(nla[GTPA_MS_ADDRESS]));
 }
 
+static struct pdp_ctx *gtp_genl_find_pdp_by_socket(struct net *net,
+  struct nlattr *nla[])
+{
+   struct socket *sock;
+   struct gtp_sock *gsk;
+   struct pdp_ctx *pctx;
+   int fd, err = 0;
+
+   if (!nla[GTPA_FD])
+   return ERR_PTR(-EINVAL);
+
+   fd = nla_get_u32(nla[GTPA_FD]);
+   sock = sockfd_lookup(fd, &err);
+   if (!sock) {
+   pr_debug("gtp socket fd=%d not found\n", fd);
+   return ERR_PTR(-EBADF);
+   }
+
+   gsk = rcu_dereference_sk_user_data(sock->sk);
+   if (!gsk) {
+   pctx = ERR_PTR(-EINVAL);
+   goto out_sock;
+   }
+
+   switch (nla_get_u32(nla[GTPA_VERSION])) {
+   case GTP_V0:
+   if (!nla[GTPA_TID]) {
+   pctx = ERR_PTR(-EINVAL);
+   break;
+   }
+   pctx = gtp0_pdp_find(gsk, nla_get_u64(nla[GTPA_TID]));
+   break;
+
+   case GTP_V1:
+   if (!nla[GTPA_I_TEI]) {
+   pctx = ERR_PTR(-EINVAL);
+   break;
+   }
+   pctx = gtp1_pdp_find(gsk, nla_get_u64(nla[GTPA_I_TEI]));
+   break;
+
+   default:
+   pctx = ERR_PTR(-EINVAL);
+   break;
+   }
+
+out_sock:
+   sockfd_put(sock);
+   return pctx;
+}
+
 static struct pdp_ctx *gtp_find_pdp(struct net *net, struct nlattr *nla[])
 {
struct pdp_ctx *pctx;
 
if (nla[GTPA_LINK])
pctx = gtp_find_pdp_by_link(net, nla);
+   else if (nla[GTPA_FD])
+   pctx = gtp_genl_find_pdp_by_socket(net, nla);
+
else
pctx = ERR_PTR(-EINVAL);
 
-- 
2.10.2



[PATCH net-next 1/4] gtp: move TEID hash to per socket structure

2017-03-14 Thread Andreas Schultz
Untangele the TEID information from the network device and move
it into a per socket structure.

The removeal of the TEID form the netdevice also means that the
gtp genl API for retriving tunnel information and removing tunnels
needs to be adjusted.
Before this change it was possible to change a GTP tunnel using
the gtp netdevice id and the teid. The teid is no longer unique
per gtp netdevice. So after this change it has to be either the
netdevice and MS IP or the GTP socket and teid.

Fortunatly, libgtpnl has always populated the Link Id, TEID,
GSN Peer IP and MS IP. So, the library interface has ensured that
all information that is mandatory after this change is guranteed
to be present. The only project that doesn't use libgtpnl (OpenAir-CN)
is also populating all of those values.

Signed-off-by: Andreas Schultz 
---
 drivers/net/gtp.c | 145 +-
 1 file changed, 78 insertions(+), 67 deletions(-)

diff --git a/drivers/net/gtp.c b/drivers/net/gtp.c
index 3e1854f..66616f7 100644
--- a/drivers/net/gtp.c
+++ b/drivers/net/gtp.c
@@ -75,10 +75,15 @@ struct gtp_dev {
struct net_device   *dev;
 
unsigned inthash_size;
-   struct hlist_head   *tid_hash;
struct hlist_head   *addr_hash;
 };
 
+/* One instance of the GTP socket. */
+struct gtp_sock {
+   unsigned inthash_size;
+   struct hlist_head   tid_hash[];
+};
+
 static unsigned int gtp_net_id __read_mostly;
 
 struct gtp_net {
@@ -106,12 +111,12 @@ static inline u32 ipv4_hashfn(__be32 ip)
 }
 
 /* Resolve a PDP context structure based on the 64bit TID. */
-static struct pdp_ctx *gtp0_pdp_find(struct gtp_dev *gtp, u64 tid)
+static struct pdp_ctx *gtp0_pdp_find(struct gtp_sock *gsk, u64 tid)
 {
struct hlist_head *head;
struct pdp_ctx *pdp;
 
-   head = >p->tid_hash[gtp0_hashfn(tid) % gtp->hash_size];
+   head = &gsk->tid_hash[gtp0_hashfn(tid) % gsk->hash_size];
 
hlist_for_each_entry_rcu(pdp, head, hlist_tid) {
if (pdp->gtp_version == GTP_V0 &&
@@ -122,12 +127,12 @@ static struct pdp_ctx *gtp0_pdp_find(struct gtp_dev *gtp, 
u64 tid)
 }
 
 /* Resolve a PDP context structure based on the 32bit TEI. */
-static struct pdp_ctx *gtp1_pdp_find(struct gtp_dev *gtp, u32 tid)
+static struct pdp_ctx *gtp1_pdp_find(struct gtp_sock *gsk, u32 tid)
 {
struct hlist_head *head;
struct pdp_ctx *pdp;
 
-   head = >p->tid_hash[gtp1u_hashfn(tid) % gtp->hash_size];
+   head = &gsk->tid_hash[gtp1u_hashfn(tid) % gsk->hash_size];
 
hlist_for_each_entry_rcu(pdp, head, hlist_tid) {
if (pdp->gtp_version == GTP_V1 &&
@@ -215,7 +220,7 @@ static int gtp_rx(struct pdp_ctx *pctx, struct sk_buff 
*skb, unsigned int hdrlen
 }
 
 /* 1 means pass up to the stack, -1 means drop and 0 means decapsulated. */
-static int gtp0_udp_encap_recv(struct gtp_dev *gtp, struct sk_buff *skb)
+static int gtp0_udp_encap_recv(struct gtp_sock *gsk, struct sk_buff *skb)
 {
unsigned int hdrlen = sizeof(struct udphdr) +
  sizeof(struct gtp0_header);
@@ -233,16 +238,16 @@ static int gtp0_udp_encap_recv(struct gtp_dev *gtp, 
struct sk_buff *skb)
if (gtp0->type != GTP_TPDU)
return 1;
 
-   pctx = gtp0_pdp_find(gtp, be64_to_cpu(gtp0->tid));
+   pctx = gtp0_pdp_find(gsk, be64_to_cpu(gtp0->tid));
if (!pctx) {
-   netdev_dbg(gtp->dev, "No PDP ctx to decap skb=%p\n", skb);
+   pr_debug("No PDP ctx to decap skb=%p\n", skb);
return 1;
}
 
return gtp_rx(pctx, skb, hdrlen);
 }
 
-static int gtp1u_udp_encap_recv(struct gtp_dev *gtp, struct sk_buff *skb)
+static int gtp1u_udp_encap_recv(struct gtp_sock *gsk, struct sk_buff *skb)
 {
unsigned int hdrlen = sizeof(struct udphdr) +
  sizeof(struct gtp1_header);
@@ -275,9 +280,9 @@ static int gtp1u_udp_encap_recv(struct gtp_dev *gtp, struct 
sk_buff *skb)
 
gtp1 = (struct gtp1_header *)(skb->data + sizeof(struct udphdr));
 
-   pctx = gtp1_pdp_find(gtp, ntohl(gtp1->tid));
+   pctx = gtp1_pdp_find(gsk, ntohl(gtp1->tid));
if (!pctx) {
-   netdev_dbg(gtp->dev, "No PDP ctx to decap skb=%p\n", skb);
+   pr_debug("No PDP ctx to decap skb=%p\n", skb);
return 1;
}
 
@@ -286,13 +291,21 @@ static int gtp1u_udp_encap_recv(struct gtp_dev *gtp, 
struct sk_buff *skb)
 
 static void gtp_encap_destroy(struct sock *sk)
 {
-   struct gtp_dev *gtp;
+   struct gtp_sock *gsk;
+   struct pdp_ctx *pctx;
+   int i;
 
-   gtp = rcu_dereference_sk_user_data(sk);
-   if (gtp) {
+   gsk = rcu_dereference_sk_user_data(sk);
+   if (gsk) {
udp_sk(sk)->encap_type = 0;
rcu_assign_sk_user_data(sk, NULL);
-   sock_put(sk);
+
+   for (i = 0; i < gsk->hash_size; i++)
+

[PATCH net-next 2/4] gtp: add genl cmd to enable GTP encapsulation on UDP socket

2017-03-14 Thread Andreas Schultz
Signed-off-by: Andreas Schultz 
---
 drivers/net/gtp.c| 47 +++
 include/uapi/linux/gtp.h |  4 
 2 files changed, 51 insertions(+)

diff --git a/drivers/net/gtp.c b/drivers/net/gtp.c
index 66616f7..c4cf1b9 100644
--- a/drivers/net/gtp.c
+++ b/drivers/net/gtp.c
@@ -1244,6 +1244,45 @@ static int gtp_genl_dump_pdp(struct sk_buff *skb,
return skb->len;
 }
 
+static int gtp_genl_enable_socket(struct sk_buff *skb, struct genl_info *info)
+{
+   u32 version, fd, hashsize;
+   struct sock *sk;
+
+   if (!info->attrs[GTPA_VERSION] ||
+   !info->attrs[GTPA_FD])
+   return -EINVAL;
+
+   if (!info->attrs[GTPA_PDP_HASHSIZE])
+   hashsize = 1024;
+   else
+   hashsize = nla_get_u32(info->attrs[IFLA_GTP_PDP_HASHSIZE]);
+
+   version = nla_get_u32(info->attrs[GTPA_VERSION]);
+   fd = nla_get_u32(info->attrs[GTPA_FD]);
+
+   switch (version) {
+   case GTP_V0:
+   sk = gtp_encap_enable_socket(fd, UDP_ENCAP_GTP0, hashsize);
+   break;
+
+   case GTP_V1:
+   sk = gtp_encap_enable_socket(fd, UDP_ENCAP_GTP1U, hashsize);
+   break;
+
+   default:
+   return -EINVAL;
+   }
+
+   if (!sk)
+   return -EINVAL;
+
+   if (IS_ERR(sk))
+   return PTR_ERR(sk);
+
+   return 0;
+}
+
 static struct nla_policy gtp_genl_policy[GTPA_MAX + 1] = {
[GTPA_LINK] = { .type = NLA_U32, },
[GTPA_VERSION]  = { .type = NLA_U32, },
@@ -1254,6 +1293,8 @@ static struct nla_policy gtp_genl_policy[GTPA_MAX + 1] = {
[GTPA_NET_NS_FD]= { .type = NLA_U32, },
[GTPA_I_TEI]= { .type = NLA_U32, },
[GTPA_O_TEI]= { .type = NLA_U32, },
+   [GTPA_PDP_HASHSIZE] = { .type = NLA_U32, },
+   [GTPA_FD]   = { .type = NLA_U32, },
 };
 
 static const struct genl_ops gtp_genl_ops[] = {
@@ -1276,6 +1317,12 @@ static const struct genl_ops gtp_genl_ops[] = {
.policy = gtp_genl_policy,
.flags = GENL_ADMIN_PERM,
},
+   {
+   .cmd = GTP_CMD_ENABLE_SOCKET,
+   .doit = gtp_genl_enable_socket,
+   .policy = gtp_genl_policy,
+   .flags = GENL_ADMIN_PERM,
+   },
 };
 
 static struct genl_family gtp_genl_family __ro_after_init = {
diff --git a/include/uapi/linux/gtp.h b/include/uapi/linux/gtp.h
index 72a04a0..a9e9fe0 100644
--- a/include/uapi/linux/gtp.h
+++ b/include/uapi/linux/gtp.h
@@ -6,6 +6,8 @@ enum gtp_genl_cmds {
GTP_CMD_DELPDP,
GTP_CMD_GETPDP,
 
+   GTP_CMD_ENABLE_SOCKET,
+
GTP_CMD_MAX,
 };
 
@@ -26,6 +28,8 @@ enum gtp_attrs {
GTPA_I_TEI, /* for GTPv1 only */
GTPA_O_TEI, /* for GTPv1 only */
GTPA_PAD,
+   GTPA_PDP_HASHSIZE,
+   GTPA_FD,
__GTPA_MAX,
 };
 #define GTPA_MAX (__GTPA_MAX + 1)
-- 
2.10.2



[PATCH net-next 4/4] Extend Kernel GTP-U tunneling documentation

2017-03-14 Thread Andreas Schultz
* clarify specification references for v0/v1
* add section "APN vs. Network device"
* add section "Local GTP-U entity and tunnel identification"

Signed-off-by: Andreas Schultz 
Signed-off-by: Harald Welte 
---
 Documentation/networking/gtp.txt | 103 +--
 1 file changed, 99 insertions(+), 4 deletions(-)

diff --git a/Documentation/networking/gtp.txt b/Documentation/networking/gtp.txt
index 93e9675..0d9c18f 100644
--- a/Documentation/networking/gtp.txt
+++ b/Documentation/networking/gtp.txt
@@ -1,6 +1,7 @@
 The Linux kernel GTP tunneling module
 ==
-Documentation by Harald Welte 
+Documentation by Harald Welte  and
+ Andreas Schultz 
 
 In 'drivers/net/gtp.c' you are finding a kernel-level implementation
 of a GTP tunnel endpoint.
@@ -91,9 +92,13 @@ http://git.osmocom.org/libgtpnl/
 
 == Protocol Versions ==
 
-There are two different versions of GTP-U: v0 and v1.  Both are
-implemented in the Kernel GTP module.  Version 0 is a legacy version,
-and deprecated from recent 3GPP specifications.
+There are two different versions of GTP-U: v0 [GSM TS 09.60] and v1
+[3GPP TS 29.281].  Both are implemented in the Kernel GTP module.
+Version 0 is a legacy version, and deprecated from recent 3GPP
+specifications.
+
+GTP-U uses UDP for transporting PDUs.  The receiving UDP port is 2151
+for GTPv1-U and 3386 for GTPv0-U.
 
 There are three versions of GTP-C: v0, v1, and v2.  As the kernel
 doesn't implement GTP-C, we don't have to worry about this.  It's the
@@ -133,3 +138,93 @@ doe to a lack of user interest, it never got merged.
 In 2015, Andreas Schultz came to the rescue and fixed lots more bugs,
 extended it with new features and finally pushed all of us to get it
 mainline, where it was merged in 4.7.0.
+
+== Architectural Details ==
+
+=== Local GTP-U entity and tunnel identification ===
+
+GTP-U uses UDP for transporting PDU's. The receiving UDP port is 2152
+for GTPv1-U and 3386 for GTPv0-U.
+
+There is only one GTP-U entity (and therefor SGSN/GGSN/S-GW/PDN-GW
+instance) per IP address. Tunnel Endpoint Identifier (TEID) are unique
+per GTP-U entity.
+
+A specific tunnel is only defined by the destination entity. Since the
+destination port is constant, only the destination IP and TEID define
+a tunnel. The source IP and Port have no meaning for the tunnel.
+
+Therefore:
+
+  * when sending, the remote entity is defined by the remote IP and
+the tunnel endpoint id. The source IP and port have no meaning and
+can be changed at any time.
+
+  * when receiving the local entity is defined by the local
+destination IP and the tunnel endpoint id. The source IP and port
+have no meaning and can change at any time.
+
+[3GPP TS 29.281] Section 4.3.0 defines this so:
+
+> The TEID in the GTP-U header is used to de-multiplex traffic
+> incoming from remote tunnel endpoints so that it is delivered to the
+> User plane entities in a way that allows multiplexing of different
+> users, different packet protocols and different QoS levels.
+> Therefore no two remote GTP-U endpoints shall send traffic to a
+> GTP-U protocol entity using the same TEID value except
+> for data forwarding as part of mobility procedures.
+
+The definition above only defines that two remote GTP-U endpoints
+*should not* send to the same TEID, it *does not* forbid or exclude
+such a scenario. In fact, the mentioned mobility procedures make it
+necessary that the GTP-U entity accepts traffic for TEIDs from
+multiple or unknown peers.
+
+Therefore, the receiving side identifies tunnels exclusively based on
+TEIDs, not based on the source IP!
+
+== APN vs. Network Device ==
+
+The GTP-U driver creates a Linux network device for each Gi/SGi
+interface.
+
+[3GPP TS 29.281] calls the Gi/SGi reference point an interface. This
+may lead to the impression that the GGSN/P-GW can have only one such
+interface.
+
+Correct is that the Gi/SGi reference point defines the interworking
+between +the 3GPP packet domain (PDN) based on GTP-U tunnel and IP
+based networks.
+
+There is no provision in any of the 3GPP documents that limits the
+number of Gi/SGi interfaces implemented by a GGSN/P-GW.
+
+[3GPP TS 29.061] Section 11.3 makes it clear that the selection of a
+specific Gi/SGi interfaces is made through the Access Point Name
+(APN):
+
+> 2. each private network manages its own addressing. In general this
+>will result in different private networks having overlapping
+>address ranges. A logically separate connection (e.g. an IP in IP
+>tunnel or layer 2 virtual circuit) is used between the GGSN/P-GW
+>and each private network.
+>
+>In this case the IP address alone is not necessarily unique.  The
+>pair of values, Access Point Name (APN) and IPv4 address and/or
+>IPv6 prefixes, is unique.
+
+In order to support the overlapping address range use case, each APN
+is mapped to a separate Gi/SGi interface (n

Re: [PATCH net-next 1/4] gtp: move TEID hash to per socket structure

2017-03-14 Thread Pablo Neira Ayuso
On Tue, Mar 14, 2017 at 12:25:45PM +0100, Andreas Schultz wrote:
> @@ -275,9 +280,9 @@ static int gtp1u_udp_encap_recv(struct gtp_dev *gtp, 
> struct sk_buff *skb)
>  
>   gtp1 = (struct gtp1_header *)(skb->data + sizeof(struct udphdr));
>  
> - pctx = gtp1_pdp_find(gtp, ntohl(gtp1->tid));
> + pctx = gtp1_pdp_find(gsk, ntohl(gtp1->tid));
>   if (!pctx) {
> - netdev_dbg(gtp->dev, "No PDP ctx to decap skb=%p\n", skb);
> + pr_debug("No PDP ctx to decap skb=%p\n", skb);
>   return 1;

Again the pr_debug() change has resurrected.

I already told you: If we are going to have more than one gtp device,
then this doesn't make sense. I have to repeat things over and over
again, just because you don't want to rebase your patchset for some
reason. I don't find any other explaination for this.

So please remove this debugging rather than rendering this completely
useful.

Moreover this change has nothing to this patch, so this doesn't break
the one logical change per patch.


Re: [PATCH net-next 2/4] gtp: add genl cmd to enable GTP encapsulation on UDP socket

2017-03-14 Thread Pablo Neira Ayuso
On Tue, Mar 14, 2017 at 12:25:46PM +0100, Andreas Schultz wrote:

> @@ -1254,6 +1293,8 @@ static struct nla_policy gtp_genl_policy[GTPA_MAX + 1] 
> = {
>   [GTPA_NET_NS_FD]= { .type = NLA_U32, },
>   [GTPA_I_TEI]= { .type = NLA_U32, },
>   [GTPA_O_TEI]= { .type = NLA_U32, },
> + [GTPA_PDP_HASHSIZE] = { .type = NLA_U32, },

This per PDP hashsize attribute clearly doesn't belong here.

Moreover, we now have a rhashtable implementation, so we hopefully we
can get rid of this. It should be very easy to convert this to use
rhashtable, and it is very much desiderable.

> + [GTPA_FD]   = { .type = NLA_U32, },

This new atttribute has nothing to do with the PDP context.
And enum gtp_attrs *only* describe a PDP context. Adding more
attributes there to mix semantics is not the way to go.

You likely have to inaugurate a new enum. This gtp_attrs enum only
related to the PDP description.

Why not add some interface to attach more sockets to the gtp device
globally? So still the gtp device is the top-level structure. Then add
a netlink attribute to specify to what VRF this tunnel belongs to,
instead of implicitly using the socket to achieve this.

Another possibility is to explicitly have an interface to add
new/delete VRFs, attach sockets to them.

In general, I'm still not convinced this is the right design for this.


Re: [PATCH net-next 0/4] gtp: support multiple APN's per GTP endpoint

2017-03-14 Thread Pablo Neira Ayuso
On Tue, Mar 14, 2017 at 12:25:44PM +0100, Andreas Schultz wrote:
[...]
> API impact:
> ---
> 
> This is probably the most problematic part of this series...
> 
> The removeal of the TEID form the netdevice also means that the gtp genl API
> for retriving tunnel information and removing tunnels needs to be adjusted.
> 
> Before this change it was possible to change a GTP tunnel using the gtp
> netdevice id and the teid. The teid is no longer unique per gtp netdevice.
> After this change it has to be either the netdevice and MS IP or the GTP
> socket and teid.

Then we have to introduce some explicit VRF concept or such to sort
out this.

It is definitely not acceptable to break the existing API.


Re: [PATCH net-next 1/4] dt-bindings: net: dsa: add Mediatek MT7530 binding

2017-03-14 Thread Andrew Lunn
> By the ways, I have a question which is could current DSA framework
> allows managing the fabric designated from "multiple cpu ports" to "user
> ports" in any combination in brctl and in other existing commands?
> 
> For example.
> 
> I assume that there are two cpu port called 5, and 6.and there are five
> user ports called 0, 1, 2 and 3. and the default fabric on the switch is
> mapping from { 5 } <-> { 0, 1, 2, 3 ,4 } where members in the braces I
> assumes they also can communicate with each other.
> 
> Is it feasible for changing the fabric into other combinations in the
> runtime such as 
> {5} <-> {0, 1, 2, 3} and {6} <-> {4}
> {5} <-> {0, 1, 2} and {6} <-> {3, 4} or 
> {6} <-> {0, 1} and {6} <-> {2, 3, 4} or
> 
> {6} <-> {0, 1, 2, 3 ,4} ?
> 
> After some trace code, I found it seemed that only one cpu port could be
> supported via one dsa registration. 

Hi Sean

This is on our TODO list, and getting near the top of Florians list,
as far as i understand. A few years ago i did make a proof of concept
implementation for this, and the new device tree binding was designed
with this in mind.

 Andrew


Re: [PATCH net-next] net: dsa: mv88e6xxx: debug ATU Age Time

2017-03-14 Thread Andrew Lunn
> Hi
> The never ever seeing R/W failure on MDIO bus is not exactly accurate.
> We had with art (atheros calibration tool) the problem that interrupts
> were being disabled which lead to MDIO operations running into
> timout/failing.

Yes, i've seen similar with power management bugs for the MDIO
driver. But you get a cascade of failures, lots of warnings and error
prints, it is clear something bad has happened, and the switch is in
an inconsistent state. So having one more debug print which is also
inconsistent does no really harm.

Anyway, this whole conversation has taken more effort than just making
this simple change to remove a few lines of code. So lets drop it and
move on.

Andrew


Re: [PATCH v2 2/2] can: spi: hi311x: Add Holt HI-311x CAN driver

2017-03-14 Thread Wolfgang Grandegger

Hallo Akshay,

Am 13.03.2017 um 16:38 schrieb Akshay Bhat:

Hi Wolfgang,

On 03/09/2017 12:36 PM, Wolfgang Grandegger wrote:

Hello,

doing a quick review... I realized a few issues...

Am 17.01.2017 um 20:22 schrieb Akshay Bhat:

... snip ...

A few other things to check:

Run "cangen" and monitor the message with "candump -e any,0:0,#FFF".
Then 1) disconnect the cable or 2) short-circuit CAN low and high at the
connector. You should see error messages. After reconnection or removing
the short-circuit (and bus-off recovery) the state should go back to
"active".



With the above sequence, candump reports "ERRORFRAME" with
protocol-violation{{}{acknowledge-slot}}, bus-error. On re-connecting
the cable the can state goes back to ACTIVE and I see the messages that
were in the queue being sent.


Do you get the ACK error also with berr-reporting off? Would be nice if 
you could show a candump log here.


Also, any error message should show the bus error counts in data[7,8]:

http://lxr.free-electrons.com/source/drivers/net/can/sja1000/sja1000.c#L408

And please check bus-off as well (short-circuiting CAN low and high).

Wolfgang.


Re: [PATCH 08/29] drivers, md: convert mddev.active from atomic_t to refcount_t

2017-03-14 Thread Michael Ellerman
Elena Reshetova  writes:

> refcount_t type and corresponding API should be
> used instead of atomic_t when the variable is used as
> a reference counter. This allows to avoid accidental
> refcounter overflows that might lead to use-after-free
> situations.
>
> Signed-off-by: Elena Reshetova 
> Signed-off-by: Hans Liljestrand 
> Signed-off-by: Kees Cook 
> Signed-off-by: David Windsor 
> ---
>  drivers/md/md.c | 6 +++---
>  drivers/md/md.h | 3 ++-
>  2 files changed, 5 insertions(+), 4 deletions(-)

When booting linux-next (specifically 5be4921c9958ec) I'm seeing the
backtrace below. I suspect this patch is just exposing an existing
issue?

cheers


[0.230738] md: Waiting for all devices to be available before autodetect
[0.230742] md: If you don't use raid, use raid=noautodetect
[0.230962] refcount_t: increment on 0; use-after-free.
[0.230988] [ cut here ]
[0.230996] WARNING: CPU: 0 PID: 1 at lib/refcount.c:114 
.refcount_inc+0x5c/0x70
[0.231001] Modules linked in:
[0.231006] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 
4.11.0-rc1-gccN-next-20170310-g5be4921 #1
[0.231012] task: c0004940 task.stack: c0004944
[0.231016] NIP: c05ac6bc LR: c05ac6b8 CTR: c0743390
[0.231021] REGS: c00049443160 TRAP: 0700   Not tainted  
(4.11.0-rc1-gccN-next-20170310-g5be4921)
[0.231026] MSR: 80029032 
[0.231033]   CR: 24024422  XER: 000c
[0.231038] CFAR: c0a5356c SOFTE: 1 
[0.231038] GPR00: c05ac6b8 c000494433e0 c1079d00 
002b 
[0.231038] GPR04:  00ef  
c10418a0 
[0.231038] GPR08: 4af8 c0ecc9a8 c0ecc9a8 
 
[0.231038] GPR12: 28024824 c6bb  
c00049443a00 
[0.231038] GPR16:  c00049443a10  
 
[0.231038] GPR20:   c0f7dd20 
 
[0.231038] GPR24: 014080c0 c12060b8 c1206080 
0009 
[0.231038] GPR28: c0f7dde0 0090  
c000461ae800 
[0.231100] NIP [c05ac6bc] .refcount_inc+0x5c/0x70
[0.231104] LR [c05ac6b8] .refcount_inc+0x58/0x70
[0.231108] Call Trace:
[0.231112] [c000494433e0] [c05ac6b8] .refcount_inc+0x58/0x70 
(unreliable)
[0.231120] [c00049443450] [c086c008] .mddev_find+0x1e8/0x430
[0.231125] [c00049443530] [c0872b6c] .md_open+0x2c/0x140
[0.231132] [c000494435c0] [c03962a4] .__blkdev_get+0xd4/0x520
[0.231138] [c00049443690] [c0396cc0] .blkdev_get+0x1c0/0x4f0
[0.231145] [c00049443790] [c0336d64] 
.do_dentry_open.isra.1+0x2a4/0x410
[0.231152] [c00049443830] [c03523f4] .path_openat+0x624/0x1580
[0.231157] [c00049443990] [c0354ce4] .do_filp_open+0x84/0x120
[0.231163] [c00049443b10] [c0338d74] .do_sys_open+0x214/0x300
[0.231170] [c00049443be0] [c0da69ac] .md_run_setup+0xa0/0xec
[0.231176] [c00049443c60] [c0da4fbc] 
.prepare_namespace+0x60/0x240
[0.231182] [c00049443ce0] [c0da47a8] 
.kernel_init_freeable+0x330/0x36c
[0.231190] [c00049443db0] [c000dc44] .kernel_init+0x24/0x160
[0.231197] [c00049443e30] [c000badc] 
.ret_from_kernel_thread+0x58/0x7c
[0.231202] Instruction dump:
[0.231206] 6000 3d22ffee 89296bfb 2f89 409effdc 3c62ffc6 3921 
3d42ffee 
[0.231216] 38630928 992a6bfb 484a6e79 6000 <0fe0> 4bb8 6000 
6000 
[0.231226] ---[ end trace 8c51f269ad91ffc2 ]---
[0.231233] md: Autodetecting RAID arrays.
[0.231236] md: autorun ...
[0.231239] md: ... autorun DONE.
[0.234188] EXT4-fs (sda4): mounting ext3 file system using the ext4 
subsystem
[0.250506] refcount_t: underflow; use-after-free.
[0.250531] [ cut here ]
[0.250537] WARNING: CPU: 0 PID: 3 at lib/refcount.c:207 
.refcount_dec_not_one+0x104/0x120
[0.250542] Modules linked in:
[0.250546] CPU: 0 PID: 3 Comm: kworker/0:0 Tainted: GW   
4.11.0-rc1-gccN-next-20170310-g5be4921 #1
[0.250553] Workqueue: events .delayed_fput
[0.250557] task: c00049404900 task.stack: c00049448000
[0.250562] NIP: c05ac964 LR: c05ac960 CTR: c0743390
[0.250567] REGS: c0004944b530 TRAP: 0700   Tainted: GW
(4.11.0-rc1-gccN-next-20170310-g5be4921)
[0.250572] MSR: 80029032 
[0.250578]   CR: 24002422  XER: 0007
[0.250584] CFAR: c0a5356c SOFTE: 1 
[0.250584] GPR00: c05ac960 c0004944b7b0 c1079d00 
0026 
[0.250584] GPR04:  0113  
c10418a0 
[0.250584] GPR08: 00

Re: [PATCH net-next 2/4] gtp: add genl cmd to enable GTP encapsulation on UDP socket

2017-03-14 Thread Andreas Schultz
- On Mar 14, 2017, at 12:43 PM, pablo pa...@netfilter.org wrote:

> On Tue, Mar 14, 2017 at 12:25:46PM +0100, Andreas Schultz wrote:
> 
>> @@ -1254,6 +1293,8 @@ static struct nla_policy gtp_genl_policy[GTPA_MAX + 1] 
>> = {
>>  [GTPA_NET_NS_FD]= { .type = NLA_U32, },
>>  [GTPA_I_TEI]= { .type = NLA_U32, },
>>  [GTPA_O_TEI]= { .type = NLA_U32, },
>> +[GTPA_PDP_HASHSIZE] = { .type = NLA_U32, },
> 
> This per PDP hashsize attribute clearly doesn't belong here.
> 
> Moreover, we now have a rhashtable implementation, so we hopefully we
> can get rid of this. It should be very easy to convert this to use
> rhashtable, and it is very much desiderable.

This would mean I have to mix the unrelated rhashtable change with moving the
hash into the socket. This certainly is not desirable either.

So, I'm going to have a look at the rhashtable thing and send a patch first
to convert the hashes to it.

>> +[GTPA_FD]   = { .type = NLA_U32, },
> 
> This new atttribute has nothing to do with the PDP context.
> And enum gtp_attrs *only* describe a PDP context. Adding more
> attributes there to mix semantics is not the way to go.

You seem to assume that the network device or the APN/VRF is the root entity
for the GTP tunnels. That is IMHO wrong. The 3GPP specification clearly defines
a GTP entity that is completely Independent from an APN or the local IP 
endpoint.

A GTP entity serves multiple local IP endpoints, It manages outgoing tunnels
by the local APN/VRF, source IP, destination IP and remote tunnel id, incoming
tunnels are managed by the local destination IP, local tunnel id and VRF/APN.

Therefor a PDP context needs the following attributes:

 * local source/destination IP (and port - but that's for different series)
 * remote destination IP
 * local and remote TEID
 * VRF/APN

The local source and destination IP is implicitly contained in the socket, 
therefor
the socket needs to part of the context. The VRF/APN is contained in the network
device reference. So this also needs to part of the PDP context.

Having either the socket or the network device as the sole root container for a
GTP entity is wrong since the PDP context always refer both.

> You likely have to inaugurate a new enum. This gtp_attrs enum only
> related to the PDP description.
> 
> Why not add some interface to attach more sockets to the gtp device
> globally? So still the gtp device is the top-level structure. 

That is IMHO the wrong model. In a real live setup it likely to have a
few GTP sockets and possibly hundreds if not thousands of network device
attached to them (we already had the discussion why this kind of sharing
makes sense). 
So from a resource perspective alone, having the network device as root makes
no sense.

> Then add
> a netlink attribute to specify to what VRF this tunnel belongs to,
> instead of implicitly using the socket to achieve this.

You got that the wrong way arround. The VRF is already in the PDP context
through the network device reference. The socket is added to the PDP context
to select the outgoing source IP of the GTP tunnel in order to support
multiple GTP source IP's per GTP entity.

> Another possibility is to explicitly have an interface to add
> new/delete VRFs, attach sockets to them.

We already have that interface. It's the create a GTP network interface
genl API. I explained a few lines above why I think that adding sockets to
GTP network devices is wrong.

> In general, I'm still not convinced this is the right design for this.

Following your "add VRF" idea, I would end up with a pseudo network device
that represents a GTP entity. This would be the root instance for all the
VRF's and GTP sockets. Although being a network device, it would not
behave in any way like a network device, it would not handle traffic or
have IP(v6) addresses attached to it.
I would then further have GTP VRF network devices. Those would be "real"
network device that handle traffic and have IP addresses/route attached
to them.

I'm not sure if this pseudo GTP entity root device fits well with
other networking concepts. And more over, I can't really see the need
for such an construct.

This need for an in-kernel root entity seem to come the concept that
the kernel *owns* the tunnels and that tunnel a static and independent
from the user space control instance.

I think that the user space control instance should own the tunnels and
only use the kernel facility to manage them. When the user space instance
goes away, so should the tunnels.
>From that perspective,  I want to keep the kernel facilities to the absolute
needed minimum.

Regards 
Andreas


RE: [PATCH 08/29] drivers, md: convert mddev.active from atomic_t to refcount_t

2017-03-14 Thread Reshetova, Elena
> Elena Reshetova  writes:
> 
> > refcount_t type and corresponding API should be
> > used instead of atomic_t when the variable is used as
> > a reference counter. This allows to avoid accidental
> > refcounter overflows that might lead to use-after-free
> > situations.
> >
> > Signed-off-by: Elena Reshetova 
> > Signed-off-by: Hans Liljestrand 
> > Signed-off-by: Kees Cook 
> > Signed-off-by: David Windsor 
> > ---
> >  drivers/md/md.c | 6 +++---
> >  drivers/md/md.h | 3 ++-
> >  2 files changed, 5 insertions(+), 4 deletions(-)
> 
> When booting linux-next (specifically 5be4921c9958ec) I'm seeing the
> backtrace below. I suspect this patch is just exposing an existing
> issue?

Yes, we have actually been following this issue in the another thread. 
It looks like the object is re-used somehow, but I can't quite understand how 
just by reading the code. 
This was what I put into the previous thread:

"The log below indicates that you are using your refcounter in a bit weird way 
in mddev_find(). 
However, I can't find the place (just by reading the code) where you would 
increment refcounter from zero (vs. setting it to one).
It looks like you either iterate over existing nodes (and increment their 
counters, which should be >= 1 at the time of increment) or create a new node, 
but then mddev_init() sets the counter to 1. "

If you can help to understand what is going on with the object 
creation/destruction, would be appreciated!

Also Shaohua Li stopped this patch coming from his tree since the issue was 
caught at that time, so we are not going to merge this until we figure it out. 

Best Regards,
Elena.

> 
> cheers
> 
> 
> [0.230738] md: Waiting for all devices to be available before autodetect
> [0.230742] md: If you don't use raid, use raid=noautodetect
> [0.230962] refcount_t: increment on 0; use-after-free.
> [0.230988] [ cut here ]
> [0.230996] WARNING: CPU: 0 PID: 1 at lib/refcount.c:114
> .refcount_inc+0x5c/0x70
> [0.231001] Modules linked in:
> [0.231006] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.11.0-rc1-gccN-next-
> 20170310-g5be4921 #1
> [0.231012] task: c0004940 task.stack: c0004944
> [0.231016] NIP: c05ac6bc LR: c05ac6b8 CTR:
> c0743390
> [0.231021] REGS: c00049443160 TRAP: 0700   Not tainted  (4.11.0-rc1-
> gccN-next-20170310-g5be4921)
> [0.231026] MSR: 80029032 
> [0.231033]   CR: 24024422  XER: 000c
> [0.231038] CFAR: c0a5356c SOFTE: 1
> [0.231038] GPR00: c05ac6b8 c000494433e0 c1079d00
> 002b
> [0.231038] GPR04:  00ef 
> c10418a0
> [0.231038] GPR08: 4af8 c0ecc9a8 c0ecc9a8
> 
> [0.231038] GPR12: 28024824 c6bb 
> c00049443a00
> [0.231038] GPR16:  c00049443a10 
> 
> [0.231038] GPR20:   c0f7dd20
> 
> [0.231038] GPR24: 014080c0 c12060b8 c1206080
> 0009
> [0.231038] GPR28: c0f7dde0 0090 
> c000461ae800
> [0.231100] NIP [c05ac6bc] .refcount_inc+0x5c/0x70
> [0.231104] LR [c05ac6b8] .refcount_inc+0x58/0x70
> [0.231108] Call Trace:
> [0.231112] [c000494433e0] [c05ac6b8] .refcount_inc+0x58/0x70
> (unreliable)
> [0.231120] [c00049443450] [c086c008]
> .mddev_find+0x1e8/0x430
> [0.231125] [c00049443530] [c0872b6c] .md_open+0x2c/0x140
> [0.231132] [c000494435c0] [c03962a4]
> .__blkdev_get+0xd4/0x520
> [0.231138] [c00049443690] [c0396cc0] .blkdev_get+0x1c0/0x4f0
> [0.231145] [c00049443790] [c0336d64]
> .do_dentry_open.isra.1+0x2a4/0x410
> [0.231152] [c00049443830] [c03523f4]
> .path_openat+0x624/0x1580
> [0.231157] [c00049443990] [c0354ce4]
> .do_filp_open+0x84/0x120
> [0.231163] [c00049443b10] [c0338d74]
> .do_sys_open+0x214/0x300
> [0.231170] [c00049443be0] [c0da69ac]
> .md_run_setup+0xa0/0xec
> [0.231176] [c00049443c60] [c0da4fbc]
> .prepare_namespace+0x60/0x240
> [0.231182] [c00049443ce0] [c0da47a8]
> .kernel_init_freeable+0x330/0x36c
> [0.231190] [c00049443db0] [c000dc44] .kernel_init+0x24/0x160
> [0.231197] [c00049443e30] [c000badc]
> .ret_from_kernel_thread+0x58/0x7c
> [0.231202] Instruction dump:
> [0.231206] 6000 3d22ffee 89296bfb 2f89 409effdc 3c62ffc6 3921
> 3d42ffee
> [0.231216] 38630928 992a6bfb 484a6e79 6000 <0fe0> 4bb8
> 6000 6000
> [0.231226] ---[ end trace 8c51f269ad91ffc2 ]---
> [0.231233] md: Autodetecting RAID arrays.
> [0.231236] md: autorun ...
> [0.231239] 

Re: [PATCH net-next 1/4] gtp: move TEID hash to per socket structure

2017-03-14 Thread Andreas Schultz


- On Mar 14, 2017, at 12:33 PM, pablo pa...@netfilter.org wrote:

> On Tue, Mar 14, 2017 at 12:25:45PM +0100, Andreas Schultz wrote:
>> @@ -275,9 +280,9 @@ static int gtp1u_udp_encap_recv(struct gtp_dev *gtp, 
>> struct
>> sk_buff *skb)
>>  
>>  gtp1 = (struct gtp1_header *)(skb->data + sizeof(struct udphdr));
>>  
>> -pctx = gtp1_pdp_find(gtp, ntohl(gtp1->tid));
>> +pctx = gtp1_pdp_find(gsk, ntohl(gtp1->tid));
>>  if (!pctx) {
>> -netdev_dbg(gtp->dev, "No PDP ctx to decap skb=%p\n", skb);
>> +pr_debug("No PDP ctx to decap skb=%p\n", skb);
>>  return 1;
> 
> Again the pr_debug() change has resurrected.

Yes, at that point in the code, there is now ways to resolve the network device.
Therefore the netdev_dbg has to go.

> I already told you: If we are going to have more than one gtp device,
> then this doesn't make sense. I have to repeat things over and over
> again, just because you don't want to rebase your patchset for some
> reason. I don't find any other explaination for this.

Without a PDP context, there is no network device, so netdev_dbg.
 
> So please remove this debugging rather than rendering this completely
> useful.

ACK
 
> Moreover this change has nothing to this patch, so this doesn't break
> the one logical change per patch.

This patch moves the incoming teid has from the network device to the
socket. This means that gtp1_pdp_find needs to change. So this related.
For the debug change, see above why it's related.

Andreas


Re: [PATCH net-next 0/4] gtp: support multiple APN's per GTP endpoint

2017-03-14 Thread Andreas Schultz


- On Mar 14, 2017, at 12:45 PM, pablo pa...@netfilter.org wrote:

> On Tue, Mar 14, 2017 at 12:25:44PM +0100, Andreas Schultz wrote:
> [...]
>> API impact:
>> ---
>> 
>> This is probably the most problematic part of this series...
>> 
>> The removeal of the TEID form the netdevice also means that the gtp genl API
>> for retriving tunnel information and removing tunnels needs to be adjusted.
>> 
>> Before this change it was possible to change a GTP tunnel using the gtp
>> netdevice id and the teid. The teid is no longer unique per gtp netdevice.
>> After this change it has to be either the netdevice and MS IP or the GTP
>> socket and teid.
> 
> Then we have to introduce some explicit VRF concept or such to sort
> out this.
> 
> It is definitely not acceptable to break the existing API.

The specific use case of the API that is no longer supported was never used by
anyone. The only supported and documented API for the GTP module is libgtpnl.
libgtpnl has always required the now mandatory fields. Therefor the externally
supported API does not change.

Regards
Andreas


[PATCH v2 2/2] ARM: dts: am335x-icev2: Add CPSW ethernet0 and ethernet1

2017-03-14 Thread Roger Quadros
Enable the 2 ethernet ports as CPSW ports in dual-mac mode

Signed-off-by: Roger Quadros 
[nsek...@ti.com: use AM33XX_IOPAD()]
Signed-off-by: Sekhar Nori 
---
v2:
- use phy-handle instead of phy_id

 arch/arm/boot/dts/am335x-icev2.dts | 121 +
 1 file changed, 121 insertions(+)

diff --git a/arch/arm/boot/dts/am335x-icev2.dts 
b/arch/arm/boot/dts/am335x-icev2.dts
index a2ad076..415cd46 100644
--- a/arch/arm/boot/dts/am335x-icev2.dts
+++ b/arch/arm/boot/dts/am335x-icev2.dts
@@ -201,6 +201,69 @@
AM33XX_IOPAD(0x938, PIN_OUTPUT_PULLUP | MUX_MODE1) /* 
(L16) gmii1_rxd2.uart3_txd */
>;
};
+
+   cpsw_default: cpsw_default {
+   pinctrl-single,pins = <
+   /* Slave 1, RMII mode */
+   AM33XX_IOPAD(0x90c, (PIN_INPUT_PULLUP | MUX_MODE1)) 
/* mii1_crs.rmii1_crs_dv */
+   AM33XX_IOPAD(0x944, (PIN_INPUT_PULLUP | MUX_MODE0)) 
/* rmii1_refclk.rmii1_refclk */
+   AM33XX_IOPAD(0x940, (PIN_INPUT_PULLUP | MUX_MODE1)) 
/* mii1_rxd0.rmii1_rxd0 */
+   AM33XX_IOPAD(0x93c, (PIN_INPUT_PULLUP | MUX_MODE1)) 
/* mii1_rxd1.rmii1_rxd1 */
+   AM33XX_IOPAD(0x910, (PIN_INPUT_PULLUP | MUX_MODE1)) 
/* mii1_rxerr.rmii1_rxerr */
+   AM33XX_IOPAD(0x928, (PIN_OUTPUT_PULLDOWN | MUX_MODE1))  
/* mii1_txd0.rmii1_txd0 */
+   AM33XX_IOPAD(0x924, (PIN_OUTPUT_PULLDOWN | MUX_MODE1))  
/* mii1_txd1.rmii1_txd1 */
+   AM33XX_IOPAD(0x914, (PIN_OUTPUT_PULLDOWN | MUX_MODE1))  
/* mii1_txen.rmii1_txen */
+   /* Slave 2, RMII mode */
+   AM33XX_IOPAD(0x870, (PIN_INPUT_PULLUP | MUX_MODE3)) 
/* gpmc_wait0.rmii2_crs_dv */
+   AM33XX_IOPAD(0x908, (PIN_INPUT_PULLUP | MUX_MODE1)) 
/* mii1_col.rmii2_refclk */
+   AM33XX_IOPAD(0x86c, (PIN_INPUT_PULLUP | MUX_MODE3)) 
/* gpmc_a11.rmii2_rxd0 */
+   AM33XX_IOPAD(0x868, (PIN_INPUT_PULLUP | MUX_MODE3)) 
/* gpmc_a10.rmii2_rxd1 */
+   AM33XX_IOPAD(0x874, (PIN_INPUT_PULLUP | MUX_MODE3)) 
/* gpmc_wpn.rmii2_rxerr */
+   AM33XX_IOPAD(0x854, (PIN_OUTPUT_PULLDOWN | MUX_MODE3))  
/* gpmc_a5.rmii2_txd0 */
+   AM33XX_IOPAD(0x850, (PIN_OUTPUT_PULLDOWN | MUX_MODE3))  
/* gpmc_a4.rmii2_txd1 */
+   AM33XX_IOPAD(0x840, (PIN_OUTPUT_PULLDOWN | MUX_MODE3))  
/* gpmc_a0.rmii2_txen */
+   >;
+   };
+
+   cpsw_sleep: cpsw_sleep {
+   pinctrl-single,pins = <
+   /* Slave 1 reset value */
+   AM33XX_IOPAD(0x90c, (PIN_INPUT_PULLDOWN | MUX_MODE7))
+   AM33XX_IOPAD(0x944, (PIN_INPUT_PULLDOWN | MUX_MODE7))
+   AM33XX_IOPAD(0x940, (PIN_INPUT_PULLDOWN | MUX_MODE7))
+   AM33XX_IOPAD(0x93c, (PIN_INPUT_PULLDOWN | MUX_MODE7))
+   AM33XX_IOPAD(0x910, (PIN_INPUT_PULLDOWN | MUX_MODE7))
+   AM33XX_IOPAD(0x928, (PIN_INPUT_PULLDOWN | MUX_MODE7))
+   AM33XX_IOPAD(0x924, (PIN_INPUT_PULLDOWN | MUX_MODE7))
+   AM33XX_IOPAD(0x914, (PIN_INPUT_PULLDOWN | MUX_MODE7))
+
+   /* Slave 2 reset value */
+   AM33XX_IOPAD(0x870, (PIN_INPUT_PULLDOWN | MUX_MODE7))
+   AM33XX_IOPAD(0x908, (PIN_INPUT_PULLDOWN | MUX_MODE7))
+   AM33XX_IOPAD(0x86c, (PIN_INPUT_PULLDOWN | MUX_MODE7))
+   AM33XX_IOPAD(0x868, (PIN_INPUT_PULLDOWN | MUX_MODE7))
+   AM33XX_IOPAD(0x874, (PIN_INPUT_PULLDOWN | MUX_MODE7))
+   AM33XX_IOPAD(0x854, (PIN_INPUT_PULLDOWN | MUX_MODE7))
+   AM33XX_IOPAD(0x850, (PIN_INPUT_PULLDOWN | MUX_MODE7))
+   AM33XX_IOPAD(0x840, (PIN_INPUT_PULLDOWN | MUX_MODE7))
+   >;
+   };
+
+   davinci_mdio_default: davinci_mdio_default {
+   pinctrl-single,pins = <
+   /* MDIO */
+   AM33XX_IOPAD(0x948, (PIN_INPUT_PULLUP | SLEWCTRL_FAST | 
MUX_MODE0)) /* mdio_data.mdio_data */
+   AM33XX_IOPAD(0x94c, (PIN_OUTPUT_PULLUP | MUX_MODE0))
/* mdio_clk.mdio_clk */
+   >;
+   };
+
+   davinci_mdio_sleep: davinci_mdio_sleep {
+   pinctrl-single,pins = <
+   /* MDIO reset value */
+   AM33XX_IOPAD(0x948, (PIN_INPUT_PULLDOWN | MUX_MODE7))
+   AM33XX_IOPAD(0x94c, (PIN_INPUT_PULLDOWN | MUX_MODE7))
+   >;
+   };
 };
 
 &i2c0 {
@@ -350,3 +413,61 @@
pinctrl-0 = <&uart3_pins_default>;
status = "okay";
 };
+
+&gpio3 {
+   p4 {
+   gpio-hog;
+   gpios = <4 GPIO_ACTIVE_HIGH>;

[PATCH 3/3] ARM: omap2plus_defconfig: Enable TI Ethernet PHY

2017-03-14 Thread Roger Quadros
DP83848_PHY i.e. [TI TLK10X 10/100 Mbps PHY] is used on the
am335x-icev2 board. Enable the PHY driver for it.

Signed-off-by: Roger Quadros 
---
 arch/arm/configs/omap2plus_defconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm/configs/omap2plus_defconfig 
b/arch/arm/configs/omap2plus_defconfig
index f2462a6..cfa5bac 100644
--- a/arch/arm/configs/omap2plus_defconfig
+++ b/arch/arm/configs/omap2plus_defconfig
@@ -167,6 +167,7 @@ CONFIG_TI_CPTS=y
 # CONFIG_NET_VENDOR_VIA is not set
 # CONFIG_NET_VENDOR_WIZNET is not set
 CONFIG_AT803X_PHY=y
+CONFIG_DP83848_PHY=y
 CONFIG_MICREL_PHY=y
 CONFIG_SMSC_PHY=y
 CONFIG_USB_USBNET=m
-- 
2.7.4




[PATCH] net: Resend IGMP memberships upon peer notification.

2017-03-14 Thread Vladislav Yasevich
When we notify peers of potential changes,  it's also good to update
IGMP memberships.  For example, during VM migration, updating IGMP
memberships will redirect existing multicast streams to the VM at the
new location.

Signed-off-by: Vladislav Yasevich 
---
 net/core/dev.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/core/dev.c b/net/core/dev.c
index a229bf0..1ed927d 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1272,6 +1272,7 @@ void netdev_notify_peers(struct net_device *dev)
 {
rtnl_lock();
call_netdevice_notifiers(NETDEV_NOTIFY_PEERS, dev);
+   call_netdevice_notifiers(NETDEV_RESEND_IGMP, dev);
rtnl_unlock();
 }
 EXPORT_SYMBOL(netdev_notify_peers);
-- 
2.7.4



[patch net 1/2] mlxsw: reg: Fix SPVM max record count

2017-03-14 Thread Jiri Pirko
From: Jiri Pirko 

The num_rec field is 8 bit, so the maximal count number is 255. This
fixes vlans not being enabled for wider ranges than 255.

Fixes: b2e345f9a454 ("mlxsw: reg: Add Switch Port VID and Switch Port VLAN 
Membership registers definitions")
Signed-off-by: Jiri Pirko 
Reviewed-by: Ido Schimmel 
---
 drivers/net/ethernet/mellanox/mlxsw/reg.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/reg.h 
b/drivers/net/ethernet/mellanox/mlxsw/reg.h
index 0899e2d..65e1942 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/reg.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/reg.h
@@ -769,7 +769,7 @@ static inline void mlxsw_reg_spvid_pack(char *payload, u8 
local_port, u16 pvid)
 #define MLXSW_REG_SPVM_ID 0x200F
 #define MLXSW_REG_SPVM_BASE_LEN 0x04 /* base length, without records */
 #define MLXSW_REG_SPVM_REC_LEN 0x04 /* record length */
-#define MLXSW_REG_SPVM_REC_MAX_COUNT 256
+#define MLXSW_REG_SPVM_REC_MAX_COUNT 255
 #define MLXSW_REG_SPVM_LEN (MLXSW_REG_SPVM_BASE_LEN +  \
MLXSW_REG_SPVM_REC_LEN * MLXSW_REG_SPVM_REC_MAX_COUNT)
 
-- 
2.7.4



[patch net 2/2] mlxsw: reg: Fix SPVMLR max record count

2017-03-14 Thread Jiri Pirko
From: Jiri Pirko 

The num_rec field is 8 bit, so the maximal count number is 255.
This fixes vlans learning not being enabled for wider ranges than 255.

Fixes: a4feea74cd7a ("mlxsw: reg: Add Switch Port VLAN MAC Learning register 
definition")
Signed-off-by: Jiri Pirko 
Reviewed-by: Ido Schimmel 
---
 drivers/net/ethernet/mellanox/mlxsw/reg.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/reg.h 
b/drivers/net/ethernet/mellanox/mlxsw/reg.h
index 65e1942..d9616da 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/reg.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/reg.h
@@ -1702,7 +1702,7 @@ static inline void mlxsw_reg_sfmr_pack(char *payload,
 #define MLXSW_REG_SPVMLR_ID 0x2020
 #define MLXSW_REG_SPVMLR_BASE_LEN 0x04 /* base length, without records */
 #define MLXSW_REG_SPVMLR_REC_LEN 0x04 /* record length */
-#define MLXSW_REG_SPVMLR_REC_MAX_COUNT 256
+#define MLXSW_REG_SPVMLR_REC_MAX_COUNT 255
 #define MLXSW_REG_SPVMLR_LEN (MLXSW_REG_SPVMLR_BASE_LEN + \
  MLXSW_REG_SPVMLR_REC_LEN * \
  MLXSW_REG_SPVMLR_REC_MAX_COUNT)
-- 
2.7.4



[patch net 0/2] mlxsw: Couple of fixes

2017-03-14 Thread Jiri Pirko
From: Jiri Pirko 

Couple or small fixes.

Jiri Pirko (2):
  mlxsw: reg: Fix SPVM max record count
  mlxsw: reg: Fix SPVMLR max record count

 drivers/net/ethernet/mellanox/mlxsw/reg.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

-- 
2.7.4



Re: [RFC v1 for accelerated IPoIB 04/25] IB/verb: Add ipoib_options struct and API

2017-03-14 Thread Erez Shitrit
On Tue, Mar 14, 2017 at 9:01 AM, Vishwanathapura, Niranjana
 wrote:
> On Mon, Mar 13, 2017 at 02:01:36PM -0600, Jason Gunthorpe wrote:
>>>
>>> +   /* multicast */
>>> +   int (*attach_mcast)(struct net_device *dev, struct ib_device
>>> *hca,
>>> +   union ib_gid *gid, u16 lid, int set_qkey);
>>> +   int (*detach_mcast)(struct net_device *dev, struct ib_device
>>> *hca,
>>> +   union ib_gid *gid, u16 lid);
>>
>>
>> It would make more sense to store the struct ib_device pointer in the
>> struct rdma_netdev.
>>
>
> Agree that it shouldn't be a function parameters.
> For opa_vnic, I found it convenient to store ib_device pointer in client and
> device private structures as those will be available in most places anyhow.

Will add it to the rdma_netdev obj, as Jason suggested.
Thanks,

>
> Niranjana


[PATCH net 0/7] qed: Fixes series

2017-03-14 Thread Yuval Mintz
This address several different issues in qed.
The more significant portions:

Patch #1 would cause timeout when qedr utilizes the highest
CIDs availble for it [or when future qede adapters would utilize
queues in some constellations].

Patch #4 fixes a leak of mapped addresses; When iommu is enabled,
offloaded storage protocols might eventually run out of resources
and fail to map additional buffers.

Patches #6,#7 were missing in the initial iSCSI infrastructure
submissions, and would hamper qedi's stability when it reaches
out-of-order scenarios.

Dave,

Please consider applying these to 'net'.

Thanks,
Yuval

Ram Amrani (2):
  qed: Align CIDs according to DORQ requirement
  qed: Fix interrupt flags on Rx LL2

Tomer Tayar (1):
  qed: Prevent creation of too-big u32-chains

Yuval Mintz (4):
  qed: Fix mapping leak on LL2 rx flow
  qed: Free previous connections when releasing iSCSI
  qed: Correct out-of-bound access in OOO history
  qed: Enable iSCSI Out-of-Order

 drivers/net/ethernet/qlogic/qed/qed_cxt.c   |  3 ++-
 drivers/net/ethernet/qlogic/qed/qed_dev.c   |  5 ++---
 drivers/net/ethernet/qlogic/qed/qed_iscsi.c | 31 +
 drivers/net/ethernet/qlogic/qed/qed_ll2.c   | 11 ++
 drivers/net/ethernet/qlogic/qed/qed_ooo.c   |  2 ++
 5 files changed, 44 insertions(+), 8 deletions(-)

-- 
1.9.3



[PATCH net 1/7] qed: Align CIDs according to DORQ requirement

2017-03-14 Thread Yuval Mintz
From: Ram Amrani 

The Doorbell HW block can be configured at a granularity
of 16 x CIDs, so we need to make sure that the actual number
of CIDs configured would be a multiplication of 16.

Today, when RoCE is enabled - given that the number is unaligned,
doorbelling the higher CIDs would fail to reach the firmware and
would eventually timeout.

Fixes: dbb799c39717 ("qed: Initialize hardware for new protocols")
Signed-off-by: Ram Amrani 
Signed-off-by: Yuval Mintz 
---
 drivers/net/ethernet/qlogic/qed/qed_cxt.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/qlogic/qed/qed_cxt.c 
b/drivers/net/ethernet/qlogic/qed/qed_cxt.c
index d42d03d..7e3a6fe 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_cxt.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_cxt.c
@@ -422,8 +422,9 @@ static void qed_cxt_set_proto_cid_count(struct qed_hwfn 
*p_hwfn,
u32 page_sz = p_mgr->clients[ILT_CLI_CDUC].p_size.val;
u32 cxt_size = CONN_CXT_SIZE(p_hwfn);
u32 elems_per_page = ILT_PAGE_IN_BYTES(page_sz) / cxt_size;
+   u32 align = elems_per_page * DQ_RANGE_ALIGN;
 
-   p_conn->cid_count = roundup(p_conn->cid_count, elems_per_page);
+   p_conn->cid_count = roundup(p_conn->cid_count, align);
}
 }
 
-- 
1.9.3



[PATCH net 3/7] qed: Fix mapping leak on LL2 rx flow

2017-03-14 Thread Yuval Mintz
When receiving an Rx LL2 packet, qed fails to unmap the previous buffer.

Fixes: 0a7fb11c23c0 ("qed: Add Light L2 support");
Signed-off-by: Yuval Mintz 
---
 drivers/net/ethernet/qlogic/qed/qed_ll2.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/qlogic/qed/qed_ll2.c 
b/drivers/net/ethernet/qlogic/qed/qed_ll2.c
index 5fb34db..29ae5ec 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_ll2.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_ll2.c
@@ -211,6 +211,8 @@ static void qed_ll2b_complete_rx_packet(struct qed_hwfn 
*p_hwfn,
/* If need to reuse or there's no replacement buffer, repost this */
if (rc)
goto out_post;
+   dma_unmap_single(&cdev->pdev->dev, buffer->phys_addr,
+cdev->ll2->rx_size, DMA_FROM_DEVICE);
 
skb = build_skb(buffer->data, 0);
if (!skb) {
-- 
1.9.3



[PATCH net 5/7] qed: Fix interrupt flags on Rx LL2

2017-03-14 Thread Yuval Mintz
From: Ram Amrani 

Before iterating over the the LL2 Rx ring, the ring's
spinlock is taken via spin_lock_irqsave().
The actual processing of the packet [including handling
by the protocol driver] is done without said lock,
so qed releases the spinlock and re-claims it afterwards.

Problem is that the final spin_lock_irqrestore() at the end
of the iteration uses the original flags saved from the
initial irqsave() instead of the flags from the most recent
irqsave(). So it's possible that the interrupt status would
be incorrect at the end of the processing.

Fixes: 0a7fb11c23c0 ("qed: Add Light L2 support");
CC: Ram Amrani 
Signed-off-by: Yuval Mintz 
---
 drivers/net/ethernet/qlogic/qed/qed_ll2.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/qlogic/qed/qed_ll2.c 
b/drivers/net/ethernet/qlogic/qed/qed_ll2.c
index 29ae5ec..0d3cef4 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_ll2.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_ll2.c
@@ -476,7 +476,7 @@ static int qed_ll2_txq_completion(struct qed_hwfn *p_hwfn, 
void *p_cookie)
 static int qed_ll2_rxq_completion_reg(struct qed_hwfn *p_hwfn,
  struct qed_ll2_info *p_ll2_conn,
  union core_rx_cqe_union *p_cqe,
- unsigned long lock_flags,
+ unsigned long *p_lock_flags,
  bool b_last_cqe)
 {
struct qed_ll2_rx_queue *p_rx = &p_ll2_conn->rx_queue;
@@ -497,10 +497,10 @@ static int qed_ll2_rxq_completion_reg(struct qed_hwfn 
*p_hwfn,
  "Mismatch between active_descq and the LL2 Rx 
chain\n");
list_add_tail(&p_pkt->list_entry, &p_rx->free_descq);
 
-   spin_unlock_irqrestore(&p_rx->lock, lock_flags);
+   spin_unlock_irqrestore(&p_rx->lock, *p_lock_flags);
qed_ll2b_complete_rx_packet(p_hwfn, p_ll2_conn->my_id,
p_pkt, &p_cqe->rx_cqe_fp, b_last_cqe);
-   spin_lock_irqsave(&p_rx->lock, lock_flags);
+   spin_lock_irqsave(&p_rx->lock, *p_lock_flags);
 
return 0;
 }
@@ -540,7 +540,8 @@ static int qed_ll2_rxq_completion(struct qed_hwfn *p_hwfn, 
void *cookie)
break;
case CORE_RX_CQE_TYPE_REGULAR:
rc = qed_ll2_rxq_completion_reg(p_hwfn, p_ll2_conn,
-   cqe, flags, b_last_cqe);
+   cqe, &flags,
+   b_last_cqe);
break;
default:
rc = -EIO;
-- 
1.9.3



[PATCH net 7/7] qed: Enable iSCSI Out-of-Order

2017-03-14 Thread Yuval Mintz
Missing in the initial submission, qed fails to propagate qedi's
request to enable OOO to firmware.

Fixes: fc831825f99e ("qed: Add support for hardware offloaded iSCSI")
Signed-off-by: Yuval Mintz 
---
 drivers/net/ethernet/qlogic/qed/qed_iscsi.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/ethernet/qlogic/qed/qed_iscsi.c 
b/drivers/net/ethernet/qlogic/qed/qed_iscsi.c
index 7e73b05..098766f 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_iscsi.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_iscsi.c
@@ -190,6 +190,9 @@ struct qed_iscsi_conn {
p_init->num_sq_pages_in_ring = p_params->num_sq_pages_in_ring;
p_init->num_r2tq_pages_in_ring = p_params->num_r2tq_pages_in_ring;
p_init->num_uhq_pages_in_ring = p_params->num_uhq_pages_in_ring;
+   p_init->ooo_enable = p_params->ooo_enable;
+   p_init->ll2_rx_queue_id = p_hwfn->hw_info.resc_start[QED_LL2_QUEUE] +
+ p_params->ll2_ooo_queue_id;
p_init->func_params.log_page_size = p_params->log_page_size;
val = p_params->num_tasks;
p_init->func_params.num_tasks = cpu_to_le16(val);
-- 
1.9.3



[PATCH net 6/7] qed: Correct out-of-bound access in OOO history

2017-03-14 Thread Yuval Mintz
Need to set the number of entries in database, otherwise the logic
would quickly surpass the array.

Fixes: 1d6cff4fca43 ("qed: Add iSCSI out of order packet handling")
Signed-off-by: Yuval Mintz 
---
 drivers/net/ethernet/qlogic/qed/qed_ooo.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/qlogic/qed/qed_ooo.c 
b/drivers/net/ethernet/qlogic/qed/qed_ooo.c
index 7d731c6..378afce 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_ooo.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_ooo.c
@@ -159,6 +159,8 @@ struct qed_ooo_info *qed_ooo_alloc(struct qed_hwfn *p_hwfn)
if (!p_ooo_info->ooo_history.p_cqes)
goto no_history_mem;
 
+   p_ooo_info->ooo_history.num_of_cqes = QED_MAX_NUM_OOO_HISTORY_ENTRIES;
+
return p_ooo_info;
 
 no_history_mem:
-- 
1.9.3



[PATCH net 4/7] qed: Free previous connections when releasing iSCSI

2017-03-14 Thread Yuval Mintz
Fixes: fc831825f99e ("qed: Add support for hardware offloaded iSCSI")
Signed-off-by: Yuval Mintz 
---
 drivers/net/ethernet/qlogic/qed/qed_iscsi.c | 28 
 1 file changed, 28 insertions(+)

diff --git a/drivers/net/ethernet/qlogic/qed/qed_iscsi.c 
b/drivers/net/ethernet/qlogic/qed/qed_iscsi.c
index 3a44d6b..7e73b05 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_iscsi.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_iscsi.c
@@ -786,6 +786,23 @@ static void qed_iscsi_release_connection(struct qed_hwfn 
*p_hwfn,
spin_unlock_bh(&p_hwfn->p_iscsi_info->lock);
 }
 
+void qed_iscsi_free_connection(struct qed_hwfn *p_hwfn,
+  struct qed_iscsi_conn *p_conn)
+{
+   qed_chain_free(p_hwfn->cdev, &p_conn->xhq);
+   qed_chain_free(p_hwfn->cdev, &p_conn->uhq);
+   qed_chain_free(p_hwfn->cdev, &p_conn->r2tq);
+   dma_free_coherent(&p_hwfn->cdev->pdev->dev,
+ sizeof(struct tcp_upload_params),
+ p_conn->tcp_upload_params_virt_addr,
+ p_conn->tcp_upload_params_phys_addr);
+   dma_free_coherent(&p_hwfn->cdev->pdev->dev,
+ sizeof(struct scsi_terminate_extra_params),
+ p_conn->queue_cnts_virt_addr,
+ p_conn->queue_cnts_phys_addr);
+   kfree(p_conn);
+}
+
 struct qed_iscsi_info *qed_iscsi_alloc(struct qed_hwfn *p_hwfn)
 {
struct qed_iscsi_info *p_iscsi_info;
@@ -807,6 +824,17 @@ void qed_iscsi_setup(struct qed_hwfn *p_hwfn,
 void qed_iscsi_free(struct qed_hwfn *p_hwfn,
struct qed_iscsi_info *p_iscsi_info)
 {
+   struct qed_iscsi_conn *p_conn = NULL;
+
+   while (!list_empty(&p_hwfn->p_iscsi_info->free_list)) {
+   p_conn = list_first_entry(&p_hwfn->p_iscsi_info->free_list,
+ struct qed_iscsi_conn, list_entry);
+   if (p_conn) {
+   list_del(&p_conn->list_entry);
+   qed_iscsi_free_connection(p_hwfn, p_conn);
+   }
+   }
+
kfree(p_iscsi_info);
 }
 
-- 
1.9.3



[PATCH net 2/7] qed: Prevent creation of too-big u32-chains

2017-03-14 Thread Yuval Mintz
From: Tomer Tayar 

Current Logic would allow the creation of a chain with U32_MAX + 1
elements, when the actual maximum supported by the driver infrastructure
is U32_MAX.

Fixes: a91eb52abb50 ("qed: Revisit chain implementation")
Signed-off-by: Tomer Tayar 
Signed-off-by: Yuval Mintz 
---
 drivers/net/ethernet/qlogic/qed/qed_dev.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/qlogic/qed/qed_dev.c 
b/drivers/net/ethernet/qlogic/qed/qed_dev.c
index e2a081c..e518f91 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_dev.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_dev.c
@@ -2389,9 +2389,8 @@ void qed_chain_free(struct qed_dev *cdev, struct 
qed_chain *p_chain)
 * size/capacity fields are of a u32 type.
 */
if ((cnt_type == QED_CHAIN_CNT_TYPE_U16 &&
-chain_size > 0x1) ||
-   (cnt_type == QED_CHAIN_CNT_TYPE_U32 &&
-chain_size > 0x1ULL)) {
+chain_size > ((u32)U16_MAX + 1)) ||
+   (cnt_type == QED_CHAIN_CNT_TYPE_U32 && chain_size > U32_MAX)) {
DP_NOTICE(cdev,
  "The actual chain size (0x%llx) is larger than the 
maximal possible value\n",
  chain_size);
-- 
1.9.3



FW: [PATCH net 0/7] qed: Fixes series

2017-03-14 Thread Mintz, Yuval
> Dave,
> 
> Please consider applying these to 'net'.

Apparently I failed adding you on E-mail, sending this only to netdev.
Does that suffice for your needs or do you need a re-send?



net: deadlock between ip_expire/sch_direct_xmit

2017-03-14 Thread Dmitry Vyukov
Hello,

I've got the following deadlock report while running syzkaller fuzzer
on net-next/92cd12c5ed432c5eebd2462d666772a8d8442c3b:


[ INFO: possible circular locking dependency detected ]
4.10.0+ #29 Not tainted
---
modprobe/12392 is trying to acquire lock:
 (_xmit_ETHER#2){+.-...}, at: [] spin_lock
include/linux/spinlock.h:299 [inline]
 (_xmit_ETHER#2){+.-...}, at: [] __netif_tx_lock
include/linux/netdevice.h:3486 [inline]
 (_xmit_ETHER#2){+.-...}, at: []
sch_direct_xmit+0x282/0x6d0 net/sched/sch_generic.c:180

but task is already holding lock:
 (&(&q->lock)->rlock){+.-...}, at: [] spin_lock
include/linux/spinlock.h:299 [inline]
 (&(&q->lock)->rlock){+.-...}, at: []
ip_expire+0x51/0x6c0 net/ipv4/ip_fragment.c:201

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #1 (&(&q->lock)->rlock){+.-...}:
   validate_chain kernel/locking/lockdep.c:2267 [inline]
   __lock_acquire+0x2149/0x3430 kernel/locking/lockdep.c:3340
   lock_acquire+0x2a1/0x630 kernel/locking/lockdep.c:3755
   __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
   _raw_spin_lock+0x33/0x50 kernel/locking/spinlock.c:151
   spin_lock include/linux/spinlock.h:299 [inline]
   ip_defrag+0x3a2/0x4130 net/ipv4/ip_fragment.c:669
   ip_check_defrag+0x4e3/0x8b0 net/ipv4/ip_fragment.c:713
   packet_rcv_fanout+0x282/0x800 net/packet/af_packet.c:1459
   deliver_skb net/core/dev.c:1834 [inline]
   dev_queue_xmit_nit+0x294/0xa90 net/core/dev.c:1890
   xmit_one net/core/dev.c:2903 [inline]
   dev_hard_start_xmit+0x16b/0xab0 net/core/dev.c:2923
   sch_direct_xmit+0x31f/0x6d0 net/sched/sch_generic.c:182
   __dev_xmit_skb net/core/dev.c:3092 [inline]
   __dev_queue_xmit+0x13e5/0x1e60 net/core/dev.c:3358
   dev_queue_xmit+0x17/0x20 net/core/dev.c:3423
   neigh_resolve_output+0x6b9/0xb10 net/core/neighbour.c:1308
   neigh_output include/net/neighbour.h:478 [inline]
   ip_finish_output2+0x8b8/0x15a0 net/ipv4/ip_output.c:228
   ip_do_fragment+0x1d93/0x2720 net/ipv4/ip_output.c:672
   ip_fragment.constprop.54+0x145/0x200 net/ipv4/ip_output.c:545
   ip_finish_output+0x82d/0xe10 net/ipv4/ip_output.c:314
   NF_HOOK_COND include/linux/netfilter.h:246 [inline]
   ip_output+0x1f0/0x7a0 net/ipv4/ip_output.c:404
   dst_output include/net/dst.h:486 [inline]
   ip_local_out+0x95/0x170 net/ipv4/ip_output.c:124
   ip_send_skb+0x3c/0xc0 net/ipv4/ip_output.c:1492
   ip_push_pending_frames+0x64/0x80 net/ipv4/ip_output.c:1512
   raw_sendmsg+0x26de/0x3a00 net/ipv4/raw.c:655
   inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:761
   sock_sendmsg_nosec net/socket.c:633 [inline]
   sock_sendmsg+0xca/0x110 net/socket.c:643
   ___sys_sendmsg+0x4a3/0x9f0 net/socket.c:1985
   __sys_sendmmsg+0x25c/0x750 net/socket.c:2075
   SYSC_sendmmsg net/socket.c:2106 [inline]
   SyS_sendmmsg+0x35/0x60 net/socket.c:2101
   do_syscall_64+0x2e8/0x930 arch/x86/entry/common.c:281
   return_from_SYSCALL_64+0x0/0x7a

-> #0 (_xmit_ETHER#2){+.-...}:
   check_prev_add kernel/locking/lockdep.c:1830 [inline]
   check_prevs_add+0xa8f/0x19f0 kernel/locking/lockdep.c:1940
   validate_chain kernel/locking/lockdep.c:2267 [inline]
   __lock_acquire+0x2149/0x3430 kernel/locking/lockdep.c:3340
   lock_acquire+0x2a1/0x630 kernel/locking/lockdep.c:3755
   __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
   _raw_spin_lock+0x33/0x50 kernel/locking/spinlock.c:151
   spin_lock include/linux/spinlock.h:299 [inline]
   __netif_tx_lock include/linux/netdevice.h:3486 [inline]
   sch_direct_xmit+0x282/0x6d0 net/sched/sch_generic.c:180
   __dev_xmit_skb net/core/dev.c:3092 [inline]
   __dev_queue_xmit+0x13e5/0x1e60 net/core/dev.c:3358
   dev_queue_xmit+0x17/0x20 net/core/dev.c:3423
   neigh_hh_output include/net/neighbour.h:468 [inline]
   neigh_output include/net/neighbour.h:476 [inline]
   ip_finish_output2+0xf6c/0x15a0 net/ipv4/ip_output.c:228
   ip_finish_output+0xa29/0xe10 net/ipv4/ip_output.c:316
   NF_HOOK_COND include/linux/netfilter.h:246 [inline]
   ip_output+0x1f0/0x7a0 net/ipv4/ip_output.c:404
   dst_output include/net/dst.h:486 [inline]
   ip_local_out+0x95/0x170 net/ipv4/ip_output.c:124
   ip_send_skb+0x3c/0xc0 net/ipv4/ip_output.c:1492
   ip_push_pending_frames+0x64/0x80 net/ipv4/ip_output.c:1512
   icmp_push_reply+0x372/0x4d0 net/ipv4/icmp.c:394
   icmp_send+0x156c/0x1c80 net/ipv4/icmp.c:754
   ip_expire+0x40e/0x6c0 net/ipv4/ip_fragment.c:239
   call_timer_fn+0x241/0x820 kernel/time/timer.c:1268
   expire_timers kernel/time/timer.c:1307 [inline]
   __run_timers+0x960/0xcf0 kernel/time/timer.c:1601
   run_timer_softirq+0x21/0x80 kernel/time/timer.c:1614
   __do_softirq+0x31f/0xbe7 kernel/softirq.c:284
   invoke_softirq 

Re: [PATCH net-next] mlx4: Better use of order-0 pages in RX path

2017-03-14 Thread Eric Dumazet
On Mon, Mar 13, 2017 at 9:57 PM, Alexei Starovoitov
 wrote:
> On Mon, Mar 13, 2017 at 06:02:11PM -0700, Eric Dumazet wrote:
>> On Mon, 2017-03-13 at 16:40 -0700, Alexei Starovoitov wrote:
>>
>> > that's not how it works. It's a job of submitter to prove
>> > that additional code doesn't cause regressions especially
>> > when there are legitimate concerns.
>>
>> This test was moved out of the mlx4_en_prepare_rx_desc() section into
>> the XDP_TX code path.
>>
>>
>> if (ring->page_cache.index > 0) {
>> /* XDP uses a single page per frame */
>> if (!frags->page) {
>> ring->page_cache.index--;
>> frags->page = 
>> ring->page_cache.buf[ring->page_cache.index].page;
>> frags->dma  = 
>> ring->page_cache.buf[ring->page_cache.index].dma;
>> }
>> frags->page_offset = XDP_PACKET_HEADROOM;
>> rx_desc->data[0].addr = cpu_to_be64(frags->dma +
>> XDP_PACKET_HEADROOM);
>> return 0;
>> }
>>
>> Can you check again your claim, because I see no additional cost
>> for XDP_TX.
>
> Let's look what it was:
> - xdp_tx xmits the page regardless whether driver can replenish
> - at the end of the napi mlx4_en_refill_rx_buffers() will replenish
> rx in bulk either from page_cache or by allocating one page at a time
>
> after the changes:
> - xdp_tx will check page_cache if it's empty it will try to do
> order 10 (ten!!) alloc, will fail, will try to alloc single page,
> will xmit the packet, and will place just allocated page into rx ring.
> on the next packet in the same napi loop, it will try to allocate
> order 9 (since the cache is still empty), will fail, will try single
> page, succeed... next packet will try order 8 and so on.
> And that spiky order 10 allocations will be happening 4 times a second
> due to new logic in mlx4_en_recover_from_oom().
> We may get lucky and order 2 alloc will succeed, but before that
> we will waste tons of cycles.
> If an attacker somehow makes existing page recycling logic not effective,
> the xdp performance will be limited by order0 page allocator.
> Not great, but still acceptable.
> After this patch it will just tank due to this crazy scheme.
> Yet you're not talking about this new behavior in the commit log.
> You didn't test XDP at all and still claiming that everything is fine ?!
> NACK


Ouch. You NACK a patch based on fears from your side, just because I
said I would not spend time on XDP at this very moment.

We hardly allocate a page in our workloads, and a failed attempt to
get an order-10 page
with GFP_ATOMIC has exactly the same cost than a failed attempt of
order-0 or order-1 or order-X,
the buddy page allocator gives that for free.

So I will leave this to Mellanox for XDP tests and upstreaming this,
and will stop arguing with you, this is going nowhere.

I suggested some changes but you blocked everything just because I
publicly said that I would not use XDP,
which added a serious mess to this driver.


Re: [PATCH net-next 0/4] gtp: support multiple APN's per GTP endpoint

2017-03-14 Thread Pablo Neira Ayuso
On Tue, Mar 14, 2017 at 01:42:44PM +0100, Andreas Schultz wrote:
> > It is definitely not acceptable to break the existing API.
> 
> The specific use case of the API that is no longer supported was never used by
> anyone. [...]

Yes, this was used openggsn and I tested this with a full blown FOSS
setup. Yes, a toy thing compared to the proprietary equipment you deal
with, but we always started with things like this.

> The only supported and documented API for the GTP module is libgtpnl.

No, the netlink interface itself if the API.

Stopping trying to find a reason to break API, that is a no-go.


Re: [PATCH net-next 2/4] net-next: dsa: add Mediatek tag RX/TX handler

2017-03-14 Thread Vivien Didelot
Hi Sean,

Sean Wang  writes:

>> This won't apply, the port index in now stored in p->dp->index.
>
> It seems that I need to upgrade to newer kernel to verify this

Correct. In fact every time you send patches to net-next (or any other
subsystem branch), you must rebase your patch series onto the latest
version of that tree, in order to avoid eventuel conflicts.

Thanks,

Vivien


Re: [PATCH net-next 2/4] gtp: add genl cmd to enable GTP encapsulation on UDP socket

2017-03-14 Thread Pablo Neira Ayuso
On Tue, Mar 14, 2017 at 01:28:25PM +0100, Andreas Schultz wrote:
> - On Mar 14, 2017, at 12:43 PM, pablo pa...@netfilter.org wrote:
[...]
> A GTP entity serves multiple local IP endpoints, It manages outgoing tunnels
> by the local APN/VRF, source IP, destination IP and remote tunnel id, incoming
> tunnels are managed by the local destination IP, local tunnel id and VRF/APN.
> 
> Therefor a PDP context needs the following attributes:
> 
>  * local source/destination IP (and port - but that's for different series)
>  * remote destination IP
>  * local and remote TEID
>  * VRF/APN
[...]
> I'm not sure if this pseudo GTP entity root device fits well with
> other networking concepts. And more over, I can't really see the need
> for such an construct.

Some sort of top-level structure that wraps all these objects is
needed, and that can be a new VRF object itself.

You can add a netlink interface to add/dump/delete VRFs, this VRF
database would be *global* to the GSN. At VRF creation, you attach the
socket and the GTP device. You can share sockets between VRFs. PDP
context objects would be added to the corresponding VRF *not to the
socket*, but actually this will result in inserting this PDP context
into the socket hashtable and the GTP device hashtable.

We need to introduce a default VRF that is assumed to always exist,
and that userspace cannot remove, so things don't break backward. If
no VRF is specified, then we attach things to this default VRF.
Actually, look at this from a different angle: the existing driver is
just supporting *one single VRF* at this moment so we just have to
represent this explicitly. Breaking existing API is a no-go.

This explicit VRF concept would also allow us to dump PDP contexts
that belong to a given VRF, by simply indicating the VRF unique id.
Jamal already requested that we extend iproute2 to have command to
inspect the gtp driver we cannot escape this, we should allow
standalone tools to inspect the gtp datapath as we do with other
existing tunnel drivers, no matter what daemon in userspace implements
the control plane.

[...]
> I think that the user space control instance should own the tunnels and
> only use the kernel facility to manage them. When the user space instance
> goes away, so should the tunnels.

This doesn't allow daemon hot restart for whatever administrative
reason without affecting existing traffic. The kernel owns the
datapath indeed, that include tunnels.

> From that perspective, I want to keep the kernel facilities to the
> absolute needed minimum.

If some simple abstraction that we can insert makes this whole thing
more maintainable, then it makes sense to consider it.

This is all about not exposing the internal layout of the
representation you use for a very specific reason: The more you expose
internal details to userspace, the more problems we'll have to extend
things later on in the future. And don't try to be smart and say:
"Hey, I already know every usecase we will have in the future" because
that is not true.


Re: [PATCH net-next] net: dsa: mv88e6xxx: debug ATU Age Time

2017-03-14 Thread Vivien Didelot
Hi Andrew,

Andrew Lunn  writes:

>> The never ever seeing R/W failure on MDIO bus is not exactly accurate.
>> We had with art (atheros calibration tool) the problem that interrupts
>> were being disabled which lead to MDIO operations running into
>> timout/failing.
>
> Yes, i've seen similar with power management bugs for the MDIO
> driver. But you get a cascade of failures, lots of warnings and error
> prints, it is clear something bad has happened, and the switch is in
> an inconsistent state. So having one more debug print which is also
> inconsistent does no really harm.
>
> Anyway, this whole conversation has taken more effort than just making
> this simple change to remove a few lines of code. So lets drop it and
> move on.

I don't understand nor agree with the fact that sometimes it's OK to not
check for errors, based on one developer assumptions. Not checking
return code is wrong and very likely error-prone.

If you really want to stand for that point, please send a patch series
which turns mv88e6xxx_read() and mv88e6xxx_write() into void functions.
I'd be glad to review and discuss this further. That would indeed make
*all* the driver code simpler.

Thanks,

Vivien


[PATCH v4] {net,IB}/{rxe,usnic}: Utilize generic mac to eui32 function

2017-03-14 Thread Yuval Shaia
This logic seems to be duplicated in (at least) three separate files.
Move it to one place so code can be re-use.

Signed-off-by: Yuval Shaia 
---
v0 -> v1:
* Add missing #include
* Rename to genaddrconf_ifid_eui48
v1 -> v2:
* Reset eui[0] to default if dev_id is used
v2 -> v3:
* Add helper function to avoid re-setting eui[0] to default if
  dev_id is used
v3 -> v4:
* Remove RXE wrappers
* Remove addrconf_addr_eui48_xor and do the eui[0] ^= 2 in the
  basic implementation
---
 drivers/infiniband/hw/usnic/usnic_common_util.h | 11 +++---
 drivers/infiniband/sw/rxe/rxe.c |  4 +++-
 drivers/infiniband/sw/rxe/rxe_loc.h |  2 --
 drivers/infiniband/sw/rxe/rxe_net.c | 28 -
 drivers/infiniband/sw/rxe/rxe_verbs.c   |  4 +++-
 include/net/addrconf.h  | 22 +++
 6 files changed, 27 insertions(+), 44 deletions(-)

diff --git a/drivers/infiniband/hw/usnic/usnic_common_util.h 
b/drivers/infiniband/hw/usnic/usnic_common_util.h
index b54986d..d91b035 100644
--- a/drivers/infiniband/hw/usnic/usnic_common_util.h
+++ b/drivers/infiniband/hw/usnic/usnic_common_util.h
@@ -34,6 +34,8 @@
 #ifndef USNIC_CMN_UTIL_H
 #define USNIC_CMN_UTIL_H
 
+#include 
+
 static inline void
 usnic_mac_to_gid(const char *const mac, char *raw_gid)
 {
@@ -57,14 +59,7 @@ usnic_mac_ip_to_gid(const char *const mac, const __be32 
inaddr, char *raw_gid)
raw_gid[1] = 0x80;
memset(&raw_gid[2], 0, 2);
memcpy(&raw_gid[4], &inaddr, 4);
-   raw_gid[8] = mac[0]^2;
-   raw_gid[9] = mac[1];
-   raw_gid[10] = mac[2];
-   raw_gid[11] = 0xff;
-   raw_gid[12] = 0xfe;
-   raw_gid[13] = mac[3];
-   raw_gid[14] = mac[4];
-   raw_gid[15] = mac[5];
+   addrconf_addr_eui48(&raw_gid[8], mac);
 }
 
 static inline void
diff --git a/drivers/infiniband/sw/rxe/rxe.c b/drivers/infiniband/sw/rxe/rxe.c
index b12dd9b..e93f81f 100644
--- a/drivers/infiniband/sw/rxe/rxe.c
+++ b/drivers/infiniband/sw/rxe/rxe.c
@@ -31,6 +31,7 @@
  * SOFTWARE.
  */
 
+#include 
 #include "rxe.h"
 #include "rxe_loc.h"
 
@@ -178,7 +179,8 @@ static int rxe_init_ports(struct rxe_dev *rxe)
return -ENOMEM;
 
port->pkey_tbl[0] = 0x;
-   port->port_guid = rxe_port_guid(rxe);
+   addrconf_addr_eui48((unsigned char *)&port->port_guid,
+   rxe->ndev->dev_addr);
 
spin_lock_init(&port->port_lock);
 
diff --git a/drivers/infiniband/sw/rxe/rxe_loc.h 
b/drivers/infiniband/sw/rxe/rxe_loc.h
index 272337e..13ed8b5 100644
--- a/drivers/infiniband/sw/rxe/rxe_loc.h
+++ b/drivers/infiniband/sw/rxe/rxe_loc.h
@@ -145,7 +145,6 @@ int advance_dma_data(struct rxe_dma_info *dma, unsigned int 
length);
 int rxe_loopback(struct sk_buff *skb);
 int rxe_send(struct rxe_dev *rxe, struct rxe_pkt_info *pkt,
 struct sk_buff *skb);
-__be64 rxe_port_guid(struct rxe_dev *rxe);
 struct sk_buff *rxe_init_packet(struct rxe_dev *rxe, struct rxe_av *av,
int paylen, struct rxe_pkt_info *pkt);
 int rxe_prepare(struct rxe_dev *rxe, struct rxe_pkt_info *pkt,
@@ -153,7 +152,6 @@ int rxe_prepare(struct rxe_dev *rxe, struct rxe_pkt_info 
*pkt,
 enum rdma_link_layer rxe_link_layer(struct rxe_dev *rxe, unsigned int 
port_num);
 const char *rxe_parent_name(struct rxe_dev *rxe, unsigned int port_num);
 struct device *rxe_dma_device(struct rxe_dev *rxe);
-__be64 rxe_node_guid(struct rxe_dev *rxe);
 int rxe_mcast_add(struct rxe_dev *rxe, union ib_gid *mgid);
 int rxe_mcast_delete(struct rxe_dev *rxe, union ib_gid *mgid);
 
diff --git a/drivers/infiniband/sw/rxe/rxe_net.c 
b/drivers/infiniband/sw/rxe/rxe_net.c
index d8610960..43b1a0c 100644
--- a/drivers/infiniband/sw/rxe/rxe_net.c
+++ b/drivers/infiniband/sw/rxe/rxe_net.c
@@ -84,34 +84,6 @@ struct rxe_dev *get_rxe_by_name(const char *name)
 
 struct rxe_recv_sockets recv_sockets;
 
-static __be64 rxe_mac_to_eui64(struct net_device *ndev)
-{
-   unsigned char *mac_addr = ndev->dev_addr;
-   __be64 eui64;
-   unsigned char *dst = (unsigned char *)&eui64;
-
-   dst[0] = mac_addr[0] ^ 2;
-   dst[1] = mac_addr[1];
-   dst[2] = mac_addr[2];
-   dst[3] = 0xff;
-   dst[4] = 0xfe;
-   dst[5] = mac_addr[3];
-   dst[6] = mac_addr[4];
-   dst[7] = mac_addr[5];
-
-   return eui64;
-}
-
-__be64 rxe_node_guid(struct rxe_dev *rxe)
-{
-   return rxe_mac_to_eui64(rxe->ndev);
-}
-
-__be64 rxe_port_guid(struct rxe_dev *rxe)
-{
-   return rxe_mac_to_eui64(rxe->ndev);
-}
-
 struct device *rxe_dma_device(struct rxe_dev *rxe)
 {
struct net_device *ndev;
diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c 
b/drivers/infiniband/sw/rxe/rxe_verbs.c
index d2e2eff..09f1bec 100644
--- a/drivers/infiniband/sw/rxe/rxe_verbs.c
+++ b/drivers/infiniband/sw/rxe/rxe_verbs.c
@@ -31,6 +31,7 @@
  * SOFTWARE.
  */
 
+#include 
 #include "rxe.h"
 

Re: [PATCH net-next 0/4] gtp: support multiple APN's per GTP endpoint

2017-03-14 Thread Harald Welte
Hi Andreas,

On Tue, Mar 14, 2017 at 02:42:16PM +0100, Pablo Neira Ayuso wrote:
> On Tue, Mar 14, 2017 at 01:42:44PM +0100, Andreas Schultz wrote:
> > The only supported and documented API for the GTP module is libgtpnl.
> 
> No, the netlink interface itself if the API.
> 
> Stopping trying to find a reason to break API, that is a no-go.

As much as one might dislike it as a developer in this particular case,
the Linux kernel has the very well communicated rule: All userspace
visible interfaces must not change in an incompatible way.  This
includes of course all the syscalls, the ioctl() parameters but also the
netlink interfaces of the networking stack.

The statement "nobody ever used it" is a statement you can never make in
FOSS software, as you don't know of 99.999% of all the users of your
software.  The fact that none of the FOSS projects that any of us was
involved in may not have used a certain feature doesn't mean nobody else
has been using it privately, quietly.  Keep in mind that several Linux
distributions have already been shipping the gtp module as part of their
stable releases meanwhile.

Also, no matter what Pablo or I may think about, there are general rules
about how Linux kernel development is done (from coding style to merge
windows, and also userspace compatibility), and we all have to obey
them.  There's little point in discussing about them, we all just have
to live with them.

Regards,
Harald
-- 
- Harald Weltehttp://laforge.gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)


Re: [PATCH net-next] net: dsa: mv88e6xxx: debug ATU Age Time

2017-03-14 Thread Andrew Lunn
On Tue, Mar 14, 2017 at 09:56:41AM -0400, Vivien Didelot wrote:
> Hi Andrew,
> 
> Andrew Lunn  writes:
> 
> >> The never ever seeing R/W failure on MDIO bus is not exactly accurate.
> >> We had with art (atheros calibration tool) the problem that interrupts
> >> were being disabled which lead to MDIO operations running into
> >> timout/failing.
> >
> > Yes, i've seen similar with power management bugs for the MDIO
> > driver. But you get a cascade of failures, lots of warnings and error
> > prints, it is clear something bad has happened, and the switch is in
> > an inconsistent state. So having one more debug print which is also
> > inconsistent does no really harm.
> >
> > Anyway, this whole conversation has taken more effort than just making
> > this simple change to remove a few lines of code. So lets drop it and
> > move on.
> 
> I don't understand nor agree with the fact that sometimes it's OK to not
> check for errors, based on one developer assumptions. Not checking
> return code is wrong and very likely error-prone.

Please go back and look what i said. I did check the error code, in
that it gets returned to the caller. I just don't check it before
printing the debug.

But as i said, lets drop this whole topic.

Andrew


[PATCH v2 18/20] ARM: dts: sun50i-a64: enable dwmac-sun8i on the BananaPi M64

2017-03-14 Thread Corentin Labbe
The dwmac-sun8i  hardware is present on the BananaPi M64.
It uses an external PHY rtl8211e via RGMII.

Signed-off-by: Corentin Labbe 
---
 arch/arm64/boot/dts/allwinner/sun50i-a64-bananapi-m64.dts | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/arch/arm64/boot/dts/allwinner/sun50i-a64-bananapi-m64.dts 
b/arch/arm64/boot/dts/allwinner/sun50i-a64-bananapi-m64.dts
index 6872135..347c262 100644
--- a/arch/arm64/boot/dts/allwinner/sun50i-a64-bananapi-m64.dts
+++ b/arch/arm64/boot/dts/allwinner/sun50i-a64-bananapi-m64.dts
@@ -77,6 +77,20 @@
bias-pull-up;
 };
 
+&mdio {
+   ext_rgmii_phy: ethernet-phy@1 {
+   reg = <1>;
+   };
+};
+
+&emac {
+   pinctrl-names = "default";
+   pinctrl-0 = <&rgmii_pins>;
+   phy-mode = "rgmii";
+   phy-handle = <&ext_rgmii_phy>;
+   status = "okay";
+};
+
 &mmc0 {
pinctrl-names = "default";
pinctrl-0 = <&mmc0_pins>;
-- 
2.10.2



[PATCH v2 20/20] ARM: sunxi: Enable dwmac-sun8i driver on multi_v7_defconfig

2017-03-14 Thread Corentin Labbe
Enable the dwmac-sun8i driver in the multi_v7 default configuration

Signed-off-by: Corentin Labbe 
---
 arch/arm/configs/multi_v7_defconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm/configs/multi_v7_defconfig 
b/arch/arm/configs/multi_v7_defconfig
index 36c1b39..380919f 100644
--- a/arch/arm/configs/multi_v7_defconfig
+++ b/arch/arm/configs/multi_v7_defconfig
@@ -257,6 +257,7 @@ CONFIG_SMSC911X=y
 CONFIG_STMMAC_ETH=y
 CONFIG_STMMAC_PLATFORM=y
 CONFIG_DWMAC_DWC_QOS_ETH=y
+CONFIG_DWMAC_SUN8I=y
 CONFIG_TI_CPSW=y
 CONFIG_XILINX_EMACLITE=y
 CONFIG_AT803X_PHY=y
-- 
2.10.2



[PATCH v2 13/20] ARM: dts: sun8i: orangepi-pc-plus: Set EMAC activity LEDs to active high

2017-03-14 Thread Corentin Labbe
On the Orange Pi PC Plus, the polarity of the LEDs on the RJ45 Ethernet
port were changed from active low to active high.

Signed-off-by: Chen-Yu Tsai 
Signed-off-by: Corentin Labbe 
---
 arch/arm/boot/dts/sun8i-h3-orangepi-pc-plus.dts | 5 +
 1 file changed, 5 insertions(+)

diff --git a/arch/arm/boot/dts/sun8i-h3-orangepi-pc-plus.dts 
b/arch/arm/boot/dts/sun8i-h3-orangepi-pc-plus.dts
index 8b93f5c..0380769 100644
--- a/arch/arm/boot/dts/sun8i-h3-orangepi-pc-plus.dts
+++ b/arch/arm/boot/dts/sun8i-h3-orangepi-pc-plus.dts
@@ -86,3 +86,8 @@
/* eMMC is missing pull-ups */
bias-pull-up;
 };
+
+&emac {
+   /* LEDs changed to active high on the plus */
+   /delete-property/ allwinner,leds-active-low;
+};
-- 
2.10.2



[PATCH v2 12/20] ARM: dts: sun8i: Enable dwmac-sun8i on the Orange Pi plus

2017-03-14 Thread Corentin Labbe
The dwmac-sun8i hardware is present on the Orange PI plus.
It uses an external PHY rtl8211e via RGMII.

This patch create the needed regulator, emac and phy nodes.

Signed-off-by: Corentin Labbe 
---
 arch/arm/boot/dts/sun8i-h3-orangepi-plus.dts | 35 
 1 file changed, 35 insertions(+)

diff --git a/arch/arm/boot/dts/sun8i-h3-orangepi-plus.dts 
b/arch/arm/boot/dts/sun8i-h3-orangepi-plus.dts
index 8c40ab7..4e075a2 100644
--- a/arch/arm/boot/dts/sun8i-h3-orangepi-plus.dts
+++ b/arch/arm/boot/dts/sun8i-h3-orangepi-plus.dts
@@ -58,6 +58,18 @@
enable-active-high;
gpio = <&pio 6 11 GPIO_ACTIVE_HIGH>;
};
+
+   reg_gmac_3v3: gmac-3v3 {
+   compatible = "regulator-fixed";
+   pinctrl-names = "default";
+   pinctrl-0 = <&gmac_power_pin_orangepi>;
+   regulator-name = "gmac-3v3";
+   regulator-min-microvolt = <330>;
+   regulator-max-microvolt = <330>;
+   startup-delay-us = <10>;
+   enable-active-high;
+   gpio = <&pio 3 6 GPIO_ACTIVE_HIGH>;
+   };
 };
 
 &ehci3 {
@@ -86,8 +98,31 @@
pins = "PG11";
function = "gpio_out";
};
+
+   gmac_power_pin_orangepi: gmac_power_pin@0 {
+   pins = "PD6";
+   function = "gpio_out";
+   drive-strength = <10>;
+   };
 };
 
 &usbphy {
usb3_vbus-supply = <®_usb3_vbus>;
 };
+
+&mdio {
+   ext_rgmii_phy: ethernet-phy@1 {
+   reg = <0>;
+   };
+};
+
+&emac {
+   pinctrl-names = "default";
+   pinctrl-0 = <&emac_rgmii_pins>;
+   phy-supply = <®_gmac_3v3>;
+   phy-handle = <&ext_rgmii_phy>;
+   phy-mode = "rgmii";
+
+   allwinner,leds-active-low;
+   status = "okay";
+};
-- 
2.10.2



[PATCH v2 06/20] ARM: dts: sunxi-h3-h5: Add dt node for the syscon control module

2017-03-14 Thread Corentin Labbe
This patch add the dt node for the syscon register present on the
Allwinner H3/H5

Only two register are present in this syscon and the only one useful is
the one dedicated to EMAC clock..

Signed-off-by: Corentin Labbe 
---
 arch/arm/boot/dts/sunxi-h3-h5.dtsi | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/arch/arm/boot/dts/sunxi-h3-h5.dtsi 
b/arch/arm/boot/dts/sunxi-h3-h5.dtsi
index 2494ea0..07e4f36 100644
--- a/arch/arm/boot/dts/sunxi-h3-h5.dtsi
+++ b/arch/arm/boot/dts/sunxi-h3-h5.dtsi
@@ -102,6 +102,12 @@
#size-cells = <1>;
ranges;
 
+   syscon: syscon@01c0 {
+   compatible = "syscon",
+   "allwinner,sun8i-h3-system-controller";
+   reg = <0x01c0 0x1000>;
+   };
+
dma: dma-controller@01c02000 {
compatible = "allwinner,sun8i-h3-dma";
reg = <0x01c02000 0x1000>;
-- 
2.10.2



[PATCH v2 07/20] ARM: dts: sunxi-h3-h5: add dwmac-sun8i ethernet driver

2017-03-14 Thread Corentin Labbe
The dwmac-sun8i is an ethernet MAC hardware that support 10/100/1000
speed.

This patch enable the dwmac-sun8i on Allwinner H3/H5 SoC Device-tree.
SoC H3/H5 have an internal PHY, so optionals syscon and ephy are set.

Signed-off-by: Corentin Labbe 
---
 arch/arm/boot/dts/sunxi-h3-h5.dtsi | 33 +
 1 file changed, 33 insertions(+)

diff --git a/arch/arm/boot/dts/sunxi-h3-h5.dtsi 
b/arch/arm/boot/dts/sunxi-h3-h5.dtsi
index 07e4f36..c35af5e 100644
--- a/arch/arm/boot/dts/sunxi-h3-h5.dtsi
+++ b/arch/arm/boot/dts/sunxi-h3-h5.dtsi
@@ -272,6 +272,14 @@
interrupt-controller;
#interrupt-cells = <3>;
 
+   emac_rgmii_pins: emac0@0 {
+   pins = "PD0", "PD1", "PD2", "PD3", "PD4",
+   "PD5", "PD7", "PD8", "PD9", "PD10",
+   "PD12", "PD13", "PD15", "PD16", "PD17";
+   function = "emac";
+   drive-strength = <40>;
+   };
+
i2c0_pins: i2c0 {
pins = "PA11", "PA12";
function = "i2c0";
@@ -368,6 +376,31 @@
clocks = <&osc24M>;
};
 
+   emac: ethernet@1c3 {
+   compatible = "allwinner,sun8i-h3-emac";
+   syscon = <&syscon>;
+   reg = <0x01c3 0x104>;
+   interrupts = ;
+   interrupt-names = "macirq";
+   resets = <&ccu RST_BUS_EMAC>;
+   reset-names = "stmmaceth";
+   clocks = <&ccu CLK_BUS_EMAC>;
+   clock-names = "stmmaceth";
+   #address-cells = <1>;
+   #size-cells = <0>;
+   status = "disabled";
+
+   mdio: mdio {
+   #address-cells = <1>;
+   #size-cells = <0>;
+   int_mii_phy: ethernet-phy@1 {
+   reg = <1>;
+   clocks = <&ccu CLK_BUS_EPHY>;
+   resets = <&ccu RST_BUS_EPHY>;
+   };
+   };
+   };
+
spi0: spi@01c68000 {
compatible = "allwinner,sun8i-h3-spi";
reg = <0x01c68000 0x1000>;
-- 
2.10.2



[PATCH v2 08/20] ARM: dts: sun8i: Enable dwmac-sun8i on the Banana Pi M2+

2017-03-14 Thread Corentin Labbe
From: LABBE Corentin 

The dwmac-sun8i hardware is present on the Banana Pi M2+
It uses an external PHY rtl8211e via RGMII.

This patch create the needed regulator, emac and phy nodes.

Signed-off-by: Corentin Labbe 
---
 arch/arm/boot/dts/sun8i-h3-bananapi-m2-plus.dts | 37 +
 1 file changed, 37 insertions(+)

diff --git a/arch/arm/boot/dts/sun8i-h3-bananapi-m2-plus.dts 
b/arch/arm/boot/dts/sun8i-h3-bananapi-m2-plus.dts
index 52acbe1..30b0a41 100644
--- a/arch/arm/boot/dts/sun8i-h3-bananapi-m2-plus.dts
+++ b/arch/arm/boot/dts/sun8i-h3-bananapi-m2-plus.dts
@@ -90,6 +90,18 @@
pinctrl-0 = <&wifi_en_bpi_m2p>;
reset-gpios = <&r_pio 0 7 GPIO_ACTIVE_LOW>; /* PL7 */
};
+
+   reg_gmac_3v3: gmac-3v3 {
+ compatible = "regulator-fixed";
+ pinctrl-names = "default";
+ pinctrl-0 = <&gmac_power_pin_orangepi>;
+ regulator-name = "gmac-3v3";
+ regulator-min-microvolt = <330>;
+ regulator-max-microvolt = <330>;
+ startup-delay-us = <10>;
+ enable-active-high;
+ gpio = <&pio 3 6 GPIO_ACTIVE_HIGH>;
+ };
 };
 
 &ehci1 {
@@ -186,3 +198,28 @@
/* USB VBUS is on as long as VCC-IO is on */
status = "okay";
 };
+
+&pio {
+   gmac_power_pin_orangepi: gmac_power_pin@0 {
+pins = "PD6";
+function = "gpio_out";
+drive-strength = <10>;
+};
+};
+
+&mdio {
+   ext_rgmii_phy: ethernet-phy@1 {
+   reg = <0>;
+   };
+};
+
+&emac {
+   pinctrl-names = "default";
+   pinctrl-0 = <&emac_rgmii_pins>;
+   phy-supply = <®_gmac_3v3>;
+   phy-handle = <&ext_rgmii_phy>;
+   phy-mode = "rgmii";
+
+   allwinner,leds-active-low;
+   status = "okay";
+};
-- 
2.10.2



[PATCH v2 09/20] ARM: dts: sun8i: Enable dwmac-sun8i on the Orange PI PC

2017-03-14 Thread Corentin Labbe
From: LABBE Corentin 

The dwmac-sun8i hardware is present on the Orange PI PC.
It uses the internal PHY.

This patch create the needed emac node.

Signed-off-by: Corentin Labbe 
---
 arch/arm/boot/dts/sun8i-h3-orangepi-pc.dts | 8 
 1 file changed, 8 insertions(+)

diff --git a/arch/arm/boot/dts/sun8i-h3-orangepi-pc.dts 
b/arch/arm/boot/dts/sun8i-h3-orangepi-pc.dts
index f148111..746c25a 100644
--- a/arch/arm/boot/dts/sun8i-h3-orangepi-pc.dts
+++ b/arch/arm/boot/dts/sun8i-h3-orangepi-pc.dts
@@ -53,6 +53,7 @@
 
aliases {
serial0 = &uart0;
+   ethernet0 = &emac;
};
 
chosen {
@@ -184,3 +185,10 @@
/* USB VBUS is always on */
status = "okay";
 };
+
+&emac {
+   phy-handle = <&int_mii_phy>;
+   phy-mode = "mii";
+   allwinner,leds-active-low;
+   status = "okay";
+};
-- 
2.10.2



[PATCH v2 03/20] ARM: sun8i: dt: Add DT bindings documentation for Allwinner dwmac-sun8i

2017-03-14 Thread Corentin Labbe
This patch adds documentation for Device-Tree bindings for the
Allwinner dwmac-sun8i driver.

Signed-off-by: Corentin Labbe 
---
 .../devicetree/bindings/net/dwmac-sun8i.txt| 77 ++
 1 file changed, 77 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/net/dwmac-sun8i.txt

diff --git a/Documentation/devicetree/bindings/net/dwmac-sun8i.txt 
b/Documentation/devicetree/bindings/net/dwmac-sun8i.txt
new file mode 100644
index 000..f01ef17
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/dwmac-sun8i.txt
@@ -0,0 +1,77 @@
+* Allwinner sun8i GMAC ethernet controller
+
+This device is a platform glue layer for stmmac.
+Please see stmmac.txt for the other unchanged properties.
+
+Required properties:
+- compatible: should be one of the following string:
+   "allwinner,sun8i-a83t-emac"
+   "allwinner,sun8i-h3-emac"
+   "allwinner,sun50i-a64-emac"
+- reg: address and length of the register for the device.
+- interrupts: interrupt for the device
+- interrupt-names: should be "macirq"
+- clocks: A phandle to the reference clock for this device
+- clock-names: should be "stmmaceth"
+- resets: A phandle to the reset control for this device
+- reset-names: should be "stmmaceth"
+- phy-mode: See ethernet.txt
+- phy-handle: See ethernet.txt
+- #address-cells: shall be 1
+- #size-cells: shall be 0
+- syscon: A phandle to the syscon of the SoC with one of the following
+ compatible string:
+  - allwinner,sun8i-h3-system-controller
+  - allwinner,sun8i-a64-system-controller
+  - allwinner,sun8i-a83t-system-controller
+
+Optional properties:
+- allwinner,tx-delay: TX clock delay chain value. Range value is 0-0x07. 
Default is 0)
+- allwinner,rx-delay: RX clock delay chain value. Range value is 0-0x1F. 
Default is 0)
+Both delay properties are in 0.1ns step.
+
+Optional properties for "allwinner,sun8i-h3-emac":
+- allwinner,leds-active-low: EPHY LEDs are active low
+
+Required child node of emac:
+- mdio bus node: should be named mdio
+
+Required properties of the mdio node:
+- #address-cells: shall be 1
+- #size-cells: shall be 0
+
+The device node referenced by "phy" or "phy-handle" should be a child node
+of the mdio node. See phy.txt for the generic PHY bindings.
+
+Required properties of the phy node with "allwinner,sun8i-h3-emac":
+- clocks: a phandle to the reference clock for the EPHY
+- resets: a phandle to the reset control for the EPHY
+
+Example:
+
+emac: ethernet@1c0b000 {
+   compatible = "allwinner,sun8i-h3-emac";
+   syscon = <&syscon>;
+   reg = <0x01c0b000 0x104>;
+   interrupts = ;
+   interrupt-names = "macirq";
+   resets = <&ccu RST_BUS_EMAC>;
+   reset-names = "stmmaceth";
+   clocks = <&ccu CLK_BUS_EMAC>;
+   clock-names = "stmmaceth";
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   phy = <&int_mii_phy>;
+   phy-mode = "mii";
+   allwinner,leds-active-low;
+   mdio: mdio {
+   #address-cells = <1>;
+   #size-cells = <0>;
+   int_mii_phy: ethernet-phy@1 {
+   reg = <1>;
+   clocks = <&ccu CLK_BUS_EPHY>;
+   resets = <&ccu RST_BUS_EPHY>;
+   };
+   };
+};
-- 
2.10.2



[PATCH v2 02/20] net-next: stmmac: add optional setup function

2017-03-14 Thread Corentin Labbe
Instead of ading more ifthen logic for adding a new mac_device_info
setup function, it is easier to add a function pointer to the function
needed.

Signed-off-by: Corentin Labbe 
---
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 4 +++-
 include/linux/stmmac.h| 3 +++
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c 
b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 4498a38..856ac57 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -3101,7 +3101,9 @@ static int stmmac_hw_init(struct stmmac_priv *priv)
struct mac_device_info *mac;
 
/* Identify the MAC HW device */
-   if (priv->plat->has_gmac) {
+   if (priv->plat->setup) {
+   mac = priv->plat->setup(priv);
+   } else if (priv->plat->has_gmac) {
priv->dev->priv_flags |= IFF_UNICAST_FLT;
mac = dwmac1000_setup(priv->ioaddr,
  priv->plat->multicast_filter_bins,
diff --git a/include/linux/stmmac.h b/include/linux/stmmac.h
index fc273e9..8f09f18 100644
--- a/include/linux/stmmac.h
+++ b/include/linux/stmmac.h
@@ -109,6 +109,8 @@ struct stmmac_axi {
bool axi_rb;
 };
 
+struct stmmac_priv;
+
 struct plat_stmmacenet_data {
int bus_id;
int phy_addr;
@@ -136,6 +138,7 @@ struct plat_stmmacenet_data {
void (*fix_mac_speed)(void *priv, unsigned int speed);
int (*init)(struct platform_device *pdev, void *priv);
void (*exit)(struct platform_device *pdev, void *priv);
+   struct mac_device_info *(*setup)(struct stmmac_priv *priv);
void *bsp_priv;
struct clk *stmmac_clk;
struct clk *pclk;
-- 
2.10.2



[PATCH v2 00/20] net-next: stmmac: add dwmac-sun8i ethernet driver

2017-03-14 Thread Corentin Labbe
Hello

This patch series add the driver for dwmac-sun8i which handle the Ethernet MAC
present on Allwinner H3/H5/A83T/A64 SoCs.

This driver is the continuation of the sun8i-emac driver.
During the development, it appeared that in fact the hardware was a modified
version of some dwmac.
So the driver is now written as a glue driver for stmmac.

It supports 10/100/1000 Mbit/s speed with half/full duplex.
It can use an internal PHY (MII 10/100) or an external PHY
via RGMII/RMII.

This patch series enable the driver only for the H3/A64/H5 SoC since A83T
doesn't have the necessary clocks present in mainline.

The driver have been tested on the following boards:
- H3 Orange PI PC, BananaPI-M2+
- A64 Pine64, BananaPi-M64
- A83T BananaPI-M3

The first two patchs are some mandatory changes for letting dwmac-sun8i be used.
The following three patchs add the driver and its documentation.
The remaining are DT patch enabling it.

Regards
Corentin Labbe

Changes since v1:
- added TX/RX delay units
- splitted syscon documentation in its own patch
- regulator is now disabled after clk_prepare_enable(gmac->tx_clk) error
- Fixed a memory leak on mac_device_info
- Use now generic pin config for all DT stuff
- CONFIG_DWMAC_SUN8I is now set to y in defconfigs

Corentin Labbe (17):
  net-next: stmmac: export stmmac_set_mac_addr/stmmac_get_mac_addr
  net-next: stmmac: add optional setup function
  ARM: sun8i: dt: Add DT bindings documentation for Allwinner
dwmac-sun8i
  ARM: sun8i: dt: Add DT bindings documentation for Allwinner syscon
  net-next: stmmac: Add dwmac-sun8i
  ARM: dts: sunxi-h3-h5: Add dt node for the syscon control module
  ARM: dts: sunxi-h3-h5: add dwmac-sun8i ethernet driver
  ARM: dts: sun8i: Enable dwmac-sun8i on the Orange Pi 2
  ARM: dts: sun8i: Enable dwmac-sun8i on the Orange PI One
  ARM: dts: sun8i: Enable dwmac-sun8i on the Orange Pi plus
  ARM: dts: sun8i: orangepi-pc-plus: Set EMAC activity LEDs to active
high
  ARM64: dts: sun50i-a64: Add dt node for the syscon control module
  ARM64: dts: sun50i-a64: add dwmac-sun8i Ethernet driver
  ARM: dts: sun50i-a64: enable dwmac-sun8i on pine64
  ARM: dts: sun50i-a64: enable dwmac-sun8i on pine64 plus
  ARM: dts: sun50i-a64: enable dwmac-sun8i on the BananaPi M64
  ARM: sunxi: Enable dwmac-sun8i driver on multi_v7_defconfig

LABBE Corentin (3):
  ARM: dts: sun8i: Enable dwmac-sun8i on the Banana Pi M2+
  ARM: dts: sun8i: Enable dwmac-sun8i on the Orange PI PC
  ARM: sunxi: Enable dwmac-sun8i driver on sunxi_defconfig

 .../devicetree/bindings/misc/allwinner,syscon.txt  |  19 +
 .../devicetree/bindings/net/dwmac-sun8i.txt|  77 ++
 arch/arm/boot/dts/sun8i-h3-bananapi-m2-plus.dts|  37 +
 arch/arm/boot/dts/sun8i-h3-orangepi-2.dts  |   8 +
 arch/arm/boot/dts/sun8i-h3-orangepi-one.dts|   8 +
 arch/arm/boot/dts/sun8i-h3-orangepi-pc-plus.dts|   5 +
 arch/arm/boot/dts/sun8i-h3-orangepi-pc.dts |   8 +
 arch/arm/boot/dts/sun8i-h3-orangepi-plus.dts   |  35 +
 arch/arm/boot/dts/sunxi-h3-h5.dtsi |  39 +
 arch/arm/configs/multi_v7_defconfig|   1 +
 arch/arm/configs/sunxi_defconfig   |   1 +
 .../boot/dts/allwinner/sun50i-a64-bananapi-m64.dts |  14 +
 .../boot/dts/allwinner/sun50i-a64-pine64-plus.dts  |  16 +-
 .../arm64/boot/dts/allwinner/sun50i-a64-pine64.dts |  15 +
 arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi  |  43 +
 drivers/net/ethernet/stmicro/stmmac/Kconfig|  11 +
 drivers/net/ethernet/stmicro/stmmac/Makefile   |   1 +
 drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c  | 938 +
 drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c|   3 +-
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c  |  31 +-
 .../net/ethernet/stmicro/stmmac/stmmac_platform.c  |   9 +-
 include/linux/stmmac.h |   4 +
 22 files changed, 1317 insertions(+), 6 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/misc/allwinner,syscon.txt
 create mode 100644 Documentation/devicetree/bindings/net/dwmac-sun8i.txt
 create mode 100644 drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c

-- 
2.10.2



[PATCH v2 01/20] net-next: stmmac: export stmmac_set_mac_addr/stmmac_get_mac_addr

2017-03-14 Thread Corentin Labbe
Thoses symbol will be needed for the dwmac-sun8i ethernet driver.
For letting it to be build as module, they need to be exported.

Signed-off-by: Corentin Labbe 
---
 drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c 
b/drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c
index e60bfca..0ab985c8 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c
@@ -248,6 +248,7 @@ void stmmac_set_mac_addr(void __iomem *ioaddr, u8 addr[6],
data = (addr[3] << 24) | (addr[2] << 16) | (addr[1] << 8) | addr[0];
writel(data, ioaddr + low);
 }
+EXPORT_SYMBOL_GPL(stmmac_set_mac_addr);
 
 /* Enable disable MAC RX/TX */
 void stmmac_set_mac(void __iomem *ioaddr, bool enable)
@@ -279,4 +280,4 @@ void stmmac_get_mac_addr(void __iomem *ioaddr, unsigned 
char *addr,
addr[4] = hi_addr & 0xff;
addr[5] = (hi_addr >> 8) & 0xff;
 }
-
+EXPORT_SYMBOL_GPL(stmmac_get_mac_addr);
-- 
2.10.2



[PATCH v2 04/20] ARM: sun8i: dt: Add DT bindings documentation for Allwinner syscon

2017-03-14 Thread Corentin Labbe
Signed-off-by: Corentin Labbe 
---
 .../devicetree/bindings/misc/allwinner,syscon.txt | 19 +++
 1 file changed, 19 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/misc/allwinner,syscon.txt

diff --git a/Documentation/devicetree/bindings/misc/allwinner,syscon.txt 
b/Documentation/devicetree/bindings/misc/allwinner,syscon.txt
new file mode 100644
index 000..9f5f1f5
--- /dev/null
+++ b/Documentation/devicetree/bindings/misc/allwinner,syscon.txt
@@ -0,0 +1,19 @@
+* Allwinner sun8i system controller
+
+This file describes the bindings for the system controller present in
+Allwinner SoC H3, A83T and A64.
+The principal function of this syscon is to control EMAC PHY choice and
+config.
+
+Required properties for the system controller:
+- reg: address and length of the register for the device.
+- compatible: should be "syscon" and one of the following string:
+   "allwinner,sun8i-h3-system-controller"
+   "allwinner,sun8i-a64-system-controller"
+   "allwinner,sun8i-a83t-system-controller"
+
+Example:
+syscon: syscon@01c0 {
+   compatible = "syscon", "allwinner,sun8i-h3-system-controller";
+   reg = <0x01c0 0x1000>;
+};
-- 
2.10.2



[PATCH v2 19/20] ARM: sunxi: Enable dwmac-sun8i driver on sunxi_defconfig

2017-03-14 Thread Corentin Labbe
From: LABBE Corentin 

Enable the dwmac-sun8i driver in the sunxi default configuration

Signed-off-by: Corentin Labbe 
---
 arch/arm/configs/sunxi_defconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm/configs/sunxi_defconfig b/arch/arm/configs/sunxi_defconfig
index 5cd5dd70..504e022 100644
--- a/arch/arm/configs/sunxi_defconfig
+++ b/arch/arm/configs/sunxi_defconfig
@@ -40,6 +40,7 @@ CONFIG_ATA=y
 CONFIG_AHCI_SUNXI=y
 CONFIG_NETDEVICES=y
 CONFIG_SUN4I_EMAC=y
+CONFIG_DWMAC_SUN8I=y
 # CONFIG_NET_VENDOR_ARC is not set
 # CONFIG_NET_CADENCE is not set
 # CONFIG_NET_VENDOR_BROADCOM is not set
-- 
2.10.2



[PATCH v2 15/20] ARM64: dts: sun50i-a64: add dwmac-sun8i Ethernet driver

2017-03-14 Thread Corentin Labbe
The dwmac-sun8i is an Ethernet MAC that supports 10/100/1000 Mbit
connections. It is very similar to the device found in the Allwinner
H3, but lacks the internal 100 Mbit PHY and its associated control
bits.
This adds the necessary bits to the Allwinner A64 SoC .dtsi, but keeps
it disabled at this level.

Signed-off-by: Corentin Labbe 
---
 arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi | 37 +++
 1 file changed, 37 insertions(+)

diff --git a/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi 
b/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi
index 3b09af2..57d69e5 100644
--- a/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi
+++ b/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi
@@ -277,6 +277,23 @@
bias-pull-up;
};
 
+   rmii_pins: rmii_pins {
+   pins = "PD10", "PD11", "PD13", "PD14",
+   "PD17", "PD18", "PD19", "PD20",
+   "PD22", "PD23";
+   function = "emac";
+   drive-strength = <40>;
+   };
+
+   rgmii_pins: rgmii_pins {
+   pins = "PD8", "PD9", "PD10", "PD11",
+   "PD12", "PD13", "PD15",
+   "PD16", "PD17", "PD18", "PD19",
+   "PD20", "PD21", "PD22", "PD23";
+   function = "emac";
+   drive-strength = <40>;
+   };
+
uart0_pins_a: uart0@0 {
pins = "PB8", "PB9";
function = "uart0";
@@ -381,6 +398,26 @@
#size-cells = <0>;
};
 
+   emac: ethernet@1c3 {
+   compatible = "allwinner,sun50i-a64-emac";
+   syscon = <&syscon>;
+   reg = <0x01c3 0x100>;
+   interrupts = ;
+   interrupt-names = "macirq";
+   resets = <&ccu RST_BUS_EMAC>;
+   reset-names = "stmmaceth";
+   clocks = <&ccu CLK_BUS_EMAC>;
+   clock-names = "stmmaceth";
+   status = "disabled";
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   mdio: mdio {
+   #address-cells = <1>;
+   #size-cells = <0>;
+   };
+   };
+
gic: interrupt-controller@1c81000 {
compatible = "arm,gic-400";
reg = <0x01c81000 0x1000>,
-- 
2.10.2



[PATCH v2 17/20] ARM: dts: sun50i-a64: enable dwmac-sun8i on pine64 plus

2017-03-14 Thread Corentin Labbe
The dwmac-sun8i hardware is present on the pine64 plus.
It uses an external PHY rtl8211e via RGMII.

Signed-off-by: Corentin Labbe 
---
 arch/arm64/boot/dts/allwinner/sun50i-a64-pine64-plus.dts | 16 +++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/boot/dts/allwinner/sun50i-a64-pine64-plus.dts 
b/arch/arm64/boot/dts/allwinner/sun50i-a64-pine64-plus.dts
index 790d14d..8e06aed 100644
--- a/arch/arm64/boot/dts/allwinner/sun50i-a64-pine64-plus.dts
+++ b/arch/arm64/boot/dts/allwinner/sun50i-a64-pine64-plus.dts
@@ -46,5 +46,19 @@
model = "Pine64+";
compatible = "pine64,pine64-plus", "allwinner,sun50i-a64";
 
-   /* TODO: Camera, Ethernet PHY, touchscreen, etc. */
+   /* TODO: Camera, touchscreen, etc. */
+};
+
+&mdio {
+   ext_rgmii_phy: ethernet-phy@1 {
+   reg = <1>;
+   };
+};
+
+&emac {
+   pinctrl-names = "default";
+   pinctrl-0 = <&rgmii_pins>;
+   phy-mode = "rgmii";
+   phy-handle = <&ext_rgmii_phy>;
+   status = "okay";
 };
-- 
2.10.2



Re: [PATCH net-next] mlx4: Better use of order-0 pages in RX path

2017-03-14 Thread Eric Dumazet
On Tue, 2017-03-14 at 06:34 -0700, Eric Dumazet wrote:

> So I will leave this to Mellanox for XDP tests and upstreaming this,
> and will stop arguing with you, this is going nowhere.

Tariq, I will send a v2, including these changes (plus the missing
include of yesterday)

One is to make sure high order allocations remove __GFP_DIRECT_RECLAIM

The other is changing mlx4_en_recover_from_oom() to increase by 
one rx_alloc_order instead of plain reset to rx_pref_alloc_order

Please test XDP and tell me if you find any issues ?
Thanks !

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c 
b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
index 
a71554649c25383bb765fa8220bc9cd490247aee..cc41f2f145541b469b52e7014659d5fdbb7dac68
 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
@@ -60,8 +60,10 @@ static struct page *mlx4_alloc_page(struct mlx4_en_priv 
*priv,
if (unlikely(!ring->pre_allocated_count)) {
unsigned int order = READ_ONCE(ring->rx_alloc_order);
 
-   page = __alloc_pages_node(node, gfp | __GFP_NOMEMALLOC |
-   __GFP_NOWARN | __GFP_NORETRY,
+   page = __alloc_pages_node(node, (gfp & ~__GFP_DIRECT_RECLAIM) |
+   __GFP_NOMEMALLOC |
+   __GFP_NOWARN |
+   __GFP_NORETRY,
  order);
if (page) {
split_page(page, order);
@@ -412,12 +414,13 @@ int mlx4_en_activate_rx_rings(struct mlx4_en_priv *priv)
 }
 
 /* Under memory pressure, each ring->rx_alloc_order might be lowered
- * to very small values. Periodically reset it to initial value for
+ * to very small values. Periodically increase t to initial value for
  * optimal allocations, in case stress is over.
  */
 void mlx4_en_recover_from_oom(struct mlx4_en_priv *priv)
 {
struct mlx4_en_rx_ring *ring;
+   unsigned int order;
int ring_ind;
 
if (!priv->port_up)
@@ -425,7 +428,9 @@ void mlx4_en_recover_from_oom(struct mlx4_en_priv *priv)
 
for (ring_ind = 0; ring_ind < priv->rx_ring_num; ring_ind++) {
ring = priv->rx_ring[ring_ind];
-   WRITE_ONCE(ring->rx_alloc_order, ring->rx_pref_alloc_order);
+   order = min_t(unsigned int, ring->rx_alloc_order + 1,
+ ring->rx_pref_alloc_order);
+   WRITE_ONCE(ring->rx_alloc_order, order);
}
 }
 




[PATCH net-next] qed*: Add support for QL41xxx adapters

2017-03-14 Thread Yuval Mintz
This adds the necessary infrastructure changes for initializing
and working with the new series of QL41xxx adapaters.

It also adds 2 new PCI device-IDs to qede:
  - 0x8070 for QL41xxx PFs
  - 0x8090 for VFs spawning from QL41xxx PFs

Signed-off-by: Tomer Tayar 
Signed-off-by: Yuval Mintz 
---
Hi Dave,

Please consider applying this to 'net-next'.

Thanks,
Yuval
---
 drivers/net/ethernet/qlogic/qed/qed.h   |  30 +++-
 drivers/net/ethernet/qlogic/qed/qed_debug.c |   2 +-
 drivers/net/ethernet/qlogic/qed/qed_dev.c   | 186 
 drivers/net/ethernet/qlogic/qed/qed_hsi.h   |  61 +--
 drivers/net/ethernet/qlogic/qed/qed_l2.c| 184 
 drivers/net/ethernet/qlogic/qed/qed_main.c  |   7 +-
 drivers/net/ethernet/qlogic/qed/qed_mcp.h   |   9 +-
 drivers/net/ethernet/qlogic/qed/qed_ptp.c   |  12 +-
 drivers/net/ethernet/qlogic/qed/qed_reg_addr.h  |  17 +-
 drivers/net/ethernet/qlogic/qed/qed_sriov.c |  30 +++-
 drivers/net/ethernet/qlogic/qede/qede.h |  43 +++--
 drivers/net/ethernet/qlogic/qede/qede_ethtool.c |  85 +
 drivers/net/ethernet/qlogic/qede/qede_main.c| 222 +---
 include/linux/qed/qed_if.h  |  48 +++--
 include/linux/qed/rdma_common.h |   3 +-
 15 files changed, 630 insertions(+), 308 deletions(-)

diff --git a/drivers/net/ethernet/qlogic/qed/qed.h 
b/drivers/net/ethernet/qlogic/qed/qed.h
index be99092..ca30a27 100644
--- a/drivers/net/ethernet/qlogic/qed/qed.h
+++ b/drivers/net/ethernet/qlogic/qed/qed.h
@@ -219,7 +219,9 @@ enum QED_PORT_MODE {
QED_PORT_MODE_DE_4X20G,
QED_PORT_MODE_DE_1X40G,
QED_PORT_MODE_DE_2X25G,
-   QED_PORT_MODE_DE_1X25G
+   QED_PORT_MODE_DE_1X25G,
+   QED_PORT_MODE_DE_4X25G,
+   QED_PORT_MODE_DE_2X10G,
 };
 
 enum qed_dev_cap {
@@ -364,7 +366,8 @@ struct qed_hwfn {
 #define IS_LEAD_HWFN(edev)  (!((edev)->my_id))
u8  rel_pf_id;  /* Relative to engine*/
u8  abs_pf_id;
-#define QED_PATH_ID(_p_hwfn)   ((_p_hwfn)->abs_pf_id & 1)
+#define QED_PATH_ID(_p_hwfn) \
+   (QED_IS_K2((_p_hwfn)->cdev) ? 0 : ((_p_hwfn)->abs_pf_id & 1))
u8  port_id;
boolb_active;
 
@@ -523,9 +526,7 @@ struct qed_dev {
u8  dp_level;
charname[NAME_SIZE];
 
-   u8  type;
-#define QED_DEV_TYPE_BB (0 << 0)
-#define QED_DEV_TYPE_AH BIT(0)
+   enumqed_dev_type type;
 /* Translate type/revision combo into the proper conditions */
 #define QED_IS_BB(dev)  ((dev)->type == QED_DEV_TYPE_BB)
 #define QED_IS_BB_A0(dev)   (QED_IS_BB(dev) && \
@@ -540,6 +541,9 @@ struct qed_dev {
 
u16 vendor_id;
u16 device_id;
+#define QED_DEV_ID_MASK0xff00
+#define QED_DEV_ID_MASK_BB 0x1600
+#define QED_DEV_ID_MASK_AH 0x8000
 
u16 chip_num;
 #define CHIP_NUM_MASK   0x
@@ -654,10 +658,16 @@ struct qed_dev {
u32 rdma_max_srq_sge;
 };
 
-#define NUM_OF_VFS(dev) MAX_NUM_VFS_BB
-#define NUM_OF_L2_QUEUES(dev)  MAX_NUM_L2_QUEUES_BB
-#define NUM_OF_SBS(dev) MAX_SB_PER_PATH_BB
-#define NUM_OF_ENG_PFS(dev) MAX_NUM_PFS_BB
+#define NUM_OF_VFS(dev) (QED_IS_BB(dev) ? MAX_NUM_VFS_BB \
+   : MAX_NUM_VFS_K2)
+#define NUM_OF_L2_QUEUES(dev)   (QED_IS_BB(dev) ? MAX_NUM_L2_QUEUES_BB \
+   : MAX_NUM_L2_QUEUES_K2)
+#define NUM_OF_PORTS(dev)   (QED_IS_BB(dev) ? MAX_NUM_PORTS_BB \
+   : MAX_NUM_PORTS_K2)
+#define NUM_OF_SBS(dev) (QED_IS_BB(dev) ? MAX_SB_PER_PATH_BB \
+   : MAX_SB_PER_PATH_K2)
+#define NUM_OF_ENG_PFS(dev) (QED_IS_BB(dev) ? MAX_NUM_PFS_BB \
+   : MAX_NUM_PFS_K2)
 
 /**
  * @brief qed_concrete_to_sw_fid - get the sw function id from
@@ -694,6 +704,7 @@ void qed_configure_vp_wfq_on_link_change(struct qed_dev 
*cdev,
 
 void qed_clean_wfq_db(struct qed_hwfn *p_hwfn, struct qed_ptt *p_ptt);
 #define QED_LEADING_HWFN(dev)   (&dev->hwfns[0])
+int qed_device_num_engines(struct qed_dev *cdev);
 
 /* Other Linux specific common definitions */
 #define DP_NAME(cdev) ((cdev)->name)
diff --git a/drivers/net/ethernet/qlogic/qed/qed_debug.c 
b/drivers/net/ethernet/qlogic/qed/qed_debug.c
index 5e81e8a..483241b 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_debug.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_debug.c
@@ -1557,7 +1557,7 @@ static enum dbg_status qed_dbg_dev_init(struct qed_hwfn 
*p_hwfn,
dev_data->mode_enable[MODE_K2] = 1;
} else if (QED_IS_BB_B0(p_hwfn->cdev)) {
dev_data->chip_id = CHIP_BB_B0;
-   dev_data->mode_enable[MODE_BB_B0] = 1;
+   dev_d

[PATCH v2 16/20] ARM: dts: sun50i-a64: enable dwmac-sun8i on pine64

2017-03-14 Thread Corentin Labbe
The dwmac-sun8i hardware is present on the pine64
It uses an external PHY via RMII.

Signed-off-by: Corentin Labbe 
---
 arch/arm64/boot/dts/allwinner/sun50i-a64-pine64.dts | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/arch/arm64/boot/dts/allwinner/sun50i-a64-pine64.dts 
b/arch/arm64/boot/dts/allwinner/sun50i-a64-pine64.dts
index c680ed3..b53994d 100644
--- a/arch/arm64/boot/dts/allwinner/sun50i-a64-pine64.dts
+++ b/arch/arm64/boot/dts/allwinner/sun50i-a64-pine64.dts
@@ -109,3 +109,18 @@
 &usbphy {
status = "okay";
 };
+
+&mdio {
+   ext_rmii_phy1: ethernet-phy@1 {
+ reg = <1>;
+   };
+};
+
+&emac {
+   pinctrl-names = "default";
+   pinctrl-0 = <&rmii_pins>;
+   phy-mode = "rmii";
+   phy-handle = <&ext_rmii_phy1>;
+   status = "okay";
+
+};
-- 
2.10.2



[PATCH v2 14/20] ARM64: dts: sun50i-a64: Add dt node for the syscon control module

2017-03-14 Thread Corentin Labbe
This patch add the dt node for the syscon register present on the
Allwinner A64.

Only two register are present in this syscon and the only one useful is
the one dedicated to EMAC clock.

Signed-off-by: Corentin Labbe 
---
 arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi 
b/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi
index 1c64ea2..3b09af2 100644
--- a/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi
+++ b/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi
@@ -121,6 +121,12 @@
#size-cells = <1>;
ranges;
 
+   syscon: syscon@01c0 {
+   compatible = "syscon",
+   "allwinner,sun8i-h3-system-controller";
+   reg = <0x01c0 0x1000>;
+   };
+
mmc0: mmc@1c0f000 {
compatible = "allwinner,sun50i-a64-mmc";
reg = <0x01c0f000 0x1000>;
-- 
2.10.2



[PATCH v2 05/20] net-next: stmmac: Add dwmac-sun8i

2017-03-14 Thread Corentin Labbe
The dwmac-sun8i is a heavy hacked version of stmmac hardware by
allwinner.
In fact the only common part is the descriptor management and the first
register function.

Signed-off-by: Corentin Labbe 
---
 drivers/net/ethernet/stmicro/stmmac/Kconfig|  11 +
 drivers/net/ethernet/stmicro/stmmac/Makefile   |   1 +
 drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c  | 938 +
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c  |  27 +-
 .../net/ethernet/stmicro/stmmac/stmmac_platform.c  |   9 +-
 include/linux/stmmac.h |   1 +
 6 files changed, 984 insertions(+), 3 deletions(-)
 create mode 100644 drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c

diff --git a/drivers/net/ethernet/stmicro/stmmac/Kconfig 
b/drivers/net/ethernet/stmicro/stmmac/Kconfig
index cfbe363..85c0e41 100644
--- a/drivers/net/ethernet/stmicro/stmmac/Kconfig
+++ b/drivers/net/ethernet/stmicro/stmmac/Kconfig
@@ -145,6 +145,17 @@ config DWMAC_SUNXI
  This selects Allwinner SoC glue layer support for the
  stmmac device driver. This driver is used for A20/A31
  GMAC ethernet controller.
+
+config DWMAC_SUN8I
+   tristate "Allwinner sun8i GMAC support"
+   default ARCH_SUNXI
+   depends on OF && (ARCH_SUNXI || COMPILE_TEST)
+   ---help---
+ Support for Allwinner H3 A83T A64 EMAC ethernet controllers.
+
+ This selects Allwinner SoC glue layer support for the
+ stmmac device driver. This driver is used for H3/A83T/A64
+ EMAC ethernet controller.
 endif
 
 config STMMAC_PCI
diff --git a/drivers/net/ethernet/stmicro/stmmac/Makefile 
b/drivers/net/ethernet/stmicro/stmmac/Makefile
index 700c603..fd4937a 100644
--- a/drivers/net/ethernet/stmicro/stmmac/Makefile
+++ b/drivers/net/ethernet/stmicro/stmmac/Makefile
@@ -16,6 +16,7 @@ obj-$(CONFIG_DWMAC_SOCFPGA)   += dwmac-altr-socfpga.o
 obj-$(CONFIG_DWMAC_STI)+= dwmac-sti.o
 obj-$(CONFIG_DWMAC_STM32)  += dwmac-stm32.o
 obj-$(CONFIG_DWMAC_SUNXI)  += dwmac-sunxi.o
+obj-$(CONFIG_DWMAC_SUN8I)  += dwmac-sun8i.o
 obj-$(CONFIG_DWMAC_DWC_QOS_ETH)+= dwmac-dwc-qos-eth.o
 obj-$(CONFIG_DWMAC_GENERIC)+= dwmac-generic.o
 stmmac-platform-objs:= stmmac_platform.o
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c 
b/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c
new file mode 100644
index 000..52ab67c
--- /dev/null
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c
@@ -0,0 +1,938 @@
+/*
+ * dwmac-sun8i.c - Allwinner sun8i DWMAC specific glue layer
+ *
+ * Copyright (C) 2017 Corentin Labbe 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "stmmac.h"
+#include "stmmac_platform.h"
+
+/* General notes on dwmac-sun8i:
+ * Locking: no locking is necessary in this file because all necessary locking
+ * is done in the "stmmac files"
+ */
+
+/* struct emac_variant - Descrive dwmac-sun8i hardware variant
+ * @default_syscon_value:  The default value of the EMAC register in syscon
+ * This value is used for disabling properly EMAC
+ * and used as a good starting value in case of the
+ * boot process(uboot) leave some stuff.
+ * @internal_phy:  Does the MAC embed an internal PHY
+ * @support_mii:   Does the MAC handle MII
+ * @support_rmii:  Does the MAC handle RMII
+ * @support_rgmii: Does the MAC handle RGMII
+ */
+struct emac_variant {
+   u32 default_syscon_value;
+   int internal_phy;
+   bool support_mii;
+   bool support_rmii;
+   bool support_rgmii;
+};
+
+/* struct sunxi_priv_data - hold all sunxi private data
+ * @tx_clk:reference to MAC TX clock
+ * @ephy_clk:  reference to the optional EPHY clock for the internal PHY
+ * @regulator: reference to the optional regulator
+ * @rst_ephy:  reference to the optional EPHY reset for the internal PHY
+ * @variant:   reference to the current board variant
+ * @regmap:regmap for using the syscon
+ * @use_internal_phy: Does the current PHY choice imply using the internal PHY
+ */
+struct sunxi_priv_data {
+   struct clk *tx_clk;
+   struct clk *ephy_clk;
+   struct regulator *regulator;
+   struct reset_control *rst_ephy;
+   const struct emac_variant *variant;
+   struct regmap *regma

[PATCH v2 10/20] ARM: dts: sun8i: Enable dwmac-sun8i on the Orange Pi 2

2017-03-14 Thread Corentin Labbe
The dwmac-sun8i hardware is present on the Orange PI 2.
It uses the internal PHY.

This patch create the needed emac node.

Signed-off-by: Corentin Labbe 
---
 arch/arm/boot/dts/sun8i-h3-orangepi-2.dts | 8 
 1 file changed, 8 insertions(+)

diff --git a/arch/arm/boot/dts/sun8i-h3-orangepi-2.dts 
b/arch/arm/boot/dts/sun8i-h3-orangepi-2.dts
index 5b6d145..3f54b12 100644
--- a/arch/arm/boot/dts/sun8i-h3-orangepi-2.dts
+++ b/arch/arm/boot/dts/sun8i-h3-orangepi-2.dts
@@ -55,6 +55,7 @@
serial0 = &uart0;
/* ethernet0 is the H3 emac, defined in sun8i-h3.dtsi */
ethernet1 = &rtl8189;
+   ethernet0 = &emac;
};
 
chosen {
@@ -203,3 +204,10 @@
usb1_vbus-supply = <®_usb1_vbus>;
status = "okay";
 };
+
+&emac {
+   phy-handle = <&int_mii_phy>;
+   phy-mode = "mii";
+   allwinner,leds-active-low;
+   status = "okay";
+};
-- 
2.10.2



[PATCH v2 11/20] ARM: dts: sun8i: Enable dwmac-sun8i on the Orange PI One

2017-03-14 Thread Corentin Labbe
The dwmac-sun8i hardware is present on the Orange PI One.
It uses the internal PHY.

This patch create the needed emac node.

Signed-off-by: Corentin Labbe 
---
 arch/arm/boot/dts/sun8i-h3-orangepi-one.dts | 8 
 1 file changed, 8 insertions(+)

diff --git a/arch/arm/boot/dts/sun8i-h3-orangepi-one.dts 
b/arch/arm/boot/dts/sun8i-h3-orangepi-one.dts
index ea8fd13..1f98ddc 100644
--- a/arch/arm/boot/dts/sun8i-h3-orangepi-one.dts
+++ b/arch/arm/boot/dts/sun8i-h3-orangepi-one.dts
@@ -53,6 +53,7 @@
 
aliases {
serial0 = &uart0;
+   ethernet0 = &emac;
};
 
chosen {
@@ -93,6 +94,13 @@
status = "okay";
 };
 
+&emac {
+   phy-handle = <&int_mii_phy>;
+   phy-mode = "mii";
+   allwinner,leds-active-low;
+   status = "okay";
+};
+
 &mmc0 {
pinctrl-names = "default";
pinctrl-0 = <&mmc0_pins_a>, <&mmc0_cd_pin>;
-- 
2.10.2



Re: [PATCH net-next 1/3] vxlan: don't allow link-local IPv6 local/remote addresses without interface

2017-03-14 Thread Jiri Benc
On Fri, 10 Mar 2017 23:39:42 +0100, Matthias Schiffer wrote:
> Signed-off-by: Matthias Schiffer 

Acked-by: Jiri Benc 


Re: [RFC v1 for accelerated IPoIB 05/25] IB/ipoib: Support ipoib acceleration options callbacks

2017-03-14 Thread Erez Shitrit
On Tue, Mar 14, 2017 at 8:35 AM, Vishwanathapura, Niranjana
 wrote:
> On Mon, Mar 13, 2017 at 08:31:16PM +0200, Erez Shitrit wrote:
>>
>> +static struct net_device *ipoib_create_netdev_default(struct ib_device
>> *hca,
>> + const char *name,
>> + void (*setup)(struct
>> net_device *))
>> {
>> struct net_device *dev;
>> +   struct rdma_netdev *rn;
>>
>> -   dev = alloc_netdev((int)sizeof(struct ipoib_dev_priv), name,
>> -  NET_NAME_UNKNOWN, ipoib_setup);
>> +   dev = alloc_netdev((int)sizeof(struct ipoib_rdma_netdev),
>> +  name,
>> +  NET_NAME_UNKNOWN, setup);
>> if (!dev)
>> return NULL;
>>
>> -   return netdev_priv(dev);
>> +   rn = netdev_priv(dev);
>> +
>> +   rn->ib_dev_init = ipoib_dev_init_default;
>> +   rn->ib_dev_cleanup = ipoib_dev_uninit_default;
>> +   rn->send = ipoib_send;
>> +   rn->attach_mcast = ipoib_mcast_attach;
>> +   rn->detach_mcast = ipoib_mcast_detach;
>> +
>> +   dev->netdev_ops = &ipoib_netdev_default_pf;
>> +
>
>
> Probably no need to set netdev_ops here as it gets overwritten.

No, it is switched, and used.

>
>
>> +   return dev;
>> +}
>> +
>> +struct ipoib_dev_priv *ipoib_intf_alloc(struct ib_device *hca, u8 port,
>> +   const char *name)
>> +{
>> +   struct net_device *dev;
>> +   struct ipoib_dev_priv *priv;
>> +   struct rdma_netdev *rn;
>> +
>> +   priv = kzalloc(sizeof(*priv), GFP_KERNEL);
>> +   if (!priv) {
>> +   pr_err("%s failed allocting priv\n", __func__);
>> +   return NULL;
>> +   }
>> +
>> +   if (!hca->alloc_rdma_netdev)
>> +   dev = ipoib_create_netdev_default(hca, name,
>> ipoib_setup_common);
>> +   else
>> +   dev = hca->alloc_rdma_netdev(hca, port, RDMA_NETDEV_IPOIB,
>> +name, NET_NAME_UNKNOWN,
>> +ipoib_setup_common);
>> +   if (!dev) {
>> +   kfree(priv);
>> +   return NULL;
>> +   }
>
>
> This will break ipoib on hfi1 as hfi1 will define alloc_rdma_netdev for
> OPA_VNIC type. We should probably look for a dedicated return type
> (-ENODEV?) to determine of the driver supports specified rdma netdev type.
> Or use a ib device attribute to suggest driver support ipoib rdma netdev.

sorry, I don't understand that, we are in ipoib driver, so the type is
RDMA_NETDEV_IPOIB, if hfi wants to implement it should use the same
flag, and to use OPA_VNIC for vnic.


>
> Niranjana


Re: net: deadlock between ip_expire/sch_direct_xmit

2017-03-14 Thread Eric Dumazet
On Tue, 2017-03-14 at 14:31 +0100, Dmitry Vyukov wrote:
> Hello,
> 
> I've got the following deadlock report while running syzkaller fuzzer
> on net-next/92cd12c5ed432c5eebd2462d666772a8d8442c3b:
> 
> 
> [ INFO: possible circular locking dependency detected ]
> 4.10.0+ #29 Not tainted
> ---
> modprobe/12392 is trying to acquire lock:
>  (_xmit_ETHER#2){+.-...}, at: [] spin_lock
> include/linux/spinlock.h:299 [inline]
>  (_xmit_ETHER#2){+.-...}, at: [] __netif_tx_lock
> include/linux/netdevice.h:3486 [inline]
>  (_xmit_ETHER#2){+.-...}, at: []
> sch_direct_xmit+0x282/0x6d0 net/sched/sch_generic.c:180
> 
> but task is already holding lock:
>  (&(&q->lock)->rlock){+.-...}, at: [] spin_lock
> include/linux/spinlock.h:299 [inline]
>  (&(&q->lock)->rlock){+.-...}, at: []
> ip_expire+0x51/0x6c0 net/ipv4/ip_fragment.c:201
> 
> which lock already depends on the new lock.
> 
> 
> the existing dependency chain (in reverse order) is:
> 
> -> #1 (&(&q->lock)->rlock){+.-...}:
>validate_chain kernel/locking/lockdep.c:2267 [inline]
>__lock_acquire+0x2149/0x3430 kernel/locking/lockdep.c:3340
>lock_acquire+0x2a1/0x630 kernel/locking/lockdep.c:3755
>__raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
>_raw_spin_lock+0x33/0x50 kernel/locking/spinlock.c:151
>spin_lock include/linux/spinlock.h:299 [inline]
>ip_defrag+0x3a2/0x4130 net/ipv4/ip_fragment.c:669
>ip_check_defrag+0x4e3/0x8b0 net/ipv4/ip_fragment.c:713
>packet_rcv_fanout+0x282/0x800 net/packet/af_packet.c:1459
>deliver_skb net/core/dev.c:1834 [inline]
>dev_queue_xmit_nit+0x294/0xa90 net/core/dev.c:1890
>xmit_one net/core/dev.c:2903 [inline]
>dev_hard_start_xmit+0x16b/0xab0 net/core/dev.c:2923
>sch_direct_xmit+0x31f/0x6d0 net/sched/sch_generic.c:182
>__dev_xmit_skb net/core/dev.c:3092 [inline]
>__dev_queue_xmit+0x13e5/0x1e60 net/core/dev.c:3358
>dev_queue_xmit+0x17/0x20 net/core/dev.c:3423
>neigh_resolve_output+0x6b9/0xb10 net/core/neighbour.c:1308
>neigh_output include/net/neighbour.h:478 [inline]
>ip_finish_output2+0x8b8/0x15a0 net/ipv4/ip_output.c:228
>ip_do_fragment+0x1d93/0x2720 net/ipv4/ip_output.c:672
>ip_fragment.constprop.54+0x145/0x200 net/ipv4/ip_output.c:545
>ip_finish_output+0x82d/0xe10 net/ipv4/ip_output.c:314
>NF_HOOK_COND include/linux/netfilter.h:246 [inline]
>ip_output+0x1f0/0x7a0 net/ipv4/ip_output.c:404
>dst_output include/net/dst.h:486 [inline]
>ip_local_out+0x95/0x170 net/ipv4/ip_output.c:124
>ip_send_skb+0x3c/0xc0 net/ipv4/ip_output.c:1492
>ip_push_pending_frames+0x64/0x80 net/ipv4/ip_output.c:1512
>raw_sendmsg+0x26de/0x3a00 net/ipv4/raw.c:655
>inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:761
>sock_sendmsg_nosec net/socket.c:633 [inline]
>sock_sendmsg+0xca/0x110 net/socket.c:643
>___sys_sendmsg+0x4a3/0x9f0 net/socket.c:1985
>__sys_sendmmsg+0x25c/0x750 net/socket.c:2075
>SYSC_sendmmsg net/socket.c:2106 [inline]
>SyS_sendmmsg+0x35/0x60 net/socket.c:2101
>do_syscall_64+0x2e8/0x930 arch/x86/entry/common.c:281
>return_from_SYSCALL_64+0x0/0x7a
> 
> -> #0 (_xmit_ETHER#2){+.-...}:
>check_prev_add kernel/locking/lockdep.c:1830 [inline]
>check_prevs_add+0xa8f/0x19f0 kernel/locking/lockdep.c:1940
>validate_chain kernel/locking/lockdep.c:2267 [inline]
>__lock_acquire+0x2149/0x3430 kernel/locking/lockdep.c:3340
>lock_acquire+0x2a1/0x630 kernel/locking/lockdep.c:3755
>__raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
>_raw_spin_lock+0x33/0x50 kernel/locking/spinlock.c:151
>spin_lock include/linux/spinlock.h:299 [inline]
>__netif_tx_lock include/linux/netdevice.h:3486 [inline]
>sch_direct_xmit+0x282/0x6d0 net/sched/sch_generic.c:180
>__dev_xmit_skb net/core/dev.c:3092 [inline]
>__dev_queue_xmit+0x13e5/0x1e60 net/core/dev.c:3358
>dev_queue_xmit+0x17/0x20 net/core/dev.c:3423
>neigh_hh_output include/net/neighbour.h:468 [inline]
>neigh_output include/net/neighbour.h:476 [inline]
>ip_finish_output2+0xf6c/0x15a0 net/ipv4/ip_output.c:228
>ip_finish_output+0xa29/0xe10 net/ipv4/ip_output.c:316
>NF_HOOK_COND include/linux/netfilter.h:246 [inline]
>ip_output+0x1f0/0x7a0 net/ipv4/ip_output.c:404
>dst_output include/net/dst.h:486 [inline]
>ip_local_out+0x95/0x170 net/ipv4/ip_output.c:124
>ip_send_skb+0x3c/0xc0 net/ipv4/ip_output.c:1492
>ip_push_pending_frames+0x64/0x80 net/ipv4/ip_output.c:1512
>icmp_push_reply+0x372/0x4d0 net/ipv4/icmp.c:394
>icmp_send+0x156c/0x1c80 net/ipv4/icmp.c:754
>ip_expire+0x40e/0x6c0 net/ipv4/ip_fragment.c:239
>call_timer_fn+0x241/0x820 kernel/time/timer.c:12

Re: net: deadlock between ip_expire/sch_direct_xmit

2017-03-14 Thread Dmitry Vyukov
On Tue, Mar 14, 2017 at 3:41 PM, Eric Dumazet  wrote:
> On Tue, 2017-03-14 at 14:31 +0100, Dmitry Vyukov wrote:
>> Hello,
>>
>> I've got the following deadlock report while running syzkaller fuzzer
>> on net-next/92cd12c5ed432c5eebd2462d666772a8d8442c3b:
>>
>>
>> [ INFO: possible circular locking dependency detected ]
>> 4.10.0+ #29 Not tainted
>> ---
>> modprobe/12392 is trying to acquire lock:
>>  (_xmit_ETHER#2){+.-...}, at: [] spin_lock
>> include/linux/spinlock.h:299 [inline]
>>  (_xmit_ETHER#2){+.-...}, at: [] __netif_tx_lock
>> include/linux/netdevice.h:3486 [inline]
>>  (_xmit_ETHER#2){+.-...}, at: []
>> sch_direct_xmit+0x282/0x6d0 net/sched/sch_generic.c:180
>>
>> but task is already holding lock:
>>  (&(&q->lock)->rlock){+.-...}, at: [] spin_lock
>> include/linux/spinlock.h:299 [inline]
>>  (&(&q->lock)->rlock){+.-...}, at: []
>> ip_expire+0x51/0x6c0 net/ipv4/ip_fragment.c:201
>>
>> which lock already depends on the new lock.
>>
>>
>> the existing dependency chain (in reverse order) is:
>>
>> -> #1 (&(&q->lock)->rlock){+.-...}:
>>validate_chain kernel/locking/lockdep.c:2267 [inline]
>>__lock_acquire+0x2149/0x3430 kernel/locking/lockdep.c:3340
>>lock_acquire+0x2a1/0x630 kernel/locking/lockdep.c:3755
>>__raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
>>_raw_spin_lock+0x33/0x50 kernel/locking/spinlock.c:151
>>spin_lock include/linux/spinlock.h:299 [inline]
>>ip_defrag+0x3a2/0x4130 net/ipv4/ip_fragment.c:669
>>ip_check_defrag+0x4e3/0x8b0 net/ipv4/ip_fragment.c:713
>>packet_rcv_fanout+0x282/0x800 net/packet/af_packet.c:1459
>>deliver_skb net/core/dev.c:1834 [inline]
>>dev_queue_xmit_nit+0x294/0xa90 net/core/dev.c:1890
>>xmit_one net/core/dev.c:2903 [inline]
>>dev_hard_start_xmit+0x16b/0xab0 net/core/dev.c:2923
>>sch_direct_xmit+0x31f/0x6d0 net/sched/sch_generic.c:182
>>__dev_xmit_skb net/core/dev.c:3092 [inline]
>>__dev_queue_xmit+0x13e5/0x1e60 net/core/dev.c:3358
>>dev_queue_xmit+0x17/0x20 net/core/dev.c:3423
>>neigh_resolve_output+0x6b9/0xb10 net/core/neighbour.c:1308
>>neigh_output include/net/neighbour.h:478 [inline]
>>ip_finish_output2+0x8b8/0x15a0 net/ipv4/ip_output.c:228
>>ip_do_fragment+0x1d93/0x2720 net/ipv4/ip_output.c:672
>>ip_fragment.constprop.54+0x145/0x200 net/ipv4/ip_output.c:545
>>ip_finish_output+0x82d/0xe10 net/ipv4/ip_output.c:314
>>NF_HOOK_COND include/linux/netfilter.h:246 [inline]
>>ip_output+0x1f0/0x7a0 net/ipv4/ip_output.c:404
>>dst_output include/net/dst.h:486 [inline]
>>ip_local_out+0x95/0x170 net/ipv4/ip_output.c:124
>>ip_send_skb+0x3c/0xc0 net/ipv4/ip_output.c:1492
>>ip_push_pending_frames+0x64/0x80 net/ipv4/ip_output.c:1512
>>raw_sendmsg+0x26de/0x3a00 net/ipv4/raw.c:655
>>inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:761
>>sock_sendmsg_nosec net/socket.c:633 [inline]
>>sock_sendmsg+0xca/0x110 net/socket.c:643
>>___sys_sendmsg+0x4a3/0x9f0 net/socket.c:1985
>>__sys_sendmmsg+0x25c/0x750 net/socket.c:2075
>>SYSC_sendmmsg net/socket.c:2106 [inline]
>>SyS_sendmmsg+0x35/0x60 net/socket.c:2101
>>do_syscall_64+0x2e8/0x930 arch/x86/entry/common.c:281
>>return_from_SYSCALL_64+0x0/0x7a
>>
>> -> #0 (_xmit_ETHER#2){+.-...}:
>>check_prev_add kernel/locking/lockdep.c:1830 [inline]
>>check_prevs_add+0xa8f/0x19f0 kernel/locking/lockdep.c:1940
>>validate_chain kernel/locking/lockdep.c:2267 [inline]
>>__lock_acquire+0x2149/0x3430 kernel/locking/lockdep.c:3340
>>lock_acquire+0x2a1/0x630 kernel/locking/lockdep.c:3755
>>__raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
>>_raw_spin_lock+0x33/0x50 kernel/locking/spinlock.c:151
>>spin_lock include/linux/spinlock.h:299 [inline]
>>__netif_tx_lock include/linux/netdevice.h:3486 [inline]
>>sch_direct_xmit+0x282/0x6d0 net/sched/sch_generic.c:180
>>__dev_xmit_skb net/core/dev.c:3092 [inline]
>>__dev_queue_xmit+0x13e5/0x1e60 net/core/dev.c:3358
>>dev_queue_xmit+0x17/0x20 net/core/dev.c:3423
>>neigh_hh_output include/net/neighbour.h:468 [inline]
>>neigh_output include/net/neighbour.h:476 [inline]
>>ip_finish_output2+0xf6c/0x15a0 net/ipv4/ip_output.c:228
>>ip_finish_output+0xa29/0xe10 net/ipv4/ip_output.c:316
>>NF_HOOK_COND include/linux/netfilter.h:246 [inline]
>>ip_output+0x1f0/0x7a0 net/ipv4/ip_output.c:404
>>dst_output include/net/dst.h:486 [inline]
>>ip_local_out+0x95/0x170 net/ipv4/ip_output.c:124
>>ip_send_skb+0x3c/0xc0 net/ipv4/ip_output.c:1492
>>ip_push_pending_frames+0x64/0x80 net/ipv4/ip_output.c:1512
>>icmp_push_reply+0x372/0x4d0 net/ipv4/icmp.c:394
>>icmp_send+0x156c/

Re: [PATCH v2] fjes: Do not load fjes driver if system does not have extended socket device.

2017-03-14 Thread YASUAKI ISHIMATSU



On 03/12/2017 02:29 PM, Bjørn Mork wrote:

Yasuaki Ishimatsu  writes:


The fjes driver is used only by FUJITSU servers and almost of all
servers in the world never use it. But currently if ACPI PNP0C02
is defined in the ACPI table, the following message is always shown:

 "FUJITSU Extended Socket Network Device Driver - version 1.2
  - Copyright (c) 2015 FUJITSU LIMITED"


Matching on PNP0C02 is fundamentally wrong. It's a way to load a device
driver on all ACPI systems.  You should not do that. I don't think it is
fair to make everyone suffer because of your inability to properly
narrow down the driver matching rules.


There are so many similar matching rules. But these modules are not
listed in blacklist because these modules has proper check like my patch
and no one suffers.

So I don't think the matching rule is fundamentally wrong.

Thanks,
Yasuaki Ishimatsu


Could we please just delete the whole MODULE_DEVICE_TABLE() from this
driver until a proper solution is found? That way we don't need to
blacklist the driver everywhere.





Bjørn



  1   2   3   >