date:20170112

From: Arkadi Sharshevsky 

During transmission the skb is checked for headroom in order to
add vendor specific header. In case the skb needs to be re-allocated,
skb_realloc_headroom() is called to make a private copy of the original,
but doesn't release it. Current code assumes that the original skb is
released during reallocation and only releases it at the error path
which causes a memory leak.

Fix this by adding the original skb release to the main path.

Fixes: d003462a50de ("mlxsw: Simplify mlxsw_sx_port_xmit function")
Signed-off-by: Arkadi Sharshevsky 
Signed-off-by: Jiri Pirko 
---
 drivers/net/ethernet/mellanox/mlxsw/switchx2.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/switchx2.c 
b/drivers/net/ethernet/mellanox/mlxsw/switchx2.c
index 150ccf5..2e88115e 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/switchx2.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/switchx2.c
@@ -345,6 +345,7 @@ static netdev_tx_t mlxsw_sx_port_xmit(struct sk_buff *skb,
dev_kfree_skb_any(skb_orig);
return NETDEV_TX_OK;
}
+   dev_consume_skb_any(skb_orig);
}
mlxsw_sx_txhdr_construct(skb, &tx_info);
/* TX header is consumed by HW on the way so we shouldn't count its
-- 
2.7.4

[patch net 3/3] mlxsw: pci: Fix EQE structure definition

From: Elad Raz 

The event_data starts from address 0x00-0x0C and not from 0x08-0x014. This
leads to duplication with other fields in the Event Queue Element such as
sub-type, cqn and owner.

Fixes: eda6500a987a0 ("mlxsw: Add PCI bus implementation")
Signed-off-by: Elad Raz 
Signed-off-by: Jiri Pirko 
---
 drivers/net/ethernet/mellanox/mlxsw/pci_hw.h | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/pci_hw.h 
b/drivers/net/ethernet/mellanox/mlxsw/pci_hw.h
index d147ddd..0af3338 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/pci_hw.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/pci_hw.h
@@ -209,21 +209,21 @@ MLXSW_ITEM32(pci, eqe, owner, 0x0C, 0, 1);
 /* pci_eqe_cmd_token
  * Command completion event - token
  */
-MLXSW_ITEM32(pci, eqe, cmd_token, 0x08, 16, 16);
+MLXSW_ITEM32(pci, eqe, cmd_token, 0x00, 16, 16);
 
 /* pci_eqe_cmd_status
  * Command completion event - status
  */
-MLXSW_ITEM32(pci, eqe, cmd_status, 0x08, 0, 8);
+MLXSW_ITEM32(pci, eqe, cmd_status, 0x00, 0, 8);
 
 /* pci_eqe_cmd_out_param_h
  * Command completion event - output parameter - higher part
  */
-MLXSW_ITEM32(pci, eqe, cmd_out_param_h, 0x0C, 0, 32);
+MLXSW_ITEM32(pci, eqe, cmd_out_param_h, 0x04, 0, 32);
 
 /* pci_eqe_cmd_out_param_l
  * Command completion event - output parameter - lower part
  */
-MLXSW_ITEM32(pci, eqe, cmd_out_param_l, 0x10, 0, 32);
+MLXSW_ITEM32(pci, eqe, cmd_out_param_l, 0x08, 0, 32);
 
 #endif
-- 
2.7.4

[patch net 0/3] mlxsw: Couple of fixes

From: Jiri Pirko 

Couple of simple fixes from Arkadi and Elad.

Please queue these up for stable. Thanks.

Arkadi Sharshevsky (2):
  mlxsw: spectrum: Fix memory leak at skb reallocation
  mlxsw: switchx2: Fix memory leak at skb reallocation

Elad Raz (1):
  mlxsw: pci: Fix EQE structure definition

 drivers/net/ethernet/mellanox/mlxsw/pci_hw.h   | 8 
 drivers/net/ethernet/mellanox/mlxsw/spectrum.c | 1 +
 drivers/net/ethernet/mellanox/mlxsw/switchx2.c | 1 +
 3 files changed, 6 insertions(+), 4 deletions(-)

-- 
2.7.4

[patch net 1/3] mlxsw: spectrum: Fix memory leak at skb reallocation

From: Arkadi Sharshevsky 

During transmission the skb is checked for headroom in order to
add vendor specific header. In case the skb needs to be re-allocated,
skb_realloc_headroom() is called to make a private copy of the original,
but doesn't release it. Current code assumes that the original skb is
released during reallocation and only releases it at the error path
which causes a memory leak.

Fix this by adding the original skb release to the main path.

Fixes: 56ade8fe3fe1 ("mlxsw: spectrum: Add initial support for Spectrum ASIC")
Signed-off-by: Arkadi Sharshevsky 
Reviewed-by: Ido Schimmel 
Signed-off-by: Jiri Pirko 
---
 drivers/net/ethernet/mellanox/mlxsw/spectrum.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c 
b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
index d768c7b..003093a 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
@@ -684,6 +684,7 @@ static netdev_tx_t mlxsw_sp_port_xmit(struct sk_buff *skb,
dev_kfree_skb_any(skb_orig);
return NETDEV_TX_OK;
}
+   dev_consume_skb_any(skb_orig);
}
 
if (eth_skb_pad(skb)) {
-- 
2.7.4

[PATCH iproute2/net-next 2/2] tc: flower: Support matching ARP

Support matching on ARP operation, and hardware and protocol addresses
for Ethernet hardware and IPv4 protocol addresses.

Example usage:

tc qdisc add dev eth0 ingress

tc filter add dev eth0 protocol arp parent : flower indev eth0 \
arp_op request arp_sip 10.0.0.1 action drop
tc filter add dev eth0 protocol rarp parent : flower indev eth0 \   
arp_op reply arp_tha 52:54:3f:00:00:00/24 action drop

Signed-off-by: Simon Horman 
---
 man/man8/tc-flower.8 |  41 +-
 tc/f_flower.c| 208 +++
 2 files changed, 232 insertions(+), 17 deletions(-)

diff --git a/man/man8/tc-flower.8 b/man/man8/tc-flower.8
index 5904a9ecafdf..2dd2c5e6e4a5 100644
--- a/man/man8/tc-flower.8
+++ b/man/man8/tc-flower.8
@@ -34,7 +34,13 @@ flower \- flow based traffic control filter
 .BR dst_ip " | " src_ip " } "
 .IR PREFIX " | { "
 .BR dst_port " | " src_port " } "
-.IR port_number " } | "
+.IR port_number " } | { "
+.BR arp_tip " | " arp_sip " } "
+.IR PREFIX " | "
+.BR arp_op " { " request " | " reply " | "
+.IR OP " } | { "
+.BR arp_tha " | " arp_sha " } "
+.IR MASKED_LLADDR " | "
 .B enc_key_id
 .IR KEY-ID " | {"
 .BR enc_dst_ip " | " enc_src_ip " } { "
@@ -131,6 +137,36 @@ Match on ICMP type or code. Only available for
 .BR ip_proto " values " icmp  " and " icmpv6
 which have to be specified in beforehand.
 .TP
+.BI arp_tip " PREFIX"
+.TQ
+.BI arp_sip " PREFIX"
+Match on ARP or RARP sender or target IP address.
+.I PREFIX
+must be a valid IPv4 address optionally followed by a slash and the prefix
+length. If the prefix is missing, \fBtc\fR assumes a full-length host
+match.
+.TP
+.BI arp_op " ARP_OP"
+Match on ARP or RARP operation.
+.I ARP_OP
+may be
+.BR request ", " reply
+or an integer value 0, 1 or 2.  A mask may be optionally provided to limit
+the bits of the operation which are matched. A mask is provided by
+following the address with a slash and then the mask. It may be provided as
+an unsigned 8 bit value representing a bitwise mask. If the mask is missing
+then a match on all bits is assumed.
+.TP
+.BI arp_sha " MASKED_LLADDR"
+.TQ
+.BI arp_tha " MASKED_LLADDR"
+Match on ARP or RARP sender or target MAC address.  A mask may be optionally
+provided to limit the bits of the address which are matched. A mask is
+provided by following the address with a slash and then the mask. It may be
+provided in LLADDR format, in which case it is a bitwise mask, or as a
+number of high bits to match. If the mask is missing then a match on all
+bits is assumed.
+.TP
 .BI enc_key_id " NUMBER"
 .TQ
 .BI enc_dst_ip " PREFIX"
@@ -152,7 +188,8 @@ As stated above where applicable, matches of a certain 
layer implicitly depend
 on the matches of the next lower layer. Precisely, layer one and two matches
 (\fBindev\fR,  \fBdst_mac\fR and \fBsrc_mac\fR)
 have no dependency, layer three matches
-(\fBip_proto\fR, \fBdst_ip\fR and \fBsrc_ip\fR)
+(\fBip_proto\fR, \fBdst_ip\fR, \fBsrc_ip\fR, \fBarp_tip\fR, \fBarp_sip\fR,
+\fBarp_op\fR, \fBarp_tha\fR and \fBarp_sha\fR)
 depend on the
 .B protocol
 option of tc filter, layer four port matches
diff --git a/tc/f_flower.c b/tc/f_flower.c
index 99f5f8163ee0..d301db36a549 100644
--- a/tc/f_flower.c
+++ b/tc/f_flower.c
@@ -15,6 +15,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -54,6 +55,11 @@ static void explain(void)
"   src_port PORT-NUMBER |\n"
"   type ICMP-TYPE |\n"
"   code ICMP-CODE |\n"
+   "   arp_tip PREFIX |\n"
+   "   arp_sip PREFIX |\n"
+   "   arp_op [ request | reply | OP ] |\n"
+   "   arp_tha MASKED-LLADDR |\n"
+   "   arp_sha MASKED-LLADDR |\n"
"   enc_dst_ip [ IPV4-ADDR | IPV6-ADDR ] 
|\n"
"   enc_src_ip [ IPV4-ADDR | IPV6-ADDR ] 
|\n"
"   enc_key_id [ KEY-ID ] |\n"
@@ -192,27 +198,16 @@ err:
return -1;
 }
 
-static int flower_parse_ip_addr(char *str, __be16 eth_type,
-   int addr4_type, int mask4_type,
-   int addr6_type, int mask6_type,
-   struct nlmsghdr *n)
+static int __flower_parse_ip_addr(char *str, int family,
+ int addr4_type, int mask4_type,
+ int addr6_type, int mask6_type,
+ struct nlmsghdr *n)
 {
int ret;
inet_prefix addr;
-   int family;
int bits;
int i;
 
-   if (eth_type == htons(ETH_P_IP)) {
-   family = AF_INET;
-   } else if (eth_type == htons(ETH_P_IPV6)) {
-   family = AF_INET6;
-   } else if (!eth_typ

[PATCH iproute2/net-next 1/2] tc: flower: update headers for TCA_FLOWER_KEY_ARP*

Present in net-next.

Signed-off-by: Simon Horman 
---
 include/linux/pkt_cls.h | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/include/linux/pkt_cls.h b/include/linux/pkt_cls.h
index a081efbd61a2..1e5e1ddfdaca 100644
--- a/include/linux/pkt_cls.h
+++ b/include/linux/pkt_cls.h
@@ -416,6 +416,17 @@ enum {
TCA_FLOWER_KEY_ICMPV6_TYPE, /* u8 */
TCA_FLOWER_KEY_ICMPV6_TYPE_MASK,/* u8 */
 
+   TCA_FLOWER_KEY_ARP_SIP, /* be32 */
+   TCA_FLOWER_KEY_ARP_SIP_MASK,/* be32 */
+   TCA_FLOWER_KEY_ARP_TIP, /* be32 */
+   TCA_FLOWER_KEY_ARP_TIP_MASK,/* be32 */
+   TCA_FLOWER_KEY_ARP_OP,  /* u8 */
+   TCA_FLOWER_KEY_ARP_OP_MASK, /* u8 */
+   TCA_FLOWER_KEY_ARP_SHA, /* ETH_ALEN */
+   TCA_FLOWER_KEY_ARP_SHA_MASK,/* ETH_ALEN */
+   TCA_FLOWER_KEY_ARP_THA, /* ETH_ALEN */
+   TCA_FLOWER_KEY_ARP_THA_MASK,/* ETH_ALEN */
+
__TCA_FLOWER_MAX,
 };
 
-- 
2.7.0.rc3.207.g0ac5344

[PATCH iproute2/net-next 0/2] net/sched: cls_flower: Support matching ARP

Add support for support matching on ARP operation, and hardware and
protocol addresses for Ethernet hardware and IPv4 protocol addresses.

Changes since RFC:
* Drop RFC designation; kernel patches are present in net-next

Simon Horman (2):
  tc: flower: update headers for TCA_FLOWER_KEY_ARP*
  tc: flower: Support matching ARP

 include/linux/pkt_cls.h |  11 +++
 man/man8/tc-flower.8|  41 +-
 tc/f_flower.c   | 208 
 3 files changed, 243 insertions(+), 17 deletions(-)

-- 
2.7.0.rc3.207.g0ac5344

Re: [PATCH] can: Fix kernel panic at security_sock_rcv_skb

2017-01-12 Thread Oliver Hartkopp




On 01/12/2017 07:33 AM, Liu ShuoX wrote:

From: Zhang Yanmin 

The patch is for fix the below kernel panic:
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [] selinux_socket_sock_rcv_skb+0x65/0x2a0

Call Trace:
 
 [] security_sock_rcv_skb+0x4c/0x60
 [] sk_filter+0x41/0x210
 [] sock_queue_rcv_skb+0x53/0x3a0
 [] raw_rcv+0x2a3/0x3c0
 [] can_rcv_filter+0x12b/0x370
 [] can_receive+0xd9/0x120
 [] can_rcv+0xab/0x100
 [] __netif_receive_skb_core+0xd8c/0x11f0
 [] __netif_receive_skb+0x24/0xb0
 [] process_backlog+0x127/0x280
 [] net_rx_action+0x33b/0x4f0
 [] __do_softirq+0x184/0x440
 [] do_softirq_own_stack+0x1c/0x30
 
 [] do_softirq.part.18+0x3b/0x40
 [] do_softirq+0x1d/0x20
 [] netif_rx_ni+0xe5/0x110
 [] slcan_receive_buf+0x507/0x520
 [] flush_to_ldisc+0x21c/0x230
 [] process_one_work+0x24f/0x670
 [] worker_thread+0x9d/0x6f0
 [] ? rescuer_thread+0x480/0x480
 [] kthread+0x12c/0x150
 [] ret_from_fork+0x3f/0x70

The sk dereferenced in panic has been released. After the rcu_call in
can_rx_unregister, receiver was protected by RCU but inner data was
not, then later sk will be freed while other CPU is still using it.
We need wait here to make sure sk referenced via receiver was safe.

=> security_sk_free
=> sk_destruct
=> __sk_free
=> sk_free
=> raw_release
=> sock_release
=> sock_close
=> __fput
=> fput
=> task_work_run
=> exit_to_usermode_loop
=> syscall_return_slowpath
=> int_ret_from_sys_call

Signed-off-by: Zhang Yanmin 
Signed-off-by: He, Bo 
Signed-off-by: Liu Shuo A 
---
 net/can/af_can.c | 14 --
 net/can/af_can.h |  1 -
 2 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/net/can/af_can.c b/net/can/af_can.c
index 1108079..fcbe971 100644
--- a/net/can/af_can.c
+++ b/net/can/af_can.c
@@ -517,10 +517,8 @@ int can_rx_register(struct net_device *dev, canid_t 
can_id, canid_t mask,
 /*
  * can_rx_delete_receiver - rcu callback for single receiver entry removal
  */
-static void can_rx_delete_receiver(struct rcu_head *rp)
+static void can_rx_delete_receiver(struct receiver *r)
 {
-   struct receiver *r = container_of(rp, struct receiver, rcu);
-
kmem_cache_free(rcv_cache, r);
 }

@@ -595,9 +593,13 @@ void can_rx_unregister(struct net_device *dev, canid_t 
can_id, canid_t mask,
  out:
spin_unlock(&can_rcvlists_lock);

-   /* schedule the receiver item for deletion */
-   if (r)
-   call_rcu(&r->rcu, can_rx_delete_receiver);
+   /* synchronize_rcu to wait until a grace period has elapsed, to make
+* sure all receiver's sk dereferenced by others.
+*/
+   if (r) {
+   synchronize_rcu();
+   can_rx_delete_receiver(r);


Nitpick: When can_rx_delete_receiver() just contains 
kmem_cache_free(rcv_cache, r), then the function definition should be 
removed.


But my main concern is:

The reason why can_rx_delete_receiver() was introduced was the need to 
remove a huge number of receivers with can_rx_unregister().


When you call synchronize_rcu() after each receiver removal this would 
potentially lead to a big performance issue when e.g. closing CAN_RAW 
sockets with a high number of receivers.


So the idea was to remove/unlink the receiver hlist_del_rcu(&r->list) 
and also kmem_cache_free(rcv_cache, r) by some rcu mechanism - so that 
all elements are cleaned up by rcu at a later point.


Is it possible that the problems emerge due to hlist_del_rcu(&r->list) 
and you accidently fix it with your introduced synchronize_rcu()?


Regards,
Oliver



+   }
 }
 EXPORT_SYMBOL(can_rx_unregister);

diff --git a/net/can/af_can.h b/net/can/af_can.h
index fca0fe9..a0cbf83 100644
--- a/net/can/af_can.h
+++ b/net/can/af_can.h
@@ -50,7 +50,6 @@

 struct receiver {
struct hlist_node list;
-   struct rcu_head rcu;
canid_t can_id;
canid_t mask;
unsigned long matches;

Re: [PATCH net-next 2/2] bpf: allow b/h/w/dw access for bpf's cb in ctx

2017-01-12 Thread Quentin Monnet

Hi Daniel,

2017-01-12 (02:21 +0100) ~ Daniel Borkmann 
> When structs are used to store temporary state in cb[] buffer that is
> used with programs and among tail calls, then the generated code will
> not always access the buffer in bpf_w chunks. We can ease programming
> of it and let this act more natural by allowing for aligned b/h/w/dw
> sized access for cb[] ctx member. Various test cases are attached as
> well for the selftest suite. Potentially, this can also be reused for
> other program types to pass data around.
> 
> Signed-off-by: Daniel Borkmann 
> Acked-by: Alexei Starovoitov 
> ---
>  kernel/bpf/verifier.c   |   8 +-
>  net/core/filter.c   |  41 ++-
>  tools/testing/selftests/bpf/test_verifier.c | 442 
> +++-
>  3 files changed, 478 insertions(+), 13 deletions(-)
> 

[...]

> diff --git a/tools/testing/selftests/bpf/test_verifier.c 
> b/tools/testing/selftests/bpf/test_verifier.c
> index 9bb4534..f664bed 100644
> --- a/tools/testing/selftests/bpf/test_verifier.c
> +++ b/tools/testing/selftests/bpf/test_verifier.c
> @@ -859,15 +859,451 @@ struct test_val {

[...]

> + {
> + "check cb access: doulbe, oob 5",
> + .insns = {
> + BPF_MOV64_IMM(BPF_REG_0, 0),
> + BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_1,
> + offsetof(struct __sk_buff, cb[4]) + 8),
> + BPF_EXIT_INSN(),
> + },
> + .errstr = "invalid bpf_context access",
> + .result = REJECT,
> + },

Nitpicking: typo ("doulbe").

Regards,
Quentin

Re: [PATCH] [net] net/mlx5e: fix another -Wmaybe-uninitialized warning

2017-01-12 Thread Or Gerlitz


On 1/11/2017 11:14 PM, Arnd Bergmann wrote:

@@ -666,14 +666,15 @@ static int mlx5e_route_lookup_ipv4(struct mlx5e_priv 
*priv,
struct rtable *rt;
struct neighbour *n = NULL;
int ttl;
+   int ret;
+
+   if (!IS_ENABLED(CONFIG_INET))
+   return -EOPNOTSUPP;
  
-#if IS_ENABLED(CONFIG_INET)

rt = ip_route_output_key(dev_net(mirred_dev), fl4);
-   if (IS_ERR(rt))
-   return PTR_ERR(rt);
-#else
-   return -EOPNOTSUPP;
-#endif
+   ret = PTR_ERR_OR_ZERO(rt);
+   if (ret)
+   return ret;


but this means that if we got NULL from ip_route_output_key, we will 
return success (0) here which is wrong.

Re: [PATCH iproute2 1/3] sr: add header files for SR-IPv6

2017-01-12 Thread David Lebrun

On 01/10/2017 07:33 PM, Stephen Hemminger wrote:
> I get all headers from santized kernel headers generated by
>   $ make headers_install
> but the segmentation stuff is missing.
> 
> When you added segment routing headers you forgot to export them.
> Please send a patch to include/uapi/linux/Kbuild, after that is merged
> I will pick them up.
> 
> Also this patch is only for net-next.

Oops ! Will do that, thanks

David



signature.asc
Description: OpenPGP digital signature

Re: [PATCH] wext: handle NULL exta data in iwe_stream_add_point better

2017-01-12 Thread Johannes Berg

On Wed, 2017-01-11 at 21:39 +0100, Arnd Bergmann wrote:
> On Wednesday, January 11, 2017 4:06:17 PM CET Johannes Berg wrote:
> > 
> > Applied. Also fixed the typo in the subject :)
> 
> Thanks! Unfortunately I now got another warning for the same
> function, and though I would have expected the patch to fix it, that
> did not work:

I've come to expect better of you (i.e. testing your own patches) ;-)

Come to think of it, I'm thinking I should drop this patch and the
driver should just use iwe_stream_add_event() instead? It'll be
somewhat tricky to get the length correct though.

Alternatively, perhaps we should just uninline all the crap and then
the compiler can't bother us :)

johannes

[PATCH/RFC net] ravb: Remove Rx overflow log messages

From: Kazuya Mizuguchi 

Remove Rx overflow log messages as in an environment where logging results
in network traffic logging may cause further overflows.

Fixes: c156633f1353 ("Renesas Ethernet AVB driver proper")
Signed-off-by: Kazuya Mizuguchi 
[simon: reworked changelog]
Signed-off-by: Simon Horman 
---
 drivers/net/ethernet/renesas/ravb_main.c | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/renesas/ravb_main.c 
b/drivers/net/ethernet/renesas/ravb_main.c
index 92d7692c840d..5e5ad978eab9 100644
--- a/drivers/net/ethernet/renesas/ravb_main.c
+++ b/drivers/net/ethernet/renesas/ravb_main.c
@@ -926,14 +926,10 @@ static int ravb_poll(struct napi_struct *napi, int budget)
/* Receive error message handling */
priv->rx_over_errors =  priv->stats[RAVB_BE].rx_over_errors;
priv->rx_over_errors += priv->stats[RAVB_NC].rx_over_errors;
-   if (priv->rx_over_errors != ndev->stats.rx_over_errors) {
+   if (priv->rx_over_errors != ndev->stats.rx_over_errors)
ndev->stats.rx_over_errors = priv->rx_over_errors;
-   netif_err(priv, rx_err, ndev, "Receive Descriptor Empty\n");
-   }
-   if (priv->rx_fifo_errors != ndev->stats.rx_fifo_errors) {
+   if (priv->rx_fifo_errors != ndev->stats.rx_fifo_errors)
ndev->stats.rx_fifo_errors = priv->rx_fifo_errors;
-   netif_err(priv, rx_err, ndev, "Receive FIFO Overflow\n");
-   }
 out:
return budget - quota;
 }
-- 
2.7.0.rc3.207.g0ac5344

Re: [Patch net] atm: remove an unnecessary loop

2017-01-12 Thread Michal Hocko

On Wed 11-01-17 21:02:01, Cong Wang wrote:
> alloc_tx() is already inside a wait loop for a successful skb
> allocation, this loop inside alloc_tx() is quite unnecessary
> and actually problematic.

I am not familiar with this code at all but vcc_sendmsg seems to be one
of those cases where open coding __GFP_NOFAIL semantic makes sense as
there is an allocation fallback strategy implemented.

> Signed-off-by: Cong Wang 

I cannot give my reviewed-by because I am not familiar with the code but
this looks like an improvement to me.

> ---
>  net/atm/common.c | 9 +
>  1 file changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/net/atm/common.c b/net/atm/common.c
> index a3ca922..7ec3bbc 100644
> --- a/net/atm/common.c
> +++ b/net/atm/common.c
> @@ -72,10 +72,11 @@ static struct sk_buff *alloc_tx(struct atm_vcc *vcc, 
> unsigned int size)
>sk_wmem_alloc_get(sk), size, sk->sk_sndbuf);
>   return NULL;
>   }
> - while (!(skb = alloc_skb(size, GFP_KERNEL)))
> - schedule();
> - pr_debug("%d += %d\n", sk_wmem_alloc_get(sk), skb->truesize);
> - atomic_add(skb->truesize, &sk->sk_wmem_alloc);
> + skb = alloc_skb(size, GFP_KERNEL);
> + if (skb) {
> + pr_debug("%d += %d\n", sk_wmem_alloc_get(sk), skb->truesize);
> + atomic_add(skb->truesize, &sk->sk_wmem_alloc);
> + }
>   return skb;
>  }
>  
> -- 
> 2.5.5

-- 
Michal Hocko
SUSE Labs

Re: wl1251 & mac address & calibration data

2017-01-12 Thread Pavel Machek

Hi!

> >> But overwriting that one file is not possible as it next update of 
> >> linux-firmware package will overwrite it back. It break any normal usage 
> >> of package management.
> >> 
> >> Also it is ridiculously broken by design if some "boot" files needs to 
> >> be overwritten to initialize hardware properly. To not break booting you 
> >> need to overwrite that file before first boot. But without booting 
> >> device you cannot read calibration data. So some hack with autoreboot 
> >> after boot is needed.
> 
> Providing the calibration data via Device Tree is the proper way to
> solve this. Yes yes, I know N900 doesn't support it but that's a
> deficiency in N900, not Linux.

Linux has to work with whatever hardware provides. You may not like
N900 design, but we have to support it, anyway.

Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: Digital signature

Re: [PATCH] [v2] net: qcom/emac: grab a reference to the phydev on ACPI systems

2017-01-12 Thread Johan Hovold

On Wed, Jan 11, 2017 at 04:45:51PM -0600, Timur Tabi wrote:
> Commit 6ffe1c4cd0a7 ("net: qcom/emac: fix of_node and phydev leaks")
> fixed the problem with reference leaks on phydev, but the fix is
> device-tree specific.  When the driver unloads, the reference is
> dropped only on DT systems.
> 
> Instead, it's cleaner if up grab an reference on ACPI systems.
> When the driver unloads, we can drop the reference without having
> to check whether we're on a DT system.
> 
> Signed-off-by: Timur Tabi 

Reviewed-by: Johan Hovold

Re: [PATCH/RFC v2 net-next] ravb: unmap descriptors when freeing rings

On Fri, Jan 06, 2017 at 10:02:36PM +0300, Sergei Shtylyov wrote:
> Hello!
> 
> On 01/05/2017 01:43 PM, Simon Horman wrote:
> 
> >From: Kazuya Mizuguchi 
> >
> >"swiotlb buffer is full" errors occur after repeated initialisation of a
> >device - f.e. suspend/resume or ip link set up/down. This is because memory
> >mapped using dma_map_single() in ravb_ring_format() and ravb_start_xmit()
> >is not released.  Resolve this problem by unmapping descriptors when
> >freeing rings.
> >
> >Note, ravb_tx_free() is moved but not otherwise modified by this patch.
> >
> >Signed-off-by: Kazuya Mizuguchi 
> >[simon: reworked]
> >Signed-off-by: Simon Horman 
> >--
> >v1 [Kazuya Mizuguchi]
> >
> >v2 [Simon Horman]
> >* As suggested by Sergei Shtylyov
> >  - Use dma_mapping_error() and rx_desc->ds_cc when unmapping RX descriptors;
> >this is consistent with the way that they are mapped
> >  - Use ravb_tx_free() to clear TX descriptors
> 
>Not sure that was good idea (sorry)... ravb_tx_ring() only unmaps the
> transmitted buffers, while we need to unmap everything...
> 
> >* Reduce scope of new local variable
> >---
> > drivers/net/ethernet/renesas/ravb_main.c | 89 
> > ++--
> > 1 file changed, 51 insertions(+), 38 deletions(-)
> >
> >diff --git a/drivers/net/ethernet/renesas/ravb_main.c 
> >b/drivers/net/ethernet/renesas/ravb_main.c
> >index 92d7692c840d..1797c48e3176 100644
> >--- a/drivers/net/ethernet/renesas/ravb_main.c
> >+++ b/drivers/net/ethernet/renesas/ravb_main.c
> >@@ -179,6 +179,44 @@ static struct mdiobb_ops bb_ops = {
> > .get_mdio_data = ravb_get_mdio_data,
> > };
> >
> >+/* Free TX skb function for AVB-IP */
> >+static int ravb_tx_free(struct net_device *ndev, int q)
> >+{
> >+struct ravb_private *priv = netdev_priv(ndev);
> >+struct net_device_stats *stats = &priv->stats[q];
> >+struct ravb_tx_desc *desc;
> >+int free_num = 0;
> >+int entry;
> >+u32 size;
> >+
> >+for (; priv->cur_tx[q] - priv->dirty_tx[q] > 0; priv->dirty_tx[q]++) {
> >+entry = priv->dirty_tx[q] % (priv->num_tx_ring[q] *
> >+ NUM_TX_DESC);
> >+desc = &priv->tx_ring[q][entry];
> >+if (desc->die_dt != DT_FEMPTY)
> 
>Here, it stop once an untransmitted buffer is encountered...

Yes, I see that now.

I wonder if we should:

a) paramatise ravb_tx_free() so it may either clear all transmitted buffers
   (current behaviour) or all buffers (new behaviour).
b) provide a different version of this loop in ravb_ring_free()

What are your thoughts?

> >+break;
> >+/* Descriptor type must be checked before all other reads */
> >+dma_rmb();
> >+size = le16_to_cpu(desc->ds_tagl) & TX_DS;
> >+/* Free the original skb. */
> >+if (priv->tx_skb[q][entry / NUM_TX_DESC]) {
> >+dma_unmap_single(ndev->dev.parent, 
> >le32_to_cpu(desc->dptr),
> >+ size, DMA_TO_DEVICE);
> >+/* Last packet descriptor? */
> >+if (entry % NUM_TX_DESC == NUM_TX_DESC - 1) {
> >+entry /= NUM_TX_DESC;
> >+dev_kfree_skb_any(priv->tx_skb[q][entry]);
> >+priv->tx_skb[q][entry] = NULL;
> >+stats->tx_packets++;
> >+}
> >+free_num++;
> >+}
> >+stats->tx_bytes += size;
> >+desc->die_dt = DT_EEMPTY;
> >+}
> >+return free_num;
> >+}
> >+
> > /* Free skb's and DMA buffers for Ethernet AVB */
> > static void ravb_ring_free(struct net_device *ndev, int q)
> > {
> >@@ -207,6 +245,18 @@ static void ravb_ring_free(struct net_device *ndev, int 
> >q)
> > priv->tx_align[q] = NULL;
> >
> > if (priv->rx_ring[q]) {
> >+for (i = 0; i < priv->num_rx_ring[q]; i++) {
> >+struct ravb_ex_rx_desc *rx_desc = &priv->rx_ring[q][i];
> >+
> >+if (!dma_mapping_error(ndev->dev.parent,
> >+   rx_desc->dptr)) {
> 
>   You forgot le32_to_cpu() here, we can't use the raw descriptor fields.

Thanks, I will fix that.

> >+dma_unmap_single(ndev->dev.parent,
> >+ le32_to_cpu(rx_desc->dptr),
> >+ PKT_BUF_SZ,
> >+ DMA_FROM_DEVICE);
> >+rx_desc->ds_cc = cpu_to_le16(0);
> 
>You don't check it anyway, not sure what that buys...

Thanks, I will see about dropping that.

Re: [PATCH] wext: handle NULL exta data in iwe_stream_add_point better

2017-01-12 Thread Johannes Berg


> Come to think of it, I'm thinking I should drop this patch and the
> driver should just use iwe_stream_add_event() instead? It'll be
> somewhat tricky to get the length correct though.

No, turns out that's basically impossible with all the compat etc.
stuff here.

johannes

Re: [PATCH] wext: handle NULL exta data in iwe_stream_add_point better

2017-01-12 Thread Johannes Berg

On Wed, 2017-01-11 at 21:39 +0100, Arnd Bergmann wrote:
> On Wednesday, January 11, 2017 4:06:17 PM CET Johannes Berg wrote:
> > 
> > Applied. Also fixed the typo in the subject :)
> 
> Thanks! Unfortunately I now got another warning for the same
> function, and though I would have expected the patch to fix it, that
> did not work:
> 
> In file included from /git/arm-
> soc/drivers/net/wireless/intersil/prism54/islpci_dev.h:27:0,
>  from /git/arm-
> soc/drivers/net/wireless/intersil/prism54/isl_ioctl.h:24,
>  from /git/arm-
> soc/drivers/net/wireless/intersil/prism54/isl_ioctl.c:32:
> /git/arm-soc/drivers/net/wireless/intersil/prism54/isl_ioctl.c: In
> function 'prism54_get_scan':
> /git/arm-soc/include/net/iw_handler.h:560:4: error: argument 2 null
> where non-null expected [-Werror=nonnull]
> memcpy(stream + point_len, extra, iwe->u.data.length);

And I realized only now that this was a different place ...

I've just added the check you suggested - spent way too much time
already on this old crap :)

johannes

Re: [PATCH] wext: handle NULL exta data in iwe_stream_add_point better

2017-01-12 Thread Arnd Bergmann

On Thursday, January 12, 2017 10:16:00 AM CET Johannes Berg wrote:
> And I realized only now that this was a different place ...

Right, it was a few hundred randconfigs later after I had confirmed
that the first patch fixed all the configurations that were broken
at first.

> I've just added the check you suggested - spent way too much time
> already on this old crap 

Ok, thanks! Let's hope it doesn't come back once more.

I'm still trying to categorize the newly added warnings in gcc-7,
there a number of very useful warnings that got added, but some of
them are rather noisy and find both a number of real bugs and
false positives. The NULL check had only a few findings that all
seemed worth fixing.

Arnd

Re: [PATCH v2 2/2] stmmac: rename it to synopsys

2017-01-12 Thread Joao Pinto

Hi Florian,

Às 9:14 PM de 1/11/2017, Florian Fainelli escreveu:
> On 01/10/2017 06:52 AM, Joao Pinto wrote:
>> This patch renames stmicro/stmmac to synopsys/ since it is a standard
>> ethernet software package regarding synopsys ethernet controllers, supporting
>> the majority of Synopsys Ethernet IPs. The config IDs remain the same, for
>> retro-compatibility, only the description was changed.
> 
> Do re really have to do this? ST Micro were the first to upstream
> support for a Synopsys IP, and it was later on identified as being
> "stmicro" instead of "synopsys" (during the big driver move under
> drivers/net/ethernet) whichever came first in the driver essentially "wins".
> 
> As mentioned before, although git is able to track renames, git log does
> not automatically have --follow, so it can be hard for people to track
> down the (new) history of the driver.
> 
> Personally, I don't see much value in doing this rename, especially when
> all the driver internal structures are still going to be named with
> stmmac (and please don't even think about doing a s/stmmac/snps/ inside
> the driver ;)).
> 
> My 2 cents.
> 

First of all, I am suggesting an alternative way of organizing the code, and
that's it, I have no second intentions about anything :).

Please don't see this as a take-over or erase Stmicro from credits, please... it
makes no sense. You can leave STMicro license and all the credits fine by me and
I insist on it. But lets name it for something that makes sense... lets call it
dwc (designware controllers), I am totally open to suggestions.

I don't understand the hostility of some comments, honestly.

The easiest way is to keep things like they are today, and believe me I have a
lot of things to do, like adding the support of multi-queues / multi-channels to
stmmac, so I not suggesting this because it is fun.

I am suggesting this because it is what I am used to seeing in other subsystems.
USB has dwc2 and dwc3 folders that clearly identifies that they are Designware
(synopsys) extensions to the USB 2.0 and 3.0. The author of dwc3 was Texas
Instruments, and they did not name it ti/usb. For example I use an AXS101
Development board that does not have a stmicro SoC but has a Designware Ethernet
IP in it, so uses stmicro/stmmac. For me it is confusing.

Lets not name it synopsys, for me it is totally fine, but naming it
stmicro/stmmac is not the right way because it seems like it is a driver just
for stmicro products, which is not, is for products that use Designware Ethernet
IPs.

I am volunteering to do this work, let's discuss this.

Thanks,
Joao

[PATCH] synopsys: remove dwc_eth_qos driver

2017-01-12 Thread Joao Pinto

This driver is no longer necessary since it was merged into stmmac.

Acked-by: Lars Persson 
Signed-off-by: Joao Pinto 
---
 MAINTAINERS |7 -
 arch/arm/configs/multi_v7_defconfig |3 +-
 drivers/net/ethernet/Kconfig|1 -
 drivers/net/ethernet/Makefile   |1 -
 drivers/net/ethernet/synopsys/Kconfig   |   27 -
 drivers/net/ethernet/synopsys/Makefile  |5 -
 drivers/net/ethernet/synopsys/dwc_eth_qos.c | 2996 ---
 7 files changed, 2 insertions(+), 3038 deletions(-)
 delete mode 100644 drivers/net/ethernet/synopsys/Kconfig
 delete mode 100644 drivers/net/ethernet/synopsys/Makefile
 delete mode 100644 drivers/net/ethernet/synopsys/dwc_eth_qos.c

diff --git a/MAINTAINERS b/MAINTAINERS
index c8df0e1..acfb0a0 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -10865,13 +10865,6 @@ F: include/linux/dma/dw.h
 F: include/linux/platform_data/dma-dw.h
 F: drivers/dma/dw/
 
-SYNOPSYS DESIGNWARE ETHERNET QOS 4.10a driver
-M: Lars Persson 
-L: netdev@vger.kernel.org
-S: Supported
-F: Documentation/devicetree/bindings/net/snps,dwc-qos-ethernet.txt
-F: drivers/net/ethernet/synopsys/dwc_eth_qos.c
-
 SYNOPSYS DESIGNWARE I2C DRIVER
 M: Jarkko Nikula 
 R: Andy Shevchenko 
diff --git a/arch/arm/configs/multi_v7_defconfig 
b/arch/arm/configs/multi_v7_defconfig
index b01a438..64f4419 100644
--- a/arch/arm/configs/multi_v7_defconfig
+++ b/arch/arm/configs/multi_v7_defconfig
@@ -253,7 +253,8 @@ CONFIG_R8169=y
 CONFIG_SH_ETH=y
 CONFIG_SMSC911X=y
 CONFIG_STMMAC_ETH=y
-CONFIG_SYNOPSYS_DWC_ETH_QOS=y
+CONFIG_STMMAC_PLATFORM=y
+CONFIG_DWMAC_DWC_QOS_ETH=y
 CONFIG_TI_CPSW=y
 CONFIG_XILINX_EMACLITE=y
 CONFIG_AT803X_PHY=y
diff --git a/drivers/net/ethernet/Kconfig b/drivers/net/ethernet/Kconfig
index e4c28fe..afc07d4 100644
--- a/drivers/net/ethernet/Kconfig
+++ b/drivers/net/ethernet/Kconfig
@@ -170,7 +170,6 @@ source "drivers/net/ethernet/sgi/Kconfig"
 source "drivers/net/ethernet/smsc/Kconfig"
 source "drivers/net/ethernet/stmicro/Kconfig"
 source "drivers/net/ethernet/sun/Kconfig"
-source "drivers/net/ethernet/synopsys/Kconfig"
 source "drivers/net/ethernet/tehuti/Kconfig"
 source "drivers/net/ethernet/ti/Kconfig"
 source "drivers/net/ethernet/tile/Kconfig"
diff --git a/drivers/net/ethernet/Makefile b/drivers/net/ethernet/Makefile
index 24330f4..e7861a8 100644
--- a/drivers/net/ethernet/Makefile
+++ b/drivers/net/ethernet/Makefile
@@ -81,7 +81,6 @@ obj-$(CONFIG_NET_VENDOR_SGI) += sgi/
 obj-$(CONFIG_NET_VENDOR_SMSC) += smsc/
 obj-$(CONFIG_NET_VENDOR_STMICRO) += stmicro/
 obj-$(CONFIG_NET_VENDOR_SUN) += sun/
-obj-$(CONFIG_NET_VENDOR_SYNOPSYS) += synopsys/
 obj-$(CONFIG_NET_VENDOR_TEHUTI) += tehuti/
 obj-$(CONFIG_NET_VENDOR_TI) += ti/
 obj-$(CONFIG_TILE_NET) += tile/
diff --git a/drivers/net/ethernet/synopsys/Kconfig 
b/drivers/net/ethernet/synopsys/Kconfig
deleted file mode 100644
index 8276ee5..000
--- a/drivers/net/ethernet/synopsys/Kconfig
+++ /dev/null
@@ -1,27 +0,0 @@
-#
-# Synopsys network device configuration
-#
-
-config NET_VENDOR_SYNOPSYS
-   bool "Synopsys devices"
-   default y
-   ---help---
- If you have a network (Ethernet) device belonging to this class, say 
Y.
-
- Note that the answer to this question doesn't directly affect the
- kernel: saying N will just cause the configurator to skip all
- the questions about Synopsys devices. If you say Y, you will be asked
- for your specific device in the following questions.
-
-if NET_VENDOR_SYNOPSYS
-
-config SYNOPSYS_DWC_ETH_QOS
-   tristate "Sypnopsys DWC Ethernet QOS v4.10a support"
-   select PHYLIB
-   select CRC32
-   select MII
-   depends on OF && HAS_DMA
-   ---help---
- This driver supports the DWC Ethernet QoS from Synopsys
-
-endif # NET_VENDOR_SYNOPSYS
diff --git a/drivers/net/ethernet/synopsys/Makefile 
b/drivers/net/ethernet/synopsys/Makefile
deleted file mode 100644
index 7a37572..000
--- a/drivers/net/ethernet/synopsys/Makefile
+++ /dev/null
@@ -1,5 +0,0 @@
-#
-# Makefile for the Synopsys network device drivers.
-#
-
-obj-$(CONFIG_SYNOPSYS_DWC_ETH_QOS) += dwc_eth_qos.o
diff --git a/drivers/net/ethernet/synopsys/dwc_eth_qos.c 
b/drivers/net/ethernet/synopsys/dwc_eth_qos.c
deleted file mode 100644
index 467dcc5..000
--- a/drivers/net/ethernet/synopsys/dwc_eth_qos.c
+++ /dev/null
@@ -1,2996 +0,0 @@
-/*  Synopsys DWC Ethernet Quality-of-Service v4.10a linux driver
- *
- *  This is a driver for the Synopsys DWC Ethernet QoS IP version 4.10a (GMAC).
- *  This version introduced a lot of changes which breaks backwards
- *  compatibility the non-QoS IP from Synopsys (used in the ST Micro drivers).
- *  Some fields differ between version 4.00a and 4.10a, mainly the interrupt
- *  bit fields. The driver could be made compatible with 4.00, if all relevant
- *  HW erratas are handled.
- *
- *  The GMAC is highly configurable at synthesis time. This driver

Re: [PATCH v2 2/2] stmmac: rename it to synopsys

2017-01-12 Thread Alexandre Torgue


Hi Joao

On 01/12/2017 10:43 AM, Joao Pinto wrote:


Hi Florian,

Às 9:14 PM de 1/11/2017, Florian Fainelli escreveu:

On 01/10/2017 06:52 AM, Joao Pinto wrote:

This patch renames stmicro/stmmac to synopsys/ since it is a standard
ethernet software package regarding synopsys ethernet controllers, supporting
the majority of Synopsys Ethernet IPs. The config IDs remain the same, for
retro-compatibility, only the description was changed.


Do re really have to do this? ST Micro were the first to upstream
support for a Synopsys IP, and it was later on identified as being
"stmicro" instead of "synopsys" (during the big driver move under
drivers/net/ethernet) whichever came first in the driver essentially "wins".

As mentioned before, although git is able to track renames, git log does
not automatically have --follow, so it can be hard for people to track
down the (new) history of the driver.

Personally, I don't see much value in doing this rename, especially when
all the driver internal structures are still going to be named with
stmmac (and please don't even think about doing a s/stmmac/snps/ inside
the driver ;)).

My 2 cents.



First of all, I am suggesting an alternative way of organizing the code, and
that's it, I have no second intentions about anything :).

Please don't see this as a take-over or erase Stmicro from credits, please... it
makes no sense. You can leave STMicro license and all the credits fine by me and
I insist on it. But lets name it for something that makes sense... lets call it
dwc (designware controllers), I am totally open to suggestions.

I don't understand the hostility of some comments, honestly.

The easiest way is to keep things like they are today, and believe me I have a
lot of things to do, like adding the support of multi-queues / multi-channels to
stmmac, so I not suggesting this because it is fun.

I am suggesting this because it is what I am used to seeing in other subsystems.
USB has dwc2 and dwc3 folders that clearly identifies that they are Designware
(synopsys) extensions to the USB 2.0 and 3.0. The author of dwc3 was Texas
Instruments, and they did not name it ti/usb. For example I use an AXS101
Development board that does not have a stmicro SoC but has a Designware Ethernet
IP in it, so uses stmicro/stmmac. For me it is confusing.

Lets not name it synopsys, for me it is totally fine, but naming it
stmicro/stmmac is not the right way because it seems like it is a driver just
for stmicro products, which is not, is for products that use Designware Ethernet
IPs.

I am volunteering to do this work, let's discuss this.


For me it makes no sens to rename only folder (stmicro/stmmac by 
synopsys) and keep stmmac* inside a synopsys folder (that is very 
confusing). If you propose that you have to change all.


BUT doing that, we will lose all stmmac driver story and we don't want 
that.






Thanks,
Joao

Re: [PATCH] [net] net/mlx5e: fix another -Wmaybe-uninitialized warning

2017-01-12 Thread Arnd Bergmann

On Thursday, January 12, 2017 10:30:24 AM CET Or Gerlitz wrote:
> On 1/11/2017 11:14 PM, Arnd Bergmann wrote:
> > @@ -666,14 +666,15 @@ static int mlx5e_route_lookup_ipv4(struct mlx5e_priv 
> > *priv,
> >   struct rtable *rt;
> >   struct neighbour *n = NULL;
> >   int ttl;
> > + int ret;
> > +
> > + if (!IS_ENABLED(CONFIG_INET))
> > + return -EOPNOTSUPP;
> >   
> > -#if IS_ENABLED(CONFIG_INET)
> >   rt = ip_route_output_key(dev_net(mirred_dev), fl4);
> > - if (IS_ERR(rt))
> > - return PTR_ERR(rt);
> > -#else
> > - return -EOPNOTSUPP;
> > -#endif
> > + ret = PTR_ERR_OR_ZERO(rt);
> > + if (ret)
> > + return ret;
> 
> but this means that if we got NULL from ip_route_output_key, we will 
> return success (0) here which is wrong.

I don't think so: if 'rt' is NULL or a valid pointer, then 'ret' is zero
and we will not return here.

Arnd

Re: net/atm: warning in alloc_tx/__might_sleep

2017-01-12 Thread Chas Williams

On Wed, 2017-01-11 at 20:36 -0800, Cong Wang wrote:
> On Wed, Jan 11, 2017 at 11:46 AM, Michal Hocko  wrote:
> > On Wed 11-01-17 20:45:25, Michal Hocko wrote:
> >> On Wed 11-01-17 09:37:06, Chas Williams wrote:
> >> > On Mon, 2017-01-09 at 18:20 +0100, Andrey Konovalov wrote:
> >> > > Hi!
> >> > >
> >> > > I've got the following error report while running the syzkaller fuzzer.
> >> > >
> >> > > On commit a121103c922847ba5010819a3f250f1f7fc84ab8 (4.10-rc3).
> >> > >
> >> > > A reproducer is attached.
> >> > >
> >> > > [ cut here ]
> >> > > WARNING: CPU: 0 PID: 4114 at kernel/sched/core.c:7737 
> >> > > __might_sleep+0x149/0x1a0
> >> > > do not call blocking ops when !TASK_RUNNING; state=1 set at
> >> > > [] prepare_to_wait+0x182/0x530
> >> > > Modules linked in:
> >> > > CPU: 0 PID: 4114 Comm: a.out Not tainted 4.10.0-rc3+ #59
> >> > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 
> >> > > 01/01/2011
> >> > > Call Trace:
> >> > >  __dump_stack lib/dump_stack.c:15
> >> > >  dump_stack+0x292/0x398 lib/dump_stack.c:51
> >> > >  __warn+0x19f/0x1e0 kernel/panic.c:547
> >> > >  warn_slowpath_fmt+0xc5/0x110 kernel/panic.c:562
> >> > >  __might_sleep+0x149/0x1a0 kernel/sched/core.c:7732
> >> > >  slab_pre_alloc_hook mm/slab.h:408
> >> > >  slab_alloc_node mm/slub.c:2634
> >> > >  kmem_cache_alloc_node+0x14a/0x280 mm/slub.c:2744
> >> > >  __alloc_skb+0x10f/0x800 net/core/skbuff.c:219
> >> > >  alloc_skb ./include/linux/skbuff.h:926
> >> > >  alloc_tx net/atm/common.c:75
> >> >
> >> > This is likely alloc_skb(..., GFP_KERNEL) in alloc_tx().  The simplest
> >> > fix for this would be simply to switch this GFP_ATOMIC.  See if this is
> >> > any better.
> >> >
> >> > diff --git a/net/atm/common.c b/net/atm/common.c
> >> > index a3ca922..d84220c 100644
> >> > --- a/net/atm/common.c
> >> > +++ b/net/atm/common.c
> >> > @@ -72,7 +72,7 @@ static struct sk_buff *alloc_tx(struct atm_vcc *vcc, 
> >> > unsigned int size)
> >> >  sk_wmem_alloc_get(sk), size, sk->sk_sndbuf);
> >> > return NULL;
> >> > }
> >> > -   while (!(skb = alloc_skb(size, GFP_KERNEL)))
> >> > +   while (!(skb = alloc_skb(size, GFP_ATOMIC)))
> >> > schedule();
> >> > pr_debug("%d += %d\n", sk_wmem_alloc_get(sk), skb->truesize);
> >> > atomic_add(skb->truesize, &sk->sk_wmem_alloc);
> >>
> >> Blee, this code is just horrendous. But the "fix" is obviously broken!
> >> schedule() is just a noop if you do not change the task state and what
> >> you are just asking for is a never failing non sleeping allocation - aka
> >> a busy loop in the kernel!
> >
> > And btw. this while loop should be really turned into GFP_KERNEL |
> > __GFP_NOFAIL with and explanation why this allocation cannot possibly
> > fail.
> 
> I think a nested loop is quite unnecessary, probably due to the code itself
> is pretty old. The alloc_tx() is in the outer loop, the alloc_skb() is
> in the inner
> loop, both seem to wait for a successful GFP allocation. The inner one
> is even more unnecessary.
> 
> Of course, I am not surprised MM may already have a mechanism to do
> the similar logic.
> 
> There maybe some reason ATM needs such a logic, although other proto
> could handle skb allocation failure quite well in ->sendmsg().


I can't think of any particular reason that it needs this loop here.  I suspect
that the loop for alloc_tx() predates the wait logic in ->sendmsg() and that the
original looping was in alloc_tx() initially and was simply never removed.  
Changes
here would date back to before the git conversion.

Re: [v5,1/5] soc: qcom: smem_state: Fix include for ERR_PTR()

Bjorn Andersson  wrote:
> The correct include file for getting errno constants and ERR_PTR() is
> linux/err.h, rather than linux/errno.h, so fix the include.
> 
> Fixes: e8b123e60084 ("soc: qcom: smem_state: Add stubs for disabled 
> smem_state")
> Acked-by: Andy Gross 
> Signed-off-by: Bjorn Andersson 

5 patches applied to ath-next branch of ath.git, thanks.

6c0b2e833f14 soc: qcom: smem_state: Fix include for ERR_PTR()
f303a9311065 wcn36xx: Transition driver to SMD client
886039036c20 wcn36xx: Implement firmware assisted scan
43efa3c0f241 wcn36xx: Implement print_reg indication
d53628882255 wcn36xx: Don't use the destroyed hal_mutex

-- 
https://patchwork.kernel.org/patch/9429045/

Documentation about submitting wireless patches and checking status
from patchwork:

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

[PATCH net-next v2 1/2] bpf: pass original insn directly to convert_ctx_access

Currently, when calling convert_ctx_access() callback for the various
program types, we pass in insn->dst_reg, insn->src_reg, insn->off from
the original instruction. This information is needed to rewrite the
instruction that is based on the user ctx structure into a kernel
representation for the ctx. As we'd like to allow access size beyond
just BPF_W, we'd need also insn->code for that in order to decode the
original access size. Given that, lets just pass insn directly to the
convert_ctx_access() callback and work on that to not clutter the
callback with even more arguments we need to pass when everything is
already contained in insn. So lets go through that once, no functional
change.

Signed-off-by: Daniel Borkmann 
Acked-by: Alexei Starovoitov 
---
 include/linux/bpf.h  |   7 ++-
 kernel/bpf/verifier.c|   3 +-
 kernel/trace/bpf_trace.c |  15 ++---
 net/core/filter.c| 139 +--
 4 files changed, 87 insertions(+), 77 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 94ea8d2..f8c3560 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -161,9 +161,10 @@ struct bpf_verifier_ops {
enum bpf_reg_type *reg_type);
int (*gen_prologue)(struct bpf_insn *insn, bool direct_write,
const struct bpf_prog *prog);
-   u32 (*convert_ctx_access)(enum bpf_access_type type, int dst_reg,
- int src_reg, int ctx_off,
- struct bpf_insn *insn, struct bpf_prog *prog);
+   u32 (*convert_ctx_access)(enum bpf_access_type type,
+ const struct bpf_insn *src,
+ struct bpf_insn *dst,
+ struct bpf_prog *prog);
 };
 
 struct bpf_prog_type_list {
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 2efdc91..df7e472 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -3177,8 +3177,7 @@ static int convert_ctx_accesses(struct bpf_verifier_env 
*env)
if (env->insn_aux_data[i].ptr_type != PTR_TO_CTX)
continue;
 
-   cnt = ops->convert_ctx_access(type, insn->dst_reg, 
insn->src_reg,
- insn->off, insn_buf, env->prog);
+   cnt = ops->convert_ctx_access(type, insn, insn_buf, env->prog);
if (cnt == 0 || cnt >= ARRAY_SIZE(insn_buf)) {
verbose("bpf verifier is misconfigured\n");
return -EINVAL;
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index f883c43..1860e7f 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -572,28 +572,29 @@ static bool pe_prog_is_valid_access(int off, int size, 
enum bpf_access_type type
return true;
 }
 
-static u32 pe_prog_convert_ctx_access(enum bpf_access_type type, int dst_reg,
- int src_reg, int ctx_off,
+static u32 pe_prog_convert_ctx_access(enum bpf_access_type type,
+ const struct bpf_insn *si,
  struct bpf_insn *insn_buf,
  struct bpf_prog *prog)
 {
struct bpf_insn *insn = insn_buf;
 
-   switch (ctx_off) {
+   switch (si->off) {
case offsetof(struct bpf_perf_event_data, sample_period):
BUILD_BUG_ON(FIELD_SIZEOF(struct perf_sample_data, period) != 
sizeof(u64));
 
*insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(struct 
bpf_perf_event_data_kern,
-  data), dst_reg, src_reg,
+  data), si->dst_reg, 
si->src_reg,
  offsetof(struct bpf_perf_event_data_kern, 
data));
-   *insn++ = BPF_LDX_MEM(BPF_DW, dst_reg, dst_reg,
+   *insn++ = BPF_LDX_MEM(BPF_DW, si->dst_reg, si->dst_reg,
  offsetof(struct perf_sample_data, 
period));
break;
default:
*insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(struct 
bpf_perf_event_data_kern,
-  regs), dst_reg, src_reg,
+  regs), si->dst_reg, 
si->src_reg,
  offsetof(struct bpf_perf_event_data_kern, 
regs));
-   *insn++ = BPF_LDX_MEM(BPF_SIZEOF(long), dst_reg, dst_reg, 
ctx_off);
+   *insn++ = BPF_LDX_MEM(BPF_SIZEOF(long), si->dst_reg, 
si->dst_reg,
+ si->off);
break;
}
 
diff --git a/net/core/filter.c b/net/core/filter.c
index f4d16a9..8cfbdef 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -2972,32 +2972,33 @@ void bpf_warn_invalid_xdp_action(u32 act)
 }
 EXPORT_SYMBOL_GPL(bpf_warn_invalid

[PATCH net-next v2 2/2] bpf: allow b/h/w/dw access for bpf's cb in ctx

When structs are used to store temporary state in cb[] buffer that is
used with programs and among tail calls, then the generated code will
not always access the buffer in bpf_w chunks. We can ease programming
of it and let this act more natural by allowing for aligned b/h/w/dw
sized access for cb[] ctx member. Various test cases are attached as
well for the selftest suite. Potentially, this can also be reused for
other program types to pass data around.

Signed-off-by: Daniel Borkmann 
Acked-by: Alexei Starovoitov 
---
 kernel/bpf/verifier.c   |   8 +-
 net/core/filter.c   |  41 ++-
 tools/testing/selftests/bpf/test_verifier.c | 442 +++-
 3 files changed, 478 insertions(+), 13 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index df7e472..d60e12c 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -3165,10 +3165,14 @@ static int convert_ctx_accesses(struct bpf_verifier_env 
*env)
insn = env->prog->insnsi + delta;
 
for (i = 0; i < insn_cnt; i++, insn++) {
-   if (insn->code == (BPF_LDX | BPF_MEM | BPF_W) ||
+   if (insn->code == (BPF_LDX | BPF_MEM | BPF_B) ||
+   insn->code == (BPF_LDX | BPF_MEM | BPF_H) ||
+   insn->code == (BPF_LDX | BPF_MEM | BPF_W) ||
insn->code == (BPF_LDX | BPF_MEM | BPF_DW))
type = BPF_READ;
-   else if (insn->code == (BPF_STX | BPF_MEM | BPF_W) ||
+   else if (insn->code == (BPF_STX | BPF_MEM | BPF_B) ||
+insn->code == (BPF_STX | BPF_MEM | BPF_H) ||
+insn->code == (BPF_STX | BPF_MEM | BPF_W) ||
 insn->code == (BPF_STX | BPF_MEM | BPF_DW))
type = BPF_WRITE;
else
diff --git a/net/core/filter.c b/net/core/filter.c
index 8cfbdef..9038386 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -2776,11 +2776,33 @@ static bool __is_valid_access(int off, int size)
 {
if (off < 0 || off >= sizeof(struct __sk_buff))
return false;
+
/* The verifier guarantees that size > 0. */
if (off % size != 0)
return false;
-   if (size != sizeof(__u32))
-   return false;
+
+   switch (off) {
+   case offsetof(struct __sk_buff, cb[0]) ...
+offsetof(struct __sk_buff, cb[4]) + sizeof(__u32) - 1:
+   if (size == sizeof(__u16) &&
+   off > offsetof(struct __sk_buff, cb[4]) + sizeof(__u16))
+   return false;
+   if (size == sizeof(__u32) &&
+   off > offsetof(struct __sk_buff, cb[4]))
+   return false;
+   if (size == sizeof(__u64) &&
+   off > offsetof(struct __sk_buff, cb[2]))
+   return false;
+   if (size != sizeof(__u8)  &&
+   size != sizeof(__u16) &&
+   size != sizeof(__u32) &&
+   size != sizeof(__u64))
+   return false;
+   break;
+   default:
+   if (size != sizeof(__u32))
+   return false;
+   }
 
return true;
 }
@@ -2799,7 +2821,7 @@ static bool sk_filter_is_valid_access(int off, int size,
if (type == BPF_WRITE) {
switch (off) {
case offsetof(struct __sk_buff, cb[0]) ...
-offsetof(struct __sk_buff, cb[4]):
+offsetof(struct __sk_buff, cb[4]) + sizeof(__u32) - 1:
break;
default:
return false;
@@ -2823,7 +2845,7 @@ static bool lwt_is_valid_access(int off, int size,
case offsetof(struct __sk_buff, mark):
case offsetof(struct __sk_buff, priority):
case offsetof(struct __sk_buff, cb[0]) ...
-offsetof(struct __sk_buff, cb[4]):
+offsetof(struct __sk_buff, cb[4]) + sizeof(__u32) - 1:
break;
default:
return false;
@@ -2915,7 +2937,7 @@ static bool tc_cls_act_is_valid_access(int off, int size,
case offsetof(struct __sk_buff, tc_index):
case offsetof(struct __sk_buff, priority):
case offsetof(struct __sk_buff, cb[0]) ...
-offsetof(struct __sk_buff, cb[4]):
+offsetof(struct __sk_buff, cb[4]) + sizeof(__u32) - 1:
case offsetof(struct __sk_buff, tc_classid):
break;
default:
@@ -3066,8 +3088,11 @@ static u32 sk_filter_convert_ctx_access(enum 
bpf_access_type type,
  si->dst_reg, si->src_reg, insn);
 
case offsetof(struct __sk_buff, cb[0]) ...
-offsetof(struct __sk_buff, cb[4]):
+offseto

[PATCH net-next v2 0/2] More flexible BPF cb access

This patch improves BPF's cb access by allowing b/h/w/dw
access variants on it. For details, please see individual
patches.

Thanks!

v1 -> v2:
 - Fix typo in test case description spotted by Quentin
 - Rest as-is

Daniel Borkmann (2):
  bpf: pass original insn directly to convert_ctx_access
  bpf: allow b/h/w/dw access for bpf's cb in ctx

 include/linux/bpf.h |   7 +-
 kernel/bpf/verifier.c   |  11 +-
 kernel/trace/bpf_trace.c|  15 +-
 net/core/filter.c   | 176 ++-
 tools/testing/selftests/bpf/test_verifier.c | 442 +++-
 5 files changed, 563 insertions(+), 88 deletions(-)

-- 
1.9.3

Re: net: wireless: ath: wil6210: constify cfg80211_ops structures

Bhumika Goyal  wrote:
> cfg80211_ops structures are only passed as an argument to the function
> wiphy_new. This argument is of type const, so cfg80211_ops strutures
> having this property can be declared as const.
> Done using Coccinelle
> 
> @r1 disable optional_qualifier @
> identifier i;
> position p;
> @@
> static struct cfg80211_ops i@p = {...};
> 
> @ok1@
> identifier r1.i;
> position p;
> @@
> wiphy_new(&i@p,...)
> 
> @bad@
> position p!={r1.p,ok1.p};
> identifier r1.i;
> @@
> i@p
> 
> @depends on !bad disable optional_qualifier@
> identifier r1.i;
> @@
> +const
> struct cfg80211_ops i;
> 
> File size before:
>text  data bss dec hex filename
>   18133  6632   0   2476560bd wireless/ath/wil6210/cfg80211.o
> 
> File size after:
>text  data bss dec hex filename
>   18933  5832   0   2476560bd wireless/ath/wil6210/cfg80211.o
> 
> Signed-off-by: Bhumika Goyal 

Patch applied to ath-next branch of ath.git, thanks.

b59eb96181e7 wil6210: constify cfg80211_ops structures

-- 
https://patchwork.kernel.org/patch/9479127/

Documentation about submitting wireless patches and checking status
from patchwork:

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

Re: [1/2] ath9k: ar9002_mac: kill off ACCESS_ONCE()

Mark Rutland  wrote:
> For several reasons, it is desirable to use {READ,WRITE}_ONCE() in
> preference to ACCESS_ONCE(), and new code is expected to use one of the
> former. So far, there's been no reason to change most existing uses of
> ACCESS_ONCE(), as these aren't currently harmful.
> 
> However, for some new features (e.g. KTSAN / Kernel Thread Sanitizer),
> it is necessary to instrument reads and writes separately, which is not
> possible with ACCESS_ONCE(). This distinction is critical to correct
> operation.
> 
> It's possible to transform the bulk of kernel code using the Coccinelle
> script below. However, for some files (including the ath9k ar9002 mac
> driver), this mangles the formatting. As a preparatory step, this patch
> converts the driver to use {READ,WRITE}_ONCE() without said mangling.
> 
> 
> virtual patch
> 
> @ depends on patch @
> expression E1, E2;
> @@
> 
> - ACCESS_ONCE(E1) = E2
> + WRITE_ONCE(E1, E2)
> 
> @ depends on patch @
> expression E;
> @@
> 
> - ACCESS_ONCE(E)
> + READ_ONCE(E)
> 
> 
> Signed-off-by: Mark Rutland 
> Cc: ath9k-de...@qca.qualcomm.com
> Cc: Kalle Valo 
> Cc: linux-wirel...@vger.kernel.org
> Cc: ath9k-de...@lists.ath9k.org
> Cc: netdev@vger.kernel.org

2 patches applied to ath-next branch of ath.git, thanks.

d5a3a76a9cb8 ath9k: ar9002_mac: kill off ACCESS_ONCE()
50f3818196f5 ath9k: ar9003_mac: kill off ACCESS_ONCE()

-- 
https://patchwork.kernel.org/patch/9489799/

Documentation about submitting wireless patches and checking status
from patchwork:

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

Re: ath9k: fix spelling mistake: "meaurement" -> "measurement"

Colin Ian King  wrote:
> From: Colin Ian King 
> 
> Trivial fix to spelling mistake in ath_err message
> 
> Signed-off-by: Colin Ian King 

Patch applied to ath-next branch of ath.git, thanks.

714ee339ff90 ath9k: fix spelling mistake: "meaurement" -> "measurement"

-- 
https://patchwork.kernel.org/patch/9492191/

Documentation about submitting wireless patches and checking status
from patchwork:

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

Re: [PATCH v2 2/2] stmmac: rename it to synopsys

2017-01-12 Thread Joao Pinto

Hi Alex, good morning!

Às 10:11 AM de 1/12/2017, Alexandre Torgue escreveu:
>>
>> Lets not name it synopsys, for me it is totally fine, but naming it
>> stmicro/stmmac is not the right way because it seems like it is a driver just
>> for stmicro products, which is not, is for products that use Designware 
>> Ethernet
>> IPs.
>>
>> I am volunteering to do this work, let's discuss this.
> 
> For me it makes no sens to rename only folder (stmicro/stmmac by synopsys) and
> keep stmmac* inside a synopsys folder (that is very confusing). If you propose
> that you have to change all.
> 
> BUT doing that, we will lose all stmmac driver story and we don't want that.

Totally understand your point. Do you agree on this approach?
rename "stmicro" to "dwc" (designware controllers) and leave stmmac as it is
today. This small change is enough in my point of view and sole the problems you
refer. We would have net/ethernet/dwc/stmmac/.

I can also rename the dwmac4 files and functions to eqos, since soon we will
have a new eqos version.

dwmac4.h -> eqos.h
dwmac4_core.c -> eqos_core.c
dwmac4_descs.c -> eqos_descs.c
dwmac4_descs.h -> eqos_descs.h
dwmac4_dma.c -> eqos_dma.c
dwmac4_dma.h -> eqos_dma.h
dwmac4_lib.c -> eqos_lib.c

What do you think about this approach?

Thanks,
Joao

> 
> 
> 
>>
>> Thanks,
>> Joao
>>
>>

Re: [PATCH net-next 2/2] bpf: allow b/h/w/dw access for bpf's cb in ctx


On 01/12/2017 09:25 AM, Quentin Monnet wrote:

2017-01-12 (02:21 +0100) ~ Daniel Borkmann 

[...]

diff --git a/tools/testing/selftests/bpf/test_verifier.c 
b/tools/testing/selftests/bpf/test_verifier.c
index 9bb4534..f664bed 100644
--- a/tools/testing/selftests/bpf/test_verifier.c
+++ b/tools/testing/selftests/bpf/test_verifier.c
@@ -859,15 +859,451 @@ struct test_val {


[...]


+   {
+   "check cb access: doulbe, oob 5",
+   .insns = {
+   BPF_MOV64_IMM(BPF_REG_0, 0),
+   BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_1,
+   offsetof(struct __sk_buff, cb[4]) + 8),
+   BPF_EXIT_INSN(),
+   },
+   .errstr = "invalid bpf_context access",
+   .result = REJECT,
+   },


Nitpicking: typo ("doulbe").


Thanks for spotting, I've sent out a v2.

Re: [PATCH net-next] secure_seq: fix sparse errors

2017-01-12 Thread Jason A. Donenfeld

Nice catch, thanks.

Reviewed-by: Jason A. Donenfeld

Re: [PATCH 9/9] treewide: Inline ib_dma_map_*() functions

2017-01-12 Thread Sagi Grimberg


Reviewed-by: Sagi Grimberg

Re: [PATCH/RFC net] ravb: Remove Rx overflow log messages

2017-01-12 Thread Sergei Shtylyov


On 01/12/2017 11:41 AM, Simon Horman wrote:


From: Kazuya Mizuguchi 

Remove Rx overflow log messages as in an environment where logging results
in network traffic logging may cause further overflows.

Fixes: c156633f1353 ("Renesas Ethernet AVB driver proper")
Signed-off-by: Kazuya Mizuguchi 
[simon: reworked changelog]
Signed-off-by: Simon Horman 


Acked-by: Sergei Shtylyov 

MBR, Sergei

Re: [PATCH/RFC v2 net-next] ravb: unmap descriptors when freeing rings

2017-01-12 Thread Sergei Shtylyov


On 01/12/2017 12:11 PM, Simon Horman wrote:


From: Kazuya Mizuguchi 

"swiotlb buffer is full" errors occur after repeated initialisation of a
device - f.e. suspend/resume or ip link set up/down. This is because memory
mapped using dma_map_single() in ravb_ring_format() and ravb_start_xmit()
is not released.  Resolve this problem by unmapping descriptors when
freeing rings.

Note, ravb_tx_free() is moved but not otherwise modified by this patch.

Signed-off-by: Kazuya Mizuguchi 
[simon: reworked]
Signed-off-by: Simon Horman 
--
v1 [Kazuya Mizuguchi]

v2 [Simon Horman]
* As suggested by Sergei Shtylyov
 - Use dma_mapping_error() and rx_desc->ds_cc when unmapping RX descriptors;
   this is consistent with the way that they are mapped
 - Use ravb_tx_free() to clear TX descriptors


   Not sure that was good idea (sorry)... ravb_tx_ring() only unmaps the
transmitted buffers, while we need to unmap everything...


* Reduce scope of new local variable
---
drivers/net/ethernet/renesas/ravb_main.c | 89 ++--
1 file changed, 51 insertions(+), 38 deletions(-)

diff --git a/drivers/net/ethernet/renesas/ravb_main.c 
b/drivers/net/ethernet/renesas/ravb_main.c
index 92d7692c840d..1797c48e3176 100644
--- a/drivers/net/ethernet/renesas/ravb_main.c
+++ b/drivers/net/ethernet/renesas/ravb_main.c
@@ -179,6 +179,44 @@ static struct mdiobb_ops bb_ops = {
.get_mdio_data = ravb_get_mdio_data,
};

+/* Free TX skb function for AVB-IP */
+static int ravb_tx_free(struct net_device *ndev, int q)
+{
+   struct ravb_private *priv = netdev_priv(ndev);
+   struct net_device_stats *stats = &priv->stats[q];
+   struct ravb_tx_desc *desc;
+   int free_num = 0;
+   int entry;
+   u32 size;
+
+   for (; priv->cur_tx[q] - priv->dirty_tx[q] > 0; priv->dirty_tx[q]++) {
+   entry = priv->dirty_tx[q] % (priv->num_tx_ring[q] *
+NUM_TX_DESC);
+   desc = &priv->tx_ring[q][entry];
+   if (desc->die_dt != DT_FEMPTY)


   Here, it stop once an untransmitted buffer is encountered...


Yes, I see that now.

I wonder if we should:

a) paramatise ravb_tx_free() so it may either clear all transmitted buffers
   (current behaviour) or all buffers (new behaviour).
b) provide a different version of this loop in ravb_ring_free()

What are your thoughts?


   I'm voting for (b).

[...]

MBR, Sergei

[PATCH net] ravb: Remove Rx overflow log messages

From: Kazuya Mizuguchi 

Remove Rx overflow log messages as in an environment where logging results
in network traffic logging may cause further overflows.

Fixes: c156633f1353 ("Renesas Ethernet AVB driver proper")
Signed-off-by: Kazuya Mizuguchi 
[simon: reworked changelog]
Signed-off-by: Simon Horman 
Acked-by: Sergei Shtylyov 
---
Changes since RFC:
* Added Ack from Sergei
---
 drivers/net/ethernet/renesas/ravb_main.c | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/renesas/ravb_main.c 
b/drivers/net/ethernet/renesas/ravb_main.c
index 92d7692c840d..5e5ad978eab9 100644
--- a/drivers/net/ethernet/renesas/ravb_main.c
+++ b/drivers/net/ethernet/renesas/ravb_main.c
@@ -926,14 +926,10 @@ static int ravb_poll(struct napi_struct *napi, int budget)
/* Receive error message handling */
priv->rx_over_errors =  priv->stats[RAVB_BE].rx_over_errors;
priv->rx_over_errors += priv->stats[RAVB_NC].rx_over_errors;
-   if (priv->rx_over_errors != ndev->stats.rx_over_errors) {
+   if (priv->rx_over_errors != ndev->stats.rx_over_errors)
ndev->stats.rx_over_errors = priv->rx_over_errors;
-   netif_err(priv, rx_err, ndev, "Receive Descriptor Empty\n");
-   }
-   if (priv->rx_fifo_errors != ndev->stats.rx_fifo_errors) {
+   if (priv->rx_fifo_errors != ndev->stats.rx_fifo_errors)
ndev->stats.rx_fifo_errors = priv->rx_fifo_errors;
-   netif_err(priv, rx_err, ndev, "Receive FIFO Overflow\n");
-   }
 out:
return budget - quota;
 }
-- 
2.7.0.rc3.207.g0ac5344

Re: [PATCHv3 2/6] sh_eth: add generic wake-on-lan support via magic packet

2017-01-12 Thread Sergei Shtylyov


Hello!

On 01/09/2017 06:34 PM, Niklas Söderlund wrote:


Add generic functionality to support Wake-on-LAN using MagicPacket which
are supported by at least a few versions of sh_eth. Only add
functionality for WoL, no specific sh_eth versions are marked to support
WoL yet.

WoL is enabled in the suspend callback by setting MagicPacket detection
and disabling all interrupts expect MagicPacket. In the resume path the
driver needs to reset the hardware to rearm the WoL logic, this prevents
the driver from simply restoring the registers and to take advantage of
that sh_eth was not suspended to reduce resume time. To reset the
hardware the driver closes and reopens the device just like it would do
in a normal suspend/resume scenario without WoL enabled, but it both
closes and opens the device in the resume callback since the device
needs to be open for WoL to work.

One quirk needed for WoL is that the module clock needs to be prevented
from being switched off by Runtime PM. To keep the clock alive the
suspend callback need to call clk_enable() directly to increase the
usage count of the clock. Then when Runtime PM decreases the clock usage
count it won't reach 0 and be switched off.

Signed-off-by: Niklas Söderlund 
---
 drivers/net/ethernet/renesas/sh_eth.c | 114 +++---
 drivers/net/ethernet/renesas/sh_eth.h |   3 +
 2 files changed, 109 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/renesas/sh_eth.c 
b/drivers/net/ethernet/renesas/sh_eth.c
index 8a784dce45fa..542c92b57b35 100644
--- a/drivers/net/ethernet/renesas/sh_eth.c
+++ b/drivers/net/ethernet/renesas/sh_eth.c
@@ -1552,6 +1552,8 @@ static void sh_eth_emac_interrupt(struct net_device *ndev)
sh_eth_rcv_snd_enable(ndev);
}
}
+   if (felic_stat & ECSR_MPD)
+   pm_wakeup_event(&mdp->pdev->dev, 0);


   Hum, seeing a corner case: if we're ignoring the link interrupt (and it 
does occur along with ECSR.MPD, we'll return and miss this check. It would 
have been preferable to add this code above the ECSR.LCHNG handler...


[...]

@@ -3150,15 +3189,67 @@ static int sh_eth_drv_remove(struct platform_device 
*pdev)

 #ifdef CONFIG_PM
 #ifdef CONFIG_PM_SLEEP
+static int sh_eth_wol_setup(struct net_device *ndev)
+{
+   struct sh_eth_private *mdp = netdev_priv(ndev);
+
+   /* Only allow ECI interrupts */
+   synchronize_irq(ndev->irq);
+   napi_disable(&mdp->napi);
+   sh_eth_write(ndev, DMAC_M_ECI, EESIPR);
+
+   /* Enable MagicPacket */
+   sh_eth_modify(ndev, ECMR, 0, ECMR_MPDE);


   I'd prefer sh_eth_modify(ndev, ECMR, ECMR_MPDE, ECMR_MPDE) to be 
consistent with my other code...


[...]

MBR, Sergei

To netlink or not to netlink, that is the question

2017-01-12 Thread Jason A. Donenfeld

Hey folks,

A few months ago I switched away from using netlink in wireguard,
preferring instead to use ioctl. I had come up against limitations in
rtnetlink, and ioctl presented a straightforward hard to screw-up
alternative. The very simple API is documented here:
https://git.zx2c4.com/WireGuard/tree/src/uapi.h

This works well, and I'm reluctant to change it, but as I do more
complicated things, and as kernel submission time looms nearer, I'm
kept up at night by the notion that maybe I ought to give netlink
another chance. But how?

For each wireguard interface, there are three types of structures for
userspace to configure. There is one wgdevice for each interface. Each
wgdevice has a variable amount (up to 2^16) of wgpeers. Each wgpeer
has a variable amount (up to 2^16) of wgipmasks. I'd like an interface
to get and set all of these at once, atomically.

Presently, with the ioctl, I just have a simple get ioctl and a simple
set ioctl. The set one passes a user space pointer, which is read
incrementally in kernel space. The get one will first return how much
userspace should allocate, and then when called again will write
incrementally into a provided userspace buffer up to a passed-in
maximum number of bytes. Very basic, I'm quite happy.

When I had tried to do this priorly with netlink, I did it by defining
changelink and fill_info in rtnl_link_ops. For changelink, I iterated
through the netlink objects, and for fill_info, I filled in the skb
with netlink objects. This was a bit more complex but basically
worked. Except netlink skbs have a maximum size and are buffered,
which means things broke entirely when trying to read or write logs of
wgpeers or lots of wgipmasks. So, the meager interfaces afforded to me
by rtnl_link_ops are insufficient. Doing anything beyond this, either
by registering new rtnetlink messages, or by using generic netlink,
seemed overwhelmingly complex and undesirable.

So I'm wondering -- is there a good way to be doing this with netlink?
Or am I right to stay with ioctl?

Thanks,
Jason

Re: [PATCH] can: Fix kernel panic at security_sock_rcv_skb

2017-01-12 Thread Eric Dumazet

On Thu, 2017-01-12 at 09:22 +0100, Oliver Hartkopp wrote:
> 
> On 01/12/2017 07:33 AM, Liu ShuoX wrote:
> > From: Zhang Yanmin 
> >
> > The patch is for fix the below kernel panic:
> > BUG: unable to handle kernel NULL pointer dereference at (null)
> > IP: [] selinux_socket_sock_rcv_skb+0x65/0x2a0
> >
> > Call Trace:
> >  
> >  [] security_sock_rcv_skb+0x4c/0x60
> >  [] sk_filter+0x41/0x210
> >  [] sock_queue_rcv_skb+0x53/0x3a0
> >  [] raw_rcv+0x2a3/0x3c0
> >  [] can_rcv_filter+0x12b/0x370
> >  [] can_receive+0xd9/0x120
> >  [] can_rcv+0xab/0x100
> >  [] __netif_receive_skb_core+0xd8c/0x11f0
> >  [] __netif_receive_skb+0x24/0xb0
> >  [] process_backlog+0x127/0x280
> >  [] net_rx_action+0x33b/0x4f0
> >  [] __do_softirq+0x184/0x440
> >  [] do_softirq_own_stack+0x1c/0x30
> >  
> >  [] do_softirq.part.18+0x3b/0x40
> >  [] do_softirq+0x1d/0x20
> >  [] netif_rx_ni+0xe5/0x110
> >  [] slcan_receive_buf+0x507/0x520
> >  [] flush_to_ldisc+0x21c/0x230
> >  [] process_one_work+0x24f/0x670
> >  [] worker_thread+0x9d/0x6f0
> >  [] ? rescuer_thread+0x480/0x480
> >  [] kthread+0x12c/0x150
> >  [] ret_from_fork+0x3f/0x70
> >
> > The sk dereferenced in panic has been released. After the rcu_call in
> > can_rx_unregister, receiver was protected by RCU but inner data was
> > not, then later sk will be freed while other CPU is still using it.
> > We need wait here to make sure sk referenced via receiver was safe.
> >
> > => security_sk_free
> > => sk_destruct
> > => __sk_free
> > => sk_free
> > => raw_release
> > => sock_release
> > => sock_close
> > => __fput
> > => fput
> > => task_work_run
> > => exit_to_usermode_loop
> > => syscall_return_slowpath
> > => int_ret_from_sys_call
> >
> > Signed-off-by: Zhang Yanmin 
> > Signed-off-by: He, Bo 
> > Signed-off-by: Liu Shuo A 
> > ---
> >  net/can/af_can.c | 14 --
> >  net/can/af_can.h |  1 -
> >  2 files changed, 8 insertions(+), 7 deletions(-)
> >
> > diff --git a/net/can/af_can.c b/net/can/af_can.c
> > index 1108079..fcbe971 100644
> > --- a/net/can/af_can.c
> > +++ b/net/can/af_can.c
> > @@ -517,10 +517,8 @@ int can_rx_register(struct net_device *dev, canid_t 
> > can_id, canid_t mask,
> >  /*
> >   * can_rx_delete_receiver - rcu callback for single receiver entry removal
> >   */
> > -static void can_rx_delete_receiver(struct rcu_head *rp)
> > +static void can_rx_delete_receiver(struct receiver *r)
> >  {
> > -   struct receiver *r = container_of(rp, struct receiver, rcu);
> > -
> > kmem_cache_free(rcv_cache, r);
> >  }
> >
> > @@ -595,9 +593,13 @@ void can_rx_unregister(struct net_device *dev, canid_t 
> > can_id, canid_t mask,
> >   out:
> > spin_unlock(&can_rcvlists_lock);
> >
> > -   /* schedule the receiver item for deletion */
> > -   if (r)
> > -   call_rcu(&r->rcu, can_rx_delete_receiver);
> > +   /* synchronize_rcu to wait until a grace period has elapsed, to make
> > +* sure all receiver's sk dereferenced by others.
> > +*/
> > +   if (r) {
> > +   synchronize_rcu();
> > +   can_rx_delete_receiver(r);
> 
> Nitpick: When can_rx_delete_receiver() just contains 
> kmem_cache_free(rcv_cache, r), then the function definition should be 
> removed.
> 
> But my main concern is:
> 
> The reason why can_rx_delete_receiver() was introduced was the need to 
> remove a huge number of receivers with can_rx_unregister().
> 
> When you call synchronize_rcu() after each receiver removal this would 
> potentially lead to a big performance issue when e.g. closing CAN_RAW 
> sockets with a high number of receivers.
> 
> So the idea was to remove/unlink the receiver hlist_del_rcu(&r->list) 
> and also kmem_cache_free(rcv_cache, r) by some rcu mechanism - so that 
> all elements are cleaned up by rcu at a later point.
> 
> Is it possible that the problems emerge due to hlist_del_rcu(&r->list) 
> and you accidently fix it with your introduced synchronize_rcu()?

I agree this patch does not fix the root cause.

The main problem seems that the sockets themselves are not RCU
protected.

If CAN uses RCU for delivery, then sockets should be freed only after
one RCU grace period.

On recent kernels, following patch could help :

diff --git a/net/can/af_can.c b/net/can/af_can.c
index 
1108079d934f8383a599d7997b08100fca0465e9..353beaefee7ea3631eb429b011604906b964465e
 100644
--- a/net/can/af_can.c
+++ b/net/can/af_can.c
@@ -189,6 +189,7 @@ static int can_create(struct net *net, struct socket *sock, 
int protocol,
 
sock_init_data(sock, sk);
sk->sk_destruct = can_sock_destruct;
+   sock_set_flag(sk, SOCK_RCU_FREE);
 
if (sk->sk_prot->init)
err = sk->sk_prot->init(sk);


For older kernels, the following could be used :

 net/can/af_can.c |   13 ++---
 net/can/af_can.h |3 ++-
 net/can/bcm.c|4 ++--
 net/can/gw.c |2 +-
 net/can/raw.c|4 ++--
 5 files changed, 17 insertions(+), 9 deletions(-)

diff --git a/net/can/af_can.c b/net/can/af_can.c
index 
1108079d934f8383a599d

Re: [PATCH net-next] net/mlx5e: Support bpf_xdp_adjust_head()

2017-01-12 Thread Saeed Mahameed

On Thu, Jan 12, 2017 at 4:09 AM, Martin KaFai Lau  wrote:
> This patch adds bpf_xdp_adjust_head() support to mlx5e.

Hi Martin, Thanks for the patch !

you can find some comments below.

>
> 1. rx_headroom is added to struct mlx5e_rq.  It uses
>an existing 4 byte hole in the struct.
> 2. The adjusted data length is checked against
>MLX5E_XDP_MIN_INLINE and MLX5E_SW2HW_MTU(rq->netdev->mtu).
> 3. The macro MLX5E_SW2HW_MTU is moved from en_main.c to en.h.
>MLX5E_HW2SW_MTU is also moved to en.h for symmetric reason
>but it is not a must.
>
> Signed-off-by: Martin KaFai Lau 
> ---
>  drivers/net/ethernet/mellanox/mlx5/core/en.h  |  4 ++
>  drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 18 +++
>  drivers/net/ethernet/mellanox/mlx5/core/en_rx.c   | 63 
> ++-
>  3 files changed, 51 insertions(+), 34 deletions(-)
>
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h 
> b/drivers/net/ethernet/mellanox/mlx5/core/en.h
> index a473cea10c16..0d9dd860a295 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
> @@ -51,6 +51,9 @@
>
>  #define MLX5_SET_CFG(p, f, v) MLX5_SET(create_flow_group_in, p, f, v)
>
> +#define MLX5E_HW2SW_MTU(hwmtu) ((hwmtu) - (ETH_HLEN + VLAN_HLEN + 
> ETH_FCS_LEN))
> +#define MLX5E_SW2HW_MTU(swmtu) ((swmtu) + (ETH_HLEN + VLAN_HLEN + 
> ETH_FCS_LEN))
> +
>  #define MLX5E_MAX_NUM_TC   8
>
>  #define MLX5E_PARAMS_MINIMUM_LOG_SQ_SIZE0x6
> @@ -369,6 +372,7 @@ struct mlx5e_rq {
>
> unsigned long  state;
> intix;
> +   u16rx_headroom;
>
> struct mlx5e_rx_am am; /* Adaptive Moderation */
> struct bpf_prog   *xdp_prog;
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c 
> b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> index f74ba73c55c7..aba3691e0919 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> @@ -343,9 +343,6 @@ static void mlx5e_disable_async_events(struct mlx5e_priv 
> *priv)
> synchronize_irq(mlx5_get_msix_vec(priv->mdev, MLX5_EQ_VEC_ASYNC));
>  }
>
> -#define MLX5E_HW2SW_MTU(hwmtu) (hwmtu - (ETH_HLEN + VLAN_HLEN + ETH_FCS_LEN))
> -#define MLX5E_SW2HW_MTU(swmtu) (swmtu + (ETH_HLEN + VLAN_HLEN + ETH_FCS_LEN))
> -
>  static inline int mlx5e_get_wqe_mtt_sz(void)
>  {
> /* UMR copies MTTs in units of MLX5_UMR_MTT_ALIGNMENT bytes.
> @@ -534,9 +531,13 @@ static int mlx5e_create_rq(struct mlx5e_channel *c,
> goto err_rq_wq_destroy;
> }
>
> -   rq->buff.map_dir = DMA_FROM_DEVICE;
> -   if (rq->xdp_prog)
> +   if (rq->xdp_prog) {
> rq->buff.map_dir = DMA_BIDIRECTIONAL;
> +   rq->rx_headroom = XDP_PACKET_HEADROOM;
> +   } else {
> +   rq->buff.map_dir = DMA_FROM_DEVICE;
> +   rq->rx_headroom = MLX5_RX_HEADROOM;
> +   }
>
> switch (priv->params.rq_wq_type) {
> case MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ:
> @@ -586,7 +587,7 @@ static int mlx5e_create_rq(struct mlx5e_channel *c,
> byte_count = rq->buff.wqe_sz;
>
> /* calc the required page order */
> -   frag_sz = MLX5_RX_HEADROOM +
> +   frag_sz = rq->rx_headroom +
>   byte_count /* packet data */ +
>   SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
> frag_sz = SKB_DATA_ALIGN(frag_sz);
> @@ -3153,11 +3154,6 @@ static int mlx5e_xdp_set(struct net_device *netdev, 
> struct bpf_prog *prog)
> bool reset, was_opened;
> int i;
>
> -   if (prog && prog->xdp_adjust_head) {
> -   netdev_err(netdev, "Does not support 
> bpf_xdp_adjust_head()\n");
> -   return -EOPNOTSUPP;
> -   }
> -
> mutex_lock(&priv->state_lock);
>
> if ((netdev->features & NETIF_F_LRO) && prog) {
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c 
> b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
> index 0e2fb3ed1790..914e00132e08 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
> @@ -264,7 +264,7 @@ int mlx5e_alloc_rx_wqe(struct mlx5e_rq *rq, struct 
> mlx5e_rx_wqe *wqe, u16 ix)
> if (unlikely(mlx5e_page_alloc_mapped(rq, di)))
> return -ENOMEM;
>
> -   wqe->data.addr = cpu_to_be64(di->addr + MLX5_RX_HEADROOM);
> +   wqe->data.addr = cpu_to_be64(di->addr + rq->rx_headroom);
> return 0;
>  }
>
> @@ -646,8 +646,7 @@ static inline void mlx5e_xmit_xdp_doorbell(struct 
> mlx5e_sq *sq)
>
>  static inline void mlx5e_xmit_xdp_frame(struct mlx5e_rq *rq,
> struct mlx5e_dma_info *di,
> -   unsigned int data_offset,
> -   int len)
> +

[PATCH net-next] tools: psock_lib: harden socket filter used by psock tests

2017-01-12 Thread Sowmini Varadhan

The filter added by sock_setfilter is intended to only permit
packets matching the pattern set up by create_payload(), but
we only check the ip_len, and a single test-character in
the IP packet to ensure this condition.

Harden the filter by adding additional constraints so that we only
permit UDP/IPv4 packets that meet the ip_len and test-character
requirements. Include the bpf_asm src as a comment, in case this
needs to be enhanced in the future

Signed-off-by: Sowmini Varadhan 
---
 tools/testing/selftests/net/psock_lib.h |   39 +-
 1 files changed, 32 insertions(+), 7 deletions(-)

diff --git a/tools/testing/selftests/net/psock_lib.h 
b/tools/testing/selftests/net/psock_lib.h
index 24bc7ec..a77da88 100644
--- a/tools/testing/selftests/net/psock_lib.h
+++ b/tools/testing/selftests/net/psock_lib.h
@@ -40,14 +40,39 @@
 
 static __maybe_unused void sock_setfilter(int fd, int lvl, int optnum)
 {
+   /* the filter below checks for all of the following conditions that
+* are based on the contents of create_payload()
+*  ether type 0x800 and
+*  ip proto udp and
+*  skb->len == DATA_LEN and
+*  udp[38] == 'a' or udp[38] == 'b'
+* It can be generated from the following bpf_asm input:
+*  ldh [12]
+*  jne #0x800, drop; ETH_P_IP
+*  ldb [23]
+*  jneq #17, drop  ; IPPROTO_UDP
+*  ld len  ; ld skb->len
+*  jlt #100, drop  ; DATA_LEN
+*  ldb [80]
+*  jeq #97, pass   ; DATA_CHAR
+*  jne #98, drop   ; DATA_CHAR_1
+*  pass:
+*ret #-1
+*  drop:
+*ret #0
+*/
struct sock_filter bpf_filter[] = {
-   { 0x80, 0, 0, 0x },  /* LD  pktlen*/
-   { 0x35, 0, 4, DATA_LEN   },  /* JGE DATA_LEN  [f goto nomatch]*/
-   { 0x30, 0, 0, 0x0050 },  /* LD  ip[80]*/
-   { 0x15, 1, 0, DATA_CHAR  },  /* JEQ DATA_CHAR   [t goto match]*/
-   { 0x15, 0, 1, DATA_CHAR_1},  /* JEQ DATA_CHAR_1 [t goto match]*/
-   { 0x06, 0, 0, 0x0060 },  /* RET match */
-   { 0x06, 0, 0, 0x },  /* RET no match  */
+   { 0x28,  0,  0, 0x000c },
+   { 0x15,  0,  8, 0x0800 },
+   { 0x30,  0,  0, 0x0017 },
+   { 0x15,  0,  6, 0x0011 },
+   { 0x80,  0,  0, 00 },
+   { 0x35,  0,  4, 0x0064 },
+   { 0x30,  0,  0, 0x0050 },
+   { 0x15,  1,  0, 0x0061 },
+   { 0x15,  0,  1, 0x0062 },
+   { 0x06,  0,  0, 0x },
+   { 0x06,  0,  0, 00 },
};
struct sock_fprog bpf_prog;
 
-- 
1.7.1

Re: [PATCH 8/9] IB: Convert ib_dma_*_coherent() argument type from u64 into dma_addr_t

2017-01-12 Thread Leon Romanovsky

On Tue, Jan 10, 2017 at 04:56:47PM -0800, Bart Van Assche wrote:
> This patch does not change any functionality.
>
> Signed-off-by: Bart Van Assche 
> Cc: David S. Miller 
> Cc: linux-r...@vger.kernel.org
> Cc: netdev@vger.kernel.org
> Cc: rds-de...@oss.oracle.com
> ---
>  include/rdma/ib_verbs.h | 11 +++
>  net/rds/ib.h|  6 +++---
>  2 files changed, 6 insertions(+), 11 deletions(-)
>

Thanks,
Reviewed-by: Leon Romanovsky 


signature.asc
Description: PGP signature

Re: [PATCH/RFC v2 net-next] ravb: unmap descriptors when freeing rings

On Thu, Jan 12, 2017 at 03:03:05PM +0300, Sergei Shtylyov wrote:
> On 01/12/2017 12:11 PM, Simon Horman wrote:

...

> >>   Here, it stop once an untransmitted buffer is encountered...
> >
> >Yes, I see that now.
> >
> >I wonder if we should:
> >
> >a) paramatise ravb_tx_free() so it may either clear all transmitted buffers
> >   (current behaviour) or all buffers (new behaviour).
> >b) provide a different version of this loop in ravb_ring_free()
> >
> >What are your thoughts?
> 
>I'm voting for (b).

Ok, something like this?

@@ -215,6 +225,30 @@ static void ravb_ring_free(struct net_device *ndev, int q)
}
 
if (priv->tx_ring[q]) {
+   for (; priv->cur_tx[q] - priv->dirty_tx[q] > 0; 
priv->dirty_tx[q]++) {
+   struct ravb_tx_desc *desc;
+   int entry;
+
+   entry = priv->dirty_tx[q] % (priv->num_tx_ring[q] *
+NUM_TX_DESC);
+   desc = &priv->tx_ring[q][entry];
+
+   /* Free the original skb. */
+   if (priv->tx_skb[q][entry / NUM_TX_DESC]) {
+   u32 size = le16_to_cpu(desc->ds_tagl) & TX_DS;
+
+   dma_unmap_single(ndev->dev.parent,
+le32_to_cpu(desc->dptr),
+size, DMA_TO_DEVICE);
+   /* Last packet descriptor? */
+   if (entry % NUM_TX_DESC == NUM_TX_DESC - 1) {
+   entry /= NUM_TX_DESC;
+   
dev_kfree_skb_any(priv->tx_skb[q][entry]);
+   priv->tx_skb[q][entry] = NULL;
+   }
+   }
+   }
+
ring_size = sizeof(struct ravb_tx_desc) *
(priv->num_tx_ring[q] * NUM_TX_DESC + 1);
dma_free_coherent(ndev->dev.parent, ring_size, priv->tx_ring[q],

Re: [PATCH v3 net-next 4/4] syncookies: use SipHash in place of SHA1

2017-01-12 Thread Eric Dumazet

On Sun, 2017-01-08 at 13:54 +0100, Jason A. Donenfeld wrote:
> SHA1 is slower and less secure than SipHash, and so replacing syncookie
> generation with SipHash makes natural sense. Some BSDs have been doing
> this for several years in fact.
> 
> The speedup should be similar -- and even more impressive -- to the
> speedup from the sequence number fix in this series.

I confirm a nice speedup under SYNFLOOD.

sha_transform() used to consume ~12 % of cpu cycles, while the
siphash_2u64() only uses ~1.9 %

Depending on the setup, gain is about 9 %

 4.48%  [kernel]  [k] ipt_do_table   
 4.39%  [kernel]  [k] fib_table_lookup   
 3.90%  [kernel]  [k] __netif_receive_skb_core   
 3.76%  [kernel]  [k] fib_rules_lookup   
 3.15%  [kernel]  [k] __inet_lookup_established  
 3.11%  [kernel]  [k] tcp_conn_request   
 2.51%  [kernel]  [k] tcp_v4_rcv 
 2.42%  [kernel]  [k] tcp_make_synack
 2.22%  [kernel]  [k] nf_iterate 
 2.16%  [kernel]  [k] ip_rcv 
 1.92%  [kernel]  [k] siphash_2u64   
 1.76%  [kernel]  [k] __ip_route_output_key  
 1.73%  [kernel]  [k] mlx4_en_process_rx_cq  
 1.68%  [kernel]  [k] memcpy_erms
 1.59%  [kernel]  [k] __alloc_skb
 1.49%  [kernel]  [k] __dev_queue_xmit   
 1.48%  [kernel]  [k] kmem_cache_alloc   
 1.38%  [kernel]  [k] __local_bh_enable_ip   
 1.36%  [kernel]  [k] kmem_cache_free
 1.21%  [kernel]  [k] ___cache_free  
 1.09%  [kernel]  [k] __build_skb
 1.07%  [kernel]  [k] inet_reqsk_alloc   
 1.04%  [kernel]  [k] kfree  
 1.04%  [kernel]  [k] ip_build_and_send_pkt  
 1.04%  [kernel]  [k] inet_gro_receive   
 1.01%  [kernel]  [k] fib_validate_source
 0.98%  [kernel]  [k] tcp_openreq_init_rwin  
 0.98%  [kernel]  [k] inet_csk_route_req 
 0.97%  [kernel]  [k] fib_get_table  
 0.96%  [kernel]  [k] ip_finish_output2  
 0.94%  [kernel]  [k] tcp_v4_do_rcv  
 0.91%  [kernel]  [k] ip_local_deliver_finish
 0.91%  [kernel]  [k] netif_skb_features 
 0.91%  [kernel]  [k] dev_hard_start_xmit

Re: [PATCH 9/9] treewide: Inline ib_dma_map_*() functions

2017-01-12 Thread Leon Romanovsky

On Tue, Jan 10, 2017 at 04:56:48PM -0800, Bart Van Assche wrote:
> Almost all changes in this patch except the removal of local variables
> that became superfluous and the actual removal of the ib_dma_map_*()
> functions have been generated as follows:
>
> git grep -lE 'ib_(sg_|)dma_' |
>   xargs -d\\n \
> sed -i -e 
> 's/\([^[:alnum:]_]\)ib_dma_\([^(]*\)(\&\([^,]\+\),/\1dma_\2(\3.dma_device,/g' 
> \
>-e 
> 's/\([^[:alnum:]_]\)ib_dma_\([^(]*\)(\([^,]\+\),/\1dma_\2(\3->dma_device,/g' \
>  -e 's/ib_sg_dma_\(len\|address\)(\([^,]\+\), /sg_dma_\1(/g'
>
> Signed-off-by: Bart Van Assche 
> Reviewed-by: Christoph Hellwig 
> Cc: Andreas Dilger 
> Cc: Anna Schumaker 
> Cc: David S. Miller 
> Cc: Eric Van Hensbergen 
> Cc: James Simmons 
> Cc: Latchesar Ionkov 
> Cc: Oleg Drokin 
> Cc: Ron Minnich 
> Cc: Trond Myklebust 
> Cc: de...@driverdev.osuosl.org
> Cc: linux-...@vger.kernel.org
> Cc: linux-n...@lists.infradead.org
> Cc: linux-r...@vger.kernel.org
> Cc: lustre-de...@lists.lustre.org
> Cc: netdev@vger.kernel.org
> Cc: rds-de...@oss.oracle.com
> Cc: target-de...@vger.kernel.org
> Cc: v9fs-develo...@lists.sourceforge.net
> ---
>  drivers/infiniband/core/mad.c  |  28 +--
>  drivers/infiniband/core/rw.c   |  30 ++-
>  drivers/infiniband/core/umem.c |   4 +-
>  drivers/infiniband/core/umem_odp.c |   6 +-
>  drivers/infiniband/hw/mlx4/cq.c|   2 +-
>  drivers/infiniband/hw/mlx4/mad.c   |  28 +--
>  drivers/infiniband/hw/mlx4/mr.c|   4 +-
>  drivers/infiniband/hw/mlx4/qp.c|  10 +-
>  drivers/infiniband/hw/mlx5/mr.c|   4 +-

For mlx5 and mlx4 parts.
Acked-by: Leon Romanovsky 

Thanks


signature.asc
Description: PGP signature

[PATCH net] mld: do not remove mld souce list info when set link down

2017-01-12 Thread Hangbin Liu

This is an IPv6 version of commit 24803f38a5c0 ("igmp: do not remove igmp
souce list..."). In mld_del_delrec(), we will restore back all source filter
info instead of flush them.

Move mld_clear_delrec() from ipv6_mc_down() to ipv6_mc_destroy_dev() since
we should not remove source list info when set link down. Remove
igmp6_group_dropped() in ipv6_mc_destroy_dev() since we have called it in
ipv6_mc_down().

Also clear all source info after igmp6_group_dropped() instead of in it
because ipv6_mc_down() will call igmp6_group_dropped().

Signed-off-by: Hangbin Liu 
---
 net/ipv6/mcast.c | 51 ++-
 1 file changed, 30 insertions(+), 21 deletions(-)

diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
index 14a3903..7139fff 100644
--- a/net/ipv6/mcast.c
+++ b/net/ipv6/mcast.c
@@ -81,7 +81,7 @@ static void mld_gq_timer_expire(unsigned long data);
 static void mld_ifc_timer_expire(unsigned long data);
 static void mld_ifc_event(struct inet6_dev *idev);
 static void mld_add_delrec(struct inet6_dev *idev, struct ifmcaddr6 *pmc);
-static void mld_del_delrec(struct inet6_dev *idev, const struct in6_addr 
*addr);
+static void mld_del_delrec(struct inet6_dev *idev, struct ifmcaddr6 *pmc);
 static void mld_clear_delrec(struct inet6_dev *idev);
 static bool mld_in_v1_mode(const struct inet6_dev *idev);
 static int sf_setstate(struct ifmcaddr6 *pmc);
@@ -692,9 +692,9 @@ static void igmp6_group_dropped(struct ifmcaddr6 *mc)
dev_mc_del(dev, buf);
}
 
-   if (mc->mca_flags & MAF_NOREPORT)
-   goto done;
spin_unlock_bh(&mc->mca_lock);
+   if (mc->mca_flags & MAF_NOREPORT)
+   return;
 
if (!mc->idev->dead)
igmp6_leave_group(mc);
@@ -702,8 +702,6 @@ static void igmp6_group_dropped(struct ifmcaddr6 *mc)
spin_lock_bh(&mc->mca_lock);
if (del_timer(&mc->mca_timer))
atomic_dec(&mc->mca_refcnt);
-done:
-   ip6_mc_clear_src(mc);
spin_unlock_bh(&mc->mca_lock);
 }
 
@@ -748,10 +746,11 @@ static void mld_add_delrec(struct inet6_dev *idev, struct 
ifmcaddr6 *im)
spin_unlock_bh(&idev->mc_lock);
 }
 
-static void mld_del_delrec(struct inet6_dev *idev, const struct in6_addr *pmca)
+static void mld_del_delrec(struct inet6_dev *idev, struct ifmcaddr6 *im)
 {
struct ifmcaddr6 *pmc, *pmc_prev;
-   struct ip6_sf_list *psf, *psf_next;
+   struct ip6_sf_list *psf;
+   struct in6_addr *pmca = &im->mca_addr;
 
spin_lock_bh(&idev->mc_lock);
pmc_prev = NULL;
@@ -768,14 +767,20 @@ static void mld_del_delrec(struct inet6_dev *idev, const 
struct in6_addr *pmca)
}
spin_unlock_bh(&idev->mc_lock);
 
+   spin_lock_bh(&im->mca_lock);
if (pmc) {
-   for (psf = pmc->mca_tomb; psf; psf = psf_next) {
-   psf_next = psf->sf_next;
-   kfree(psf);
+   im->idev = pmc->idev;
+   im->mca_crcount = idev->mc_qrv;
+   im->mca_sfmode = pmc->mca_sfmode;
+   if (pmc->mca_sfmode == MCAST_INCLUDE) {
+   im->mca_tomb = pmc->mca_tomb;
+   im->mca_sources = pmc->mca_sources;
+   for (psf = im->mca_sources; psf; psf = psf->sf_next)
+   psf->sf_crcount = im->mca_crcount;
}
in6_dev_put(pmc->idev);
-   kfree(pmc);
}
+   spin_unlock_bh(&im->mca_lock);
 }
 
 static void mld_clear_delrec(struct inet6_dev *idev)
@@ -904,7 +909,7 @@ int ipv6_dev_mc_inc(struct net_device *dev, const struct 
in6_addr *addr)
mca_get(mc);
write_unlock_bh(&idev->lock);
 
-   mld_del_delrec(idev, &mc->mca_addr);
+   mld_del_delrec(idev, mc);
igmp6_group_added(mc);
ma_put(mc);
return 0;
@@ -927,6 +932,7 @@ int __ipv6_dev_mc_dec(struct inet6_dev *idev, const struct 
in6_addr *addr)
write_unlock_bh(&idev->lock);
 
igmp6_group_dropped(ma);
+   ip6_mc_clear_src(ma);
 
ma_put(ma);
return 0;
@@ -2501,15 +2507,17 @@ void ipv6_mc_down(struct inet6_dev *idev)
/* Withdraw multicast list */
 
read_lock_bh(&idev->lock);
-   mld_ifc_stop_timer(idev);
-   mld_gq_stop_timer(idev);
-   mld_dad_stop_timer(idev);
 
for (i = idev->mc_list; i; i = i->next)
igmp6_group_dropped(i);
-   read_unlock_bh(&idev->lock);
 
-   mld_clear_delrec(idev);
+   /* Should stop timer after group drop. or we will
+* start timer again in mld_ifc_event()
+*/
+   mld_ifc_stop_timer(idev);
+   mld_gq_stop_timer(idev);
+   mld_dad_stop_timer(idev);
+   read_unlock_bh(&idev->lock);
 }
 
 static void ipv6_mc_reset(struct inet6_dev *idev)
@@ -2531,8 +2539,10 @@ void ipv6_mc_

Re: [PATCH/RFC v2 net-next] ravb: unmap descriptors when freeing rings

2017-01-12 Thread Lino Sanfilippo


Hi,

On 12.01.2017 10:11, Simon Horman wrote:


+
+   for (; priv->cur_tx[q] - priv->dirty_tx[q] > 0; priv->dirty_tx[q]++) {


BTW: How can this work correctly when cur_tx wraps and dirty_tx is greater?

Regards,
Lino

[PATCH net-next] IPsec: do not ignore crypto err in ah input

2017-01-12 Thread Gilad Ben-Yossef

ah input processing uses the asynchrnous hash crypto API which
supplies an error code as part of the operation completion but 
the error code was being ignored.

Treat a crypto API error indication as a verification failure.

While a crypto API reported error would almost certainly result
in a memcpy of the digest failing anyway and thus the security
risk seems minor, performing a memory compare on what might be
uninitialized memory is wrong.

Signed-off-by: Gilad Ben-Yossef 
---

The change was boot tested on Arm64 but I did not exercise
the specific error code path in question.

 net/ipv4/ah4.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/net/ipv4/ah4.c b/net/ipv4/ah4.c
index f2a7102..22377c8 100644
--- a/net/ipv4/ah4.c
+++ b/net/ipv4/ah4.c
@@ -270,6 +270,9 @@ static void ah_input_done(struct crypto_async_request 
*base, int err)
int ihl = ip_hdrlen(skb);
int ah_hlen = (ah->hdrlen + 2) << 2;
 
+   if (err)
+   goto out;
+
work_iph = AH_SKB_CB(skb)->tmp;
auth_data = ah_tmp_auth(work_iph, ihl);
icv = ah_tmp_icv(ahp->ahash, auth_data, ahp->icv_trunc_len);
-- 
2.1.4

[PATCH net-next] cdc-ether: usbnet_cdc_zte_status() can be static

2017-01-12 Thread Wei Yongjun

From: Wei Yongjun 

Fixes the following sparse warning:

drivers/net/usb/cdc_ether.c:469:6: warning:
 symbol 'usbnet_cdc_zte_status' was not declared. Should it be static?

Signed-off-by: Wei Yongjun 
---
 drivers/net/usb/cdc_ether.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/usb/cdc_ether.c b/drivers/net/usb/cdc_ether.c
index fe7b288..620ba8e 100644
--- a/drivers/net/usb/cdc_ether.c
+++ b/drivers/net/usb/cdc_ether.c
@@ -466,7 +466,7 @@ static int usbnet_cdc_zte_rx_fixup(struct usbnet *dev, 
struct sk_buff *skb)
  * connected. This causes the link state to be incorrect. Work around this by
  * always setting the state to off, then on.
  */
-void usbnet_cdc_zte_status(struct usbnet *dev, struct urb *urb)
+static void usbnet_cdc_zte_status(struct usbnet *dev, struct urb *urb)
 {
struct usb_cdc_notification *event;

[PATCH iproute2 v4 1/4] ifstat: Includes reorder

Reorder the includes order in misc/ifstat.c to match convention.

Signed-off-by: Nogah Frankel 
Reviewed-by: Jiri Pirko 
---
 misc/ifstat.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/misc/ifstat.c b/misc/ifstat.c
index 92d67b0..5bcbcc8 100644
--- a/misc/ifstat.c
+++ b/misc/ifstat.c
@@ -28,12 +28,12 @@
 #include 
 #include 
 
-#include 
-#include 
 #include 
 #include 
 
-#include 
+#include "libnetlink.h"
+#include "json_writer.h"
+#include "SNAPSHOT.h"
 
 int dump_zeros;
 int reset_history;
-- 
2.4.3

[PATCH iproute2 v4 4/4] ifstat: Add "sw only" extended statistics to ifstat

Add support for extended statistics of SW only type, for counting only the
packets that went via the cpu. (useful for systems with forward
offloading). It reads it from filter type IFLA_STATS_LINK_OFFLOAD_XSTATS
and sub type IFLA_OFFLOAD_XSTATS_CPU_HIT.

It is under the name 'cpu_hits'
(or any shorten of it as 'cpu' or simply 'c')

For example:
ifstat -x c

Signed-off-by: Nogah Frankel 
Reviewed-by: Jiri Pirko 
---
 misc/ifstat.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/misc/ifstat.c b/misc/ifstat.c
index 3478f0a..5b6a36b 100644
--- a/misc/ifstat.c
+++ b/misc/ifstat.c
@@ -730,7 +730,8 @@ static void xstat_usage(void)
 {
fprintf(stderr,
 "Usage: ifstat supported xstats:\n"
-"   64bits default stats, with 64 bits support\n");
+"   64bits default stats, with 64 bits support\n"
+"   cpu_hits   Counts only packets that went via the CPU.\n");
 }
 
 struct extended_stats_options_t {
@@ -745,6 +746,7 @@ struct extended_stats_options_t {
  */
 static const struct extended_stats_options_t extended_stats_options[] = {
{"64bits", IFLA_STATS_LINK_64, NO_SUB_TYPE},
+   {"cpu_hits",  IFLA_STATS_LINK_OFFLOAD_XSTATS, 
IFLA_OFFLOAD_XSTATS_CPU_HIT},
 };
 
 static const char *get_filter_type(const char *name)
-- 
2.4.3

[PATCH iproute2 v4 0/4] update ifstat for new stats

Previously stats were gotten by RTM_GETLINK which returns 32 bits based
statistics. It supports only one type of stats.
Lately, a new method to get stats was added - RTM_GETSTATS. It supports
ability to choose stats type. The basic stats were changed from 32 bits
based to 64 bits based.

This patchset adds ifstat the ability to get extended stats by this
method. Its adds two types of extended stats:
64bits - the same as the "normal" stats but get the stats from the cpu
in 64 bits based struct.
cpu_hits - for packets that hit cpu.

---
v3->v4:
- patch 2/4:
 - change xstat name read to avoid redundant copy.
 - delete extra line
- patch 4/4:
 - change xstat name.

v2->v3:
- patch 1/4:
 - add a new patch to reorder includes in misc/ifstat.c
- patch 2/4: (previously 1/3)
 - fix typos.
 - change error print to use fprintf.

v1->v2:
 - change from using RTM_GETSTATS always to using it only for extended
   stats.
 - Add 64bits extended stats type.

Nogah Frankel (4):
  ifstat: Includes reorder
  ifstat: Add extended statistics to ifstat
  ifstat: Add 64 bits based stats to extended statistics
  ifstat: Add "sw only" extended statistics to ifstat

 misc/ifstat.c | 170 +++---
 1 file changed, 152 insertions(+), 18 deletions(-)

-- 
2.4.3

[PATCH iproute2 v4 3/4] ifstat: Add 64 bits based stats to extended statistics

The default stats for ifstat are 32 bits based.
The kernel supports 64 bits based stats. (They are returned in struct
rtnl_link_stats64 which is an exact copy of struct rtnl_link_stats, in
which the "normal" stats are returned, but with fields of u64 instead of
u32). This patch adds them as an extended stats.

It is read with filter type IFLA_STATS_LINK_64 and no sub type.

It is under the name 64bits
(or any shorten of it as "64")

For example:
ifstat -x 64bit

Signed-off-by: Nogah Frankel 
Reviewed-by: Jiri Pirko 
---
 misc/ifstat.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/misc/ifstat.c b/misc/ifstat.c
index 9467119..3478f0a 100644
--- a/misc/ifstat.c
+++ b/misc/ifstat.c
@@ -729,7 +729,8 @@ static int verify_forging(int fd)
 static void xstat_usage(void)
 {
fprintf(stderr,
-"Usage: ifstat supported xstats:\n");
+"Usage: ifstat supported xstats:\n"
+"   64bits default stats, with 64 bits support\n");
 }
 
 struct extended_stats_options_t {
@@ -743,6 +744,7 @@ struct extended_stats_options_t {
  * Name length must be under 64 chars.
  */
 static const struct extended_stats_options_t extended_stats_options[] = {
+   {"64bits", IFLA_STATS_LINK_64, NO_SUB_TYPE},
 };
 
 static const char *get_filter_type(const char *name)
-- 
2.4.3

[PATCH iproute2 v4 2/4] ifstat: Add extended statistics to ifstat

Extended stats are part of the RTM_GETSTATS method. This patch adds them
to ifstat.
While extended stats can come in many forms, we support only the
rtnl_link_stats64 struct for them (which is the 64 bits version of struct
rtnl_link_stats).
We support stats in the main nesting level, or one lower.
The extension can be called by its name or any shorten of it. If there is
more than one matched, the first one will be picked.

To get the extended stats the flag -x  is used.

Signed-off-by: Nogah Frankel 
Reviewed-by: Jiri Pirko 
---
 misc/ifstat.c | 160 --
 1 file changed, 145 insertions(+), 15 deletions(-)

diff --git a/misc/ifstat.c b/misc/ifstat.c
index 5bcbcc8..9467119 100644
--- a/misc/ifstat.c
+++ b/misc/ifstat.c
@@ -34,6 +34,7 @@
 #include "libnetlink.h"
 #include "json_writer.h"
 #include "SNAPSHOT.h"
+#include "utils.h"
 
 int dump_zeros;
 int reset_history;
@@ -48,17 +49,21 @@ int pretty;
 double W;
 char **patterns;
 int npatterns;
+bool is_extended;
+int filter_type;
+int sub_type;
 
 char info_source[128];
 int source_mismatch;
 
 #define MAXS (sizeof(struct rtnl_link_stats)/sizeof(__u32))
+#define NO_SUB_TYPE 0x
 
 struct ifstat_ent {
struct ifstat_ent   *next;
char*name;
int ifindex;
-   unsigned long long  val[MAXS];
+   __u64   val[MAXS];
double  rate[MAXS];
__u32   ival[MAXS];
 };
@@ -106,6 +111,48 @@ static int match(const char *id)
return 0;
 }
 
+static int get_nlmsg_extended(const struct sockaddr_nl *who,
+ struct nlmsghdr *m, void *arg)
+{
+   struct if_stats_msg *ifsm = NLMSG_DATA(m);
+   struct rtattr *tb[IFLA_STATS_MAX+1];
+   int len = m->nlmsg_len;
+   struct ifstat_ent *n;
+
+   if (m->nlmsg_type != RTM_NEWSTATS)
+   return 0;
+
+   len -= NLMSG_LENGTH(sizeof(*ifsm));
+   if (len < 0)
+   return -1;
+
+   parse_rtattr(tb, IFLA_STATS_MAX, IFLA_STATS_RTA(ifsm), len);
+   if (tb[filter_type] == NULL)
+   return 0;
+
+   n = malloc(sizeof(*n));
+   if (!n)
+   abort();
+
+   n->ifindex = ifsm->ifindex;
+   n->name = strdup(ll_index_to_name(ifsm->ifindex));
+
+   if (sub_type == NO_SUB_TYPE) {
+   memcpy(&n->val, RTA_DATA(tb[filter_type]), sizeof(n->val));
+   } else {
+   struct rtattr *attr;
+
+   attr = parse_rtattr_one_nested(sub_type, tb[filter_type]);
+   if (attr == NULL)
+   return 0;
+   memcpy(&n->val, RTA_DATA(attr), sizeof(n->val));
+   }
+   memset(&n->rate, 0, sizeof(n->rate));
+   n->next = kern_db;
+   kern_db = n;
+   return 0;
+}
+
 static int get_nlmsg(const struct sockaddr_nl *who,
 struct nlmsghdr *m, void *arg)
 {
@@ -147,18 +194,34 @@ static void load_info(void)
 {
struct ifstat_ent *db, *n;
struct rtnl_handle rth;
+   __u32 filter_mask;
 
if (rtnl_open(&rth, 0) < 0)
exit(1);
 
-   if (rtnl_wilddump_request(&rth, AF_INET, RTM_GETLINK) < 0) {
-   perror("Cannot send dump request");
-   exit(1);
-   }
+   if (is_extended) {
+   ll_init_map(&rth);
+   filter_mask = IFLA_STATS_FILTER_BIT(filter_type);
+   if (rtnl_wilddump_stats_req_filter(&rth, AF_UNSPEC, 
RTM_GETSTATS,
+  filter_mask) < 0) {
+   perror("Cannot send dump request");
+   exit(1);
+   }
 
-   if (rtnl_dump_filter(&rth, get_nlmsg, NULL) < 0) {
-   fprintf(stderr, "Dump terminated\n");
-   exit(1);
+   if (rtnl_dump_filter(&rth, get_nlmsg_extended, NULL) < 0) {
+   fprintf(stderr, "Dump terminated\n");
+   exit(1);
+   }
+   } else {
+   if (rtnl_wilddump_request(&rth, AF_INET, RTM_GETLINK) < 0) {
+   perror("Cannot send dump request");
+   exit(1);
+   }
+
+   if (rtnl_dump_filter(&rth, get_nlmsg, NULL) < 0) {
+   fprintf(stderr, "Dump terminated\n");
+   exit(1);
+   }
}
 
rtnl_close(&rth);
@@ -553,10 +616,17 @@ static void update_db(int interval)
}
for (i = 0; i < MAXS; i++) {
double sample;
-   unsigned long incr = h1->ival[i] - 
n->ival[i];
+   __u64 incr;
+
+   if (is_extended) {
+   incr = h1->val[i] - n->val[i];
+

Setting link down or up in software

2017-01-12 Thread Mason

Hello,

I'm wondering what are the semantics of calling

ip link set dev eth0 down

I was expecting that to somehow instruct the device's ethernet driver
to shut everything down, have the PHY tell the peer that it's going
away, maybe even put the PHY in some low-power mode, etc.

But it doesn't seem to be doing any of that on my HW.

So what exactly is it supposed to do?


And on top of that, I am seeing random occurrences of

nb8800 26000.ethernet eth0: Link is Down

Sometimes it is printed immediately.
Sometimes it is printed as soon as I run "ip link set dev eth0 up" (?!)
Sometimes it is not printed at all.

I find this erratic behavior very confusing.

Is it the symptom of some deeper bug?

Regards.

[PATCH/RFC net] ravb: do not use zero-length alighment DMA request

From: Masaru Nagai 

Due to alignment requirements of the hardware transmissions are split
into two DMA requests, a small padding request of 0 - 4 bytes in length
followed by the a request for rest of the packet.

In the case of IP packets the first request will never be zero due
to the way that the stack aligns buffers for IP packets. However, for
non-IP packets it may be zero.

In this case it has been reported that timeouts occur, presumably because
transmission stops at the first zero-length DMA request and thus the packet
is not transmitted. However, in my environment a BUG is triggered as
follows:

[   20.381417] [ cut here ]
[   20.386054] kernel BUG at lib/swiotlb.c:495!
[   20.390324] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
[   20.395805] Modules linked in:
[   20.398862] CPU: 0 PID: 2089 Comm: mz Not tainted 
4.10.0-rc3-1-gf13ad2db193f #162
[   20.406689] Hardware name: Renesas Salvator-X board based on r8a7796 (DT)
[   20.413474] task: 80063b1f1900 task.stack: 80063a71c000
[   20.419404] PC is at swiotlb_tbl_map_single+0x178/0x2ec
[   20.424625] LR is at map_single+0x4c/0x98
[   20.428629] pc : [] lr : [] pstate: 
81c5
[   20.436019] sp : 80063a71f9b0
[   20.439327] x29: 80063a71f9b0 x28: 80063a20d500
[   20.444636] x27: 08ed5000 x26: 
[   20.449944] x25: 00067abe2adc x24: 
[   20.455252] x23: 0020 x22: 0001
[   20.460559] x21: 00175ffe x20: 80063b2a0010
[   20.465866] x19:  x18: cae6fb20
[   20.471173] x17: a09ba018 x16: 087c8b70
[   20.476480] x15: a084f588 x14: a09cfa14
[   20.481787] x13: cae87ff0 x12: 0063abe2
[   20.487098] x11: 08096360 x10: 80063abe2adc
[   20.492407] x9 :  x8 : 
[   20.497718] x7 :  x6 : 08ed50d0
[   20.503028] x5 :  x4 : 0001
[   20.508338] x3 :  x2 : 00067abe2adc
[   20.513648] x1 : bafff000 x0 : 
[   20.518958]
[   20.520446] Process mz (pid: 2089, stack limit = 0x80063a71c000)
[   20.526798] Stack: (0x80063a71f9b0 to 0x80063a72)
[   20.532543] f9a0:   80063a71fa30 
0839c680
[   20.540374] f9c0: 80063b2a0010 80063b2a0010 0001 

[   20.548204] f9e0: 006e 80063b23c000 80063b23c000 

[   20.556034] fa00: 80063b23c000 80063a20d500 00013b1f1900 

[   20.563864] fa20: 80063ffd18e0 80063b2a0010 80063a71fa60 
0839cd10
[   20.571694] fa40: 80063b2a0010  80063ffd18e0 
00067abe2adc
[   20.579524] fa60: 80063a71fa90 08096380 80063b2a0010 

[   20.587353] fa80:  0001 80063a71fac0 
0864f770
[   20.595184] faa0: 80063b23caf0   
0140
[   20.603014] fac0: 80063a71fb60 087e6498 80063a20d500 
80063b23c000
[   20.610843] fae0:  08daeaf0  
08daeb00
[   20.618673] fb00: 80063a71fc0c 08da7000 80063b23c090 
80063a44f000
[   20.626503] fb20:  08daeb00 80063a71fc0c 
08da7000
[   20.634333] fb40: 80063b23c090  80060037 
087e63d8
[   20.642163] fb60: 80063a71fbc0 08807510 80063a692400 
80063a20d500
[   20.649993] fb80: 80063a44f000 80063b23c000 80063a69249c 

[   20.657823] fba0:  80063a087800 80063b23c000 
80063a20d500
[   20.665653] fbc0: 80063a71fc10 087e67dc 80063a20d500 
80063a692400
[   20.673483] fbe0: 80063b23c000  80063a44f000 
80063a69249c
[   20.681312] fc00: 80063a5f1a10 00103a087800 80063a71fc70 
087e6b24
[   20.689142] fc20: 80063a5f1a80 80063a71fde8 000f 
05ea
[   20.696972] fc40: 80063a5f1a10  000f 
0887fbd0
[   20.704802] fc60: fff43a5f1a80  80063a71fc80 
08880240
[   20.712632] fc80: 80063a71fd90 087c7a34 80063afc7180 

[   20.720462] fca0: cae6fe18 0014 6000 
0015
[   20.728292] fcc0: 0123 00ce 088d2000 
80063b1f1900
[   20.736122] fce0: 8933 08e7cb80 80063a71fd80 
087c50a4
[   20.743951] fd00: 8933 08e7cb80 08e7cb80 
001e
[   20.751781] fd20: 80063a71fe4c 0300 0123 

[   20.759611] fd40:  80063b1f 000e 
0300
[   20.767441] fd60: 0

[PATCH net-next 0/2] net/smc: fix typo and clc-bug

Dave,

I received 2 bug reports for my new AF_SMC-code. Here are the fixes for them.

Thanks,
Ursula

Ursula Braun (2):
  smc-typo-in-core-sock
  smc-macaddr-len

 net/core/sock.c   |  2 +-
 net/smc/smc_clc.c | 10 --
 net/smc/smc_ib.h  |  4 +++-
 3 files changed, 8 insertions(+), 8 deletions(-)

-- 
2.8.4

[PATCH net-next 2/2] smc: ETH_ALEN as memcpy length for mac addresses

When creating an SMC connection, there is a CLC (connection layer control)
handshake to prepare for RDMA traffic. The corresponding code is part of
commit 0cfdd8f92cac ("smc: connection and link group creation").
Mac addresses to be exchanged in the handshake are copied with a wrong
length of 12 instead of 6 bytes. Following code overwrites the wrongly
copied code, but nevertheless the correct length should already be used for
the preceding mac address copying. Use ETH_ALEN for the memcpy length with
mac addresses.

Signed-off-by: Ursula Braun 
Fixes: 0cfdd8f92cac ("smc: connection and link group creation")
Reported-by: Dan Carpenter 
---
 net/smc/smc_clc.c | 10 --
 net/smc/smc_ib.h  |  4 +++-
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/net/smc/smc_clc.c b/net/smc/smc_clc.c
index e1e684c..cc6b6f8 100644
--- a/net/smc/smc_clc.c
+++ b/net/smc/smc_clc.c
@@ -10,6 +10,7 @@
  */
 
 #include 
+#include 
 #include 
 #include 
 
@@ -151,8 +152,7 @@ int smc_clc_send_proposal(struct smc_sock *smc,
pclc.hdr.version = SMC_CLC_V1;  /* SMC version */
memcpy(pclc.lcl.id_for_peer, local_systemid, sizeof(local_systemid));
memcpy(&pclc.lcl.gid, &smcibdev->gid[ibport - 1], SMC_GID_SIZE);
-   memcpy(&pclc.lcl.mac, &smcibdev->mac[ibport - 1],
-  sizeof(smcibdev->mac[ibport - 1]));
+   memcpy(&pclc.lcl.mac, &smcibdev->mac[ibport - 1], ETH_ALEN);
 
/* determine subnet and mask from internal TCP socket */
rc = smc_netinfo_by_tcpsk(smc->clcsock, &pclc.outgoing_subnet,
@@ -199,8 +199,7 @@ int smc_clc_send_confirm(struct smc_sock *smc)
memcpy(cclc.lcl.id_for_peer, local_systemid, sizeof(local_systemid));
memcpy(&cclc.lcl.gid, &link->smcibdev->gid[link->ibport - 1],
   SMC_GID_SIZE);
-   memcpy(&cclc.lcl.mac, &link->smcibdev->mac[link->ibport - 1],
-  sizeof(link->smcibdev->mac));
+   memcpy(&cclc.lcl.mac, &link->smcibdev->mac[link->ibport - 1], ETH_ALEN);
hton24(cclc.qpn, link->roce_qp->qp_num);
cclc.rmb_rkey =
htonl(conn->rmb_desc->mr_rx[SMC_SINGLE_LINK]->rkey);
@@ -252,8 +251,7 @@ int smc_clc_send_accept(struct smc_sock *new_smc, int 
srv_first_contact)
memcpy(aclc.lcl.id_for_peer, local_systemid, sizeof(local_systemid));
memcpy(&aclc.lcl.gid, &link->smcibdev->gid[link->ibport - 1],
   SMC_GID_SIZE);
-   memcpy(&aclc.lcl.mac, link->smcibdev->mac[link->ibport - 1],
-  sizeof(link->smcibdev->mac[link->ibport - 1]));
+   memcpy(&aclc.lcl.mac, link->smcibdev->mac[link->ibport - 1], ETH_ALEN);
hton24(aclc.qpn, link->roce_qp->qp_num);
aclc.rmb_rkey =
htonl(conn->rmb_desc->mr_rx[SMC_SINGLE_LINK]->rkey);
diff --git a/net/smc/smc_ib.h b/net/smc/smc_ib.h
index 3fe2d55..a95f74b 100644
--- a/net/smc/smc_ib.h
+++ b/net/smc/smc_ib.h
@@ -11,6 +11,7 @@
 #ifndef _SMC_IB_H
 #define _SMC_IB_H
 
+#include 
 #include 
 
 #define SMC_MAX_PORTS  2   /* Max # of ports */
@@ -34,7 +35,8 @@ struct smc_ib_device {/* 
ib-device infos for smc */
struct ib_cq*roce_cq_recv;  /* recv completion queue */
struct tasklet_struct   send_tasklet;   /* called by send cq handler */
struct tasklet_struct   recv_tasklet;   /* called by recv cq handler */
-   charmac[SMC_MAX_PORTS][6]; /* mac address per port*/
+   charmac[SMC_MAX_PORTS][ETH_ALEN];
+   /* mac address per port*/
union ib_gidgid[SMC_MAX_PORTS]; /* gid per port */
u8  initialized : 1; /* ib dev CQ, evthdl done */
struct work_struct  port_event_work;
-- 
2.8.4

[PATCH net-next 1/2] net: fix AF_SMC related typo

When introducing the new socket family AF_SMC in
commit ac7138746e14 ("smc: establish new socket family"),
a typo in af_family_clock_key_strings has slipped in.
This patch repairs it.

Signed-off-by: Ursula Braun 
Fixes: ac7138746e14 ("smc: establish new socket family")
Reported-by: Andrew Morton 
---
 net/core/sock.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/sock.c b/net/core/sock.c
index dbbdc4f..8b35debf 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -256,7 +256,7 @@ static const char *const 
af_family_clock_key_strings[AF_MAX+1] = {
   "clock-AF_RXRPC" , "clock-AF_ISDN" , "clock-AF_PHONET"   ,
   "clock-AF_IEEE802154", "clock-AF_CAIF" , "clock-AF_ALG"  ,
   "clock-AF_NFC"   , "clock-AF_VSOCK", "clock-AF_KCM"  ,
-  "clock-AF_QIPCRTR", "closck-AF_smc", "clock-AF_MAX"
+  "clock-AF_QIPCRTR", "clock-AF_SMC" , "clock-AF_MAX"
 };
 
 /*
-- 
2.8.4

Re: [PATCH v4 01/13] net: ethernet: aquantia: Make and configuration files.

From: Alexander Loktionov 
Date: Wed, 11 Jan 2017 19:53:05 -0800

> @@ -0,0 +1,19 @@
> +/*
> + * aQuantia Corporation Network Driver
> + * Copyright (C) 2014-2017 aQuantia Corporation. All rights reserved
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + */
> +
> +#ifndef VER_H
> +#define VER_H
> +
> +#define NIC_MAJOR_DRIVER_VERSION   1
> +#define NIC_MINOR_DRIVER_VERSION   5
> +#define NIC_BUILD_DRIVER_VERSION   339
> +#define NIC_REVISION_DRIVER_VERSION0
> +
> +#endif /* VER_H */
> +

Please do not add empty lines at the end of files, GIT even warns
about this.

Please audit your entire submission for this problem.

Re: [PATCH net-next] net: core: Make netif_wake_subqueue a wrapper

From: Florian Fainelli 
Date: Wed, 11 Jan 2017 21:13:02 -0800

> netif_wake_subqueue() is duplicating the same thing that netif_tx_wake_queue()
> does, so make it call it directly after looking up the queue from the index.
> 
> Signed-off-by: Florian Fainelli 

Looks good, applied.

Re: [PATCH v2 net-next] Introduce a sysctl that modifies the value of PROT_SOCK.

From: Krister Johansen 
Date: Wed, 11 Jan 2017 22:52:25 -0800

> Add net.ipv4.ip_unprotected_port_start, which is a per namespace sysctl
> that denotes the first unprotected inet port in the namespace.  To
> disable all protected ports set this to zero.  It also checks for
> overlap with the local port range.  The protected and local range may
> not overlap.
> 
> The use case for this change is to allow containerized processes to bind
> to priviliged ports, but prevent them from ever being allowed to modify
> their container's network configuration.  The latter is accomplished by
> ensuring that the network namespace is not a child of the user
> namespace.  This modification was needed to allow the container manager
> to disable a namespace's priviliged port restrictions without exposing
> control of the network namespace to processes in the user namespace.
> 
> Signed-off-by: Krister Johansen 

This is what CAP_NET_BIND_SERVICE is for, and why it is a separate
network privilege, please use it.

[iproute PATCH] tc: m_xt: Fix segfault with iptables-1.6.0

2017-01-12 Thread Phil Sutter

Said iptables version introduced struct xtables_globals field
'compat_rev', a function pointer. Initializing it is mandatory as
libxtables calls it without existence check.

Without this, tc segfaults when using the xt action like so:

| tc filter add dev d0 parent : u32 match u32 0 0 \
|   action xt -j MARK --set-mark 20

Signed-off-by: Phil Sutter 
---
 tc/m_xt.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tc/m_xt.c b/tc/m_xt.c
index dbb54981462ee..57ed40d7aa3a8 100644
--- a/tc/m_xt.c
+++ b/tc/m_xt.c
@@ -77,6 +77,9 @@ static struct xtables_globals tcipt_globals = {
.orig_opts = original_opts,
.opts = original_opts,
.exit_err = NULL,
+#if (XTABLES_VERSION_CODE >= 11)
+   .compat_rev = xtables_compatible_revision,
+#endif
 };
 
 /*
-- 
2.11.0

Re: [PATCH net-next] cxgb4: Initialize mbox lock and list for mgmt dev

From: Ganesh Goudar 
Date: Thu, 12 Jan 2017 12:23:21 +0530

> Initialize mbox lock and list for mgmt dev to avoid NULL pointer
> dereference when cxgb_set_vf_mac is called.
> 
> And also allocate memory for private data while allocating mgmt
> netdev.
> 
> Signed-off-by: Ganesh Goudar 

Applied.

Re: [patch net 0/3] mlxsw: Couple of fixes

From: Jiri Pirko 
Date: Thu, 12 Jan 2017 09:10:36 +0100

> Couple of simple fixes from Arkadi and Elad.
> 
> Please queue these up for stable. Thanks.

Series applied and queued up for -stable, thanks.

Re: [PATCH net-next] tools: psock_lib: harden socket filter used by psock tests


On 01/12/2017 02:10 PM, Sowmini Varadhan wrote:

The filter added by sock_setfilter is intended to only permit
packets matching the pattern set up by create_payload(), but
we only check the ip_len, and a single test-character in
the IP packet to ensure this condition.

Harden the filter by adding additional constraints so that we only
permit UDP/IPv4 packets that meet the ip_len and test-character
requirements. Include the bpf_asm src as a comment, in case this
needs to be enhanced in the future

Signed-off-by: Sowmini Varadhan 


LGTM, thanks!

Acked-by: Daniel Borkmann

Re: [PATCH v2 net-next] Introduce a sysctl that modifies the value of PROT_SOCK.

2017-01-12 Thread Eric Dumazet

On Wed, 2017-01-11 at 22:52 -0800, Krister Johansen wrote:
> Add net.ipv4.ip_unprotected_port_start, which is a per namespace sysctl
> that denotes the first unprotected inet port in the namespace.  To
> disable all protected ports set this to zero.  It also checks for
> overlap with the local port range.  The protected and local range may
> not overlap.
> 
> The use case for this change is to allow containerized processes to bind
> to priviliged ports, but prevent them from ever being allowed to modify
> their container's network configuration.  The latter is accomplished by
> ensuring that the network namespace is not a child of the user
> namespace.  This modification was needed to allow the container manager
> to disable a namespace's priviliged port restrictions without exposing
> control of the network namespace to processes in the user namespace.
> 
> Signed-off-by: Krister Johansen 
> ---
>  include/net/ip.h   | 10 +
>  include/net/netns/ipv4.h   |  1 +
>  net/ipv4/af_inet.c |  5 -
>  net/ipv4/sysctl_net_ipv4.c | 50 
> +-
>  net/ipv6/af_inet6.c|  3 ++-
>  net/netfilter/ipvs/ip_vs_ctl.c |  7 +++---
>  net/sctp/socket.c  | 10 +
>  security/selinux/hooks.c   |  3 ++-

Adding a new sysctl without documentation is generally not accepted.

Please take a look at Documentation/networking/ip-sysctl.txt

BTW, sticking to 'unprivileged' ports might be better than 'unprotected'
which is vague.

[PATCH net-next] lwt_bpf: bpf_lwt_prog_cmp() can be static

2017-01-12 Thread Wei Yongjun

From: Wei Yongjun 

Fixes the following sparse warning:

net/core/lwt_bpf.c:355:5: warning:
 symbol 'bpf_lwt_prog_cmp' was not declared. Should it be static?

Signed-off-by: Wei Yongjun 
---
 net/core/lwt_bpf.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/lwt_bpf.c b/net/core/lwt_bpf.c
index 71bb3e2..40ef8ae 100644
--- a/net/core/lwt_bpf.c
+++ b/net/core/lwt_bpf.c
@@ -352,7 +352,7 @@ static int bpf_encap_nlsize(struct lwtunnel_state *lwtstate)
   0;
 }
 
-int bpf_lwt_prog_cmp(struct bpf_lwt_prog *a, struct bpf_lwt_prog *b)
+static int bpf_lwt_prog_cmp(struct bpf_lwt_prog *a, struct bpf_lwt_prog *b)
 {
/* FIXME:
 * The LWT state is currently rebuilt for delete requests which

Re: [PATCH net-next] lwt_bpf: bpf_lwt_prog_cmp() can be static


On 01/12/2017 03:39 PM, Wei Yongjun wrote:

From: Wei Yongjun 

Fixes the following sparse warning:

net/core/lwt_bpf.c:355:5: warning:
  symbol 'bpf_lwt_prog_cmp' was not declared. Should it be static?

Signed-off-by: Wei Yongjun 


Acked-by: Daniel Borkmann

[PATCH RESEND net-next 11/12] s390/qeth: issue STARTLAN as first IPA command

From: Julian Wiedmann 

STARTLAN needs to be the first IPA command after MPC initialization
completes.
So move the qeth_send_startlan() call from the layer disciplines
into the core path, right after the MPC handshake.
While at it, replace the magic LAN OFFLINE return code
with the existing enum.

Signed-off-by: Julian Wiedmann 
Reviewed-by: Thomas Richter 
Reviewed-by: Ursula Braun 
---
 drivers/s390/net/qeth_core.h  |  1 -
 drivers/s390/net/qeth_core_main.c | 21 +
 drivers/s390/net/qeth_l2_main.c   | 15 ---
 drivers/s390/net/qeth_l3_main.c   | 15 ---
 4 files changed, 17 insertions(+), 35 deletions(-)

diff --git a/drivers/s390/net/qeth_core.h b/drivers/s390/net/qeth_core.h
index 774ae51..e7addea 100644
--- a/drivers/s390/net/qeth_core.h
+++ b/drivers/s390/net/qeth_core.h
@@ -913,7 +913,6 @@ void qeth_clear_thread_running_bit(struct qeth_card *, 
unsigned long);
 int qeth_core_hardsetup_card(struct qeth_card *);
 void qeth_print_status_message(struct qeth_card *);
 int qeth_init_qdio_queues(struct qeth_card *);
-int qeth_send_startlan(struct qeth_card *);
 int qeth_send_ipa_cmd(struct qeth_card *, struct qeth_cmd_buffer *,
  int (*reply_cb)
  (struct qeth_card *, struct qeth_reply *, unsigned long),
diff --git a/drivers/s390/net/qeth_core_main.c 
b/drivers/s390/net/qeth_core_main.c
index ca8309f..315d8a2 100644
--- a/drivers/s390/net/qeth_core_main.c
+++ b/drivers/s390/net/qeth_core_main.c
@@ -2944,7 +2944,7 @@ int qeth_send_ipa_cmd(struct qeth_card *card, struct 
qeth_cmd_buffer *iob,
 }
 EXPORT_SYMBOL_GPL(qeth_send_ipa_cmd);
 
-int qeth_send_startlan(struct qeth_card *card)
+static int qeth_send_startlan(struct qeth_card *card)
 {
int rc;
struct qeth_cmd_buffer *iob;
@@ -2957,7 +2957,6 @@ int qeth_send_startlan(struct qeth_card *card)
rc = qeth_send_ipa_cmd(card, iob, NULL, NULL);
return rc;
 }
-EXPORT_SYMBOL_GPL(qeth_send_startlan);
 
 static int qeth_default_setadapterparms_cb(struct qeth_card *card,
struct qeth_reply *reply, unsigned long data)
@@ -5087,6 +5086,20 @@ int qeth_core_hardsetup_card(struct qeth_card *card)
goto out;
}
 
+   rc = qeth_send_startlan(card);
+   if (rc) {
+   QETH_DBF_TEXT_(SETUP, 2, "6err%d", rc);
+   if (rc == IPA_RC_LAN_OFFLINE) {
+   dev_warn(&card->gdev->dev,
+   "The LAN is offline\n");
+   card->lan_online = 0;
+   } else {
+   rc = -ENODEV;
+   goto out;
+   }
+   } else
+   card->lan_online = 1;
+
card->options.ipa4.supported_funcs = 0;
card->options.ipa6.supported_funcs = 0;
card->options.adp.supported_funcs = 0;
@@ -5098,14 +5111,14 @@ int qeth_core_hardsetup_card(struct qeth_card *card)
if (qeth_is_supported(card, IPA_SETADAPTERPARMS)) {
rc = qeth_query_setadapterparms(card);
if (rc < 0) {
-   QETH_DBF_TEXT_(SETUP, 2, "6err%d", rc);
+   QETH_DBF_TEXT_(SETUP, 2, "7err%d", rc);
goto out;
}
}
if (qeth_adp_supported(card, IPA_SETADP_SET_DIAG_ASSIST)) {
rc = qeth_query_setdiagass(card);
if (rc < 0) {
-   QETH_DBF_TEXT_(SETUP, 2, "7err%d", rc);
+   QETH_DBF_TEXT_(SETUP, 2, "8err%d", rc);
goto out;
}
}
diff --git a/drivers/s390/net/qeth_l2_main.c b/drivers/s390/net/qeth_l2_main.c
index c298759c..bea4833 100644
--- a/drivers/s390/net/qeth_l2_main.c
+++ b/drivers/s390/net/qeth_l2_main.c
@@ -1177,21 +1177,6 @@ static int __qeth_l2_set_online(struct ccwgroup_device 
*gdev, int recovery_mode)
/* softsetup */
QETH_DBF_TEXT(SETUP, 2, "softsetp");
 
-   rc = qeth_send_startlan(card);
-   if (rc) {
-   QETH_DBF_TEXT_(SETUP, 2, "1err%d", rc);
-   if (rc == 0xe080) {
-   dev_warn(&card->gdev->dev,
-   "The LAN is offline\n");
-   card->lan_online = 0;
-   goto contin;
-   }
-   rc = -ENODEV;
-   goto out_remove;
-   } else
-   card->lan_online = 1;
-
-contin:
if ((card->info.type == QETH_CARD_TYPE_OSD) ||
(card->info.type == QETH_CARD_TYPE_OSX)) {
rc = qeth_l2_start_ipassists(card);
diff --git a/drivers/s390/net/qeth_l3_main.c b/drivers/s390/net/qeth_l3_main.c
index ac37d05..06d0add 100644
--- a/drivers/s390/net/qeth_l3_main.c
+++ b/drivers/s390/net/qeth_l3_main.c
@@ -3227,21 +3227,6 @@ static int __qeth_l3_set_online(struct ccwgroup_device 
*gdev, int recovery_mode)
/* softsetup */
QETH_DBF_TEXT(SETUP, 2, "softsetp");
 
-   rc = qeth_send_

[PATCH RESEND net-next 02/12] s390/qeth: test RX/TX checksum offload reply

From: Thomas Richter 

Turning on receive and/or transmit checksum offload support
on the OSA card requires 2 commands:
1. start command which replies with available features
2. enable command to turn on selected features.

The current version does not check the reply of the start
command and simply uses the returned value to enable
offload features. When the start command returns zero, this
leads to a situation where no checksum offload
is turned on by the hardware. Even worse no error
indication is returned. The Linux kernel assumes
the OSA card performs RX/TX checksum offload, but the hardware
does not perform any checksum verification at all.

This patch checks the return of the start and enable
command responses from the hardware and turns off
checksum offloading if the commands fails or does not
respond with the correct bit setting.

Signed-off-by: Thomas Richter 
Reviewed-by: Julian Wiedmann 
Reviewed-by: Ursula Braun 
---
 drivers/s390/net/qeth_core_main.c | 13 +
 drivers/s390/net/qeth_core_mpc.h  | 10 ++
 2 files changed, 23 insertions(+)

diff --git a/drivers/s390/net/qeth_core_main.c 
b/drivers/s390/net/qeth_core_main.c
index 5ab80ea..49b813f 100644
--- a/drivers/s390/net/qeth_core_main.c
+++ b/drivers/s390/net/qeth_core_main.c
@@ -6104,11 +6104,19 @@ static int qeth_ipa_checksum_run_cmd(struct qeth_card 
*card,
 
 static int qeth_send_checksum_on(struct qeth_card *card, int cstype)
 {
+   const __u32 required_features = QETH_IPA_CHECKSUM_IP_HDR |
+   QETH_IPA_CHECKSUM_UDP |
+   QETH_IPA_CHECKSUM_TCP;
struct qeth_checksum_cmd chksum_cb;
int rc;
 
rc = qeth_ipa_checksum_run_cmd(card, cstype, IPA_CMD_ASS_START, 0,
   &chksum_cb);
+   if (!rc) {
+   if ((required_features & chksum_cb.supported) !=
+   required_features)
+   rc = -EIO;
+   }
if (rc) {
qeth_send_simple_setassparms(card, cstype, IPA_CMD_ASS_STOP, 0);
dev_warn(&card->gdev->dev,
@@ -6118,6 +6126,11 @@ static int qeth_send_checksum_on(struct qeth_card *card, 
int cstype)
}
rc = qeth_ipa_checksum_run_cmd(card, cstype, IPA_CMD_ASS_ENABLE,
   chksum_cb.supported, &chksum_cb);
+   if (!rc) {
+   if ((required_features & chksum_cb.enabled) !=
+   required_features)
+   rc = -EIO;
+   }
if (rc) {
qeth_send_simple_setassparms(card, cstype, IPA_CMD_ASS_STOP, 0);
dev_warn(&card->gdev->dev,
diff --git a/drivers/s390/net/qeth_core_mpc.h b/drivers/s390/net/qeth_core_mpc.h
index f54ea72..bc69d0a 100644
--- a/drivers/s390/net/qeth_core_mpc.h
+++ b/drivers/s390/net/qeth_core_mpc.h
@@ -352,6 +352,16 @@ struct qeth_arp_query_info {
char *udata;
 };
 
+/* IPA set assist segmentation bit definitions for receive and
+ * transmit checksum offloading.
+ */
+enum qeth_ipa_checksum_bits {
+   QETH_IPA_CHECKSUM_IP_HDR= 0x0002,
+   QETH_IPA_CHECKSUM_UDP   = 0x0008,
+   QETH_IPA_CHECKSUM_TCP   = 0x0010,
+   QETH_IPA_CHECKSUM_LP2LP = 0x0020
+};
+
 /* IPA Assist checksum offload reply layout. */
 struct qeth_checksum_cmd {
__u32 supported;
-- 
2.8.4

[PATCH RESEND net-next 10/12] s390/qeth: shuffle MAC management functions around

From: Julian Wiedmann 

Move all MAC utility functions in one place, and drop the
forward declarations.

Signed-off-by: Julian Wiedmann 
Reviewed-by: Thomas Richter 
---
 drivers/s390/net/qeth_l2_main.c | 129 
 1 file changed, 63 insertions(+), 66 deletions(-)

diff --git a/drivers/s390/net/qeth_l2_main.c b/drivers/s390/net/qeth_l2_main.c
index d456740..c298759c 100644
--- a/drivers/s390/net/qeth_l2_main.c
+++ b/drivers/s390/net/qeth_l2_main.c
@@ -27,9 +27,6 @@
 
 static int qeth_l2_set_offline(struct ccwgroup_device *);
 static int qeth_l2_stop(struct net_device *);
-static int qeth_l2_send_delmac(struct qeth_card *, __u8 *);
-static int qeth_l2_send_setdelmac(struct qeth_card *, __u8 *,
-  enum qeth_ipa_cmds);
 static void qeth_l2_set_rx_mode(struct net_device *);
 static int qeth_l2_recover(void *);
 static void qeth_bridgeport_query_support(struct qeth_card *card);
@@ -165,6 +162,64 @@ static int qeth_setdel_makerc(struct qeth_card *card, int 
retcode)
return rc;
 }
 
+static int qeth_l2_send_setdelmac(struct qeth_card *card, __u8 *mac,
+  enum qeth_ipa_cmds ipacmd)
+{
+   struct qeth_ipa_cmd *cmd;
+   struct qeth_cmd_buffer *iob;
+
+   QETH_CARD_TEXT(card, 2, "L2sdmac");
+   iob = qeth_get_ipacmd_buffer(card, ipacmd, QETH_PROT_IPV4);
+   if (!iob)
+   return -ENOMEM;
+   cmd = (struct qeth_ipa_cmd *)(iob->data+IPA_PDU_HEADER_SIZE);
+   cmd->data.setdelmac.mac_length = OSA_ADDR_LEN;
+   memcpy(&cmd->data.setdelmac.mac, mac, OSA_ADDR_LEN);
+   return qeth_setdel_makerc(card, qeth_send_ipa_cmd(card, iob,
+   NULL, NULL));
+}
+
+static int qeth_l2_send_setmac(struct qeth_card *card, __u8 *mac)
+{
+   int rc;
+
+   QETH_CARD_TEXT(card, 2, "L2Setmac");
+   rc = qeth_l2_send_setdelmac(card, mac, IPA_CMD_SETVMAC);
+   if (rc == 0) {
+   card->info.mac_bits |= QETH_LAYER2_MAC_REGISTERED;
+   memcpy(card->dev->dev_addr, mac, OSA_ADDR_LEN);
+   dev_info(&card->gdev->dev,
+   "MAC address %pM successfully registered on device 
%s\n",
+   card->dev->dev_addr, card->dev->name);
+   } else {
+   card->info.mac_bits &= ~QETH_LAYER2_MAC_REGISTERED;
+   switch (rc) {
+   case -EEXIST:
+   dev_warn(&card->gdev->dev,
+   "MAC address %pM already exists\n", mac);
+   break;
+   case -EPERM:
+   dev_warn(&card->gdev->dev,
+   "MAC address %pM is not authorized\n", mac);
+   break;
+   }
+   }
+   return rc;
+}
+
+static int qeth_l2_send_delmac(struct qeth_card *card, __u8 *mac)
+{
+   int rc;
+
+   QETH_CARD_TEXT(card, 2, "L2Delmac");
+   if (!(card->info.mac_bits & QETH_LAYER2_MAC_REGISTERED))
+   return 0;
+   rc = qeth_l2_send_setdelmac(card, mac, IPA_CMD_DELVMAC);
+   if (rc == 0)
+   card->info.mac_bits &= ~QETH_LAYER2_MAC_REGISTERED;
+   return rc;
+}
+
 static int qeth_l2_send_setgroupmac(struct qeth_card *card, __u8 *mac)
 {
int rc;
@@ -193,11 +248,6 @@ static int qeth_l2_send_delgroupmac(struct qeth_card 
*card, __u8 *mac)
return rc;
 }
 
-static inline u32 qeth_l2_mac_hash(const u8 *addr)
-{
-   return get_unaligned((u32 *)(&addr[2]));
-}
-
 static int qeth_l2_write_mac(struct qeth_card *card, struct qeth_mac *mac)
 {
if (mac->is_uc) {
@@ -232,6 +282,11 @@ static void qeth_l2_del_all_macs(struct qeth_card *card)
spin_unlock_bh(&card->mclock);
 }
 
+static inline u32 qeth_l2_mac_hash(const u8 *addr)
+{
+   return get_unaligned((u32 *)(&addr[2]));
+}
+
 static inline int qeth_l2_get_cast_type(struct qeth_card *card,
struct sk_buff *skb)
 {
@@ -572,64 +627,6 @@ static int qeth_l2_poll(struct napi_struct *napi, int 
budget)
return work_done;
 }
 
-static int qeth_l2_send_setdelmac(struct qeth_card *card, __u8 *mac,
-  enum qeth_ipa_cmds ipacmd)
-{
-   struct qeth_ipa_cmd *cmd;
-   struct qeth_cmd_buffer *iob;
-
-   QETH_CARD_TEXT(card, 2, "L2sdmac");
-   iob = qeth_get_ipacmd_buffer(card, ipacmd, QETH_PROT_IPV4);
-   if (!iob)
-   return -ENOMEM;
-   cmd = (struct qeth_ipa_cmd *)(iob->data+IPA_PDU_HEADER_SIZE);
-   cmd->data.setdelmac.mac_length = OSA_ADDR_LEN;
-   memcpy(&cmd->data.setdelmac.mac, mac, OSA_ADDR_LEN);
-   return qeth_setdel_makerc(card, qeth_send_ipa_cmd(card, iob,
-   NULL, NULL));
-}
-
-static int qeth_l2_send_setmac(struct qeth_card *card, __u8 *mac)
-{
-   int rc;
-
-   QETH_CARD_TEXT(card, 2, "L2Setmac");
-   rc = qeth_l2_send_setdelmac(card, mac, IPA_CMD_SETVMAC);
-   if (rc == 0) {
-

[PATCH RESEND net-next 06/12] s390/qeth: drop qeth_l2_del_all_macs() parameter

From: Julian Wiedmann 

The only caller passes del = 0, so remove both the parameter and
the code that handles != 0.

Signed-off-by: Julian Wiedmann 
Reviewed-by: Thomas Richter 
Acked-by: Ursula Braun 
---
 drivers/s390/net/qeth_l2_main.c | 11 ++-
 1 file changed, 2 insertions(+), 9 deletions(-)

diff --git a/drivers/s390/net/qeth_l2_main.c b/drivers/s390/net/qeth_l2_main.c
index 9c921c28..3025f56 100644
--- a/drivers/s390/net/qeth_l2_main.c
+++ b/drivers/s390/net/qeth_l2_main.c
@@ -216,7 +216,7 @@ static int qeth_l2_write_mac(struct qeth_card *card, struct 
qeth_mac *mac)
return rc;
 }
 
-static void qeth_l2_del_all_macs(struct qeth_card *card, int del)
+static void qeth_l2_del_all_macs(struct qeth_card *card)
 {
struct qeth_mac *mac;
struct hlist_node *tmp;
@@ -224,13 +224,6 @@ static void qeth_l2_del_all_macs(struct qeth_card *card, 
int del)
 
spin_lock_bh(&card->mclock);
hash_for_each_safe(card->mac_htable, i, tmp, mac, hnode) {
-   if (del) {
-   if (mac->is_uc)
-   qeth_l2_send_setdelmac(card, mac->mac_addr,
-   IPA_CMD_DELVMAC);
-   else
-   qeth_l2_send_delgroupmac(card, mac->mac_addr);
-   }
hash_del(&mac->hnode);
kfree(mac);
}
@@ -425,7 +418,7 @@ static void qeth_l2_stop_card(struct qeth_card *card, int 
recovery_mode)
card->state = CARD_STATE_SOFTSETUP;
}
if (card->state == CARD_STATE_SOFTSETUP) {
-   qeth_l2_del_all_macs(card, 0);
+   qeth_l2_del_all_macs(card);
qeth_clear_ipacmd_list(card);
card->state = CARD_STATE_HARDSETUP;
}
-- 
2.8.4

[PATCH RESEND net-next 12/12] s390/qeth: fix retrieval of vipa and proxy-arp addresses

qeth devices in layer3 mode need a separate handling of vipa and proxy-arp
addresses. vipa and proxy-arp addresses processed by qeth can be read from
userspace. Introduced with commit 5f78e29ceebf ("qeth: optimize IP handling
in rx_mode callback") the retrieval of vipa and proxy-arp addresses is
broken, if more than one vipa or proxy-arp address are set.

The qeth code used local variable "int i" for 2 different purposes. This
patch now spends 2 separate local variables of type "int".
While touching these functions hash_for_each_safe() is converted to
hash_for_each(), since there is no removal of hash entries.

Signed-off-by: Ursula Braun 
Reviewed-by: Julian Wiedmann 
Reference-ID: RQM 3524
---
 drivers/s390/net/qeth_l3_sys.c | 30 --
 1 file changed, 16 insertions(+), 14 deletions(-)

diff --git a/drivers/s390/net/qeth_l3_sys.c b/drivers/s390/net/qeth_l3_sys.c
index 3cd4d9f..05e9471 100644
--- a/drivers/s390/net/qeth_l3_sys.c
+++ b/drivers/s390/net/qeth_l3_sys.c
@@ -689,15 +689,15 @@ static ssize_t qeth_l3_dev_vipa_add_show(char *buf, 
struct qeth_card *card,
enum qeth_prot_versions proto)
 {
struct qeth_ipaddr *ipaddr;
-   struct hlist_node  *tmp;
char addr_str[40];
+   int str_len = 0;
int entry_len; /* length of 1 entry string, differs between v4 and v6 */
-   int i = 0;
+   int i;
 
entry_len = (proto == QETH_PROT_IPV4)? 12 : 40;
entry_len += 2; /* \n + terminator */
spin_lock_bh(&card->ip_lock);
-   hash_for_each_safe(card->ip_htable, i, tmp, ipaddr, hnode) {
+   hash_for_each(card->ip_htable, i, ipaddr, hnode) {
if (ipaddr->proto != proto)
continue;
if (ipaddr->type != QETH_IP_TYPE_VIPA)
@@ -705,16 +705,17 @@ static ssize_t qeth_l3_dev_vipa_add_show(char *buf, 
struct qeth_card *card,
/* String must not be longer than PAGE_SIZE. So we check if
 * string length gets near PAGE_SIZE. Then we can savely display
 * the next IPv6 address (worst case, compared to IPv4) */
-   if ((PAGE_SIZE - i) <= entry_len)
+   if ((PAGE_SIZE - str_len) <= entry_len)
break;
qeth_l3_ipaddr_to_string(proto, (const u8 *)&ipaddr->u,
addr_str);
-   i += snprintf(buf + i, PAGE_SIZE - i, "%s\n", addr_str);
+   str_len += snprintf(buf + str_len, PAGE_SIZE - str_len, "%s\n",
+   addr_str);
}
spin_unlock_bh(&card->ip_lock);
-   i += snprintf(buf + i, PAGE_SIZE - i, "\n");
+   str_len += snprintf(buf + str_len, PAGE_SIZE - str_len, "\n");
 
-   return i;
+   return str_len;
 }
 
 static ssize_t qeth_l3_dev_vipa_add4_show(struct device *dev,
@@ -851,15 +852,15 @@ static ssize_t qeth_l3_dev_rxip_add_show(char *buf, 
struct qeth_card *card,
   enum qeth_prot_versions proto)
 {
struct qeth_ipaddr *ipaddr;
-   struct hlist_node *tmp;
char addr_str[40];
+   int str_len = 0;
int entry_len; /* length of 1 entry string, differs between v4 and v6 */
-   int i = 0;
+   int i;
 
entry_len = (proto == QETH_PROT_IPV4)? 12 : 40;
entry_len += 2; /* \n + terminator */
spin_lock_bh(&card->ip_lock);
-   hash_for_each_safe(card->ip_htable, i, tmp, ipaddr, hnode) {
+   hash_for_each(card->ip_htable, i, ipaddr, hnode) {
if (ipaddr->proto != proto)
continue;
if (ipaddr->type != QETH_IP_TYPE_RXIP)
@@ -867,16 +868,17 @@ static ssize_t qeth_l3_dev_rxip_add_show(char *buf, 
struct qeth_card *card,
/* String must not be longer than PAGE_SIZE. So we check if
 * string length gets near PAGE_SIZE. Then we can savely display
 * the next IPv6 address (worst case, compared to IPv4) */
-   if ((PAGE_SIZE - i) <= entry_len)
+   if ((PAGE_SIZE - str_len) <= entry_len)
break;
qeth_l3_ipaddr_to_string(proto, (const u8 *)&ipaddr->u,
addr_str);
-   i += snprintf(buf + i, PAGE_SIZE - i, "%s\n", addr_str);
+   str_len += snprintf(buf + str_len, PAGE_SIZE - str_len, "%s\n",
+   addr_str);
}
spin_unlock_bh(&card->ip_lock);
-   i += snprintf(buf + i, PAGE_SIZE - i, "\n");
+   str_len += snprintf(buf + str_len, PAGE_SIZE - str_len, "\n");
 
-   return i;
+   return str_len;
 }
 
 static ssize_t qeth_l3_dev_rxip_add4_show(struct device *dev,
-- 
2.8.4

[PATCH RESEND net-next 09/12] s390/qeth: extract qeth_l2_remove_mac()

From: Julian Wiedmann 

This matches qeth_l2_write_mac().

Signed-off-by: Julian Wiedmann 
Reviewed-by: Thomas Richter 
---
 drivers/s390/net/qeth_l2_main.c | 27 +--
 1 file changed, 13 insertions(+), 14 deletions(-)

diff --git a/drivers/s390/net/qeth_l2_main.c b/drivers/s390/net/qeth_l2_main.c
index 074fc62..d456740 100644
--- a/drivers/s390/net/qeth_l2_main.c
+++ b/drivers/s390/net/qeth_l2_main.c
@@ -200,16 +200,22 @@ static inline u32 qeth_l2_mac_hash(const u8 *addr)
 
 static int qeth_l2_write_mac(struct qeth_card *card, struct qeth_mac *mac)
 {
-
-   int rc;
-
if (mac->is_uc) {
-   rc = qeth_l2_send_setdelmac(card, mac->mac_addr,
+   return qeth_l2_send_setdelmac(card, mac->mac_addr,
IPA_CMD_SETVMAC);
} else {
-   rc = qeth_l2_send_setgroupmac(card, mac->mac_addr);
+   return qeth_l2_send_setgroupmac(card, mac->mac_addr);
+   }
+}
+
+static int qeth_l2_remove_mac(struct qeth_card *card, struct qeth_mac *mac)
+{
+   if (mac->is_uc) {
+   return qeth_l2_send_setdelmac(card, mac->mac_addr,
+   IPA_CMD_DELVMAC);
+   } else {
+   return qeth_l2_send_delgroupmac(card, mac->mac_addr);
}
-   return rc;
 }
 
 static void qeth_l2_del_all_macs(struct qeth_card *card)
@@ -782,14 +788,7 @@ static void qeth_l2_set_rx_mode(struct net_device *dev)
 
hash_for_each_safe(card->mac_htable, i, tmp, mac, hnode) {
if (mac->disp_flag == QETH_DISP_ADDR_DELETE) {
-   if (!mac->is_uc)
-   rc = qeth_l2_send_delgroupmac(card,
-   mac->mac_addr);
-   else {
-   rc = qeth_l2_send_setdelmac(card, mac->mac_addr,
-   IPA_CMD_DELVMAC);
-   }
-
+   qeth_l2_remove_mac(card, mac);
hash_del(&mac->hnode);
kfree(mac);
 
-- 
2.8.4

[PATCH RESEND net-next 00/12] s390: qeth patches

Hi Dave,

yesterday I came up with 13 qeth patches. Since you have not been
happy with the 13th patch, I want to make sure that at least the
remaining 12 qeth patches can be applied to net-next. Here is the
resend of them.

Thanks,
Ursula

Julian Wiedmann (8):
  s390/qeth: Allow reading hsuid in state DOWN
  s390/qeth: Remove QETH_IP_HEADER_SIZE
  s390/qeth: drop qeth_l2_del_all_macs() parameter
  s390/qeth: don't convert return code twice
  s390/qeth: consolidate errno translation
  s390/qeth: extract qeth_l2_remove_mac()
  s390/qeth: shuffle MAC management functions around
  s390/qeth: issue STARTLAN as first IPA command

Thomas Richter (3):
  s390/qeth: rework RX/TX checksum offload
  s390/qeth: test RX/TX checksum offload reply
  s390/qeth: display warning for OSA3 RX/TX checksum offloading

Ursula Braun (1):
  s390/qeth: fix retrieval of vipa and proxy-arp addresses

 drivers/s390/net/qeth_core.h  |   5 -
 drivers/s390/net/qeth_core_main.c | 135 ---
 drivers/s390/net/qeth_core_mpc.h  |  17 
 drivers/s390/net/qeth_l2_main.c   | 189 --
 drivers/s390/net/qeth_l3_main.c   |  15 ---
 drivers/s390/net/qeth_l3_sys.c|  33 ---
 6 files changed, 212 insertions(+), 182 deletions(-)

-- 
2.8.4

[PATCH RESEND net-next 08/12] s390/qeth: consolidate errno translation

From: Julian Wiedmann 

Consolidate errno handling for MAC management: Instead of doing this in every
caller, do it in one place.

Signed-off-by: Julian Wiedmann 
Reviewed-by: Thomas Richter 
Suggested-by: Ursula Braun 
---
 drivers/s390/net/qeth_l2_main.c | 20 
 1 file changed, 8 insertions(+), 12 deletions(-)

diff --git a/drivers/s390/net/qeth_l2_main.c b/drivers/s390/net/qeth_l2_main.c
index 38fae10..074fc62 100644
--- a/drivers/s390/net/qeth_l2_main.c
+++ b/drivers/s390/net/qeth_l2_main.c
@@ -170,8 +170,7 @@ static int qeth_l2_send_setgroupmac(struct qeth_card *card, 
__u8 *mac)
int rc;
 
QETH_CARD_TEXT(card, 2, "L2Sgmac");
-   rc = qeth_setdel_makerc(card, qeth_l2_send_setdelmac(card, mac,
-   IPA_CMD_SETGMAC));
+   rc = qeth_l2_send_setdelmac(card, mac, IPA_CMD_SETGMAC);
if (rc == -EEXIST)
QETH_DBF_MESSAGE(2, "Group MAC %pM already existing on %s\n",
mac, QETH_CARD_IFNAME(card));
@@ -186,8 +185,7 @@ static int qeth_l2_send_delgroupmac(struct qeth_card *card, 
__u8 *mac)
int rc;
 
QETH_CARD_TEXT(card, 2, "L2Dgmac");
-   rc = qeth_setdel_makerc(card, qeth_l2_send_setdelmac(card, mac,
-   IPA_CMD_DELGMAC));
+   rc = qeth_l2_send_setdelmac(card, mac, IPA_CMD_DELGMAC);
if (rc)
QETH_DBF_MESSAGE(2,
"Could not delete group MAC %pM on %s: %d\n",
@@ -206,9 +204,8 @@ static int qeth_l2_write_mac(struct qeth_card *card, struct 
qeth_mac *mac)
int rc;
 
if (mac->is_uc) {
-   rc = qeth_setdel_makerc(card,
-   qeth_l2_send_setdelmac(card, mac->mac_addr,
-   IPA_CMD_SETVMAC));
+   rc = qeth_l2_send_setdelmac(card, mac->mac_addr,
+   IPA_CMD_SETVMAC);
} else {
rc = qeth_l2_send_setgroupmac(card, mac->mac_addr);
}
@@ -582,7 +579,8 @@ static int qeth_l2_send_setdelmac(struct qeth_card *card, 
__u8 *mac,
cmd = (struct qeth_ipa_cmd *)(iob->data+IPA_PDU_HEADER_SIZE);
cmd->data.setdelmac.mac_length = OSA_ADDR_LEN;
memcpy(&cmd->data.setdelmac.mac, mac, OSA_ADDR_LEN);
-   return qeth_send_ipa_cmd(card, iob, NULL, NULL);
+   return qeth_setdel_makerc(card, qeth_send_ipa_cmd(card, iob,
+   NULL, NULL));
 }
 
 static int qeth_l2_send_setmac(struct qeth_card *card, __u8 *mac)
@@ -590,8 +588,7 @@ static int qeth_l2_send_setmac(struct qeth_card *card, __u8 
*mac)
int rc;
 
QETH_CARD_TEXT(card, 2, "L2Setmac");
-   rc = qeth_setdel_makerc(card, qeth_l2_send_setdelmac(card, mac,
-   IPA_CMD_SETVMAC));
+   rc = qeth_l2_send_setdelmac(card, mac, IPA_CMD_SETVMAC);
if (rc == 0) {
card->info.mac_bits |= QETH_LAYER2_MAC_REGISTERED;
memcpy(card->dev->dev_addr, mac, OSA_ADDR_LEN);
@@ -621,8 +618,7 @@ static int qeth_l2_send_delmac(struct qeth_card *card, __u8 
*mac)
QETH_CARD_TEXT(card, 2, "L2Delmac");
if (!(card->info.mac_bits & QETH_LAYER2_MAC_REGISTERED))
return 0;
-   rc = qeth_setdel_makerc(card, qeth_l2_send_setdelmac(card, mac,
-   IPA_CMD_DELVMAC));
+   rc = qeth_l2_send_setdelmac(card, mac, IPA_CMD_DELVMAC);
if (rc == 0)
card->info.mac_bits &= ~QETH_LAYER2_MAC_REGISTERED;
return rc;
-- 
2.8.4

[PATCH RESEND net-next 04/12] s390/qeth: Allow reading hsuid in state DOWN

From: Julian Wiedmann 

Accessing the current hsuid via card->options.hsuid is perfectly
fine, even when the card is DOWN.

Signed-off-by: Julian Wiedmann 
Reviewed-by: Thomas Richter 
Acked-by: Ursula Braun 
---
 drivers/s390/net/qeth_l3_sys.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/s390/net/qeth_l3_sys.c b/drivers/s390/net/qeth_l3_sys.c
index 0e00a5c..3cd4d9f 100644
--- a/drivers/s390/net/qeth_l3_sys.c
+++ b/drivers/s390/net/qeth_l3_sys.c
@@ -250,9 +250,6 @@ static ssize_t qeth_l3_dev_hsuid_show(struct device *dev,
if (card->info.type != QETH_CARD_TYPE_IQD)
return -EPERM;
 
-   if (card->state == CARD_STATE_DOWN)
-   return -EPERM;
-
memcpy(tmp_hsuid, card->options.hsuid, sizeof(tmp_hsuid));
EBCASC(tmp_hsuid, 8);
return sprintf(buf, "%s\n", tmp_hsuid);
-- 
2.8.4

[PATCH RESEND net-next 05/12] s390/qeth: Remove QETH_IP_HEADER_SIZE

From: Julian Wiedmann 

Remove unused define QETH_IP_HEADER_SIZE.

Signed-off-by: Julian Wiedmann 
Reviewed-by: Thomas Richter 
Acked-by: Ursula Braun 
---
 drivers/s390/net/qeth_core.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/s390/net/qeth_core.h b/drivers/s390/net/qeth_core.h
index 41e4665..774ae51 100644
--- a/drivers/s390/net/qeth_core.h
+++ b/drivers/s390/net/qeth_core.h
@@ -281,8 +281,6 @@ static inline int qeth_is_ipa_enabled(struct qeth_ipa_info 
*ipa,
 #define QETH_HIGH_WATERMARK_PACK 5
 #define QETH_WATERMARK_PACK_FUZZ 1
 
-#define QETH_IP_HEADER_SIZE 40
-
 /* large receive scatter gather copy break */
 #define QETH_RX_SG_CB (PAGE_SIZE >> 1)
 #define QETH_RX_PULL_LEN 256
-- 
2.8.4

[PATCH RESEND net-next 07/12] s390/qeth: don't convert return code twice

From: Julian Wiedmann 

qeth_l2_send_groupmac() already translates the return code, so
calling qeth_setdel_makerc() a second time only produces garbage.

Signed-off-by: Julian Wiedmann 
Reviewed-by: Thomas Richter 
Reviewed-by: Ursula Braun 
---
 drivers/s390/net/qeth_l2_main.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/s390/net/qeth_l2_main.c b/drivers/s390/net/qeth_l2_main.c
index 3025f56..38fae10 100644
--- a/drivers/s390/net/qeth_l2_main.c
+++ b/drivers/s390/net/qeth_l2_main.c
@@ -210,8 +210,7 @@ static int qeth_l2_write_mac(struct qeth_card *card, struct 
qeth_mac *mac)
qeth_l2_send_setdelmac(card, mac->mac_addr,
IPA_CMD_SETVMAC));
} else {
-   rc = qeth_setdel_makerc(card,
-   qeth_l2_send_setgroupmac(card, mac->mac_addr));
+   rc = qeth_l2_send_setgroupmac(card, mac->mac_addr);
}
return rc;
 }
-- 
2.8.4

[PATCH RESEND net-next 01/12] s390/qeth: rework RX/TX checksum offload

From: Thomas Richter 

Rework the RX/TX checksum offloading command sequence to use
the provided function call back mechanims to return card
data to the device driver.

Signed-off-by: Thomas Richter 
Reviewed-by: Julian Wiedmann 
Reviewed-by: Ursula Braun 
---
 drivers/s390/net/qeth_core.h  |  2 -
 drivers/s390/net/qeth_core_main.c | 96 ++-
 drivers/s390/net/qeth_core_mpc.h  |  7 +++
 3 files changed, 72 insertions(+), 33 deletions(-)

diff --git a/drivers/s390/net/qeth_core.h b/drivers/s390/net/qeth_core.h
index 6d4b68c4..41e4665 100644
--- a/drivers/s390/net/qeth_core.h
+++ b/drivers/s390/net/qeth_core.h
@@ -674,8 +674,6 @@ struct qeth_card_info {
int broadcast_capable;
int unique_id;
struct qeth_card_blkt blkt;
-   __u32 csum_mask;
-   __u32 tx_csum_mask;
enum qeth_ipa_promisc_modes promisc_mode;
__u32 diagass_support;
__u32 hwtrap;
diff --git a/drivers/s390/net/qeth_core_main.c 
b/drivers/s390/net/qeth_core_main.c
index e335583..5ab80ea 100644
--- a/drivers/s390/net/qeth_core_main.c
+++ b/drivers/s390/net/qeth_core_main.c
@@ -5289,18 +5289,6 @@ int qeth_setassparms_cb(struct qeth_card *card,
if (cmd->hdr.prot_version == QETH_PROT_IPV6)
card->options.ipa6.enabled_funcs = cmd->hdr.ipa_enabled;
}
-   if (cmd->data.setassparms.hdr.assist_no == IPA_INBOUND_CHECKSUM &&
-   cmd->data.setassparms.hdr.command_code == IPA_CMD_ASS_START) {
-   card->info.csum_mask = cmd->data.setassparms.data.flags_32bit;
-   QETH_CARD_TEXT_(card, 3, "csum:%d", card->info.csum_mask);
-   }
-   if (cmd->data.setassparms.hdr.assist_no == IPA_OUTBOUND_CHECKSUM &&
-   cmd->data.setassparms.hdr.command_code == IPA_CMD_ASS_START) {
-   card->info.tx_csum_mask =
-   cmd->data.setassparms.data.flags_32bit;
-   QETH_CARD_TEXT_(card, 3, "tcsu:%d", card->info.tx_csum_mask);
-   }
-
return 0;
 }
 EXPORT_SYMBOL_GPL(qeth_setassparms_cb);
@@ -6060,23 +6048,78 @@ int qeth_core_ethtool_get_settings(struct net_device 
*netdev,
 }
 EXPORT_SYMBOL_GPL(qeth_core_ethtool_get_settings);
 
+/* Callback to handle checksum offload command reply from OSA card.
+ * Verify that required features have been enabled on the card.
+ * Return error in hdr->return_code as this value is checked by caller.
+ *
+ * Always returns zero to indicate no further messages from the OSA card.
+ */
+static int qeth_ipa_checksum_run_cmd_cb(struct qeth_card *card,
+   struct qeth_reply *reply,
+   unsigned long data)
+{
+   struct qeth_ipa_cmd *cmd = (struct qeth_ipa_cmd *) data;
+   struct qeth_checksum_cmd *chksum_cb =
+   (struct qeth_checksum_cmd *)reply->param;
+
+   QETH_CARD_TEXT(card, 4, "chkdoccb");
+   if (cmd->hdr.return_code)
+   return 0;
+
+   memset(chksum_cb, 0, sizeof(*chksum_cb));
+   if (cmd->data.setassparms.hdr.command_code == IPA_CMD_ASS_START) {
+   chksum_cb->supported =
+   cmd->data.setassparms.data.chksum.supported;
+   QETH_CARD_TEXT_(card, 3, "strt:%x", chksum_cb->supported);
+   }
+   if (cmd->data.setassparms.hdr.command_code == IPA_CMD_ASS_ENABLE) {
+   chksum_cb->supported =
+   cmd->data.setassparms.data.chksum.supported;
+   chksum_cb->enabled =
+   cmd->data.setassparms.data.chksum.enabled;
+   QETH_CARD_TEXT_(card, 3, "supp:%x", chksum_cb->supported);
+   QETH_CARD_TEXT_(card, 3, "enab:%x", chksum_cb->enabled);
+   }
+   return 0;
+}
+
+/* Send command to OSA card and check results. */
+static int qeth_ipa_checksum_run_cmd(struct qeth_card *card,
+enum qeth_ipa_funcs ipa_func,
+__u16 cmd_code, long data,
+struct qeth_checksum_cmd *chksum_cb)
+{
+   struct qeth_cmd_buffer *iob;
+   int rc = -ENOMEM;
+
+   QETH_CARD_TEXT(card, 4, "chkdocmd");
+   iob = qeth_get_setassparms_cmd(card, ipa_func, cmd_code,
+  sizeof(__u32), QETH_PROT_IPV4);
+   if (iob)
+   rc = qeth_send_setassparms(card, iob, sizeof(__u32), data,
+  qeth_ipa_checksum_run_cmd_cb,
+  chksum_cb);
+   return rc;
+}
+
 static int qeth_send_checksum_on(struct qeth_card *card, int cstype)
 {
-   long rxtx_arg;
+   struct qeth_checksum_cmd chksum_cb;
int rc;
 
-   rc = qeth_send_simple_setassparms(card, cstype, IPA_CMD_ASS_START, 0);
+   rc = qeth_ipa_checksum_run_cmd(card, cstype, IPA_CMD_ASS_START, 0,
+  &chksum_cb);

[PATCH net-next] ipmr: improve hash scalability

2017-01-12 Thread Nikolay Aleksandrov

Recently we started using ipmr with thousands of entries and easily hit
soft lockups on smaller devices. The reason is that the hash function
uses the high order bits from the src and dst, but those don't change in
many common cases, also the hash table  is only 64 elements so with
thousands it doesn't scale at all.
This patch migrates the hash table to rhashtable, and in particular the
rhl interface which allows for duplicate elements to be chained because
of the MFC_PROXY support (*,G; *,*,oif cases) which allows for multiple
duplicate entries to be added with different interfaces (IMO wrong, but
it's been in for a long time).

And here are some results from tests I've run in a VM:
 mr_table size (default, allocated for all namespaces):
  BeforeAfter
   49304 bytes   2400 bytes

 Add 65000 routes (the diff is much larger on smaller devices):
  BeforeAfter
   1m42s 58s

 Forwarding 256 byte packets with 65000 routes (test done in a VM):
  BeforeAfter
   3 Mbps / ~1465 pps122 Mbps / ~59000 pps

As a bonus we no longer see the soft lockups on smaller devices which
showed up even with 2000 entries before.

Signed-off-by: Nikolay Aleksandrov 
---
 include/linux/mroute.h |  57 ---
 net/ipv4/ipmr.c| 255 +++--
 2 files changed, 182 insertions(+), 130 deletions(-)

diff --git a/include/linux/mroute.h b/include/linux/mroute.h
index f019b62f27b5..d7f63339ef0b 100644
--- a/include/linux/mroute.h
+++ b/include/linux/mroute.h
@@ -3,6 +3,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -60,7 +61,6 @@ struct vif_device {
 #define VIFF_STATIC 0x8000
 
 #define VIF_EXISTS(_mrt, _idx) ((_mrt)->vif_table[_idx].dev != NULL)
-#define MFC_LINES 64
 
 struct mr_table {
struct list_headlist;
@@ -69,8 +69,9 @@ struct mr_table {
struct sock __rcu   *mroute_sk;
struct timer_list   ipmr_expire_timer;
struct list_headmfc_unres_queue;
-   struct list_headmfc_cache_array[MFC_LINES];
struct vif_device   vif_table[MAXVIFS];
+   struct rhltable mfc_hash;
+   struct list_headmfc_cache_list;
int maxvif;
atomic_tcache_resolve_queue_len;
boolmroute_do_assert;
@@ -85,17 +86,48 @@ enum {
MFC_STATIC = BIT(0),
 };
 
+struct mfc_cache_cmp_arg {
+   __be32 mfc_mcastgrp;
+   __be32 mfc_origin;
+};
+
+/**
+ * struct mfc_cache - multicast routing entries
+ * @mnode: rhashtable list
+ * @mfc_mcastgrp: destination multicast group address
+ * @mfc_origin: source address
+ * @cmparg: used for rhashtable comparisons
+ * @mfc_parent: source interface (iif)
+ * @mfc_flags: entry flags
+ * @expires: unresolved entry expire time
+ * @unresolved: unresolved cached skbs
+ * @last_assert: time of last assert
+ * @minvif: minimum VIF id
+ * @maxvif: maximum VIF id
+ * @bytes: bytes that have passed for this entry
+ * @pkt: packets that have passed for this entry
+ * @wrong_if: number of wrong source interface hits
+ * @lastuse: time of last use of the group (traffic or update)
+ * @ttls: OIF TTL threshold array
+ * @list: global entry list
+ * @rcu: used for entry destruction
+ */
 struct mfc_cache {
-   struct list_head list;
-   __be32 mfc_mcastgrp;/* Group the entry belongs to   
*/
-   __be32 mfc_origin;  /* Source of packet 
*/
-   vifi_t mfc_parent;  /* Source interface 
*/
-   int mfc_flags;  /* Flags on line
*/
+   struct rhlist_head mnode;
+   union {
+   struct {
+   __be32 mfc_mcastgrp;
+   __be32 mfc_origin;
+   };
+   struct mfc_cache_cmp_arg cmparg;
+   };
+   vifi_t mfc_parent;
+   int mfc_flags;
 
union {
struct {
unsigned long expires;
-   struct sk_buff_head unresolved; /* Unresolved buffers   
*/
+   struct sk_buff_head unresolved;
} unres;
struct {
unsigned long last_assert;
@@ -105,18 +137,13 @@ struct mfc_cache {
unsigned long pkt;
unsigned long wrong_if;
unsigned long lastuse;
-   unsigned char ttls[MAXVIFS];/* TTL thresholds   
*/
+   unsigned char ttls[MAXVIFS];
} res;
} mfc_un;
+   struct list_head list;
struct rcu_head rcu;
 };
 
-#ifdef __BIG_ENDIAN
-#define MFC_HASH(a,b)  (__force u32)(__be32)a)>>24)^(((__force 
u32)(__be32)b)>>26))&(MFC_LINES-1))
-#else
-#define MFC_HASH(a,b)  __force u32)(__be32)a)^(((__force

[PATCH RESEND net-next 03/12] s390/qeth: display warning for OSA3 RX/TX checksum offloading

From: Thomas Richter 

When RX/TX checksum offloading is turned on and the adapter is
an OSA 3 card in layer 3 mode, the checksum offloading is only
performed when both peers use different adapters. If both peers
share an OSA 3 card, communication is a memory copy and
checksum offloading is not performed.

This patch adds a warning to inform the administrator.

OSA 3 in layer 2 mode does not offer the RX/TX checksum
offload feature.

Signed-off-by: Thomas Richter 
Reviewed-by: Julian Wiedmann 
Reviewed-by: Ursula Braun 
---
 drivers/s390/net/qeth_core_main.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/s390/net/qeth_core_main.c 
b/drivers/s390/net/qeth_core_main.c
index 49b813f..ca8309f 100644
--- a/drivers/s390/net/qeth_core_main.c
+++ b/drivers/s390/net/qeth_core_main.c
@@ -6116,6 +6116,11 @@ static int qeth_send_checksum_on(struct qeth_card *card, 
int cstype)
if ((required_features & chksum_cb.supported) !=
required_features)
rc = -EIO;
+   else if (!(QETH_IPA_CHECKSUM_LP2LP & chksum_cb.supported) &&
+cstype == IPA_INBOUND_CHECKSUM)
+   dev_warn(&card->gdev->dev,
+"Hardware checksumming is performed only if %s 
and its peer use different OSA Express 3 ports\n",
+QETH_CARD_IFNAME(card));
}
if (rc) {
qeth_send_simple_setassparms(card, cstype, IPA_CMD_ASS_STOP, 0);
-- 
2.8.4

Re: [PATCH net-next 0/2] net/smc: fix typo and clc-bug

From: Ursula Braun 
Date: Thu, 12 Jan 2017 14:57:13 +0100

> I received 2 bug reports for my new AF_SMC-code. Here are the fixes for them.

Series applied, thanks.

Re: [PATCH net-next v2 0/2] More flexible BPF cb access

From: Daniel Borkmann 
Date: Thu, 12 Jan 2017 11:51:31 +0100

> This patch improves BPF's cb access by allowing b/h/w/dw
> access variants on it. For details, please see individual
> patches.

Series applied, thanks.

Re: [PATCH net-next] lwt_bpf: bpf_lwt_prog_cmp() can be static

From: Wei Yongjun 
Date: Thu, 12 Jan 2017 14:39:28 +

> From: Wei Yongjun 
> 
> Fixes the following sparse warning:
> 
> net/core/lwt_bpf.c:355:5: warning:
>  symbol 'bpf_lwt_prog_cmp' was not declared. Should it be static?
> 
> Signed-off-by: Wei Yongjun 

Applied.

Re: [PATCH RESEND net-next 00/12] s390: qeth patches

From: Ursula Braun 
Date: Thu, 12 Jan 2017 15:48:31 +0100

> yesterday I came up with 13 qeth patches. Since you have not been
> happy with the 13th patch, I want to make sure that at least the
> remaining 12 qeth patches can be applied to net-next. Here is the
> resend of them.

Series applied.

Re: Setting link down or up in software

2017-01-12 Thread Mason

On 12/01/2017 14:05, Mason wrote:

> I'm wondering what are the semantics of calling
> 
>   ip link set dev eth0 down
> 
> I was expecting that to somehow instruct the device's ethernet driver
> to shut everything down, have the PHY tell the peer that it's going
> away, maybe even put the PHY in some low-power mode, etc.
> 
> But it doesn't seem to be doing any of that on my HW.
> 
> So what exactly is it supposed to do?
> 
> 
> And on top of that, I am seeing random occurrences of
> 
>   nb8800 26000.ethernet eth0: Link is Down
> 
> Sometimes it is printed immediately.
> Sometimes it is printed as soon as I run "ip link set dev eth0 up" (?!)
> Sometimes it is not printed at all.
> 
> I find this erratic behavior very confusing.
> 
> Is it the symptom of some deeper bug?

Here's an example of "Link is Down" printed when I set link up:

At [   62.750220] I run ip link set dev eth0 down
Then leave the system idle for 10 minutes.
At [  646.263041] I run ip link set dev eth0 up
At [  647.364079] it prints "Link is Down"
At [  649.417434] it prints "Link is Up - 1Gbps/Full - flow control rx/tx"

I think whether I set up the PHY to use interrupts or polling
does have an influence on the weirdness I observe.

AFAICT, changing the interface flags is done in dev_change_flags
which calls __dev_change_flags and __dev_notify_flags

Is one of these supposed to call the device driver through a
callback at some point?

How/when is the phy_state_machine notified of the change in
interface flags?

Regards.

Re: [PATCH net-next] net: ipv6: put autoconf routes into per-interface tables

2017-01-12 Thread David Ahern

On 1/9/17 7:01 PM, Lorenzo Colitti wrote:
> On Sun, Jan 8, 2017 at 1:24 PM, David Ahern  wrote:
>> Why not use the VRF capability then? create a VRF and assign the interface 
>> to it. End result is the same -- separate tables and the need to use a 
>> bind-to-device API to hit those routes.
> 
> Requiring that VRFs for this creates additional complexity, because
> each network now requires its own VRF. That means that the connection
> manager must create the VRF before the interface comes up and receives
> the RA.
> 
> In some cases this might not be possible. For example, consider a tun
> interface that's created by a different process such as a VPN client.
> In this case the connection manager doesn't know the interface name,
> and the VPN client doesn't know to create the VRF, so if the tun
> interface gets an RA after the tun is created but

Have you looked at adding basic l3mdev capabilities to tun? in this case just 
l3mdev_fib_table needs be implemented. On interface create push down a table id 
and set the IFF_L3MDEV_MASTER flag.

Correct method for initializing Pause and Asymmetrical Pause support in phy drivers

2017-01-12 Thread Marc Bertola

Hello netdev list,

I am currently investigating a problem related to Ethernet
auto-negotiation of Pause and Asymmetrical Pause capabilities.

TL;DR: I am using a Picozed system-on-module with a Xilinx Gigabit
Ethernet MAC and a Marvell PHY. It does not appear to be advertising
support for Pause and Asym Pause, which seems strange to me given that
this is relatively recent hardware. I suspect that may be due to a
problem in the way phydev->supported is initialized in
drivers/net/phy/marvell.c.

I am trying to confirm what the proper method is to initialize
phydev->supported such that it advertises SUPPORTED_Pause and
SUPPORTED_Asym_Pause. Adding these flags to (phy_driver).features
seems to work, but I would like to confirm with people who are more
knowledgeable than me in this regard.

Read on for details about what I have observed and tried so far.

# The System #

The application I am working on uses Avnet's Picozed 7020
System-on-Module (SOM), which contains:
* An on-chip MAC (on a Xilinx Zynq 7000 chip)
* A Marvell 1512 Alaska PHY.
* A daughtercard that provides the actual RJ45 connector.

The Zynq is running a Xilinx fork of Linux. I am working with the
following drivers:

The MAC is the built-in Gigabit Ethernet MAC on a Xilinx Zynq 7000
chip. It uses the xemacps.c driver, which can be found on Xilinx's
official Linux fork:
* RAW:
https://raw.githubusercontent.com/Xilinx/linux-xlnx/master/drivers/net/ethernet/xilinx/xilinx_emacps.c
* GITHUB:
https://github.com/Xilinx/linux-xlnx/blob/master/drivers/net/ethernet/xilinx/xilinx_emacps.c

The PHY is a Marvell 1512 Alaska device that comes on the Picozed 7020
SOM that is the heart of the application I am working on.
The version of the driver I am using for this PHY can be found here
(note that this version is slightly different (older?) to the mainline
Linux repo):
* RAW:
https://raw.githubusercontent.com/Xilinx/linux-xlnx/master/drivers/net/phy/marvell.c
* GITHUB:
https://github.com/Xilinx/linux-xlnx/blob/master/drivers/net/phy/marvell.c

# The Problem: No Flow Control via Auto-Negotiation #

I have noticed that when I connect to the Picozed using a PC, the
connection speed and duplex are negotiated correctly, and the signal
integrity is good. However, its autonegotiation capabilities do not
report support for the Pause or Asymmetrical Pause capabilities. Flow
control is thus disabled as a result. I verified this by dumping the
Link Partner capabilities PHY register on my PC.

If I connect another regular PC or a smart switch (on which I have
enabled flow control) to my PC, flow control capability is reported
and thus Pause frames are enabled.

Based on the Zynq's Technical Reference Manual, the MAC supports Pause
frames and Asymmetrical Pause, so I am working with the assumption
that these features should be advertised, contrary to what I am
seeing.

I spent most of the day looking at the PHY abstraction layer, the
marvell driver, and the xemacps driver to figure out where the missing
flow control capability information needed to be added. I also looked
at the phy.txt where I found the following:

"Now just make sure that phydev->supported and phydev->advertising
have any values pruned from them which don't make sense for your
controller a 10/100 controller may be connected to a gigabit capable
PHY, so you would need to mask off SUPPORTED_1000baseT*). See
include/linux/ethtool.h for definitions for these bitfields. Note that
you should not SET any bits, or the PHY may get put into an
unsupported state."

Source:
https://raw.githubusercontent.com/Xilinx/linux-xlnx/master/Documentation/networking/phy.txt

So my understanding is as follows:

1. The PHY driver sets all of the flags for the capabilities it
supports in phydev->supported.
2. The MAC driver then prunes the capabilities it does not support
from phydev->supported to see what can be safely advertised.

In xemacps.c, the following code appears to be performing this
pruning, by removing all capabilities other than PHY_GBIT_FEATURES and
the Flow Control capability bits.

phydev->supported &= (PHY_GBIT_FEATURES | SUPPORTED_Pause |
SUPPORTED_Asym_Pause);

So, if the PHY had advertised that it supported Flow Control / Pause
Frames, these capabilities would have been preserved. However, with
some variable dumping to dmesg, I can see that SUPPORTED_Pause and
SUPPORTED_Asym_Pause are not present in phydev->supported. What I
observe is that phydev->supported has a value of 0x02ff, and we would
expect bits 13 and 14 to be set, resulting in 0x62ff.

I dug a bit deeper to see how the PHY driver populates the value of
phydev->supported before it gets passed to the MAC for pruning. I
found that it comes from the .features field in the phy_driver structs
defined at the bottom of the marvell.c file. In the version I am
using, this field only contains PHY_GBIT_FEATURES, which explains why
the SUPPORTED_Pause and SUPPORTED_Asym_Pause flags are not set.

{
.phy_id = MARVELL_PHY_ID_88E1510,
.phy_id_mask

Re: [PATCH] [net] net/mlx5e: fix another -Wmaybe-uninitialized warning

2017-01-12 Thread Or Gerlitz


On 1/11/2017 11:14 PM, Arnd Bergmann wrote:

As found by Olof's build bot, today's mainline kernel gained a harmless
warning about a potential uninitalied variable reference:

drivers/net/ethernet/mellanox/mlx5/core/en_tc.c: In function 
'parse_tc_fdb_actions':
drivers/net/ethernet/mellanox/mlx5/core/en_tc.c:769:13: warning: 'out_dev' may 
be used uninitialized in this function [-Wmaybe-uninitialized]
drivers/net/ethernet/mellanox/mlx5/core/en_tc.c:811:21: note: 'out_dev' was 
declared here

This was introduced through the addition of an 'IS_ERR/PTR_ERR' pair that
gcc is unfortunately unable to completely figure out. Replacing it with
PTR_ERR_OR_ZERO makes the code more understandable to gcc so it no longer
warns.


can you elaborate on this a little further?


Hadar Hen Zion already attempted to fix the warning earlier by adding
fake initializations, but that ended up just making the code worse without
fully addressing all warnings, so I'm reverting it now that it is no longer 
needed.


ok, so if your approach eliminates the warning on out_dev and also on 
the variables for which Hadar added the faked initializers, I guess we 
should be fine with this change (saw your reply on my other comment), 
just another question:



In order to avoid pulling a variable declaration into the #ifdef, I'm
removing it in favor of a more readable 'if()' statement here that has the same 
effect.


When I build here without CONFIG_INET in my system, the build goes fine 
with this approach. However, we're pretty sure that in the past we got 
0-day report from the kbuild test robot where he was unhappy that we 
make the ip_route_output_key call without being wrapped with that #if 
IS_ENABLED(CONFIG_INET) -- so, we don't want to go there again... thoughts?


Or.

Re: [PATCH net] ravb: Remove Rx overflow log messages

From: Simon Horman 
Date: Thu, 12 Jan 2017 13:21:06 +0100

> From: Kazuya Mizuguchi 
> 
> Remove Rx overflow log messages as in an environment where logging results
> in network traffic logging may cause further overflows.
> 
> Fixes: c156633f1353 ("Renesas Ethernet AVB driver proper")
> Signed-off-by: Kazuya Mizuguchi 
> [simon: reworked changelog]
> Signed-off-by: Simon Horman 
> Acked-by: Sergei Shtylyov 

Applied, thanks.

Re: [PATCH v2 2/2] stmmac: rename it to synopsys


I don't understand at all why it is so important to change the name of
these files nor the directory they live in.

What bonafide benefit will users receive if we do this?

The only clear part is the downside, which is that it is going to make
it painful to browse source history and backport bug fixes.

Please, let's not do this.

Thanks.

Re: [PATCH] synopsys: remove dwc_eth_qos driver