[Patch v1] ax88179_178a: add reset functionality in reset_resume
Without reset functionality in reset_resume, iperf connection does not establish after suspend/resume however ping works at the same time. iperf connection fails with wrong checksum error shown by tcpdump. reset function inside reset_resume solves above bug. We have verified this issue on ASIX based ST Lab, Cadyce dongle. Signed-off-by: Vivek Kumar Bhagat Signed-off-by: Praveen Kumar --- drivers/net/usb/ax88179_178a.c | 14 +- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/drivers/net/usb/ax88179_178a.c b/drivers/net/usb/ax88179_178a.c index e6338c1..00928c0 100644 --- a/drivers/net/usb/ax88179_178a.c +++ b/drivers/net/usb/ax88179_178a.c @@ -1630,6 +1630,18 @@ static int ax88179_stop(struct usbnet *dev) return 0; } +static int ax88179_reset_resume(struct usb_interface *intf) +{ + struct usbnet *dev = usb_get_intfdata(intf); + int ret; + + ret = ax88179_reset(dev); + if (ret < 0) + return ret; + + return ax88179_resume(intf); +} + static const struct driver_info ax88179_info = { .description = "ASIX AX88179 USB 3.0 Gigabit Ethernet", .bind = ax88179_bind, @@ -1744,7 +1756,7 @@ static struct usb_driver ax88179_178a_driver = { .probe = usbnet_probe, .suspend = ax88179_suspend, .resume = ax88179_resume, - .reset_resume = ax88179_resume, + .reset_resume = ax88179_reset_resume, .disconnect = usbnet_disconnect, .supports_autosuspend = 1, .disable_hub_initiated_lpm = 1, -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe netdev" in
[Patch v1] ax88179_178a: add reset functionality in reset_resume
>From d178065c9e3cfa8a45ef537fae7412775339beb0 Mon Sep 17 00:00:00 2001 From: Vivek Kumar Bhagat Date: Thu, 11 Jun 2015 07:23:46 -0700 Subject: [PATCH] ax88179_178a: add reset functionality in reset_resume Without reset functionality in reset_resume, iperf connection does not establish after suspend/resume however ping works at the same time. iperf connection fails with wrong checksum error shown by tcpdump. reset function inside reset_resume solves above bug. We have verified this issue on ASIX based ST Lab, Cadyce dongle. Signed-off-by: Vivek Kumar Bhagat Signed-off-by: Praveen Kumar --- drivers/net/usb/ax88179_178a.c | 14 +- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/drivers/net/usb/ax88179_178a.c b/drivers/net/usb/ax88179_178a.c index e6338c1..00928c0 100644 --- a/drivers/net/usb/ax88179_178a.c +++ b/drivers/net/usb/ax88179_178a.c @@ -1630,6 +1630,18 @@ static int ax88179_stop(struct usbnet *dev) return 0; } +static int ax88179_reset_resume(struct usb_interface *intf) +{ + struct usbnet *dev = usb_get_intfdata(intf); + int ret; + + ret = ax88179_reset(dev); + if (ret < 0) + return ret; + + return ax88179_resume(intf); +} + static const struct driver_info ax88179_info = { .description = "ASIX AX88179 USB 3.0 Gigabit Ethernet", .bind = ax88179_bind, @@ -1744,7 +1756,7 @@ static struct usb_driver ax88179_178a_driver = { .probe = usbnet_probe, .suspend = ax88179_suspend, .resume = ax88179_resume, - .reset_resume = ax88179_resume, + .reset_resume = ax88179_reset_resume, .disconnect = usbnet_disconnect, .supports_autosuspend = 1, .disable_hub_initiated_lpm = 1, -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe netdev" in
Re: [PATCH net-next 0/3 v5] changes to make ipv4 routing table aware of next-hop link status
On Thu, Jun 18, 2015 at 8:22 AM, Andy Gospodarek wrote: > This series adds the ability to have the Linux kernel track whether or > not a particular route should be used based on the link-status of the > interface associated with the next-hop. [cut] > Andy Gospodarek (3): > net: track link-status of ipv4 nexthops > net: ipv4 sysctl option to ignore routes when nexthop link is down > iproute2: add support to print 'linkdown' nexthop flag On the series: Acked-by: Scott Feldman -- To unsubscribe from this list: send the line "unsubscribe netdev" in
Re: [PATCH net-next 0/3 v5] changes to make ipv4 routing table aware of next-hop link status
On Thu, Jun 18, 2015 at 12:39 PM, Andy Gospodarek wrote: > On Thu, Jun 18, 2015 at 10:51:37AM -0700, Scott Feldman wrote: >> On Thu, Jun 18, 2015 at 8:22 AM, Andy Gospodarek >> wrote: >> > This series adds the ability to have the Linux kernel track whether or >> > not a particular route should be used based on the link-status of the >> > interface associated with the next-hop. >> > >> > Before this patch any link-failure on an interface that was serving as a >> > gateway for some systems could result in those systems being isolated >> > from the rest of the network as the stack would continue to attempt to >> > send frames out of an interface that is actually linked-down. When the >> > kernel is responsible for all forwarding, it should also be responsible >> > for taking action when the traffic can no longer be forwarded -- there >> > is no real need to outsource link-monitoring to userspace anymore. >> > >> > This feature is only enabled with the new per-interface or ipv4 global >> > sysctls called 'ignore_routes_with_linkdown'. >> > >> > net.ipv4.conf.all.ignore_routes_with_linkdown = 0 >> > net.ipv4.conf.default.ignore_routes_with_linkdown = 0 >> > net.ipv4.conf.lo.ignore_routes_with_linkdown = 0 >> > ... >> > >> > When the above sysctls are set, the kernel will not only report to >> > userspace that the link is down, but it will also report to userspace >> > that a route is dead. This will signal to userspace that the route will >> > not be selected. >> > >> > With the new sysctls set, the following behavior can be observed >> > (interface p8p1 is link-down): >> > >> > # ip route show >> > default via 10.0.5.2 dev p9p1 >> > 10.0.5.0/24 dev p9p1 proto kernel scope link src 10.0.5.15 >> > 70.0.0.0/24 dev p7p1 proto kernel scope link src 70.0.0.1 >> > 80.0.0.0/24 dev p8p1 proto kernel scope link src 80.0.0.1 dead linkdown >> > 90.0.0.0/24 via 80.0.0.2 dev p8p1 metric 1 dead linkdown >> > 90.0.0.0/24 via 70.0.0.2 dev p7p1 metric 2 >> > # ip route get 90.0.0.1 >> > 90.0.0.1 via 70.0.0.2 dev p7p1 src 70.0.0.1 >> > cache >> > # ip route get 80.0.0.1 >> > local 80.0.0.1 dev lo src 80.0.0.1 >> > cache >> > # ip route get 80.0.0.2 >> > 80.0.0.2 via 10.0.5.2 dev p9p1 src 10.0.5.15 >> > cache >> > >> > While the route does remain in the table (so it can be modified if >> > needed rather than being wiped away as it would be if IFF_UP was >> > cleared), the proper next-hop is chosen automatically when the link is >> > down. Now interface p8p1 is linked-up: >> > >> > # ip route show >> > default via 10.0.5.2 dev p9p1 >> > 10.0.5.0/24 dev p9p1 proto kernel scope link src 10.0.5.15 >> > 70.0.0.0/24 dev p7p1 proto kernel scope link src 70.0.0.1 >> > 80.0.0.0/24 dev p8p1 proto kernel scope link src 80.0.0.1 >> > 90.0.0.0/24 via 80.0.0.2 dev p8p1 metric 1 >> > 90.0.0.0/24 via 70.0.0.2 dev p7p1 metric 2 >> > 192.168.56.0/24 dev p2p1 proto kernel scope link src 192.168.56.2 >> > # ip route get 90.0.0.1 >> > 90.0.0.1 via 80.0.0.2 dev p8p1 src 80.0.0.1 >> > cache >> > # ip route get 80.0.0.1 >> > local 80.0.0.1 dev lo src 80.0.0.1 >> > cache >> > # ip route get 80.0.0.2 >> > 80.0.0.2 dev p8p1 src 80.0.0.1 >> > cache >> > >> > and the output changes to what one would expect. >> > >> > If the global or interface sysctl is not set, the following output would be >> > expected when p8p1 is down: >> > >> > # ip route show >> > default via 10.0.5.2 dev p9p1 >> > 10.0.5.0/24 dev p9p1 proto kernel scope link src 10.0.5.15 >> > 70.0.0.0/24 dev p7p1 proto kernel scope link src 70.0.0.1 >> > 80.0.0.0/24 dev p8p1 proto kernel scope link src 80.0.0.1 linkdown >> > 90.0.0.0/24 via 80.0.0.2 dev p8p1 metric 1 linkdown >> > 90.0.0.0/24 via 70.0.0.2 dev p7p1 metric 2 >> > >> > If the dead flag does not appear there should be no expectation that the >> > kernel would skip using this route due to link being down. >> > >> > v2: Split kernel changes into 2 patches: first to add linkdown flag and >> > second to add new sysctl settings. Also took suggestion from Alex to >> > simplify code by only checking sysctl during fib lookup and suggestion >> > from Scott to add a per-interface sysctl. Added iproute2 patch to >> > recognize and print linkdown flag. >> > >> > v3: Code cleanups along with reverse-path checks suggested by Alex and >> > small fixes related to problems found when multipath was disabled. >> > >> > v4: Drop binary sysctls >> > >> > v5: Whitespace and variable declaration fixups suggested by Dave >> > >> > Though there were some that preferred not to have a configuration option >> > and to make this behavior the default when it was discussed in Ottawa >> > earlier this year since "it was time to do this." I wanted to propose >> > the config option to preserve the current behavior for those that desire >> > it. I'll happily remove it if Dave and Linus approve. >> > >> > An IPv6 implementation is also needed (DECnet too!), but I wanted to start >> > with >> > t
Re: [PATCH] net: inet_diag: export IPV6_V6ONLY sockopt
On Fri, Jun 19, 2015 at 06:52:00AM -0700, Eric Dumazet wrote: > On Fri, 2015-06-19 at 14:15 +0200, Phil Sutter wrote: > > For AF_INET6 sockets, the value of struct ipv6_pinfo.ipv6only is > > exported to userspace. It indicates whether an unbound socket listens on > > IPv4 as well as IPv6. > > What is an 'unbound socket' ??? This makes no sense to me here. Indeed, this is just plain wrong. Actually meant "not bound to a specific IPv6 address". > > Since the socket is natively IPv6, it is not > > listed by e.g. 'netstat -l -4'. > > netstat does not use this interface. iproute2/ss does. Just used this as a simple example illustrating the problem, but doing the same with 'ss' is truly a better choice. [...] > 1) This certainly should not compile on current linux trees. >Always submit such patches on net-next. It cleanly applies to net.git. > 2) It is not clear why we would add this attribute if it is 0. > This looks a waste of data. > > So I would rather use : ACK. Thanks for reviewing, v2 follows after I've tested it. Cheers, Phil -- To unsubscribe from this list: send the line "unsubscribe netdev" in
Re: [PATCHv2 net-next] net: fec: Ensure clocks are enabled while using mdio bus
Le 06/20/15 09:15, Andrew Lunn a écrit : > When a switch is attached to the mdio bus, the mdio bus can be used > while the interface is not open. If the IPG clock are not enabled, > MDIO reads/writes will simply time out. So enable the clock before > starting a transaction, and disable it afterwards. The CCF performs > reference counting so the clock will only be disabled if there are no > other users. > > Signed-off-by: Andrew Lunn > --- > v2: > Only enable/disable the IPG clock. > > drivers/net/ethernet/freescale/fec_main.c | 21 +++-- > 1 file changed, 19 insertions(+), 2 deletions(-) > > diff --git a/drivers/net/ethernet/freescale/fec_main.c > b/drivers/net/ethernet/freescale/fec_main.c > index bf4cf3fbb5f2..2b8a043a573c 100644 > --- a/drivers/net/ethernet/freescale/fec_main.c > +++ b/drivers/net/ethernet/freescale/fec_main.c > @@ -65,6 +65,7 @@ > > static void set_multicast_list(struct net_device *ndev); > static void fec_enet_itr_coal_init(struct net_device *ndev); > +static int fec_enet_clk_enable(struct net_device *ndev, bool enable); You do not seem to be using this, unrelated change? -- Florian -- To unsubscribe from this list: send the line "unsubscribe netdev" in
Re: displayed name changed in ip link show for bridge- and other interfaces
On 06/17/2015 09:26 AM, Nicolas Dichtel wrote: > Le 16/06/2015 19:35, Oliver Hartkopp a écrit : >> On 15.06.2015 17:54, Stephen Hemminger wrote: >>> On Mon, 15 Jun 2015 11:13:12 +0200 >>> Nicolas Dichtel wrote: >>> Theoretically, virtual interfaces should advertise an IFLA_LINK to 0. I don't know what is the best fix: - patching iproute2 to avoid this '@NONE' - patching the kernel (see below). >>> >>> >>> Sorry this is an ABI change. The kernel has to go back >>> to doing the same thing as before. >>> >> >> Isn't this too late right now at 4.1-rc8 stage??? >> >> At least the patch suggested for br_device.c at >> >> http://marc.info/?l=linux-netdev&m=143435960111768&w=2 >> >> would been necessary in all networking drivers, right? >> >> I currently see this @NONE stuff with virtual CAN devices too. > Another solution is to revert e1622baf54df ("dev: set iflink to 0 for virtual > interfaces") and add a ndo_get_iflink handler which returns 0 for all virtual > interfaces that had this IFLA_LINK set to 0 before the series. > But it's not consistent between virtual interfaces. I have no good suggestion, as I don't know if this makes a difference for the ABI to finally make 'ip' omit the '@NONE' output. E.g. virtual CAN interfaces (vcan.c) now print this @NONE and they never have a (physical?) link. So you probably have to deal with different virtual interfaces anyway, right? Regards, Oliver -- To unsubscribe from this list: send the line "unsubscribe netdev" in
Re: [PATCH] can: fix loss of frames due to wrong assumption in raw_rcv
Hello Manfred, On 06/20/2015 07:21 PM, Manfred Schlaegl wrote: > I've detected a massive loss of can frames on i.MX6 using flexcan > driver with 4.1-rc8 and tracked this down to following commit: > 514ac99c64b22d83b52dfee3b8becaa69a92bc4a - "can: fix multiple delivery > of a single CAN frame for overlapping CAN filters" thanks for detecting this issue! > 514ac99c64b22d83b52dfee3b8becaa69a92bc4a introduces a frame equality > check. Since the sk_buff pointer is not sufficient to do this (buffers > are reused), the check also compares time stamps. > In short: pointer+time stamp was assumed as unique key to a specific > frame. > The problem with this is, that the time stamp is an optional property > and not set per default. > In our case (flexcan) the time stamp is always zero, so the equality > check is reduced to equality of buffer pointers, resulting in a lot of > dropped frames. The question is why your system did not generate a timestamp at the time of skb reception. Usually when netif_rx(), netif_rx_ni() is invoked the timestamp is set in the following reception process. flexcan.c only uses netif_receive_skb() - but all theses functions set the timestamp net_timestamp_check(netdev_tstamp_prequeue, skb); depending on netdev_tstamp_prequeue which is configured by /proc/sys/net/core/netdev_tstamp_prequeue See the idea of netdev_tstamp_prequeue here: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/net/core/sysctl_net_core.c?id=3b098e2d7c693796cc4dffb07caa249fc0f70771 Can you tell me the output of /proc/sys/net/core/netdev_tstamp_prequeue on your machine? If it's not '1' can you set it to '1' for a test? > > Possible solutions I thought of: > 1. Every driver has to set a time stamp > (possibly error prone and hard to enforce?) > 2. Change the equality check > 3. Fulfil the requirements of the equality check by setting a > time stamp per default. > > This patch fixes the problem with solution 3. A time stamp is set at > time of allocation in alloc_can_skb. That's a feasible way if won't find a better way to make sure the timestamps are generally set before the skb is processed in the NET_RX softirq. > The time stamp may be overridden later, but the function of the equality > check is ensured. > > I'm not really deep in linux network subsystem, so there may exists > more elegant solutions for the problem. > > Signed-off-by: Manfred Schlaegl > --- > drivers/net/can/dev.c |1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/net/can/dev.c b/drivers/net/can/dev.c > index b0f6924..282e2e7 100644 > --- a/drivers/net/can/dev.c > +++ b/drivers/net/can/dev.c > @@ -575,6 +575,7 @@ struct sk_buff *alloc_can_skb(struct net_device *dev, > struct can_frame **cf) > if (unlikely(!skb)) > return NULL; > > + __net_timestamp(skb); > skb->protocol = htons(ETH_P_CAN); > skb->pkt_type = PACKET_BROADCAST; > skb->ip_summed = CHECKSUM_UNNECESSARY; > Please check the netdev_tstamp_prequeue value first. If we would need solution 3 the __net_timestamp(skb) should be placed in alloc_canfd_skb() too. Thanks again for your investigation! Best regards, Oliver -- To unsubscribe from this list: send the line "unsubscribe netdev" in
[PATCH v2 1/1] net: fs_enet: Fix NETIF_F_SG feature for Freescale MPC5121
Commit 4fc9b87bae25 ("net: fs_enet: Implement NETIF_F_SG feature") brings a trouble to Freescale MPC512x: a kernel oops happens during sending non-linear sk_buff with .data not aligned by 4. Log quotation: Unable to handle kernel paging request for data at address 0xe467c000 Faulting instruction address: 0xc000cd44 Oops: Kernel access of bad area, sig: 11 [#1] MPC512x generic Modules linked in: CPU: 0 PID: 984 Comm: kworker/0:1H Not tainted 4.1.0-rc8-00024-gbb16140 #2 Workqueue: rpciod rpc_async_schedule task: cf364a50 ti: cf362000 task.ti: cf362000 NIP: c000cd44 LR: c000c720 CTR: 0206 REGS: cf363ac0 TRAP: 0300 Not tainted (4.1.0-rc8-00024-gbb16140) MSR: 9032 CR: 42004082 XER: DAR: e467c000 DSISR: 2000 GPR00: c0279e24 cf363b70 cf364a50 e467c000 0206 001f 0001 0001 GPR08: e467c000 e46800be 000139a6 82002082 c002e46c cf3c3680 GPR16: c044cb30 c04b cf363c48 0001 fde0315c 000b GPR24: 002c 40be cf339aa0 000b 0001 cf873210 00282f85 NIP [c000cd44] clean_dcache_range+0x1c/0x30 LR [c000c720] dma_direct_map_page+0x40/0x94 Call Trace: [cf363b70] [cf339b60] 0xcf339b60 (unreliable) [cf363b90] [c0279e24] fs_enet_start_xmit+0x1c8/0x42c [cf363bd0] [c02ff710] dev_hard_start_xmit+0x2dc/0x3d4 [cf363c40] [c0319c60] sch_direct_xmit+0xcc/0x1cc [cf363c70] [c02ff9c0] __dev_queue_xmit+0x1b8/0x47c [cf363ca0] [c032a3e8] ip_finish_output+0x1fc/0x9a8 [cf363ce0] [c032c31c] ip_send_skb+0x1c/0xa4 [cf363cf0] [c035112c] udp_send_skb+0xe4/0x2e8 [cf363d10] [c0351368] udp_push_pending_frames+0x38/0x84 [cf363d20] [c03537b8] udp_sendpage+0x134/0x174 [cf363d70] [c0384fd4] xs_sendpages+0x21c/0x250 [cf363db0] [c03852bc] xs_udp_send_request+0x50/0xf8 [cf363de0] [c0382f08] xprt_transmit+0x64/0x280 [cf363e20] [c038017c] call_transmit+0x168/0x234 [cf363e40] [c0387918] __rpc_execute+0x88/0x2b0 [cf363e80] [c00296f8] process_one_work+0x124/0x2fc [cf363ea0] [c0029a00] worker_thread+0x130/0x480 [cf363ef0] [c002e528] kthread+0xbc/0xd0 [cf363f40] [c000e4a8] ret_from_kernel_thread+0x5c/0x64 Instruction dump: 7c70faa6 60630800 7c70fba6 4c00012c 4e800020 38a0001f 7c632878 7c832050 7c842a14 5484d97f 4d820020 7c8903a6 <7c00186c> 38630020 4200fff8 7c0004ac ---[ end trace c846c1eceb513c85 ]--- The reason: MPC5121 FEC requires 4-byte alignment for TX data buffer and calls tx_skb_align_workaround() for copying sk_buff with not aligned .data to a new sk_buff with aligned one. But tx_skb_align_workaround() uses skb_copy_from_linear_data() which doesn't work for non-linear sk_buff: a new sk_buff has non-zero nr_frags and zero .data_len. So improve the condition of calling tx_skb_align_workaround() and use skb_linearize() in it. Signed-off-by: Alexander Popov --- .../net/ethernet/freescale/fs_enet/fs_enet-main.c | 26 +++--- 1 file changed, 23 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c b/drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c index 9b3639e..d92802b 100644 --- a/drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c +++ b/drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c @@ -490,6 +490,9 @@ static struct sk_buff *tx_skb_align_workaround(struct net_device *dev, { struct sk_buff *new_skb; + if (skb_linearize(skb)) + return NULL; + /* Alloc new skb */ new_skb = netdev_alloc_skb(dev, skb->len + 4); if (!new_skb) @@ -515,12 +518,27 @@ static int fs_enet_start_xmit(struct sk_buff *skb, struct net_device *dev) cbd_t __iomem *bdp; int curidx; u16 sc; - int nr_frags = skb_shinfo(skb)->nr_frags; + int nr_frags; skb_frag_t *frag; int len; - #ifdef CONFIG_FS_ENET_MPC5121_FEC - if (((unsigned long)skb->data) & 0x3) { + int is_aligned = 1; + int i; + + if (!IS_ALIGNED((unsigned long)skb->data, 4)) { + is_aligned = 0; + } else { + nr_frags = skb_shinfo(skb)->nr_frags; + frag = skb_shinfo(skb)->frags; + for (i = 0; i < nr_frags; i++, frag++) { + if (!IS_ALIGNED(frag->page_offset, 4)) { + is_aligned = 0; + break; + } + } + } + + if (!is_aligned) { skb = tx_skb_align_workaround(dev, skb); if (!skb) { /* @@ -532,6 +550,7 @@ static int fs_enet_start_xmit(struct sk_buff *skb, struct net_device *dev) } } #endif + spin_lock(&fep->tx_lock); /* @@ -539,6 +558,7 @@ static int fs_enet_start_xmit(struct sk_buff *skb, struct net_device *dev) */ bdp = fep->cur_tx; + nr_frags = skb_shinfo(skb)->nr_frags; if (fep->tx_free <= nr_frags || (CBDR_SC(bdp) & BD_ENET_TX_READY)) { netif_stop_queue(dev); spin_unlock(&fe
Re: [PATCH 00/12] Netfilter updates for net-next
From: Pablo Neira Ayuso Date: Fri, 19 Jun 2015 19:17:37 +0200 > The following patchset contains a final Netfilter pull request for net-next > 4.2. This mostly addresses some fallout from the previous pull request, small > netns updates and a couple of new features for nfnetlink_log and the socket > match that didn't get in time for the previous pull request. More specifically > they are: ... > You can pull these changes from: > > git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next.git Pulled, thanks a lot Pablo. -- To unsubscribe from this list: send the line "unsubscribe netdev" in
[PATCH] dsa: mv88x6xxx: Zero statistics counters
Zero the statistics counters when setting up the global registers. Otherwise the counters will remain from the last boot if the power has not been removed. Signed-off-by: Andrew Lunn --- This patch will only cleanly apply after the debug series. There is no actual dependency, so applying the patch with some fuzz will allow it to be applied without the debug series. drivers/net/dsa/mv88e6xxx.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/net/dsa/mv88e6xxx.c b/drivers/net/dsa/mv88e6xxx.c index cb6c2711d6ea..fc73d809c292 100644 --- a/drivers/net/dsa/mv88e6xxx.c +++ b/drivers/net/dsa/mv88e6xxx.c @@ -2061,6 +2061,12 @@ int mv88e6xxx_setup_global(struct dsa_switch *ds) 0x9000 | (i << 8)); } + /* Clear the statistics counters for all ports */ + REG_WRITE(REG_GLOBAL, GLOBAL_STATS_OP, GLOBAL_STATS_OP_FLUSH_ALL); + + /* Wait for the flush to complete. */ + _mv88e6xxx_stats_wait(ds); + return 0; } -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe netdev" in
Gift
Good news tv,moto,cellphone,gultar the shipping is free samsung 6, € 320 s i te: isgayre. com N�r��yb�X��ǧv�^�){.n�+���z�^�)
Re: [PATCH net] netfilter: nf_qeueue: Drop queue entries on nf_unregister_hook
On Sat, Jun 20, 2015 at 01:32:48PM +0200, Patrick McHardy wrote: > On 20.06, Pablo Neira Ayuso wrote: > > On Fri, Jun 19, 2015 at 02:03:39PM -0500, Eric W. Biederman wrote: > > > > > > Add code to nf_unregister_hook to flush the nf_queue when a hook is > > > unregistered. This guarantees that the pointer that the nf_queue code > > > retains into the nf_hook list will remain valid while a packet is > > > queued. > > > > I think the real problem is that struct nf_queue_entry holds a pointer > > to struct nf_hook_ops, which will be gone after removal. So you > > uncovered a long standing problem that will amplify by when pernet > > hooks are in place. > > > > Regarding the pointer to nf_hook_list, now that new netdevice variant > > doesn't support nf_queue yet, so that nf_hook_list will be always > > valid since it will point to the global nf_hooks in the core. > > I think Eric's patch is the right thing to do. I'm not sure I get > your netdev comment, but we certainly do want to drop packets once > a hook is gone. I agree this patch is fine, of course. > > > +{ > > > + const struct nf_queue_handler *qh; > > > + struct net *net; > > > + > > > + rtnl_lock(); > > > > Why rtnl_lock() here? > > for_each_net(). Would actually be nice to have a variant that doesn't > need the rtnl since it makes locking order analysis a lot harder. OK, thanks for explaining. -- To unsubscribe from this list: send the line "unsubscribe netdev" in
Re: [PATCH net] netfilter: nf_queue: Don't recompute the hook_list head
On Sat, Jun 20, 2015 at 09:08:20AM -0500, Eric W. Biederman wrote: > Pablo Neira Ayuso writes: > > > On Fri, Jun 19, 2015 at 05:23:37PM -0500, Eric W. Biederman wrote: > >> > >> If someone sends packets from one of the netdevice ingress hooks to > >> the a userspace queue, and then userspace later accepts the packet, > >> the netfilter code can enter an infinite loop as the list head will > >> never be found. > >> > >> Pass in the saved list_head to avoid this. > > > > There is no userspace queueing for netdevice yet, so this can be route > > through nf-next. Thanks. > > *scratches head* the netdevice queueing is in the netfilter core. > > netfilter_ingress calls nf_hook_slow. The queuing happens in > nf_hook_slow if anything returns the verdict queue it. > > This patch applies to Linus's tree. > > So how in the world does this not need to be ported to 4.1? There is no nfnetlink_queue support for the netdev family at this moment, so this can't be triggered unless you use an out of tree module. I have a patch here to add a static key to disable userspace queueing per family using a static key so that part would be basically inactive. But if you really want to see this in 4.1, no problem, please just let me know and I'll pass it to David, as I said it's basically not resolving any urgent problem so this is not harming. Thank you. -- To unsubscribe from this list: send the line "unsubscribe netdev" in
Re: [PATCH net-next 2/3] ipv4: L3 and L4 hash-based multipath routing
On Thu, 18 Jun 2015 15:52:22 -0700 Alexander Duyck wrote: > > > On 06/17/2015 01:08 PM, Peter Nørlund wrote: > > This patch adds L3 and L4 hash-based multipath routing, selectable > > on a per-route basis with the reintroduced RTA_MP_ALGO attribute. > > The default is now RT_MP_ALG_L3_HASH. > > > > Signed-off-by: Peter Nørlund > > --- > > include/net/ip_fib.h | 4 ++- > > include/net/route.h| 5 ++-- > > include/uapi/linux/rtnetlink.h | 14 ++- > > net/ipv4/fib_frontend.c| 4 +++ > > net/ipv4/fib_semantics.c | 34 ++--- > > net/ipv4/icmp.c| 4 +-- > > net/ipv4/route.c | 56 > > +++--- > > net/ipv4/xfrm4_policy.c| 2 +- 8 files changed, 103 > > insertions(+), 20 deletions(-) > > > > diff --git a/include/net/ip_fib.h b/include/net/ip_fib.h > > index 4be4f25..250d98e 100644 > > --- a/include/net/ip_fib.h > > +++ b/include/net/ip_fib.h > > @@ -37,6 +37,7 @@ struct fib_config { > > u32 fc_flags; > > u32 fc_priority; > > __be32 fc_prefsrc; > > + int fc_mp_alg; > > struct nlattr *fc_mx; > > struct rtnexthop*fc_mp; > > int fc_mx_len; > > @@ -116,6 +117,7 @@ struct fib_info { > > int fib_nhs; > > #ifdef CONFIG_IP_ROUTE_MULTIPATH > > int fib_mp_weight; > > + int fib_mp_alg; > > #endif > > struct rcu_head rcu; > > struct fib_nh fib_nh[0]; > > @@ -308,7 +310,7 @@ int ip_fib_check_default(__be32 gw, struct > > net_device *dev); int fib_sync_down_dev(struct net_device *dev, int > > force); int fib_sync_down_addr(struct net *net, __be32 local); > > int fib_sync_up(struct net_device *dev); > > -void fib_select_multipath(struct fib_result *res); > > +void fib_select_multipath(struct fib_result *res, const struct > > flowi4 *flow); > > > > /* Exported by fib_trie.c */ > > void fib_trie_init(void); > > diff --git a/include/net/route.h b/include/net/route.h > > index fe22d03..1fc7deb 100644 > > --- a/include/net/route.h > > +++ b/include/net/route.h > > @@ -110,7 +110,8 @@ struct in_device; > > int ip_rt_init(void); > > void rt_cache_flush(struct net *net); > > void rt_flush_dev(struct net_device *dev); > > -struct rtable *__ip_route_output_key(struct net *, struct flowi4 > > *flp); +struct rtable *__ip_route_output_key(struct net *, struct > > flowi4 *flp, > > +const struct flowi4 *mp_flow); > > struct rtable *ip_route_output_flow(struct net *, struct flowi4 > > *flp, struct sock *sk); > > struct dst_entry *ipv4_blackhole_route(struct net *net, > > @@ -267,7 +268,7 @@ static inline struct rtable > > *ip_route_connect(struct flowi4 *fl4, sport, dport, sk); > > > > if (!dst || !src) { > > - rt = __ip_route_output_key(net, fl4); > > + rt = __ip_route_output_key(net, fl4, NULL); > > if (IS_ERR(rt)) > > return rt; > > ip_rt_put(rt); > > diff --git a/include/uapi/linux/rtnetlink.h > > b/include/uapi/linux/rtnetlink.h index 17fb02f..dff4a72 100644 > > --- a/include/uapi/linux/rtnetlink.h > > +++ b/include/uapi/linux/rtnetlink.h > > @@ -271,6 +271,18 @@ enum rt_scope_t { > > #define RTM_F_EQUALIZE0x400 /* Multipath > > equalizer: NI */ #define RTM_F_PREFIX > > 0x800 /* Prefix addresses */ > > > > +/* Multipath algorithms */ > > + > > +enum rt_mp_alg_t { > > + RT_MP_ALG_L3_HASH, /* Was IP_MP_ALG_NONE */ > > + RT_MP_ALG_PER_PACKET, /* Was IP_MP_ALG_RR */ > > + RT_MP_ALG_DRR, /* not used */ > > + RT_MP_ALG_RANDOM, /* not used */ > > + RT_MP_ALG_WRANDOM, /* not used */ > > + RT_MP_ALG_L4_HASH, > > + __RT_MP_ALG_MAX > > +}; > > + > > /* Reserved table identifiers */ > > > > enum rt_class_t { > > @@ -301,7 +313,7 @@ enum rtattr_type_t { > > RTA_FLOW, > > RTA_CACHEINFO, > > RTA_SESSION, /* no longer used */ > > - RTA_MP_ALGO, /* no longer used */ > > + RTA_MP_ALGO, > > RTA_TABLE, > > RTA_MARK, > > RTA_MFC_STATS, > > diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c > > index 872494e..376e8c1 100644 > > --- a/net/ipv4/fib_frontend.c > > +++ b/net/ipv4/fib_frontend.c > > @@ -590,6 +590,7 @@ const struct nla_policy rtm_ipv4_policy[RTA_MAX > > + 1] = { [RTA_PREFSRC] = { .type = NLA_U32 }, > > [RTA_METRICS] = { .type = NLA_NESTED }, > > [RTA_MULTIPATH] = { .len = sizeof(struct > > rtnexthop) }, > > + [RTA_MP_ALGO] = { .type = NLA_U32 }, > > [RTA_FLOW] = { .type = NLA_U32 }, > > }; > > > > @@ -650,6 +651,9 @@ static int rtm_to_fib_config(struct net *net, > > struct sk_buff *skb, cfg->fc_mp = nla_data(attr); > > cfg->fc_mp_len = nla_
[PATCHv3 net-next] net: fec: Ensure clocks are enabled while using mdio bus
When a switch is attached to the mdio bus, the mdio bus can be used while the interface is not open. If the IPG clock are not enabled, MDIO reads/writes will simply time out. So enable the clock before starting a transaction, and disable it afterwards. The CCF performs reference counting so the clock will only be disabled if there are no other users. Signed-off-by: Andrew Lunn --- v3: Return the error code from clk_prepare_enable() v2: Only enable the IGP clock. drivers/net/ethernet/freescale/fec_main.c | 21 +++-- 1 file changed, 19 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c index bf4cf3fbb5f2..8d9b1fd175f7 100644 --- a/drivers/net/ethernet/freescale/fec_main.c +++ b/drivers/net/ethernet/freescale/fec_main.c @@ -65,6 +65,7 @@ static void set_multicast_list(struct net_device *ndev); static void fec_enet_itr_coal_init(struct net_device *ndev); +static int fec_enet_clk_enable(struct net_device *ndev, bool enable); #define DRIVER_NAME"fec" @@ -1764,6 +1765,11 @@ static int fec_enet_mdio_read(struct mii_bus *bus, int mii_id, int regnum) { struct fec_enet_private *fep = bus->priv; unsigned long time_left; + int ret; + + ret = clk_prepare_enable(fep->clk_ipg); + if (ret) + return ret; fep->mii_timeout = 0; init_completion(&fep->mdio_done); @@ -1779,11 +1785,14 @@ static int fec_enet_mdio_read(struct mii_bus *bus, int mii_id, int regnum) if (time_left == 0) { fep->mii_timeout = 1; netdev_err(fep->netdev, "MDIO read timeout\n"); + clk_disable_unprepare(fep->clk_ipg); return -ETIMEDOUT; } - /* return value */ - return FEC_MMFR_DATA(readl(fep->hwp + FEC_MII_DATA)); + ret = FEC_MMFR_DATA(readl(fep->hwp + FEC_MII_DATA)); + clk_disable_unprepare(fep->clk_ipg); + + return ret; } static int fec_enet_mdio_write(struct mii_bus *bus, int mii_id, int regnum, @@ -1791,10 +1800,15 @@ static int fec_enet_mdio_write(struct mii_bus *bus, int mii_id, int regnum, { struct fec_enet_private *fep = bus->priv; unsigned long time_left; + int ret; fep->mii_timeout = 0; init_completion(&fep->mdio_done); + ret = clk_prepare_enable(fep->clk_ipg); + if (ret) + return ret; + /* start a write op */ writel(FEC_MMFR_ST | FEC_MMFR_OP_WRITE | FEC_MMFR_PA(mii_id) | FEC_MMFR_RA(regnum) | @@ -1807,9 +1821,12 @@ static int fec_enet_mdio_write(struct mii_bus *bus, int mii_id, int regnum, if (time_left == 0) { fep->mii_timeout = 1; netdev_err(fep->netdev, "MDIO write timeout\n"); + clk_disable_unprepare(fep->clk_ipg); return -ETIMEDOUT; } + clk_disable_unprepare(fep->clk_ipg); + return 0; } -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe netdev" in
Re: [PATCH net-next 1/3] ipv4: Lock-less per-packet multipath
On Thu, 18 Jun 2015 12:42:05 -0700 Alexander Duyck wrote: > On 06/17/2015 01:08 PM, Peter Nørlund wrote: > > The current multipath attempted to be quasi random, but in most > > cases it behaved just like a round robin balancing. This patch > > refactors the algorithm to be exactly that and in doing so, avoids > > the spin lock. > > > > The new design paves the way for hash-based multipath, replacing the > > modulo with thresholds, minimizing disruption in case of failing > > paths or route replacements. > > > > Signed-off-by: Peter Nørlund > > --- > > include/net/ip_fib.h | 6 +-- > > net/ipv4/Kconfig | 1 + > > net/ipv4/fib_semantics.c | 116 > > ++- 3 files changed, 68 > > insertions(+), 55 deletions(-) > > > > diff --git a/include/net/ip_fib.h b/include/net/ip_fib.h > > index 54271ed..4be4f25 100644 > > --- a/include/net/ip_fib.h > > +++ b/include/net/ip_fib.h > > @@ -76,8 +76,8 @@ struct fib_nh { > > unsigned intnh_flags; > > unsigned char nh_scope; > > #ifdef CONFIG_IP_ROUTE_MULTIPATH > > - int nh_weight; > > - int nh_power; > > + int nh_mp_weight; > > + atomic_tnh_mp_upper_bound; > > #endif > > #ifdef CONFIG_IP_ROUTE_CLASSID > > __u32 nh_tclassid; > > @@ -115,7 +115,7 @@ struct fib_info { > > #define fib_advmss fib_metrics[RTAX_ADVMSS-1] > > int fib_nhs; > > #ifdef CONFIG_IP_ROUTE_MULTIPATH > > - int fib_power; > > + int fib_mp_weight; > > #endif > > struct rcu_head rcu; > > struct fib_nh fib_nh[0]; > > I could do without some of this renaming. For example you could > probably not bother with adding the _mp piece to the name. That way > we don't have to track all the nh_weight -> nh_mp_weight changes. > Also you could probably just use the name fib_weight since not > including the _mp was already the convention for the multipath > portions of the structure anyway. > > This isn't really improving readability at all so I would say don't > bother renaming it. > Good point. I'll skip the renaming. > > diff --git a/net/ipv4/Kconfig b/net/ipv4/Kconfig > > index d83071d..cb91f67 100644 > > --- a/net/ipv4/Kconfig > > +++ b/net/ipv4/Kconfig > > @@ -81,6 +81,7 @@ config IP_MULTIPLE_TABLES > > config IP_ROUTE_MULTIPATH > > bool "IP: equal cost multipath" > > depends on IP_ADVANCED_ROUTER > > + select BITREVERSE > > help > > Normally, the routing tables specify a single action to > > be taken in a deterministic manner for a given packet. If you say Y > > here diff --git a/net/ipv4/fib_semantics.c > > b/net/ipv4/fib_semantics.c index 28ec3c1..8c8df80 100644 > > --- a/net/ipv4/fib_semantics.c > > +++ b/net/ipv4/fib_semantics.c > > @@ -15,6 +15,7 @@ > > > > #include > > #include > > +#include > > #include > > #include > > #include > > @@ -57,7 +58,7 @@ static struct hlist_head > > fib_info_devhash[DEVINDEX_HASHSIZE]; > > > > #ifdef CONFIG_IP_ROUTE_MULTIPATH > > > > -static DEFINE_SPINLOCK(fib_multipath_lock); > > +static DEFINE_PER_CPU(u8, fib_mp_rr_counter); > > > > #define for_nexthops(fi) > > { \ int nhsel; const > > struct fib_nh *nh; \ @@ -261,7 > > +262,7 @@ static inline int nh_comp(const struct fib_info *fi, > > const struct fib_info *ofi) nh->nh_gw != onh->nh_gw || > > nh->nh_scope != onh->nh_scope || #ifdef CONFIG_IP_ROUTE_MULTIPATH > > - nh->nh_weight != onh->nh_weight || > > + nh->nh_mp_weight != onh->nh_mp_weight || > > #endif > > #ifdef CONFIG_IP_ROUTE_CLASSID > > nh->nh_tclassid != onh->nh_tclassid || > > @@ -449,6 +450,43 @@ static int fib_count_nexthops(struct rtnexthop > > *rtnh, int remaining) return remaining > 0 ? 0 : nhs; > > } > > > > This is a good example. If we don't do the rename we don't have to > review changes like the one above which just add extra overhead to > the patch. > Right. > > +static void fib_rebalance(struct fib_info *fi) > > +{ > > + int factor; > > + int total; > > + int w; > > + > > + if (fi->fib_nhs < 2) > > + return; > > + > > + total = 0; > > + for_nexthops(fi) { > > + if (!(nh->nh_flags & RTNH_F_DEAD)) > > + total += nh->nh_mp_weight; > > + } endfor_nexthops(fi); > > + > > + if (likely(total != 0)) { > > + factor = DIV_ROUND_UP(total, 8388608); > > + total /= factor; > > + } else { > > + factor = 1; > > + } > > + > > So where does the 8388608 value come from? Is it just here to help > restrict the upper_bound to a u8 value? > Yes. Although I think it is rare for the weight to be that large, the API supports it, so I'd better make sure nothing weird happens. Or is it too hypothetical? Today, if one were
Re: linux-next: build warnings after merge of the net-next tree
On Sat, Jun 20, 2015 at 07:40:03PM +0200, Florian Westphal wrote: [...] > > Introduced by commit: > > > > 71ae0dff02d7 ("netfilter: xtables: use percpu rule counters") > > Yes, sorry about this, should be fixed by dcb8f5c8139ef945cdfd > ("netfilter: xtables: fix warnings on 32bit platforms"). There's a pending pull to address this fallout: http://patchwork.ozlabs.org/patch/486819/ Thanks. -- To unsubscribe from this list: send the line "unsubscribe netdev" in
Re: [PATCH 00/32] Netfilter updates for net-next
On Sat, Jun 20, 2015 at 03:11:30PM +0200, Jakub Kiciński wrote: > On Mon, 15 Jun 2015 23:25:57 +0200, Pablo Neira Ayuso wrote: > > Hi David, > > > > This a bit large (and late) patchset that contains Netfilter updates for > > net-next. Most relevantly br_netfilter fixes, ipset RCU support, removal of > > x_tables percpu ruleset copy and rework of the nf_tables netdev support. > > More > > specifically, they are: > [...] > > > > Bernhard Thaler (7): > > netfilter: bridge: refactor clearing BRNF_NF_BRIDGE_PREROUTING > > netfilter: bridge: re-order br_nf_pre_routing_finish_ipv6() > > netfilter: bridge: detect NAT66 correctly and change MAC address > > netfilter: bridge: refactor frag_max_size > > netfilter: bridge: rename br_parse_ip_options > > netfilter: bridge: re-order check_hbh_len() > > netfilter: bridge: forward IPv6 fragmented packets > > Pablo, Bernhard, > > this batch breaks builds with CONFIG_IPV6=n. No idea why build bot > didn't catch that. There is a pending pull request to address this fallout: http://patchwork.ozlabs.org/patch/486819/ Thanks. -- To unsubscribe from this list: send the line "unsubscribe netdev" in
Re: [PATCHv2 net-next] net: fec: Ensure clocks are enabled while using mdio bus
On Sat, Jun 20, 2015 at 02:47:20PM -0300, Fabio Estevam wrote: > On Sat, Jun 20, 2015 at 1:15 PM, Andrew Lunn wrote: > > > @@ -1764,6 +1765,11 @@ static int fec_enet_mdio_read(struct mii_bus *bus, > > int mii_id, int regnum) > > { > > struct fec_enet_private *fep = bus->priv; > > unsigned long time_left; > > + int ret; > > + > > + ret = clk_prepare_enable(fep->clk_ipg); > > + if (ret) > > + return 0x; > > Why don`t you return ret instead? ret would also work. v3 to follow soon. Andrew -- To unsubscribe from this list: send the line "unsubscribe netdev" in
[Patch] ax88179_178a: add reset functionality in reset_resume
Dear All, Attached patch fix iperf connection problem after reset resume of ethernet to usb dongle. Without reset functionality, i see ping works after reset resume but iperf connection fails with wrong checksum error message shown by tcpdump. Attached patch fix above issue. Thanks, Vivek 0001-ax88179_178a-add-reset-function-in-reset_resume.patch Description: Binary data
Re: [PATCHv2 net-next] net: fec: Ensure clocks are enabled while using mdio bus
On Sat, Jun 20, 2015 at 1:15 PM, Andrew Lunn wrote: > @@ -1764,6 +1765,11 @@ static int fec_enet_mdio_read(struct mii_bus *bus, int > mii_id, int regnum) > { > struct fec_enet_private *fep = bus->priv; > unsigned long time_left; > + int ret; > + > + ret = clk_prepare_enable(fep->clk_ipg); > + if (ret) > + return 0x; Why don`t you return ret instead? -- To unsubscribe from this list: send the line "unsubscribe netdev" in
Re: linux-next: build warnings after merge of the net-next tree
Stephen Rothwell wrote: > After merging the net-next tree, today's linux-next build (i386 defconfig) > produced these warnings: > > In file included from include/net/netfilter/nf_conntrack_tuple.h:13:0, > from include/linux/netfilter/nf_conntrack_dccp.h:28, > from include/net/netfilter/nf_conntrack.h:22, > from net/netfilter/nf_conntrack_core.c:37: > include/linux/netfilter/x_tables.h: In function 'xt_percpu_counter_alloc': > include/linux/netfilter/x_tables.h:373:10: warning: cast from pointer to > integer of different size [-Wpointer-to-int-cast] >return (__force u64) res; > ^ > include/linux/netfilter/x_tables.h: In function 'xt_percpu_counter_free': > include/linux/netfilter/x_tables.h:381:15: warning: cast to pointer from > integer of different size [-Wint-to-pointer-cast] >free_percpu((void __percpu *) pcnt); >^ > In file included from include/asm-generic/percpu.h:6:0, > from arch/x86/include/asm/percpu.h:551, > from arch/x86/include/asm/preempt.h:5, > from include/linux/preempt.h:64, > from include/linux/spinlock.h:50, > from include/linux/mm_types.h:8, > from include/linux/kmemcheck.h:4, > from include/linux/skbuff.h:18, > from include/linux/netfilter.h:5, > from net/netfilter/nf_conntrack_core.c:16: > include/linux/netfilter/x_tables.h: In function 'xt_get_this_cpu_counter': > include/linux/netfilter/x_tables.h:388:23: warning: cast to pointer from > integer of different size [-Wint-to-pointer-cast] >return this_cpu_ptr((void __percpu *) cnt->pcnt); >^ > include/linux/percpu-defs.h:206:47: note: in definition of macro > '__verify_pcpu_ptr' > const void __percpu *__vpp_verify = (typeof((ptr) + 0))NULL; \ >^ > include/linux/percpu-defs.h:239:27: note: in expansion of macro 'raw_cpu_ptr' > #define this_cpu_ptr(ptr) raw_cpu_ptr(ptr) >^ > include/linux/netfilter/x_tables.h:388:10: note: in expansion of macro > 'this_cpu_ptr' >return this_cpu_ptr((void __percpu *) cnt->pcnt); > ^ > > and many more. > > Introduced by commit: > > 71ae0dff02d7 ("netfilter: xtables: use percpu rule counters") Yes, sorry about this, should be fixed by dcb8f5c8139ef945cdfd ("netfilter: xtables: fix warnings on 32bit platforms"). Thanks, Florian -- To unsubscribe from this list: send the line "unsubscribe netdev" in
[PATCH] can: fix loss of frames due to wrong assumption in raw_rcv
I've detected a massive loss of can frames on i.MX6 using flexcan driver with 4.1-rc8 and tracked this down to following commit: 514ac99c64b22d83b52dfee3b8becaa69a92bc4a - "can: fix multiple delivery of a single CAN frame for overlapping CAN filters" 514ac99c64b22d83b52dfee3b8becaa69a92bc4a introduces a frame equality check. Since the sk_buff pointer is not sufficient to do this (buffers are reused), the check also compares time stamps. In short: pointer+time stamp was assumed as unique key to a specific frame. The problem with this is, that the time stamp is an optional property and not set per default. In our case (flexcan) the time stamp is always zero, so the equality check is reduced to equality of buffer pointers, resulting in a lot of dropped frames. Possible solutions I thought of: 1. Every driver has to set a time stamp (possibly error prone and hard to enforce?) 2. Change the equality check 3. Fulfil the requirements of the equality check by setting a time stamp per default. This patch fixes the problem with solution 3. A time stamp is set at time of allocation in alloc_can_skb. The time stamp may be overridden later, but the function of the equality check is ensured. I'm not really deep in linux network subsystem, so there may exists more elegant solutions for the problem. Signed-off-by: Manfred Schlaegl --- drivers/net/can/dev.c |1 + 1 file changed, 1 insertion(+) diff --git a/drivers/net/can/dev.c b/drivers/net/can/dev.c index b0f6924..282e2e7 100644 --- a/drivers/net/can/dev.c +++ b/drivers/net/can/dev.c @@ -575,6 +575,7 @@ struct sk_buff *alloc_can_skb(struct net_device *dev, struct can_frame **cf) if (unlikely(!skb)) return NULL; + __net_timestamp(skb); skb->protocol = htons(ETH_P_CAN); skb->pkt_type = PACKET_BROADCAST; skb->ip_summed = CHECKSUM_UNNECESSARY; -- 1.7.10.4 signature.asc Description: OpenPGP digital signature
[PATCH 0/6] debugfs for mv88e6xxx
This patchset adds some debugfs files for seeing into a mv88e6xxx family of switch chips. # cat atu DB T/P Vec State Addr 003 Port 008 7 00:22:02:00:18:44 003 Port 008 6 80:ee:73:83:60:27 005 Port 020 7 94:10:3e:80:bc:f3 0f8 Port 001 6 8e:25:13:53:44:de This walks all possible entries, so is a bit slow, but is always correct. # cat device_map Target Port 0 15 1 15 2 15 3 15 4 15 5 15 6 15 7 15 8 15 9 15 -->snip<-- 31 15 A rather boring example, since i only have one switch here. But this shows the routing between multiple switches. # cat regs GLOBAL GLOBAL2 0123456 0: c804 0 1e4f 100f 100f 1e4f 1e0f e07 e07 1:fe 0 33333 c03e c03f 2: 0 0000000 3: 0 1721 1721 1721 1721 1721 1721 1721 4: 6000 258 433 431 431 433 433 373f 433 5: 0 ff 0000000 6: c0001f0f 2026 2025 2023 3020 4020 501f 6020 7: 0707f 0000000 8: 07800 2080 2080 2080 2080 2080 2080 2080 9: 01600 1111111 a: 148 0 0000000 b: 40001000 1248 10 20 40 c: 0 7f 0000000 d: 5f3 0000000 e: 6 0000000 f: f00 dada dada dada dada dada dada dada 10: 0 0 0000000 11: 0 0 0000000 12: 0 0000000 13: 01a00 1df00 1e070 14: 400 0000000 15: 0 0000000 16: 0 6011 6011 6011 6011 33 330 17: 0 0000000 18: fa411844 3210 3210 3210 3210 3210 3210 3210 19: 0 1e1 7654 7654 7654 7654 7654 7654 7654 1a: 5550 0 0000000 1b: 1fbf869 8000 8000 8000 8000 8000 8000 8000 1c: 0 0 0000000 1d: c00 0 0000000 1e: 0 0 0000000 1f: 0 0 0000000 All the switch registers which are directly accessible. # cat stats Statistic Port 0 Port 1 Port 2 Port 3 Port 4 Port 5 Port 6 in_good_octets: 217600 42637110 499540 0 in_bad_octets:46050005019600 0 in_unicast:000 76930 7691 0 in_broadcasts:000003 0 in_multicasts: 340000 27 0 in_pause:000000 0 in_undersize:000000 0 in_fragments: 4500200 0 in_oversize:000000 0 in_jabber:000000 0 in_rx_error:000000 0 in_fcs_error: 15900 3700 0 out_octets: 80800 496608 336 4267159 0 out_unicast:000 76910 7693 0 out_broadcasts:100300 0 out_multicasts:90064 34 0 out_pause:000000 0 excessive:000000 0 collisions:000000 0 deferred:000000 0 single:000000 0 multiple:000000 0 out_fcs_error:000000 0 late:000000 0 hist_64bytes: 3600 75770 7574 0 hist_65_127bytes: 5300 2414 298 0 hist_128_255bytes: 5000 120 10 0 hist_256_511bytes: 4300802 0 hist_512_1023bytes: 1800 75730 7564
[PATCH 3/6] dsa: mv88x6xxx: Refactor getting a single statistic
Move the code to retrieve a statistics counter into a function of its own, so it can later be reused. Signed-off-by: Andrew Lunn --- drivers/net/dsa/mv88e6xxx.c | 63 ++--- drivers/net/dsa/mv88e6xxx.h | 4 +++ 2 files changed, 40 insertions(+), 27 deletions(-) diff --git a/drivers/net/dsa/mv88e6xxx.c b/drivers/net/dsa/mv88e6xxx.c index 6e684f3d377c..43c1515a8319 100644 --- a/drivers/net/dsa/mv88e6xxx.c +++ b/drivers/net/dsa/mv88e6xxx.c @@ -681,6 +681,40 @@ static void _mv88e6xxx_get_strings(struct dsa_switch *ds, } } +static uint64_t _mv88e6xxx_get_ethtool_stat(struct dsa_switch *ds, + int stat, + struct mv88e6xxx_hw_stat *stats, + int port) +{ + struct mv88e6xxx_hw_stat *s = stats + stat; + u32 low; + u32 high = 0; + int ret; + u64 value; + + if (s->reg >= 0x100) { + ret = _mv88e6xxx_reg_read(ds, REG_PORT(port), + s->reg - 0x100); + if (ret < 0) + return UINT64_MAX; + + low = ret; + if (s->sizeof_stat == 4) { + ret = _mv88e6xxx_reg_read(ds, REG_PORT(port), + s->reg - 0x100 + 1); + if (ret < 0) + return UINT64_MAX; + high = ret; + } + } else { + _mv88e6xxx_stats_read(ds, s->reg, &low); + if (s->sizeof_stat == 8) + _mv88e6xxx_stats_read(ds, s->reg + 1, &high); + } + value = (((u64)high) << 16) | low; + return value; +} + static void _mv88e6xxx_get_ethtool_stats(struct dsa_switch *ds, int nr_stats, struct mv88e6xxx_hw_stat *stats, @@ -699,34 +733,9 @@ static void _mv88e6xxx_get_ethtool_stats(struct dsa_switch *ds, } /* Read each of the counters. */ - for (i = 0; i < nr_stats; i++) { - struct mv88e6xxx_hw_stat *s = stats + i; - u32 low; - u32 high = 0; + for (i = 0; i < nr_stats; i++) + data[i] = _mv88e6xxx_get_ethtool_stat(ds, i, stats, port); - if (s->reg >= 0x100) { - ret = _mv88e6xxx_reg_read(ds, REG_PORT(port), - s->reg - 0x100); - if (ret < 0) - goto error; - low = ret; - if (s->sizeof_stat == 4) { - ret = _mv88e6xxx_reg_read(ds, REG_PORT(port), - s->reg - 0x100 + 1); - if (ret < 0) - goto error; - high = ret; - } - data[i] = (((u64)high) << 16) | low; - continue; - } - _mv88e6xxx_stats_read(ds, s->reg, &low); - if (s->sizeof_stat == 8) - _mv88e6xxx_stats_read(ds, s->reg + 1, &high); - - data[i] = (((u64)high) << 32) | low; - } -error: mutex_unlock(&ps->smi_mutex); } diff --git a/drivers/net/dsa/mv88e6xxx.h b/drivers/net/dsa/mv88e6xxx.h index 8b9c76b66ddb..7cccff202586 100644 --- a/drivers/net/dsa/mv88e6xxx.h +++ b/drivers/net/dsa/mv88e6xxx.h @@ -11,6 +11,10 @@ #ifndef __MV88E6XXX_H #define __MV88E6XXX_H +#ifndef UINT64_MAX +#define UINT64_MAX (u64)(~((u64)0)) +#endif + #define SMI_CMD0x00 #define SMI_CMD_BUSY BIT(15) #define SMI_CMD_CLAUSE_22 BIT(12) -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe netdev" in
[PATCH 2/6] dsa: mv88e6xxx: Add debugfs interface for ATU
Dump the Address Translation Unit via a file in debugfs. Signed-off-by: Andrew Lunn --- drivers/net/dsa/mv88e6xxx.c | 81 + drivers/net/dsa/mv88e6xxx.h | 3 ++ 2 files changed, 84 insertions(+) diff --git a/drivers/net/dsa/mv88e6xxx.c b/drivers/net/dsa/mv88e6xxx.c index e6dbc4a8110e..6e684f3d377c 100644 --- a/drivers/net/dsa/mv88e6xxx.c +++ b/drivers/net/dsa/mv88e6xxx.c @@ -1643,6 +1643,84 @@ static const struct file_operations mv88e6xxx_regs_fops = { .owner = THIS_MODULE, }; +static void mv88e6xxx_atu_show_header(struct seq_file *s) +{ + seq_puts(s, "DB T/P Vec State Addr\n"); +} + +static void mv88e6xxx_atu_show_entry(struct seq_file *s, int dbnum, +unsigned char *addr, int data) +{ + bool trunk = !!(data & GLOBAL_ATU_DATA_TRUNK); + int portvec = ((data & GLOBAL_ATU_DATA_PORT_VECTOR_MASK) >> + GLOBAL_ATU_DATA_PORT_VECTOR_SHIFT); + int state = data & GLOBAL_ATU_DATA_STATE_MASK; + + seq_printf(s, "%03x %5s %10pb %x %pM\n", + dbnum, (trunk ? "Trunk" : "Port"), &portvec, state, addr); +} + +static int mv88e6xxx_atu_show_db(struct seq_file *s, struct dsa_switch *ds, +int dbnum) +{ + unsigned char bcast[] = { 0xff, 0xff, 0xff, 0xff, 0xff, 0xff}; + unsigned char addr[6]; + int ret, data, state; + + ret = __mv88e6xxx_write_addr(ds, bcast); + if (ret < 0) + return ret; + + do { + ret = _mv88e6xxx_atu_cmd(ds, dbnum, GLOBAL_ATU_OP_GET_NEXT_DB); + if (ret < 0) + return ret; + data = _mv88e6xxx_reg_read(ds, REG_GLOBAL, GLOBAL_ATU_DATA); + if (data < 0) + return data; + + state = data & GLOBAL_ATU_DATA_STATE_MASK; + if (state == GLOBAL_ATU_DATA_STATE_UNUSED) + break; + ret = __mv88e6xxx_read_addr(ds, addr); + if (ret < 0) + return ret; + mv88e6xxx_atu_show_entry(s, dbnum, addr, data); + } while (state != GLOBAL_ATU_DATA_STATE_UNUSED); + + return 0; +} + +static int mv88e6xxx_atu_show(struct seq_file *s, void *p) +{ + struct dsa_switch *ds = s->private; + struct mv88e6xxx_priv_state *ps = ds_to_priv(ds); + int dbnum; + + mv88e6xxx_atu_show_header(s); + + for (dbnum = 0; dbnum < 255; dbnum++) { + mutex_lock(&ps->smi_mutex); + mv88e6xxx_atu_show_db(s, ds, dbnum); + mutex_unlock(&ps->smi_mutex); + } + + return 0; +} + +static int mv88e6xxx_atu_open(struct inode *inode, struct file *file) +{ + return single_open(file, mv88e6xxx_atu_show, inode->i_private); +} + +static const struct file_operations mv88e6xxx_atu_fops = { + .open = mv88e6xxx_atu_open, + .read = seq_read, + .llseek = no_llseek, + .release = single_release, + .owner = THIS_MODULE, +}; + int mv88e6xxx_setup_common(struct dsa_switch *ds) { struct mv88e6xxx_priv_state *ps = ds_to_priv(ds); @@ -1663,6 +1741,9 @@ int mv88e6xxx_setup_common(struct dsa_switch *ds) debugfs_create_file("regs", S_IRUGO, ps->dbgfs, ds, &mv88e6xxx_regs_fops); + debugfs_create_file("atu", S_IRUGO, ps->dbgfs, ds, + &mv88e6xxx_atu_fops); + return 0; } diff --git a/drivers/net/dsa/mv88e6xxx.h b/drivers/net/dsa/mv88e6xxx.h index 5fc291cbdae0..8b9c76b66ddb 100644 --- a/drivers/net/dsa/mv88e6xxx.h +++ b/drivers/net/dsa/mv88e6xxx.h @@ -193,6 +193,9 @@ #define GLOBAL_ATU_OP_FLUSH_NON_STATIC_DB ((6 << 12) | GLOBAL_ATU_OP_BUSY) #define GLOBAL_ATU_OP_GET_CLR_VIOLATION ((7 << 12) | GLOBAL_ATU_OP_BUSY) #define GLOBAL_ATU_DATA0x0c +#define GLOBAL_ATU_DATA_TRUNK BIT(15) +#define GLOBAL_ATU_DATA_PORT_VECTOR_MASK 0x3ff0 +#define GLOBAL_ATU_DATA_PORT_VECTOR_SHIFT 4 #define GLOBAL_ATU_DATA_STATE_MASK 0x0f #define GLOBAL_ATU_DATA_STATE_UNUSED 0x00 #define GLOBAL_ATU_DATA_STATE_UC_MGMT 0x0d -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe netdev" in
[PATCH 4/6] dsa: mv88x6xxx: Add debugfs interface for statistics
Allow the contents of the statistics counters to be shown in debugfs. This is particularly useful for the cpu and dsa ports, which cannot be seen using ethtools -S. Signed-off-by: Andrew Lunn --- drivers/net/dsa/mv88e6xxx.c | 59 + 1 file changed, 59 insertions(+) diff --git a/drivers/net/dsa/mv88e6xxx.c b/drivers/net/dsa/mv88e6xxx.c index 43c1515a8319..fc5d4fdfcb02 100644 --- a/drivers/net/dsa/mv88e6xxx.c +++ b/drivers/net/dsa/mv88e6xxx.c @@ -1730,6 +1730,62 @@ static const struct file_operations mv88e6xxx_atu_fops = { .owner = THIS_MODULE, }; +static void mv88e6xxx_stats_show_header(struct seq_file *s, + struct mv88e6xxx_priv_state *ps) +{ + int port; + + seq_puts(s, " Statistic "); + for (port = 0 ; port < ps->num_ports; port++) + seq_printf(s, "Port %2d ", port); + seq_puts(s, "\n"); +} + +static int mv88e6xxx_stats_show(struct seq_file *s, void *p) +{ + struct dsa_switch *ds = s->private; + struct mv88e6xxx_priv_state *ps = ds_to_priv(ds); + struct mv88e6xxx_hw_stat *stats = mv88e6xxx_hw_stats; + int port, stat, max_stats; + uint64_t value; + + if (have_sw_in_discards(ds)) + max_stats = ARRAY_SIZE(mv88e6xxx_hw_stats); + else + max_stats = ARRAY_SIZE(mv88e6xxx_hw_stats) - 3; + + mv88e6xxx_stats_show_header(s, ps); + + mutex_lock(&ps->smi_mutex); + + for (stat = 0; stat < max_stats; stat++) { + seq_printf(s, "%19s: ", stats[stat].string); + for (port = 0 ; port < ps->num_ports; port++) { + _mv88e6xxx_stats_snapshot(ds, port); + value = _mv88e6xxx_get_ethtool_stat(ds, stat, stats, + port); + seq_printf(s, "%8llu ", value); + } + seq_puts(s, "\n"); + } + mutex_unlock(&ps->smi_mutex); + + return 0; +} + +static int mv88e6xxx_stats_open(struct inode *inode, struct file *file) +{ + return single_open(file, mv88e6xxx_stats_show, inode->i_private); +} + +static const struct file_operations mv88e6xxx_stats_fops = { + .open = mv88e6xxx_stats_open, + .read = seq_read, + .llseek = no_llseek, + .release = single_release, + .owner = THIS_MODULE, +}; + int mv88e6xxx_setup_common(struct dsa_switch *ds) { struct mv88e6xxx_priv_state *ps = ds_to_priv(ds); @@ -1753,6 +1809,9 @@ int mv88e6xxx_setup_common(struct dsa_switch *ds) debugfs_create_file("atu", S_IRUGO, ps->dbgfs, ds, &mv88e6xxx_atu_fops); + debugfs_create_file("stats", S_IRUGO, ps->dbgfs, ds, + &mv88e6xxx_stats_fops); + return 0; } -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe netdev" in
[PATCH 5/6] dsa: mv88x6xxx: Add debugfs interface for device map
The device map is used to route packets between cascaded switches. Add dumping a switches device map via debugfs. Signed-off-by: Andrew Lunn --- drivers/net/dsa/mv88e6xxx.c | 41 + drivers/net/dsa/mv88e6xxx.h | 1 + 2 files changed, 42 insertions(+) diff --git a/drivers/net/dsa/mv88e6xxx.c b/drivers/net/dsa/mv88e6xxx.c index fc5d4fdfcb02..7e12a31e9dae 100644 --- a/drivers/net/dsa/mv88e6xxx.c +++ b/drivers/net/dsa/mv88e6xxx.c @@ -1786,6 +1786,45 @@ static const struct file_operations mv88e6xxx_stats_fops = { .owner = THIS_MODULE, }; +static int mv88e6xxx_device_map_show(struct seq_file *s, void *p) +{ + struct dsa_switch *ds = s->private; + struct mv88e6xxx_priv_state *ps = ds_to_priv(ds); + int target, ret; + + seq_puts(s, "Target Port\n"); + + mutex_lock(&ps->smi_mutex); + for (target = 0; target < 32; target++) { + ret = _mv88e6xxx_reg_write( + ds, REG_GLOBAL2, GLOBAL2_DEVICE_MAPPING, + target << GLOBAL2_DEVICE_MAPPING_TARGET_SHIFT); + if (ret < 0) + goto out; + ret = _mv88e6xxx_reg_read(ds, REG_GLOBAL2, + GLOBAL2_DEVICE_MAPPING); + seq_printf(s, " %2d %2d\n", target, + ret & GLOBAL2_DEVICE_MAPPING_PORT_MASK); + } +out: + mutex_unlock(&ps->smi_mutex); + + return 0; +} + +static int mv88e6xxx_device_map_open(struct inode *inode, struct file *file) +{ + return single_open(file, mv88e6xxx_device_map_show, inode->i_private); +} + +static const struct file_operations mv88e6xxx_device_map_fops = { + .open = mv88e6xxx_device_map_open, + .read = seq_read, + .llseek = no_llseek, + .release = single_release, + .owner = THIS_MODULE, +}; + int mv88e6xxx_setup_common(struct dsa_switch *ds) { struct mv88e6xxx_priv_state *ps = ds_to_priv(ds); @@ -1812,6 +1851,8 @@ int mv88e6xxx_setup_common(struct dsa_switch *ds) debugfs_create_file("stats", S_IRUGO, ps->dbgfs, ds, &mv88e6xxx_stats_fops); + debugfs_create_file("device_map", S_IRUGO, ps->dbgfs, ds, + &mv88e6xxx_device_map_fops); return 0; } diff --git a/drivers/net/dsa/mv88e6xxx.h b/drivers/net/dsa/mv88e6xxx.h index 7cccff202586..a2c9ac0c54ab 100644 --- a/drivers/net/dsa/mv88e6xxx.h +++ b/drivers/net/dsa/mv88e6xxx.h @@ -260,6 +260,7 @@ #define GLOBAL2_DEVICE_MAPPING 0x06 #define GLOBAL2_DEVICE_MAPPING_UPDATE BIT(15) #define GLOBAL2_DEVICE_MAPPING_TARGET_SHIFT8 +#define GLOBAL2_DEVICE_MAPPING_PORT_MASK 0x0f #define GLOBAL2_TRUNK_MASK 0x07 #define GLOBAL2_TRUNK_MASK_UPDATE BIT(15) #define GLOBAL2_TRUNK_MASK_NUM_SHIFT 12 -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe netdev" in
[PATCH 1/6] dsa: mv88e6xxx: Add debugfs interface for registers
Allow the contents of the registers to be shown in debugfs. Signed-off-by: Andrew Lunn --- drivers/net/dsa/mv88e6xxx.c | 50 + drivers/net/dsa/mv88e6xxx.h | 2 ++ 2 files changed, 52 insertions(+) diff --git a/drivers/net/dsa/mv88e6xxx.c b/drivers/net/dsa/mv88e6xxx.c index 39530fa142b0..e6dbc4a8110e 100644 --- a/drivers/net/dsa/mv88e6xxx.c +++ b/drivers/net/dsa/mv88e6xxx.c @@ -8,6 +8,7 @@ * (at your option) any later version. */ +#include #include #include #include @@ -16,6 +17,7 @@ #include #include #include +#include #include #include "mv88e6xxx.h" @@ -1601,9 +1603,50 @@ int mv88e6xxx_setup_ports(struct dsa_switch *ds) return 0; } +static int mv88e6xxx_regs_show(struct seq_file *s, void *p) +{ + struct dsa_switch *ds = s->private; + + struct mv88e6xxx_priv_state *ps = ds_to_priv(ds); + int reg, port; + + seq_puts(s, "GLOBAL GLOBAL2 "); + for (port = 0 ; port < ps->num_ports; port++) + seq_printf(s, " %2d ", port); + seq_puts(s, "\n"); + + for (reg = 0; reg < 32; reg++) { + seq_printf(s, "%2x: ", reg); + seq_printf(s, " %4x%4x ", + mv88e6xxx_reg_read(ds, REG_GLOBAL, reg), + mv88e6xxx_reg_read(ds, REG_GLOBAL2, reg)); + + for (port = 0 ; port < ps->num_ports; port++) + seq_printf(s, "%4x ", + mv88e6xxx_reg_read(ds, REG_PORT(port), reg)); + seq_puts(s, "\n"); + } + + return 0; +} + +static int mv88e6xxx_regs_open(struct inode *inode, struct file *file) +{ + return single_open(file, mv88e6xxx_regs_show, inode->i_private); +} + +static const struct file_operations mv88e6xxx_regs_fops = { + .open = mv88e6xxx_regs_open, + .read = seq_read, + .llseek = no_llseek, + .release = single_release, + .owner = THIS_MODULE, +}; + int mv88e6xxx_setup_common(struct dsa_switch *ds) { struct mv88e6xxx_priv_state *ps = ds_to_priv(ds); + char *name; mutex_init(&ps->smi_mutex); @@ -1613,6 +1656,13 @@ int mv88e6xxx_setup_common(struct dsa_switch *ds) INIT_WORK(&ps->bridge_work, mv88e6xxx_bridge_work); + name = kasprintf(GFP_KERNEL, "dsa%d", ds->index); + ps->dbgfs = debugfs_create_dir(name, NULL); + kfree(name); + + debugfs_create_file("regs", S_IRUGO, ps->dbgfs, ds, + &mv88e6xxx_regs_fops); + return 0; } diff --git a/drivers/net/dsa/mv88e6xxx.h b/drivers/net/dsa/mv88e6xxx.h index e10ccdb4ffbc..5fc291cbdae0 100644 --- a/drivers/net/dsa/mv88e6xxx.h +++ b/drivers/net/dsa/mv88e6xxx.h @@ -339,6 +339,8 @@ struct mv88e6xxx_priv_state { u8 port_state[DSA_MAX_PORTS]; struct work_struct bridge_work; + + struct dentry *dbgfs; }; struct mv88e6xxx_hw_stat { -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe netdev" in
[PATCH 6/6] dsa: mv88x6xxx: Add debugfs interface for scratch registers
Allow the contents of the scratch registers to be shown in debugfs. Signed-off-by: Andrew Lunn --- drivers/net/dsa/mv88e6xxx.c | 54 + drivers/net/dsa/mv88e6xxx.h | 3 +++ 2 files changed, 57 insertions(+) diff --git a/drivers/net/dsa/mv88e6xxx.c b/drivers/net/dsa/mv88e6xxx.c index 7e12a31e9dae..c938d7ce5215 100644 --- a/drivers/net/dsa/mv88e6xxx.c +++ b/drivers/net/dsa/mv88e6xxx.c @@ -901,6 +901,13 @@ static int _mv88e6xxx_atu_wait(struct dsa_switch *ds) GLOBAL_ATU_OP_BUSY); } +/* Must be called with SMI lock held */ +static int _mv88e6xxx_scratch_wait(struct dsa_switch *ds) +{ + return _mv88e6xxx_wait(ds, REG_GLOBAL2, GLOBAL2_SCRATCH_MISC, + GLOBAL2_SCRATCH_BUSY); +} + /* Must be called with SMI mutex held */ static int _mv88e6xxx_phy_read_indirect(struct dsa_switch *ds, int addr, int regnum) @@ -1825,6 +1832,50 @@ static const struct file_operations mv88e6xxx_device_map_fops = { .owner = THIS_MODULE, }; +static int mv88e6xxx_scratch_show(struct seq_file *s, void *p) +{ + struct dsa_switch *ds = s->private; + struct mv88e6xxx_priv_state *ps = ds_to_priv(ds); + int reg, ret; + + seq_puts(s, "Register Value\n"); + + mutex_lock(&ps->smi_mutex); + for (reg = 0; reg < 0x80; reg++) { + ret = _mv88e6xxx_reg_write( + ds, REG_GLOBAL2, GLOBAL2_SCRATCH_MISC, + reg << GLOBAL2_SCRATCH_REGISTER_SHIFT); + if (ret < 0) + goto out; + + ret = _mv88e6xxx_scratch_wait(ds); + if (ret < 0) + goto out; + + ret = _mv88e6xxx_reg_read(ds, REG_GLOBAL2, + GLOBAL2_SCRATCH_MISC); + seq_printf(s, " %2x %2x\n", reg, + ret & GLOBAL2_SCRATCH_VALUE_MASK); + } +out: + mutex_unlock(&ps->smi_mutex); + + return 0; +} + +static int mv88e6xxx_scratch_open(struct inode *inode, struct file *file) +{ + return single_open(file, mv88e6xxx_scratch_show, inode->i_private); +} + +static const struct file_operations mv88e6xxx_scratch_fops = { + .open = mv88e6xxx_scratch_open, + .read = seq_read, + .llseek = no_llseek, + .release = single_release, + .owner = THIS_MODULE, +}; + int mv88e6xxx_setup_common(struct dsa_switch *ds) { struct mv88e6xxx_priv_state *ps = ds_to_priv(ds); @@ -1853,6 +1904,9 @@ int mv88e6xxx_setup_common(struct dsa_switch *ds) debugfs_create_file("device_map", S_IRUGO, ps->dbgfs, ds, &mv88e6xxx_device_map_fops); + + debugfs_create_file("scratch", S_IRUGO, ps->dbgfs, ds, + &mv88e6xxx_scratch_fops); return 0; } diff --git a/drivers/net/dsa/mv88e6xxx.h b/drivers/net/dsa/mv88e6xxx.h index a2c9ac0c54ab..a650b2656de9 100644 --- a/drivers/net/dsa/mv88e6xxx.h +++ b/drivers/net/dsa/mv88e6xxx.h @@ -297,6 +297,9 @@ #define GLOBAL2_SMI_OP_45_READ_DATA((2 << 10) | GLOBAL2_SMI_OP_BUSY) #define GLOBAL2_SMI_DATA 0x19 #define GLOBAL2_SCRATCH_MISC 0x1a +#define GLOBAL2_SCRATCH_BUSY BIT(15) +#define GLOBAL2_SCRATCH_REGISTER_SHIFT 8 +#define GLOBAL2_SCRATCH_VALUE_MASK 0xff #define GLOBAL2_WDOG_CONTROL 0x1b #define GLOBAL2_QOS_WEIGHT 0x1c #define GLOBAL2_MISC 0x1d -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe netdev" in
Re: macb napi strange behavior
Florian Fainelli : [...] > Typically, NAPI is used at the receive side of the Ethernet NIC/driver > to lower the hard/soft interrupt context switch, although there is > nothing that prevent you to implement a similar scheme for the > transmit side. Usually, for transmit you will be submitting one packet > for transmission and get a completion interrupt, so without interrupt > coalescing (software or hardware) you can end-up with 1 interrupt per > packet transmitted. The wording is a bit shy: there is a long standing policy to move everything to NAPI context (as well as go mostly lockless, etc.). Any taker to move macb Tx processing to NAPI context or should I consider it ? -- Ueimor -- To unsubscribe from this list: send the line "unsubscribe netdev" in
Re: [PATCH net-next RFC v2 1/3] lwt: infrastructure to support light weight tunnels
<<>> > diff --git a/net/core/lwtunnel.c b/net/core/lwtunnel.c > new file mode 100644 > index 000..29c7802 > --- /dev/null > +++ b/net/core/lwtunnel.c > @@ -0,0 +1,162 @@ > +/* > + * lwtunnel Infrastructure for light weight tunnels like mpls > + * > + * > + * This program is free software; you can redistribute it and/or > + * modify it under the terms of the GNU General Public License > + * as published by the Free Software Foundation; either version > + * 2 of the License, or (at your option) any later version. > + * > + */ > +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt > + > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > + > +#include > +#include > + > +struct lwtunnel_state *lwtunnel_state_alloc(int hdr_len) > +{ > + struct lwtunnel_state *lws; > + > + return kzalloc(sizeof(*lws) + hdr_len, GFP_KERNEL); This seems to be called with rcu_read_lock so GFP_ATOMIC would have to be used. (Judging by patch 3/3’s mpls_build_state and lwtunnel_build_state) > +} > +EXPORT_SYMBOL(lwtunnel_state_alloc); > + > +const struct lwtunnel_encap_ops __rcu * > + lwtun_encaps[MAX_LWTUNNEL_ENCAP_OPS] __read_mostly; > + > +int lwtunnel_encap_add_ops(const struct lwtunnel_encap_ops *ops, > +unsigned int num) > +{ > + if (num >= MAX_LWTUNNEL_ENCAP_OPS) > + return -ERANGE; > + > + return !cmpxchg((const struct lwtunnel_encap_ops **) > + &lwtun_encaps[num], > + NULL, ops) ? 0 : -1; > +} > +EXPORT_SYMBOL(lwtunnel_encap_add_ops); > + > +int lwtunnel_encap_del_ops(const struct lwtunnel_encap_ops *ops, > +unsigned int num) > +{ > + int ret; > + > + if (num >= MAX_LWTUNNEL_ENCAP_OPS) > + return -ERANGE; > + > + ret = (cmpxchg((const struct lwtunnel_encap_ops **) > +&lwtun_encaps[num], > +ops, NULL) == ops) ? 0 : -1; > + > + synchronize_net(); > + > + return ret; > +} > +EXPORT_SYMBOL(lwtunnel_encap_del_ops); > + > +int lwtunnel_build_state(struct net_device *dev, u16 encap_type, > + struct nlattr *encap, struct lwtunnel_state **lws) > +{ > + const struct lwtunnel_encap_ops *ops; > + int ret = -EINVAL; > + > + if (encap_type == LWTUNNEL_ENCAP_NONE || > + encap_type >= MAX_LWTUNNEL_ENCAP_OPS) > + return ret; > + > + ret = -EOPNOTSUPP; > + rcu_read_lock(); > + ops = rcu_dereference(lwtun_encaps[encap_type]); > + if (likely(ops && ops->build_state)) > + ret = ops->build_state(dev, encap, lws); > + rcu_read_unlock(); > + > + return ret; > +} > +EXPORT_SYMBOL(lwtunnel_build_state); > + > +int lwtunnel_fill_encap(struct sk_buff *skb, struct lwtunnel_state *lwtstate) > +{ > + const struct lwtunnel_encap_ops *ops; > + struct nlattr *nest; > + int ret = -EINVAL; > + > + if (lwtstate->type == LWTUNNEL_ENCAP_NONE || > + lwtstate->type >= MAX_LWTUNNEL_ENCAP_OPS) > + return 0; > + > + ret = -EOPNOTSUPP; > + nest = nla_nest_start(skb, RTA_ENCAP); > + rcu_read_lock(); > + ops = rcu_dereference(lwtun_encaps[lwtstate->type]); > + if (likely(ops && ops->fill_encap)) > + ret = ops->fill_encap(skb, lwtstate); > + rcu_read_unlock(); > + > + if (ret) > + goto errout; > + > + nla_nest_end(skb, nest); > + > + return 0; > + > +errout: > + nla_nest_cancel(skb, nest); > + > + return ret; > +} > +EXPORT_SYMBOL(lwtunnel_fill_encap); > + > +int lwtunnel_get_encap_size(struct lwtunnel_state *lwtstate) > +{ > + const struct lwtunnel_encap_ops *ops; > + int ret = 0; > + > + if (lwtstate->type == LWTUNNEL_ENCAP_NONE || > + lwtstate->type >= MAX_LWTUNNEL_ENCAP_OPS) > + return 0; > + > + rcu_read_lock(); > + ops = rcu_dereference(lwtun_encaps[lwtstate->type]); > + if (likely(ops && ops->get_encap_size)) > + ret = nla_total_size(ops->get_encap_size(lwtstate)); > + rcu_read_unlock(); > + > + return ret; > +} > +EXPORT_SYMBOL(lwtunnel_get_encap_size); > + > +int lwtunnel_output(struct sock *sk, struct sk_buff *skb) > +{ > + const struct lwtunnel_encap_ops *ops; > + struct lwtunnel_state *lwtstate = lwtunnel_skb_lwstate(skb); > + int ret = 0; > + > + if (!lwtstate) > + return -EINVAL; > + > + if (lwtstate->type == LWTUNNEL_ENCAP_NONE || > + lwtstate->type >= MAX_LWTUNNEL_ENCAP_OPS) > + return 0; > + > + rcu_read_lock(); > + ops = rcu_dereference(lwtun_encaps[lwtstate->type]); > + if (likely(ops && ops->output)) > + ret = ops->output(sk, skb); > + rcu_read_unlock(); > + > + return ret; > +} > +EXPORT_SYMBOL(lwtunnel_output); > -- > 1.7.10.4 > > -- > To unsubscribe from this list: send the line "unsubs
Re: [PATCH net] netfilter: nf_qeueue: Drop queue entries on nf_unregister_hook
On 20.06, Eric W. Biederman wrote: > Patrick McHardy writes: > > > On 20.06, Pablo Neira Ayuso wrote: > >> On Fri, Jun 19, 2015 at 02:03:39PM -0500, Eric W. Biederman wrote: > >> > > >> > Add code to nf_unregister_hook to flush the nf_queue when a hook is > >> > unregistered. This guarantees that the pointer that the nf_queue code > >> > retains into the nf_hook list will remain valid while a packet is > >> > queued. > >> > >> I think the real problem is that struct nf_queue_entry holds a pointer > >> to struct nf_hook_ops, which will be gone after removal. So you > >> uncovered a long standing problem that will amplify by when pernet > >> hooks are in place. > >> > >> Regarding the pointer to nf_hook_list, now that new netdevice variant > >> doesn't support nf_queue yet, so that nf_hook_list will be always > >> valid since it will point to the global nf_hooks in the core. > > > > I think Eric's patch is the right thing to do. I'm not sure I get > > your netdev comment, but we certainly do want to drop packets once > > a hook is gone. > > > >> > +{ > >> > +const struct nf_queue_handler *qh; > >> > +struct net *net; > >> > + > >> > +rtnl_lock(); > >> > >> Why rtnl_lock() here? > > > > for_each_net(). Would actually be nice to have a variant that doesn't > > need the rtnl since it makes locking order analysis a lot harder. > > Someone added a for_each_net_rcu. But right now I am not at all certain > I trust an rcu variant not to miss something, in a weird corner case. > When missing something translates to an unprivileged user triggerable > kernel oops I am not ready to play games. > > As for the lock analysis. Except for nf_tables nf_unregister_hook is > called by module removal routines where rtnl_lock() is safe. > > With nftables we seem to do everything under some version of the > nfnl_lock. Does the nfnl_lock have any problems with taking the > rtnl_lock to nest underneath it? No, its fine, we have almost none interactions except for network namespaces and device lookups. Main reason why I'd prefer a non-RTNL version is because your callbacks introduce bigger chunks of code under the RTNL, so it might complicate things in the future. But your reasoning is sound and for now this is perfectly fine. -- To unsubscribe from this list: send the line "unsubscribe netdev" in
Re: [PATCH 2/3] rcar_can: print signed IRQ #
Hello. On 6/20/2015 7:02 PM, Marc Kleine-Budde wrote: Printing IRQ # using "%x" and "%u" unsigned formats isn't quite correct as 'ndev->irq' is of type *int*, so the "%d" format needs to be used instead. While fixing this, beautify the dev_info() message in rcar_can_probe() a bit. If you change the message, why don't you make it consistent ("interrupt" vs. "IRQ")? I decided to change the message in a follow-up patch (posted afterwards). Please squash you patches, so that you don't modify code (or error messages) that you've added in a previous patch. I didn't add any messages. Marc WBR, Sergei -- To unsubscribe from this list: send the line "unsubscribe netdev" in
[PATCH] can: fix loss of frames due to wrong assumption in raw_rcv
I've detected a massive loss of can frames on i.MX6 using flexcan driver with 4.1-rc8 and tracked this down to following commit: 514ac99c64b22d83b52dfee3b8becaa69a92bc4a - "can: fix multiple delivery of a single CAN frame for overlapping CAN filters" 514ac99c64b22d83b52dfee3b8becaa69a92bc4a introduces a frame equality check. Since the sk_buff pointer is not sufficient to do this (buffers are reused), the check also compares time stamps. In short: pointer+time stamp was assumed as unique key to a specific frame. The problem with this is, that the time stamp is an optional property and not set per default. In our case (flexcan) the time stamp is always zero, so the equality check is reduced to equality of buffer pointers, resulting in a lot of dropped frames. Possible solutions I thought of: 1. Every driver has to set a time stamp (possibly error prone and hard to enforce?) 2. Change the equality check 3. Fulfil the requirements of the equality check by setting a time stamp per default. This patch fixes the problem with solution 3. A time stamp is set at time of allocation in alloc_can_skb. The time stamp may be overridden later, but the function of the equality check is ensured. I'm not really deep in linux network subsystem, so there may exists more elegant solutions for the problem. Signed-off-by: Manfred Schlaegl --- drivers/net/can/dev.c |1 + 1 file changed, 1 insertion(+) diff --git a/drivers/net/can/dev.c b/drivers/net/can/dev.c index b0f6924..282e2e7 100644 --- a/drivers/net/can/dev.c +++ b/drivers/net/can/dev.c @@ -575,6 +575,7 @@ struct sk_buff *alloc_can_skb(struct net_device *dev, struct can_frame **cf) if (unlikely(!skb)) return NULL; + __net_timestamp(skb); skb->protocol = htons(ETH_P_CAN); skb->pkt_type = PACKET_BROADCAST; skb->ip_summed = CHECKSUM_UNNECESSARY; -- 1.7.10.4 -- To unsubscribe from this list: send the line "unsubscribe netdev" in
[PATCHv2 net-next] net: fec: Ensure clocks are enabled while using mdio bus
When a switch is attached to the mdio bus, the mdio bus can be used while the interface is not open. If the IPG clock are not enabled, MDIO reads/writes will simply time out. So enable the clock before starting a transaction, and disable it afterwards. The CCF performs reference counting so the clock will only be disabled if there are no other users. Signed-off-by: Andrew Lunn --- v2: Only enable/disable the IPG clock. drivers/net/ethernet/freescale/fec_main.c | 21 +++-- 1 file changed, 19 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c index bf4cf3fbb5f2..2b8a043a573c 100644 --- a/drivers/net/ethernet/freescale/fec_main.c +++ b/drivers/net/ethernet/freescale/fec_main.c @@ -65,6 +65,7 @@ static void set_multicast_list(struct net_device *ndev); static void fec_enet_itr_coal_init(struct net_device *ndev); +static int fec_enet_clk_enable(struct net_device *ndev, bool enable); #define DRIVER_NAME"fec" @@ -1764,6 +1765,11 @@ static int fec_enet_mdio_read(struct mii_bus *bus, int mii_id, int regnum) { struct fec_enet_private *fep = bus->priv; unsigned long time_left; + int ret; + + ret = clk_prepare_enable(fep->clk_ipg); + if (ret) + return 0x; fep->mii_timeout = 0; init_completion(&fep->mdio_done); @@ -1779,11 +1785,14 @@ static int fec_enet_mdio_read(struct mii_bus *bus, int mii_id, int regnum) if (time_left == 0) { fep->mii_timeout = 1; netdev_err(fep->netdev, "MDIO read timeout\n"); + clk_disable_unprepare(fep->clk_ipg); return -ETIMEDOUT; } - /* return value */ - return FEC_MMFR_DATA(readl(fep->hwp + FEC_MII_DATA)); + ret = FEC_MMFR_DATA(readl(fep->hwp + FEC_MII_DATA)); + clk_disable_unprepare(fep->clk_ipg); + + return ret; } static int fec_enet_mdio_write(struct mii_bus *bus, int mii_id, int regnum, @@ -1791,10 +1800,15 @@ static int fec_enet_mdio_write(struct mii_bus *bus, int mii_id, int regnum, { struct fec_enet_private *fep = bus->priv; unsigned long time_left; + int ret; fep->mii_timeout = 0; init_completion(&fep->mdio_done); + ret = clk_prepare_enable(fep->clk_ipg); + if (ret) + return ret; + /* start a write op */ writel(FEC_MMFR_ST | FEC_MMFR_OP_WRITE | FEC_MMFR_PA(mii_id) | FEC_MMFR_RA(regnum) | @@ -1807,9 +1821,12 @@ static int fec_enet_mdio_write(struct mii_bus *bus, int mii_id, int regnum, if (time_left == 0) { fep->mii_timeout = 1; netdev_err(fep->netdev, "MDIO write timeout\n"); + clk_disable_unprepare(fep->clk_ipg); return -ETIMEDOUT; } + clk_disable_unprepare(fep->clk_ipg); + return 0; } -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe netdev" in
Re: [PATCH 2/3] rcar_can: print signed IRQ #
On 06/20/2015 04:38 PM, Sergei Shtylyov wrote: > Hello. > > On 6/20/2015 3:03 PM, Geert Uytterhoeven wrote: > >>> Printing IRQ # using "%x" and "%u" unsigned formats isn't quite correct as >>> 'ndev->irq' is of type *int*, so the "%d" format needs to be used >>> instead. > >>> While fixing this, beautify the dev_info() message in rcar_can_probe() a >>> bit. > >> If you change the message, why don't you make it consistent >> ("interrupt" vs. "IRQ")? > > I decided to change the message in a follow-up patch (posted afterwards). Please squash you patches, so that you don't modify code (or error messages) that you've added in a previous patch. Marc -- Pengutronix e.K. | Marc Kleine-Budde | Industrial Linux Solutions| Phone: +49-231-2826-924 | Vertretung West/Dortmund | Fax: +49-5121-206917- | Amtsgericht Hildesheim, HRA 2686 | http://www.pengutronix.de | signature.asc Description: OpenPGP digital signature
Re: [PATCH net-next RFC v2 3/3] mpls: support for ip tunnels
On 6/19/15, 9:06 AM, Robert Shearman wrote: + +/* Push the new labels */ +hdr = mpls_hdr(skb); +bos = true; +for (i = tun_encap_info->labels - 1; i >= 0; i--) { +hdr[i] = mpls_entry_encode(tun_encap_info->label[i], + dec.ttl, 0, bos); dec is never initialised in this function, so this will encode a garbage ttl into the packet. This should instead be deriving the ttl from the IP packet, as Eric did in his original patch. Thanks for the pointer Robert. I will fix it. -- To unsubscribe from this list: send the line "unsubscribe netdev" in
Re: [PATCH net-next RFC v2 3/3] mpls: support for ip tunnels
On 6/19/15, 9:06 AM, Robert Shearman wrote: Since the entire label stack and the output device is encoded in the route, this means that you won't get prefix-independent convergence with this implementation for an IGP route change. I.e. if you've got 10 million VPN routes via an IGP route for the BGP nexthop, and the IGP route for the BGP nexthop changes (e.g. because a link has gone down somewhere in the network) then you'll have to update all 10 million IP routes to change the output device, gateway and IGP label. That's going to represent a scaling obstacle for one of the primary MPLS use cases. I cant say I understand PIC very well, but, assuming PIC is not just an mpls thing, PIC does require an alternate nexthop infrastructure in the kernel (FIB). And if that were present, It would help the mpls case too. I am not sure how you would solve PIC just for the mpls case or if having a netdevice makes it any easier. Thanks, Roopa -- To unsubscribe from this list: send the line "unsubscribe netdev" in
Re: [PATCH 2/3] rcar_can: print signed IRQ #
Hello. On 6/20/2015 3:03 PM, Geert Uytterhoeven wrote: Printing IRQ # using "%x" and "%u" unsigned formats isn't quite correct as 'ndev->irq' is of type *int*, so the "%d" format needs to be used instead. While fixing this, beautify the dev_info() message in rcar_can_probe() a bit. If you change the message, why don't you make it consistent ("interrupt" vs. "IRQ")? I decided to change the message in a follow-up patch (posted afterwards). Fixes: fd1159318e55 ("can: add Renesas R-Car CAN driver") Signed-off-by: Sergei Shtylyov [...] Gr{oetje,eeting}s, Geert WBR, Sergei -- To unsubscribe from this list: send the line "unsubscribe netdev" in
Re: [PATCH net-next RFC v2 1/3] lwt: infrastructure to support light weight tunnels
On 6/19/15, 11:39 AM, Robert Shearman wrote: On 19/06/15 19:34, roopa wrote: On 6/19/15, 10:25 AM, Robert Shearman wrote: n 19/06/15 16:14, roopa wrote: In the netdevice case, this output function is not called atall. It should just follow the existing netdevice the route is pointing to. Sorry for not being clear, but I meant that there would have to be lwtunnel_skb_lwstate functions for ipv4 and ipv6 to match the output functions. So in the vxlan use case where it's using a netdevice, how would it determine which one to call? thanks for that clarification, and good point. I see some areas of the kernel checking for skb->protocol to do the conversion (something like below). I am guessing that is acceptable. if (skb->protocol == htons(ETH_P_IPV6)) struct rt6_info *rt6 = (struct rt6_info *)skb_dst(skb); -- To unsubscribe from this list: send the line "unsubscribe netdev" in
Re: [PATCH net] netfilter: nf_qeueue: Drop queue entries on nf_unregister_hook
Patrick McHardy writes: > On 20.06, Pablo Neira Ayuso wrote: >> On Fri, Jun 19, 2015 at 02:03:39PM -0500, Eric W. Biederman wrote: >> > >> > Add code to nf_unregister_hook to flush the nf_queue when a hook is >> > unregistered. This guarantees that the pointer that the nf_queue code >> > retains into the nf_hook list will remain valid while a packet is >> > queued. >> >> I think the real problem is that struct nf_queue_entry holds a pointer >> to struct nf_hook_ops, which will be gone after removal. So you >> uncovered a long standing problem that will amplify by when pernet >> hooks are in place. >> >> Regarding the pointer to nf_hook_list, now that new netdevice variant >> doesn't support nf_queue yet, so that nf_hook_list will be always >> valid since it will point to the global nf_hooks in the core. > > I think Eric's patch is the right thing to do. I'm not sure I get > your netdev comment, but we certainly do want to drop packets once > a hook is gone. > >> > +{ >> > + const struct nf_queue_handler *qh; >> > + struct net *net; >> > + >> > + rtnl_lock(); >> >> Why rtnl_lock() here? > > for_each_net(). Would actually be nice to have a variant that doesn't > need the rtnl since it makes locking order analysis a lot harder. Someone added a for_each_net_rcu. But right now I am not at all certain I trust an rcu variant not to miss something, in a weird corner case. When missing something translates to an unprivileged user triggerable kernel oops I am not ready to play games. As for the lock analysis. Except for nf_tables nf_unregister_hook is called by module removal routines where rtnl_lock() is safe. With nftables we seem to do everything under some version of the nfnl_lock. Does the nfnl_lock have any problems with taking the rtnl_lock to nest underneath it? I tested this path and I did not have any practical problems, but I don't think I had lockdep enabled at the time. Eric >> > + rcu_read_lock(); >> > + qh = rcu_dereference(queue_handler); >> > + if (qh) { >> > + for_each_net(net) { >> > + qh->nf_hook_drop(net, ops); >> > + } >> > + } >> > + rcu_read_unlock(); >> > + rtnl_unlock(); -- To unsubscribe from this list: send the line "unsubscribe netdev" in
Re: [PATCH net] netfilter: nf_queue: Don't recompute the hook_list head
Pablo Neira Ayuso writes: > On Fri, Jun 19, 2015 at 05:23:37PM -0500, Eric W. Biederman wrote: >> >> If someone sends packets from one of the netdevice ingress hooks to >> the a userspace queue, and then userspace later accepts the packet, >> the netfilter code can enter an infinite loop as the list head will >> never be found. >> >> Pass in the saved list_head to avoid this. > > There is no userspace queueing for netdevice yet, so this can be route > through nf-next. Thanks. *scratches head* the netdevice queueing is in the netfilter core. netfilter_ingress calls nf_hook_slow. The queuing happens in nf_hook_slow if anything returns the verdict queue it. This patch applies to Linus's tree. So how in the world does this not need to be ported to 4.1? >> Signed-off-by: "Eric W. Biederman" >> --- >> net/netfilter/nf_queue.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/net/netfilter/nf_queue.c b/net/netfilter/nf_queue.c >> index cd60d397fe05..8a8b2abc35ff 100644 >> --- a/net/netfilter/nf_queue.c >> +++ b/net/netfilter/nf_queue.c >> @@ -213,7 +213,7 @@ void nf_reinject(struct nf_queue_entry *entry, unsigned >> int verdict) >> >> if (verdict == NF_ACCEPT) { >> next_hook: >> -verdict = >> nf_iterate(&nf_hooks[entry->state.pf][entry->state.hook], >> +verdict = nf_iterate(entry->state.hook_list, >> skb, &entry->state, &elem); >> } >> >> -- >> 2.2.1 >> -- To unsubscribe from this list: send the line "unsubscribe netdev" in
Re: [PATCH net] netfilter: nf_qeueue: Drop queue entries on nf_unregister_hook
Pablo Neira Ayuso writes: > On Fri, Jun 19, 2015 at 02:03:39PM -0500, Eric W. Biederman wrote: >> >> Add code to nf_unregister_hook to flush the nf_queue when a hook is >> unregistered. This guarantees that the pointer that the nf_queue code >> retains into the nf_hook list will remain valid while a packet is >> queued. > > I think the real problem is that struct nf_queue_entry holds a pointer > to struct nf_hook_ops, which will be gone after removal. Yes that is what I meant, when I was talking about the pointer that the nf_queue code holds into the nf_hook list. That list is threaded through nf_hook_ops, and is used to retain the place in the nf_hook list for when the packet returns through nf_reinject. > So you > uncovered a long standing problem that will amplify by when pernet > hooks are in place. Yes. This will apply to more than just nftables when the pernet hooks are in place. The try_module_get prevents this for everything except for nftables today. So in practice this problem has existed since the merge of nftables. The try_module_get shows this problem has existed in some form longer than git. > Regarding the pointer to nf_hook_list, now that new netdevice variant > doesn't support nf_queue yet, so that nf_hook_list will be always > valid since it will point to the global nf_hooks in the core. > >> I tested what would happen if we do not flush queued packets and was >> trivially able to obtain the oops below. All that was required was >> to stop the nf_queue listening process, to delete all of the nf_tables, >> and to awaken the nf_queue listening process. > [...] > > Please, route netfilter patches through the netfilter trees, ie. nf > and nf-next. Whatever works. I just see this as a bug in the networking stack that needs to be fixed. I don't care who I send it to as long as Linus gets it. >> Cc: sta...@vger.kernel.org > > I guess this is a leftover since there is no Cc to stable. Anyway, > we have to wait until this hits master before we ask for -stable > inclusion. This is a marker that this should be backported to stable, and the typicall way this is remembered outside of the network trees. The stable folks grep the git log for Cc: stable... > More comments below. Thanks for this fix BTW. > >> Signed-off-by: "Eric W. Biederman" >> --- >> >> Apologies for the duplicate send but I forgot to include the appropriate >> mailing lists. >> >> include/net/netfilter/nf_queue.h | 2 ++ >> net/netfilter/core.c | 1 + >> net/netfilter/nf_internals.h | 1 + >> net/netfilter/nf_queue.c | 17 + >> net/netfilter/nfnetlink_queue_core.c | 24 +++- >> 5 files changed, 44 insertions(+), 1 deletion(-) >> >> diff --git a/include/net/netfilter/nf_queue.h >> b/include/net/netfilter/nf_queue.h >> index d81d584157e1..e8635854a55b 100644 >> --- a/include/net/netfilter/nf_queue.h >> +++ b/include/net/netfilter/nf_queue.h >> @@ -24,6 +24,8 @@ struct nf_queue_entry { >> struct nf_queue_handler { >> int (*outfn)(struct nf_queue_entry *entry, >> unsigned int queuenum); >> +void(*nf_hook_drop)(struct net *net, >> +struct nf_hook_ops *ops); >> }; >> >> void nf_register_queue_handler(const struct nf_queue_handler *qh); >> diff --git a/net/netfilter/core.c b/net/netfilter/core.c >> index 653e32eac08c..a0e54974e2c9 100644 >> --- a/net/netfilter/core.c >> +++ b/net/netfilter/core.c >> @@ -118,6 +118,7 @@ void nf_unregister_hook(struct nf_hook_ops *reg) >> static_key_slow_dec(&nf_hooks_needed[reg->pf][reg->hooknum]); >> #endif >> synchronize_net(); >> +nf_queue_nf_hook_drop(reg); >> } >> EXPORT_SYMBOL(nf_unregister_hook); >> >> diff --git a/net/netfilter/nf_internals.h b/net/netfilter/nf_internals.h >> index ea7f36784b3d..399210693c2a 100644 >> --- a/net/netfilter/nf_internals.h >> +++ b/net/netfilter/nf_internals.h >> @@ -19,6 +19,7 @@ unsigned int nf_iterate(struct list_head *head, struct >> sk_buff *skb, >> /* nf_queue.c */ >> int nf_queue(struct sk_buff *skb, struct nf_hook_ops *elem, >> struct nf_hook_state *state, unsigned int queuenum); >> +void nf_queue_nf_hook_drop(struct nf_hook_ops *ops); >> int __init netfilter_queue_init(void); >> >> /* nf_log.c */ >> diff --git a/net/netfilter/nf_queue.c b/net/netfilter/nf_queue.c >> index 2e88032cd5ad..cd60d397fe05 100644 >> --- a/net/netfilter/nf_queue.c >> +++ b/net/netfilter/nf_queue.c >> @@ -105,6 +105,23 @@ bool nf_queue_entry_get_refs(struct nf_queue_entry >> *entry) >> } >> EXPORT_SYMBOL_GPL(nf_queue_entry_get_refs); >> >> +void nf_queue_nf_hook_drop(struct nf_hook_ops *ops) > > I'd suggest you rename all these 'nf_hook_drop' to 'flush'. The functions in nfnetfilter_queue_core.c are also named drop, and I am not in a mood to change the convention. >> +{ >> +const struct nf_queue_handler
linux-next: build warnings after merge of the net-next tree
Hi all, After merging the net-next tree, today's linux-next build (i386 defconfig) produced these warnings: In file included from include/net/netfilter/nf_conntrack_tuple.h:13:0, from include/linux/netfilter/nf_conntrack_dccp.h:28, from include/net/netfilter/nf_conntrack.h:22, from net/netfilter/nf_conntrack_core.c:37: include/linux/netfilter/x_tables.h: In function 'xt_percpu_counter_alloc': include/linux/netfilter/x_tables.h:373:10: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast] return (__force u64) res; ^ include/linux/netfilter/x_tables.h: In function 'xt_percpu_counter_free': include/linux/netfilter/x_tables.h:381:15: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast] free_percpu((void __percpu *) pcnt); ^ In file included from include/asm-generic/percpu.h:6:0, from arch/x86/include/asm/percpu.h:551, from arch/x86/include/asm/preempt.h:5, from include/linux/preempt.h:64, from include/linux/spinlock.h:50, from include/linux/mm_types.h:8, from include/linux/kmemcheck.h:4, from include/linux/skbuff.h:18, from include/linux/netfilter.h:5, from net/netfilter/nf_conntrack_core.c:16: include/linux/netfilter/x_tables.h: In function 'xt_get_this_cpu_counter': include/linux/netfilter/x_tables.h:388:23: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast] return this_cpu_ptr((void __percpu *) cnt->pcnt); ^ include/linux/percpu-defs.h:206:47: note: in definition of macro '__verify_pcpu_ptr' const void __percpu *__vpp_verify = (typeof((ptr) + 0))NULL; \ ^ include/linux/percpu-defs.h:239:27: note: in expansion of macro 'raw_cpu_ptr' #define this_cpu_ptr(ptr) raw_cpu_ptr(ptr) ^ include/linux/netfilter/x_tables.h:388:10: note: in expansion of macro 'this_cpu_ptr' return this_cpu_ptr((void __percpu *) cnt->pcnt); ^ and many more. Introduced by commit: 71ae0dff02d7 ("netfilter: xtables: use percpu rule counters") -- Cheers, Stephen Rothwells...@canb.auug.org.au pgpOswghHwbve.pgp Description: OpenPGP digital signature
Re: [PATCH 00/32] Netfilter updates for net-next
On Mon, 15 Jun 2015 23:25:57 +0200, Pablo Neira Ayuso wrote: > Hi David, > > This a bit large (and late) patchset that contains Netfilter updates for > net-next. Most relevantly br_netfilter fixes, ipset RCU support, removal of > x_tables percpu ruleset copy and rework of the nf_tables netdev support. More > specifically, they are: [...] > > Bernhard Thaler (7): > netfilter: bridge: refactor clearing BRNF_NF_BRIDGE_PREROUTING > netfilter: bridge: re-order br_nf_pre_routing_finish_ipv6() > netfilter: bridge: detect NAT66 correctly and change MAC address > netfilter: bridge: refactor frag_max_size > netfilter: bridge: rename br_parse_ip_options > netfilter: bridge: re-order check_hbh_len() > netfilter: bridge: forward IPv6 fragmented packets Pablo, Bernhard, this batch breaks builds with CONFIG_IPV6=n. No idea why build bot didn't catch that. linux/net/bridge/br_netfilter.c: In function ‘br_validate_ipv6’: /home/kuba/Development/Linux/linux/net/bridge/br_netfilter.c:350:618: error: ‘struct netns_mib’ has no member named ‘ipv6_statistics’ IP6_INC_STATS_BH(dev_net(dev), idev, linux/net/bridge/br_netfilter.c:350:706: error: ‘struct netns_mib’ has no member named ‘ipv6_statistics’ IP6_INC_STATS_BH(dev_net(dev), idev, linux/net/bridge/br_netfilter.c:350:915: error: ‘struct netns_mib’ has no member named ‘ipv6_statistics’ IP6_INC_STATS_BH(dev_net(dev), idev, linux/net/bridge/br_netfilter.c:350:964: error: ‘struct netns_mib’ has no member named ‘ipv6_statistics’ IP6_INC_STATS_BH(dev_net(dev), idev, linux/net/bridge/br_netfilter.c:350:1031: error: request for member ‘syncp’ in something not a structure or union IP6_INC_STATS_BH(dev_net(dev), idev, linux/net/bridge/br_netfilter.c:350:1044: error: request for member ‘mibs’ in something not a structure or union IP6_INC_STATS_BH(dev_net(dev), idev, linux/net/bridge/br_netfilter.c:350:1113: error: request for member ‘syncp’ in something not a structure or union IP6_INC_STATS_BH(dev_net(dev), idev, linux/net/bridge/br_netfilter.c:355:613: error: ‘struct netns_mib’ has no member named ‘ipv6_statistics’ IP6_INC_STATS_BH(dev_net(dev), idev, linux/net/bridge/br_netfilter.c:355:701: error: ‘struct netns_mib’ has no member named ‘ipv6_statistics’ IP6_INC_STATS_BH(dev_net(dev), idev, linux/net/bridge/br_netfilter.c:355:910: error: ‘struct netns_mib’ has no member named ‘ipv6_statistics’ IP6_INC_STATS_BH(dev_net(dev), idev, linux/net/bridge/br_netfilter.c:355:959: error: ‘struct netns_mib’ has no member named ‘ipv6_statistics’ IP6_INC_STATS_BH(dev_net(dev), idev, linux/net/bridge/br_netfilter.c:355:1026: error: request for member ‘syncp’ in something not a structure or union IP6_INC_STATS_BH(dev_net(dev), idev, linux/net/bridge/br_netfilter.c:355:1039: error: request for member ‘mibs’ in something not a structure or union IP6_INC_STATS_BH(dev_net(dev), idev, linux/net/bridge/br_netfilter.c:355:1103: error: request for member ‘syncp’ in something not a structure or union IP6_INC_STATS_BH(dev_net(dev), idev, linux/net/bridge/br_netfilter.c:370:612: error: ‘struct netns_mib’ has no member named ‘ipv6_statistics’ IP6_INC_STATS_BH(dev_net(dev), idev, IPSTATS_MIB_INHDRERRORS); linux/net/bridge/br_netfilter.c:370:700: error: ‘struct netns_mib’ has no member named ‘ipv6_statistics’ IP6_INC_STATS_BH(dev_net(dev), idev, IPSTATS_MIB_INHDRERRORS); linux/net/bridge/br_netfilter.c:370:909: error: ‘struct netns_mib’ has no member named ‘ipv6_statistics’ IP6_INC_STATS_BH(dev_net(dev), idev, IPSTATS_MIB_INHDRERRORS); linux/net/bridge/br_netfilter.c:370:958: error: ‘struct netns_mib’ has no member named ‘ipv6_statistics’ IP6_INC_STATS_BH(dev_net(dev), idev, IPSTATS_MIB_INHDRERRORS); linux/net/bridge/br_netfilter.c:370:1025: error: request for member ‘syncp’ in something not a structure or union IP6_INC_STATS_BH(dev_net(dev), idev, IPSTATS_MIB_INHDRERRORS); linux/net/bridge/br_netfilter.c:370:1038: error: request for member ‘mibs’ in something not a structure or union IP6_INC_STATS_BH(dev_net(dev), idev, IPSTATS_MIB_INHDRERRORS); linux/net/bridge/br_netfilter.c:370:1103: error: request for member ‘syncp’ in something not a structure or union IP6_INC_STATS_BH(dev_net(dev), idev, IPSTATS_MIB_INHDRERRORS); -- To unsubscribe from this list: send the line "unsubscribe netdev" in
Re: [PATCH 2/3] rcar_can: print signed IRQ #
Hi Sergei, On Sat, Jun 20, 2015 at 2:33 AM, Sergei Shtylyov wrote: > Printing IRQ # using "%x" and "%u" unsigned formats isn't quite correct as > 'ndev->irq' is of type *int*, so the "%d" format needs to be used instead. > > While fixing this, beautify the dev_info() message in rcar_can_probe() a bit. If you change the message, why don't you make it consistent ("interrupt" vs. "IRQ")? > Fixes: fd1159318e55 ("can: add Renesas R-Car CAN driver") > Signed-off-by: Sergei Shtylyov > > --- > drivers/net/can/rcar_can.c |4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > Index: linux-can/drivers/net/can/rcar_can.c > === > --- linux-can.orig/drivers/net/can/rcar_can.c > +++ linux-can/drivers/net/can/rcar_can.c > @@ -526,7 +526,7 @@ static int rcar_can_open(struct net_devi > napi_enable(&priv->napi); > err = request_irq(ndev->irq, rcar_can_interrupt, 0, ndev->name, ndev); > if (err) { > - netdev_err(ndev, "error requesting interrupt %x\n", > ndev->irq); > + netdev_err(ndev, "error requesting interrupt %d\n", > ndev->irq); > goto out_close; > } > can_led_event(ndev, CAN_LED_EVENT_OPEN); > @@ -824,7 +824,7 @@ static int rcar_can_probe(struct platfor > > devm_can_led_init(ndev); > > - dev_info(&pdev->dev, "device registered (reg_base=%p, irq=%u)\n", > + dev_info(&pdev->dev, "device registered (regs @ %p, IRQ%d)\n", > priv->regs, ndev->irq); Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds -- To unsubscribe from this list: send the line "unsubscribe netdev" in
Re: [PATCH net] netfilter: nf_qeueue: Drop queue entries on nf_unregister_hook
On 20.06, Pablo Neira Ayuso wrote: > On Fri, Jun 19, 2015 at 02:03:39PM -0500, Eric W. Biederman wrote: > > > > Add code to nf_unregister_hook to flush the nf_queue when a hook is > > unregistered. This guarantees that the pointer that the nf_queue code > > retains into the nf_hook list will remain valid while a packet is > > queued. > > I think the real problem is that struct nf_queue_entry holds a pointer > to struct nf_hook_ops, which will be gone after removal. So you > uncovered a long standing problem that will amplify by when pernet > hooks are in place. > > Regarding the pointer to nf_hook_list, now that new netdevice variant > doesn't support nf_queue yet, so that nf_hook_list will be always > valid since it will point to the global nf_hooks in the core. I think Eric's patch is the right thing to do. I'm not sure I get your netdev comment, but we certainly do want to drop packets once a hook is gone. > > +{ > > + const struct nf_queue_handler *qh; > > + struct net *net; > > + > > + rtnl_lock(); > > Why rtnl_lock() here? for_each_net(). Would actually be nice to have a variant that doesn't need the rtnl since it makes locking order analysis a lot harder. > > + rcu_read_lock(); > > + qh = rcu_dereference(queue_handler); > > + if (qh) { > > + for_each_net(net) { > > + qh->nf_hook_drop(net, ops); > > + } > > + } > > + rcu_read_unlock(); > > + rtnl_unlock(); -- To unsubscribe from this list: send the line "unsubscribe netdev" in
Re: [PATCH net] netfilter: nf_queue: Don't recompute the hook_list head
On Fri, Jun 19, 2015 at 05:23:37PM -0500, Eric W. Biederman wrote: > > If someone sends packets from one of the netdevice ingress hooks to > the a userspace queue, and then userspace later accepts the packet, > the netfilter code can enter an infinite loop as the list head will > never be found. > > Pass in the saved list_head to avoid this. There is no userspace queueing for netdevice yet, so this can be route through nf-next. Thanks. > Signed-off-by: "Eric W. Biederman" > --- > net/netfilter/nf_queue.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/net/netfilter/nf_queue.c b/net/netfilter/nf_queue.c > index cd60d397fe05..8a8b2abc35ff 100644 > --- a/net/netfilter/nf_queue.c > +++ b/net/netfilter/nf_queue.c > @@ -213,7 +213,7 @@ void nf_reinject(struct nf_queue_entry *entry, unsigned > int verdict) > > if (verdict == NF_ACCEPT) { > next_hook: > - verdict = > nf_iterate(&nf_hooks[entry->state.pf][entry->state.hook], > + verdict = nf_iterate(entry->state.hook_list, >skb, &entry->state, &elem); > } > > -- > 2.2.1 > -- To unsubscribe from this list: send the line "unsubscribe netdev" in
Re: [PATCH net] netfilter: nf_qeueue: Drop queue entries on nf_unregister_hook
On Fri, Jun 19, 2015 at 02:03:39PM -0500, Eric W. Biederman wrote: > > Add code to nf_unregister_hook to flush the nf_queue when a hook is > unregistered. This guarantees that the pointer that the nf_queue code > retains into the nf_hook list will remain valid while a packet is > queued. I think the real problem is that struct nf_queue_entry holds a pointer to struct nf_hook_ops, which will be gone after removal. So you uncovered a long standing problem that will amplify by when pernet hooks are in place. Regarding the pointer to nf_hook_list, now that new netdevice variant doesn't support nf_queue yet, so that nf_hook_list will be always valid since it will point to the global nf_hooks in the core. > I tested what would happen if we do not flush queued packets and was > trivially able to obtain the oops below. All that was required was > to stop the nf_queue listening process, to delete all of the nf_tables, > and to awaken the nf_queue listening process. [...] Please, route netfilter patches through the netfilter trees, ie. nf and nf-next. > Cc: sta...@vger.kernel.org I guess this is a leftover since there is no Cc to stable. Anyway, we have to wait until this hits master before we ask for -stable inclusion. More comments below. Thanks for this fix BTW. > Signed-off-by: "Eric W. Biederman" > --- > > Apologies for the duplicate send but I forgot to include the appropriate > mailing lists. > > include/net/netfilter/nf_queue.h | 2 ++ > net/netfilter/core.c | 1 + > net/netfilter/nf_internals.h | 1 + > net/netfilter/nf_queue.c | 17 + > net/netfilter/nfnetlink_queue_core.c | 24 +++- > 5 files changed, 44 insertions(+), 1 deletion(-) > > diff --git a/include/net/netfilter/nf_queue.h > b/include/net/netfilter/nf_queue.h > index d81d584157e1..e8635854a55b 100644 > --- a/include/net/netfilter/nf_queue.h > +++ b/include/net/netfilter/nf_queue.h > @@ -24,6 +24,8 @@ struct nf_queue_entry { > struct nf_queue_handler { > int (*outfn)(struct nf_queue_entry *entry, >unsigned int queuenum); > + void(*nf_hook_drop)(struct net *net, > + struct nf_hook_ops *ops); > }; > > void nf_register_queue_handler(const struct nf_queue_handler *qh); > diff --git a/net/netfilter/core.c b/net/netfilter/core.c > index 653e32eac08c..a0e54974e2c9 100644 > --- a/net/netfilter/core.c > +++ b/net/netfilter/core.c > @@ -118,6 +118,7 @@ void nf_unregister_hook(struct nf_hook_ops *reg) > static_key_slow_dec(&nf_hooks_needed[reg->pf][reg->hooknum]); > #endif > synchronize_net(); > + nf_queue_nf_hook_drop(reg); > } > EXPORT_SYMBOL(nf_unregister_hook); > > diff --git a/net/netfilter/nf_internals.h b/net/netfilter/nf_internals.h > index ea7f36784b3d..399210693c2a 100644 > --- a/net/netfilter/nf_internals.h > +++ b/net/netfilter/nf_internals.h > @@ -19,6 +19,7 @@ unsigned int nf_iterate(struct list_head *head, struct > sk_buff *skb, > /* nf_queue.c */ > int nf_queue(struct sk_buff *skb, struct nf_hook_ops *elem, >struct nf_hook_state *state, unsigned int queuenum); > +void nf_queue_nf_hook_drop(struct nf_hook_ops *ops); > int __init netfilter_queue_init(void); > > /* nf_log.c */ > diff --git a/net/netfilter/nf_queue.c b/net/netfilter/nf_queue.c > index 2e88032cd5ad..cd60d397fe05 100644 > --- a/net/netfilter/nf_queue.c > +++ b/net/netfilter/nf_queue.c > @@ -105,6 +105,23 @@ bool nf_queue_entry_get_refs(struct nf_queue_entry > *entry) > } > EXPORT_SYMBOL_GPL(nf_queue_entry_get_refs); > > +void nf_queue_nf_hook_drop(struct nf_hook_ops *ops) I'd suggest you rename all these 'nf_hook_drop' to 'flush'. > +{ > + const struct nf_queue_handler *qh; > + struct net *net; > + > + rtnl_lock(); Why rtnl_lock() here? > + rcu_read_lock(); > + qh = rcu_dereference(queue_handler); > + if (qh) { > + for_each_net(net) { > + qh->nf_hook_drop(net, ops); > + } > + } > + rcu_read_unlock(); > + rtnl_unlock(); > +} > + > /* > * Any packet that leaves via this function must come back > * through nf_reinject(). -- To unsubscribe from this list: send the line "unsubscribe netdev" in
Performance loss due to commit 37c3185 ([NET]: Added GSO toggle)
Hello Herbert, In commit "[NET]: Added GSO toggle" 37c3185a02d4b85fbe134bf5204535405dd2c957, you force NETIF_F_HW_CSUM if GSO feature is selected. By default, SW GSO is active as soon as a network board has NETIF_F_SG feature. This means that function sk_setup_caps() forces NETIF_F_HW_CSUM for any board having NETIF_F_SG For boards having no HW checksum capability, this results in performance loss due to data copy being done in skb_do_copy_data_nocache() with copy_from_user() then checksum being done later with csum_partial() instead of getting both done at the same time using csum_and_copy_from_user() Is there a reason for forcing NETIF_F_HW_CSUM ? Christophe --- L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast. https://www.avast.com/antivirus -- To unsubscribe from this list: send the line "unsubscribe netdev" in
Gift
Good news tv,moto,cellphone,gultar the shipping is free samsung 6, € 320 s i te: isgayre. com N�r��yb�X��ǧv�^�){.n�+���z�^�)
Re: [PATCH v2] bpf: BPF based latency tracing
On 06/19/2015 04:00 PM, Daniel Wagner wrote: BPF offers another way to generate latency histograms. We attach kprobes at trace_preempt_off and trace_preempt_on and calculate the time it takes to from seeing the off/on transition. The first array is used to store the start time stamp. The key is the CPU id. The second array stores the log2(time diff). We need to use static allocation here (array and not hash tables). The kprobes hooking into trace_preempt_on|off should not calling any dynamic memory allocation or free path. We need to avoid recursivly getting called. Besides that, it reduces jitter in the measurement. CPU 0 latency: count distribution 1 -> 1: 0|| 2 -> 3: 0|| 4 -> 7: 0|| 8 -> 15 : 0|| 16 -> 31 : 0|| 32 -> 63 : 0|| 64 -> 127 : 0|| 128 -> 255 : 0|| 256 -> 511 : 0|| 512 -> 1023 : 0|| 1024 -> 2047 : 0|| 2048 -> 4095 : 166723 |*** | 4096 -> 8191 : 19870|*** | 8192 -> 16383: 6324 || 16384 -> 32767: 1098 || 32768 -> 65535: 190 || 65536 -> 131071 : 179 || 131072 -> 262143 : 18 || 262144 -> 524287 : 4|| 524288 -> 1048575 : 1363 || CPU 1 latency: count distribution 1 -> 1: 0|| 2 -> 3: 0|| 4 -> 7: 0|| 8 -> 15 : 0|| 16 -> 31 : 0|| 32 -> 63 : 0|| 64 -> 127 : 0|| 128 -> 255 : 0|| 256 -> 511 : 0|| 512 -> 1023 : 0|| 1024 -> 2047 : 0|| 2048 -> 4095 : 114042 |*** | 4096 -> 8191 : 9587 |** | 8192 -> 16383: 4140 || 16384 -> 32767: 673 || 32768 -> 65535: 179 || 65536 -> 131071 : 29 || 131072 -> 262143 : 4|| 262144 -> 524287 : 1|| 524288 -> 1048575 : 364 || CPU 2 latency: count distribution 1 -> 1: 0|| 2 -> 3: 0|| 4 -> 7: 0|| 8 -> 15 : 0|| 16 -> 31 : 0|| 32 -> 63 : 0|| 64 -> 127 : 0|| 128 -> 255 : 0|| 256 -> 511 : 0|| 512 -> 1023 : 0|| 1024 -> 2047 : 0|| 2048 -> 4095 : 40147|*** | 4096 -> 8191 : 2300 |* | 8192 -> 16383: 828 || 16384 -> 32767: 178 || 32768 -> 65535: 59