Re: pull-request: bpf 2018-09-02
From: Daniel Borkmann Date: Sun, 2 Sep 2018 23:20:31 +0200 > The following pull-request contains BPF updates for your *net* tree. > > The main changes are: > > 1) Fix one remaining buggy offset override in sockmap's bpf_msg_pull_data() >when linearizing multiple scatterlist elements, from Tushar. > > 2) Fix BPF sockmap's misuse of ULP when a collision with another ULP is >found on map update where it would release existing ULP. syzbot found and >triggered this couple of times now, fix from John. > > 3) Add missing xskmap type to bpftool so it will properly show the type >on map dump, from Prashant. > > Please consider pulling these changes from: > > git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git Pulled, thanks Daniel.
Re: [PATCH net] net/ipv6: Only update MTU metric if it set
From: dsah...@kernel.org Date: Thu, 30 Aug 2018 14:15:43 -0700 > From: David Ahern > > Jan reported a regression after an update to 4.18.5. In this case ipv6 > default route is setup by systemd-networkd based on data from an RA. The > RA contains an MTU of 1492 which is used when the route is first inserted > but then systemd-networkd pushes down updates to the default route > without the mtu set. > > Prior to the change to fib6_info, metrics such as MTU were held in the > dst_entry and rt6i_pmtu in rt6_info contained an update to the mtu if > any. ip6_mtu would look at rt6i_pmtu first and use it if set. If not, > the value from the metrics is used if it is set and finally falling > back to the idev value. > > After the fib6_info change metrics are contained in the fib6_info struct > and there is no equivalent to rt6i_pmtu. To maintain consistency with > the old behavior the new code should only reset the MTU in the metrics > if the route update has it set. > > Fixes: d4ead6b34b67 ("net/ipv6: move metrics from dst to rt6_info") > Reported-by: Jan Janssen > Signed-off-by: David Ahern Applied and queued up for -stable, thanks David.
Re: [PATCH net-next] net/sched: fix type of htb statistics
From: Florent Fourcot Date: Thu, 30 Aug 2018 16:39:23 +0200 > tokens and ctokens are defined as s64 in htb_class structure, > and clamped to 32bits value during netlink dumps: > > cl->xstats.tokens = clamp_t(s64, PSCHED_NS2TICKS(cl->tokens), > INT_MIN, INT_MAX); > > Defining it as u32 is working since userspace (tc) is printing it as > signed int, but a correct definition from the beginning is probably > better. > > In the same time, 'giants' structure member is unused since years, so > update the comment to mark it unused. > > Signed-off-by: Florent Fourcot Looks good, applied.
Re: [PATCH 1/2] dt-bindings: net: cpsw: Document cpsw-phy-sel usage but prefer phandle
From: Tony Lindgren Date: Wed, 29 Aug 2018 08:00:23 -0700 > The current cpsw usage for cpsw-phy-sel is undocumented but is used for > all the boards using cpsw. And cpsw-phy-sel is not really a child of > the cpsw device, it lives in the system control module instead. > > Let's document the existing usage, and improve it a bit where we prefer > to use a phandle instead of a child device for it. That way we can > properly describe the hardware in dts files for things like genpd. > > Signed-off-by: Tony Lindgren Applied.
Re: [PATCH 2/2] net: ethernet: cpsw-phy-sel: prefer phandle for phy sel
From: Tony Lindgren Date: Wed, 29 Aug 2018 08:00:24 -0700 > The cpsw-phy-sel device is not a child of the cpsw interconnect target > module. It lives in the system control module. > > Let's fix this issue by trying to use cpsw-phy-sel phandle first if it > exists and if not fall back to current usage of trying to find the > cpsw-phy-sel child. That way the phy sel driver can be a child of the > system control module where it belongs in the device tree. > > Without this fix, we cannot have a proper interconnect target module > hierarchy in device tree for things like genpd. > > Note that deferred probe is mostly not supported by cpsw and this patch > does not attempt to fix that. In case deferred probe support is needed, > this could be added to cpsw_slave_open() and phy_connect() so they start > handling and returning errors. > > For documenting it, looks like the cpsw-phy-sel is used for all cpsw device > tree nodes. It's missing the related binding documentation, so let's also > update the binding documentation accordingly. > > Signed-off-by: Tony Lindgren Applied.
Re: [PATCH net 0/2] igmp: fix two incorrect unsolicit report count issues
From: Hangbin Liu Date: Wed, 29 Aug 2018 18:06:07 +0800 > Just like the subject, fix two minor igmp unsolicit report count issues. Series applied, thanks.
Re: [PATCH RFC net-next 00/18] net: Improve route scalability via support for nexthop objects
From: dsah...@kernel.org Date: Fri, 31 Aug 2018 17:49:35 -0700 > Examples > 1. Single path > $ ip nexthop add id 1 via 10.99.1.2 dev veth1 > $ ip route add 10.1.1.0/24 nhid 1 > > $ ip next ls > id 1 via 10.99.1.2 src 10.99.1.1 dev veth1 scope link > > $ ip ro ls > 10.1.1.0/24 nhid 1 scope link > ... First of all, this whole idea is awesome! But, you knew that already. :) However, I worry what happesn in a mixed environment where we have routing daemons and tools inserting nexthop based routes, and some doing things the old way using and expecting inline nexthop information in the routes. That mixed environment situation has to function correctly. Older apps have to see the per-route nexthop info in the format and layout they expect (gw/dev pairs). They cannot be expected to just studdenly understand the nexthop ID etc. Otherwise the concept and ideas are fine, so as long as you can resolve the mixed environment situation I fully support this work and look forward to it being in a state where I can integrate it :-)
Re: [PATCH] cxgb4: fix abort_req_rss6 struct
From: Steve Wise Date: Fri, 31 Aug 2018 11:52:00 -0700 > Remove the incorrect WR_HDR field which can cause a misinterpretation > of this CPL by ULDs. > > Fixes: a3cdaa69e4ae ("cxgb4: Adds CPL support for Shared Receive Queues") > Signed-off-by: Steve Wise > --- > > Dave, Doug, and Jason, > > I request this merge through the rdma repo since the only user of this > structure is iw_cxgb4. No objections from me.
Re: [PATCH net-next] net: dsa: b53: Provide sensible defaults
From: Florian Fainelli Date: Fri, 31 Aug 2018 12:29:49 -0700 > The SRAB driver is the default way to communicate with the integrated > switch on iProc platforms and the MMAP driver is the way to communicate > with the integrated switch on DSL BCM63xx and CM BCM33xx. > > Signed-off-by: Florian Fainelli Applied.
Re: [PATCH net-next] cxgb4: collect hardware queue descriptors
From: Rahul Lakkireddy Date: Fri, 31 Aug 2018 18:16:34 +0530 > diff --git a/drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.c > b/drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.c > index d97e0d7e541a..02fc350f81c9 100644 > --- a/drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.c > +++ b/drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.c ... > +static inline u32 cudbg_uld_txq_to_qtype(u32 uld) Do not use inline in foo.c files, let the compiler decide.
Re: [PATCH v2 net-next] liquidio: remove set but not used variable 'irh'
From: YueHaibing Date: Fri, 31 Aug 2018 12:03:56 + > Fixes gcc '-Wunused-but-set-variable' warning: > > drivers/net/ethernet/cavium/liquidio/request_manager.c: In function > 'lio_process_iq_request_list': > drivers/net/ethernet/cavium/liquidio/request_manager.c:383:27: warning: > variable 'irh' set but not used [-Wunused-but-set-variable] > > Signed-off-by: YueHaibing > --- > v2: fix patch description,remove 'cHECK-' Applied, thanks.
Re: [PATCH net-next 1/1] qed: Lower the severity of a dcbx log message.
From: Sudarsana Reddy Kalluru Date: Fri, 31 Aug 2018 04:10:17 -0700 > Driver displays an error message for each unrecognized dcbx TLV that's > received from the peer or configured on the device. It is observed that > syslog will be flooded with such messages in certain scenarios e.g., > frequent link-flaps/lldp-transactions. Changing the severity of this > message to verbose level as it's not an error scenario/message. > > Signed-off-by: Sudarsana Reddy Kalluru Applied.
Re: [PATCH net-next v2] net/tls: Add support for async decryption of tls records
From: Vakul Garg Date: Sun, 2 Sep 2018 02:28:00 + > I do not find this patch in tree yet. > Can you please check? Thanks and Regards. The perils of working on two different machines :-) It should be there now, sorry about that.
Re: [PATCH net-next] net: remove duplicated include from net_failover.c
From: YueHaibing Date: Fri, 31 Aug 2018 03:44:27 + > Remove duplicated include. > > Signed-off-by: YueHaibing Applied, thanks.
Re: [PATCH net] ipv6: don't get lwtstate twice in ip6_rt_copy_init()
From: Alexey Kodanev Date: Thu, 30 Aug 2018 19:11:24 +0300 > Commit 80f1a0f4e0cd ("net/ipv6: Put lwtstate when destroying fib6_info") > partially fixed the kmemleak [1], lwtstate can be copied from fib6_info, > with ip6_rt_copy_init(), and it should be done only once there. > > rt->dst.lwtstate is set by ip6_rt_init_dst(), at the start of the function > ip6_rt_copy_init(), so there is no need to get it again at the end. > > With this patch, lwtstate also isn't copied from RTF_REJECT routes. ... > Fixes: 6edb3c96a5f0 ("net/ipv6: Defer initialization of dst to data path") > Signed-off-by: Alexey Kodanev Applied and queued up for -stable, thanks.
Re: [PATCH net-next] net: stmmac: Add CBS support in XGMAC2
From: Jose Abreu Date: Thu, 30 Aug 2018 15:09:48 +0100 > XGMAC2 uses the same CBS mechanism as GMAC5, only registers offset > changes. Lets use the same TC callbacks and implement the .config_cbs > callback in XGMAC2 core. > > Signed-off-by: Jose Abreu Applied.
Re: [PATCH net-next 1/2] PCI: hv: support reporting serial number as slot information
From: Stephen Hemminger Date: Wed, 29 Aug 2018 09:24:51 -0700 > + spin_lock_irqsave(>device_list_lock, flags); > + list_for_each_entry(hpdev, >children, list_entry) { > + if (hpdev->pci_slot) > + continue; > + > + slot_nr = PCI_SLOT(wslot_to_devfn(hpdev->desc.win_slot.slot)); > + snprintf(name, SLOT_NAME_SIZE, "%u", hpdev->desc.ser); > + hpdev->pci_slot = pci_create_slot(hbus->pci_bus, slot_nr, > + name, NULL); pci_create_slot() takes a mutex, therefore you can't hold a spinlock or disable interrupts here.
Re: [PATCH] neighbour: confirm neigh entries when ARP packet is received
From: Vasily Khoruzhick Date: Tue, 28 Aug 2018 19:48:25 -0700 > Update 'confirmed' timestamp when ARP packet is received. It shouldn't > affect locktime logic and anyway entry can be confirmed by any higher-layer > protocol. Thus it makes no sense not to confirm it when ARP packet is > received. > > Fixes: 77d7123342 ("neighbour: update neigh timestamps iff update is > effective") > > Signed-off-by: Vasily Khoruzhick I'm not so sure. The comment above the code you are moving explains that the current behavior is intention, and it explains why too. Even if your change is correct, you're now making that comment inaccuratte, so you'd have to update it to match the new code. But I still think the current code is intentionally behaving that way, and for good reason.
Re: [PATCH net] ibmvnic: Include missing return code checks in reset function
From: Thomas Falcon Date: Thu, 30 Aug 2018 13:19:53 -0500 > Check the return codes of these functions and halt reset > in case of failure. The driver will remain in a dormant state > until the next reset event, when device initialization will be > re-attempted. > > Signed-off-by: Thomas Falcon Applied.
Re: [PATCH net 2/2] selftests: pmtu: detect correct binary to ping ipv6 addresses
From: Sabrina Dubroca Date: Thu, 30 Aug 2018 16:01:18 +0200 > Some systems don't have the ping6 binary anymore, and use ping for > everything. Detect the absence of ping6 and try to use ping instead. > > Fixes: d1f1b9cbf34c ("selftests: net: Introduce first PMTU test") > Signed-off-by: Sabrina Dubroca > Acked-by: Stefano Brivio Applied.
Re: [PATCH net 1/2] selftests: pmtu: maximum MTU for vti4 is 2^16-1-20
From: Sabrina Dubroca Date: Thu, 30 Aug 2018 16:01:17 +0200 > Since commit 82612de1c98e ("ip_tunnel: restore binding to ifaces with a > large mtu"), the maximum MTU for vti4 is based on IP_MAX_MTU instead of > the mysterious constant 0xFFF8. This makes this selftest fail. > > Fixes: 82612de1c98e ("ip_tunnel: restore binding to ifaces with a large mtu") > Signed-off-by: Sabrina Dubroca > Acked-by: Stefano Brivio Applied.
Re: [PATCH net] tcp: do not restart timewait timer on rst reception
From: Florian Westphal Date: Thu, 30 Aug 2018 14:24:29 +0200 > RFC 1337 says: > ''Ignore RST segments in TIME-WAIT state. >If the 2 minute MSL is enforced, this fix avoids all three hazards.'' > > So with net.ipv4.tcp_rfc1337=1, expected behaviour is to have TIME-WAIT sk > expire rather than removing it instantly when a reset is received. > > However, Linux will also re-start the TIME-WAIT timer. > > This causes connect to fail when tying to re-use ports or very long > delays (until syn retry interval exceeds MSL). ... > Without this patch, 'ss' shows restarts of tw timer and last packet is > thus just another pure ack, more than one minute later. > > This restores the original code from commit 283fd6cf0be690a83 > ("Merge in ANK networking jumbo patch") in netdev-vger-cvs.git . > > For some reason the else branch was removed/lost in 1f28b683339f7 > ("Merge in TCP/UDP optimizations and [..]") and timer restart became > unconditional. > > Reported-by: Michal Tesar > Signed-off-by: Florian Westphal Applied and thanks for the packet drill test case :-)
Re: [PATCH net-next] net: dsa: mv88e6xxx: Share main switch IRQ
From: Marek Behún Date: Thu, 30 Aug 2018 02:13:50 +0200 > On some boards the interrupt can be shared between multiple devices. > For example on Turris Mox the interrupt is shared between all switches. > > Signed-off-by: Marek Behun Applied.
Re: [PATCH net-next] net/ipv6: Do not reset nl_net in ip6_route_info_create
From: dsah...@kernel.org Date: Wed, 29 Aug 2018 16:54:01 -0700 > From: David Ahern > > nl_net is set on entry to ip6_route_info_create. Only devices > within that namespace are considered so no need to reset it > before returning. > > Signed-off-by: David Ahern Applied.
Re: [PATCH net-next] net/ipv4: Add extack message that dev is required for ONLINK
From: dsah...@kernel.org Date: Wed, 29 Aug 2018 16:53:27 -0700 > From: David Ahern > > Make IPv4 consistent with IPv6 and return an extack message that the > ONLINK flag requires a nexthop device. > > Signed-off-by: David Ahern Applied.
Re: [PATCH net] nfp: wait for posted reconfigs when disabling the device
From: Jakub Kicinski Date: Wed, 29 Aug 2018 12:46:08 -0700 > To avoid leaking a running timer we need to wait for the > posted reconfigs after netdev is unregistered. In common > case the process of deinitializing the device will perform > synchronous reconfigs which wait for posted requests, but > especially with VXLAN ports being actively added and removed > there can be a race condition leaving a timer running after > adapter structure is freed leading to a crash. > > Add an explicit flush after deregistering and for a good > measure a warning to check if timer is running just before > structures are freed. > > Fixes: 3d780b926a12 ("nfp: add async reconfiguration mechanism") > Signed-off-by: Jakub Kicinski > Reviewed-by: Dirk van der Merwe Applied and queued up for -stable.
Re: [PATCH net-next] tcp: change IPv6 flow-label upon receiving spurious retransmission
From: Yuchung Cheng Date: Wed, 29 Aug 2018 14:53:56 -0700 > Currently a Linux IPv6 TCP sender will change the flow label upon > timeouts to potentially steer away from a data path that has gone > bad. However this does not help if the problem is on the ACK path > and the data path is healthy. In this case the receiver is likely > to receive repeated spurious retransmission because the sender > couldn't get the ACKs in time and has recurring timeouts. > > This patch adds another feature to mitigate this problem. It > leverages the DSACK states in the receiver to change the flow > label of the ACKs to speculatively re-route the ACK packets. > In order to allow triggering on the second consecutive spurious > RTO, the receiver changes the flow label upon sending a second > consecutive DSACK for a sequence number below RCV.NXT. > > Signed-off-by: Yuchung Cheng > Signed-off-by: Neal Cardwell > Signed-off-by: Eric Dumazet Applied.
Re: [PATCH net] Revert "packet: switch kvzalloc to allocate memory"
From: Eric Dumazet Date: Wed, 29 Aug 2018 11:50:12 -0700 > This reverts commit 71e41286203c017d24f041a7cd71abea7ca7b1e0. > > mmap()/munmap() can not be backed by kmalloced pages : > > We fault in : > > VM_BUG_ON_PAGE(PageSlab(page), page); > > unmap_single_vma+0x8a/0x110 > unmap_vmas+0x4b/0x90 > unmap_region+0xc9/0x140 > do_munmap+0x274/0x360 > vm_munmap+0x81/0xc0 > SyS_munmap+0x2b/0x40 > do_syscall_64+0x13e/0x1c0 > entry_SYSCALL_64_after_hwframe+0x42/0xb7 > > Fixes: 71e41286203c ("packet: switch kvzalloc to allocate memory") > Signed-off-by: Eric Dumazet > Reported-by: John Sperbeck > Bisected-by: John Sperbeck Oops, applied, thanks Eric.
Re: [Patch net-nnext] net_sched: add missing tcf_lock for act_connmark
From: Cong Wang Date: Wed, 29 Aug 2018 10:15:36 -0700 > According to the new locking rule, we have to take tcf_lock > for both ->init() and ->dump(), as RTNL will be removed. > However, it is missing for act_connmark. > > Cc: Vlad Buslov > Signed-off-by: Cong Wang Applied.
Re: [Patch net-nnext] Revert "net: sched: act: add extack for lookup callback"
From: Cong Wang Date: Wed, 29 Aug 2018 10:15:35 -0700 > This reverts commit 331a9295de23 ("net: sched: act: add extack for lookup > callback"). > > This extack is never used after 6 months... In fact, it can be just > set in the caller, right after ->lookup(). > > Cc: Alexander Aring > Signed-off-by: Cong Wang Applied.
Re: pull-request: bpf-next 2018-09-01
From: Daniel Borkmann Date: Sat, 1 Sep 2018 02:05:06 +0200 > The following pull-request contains BPF updates for your *net-next* tree. > > The main changes are: > > 1) Add AF_XDP zero-copy support for i40e driver (!), from Björn and Magnus. W00t! > 2) BPF verifier improvements by giving each register its own liveness >chain which allows to simplify and getting rid of skip_callee() logic, >from Edward. > > 3) Add bpf fs pretty print support for percpu arraymap, percpu hashmap >and percpu lru hashmap. Also add generic percpu formatted print on >bpftool so the same can be dumped there, from Yonghong. > > 4) Add bpf_{set,get}sockopt() helper support for TCP_SAVE_SYN and >TCP_SAVED_SYN options to allow reflection of tos/tclass from received >SYN packet, from Nikita. > > 5) Misc improvements to the BPF sockmap test cases in terms of cgroup v2 >interaction and removal of incorrect shutdown() calls, from John. > > 6) Few cleanups in xdp_umem_assign_dev() and xdpsock samples, from Prashant. Pulled, thanks Daniel!
Re: [PATCH net-next v1] selftests/tls: Add test for recv(PEEK) spanning across multiple records
From: Vakul Garg Date: Wed, 29 Aug 2018 15:30:14 +0530 > Added test case to receive multiple records with a single recvmsg() > operation with a MSG_PEEK set. Applied.
Re: [PATCH net-next v2] net/tls: Add support for async decryption of tls records
From: Vakul Garg Date: Wed, 29 Aug 2018 15:26:55 +0530 > When tls records are decrypted using asynchronous acclerators such as > NXP CAAM engine, the crypto apis return -EINPROGRESS. Presently, on > getting -EINPROGRESS, the tls record processing stops till the time the > crypto accelerator finishes off and returns the result. This incurs a > context switch and is not an efficient way of accessing the crypto > accelerators. Crypto accelerators work efficient when they are queued > with multiple crypto jobs without having to wait for the previous ones > to complete. > > The patch submits multiple crypto requests without having to wait for > for previous ones to complete. This has been implemented for records > which are decrypted in zero-copy mode. At the end of recvmsg(), we wait > for all the asynchronous decryption requests to complete. > > The references to records which have been sent for async decryption are > dropped. For cases where record decryption is not possible in zero-copy > mode, asynchronous decryption is not used and we wait for decryption > crypto api to complete. > > For crypto requests executing in async fashion, the memory for > aead_request, sglists and skb etc is freed from the decryption > completion handler. The decryption completion handler wakesup the > sleeping user context when recvmsg() flags that it has done sending > all the decryption requests and there are no more decryption requests > pending to be completed. > > Signed-off-by: Vakul Garg > Reviewed-by: Dave Watson > --- > > Changes since v1: > - Simplified recvmsg() so to drop reference to skb in case it > was submimtted for async decryption. > - Modified tls_sw_advance_skb() to handle case when input skb is > NULL. Applied.
Re: [net-next v2 00/15][pull request] 40GbE Intel Wired LAN Driver Updates 2018-08-30
From: Jeff Kirsher Date: Thu, 30 Aug 2018 14:11:32 -0700 > This series contains updates to i40e, i40evf and virtchnl. Pulled, thanks Jeff.
Re: [PATCH net v2 0/2] net_sched: reject unknown tcfa_action values
From: Paolo Abeni Date: Wed, 29 Aug 2018 10:22:32 +0200 > As agreed some time ago, this changeset reject unknown tcfa_action values, > instead of changing such values under the hood. > > A tdc test is included to verify the new behavior. > > v1 -> v2: > - helper is now static and renamed according to act_* convention > - updated extack message, according to the new behavior Series applied, thank you.
Re: [PATCH v2] net: mvpp2: initialize port of_node pointer
From: Baruch Siach Date: Wed, 29 Aug 2018 09:44:39 +0300 > Without a valid of_node in struct device we can't find the mvpp2 port > device by its DT node. Specifically, this breaks > of_find_net_device_by_node(). > > For example, the Armada 8040 based Clearfog GT-8K uses Marvell 88E6141 > switch connected to the _eth2 port: > > _mdio { > ... > > switch0: switch0@4 { > compatible = "marvell,mv88e6085"; > ... > > ports { > ... > > port@5 { > reg = <5>; > label = "cpu"; > ethernet = <_eth2>; > }; > }; > }; > }; > > Without this patch, dsa_register_switch() returns -EPROBE_DEFER because > of_find_net_device_by_node() can't find the device_node of the _eth2 > device. > > Reviewed-by: Andrew Lunn > Signed-off-by: Baruch Siach Applied.
Re: [PATCH][net-next] vxlan: reduce dirty cache line in vxlan_find_mac
From: Li RongQing Date: Wed, 29 Aug 2018 11:52:10 +0800 > vxlan_find_mac() unconditionally set f->used for every packet, > this causes a cache miss for every packet, since remote, hlist > and used of vxlan_fdb share the same cache line, which are > accessed when send every packets. > > so f->used is set only if not equal to jiffies, to reduce dirty > cache line times, this gives 3% speed-up with small packets. > > Signed-off-by: Zhang Yu > Signed-off-by: Li RongQing Applied.
Re: [PATCH net-next 0/4] liquidio: improve soft command/response handling
From: Felix Manlunas Date: Tue, 28 Aug 2018 18:50:58 -0700 > From: Weilin Chang > > Change soft command handling to fix the possible race condition when the > process handles a response of a soft command that was already freed by an > application which got timeout for this request. Series applied, thank you.
Re: [net-next 02/15] i40e: move ethtool stats boiler plate code to i40e_ethtool_stats.h
From: Jeff Kirsher Date: Wed, 29 Aug 2018 15:48:21 -0700 > diff --git a/drivers/net/ethernet/intel/i40e/i40e_ethtool_stats.h > b/drivers/net/ethernet/intel/i40e/i40e_ethtool_stats.h > new file mode 100644 > index ..0290ade7494b > --- /dev/null > +++ b/drivers/net/ethernet/intel/i40e/i40e_ethtool_stats.h > @@ -0,0 +1,221 @@ ... > +/** > + * __i40e_add_stat_strings - copy stat strings into ethtool buffer > + * @p: ethtool supplied buffer > + * @stats: stat definitions array > + * @size: size of the stats array > + * > + * Format and copy the strings described by stats into the buffer pointed at > + * by p. > + **/ > +static void __i40e_add_stat_strings(u8 **p, const struct i40e_stats stats[], > + const unsigned int size, ...) Need to be marked inline.
Re: [PATCH net-next,v5] net/tls: Calculate nsg for zerocopy path without skb_cow_data.
From: Doron Roberts-Kedes Date: Tue, 28 Aug 2018 16:33:57 -0700 > decrypt_skb fails if the number of sg elements required to map it > is greater than MAX_SKB_FRAGS. nsg must always be calculated, but > skb_cow_data adds unnecessary memcpy's for the zerocopy case. > > The new function skb_nsg calculates the number of scatterlist elements > required to map the skb without the extra overhead of skb_cow_data. > This patch reduces memcpy by 50% on my encrypted NBD benchmarks. > > Reported-by: Vakul Garg > Reviewed-by: Vakul Garg > Tested-by: Vakul Garg > Signed-off-by: Doron Roberts-Kedes Applied, thank you.
Re: [PATCH net-next 1/2] ip: fail fast on IP defrag errors
From: Peter Oskolkov Date: Tue, 28 Aug 2018 11:36:19 -0700 > The current behavior of IP defragmentation is inconsistent: > - some overlapping/wrong length fragments are dropped without > affecting the queue; > - most overlapping fragments cause the whole frag queue to be dropped. > > This patch brings consistency: if a bad fragment is detected, > the whole frag queue is dropped. Two major benefits: > - fail fast: corrupted frag queues are cleared immediately, instead of > by timeout; > - testing of overlapping fragments is now much easier: any kind of > random fragment length mutation now leads to the frag queue being > discarded (IP packet dropped); before this patch, some overlaps were > "corrected", with tests not seeing expected packet drops. > > Note that in one case (see "if (end&7)" conditional) the current > behavior is preserved as there are concerns that this could be > legitimate padding. > > Signed-off-by: Peter Oskolkov > Reviewed-by: Eric Dumazet > Reviewed-by: Willem de Bruijn Applied.
Re: [PATCH net-next 2/2] selftests/net: add ip_defrag selftest
From: Peter Oskolkov Date: Tue, 28 Aug 2018 11:36:20 -0700 > This test creates a raw IPv4 socket, fragments a largish UDP > datagram and sends the fragments out of order. > > Then repeats in a loop with different message and fragment lengths. > > Then does the same with overlapping fragments (with overlapping > fragments the expectation is that the recv times out). > > Tested: > > root@# time ./ip_defrag.sh > ipv4 defrag > PASS > ipv4 defrag with overlaps > PASS > > real1m7.679s > user0m0.628s > sys 0m2.242s > > A similar test for IPv6 is to follow. > > Signed-off-by: Peter Oskolkov > Reviewed-by: Willem de Bruijn Applied.
Re: [PATCH net-next] liquidio: fix race condition in instruction completion processing
From: Felix Manlunas Date: Tue, 28 Aug 2018 11:32:55 -0700 > From: Rick Farrington > > In lio_enable_irq, the pkt_in_done count register was being cleared to > zero. However, there could be some completed instructions which were not > yet processed due to budget and limit constraints. > So, only write this register with the number of actual completions > that were processed. > > Signed-off-by: Rick Farrington > Signed-off-by: Felix Manlunas Applied.
Re: [PATCH net-next] liquidio: remove unnecessary delay when processing IQ responses
From: Felix Manlunas Date: Tue, 28 Aug 2018 11:19:54 -0700 > From: Rick Farrington > > Signed-off-by: Rick Farrington > Signed-off-by: Felix Manlunas Applied.
Re: [PATCH net-next] net: thunderbolt: Convert to use SPDX identifier
From: Mika Westerberg Date: Tue, 28 Aug 2018 19:58:43 +0300 > This gets rid of the licence boilerblate in favor of SPDX identifier > which only takes a single line comment. > > Signed-off-by: Mika Westerberg Applied.
Re: [PATCH net 0/3] ipv6: fix error path of inet6_init()
From: Sabrina Dubroca Date: Tue, 28 Aug 2018 13:40:50 +0200 > The error path of inet6_init() can trigger multiple kernel panics, > mostly due to wrong ordering of cleanups. This series fixes those > issues. Series applied, thank you.
Re: [PATCH net] net/sched: act_pedit: fix dump of extended layered op
From: Davide Caratti Date: Mon, 27 Aug 2018 22:56:22 +0200 > in the (rare) case of failure in nla_nest_start(), missing NULL checks in > tcf_pedit_key_ex_dump() can make the following command > > # tc action add action pedit ex munge ip ttl set 64 > > dereference a NULL pointer: ... > Like it's done for other TC actions, give up dumping pedit rules and return > an error if nla_nest_start() returns NULL. > > Fixes: 71d0ed7079df ("net/act_pedit: Support using offset relative to the > conventional network headers") > Signed-off-by: Davide Caratti Applied and queued up for -stable, thanks.
Re: [PATCH] r8169: set RxConfig after tx/rx is enabled for RTL8169sb/8110sb devices
From: Azat Khuzhin Date: Sun, 26 Aug 2018 17:03:09 +0300 > I have two Ethernet adapters: > r8169 :03:01.0 eth0: RTL8169sb/8110sb, 00:14:d1:14:2d:49, XID 1000, > IRQ 18 > r8169 :01:00.0 eth0: RTL8168e/8111e, 64:66:b3:11:14:5d, XID 2c20, > IRQ 30 > And after upgrading from linux 4.15 [1] to linux 4.18+ [2] RTL8169sb failed to > receive any packets. tcpdump shows a lot of checksum mismatch. > > [1]: a0f79386a4968b4925da6db2d1daffd0605a4402 > [2]: 0519359784328bfa92bf0931bf0cff3b58c16932 (4.19 merge window opened) > > I started bisecting and the found that [3] breaks it. According to [4]: > "For 8110S, 8110SB, and 8110SC series, the initial value of RxConfig > needs to be set after the tx/rx is enabled." > So I moved rtl_init_rxcfg() after enabling tx/rs and now my adapter works > (RTL8168e works too). > > [3]: 3559d81e76bfe3803e89f2e04cf6ef7ab4f3aace > [4]: e542a2269f232d61270ceddd42b73a4348dee2bb ("r8169: adjust the RxConfig > settings.") > > Also drop "rx" from rtl_set_rx_tx_config_registers(), since it does nothing > with it already. > > Fixes: 3559d81e76bfe3803e89f2e04cf6ef7ab4f3aace ("r8169: simplify > rtl_hw_start_8169") > > Cc: Heiner Kallweit > Cc: David S. Miller > Cc: netdev@vger.kernel.org > Cc: Realtek linux nic maintainers > Signed-off-by: Azat Khuzhin Applied and queued up for -stable.
Re: [Patch net] tipc: fix a missing rhashtable_walk_exit()
From: Cong Wang Date: Thu, 23 Aug 2018 16:19:44 -0700 > rhashtable_walk_exit() must be paired with rhashtable_walk_enter(). > > Fixes: 40f9f4397060 ("tipc: Fix tipc_sk_reinit race conditions") > Cc: Herbert Xu > Cc: Ying Xue > Signed-off-by: Cong Wang Applied and queued up for -stable, thanks Cong.
Re: [PATCH net] vti6: remove !skb->ignore_df check from vti6_xmit()
From: Alexey Kodanev Date: Thu, 23 Aug 2018 19:49:54 +0300 > Before the commit d6990976af7c ("vti6: fix PMTU caching and reporting > on xmit") '!skb->ignore_df' check was always true because the function > skb_scrub_packet() was called before it, resetting ignore_df to zero. > > In the commit, skb_scrub_packet() was moved below, and now this check > can be false for the packet, e.g. when sending it in the two fragments, > this prevents successful PMTU updates in such case. The next attempts > to send the packet lead to the same tx error. Moreover, vti6 initial > MTU value relies on PMTU adjustments. > > This issue can be reproduced with the following LTP test script: > udp_ipsec_vti.sh -6 -p ah -m tunnel -s 2000 > > Fixes: ccd740cbc6e0 ("vti6: Add pmtu handling to vti6_xmit.") > Signed-off-by: Alexey Kodanev Applied and queued up for -stable, thank you.
Re: pull-request: bpf 2018-08-29
From: Daniel Borkmann Date: Wed, 29 Aug 2018 21:07:24 +0200 > The following pull-request contains BPF updates for your *net* tree. > > The main changes are: > > 1) Fix a build error in sk_reuseport_convert_ctx_access() when >compiling with clang which cannot resolve hweight_long() at >build time inside the BUILD_BUG_ON() assertion, from Stefan. > > 2) Several fixes for BPF sockmap, four of them in getting the >bpf_msg_pull_data() helper to work, one use after free case >in bpf_tcp_close() and one refcount leak in bpf_tcp_recvmsg(), >from Daniel. > > 3) Another fix for BPF sockmap where we misaccount sk_mem_uncharge() >in the socket redirect error case from unwinding scatterlist >twice, from John. > > Please consider pulling these changes from: > > git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git Pulled, thanks Daniel.
Re: [net-next 00/13][pull request] 10GbE Intel Wired LAN Driver Updates 2018-08-28
From: Jeff Kirsher Date: Tue, 28 Aug 2018 14:35:44 -0700 > This series contains updates to ixgbe and ixgbevf only. ... Pulled.
Re: [PATCH net-next 00/15] nfp: add NFP5000 support
From: Jakub Kicinski Date: Tue, 28 Aug 2018 13:20:32 -0700 > This series broadly speaking adds support for NFP5000 and > related products. ... Series applied, thanks Jakub.
Re: [net-next 00/15][pull request] 100GbE Intel Wired LAN Driver Updates 2018-08-28
From: Jeff Kirsher Date: Tue, 28 Aug 2018 12:03:58 -0700 > This series contains new features and implementation updates for the > ice driver. ... Pulled.
net-next is OPEN...
You know the drill... http://vger.kernel.org/~davem/net-next.html
Re: [PATCH 1/1] net/rds: Use rdma_read_gids to get connection SGID/DGID in IPv6
From: Zhu Yanjun Date: Sat, 25 Aug 2018 15:19:05 +0800 > In IPv4, the newly introduced rdma_read_gids is used to read the SGID/DGID > for the connection which returns GID correctly for RoCE transport as well. > > In IPv6, rdma_read_gids is also used. The following are why rdma_read_gids > is introduced. > > rdma_addr_get_dgid() for RoCE for client side connections returns MAC > address, instead of DGID. > rdma_addr_get_sgid() for RoCE doesn't return correct SGID for IPv6 and > when more than one IP address is assigned to the netdevice. > > So the transport agnostic rdma_read_gids() API is provided by rdma_cm > module. > > Signed-off-by: Zhu Yanjun Applied.
Re: [PATCH] r8169: set RxConfig after tx/rx is enabled for RTL8169sb/8110sb devices
From: Azat Khuzhin Date: Sun, 26 Aug 2018 17:03:09 +0300 > I have two Ethernet adapters: > r8169 :03:01.0 eth0: RTL8169sb/8110sb, 00:14:d1:14:2d:49, XID 1000, > IRQ 18 > r8169 :01:00.0 eth0: RTL8168e/8111e, 64:66:b3:11:14:5d, XID 2c20, > IRQ 30 > And after upgrading from linux 4.15 [1] to linux 4.18+ [2] RTL8169sb failed to > receive any packets. tcpdump shows a lot of checksum mismatch. > > [1]: a0f79386a4968b4925da6db2d1daffd0605a4402 > [2]: 0519359784328bfa92bf0931bf0cff3b58c16932 (4.19 merge window opened) > > I started bisecting and the found that [3] breaks it. According to [4]: > "For 8110S, 8110SB, and 8110SC series, the initial value of RxConfig > needs to be set after the tx/rx is enabled." > So I moved rtl_init_rxcfg() after enabling tx/rs and now my adapter works > (RTL8168e works too). > > [3]: 3559d81e76bfe3803e89f2e04cf6ef7ab4f3aace > [4]: e542a2269f232d61270ceddd42b73a4348dee2bb ("r8169: adjust the RxConfig > settings.") > > Also drop "rx" from rtl_set_rx_tx_config_registers(), since it does nothing > with it already. > > Fixes: 3559d81e76bfe3803e89f2e04cf6ef7ab4f3aace ("r8169: simplify > rtl_hw_start_8169") > > Cc: Heiner Kallweit > Cc: David S. Miller > Cc: netdev@vger.kernel.org > Cc: Realtek linux nic maintainers > Signed-off-by: Azat Khuzhin > --- > It looks like calling rtl_init_rxcfg() the second time is fine, but I > can move it into rtl_hw_start_8169()) Heiner, please review.
Re: [PATCH] net: dsa: Drop GPIO includes
From: Linus Walleij Date: Mon, 27 Aug 2018 00:20:11 +0200 > Commit 52638f71fcff ("dsa: Move gpio reset into switch driver") > moved the GPIO handling into the switch drivers but forgot > to remove the GPIO header includes. > > Signed-off-by: Linus Walleij Applied.
Re: [patch net 0/2] net: sched: couple of small fixes
From: Cong Wang Date: Mon, 27 Aug 2018 13:44:56 -0700 > On Mon, Aug 27, 2018 at 11:58 AM Jiri Pirko wrote: >> >> From: Jiri Pirko >> >> Jiri Pirko (2): >> net: sched: fix extack error message when chain is failed to be >> created >> net: sched: return -ENOENT when trying to remove filter from >> non-existent chain > > Acked-by: Cong Wang Series applied.
Re: [PATCH net] sctp: hold transport before accessing its asoc in sctp_transport_get_next
From: Xin Long Date: Mon, 27 Aug 2018 18:38:31 +0800 > As Marcelo noticed, in sctp_transport_get_next, it is iterating over > transports but then also accessing the association directly, without > checking any refcnts before that, which can cause an use-after-free > Read. > > So fix it by holding transport before accessing the association. With > that, sctp_transport_hold calls can be removed in the later places. > > Fixes: 626d16f50f39 ("sctp: export some apis or variables for sctp_diag and > reuse some for proc") > Reported-by: syzbot+fe62a0c9aa6a85c6d...@syzkaller.appspotmail.com > Signed-off-by: Xin Long Applied and queued up for -stable.
Re: [PATCH net] erspan: set erspan_ver to 1 by default when adding an erspan dev
From: Xin Long Date: Mon, 27 Aug 2018 18:41:32 +0800 > After erspan_ver is introudced, if erspan_ver is not set in iproute, its > value will be left 0 by default. Since Commit 02f99df1875c ("erspan: fix > invalid erspan version."), it has broken the traffic due to the version > check in erspan_xmit if users are not aware of 'erspan_ver' param, like > using an old version of iproute. > > To fix this compatibility problem, it sets erspan_ver to 1 by default > when adding an erspan dev in erspan_setup. Note that we can't do it in > ipgre_netlink_parms, as this function is also used by ipgre_changelink. > > Fixes: 02f99df1875c ("erspan: fix invalid erspan version.") > Reported-by: Jianlin Shi > Signed-off-by: Xin Long Applied and queued up for -stable.
Re: [PATCH net] sctp: remove useless start_fail from sctp_ht_iter in proc
From: Xin Long Date: Mon, 27 Aug 2018 18:40:18 +0800 > After changing rhashtable_walk_start to return void, start_fail would > never be set other value than 0, and the checking for start_fail is > pointless, so remove it. > > Fixes: 97a6ec4ac021 ("rhashtable: Change rhashtable_walk_start to return > void") > Signed-off-by: Xin Long Applied and queued up for -stable.
Re: any reason for "!!netif_carrier_ok" and "!!netif_dormant" in net-sysfs.c?
From: "Robert P. J. Day" Date: Mon, 27 Aug 2018 04:55:29 -0400 (EDT) > another pedantic oddity -- is there a reason for these two double > negations in net/core/net-sysfs.c? It turns an arbitrary integer into a boolean, this is a common construct across the kernel tree so I'm surprised you've never seen it before. Although, I don't know how much more hand holding we're willing to tolerate continuing to give to you at this point. Thanks.
Re: [PATCH net 1/1] qlge: Fix netdev features configuration.
From: Manish Chopra Date: Thu, 23 Aug 2018 13:20:52 -0700 > qlge_fix_features() is not supposed to modify hardware or > driver state, rather it is supposed to only fix requested > fetures bits. Currently qlge_fix_features() also goes for > interface down and up unnecessarily if there is not even > any change in features set. > > This patch changes/fixes following - > > 1) Move reload of interface or device re-config from >qlge_fix_features() to qlge_set_features(). > 2) Reload of interface in qlge_set_features() only if >relevant feature bit (NETIF_F_HW_VLAN_CTAG_RX) is changed. > 3) Get rid of qlge_fix_features() since driver is not really >required to fix any features bit. > > Signed-off-by: Manish > Reviewed-by: Benjamin Poirier Applied and queued up for -stable. Please provide a proper Fixes: tag next time. Thanks.
Re: [PATCH v2] net: macb: do not disable MDIO bus at open/close time
From: Anssi Hannula Date: Thu, 23 Aug 2018 10:45:22 +0300 > macb_reset_hw() is called from macb_close() and indirectly from > macb_open(). macb_reset_hw() zeroes the NCR register, including the MPE > (Management Port Enable) bit. > > This will prevent accessing any other PHYs for other Ethernet MACs on > the MDIO bus, which remains registered at macb_reset_hw() time, until > macb_init_hw() is called from macb_open() which sets the MPE bit again. > > I.e. currently the MDIO bus has a short disruption at open time and is > disabled at close time until the interface is opened again. > > Fix that by only touching the RE and TE bits when enabling and disabling > RX/TX. > > v2: Make macb_init_hw() NCR write a single statement. > > Fixes: 6c36a7074436 ("macb: Use generic PHY layer") > Signed-off-by: Anssi Hannula Applied and queued up for -stable.
Re: [PATCH v3 net 1/1] net: macb: Fix regression breaking non-MDIO fixed-link PHYs
From: Ahmad Fatoum Date: Tue, 21 Aug 2018 17:35:48 +0200 > commit 739de9a1563a ("net: macb: Reorganize macb_mii bringup") broke > initializing macb on the EVB-KSZ9477 eval board. > There, of_mdiobus_register was called even for the fixed-link representing > the RGMII-link to the switch with the result that the driver attempts to > enumerate PHYs on a non-existent MDIO bus: > > libphy: MACB_mii_bus: probed > mdio_bus f0028000.ethernet-: fixed-link has invalid PHY address > mdio_bus f0028000.ethernet-: scan phy fixed-link at address 0 > [snip] > mdio_bus f0028000.ethernet-: scan phy fixed-link at address 31 > > The "MDIO" bus registration succeeds regardless, having claimed the reset > GPIO, > and calling of_phy_register_fixed_link later on fails because it tries > to claim the same GPIO: > > macb f0028000.ethernet: broken fixed-link specification > > Fix this by registering the fixed-link before calling mdiobus_register. > > Fixes: 739de9a1563a ("net: macb: Reorganize macb_mii bringup") > Signed-off-by: Ahmad Fatoum Applied and queued up for -stable, thanks.
Re: [PATCH net] mlxsw: spectrum_switchdev: Do not leak RIFs when removing bridge
From: Ido Schimmel Date: Fri, 24 Aug 2018 15:41:35 +0300 > When a bridge device is removed, the VLANs are flushed from each > configured port. This causes the ports to decrement the reference count > on the associated FIDs (filtering identifier). If the reference count of > a FID is 1 and it has a RIF (router interface), then this RIF is > destroyed. > > However, if no port is member in the VLAN for which a RIF exists, then > the RIF will continue to exist after the removal of the bridge. To > reproduce: > > # ip link add name br0 type bridge vlan_filtering 1 > # ip link set dev swp1 master br0 > # ip link add link br0 name br0.10 type vlan id 10 > # ip address add 192.0.2.0/24 dev br0.10 > # ip link del dev br0 > > The RIF associated with br0.10 continues to exist. > > Fix this by iterating over all the bridge device uppers when it is > destroyed and take care of destroying their RIFs. > > Fixes: 99f44bb3527b ("mlxsw: spectrum: Enable L3 interfaces on top of bridge > devices") > Signed-off-by: Ido Schimmel > Reviewed-by: Petr Machata Applied and queued up for -stable, thanks.
Re: [net 00/11][pull request] Intel Wired LAN Driver Updates 2018-08-24
From: Jeff Kirsher Date: Fri, 24 Aug 2018 11:47:24 -0700 > This series contains fixes to e1000, igb, ixgb, ixgbe and i40e. Pulled, thanks Jeff.
Re: pull-request: bpf 2018-08-24
From: Daniel Borkmann Date: Fri, 24 Aug 2018 01:09:29 +0200 > The following pull-request contains BPF updates for your *net* tree. Pulled, thanks Daniel.
Re: [net 00/13][pull request] Intel Wired LAN Driver Fixes 2018-08-23
From: Jeff Kirsher Date: Thu, 23 Aug 2018 12:14:50 -0700 > This series contains bug fixes to the ice driver. Pulled, thanks Jeff.
Re: pull request: bluetooth 2018-08-23
From: Johan Hedberg Date: Thu, 23 Aug 2018 08:34:40 +0300 > Here are two important Bluetooth fixes for the MediaTek and RealTek HCI > drivers. > > Please let me know if there are any issues pulling, thanks. Pulled, thank you.
Re: [Patch net 0/2] net: hns3: bug fix & optimization for HNS3 driver
From: Huazhong Tan Date: Thu, 23 Aug 2018 11:37:14 +0800 > This patchset presents a bug fix found out when CONFIG_ARM64_64K_PAGES > enable and an optimization for HNS3 driver. Series applied, thank you.
Re: [PATCH] net/ipv6: init ip6 anycast rt->dst.input as ip6_input
From: Hangbin Liu Date: Thu, 23 Aug 2018 11:31:37 +0800 > Commit 6edb3c96a5f02 ("net/ipv6: Defer initialization of dst to data path") > forgot to handle anycast route and init anycast rt->dst.input to ip6_forward. > Fix it by setting anycast rt->dst.input back to ip6_input. > > Fixes: 6edb3c96a5f02 ("net/ipv6: Defer initialization of dst to data path") > Signed-off-by: Hangbin Liu Applied and queued up for -stable, thanks.
Re: [Patch net 0/4] net: hns: bug fixes & optimization for HNS driver
From: Huazhong Tan Date: Thu, 23 Aug 2018 11:10:09 +0800 > This patchset presents some bug fixes found out when > CONFIG_ARM64_64K_PAGES enable and an optimization for HNS driver. Series applied, thank you.
Re: [PATCH net 0/3] tcp_bbr: PROBE_RTT minor bug fixes
From: Kevin Yang Date: Wed, 22 Aug 2018 17:43:13 -0400 > From: "Kevin(Yudong) Yang" > > This series includes two minor bug fixes for the TCP BBR PROBE_RTT > mechanism, and one preparatory patch: > > (1) A preparatory patch to reorganize the PROBE_RTT logic by refactoring > (into its own function) the code to exit PROBE_RTT, since the next > patch will be using that code in a new context. > > (2) Fix: When BBR restarts from idle and if BBR is in PROBE_RTT mode, > BBR should check if it's time to exit PROBE_RTT. If yes, then BBR > should exit PROBE_RTT mode and restore the cwnd to its full value. > > (3) Fix: Apply the PROBE_RTT cwnd cap even if the count of fully-ACKed > packets is 0. Series applied, thank you.
Re: [PATCH net] ipv4: tcp: send zero IPID for RST and ACK sent in SYN-RECV and TIME-WAIT state
From: Eric Dumazet Date: Wed, 22 Aug 2018 13:30:45 -0700 > tcp uses per-cpu (and per namespace) sockets (net->ipv4.tcp_sk) internally > to send some control packets. > > 1) RST packets, through tcp_v4_send_reset() > 2) ACK packets in SYN-RECV and TIME-WAIT state, through tcp_v4_send_ack() > > These packets assert IP_DF, and also use the hashed IP ident generator > to provide an IPv4 ID number. > > Geoff Alexander reported this could be used to build off-path attacks. > > These packets should not be fragmented, since their size is smaller than > IPV4_MIN_MTU. Only some tunneled paths could eventually have to fragment, > regardless of inner IPID. > > We really can use zero IPID, to address the flaw, and as a bonus, > avoid a couple of atomic operations in ip_idents_reserve() > > Signed-off-by: Eric Dumazet > Reported-by: Geoff Alexander > Tested-by: Geoff Alexander Applied and queued up for -stable.
Re: [Patch net] addrconf: reduce unnecessary atomic allocations
From: Cong Wang Date: Wed, 22 Aug 2018 12:58:34 -0700 > All the 3 callers of addrconf_add_mroute() assert RTNL > lock, they don't take any additional lock either, so > it is safe to convert it to GFP_KERNEL. > > Same for sit_add_v4_addrs(). > > Cc: David Ahern > Signed-off-by: Cong Wang Applied.
Re: Experimental fix for MSI-X issue on r8169
From: Jian-Hong Pan Date: Wed, 22 Aug 2018 11:01:02 +0800 ... > [ 56.462464] r8169 :02:00.0: MSI-X entry: context resume: > ... > uh! The MSI-X entry seems missed after resume on this laptop! Yeah, having all of the MSI-X entry values be all-1's is not a good sign. But this is quite a curious set of debugging traces we now have. In the working case, the vector number in the DATA field seems to change, which suggests that something is assigning new values and programming them into these fields at resume time. But in the failing cases, all of the values are garbage. I would expect, given what the working trace looks like, that in the failing case some values would be wrong and the DATA value would have some new yet valid value. But that is not what we are seeing here. Weird.
Re: [PATCH v1 3/3] net: WireGuard secure network tunnel
From: "Jason A. Donenfeld" Date: Tue, 21 Aug 2018 16:41:50 -0700 > Is 100 in fact acceptable for new code? 120? 180? What's the > generally accepted limit these days? Please keep it as close to 80 columns as possible. Line breaks are not ugly, please embrace them :)
Re: [PATCH] datapath.c: fix missing return value check of nla_nest_start()
From: Pravin Shelar Date: Tue, 21 Aug 2018 15:38:28 -0700 > On Fri, Aug 17, 2018 at 1:15 AM Jiecheng Wu wrote: >> >> Function queue_userspace_packet() defined in net/openvswitch/datapath.c >> calls nla_nest_start() to allocate memory for struct nlattr which is >> dereferenced immediately. As nla_nest_start() may return NULL on failure, >> this code piece may cause NULL pointer dereference bug. >> --- >> net/openvswitch/datapath.c | 4 >> 1 file changed, 4 insertions(+) >> >> diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c >> index 0f5ce77..ff4457d 100644 >> --- a/net/openvswitch/datapath.c >> +++ b/net/openvswitch/datapath.c >> @@ -460,6 +460,8 @@ static int queue_userspace_packet(struct datapath *dp, >> struct sk_buff *skb, >> >> if (upcall_info->egress_tun_info) { >> nla = nla_nest_start(user_skb, >> OVS_PACKET_ATTR_EGRESS_TUN_KEY); >> + if (!nla) >> + return -EMSGSIZE; > It is not possible, since user_skb is allocated to accommodate all > netlink attributes. Pravin, common practice is to always check nla_*() return values even if the SKB is allocated with "enough space". Those calculations can have bugs, and these checks are therefore helpful to avoid crashes and memory corruption in such cases. Thank you.
Re: Experimental fix for MSI-X issue on r8169
From: Heiner Kallweit Date: Tue, 21 Aug 2018 23:19:04 +0200 > That's what I get on my system (RTL8168E-VL). In your case you'll come > only till the first suspend. > > [3.743404] r8169 :03:00.0: MSI-X entry: context probe: fee01004 0 > 40ef 1 On probe, MSI-X is masked (ie. disabled) and is configured to use: address: 0xfee01004 data:0x40ef > [ 29.539250] r8169 :03:00.0: MSI-X entry: context suspend: fee02004 0 > 4028 0 At suspend time, MSI-X is unmasked (ie. enabled) and is configured to use: address: 0xfee01004 data:0x4028 > [ 29.837457] r8169 :03:00.0: MSI-X entry: context resume: fee01004 0 > 402b 0 At reume time, MSI-X is unmasked (ie. enabled) and is configured to use: address: 0xfee01004 data:0x402b > [ 36.921370] r8169 :03:00.0: MSI-X entry: context suspend: fee01004 0 > 402b 0 Second suspend: address: 0xfee01004 data:0x402b > [ 37.239407] r8169 :03:00.0: MSI-X entry: context resume: fee01004 0 > 402b 0 Second resume: address: 0xfee01004 data:0x402b And this all looks normal. The data field is changing when you first up the device and interrupts are enabled. This is where the request_irq happens, the MSI vector is allocated, and that vector number is written to the data field of the MSI-X entry. It looks like this (re-)allocation of MSI vectors happens on every resume as well. And that's why the data field changes each resume.
Re: [Patch net 0/9] net_sched: pending clean up and bug fixes
From: Cong Wang Date: Sun, 19 Aug 2018 12:22:04 -0700 > This patchset aims to clean up and fixes some bugs in current > merge window, this is why it is targeting -net. > > Patch 1-5 are clean up Vlad's patches merged in current merge > window, patch 6 is just a trivial cleanup. > > Patch 7 reverts a lockdep warning fix and patch 8 provides a better > fix for it. > > Patch 9 fixes a potential deadlock found by me during code review. > > Please see each patch for details. > > Cc: Jamal Hadi Salim > Signed-off-by: Cong Wang Series applied and patches #8 and #9 queued up for -stable.
Re: [Patch net 8/9] act_ife: move tcfa_lock down to where necessary
From: Cong Wang Date: Mon, 20 Aug 2018 16:57:46 -0700 > Passing 'exists' as 'atomic' is prior to my change. With my change, > they are separated as two parameters: I mis-read the patch, thanks for explaining :)
Re: [PATCH net] hv_netvsc: ignore devices that are not PCI
From: Stephen Hemminger Date: Tue, 21 Aug 2018 10:40:38 -0700 > Registering another device with same MAC address (such as TAP, VPN or > DPDK KNI) will confuse the VF autobinding logic. Restrict the search > to only run if the device is known to be a PCI attached VF. > > Fixes: e8ff40d4bff1 ("hv_netvsc: improve VF device matching") > Signed-off-by: Stephen Hemminger Applied and queued up for -stable.
Re: [bpf-next RFC 0/3] Introduce eBPF flow dissector
From: Alexei Starovoitov Date: Mon, 20 Aug 2018 13:52:07 -0700 > I don't think copy-paste avoids the issue of uapi. > Anything used by BPF program is uapi. > The only exception is offsets of kernel internal structures > passed into bpf_probe_read(). > So we have several options: > 1. be honest and say 'struct flow_dissect_key*' is now uapi > 2. wrap all of them into 'struct bpf_flow_dissect_key*' and do rewrites > when/if 'struct flow_dissect_key*' changes > 3. wait for BTF to solve it for tracing use case and for this one two. ... > The idea is that kernel internal structs can be defined in bpf prog > and since they will be described precisely in BTF that comes with the prog > the kernel can validate that prog's BTF matches what kernel thinks it has. > imo that's the most flexible, but BTF for all of vmlinux won't be ready > tomorrow and looks like this patch set is ready to go, so I would go with 1 > or 2. I would definitely prefer #2 or #3. I personally would like to see us avoid preventing interesting optimizations of the flow key layout and/or accesses in the future.
Re: [PATCH] rhashtable: remove duplicated include from rhashtable.c
From: Yue Haibing Date: Tue, 21 Aug 2018 01:41:56 + > Remove duplicated include. > > Signed-off-by: Yue Haibing Applied, thank you.
Re: [PATCH net] net/ipv6: Put lwtstate when destroying fib6_info
From: dsah...@kernel.org Date: Mon, 20 Aug 2018 13:02:41 -0700 > From: David Ahern > > Prior to the introduction of fib6_info lwtstate was managed by the dst > code. With fib6_info releasing lwtstate needs to be done when the struct > is freed. > > Fixes: 93531c674315 ("net/ipv6: separate handling of FIB entries from dst > based routes") > Signed-off-by: David Ahern Applied and queued up for -stable, thanks David.
Re: [PATCH net-next,v4] net/tls: Calculate nsg for zerocopy path without skb_cow_data.
From: Doron Roberts-Kedes Date: Mon, 20 Aug 2018 17:27:23 -0700 > Given that frag_lists are not unlikely in this case, I believe the only > remaining feedback on the original patch was the recursive > implementation. If you'd like, I can re-submit with an iterative > implementation, but I noticed that goes against the existing recursive > pattern in functions like skb_release_data -> kfree_skb_list -> kfree_skb > -> __kfree_skb -> skb_release_all -> skb_release_data, as well as > skb_to_sgvec. Let me know whether an iterative implementation is > preferred here, or whether I can simply rebase and resubmit a patch > similar to the original (modulo some variable renaming improvements). Ok, I guess staying with the recursive implementation is fine. It's a real shame that frag lists are so common in this code path, especially nested ones :-/ In the long term, perhaps we can do something about that. In the short term, I guess this means your original change is OK. Please resubmit when the net-next tree opens back up, thanks.
Re: [PATCH net v2 0/4] qed: Misc fixes in the interface with the MFW
From: Tomer Tayar Date: Mon, 20 Aug 2018 00:01:41 +0300 > This patch series fixes several issues in the driver's interface with the > management FW (MFW). > > v1->v2: > - Fix loop counter decrement to be pre instead of post. Series applied, thank you.
Re: [Patch net 8/9] act_ife: move tcfa_lock down to where necessary
From: Cong Wang Date: Sun, 19 Aug 2018 12:22:12 -0700 > The only time we need to take tcfa_lock is when adding > a new metainfo to an existing ife->metalist. We don't need > to take tcfa_lock so early and so broadly in tcf_ife_init(). > > This means we can always take ife_mod_lock first, avoid the > reverse locking ordering warning as reported by Vlad. > > Reported-by: Vlad Buslov > Tested-by: Vlad Buslov > Cc: Vlad Buslov > Cc: Jamal Hadi Salim > Signed-off-by: Cong Wang After this change we no longer call populate_metalist() in an atomic context via tcf_ife_init(), and populate_metalist passes 'exists' down to add_metainfo() as an 'atomic' indicator. It doesn't have this meaning if you aren't holding the tcfa_lock in the callers with BH disabled. Therefore, add_metainfo()'s 'atomic' indication is inaccurate in this call chain and will use GFP_ATOMIC unnecessarily. Probably the thing to just is just pass 'false' down to add_metainfo() in populate_metalist().
Re: [PATCH][net-next] vxlan: reduce dirty cache line in vxlan_find_mac
From: Li RongQing Date: Sun, 19 Aug 2018 11:36:08 +0800 > vxlan_find_mac() unconditionally set f->used for every packet, > this cause a cache miss for every packet, since remote, hlist > and used of vxlan_fdb share the same cacheline. > > With this change f->used is set only if not equal to jiffies > This gives up to 5% speed-up with small packets. > > Signed-off-by: Zhang Yu > Signed-off-by: Li RongQing Please resubmit this when the net-next tree opens back up. Thanks.
Re: [PATCH net 1/4] qed: Wait for ready indication before rereading the shmem
From: Tomer Tayar Date: Sun, 19 Aug 2018 20:58:04 +0300 > + while (!p_info->mfw_mb_length && cnt--) { > + msleep(msec); > + p_info->mfw_mb_length = > + (u16)qed_rd(p_hwfn, p_ptt, > + p_info->mfw_mb_addr + > + offsetof(struct public_mfw_mb, sup_msgs)); > + } > + > + if (!cnt) { Because you use postdecrement on 'cnt', the loop will timeout with 'cnt' equal to '-1' not zero. You need to fix this.
Re: [PATCH 1/1] tap: RCU usage and comment fixes
From: Wang Jian Date: Fri, 17 Aug 2018 08:22:53 + > The tap_queue and the 'tap_dev' are loosely coupled, not 'macvlan_dev'. There is another reference to macvlan_dev in that comment, which is therefore also similarly inaccurate. You should add an appropriate Fixes: line for where this inaccuracy was introduced, which is: Fixes: 6fe3faf86757 ("tap: Abstract type of virtual interface from tap implementation") > Taking rcu_read_lock a little later seems can slightly reduce rcu read > critical section. This is a separate change from fixing up a comment.
Re: [PATCH] net: lan743x_ptp: convert to ktime_get_clocktai_ts64
From: Arnd Bergmann Date: Wed, 15 Aug 2018 19:49:49 +0200 > timekeeping_clocktai64() has been renamed to ktime_get_clocktai_ts64() > for consistency with the other ktime_get_* access functions. > > Rename the new caller that has come up as well. > > Question: this is the only ptp driver that sets the hardware time > to the current system time in TAI. Why does it do that? > > Signed-off-by: Arnd Bergmann Deciding whether PTP drivers should set the hardware time at boot to the current system time is a separate discussion from using the new name for the timekeeping_clocktai64() interface, I'm applying this. Thanks Arnd.
Re: [PATCH net-next] net: sched: always disable bh when taking tcf_lock
From: Vlad Buslov Date: Tue, 14 Aug 2018 21:46:16 +0300 > Recently, ops->init() and ops->dump() of all actions were modified to > always obtain tcf_lock when accessing private action state. Actions that > don't depend on tcf_lock for synchronization with their data path use > non-bh locking API. However, tcf_lock is also used to protect rate > estimator stats in softirq context by timer callback. > > Change ops->init() and ops->dump() of all actions to disable bh when using > tcf_lock to prevent deadlock reported by following lockdep warning: ... > Taking tcf_lock in sample action with bh disabled causes lockdep to issue a > warning regarding possible irq lock inversion dependency between tcf_lock, > and psample_groups_lock that is taken when holding tcf_lock in sample init: ... > In order to prevent potential lock inversion dependency between tcf_lock > and psample_groups_lock, extract call to psample_group_get() from tcf_lock > protected section in sample action init function. > > Fixes: 4e232818bd32 ("net: sched: act_mirred: remove dependency on rtnl lock") > Fixes: 764e9a24480f ("net: sched: act_vlan: remove dependency on rtnl lock") > Fixes: 729e01260989 ("net: sched: act_tunnel_key: remove dependency on rtnl > lock") > Fixes: d77284956656 ("net: sched: act_sample: remove dependency on rtnl lock") > Fixes: e8917f437006 ("net: sched: act_gact: remove dependency on rtnl lock") > Fixes: b6a2b971c0b0 ("net: sched: act_csum: remove dependency on rtnl lock") > Fixes: 2142236b4584 ("net: sched: act_bpf: remove dependency on rtnl lock") > Signed-off-by: Vlad Buslov Applied, thanks Vlad.
Re: [PATCH]ipv6: multicast: In mld_send_cr function moving read lock to second for loop
From: Guruswamy Basavaiah Date: Fri, 17 Aug 2018 18:01:41 +0530 > @@ -1860,7 +1860,6 @@ static void mld_send_cr(struct inet6_dev *idev) > struct sk_buff *skb = NULL; > int type, dtype; > > -read_lock_bh(>lock); > spin_lock(>mc_lock); > > /* deleted MCA's */ This will lead to deadlocks, idev->mc_lock must be taken with _bh(). I have zero confidence in this change, did you do any stress testing with lockdep enabled? It would have caught this quickly.
Re: [PATCH] net: nixge: Add support for 64-bit platforms
From: Moritz Fischer Date: Thu, 16 Aug 2018 12:07:06 -0700 > Add support for 64-bit platforms to driver. > > The hardware only supports 32-bit register accesses > so the accesses need to be split up into two writes > when setting the current and tail descriptor values. > > Cc: Florian Fainelli > Signed-off-by: Moritz Fischer Please resubmit when the net-next tree opens back up. Thank you.
Re: pull-request: bpf 2018-08-18
From: Daniel Borkmann Date: Sat, 18 Aug 2018 01:29:20 +0200 > The following pull-request contains BPF updates for your *net* tree. > > The main changes are: ... > Please consider pulling these changes from: > > git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git Pulled, thanks.
Re: [PATCH] sunhme: convert printk to pr_cont
From: Mikulas Patocka Date: Fri, 17 Aug 2018 16:08:49 -0400 (EDT) > I'm not an expert on networking code - you can change it if it is more > appropriate this way. What Stephen is asking of you doesn't require networking expertiece and he even gave you an example of how to do it. All you would need to do is test is suggestion and make sure it works properly.
Re: [PATCH] sunhme: convert printk to pr_cont
From: Mikulas Patocka Date: Fri, 17 Aug 2018 15:12:22 -0400 (EDT) > The kernel adds newlines automatically unless pr_cont is used. This patch > converts sunhme to use pr_cont, so that the messages are not broken to > multiple lines. > > The patch also adds "\n" to a few strings that were missing it. > > Signed-off-by: Mikulas Patocka > Cc: sta...@vger.kernel.org "stable", are you sure? What crash or memory corruption does these added newlines in the kernel log cuase? I don't think this is appropriate for -stable, sorry. At best this is net-next material, and that tree is closed right now. Please resubmit this when the net-next tree opens back up again, thanks.