[patch net-next] net: sched: silence uninitialized parent variable warning in tc_dump_tfilter
From: Jiri Pirko When tcm->tcm_ifindex == TCM_IFINDEX_MAGIC_BLOCK, parent is still passed down but the value is never used. Compiler does not recognize it and issues a warning. Silence it down initializing parent to 0. Reported-by: David Miller Reported-by: Stephen Rothwell Signed-off-by: Jiri Pirko --- net/sched/cls_api.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c index e500d11da9cd..9683f8550f6f 100644 --- a/net/sched/cls_api.c +++ b/net/sched/cls_api.c @@ -1303,7 +1303,7 @@ static int tc_dump_tfilter(struct sk_buff *skb, struct netlink_callback *cb) struct tcmsg *tcm = nlmsg_data(cb->nlh); long index_start; long index; - u32 parent; + u32 parent = 0; int err; if (nlmsg_len(cb->nlh) < sizeof(*tcm)) -- 2.14.3
Re: [PATCH 0/3] Check gso_size of packets when forwarding
On Mon, Jan 15, 2018 at 6:09 PM, Daniel Axtens wrote: > When regular packets are forwarded, we validate their size against the > MTU of the destination device. However, when GSO packets are > forwarded, we do not validate their size against the MTU. We > implicitly assume that when they are segmented, the resultant packets > will be correctly sized. > > This is not always the case. > > We observed a case where a packet received on an ibmveth device had a > GSO size of around 10kB. This was forwarded by Open vSwitch to a bnx2x > device, where it caused a firmware assert. This is described in detail > at [0] and was the genesis of this series. Rather than fixing it in > the driver, this series fixes the forwarding path. > Are there any other possible forwarding path in networking stack? TC is one subsystem that could forward such a packet to the bnx2x device, how is that handled ? > To fix this: > > - Move a helper in patch 1. > > - Validate GSO segment lengths in is_skb_forwardable() in the GSO >case, rather than assuming all will be well. This fixes bridges. >This is patch 2. > > - Open vSwitch uses its own slightly specialised algorithm for >checking lengths. Wire up checking for that in patch 3. > > [0]: https://patchwork.ozlabs.org/patch/859410/ > > Cc: manish.cho...@cavium.com > Cc: d...@openvswitch.org > > Daniel Axtens (3): > net: move skb_gso_mac_seglen to skbuff.h > net: is_skb_forwardable: validate length of GSO packet segments > openvswitch: drop GSO packets that are too large > > include/linux/skbuff.h | 16 > net/core/dev.c | 7 --- > net/core/skbuff.c | 34 ++ > net/openvswitch/vport.c | 37 ++--- > net/sched/sch_tbf.c | 10 -- > 5 files changed, 84 insertions(+), 20 deletions(-) > > -- > 2.14.1 >
Re: iscsi target regression due to "tcp: remove prequeue support" patch
Hi Florian, (Adding netdev + DaveM CC') Thanks for the response. Comments below. On Mon, 2018-01-15 at 11:41 +0100, Florian Westphal wrote: > Mike Christie wrote: > > > Dec 13 17:55:01 rhel73n1 kernel: Got Login Command, Flags 0x81, ITT: > > 0x, CmdSN: 0x, ExpStatSN: 0xf86dc69b, CID: 0, Length: 65 > > > > we have got a login command and we seem to then go into > > iscsit_do_rx_data -> sock_recvmsg > > > > We seem to get stuck in there though, because we stay blocked until: > > > > Dec 13 17:55:01 rhel73n1 kernel: Entering iscsi_target_sk_data_ready: > > conn: 88b35cbb3000 > > Dec 13 17:55:01 rhel73n1 kernel: Got LOGIN_FLAGS_READ_ACTIVE=1, conn: > > 88b35cbb3000 > > > > where initiator side timeout fires 15 seconds later and it disconnects > > the tcp connection, and we eventually break out of the recvmsg call: > > > > Dec 13 17:55:16 rhel73n1 kernel: Entering iscsi_target_sk_state_change > > Dec 13 17:55:16 rhel73n1 kernel: __iscsi_target_sk_check_close: > > TCP_CLOSE_WAIT|TCP_CLOSE,returning FALSE > > > > > > > > Dec 13 17:55:16 rhel73n1 kernel: rx_loop: 68, total_rx: 68, data: 68 > > Dec 13 17:55:16 rhel73n1 kernel: iscsi_target_do_login_rx after > > rx_login_io, 88b35cbb3000, kworker/2:2:1829 > > > > Is the iscsi target doing something incorrect in its use of > > sk_data_ready and sock_recvmsg or is the tcp patch at fault? > > I have not received any bug reports except this one. > > I also have a hard time following iscsi code flow. > > > Dec 13 17:55:01 rhel73n1 kernel: Starting login_timer for kworker/2:2/1829 > > Dec 13 17:55:01 rhel73n1 kernel: rx_loop: 48, total_rx: 48, data: 48 > > Dec 13 17:55:01 rhel73n1 kernel: Got Login Command, Flags 0x81, ITT: > > 0x, CmdSN: 0x, ExpStatSN: 0xf86dc69b, CID: 0, Length: 65 > > Dec 13 17:55:01 rhel73n1 kernel: Entering iscsi_target_sk_data_ready: conn: > > 88b35cbb3000 > > Looks like things are fine up to this point. > > > Dec 13 17:55:01 rhel73n1 kernel: Got LOGIN_FLAGS_READ_ACTIVE=1, conn: > > 88b35cbb3000 > > This makes things return early from sk_data_ready callback. Correct. This is existing behavior for individual iscsi_conn login delayed_work contexts (conn->login_work) which have not yet returned from a previous sock_recvmsg(..., MSG_WAITALL) blocking call. This causes the next iscsi_target_sk_data_ready() callback to hit LOGIN_FLAGS_READ_ACTIVE=1, and return immediately without kicking conn->login_work to process iscsi_target_do_login_rx() -> sock_recvmsg(..., MSG_WAITALL). > > Dec 13 17:55:16 rhel73n1 kernel: Entering iscsi_target_sk_state_change > > Dec 13 17:55:16 rhel73n1 kernel: __iscsi_target_sk_check_close: > > TCP_CLOSE_WAIT|TCP_CLOSE,returning FALSE > > Dec 13 17:55:16 rhel73n1 kernel: __iscsi_target_sk_close_change: state: 1 > > Dec 13 17:55:16 rhel73n1 kernel: Got LOGIN_FLAGS_READ_ACTIVE=1 > > sk_state_change conn: 88b35cbb3000 > > Dec 13 17:55:16 rhel73n1 kernel: rx_loop: 68, total_rx: 68, data: 68 > > So it looks like all data is there, and probably has been there all the > past 15 seconds, but nothing noticed. > > Why is LOGIN_FLAGS_READ_ACTIVE set? Who sets this? Who is supposed to clear > that? > Why does it exist in first place? The bit is set in iscsi_target_sk_data_ready() when conn->login_work is not already blocked by sock_recvmsg(..., MSG_WAITALL). Once it's set, conn->login_work is kicked to run iscsi_target_do_login_rx() -> sock_recvmsg(..., MSG_WAITALL) which blocks waiting for the next 48 byte login request PDU + payload. Once the active conn->login_work context in iscsi_target_do_login_rx() returns from sock_recvmsg(..., MSG_WAITALL) with full login request PDU + payload bytes, the bit is cleared. AFAICT, there was a wake_up removed by commit e7942d063 that results in multi iscsi login PDU authentication exchanges blocking on a incoming login request payload. So I don't know if this is a pre-existing bug hidden in iscsi-target by tcp prequeue login, or not...? It would indicate users providing their own ->sk_data_ready() callback must be responsible for waking up a kthread context blocked on sock_recvmsg(..., MSG_WAITALL), when a second ->sk_data_ready() is received before the first sock_recvmsg(..., MSG_WAITALL) completes. I don't know if this is a pre-existing bug in iscsi-target hidden by tcp prequeue logic, or not...? Any ideas..?
Re: iscsi target regression due to "tcp: remove prequeue support" patch
On Mon, 2018-01-15 at 14:33 -0600, Mike Christie wrote: > On 01/15/2018 12:41 AM, Nicholas A. Bellinger wrote: > > Hey MNC & Co, > > > > Ping on the earlier iscsi-target authentication login failures atop > > 4.14 + commit e7942d063 removing tcp prequeue support. > > > > For reference, what is your pre 4.14 environment using for > > sysctl_tcp_low_latency..? > > tcp_low_latency=1. > > I have tested the current kernel with 1 and 0, and it does not make a > difference. > Likewise, I've always used tcp_low_latency=1 on pre 4.14.y kernels and never triggered this bug with multi login PDU exchanges. What's strange from groking commit e7942d0 though, it appear the removal of tcp prequeue logic should only effect tcp_low_latency=0 users, right..? But here, that doesn't seem to be the case..
Re: [Intel-wired-lan] [RFC v2 net-next 01/10] net: Add a new socket option for a future transmit time.
On Wed, Jan 17, 2018 at 03:06:12PM -0800, Jesus Sanchez-Palencia wrote: > From: Richard Cochran > > This patch introduces SO_TXTIME. User space enables this option in > order to pass a desired future transmit time in a CMSG when calling > sendmsg(2). > > A new field is added to struct sockcm_cookie, and the tstamp from > skbuffs will be used later on. In the discussion about the v1 patchset, there was a question if the cmsg should include a clockid_t. Without that, how can an application prevent the packet from being sent using an incorrect clock, e.g. the system clock when it expects it to be a PHC, or a different PHC when the socket is not bound to a specific interface? At least in some applications it would be preferred to not sent a packet at all instead of sending it at a wrong time. Please keep in mind that the PHCs and the system clock don't have to be synchronized to each other. If I understand the rest of the series correctly, there is an assumption that the PHCs are keeping time in TAI and CLOCK_TAI can be used as a fallback. -- Miroslav Lichvar
Re: DPAA Ethernet traffice troubles with Linux kernel
On Thu, 1970-01-01 at 00:00 +, Joakim Tjernlund wrote: > On Thu, 1970-01-01 at 00:00 +, Madalin-cristian Bucur wrote: > > CAUTION: This email originated from outside of the organization. Do not > > click links or open attachments unless you recognize the sender and know > > the content is safe. > > > > > > > -Original Message- > > > From: Joakim Tjernlund [mailto:joakim.tjernl...@infinera.com] > > > Sent: Tuesday, January 16, 2018 7:58 PM > > > To: and...@lunn.ch > > > Subject: Re: DPAA Ethernet traffice troubles with Linux kernel > > > > > > On Thu, 1970-01-01 at 00:00 +, Andrew Lunn wrote: > > > > > > > > Hi Joakim > > > > > > > > You appear to be using an old kernel. Take a look at: > > > > > > Not really, I am using 4.14.x and I don't think that is old. Seems like > > > this > > > patch hasn't been sent to 4.14.x. > > > > > > I wonder if I might be missing something else, we just moved to 4.14 and > > > notic that all > > > our fixed PHYs are non functioning: > > > fsl_mac ffe4e2000.ethernet: FMan MEMAC > > > fsl_mac ffe4e2000.ethernet: FMan MAC address: 00:06:9c:0b:06:20 > > > fsl_mac dpaa-ethernet.0: __devm_request_mem_region(mac) failed > > > fsl_mac: probe of dpaa-ethernet.0 failed with error -16 > > > fsl_mac ffe4e4000.ethernet: FMan MEMAC > > > fsl_mac ffe4e4000.ethernet: FMan MAC address: 00:06:9c:0b:06:21 > > > fsl_mac dpaa-ethernet.1: __devm_request_mem_region(mac) failed > > > fsl_mac: probe of dpaa-ethernet.1 failed with error -16 > > > fsl_mac ffe4e6000.ethernet: FMan MEMAC > > > fsl_mac ffe4e6000.ethernet: FMan MAC address: 00:06:9c:0b:06:22 > > > fsl_mac dpaa-ethernet.2: __devm_request_mem_region(mac) failed > > > fsl_mac: probe of dpaa-ethernet.2 failed with error -16 > > > fsl_mac ffe4e8000.ethernet: FMan MEMAC > > > fsl_mac ffe4e8000.ethernet: FMan MAC address: 00:06:9c:0b:06:23 > > > fsl_mac dpaa-ethernet.3: __devm_request_mem_region(mac) failed > > > fsl_mac: probe of dpaa-ethernet.3 failed with error -16 > > > > > > Feels like FMAN still think there are real PHYs there ? > > > > Hi Joakim, > > > > These errors are issued when trying to probe the second time the same > > MAC node. The issue was introduced by this commit: > > > > commit 4d8ee1935bcd666360311dfdadeee235d682d69a > > Author: Florian Fainelli > > Date: Tue Aug 22 15:24:47 2017 -0700 > > fsl/man: Inherit parent device and of_node > > > > and was later addressed by this patch set: > > > > http://patchwork.ozlabs.org/project/netdev/list/?series=8462&state=* > > > > Even with these errors printed, all is working fine, it's just the > > second probing that fails. Adding the latter patches or reverting > > the one above makes the errors prints dissapear. > > > > Madalin > > Ahh, now it starts to look better, reverting "fsl/man: Inherit parent device > and of_node" on 4.14 gives: > libphy: Fixed MDIO Bus: probed > tun: Universal TUN/TAP device driver, 1.6 > libphy: Freescale XGMAC MDIO Bus: probed > iommu: Adding device ffe488000.port to group 10 > libphy: Freescale XGMAC MDIO Bus: probed > mdio_bus ffe4e1000: Error while reading PHY0 reg at 3.3 > iommu: Adding device ffe489000.port to group 22 > libphy: Freescale XGMAC MDIO Bus: probed > mdio_bus ffe4e3000: Error while reading PHY0 reg at 3.3 > iommu: Adding device ffe48a000.port to group 23 > libphy: Freescale XGMAC MDIO Bus: probed > mdio_bus ffe4e5000: Error while reading PHY0 reg at 3.3 > iommu: Adding device ffe48b000.port to group 24 > libphy: Freescale XGMAC MDIO Bus: probed > mdio_bus ffe4e7000: Error while reading PHY0 reg at 3.3 > iommu: Adding device ffe48c000.port to group 25 > libphy: Freescale XGMAC MDIO Bus: probed > mdio_bus ffe4e9000: Error while reading PHY0 reg at 3.3 > fsl_mac ffe4e2000.ethernet: FMan MEMAC > fsl_mac ffe4e2000.ethernet: FMan MAC address: 00:06:9c:0b:06:20 > fsl_mac ffe4e4000.ethernet: FMan MEMAC > fsl_mac ffe4e4000.ethernet: FMan MAC address: 00:06:9c:0b:06:21 > fsl_mac ffe4e6000.ethernet: FMan MEMAC > fsl_mac ffe4e6000.ethernet: FMan MAC address: 00:06:9c:0b:06:22 > fsl_mac ffe4e8000.ethernet: FMan MEMAC > fsl_mac ffe4e8000.ethernet: FMan MAC address: 00:06:9c:0b:06:23 > fsl_mac ffe4e.ethernet: FMan MEMAC > fsl_mac ffe4e.ethernet: FMan MAC address: 00:06:9c:0b:06:1f > fsl_dpa dpaa-ethernet.0 eth0: Probed interface eth0 > fsl_dpa dpaa-ethernet.1 eth1: Probed interface eth1 > fsl_dpa dpaa-ethernet.2 eth2: Probed interface eth2 > fsl_dpa dpaa-ethernet.3 eth3: Probed interface eth3 > fsl_dpa dpaa-ethernet.4 eth4: Probed interface eth4 > > Still some minor errors: mdio_bus ffe4e7000: Error while reading PHY0 reg at > 3.3 > but this is going the right way(I have not had a chance to try if they work > due > to external modules not ported/ready yet) > > The other patch series is still to be tested though but I already now wanted > to stress the importance of getting all upstream fixes into stable, ASAP. > You now what they are, I have no idea. FYI, applied http://patchwork.ozlabs.org/project/ne
Re: [PATCH next-queue 2/2] ixgbe: add unlikely notes to tx fastpath expressions
On 2018/1/9 6:47, Shannon Nelson wrote: Add unlikely() to a few error checking expressions in the Tx offload handling. Suggested-by: Yanjun Zhu Hi, I am fine with this patch. I have a question. The ipsec feature is supported in ixgbevf? Thanks a lot. Zhu Yanjun Signed-off-by: Shannon Nelson --- drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c index 57c10e6..3d069a2 100644 --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c @@ -749,28 +749,28 @@ int ixgbe_ipsec_tx(struct ixgbe_ring *tx_ring, struct xfrm_state *xs; struct tx_sa *tsa; - if (!first->skb->sp->len) { + if (unlikely(!first->skb->sp->len)) { netdev_err(tx_ring->netdev, "%s: no xfrm state len = %d\n", __func__, first->skb->sp->len); return 0; } xs = xfrm_input_state(first->skb); - if (!xs) { + if (unlikely(!xs)) { netdev_err(tx_ring->netdev, "%s: no xfrm_input_state() xs = %p\n", __func__, xs); return 0; } itd->sa_idx = xs->xso.offload_handle - IXGBE_IPSEC_BASE_TX_INDEX; - if (itd->sa_idx > IXGBE_IPSEC_MAX_SA_COUNT) { + if (unlikely(itd->sa_idx > IXGBE_IPSEC_MAX_SA_COUNT)) { netdev_err(tx_ring->netdev, "%s: bad sa_idx=%d handle=%lu\n", __func__, itd->sa_idx, xs->xso.offload_handle); return 0; } tsa = &ipsec->tx_tbl[itd->sa_idx]; - if (!tsa->used) { + if (unlikely(!tsa->used)) { netdev_err(tx_ring->netdev, "%s: unused sa_idx=%d\n", __func__, itd->sa_idx); return 0;
Re: [PATCH net] gso: validate gso_type if SKB_GSO_DODGY
On 2018年01月18日 14:04, Willem de Bruijn wrote: On Thu, Jan 18, 2018 at 12:20 AM, Willem de Bruijn wrote: On Wed, Jan 17, 2018 at 10:48 PM, Jason Wang wrote: On 2018年01月18日 07:11, Willem de Bruijn wrote: From: Willem de Bruijn Validate gso_type of untrusted SKB_GSO_DODGY packets during segmentation. Untrusted user packets are limited to a small set of gso types in virtio_net_hdr_to_skb. But segmentation occurs on packet contents. Syzkaller was able to enter gso callbacks that are not hardened against untrusted user input. Fixes: f43798c27684 ("tun: Allow GSO using virtio_net_hdr") This commit is suspicious, I guess it should be 5c7cdf339af5 ("gso: Remove arbitrary checks for unsupported GSO") The specific SCTP path was introduced with commit 90017accff61 ("sctp: add GSO support"). But the main issue that packets can be delivered to gso handlers different from their gso_type goes back further. The commit you reference is actually older than the sctp gso patch, so it makes sense that it did not have a check in the sctp_gso_segment. I still think that we should check in inet_gso_segment when we have the proto, instead of in each {tcp, sctp, udp, esp, ...} handler having a check of the form. !(type & (SKB_GSO_TCPV4 | SKB_GSO_TCPV6 Unless we can create packets that legitimate combine SKB_GSO_DODGY with tunnel headers. As you mentioned below, looks like we can e.g bridge between tunnels (vxlan, gre or others) and tap, or even bpf can produce this (e.g bpf_skb_adjust_room). virtio_net_hdr_to_skb does not accept tunneled gso types. Yes, Vlad is trying to extend virtio to support more kinds of gso types, so it will be supported for sure. But a tun device can be bridged with a gre tunnel in the host, creating a path that will call gre_gso_segment. If that is possible, then this patch is indeed too strict and we do need checks in the individual handlers. I think so. Thanks
Re: [PATCH net] net: validate untrusted gso packets
On 2018年01月18日 13:09, Willem de Bruijn wrote: Also, a packet can just as easily spoof an esp packet. See also the discussion in the Link above. Then we probably need check there. Adding a check in every gso handler that does not expect packets with SKB_GSO_DODGY source seems backwards to me. A packet with gso_type SKB_GSO_TCPV4 should never be delivered to an sctp handler, say. We must check before passing to a handler whether the packet matches the callback. That's the case for trusted source. For dodgy source, we can't assume that. We can address this specific issue in segmentation by placing a check analogous to skb_validate_dodgy_gso in the network layer. Something like this (but with ECN option support): @@ -1258,6 +1258,22 @@ struct sk_buff *inet_gso_segment(struct sk_buff *skb, skb_reset_transport_header(skb); + gso_type = skb_shinfo(skb)->gso_type; + if (gso_type & SKB_GSO_DODGY) { + switch (gso_type & (SKB_GSO_TCPV4 | SKB_GSO_UDP)) { + case SKB_GSO_TCPV4: + if (proto != IPPROTO_TCP) + goto out; + break; + case SKB_GSO_UDP: + if (proto != IPPROTO_UDP) + goto out; + break; + default: + goto out; + } + } This implements subset of function for codes which was removed by the commit I mentioned below. No, as I explain above, it performs a different check. [...] For performance reason. I think we should delay the check or segmentation as much as possible until it was really needed. Going through segmentation is probably as expensive as flow dissector, if not more so because of the indirect branches. I think we don't even need to care about this consider the evil packet should be rare. How does frequency matter when a single packet can crash a host? I mean consider we had fix the crash, we don't care how expensive do we spot this. And what you propose here is just a very small subset of the necessary checking, more comes at gso header checking. So even if we care performance, it only help for some specific case. It also fixed the bug that Eric sent a separate patch for, as that did not dissect as a valid TCP packet, either. I may miss something but how did this patch protects an evil thoff? And now we have two pieces of code that need to be hardened against malicious packets. To me fixing the exist path is a good balance between security and performance. But I did overlook the guest to guest communication, with virtual devices that can set NETIF_F_GSO_ROBUST and so do not enter the stack. That means that one guest can send misrepresented packets to another. Is that preferable to absorbing the cost of validation? The problem is that guests and hosts don't trust each other. Even if one packet has already been validated by one side, it must be validated again by another side when needed. Doing validation twice is meaningless in this case. Ack, agreed that guests need to have defensive coding of themselves. Then it's fine to pass packets to guests where the gso_type does not match the contents. The current patch does not actually deprecate the path through the segmentation layer. So it will indeed add overhead. We would have to remove the DODGY logic in net-next. I don't get the point of removing this. One additional possible optimization is to switch to flow_keys_buf_dissector, which does not dissect as deeply. I did that in an earlier patch; it should be sufficient for this purpose. I suspect the cost is still noticeable, and consider we may support kind of tunnel offload in the future. We should first assume the correctness of metadata provided by untrusted source and validate it before we really want to use it. Validate them during entry means we assume they are wrong, then there's no need for most of the fields of virtio-net header, If you always need to use the data, then you always validate. In which case you want to validate early, as there will be less vulnerable code before validation. But I see what I think is your point: in guest to guest communication we do not need to validate at all, so early validation adds unnecessary cost for this important use case. That's a fair argument for validating later and only when needed (i.e., a path without NETIF_F_GSO_ROBUST). Yes, and looking at Herbert's commit log for 576a30eb6453 ("[NET]: Added GSO header verification"). It was intended do the verification just before gso. Thanks
[PATCH v2] tools/bpftool: silence a static checker warning
There is a static checker warning that "proglen" has an upper bound but no lower bound. The allocation will just fail harmlessly so it's not a big deal. Signed-off-by: Dan Carpenter --- v2: change the type to unsigned instead of checking for negatives diff --git a/tools/bpf/bpf_jit_disasm.c b/tools/bpf/bpf_jit_disasm.c index 30044bc4f389..58c2bab4ef6e 100644 --- a/tools/bpf/bpf_jit_disasm.c +++ b/tools/bpf/bpf_jit_disasm.c @@ -172,7 +172,8 @@ static uint8_t *get_last_jit_image(char *haystack, size_t hlen, { char *ptr, *pptr, *tmp; off_t off = 0; - int ret, flen, proglen, pass, ulen = 0; + unsigned int proglen; + int ret, flen, pass, ulen = 0; regmatch_t pmatch[1]; unsigned long base; regex_t regex; @@ -199,7 +200,7 @@ static uint8_t *get_last_jit_image(char *haystack, size_t hlen, } ptr = haystack + off - (pmatch[0].rm_eo - pmatch[0].rm_so); - ret = sscanf(ptr, "flen=%d proglen=%d pass=%d image=%lx", + ret = sscanf(ptr, "flen=%d proglen=%u pass=%d image=%lx", &flen, &proglen, &pass, &base); if (ret != 4) { regfree(®ex); @@ -239,7 +240,7 @@ static uint8_t *get_last_jit_image(char *haystack, size_t hlen, } assert(ulen == proglen); - printf("%d bytes emitted from JIT compiler (pass:%d, flen:%d)\n", + printf("%u bytes emitted from JIT compiler (pass:%d, flen:%d)\n", proglen, pass, flen); printf("%lx + :\n", base);
Re: [PATCH 0/3] Check gso_size of packets when forwarding
On 2018年01月18日 16:28, Pravin Shelar wrote: On Mon, Jan 15, 2018 at 6:09 PM, Daniel Axtens wrote: When regular packets are forwarded, we validate their size against the MTU of the destination device. However, when GSO packets are forwarded, we do not validate their size against the MTU. We implicitly assume that when they are segmented, the resultant packets will be correctly sized. This is not always the case. We observed a case where a packet received on an ibmveth device had a GSO size of around 10kB. This was forwarded by Open vSwitch to a bnx2x device, where it caused a firmware assert. This is described in detail at [0] and was the genesis of this series. Rather than fixing it in the driver, this series fixes the forwarding path. Are there any other possible forwarding path in networking stack? TC is one subsystem that could forward such a packet to the bnx2x device, how is that handled ? Yes, so it looks to me we should do the check in e.g netif_needs_gso() before passing it to hardware. And bnx2x needs to set its gso_max_size correctly. Btw, looks like this could be triggered from a guest which is a DOS. We need request a CVE for this. Thanks To fix this: - Move a helper in patch 1. - Validate GSO segment lengths in is_skb_forwardable() in the GSO case, rather than assuming all will be well. This fixes bridges. This is patch 2. - Open vSwitch uses its own slightly specialised algorithm for checking lengths. Wire up checking for that in patch 3. [0]: https://patchwork.ozlabs.org/patch/859410/ Cc: manish.cho...@cavium.com Cc: d...@openvswitch.org Daniel Axtens (3): net: move skb_gso_mac_seglen to skbuff.h net: is_skb_forwardable: validate length of GSO packet segments openvswitch: drop GSO packets that are too large include/linux/skbuff.h | 16 net/core/dev.c | 7 --- net/core/skbuff.c | 34 ++ net/openvswitch/vport.c | 37 ++--- net/sched/sch_tbf.c | 10 -- 5 files changed, 84 insertions(+), 20 deletions(-) -- 2.14.1
Re: [PATCH v6 00/36] Andes(nds32) Linux Kernel Port
On Mon, Jan 15, 2018 at 6:53 AM, Greentime Hu wrote: > This is the 6th version patchset to add the Linux kernel port for Andes(nds32) > processors. Almost all of the feedbacks from v5 patchseries has been > addressed. > Thanks to everyone who provided feedback on the previous version. > > > This patchset adds core architecture support to Linux for Andestech's > N13, N15, D15, N10, D10 processor cores. > > Based on the 16/32-bit AndeStar RISC-like architecture, we designed the > configurable AndesCore series of embedded processor families. AndesCores > range from highly performance-efficient small-footprint cores for > microcontrollers and deeply-embedded applications to 1GHz+ cores running > Linux, covering general-purpose N-series cores for a wide range of computing > need, DSP-capable D-series cores for digital signal control, > instruction-extensible E-series cores for application-specific acceleration, > and secure S-series cores for best protection of the most valuable. > > The patches are based on v4.14-rc8, and can also be found in the > following git tree: > https://github.com/andestech/linux.git nds32-4.14-rc8-v6 > > The build script and toolchain repositories are able to be found here: > https://github.com/andestech/build_script.git > > Freely available instruction set and architecture overview documents can > be found on the following page: > http://www.andestech.com/product.php?cls=9 > > > Vincent Ren-Wei Chen and I will maintain this port. Thanks to everyone who > helped us and contributed to it. :) Any feedback is welcome. Hi Greentime, I think it's time to move this to the next step towards inclusion, as you appear to have addressed all major issues, and only smaller details remain. This is what I'd suggest for moving forward: - split the patches into two branches in git: one 'next' branch for everything that is ready to get merged in one pull request that you send to Linus, including drivers that you have received an Ack for from the respective subsystem maintainers, and one 'testing' branch for anything that is either not quite ready or that you expect to get merged through someone else's tree (e.g. most device drivers). The 'testing' branch can merge in the 'next' branch or get rebased on it frequently. - Ask reviewers to send 'Acked-by' or 'Reviewed-by' tags for the patches they are happy with, or to complain if they still see any major issues that should prevent the series from going in. I'll go through the submission once more myself and Ack any patches that I have actually reviewed. - Decide what base to use for the 'next' branch, rebase it on that release one more time and then plan to never rebase it again. This will be the branch to send to linux-next and to Linus Torvalds. Given the current timing of the merge window, I would suggest to base it on top of v4.16-rc1 once that gets released, and then send it for inclusion in 4.17. Any changes you do in the meantime would be added on top of the original set. - Ask Stephen Rothwell to include your 'next' branch in linux-next. (if you base on 4.16-rc1, wait until that is out). - Submit any remaining driver patches that you do not have an 'Ack' for to the respective subsystem maintainers for inclusion in their trees. - Upload a gpg key (4096 bits) for your greent...@andestech.com address to the keyservers, and arrange to have that signed by other kernel developers. You will need to sign git tags for pull requests with your key, and the key itself should be signed for this to work best. - Once you have at least three signatures on your gpg key, apply for an account on kernel.org and move your git tree there, see https://www.kernel.org/category/faq.html. Alternatively, host the git tree on a web-facing git server from andestech.com (github works initially, but has a couple of shortcomings). Arnd
Re: [PATCH] net: fec: add necessary defines to work on ARM64
Hi Andy, Am Donnerstag, den 18.01.2018, 01:49 + schrieb Andy Duan: > From: Lucas Stach Sent: Thursday, January > 18, 2018 2:31 AM > > The i.MX8 is a ARMv8 based SoC, that uses the same FEC IP as the > > earlier, > > ARMv7 based, i.MX SoCs. Allow the driver to work on ARM64. > > > > Signed-off-by: Lucas Stach > > --- [...] > Kconfig also need to add ARM64 support. No, not as far as I'm aware. I introduced the ARCH_MXC symbol for ARM64, which unlocks the FEC config option when selected. Regards, Lucas
Re: [PATCH v6 03/36] sparc: io: To use the define of ioremap_[nocache|wc|wb] in asm-generic/io.h
On Mon, Jan 15, 2018 at 6:53 AM, Greentime Hu wrote: > From: Greentime Hu > > It will be built failed if commit id: d25ea659 is selected. This patch > can fix this build error. > > Signed-off-by: Greentime Hu The change is fine, but the reference to commit 'd25ea659' is not for two reasons: - when you rebase the tree, you will get a different ID - the recommended format for referring to another commit is to list it with 12 digits and the one-line summary like this: commit d25ea659bc37 ("asm-generic/io.h: move ioremap_nocache/ioremap_uc/ioremap_wc/ioremap_…") - Ideally you use a 'Fixes:' tag in the patch description. You can add these lines in your .gitconfig to help you here [alias] fixes = show --format='Fixes: %h (\"%s\")' -s [core] abbrev = 12 That will give you a 'git fixes' command to output the Fixes: line in the correct format for future bug fixes. For this particular change, I would actually just merge the two patches into one patch that then doesn't break anything in the first place. Arnd
RE: [PATCH] net: fec: add necessary defines to work on ARM64
From: Lucas Stach Sent: Thursday, January 18, 2018 5:54 PM >Hi Andy, > >Am Donnerstag, den 18.01.2018, 01:49 + schrieb Andy Duan: >> From: Lucas Stach Sent: Thursday, January 18, >> 2018 2:31 AM >> > The i.MX8 is a ARMv8 based SoC, that uses the same FEC IP as the >> > earlier, >> > ARMv7 based, i.MX SoCs. Allow the driver to work on ARM64. >> > >> > Signed-off-by: Lucas Stach >> > --- >[...] > >> Kconfig also need to add ARM64 support. > >No, not as far as I'm aware. I introduced the ARCH_MXC symbol for ARM64, >which unlocks the FEC config option when selected. > >Regards, >Lucas Okay, that make sense. Acked-by: Fugang Duan
Re: [PATCH v6 04/36] earlycon: add reg-offset to physical address before mapping
On Mon, Jan 15, 2018 at 6:53 AM, Greentime Hu wrote: > From: Greentime Hu > > It will get the wrong virtual address because port->mapbase is not added > the correct reg-offset yet. We have to update it before earlycon_map() > is called > > Signed-off-by: Greentime Hu Acked-by: Arnd Bergmann Cc: Peter Hurley Cc: sta...@vger.kernel.org Fixes: 088da2a17619 ("of: earlycon: Initialize port fields from DT properties") As this is a bugfix for generic code, this might be good to have Greg pick up directly for his serial drivers tree. An Ack from Peter Hurley would also help, as he introduced the reg-offset handling originally. Arnd
Re: [PATCH v6 05/36] nds32: Assembly macros and definitions
On Mon, Jan 15, 2018 at 6:53 AM, Greentime Hu wrote: > From: Greentime Hu > > This patch includes assembly macros, bit field definitions used in .S > files across arch/nds32/. > > Signed-off-by: Vincent Chen > Signed-off-by: Greentime Hu Acked-by: Arnd Bergmann
Re: [PATCH v6 06/36] nds32: Kernel booting and initialization
On Mon, Jan 15, 2018 at 6:53 AM, Greentime Hu wrote: I had not looked at this patch in enough detail earlier, sorry about that. It should be easy enough to fix though. > +#ifdef CONFIG_VGA_CONSOLE > +struct screen_info screen_info; > +#endif I would assume that you can't ever have a VGA console. Just drop all the references here and instead send a patch to the fbdev maintainer to add the dependency at CONFIG_VGA_CONSOLE to prevent selecting it with nds32. > +extern struct mm_struct init_mm; init_mm is declared in linux/mm_types.h, so you should need another declaration. In general, you should never put 'extern' declarations in to .c files anyway. > + > +extern void __init early_init_devtree(void *params); > +extern void __init early_trap_init(void); similarly, these are declared in include/linux/of_fdt.h > +void __init setup_arch(char **cmdline_p) > +{ > + early_init_devtree(__atags_pointer ? > + phys_to_virt(__atags_pointer) : __dtb_start); The reference to '__atags_pointer' appears to be a leftover from pre-DT days. Can you just remove that? > +void calibrate_delay(void) > +{ > + const int *val; > + struct device_node *cpu = NULL; > + cpu = of_find_compatible_node(NULL, NULL, "andestech,nds32v3"); > + val = of_get_property(cpu, "clock-frequency", NULL); > + if (!val || !*val) > + panic("no cpu 'clock-frequency' parameter in device tree"); > + loops_per_jiffy = be32_to_cpup(val) / HZ; > + pr_cont("%lu.%02lu BogoMIPS (lpj=%lu)\n", > + loops_per_jiffy / (50 / HZ), > + (loops_per_jiffy / (5000 / HZ)) % 100, loops_per_jiffy); > +} This seems very odd to me: The 'clock-frequency' property in the cpu node should refer to the actual frequency it is running at, but that tends to be different from the bogomips as needed by the ndelay() function. Can you explain what is going on here? Arnd
Re: [PATCHv3 net-next 2/2] openvswitch: add erspan version I and II support
Looks much better. On Wed, 17 Jan 2018 09:32:51 -0800, William Tu wrote: > + OVS_ERSPAN_OPT_IDX, /* be32 index */ Why don't you convert this to u32 while passing to/from user space? > + [OVS_ERSPAN_OPT_IDX]= { .len = sizeof(u32) }, sizeof(__be32) but see above. Thanks, Jiri
Re: [PATCH v6 07/36] nds32: Exception handling
On Mon, Jan 15, 2018 at 6:53 AM, Greentime Hu wrote: > From: Greentime Hu > > This patch includes the exception/interrupt entries, pt_reg structure and > related accessors. > > Signed-off-by: Vincent Chen > Signed-off-by: Greentime Hu Here it would be good to have a more detailed explanation about the alignment trap handling. I remember discussing it with you before, but don't remember the exact outcome. In particular you should explain here why you need to handle alignment traps in the first place, and what the expected defaults are (e.g. always disabled unless a user requests it, or always enabled) and what kind of code runs into the traps (e.g. buggy kernel code, correct kernel code, buggy user space code etc). Arnd
[PATCH 1/2] can: af_can: can_rcv(): replace WARN_ONCE by pr_warn_once
If an invalid CAN frame is received, from a driver or from a tun interface, a Kernel warning is generated. This patch replaces the WARN_ONCE by a simple pr_warn_once, so that a kernel, bootet with panic_on_warn, does not panic. A printk seems to be more appropriate here. Reported-by: syzbot+4386709c0c1284dca...@syzkaller.appspotmail.com Suggested-by: Dmitry Vyukov Acked-by: Oliver Hartkopp Cc: linux-stable Signed-off-by: Marc Kleine-Budde --- net/can/af_can.c | 18 +++--- 1 file changed, 7 insertions(+), 11 deletions(-) diff --git a/net/can/af_can.c b/net/can/af_can.c index 003b2d6d655f..ae835382e678 100644 --- a/net/can/af_can.c +++ b/net/can/af_can.c @@ -721,20 +721,16 @@ static int can_rcv(struct sk_buff *skb, struct net_device *dev, { struct canfd_frame *cfd = (struct canfd_frame *)skb->data; - if (WARN_ONCE(dev->type != ARPHRD_CAN || - skb->len != CAN_MTU || - cfd->len > CAN_MAX_DLEN, - "PF_CAN: dropped non conform CAN skbuf: " - "dev type %d, len %d, datalen %d\n", - dev->type, skb->len, cfd->len)) - goto drop; + if (unlikely(dev->type != ARPHRD_CAN || skb->len != CAN_MTU || +cfd->len > CAN_MAX_DLEN)) { + pr_warn_once("PF_CAN: dropped non conform CAN skbuf: dev type %d, len %d, datalen %d\n", +dev->type, skb->len, cfd->len); + kfree_skb(skb); + return NET_RX_DROP; + } can_receive(skb, dev); return NET_RX_SUCCESS; - -drop: - kfree_skb(skb); - return NET_RX_DROP; } static int canfd_rcv(struct sk_buff *skb, struct net_device *dev, -- 2.15.1
pull-request: can 2018-01-18
Hello David, this is a pull reqeust of two patches for net/master: The syzkaller project triggered two WARN_ONCE() in the af_can code from userspace and we decided to replace it by a pr_warn_once(). regards, Marc --- The following changes since commit ad9294dbc227cbc8e173b3b963e7dd9af5314f77: bpf: fix cls_bpf on filter replace (2018-01-17 17:14:06 -0500) are available in the Git repository at: ssh://g...@gitolite.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can.git tags/linux-can-fixes-for-4.15-20180118 for you to fetch changes up to d4689846881d160a4d12a514e991a740bcb5d65a: can: af_can: canfd_rcv(): replace WARN_ONCE by pr_warn_once (2018-01-18 09:32:54 +0100) linux-can-fixes-for-4.15-20180118 Marc Kleine-Budde (2): can: af_can: can_rcv(): replace WARN_ONCE by pr_warn_once can: af_can: canfd_rcv(): replace WARN_ONCE by pr_warn_once net/can/af_can.c | 36 ++-- 1 file changed, 14 insertions(+), 22 deletions(-)
[PATCH 2/2] can: af_can: canfd_rcv(): replace WARN_ONCE by pr_warn_once
If an invalid CANFD frame is received, from a driver or from a tun interface, a Kernel warning is generated. This patch replaces the WARN_ONCE by a simple pr_warn_once, so that a kernel, bootet with panic_on_warn, does not panic. A printk seems to be more appropriate here. Reported-by: syzbot+e3b775f40babeff6e...@syzkaller.appspotmail.com Suggested-by: Dmitry Vyukov Acked-by: Oliver Hartkopp Cc: linux-stable Signed-off-by: Marc Kleine-Budde --- net/can/af_can.c | 18 +++--- 1 file changed, 7 insertions(+), 11 deletions(-) diff --git a/net/can/af_can.c b/net/can/af_can.c index ae835382e678..4d7f988a3130 100644 --- a/net/can/af_can.c +++ b/net/can/af_can.c @@ -738,20 +738,16 @@ static int canfd_rcv(struct sk_buff *skb, struct net_device *dev, { struct canfd_frame *cfd = (struct canfd_frame *)skb->data; - if (WARN_ONCE(dev->type != ARPHRD_CAN || - skb->len != CANFD_MTU || - cfd->len > CANFD_MAX_DLEN, - "PF_CAN: dropped non conform CAN FD skbuf: " - "dev type %d, len %d, datalen %d\n", - dev->type, skb->len, cfd->len)) - goto drop; + if (unlikely(dev->type != ARPHRD_CAN || skb->len != CANFD_MTU || +cfd->len > CANFD_MAX_DLEN)) { + pr_warn_once("PF_CAN: dropped non conform CAN FD skbuf: dev type %d, len %d, datalen %d\n", +dev->type, skb->len, cfd->len); + kfree_skb(skb); + return NET_RX_DROP; + } can_receive(skb, dev); return NET_RX_SUCCESS; - -drop: - kfree_skb(skb); - return NET_RX_DROP; } /* -- 2.15.1
Re: [PATCH v2 03/31] net: Introduce net_sem for protection of pernet_list
On 17.01.2018 23:04, Andrei Vagin wrote: > On Mon, Nov 20, 2017 at 09:32:34PM +0300, Kirill Tkhai wrote: >> Curently mutex is used to protect pernet operations list. It makes >> cleanup_net() to execute ->exit methods of the same operations set, >> which was used on the time of ->init, even after net namespace is >> unlinked from net_namespace_list. >> >> But the problem is it's need to synchronize_rcu() after net is removed >> from net_namespace_list(): >> >> Destroy net_ns: >> cleanup_net() >> mutex_lock(&net_mutex) >> list_del_rcu(&net->list) >> synchronize_rcu() <--- Sleep there for >> ages >> list_for_each_entry_reverse(ops, &pernet_list, list) >> ops_exit_list(ops, &net_exit_list) >> list_for_each_entry_reverse(ops, &pernet_list, list) >> ops_free_list(ops, &net_exit_list) >> mutex_unlock(&net_mutex) >> >> This primitive is not fast, especially on the systems with many processors >> and/or when preemptible RCU is enabled in config. So, all the time, while >> cleanup_net() is waiting for RCU grace period, creation of new net namespaces >> is not possible, the tasks, who makes it, are sleeping on the same mutex: >> >> Create net_ns: >> copy_net_ns() >> mutex_lock_killable(&net_mutex)<--- Sleep there for >> ages >> >> I observed 20-30 seconds hangs of "unshare -n" on ordinary 8-cpu laptop >> with preemptible RCU enabled. >> >> The solution is to convert net_mutex to the rw_semaphore and add small locks >> to really small number of pernet_operations, what really need them. Then, >> pernet_operations::init/::exit methods, modifying the net-related data, >> will require down_read() locking only, while down_write() will be used >> for changing pernet_list. >> >> This gives signify performance increase, after all patch set is applied, >> like you may see here: >> >> %for i in {1..1}; do unshare -n bash -c exit; done >> >> *before* >> real 1m40,377s >> user 0m9,672s >> sys 0m19,928s >> >> *after* >> real 0m17,007s >> user 0m5,311s >> sys 0m11,779 >> >> (5.8 times faster) >> >> This patch starts replacing net_mutex to net_sem. It adds rw_semaphore, >> describes the variables it protects, and makes to use where appropriate. >> net_mutex is still present, and next patches will kick it out step-by-step. >> >> Signed-off-by: Kirill Tkhai >> --- >> include/linux/rtnetlink.h |1 + >> net/core/net_namespace.c | 39 ++- >> net/core/rtnetlink.c |4 ++-- >> 3 files changed, 29 insertions(+), 15 deletions(-) >> >> diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h >> index 2032ce2eb20b..f640fc87fe1d 100644 >> --- a/include/linux/rtnetlink.h >> +++ b/include/linux/rtnetlink.h >> @@ -35,6 +35,7 @@ extern int rtnl_is_locked(void); >> >> extern wait_queue_head_t netdev_unregistering_wq; >> extern struct mutex net_mutex; >> +extern struct rw_semaphore net_sem; >> >> #ifdef CONFIG_PROVE_LOCKING >> extern bool lockdep_rtnl_is_held(void); >> diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c >> index 2e512965bf42..859dce31e37e 100644 >> --- a/net/core/net_namespace.c >> +++ b/net/core/net_namespace.c >> @@ -41,6 +41,11 @@ struct net init_net = { > >> static LIST_HEAD(pernet_list); >> static struct list_head *first_device = &pernet_list; >> DEFINE_MUTEX(net_mutex); > > With all patches, we still have the net_mutex, I think we need to add a > comment, which explains why we need it. Are "sync" pernet operations > depricated after this series? Or is it ok to have them? net_mutex will leave till the time all pernet_operations are converted. But people, who don't use unconverted operations, will have performance profit already. Comment is not a problem :) Thanks, Kirill >> EXPORT_SYMBOL(init_net); >> >> static bool init_net_initialized; >> +/* >> + * net_sem: protects: pernet_list, net_generic_ids, >> + * init_net_initialized and first_device pointer. >> + */ >> +DECLARE_RWSEM(net_sem); >> >> #define MIN_PERNET_OPS_ID \ >> ((sizeof(struct net_generic) + sizeof(void *) - 1) / sizeof(void *)) >> @@ -279,7 +284,7 @@ struct net *get_net_ns_by_id(struct net *net, int id) >> */ >> static __net_init int setup_net(struct net *net, struct user_namespace >> *user_ns) >> { >> -/* Must be called with net_mutex held */ >> +/* Must be called with net_sem held */ >> const struct pernet_operations *ops, *saved_ops; >> int error = 0; >> LIST_HEAD(net_exit_list); >> @@ -411,12 +416,16 @@ struct net *copy_net_ns(unsigned long flags, >> net->ucounts = ucounts; >> get_user_ns(user_ns); >> >> -rv = mutex_lock_killable(&net_mutex); >> +rv = down_read_killable(&net_sem); >> if (rv < 0) >> goto put_userns; >> - >> +rv = mutex_lock_killable(&net_mutex); >> +if (rv < 0) >> +goto up_read; >> rv = setup_net(net, user_ns); >> mutex_unlock(&net_mutex); >> +up_read: >> +up_read
Re: [PATCH v6 08/36] nds32: MMU definitions
On Mon, Jan 15, 2018 at 6:53 AM, Greentime Hu wrote: > From: Greentime Hu > > This patch includes virtual memory layout, PHYS_OFFSET is defined as 0x0. It > also includes the 4KB/8KB page size configurations and pte operations. > > Signed-off-by: Vincent Chen > Signed-off-by: Greentime Hu Acked-by: Arnd Bergmann
Re: [PATCH v2 05/31] net: Allow pernet_operations to be executed in parallel
On 17.01.2018 21:34, Andrei Vagin wrote: > On Mon, Nov 20, 2017 at 09:32:55PM +0300, Kirill Tkhai wrote: >> This adds new pernet_operations::async flag to indicate operations, >> which ->init(), ->exit() and ->exit_batch() methods are allowed >> to be executed in parallel with the methods of any other pernet_operations. >> >> When there are only asynchronous pernet_operations in the system, >> net_mutex won't be taken for a net construction and destruction. >> >> Also, remove BUG_ON(mutex_is_locked()) from net_assign_generic() >> without replacing with the equivalent net_sem check, as there is >> one more lockdep assert below. >> >> Suggested-by: Eric W. Biederman >> Signed-off-by: Kirill Tkhai >> --- >> include/net/net_namespace.h |6 ++ >> net/core/net_namespace.c| 29 +++-- >> 2 files changed, 25 insertions(+), 10 deletions(-) >> >> diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h >> index 10f99dafd5ac..db978c4755f7 100644 >> --- a/include/net/net_namespace.h >> +++ b/include/net/net_namespace.h >> @@ -303,6 +303,12 @@ struct pernet_operations { >> void (*exit_batch)(struct list_head *net_exit_list); >> unsigned int *id; >> size_t size; >> +/* >> + * Indicates above methods are allowe to be executed in parallel >> + * with methods of any other pernet_operations, i.e. they are not >> + * need synchronization via net_mutex. >> + */ >> +bool async; >> }; >> >> /* >> diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c >> index c4f7452906bb..550c766f73aa 100644 >> --- a/net/core/net_namespace.c >> +++ b/net/core/net_namespace.c >> @@ -41,8 +41,9 @@ struct net init_net = { >> EXPORT_SYMBOL(init_net); >> >> static bool init_net_initialized; >> +static unsigned nr_sync_pernet_ops; >> /* >> - * net_sem: protects: pernet_list, net_generic_ids, >> + * net_sem: protects: pernet_list, net_generic_ids, nr_sync_pernet_ops, >> * init_net_initialized and first_device pointer. >> */ >> DECLARE_RWSEM(net_sem); >> @@ -70,11 +71,10 @@ static int net_assign_generic(struct net *net, unsigned >> int id, void *data) >> { >> struct net_generic *ng, *old_ng; >> >> -BUG_ON(!mutex_is_locked(&net_mutex)); >> BUG_ON(id < MIN_PERNET_OPS_ID); >> >> old_ng = rcu_dereference_protected(net->gen, >> - lockdep_is_held(&net_mutex)); >> + lockdep_is_held(&net_sem)); >> if (old_ng->s.len > id) { >> old_ng->ptr[id] = data; >> return 0; >> @@ -419,11 +419,14 @@ struct net *copy_net_ns(unsigned long flags, >> rv = down_read_killable(&net_sem); >> if (rv < 0) >> goto put_userns; >> -rv = mutex_lock_killable(&net_mutex); >> -if (rv < 0) >> -goto up_read; >> +if (nr_sync_pernet_ops) { >> +rv = mutex_lock_killable(&net_mutex); >> +if (rv < 0) >> +goto up_read; >> +} >> rv = setup_net(net, user_ns); >> -mutex_unlock(&net_mutex); >> +if (nr_sync_pernet_ops) >> +mutex_unlock(&net_mutex); >> up_read: >> up_read(&net_sem); >> if (rv < 0) { >> @@ -453,7 +456,8 @@ static void cleanup_net(struct work_struct *work) >> spin_unlock_irq(&cleanup_list_lock); >> >> down_read(&net_sem); >> -mutex_lock(&net_mutex); >> +if (nr_sync_pernet_ops) >> +mutex_lock(&net_mutex); >> >> /* Don't let anyone else find us. */ >> rtnl_lock(); >> @@ -489,7 +493,8 @@ static void cleanup_net(struct work_struct *work) >> list_for_each_entry_reverse(ops, &pernet_list, list) >> ops_exit_list(ops, &net_exit_list); >> >> -mutex_unlock(&net_mutex); >> +if (nr_sync_pernet_ops) >> +mutex_unlock(&net_mutex); >> >> /* Free the net generic variables */ >> list_for_each_entry_reverse(ops, &pernet_list, list) >> @@ -961,6 +966,9 @@ static int register_pernet_operations(struct list_head >> *list, >> rcu_barrier(); >> if (ops->id) >> ida_remove(&net_generic_ids, *ops->id); >> +} else if (!ops->async) { >> +pr_info_once("Pernet operations %ps are sync.\n", ops); > > As far as I understand, we have this sync mode for backward > compatibility with non-upstream modules, don't we? If the answer is yes, > it may be better to add WARN_ONCE here? There are 200+ more pernet operations requiring the review and making them async. This pr_info_once() is to help people find unconverted pernet_operations they use and start the work on converting them. Thanks, Kirill >> +nr_sync_pernet_ops++; >> } >> >> return error; >> @@ -968,7 +976,8 @@ static int register_pernet_operations(struct list_head >> *list, >> >> static void unregister_pernet_operations(struct pernet_operations *ops) >> { >> - >> +if (!ops->async) >> +BU
Re: [PATCH v6 09/36] nds32: MMU initialization
On Mon, Jan 15, 2018 at 6:53 AM, Greentime Hu wrote: > From: Greentime Hu > > This patch includes memory initializations and highmem supporting. > > Signed-off-by: Vincent Chen > Signed-off-by: Greentime Hu Acked-by: Arnd Bergmann
Re: [PATCH v6 10/36] nds32: MMU fault handling and page table management
On Mon, Jan 15, 2018 at 6:53 AM, Greentime Hu wrote: > From: Greentime Hu > > This patch includes page fault handler, mmap and fixup implementations. > > Signed-off-by: Vincent Chen > Signed-off-by: Greentime Hu Acked-by: Arnd Bergmann
Re: [PATCH v6 11/36] nds32: Cache and TLB routines
On Mon, Jan 15, 2018 at 6:53 AM, Greentime Hu wrote: > From: Greentime Hu > > This patch contains cache and TLB maintenance functions. > > Signed-off-by: Vincent Chen > Signed-off-by: Greentime Hu Acked-by: Arnd Bergmann
Re: [PATCH v6 12/36] nds32: Process management
On Mon, Jan 15, 2018 at 6:53 AM, Greentime Hu wrote: > +void machine_restart(char *__unused) > +{ > + /* > +* Clean and disable cache, and turn off interrupts > +*/ > + cpu_proc_fin(); > + > + /* > +* Tell the mm system that we are going to reboot - > +* we may need it to insert some 1:1 mappings so that > +* soft boot works. > +*/ > + setup_mm_for_reboot(reboot_mode); > + > + /* > +* Now call the architecture specific reboot code. > +*/ > + arch_reset(reboot_mode); > + > + /* > +* Whoops - the architecture was unable to reboot. > +* Tell the user! > +*/ > + mdelay(1000); > + pr_info("Reboot failed -- System halted\n"); > + while (1) ; > +} You should insert a call to do_kernel_restart() in this function to allow e.g. a watchdog driver to provide a machine-specific reboot method. Otherwise the patch looks good to me, Acked-by: Arnd Bergmann
Re: [PATCH v6 13/36] nds32: IRQ handling
On Mon, Jan 15, 2018 at 6:53 AM, Greentime Hu wrote: > From: Greentime Hu > > This patch includes irq related functions and irqchip_init(). > > Signed-off-by: Vincent Chen > Signed-off-by: Greentime Hu Acked-by: Arnd Bergmann
Re: [PATCH v6 14/36] nds32: Atomic operations
On Mon, Jan 15, 2018 at 6:53 AM, Greentime Hu wrote: > From: Greentime Hu > > This patch includes the atomic and futex operations. Many atomic operations > use > the load-lock word(llw) and store-condition word(scw) operations. > > Signed-off-by: Vincent Chen > Signed-off-by: Greentime Hu Acked-by: Arnd Bergmann
Re: [PATCH v6 15/36] nds32: Device specific operations
On Mon, Jan 15, 2018 at 6:53 AM, Greentime Hu wrote: > From: Greentime Hu > > This patch introduces ioremap implementations. > > Signed-off-by: Vincent Chen > Signed-off-by: Greentime Hu Acked-by: Arnd Bergmann
Re: [PATCH v6 16/36] nds32: DMA mapping API
On Mon, Jan 15, 2018 at 6:53 AM, Greentime Hu wrote: > From: Greentime Hu > > This patch adds support for the DMA mapping API. It uses dma_map_ops for > flexibility. > > Signed-off-by: Vincent Chen > Signed-off-by: Greentime Hu I'm still unhappy about the way the cache flushes are done here as discussed before. It's not a show-stopped, but no Ack from me. Arnd
Re: [PATCH net-next v5 0/4] dwmac-meson8b: clock fixes for Meson8b
HI Jerome On 01/17/18 01:19, Jerome Brunet wrote: > On Tue, 2018-01-16 at 12:17 +0100, Martin Blumenstingl wrote: Hi Martin I'm having problem with this series applied. I've tested on the A113D (AXG) platform, if this patch is applied, the driver will choose MPLL2 as clk source, and seems it doesn't work out with this clk (fail to get IP) grep mpll2 /sys/kernel/debug/clk/clk_summary mpll2 11 24449 0 0 >>> >>> This is because the fixed pll is 199200 on the axg instead of 2Ghz. >> >> is there a chance that we can have a public AXG datasheet, similar to >> what Hardkernel and Khadas published for the S805, S905, S905X and >> S912? >> this would give us community developers a chance to know that we are >> going to break something on AXG *before* a patch is published (like in >> this case :( ) >> >>> As a consequence fdiv4 is 49800. > > Quick update > I have just been able measure fdiv4 and it is exactly 500Mhz > This means the the fixed_pll is actually 2GHz, as we would expect. > yes, you right here. the output of fixed pll is actually 2GHz (1999.998MHz) > This also means that calculation of the fixed_pll is slightly off on the axg > platform. > the calculation algorithm should be updated because previous one didn't take the 'frac' part into account. > Once the clock calculation is done correctly on axg, there should be no > problem. > there is still a problem, current calculation will still result at fixed_pll frequency - 1999,000,000, I would expect it 2000,000,000 (to round it to closest? or just 1999,998,000) >>> >>> CCF can reach something closer to 250Mhz with the mpll2. >>> >>> The easy way to fix this would be to make the inputs of ethernet mux >>> optional >>> and not provide the MPLL2 on the axg. >>> >>> However, this raises a few question on the axg platform : >>> 1) Why is the root pll 1992Mhz and not 2GHz ? seems a weird choice just for >>> 4MHz >>> This 1992Mhz does not seems to help generate audio freqs anymore than 2GHz >>> would. I don't really get this choice. Could you help us understand Yixun ? >>> >>> 2) Why is ethernet not working with mpll2 on axg ? espcially since we have >>> proven the rate be correct on meson8 ... is there any change on this >>> platform we >>> don't know about ? >> >> I tried to force my Khadas VIM2 (GXM) into using MPLL2 by changing >> meson-gxl.dtsi: >> clocks = <&clkc CLKID_ETH>, >> -<&clkc CLKID_FCLK_DIV2>, >> +<&xtal>, >> <&clkc CLKID_MPLL2>; >> >> the resulting clock tree looks fine: >> fixed_pll 440 20 >> 0 0 >> mpll2_div 110 25000 >>0 0 >> mpll2 110 25000 >>0 0 >> c941.ethernet#m250_sel 110 >> 25000 0 0 >>c941.ethernet#m250_div 110 >> 25000 0 0 >> c941.ethernet#fixed_div2 11 >> 0 12500 0 0 >> c941.ethernet#rgmii_tx_en 11 >> 0 12500 0 0 >> >> however, Ethernet is broken (I can't ping anything) >> >> @Yixun: according to all public datasheets bit 4 is reserved >> however, Mike and Kevin were told that bit 4 controls a mux between >> FCLK_DIV2 and MPLL2 >> could you please clarify the meaning of this bit 4 on: >> - Meson8b >> - Meson8m2 (a confirmation that it uses the same Ethernet registers >> with the same bits/meanings of these as Meson8b would be enough) >> - GXBB / GXL / GXM (assuming that these are using an identical IP for >> PRG_ETH0) >> - AXG > > Indeed, if bit 4 stands for something else, or if some combination are known > to > fail, it would be nice to know. > the bit 4 is acutally a mux which used to choose two clock sources: a) fclk_div2 b) MPLL2 (I couldn't check all the SoC generation, but a least I know..) >>From my point of view, the problem reported by yixun is with the clock > controller, not this series, which I think should be merged. > feedback from ASIC team, they think the MPLL2 clock source should also work.. I don't know why it's broken here)-( > Still, we should clockin of the mac optional so don't have to provide the > inputs > known to fail, if any. > >> >> >> Regards >> Martin > > . >
Re: [PATCH v6 17/36] nds32: ELF definitions
On Mon, Jan 15, 2018 at 6:53 AM, Greentime Hu wrote: > From: Greentime Hu > > This patch adds definitions for the ELF format, relocation types, vdso > locations and EXEC_PAGESIZE. > > Signed-off-by: Vincent Chen > Signed-off-by: Greentime Hu Acked-by: Arnd Bergmann
Re: [PATCH v6 18/36] nds32: System calls handling
On Mon, Jan 15, 2018 at 6:53 AM, Greentime Hu wrote: > From: Greentime Hu > > This patch adds support for system calls. > > Signed-off-by: Vincent Chen > Signed-off-by: Greentime Hu You seem to have finally addressed all my previous concerns, looks good. Reviewed-by: Arnd Bergmann
Re: [PATCH v6 19/36] nds32: VDSO support
On Mon, Jan 15, 2018 at 6:53 AM, Greentime Hu wrote: > From: Greentime Hu > > This patch adds VDSO support. The VDSO code is currently used for > sys_rt_sigreturn() and optimised gettimeofday() (using the SoC timer counter). > > Signed-off-by: Vincent Chen > Signed-off-by: Greentime Hu Acked-by: Arnd Bergmann
Re: [PATCH v6 20/36] nds32: Signal handling support
On Mon, Jan 15, 2018 at 6:53 AM, Greentime Hu wrote: > From: Greentime Hu > > This patch adds support for signal handling. > > Signed-off-by: Vincent Chen > Signed-off-by: Greentime Hu I never feel qualified enough to properly review signal handling code, so no Ack from me for this code even though I don't see anything wrong with it. Hopefully someone else can give an Ack after looking more closely. Arnd
Re: [PATCH v6 21/36] nds32: Library functions
On Mon, Jan 15, 2018 at 6:53 AM, Greentime Hu wrote: > From: Greentime Hu > > This patch add support for various library functions. > > Signed-off-by: Vincent Chen > Signed-off-by: Greentime Hu Acked-by: Arnd Bergmann
[PATCH iproute2] tc/lexer: let quotes actually start strings
The lexer will go with the longest match, so previously the starting double quotes of a string would be swallowed by the [^ \t\r\n()]+ pattern leaving the user no way to actually use strings with escape sequences. Fix this by not allowing this case to start with double quotes. Signed-off-by: Wolfgang Bumiller --- tc/emp_ematch.l | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tc/emp_ematch.l b/tc/emp_ematch.l index dc106759..d7a99304 100644 --- a/tc/emp_ematch.l +++ b/tc/emp_ematch.l @@ -137,7 +137,7 @@ ")"{ return yylval.i = *yytext; } -[^ \t\r\n()]+ { +[^" \t\r\n()][^ \t\r\n()]* { yylval.b = bstr_alloc(yytext); if (yylval.b == NULL) return ERROR; -- 2.11.0
[PATCH net 2/2] net_sched: fix TCF_LAYER_LINK case in tcf_get_base_ptr
TCF_LAYER_LINK and TCF_LAYER_NETWORK returned the same pointer as skb->data points to the network header. Use skb_mac_header instead. Signed-off-by: Wolfgang Bumiller --- Alternatively this could return skb->head directly, but 'sk_buff->mac_header' is documented as 'Link layer header' and this seemed more clear. Since on the first read I thought "it looks fine" while in fact skb->head comes before skb->data, so this seems less confusing. include/net/pkt_cls.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h index 8e08b6da72f3..753ac9361154 100644 --- a/include/net/pkt_cls.h +++ b/include/net/pkt_cls.h @@ -522,7 +522,7 @@ static inline unsigned char * tcf_get_base_ptr(struct sk_buff *skb, int layer) { switch (layer) { case TCF_LAYER_LINK: - return skb->data; + return skb_mac_header(skb); case TCF_LAYER_NETWORK: return skb_network_header(skb); case TCF_LAYER_TRANSPORT: -- 2.11.0
[PATCH net 1/2] net: sched: em_nbyte: don't add the data offset twice
'ptr' is shifted by the offset and then validated, the memcmp should not add it a second time. Signed-off-by: Wolfgang Bumiller --- net/sched/em_nbyte.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/sched/em_nbyte.c b/net/sched/em_nbyte.c index df3110d69585..07c10bac06a0 100644 --- a/net/sched/em_nbyte.c +++ b/net/sched/em_nbyte.c @@ -47,15 +47,15 @@ static int em_nbyte_match(struct sk_buff *skb, struct tcf_ematch *em, unsigned char *ptr = tcf_get_base_ptr(skb, nbyte->hdr.layer); ptr += nbyte->hdr.off; if (!tcf_valid_offset(skb, ptr, nbyte->hdr.len)) return 0; - return !memcmp(ptr + nbyte->hdr.off, nbyte->pattern, nbyte->hdr.len); + return !memcmp(ptr, nbyte->pattern, nbyte->hdr.len); } static struct tcf_ematch_ops em_nbyte_ops = { .kind = TCF_EM_NBYTE, .change = em_nbyte_change, .match= em_nbyte_match, .owner= THIS_MODULE, -- 2.11.0
[PATCH net+iproute2 0/2] nbyte, cmp and text filter fixups
The iproute2 part allows the the actual use of the already existing quoted string parsing. The kernel side fixes an oob read in em_nbyte and allows 'layer 0' in cmp and nbyte (and em_text whose existence surprised me given that I did not see it exposed via iproute2) to actually match layer 0 rather than being the same as specifying layer 1. I seem to have stumbled upon a layer of dust (says git-blame). Trying to match mac addresses I felt that the examples found online using the 'u32' filter were rather inconvenient, particularly given that there's the 'nbyte' filter around that could just memcmp the entire a byte sequence at once. Wolfgang Bumiller (1; 2): tc/lexer: let quotes actually start strings tc/emp_ematch.l | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) net: sched: em_nbyte: don't add the data offset twice net_sched: fix TCF_LAYER_LINK case in tcf_get_base_ptr include/net/pkt_cls.h | 2 +- net/sched/em_nbyte.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) -- 2.11.0
Re: [PATCH v6 22/36] nds32: Debugging support
On Mon, Jan 15, 2018 at 6:53 AM, Greentime Hu wrote: > From: Greentime Hu > > This patch adds ptrace support. > > Signed-off-by: Vincent Chen > Signed-off-by: Greentime Hu I must have missed this patch earlier, unfortunately I don't think this is ready: > +long arch_ptrace(struct task_struct *child, long request, unsigned long addr, > +unsigned long data) > +{ > + int ret; > + > + switch (request) { > + case PTRACE_PEEKUSR: > + ret = > + ptrace_read_user(child, addr, (unsigned long __user > *)data); > + break; > + > + case PTRACE_POKEUSR: > + ret = ptrace_write_user(child, addr, data); > + break; > + > + case PTRACE_GETREGS: > + ret = ptrace_getregs(child, (void __user *)data); > + break; > + > + case PTRACE_SETREGS: > + ret = ptrace_setregs(child, (void __user *)data); > + break; > + > + case PTRACE_GETFPREGS: > + ret = ptrace_getfpregs(child, (void __user *)data); > + break; > + > + case PTRACE_SETFPREGS: > + ret = ptrace_setfpregs(child, (void __user *)data); > + break; > + > + default: > + ret = ptrace_request(child, request, addr, data); > + break; > + } > + > + return ret; > +} It appears that you are implementing the old-style ptrace handling with architecture specific commands. Please have a look at how this is done in risc-v or arm64. If this takes more too much time to address, I'd suggest using an empty stub function for sys_ptrace and adding it back at a later point, but not send the current version upstream. Arnd
Re: [PATCH v6 23/36] nds32: L2 cache support
On Mon, Jan 15, 2018 at 6:53 AM, Greentime Hu wrote: > From: Greentime Hu > > This patch adds L2 cache support. > > Signed-off-by: Vincent Chen > Signed-off-by: Greentime Hu Acked-by: Arnd Bergmann
Re: [PATCH v6 24/36] nds32: Loadable modules
On Mon, Jan 15, 2018 at 6:53 AM, Greentime Hu wrote: > From: Greentime Hu > > This patch adds support for loadable modules. One detail: You still seem to have both the ELF_REL and ELF_RELA based functions implemented here, you should drop the unused ELF_REL version: > diff --git a/arch/nds32/kernel/module.c b/arch/nds32/kernel/module.c > new file mode 100644 > index 000..714a6d6 > --- /dev/null > +++ b/arch/nds32/kernel/module.c > @@ -0,0 +1,286 @@ > +// SPDX-License-Identifier: GPL-2.0 > +// Copyright (C) 2005-2017 Andes Technology Corporation > + > +#include > +#include > +#include > + > +#include include to catch this. > +int > +apply_relocate(Elf32_Shdr * sechdrs, const char *strtab, > + unsigned int symindex, unsigned int relsec, > + struct module *module) > +{ > + return 0; > +} and drop this. With that change, Acked-by: Arnd Bergmann
Re: [PATCH v6 25/36] nds32: Generic timers support
On Mon, Jan 15, 2018 at 6:53 AM, Greentime Hu wrote: > From: Greentime Hu > > This patch adds support for timer. > > Signed-off-by: Vincent Chen > Signed-off-by: Greentime Hu > Reviewed-by: Linus Walleij Acked-by: Arnd Bergmann
Re: [PATCH v6 26/36] nds32: Device tree support
On Mon, Jan 15, 2018 at 6:53 AM, Greentime Hu wrote: > From: Greentime Hu > > This patch adds support for device tree. > > Signed-off-by: Vincent Chen > Signed-off-by: Greentime Hu Acked-by: Arnd Bergmann
Re: [PATCH v6 28/36] nds32: defconfig
On Mon, Jan 15, 2018 at 6:53 AM, Greentime Hu wrote: > From: Greentime Hu > > This patch adds nds32 defconfig. > > Signed-off-by: Vincent Chen > Signed-off-by: Greentime Hu Acked-by: Arnd Bergmann
Re: [PATCH v6 30/36] MAINTAINERS: Add nds32
On Mon, Jan 15, 2018 at 6:53 AM, Greentime Hu wrote: > From: Greentime Hu > > Signed-off-by: Greentime Hu Please add a changelog text to every single commit, otherwise: Acked-by: Arnd Bergmann
Re: [PATCH v6 32/36] dt-bindings: nds32 L2 cache controller Bindings
On Mon, Jan 15, 2018 at 6:53 AM, Greentime Hu wrote: > From: Greentime Hu > > This patch adds nds32 L2 cache controller binding documents. > > Signed-off-by: Greentime Hu > Reviewed-by: Rob Herring Acked-by: Arnd Bergmann
Re: [PATCH v6 34/36] dt-bindings: interrupt-controller: Andestech Internal Vector Interrupt Controller
On Mon, Jan 15, 2018 at 6:53 AM, Greentime Hu wrote: > From: Greentime Hu > > This patch adds an irqchip driver document for the Andestech Internal Vector > Interrupt Controller. > > Signed-off-by: Rick Chen > Signed-off-by: Greentime Hu > Reviewed-by: Rob Herring Acked-by: Arnd Bergmann
Re: [PATCH v6 27/36] nds32: Miscellaneous header files
On Mon, Jan 15, 2018 at 6:53 AM, Greentime Hu wrote: > From: Greentime Hu > > This patch introduces some miscellaneous header files. > > Signed-off-by: Vincent Chen > Signed-off-by: Greentime Hu Acked-by: Arnd Bergmann
Re: dangers of bots on the mailing lists was Re: divide error in ___bpf_prog_run
On Wed, Jan 17, 2018 at 11:11 AM, Henrique de Moraes Holschuh wrote: > On Wed, 17 Jan 2018, Dmitry Vyukov wrote: >> On Wed, Jan 17, 2018 at 10:32 AM, Pavel Machek wrote: >> > On Fri 2018-01-12 17:58:01, syzbot wrote: >> >> syzkaller hit the following crash on >> >> 19d28fbd306e7ae7c1acf05c3e6968b56f0d196b >> > >> > What an useful way to describe kernel version. >> > >> > Could we get reasonable subject line? 4.15-rc7: prefix would be nice >> > if it is on mainline, >> >> Yes, I guess. I am all for useful improvements. >> What exactly is reasonable subject line? And how it can be extracted >> for an arbitrary kernel tree? > > It can't, I guess. But maybe you could extract it from syzbot > information about the context of that patch? > > Maybe tagging it with the git tree you fetched when getting it over git, > and mail from[1]+subject+message-id when getting it over email? > > [1] processing of related headers to handle mailing lists and > retransmits is required, e.g. ressent-*, etc. But this is relatively > easy to do as well. > > A map to generate subject prefixes from key git trees or MLs could > enhance that even further, to get at least mainline:, *-next:, etc. Hi Henrique, Re report format. Ted also provided some useful feedback here: https://groups.google.com/d/msg/syzkaller/5hjgr2v_oww/fn5QW6dvDQAJ I've made a bunch of changes yesterday and today. This includes rearranging lines in the email, rearranging attachment order, removing some clutter, providing short repo alias (upstream, linux-next, net, etc), providing commit date and title. syzbot will not also prefer to report crashes on upstream tree, rather than on other trees. Re subject line, I don't think prefixing subject with tree will work. What you see as a single crash actually represents from tens to tens of thousands crashes on some set of trees. And that set grows over time. That can be one set of trees when the bug is first reported, and then another subset of trees when a reproducer is found. It's obviously a bad idea to send a email per crash (every few seconds), and even per crash/tree. To alleviate this, syzbot will now say e.g. "So far this crash happened 185 times on linux-next, mmots, net-next, upstream". So that you can see that it's not only, say, linux-next problem. syzbot just mailed another report with all of these changes which you can see here: https://groups.google.com/forum/#!msg/syzkaller-bugs/u5nq3PdPkIc/F4tXzErxAgAJ Thanks
Re: [PATCH net-next] macvlan: Pass SIOC[SG]HWTSTAMP ioctls and get_ts_info to lower device
On Thu, 18 Jan 2018 02:12:34 +0100, grzegorz.ha...@gmail.com wrote: > This patch allows to enable hardware timestamping on macvlan intefaces and > find out capabilities of the lower device. > > Signed-off-by: Grzegorz Halat NACK. This does not work. For start, how do you deal with fwd_priv? When a packet is sent to other software ports, it wouldn't be time stamped. And I expect more cases like this to be there in macvlan, I only spent 10 seconds checking it. Please study how time stamping in the kernel works. A good start is Documentation/networking/timestamping.txt. Then examine all possible packet paths with macvlan, both egress and ingress. > +static int macvlan_ethtool_get_ts_info(struct net_device *dev, > + struct ethtool_ts_info *ts_info) > +{ > + const struct macvlan_dev *vlan = netdev_priv(dev); > + const struct ethtool_ops *eth_ops = vlan->lowerdev->ethtool_ops; > + > + if (eth_ops->get_ts_info) > + return eth_ops->get_ts_info(vlan->lowerdev, ts_info); > + > + ts_info->so_timestamping = SOF_TIMESTAMPING_RX_SOFTWARE | > +SOF_TIMESTAMPING_SOFTWARE; > + ts_info->phc_index = -1; What calls skb_tx_timestamp if the driver does not support it? Jiri
Re: [PATCH net+iproute2 0/2] nbyte, cmp and text filter fixups
Thu, Jan 18, 2018 at 11:32:33AM CET, w.bumil...@proxmox.com wrote: >The iproute2 part allows the the actual use of the already existing >quoted string parsing. Makes no sense to send iproute2 patch which is not related with the kernel ones in the same set. Please send it separatelly. > >The kernel side fixes an oob read in em_nbyte and allows 'layer 0' in >cmp and nbyte (and em_text whose existence surprised me given that I did >not see it exposed via iproute2) to actually match layer 0 rather than >being the same as specifying layer 1. > >I seem to have stumbled upon a layer of dust (says git-blame). >Trying to match mac addresses I felt that the examples found online >using the 'u32' filter were rather inconvenient, particularly given >that there's the 'nbyte' filter around that could just memcmp the >entire a byte sequence at once. > >Wolfgang Bumiller (1; 2): > tc/lexer: let quotes actually start strings > > tc/emp_ematch.l | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > net: sched: em_nbyte: don't add the data offset twice > net_sched: fix TCF_LAYER_LINK case in tcf_get_base_ptr "net: sched:" or "net_sched:"? - please, try to be consistent
Re: [PATCH v6 29/36] nds32: Build infrastructure
On Mon, Jan 15, 2018 at 6:53 AM, Greentime Hu wrote: > From: Greentime Hu > > This patch adds Makefile, Kconfig and vmlinux.lds.S files required for > building > an nds32 kernel. > > Signed-off-by: Vincent Chen > Signed-off-by: Greentime Hu I find some new details every time I look here ;-) > @@ -0,0 +1,107 @@ > +# > +# For a description of the syntax of this configuration file, > +# see Documentation/kbuild/kconfig-language.txt. > +# > + > +config NDS32 > +def_bool y > + select ARCH_HAS_RAW_COPY_USER I don't think this symbol was ever merged. Do you remember why you added it? > + select ARCH_WANT_FRAME_POINTERS if FTRACE > + select ARCH_WANT_IPC_PARSE_VERSION You most certainly don't want IPC_PARSE_VERSION, please drop this and adapt your glibc. > + select CLKSRC_MMIO > + select CLONE_BACKWARDS > + select COMMON_CLK > + select FRAME_POINTER Do you need both ARCH_WANT_FRAME_POINTERS and FRAME_POINTER here? > + select GENERIC_ATOMIC64 > + select GENERIC_CPU_DEVICES > + select GENERIC_CLOCKEVENTS > + select GENERIC_IRQ_CHIP > + select GENERIC_IRQ_PROBE I think it's better to drop GENERIC_IRQ_PROBE here, no modern driver should rely on that. > +choice > + prompt "CPU type" > + default CPU_V3 > +config CPU_N15 > + bool "AndesCore N15" > +config CPU_N13 > + bool "AndesCore N13" > + select CPU_CACHE_ALIASING if ANDES_PAGE_SIZE_4KB > +config CPU_N10 > + bool "AndesCore N10" > + select CPU_CACHE_ALIASING > +config CPU_D15 > + bool "AndesCore D15" > +config CPU_D10 > + bool "AndesCore D10" > + select CPU_CACHE_ALIASING > +config CPU_V3 > + bool "AndesCore v3 compatible" > + select ANDES_PAGE_SIZE_8KB > +endchoice I forget what we discussed here earlier, but at the very least, there should be some help text here to explain what the implications are. I assume that you generally want to be able to build one kernel to run on all of the above, right? Will selecting 'CPU_V3' result in a kernel binary that can run on all of them? If so, please explain it here as that is not obvious. For the other CPU types, can you list the what backwards-compatiblity you get? E.g. will a kernel built for N13 run on any of N15, D15 or N10? I think the 'select ANDES_PAGE_SIZE_8KB' cannot work as expected, since ANDES_PAGE_SIZE_8KB is inside of a 'choice' statement. Since there are only two options (4K and 8K), you can address that by making it a simple bool option and fall back to 4K when ANDES_PAGE_SIZE_8KB is disabled. > +config CACHE_L2 > + bool "Support L2 cache" > +default y > + help > + Say Y here to enable L2 cache if your SoC are integrated with L2CC. > + If unsure, say N. > + > +menu "Memory configuration" > + > +choice > + prompt "Memory split" > + depends on MMU > + default VMSPLIT_3G Why not default to VMSPLIT_3G_OPT? Arnd
Re: [PATCH v6 31/36] dt-bindings: nds32 CPU Bindings
On Mon, Jan 15, 2018 at 6:53 AM, Greentime Hu wrote: > From: Greentime Hu > > This patch adds nds32 CPU binding documents. > > Signed-off-by: Vincent Chen > Signed-off-by: Rick Chen > Signed-off-by: Zong Li > Signed-off-by: Greentime Hu > Reviewed-by: Rob Herring > --- > Documentation/devicetree/bindings/nds32/cpus.txt | 37 > ++ > 1 file changed, 37 insertions(+) > create mode 100644 Documentation/devicetree/bindings/nds32/cpus.txt > > diff --git a/Documentation/devicetree/bindings/nds32/cpus.txt > b/Documentation/devicetree/bindings/nds32/cpus.txt > new file mode 100644 > index 000..9a52937 > --- /dev/null > +++ b/Documentation/devicetree/bindings/nds32/cpus.txt > @@ -0,0 +1,37 @@ > +* Andestech Processor Binding > + > +This binding specifies what properties must be available in the device tree > +representation of a Andestech Processor Core, which is the root node in the > +tree. > + > +Required properties: > + > + - compatible: > + Usage: required > + Value type: > + Definition: should be one of: > + "andestech,n13" > + "andestech,n15" > + "andestech,d15" > + "andestech,n10" > + "andestech,d10" > + "andestech,nds32v3" Based on https://lkml.org/lkml/2017/11/27/1290, this should say that the device tree should always list 'andestech,nds32v3' as the most generic 'compatible' value and list exactly one of the others in addition. Arnd
Re: [PATCH v6 33/36] dt-bindings: nds32 SoC Bindings
On Mon, Jan 15, 2018 at 6:53 AM, Greentime Hu wrote: > From: Greentime Hu > > This patch adds nds32 SoC(AE3XX and AG101P) binding documents. > > Signed-off-by: Greentime Hu > Reviewed-by: Rob Herring Acked-by: Arnd Bergmann
Re: [PATCH v6 36/36] net: faraday add nds32 support.
On Mon, Jan 15, 2018 at 6:53 AM, Greentime Hu wrote: > From: Greentime Hu > > This patch is used to support nds32 architecture to use these faraday > mac IP. > > Signed-off-by: Greentime Hu Acked-by: Arnd Bergmann
Re: [PATCH RFC net-next 1/6] sock: MSG_PEEK support for sk_error_queue
On (01/17/18 18:50), Willem de Bruijn wrote: > > This can cause reordering with parallel readers. Can we avoid the need > for peeking? It also caused a slew of subtle bugs previously. Yes, I did notice the potential for re-ordering when writing the patch.. but these are not actuallly messages from the wire, so is re-ordering fatal? In general, I"m not particularly attached to this solution- in my testing, I'm seeing that it's possible to reduce the latency and still take a hit on the throughput if the application does not reap the completion notifciation (and send out new data) efficiently Some (radically differnt) alternatives that were suggested to me - send up all the cookies as ancillary data with recvmsg (i.e., send it as a cmsgdata along with actual data from the wire). In most cases, the application has data to read, anyway. If it doesnt (pure sender), we could wake up recvmsg with 0 bytes of data, but with the cookie info in the ancillary data. This feels not-so-elegant to me, but I suppose it would have the benefit of optimizing on the syscall overhead.. (and you could use MSG_CTRUNC to handle the case of insuufficient bufffer for cookies, sending the rest on the next call).. - allow application to use a setsockopt on the rds socket, with some shmem region, into which the kernel could write the cookies, Let application reap cookies without syscall overhead from that shmem region.. > How about just define a max number of cookies and require the caller > to always read with sufficient room to hold them? This may be "good enough" as well, maybe allow a max of (say) 16 cookies, and set up the skb's in the error queue to send up batches of 16 cookies at a time? --Sowmini
Re: [PATCH v6 3/3] dt-bindings: timer: Add andestech atcpit100 timer binding doc
On Mon, Jan 15, 2018 at 6:57 AM, Greentime Hu wrote: > From: Rick Chen > > Add a document to describe Andestech atcpit100 timer and > binding information. > > Signed-off-by: Rick Chen > Signed-off-by: Greentime Hu > Acked-by: Rob Herring Acked-by: Arnd Bergmann
Re: [PATCH v6 2/3] clocksource/drivers/atcpit100: VDSO support
On Mon, Jan 15, 2018 at 6:57 AM, Greentime Hu wrote: > From: Rick Chen > > VDSO needs real-time cycle count to ensure the time accuracy. > Unlike others, nds32 architecture does not define clock source, > hence VDSO needs atcpit100 offering real-time cycle count > to derive the correct time. > > Signed-off-by: Vincent Chen > Signed-off-by: Rick Chen > Signed-off-by: Greentime Hu I'm a bit puzzled by this patch, can you explain how the vdso actually manages to access the clock hardware? It looks like you make the physical address and the register offset available to user space, but how does it end up accessing it? Arnd
Re: [PATCH RFC net-next 5/6] rds: support for zcopy completion notification
On (01/17/18 19:23), Willem de Bruijn wrote: > > +static void rds_rm_zerocopy_callback(struct rds_sock *rs) : > > + skb = alloc_skb(ncookies * sizeof(u32), GFP_ATOMIC); > > TCP zerocopy avoids this issue by allocating the notification skb when the > zerocopy packet is created, which would be rds_message_copy_from_user. right, I could allocate the skb when we set up the zcopy data strucutres. > This does not add an allocation, if using the same trick of stashing > the intermediate notification object (op_mmp_znotifier) in skb->cb. Though, > alloc_skb is probably more expensive than that kzalloc. If nothing else, > because of more zeroing. I would have liked to reuse skb->cb, but had to fall back to the alloc_skb because of the attempt to fall back to flexibly numbered (and sized) cookie notifications. If we choose the "max number of cookies" option discussed in the previous thread, I think I should be able to do this with the existing 48 byte sized cb and pack in 8 32 bit cookies after the sock_extended_err.. maybe that's sufficient. > > + serr = SKB_EXT_ERR(skb); > > + serr->ee.ee_errno = 0; > > + serr->ee.ee_origin = SO_EE_ORIGIN_ZEROCOPY; > > + serr->ee.ee_data = ncookies; > > This changes the semantics of these fields. Please add a new SO_EE_CODE flag, > even if the semantics can be derived from the packet family (for now). sounds good, I'll take care of this (and other review comments) for the next rev. > Even better would be if we can avoid the cookies completely. I understand > the issue with concurrent send threads racing on obtaining a zckey value. If > the sender could learn which zckey was chosen for a call, would that suffice? I'm not sure I understand the proposal- you want sendmsg to return the cookie ("zckey") for you? How? even if we mangled sendmsg() to return something other than the existing POSIX semantics, how will the application be asynchronously notified that send has completed (on a per-message/per-datagram) basis? > I suspect that in even with concurrent senders, notifications arrive > largely in > order, in which case we could just maintain the existing semantics and even > reuse that implementation. not so. rds-stress [1] with -d8 -t8 quickly disproves this on my 10G ixgbe connection. When you have multiple threads writing to a socket, you cannot know what was the "order of send", unless you bottleneck all the threads to go through a single-point-of-send. rds-stress is an example of this sort of usage (fairly typical in our applications) [1] http://public-yum.oracle.com/repo/OracleLinux/OL6/ofed_UEK/x86_64//getPackageSource/rds-tools-2.0.7-1.12.el6.src.rpm Consider 2 RDS sockets fd1 and fd2, each one sending to the same pair of IP addresses: if fd1 gets bound to tcp sock1, fd2 to tcp sock2, it's very possible that the send completion is not in the same order as the order of send. The application cannot know which socket gets bound to which TCP connection (or even whether/how-many tcp connections are involved: mprds strives to make this transparent to the application) Same problem exists for pure-datagram sockets like PF_PACKET, UDP etc. application may send buf1 (datagram1) and buf2 (datagram2), and buf2 may make it to the destination before buf1. The "notifications arrive largely in order" may be mostly true about a single-stream TCP connection but not for the datagram models (or even threaded tcp apps like iperf?).. > > + tail = skb_peek_tail(q); > > + if (!tail || > > + SKB_EXT_ERR(tail)->ee.ee_origin != SO_EE_ORIGIN_ZEROCOPY) { > > + __skb_queue_tail(q, skb); > > + skb = NULL; > > + } > > This drops the packet if the branch is not taken. In the TCP case > this condition > means that we can try to coalesce packets, but that is not the case here. good point, I'll check into this and fix (same applies for the comments to patches 4/6 and 6/6) > > } rdma; > > struct rm_data_op { > > unsigned intop_active:1; > > - unsigned intop_notify:1; > > + unsigned intop_notify:1, > > + op_zcopy:1, : > > + struct rds_znotifier*op_mmp_znotifier; > > not necessary if op_mmp_znotifier is NULL unless set in > rds_message_copy_from_user To make sure I dont misunderstand, you are suggesting that we dont need op_zcopy, but can just check for the null-ness of op_mmp_znotifier (yes, true, I agree)? or something else? --Sowmini
[patch net-next] mlxsw: spectrum: Upper-bound supported FW version
From: Yuval Mintz During initialization the driver checks whether the flashed FW image suits its requirements by checking that it's sufficiently new. However, there's only a weak backward compatibility scheme that is actually guaranteed by the FW, so driver must also upper bound the version to prevent compatibility issues between current driver and some possible future fw. Signed-off-by: Yuval Mintz Reviewed-by: Ido Schimmel Signed-off-by: Jiri Pirko --- drivers/net/ethernet/mellanox/mlxsw/spectrum.c | 30 ++ 1 file changed, 11 insertions(+), 19 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c index bbe48917dcad..7e2b552c2237 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c @@ -76,12 +76,7 @@ #define MLXSW_FWREV_MAJOR 13 #define MLXSW_FWREV_MINOR 1530 #define MLXSW_FWREV_SUBMINOR 152 - -static const struct mlxsw_fw_rev mlxsw_sp_supported_fw_rev = { - .major = MLXSW_FWREV_MAJOR, - .minor = MLXSW_FWREV_MINOR, - .subminor = MLXSW_FWREV_SUBMINOR -}; +#define MLXSW_FWREV_MINOR_TO_BRANCH(minor) ((minor) / 100) #define MLXSW_SP_FW_FILENAME \ "mellanox/mlxsw_spectrum-" __stringify(MLXSW_FWREV_MAJOR) \ @@ -339,28 +334,25 @@ static int mlxsw_sp_firmware_flash(struct mlxsw_sp *mlxsw_sp, return mlxfw_firmware_flash(&mlxsw_sp_mlxfw_dev.mlxfw_dev, firmware); } -static bool mlxsw_sp_fw_rev_ge(const struct mlxsw_fw_rev *a, - const struct mlxsw_fw_rev *b) -{ - if (a->major != b->major) - return a->major > b->major; - if (a->minor != b->minor) - return a->minor > b->minor; - return a->subminor >= b->subminor; -} - static int mlxsw_sp_fw_rev_validate(struct mlxsw_sp *mlxsw_sp) { const struct mlxsw_fw_rev *rev = &mlxsw_sp->bus_info->fw_rev; const struct firmware *firmware; int err; - if (mlxsw_sp_fw_rev_ge(rev, &mlxsw_sp_supported_fw_rev)) + /* Validate driver & FW are compatible */ + if (rev->major != MLXSW_FWREV_MAJOR) { + WARN(1, "Mismatch in major FW version [%d:%d] is never expected; Please contact support\n", +rev->major, MLXSW_FWREV_MAJOR); + return -EINVAL; + } + if (MLXSW_FWREV_MINOR_TO_BRANCH(rev->minor) == + MLXSW_FWREV_MINOR_TO_BRANCH(MLXSW_FWREV_MINOR)) return 0; - dev_info(mlxsw_sp->bus_info->dev, "The firmware version %d.%d.%d is out of date\n", + dev_info(mlxsw_sp->bus_info->dev, "The firmware version %d.%d.%d is incompatible with the driver\n", rev->major, rev->minor, rev->subminor); - dev_info(mlxsw_sp->bus_info->dev, "Upgrading firmware using file %s\n", + dev_info(mlxsw_sp->bus_info->dev, "Flashing firmware using file %s\n", MLXSW_SP_FW_FILENAME); err = request_firmware_direct(&firmware, MLXSW_SP_FW_FILENAME, -- 2.14.3
Re: [PATCH 1/3 net] ibmvnic: Modify buffer size on failover
Hi John, I love your patch! Yet something to improve: [auto build test ERROR on net/master] url: https://github.com/0day-ci/linux/commits/John-Allen/ibmvnic-Modify-buffer-size-on-failover/20180118-190528 config: powerpc-allyesconfig (attached as .config) compiler: powerpc64-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0 reproduce: wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree make.cross ARCH=powerpc All errors (new ones prefixed by >>): drivers/net//ethernet/ibm/ibmvnic.c: In function 'reset_rx_pools': >> drivers/net//ethernet/ibm/ibmvnic.c:414:2: error: 'size_array' undeclared >> (first use in this function); did you mean 'sem_array'? size_array = (u64 *)((u8 *)(adapter->login_rsp_buf) + ^~ sem_array drivers/net//ethernet/ibm/ibmvnic.c:414:2: note: each undeclared identifier is reported only once for each function it appears in vim +414 drivers/net//ethernet/ibm/ibmvnic.c 407 408 static int reset_rx_pools(struct ibmvnic_adapter *adapter) 409 { 410 struct ibmvnic_rx_pool *rx_pool; 411 int rx_scrqs; 412 int i, j, rc; 413 > 414 size_array = (u64 *)((u8 *)(adapter->login_rsp_buf) + 415 be32_to_cpu(adapter->login_rsp_buf->off_rxadd_buff_size)); 416 417 rx_scrqs = be32_to_cpu(adapter->login_rsp_buf->num_rxadd_subcrqs); 418 for (i = 0; i < rx_scrqs; i++) { 419 rx_pool = &adapter->rx_pool[i]; 420 421 netdev_dbg(adapter->netdev, "Re-setting rx_pool[%d]\n", i); 422 423 if (rx_pool->buff_size != be64_to_cpu(size_array[i])) { 424 rx_pool->buff_size = be64_to_cpu(size_array[i]); 425 rc = alloc_long_term_buff(adapter, 426 &rx_pool->long_term_buff, 427rx_pool->size * 428rx_pool->buff_size); 429 } else { 430 rc = reset_long_term_buff(adapter, 431 &rx_pool->long_term_buff); 432 } 433 if (rc) 434 return rc; 435 436 for (j = 0; j < rx_pool->size; j++) 437 rx_pool->free_map[j] = j; 438 439 memset(rx_pool->rx_buff, 0, 440 rx_pool->size * sizeof(struct ibmvnic_rx_buff)); 441 442 atomic_set(&rx_pool->available, 0); 443 rx_pool->next_alloc = 0; 444 rx_pool->next_free = 0; 445 rx_pool->active = 1; 446 } 447 448 return 0; 449 } 450 --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: application/gzip
[net-next: PATCH v4 6/7] net: mvpp2: use device_*/fwnode_* APIs instead of of_*
OF functions can be used only for the driver using DT. As a preparation for introducing ACPI support in mvpp2 driver, use struct fwnode_handle in order to obtain properties from the hardware description. This patch replaces of_* function with device_*/fwnode_* where possible in the mvpp2. Signed-off-by: Marcin Wojtas --- drivers/net/ethernet/marvell/mvpp2.c | 45 +++- 1 file changed, 24 insertions(+), 21 deletions(-) diff --git a/drivers/net/ethernet/marvell/mvpp2.c b/drivers/net/ethernet/marvell/mvpp2.c index 7f42d90..f16448e 100644 --- a/drivers/net/ethernet/marvell/mvpp2.c +++ b/drivers/net/ethernet/marvell/mvpp2.c @@ -932,6 +932,9 @@ struct mvpp2_port { struct mvpp2 *priv; + /* Firmware node associated to the port */ + struct fwnode_handle *fwnode; + /* Per-port registers' base address */ void __iomem *base; void __iomem *stats_base; @@ -7711,17 +7714,16 @@ static bool mvpp2_port_has_tx_irqs(struct mvpp2 *priv, } static void mvpp2_port_copy_mac_addr(struct net_device *dev, struct mvpp2 *priv, -struct device_node *port_node, +struct fwnode_handle *fwnode, char **mac_from) { struct mvpp2_port *port = netdev_priv(dev); char hw_mac_addr[ETH_ALEN] = {0}; - const char *dt_mac_addr; + char fw_mac_addr[ETH_ALEN]; - dt_mac_addr = of_get_mac_address(port_node); - if (dt_mac_addr && is_valid_ether_addr(dt_mac_addr)) { - *mac_from = "device tree"; - ether_addr_copy(dev->dev_addr, dt_mac_addr); + if (fwnode_get_mac_address(fwnode, fw_mac_addr, ETH_ALEN)) { + *mac_from = "firmware node"; + ether_addr_copy(dev->dev_addr, fw_mac_addr); return; } @@ -7740,13 +7742,14 @@ static void mvpp2_port_copy_mac_addr(struct net_device *dev, struct mvpp2 *priv, /* Ports initialization */ static int mvpp2_port_probe(struct platform_device *pdev, - struct device_node *port_node, + struct fwnode_handle *port_fwnode, struct mvpp2 *priv) { struct device_node *phy_node; struct phy *comphy; struct mvpp2_port *port; struct mvpp2_port_pcpu *port_pcpu; + struct device_node *port_node = to_of_node(port_fwnode); struct net_device *dev; struct resource *res; char *mac_from = ""; @@ -7773,7 +7776,7 @@ static int mvpp2_port_probe(struct platform_device *pdev, return -ENOMEM; phy_node = of_parse_phandle(port_node, "phy", 0); - phy_mode = of_get_phy_mode(port_node); + phy_mode = fwnode_get_phy_mode(port_fwnode); if (phy_mode < 0) { dev_err(&pdev->dev, "incorrect phy mode\n"); err = phy_mode; @@ -7789,7 +7792,7 @@ static int mvpp2_port_probe(struct platform_device *pdev, comphy = NULL; } - if (of_property_read_u32(port_node, "port-id", &id)) { + if (fwnode_property_read_u32(port_fwnode, "port-id", &id)) { err = -EINVAL; dev_err(&pdev->dev, "missing port-id value\n"); goto err_free_netdev; @@ -7820,7 +7823,7 @@ static int mvpp2_port_probe(struct platform_device *pdev, /* the link irq is optional */ port->link_irq = 0; - if (of_property_read_bool(port_node, "marvell,loopback")) + if (fwnode_property_read_bool(port_fwnode, "marvell,loopback")) port->flags |= MVPP2_F_LOOPBACK; port->id = id; @@ -7845,8 +7848,8 @@ static int mvpp2_port_probe(struct platform_device *pdev, MVPP21_MIB_COUNTERS_OFFSET + port->gop_id * MVPP21_MIB_COUNTERS_PORT_SZ; } else { - if (of_property_read_u32(port_node, "gop-port-id", -&port->gop_id)) { + if (fwnode_property_read_u32(port_fwnode, "gop-port-id", +&port->gop_id)) { err = -EINVAL; dev_err(&pdev->dev, "missing gop-port-id value\n"); goto err_deinit_qvecs; @@ -7876,7 +7879,7 @@ static int mvpp2_port_probe(struct platform_device *pdev, mutex_init(&port->gather_stats_lock); INIT_DELAYED_WORK(&port->stats_work, mvpp2_gather_hw_statistics); - mvpp2_port_copy_mac_addr(dev, priv, port_node, &mac_from); + mvpp2_port_copy_mac_addr(dev, priv, port_fwnode, &mac_from); port->tx_ring_size = MVPP2_MAX_TXD_DFLT; port->rx_ring_size = MVPP2_MAX_RXD_DFLT; @@ -8194,8 +8197,8 @@ static int mvpp2_init(struct platform_device *pdev, struct mvpp2 *priv) static int mvpp2_probe(struct platform_device *pdev) { - struct device_node *dn = pdev->dev.o
[net-next: PATCH v4 7/7] net: mvpp2: enable ACPI support in the driver
This patch introduces an alternative way of obtaining resources - via ACPI tables provided by firmware. Enabling coexistence with the DT support, in addition to the OF_*->device_*/fwnode_* API replacement, required following steps to be taken: * Add mvpp2_acpi_match table * Omit clock configuration and obtain tclk from the property - in ACPI world, the firmware is responsible for clock maintenance. * Disable comphy and syscon handling as they are not available for ACPI. * Modify way of obtaining interrupts - use newly introduced fwnode_irq_get() routine * Until proper MDIO bus and PHY handling with ACPI is established in the kernel, use only link interrupts feature in the driver. For the RGMII port it results in depending on GMAC settings done during firmware stage. * When booting with ACPI MVPP2_QDIST_MULTI_MODE is picked by default, as there is no need to keep any kind of the backward compatibility. Moreover, a memory region used by mvmdio driver is usually placed in the middle of the address space of the PP2 network controller. The MDIO base address is obtained without requesting memory region (by devm_ioremap() call) in mvmdio.c, later overlapping resources are requested by the network driver, which is responsible for avoiding a concurrent access. In case the MDIO memory region is declared in the ACPI, it can already appear as 'in-use' in the OS. Because it is overlapped by second region of the network controller, make sure it is released, before requesting it again. The care is taken by mvpp2 driver to avoid concurrent access to this memory region. Signed-off-by: Marcin Wojtas --- drivers/net/ethernet/marvell/mvpp2.c | 133 ++-- 1 file changed, 94 insertions(+), 39 deletions(-) diff --git a/drivers/net/ethernet/marvell/mvpp2.c b/drivers/net/ethernet/marvell/mvpp2.c index f16448e..a1d7b88 100644 --- a/drivers/net/ethernet/marvell/mvpp2.c +++ b/drivers/net/ethernet/marvell/mvpp2.c @@ -10,6 +10,7 @@ * warranty of any kind, whether express or implied. */ +#include #include #include #include @@ -7502,7 +7503,10 @@ static int mvpp2_multi_queue_vectors_init(struct mvpp2_port *port, strncpy(irqname, "rx-shared", sizeof(irqname)); } - v->irq = of_irq_get_byname(port_node, irqname); + if (port_node) + v->irq = of_irq_get_byname(port_node, irqname); + else + v->irq = fwnode_irq_get(port->fwnode, i); if (v->irq <= 0) { ret = -EINVAL; goto err; @@ -7746,7 +7750,7 @@ static int mvpp2_port_probe(struct platform_device *pdev, struct mvpp2 *priv) { struct device_node *phy_node; - struct phy *comphy; + struct phy *comphy = NULL; struct mvpp2_port *port; struct mvpp2_port_pcpu *port_pcpu; struct device_node *port_node = to_of_node(port_fwnode); @@ -7760,7 +7764,12 @@ static int mvpp2_port_probe(struct platform_device *pdev, int phy_mode; int err, i, cpu; - has_tx_irqs = mvpp2_port_has_tx_irqs(priv, port_node); + if (port_node) { + has_tx_irqs = mvpp2_port_has_tx_irqs(priv, port_node); + } else { + has_tx_irqs = true; + queue_mode = MVPP2_QDIST_MULTI_MODE; + } if (!has_tx_irqs) queue_mode = MVPP2_QDIST_SINGLE_MODE; @@ -7775,7 +7784,11 @@ static int mvpp2_port_probe(struct platform_device *pdev, if (!dev) return -ENOMEM; - phy_node = of_parse_phandle(port_node, "phy", 0); + if (port_node) + phy_node = of_parse_phandle(port_node, "phy", 0); + else + phy_node = NULL; + phy_mode = fwnode_get_phy_mode(port_fwnode); if (phy_mode < 0) { dev_err(&pdev->dev, "incorrect phy mode\n"); @@ -7783,13 +7796,15 @@ static int mvpp2_port_probe(struct platform_device *pdev, goto err_free_netdev; } - comphy = devm_of_phy_get(&pdev->dev, port_node, NULL); - if (IS_ERR(comphy)) { - if (PTR_ERR(comphy) == -EPROBE_DEFER) { - err = -EPROBE_DEFER; - goto err_free_netdev; + if (port_node) { + comphy = devm_of_phy_get(&pdev->dev, port_node, NULL); + if (IS_ERR(comphy)) { + if (PTR_ERR(comphy) == -EPROBE_DEFER) { + err = -EPROBE_DEFER; + goto err_free_netdev; + } + comphy = NULL; } - comphy = NULL; } if (fwnode_property_read_u32(port_fwnode, "port-id", &id)) { @@ -7805,6 +7820,7 @@ static int mvpp2_port_probe(struct platform_device *pdev, port = netdev_priv(dev); port->dev = dev; + port->fwnode = port_fwnod
[net-next: PATCH v4 0/7] Armada 7k/8k PP2 ACPI support
Hi, I quickly resend the series, thanks to Antoine Tenart's remark, who spotted !CONFIG_ACPI compilation issue after introducing the new fwnode_irq_get() routine. Please see the details in the changelog below and the 3/7 commit log. mvpp2 driver can work with the ACPI representation, as exposed on a public branch: https://github.com/MarvellEmbeddedProcessors/edk2-open-platform/commits/marvell-armada-wip It was compiled together with the most recent Tianocore EDK2 revision. Please refer to the firmware build instruction on MacchiatoBin board: http://wiki.macchiatobin.net/tiki-index.php?page=Build+from+source+-+UEFI+EDK+II ACPI representation of PP2 controllers (withouth PHY support) can be viewed in the github: * MacchiatoBin: https://github.com/MarvellEmbeddedProcessors/edk2-open-platform/blob/71ae395da1661374b0f07d1602afb1eee56e9794/Platforms/Marvell/Armada/AcpiTables/Armada80x0McBin/Dsdt.asl#L201 * Armada 7040 DB: https://github.com/MarvellEmbeddedProcessors/edk2-open-platform/blob/71ae395da1661374b0f07d1602afb1eee56e9794/Platforms/Marvell/Armada/AcpiTables/Armada70x0/Dsdt.asl#L131 I will appreciate any comments or remarks. Best regards, Marcin Changelog: v3 -> v4: * 3/7 - add new macro (ACPI_HANDLE_FWNODE) and fix compilation with !CONFIG_ACPI - extend commit log and mention usability of fwnode_irq_get for the child nodes as well v2 -> v3: * 1/7, 2/7 - Add Rafael's Acked-by's * 3/7, 4/7 - New patches * 6/7, 7/7 - Update driver with new helper routines usage - Improve commit log. v1 -> v2: * Remove MDIO patches * Use PP2 ports only with link interrupts * Release second region resources in mvpp2 driver (code moved from mvmdio), as explained in details in 5/5 commit message. Marcin Wojtas (7): device property: Introduce fwnode_get_mac_address() device property: Introduce fwnode_get_phy_mode() device property: Introduce fwnode_irq_get() device property: Allow iterating over available child fwnodes net: mvpp2: simplify maintaining enabled ports' list net: mvpp2: use device_*/fwnode_* APIs instead of of_* net: mvpp2: enable ACPI support in the driver drivers/base/property.c | 104 -- drivers/net/ethernet/marvell/mvpp2.c | 206 include/linux/acpi.h | 3 + include/linux/property.h | 11 ++ 4 files changed, 232 insertions(+), 92 deletions(-) -- 2.7.4
[net-next: PATCH v4 3/7] device property: Introduce fwnode_irq_get()
Until now there were two very similar functions allowing to get Linux IRQ number from ACPI handle (acpi_irq_get()) and OF node (of_irq_get()). The first one appeared to be used only as a subroutine of platform_irq_get(), which (in the generic code) limited IRQ obtaining from _CRS method only to nodes associated to kernel's struct platform_device. This patch introduces a new helper routine - fwnode_irq_get(), which allows to get the IRQ number directly from the fwnode to be used as common for OF/ACPI worlds. It is usable not only for the parents fwnodes, but also for the child nodes comprising their own _CRS methods with interrupts description. In order to be able o satisfy compilation with !CONFIG_ACPI and also simplify the new code, introduce a helper macro (ACPI_HANDLE_FWNODE), with which it is possible to reach an ACPI handle directly from its fwnode. Signed-off-by: Marcin Wojtas --- drivers/base/property.c | 26 include/linux/acpi.h | 3 +++ include/linux/property.h | 2 ++ 3 files changed, 31 insertions(+) diff --git a/drivers/base/property.c b/drivers/base/property.c index 7c4a53d..1d6c9d9 100644 --- a/drivers/base/property.c +++ b/drivers/base/property.c @@ -16,6 +16,7 @@ #include #include #include +#include #include #include #include @@ -1230,6 +1231,31 @@ void *device_get_mac_address(struct device *dev, char *addr, int alen) EXPORT_SYMBOL(device_get_mac_address); /** + * fwnode_irq_get - Get IRQ directly from a fwnode + * @fwnode:Pointer to the firmware node + * @index: Zero-based index of the IRQ + * + * Returns Linux IRQ number on success. Other values are determined + * accordingly to acpi_/of_ irq_get() operation. + */ +int fwnode_irq_get(struct fwnode_handle *fwnode, unsigned int index) +{ + struct device_node *of_node = to_of_node(fwnode); + struct resource res; + int ret; + + if (IS_ENABLED(CONFIG_OF) && of_node) + return of_irq_get(of_node, index); + + ret = acpi_irq_get(ACPI_HANDLE_FWNODE(fwnode), index, &res); + if (ret) + return ret; + + return res.start; +} +EXPORT_SYMBOL(fwnode_irq_get); + +/** * device_graph_get_next_endpoint - Get next endpoint firmware node * @fwnode: Pointer to the parent firmware node * @prev: Previous endpoint node or %NULL to get the first diff --git a/include/linux/acpi.h b/include/linux/acpi.h index dc1ebfe..f05b9b6 100644 --- a/include/linux/acpi.h +++ b/include/linux/acpi.h @@ -56,6 +56,8 @@ static inline acpi_handle acpi_device_handle(struct acpi_device *adev) #define ACPI_COMPANION_SET(dev, adev) set_primary_fwnode(dev, (adev) ? \ acpi_fwnode_handle(adev) : NULL) #define ACPI_HANDLE(dev) acpi_device_handle(ACPI_COMPANION(dev)) +#define ACPI_HANDLE_FWNODE(fwnode) \ + acpi_device_handle(to_acpi_device_node(fwnode)) static inline struct fwnode_handle *acpi_alloc_fwnode_static(void) { @@ -626,6 +628,7 @@ int acpi_arch_timer_mem_init(struct arch_timer_mem *timer_mem, int *timer_count) #define ACPI_COMPANION(dev)(NULL) #define ACPI_COMPANION_SET(dev, adev) do { } while (0) #define ACPI_HANDLE(dev) (NULL) +#define ACPI_HANDLE_FWNODE(fwnode) (NULL) #define ACPI_DEVICE_CLASS(_cls, _msk) .cls = (0), .cls_msk = (0), struct fwnode_handle; diff --git a/include/linux/property.h b/include/linux/property.h index 9b13332..e05889f 100644 --- a/include/linux/property.h +++ b/include/linux/property.h @@ -103,6 +103,8 @@ struct fwnode_handle *device_get_named_child_node(struct device *dev, struct fwnode_handle *fwnode_handle_get(struct fwnode_handle *fwnode); void fwnode_handle_put(struct fwnode_handle *fwnode); +int fwnode_irq_get(struct fwnode_handle *fwnode, unsigned int index); + unsigned int device_get_child_node_count(struct device *dev); static inline bool device_property_read_bool(struct device *dev, -- 2.7.4
[net-next: PATCH v4 5/7] net: mvpp2: simplify maintaining enabled ports' list
'port_count' field of the mvpp2 structure holds an overall amount of available ports, based on DT nodes status. In order to be prepared to support other HW description, obtain the value by incrementing it upon each successful port initialization. This allowed for simplifying port indexing in the controller's private array, whose size is now not dynamically allocated, but fixed to MVPP2_MAX_PORTS. This patch simplifies creating and filling list of enabled ports and is a part of the preparation for adding ACPI support in the mvpp2 driver. Signed-off-by: Marcin Wojtas --- drivers/net/ethernet/marvell/mvpp2.c | 32 +++- 1 file changed, 11 insertions(+), 21 deletions(-) diff --git a/drivers/net/ethernet/marvell/mvpp2.c b/drivers/net/ethernet/marvell/mvpp2.c index a197607..7f42d90 100644 --- a/drivers/net/ethernet/marvell/mvpp2.c +++ b/drivers/net/ethernet/marvell/mvpp2.c @@ -865,7 +865,7 @@ struct mvpp2 { /* List of pointers to port structures */ int port_count; - struct mvpp2_port **port_list; + struct mvpp2_port *port_list[MVPP2_MAX_PORTS]; /* Aggregated TXQs */ struct mvpp2_tx_queue *aggr_txqs; @@ -7741,7 +7741,7 @@ static void mvpp2_port_copy_mac_addr(struct net_device *dev, struct mvpp2 *priv, /* Ports initialization */ static int mvpp2_port_probe(struct platform_device *pdev, struct device_node *port_node, - struct mvpp2 *priv, int index) + struct mvpp2 *priv) { struct device_node *phy_node; struct phy *comphy; @@ -7934,7 +7934,8 @@ static int mvpp2_port_probe(struct platform_device *pdev, } netdev_info(dev, "Using %s mac address %pM\n", mac_from, dev->dev_addr); - priv->port_list[index] = port; + priv->port_list[priv->port_count++] = port; + return 0; err_free_port_pcpu: @@ -8313,28 +8314,17 @@ static int mvpp2_probe(struct platform_device *pdev) goto err_mg_clk; } - priv->port_count = of_get_available_child_count(dn); - if (priv->port_count == 0) { - dev_err(&pdev->dev, "no ports enabled\n"); - err = -ENODEV; - goto err_mg_clk; - } - - priv->port_list = devm_kcalloc(&pdev->dev, priv->port_count, - sizeof(*priv->port_list), - GFP_KERNEL); - if (!priv->port_list) { - err = -ENOMEM; - goto err_mg_clk; - } - /* Initialize ports */ - i = 0; for_each_available_child_of_node(dn, port_node) { - err = mvpp2_port_probe(pdev, port_node, priv, i); + err = mvpp2_port_probe(pdev, port_node, priv); if (err < 0) goto err_port_probe; - i++; + } + + if (priv->port_count == 0) { + dev_err(&pdev->dev, "no ports enabled\n"); + err = -ENODEV; + goto err_mg_clk; } /* Statistics must be gathered regularly because some of them (like -- 2.7.4
[net-next: PATCH v4 1/7] device property: Introduce fwnode_get_mac_address()
Until now there were two almost identical functions for obtaining MAC address - of_get_mac_address() and, more generic, device_get_mac_address(). However it is not uncommon, that the network interface is represented as a child of the actual controller, hence it is not associated directly to any struct device, required by the latter routine. This commit allows for getting the MAC address for children nodes in the ACPI world by introducing a new function - fwnode_get_mac_address(). This commit also changes device_get_mac_address() routine to be its wrapper, in order to prevent unnecessary duplication. Signed-off-by: Marcin Wojtas Acked-by: Rafael J. Wysocki --- drivers/base/property.c | 28 ++-- include/linux/property.h | 2 ++ 2 files changed, 22 insertions(+), 8 deletions(-) diff --git a/drivers/base/property.c b/drivers/base/property.c index 851b1b6..f261d1a 100644 --- a/drivers/base/property.c +++ b/drivers/base/property.c @@ -1153,11 +1153,11 @@ int device_get_phy_mode(struct device *dev) } EXPORT_SYMBOL_GPL(device_get_phy_mode); -static void *device_get_mac_addr(struct device *dev, +static void *fwnode_get_mac_addr(struct fwnode_handle *fwnode, const char *name, char *addr, int alen) { - int ret = device_property_read_u8_array(dev, name, addr, alen); + int ret = fwnode_property_read_u8_array(fwnode, name, addr, alen); if (ret == 0 && alen == ETH_ALEN && is_valid_ether_addr(addr)) return addr; @@ -1165,8 +1165,8 @@ static void *device_get_mac_addr(struct device *dev, } /** - * device_get_mac_address - Get the MAC for a given device - * @dev: Pointer to the device + * fwnode_get_mac_address - Get the MAC from the firmware node + * @fwnode:Pointer to the firmware node * @addr: Address of buffer to store the MAC in * @alen: Length of the buffer pointed to by addr, should be ETH_ALEN * @@ -1187,19 +1187,31 @@ static void *device_get_mac_addr(struct device *dev, * In this case, the real MAC is in 'local-mac-address', and 'mac-address' * exists but is all zeros. */ -void *device_get_mac_address(struct device *dev, char *addr, int alen) +void *fwnode_get_mac_address(struct fwnode_handle *fwnode, char *addr, int alen) { char *res; - res = device_get_mac_addr(dev, "mac-address", addr, alen); + res = fwnode_get_mac_addr(fwnode, "mac-address", addr, alen); if (res) return res; - res = device_get_mac_addr(dev, "local-mac-address", addr, alen); + res = fwnode_get_mac_addr(fwnode, "local-mac-address", addr, alen); if (res) return res; - return device_get_mac_addr(dev, "address", addr, alen); + return fwnode_get_mac_addr(fwnode, "address", addr, alen); +} +EXPORT_SYMBOL(fwnode_get_mac_address); + +/** + * device_get_mac_address - Get the MAC for a given device + * @dev: Pointer to the device + * @addr: Address of buffer to store the MAC in + * @alen: Length of the buffer pointed to by addr, should be ETH_ALEN + */ +void *device_get_mac_address(struct device *dev, char *addr, int alen) +{ + return fwnode_get_mac_address(dev_fwnode(dev), addr, alen); } EXPORT_SYMBOL(device_get_mac_address); diff --git a/include/linux/property.h b/include/linux/property.h index f6189a3..35620e0 100644 --- a/include/linux/property.h +++ b/include/linux/property.h @@ -279,6 +279,8 @@ int device_get_phy_mode(struct device *dev); void *device_get_mac_address(struct device *dev, char *addr, int alen); +void *fwnode_get_mac_address(struct fwnode_handle *fwnode, +char *addr, int alen); struct fwnode_handle *fwnode_graph_get_next_endpoint( const struct fwnode_handle *fwnode, struct fwnode_handle *prev); struct fwnode_handle * -- 2.7.4
[net-next: PATCH v4 2/7] device property: Introduce fwnode_get_phy_mode()
Until now there were two almost identical functions for obtaining network PHY mode - of_get_phy_mode() and, more generic, device_get_phy_mode(). However it is not uncommon, that the network interface is represented as a child of the actual controller, hence it is not associated directly to any struct device, required by the latter routine. This commit allows for getting the PHY mode for children nodes in the ACPI world by introducing a new function - fwnode_get_phy_mode(). This commit also changes device_get_phy_mode() routine to be its wrapper, in order to prevent unnecessary duplication. Signed-off-by: Marcin Wojtas Acked-by: Rafael J. Wysocki --- drivers/base/property.c | 24 include/linux/property.h | 1 + 2 files changed, 20 insertions(+), 5 deletions(-) diff --git a/drivers/base/property.c b/drivers/base/property.c index f261d1a..7c4a53d 100644 --- a/drivers/base/property.c +++ b/drivers/base/property.c @@ -1126,21 +1126,21 @@ enum dev_dma_attr device_get_dma_attr(struct device *dev) EXPORT_SYMBOL_GPL(device_get_dma_attr); /** - * device_get_phy_mode - Get phy mode for given device - * @dev: Pointer to the given device + * fwnode_get_phy_mode - Get phy mode for given firmware node + * @fwnode:Pointer to the given node * * The function gets phy interface string from property 'phy-mode' or * 'phy-connection-type', and return its index in phy_modes table, or errno in * error case. */ -int device_get_phy_mode(struct device *dev) +int fwnode_get_phy_mode(struct fwnode_handle *fwnode) { const char *pm; int err, i; - err = device_property_read_string(dev, "phy-mode", &pm); + err = fwnode_property_read_string(fwnode, "phy-mode", &pm); if (err < 0) - err = device_property_read_string(dev, + err = fwnode_property_read_string(fwnode, "phy-connection-type", &pm); if (err < 0) return err; @@ -1151,6 +1151,20 @@ int device_get_phy_mode(struct device *dev) return -ENODEV; } +EXPORT_SYMBOL_GPL(fwnode_get_phy_mode); + +/** + * device_get_phy_mode - Get phy mode for given device + * @dev: Pointer to the given device + * + * The function gets phy interface string from property 'phy-mode' or + * 'phy-connection-type', and return its index in phy_modes table, or errno in + * error case. + */ +int device_get_phy_mode(struct device *dev) +{ + return fwnode_get_phy_mode(dev_fwnode(dev)); +} EXPORT_SYMBOL_GPL(device_get_phy_mode); static void *fwnode_get_mac_addr(struct fwnode_handle *fwnode, diff --git a/include/linux/property.h b/include/linux/property.h index 35620e0..9b13332 100644 --- a/include/linux/property.h +++ b/include/linux/property.h @@ -279,6 +279,7 @@ int device_get_phy_mode(struct device *dev); void *device_get_mac_address(struct device *dev, char *addr, int alen); +int fwnode_get_phy_mode(struct fwnode_handle *fwnode); void *fwnode_get_mac_address(struct fwnode_handle *fwnode, char *addr, int alen); struct fwnode_handle *fwnode_graph_get_next_endpoint( -- 2.7.4
[net-next: PATCH v4 4/7] device property: Allow iterating over available child fwnodes
Implement a new helper function fwnode_get_next_available_child_node(), which enables obtaining next enabled child fwnode, which works on a similar basis to OF's of_get_next_available_child(). This commit also introduces a macro, thanks to which it is possible to iterate over the available fwnodes, using the new function described above. Signed-off-by: Marcin Wojtas --- drivers/base/property.c | 26 include/linux/property.h | 6 + 2 files changed, 32 insertions(+) diff --git a/drivers/base/property.c b/drivers/base/property.c index 1d6c9d9..613ba82 100644 --- a/drivers/base/property.c +++ b/drivers/base/property.c @@ -998,6 +998,32 @@ fwnode_get_next_child_node(const struct fwnode_handle *fwnode, EXPORT_SYMBOL_GPL(fwnode_get_next_child_node); /** + * fwnode_get_next_available_child_node - Return the next + * available child node handle for a node + * @fwnode: Firmware node to find the next child node for. + * @child: Handle to one of the node's child nodes or a %NULL handle. + */ +struct fwnode_handle * +fwnode_get_next_available_child_node(const struct fwnode_handle *fwnode, +struct fwnode_handle *child) +{ + struct fwnode_handle *next_child = child; + + if (!fwnode) + return NULL; + + do { + next_child = fwnode_get_next_child_node(fwnode, next_child); + + if (!next_child || fwnode_device_is_available(next_child)) + break; + } while (next_child); + + return next_child; +} +EXPORT_SYMBOL_GPL(fwnode_get_next_available_child_node); + +/** * device_get_next_child_node - Return the next child node handle for a device * @dev: Device to find the next child node for. * @child: Handle to one of the device's child nodes or a null handle. diff --git a/include/linux/property.h b/include/linux/property.h index e05889f..5b0563a 100644 --- a/include/linux/property.h +++ b/include/linux/property.h @@ -83,11 +83,17 @@ struct fwnode_handle *fwnode_get_next_parent( struct fwnode_handle *fwnode); struct fwnode_handle *fwnode_get_next_child_node( const struct fwnode_handle *fwnode, struct fwnode_handle *child); +struct fwnode_handle *fwnode_get_next_available_child_node( + const struct fwnode_handle *fwnode, struct fwnode_handle *child); #define fwnode_for_each_child_node(fwnode, child) \ for (child = fwnode_get_next_child_node(fwnode, NULL); child; \ child = fwnode_get_next_child_node(fwnode, child)) +#define fwnode_for_each_available_child_node(fwnode, child) \ + for (child = fwnode_get_next_available_child_node(fwnode, NULL); child;\ +child = fwnode_get_next_available_child_node(fwnode, child)) + struct fwnode_handle *device_get_next_child_node( struct device *dev, struct fwnode_handle *child); -- 2.7.4
Re: [net-next: PATCH 0/8] Armada 7k/8k PP2 ACPI support
[+cc Mika] On Tue, Jan 09, 2018 at 02:00:12PM +0100, Andrew Lunn wrote: > On Tue, Jan 09, 2018 at 11:22:00AM +0100, Marcin Wojtas wrote: > > 2018-01-09 11:19 GMT+01:00 Graeme Gregory : > > > On Mon, Jan 08, 2018 at 06:17:06PM +0100, Marcin Wojtas wrote: > > >> Hi Andrew, > > >> > > >> > > >> > > >> 2018-01-08 16:42 GMT+01:00 Andrew Lunn : > > >> > w> I am not familiar with MDIO, but if its similar or a specific > > >> >> implementation of a serial bus that does sound sane! > > >> > > > >> > > >> Thanks for digging, I will check if and how we can use > > >> GenericSerialBus with MDIO. > > >> > > > Maybe Lorenzo, Hanjun, Sudeep can comment here they might have come > > > across similar on other ARM boards. > > > > > > > I'm looking forward to their feedback, however, what I've noticed, > > each driver handles mdio/phys on its own, not using any generic > > solution, which is what I need to actually avoid :) > > Agreed. Lets define it once for all drivers using phylib/phylink. To start with, I am not entirely familiar with MDIO, apologies in advance. Building something on top of GenericSerialBus sounds correct but if I am not mistaken you would need a new bus type in the ACPI specs. I CC'ed Mika since he is more familiar with handling these bits of ACPI specs - I wonder whether this is a problem that cropped up on x86 systems too. I do not think there is one and only answer but there must be a single set of bindings and if the ACPI specs already cater for some of them we have to reuse them. Please take some time to ensure the solution you are pushing is widely deployable. Thanks, Lorenzo
Re: [RFC PATCH net-next 00/12] selftests: forwarding: Add VRF-based tests
> > > >> > > > >> br0 > > > >> + > > > >>vrf-h1 | vrf-h2 > > > >> ++---+++ > > > >> |||| > > > >> 192.0.2.1/24 ++++ 192.0.2.2/24 > > > >>swp1 swp2 swp3 swp4 > > > >> ++++ > > > >> |||| > > > >> ++++ > > > >> > > > > > Agreed this is really cool! For DSA enabled switches, we usually have a > > > host that does the test sequencing and then execute commands remotely on > > > the DUT, but we might be able to get such a similar framework up and > > > running on the DUT itself without too much hassle. > > > > I think the problem we will have is a lack of ports. Most DSA switches > > have 4 or 5 ports. Given the need for two ports per bridge port, we > > will be limited to bridges with just two members. That really limits > > what sort of tests you can do. > > I was actually interested in feedback from you guys. Looking at > dsa_slave_changeupper() I see you don't forbid the enslavement to a VRF > and that you set STP state to forwarding when a port leaves a bridge > (good). Does that mean you're able to use some of these tests on your > switches? By default, we have port separation. So you can add individual IP addresses to interfaces, and ping from them. In fact, this is the first test i always run. My test setup uses 4 ports of the switch, and connects a USB-Ethernet dongle to each port. The 4 USB cables then connect to my host running the tests. I setup four parallel subnets, and ping across each. I've never used vrf, so i've no idea if that will just work on top. > The reason we can use these tests for mlxsw is that we support VRF and > ACL offload. At least in the above example, swp4 is able to receive > packets directed at 192.0.2.2 because we program the device with the > host route 192.0.2.2/32. We are all L2. So no route needed. No ACLs either. So in theory, this could be used with DSA. But in practice, only having two test ports really restricts you from doing anything interesting. You need a minimum of 3, and all my tests use 4. Given your double requirements, that means you actually need a switch with 8 external ports. I don't know of any DSA board which has that. Andrew
Re: [net-next: PATCH 0/8] Armada 7k/8k PP2 ACPI support
> I CC'ed Mika since he is more familiar with handling these bits of ACPI > specs - I wonder whether this is a problem that cropped up on x86 > systems too. Hi Lorenzo There is nothing about MDIO, PHYs, Ethernet switches, etc in version 6.2 of the spec. If x86 has this, it must be after 6.2 was released. I would not be too surprised if x86 has none of this. If you look at the typical drives used on x86, i210, e1000e, ixgb, r8169, etc. They are all PCI devices, and hide all this. > I do not think there is one and only answer but there must be a single > set of bindings and if the ACPI specs already cater for some of them > we have to reuse them. Agreed. Due diligence so far suggests there is nothing already defined. But im a newbie to ACPI, so could be looking in the wrong place. I really hope there is somebody like Rob Herring, the DT maintainer, who keeps an eye on all ACPI talk and would tell us if we are heading off in the wrong direction. Andrew
Re: dangers of bots on the mailing lists was Re: divide error in ___bpf_prog_run
On Thu, Jan 18, 2018 at 2:09 AM, Theodore Ts'o wrote: > On Wed, Jan 17, 2018 at 04:21:13PM -0800, Alexei Starovoitov wrote: >> >> If syzkaller can only test one tree than linux-next should be the one. > > Well, there's been some controversy about that. The problem is that > it's often not clear if this is long-standing bug, or a bug which is > in a particular subsystem tree --- and if so, *which* subsystem tree, > etc. So it gets blasted to linux-kernel, and to get_maintainer.pl, > which is often not accurate --- since the location of the crash > doesn't necessarily point out where the problem originated, and hence > who should look at the syzbot report. And so this has caused > some irritation. Re set of tested trees. We now have an interesting spectrum of opinions. Some assorted thoughts on this: 1. First, "upstream is clean" won't happen any time soon. There are several reasons for this: - Currently syzkaller only tests a subset of subsystems that it knows how to test, even the ones that it tests it tests poorly. Over time it's improved to test most subsystems and existing subsystems better. Just few weeks ago I've added some descriptions for crypto subsystem and it uncovered 20+ old bugs. - syzkaller is guided, genetic fuzzer over time it leans how to do more complex things by small steps. It takes time. - We have more bug detection tools coming: LEAKCHECK, KMSAN (uninit memory), KTSAN (data races). - generic syzkaller smartness will be improved over time. - it will get more CPU resources. Effect of all of these things is multiplicative: we test more code, smarter, with more bug-detection tools, with more resources. So I think we need to plan for a mix of old and new bugs for foreseeable future. 2. get_maintainer.pl and mix of old and new bugs was mentioned as harming attribution. I don't see what will change when/if we test only upstream. Then the same mix of old/new bugs will be detected just on upstream, with all of the same problems for old/new, maintainers, which subsystem, etc. I think the amount of bugs in the kernel is significant part of the problem, but the exact boundary where we decide to start killing them won't affect number of bugs. 3. If we test only upstream, we increase chances of new security bugs sinking into releases. We sure could raise perceived security value of the bugs by keeping them private, letting them sink into release, letting them sink into distros, and then reporting a high-profile vulnerability. I think that's wrong. There is something broken with value measuring in security community. Bug that is killed before sinking into any release is the highest impact thing. As Alexei noted, fixing bugs es early as possible also reduces fix costs, backporting burden, etc. This also can eliminate need in bisection in some cases, say if you accepted a large change to some files and a bunch of crashes appears for these files on your tree soon, it's obvious what happens. 4. It was mentioned that linux-next can have a broken slab allocator and that will manifest as multiple random crashes. FWIW I don't remember that I ever seen this. Yes, sometimes it does not build/boot, but these builds are just rejected for testing. I don't mind dropping linux-next specifically if that's the common decision. However, (1) Alexei and Gruenter expressed opposite opinion, (2) I don't see what it will change dramatically, (2) as far as I understand Linus actually relies on linux-next giving some concrete testing to the code there. But I think that testing bpf-next is a positive thing provided that there is explicit interest from maintainers. And note that that will be testing targeted specifically at bpf subsystem, so that instance will not generate bugs in SCSI, USB, etc (though it will cover a part of net). Also note that the latest email format includes set of tree where the crash happened, so if you see "upstream" or "upstream and bpf-next", nothing really changes, you still know that it happens upstream. Or if you see only "bpf-next", then you know that it's only that tree.
[PATCH] can: m_can: mark runtime-PM handlers as __maybe_unused
Building without CONFIG_PM results in a harmless warning: drivers/net/can/m_can/m_can.c:1763:12: error: 'm_can_runtime_resume' defined but not used [-Werror=unused-function] drivers/net/can/m_can/m_can.c:1752:12: error: 'm_can_runtime_suspend' defined but not used [-Werror=unused-function] Marking the functions as __maybe_unused lets the compiler silently drop them instead. Fixes: cdf8259d6573 ("can: m_can: Add PM Support") Signed-off-by: Arnd Bergmann --- drivers/net/can/m_can/m_can.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/can/m_can/m_can.c b/drivers/net/can/m_can/m_can.c index 49ed8737871f..2594f7779c6f 100644 --- a/drivers/net/can/m_can/m_can.c +++ b/drivers/net/can/m_can/m_can.c @@ -1749,7 +1749,7 @@ static int m_can_plat_remove(struct platform_device *pdev) return 0; } -static int m_can_runtime_suspend(struct device *dev) +static int __maybe_unused m_can_runtime_suspend(struct device *dev) { struct net_device *ndev = dev_get_drvdata(dev); struct m_can_priv *priv = netdev_priv(ndev); @@ -1760,7 +1760,7 @@ static int m_can_runtime_suspend(struct device *dev) return 0; } -static int m_can_runtime_resume(struct device *dev) +static int __maybe_unused m_can_runtime_resume(struct device *dev) { struct net_device *ndev = dev_get_drvdata(dev); struct m_can_priv *priv = netdev_priv(ndev); -- 2.9.0
Re: dangers of bots on the mailing lists was Re: divide error in ___bpf_prog_run
On Thu, Jan 18, 2018 at 2:01 PM, Dmitry Vyukov wrote: > On Thu, Jan 18, 2018 at 2:09 AM, Theodore Ts'o wrote: >> On Wed, Jan 17, 2018 at 04:21:13PM -0800, Alexei Starovoitov wrote: >>> >>> If syzkaller can only test one tree than linux-next should be the one. >> >> Well, there's been some controversy about that. The problem is that >> it's often not clear if this is long-standing bug, or a bug which is >> in a particular subsystem tree --- and if so, *which* subsystem tree, >> etc. So it gets blasted to linux-kernel, and to get_maintainer.pl, >> which is often not accurate --- since the location of the crash >> doesn't necessarily point out where the problem originated, and hence >> who should look at the syzbot report. And so this has caused >> some irritation. > > > Re set of tested trees. > > We now have an interesting spectrum of opinions. > > Some assorted thoughts on this: > > 1. First, "upstream is clean" won't happen any time soon. There are > several reasons for this: > - Currently syzkaller only tests a subset of subsystems that it knows > how to test, even the ones that it tests it tests poorly. Over time > it's improved to test most subsystems and existing subsystems better. > Just few weeks ago I've added some descriptions for crypto subsystem > and it uncovered 20+ old bugs. /\/\/\/\/\/\/\/\/\/\/\/\ While we are here, you can help syzkaller to test your subsystem better (or at all). It frequently requires domain expertise which we don't have for all kernel subsystems (sometimes we don't even know they exist). It can be as simple as this (for /dev/ashmem): https://github.com/google/syzkaller/blob/master/sys/linux/ashmem.txt > - syzkaller is guided, genetic fuzzer over time it leans how to do > more complex things by small steps. It takes time. > - We have more bug detection tools coming: LEAKCHECK, KMSAN (uninit > memory), KTSAN (data races). > - generic syzkaller smartness will be improved over time. > - it will get more CPU resources. > Effect of all of these things is multiplicative: we test more code, > smarter, with more bug-detection tools, with more resources. So I > think we need to plan for a mix of old and new bugs for foreseeable > future. > > 2. get_maintainer.pl and mix of old and new bugs was mentioned as > harming attribution. I don't see what will change when/if we test only > upstream. Then the same mix of old/new bugs will be detected just on > upstream, with all of the same problems for old/new, maintainers, > which subsystem, etc. I think the amount of bugs in the kernel is > significant part of the problem, but the exact boundary where we > decide to start killing them won't affect number of bugs. > > 3. If we test only upstream, we increase chances of new security bugs > sinking into releases. We sure could raise perceived security value of > the bugs by keeping them private, letting them sink into release, > letting them sink into distros, and then reporting a high-profile > vulnerability. I think that's wrong. There is something broken with > value measuring in security community. Bug that is killed before > sinking into any release is the highest impact thing. As Alexei noted, > fixing bugs es early as possible also reduces fix costs, backporting > burden, etc. This also can eliminate need in bisection in some cases, > say if you accepted a large change to some files and a bunch of > crashes appears for these files on your tree soon, it's obvious what > happens. > > 4. It was mentioned that linux-next can have a broken slab allocator > and that will manifest as multiple random crashes. FWIW I don't > remember that I ever seen this. Yes, sometimes it does not build/boot, > but these builds are just rejected for testing. > > I don't mind dropping linux-next specifically if that's the common > decision. However, (1) Alexei and Gruenter expressed opposite opinion, > (2) I don't see what it will change dramatically, (2) as far as I > understand Linus actually relies on linux-next giving some concrete > testing to the code there. > But I think that testing bpf-next is a positive thing provided that > there is explicit interest from maintainers. And note that that will > be testing targeted specifically at bpf subsystem, so that instance > will not generate bugs in SCSI, USB, etc (though it will cover a part > of net). Also note that the latest email format includes set of tree > where the crash happened, so if you see "upstream" or "upstream and > bpf-next", nothing really changes, you still know that it happens > upstream. Or if you see only "bpf-next", then you know that it's only > that tree.
Re: [PATCH 0/3] Check gso_size of packets when forwarding
Pravin Shelar writes: > On Mon, Jan 15, 2018 at 6:09 PM, Daniel Axtens wrote: >> When regular packets are forwarded, we validate their size against the >> MTU of the destination device. However, when GSO packets are >> forwarded, we do not validate their size against the MTU. We >> implicitly assume that when they are segmented, the resultant packets >> will be correctly sized. >> >> This is not always the case. >> >> We observed a case where a packet received on an ibmveth device had a >> GSO size of around 10kB. This was forwarded by Open vSwitch to a bnx2x >> device, where it caused a firmware assert. This is described in detail >> at [0] and was the genesis of this series. Rather than fixing it in >> the driver, this series fixes the forwarding path. >> > Are there any other possible forwarding path in networking stack? TC > is one subsystem that could forward such a packet to the bnx2x device, > how is that handled ? So far I have only looked at bridges, openvswitch and macvlan. In general, if the code uses dev_forward_skb() it should automatically be fine as that invokes is_skb_forwardable(), which we patch. There is potentially a forwarding path in macvlan that doesn't pass through dev_forward_skb() (which calls is_skb_forwardable). I did post a patch to this earlier, but it got somewhat derailed by how you would get such a packet in to a macvlan device and whether that should be fixed instead. Once we have some consensus on Jason's questions and the overall approach is solid, I'm happy to work on that again. To answer your specific question, I'm not aware of how tc can cause a packet to be forwarded - if you can point me to the right general area I can have a look. Regards, Daniel > >> To fix this: >> >> - Move a helper in patch 1. >> >> - Validate GSO segment lengths in is_skb_forwardable() in the GSO >>case, rather than assuming all will be well. This fixes bridges. >>This is patch 2. >> >> - Open vSwitch uses its own slightly specialised algorithm for >>checking lengths. Wire up checking for that in patch 3. >> >> [0]: https://patchwork.ozlabs.org/patch/859410/ >> >> Cc: manish.cho...@cavium.com >> Cc: d...@openvswitch.org >> >> Daniel Axtens (3): >> net: move skb_gso_mac_seglen to skbuff.h >> net: is_skb_forwardable: validate length of GSO packet segments >> openvswitch: drop GSO packets that are too large >> >> include/linux/skbuff.h | 16 >> net/core/dev.c | 7 --- >> net/core/skbuff.c | 34 ++ >> net/openvswitch/vport.c | 37 ++--- >> net/sched/sch_tbf.c | 10 -- >> 5 files changed, 84 insertions(+), 20 deletions(-) >> >> -- >> 2.14.1 >>
Re: dangers of bots on the mailing lists was Re: divide error in ___bpf_prog_run
On Wed, Jan 17, 2018 at 12:09 PM, Dmitry Vyukov wrote: > On Wed, Jan 17, 2018 at 10:49 AM, Daniel Borkmann > wrote: >> Don't know if there's such a possibility, but it would be nice if we could >> target fuzzing for specific subsystems in related subtrees directly (e.g. >> for bpf in bpf and bpf-next trees as one example). Dmitry? > > Hi Daniel, > > It's doable. > Let's start with one bpf tree. Will it be bpf or bpf-next? Which one > contains more ongoing work? What's the exact git repo address/branch, > so that I don't second guess? > Also what syscalls it makes sense to enable there to target it at bpf > specifically? As far as I understand effects of bpf are far beyond the > bpf call and proper testing requires some sockets and other stuff. For > sockets, will it be enough to enable ip/ipv6? Because if we enable all > of sctp/dccp/tipc/pptp/etc, it will sure will be finding lots of bugs > there as well. Does bpf affect incoming network packets? > Also are there any sysctl's, command line arguments, etc that need to > be tuned. I know there are net.core.bpf_jit_enable/harden, but I don't > know what's the most relevant combination. Ideally, we test all of > them, but let start with one of them because it requires separate > instances (since the setting is global and test programs can't just > flip it randomly). > Also do you want testing from root or not from root? We generally > don't test under root, because syzkaller comes up with legal ways to > shut everything down even if we try to contain it (e.g. kill init > somehow or shut down network using netlink). But if we limit syscall > surface, then root may work and allow testing staging bpf features. So, Daniel, Alexei, I understand that I asked lots of questions, but they are relatively simple. I need that info to setup proper testing.
[PATCH] [wireless-next] mt76: fix building without CONFIG_LEDS_CLASS
When CONFIG_LEDS_CLASS is disabled, or it is a loadable module while mt76 is built-in, we run into a link error: drivers/net/wireless/mediatek/mt76/mac80211.o: In function `mt76_register_device': mac80211.c:(.text+0xb78): relocation truncated to fit: R_AARCH64_CALL26 against undefined symbol `devm_of_led_classdev_register' We don't really need a hard dependency here as the driver can presumably work just fine without LEDs, so this follows the iwlwifi example and adds a separate Kconfig option for the LED support, this will be available whenever it will link, and otherwise the respective code gets left out from the driver object. Fixes: 17f1de56df05 ("mt76: add common code shared between multiple chipsets") Signed-off-by: Arnd Bergmann --- drivers/net/wireless/mediatek/mt76/Kconfig | 5 + drivers/net/wireless/mediatek/mt76/mac80211.c| 8 +--- drivers/net/wireless/mediatek/mt76/mt76x2_init.c | 6 -- 3 files changed, 14 insertions(+), 5 deletions(-) diff --git a/drivers/net/wireless/mediatek/mt76/Kconfig b/drivers/net/wireless/mediatek/mt76/Kconfig index fc05d79c80d0..9d12c1f5284e 100644 --- a/drivers/net/wireless/mediatek/mt76/Kconfig +++ b/drivers/net/wireless/mediatek/mt76/Kconfig @@ -8,3 +8,8 @@ config MT76x2E depends on PCI ---help--- This adds support for MT7612/MT7602/MT7662-based wireless PCIe devices. + +config MT76_LEDS + bool "LED support for MT76" + depends on MT76_CORE + depends on LEDS_CLASS=y || MT76_CORE=LEDS_CLASS diff --git a/drivers/net/wireless/mediatek/mt76/mac80211.c b/drivers/net/wireless/mediatek/mt76/mac80211.c index 3acf0e175d71..144e312a9407 100644 --- a/drivers/net/wireless/mediatek/mt76/mac80211.c +++ b/drivers/net/wireless/mediatek/mt76/mac80211.c @@ -295,9 +295,11 @@ int mt76_register_device(struct mt76_dev *dev, bool vht, mt76_check_sband(dev, NL80211_BAND_2GHZ); mt76_check_sband(dev, NL80211_BAND_5GHZ); - ret = mt76_led_init(dev); - if (ret) - return ret; + if (IS_ENABLED(CONFIG_MT76_LEDS)) { + ret = mt76_led_init(dev); + if (ret) + return ret; + } return ieee80211_register_hw(hw); } diff --git a/drivers/net/wireless/mediatek/mt76/mt76x2_init.c b/drivers/net/wireless/mediatek/mt76/mt76x2_init.c index 4373a2ba5143..0c9d02af205d 100644 --- a/drivers/net/wireless/mediatek/mt76/mt76x2_init.c +++ b/drivers/net/wireless/mediatek/mt76/mt76x2_init.c @@ -849,8 +849,10 @@ int mt76x2_register_device(struct mt76x2_dev *dev) mt76x2_dfs_init_detector(dev); /* init led callbacks */ - dev->mt76.led_cdev.brightness_set = mt76x2_led_set_brightness; - dev->mt76.led_cdev.blink_set = mt76x2_led_set_blink; + if (IS_ENABLED(CONFIG_MT76_LEDS)) { + dev->mt76.led_cdev.brightness_set = mt76x2_led_set_brightness; + dev->mt76.led_cdev.blink_set = mt76x2_led_set_blink; + } ret = mt76_register_device(&dev->mt76, true, mt76x2_rates, ARRAY_SIZE(mt76x2_rates)); -- 2.9.0
Re: [PATCH] can: m_can: mark runtime-PM handlers as __maybe_unused
On 01/18/2018 02:04 PM, Arnd Bergmann wrote: > Building without CONFIG_PM results in a harmless warning: > > drivers/net/can/m_can/m_can.c:1763:12: error: 'm_can_runtime_resume' defined > but not used [-Werror=unused-function] > drivers/net/can/m_can/m_can.c:1752:12: error: 'm_can_runtime_suspend' defined > but not used [-Werror=unused-function] > > Marking the functions as __maybe_unused lets the compiler > silently drop them instead. > > Fixes: cdf8259d6573 ("can: m_can: Add PM Support") > Signed-off-by: Arnd Bergmann Tnx - applied. However all archs using this driver have mandatory PM support and the driver depends on it. Marc -- Pengutronix e.K. | Marc Kleine-Budde | Industrial Linux Solutions| Phone: +49-231-2826-924 | Vertretung West/Dortmund | Fax: +49-5121-206917- | Amtsgericht Hildesheim, HRA 2686 | http://www.pengutronix.de | signature.asc Description: OpenPGP digital signature
Re: [PATCH 0/3] Check gso_size of packets when forwarding
Jason Wang writes: > On 2018年01月18日 16:28, Pravin Shelar wrote: >> On Mon, Jan 15, 2018 at 6:09 PM, Daniel Axtens wrote: >>> When regular packets are forwarded, we validate their size against the >>> MTU of the destination device. However, when GSO packets are >>> forwarded, we do not validate their size against the MTU. We >>> implicitly assume that when they are segmented, the resultant packets >>> will be correctly sized. >>> >>> This is not always the case. >>> >>> We observed a case where a packet received on an ibmveth device had a >>> GSO size of around 10kB. This was forwarded by Open vSwitch to a bnx2x >>> device, where it caused a firmware assert. This is described in detail >>> at [0] and was the genesis of this series. Rather than fixing it in >>> the driver, this series fixes the forwarding path. >>> >> Are there any other possible forwarding path in networking stack? TC >> is one subsystem that could forward such a packet to the bnx2x device, >> how is that handled ? > > Yes, so it looks to me we should do the check in e.g netif_needs_gso() > before passing it to hardware. And bnx2x needs to set its gso_max_size > correctly. I don't think gso_max_size is quite the same. If I understand net/ipv4/tcp.c correctly, gso_max_size is supposed to cap the total length of the skb, not the length of each segmented packet. The problem for the bnx2x card is the length of the segment, not the overall length. > > Btw, looks like this could be triggered from a guest which is a DOS. We > need request a CVE for this. > You are correct about how this can be triggered: in fact it came to my attention because a network packet from one LPAR (PowerVM virtual machine) brought down the card attached to a different LPAR. It didn't occur to me that it was potentially a security issue. I am talking with the security team at Canonical regarding this. Regards, Daniel > Thanks > >> >>> To fix this: >>> >>> - Move a helper in patch 1. >>> >>> - Validate GSO segment lengths in is_skb_forwardable() in the GSO >>> case, rather than assuming all will be well. This fixes bridges. >>> This is patch 2. >>> >>> - Open vSwitch uses its own slightly specialised algorithm for >>> checking lengths. Wire up checking for that in patch 3. >>> >>> [0]: https://patchwork.ozlabs.org/patch/859410/ >>> >>> Cc: manish.cho...@cavium.com >>> Cc: d...@openvswitch.org >>> >>> Daniel Axtens (3): >>>net: move skb_gso_mac_seglen to skbuff.h >>>net: is_skb_forwardable: validate length of GSO packet segments >>>openvswitch: drop GSO packets that are too large >>> >>> include/linux/skbuff.h | 16 >>> net/core/dev.c | 7 --- >>> net/core/skbuff.c | 34 ++ >>> net/openvswitch/vport.c | 37 ++--- >>> net/sched/sch_tbf.c | 10 -- >>> 5 files changed, 84 insertions(+), 20 deletions(-) >>> >>> -- >>> 2.14.1 >>>
[PATCH] [net-next] net: sched: avoid uninitialized variable use
gcc has identified a code path in which we pass uninitialized data into tc_dump_tfilter(): net/sched/cls_api.c: In function 'tc_dump_tfilter': net/sched/cls_api.c:1268:8: error: 'parent' may be used uninitialized in this function [-Werror=maybe-uninitialized] This initializes the variable to the value it had before the previous change. Fixes: 7960d1daf278 ("net: sched: use block index as a handle instead of qdisc when block is shared") Signed-off-by: Arnd Bergmann I don't know if my patch is the best way to address the issue, but if not, then at least it helps show what the warning is about and lets someone else come up with a better solution. --- net/sched/cls_api.c | 1 + 1 file changed, 1 insertion(+) diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c index e500d11da9cd..a95bfc8fc442 100644 --- a/net/sched/cls_api.c +++ b/net/sched/cls_api.c @@ -1317,6 +1317,7 @@ static int tc_dump_tfilter(struct sk_buff *skb, struct netlink_callback *cb) block = tcf_block_lookup(net, tcm->tcm_block_index); if (!block) goto out; + parent = tcm->tcm_parent; } else { const struct Qdisc_class_ops *cops; struct net_device *dev; -- 2.9.0
Re: [PATCH v2 00/19] prevent bounds-check bypass via speculative execution
Hi Dan, Linus, On Thu, Jan 11, 2018 at 05:41:08PM -0800, Dan Williams wrote: > On Thu, Jan 11, 2018 at 5:19 PM, Linus Torvalds > wrote: > > On Thu, Jan 11, 2018 at 4:46 PM, Dan Williams > > wrote: > >> > >> This series incorporates Mark Rutland's latest ARM changes and adds > >> the x86 specific implementation of 'ifence_array_ptr'. That ifence > >> based approach is provided as an opt-in fallback, but the default > >> mitigation, '__array_ptr', uses a 'mask' approach that removes > >> conditional branches instructions, and otherwise aims to redirect > >> speculation to use a NULL pointer rather than a user controlled value. > > > > Do you have any performance numbers and perhaps example code > > generation? Is this noticeable? Are there any microbenchmarks showing > > the difference between lfence use and the masking model? > > I don't have performance numbers, but here's a sample code generation > from __fcheck_files, where the 'and; lea; and' sequence is portion of > array_ptr() after the mask generation with 'sbb'. > > fdp = array_ptr(fdt->fd, fd, fdt->max_fds); > 8e7: 8b 02 mov(%rdx),%eax > 8e9: 48 39 c7cmp%rax,%rdi > 8ec: 48 19 c9sbb%rcx,%rcx > 8ef: 48 8b 42 08 mov0x8(%rdx),%rax > 8f3: 48 89 femov%rdi,%rsi > 8f6: 48 21 ceand%rcx,%rsi > 8f9: 48 8d 04 f0 lea(%rax,%rsi,8),%rax > 8fd: 48 21 c8and%rcx,%rax > > > > Having both seems good for testing, but wouldn't we want to pick one in the > > end? > > I was thinking we'd keep it as a 'just in case' sort of thing, at > least until the 'probably safe' assumption of the 'mask' approach has > more time to settle out. >From the arm64 side, the only concern I have (and this actually applies to our CSDB sequence as well) is the calculation of the array size by the caller. As Linus mentioned at the end of [1], if the determination of the size argument is based on a conditional branch, then masking doesn't help because you bound within the wrong range under speculation. We ran into this when trying to use masking to protect our uaccess routines where the conditional bound is either KERNEL_DS or USER_DS. It's possible that a prior conditional set_fs(KERNEL_DS) could defeat the masking and so we'd need to throw some heavy barriers in set_fs to make it robust. The question then is whether or not we're likely to run into this elsewhere, and I don't have a good feel for that. Will [1] http://lkml.kernel.org/r/CA+55aFz0tsreoa=5ud2nofcpng-dizlbht9wu9asyhplfjd...@mail.gmail.com
Re: [net-next: PATCH v4 0/7] Armada 7k/8k PP2 ACPI support
Hi Marcin, I tested the series on a MacchiatoBin to ensure the mvpp2 DT support was still working. I was able to use all supported ports as before, and saw no issue. For all mvpp2 patches, you can add: Tested-by: Antoine Tenart Thanks! Antoine On Thu, Jan 18, 2018 at 01:31:37PM +0100, Marcin Wojtas wrote: > Hi, > > I quickly resend the series, thanks to Antoine Tenart's remark, > who spotted !CONFIG_ACPI compilation issue after introducing > the new fwnode_irq_get() routine. Please see the details in the changelog > below and the 3/7 commit log. > > mvpp2 driver can work with the ACPI representation, as exposed > on a public branch: > https://github.com/MarvellEmbeddedProcessors/edk2-open-platform/commits/marvell-armada-wip > It was compiled together with the most recent Tianocore EDK2 revision. > Please refer to the firmware build instruction on MacchiatoBin board: > http://wiki.macchiatobin.net/tiki-index.php?page=Build+from+source+-+UEFI+EDK+II > > ACPI representation of PP2 controllers (withouth PHY support) can > be viewed in the github: > * MacchiatoBin: > https://github.com/MarvellEmbeddedProcessors/edk2-open-platform/blob/71ae395da1661374b0f07d1602afb1eee56e9794/Platforms/Marvell/Armada/AcpiTables/Armada80x0McBin/Dsdt.asl#L201 > > * Armada 7040 DB: > https://github.com/MarvellEmbeddedProcessors/edk2-open-platform/blob/71ae395da1661374b0f07d1602afb1eee56e9794/Platforms/Marvell/Armada/AcpiTables/Armada70x0/Dsdt.asl#L131 > > I will appreciate any comments or remarks. > > Best regards, > Marcin > > Changelog: > v3 -> v4: > * 3/7 > - add new macro (ACPI_HANDLE_FWNODE) and fix > compilation with !CONFIG_ACPI > - extend commit log and mention usability of fwnode_irq_get > for the child nodes as well > > v2 -> v3: > * 1/7, 2/7 > - Add Rafael's Acked-by's > * 3/7, 4/7 > - New patches > * 6/7, 7/7 > - Update driver with new helper routines usage > - Improve commit log. > > v1 -> v2: > * Remove MDIO patches > * Use PP2 ports only with link interrupts > * Release second region resources in mvpp2 driver (code moved from > mvmdio), as explained in details in 5/5 commit message. > > Marcin Wojtas (7): > device property: Introduce fwnode_get_mac_address() > device property: Introduce fwnode_get_phy_mode() > device property: Introduce fwnode_irq_get() > device property: Allow iterating over available child fwnodes > net: mvpp2: simplify maintaining enabled ports' list > net: mvpp2: use device_*/fwnode_* APIs instead of of_* > net: mvpp2: enable ACPI support in the driver > > drivers/base/property.c | 104 -- > drivers/net/ethernet/marvell/mvpp2.c | 206 > include/linux/acpi.h | 3 + > include/linux/property.h | 11 ++ > 4 files changed, 232 insertions(+), 92 deletions(-) > > -- > 2.7.4 > -- Antoine Ténart, Free Electrons Embedded Linux and Kernel engineering http://free-electrons.com
Re: [PATCH] [wireless-next] mt76: fix building without CONFIG_LEDS_CLASS
Arnd Bergmann writes: > When CONFIG_LEDS_CLASS is disabled, or it is a loadable module while > mt76 is built-in, we run into a link error: > > drivers/net/wireless/mediatek/mt76/mac80211.o: In function > `mt76_register_device': > mac80211.c:(.text+0xb78): relocation truncated to fit: R_AARCH64_CALL26 > against undefined symbol `devm_of_led_classdev_register' > > We don't really need a hard dependency here as the driver can presumably > work just fine without LEDs, so this follows the iwlwifi example and > adds a separate Kconfig option for the LED support, this will be available > whenever it will link, and otherwise the respective code gets left out from > the driver object. > > Fixes: 17f1de56df05 ("mt76: add common code shared between multiple chipsets") > Signed-off-by: Arnd Bergmann I'm planning to queue this for 4.16. -- Kalle Valo
Re: [RFC v2 net-next 06/10] net/sched: Introduce the TBS Qdisc
On 18-01-17 06:06 PM, Jesus Sanchez-Palencia wrote: From: Vinicius Costa Gomes TBS (Time Based Scheduler) uses the information added earlier in this series (the socket option SO_TXTIME and the new role of sk_buff->tstamp) to schedule traffic transmission based on absolute time. For some workloads, just bandwidth enforcement is not enough, and precise control of the transmission of packets is necessary. Example: $ tc qdisc replace dev enp2s0 parent root handle 100 mqprio num_tc 3 \ handle 100:0 ? map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 queues 1@0 1@1 2@2 hw 0 $ tc qdisc add dev enp2s0 parent 100:1 tbs delta 6 clockid 11 offload 1 In this example, the Qdisc will try to enable offloading (offload 1) the control of the transmission time to the network adapter, the time stamp in socket are in reference to the clockid '11' (CLOCK_TAI) and packets leave the Qdisc "delta" (6) nanoseconds before its transmission time. When offloading is disabled, the network adapter will ignore the sk_buff time stamp, and so, the transmission time will be only "best effort" from the Qdisc. General comments: 1) iproute2: Avoid magic numbers like 1 or 11 please; "offload" (without 1) and "TAI" will be more human friendly. 2) Experience shows that adding padding fields in the control structs implies they will _never ever_ be used. That was not design intent for netlink but over years shit like that has happened. Maybe look at using a 32 bitmap? It is more "future proof". You seem to only have 2-3 flags but it gives you opportunity to add more changes later. If you are 100% sure youll never need it - then maybe just move the tc_tbs_qopt::offload to the end of of the struct. 3)It would be helpful for debugging to increment some stats other than drop counters on enqueu/dequeue obsolete packet drop. Maybe use overlimits for the dequeu drops (in addition)? 4) I may be misreading things - but did you need to reset the watchdog on dequeue? It is already being kicked for every incoming packet. cheers, jamal
[PATCH xfrm v1] xfrm: fix error flow in case of add state fails
If add state fails in case of device offload, netdev refcount will be negative since gc task is attempting to dev_free this state. This is fixed by putting NULL in state dev field. Signed-off-by: Aviad Yehezkel Signed-off-by: Boris Pismeny --- net/xfrm/xfrm_device.c | 1 + 1 file changed, 1 insertion(+) diff --git a/net/xfrm/xfrm_device.c b/net/xfrm/xfrm_device.c index fb3f920..cd2b7d3 100644 --- a/net/xfrm/xfrm_device.c +++ b/net/xfrm/xfrm_device.c @@ -109,6 +109,7 @@ int xfrm_dev_state_add(struct net *net, struct xfrm_state *x, err = dev->xfrmdev_ops->xdo_dev_state_add(x); if (err) { + xso->dev = NULL; dev_put(dev); return err; } -- 2.7.4
Re: dangers of bots on the mailing lists was Re: divide error in ___bpf_prog_run
On Thu, 18 Jan 2018, Dmitry Vyukov wrote: > I've made a bunch of changes yesterday and today. This includes ... > and even per crash/tree. To alleviate this, syzbot will now say e.g. > "So far this crash happened 185 times on linux-next, mmots, net-next, > upstream". So that you can see that it's not only, say, linux-next > problem. > > syzbot just mailed another report with all of these changes which you > can see here: > https://groups.google.com/forum/#!msg/syzkaller-bugs/u5nq3PdPkIc/F4tXzErxAgAJ Looks good to me. Not that I had anything against what it did before, but it is much more recipient-friendly now, IMHO. Thanks for all the hard work on syzkaller! -- Henrique Holschuh
Re: [PATCH v2] FIRMWARE: bcm47xx_nvram: Replace mac address parsing
On Thu, 2017-12-21 at 17:42 +0100, Hauke Mehrtens wrote: > > On 12/21/2017 03:40 PM, Andy Shevchenko wrote: > > Replace sscanf() with mac_pton(). > > > > Signed-off-by: Andy Shevchenko > > Acked-by: Hauke Mehrtens > > The patch looks good, but I haven't tested them on my devices. Any news on this, anyone? > > --- > > - use negative condition to be consistent with the rest code > > drivers/firmware/broadcom/Kconfig | 1 + > > drivers/firmware/broadcom/bcm47xx_sprom.c | 18 +++--- > > 2 files changed, 4 insertions(+), 15 deletions(-) > > > > diff --git a/drivers/firmware/broadcom/Kconfig > > b/drivers/firmware/broadcom/Kconfig > > index 5e29f83e7b39..c041dcb7ea52 100644 > > --- a/drivers/firmware/broadcom/Kconfig > > +++ b/drivers/firmware/broadcom/Kconfig > > @@ -13,6 +13,7 @@ config BCM47XX_NVRAM > > config BCM47XX_SPROM > > bool "Broadcom SPROM driver" > > depends on BCM47XX_NVRAM > > + select GENERIC_NET_UTILS > > help > > Broadcom devices store configuration data in SPROM. > > Accessing it is > > specific to the bus host type, e.g. PCI(e) devices have > > it mapped in > > diff --git a/drivers/firmware/broadcom/bcm47xx_sprom.c > > b/drivers/firmware/broadcom/bcm47xx_sprom.c > > index 62aa3cf09b4d..4787f86c8ac1 100644 > > --- a/drivers/firmware/broadcom/bcm47xx_sprom.c > > +++ b/drivers/firmware/broadcom/bcm47xx_sprom.c > > @@ -137,20 +137,6 @@ static void nvram_read_leddc(const char > > *prefix, const char *name, > > *leddc_off_time = (val >> 16) & 0xff; > > } > > > > -static void bcm47xx_nvram_parse_macaddr(char *buf, u8 macaddr[6]) > > -{ > > - if (strchr(buf, ':')) > > - sscanf(buf, "%hhx:%hhx:%hhx:%hhx:%hhx:%hhx", > > &macaddr[0], > > - &macaddr[1], &macaddr[2], &macaddr[3], > > &macaddr[4], > > - &macaddr[5]); > > - else if (strchr(buf, '-')) > > - sscanf(buf, "%hhx-%hhx-%hhx-%hhx-%hhx-%hhx", > > &macaddr[0], > > - &macaddr[1], &macaddr[2], &macaddr[3], > > &macaddr[4], > > - &macaddr[5]); > > - else > > - pr_warn("Can not parse mac address: %s\n", buf); > > -} > > - > > static void nvram_read_macaddr(const char *prefix, const char > > *name, > >u8 val[6], bool fallback) > > { > > @@ -161,7 +147,9 @@ static void nvram_read_macaddr(const char > > *prefix, const char *name, > > if (err < 0) > > return; > > > > - bcm47xx_nvram_parse_macaddr(buf, val); > > + strreplace(buf, '-', ':'); > > + if (!mac_pton(buf, val)) > > + pr_warn("Can not parse mac address: %s\n", buf); > > } > > > > static void nvram_read_alpha2(const char *prefix, const char *name, > > -- Andy Shevchenko Intel Finland Oy
Re: [RFC v2 net-next 06/10] net/sched: Introduce the TBS Qdisc
One more comment: Probably try to run a test with a very small delta with no offload (probably using something like prio as the root qdisc) and dump the stats. My gut feeling is your accounting of the backlog in particular is off. cheers, jamal On 18-01-18 08:35 AM, Jamal Hadi Salim wrote: On 18-01-17 06:06 PM, Jesus Sanchez-Palencia wrote: From: Vinicius Costa Gomes TBS (Time Based Scheduler) uses the information added earlier in this series (the socket option SO_TXTIME and the new role of sk_buff->tstamp) to schedule traffic transmission based on absolute time. For some workloads, just bandwidth enforcement is not enough, and precise control of the transmission of packets is necessary. Example: $ tc qdisc replace dev enp2s0 parent root handle 100 mqprio num_tc 3 \ handle 100:0 ? map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 queues 1@0 1@1 2@2 hw 0 $ tc qdisc add dev enp2s0 parent 100:1 tbs delta 6 clockid 11 offload 1 In this example, the Qdisc will try to enable offloading (offload 1) the control of the transmission time to the network adapter, the time stamp in socket are in reference to the clockid '11' (CLOCK_TAI) and packets leave the Qdisc "delta" (6) nanoseconds before its transmission time. When offloading is disabled, the network adapter will ignore the sk_buff time stamp, and so, the transmission time will be only "best effort" from the Qdisc. General comments: 1) iproute2: Avoid magic numbers like 1 or 11 please; "offload" (without 1) and "TAI" will be more human friendly. 2) Experience shows that adding padding fields in the control structs implies they will _never ever_ be used. That was not design intent for netlink but over years shit like that has happened. Maybe look at using a 32 bitmap? It is more "future proof". You seem to only have 2-3 flags but it gives you opportunity to add more changes later. If you are 100% sure youll never need it - then maybe just move the tc_tbs_qopt::offload to the end of of the struct. 3)It would be helpful for debugging to increment some stats other than drop counters on enqueu/dequeue obsolete packet drop. Maybe use overlimits for the dequeu drops (in addition)? 4) I may be misreading things - but did you need to reset the watchdog on dequeue? It is already being kicked for every incoming packet. cheers, jamal
Re: [PATCH] [net-next] net: sched: avoid uninitialized variable use
Thu, Jan 18, 2018 at 02:17:28PM CET, a...@arndb.de wrote: >gcc has identified a code path in which we pass uninitialized >data into tc_dump_tfilter(): > >net/sched/cls_api.c: In function 'tc_dump_tfilter': >net/sched/cls_api.c:1268:8: error: 'parent' may be used uninitialized in this >function [-Werror=maybe-uninitialized] > >This initializes the variable to the value it had before the previous >change. > >Fixes: 7960d1daf278 ("net: sched: use block index as a handle instead of qdisc >when block is shared") >Signed-off-by: Arnd Bergmann > >I don't know if my patch is the best way to address the issue, but >if not, then at least it helps show what the warning is about >and lets someone else come up with a better solution. I already sent a fix for this: http://patchwork.ozlabs.org/patch/862787/
Re: [PATCH 0/3] Check gso_size of packets when forwarding
Daniel Axtens writes: > Jason Wang writes: > >> On 2018年01月18日 16:28, Pravin Shelar wrote: >>> On Mon, Jan 15, 2018 at 6:09 PM, Daniel Axtens wrote: When regular packets are forwarded, we validate their size against the MTU of the destination device. However, when GSO packets are forwarded, we do not validate their size against the MTU. We implicitly assume that when they are segmented, the resultant packets will be correctly sized. This is not always the case. We observed a case where a packet received on an ibmveth device had a GSO size of around 10kB. This was forwarded by Open vSwitch to a bnx2x device, where it caused a firmware assert. This is described in detail at [0] and was the genesis of this series. Rather than fixing it in the driver, this series fixes the forwarding path. >>> Are there any other possible forwarding path in networking stack? TC >>> is one subsystem that could forward such a packet to the bnx2x device, >>> how is that handled ? >> >> Yes, so it looks to me we should do the check in e.g netif_needs_gso() >> before passing it to hardware. And bnx2x needs to set its gso_max_size >> correctly. > > I don't think gso_max_size is quite the same. If I understand > net/ipv4/tcp.c correctly, gso_max_size is supposed to cap the total > length of the skb, not the length of each segmented packet. The problem > for the bnx2x card is the length of the segment, not the overall length. > >> >> Btw, looks like this could be triggered from a guest which is a DOS. We >> need request a CVE for this. >> > > You are correct about how this can be triggered: in fact it came to my > attention because a network packet from one LPAR (PowerVM virtual > machine) brought down the card attached to a different LPAR. It didn't > occur to me that it was potentially a security issue. I am talking with > the security team at Canonical regarding this. I have requested a CVE from the Distributed Weakness Filing. Regards, Daniel > > Regards, > Daniel > >> Thanks >> >>> To fix this: - Move a helper in patch 1. - Validate GSO segment lengths in is_skb_forwardable() in the GSO case, rather than assuming all will be well. This fixes bridges. This is patch 2. - Open vSwitch uses its own slightly specialised algorithm for checking lengths. Wire up checking for that in patch 3. [0]: https://patchwork.ozlabs.org/patch/859410/ Cc: manish.cho...@cavium.com Cc: d...@openvswitch.org Daniel Axtens (3): net: move skb_gso_mac_seglen to skbuff.h net: is_skb_forwardable: validate length of GSO packet segments openvswitch: drop GSO packets that are too large include/linux/skbuff.h | 16 net/core/dev.c | 7 --- net/core/skbuff.c | 34 ++ net/openvswitch/vport.c | 37 ++--- net/sched/sch_tbf.c | 10 -- 5 files changed, 84 insertions(+), 20 deletions(-) -- 2.14.1
[PATCH iproute2 v2 0/9] ip/tunnel: Improve tunnel parameters printing
Continue improvements to tunnel modules. Final goal is to make merge IP and IPv6 variants and make that transition as transparent as possible. Everything within this series is open for your comments, suggestions and criticism. See individual patch description message for details. v2 1) Fix checkpatch issues: no assignment in condtional anymore. 2) Better code abstraction: introduce tnl_print_encap() helper and get rid of duplicated code for printing tunnel encapsulation options. Patch with subject "ip/tunnel: Abstract tunnel encapsulation options printing" replaces two previous in v1 series "ip/tunnel: Use print_string() and simplify encap option printing" and "ip/tunnel: Minor cleanups in print routines" 3) Include patch with subject "tunnel: Return constant string without copying it" in the series: it is related to tunneling code too. Thanks, Serhii Serhey Popovych (9): iplink: Use ll_index_to_name() instead of if_indextoname() ip/tunnel: Correct and unify ttl/hoplimit printing ip/tunnel: Simplify and unify tos printing ip/tunnel: Use print_0xhex() instead of print_string() ip/tunnel: Abstract tunnel encapsulation options printing gre/tunnel: Print erspan_index using print_uint() vti/tunnel: Unify ikey/okey printing vti6/tunnel: Unify and simplify link type help functions tunnel: Return constant string without copying it bridge/fdb.c |5 +- bridge/link.c | 19 +++ ip/ip6tunnel.c|5 +- ip/iplink_bond.c | 38 ++ ip/iplink_geneve.c| 38 ++ ip/iplink_vxlan.c | 49 -- ip/iproute_lwtunnel.c | 11 ++--- ip/iptunnel.c |2 +- ip/link_gre.c | 132 +++-- ip/link_gre6.c| 103 +- ip/link_ip6tnl.c | 105 ++- ip/link_iptnl.c | 128 +++ ip/link_vti.c | 42 ip/link_vti6.c| 64 +++- ip/tunnel.c | 116 --- ip/tunnel.h |5 ++ 16 files changed, 340 insertions(+), 522 deletions(-) -- 1.7.10.4
Re: dangers of bots on the mailing lists was Re: divide error in ___bpf_prog_run
On Thu, Jan 18, 2018 at 02:01:28PM +0100, Dmitry Vyukov wrote: > On Thu, Jan 18, 2018 at 2:09 AM, Theodore Ts'o wrote: > > On Wed, Jan 17, 2018 at 04:21:13PM -0800, Alexei Starovoitov wrote: > >> > >> If syzkaller can only test one tree than linux-next should be the one. > > > > Well, there's been some controversy about that. The problem is that > > it's often not clear if this is long-standing bug, or a bug which is > > in a particular subsystem tree --- and if so, *which* subsystem tree, > > etc. So it gets blasted to linux-kernel, and to get_maintainer.pl, > > which is often not accurate --- since the location of the crash > > doesn't necessarily point out where the problem originated, and hence > > who should look at the syzbot report. And so this has caused > > some irritation. > > > Re set of tested trees. > > We now have an interesting spectrum of opinions. > > Some assorted thoughts on this: > > 1. First, "upstream is clean" won't happen any time soon. There are > several reasons for this: > - Currently syzkaller only tests a subset of subsystems that it knows > how to test, even the ones that it tests it tests poorly. Over time > it's improved to test most subsystems and existing subsystems better. > Just few weeks ago I've added some descriptions for crypto subsystem > and it uncovered 20+ old bugs. > - syzkaller is guided, genetic fuzzer over time it leans how to do > more complex things by small steps. It takes time. > - We have more bug detection tools coming: LEAKCHECK, KMSAN (uninit > memory), KTSAN (data races). > - generic syzkaller smartness will be improved over time. > - it will get more CPU resources. > Effect of all of these things is multiplicative: we test more code, > smarter, with more bug-detection tools, with more resources. So I > think we need to plan for a mix of old and new bugs for foreseeable > future. That's fine, but when you test Linus's tree, we "know" you are hitting something that really is an issue, and it's not due to linux-next oddities. When I see a linux-next report, and it looks "odd", my default reaction is "ugh, must be a crazy patch in some other subsystem, I _know_ my code in linux-next is just fine." :) > 2. get_maintainer.pl and mix of old and new bugs was mentioned as > harming attribution. I don't see what will change when/if we test only > upstream. Then the same mix of old/new bugs will be detected just on > upstream, with all of the same problems for old/new, maintainers, > which subsystem, etc. I think the amount of bugs in the kernel is > significant part of the problem, but the exact boundary where we > decide to start killing them won't affect number of bugs. I don't worry about that, the traceback should tell you a lot, and even when that is wrong (i.e. warnings thrown up by sysfs core calls that are obviously not a sysfs issue, but rather a subsystem issue), it's easy to see. > 3. If we test only upstream, we increase chances of new security bugs > sinking into releases. We sure could raise perceived security value of > the bugs by keeping them private, letting them sink into release, > letting them sink into distros, and then reporting a high-profile > vulnerability. I think that's wrong. There is something broken with > value measuring in security community. Bug that is killed before > sinking into any release is the highest impact thing. As Alexei noted, > fixing bugs es early as possible also reduces fix costs, backporting > burden, etc. This also can eliminate need in bisection in some cases, > say if you accepted a large change to some files and a bunch of > crashes appears for these files on your tree soon, it's obvious what > happens. I agree, this is an issue, but I think you have a lot of "low hanging fruit" in Linus's tree left to find. Testing linux-next is great, but the odds of something "new" being added there for your type of testing right now is usually pretty low, right? thanks, greg k-h
[PATCH iproute2 v2 8/9] vti6/tunnel: Unify and simplify link type help functions
Both of these two changes are missing for link_vti6.c: commit 8b47135474cd ("ip: link: Unify link type help functions a bit") commit 561e650eff67 ("ip link: Shortify printing the usage of link type") Replay them on link_vti6.c to bring link type help functions inline with other tunneling code. Signed-off-by: Serhey Popovych --- ip/link_vti6.c | 32 ++-- 1 file changed, 22 insertions(+), 10 deletions(-) diff --git a/ip/link_vti6.c b/ip/link_vti6.c index 6b61e3d..2a86d59 100644 --- a/ip/link_vti6.c +++ b/ip/link_vti6.c @@ -24,20 +24,25 @@ #include "ip_common.h" #include "tunnel.h" +static void print_usage(FILE *f) +{ + fprintf(f, + "Usage: ... vti6 [ remote ADDR ]\n" + "[ local ADDR ]\n" + "[ [i|o]key KEY ]\n" + "[ dev PHYS_DEV ]\n" + "[ fwmark MARK ]\n" + "\n" + "Where: ADDR := { IPV6_ADDRESS }\n" + " KEY := { DOTTED_QUAD | NUMBER }\n" + " MARK := { 0x0..0x }\n" + ); +} static void usage(void) __attribute__((noreturn)); static void usage(void) { - fprintf(stderr, "Usage: ip link { add | set | change | replace | del } NAME\n"); - fprintf(stderr, " type { vti6 } [ remote ADDR ] [ local ADDR ]\n"); - fprintf(stderr, " [ [i|o]key KEY ]\n"); - fprintf(stderr, " [ dev PHYS_DEV ]\n"); - fprintf(stderr, " [ fwmark MARK ]\n"); - fprintf(stderr, "\n"); - fprintf(stderr, "Where: NAME := STRING\n"); - fprintf(stderr, " ADDR := { IPV6_ADDRESS }\n"); - fprintf(stderr, " KEY := { DOTTED_QUAD | NUMBER }\n"); - fprintf(stderr, " MARK := { 0x0..0x }\n"); + print_usage(stderr); exit(-1); } @@ -224,9 +229,16 @@ static void vti6_print_opt(struct link_util *lu, FILE *f, struct rtattr *tb[]) } } +static void vti6_print_help(struct link_util *lu, int argc, char **argv, + FILE *f) +{ + print_usage(f); +} + struct link_util vti6_link_util = { .id = "vti6", .maxattr = IFLA_VTI_MAX, .parse_opt = vti6_parse_opt, .print_opt = vti6_print_opt, + .print_help = vti6_print_help, }; -- 1.7.10.4