Re: [PATCH net v2] switchdev: don't abort hardware ipv4 fib offload on failure to program fib entry in hardware
Thu, May 28, 2015 at 05:35:05PM CEST, john.fastab...@gmail.com wrote: On 05/28/2015 02:42 AM, Jiri Pirko wrote: Mon, May 18, 2015 at 10:19:16PM CEST, da...@davemloft.net wrote: From: Roopa Prabhu ro...@cumulusnetworks.com Date: Sun, 17 May 2015 16:42:05 -0700 On most systems where you can offload routes to hardware, doing routing in software is not an option (the cpu limitations make routing impossible in software). You absolutely do not get to determine this policy, none of us do. What matters is that by default the damn switch device being there is %100 transparent to the user. And the way to achieve that default is to do software routes as a fallback. I am not going to entertain changes of this nature which fail route loading by default just because we've exceeded a device's HW capacity to offload. I thought I was _really_ clear about this at netdev 0.1 I certainly agree that by default, transparency 1:1 sw:hw mapping is what we need for fib. The current code is a good start! I see couple of issues regarding switchdev_fib_ipv4_abort: 1) If user adds and entry, switchdev_fib_ipv4_add fails, abort is executed - and, error returned. I would expect that route entry should be added in this case. The next attempt of adding the same entry will be successful. The current behaviour breaks the transparency you are reffering to. 2) When switchdev_fib_ipv4_abort happens to be executed, the offload is disabled for good (until reboot). That is certainly not nice, alhough I understand that is the easiest solution for now. I believe that we all agree that the 1:1 transparency, although it is a default, may not be optimal for real-life usage. HW resources are limited and user does not know them. The danger of hitting _abort and screwing-up the whole system is huge, unacceptable. So here, there are couple of more or less simple things that I suggest to do in order to move a little bit forward: 1) Introduce system-wide option to switch _abort to just plain fail. When HW does not have capacity, do not flush and fallback to sw, but rather just fail to add the entry. This would not break anything. Userspace has to be prepared that entry add could fail. 2) Introduce a way to propagate resources to userspace. Driver knows about resources used/available/potentially_available. Switchdev infra could be extended in order to propagate the info to the user. I currently use the FlowAPI work I presented at netdev conference for this. Perhaps I was a bit reaching by trying to also push it as a replacement for the ethtool flow classification mechanism all in one shot. For what it is worth replacing 'ethtool' flow classifier when I have a pipeline of tables in a NIC is really my first use case for the 'set' operations but that is off-topic probably. The benefits I see of using this interface (or if you want rename it and push it into a different netlink type) is it gives you the entire view of the switch resources and pipeline from a single interface. Also because you are talking about system-wide behaviour above it nicely rolls up into user space software where we can act on it with the flags we have for l2 already and if we pursue your option (3) also l3. I like the single interface vs. scattering the information across many different interfaces this way we can do it once and be done with it. If you scatter it across all the interfaces just l2,l3 for now but we will get more then each interface will have its own mechanism and I have no idea where you put global information such as table ordering. I think that for fib capacities/capabilities, user should be able to use extended existing Netlink interface. Not some parallel one. I'm still not convinced that user should care about the actual hw pipeline. We already have a pipeline in kernel. Switch drivers should just do mapping, easy as that. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-next] bpf: allow BPF programs access skb-skb_iif and skb-dev-ifindex fields
On 05/29/2015 05:43 AM, David Miller wrote: From: Alexei Starovoitov a...@plumgrid.com Date: Thu, 28 May 2015 20:40:44 -0700 ... btw, Daniel, your ack appeared on the mailing list, but didn't make it into patchwork... Same issue as I was having. I noticed when I send email via gmail web interface it always appears as separate thread and doesn't register in patchwork. So now I've switched to mutt and thunderbird only :) patchwork lost 22 hours of list traffic, but it should be functioning normally now Ok, great. So, lets try once more. :) Acked-by: Daniel Borkmann dan...@iogearbox.net -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net v2] switchdev: don't abort hardware ipv4 fib offload on failure to program fib entry in hardware
Thu, May 21, 2015 at 07:46:54AM CEST, sfel...@gmail.com wrote: On Tue, May 19, 2015 at 1:28 PM, David Miller da...@davemloft.net wrote: From: Andy Gospodarek go...@cumulusnetworks.com Date: Tue, 19 May 2015 15:47:32 -0400 Are you actually saying that if users complain loudly enough about the current behavior (not the change Roopa has proposed) that you would be open to considering a change the current behavior? I am saying that we have a contract with users not to break existing behavior. Full stop. After rehearing David's argument, we should probably explore option d) which is a refinement on the fib_offload_disable mechanism we have today. fib_offload_disable is global for all routes. Once we hit a HW install problem, the global flag is set and all routes fallback to SW. We did this because we can't allow the failed route to exist in SW and not in HW because it could mess up LPM searches (HW could hit on a lesser prefix even when SW has the true LPM, because HW gets first shot at match). The refinement on fib_offload_disable is this: make it per-related-prefix rather than global, and on a HW install problem, set the flag for the related-prefix and uninstall only those routes from HW. Related-prefix (is there a correct term for this?) are routes to the same dst addr but with different prefix lengths. I haven't parsed the fib_trie structure to see how routes are organized, but I suspect since it's optimized for lookup the related-prefix tracking is already there and we can build on that. This looks interesting. However, I'm not sure that it is acceptable for user to experience this hw evict of random entries. User knows what entries are essential to have in hw. With your solution, I can see no way user can actually say what should be offloaded or not. Kernel just automagically decides. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net v2] switchdev: don't abort hardware ipv4 fib offload on failure to program fib entry in hardware
Thu, May 28, 2015 at 05:40:11PM CEST, sfel...@gmail.com wrote: On Thu, May 28, 2015 at 2:42 AM, Jiri Pirko j...@resnulli.us wrote: Mon, May 18, 2015 at 10:19:16PM CEST, da...@davemloft.net wrote: From: Roopa Prabhu ro...@cumulusnetworks.com Date: Sun, 17 May 2015 16:42:05 -0700 On most systems where you can offload routes to hardware, doing routing in software is not an option (the cpu limitations make routing impossible in software). You absolutely do not get to determine this policy, none of us do. What matters is that by default the damn switch device being there is %100 transparent to the user. And the way to achieve that default is to do software routes as a fallback. I am not going to entertain changes of this nature which fail route loading by default just because we've exceeded a device's HW capacity to offload. I thought I was _really_ clear about this at netdev 0.1 I certainly agree that by default, transparency 1:1 sw:hw mapping is what we need for fib. The current code is a good start! I see couple of issues regarding switchdev_fib_ipv4_abort: 1) If user adds and entry, switchdev_fib_ipv4_add fails, abort is executed - and, error returned. I would expect that route entry should be added in this case. The next attempt of adding the same entry will be successful. The current behaviour breaks the transparency you are reffering to. 2) When switchdev_fib_ipv4_abort happens to be executed, the offload is disabled for good (until reboot). That is certainly not nice, alhough I understand that is the easiest solution for now. I believe that we all agree that the 1:1 transparency, although it is a default, may not be optimal for real-life usage. HW resources are limited and user does not know them. The danger of hitting _abort and screwing-up the whole system is huge, unacceptable. So here, there are couple of more or less simple things that I suggest to do in order to move a little bit forward: 1) Introduce system-wide option to switch _abort to just plain fail. When HW does not have capacity, do not flush and fallback to sw, but rather just fail to add the entry. This would not break anything. Userspace has to be prepared that entry add could fail. 2) Introduce a way to propagate resources to userspace. Driver knows about resources used/available/potentially_available. Switchdev infra could be extended in order to propagate the info to the user. 3) Introduce couple of flags for entry add that would alter the default behaviour. Something like: NLM_F_SKIP_KERNEL NLM_F_SKIP_OFFLOAD Again, this does not break the current users. On the other hand, this gives new users a leverage to instruct kernel where the entry should be added to (or not added to). Any thoughts? Objections? I don't like these. Breaks transparency and forces the user in a position of having to know hardware failures modes (unique to each Can you please elaborate on this a bit more? I fail to see transparency breaking in my proposal :/ Maybe it is by different understanding of the term? Also I do not understand the remark about user having to know hardware failure modes. Could you please explain? hardware device). I presented an option d) which avoids this issues; was it not understood? I just commented on option d) it other email. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Possible issue in iproute2 package
Hi Jose, On Thu, May 28, 2015 at 09:12:15PM +, Guzman Mosqueda, Jose R wrote: Hi all I'm Jose Guzman from a security team at Intel. We're using iproute2 in a GNU-Linux project and I'm analyzing the code to try to find possible issues/gaps/risks. Since I'm not too familiar with the package yet I have a question about a particular piece of code that could result in a memory corruption: Version: 4.0.0 File: misc/ss.c Function: static void tcp_show_info(...) Line: ~1903 Description: There is a memory allocation for a s.cong_alg variable: s.cong_alg = malloc(strlen(cong_attr + 1)); The length is calculated about next position of the starting character. But next line there is a copy of the whole content: strcpy(s.cong_alg, cong_attr); I think there is a mistake and it should be something like: s.cong_alg = malloc(strlen(cong_attr) + 1); I think strdup can be used here. I will send a patch. Thank You! -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/4] netfilter: default CONFIG_NETFILTER_INGRESS to y
On Friday 2015-05-29 01:44, Pablo Neira Ayuso wrote: Useful to compile-test all options. --- a/net/netfilter/Kconfig +++ b/net/netfilter/Kconfig @@ -3,6 +3,7 @@ menu Core Netfilter Configuration config NETFILTER_INGRESS bool Netfilter ingress support + default y select NET_INGRESS help This allows you to classify packets from ingress using the Netfilter Careful with default y. I seem to remember that someone higher up (perhaps Linus himself) was against default y for features deemed not essential (especially hardware drivers), as no driver is any more important than another. If compile-test is your reason for the patch, it might fall into the same category. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [Xen-devel] [RFC PATCH 00/13] Persistent grant maps for xen net drivers
Hi, About rx zerocopy, I have a question: If some application make a socket, then listen and accept, the client sends packets to it, but it doesn't recv from this socket right now, all persistent grant page would be in used. So other application cannot receive any packets. Is my guess right or wrong? YuZhou -Original Message- From: xen-devel-boun...@lists.xen.org [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of Joao Martins Sent: Friday, May 22, 2015 6:27 PM To: Wei Liu Cc: ian.campb...@citrix.com; netdev@vger.kernel.org; david.vra...@citrix.com; xen-de...@lists.xenproject.org; boris.ostrov...@oracle.com Subject: Re: [Xen-devel] [RFC PATCH 00/13] Persistent grant maps for xen net drivers On 19 May 2015, at 17:39, Wei Liu wei.l...@citrix.com wrote: On Tue, May 12, 2015 at 07:18:24PM +0200, Joao Martins wrote: There have been recently[3] some discussions and issues raised on persistent grants for the block layer, though the numbers above show some significant improvements specially on more network intensive workloads and provide a margin for comparison against future map/unmap improvements. Any comments or suggestions are welcome, Thanks! Thanks, the numbers certainly look interesting. I'm just a bit concerned about the complexity of netback. I've commented on individual patches, we can discuss the issues there. Thanks a lot for the review! It does add more complexity, mainly for the TX path, but I also would like to mention that a portion of this changeset is also the persistent grants ops that could potentially live outside. Joao [1] http://article.gmane.org/gmane.linux.network/249383 [2] http://bit.ly/1IhJfXD [3] http://lists.xen.org/archives/html/xen-devel/2015-02/msg02292.html Joao Martins (13): xen-netback: add persistent grant tree ops xen-netback: xenbus feature persistent support xen-netback: implement TX persistent grants xen-netback: implement RX persistent grants xen-netback: refactor xenvif_rx_action xen-netback: copy buffer on xenvif_start_xmit() xen-netback: add persistent tree counters to debugfs xen-netback: clone skb if skb-xmit_more is set xen-netfront: move grant_{ref,page} to struct grant xen-netfront: refactor claim/release grant xen-netfront: feature-persistent xenbus support xen-netfront: implement TX persistent grants xen-netfront: implement RX persistent grants drivers/net/xen-netback/common.h| 79 drivers/net/xen-netback/interface.c | 78 +++- drivers/net/xen-netback/netback.c | 873 ++-- drivers/net/xen-netback/xenbus.c| 24 + drivers/net/xen-netfront.c | 362 --- 5 files changed, 1216 insertions(+), 200 deletions(-) -- 2.1.3 ___ Xen-devel mailing list xen-de...@lists.xen.org http://lists.xen.org/xen-devel -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
pull-request: mac80211-next 2015-05-29
Hi Dave, It's been a while since I sent anything for -next, but with the merge window getting closer I wanted to send a few more things, mostly fixes. Of course I expect that tomorrow somebody will send an important fix, but hey :-) johannes The following changes since commit 658358cec93a7130615cfc1d6843ab07e49625e6: mac80211: fix throughput LED trigger (2015-05-11 19:16:04 +0200) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next.git tags/mac80211-next-for-davem-2015-05-29 for you to fetch changes up to f7959e9c73200f2ae361d0d311aa501f2c6a05c7: net: rfkill: gpio: make better use of gpiod API (2015-05-29 13:13:45 +0200) As we get closer to the merge window, here are a few more things for -next: * disconnect TDLS stations on CSA to avoid issues * fix a memory leak introduced in a recent commit * switch rfkill and cfg80211 to PM ops * in an unlikely scenario, prevent a bookkeeping value to get corrupted leading to dropped packets * fix a crash in VLAN assignment * switch rfkill-gpio to more modern gpiod API * send disconnected event to userspace with proper local/remote indication Arik Nemtsov (1): mac80211: disconnect TDLS stations on STA CSA Johannes Berg (3): mac80211: fix memory leak mac80211: add missing drv_priv description for TXQ struct cfg80211: properly send NL80211_ATTR_DISCONNECTED_BY_AP in disconnect Lars-Peter Clausen (2): net: rfkill: Switch to PM ops cfg80211: Switch to PM ops Michal Kazior (3): mac80211: check fast-xmit on station change mac80211: prevent possible crypto tx tailroom corruption cfg80211: ignore netif running state when changing iftype Uwe Kleine-König (1): net: rfkill: gpio: make better use of gpiod API drivers/net/wireless/ath/ath6kl/cfg80211.c | 4 ++-- drivers/net/wireless/ath/wil6210/main.c| 2 +- drivers/net/wireless/brcm80211/brcmfmac/cfg80211.c | 4 ++-- drivers/net/wireless/libertas/cfg.c| 13 +-- drivers/net/wireless/libertas/cfg.h| 3 ++- drivers/net/wireless/libertas/cmd.h| 3 ++- drivers/net/wireless/libertas/cmdresp.c| 13 ++- drivers/net/wireless/mwifiex/join.c| 2 +- drivers/net/wireless/mwifiex/sta_event.c | 2 +- drivers/net/wireless/rndis_wlan.c | 2 +- drivers/staging/rtl8723au/os_dep/ioctl_cfg80211.c | 2 +- drivers/staging/wlan-ng/cfg80211.c | 2 +- include/net/cfg80211.h | 4 +++- include/net/mac80211.h | 1 + net/mac80211/cfg.c | 1 + net/mac80211/main.c| 7 +- net/mac80211/mlme.c| 26 ++ net/mac80211/tdls.c| 6 + net/rfkill/core.c | 12 +++--- net/rfkill/rfkill-gpio.c | 24 +--- net/wireless/core.h| 1 + net/wireless/sme.c | 4 +++- net/wireless/sysfs.c | 14 +++- net/wireless/util.c| 5 +++-- 24 files changed, 105 insertions(+), 52 deletions(-) -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net 1/1] sfc: free multiple Rx buffers when required
From: Daniel Pieczko dpiec...@solarflare.com When Rx packet data must be dropped, all the buffers associated with that Rx packet must be freed. Extend and rename efx_free_rx_buffer() to efx_free_rx_buffers() and loop through all the fragments. By doing so this patch fixes a possible memory leak. Signed-off-by: Shradha Shah ss...@solarflare.com --- drivers/net/ethernet/sfc/rx.c | 42 +- 1 file changed, 25 insertions(+), 17 deletions(-) diff --git a/drivers/net/ethernet/sfc/rx.c b/drivers/net/ethernet/sfc/rx.c index c0ad95d..809ea461 100644 --- a/drivers/net/ethernet/sfc/rx.c +++ b/drivers/net/ethernet/sfc/rx.c @@ -224,12 +224,17 @@ static void efx_unmap_rx_buffer(struct efx_nic *efx, } } -static void efx_free_rx_buffer(struct efx_rx_buffer *rx_buf) +static void efx_free_rx_buffers(struct efx_rx_queue *rx_queue, + struct efx_rx_buffer *rx_buf, + unsigned int num_bufs) { - if (rx_buf-page) { - put_page(rx_buf-page); - rx_buf-page = NULL; - } + do { + if (rx_buf-page) { + put_page(rx_buf-page); + rx_buf-page = NULL; + } + rx_buf = efx_rx_buf_next(rx_queue, rx_buf); + } while (--num_bufs); } /* Attempt to recycle the page if there is an RX recycle ring; the page can @@ -278,7 +283,7 @@ static void efx_fini_rx_buffer(struct efx_rx_queue *rx_queue, /* If this is the last buffer in a page, unmap and free it. */ if (rx_buf-flags EFX_RX_BUF_LAST_IN_PAGE) { efx_unmap_rx_buffer(rx_queue-efx, rx_buf); - efx_free_rx_buffer(rx_buf); + efx_free_rx_buffers(rx_queue, rx_buf, 1); } rx_buf-page = NULL; } @@ -304,10 +309,7 @@ static void efx_discard_rx_packet(struct efx_channel *channel, efx_recycle_rx_pages(channel, rx_buf, n_frags); - do { - efx_free_rx_buffer(rx_buf); - rx_buf = efx_rx_buf_next(rx_queue, rx_buf); - } while (--n_frags); + efx_free_rx_buffers(rx_queue, rx_buf, n_frags); } /** @@ -431,11 +433,10 @@ efx_rx_packet_gro(struct efx_channel *channel, struct efx_rx_buffer *rx_buf, skb = napi_get_frags(napi); if (unlikely(!skb)) { - while (n_frags--) { - put_page(rx_buf-page); - rx_buf-page = NULL; - rx_buf = efx_rx_buf_next(channel-rx_queue, rx_buf); - } + struct efx_rx_queue *rx_queue; + + rx_queue = efx_channel_get_rx_queue(channel); + efx_free_rx_buffers(rx_queue, rx_buf, n_frags); return; } @@ -622,7 +623,10 @@ static void efx_rx_deliver(struct efx_channel *channel, u8 *eh, skb = efx_rx_mk_skb(channel, rx_buf, n_frags, eh, hdr_len); if (unlikely(skb == NULL)) { - efx_free_rx_buffer(rx_buf); + struct efx_rx_queue *rx_queue; + + rx_queue = efx_channel_get_rx_queue(channel); + efx_free_rx_buffers(rx_queue, rx_buf, n_frags); return; } skb_record_rx_queue(skb, channel-rx_queue.core_index); @@ -661,8 +665,12 @@ void __efx_rx_packet(struct efx_channel *channel) * loopback layer, and free the rx_buf here */ if (unlikely(efx-loopback_selftest)) { + struct efx_rx_queue *rx_queue; + efx_loopback_rx_packet(efx, eh, rx_buf-len); - efx_free_rx_buffer(rx_buf); + rx_queue = efx_channel_get_rx_queue(channel); + efx_free_rx_buffers(rx_queue, rx_buf, + channel-rx_pkt_n_frags); goto out; } -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH iproute2] ss: speedup resolve_service()
From: Eric Dumazet eduma...@google.com Lets implement a full cache with proper hash table, memory got cheaper these days. Before : $ time ss -t | wc -l 529678 real0m22.708s user0m19.591s sys 0m2.969s After : $ time ss -t | wc -l 528291 real0m5.078s user0m4.099s sys 0m0.985s Signed-off-by: Eric Dumazet eduma...@google.com --- misc/ss.c | 71 +++- 1 file changed, 32 insertions(+), 39 deletions(-) diff --git a/misc/ss.c b/misc/ss.c index 347e3a1..3f4b7f1 100644 --- a/misc/ss.c +++ b/misc/ss.c @@ -858,8 +858,7 @@ static const char *print_ms_timer(int timeout) return buf; } -struct scache -{ +struct scache { struct scache *next; int port; char *name; @@ -949,11 +948,15 @@ static const char *__resolve_service(int port) return NULL; } +#define SCACHE_BUCKETS 1024 +static struct scache *cache_htab[SCACHE_BUCKETS]; static const char *resolve_service(int port) { static char buf[128]; - static struct scache cache[256]; + struct scache *c; + const char *res; + int hash; if (port == 0) { buf[0] = '*'; @@ -961,45 +964,35 @@ static const char *resolve_service(int port) return buf; } - if (resolve_services) { - if (dg_proto == RAW_PROTO) { - return inet_proto_n2a(port, buf, sizeof(buf)); - } else { - struct scache *c; - const char *res; - int hash = (port^(((unsigned long)dg_proto)2))255; - - for (c = cache[hash]; c; c = c-next) { - if (c-port == port - c-proto == dg_proto) { - if (c-name) - return c-name; - goto do_numeric; - } - } + if (!resolve_services) + goto do_numeric; - if ((res = __resolve_service(port)) != NULL) { - if ((c = malloc(sizeof(*c))) == NULL) - goto do_numeric; - } else { - c = cache[hash]; - if (c-name) - free(c-name); - } - c-port = port; - c-name = NULL; - c-proto = dg_proto; - if (res) { - c-name = strdup(res); - c-next = cache[hash].next; - cache[hash].next = c; - } - if (c-name) - return c-name; - } + if (dg_proto == RAW_PROTO) + return inet_proto_n2a(port, buf, sizeof(buf)); + + + hash = (port^(((unsigned long)dg_proto)2)) % SCACHE_BUCKETS; + + for (c = cache_htab[hash]; c; c = c-next) { + if (c-port == port c-proto == dg_proto) + goto do_cache; } - do_numeric: + c = malloc(sizeof(*c)); + if (!c) + goto do_numeric; + res = __resolve_service(port); + c-port = port; + c-name = res ? strdup(res) : NULL; + c-proto = dg_proto; + c-next = cache_htab[hash]; + cache_htab[hash] = c; + +do_cache: + if (c-name) + return c-name; + +do_numeric: sprintf(buf, %u, port); return buf; } -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC][PATCH] x86: remove vmalloc.h from asm/io.h
Nothing in asm/io.h uses anything from vmalloc.h, so remove the include and fix up the build problems in an allmodconfig (64 bit and 32 bit) build. This may be the place where x86 builds get vmalloc.h implicitly included and that tends to hide places where vmalloc() et al are added to files but the include of vmalloc.h is forgotten. Cc: Thomas Gleixner t...@linutronix.de Cc: Ingo Molnar mi...@redhat.com Cc: H. Peter Anvin h...@zytor.com Cc: x...@kernel.org Cc: Konrad Rzeszutek Wilk konrad.w...@oracle.com Cc: Boris Ostrovsky boris.ostrov...@oracle.com Cc: David Vrabel david.vra...@citrix.com Cc: Anton Vorontsov an...@enomsg.org Cc: Colin Cross ccr...@android.com Cc: Kees Cook keesc...@chromium.org Cc: Tony Luck tony.l...@intel.com Cc: Rafael J. Wysocki r...@rjwysocki.net Cc: Len Brown l...@kernel.org Cc: Kristen Carlson Accardi kris...@linux.intel.com Cc: Viresh Kumar viresh.ku...@linaro.org Cc: Vinod Koul vinod.k...@intel.com Cc: K. Y. Srinivasan k...@microsoft.com Cc: Haiyang Zhang haiya...@microsoft.com Cc: Hiral Patel hiral...@cisco.com Cc: Suma Ramars sram...@cisco.com Cc: Brian Uchino buch...@cisco.com Cc: James E.J. Bottomley jbottom...@odin.com Cc: Jaroslav Kysela pe...@perex.cz Cc: Takashi Iwai ti...@suse.de Cc: Andrew Morton a...@linux-foundation.org Suggested-by: David Miller da...@davemloft.net Signed-off-by: Stephen Rothwell s...@canb.auug.org.au --- Based in Linus' tree of today. There are probably more places that need vmalloc.h included, but this passes 64 bit and 32 bit allmodconfig builds, so is a place to start. Dave Miller suggested that I start this journey. arch/x86/include/asm/io.h | 2 -- arch/x86/kernel/crash.c| 1 + arch/x86/kernel/machine_kexec_64.c | 1 + arch/x86/mm/pageattr-test.c| 1 + arch/x86/mm/pageattr.c | 1 + arch/x86/xen/p2m.c | 1 + drivers/acpi/apei/erst.c | 1 + drivers/cpufreq/intel_pstate.c | 1 + drivers/dma/mic_x100_dma.c | 1 + drivers/net/hyperv/netvsc.c| 1 + drivers/net/hyperv/rndis_filter.c | 1 + drivers/scsi/fnic/fnic_debugfs.c | 1 + drivers/scsi/fnic/fnic_trace.c | 1 + sound/pci/asihpi/hpioctl.c | 1 + 14 files changed, 13 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h index 34a5b93704d3..5791e7ace9db 100644 --- a/arch/x86/include/asm/io.h +++ b/arch/x86/include/asm/io.h @@ -197,8 +197,6 @@ extern void set_iounmap_nonlazy(void); #include asm-generic/iomap.h -#include linux/vmalloc.h - /* * Convert a virtual cached pointer to an uncached pointer */ diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c index c76d3e37c6e1..e068d6683dba 100644 --- a/arch/x86/kernel/crash.c +++ b/arch/x86/kernel/crash.c @@ -22,6 +22,7 @@ #include linux/elfcore.h #include linux/module.h #include linux/slab.h +#include linux/vmalloc.h #include asm/processor.h #include asm/hardirq.h diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c index 415480d3ea84..11546b462fa6 100644 --- a/arch/x86/kernel/machine_kexec_64.c +++ b/arch/x86/kernel/machine_kexec_64.c @@ -17,6 +17,7 @@ #include linux/ftrace.h #include linux/io.h #include linux/suspend.h +#include linux/vmalloc.h #include asm/init.h #include asm/pgtable.h diff --git a/arch/x86/mm/pageattr-test.c b/arch/x86/mm/pageattr-test.c index 6629f397b467..8ff686aa7e8c 100644 --- a/arch/x86/mm/pageattr-test.c +++ b/arch/x86/mm/pageattr-test.c @@ -9,6 +9,7 @@ #include linux/random.h #include linux/kernel.h #include linux/mm.h +#include linux/vmalloc.h #include asm/cacheflush.h #include asm/pgtable.h diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c index 89af288ec674..bedfc794b4ba 100644 --- a/arch/x86/mm/pageattr.c +++ b/arch/x86/mm/pageattr.c @@ -14,6 +14,7 @@ #include linux/percpu.h #include linux/gfp.h #include linux/pci.h +#include linux/vmalloc.h #include asm/e820.h #include asm/processor.h diff --git a/arch/x86/xen/p2m.c b/arch/x86/xen/p2m.c index b47124d4cd67..8b7f18e200aa 100644 --- a/arch/x86/xen/p2m.c +++ b/arch/x86/xen/p2m.c @@ -67,6 +67,7 @@ #include linux/seq_file.h #include linux/bootmem.h #include linux/slab.h +#include linux/vmalloc.h #include asm/cache.h #include asm/setup.h diff --git a/drivers/acpi/apei/erst.c b/drivers/acpi/apei/erst.c index ed65e9c4b5b0..3670bbab57a3 100644 --- a/drivers/acpi/apei/erst.c +++ b/drivers/acpi/apei/erst.c @@ -35,6 +35,7 @@ #include linux/nmi.h #include linux/hardirq.h #include linux/pstore.h +#include linux/vmalloc.h #include acpi/apei.h #include apei-internal.h diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c index 6414661ac1c4..2ba53f4f6af2 100644 --- a/drivers/cpufreq/intel_pstate.c +++ b/drivers/cpufreq/intel_pstate.c @@ -26,6 +26,7 @@ #include linux/fs.h #include linux/debugfs.h #include linux/acpi.h +#include linux/vmalloc.h #include trace/events/power.h #include asm/div64.h diff --git
[PATCH net-next 14/14] sfc: leak vports if a VF is assigned during PF unload
From: Daniel Pieczko dpiec...@solarflare.com If any VF is assigned as the PF is unloaded, do not attempt to remove its vport or the vswitch. These will be removed if the driver binds to the PF again, as an entity reset occurs during probe. A 'force' flag is added to efx_ef10_pci_sriov_disable() to distinguish between disabling SR-IOV and driver unload. SR-IOV cannot be disabled if VFs are assigned to guests. If the PF driver is unloaded while VFs are assigned, the driver may try to bind to the VF again at a later point if the driver has been reloaded and the VF returns to the same domain as the PF. In this case, the PF will not have a VF data structure, so the VF can check this and drop out of probe early. In this case, efx-vf_count will be zero but VFs will be present. The user is advised to remove the VF and re-create it. The check at the beginning of efx_ef10_pci_sriov_disable() that efx-vf_count is non-zero is removed to allow SR-IOV to be disabled in this case. Also, if the PF driver is unloaded, it will disable SR-IOV to remove these unknown VFs. By not disabling bus-mastering if VFs are still assigned, the VF will continue to pass traffic after the PF has been removed. When using the max_vfs module parameter, if VFs are already present do not try to initialise any more. Signed-off-by: Shradha Shah ss...@solarflare.com --- drivers/net/ethernet/sfc/ef10.c | 20 drivers/net/ethernet/sfc/ef10_sriov.c | 35 --- drivers/net/ethernet/sfc/ef10_sriov.h | 2 ++ drivers/net/ethernet/sfc/efx.c| 4 +++- 4 files changed, 49 insertions(+), 12 deletions(-) diff --git a/drivers/net/ethernet/sfc/ef10.c b/drivers/net/ethernet/sfc/ef10.c index 859980d..07c645a 100644 --- a/drivers/net/ethernet/sfc/ef10.c +++ b/drivers/net/ethernet/sfc/ef10.c @@ -695,6 +695,24 @@ static int efx_ef10_probe_pf(struct efx_nic *efx) static int efx_ef10_probe_vf(struct efx_nic *efx) { int rc; + struct pci_dev *pci_dev_pf; + + /* If the parent PF has no VF data structure, it doesn't know about this +* VF so fail probe. The VF needs to be re-created. This can happen +* if the PF driver is unloaded while the VF is assigned to a guest. +*/ + pci_dev_pf = efx-pci_dev-physfn; + if (pci_dev_pf) { + struct efx_nic *efx_pf = pci_get_drvdata(pci_dev_pf); + struct efx_ef10_nic_data *nic_data_pf = efx_pf-nic_data; + + if (!nic_data_pf-vf) { + netif_info(efx, drv, efx-net_dev, + The VF cannot link to its parent PF; + please destroy and re-create the VF\n); + return -EBUSY; + } + } rc = efx_ef10_probe(efx); if (rc) @@ -712,6 +730,8 @@ static int efx_ef10_probe_vf(struct efx_nic *efx) struct efx_ef10_nic_data *nic_data = efx-nic_data; nic_data_p-vf[nic_data-vf_index].efx = efx; + nic_data_p-vf[nic_data-vf_index].pci_dev = + efx-pci_dev; } else netif_info(efx, drv, efx-net_dev, Could not get the PF id from VF\n); diff --git a/drivers/net/ethernet/sfc/ef10_sriov.c b/drivers/net/ethernet/sfc/ef10_sriov.c index 41ab18d..6c9b6e4 100644 --- a/drivers/net/ethernet/sfc/ef10_sriov.c +++ b/drivers/net/ethernet/sfc/ef10_sriov.c @@ -165,6 +165,11 @@ static void efx_ef10_sriov_free_vf_vports(struct efx_nic *efx) for (i = 0; i efx-vf_count; i++) { struct ef10_vf *vf = nic_data-vf + i; + /* If VF is assigned, do not free the vport */ + if (vf-pci_dev + vf-pci_dev-dev_flags PCI_DEV_FLAGS_ASSIGNED) + continue; + if (vf-vport_assigned) { efx_ef10_evb_port_assign(efx, EVB_PORT_ID_NULL, i); vf-vport_assigned = 0; @@ -380,7 +385,9 @@ void efx_ef10_vswitching_remove_pf(struct efx_nic *efx) efx_ef10_vport_free(efx, nic_data-vport_id); nic_data-vport_id = EVB_PORT_ID_ASSIGNED; - efx_ef10_vswitch_free(efx, nic_data-vport_id); + /* Only free the vswitch if no VFs are assigned */ + if (!pci_vfs_assigned(efx-pci_dev)) + efx_ef10_vswitch_free(efx, nic_data-vport_id); } void efx_ef10_vswitching_remove_vf(struct efx_nic *efx) @@ -413,20 +420,22 @@ fail1: return rc; } -static int efx_ef10_pci_sriov_disable(struct efx_nic *efx) +static int efx_ef10_pci_sriov_disable(struct efx_nic *efx, bool force) { struct pci_dev *dev = efx-pci_dev; + unsigned int vfs_assigned = 0; - if (!efx-vf_count) - return 0; + vfs_assigned = pci_vfs_assigned(dev); - if (pci_vfs_assigned(dev)) { - netif_err(efx, drv,
[PATCH net-next 13/14] sfc: force removal of VF and vport on driver removal
From: Daniel Pieczko dpiec...@solarflare.com When the driver unloads, force the unbind and removal of any VFs in the host with the PF. The PF cannot remove vports and vswitches if they are still being used by a VF driver, and when unloading the sfc driver the removal order is not guaranteed, so the instruction from the PF to the VF to unbind enforces a suitable ordering so that vswitches and vports can be removed. As a result of this, manually unbinding the driver from a single PF will result in all of its VFs in the host also being removed. Signed-off-by: Shradha Shah ss...@solarflare.com --- drivers/net/ethernet/sfc/ef10_sriov.c | 9 + drivers/net/ethernet/sfc/efx.c| 3 ++- 2 files changed, 11 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/sfc/ef10_sriov.c b/drivers/net/ethernet/sfc/ef10_sriov.c index 083c534..41ab18d 100644 --- a/drivers/net/ethernet/sfc/ef10_sriov.c +++ b/drivers/net/ethernet/sfc/ef10_sriov.c @@ -448,11 +448,20 @@ int efx_ef10_sriov_init(struct efx_nic *efx) void efx_ef10_sriov_fini(struct efx_nic *efx) { struct efx_ef10_nic_data *nic_data = efx-nic_data; + unsigned int i; int rc; if (!nic_data-vf) return; + /* Remove any VFs in the host */ + for (i = 0; i efx-vf_count; ++i) { + struct efx_nic *vf_efx = nic_data-vf[i].efx; + + if (vf_efx) + vf_efx-pci_dev-driver-remove(vf_efx-pci_dev); + } + rc = efx_ef10_pci_sriov_disable(efx); if (rc) netif_dbg(efx, drv, efx-net_dev, diff --git a/drivers/net/ethernet/sfc/efx.c b/drivers/net/ethernet/sfc/efx.c index fe3481c..6887871 100644 --- a/drivers/net/ethernet/sfc/efx.c +++ b/drivers/net/ethernet/sfc/efx.c @@ -2867,7 +2867,8 @@ static void efx_pci_remove_main(struct efx_nic *efx) } /* Final NIC shutdown - * This is called only at module unload (or hotplug removal). + * This is called only at module unload (or hotplug removal). A PF can call + * this on its VFs to ensure they are unbound first. */ static void efx_pci_remove(struct pci_dev *pci_dev) { -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next 12/14] sfc: do not allow VFs to be destroyed if assigned to guests
From: Daniel Pieczko dpiec...@solarflare.com Signed-off-by: Shradha Shah ss...@solarflare.com --- drivers/net/ethernet/sfc/ef10_sriov.c | 9 + 1 file changed, 9 insertions(+) diff --git a/drivers/net/ethernet/sfc/ef10_sriov.c b/drivers/net/ethernet/sfc/ef10_sriov.c index cd52454..083c534 100644 --- a/drivers/net/ethernet/sfc/ef10_sriov.c +++ b/drivers/net/ethernet/sfc/ef10_sriov.c @@ -417,6 +417,15 @@ static int efx_ef10_pci_sriov_disable(struct efx_nic *efx) { struct pci_dev *dev = efx-pci_dev; + if (!efx-vf_count) + return 0; + + if (pci_vfs_assigned(dev)) { + netif_err(efx, drv, efx-net_dev, VFs are assigned to guests; + please detach them before disabling SR-IOV\n); + return -EBUSY; + } + pci_disable_sriov(dev); efx_ef10_sriov_free_vf_vswitching(efx); efx-vf_count = 0; -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Connection tracking and soft lockups with certain field values
On Thu, May 28, 2015 at 10:41 PM, Andrey Korolyov and...@xdel.ru wrote: Hi, I am currently playing with SYNPROXY target to optimize SYN filtering performance and by occasion found that TCP SYN packets containing port 0 can result in a soft lockup when conntrack is enabled just by itself, given high packet ratio (I`ve reached 450kpps so far with 60b packets on a /32-/32 flood with enabled flow control at the media level and middle-level E3 Xeon on receiver side). Same flood with port 0 going just well, producing same ceil numbers but without visible lockups in kernel log. I`ve tested the issue on a broad range of 3.x kernels and all of them are seemingly affected. Fast and dirty grep revealed special conditions for port 0 only for protocol-specific helpers, but there are none of them. Please find both same captures and traceback below. Attached trace without GSO, at its presence can be somehow confusing in a previous sample. The testbed using net.nf_conntrack_max = 200, forgot to mention that previously. [52671.706307] BUG: soft lockup - CPU#0 stuck for 24s! [rcuos/0:18] [52671.706331] Modules linked in: ixgbe mdio xt_CT iptable_raw ipt_SYNPROXY nf_synproxy_core nf_conntrack_ipv4 nf_defrag_ipv4 xt_tcpudp xt_conntrack nf_conntrack iptable_filter ip_tables x_tables tun openvswitch nfsd auth_rpcgss oid_registry nfs_acl nfs lockd dns_resolver fscache sunrpc bridge stp llc w83627ehf hwmon_vid loop fuse dm_crypt dm_mod coretemp kvm_intel snd_pcm kvm snd_page_alloc snd_timer crc32c_intel ghash_clmulni_intel aesni_intel aes_x86_64 ablk_helper snd cryptd lrw iTCO_wdt soundcore iTCO_vendor_support gf128mul glue_helper joydev evdev pcspkr video processor button i2c_i801 lpc_ich mfd_core shpchp ext4 crc16 jbd2 mbcache microcode sg sd_mod crc_t10dif hid_generic usbhid hid ahci libahci libata mpt2sas igb raid_class i2c_algo_bit dca scsi_transport_sas ptp pps_core i2c_core ehci_pci [52671.706364] scsi_mod ehci_hcd xhci_hcd usbcore usb_common thermal fan thermal_sys [52671.706369] CPU: 0 PID: 18 Comm: rcuos/0 Not tainted 3.10-0.bpo.3-amd64 #1 Debian 3.10.11-1~bpo70+19 [52671.706370] Hardware name: Supermicro X10SL7-F/X10SL7-F, BIOS 2.00 04/24/2014 [52671.706372] task: 88040d528790 ti: 88040d532000 task.ti: 88040d532000 [52671.706373] RIP: 0010:[812bf927] [812bf927] sock_wfree+0x42/0x5b [52671.706377] RSP: 0018:88041fc03b18 EFLAGS: 0202 [52671.706378] RAX: 812bf801 RBX: 88041fc10110 RCX: 0024 [52671.706379] RDX: 8802e0751610 RSI: RDI: 88041fc0fe40 [52671.706380] RBP: 0300 R08: R09: [52671.706381] R10: 88041fc0ff44 R11: 0001 R12: 88041fc03a88 [52671.706382] R13: 8139705d R14: 0300 R15: 88041fc0fe40 [52671.706383] FS: () GS:88041fc0() knlGS: [52671.706384] CS: 0010 DS: ES: CR0: 80050033 [52671.706385] CR2: 7fb866cc2140 CR3: 0160c000 CR4: 001407f0 [52671.706386] DR0: DR1: DR2: [52671.706387] DR3: DR6: 0ff0 DR7: 0400 [52671.706388] Stack: [52671.706389] 88040ac7b000 88040ac7aa00 88040ac7b000 812f8747 [52671.706391] 88041fc0fe40 812fb0d8 88041fc03b80 88041fc03b78 [52671.706392] 0040 0286 817dda00 0214140a [52671.706394] Call Trace: [52671.706395] IRQ [52671.706399] [812f8747] ? skb_orphan+0x12/0x27 [52671.706402] [812fb0d8] ? ip_send_unicast_reply+0x243/0x297 [52671.706406] [810477bf] ? mod_timer+0x7b/0x89 [52671.706409] [813120c2] ? tcp_v4_send_reset+0x2db/0x324 [52671.706411] [81312e56] ? tcp_v4_rcv+0x387/0x559 [52671.706413] [812f6017] ? __xfrm_policy_check2.constprop.9+0x50/0x50 [52671.706415] [812f6117] ? ip_local_deliver_finish+0x100/0x176 [52671.706418] [812cd640] ? __netif_receive_skb_core+0x447/0x4bf [52671.706420] [812cd893] ? netif_receive_skb+0x4c/0x7d [52671.706422] [812ce013] ? napi_gro_receive+0x35/0x76 [52671.706427] [a04ab24c] ? ixgbe_poll+0xbc9/0xe0a [ixgbe] [52671.706429] [812cddaa] ? net_rx_action+0xa7/0x1e1 [52671.706431] [8106442c] ? account_system_time+0x113/0x12c [52671.706433] [81041683] ? __do_softirq+0xf1/0x216 [52671.706436] [8139781c] ? call_softirq+0x1c/0x30 [52671.706436] EOI [52671.706439] [8100eade] ? do_softirq+0x3a/0x78 [52671.706440] [8104142e] ? _local_bh_enable_ip.isra.11+0x6a/0x88 [52671.706443] [810a2413] ? rcu_nocb_kthread+0x25e/0x298 [52671.706445] [810573e3] ? abort_exclusive_wait+0x79/0x79 [52671.706447] [810a21b5] ? force_qs_rnp+0x120/0x120 [52671.706448] [810a21b5] ? force_qs_rnp+0x120/0x120 [52671.706450]
Re: [PATCH iproute2] ss: Fix allocation of cong control alg name
On 05/29/2015 01:04 PM, Eric Dumazet wrote: ... I doubt TCP_CA_NAME_MAX will ever change in the kernel : 16 bytes. Its typically cubic and less than 8 bytes. Using 8 bytes to point to a malloc(8) is a waste. Please remove the memory allocation, or store the pointer, since tcp_show_info() does the malloc()/free() before return. +1, much better -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
pull request: batman-adv 2015-05-29
Hello David, after quite some time of silence (mostly due to me being rather busy and not because we did not have any active development) here you have my first batch intended for net-next/linux-4.1. In this patchset you have quite some code cleanup and style fixes. A big chunk of the cleanup work has been performed by Markus Pargmann, followed by a couple of checkpatch fixes brought by Marek Lindner. Then we have a patch by Sven Eckelmann that disables some of the features we had enabled by default, because not all the users having them enabled at compile time are aware of those. For example distributions like debian enables any compile-time option without asking the user, thus having him using all these features without his explicit consent with potentially unexpected behaviours. Therefore the decision to disable by default not-so-user-safe options. This is all for now. Please pull or let me know of any problem! Thanks a lot! The following changes since commit a74eab639ec502eb744528ef7c271576d670aa7a: Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next (2015-05-28 20:23:01 -0700) are available in the git repository at: git://git.open-mesh.org/linux-merge.git tags/batman-adv-for-davem for you to fetch changes up to 8ea64e27080eb66dc26f64f28485c0bc9fc06b36: batman-adv: Use common declaration order in *_send_skb_(packet|unicast) (2015-05-29 10:13:37 +0200) Included changes: - checkpatch fixes - code cleanup - debugfs component is now compiled only if DEBUG_FS is selected - update copyright years - disable by default not-so-user-safe features Antonio Quartulli (1): batman-adv: Use common declaration order in *_send_skb_(packet|unicast) Marek Lindner (2): batman-adv: checkpatch - comparison to NULL could be rewritten batman-adv: checkpatch - spaces preferred around that '*' Markus Pargmann (8): batman-adv: debugfs, avoid compiling for !DEBUG_FS batman-adv: tvlv realloc, move error handling into if block batman-adv: Makefile, Sort alphabetically batman-adv: iv_ogm_iface_enable, direct return values batman-adv: iv_ogm_aggr_packet, bool return value batman-adv: iv_ogm_send_to_if, declare char* as const batman-adv: iv_ogm_can_aggregate, code readability batman-adv: iv_ogm_orig_update, remove unnecessary brackets Simon Wunderlich (1): batman-adv: Start new development cycle Sven Eckelmann (4): batman-adv: update copyright years for 2015 batman-adv: Check total_size when queueing fragments batman-adv: Use only queued fragments when merging batman-adv: Use safer default config for optional features net/batman-adv/Makefile| 6 +- net/batman-adv/bat_algo.h | 2 +- net/batman-adv/bat_iv_ogm.c| 120 - net/batman-adv/bitarray.c | 2 +- net/batman-adv/bitarray.h | 2 +- net/batman-adv/bridge_loop_avoidance.c | 2 +- net/batman-adv/bridge_loop_avoidance.h | 2 +- net/batman-adv/debugfs.c | 10 +-- net/batman-adv/debugfs.h | 36 +- net/batman-adv/distributed-arp-table.c | 2 +- net/batman-adv/distributed-arp-table.h | 2 +- net/batman-adv/fragmentation.c | 22 +++--- net/batman-adv/fragmentation.h | 2 +- net/batman-adv/gateway_client.c| 2 +- net/batman-adv/gateway_client.h| 2 +- net/batman-adv/gateway_common.c| 2 +- net/batman-adv/gateway_common.h| 2 +- net/batman-adv/hard-interface.c| 2 +- net/batman-adv/hard-interface.h| 2 +- net/batman-adv/hash.c | 2 +- net/batman-adv/hash.h | 2 +- net/batman-adv/icmp_socket.c | 2 +- net/batman-adv/icmp_socket.h | 2 +- net/batman-adv/main.c | 18 ++--- net/batman-adv/main.h | 6 +- net/batman-adv/multicast.c | 2 +- net/batman-adv/multicast.h | 2 +- net/batman-adv/network-coding.c| 6 +- net/batman-adv/network-coding.h| 2 +- net/batman-adv/originator.c| 2 +- net/batman-adv/originator.h| 2 +- net/batman-adv/packet.h| 2 +- net/batman-adv/routing.c | 2 +- net/batman-adv/routing.h | 2 +- net/batman-adv/send.c | 4 +- net/batman-adv/send.h | 2 +- net/batman-adv/soft-interface.c| 6 +- net/batman-adv/soft-interface.h| 2 +- net/batman-adv/sysfs.c | 2 +- net/batman-adv/sysfs.h | 2 +- net/batman-adv/translation-table.c | 2 +- net/batman-adv/translation-table.h | 2 +- net/batman-adv/types.h | 4 +- 43 files
Re: [PATCH 2/4] netfilter: default CONFIG_NETFILTER_INGRESS to y
On Fri, May 29, 2015 at 08:19:35AM +0200, Jan Engelhardt wrote: On Friday 2015-05-29 01:44, Pablo Neira Ayuso wrote: Useful to compile-test all options. --- a/net/netfilter/Kconfig +++ b/net/netfilter/Kconfig @@ -3,6 +3,7 @@ menu Core Netfilter Configuration config NETFILTER_INGRESS bool Netfilter ingress support +default y select NET_INGRESS help This allows you to classify packets from ingress using the Netfilter Careful with default y. I seem to remember that someone higher up (perhaps Linus himself) was against default y for features deemed not essential (especially hardware drivers), as no driver is any more important than another. If compile-test is your reason for the patch, it might fall into the same category. This config option is hiding behind the global CONFIG_NETFILTER switch that, if enabled, gets the very basic hook infrastructure, and this ingress hook falls into that category. I agree this makes sense for hardware drivers, but this is not the case. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next 06/14] sfc: display vadaptor statistics for all interfaces
From: Daniel Pieczko dpiec...@solarflare.com All interfaces will display vadaptor statistics, so set all the relevant bits in the stats bitmask. Only functions with the LINKCTRL flag will see other stats, including (per-port) MAC stats. The vadaptor stats are from rx_unicast to tx_overflow. Signed-off-by: Shradha Shah ss...@solarflare.com --- drivers/net/ethernet/sfc/ef10.c | 39 drivers/net/ethernet/sfc/mcdi_pcol.h | 20 ++ drivers/net/ethernet/sfc/nic.h | 18 + 3 files changed, 73 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/sfc/ef10.c b/drivers/net/ethernet/sfc/ef10.c index 46ddd6e..bf4677e 100644 --- a/drivers/net/ethernet/sfc/ef10.c +++ b/drivers/net/ethernet/sfc/ef10.c @@ -1060,6 +1060,24 @@ static const struct efx_hw_stat_desc efx_ef10_stat_desc[EF10_STAT_COUNT] = { EF10_DMA_STAT(port_rx_dp_streaming_packets, RXDP_STREAMING_PKTS), EF10_DMA_STAT(port_rx_dp_hlb_fetch, RXDP_HLB_FETCH_CONDITIONS), EF10_DMA_STAT(port_rx_dp_hlb_wait, RXDP_HLB_WAIT_CONDITIONS), + EF10_DMA_STAT(rx_unicast, VADAPTER_RX_UNICAST_PACKETS), + EF10_DMA_STAT(rx_unicast_bytes, VADAPTER_RX_UNICAST_BYTES), + EF10_DMA_STAT(rx_multicast, VADAPTER_RX_MULTICAST_PACKETS), + EF10_DMA_STAT(rx_multicast_bytes, VADAPTER_RX_MULTICAST_BYTES), + EF10_DMA_STAT(rx_broadcast, VADAPTER_RX_BROADCAST_PACKETS), + EF10_DMA_STAT(rx_broadcast_bytes, VADAPTER_RX_BROADCAST_BYTES), + EF10_DMA_STAT(rx_bad, VADAPTER_RX_BAD_PACKETS), + EF10_DMA_STAT(rx_bad_bytes, VADAPTER_RX_BAD_BYTES), + EF10_DMA_STAT(rx_overflow, VADAPTER_RX_OVERFLOW), + EF10_DMA_STAT(tx_unicast, VADAPTER_TX_UNICAST_PACKETS), + EF10_DMA_STAT(tx_unicast_bytes, VADAPTER_TX_UNICAST_BYTES), + EF10_DMA_STAT(tx_multicast, VADAPTER_TX_MULTICAST_PACKETS), + EF10_DMA_STAT(tx_multicast_bytes, VADAPTER_TX_MULTICAST_BYTES), + EF10_DMA_STAT(tx_broadcast, VADAPTER_TX_BROADCAST_PACKETS), + EF10_DMA_STAT(tx_broadcast_bytes, VADAPTER_TX_BROADCAST_BYTES), + EF10_DMA_STAT(tx_bad, VADAPTER_TX_BAD_PACKETS), + EF10_DMA_STAT(tx_bad_bytes, VADAPTER_TX_BAD_BYTES), + EF10_DMA_STAT(tx_overflow, VADAPTER_TX_OVERFLOW), }; #define HUNT_COMMON_STAT_MASK ((1ULL EF10_STAT_port_tx_bytes) | \ @@ -1140,6 +1158,10 @@ static u64 efx_ef10_raw_stat_mask(struct efx_nic *efx) u32 port_caps = efx_mcdi_phy_get_caps(efx); struct efx_ef10_nic_data *nic_data = efx-nic_data; + if (!(efx-mcdi-fn_flags + 1 MC_CMD_DRV_ATTACH_EXT_OUT_FLAG_LINKCTRL)) + return 0; + if (port_caps (1 MC_CMD_PHY_CAP_4FDX_LBN)) raw_mask |= HUNT_40G_EXTRA_STAT_MASK; else @@ -1154,13 +1176,22 @@ static u64 efx_ef10_raw_stat_mask(struct efx_nic *efx) static void efx_ef10_get_stat_mask(struct efx_nic *efx, unsigned long *mask) { - u64 raw_mask = efx_ef10_raw_stat_mask(efx); + u64 raw_mask[2]; + + raw_mask[0] = efx_ef10_raw_stat_mask(efx); + + /* All functions see the vadaptor stats */ + raw_mask[0] |= ~((1ULL EF10_STAT_rx_unicast) - 1); + raw_mask[1] = (1ULL (EF10_STAT_COUNT - 63)) - 1; #if BITS_PER_LONG == 64 - mask[0] = raw_mask; + mask[0] = raw_mask[0]; + mask[1] = raw_mask[1]; #else - mask[0] = raw_mask 0x; - mask[1] = raw_mask 32; + mask[0] = raw_mask[0] 0x; + mask[1] = raw_mask[0] 32; + mask[2] = raw_mask[1] 0x; + mask[3] = raw_mask[1] 32; #endif } diff --git a/drivers/net/ethernet/sfc/mcdi_pcol.h b/drivers/net/ethernet/sfc/mcdi_pcol.h index 1e11bb8..0e497b3 100644 --- a/drivers/net/ethernet/sfc/mcdi_pcol.h +++ b/drivers/net/ethernet/sfc/mcdi_pcol.h @@ -2896,6 +2896,26 @@ * descriptor fetch. Valid for EF10 with PM_AND_RXDP_COUNTERS capability only. */ #define MC_CMD_MAC_RXDP_HLB_WAIT_CONDITIONS 0x48 +#define MC_CMD_MAC_VADAPTER_RX_DMABUF_START 0x4c /* enum */ +#define MC_CMD_MAC_VADAPTER_RX_UNICAST_PACKETS 0x4c /* enum */ +#define MC_CMD_MAC_VADAPTER_RX_UNICAST_BYTES 0x4d /* enum */ +#define MC_CMD_MAC_VADAPTER_RX_MULTICAST_PACKETS 0x4e /* enum */ +#define MC_CMD_MAC_VADAPTER_RX_MULTICAST_BYTES 0x4f /* enum */ +#define MC_CMD_MAC_VADAPTER_RX_BROADCAST_PACKETS 0x50 /* enum */ +#define MC_CMD_MAC_VADAPTER_RX_BROADCAST_BYTES 0x51 /* enum */ +#define MC_CMD_MAC_VADAPTER_RX_BAD_PACKETS 0x52 /* enum */ +#define MC_CMD_MAC_VADAPTER_RX_BAD_BYTES 0x53 /* enum */ +#define MC_CMD_MAC_VADAPTER_RX_OVERFLOW 0x54 /* enum */ +#define MC_CMD_MAC_VADAPTER_TX_DMABUF_START 0x57 /* enum */ +#define MC_CMD_MAC_VADAPTER_TX_UNICAST_PACKETS 0x57 /* enum */ +#define MC_CMD_MAC_VADAPTER_TX_UNICAST_BYTES 0x58 /* enum */ +#define MC_CMD_MAC_VADAPTER_TX_MULTICAST_PACKETS 0x59 /*
[PATCH net-next 11/14] sfc: don't update stats on VF when called in atomic context
From: Daniel Pieczko dpiec...@solarflare.com The ifenslave command to set up a bond runs in an atomic context, and it queries the stats on the devices that are being enslaved. A VF needs to make an MCDI call to update its stats, which is not allowed in atomic context. The releasing of the stats_lock is moved to the beginning of the VF stats update function so that in_interrupt() can be used; it must be taken again before returning from this function. Signed-off-by: Shradha Shah ss...@solarflare.com --- drivers/net/ethernet/sfc/ef10.c | 16 ++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/sfc/ef10.c b/drivers/net/ethernet/sfc/ef10.c index 8be9191..859980d 100644 --- a/drivers/net/ethernet/sfc/ef10.c +++ b/drivers/net/ethernet/sfc/ef10.c @@ -1320,11 +1320,24 @@ static int efx_ef10_try_update_nic_stats_vf(struct efx_nic *efx) __le64 *dma_stats; int rc; + spin_unlock_bh(efx-stats_lock); + + if (in_interrupt()) { + /* If in atomic context, cannot update stats. Just update the +* software stats and return so the caller can continue. +*/ + spin_lock_bh(efx-stats_lock); + efx_update_sw_stats(efx, stats); + return 0; + } + efx_ef10_get_stat_mask(efx, mask); rc = efx_nic_alloc_buffer(efx, stats_buf, dma_len, GFP_ATOMIC); - if (rc) + if (rc) { + spin_lock_bh(efx-stats_lock); return rc; + } dma_stats = stats_buf.addr; dma_stats[MC_CMD_MAC_GENERATION_END] = EFX_MC_STATS_GENERATION_INVALID; @@ -1335,7 +1348,6 @@ static int efx_ef10_try_update_nic_stats_vf(struct efx_nic *efx) MCDI_SET_DWORD(inbuf, MAC_STATS_IN_DMA_LEN, dma_len); MCDI_SET_DWORD(inbuf, MAC_STATS_IN_PORT_ID, EVB_PORT_ID_ASSIGNED); - spin_unlock_bh(efx-stats_lock); rc = efx_mcdi_rpc_quiet(efx, MC_CMD_MAC_STATS, inbuf, sizeof(inbuf), NULL, 0, NULL); spin_lock_bh(efx-stats_lock); -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next 09/14] sfc: suppress ENOENT error messages from MC_CMD_MAC_STATS
From: Daniel Pieczko dpiec...@solarflare.com MC_CMD_MAC_STATS can be called on a function before a vadaptor has been created, as the kernel can call into this through ndo_get_stats/ndo_get_stats64. If MC_CMD_MAC_STATS is called before the DMA queues have been setup, so that a vadaptor has not been created yet, firmware will return ENOENT. This is expected, so suppress the MCDI error message in this case. Signed-off-by: Shradha Shah ss...@solarflare.com --- drivers/net/ethernet/sfc/ef10.c | 11 --- drivers/net/ethernet/sfc/mcdi_port.c | 8 ++-- 2 files changed, 14 insertions(+), 5 deletions(-) diff --git a/drivers/net/ethernet/sfc/ef10.c b/drivers/net/ethernet/sfc/ef10.c index 1e83c18..c68bf55 100644 --- a/drivers/net/ethernet/sfc/ef10.c +++ b/drivers/net/ethernet/sfc/ef10.c @@ -1330,11 +1330,16 @@ static int efx_ef10_try_update_nic_stats_vf(struct efx_nic *efx) MCDI_SET_DWORD(inbuf, MAC_STATS_IN_PORT_ID, EVB_PORT_ID_ASSIGNED); spin_unlock_bh(efx-stats_lock); - rc = efx_mcdi_rpc(efx, MC_CMD_MAC_STATS, inbuf, sizeof(inbuf), NULL, - 0, NULL); + rc = efx_mcdi_rpc_quiet(efx, MC_CMD_MAC_STATS, inbuf, sizeof(inbuf), + NULL, 0, NULL); spin_lock_bh(efx-stats_lock); - if (rc) + if (rc) { + /* Expect ENOENT if DMA queues have not been set up */ + if (rc != -ENOENT || atomic_read(efx-active_queues)) + efx_mcdi_display_error(efx, MC_CMD_MAC_STATS, + sizeof(inbuf), NULL, 0, rc); goto out; + } generation_end = dma_stats[MC_CMD_MAC_GENERATION_END]; if (generation_end == EFX_MC_STATS_GENERATION_INVALID) { diff --git a/drivers/net/ethernet/sfc/mcdi_port.c b/drivers/net/ethernet/sfc/mcdi_port.c index fffc348..7f295c4 100644 --- a/drivers/net/ethernet/sfc/mcdi_port.c +++ b/drivers/net/ethernet/sfc/mcdi_port.c @@ -948,8 +948,12 @@ static int efx_mcdi_mac_stats(struct efx_nic *efx, MCDI_SET_DWORD(inbuf, MAC_STATS_IN_DMA_LEN, dma_len); MCDI_SET_DWORD(inbuf, MAC_STATS_IN_PORT_ID, nic_data-vport_id); - rc = efx_mcdi_rpc(efx, MC_CMD_MAC_STATS, inbuf, sizeof(inbuf), - NULL, 0, NULL); + rc = efx_mcdi_rpc_quiet(efx, MC_CMD_MAC_STATS, inbuf, sizeof(inbuf), + NULL, 0, NULL); + /* Expect ENOENT if DMA queues have not been set up */ + if (rc (rc != -ENOENT || atomic_read(efx-active_queues))) + efx_mcdi_display_error(efx, MC_CMD_MAC_STATS, sizeof(inbuf), + NULL, 0, rc); return rc; } -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next 07/14] sfc: DMA the VF stats only when requested
From: Daniel Pieczko dpiec...@solarflare.com Firmware does not support a periodic DMA of vadaptor-stats on VFs, so only update the stats buffer when stats are requested (when running ethtool -S or an ip/ifconfig command that reports stats). Signed-off-by: Shradha Shah ss...@solarflare.com --- drivers/net/ethernet/sfc/ef10.c | 149 +-- drivers/net/ethernet/sfc/mcdi_pcol.h | 4 +- 2 files changed, 112 insertions(+), 41 deletions(-) diff --git a/drivers/net/ethernet/sfc/ef10.c b/drivers/net/ethernet/sfc/ef10.c index bf4677e..6aaec4e 100644 --- a/drivers/net/ethernet/sfc/ef10.c +++ b/drivers/net/ethernet/sfc/ef10.c @@ -1204,7 +1204,50 @@ static size_t efx_ef10_describe_stats(struct efx_nic *efx, u8 *names) mask, names); } -static int efx_ef10_try_update_nic_stats(struct efx_nic *efx) +static size_t efx_ef10_update_stats_common(struct efx_nic *efx, u64 *full_stats, + struct rtnl_link_stats64 *core_stats) +{ + DECLARE_BITMAP(mask, EF10_STAT_COUNT); + struct efx_ef10_nic_data *nic_data = efx-nic_data; + u64 *stats = nic_data-stats; + size_t stats_count = 0, index; + + efx_ef10_get_stat_mask(efx, mask); + + if (full_stats) { + for_each_set_bit(index, mask, EF10_STAT_COUNT) { + if (efx_ef10_stat_desc[index].name) { + *full_stats++ = stats[index]; + ++stats_count; + } + } + } + + if (core_stats) { + core_stats-rx_packets = stats[EF10_STAT_port_rx_packets]; + core_stats-tx_packets = stats[EF10_STAT_port_tx_packets]; + core_stats-rx_bytes = stats[EF10_STAT_port_rx_bytes]; + core_stats-tx_bytes = stats[EF10_STAT_port_tx_bytes]; + core_stats-rx_dropped = stats[EF10_STAT_port_rx_nodesc_drops] + +stats[GENERIC_STAT_rx_nodesc_trunc] + +stats[GENERIC_STAT_rx_noskb_drops]; + core_stats-multicast = stats[EF10_STAT_port_rx_multicast]; + core_stats-rx_length_errors = + stats[EF10_STAT_port_rx_gtjumbo] + + stats[EF10_STAT_port_rx_length_error]; + core_stats-rx_crc_errors = stats[EF10_STAT_port_rx_bad]; + core_stats-rx_frame_errors = + stats[EF10_STAT_port_rx_align_error]; + core_stats-rx_fifo_errors = stats[EF10_STAT_port_rx_overflow]; + core_stats-rx_errors = (core_stats-rx_length_errors + +core_stats-rx_crc_errors + +core_stats-rx_frame_errors); + } + + return stats_count; +} + +static int efx_ef10_try_update_nic_stats_pf(struct efx_nic *efx) { struct efx_ef10_nic_data *nic_data = efx-nic_data; DECLARE_BITMAP(mask, EF10_STAT_COUNT); @@ -1241,57 +1284,83 @@ static int efx_ef10_try_update_nic_stats(struct efx_nic *efx) } -static size_t efx_ef10_update_stats(struct efx_nic *efx, u64 *full_stats, - struct rtnl_link_stats64 *core_stats) +static size_t efx_ef10_update_stats_pf(struct efx_nic *efx, u64 *full_stats, + struct rtnl_link_stats64 *core_stats) { - DECLARE_BITMAP(mask, EF10_STAT_COUNT); - struct efx_ef10_nic_data *nic_data = efx-nic_data; - u64 *stats = nic_data-stats; - size_t stats_count = 0, index; int retry; - efx_ef10_get_stat_mask(efx, mask); - /* If we're unlucky enough to read statistics during the DMA, wait * up to 10ms for it to finish (typically takes 500us) */ for (retry = 0; retry 100; ++retry) { - if (efx_ef10_try_update_nic_stats(efx) == 0) + if (efx_ef10_try_update_nic_stats_pf(efx) == 0) break; udelay(100); } - if (full_stats) { - for_each_set_bit(index, mask, EF10_STAT_COUNT) { - if (efx_ef10_stat_desc[index].name) { - *full_stats++ = stats[index]; - ++stats_count; - } - } - } + return efx_ef10_update_stats_common(efx, full_stats, core_stats); +} - if (core_stats) { - core_stats-rx_packets = stats[EF10_STAT_port_rx_packets]; - core_stats-tx_packets = stats[EF10_STAT_port_tx_packets]; - core_stats-rx_bytes = stats[EF10_STAT_port_rx_bytes]; - core_stats-tx_bytes = stats[EF10_STAT_port_tx_bytes]; - core_stats-rx_dropped = stats[EF10_STAT_port_rx_nodesc_drops] + -stats[GENERIC_STAT_rx_nodesc_trunc] + -
[PATCH net-next 08/14] sfc: update netdevice statistics to use vadaptor stats
From: Daniel Pieczko dpiec...@solarflare.com The netdevice statistics (in /proc/net/dev) are per-function stats so they must use the vadaptor stats. Change the use of MAC stats to vadaptor stats, and remove any statistics that can only be measured per-port. All stats that are removed will be shown as zeroes when these statistics are displayed. Signed-off-by: Shradha Shah ss...@solarflare.com --- drivers/net/ethernet/sfc/ef10.c | 41 ++--- 1 file changed, 22 insertions(+), 19 deletions(-) diff --git a/drivers/net/ethernet/sfc/ef10.c b/drivers/net/ethernet/sfc/ef10.c index 6aaec4e..1e83c18 100644 --- a/drivers/net/ethernet/sfc/ef10.c +++ b/drivers/net/ethernet/sfc/ef10.c @@ -1224,24 +1224,25 @@ static size_t efx_ef10_update_stats_common(struct efx_nic *efx, u64 *full_stats, } if (core_stats) { - core_stats-rx_packets = stats[EF10_STAT_port_rx_packets]; - core_stats-tx_packets = stats[EF10_STAT_port_tx_packets]; - core_stats-rx_bytes = stats[EF10_STAT_port_rx_bytes]; - core_stats-tx_bytes = stats[EF10_STAT_port_tx_bytes]; - core_stats-rx_dropped = stats[EF10_STAT_port_rx_nodesc_drops] + -stats[GENERIC_STAT_rx_nodesc_trunc] + + core_stats-rx_packets = stats[EF10_STAT_rx_unicast] + +stats[EF10_STAT_rx_multicast] + +stats[EF10_STAT_rx_broadcast]; + core_stats-tx_packets = stats[EF10_STAT_tx_unicast] + +stats[EF10_STAT_tx_multicast] + +stats[EF10_STAT_tx_broadcast]; + core_stats-rx_bytes = stats[EF10_STAT_rx_unicast_bytes] + + stats[EF10_STAT_rx_multicast_bytes] + + stats[EF10_STAT_rx_broadcast_bytes]; + core_stats-tx_bytes = stats[EF10_STAT_tx_unicast_bytes] + + stats[EF10_STAT_tx_multicast_bytes] + + stats[EF10_STAT_tx_broadcast_bytes]; + core_stats-rx_dropped = stats[GENERIC_STAT_rx_nodesc_trunc] + stats[GENERIC_STAT_rx_noskb_drops]; - core_stats-multicast = stats[EF10_STAT_port_rx_multicast]; - core_stats-rx_length_errors = - stats[EF10_STAT_port_rx_gtjumbo] + - stats[EF10_STAT_port_rx_length_error]; - core_stats-rx_crc_errors = stats[EF10_STAT_port_rx_bad]; - core_stats-rx_frame_errors = - stats[EF10_STAT_port_rx_align_error]; - core_stats-rx_fifo_errors = stats[EF10_STAT_port_rx_overflow]; - core_stats-rx_errors = (core_stats-rx_length_errors + -core_stats-rx_crc_errors + -core_stats-rx_frame_errors); + core_stats-multicast = stats[EF10_STAT_rx_multicast]; + core_stats-rx_crc_errors = stats[EF10_STAT_rx_bad]; + core_stats-rx_fifo_errors = stats[EF10_STAT_rx_overflow]; + core_stats-rx_errors = core_stats-rx_crc_errors; + core_stats-tx_errors = stats[EF10_STAT_tx_bad]; } return stats_count; @@ -1324,7 +1325,7 @@ static int efx_ef10_try_update_nic_stats_vf(struct efx_nic *efx) MCDI_SET_QWORD(inbuf, MAC_STATS_IN_DMA_ADDR, stats_buf.dma_addr); MCDI_POPULATE_DWORD_1(inbuf, MAC_STATS_IN_CMD, - MAC_STATS_IN_DMA, true); + MAC_STATS_IN_DMA, 1); MCDI_SET_DWORD(inbuf, MAC_STATS_IN_DMA_LEN, dma_len); MCDI_SET_DWORD(inbuf, MAC_STATS_IN_PORT_ID, EVB_PORT_ID_ASSIGNED); @@ -1336,8 +1337,10 @@ static int efx_ef10_try_update_nic_stats_vf(struct efx_nic *efx) goto out; generation_end = dma_stats[MC_CMD_MAC_GENERATION_END]; - if (generation_end == EFX_MC_STATS_GENERATION_INVALID) + if (generation_end == EFX_MC_STATS_GENERATION_INVALID) { + WARN_ON_ONCE(1); goto out; + } rmb(); efx_nic_update_stats(efx_ef10_stat_desc, EF10_STAT_COUNT, mask, stats, stats_buf.addr, false); -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next 04/14] sfc: add port_ prefix to MAC stats
From: Daniel Pieczko dpiec...@solarflare.com The MAC stats are per-port and will only be displayed on the PF with control of the link (one per physical port). Vadapter stats will also be displayed for this PF, so distinguish the MAC stats by adding a prefix of port_. Signed-off-by: Shradha Shah ss...@solarflare.com --- drivers/net/ethernet/sfc/ef10.c | 251 ++- drivers/net/ethernet/sfc/mcdi_pcol.h | 4 +- drivers/net/ethernet/sfc/nic.h | 106 +++ 3 files changed, 182 insertions(+), 179 deletions(-) diff --git a/drivers/net/ethernet/sfc/ef10.c b/drivers/net/ethernet/sfc/ef10.c index 5c9576d..46ddd6e 100644 --- a/drivers/net/ethernet/sfc/ef10.c +++ b/drivers/net/ethernet/sfc/ef10.c @@ -1005,93 +1005,94 @@ static int efx_ef10_reset(struct efx_nic *efx, enum reset_type reset_type) [GENERIC_STAT_ ## ext_name] = { #ext_name, 0, 0 } static const struct efx_hw_stat_desc efx_ef10_stat_desc[EF10_STAT_COUNT] = { - EF10_DMA_STAT(tx_bytes, TX_BYTES), - EF10_DMA_STAT(tx_packets, TX_PKTS), - EF10_DMA_STAT(tx_pause, TX_PAUSE_PKTS), - EF10_DMA_STAT(tx_control, TX_CONTROL_PKTS), - EF10_DMA_STAT(tx_unicast, TX_UNICAST_PKTS), - EF10_DMA_STAT(tx_multicast, TX_MULTICAST_PKTS), - EF10_DMA_STAT(tx_broadcast, TX_BROADCAST_PKTS), - EF10_DMA_STAT(tx_lt64, TX_LT64_PKTS), - EF10_DMA_STAT(tx_64, TX_64_PKTS), - EF10_DMA_STAT(tx_65_to_127, TX_65_TO_127_PKTS), - EF10_DMA_STAT(tx_128_to_255, TX_128_TO_255_PKTS), - EF10_DMA_STAT(tx_256_to_511, TX_256_TO_511_PKTS), - EF10_DMA_STAT(tx_512_to_1023, TX_512_TO_1023_PKTS), - EF10_DMA_STAT(tx_1024_to_15xx, TX_1024_TO_15XX_PKTS), - EF10_DMA_STAT(tx_15xx_to_jumbo, TX_15XX_TO_JUMBO_PKTS), - EF10_DMA_STAT(rx_bytes, RX_BYTES), - EF10_DMA_INVIS_STAT(rx_bytes_minus_good_bytes, RX_BAD_BYTES), - EF10_OTHER_STAT(rx_good_bytes), - EF10_OTHER_STAT(rx_bad_bytes), - EF10_DMA_STAT(rx_packets, RX_PKTS), - EF10_DMA_STAT(rx_good, RX_GOOD_PKTS), - EF10_DMA_STAT(rx_bad, RX_BAD_FCS_PKTS), - EF10_DMA_STAT(rx_pause, RX_PAUSE_PKTS), - EF10_DMA_STAT(rx_control, RX_CONTROL_PKTS), - EF10_DMA_STAT(rx_unicast, RX_UNICAST_PKTS), - EF10_DMA_STAT(rx_multicast, RX_MULTICAST_PKTS), - EF10_DMA_STAT(rx_broadcast, RX_BROADCAST_PKTS), - EF10_DMA_STAT(rx_lt64, RX_UNDERSIZE_PKTS), - EF10_DMA_STAT(rx_64, RX_64_PKTS), - EF10_DMA_STAT(rx_65_to_127, RX_65_TO_127_PKTS), - EF10_DMA_STAT(rx_128_to_255, RX_128_TO_255_PKTS), - EF10_DMA_STAT(rx_256_to_511, RX_256_TO_511_PKTS), - EF10_DMA_STAT(rx_512_to_1023, RX_512_TO_1023_PKTS), - EF10_DMA_STAT(rx_1024_to_15xx, RX_1024_TO_15XX_PKTS), - EF10_DMA_STAT(rx_15xx_to_jumbo, RX_15XX_TO_JUMBO_PKTS), - EF10_DMA_STAT(rx_gtjumbo, RX_GTJUMBO_PKTS), - EF10_DMA_STAT(rx_bad_gtjumbo, RX_JABBER_PKTS), - EF10_DMA_STAT(rx_overflow, RX_OVERFLOW_PKTS), - EF10_DMA_STAT(rx_align_error, RX_ALIGN_ERROR_PKTS), - EF10_DMA_STAT(rx_length_error, RX_LENGTH_ERROR_PKTS), - EF10_DMA_STAT(rx_nodesc_drops, RX_NODESC_DROPS), + EF10_DMA_STAT(port_tx_bytes, TX_BYTES), + EF10_DMA_STAT(port_tx_packets, TX_PKTS), + EF10_DMA_STAT(port_tx_pause, TX_PAUSE_PKTS), + EF10_DMA_STAT(port_tx_control, TX_CONTROL_PKTS), + EF10_DMA_STAT(port_tx_unicast, TX_UNICAST_PKTS), + EF10_DMA_STAT(port_tx_multicast, TX_MULTICAST_PKTS), + EF10_DMA_STAT(port_tx_broadcast, TX_BROADCAST_PKTS), + EF10_DMA_STAT(port_tx_lt64, TX_LT64_PKTS), + EF10_DMA_STAT(port_tx_64, TX_64_PKTS), + EF10_DMA_STAT(port_tx_65_to_127, TX_65_TO_127_PKTS), + EF10_DMA_STAT(port_tx_128_to_255, TX_128_TO_255_PKTS), + EF10_DMA_STAT(port_tx_256_to_511, TX_256_TO_511_PKTS), + EF10_DMA_STAT(port_tx_512_to_1023, TX_512_TO_1023_PKTS), + EF10_DMA_STAT(port_tx_1024_to_15xx, TX_1024_TO_15XX_PKTS), + EF10_DMA_STAT(port_tx_15xx_to_jumbo, TX_15XX_TO_JUMBO_PKTS), + EF10_DMA_STAT(port_rx_bytes, RX_BYTES), + EF10_DMA_INVIS_STAT(port_rx_bytes_minus_good_bytes, RX_BAD_BYTES), + EF10_OTHER_STAT(port_rx_good_bytes), + EF10_OTHER_STAT(port_rx_bad_bytes), + EF10_DMA_STAT(port_rx_packets, RX_PKTS), + EF10_DMA_STAT(port_rx_good, RX_GOOD_PKTS), + EF10_DMA_STAT(port_rx_bad, RX_BAD_FCS_PKTS), + EF10_DMA_STAT(port_rx_pause, RX_PAUSE_PKTS), + EF10_DMA_STAT(port_rx_control, RX_CONTROL_PKTS), + EF10_DMA_STAT(port_rx_unicast, RX_UNICAST_PKTS), + EF10_DMA_STAT(port_rx_multicast, RX_MULTICAST_PKTS), + EF10_DMA_STAT(port_rx_broadcast, RX_BROADCAST_PKTS), + EF10_DMA_STAT(port_rx_lt64, RX_UNDERSIZE_PKTS), + EF10_DMA_STAT(port_rx_64, RX_64_PKTS), + EF10_DMA_STAT(port_rx_65_to_127, RX_65_TO_127_PKTS), + EF10_DMA_STAT(port_rx_128_to_255, RX_128_TO_255_PKTS), + EF10_DMA_STAT(port_rx_256_to_511, RX_256_TO_511_PKTS), +
[PATCH iproute2] configure: Check for libmnl
From: Vadim Kochan vadi...@gmail.com Indicate existence of libmnl which is required by tipc. Signed-off-by: Vadim Kochan vadi...@gmail.com --- configure | 16 1 file changed, 16 insertions(+) diff --git a/configure b/configure index f1325df..1605464 100755 --- a/configure +++ b/configure @@ -284,6 +284,17 @@ check_selinux() fi } +check_mnl() +{ + if ${PKG_CONFIG} libmnl --exists + then + echo HAVE_MNL:=y Config + echo -n yes + else + echo -n no + fi +} + echo # Generated config based on $INCLUDE Config check_toolchain @@ -313,5 +324,10 @@ check_selinux echo -n ELF support: check_elf +echo -n libmnl support: +check_mnl +echo (required by tipc) + echo -e \nDocs check_docs +echo -- 2.3.1 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH iproute2] ss: Fix allocation of cong control alg name
Hi Vadim, On 05/29/2015 12:30 PM, Vadim Kochan wrote: From: Vadim Kochan vadi...@gmail.com Use strdup instead of malloc, and get rid of bad strcpy. Signed-off-by: Vadim Kochan vadi...@gmail.com Please also Cc the reporter (done here), and add a: Fixes: 8250bc9ff4e5 (ss: Unify inet sockets output) Reported-by: Jose R. Guzman Mosqueda jose.r.guzman.mosqu...@intel.com Fixes tag is _very useful_ for distros to easily identify if additional follow-up commits would be needed when backporting the original change. Then, this can be easily identified when going through the git log. --- misc/ss.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/misc/ss.c b/misc/ss.c index 347e3a1..a719466 100644 --- a/misc/ss.c +++ b/misc/ss.c @@ -1908,8 +1908,7 @@ static void tcp_show_info(const struct nlmsghdr *nlh, struct inet_diag_msg *r, if (tb[INET_DIAG_CONG]) { const char *cong_attr = rta_getattr_str(tb[INET_DIAG_CONG]); - s.cong_alg = malloc(strlen(cong_attr + 1)); - strcpy(s.cong_alg, cong_attr); + s.cong_alg = strdup(cong_attr); strdup(3) can still return NULL. } if (TCPI_HAS_OPT(info, TCPI_OPT_WSCALE)) { -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 06/16] batman-adv: tvlv realloc, move error handling into if block
From: Markus Pargmann m...@pengutronix.de Instead of hiding the normal function flow inside an if block, we should just put the error handling into the if block. Signed-off-by: Markus Pargmann m...@pengutronix.de Signed-off-by: Marek Lindner mareklind...@neomailbox.ch Signed-off-by: Antonio Quartulli anto...@meshcoding.com --- net/batman-adv/main.c | 16 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/net/batman-adv/main.c b/net/batman-adv/main.c index ca63d48..fd9333d 100644 --- a/net/batman-adv/main.c +++ b/net/batman-adv/main.c @@ -819,15 +819,15 @@ static bool batadv_tvlv_realloc_packet_buff(unsigned char **packet_buff, new_buff = kmalloc(min_packet_len + additional_packet_len, GFP_ATOMIC); /* keep old buffer if kmalloc should fail */ - if (new_buff) { - memcpy(new_buff, *packet_buff, min_packet_len); - kfree(*packet_buff); - *packet_buff = new_buff; - *packet_buff_len = min_packet_len + additional_packet_len; - return true; - } + if (!new_buff) + return false; - return false; + memcpy(new_buff, *packet_buff, min_packet_len); + kfree(*packet_buff); + *packet_buff = new_buff; + *packet_buff_len = min_packet_len + additional_packet_len; + + return true; } /** -- 2.4.2 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 05/16] batman-adv: debugfs, avoid compiling for !DEBUG_FS
From: Markus Pargmann m...@pengutronix.de Normally the debugfs framework will return error pointer with -ENODEV for function calls when DEBUG_FS is not set. batman does not notice this error code and continues trying to create debugfs files and executes more code. We can avoid this code execution by disabling compiling debugfs.c when DEBUG_FS is not set. Signed-off-by: Markus Pargmann m...@pengutronix.de Signed-off-by: Marek Lindner mareklind...@neomailbox.ch Signed-off-by: Antonio Quartulli anto...@meshcoding.com --- net/batman-adv/Makefile | 2 +- net/batman-adv/debugfs.c | 8 net/batman-adv/debugfs.h | 34 ++ 3 files changed, 35 insertions(+), 9 deletions(-) diff --git a/net/batman-adv/Makefile b/net/batman-adv/Makefile index 79833eb..bd7e343 100644 --- a/net/batman-adv/Makefile +++ b/net/batman-adv/Makefile @@ -20,7 +20,7 @@ obj-$(CONFIG_BATMAN_ADV) += batman-adv.o batman-adv-y += bat_iv_ogm.o batman-adv-y += bitarray.o batman-adv-$(CONFIG_BATMAN_ADV_BLA) += bridge_loop_avoidance.o -batman-adv-y += debugfs.o +batman-adv-$(CONFIG_DEBUG_FS) += debugfs.o batman-adv-$(CONFIG_BATMAN_ADV_DAT) += distributed-arp-table.o batman-adv-y += fragmentation.o batman-adv-y += gateway_client.o diff --git a/net/batman-adv/debugfs.c b/net/batman-adv/debugfs.c index 0c213a8..4611808 100644 --- a/net/batman-adv/debugfs.c +++ b/net/batman-adv/debugfs.c @@ -482,11 +482,7 @@ rem_attr: debugfs_remove_recursive(hard_iface-debug_dir); hard_iface-debug_dir = NULL; out: -#ifdef CONFIG_DEBUG_FS return -ENOMEM; -#else - return 0; -#endif /* CONFIG_DEBUG_FS */ } /** @@ -541,11 +537,7 @@ rem_attr: debugfs_remove_recursive(bat_priv-debug_dir); bat_priv-debug_dir = NULL; out: -#ifdef CONFIG_DEBUG_FS return -ENOMEM; -#else - return 0; -#endif /* CONFIG_DEBUG_FS */ } void batadv_debugfs_del_meshif(struct net_device *dev) diff --git a/net/batman-adv/debugfs.h b/net/batman-adv/debugfs.h index f3b49c3..ed25605 100644 --- a/net/batman-adv/debugfs.h +++ b/net/batman-adv/debugfs.h @@ -20,6 +20,8 @@ #define BATADV_DEBUGFS_SUBDIR batman_adv +#if IS_ENABLED(CONFIG_DEBUG_FS) + void batadv_debugfs_init(void); void batadv_debugfs_destroy(void); int batadv_debugfs_add_meshif(struct net_device *dev); @@ -27,4 +29,36 @@ void batadv_debugfs_del_meshif(struct net_device *dev); int batadv_debugfs_add_hardif(struct batadv_hard_iface *hard_iface); void batadv_debugfs_del_hardif(struct batadv_hard_iface *hard_iface); +#else + +static inline void batadv_debugfs_init(void) +{ +} + +static inline void batadv_debugfs_destroy(void) +{ +} + +static inline int batadv_debugfs_add_meshif(struct net_device *dev) +{ + return 0; +} + +static inline void batadv_debugfs_del_meshif(struct net_device *dev) +{ +} + +static inline +int batadv_debugfs_add_hardif(struct batadv_hard_iface *hard_iface) +{ + return 0; +} + +static inline +void batadv_debugfs_del_hardif(struct batadv_hard_iface *hard_iface) +{ +} + +#endif + #endif /* _NET_BATMAN_ADV_DEBUGFS_H_ */ -- 2.4.2 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 11/16] batman-adv: Use safer default config for optional features
From: Sven Eckelmann s...@narfation.org The current default settings for optional features in batman-adv seems to be based around the idea that the user only compiles what he requires. They will automatically enabled when they are compiled in. For example the network coding part of batman-adv is by default disabled in the out-of-tree module but will be enabled when the code is compiled during the module build. But distributions like Debian just enable all features of the batman-adv kernel module and hope that more experimental features or features with possible negative effects have to be enabled using some runtime configuration interface. The network_coding feature can help in specific setups but also has drawbacks and is not disabled by default in the out-of-tree module. Disabling by default in the runtime config seems to be also quite sane. The bridge_loop_avoidance is the only feature which is disabled by default but may be necessary even in simple setups. Packet loops may even be created during the initial node setup when this is not enabled. This is different than STP on bridges because mesh is usually used on Adhoc WiFi. Having two nodes (by accident) in the same LAN segment and in the same mesh network is rather common in this situation. Signed-off-by: Sven Eckelmann s...@narfation.org Acked-by: Martin Hundebøll mar...@hundeboll.net Signed-off-by: Marek Lindner mareklind...@neomailbox.ch Signed-off-by: Antonio Quartulli anto...@meshcoding.com --- net/batman-adv/network-coding.c | 2 +- net/batman-adv/soft-interface.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/net/batman-adv/network-coding.c b/net/batman-adv/network-coding.c index 89e1d47..4cb70bb 100644 --- a/net/batman-adv/network-coding.c +++ b/net/batman-adv/network-coding.c @@ -155,7 +155,7 @@ err: */ void batadv_nc_init_bat_priv(struct batadv_priv *bat_priv) { - atomic_set(bat_priv-network_coding, 1); + atomic_set(bat_priv-network_coding, 0); bat_priv-nc.min_tq = 200; bat_priv-nc.max_fwd_delay = 10; bat_priv-nc.max_buffer_time = 200; diff --git a/net/batman-adv/soft-interface.c b/net/batman-adv/soft-interface.c index d85a45c..9426b83 100644 --- a/net/batman-adv/soft-interface.c +++ b/net/batman-adv/soft-interface.c @@ -732,7 +732,7 @@ static int batadv_softif_init_late(struct net_device *dev) atomic_set(bat_priv-aggregated_ogms, 1); atomic_set(bat_priv-bonding, 0); #ifdef CONFIG_BATMAN_ADV_BLA - atomic_set(bat_priv-bridge_loop_avoidance, 0); + atomic_set(bat_priv-bridge_loop_avoidance, 1); #endif #ifdef CONFIG_BATMAN_ADV_DAT atomic_set(bat_priv-distributed_arp_table, 1); -- 2.4.2 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC][PATCH] x86: remove vmalloc.h from asm/io.h
* Stephen Rothwell s...@canb.auug.org.au wrote: Nothing in asm/io.h uses anything from vmalloc.h, so remove the include and fix up the build problems in an allmodconfig (64 bit and 32 bit) build. This may be the place where x86 builds get vmalloc.h implicitly included and that tends to hide places where vmalloc() et al are added to files but the include of vmalloc.h is forgotten. Good idea. Acked-by: Ingo Molnar mi...@kernel.org Based in Linus' tree of today. There are probably more places that need vmalloc.h included, but this passes 64 bit and 32 bit allmodconfig builds, so is a place to start. Please also test x86 allnoconfig and defconfig 32/64, that tends to unearth the remaining places. People doing randconfig testing will find the rest. Thanks, Ingo -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 12/16] batman-adv: checkpatch - comparison to NULL could be rewritten
From: Marek Lindner mareklind...@neomailbox.ch Signed-off-by: Marek Lindner mareklind...@neomailbox.ch Signed-off-by: Antonio Quartulli anto...@meshcoding.com --- net/batman-adv/soft-interface.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/batman-adv/soft-interface.c b/net/batman-adv/soft-interface.c index 9426b83..50cf722 100644 --- a/net/batman-adv/soft-interface.c +++ b/net/batman-adv/soft-interface.c @@ -818,7 +818,7 @@ static int batadv_softif_slave_add(struct net_device *dev, int ret = -EINVAL; hard_iface = batadv_hardif_get_by_netdev(slave_dev); - if (!hard_iface || hard_iface-soft_iface != NULL) + if (!hard_iface || hard_iface-soft_iface) goto out; ret = batadv_hardif_enable_interface(hard_iface, dev-name); -- 2.4.2 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 02/16] batman-adv: update copyright years for 2015
From: Sven Eckelmann s...@narfation.org Signed-off-by: Sven Eckelmann s...@narfation.org Signed-off-by: Marek Lindner mareklind...@neomailbox.ch Signed-off-by: Antonio Quartulli anto...@meshcoding.com --- net/batman-adv/Makefile| 2 +- net/batman-adv/bat_algo.h | 2 +- net/batman-adv/bat_iv_ogm.c| 2 +- net/batman-adv/bitarray.c | 2 +- net/batman-adv/bitarray.h | 2 +- net/batman-adv/bridge_loop_avoidance.c | 2 +- net/batman-adv/bridge_loop_avoidance.h | 2 +- net/batman-adv/debugfs.c | 2 +- net/batman-adv/debugfs.h | 2 +- net/batman-adv/distributed-arp-table.c | 2 +- net/batman-adv/distributed-arp-table.h | 2 +- net/batman-adv/fragmentation.c | 2 +- net/batman-adv/fragmentation.h | 2 +- net/batman-adv/gateway_client.c| 2 +- net/batman-adv/gateway_client.h| 2 +- net/batman-adv/gateway_common.c| 2 +- net/batman-adv/gateway_common.h| 2 +- net/batman-adv/hard-interface.c| 2 +- net/batman-adv/hard-interface.h| 2 +- net/batman-adv/hash.c | 2 +- net/batman-adv/hash.h | 2 +- net/batman-adv/icmp_socket.c | 2 +- net/batman-adv/icmp_socket.h | 2 +- net/batman-adv/main.c | 2 +- net/batman-adv/main.h | 2 +- net/batman-adv/multicast.c | 2 +- net/batman-adv/multicast.h | 2 +- net/batman-adv/network-coding.c| 2 +- net/batman-adv/network-coding.h| 2 +- net/batman-adv/originator.c| 2 +- net/batman-adv/originator.h| 2 +- net/batman-adv/packet.h| 2 +- net/batman-adv/routing.c | 2 +- net/batman-adv/routing.h | 2 +- net/batman-adv/send.c | 2 +- net/batman-adv/send.h | 2 +- net/batman-adv/soft-interface.c| 2 +- net/batman-adv/soft-interface.h| 2 +- net/batman-adv/sysfs.c | 2 +- net/batman-adv/sysfs.h | 2 +- net/batman-adv/translation-table.c | 2 +- net/batman-adv/translation-table.h | 2 +- net/batman-adv/types.h | 2 +- 43 files changed, 43 insertions(+), 43 deletions(-) diff --git a/net/batman-adv/Makefile b/net/batman-adv/Makefile index eb7d8c03..79833eb 100644 --- a/net/batman-adv/Makefile +++ b/net/batman-adv/Makefile @@ -1,5 +1,5 @@ # -# Copyright (C) 2007-2014 B.A.T.M.A.N. contributors: +# Copyright (C) 2007-2015 B.A.T.M.A.N. contributors: # # Marek Lindner, Simon Wunderlich # diff --git a/net/batman-adv/bat_algo.h b/net/batman-adv/bat_algo.h index 4e49666..4e59cf3 100644 --- a/net/batman-adv/bat_algo.h +++ b/net/batman-adv/bat_algo.h @@ -1,4 +1,4 @@ -/* Copyright (C) 2011-2014 B.A.T.M.A.N. contributors: +/* Copyright (C) 2011-2015 B.A.T.M.A.N. contributors: * * Marek Lindner * diff --git a/net/batman-adv/bat_iv_ogm.c b/net/batman-adv/bat_iv_ogm.c index 00e00e0..fc299c0 100644 --- a/net/batman-adv/bat_iv_ogm.c +++ b/net/batman-adv/bat_iv_ogm.c @@ -1,4 +1,4 @@ -/* Copyright (C) 2007-2014 B.A.T.M.A.N. contributors: +/* Copyright (C) 2007-2015 B.A.T.M.A.N. contributors: * * Marek Lindner, Simon Wunderlich * diff --git a/net/batman-adv/bitarray.c b/net/batman-adv/bitarray.c index e3da07a..40e4a2a 100644 --- a/net/batman-adv/bitarray.c +++ b/net/batman-adv/bitarray.c @@ -1,4 +1,4 @@ -/* Copyright (C) 2006-2014 B.A.T.M.A.N. contributors: +/* Copyright (C) 2006-2015 B.A.T.M.A.N. contributors: * * Simon Wunderlich, Marek Lindner * diff --git a/net/batman-adv/bitarray.h b/net/batman-adv/bitarray.h index 2acaafe..be497be 100644 --- a/net/batman-adv/bitarray.h +++ b/net/batman-adv/bitarray.h @@ -1,4 +1,4 @@ -/* Copyright (C) 2006-2014 B.A.T.M.A.N. contributors: +/* Copyright (C) 2006-2015 B.A.T.M.A.N. contributors: * * Simon Wunderlich, Marek Lindner * diff --git a/net/batman-adv/bridge_loop_avoidance.c b/net/batman-adv/bridge_loop_avoidance.c index ac4b96e..fa941cd 100644 --- a/net/batman-adv/bridge_loop_avoidance.c +++ b/net/batman-adv/bridge_loop_avoidance.c @@ -1,4 +1,4 @@ -/* Copyright (C) 2011-2014 B.A.T.M.A.N. contributors: +/* Copyright (C) 2011-2015 B.A.T.M.A.N. contributors: * * Simon Wunderlich * diff --git a/net/batman-adv/bridge_loop_avoidance.h b/net/batman-adv/bridge_loop_avoidance.h index 43c985d..1f506d3 100644 --- a/net/batman-adv/bridge_loop_avoidance.h +++ b/net/batman-adv/bridge_loop_avoidance.h @@ -1,4 +1,4 @@ -/* Copyright (C) 2011-2014 B.A.T.M.A.N. contributors: +/* Copyright (C) 2011-2015 B.A.T.M.A.N. contributors: * * Simon Wunderlich * diff --git a/net/batman-adv/debugfs.c b/net/batman-adv/debugfs.c index a4972874..0c213a8 100644 --- a/net/batman-adv/debugfs.c +++ b/net/batman-adv/debugfs.c @@ -1,4 +1,4 @@ -/* Copyright (C) 2010-2014 B.A.T.M.A.N. contributors: +/* Copyright (C) 2010-2015 B.A.T.M.A.N. contributors: * * Marek Lindner * diff --git
[PATCH 01/16] batman-adv: Start new development cycle
From: Simon Wunderlich s...@simonwunderlich.de Signed-off-by: Simon Wunderlich s...@simonwunderlich.de Signed-off-by: Antonio Quartulli anto...@meshcoding.com --- net/batman-adv/main.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/batman-adv/main.h b/net/batman-adv/main.h index 4d23188..9d5657e 100644 --- a/net/batman-adv/main.h +++ b/net/batman-adv/main.h @@ -24,7 +24,7 @@ #define BATADV_DRIVER_DEVICE batman-adv #ifndef BATADV_SOURCE_VERSION -#define BATADV_SOURCE_VERSION 2015.0 +#define BATADV_SOURCE_VERSION 2015.1 #endif /* B.A.T.M.A.N. parameters */ -- 2.4.2 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 04/16] batman-adv: Use only queued fragments when merging
From: Sven Eckelmann s...@narfation.org The fragment queueing code now validates the total_size of each fragment, checks when enough fragments are queued to allow to merge them into a single packet and if the fragments have the correct size. Therefore, it is not required to have any other parameter for the merging function than a list of queued fragments. This change should avoid problems like in the past when the different skb from the list and the function parameter were mixed incorrectly. Signed-off-by: Sven Eckelmann s...@narfation.org Acked-by: Martin Hundebøll mar...@hundeboll.net Signed-off-by: Marek Lindner mareklind...@neomailbox.ch Signed-off-by: Antonio Quartulli anto...@meshcoding.com --- net/batman-adv/fragmentation.c | 13 + 1 file changed, 5 insertions(+), 8 deletions(-) diff --git a/net/batman-adv/fragmentation.c b/net/batman-adv/fragmentation.c index a9fc653..6ce3c84 100644 --- a/net/batman-adv/fragmentation.c +++ b/net/batman-adv/fragmentation.c @@ -231,19 +231,13 @@ err: * Returns the merged skb or NULL on error. */ static struct sk_buff * -batadv_frag_merge_packets(struct hlist_head *chain, struct sk_buff *skb) +batadv_frag_merge_packets(struct hlist_head *chain) { struct batadv_frag_packet *packet; struct batadv_frag_list_entry *entry; struct sk_buff *skb_out = NULL; int size, hdr_size = sizeof(struct batadv_frag_packet); - /* Make sure incoming skb has non-bogus data. */ - packet = (struct batadv_frag_packet *)skb-data; - size = ntohs(packet-total_size); - if (size batadv_frag_size_limit()) - goto free; - /* Remove first entry, as this is the destination for the rest of the * fragments. */ @@ -252,6 +246,9 @@ batadv_frag_merge_packets(struct hlist_head *chain, struct sk_buff *skb) skb_out = entry-skb; kfree(entry); + packet = (struct batadv_frag_packet *)skb_out-data; + size = ntohs(packet-total_size); + /* Make room for the rest of the fragments. */ if (pskb_expand_head(skb_out, 0, size - skb_out-len, GFP_ATOMIC) 0) { kfree_skb(skb_out); @@ -307,7 +304,7 @@ bool batadv_frag_skb_buffer(struct sk_buff **skb, if (hlist_empty(head)) goto out; - skb_out = batadv_frag_merge_packets(head, *skb); + skb_out = batadv_frag_merge_packets(head); if (!skb_out) goto out_err; -- 2.4.2 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next 10/14] sfc: suppress vadaptor stats when EVB is not present
From: Daniel Pieczko dpiec...@solarflare.com The raw_mask array is not initialised, so it needs to be explicitly set to zero in the 'else' branch. If the EVB capability is not present, a port cannot have multiple functions so the per-port MAC stats are correct and should match the corresponding vadaptor stats, so this redundancy can be removed from the ethtool stats output. Signed-off-by: Shradha Shah ss...@solarflare.com --- drivers/net/ethernet/sfc/ef10.c | 12 +--- drivers/net/ethernet/sfc/mcdi_pcol.h | 2 ++ 2 files changed, 11 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/sfc/ef10.c b/drivers/net/ethernet/sfc/ef10.c index c68bf55..8be9191 100644 --- a/drivers/net/ethernet/sfc/ef10.c +++ b/drivers/net/ethernet/sfc/ef10.c @@ -1176,13 +1176,19 @@ static u64 efx_ef10_raw_stat_mask(struct efx_nic *efx) static void efx_ef10_get_stat_mask(struct efx_nic *efx, unsigned long *mask) { + struct efx_ef10_nic_data *nic_data = efx-nic_data; u64 raw_mask[2]; raw_mask[0] = efx_ef10_raw_stat_mask(efx); - /* All functions see the vadaptor stats */ - raw_mask[0] |= ~((1ULL EF10_STAT_rx_unicast) - 1); - raw_mask[1] = (1ULL (EF10_STAT_COUNT - 63)) - 1; + /* Only show vadaptor stats when EVB capability is present */ + if (nic_data-datapath_caps + (1 MC_CMD_GET_CAPABILITIES_OUT_EVB_LBN)) { + raw_mask[0] |= ~((1ULL EF10_STAT_rx_unicast) - 1); + raw_mask[1] = (1ULL (EF10_STAT_COUNT - 63)) - 1; + } else { + raw_mask[1] = 0; + } #if BITS_PER_LONG == 64 mask[0] = raw_mask[0]; diff --git a/drivers/net/ethernet/sfc/mcdi_pcol.h b/drivers/net/ethernet/sfc/mcdi_pcol.h index 181978d..45fca9f 100644 --- a/drivers/net/ethernet/sfc/mcdi_pcol.h +++ b/drivers/net/ethernet/sfc/mcdi_pcol.h @@ -5600,6 +5600,8 @@ #defineMC_CMD_GET_CAPABILITIES_OUT_MCAST_FILTER_CHAINING_WIDTH 1 #defineMC_CMD_GET_CAPABILITIES_OUT_PM_AND_RXDP_COUNTERS_LBN 27 #defineMC_CMD_GET_CAPABILITIES_OUT_PM_AND_RXDP_COUNTERS_WIDTH 1 +#defineMC_CMD_GET_CAPABILITIES_OUT_EVB_LBN 30 +#defineMC_CMD_GET_CAPABILITIES_OUT_EVB_WIDTH 1 /* RxDPCPU firmware id. */ #define MC_CMD_GET_CAPABILITIES_OUT_RX_DPCPU_FW_ID_OFST 4 #define MC_CMD_GET_CAPABILITIES_OUT_RX_DPCPU_FW_ID_LEN 2 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH iproute2] ss: Fix allocation of cong control alg name
From: Vadim Kochan vadi...@gmail.com Use strdup instead of malloc, and get rid of bad strcpy. Signed-off-by: Vadim Kochan vadi...@gmail.com --- misc/ss.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/misc/ss.c b/misc/ss.c index 347e3a1..a719466 100644 --- a/misc/ss.c +++ b/misc/ss.c @@ -1908,8 +1908,7 @@ static void tcp_show_info(const struct nlmsghdr *nlh, struct inet_diag_msg *r, if (tb[INET_DIAG_CONG]) { const char *cong_attr = rta_getattr_str(tb[INET_DIAG_CONG]); - s.cong_alg = malloc(strlen(cong_attr + 1)); - strcpy(s.cong_alg, cong_attr); + s.cong_alg = strdup(cong_attr); } if (TCPI_HAS_OPT(info, TCPI_OPT_WSCALE)) { -- 2.3.1 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Xen-devel] xen-netfront sets partial checksum at wrong offset
On 11.05.15 at 19:25, venkat.x.venkatsu...@oracle.com wrote: Please CC the maintainers of the driver. You can get that from 'scripts/get_maintainer.pl' I've done that for you. Thanks, Konrad. I am copying Wei too who had fixed the below problem earlier. It fixed the incorrect ip_hdr(). tcp_hdr() still needs to fixed. commit d554f73df6bc35ac8f6a65e5560bf1d31dfebed9 Author: Wei Liu wei.l...@citrix.com Date: Wed Feb 19 18:48:34 2014 + xen-netfront: reset skb network header before checksum So no response at all so far from the maintainers made me look into this: I first thought what we need would be calls to skb_probe_transport_header() in skb_checksum_setup_ip() after each of the skb_maybe_pull_tail() functions. But skb_partial_csum_set() already calls skb_set_transport_header(), so I now think things ought to be fine without any change. Can you clarify what you think is missing? Or is this an issue in just the old (3.8.x) kernel you're using? (In either case netback's xenvif_tx_submit() calling skb_probe_transport_header() would seem pointless, as skb_checksum_setup() - with or without a fix - ought to be taking care of this anyway.) Jan -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Xen-devel] xen-netfront sets partial checksum at wrong offset
On Fri, May 29, 2015 at 11:34:07AM +0100, Jan Beulich wrote: On 11.05.15 at 19:25, venkat.x.venkatsu...@oracle.com wrote: Please CC the maintainers of the driver. You can get that from 'scripts/get_maintainer.pl' I've done that for you. Thanks, Konrad. I am copying Wei too who had fixed the below problem earlier. It fixed the incorrect ip_hdr(). tcp_hdr() still needs to fixed. commit d554f73df6bc35ac8f6a65e5560bf1d31dfebed9 Author: Wei Liu wei.l...@citrix.com Date: Wed Feb 19 18:48:34 2014 + xen-netfront: reset skb network header before checksum So no response at all so far from the maintainers made me look into this: I first thought what we need would be calls to skb_probe_transport_header() in skb_checksum_setup_ip() after each of the skb_maybe_pull_tail() functions. But skb_partial_csum_set() already calls skb_set_transport_header(), so I now think things ought to be fine without any change. Can you clarify what you think is missing? Or is this an issue in just the old (3.8.x) kernel you're using? I think this is the follow-up thread 20150512013424.ga7...@oracle.com And the conclusion is 3.8 is too old so the fix is not there. (In either case netback's xenvif_tx_submit() calling skb_probe_transport_header() would seem pointless, as skb_checksum_setup() - with or without a fix - ought to be taking care of this anyway.) Patches welcome. :-) Wei. Jan -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 16/16] batman-adv: Use common declaration order in *_send_skb_(packet|unicast)
From: Antonio Quartulli anto...@open-mesh.com Signed-off-by: Antonio Quartulli anto...@open-mesh.com Signed-off-by: Antonio Quartulli anto...@meshcoding.com --- net/batman-adv/send.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/batman-adv/send.c b/net/batman-adv/send.c index fa70ae8..23635bd 100644 --- a/net/batman-adv/send.c +++ b/net/batman-adv/send.c @@ -255,8 +255,8 @@ int batadv_send_skb_unicast(struct batadv_priv *bat_priv, struct batadv_orig_node *orig_node, unsigned short vid) { - struct ethhdr *ethhdr; struct batadv_unicast_packet *unicast_packet; + struct ethhdr *ethhdr; int ret = NET_XMIT_DROP; if (!orig_node) -- 2.4.2 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 07/16] batman-adv: Makefile, Sort alphabetically
From: Markus Pargmann m...@pengutronix.de The whole Makefile is sorted, just the multicast rule is not at the right position. Signed-off-by: Markus Pargmann m...@pengutronix.de Signed-off-by: Marek Lindner mareklind...@neomailbox.ch Signed-off-by: Antonio Quartulli anto...@meshcoding.com --- net/batman-adv/Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/batman-adv/Makefile b/net/batman-adv/Makefile index bd7e343..21434ab 100644 --- a/net/batman-adv/Makefile +++ b/net/batman-adv/Makefile @@ -29,6 +29,7 @@ batman-adv-y += hard-interface.o batman-adv-y += hash.o batman-adv-y += icmp_socket.o batman-adv-y += main.o +batman-adv-$(CONFIG_BATMAN_ADV_MCAST) += multicast.o batman-adv-$(CONFIG_BATMAN_ADV_NC) += network-coding.o batman-adv-y += originator.o batman-adv-y += routing.o @@ -36,4 +37,3 @@ batman-adv-y += send.o batman-adv-y += soft-interface.o batman-adv-y += sysfs.o batman-adv-y += translation-table.o -batman-adv-$(CONFIG_BATMAN_ADV_MCAST) += multicast.o -- 2.4.2 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 14/16] batman-adv: iv_ogm_can_aggregate, code readability
From: Markus Pargmann m...@pengutronix.de This patch tries to increase code readability by negating the first if block and rearranging some of the other conditional blocks. This way we save an indentation level, we also save some allocation that is not necessary for one of the conditions. Signed-off-by: Markus Pargmann m...@pengutronix.de Signed-off-by: Marek Lindner mareklind...@neomailbox.ch Signed-off-by: Antonio Quartulli anto...@meshcoding.com --- net/batman-adv/bat_iv_ogm.c | 96 +++-- 1 file changed, 50 insertions(+), 46 deletions(-) diff --git a/net/batman-adv/bat_iv_ogm.c b/net/batman-adv/bat_iv_ogm.c index f9c5610..ce04cd7 100644 --- a/net/batman-adv/bat_iv_ogm.c +++ b/net/batman-adv/bat_iv_ogm.c @@ -544,58 +544,62 @@ batadv_iv_ogm_can_aggregate(const struct batadv_ogm_packet *new_bat_ogm_packet, * - the send time is within our MAX_AGGREGATION_MS time * - the resulting packet wont be bigger than * MAX_AGGREGATION_BYTES +* otherwise aggregation is not possible */ - if (time_before(send_time, forw_packet-send_time) - time_after_eq(aggregation_end_time, forw_packet-send_time) - (aggregated_bytes = BATADV_MAX_AGGREGATION_BYTES)) { - /* check aggregation compatibility -* - direct link packets are broadcasted on -*their interface only -* - aggregate packet if the current packet is -*a global packet as well as the base -*packet -*/ - primary_if = batadv_primary_if_get_selected(bat_priv); - if (!primary_if) - goto out; + if (!time_before(send_time, forw_packet-send_time) || + !time_after_eq(aggregation_end_time, forw_packet-send_time)) + return false; - /* packet is not leaving on the same interface. */ - if (forw_packet-if_outgoing != if_outgoing) - goto out; + if (aggregated_bytes BATADV_MAX_AGGREGATION_BYTES) + return false; - /* packets without direct link flag and high TTL -* are flooded through the net -*/ - if ((!directlink) - (!(batadv_ogm_packet-flags BATADV_DIRECTLINK)) - (batadv_ogm_packet-ttl != 1) + /* packet is not leaving on the same interface. */ + if (forw_packet-if_outgoing != if_outgoing) + return false; - /* own packets originating non-primary -* interfaces leave only that interface -*/ - ((!forw_packet-own) || -(forw_packet-if_incoming == primary_if))) { - res = true; - goto out; - } + /* check aggregation compatibility +* - direct link packets are broadcasted on +*their interface only +* - aggregate packet if the current packet is +*a global packet as well as the base +*packet +*/ + primary_if = batadv_primary_if_get_selected(bat_priv); + if (!primary_if) + return false; - /* if the incoming packet is sent via this one -* interface only - we still can aggregate -*/ - if ((directlink) - (new_bat_ogm_packet-ttl == 1) - (forw_packet-if_incoming == if_incoming) + /* packets without direct link flag and high TTL +* are flooded through the net +*/ + if (!directlink + !(batadv_ogm_packet-flags BATADV_DIRECTLINK) + batadv_ogm_packet-ttl != 1 - /* packets from direct neighbors or -* own secondary interface packets -* (= secondary interface packets in general) -*/ - (batadv_ogm_packet-flags BATADV_DIRECTLINK || -(forw_packet-own - forw_packet-if_incoming != primary_if))) { - res = true; - goto out; - } + /* own packets originating non-primary +* interfaces leave only that interface +*/ + (!forw_packet-own || +forw_packet-if_incoming == primary_if)) { + res = true; + goto out; + } + + /* if the incoming packet is sent via this one +* interface only - we still can aggregate +*/ + if (directlink + new_bat_ogm_packet-ttl == 1 + forw_packet-if_incoming == if_incoming + + /* packets from direct neighbors or +* own secondary interface packets +* (= secondary interface packets in general) +*/ +
[PATCH net-next 00/14] sfc: ndo_get_phys_port_id, vadaptor stats and PF unload when Vf's assigned to guest
This is the third and last instalment of SRIOV for EF10 patches. This patch set includes implementation of ndo_get_phys_port_id and changes to the MAC statistics code in order to support vadaptor statistics. It also includes code to deal with PF unload when Vf's are still assigned to the guest. The first couple of patches create sysfs files for physical port and link control flags which are particularly useful when we have enabled a large number of VF's. These patches have been tested with and without CONFIG_SFC_SRIOV. The creation and content of the sysfs files has been tested. The statistics are tested using ethtool for monitoring. Daniel Pieczko (11): sfc: add port_ prefix to MAC stats sfc: set the port-id when calling MC_CMD_MAC_STATS sfc: display vadaptor statistics for all interfaces sfc: DMA the VF stats only when requested sfc: update netdevice statistics to use vadaptor stats sfc: suppress ENOENT error messages from MC_CMD_MAC_STATS sfc: suppress vadaptor stats when EVB is not present sfc: don't update stats on VF when called in atomic context sfc: do not allow VFs to be destroyed if assigned to guests sfc: force removal of VF and vport on driver removal sfc: leak vports if a VF is assigned during PF unload Shradha Shah (3): sfc: Add sysfs entry for physical port sfc: Add sysfs entry for flags (link control and primary) sfc: Implement ndo_gets_phys_port_id() for EF10 VFs drivers/net/ethernet/sfc/ef10.c | 553 -- drivers/net/ethernet/sfc/ef10_sriov.c | 57 +++- drivers/net/ethernet/sfc/ef10_sriov.h | 5 + drivers/net/ethernet/sfc/efx.c| 8 +- drivers/net/ethernet/sfc/mcdi_pcol.h | 30 +- drivers/net/ethernet/sfc/mcdi_port.c | 12 +- drivers/net/ethernet/sfc/net_driver.h | 2 + drivers/net/ethernet/sfc/nic.h| 117 --- drivers/net/ethernet/sfc/sriov.c | 11 + drivers/net/ethernet/sfc/sriov.h | 2 + 10 files changed, 569 insertions(+), 228 deletions(-) -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next 01/14] sfc: Add sysfs entry for physical port
In the case where we have multiple functions (PFs and VFs), this sysfs entry is useful to identify the physical port corresponding to the function we are interested in. Signed-off-by: Shradha Shah ss...@solarflare.com --- drivers/net/ethernet/sfc/ef10.c | 35 --- 1 file changed, 28 insertions(+), 7 deletions(-) diff --git a/drivers/net/ethernet/sfc/ef10.c b/drivers/net/ethernet/sfc/ef10.c index a547ceb..ee20d96 100644 --- a/drivers/net/ethernet/sfc/ef10.c +++ b/drivers/net/ethernet/sfc/ef10.c @@ -246,6 +246,18 @@ static int efx_ef10_get_mac_address_vf(struct efx_nic *efx, u8 *mac_address) return 0; } +static ssize_t efx_ef10_show_physical_port(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + struct efx_nic *efx = pci_get_drvdata(to_pci_dev(dev)); + + return sprintf(buf, %d\n, efx-port_num); +} + +static DEVICE_ATTR(physical_port, 0444, efx_ef10_show_physical_port, + NULL); + static int efx_ef10_probe(struct efx_nic *efx) { struct efx_ef10_nic_data *nic_data; @@ -326,14 +338,18 @@ static int efx_ef10_probe(struct efx_nic *efx) if (rc 0) goto fail3; efx-port_num = rc; + rc = device_create_file(efx-pci_dev-dev, dev_attr_physical_port); + if (rc) + goto fail3; rc = efx-type-get_mac_address(efx, efx-net_dev-perm_addr); if (rc) - goto fail3; + goto fail4; rc = efx_ef10_get_sysclk_freq(efx); if (rc 0) - goto fail3; + goto fail4; + efx-timer_quantum_ns = 1536000 / rc; /* 1536 cycles */ /* Check whether firmware supports bug 35388 workaround. @@ -341,9 +357,9 @@ static int efx_ef10_probe(struct efx_nic *efx) * ask if it's already enabled */ rc = efx_mcdi_set_workaround(efx, MC_CMD_WORKAROUND_BUG35388, true); - if (rc == 0) + if (rc == 0) { nic_data-workaround_35388 = true; - else if (rc == -EPERM) { + } else if (rc == -EPERM) { unsigned int enabled; rc = efx_mcdi_get_workarounds(efx, NULL, enabled); @@ -351,21 +367,24 @@ static int efx_ef10_probe(struct efx_nic *efx) goto fail3; nic_data-workaround_35388 = enabled MC_CMD_GET_WORKAROUNDS_OUT_BUG35388; + } else if (rc != -ENOSYS rc != -ENOENT) { + goto fail4; } - else if (rc != -ENOSYS rc != -ENOENT) - goto fail3; + netif_dbg(efx, probe, efx-net_dev, workaround for bug 35388 is %sabled\n, nic_data-workaround_35388 ? en : dis); rc = efx_mcdi_mon_probe(efx); if (rc rc != -EPERM) - goto fail3; + goto fail4; efx_ptp_probe(efx, NULL); return 0; +fail4: + device_remove_file(efx-pci_dev-dev, dev_attr_physical_port); fail3: efx_mcdi_fini(efx); fail2: @@ -608,6 +627,8 @@ static void efx_ef10_remove(struct efx_nic *efx) if (!nic_data-must_restore_piobufs) efx_ef10_free_piobufs(efx); + device_remove_file(efx-pci_dev-dev, dev_attr_physical_port); + efx_mcdi_fini(efx); efx_nic_free_buffer(efx, nic_data-mcdi_buf); kfree(nic_data); -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH net-next 02/14] sfc: Add sysfs entry for flags (link control and primary)
From: Shradha Shah Sent: 29 May 2015 11:01 On every adapter there will be one primary PF per adaptor and one link control PF per port. ... + return sprintf(buf, %d\n, +((efx-mcdi-fn_flags) + (1 MC_CMD_DRV_ATTACH_EXT_OUT_FLAG_LINKCTRL)) +? 1 : 0); Horrid expression. Why not: (efx-mcdi-fn_flags MC_CMD_DRV_ATTACH_EXT_OUT_FLAG_LINKCTRL) 1 using sprintf() is also excessive. Maybe: *buf = '0' + (expression); return 1; You may also need to check for buffer overrun. David -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Xen-devel] xen-netfront sets partial checksum at wrong offset
On 29.05.15 at 12:39, wei.l...@citrix.com wrote: On Fri, May 29, 2015 at 11:34:07AM +0100, Jan Beulich wrote: On 11.05.15 at 19:25, venkat.x.venkatsu...@oracle.com wrote: Please CC the maintainers of the driver. You can get that from 'scripts/get_maintainer.pl' I've done that for you. Thanks, Konrad. I am copying Wei too who had fixed the below problem earlier. It fixed the incorrect ip_hdr(). tcp_hdr() still needs to fixed. commit d554f73df6bc35ac8f6a65e5560bf1d31dfebed9 Author: Wei Liu wei.l...@citrix.com Date: Wed Feb 19 18:48:34 2014 + xen-netfront: reset skb network header before checksum So no response at all so far from the maintainers made me look into this: I first thought what we need would be calls to skb_probe_transport_header() in skb_checksum_setup_ip() after each of the skb_maybe_pull_tail() functions. But skb_partial_csum_set() already calls skb_set_transport_header(), so I now think things ought to be fine without any change. Can you clarify what you think is missing? Or is this an issue in just the old (3.8.x) kernel you're using? I think this is the follow-up thread 20150512013424.ga7...@oracle.com And the conclusion is 3.8 is too old so the fix is not there. Ah, right. Seems like I failed to remove the earlier mail from my list of things to look at when this came through. Sorry for the noise then. Jan -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH iproute2] ss: Fix allocation of cong control alg name
On Fri, 2015-05-29 at 13:30 +0300, Vadim Kochan wrote: From: Vadim Kochan vadi...@gmail.com Use strdup instead of malloc, and get rid of bad strcpy. Signed-off-by: Vadim Kochan vadi...@gmail.com --- misc/ss.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/misc/ss.c b/misc/ss.c index 347e3a1..a719466 100644 --- a/misc/ss.c +++ b/misc/ss.c @@ -1908,8 +1908,7 @@ static void tcp_show_info(const struct nlmsghdr *nlh, struct inet_diag_msg *r, if (tb[INET_DIAG_CONG]) { const char *cong_attr = rta_getattr_str(tb[INET_DIAG_CONG]); - s.cong_alg = malloc(strlen(cong_attr + 1)); - strcpy(s.cong_alg, cong_attr); + s.cong_alg = strdup(cong_attr); } if (TCPI_HAS_OPT(info, TCPI_OPT_WSCALE)) { I doubt TCP_CA_NAME_MAX will ever change in the kernel : 16 bytes. Its typically cubic and less than 8 bytes. Using 8 bytes to point to a malloc(8) is a waste. Please remove the memory allocation, or store the pointer, since tcp_show_info() does the malloc()/free() before return. diff --git a/misc/ss.c b/misc/ss.c index 347e3a1..9fe229f 100644 --- a/misc/ss.c +++ b/misc/ss.c @@ -755,7 +755,7 @@ struct tcpstat int timer; int timeout; int probes; - char*cong_alg; + charcong_alg[16]; double rto, ato, rtt, rttvar; int qack, cwnd, ssthresh, backoff; double send_bps; @@ -1664,7 +1664,7 @@ static void tcp_stats_print(struct tcpstat *s) printf( ecnseen); if (s-has_fastopen_opt) printf( fastopen); - if (s-cong_alg) + if (s-cong_alg[0]) printf( %s, s-cong_alg); if (s-has_wscale_opt) printf( wscale:%d,%d, s-snd_wscale, s-rcv_wscale); @@ -1906,11 +1906,10 @@ static void tcp_show_info(const struct nlmsghdr *nlh, struct inet_diag_msg *r, s.has_fastopen_opt = TCPI_HAS_OPT(info, TCPI_OPT_SYN_DATA); } - if (tb[INET_DIAG_CONG]) { - const char *cong_attr = rta_getattr_str(tb[INET_DIAG_CONG]); - s.cong_alg = malloc(strlen(cong_attr + 1)); - strcpy(s.cong_alg, cong_attr); - } + if (tb[INET_DIAG_CONG]) + strncpy(s.cong_alg, + rta_getattr_str(tb[INET_DIAG_CONG]), + sizeof(s.cong_alg) - 1); if (TCPI_HAS_OPT(info, TCPI_OPT_WSCALE)) { s.has_wscale_opt = true; @@ -1984,8 +1983,6 @@ static void tcp_show_info(const struct nlmsghdr *nlh, struct inet_diag_msg *r, tcp_stats_print(s); if (s.dctcp) free(s.dctcp); - if (s.cong_alg) - free(s.cong_alg); } } -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 15/16] batman-adv: iv_ogm_orig_update, remove unnecessary brackets
From: Markus Pargmann m...@pengutronix.de Remove these unnecessary brackets inside a condition. Signed-off-by: Markus Pargmann m...@pengutronix.de Signed-off-by: Marek Lindner mareklind...@neomailbox.ch Signed-off-by: Antonio Quartulli anto...@meshcoding.com --- net/batman-adv/bat_iv_ogm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/batman-adv/bat_iv_ogm.c b/net/batman-adv/bat_iv_ogm.c index ce04cd7..c5ba7a7 100644 --- a/net/batman-adv/bat_iv_ogm.c +++ b/net/batman-adv/bat_iv_ogm.c @@ -1081,7 +1081,7 @@ batadv_iv_ogm_orig_update(struct batadv_priv *bat_priv, * won't consider it either */ if (router_ifinfo - (neigh_ifinfo-bat_iv.tq_avg == router_ifinfo-bat_iv.tq_avg)) { + neigh_ifinfo-bat_iv.tq_avg == router_ifinfo-bat_iv.tq_avg) { orig_node_tmp = router-orig_node; spin_lock_bh(orig_node_tmp-bat_iv.ogm_cnt_lock); if_num = router-if_incoming-if_num; -- 2.4.2 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 10/16] batman-adv: iv_ogm_send_to_if, declare char* as const
From: Markus Pargmann m...@pengutronix.de This string pointer is later assigned to a constant string, so it should be defined constant at the beginning. Signed-off-by: Markus Pargmann m...@pengutronix.de Signed-off-by: Marek Lindner mareklind...@neomailbox.ch Signed-off-by: Antonio Quartulli anto...@meshcoding.com --- net/batman-adv/bat_iv_ogm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/batman-adv/bat_iv_ogm.c b/net/batman-adv/bat_iv_ogm.c index 0015138..f9c5610 100644 --- a/net/batman-adv/bat_iv_ogm.c +++ b/net/batman-adv/bat_iv_ogm.c @@ -409,7 +409,7 @@ static void batadv_iv_ogm_send_to_if(struct batadv_forw_packet *forw_packet, struct batadv_hard_iface *hard_iface) { struct batadv_priv *bat_priv = netdev_priv(hard_iface-soft_iface); - char *fwd_str; + const char *fwd_str; uint8_t packet_num; int16_t buff_pos; struct batadv_ogm_packet *batadv_ogm_packet; -- 2.4.2 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 09/16] batman-adv: iv_ogm_aggr_packet, bool return value
From: Markus Pargmann m...@pengutronix.de This function returns bool values, so it should be defined to return them instead of the whole int range. Signed-off-by: Markus Pargmann m...@pengutronix.de Signed-off-by: Marek Lindner mareklind...@neomailbox.ch Signed-off-by: Antonio Quartulli anto...@meshcoding.com --- net/batman-adv/bat_iv_ogm.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/net/batman-adv/bat_iv_ogm.c b/net/batman-adv/bat_iv_ogm.c index 123aabb..0015138 100644 --- a/net/batman-adv/bat_iv_ogm.c +++ b/net/batman-adv/bat_iv_ogm.c @@ -392,8 +392,8 @@ static uint8_t batadv_hop_penalty(uint8_t tq, } /* is there another aggregated packet here? */ -static int batadv_iv_ogm_aggr_packet(int buff_pos, int packet_len, -__be16 tvlv_len) +static bool batadv_iv_ogm_aggr_packet(int buff_pos, int packet_len, + __be16 tvlv_len) { int next_buff_pos = 0; -- 2.4.2 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 13/16] batman-adv: checkpatch - spaces preferred around that '*'
From: Marek Lindner mareklind...@neomailbox.ch Signed-off-by: Marek Lindner mareklind...@neomailbox.ch Signed-off-by: Antonio Quartulli anto...@meshcoding.com --- net/batman-adv/main.h | 2 +- net/batman-adv/network-coding.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/net/batman-adv/main.h b/net/batman-adv/main.h index 11d2d9f..026ba37 100644 --- a/net/batman-adv/main.h +++ b/net/batman-adv/main.h @@ -44,7 +44,7 @@ #define BATADV_TT_CLIENT_TEMP_TIMEOUT 60 /* in milliseconds */ #define BATADV_TT_WORK_PERIOD 5000 /* 5 seconds */ #define BATADV_ORIG_WORK_PERIOD 1000 /* 1 second */ -#define BATADV_DAT_ENTRY_TIMEOUT (5*6) /* 5 mins in milliseconds */ +#define BATADV_DAT_ENTRY_TIMEOUT (5 * 6) /* 5 mins in milliseconds */ /* sliding packet range of received originator messages in sequence numbers * (should be a multiple of our word size) */ diff --git a/net/batman-adv/network-coding.c b/net/batman-adv/network-coding.c index 4cb70bb..b984bc4 100644 --- a/net/batman-adv/network-coding.c +++ b/net/batman-adv/network-coding.c @@ -275,7 +275,7 @@ static bool batadv_nc_to_purge_nc_path_decoding(struct batadv_priv *bat_priv, * max_buffer time */ return batadv_has_timed_out(nc_path-last_valid, - bat_priv-nc.max_buffer_time*10); + bat_priv-nc.max_buffer_time * 10); } /** -- 2.4.2 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 03/16] batman-adv: Check total_size when queueing fragments
From: Sven Eckelmann s...@narfation.org The fragmentation code was replaced in 610bfc6bc99bc83680d190ebc69359a05fc7f605 (batman-adv: Receive fragmented packets and merge) by an implementation which handles the queueing+merging of fragments based on their size and the total_size of the non-fragmented packet. This total_size is announced by each fragment. The new implementation doesn't check if the the total_size information of the packets inside one chain is consistent. This is consistency check is recommended to allow using any of the packets in the queue to decide whether all fragments of a packet are received or not. Signed-off-by: Sven Eckelmann s...@narfation.org Acked-by: Martin Hundebøll mar...@hundeboll.net Signed-off-by: Marek Lindner mareklind...@neomailbox.ch Signed-off-by: Antonio Quartulli anto...@meshcoding.com --- net/batman-adv/fragmentation.c | 7 +-- net/batman-adv/types.h | 2 ++ 2 files changed, 7 insertions(+), 2 deletions(-) diff --git a/net/batman-adv/fragmentation.c b/net/batman-adv/fragmentation.c index af495cc..a9fc653 100644 --- a/net/batman-adv/fragmentation.c +++ b/net/batman-adv/fragmentation.c @@ -161,6 +161,7 @@ static bool batadv_frag_insert_packet(struct batadv_orig_node *orig_node, hlist_add_head(frag_entry_new-list, chain-head); chain-size = skb-len - hdr_size; chain-timestamp = jiffies; + chain-total_size = ntohs(frag_packet-total_size); ret = true; goto out; } @@ -195,9 +196,11 @@ static bool batadv_frag_insert_packet(struct batadv_orig_node *orig_node, out: if (chain-size batadv_frag_size_limit() || - ntohs(frag_packet-total_size) batadv_frag_size_limit()) { + chain-total_size != ntohs(frag_packet-total_size) || + chain-total_size batadv_frag_size_limit()) { /* Clear chain if total size of either the list or the packet -* exceeds the maximum size of one merged packet. +* exceeds the maximum size of one merged packet. Don't allow +* packets to have different total_size. */ batadv_frag_clear_chain(chain-head); chain-size = 0; diff --git a/net/batman-adv/types.h b/net/batman-adv/types.h index 28f2461..e95db42 100644 --- a/net/batman-adv/types.h +++ b/net/batman-adv/types.h @@ -132,6 +132,7 @@ struct batadv_orig_ifinfo { * @timestamp: time (jiffie) of last received fragment * @seqno: sequence number of the fragments in the list * @size: accumulated size of packets in list + * @total_size: expected size of the assembled packet */ struct batadv_frag_table_entry { struct hlist_head head; @@ -139,6 +140,7 @@ struct batadv_frag_table_entry { unsigned long timestamp; uint16_t seqno; uint16_t size; + uint16_t total_size; }; /** -- 2.4.2 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 08/16] batman-adv: iv_ogm_iface_enable, direct return values
From: Markus Pargmann m...@pengutronix.de Directly return error values. No need to use a return variable. Signed-off-by: Markus Pargmann m...@pengutronix.de Signed-off-by: Marek Lindner mareklind...@neomailbox.ch Signed-off-by: Antonio Quartulli anto...@meshcoding.com --- net/batman-adv/bat_iv_ogm.c | 8 ++-- 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/net/batman-adv/bat_iv_ogm.c b/net/batman-adv/bat_iv_ogm.c index fc299c0..123aabb 100644 --- a/net/batman-adv/bat_iv_ogm.c +++ b/net/batman-adv/bat_iv_ogm.c @@ -308,7 +308,6 @@ static int batadv_iv_ogm_iface_enable(struct batadv_hard_iface *hard_iface) struct batadv_ogm_packet *batadv_ogm_packet; unsigned char *ogm_buff; uint32_t random_seqno; - int res = -ENOMEM; /* randomize initial seqno to avoid collision */ get_random_bytes(random_seqno, sizeof(random_seqno)); @@ -317,7 +316,7 @@ static int batadv_iv_ogm_iface_enable(struct batadv_hard_iface *hard_iface) hard_iface-bat_iv.ogm_buff_len = BATADV_OGM_HLEN; ogm_buff = kmalloc(hard_iface-bat_iv.ogm_buff_len, GFP_ATOMIC); if (!ogm_buff) - goto out; + return -ENOMEM; hard_iface-bat_iv.ogm_buff = ogm_buff; @@ -329,10 +328,7 @@ static int batadv_iv_ogm_iface_enable(struct batadv_hard_iface *hard_iface) batadv_ogm_packet-reserved = 0; batadv_ogm_packet-tq = BATADV_TQ_MAX_VALUE; - res = 0; - -out: - return res; + return 0; } static void batadv_iv_ogm_iface_disable(struct batadv_hard_iface *hard_iface) -- 2.4.2 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next 02/14] sfc: Add sysfs entry for flags (link control and primary)
On every adapter there will be one primary PF per adaptor and one link control PF per port. Signed-off-by: Shradha Shah ss...@solarflare.com --- drivers/net/ethernet/sfc/ef10.c | 62 ++--- 1 file changed, 52 insertions(+), 10 deletions(-) diff --git a/drivers/net/ethernet/sfc/ef10.c b/drivers/net/ethernet/sfc/ef10.c index ee20d96..ebdf6ee 100644 --- a/drivers/net/ethernet/sfc/ef10.c +++ b/drivers/net/ethernet/sfc/ef10.c @@ -255,8 +255,34 @@ static ssize_t efx_ef10_show_physical_port(struct device *dev, return sprintf(buf, %d\n, efx-port_num); } -static DEVICE_ATTR(physical_port, 0444, efx_ef10_show_physical_port, +static ssize_t efx_ef10_show_link_control_flag(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + struct efx_nic *efx = pci_get_drvdata(to_pci_dev(dev)); + + return sprintf(buf, %d\n, + ((efx-mcdi-fn_flags) + (1 MC_CMD_DRV_ATTACH_EXT_OUT_FLAG_LINKCTRL)) + ? 1 : 0); +} + +static ssize_t efx_ef10_show_primary_flag(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + struct efx_nic *efx = pci_get_drvdata(to_pci_dev(dev)); + + return sprintf(buf, %d\n, + ((efx-mcdi-fn_flags) + (1 MC_CMD_DRV_ATTACH_EXT_OUT_FLAG_PRIMARY)) + ? 1 : 0); +} + +static DEVICE_ATTR(physical_port, 0444, efx_ef10_show_physical_port, NULL); +static DEVICE_ATTR(link_control_flag, 0444, efx_ef10_show_link_control_flag, NULL); +static DEVICE_ATTR(primary_flag, 0444, efx_ef10_show_primary_flag, NULL); static int efx_ef10_probe(struct efx_nic *efx) { @@ -323,32 +349,41 @@ static int efx_ef10_probe(struct efx_nic *efx) if (rc) goto fail3; - rc = efx_ef10_get_pf_index(efx); + rc = device_create_file(efx-pci_dev-dev, + dev_attr_link_control_flag); if (rc) goto fail3; + rc = device_create_file(efx-pci_dev-dev, dev_attr_primary_flag); + if (rc) + goto fail4; + + rc = efx_ef10_get_pf_index(efx); + if (rc) + goto fail5; + rc = efx_ef10_init_datapath_caps(efx); if (rc 0) - goto fail3; + goto fail5; efx-rx_packet_len_offset = ES_DZ_RX_PREFIX_PKTLEN_OFST - ES_DZ_RX_PREFIX_SIZE; rc = efx_mcdi_port_get_number(efx); if (rc 0) - goto fail3; + goto fail5; efx-port_num = rc; rc = device_create_file(efx-pci_dev-dev, dev_attr_physical_port); if (rc) - goto fail3; + goto fail5; rc = efx-type-get_mac_address(efx, efx-net_dev-perm_addr); if (rc) - goto fail4; + goto fail6; rc = efx_ef10_get_sysclk_freq(efx); if (rc 0) - goto fail4; + goto fail6; efx-timer_quantum_ns = 1536000 / rc; /* 1536 cycles */ @@ -368,7 +403,7 @@ static int efx_ef10_probe(struct efx_nic *efx) nic_data-workaround_35388 = enabled MC_CMD_GET_WORKAROUNDS_OUT_BUG35388; } else if (rc != -ENOSYS rc != -ENOENT) { - goto fail4; + goto fail6; } netif_dbg(efx, probe, efx-net_dev, @@ -377,14 +412,18 @@ static int efx_ef10_probe(struct efx_nic *efx) rc = efx_mcdi_mon_probe(efx); if (rc rc != -EPERM) - goto fail4; + goto fail6; efx_ptp_probe(efx, NULL); return 0; -fail4: +fail6: device_remove_file(efx-pci_dev-dev, dev_attr_physical_port); +fail5: + device_remove_file(efx-pci_dev-dev, dev_attr_primary_flag); +fail4: + device_remove_file(efx-pci_dev-dev, dev_attr_link_control_flag); fail3: efx_mcdi_fini(efx); fail2: @@ -629,6 +668,9 @@ static void efx_ef10_remove(struct efx_nic *efx) device_remove_file(efx-pci_dev-dev, dev_attr_physical_port); + device_remove_file(efx-pci_dev-dev, dev_attr_primary_flag); + device_remove_file(efx-pci_dev-dev, dev_attr_link_control_flag); + efx_mcdi_fini(efx); efx_nic_free_buffer(efx, nic_data-mcdi_buf); kfree(nic_data); -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next 03/14] sfc: Implement ndo_gets_phys_port_id() for EF10 VFs
Signed-off-by: Shradha Shah ss...@solarflare.com --- drivers/net/ethernet/sfc/ef10.c | 11 +++ drivers/net/ethernet/sfc/ef10_sriov.c | 14 ++ drivers/net/ethernet/sfc/ef10_sriov.h | 3 +++ drivers/net/ethernet/sfc/efx.c| 1 + drivers/net/ethernet/sfc/net_driver.h | 2 ++ drivers/net/ethernet/sfc/nic.h| 1 + drivers/net/ethernet/sfc/sriov.c | 11 +++ drivers/net/ethernet/sfc/sriov.h | 2 ++ 8 files changed, 45 insertions(+) diff --git a/drivers/net/ethernet/sfc/ef10.c b/drivers/net/ethernet/sfc/ef10.c index ebdf6ee..5c9576d 100644 --- a/drivers/net/ethernet/sfc/ef10.c +++ b/drivers/net/ethernet/sfc/ef10.c @@ -416,6 +416,16 @@ static int efx_ef10_probe(struct efx_nic *efx) efx_ptp_probe(efx, NULL); +#ifdef CONFIG_SFC_SRIOV + if ((efx-pci_dev-physfn) (!efx-pci_dev-is_physfn)) { + struct pci_dev *pci_dev_pf = efx-pci_dev-physfn; + struct efx_nic *efx_pf = pci_get_drvdata(pci_dev_pf); + + efx_pf-type-get_mac_address(efx_pf, nic_data-port_id); + } else +#endif + ether_addr_copy(nic_data-port_id, efx-net_dev-perm_addr); + return 0; fail6: @@ -4154,6 +4164,7 @@ const struct efx_nic_type efx_hunt_a0_vf_nic_type = { .vswitching_probe = efx_ef10_vswitching_probe_vf, .vswitching_restore = efx_ef10_vswitching_restore_vf, .vswitching_remove = efx_ef10_vswitching_remove_vf, + .sriov_get_phys_port_id = efx_ef10_sriov_get_phys_port_id, #endif .get_mac_address = efx_ef10_get_mac_address_vf, .set_mac_address = efx_ef10_set_mac_address, diff --git a/drivers/net/ethernet/sfc/ef10_sriov.c b/drivers/net/ethernet/sfc/ef10_sriov.c index 3969b1b..cd52454 100644 --- a/drivers/net/ethernet/sfc/ef10_sriov.c +++ b/drivers/net/ethernet/sfc/ef10_sriov.c @@ -736,3 +736,17 @@ int efx_ef10_sriov_get_vf_config(struct efx_nic *efx, int vf_i, return 0; } + +int efx_ef10_sriov_get_phys_port_id(struct efx_nic *efx, + struct netdev_phys_item_id *ppid) +{ + struct efx_ef10_nic_data *nic_data = efx-nic_data; + + if (!is_valid_ether_addr(nic_data-port_id)) + return -EOPNOTSUPP; + + ppid-id_len = ETH_ALEN; + memcpy(ppid-id, nic_data-port_id, ppid-id_len); + + return 0; +} diff --git a/drivers/net/ethernet/sfc/ef10_sriov.h b/drivers/net/ethernet/sfc/ef10_sriov.h index b985576..ffc92a5 100644 --- a/drivers/net/ethernet/sfc/ef10_sriov.h +++ b/drivers/net/ethernet/sfc/ef10_sriov.h @@ -54,6 +54,9 @@ int efx_ef10_sriov_get_vf_config(struct efx_nic *efx, int vf_i, int efx_ef10_sriov_set_vf_link_state(struct efx_nic *efx, int vf_i, int link_state); +int efx_ef10_sriov_get_phys_port_id(struct efx_nic *efx, + struct netdev_phys_item_id *ppid); + int efx_ef10_vswitching_probe_pf(struct efx_nic *efx); int efx_ef10_vswitching_probe_vf(struct efx_nic *efx); int efx_ef10_vswitching_restore_pf(struct efx_nic *efx); diff --git a/drivers/net/ethernet/sfc/efx.c b/drivers/net/ethernet/sfc/efx.c index 9eafa39..fe3481c 100644 --- a/drivers/net/ethernet/sfc/efx.c +++ b/drivers/net/ethernet/sfc/efx.c @@ -2282,6 +2282,7 @@ static const struct net_device_ops efx_netdev_ops = { .ndo_set_vf_spoofchk= efx_sriov_set_vf_spoofchk, .ndo_get_vf_config = efx_sriov_get_vf_config, .ndo_set_vf_link_state = efx_sriov_set_vf_link_state, + .ndo_get_phys_port_id = efx_sriov_get_phys_port_id, #endif #ifdef CONFIG_NET_POLL_CONTROLLER .ndo_poll_controller = efx_netpoll, diff --git a/drivers/net/ethernet/sfc/net_driver.h b/drivers/net/ethernet/sfc/net_driver.h index a468a22..d72f522 100644 --- a/drivers/net/ethernet/sfc/net_driver.h +++ b/drivers/net/ethernet/sfc/net_driver.h @@ -1350,6 +1350,8 @@ struct efx_nic_type { struct ifla_vf_info *ivi); int (*sriov_set_vf_link_state)(struct efx_nic *efx, int vf_i, int link_state); + int (*sriov_get_phys_port_id)(struct efx_nic *efx, + struct netdev_phys_item_id *ppid); int (*vswitching_probe)(struct efx_nic *efx); int (*vswitching_restore)(struct efx_nic *efx); void (*vswitching_remove)(struct efx_nic *efx); diff --git a/drivers/net/ethernet/sfc/nic.h b/drivers/net/ethernet/sfc/nic.h index db8562e..e146e30 100644 --- a/drivers/net/ethernet/sfc/nic.h +++ b/drivers/net/ethernet/sfc/nic.h @@ -524,6 +524,7 @@ struct efx_ef10_nic_data { unsigned int vport_id; bool must_probe_vswitching; unsigned int pf_index; + u8 port_id[ETH_ALEN]; #ifdef CONFIG_SFC_SRIOV unsigned int vf_index; struct ef10_vf *vf; diff --git a/drivers/net/ethernet/sfc/sriov.c b/drivers/net/ethernet/sfc/sriov.c index 6c5edbd..816c446 100644 --- a/drivers/net/ethernet/sfc/sriov.c
[PATCH net-next 05/14] sfc: set the port-id when calling MC_CMD_MAC_STATS
From: Daniel Pieczko dpiec...@solarflare.com The port-id must be known so that the RMON level can be set for the collection of vadapter stats. Signed-off-by: Shradha Shah ss...@solarflare.com --- drivers/net/ethernet/sfc/mcdi_port.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/net/ethernet/sfc/mcdi_port.c b/drivers/net/ethernet/sfc/mcdi_port.c index 9bf04cb..fffc348 100644 --- a/drivers/net/ethernet/sfc/mcdi_port.c +++ b/drivers/net/ethernet/sfc/mcdi_port.c @@ -924,6 +924,7 @@ enum efx_stats_action { static int efx_mcdi_mac_stats(struct efx_nic *efx, enum efx_stats_action action, int clear) { + struct efx_ef10_nic_data *nic_data = efx-nic_data; MCDI_DECLARE_BUF(inbuf, MC_CMD_MAC_STATS_IN_LEN); int rc; int change = action == EFX_STATS_PULL ? 0 : 1; @@ -945,6 +946,7 @@ static int efx_mcdi_mac_stats(struct efx_nic *efx, MAC_STATS_IN_PERIODIC_NOEVENT, 1, MAC_STATS_IN_PERIOD_MS, period); MCDI_SET_DWORD(inbuf, MAC_STATS_IN_DMA_LEN, dma_len); + MCDI_SET_DWORD(inbuf, MAC_STATS_IN_PORT_ID, nic_data-vport_id); rc = efx_mcdi_rpc(efx, MC_CMD_MAC_STATS, inbuf, sizeof(inbuf), NULL, 0, NULL); -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC][PATCH] x86: remove vmalloc.h from asm/io.h
Hi Ingo, On Fri, 29 May 2015 11:21:05 +0200 Ingo Molnar mi...@kernel.org wrote: Good idea. Acked-by: Ingo Molnar mi...@kernel.org Thanks. Please also test x86 allnoconfig and defconfig 32/64, that tends to unearth the remaining places. People doing randconfig testing will find the rest. Good idea. the allnoconfigs produced this further patch. I will squash it into the original. The defconfigs built ok. From: Stephen Rothwell s...@canb.auug.org.au Date: Fri, 29 May 2015 22:01:41 +1000 Subject: [PATCH] x86: more fixes for removing vmalloc.h fron asm/io.h Signed-off-by: Stephen Rothwell s...@canb.auug.org.au --- arch/x86/include/asm/io.h | 1 + include/linux/io.h| 1 + 2 files changed, 2 insertions(+) diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h index 5791e7ace9db..2a3543a4db1d 100644 --- a/arch/x86/include/asm/io.h +++ b/arch/x86/include/asm/io.h @@ -40,6 +40,7 @@ #include linux/compiler.h #include asm/page.h #include asm/early_ioremap.h +#include asm/pgtable_types.h #define build_mmio_read(name, size, type, reg, barrier) \ static inline type name(const volatile void __iomem *addr) \ diff --git a/include/linux/io.h b/include/linux/io.h index 986f2bffea1e..cb753a2450b8 100644 --- a/include/linux/io.h +++ b/include/linux/io.h @@ -19,6 +19,7 @@ #define _LINUX_IO_H #include linux/types.h +#include linux/init.h #include asm/io.h #include asm/page.h -- 2.1.4 -- Cheers, Stephen Rothwells...@canb.auug.org.au pgp3g2vzqXziO.pgp Description: OpenPGP digital signature
Re: [RFC][PATCH] x86: remove vmalloc.h from asm/io.h
At Fri, 29 May 2015 19:18:47 +1000, Stephen Rothwell wrote: Nothing in asm/io.h uses anything from vmalloc.h, so remove the include and fix up the build problems in an allmodconfig (64 bit and 32 bit) build. This may be the place where x86 builds get vmalloc.h implicitly included and that tends to hide places where vmalloc() et al are added to files but the include of vmalloc.h is forgotten. Cc: Thomas Gleixner t...@linutronix.de Cc: Ingo Molnar mi...@redhat.com Cc: H. Peter Anvin h...@zytor.com Cc: x...@kernel.org Cc: Konrad Rzeszutek Wilk konrad.w...@oracle.com Cc: Boris Ostrovsky boris.ostrov...@oracle.com Cc: David Vrabel david.vra...@citrix.com Cc: Anton Vorontsov an...@enomsg.org Cc: Colin Cross ccr...@android.com Cc: Kees Cook keesc...@chromium.org Cc: Tony Luck tony.l...@intel.com Cc: Rafael J. Wysocki r...@rjwysocki.net Cc: Len Brown l...@kernel.org Cc: Kristen Carlson Accardi kris...@linux.intel.com Cc: Viresh Kumar viresh.ku...@linaro.org Cc: Vinod Koul vinod.k...@intel.com Cc: K. Y. Srinivasan k...@microsoft.com Cc: Haiyang Zhang haiya...@microsoft.com Cc: Hiral Patel hiral...@cisco.com Cc: Suma Ramars sram...@cisco.com Cc: Brian Uchino buch...@cisco.com Cc: James E.J. Bottomley jbottom...@odin.com Cc: Jaroslav Kysela pe...@perex.cz Cc: Takashi Iwai ti...@suse.de For the sound bits, Acked-by: Takashi Iwai ti...@suse.de thanks, Takashi Cc: Andrew Morton a...@linux-foundation.org Suggested-by: David Miller da...@davemloft.net Signed-off-by: Stephen Rothwell s...@canb.auug.org.au --- Based in Linus' tree of today. There are probably more places that need vmalloc.h included, but this passes 64 bit and 32 bit allmodconfig builds, so is a place to start. Dave Miller suggested that I start this journey. arch/x86/include/asm/io.h | 2 -- arch/x86/kernel/crash.c| 1 + arch/x86/kernel/machine_kexec_64.c | 1 + arch/x86/mm/pageattr-test.c| 1 + arch/x86/mm/pageattr.c | 1 + arch/x86/xen/p2m.c | 1 + drivers/acpi/apei/erst.c | 1 + drivers/cpufreq/intel_pstate.c | 1 + drivers/dma/mic_x100_dma.c | 1 + drivers/net/hyperv/netvsc.c| 1 + drivers/net/hyperv/rndis_filter.c | 1 + drivers/scsi/fnic/fnic_debugfs.c | 1 + drivers/scsi/fnic/fnic_trace.c | 1 + sound/pci/asihpi/hpioctl.c | 1 + 14 files changed, 13 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h index 34a5b93704d3..5791e7ace9db 100644 --- a/arch/x86/include/asm/io.h +++ b/arch/x86/include/asm/io.h @@ -197,8 +197,6 @@ extern void set_iounmap_nonlazy(void); #include asm-generic/iomap.h -#include linux/vmalloc.h - /* * Convert a virtual cached pointer to an uncached pointer */ diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c index c76d3e37c6e1..e068d6683dba 100644 --- a/arch/x86/kernel/crash.c +++ b/arch/x86/kernel/crash.c @@ -22,6 +22,7 @@ #include linux/elfcore.h #include linux/module.h #include linux/slab.h +#include linux/vmalloc.h #include asm/processor.h #include asm/hardirq.h diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c index 415480d3ea84..11546b462fa6 100644 --- a/arch/x86/kernel/machine_kexec_64.c +++ b/arch/x86/kernel/machine_kexec_64.c @@ -17,6 +17,7 @@ #include linux/ftrace.h #include linux/io.h #include linux/suspend.h +#include linux/vmalloc.h #include asm/init.h #include asm/pgtable.h diff --git a/arch/x86/mm/pageattr-test.c b/arch/x86/mm/pageattr-test.c index 6629f397b467..8ff686aa7e8c 100644 --- a/arch/x86/mm/pageattr-test.c +++ b/arch/x86/mm/pageattr-test.c @@ -9,6 +9,7 @@ #include linux/random.h #include linux/kernel.h #include linux/mm.h +#include linux/vmalloc.h #include asm/cacheflush.h #include asm/pgtable.h diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c index 89af288ec674..bedfc794b4ba 100644 --- a/arch/x86/mm/pageattr.c +++ b/arch/x86/mm/pageattr.c @@ -14,6 +14,7 @@ #include linux/percpu.h #include linux/gfp.h #include linux/pci.h +#include linux/vmalloc.h #include asm/e820.h #include asm/processor.h diff --git a/arch/x86/xen/p2m.c b/arch/x86/xen/p2m.c index b47124d4cd67..8b7f18e200aa 100644 --- a/arch/x86/xen/p2m.c +++ b/arch/x86/xen/p2m.c @@ -67,6 +67,7 @@ #include linux/seq_file.h #include linux/bootmem.h #include linux/slab.h +#include linux/vmalloc.h #include asm/cache.h #include asm/setup.h diff --git a/drivers/acpi/apei/erst.c b/drivers/acpi/apei/erst.c index ed65e9c4b5b0..3670bbab57a3 100644 --- a/drivers/acpi/apei/erst.c +++ b/drivers/acpi/apei/erst.c @@ -35,6 +35,7 @@ #include linux/nmi.h #include linux/hardirq.h #include linux/pstore.h +#include linux/vmalloc.h #include acpi/apei.h #include apei-internal.h diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
Re: [PATCH iproute2] ss: Fix allocation of cong control alg name
On Fri, May 29, 2015 at 04:04:05AM -0700, Eric Dumazet wrote: On Fri, 2015-05-29 at 13:30 +0300, Vadim Kochan wrote: From: Vadim Kochan vadi...@gmail.com Use strdup instead of malloc, and get rid of bad strcpy. Signed-off-by: Vadim Kochan vadi...@gmail.com --- misc/ss.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/misc/ss.c b/misc/ss.c index 347e3a1..a719466 100644 --- a/misc/ss.c +++ b/misc/ss.c @@ -1908,8 +1908,7 @@ static void tcp_show_info(const struct nlmsghdr *nlh, struct inet_diag_msg *r, if (tb[INET_DIAG_CONG]) { const char *cong_attr = rta_getattr_str(tb[INET_DIAG_CONG]); - s.cong_alg = malloc(strlen(cong_attr + 1)); - strcpy(s.cong_alg, cong_attr); + s.cong_alg = strdup(cong_attr); } if (TCPI_HAS_OPT(info, TCPI_OPT_WSCALE)) { I doubt TCP_CA_NAME_MAX will ever change in the kernel : 16 bytes. Its typically cubic and less than 8 bytes. Using 8 bytes to point to a malloc(8) is a waste. Please remove the memory allocation, or store the pointer, since tcp_show_info() does the malloc()/free() before return. diff --git a/misc/ss.c b/misc/ss.c index 347e3a1..9fe229f 100644 --- a/misc/ss.c +++ b/misc/ss.c @@ -755,7 +755,7 @@ struct tcpstat int timer; int timeout; int probes; - char*cong_alg; + charcong_alg[16]; double rto, ato, rtt, rttvar; int qack, cwnd, ssthresh, backoff; double send_bps; @@ -1664,7 +1664,7 @@ static void tcp_stats_print(struct tcpstat *s) printf( ecnseen); if (s-has_fastopen_opt) printf( fastopen); - if (s-cong_alg) + if (s-cong_alg[0]) printf( %s, s-cong_alg); if (s-has_wscale_opt) printf( wscale:%d,%d, s-snd_wscale, s-rcv_wscale); @@ -1906,11 +1906,10 @@ static void tcp_show_info(const struct nlmsghdr *nlh, struct inet_diag_msg *r, s.has_fastopen_opt = TCPI_HAS_OPT(info, TCPI_OPT_SYN_DATA); } - if (tb[INET_DIAG_CONG]) { - const char *cong_attr = rta_getattr_str(tb[INET_DIAG_CONG]); - s.cong_alg = malloc(strlen(cong_attr + 1)); - strcpy(s.cong_alg, cong_attr); - } + if (tb[INET_DIAG_CONG]) + strncpy(s.cong_alg, + rta_getattr_str(tb[INET_DIAG_CONG]), + sizeof(s.cong_alg) - 1); if (TCPI_HAS_OPT(info, TCPI_OPT_WSCALE)) { s.has_wscale_opt = true; @@ -1984,8 +1983,6 @@ static void tcp_show_info(const struct nlmsghdr *nlh, struct inet_diag_msg *r, tcp_stats_print(s); if (s.dctcp) free(s.dctcp); - if (s.cong_alg) - free(s.cong_alg); } } Thanks! Should I put you in From tag or in Signed-off-by ? Or your diff might be used from this email thread ? -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH iproute2 -next] tc: {f,m}_bpf: add tail call support for parser
Kernel commit 04fd61ab36ec (bpf: allow bpf programs to tail-call other bpf programs) added support for tail calls, this patch here adds tc front end parts for the object parser to prepopulate a given eBPF prog array before the root prog is pushed down for classifier creation. The prepopulation works with any number of prog arrays in any dependencies, e.g. prog or normal maps could also be used from progs that are tail-called themself, etc. Signed-off-by: Daniel Borkmann dan...@iogearbox.net --- [ doc will follow later some time after -next branch has man page again. ] tc/tc_bpf.c | 139 1 file changed, 111 insertions(+), 28 deletions(-) diff --git a/tc/tc_bpf.c b/tc/tc_bpf.c index 59493df..276871a 100644 --- a/tc/tc_bpf.c +++ b/tc/tc_bpf.c @@ -16,6 +16,7 @@ #include unistd.h #include string.h #include stdbool.h +#include stdint.h #include errno.h #include fcntl.h #include stdarg.h @@ -266,6 +267,19 @@ static int bpf_create_map(enum bpf_map_type type, unsigned int size_key, return bpf(BPF_MAP_CREATE, attr, sizeof(attr)); } +static int bpf_update_map(int fd, const void *key, const void *value, + uint64_t flags) +{ + union bpf_attr attr = { + .map_fd = fd, + .key= bpf_ptr_to_u64(key), + .value = bpf_ptr_to_u64(value), + .flags = flags, + }; + + return bpf(BPF_MAP_UPDATE_ELEM, attr, sizeof(attr)); +} + static int bpf_prog_load(enum bpf_prog_type type, const struct bpf_insn *insns, unsigned int len, const char *license) { @@ -282,15 +296,17 @@ static int bpf_prog_load(enum bpf_prog_type type, const struct bpf_insn *insns, return bpf(BPF_PROG_LOAD, attr, sizeof(attr)); } -static int bpf_prog_attach(enum bpf_prog_type type, const struct bpf_insn *insns, - unsigned int size, const char *license) +static int bpf_prog_attach(enum bpf_prog_type type, const char *sec, + const struct bpf_insn *insns, unsigned int size, + const char *license) { int prog_fd = bpf_prog_load(type, insns, size, license); if (prog_fd 0 || bpf_verbose) { - bpf_dump_error(%s: %s\n, prog_fd 0 ? + bpf_dump_error(%s (section \'%s\'): %s\n, prog_fd 0 ? BPF program rejected : - BPF program verification, strerror(errno)); + BPF program verification, + sec, strerror(errno)); } return prog_fd; @@ -436,7 +452,7 @@ static int bpf_apply_relo_data(struct bpf_elf_sec_data *data_relo, } static int bpf_fetch_ancillary(int file_fd, Elf *elf_fd, GElf_Ehdr *elf_hdr, - bool *sec_seen, char *license, unsigned int lic_len, + bool *sec_done, char *license, unsigned int lic_len, Elf_Data **sym_tab) { int sec_index, ret = -1; @@ -462,23 +478,24 @@ static int bpf_fetch_ancillary(int file_fd, Elf *elf_fd, GElf_Ehdr *elf_hdr, maps_num = data_anc.sec_data-d_size / sizeof(*maps); memcpy(map_ent, maps, data_anc.sec_data-d_size); - sec_seen[sec_index] = true; ret = bpf_maps_attach(maps, maps_num); if (ret 0) return ret; + + sec_done[sec_index] = true; } /* Extract eBPF license. */ else if (!strcmp(data_anc.sec_name, ELF_SECTION_LICENSE)) { if (data_anc.sec_data-d_size lic_len) return -ENOMEM; - sec_seen[sec_index] = true; + sec_done[sec_index] = true; memcpy(license, data_anc.sec_data-d_buf, data_anc.sec_data-d_size); } /* Extract symbol table for relocations (map fd fixups). */ else if (data_anc.sec_hdr.sh_type == SHT_SYMTAB) { - sec_seen[sec_index] = true; + sec_done[sec_index] = true; *sym_tab = data_anc.sec_data; } } @@ -486,7 +503,7 @@ static int bpf_fetch_ancillary(int file_fd, Elf *elf_fd, GElf_Ehdr *elf_hdr, return ret; } -static int bpf_fetch_prog_relo(Elf *elf_fd, GElf_Ehdr *elf_hdr, bool *sec_seen, +static int bpf_fetch_prog_relo(Elf *elf_fd, GElf_Ehdr *elf_hdr, bool *sec_done, enum bpf_prog_type type, const char *sec, const char *license, Elf_Data *sym_tab) { @@ -511,25 +528,24 @@ static int bpf_fetch_prog_relo(Elf *elf_fd, GElf_Ehdr *elf_hdr, bool *sec_seen,
[PATCH iproute2] ss: do not bindly dump two families
From: Eric Dumazet eduma...@google.com ss currently dumps IPv4 sockets, then IPv6 sockets from the kernel, even if -4 or -6 option was given. Filtering in user space then has to drop all sockets of wrong family. Such a waste of time... Before : $ time ss -tn -4 | wc -l 251659 real0m1.241s user0m0.423s sys 0m0.806s After: $ time ss -tn -4 | wc -l 251672 real0m0.779s user0m0.412s sys 0m0.386s Signed-off-by: Eric Dumazet eduma...@google.com --- misc/ss.c |4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/misc/ss.c b/misc/ss.c index 347e3a1..4ef8fea 100644 --- a/misc/ss.c +++ b/misc/ss.c @@ -2209,6 +2209,8 @@ static int inet_show_netlink(struct filter *f, FILE *dump_fp, int protocol) return -1; rth.dump = MAGIC_SEQ; rth.dump_fp = dump_fp; + if (preferred_family == PF_INET6) + family = PF_INET6; again: if ((err = sockdiag_send(family, rth.fd, protocol, f))) @@ -2221,7 +2223,7 @@ again: } goto Exit; } - if (family == PF_INET) { + if (family == PF_INET preferred_family != PF_INET) { family = PF_INET6; goto again; } -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-next 02/14] sfc: Add sysfs entry for flags (link control and primary)
On 29/05/15 11:48, David Laight wrote: From: Shradha Shah Sent: 29 May 2015 11:01 On every adapter there will be one primary PF per adaptor and one link control PF per port. ... +return sprintf(buf, %d\n, + ((efx-mcdi-fn_flags) +(1 MC_CMD_DRV_ATTACH_EXT_OUT_FLAG_LINKCTRL)) + ? 1 : 0); Horrid expression. Why not: (efx-mcdi-fn_flags MC_CMD_DRV_ATTACH_EXT_OUT_FLAG_LINKCTRL) 1 I think the idea is that this is more explicit about what it's doing. It's a toss-up which is more readable / idiomatic; I prefer the OP version. (They probably compile to the same thing, though I haven't checked.) using sprintf() is also excessive. Maybe: *buf = '0' + (expression); return 1; That loses the '\n'; it's annoying when you cat a file and it doesn't end in a '\n', because it gloms onto your shell prompt. sprintf isn't really that expensive, this isn't likely to be called very frequently. You may also need to check for buffer overrun. In fact Documentation/filesystems/sysfs.txt says that show() should always use scnprintf() and that The buffer will always be PAGE_SIZE bytes in length. So if we want to be consistent, it should be return scnprintf(buf, PAGE_SIZE, %d\n, expression); although it'd be rather surprising if either 0\n or 1\n were ever too big for PAGE_SIZE :grin:. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] sctp: fix ASCONF list handling
On Thu, May 28, 2015 at 11:46:29AM -0300, Marcelo Ricardo Leitner wrote: On Thu, May 28, 2015 at 10:27:32AM -0300, Marcelo Ricardo Leitner wrote: On Thu, May 28, 2015 at 08:17:27AM -0300, Marcelo Ricardo Leitner wrote: On Thu, May 28, 2015 at 06:15:11AM -0400, Neil Horman wrote: On Wed, May 27, 2015 at 09:52:17PM -0300, mleit...@redhat.com wrote: From: Marcelo Ricardo Leitner marcelo.leit...@gmail.com -auto_asconf_splist is per namespace and mangled by functions like sctp_setsockopt_auto_asconf() which doesn't guarantee any serialization. Also, the call to inet_sk_copy_descendant() was backuping -auto_asconf_list through the copy but was not honoring -do_auto_asconf, which could lead to list corruption if it was different between both sockets. This commit thus fixes the list handling by adding a spinlock to protect against multiple writers and converts the list to be protected by RCU too, so that we don't have a lock inverstion issue at sctp_addr_wq_timeout_handler(). And as this list now uses RCU, we cannot do such backup and restore while copying descendant data anymore as readers may be traversing the list meanwhile. We fix this by simply ignoring/not copying those fields, placed at the end of struct sctp_sock, so we can just ignore it together with struct ipv6_pinfo data. For that we create sctp_copy_descendant() so we don't clutter inet_sk_copy_descendant() with SCTP info. Issue was found with a test application that kept flipping sysctl default_auto_asconf on and off. Fixes: 9f7d653b67ae (sctp: Add Auto-ASCONF support (core).) Signed-off-by: Marcelo Ricardo Leitner marcelo.leit...@gmail.com --- include/net/netns/sctp.h | 6 +- include/net/sctp/structs.h | 2 ++ net/sctp/protocol.c| 6 +- net/sctp/socket.c | 39 ++- 4 files changed, 38 insertions(+), 15 deletions(-) diff --git a/include/net/netns/sctp.h b/include/net/netns/sctp.h index 3573a81815ad9e0efb6ceb721eb066d3726419f0..e080bebb3147af39c8275261f57018eb01e917b0 100644 --- a/include/net/netns/sctp.h +++ b/include/net/netns/sctp.h @@ -30,12 +30,15 @@ struct netns_sctp { struct list_head local_addr_list; struct list_head addr_waitq; struct timer_list addr_wq_timer; - struct list_head auto_asconf_splist; + struct list_head __rcu auto_asconf_splist; You should use the addr_wq_lock here instead of creating a new lock, as thats already used to protect most accesses to the list you are concerned about. Ok, that works too. Though truthfully, that shouldn't be necessecary. The list in question is only read in one location and only written in one location. You can likely just rcu-ify, as the write side is in process context and protected by lock_sock. It should, it's not protected by lock_sock as this list resides in netns_sctp structure, which lock_sock doesn't cover. Write side is in process context yes, but this list is written in sctp_init_sock(), sctp_destroy_sock() and sctp_setsockopt_auto_asconf(), so one could trigger this by either creating/destroying sockets if default_auto_asconf=1 or just by creating a bunch of sockets and flipping asconf via setsockopt (or a combination of these operations). (I'll point this out in the changelog) Hmm.. by reusing addr_wq_lock we don't need to rcu-ify the list, as the reader is inside that lock too, so I can just protect auto_asconf_splist writers with addr_wq_lock. Nice, thanks Neil. Cannot really do that.. as that creates a lock inversion between sctp_destroy_sock() (which already holds lock_sock) and sctp_addr_wq_timeout_handler(), which first grabs addr_wq_lock and then locks socket by socket. Due to that, I'm afraid reusing this lock is not possible, and we should stick with the patch.. what do you think? (though I have to fix the nits in there) I don't think thats accurate. You are correct in that the the locks are taken in opposing order, which would imply a lock inversion that could result in deadlock, but we can avoid that by deferring the asconf list removal until after sk_common_release and unlock_sock_bh is called in sctp_close. That will make the lock ordering consistent. Alternatively, we can pre-emptively take the asconf_lock in sctp_close before locking the socket. I'd really rather avoid creating an additional lock here if we don't have to Neil Marcelo -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe netdev
Re: [PATCH net-next] vlan: Add GRO support for non hardware accelerated vlan
On 15/05/28 (木) 21:02, Eric Dumazet wrote: On Thu, 2015-05-28 at 20:17 +0900, Toshiaki Makita wrote: Currently packets with non-hardware-accelerated vlan cannot be handled by GRO. This causes low performance for 802.1ad and stacked vlan, as their vlan tags are currently not stripped by hardware. This patch adds GRO support for non-hardware-accelerated vlan and improves receive performance of them. Very nice patch ! Signed-off-by: Toshiaki Makita makita.toshi...@lab.ntt.co.jp --- net/8021q/vlan.c | 94 1 file changed, 94 insertions(+) diff --git a/net/8021q/vlan.c b/net/8021q/vlan.c index 59555f0..0a9e8e1 100644 --- a/net/8021q/vlan.c +++ b/net/8021q/vlan.c @@ -618,6 +618,90 @@ out: return err; } + vhdr2 = (struct vlan_hdr *)(p-data + off_vlan); + if (memcmp(vhdr, vhdr2, VLAN_HLEN)) + NAPI_GRO_CB(p)-same_flow = 0; + } This memcmp() is quite expensive, you better use a helper like : /* vlan header only guaranteed to be 16bit aligned */ static bool vlan_hdr_compare(const struct vlan_hdr *h1, const struct vlan_hdr *h2) { #if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) return *(u32 *)h1 != *(u32 *)h2; #else return (((__force u32)h1-h_vlan_TCI ^ (__force u32)h2-h_vlan_TCI) | ((__force u32)h1-h_vlan_encapsulated_proto ^ (__force u32)h2-h_vlan_encapsulated_proto)) != 0; #endif } Hi Eric, Thank you for your reviewing. Indeed, memcmp() is not good for performance. I'll include your feedback in v2. Thanks, Toshiaki Makita -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Fw: [Bug 99161] New: 2.6.32.66 PPC Oops in tcp_send_fin
Hi Eric, On Fri, May 29, 2015 at 08:52:11AM -0700, Eric Dumazet wrote: On Fri, 2015-05-29 at 08:12 -0700, Stephen Hemminger wrote: I think 2.6.32 is so old no one will care. A few will still, but at least we must ensure the old guy finishes his days nicely :-) (...) I guess a backport went wrong. Ah crap, sorry about that :-( Willy, please add following to your tree : diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 5339f066234b..d1e2895bb63c 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -2136,7 +2136,7 @@ void tcp_send_fin(struct sock *sk) */ if (tskb (tcp_send_head(sk) || tcp_memory_pressure)) { coalesce: - TCP_SKB_CB(skb)-flags |= TCPCB_FLAG_FIN; + TCP_SKB_CB(tskb)-flags |= TCPCB_FLAG_FIN; TCP_SKB_CB(tskb)-end_seq++; tp-write_seq++; if (!tcp_send_head(sk)) { Thanks Eric, I'm queuing this now. Best regards, Willy -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Request for advice on where to put Root Complex fix up code for downstream device
Hi Casey, Sorry, this one slipped through and I forgot to respond earlier. On Thu, May 07, 2015 at 11:31:58PM +, Casey Leedom wrote: | From: Bjorn Helgaas [bhelg...@google.com] | Sent: Thursday, May 07, 2015 4:04 PM | | There are a lot of fixups in drivers/pci/quirks.c. For things that have to | be worked around either before a driver claims the device or if there is no | driver at all, the fixup *has* to go in drivers/pci/quirks.c | | But for things like this, where the problem can only occur after a driver | claims the device, I think it makes more sense to put the fixup in the | driver itself. The only wrinkle here is that the fixup has to be done on a | separate device, not the device claimed by the driver. But I think it | probably still makes sense to put this fixup in the driver. Okay, the example code that I provided (still quoted below) was indeed done as a fix within the cxgb4 Network Driver. I've also worked up a version as a PCI Quirk but if you and David Miller agree that the fixup code should go into cxgb4, I'm comfortable with that. I can also provide the example PCI Quirk code I worked up if you like. One complication to doing this in cxgb4 is that it attaches to Physical Function 4 of our T5 chip. Meanwhile, a completely separate storage driver, csiostor, connections to PF5 and PF6 and there's no requirement at all that cxgb4 be loaded. So if we go down the road of putting the fixup code in the cxgb4 driver, we'll also need to duplicate that code in the csiostor driver. Sounds simpler to just put the quirk in drivers/pci/quirks.c. | +static void clear_root_complex_tlp_attributes(struct pci_dev *pdev) | +{ | + struct pci_bus *bus = pdev-bus; | + struct pci_dev *highest_pcie_bridge = NULL; | + | + while (bus) { | + struct pci_dev *bridge = bus-self; | + | + if (!bridge || !bridge-pcie_cap) | + break; | + highest_pcie_bridge = bridge; | + bus = bus-parent; | + } | | Can you use pci_upstream_bridge() here? There are a couple places where we | want to find the Root Port, so we might factor that out someday. It'll be | easier to find all those places if they use with pci_upstream_bridge(). It looks like pci_upstream_bridge() just traverses one like upstream toward the Root Complex? Or am I misunderstanding that function? No, you're right. I was just trying to suggest using pci_upstream_bridge() instead of bus-parent-self in your loop. It wouldn't replace the loop completely. Bjorn -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: netlink and user namespaces
Alexander Larsson al...@redhat.com writes: Now that I'm using a non-privileged user namespace for my desktop sandboxing system all kind of network status things are breaking. The reason for this is that they use netlink to enumerated interfaces, and to verify that the replies are from the kernel (apparently anyone can send anyone netlink messages) this code is verifying that the SCM_CREDENTIAL sender of the netlink messages is uid 0. For instance: http://git.0pointer.net/avahi.git/commit/avahi-core/netlink.c?id=37b2be93e63ceff95698f24cd91cb11774eb621c and: https://git.gnome.org/browse/glib/tree/gio/gnetworkmonitornetlink.c#n340 This obviously breaks when uid is not mapped (as it can't be in an unprivileged user namespace), as uid will be overflowuid. Is there any other way to check that a netlink message is from the kernel? *scratches my head* Those are weird pieces of code. The answer is the way you do this with any other socket. Call recvfrom or recvmsg. Looking at the senders address. If in the senders address nl_pid == 0 then the message is from the kernel. Otherwise the message is from userspace. Looking anywhere else at anything else is bogus. And a note. nl_pid is short for netlink port id. It is not a process id. Eric -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC 1/3] net: dsa: add basic support for VLAN ndo
On Fri, May 29, 2015 at 6:38 PM, Vivien Didelot vivien.dide...@savoirfairelinux.com wrote: Hi, - On May 29, 2015, at 11:24 AM, Or Gerlitz gerlitz...@gmail.com wrote: On Fri, May 29, 2015 at 12:37 AM, Vivien Didelot vivien.dide...@savoirfairelinux.com wrote: @@ -854,7 +922,9 @@ int dsa_slave_create(struct dsa_switch *ds, struct device *parent, if (slave_dev == NULL) return -ENOMEM; - slave_dev-features = master-vlan_features; + slave_dev-features = master-vlan_features | + NETIF_F_VLAN_FEATURES | + NETIF_F_HW_SWITCH_OFFLOAD; wait... didn't commit 7889cbee8357aaed85898d028829dfb4f75bae2c remove NETIF_F_HW_SWITCH_OFFLOAD? Indeed, note that this RFC is based on v4.1-rc3. This will become unneeded I guess. You should rebase networking patches proposed for the next kernel against the net-next tree. BTW, given the commit message, I didn't really understand why? M2, I thought it was unsuccessful commit message and made a comment to the maintainer, he didn't accept it. Or. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH iproute2] ss: Fix allocation of cong control alg name
Hi Daniel and Vadim Thanks for your prompt response and for the patch. Also, what about the other one? Do you think it is an issue or not? File: tc/tc_util.c Function: void print_rate(char *buf, int len, __u64 rate) Line: ~264 In the case that user inputs a high value for rate, the for loop will exit in the condition meaning that variable i get the value of 5 which will be an invalid index for the units array due to that array has only 5 elements. I know a very high value is invalid but in the case that it comes directly from user, it could cause and issue, what do you think? Thanks, -José R. -Original Message- From: Daniel Borkmann [mailto:dan...@iogearbox.net] Sent: Friday, May 29, 2015 6:10 AM To: Vadim Kochan Cc: netdev@vger.kernel.org; Guzman Mosqueda, Jose R Subject: Re: [PATCH iproute2] ss: Fix allocation of cong control alg name Hi Vadim, On 05/29/2015 12:30 PM, Vadim Kochan wrote: From: Vadim Kochan vadi...@gmail.com Use strdup instead of malloc, and get rid of bad strcpy. Signed-off-by: Vadim Kochan vadi...@gmail.com Please also Cc the reporter (done here), and add a: Fixes: 8250bc9ff4e5 (ss: Unify inet sockets output) Reported-by: Jose R. Guzman Mosqueda jose.r.guzman.mosqu...@intel.com Fixes tag is _very useful_ for distros to easily identify if additional follow-up commits would be needed when backporting the original change. Then, this can be easily identified when going through the git log. --- misc/ss.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/misc/ss.c b/misc/ss.c index 347e3a1..a719466 100644 --- a/misc/ss.c +++ b/misc/ss.c @@ -1908,8 +1908,7 @@ static void tcp_show_info(const struct nlmsghdr *nlh, struct inet_diag_msg *r, if (tb[INET_DIAG_CONG]) { const char *cong_attr = rta_getattr_str(tb[INET_DIAG_CONG]); - s.cong_alg = malloc(strlen(cong_attr + 1)); - strcpy(s.cong_alg, cong_attr); + s.cong_alg = strdup(cong_attr); strdup(3) can still return NULL. } if (TCPI_HAS_OPT(info, TCPI_OPT_WSCALE)) { -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next v2 2/3] net: systemport: rewrite bcm_sysport_rx_refill
Currently, bcm_sysport_desc_rx() calls bcm_sysport_rx_refill() at the end of Rx packet processing loop, after the current Rx packet has already been passed to napi_gro_receive(). However, bcm_sysport_rx_refill() might fail to allocate a new Rx skb, thus leaving a hole on the Rx queue where no valid Rx buffer exists. To eliminate this situation: 1. Rewrite bcm_sysport_rx_refill() to retain the current Rx skb on the Rx queue if a new replacement Rx skb can't be allocated and DMA-mapped. In this case, the data on the current Rx skb is effectively dropped. 2. Modify bcm_sysport_desc_rx() to call bcm_sysport_rx_refill() at the top of Rx packet processing loop, so that the new replacement Rx skb is already in place before the current Rx skb is processed. This is loosely inspired from d6707bec5986 (net: bcmgenet: rewrite bcmgenet_rx_refill()) Signed-off-by: Florian Fainelli f.faine...@gmail.com --- drivers/net/ethernet/broadcom/bcmsysport.c | 87 ++ 1 file changed, 41 insertions(+), 46 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bcmsysport.c b/drivers/net/ethernet/broadcom/bcmsysport.c index 267330ccd595..62ea403e15b8 100644 --- a/drivers/net/ethernet/broadcom/bcmsysport.c +++ b/drivers/net/ethernet/broadcom/bcmsysport.c @@ -524,62 +524,70 @@ static void bcm_sysport_free_cb(struct bcm_sysport_cb *cb) dma_unmap_addr_set(cb, dma_addr, 0); } -static int bcm_sysport_rx_refill(struct bcm_sysport_priv *priv, -struct bcm_sysport_cb *cb) +static struct sk_buff *bcm_sysport_rx_refill(struct bcm_sysport_priv *priv, +struct bcm_sysport_cb *cb) { struct device *kdev = priv-pdev-dev; struct net_device *ndev = priv-netdev; + struct sk_buff *skb, *rx_skb; dma_addr_t mapping; - int ret; - cb-skb = netdev_alloc_skb(priv-netdev, RX_BUF_LENGTH); - if (!cb-skb) { + /* Allocate a new SKB for a new packet */ + skb = netdev_alloc_skb(priv-netdev, RX_BUF_LENGTH); + if (!skb) { + priv-mib.alloc_rx_buff_failed++; netif_err(priv, rx_err, ndev, SKB alloc failed\n); - return -ENOMEM; + return NULL; } - mapping = dma_map_single(kdev, cb-skb-data, + mapping = dma_map_single(kdev, skb-data, RX_BUF_LENGTH, DMA_FROM_DEVICE); - ret = dma_mapping_error(kdev, mapping); - if (ret) { + if (dma_mapping_error(kdev, mapping)) { priv-mib.rx_dma_failed++; - bcm_sysport_free_cb(cb); + dev_kfree_skb_any(skb); netif_err(priv, rx_err, ndev, DMA mapping failure\n); - return ret; + return NULL; } + /* Grab the current SKB on the ring */ + rx_skb = cb-skb; + if (likely(rx_skb)) + dma_unmap_single(kdev, dma_unmap_addr(cb, dma_addr), +RX_BUF_LENGTH, DMA_FROM_DEVICE); + + /* Put the new SKB on the ring */ + cb-skb = skb; dma_unmap_addr_set(cb, dma_addr, mapping); dma_desc_set_addr(priv, cb-bd_addr, mapping); netif_dbg(priv, rx_status, ndev, RX refill\n); - return 0; + /* Return the current SKB to the caller */ + return rx_skb; } static int bcm_sysport_alloc_rx_bufs(struct bcm_sysport_priv *priv) { struct bcm_sysport_cb *cb; - int ret = 0; + struct sk_buff *skb; unsigned int i; for (i = 0; i priv-num_rx_bds; i++) { cb = priv-rx_cbs[i]; - if (cb-skb) - continue; - - ret = bcm_sysport_rx_refill(priv, cb); - if (ret) - break; + skb = bcm_sysport_rx_refill(priv, cb); + if (skb) + dev_kfree_skb(skb); + if (!cb-skb) + return -ENOMEM; } - return ret; + return 0; } /* Poll the hardware for up to budget packets to process */ static unsigned int bcm_sysport_desc_rx(struct bcm_sysport_priv *priv, unsigned int budget) { - struct device *kdev = priv-pdev-dev; struct net_device *ndev = priv-netdev; unsigned int processed = 0, to_process; struct bcm_sysport_cb *cb; @@ -587,7 +595,6 @@ static unsigned int bcm_sysport_desc_rx(struct bcm_sysport_priv *priv, unsigned int p_index; u16 len, status; struct bcm_rsb *rsb; - int ret; /* Determine how much we should process since last call */ p_index = rdma_readl(priv, RDMA_PROD_INDEX); @@ -605,29 +612,15 @@ static unsigned int bcm_sysport_desc_rx(struct bcm_sysport_priv *priv, while ((processed to_process) (processed budget)) { cb = priv-rx_cbs[priv-rx_read_ptr]; - skb = cb-skb; -
Re: [PATCH] sctp: fix ASCONF list handling
On Fri, May 29, 2015 at 09:17:26AM -0400, Neil Horman wrote: On Thu, May 28, 2015 at 11:46:29AM -0300, Marcelo Ricardo Leitner wrote: On Thu, May 28, 2015 at 10:27:32AM -0300, Marcelo Ricardo Leitner wrote: On Thu, May 28, 2015 at 08:17:27AM -0300, Marcelo Ricardo Leitner wrote: On Thu, May 28, 2015 at 06:15:11AM -0400, Neil Horman wrote: On Wed, May 27, 2015 at 09:52:17PM -0300, mleit...@redhat.com wrote: From: Marcelo Ricardo Leitner marcelo.leit...@gmail.com -auto_asconf_splist is per namespace and mangled by functions like sctp_setsockopt_auto_asconf() which doesn't guarantee any serialization. Also, the call to inet_sk_copy_descendant() was backuping -auto_asconf_list through the copy but was not honoring -do_auto_asconf, which could lead to list corruption if it was different between both sockets. This commit thus fixes the list handling by adding a spinlock to protect against multiple writers and converts the list to be protected by RCU too, so that we don't have a lock inverstion issue at sctp_addr_wq_timeout_handler(). And as this list now uses RCU, we cannot do such backup and restore while copying descendant data anymore as readers may be traversing the list meanwhile. We fix this by simply ignoring/not copying those fields, placed at the end of struct sctp_sock, so we can just ignore it together with struct ipv6_pinfo data. For that we create sctp_copy_descendant() so we don't clutter inet_sk_copy_descendant() with SCTP info. Issue was found with a test application that kept flipping sysctl default_auto_asconf on and off. Fixes: 9f7d653b67ae (sctp: Add Auto-ASCONF support (core).) Signed-off-by: Marcelo Ricardo Leitner marcelo.leit...@gmail.com --- include/net/netns/sctp.h | 6 +- include/net/sctp/structs.h | 2 ++ net/sctp/protocol.c| 6 +- net/sctp/socket.c | 39 ++- 4 files changed, 38 insertions(+), 15 deletions(-) diff --git a/include/net/netns/sctp.h b/include/net/netns/sctp.h index 3573a81815ad9e0efb6ceb721eb066d3726419f0..e080bebb3147af39c8275261f57018eb01e917b0 100644 --- a/include/net/netns/sctp.h +++ b/include/net/netns/sctp.h @@ -30,12 +30,15 @@ struct netns_sctp { struct list_head local_addr_list; struct list_head addr_waitq; struct timer_list addr_wq_timer; - struct list_head auto_asconf_splist; + struct list_head __rcu auto_asconf_splist; You should use the addr_wq_lock here instead of creating a new lock, as thats already used to protect most accesses to the list you are concerned about. Ok, that works too. Though truthfully, that shouldn't be necessecary. The list in question is only read in one location and only written in one location. You can likely just rcu-ify, as the write side is in process context and protected by lock_sock. It should, it's not protected by lock_sock as this list resides in netns_sctp structure, which lock_sock doesn't cover. Write side is in process context yes, but this list is written in sctp_init_sock(), sctp_destroy_sock() and sctp_setsockopt_auto_asconf(), so one could trigger this by either creating/destroying sockets if default_auto_asconf=1 or just by creating a bunch of sockets and flipping asconf via setsockopt (or a combination of these operations). (I'll point this out in the changelog) Hmm.. by reusing addr_wq_lock we don't need to rcu-ify the list, as the reader is inside that lock too, so I can just protect auto_asconf_splist writers with addr_wq_lock. Nice, thanks Neil. Cannot really do that.. as that creates a lock inversion between sctp_destroy_sock() (which already holds lock_sock) and sctp_addr_wq_timeout_handler(), which first grabs addr_wq_lock and then locks socket by socket. Due to that, I'm afraid reusing this lock is not possible, and we should stick with the patch.. what do you think? (though I have to fix the nits in there) I don't think thats accurate. You are correct in that the the locks are taken in opposing order, which would imply a lock inversion that could result in deadlock, but we can avoid that by deferring the asconf list removal until after sk_common_release and unlock_sock_bh is called in sctp_close. That will make the lock ordering consistent. Alternatively, we can pre-emptively take the asconf_lock in sctp_close before locking the socket. For your first approach, deferring the asconf list removal, we can only do that reliably via some work queue, because we initialize asconf stuff on sctp_init_sock() and it should be de-initialized
[PATCH net-next v2 0/3] net: systemport: misc. improvements
Hi David, These patches are highly inspired by changes from Petri on bcmgenet, last patch is a misc fix that I had pending for a while, but is not a candidate for 'net' at this point. Changes in v2: - added Petri's reviewed-by tag for patches 1 and 2 - reworked patch 2 to remove a now stale comment and use an unlikely optimization Thanks! Florian Fainelli (3): net: systemport: Pre-calculate and utilize cb-bd_addr net: systemport: rewrite bcm_sysport_rx_refill net: systemport: Add a check for oversized packets drivers/net/ethernet/broadcom/bcmsysport.c | 113 +++-- drivers/net/ethernet/broadcom/bcmsysport.h | 2 - 2 files changed, 58 insertions(+), 57 deletions(-) -- 2.1.0 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next v2 1/3] net: systemport: Pre-calculate and utilize cb-bd_addr
There is a 1:1 mapping between the software maintained control block in priv-rx_cbs and the buffer address in priv-rx_bds, such that there is no need to keep computing the buffer address when refiling a control block. Reviewed-by: Petri Gynther pgynt...@google.com Signed-off-by: Florian Fainelli f.faine...@gmail.com --- drivers/net/ethernet/broadcom/bcmsysport.c | 18 +- drivers/net/ethernet/broadcom/bcmsysport.h | 2 -- 2 files changed, 9 insertions(+), 11 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bcmsysport.c b/drivers/net/ethernet/broadcom/bcmsysport.c index 084a50a555de..267330ccd595 100644 --- a/drivers/net/ethernet/broadcom/bcmsysport.c +++ b/drivers/net/ethernet/broadcom/bcmsysport.c @@ -549,12 +549,7 @@ static int bcm_sysport_rx_refill(struct bcm_sysport_priv *priv, } dma_unmap_addr_set(cb, dma_addr, mapping); - dma_desc_set_addr(priv, priv-rx_bd_assign_ptr, mapping); - - priv-rx_bd_assign_index++; - priv-rx_bd_assign_index = (priv-num_rx_bds - 1); - priv-rx_bd_assign_ptr = priv-rx_bds + - (priv-rx_bd_assign_index * DESC_SIZE); + dma_desc_set_addr(priv, cb-bd_addr, mapping); netif_dbg(priv, rx_status, ndev, RX refill\n); @@ -568,7 +563,7 @@ static int bcm_sysport_alloc_rx_bufs(struct bcm_sysport_priv *priv) unsigned int i; for (i = 0; i priv-num_rx_bds; i++) { - cb = priv-rx_cbs[priv-rx_bd_assign_index]; + cb = priv-rx_cbs[i]; if (cb-skb) continue; @@ -1330,14 +1325,14 @@ static inline int tdma_enable_set(struct bcm_sysport_priv *priv, static int bcm_sysport_init_rx_ring(struct bcm_sysport_priv *priv) { + struct bcm_sysport_cb *cb; u32 reg; int ret; + int i; /* Initialize SW view of the RX ring */ priv-num_rx_bds = NUM_RX_DESC; priv-rx_bds = priv-base + SYS_PORT_RDMA_OFFSET; - priv-rx_bd_assign_ptr = priv-rx_bds; - priv-rx_bd_assign_index = 0; priv-rx_c_index = 0; priv-rx_read_ptr = 0; priv-rx_cbs = kcalloc(priv-num_rx_bds, sizeof(struct bcm_sysport_cb), @@ -1347,6 +1342,11 @@ static int bcm_sysport_init_rx_ring(struct bcm_sysport_priv *priv) return -ENOMEM; } + for (i = 0; i priv-num_rx_bds; i++) { + cb = priv-rx_cbs + i; + cb-bd_addr = priv-rx_bds + i * DESC_SIZE; + } + ret = bcm_sysport_alloc_rx_bufs(priv); if (ret) { netif_err(priv, hw, priv-netdev, SKB allocation failed\n); diff --git a/drivers/net/ethernet/broadcom/bcmsysport.h b/drivers/net/ethernet/broadcom/bcmsysport.h index 42a4b4a0bc14..f28bf545d7f4 100644 --- a/drivers/net/ethernet/broadcom/bcmsysport.h +++ b/drivers/net/ethernet/broadcom/bcmsysport.h @@ -663,8 +663,6 @@ struct bcm_sysport_priv { /* Receive queue */ void __iomem*rx_bds; - void __iomem*rx_bd_assign_ptr; - unsigned intrx_bd_assign_index; struct bcm_sysport_cb *rx_cbs; unsigned intnum_rx_bds; unsigned intrx_read_ptr; -- 2.1.0 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next v2 3/3] net: systemport: Add a check for oversized packets
Occasionnaly we may get oversized packets from the hardware which exceed the nomimal 2KiB buffer size we allocate SKBs with. Add an early check which drops the packet to avoid invoking skb_over_panic() and move on to processing the next packet. Reviewed-by: Petri Gynther pgynt...@google.com Signed-off-by: Florian Fainelli f.faine...@gmail.com --- drivers/net/ethernet/broadcom/bcmsysport.c | 8 1 file changed, 8 insertions(+) diff --git a/drivers/net/ethernet/broadcom/bcmsysport.c b/drivers/net/ethernet/broadcom/bcmsysport.c index 62ea403e15b8..bbd8676a9675 100644 --- a/drivers/net/ethernet/broadcom/bcmsysport.c +++ b/drivers/net/ethernet/broadcom/bcmsysport.c @@ -632,6 +632,14 @@ static unsigned int bcm_sysport_desc_rx(struct bcm_sysport_priv *priv, p_index, priv-rx_c_index, priv-rx_read_ptr, len, status); + if (unlikely(len RX_BUF_LENGTH)) { + netif_err(priv, rx_status, ndev, oversized packet\n); + ndev-stats.rx_length_errors++; + ndev-stats.rx_errors++; + dev_kfree_skb_any(skb); + goto next; + } + if (unlikely(!(status DESC_EOP) || !(status DESC_SOP))) { netif_err(priv, rx_status, ndev, fragmented packet!\n); ndev-stats.rx_dropped++; -- 2.1.0 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC 1/3] net: dsa: add basic support for VLAN ndo
On Fri, May 29, 2015 at 12:37 AM, Vivien Didelot vivien.dide...@savoirfairelinux.com wrote: @@ -854,7 +922,9 @@ int dsa_slave_create(struct dsa_switch *ds, struct device *parent, if (slave_dev == NULL) return -ENOMEM; - slave_dev-features = master-vlan_features; + slave_dev-features = master-vlan_features | + NETIF_F_VLAN_FEATURES | + NETIF_F_HW_SWITCH_OFFLOAD; wait... didn't commit 7889cbee8357aaed85898d028829dfb4f75bae2c remove NETIF_F_HW_SWITCH_OFFLOAD? -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC 0/3] DSA and Marvell 88E6352 802.1q support
Hi Scott, On May 29, 2015, at 1:02 AM, Scott Feldman sfel...@gmail.com wrote: On Thu, May 28, 2015 at 2:37 PM, Vivien Didelot vivien.dide...@savoirfairelinux.com wrote: This RFC is based on v4.1-rc3. It is meant to get a glance to the commits responsible to implement the necessary NDOs between DSA and the Marvell 88E6352 switch driver. With this support, I am able to create VLANs with (un)tagged ports, setting their default VID, from a bridge. To create a bridge containing all switch ports, with a VLAN ID 400, swp2 and swp3 untagged (pvid), and swp4 tagged, the userspace commands look like this: ip link add name br0 type bridge [...] ip link set dev swp2 up master br0 [...] bridge vlan add vid 400 pvid untagged dev swp2 bridge vlan add vid 400 pvid untagged dev swp3 bridge vlan add vid 400 dev swp4 [...] ip link add link br0 name br0.400 type vlan id 400 [...] bridge vlan add dev br0 vid 400 self The code is currently being rebased to the latest net-next/master. Seems like the way to go now is through switchdev attr getter/setter... Indeed, for dsa_slave you should be able to port this to switchdev and set your ndo_bridge_setlink/dellink handlers to switchdev_port_bridge_setlink/dellink. (And also implement the switchdev ops for vlans). If you use switchdev_port_bridge_setlink/dellink, you shouldn't need to implement ndo_vlan_rx_add_vid/ndo_vlan_rx_kill_vid at all. The setlink/dellink callbacks will give the same info (and more, e.g. pvid, untagged flags) and you'll automatically get support for stacked drivers, for example if you bonded swp2/3 and then included that bond in your vlan bridge. Your commands will be slightly modified: when adding the vid to the port, specify master and self: bridge vlan add vid 400 dev swp4 master self Thanks a lot! I will send a complete RFC soon. -v -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net] xen: netback: read hotplug script once at start of day.
When we come to tear things down in netback_remove() and generate the uevent it is possible that the xenstore directory has already been removed (details below). In such cases netback_uevent() won't be able to read the hotplug script and will write a xenstore error node. A recent change to the hypervisor exposed this race such that we now sometimes lose it (where apparently we didn't ever before). Instead read the hotplug script configuration during setup and use it for the lifetime of the vif device. The apparently more obvious fix of moving the transition to state=Closed in netback_remove() to after the uevent does not work because it is possible that we are already in state=Closed (in reaction to the guest having disconnected as it shutdown). Being already in Closed means the toolstack is at liberty to start tearing down the xenstore directories. In principal it might be possible to arrange to unregister the device sooner (e.g on transition to Closing) such that xenstore would still be there but this state machine is fragile and prone to anger... A modern Xen system only relies on the hotplug uevent for driver domains, when the backend is in the same domain as the toolstack it will run the necessary setup/teardown directly in the correct sequence wrt xenstore changes. Signed-off-by: Ian Campbell ian.campb...@citrix.com --- DaveM, could this go to all stable trees please. --- drivers/net/xen-netback/common.h |2 ++ drivers/net/xen-netback/xenbus.c | 29 - 2 files changed, 18 insertions(+), 13 deletions(-) diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h index 8a495b3..01b54e9 100644 --- a/drivers/net/xen-netback/common.h +++ b/drivers/net/xen-netback/common.h @@ -248,6 +248,8 @@ struct xenvif { /* Miscellaneous private stuff. */ struct net_device *dev; + + const char *hotplug_script; }; struct xenvif_rx_cb { diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c index 3d8dbf5..698b4ad 100644 --- a/drivers/net/xen-netback/xenbus.c +++ b/drivers/net/xen-netback/xenbus.c @@ -235,6 +235,7 @@ static int netback_remove(struct xenbus_device *dev) kobject_uevent(dev-dev.kobj, KOBJ_OFFLINE); xen_unregister_watchers(be-vif); xenbus_rm(XBT_NIL, dev-nodename, hotplug-status); + kfree(be-vif-hotplug_script); xenvif_free(be-vif); be-vif = NULL; } @@ -379,24 +380,14 @@ static int netback_uevent(struct xenbus_device *xdev, struct kobj_uevent_env *env) { struct backend_info *be = dev_get_drvdata(xdev-dev); - char *val; - val = xenbus_read(XBT_NIL, xdev-nodename, script, NULL); - if (IS_ERR(val)) { - int err = PTR_ERR(val); - xenbus_dev_fatal(xdev, err, reading script); - return err; - } else { - if (add_uevent_var(env, script=%s, val)) { - kfree(val); - return -ENOMEM; - } - kfree(val); - } if (!be || !be-vif) return 0; + if (add_uevent_var(env, script=%s, be-vif-hotplug_script)) + return -ENOMEM; + return add_uevent_var(env, vif=%s, be-vif-dev-name); } @@ -407,6 +398,7 @@ static int backend_create_xenvif(struct backend_info *be) long handle; struct xenbus_device *dev = be-dev; struct xenvif *vif; + const char *script; if (be-vif != NULL) return 0; @@ -417,12 +409,23 @@ static int backend_create_xenvif(struct backend_info *be) return (err 0) ? err : -EINVAL; } + script = xenbus_read(XBT_NIL, dev-nodename, script, NULL); + if (IS_ERR(script)) { + int err = PTR_ERR(script); + xenbus_dev_fatal(dev, err, reading script); + return err; + } + vif = xenvif_alloc(dev-dev, dev-otherend_id, handle); if (IS_ERR(vif)) { err = PTR_ERR(vif); xenbus_dev_fatal(dev, err, creating interface); + kfree(script); return err; } + + vif-hotplug_script = script; + be-vif = vif; kobject_uevent(dev-dev.kobj, KOBJ_ONLINE); -- 1.7.10.4 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net v2] switchdev: don't abort hardware ipv4 fib offload on failure to program fib entry in hardware
Fri, May 29, 2015 at 05:12:35PM CEST, sfel...@gmail.com wrote: On Thu, May 28, 2015 at 2:42 AM, Jiri Pirko j...@resnulli.us wrote: Mon, May 18, 2015 at 10:19:16PM CEST, da...@davemloft.net wrote: From: Roopa Prabhu ro...@cumulusnetworks.com Date: Sun, 17 May 2015 16:42:05 -0700 On most systems where you can offload routes to hardware, doing routing in software is not an option (the cpu limitations make routing impossible in software). You absolutely do not get to determine this policy, none of us do. What matters is that by default the damn switch device being there is %100 transparent to the user. And the way to achieve that default is to do software routes as a fallback. I am not going to entertain changes of this nature which fail route loading by default just because we've exceeded a device's HW capacity to offload. I thought I was _really_ clear about this at netdev 0.1 I certainly agree that by default, transparency 1:1 sw:hw mapping is what we need for fib. The current code is a good start! I see couple of issues regarding switchdev_fib_ipv4_abort: 1) If user adds and entry, switchdev_fib_ipv4_add fails, abort is executed - and, error returned. I would expect that route entry should be added in this case. The next attempt of adding the same entry will be successful. The current behaviour breaks the transparency you are reffering to. 2) When switchdev_fib_ipv4_abort happens to be executed, the offload is disabled for good (until reboot). That is certainly not nice, alhough I understand that is the easiest solution for now. I believe that we all agree that the 1:1 transparency, although it is a default, may not be optimal for real-life usage. HW resources are limited and user does not know them. The danger of hitting _abort and screwing-up the whole system is huge, unacceptable. So here, there are couple of more or less simple things that I suggest to do in order to move a little bit forward: 1) Introduce system-wide option to switch _abort to just plain fail. When HW does not have capacity, do not flush and fallback to sw, but rather just fail to add the entry. This would not break anything. Userspace has to be prepared that entry add could fail. This breaks 1:1 transparency. A route now fails to install and the user is scratching his/her head as to why it failed. It used to work when there was no switch offload. It works with switch offload on this other device. So it must be a failure due to switch offload on this device. But why this route? I just installed 20 IPv4 routes and 10 IPv6 routes. Why did this 11th IPv6 route fail to install? See, now user needs to learn about details of that particular device's limits to understand failure. When they move their application to another device, they need to re-learn failure modes. I don't want this behaviour as the default. Default should be what is at this moment. This would be tunable by user. That, I believe is correct. 2) Introduce a way to propagate resources to userspace. Driver knows about resources used/available/potentially_available. Switchdev infra could be extended in order to propagate the info to the user. 3) Introduce couple of flags for entry add that would alter the default behaviour. Something like: NLM_F_SKIP_KERNEL NLM_F_SKIP_OFFLOAD Again, this does not break the current users. On the other hand, this gives new users a leverage to instruct kernel where the entry should be added to (or not added to). I don't think we want an NLM_F_SKIP_KERNEL option and only have the route installed on the device. We want offload to be an acceleration of the kernel's FIB, not a bypass. Okay, fair enough. Let's have NLM_F_SKIP_OFFLOAD only. SKIP_OFFLOAD can mess up LPM if the user is not really really careful. Any thoughts? Objections? Thanks! Jiri -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Fw: [Bug 99161] New: 2.6.32.66 PPC Oops in tcp_send_fin
On Fri, 2015-05-29 at 08:12 -0700, Stephen Hemminger wrote: I think 2.6.32 is so old no one will care. Begin forwarded message: Date: Fri, 29 May 2015 09:12:45 + From: bugzilla-dae...@bugzilla.kernel.org bugzilla-dae...@bugzilla.kernel.org To: shemmin...@linux-foundation.org shemmin...@linux-foundation.org Subject: [Bug 99161] New: 2.6.32.66 PPC Oops in tcp_send_fin https://bugzilla.kernel.org/show_bug.cgi?id=99161 Bug ID: 99161 Summary: 2.6.32.66 PPC Oops in tcp_send_fin Product: Networking Version: 2.5 Kernel Version: 2.6.32.66 Hardware: PPC-32 OS: Linux Tree: Mainline Status: NEW Severity: normal Priority: P1 Component: IPV4 Assignee: shemmin...@linux-foundation.org Reporter: vare...@parisc-linux.org Regression: No I just updated my trusty old PPC box to longterm 2.6.32.66 (was running .65 before that with zero issue) and it started spewing oopses at me like hell broke loose. This machine is primarily used as a DNS and MX (albeit under low pressure). Unable to handle kernel paging request for data at address 0x003c Faulting instruction address: 0xc0344ffc Oops: Kernel access of bad area, sig: 11 [#1] PowerMac Modules linked in: sch_sfq cls_u32 sch_cbq xt_recent xt_length iptable_mangle NIP: c0344ffc LR: c0335b00 CTR: c03357b0 REGS: cb441dd0 TRAP: 0300 Not tainted (2.6.32.66) MSR: 9032 EE,ME,IR,DR CR: 44244488 XER: DAR: 003c, DSISR: 4000 TASK = e39f0900[14281] 'smtpd' THREAD: cb44 GPR00: dbc0 cb441e80 e39f0900 e397cc60 0004 e3948100 0003 GPR08: 0020 01af ffe4 24244482 207bb198 201322b4 2065d898 GPR16: 2065d878 2065d7e0 2065d858 2065d7e0 2065d7e0 206733b0 20673060 bfcc7f50 GPR24: bfcc7f40 20b7eeb0 bfcc7f40 e397ccc4 dbc00020 e397cc60 NIP [c0344ffc] tcp_send_fin+0x48/0x21c LR [c0335b00] tcp_close+0x350/0x3fc Call Trace: [cb441e80] [cb441e84] 0xcb441e84 (unreliable) [cb441ea0] [c0335b00] tcp_close+0x350/0x3fc [cb441ec0] [c035733c] inet_release+0x58/0x88 [cb441ed0] [c02e1fe8] sock_release+0x34/0xa8 [cb441ee0] [c02e2078] sock_close+0x1c/0x40 [cb441ef0] [c009cddc] __fput+0xf4/0x22c [cb441f10] [c0098ea4] filp_close+0x64/0xa0 [cb441f30] [c0098f7c] sys_close+0x9c/0xc0 [cb441f40] [c0012988] ret_from_syscall+0x0/0x38 --- Exception: c01 at 0x20368780 LR = 0x2064bc48 Instruction dump: 90010024 93c10018 83dd0004 7f9df000 419e0080 2f9e 419e007c 80030104 2f80 419e0180 39200020 3bde0020 8809001c 6001 9809001c 813e0014 ---[ end trace 13772745934a0d1f ]--- Unable to handle kernel paging request for data at address 0x003c Faulting instruction address: 0xc0344ffc Oops: Kernel access of bad area, sig: 11 [#2] PowerMac Modules linked in: sch_sfq cls_u32 sch_cbq xt_recent xt_length iptable_mangle NIP: c0344ffc LR: c0335b00 CTR: c03357b0 REGS: dbc09d60 TRAP: 0300 Tainted: G D (2.6.32.66) MSR: 9032 EE,ME,IR,DR CR: 42004288 XER: 2000 DAR: 003c, DSISR: 4000 TASK = e394f180[14867] 'imapd' THREAD: dbc08000 GPR00: dbc00d80 dbc09e10 e394f180 e397c420 0009 ef10eb80 0003 GPR08: 0020 e397c498 22004282 1002bad4 1023e7b0 1002 GPR16: 1002 1002 1002 1002 10007678 1000766c 0008 1023d168 GPR24: 1002 10018c28 e397c484 ef327c20 e397c420 NIP [c0344ffc] tcp_send_fin+0x48/0x21c LR [c0335b00] tcp_close+0x350/0x3fc Call Trace: [dbc09e10] [1000766c] 0x1000766c (unreliable) [dbc09e30] [c0335b00] tcp_close+0x350/0x3fc [dbc09e50] [c035733c] inet_release+0x58/0x88 [dbc09e60] [c02e1fe8] sock_release+0x34/0xa8 [dbc09e70] [c02e2078] sock_close+0x1c/0x40 [dbc09e80] [c009cddc] __fput+0xf4/0x22c [dbc09ea0] [c0098ea4] filp_close+0x64/0xa0 [dbc09ec0] [c00318e0] put_files_struct+0x108/0x124 [dbc09ee0] [c0033824] do_exit+0x4fc/0x630 [dbc09f20] [c003399c] do_group_exit+0x44/0xa4 [dbc09f30] [c0033a10] sys_exit_group+0x14/0x28 [dbc09f40] [c0012988] ret_from_syscall+0x0/0x38 --- Exception: c01 at 0xfd96f38 LR = 0xfd96f04 Instruction dump: 90010024 93c10018 83dd0004 7f9df000 419e0080 2f9e 419e007c 80030104 2f80 419e0180 39200020 3bde0020 8809001c 6001 9809001c 813e0014 ---[ end trace 13772745934a0d20 ]--- Fixing recursive fault but reboot is needed! Unable to handle kernel paging request for data at address 0x003c Faulting instruction address: 0xc0344ffc Oops: Kernel access of bad area, sig: 11 [#3] PowerMac Modules linked in: sch_sfq cls_u32 sch_cbq xt_recent xt_length iptable_mangle NIP: c0344ffc LR: c0335b00 CTR: c03357b0 REGS: cb463dd0 TRAP: 0300 Tainted: G D (2.6.32.66) MSR: 9032 EE,ME,IR,DR CR: 44244488 XER: DAR: 003c, DSISR: 4000 TASK = e39f1f80[15093] 'smtpd' THREAD:
Re: Request for advice on where to put Root Complex fix up code for downstream device
On Fri, May 29, 2015 at 11:46 AM, Casey Leedom lee...@chelsio.com wrote: Thanks Bjorn and no issues at all about the delay -- I definitely understand how busy we all are. I'll go ahead and submit a PCI Quirk. As part of this, would you like me to also commit a new PCI-E routine to find the Root Complex Port for a given PCI Device? It seem like it might prove useful in the future. Otherwise I'll just incorporate that loop in my PCI Quirk. Sure, I wouldn't mind seeing a new interface for that. Bjorn From: Bjorn Helgaas [bhelg...@google.com] Sent: Friday, May 29, 2015 9:20 AM To: Casey Leedom Cc: netdev@vger.kernel.org; linux-...@vger.kernel.org Subject: Re: Request for advice on where to put Root Complex fix up code for downstream device Hi Casey, Sorry, this one slipped through and I forgot to respond earlier. On Thu, May 07, 2015 at 11:31:58PM +, Casey Leedom wrote: | From: Bjorn Helgaas [bhelg...@google.com] | Sent: Thursday, May 07, 2015 4:04 PM | | There are a lot of fixups in drivers/pci/quirks.c. For things that have to | be worked around either before a driver claims the device or if there is no | driver at all, the fixup *has* to go in drivers/pci/quirks.c | | But for things like this, where the problem can only occur after a driver | claims the device, I think it makes more sense to put the fixup in the | driver itself. The only wrinkle here is that the fixup has to be done on a | separate device, not the device claimed by the driver. But I think it | probably still makes sense to put this fixup in the driver. Okay, the example code that I provided (still quoted below) was indeed done as a fix within the cxgb4 Network Driver. I've also worked up a version as a PCI Quirk but if you and David Miller agree that the fixup code should go into cxgb4, I'm comfortable with that. I can also provide the example PCI Quirk code I worked up if you like. One complication to doing this in cxgb4 is that it attaches to Physical Function 4 of our T5 chip. Meanwhile, a completely separate storage driver, csiostor, connections to PF5 and PF6 and there's no requirement at all that cxgb4 be loaded. So if we go down the road of putting the fixup code in the cxgb4 driver, we'll also need to duplicate that code in the csiostor driver. Sounds simpler to just put the quirk in drivers/pci/quirks.c. | +static void clear_root_complex_tlp_attributes(struct pci_dev *pdev) | +{ | + struct pci_bus *bus = pdev-bus; | + struct pci_dev *highest_pcie_bridge = NULL; | + | + while (bus) { | + struct pci_dev *bridge = bus-self; | + | + if (!bridge || !bridge-pcie_cap) | + break; | + highest_pcie_bridge = bridge; | + bus = bus-parent; | + } | | Can you use pci_upstream_bridge() here? There are a couple places where we | want to find the Root Port, so we might factor that out someday. It'll be | easier to find all those places if they use with pci_upstream_bridge(). It looks like pci_upstream_bridge() just traverses one like upstream toward the Root Complex? Or am I misunderstanding that function? No, you're right. I was just trying to suggest using pci_upstream_bridge() instead of bus-parent-self in your loop. It wouldn't replace the loop completely. Bjorn -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Xen-devel] [RFC PATCH 00/13] Persistent grant maps for xen net drivers
On 29 May 2015, at 08:53, Yuzhou (C) vitas.yuz...@huawei.com wrote: Hi, About rx zerocopy, I have a question: If some application make a socket, then listen and accept, the client sends packets to it, but it doesn't recv from this socket right now, all persistent grant page would be in used. So other application cannot receive any packets. Is my guess right or wrong? I believe that doesn’t happen: before the skb gets delivered to the protocol stack, skb_orphan_frags gets called which releases the original pages (i.e. the persistent grants) and memcpy to new ones (if the skb is fragmented). This happens because I previously set the flag SKBTX_DEV_ZEROCOPY, which also invokes a callback in such event. Once the callback is invoked, the released pages are added a pool within xen-netfront which later will use for new requests to the backend. Note that part of the data is previously copied on pskb_pull_tail and may unref the initial frag before skb_orphan_frags is called. The callback is still invoked and this page is added to the pool as well. Joao YuZhou -Original Message- From: xen-devel-boun...@lists.xen.org [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of Joao Martins Sent: Friday, May 22, 2015 6:27 PM To: Wei Liu Cc: ian.campb...@citrix.com; netdev@vger.kernel.org; david.vra...@citrix.com; xen-de...@lists.xenproject.org;boris.ostrov...@oracle.com Subject: Re: [Xen-devel] [RFC PATCH 00/13] Persistent grant maps for xen net drivers On 19 May 2015, at 17:39, Wei Liu wei.l...@citrix.com wrote: On Tue, May 12, 2015 at 07:18:24PM +0200, Joao Martins wrote: There have been recently[3] some discussions and issues raised on persistent grants for the block layer, though the numbers above show some significant improvements specially on more network intensive workloads and provide a margin for comparison against future map/unmap improvements. Any comments or suggestions are welcome, Thanks! Thanks, the numbers certainly look interesting. I'm just a bit concerned about the complexity of netback. I've commented on individual patches, we can discuss the issues there. Thanks a lot for the review! It does add more complexity, mainly for the TX path, but I also would like to mention that a portion of this changeset is also the persistent grants ops that could potentially live outside. Joao [1] http://article.gmane.org/gmane.linux.network/249383 [2] http://bit.ly/1IhJfXD [3] http://lists.xen.org/archives/html/xen-devel/2015-02/msg02292.html Joao Martins (13): xen-netback: add persistent grant tree ops xen-netback: xenbus feature persistent support xen-netback: implement TX persistent grants xen-netback: implement RX persistent grants xen-netback: refactor xenvif_rx_action xen-netback: copy buffer on xenvif_start_xmit() xen-netback: add persistent tree counters to debugfs xen-netback: clone skb if skb-xmit_more is set xen-netfront: move grant_{ref,page} to struct grant xen-netfront: refactor claim/release grant xen-netfront: feature-persistent xenbus support xen-netfront: implement TX persistent grants xen-netfront: implement RX persistent grants drivers/net/xen-netback/common.h| 79 drivers/net/xen-netback/interface.c | 78 +++- drivers/net/xen-netback/netback.c | 873 ++-- drivers/net/xen-netback/xenbus.c| 24 + drivers/net/xen-netfront.c | 362 --- 5 files changed, 1216 insertions(+), 200 deletions(-) -- 2.1.3 ___ Xen-devel mailing list xen-de...@lists.xen.org http://lists.xen.org/xen-devel === João Martins Research Scientist, Networked Systems and Data Analytics Group NEC Laboratories Europe Kurfuerstenanlage 36 D-69115 Heidelberg Tel. +49 (0)6221 4342-208 Fax: +49 (0)6221 4342-155 e-mail: joao.mart...@neclab.eu === NEC Europe Ltd | Registered Office: Athene, Odyssey Business Park, West End Road, London, HA4 6QE, GB | Registered in England 2832014 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH iproute2] ss: Fix allocation of cong control alg name
On Fri, 2015-05-29 at 15:53 +0300, Vadim Kochan wrote: Thanks! Should I put you in From tag or in Signed-off-by ? Or your diff might be used from this email thread ? Don't worry, just submit the patch officially on your own ;) Thanks. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] xen: netback: fix error printf format string.
On Fri, May 29, 2015 at 05:22:04PM +0100, Ian Campbell wrote: drivers/net/xen-netback/netback.c: In function ‘xenvif_tx_build_gops’: drivers/net/xen-netback/netback.c:1253:8: warning: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 5 has type ‘int’ [-Wformat=] (txreq.offset~PAGE_MASK) + txreq.size); ^ txreq.offset and .size are uint16_t fields. Signed-off-by: Ian Campbell ian.campb...@citrix.com Acked-by: Wei Liu wei.l...@citrix.com --- drivers/net/xen-netback/netback.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c index 4de46aa..a3b1cbb 100644 --- a/drivers/net/xen-netback/netback.c +++ b/drivers/net/xen-netback/netback.c @@ -1248,7 +1248,7 @@ static void xenvif_tx_build_gops(struct xenvif_queue *queue, /* No crossing a page as the payload mustn't fragment. */ if (unlikely((txreq.offset + txreq.size) PAGE_SIZE)) { netdev_err(queue-vif-dev, -txreq.offset: %x, size: %u, end: %lu\n, +txreq.offset: %x, size: %u, end: %u\n, txreq.offset, txreq.size, (txreq.offset~PAGE_MASK) + txreq.size); xenvif_fatal_tx_err(queue-vif); -- 1.7.10.4 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH iproute2] ss: Fix allocation of cong control alg name
On 05/29/2015 06:17 PM, Guzman Mosqueda, Jose R wrote: Hi Daniel and Vadim Thanks for your prompt response and for the patch. Also, what about the other one? Do you think it is an issue or not? File: tc/tc_util.c Function: void print_rate(char *buf, int len, __u64 rate) Line: ~264 In the case that user inputs a high value for rate, the for loop will exit in the condition meaning that variable i get the value of 5 which will be an invalid index for the units array due to that array has only 5 elements. I know a very high value is invalid but in the case that it comes directly from user, it could cause and issue, what do you think? Hm, this prints just the netlink dump from kernel side, but perhaps we should just change it ... diff --git a/tc/tc_util.c b/tc/tc_util.c index dc2b70f..aa6de24 100644 --- a/tc/tc_util.c +++ b/tc/tc_util.c @@ -250,18 +250,19 @@ void print_rate(char *buf, int len, __u64 rate) extern int use_iec; unsigned long kilo = use_iec ? 1024 : 1000; const char *str = use_iec ? i : ; - int i = 0; static char *units[5] = {, K, M, G, T}; + int i; rate = 3; /* bytes/sec - bits/sec */ - for (i = 0; i ARRAY_SIZE(units); i++) { + for (i = 0; i ARRAY_SIZE(units) - 1; i++) { if (rate kilo) break; if (((rate % kilo) != 0) rate 1000*kilo) break; rate /= kilo; } + snprintf(buf, len, %.0f%s%sbit, (double)rate, units[i], str); } -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Hardware support for devices using the driver file,dm9000.c
On 05/29/2015 09:23 AM, nick wrote: Greetings All, I am wondering if anything is still using the Davicom DM9000 driver recently as the code base states its from the late 1990s. Furthermore I see no reason to keep it around if something recent i.e. last 10 years is using it that is based off the mainline/stable releases as otherwise it's legacy code and should be removed in my option. Cheers, Nick Nick, Yes, it is still used. Andy P.S. Some context for those yelling WTF at their computer right now: https://lkml.org/lkml/2014/8/4/206 Pretty sure he's banned from the mailing lists though. N�r��yb�X��ǧv�^�){.n�+���z�^�)w*jg����ݢj/���z�ޖ��2�ޙ�)ߡ�a�����G���h��j:+v���w��٥
Fw: [Bug 99161] New: 2.6.32.66 PPC Oops in tcp_send_fin
I think 2.6.32 is so old no one will care. Begin forwarded message: Date: Fri, 29 May 2015 09:12:45 + From: bugzilla-dae...@bugzilla.kernel.org bugzilla-dae...@bugzilla.kernel.org To: shemmin...@linux-foundation.org shemmin...@linux-foundation.org Subject: [Bug 99161] New: 2.6.32.66 PPC Oops in tcp_send_fin https://bugzilla.kernel.org/show_bug.cgi?id=99161 Bug ID: 99161 Summary: 2.6.32.66 PPC Oops in tcp_send_fin Product: Networking Version: 2.5 Kernel Version: 2.6.32.66 Hardware: PPC-32 OS: Linux Tree: Mainline Status: NEW Severity: normal Priority: P1 Component: IPV4 Assignee: shemmin...@linux-foundation.org Reporter: vare...@parisc-linux.org Regression: No I just updated my trusty old PPC box to longterm 2.6.32.66 (was running .65 before that with zero issue) and it started spewing oopses at me like hell broke loose. This machine is primarily used as a DNS and MX (albeit under low pressure). Unable to handle kernel paging request for data at address 0x003c Faulting instruction address: 0xc0344ffc Oops: Kernel access of bad area, sig: 11 [#1] PowerMac Modules linked in: sch_sfq cls_u32 sch_cbq xt_recent xt_length iptable_mangle NIP: c0344ffc LR: c0335b00 CTR: c03357b0 REGS: cb441dd0 TRAP: 0300 Not tainted (2.6.32.66) MSR: 9032 EE,ME,IR,DR CR: 44244488 XER: DAR: 003c, DSISR: 4000 TASK = e39f0900[14281] 'smtpd' THREAD: cb44 GPR00: dbc0 cb441e80 e39f0900 e397cc60 0004 e3948100 0003 GPR08: 0020 01af ffe4 24244482 207bb198 201322b4 2065d898 GPR16: 2065d878 2065d7e0 2065d858 2065d7e0 2065d7e0 206733b0 20673060 bfcc7f50 GPR24: bfcc7f40 20b7eeb0 bfcc7f40 e397ccc4 dbc00020 e397cc60 NIP [c0344ffc] tcp_send_fin+0x48/0x21c LR [c0335b00] tcp_close+0x350/0x3fc Call Trace: [cb441e80] [cb441e84] 0xcb441e84 (unreliable) [cb441ea0] [c0335b00] tcp_close+0x350/0x3fc [cb441ec0] [c035733c] inet_release+0x58/0x88 [cb441ed0] [c02e1fe8] sock_release+0x34/0xa8 [cb441ee0] [c02e2078] sock_close+0x1c/0x40 [cb441ef0] [c009cddc] __fput+0xf4/0x22c [cb441f10] [c0098ea4] filp_close+0x64/0xa0 [cb441f30] [c0098f7c] sys_close+0x9c/0xc0 [cb441f40] [c0012988] ret_from_syscall+0x0/0x38 --- Exception: c01 at 0x20368780 LR = 0x2064bc48 Instruction dump: 90010024 93c10018 83dd0004 7f9df000 419e0080 2f9e 419e007c 80030104 2f80 419e0180 39200020 3bde0020 8809001c 6001 9809001c 813e0014 ---[ end trace 13772745934a0d1f ]--- Unable to handle kernel paging request for data at address 0x003c Faulting instruction address: 0xc0344ffc Oops: Kernel access of bad area, sig: 11 [#2] PowerMac Modules linked in: sch_sfq cls_u32 sch_cbq xt_recent xt_length iptable_mangle NIP: c0344ffc LR: c0335b00 CTR: c03357b0 REGS: dbc09d60 TRAP: 0300 Tainted: G D (2.6.32.66) MSR: 9032 EE,ME,IR,DR CR: 42004288 XER: 2000 DAR: 003c, DSISR: 4000 TASK = e394f180[14867] 'imapd' THREAD: dbc08000 GPR00: dbc00d80 dbc09e10 e394f180 e397c420 0009 ef10eb80 0003 GPR08: 0020 e397c498 22004282 1002bad4 1023e7b0 1002 GPR16: 1002 1002 1002 1002 10007678 1000766c 0008 1023d168 GPR24: 1002 10018c28 e397c484 ef327c20 e397c420 NIP [c0344ffc] tcp_send_fin+0x48/0x21c LR [c0335b00] tcp_close+0x350/0x3fc Call Trace: [dbc09e10] [1000766c] 0x1000766c (unreliable) [dbc09e30] [c0335b00] tcp_close+0x350/0x3fc [dbc09e50] [c035733c] inet_release+0x58/0x88 [dbc09e60] [c02e1fe8] sock_release+0x34/0xa8 [dbc09e70] [c02e2078] sock_close+0x1c/0x40 [dbc09e80] [c009cddc] __fput+0xf4/0x22c [dbc09ea0] [c0098ea4] filp_close+0x64/0xa0 [dbc09ec0] [c00318e0] put_files_struct+0x108/0x124 [dbc09ee0] [c0033824] do_exit+0x4fc/0x630 [dbc09f20] [c003399c] do_group_exit+0x44/0xa4 [dbc09f30] [c0033a10] sys_exit_group+0x14/0x28 [dbc09f40] [c0012988] ret_from_syscall+0x0/0x38 --- Exception: c01 at 0xfd96f38 LR = 0xfd96f04 Instruction dump: 90010024 93c10018 83dd0004 7f9df000 419e0080 2f9e 419e007c 80030104 2f80 419e0180 39200020 3bde0020 8809001c 6001 9809001c 813e0014 ---[ end trace 13772745934a0d20 ]--- Fixing recursive fault but reboot is needed! Unable to handle kernel paging request for data at address 0x003c Faulting instruction address: 0xc0344ffc Oops: Kernel access of bad area, sig: 11 [#3] PowerMac Modules linked in: sch_sfq cls_u32 sch_cbq xt_recent xt_length iptable_mangle NIP: c0344ffc LR: c0335b00 CTR: c03357b0 REGS: cb463dd0 TRAP: 0300 Tainted: G D (2.6.32.66) MSR: 9032 EE,ME,IR,DR CR: 44244488 XER: DAR: 003c, DSISR: 4000 TASK = e39f1f80[15093] 'smtpd' THREAD: cb462000 GPR00: dbc00480 cb463e80 e39f1f80 e397d4a0 0004 e3878f80 0003 GPR08: 0020 01af ffd6 24244482 206eb198 200622b4 2058d898 GPR16: 2058d878 2058d7e0
Re: [PATCH net v2] switchdev: don't abort hardware ipv4 fib offload on failure to program fib entry in hardware
On Thu, May 28, 2015 at 2:42 AM, Jiri Pirko j...@resnulli.us wrote: Mon, May 18, 2015 at 10:19:16PM CEST, da...@davemloft.net wrote: From: Roopa Prabhu ro...@cumulusnetworks.com Date: Sun, 17 May 2015 16:42:05 -0700 On most systems where you can offload routes to hardware, doing routing in software is not an option (the cpu limitations make routing impossible in software). You absolutely do not get to determine this policy, none of us do. What matters is that by default the damn switch device being there is %100 transparent to the user. And the way to achieve that default is to do software routes as a fallback. I am not going to entertain changes of this nature which fail route loading by default just because we've exceeded a device's HW capacity to offload. I thought I was _really_ clear about this at netdev 0.1 I certainly agree that by default, transparency 1:1 sw:hw mapping is what we need for fib. The current code is a good start! I see couple of issues regarding switchdev_fib_ipv4_abort: 1) If user adds and entry, switchdev_fib_ipv4_add fails, abort is executed - and, error returned. I would expect that route entry should be added in this case. The next attempt of adding the same entry will be successful. The current behaviour breaks the transparency you are reffering to. 2) When switchdev_fib_ipv4_abort happens to be executed, the offload is disabled for good (until reboot). That is certainly not nice, alhough I understand that is the easiest solution for now. I believe that we all agree that the 1:1 transparency, although it is a default, may not be optimal for real-life usage. HW resources are limited and user does not know them. The danger of hitting _abort and screwing-up the whole system is huge, unacceptable. So here, there are couple of more or less simple things that I suggest to do in order to move a little bit forward: 1) Introduce system-wide option to switch _abort to just plain fail. When HW does not have capacity, do not flush and fallback to sw, but rather just fail to add the entry. This would not break anything. Userspace has to be prepared that entry add could fail. This breaks 1:1 transparency. A route now fails to install and the user is scratching his/her head as to why it failed. It used to work when there was no switch offload. It works with switch offload on this other device. So it must be a failure due to switch offload on this device. But why this route? I just installed 20 IPv4 routes and 10 IPv6 routes. Why did this 11th IPv6 route fail to install? See, now user needs to learn about details of that particular device's limits to understand failure. When they move their application to another device, they need to re-learn failure modes. 2) Introduce a way to propagate resources to userspace. Driver knows about resources used/available/potentially_available. Switchdev infra could be extended in order to propagate the info to the user. 3) Introduce couple of flags for entry add that would alter the default behaviour. Something like: NLM_F_SKIP_KERNEL NLM_F_SKIP_OFFLOAD Again, this does not break the current users. On the other hand, this gives new users a leverage to instruct kernel where the entry should be added to (or not added to). I don't think we want an NLM_F_SKIP_KERNEL option and only have the route installed on the device. We want offload to be an acceleration of the kernel's FIB, not a bypass. SKIP_OFFLOAD can mess up LPM if the user is not really really careful. Any thoughts? Objections? Thanks! Jiri -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC][PATCH] x86: remove vmalloc.h from asm/io.h
Hi Takashi, On Fri, 29 May 2015 14:43:14 +0200 Takashi Iwai ti...@suse.de wrote: For the sound bits, Acked-by: Takashi Iwai ti...@suse.de Thanks, noted. -- Cheers, Stephen Rothwells...@canb.auug.org.au pgp0duKaM_eov.pgp Description: OpenPGP digital signature
Re: [PATCH net v2] switchdev: don't abort hardware ipv4 fib offload on failure to program fib entry in hardware
On Fri, May 29, 2015 at 12:50 AM, Jiri Pirko j...@resnulli.us wrote: Thu, May 21, 2015 at 07:46:54AM CEST, sfel...@gmail.com wrote: On Tue, May 19, 2015 at 1:28 PM, David Miller da...@davemloft.net wrote: From: Andy Gospodarek go...@cumulusnetworks.com Date: Tue, 19 May 2015 15:47:32 -0400 Are you actually saying that if users complain loudly enough about the current behavior (not the change Roopa has proposed) that you would be open to considering a change the current behavior? I am saying that we have a contract with users not to break existing behavior. Full stop. After rehearing David's argument, we should probably explore option d) which is a refinement on the fib_offload_disable mechanism we have today. fib_offload_disable is global for all routes. Once we hit a HW install problem, the global flag is set and all routes fallback to SW. We did this because we can't allow the failed route to exist in SW and not in HW because it could mess up LPM searches (HW could hit on a lesser prefix even when SW has the true LPM, because HW gets first shot at match). The refinement on fib_offload_disable is this: make it per-related-prefix rather than global, and on a HW install problem, set the flag for the related-prefix and uninstall only those routes from HW. Related-prefix (is there a correct term for this?) are routes to the same dst addr but with different prefix lengths. I haven't parsed the fib_trie structure to see how routes are organized, but I suspect since it's optimized for lookup the related-prefix tracking is already there and we can build on that. This looks interesting. However, I'm not sure that it is acceptable for user to experience this hw evict of random entries. User knows what entries are essential to have in hw. With your solution, I can see no way user can actually say what should be offloaded or not. Kernel just automagically decides. The default eviction policy could be based on RTA_PRIORITY: evict lower priority routes first. It would be up to the device driver to decide between two routes of same priority. To help device driver make the decision, we could have eviction policy options: Priority-base (default) Prefer IPv6 over IPv4 Prefer IPv4 over IPv6 Prefer single path over multipath Prefer longer prefix lengths over shorter Optimize for resource utilization These are portable across different switches. They're in terms a user understands. It's up to the device driver which truly understands the device constraints to translates the user's eviction policy choices into something that makes sense to that device. -scott -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC 1/3] net: dsa: add basic support for VLAN ndo
Hi, - On May 29, 2015, at 11:24 AM, Or Gerlitz gerlitz...@gmail.com wrote: On Fri, May 29, 2015 at 12:37 AM, Vivien Didelot vivien.dide...@savoirfairelinux.com wrote: @@ -854,7 +922,9 @@ int dsa_slave_create(struct dsa_switch *ds, struct device *parent, if (slave_dev == NULL) return -ENOMEM; - slave_dev-features = master-vlan_features; + slave_dev-features = master-vlan_features | + NETIF_F_VLAN_FEATURES | + NETIF_F_HW_SWITCH_OFFLOAD; wait... didn't commit 7889cbee8357aaed85898d028829dfb4f75bae2c remove NETIF_F_HW_SWITCH_OFFLOAD? Indeed, note that this RFC is based on v4.1-rc3. This will become unneeded I guess. BTW, given the commit message, I didn't really understand why? Thanks, -v -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] xen: netback: fix error printf format string.
drivers/net/xen-netback/netback.c: In function ‘xenvif_tx_build_gops’: drivers/net/xen-netback/netback.c:1253:8: warning: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 5 has type ‘int’ [-Wformat=] (txreq.offset~PAGE_MASK) + txreq.size); ^ txreq.offset and .size are uint16_t fields. Signed-off-by: Ian Campbell ian.campb...@citrix.com --- drivers/net/xen-netback/netback.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c index 4de46aa..a3b1cbb 100644 --- a/drivers/net/xen-netback/netback.c +++ b/drivers/net/xen-netback/netback.c @@ -1248,7 +1248,7 @@ static void xenvif_tx_build_gops(struct xenvif_queue *queue, /* No crossing a page as the payload mustn't fragment. */ if (unlikely((txreq.offset + txreq.size) PAGE_SIZE)) { netdev_err(queue-vif-dev, - txreq.offset: %x, size: %u, end: %lu\n, + txreq.offset: %x, size: %u, end: %u\n, txreq.offset, txreq.size, (txreq.offset~PAGE_MASK) + txreq.size); xenvif_fatal_tx_err(queue-vif); -- 1.7.10.4 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net] xen: netback: read hotplug script once at start of day.
On 29/05/15 17:24, Ian Campbell wrote: --- a/drivers/net/xen-netback/xenbus.c +++ b/drivers/net/xen-netback/xenbus.c @@ -235,6 +235,7 @@ static int netback_remove(struct xenbus_device *dev) kobject_uevent(dev-dev.kobj, KOBJ_OFFLINE); xen_unregister_watchers(be-vif); xenbus_rm(XBT_NIL, dev-nodename, hotplug-status); + kfree(be-vif-hotplug_script); Should this kfree() be in xenvif_free()? xenvif_free(be-vif); be-vif = NULL; } David -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Request for advice on where to put Root Complex fix up code for downstream device
Thanks Bjorn and no issues at all about the delay -- I definitely understand how busy we all are. I'll go ahead and submit a PCI Quirk. As part of this, would you like me to also commit a new PCI-E routine to find the Root Complex Port for a given PCI Device? It seem like it might prove useful in the future. Otherwise I'll just incorporate that loop in my PCI Quirk. Casey From: Bjorn Helgaas [bhelg...@google.com] Sent: Friday, May 29, 2015 9:20 AM To: Casey Leedom Cc: netdev@vger.kernel.org; linux-...@vger.kernel.org Subject: Re: Request for advice on where to put Root Complex fix up code for downstream device Hi Casey, Sorry, this one slipped through and I forgot to respond earlier. On Thu, May 07, 2015 at 11:31:58PM +, Casey Leedom wrote: | From: Bjorn Helgaas [bhelg...@google.com] | Sent: Thursday, May 07, 2015 4:04 PM | | There are a lot of fixups in drivers/pci/quirks.c. For things that have to | be worked around either before a driver claims the device or if there is no | driver at all, the fixup *has* to go in drivers/pci/quirks.c | | But for things like this, where the problem can only occur after a driver | claims the device, I think it makes more sense to put the fixup in the | driver itself. The only wrinkle here is that the fixup has to be done on a | separate device, not the device claimed by the driver. But I think it | probably still makes sense to put this fixup in the driver. Okay, the example code that I provided (still quoted below) was indeed done as a fix within the cxgb4 Network Driver. I've also worked up a version as a PCI Quirk but if you and David Miller agree that the fixup code should go into cxgb4, I'm comfortable with that. I can also provide the example PCI Quirk code I worked up if you like. One complication to doing this in cxgb4 is that it attaches to Physical Function 4 of our T5 chip. Meanwhile, a completely separate storage driver, csiostor, connections to PF5 and PF6 and there's no requirement at all that cxgb4 be loaded. So if we go down the road of putting the fixup code in the cxgb4 driver, we'll also need to duplicate that code in the csiostor driver. Sounds simpler to just put the quirk in drivers/pci/quirks.c. | +static void clear_root_complex_tlp_attributes(struct pci_dev *pdev) | +{ | + struct pci_bus *bus = pdev-bus; | + struct pci_dev *highest_pcie_bridge = NULL; | + | + while (bus) { | + struct pci_dev *bridge = bus-self; | + | + if (!bridge || !bridge-pcie_cap) | + break; | + highest_pcie_bridge = bridge; | + bus = bus-parent; | + } | | Can you use pci_upstream_bridge() here? There are a couple places where we | want to find the Root Port, so we might factor that out someday. It'll be | easier to find all those places if they use with pci_upstream_bridge(). It looks like pci_upstream_bridge() just traverses one like upstream toward the Root Complex? Or am I misunderstanding that function? No, you're right. I was just trying to suggest using pci_upstream_bridge() instead of bus-parent-self in your loop. It wouldn't replace the loop completely. Bjorn -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] xfrm6: Do not use xfrm_local_error for path MTU issues in tunnels
On 05/28/2015 12:15 PM, Alexander Duyck wrote: On 05/28/2015 01:40 AM, Steffen Klassert wrote: On Thu, May 28, 2015 at 12:18:51AM -0700, Alexander Duyck wrote: On 05/27/2015 10:36 PM, Steffen Klassert wrote: On Wed, May 27, 2015 at 10:40:32AM -0700, Alexander Duyck wrote: This change makes it so that we use icmpv6_send to report PMTU issues back into tunnels in the case that the resulting packet is larger than the MTU of the outgoing interface. Previously xfrm_local_error was being used in this case, however this was resulting in no changes, I suspect due to the fact that the tunnel itself was being kept out of the loop. This patch fixes PMTU problems seen on ip6_vti tunnels and is based on the behavior seen if the socket was orphaned. Instead of requiring the socket to be orphaned this patch simply defaults to using icmpv6_send in the case that the frame came though a tunnel. We can use icmpv6_send() just in the case that the packet was already transmitted by a tunnel device, otherwise we get the bug back that I mentioned in my other mail. Not sure if we have something to know that the packet traversed a tunnel device. That's what I asked in the thread 'Looking for a lost patch'. Okay I will try to do some more digging. From what I can tell right now it looks like my ping attempts are getting hung up on the xfrm_local_error in __xfrm6_output. I wonder if we couldn't somehow make use of the skb-cb to store a pointer to the tunnel that could be checked to determine if we are going through a VTI or not. Maybe it is as easy as the patch below, could you please test it? Subject: [PATCH RFC] vti6: Add pmtu handling to vti6_xmit. We currently rely on the PMTU discovery of xfrm. However if a packet is localy sent, the PMTU mechanism of xfrm tries to to local socket notification what might not work for applications like ping that don't check for this. So add pmtu handling to vti6_xmit to report MTU changes immediately. Signed-off-by: Steffen Klassert steffen.klass...@secunet.com --- net/ipv6/ip6_vti.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/net/ipv6/ip6_vti.c b/net/ipv6/ip6_vti.c index ff3bd86..13cb771 100644 --- a/net/ipv6/ip6_vti.c +++ b/net/ipv6/ip6_vti.c @@ -434,6 +434,7 @@ vti6_xmit(struct sk_buff *skb, struct net_device *dev, struct flowi *fl) struct dst_entry *dst = skb_dst(skb); struct net_device *tdev; struct xfrm_state *x; +int mtu; int err = -1; if (!dst) @@ -468,6 +469,15 @@ vti6_xmit(struct sk_buff *skb, struct net_device *dev, struct flowi *fl) skb_dst_set(skb, dst); skb-dev = skb_dst(skb)-dev; +mtu = dst_mtu(dst); +if (!skb-ignore_df skb-len mtu) { +skb_dst(skb)-ops-update_pmtu(dst, NULL, skb, mtu); + +icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu); + +return -EMSGSIZE; +} + err = dst_output(skb); if (net_xmit_eval(err) == 0) { struct pcpu_sw_netstats *tstats = this_cpu_ptr(dev-tstats); That seems to be working for me. I'm able to ping and while the first packet fails the second one and all that follow make it through correctly after the ptmu update. - Alex It looks like I spoke too soon. It resolves it for IPv6, but IPv4 over the tunnel has the same issue. Probably need to have some sort of protocol based check to determine which version of the call to use. - Alex -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 3/7] net: dsa: ar8xxx: add regmap support
Alternatively, we could have something similar to what happens for the phy in the wireless subsystems. Wireless PHYs are not registered as net_device but they can still be listed, queried or configured through netlink. Just thinking out loud here. Thanks, Mathieu -Original Message- From: Andrew Lunn [mailto:and...@lunn.ch] Sent: Thursday, May 28, 2015 7:44 PM To: Florian Fainelli Cc: Mathieu Olivari; robh...@kernel.org; pawel.m...@arm.com; mark.rutl...@arm.com; ijc+devicet...@hellion.org.uk; ga...@codeaurora.org; da...@davemloft.net; li...@roeck-us.net; gang.chen.5...@gmail.com; j...@resnulli.us; lei...@staticky.com; f...@skynet.be; pavel.nakonec...@skitlab.ru; j...@perches.com; sfel...@gmail.com; n...@openwrt.org; juh...@openwrt.org; devicet...@vger.kernel.org; linux-ker...@vger.kernel.org; netdev@vger.kernel.org Subject: Re: [PATCH 3/7] net: dsa: ar8xxx: add regmap support Fair enough, are there other global things besides counters that could deserve adding maybe some sort of global/master net_device to help query switch-wide information? This was discussed a while back. I like the current abstraction, all interfaces are real interfaces you can send and receive packets over. This pseudo interface cannot be used for packet transfer, which seems odd. Having access to registers for debugging, so debugfs seems like the best option to me. Andrew -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] vti6: Add pmtu handling to vti6_xmit.
From: Steffen Klassert steffen.klass...@secunet.com We currently rely on the PMTU discovery of xfrm. However if a packet is localy sent, the PMTU mechanism of xfrm tries to to local socket notification what might not work for applications like ping that don't check for this. So add pmtu handling to vti6_xmit to report MTU changes immediately. Signed-off-by: Steffen Klassert steffen.klass...@secunet.com Signed-off-by: Alexander Duyck alexander.h.du...@redhat.com --- So this version is slightly modified to cover the IPv4 case in addition to the IPv6 case. With this patch I was able to run netperf over either an IPv4 or IPv6 address routed over the ip6_vti tunnel. net/ipv6/ip6_vti.c | 14 ++ 1 file changed, 14 insertions(+) diff --git a/net/ipv6/ip6_vti.c b/net/ipv6/ip6_vti.c index d25209657edc..3b5c1ea50d2f 100644 --- a/net/ipv6/ip6_vti.c +++ b/net/ipv6/ip6_vti.c @@ -435,6 +435,7 @@ vti6_xmit(struct sk_buff *skb, struct net_device *dev, struct flowi *fl) struct net_device *tdev; struct xfrm_state *x; int err = -1; + int mtu; if (!dst) goto tx_err_link_failure; @@ -468,6 +469,19 @@ vti6_xmit(struct sk_buff *skb, struct net_device *dev, struct flowi *fl) skb_dst_set(skb, dst); skb-dev = skb_dst(skb)-dev; + mtu = dst_mtu(dst); + if (!skb-ignore_df skb-len mtu) { + skb_dst(skb)-ops-update_pmtu(dst, NULL, skb, mtu); + + if (skb-protocol == htons(ETH_P_IPV6)) + icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu); + else + icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED, + htonl(mtu)); + + return -EMSGSIZE; + } + err = dst_output(skb); if (net_xmit_eval(err) == 0) { struct pcpu_sw_netstats *tstats = this_cpu_ptr(dev-tstats); -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/7] net: dsa: add QCA AR8xxx switch family support\
On Fri, May 29, 2015 at 04:00:01AM +0200, Andrew Lunn wrote: FYI: I have patches which allow DSA to use two cpu interfaces. Seems to work on my DIR665 with a Marvell Switch. I will post the patches as an RFC. Andrew Does it require the switch CPU ports to support LAG or is it generic enough to allow switch partitioning? -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-next v2 2/3] net: systemport: rewrite bcm_sysport_rx_refill
On Fri, May 29, 2015 at 9:42 AM, Florian Fainelli f.faine...@gmail.com wrote: Currently, bcm_sysport_desc_rx() calls bcm_sysport_rx_refill() at the end of Rx packet processing loop, after the current Rx packet has already been passed to napi_gro_receive(). However, bcm_sysport_rx_refill() might fail to allocate a new Rx skb, thus leaving a hole on the Rx queue where no valid Rx buffer exists. To eliminate this situation: 1. Rewrite bcm_sysport_rx_refill() to retain the current Rx skb on the Rx queue if a new replacement Rx skb can't be allocated and DMA-mapped. In this case, the data on the current Rx skb is effectively dropped. 2. Modify bcm_sysport_desc_rx() to call bcm_sysport_rx_refill() at the top of Rx packet processing loop, so that the new replacement Rx skb is already in place before the current Rx skb is processed. This is loosely inspired from d6707bec5986 (net: bcmgenet: rewrite bcmgenet_rx_refill()) Signed-off-by: Florian Fainelli f.faine...@gmail.com Reviewed-by: Petri Gynther pgynt...@google.com --- drivers/net/ethernet/broadcom/bcmsysport.c | 87 ++ 1 file changed, 41 insertions(+), 46 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bcmsysport.c b/drivers/net/ethernet/broadcom/bcmsysport.c index 267330ccd595..62ea403e15b8 100644 --- a/drivers/net/ethernet/broadcom/bcmsysport.c +++ b/drivers/net/ethernet/broadcom/bcmsysport.c @@ -524,62 +524,70 @@ static void bcm_sysport_free_cb(struct bcm_sysport_cb *cb) dma_unmap_addr_set(cb, dma_addr, 0); } -static int bcm_sysport_rx_refill(struct bcm_sysport_priv *priv, -struct bcm_sysport_cb *cb) +static struct sk_buff *bcm_sysport_rx_refill(struct bcm_sysport_priv *priv, +struct bcm_sysport_cb *cb) { struct device *kdev = priv-pdev-dev; struct net_device *ndev = priv-netdev; + struct sk_buff *skb, *rx_skb; dma_addr_t mapping; - int ret; - cb-skb = netdev_alloc_skb(priv-netdev, RX_BUF_LENGTH); - if (!cb-skb) { + /* Allocate a new SKB for a new packet */ + skb = netdev_alloc_skb(priv-netdev, RX_BUF_LENGTH); + if (!skb) { + priv-mib.alloc_rx_buff_failed++; netif_err(priv, rx_err, ndev, SKB alloc failed\n); - return -ENOMEM; + return NULL; } - mapping = dma_map_single(kdev, cb-skb-data, + mapping = dma_map_single(kdev, skb-data, RX_BUF_LENGTH, DMA_FROM_DEVICE); - ret = dma_mapping_error(kdev, mapping); - if (ret) { + if (dma_mapping_error(kdev, mapping)) { priv-mib.rx_dma_failed++; - bcm_sysport_free_cb(cb); + dev_kfree_skb_any(skb); netif_err(priv, rx_err, ndev, DMA mapping failure\n); - return ret; + return NULL; } + /* Grab the current SKB on the ring */ + rx_skb = cb-skb; + if (likely(rx_skb)) + dma_unmap_single(kdev, dma_unmap_addr(cb, dma_addr), +RX_BUF_LENGTH, DMA_FROM_DEVICE); + + /* Put the new SKB on the ring */ + cb-skb = skb; dma_unmap_addr_set(cb, dma_addr, mapping); dma_desc_set_addr(priv, cb-bd_addr, mapping); netif_dbg(priv, rx_status, ndev, RX refill\n); - return 0; + /* Return the current SKB to the caller */ + return rx_skb; } static int bcm_sysport_alloc_rx_bufs(struct bcm_sysport_priv *priv) { struct bcm_sysport_cb *cb; - int ret = 0; + struct sk_buff *skb; unsigned int i; for (i = 0; i priv-num_rx_bds; i++) { cb = priv-rx_cbs[i]; - if (cb-skb) - continue; - - ret = bcm_sysport_rx_refill(priv, cb); - if (ret) - break; + skb = bcm_sysport_rx_refill(priv, cb); + if (skb) + dev_kfree_skb(skb); + if (!cb-skb) + return -ENOMEM; } - return ret; + return 0; } /* Poll the hardware for up to budget packets to process */ static unsigned int bcm_sysport_desc_rx(struct bcm_sysport_priv *priv, unsigned int budget) { - struct device *kdev = priv-pdev-dev; struct net_device *ndev = priv-netdev; unsigned int processed = 0, to_process; struct bcm_sysport_cb *cb; @@ -587,7 +595,6 @@ static unsigned int bcm_sysport_desc_rx(struct bcm_sysport_priv *priv, unsigned int p_index; u16 len, status; struct bcm_rsb *rsb; - int ret; /* Determine how much we should process since last call */ p_index = rdma_readl(priv, RDMA_PROD_INDEX); @@
Re: [PATCH 3/7] net: dsa: ar8xxx: add regmap support
On Fri, May 29, 2015 at 10:36:49AM -0700, Mathieu Olivari wrote: Alternatively, we could have something similar to what happens for the phy in the wireless subsystems. Wireless PHYs are not registered as net_device but they can still be listed, queried or configured through netlink. It is a reasonable idea, but you retrieve most of the useful information using ethtool. That, as far as i know, operates on net_devices, not phys. Andrew -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net] tcp: fix child sockets to use system default congestion control if not set
Linux 3.17 and earlier are explicitly engineered so that if the app doesn't specifically request a CC module on a listener before the SYN arrives, then the child gets the system default CC when the connection is established. See tcp_init_congestion_control() in 3.17 or earlier, which says if no choice made yet assign the current value set as default. The change (net: tcp: assign tcp cong_ops when tcp sk is created) altered these semantics, so that children got their parent listener's congestion control even if the system default had changed after the listener was created. This commit returns to those original semantics from 3.17 and earlier, since they are the original semantics from 2007 in 4d4d3d1e8 ([TCP]: Congestion control initialization.), and some Linux congestion control workflows depend on that. In summary, if a listener socket specifically sets TCP_CONGESTION to x, or the route locks the CC module to x, then the child gets x. Otherwise the child gets current system default from net.ipv4.tcp_congestion_control. That's the behavior in 3.17 and earlier, and this commit restores that. Fixes: 55d8694fa82c (net: tcp: assign tcp cong_ops when tcp sk is created) Cc: Florian Westphal f...@strlen.de Cc: Daniel Borkmann dbork...@redhat.com Cc: Glenn Judd glenn.j...@morganstanley.com Cc: Stephen Hemminger step...@networkplumber.org Signed-off-by: Neal Cardwell ncardw...@google.com Signed-off-by: Eric Dumazet eduma...@google.com Signed-off-by: Yuchung Cheng ych...@google.com --- include/net/inet_connection_sock.h | 3 ++- net/ipv4/tcp_cong.c| 5 - net/ipv4/tcp_minisocks.c | 5 - 3 files changed, 10 insertions(+), 3 deletions(-) diff --git a/include/net/inet_connection_sock.h b/include/net/inet_connection_sock.h index 497bc14..0320bbb 100644 --- a/include/net/inet_connection_sock.h +++ b/include/net/inet_connection_sock.h @@ -98,7 +98,8 @@ struct inet_connection_sock { const struct tcp_congestion_ops *icsk_ca_ops; const struct inet_connection_sock_af_ops *icsk_af_ops; unsigned int (*icsk_sync_mss)(struct sock *sk, u32 pmtu); - __u8 icsk_ca_state:7, + __u8 icsk_ca_state:6, + icsk_ca_setsockopt:1, icsk_ca_dst_locked:1; __u8 icsk_retransmits; __u8 icsk_pending; diff --git a/net/ipv4/tcp_cong.c b/net/ipv4/tcp_cong.c index 7a5ae50..84be008 100644 --- a/net/ipv4/tcp_cong.c +++ b/net/ipv4/tcp_cong.c @@ -187,6 +187,7 @@ static void tcp_reinit_congestion_control(struct sock *sk, tcp_cleanup_congestion_control(sk); icsk-icsk_ca_ops = ca; + icsk-icsk_ca_setsockopt = 1; if (sk-sk_state != TCP_CLOSE icsk-icsk_ca_ops-init) icsk-icsk_ca_ops-init(sk); @@ -335,8 +336,10 @@ int tcp_set_congestion_control(struct sock *sk, const char *name) rcu_read_lock(); ca = __tcp_ca_find_autoload(name); /* No change asking for existing value */ - if (ca == icsk-icsk_ca_ops) + if (ca == icsk-icsk_ca_ops) { + icsk-icsk_ca_setsockopt = 1; goto out; + } if (!ca) err = -ENOENT; else if (!((ca-flags TCP_CONG_NON_RESTRICTED) || diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c index b5732a5..17e7339 100644 --- a/net/ipv4/tcp_minisocks.c +++ b/net/ipv4/tcp_minisocks.c @@ -420,7 +420,10 @@ void tcp_ca_openreq_child(struct sock *sk, const struct dst_entry *dst) rcu_read_unlock(); } - if (!ca_got_dst !try_module_get(icsk-icsk_ca_ops-owner)) + /* If no valid choice made yet, assign current system default ca. */ + if (!ca_got_dst + (!icsk-icsk_ca_setsockopt || +!try_module_get(icsk-icsk_ca_ops-owner))) tcp_assign_congestion_control(sk); tcp_set_ca_state(sk, TCP_CA_Open); -- 2.2.0.rc0.207.ga3a616c -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net] net: dsa: Properly propagate errors from dsa_switch_setup_one
On Fri, May 29, 2015 at 10:29:46AM -0700, Florian Fainelli wrote: While shuffling some code around, dsa_switch_setup_one() was introduced, and it was modified to return either an error code using ERR_PTR() or a NULL pointer when running out of memory or failing to setup a switch. This is a problem for its caler: dsa_switch_setup() which uses IS_ERR() and expects to find an error code, not a NULL pointer, so we still try to proceed with dsa_switch_setup() and operate on invalid memory addresses. This can be easily reproduced by having e.g: the bcm_sf2 driver built-in, but having no such switch, such that drv-setup will fail. Fix this by using PTR_ERR() consistently which is both more informative and avoids for the caller to use IS_ERR_OR_NULL(). Fixes: df197195a5248 (net: dsa: split dsa_switch_setup into two functions) Reported-by: Andrew Lunn and...@lunn.ch Signed-off-by: Florian Fainelli f.faine...@gmail.com Hi Florian No more crash and burn :-) Tested-by: Andrew Lunn and...@lunn.ch Thanks Andrew --- net/dsa/dsa.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c index e6f6cc3a1bcf..392e29a0227d 100644 --- a/net/dsa/dsa.c +++ b/net/dsa/dsa.c @@ -359,7 +359,7 @@ dsa_switch_setup(struct dsa_switch_tree *dst, int index, */ ds = kzalloc(sizeof(*ds) + drv-priv_size, GFP_KERNEL); if (ds == NULL) - return NULL; + return ERR_PTR(-ENOMEM); ds-dst = dst; ds-index = index; @@ -370,7 +370,7 @@ dsa_switch_setup(struct dsa_switch_tree *dst, int index, ret = dsa_switch_setup_one(ds, parent); if (ret) - return NULL; + return ERR_PTR(ret); return ds; } -- 2.1.0 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html