[PATCH V1 1/3] asix: Add rx->ax_skb = NULL after usbnet_skb_return()
In asix_rx_fixup_internal() there is a risk that rx->ax_skb gets reused after passing the Ethernet frame into the network stack via usbnet_skb_return(). The risks include: a) asynchronously freeing rx->ax_skb after passing the netdev buffer to the NAPI layer which might corrupt the backlog queue. b) erroneously reusing rx->ax_skb such as calling skb_put_data() multiple times which causes writing off the end of the netdev buffer. Therefore add a defensive rx->ax_skb = NULL after usbnet_skb_return() so that it is not possible to free rx->ax_skb or to apply skb_put_data() too many times. Signed-off-by: Dean Jenkins --- drivers/net/usb/asix_common.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/net/usb/asix_common.c b/drivers/net/usb/asix_common.c index 7847436..6983b6b 100644 --- a/drivers/net/usb/asix_common.c +++ b/drivers/net/usb/asix_common.c @@ -168,8 +168,10 @@ int asix_rx_fixup_internal(struct usbnet *dev, struct sk_buff *skb, if (rx->ax_skb) { skb_put_data(rx->ax_skb, skb->data + offset, copy_length); - if (!rx->remaining) + if (!rx->remaining) { usbnet_skb_return(dev, rx->ax_skb); + rx->ax_skb = NULL; + } } offset += (copy_length + 1) & 0xfffe; -- 2.7.4
[PATCH V1 3/3] asix: Fix small memory leak in ax88772_unbind()
When Ethernet frames span mulitple URBs, the netdev buffer memory pointed to by the asix_rx_fixup_info structure remains allocated during the time gap between the 2 executions of asix_rx_fixup_internal(). This means that if ax88772_unbind() is called within this time gap to free the memory of the parent private data structure then a memory leak of the part filled netdev buffer memory will occur. Therefore, create a new function asix_rx_fixup_common_free() to free the memory of the netdev buffer and add a call to asix_rx_fixup_common_free() from inside ax88772_unbind(). Consequently when an unbind occurs part way through receiving an Ethernet frame, the netdev buffer memory that is holding part of the received Ethernet frame will now be freed. Signed-off-by: Dean Jenkins --- drivers/net/usb/asix.h | 1 + drivers/net/usb/asix_common.c | 15 +++ drivers/net/usb/asix_devices.c | 1 + 3 files changed, 17 insertions(+) diff --git a/drivers/net/usb/asix.h b/drivers/net/usb/asix.h index d109242..9a4171b 100644 --- a/drivers/net/usb/asix.h +++ b/drivers/net/usb/asix.h @@ -209,6 +209,7 @@ void asix_write_cmd_async(struct usbnet *dev, u8 cmd, u16 value, int asix_rx_fixup_internal(struct usbnet *dev, struct sk_buff *skb, struct asix_rx_fixup_info *rx); int asix_rx_fixup_common(struct usbnet *dev, struct sk_buff *skb); +void asix_rx_fixup_common_free(struct asix_common_private *dp); struct sk_buff *asix_tx_fixup(struct usbnet *dev, struct sk_buff *skb, gfp_t flags); diff --git a/drivers/net/usb/asix_common.c b/drivers/net/usb/asix_common.c index fda74f3..522d290 100644 --- a/drivers/net/usb/asix_common.c +++ b/drivers/net/usb/asix_common.c @@ -210,6 +210,21 @@ int asix_rx_fixup_common(struct usbnet *dev, struct sk_buff *skb) return asix_rx_fixup_internal(dev, skb, rx); } +void asix_rx_fixup_common_free(struct asix_common_private *dp) +{ + struct asix_rx_fixup_info *rx; + + if (!dp) + return; + + rx = &dp->rx_fixup_info; + + if (rx->ax_skb) { + kfree_skb(rx->ax_skb); + rx->ax_skb = NULL; + } +} + struct sk_buff *asix_tx_fixup(struct usbnet *dev, struct sk_buff *skb, gfp_t flags) { diff --git a/drivers/net/usb/asix_devices.c b/drivers/net/usb/asix_devices.c index a3aa0a2..b2ff88e 100644 --- a/drivers/net/usb/asix_devices.c +++ b/drivers/net/usb/asix_devices.c @@ -764,6 +764,7 @@ static int ax88772_bind(struct usbnet *dev, struct usb_interface *intf) static void ax88772_unbind(struct usbnet *dev, struct usb_interface *intf) { + asix_rx_fixup_common_free(dev->driver_priv); kfree(dev->driver_priv); } -- 2.7.4
[PATCH V1 0/3] asix: Improve robustness
s the received Ethernet frame to the NAPI layer of the network stack via the call to usbnet_skb_return() in asix_rx_fixup_internal() but retains the rx->ax_skb pointer to the netdev buffer. The driver no longer needs the rx->ax_skb pointer at this point because the NAPI layer now has the Ethernet frame. This means that asix_rx_fixup_internal() must not use rx->ax_skb after the call to usbnet_skb_return() because it could corrupt the handling of the Ethernet frame within the network layer. Therefore, to remove the risk of erroneous usage of rx->ax_skb, set rx->ax_skb to NULL after the call to usbnet_skb_return(). This avoids potential erroneous freeing of rx->ax_skb and erroneous writing to the netdev buffer. If the software now somehow inappropriately reused rx->ax_skb, then a NULL pointer dereference of rx->ax_skb would occur which makes investigation easier. 2. asix: Ensure asix_rx_fixup_info members are all reset This patch creates reset_asix_rx_fixup_info() to allow all the asix_rx_fixup_info structure members to be consistently reset to initial conditions. Call reset_asix_rx_fixup_info() upon each detectable error condition so that the next URB is processed from a known state. Otherwise, there is a risk that some members of the asix_rx_fixup_info structure may be incorrect after an error occurred so potentially leading to a malfunction. 3. asix: Fix small memory leak in ax88772_unbind() This patch creates asix_rx_fixup_common_free() to allow the rx->ax_skb to be freed when necessary. asix_rx_fixup_common_free() is called from ax88772_unbind() before the parent private data structure is freed. Without this patch, there is a risk of a small netdev buffer memory leak each time ax88772_unbind() is called during the reception of an Ethernet frame that spans across 2 URBs. Testing === The patches have been sanity tested on a 64-bit Linux laptop running kernel 4.13-rc2 with the 3 patches applied on top. The ASIX USB to Adaptor used for testing was (output of lsusb): ID 0b95:772b ASIX Electronics Corp. AX88772B Test #1 --- The test ran a flood ping test script which slowly incremented the ICMP Echo Request's payload from 0 to 5000 octets. This eventually causes IPv4 fragmentation to occur which causes Ethernet frames to be sent very close to each other so increases the probability that an Ethernet frame will span 2 URBs. The test showed that all pings were successful. The test took about 15 minutes to complete. Test #2 --- A script was run on the laptop to periodically run ifdown and ifup every second so that the ASIX USB to Adaptor was up for 1 second and down for 1 second. >From a Linux PC connected to the laptop, the following ping command was used ping -f -s 5000 The large ICMP payload causes IPv4 fragmentation resulting in multiple Ethernet frames per original IP packet. Kernel debug within the ASIX driver was enabled to see whether any ASIX errors were generated. The test was run for about 24 hours and no ASIX errors were seen. Patches === The 3 patches have been rebased off the net-next repo master branch with HEAD fbbeefd net: fec: Allow reception of frames bigger than 1522 bytes Dean Jenkins (3): asix: Add rx->ax_skb = NULL after usbnet_skb_return() asix: Ensure asix_rx_fixup_info members are all reset asix: Fix small memory leak in ax88772_unbind() drivers/net/usb/asix.h | 1 + drivers/net/usb/asix_common.c | 53 ++ drivers/net/usb/asix_devices.c | 1 + 3 files changed, 45 insertions(+), 10 deletions(-) -- 2.7.4
[PATCH V1 2/3] asix: Ensure asix_rx_fixup_info members are all reset
There is a risk that the members of the structure asix_rx_fixup_info become unsynchronised leading to the possibility of a malfunction. For example, rx->split_head was not being set to false after an error was detected so potentially could cause a malformed 32-bit Data header word to be formed. Therefore add function reset_asix_rx_fixup_info() to reset all the members of asix_rx_fixup_info so that future processing will start with known initial conditions. Also, if (skb->len != offset) becomes true then call reset_asix_rx_fixup_info() so that the processing of the next URB starts with known initial conditions. Without the call, the check does nothing which potentially could lead to a malfunction when the next URB is processed. In addition, for robustness, call reset_asix_rx_fixup_info() before every error path's "return 0". This ensures that the next URB is processed from known initial conditions. Signed-off-by: Dean Jenkins --- drivers/net/usb/asix_common.c | 34 +- 1 file changed, 25 insertions(+), 9 deletions(-) diff --git a/drivers/net/usb/asix_common.c b/drivers/net/usb/asix_common.c index 6983b6b..fda74f3 100644 --- a/drivers/net/usb/asix_common.c +++ b/drivers/net/usb/asix_common.c @@ -75,6 +75,27 @@ void asix_write_cmd_async(struct usbnet *dev, u8 cmd, u16 value, u16 index, value, index, data, size); } +static void reset_asix_rx_fixup_info(struct asix_rx_fixup_info *rx) +{ + /* Reset the variables that have a lifetime outside of +* asix_rx_fixup_internal() so that future processing starts from a +* known set of initial conditions. +*/ + + if (rx->ax_skb) { + /* Discard any incomplete Ethernet frame in the netdev buffer */ + kfree_skb(rx->ax_skb); + rx->ax_skb = NULL; + } + + /* Assume the Data header 32-bit word is at the start of the current +* or next URB socket buffer so reset all the state variables. +*/ + rx->remaining = 0; + rx->split_head = false; + rx->header = 0; +} + int asix_rx_fixup_internal(struct usbnet *dev, struct sk_buff *skb, struct asix_rx_fixup_info *rx) { @@ -99,15 +120,7 @@ int asix_rx_fixup_internal(struct usbnet *dev, struct sk_buff *skb, if (size != ((~rx->header >> 16) & 0x7ff)) { netdev_err(dev->net, "asix_rx_fixup() Data Header synchronisation was lost, remaining %d\n", rx->remaining); - if (rx->ax_skb) { - kfree_skb(rx->ax_skb); - rx->ax_skb = NULL; - /* Discard the incomplete netdev Ethernet frame -* and assume the Data header is at the start of -* the current URB socket buffer. -*/ - } - rx->remaining = 0; + reset_asix_rx_fixup_info(rx); } } @@ -139,11 +152,13 @@ int asix_rx_fixup_internal(struct usbnet *dev, struct sk_buff *skb, if (size != ((~rx->header >> 16) & 0x7ff)) { netdev_err(dev->net, "asix_rx_fixup() Bad Header Length 0x%x, offset %d\n", rx->header, offset); + reset_asix_rx_fixup_info(rx); return 0; } if (size > dev->net->mtu + ETH_HLEN + VLAN_HLEN) { netdev_dbg(dev->net, "asix_rx_fixup() Bad RX Length %d\n", size); + reset_asix_rx_fixup_info(rx); return 0; } @@ -180,6 +195,7 @@ int asix_rx_fixup_internal(struct usbnet *dev, struct sk_buff *skb, if (skb->len != offset) { netdev_err(dev->net, "asix_rx_fixup() Bad SKB Length %d, %d\n", skb->len, offset); + reset_asix_rx_fixup_info(rx); return 0; } -- 2.7.4
Re: [PATCH] asix: Fix offset calculation in asix_rx_fixup() causing slow transmissions
Hi John, Thanks for your patch. I think the patch has already been applied. The git commit subject of "asix: Fix offset calculation in asix_rx_fixup() causing slow transmissions" I think is a bit misleading as the bug relates to reception and not transmission. I guess that your intent was to say that "the through-put of communications was low" due to the bug. Personally, I would of used a git subject like "asix: Fix asix_rx_fixup_interval() offset calculation for spanned frames" But anyway, I have no real issue with the patch. On 17/05/16 04:36, John Stultz wrote: In testing with HiKey, we found that since commit 3f30b158eba5 ("asix: On RX avoid creating bad Ethernet frames"), we're seeing lots of noise during network transfers: [ 239.027993] asix 1-1.1:1.0 eth0: asix_rx_fixup() Data Header synchronisation was lost, remaining 988 [ 239.037310] asix 1-1.1:1.0 eth0: asix_rx_fixup() Bad Header Length 0x54ebb5ec, offset 4 [ 239.045519] asix 1-1.1:1.0 eth0: asix_rx_fixup() Bad Header Length 0xcdffe7a2, offset 4 [ 239.275044] asix 1-1.1:1.0 eth0: asix_rx_fixup() Data Header synchronisation was lost, remaining 988 [ 239.284355] asix 1-1.1:1.0 eth0: asix_rx_fixup() Bad Header Length 0x1d36f59d, offset 4 [ 239.292541] asix 1-1.1:1.0 eth0: asix_rx_fixup() Bad Header Length 0xaef3c1e9, offset 4 [ 239.518996] asix 1-1.1:1.0 eth0: asix_rx_fixup() Data Header synchronisation was lost, remaining 988 [ 239.528300] asix 1-1.1:1.0 eth0: asix_rx_fixup() Bad Header Length 0x2881912, offset 4 [ 239.536413] asix 1-1.1:1.0 eth0: asix_rx_fixup() Bad Header Length 0x5638f7e2, offset 4 And network throughput ends up being pretty bursty and slow with a overall throughput of at best ~30kB/s (where as previously we got 1.1MB/s with the slower USB1.1 "full speed" host). We found the issue also was reproducible on a x86_64 system, using a "high-speed" USB2.0 port but the throughput did not measurably drop (possibly due to the scp transfer being cpu bound on my slow test hardware). After lots of debugging, I found the check added in the problematic commit seems to be calculating the offset incorrectly. In the normal case, in the main loop of the function, we do: (where offset is zero, or set to "offset += (copy_length + 1) & 0xfffe" in the previous loop) rx->header = get_unaligned_le32(skb->data + offset); offset += sizeof(u32); But the problematic patch calculates: offset = ((rx->remaining + 1) & 0xfffe) + sizeof(u32); rx->header = get_unaligned_le32(skb->data + offset); Adding some debug logic to check those offset calculation used to find rx->header, the one in problematic code is always too large by sizeof(u32). Thus, this patch removes the incorrect " + sizeof(u32)" addition in the problematic calculation, and resolves the issue. Cc: Dean Jenkins Cc: "David B. Robins" Cc: Mark Craske Cc: Emil Goode Cc: "David S. Miller" Cc: YongQin Liu Cc: Guodong Xu Cc: Ivan Vecera Cc: linux-...@vger.kernel.org Cc: netdev@vger.kernel.org Cc: stable #4.4+ Reported-by: Yongqin Liu Signed-off-by: John Stultz --- drivers/net/usb/asix_common.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/usb/asix_common.c b/drivers/net/usb/asix_common.c index 0c5c22b..7de5ab5 100644 --- a/drivers/net/usb/asix_common.c +++ b/drivers/net/usb/asix_common.c @@ -66,7 +66,7 @@ int asix_rx_fixup_internal(struct usbnet *dev, struct sk_buff *skb, * buffer. */ if (rx->remaining && (rx->remaining + sizeof(u32) <= skb->len)) { - offset = ((rx->remaining + 1) & 0xfffe) + sizeof(u32); + offset = ((rx->remaining + 1) & 0xfffe); I have verified that this fixes my ARM board. Thanks for finding the mistake. Note that the outer set of brackets could of been removed as they are redundant. rx->header = get_unaligned_le32(skb->data + offset); offset = 0; Thanks, Best regards, Dean -- Dean Jenkins Embedded Software Engineer Linux Transportation Solutions Mentor Embedded Software Division Mentor Graphics (UK) Ltd.
Re: [REGRESSION] asix: Lots of asix_rx_fixup() errors and slow transmissions
= 118) Ethernet frame This could be due to IP fragmentation. ie: [ 106.946473] JDB set remaining to 1514 (skblen: 1518) [ 106.946525] JDB set remaining to 1514 (skblen: 1640) [ 106.946546] JDB set remaining to 118 (skblen: 1640) Yes, that 118 confirms my reasoning above. [ 106.946586] JDB set remaining to 1514 (skblen: 1518) So yea.. maybe that will help clue things in a bit? I'm still a bit lost. :) Your observations are consistent with missing URBs from the USB host controller. Here is a summary of what I think is happening in your case: Good case: URB #1: 1514 octets of 1514 Ethernet frame (A) URB #2: 1514 octets of 1514 Ethernet frame (B) + 526 octets of 1514 Ethernet frame (C) URB #3: 988 octets of 1514 Ethernet frame (C) URB #4: 1514 octets of 1514 Ethernet frame (D) Therefore, Ethernet frame (C) is spanning URBs #2 and #3. Bad case, URB #3 is lost: URB #1: 1514 octets of 1514 Ethernet frame (A) URB #2: 1514 octets of 1514 Ethernet frame (B) + 526 octets of 1514 Ethernet frame (C) Remaining is 988 URB #4: 1514 octets of 1514 Ethernet frame (D) But when URB #4 is analysed the 32-bit Header word is not found after 988 octets in the URB buffer so "sync lost". The end of Ethernet frame (C) is missing so drop the Ethernet frame. Now look at the start of the URB #4 buffer and find a 32-bit header word so Ethernet frame (D) can be consumed. So I think the commit is acting as intended and you are suffering from lost URBs. So perhaps you have a URB memory allocation issue for the USB host controller ? However, this would also mean x86 has the same issue. So perhaps it is a bug in common USB kernel code ? You might find it helpful to get USB analyser logs or use the software utility usbmon to grab the USB host controller data. You might then be able to compare the data from the USB log with the URB buffer data. This might confirm that URBs are going missing. On the other hand, it was suggested that removing my commit improved through-put. Is that really true ? Perhaps you can try an experiment where you add test code to add a counter to use my "sync lost" detector for 1000 URBs then by-pass the "sync lost" detector for 1000 URBs and repeat it cyclically. 1000 is a guess, you might need a higher value. Then run some tests to see whether the through-put periodically goes up and down in sympathy with the "sync lost" detector being used or not being used. Also track the RX error counter from ifconfig to see how many bad Ethernet frames got to the IP stack. Also try using wireshark to see whether you are generating IP fragmented frames. Unfortunately, using iperf with IPv6 with defaults will generate IPv6 fragmentation causing twice as many Ethernet frames to be sent as needed so is inefficient. To fix that, in iperf reduce the data length so that the MTU size of 1500 is not exceeded. The commit only runs when Ethernet frames span URBs which typically occurs when IP fragmentation happens or when there is a very high rate of Ethernet frames per second. At least, it is now clear why you see "988 remaining", that is because you are using maximum length Ethernet frames and 2 of those frames are written into a single URB which means 988 octets of the 2nd Ethernet frame goes into the 2nd URB buffer. So that makes sense and is correct. Regards, Dean thanks -john -- Dean Jenkins Embedded Software Engineer Linux Transportation Solutions Mentor Embedded Software Division Mentor Graphics (UK) Ltd.
Re: [REGRESSION] asix: Lots of asix_rx_fixup() errors and slow transmissions
On 06/05/16 16:27, Andrew Lunn wrote: In other words, the full-speed hub is restricting the USB to Ethernet Adaptor to a 12Mbps (half-duplex) bandwidth to support Ethernet 100Mbps (full-duplex) traffic. That is not going to work very well because Ethernet frames (perhaps partial Ethernet frames) need to be discarded within the USB link. If that really is true, the design is broken. I would expect the adaptor to reliably transfer whole frames over USB, and drop whole frames from its receive queue when the USB is congested. TCP is also going to see the USB bottleneck as just like any bottleneck in the network and back off. So TCP streams should not cause major congestion on the USB link. The host's USB host controller polls the USB to Ethernet adaptor for more data. The USB to Ethernet adaptor cannot predict when the next poll request comes. The AX88772B can span Ethernet frames across multiple poll requests. This means it is possible get a partial Ethernet frame received in the USB host controller on one poll and it is assumed that the next poll (sometime in the near future) will get the remaining part of the Ethernet frame. However, the USB to Ethernet adaptor does not contain an infinitely sized RX Ethernet buffer for the incoming Ethernet frames. I believe the USB to Ethernet adaptor is just a pipe and does not directly implement flow control for Ethernet frames so the RX buffer is going to overflow causing loss of whole Ethernet frames. I suspect the IP stack in the host computer implements flow control for Ethernet frames. Because the AX88772B can span Ethernet frames across multiple poll requests there is a risk that the designers of the device could of implemented a solution to discard the remaining part of the Ethernet frame before the next poll arrives due to the RX buffer overflowing. I don't know the algorithm used in the AX88772B but there will be loss of data due to the mismatch in bandwidths. I agree that dropping whole Ethernet frames would be preferable to dropping partial Ethernet frames which would corrupt the data stream. My suspicion is that the URB buffers are containing discontinues in the data stream because of lost data due to insufficient bandwidth on the USB link. Going over a 12Mbps USB link should be no different to hitting an old Ethernet hub which can only do 10/Half. Not exactly, because USB is a transport link which is agnostic to the type of data that is flowing. It is up to the layers above USB to manage the data content. In other words, the USB speed needs to be higher than the Ethernet speed to avoid mismatches in bandwidth. Therefore please retest with a working high-speed USB hub or remove the full-speed USB hub from the test environment and directly connect the USB to Ethernet Adaptor to the root hub of the USB port. Then repeat the tests to see whether anything improved. In other words, you need to eliminate the dmesg messages saying "not running at top speed; connect to a high speed hub". I would also suggest testing with the Ethernet at 10/half. You should be able to use Ethtool to set that up. Your USB and Ethernet bandwidth become more equal. If you still see errors, it suggests a protocol implementation error somewhere. I agree with the suggestion but I hope USB high speed (480Mbps) operation was the intended environment rather than the useless USB full speed (12Mbps) operation. Let's hope that not using the USB hub improves things. Regards, Dean Andrew -- Dean Jenkins Embedded Software Engineer Linux Transportation Solutions Mentor Embedded Software Division Mentor Graphics (UK) Ltd.
Re: [REGRESSION] asix: Lots of asix_rx_fixup() errors and slow transmissions
not found in the expected location at the start of the URB buffer. [ 41.938370] asix 1-1.2:1.0 eth0: asix_rx_fixup() Bad Header Length 0x11400040, offset 4 This evidence is suggesting either the data stream is garbled such as by many dropped URBs or lost partial Ethernet frames, or the ASIX protocol for the AX88772B chipset differs from the AX88772 chipset so the ASIX driver is looking in the wrong place for the 32-bit header word. I suspect data is lost due to the restricted 12Mbps bandwidth. My conclusion is that your USB to Ethernet Adaptor is not running at high speed (480Mbps) mode which is causing a partial loss (corruption) of Ethernet frames across the USB link. A USB Protocol Analyser or software tool usbmon could be used to confirm this scenario. Therefore please retest with a working high-speed USB hub or remove the full-speed USB hub from the test environment and directly connect the USB to Ethernet Adaptor to the root hub of the USB port. Then repeat the tests to see whether anything improved. In other words, you need to eliminate the dmesg messages saying "not running at top speed; connect to a high speed hub". Best regards, Dean -Guodong Xu -- Dean Jenkins Embedded Software Engineer Linux Transportation Solutions Mentor Embedded Software Division Mentor Graphics (UK) Ltd.
Re: [REGRESSION] asix: Lots of asix_rx_fixup() errors and slow transmissions
On 05/05/16 00:45, John Stultz wrote: On Tue, May 3, 2016 at 3:54 AM, Dean Jenkins wrote: On 03/05/16 11:04, Guodong Xu wrote: did you test on ARM 64-bit system or ARM 32-bit? I ask because HiKey is an ARM 64-bit system. I suggest we should be careful on that. I saw similar issues when transferring to a 64-bit system in other net drivers. We used 32-bit ARM and never tested on 64-bit ARM so I suggest that the commits need to be reviewed with 64-bit OS in mind. Do you have any suggestion on this regard? Try testing on a Linux PC x86 32-bit OS which has has a kernel containing my ASIX commits. This will help to confirm whether the failure is related to 32-bit or 64-bit OS. Then try with Linux PC x86 64-bit OS, this should fail otherwise it points to something specific in your ARM 64-bit platform. Just as a sample point, I have managed to reproduce exactly this issue on an x86_64 system by simply scp'ing a large file. Please tell us the x86_64 kernel version number that you used and which Linux Distribution it was ? This allows other people a chance to reproduce your observations. [ 417.819276] asix 1-5:1.0 eth1: asix_rx_fixup() Data Header synchronisation was lost, remaining 988 It is interesting that the reported "remaining" value is 988. Is 988 always shown ? I mean that do you see any other "remaining" values for the "Data Header synchronisation was lost" error message ? [ 417.823415] asix 1-5:1.0 eth1: asix_rx_fixup() Bad Header Length 0xef830347, offset 4 The gap in the timestamps shows 417.823415 - 417.819276 = 0.004139 = 4ms which is a large gap in terms of USB 2.0 high speed communications. This gap is expected to be in the 100us range for consecutive URBs. So 4ms is strange. The expectation is that the "Data Header synchronisation was lost" error message resets the 32-bit header word synchronisation to the start of the next URB buffer. The "Bad Header Length, offset 4" is the expected outcome for the next URB because it is unlikely the 32-bit header word is at the start of URB buffer due to Ethernet frames spanning across URBs. [ 417.827502] asix 1-5:1.0 eth1: asix_rx_fixup() Bad Header Length 0x31e2b348, offset 4 Timestamps show the gap to be 4ms which is strange for USB 2.0 high speed, are you sure high speed mode is being used ? [ 417.843779] asix 1-5:1.0 eth1: asix_rx_fixup() Data Header synchronisation was lost, remaining 988 [ 417.847921] asix 1-5:1.0 eth1: asix_rx_fixup() Bad Header Length 0x8af91035, offset 4 [ 417.852004] asix 1-5:1.0 eth1: asix_rx_fixup() Bad Header Length 0x8521fa03, offset 4 [ 418.273394] asix 1-5:1.0 eth1: asix_rx_fixup() Data Header synchronisation was lost, remaining 988 [ 418.277532] asix 1-5:1.0 eth1: asix_rx_fixup() Bad Header Length 0x33cd9c7c, offset 4 [ 418.281683] asix 1-5:1.0 eth1: asix_rx_fixup() Bad Header Length 0x3d850896, offset 4 [ 418.286227] asix 1-5:1.0 eth1: asix_rx_fixup() Bad Header Length 0x86443357, offset 4 [ 418.290319] asix 1-5:1.0 eth1: asix_rx_fixup() Bad Header Length 0xee6c81d1, offset 4 I don't have any 32bit x86 installs around so I'm not sure I can easly test there, but its clear its not arm64 specific. I agree the issue is not specific to your ARM 64 bit platform. Please can you supply the output of ifconfig for the USB to Ethernet adaptor, your example above shows eth1 as the device. Please show the output of ifconfig eth1 before and after the issue is seen. This will show us whether the kernel logs any network errors and how many bytes have been transferred. After the issue is seen, please can you show us the output of "dmesg | grep asix" so that we can see status messages from the ASIX driver that the USB to Ethernet adaptor is using. In particular we need to check that USB high speed operation (480Mbps) is being used and not full speed operation (12Mbps). Thanks, Regards, Dean thanks -john -- Dean Jenkins Embedded Software Engineer Linux Transportation Solutions Mentor Embedded Software Division Mentor Graphics (UK) Ltd.
Re: [REGRESSION] asix: Lots of asix_rx_fixup() errors and slow transmissions
On 04/05/16 01:28, David B. Robins wrote: Here is the code snippet from the patch with my annotations between # #, I will try to explain my intentions. Feel free to point out any flaws: if (rx->remaining && (rx->remaining + sizeof(u32) <= skb->len)) { # Only runs when rx->remaining !=0 and the end of the Ethernet frame + next 32-bit header word is within the URB buffer. # # Therefore, this code does not run when the end of an Ethernet frame has been reached in the previous URB # # or when the end of the Ethernet frame + next 32-bit header word will be in a later URB buffer # It may well be. I don't have the setup with me now, but I can try tomorrow to reproduce an environment where I can add some more detailed logging. Since the URB length has to be >= than the remaining data plus a u32, the devices that John Stultz and I are using (AX88772B in my case) may be adding some additional data/padding after an Ethernet frame, expecting it to be discarded, and running into this check and its consequences. This may mean the device is badly behaved, if it is specified not to send anything extra; in any case, a well-intentioned error correction has gone badly, but I better understand the intent now. I am curious to know how often the device you are using benefits from this block of code. The issue is that the driver should be robust to cope with missing URBs. Whilst testing with D-Link DUB-E100 C1 AX88772 USB to Ethernet adaptor in our ARM embedded system which runs in hostile environments, it was noticed that URBs could be lost (probably due to a bug elsewhere or low memory issue). Without this patch, a missing URB causes bad Ethernet frames to be passed up to the IP stack because rx->remaining spans multiple URBs. In the good case of an Ethernet frame spanning 2 URBs, the 1st URB is processed and copies the 1st part of the Ethernet frame into the netdev buffer, for the 2nd URB the remaining part of the Ethernet frame is copied into the same netdev buffer to complete the Ethernet frame. The netdev buffer is then sent up to the IP stack. In the case of a missing URB, a bad Ethernet frame is created as follows: The 1st URB is processed and copies the 1st part of the Ethernet frame into the netdev buffer, the 2nd URB is lost (somehow), the 3rd URB is processed and blindly copies what it thinks is the remaining part of the Ethernet frame in the same netdev buffer which corrupts the Ethernet frame. The netdev buffer is then sent up to the IP stack. The 3rd URB and subsequent URBs are processed but synchronisation has been lost so can misread data as a 32-bit header word. It is likely that some good Ethernet frames get discarded whilst trying to resynchronise. A recovery strategy for regaining lock with the 32-bit header word is necessary otherwise the driver will have difficulty in recovering from a lost URB. In the "olden days", the 32-bit header word was always at the start of the URB buffer so previous URBs did not influence the current URB. So no recovery strategy was needed at that time. But now we have to remember what happened in the previous URB and a lost URB can cause a discontinuity in the data stream because the data is not always aligned to the start of the URB buffer. I agree that your environment may never suffer from lost URBs so removal of the patch would work OK. I will try to find some time to setup a test environment. Regards, Dean -- Dean Jenkins Embedded Software Engineer Linux Transportation Solutions Mentor Embedded Software Division Mentor Graphics (UK) Ltd.
Re: [REGRESSION] asix: Lots of asix_rx_fixup() errors and slow transmissions
s provided by John. This does not suggest a random failure unless there are other examples of a non 988 remaining value error message. 988 is well within a Ethernet frame length so seems to be valid. I think a good step would be to add some debug to print the rx->remaining value at entry to asix_rx_fixup_internal(). This would generate a lot of debug but a pattern of the values might emerge. A good test would be to run "ping -c 1 -s $packet_length $ip_address" inside a script which has a loop with an increasing payload length $packet_length with a small delay between ping calls. This will show whether particular packet sizes trigger the failures. Then try with "ping -f -c 200 -s $packet_length $ip_address" to load up the USB link. Seems that I need kernel v4.4 or later to get a kernel with my patch in. This will take me a few days to find time to rig something up to test... Regards, Dean -- Dean Jenkins Embedded Software Engineer Linux Transportation Solutions Mentor Embedded Software Division Mentor Graphics (UK) Ltd.
Re: [REGRESSION] asix: Lots of asix_rx_fixup() errors and slow transmissions
On 03/05/16 11:04, Guodong Xu wrote: On 3 May 2016 at 17:23, Dean Jenkins wrote: On 03/05/16 05:55, John Stultz wrote: In testing with HiKey, we found that since commit 3f30b158eba5c60 (asix: On RX avoid creating bad Ethernet frames), we're seeing lots of noise during network transfers: [ 239.027993] asix 1-1.1:1.0 eth0: asix_rx_fixup() Data Header synchronisation was lost, remaining 988 [ 239.037310] asix 1-1.1:1.0 eth0: asix_rx_fixup() Bad Header Length 0x54ebb5ec, offset 4 [ 239.045519] asix 1-1.1:1.0 eth0: asix_rx_fixup() Bad Header Length 0xcdffe7a2, offset 4 [ 239.275044] asix 1-1.1:1.0 eth0: asix_rx_fixup() Data Header synchronisation was lost, remaining 988 [ 239.284355] asix 1-1.1:1.0 eth0: asix_rx_fixup() Bad Header Length 0x1d36f59d, offset 4 [ 239.292541] asix 1-1.1:1.0 eth0: asix_rx_fixup() Bad Header Length 0xaef3c1e9, offset 4 [ 239.518996] asix 1-1.1:1.0 eth0: asix_rx_fixup() Data Header synchronisation was lost, remaining 988 [ 239.528300] asix 1-1.1:1.0 eth0: asix_rx_fixup() Bad Header Length 0x2881912, offset 4 [ 239.536413] asix 1-1.1:1.0 eth0: asix_rx_fixup() Bad Header Length 0x5638f7e2, offset 4 And network throughput ends up being pretty bursty and slow with a overall throughput of at best ~30kB/s. Looking through the commits since the v4.1 kernel where we didn't see this, I narrowed the regression down, and reverting the following two commits seems to avoid the problem: 6a570814cd430fa5ef4f278e8046dcf12ee63f13 asix: Continue processing URB if no RX netdev buffer 3f30b158eba5c604b6e0870027eef5d19fc9271d asix: On RX avoid creating bad Ethernet frames With these reverted, we don't see all the error messages, and we see better ~1.1MB/s throughput (I've got a mouse plugged in, so I think the usb host is only running at "full-speed" mode here). This worries me some, as the patches seem to describe trying to fix the issue they seem to cause, so I suspect a revert isn't the correct solution, but am not sure why we're having such trouble and the patch authors did not. I'd be happy to do further testing of patches if folks have any ideas. Originally Reported-by: Yongqin Liu thanks -john Hi John, Some ASIX chipsets span the Ethernet frame over consecutive URBs which requires successful transfer of 2 URBs. This means states of a previous URB influences the processing of the next URB including a dropped URB (causes a discontinuity in the data stream). In other words synchronisation of the in-band 32-bit header word needs to be tracked between URBs. Some ASIX chipsets allow the in-band 32-bit header word to be no longer fixed to the start of the URB buffer so it moves to any position within the URB buffer. I understand your point of suggesting it is a "regression" for your device but the driver was broken for DUB-E100 C1 (small black USB device). So you cannot revert the commits as this would break DUB-E100 C1 (small black USB device). 6a570814cd430fa5ef4f278e8046dcf12ee63f13 asix: Continue processing URB if no RX netdev buffer This commit is necessary because it avoids a crash when netdev buffer failed to be allocated for the 1st URB and the 2nd URB containing a spanned Ethernet frame is processed. The crash happens because the 2nd URB assumed that the netdev buffer had been allocated. 3f30b158eba5c604b6e0870027eef5d19fc9271d asix: On RX avoid creating bad Ethernet frames This commit is necessary to avoid sending bad Ethernet frames into the IP stack during loss of synchronisation and to dropping good Ethernet frames. This commit improves the synchronisation recovery mechanism of the in-band 32-bit header word. The ASIX USB to Ethernet devices these commits were tested on where DUB-E100 C1 (small black USB device). Embedded ARM based systems were used where memory resources can run out. I don't have the chance to look into detail yet. But just a caution, did you test on ARM 64-bit system or ARM 32-bit? I ask because HiKey is an ARM 64-bit system. I suggest we should be careful on that. I saw similar issues when transferring to a 64-bit system in other net drivers. We used 32-bit ARM and never tested on 64-bit ARM so I suggest that the commits need to be reviewed with 64-bit OS in mind. Do you have any suggestion on this regard? Try testing on a Linux PC x86 32-bit OS which has has a kernel containing my ASIX commits. This will help to confirm whether the failure is related to 32-bit or 64-bit OS. Then try with Linux PC x86 64-bit OS, this should fail otherwise it points to something specific in your ARM 64-bit platform. It could be that for your USB to Ethernet device that the wrong configuration settings have been used. In other words the ASIX driver is flexible to support various variants of the ASIX chipsets. For example, does your device support Ethernet frames spanning multiple URBs (multiple USB transfers) ? Would you please suggest how to find out this information? How can I c
Re: [REGRESSION] asix: Lots of asix_rx_fixup() errors and slow transmissions
On 03/05/16 05:55, John Stultz wrote: In testing with HiKey, we found that since commit 3f30b158eba5c60 (asix: On RX avoid creating bad Ethernet frames), we're seeing lots of noise during network transfers: [ 239.027993] asix 1-1.1:1.0 eth0: asix_rx_fixup() Data Header synchronisation was lost, remaining 988 [ 239.037310] asix 1-1.1:1.0 eth0: asix_rx_fixup() Bad Header Length 0x54ebb5ec, offset 4 [ 239.045519] asix 1-1.1:1.0 eth0: asix_rx_fixup() Bad Header Length 0xcdffe7a2, offset 4 [ 239.275044] asix 1-1.1:1.0 eth0: asix_rx_fixup() Data Header synchronisation was lost, remaining 988 [ 239.284355] asix 1-1.1:1.0 eth0: asix_rx_fixup() Bad Header Length 0x1d36f59d, offset 4 [ 239.292541] asix 1-1.1:1.0 eth0: asix_rx_fixup() Bad Header Length 0xaef3c1e9, offset 4 [ 239.518996] asix 1-1.1:1.0 eth0: asix_rx_fixup() Data Header synchronisation was lost, remaining 988 [ 239.528300] asix 1-1.1:1.0 eth0: asix_rx_fixup() Bad Header Length 0x2881912, offset 4 [ 239.536413] asix 1-1.1:1.0 eth0: asix_rx_fixup() Bad Header Length 0x5638f7e2, offset 4 And network throughput ends up being pretty bursty and slow with a overall throughput of at best ~30kB/s. Looking through the commits since the v4.1 kernel where we didn't see this, I narrowed the regression down, and reverting the following two commits seems to avoid the problem: 6a570814cd430fa5ef4f278e8046dcf12ee63f13 asix: Continue processing URB if no RX netdev buffer 3f30b158eba5c604b6e0870027eef5d19fc9271d asix: On RX avoid creating bad Ethernet frames With these reverted, we don't see all the error messages, and we see better ~1.1MB/s throughput (I've got a mouse plugged in, so I think the usb host is only running at "full-speed" mode here). This worries me some, as the patches seem to describe trying to fix the issue they seem to cause, so I suspect a revert isn't the correct solution, but am not sure why we're having such trouble and the patch authors did not. I'd be happy to do further testing of patches if folks have any ideas. Originally Reported-by: Yongqin Liu thanks -john Hi John, Some ASIX chipsets span the Ethernet frame over consecutive URBs which requires successful transfer of 2 URBs. This means states of a previous URB influences the processing of the next URB including a dropped URB (causes a discontinuity in the data stream). In other words synchronisation of the in-band 32-bit header word needs to be tracked between URBs. Some ASIX chipsets allow the in-band 32-bit header word to be no longer fixed to the start of the URB buffer so it moves to any position within the URB buffer. I understand your point of suggesting it is a "regression" for your device but the driver was broken for DUB-E100 C1 (small black USB device). So you cannot revert the commits as this would break DUB-E100 C1 (small black USB device). 6a570814cd430fa5ef4f278e8046dcf12ee63f13 asix: Continue processing URB if no RX netdev buffer This commit is necessary because it avoids a crash when netdev buffer failed to be allocated for the 1st URB and the 2nd URB containing a spanned Ethernet frame is processed. The crash happens because the 2nd URB assumed that the netdev buffer had been allocated. 3f30b158eba5c604b6e0870027eef5d19fc9271d asix: On RX avoid creating bad Ethernet frames This commit is necessary to avoid sending bad Ethernet frames into the IP stack during loss of synchronisation and to dropping good Ethernet frames. This commit improves the synchronisation recovery mechanism of the in-band 32-bit header word. The ASIX USB to Ethernet devices these commits were tested on where DUB-E100 C1 (small black USB device). Embedded ARM based systems were used where memory resources can run out. It could be that for your USB to Ethernet device that the wrong configuration settings have been used. In other words the ASIX driver is flexible to support various variants of the ASIX chipsets. For example, does your device support Ethernet frames spanning multiple URBs (multiple USB transfers) ? So I doubt my commits are "broken" because we don't see your failures (not tested your device). It is more likely that your ASIX device needs to be properly identified and configured to be compatible with the ASIX driver. At least, I suggest that is the best place to start your investigation. Of course, your ASIX chipset might have a different behaviour for how the in-band 32-bit header word operates so perhaps special treatment is needed for your chipset ? Please send to the mailing list the output of lsusb for your device so that people can know the USB product ID and vendor ID for your device. This is allows people to assist with the investigation. Do you have any links to websites that sell your device ? Are you using UDP or TCP connections ? Regards, Dean -- Dean Jenkins Embedded Software Engineer Linux Transportation Solutions Mentor Embedded Software Division Mentor Graphics (UK) Ltd.
[PATCH v1 0/5] Improve ASIX RX memory allocation error handling
From: Mark Craske Please ignore the cover letter PATCH v2 as sent in error. Patches are all v1, (there are no v2 patches yet) The ASIX RX handler algorithm is weak on error handling. There is a design flaw in the ASIX RX handler algorithm because the implementation for handling RX Ethernet frames for the DUB-E100 C1 can have Ethernet frames spanning multiple URBs. This means that payload data from more than 1 URB is sometimes needed to fill the socket buffer with a complete Ethernet frame. When the URB with the start of an Ethernet frame is received then an attempt is made to allocate a socket buffer. If the memory allocation fails then the algorithm sets the buffer pointer member to NULL and the function exits (no crash yet). Subsequently, the RX hander is called again to process the next URB which assumes there is a socket buffer available and the kernel crashes when there is no buffer. This patchset implements an improvement to the RX handling algorithm to avoid a crash when no memory is available for the socket buffer. The patchset will apply cleanly to the net-next master branch but the created kernel has not been tested. The driver was tested on ARM kernels v3.8 and v3.14 for a commercial product. Dean Jenkins (5): asix: Rename remaining and size for clarity asix: Tidy-up 32-bit header word synchronisation asix: Simplify asix_rx_fixup_internal() netdev alloc asix: On RX avoid creating bad Ethernet frames asix: Continue processing URB if no RX netdev buffer drivers/net/usb/asix.h|2 +- drivers/net/usb/asix_common.c | 114 ++--- 2 files changed, 74 insertions(+), 42 deletions(-) -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v1 4/5] asix: On RX avoid creating bad Ethernet frames
When RX Ethernet frames span multiple URB socket buffers, the data stream may suffer a discontinuity which will cause the current Ethernet frame in the netdev socket buffer to be incomplete. This frame needs to be discarded instead of appending unrelated data from the current URB socket buffer to the Ethernet frame in the netdev socket buffer. This avoids creating a corrupted Ethernet frame in the netdev socket buffer. A discontinuity can occur when the previous URB socket buffer held an incomplete Ethernet frame due to truncation or a URB socket buffer containing the end of the Ethernet frame was missing. Therefore, add a sanity test for when an Ethernet frame spans multiple URB socket buffers to check that the remaining bytes of the currently received Ethernet frame point to a good Data header 32-bit word of the next Ethernet frame. Upon error, reset the remaining bytes variable to zero and discard the current netdev socket buffer. Assume that the Data header is located at the start of the current socket buffer and attempt to process the next Ethernet frame from there. This avoids unnecessarily discarding a good URB socket buffer that contains a new Ethernet frame. Signed-off-by: Dean Jenkins Signed-off-by: Mark Craske --- drivers/net/usb/asix_common.c | 28 1 file changed, 28 insertions(+) diff --git a/drivers/net/usb/asix_common.c b/drivers/net/usb/asix_common.c index 6a8eddf..1e4768d 100644 --- a/drivers/net/usb/asix_common.c +++ b/drivers/net/usb/asix_common.c @@ -56,6 +56,34 @@ int asix_rx_fixup_internal(struct usbnet *dev, struct sk_buff *skb, int offset = 0; u16 size; + /* When an Ethernet frame spans multiple URB socket buffers, +* do a sanity test for the Data header synchronisation. +* Attempt to detect the situation of the previous socket buffer having +* been truncated or a socket buffer was missing. These situations +* cause a discontinuity in the data stream and therefore need to avoid +* appending bad data to the end of the current netdev socket buffer. +* Also avoid unnecessarily discarding a good current netdev socket +* buffer. +*/ + if (rx->remaining && (rx->remaining + sizeof(u32) <= skb->len)) { + offset = ((rx->remaining + 1) & 0xfffe) + sizeof(u32); + rx->header = get_unaligned_le32(skb->data + offset); + offset = 0; + + size = (u16)(rx->header & 0x7ff); + if (size != ((~rx->header >> 16) & 0x7ff)) { + netdev_err(dev->net, "asix_rx_fixup() Data Header synchronisation was lost, remaining %d\n", + rx->remaining); + kfree_skb(rx->ax_skb); + rx->ax_skb = NULL; + /* Discard the incomplete netdev Ethernet frame and +* assume the Data header is at the start of the current +* URB socket buffer. +*/ + rx->remaining = 0; + } + } + while (offset + sizeof(u16) <= skb->len) { u16 copy_length; unsigned char *data; -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 0/5] Improve ASIX RX memory allocation error handling
From: Mark Craske The ASIX RX handler algorithm is weak on error handling. There is a design flaw in the ASIX RX handler algorithm because the implementation for handling RX Ethernet frames for the DUB-E100 C1 can have Ethernet frames spanning multiple URBs. This means that payload data from more than 1 URB is sometimes needed to fill the socket buffer with a complete Ethernet frame. When the URB with the start of an Ethernet frame is received then an attempt is made to allocate a socket buffer. If the memory allocation fails then the algorithm sets the buffer pointer member to NULL and the function exits (no crash yet). Subsequently, the RX hander is called again to process the next URB which assumes there is a socket buffer available and the kernel crashes when there is no buffer. This patchset implements an improvement to the RX handling algorithm to avoid a crash when no memory is available for the socket buffer. The patchset will apply cleanly to the net-next master branch but the created kernel has not been tested. The driver was tested on ARM kernels v3.8 and v3.14 for a commercial product. Dean Jenkins (5): asix: Rename remaining and size for clarity asix: Tidy-up 32-bit header word synchronisation asix: Simplify asix_rx_fixup_internal() netdev alloc asix: On RX avoid creating bad Ethernet frames asix: Continue processing URB if no RX netdev buffer drivers/net/usb/asix.h|2 +- drivers/net/usb/asix_common.c | 114 ++--- 2 files changed, 74 insertions(+), 42 deletions(-) -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v1 2/5] asix: Tidy-up 32-bit header word synchronisation
Tidy-up the Data header 32-bit word synchronisation logic in asix_rx_fixup_internal() by removing redundant logic tests. The code is looking at the following cases of the Data header 32-bit word that is present before each Ethernet frame: a) all 32 bits of the Data header word are in the URB socket buffer b) first 16 bits of the Data header word are at the end of the URB socket buffer c) last 16 bits of the Data header word are at the start of the URB socket buffer eg. split_head = true Note that the lifetime of rx->split_head exists outside of the function call and is accessed per processing of each URB. Therefore, split_head being true acts on the next URB to be processed. To check for b) the offset will be 16 bits (2 bytes) from the end of the buffer then indicate split_head is true. To check for c) split_head must be true because the first 16 bits have been found. To check for a) else c) Note that the || logic of the old code included the state (skb->len - offset == sizeof(u16) && rx->split_head) which is not possible because the split_head cannot be true whilst checking for b). This is because the split_head indicates that the first 16 bits have been found and that is not possible whilst checking for the first 16 bits. Therefore simplify the logic. Signed-off-by: Dean Jenkins Signed-off-by: Mark Craske --- drivers/net/usb/asix_common.c | 28 +--- 1 file changed, 13 insertions(+), 15 deletions(-) diff --git a/drivers/net/usb/asix_common.c b/drivers/net/usb/asix_common.c index 2bd5bdd..89efd6a 100644 --- a/drivers/net/usb/asix_common.c +++ b/drivers/net/usb/asix_common.c @@ -61,21 +61,19 @@ int asix_rx_fixup_internal(struct usbnet *dev, struct sk_buff *skb, unsigned char *data; if (!rx->remaining) { - if ((skb->len - offset == sizeof(u16)) || - rx->split_head) { - if(!rx->split_head) { - rx->header = get_unaligned_le16( - skb->data + offset); - rx->split_head = true; - offset += sizeof(u16); - break; - } else { - rx->header |= (get_unaligned_le16( - skb->data + offset) - << 16); - rx->split_head = false; - offset += sizeof(u16); - } + if (skb->len - offset == sizeof(u16)) { + rx->header = get_unaligned_le16( + skb->data + offset); + rx->split_head = true; + offset += sizeof(u16); + break; + } + + if (rx->split_head == true) { + rx->header |= (get_unaligned_le16( + skb->data + offset) << 16); + rx->split_head = false; + offset += sizeof(u16); } else { rx->header = get_unaligned_le32(skb->data + offset); -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v1 5/5] asix: Continue processing URB if no RX netdev buffer
Avoid a loss of synchronisation of the Ethernet Data header 32-bit word due to a failure to get a netdev socket buffer. The ASIX RX handling algorithm returned 0 upon a failure to get an allocation of a netdev socket buffer. This causes the URB processing to stop which potentially causes a loss of synchronisation with the Ethernet Data header 32-bit word. Therefore, subsequent processing of URBs may be rejected due to a loss of synchronisation. This may cause additional good Ethernet frames to be discarded along with outputting of synchronisation error messages. Implement a solution which checks whether a netdev socket buffer has been allocated before trying to copy the Ethernet frame into the netdev socket buffer. But continue to process the URB so that synchronisation is maintained. Therefore, only a single Ethernet frame is discarded when no netdev socket buffer is available. Signed-off-by: Dean Jenkins Signed-off-by: Mark Craske --- drivers/net/usb/asix_common.c | 31 +++ 1 file changed, 19 insertions(+), 12 deletions(-) diff --git a/drivers/net/usb/asix_common.c b/drivers/net/usb/asix_common.c index 1e4768d..a186b0a 100644 --- a/drivers/net/usb/asix_common.c +++ b/drivers/net/usb/asix_common.c @@ -74,12 +74,14 @@ int asix_rx_fixup_internal(struct usbnet *dev, struct sk_buff *skb, if (size != ((~rx->header >> 16) & 0x7ff)) { netdev_err(dev->net, "asix_rx_fixup() Data Header synchronisation was lost, remaining %d\n", rx->remaining); - kfree_skb(rx->ax_skb); - rx->ax_skb = NULL; - /* Discard the incomplete netdev Ethernet frame and -* assume the Data header is at the start of the current -* URB socket buffer. -*/ + if (rx->ax_skb) { + kfree_skb(rx->ax_skb); + rx->ax_skb = NULL; + /* Discard the incomplete netdev Ethernet frame +* and assume the Data header is at the start of +* the current URB socket buffer. +*/ + } rx->remaining = 0; } } @@ -121,9 +123,12 @@ int asix_rx_fixup_internal(struct usbnet *dev, struct sk_buff *skb, return 0; } + /* Sometimes may fail to get a netdev socket buffer but +* continue to process the URB socket buffer so that +* synchronisation of the Ethernet frame Data header +* word is maintained. +*/ rx->ax_skb = netdev_alloc_skb_ip_align(dev->net, size); - if (!rx->ax_skb) - return 0; rx->remaining = size; } @@ -136,10 +141,12 @@ int asix_rx_fixup_internal(struct usbnet *dev, struct sk_buff *skb, rx->remaining = 0; } - data = skb_put(rx->ax_skb, copy_length); - memcpy(data, skb->data + offset, copy_length); - if (!rx->remaining) - usbnet_skb_return(dev, rx->ax_skb); + if (rx->ax_skb) { + data = skb_put(rx->ax_skb, copy_length); + memcpy(data, skb->data + offset, copy_length); + if (!rx->remaining) + usbnet_skb_return(dev, rx->ax_skb); + } offset += (copy_length + 1) & 0xfffe; } -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v1 3/5] asix: Simplify asix_rx_fixup_internal() netdev alloc
The code is checking that the Ethernet frame will fit into a netdev allocated socket buffer within the constraints of MTU size, Ethernet header length plus VLAN header length. The original code was checking rx->remaining each loop of the while loop that processes multiple Ethernet frames per URB and/or Ethernet frames that span across URBs. rx->remaining decreases per while loop so there is no point in potentially checking multiple times that the Ethernet frame (remaining part) will fit into the netdev socket buffer. The modification checks that the size of the Ethernet frame will fit the netdev socket buffer before allocating the netdev socket buffer. This avoids grabbing memory and then deciding that the Ethernet frame is too big and then freeing the memory. Signed-off-by: Dean Jenkins Signed-off-by: Mark Craske --- drivers/net/usb/asix_common.c | 16 +++- 1 file changed, 7 insertions(+), 9 deletions(-) diff --git a/drivers/net/usb/asix_common.c b/drivers/net/usb/asix_common.c index 89efd6a..6a8eddf 100644 --- a/drivers/net/usb/asix_common.c +++ b/drivers/net/usb/asix_common.c @@ -87,19 +87,17 @@ int asix_rx_fixup_internal(struct usbnet *dev, struct sk_buff *skb, rx->header, offset); return 0; } + if (size > dev->net->mtu + ETH_HLEN + VLAN_HLEN) { + netdev_err(dev->net, "asix_rx_fixup() Bad RX Length %d\n", + size); + return 0; + } + rx->ax_skb = netdev_alloc_skb_ip_align(dev->net, size); if (!rx->ax_skb) return 0; - rx->remaining = size; - } - if (rx->remaining > dev->net->mtu + ETH_HLEN + VLAN_HLEN) { - netdev_err(dev->net, "asix_rx_fixup() Bad RX Length %d\n", - rx->remaining); - kfree_skb(rx->ax_skb); - rx->ax_skb = NULL; - rx->remaining = 0; - return 0; + rx->remaining = size; } if (rx->remaining > skb->len - offset) { -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v1 1/5] asix: Rename remaining and size for clarity
The Data header synchronisation is easier to understand if the variables "remaining" and "size" are renamed. Therefore, the lifetime of the "remaining" variable exists outside of asix_rx_fixup_internal() and is used to indicate any remaining pending bytes of the Ethernet frame that need to be obtained from the next socket buffer. This allows an Ethernet frame to span across multiple socket buffers. "size" is now local to asix_rx_fixup_internal() and contains the size read from the Data header 32-bit word. Add "copy_length" to hold the number of the Ethernet frame bytes (maybe a part of a full frame) that are to be copied out of the socket buffer. Signed-off-by: Dean Jenkins Signed-off-by: Mark Craske --- drivers/net/usb/asix.h|2 +- drivers/net/usb/asix_common.c | 41 + 2 files changed, 22 insertions(+), 21 deletions(-) diff --git a/drivers/net/usb/asix.h b/drivers/net/usb/asix.h index 5d049d0..a2d3ea6 100644 --- a/drivers/net/usb/asix.h +++ b/drivers/net/usb/asix.h @@ -168,7 +168,7 @@ struct asix_data { struct asix_rx_fixup_info { struct sk_buff *ax_skb; u32 header; - u16 size; + u16 remaining; bool split_head; }; diff --git a/drivers/net/usb/asix_common.c b/drivers/net/usb/asix_common.c index 75d6f26..2bd5bdd 100644 --- a/drivers/net/usb/asix_common.c +++ b/drivers/net/usb/asix_common.c @@ -54,12 +54,13 @@ int asix_rx_fixup_internal(struct usbnet *dev, struct sk_buff *skb, struct asix_rx_fixup_info *rx) { int offset = 0; + u16 size; while (offset + sizeof(u16) <= skb->len) { - u16 remaining = 0; + u16 copy_length; unsigned char *data; - if (!rx->size) { + if (!rx->remaining) { if ((skb->len - offset == sizeof(u16)) || rx->split_head) { if(!rx->split_head) { @@ -81,42 +82,42 @@ int asix_rx_fixup_internal(struct usbnet *dev, struct sk_buff *skb, offset += sizeof(u32); } - /* get the packet length */ - rx->size = (u16) (rx->header & 0x7ff); - if (rx->size != ((~rx->header >> 16) & 0x7ff)) { + /* take frame length from Data header 32-bit word */ + size = (u16)(rx->header & 0x7ff); + if (size != ((~rx->header >> 16) & 0x7ff)) { netdev_err(dev->net, "asix_rx_fixup() Bad Header Length 0x%x, offset %d\n", rx->header, offset); - rx->size = 0; return 0; } - rx->ax_skb = netdev_alloc_skb_ip_align(dev->net, - rx->size); + rx->ax_skb = netdev_alloc_skb_ip_align(dev->net, size); if (!rx->ax_skb) return 0; + rx->remaining = size; } - if (rx->size > dev->net->mtu + ETH_HLEN + VLAN_HLEN) { + if (rx->remaining > dev->net->mtu + ETH_HLEN + VLAN_HLEN) { netdev_err(dev->net, "asix_rx_fixup() Bad RX Length %d\n", - rx->size); + rx->remaining); kfree_skb(rx->ax_skb); rx->ax_skb = NULL; - rx->size = 0U; - + rx->remaining = 0; return 0; } - if (rx->size > skb->len - offset) { - remaining = rx->size - (skb->len - offset); - rx->size = skb->len - offset; + if (rx->remaining > skb->len - offset) { + copy_length = skb->len - offset; + rx->remaining -= copy_length; + } else { + copy_length = rx->remaining; + rx->remaining = 0; } - data = skb_put(rx->ax_skb, rx->size); - memcpy(data, skb->data + offset, rx->size); - if (!remaining) + data = skb_put(rx->ax_skb, copy_length); + memcpy(data, skb->data + offset, copy_length); + if (!rx->remaining) usbnet_skb_return(dev, rx->ax_skb); - offset += (rx->size + 1) & 0xfffe; - r
Re: [PATCH] net: usb: asix: Fix crash on skb alloc failure
If asix_rx_fixup_internal() fails to allocate rx->ax_skb, it will return but not clear rx->size. rx points to driver private data. A later call assumes that nonzero size means ax_skb was allocated and passes a null ax_skb to skb_put. Changed allocation failure return to clear size first. Found testing board with AX88772B devices. Signed-off-by: David B. Robins --- drivers/net/usb/asix_common.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/net/usb/asix_common.c b/drivers/net/usb/asix_common.c index 75d6f26..079069a 100644 --- a/drivers/net/usb/asix_common.c +++ b/drivers/net/usb/asix_common.c @@ -91,8 +91,10 @@ int asix_rx_fixup_internal(struct usbnet *dev, struct sk_buff *skb, } rx->ax_skb = netdev_alloc_skb_ip_align(dev->net, rx->size); - if (!rx->ax_skb) + if (!rx->ax_skb) { + rx->size = 0; return 0; + } } if (rx->size > dev->net->mtu + ETH_HLEN + VLAN_HLEN) { -- 1.9.1 Hi David, I copied your patch from http://www.spinics.net/lists/netdev/msg345724.html and resubscribed to netdev@vger.kernel.org so I am unable to directly reply to your original post. But I should now see any subsequent reply on the mailing list. We are preparing to release some fixes in this area of the asix driver which fixes your observation. Unfortunately, your simple proposal has a flaw because state variables exist outside of the scope of asix_rx_fixup_internal() which handles Ethernet frames spanning multiple URBs (depends on the variant of the USB ASIX chipset). Therefore, subsequent URBs with the remainder of the Ethernet frame need to be handled when no netdev socket buffer exists. We intend to release the patches within the next few days so please watch out for them. Regards, Dean -- Dean Jenkins Embedded Software Engineer Linux Transportation Solutions Mentor Embedded Software Division Mentor Graphics (UK) Ltd. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html