Re: List of Networking enhancements and bug fixes in a particular release

2018-06-14 Thread tedheadster
On Thu, Jun 14, 2018 at 1:21 PM, Joe Smith  wrote:
>
> What is the best and authoritative mechanism to find out networking
> enhancements in a Linux release?
>

Joe,
  there usually is a good summary a few days after a kernel release on
the Kernel Newbies site. Here is a recent one:

https://kernelnewbies.org/Linux_4.15

- Matthew


3c59x: Transmit timeouts

2018-05-07 Thread tedheadster
Steffen,
  I am getting mostly transmit errors on a 3c597 Fast Ethernet card.
It is EISA, not PCI. Here is some of the logs with verbosity turned up
a bit:

eth3: using default media 100baseTX
[372] eth3: Initial media type 100baseTX.
eth3:  setting half-duplex.
[372] eth3: vortex_up() irq 9 media status 8882.
 eth3: Media 100baseTX has link beat, 8882.
eth3: transmit timed out, tx_status 00 status 8000.
  diagnostics: net 0c80 media 8882 dma  fifo 
eth3: transmit timed out, tx_status 00 status 8000.
  diagnostics: net 0c80 media 8882 dma  fifo 
eth3: transmit timed out, tx_status 00 status 8000.
  diagnostics: net 0c80 media 8882 dma  fifo 
eth3: transmit timed out, tx_status 00 status 8000.
  diagnostics: net 0c80 media 8882 dma  fifo 
[378] eth3: vortex_close() status 8000, Tx status 00.
[378] eth3: vortex close stats: rx_nocopy 0 rx_copy 0 tx_queued 0 Rx
pre-checksummed 0.

Here is what I got from the registers:

NetworkDiagnostic (net): txEnabled, rxEnabled, statisticsEnabled
MediaStatus (media): auiDisable, linkBeatDetect, linkBeatEnable, crcStripDisable
IntStatus (status): register window 4 selected

I am seeing /proc/interrupts incrementing, so at least some interrupts
are getting to the card.

'tc -s qdisc' reports 4 requeues on 13 packets sent.

'ifconfig' reports 13 tx packets, 11 tx errors, 11 tx dropped (no
overrun, carrier, or collision errors).

I can do the legwork on debugging this. What should I investigate next?

- Matthew


Re: [PATCH net] 8139too: revisit napi_complete_done() usage

2018-01-22 Thread tedheadster
On Mon, Sep 18, 2017 at 11:57 PM, David Miller  wrote:
> From: Eric Dumazet 
> Date: Mon, 18 Sep 2017 13:03:43 -0700
>
>> From: Eric Dumazet 
>>
>> It seems we have to be more careful in napi_complete_done()
>> use. This patch is not a revert, as it seems we can
>> avoid bug that Ville reported by moving the napi_complete_done()
>> test in the spinlock section.
>>
>> Many thanks to Ville for detective work and all tests.
>>
>> Fixes: 617f01211baf ("8139too: use napi_complete_done()")
>> Reported-by: Ville Syrjälä 
>> Tested-by: Ville Syrjälä 
>
> Applied and queued up for -stable.

Eric,
  sorry to bring up this old thread, but I had a question. Do we have
to surround most usage of napi_complete_done() with a spinlock, or was
this problem restricted to just the 8139too driver?

- Matthew Whitehead


Re: [PATCHv3] 3c59x: fix missing dma_mapping_error check and bad ring refill logic

2018-01-21 Thread tedheadster
On Wed, Jan 3, 2018 at 1:44 PM, David Miller  wrote:
> From: Neil Horman 
> Date: Wed,  3 Jan 2018 13:09:23 -0500
>
>> A few spots in 3c59x missed calls to dma_mapping_error checks, casuing
>> WARN_ONS to trigger.  Clean those up.  While we're at it, refactor the
>> refill code a bit so that if skb allocation or dma mapping fails, we
>> recycle the existing buffer.  This prevents holes in the rx ring, and
>> makes for much simpler logic
>>
>> Note: This is compile only tested.  Ted, if you could run this and
>> confirm that it continues to work properly, I would appreciate it, as I
>> currently don't have access to this hardware
>>

Neil,
  I was able to test this patch. I did not get any WARN_ON messages.
However, I am getting a lot of dropped receive packets; uptime is 11
minutes and it has already dropped 214 of 743 receive packets.

Admittedly this is on a slow i486 regression testing system, but the
drop rate is approximately 30% which seems high even for this system
because it is on a very quiet switched network.

I enabled some debugging messages by setting msglvl to 4 and
recompiling with DYNAMIC_DEBUG=y. I did not see any messages of the
form "No memory to allocate a sk_buff of size" so that leaves the
following two cases:

boomerang_rx()
...
newskb = netdev_alloc_skb_ip_align(dev, PKT_BUF_SZ);
if (!newskb) {
  dev->stats.rx_dropped++;
  goto clear_complete;
  }
  newdma = pci_map_single(VORTEX_PCI(vp), newskb->data,
PKT_BUF_SZ, PCI_DMA_FROMDEVICE)
  if (dma_mapping_error(_PCI(vp)->dev, newdma)) {
dev->stats.rx_dropped++;
consume_skb(newskb);
goto clear_complete;
  }

What shall we do to determine if it is hitting the pci_map_single() or
netdev_alloc_skb_ip_align() failure?

- Matthew


3c59x: pci_unmap_single() oops

2017-12-29 Thread tedheadster
In the 4.15.0-rc5 kernel (and likely earlier) I get the following oops.

3c59x :00:0c.0 enp0s12: renamed from eth0
enp0s12:  setting half-duplex.
[ cut here ]
3c59x :00:0c.0: DMA-API: device driver failed to check map
error[device address=0x09e1b040] [size=1536 bytes] [mapped as
single]
WARNING: CPU: 0 PID: 1 at check_unmap+0x559/0x695
Modules linked in: ohci_pci ohci_hcd ehci_pci ehci_hcd usbcore pcspkr
serio_raw 3c59x mii usb_common ipv6
CPU: 0 PID: 1 Comm: systemd Not tainted 4.15.0-rc5.i486 #10
EIP: check_unmap+0x559/0x695
EFLAGS: 00010096 CPU: 0
EAX: 008c EBX: cb8a8660 ECX: c0881544 EDX: 0001
ESI: cb8e5280 EDI: c06e8b9f EBP: cb821e50 ESP: cb821df8
 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
CR0: 80050033 CR2: 004f1700 CR3: 0b2f9000 CR4: 
Call Trace:
 ? mntput_no_expire+0x13/0x105
 debug_dma_unmap_page+0x61/0x69
 pci_unmap_single+0x4c/0x56 [3c59x]
 boomerang_rx+0x250/0x42a [3c59x]
 boomerang_interrupt+0xde/0x3ea [3c59x]
 __handle_irq_event_percpu+0x2a/0xaf
 handle_irq_event_percpu+0x17/0x3d
 handle_irq_event+0x22/0x3b
 handle_level_irq+0x55/0x7a
 handle_irq+0x4f/0x58
 do_IRQ+0x35/0x95
 common_interrupt+0x34/0x40
EIP: 0xb7ae6970
EFLAGS: 0246 CPU: 0
EAX: b7e1c3d8 EBX: b7eef344 ECX:  EDX: 007048c4
ESI: 0013 EDI: 007048c4 EBP: bf8d81c8 ESP: bf8d8094
 DS: 007b ES: 007b FS:  GS: 0033 SS: 007b
Code: 01 00 00 8b 58 08 e9 4a 01 00 00 bb 1b 2f 70 c0 89 d8 57 ff 75
e4 ff 75 e0 ff 75 dc ff 75 d8 53 50 68 bd 16 71 c0 e8 1f 3c e1 ff <0f>
ff 83 c4 20 83 3d 44 d5 7e c0 00 75 0f a1 f0 e6 7b c0 85 c0
---[ end trace 8b519628d8703199 ]---

This may relate to "3c59x: Add dma error checking and recovery"

- Matthew Whitehead


Re: Thoughts on staging and on fixing up drivers?

2017-08-17 Thread tedheadster
>
> Larry, you've migrated a bunch of staging code, and tried various
> approaches.  Do you have any lessons on what has worked and what hasn't
> and if there is anything we can do to make the process better?

I am also quite interested in such work. We asked for a Birds of
Feather discussion at the upcoming Linux Plumbers conference on
exactly this sort of work.

- Matthew


Re: [regression v4.11] 617f01211baf ("8139too: use napi_complete_done()")

2017-04-10 Thread tedheadster
On Sat, Apr 8, 2017 at 6:23 AM, Francois Romieu  wrote:
> David Miller  :
> [...]
>> One theory is that the interrupt masking isn't working properly
>> and interrupts are still arriving and hitting the NAPI state even
>> when we are actively polling NAPI.
>>
>> And this problem was masked by the locking done here.
>
> Yes.
>
> Ville, can you rule out irq sharing between the 8139 and some other
> device ? It's a candidate for unexpected interrupt handler invocation
> with older pc, even with properly working hardware.
>

Eric,
  If napi_complete_done() calls could affect drivers on older
hardware, I can test the following:

drivers/net/ethernet/3com/typhoon.c
drivers/net/ethernet/amd/pcnet32.c
drivers/net/ethernet/broadcom/tg3.c
drivers/net/ethernet/dec/tulip/interrupt.c
drivers/net/ethernet/intel/e100.c
drivers/net/ethernet/intel/e1000/e1000_main.c
drivers/net/ethernet/smsc/epic100.c
drivers/net/ethernet/via/via-rhine.c

- Matthew