On Tue, Aug 09, 2016 at 12:26:00PM +0200, Momtchil Momtchev wrote:
>     I tried this patch (5f22 in the RL_IM register) on 5.9-stable, no
> POOL_DEBUG, 30 pf rules and about 10k states and I got about 8000 IRQ/s (up
> from 6000/s on vanilla) and iperf RX throughput was down to 280 MBit/s.

Are you running iperf on the APU?  I'm running it on separate machines
either side.

> I then tried upping the RXPKTS to 15 and the RXTIME to 4 (100us) - this is
> 5f4f in RL_IM. The IRQs dropped to 5000 IRQ/s and the CPU utilisation went
> down, but RX throughput went further down to 240 MBit/s. Next I tried the
> magic value 5151 and IRQs dropped to 4000 IRQ/s (???), CPU went down and
> throughput went down to 200 MBit/s.
>     All these are RX values. On the TX side I get slightly higher throughput
> and a higher IRQ rate.
>     There are no public datasheets for the 8111E and I think there is
> something fishy with those values. The original source of the 5151 magic is
> the open-source driver by Realtek. Hopefully they know what they are doing.
> They have a Linux and a FreeBSD driver on their site. The one for FreeBSD
> seems to be based on that same driver by Bill Paul from Windriver and it
> uses this magic value which doesn't make any sense if 1 is the number of
> packets to wait before sending an interrupt.
> 
>     Also what I find very puzzling is that lower IRQ rates lead to lower CPU
> utilization but not higher throughput??
> 
>     What hardware are you using? This is not an APU? Maybe there is some
> other bottleneck on the APU?

It's an early (2 core) APU.  dmesg below.

> Are you getting higher performance than on vanilla?

It's not consistent and I think there's a variable I haven't controlled
for. I've been fiddling a lot but right now a stock -current kernel
performance is about in the same ballpark as with the diff.  Have you
tested with -current?  I might try 5.9 for comparison.

Things I've tried to control for:
hw.perfpolicy=manual
hw.setperf=0 (also 100)
kern.pool_debug=0

I did find that the results seemed a bit more consistent when disabling
Nagle (iperf -N) which would probably make it less sensitive to latency
or loss during TCP slow start.

OpenBSD 6.0 (GENERIC) #2135: Sun Jul 24 10:30:11 MDT 2016
    dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC
real mem = 2098520064 (2001MB)
avail mem = 2030505984 (1936MB)
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.7 @ 0x7e16d820 (7 entries)
bios0: vendor coreboot version "4.0" date 09/08/2014
bios0: PC Engines APU
acpi0 at bios0: rev 0
acpi0: sleep states S0 S1 S3 S4 S5
acpi0: tables DSDT FACP SPCR HPET APIC HEST SSDT SSDT SSDT
acpi0: wakeup devices AGPB(S4) HDMI(S4) PBR4(S4) PBR5(S4) PBR6(S4) PBR7(S4) 
PE20(S4) PE21(S4) PE22(S4) PE23(S4) PIBR(S4) UOH1(S3) UOH2(S3) UOH3(S3) 
UOH4(S3) UOH5(S3) [...]
acpitimer0 at acpi0: 3579545 Hz, 32 bits
acpihpet0 at acpi0: 14318180 Hz
acpimadt0 at acpi0 addr 0xfee00000: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: AMD G-T40N Processor, 1000.13 MHz
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,MWAIT,SSSE3,CX16,POPCNT,NXE,MMXX,FFXSR,PAGE1GB,LONG,LAHF,CMPLEG,SVM,EAPICSP,AMCR8,ABM,SSE4A,MASSE,3DNOWP,IBS,SKINIT,ITSC
cpu0: 32KB 64b/line 2-way I-cache, 32KB 64b/line 8-way D-cache, 512KB 64b/line 
16-way L2 cache
cpu0: 8 4MB entries fully associative
cpu0: DTLB 40 4KB entries fully associative, 8 4MB entries fully associative
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges
cpu0: apic clock running at 199MHz
cpu0: mwait min=64, max=64, IBE
cpu at mainbus0: not configured
ioapic0 at mainbus0: apid 2 pa 0xfec00000, version 21, 24 pins
acpiprt0 at acpi0: bus -1 (AGPB)
acpiprt1 at acpi0: bus -1 (HDMI)
acpiprt2 at acpi0: bus 1 (PBR4)
acpiprt3 at acpi0: bus 2 (PBR5)
acpiprt4 at acpi0: bus 3 (PBR6)
acpiprt5 at acpi0: bus -1 (PBR7)
acpiprt6 at acpi0: bus 5 (PE20)
acpiprt7 at acpi0: bus -1 (PE21)
acpiprt8 at acpi0: bus -1 (PE22)
acpiprt9 at acpi0: bus -1 (PE23)
acpiprt10 at acpi0: bus 0 (PCI0)
acpiprt11 at acpi0: bus 4 (PIBR)
acpicpu0 at acpi0: C2(0@100 io@0x841), C1(@1 halt!), PSS
acpibtn0 at acpi0: PWRB
cpu0: 1000 MHz: speeds: 1000 800 MHz
pci0 at mainbus0 bus 0
pchb0 at pci0 dev 0 function 0 "AMD AMD64 14h Host" rev 0x00
ppb0 at pci0 dev 4 function 0 "AMD AMD64 14h PCIE" rev 0x00: msi
pci1 at ppb0 bus 1
re0 at pci1 dev 0 function 0 "Realtek 8168" rev 0x06: RTL8168E/8111E (0x2c00), 
msi, address 00:0d:b9:31:30:74
rgephy0 at re0 phy 7: RTL8169S/8110S/8211 PHY, rev. 4
ppb1 at pci0 dev 5 function 0 "AMD AMD64 14h PCIE" rev 0x00: msi
pci2 at ppb1 bus 2
re1 at pci2 dev 0 function 0 "Realtek 8168" rev 0x06: RTL8168E/8111E (0x2c00), 
msi, address 00:0d:b9:31:30:75
rgephy1 at re1 phy 7: RTL8169S/8110S/8211 PHY, rev. 4
ppb2 at pci0 dev 6 function 0 "AMD AMD64 14h PCIE" rev 0x00: msi
pci3 at ppb2 bus 3
re2 at pci3 dev 0 function 0 "Realtek 8168" rev 0x06: RTL8168E/8111E (0x2c00), 
msi, address 00:0d:b9:31:30:76
rgephy2 at re2 phy 7: RTL8169S/8110S/8211 PHY, rev. 4
ahci0 at pci0 dev 17 function 0 "ATI SBx00 SATA" rev 0x40: apic 2 int 19, AHCI 
1.2
ahci0: port 0: 6.0Gb/s
scsibus1 at ahci0: 32 targets
sd0 at scsibus1 targ 0 lun 0: <ATA, GLOWAY VAL32GS3-, V1BB> SCSI3 0/direct 
fixed naa.0000000000000000
sd0: 30029MB, 512 bytes/sector, 61500000 sectors, thin
ohci0 at pci0 dev 18 function 0 "ATI SB700 USB" rev 0x00: apic 2 int 18, 
version 1.0, legacy support
ehci0 at pci0 dev 18 function 2 "ATI SB700 USB2" rev 0x00: apic 2 int 17
usb0 at ehci0: USB revision 2.0
uhub0 at usb0 "ATI EHCI root hub" rev 2.00/1.00 addr 1
ohci1 at pci0 dev 19 function 0 "ATI SB700 USB" rev 0x00: apic 2 int 18, 
version 1.0, legacy support
ehci1 at pci0 dev 19 function 2 "ATI SB700 USB2" rev 0x00: apic 2 int 17
usb1 at ehci1: USB revision 2.0
uhub1 at usb1 "ATI EHCI root hub" rev 2.00/1.00 addr 1
piixpm0 at pci0 dev 20 function 0 "ATI SBx00 SMBus" rev 0x42: polling
iic0 at piixpm0
pcib0 at pci0 dev 20 function 3 "ATI SB700 ISA" rev 0x40
ppb3 at pci0 dev 20 function 4 "ATI SB600 PCI" rev 0x40
pci4 at ppb3 bus 4
ohci2 at pci0 dev 20 function 5 "ATI SB700 USB" rev 0x00: apic 2 int 18, 
version 1.0, legacy support
ppb4 at pci0 dev 21 function 0 "ATI SB800 PCIE" rev 0x00
pci5 at ppb4 bus 5
ohci3 at pci0 dev 22 function 0 "ATI SB700 USB" rev 0x00: apic 2 int 18, 
version 1.0, legacy support
ehci2 at pci0 dev 22 function 2 "ATI SB700 USB2" rev 0x00: apic 2 int 17
usb2 at ehci2: USB revision 2.0
uhub2 at usb2 "ATI EHCI root hub" rev 2.00/1.00 addr 1
pchb1 at pci0 dev 24 function 0 "AMD AMD64 14h Link Cfg" rev 0x43
pchb2 at pci0 dev 24 function 1 "AMD AMD64 14h Address Map" rev 0x00
pchb3 at pci0 dev 24 function 2 "AMD AMD64 14h DRAM Cfg" rev 0x00
km0 at pci0 dev 24 function 3 "AMD AMD64 14h Misc Cfg" rev 0x00
pchb4 at pci0 dev 24 function 4 "AMD AMD64 14h CPU Power" rev 0x00
pchb5 at pci0 dev 24 function 5 "AMD AMD64 14h Reserved" rev 0x00
pchb6 at pci0 dev 24 function 6 "AMD AMD64 14h NB Power" rev 0x00
pchb7 at pci0 dev 24 function 7 "AMD AMD64 14h Reserved" rev 0x00
usb3 at ohci0: USB revision 1.0
uhub3 at usb3 "ATI OHCI root hub" rev 1.00/1.00 addr 1
usb4 at ohci1: USB revision 1.0
uhub4 at usb4 "ATI OHCI root hub" rev 1.00/1.00 addr 1
isa0 at pcib0
isadma0 at isa0
com0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo
com0: console
com1 at isa0 port 0x2f8/8 irq 3: ns16550a, 16 byte fifo
pcppi0 at isa0 port 0x61
spkr0 at pcppi0
lpt0 at isa0 port 0x378/4 irq 7
wbsio0 at isa0 port 0x2e/2: NCT5104D rev 0x52
usb5 at ohci2: USB revision 1.0
uhub5 at usb5 "ATI OHCI root hub" rev 1.00/1.00 addr 1
usb6 at ohci3: USB revision 1.0
uhub6 at usb6 "ATI OHCI root hub" rev 1.00/1.00 addr 1
umass0 at uhub2 port 1 configuration 1 interface 0 "Generic Flash Card 
Reader/Writer" rev 2.01/1.00 addr 2
umass0: using SCSI over Bulk-Only
scsibus2 at umass0: 2 targets, initiator 0
sd1 at scsibus2 targ 1 lun 0: <Multiple, Card Reader, 1.00> SCSI2 0/direct 
removable serial.058f6366058F63666485
vscsi0 at root
scsibus3 at vscsi0: 256 targets
softraid0 at root
scsibus4 at softraid0: 256 targets
root on sd0a (2b4cdf5e1e14b9e7.a) swap on sd0b dump on sd0b
-- 
Darren Tucker (dtucker at zip.com.au)
GPG key 11EAA6FA / A86E 3E07 5B19 5880 E860  37F4 9357 ECEF 11EA A6FA (new)
    Good judgement comes with experience. Unfortunately, the experience
usually comes from bad judgement.

Reply via email to