Re: ping time fluctuates, any idea?
Hello, > On 9/09/2019, at 8:45 PM, Jihyun Yu wrote: > > It seems that time from ping command fluctuates. Here's a output from ping > command. > [...snip ping with negative rtt...] This is symptomatic of unsynchronized time stamp counters (TSC). I would expect that setting: # sysctl kern.timecounter.hardware=acpihpet0 would fix your ping results, and probably improve your ntpd(8) performance, too. There has been some work in this area on -current. best, Richard. PS. Please make sure to include a complete dmesg next time - half a dmesg is like half a photograph! > 00=2828 01=a1a1 02=f7f7 03=2020 04=d9d9 05=5b5b 06=0b0b 07=3030 > isa0 at pcib0 > isadma0 at isa0 > com0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo > pckbc0 at isa0 port 0x60/5 irq 1 irq 12 > pckbd0 at pckbc0 (kbd slot) > wskbd0 at pckbd0: console keyboard, using wsdisplay0 > pcppi0 at isa0 port 0x61 > spkr0 at pcppi0 > wbsio0 at isa0 port 0x2e/2: NCT6776F rev 0x33 > lm1 at wbsio0 port 0x290/8: NCT6776F > pci13 at mainbus0 bus 255 > "Intel E5 QPI Link" rev 0x07 at pci13 dev 8 function 0 not configured > vendor "Intel", unknown product 0x3c83 (class system subclass > miscellaneous, rev 0x07) at pci13 dev 8 function 3 not configured > vendor "Intel", unknown product 0x3c84 (class system subclass > miscellaneous, rev 0x07) at pci13 dev 8 function 4 not configured > "Intel E5 QPI Link" rev 0x07 at pci13 dev 9 function 0 not configured > vendor "Intel", unknown product 0x3c93 (class system subclass > miscellaneous, rev 0x07) at pci13 dev 9 function 3 not configured > vendor "Intel", unknown product 0x3c94 (class system subclass > miscellaneous, rev 0x07) at pci13 dev 9 function 4 not configured > "Intel E5 PCU" rev 0x07 at pci13 dev 10 function 0 not configured > "Intel E5 PCU" rev 0x07 at pci13 dev 10 function 1 not configured > "Intel E5 PCU" rev 0x07 at pci13 dev 10 function 2 not configured > "Intel E5 PCU" rev 0x07 at pci13 dev 10 function 3 not configured > "Intel E5 Scratch" rev 0x07 at pci13 dev 11 function 0 not configured > "Intel E5 Scratch" rev 0x07 at pci13 dev 11 function 3 not configured > "Intel E5 Unicast" rev 0x07 at pci13 dev 12 function 0 not configured > "Intel E5 Unicast" rev 0x07 at pci13 dev 12 function 1 not configured > "Intel E5 Unicast" rev 0x07 at pci13 dev 12 function 2 not configured > "Intel E5 SAD" rev 0x07 at pci13 dev 12 function 6 not configured > "Intel E5 SAD" rev 0x07 at pci13 dev 12 function 7 not configured > "Intel E5 Unicast" rev 0x07 at pci13 dev 13 function 0 not configured > "Intel E5 Unicast" rev 0x07 at pci13 dev 13 function 1 not configured > "Intel E5 Unicast" rev 0x07 at pci13 dev 13 function 2 not configured > "Intel E5 Broadcast" rev 0x07 at pci13 dev 13 function 6 not configured > "Intel E5 Home Agent" rev 0x07 at pci13 dev 14 function 0 not configured > "Intel E5 Home Agent" rev 0x07 at pci13 dev 14 function 1 not configured > "Intel E5 TA" rev 0x07 at pci13 dev 15 function 0 not configured > "Intel E5 RAS" rev 0x07 at pci13 dev 15 function 1 not configured > "Intel E5 TAD" rev 0x07 at pci13 dev 15 function 2 not configured > "Intel E5 TAD" rev 0x07 at pci13 dev 15 function 3 not configured > "Intel E5 TAD" rev 0x07 at pci13 dev 15 function 4 not configured > "Intel E5 TAD" rev 0x07 at pci13 dev 15 function 5 not configured > "Intel E5 TAD" rev 0x07 at pci13 dev 15 function 6 not configured > "Intel E5 Thermal" rev 0x07 at pci13 dev 16 function 0 not configured > "Intel E5 Thermal" rev 0x07 at pci13 dev 16 function 1 not configured > "Intel E5 Error" rev 0x07 at pci13 dev 16 function 2 not configured > "Intel E5 Error" rev 0x07 at pci13 dev 16 function 3 not configured > "Intel E5 Thermal" rev 0x07 at pci13 dev 16 function 4 not configured > "Intel E5 Thermal" rev 0x07 at pci13 dev 16 function 5 not configured > "Intel E5 Error" rev 0x07 at pci13 dev 16 function 6 not configured > "Intel E5 Error" rev 0x07 at pci13 dev 16 function 7 not configured > "Intel E5 DDRIO" rev 0x07 at pci13 dev 17 function 0 not configured > "Intel E5 R2PCIE" rev 0x07 at pci13 dev 19 function 0 not configured > "Intel E5 PCIE Monitor" rev 0x07 at pci13 dev 19 function 1 not configured > "Intel E5 QPI" rev 0x07 at pci13 dev 19 function 4 not configured > "Intel E5 QPI Link Monitor" rev 0x07 at pci13 dev 19 function 5 not > configured > "Intel E5 QPI Link Monitor" rev 0x07 at pci13 dev 19 function 6 not > configured > vmm0 at mainbus0: VMX/EPT > uhub4 at uhub0 port 1 configuration 1 interface 0 "Intel Rate Matching Hub" > rev 2.00/0.00 addr 2 > uhub5 at uhub3 port 1 configuration 1 interface 0 "Intel Rate Matching Hub" > rev 2.00/0.00 addr 2 > vscsi0 at root > scsibus4 at vscsi0: 256 targets > softraid0 at root > scsibus5 at softraid0: 256 targets > root on sd0a (74fd07b06a4f30a1.a) swap on sd0b dump on sd0b > > > Thanks, > Jihyun Yu
Re: ral(4) problems on current/i386 ALIX
On 28/11/2016, at 4:25 AM, Jan Stary wrote: > [...] > What kind of wifi are people using > on the ALIX serving as an AP? I'm running an RT2860 via ral(4) on an Alix 2d2 -- I'm seeing about 1.1MB/s when transferring 47MB from it through a couple of walls, and with another network at -74dBm on the same channel. The network is otherwise quiet and I'm maybe 12M from the AP. best, Richard. (The kernel is running a patch but this shouldn't be affecting throughput.) OpenBSD 6.0-current (GENERIC) #8: Sun Nov 20 12:51:52 NZDT 2016 build@build.localdomain:/usr/src/sys/arch/i386/compile/GENERIC cpu0: Geode(TM) Integrated Processor by AMD PCS ("AuthenticAMD" 586-class) 499 MHz cpu0: FPU,DE,PSE,TSC,MSR,CX8,SEP,PGE,CMOV,CFLUSH,MMX,MMXX,3DNOW2,3DNOW real mem = 267931648 (255MB) avail mem = 250114048 (238MB) mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root bios0 at mainbus0: date 11/05/08, BIOS32 rev. 0 @ 0xfd088 pcibios0 at bios0: rev 2.1 @ 0xf/0x1 pcibios0: pcibios_get_intr_routing - function not supported pcibios0: PCI IRQ Routing information unavailable. pcibios0: PCI bus #0 is the last bus bios0: ROM list: 0xe/0xa800 cpu0 at mainbus0: (uniprocessor) mtrr: K6-family MTRR support (2 registers) pci0 at mainbus0 bus 0: configuration mode 1 (bios) pchb0 at pci0 dev 1 function 0 "AMD Geode LX" rev 0x33 glxsb0 at pci0 dev 1 function 2 "AMD Geode LX Crypto" rev 0x00: RNG AES vr0 at pci0 dev 9 function 0 "VIA VT6105M RhineIII" rev 0x96: irq 10, address xx:xx:xx:xx:xx:xx ukphy0 at vr0 phy 1: Generic IEEE 802.3u media interface, rev. 3: OUI 0x004063, model 0x0034 vr1 at pci0 dev 11 function 0 "VIA VT6105M RhineIII" rev 0x96: irq 15, address xx:xx:xx:xx:xx:xx ukphy1 at vr1 phy 1: Generic IEEE 802.3u media interface, rev. 3: OUI 0x004063, model 0x0034 ral0 at pci0 dev 12 function 0 "Ralink RT2860" rev 0x00: irq 9, address xx:xx:xx:xx:xx:xx ral0: MAC/BBP RT2860 (rev 0x0103), RF RT2850 (MIMO 2T3R) glxpcib0 at pci0 dev 15 function 0 "AMD CS5536 ISA" rev 0x03: rev 3, 32-bit 3579545Hz timer, watchdog, gpio, i2c gpio0 at glxpcib0: 32 pins iic0 at glxpcib0 maxtmp0 at iic0 addr 0x4c: lm86 pciide0 at pci0 dev 15 function 2 "AMD CS5536 IDE" rev 0x01: DMA, channel 0 wired to compatibility, channel 1 wired to compatibility wd0 at pciide0 channel 0 drive 0: wd0: 1-sector PIO, LBA, 3831MB, 7847280 sectors wd0(pciide0:0:0): using PIO mode 4, Ultra-DMA mode 2 pciide0: channel 1 ignored (disabled) ohci0 at pci0 dev 15 function 4 "AMD CS5536 USB" rev 0x02: irq 12, version 1.0, legacy support ehci0 at pci0 dev 15 function 5 "AMD CS5536 USB" rev 0x02: irq 12 usb0 at ehci0: USB revision 2.0 uhub0 at usb0 configuration 1 interface 0 "AMD EHCI root hub" rev 2.00/1.00 addr 1 isa0 at glxpcib0 isadma0 at isa0 com0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo com0: console com1 at isa0 port 0x2f8/8 irq 3: ns16550a, 16 byte fifo pcppi0 at isa0 port 0x61 spkr0 at pcppi0 npx0 at isa0 port 0xf0/16: reported by CPUID; using exception 16 usb1 at ohci0: USB revision 1.0 uhub1 at usb1 configuration 1 interface 0 "AMD OHCI root hub" rev 1.00/1.00 addr 1 vmm at mainbus0 not configured nvram: invalid checksum vscsi0 at root scsibus1 at vscsi0: 256 targets softraid0 at root scsibus2 at softraid0: 256 targets
Re: cuaU0 problems
On 20/09/2016, at 9:53 PM, Richard Procter wrote: > > On 20/09/2016, at 8:00 AM, Edgar Pettijohn wrote: > >> On 16-09-19 19:56:31, Kapfhammer, Stefan wrote: >>> Hello Edgar, >>> >>> I have no Soekris, but Apu2 is also connected >>> with a serial cable. >>> >>> When cable is plugged in the controlling pc >>> before booting, it is to be found as /dev/cuaU???0. >>> >>> When I plug it in after the boot completed, it is to be >>> found as /dev/cuaU3. (0/1/2 is normally int. 3G modem) >>> >>> Hope this helps debugging. Feedback would be fine. >> >> Thanks for the reply. It has always worked with: >> >> # cu -l cuaU0 >> >> when it stopped working I tried cuaU{1,2} with same result. > > Although my MacBook works at > [ details on breakage ] P.S. here's the dmesg for the snapshot where I can't login via console. best, Richard OpenBSD 6.0-current (GENERIC.MP) #2473: Sun Sep 18 23:24:19 MDT 2016 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP RTC BIOS diagnostic error ff real mem = 4005785600 (3820MB) avail mem = 3879882752 (3700MB) mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root bios0 at mainbus0: SMBIOS rev. 2.4 @ 0xbf719000 (44 entries) bios0: vendor Apple Inc. version "MBP71.88Z.0039.B05.1003251322" date 03/25/10 bios0: Apple Inc. MacBookPro7,1 acpi0 at bios0: rev 2 acpi0: sleep states S0 S3 S4 S5 acpi0: tables DSDT FACP HPET APIC APIC ASF! SBST ECDT SSDT SSDT SSDT MCFG acpi0: wakeup devices ADP1(S3) LID0(S3) EC__(S3) OHC1(S3) EHC1(S3) OHC2(S3) EHC2(S3) ARPT(S5) GIGE(S5) acpitimer0 at acpi0: 3579545 Hz, 24 bits acpihpet0 at acpi0: 2500 Hz acpimadt0 at acpi0 addr 0xfee0: PC-AT compat cpu0 at mainbus0: apid 0 (boot processor) cpu0: Intel(R) Core(TM)2 Duo CPU P8600 @ 2.40GHz, 2389.64 MHz cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUS H,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,DTES64,MWAIT,DS-CPL,VMX,SMX,ES T,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,XSAVE,NXE,LONG,LAHF,PERF,SENSOR cpu0: 3MB 64b/line 8-way L2 cache cpu0: smt 0, core 0, package 0 mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges cpu0: apic clock running at 265MHz cpu0: mwait min=64, max=64, C-substates=0.2.2.2.2.1.3, IBE cpu1 at mainbus0: apid 1 (application processor) cpu1: Intel(R) Core(TM)2 Duo CPU P8600 @ 2.40GHz, 2389.25 MHz cpu1: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUS H,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,DTES64,MWAIT,DS-CPL,VMX,SMX,ES T,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,XSAVE,NXE,LONG,LAHF,PERF,SENSOR cpu1: 3MB 64b/line 8-way L2 cache cpu1: smt 0, core 1, package 0 ioapic0 at mainbus0: apid 1 pa 0xfec0, version 11, 24 pins acpiec0 at acpi0 acpimcfg0 at acpi0 addr 0xf000, bus 0-4 acpiprt0 at acpi0: bus 0 (PCI0) acpiprt1 at acpi0: bus 4 (IXVE) acpicpu0 at acpi0: !C3(100@57 mwait.3@0x31), !C2(500@1 mwait@0x10), C1(1000@1 mwait), PSS acpicpu1 at acpi0: !C3(100@57 mwait.3@0x31), !C2(500@1 mwait@0x10), C1(1000@1 mwait), PSS acpiac0 at acpi0: AC unit offline acpibtn0 at acpi0: LID0 "APP0002" at acpi0 not configured acpibtn1 at acpi0: PWRB acpibtn2 at acpi0: SLPB "APP0001" at acpi0 not configured "APP0003" at acpi0 not configured acpials0 at acpi0: ALS0 "ACPI0002" at acpi0 not configured acpibat0 at acpi0: BAT0 model "3545797981023400290" type 3545797981528607052 oem "3545797981528673619" cpu0: Enhanced SpeedStep 2389 MHz: speeds: 2394, 2128, 1862, 1596, 798 MHz pci0 at mainbus0 bus 0 0:3:4: mem address conflict 0xd340/0x8 pchb0 at pci0 dev 0 function 0 "NVIDIA MCP89 Host" rev 0xa1 "NVIDIA MCP89 Memory" rev 0xa1 at pci0 dev 0 function 1 not configured vendor "NVIDIA", unknown product 0x0d6d (class memory subclass RAM, rev 0xa1) at pci0 dev 1 function 0 not configured vendor "NVIDIA", unknown product 0x0d6e (class memory subclass RAM, rev 0xa1) at pci0 dev 1 function 1 not configured vendor "NVIDIA", unknown product 0x0d6f (class memory subclass RAM, rev 0xa1) at pci0 dev 1 function 2 not configured vendor "NVIDIA", unknown product 0x0d70 (class memory subclass RAM, rev 0xa1) at pci0 dev 1 function 3 not configured vendor "NVIDIA", unknown product 0x0d71 (class memory subclass RAM, rev 0xa1) at pci0 dev 2 function 0 not configured vendor "NVIDIA", unknown product 0x0d72 (class memory subclass RAM, rev 0xa1) at pci0 dev 2 function 1 not configured pcib0 at pci0 dev 3 function 0 "NVIDIA MCP89 LPC" rev 0xa2 "NVIDIA MCP89 Memory" rev 0xa1 at pci0 dev 3 function 1 not configured nviic0 at pci0 dev 3 function 2 "NVIDIA MCP89 SMBus" rev 0xa1 iic0 at nviic0 iic1 at nviic0 "NVIDIA MCP89 Memory" rev 0xa1 at pci0 dev 3 function 3 not configured &
Re: cuaU0 problems
On 20/09/2016, at 8:00 AM, Edgar Pettijohn wrote: > On 16-09-19 19:56:31, Kapfhammer, Stefan wrote: >> Hello Edgar, >> >> I have no Soekris, but Apu2 is also connected >> with a serial cable. >> >> When cable is plugged in the controlling pc >> before booting, it is to be found as /dev/cuaU???0. >> >> When I plug it in after the boot completed, it is to be >> found as /dev/cuaU3. (0/1/2 is normally int. 3G modem) >> >> Hope this helps debugging. Feedback would be fine. > > Thanks for the reply. It has always worked with: > > # cu -l cuaU0 > > when it stopped working I tried cuaU{1,2} with same result. Although my MacBook works at OpenBSD 6.0-current (GENERIC.MP) #2466: Sat Sep 17 23:07:05 MDT 2016 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP (dmesg attached) , it breaks at OpenBSD 6.0-current (GENERIC.MP) #2473: Sun Sep 18 23:24:19 MDT 2016 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP I see some disabled USB ports and my keyboard is apparently attached to one of them because I cannot log in via console. sys/dev/usb/usb_subr.c revision 1.129 lies between these dates. Reverting this to revision 1.128 restores my keyboard, etc. best, Richard. [known good - and working FDTI USB->serial attached at end] OpenBSD 6.0-current (GENERIC.MP) #2466: Sat Sep 17 23:07:05 MDT 2016 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP RTC BIOS diagnostic error ff real mem = 4005785600 (3820MB) avail mem = 3879882752 (3700MB) mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root bios0 at mainbus0: SMBIOS rev. 2.4 @ 0xbf719000 (44 entries) bios0: vendor Apple Inc. version "MBP71.88Z.0039.B05.1003251322" date 03/25/10 bios0: Apple Inc. MacBookPro7,1 acpi0 at bios0: rev 2 acpi0: sleep states S0 S3 S4 S5 acpi0: tables DSDT FACP HPET APIC APIC ASF! SBST ECDT SSDT SSDT SSDT MCFG acpi0: wakeup devices ADP1(S3) LID0(S3) EC__(S3) OHC1(S3) EHC1(S3) OHC2(S3) EHC2(S3) ARPT(S5) GIGE(S5) acpitimer0 at acpi0: 3579545 Hz, 24 bits acpihpet0 at acpi0: 2500 Hz acpimadt0 at acpi0 addr 0xfee0: PC-AT compat cpu0 at mainbus0: apid 0 (boot processor) cpu0: Intel(R) Core(TM)2 Duo CPU P8600 @ 2.40GHz, 2389.57 MHz cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUS H,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,DTES64,MWAIT,DS-CPL,VMX,SMX,ES T,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,XSAVE,NXE,LONG,LAHF,PERF,SENSOR cpu0: 3MB 64b/line 8-way L2 cache cpu0: smt 0, core 0, package 0 mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges cpu0: apic clock running at 265MHz cpu0: mwait min=64, max=64, C-substates=0.2.2.2.2.1.3, IBE cpu1 at mainbus0: apid 1 (application processor) cpu1: Intel(R) Core(TM)2 Duo CPU P8600 @ 2.40GHz, 2389.25 MHz cpu1: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUS H,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,DTES64,MWAIT,DS-CPL,VMX,SMX,ES T,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,XSAVE,NXE,LONG,LAHF,PERF,SENSOR cpu1: 3MB 64b/line 8-way L2 cache cpu1: smt 0, core 1, package 0 ioapic0 at mainbus0: apid 1 pa 0xfec0, version 11, 24 pins acpiec0 at acpi0 acpimcfg0 at acpi0 addr 0xf000, bus 0-4 acpiprt0 at acpi0: bus 0 (PCI0) acpiprt1 at acpi0: bus 4 (IXVE) acpicpu0 at acpi0: !C3(100@57 mwait.3@0x31), !C2(500@1 mwait@0x10), C1(1000@1 mwait), PSS acpicpu1 at acpi0: !C3(100@57 mwait.3@0x31), !C2(500@1 mwait@0x10), C1(1000@1 mwait), PSS acpiac0 at acpi0: AC unit offline acpibtn0 at acpi0: LID0 "APP0002" at acpi0 not configured acpibtn1 at acpi0: PWRB acpibtn2 at acpi0: SLPB "APP0001" at acpi0 not configured "APP0003" at acpi0 not configured acpials0 at acpi0: ALS0 "ACPI0002" at acpi0 not configured acpibat0 at acpi0: BAT0 model "3545797981023400290" type 3545797981528607052 oem "3545797981528673619" cpu0: Enhanced SpeedStep 2389 MHz: speeds: 2394, 2128, 1862, 1596, 798 MHz pci0 at mainbus0 bus 0 0:3:4: mem address conflict 0xd340/0x8 pchb0 at pci0 dev 0 function 0 "NVIDIA MCP89 Host" rev 0xa1 "NVIDIA MCP89 Memory" rev 0xa1 at pci0 dev 0 function 1 not configured vendor "NVIDIA", unknown product 0x0d6d (class memory subclass RAM, rev 0xa1) at pci0 dev 1 function 0 not configured vendor "NVIDIA", unknown product 0x0d6e (class memory subclass RAM, rev 0xa1) at pci0 dev 1 function 1 not configured vendor "NVIDIA", unknown product 0x0d6f (class memory subclass RAM, rev 0xa1) at pci0 dev 1 function 2 not configured vendor "NVIDIA", unknown product 0x0d70 (class memory subclass RAM, rev 0xa1) at pci0 dev 1 function 3 not configured vendor "NVIDIA", unknown product 0x0d71 (class memory subclass RAM, rev 0xa1) at pci0 dev 2 function 0 not configured vendor "NVIDIA", unknown product 0x0d72 (class memory subclass RAM, rev 0xa1) at pci0 dev 2 function 1 not configured pcib0 at pci0 dev 3 function 0 "NVIDIA MCP89 LPC" rev 0xa2 "NVIDIA MCP89 Memory" rev 0xa1 at pci0 dev 3 function 1 not configured nviic0 at pci0 dev 3 function 2 "NVIDIA MCP89 SMBus" rev 0xa1 iic0 at nviic0 iic1
Re: NAT reliability in light of recent checksum changes
On 7/03/2014, at 2:15 PM, Richard Procter wrote: > > I've some ideas about solutions [for modifying checksums more cleanly] but > will > leave those for another email. Shifting this old thread to tech@: I've posted a patch that re-instates the pf algorithm of OpenBSD 5.4 for preserving payload checksums end-to-end but rewritten without the ugly and error-prone (but speedy!) code and aiming to have no significant impact on performance. best, Richard.
Re: NAT reliability in light of recent checksum changes
On 27/02/2014, at 11:04 AM, Theo de Raadt wrote: > > There was a method of converting an in-bound checksum, due to NAT > conversion, into a new out-bound checksum. A process is required, > it's how NAT works. > > A new method of version is being used. It is mathematically equivelant > to the old method. First, I agree with Theo that modifying a checksum is mathematically equivalent to regenerating it; both give the same result on ideal hardware. Of course, we use checksums because our hardware isn't ideal, so let's look at how the two approaches differ when a router fault occurs. Take Stuart Henderson's example: > Consider this scenario, which has happened in real life. > > - NIC supports checksum offloading, verified checksum is OK. > > - PCI transfers are broken (in my case it affected multiple > machines of a certain type, so most likely a motherboard bug), > causing some corruption in the payload, but the machine won't > detect them because it doesn't look at checksums itself, just > trusts the NIC's "rx csum good" flag. > > In this situation, packets which have been NATted that are > corrupt now get a new checksum that is valid; so the final > endpoint can not detect the breakage. That is, when the router offloads and regenerates, the router's egress NIC will hide any card, stack, bus or memory fault a verified packet suffered in passing through the router when it regenerates a new checksum from the now corrupt data. Looking at the code, the relevant functions are pf.c:pf_check_proto_cksum(), which trusts the ingress NIC's checksum good flag, and pf.c:pf_cksum(), which zeros the existing checksum on that basis and flags it to be regenerated by the egress NIC[1]. By contrast, checksum modification is far more reliable. In order to hide payload corruption the update code[1] would have to modify the checksum to exactly account for it. But that would have to happen by accident --- by a fault that in effect computes the necessary change --- as the update code never considers the payload[0]. It's not impossible but, on the other hand, checksum regeneration guarantees to hide faults in the regenerating router. We conclude that in the typical offloading case, regenerated checksums, unlike modified ones, cannot detect faults in the regenerating routers. Whether this difference is significant is a matter of judgment and a separate issue. I've some ideas about solutions but will leave those for another email. best, Richard. PS. I find the following terminology helpful: Checksums calculated from the origin data are 'original'; checksums calculated from a copy are 'regenerated'. Checksums may also be 'modified' to account for altered data in such a way as to preserve originality for any unaltered data[0]. A checksum is 'end-to-end' if it is delivered original with respect to the payload. A modified checksum may be end-to-end but never a regenerated checksum as it is not original. [0] Strikingly, RFC1631 (1994) and RFC3022 (2001), the NAT RFCs, fail to say end-to-end preservation is a property of their checksum modification algorithm. I presume it just didn't seem worth mentioning as, lacking hardware offload back then, one wouldn't regenerate in software on performance grounds alone. It is only alluded to in RFC1071 (1988) "Computing the Internet Checksum", which states that a checksum remains end-to-end when modified 'since it was not fully recomputed'. Although that's still true if NAT modifies it, NAT makes the meaning of 'end-to-end' more complex; I think my above terminology helps there. [1] I'll quote OpenBSD code here for completeness, contrasting modification (OpenBSD 5.3) with regeneration (OpenBSD 5.4) OpenBSD 5.3 NAT modified the checksum as follows: --- pf.c 1.818 (OPENBSD_5_3) --- Assuming an AF_INET <-> AF_INET TCP connection. pf_test_rule() 3862: pf_translate() 3881: pf_change_ap() [ src addr/port ] 1671: PF_ACPY [ = pf_addrcpy() ] 1689: pf_cksum_fixup(...) [ psuedo code is: sum = fixup(sum, addr16[1]) sum = fixup(sum, addr16[0]) sum = fixup(sum, port) ] 1662: l = cksum + old - new <--- checksum modified [ then presumably account for ones-complement carries ] 3887: pf_change_ap() etc [ dst addr/port ] On subsequent state matching: pf_test() 6788: pf_test_state_tcp() [ for TCP ] 4566: pf_change_ap() etc [ for src addr/port ] 4574: pf_change_ap() etc [ for dst addr/port ] --- OpenBSD 5.4 NAT regenerates checksums as follows: --- pf.c 1.863 (post OPENBSD_5_4) --- Assuming an AF_INET <-> AF_INET TCP connection. On initial rule match: pf_test_rule() 3445: pf_translate() 3707: pf_change_ap() 1677: PF_ACPY [= pf_addrcpy()] 3461: pf_cksum() 6775: pd->hdr.tcp->th_sum = 0; <--- checksum zeroed m->m_pkthdr.csum_flags |= M_TCP_CSUM_OUT <--- flagged for recalculation (if orig checksum good) On s
Re: NAT reliability in light of recent checksum changes
On 27/02/2014, at 11:04 AM, Theo de Raadt wrote: > I believe you are posting cast aspersions on the pf efforts. Theo, I'll insist then that I think pf is a superior piece of code which I benefit from every day, and that Henning's efforts to simplify it are so very welcome in a world addicted to complexity. My beef is solely with the technique of regenerating checksums, not the people working on the code. Criticising a design choice with argument and evidence is not the same as attacking the designer's integrity or competence and if I seem to be playing the men and not the ball, it is not my intent and I apologise. As to your other points, I will hopefully address them in another email I have been drafting and should have finished over the next few days. best, Richard.
Re: NAT reliability in light of recent checksum changes
On 24/02/2014, at 9:33 PM, Henning Brauer wrote: > * Richard Procter [2014-01-25 20:41]: >> On 22/01/2014, at 7:19 PM, Henning Brauer wrote: >>> * Richard Procter [2014-01-22 06:44]: >>>> This fundamentally weakens its usefulness, though: a correct >>>> checksum now implies only that the payload likely matches >>>> what the last NAT router happened to have in its memory >>> huh? >>> we receive a packet with correct cksum -> NAT -> packet goes out with >>> correct cksum. >>> we receive a packet with broken cksum -> NAT -> we leave the cksum >>> alone, i. e. leave it broken. >> Christian said it better than me: routers may corrupt data >> and regenerating the checksum will hide it. > > if that happened we had much bigger problems than NAT. By bigger problems do you mean obvious router stability issues? Suppose someone argued that as we'd have obvious stability issues if unprotected memory was unreliable, ECC memory is unnecessary. That argument is logically equivalent to what seems to be yours, that as we'd see obvious issues if routers were corrupting data, end-to-end checksums are unnecessary, but I don't buy it. We know that routers corrupt data. Right now my home firewall shows 30 TCP segments dropped for bad checksums. As checks at least as strong are used by every sane link-layer this virtually implies the dropped packets suffered router or end-point faults. Again, it's not just me saying it: "...checksums are used by higher layers to ensure that data was not corrupted in intermediate routers or by the sending or receiving host. The fact that checksums are typically the secondary level of protection has often led to suggestions that checksums are superfluous. Hard won experience, however, has shown that checksums are necessary. Software errors (such as buffer mismanagement) and even hardware errors (such as network adapters with poor DMA hardware that sometimes fail to fully DMA data) are surprisingly common [let alone memory faults! RP] and checksums have been very useful in protecting against such errors."[0] best, Richard. [0] Craig Partridge, Jim Hughes, and Jonathan Stone. 1995. Performance of checksums and CRCs over real data. SIGCOMM Comput. Commun. Rev. 25, 4 (October 1995), 68-76. DOI=10.1145/217391.217413 http://doi.acm.org/10.1145/217391.217413 page 1
Re: NAT reliability in light of recent checksum changes
On 28/01/2014, at 4:19 AM, Simon Perreault wrote: > Le 2014-01-25 14:40, Richard Procter a écrit : >> I'm not saying the calculation is bad. I'm saying it's being >> calculated from the wrong copy of the data and by the wrong >> device. And it's not just me saying it: I'm quoting the guys >> who designed TCP. > > Those guys didn't envision NAT. > > If you want end-to-end checksum purity, don't do NAT. Let's look at the options. The world needs more addresses than IPv4 provides and NAT gives them to us. There's IPv6, which has about a hundred billion addresses for every bacteria estimated to live on the planet[0], but it's not looking to replace IPv4 any time soon. So NAT is here to stay for a good while longer. Perhaps I can at least stop using NAT on my own network. In my case I can't but let's assume I do. This eliminates one source of error. But my TCP streams may still have now-undetected one-bit errors (at least) if there may be routers out there regenerating checksums. As long as there are, good checksums no longer mean as much by themselves and if I want at least some assurance the network did its job, I still need some other way (e.g, checking the network path contains no such routers, either by inspection or statistically, or by reimplementing an end-to-end checksum at a higher layer, etc). Regenerated checksums affect me whether or not I use NAT myself. Another option is to always update the checksum as versions prior to version 5.4 did. It's reasonable to ask, well is any more reliable than recomputing them as 5.4 does? That is, can the old update code hide payload corruption, too? In order to hide payload corruption the update code would have to modify the checksum to exactly account for it. But that would have to happen by accident, as it never considers the payload. It's not impossible, but, on the other hand, checksum regeneration guarantees to hide any bad data. So updates are more reliable. A lot more reliable, in fact, as you'd require precisely those memory errors necessary to in effect compute the correct update, or some freak fault in the ALU that did the same thing, or some combination of both. And as that has nothing to do with the update code it is in principle possible for non-NAT connections, too. For the hardware, updates are just an extra load/modify/store and so the chances of a checksum update hiding a corrupted payload are in practical terms equivalent to those of normal forwarding. So your statement holds only if checksums are being regenerated. In general, NAT needn't compromise end-to-end TCP payload checksum integrity, and in versions prior to 5.4, it didn't. best, Richard. [0] "Prokaryotes: The unseen majority" Proc Natl Acad Sci U S A. 1998 June 9; 95(12): 6578–6583. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC33863/ 2^128 IPv6 addresses = ~ 10^38 ~ 10^38 IPv6 addresses / ~ 10^30 bacteria cells = ~ 10^8 addresses per cell. [1] RFC1071 "Computing the Internet Checksum" p21 "If anything, [this end-to-end property] is the most powerful feature of the TCP checksum!". Page 15 is also touches on the end-to-end preserving properties of checksum update.
Re: NAT reliability in light of recent checksum changes
On 22/01/2014, at 7:19 PM, Henning Brauer wrote: > * Richard Procter [2014-01-22 06:44]: >> This fundamentally weakens its usefulness, though: a correct >> checksum now implies only that the payload likely matches >> what the last NAT router happened to have in its memory > > huh? > we receive a packet with correct cksum -> NAT -> packet goes out with > correct cksum. > we receive a packet with broken cksum -> NAT -> we leave the cksum > alone, i. e. leave it broken. Christian said it better than me: routers may corrupt data and regenerating the checksum will hide it. That's more than a theoretical concern. The article I referenced is a detailed study of real-world traces co-authored by a member of the Stanford distributed systems group that concludes "Probably the strongest message of this study is that the networking hardware is often trashing the packets which are entrusted to it"[0]. More generally, TCP checksums provide for an acceptable error rate that is independent of the reliability of the underlying network[*] by allowing us to verify its workings. But it's no longer possible to verify network operation if it may be regenerating TCP checksums, as these may hide network faults. That's a fundamental change from the scheme Cerf and Khan emphasized in their design notes for what became known as TCP: "The remainder of the packet consists of text for delivery to the destination and a trailing check sum used for end-to-end software verification. The GATEWAY does /not/ modify the text and merely forwards the check sum along without computing or recomputing it."[1] > It doesn't seem you know what you are talking about. the > cksum is dead simple, if we had bugs in claculating or > verifying it, we really had a LOT of other problems. I'm not saying the calculation is bad. I'm saying it's being calculated from the wrong copy of the data and by the wrong device. And it's not just me saying it: I'm quoting the guys who designed TCP. > There is no "undetected error rate", nothing really changes > there. I disagree. Every TCP stream containing aribitrary data may have undetected errors as checksums cannot detect all the errors networks may make (being shorter than the data they cover). The engineer's task is to make network errors reliably negligible in practice. As network regenerated checksums may hide any amount of arbitrary data corruption I believe it's correct to say the network error rate undetected by TCP is then "unknown and unbounded". best, Richard. [*] Under reasonable assumptions of the error modes most likely in practice. And some applications require lower error rates than TCP checksums can provide. [0] http://conferences.sigcomm.org/sigcomm/2000/conf/paper/sigcomm2000-9-1.pdf Jonathan Stone and Craig Partridge. 2000. When the CRC and TCP checksum disagree. In Proceedings of the conference on Applications, Technologies, Architectures, and Protocols for Computer Communication (SIGCOMM '00). ACM, New York, NY, USA, 309-319. DOI=10.1145/347059.347561 http://doi.acm.org/10.1145/347059.347561 [1] "A Protocol for Packet Network Intercommunication" V. Cerf, R. Khan, IEEE Trans on Comms, Vol Com-22, No 5 May 1974 Page 3 in original emphasis.
Re: NAT reliability in light of recent checksum changes
On 2014-01-15, Stuart Henderson wrote: > On 2014-01-14, Richard Procter wrote: >> >> I've a question about the new checksum changes. [...] >> My understanding is that checksums are now always recalculated when >> a header is altered, never updated. >> >> Is that right and if so has this affected NAT reliability? >> >> Recalculation here would compromise reliable end-to-end transport >> as the payload checksum no longer covers the entire network path, >> and so break a basic transport layer design principle. > > That is exactly what slides 30-33 talk about. PF now checks > the incoming packets before it rewrites the checksum, so it can > reject them if they are broken. Right -- so NAT now replaces the existing transport checksum with one newly computed from the payload [0]. This fundamentally weakens its usefulness, though: a correct checksum now implies only that the payload likely matches what the last NAT router happened to have in its memory, whereas the receiver wants to know whether what it got is what was originally transmitted. In the worst case of NAT on every intermediate node the transport checksum is effectively reduced to an adjunct of the link layer checksum. This means transport layer payload integrity is no longer reliant on the quality of the checksum algorithm alone but now depends too on the reliability of the path the packet took through the network. I think it's great to see someone working hard to simplify crucial code but in light of the above I believe pf should always update the checksum, as it did in versions prior to 5.4, as the alternative fundamentally undermines TCP by making the undetected error rate of its streams unknown and unbounded. One might argue networks these days are reliable; I think it better to avoid the need to make the argument. In any case the work I've found on that question is not reassuring [1]. best, Richard. [0] pf.c 1.863 On initial rule match: pf_test_rule() 3445: pf_translate() 3707: pf_change_ap() 1677: PF_ACPY [= pf_addrcpy()] 3461: pf_cksum() 6775: pd->hdr.tcp->th_sum = 0; m->m_pkthdr.csum_flags |= M_TCP_CSUM_OUT (if orig checksum good) On subsequent state matching: pf_test_state() ~4445: pf_change_ap() etc 4471: pf_cksum() etc [1] "Probably the strongest message of this study is that the networking hardware is often trashing the packets which are entrusted to it" http://conferences.sigcomm.org/sigcomm/2000/conf/paper/sigcomm2000-9-1.pdf Jonathan Stone and Craig Partridge. 2000. When the CRC and TCP checksum disagree. In Proceedings of the conference on Applications, Technologies, Architectures, and Protocols for Computer Communication (SIGCOMM '00). ACM, New York, NY, USA, 309-319. DOI=10.1145/347059.347561 http://doi.acm.org/10.1145/347059.347561
NAT reliability in light of recent checksum changes
Hi all, I'm using OpenBSD 5.3 to provide an Alix-based home firewall. Thank you all for the commitment to elegant, well-documented software which isn't pernicious to the mental health of its users. I've a question about the new checksum changes[0], being interested in such things and having listened to Henning's presentation and poked around in the archives a little. My understanding is that checksums are now always recalculated when a header is altered, never updated.[1] Is that right and if so has this affected NAT reliability? Recalculation here would compromise reliable end-to-end transport as the payload checksum no longer covers the entire network path, and so break a basic transport layer design principle.[2][3] best, Richard. [0] http://www.openbsd.org/54.html "Reworked checksum handling for network protocols." [1] e.g. 26:45 slide 27, 'use protocol checksum offloading better' http://quigon.bsws.de/papers/2013/EuroBSDcon/mgp00027.html 30:51 slide 30, 'consequences in pf' http://quigon.bsws.de/papers/2013/EuroBSDcon/mgp00030.html https://www.youtube.com/watch?v=AymV11igbLY 'The surprising complexity of checksums in TCP/IP' [2] V. Cerf, R. Khan, IEEE Trans on Comms, Vol Com-22, No 5 May 1974 Page 3 in original emphasis. > The remainder of the packet consists of text for delivery to the > destination and a trailing check sum used for end-to-end software > verification. The GATEWAY does /not/ modify the text and merely > forwards the check sum along without computing or recomputing it. [3] Page 3. http://www.ietf.org/rfc/rfc793.txt > The TCP must recover from data that is damaged, lost, duplicated, or > delivered out of order by the internet communication system. [...] > Damage is handled by adding a checksum to each segment transmitted, > checking it at the receiver, and discarding damaged segments.