Re: ping time fluctuates, any idea?

2019-09-09 Thread Richard Procter
Hello,

> On 9/09/2019, at 8:45 PM, Jihyun Yu  wrote:
> 
> It seems that time from ping command fluctuates. Here's a output from ping
> command.
> [...snip ping with negative rtt...]

This is symptomatic of unsynchronized time stamp counters (TSC).

I would expect that setting: 

# sysctl kern.timecounter.hardware=acpihpet0 

would fix your ping results, and probably improve your ntpd(8) 
performance, too. 

There has been some work in this area on -current.

best, 
Richard. 

PS. Please make sure to include a complete dmesg next time -  
half a dmesg is like half a photograph!

> 00=2828 01=a1a1 02=f7f7 03=2020 04=d9d9 05=5b5b 06=0b0b 07=3030
> isa0 at pcib0
> isadma0 at isa0
> com0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo
> pckbc0 at isa0 port 0x60/5 irq 1 irq 12
> pckbd0 at pckbc0 (kbd slot)
> wskbd0 at pckbd0: console keyboard, using wsdisplay0
> pcppi0 at isa0 port 0x61
> spkr0 at pcppi0
> wbsio0 at isa0 port 0x2e/2: NCT6776F rev 0x33
> lm1 at wbsio0 port 0x290/8: NCT6776F
> pci13 at mainbus0 bus 255
> "Intel E5 QPI Link" rev 0x07 at pci13 dev 8 function 0 not configured
> vendor "Intel", unknown product 0x3c83 (class system subclass
> miscellaneous, rev 0x07) at pci13 dev 8 function 3 not configured
> vendor "Intel", unknown product 0x3c84 (class system subclass
> miscellaneous, rev 0x07) at pci13 dev 8 function 4 not configured
> "Intel E5 QPI Link" rev 0x07 at pci13 dev 9 function 0 not configured
> vendor "Intel", unknown product 0x3c93 (class system subclass
> miscellaneous, rev 0x07) at pci13 dev 9 function 3 not configured
> vendor "Intel", unknown product 0x3c94 (class system subclass
> miscellaneous, rev 0x07) at pci13 dev 9 function 4 not configured
> "Intel E5 PCU" rev 0x07 at pci13 dev 10 function 0 not configured
> "Intel E5 PCU" rev 0x07 at pci13 dev 10 function 1 not configured
> "Intel E5 PCU" rev 0x07 at pci13 dev 10 function 2 not configured
> "Intel E5 PCU" rev 0x07 at pci13 dev 10 function 3 not configured
> "Intel E5 Scratch" rev 0x07 at pci13 dev 11 function 0 not configured
> "Intel E5 Scratch" rev 0x07 at pci13 dev 11 function 3 not configured
> "Intel E5 Unicast" rev 0x07 at pci13 dev 12 function 0 not configured
> "Intel E5 Unicast" rev 0x07 at pci13 dev 12 function 1 not configured
> "Intel E5 Unicast" rev 0x07 at pci13 dev 12 function 2 not configured
> "Intel E5 SAD" rev 0x07 at pci13 dev 12 function 6 not configured
> "Intel E5 SAD" rev 0x07 at pci13 dev 12 function 7 not configured
> "Intel E5 Unicast" rev 0x07 at pci13 dev 13 function 0 not configured
> "Intel E5 Unicast" rev 0x07 at pci13 dev 13 function 1 not configured
> "Intel E5 Unicast" rev 0x07 at pci13 dev 13 function 2 not configured
> "Intel E5 Broadcast" rev 0x07 at pci13 dev 13 function 6 not configured
> "Intel E5 Home Agent" rev 0x07 at pci13 dev 14 function 0 not configured
> "Intel E5 Home Agent" rev 0x07 at pci13 dev 14 function 1 not configured
> "Intel E5 TA" rev 0x07 at pci13 dev 15 function 0 not configured
> "Intel E5 RAS" rev 0x07 at pci13 dev 15 function 1 not configured
> "Intel E5 TAD" rev 0x07 at pci13 dev 15 function 2 not configured
> "Intel E5 TAD" rev 0x07 at pci13 dev 15 function 3 not configured
> "Intel E5 TAD" rev 0x07 at pci13 dev 15 function 4 not configured
> "Intel E5 TAD" rev 0x07 at pci13 dev 15 function 5 not configured
> "Intel E5 TAD" rev 0x07 at pci13 dev 15 function 6 not configured
> "Intel E5 Thermal" rev 0x07 at pci13 dev 16 function 0 not configured
> "Intel E5 Thermal" rev 0x07 at pci13 dev 16 function 1 not configured
> "Intel E5 Error" rev 0x07 at pci13 dev 16 function 2 not configured
> "Intel E5 Error" rev 0x07 at pci13 dev 16 function 3 not configured
> "Intel E5 Thermal" rev 0x07 at pci13 dev 16 function 4 not configured
> "Intel E5 Thermal" rev 0x07 at pci13 dev 16 function 5 not configured
> "Intel E5 Error" rev 0x07 at pci13 dev 16 function 6 not configured
> "Intel E5 Error" rev 0x07 at pci13 dev 16 function 7 not configured
> "Intel E5 DDRIO" rev 0x07 at pci13 dev 17 function 0 not configured
> "Intel E5 R2PCIE" rev 0x07 at pci13 dev 19 function 0 not configured
> "Intel E5 PCIE Monitor" rev 0x07 at pci13 dev 19 function 1 not configured
> "Intel E5 QPI" rev 0x07 at pci13 dev 19 function 4 not configured
> "Intel E5 QPI Link Monitor" rev 0x07 at pci13 dev 19 function 5 not
> configured
> "Intel E5 QPI Link Monitor" rev 0x07 at pci13 dev 19 function 6 not
> configured
> vmm0 at mainbus0: VMX/EPT
> uhub4 at uhub0 port 1 configuration 1 interface 0 "Intel Rate Matching Hub"
> rev 2.00/0.00 addr 2
> uhub5 at uhub3 port 1 configuration 1 interface 0 "Intel Rate Matching Hub"
> rev 2.00/0.00 addr 2
> vscsi0 at root
> scsibus4 at vscsi0: 256 targets
> softraid0 at root
> scsibus5 at softraid0: 256 targets
> root on sd0a (74fd07b06a4f30a1.a) swap on sd0b dump on sd0b
> 
> 
> Thanks,
> Jihyun Yu



Re: ral(4) problems on current/i386 ALIX

2016-11-28 Thread Richard Procter
On 28/11/2016, at 4:25 AM, Jan Stary wrote:
> [...]
> What kind of wifi are people using
> on the ALIX serving as an AP?

I'm running an RT2860 via ral(4) on an Alix 2d2 -- I'm seeing
about 1.1MB/s when transferring 47MB from it through a couple
of walls, and with another network at -74dBm on the same channel.
The network is otherwise quiet and I'm maybe 12M from the AP.

best,
Richard.

(The kernel is running a patch but this shouldn't be affecting throughput.)

OpenBSD 6.0-current (GENERIC) #8: Sun Nov 20 12:51:52 NZDT 2016
build@build.localdomain:/usr/src/sys/arch/i386/compile/GENERIC
cpu0: Geode(TM) Integrated Processor by AMD PCS ("AuthenticAMD" 586-class) 499
MHz
cpu0: FPU,DE,PSE,TSC,MSR,CX8,SEP,PGE,CMOV,CFLUSH,MMX,MMXX,3DNOW2,3DNOW
real mem  = 267931648 (255MB)
avail mem = 250114048 (238MB)
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: date 11/05/08, BIOS32 rev. 0 @ 0xfd088
pcibios0 at bios0: rev 2.1 @ 0xf/0x1
pcibios0: pcibios_get_intr_routing - function not supported
pcibios0: PCI IRQ Routing information unavailable.
pcibios0: PCI bus #0 is the last bus
bios0: ROM list: 0xe/0xa800
cpu0 at mainbus0: (uniprocessor)
mtrr: K6-family MTRR support (2 registers)
pci0 at mainbus0 bus 0: configuration mode 1 (bios)
pchb0 at pci0 dev 1 function 0 "AMD Geode LX" rev 0x33
glxsb0 at pci0 dev 1 function 2 "AMD Geode LX Crypto" rev 0x00: RNG AES
vr0 at pci0 dev 9 function 0 "VIA VT6105M RhineIII" rev 0x96: irq 10, address
xx:xx:xx:xx:xx:xx
ukphy0 at vr0 phy 1: Generic IEEE 802.3u media interface, rev. 3: OUI
0x004063, model 0x0034
vr1 at pci0 dev 11 function 0 "VIA VT6105M RhineIII" rev 0x96: irq 15, address
xx:xx:xx:xx:xx:xx
ukphy1 at vr1 phy 1: Generic IEEE 802.3u media interface, rev. 3: OUI
0x004063, model 0x0034
ral0 at pci0 dev 12 function 0 "Ralink RT2860" rev 0x00: irq 9, address
xx:xx:xx:xx:xx:xx
ral0: MAC/BBP RT2860 (rev 0x0103), RF RT2850 (MIMO 2T3R)
glxpcib0 at pci0 dev 15 function 0 "AMD CS5536 ISA" rev 0x03: rev 3, 32-bit
3579545Hz timer, watchdog, gpio, i2c
gpio0 at glxpcib0: 32 pins
iic0 at glxpcib0
maxtmp0 at iic0 addr 0x4c: lm86
pciide0 at pci0 dev 15 function 2 "AMD CS5536 IDE" rev 0x01: DMA, channel 0
wired to compatibility, channel 1 wired to compatibility
wd0 at pciide0 channel 0 drive 0: 
wd0: 1-sector PIO, LBA, 3831MB, 7847280 sectors
wd0(pciide0:0:0): using PIO mode 4, Ultra-DMA mode 2
pciide0: channel 1 ignored (disabled)
ohci0 at pci0 dev 15 function 4 "AMD CS5536 USB" rev 0x02: irq 12, version
1.0, legacy support
ehci0 at pci0 dev 15 function 5 "AMD CS5536 USB" rev 0x02: irq 12
usb0 at ehci0: USB revision 2.0
uhub0 at usb0 configuration 1 interface 0 "AMD EHCI root hub" rev 2.00/1.00
addr 1
isa0 at glxpcib0
isadma0 at isa0
com0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo
com0: console
com1 at isa0 port 0x2f8/8 irq 3: ns16550a, 16 byte fifo
pcppi0 at isa0 port 0x61
spkr0 at pcppi0
npx0 at isa0 port 0xf0/16: reported by CPUID; using exception 16
usb1 at ohci0: USB revision 1.0
uhub1 at usb1 configuration 1 interface 0 "AMD OHCI root hub" rev 1.00/1.00
addr 1
vmm at mainbus0 not configured
nvram: invalid checksum
vscsi0 at root
scsibus1 at vscsi0: 256 targets
softraid0 at root
scsibus2 at softraid0: 256 targets



Re: cuaU0 problems

2016-09-20 Thread Richard Procter
On 20/09/2016, at 9:53 PM, Richard Procter wrote:

>
> On 20/09/2016, at 8:00 AM, Edgar Pettijohn wrote:
>
>> On 16-09-19 19:56:31, Kapfhammer, Stefan wrote:
>>> Hello Edgar,
>>>
>>> I have no Soekris, but Apu2 is also connected
>>> with a serial cable.
>>>
>>> When cable is plugged in the controlling pc
>>> before booting, it is to be found as /dev/cuaU???0.
>>>
>>> When I plug it in after the boot completed, it is to be
>>> found as /dev/cuaU3. (0/1/2 is normally int. 3G modem)
>>>
>>> Hope this helps debugging. Feedback would be fine.
>>
>> Thanks for the reply. It has always worked with:
>>
>> # cu -l cuaU0
>>
>> when it stopped working I tried cuaU{1,2} with same result.
>
> Although my MacBook works at
> [ details on breakage ]

P.S. here's the dmesg for the snapshot where I can't login via console.

best,
Richard

OpenBSD 6.0-current (GENERIC.MP) #2473: Sun Sep 18 23:24:19 MDT 2016
dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
RTC BIOS diagnostic error
ff
real mem = 4005785600 (3820MB)
avail mem = 3879882752 (3700MB)
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.4 @ 0xbf719000 (44 entries)
bios0: vendor Apple Inc. version "MBP71.88Z.0039.B05.1003251322" date
03/25/10
bios0: Apple Inc. MacBookPro7,1
acpi0 at bios0: rev 2
acpi0: sleep states S0 S3 S4 S5
acpi0: tables DSDT FACP HPET APIC APIC ASF! SBST ECDT SSDT SSDT SSDT MCFG
acpi0: wakeup devices ADP1(S3) LID0(S3) EC__(S3) OHC1(S3) EHC1(S3) OHC2(S3)
EHC2(S3) ARPT(S5) GIGE(S5)
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpihpet0 at acpi0: 2500 Hz
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Core(TM)2 Duo CPU P8600 @ 2.40GHz, 2389.64 MHz
cpu0:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUS
H,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,DTES64,MWAIT,DS-CPL,VMX,SMX,ES
T,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,XSAVE,NXE,LONG,LAHF,PERF,SENSOR
cpu0: 3MB 64b/line 8-way L2 cache
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges
cpu0: apic clock running at 265MHz
cpu0: mwait min=64, max=64, C-substates=0.2.2.2.2.1.3, IBE
cpu1 at mainbus0: apid 1 (application processor)
cpu1: Intel(R) Core(TM)2 Duo CPU P8600 @ 2.40GHz, 2389.25 MHz
cpu1:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUS
H,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,DTES64,MWAIT,DS-CPL,VMX,SMX,ES
T,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,XSAVE,NXE,LONG,LAHF,PERF,SENSOR
cpu1: 3MB 64b/line 8-way L2 cache
cpu1: smt 0, core 1, package 0
ioapic0 at mainbus0: apid 1 pa 0xfec0, version 11, 24 pins
acpiec0 at acpi0
acpimcfg0 at acpi0 addr 0xf000, bus 0-4
acpiprt0 at acpi0: bus 0 (PCI0)
acpiprt1 at acpi0: bus 4 (IXVE)
acpicpu0 at acpi0: !C3(100@57 mwait.3@0x31), !C2(500@1 mwait@0x10), C1(1000@1
mwait), PSS
acpicpu1 at acpi0: !C3(100@57 mwait.3@0x31), !C2(500@1 mwait@0x10), C1(1000@1
mwait), PSS
acpiac0 at acpi0: AC unit offline
acpibtn0 at acpi0: LID0
"APP0002" at acpi0 not configured
acpibtn1 at acpi0: PWRB
acpibtn2 at acpi0: SLPB
"APP0001" at acpi0 not configured
"APP0003" at acpi0 not configured
acpials0 at acpi0: ALS0
"ACPI0002" at acpi0 not configured
acpibat0 at acpi0: BAT0 model "3545797981023400290" type 3545797981528607052
oem "3545797981528673619"
cpu0: Enhanced SpeedStep 2389 MHz: speeds: 2394, 2128, 1862, 1596, 798 MHz
pci0 at mainbus0 bus 0
0:3:4: mem address conflict 0xd340/0x8
pchb0 at pci0 dev 0 function 0 "NVIDIA MCP89 Host" rev 0xa1
"NVIDIA MCP89 Memory" rev 0xa1 at pci0 dev 0 function 1 not configured
vendor "NVIDIA", unknown product 0x0d6d (class memory subclass RAM, rev 0xa1)
at pci0 dev 1 function 0 not configured
vendor "NVIDIA", unknown product 0x0d6e (class memory subclass RAM, rev 0xa1)
at pci0 dev 1 function 1 not configured
vendor "NVIDIA", unknown product 0x0d6f (class memory subclass RAM, rev 0xa1)
at pci0 dev 1 function 2 not configured
vendor "NVIDIA", unknown product 0x0d70 (class memory subclass RAM, rev 0xa1)
at pci0 dev 1 function 3 not configured
vendor "NVIDIA", unknown product 0x0d71 (class memory subclass RAM, rev 0xa1)
at pci0 dev 2 function 0 not configured
vendor "NVIDIA", unknown product 0x0d72 (class memory subclass RAM, rev 0xa1)
at pci0 dev 2 function 1 not configured
pcib0 at pci0 dev 3 function 0 "NVIDIA MCP89 LPC" rev 0xa2
"NVIDIA MCP89 Memory" rev 0xa1 at pci0 dev 3 function 1 not configured
nviic0 at pci0 dev 3 function 2 "NVIDIA MCP89 SMBus" rev 0xa1
iic0 at nviic0
iic1 at nviic0
"NVIDIA MCP89 Memory" rev 0xa1 at pci0 dev 3 function 3 not configured
&

Re: cuaU0 problems

2016-09-20 Thread Richard Procter
On 20/09/2016, at 8:00 AM, Edgar Pettijohn wrote:

> On 16-09-19 19:56:31, Kapfhammer, Stefan wrote:
>> Hello Edgar,
>>
>> I have no Soekris, but Apu2 is also connected
>> with a serial cable.
>>
>> When cable is plugged in the controlling pc
>> before booting, it is to be found as /dev/cuaU???0.
>>
>> When I plug it in after the boot completed, it is to be
>> found as /dev/cuaU3. (0/1/2 is normally int. 3G modem)
>>
>> Hope this helps debugging. Feedback would be fine.
>
> Thanks for the reply. It has always worked with:
>
> # cu -l cuaU0
>
> when it stopped working I tried cuaU{1,2} with same result.

Although my MacBook works at

OpenBSD 6.0-current (GENERIC.MP) #2466: Sat Sep 17 23:07:05 MDT 2016
dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP

(dmesg attached)

, it breaks at

OpenBSD 6.0-current (GENERIC.MP) #2473: Sun Sep 18 23:24:19 MDT 2016
dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP

I see some disabled USB ports and my keyboard is apparently attached to one
of them because I cannot log in via console.

sys/dev/usb/usb_subr.c revision 1.129 lies between these dates. Reverting
this to revision 1.128 restores my keyboard, etc.

best,
Richard.

[known good - and working FDTI USB->serial attached at end]

OpenBSD 6.0-current (GENERIC.MP) #2466: Sat Sep 17 23:07:05 MDT 2016
dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
RTC BIOS diagnostic error
ff
real mem = 4005785600 (3820MB)
avail mem = 3879882752 (3700MB)
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.4 @ 0xbf719000 (44 entries)
bios0: vendor Apple Inc. version "MBP71.88Z.0039.B05.1003251322" date
03/25/10
bios0: Apple Inc. MacBookPro7,1
acpi0 at bios0: rev 2
acpi0: sleep states S0 S3 S4 S5
acpi0: tables DSDT FACP HPET APIC APIC ASF! SBST ECDT SSDT SSDT SSDT MCFG
acpi0: wakeup devices ADP1(S3) LID0(S3) EC__(S3) OHC1(S3) EHC1(S3) OHC2(S3)
EHC2(S3) ARPT(S5) GIGE(S5)
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpihpet0 at acpi0: 2500 Hz
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Core(TM)2 Duo CPU P8600 @ 2.40GHz, 2389.57 MHz
cpu0:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUS
H,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,DTES64,MWAIT,DS-CPL,VMX,SMX,ES
T,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,XSAVE,NXE,LONG,LAHF,PERF,SENSOR
cpu0: 3MB 64b/line 8-way L2 cache
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges
cpu0: apic clock running at 265MHz
cpu0: mwait min=64, max=64, C-substates=0.2.2.2.2.1.3, IBE
cpu1 at mainbus0: apid 1 (application processor)
cpu1: Intel(R) Core(TM)2 Duo CPU P8600 @ 2.40GHz, 2389.25 MHz
cpu1:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUS
H,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,DTES64,MWAIT,DS-CPL,VMX,SMX,ES
T,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,XSAVE,NXE,LONG,LAHF,PERF,SENSOR
cpu1: 3MB 64b/line 8-way L2 cache
cpu1: smt 0, core 1, package 0
ioapic0 at mainbus0: apid 1 pa 0xfec0, version 11, 24 pins
acpiec0 at acpi0
acpimcfg0 at acpi0 addr 0xf000, bus 0-4
acpiprt0 at acpi0: bus 0 (PCI0)
acpiprt1 at acpi0: bus 4 (IXVE)
acpicpu0 at acpi0: !C3(100@57 mwait.3@0x31), !C2(500@1 mwait@0x10), C1(1000@1
mwait), PSS
acpicpu1 at acpi0: !C3(100@57 mwait.3@0x31), !C2(500@1 mwait@0x10), C1(1000@1
mwait), PSS
acpiac0 at acpi0: AC unit offline
acpibtn0 at acpi0: LID0
"APP0002" at acpi0 not configured
acpibtn1 at acpi0: PWRB
acpibtn2 at acpi0: SLPB
"APP0001" at acpi0 not configured
"APP0003" at acpi0 not configured
acpials0 at acpi0: ALS0
"ACPI0002" at acpi0 not configured
acpibat0 at acpi0: BAT0 model "3545797981023400290" type 3545797981528607052
oem "3545797981528673619"
cpu0: Enhanced SpeedStep 2389 MHz: speeds: 2394, 2128, 1862, 1596, 798 MHz
pci0 at mainbus0 bus 0
0:3:4: mem address conflict 0xd340/0x8
pchb0 at pci0 dev 0 function 0 "NVIDIA MCP89 Host" rev 0xa1
"NVIDIA MCP89 Memory" rev 0xa1 at pci0 dev 0 function 1 not configured
vendor "NVIDIA", unknown product 0x0d6d (class memory subclass RAM, rev 0xa1)
at pci0 dev 1 function 0 not configured
vendor "NVIDIA", unknown product 0x0d6e (class memory subclass RAM, rev 0xa1)
at pci0 dev 1 function 1 not configured
vendor "NVIDIA", unknown product 0x0d6f (class memory subclass RAM, rev 0xa1)
at pci0 dev 1 function 2 not configured
vendor "NVIDIA", unknown product 0x0d70 (class memory subclass RAM, rev 0xa1)
at pci0 dev 1 function 3 not configured
vendor "NVIDIA", unknown product 0x0d71 (class memory subclass RAM, rev 0xa1)
at pci0 dev 2 function 0 not configured
vendor "NVIDIA", unknown product 0x0d72 (class memory subclass RAM, rev 0xa1)
at pci0 dev 2 function 1 not configured
pcib0 at pci0 dev 3 function 0 "NVIDIA MCP89 LPC" rev 0xa2
"NVIDIA MCP89 Memory" rev 0xa1 at pci0 dev 3 function 1 not configured
nviic0 at pci0 dev 3 function 2 "NVIDIA MCP89 SMBus" rev 0xa1
iic0 at nviic0
iic1 

Re: NAT reliability in light of recent checksum changes

2015-06-15 Thread Richard Procter
On 7/03/2014, at 2:15 PM, Richard Procter wrote:
> 
> I've some ideas about solutions [for modifying checksums more cleanly] but 
> will
> leave those for another email.

Shifting this old thread to tech@: I've posted a patch that re-instates 
the pf algorithm of OpenBSD 5.4 for preserving payload checksums end-to-end
but rewritten without the ugly and error-prone (but speedy!) code and 
aiming to have no significant impact on performance. 

best, 
Richard. 



Re: NAT reliability in light of recent checksum changes

2014-03-06 Thread Richard Procter
On 27/02/2014, at 11:04 AM, Theo de Raadt wrote:
> 
> There was a method of converting an in-bound checksum, due to NAT
> conversion, into a new out-bound checksum.  A process is required,
> it's how NAT works.
> 
> A new method of version is being used.  It is mathematically equivelant
> to the old method.

First, I agree with Theo that modifying a checksum is
mathematically equivalent to regenerating it; both give the same
result on ideal hardware.

Of course, we use checksums because our hardware isn't ideal, so
let's look at how the two approaches differ when a router 
fault occurs.

Take Stuart Henderson's example:

> Consider this scenario, which has happened in real life.
> 
> - NIC supports checksum offloading, verified checksum is OK.
> 
> - PCI transfers are broken (in my case it affected multiple
> machines of a certain type, so most likely a motherboard bug),
> causing some corruption in the payload, but the machine won't
> detect them because it doesn't look at checksums itself, just
> trusts the NIC's "rx csum good" flag.
> 
> In this situation, packets which have been NATted that are
> corrupt now get a new checksum that is valid; so the final
> endpoint can not detect the breakage.

That is, when the router offloads and regenerates, the router's
egress NIC will hide any card, stack, bus or memory fault a
verified packet suffered in passing through the router when it
regenerates a new checksum from the now corrupt data.

Looking at the code, the relevant functions are
pf.c:pf_check_proto_cksum(), which trusts the ingress NIC's
checksum good flag, and pf.c:pf_cksum(), which zeros the existing
checksum on that basis and flags it to be regenerated by the
egress NIC[1].

By contrast, checksum modification is far more reliable. In order
to hide payload corruption the update code[1] would have to
modify the checksum to exactly account for it. But that would
have to happen by accident --- by a fault that in effect computes
the necessary change --- as the update code never considers the
payload[0]. It's not impossible but, on the other hand,
checksum regeneration guarantees to hide faults in the
regenerating router. 

We conclude that in the typical offloading case, regenerated
checksums, unlike modified ones, cannot detect faults in the
regenerating routers. 

Whether this difference is significant is a matter 
of judgment and a separate issue.

I've some ideas about solutions but will leave those for 
another email.

best, 
Richard. 

PS. I find the following terminology helpful:

Checksums calculated from the origin data are 'original';
checksums calculated from a copy are 'regenerated'.

Checksums may also be 'modified' to account for altered data in
such a way as to preserve originality for any unaltered data[0].

A checksum is 'end-to-end' if it is delivered original with
respect to the payload. A modified checksum may be end-to-end but
never a regenerated checksum as it is not original.

[0] Strikingly, RFC1631 (1994) and RFC3022 (2001), the NAT RFCs, 
fail to say end-to-end preservation is a property of their checksum 
modification algorithm. I presume it just didn't seem worth 
mentioning as, lacking hardware offload back then, one wouldn't 
regenerate in software on performance grounds alone. It is only 
alluded to in RFC1071 (1988) "Computing the Internet Checksum", 
which states that a checksum remains end-to-end when modified 
'since it was not fully recomputed'. Although that's still true 
if NAT modifies it, NAT makes the meaning of 'end-to-end' 
more complex; I think my above terminology helps there. 

[1]
I'll quote OpenBSD code here for completeness, contrasting 
modification (OpenBSD 5.3) with regeneration (OpenBSD 5.4) 

OpenBSD 5.3 NAT modified the checksum as follows: 

--- pf.c 1.818 (OPENBSD_5_3) --- 

Assuming an AF_INET <-> AF_INET TCP connection. 

pf_test_rule()
3862: pf_translate()
  3881: pf_change_ap()  [ src addr/port ]
 1671: PF_ACPY  [ = pf_addrcpy() ] 
 1689: pf_cksum_fixup(...) 
   [ 
 psuedo code is: 
 sum = fixup(sum, addr16[1]) 
 sum = fixup(sum, addr16[0]) 
 sum = fixup(sum, port) 
   ] 
1662: l = cksum + old - new <--- checksum modified 
  [ then presumably account for ones-complement carries ] 
  3887: pf_change_ap() etc [ dst addr/port ] 

On subsequent state matching:  

pf_test()
6788: pf_test_state_tcp() [ for TCP ] 
  4566: pf_change_ap() etc [ for src addr/port ] 
  4574: pf_change_ap() etc [ for dst addr/port ]  

---

OpenBSD 5.4 NAT regenerates checksums as follows: 

--- pf.c 1.863 (post OPENBSD_5_4) --- 

Assuming an AF_INET <-> AF_INET TCP connection.

On initial rule match: 
pf_test_rule()
3445: pf_translate()
  3707: pf_change_ap()
 1677: PF_ACPY [= pf_addrcpy()] 
3461: pf_cksum()
  6775: pd->hdr.tcp->th_sum = 0; <--- checksum zeroed
m->m_pkthdr.csum_flags |= M_TCP_CSUM_OUT <--- flagged for recalculation
(if orig checksum good) 

On s

Re: NAT reliability in light of recent checksum changes

2014-02-26 Thread Richard Procter
On 27/02/2014, at 11:04 AM, Theo de Raadt wrote:

> I believe you are posting cast aspersions on the pf efforts.

Theo, 

I'll insist then that I think pf is a superior piece of code
which I benefit from every day, and that Henning's efforts
to simplify it are so very welcome in a world addicted to
complexity.

My beef is solely with the technique of regenerating
checksums, not the people working on the code. Criticising a
design choice with argument and evidence is not the same as
attacking the designer's integrity or competence and if I
seem to be playing the men and not the ball, it is not my
intent and I apologise.

As to your other points, I will hopefully address them in
another email I have been drafting and should have finished
over the next few days.

best, 
Richard. 



Re: NAT reliability in light of recent checksum changes

2014-02-26 Thread Richard Procter
On 24/02/2014, at 9:33 PM, Henning Brauer wrote:

> * Richard Procter  [2014-01-25 20:41]:
>> On 22/01/2014, at 7:19 PM, Henning Brauer wrote:
>>> * Richard Procter  [2014-01-22 06:44]:
>>>> This fundamentally weakens its usefulness, though: a correct
>>>> checksum now implies only that the payload likely matches
>>>> what the last NAT router happened to have in its memory
>>> huh?
>>> we receive a packet with correct cksum -> NAT -> packet goes out with
>>> correct cksum.
>>> we receive a packet with broken cksum -> NAT -> we leave the cksum
>>> alone, i. e. leave it broken.
>> Christian said it better than me: routers may corrupt data
>> and regenerating the checksum will hide it.
> 
> if that happened we had much bigger problems than NAT.

By bigger problems do you mean obvious router stability
issues?  Suppose someone argued that as we'd have obvious
stability issues if unprotected memory was unreliable, ECC
memory is unnecessary. That argument is logically equivalent
to what seems to be yours, that as we'd see obvious
issues if routers were corrupting data, end-to-end
checksums are unnecessary, but I don't buy it.

We know that routers corrupt data. Right now my home
firewall shows 30 TCP segments dropped for bad checksums. As
checks at least as strong are used by every sane link-layer
this virtually implies the dropped packets suffered router
or end-point faults.

Again, it's not just me saying it: "...checksums are used by
higher layers to ensure that data was not corrupted in
intermediate routers or by the sending or receiving host.
The fact that checksums are typically the secondary level of
protection has often led to suggestions that checksums are
superfluous. Hard won experience, however, has shown that
checksums are necessary.  Software errors (such as buffer
mismanagement) and even hardware errors (such as network
adapters with poor DMA hardware that sometimes fail to fully
DMA data) are surprisingly common [let alone memory faults!
RP] and checksums have been very useful in protecting
against such errors."[0]

best, 
Richard. 

[0] Craig Partridge, Jim Hughes, and Jonathan Stone. 1995. 
Performance of checksums and CRCs over real data. SIGCOMM Comput. 
Commun. Rev. 25, 4 (October 1995), 68-76. DOI=10.1145/217391.217413 
http://doi.acm.org/10.1145/217391.217413 page 1 



Re: NAT reliability in light of recent checksum changes

2014-01-28 Thread Richard Procter
On 28/01/2014, at 4:19 AM, Simon Perreault wrote:

> Le 2014-01-25 14:40, Richard Procter a écrit :
>> I'm not saying the calculation is bad. I'm saying it's being
>> calculated from the wrong copy of the data and by the wrong
>> device. And it's not just me saying it: I'm quoting the guys
>> who designed TCP.
> 
> Those guys didn't envision NAT.
> 
> If you want end-to-end checksum purity, don't do NAT.

Let's look at the options.

The world needs more addresses than IPv4 provides and NAT
gives them to us. There's IPv6, which has about a hundred
billion addresses for every bacteria estimated to live on
the planet[0], but it's not looking to replace IPv4 any time
soon. So NAT is here to stay for a good while longer.

Perhaps I can at least stop using NAT on my own network. In
my case I can't but let's assume I do. This eliminates one
source of error. But my TCP streams may still have
now-undetected one-bit errors (at least) if there may be
routers out there regenerating checksums. As long as there
are, good checksums no longer mean as much by themselves and
if I want at least some assurance the network did its job, I
still need some other way (e.g, checking the network path
contains no such routers, either by inspection or
statistically, or by reimplementing an end-to-end checksum
at a higher layer, etc). Regenerated checksums affect me
whether or not I use NAT myself.

Another option is to always update the checksum as versions
prior to version 5.4 did. It's reasonable to ask, well is
any more reliable than recomputing them as 5.4 does?
That is, can the old update code hide payload corruption,
too?

In order to hide payload corruption the update code would
have to modify the checksum to exactly account for it. But
that would have to happen by accident, as it never considers
the payload. It's not impossible, but, on the other hand,
checksum regeneration guarantees to hide any bad data.
So updates are more reliable.

A lot more reliable, in fact, as you'd require precisely
those memory errors necessary to in effect compute the
correct update, or some freak fault in the ALU that did the
same thing, or some combination of both. And as that has
nothing to do with the update code it is in principle
possible for non-NAT connections, too. For the hardware,
updates are just an extra load/modify/store and so the
chances of a checksum update hiding a corrupted payload are
in practical terms equivalent to those of normal forwarding.

So your statement holds only if checksums are being
regenerated. In general, NAT needn't compromise end-to-end
TCP payload checksum integrity, and in versions prior to
5.4, it didn't.

best, 
Richard. 


[0] "Prokaryotes: The unseen majority" 
Proc Natl Acad Sci U S A. 1998 June 9; 95(12): 6578–6583.
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC33863/

2^128 IPv6 addresses = ~ 10^38 

 ~ 10^38 IPv6 addresses / ~ 10^30 bacteria cells
= 
 ~ 10^8 addresses per cell. 

[1] RFC1071 "Computing the Internet Checksum" p21 
"If anything, [this end-to-end property] is the most powerful 
 feature of the TCP checksum!". Page 15 is also touches on 
 the end-to-end preserving properties of checksum update. 



Re: NAT reliability in light of recent checksum changes

2014-01-25 Thread Richard Procter
On 22/01/2014, at 7:19 PM, Henning Brauer wrote:

> * Richard Procter  [2014-01-22 06:44]:
>> This fundamentally weakens its usefulness, though: a correct
>> checksum now implies only that the payload likely matches
>> what the last NAT router happened to have in its memory
> 
> huh?
> we receive a packet with correct cksum -> NAT -> packet goes out with
> correct cksum.
> we receive a packet with broken cksum -> NAT -> we leave the cksum
> alone, i. e. leave it broken.

Christian said it better than me: routers may corrupt data
and regenerating the checksum will hide it.

That's more than a theoretical concern. The article I
referenced is a detailed study of real-world traces
co-authored by a member of the Stanford distributed systems
group that concludes "Probably the strongest message of this
study is that the networking hardware is often trashing the
packets which are entrusted to it"[0].

More generally, TCP checksums provide for an acceptable
error rate that is independent of the reliability of the
underlying network[*] by allowing us to verify its workings.
But it's no longer possible to verify network operation if 
it may be regenerating TCP checksums, as these may hide 
network faults. That's a fundamental change from the scheme 
Cerf and Khan emphasized in their design notes for what 
became known as TCP:

"The remainder of the packet consists of text for delivery
to the destination and a trailing check sum used for
end-to-end software verification. The GATEWAY does /not/
modify the text and merely forwards the check sum along
without computing or recomputing it."[1]

> It doesn't seem you know what you are talking about. the
> cksum is dead simple, if we had bugs in claculating or
> verifying it, we really had a LOT of other problems.

I'm not saying the calculation is bad. I'm saying it's being
calculated from the wrong copy of the data and by the wrong
device. And it's not just me saying it: I'm quoting the guys 
who designed TCP. 

> There is no "undetected error rate", nothing really changes
> there.

I disagree. Every TCP stream containing aribitrary data may
have undetected errors as checksums cannot detect all the
errors networks may make (being shorter than the data they
cover). The engineer's task is to make network errors
reliably negligible in practice.

As network regenerated checksums may hide any amount of
arbitrary data corruption I believe it's correct to say the
network error rate undetected by TCP is then "unknown and
unbounded".

best, 
Richard. 

[*] Under reasonable assumptions of the error modes most likely
in practice. And some applications require lower error rates 
than TCP checksums can provide.

[0]
http://conferences.sigcomm.org/sigcomm/2000/conf/paper/sigcomm2000-9-1.pdf

Jonathan Stone and Craig Partridge. 2000. When the CRC and
TCP checksum disagree.  In Proceedings of the conference on
Applications, Technologies, Architectures, and Protocols for
Computer Communication (SIGCOMM '00). ACM, New York, NY,
USA, 309-319.  DOI=10.1145/347059.347561
http://doi.acm.org/10.1145/347059.347561

[1] "A Protocol for Packet Network Intercommunication" 
V. Cerf, R. Khan, IEEE Trans on Comms, Vol Com-22, No 5 May
1974 Page 3 in original emphasis.



Re: NAT reliability in light of recent checksum changes

2014-01-21 Thread Richard Procter
On 2014-01-15, Stuart Henderson  wrote:
> On 2014-01-14, Richard Procter  wrote:
>> 
>> I've a question about the new checksum changes. [...] 
>> My understanding is that checksums are now always recalculated when
>> a header is altered, never updated.
>> 
>> Is that right and if so has this affected NAT reliability? 
>> 
>> Recalculation here would compromise reliable end-to-end transport 
>> as the payload checksum no longer covers the entire network path, 
>> and so break a basic transport layer design principle.
> 
> That is exactly what slides 30-33 talk about. PF now checks
> the incoming packets before it rewrites the checksum, so it can
> reject them if they are broken.

Right -- so NAT now replaces the existing transport checksum
with one newly computed from the payload [0].

This fundamentally weakens its usefulness, though: a correct
checksum now implies only that the payload likely matches
what the last NAT router happened to have in its memory,
whereas the receiver wants to know whether what it got is
what was originally transmitted. In the worst case of NAT on
every intermediate node the transport checksum is
effectively reduced to an adjunct of the link layer
checksum.

This means transport layer payload integrity is no longer
reliant on the quality of the checksum algorithm alone but 
now depends too on the reliability of the path the packet 
took through the network.

I think it's great to see someone working hard to simplify 
crucial code but in light of the above I believe pf should 
always update the checksum, as it did in versions prior to 
5.4, as the alternative fundamentally undermines TCP by 
making the undetected error rate of its streams unknown and 
unbounded. One might argue networks these days are reliable; 
I think it better to avoid the need to make the argument. 
In any case the work I've found on that question is not 
reassuring [1].

best, 
Richard. 

[0] pf.c 1.863

On initial rule match: 
pf_test_rule()
  3445: pf_translate()
 3707: pf_change_ap()
1677: PF_ACPY [= pf_addrcpy()] 
  3461: pf_cksum()
 6775: pd->hdr.tcp->th_sum = 0;
   m->m_pkthdr.csum_flags |= M_TCP_CSUM_OUT 
   (if orig checksum good) 

On subsequent state matching: 
pf_test_state() 
   ~4445: pf_change_ap() etc
   4471: pf_cksum() etc

[1] "Probably the strongest message of this study is that the 
networking hardware is often trashing the packets which are 
entrusted to it"

http://conferences.sigcomm.org/sigcomm/2000/conf/paper/sigcomm2000-9-1.pdf

Jonathan Stone and Craig Partridge. 2000. When the CRC and TCP checksum 
disagree. 
In Proceedings of the conference on Applications, Technologies, Architectures, 
and 
Protocols for Computer Communication (SIGCOMM '00). ACM, New York, NY, USA, 
309-319. 
DOI=10.1145/347059.347561 http://doi.acm.org/10.1145/347059.347561



NAT reliability in light of recent checksum changes

2014-01-14 Thread Richard Procter
Hi all, 

I'm using OpenBSD 5.3 to provide an Alix-based home firewall. Thank
you all for the commitment to elegant, well-documented software which
isn't pernicious to the mental health of its users.

I've a question about the new checksum changes[0], being interested 
in such things and having listened to Henning's presentation and 
poked around in the archives a little. My understanding is that 
checksums are now always recalculated when a header is altered, 
never updated.[1]

Is that right and if so has this affected NAT reliability? 
Recalculation here would compromise reliable end-to-end transport 
as the payload checksum no longer covers the entire network path, 
and so break a basic transport layer design principle.[2][3]

best, 
Richard.

[0] http://www.openbsd.org/54.html "Reworked checksum handling for
network protocols."

[1] e.g.
   26:45 slide 27, 'use protocol checksum offloading better'
   http://quigon.bsws.de/papers/2013/EuroBSDcon/mgp00027.html 
   30:51 slide 30, 'consequences in pf'
   http://quigon.bsws.de/papers/2013/EuroBSDcon/mgp00030.html
   https://www.youtube.com/watch?v=AymV11igbLY 
   'The surprising complexity of checksums in TCP/IP'

[2] V. Cerf, R. Khan, IEEE Trans on Comms, Vol Com-22, No 5 May 1974
Page 3 in original emphasis. 

> The remainder of the packet consists of text for delivery to the
> destination and a trailing check sum used for end-to-end software
> verification. The GATEWAY does /not/ modify the text and merely
> forwards the check sum along without computing or recomputing it.

[3] Page 3. http://www.ietf.org/rfc/rfc793.txt

> The TCP must recover from data that is damaged, lost, duplicated, or
> delivered out of order by the internet communication system. [...]
> Damage is handled by adding a checksum to each segment transmitted,
> checking it at the receiver, and discarding damaged segments.