softraid crypto performance regression

2017-05-07 Thread m . reed

Synopsis:   softraid crypto performance regression
Category:   system
Environment:

System  : OpenBSD 6.1
	Details : OpenBSD 6.1-current (GENERIC.MP) #51: Sat May  6 12:01:40 
MDT 2017

 
dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP

Architecture: OpenBSD.amd64
Machine : amd64

Description:

The issue appeared after upgrading from the April 20th snapshot to
the May 6th one.  For context, my whole disk is encrypted, as
described in the FAQ 
(https://www.openbsd.org/faq/faq14.html#softraidFDE);

see below for disklabel information.

With the April 20th snapshot, disk performance was fine; but with
the May 6th snapshot everything is slow. For example, where before
it took ~5 seconds for Libreoffice to open, now it takes ~30 seconds.



How-To-Repeat:
These instructions assume that you have the same disk setup as me; see 
below for my disklabel information.

1. download OpenBSD 6.1 miniroot.fs
2. dd it to a USB drive
3. boot it
4. when the OpenBSD installer prompt comes up, hit "s" for (S)hell
5. configure the existing crypto volume:
 # bioctl -c C -l /dev/sd0a softraid0
 (enter existing volume password)
 (crypto volume now mounted on /dev/sd2*)
6. mount a partition in the crypto volume:
 # mount /dev/sd2k /mnt
 # cd /mnt
7. create a blob of random data:
 # dd if=/dev/random of=random_data bs=1m count=512
8. test disk performance:
 # for i in 1 2 3; do sync && time cp random_data test$i; done
9. record results
10. repeat from step 1, replacing the 6.1 miniroot.fs with the May 6th 
snapshot miniroot.fs

11. compare results

Here's my results:
  6.1:28.89s,   36.39s,   27.63s
  May 6 snapshot: 2m12.01s, 2m16.31s, 2m30.47s

I know that many commits occurred between 6.1's release and May 6,
so, if needed, I can bisect for the problem commit. Besides that,
let me know if you need more info.



Fix:

Not known.



# /dev/rsd0c:
type: SCSI
disk: SCSI disk
label: M4-CT032M4SSD3
duid: b2aa6a55bf5ee149
flags:
bytes/sector: 512
sectors/track: 63
tracks/cylinder: 255
sectors/cylinder: 16065
cylinders: 3892
total sectors: 62533296
boundstart: 64
boundend: 62524980
drivedata: 0

16 partitions:
#size   offset  fstype [fsize bsize   cpg]
  a: 62524916   64RAID
  c: 625332960  unused



# /dev/rsd1c:
type: SCSI
disk: SCSI disk
label: SR CRYPTO
duid: 3b07dc2689260fe1
flags:
bytes/sector: 512
sectors/track: 63
tracks/cylinder: 255
sectors/cylinder: 16065
cylinders: 3891
total sectors: 62524388
boundstart: 64
boundend: 62508915
drivedata: 0

16 partitions:
#size   offset  fstype [fsize bsize   cpg]
  a:  2097152   64  4.2BSD   2048 16384 12958 # /
  b:  4548352  2097216swap# none
  c: 625243880  unused
  d:  4072032  6645568  4.2BSD   2048 16384 12958 # /tmp
  e:  6381568 10717600  4.2BSD   2048 16384 12958 # /var
  f:  4194304 17099168  4.2BSD   2048 16384 12958 # /usr
  g:  2097152 21293472  4.2BSD   2048 16384 12958 # 
/usr/X11R6
  h:  8977152 23390624  4.2BSD   2048 16384 12958 # 
/usr/local
  i:  3053696 32367776  4.2BSD   2048 16384 12958 # 
/usr/src
  j:  4194304 35421472  4.2BSD   2048 16384 12958 # 
/usr/obj
  k: 22893056 39615776  4.2BSD   2048 16384 12958 # 
/home




dmesg:
OpenBSD 6.1-current (GENERIC.MP) #51: Sat May  6 12:01:40 MDT 2017
dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 2060320768 (1964MB)
avail mem = 1992077312 (1899MB)
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.7 @ 0xeb600 (47 entries)
bios0: vendor Intel Corp. version "GKPPT10H.86A.0058.2015.0630.1349" 
date 06/30/2015

bios0: Intel Corporation D33217GKE
acpi0 at bios0: rev 2
acpi0: sleep states S0 S1 S3 S4 S5
acpi0: tables DSDT FACP APIC FPDT FIDT MCFG HPET SSDT SSDT
acpi0: wakeup devices P0P1(S4) USB1(S3) USB2(S3) USB3(S3) USB4(S3) 
USB5(S3) USB6(S3) USB7(S3) PXSX(S4) RP01(S4) PXSX(S4) RP02(S4) PXSX(S4) 
RP03(S4) PXSX(S4) RP04(S4) [...]

acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Core(TM) i3-3217U CPU @ 1.80GHz, 1797.91 MHz
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,XSAVE,AVX,F16C,NXE,RDTSCP,LONG,LAHF,PERF,ITSC,FSGSBASE,SMEP,ERMS,SENSOR,ARAT

cpu0: 256KB 64b/line 8-way L2 cache
cpu0: TSC frequency 1797909400 Hz
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges
cpu0: apic clock running a

Re: softraid crypto performance regression

2017-05-07 Thread Mike Belopuhov
On 8 May 2017 at 01:04,  wrote:

> Synopsis:   softraid crypto performance regression
>> Category:   system
>> Environment:
>>
> System  : OpenBSD 6.1
> Details : OpenBSD 6.1-current (GENERIC.MP) #51: Sat May  6
> 12:01:40 MDT 2017
>  dera...@amd64.openbsd.org:/us
> r/src/sys/arch/amd64/compile/GENERIC.MP
>
> Architecture: OpenBSD.amd64
> Machine : amd64
>
>> Description:
>>
> The issue appeared after upgrading from the April 20th snapshot to
> the May 6th one.  For context, my whole disk is encrypted, as
> described in the FAQ (https://www.openbsd.org/faq/faq14.html#softraidFDE);
> see below for disklabel information.
>
> With the April 20th snapshot, disk performance was fine; but with
> the May 6th snapshot everything is slow. For example, where before
> it took ~5 seconds for Libreoffice to open, now it takes ~30 seconds.
>
>
> How-To-Repeat:
>>
> These instructions assume that you have the same disk setup as me; see
> below for my disklabel information.
> 1. download OpenBSD 6.1 miniroot.fs
> 2. dd it to a USB drive
> 3. boot it
> 4. when the OpenBSD installer prompt comes up, hit "s" for (S)hell
> 5. configure the existing crypto volume:
>  # bioctl -c C -l /dev/sd0a softraid0
>  (enter existing volume password)
>  (crypto volume now mounted on /dev/sd2*)
> 6. mount a partition in the crypto volume:
>  # mount /dev/sd2k /mnt
>  # cd /mnt
> 7. create a blob of random data:
>  # dd if=/dev/random of=random_data bs=1m count=512
> 8. test disk performance:
>  # for i in 1 2 3; do sync && time cp random_data test$i; done
> 9. record results
> 10. repeat from step 1, replacing the 6.1 miniroot.fs with the May 6th
> snapshot miniroot.fs
> 11. compare results
>
> Here's my results:
>   6.1:28.89s,   36.39s,   27.63s
>   May 6 snapshot: 2m12.01s, 2m16.31s, 2m30.47s
>
> I know that many commits occurred between 6.1's release and May 6,
> so, if needed, I can bisect for the problem commit. Besides that,
> let me know if you need more info.
>
>
> Fix:
>>
> Not known.
>
>
Hi,

You observe a decrease in performance because we've switched to
a constant time machine independent AES implementation which is
inherently slower than the T-table version.  Users with CPUs
supporting AES-NI are not affected by this since the AES-NI
driver provides it's own constant time implementation.

Regards,
Mike


ipsec connections dying after 15 minutes with ios road warrior

2017-05-07 Thread Daniel Jakots
Hi,

I'm trying to set up a ipsec vpn for my iPhone. I tried with both iked
and ipsec/isakmpd + npppd. They both work for 10/15 minutes then it
dies. This mail is only about the iked problem (at least for now).

The server runs 6.1 -stable and isn't behind any nat. Client runs
ios 10.3.1 and use the included ipsec client (and is behind nat).

The iked.conf file I use is 

ikev2 "ios10" passive esp from 0.0.0.0/0 to 192.168.222.0/24 \
 local egress peer any \
 ikesa enc aes-256 auth hmac-sha2-256 group modp2048 \
 childsa enc aes-256 auth hmac-sha2-256 group modp2048 \
 psk "whatever" config address 192.168.222.0/24 \
 config name-server 192.168.222.254 config access-server 192.168.222.1

I run iked -dvvv (full log attached) it seems to close nicely:
ikev2_msg_send: INFORMATIONAL response from 159.100.249.61:4500 to 
198.48.213.186:58457 msgid 1, 80 bytes, NAT-T
sa_state: ESTABLISHED -> CLOSED from 198.48.213.186:58457 to 
159.100.249.61:4500 policy 'ios10'

and I'm not sure why.

Any idea?

Cheers,
Daniel

OpenBSD 6.1 (GENERIC) #6: Sat May  6 09:33:26 CEST 2017
rob...@syspatch-61-amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC
real mem = 519962624 (495MB)
avail mem = 499658752 (476MB)
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.8 @ 0xf63a0 (9 entries)
bios0: vendor SeaBIOS version "Ubuntu-1.8.2-1ubuntu1~precise0+exo1" date 
04/01/2014
bios0: QEMU Standard PC (i440FX + PIIX, 1996)
acpi0 at bios0: rev 0
acpi0: sleep states S3 S4 S5
acpi0: tables DSDT FACP SSDT APIC HPET
acpi0: wakeup devices
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel Xeon E312xx (Sandy Bridge), 2594.10 MHz
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SSE3,PCLMUL,VMX,SSSE3,CX16,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,HV,NXE,RDTSCP,LONG,LAHF
cpu0: 64KB 64b/line 2-way I-cache, 64KB 64b/line 2-way D-cache, 512KB 64b/line 
16-way L2 cache
cpu0: ITLB 255 4KB entries direct-mapped, 255 4MB entries direct-mapped
cpu0: DTLB 255 4KB entries direct-mapped, 255 4MB entries direct-mapped
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges
cpu0: apic clock running at 1000MHz
ioapic0 at mainbus0: apid 0 pa 0xfec0, version 11, 24 pins
acpihpet0 at acpi0: 1 Hz
acpiprt0 at acpi0: bus 0 (PCI0)
acpicpu0 at acpi0: C1(@1 halt!)
"ACPI0006" at acpi0 not configured
"PNP0303" at acpi0 not configured
"PNP0F13" at acpi0 not configured
"PNP0700" at acpi0 not configured
"PNP0501" at acpi0 not configured
"PNP0A06" at acpi0 not configured
"PNP0A06" at acpi0 not configured
"PNP0A06" at acpi0 not configured
pvbus0 at mainbus0: KVM
pci0 at mainbus0 bus 0
pchb0 at pci0 dev 0 function 0 "Intel 82441FX" rev 0x02
pcib0 at pci0 dev 1 function 0 "Intel 82371SB ISA" rev 0x00
pciide0 at pci0 dev 1 function 1 "Intel 82371SB IDE" rev 0x00: DMA, channel 0 
wired to compatibility, channel 1 wired to compatibility
pciide0: channel 0 disabled (no drives)
atapiscsi0 at pciide0 channel 1 drive 0
scsibus1 at atapiscsi0: 2 targets
cd0 at scsibus1 targ 0 lun 0:  ATAPI 5/cdrom removable
cd0(pciide0:1:0): using PIO mode 4, DMA mode 2
uhci0 at pci0 dev 1 function 2 "Intel 82371SB USB" rev 0x01: apic 0 int 11
piixpm0 at pci0 dev 1 function 3 "Intel 82371AB Power" rev 0x03: apic 0 int 9
iic0 at piixpm0
vga1 at pci0 dev 2 function 0 "Cirrus Logic CL-GD5446" rev 0x00
wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
virtio0 at pci0 dev 3 function 0 "Qumranet Virtio Network" rev 0x00
vio0 at virtio0: address 06:eb:fa:00:01:45
virtio0: msix shared
virtio1 at pci0 dev 4 function 0 "Qumranet Virtio Storage" rev 0x00
vioblk0 at virtio1
scsibus2 at vioblk0: 2 targets
sd0 at scsibus2 targ 0 lun 0:  SCSI3 0/direct fixed
sd0: 51200MB, 512 bytes/sector, 104857600 sectors
virtio1: msix shared
isa0 at pcib0
isadma0 at isa0
fdc0 at isa0 port 0x3f0/6 irq 6 drq 2
fd0 at fdc0 drive 1: density unknown
com0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo
pckbc0 at isa0 port 0x60/5 irq 1 irq 12
pckbd0 at pckbc0 (kbd slot)
wskbd0 at pckbd0: console keyboard, using wsdisplay0
pms0 at pckbc0 (aux slot)
wsmouse0 at pms0 mux 0
pcppi0 at isa0 port 0x61
spkr0 at pcppi0
usb0 at uhci0: USB revision 1.0
uhub0 at usb0 configuration 1 interface 0 "Intel UHCI root hub" rev 1.00/1.00 
addr 1
vmm0 at mainbus0: VMX/EPT
uhidev0 at uhub0 port 1 configuration 1 interface 0 "QEMU QEMU USB Tablet" rev 
2.00/0.00 addr 2
uhidev0: iclass 3/0
ums0 at uhidev0: 3 buttons, Z dir
wsmouse1 at ums0 mux 0
vscsi0 at root
scsibus3 at vscsi0: 256 targets
softraid0 at root
scsibus4 at softraid0: 256 targets
root on sd0a (bfd80eba6139c686.a) swap on sd0b dump on sd0b
ikev2 "ios10" passive esp inet from 0.0.0.0/0 to 192.168.222.0/24 local 159.100.249.61 peer any ikesa enc aes-256 prf hmac-sha2-256,hmac-sha1

Re: softraid crypto performance regression

2017-05-07 Thread m . reed

On 2017-05-07 19:30, Mike Belopuhov wrote:

On 8 May 2017 at 01:04,  wrote:


Synopsis:   softraid crypto performance regression

Category:   system
Environment:


System  : OpenBSD 6.1
Details : OpenBSD 6.1-current (GENERIC.MP) #51: Sat May  6
12:01:40 MDT 2017
 dera...@amd64.openbsd.org:/us
r/src/sys/arch/amd64/compile/GENERIC.MP

Architecture: OpenBSD.amd64
Machine : amd64


Description:


The issue appeared after upgrading from the April 20th snapshot to
the May 6th one.  For context, my whole disk is encrypted, as
described in the FAQ 
(https://www.openbsd.org/faq/faq14.html#softraidFDE);

see below for disklabel information.

With the April 20th snapshot, disk performance was fine; but with
the May 6th snapshot everything is slow. For example, where before
it took ~5 seconds for Libreoffice to open, now it takes ~30 seconds.


How-To-Repeat:



These instructions assume that you have the same disk setup as me; see
below for my disklabel information.
1. download OpenBSD 6.1 miniroot.fs
2. dd it to a USB drive
3. boot it
4. when the OpenBSD installer prompt comes up, hit "s" for (S)hell
5. configure the existing crypto volume:
 # bioctl -c C -l /dev/sd0a softraid0
 (enter existing volume password)
 (crypto volume now mounted on /dev/sd2*)
6. mount a partition in the crypto volume:
 # mount /dev/sd2k /mnt
 # cd /mnt
7. create a blob of random data:
 # dd if=/dev/random of=random_data bs=1m count=512
8. test disk performance:
 # for i in 1 2 3; do sync && time cp random_data test$i; done
9. record results
10. repeat from step 1, replacing the 6.1 miniroot.fs with the May 6th
snapshot miniroot.fs
11. compare results

Here's my results:
  6.1:28.89s,   36.39s,   27.63s
  May 6 snapshot: 2m12.01s, 2m16.31s, 2m30.47s

I know that many commits occurred between 6.1's release and May 6,
so, if needed, I can bisect for the problem commit. Besides that,
let me know if you need more info.


Fix:



Not known.



Hi,

You observe a decrease in performance because we've switched to
a constant time machine independent AES implementation which is
inherently slower than the T-table version.  Users with CPUs
supporting AES-NI are not affected by this since the AES-NI
driver provides it's own constant time implementation.

Regards,
Mike


Hi Mike,

Thanks for the info, and for your work on the AES implementation.
With that said, is there any chance that this issue could be solved
such that CPUs like mine (which lack AES-NI) won't become super slow?

I can always stop using softraid crypto or buy a new CPU, but I'd
like to avoid that :)


Michael Reed