Re: Should I be tuning relayd?

2013-02-27 Thread Peter Farmer
Yep, I'm going to set it up to 100k to be on the safe side.



On 27 February 2013 09:47, Janne Johansson  wrote:

> I would raise it far more, since you're at 60-something percent when
> you peak at 22k.
>
>
> 2013/2/26 Peter Farmer :
> > Thanks Vadim, with "set limit state 3" I now see the states balloon
> > upto nearly 22000 states at peak, and no more "state up -> down".
> >
> >
> > Peter
> >
> >
> > On 26 February 2013 17:41, Vadim Zhukov  wrote:
> >
> >> 26.02.2013 20:06 пользователь "Peter Farmer"

> >> написал:
> >>
> >> >
> >> > Hi All,
> >> >
> >> > Whilst load testing my website (being balanced via relayd) I see this
> >> from
> >> > time to time (when running "relayd -d"):
> >> >
> >> > relay www, session 2410 (1 active), 0, 195.143.230.243 ->
> 10.201.0.7:80,
> >> > done
> >> > relay www, session 3479 (1 active), 0, 195.143.230.242 ->
> 10.201.0.6:80,
> >> > done
> >> > relay www, session 2411 (1 active), 0, 195.143.230.243 ->
> 10.201.0.6:80,
> >> > done
> >> > relay www, session 3480 (1 active), 0, 195.143.230.242 ->
> 10.201.0.7:80,
> >> > done
> >> > host 10.201.0.6, check http code (0ms), state up -> down, availability
> >> > 92.31%
> >> > host 10.201.0.7, check http code (0ms), state up -> down, availability
> >> > 84.62%
> >> > relay www, session 2412 (1 active), 0, 195.143.230.242 -> :80, session
> >> > failed
> >> > relay www, session 2413 (1 active), 0, 195.143.230.243 -> :80, session
> >> > failed
> >> > relay www, session 2414 (1 active), 0, 195.143.230.242 -> :80, session
> >> > failed
> >> >
> >> > I also periodically see:
> >> >
> >> > relay www, session 1609 (1 active), 0, 195.143.230.243 ->
> 10.201.0.6:80,
> >> > session failed
> >> >
> >> > I know that the webservers are available because I also have a tests
> >> > running against each of the webservers and can see they are available
> all
> >> > the time.
> >> >
> >> > Should I be adding something to relayd.conf or should I be tuning
> OpenBSD
> >> > is anyway? There are typically between 6000 - 9000 states in the state
> >> > table during the test.
> >>
> >> And default PF limit is 1. Too close to be safe. Try to set it in
> >> pf.conf to, e.g., 3 first.
> >>
> >> > The ab command I am running is:
> >> >
> >> > ab -v -c100 -n10 http://beta.digidayoff.com/
> >> >
> >> > My relayd conf is:
> >> >
> >> > ext_addr="10.201.0.3"
> >> > www1="10.201.0.6"
> >> > www2="10.201.0.7"
> >> >
> >> > log all
> >> >
> >> > table  { $www1 $www2 }
> >> > relay www {
> >> > listen on $ext_addr port http
> >> > forward to  port http mode roundrobin check http "/"
> code
> >> 200
> >> > }
> >> >
> >> >
> >> > My pf.conf is:
> >> >
> >> > set skip on lo
> >> > anchor "relayd/*"
> >> > pass quick on em1 proto pfsync keep state (no-sync)
> >> > pass on em1 proto carp keep state
> >> > pass# to establish keep-state
> >> > # By default, do not permit remote connections to X11
> >> > block in on ! lo0 proto tcp to port 6000:6010
> >> >
> >> >
> >> > dmesg:
> >> >
> >> > OpenBSD 5.2 (GENERIC) #309: Wed Aug  1 09:58:55 MDT 2012
> >> > dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC
> >> > real mem = 535756800 (510MB)
> >> > avail mem = 499208192 (476MB)
> >> > mainbus0 at root
> >> > bios0 at mainbus0: SMBIOS rev. 2.4 @ 0xe0010 (268 entries)
> >> > bios0: vendor Phoenix Technologies LTD version "6.00" date 09/21/2011
> >> > bios0: VMware, Inc. VMware Virtual Platform
> >> > acpi0 at bios0: rev 2
> >> > acpi0: sleep states S0 S1 S4 S5
> >> > acpi0: tables DSDT FACP BOOT APIC MCFG SRAT HPET WAET
> >> > acpi0: wakeup devices PCI0(S3) USB_(S1) P2P0(S3) S1F0(S3) S2F0(S3)
> >> S3F0(S3)
> >> > S4

Re: Should I be tuning relayd?

2013-02-26 Thread Peter Farmer
Thanks Vadim, with "set limit state 3" I now see the states balloon
upto nearly 22000 states at peak, and no more "state up -> down".


Peter


On 26 February 2013 17:41, Vadim Zhukov  wrote:

> 26.02.2013 20:06 ÐÏÌØÚÏ×ÁÔÅÌØ "Peter Farmer" 
> ÎÁÐÉÓÁÌ:
>
> >
> > Hi All,
> >
> > Whilst load testing my website (being balanced via relayd) I see this
> from
> > time to time (when running "relayd -d"):
> >
> > relay www, session 2410 (1 active), 0, 195.143.230.243 -> 10.201.0.7:80,
> > done
> > relay www, session 3479 (1 active), 0, 195.143.230.242 -> 10.201.0.6:80,
> > done
> > relay www, session 2411 (1 active), 0, 195.143.230.243 -> 10.201.0.6:80,
> > done
> > relay www, session 3480 (1 active), 0, 195.143.230.242 -> 10.201.0.7:80,
> > done
> > host 10.201.0.6, check http code (0ms), state up -> down, availability
> > 92.31%
> > host 10.201.0.7, check http code (0ms), state up -> down, availability
> > 84.62%
> > relay www, session 2412 (1 active), 0, 195.143.230.242 -> :80, session
> > failed
> > relay www, session 2413 (1 active), 0, 195.143.230.243 -> :80, session
> > failed
> > relay www, session 2414 (1 active), 0, 195.143.230.242 -> :80, session
> > failed
> >
> > I also periodically see:
> >
> > relay www, session 1609 (1 active), 0, 195.143.230.243 -> 10.201.0.6:80,
> > session failed
> >
> > I know that the webservers are available because I also have a tests
> > running against each of the webservers and can see they are available all
> > the time.
> >
> > Should I be adding something to relayd.conf or should I be tuning OpenBSD
> > is anyway? There are typically between 6000 - 9000 states in the state
> > table during the test.
>
> And default PF limit is 1. Too close to be safe. Try to set it in
> pf.conf to, e.g., 3 first.
>
> > The ab command I am running is:
> >
> > ab -v -c100 -n10 http://beta.digidayoff.com/
> >
> > My relayd conf is:
> >
> > ext_addr="10.201.0.3"
> > www1="10.201.0.6"
> > www2="10.201.0.7"
> >
> > log all
> >
> > table  { $www1 $www2 }
> > relay www {
> > listen on $ext_addr port http
> > forward to  port http mode roundrobin check http "/" code
> 200
> > }
> >
> >
> > My pf.conf is:
> >
> > set skip on lo
> > anchor "relayd/*"
> > pass quick on em1 proto pfsync keep state (no-sync)
> > pass on em1 proto carp keep state
> > pass# to establish keep-state
> > # By default, do not permit remote connections to X11
> > block in on ! lo0 proto tcp to port 6000:6010
> >
> >
> > dmesg:
> >
> > OpenBSD 5.2 (GENERIC) #309: Wed Aug  1 09:58:55 MDT 2012
> > dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC
> > real mem = 535756800 (510MB)
> > avail mem = 499208192 (476MB)
> > mainbus0 at root
> > bios0 at mainbus0: SMBIOS rev. 2.4 @ 0xe0010 (268 entries)
> > bios0: vendor Phoenix Technologies LTD version "6.00" date 09/21/2011
> > bios0: VMware, Inc. VMware Virtual Platform
> > acpi0 at bios0: rev 2
> > acpi0: sleep states S0 S1 S4 S5
> > acpi0: tables DSDT FACP BOOT APIC MCFG SRAT HPET WAET
> > acpi0: wakeup devices PCI0(S3) USB_(S1) P2P0(S3) S1F0(S3) S2F0(S3)
> S3F0(S3)
> > S4F0(S3) S5F0(S3) S6F0(S3) S7F0(S3) S8F0(S3) S9F0(S3) Z00S(S3) Z00T(S3)
> > Z00U(S3) Z00V(S3) Z00W(S3) Z00X(S3) Z00Y(S3) Z00Z(S3) Z010(S3) Z011(S3)
> > Z012(S3) Z013(S3) Z014(S3) Z015(S3) Z016(S3) Z017(S3) Z018(S3) Z019(S3)
> > Z01A(S3) Z01B(S3) Z01C(S3) Z01D(S3) Z01E(S3) P2P1(S3) S1F0(S3) S2F0(S3)
> > S3F0(S3) S4F0(S3) S5F0(S3) S6F0(S3) S7F0(S3) S8F0(S3) S9F0(S3) Z00S(S3)
> > Z00T(S3) Z00U(S3) Z00V(S3) Z00W(S3) Z00X(S3) Z00Y(S3) Z00Z(S3) Z010(S3)
> > Z011(S3) Z012(S3) Z013(S3) Z014(S3) Z015(S3) Z016(S3) Z017(S3) Z018(S3)
> > Z019(S3) Z01A(S3) Z01B(S3) Z01C(S3) Z01D(S3) Z01E(S3) P2P2(S3) S1F0(S3)
> > S2F0(S3) S3F0(S3) S4F0(S3) S5F0(S3) S6F0(S3) S7F0(S3) S8F0(S3) S9F0(S3)
> > Z00S(S3) Z00T(S3) Z00U(S3) Z00V(S3) Z00W(S3) Z00X(S3) Z00Y(S3) Z00Z(S3)
> > Z010(S3) Z011(S3) Z012(S3) Z013(S3) Z014(S3) Z015(S3) Z016(S3) Z017(S3)
> > Z018(S3) Z019(S3) Z01A(S3) Z01B(S3) Z01C(S3) Z01D(S3) Z01E(S3) P2P3(S3)
> > S1F0(S3) S2F0(S3) S3F0(S3) S4F0(S3) S5F0(S3) S6F0(S3) S7F0(S3) S8F0(S3)
> > S9F0(S3) Z00S(S3) Z00T(S3) Z00U(S3) Z00V(S3) Z00W(S3) Z00X(S3) Z00Y(S3)
> > Z00Z(S3) Z010(S3) Z011(S3) Z012(S3) Z013(S3) Z014(S3) Z015(S3) Z016(S3)
> > Z017(S

Re: Should I be tuning relayd?

2013-02-26 Thread Peter Farmer
OK, I'll try with 5.3-beta and report back.

Thanks,

Peter


On 26 February 2013 16:10, Theo de Raadt  wrote:

> Note that there were some big bug fixes in relayd in the last year,
> and some of them landed after 5.2



Re: Kernel Panic on 5.2 running on KVM

2013-02-23 Thread Peter Farmer
On Friday, February 22, 2013, Stuart Henderson wrote:

> On 2013-02-22, Peter Farmer > wrote:
> > Unfortunately now getting "em0: watchdog timeout -- resetting" on my VMs
> > (on 5.3-beta) , which also locks the terminal for me, so can't bring the
> > network up :(
>
> with -current you might want to try switching the network interface type
> to virtio, using the vio(4) driver
>
>
Unfortunately I'm restricted to the e1000 NIC by my provider.



Re: Kernel Panic on 5.2 running on KVM

2013-02-22 Thread Peter Farmer
That's a little tricky from a VNC console, so this is the best I can do:

http://habanero.projectchilli.com/~pfarmer/screens/



On 22 February 2013 17:26, Chris Cappuccio  wrote:

> dmesg?
>
> Peter Farmer [pfarmer...@gmail.com] wrote:
> > Unfortunately now getting "em0: watchdog timeout -- resetting" on my VMs
> > (on 5.3-beta) , which also locks the terminal for me, so can't bring the
> > network up :(
> >
> >
> > On 22 February 2013 15:49, Peter Farmer  wrote:
> >
> > > Building a 5.3-beta template now, will let you know.
> > >
> > >
> > > On 22 February 2013 15:26, Chris Cappuccio  wrote:
> > >
> > >> before you go much further, try openbsd 5.3-beta first
> > >>
> > >> ftp://ftp.openbsd.org/pub/OpenBSD/snapshots/amd64/
> > >>
> > >> Peter Farmer [pfarmer...@gmail.com] wrote:
> > >> > Hi,
> > >> >
> > >> > I have a pair of OpenBSD 5.2 VMs running on KVM, they have a carp
> > >> interface
> > >> > and are running relayd to load balancer http traffic into two
> webservers
> > >> > (also VMs). While benchmarking the setup with ab, I noticed that the
> > >> > OpenBSD VMs panic'd, I can easily reproduce the panics. Here is a
> > >> typical
> > >> > stack trace:
> > >> >
> > >> > uvm_fault(0xfe807d0c62a8, 0x0, 0, 1) -> e
> > >> > kernel: page fault trap, code=0
> > >> > Stopped at  somove+0x22:movq0x78(%rdi),%r14
> > >> > ddb> somove() at somove+0x22
> > >> > sowwakeup() at sowwakeup+0x26
> > >> > tcp_input() at tcp_input+0x2a37
> > >> > ipv4_input() at ipv4_input+0x584
> > >> > ipintr() at ipintr+0x7f
> > >> > netintr() at netintr+0xd5
> > >> > softintr_dispatch() at softintr_dispatch+0x5d
> > >> > Xsoftnet() at Xsoftnet+0x28
> > >> > --- interrupt ---
> > >> > (null)() at 0x800021454e30
> > >> > end of kernel
> > >> > end trace frame: 0x4043c748, count: -9
> > >> > ddb>PID   PPID   PGRPUID  S   FLAGS  WAIT
>  COMMAND
> > >> >
> > >> >  13819  1  13819  0  30x80  selectsendmail
> > >> >  15713  1  15713  0  30x80  ttyin getty
> > >> >   3077  1   3077  0  30x80  ttyin getty
> > >> >   1982  1   1982  0  30x80  ttyin getty
> > >> >  12235  1  12235  0  30x80  ttyin getty
> > >> >  17057  1  17057  0  30x80  ttyin getty
> > >> >  23271  1  23271  0  30x80  selectcron
> > >> >   4619  1   4619  0  30x80  selectruby18
> > >> >  13722  1  13722 99  30x80  poll  sndiod
> > >> >  22844  18069  18069 89  30x80  kqreadrelayd
> > >> >  19323  18069  18069 89  30x80  kqreadrelayd
> > >> >   1643  18069  18069 89  30x80  kqreadrelayd
> > >> > *26499  18069  18069 89  7   0relayd
> > >> >  18069   9864  18069 89  30x80  kqreadrelayd
> > >> >  10272   9864  10272 89  30x80  kqreadrelayd
> > >> >  13354   9864  13354 89  30x80  kqreadrelayd
> > >> >   9864  1   9864  0  30x80  kqreadrelayd
> > >> >  22085  1  22085  0  30x80  selectsshd
> > >> >  18165  18463  19253 83  30x80  poll  ntpd
> > >> >  18463  19253  19253 83  30x80  poll  ntpd
> > >> >  19253  1  19253  0  30x80  poll  ntpd
> > >> >  26963  18156  18156 74  30x80  bpf   pflogd
> > >> >  18156  1  18156  0  30x80  netio pflogd
> > >> >  30594  10090  10090 73  20x80syslogd
> > >> >  10090  1  10090  0  30x80  netio syslogd
> > >> >   3510  1   3510 77  30x80  poll  dhclient
> > >> >  20348  1  22482  0  30x80  poll  dhclient
> > >> >  25124  1  25124 77  30x80  poll  dhclient
> > >

Re: Kernel Panic on 5.2 running on KVM

2013-02-22 Thread Peter Farmer
Unfortunately now getting "em0: watchdog timeout -- resetting" on my VMs
(on 5.3-beta) , which also locks the terminal for me, so can't bring the
network up :(


On 22 February 2013 15:49, Peter Farmer  wrote:

> Building a 5.3-beta template now, will let you know.
>
>
> On 22 February 2013 15:26, Chris Cappuccio  wrote:
>
>> before you go much further, try openbsd 5.3-beta first
>>
>> ftp://ftp.openbsd.org/pub/OpenBSD/snapshots/amd64/
>>
>> Peter Farmer [pfarmer...@gmail.com] wrote:
>> > Hi,
>> >
>> > I have a pair of OpenBSD 5.2 VMs running on KVM, they have a carp
>> interface
>> > and are running relayd to load balancer http traffic into two webservers
>> > (also VMs). While benchmarking the setup with ab, I noticed that the
>> > OpenBSD VMs panic'd, I can easily reproduce the panics. Here is a
>> typical
>> > stack trace:
>> >
>> > uvm_fault(0xfe807d0c62a8, 0x0, 0, 1) -> e
>> > kernel: page fault trap, code=0
>> > Stopped at  somove+0x22:movq0x78(%rdi),%r14
>> > ddb> somove() at somove+0x22
>> > sowwakeup() at sowwakeup+0x26
>> > tcp_input() at tcp_input+0x2a37
>> > ipv4_input() at ipv4_input+0x584
>> > ipintr() at ipintr+0x7f
>> > netintr() at netintr+0xd5
>> > softintr_dispatch() at softintr_dispatch+0x5d
>> > Xsoftnet() at Xsoftnet+0x28
>> > --- interrupt ---
>> > (null)() at 0x800021454e30
>> > end of kernel
>> > end trace frame: 0x4043c748, count: -9
>> > ddb>PID   PPID   PGRPUID  S   FLAGS  WAIT  COMMAND
>> >
>> >  13819  1  13819  0  30x80  selectsendmail
>> >  15713  1  15713  0  30x80  ttyin getty
>> >   3077  1   3077  0  30x80  ttyin getty
>> >   1982  1   1982  0  30x80  ttyin getty
>> >  12235  1  12235  0  30x80  ttyin getty
>> >  17057  1  17057  0  30x80  ttyin getty
>> >  23271  1  23271  0  30x80  selectcron
>> >   4619  1   4619  0  30x80  selectruby18
>> >  13722  1  13722 99  30x80  poll  sndiod
>> >  22844  18069  18069 89  30x80  kqreadrelayd
>> >  19323  18069  18069 89  30x80  kqreadrelayd
>> >   1643  18069  18069 89  30x80  kqreadrelayd
>> > *26499  18069  18069 89  7   0relayd
>> >  18069   9864  18069 89  30x80  kqreadrelayd
>> >  10272   9864  10272 89  30x80  kqreadrelayd
>> >  13354   9864  13354 89  30x80  kqreadrelayd
>> >   9864  1   9864  0  30x80  kqreadrelayd
>> >  22085  1  22085  0  30x80  selectsshd
>> >  18165  18463  19253 83  30x80  poll  ntpd
>> >  18463  19253  19253 83  30x80  poll  ntpd
>> >  19253  1  19253  0  30x80  poll  ntpd
>> >  26963  18156  18156 74  30x80  bpf   pflogd
>> >  18156  1  18156  0  30x80  netio pflogd
>> >  30594  10090  10090 73  20x80syslogd
>> >  10090  1  10090  0  30x80  netio syslogd
>> >   3510  1   3510 77  30x80  poll  dhclient
>> >  20348  1  22482  0  30x80  poll  dhclient
>> >  25124  1  25124 77  30x80  poll  dhclient
>> >  12672  1  22482  0  30x80  poll  dhclient
>> > 13  0  0  0  30x100200  aiodoned  aiodoned
>> > 12  0  0  0  30x100200  syncerupdate
>> > 11  0  0  0  30x100200  cleaner   cleaner
>> > 10  0  0  0  30x100200  reaperreaper
>> >  9  0  0  0  30x100200  pgdaemon  pagedaemon
>> >  8  0  0  0  30x100200  bored crypto
>> >  7  0  0  0  30x100200  pftm  pfpurge
>> >  6  0  0  0  30x100200  usbtskusbtask
>> >  5  0  0  0  30x100200  usbatsk   usbatsk
>> >  4  0  0  0  30x100200  acpi0 acpi0
>> >  3  0  0  0  30x100200  bored syswq
>> >  2  0  0

Re: Kernel Panic on 5.2 running on KVM

2013-02-22 Thread Peter Farmer
Building a 5.3-beta template now, will let you know.


On 22 February 2013 15:26, Chris Cappuccio  wrote:

> before you go much further, try openbsd 5.3-beta first
>
> ftp://ftp.openbsd.org/pub/OpenBSD/snapshots/amd64/
>
> Peter Farmer [pfarmer...@gmail.com] wrote:
> > Hi,
> >
> > I have a pair of OpenBSD 5.2 VMs running on KVM, they have a carp
> interface
> > and are running relayd to load balancer http traffic into two webservers
> > (also VMs). While benchmarking the setup with ab, I noticed that the
> > OpenBSD VMs panic'd, I can easily reproduce the panics. Here is a typical
> > stack trace:
> >
> > uvm_fault(0xfe807d0c62a8, 0x0, 0, 1) -> e
> > kernel: page fault trap, code=0
> > Stopped at  somove+0x22:movq0x78(%rdi),%r14
> > ddb> somove() at somove+0x22
> > sowwakeup() at sowwakeup+0x26
> > tcp_input() at tcp_input+0x2a37
> > ipv4_input() at ipv4_input+0x584
> > ipintr() at ipintr+0x7f
> > netintr() at netintr+0xd5
> > softintr_dispatch() at softintr_dispatch+0x5d
> > Xsoftnet() at Xsoftnet+0x28
> > --- interrupt ---
> > (null)() at 0x800021454e30
> > end of kernel
> > end trace frame: 0x4043c748, count: -9
> > ddb>PID   PPID   PGRPUID  S   FLAGS  WAIT  COMMAND
> >
> >  13819  1  13819  0  30x80  selectsendmail
> >  15713  1  15713  0  30x80  ttyin getty
> >   3077  1   3077  0  30x80  ttyin getty
> >   1982  1   1982  0  30x80  ttyin getty
> >  12235  1  12235  0  30x80  ttyin getty
> >  17057  1  17057  0  30x80  ttyin getty
> >  23271  1  23271  0  30x80  selectcron
> >   4619  1   4619  0  30x80  selectruby18
> >  13722  1  13722 99  30x80  poll  sndiod
> >  22844  18069  18069 89  30x80  kqreadrelayd
> >  19323  18069  18069 89  30x80  kqreadrelayd
> >   1643  18069  18069 89  30x80  kqreadrelayd
> > *26499  18069  18069 89  7   0relayd
> >  18069   9864  18069 89  30x80  kqreadrelayd
> >  10272   9864  10272 89  30x80  kqreadrelayd
> >  13354   9864  13354 89  30x80  kqreadrelayd
> >   9864  1   9864  0  30x80  kqreadrelayd
> >  22085  1  22085  0  30x80  selectsshd
> >  18165  18463  19253 83  30x80  poll  ntpd
> >  18463  19253  19253 83  30x80  poll  ntpd
> >  19253  1  19253  0  30x80  poll  ntpd
> >  26963  18156  18156 74  30x80  bpf   pflogd
> >  18156  1  18156  0  30x80  netio pflogd
> >  30594  10090  10090 73  20x80syslogd
> >  10090  1  10090  0  30x80  netio syslogd
> >   3510  1   3510 77  30x80  poll  dhclient
> >  20348  1  22482  0  30x80  poll  dhclient
> >  25124  1  25124 77  30x80  poll  dhclient
> >  12672  1  22482  0  30x80  poll  dhclient
> > 13  0  0  0  30x100200  aiodoned  aiodoned
> > 12  0  0  0  30x100200  syncerupdate
> > 11  0  0  0  30x100200  cleaner   cleaner
> > 10  0  0  0  30x100200  reaperreaper
> >  9  0  0  0  30x100200  pgdaemon  pagedaemon
> >  8  0  0  0  30x100200  bored crypto
> >  7  0  0  0  30x100200  pftm  pfpurge
> >  6  0  0  0  30x100200  usbtskusbtask
> >  5  0  0  0  30x100200  usbatsk   usbatsk
> >  4  0  0  0  30x100200  acpi0 acpi0
> >  3  0  0  0  30x100200  bored syswq
> >  2  0  0  0  3  0x40100200idle0
> >  1  0  1  0  30x80  wait  init
> >  0 -1  0  0  3   0x200  scheduler swapper
> > ddb> rebooting...
> >
> >
> > dmesg from same machine:
> >
> > OpenBSD 5.2 (GENERIC) #309: Wed Aug  1 09:58:55 MDT 2012
> > dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC
> > real mem = 2146369536 (2046MB)
> > avail mem = 2066952192 (1971MB)
> > mainbus0 at root

Kernel Panic on 5.2 running on KVM

2013-02-22 Thread Peter Farmer
Hi,

I have a pair of OpenBSD 5.2 VMs running on KVM, they have a carp interface
and are running relayd to load balancer http traffic into two webservers
(also VMs). While benchmarking the setup with ab, I noticed that the
OpenBSD VMs panic'd, I can easily reproduce the panics. Here is a typical
stack trace:

uvm_fault(0xfe807d0c62a8, 0x0, 0, 1) -> e
kernel: page fault trap, code=0
Stopped at  somove+0x22:movq0x78(%rdi),%r14
ddb> somove() at somove+0x22
sowwakeup() at sowwakeup+0x26
tcp_input() at tcp_input+0x2a37
ipv4_input() at ipv4_input+0x584
ipintr() at ipintr+0x7f
netintr() at netintr+0xd5
softintr_dispatch() at softintr_dispatch+0x5d
Xsoftnet() at Xsoftnet+0x28
--- interrupt ---
(null)() at 0x800021454e30
end of kernel
end trace frame: 0x4043c748, count: -9
ddb>PID   PPID   PGRPUID  S   FLAGS  WAIT  COMMAND

 13819  1  13819  0  30x80  selectsendmail
 15713  1  15713  0  30x80  ttyin getty
  3077  1   3077  0  30x80  ttyin getty
  1982  1   1982  0  30x80  ttyin getty
 12235  1  12235  0  30x80  ttyin getty
 17057  1  17057  0  30x80  ttyin getty
 23271  1  23271  0  30x80  selectcron
  4619  1   4619  0  30x80  selectruby18
 13722  1  13722 99  30x80  poll  sndiod
 22844  18069  18069 89  30x80  kqreadrelayd
 19323  18069  18069 89  30x80  kqreadrelayd
  1643  18069  18069 89  30x80  kqreadrelayd
*26499  18069  18069 89  7   0relayd
 18069   9864  18069 89  30x80  kqreadrelayd
 10272   9864  10272 89  30x80  kqreadrelayd
 13354   9864  13354 89  30x80  kqreadrelayd
  9864  1   9864  0  30x80  kqreadrelayd
 22085  1  22085  0  30x80  selectsshd
 18165  18463  19253 83  30x80  poll  ntpd
 18463  19253  19253 83  30x80  poll  ntpd
 19253  1  19253  0  30x80  poll  ntpd
 26963  18156  18156 74  30x80  bpf   pflogd
 18156  1  18156  0  30x80  netio pflogd
 30594  10090  10090 73  20x80syslogd
 10090  1  10090  0  30x80  netio syslogd
  3510  1   3510 77  30x80  poll  dhclient
 20348  1  22482  0  30x80  poll  dhclient
 25124  1  25124 77  30x80  poll  dhclient
 12672  1  22482  0  30x80  poll  dhclient
13  0  0  0  30x100200  aiodoned  aiodoned
12  0  0  0  30x100200  syncerupdate
11  0  0  0  30x100200  cleaner   cleaner
10  0  0  0  30x100200  reaperreaper
 9  0  0  0  30x100200  pgdaemon  pagedaemon
 8  0  0  0  30x100200  bored crypto
 7  0  0  0  30x100200  pftm  pfpurge
 6  0  0  0  30x100200  usbtskusbtask
 5  0  0  0  30x100200  usbatsk   usbatsk
 4  0  0  0  30x100200  acpi0 acpi0
 3  0  0  0  30x100200  bored syswq
 2  0  0  0  3  0x40100200idle0
 1  0  1  0  30x80  wait  init
 0 -1  0  0  3   0x200  scheduler swapper
ddb> rebooting...


dmesg from same machine:

OpenBSD 5.2 (GENERIC) #309: Wed Aug  1 09:58:55 MDT 2012
dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC
real mem = 2146369536 (2046MB)
avail mem = 2066952192 (1971MB)
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.4 @ 0xfbc4f (10 entries)
bios0: vendor QEMU version "QEMU" date 01/01/2007
acpi0 at bios0: rev 0
acpi0: sleep states S3 S4 S5
acpi0: tables DSDT FACP SSDT APIC
acpi0: wakeup devices
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
acpiprt0 at acpi0: bus 0 (PCI0)
acpicpu0 at acpi0
mpbios at bios0 not configured
vmt0 at mainbus0
vmware: open failed, eax=564d5868, ecx=001e, edx=5658
vmt0: failed to open backdoor RPC channel (TCLO protocol)
cpu0 at mainbus0: (uniprocessor)
cpu0: QEMU Virtual CPU version 0.10.50, 2200.26 MHz
cpu0:
FPU,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SSE3,NXE,LONG
cpu0: 64KB 64b/line 2-way I-cache, 64KB 64b/line 2-way D-cache, 512KB
64b/line 16-way L2 cache
cpu0: ITLB 255 4KB entries direct-mapped, 255 4MB entries direct-mapped
cpu0: DTLB 255 4KB entries direct-mapped, 255 4MB entries direct-mapped
pci0 at mainbus0 bus 0
pchb0 at pci0 dev 0 function 0 "Intel 82441FX" rev 0x02
pcib0 at pci0 dev 1 function 0 "Intel 8237

ALTQ and VLAN interfaces

2012-04-04 Thread Peter Farmer
Hi All,

I have the following OpenBSD multi-tenant firewall setup:

   |
+-+---+++---+---+
| |   vlan10  |||vlan11 |   |
| | 195.188.200.a |--(em0)--| 195.188.201.a |   |
| | 195.188.200.b | | 195.188.201.b |   |
| |   rdomain 1   | |   rdomain 2   |   |
| +---+ +---+   |
|   |
| +---+ +---+   |
| |vlan160| |vlan161|   |
| |  10.1.160.1   |--(em1)--|  10.1.160.1   |   |
| |  rdomain 160  |||  rdomain 161  |   |
+-+---+++---+---+
   |

vlan10 and vlan11 represent the PUBLIC side of the firewall and each
vlan has a separate rdomain. A customer could be assigned IP addresses
from both vlan10 and vlan11. Traffic from vlans 160 and 161 is then
natted out of vlan10 and vlan11 using pf rules (and vice-verse, with
some tagging). vlan160 and vlan161 represent the customer side of the
firewall, ip addresses on this side can only be rfc1918, but can be
the same subnets in each vlan (hence separate rdomains). What I'd like
to be able to do is queue traffic as it leaves the firewall, both
north and south, but I'm unsure as to where to enable altq. Should I
do:

# "out" being out of em0
altq on em0 cbq bandwidth 300Mb queue { INT_em0, queue1_out, queue2_out }
queue INT_em0 bandwidth 100Mb cbq(default)
queue queue1_out bandwidth 100Mb cbq(ecn)
queue queue2_out bandwidth 100Mb cbq(ecn)

# Using pass in to keep state for packets coming back out of vlan10
pass in on vlan10 from any to 195.188.200.a queue queue1_out
pass in on vlan10 from any to 195.188.200.b queue queue2_out

# "in" being out of em1
altq on em1 cbq bandwidth 300Mb queue { INT_em1, queue1_in, queue2_in }
queue INT_em1 bandwidth 100Mb cbq(default)
queue queue1_in bandwidth 100Mb cbq(ecn)
queue queue2_in bandwidth 100Mb cbq(ecn)

# Using pass in to keep state for packets coming back out of vlan160 or vlan161
pass in on vlan160 from any to any queue queue1_in
pass in on vlan160 from any to any queue queue2_in



or should I do:

altq on vlan10 cbq bandwidth 300MB queue { INT_vlan10, queue1_out, queue2_out }
queue INT_vlan10 bandwidth 100Mb cbq(default)
queue queue1_out bandwidth 100Mb cbq(ecn)
queue queue2_out bandwidth 100Mb cbq(ecn)

# Using pass in to keep state for packets coming back out of vlan10
pass in on vlan10 from any to 195.188.200.a queue queue1_out
pass in on vlan10 from any to 195.188.200.b queue queue2_out

# "in" being out of vlan160
altq on vlan160 cbq bandwidth 100Mb queue { INT_vlan160 }
queue INT_vlan160 bandwidth 100Mb cbq(default)

# Using pass in to keep state for packets coming back out of vlan160 or vlan161
pass in on vlan160 from any to any queue queue1_in
pass in on vlan160 from any to any queue queue2_in


With altq statements for each vlan interface.

Ideally I'd want to do altq on the vlan parent interface.


Thanks,

Peter