Re: PF and States

2011-01-24 Thread dabheeruz

Hi Stuart,

Thanks a bunch for you suggestions.  This email got lost in my inbox.  
Will let you know if I have some questions.  Appreciate your help :)


Thx

On 1/11/11 1:43 PM, Stuart Henderson wrote:

On 2010-12-03, Godesi  wrote:

relay web {

Try applying this diff from -current and rebuilding relayd.
It is an inline diff, if your mail client has problems giving
you valid plaintext then try pasting it from a web-based
mailing list archive instead.

I think the diff will probably apply fairly cleanly as I don't
think there have been big changes in relayd since 4.7, but I am not
certain. If you don't know how or have problems patching/building,
hopefully someone else will have time to explain things, or you
could try a -current snapshot which includes this already.

Also check that the following limits are sufficiently high for
the number of TCP connections:

login.conf, "daemon" class, openfiles-cur
sysctl kern.maxfiles

-
PatchSet 489
Date: 2010/12/20 12:38:06
Author: dhill
Branch: HEAD
Tag: (none)
Log:
Only set SO_REUSEPORT for listening ports.

Fixes "Address already in use" errors seen on high load.

OK reyk@ pyr@

Members:
check_tcp.c:1.38->1.39
relay.c:1.127->1.128

Index: src/usr.sbin/relayd/check_tcp.c
diff -u src/usr.sbin/relayd/check_tcp.c:1.38 
src/usr.sbin/relayd/check_tcp.c:1.39
--- src/usr.sbin/relayd/check_tcp.c:1.38Tue Nov 30 14:38:45 2010
+++ src/usr.sbin/relayd/check_tcp.c Mon Dec 20 12:38:06 2010
@@ -50,7 +50,6 @@
  check_tcp(struct ctl_tcp_event *cte)
  {
int  s;
-   int  type;
socklen_tlen;
struct timeval   tv;
struct lingerlng;
@@ -79,10 +78,6 @@

bzero(&lng, sizeof(lng));
if (setsockopt(s, SOL_SOCKET, SO_LINGER,&lng, sizeof(lng)) == -1)
-   goto bad;
-
-   type = 1;
-   if (setsockopt(s, SOL_SOCKET, SO_REUSEPORT,&type, sizeof(type)) == -1)
goto bad;

if (cte->host->conf.ttl>  0) {
Index: src/usr.sbin/relayd/relay.c
diff -u src/usr.sbin/relayd/relay.c:1.127 src/usr.sbin/relayd/relay.c:1.128
--- src/usr.sbin/relayd/relay.c:1.127   Tue Nov 30 14:49:14 2010
+++ src/usr.sbin/relayd/relay.c Mon Dec 20 12:38:06 2010
@@ -59,7 +59,7 @@
  void   relay_init(void);
  void   relay_launch(void);
  intrelay_socket(struct sockaddr_storage *, in_port_t,
-   struct protocol *, int);
+   struct protocol *, int, int);
  intrelay_socket_listen(struct sockaddr_storage *, in_port_t,
struct protocol *);
  intrelay_socket_connect(struct sockaddr_storage *, in_port_t,
@@ -622,7 +622,7 @@

  int
  relay_socket(struct sockaddr_storage *ss, in_port_t port,
-struct protocol *proto, int fd)
+struct protocol *proto, int fd, int reuseport)
  {
int s = -1, val;
struct linger lng;
@@ -640,9 +640,12 @@
bzero(&lng, sizeof(lng));
if (setsockopt(s, SOL_SOCKET, SO_LINGER,&lng, sizeof(lng)) == -1)
goto bad;
-   val = 1;
-   if (setsockopt(s, SOL_SOCKET, SO_REUSEPORT,&val, sizeof(int)) == -1)
-   goto bad;
+   if (reuseport) {
+   val = 1;
+   if (setsockopt(s, SOL_SOCKET, SO_REUSEPORT,&val,
+   sizeof(int)) == -1)
+   goto bad;
+   }
if (fcntl(s, F_SETFL, O_NONBLOCK) == -1)
goto bad;
if (proto->tcpflags&  TCPFLAG_BUFSIZ) {
@@ -708,7 +711,7 @@
  {
int s;

-   if ((s = relay_socket(ss, port, proto, fd)) == -1)
+   if ((s = relay_socket(ss, port, proto, fd, 0)) == -1)
return (-1);

if (connect(s, (struct sockaddr *)ss, ss->ss_len) == -1) {
@@ -729,7 +732,7 @@
  {
int s;

-   if ((s = relay_socket(ss, port, proto, -1)) == -1)
+   if ((s = relay_socket(ss, port, proto, -1, 1)) == -1)
return (-1);

if (bind(s, (struct sockaddr *)ss, ss->ss_len) == -1)




Re: PF and States

2011-01-11 Thread Stuart Henderson
On 2010-12-03, Godesi  wrote:
> relay web {

Try applying this diff from -current and rebuilding relayd.
It is an inline diff, if your mail client has problems giving
you valid plaintext then try pasting it from a web-based
mailing list archive instead.

I think the diff will probably apply fairly cleanly as I don't
think there have been big changes in relayd since 4.7, but I am not
certain. If you don't know how or have problems patching/building,
hopefully someone else will have time to explain things, or you
could try a -current snapshot which includes this already.

Also check that the following limits are sufficiently high for
the number of TCP connections:

login.conf, "daemon" class, openfiles-cur
sysctl kern.maxfiles

-
PatchSet 489 
Date: 2010/12/20 12:38:06
Author: dhill
Branch: HEAD
Tag: (none) 
Log:
Only set SO_REUSEPORT for listening ports.

Fixes "Address already in use" errors seen on high load.

OK reyk@ pyr@

Members: 
check_tcp.c:1.38->1.39 
relay.c:1.127->1.128 

Index: src/usr.sbin/relayd/check_tcp.c
diff -u src/usr.sbin/relayd/check_tcp.c:1.38 
src/usr.sbin/relayd/check_tcp.c:1.39
--- src/usr.sbin/relayd/check_tcp.c:1.38Tue Nov 30 14:38:45 2010
+++ src/usr.sbin/relayd/check_tcp.c Mon Dec 20 12:38:06 2010
@@ -50,7 +50,6 @@
 check_tcp(struct ctl_tcp_event *cte)
 {
int  s;
-   int  type;
socklen_tlen;
struct timeval   tv;
struct lingerlng;
@@ -79,10 +78,6 @@
 
bzero(&lng, sizeof(lng));
if (setsockopt(s, SOL_SOCKET, SO_LINGER, &lng, sizeof(lng)) == -1)
-   goto bad;
-
-   type = 1;
-   if (setsockopt(s, SOL_SOCKET, SO_REUSEPORT, &type, sizeof(type)) == -1)
goto bad;
 
if (cte->host->conf.ttl > 0) {
Index: src/usr.sbin/relayd/relay.c
diff -u src/usr.sbin/relayd/relay.c:1.127 src/usr.sbin/relayd/relay.c:1.128
--- src/usr.sbin/relayd/relay.c:1.127   Tue Nov 30 14:49:14 2010
+++ src/usr.sbin/relayd/relay.c Mon Dec 20 12:38:06 2010
@@ -59,7 +59,7 @@
 voidrelay_init(void);
 voidrelay_launch(void);
 int relay_socket(struct sockaddr_storage *, in_port_t,
-   struct protocol *, int);
+   struct protocol *, int, int);
 int relay_socket_listen(struct sockaddr_storage *, in_port_t,
struct protocol *);
 int relay_socket_connect(struct sockaddr_storage *, in_port_t,
@@ -622,7 +622,7 @@
 
 int
 relay_socket(struct sockaddr_storage *ss, in_port_t port,
-struct protocol *proto, int fd)
+struct protocol *proto, int fd, int reuseport)
 {
int s = -1, val;
struct linger lng;
@@ -640,9 +640,12 @@
bzero(&lng, sizeof(lng));
if (setsockopt(s, SOL_SOCKET, SO_LINGER, &lng, sizeof(lng)) == -1)
goto bad;
-   val = 1;
-   if (setsockopt(s, SOL_SOCKET, SO_REUSEPORT, &val, sizeof(int)) == -1)
-   goto bad;
+   if (reuseport) {
+   val = 1;
+   if (setsockopt(s, SOL_SOCKET, SO_REUSEPORT, &val,
+   sizeof(int)) == -1)
+   goto bad;
+   }
if (fcntl(s, F_SETFL, O_NONBLOCK) == -1)
goto bad;
if (proto->tcpflags & TCPFLAG_BUFSIZ) {
@@ -708,7 +711,7 @@
 {
int s;
 
-   if ((s = relay_socket(ss, port, proto, fd)) == -1)
+   if ((s = relay_socket(ss, port, proto, fd, 0)) == -1)
return (-1);
 
if (connect(s, (struct sockaddr *)ss, ss->ss_len) == -1) {
@@ -729,7 +732,7 @@
 {
int s;
 
-   if ((s = relay_socket(ss, port, proto, -1)) == -1)
+   if ((s = relay_socket(ss, port, proto, -1, 1)) == -1)
return (-1);
 
if (bind(s, (struct sockaddr *)ss, ss->ss_len) == -1)



Re: PF and States

2010-12-21 Thread Henning Brauer
* Kevin Wilcox  [2010-12-20 16:01]:
> On 19 December 2010 07:16, Henning Brauer  wrote:
> > * Ryan McBride  [2010-12-03 09:52]:
> >> More than 100,000. I havn't tested lately (planning to do so soo), but I
> >> would expect somewhere closer to 500,000.
> > you're way off ;)
> > I had 2 million during a DDoS. things got a bit slow but everything
> > worked.
> Henning - out of curiosity, what were the specs on that hardware?

OpenBSD 4.8-stable (GENERIC) #1: Mon Oct  4 16:19:06 CEST 2010
henn...@terak.bsws.de:/usr/src/sys/arch/i386/compile/GENERIC
cpu0: Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz ("GenuineIntel" 686-class) 2.40 GHz
cpu0: 
FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,TM,SBF,SSE3,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM
real mem  = 1072128000 (1022MB)
avail mem = 1044631552 (996MB)
mainbus0 at root
bios0 at mainbus0: AT/286+ BIOS, date 08/25/07, BIOS32 rev. 0 @ 0xfd470, SMBIOS 
rev. 2.51 @ 0x3feeb000 (31 entries)
bios0: vendor Phoenix Technologies LTD version "6.00" date 08/25/2007
bios0: Supermicro PDSMi
acpi0 at bios0: rev 0
acpi0: sleep states S0 S1 S4 S5
acpi0: tables DSDT FACP MCFG HPET APIC BOOT ASF! SSDT SSDT
acpi0: wakeup devices DEV1(S5) EXP1(S5) PXHA(S5) EXP5(S5) EXP6(S5) PCIB(S5) 
KBC0(S1) MSE0(S1) COM1(S5) COM2(S5) USB1(S4) USB2(S4) USB3(S4) USB4(S4) EUSB(S4)
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpihpet0 at acpi0: 14318179 Hz
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: apic clock running at 268MHz
ioapic0 at mainbus0: apid 1 pa 0xfec0, version 20, 24 pins
ioapic1 at mainbus0: apid 2 pa 0xfec1, version 20, 24 pins
acpiprt0 at acpi0: bus 0 (PCI0)
acpiprt1 at acpi0: bus 1 (DEV1)
acpiprt2 at acpi0: bus 9 (EXP1)
acpiprt3 at acpi0: bus 10 (PXHA)
acpiprt4 at acpi0: bus 13 (EXP5)
acpiprt5 at acpi0: bus 14 (EXP6)
acpiprt6 at acpi0: bus 15 (PCIB)
acpicpu0 at acpi0: PSS
acpibtn0 at acpi0: PWRB
bios0: ROM list: 0xc/0xb000
ipmi at mainbus0 not configured
cpu0: Enhanced SpeedStep 2395 MHz: speeds: 900, 600 MHz
pci0 at mainbus0 bus 0: configuration mode 1 (bios)
pchb0 at pci0 dev 0 function 0 "Intel E7230 Host" rev 0xc0
ppb0 at pci0 dev 1 function 0 "Intel E7230 PCIE" rev 0xc0: apic 1 int 16 (irq 
11)
pci1 at ppb0 bus 1
ppb1 at pci0 dev 28 function 0 "Intel 82801GB PCIE" rev 0x01: apic 1 int 17 
(irq 12)
pci2 at ppb1 bus 9
ppb2 at pci2 dev 0 function 0 "Intel PCIE-PCIE" rev 0x09
pci3 at ppb2 bus 10
em0 at pci3 dev 1 function 0 "Intel PRO/1000MT (82541GI)" rev 0x00: apic 2 int 
0 (irq 11), address 00:0e:0c:37:d1:86
"Intel IOxAPIC" rev 0x09 at pci2 dev 0 function 1 not configured
ppb3 at pci0 dev 28 function 4 "Intel 82801G PCIE" rev 0x01: apic 1 int 17 (irq 
12)
pci4 at ppb3 bus 13
em1 at pci4 dev 0 function 0 "Intel PRO/1000MT (82573E)" rev 0x03: apic 1 int 
16 (irq 11), address 00:30:48:92:08:32
ppb4 at pci0 dev 28 function 5 "Intel 82801G PCIE" rev 0x01: apic 1 int 16 (irq 
11)
pci5 at ppb4 bus 14
em2 at pci5 dev 0 function 0 "Intel PRO/1000MT (82573L)" rev 0x00: apic 1 int 
17 (irq 12), address 00:30:48:92:08:33
uhci0 at pci0 dev 29 function 0 "Intel 82801GB USB" rev 0x01: apic 1 int 23 
(irq 10)
uhci1 at pci0 dev 29 function 1 "Intel 82801GB USB" rev 0x01: apic 1 int 19 
(irq 11)
uhci2 at pci0 dev 29 function 2 "Intel 82801GB USB" rev 0x01: apic 1 int 18 
(irq 5)
uhci3 at pci0 dev 29 function 3 "Intel 82801GB USB" rev 0x01: apic 1 int 16 
(irq 11)
ehci0 at pci0 dev 29 function 7 "Intel 82801GB USB" rev 0x01: apic 1 int 23 
(irq 10)
usb0 at ehci0: USB revision 2.0
uhub0 at usb0 "Intel EHCI root hub" rev 2.00/1.00 addr 1
ppb5 at pci0 dev 30 function 0 "Intel 82801BA Hub-to-PCI" rev 0xe1
pci6 at ppb5 bus 15
vga1 at pci6 dev 0 function 0 "ATI ES1000" rev 0x02
wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
radeondrm0 at vga1: apic 1 int 16 (irq 11)
drm0 at radeondrm0
ichpcib0 at pci0 dev 31 function 0 "Intel 82801GB LPC" rev 0x01: PM disabled
pciide0 at pci0 dev 31 function 1 "Intel 82801GB IDE" rev 0x01: DMA, channel 0 
configured to compatibility, channel 1 configured to compatibility
pciide0: channel 0 disabled (no drives)
pciide0: channel 1 disabled (no drives)
ahci0 at pci0 dev 31 function 2 "Intel 82801GR AHCI" rev 0x01: apic 1 int 19 
(irq 11), AHCI 1.1
scsibus0 at ahci0: 32 targets
sd0 at scsibus0 targ 0 lun 0:  SCSI3 0/direct fixed
sd0: 76319MB, 512 bytes/sec, 156301488 sec total
ichiic0 at pci0 dev 31 function 3 "Intel 82801GB SMBus" rev 0x01: apic 1 int 19 
(irq 11)
iic0 at ichiic0
lm1 at iic0 addr 0x2d: W83627HF
wbng0 at iic0 addr 0x2f: w83793g
spdmem0 at iic0 addr 0x50: 512MB DDR2 SDRAM non-parity PC2-5300CL5
spdmem1 at iic0 addr 0x52: 512MB DDR2 SDRAM non-parity PC2-5300CL5
usb1 at uhci0: USB revision 1.0
uhub1 at usb1 "Intel UHCI root hub" rev 1.00/1.00 addr 1
usb2 at uhci1: USB revision 1.0
uhub2 at usb2 "Intel UHCI root hub" rev 1.00/1.00 addr 1
usb3 at uhci2: USB revision 1.0
uhub3

Re: PF and States

2010-12-21 Thread Gabriel Linder

On 12/20/10 15:52, Kevin Wilcox wrote:

On 19 December 2010 07:16, Henning Brauer  wrote:

you're way off ;)
I had 2 million during a DDoS. things got a bit slow but everything
worked.

Henning - out of curiosity, what were the specs on that hardware?


It may be interesting to know of any specifics tweaks in that setup 
(besides net.inet.ip.ifq.maxlen and set limit states), if any.



My understanding was that pf won't use more than 1GB of RAM, which I
thought to equal about 1 million states, but I never verified that
information and now it's been so long I can't recall the source.


According to pf_var.h, a struct pf_state is roughly 212 bytes on amd64.



Re: PF and States

2010-12-20 Thread Kevin Wilcox
On 19 December 2010 07:16, Henning Brauer  wrote:

> * Ryan McBride  [2010-12-03 09:52]:

>> More than 100,000. I havn't tested lately (planning to do so soo), but I
>> would expect somewhere closer to 500,000.

> you're way off ;)
> I had 2 million during a DDoS. things got a bit slow but everything
> worked.

Henning - out of curiosity, what were the specs on that hardware?

My understanding was that pf won't use more than 1GB of RAM, which I
thought to equal about 1 million states, but I never verified that
information and now it's been so long I can't recall the source.

Obviously, my incorrectness probably exists on several levels here...

kmw



Re: PF and States

2010-12-19 Thread dabheeruz

On 12/19/10 4:16 AM, Henning Brauer wrote:

* Ryan McBride  [2010-12-03 09:52]:

On Thu, Dec 02, 2010 at 11:22:08PM -0500, Godesi wrote:

2.  How much states can i "really" have on a box that has 4 gig ram?

More than 100,000. I havn't tested lately (planning to do so soo), but I
would expect somewhere closer to 500,000.

you're way off ;)
I had 2 million during a DDoS. things got a bit slow but everything
worked.

Hmm..thanks guys.  I am stumped as even with 100K states set in pf, the 
box was dying.  Dying meaning I couldn't ssh (intermittent) , carp was 
failing etc, relayd (intermittent failure on the checks etc).


Using pftop I saw that there was only slight increase in states (around 
15-20K - total).  As I tried bunch of things which didn't work.   When 
the traffic was around 8-10K (total) states then the box was responding 
perfectly well.  I am on 4.7 for amd64.  This has now happened around 4 
times and I am totally clueless now as to what should my next 
troubleshooting step be like.  Wondering if there is some issue with 4.7 
amd64.




Re: PF and States

2010-12-19 Thread Henning Brauer
* Ryan McBride  [2010-12-03 09:52]:
> On Thu, Dec 02, 2010 at 11:22:08PM -0500, Godesi wrote:
> > 2.  How much states can i "really" have on a box that has 4 gig ram?
> More than 100,000. I havn't tested lately (planning to do so soo), but I
> would expect somewhere closer to 500,000.

you're way off ;)
I had 2 million during a DDoS. things got a bit slow but everything
worked.

-- 
Henning Brauer, h...@bsws.de, henn...@openbsd.org
BS Web Services, http://bsws.de
Full-Service ISP - Secure Hosting, Mail and DNS Services
Dedicated Servers, Rootservers, Application Hosting



Re: PF and States

2010-12-11 Thread dabheeruz

On 12/8/10 2:09 PM, Ryan McBride wrote:

On Wed, Dec 08, 2010 at 12:39:12PM -0800, dabheeruz wrote:

We are seeing the issue again and I am writing a script to get the
"pfctl -vvsi" data at regular intervals.  Can you please point me to
what values I should be looking out for?

You want to look for any of the counters in the Counters section besides
'match' increasing "A Lot". How much depends on your specific situation,
but if you get a feel for what you see when you're NOT having problems,
you should be able to see if any of the counters increases suddenly.

In your case, the most likely ones are:

- memory
- congestion
- state-limit

Thanks Ryan!!



Re: PF and States

2010-12-08 Thread Ryan McBride
On Wed, Dec 08, 2010 at 12:39:12PM -0800, dabheeruz wrote:
> We are seeing the issue again and I am writing a script to get the
> "pfctl -vvsi" data at regular intervals.  Can you please point me to
> what values I should be looking out for?

You want to look for any of the counters in the Counters section besides
'match' increasing "A Lot". How much depends on your specific situation,
but if you get a feel for what you see when you're NOT having problems,
you should be able to see if any of the counters increases suddenly.

In your case, the most likely ones are:

- memory
- congestion
- state-limit



Re: PF and States

2010-12-08 Thread dabheeruz

Hi Ryan,

We are seeing the issue again and I am writing a script to get the 
"pfctl -vvsi" data at regular intervals.  Can you please point me to 
what values I should be looking out for?


Thanks
Parvinder Bhasin

On 12/3/10 11:32 AM, dabheeruz wrote:
Thanks Ryan! Unfortunately when this happened I was remote and could 
not grab those stats.  But what should I be looking for in term of 
badness.  Maybe I can quickly setup something to monitor for 
particular stat.  Really appreciate your input.


Thx.

On 12/3/10 12:41 AM, Ryan McBride wrote:

On Thu, Dec 02, 2010 at 11:22:08PM -0500, Godesi wrote:

1.  Do I need pf for relayd when I am not doing redirects?

I don't think so, but this is easy for you to test...



2.  How much states can i "really" have on a box that has 4 gig ram?

More than 100,000. I havn't tested lately (planning to do so soo), but I
would expect somewhere closer to 500,000.



Is it governed by how much mem is allocated to kernel?

Yes.


Can I change that?

Not directly. In fact, having too much RAM in your box will COST you
memory, as more kernel memory is used up tracking all your RAM. So
cutting your ram to 2 GB will probably improve the upper limit, though
it doesn't seem that that's the limit you are hitting.


What does 'pfctl -vvsi' show when this problem is happening?




Re: PF and States

2010-12-05 Thread dabheeruz

Hi Jan,

This actually happened again really late at night , one thing that 
strangely happened was that we had nagios setup to monitor CARP state 
and basically the secondary lb (same config etc) had its carp interface 
in "init" state and once again the primary relayd box was displaying 
problems.  Users not being able to get to site and sometimes they 
could.  When I tried to ssh into the box , I  couldn't and after couple 
of retries when I was finally logged in.  I try to do "relayctl show 
hosts " or "relayctl show sessions " or any other command. I got error.  
When I looked at PF states they were around 20K.   I logged on to the 
secondary (backup carp) and of course saw that it was confused.  These 
two boxes are connected directly.  No switches or anything.  It seems 
like the secondary box also wasn't able to fully communicate with the 
MASTER.  When the states were back to around 8K, everything was back to 
normal.  I could do "relayctl show sessions" etc.


Very strange this problem!! Is it PF? or relayd?  can't really tell but 
I have to come up with something soon otherwise I would have to part way 
with this solution.  Which I really don't want to :(


thx
On 12/3/10 11:58 PM, Jan Johansson wrote:

Godesi  wrote:

We recently deployed OBSD4.7 boxes to do load balancing in our
environment with relayd.

After few hours we encountered problem with the server going beyond
10,000 states.

Are you convinced that it is a state problem?

In our tests we have found that a default setup of relayd will
handle 2540 connections and will then stop responding to new
connections might this be the limit you are seeing?

Our pf.conf is the default that comes with the install.




Re: PF and States

2010-12-04 Thread Jan Johansson
Godesi  wrote:
> We recently deployed OBSD4.7 boxes to do load balancing in our
> environment with relayd.
> 
> After few hours we encountered problem with the server going beyond
> 10,000 states.

Are you convinced that it is a state problem?

In our tests we have found that a default setup of relayd will
handle 2540 connections and will then stop responding to new
connections might this be the limit you are seeing?

Our pf.conf is the default that comes with the install.



Re: PF and States

2010-12-03 Thread dabheeruz
Thanks Ryan! Unfortunately when this happened I was remote and could not 
grab those stats.  But what should I be looking for in term of badness.  
Maybe I can quickly setup something to monitor for particular stat.  
Really appreciate your input.


Thx.

On 12/3/10 12:41 AM, Ryan McBride wrote:

On Thu, Dec 02, 2010 at 11:22:08PM -0500, Godesi wrote:

1.  Do I need pf for relayd when I am not doing redirects?

I don't think so, but this is easy for you to test...



2.  How much states can i "really" have on a box that has 4 gig ram?

More than 100,000. I havn't tested lately (planning to do so soo), but I
would expect somewhere closer to 500,000.



Is it governed by how much mem is allocated to kernel?

Yes.


Can I change that?

Not directly. In fact, having too much RAM in your box will COST you
memory, as more kernel memory is used up tracking all your RAM. So
cutting your ram to 2 GB will probably improve the upper limit, though
it doesn't seem that that's the limit you are hitting.


What does 'pfctl -vvsi' show when this problem is happening?




Re: PF and States

2010-12-03 Thread Ryan McBride
On Thu, Dec 02, 2010 at 11:22:08PM -0500, Godesi wrote:
> 1.  Do I need pf for relayd when I am not doing redirects?

I don't think so, but this is easy for you to test...


> 2.  How much states can i "really" have on a box that has 4 gig ram?

More than 100,000. I havn't tested lately (planning to do so soo), but I
would expect somewhere closer to 500,000.


> Is it governed by how much mem is allocated to kernel?

Yes.

> Can I change that?

Not directly. In fact, having too much RAM in your box will COST you
memory, as more kernel memory is used up tracking all your RAM. So
cutting your ram to 2 GB will probably improve the upper limit, though
it doesn't seem that that's the limit you are hitting.


What does 'pfctl -vvsi' show when this problem is happening?



PF and States

2010-12-02 Thread Godesi

Hi,

We recently deployed OBSD4.7 boxes to do load balancing in our
environment with relayd.

After few hours we encountered problem with the server going beyond
10,000 states.  After much research and man pages, we setup states to a
"ridiculous" number.
Yes the number was 100,000.  We also changed the states to expire much
faster.  Redeployed the box and everything was normal for few days till
again we started having issues with the box.
This time the states were 20,000 and again pf/relayd started having
issues.  The box has like 4gig of ram, multiple cores etc.  By issues I
mean can't ssh to box sometimes , can't get relayctl to show hosts etc.

Can someone who is expert at this look at it and tell me what may be
wrong here?
I have couple of questions:

1.  Do I need pf for relayd when I am not doing redirects?
2.  How much states can i "really" have on a box that has 4 gig ram?
Is it governed by how much mem is allocated to kernel? (i read it
somewhere while googling).  Can I change that?


Here is pf.conf.  Basically since the box is BEHIND a corporate
firewall Juniper.  We didn't really need to block anything. So pf.conf
is very simple and so is the relayd.conf:

I would really appreciate any help.

ext_if="fxp0"
web_if="fxp1"

set loginterface $ext_if
set optimization aggressive
set skip on lo
set limit { states 10  }


set timeout tcp.first   10
set timeout tcp.opening 10
set timeout tcp.established 60
set timeout tcp.closing 10
set timeout tcp.finwait 10
set timeout tcp.closed  10


pass quick on $ext_if
pass quick on $mgt_if


Here is the relayd.conf file:


# $OpenBSD: relayd.conf,v 1.13 2008/03/03 16:58:41 reyk Exp $
#
# Macros
#

images_vip="10.1.0.107"

#
# Global Options
#
interval 30
#timeout 180
#
# Each table will be mapped to a pf table.
#
table  {   web01 web02  web03   web04   web05  web06 }
   table  { 127.0.0.1 }

#
# Services will be mapped to a rdr rule.
#

#
# Relay and protocol for HTTP layer 7 loadbalancing and SSL acceleration
#
relay web {
   listen on $webip port 80
   session timeout 180
   forward to  port 8080 mode roundrobin \
   check tcp
}

thank you



Re: PF and states of connections with same src port

2008-05-04 Thread Jordi Espasa Clofent

It's related to timeout options.
man pf.conf(5), Options sections, timeouts.

By default, pf offers to you a three 'lists' of timeouts values: 
Conservative, Normal and Aggressive.


If you want to drop completely the connections states early, you can use 
Aggressive staff. But PF is extremely flexible: you also can configure 
every timeout value according your specific needs.


I recommend the reading of this precious resource: 
http://undeadly.org/cgi?action=article&sid=20060927091645


--
Thanks,
Jordi Espasa Clofent



Re: PF and states of connections with same src port

2008-05-02 Thread B A
I found this notes 



http://www.openbsd.org/cgi-bin/cvsweb/src/sys/net/pf.c?rev=1.559&content-type=text/x-cvsweb-markup



Will try upgrade (I'm running 4.1) and see





02.05.08, 20:21, "Kian Mohageri" <[EMAIL PROTECTED]>:





> States aren't purged immediately.  Take a look at the timeout values,

> specifically tcp.closed.

> -Kian



Re: PF and states of connections with same src port

2008-05-02 Thread Kian Mohageri
On Fri, May 2, 2008 at 7:35 AM, B A <[EMAIL PROTECTED]> wrote:
> Hello!
>
>
>
>  I have question about PF.
>
>
>
>  I have just found interesting behavior of of PF.
>
>  For example if I fix source port and run from my PC:
>
>echo 'aaa' | nc -p  www.my.rerver 80
>
>  I got response.
>
>  But if I just run this command again - connection stuck.
>
>  I should wait about 1 min to be able make connection with
>
>  same src port. Looks like ps states didn'd imediately removed after
>
>  FIN send.
>
>  Directly connected PC haven't show such behavior, I got response immediately.
>
>
>
>  Am I wrong or something about PF? How can fix this behavior?
>
>

States aren't purged immediately.  Take a look at the timeout values,
specifically tcp.closed.

-Kian



PF and states of connections with same src port

2008-05-02 Thread B A
Hello!



I have question about PF.



I have just found interesting behavior of of PF.

For example if I fix source port and run from my PC:

   echo 'aaa' | nc -p  www.my.rerver 80

I got response.

But if I just run this command again - connection stuck.

I should wait about 1 min to be able make connection with

same src port. Looks like ps states didn'd imediately removed after

FIN send.

Directly connected PC haven't show such behavior, I got response immediately.



Am I wrong or something about PF? How can fix this behavior? 



Thank you.