Re: [pfSense Support] Inbound Loadbalancing problem - SOLVED

2007-04-25 Thread Gary Buckmaster

Bill Marquette wrote:

On 4/24/07, Gary Buckmaster [EMAIL PROTECTED] wrote:

This issue turned out to be primarily a configuration problem, although
it serves as a good lesson for others to learn from so I'll post the
reply for the sake of posterity.

background
We currently have 16 web servers in production handling requests.  They
are sitting behind Cisco Localdirectors.  Because of how the
LocalDirectors are configured, its not a simple plug-and-play scenario
to substitute in the pfSense boxes.  In order to make the transition
more smooth, a number of machines were multi-homed so as to exist behind
the localdirectors and the new pfSense network.
/background

The astute reader will quickly surmise what happened.  Although the web
servers were located on both networks, their default route was
inadvertently left alone.  Thus traffic coming from the pfSense boxes
was replied to using the wrong network card, causing the timeout issues.

This turned out to be a blessing in disguise because it demonstrated a
more gentle way we could transition to the new machines without
interrupting service dramatically as DNS propagated to the new cluster.


I'm not following what the gentle way of transitioning to the new
machines is.  Care to elaborate a little?  Did you change the default
route on part of the farm and disable the interfaces on the machines
that should still be going through the LocalDirector?

--Bill

PS. I'm very happy to see pfSense replace a LocalDirector - I honestly
didn't expect to see anyone using the load balancing code when I wrote
it, except for the one person that requested it.

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Bill,

We ended up putting the pfSense cluster into the LocalDirector pool and 
slowly transitioned more servers behind the pfSense cluster and away 
from the LocalDirector pool as traffic stopped hitting the 
LocalDirectors and started hitting the pfSense cluster directly.  Due to 
the odd DNS caching employed by some ISPs, I suspect that it'll be about 
a week before we can pull the LocalDirectors completely out of the mix, 
but we're already seeing a significant amount of traffic going straight 
to the pfSense boxes. 

I'm happy to report that, other than the backup pfSense box randomly 
promoting itself to master, the load balancing is working flawlessly.  
Mind you, this is being done on hardware that is nowhere near top of the 
line.  The pfSense boxes are 1.2Ghz Celerons with 1G of DDR 266 memory 
and IDE hard drives.  Considering that they'll be ultimately handling 
65million plus web requests daily, that's pretty cost effective, 
especially considering commercial load balancing products. 

At some point it might be good to talk about adding some additional 
reporting functionality to the load balancer but as you pointed out, its 
probably being under-utilized by the general pfSense community.  We're 
planning on putting it extensively to use for our customers and so this 
was a very good trial by fire for the functionality.  The fact that we 
felt comfortable enough setting this up with a beta speaks volumes for 
the quality of the pfSense product. 


-Gary

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [pfSense Support] Inbound Loadbalancing problem - SOLVED

2007-04-25 Thread Michael Oh

Bill -

fyi, I find the load balancing part of pfsense to be invaluable! I'd bet
there are a number more out there that feel the same! I'd love to see more
development in that area for pfsense - for example, handling squid proxy on
the box, etc...

Michael Oh

On 4/24/07, Bill Marquette [EMAIL PROTECTED] wrote:



PS. I'm very happy to see pfSense replace a LocalDirector - I honestly
didn't expect to see anyone using the load balancing code when I wrote
it, except for the one person that requested it.

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





--

Thanks!
Michael Oh


Re: [pfSense Support] Inbound Loadbalancing problem - SOLVED

2007-04-24 Thread Gary Buckmaster
This issue turned out to be primarily a configuration problem, although 
it serves as a good lesson for others to learn from so I'll post the 
reply for the sake of posterity. 


background
We currently have 16 web servers in production handling requests.  They 
are sitting behind Cisco Localdirectors.  Because of how the 
LocalDirectors are configured, its not a simple plug-and-play scenario 
to substitute in the pfSense boxes.  In order to make the transition 
more smooth, a number of machines were multi-homed so as to exist behind 
the localdirectors and the new pfSense network. 
/background


The astute reader will quickly surmise what happened.  Although the web 
servers were located on both networks, their default route was 
inadvertently left alone.  Thus traffic coming from the pfSense boxes 
was replied to using the wrong network card, causing the timeout issues. 

This turned out to be a blessing in disguise because it demonstrated a 
more gentle way we could transition to the new machines without 
interrupting service dramatically as DNS propagated to the new cluster. 

Thanks to the pfSense team for such a great product and their help in 
figuring out the issue.


Bill Marquette wrote:

Both boxes are likely polling the web servers in question, hence the
traffic from both machines.

You might confirm that you have rules loaded to allow this traffic.

--Bill

On 4/24/07, Gary Buckmaster [EMAIL PROTECTED] wrote:

Prior to trying to install this into production, I had this entire
scenario working perfectly in a test environment.  Something, it seems,
has changed between testing and production.

I have a cluster of 15 web servers which I intend to load balance with a
CARP'd cluster.  I've created a CARP VIP address which will be the
virtual server address and another one on the LAN to serve as the
gateway for the server pool.  CARP failover has been configured and
appears to work properly, although the secondary load balancer, for some
odd reason, is always the Master.

The problem comes when I try to test web connectivity to the balanced
servers.  Traffic hits the virtual server address, hits the load
balanced pool of servers and that appears to be where things stop.  A
tcpdump shows that traffic appears to be coming from both pfSense boxes,
which seems contrary to the way the load balancer should be working:

10:10:56.089142 IP 192.168.100.3.62747  192.168.100.161.http: S
2531494251:2531494251(0) win 65228 mss 1460,nop,wscale
0,nop,nop,timestamp 7490736 0,sackOK,eol
10:10:56.089220 IP 192.168.100.161.http  192.168.100.3.62747: S
1542065227:1542065227(0) ack 2531494252 win 65535 mss 1460,nop,wscale
1,nop,nop,timestamp 6878409 7490736,nop,nop,sackOK
10:10:56.089780 IP 192.168.100.3.62747  192.168.100.161.http: . ack 1
win 65535 nop,nop,timestamp 7490737 6878409
10:10:56.090036 IP 192.168.100.3.62747  192.168.100.161.http: F 1:1(0)
ack 1 win 65535 nop,nop,timestamp 7490737 6878409
10:10:56.090081 IP 192.168.100.161.http  192.168.100.3.62747: . ack 2
win 33304 nop,nop,timestamp 6878409 7490737
10:10:56.090129 IP 192.168.100.161.http  192.168.100.3.62747: F 1:1(0)
ack 2 win 33304 nop,nop,timestamp 6878409 7490737
10:10:56.090800 IP 192.168.100.3.62747  192.168.100.161.http: . ack 2
win 1071 nop,nop,timestamp 7490738 6878409
10:10:57.186346 IP 192.168.100.2.60821  192.168.100.161.http: S
4259965474:4259965474(0) win 65228 mss 1460,nop,wscale
0,nop,nop,timestamp 5838503 0,sackOK,eol
10:10:57.186401 IP 192.168.100.161.http  192.168.100.2.60821: S
1151731680:1151731680(0) ack 4259965475 win 65535 mss 1460,nop,wscale
1,nop,nop,timestamp 6878519 5838503,nop,nop,sackOK
10:10:57.186673 IP 192.168.100.2.60821  192.168.100.161.http: . ack 1
win 65535 nop,nop,timestamp 5838504 6878519
10:10:57.186941 IP 192.168.100.2.60821  192.168.100.161.http: F 1:1(0)
ack 1 win 65535 nop,nop,timestamp 5838504 6878519
10:10:57.186984 IP 192.168.100.161.http  192.168.100.2.60821: . ack 2
win 33304 nop,nop,timestamp 6878519 5838504
10:10:57.187037 IP 192.168.100.161.http  192.168.100.2.60821: F 1:1(0)
ack 2 win 33304 nop,nop,timestamp 6878519 5838504
10:10:57.187747 IP 192.168.100.2.60821  192.168.100.161.http: . ack 2
win 1071 nop,nop,timestamp 5838505 6878519

I'm at a loss trying to figure out what the issue is.

-Gary






-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [pfSense Support] Inbound Loadbalancing problem - SOLVED

2007-04-24 Thread Bill Marquette

On 4/24/07, Gary Buckmaster [EMAIL PROTECTED] wrote:

This issue turned out to be primarily a configuration problem, although
it serves as a good lesson for others to learn from so I'll post the
reply for the sake of posterity.

background
We currently have 16 web servers in production handling requests.  They
are sitting behind Cisco Localdirectors.  Because of how the
LocalDirectors are configured, its not a simple plug-and-play scenario
to substitute in the pfSense boxes.  In order to make the transition
more smooth, a number of machines were multi-homed so as to exist behind
the localdirectors and the new pfSense network.
/background

The astute reader will quickly surmise what happened.  Although the web
servers were located on both networks, their default route was
inadvertently left alone.  Thus traffic coming from the pfSense boxes
was replied to using the wrong network card, causing the timeout issues.

This turned out to be a blessing in disguise because it demonstrated a
more gentle way we could transition to the new machines without
interrupting service dramatically as DNS propagated to the new cluster.


I'm not following what the gentle way of transitioning to the new
machines is.  Care to elaborate a little?  Did you change the default
route on part of the farm and disable the interfaces on the machines
that should still be going through the LocalDirector?

--Bill

PS. I'm very happy to see pfSense replace a LocalDirector - I honestly
didn't expect to see anyone using the load balancing code when I wrote
it, except for the one person that requested it.

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]