enabling SPDY

2013-01-15 Thread KT Walrus
I'd like to enable SPDY protocol using nginx.  For this to work, do the 
frontend haproxy load balancers need to operate in tcp mode only?  Or, can they 
use http mode to pass the SPDY requests to nginx?


Re: Backend Server Dynamic Configuration

2013-01-09 Thread KT Walrus
I think you might have misunderstood.  By adding new server, I mean to add it 
as a server in HAProxy configuration.  That is, the effect is to add the 
server line for the new server into the config file.  This has nothing to do 
with launching the server in the cloud.  It is the reverse of marking a server 
DOWN, except that the server being marked UP was not originally included in the 
list of servers for the HAProxy backend.

On Jan 9, 2013, at 4:21 PM, Zachary Stern z...@enternewmedia.com wrote:

 
 
 On Wed, Jan 9, 2013 at 4:13 PM, Kevin Heatwole ke...@heatwoles.us wrote:
 4.  Adding new server to backend by having configuration check return new 
 server configuration.
 
 I don't know about the other features, but this one I think violates the UNIX 
 philosophy of do one thing and do it well. There are already plenty of 
 tools you can use to achieve this with HAproxy, like puppet or chef, and 
 things like the ruby fog gem for cloud provisioning, etc.
 
 
 -- 
 
 zachary alex stern I systems architect
 
 o: 212.363.1654 x106 | f: 212.202.6488 | z...@enternewmedia.com
 
 60-62 e. 11th street, 4th floor | new york, ny | 10003
 
 www.enternewmedia.com
 



Re: Backend Server Dynamic Configuration

2013-01-09 Thread KT Walrus
Willy,

Thanks for your thoughts.  I see that you have thought about these issues much 
more than I have.  I just wanted to get you some feedback from a potential 
haproxy user.

As for your point that I would need to edit the static configuration file 
incase of a complete restart of HAProxy, I would point out that if the backend 
servers are responsible for their HAProxy configuration, then the static 
configuration file only needs to configure the minimal set of backend servers.  
When HAProxy restarts, it would send the configuration data to the statically 
defined backends and those servers would be responsible for dynamically 
reconfiguring HAProxy back to its desired settings.  This could also include 
adding any additional servers.

I guess my feedback to you would be that I would rather see my backend servers 
define how HAProxy sees them through some sort of dynamic configuration 
interface.  I can see how I could implement this by using other scripts/tools 
to dynamically rewrite the config file and reload HAProxy.  This is what I'll 
do if you don't choose to implement more support for dynamic configuration from 
the backend.

On Jan 9, 2013, at 6:14 PM, Willy Tarreau w...@1wt.eu wrote:

 Hi Kevin,
 
 On Wed, Jan 09, 2013 at 04:13:28PM -0500, Kevin Heatwole wrote:
 (...)
 1.  Setting new interval time for subsequent configuration checks to the 
 server.
 2.  Setting new maxconn or weight for the server (allowing backend to 
 throttle or increase load for itself).
 3.  Setting server state (DOWN, MAINTENANCE, UP, STARTING, STOPPING, 
 DAMAGED) changing how HAProxy treats existing connections or new connections 
 for the server.
 4.  Change server from active to backup (or vice versa).
 4.  Adding new server to backend by having configuration check return new 
 server configuration.
 5.  Changing any other useful settings that affect backend servers.
 
 This is more or less what was spec'd by Malcolm recently and implemented
 by Simon. We found that some details could be adapted before merging, but
 in general it's very close.
 
 Anyway you won't hot-add a server that soon : lots of internal sizes
 (including file descriptor limits) depend on the number of servers. At
 most we'd have provisions for adding up to N extra servers. Also, a lot
 of checks are currently performed on server configuration (eg: don't
 track a server that tracks yourself). Doing some of these checks at
 runtime and providing enough feedback to the user to react will be quite 
 tricky.
 
 In practice, you can already tweak a number of things on the stats CLI.
 I'd like to be able to switch a server's state betwen active/backup, but
 it's already algorithmically complex. Changing an address would be really
 nice, but comes with issues again.
 
 Anyway, whatever changes you make, you'll have to do them on the config
 file too because if you restart your process, you don't want all these
 changes to be lost ! So right now changing the config only and doing a
 reload is already a reasonable tradeoff.
 
 Willy
 
 




Re: Backend Server Dynamic Configuration

2013-01-09 Thread KT Walrus
Zachary,

I guess I'll have to spend some time researching puppet (and/or chef).  Thanks 
for the reference.

Kevin

On Jan 9, 2013, at 6:18 PM, Zachary Stern z...@enternewmedia.com wrote:

 Case in point for why puppet (and probably also chef) is perfect here.
 
 You can manage the config with puppet, and have the service subscribe to 
 the config file, so that it autorestarts or reloads every time puppet changes 
 is.
 
 
 On Wed, Jan 9, 2013 at 6:14 PM, Willy Tarreau w...@1wt.eu wrote:
 Hi Kevin,
 
 On Wed, Jan 09, 2013 at 04:13:28PM -0500, Kevin Heatwole wrote:
 (...)
  1.  Setting new interval time for subsequent configuration checks to the 
  server.
  2.  Setting new maxconn or weight for the server (allowing backend to 
  throttle or increase load for itself).
  3.  Setting server state (DOWN, MAINTENANCE, UP, STARTING, STOPPING, 
  DAMAGED) changing how HAProxy treats existing connections or new 
  connections for the server.
  4.  Change server from active to backup (or vice versa).
  4.  Adding new server to backend by having configuration check return new 
  server configuration.
  5.  Changing any other useful settings that affect backend servers.
 
 This is more or less what was spec'd by Malcolm recently and implemented
 by Simon. We found that some details could be adapted before merging, but
 in general it's very close.
 
 Anyway you won't hot-add a server that soon : lots of internal sizes
 (including file descriptor limits) depend on the number of servers. At
 most we'd have provisions for adding up to N extra servers. Also, a lot
 of checks are currently performed on server configuration (eg: don't
 track a server that tracks yourself). Doing some of these checks at
 runtime and providing enough feedback to the user to react will be quite
 tricky.
 
 In practice, you can already tweak a number of things on the stats CLI.
 I'd like to be able to switch a server's state betwen active/backup, but
 it's already algorithmically complex. Changing an address would be really
 nice, but comes with issues again.
 
 Anyway, whatever changes you make, you'll have to do them on the config
 file too because if you restart your process, you don't want all these
 changes to be lost ! So right now changing the config only and doing a
 reload is already a reasonable tradeoff.
 
 Willy
 
 
 
 
 
 -- 
 
 zachary alex stern I systems architect
 
 o: 212.363.1654 x106 | f: 212.202.6488 | z...@enternewmedia.com
 
 60-62 e. 11th street, 4th floor | new york, ny | 10003
 
 www.enternewmedia.com
 



Re: dynamic upstream configuration

2013-01-05 Thread KT Walrus
Willy, 

 I think you shouldn't try that hard to adapt the tools to your needs.

You are correct.  Clearly, HAProxy is among the best at load balancing and I 
definitely need a load balancer.  NGINX is among the best at web serving and I 
definitely need a web server.  Varnish is among the best at web cacheing and I 
definitely need a static content web cache.

HAProxy has most of what I want in letting the backend server communicate back 
to the load balancer.  I'm not so sure whether I can adjust MAXCONN for the 
front end from a backend server, but this feature would be just a nice thing to 
have and not essential.  I can live with a static MAXCONN.

My problem also came that I also need SSL Termination and I need to do this in 
the load balancer.  I'm a bit hesitant to use HAProxy for SSL since it is a new 
feature.  I've used NGINX for SSL and it seems very stable.  I also think NGINX 
can probably scale to use more CPU cores on a server but may CPU isn't the 
bottleneck and both NGINX and HAProxy will saturate my networking link to the 
internet before running out of CPU.

Anyway, do I put NGINX on my load balancers too?  My need for a load balancer 
is very simple in the frontend load balancers and NGINX has enough support for 
my needs with ip_hash balancing.  On the other hand, I really like the stats 
page of HAProxy and maybe that is enough of a feature to get me to choose 
HAProxy for the load balancers (and SSL).

As for the backend servers, I only need HAProxy as a guard to keep the 
particular backend server from never getting overloaded (never exceeding 
MAXCONN) no matter how many load balancers send requests to it.  It would be 
nice to have NGINX provide this guard.  One way to provide this guard is to 
use HAProxy to use first load balancing algorithm.  Maybe I can figure out 
how to use NGINX to do this too.  If NGINX has a variable that says how many 
requests are being processed by a server and I can access that variable in 
deciding how to process a new request, then I can eliminate HAProxy for guard 
purposes.  I still might need HAProxy for MySQL to copy session data to the 
local sessionDB if the load balancers needed to change the backend server for a 
user request.  But, I may be able to handle this in PHP just as well.

Anyway, thanks for your thoughts.  I have a couple of months to get this all 
figured out, but I can already see that good solutions and tools exist already 
to do what I want to do.  But, it is still very tempting to minimize the number 
of software packages in the system just to keep the complexity of 
administration low.

Kevin

On Jan 5, 2013, at 3:19 AM, Willy Tarreau w...@1wt.eu wrote:

 Kevin,
 
 On Fri, Jan 04, 2013 at 03:44:40PM -0500, KT Walrus wrote:
 HAProxy also does all this stuff too (more or less).  I plan on using HAProxy
 if NGINX doesn't support everything I need done.
 
 I have a preference to use as few software packages as possible so I keep the
 system as simple and maintainable as possible.  Being able to eliminate
 HAProxy would be a plus for me.  I already need NGINX to handle basic web
 serving, so I think I just need a few new features in the load balancing area
 to go without HAProxy.
 
 I think you shouldn't try that hard to adapt the tools to your needs.
 Many large scalable websites use haproxy+varnish+nginx together and
 are very happy with that combination. Each one has a separate clearly
 identified role. All 3 perform extremely well and they try to address
 different needs, so in general what you don't find in one of them is
 doable with the other two. And at least each of them is better than
 the two others at something. So why try to ditch any of them for the
 sake of package reduction ? This means you'll give up some capabilities.
 
 Focusing on reducing the number of packages only means that your system
 is not yet ready for deployment because you're managing everything by
 hand, and that does not scale.
 
 So I think you should definitely start by deciding how you want to
 manage your system, packages and configurations, and once you're done
 with that, simply install the 3 and try to build your scalable and
 manageable architecture. There are many people on these products'
 mailing lists who have a lot of experience (good and bad) with various
 models and who can probably suggest what to do and what not to do
 depending on your needs.
 
 Regards,
 Willy
 




Re: dynamic upstream configuration

2013-01-04 Thread KT Walrus
HAProxy also does all this stuff too (more or less).  I plan on using HAProxy 
if NGINX doesn't support everything I need done.

I have a preference to use as few software packages as possible so I keep the 
system as simple and maintainable as possible.  Being able to eliminate HAProxy 
would be a plus for me.  I already need NGINX to handle basic web serving, so I 
think I just need a few new features in the load balancing area to go without 
HAProxy.

Kevin

On Jan 4, 2013, at 3:34 PM, Kasper Grubbe kasp...@benjamin.dk wrote:

 Hi Kevin,
 
 Not relevant for the NGINX mailinglist, but perhaps for you: Varnish already 
 have most of that functionality, missing the MAXCONN feature but it will 
 avoid backend-servers that behave strange. 
 
 It does so through http status codes. 
 
 You might want to look into it, instead of waiting for this NGINX-feature :-)
 
 
 Greetings,
 
 Kasper Grubbe
 




Re: My Scalable Architecture using HAProxy

2013-01-03 Thread KT Walrus
One more tweak…  I think the frontend LBs could be made to distribute the load 
so that requests go to the backend that has the sessionDB that will be used for 
the request rather than simple RR (by using cookies).  This would keep most 
requests handled entirely by a single backend server.  I kind of like this, 
from an efficiency and simplicity point of view.

Most setups seem to want you to place each individual component of the backend 
(HAProxy, Nginx/Varnish, PHP, and MySQL) in separate VPSs (in a cloud 
architecture).  But, I'm thinking that it will simplify things if I don't use 
virtualization and have each backend capable of handling the entire request.  
If I need more capacity in the backend, I simply add another backend server 
that functions independently of the other backends (except for handling HA in 
times of high load where one backend forwards the excess requests to its next 
neighbor backend).

I do have one problem in my proposed architecture.  A sessionDB could, 
theoretically, get much more than MAXCONN connections (up to and including all 
current requests could use a single sessionDB).  This is because once a 
sessionDB is selected for an individual user, all subsequent request from that 
user must be handled using this sessionDB.  This means I have to keep MAXCONN 
low enough that if the sessionDB in the backend does have to handle all 
requests to all backends, the server will still function and not be overloaded. 
 It would be nice if this wasn't the case, but I can't think of how to avoid 
this possibility.  If I could, I could probably set MAXCONN to utilize 80% of 
the backend rather than a more conservative 50%, eventually, saving significant 
money in scale out.

On Jan 3, 2013, at 2:56 AM, KT Walrus ke...@my.walr.us wrote:

 Thanks for the reply.
 
 Why installing 2 layers of HAProxy???
 A single one (on the 2 servers is enough).
 
 My thought was that the second layer of HAProxy would ensure that the 
 individual backend server would never have more than MAXCONN requests so I 
 know the server will never be overloaded possibly leading to the server going 
 down or taking too long to process a request.
 
 I want multiple active frontend lbs so that my architecture will scale 
 infinitely to many more frontends if necessary.  If I  eventually needed more 
 than 6 servers, I would set up another 6 servers (using the same setup at 2 
 data centers for additional HA.
 
 Since you're doing SSL, try
 to make it start multiple processes, a single one dedicated to HTTP
 and all other one for cyphering/deciphering processing…
 
 Yes.  I planned on doing that.  My 2 frontend servers are UP (4 cores) while 
 the 4 backend servers can be upgraded to DP (16 cores) and huge RAM (256GBs). 
  I've already purchased these servers.  I expect that 1 frontend server would 
 be sufficient for a long time, but I want HA by having the two frontends on 
 separate independent power/ethernet connections within the datacenter.
 
 I'm not a fan of first algo, unless you pay the resource per number of
 backend server, which is not your case.
 
 I just thought first load balancing was perfect for guarding that an 
 individual backend server never exceeded MAXCONN concurrent requests.  The 
 overhead should be minimal since this guard HAProxy almost always will pass 
 the request to localhost nginx/varnish.  I need this guard because there 
 are multiple frontend LBs doing simple round robin to the backends 
 independently.  This might become more of a possibility when and if I need 
 more LBs independently distributing requests to the backends.
 
 Prefer using a hash in your case (even multiple hash with different
 backends and content switching), that way, your hit rate would be much
 better.
 
 I'm not so concerned about individual hit rate as I am about HA and infinite 
 scalability.  It is relatively cheap to add a new server to handle more 
 backend or frontend load or split to placing some servers in a new 
 datacenter.  I'd rather have my servers run at 50% capacity (purchasing twice 
 the hardware) if that means increased HA from having the guard HAProxy's and 
 never coming close to pushing them too hard that individual pieces of the 
 software/hardware stack start to fail.
 
 no need to host a sorry page on a far away server, host it on your
 frontend LBs and HAProxy can deliver it once your server farm is
 full…
 
 That is true.  I was really thinking that maybe the first Amazon overflow 
 server might be set up to actually have a full backend server if the sorry 
 page ever starts to be served by Amazon, I would simply create one or more 
 EC2 servers to take the temporary load.  I actually plan on implementing the 
 website as EC2 instances (using this architecture) until my Amazon bill goes 
 over $500 a month at which time I would go colo.
 
 An other remark, it may be hard to troubleshoot such infra with 2
 Active/active LBs.
 
 I think I have to deal with this, but since each LB

Re: My Scalable Architecture using HAProxy

2013-01-03 Thread KT Walrus
 This is the role of the first layer of load-balancer.
 The maxconn features make it smart: as soon as one of your backend
 server reaches its maxconn, it's pulled out from the LB algorithm
 untill the number of connections decrease.
 I definitively would not use 2 layers of HAProxy and would not use
 first algo too…

I don't quite understand.  Are you saying that if I have 2 or more LBs routing 
requests to the same backends, that each LB knows how many requests are 
actually being serviced by each backend (including counting the requests to a 
backend from different LBs).

I really want my architecture to have multiple active LBs so the architecture 
will infinitely scale.

 note that, an active/active HAProxy can be designed (with tricked
 configuration) to adapt its maxconn based on the number of LBs
 available.

Is this documented anywhere?  I would like to understand how to set this up.  
In the meantime, I don't have a problem with having a guard HAProxy in each 
backend.  I'll have to have an HAProxy instance to handle the HA for the 
sessionDBs anyway, so it shouldn't be that much more to have the extra hop in 
going through the guard.

 no, you said your domain name would be propagated over 2 dns A entries...
 so a client could arrive at any time on any LB (modulo the DNS TTL)

Yes.  I understand that.  This is why the architecture doesn't require session 
stickiness to a particular backend.  It just requires that the session use 
the same backend sessionDB for all requests.  So, I hope to have 1 A record in 
the DNS for each active frontend LB and I hope to be able to scale to as many 
frontend LBs as needed for all possible loads.

 To avoid loosing traffic, HAProxy could also be used to forward
 traffic to the other DC if all the server in its local DC are
 unavailable.

Actually, if all backend servers in its local DC are unavailable, the frontend 
HAProxy should down itself.  This way, the DNS Failover will kick in and all 
subsequent requests (after the DNS TTL refreshes) will go to the other DC.  The 
frontend HAProxy will only need to forward traffic to the other DC until the 
DNS Failover has kicked in for all users.

It is more likely that some core DC routers fail or a power outage at the DC 
knocks all servers offline, both frontend LBs and backends.  In this case, you 
really need the DNS Failover to route all traffic to the surviving DC to 
minimize user perceived downtime (at DNSMadeEasy, they say that Failover can 
happen fully in under 5 minutes if TTLs are set low and DNS caching respects 
TTLs).

Kevin

On Jan 3, 2013, at 3:45 AM, Baptiste bed...@gmail.com wrote:

 On Thu, Jan 3, 2013 at 8:56 AM, KT Walrus ke...@my.walr.us wrote:
 Thanks for the reply.
 
 Why installing 2 layers of HAProxy???
 A single one (on the 2 servers is enough).
 
 My thought was that the second layer of HAProxy would ensure that the 
 individual backend server would never have more than MAXCONN requests so I 
 know the server will never be overloaded possibly leading to the server 
 going down or taking too long to process a request.
 
 cf below
 
 I'm not a fan of first algo, unless you pay the resource per number of
 backend server, which is not your case.
 
 I just thought first load balancing was perfect for guarding that an 
 individual backend server never exceeded MAXCONN concurrent requests.  The 
 overhead should be minimal since this guard HAProxy almost always will 
 pass the request to localhost nginx/varnish.  I need this guard because 
 there are multiple frontend LBs doing simple round robin to the backends 
 independently.  This might become more of a possibility when and if I need 
 more LBs independently distributing requests to the backends.
 
 
 This is the role of the first layer of load-balancer.
 The maxconn features make it smart: as soon as one of your backend
 server reaches its maxconn, it's pulled out from the LB algorithm
 untill the number of connections decrease.
 I definitively would not use 2 layers of HAProxy and would not use
 first algo too...
 
 note that, an active/active HAProxy can be designed (with tricked
 configuration) to adapt its maxconn based on the number of LBs
 available.
 
 
 Prefer using a hash in your case (even multiple hash with different
 backends and content switching), that way, your hit rate would be much
 better.
 
 I'm not so concerned about individual hit rate as I am about HA and infinite 
 scalability.  It is relatively cheap to add a new server to handle more 
 backend or frontend load or split to placing some servers in a new 
 datacenter.  I'd rather have my servers run at 50% capacity (purchasing 
 twice the hardware) if that means increased HA from having the guard 
 HAProxy's and never coming close to pushing them too hard that individual 
 pieces of the software/hardware stack start to fail.
 
 
 This is not a question of cost, this is a question of efficiency and
 response time (on the client side)
 
 An other remark, it may

Re: My Scalable Architecture using HAProxy

2013-01-03 Thread KT Walrus
 basically, you need persistence :)

Well, I only need persistence to optimize traffic flow so the correct sessionDB 
is used (eliminating a network hop).  But, the system will still function 
without persistence (in HAProxy) as the PHP code will know which sessionDB it 
needs to use for a given user.  In this case, persistence can be ensured by the 
PHP code even if HAProxy routes to a suboptimal initial backend.

In the multiple DC case, I will lose persistence if one DC fails.  The 
forwarded requests to the other DC will have to establish a new session in a 
new sessionDB, but DC failure should be rare enough that I don't care about 
this.  My site doesn't need 100% availability, just minimized user perceived 
downtime of minutes rather than hours.

On Jan 3, 2013, at 3:49 AM, Baptiste bed...@gmail.com wrote:

 basically, you need persistence :)
 
 On Thu, Jan 3, 2013 at 9:45 AM, KT Walrus ke...@my.walr.us wrote:
 One more tweak…  I think the frontend LBs could be made to distribute the 
 load so that requests go to the backend that has the sessionDB that will be 
 used for the request rather than simple RR (by using cookies).  This would 
 keep most requests handled entirely by a single backend server.  I kind of 
 like this, from an efficiency and simplicity point of view.
 
 Most setups seem to want you to place each individual component of the 
 backend (HAProxy, Nginx/Varnish, PHP, and MySQL) in separate VPSs (in a 
 cloud architecture).  But, I'm thinking that it will simplify things if I 
 don't use virtualization and have each backend capable of handling the 
 entire request.  If I need more capacity in the backend, I simply add 
 another backend server that functions independently of the other backends 
 (except for handling HA in times of high load where one backend forwards the 
 excess requests to its next neighbor backend).
 
 I do have one problem in my proposed architecture.  A sessionDB could, 
 theoretically, get much more than MAXCONN connections (up to and including 
 all current requests could use a single sessionDB).  This is because once a 
 sessionDB is selected for an individual user, all subsequent request from 
 that user must be handled using this sessionDB.  This means I have to keep 
 MAXCONN low enough that if the sessionDB in the backend does have to handle 
 all requests to all backends, the server will still function and not be 
 overloaded.  It would be nice if this wasn't the case, but I can't think of 
 how to avoid this possibility.  If I could, I could probably set MAXCONN to 
 utilize 80% of the backend rather than a more conservative 50%, eventually, 
 saving significant money in scale out.
 
 On Jan 3, 2013, at 2:56 AM, KT Walrus ke...@my.walr.us wrote:
 
 Thanks for the reply.
 
 Why installing 2 layers of HAProxy???
 A single one (on the 2 servers is enough).
 
 My thought was that the second layer of HAProxy would ensure that the 
 individual backend server would never have more than MAXCONN requests so I 
 know the server will never be overloaded possibly leading to the server 
 going down or taking too long to process a request.
 
 I want multiple active frontend lbs so that my architecture will scale 
 infinitely to many more frontends if necessary.  If I  eventually needed 
 more than 6 servers, I would set up another 6 servers (using the same setup 
 at 2 data centers for additional HA.
 
 Since you're doing SSL, try
 to make it start multiple processes, a single one dedicated to HTTP
 and all other one for cyphering/deciphering processing…
 
 Yes.  I planned on doing that.  My 2 frontend servers are UP (4 cores) 
 while the 4 backend servers can be upgraded to DP (16 cores) and huge RAM 
 (256GBs).  I've already purchased these servers.  I expect that 1 frontend 
 server would be sufficient for a long time, but I want HA by having the two 
 frontends on separate independent power/ethernet connections within the 
 datacenter.
 
 I'm not a fan of first algo, unless you pay the resource per number of
 backend server, which is not your case.
 
 I just thought first load balancing was perfect for guarding that an 
 individual backend server never exceeded MAXCONN concurrent requests.  The 
 overhead should be minimal since this guard HAProxy almost always will 
 pass the request to localhost nginx/varnish.  I need this guard because 
 there are multiple frontend LBs doing simple round robin to the backends 
 independently.  This might become more of a possibility when and if I need 
 more LBs independently distributing requests to the backends.
 
 Prefer using a hash in your case (even multiple hash with different
 backends and content switching), that way, your hit rate would be much
 better.
 
 I'm not so concerned about individual hit rate as I am about HA and 
 infinite scalability.  It is relatively cheap to add a new server to handle 
 more backend or frontend load or split to placing some servers in a new 
 datacenter.  I'd rather have my servers run

Re: My Scalable Architecture using HAProxy

2013-01-03 Thread KT Walrus
 Also Kevin, I don't really know what's the database usage profile of your 
 app, but I'd immediately rule out installing the DB on the web servers, 
 especially having two MySQL instances on *each* machine that will be serving 
 PHP...

Why?

I like the idea that each physical backend server can independently service a 
full request.  Therefore, if the physical server is up and running, likely all 
the services running on the server will be up and running (especially if I 
don't push the server so hard that it is overloaded).

Need to handle more requests?  Simply add another backend server or upgrade the 
specs (and MAXCONN/server weight) of one or more existing servers.

Kevin

On Jan 3, 2013, at 5:06 AM, Pedro Mata-Mouros pedro.matamou...@sapo.pt wrote:

 Also Kevin, I don't really know what's the database usage profile of your 
 app, but I'd immediately rule out installing the DB on the web servers, 
 especially having two MySQL instances on *each* machine that will be serving 
 PHP...
 
 Cheers,
 Pedro.
 
 On 3 Jan 2013, at 09:25, KT Walrus ke...@my.walr.us wrote:
 
 basically, you need persistence :)
 
 Well, I only need persistence to optimize traffic flow so the correct 
 sessionDB is used (eliminating a network hop).  But, the system will still 
 function without persistence (in HAProxy) as the PHP code will know which 
 sessionDB it needs to use for a given user.  In this case, persistence can 
 be ensured by the PHP code even if HAProxy routes to a suboptimal initial 
 backend.
 
 In the multiple DC case, I will lose persistence if one DC fails.  The 
 forwarded requests to the other DC will have to establish a new session in a 
 new sessionDB, but DC failure should be rare enough that I don't care about 
 this.  My site doesn't need 100% availability, just minimized user perceived 
 downtime of minutes rather than hours.
 
 On Jan 3, 2013, at 3:49 AM, Baptiste bed...@gmail.com wrote:
 
 basically, you need persistence :)
 
 On Thu, Jan 3, 2013 at 9:45 AM, KT Walrus ke...@my.walr.us wrote:
 One more tweak…  I think the frontend LBs could be made to distribute the 
 load so that requests go to the backend that has the sessionDB that will 
 be used for the request rather than simple RR (by using cookies).  This 
 would keep most requests handled entirely by a single backend server.  I 
 kind of like this, from an efficiency and simplicity point of view.
 
 Most setups seem to want you to place each individual component of the 
 backend (HAProxy, Nginx/Varnish, PHP, and MySQL) in separate VPSs (in a 
 cloud architecture).  But, I'm thinking that it will simplify things if 
 I don't use virtualization and have each backend capable of handling the 
 entire request.  If I need more capacity in the backend, I simply add 
 another backend server that functions independently of the other backends 
 (except for handling HA in times of high load where one backend forwards 
 the excess requests to its next neighbor backend).
 
 I do have one problem in my proposed architecture.  A sessionDB could, 
 theoretically, get much more than MAXCONN connections (up to and including 
 all current requests could use a single sessionDB).  This is because once 
 a sessionDB is selected for an individual user, all subsequent request 
 from that user must be handled using this sessionDB.  This means I have to 
 keep MAXCONN low enough that if the sessionDB in the backend does have to 
 handle all requests to all backends, the server will still function and 
 not be overloaded.  It would be nice if this wasn't the case, but I can't 
 think of how to avoid this possibility.  If I could, I could probably set 
 MAXCONN to utilize 80% of the backend rather than a more conservative 50%, 
 eventually, saving significant money in scale out.
 
 On Jan 3, 2013, at 2:56 AM, KT Walrus ke...@my.walr.us wrote:
 
 Thanks for the reply.
 
 Why installing 2 layers of HAProxy???
 A single one (on the 2 servers is enough).
 
 My thought was that the second layer of HAProxy would ensure that the 
 individual backend server would never have more than MAXCONN requests so 
 I know the server will never be overloaded possibly leading to the server 
 going down or taking too long to process a request.
 
 I want multiple active frontend lbs so that my architecture will scale 
 infinitely to many more frontends if necessary.  If I  eventually needed 
 more than 6 servers, I would set up another 6 servers (using the same 
 setup at 2 data centers for additional HA.
 
 Since you're doing SSL, try
 to make it start multiple processes, a single one dedicated to HTTP
 and all other one for cyphering/deciphering processing…
 
 Yes.  I planned on doing that.  My 2 frontend servers are UP (4 cores) 
 while the 4 backend servers can be upgraded to DP (16 cores) and huge RAM 
 (256GBs).  I've already purchased these servers.  I expect that 1 
 frontend server would be sufficient for a long time, but I want HA by 
 having the two frontends on separate

Re: Cookie Persistence and Backend Recognition of Server Change

2013-01-03 Thread KT Walrus
Nevermind.  I solved my problem by having the backend save the sessionDB server 
id in its SESSION_ID cookie.  If the SESSION_ID cookie isn't the same server id 
as the localhost sessionDB, it knows a change has been made and it will first 
copy the session data out of the read-only slave sessionDB to the localhost 
sessionDB (updating the SESSION_ID cookie) before proceeding to handle the 
request.

On Jan 3, 2013, at 12:47 PM, Kevin Heatwole ke...@heatwoles.us wrote:

 I'm thinking of using cookie persistence to stick a user to the same backend 
 (if available) for all requests coming from the user.
 
 But, I need to handle the case where HAProxy switches the user to a different 
 backend (because the original backend has gone offline or MAXCONN reached) 
 than the one saved in the cookie.  
 
 My question is:  Can the backends tell when the frontend has changed to a new 
 backend server than the one saved in the cookie?
 
 I assume so, but I'm wondering how to do this.  Have the backend save the 
 frontend cookie value in another cookie, if the frontend cookie has changed?  
 Or, is it simpler than this and the frontend can set a request attribute 
 (X-Server-Changed?) that the backend simply checks?
 
 I need to copy previous session data to the new backend sessionDB (from the 
 slave sessionDB backup) to continue processing the user requests 
 uninterrupted on the new backend.
 
 Kevin
 
 




My Scalable Architecture using HAProxy

2013-01-02 Thread KT Walrus
I'm setting up a new website in the next month or two.  Even though the traffic 
won't require a scalable HA website, I'm going to start out as if the website 
needs to support huge traffic so I can get some experience running such a 
website.

I'd like any feedback on what I am thinking of doing…

As for hardware, I am colocating 6 servers at this time and plan to use Amazon 
S3 to host the static files (which should grow quickly to 1TB or 2TB of mostly 
images).  2 of the servers are going to be my frontend load balancers running 
haproxy.  The remaining 4 servers with be nginx/varnish servers (nginx for the 
PHP/MySQL part of the site and varnish to cache the Amazon S3 files to save 
bandwidth charges by Amazon).

I plan on doing DNS load balancing using pairs of A records for each hosted 
domain that will point to each of my frontend haproxy load balancers.  Most 
traffic will be HTTPS, so I plan on having the frontend load balancers to 
handle the SSL (using the new haproxy support for SSL).

The two load balancers will proxy to the 4 backend servers.  These 4 backend 
servers will run haproxy in front of nginx/varnish with load balancing of 
first and a suitable MAXCONN.  Server 1 haproxy will first route to the 
localhost nginx/varnish and when MAXCONN connections are active to the 
localhost, will forward the connection to Server 2 haproxy.  Server 2 and 3 
will be set up similarly to first route requests to localhost and when full, 
route subsequent requests to the next server.  Server 4 will route excess 
requests to a small Amazon EC2 instance to return a servers are all busy 
page.  Hopefully, I will be able to add a 5th backend server at Amazon to 
handle the overload if it looks like I really do have traffic that will fill 
all 4 backend servers that I am colo'ing (I don't really expect this to ever be 
necessary).

Nginx will proxy to PHP on localhost and each localhost (of my 4 backend 
servers) will have 2 MySQL instances - one for the main Read-Only DB and one 
for a Read-Write SessionDB.  PHP will go directly to the main DB (not through 
HAProxy) and will use HAProxy to select the proper SessionDB to use (each user 
session must use the same SessionDB so the one a request needs might be on any 
of the backend servers).  Each SessionDB will be the master of one slave 
SessionDB on a different backend server for handling the failure of the master 
(haproxy will send requests to the slave SessionDB if the master is down or  
failing).

So, each backend server will have haproxy to first balance HTTP to 
nginx/varnish.  The backends also have PHP and 3 instances of MySQL (one for 
mainDB, one for master sessionDB, and one for another backend's slave 
sessionDB).

Also, the 2 frontend servers will be running separate instances of haproxy.  I 
hope to use keepalived to route the VIPs for one frontend to the other frontend 
in case of failure.  Or, should I use heartbeat?  There seems to be two HA 
solutions here.

I know this is a very long description of what I am thinking of doing and I 
thank you if you have read this far.  I'm looking for any comments on this 
setup.  Especially, any comments on using first load balancing/MAXCONN on the 
backend servers so that a request load balanced from the frontend will keep the 
backend servers from overloading (possibly bouncing a request from server 1 to 
server 2 to server 3 to server 4 to EC2 server busy server) are especially 
appreciated.  Also, any comments on using pairs of master/slave sessionDBs to 
provide high availability but still have session data saved/retrieved for a 
given user from the same DB are appreciated.  I believe this setup will allow 
the load to be distributed evenly over the 4 backends and only have the front 
end load balancers do simple round robin without session stickiness.

Kevin




Re: My Scalable Architecture using HAProxy

2013-01-02 Thread KT Walrus
Thanks for the reply.

 Why installing 2 layers of HAProxy???
 A single one (on the 2 servers is enough).

My thought was that the second layer of HAProxy would ensure that the 
individual backend server would never have more than MAXCONN requests so I know 
the server will never be overloaded possibly leading to the server going down 
or taking too long to process a request.

I want multiple active frontend lbs so that my architecture will scale 
infinitely to many more frontends if necessary.  If I  eventually needed more 
than 6 servers, I would set up another 6 servers (using the same setup at 2 
data centers for additional HA.

 Since you're doing SSL, try
 to make it start multiple processes, a single one dedicated to HTTP
 and all other one for cyphering/deciphering processing…

Yes.  I planned on doing that.  My 2 frontend servers are UP (4 cores) while 
the 4 backend servers can be upgraded to DP (16 cores) and huge RAM (256GBs).  
I've already purchased these servers.  I expect that 1 frontend server would be 
sufficient for a long time, but I want HA by having the two frontends on 
separate independent power/ethernet connections within the datacenter.

 I'm not a fan of first algo, unless you pay the resource per number of
 backend server, which is not your case.

I just thought first load balancing was perfect for guarding that an 
individual backend server never exceeded MAXCONN concurrent requests.  The 
overhead should be minimal since this guard HAProxy almost always will pass 
the request to localhost nginx/varnish.  I need this guard because there are 
multiple frontend LBs doing simple round robin to the backends independently.  
This might become more of a possibility when and if I need more LBs 
independently distributing requests to the backends.

 Prefer using a hash in your case (even multiple hash with different
 backends and content switching), that way, your hit rate would be much
 better.

I'm not so concerned about individual hit rate as I am about HA and infinite 
scalability.  It is relatively cheap to add a new server to handle more backend 
or frontend load or split to placing some servers in a new datacenter.  I'd 
rather have my servers run at 50% capacity (purchasing twice the hardware) if 
that means increased HA from having the guard HAProxy's and never coming close 
to pushing them too hard that individual pieces of the software/hardware stack 
start to fail.

 no need to host a sorry page on a far away server, host it on your
 frontend LBs and HAProxy can deliver it once your server farm is
 full…

That is true.  I was really thinking that maybe the first Amazon overflow 
server might be set up to actually have a full backend server if the sorry page 
ever starts to be served by Amazon, I would simply create one or more EC2 
servers to take the temporary load.  I actually plan on implementing the 
website as EC2 instances (using this architecture) until my Amazon bill goes 
over $500 a month at which time I would go colo.
 
 An other remark, it may be hard to troubleshoot such infra with 2
 Active/active LBs.

I think I have to deal with this, but since each LB is handling unique VIPs 
(unless keepalived kicks in due to failure), I don't think there is going to be 
that much trouble.

 And using DNS rr does not prevent you from using keepalived to ensure
 HA between your 2 HAProxys.

Yes.  I am hoping this is the case.  I eventually want at least two geographic 
locations (east and west coast data centers) so 4 IPs in the DNS to distribute 
to the closest datacenter.  I use DNSMadeEasy which can support both DNS Global 
Traffic Director (east coast and west coast IP Anycast) and DNS Failover 
(incase one datacenter goes offline).

 
 cheers
 
 
 On Thu, Jan 3, 2013 at 12:20 AM, KT Walrus ke...@my.walr.us wrote:
 I'm setting up a new website in the next month or two.  Even though the 
 traffic won't require a scalable HA website, I'm going to start out as if 
 the website needs to support huge traffic so I can get some experience 
 running such a website.
 
 I'd like any feedback on what I am thinking of doing…
 
 As for hardware, I am colocating 6 servers at this time and plan to use 
 Amazon S3 to host the static files (which should grow quickly to 1TB or 2TB 
 of mostly images).  2 of the servers are going to be my frontend load 
 balancers running haproxy.  The remaining 4 servers with be nginx/varnish 
 servers (nginx for the PHP/MySQL part of the site and varnish to cache the 
 Amazon S3 files to save bandwidth charges by Amazon).
 
 I plan on doing DNS load balancing using pairs of A records for each hosted 
 domain that will point to each of my frontend haproxy load balancers.  Most 
 traffic will be HTTPS, so I plan on having the frontend load balancers to 
 handle the SSL (using the new haproxy support for SSL).
 
 The two load balancers will proxy to the 4 backend servers.  These 4 backend 
 servers will run haproxy in front of nginx/varnish with load

Re: SPDY support?

2012-05-08 Thread KT Walrus

On May 8, 2012, at 2:01 PM, Willy Tarreau wrote:

 That's why with the guys from Squid, Varnish and Wingate we presented
 an concurrent proposal to the IETF one month ago :
 
  http://tools.ietf.org/html/draft-tarreau-httpbis-network-friendly-00
 

I hope that HTTP 2.0 requires encryption/compression for all traffic.  

Also, I would hope that geographic/distributed load balancing is better 
addressed in the protocol.  That is, any request can get forwarded to another 
IP immediately (along with any session data needed by the new server) and a 
short response back to the client (if the new server accepts the request) 
containing a Unique Request ID and the IP for the client to connect to for the 
response.  The client would, when seeing this redirect response, connect to the 
IP with the Request ID to get the response.  Subsequent requests from the 
client should be made to the new IP for the given host and could be changed 
again.

I'm thinking this could make geographic load balancing easy without using DNS 
to make the geo decisions based only on source ip.  And, this might really help 
with DDoS attack mitigation in that a server/haproxy could easily transfer 
authenticated users (e.g., logged in users to the site) to separate networks 
(that only accept authenticated requests) and severely limiting the connection 
rate to domain's DNS IP.

Kevin