Re: My Scalable Architecture using HAProxy

Baptiste Thu, 03 Jan 2013 00:50:23 -0800
basically, you need persistence :)

On Thu, Jan 3, 2013 at 9:45 AM, KT Walrus <ke...@my.walr.us> wrote:
> One more tweak…  I think the frontend LBs could be made to distribute the 
> load so that requests go to the backend that has the sessionDB that will be 
> used for the request rather than simple RR (by using cookies).  This would 
> keep most requests handled entirely by a single backend server.  I kind of 
> like this, from an efficiency and simplicity point of view.
>
> Most setups seem to want you to place each individual component of the 
> backend (HAProxy, Nginx/Varnish, PHP, and MySQL) in separate VPSs (in a 
> "cloud" architecture).  But, I'm thinking that it will simplify things if I 
> don't use virtualization and have each backend capable of handling the entire 
> request.  If I need more capacity in the backend, I simply add another 
> backend server that functions independently of the other backends (except for 
> handling HA in times of high load where one backend forwards the excess 
> requests to its next neighbor backend).
>
> I do have one problem in my proposed architecture.  A sessionDB could, 
> theoretically, get much more than MAXCONN connections (up to and including 
> all current requests could use a single sessionDB).  This is because once a 
> sessionDB is selected for an individual user, all subsequent request from 
> that user must be handled using this sessionDB.  This means I have to keep 
> MAXCONN low enough that if the sessionDB in the backend does have to handle 
> all requests to all backends, the server will still function and not be 
> overloaded.  It would be nice if this wasn't the case, but I can't think of 
> how to avoid this possibility.  If I could, I could probably set MAXCONN to 
> utilize 80% of the backend rather than a more conservative 50%, eventually, 
> saving significant money in scale out.
>
> On Jan 3, 2013, at 2:56 AM, KT Walrus <ke...@my.walr.us> wrote:
>
>> Thanks for the reply.
>>
>>> Why installing 2 layers of HAProxy???
>>> A single one (on the 2 servers is enough).
>>
>> My thought was that the second layer of HAProxy would ensure that the 
>> individual backend server would never have more than MAXCONN requests so I 
>> know the server will never be overloaded possibly leading to the server 
>> going down or taking too long to process a request.
>>
>> I want multiple active frontend lbs so that my architecture will scale 
>> infinitely to many more frontends if necessary.  If I  eventually needed 
>> more than 6 servers, I would set up another 6 servers (using the same setup 
>> at 2 data centers for additional HA.
>>
>>> Since you're doing SSL, try
>>> to make it start multiple processes, a single one dedicated to HTTP
>>> and all other one for cyphering/deciphering processing…
>>
>> Yes.  I planned on doing that.  My 2 frontend servers are UP (4 cores) while 
>> the 4 backend servers can be upgraded to DP (16 cores) and huge RAM 
>> (256GBs).  I've already purchased these servers.  I expect that 1 frontend 
>> server would be sufficient for a long time, but I want HA by having the two 
>> frontends on separate independent power/ethernet connections within the 
>> datacenter.
>>
>>> I'm not a fan of first algo, unless you pay the resource per number of
>>> backend server, which is not your case.
>>
>> I just thought "first" load balancing was perfect for "guarding" that an 
>> individual backend server never exceeded MAXCONN concurrent requests.  The 
>> overhead should be minimal since this "guard" HAProxy almost always will 
>> pass the request to localhost nginx/varnish.  I need this "guard" because 
>> there are multiple frontend LBs doing simple round robin to the backends 
>> independently.  This might become more of a possibility when and if I need 
>> more LBs independently distributing requests to the backends.
>>
>>> Prefer using a hash in your case (even multiple hash with different
>>> backends and content switching), that way, your hit rate would be much
>>> better.
>>
>> I'm not so concerned about individual hit rate as I am about HA and infinite 
>> scalability.  It is relatively cheap to add a new server to handle more 
>> backend or frontend load or split to placing some servers in a new 
>> datacenter.  I'd rather have my servers run at 50% capacity (purchasing 
>> twice the hardware) if that means increased HA from having the guard 
>> HAProxy's and never coming close to pushing them too hard that individual 
>> pieces of the software/hardware stack start to fail.
>>
>>> no need to host a sorry page on a far away server, host it on your
>>> frontend LBs and HAProxy can deliver it once your server farm is
>>> full…
>>
>> That is true.  I was really thinking that maybe the first Amazon "overflow" 
>> server might be set up to actually have a full backend server if the sorry 
>> page ever starts to be served by Amazon, I would simply create one or more 
>> EC2 servers to take the temporary load.  I actually plan on implementing the 
>> website as EC2 instances (using this architecture) until my Amazon bill goes 
>> over $500 a month at which time I would go colo.
>>
>>> An other remark, it may be hard to troubleshoot such infra with 2
>>> Active/active LBs.
>>
>> I think I have to deal with this, but since each LB is handling unique VIPs 
>> (unless keepalived kicks in due to failure), I don't think there is going to 
>> be that much trouble.
>>
>>> And using DNS rr does not prevent you from using keepalived to ensure
>>> HA between your 2 HAProxys.
>>
>> Yes.  I am hoping this is the case.  I eventually want at least two 
>> geographic locations (east and west coast data centers) so 4 IPs in the DNS 
>> to distribute to the closest datacenter.  I use DNSMadeEasy which can 
>> support both DNS Global Traffic Director (east coast and west coast IP 
>> Anycast) and DNS Failover (incase one datacenter goes offline).
>>
>>>
>>> cheers
>>>
>>>
>>> On Thu, Jan 3, 2013 at 12:20 AM, KT Walrus <ke...@my.walr.us> wrote:
>>>> I'm setting up a new website in the next month or two.  Even though the 
>>>> traffic won't require a scalable HA website, I'm going to start out as if 
>>>> the website needs to support huge traffic so I can get some experience 
>>>> running such a website.
>>>>
>>>> I'd like any feedback on what I am thinking of doing…
>>>>
>>>> As for hardware, I am colocating 6 servers at this time and plan to use 
>>>> Amazon S3 to host the static files (which should grow quickly to 1TB or 
>>>> 2TB of mostly images).  2 of the servers are going to be my frontend load 
>>>> balancers running haproxy.  The remaining 4 servers with be nginx/varnish 
>>>> servers (nginx for the PHP/MySQL part of the site and varnish to cache the 
>>>> Amazon S3 files to save bandwidth charges by Amazon).
>>>>
>>>> I plan on doing DNS load balancing using pairs of A records for each 
>>>> hosted domain that will point to each of my frontend haproxy load 
>>>> balancers.  Most traffic will be HTTPS, so I plan on having the frontend 
>>>> load balancers to handle the SSL (using the new haproxy support for SSL).
>>>>
>>>> The two load balancers will proxy to the 4 backend servers.  These 4 
>>>> backend servers will run haproxy in front of nginx/varnish with load 
>>>> balancing of "first" and a suitable MAXCONN.  Server 1 haproxy will first 
>>>> route to the localhost nginx/varnish and when MAXCONN connections are 
>>>> active to the localhost, will forward the connection to Server 2 haproxy.  
>>>> Server 2 and 3 will be set up similarly to first route requests to 
>>>> localhost and when full, route subsequent requests to the next server.  
>>>> Server 4 will route excess requests to a small Amazon EC2 instance to 
>>>> return a "servers are all busy" page.  Hopefully, I will be able to add a 
>>>> 5th backend server at Amazon to handle the overload if it looks like I 
>>>> really do have traffic that will fill all 4 backend servers that I am 
>>>> colo'ing (I don't really expect this to ever be necessary).
>>>>
>>>> Nginx will proxy to PHP on localhost and each localhost (of my 4 backend 
>>>> servers) will have 2 MySQL instances - one for the main Read-Only DB and 
>>>> one for a Read-Write SessionDB.  PHP will go directly to the main DB (not 
>>>> through HAProxy) and will use HAProxy to select the proper SessionDB to 
>>>> use (each user session must use the same SessionDB so the one a request 
>>>> needs might be on any of the backend servers).  Each SessionDB will be the 
>>>> master of one slave SessionDB on a different backend server for handling 
>>>> the failure of the master (haproxy will send requests to the slave 
>>>> SessionDB if the master is down or  failing).
>>>>
>>>> So, each backend server will have haproxy to "first" balance HTTP to 
>>>> nginx/varnish.  The backends also have PHP and 3 instances of MySQL (one 
>>>> for mainDB, one for master sessionDB, and one for another backend's slave 
>>>> sessionDB).
>>>>
>>>> Also, the 2 frontend servers will be running separate instances of 
>>>> haproxy.  I hope to use keepalived to route the VIPs for one frontend to 
>>>> the other frontend in case of failure.  Or, should I use heartbeat?  There 
>>>> seems to be two HA solutions here.
>>>>
>>>> I know this is a very long description of what I am thinking of doing and 
>>>> I thank you if you have read this far.  I'm looking for any comments on 
>>>> this setup.  Especially, any comments on using "first" load 
>>>> balancing/MAXCONN on the backend servers so that a request load balanced 
>>>> from the frontend will keep the backend servers from overloading (possibly 
>>>> bouncing a request from server 1 to server 2 to server 3 to server 4 to 
>>>> EC2 "server busy" server) are especially appreciated.  Also, any comments 
>>>> on using pairs of master/slave sessionDBs to provide high availability but 
>>>> still have session data saved/retrieved for a given user from the same DB 
>>>> are appreciated.  I believe this setup will allow the load to be 
>>>> distributed evenly over the 4 backends and only have the front end load 
>>>> balancers do simple round robin without session stickiness.
>>>>
>>>> Kevin
>>>>
>>>>
>>
>
Re: My Scalable Architecture using HAProxy

Reply via email to