basically, you need persistence :)
On Thu, Jan 3, 2013 at 9:45 AM, KT Walrus <ke...@my.walr.us> wrote:
> One more tweak… I think the frontend LBs could be made to distribute the
> load so that requests go to the backend that has the sessionDB that will be
> used for the request rather than simple RR (by using cookies). This would
> keep most requests handled entirely by a single backend server. I kind of
> like this, from an efficiency and simplicity point of view.
>
> Most setups seem to want you to place each individual component of the
> backend (HAProxy, Nginx/Varnish, PHP, and MySQL) in separate VPSs (in a
> "cloud" architecture). But, I'm thinking that it will simplify things if I
> don't use virtualization and have each backend capable of handling the entire
> request. If I need more capacity in the backend, I simply add another
> backend server that functions independently of the other backends (except for
> handling HA in times of high load where one backend forwards the excess
> requests to its next neighbor backend).
>
> I do have one problem in my proposed architecture. A sessionDB could,
> theoretically, get much more than MAXCONN connections (up to and including
> all current requests could use a single sessionDB). This is because once a
> sessionDB is selected for an individual user, all subsequent request from
> that user must be handled using this sessionDB. This means I have to keep
> MAXCONN low enough that if the sessionDB in the backend does have to handle
> all requests to all backends, the server will still function and not be
> overloaded. It would be nice if this wasn't the case, but I can't think of
> how to avoid this possibility. If I could, I could probably set MAXCONN to
> utilize 80% of the backend rather than a more conservative 50%, eventually,
> saving significant money in scale out.
>
> On Jan 3, 2013, at 2:56 AM, KT Walrus <ke...@my.walr.us> wrote:
>
>> Thanks for the reply.
>>
>>> Why installing 2 layers of HAProxy???
>>> A single one (on the 2 servers is enough).
>>
>> My thought was that the second layer of HAProxy would ensure that the
>> individual backend server would never have more than MAXCONN requests so I
>> know the server will never be overloaded possibly leading to the server
>> going down or taking too long to process a request.
>>
>> I want multiple active frontend lbs so that my architecture will scale
>> infinitely to many more frontends if necessary. If I eventually needed
>> more than 6 servers, I would set up another 6 servers (using the same setup
>> at 2 data centers for additional HA.
>>
>>> Since you're doing SSL, try
>>> to make it start multiple processes, a single one dedicated to HTTP
>>> and all other one for cyphering/deciphering processing…
>>
>> Yes. I planned on doing that. My 2 frontend servers are UP (4 cores) while
>> the 4 backend servers can be upgraded to DP (16 cores) and huge RAM
>> (256GBs). I've already purchased these servers. I expect that 1 frontend
>> server would be sufficient for a long time, but I want HA by having the two
>> frontends on separate independent power/ethernet connections within the
>> datacenter.
>>
>>> I'm not a fan of first algo, unless you pay the resource per number of
>>> backend server, which is not your case.
>>
>> I just thought "first" load balancing was perfect for "guarding" that an
>> individual backend server never exceeded MAXCONN concurrent requests. The
>> overhead should be minimal since this "guard" HAProxy almost always will
>> pass the request to localhost nginx/varnish. I need this "guard" because
>> there are multiple frontend LBs doing simple round robin to the backends
>> independently. This might become more of a possibility when and if I need
>> more LBs independently distributing requests to the backends.
>>
>>> Prefer using a hash in your case (even multiple hash with different
>>> backends and content switching), that way, your hit rate would be much
>>> better.
>>
>> I'm not so concerned about individual hit rate as I am about HA and infinite
>> scalability. It is relatively cheap to add a new server to handle more
>> backend or frontend load or split to placing some servers in a new
>> datacenter. I'd rather have my servers run at 50% capacity (purchasing
>> twice the hardware) if that means increased HA from having the guard
>> HAProxy's and never coming close to pushing them too hard that individual
>> pieces of the software/hardware stack start to fail.
>>
>>> no need to host a sorry page on a far away server, host it on your
>>> frontend LBs and HAProxy can deliver it once your server farm is
>>> full…
>>
>> That is true. I was really thinking that maybe the first Amazon "overflow"
>> server might be set up to actually have a full backend server if the sorry
>> page ever starts to be served by Amazon, I would simply create one or more
>> EC2 servers to take the temporary load. I actually plan on implementing the
>> website as EC2 instances (using this architecture) until my Amazon bill goes
>> over $500 a month at which time I would go colo.
>>
>>> An other remark, it may be hard to troubleshoot such infra with 2
>>> Active/active LBs.
>>
>> I think I have to deal with this, but since each LB is handling unique VIPs
>> (unless keepalived kicks in due to failure), I don't think there is going to
>> be that much trouble.
>>
>>> And using DNS rr does not prevent you from using keepalived to ensure
>>> HA between your 2 HAProxys.
>>
>> Yes. I am hoping this is the case. I eventually want at least two
>> geographic locations (east and west coast data centers) so 4 IPs in the DNS
>> to distribute to the closest datacenter. I use DNSMadeEasy which can
>> support both DNS Global Traffic Director (east coast and west coast IP
>> Anycast) and DNS Failover (incase one datacenter goes offline).
>>
>>>
>>> cheers
>>>
>>>
>>> On Thu, Jan 3, 2013 at 12:20 AM, KT Walrus <ke...@my.walr.us> wrote:
>>>> I'm setting up a new website in the next month or two. Even though the
>>>> traffic won't require a scalable HA website, I'm going to start out as if
>>>> the website needs to support huge traffic so I can get some experience
>>>> running such a website.
>>>>
>>>> I'd like any feedback on what I am thinking of doing…
>>>>
>>>> As for hardware, I am colocating 6 servers at this time and plan to use
>>>> Amazon S3 to host the static files (which should grow quickly to 1TB or
>>>> 2TB of mostly images). 2 of the servers are going to be my frontend load
>>>> balancers running haproxy. The remaining 4 servers with be nginx/varnish
>>>> servers (nginx for the PHP/MySQL part of the site and varnish to cache the
>>>> Amazon S3 files to save bandwidth charges by Amazon).
>>>>
>>>> I plan on doing DNS load balancing using pairs of A records for each
>>>> hosted domain that will point to each of my frontend haproxy load
>>>> balancers. Most traffic will be HTTPS, so I plan on having the frontend
>>>> load balancers to handle the SSL (using the new haproxy support for SSL).
>>>>
>>>> The two load balancers will proxy to the 4 backend servers. These 4
>>>> backend servers will run haproxy in front of nginx/varnish with load
>>>> balancing of "first" and a suitable MAXCONN. Server 1 haproxy will first
>>>> route to the localhost nginx/varnish and when MAXCONN connections are
>>>> active to the localhost, will forward the connection to Server 2 haproxy.
>>>> Server 2 and 3 will be set up similarly to first route requests to
>>>> localhost and when full, route subsequent requests to the next server.
>>>> Server 4 will route excess requests to a small Amazon EC2 instance to
>>>> return a "servers are all busy" page. Hopefully, I will be able to add a
>>>> 5th backend server at Amazon to handle the overload if it looks like I
>>>> really do have traffic that will fill all 4 backend servers that I am
>>>> colo'ing (I don't really expect this to ever be necessary).
>>>>
>>>> Nginx will proxy to PHP on localhost and each localhost (of my 4 backend
>>>> servers) will have 2 MySQL instances - one for the main Read-Only DB and
>>>> one for a Read-Write SessionDB. PHP will go directly to the main DB (not
>>>> through HAProxy) and will use HAProxy to select the proper SessionDB to
>>>> use (each user session must use the same SessionDB so the one a request
>>>> needs might be on any of the backend servers). Each SessionDB will be the
>>>> master of one slave SessionDB on a different backend server for handling
>>>> the failure of the master (haproxy will send requests to the slave
>>>> SessionDB if the master is down or failing).
>>>>
>>>> So, each backend server will have haproxy to "first" balance HTTP to
>>>> nginx/varnish. The backends also have PHP and 3 instances of MySQL (one
>>>> for mainDB, one for master sessionDB, and one for another backend's slave
>>>> sessionDB).
>>>>
>>>> Also, the 2 frontend servers will be running separate instances of
>>>> haproxy. I hope to use keepalived to route the VIPs for one frontend to
>>>> the other frontend in case of failure. Or, should I use heartbeat? There
>>>> seems to be two HA solutions here.
>>>>
>>>> I know this is a very long description of what I am thinking of doing and
>>>> I thank you if you have read this far. I'm looking for any comments on
>>>> this setup. Especially, any comments on using "first" load
>>>> balancing/MAXCONN on the backend servers so that a request load balanced
>>>> from the frontend will keep the backend servers from overloading (possibly
>>>> bouncing a request from server 1 to server 2 to server 3 to server 4 to
>>>> EC2 "server busy" server) are especially appreciated. Also, any comments
>>>> on using pairs of master/slave sessionDBs to provide high availability but
>>>> still have session data saved/retrieved for a given user from the same DB
>>>> are appreciated. I believe this setup will allow the load to be
>>>> distributed evenly over the 4 backends and only have the front end load
>>>> balancers do simple round robin without session stickiness.
>>>>
>>>> Kevin
>>>>
>>>>
>>
>