Hello, I am wondering what's the best design for a high volume radius system. We are looking at on the order of 100-150 requests/second (auth+acct) on average. Does anyone here have a load balancing system setup? If so, I'd appreciate any tips on how you set this up.
After using Radiator for quite awhile, I've found that the main things that cause slowdowns is database queries or network outages. I've noticed during network outages, some RASes (we have mostly Ascend) and proxy servers start flooding the server once the connectivity comes back. These appear to be queued requests (mostly accounting) on the systems. In this situation it pretty much kills our radius server (CPU -> 99%) and many times we have to run Radiator in a very basic configuration (no database, no authentication) for some time to cool things down. Many times I've even had to go to our firewall and block some RAS traffic. So I am just looking for some tips on how to setup a scalable system. We have a test system setup with a Foundry switch load balancing to 2 Radiator servers via roundrobin. However, in our tests we are noticing that the load balancing is not even when the source UDP port stays constant, which is for example when another Radiator is forwarding requests to it. It only seems to load balance properly when the source ports change. Anyone have any ideas what could be wrong here? What I was thinking was to instead setup one Radiator system that uses the AuthBy loadbalance clause instead of the Foundry switch. Any thoughts on this instead of hardware load balancing? The next issue is database slowdowns. I am thinking that the best setup would be for the RASes to go directly to Radiators that do not have any sort of DB dependency, and instead they proxy to respective servers that do have DB dependencies. For example: A / \ / \ B C / \ / \ D E F G A = Radiator doing AuthBy loadbalance to B and C (or hardware switch) B/C = Radiator with only AuthBy RADIUS clauses D/E/F/G = Radiator with DB access The B and C trees would be identical. Does this sound like a proper setup? As far as the type of database access, we've mostly seen that accounting is what causes problems. I believe this is due to our table designs. For example, we have unique indexes to drop duplicate accounting, indexed on many fields. At some point when there is alot of data inserts become slow. I was thinking that Radiator's access to the DB should be made as fast as possible, and that Radiator should instead use the DB as sort of a log table for accounting (with no indexes at all), similar to writing to raw files. Then, periodically, an external process would process this data and move to the real accounting tables (with indexes, etc). This way, DB query time is kept to a minimal for accounting. Another problem we have is the number of Handlers. We handle requests depending on the following: RAS IP RAS IP+DNIS RAS IP+DNIS+Realm With all of our devices, the number of handlers is getting quite large. I'm wondering what would be an upper bound on this and if there is a better way to handle this. We have almost 500 handlers at this point. Anyhow, I'd appreciate any info or tips anyone has on a large setup like this. Thanks, Viraj. === Archive at http://www.open.com.au/archives/radiator/ Announcements on [EMAIL PROTECTED] To unsubscribe, email '[EMAIL PROTECTED]' with 'unsubscribe radiator' in the body of the message.