"Milking...", "High Scalability" and "$1000 servers"

dougn Thu, 27 Jul 2000 15:43:12 -0700

Greetings fellow CF Fans!

I've received and read a lot of comments to the postings I did yesterday
regarding scalability and thought I'd put them all together in one e-mail
for simplicity.  Sorry for the long post...

--Doug

CFSWITCH - YES!

Consider this code:

<cfif #left(mystring, 2)# IS "AB">
        ...
<cfelseif #left(mystring, 2)# IS "BC">
        ...
<cfelseif #left(mystring, 2)# IS DE">
        ...
<cfelse>
        ...
</cfif>

If the value of the operation is "FG", you're doing 3 string operations to
just get to CFELSE.  That's a lot of unnecessary CF processing.  With a
CFSWITCH, you do the processing once then determine the correct action to
take based upon the result string.  This is much faster and actually is
usually cleaner code to read.

SELECT *  -- NEVER!

This was a good suggestion.  First of all let's assume your DBA (you have a
stellar DBA right? No?  Don't plan an IPO in your near future then...)
decides you need three more columns on the table and they are 16 chars each.
Now you have 48 chars PER RECORD coming back to your application that you
don't need.  That adds up quickly and slows your application down, even
though when you designed it, it seemed to run smoothly.  Secondly, it means
that SQL has to do minor overhead to figure out what to send you.  It's
minor, but why make it do the work, ESPECIALLY if this query is run
frequently.  Finally, it makes it easier for the "next guy" to debug your
code.

Scoping Variables - Always

YES!  Always scope variables.  You get two benefits.  First, the developer
after you that reads your code can figure it out much more quickly.
Secondly, you aren't forcing CF to take valuable milliseconds to figure out
what the variable's scope is since it's already defined.  It's a minor
point, but well worth making a part of your daily coding ritual.

CFSETTING - White Space

The CFSETTING tag and respective CFADMIN setting for this globally appear to
be great tools.  In some cases they are.  However, remember that essentially
you're asking CF to do extra work to remove these "offensive items" (tabs,
spaces, etc.).  If your page is now 5k instead of 6k, is that worth the CPU
overhead?  Each developer will have to decide if it makes cost-effective
sense to do this.  I've found cases where it was good AND those where it
caused more processing overhead and really didn't save a lot of bandwidth.
So be careful and decide when this is best to use rather than just setting
the entire application or even server to remove white space all the time.

Lots of Cheap Servers or Big Beefy Servers?

First of all, the SQL server should have a twin or slightly lesser machine
if cost is an issue and these should be as powerful as you can reasonably
afford.  Put your horsepower into your database.

Secondly, get a load balancing solution that has fail-over built in.  This
way you can put ANY web servers (cheap OR expensive, NT, Solaris or
otherwise) all in the same solution and the load balancer will find the best
server for each request.  Note that session management is a big issue here I
won't discuss now.

Finally, buy web servers.  Consider the cost of the hardware AND software
together.  Sure you could get a cheap 1U Linux box.  Linux's kernel though
only supports one CPU (talk to Linus Torvalds about that one) so you can
keep these machines fairly cheap.  Allaire, however, will charge you for a
CF license for each, so there's a hit there.  If your applications are CPU
intensive, get stronger hardware.  If you're doing very lightweight pages
and lots of them, a farm is probably good.  What's EVEN BETTER is to see if
you can build a scheduler to convert pages to static HTML and put them on a
NON-CF server and preferably in a caching environment outside your network.
That's often the best and cheapest solution.  Look at Yahoo.  Most of their
content is static pages rebuilt regularly with CGI to just help steer you to
a static page.  That model makes lots of sense and is cheap to build.

NT v. Solaris

My research and opinion says six of one, half dozen of the other.  Where one
has more power, it has more cost to balance either in hardware or in labor
costs to support more advanced gear.  I've yet to see research that strongly
indicated one solution greatly outperformed the other in comparable farms
and while doing so at a greatly reduced cost.  So the end result is build
what's comfortable and cost-effective for you... but don't expect a one
solution fits all answer to this question.

$1000 light servers v. dual CPU

The "2 NIC" approach

Yes, we've done research here and it's highly recommended.  Even Compaq in
their DISA architecture recommends it?  Why?  Because it creates two
networks, both high speed and isolated.  The first is so your web servers
talk directly to a dedicated switch that talks to your firewall, router or
DMZ.  This way you're isolated from your regular office network traffic and
get right out to the internet to deal with customers.  The second network is
so web servers have high speed and isolated connections to the SQL cluster.
This is the fastest way to get data to these machines.  If you just put
everything into one large switch (which often has a common backplane) and
just use one NIC per system, you're putting both the Internet and SQL
traffic on the same cards, ports, etc. and eventually you will see
degradation.  Also, this is a security issue (although minor) because in
theory the router could talk to the switch and directly to SQL.  It's far
easier to block this port in your firewall with a two network approach and
that way insure that only internal servers talk to SQL.  It's a minor point,
but worth noting... especially if you plan to file an S-1 document in your
future.

Log Analyzers

Yes, under high load this needs to be a dedicated server.  Set it up and
archive the raw logs (I agree with the recommendaiton to use .ZIP archives
to save archiving space requiremenets).  Once the analyzer processes it, the
raw logs don't need to remain on the server.  Don't forget to "flush" the
dataset as well since you probably don't need to keep all this data live.
Static HTML reports can easily be archived but the Gb of data that it took
to make it doesn't have to remain online.  Also, by removing these logs from
the web servers ASAP, you don't need big drives on your web servers and can
keep the costs down.  Writing this data to a file share on the fly is not
recommended as it will greatly increase the "chattiness" of your network
segment.

IPs or Virtual Hosting

I've seen both done in various environments and projects and haven't seen
any measurable difference.  Yes you can host a LOT of domains on a single
server as long as the bandwidth limits of the box don't become an issue.
Lots of "inexpensive" hosting solutions do this.  It's fast, cheap and easy.
If you run your own network, however, I always like a well documented IP
strategy... but it's not always needed.

Lots of Servers - One File Share for Content

Having your web servers grab their content from a file share makes it very
easy to scale and distribute content.  You can essentially make each one a
"mirror image" of the others.  It's fast, cheap and easy as long as the CF
licensing costs don't become an issue.  This is a great way to have lots of
domains on lots of servers and provide redundancy.  It's also great for
security because you put FTP on the file server and the customers never
actually hit the web server farm.

___________________________________________
Douglas Nottage
Director of Advanced Technology
Autobytel.com (NASDAQ:  ABTL )
___________________________________________
"Never doubt that a small group of thoughtful,
committed people can change the world.
Indeed, it is the only thing that ever has."
--- Margaret Mead
___________________________________________


------------------------------------------------------------------------------
Archives: http://www.mail-archive.com/cf-talk@houseoffusion.com/
To Unsubscribe visit 
http://www.houseoffusion.com/index.cfm?sidebar=lists&body=lists/cf_talk or send a 
message to [EMAIL PROTECTED] with 'unsubscribe' in the body.
"Milking...", "High Scalability" and "$1000 servers"

Reply via email to