RE: I need some good old advice.

Haggerty, Mike Tue, 15 Jul 2003 12:50:15 -0700

JB - 

I can shed some light on this kind of system, and give you some suggestions
about building one. Just about every dynamic, scalable site employs some
form of caching, and the following is a model I have found useful in
increasing the scalability of my sites.

The idea behind caching is to pre-generate as much content as possible to
reduce overhead on your Web server. One of the big 'hits' we take as
developers is with database access, so, naturally, caching data is probably
going to be a big part of your strategy. 

There are a number of issues with doing this, however: deciding what data to
cache, indexing your data store, handling inserts / updates / deletes,
handling record locking, detecting the presence of data, error handling for
when something goes wrong, all of these things need to be thought out before
building a system like this.

Now, it's great that you want to use cached queries to store data and QoQ's
to access it. But I have found this approach to be a little slower than
dumping the same information into structures. For instance, in one
application I have a list of values specific to congressional districts and
use a structure indexed with each congressional district names to contain
all the data for that area. This includes multiple structures, string and
numeric values. But structure lookups are fast compared to QoQ, at least in
CF5.

Whenever a user requests information for a particular distrct, the data is
pulled directly out of that structure. Whenever a user adds / updates /
deletes information on that particular district, I update the database and
then the key in the structure they changed. This process comes out like
this: 

1) CFQuery with Insert / Update / Delete
2) Use a readonly CFLock to copy the structure out of the application scope
3) Perform the action on the copy of the structure
4) Use an exclusive CFLock to copy the new structure back into the
application scope

By handling changes to the cached data in this way, you can maintain
synchronicity between the database and the server without having to run a
select statement. This method, however, also illustrates the source of a lot
of problems. Locking that shared information can be really troublesome in
several situations:

1) Dirty reads. Between the time the data structure is initially copied out
of the application scope to when it is copied back in, another user can copy
the same data prior to it being updated then write it back in. The first
user's changes would be lost.
2) When you have a lot of users, the high volume of requests can make the
update take a long time, or users can encounter errors. Each exclusive lock
means only one user has access to that data and all others wait. Usually,
this is easier to work around than the next problem.
3) When you are operating in a clustered environment, there is no way to
share application scopes between servers. You need a mechanism for alerting
each server it is time to update the cached data.

There are a number of ways around these problems. I maintain a lot of
metadata regarding each data structure, including when it was last updated -
down to the millisecond. In the process above, I will add a check to see
when the last time the structure was updated and use that to figure out
whether or not this is a legit update. In the event that a dirty read has
occurred, I have an alternate process that kicks in:

1) Exclusively lock the data structure to be changed
2) Change it directly
3) Release the lock. 

This process works, for the most part, without a serious performance hit (on
our Red Hat servers, the difference is +- 3ms). There have been times
though, under heavy load, when this process lead to the number of locks
overwhelming CF's ability to deal with them. After this occurred several
times, we decided to move to a clustered server approach. 

The problem with data caching and clustered servers is that servers do not
share data scopes. They can use a common set of session variables, but the
ideal would be for them to share the application scope (since that's where
all the data is).

I get around this problem by passing data back and forth in the database. I
have a separate database dedicated exclusively to 'state' reporting. Each
time I do an update to a specific data structure that needs to be mirrored
on another server, I record the server, the structure, the key, and the time
last updated. On each page request, each server runs a quick select
statement to check (based on internal IP) if it needs to update any data.
When one server needs to update information, it follows a process similar to
the one used for record updates:

1) Use a readonly CFLock to copy the key of the structure out of the
application scope
2) Perform the action on the copy of the key of the structure
3) Use an exclusive CFLock to copy the new key back into the application
scope

The performance hit on this one depends on the particular data structure,
and the goal is to make the process invisible to the user. The bechmark I
use is 5 +- 3ms per record, if I cannot get it to take less time I will go
another route.

Now, despite all this there are still cracks in the system. Occassionally, a
really nasty dirty read happens and I hear about it and it drives me up a
wall trying to sort understand what went wrong. One thing I do to keep the
data current is to refresh one data structure every ten minutes. I have a
scheduled task that loops through a list of all the data structures on each
server and refreshes one each time the server refreshes. This procedure
takes an average of 45 seconds each time it runs, so, obviously, the data
needs to be handled in a certain way. What I do to handle the refreshes is:

1) Call the query and construct the data
2) Lock the application scope
3) Copy the new structure and update all appropriate metadata about each
structure
4) Release the lock

This strategy really cuts down on dirty reads - we specifically track them
using error pages and the error rate is less than .01% for any hour in a
24-hour period (and most days far less than that). It has been as high as 3%
since we moved to clustered servers, but has not moved above .01 in the last
9 months. And there have been a number of issues with change management.
Generally, I use multiple actions for each data structure, for instance, on
the district structure there are files for updating the whole structure, for
updating a specific key in the structure, for updating a key of a key, etc.
Every once in a while, I will make an update to handle the entire data
structure that is not reflected on all the other levels and chaos will ensue
on our testing server. It is important that processes be built around a site
like this to ensure changes in how the data structure is handled occur for
every action that affects the structure. Of course, this is why we use
Visio.

This overly-long message started off with a discussion about caching in
general. The other big thing I do is cache dynamic content, and I use
several schemes for caching based on user permissions. If anyone finds this
useful, let me know and I might be willing to write that up as well.

M

-----Original Message-----
From: James Blaha [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, July 15, 2003 1:35 PM
To: CF-Talk
Subject: Re: I need some good old advice.

Jon,

Your comment has the meat I'm looking for! Thanks.

What exactly do you mean by:

"schedule the cache update, and build in logic on each page request to 
test if the cache exists, if not, requery immediately"

Can you please give me some kind of an example of what the templates 
involved would have for code and how you would use it?

What do you mean by:

If possible, abstract access to the cache as much as possible right
now...since it is your datastore, you are bound to think of new ways to use
it, or need to use it down the road.

FYI: My data from the table involved is affected on two sides. On one 
side there are users on the web entering data which goes into a table. 
On the other side is a BackOffice were staff edits and queries that 
data. Querying, updates and deletions only happen in the BackOffice.

Regards,
JB

jon hall wrote:

>If this is a high traffic site, or the query takes extraordinarily long 
>to execute, keep in mind the people hitting the site while the query is 
>recaching, and right after the server is restarted, etc. They could get 
>errors, bad data, or long delays...
>
>I've got an app that does something similar now where we chose to 
>schedule the cache update, and build in logic on each page request to 
>test if the cache exists, if not, requery immediately. It works fairly 
>well under load...other than a few possible CF5 bugs with cached 
>queries under load that I can't control, but thankfully are rare.
>
>If possible, abstract access to the cache as much as possible right 
>now...since it is your datastore, you are bound to think of new ways to 
>use it, or need to use it down the road.
>
>  
>

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|
Archives: http://www.houseoffusion.com/cf_lists/index.cfm?forumid=4
Subscription: 
http://www.houseoffusion.com/cf_lists/index.cfm?method=subscribe&forumid=4
FAQ: http://www.thenetprofits.co.uk/coldfusion/faq

Your ad could be here. Monies from ads go to support these lists and provide more 
resources for the community. 
http://www.fusionauthority.com/ads.cfm

                                Unsubscribe: 
http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4

RE: I need some good old advice.

Reply via email to