md wrote:
> I haven't looked at the cache modules docs yet...would
> it be possible to build cache on the separate
> load-balanced machines as we go along...as we do with
> template caching?

Of course.  However, if a user is sent to a random machine each time you 
won't be able to cache anything that a user is allowed to change during 
their time on the site, because they could end up on a machine that has 
an old cached value for it.  Sticky load-balancing or a cluster-wide 
cache (which you can update when data changes) deals with this problem.

> everything seems so user specific...

That doesn't mean you can't cache it.  You can do basically the same 
thing you were doing with the session: stuff a hash of user-specific 
stuff into the cache.  The next time that user sends a request, you 
check the cache for data on that user ID (you get the user ID from the 
session) and if you don't find any you just fetch it from the db.

Pseudo-code:

sub fetch_user_data {
   my $user_id = shift;
   my $user_data;
   unless ($user_data = fetch_from_cache($user_id)) {
     $user_data = fetch_from_db($user_id);
   }
   return $user_data;
}

> I would be curious though that if my choice is simply
> that the data is stored in the session or comes from
> the database with each request, would it still be best
> to essentially only store the session id in the
> session and pull everything else from the db? It still
> seems that something trivial like a greeting name (a
> preference) could go in the session.

Your decision about what to put in the session is not connected to your 
decision about what to pull from the db each time.  You can cache all 
the data if you want to, and still have very little in the session.

This might sound like an academic distinction, but I think it's 
important to keep the concepts separate: a session is a place to store 
transient state information that is irrelevant as soon as the user logs 
out, and a cache is a way of speeding up access to a slow resource like 
a database, and the two things should not be confused.  You can actually 
cache the session data if you need to (with a write-through cache that 
updates the backing database as well).  A cache will typically be faster 
than session storage because it doesn't need to be very reliable and 
because you can store and retrieve individual chunks of data (user's 
name, page names) when you need them instead of storing and retrieving 
everything on every request.  Separating these concepts allows you to do 
things like migrate the session storage to a transactional database some 
day, and move your cache storage to a distributed multicast cache when 
someone comes out with a module for that.

> The only
> gotcha would be that the calendar would need to update
> every day, at least on the current month's pages.

The cache modules I mentioned have a concept of "timeout" so that you 
can say "cache this for 12 hours" and then when it expires you fetch it 
again and update the cache for another 12 hours.

> Even though there are some "preset" pages, the user
> can change the names and the user can also create a
> cutom page with its own name.

No problem, you can cache data that's only useful for a single user, as 
I explained above.

> Not
> to mention that between the fact that the users' daily
> pages can have any number of user selected features
> per page and features themselves can have archive
> depths of anywhere from 3 to 20 years, there's a lot
> of info.

No problem, disks are cheap.  400MB of disk space will cost you about as 
much as a movie in New York these days.

- Perrin

Reply via email to