md wrote:
> I haven't looked at the cache modules docs yet...would
> it be possible to build cache on the separate
> load-balanced machines as we go along...as we do with
> template caching?
Of course. However, if a user is sent to a random machine each time you
won't be able to cache anything that a user is allowed to change during
their time on the site, because they could end up on a machine that has
an old cached value for it. Sticky load-balancing or a cluster-wide
cache (which you can update when data changes) deals with this problem.
> everything seems so user specific...
That doesn't mean you can't cache it. You can do basically the same
thing you were doing with the session: stuff a hash of user-specific
stuff into the cache. The next time that user sends a request, you
check the cache for data on that user ID (you get the user ID from the
session) and if you don't find any you just fetch it from the db.
Pseudo-code:
sub fetch_user_data {
my $user_id = shift;
my $user_data;
unless ($user_data = fetch_from_cache($user_id)) {
$user_data = fetch_from_db($user_id);
}
return $user_data;
}
> I would be curious though that if my choice is simply
> that the data is stored in the session or comes from
> the database with each request, would it still be best
> to essentially only store the session id in the
> session and pull everything else from the db? It still
> seems that something trivial like a greeting name (a
> preference) could go in the session.
Your decision about what to put in the session is not connected to your
decision about what to pull from the db each time. You can cache all
the data if you want to, and still have very little in the session.
This might sound like an academic distinction, but I think it's
important to keep the concepts separate: a session is a place to store
transient state information that is irrelevant as soon as the user logs
out, and a cache is a way of speeding up access to a slow resource like
a database, and the two things should not be confused. You can actually
cache the session data if you need to (with a write-through cache that
updates the backing database as well). A cache will typically be faster
than session storage because it doesn't need to be very reliable and
because you can store and retrieve individual chunks of data (user's
name, page names) when you need them instead of storing and retrieving
everything on every request. Separating these concepts allows you to do
things like migrate the session storage to a transactional database some
day, and move your cache storage to a distributed multicast cache when
someone comes out with a module for that.
> The only
> gotcha would be that the calendar would need to update
> every day, at least on the current month's pages.
The cache modules I mentioned have a concept of "timeout" so that you
can say "cache this for 12 hours" and then when it expires you fetch it
again and update the cache for another 12 hours.
> Even though there are some "preset" pages, the user
> can change the names and the user can also create a
> cutom page with its own name.
No problem, you can cache data that's only useful for a single user, as
I explained above.
> Not
> to mention that between the fact that the users' daily
> pages can have any number of user selected features
> per page and features themselves can have archive
> depths of anywhere from 3 to 20 years, there's a lot
> of info.
No problem, disks are cheap. 400MB of disk space will cost you about as
much as a movie in New York these days.
- Perrin