Re: Comparison of different caching schemes

Jay Thorne Fri, 14 Dec 2001 16:36:20 -0800

On December 14, 2001 03:53 pm, Robert Landrum wrote:
> At 6:04 PM -0500 12/14/01, Perrin Harkins wrote:
> >That's actually a bit different.  That would fail to notice updates
> > between processes until the in-memory cache was cleared.  Still very
> > useful for read-only data or data that can be out of sync for some period
> > though.
>
> The primary problem with everything mentioned thus far is that they
> are almost entirely based around the concept of a single server.
> Caching schemes that are both fast and work across multiple servers
> and process instances are very hard to find.  After reading the eToys
> article, I decided that BerkeleyDB was worth a look.  We discovered
> that it was very fast and would make a great cache for some of our
> commonly used database queries.  The problem was that we are running
> on 5 different servers, load balanced by Big/IP.  Again, taking my
> que from the eToys article, I began working on a data exchange system
> (using perl instead of C) which multicasts data packets to the
> "listening" servers.  So far, our only problems have been with the
> deadlocking (solved with db_deadlock), and a few corrupt records.
>


We did stuff based on kind of an "Acceptable differences" concept.

Using round robinn dns, with predominantly winXX clients, you tend to get the 
same IP address from one request to tthe next. Changing, IIRC, on the dns 
cache timeout, from the soa record.  With a bigIP or something, you can 
guarantee this user locality.

So that leaves you with *appearing* to be in sync for the one local machine.

Using a database with common connections from multiple machines or the "one 
writer db machine, many read db machines" concept, using replication or what 
have you for updates, you can present a pretty consistent database view, with 
a maximum difference on the read side of the replication time.

We found that the database concept, plus using a local cache with a timeout 
of 5 seconds not only gave us a huge performance increase, but it gave us a 
maximum of about 6 seconds of data difference between machines. And since 
your average human viewer hits pages no faster than once every 20 seconds or 
so, it was pretty much invisible to the individual user. Of course search 
engines will be faster than this, but does being consistent for search 
engines matter all that much?


> I'm considering uploading to CPAN the stuff I've written for the
> caching, but haven't had the time to make it generic enough.  I also
> haven't performed any benchmarks other than the "Wow, that's a lot
> faster" benchmark.  One limitation to the stuff I've written is that
> the daemons (listeners) are non threaded, non forking.  That and it's
> all based on UDP.
>
>
> Rob

-- 
Jay "yohimbe" Thorne  [EMAIL PROTECTED]
Mgr Sys & Tech, Userfriendly.org

Re: Comparison of different caching schemes

Reply via email to