HI

I agree that in most cases the LRU is good enough. I thought I would input our scenario to help put the context where LRU does not suit. Chances are that lots of people do this and that Im not saying anything new but......

So

We have a high volume distributed application. Not as big as the bloke with ten 30 gig instances, but for us its big and getting bigger. The application uses postgresql as a backend database and is designed as a "plugNprocess" environment. If the load gets too high we startup another blade, it connects and immediately starts sharing the load.

One of the challenges we face is the replication of data across multiple sites and database throughput. This is where memcached comes in. So far we have implemented memcached for all records which already naturally expire and so the LRU is a perfect fit and we get tremendous performance/overhead/throughput boost with cache hit rates being 40%+.

The next stage, where LRU does not fit, is the caching of an almost static dataset. We would do two lookups. One to see if the class of record is cached and then another to see if the record itself exists in cache.

If the class is cached and the record exists then we allow the action to proceed.
        If the class is not cached then we just allow the action to proceed
if the class is cached but the record does not exist then the action is denied.

So we need to know that if we add the class and it's records, they have to stay. Otherwise we could block actions because a record was not there - for whatever reason. The other event could be that the "class" record is lost and therefore the action will not be validated. This later scenario does not really matter so much as this validation is part of a bigger process, but the case where the class record exists but the data record does not, on its own is enough to refuse the action - and that is bad.

Now this particular set of records is used for every transaction, but some of the individual records may not be accessed for hours or days.

A "dont touch this" flag addition to the LRU would allow us to know that if we put a record there it will still be there.

I agree that this behavior may not be consistent with how memcached was designed, but I think it is consistent with the underlying philosophy of caching data to provide scalability. In our scenario the impact would be significant. We could probably grow our transactions by triple or more with NO additional database hits. Now THATS scalability :)

regards
Grant






On 12/06/2008, at 9:32 AM, Reinis Rozitis wrote:

In most of the cases, LRU should be good enough. If the 'important' items are used very often, they will be naturally kept in cache. Locking some rarely-used items in cache, I cannot see the justification for the cost of discarding other often-used items and constantly re-creating them as they are referenced frequently.

Yes and no. Looking at some real world cases (our) we have experienced that objects without specified ttl (which kinda qualify as permanent items) get evicted more/quicker rather than those with expiration set (as they prolly fall in to the same slabs). At some time we get to the point where a newly created item is pushed out few moments after it is added. Well you can say it shows that memcache needs more ram, but we are allready running on ten 30 gig instances with zilion of items :)


Running seperate servers with different settings is an option but imho throws out the window whole idea about transparent horizontal scaling/growing of memcached servers..

Anyway my point here was just to give some hints to the initial questions..

As there have been answers to different proposals like full cache dumps, replications, bdb backends and so on (I have always liked the responses to be superfast rather than feature-rich bloat) but noone really has answered to Pauls mail it was an opurtunity to throw it in again :)

rr

Reply via email to