Re: Failure, Data Integrity and the PECL Extension

Nathan Schmidt Fri, 18 May 2007 10:52:02 -0700

Kenner,

If any machine is too CPU-bound to return memcached responses atanything other than ethernet speed you should probably step back andreally evaluate your hardware plan.

You'll get a lot of mileage out of memcached but you'll be better offin this case rethinking your approach to caching -- if you reallyneed transactional integrity or absolutely assured consistency you'llwant to use an in-memory database like MySQL cluster. If you reworkyour application's cache layer to "appreciate not require"consistency you'll be much happier. Lots of folks on this listsuccessfully use memcached as a source of authority but it'sgenerally a) backed by a more expensive db query if necessary or b)not actually critical data. Whatever you do, make the kind ofconsistency problem you describe cause at most annoyance "grr, haveto hit the disk and recalculate x y z", not "OMG two users got thesame UID"

For very common data which must be _available_ we keep a separatepool of a couple servers who all get the same data written to them --we've written MultiputMemcacheDriver class which handles that logic.If you write a timestamp as part of your payload data you can resolveambiguity in a pinch -- data with the later timestamp is 'moreauthoritative'. It's not terribly complex but makes for better sleep.


-Nathan / PBwiki


On May 18, 2007, at 9:59 AM, Kenner Stross wrote:

Hello,
I am using the PECL php extension for memcached access, and amconfused/concerned about data integrity in the case of a failure. Ihave already found some discussions on this list regarding thisissue, but I don't see how those solutions hold up in a multi-server environment.
What I've found so far is basically this: Disable automaticfailover, use a callback method to catch the failure and in thatcallback routine set the server status to off and stop any furtherretrying (-1), and lastly, implement an external service monitorthat can detect the problem, flush the cache and then mark theserver as available again. That way, you can be sure all staleentries are flushed before it rejoins the pool of active servers.
Fine for one client accessing the cache server. But I don't seehow that guarantees integrity in a multi-client environment. Inparticular, I don't see how it works when the failure is quitetemporary, due to a heavy load that made the response toosluggish. Hopefully I'm just overlooking the obvious and one ofyou will straighten me out.
Let's imagine a simple 3 machine setup (m1 - m3), where eachmachine is acting as a web server and a memcached server.
m1 web --> attempts write to m3 cache, but it fails due to extremeload. Marks it as failed and offline (in the callback routine).m2 web --> accesses m3 cache successfully (no load problem on m2,so no failure). Doesn't see that m1 took it offline.
m2 is using invalid cache data (it's missing m1's activity) butdoesn't realize it. An external service monitor may or may notnotice this brief, intermittent problem, but even if it does, thatdoesn't help m2 avoid the m3 cache once m1 has experienced an m3cache failure.
I'm sure I must be missing something. Your help is greatlyappreciated.
Thanks,
Kenner

Re: Failure, Data Integrity and the PECL Extension

Reply via email to