Re: Few queries on atomicity of requests

Dustin Sallings Wed, 18 Jun 2008 18:51:21 -0700


On Jun 18, 2008, at 14:29, Joyesh Mishra wrote:

1 function get_foo (int userid) {
2    result = memcached_fetch("userrow:" + userid);
3    if (!result) {
4 result = db_select("SELECT * FROM users WHERE userid = ?",userid);
5        memcached_add("userrow:" + userid,  result);
6    }
7    return result;
8}
9 function update_foo(int userid, string dbUpdateString) {
10    result = db_execute(dbUpdateString);
11    if (result) {
12        data =
 createUserDataFromDBString(dbUpdateString);
13        memcached_set("userrow:" + userid, data);
14    }
15}

*******
Imagine a table now getting queried on 2 columns say userid andusername
Q1:
If we have 100 processes each executing the get_foo function, andlets saymemcached does not have the key. As there would be a delay betweenexecuting Line 2 and Line 5,there would be atleast dozens of processes querying the db andexecuting Line 5 creating morebottleneck on the memcached server - How does it scale then (Imaginea million processes now getting triggered)?I understand it is the initial load factorbut how do you take thisinto account while starting up the memcached servers?

The bottleneck isn't on the memcache server, it's on your DB. Inthat case, sounds like you've got a really popular user. :)

You may have a bit of a thundering herd problem. If it's toointense, you can create (or find) a locking mechanism to prevent thethundering herd from thundering too hard.

Q2:
Now imagine, you have 100 processes again querying the key out ofwhich 50 execute get_foo() and 50 update_foo().And lets say the key is not there on memcached server. Imagine T1doing a select operation
 followed
by T2 doing an update. T1 is in Line4 doing the select and *GOING*to add the key to cache, while T2goes ahead and updates the DB and executes Line 13 (i.e. updates thecache). Now if T1 executes Line 5it would have stale results (in such a case memcache_add failsbasically - but is it a sufficient guarantee
that such a case would never arise?)

I don't know what API you're using, but memcached's add fails if avalue is already in the cache for the specified key.

Q3:
Now we have 2 queries say:
select * from users where userid = abc;
select * from users where username = xyz;

Users
|userid|username|userinfo|

and I want memcached to improve the query performance

I had 2 approaches:
1. Cache1: Key=userid Value=User_Object
   Cache2: Key=username Value=userid

2. Cache1: Key=userid Value=User_Object
   Cache2: Key=username Value=User_Object

Do you see potential flaws in any of these approaches? I tried totrace the flaws in the first one using

various db calls, still would ask if you guys
 have seen it before.

If you're really concerned about stale objects here, you can useCAS. For most of these issues, `get || add' combinations give you areasonable level of atomicity. Most of the time, however, it reallydoesn't matter.

I would like to know in detail how memcached server handles queueingof these requests and atomicity of requests. If there are any posts/info on it, please let me know.

There's no real queue other than connection management threadshuddled around the storage mutex. At the point where memcached saysyou've written, it's done.


--
Dustin Sallings

Re: Few queries on atomicity of requests

Reply via email to