Hi All,
I have a some questions regarding memcached updates and seek your comments. I
am a newbie here (in case there is any separate list to be posted out to,
please lemme know)
** Source Wikipedia **
1 function get_foo (int userid) {
2 result = memcached_fetch("userrow:" + userid);
3 if (!result) {
4 result = db_select("SELECT * FROM users WHERE userid = ?", userid);
5 memcached_add("userrow:" + userid, result);
6 }
7 return result;
8}
9 function update_foo(int userid, string dbUpdateString) {
10 result = db_execute(dbUpdateString);
11 if (result) {
12 data = createUserDataFromDBString(dbUpdateString);
13 memcached_set("userrow:" + userid, data);
14 }
15}
*******
Imagine a table now getting queried on 2 columns say userid and username
Q1:
If we have 100 processes each executing the get_foo function, and lets say
memcached does not have the key. As there would be a delay between executing
Line 2 and Line 5,
there would be atleast dozens of processes querying the db and executing Line 5
creating more
bottleneck on the memcached server - How does it scale then (Imagine a million
processes now getting triggered)?
I understand it is the initial load factorbut how do you take this into account
while starting up the memcached servers?
Q2:
Now imagine, you have 100 processes again querying the key out of which 50
execute get_foo() and 50 update_foo().
And lets say the key is not there on memcached server. Imagine T1 doing a
select operation followed
by T2 doing an update. T1 is in Line4 doing the select and *GOING* to add the
key to cache, while T2
goes ahead and updates the DB and executes Line 13 (i.e. updates the cache).
Now if T1 executes Line 5
it would have stale results (in such a case memcache_add fails basically - but
is it a sufficient guarantee
that such a case would never arise?)
Q3:
Now we have 2 queries say:
select * from users where userid = abc;
select * from users where username = xyz;
Users
|userid|username|userinfo|
and I want memcached to improve the query performance
I had 2 approaches:
1. Cache1: Key=userid Value=User_Object
Cache2: Key=username Value=userid
2. Cache1: Key=userid Value=User_Object
Cache2: Key=username Value=User_Object
Do you see potential flaws in any of these approaches? I tried to trace the
flaws in the first one using
various db calls, still would ask if you guys have seen it before.
I would like to know in detail how memcached server handles queueing of these
requests and atomicity of requests. If there are any posts/info on it, please
let me know.
Thanks
-J