Re: How to get more predictable caching behavior - how to store sessions in memcached

dormando Thu, 18 Mar 2010 10:49:50 -0700

> Nope, basically it's like this: on every request (with an associated session) 
> the session is served from local memory (the jvm local session map). When the 
> request is finished, the session is sent to
> memcached (with the session timeout as expiration, e.g. 3600). It's still 
> held in the local session map, memcached is just there as a kind of backup 
> device. The session is only pulled from memcached, if a
> tomcat originally serving this session (say tomcat1) died and therefore 
> another tomcat (tomcat2) is asked to serve this session. tomcat2 then does 
> not have a session for the requested sessionId and
> therefore loads this session from memcached.
>
> The case that I was describing before (sending only modified sessions to 
> memcached) is a feature for performance optimization: the asumption is, that 
> there are more requests just accessing the session but
> less requests that actually modify the session. So the idea is that I don't 
> have to update a session in memcached that was accessed but not modified. The 
> issue that had to be handled then was only the case
> of the different timeouts/expiration times:
> - a session was first created/updated and stored in memcached: both in tomcat 
> and memcached it has an expiration of 1h
> - this session is accessed 10 minutes later; as it was not modified it is not 
> updated in memcached. Then this session has an expiration of 1 hour again in 
> tomcat, but in memcached it's already 10 minutes
> old, so it would expire 50 minutes later.
>
> To prevent this premature expiration I used the mentioned background thread 
> to update the expiration of such items in memcached to the remaining 
> expiration time they have in tomcat. In the example above
> nearly 50 minutes later the session would be updated with an expiration time 
> of 10 minutes.


For sake of argument I think that background thread should just journal to
DB, but eh :P

I had a thought; You could use ADD to "ping" the sessions every time
they're accessed. When a session is served from local memory, do an async
ADD of a 0 byte value with a 1 second expiration time against that
session. If the add command fails it actually promotes the existing item
to the head of the LRU. If it doesn't fail you probably want to SET the
session back into memcached.

Feels like your background thread is a little wrong as well. If you can
journal in the session somewhere the last time it was written to
memcached, your thread could update purely based on that. ie; if they're
off by 5+ minutes, sync them asap. So then:

- session created, 1hr expiration. session notes it is "clean" with
memcached as of that second.

- session accessed 10 minutes later, not modified. 1hr expiration, 50 mins
in memcached. Session is pinged via ADD and moves to head of LRU. Fresh.

- 5 minutes later, background thread trawls through sessions in local
memory whose "memcached clean" timestamp is 5+ minutes off from its last
accessed time. syncs to memcached, updates session locally to note when it
was synced? Session is still relatively "fresh" (last accessed 10 minutes
ago). Bumping it to the top of the LRU isn't as bad of a problem as
bumping a 50 minute old session to the top for no reason.

- 55 minutes later, session expires from tomcat. Thread issues DELETE
against memcached which cleans up the session, if it's still there.

> That's already done: sessions that are expiring in tomcat are deleted from 
> memcached
>
> The issue that I described regarding sessions, that were only accessed by the 
> application but not modified and therefore were not updated in memached was 
> the following: when such a session (session A) is
> updated in memcached (just before it would expire in memcached) with a new 
> expiration time of say then 10 minutes (the time that it has left in tomcat), 
> it will be pushed to the head of the LRU.
> Another session (session B) might have been modified just 20 minutes before 
> and sent to memcached with an expiration of 60 minutes, this one will be 
> closer to the tail of the LRU than session A, even if
> session B will still have 40 minutes to live - 30 minutes more than session 
> A. And because it's closer to the tail session B might be dropped before 
> session A, even if session A would already be expired.
>
> However, this would only be an issue if there are too many sessions for the 
> available memory of a slab.

Yeah. Think what I described above solves almost everything :p granted you
can add that sync timestamp. That's essentially the same algorithm I
describe for syncing to a database in my old post, + the ADD trick. Guess
I should go rewrite the post.

This sorta gates on your background thread being able to update a
timestamp in the local memory session though... Failing that you still
have some workaround options but I can't think of any non-ugly ones.

I'll push again on my main point though... Don't overengineer the
slabbing unless you prove that it's a problem first. 1.4 series have many
counters available to tell if one slab class does not have enough memory
relative to other slabs. If you can prove that happens often, you'll need
to check this out more carefully.

> Tomcat provides a PersistentManager ([1]) which allows to store sessions in 
> the database. But this manager backups all sessions in batches every 10 
> seconds. For one thing scalability of the application is
> then directly dependent from the database (more than it is already if a 
> database is used) and there's a timeframe where sessions can be lost. If the 
> session backup frequency is shortened, the database is
> hit more often. Additionally sessions are stored in the database again and 
> again even if they were not changed at all. That was the reason why I decided 
> not to use this.

Yeah, I get why the original one is crap. I just don't see why the swing
from "it batches everything every 10 seconds because some guy lacked total
clue" to "holy _SHIT_ we can't touch the database ever!!!" is necessary.

In my old post I basically describe something where:

- A new session gets INSERT'ed into DB, and memcached.

- Fetched from memcached and updated back into memcached.

- On fetch, if you haven't synced with the DB in more than N minutes, sync
with the DB again.

- Alternatively a background thread could trawl through "select
session_id, sync_timestamp, last_known_fetch" from session where etc"
every few minutes and sync stuff back to the DB if it's been changed (but
it doesn't have to keep checking if the last_accessed time isn't being
updated with the synced time).

- At some point background process or whatever DELETE's expired crap from
the DB.

So you still do some writes, but it should be vastly reduced compared to
updating the "last accessed" timestamp on every view, and reads against
the DB should almost never happen for active sessions that stay in cache.

Given the addition of the ADD bit, I guess you could build it so that in a
pinch you could just shut off DB journaling to deal with some overload
scenario.

-Dormando

Re: How to get more predictable caching behavior - how to store sessions in memcached

Reply via email to