Re: Apache::Session - What goes in session?
On Tue, 20 Aug 2002 [EMAIL PROTECTED] wrote: We are investigating using IPC rather then a file based structure but its purely investigation at this point. What are the speed diffs between an IPC cache and a Berkely DB cache. My gut instinct always screams 'Stay Off The Disk' but my gut is not always right.. Ok, rarely right.. ;) IPC (for many definitions of that) has all sorts of odd limitations and isn't that fast. Don't go there. The disk is usually much faster than you think. Often overlooked for caching is a simple file based cache. Here's a story about that: A while ago Graham Barr and I spend some time going through a number of iterations for a self cleaning cache system. It would take lots of writes and fewer reads. In each cache entry a number of integers would be stored. Just storing the last thousand entries would be enough. We tried quite a few different approaches; the most noteworthy was a system of semaphores to control access to a number of slots in a BerkeleyDB. That should be pretty fast, right? It got a bit complicated as our systems didn't support that many semaphores, so we had to come up with a system for sharing the semaphores across multiple slots in the database. Designing and writing this implementation took a few days. It was really cool. Anyway, after fixing that and a few deadlocks we were benchmarking away. The system was so clever. We thought it was simple and neat. Okay, neat at least. And it was really slow. Slow. (~200 writes a second on a 400MHz Pentium II if I recall correctly). First we suspected we did something wrong with the semaphores, but further benchmarking showed that the BerkeleyDB just wasn't that fast for writing. 30 minutes thinking and 30 minutes typing code later we had a prototype for a simple filebased system. Now using good old Fcntl to control access to simple flat files. (Data serialized with pack(N*, ...); I don't think anything beats pack and unpack for serializing data). The expiration went into the data and purging the cache was a simple cronjob to find files older than a few minutes and deleting them. The performance? I don't remember the exact figure, but it was at least several times faster than the BerkeleyDB system. And *much* simpler. The morale of the story: Flat files rock! ;-) - ask -- ask bjoern hansen, http://www.askbjoernhansen.com/ !try; do();
Re: Apache::Session - What goes in session?
On 21 Aug 2002 at 2:09, Ask Bjoern Hansen wrote: Now using good old Fcntl to control access to simple flat files. (Data serialized with pack(N*, ...); I don't think anything beats pack and unpack for serializing data). The expiration went into the data and purging the cache was a simple cronjob to find files older than a few minutes and deleting them. The performance? I don't remember the exact figure, but it was at least several times faster than the BerkeleyDB system. And *much* simpler. The morale of the story: Flat files rock! ;-) If I'm using Apache::DBI so I have a persistent connection to MySQL, would it not be faster to simply use a table in MySQL? Peter --- Reality is that which, when you stop believing in it, doesn't go away. -- Philip K. Dick
RE: Apache::Session - What goes in session?
Hi Peter -- The morale of the story: Flat files rock! ;-) If I'm using Apache::DBI so I have a persistent connection to MySQL, would it not be faster to simply use a table in MySQL? Unlikely. Even with cached database connections you are probably not going to beat the performance of going to a flat text file. Accessing files is something the OS is optimized to do. The process of issuing a SQL query, having it parsed and retrieving results is probably more time-consuming than you think. One way to think about it is this: MySQL stores its data in files. There are many layers of code between DBI and those files, each of which add processing time. Going directly to files is far less code, and less code is most often faster code. The best way to be cure is to benchmark the difference yourself. Try out the Benchmark module. Quantitative data trumps anecdotal data every time. Warmest regards, -Jesse- -- Jesse Erlbaum The Erlbaum Group [EMAIL PROTECTED] Phone: 212-684-6161 Fax: 212-684-6226
Re: Apache::Session - What goes in session?
Jesse Erlbaum [EMAIL PROTECTED] wrote: Hi Peter -- The morale of the story: Flat files rock! ;-) If I'm using Apache::DBI so I have a persistent connection to MySQL, would it not be faster to simply use a table in MySQL? Unlikely. Even with cached database connections you are probably not going to beat the performance of going to a flat text file. Accessing files is something the OS is optimized to do. The process of issuing a SQL query, having it parsed and retrieving results is probably more time-consuming than you think. All depends on the file structure. A linear search through a thousand records can be slower than a binary search through a million (500 ave. compares vs. about 20 max [10 ave.] compares - hope the extra overhead for the binary search is worth the savings in comparisons). One way to think about it is this: MySQL stores its data in files. There are many layers of code between DBI and those files, each of which add processing time. Going directly to files is far less code, and less code is most often faster code. MySQL also stores indices. As soon as you start having to store data in files and maintain indices, you might as well start using a database. The best way to be cure is to benchmark the difference yourself. Try out the Benchmark module. Quantitative data trumps anecdotal data every time. Definitely. But before you do, make sure the proper indices are created on the MySQL side. Wrong database configurations can kill any performance gain. -- James Smith [EMAIL PROTECTED], 979-862-3725 Texas AM CIS Operating Systems Group, Unix
RE: Apache::Session - What goes in session?
Hey James -- One way to think about it is this: MySQL stores its data in files. There are many layers of code between DBI and those files, each of which add processing time. Going directly to files is far less code, and less code is most often faster code. MySQL also stores indices. As soon as you start having to store data in files and maintain indices, you might as well start using a database. You bring up a really important point: Scale. If a custom file-based data storage system starts growing in both size and functionality it will sooner or latter reach a point where it is a far worse solution. Relational databases are optimized for two things: Ease of access and management of large data sets. If the data set is small and the functional requirements are very narrow then a custom system can outperform a database most of the time (not including poorly written systems!). Once you have to deal with large amounts of data, or you need to have an interface which allows customizable retrieval of sub-sets of data (a la SQL), a database is going to be the way to go. The trick is knowing which path to choose. Having an idea of the potential growth of the system and use of the data, combined with a few well chosen benchmarks come in handy here. TTYL, -Jesse- -- Jesse Erlbaum The Erlbaum Group [EMAIL PROTECTED] Phone: 212-684-6161 Fax: 212-684-6226
Re: Apache::Session - What goes in session?
Ask Bjoern Hansen wrote: The performance? I don't remember the exact figure, but it was at least several times faster than the BerkeleyDB system. And *much* simpler. In my benchmarks, recent versions of BerkeleyDB, used with the BerkeleyDB module and allowed to manage their own locking, beat all available flat-file modules. It may be possible to improve the flat-file ones, but it even beat Tie::TextDir which is about as simple (and therefore fast) as they come. The only thing that did better was IPC::MM. - Perrin
Re: Apache::Session - What goes in session?
Peter J. Schoenster wrote: If I'm using Apache::DBI so I have a persistent connection to MySQL, would it not be faster to simply use a table in MySQL? Probably not, if the MySQL server is on a separate machine. If it's on the same machine, it would be close. Remember, MySQL has more work to do (parse SQL statement, make query plan, etc.) than a simple hash-based system like BerkeleyDB does. Best thing would be to benchmark it though. - Perrin
Re: Apache::Session - What goes in session?
--- Perrin Harkins [EMAIL PROTECTED] wrote: There are a few ways to deal with this. The simplest is to use the sticky load-balancing feature that many load-balancers have. Failing that, you can store to a network file system like NFS or CIFS, or use a database. (There are also fancier options with things like Spread, but that's getting a little ahead of the game.) You can use MySQL for caching, and it will probably have similar performance to a networked file system. Unfortunately, the Apache::Session code isn't all that easy to use for this, since it assumes you want to generate IDs for the objects you store rather than passing them in. You could adapt the code from it to suit your needs though. The important thing is to leave out all of the mutually exclusive locking it implements, since a cache is all about get the latest as quick as you can and lost updates are not a problem (last save wins is good enough for a cache). I haven't looked at the cache modules docs yet...would it be possible to build cache on the separate load-balanced machines as we go along...as we do with template caching? By that I mean if an item has cached on machine one then further requests on machine one will come from cache where if on machine two the same item hasn't cached, it will be pulled from the db the first time and then cached? If this isn't possible, I'm not sure if I'll be able to implement any caching or not (some of the site configuration is out of my hands) and everything seems so user specific...I'll definitely reread your posts and go through my app for things that should be cached. I would be curious though that if my choice is simply that the data is stored in the session or comes from the database with each request, would it still be best to essentially only store the session id in the session and pull everything else from the db? It still seems that something trivial like a greeting name (a preference) could go in the session. The relationships to the features and pages differ by user, but there might be general information about the features themselves that is stored in the database and is not user-specific. That could be cached separately, to save some trips to the db for each user. The only thing I can think of right now is a calendar...that should probably be cached. The only gotcha would be that the calendar would need to update every day, at least on the current month's pages. But this is only on a feature page, not a users created page (that is a user can click a link on their daily page that takes them to a feature page where they can go through archives). You can cache the names too if you want to, but keeping them out of the session means that you won't be slowed down by fetching that extra data and de-serializing it with Storable unless the page you're on actually needs it. Even though there are some preset pages, the user can change the names and the user can also create a cutom page with its own name. So there could be thousands of unique page names, many (most) specific to unique users (like Jim's Sports Page, etc.). Not to mention that between the fact that the users' daily pages can have any number of user selected features per page and features themselves can have archive depths of anywhere from 3 to 20 years, there's a lot of info. It's also good to separate things that have to be reliable (like the ID of the current user, since without that you have to send them back to log in again) from things that don't need to be (you could always fetch the list of pages from the db if your cache went down). Very good advice. I've found that occasionally something happens to my session where the sesssion id is ok but some of the other data disapears (like current page id) which really screws things up until you log out and log back in again. This leads me to suspect that I've answered my own question from above. It's just whether I can cache or not. Thanks for all your time and help. __ Do You Yahoo!? HotJobs - Search Thousands of New Jobs http://www.hotjobs.com
Re: Apache::Session - What goes in session?
On Mon, Aug 19, 2002 at 06:54:01PM -0700, md wrote: I can definitely get it all from the db, but that doesn't seem very efficient. Don't worry about whether it *seems* efficient. Do it right, and then worry about how to speed that up - if, and only if, it's too slow. Premature optimisation is the root of all evil, and all that .. At BlackStar the session was just a single hashed ID and all other info was loaded from the database every time. We thought about caching some info a few times, but always ran into problems with replication. In the end we discovered that fetching everything from the database on every request wasn't noticeably slower than anything else we could up with, and was a lot more flexible. Throwing more memory at the database servers was usually quicker, cheaper and more effective than micro-optimising our session vs caching strategy... Tony
Re: Apache::Session - What goes in session?
We do see some slowdown on our langauge translation db calls since they are so intensive. Moving to a 'per child' cache for each string as it came out of the db sped page loads up from 4.5 seconds to .6-1.0 seconds per page which is significant. Currently we are working on a 'per machine' cache so all children can benefit for each childs initial database read of the translated string, the differential between children is annoying in the 'per child cache' strategy. John- On Tue, 20 Aug 2002 16:33:07 +0100 Tony Bowden [EMAIL PROTECTED] wrote: On Mon, Aug 19, 2002 at 06:54:01PM -0700, md wrote: I can definitely get it all from the db, but that doesn't seem very efficient. Don't worry about whether it *seems* efficient. Do it right, and then worry about how to speed that up - if, and only if, it's too slow. Premature optimisation is the root of all evil, and all that .. At BlackStar the session was just a single hashed ID and all other info was loaded from the database every time. We thought about caching some info a few times, but always ran into problems with replication. In the end we discovered that fetching everything from the database on every request wasn't noticeably slower than anything else we could up with, and was a lot more flexible. Throwing more memory at the database servers was usually quicker, cheaper and more effective than micro-optimising our session vs caching strategy... Tony
Re: Apache::Session - What goes in session?
On Tue, 20 Aug 2002 [EMAIL PROTECTED] wrote: Currently we are working on a 'per machine' cache so all children can benefit for each childs initial database read of the translated string, the differential between children is annoying in the 'per child cache' strategy. Sounds like you want BerkeleyDB.pm (not DB_File), which is quite fast and handles locking/concurrent access internally (when set up properly). See the Alzabo::ObjectCache::{Store,Sync}::BerkeleyDB modules for examples. For Alzabo, I also have a caching system that caches data in a database, for cross-machine caching/syncing. I haven't really benchmarked it yet but I imagine it could be a win in some situations. For example, you could set up the cache as a separate machine running MySQL and still pull your data from another machine, possibly running a different RDBMS. -dave /*== www.urth.org we await the New Sun ==*/
Re: Apache::Session - What goes in session?
We are investigating using IPC rather then a file based structure but its purely investigation at this point. What are the speed diffs between an IPC cache and a Berkely DB cache. My gut instinct always screams 'Stay Off The Disk' but my gut is not always right.. Ok, rarely right.. ;) John- On Tue, 20 Aug 2002 11:49:52 -0500 (CDT) Dave Rolsky [EMAIL PROTECTED] wrote: On Tue, 20 Aug 2002 [EMAIL PROTECTED] wrote: Currently we are working on a 'per machine' cache so all children can benefit for each childs initial database read of the translated string, the differential between children is annoying in the 'per child cache' strategy. Sounds like you want BerkeleyDB.pm (not DB_File), which is quite fast and handles locking/concurrent access internally (when set up properly). See the Alzabo::ObjectCache::{Store,Sync}::BerkeleyDB modules for examples. For Alzabo, I also have a caching system that caches data in a database, for cross-machine caching/syncing. I haven't really benchmarked it yet but I imagine it could be a win in some situations. For example, you could set up the cache as a separate machine running MySQL and still pull your data from another machine, possibly running a different RDBMS. -dave /*== www.urth.org we await the New Sun ==*/
Re: Apache::Session - What goes in session?
md wrote: I haven't looked at the cache modules docs yet...would it be possible to build cache on the separate load-balanced machines as we go along...as we do with template caching? Of course. However, if a user is sent to a random machine each time you won't be able to cache anything that a user is allowed to change during their time on the site, because they could end up on a machine that has an old cached value for it. Sticky load-balancing or a cluster-wide cache (which you can update when data changes) deals with this problem. everything seems so user specific... That doesn't mean you can't cache it. You can do basically the same thing you were doing with the session: stuff a hash of user-specific stuff into the cache. The next time that user sends a request, you check the cache for data on that user ID (you get the user ID from the session) and if you don't find any you just fetch it from the db. Pseudo-code: sub fetch_user_data { my $user_id = shift; my $user_data; unless ($user_data = fetch_from_cache($user_id)) { $user_data = fetch_from_db($user_id); } return $user_data; } I would be curious though that if my choice is simply that the data is stored in the session or comes from the database with each request, would it still be best to essentially only store the session id in the session and pull everything else from the db? It still seems that something trivial like a greeting name (a preference) could go in the session. Your decision about what to put in the session is not connected to your decision about what to pull from the db each time. You can cache all the data if you want to, and still have very little in the session. This might sound like an academic distinction, but I think it's important to keep the concepts separate: a session is a place to store transient state information that is irrelevant as soon as the user logs out, and a cache is a way of speeding up access to a slow resource like a database, and the two things should not be confused. You can actually cache the session data if you need to (with a write-through cache that updates the backing database as well). A cache will typically be faster than session storage because it doesn't need to be very reliable and because you can store and retrieve individual chunks of data (user's name, page names) when you need them instead of storing and retrieving everything on every request. Separating these concepts allows you to do things like migrate the session storage to a transactional database some day, and move your cache storage to a distributed multicast cache when someone comes out with a module for that. The only gotcha would be that the calendar would need to update every day, at least on the current month's pages. The cache modules I mentioned have a concept of timeout so that you can say cache this for 12 hours and then when it expires you fetch it again and update the cache for another 12 hours. Even though there are some preset pages, the user can change the names and the user can also create a cutom page with its own name. No problem, you can cache data that's only useful for a single user, as I explained above. Not to mention that between the fact that the users' daily pages can have any number of user selected features per page and features themselves can have archive depths of anywhere from 3 to 20 years, there's a lot of info. No problem, disks are cheap. 400MB of disk space will cost you about as much as a movie in New York these days. - Perrin
Re: Apache::Session - What goes in session?
Thanks...you've given me plenty to work with. Great explination. This is good pragmatic stuff to know! __ Do You Yahoo!? HotJobs - Search Thousands of New Jobs http://www.hotjobs.com
Re: Apache::Session - What goes in session?
[EMAIL PROTECTED] wrote: We are investigating using IPC rather then a file based structure but its purely investigation at this point. What are the speed diffs between an IPC cache and a Berkely DB cache. My gut instinct always screams 'Stay Off The Disk' but my gut is not always right.. Ok, rarely right.. ;) Most of the shared memory modules are much slower than Berkeley DB. The fastest option around is IPC::MM, but data you store in that does not persist if you restart the server which is a problem for some. BerkeleyDB (the new one, not DB_File) is also very fast, and other options like Cache::Mmap and Cache::FileCache are much faster than anything based on IPC::Sharelite and the like. I have charts and numbers in my TPC presentation, which I will be putting up soon. - Perrin
Re: Apache::Session - What goes in session?
Thanks, you just saved us a ton of time. Off to change course ;) J On Tue, 20 Aug 2002 13:12:29 -0400 Perrin Harkins [EMAIL PROTECTED] wrote: [EMAIL PROTECTED] wrote: We are investigating using IPC rather then a file based structure but its purely investigation at this point. What are the speed diffs between an IPC cache and a Berkely DB cache. My gut instinct always screams 'Stay Off The Disk' but my gut is not always right.. Ok, rarely right.. ;) Most of the shared memory modules are much slower than Berkeley DB. The fastest option around is IPC::MM, but data you store in that does not persist if you restart the server which is a problem for some. BerkeleyDB (the new one, not DB_File) is also very fast, and other options like Cache::Mmap and Cache::FileCache are much faster than anything based on IPC::Sharelite and the like. I have charts and numbers in my TPC presentation, which I will be putting up soon. - Perrin
Re: Apache::Session - What goes in session?
Just to jump in here - as I understand it you can split a hash across multiple threads if you preload it before apache forks. So load it in your startup.pl and get it in memory prior to forking. It'll be part of the shared memory since you aren't writing to it. Or at least that's how I understand the theory to work anyway. Josh [EMAIL PROTECTED] 08/20/2002 10:54 AM To: Tony Bowden [EMAIL PROTECTED], md [EMAIL PROTECTED] cc: Perrin Harkins [EMAIL PROTECTED], [EMAIL PROTECTED] Subject:Re: Apache::Session - What goes in session? We do see some slowdown on our langauge translation db calls since they are so intensive. Moving to a 'per child' cache for each string as it came out of the db sped page loads up from 4.5 seconds to .6-1.0 seconds per page which is significant. Currently we are working on a 'per machine' cache so all children can benefit for each childs initial database read of the translated string, the differential between children is annoying in the 'per child cache' strategy. John- On Tue, 20 Aug 2002 16:33:07 +0100 Tony Bowden [EMAIL PROTECTED] wrote: On Mon, Aug 19, 2002 at 06:54:01PM -0700, md wrote: I can definitely get it all from the db, but that doesn't seem very efficient. Don't worry about whether it *seems* efficient. Do it right, and then worry about how to speed that up - if, and only if, it's too slow. Premature optimisation is the root of all evil, and all that .. At BlackStar the session was just a single hashed ID and all other info was loaded from the database every time. We thought about caching some info a few times, but always ran into problems with replication. In the end we discovered that fetching everything from the database on every request wasn't noticeably slower than anything else we could up with, and was a lot more flexible. Throwing more memory at the database servers was usually quicker, cheaper and more effective than micro-optimising our session vs caching strategy... Tony
Re: Apache::Session - What goes in session?
I havent had much luck with that but we will look at it again and see what we can get from it. We want to avoid preloading all data per child direct from the database but I wouldnt mind doing it on startup for the root process and then copying it to each child. J On Tue, 20 Aug 2002 16:39:45 -0500 [EMAIL PROTECTED] wrote: Just to jump in here - as I understand it you can split a hash across multiple threads if you preload it before apache forks. So load it in your startup.pl and get it in memory prior to forking. It'll be part of the shared memory since you aren't writing to it. Or at least that's how I understand the theory to work anyway. Josh [EMAIL PROTECTED] 08/20/2002 10:54 AM To: Tony Bowden [EMAIL PROTECTED], md [EMAIL PROTECTED] cc: Perrin Harkins [EMAIL PROTECTED], [EMAIL PROTECTED] Subject:Re: Apache::Session - What goes in session? We do see some slowdown on our langauge translation db calls since they are so intensive. Moving to a 'per child' cache for each string as it came out of the db sped page loads up from 4.5 seconds to .6-1.0 seconds per page which is significant. Currently we are working on a 'per machine' cache so all children can benefit for each childs initial database read of the translated string, the differential between children is annoying in the 'per child cache' strategy. John- On Tue, 20 Aug 2002 16:33:07 +0100 Tony Bowden [EMAIL PROTECTED] wrote: On Mon, Aug 19, 2002 at 06:54:01PM -0700, md wrote: I can definitely get it all from the db, but that doesn't seem very efficient. Don't worry about whether it *seems* efficient. Do it right, and then worry about how to speed that up - if, and only if, it's too slow. Premature optimisation is the root of all evil, and all that .. At BlackStar the session was just a single hashed ID and all other info was loaded from the database every time. We thought about caching some info a few times, but always ran into problems with replication. In the end we discovered that fetching everything from the database on every request wasn't noticeably slower than anything else we could up with, and was a lot more flexible. Throwing more memory at the database servers was usually quicker, cheaper and more effective than micro-optimising our session vs caching strategy... Tony
Re: Apache::Session - What goes in session?
Not in the MS house that I am living in right now :^( On Tue, 20 Aug 2002, Perrin Harkins wrote: Ian Struble wrote: And just to throw one more wrench into the works. You could load up only the most popular data at startup and let the rest of the data get loaded on a cache miss. That is one technique that we have used for some customer session servers. It allowed each server to start up in well under a minute instead of in 15-30 minutes while pegging the DB. The 15-30 minutes was when we were dealing with ~5mil total entries and I would hate to see it now that the size of the table has doubled. Now we just need to do some batch processing to determine what subset gets loaded at startup. You could also just dump the whole thing into a Berkeley DB file every now and then. - Perrin
Apache::Session - What goes in session?
I'm using mod_perl and Apache::Session on an app that is similar to MyYahoo. I found a few bits of info from a previous thread, but I'm curious as to what type of information should go in the session and what should come from the database. Currently I'm putting very little in the session, but what I am putting in the session is more global in nature...greeting, current page number, current page name...data that doesn't change very often. I'm pulling a lot of info from the database and I wonder if my design is sound. Most of the info being pulled from the database is features for the page. Now I need to add global modules to the page which will show user info like which pages they have created and which features are being emailed to the user. These modules will display on every page unless the user turns them off. It seems that since this info wouldn't change very often that I should put the data in the session... Anyone have any general tips on session design? Thanks. __ Do You Yahoo!? HotJobs - Search Thousands of New Jobs http://www.hotjobs.com
RE: Apache::Session - What goes in session?
Hello md -- I'm using mod_perl and Apache::Session on an app that is similar to MyYahoo. I found a few bits of info from a previous thread, but I'm curious as to what type of information should go in the session and what should come from the database. One thing to watch out for is the trap of using session data as a dumping ground for global variables. Since you are asking what belongs in a session, it seems you are already thinking along those lines. I have found that many people who are fond of sessions often use them to store data which I would be personally inclined to store in hidden form data, in a simple cookie, or retrieve from a database when needed. In my systems I usually only store a single session ID in a cookie -- a key which references a database row. This allows me to have as much data as I like but keep it all in the database. There is one case where it might make sense to put data into a session of some sort -- to cache information which is very time-consuming to retrieve. Minimizing time-consuming database operations is an important thing to think about in large systems, and a place where session data might come in handy. Warmest regards, -Jesse- -- Jesse Erlbaum The Erlbaum Group [EMAIL PROTECTED] Phone: 212-684-6161 Fax: 212-684-6226
Re: Apache::Session - What goes in session?
md wrote: Currently I'm putting very little in the session Good. You should put in as little as possible. what I am putting in the session is more global in nature...greeting, current page number, current page name... That doesn't sound very global to me. What happens when users open multiple browser windows on your site? Doesn't it screw up the current page data? I'm pulling a lot of info from the database and I wonder if my design is sound. Optimizing database fetches or caching data is independent of the session issue. Nothing that is relevant to more than one user should ever go in the session. Now I need to add global modules to the page which will show user info like which pages they have created and which features are being emailed to the user. These modules will display on every page unless the user turns them off. That sounds like a user or subscriptions object to me, not session data. It seems that since this info wouldn't change very often that I should put the data in the session... No, that's caching. Don't use the session for caching, use a cache for it. They're not the same. A session is often stored in a database so that it can be reliable. A cache is usually stored on the file system so it can be fast. Things like the login status of this session, and the user ID that is associated with it go in the session. Status of a particular page has to be passed in query args or hidden fields, to avoid problems with multiple browser windows. Data that applies to multiple users or lasts more than the current browsing session never goes in the session. - Perrin
RE: Apache::Session - What goes in session?
Thanks though. That was succinctly put. Could you go back in time and tell me that a year or two ago? That would be great, thanks again. -Josh :) Things like the login status of this session, and the user ID that is associated with it go in the session. Status of a particular page has to be passed in query args or hidden fields, to avoid problems with multiple browser windows. Data that applies to multiple users or lasts more than the current browsing session never goes in the session. -- This message is intended only for the personal and confidential use of the designated recipient(s) named above. If you are not the intended recipient of this message you are hereby notified that any review, dissemination, distribution or copying of this message is strictly prohibited. This communication is for information purposes only and should not be regarded as an offer to sell or as a solicitation of an offer to buy any financial product, an official confirmation of any transaction, or as an official statement of Lehman Brothers. Email transmission cannot be guaranteed to be secure or error-free. Therefore, we do not represent that this information is complete or accurate and it should not be relied upon as such. All information is subject to change without notice.
Re: Apache::Session - What goes in session?
--- Perrin Harkins [EMAIL PROTECTED] wrote: md wrote: That doesn't sound very global to me. What happens when users open multiple browser windows on your site? Doesn't it screw up the current page data? I don't think global was the term I should have used. What I mean is data that will be seen on all or most pages by the same user...like Hello Jim, where Jim is pulled from the database when the session is created and passed around in the session after that (and updated in the db and session if user changes their greeting name). Current page name and id are never stored in db, so different browser windows can be on different pages...I'm not sure if that's good or bad. However, changes to the user name will be seen in both browser windows since that's updated both in the session and db. Optimizing database fetches or caching data is independent of the session issue. Nothing that is relevant to more than one user should ever go in the session. Correct. That little info I am putting in the session corresponds directly to a single user. That sounds like a user or subscriptions object to me, not session data. Once again, I shouldn't have used the term global. This is the subscriptions info for a single user...that's why I had thought to put this in the session instead of pulling from the db each page call since the data will rarely change. This info will be displayed on every page the user visits (unless they turn off this module). No, that's caching. Don't use the session for caching, use a cache for it. They're not the same. A session is often stored in a database so that it can be reliable. A cache is usually stored on the file system so it can be fast. The session is stored in a database (Apache::Session::MySQL), and I am using TT caching for the templates, but I'm not sure how to cache the non-session data. I've seen this discussed but I definitely need more info on this. As it stands I see two options: get data from the session or get it from the db...how do I bring caching into play? Things like the login status of this session, and the user ID that is associated with it go in the session. Status of a particular page has to be passed in query args or hidden fields, to avoid problems with multiple browser windows. Data that applies to multiple users or lasts more than the current browsing session never goes in the session. What about something like default page id, which is the page that is considered your home page? This id is stored permanently in the db (lasts more than the current current browsing session) but I keep it in the session since this also rarely changes so I don't want to keep hitting the db to get it. Thanks again... __ Do You Yahoo!? HotJobs - Search Thousands of New Jobs http://www.hotjobs.com
Re: Apache::Session - What goes in session?
md wrote: I don't think global was the term I should have used. What I mean is data that will be seen on all or most pages by the same user...like Hello Jim Okay, don't put that in the session. It belongs in a cache. The session is for transient state information, that you don't want to keep after the user logs out. Current page name and id are never stored in db, so different browser windows can be on different pages... I thought your session was all stored in MySQL. Why are you putting these in the session exactly? If these things are not relevant to more than one request (page), they don't belong in the session. They should just be in ordinary variables. That sounds like a user or subscriptions object to me, not session data. Once again, I shouldn't have used the term global. This is the subscriptions info for a single user...that's why I had thought to put this in the session instead of pulling from the db each page call since the data will rarely change. You should use a cache for that, rather than the session. This is long-term data that you just want quicker access to. I am using TT caching for the templates, but I'm not sure how to cache the non-session data. Template Toolkit caches the compiled template code, but it doesn't cache your data or the output of the templates. What you should do is grab a module like Cache::Cache or Cache::Mmap and take a look at the examples there. You use it in a way that's very similar to what you're doing with Apache::Session for the things you referred to as global. There are also good examples in the documentation for the Memoize module. There are various reasons to use a cache rather than treating the session like a cache. If you put a lot of data in the session, it will slow down every hit loading and saving that data. In a cache, you can just keep multiple cached items separately and only grab them if you need them for this page. With a cache you can store things that come from the database but are not user-specific, like today's weather. What about something like default page id, which is the page that is considered your home page? This id is stored permanently in the db (lasts more than the current current browsing session) but I keep it in the session since this also rarely changes so I don't want to keep hitting the db to get it. I would have some kind of user object which has a property of default_page_id. The first time the user logs in I would fetch that from the database, and then I would cache it so that I wouldn't need to go back to the database for it on future requests. - Perrin
Re: Apache::Session - What goes in session?
--- Perrin Harkins [EMAIL PROTECTED] wrote: Current page name and id are never stored in db, so different browser windows can be on different pages... I thought your session was all stored in MySQL. Why are you putting these in the session exactly? If these things are not relevant to more than one request (page), they don't belong in the session. They should just be in ordinary variables. You are correct, these items are in the session in the db. I meant that they weren't kept in long term storage in the db after the session ended like the default page id and user name are. The current page id/name is only relevent for an active session. Once a session is started current page is set to whatever the default page id is and will change as the user changes pages. The only reason I did this (as I recall) is that way I can get the page name once. You should use a cache for that, rather than the session. This is long-term data that you just want quicker access to. Yes, that's exactly what I want to do. My main concern is long-term data that I want quicker access to. I can definitely get it all from the db, but that doesn't seem very efficient. Template Toolkit caches the compiled template code, but it doesn't cache your data or the output of the templates. What you should do is grab a module like Cache::Cache or Cache::Mmap and take a look at the examples there. You use it in a way that's very similar to what you're doing with Apache::Session for the things you referred to as global. There are also good examples in the documentation for the Memoize module. Great...exactly the kind of info I was looking for. I'll look at those. We are using a load-balanced system; I shoudl have mentioned that earlier. Won't that be an issue with caching to disk? Is it possible to cache to the db? There are various reasons to use a cache rather than treating the session like a cache. If you put a lot of data in the session, it will slow down every hit loading and saving that data. In a cache, you can just keep multiple cached items separately and only grab them if you need them for this page. With a cache you can store things that come from the database but are not user-specific, like today's weather. Thank you for all the excellent advice and explination(in this and other posts). Most of the info I'll be pulling is *very* user-specific...user name, which features to display on which page, what features the user gets by email, etc. What happens is the user logs in and then the username (greeting), the default page id (the user can create many pages with different features per page) and what features go on the default page are pulled from the database and the default page is displayed, as well as any module info. The modules will consist of a pages module with the names of all the pages the user has created (with links) and a emails module which will display all the features that the user is getting via email. These modules will be displayed on every page. You can see that almost everything is user-specific. Right now I'm storing the page names/ids in a hash ref in the session (the emails module isn't live yet), but I thought that I would change that and only store the module id and pull the names from the db (if the user hasn't turned off the module) with each page call. Thanks again for all the info! __ Do You Yahoo!? HotJobs - Search Thousands of New Jobs http://www.hotjobs.com
Re: Apache::Session - What goes in session?
md wrote: We are using a load-balanced system; I shoudl have mentioned that earlier. Won't that be an issue with caching to disk? Is it possible to cache to the db? There are a few ways to deal with this. The simplest is to use the sticky load-balancing feature that many load-balancers have. Failing that, you can store to a network file system like NFS or CIFS, or use a database. (There are also fancier options with things like Spread, but that's getting a little ahead of the game.) You can use MySQL for caching, and it will probably have similar performance to a networked file system. Unfortunately, the Apache::Session code isn't all that easy to use for this, since it assumes you want to generate IDs for the objects you store rather than passing them in. You could adapt the code from it to suit your needs though. The important thing is to leave out all of the mutually exclusive locking it implements, since a cache is all about get the latest as quick as you can and lost updates are not a problem (last save wins is good enough for a cache). The modules will consist of a pages module with the names of all the pages the user has created (with links) and a emails module which will display all the features that the user is getting via email. These modules will be displayed on every page. You can see that almost everything is user-specific. The relationships to the features and pages differ by user, but there might be general information about the features themselves that is stored in the database and is not user-specific. That could be cached separately, to save some trips to the db for each user. Right now I'm storing the page names/ids in a hash ref in the session (the emails module isn't live yet), but I thought that I would change that and only store the module id and pull the names from the db (if the user hasn't turned off the module) with each page call. You can cache the names too if you want to, but keeping them out of the session means that you won't be slowed down by fetching that extra data and de-serializing it with Storable unless the page you're on actually needs it. It's also good to separate things that have to be reliable (like the ID of the current user, since without that you have to send them back to log in again) from things that don't need to be (you could always fetch the list of pages from the db if your cache went down). - Perrin