Thanks for your detailed answer. Regarding object compatibility, the value objects which we plan to cache are developed in releases and change infrequently. Most development work in the company affects how data is processed and routed - the structure and fields of data objects themselves tend to remain quite static. We are breaking the rules of object oriented design by separating behaviour from data, I know, but at least some code remains static this way.
Some background info: Right now, we have about 10 client machines. It is not the number of client boxes that causes load for us though, it's the volume of data they feed into our systems. We are a message routing company. Right now, load is extremely high on our databases and routing applications, largely because we repeatedly perform expensive DB operations because we employ little caching in general. Where we currently employ caching, it is in an ad-hoc per-application/per-server manner which means multiple applications can cache the same database data separately, get out of sync and need to be restarted individually when the underlying database data is changed. We are trying to move towards a more coordinated caching strategy. Most of the data which we would like to cache is common data needed across all applications and servers - lookup data (changes infrequently and is expensive to load) and account data (is just expensive to load). We are using JBoss application server to manage DB transactions. Since the cache layer will need to be kept in sync with DB as much as possible, I want changes to the cache to be tied to the success or failure of DB transactions. It makes sense for me to have application logic on JBoss machines decide when data should be cached and when it should be expired. If a DB stored procedure or application not linked to the cache alters DB data, cache data can be updated centrally by central applications. The architecture for this project right now will be close to what you mention: PT*10 -> ST*1 -> DB*4. We will likely expand the service tier (ST) layer by adding more machines as a cluster, but that's some time away - this project will serve as proof of concept for that. We will need to install JBoss on every machine in the cluster, so why not have the application running in JBoss provide a data caching service? This seems like a clean solution to me because we won't need to set up standalone remote caches in addition. I think performance should be slightly better also, because whenever a JBoss machine wants to add something to the cache it will not need to make an RMI connection to add it to the remote cache, because the remote cache is in-process. As you suggest, we could use lateral replication to expand ST cache capacity. This would fit neatly with JBoss clustering & distributed transactions. I want to cache at the presentation tier (PT) to avoid calling the ST over and over, yes. There will be regions defined in all caches (ST & PT) which hold company-wide shared data. On PT clients, these regions will be read-only (allowPut=false?). There will be some other regions configured in PT caches also, which will be used for data specific to each PT application. These regions will be read-write and act as standalone caches independent of the central server. Is it possible to configure JCS to run in-process in an application server, as both a local cache to the app server and a remote cache to clients? Or are these functions mutually exclusive? Mant thanks, Niall On Thu, 2006-02-23 at 11:53 -0600, Smuts, Aaron wrote: > How many clients do you have? How heavy is the load? > > I don't exactly recommend using JBoss or any other app server, but that > is beside the point. I'll just call your JBoss layer the middle tier or > your service tier (ST). Ideally, you could break this up into a bunch > of independent services that would be responsible for some group of data > . . . For now let's say that you have a service layer tier that provides > data retrieval and storage service to your presentation layer. If your > presentation layer is just a web service layer for a client above it, it > doesn't matter. I'll just call it the presentation tier (PT) for now. > > I suppose you have something like this: > > PT*10 --> ST*4 --> Database > > 10 presentation tier boxes that talk to 4 service tier boxes that sit on > top of your database. > > If your data changes a lot, then one thing you can do it just cache at > the service tier level. Or just cache the regions that change a lot in > the service tier. > > Then you can just link the mid tier boxes together. This will allow you > to expand with cache replication for some time. If you expect to need > hundreds of mid tier boxes any time soon, then you will need to come up > with some other strategy. > > There are two options here. First, you can just hook up the lateral > cache between the ST boxes. This is very simple, since with UDP > discovery, you don't have to do much at all to get them talking. They > will all replicate their data to each other. The second major strategy > is to use a remote cache server. You can hook the 4 ST boxes up to a > centralized remote cache and share data that way. You can configure > them to issue removes on put, so that only the cache that created the > data and the remote server have the data. Getting the remote cache > running properly is a bit tricky, but I'm trying to improve the scripts > and documentation right now. > > I assume that you want to cache at the presentation tier to avoid having > to call the service tier over and over for the same data. I would avoid > distributing cache data between tiers. It seems unclean to separate the > application and then share the data. It would also make it more > difficult to release one tier without changing the other, since you > might make your objects incompatible. (A good reason to decouple tiers > and link them with just XML. . . ) > > If you were to avoid the tier coupling, you could use either of the two > strategies used on the ST on the PT. That is, if you don't have too > many PT boxes and you are not pushing thousands of new items into the > cache a second, then you could link them with the lateral cache. If you > are relatively low put, then you can scale to more boxes. . . . You > could also put a remote cache serer in place for your PT. > > The flow with the two tiered cache would be like this. > > PTn1 checks its local cache. If it is not local or on disk, it could > check the PT remote cache server. If it doesn't have the item the PTn1 > calls STn1. STn1 would go through the same procedure, except it would > go to the database for the item. If it got the item, it would then > update the ST remote cache, or just broadcast the item out to the other > Service Tier members (STn1 --> STn2-4, this is done asynchronously). > PTn1 gets the data back from STn1. PTn1 then puts the item in the > cache. This will result in the item being sent to the PT remote cache > server or being sent to the other PT boxes directly if you use the > lateral. > > If you want to share cached data between the tiers, then run a remote > server in the middle. Your firewall configuration will determine where > this needs to go. Make sure to define a serialverisionuid on all of > your objects. . . . > > If you run a remote server in the middle, then make sure that the PT > clients do not put into the cache, only get. You don't want PTn1 > putting into the remote cache data it got from STn1, since STn1 just put > it in the remote cache. > > Cheers, > > Aaron Smuts > > > > > -----Original Message----- > > From: Niall Gallagher [mailto:[EMAIL PROTECTED] > > Sent: Thursday, February 23, 2006 12:15 PM > > To: [email protected] > > Subject: JCS as both local and remote cache? > > > > Hi, > > > > Can anybody answer the following for me? > > > > I have a JBoss server which I want to use as a central point of access > > to a database. Client machines will only be able to write to the > > database by calling EJB method on the JBoss machine. Client machines > > will not hold direct connections to the database themselves. > > > > I want the JBoss server to use JCS as a caching layer to sit between > all > > clients and the database. JCS will run inside JBoss. Also, I want > client > > machines to run JCS locally as a local cache, configured to use the > > JBoss central machine as a remote cache. > > > > Client machines will not write to the central cache directly. If they > > call EJBs on the central JBoss machine to update the database, the > > central EJBs will update the central cache with the new data at the > same > > time. If clients then try to access data from their local cache, since > > it is not cached locally, the client-side JCS cache will download it > > from the central (remote) cache automatically. If clients find that > > required data is not available locally or centrally, they will call > > methods on the JBoss machine to have it loaded into the cache. > > > > If the JBoss server updates or removes data in the central cache, JCS > > will automatically issue 'remove' commands to all client caches. > > > > So I think this approach will ensure all caches remain in sync almost > > all of the time (ignoring the intricacies of asynchronous queueing!). > > > > My question is: How do I configure the central cache? > > > > If a line of code on the central server reads jcs.put("myKey", > > "myValue"); will the central JCS cache automatically issue remove > > commands to clients for "myKey"? > > > > The central server needs to be configured as a local cache for its own > > use, but it should also know that it is a remote cache to clients, and > > therefore needs to issue these remove commands. > > > > Any suggestions? > > > > > > By the way, I think JCS is an extremely well written piece of > software. > > Hopefully the recent additions to the JCS website will help it get > some > > more recognition for this, which I think it deserves. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] >
