asmuts 02/02/18 16:44:21
Added: xdocs UsingJCSBasicWeb.xml
Log:
this is really rough. i'm not sure what to do
some section could become entire docs
didn't create link to other pages yet
Revision Changes Path
1.1 jakarta-turbine-stratum/xdocs/UsingJCSBasicWeb.xml
Index: UsingJCSBasicWeb.xml
===================================================================
<?xml version="1.0"?>
<document>
<properties>
<title>Using JCS: Some basics for the web</title>
<author email="[EMAIL PROTECTED]">Aaron Smuts</author>
</properties>
<body>
<section name="Using JCS: Some basics for the web">
<p>
The primary bottleneck in most dynamic web-based application is the retrieval of
data from
the database. While it is relatively inexpensive to add more front end servers to
scale the
serving of pages and images and the processing of content, it is an expensive and
complex
ordeal to scale the database. By taking advantage of data caching, most web
applications can
reduce latency times and scale farther with fewer machines.
JCS is a front tier cache that can be configured to maintain consistency across
multiple
servers by using a centralized remote server or by lateral distribution of cache
updates.
Other caches, like the Javlin EJB data cache, are basically in memory databases that
sit
between your EJB's and your database. Rather than trying to speed up your slow
EJB's, you
can avoid most of the network traffic and the complexity by implementing JCS front
tier
caching. Centralize your EJB access or you JDBC data acces into local managers and
perform the caching there.
</p>
<subsection name="What to cache?">
<p>
The data used by most web applications varies in its dynamicity, from completely
static to
always changing at every request. Everything that has some degree of stability can
be
cached. Prime candidates for caching range from the list data for stable dropdowns,
user
information, discrete and infrequently changing information, to stable search
results that
could be sorted in memory.
Since JCS is distributed and allows updates and invalidations to be broadcast to
multiple
listeners, frequently changing items can be easily cached and kept in synch through
your data
access layer. For data that must be 100% up to date, say an account balance prior
to a
transfer, the data should directly be retrieved from the database. If your
application allows
for the viewing and editing of data, the data for the view pages could be cached,
but the edit
pages should, in most cases, pull the data directly from the database.
</p>
</subsection>
<subsection name="How to cache discrete data">
<p>
Let's say that you have an e-commerce book store. Each book has a related set of
information that you must present to the user. Let's say that 70% of your hits
during a
particular day are for the same 1,000 popular items that you advertise on key pages
of your
site, but users are still actively browsing your catalog of over a million books.
You cannot
possibly cache your entire database, but you could dramatically decrease the load on
your
database by caching the 1,000 or so most popular items.
</p>
<p>
For the sake of simplicity let's ignore tie-ins and user-profile based suggestions
(also good
candidates for caching) and focus on the core of the book detail page.
</p>
<p>
A simple way to cache the core book information would be to create a value object
for book
data that contains the necessary information to build the display page. This value
object
could hold data from multiple related tables or book subtype table, but lets day
that you have
a simple table called BOOK that looks something like this:
</p>
<source><![CDATA[
Table BOOK
BOOK_ID_PK
TITLE
AUTHOR
ISBN
PRICE
PUBLISH_DATE.
]]></source>
<p>
We could create a value object for this table called BookVObj that has variable with
the
same names as the table columns that might look like this:
</p>
<source><![CDATA[
package com.genericbookstore.data;
import java.io.Serializable;
import java.util.Date;
public class BookVObj implements Serializable{
public int book_id_pk = 0;
public String title;
public String author;
public String ISBN;
public String price;
public Date publish_date;
public BookVObj() {
}
}
]]></source>
<p>
Then we can create a manager called BookVObjManager to store and retrieve BokVObj's.
All access to core book data should go through this class, including inserts and
updates, to
keep the caching simple. Let's make BookVObjManager a singleton that gets a JCS
access
object in initialization. The start of the class might look like:
</p>
<source><![CDATA[
package com.genericbookstore.data;
import org.apache.stratum.jcs.JCS;
// in case we want to set some special behavior
import org.apache.stratum.jcs.engine.behavior.IElementAttributes;
public class BookVObjManager {
private static BookVObjManager instance;
private static int checkedOut = 0;
private static JCS bookCache;
// dataAccess
private BookVObjManager() {
try {
bookCache = JCS.getInstance("bookCache");
} catch ( Exception e ) {
// do something
}
//get dataAccess
}
/**
* Singleton access point to the manager.
*/
public static BookVObjManager getInstance () {
if (instance == null) {
synchronized (BookVObjManager.class) {
if (instance == null) {
instance = new BookVObjManager();
}
}
}
synchronized (instance) {
instance.checkedOut++;
}
return instance;
}
]]></source>
<p>
To get a BookVObj we will need some access methods in the manager. We should be
able
to get a non-cached version if necessary, say before allowing an administrator to
edit the
book data. The methods might look like:
</p>
<source><![CDATA[
/**
* Retrieves a BookVObj. Default to look in the cache.
*/
public BookVObj getBookVObj (int id) {
return getBookVObj(id, true);
}
/**
* Retrieves a BookVObj. Second argument decides whether to look in the cache.
* Returns a new value object if one can't be loaded from the database.
* Database cache synchronization is handled by removing cache elements
* upon modification.
*/
public BookVObj getBookVObj (int id, boolean fromCache) {
BookVObj vObj = null;
if (fromCache) {
vObj = (BookVObj)bookCache.get("BookVObj" + id);
}
if (vObj == null) {
vObj = loadvObj(id);
}
// code to get vObj
if (vObj == null) {
vObj = new BookVObj();
}
return vObj;
} // End getBookVObj()
/**
* Creates a BookVObj based on the id of the BOOK table.
* Data access could be direct JDBC, some or mapping tool, or an EJB.
*
*/
public BookVObj loadvObj( int id ) {
BookVObj vObj = new BookVObj();
vObj.book_id_pk = id;
try {
boolean found = false;
// load the data and set the rest of the fields
// set found to true if it was found
found = true;
// cache the value object
if ( found ) {
// could use the defaults like this
//bookCache.put( "BookVObj" + id, vObj );
// or specify special characteristics
//get the default attributes and copy them
IElementAttributes attr = bookCache.getElementAttributes().copy();
attr.setIsEternal(false);
// expire after an hour
attr.setMaxLifeSeconds(60*120);
bookCache.put( "BookVObj" + id, vObj, attr );
}
} catch (Exception e ) {
// do soemthing
}
return vObj;
}
]]></source>
<p>
We will also need a method to insert and update book data. To keep the caching in
one
place, this should be the primary way core book data is created. The method might
look
like:
</p>
<source><![CDATA[
/**
* Stores BookVObj's in database. Clears old items and caches new.
*
*/
public int storeBookVObj( BookVObj vObj ) {
try {
// since any cached data is no longer valid, we should
// remove the item from the cache if it an update.
if ( vObj.book_id_pk != 0 ) {
bookCache.remove( "BookVObj" + vObj.book_id_pk );
}
// determine if it is an update or insert and store the item
// if it is sucessful, cache the item
// get the new id if it is an insert and put it in the vObj
//get the default attributes and copy them
IElementAttributes attr = bookCache.getElementAttributes().copy();
attr.setIsEternal(false);
// expire after an hour
attr.setMaxLifeSeconds(60*120);
bookCache.put( "BookVObj" + id, vObj, attr );
} catch (Exception e ) {
// do soemthing
}
}
]]></source>
<p>
We now have the basic infrastructure for caching the book data. I added auto
expiration to
the elements to be safe, so the rest of the work will be to configure the cache
region.
</p>
</subsection>
<subsection name="Selecting the appropriate auxiliary caches">
<p>
The first step in creating a cache region is to determine the makeup of the memory
cache.
For the book store example, I would create a region that could store a bit over the
minimum
number I want to have in memory, so the core items always readily available. I
would set the
maximum memory size to 1200.
</p>
<p>
For most cache regions you will want to use a disk cache if the data takes over
about .5
milliseconds to create. The indexed disk cache is the most efficient disk caching
auxiliary,
and for normal usage it is recommended. See the documentation.
</p>
<p>
The next step will be to select an appropriate distribution layer. If you have a
backend
server running an apserver or scripts or are running multiple webserver vms on one
machine, you might want to use the centralized remote cache. See documentation.
The
lateral cache would be fine, but since the lateral cache binds to a port, you'd have
to
configure each vm's lateral cache to listen to a different port on that machine.
</p>
<p>
If your environment is very flat, say a few loadbalanced webservers and a database
machine
or one webserver with multiple vm's and a database machine, then the lateral cache
will
probably make more sense. The TCP lateral cache is recommended. See the
documentation.
</p>
<p>
For the book store configuration I will set up a region for the bookCache that uses
the LRU
memory cache, the indexed disk auxiliary cache, and the remote cache. The
configuration
file might look like this:
</p>
<source><![CDATA[
######################################################
########
################## DEFAULT CACHE REGION
#####################
# sets the default aux value for any non configured caches
jcs.default=DC,RFailover
jcs.default.cacheattributes=org.apache.stratum.jcs.engine.CompositeCacheAttributes
jcs.default.cacheattributes.MaxObjects=1000
jcs.default.cacheattributes.MemoryCacheName=org.apache.stratum.jcs.engine.memory.lru.L
RUMemoryCache
# SYSTEM CACHE
# should be defined for the storage of group attribute list
jcs.system.groupIdCache=DC,RFailover
jcs.system.groupIdCache.cacheattributes=org.apache.stratum.jcs.engine.CompositeCacheAtt
ributes
jcs.system.groupIdCache.cacheattributes.MaxObjects=10000
jcs.system.groupIdCache.cacheattributes.MemoryCacheName=org.apache.stratum.jcs.engine
.memory.lru.LRUMemoryCache
######################################################
########
################## CACHE REGIONS AVAILABLE
###################
# Regions preconfirgured for caching
jcs.region.bookCache=DC,RFailover
jcs.region.bookCache.cacheattributes=org.apache.stratum.jcs.engine.CompositeCacheAttribu
tes
jcs.region.bookCache.cacheattributes.MaxObjects=1200
jcs.region.bookCache.cacheattributes.MemoryCacheName=org.apache.stratum.jcs.engine.me
mory.lru.LRUMemoryCache
######################################################
########
################## AUXILIARY CACHES AVAILABLE
################
# Primary Disk Cache-- faster than the rest because of memory key storage
jcs.auxiliary.DC=org.apache.stratum.jcs.auxiliary.disk.indexed.IndexedDiskCacheFactory
jcs.auxiliary.DC.attributes=org.apache.stratum.jcs.auxiliary.disk.indexed.IndexedDiskCacheA
ttributes
jcs.auxiliary.DC.attributes.DiskPath=/usr/opt/bookstore/raf
# Remote RMI Cache set up to failover
jcs.auxiliary.RFailover=org.apache.stratum.jcs.auxiliary.remote.RemoteCacheFactory
jcs.auxiliary.RFailover.attributes=org.apache.stratum.jcs.auxiliary.remote.RemoteCacheAttrib
utes
jcs.auxiliary.RFailover.attributes.RemoteTypeName=LOCAL
jcs.auxiliary.RFailover.attributes.FailoverServers=scriptserver:1102
jcs.auxiliary.RFailover.attributes.GetOnly=false
]]></source>
<p>
I've set up the default cache settings in the above file to approximate the
bookCache settings.
Other non-preconfigured cache regions will use the default settings. You only have
to configure
the auxiliary caches once. For most caches you will not need to pre-configure our
regions unless
the size of the elements varies radically. We could easliy put several hundred
thousand BookVObj's
in memory. The 1200 limit was very conservative and would be more appropriate for a
large data
structure.
</p>
<p>
To get running with the book store example, I will also need to start up the remote
cache server
on the scriptserver machine. The remote cache documentation describes the
configuration.
</p>
<p>
I now have a basic caching system implemented for my book data. Performance should
improve immediately.
</p>
</subsection>
</section>
</body>
</document>
--
To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>