Re: Speed of tags.

Steven J. Owens Wed, 10 Nov 2004 21:21:57 -0800

Adoni,

On Wed, Nov 10, 2004 at 02:19:16PM +0000, Andoni wrote:
> There are two alternatives with the Mainframe. One is to do a call to a
> batch file which searches through a directory tree for the files I want and
> then opens them and returns the data from them. This is the one currently in
> use live.
> 
> The second alternative is the one I am trying to implement and is just going
> to the appropriate directory using Java code and replacing the
> Runtime.exec() call which was calling the batch file. I thought that this
> would significantly improve performance but my initial tests don't agree
> (though there is an improvement). Now I am not sure whether to spend more
> time programming this solution for the few seconds it gains or to put it on
> hold.



1) cache your directory lookup (in memory as a collection of some sort,
or possibly as a lucene searchable index)

2) if your application allows, consider setting it up as an
asynchronous lookup;  the browser submit fires off the request, your
page starts up the lookup asynchronously and immediately returns with
a placeholder page with a reload header.

3) watch out for tying up all your apache processes.


For #1 and #2, we'd have to hear a bit more about what you're doing,
and why, to tell you if this is advisable.  Here are some general
thoughts:

Cached Lookup

If you're actually searching through a directory tree to find a
specific file, then it might be worthwhile to build either an
in-memory cache using simple objects and collections, and store it in
the ServletContext.

Or if you really need complex and/or voluminous searching, consider
using the jakarta lucene search engine API, it's pretty easy to build
a search index, and blazingly fast.

It may not really be necessary to do a full-blown cache, though,
depending on what you need.  One of the better programmers I know told
me, many years ago, of a learning experience he had with implementing
an LRU (least recently used) cache.  He started by doing a quick hack
to cache the results of the last request - and saw a massive
improvement in performance.  Then he added a fullblown LRU, and saw
only a trivial further improvement.  So it may be that you just need
to do some very simple stuff.  

Asynchronous

For #2, I'm open to comments from others on the best way to do an
asynchronous lookup.  All of them pretty much boiling down to sending
back a placeholder page, with a reload-in-8-seconds header, and then
continuing the mainframe request "behind the scenes" and caching the
results in the user's session.  When the browser attempts to reload in
8 seconds, the results are there, waiting.

There are various opinions about starting your own threads in a
servlet engine.  I'm not entirely certain what's the *best* way, or if
there are any hidden traps (aside from the usual trickiness of doing
multithreaded programming), but here are the ways you could do it.

There's nothing that says the doGet() has to end when you send the
page back to the user.  So you could just have the first chunk of the
code send the placeholder page back, flush the connection and close
the connection.  Then continue, launch the mainframe request from
inside the doGet() body, and write the result to the user's session.

You could fire off a separate thread to do the mainframe request.
I've seen a few comments about this being not-strictly-proper in the
servlet API, but not uncommon, either.  You'd make the mainframe
request an independent, runnable object.  For more info on this, see
the various chapters, books and tutorials on multithreaded programming
in java.
 
I have heard (and am planning shortly to try it myself) that you could
use Doug Lea's excellent concurrent utils library, or the version of
the same that have been included in java as of 1.5, to set up a
thread-safe queue and worker pool that handles the mainframe lookups.
Your servlet would instantiate a mainframe request lookup object,
place it on the queue. When a worker pool thread became available to
handle it, it'd pull it off the queue and process it.  This is mostly
useful if you're worried about too many requests coming in, spawning
too many threads, and bogging down your system.

(I'm actually planning to get dirty with concurrent utils (we're not
up to java 1.5 yet) queue and worker pool sometime real soon, any
comments or advice are welcome, especially with setting up a queue
backed by a database table).


Typing Up Your Apache Processes

This is a gotcha to watch out for with servlets that take a while to
complete the request.  We're used to ignoring the connection overhead,
but the apache server typically only has a fairly small pool of
processes, measured in ten, while tomcat can easily have hundreds of
threads going.

For the most part this is only a problem if one of the numbers
(requests per second, delay per request, max processes in pool) gets
really out of proportion, but be aware of it.  (I've asked a
mathematically-inclined friend to cobble up a rule of thumb for
sanity-checking this, I'll post it if he comes up with something).


-- 
Steven J. Owens
[EMAIL PROTECTED]

"I'm going to make broad, sweeping generalizations and strong,
 declarative statements, because otherwise I'll be here all night and
 this document will be four times longer and much less fun to read.
 Take it all with a grain of salt." - http://darksleep.com/notablog


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Speed of tags.

Reply via email to