Suggestions....

[a]

Try invoking the VM w/ an option like "-XX:CompileThreshold=100" or even a smaller number. This encourages the hotspot VM to compile methods sooner, thus the app will take less time to "warm up".

http://java.sun.com/docs/hotspot/VMOptions.html#additional

You might want to search the web for refs to this, esp how things like Eclipse is brought up, as I think their invocation script sets other obscure options to guide GC too.

[b]

Any time I've worked w/ a hard core java server I've always found it helpful to have a loop explicitly trying to force gc - this is the idiom I use (i.e. you may have to do more than just System.gc()), and my suggestion is to try calling this every 15-60 secs so that memory use never jumps. I know that in theory you should never need to, but it may help.

        public static long gc()
        {
                long bef = mem();
                System.gc();
                sleep( 100);
                System.runFinalization();
                sleep( 100);
                System.gc();
                long aft= mem();
                return aft-bef;
        }


Gordon Riggs wrote:

Hi,
I am working on a web development project using PHP and mySQL. The team has
implemented full text search with mySQL, but is now researching Lucene to
help with performance/scalability issues. The team is looking for a
developer who has experience working with Lucene and can assist with
integrating into our environment. What follows is a brief overview of the
problems that we're working to address. If you have the experience with
using Lucene with large amounts of data (we have roughly 16 million records)
where search time is critical (needs to be under .2 seconds), then please
respond.
Thanks,
Gordon Riggs
[EMAIL PROTECTED]
1. Loading index into memory using Lucene's RAMDirectory
Why is the Java heap 2.9GB for a 1.4GB index?
Why can we not load an index over 1.4GB in size? We receive
'java.lang.OutOfMemoryError' even with the -mx flag set to as high as '10g'.
We're using a dedicated test machine which has dual AMD Opteron processors
and 12GB of memory. The OS is SuSE Linux Enterprise Server 9 (x86_64). The
java version is: Java(TM) 2 Runtime Environment, Standard Edition (build
Blackdown-1.4.2) Java HotSpot(TM) 64-Bit Server VM (build
Blackdown-1.4.2-fcs, mixed mode)
We also get similar results with: Java(TM) 2 Runtime Environment, Standard
Edition (build 1.4.2_03-b02) Java HotSpot(TM) Client VM (build 1.4.2_03-b02,
mixed mode)


2. How to keep Lucene and Java in memory, to improve performance
The idea is to have a Lucene "daemon" that loads the index into memory once
on startup. It then listens for connections and performs search requests for
clients using that single index instance.
Do you foresee any problems (other than the ones stated above) with this
approach?
Garbage collection and/or memory leaks? Performance issues? Concurrency issues with multiple searches coming in at once?
What's involved in writing the daemon?
Assuming that we need the daemon, we need to find out how big a job it is to
develop, what requirements need to be specified, etc.


3. How to interface our PHP web application with Java
Our web application is written in PHP so we need a communication interface
for performing search queries that is both PHP and Java friendly.
What do you think would be a good solution?  XML-RPC?
What's involved in developing the solution?

4. How to tune Lucene
Are there ways to "tune" Lucene in order to improve performance? We already
plan on moving the index into memory.
What else can be done to improve the search times? Can the way the index is
built affect performance?


--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to