Re: Performance tips?

2006-01-26 Thread Chris Lamprecht
I seem to say this a lot :), but, assuming your OS has a decent filesystem cache, try reducing your JVM heapsize, using an FSDirectory instead of RAMDirectory, and see if your filesystem cache does ok. If you have 12GB, then you should have enough RAM to hold both the old and new indexes during th

RE: Performance tips?

2006-01-27 Thread Daniel Pfeifer
: java-user@lucene.apache.org Subject: Re: Performance tips? I seem to say this a lot :), but, assuming your OS has a decent filesystem cache, try reducing your JVM heapsize, using an FSDirectory instead of RAMDirectory, and see if your filesystem cache does ok. If you have 12GB, then you should

Re: Performance tips?

2006-01-27 Thread Doug Cutting
Daniel Pfeifer wrote: We are sporting Solaris 10 on a Sun Fire-machine with four cores and 12GB of RAM and mirrored Ultra 320-disks. I guess I could try switching to FSDirectory and hope for the best. Or, since you're on a 64-bit platform, try MMapDirectory, which supports greater parallelism

RE: Performance tips on searching

2009-03-20 Thread Uwe Schindler
Why not use a MultiSearcher an all single searchers? Or a Searcher on a MultiReader consisting of all IndexReaders? With that you do not need to merge the results. By the way: instead of creating a TopDocCollector, you could also call directly, Searcher.search(Query query, Filter filter, int n, S

Re: Performance tips on searching

2009-03-20 Thread Amin Mohammed-Coleman
Hi How do you expose a pagination without a customized hit collector. The multi searcher does not expose a method for hit collector and sort. Maybe this is not an issue for people ... Cheers Amin On 20 Mar 2009, at 17:25, "Uwe Schindler" wrote: Why not use a MultiSearcher an all single

RE: Performance tips on searching

2009-03-20 Thread Uwe Schindler
che.org > Cc: ; > Subject: Re: Performance tips on searching > > Hi > > How do you expose a pagination without a customized hit collector. The > multi searcher does not expose a method for hit collector and sort. > Maybe this is not an issue for people ... > > Chee

Re: Performance tips on searching

2009-03-20 Thread Amin Mohammed-Coleman
s worked. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Amin Mohammed-Coleman [mailto:ami...@gmail.com] Sent: Friday, March 20, 2009 6:43 PM To: java-user@lucene.apache.org Cc: ; Subject: Re: Performance t

RE: Performance tips on searching

2009-03-20 Thread Uwe Schindler
lee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Amin Mohammed-Coleman [mailto:ami...@gmail.com] > Sent: Friday, March 20, 2009 6:58 PM > To: java-user@lucene.apache.org > Cc: > Subject: Re: Performance tips on searching &

Re: Performance tips when creating a large index from database.

2009-10-22 Thread Glen Newton
You might want to consider using LuSql, which is a high performance, multithreaded, well documented tool designed specifically for moving data from a JDBC database into Lucene (you didn't say if it was a JDBC-accessible db...) http://lab.cisti-icist.nrc-cnrc.gc.ca/cistilabswiki/index.php/LuSql Di

Re: Performance tips when creating a large index from database.

2009-10-22 Thread Ian Lea
See also http://wiki.apache.org/lucene-java/ImproveIndexingSpeed. That includes some info on merge and buffer factors, and recommends multiple threads. When I've done this sort of thing in the past it has tended to be the database that is the problem, but maybe your database is faster than mine.

Re: Performance tips when creating a large index from database.

2009-10-22 Thread Erick Erickson
Besides the other suggestions, I'd really, really, really put some instrumentationin the code and see where you're spending your time. For a fast hint, put a cumulative timer around your indexing part only. This will indicate whether the time is consumed in querying your database or indexing..

Re: Performance tips when creating a large index from database.

2009-10-22 Thread Paul Taylor
Glen Newton wrote: You might want to consider using LuSql, which is a high performance, multithreaded, well documented tool designed specifically for moving data from a JDBC database into Lucene (you didn't say if it was a JDBC-accessible db...) http://lab.cisti-icist.nrc-cnrc.gc.ca/cistilabswik

Re: Performance tips when creating a large index from database.

2009-10-22 Thread Marcelo Ochoa
Hi Paul: Mostly of the time indexing big tables is spent on the table full scan and network data transfer. Please take a quick look at my OOW08 presentation about Oracle Lucene integration: http://docs.google.com/present/view?id=ddgw7sjp_156gf9hczxv specially slides 13 and 14 wh

Re: Performance tips when creating a large index from database.

2009-10-22 Thread Thomas Becker
Profile your application first hand and find out where the bottlenecks really are during indexing. For me it was clearly the database calls which took most of the time. Due to a very complex SQL Query. I applied the Producer - Consumer pattern and put a blocking queue in between. I have a threadpo

Re: Performance tips when creating a large index from database.

2009-10-22 Thread Glen Newton
This is basically what LuSql does. The time increases ("8h to 30 min") are similar. Usually on the order of an order of magnitude. Oh, the comments suggesting most of the interaction is with the database? The answer is: it depends. With large Lucene documents: Lucene is the limiting factor (worsen

Re: Performance tips when creating a large index from database.

2009-10-22 Thread Chris Lu
All previous suggestions are very good. It's usually just the database. Lucene itself are faster enough. Previously when I used Pentium III years ago, the indexing speed matters. But upgrading the CPU to Xeon etc, the indexing bottle neck is on database side. Basically use the simplest SQL as

Re: Performance tips when creating a large index from database.

2009-10-27 Thread Toke Eskildsen
On Thu, 2009-10-22 at 15:14 +0200, Erick Erickson wrote: > Besides the other suggestions, I'd really, really, really put > some instrumentationin the code and see where you're spending your time. For > a fast hint, put > a cumulative timer around your indexing part only. This will indicate > whethe

RE: [SPAM] - Re: Performance tips? - Sending mail server found on bl.spamcop.net

2006-01-27 Thread Daniel Pfeifer
- Re: Performance tips? - Sending mail server found on bl.spamcop.net Daniel Pfeifer wrote: > We are sporting Solaris 10 on a Sun Fire-machine with four cores and > 12GB of RAM and mirrored Ultra 320-disks. I guess I could try switching > to FSDirectory and hope for the best. Or, since you're o

Re: [SPAM] - Re: Performance tips? - Sending mail server found on bl.spamcop.net

2006-01-27 Thread Doug Cutting
Daniel Pfeifer wrote: Are we both talking about Lucene? I am using Lucene 1.4.3 and can't find a class called MapDirectory or MMapDirectory. It is post-1.4. You can download a nightly build of the current trunk at: http://cvs.apache.org/dist/lucene/java/nightly/ Doug ---