Re: lucene (search) performance tuning

2012-05-28 Thread Lance Norskog
And, no RamDirectory does not help. On Mon, May 28, 2012 at 5:54 PM, Lance Norskog wrote: > Can you use filter queries? Filters short-circuit a lot of search > processing. "City:San Francisco" is a classic filter - it is a small > part of the documents and it is reused a lot. > > On Sat, May 26,

Re: lucene (search) performance tuning

2012-05-28 Thread Lance Norskog
Can you use filter queries? Filters short-circuit a lot of search processing. "City:San Francisco" is a classic filter - it is a small part of the documents and it is reused a lot. On Sat, May 26, 2012 at 7:32 AM, Yang wrote: > I'm using disjunction (OR) query. unfortunately all of the clauses ar

Re: lucene (search) performance tuning

2012-05-26 Thread Yang
I'm using disjunction (OR) query. unfortunately all of the clauses are optional On Sat, May 26, 2012 at 4:38 AM, Simon Willnauer < simon.willna...@googlemail.com> wrote: > On Sat, May 26, 2012 at 2:59 AM, Yang wrote: > > I tested with more threads / processes. indeed this is completely > > cpu-b

Re: lucene (search) performance tuning

2012-05-26 Thread Li Li
if you don't score but sort by id, it may be a little bit faster. but for 3.x, you can hardly speed up by simpler scoring function. for your situation, the bottleneck is cpu. you can speed up by paralleling. so the best one is to split index and searching concurrently. so the cpus can be fully used

Re: lucene (search) performance tuning

2012-05-26 Thread Simon Willnauer
On Sat, May 26, 2012 at 2:59 AM, Yang wrote: > I tested with more threads / processes. indeed this is completely > cpu-bound, since running 1 thread gives the same latency as 4 threads (my > box has 4 cores) > > > given this, is there any way to simplify the scoring computation (i'm only > using l

Re: lucene (search) performance tuning

2012-05-25 Thread Yang
I tested with more threads / processes. indeed this is completely cpu-bound, since running 1 thread gives the same latency as 4 threads (my box has 4 cores) given this, is there any way to simplify the scoring computation (i'm only using lucene as a first level "rough" search, so the search quali

Re: lucene (search) performance tuning

2012-05-25 Thread Yang
thanks a lot guys On Tue, May 22, 2012 at 1:34 AM, Ian Lea wrote: > Lots of good tips in > http://wiki.apache.org/lucene-java/ImproveSearchingSpeed, linked from > the FAQ. > > > -- > Ian. > > > On Tue, May 22, 2012 at 2:08 AM, Li Li wrote: > > something wrong when writing in my android client.

Re: lucene (search) performance tuning

2012-05-22 Thread Ian Lea
Lots of good tips in http://wiki.apache.org/lucene-java/ImproveSearchingSpeed, linked from the FAQ. -- Ian. On Tue, May 22, 2012 at 2:08 AM, Li Li wrote: > something wrong when writing in my android client. > if RAMDirectory do not help, i think the bottleneck is cpu. you may try to > tune jvm

Re: lucene (search) performance tuning

2012-05-21 Thread Li Li
something wrong when writing in my android client. if RAMDirectory do not help, i think the bottleneck is cpu. you may try to tune jvm but i do not expect much improvement. the best one is splitting your index into 2 or more smaller ones. you can then use solr s distributed searching. if the cpu is

Re: lucene (search) performance tuning

2012-05-21 Thread Li Li
在 2012-5-22 凌晨4:59,"Yang" 写道: > > I'm trying to make my search faster. right now a query like > > name:Joe Moe Pizza address:77 main street city:San Francisco >is this a conjunction query or a disjunction query? > in a index with 20mil such short business descriptions (total size about 3GB) take

Re: Improving Lucene Search Performance

2011-12-09 Thread Chris Hostetter
: Subject: Improving Lucene Search Performance : In-Reply-To: : : References: : <161fd7d0-e01f-42f2-a02a-a4e4b182c...@ebi.ac.uk><347A161B-6C7B-4DC3-ACD0-9A804E2 : dd...@ebi.ac.uk><007613f0-8529-47a3-95c4-7839e1d3e...@ebi.ac.uk> : https://people.apache.org/~hos

Re: Improving Lucene Search Performance

2011-12-08 Thread Ian Lea
See http://wiki.apache.org/lucene-java/ImproveSearchingSpeed. Some of the tips relate to indexing but most to search time stuff. -- Ian. On Thu, Dec 8, 2011 at 10:45 AM, Dilshad K. P. wrote: > Hi, > Is there any thing to take care while creating index for improving lucene > text search speed

Improving Lucene Search Performance

2011-12-08 Thread Dilshad K. P.
Hi, Is there any thing to take care while creating index for improving lucene text search speed. Thanks And Regards Dilshad K.P * Confidentiality Statement/Disclaimer * This message and any attachments is intended for the sole use of the intended recipient. It may contain confidential i

Lucene Search Performance Analysis Workshop

2009-08-26 Thread Andrzej Bialecki
Hi all, I am giving a free talk/ workshop next week on how to analyze and improve Lucene search performance for native lucene apps. If you've ever been challenged to get your Java Lucene search apps running faster, I think you might find the talk of interest. Free online workshop: Thu

Re: Lucene search performance on Sun UltraSparc T2 (T5120) servers

2009-02-19 Thread Glen Newton
I will look a little deeper into the information you supplied and comment, but will suggest this on my initial cursory review: 1 - You have 32GB of memory. Using the 64bit VM, try using a 16GB or 24GB heap; 2 - Turn-on huge pages: -XX:+UseLargePages -XX:LargePageSizeInBytes=256m 3 - Tu

Re: Lucene search performance on Sun UltraSparc T2 (T5120) servers

2009-02-18 Thread Varun Dhussa
Hi, The details are as follows: Solaris version: Solaris 10 U5 and U6 For the Java Setup, I have tried with: Sun JDK 1.5 (32 & 64) Sun JDK 1.6 (32 & 64) Heap Space: 2G from 32 bit and 4G for 64 bit (Set the same values for both XMS and XMX) Disk: Tried with ZFS (U6) and UFS (U5) I reduced the

Re: Lucene search performance on Sun UltraSparc T2 (T5120) servers

2009-02-18 Thread Michael Stoppelman
kly" and how we determine what prefix to use in 2) are to >> be determined but the principle seems reasonable >> >> Thoughts? >> >> >> >> >> - Original Message >> From: Varun Dhussa >> To: java-user@lucene.apache.org >>

Re: Lucene search performance on Sun UltraSparc T2 (T5120) servers

2009-02-18 Thread Glen Newton
Could you give some configuration details: - Solaris version - Java VM version, heap size, and any other flags - disk setup You should also consider using huge pages (see http://zzzoot.blogspot.com/2009/02/java-mysql-increased-performance-with.html) I will also be posting performance gains using

Re: Lucene search performance on Sun UltraSparc T2 (T5120) servers

2009-02-18 Thread eks dev
st score in the priority queue. > > How we "exit quickly" and how we determine what prefix to use in 2) are to be > determined but the principle seems reasonable > > Thoughts? > > > > > - Original Message > From: Varun Dhussa > To: java-user@lucene.apac

Re: Lucene search performance on Sun UltraSparc T2 (T5120) servers

2009-02-18 Thread Varun Dhussa
asonable Thoughts? - Original Message From: Varun Dhussa To: java-user@lucene.apache.org Sent: Wednesday, 18 February, 2009 10:36:07 Subject: Lucene search performance on Sun UltraSparc T2 (T5120) servers Hi, I have had a bad experience when migrating my application from Intel Xeon based serv

Re: Lucene search performance on Sun UltraSparc T2 (T5120) servers

2009-02-18 Thread mark harwood
uot;z", exiting quickly if the term fails to meet lowest score in the priority queue. How we "exit quickly" and how we determine what prefix to use in 2) are to be determined but the principle seems reasonable Thoughts? - Original Message From: Varun Dhussa To: java-user

Lucene search performance on Sun UltraSparc T2 (T5120) servers

2009-02-18 Thread Varun Dhussa
Hi, I have had a bad experience when migrating my application from Intel Xeon based servers to Sun UltraSparc T2 T5120 servers. Lucene fuzzy search just does not perform. A search which took approximately 500 ms takes more than 6 seconds to execute. The index has about 100,000,000 records. S

RE: Lucene Search Performance

2008-02-29 Thread Andreas Guther
. We can provide the user an option to search across older indexes as well, if wanted. Andreas -Original Message- From: Jamie [mailto:[EMAIL PROTECTED] Sent: Wednesday, February 27, 2008 10:17 PM To: java-user@lucene.apache.org Subject: Re: Lucene Search Performance Hi Thanks for the

Re: Lucene Search Performance

2008-02-27 Thread Jamie
Hi Thanks for the suggestions. This would require us to change the index and right now we literally have millions of documents stored in current index format. I'll bear it in mind, but I am not entirely sure how I would go about implementing the change at this point. Much appreciate Jamie

Re: Lucene Search Performance

2008-02-27 Thread h t
1. redefine the archivedate field as YYmmDD format, 2. add another field using timestamp for sort use. 3. use RangeFilter to get result and then sort by timestamp. 2008/2/27, Jamie <[EMAIL PROTECTED]>: > > Hi Michael & Others > > Ok. I've gathered some more statistics from a different machine for

Re: Lucene Search Performance

2008-02-27 Thread Michael Prichard
I'm wondering if your date field's precision may be a little too much? What I mean is that you are going all the way down to seconds. Whenever you do a range query you are essentially spawning a BooleanQuery with a representation of that range. Do you really need to be that precise? I u

Re: Lucene Search Performance

2008-02-26 Thread Anshum
Hi Jamie, Are you running concurrent searches on the index i.e. spawning multiple threads and not handling them? I have been having similar issues and I am planning to try out a workaround for it using Java's Interface Executor. http://java.sun.com/j2se/1.5.0/docs/api/java/util/concurrent/Executor

Re: Lucene Search Performance

2008-02-26 Thread h t
Hi Michael, I guess the hotspot of lucene is org.apache.lucene.search.IndexSearcher.search() Hi Jamie, What's the original text size of a million emails? I estimate the size of an email is around 100k, is this true? When you doing search, what kind keywords did you input, words or short sentence?

Re: Lucene Search Performance

2008-02-26 Thread Michael Stoppelman
So you're saying searches are taking 10 seconds on a 5G index? If so that seems ungodly slow. If you're on *nix, have you watched your iostat statistics? Maybe something is hammering your hds. Something seems amiss. What lucene methods were pointed to as hotspots by YourKit? -M On Tue, Feb 26, 2

Re: Lucene Search Performance

2008-02-26 Thread Jamie
Hi Michael Perhaps this will help. We are using Lucene to index emails and provide a search interface to search through those emails. Many of our customers have 3-5 TB's or more of email data. The index size tends to be around 5 GB per million messages. On a 3 GHZ intel core duo with standard

Re: Lucene Search Performance

2008-02-26 Thread Michael Stoppelman
On Tue, Feb 26, 2008 at 10:18 AM, Jamie <[EMAIL PROTECTED]> wrote: > Hi > > I am looking for a way to improve the search performance of my > application. I've followed every suggestion in the Lucene Wiki but the > search is still too slow with large indexes. I was wondering whether Did you optim

Lucene Search Performance

2008-02-26 Thread Jamie
Hi I am looking for a way to improve the search performance of my application. I've followed every suggestion in the Lucene Wiki but the search is still too slow with large indexes. I was wondering whether there was a way to restrict a search to a specific time period and in doing so sacrific

Re: Lucene search performance: linear?

2007-03-21 Thread Yonik Seeley
On 3/21/07, Peter Keegan <[EMAIL PROTECTED]> wrote: On a similar topic, has anybody measured query performance as a function of index size? Well, I did and the results surprised me. I measured query throughput on 8 indexes that varied in size from 55,000 to 4.4 million documents. When plotted on

Re: Lucene search performance: linear?

2007-03-21 Thread Peter Keegan
Best regards, Lisheng -Original Message- From: Soeren Pekrul [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 05, 2006 10:37 AM To: java-user@lucene.apache.org Subject: Re: Lucene search performance: linear? Hello Lisheng, a search process has to do usually two thinks. First it has to fi

RE: Lucene search performance: linear?

2006-12-05 Thread Zhang, Lisheng
@lucene.apache.org Subject: Re: Lucene search performance: linear? Hello Lisheng, a search process has to do usually two thinks. First it has to find the term in the index. I don’t know the implementation of finding a term in Lucene. I hope that the index is at least a sorted list or a binary tree, so it

Re: Lucene search performance: linear?

2006-12-05 Thread Soeren Pekrul
Hello Lisheng, a search process has to do usually two thinks. First it has to find the term in the index. I don’t know the implementation of finding a term in Lucene. I hope that the index is at least a sorted list or a binary tree, so it can search binary. The time finding a term depends of t

RE: Lucene search performance: linear?

2006-12-05 Thread Zhang, Lisheng
Hi, Thanks for the reply, I only measure search(), I cached IndexSearcher in memory. Best regards, Lisheng -Original Message- From: Daniel Naber [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 05, 2006 12:22 AM To: java-user@lucene.apache.org Subject: Re: Lucene search performance

Re: Lucene search performance: linear?

2006-12-05 Thread Michael McCandless
Zhang, Lisheng wrote: Hi, I indexed first 220,000, all with a special keyword, I did a simple query and only fetched 5 docs, with Hits.length()=220,000. Then I indexed 440,000 docs, with the same keyword, query it again and fetched a few docs, with Hits.length(0=440,000. I found that search ti

Re: Lucene search performance: linear?

2006-12-05 Thread Daniel Naber
On Tuesday 05 December 2006 03:49, Zhang, Lisheng wrote: > I found that search time is about linear: 2nd time is about 2 times > longer than 1st query. What exactly did you measure, only the search() or also opening the IndexSearcher? The later depends on index size, thus you shouldn't re-open

Lucene search performance: linear?

2006-12-04 Thread Zhang, Lisheng
Hi, I indexed first 220,000, all with a special keyword, I did a simple query and only fetched 5 docs, with Hits.length()=220,000. Then I indexed 440,000 docs, with the same keyword, query it again and fetched a few docs, with Hits.length(0=440,000. I found that search time is about linear: 2nd