And, no RamDirectory does not help. On Mon, May 28, 2012 at 5:54 PM, Lance Norskog <[email protected]> wrote: > Can you use filter queries? Filters short-circuit a lot of search > processing. "City:San Francisco" is a classic filter - it is a small > part of the documents and it is reused a lot. > > On Sat, May 26, 2012 at 7:32 AM, Yang <[email protected]> wrote: >> I'm using disjunction (OR) query. unfortunately all of the clauses are >> optional >> >> On Sat, May 26, 2012 at 4:38 AM, Simon Willnauer < >> [email protected]> wrote: >> >>> On Sat, May 26, 2012 at 2:59 AM, Yang <[email protected]> wrote: >>> > I tested with more threads / processes. indeed this is completely >>> > cpu-bound, since running 1 thread gives the same latency as 4 threads (my >>> > box has 4 cores) >>> > >>> > >>> > given this, is there any way to simplify the scoring computation (i'm >>> only >>> > using lucene as a first level "rough" search, so the search quality is >>> not >>> > a huge issue here) , so that, for example, fewer fields are evaluated or >>> a >>> > simpler scoring function is used? >>> >>> are you using disjunction or conjunction queries? Can you make some >>> parts of the query mandatory? >>> >>> simon >>> > >>> > thanks >>> > Yang >>> > >>> > On Fri, May 25, 2012 at 5:47 PM, Yang <[email protected]> wrote: >>> > >>> >> thanks a lot guys >>> >> >>> >> >>> >> On Tue, May 22, 2012 at 1:34 AM, Ian Lea <[email protected]> wrote: >>> >> >>> >>> Lots of good tips in >>> >>> http://wiki.apache.org/lucene-java/ImproveSearchingSpeed, linked from >>> >>> the FAQ. >>> >>> >>> >>> >>> >>> -- >>> >>> Ian. >>> >>> >>> >>> >>> >>> On Tue, May 22, 2012 at 2:08 AM, Li Li <[email protected]> wrote: >>> >>> > something wrong when writing in my android client. >>> >>> > if RAMDirectory do not help, i think the bottleneck is cpu. you may >>> try >>> >>> to >>> >>> > tune jvm but i do not expect much improvement. >>> >>> > the best one is splitting your index into 2 or more smaller ones. >>> >>> > you can then use solr s distributed searching. >>> >>> > if the cpu is not fully used, yuo can do this in one physical machine >>> >>> > >>> >>> > 在 2012-5-22 上午8:50,"Li Li" <[email protected]>写道: >>> >>> >> >>> >>> >> >>> >>> >> 在 2012-5-22 凌晨4:59,"Yang" <[email protected]>写道: >>> >>> >> >>> >>> >> > >>> >>> >> > I'm trying to make my search faster. right now a query like >>> >>> >> > >>> >>> >> > name:Joe Moe Pizza address:77 main street city:San Francisco >>> >>> >> >is this a conjunction query or a disjunction query? >>> >>> >> >>> >>> >> > in a index with 20mil such short business descriptions (total size >>> >>> > about 3GB) takes about 100--200ms. >>> >>> >> >20m is not a small size, how many results for a query in average? >>> >>> >> >>> >>> >> > I profiled the query, most time is spent in TermScorer.score(), >>> as is >>> >>> > shown by the attached yourkit screenshot. >>> >>> >> >that's true, for a query, matching and scoring is very time >>> consuming >>> >>> > and cpu intensive. another one is io for reading postings. >>> >>> >> >>> >>> >> > >>> >>> >> > >>> >>> >> > >>> >>> >> > I tried loading the index onto tmpfs (in-memory block device), and >>> >>> also >>> >>> > tried RAMDirectory, neither helps much. >>> >>> >> >if that is true. it seems that io is not the >>> >>> >> > I am reading >>> >>> > http://www.cnlp.org/presentations/slides/AdvancedLuceneEU.pdf >>> >>> >> > it mentions >>> >>> >> > Size >>> >>> >> > – Stopword removal >>> >>> >> > – Stemming >>> >>> >> > • Lucene has a number of stemmers available >>> >>> >> > • Light versus Aggressive >>> >>> >> > • May prevent fine-grained matches in some cases >>> >>> >> > – Not a linear factor (usually) due to index compression >>> >>> >> > >>> >>> >> > so for "stopword removal", I'm already using the standard >>> analyzer, >>> >>> so >>> >>> > stop word removal is already included, right? >>> >>> >> > >>> >>> >> > also generally any other tricks to try for reducing the search >>> >>> latency? >>> >>> >> > >>> >>> >> > Thanks! >>> >>> >> > Yang >>> >>> >> > >>> >>> >> > >>> >>> >> > >>> --------------------------------------------------------------------- >>> >>> >> > To unsubscribe, e-mail: [email protected] >>> >>> >> > For additional commands, e-mail: [email protected] >>> >>> >>> >>> --------------------------------------------------------------------- >>> >>> To unsubscribe, e-mail: [email protected] >>> >>> For additional commands, e-mail: [email protected] >>> >>> >>> >>> >>> >> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: [email protected] >>> For additional commands, e-mail: [email protected] >>> >>> > > > > -- > Lance Norskog > [email protected]
-- Lance Norskog [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
