在 2012-5-22 凌晨4:59,"Yang" <teddyyyy...@gmail.com>写道:
>
> I'm trying to make my search faster. right now a query like
>
> name:Joe Moe Pizza   address:77 main street  city:San Francisco
>is this a conjunction query or a disjunction query?
> in a index with 20mil such short business descriptions (total size about
3GB) takes about 100--200ms.
>20m is not a small size, how many results for a query in average?
> I profiled the query, most time is spent in TermScorer.score(), as is
shown by the attached yourkit screenshot.
>that's true, for a query, matching and scoring is very time consuming and
cpu intensive. another one is io for reading postings.
>
>
>
> I tried loading the index onto tmpfs (in-memory block device), and also
tried RAMDirectory, neither helps much.
>if that is true. it seems that io is not the
> I am reading http://www.cnlp.org/presentations/slides/AdvancedLuceneEU.pdf
> it mentions
> Size
> - Stopword removal
> - Stemming
> * Lucene has a number of stemmers available
> * Light versus Aggressive
> * May prevent fine-grained matches in some cases
> - Not a linear factor (usually) due to index compression
>
> so for "stopword removal", I'm already using the standard analyzer, so
stop word removal is already included, right?
>
> also generally any other tricks to try for reducing the search latency?
>
> Thanks!
> Yang
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to