Re: Performance issue when using multiple PhraseQueries against a 1+ million entries index

2014-05-19 Thread Liviu Matei
Thanks for the reply. When you mention system memory you referring to RAM (or HEAP as this is running as a java process) ? The index size is around 13G and the java process is not given so many memory (in terms of XMX). Could this be the cause? My understandint while reading some articles on the in

Re: Performance issue when using multiple PhraseQueries against a 1+ million entries index

2014-05-19 Thread Jack Krupansky
Does your index fit fully in system memory - the OS file cache? If not, there could be a lot of thrashing (I/O) as Lucene accesses the index. -- Jack Krupansky -Original Message- From: Liviu Matei Sent: Monday, May 19, 2014 4:21 PM To: [email protected] Subject: Performance

Re: Performance issue

2009-02-03 Thread mittals
Hi, All of the fields are text. We have 3 @IndexedEmbedded object with in a object. users can search for any string, few of the cases are morgan, john, orcl, healthcare, pa ma, ma fin. here is e.g. of our one query: (+(trading.tn:pa^150.0 lastName:pa*^45.0 firstName:pa*^30.0 roleDesc:pa^3.0 rol

Re: Performance issue

2009-02-02 Thread Matthew Hall
Do you NEED to be using 7 fields here? Like Erick said, if you could give us an example of the types of data you are trying to search against, it would be quite helpful. Its possible that you might be able to say collapse your 7 fields down to a single field, which would likely reduce the ove

Re: Performance issue

2009-02-02 Thread Erick Erickson
Prefix queries are expensive here. The problem is that each one forms a very large OR clause on all the terms that start with those two letters. For instance, if a field in your index contained mine milanta mica a prefix search on "mi" would form mine OR milanta OR mica. Doing this across seven f

Re: Performance issue

2009-02-02 Thread Grant Ingersoll
Can you give us more info on what they are searching for w/ 2 letter searches? Typically, prefix queries that short are going to have a lot of terms to match. You might try having a field that you index using a variation of ngrams that are anchored at the first character. For example, en