Re: lucene 2.9.0RC4 slower than 2.4.1?

Mark Miller Wed, 16 Sep 2009 10:04:47 -0700

Something is very odd about this if they both cover the same search and
the environ for both is identical. Even if one search was done twice,
and we divide the numbers for the new api by 2 - its still *very* odd.


With 2.4, ScorerDocQueue.topDoc is called half a million times.
With 2.9, its called over 4 million times.

Huh?

Thomas Becker wrote:
> No it's only a single segment. But two calls. One doing a getHitsCount first 
> and
> the other doing the actual search. I'll paste both methods below if someone's
> interested.
>
> Will dig into lucene's sources and compare 2.4 search behaviour for my case 
> with
> 2.9 tomorrow. It was about time to get more into lucene-core sources anyhow. 
> :)
>
> See you tomorrow guys and thanks a lot again! It's a pleasure.
>
>       public int getHitsCount(String query, Filter filter) throws
> LuceneServiceException {
>               log.debug("getHitsCount('{}, {}')", query, filter);
>               if (StringUtils.isBlank(query)) {
>                       log.warn("getHitsCount: empty lucene query");
>                       return 0;
>               }
>               long startTimeMillis = System.currentTimeMillis();
>               int count = 0;
>
>               if (indexSearcher == null) {
>                       return 0;
>               }
>
>               BooleanQuery.setMaxClauseCount(MAXCLAUSECOUNT);
>               Query q = null;
>               try {
>                       q = createQuery(query);
>                       TopScoreDocCollector tsdc = 
> TopScoreDocCollector.create(1, true);
>                       indexSearcher.search(q, filter, tsdc);
>                       count = tsdc.getTotalHits();
>                       log.info("getHitsCount: count = {}",count);
>               } catch (ParseException ex) {
>                       throw new LuceneServiceException("invalid lucene 
> query:" + query, ex);
>               } catch (IOException e) {
>                       throw new LuceneServiceException(" indexSearcher could 
> be corrupted", e);
>               } finally {
>                       long durationMillis = System.currentTimeMillis() - 
> startTimeMillis;
>                       if (durationMillis > slowQueryLimit) {
>                               log.warn("getHitsCount: Slow query: {} ms, 
> query={}", durationMillis, query);
>                       }
>                       log.debug("getHitsCount: query took {} ms", 
> durationMillis);
>               }
>               return count;
>       }
>
>       public List<Document> search(String query, Filter filter, Sort sort, 
> int from,
> int size) throws LuceneServiceException {
>               log.debug("{} search('{}', {}, {}, {}, {})", new Object[] { 
> indexAlias, query,
> filter, sort, from, size });
>               long startTimeMillis = System.currentTimeMillis();
>
>               List<Document> docs = new ArrayList<Document>();
>               if (indexSearcher == null) {
>                       return docs;
>               }
>               Query q = null;
>               try {
>                       if (query == null) {
>                               log.warn("search: lucene query is null...");
>                               return docs;
>                       }
>                       q = createQuery(query);
>                       BooleanQuery.setMaxClauseCount(MAXCLAUSECOUNT);
>                       if (size < 0 || size > maxNumHits) {
>                               // set hard limit for numHits
>                               size = maxNumHits;
>                               if (log.isDebugEnabled())
>                                       log.debug("search: Size set to 
> hardlimit: {} for query: {} with filter:
> {}", new Object[] { size, query, filter });
>                       }
>                       TopFieldCollector collector = 
> TopFieldCollector.create(sort, size + from,
> true, false, false, true);
>                       indexSearcher.search(q, filter, collector);
>                       if(size > collector.getTotalHits())
>                               size = collector.getTotalHits();
>                       if (size > 100000)
>                               log.info("search: size: {} bigger than 100.000 
> for query: {} with filter:
> {}", new Object[] { size, query, filter });
>                       TopDocs td = collector.topDocs(from, size);
>                       ScoreDoc[] scoreDocs = td.scoreDocs;
>                       for (ScoreDoc scoreDoc : scoreDocs) {
>                               docs.add(indexSearcher.doc(scoreDoc.doc));
>                       }
>               } catch (ParseException e) {
>                       log.warn("search: ParseException: {}", e.getMessage());
>                       if (log.isDebugEnabled())
>                               log.warn("search: ParseException: ", e);
>                       return Collections.emptyList();
>               } catch (IOException e) {
>                       log.warn("search: IOException: ", e);
>                       return Collections.emptyList();
>               } finally {
>                       long durationMillis = System.currentTimeMillis() - 
> startTimeMillis;
>                       if (durationMillis > slowQueryLimit) {
>                               log.warn("search: Slow query: {} ms, query={}, 
> indexUsed={}",
>                                               new Object[] { durationMillis, 
> query,
> indexSearcher.getIndexReader().directory() });
>                       }
>                       log.debug("search: query took {} ms", durationMillis);
>               }
>               return docs;
>       }
>
>
> Uwe Schindler wrote:
>   
>>>> http://ankeschwarzer.de/tmp/lucene_29_newapi_mmap_singlereq.png
>>>>
>>>> Have to verify that the last one is not by accident more than one
>>>>         
>>> request.
>>>       
>>>> Will
>>>> do the run again and then post the required info.
>>>>         
>>> The last figure shows, that IndexSearcher.searchWithFilter was called
>>> twice
>>> in contrast to the first figure, where IndexSearcher.search was called
>>> only
>>> once.
>>>       
>> I forgot, searchWithFilter it is called per segment in 2.9. If it was only
>> one search, you must have two segments and therefore no optimized index for
>> this to be correct?
>>
>> Uwe
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>
>>     
>
>   


-- 
- Mark

http://www.lucidimagination.com




---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: lucene 2.9.0RC4 slower than 2.4.1?

Reply via email to