Re: lucene 2.9.0RC4 slower than 2.4.1?

Mark Miller Wed, 16 Sep 2009 10:14:45 -0700

Notice that while DisjunctionScorer.advance and
DisjuntionScorer.advanceAfterCurrent appear to be called
in 2.9, in 2.4, I am only seeing DisjuntionScorer.advanceAfterCurrent
called.


Can someone explain that?

Mark Miller wrote:
> Something is very odd about this if they both cover the same search and
> the environ for both is identical. Even if one search was done twice,
> and we divide the numbers for the new api by 2 - its still *very* odd.
>
> With 2.4, ScorerDocQueue.topDoc is called half a million times.
> With 2.9, its called over 4 million times.
>
> Huh?
>
> Thomas Becker wrote:
>   
>> No it's only a single segment. But two calls. One doing a getHitsCount first 
>> and
>> the other doing the actual search. I'll paste both methods below if someone's
>> interested.
>>
>> Will dig into lucene's sources and compare 2.4 search behaviour for my case 
>> with
>> 2.9 tomorrow. It was about time to get more into lucene-core sources anyhow. 
>> :)
>>
>> See you tomorrow guys and thanks a lot again! It's a pleasure.
>>
>>      public int getHitsCount(String query, Filter filter) throws
>> LuceneServiceException {
>>              log.debug("getHitsCount('{}, {}')", query, filter);
>>              if (StringUtils.isBlank(query)) {
>>                      log.warn("getHitsCount: empty lucene query");
>>                      return 0;
>>              }
>>              long startTimeMillis = System.currentTimeMillis();
>>              int count = 0;
>>
>>              if (indexSearcher == null) {
>>                      return 0;
>>              }
>>
>>              BooleanQuery.setMaxClauseCount(MAXCLAUSECOUNT);
>>              Query q = null;
>>              try {
>>                      q = createQuery(query);
>>                      TopScoreDocCollector tsdc = 
>> TopScoreDocCollector.create(1, true);
>>                      indexSearcher.search(q, filter, tsdc);
>>                      count = tsdc.getTotalHits();
>>                      log.info("getHitsCount: count = {}",count);
>>              } catch (ParseException ex) {
>>                      throw new LuceneServiceException("invalid lucene 
>> query:" + query, ex);
>>              } catch (IOException e) {
>>                      throw new LuceneServiceException(" indexSearcher could 
>> be corrupted", e);
>>              } finally {
>>                      long durationMillis = System.currentTimeMillis() - 
>> startTimeMillis;
>>                      if (durationMillis > slowQueryLimit) {
>>                              log.warn("getHitsCount: Slow query: {} ms, 
>> query={}", durationMillis, query);
>>                      }
>>                      log.debug("getHitsCount: query took {} ms", 
>> durationMillis);
>>              }
>>              return count;
>>      }
>>
>>      public List<Document> search(String query, Filter filter, Sort sort, 
>> int from,
>> int size) throws LuceneServiceException {
>>              log.debug("{} search('{}', {}, {}, {}, {})", new Object[] { 
>> indexAlias, query,
>> filter, sort, from, size });
>>              long startTimeMillis = System.currentTimeMillis();
>>
>>              List<Document> docs = new ArrayList<Document>();
>>              if (indexSearcher == null) {
>>                      return docs;
>>              }
>>              Query q = null;
>>              try {
>>                      if (query == null) {
>>                              log.warn("search: lucene query is null...");
>>                              return docs;
>>                      }
>>                      q = createQuery(query);
>>                      BooleanQuery.setMaxClauseCount(MAXCLAUSECOUNT);
>>                      if (size < 0 || size > maxNumHits) {
>>                              // set hard limit for numHits
>>                              size = maxNumHits;
>>                              if (log.isDebugEnabled())
>>                                      log.debug("search: Size set to 
>> hardlimit: {} for query: {} with filter:
>> {}", new Object[] { size, query, filter });
>>                      }
>>                      TopFieldCollector collector = 
>> TopFieldCollector.create(sort, size + from,
>> true, false, false, true);
>>                      indexSearcher.search(q, filter, collector);
>>                      if(size > collector.getTotalHits())
>>                              size = collector.getTotalHits();
>>                      if (size > 100000)
>>                              log.info("search: size: {} bigger than 100.000 
>> for query: {} with filter:
>> {}", new Object[] { size, query, filter });
>>                      TopDocs td = collector.topDocs(from, size);
>>                      ScoreDoc[] scoreDocs = td.scoreDocs;
>>                      for (ScoreDoc scoreDoc : scoreDocs) {
>>                              docs.add(indexSearcher.doc(scoreDoc.doc));
>>                      }
>>              } catch (ParseException e) {
>>                      log.warn("search: ParseException: {}", e.getMessage());
>>                      if (log.isDebugEnabled())
>>                              log.warn("search: ParseException: ", e);
>>                      return Collections.emptyList();
>>              } catch (IOException e) {
>>                      log.warn("search: IOException: ", e);
>>                      return Collections.emptyList();
>>              } finally {
>>                      long durationMillis = System.currentTimeMillis() - 
>> startTimeMillis;
>>                      if (durationMillis > slowQueryLimit) {
>>                              log.warn("search: Slow query: {} ms, query={}, 
>> indexUsed={}",
>>                                              new Object[] { durationMillis, 
>> query,
>> indexSearcher.getIndexReader().directory() });
>>                      }
>>                      log.debug("search: query took {} ms", durationMillis);
>>              }
>>              return docs;
>>      }
>>
>>
>> Uwe Schindler wrote:
>>   
>>     
>>>>> http://ankeschwarzer.de/tmp/lucene_29_newapi_mmap_singlereq.png
>>>>>
>>>>> Have to verify that the last one is not by accident more than one
>>>>>         
>>>>>           
>>>> request.
>>>>       
>>>>         
>>>>> Will
>>>>> do the run again and then post the required info.
>>>>>         
>>>>>           
>>>> The last figure shows, that IndexSearcher.searchWithFilter was called
>>>> twice
>>>> in contrast to the first figure, where IndexSearcher.search was called
>>>> only
>>>> once.
>>>>       
>>>>         
>>> I forgot, searchWithFilter it is called per segment in 2.9. If it was only
>>> one search, you must have two segments and therefore no optimized index for
>>> this to be correct?
>>>
>>> Uwe
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>>
>>>     
>>>       
>>   
>>     
>
>
>   


-- 
- Mark

http://www.lucidimagination.com




---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: lucene 2.9.0RC4 slower than 2.4.1?

Reply via email to