Nevermind. I see advance wasn't around in 2.4. This is part of the DocIdSetIterator changes.
Anyway - either these are just not comparable runs, or there is a major bug (which seems unlikely). Just to keep pointing out the obvious: 2.4 calls doc 195,000 times 2.9 calls docId 1.4 million times That just doesn't jive. Mark Miller wrote: > Notice that while DisjunctionScorer.advance and > DisjuntionScorer.advanceAfterCurrent appear to be called > in 2.9, in 2.4, I am only seeing DisjuntionScorer.advanceAfterCurrent > called. > > Can someone explain that? > > Mark Miller wrote: > >> Something is very odd about this if they both cover the same search and >> the environ for both is identical. Even if one search was done twice, >> and we divide the numbers for the new api by 2 - its still *very* odd. >> >> With 2.4, ScorerDocQueue.topDoc is called half a million times. >> With 2.9, its called over 4 million times. >> >> Huh? >> >> Thomas Becker wrote: >> >> >>> No it's only a single segment. But two calls. One doing a getHitsCount >>> first and >>> the other doing the actual search. I'll paste both methods below if >>> someone's >>> interested. >>> >>> Will dig into lucene's sources and compare 2.4 search behaviour for my case >>> with >>> 2.9 tomorrow. It was about time to get more into lucene-core sources >>> anyhow. :) >>> >>> See you tomorrow guys and thanks a lot again! It's a pleasure. >>> >>> public int getHitsCount(String query, Filter filter) throws >>> LuceneServiceException { >>> log.debug("getHitsCount('{}, {}')", query, filter); >>> if (StringUtils.isBlank(query)) { >>> log.warn("getHitsCount: empty lucene query"); >>> return 0; >>> } >>> long startTimeMillis = System.currentTimeMillis(); >>> int count = 0; >>> >>> if (indexSearcher == null) { >>> return 0; >>> } >>> >>> BooleanQuery.setMaxClauseCount(MAXCLAUSECOUNT); >>> Query q = null; >>> try { >>> q = createQuery(query); >>> TopScoreDocCollector tsdc = >>> TopScoreDocCollector.create(1, true); >>> indexSearcher.search(q, filter, tsdc); >>> count = tsdc.getTotalHits(); >>> log.info("getHitsCount: count = {}",count); >>> } catch (ParseException ex) { >>> throw new LuceneServiceException("invalid lucene >>> query:" + query, ex); >>> } catch (IOException e) { >>> throw new LuceneServiceException(" indexSearcher could >>> be corrupted", e); >>> } finally { >>> long durationMillis = System.currentTimeMillis() - >>> startTimeMillis; >>> if (durationMillis > slowQueryLimit) { >>> log.warn("getHitsCount: Slow query: {} ms, >>> query={}", durationMillis, query); >>> } >>> log.debug("getHitsCount: query took {} ms", >>> durationMillis); >>> } >>> return count; >>> } >>> >>> public List<Document> search(String query, Filter filter, Sort sort, >>> int from, >>> int size) throws LuceneServiceException { >>> log.debug("{} search('{}', {}, {}, {}, {})", new Object[] { >>> indexAlias, query, >>> filter, sort, from, size }); >>> long startTimeMillis = System.currentTimeMillis(); >>> >>> List<Document> docs = new ArrayList<Document>(); >>> if (indexSearcher == null) { >>> return docs; >>> } >>> Query q = null; >>> try { >>> if (query == null) { >>> log.warn("search: lucene query is null..."); >>> return docs; >>> } >>> q = createQuery(query); >>> BooleanQuery.setMaxClauseCount(MAXCLAUSECOUNT); >>> if (size < 0 || size > maxNumHits) { >>> // set hard limit for numHits >>> size = maxNumHits; >>> if (log.isDebugEnabled()) >>> log.debug("search: Size set to >>> hardlimit: {} for query: {} with filter: >>> {}", new Object[] { size, query, filter }); >>> } >>> TopFieldCollector collector = >>> TopFieldCollector.create(sort, size + from, >>> true, false, false, true); >>> indexSearcher.search(q, filter, collector); >>> if(size > collector.getTotalHits()) >>> size = collector.getTotalHits(); >>> if (size > 100000) >>> log.info("search: size: {} bigger than 100.000 >>> for query: {} with filter: >>> {}", new Object[] { size, query, filter }); >>> TopDocs td = collector.topDocs(from, size); >>> ScoreDoc[] scoreDocs = td.scoreDocs; >>> for (ScoreDoc scoreDoc : scoreDocs) { >>> docs.add(indexSearcher.doc(scoreDoc.doc)); >>> } >>> } catch (ParseException e) { >>> log.warn("search: ParseException: {}", e.getMessage()); >>> if (log.isDebugEnabled()) >>> log.warn("search: ParseException: ", e); >>> return Collections.emptyList(); >>> } catch (IOException e) { >>> log.warn("search: IOException: ", e); >>> return Collections.emptyList(); >>> } finally { >>> long durationMillis = System.currentTimeMillis() - >>> startTimeMillis; >>> if (durationMillis > slowQueryLimit) { >>> log.warn("search: Slow query: {} ms, query={}, >>> indexUsed={}", >>> new Object[] { durationMillis, >>> query, >>> indexSearcher.getIndexReader().directory() }); >>> } >>> log.debug("search: query took {} ms", durationMillis); >>> } >>> return docs; >>> } >>> >>> >>> Uwe Schindler wrote: >>> >>> >>> >>>>>> http://ankeschwarzer.de/tmp/lucene_29_newapi_mmap_singlereq.png >>>>>> >>>>>> Have to verify that the last one is not by accident more than one >>>>>> >>>>>> >>>>>> >>>>> request. >>>>> >>>>> >>>>> >>>>>> Will >>>>>> do the run again and then post the required info. >>>>>> >>>>>> >>>>>> >>>>> The last figure shows, that IndexSearcher.searchWithFilter was called >>>>> twice >>>>> in contrast to the first figure, where IndexSearcher.search was called >>>>> only >>>>> once. >>>>> >>>>> >>>>> >>>> I forgot, searchWithFilter it is called per segment in 2.9. If it was only >>>> one search, you must have two segments and therefore no optimized index for >>>> this to be correct? >>>> >>>> Uwe >>>> >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >>>> For additional commands, e-mail: java-user-h...@lucene.apache.org >>>> >>>> >>>> >>>> >>> >>> >>> >> >> > > > -- - Mark http://www.lucidimagination.com --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org