The Hits class collects the document ids from the query in batches. If you
iterate beyond what was collected, the query is re-executed to collect more
ids.
You can use the expert level search methods on IndexSearcher if this isn't
what you want.
-Yonik
On 9/8/05, Richard Krenek <[EMAIL PROTECTED]> wrote:
>
> I understand that for the query, but why does it matter once you have the
> Hits object? That is the part I'm baffled on. The query with the wildcard
> in
> the front takes a lot longer, but we expected that.
>
> On 9/8/05, Jeremy Meyer <[EMAIL PROTECTED]> wrote:
> >
> > The issue isn't with multiple wildcards exactly. Specifically, the
> problem
> > is if the query starts with a wildcard. In the case where it starts with
> a
> > wildcard, lucene has no option but to linearly go over every term in the
> > index to see if it matches your pattern. It must visit every singe term
> in
> > the index. If it doesn't start with a wildcard, lucene can skip to the
> > relevant part of the index and only visit the relevant terms. For this
> > reason, many people that use Lucene choose to disable having wildcard at
> the
> > start of a search term. This is discussed in the "Lucene in Action"
> book.
> >
> > ~Jack~
> >
> > >>Hello All,
> > >>I am getting some weird time results when retrieving documents back
> from
> > a hits object. I am just timing this bit of code:
> > >>Hits hits = searcher.search(query);
> > >>long startTime = System.currentTimeMillis(); for (int i = 0; i <
> > hits.length(); i++) { Document doc = hits.doc(i); String field = doc.get
> (defaultField);
> > } System.out.println("Cycle Time: "+(System.currentTimeMillis
> > ()-startTime));
> > >>
> > >>It seems when I have a wilcard query like *abcd* vs weqrew*, the
> *abcd*
> > query will always take longer to retrieve the documents even if they are
> of
> > simular result sizes. We are talking a big difference 1 second vs 16. It
> is
> > consistent no matter >>what order I run the queries in, terms with
> multiple
> > wildcards always take longer to retrieve the documents. I am not
> counting
> > the time of the query.
> > >>
> > >>The index is 2.18 GB, 9 fields per document, 10,694,190 documents,
> > >>25,538,793 terms and has been optimized.
> > >>
> > >>I am not sure if this is a real or just a percieved issue. We cannot
> > figure out why the type of query would affect the speed it takes to
> retrieve
> > each document. We have run this on both Windows XP and Linux. With the
> same
> > results. Also to >>note we did watch GC and this did not have any
> > significant impact that we could se.
> > >>
> > >>We are trying to figure out what could cause this and how we can work
> > around it.
> > >>
> > >>
> > >>Thanks,
> > >>Richard
> >
>
>
--
-Yonik
Now hiring -- http://tinyurl.com/7m67g