The issue isn't with multiple wildcards exactly. Specifically, the problem is if the query starts with a wildcard. In the case where it starts with a wildcard, lucene has no option but to linearly go over every term in the index to see if it matches your pattern. It must visit every singe term in the index. If it doesn't start with a wildcard, lucene can skip to the relevant part of the index and only visit the relevant terms. For this reason, many people that use Lucene choose to disable having wildcard at the start of a search term. This is discussed in the "Lucene in Action" book.
~Jack~ >>Hello All, >>I am getting some weird time results when retrieving documents back from a hits object. I am just timing this bit of code: >>Hits hits = searcher.search(query); >>long startTime = System.currentTimeMillis(); for (int i = 0; i < hits.length(); i++) { Document doc = hits.doc(i); String field = doc.get(defaultField); } System.out.println("Cycle Time: "+(System.currentTimeMillis()-startTime)); >> >>It seems when I have a wilcard query like *abcd* vs weqrew*, the *abcd* query will always take longer to retrieve the documents even if they are of simular result sizes. We are talking a big difference 1 second vs 16. It is consistent no matter >>what order I run the queries in, terms with multiple wildcards always take longer to retrieve the documents. I am not counting the time of the query. >> >>The index is 2.18 GB, 9 fields per document, 10,694,190 documents, >>25,538,793 terms and has been optimized. >> >>I am not sure if this is a real or just a percieved issue. We cannot figure out why the type of query would affect the speed it takes to retrieve each document. We have run this on both Windows XP and Linux. With the same results. Also to >>note we did watch GC and this did not have any significant impact that we could se. >> >>We are trying to figure out what could cause this and how we can work around it. >> >> >>Thanks, >>Richard