Re: Google performance bottlenecks ;-) (Re: Lucene performance bottlenecks)

Andrzej Bialecki Mon, 12 Dec 2005 01:59:01 -0800

Dawid Weiss wrote:

Hi Andrzej,
This was a very interesting experiment -- thanks for sharing theresults with us.
The last range was the maximum in this case - Google wouldn't displayany hit above 652 (which I find curious, too - because the totalnumber of hits is, well, significantly higher - and Google claims toreturn up to the first 1000 results).
I believe this may have something to do with the way Google compactsURLs. My guess is that initially a 1000 results is found and ranked.Then pruning is performed on that, leaving just a subset of resultsfor the user to select from.


That was my guess, too ...

Sorry, my initial intuition proved wrong -- there is no clear logicbehind the maximum limit of results you can see (unless you can findsome logic in the fact that I can see _more_ results when I _exclude_repeated ones from the total).

Well, trying not to sound too much like Spock... Fascinating :-), butthe only logical conclusion is that at the user end we never deal withany hard results calculated directly from the hypothetical "main index",we deal just with rough estimates from the "estimated indexes". Thesechange in time, and perhaps even with the group of servers that answeredthis particular query... My guess is that there could be different"estimated" indexes prepared for different values of the main booleanparameters, like filter=0...


--
Best regards,
Andrzej Bialecki     <><
___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com

Re: Google performance bottlenecks ;-) (Re: Lucene performance bottlenecks)

Reply via email to