Hi,

I had a brief brainstorm on my run today as far as profiling the 
indexing. Obviously htword/mifluz performance still needs to improve 
significantly. But another slowdown relative to 3.1 is from the way 3.2 
treats hopcounts. To ensure that restricting indexes by hopcount works 
correctly, the "queue" for URLs is really a priority queue. URLs with 
lower hopcounts move up the heap. Of course this requires some sorting 
and some overhead.

Right now, I don't think this needs to happen *unless* we're restricting 
indexing based on hopcount. So the proposal is that when we're not 
restricting by hopcount, the Server objects would switch back to the 
previous system (i.e. no sorting).

I think this should shave a few percent off of indexing. Does this seem 
like an OK idea? Can anyone come up with an example where this would be 
a Bad Idea(tm)?

-Geoff



-------------------------------------------------------
This sf.net email is sponsored by: Jabber - The world's fastest growing 
real-time communications platform! Don't just IM. Build it in! 
http://www.jabber.com/osdn/xim
_______________________________________________
htdig-dev mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/htdig-dev

Reply via email to