[Message reposted due to mailing list problems]
Remy Schleimer <[EMAIL PROTECTED]> wrote:
> Reading about indexing problems, I came to wonder whether wwwoffle might
> have a function that would index its cache 'on the fly'. I think wwwoffle
> might create a task running htdig and then send any new pages being
> cached to that process, which in turn would take care of on the fly indexing.
This is unfortunately not possible due to the design of htdig.
There are two phases with using htdig to create the index of words
that can be searched. The first phase is to find all of the web pages
and extract the words from them. The second phase is to create an
index that can be searched.
The problem is that the second phase takes almost the same amount of
time for performing a full search as it does for adding in the words
fro one page. This is the reason that the wwwoffle-htdig-lasttime
script takes so long to run even though it is only adding in the pages
that were visited the last time online.
With Udmsearch the addition of a single page is quicker since a full
database backend is used. The drawback that I have found is that it
is slower to use this than htdig.
I don't know what the situation is with the namazu search program. I
will be adding this into WWWOFFLE as an alternative to the current
search options.
I think that the idea is a good one though. If there is a search
program that is available as free software that is quick to add pages
then I would consider doing this.
--
Andrew.
----------------------------------------------------------------------
Andrew M. Bishop [EMAIL PROTECTED]
http://www.gedanken.demon.co.uk/
WWWOFFLE users page:
http://www.gedanken.demon.co.uk/wwwoffle/version-2.6/user.html