Hi,

I read the two tutorials on the architecture and I would like to ask two
questions about the db design.
If I understand this right the search words are keys in word.db. A query
on word.db yields a document id which in turn matches a key in
docs.index. This yields a (coded)URL which matches a key in docdb and we
get the document we want!
Why do we need the docs.index?
My first thought would be to say: Ok, we have word.db which gives us a
document id. This document id matches a key in docdb.
And that's it. What is the gain in the docs.index?

A second question deals with the choice of db type. If I understand this
right all three dbs are of type B+Tree. A db with unique keys, like the
docs.index should be much faster (1 access instead of 3-4, if the
Fillfactor is big enough) if you choose db type HASH. DB Berkeley
supports this easily and the access methods are identically, so it
should be easy to test this.
Please tell me, if I am making a mistake here. I am a newbie in things
like db design and searching, but I just read some parts of the
documentation of DB Berkeley.

Yours, mentos

--
Mentos Hoffmann, Roonstr.17, 76137 Karlsruhe, Germany
email: [EMAIL PROTECTED]




------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
[EMAIL PROTECTED] 
You will receive a message to confirm this. 

Reply via email to