[htdig3-dev] Re: Databases...

Hans-Peter Nilsson Sun, 21 Feb 1999 17:48:25 -0500

> Date: Sun, 21 Feb 1999 16:45:48 -0500
> From: Geoff Hutchison <[EMAIL PROTECTED]>

> The idea of plan 3 is that you don't store the location of the word.
> Instead, you store which words are before and after it. Since phrases will
> occur multiple times, this should provide some builtin space savings, since
> you could simply store one record.
> 
> Make sense?

Not totally, or I'm confused.

With plan 2, you store every word in a document, either uniquely
with a list of locations (not as you put it) or as separate
records per location (as you put it).

With plan 3 (as I understand it), you similarly store a record
for each unique word in a document, and list the WordID:s of
"before" (and "after").

The speculation is that the before-and-after lists would be
smaller enough compared to the locations list to make up for
e.g. losing the "near" functionality.  I don't really know, but
offhand "don't think so".  Maybe someone has studied this
somewhere?

It seems to me that both plans can take up roughly the same
space (a list of locations or before/after WordID:s), while plan
2 should be preferred as being less constrained and directly
giving more functionality than plan 3.

brgds, H-P
------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the SUBJECT of the message.
[htdig3-dev] Re: Databases...

Reply via email to