On Thu, 7 Jun 2001, Marcio Marchini wrote:

>       Is it fair to say that if I am indexing htmls and throwing away the
> text (max_head_length:    1), the index will be about 30% of the
> original data with 3.1 ?

Not necessarily. If you're only interested in space savings and rarely
update your databases, then you can delete db.wordlist and do what you
described and you'd probably be OK.

>       Is it true that 3.2 produces indexes 1/2 the size ? So, would it be
> 15% of the total size of the htmls ?

Probably not quite. (And I'm not sure where you got the 1/2 figure from.)

The databases in the 3.2 code are compressed and the amount of compression
is fairly variable depending on how frequently words occur, etc. Again, if
you're only updating rarely, you can delete the doc index and save some
space.

--
-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/


_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to