According to Neal Richter:
> 1.  Add a new config verb to let users use zlib WordDB-page compression.
>       This would be an option to let users who run into this error:
> 
> FATAL ERROR:Compressor::get_vals invalid comptype
> FATAL ERROR at file:WordBitCompress.cc line:827 !!!
> 
>  If you look into the db/mp_cmpr.c code (Loic's Compressed BDB page code)
>  you'll find these two functions:
>       CDB___memp_cmpr_inflate(..)
>         CDB___memp_cmpr_defalte(...)
...
>    Merging Loic's latest mifluz is supposed to fix this problem (Geoff
> and I have been working on this), but so far the merge is fairly complex
> and needs much more work and long term testing.  This is a decent
> solution.

Sounds reasonable as an interim solution.  I wonder, though, if
it wouldn't be a quicker/easier fix to backport just the inflate and
deflate code from the latest mifluz package to the existing 3.2.0b4 code.
Would that fix this particular problem without all the headaches of
merging in all the latest mifluz code?

> 2.  The inverted index is not very efficient in general.
> 
> The current scheme:
> 
> WORD    DOCID   LOCATION
> affect  323    43  
> affect  323    53  
...
> A more efficient inverted system 
> 
> affect  323    43, 53
...
> If the fixed width Location field was around 256 characters, this would
> allow roughly 40-50 1,2,3 & 4 digit location codes... likely resulting the
> vast majority of the time a second row is not needed.  For large
> documents, this would change but still be much more efficient.
> 
> Eh? Feedback?

Sounds like an excellent idea to me.  I'm rather surprised they didn't do
that in mifluz already (or is something like this in the newer code?).
This does mean a deviation from the mifluz code base, but it seems
that's inevitable anyway, given the efforts to crowbar the latest code
into ht://Dig, and the lack of support from the mifluz developers.

I guess it also means making the change twice - once in the current
ht://Dig code and again after the mifluz code merge.  Or is all this
at a level that can be done with minimal changes after the merge?

-- 
Gilles R. Detillieux              E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/
Dept. Physiology, U. of Manitoba  Winnipeg, MB  R3E 3J7  (Canada)


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
htdig-dev mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/htdig-dev

Reply via email to