> > Do we REALLY need this function to goto the complexity of calling
> > 'Unpack'
> > and comparing the keys?  Why not treat the entire key bitstream after
> > the
> > word-string as a binary compare and return?
>
> That depends on how the packing is done. Udi Manber (among others
> probably) outlined various strategies for comparing compressed strings
> w/o unpacking. In this case, I think as long as you have a one-to-one
> mapping and you're consistently doing comparisons for *every* key,
> you're fine. The ordering will be different on the compressed strings,
> but as long as everything is unique, I can't think of a problem.


FYI:

For a small dataset of ~630 documents with near 1000 word-rows per
document added to WordDB

713123     0.00     0.00  word_db_cmp(__db_dbt const *, __db_dbt const *)
713123     0.00     0.00  WordKey::Compare(char const *, int, char const *, int)
580392     0.00     0.00  WordKey::UnpackNumber(unsigned char const *, int, unsigned 
int &, int, int)

This is a lot of calls!

The number of calls to word_db_cmp won't chage, but the others can be!

Thanks.

Neal Richter
Knowledgebase Developer
RightNow Technologies, Inc.
Customer Service for Every Web Site
Office: 406-522-1485




-------------------------------------------------------
This SF.net email is sponsored by: ApacheCon, November 18-21 in
Las Vegas (supported by COMDEX), the only Apache event to be
fully supported by the ASF. http://www.apachecon.com
_______________________________________________
htdig-dev mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/htdig-dev

Reply via email to