> > Do we REALLY need this function to goto the complexity of calling > > 'Unpack' > > and comparing the keys? Why not treat the entire key bitstream after > > the > > word-string as a binary compare and return? > > That depends on how the packing is done. Udi Manber (among others > probably) outlined various strategies for comparing compressed strings > w/o unpacking. In this case, I think as long as you have a one-to-one > mapping and you're consistently doing comparisons for *every* key, > you're fine. The ordering will be different on the compressed strings, > but as long as everything is unique, I can't think of a problem.
FYI: For a small dataset of ~630 documents with near 1000 word-rows per document added to WordDB 713123 0.00 0.00 word_db_cmp(__db_dbt const *, __db_dbt const *) 713123 0.00 0.00 WordKey::Compare(char const *, int, char const *, int) 580392 0.00 0.00 WordKey::UnpackNumber(unsigned char const *, int, unsigned int &, int, int) This is a lot of calls! The number of calls to word_db_cmp won't chage, but the others can be! Thanks. Neal Richter Knowledgebase Developer RightNow Technologies, Inc. Customer Service for Every Web Site Office: 406-522-1485 ------------------------------------------------------- This SF.net email is sponsored by: ApacheCon, November 18-21 in Las Vegas (supported by COMDEX), the only Apache event to be fully supported by the ASF. http://www.apachecon.com _______________________________________________ htdig-dev mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/htdig-dev
