Hey, So here's a very weird observation.
It looks to me like the db.words.db is using only a 'key' value, and has a blank 'value' for each and every key! How did I find this? 1. I indexed a single web-page consisting of the 'Gettysburg Address' 250+ words. 2. I added printfs to htPack/htUnpack & WordDB:Get & WordDB:Put This is what is see during and htdig HtPack [] db->put key=[people] data=[] flags=[0] HtPack [] db->put key=[people] data=[] flags=[0] HtPack [] db->put key=[people] data=[] flags=[0] HtPack [] db->put key=[perish] data=[] flags=[0] HtPack [] db->put key=[earth] data=[] flags=[0] HtPack [] db->put key=[abraham] data=[] flags=[0] HtPack [] db->put key=[lincoln] data=[] flags=[0] Nothing in the data-value! This seems to contradict (in spirit) whats in the db.worddump produced by htdump! I also downloaded and built the 3.0.55 BDB and used the db_dump utility to dump the db.words.db. This is what I get (The first line is a key, the following line is the value): %db_dump_3055 -pk db.words.db people\02\00\00\00\00\c9\00 \00 people\02\00\00\00\00\cb\00 \00 people\02\00\00\00\00\ce\00 \00 perish\02\00\00\00\00\d1\00 \00 poor\02\00\00\00\00j\00 \00 portion\02\00\00\00\009\00 \00 power\02\00\00\00\00k\00 \00 Note the Zeros in the VALUE!! Here's the relevant entries in db.worddump people 2 0 201 0 people 2 0 203 0 people 2 0 206 0 perish 2 0 209 0 poor 2 0 106 0 c9 = 201 cb = 203 ce = 206 This is brain dead for an inverted index! It should at least be key = 'people\02', value = '00\00\00\00\c9\00' A more efficient solution to make the index smaller would be this: key = 'people\02', value = '00\00\00\00\c9\cb\ce\00' Eh? Thanks. Neal Richter Knowledgebase Developer RightNow Technologies, Inc. Customer Service for Every Web Site Office: 406-522-1485 ------------------------------------------------------- This sf.net emial is sponsored by: Influence the future of Java(TM) technology. Join the Java Community Process(SM) (JCP(SM)) program now. http://ad.doubleclick.net/clk;4699841;7576301;v?http://www.sun.com/javavote _______________________________________________ htdig-dev mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/htdig-dev
